[Feature] Add statistics sidecar with mergeable NDV sketches

### Search before asking

- [x] I searched in the [issues](https://github.com/apache/paimon/issues) and found nothing similar.


### Motivation

  Paimon already has table and column statistics exposed through `Statistics` and `ColStats`, and Spark can use these statistics through `PaimonStatistics` for CBO. However, the current
  NDV information is stored as a scalar `distinctCount`, which is not naturally mergeable or incrementally maintainable.

  For large append-heavy tables, recomputing full-table column statistics can be expensive. At the same time, simply combining scalar NDV values from partitions or snapshots is not
  correct: summing them overestimates when values overlap, while taking the maximum underestimates in many cases. As a result, NDV statistics can easily become stale or unavailable for
  large tables, which affects Spark's cardinality estimation, join reordering, broadcast decisions, and aggregation planning.

  For example, a table may append new data every day and queries may frequently join or aggregate by a high-cardinality column such as `user_id` or `device_id`. Each day's data can have
  its own distinct value count, but the global distinct count cannot be derived correctly from those scalar counts because the same values may appear across multiple days. A mergeable
  sketch, such as a Theta sketch, can represent each partition or snapshot as a compact summary and produce an approximate global NDV by unioning those sketches.

  This proposal focuses on adding the missing statistics infrastructure in a backward-compatible way. Existing scalar statistics can continue to be exposed to engines, while richer
  statistics can be stored in optional sidecar files and consumed when available.

### Solution

_No response_

### Anything else?

_No response_

### Are you willing to submit a PR?

- [x] I'm willing to submit a PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Add statistics sidecar with mergeable NDV sketches #8381

Search before asking

Motivation

Solution

Anything else?

Are you willing to submit a PR?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Feature] Add statistics sidecar with mergeable NDV sketches #8381

Description

Search before asking

Motivation

Solution

Anything else?

Are you willing to submit a PR?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions