Skip to content

[vector] Support VectorType in common row paths#8370

Open
QuakeWang wants to merge 2 commits into
apache:masterfrom
QuakeWang:vector-common
Open

[vector] Support VectorType in common row paths#8370
QuakeWang wants to merge 2 commits into
apache:masterfrom
QuakeWang:vector-common

Conversation

@QuakeWang

Copy link
Copy Markdown
Member

Purpose

VectorType was not supported in common row-to-column conversion and compacted row serialization. This left common data paths incomplete for vector columns, and nullable vector columns also need to preserve fixed-size dense child offsets for vectorized consumers such as Arrow writers.

This PR adds a heap vector column vector, supports VectorType in RowToColumnConverter and RowCompactedSerializer, and rejects vector fields from key/comparator paths because vectors are not comparable.

Tests

  • mvn -pl paimon-common -Dtest=RowToColumnConverterTest#testConvertVectorType+testConvertNullableVectorType+testConvertVectorTypeWithInvalidLength+testConvertVectorTypeWithNullElement test
  • mvn -pl paimon-common -DfailIfNoTests=false '-Dtest=RowCompactedSerializerTest$VectorTypesTest' test
  • mvn -pl paimon-core -Dtest=SchemaValidationTest#testVectorTypeCanNotBeKey test
  • mvn -pl paimon-arrow -am -DfailIfNoTests=false -Dtest=ArrowVectorizedBatchConverterTest#testNullableVectorColumnFromRowToColumnConverter test

Support VectorType in RowToColumnConverter with a heap vector column vector that preserves fixed-size dense child offsets, including nullable parent vectors.

Also add compacted row serialization for vector fields and reject vector fields from key/comparator paths.

Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant