Skip to content

Core: Make partition field in TrackedFile optional#17000

Open
gaborkaszab wants to merge 1 commit into
apache:mainfrom
gaborkaszab:main_trackedfile_partition_optional
Open

Core: Make partition field in TrackedFile optional#17000
gaborkaszab wants to merge 1 commit into
apache:mainfrom
gaborkaszab:main_trackedfile_partition_optional

Conversation

@gaborkaszab

Copy link
Copy Markdown
Contributor

No description provided.

FILE_SIZE_IN_BYTES,
SPEC_ID,
Types.NestedField.required(PARTITION_ID, PARTITION_NAME, partitionType, PARTITION_DOC),
Types.NestedField.optional(PARTITION_ID, PARTITION_NAME, partitionType, PARTITION_DOC),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that this field is optional, partition() can return null (for example a projection without the partition column, per TestTrackedFileStruct.projectionWithoutPartition). The partition() javadoc at line 145 still says only "Returns partition for this file as a StructLike", while the other optional getters in this interface all document "or null". Suggest noting the null case there, e.g. "or null if the partition is not present".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pattern seems to be simply writing ", or null.". I added that to the existing comment

@gaborkaszab gaborkaszab force-pushed the main_trackedfile_partition_optional branch from d6cf350 to ebe1087 Compare June 29, 2026 14:08
/** Adapts {@link TrackedFile} entries to the {@link DataFile} and {@link DeleteFile} APIs. */
class TrackedFileAdapters {

static final Types.StructType EMPTY_STRUCT_TYPE = Types.StructType.of();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably use the package private constants from the BaseFile class.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I removed introducing it here and reuse the on in BaseFile

@Override
public StructLike partition() {
return file().partition();
return file().partition() != null ? file().partition() : EMPTY_PARTITION_DATA;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We could use the MoreObjects.firstNonNull method. Feel free to disregard this comment since this method isn't widely used in the codebase.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a grep on the Java codebase, apparently, there is a single usage of this. I'd prefer to keep the ternary as is to follow the pattern around these files.

@gaborkaszab gaborkaszab force-pushed the main_trackedfile_partition_optional branch from ebe1087 to 21f8c80 Compare June 30, 2026 07:13

@gaborkaszab gaborkaszab left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reviews, @stevenzwu , @ebyhr !

// partition is rebuilt with the supplied struct types inside schemaWithContentStats, so its
// ordinal is looked up by field ID.
private static final int PARTITION_ORDINAL = ordinalOf(TrackedFile.PARTITION_ID);
private static final int CONTENT_STATS_ORDINAL = ordinalOf(TrackedFile.CONTENT_STATS_ID);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, this was unused, removed as part of this PR

/** Adapts {@link TrackedFile} entries to the {@link DataFile} and {@link DeleteFile} APIs. */
class TrackedFileAdapters {

static final Types.StructType EMPTY_STRUCT_TYPE = Types.StructType.of();

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I removed introducing it here and reuse the on in BaseFile

// should return EMPTY_PARTITION_DATA
assertThat(file.partition()).isNotNull();
assertThat(file.partition().size()).isEqualTo(0);
assertThat(file.partition()).isNull();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious and non-blocking, if the intention is projection without partition. Do we need a coverage for projection where we have a valid/nonnull partition but not included in the projection as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

5 participants