Skip to content

fix: handle where predicate contains list#158

Merged
drgroot merged 1 commit intomainfrom
drgroot-patch-1
Apr 4, 2026
Merged

fix: handle where predicate contains list#158
drgroot merged 1 commit intomainfrom
drgroot-patch-1

Conversation

@drgroot
Copy link
Copy Markdown
Member

@drgroot drgroot commented Apr 4, 2026

No description provided.

Copilot AI review requested due to automatic review settings April 4, 2026 15:59
@drgroot drgroot enabled auto-merge April 4, 2026 15:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Delta Lake overwrite predicate generation to support partition filters where a partition key has multiple values (i.e., an IN (...) predicate), preventing failures when _filters() returns list-valued filter tuples.

Changes:

  • Build overwrite predicates incrementally instead of " ".join(tuple) so list-valued filters can be rendered as col in (v1, v2, ...).
  • Introduce special handling for list-valued filter tuples when assembling the predicate string.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +134 to +138
if isinstance(tuple_value[-1], list):
in_list = ", ".join(tuple_value[-1])
predicate_list.append(" ".join([tuple_value[0], tuple_value[1], f"({in_list})"]))
else:
predicate_list.append(" ".join(tuple_value))
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both branches build the predicate using str joins (, ".join(tuple_value[-1]) and " ".join(tuple_value)), which will raise TypeError if partition values aren’t already strings (e.g., int partitions). Since _filters() returns Any for the value, this should defensively stringify/format values (and ideally handle quoting/escaping for strings) instead of assuming pre-quoted string literals.

Copilot uses AI. Check for mistakes.
Comment on lines +133 to +136
for tuple_value in filter:
if isinstance(tuple_value[-1], list):
in_list = ", ".join(tuple_value[-1])
predicate_list.append(" ".join([tuple_value[0], tuple_value[1], f"({in_list})"]))
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a partition key is provided with an empty list (len==0), _filters() currently emits an in filter and this code will produce col in (), which is typically an invalid Delta predicate. Consider filtering out empty lists in _filters() (skip the key) or raising a clear error before calling write_deltalake.

Copilot uses AI. Check for mistakes.
Comment on lines +132 to +139
predicate_list: List[str] = []
for tuple_value in filter:
if isinstance(tuple_value[-1], list):
in_list = ", ".join(tuple_value[-1])
predicate_list.append(" ".join([tuple_value[0], tuple_value[1], f"({in_list})"]))
else:
predicate_list.append(" ".join(tuple_value))
predicate = operator.join(predicate_list)
Copy link

Copilot AI Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change adds new behavior for list-valued predicates (in (...)) but there’s no unit test coverage verifying overwrite(..., partitions={key: [v1, v2]}) actually deletes/overwrites only the intended partitions. Adding a test for the multi-value IN case would help prevent regressions (especially around literal formatting/quoting).

Copilot uses AI. Check for mistakes.
@drgroot drgroot disabled auto-merge April 4, 2026 16:03
@drgroot drgroot merged commit 7f09c59 into main Apr 4, 2026
19 of 39 checks passed
@drgroot drgroot deleted the drgroot-patch-1 branch April 4, 2026 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants