Skip to content

Stop using Vec::remove(0)#5

Merged
sgrif merged 1 commit into
mainfrom
sg-stop-doing-remove-0-fork
Jun 12, 2026
Merged

Stop using Vec::remove(0)#5
sgrif merged 1 commit into
mainfrom
sg-stop-doing-remove-0-fork

Conversation

@sgrif

@sgrif sgrif commented Jun 12, 2026

Copy link
Copy Markdown

When using pg_query::parse on a very large query (for example SELECT COUNT(*) FROM users WHERE users.id IN ({1.3M entries})), pg_query would effectively hang indefinitely (5-6 minutes in production, never waited long enough on my local machine to get a measurement).

Profiling the code showed 99.99% of the time was spent in __memmove_avx_unaligned_erms, called from flatten within ParseResult::new. Digging into nodes, it was clear why. These functions are operating by maintaining two lists, the remaining elements to iterate over, and the results to return. When an element is removed from the list to iterate over, it uses Vec::remove(0), which must copy the entire remaining vector over each time. This results in a computational complexity of somewhere between O(N^2) and O(N!).

This commit changes the code to use VecDeque instead, which is built for exactly this purpose, and can pop from the front in constant time. Ideally these functions wouldn't allocate at all. Rather than returning Vec, they'd be much better off returning impl Iterator and yielding without allocation (especially since every caller of these internally just immediately iterates over the result). But that's a bigger rewrite, and would be a backwards incompatible change to the API. I am willing to implement that if such a change would be accepted.

In the short term though, using VecDeque removes the degenerate performance case, and changes the performance from so slow I don't have a measurement, to 1.2s on my local machine which is somewhat in line with the Ruby implementation 975ms.

When using `pg_query::parse` on a very large query (for example `SELECT
COUNT(*) FROM users WHERE users.id IN ({1.3M entries})`), pg_query would
effectively hang indefinitely (5-6 minutes in production, never waited
long enough on my local machine to get a measurement).

Profiling the code showed 99.99% of the time was spent in
`__memmove_avx_unaligned_erms`, called from `flatten` within
`ParseResult::new`. Digging into `nodes`, it was clear why. These
functions are operating by maintaining two lists, the remaining elements
to iterate over, and the results to return. When an element is removed
from the list to iterate over, it uses `Vec::remove(0)`, which must copy
the entire remaining vector over each time. This results in a
computational complexity of somewhere between `O(N^2)` and `O(N!)`.

This commit changes the code to use `VecDeque` instead, which is built
for exactly this purpose, and can pop from the front in constant time.
Ideally these functions wouldn't allocate at all. Rather than returning
`Vec`, they'd be much better off returning `impl Iterator` and yielding
without allocation (especially since every caller of these internally
just immediately iterates over the result). But that's a bigger rewrite,
and would be a backwards incompatible change to the API. I am willing to
implement that if such a change would be accepted.

In the short term though, using `VecDeque` removes the degenerate
performance case, and changes the performance from so slow I don't have
a measurement, to 1.2s on my local machine which is somewhat in line
with the Ruby implementation 975ms.

@levkk levkk left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boom

@sgrif

sgrif commented Jun 12, 2026

Copy link
Copy Markdown
Author

CI failure is due to stable changing the error message of a panic, unrelated to this change. Gonna merge and let upstream handle it

@sgrif sgrif merged commit 97019d0 into main Jun 12, 2026
1 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants