Stop using `Vec::remove(0)` by sgrif · Pull Request #5 · pgdogdev/pg_query.rs

sgrif · 2026-06-12T17:01:21Z

When using pg_query::parse on a very large query (for example SELECT COUNT(*) FROM users WHERE users.id IN ({1.3M entries})), pg_query would effectively hang indefinitely (5-6 minutes in production, never waited long enough on my local machine to get a measurement).

Profiling the code showed 99.99% of the time was spent in __memmove_avx_unaligned_erms, called from flatten within ParseResult::new. Digging into nodes, it was clear why. These functions are operating by maintaining two lists, the remaining elements to iterate over, and the results to return. When an element is removed from the list to iterate over, it uses Vec::remove(0), which must copy the entire remaining vector over each time. This results in a computational complexity of somewhere between O(N^2) and O(N!).

This commit changes the code to use VecDeque instead, which is built for exactly this purpose, and can pop from the front in constant time. Ideally these functions wouldn't allocate at all. Rather than returning Vec, they'd be much better off returning impl Iterator and yielding without allocation (especially since every caller of these internally just immediately iterates over the result). But that's a bigger rewrite, and would be a backwards incompatible change to the API. I am willing to implement that if such a change would be accepted.

In the short term though, using VecDeque removes the degenerate performance case, and changes the performance from so slow I don't have a measurement, to 1.2s on my local machine which is somewhat in line with the Ruby implementation 975ms.

When using `pg_query::parse` on a very large query (for example `SELECT COUNT(*) FROM users WHERE users.id IN ({1.3M entries})`), pg_query would effectively hang indefinitely (5-6 minutes in production, never waited long enough on my local machine to get a measurement). Profiling the code showed 99.99% of the time was spent in `__memmove_avx_unaligned_erms`, called from `flatten` within `ParseResult::new`. Digging into `nodes`, it was clear why. These functions are operating by maintaining two lists, the remaining elements to iterate over, and the results to return. When an element is removed from the list to iterate over, it uses `Vec::remove(0)`, which must copy the entire remaining vector over each time. This results in a computational complexity of somewhere between `O(N^2)` and `O(N!)`. This commit changes the code to use `VecDeque` instead, which is built for exactly this purpose, and can pop from the front in constant time. Ideally these functions wouldn't allocate at all. Rather than returning `Vec`, they'd be much better off returning `impl Iterator` and yielding without allocation (especially since every caller of these internally just immediately iterates over the result). But that's a bigger rewrite, and would be a backwards incompatible change to the API. I am willing to implement that if such a change would be accepted. In the short term though, using `VecDeque` removes the degenerate performance case, and changes the performance from so slow I don't have a measurement, to 1.2s on my local machine which is somewhat in line with the Ruby implementation 975ms.

levkk

boom

sgrif · 2026-06-12T17:28:47Z

CI failure is due to stable changing the error message of a panic, unrelated to this change. Gonna merge and let upstream handle it

levkk approved these changes Jun 12, 2026

View reviewed changes

sgrif merged commit 97019d0 into main Jun 12, 2026
1 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop using `Vec::remove(0)`#5

Stop using `Vec::remove(0)`#5
sgrif merged 1 commit into
mainfrom
sg-stop-doing-remove-0-fork

sgrif commented Jun 12, 2026

Uh oh!

levkk left a comment

Uh oh!

sgrif commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sgrif commented Jun 12, 2026

Uh oh!

levkk left a comment

Choose a reason for hiding this comment

Uh oh!

sgrif commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants