chore: Updating VectorStore batch size to improve performance#182
Draft
jamie-ons wants to merge 1 commit into
Draft
chore: Updating VectorStore batch size to improve performance#182jamie-ons wants to merge 1 commit into
jamie-ons wants to merge 1 commit into
Conversation
…ngle source of truth
|
|
||
| return result_df | ||
|
|
||
| def search(self, query: VectorStoreSearchInput, n_results=10, batch_size=8) -> VectorStoreSearchOutput: # noqa: C901, PLR0912, PLR0915 |
Contributor
There was a problem hiding this comment.
I think we'd like to retain the option for users to specify a different batch size at this point, but we'd want the default behaviour to follow the single source of truth.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
✨ Summary
VectorStore previously exposed batch_size as a repeated parameter on individual methods, creating multiple independent sources of truth. This PR consolidates that to a single value set at construction time.
To inform the choice of default, a profiling analysis was run across the target GCP instance range at batch sizes from 8 to 512. The default has been updated to the value that minimises search time without risking OOM on the smallest supported instances.
Constraints: must not break or perform significantly worse on 2 vCPU instances; optimised for typical cloud deployments at 4–8 vCPUs.
📜 Changes Introduced
✅ Checklist
🔍 How to Test
To test this code, run the
DEMO/general_workflow_demo.ipynb.To test that it does not break on small instances, run on the following GCP instances and locally: