Skip to content

[SOLR-18176] HttpShardHandler query throughput bottleneck from ZooKeeper#4237

Open
mlbiscoc wants to merge 1 commit intoapache:mainfrom
mlbiscoc:SOLR-18176-shardhandler-bottleneck
Open

[SOLR-18176] HttpShardHandler query throughput bottleneck from ZooKeeper#4237
mlbiscoc wants to merge 1 commit intoapache:mainfrom
mlbiscoc:SOLR-18176-shardhandler-bottleneck

Conversation

@mlbiscoc
Copy link
Contributor

https://issues.apache.org/jira/browse/SOLR-18176

HttpShardHandler was bottlenecking in throughput due to CloudReplicaSource recalling for ZooKeeper collection state with every distrib request due to the missing allowCache=true parameter. This resulted in large CPU utilization in ZooKeeper and the synchronized call blocking QTP threads waiting for zookeeper response.

See JIRA above for more detailed information.

@mlbiscoc mlbiscoc requested a review from dsmiley March 24, 2026 20:17
Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!!!
Remember ./gradlew writeChangelog

Thankfully, org.apache.solr.servlet.HttpSolrCall is also asking for the cached version (2 places).

Makes me wonder, what other callers of getCollection (thus allowCached) were making a deliberate decision vs accidental? IMO the cached version should be the default, if there should even be a default.

@mlbiscoc
Copy link
Contributor Author

IMO the cached version should be the default, if there should even be a default.

I agree. I thought about the same thing but I have no idea of the impact and potential consequences of making that change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants