Avoid casting a tsvector column to text and back again.#262
Open
savef wants to merge 2 commits into
Open
Conversation
When you have a tsvector column but want to search against both it and other text columns (without having to deal with triggers) the :tsvector_column option is no use as it causes everything in :against to be ignored. Also since a column's type is available I don't see why the need to specify something as a tsvector column. This change allows you to simply pass column names to :against, and if any of those are tsvectors then they will be used directly, otherwise they will be converted to tsvectors as before. My data set is over 900 records and as well as the document I search 4 additional small text fields, also there is an postgres gin index being used. On a query that matched just 4 results the speed improved from 40ms to 3ms. And on a query that matched 397 results the speed improved from over 3000ms to 60ms.
|
+1'ing this pull request, also it would be nice to make it work in |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I think this essentially solves the problem that the
:tsvector_columnoption tries to solve, but automatically. It takes the columns in :against and if they are a tsvector it does not cast them (to text, and back to tsvector). I think the only slight difference is that with this commit the column will still be coalesced with "".Here's what the relevant part of an example query looks like:
setweight(to_tsvector('english', coalesce("documents"."title"::text, '')), 'A') || setweight(to_tsvector('english', coalesce("documents"."author"::text, '')), 'B') || setweight(coalesce("documents"."content_tsv", ''), 'C')Thanks for reading, if this is something you'll consider merging here are a couple of followup questions.
If you agree that this renders the
:tsvector_columnoption obsolete is it worth removing that feature as part of this PR?Do the changes here apply to either trigram or dmetaphone searching? I don't use those but would look at improving them too if the changes are relevant.
This was #247, but I needed it to point to a new branch on my fork.