Add a stored normalized name column with an index#1160
Add a stored normalized name column with an index#1160
Conversation
2edeffd to
02d0537
Compare
02d0537 to
53b6f61
Compare
53b6f61 to
a6d943c
Compare
jobselko
left a comment
There was a problem hiding this comment.
Partial review - I will look at viewsets.py tomorrow.
| names = content.order_by("name").values_list("name", flat=True).distinct().iterator() | ||
| names = ( | ||
| content.order_by("name_normalized") | ||
| .values_list("name", flat=True) | ||
| .distinct("name_normalized") | ||
| .iterator() | ||
| ) |
There was a problem hiding this comment.
Why was name changed to name_normalized and why was name_normalized added to distinct?
There was a problem hiding this comment.
Why was name changed to name_normalized
now that we have a name_normalized field, we can use it to do a "better ordering" (instead of ordering Django, django, and DJANGO as different entries and in different order, we can now handle them as a "same" entrance during ordering).
why was name_normalized added to distinct?
https://docs.djangoproject.com/en/5.2/ref/models/querysets/#django.db.models.query.QuerySet.distinct
"On PostgreSQL only, you can pass positional arguments (*fields) in order to specify the names of fields to which the DISTINCT should apply. [..]For a normal distinct() call, the database compares each field in each row when determining which rows are distinct. For a distinct() call with specified field names, the database will only compare the specified field names."
"When you specify field names, you must provide an order_by() in the QuerySet, and the fields in order_by() must start with the fields in distinct(), in the same order." (order_by and distinct fields must match)
Add a name_normalized field to PythonPackageContent that stores the pre-computed LOWER(REGEXP_REPLACE(name, ...)) value, populated via a BEFORE_SAVE hook. Add db_index=True. Change all name__normalize= lookups to use name_normalized__exact=. This eliminates the regex computation at query time. closes: pulp#1159 Assisted By: claude-opus-4.6
a6d943c to
e8dcc19
Compare
Add a name_normalized field to PythonPackageContent that stores the pre-computed LOWER(REGEXP_REPLACE(name, ...)) value, populated via a BEFORE_SAVE hook.
Add db_index=True.
Change all name__normalize= lookups to use name_normalized__exact=. This eliminates the regex computation at query time.
closes: #1159
Assisted By: claude-opus-4.6
📜 Checklist
See: Pull Request Walkthrough