Conversation
Merge pull request #46 from NCAR/hua-work-dbms
There was a problem hiding this comment.
Pull request overview
Updates the pgchksum workflow for web-file checksum evaluation and publishes a new package version.
Changes:
- Skip missing
wfilerecords when expandingwid/dsidresults into fullwfilerecords, and adjust the effective count. - Switch the object-store bucket used for web-file object checks from
rda-datatogdex-data. - Bump package version to
2.0.7.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/rda_python_dbms/pgchksum.py | Filters out missing per-wid fetches and changes the object-store bucket used for web object checks. |
| pyproject.toml | Increments project version. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR updates pgchksum’s object-store targeting to use new bucket names and reduces hardcoded bucket usage during checksum evaluation, alongside a package version bump.
Changes:
- Switch web/saved object store buckets from
rda-*togdex-*. - Pass the bucket via
self.PVALS['BUCKET']into object-store checksum evaluation instead of hardcoding. - Make multi-dataset webfile collection skip missing
pgget_wfile()results and compact the resulting record set.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/rda_python_dbms/pgchksum.py | Updates bucket names, removes hardcoded bucket args during evaluation, and filters missing webfile records when aggregating. |
| pyproject.toml | Bumps project version to 2.0.7. |
Comments suppressed due to low confidence (2)
src/rda_python_dbms/pgchksum.py:185
- When running with the
-foption (self.PGSUM['f']is non-empty), this method returnscntbut never updatesself.PGSUM['c']. Sincestart_actions()usesfcnt = self.PGSUM['c']to drive the evaluation loop, the default0will cause no files to be checked even thoughcnt > 0. Setself.PGSUM['c'] = cnt(or return the records and havestart_actions()use the returned count) in the-fbranch so the requested files actually get processed.
if self.PGSUM['f']:
pgrecs = self.pgmhget('wfile', 'wid, dsid', {'wfile' : self.PGSUM['f']})
cnt = len(pgrecs['wid']) if pgrecs else 0
if cnt > 0:
self.PGSUM['f'] = {}
src/rda_python_dbms/pgchksum.py:240
- Same issue as
get_checksum_wfilelist(): in the-fpath (cnt > 0fromself.PGSUM['f']), the function returns the number of matched records but does not updateself.PGSUM['c'].start_actions()then iteratesrange(self.PGSUM['c'])and will skip evaluating the explicitly provided files when-cis not also specified. Updateself.PGSUM['c']to the returned count in this branch.
fcnt = self.PGSUM['c']
cnt = len(self.PGSUM['f'])
if cnt > 0:
self.PGSUM['f'] = self.pgmhget('sfile', flds, {'sfile' : self.PGSUM['f']})
return (len(self.PGSUM['f']['sid']) if self.PGSUM['f'] else 0)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
No description provided.