Complete Rewrite of varsome-api-client-python#29
Conversation
BREAKING CHANGE: models/elements/* consolidated into models/annotation.py; CLI moved from scripts/ to varsome_api/cli/ package; client and VCF interfaces rewritten. - Migrate build tooling from setup.py/tox.ini to Poetry - Rewrite client.py with improved async/sync support via _sync.py - Rewrite vcf.py with expanded VCF annotation handling - Consolidate all fragmented models/elements/* into models/annotation.py - Add models/slim/ for lightweight annotation models - Restructure CLI into varsome_api/cli/ package (varsome_api_run, varsome_api_annotate_vcf, utils) - Expand exception hierarchy in exceptions.py - Add GitHub Actions workflows for tests and Docker release - Add Dockerfile, .dockerignore, and Docker documentation - Add parquet annotation example under examples/parquet/ - Expand test suite: test_cli, test_models, test_vcf, test_sync, test_exception with fixtures - Rewrite README and add docs/developer-guide.md, docs/docker.md - Add .editorconfig, .flake8, updated .pre-commit-config.yaml
16a6afb to
faa321c
Compare
Python Test Results216 tests 209 ✅ 2s ⏱️ For more details on these failures, see this check. Results for commit faa321c. |
Python Test Results216 tests 216 ✅ 1s ⏱️ Results for commit 6dbad8a. ♻️ This comment has been updated with latest results. |
6dbad8a to
15054e9
Compare
Python Test Results (3.13)216 tests 216 ✅ 1s ⏱️ Results for commit 960e38b. ♻️ This comment has been updated with latest results. |
Python Test Results (3.11)216 tests 216 ✅ 2s ⏱️ Results for commit 960e38b. ♻️ This comment has been updated with latest results. |
Python Test Results (3.14)216 tests 216 ✅ 1s ⏱️ Results for commit 960e38b. ♻️ This comment has been updated with latest results. |
Python Test Results (3.12)216 tests 216 ✅ 2s ⏱️ Results for commit 960e38b. ♻️ This comment has been updated with latest results. |
15054e9 to
6b1b6a1
Compare
… types Refactor the VarSomeAPIClient to handle queries for variants, genes, and CNVs. Additionally, the CLI tool and developer guide have been updated to reflect these changes
There was a problem hiding this comment.
Pull request overview
This PR is a full modernization/rewrite of the varsome_api Python client to support newer Python versions (3.11–3.14), migrate packaging/tooling to Poetry + GitHub Actions + Docker, and redesign the client/VCF/CLI layers around async-first I/O with sync wrappers.
Changes:
- Rewrote the HTTP client to use
aiohttp, added async batch streaming with a sync wrapper (_sync.py). - Replaced legacy
jsonmodelselement models with Pydantic v2 models (generatedmodels/annotation.py+ slim models for performance). - Rebuilt VCF annotation to use
pysam(optional[vcf]extra) and moved CLI tools intovarsome_api/cli/with Poetry entry points.
Reviewed changes
Copilot reviewed 67 out of 75 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| version.py | Removed legacy version constant file. |
| varsome_api/vcf.py | Rewritten VCF annotator built on pysam + async batching. |
| varsome_api/models/variant.py | Replaced old jsonmodels variant with Pydantic-based model + convenience mixin. |
| varsome_api/models/slim/annotation.py | Added slim Pydantic model for VCF pipelines. |
| varsome_api/models/slim/init.py | Exported slim model symbols. |
| varsome_api/models/fields.py | Removed jsonmodels custom fields. |
| varsome_api/models/elements/wustl.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/uniprot.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/transcript.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/thousand_genomes.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/sanger.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/ncbi.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/isb.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/icgc.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/iarc.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/gwas.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/gnomad.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/gerp.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/dbnsfp.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/dann.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/broad.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/acmg.py | Removed legacy jsonmodels element definitions. |
| varsome_api/models/elements/init.py | Removed legacy wildcard element exports. |
| varsome_api/models/init.py | Added explicit exports for new models. |
| varsome_api/log.py | Introduced library logger with NullHandler. |
| varsome_api/exceptions.py | Added VarSomeAPIException with HTTP status mapping. |
| varsome_api/constants.py | Centralized literal types/defaults for ref genome + query types. |
| varsome_api/client.py | Rewritten API client using aiohttp, async batch generator, and sync wrappers. |
| varsome_api/cli/varsome_api_run.py | New CLI entry point for lookup (variants/genes/CNVs). |
| varsome_api/cli/varsome_api_annotate_vcf.py | New CLI entry point for VCF annotation. |
| varsome_api/cli/utils.py | Shared CLI parsing/validation/output helpers. |
| varsome_api/cli/init.py | Added CLI package marker. |
| varsome_api/_sync.py | Added run_sync + sync_wrapper utilities. |
| varsome_api/init.py | Added package __version__. |
| tox.ini | Removed legacy tox config. |
| tests/test_vcf.py | Added VCF annotator test suite. |
| tests/test_sync.py | Added tests for sync-wrapper utilities. |
| tests/test_models.py | Added tests for full vs slim model deserialization + mixin properties. |
| tests/test_exception.py | Added tests for VarSomeAPIException. |
| tests/test_cli.py | Added CLI utilities + parser validation tests. |
| tests/fixtures/variants.vcf | Added VCF fixture for tests. |
| tests/conftest.py | Added shared fixtures/mocks for request stubbing. |
| setup.py | Removed legacy setuptools packaging. |
| scripts/varsome_api_run.py | Removed legacy scripts-based CLI. |
| scripts/varsome_api_annotate_vcf.py | Removed legacy scripts-based CLI. |
| pyproject.toml | Migrated to Poetry packaging, deps, extras, pytest + coverage config. |
| MANIFEST.in | Removed legacy manifest. |
| examples/README.md | Added examples index doc. |
| examples/parquet/writer.py | Added Parquet writer example support code. |
| examples/parquet/requirements.txt | Added example-local requirements. |
| examples/parquet/models.py | Added Parquet schema + row conversion. |
| examples/parquet/annotate_to_parquet.py | Added runnable CSV→Parquet annotation example. |
| docs/docker.md | Added Docker usage guide. |
| docs/developer-guide.md | Added developer guide for client/VCF/models usage. |
| Dockerfile | Added multi-stage Docker build with [vcf] extra. |
| CHANGELOG.md | Updated changelog for v1 beta releases. |
| .pre-commit-config.yaml | Updated pre-commit hooks; added Poetry lock/export hooks. |
| .gitignore | Replaced with expanded Python gitignore template. |
| .github/workflows/python-tests.yml | Added CI matrix for Python 3.11–3.14 with Poetry installs. |
| .github/workflows/docker-release.yml | Added GHCR multi-arch docker publish on release. |
| .flake8 | Added flake8 config. |
| .editorconfig | Added EditorConfig defaults. |
| .dockerignore | Added docker ignore rules (excludes tests/docs/examples/etc.). |
| .cz.toml | Updated commitizen versioning to new version sources. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…nt.py Updated the error handling logic to check for response status codes greater than or equal to 400
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 67 out of 75 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| path = kwargs.get("path", "<unknown path>") | ||
| method = kwargs.get("method", "GET") | ||
| json_body = kwargs.get("json") | ||
| if method == "POST" and isinstance(json_body, dict): | ||
| variant_count = len(json_body.get("variants", [])) | ||
| else: | ||
| variant_count = 1 |
There was a problem hiding this comment.
_log_request_time only counts batch size from the "variants" JSON key. For gene batches (payload key "genes") this will log 0 variant(s) and make request timing logs misleading. Consider detecting the batch list generically (e.g., first list value in the JSON body, or handle both variants/genes).
What Changed
models/elements/*→ unifiedmodels/annotation.py_sync.pyscripts/→varsome_api/cli/package (3 entry points)Breaking Changes⚠️
models.elements.*→models.annotationvarsome_api_run,varsome_api_annotate_vcfinstead of their .py equivalents in the previous versionsRelated to #28