Skip to content

feat: add suggest command and harden citation resolution#41

Open
HzaCode wants to merge 1 commit into
mainfrom
feat/suggest-cli-and-citation-fixes
Open

feat: add suggest command and harden citation resolution#41
HzaCode wants to merge 1 commit into
mainfrom
feat/suggest-cli-and-citation-fixes

Conversation

@HzaCode

@HzaCode HzaCode commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Summary

  • Add the candidate-only suggest command: returns candidate matches for
    plain-text references without producing authoritative BibTeX. process now
    resolves only strong identifiers (DOI/PMID/arXiv/ISBN/URL) and points plain
    text to suggest.
  • Add stable JSON/NDJSON envelopes for process/suggest, plus doctor,
    benchmark, and templates commands; CrossRef polite-pool headers; reuse of
    input BibTeX citation keys.
  • Scope Google Scholar to suggest only — an opt-in, best-effort fallback
    (off by default, may be CAPTCHA-blocked, not guaranteed reproducible).
    Removed --google-scholar from process and the use_google_scholar
    parameter from process_references() (both were no-ops there).

Fixes

  • DOI-backed BibTeX keeps canonical CrossRef/DataCite fields instead of the
    original entry's values (original still fills gaps; citation key preserved).
  • CrossRef 404 now always falls back to DataCite (was limited to ~11 hardcoded
    prefixes).
  • suggest thesis routing uses whole-word matching (no longer fires on
    "synthesis"/"hypothesis"/"parenthesis").
  • GitHub clone URLs ending in .git resolve to the correct repository.
  • Plain-text entry ids stay contiguous across multi-blank-line gaps; removed a
    dead PLOS article-id branch in the text parser.

Test plan

  • pytest: 191 passed, 1 skipped, 6 deselected (live-marked).
  • flake8 onecite tests: clean.
  • onecite process --help no longer shows --google-scholar;
    onecite suggest --help still does.

@HzaCode HzaCode force-pushed the feat/suggest-cli-and-citation-fixes branch 2 times, most recently from 732ce67 to 1a4ac0e Compare June 15, 2026 02:40
Add the candidate-only `suggest` command plus a batch of correctness
fixes, and scope Google Scholar to `suggest` only.

Features:
- `onecite suggest`: return candidate matches for plain-text references
  without producing authoritative BibTeX. `process` now resolves only
  strong identifiers (DOI/PMID/arXiv/ISBN/URL) and points plain text to
  `suggest`.
- Stable JSON/NDJSON envelopes for process/suggest, plus `doctor`,
  `benchmark`, and `templates` commands; CrossRef polite-pool headers;
  reuse of input BibTeX citation keys.

Fixes:
- DOI-backed BibTeX keeps canonical CrossRef/DataCite fields instead of
  the original entry's values (original still fills gaps; key preserved).
- CrossRef 404 always falls back to DataCite (was limited to ~11
  hardcoded prefixes).
- `suggest` thesis routing uses whole-word matching (no longer fires on
  "synthesis"/"hypothesis"/"parenthesis").
- GitHub clone URLs ending in `.git` resolve to the correct repository.
- Plain-text entry ids stay contiguous; removed a dead PLOS branch.

Changes:
- Google Scholar is now an opt-in, best-effort fallback on `suggest`
  only; removed `--google-scholar` from `process` and the
  `use_google_scholar` parameter from `process_references()` (both were
  no-ops there). Documented as not guaranteed reproducible.

Tests: 191 passed, 1 skipped; flake8 clean on onecite + tests.
@HzaCode HzaCode force-pushed the feat/suggest-cli-and-citation-fixes branch from 1a4ac0e to cbea632 Compare June 15, 2026 02:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant