Skip to content

Fix project resolution in admin ingest and improve TF operator transparency#50

Merged
clincoln8 merged 17 commits into
datacommonsorg:mainfrom
clincoln8:uvx
May 18, 2026
Merged

Fix project resolution in admin ingest and improve TF operator transparency#50
clincoln8 merged 17 commits into
datacommonsorg:mainfrom
clincoln8:uvx

Conversation

@clincoln8
Copy link
Copy Markdown
Contributor

@clincoln8 clincoln8 commented May 16, 2026

This PR addresses a bug where the admin ingest start command would infer the target GCP project from the user's ambient gcloud context instead of using the project explicitly configured in the Terraform state. It also removes the hardcoded region (us-central1) limitation by reading the region from Terraform outputs.

It also improves operator transparency and observability by parsing Cloud Run "operations" to provide direct, clickable links to the job in the GCP Console.

Finally, it resolves SSL/impersonation issues encountered when running the CLI tool in isolated environments (like uvx on macOS) by adding pyopenssl as a dependency.

Key Changes

CLI (datacommons-admin)

  • Project & Region Resolution Fix: ingest start now reads project_id and region from Terraform outputs instead of falling back to google.auth.default() or hardcoded defaults.
  • Improved Operator Transparency: Added support for parsing operations paths returned by the Cloud Run API. This allows the CLI to output direct, clickable links to the job in the GCP Console even for long-running async operations.
  • Templates: Updated the generated main.tf and terraform.tfvars templates to make the web service image configurable (defaulting to stable) and to expose project_id, cdc_service_name, and region outputs.

Terraform

  • Outputs: Exposed project_id, cdc_service_name, and region as outputs in the modules to support the CLI.

Dependencies

  • datacommons-admin: Added pyopenssl>=24.0.0 to resolve SSL validation warnings in isolated environments.
  • Root: Declared the root project as a meta-package to prevent accidental code bundling.

How to Test

You can test all these commands directly from this branch without installing them locally by using uvx.

1. Generate Scaffolding:

uvx --from "git+https://github.com/clincoln8/datacommons.git@uvx#subdirectory=packages/datacommons-cli" datacommons admin init

2. Initialize Database:

uvx --from "git+https://github.com/clincoln8/datacommons.git@uvx#subdirectory=packages/datacommons-cli" datacommons admin init-db

3. Start Ingestion Job:

uvx --from "git+https://github.com/clincoln8/datacommons.git@uvx#subdirectory=packages/datacommons-cli" datacommons admin ingest start

@clincoln8 clincoln8 requested a review from dwnoble May 16, 2026 02:44
Comment thread packages/datacommons-admin/datacommons_admin/ingestion_job_client.py Outdated
@clincoln8 clincoln8 requested a review from gmechali May 16, 2026 02:46
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the infrastructure and CLI for the CDC data ingestion process. Key changes include exposing the GCP project ID and CDC service name via Terraform outputs, allowing the IngestionJobClient to resolve the project ID dynamically, and adding support for a configurable CDC web service image. The CLI now also provides more detailed console links by parsing operation names. Feedback suggests avoiding variable shadowing in the CLI logic and making the hardcoded GCP region configurable to improve flexibility.

Comment thread packages/datacommons-admin/datacommons_admin/ingest_cli.py
Comment thread packages/datacommons-admin/datacommons_admin/ingestion_job_client.py Outdated
@clincoln8
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the infrastructure and CLI to support multi-region deployments by exposing and utilizing GCP project_id and region from Terraform outputs. Key changes include updating the IngestionJobClient to accept these parameters, enhancing the CLI's console link generation, and adding pyopenssl as a dependency. Feedback suggests removing the hardcoded us-central1 default in the IngestionJobClient and instead raising an error if the region is not provided, which would prevent accidental resource management in the wrong region.

Comment thread packages/datacommons-admin/datacommons_admin/ingestion_job_client.py Outdated
Copy link
Copy Markdown
Contributor

@gmechali gmechali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Christie! Couple small questions but no objections

click.secho(job_url, fg="blue", underline=True)
click.secho("Execution Console Link: ", fg="cyan", bold=True, nl=False)
click.secho(exec_url, fg="blue", underline=True)
elif (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would it go through here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I briefly looked into this and from my initial understanding the endpoint actually only returns operations, not executions.

I might submit this as is and then remove the previous block after doing some more testing and also refactor the method a little bit.

Comment thread packages/datacommons-admin/datacommons_admin/ingestion_job_client.py Outdated
@clincoln8 clincoln8 requested a review from gmechali May 18, 2026 15:56
@clincoln8 clincoln8 added this pull request to the merge queue May 18, 2026
Merged via the queue into datacommonsorg:main with commit 0040172 May 18, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants