Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions R/bucket_access.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Demo: access a Google Cloud Storage bucket from R running in a
# Google Cloud Workstation.
#
# This example shows how to:
# 1. authenticate with Application Default Credentials (ADC),
# 2. confirm access to a bucket,
# 3. upload a small example file, and
# 4. download that file back into R.
#
# Before running this script, authenticate once from the workstation terminal:
# gcloud auth application-default login --no-browser
#
# That command starts a copy/paste authentication flow:
# 1. copy the long URL from the workstation terminal,
# 2. open it in a browser on your local machine,
# 3. sign in and approve access,
# 4. copy the resulting code back into the workstation terminal.
#
# Alternative interactive approach:
# You can use `credentials_user_oauth2(scopes = scope)` instead of
# `token_fetch()` if you prefer to initiate authentication from R.
#
# Packages:
# install.packages(c("googleCloudStorageR", "gargle", "readr"))

library(gargle)
library(googleCloudStorageR)
library(readr)

# Bucket can live in a different project as long as the user or service account
# has permission on that bucket.
bucket_name <- "josh-london"

# Full Cloud Platform scope is convenient for demos, but narrower scopes are
# preferable in production workflows when possible.
scope <- "https://www.googleapis.com/auth/cloud-platform"

# Retrieve Application Default Credentials and use them for GCS requests.
token <- token_fetch(scopes = scope)
gcs_auth(token = token)

# Confirm access to the bucket.
gcs_list_objects(bucket = bucket_name)

# Set a default bucket so later calls can omit `bucket =`.
gcs_global_bucket(bucket_name)

# Write a small example file locally, then upload it to the bucket.
local_file <- tempfile(fileext = ".csv")
write_csv(mtcars, local_file)

gcs_upload(
file = local_file,
name = "testing/mtcars.csv"
)

# Verify that the object now exists in the bucket.
gcs_list_objects(bucket = bucket_name)

# Download the uploaded object.
downloaded_file <- tempfile(fileext = ".csv")
gcs_get_object(
object_name = "testing/mtcars.csv",
bucket = bucket_name,
saveToDisk = downloaded_file,
overwrite = TRUE
)

# Read the downloaded file back into R.
mtcars_from_bucket <- read_csv(downloaded_file, show_col_types = FALSE)
head(mtcars_from_bucket)
123 changes: 123 additions & 0 deletions bucket-access-demo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# GCS Bucket access demo from Google Cloud Workstation (RStudio or Positron)

This guide explains [R/bucket_access.R](R/bucket_access.R) as a demonstration for
read and write access from a Google Cloud Workstation to a Google Cloud Storage Bucket,
including a bucket that lives in a different GCP project.

## What the demo script does

The script is intentionally small. It demonstrates four steps:

1. authenticate from R using Application Default Credentials (ADC),
2. confirm that the authenticated identity can access the target bucket,
3. upload a small example CSV file to the bucket, and
4. download that file back into R.

This is simple test process to verify that a workstation can interact with a bucket.
Users can use this as a starting point for implementing their own workflows.

## What you need before running it

You should have:

- access to a Google Cloud Workstation,
- the ability to run `gcloud` from the workstation terminal,
- permission to access the target bucket, and
- the required R packages installed.

Install the R packages if needed:

```r
install.packages(c("googleCloudStorageR", "gargle", "readr"))
```

## One-time authentication setup

The recommended best practice for authentication is to use Application Default
Credentials (ADC). This creates a persistent authentication on the workstation
after a one-time process. The `gcloud` CLI application is installed by default
on all Google Workstations

From the workstation terminal, run:

```bash
gcloud auth application-default login --no-browser
```

This starts a manual browser-based flow. In practice, the steps are:

1. copy the long authentication URL from the workstation terminal,
2. open that URL in a browser on your local machine,
3. sign in with the account that has bucket access,
4. approve the request,
5. copy the returned code, and
6. paste it back into the workstation terminal.

This creates Application Default Credentials that R can use
through `gargle::token_fetch()`.

## How to configure the script

Open [R/bucket_access.R](R/bucket_access.R) and update this value:

```r
bucket_name <- "your-bucket-name"
```

The bucket can live in a different project from the workstation as long as the authenticated identity has the necessary permissions on that bucket.

## How to run the demo

Run the script in R from the workstation.

The script will:

- fetch ADC credentials,
- authenticate with Google Cloud Storage,
- list objects in `bucket_name`,
- upload `mtcars` as `testing/mtcars.csv`,
- download the same object to a temporary local file, and
- read the downloaded CSV back into R.

A successful run is a good indication that both authentication and bucket permissions are working.

## Common permission requirements

Exact IAM roles vary, but the authenticated identity generally needs permission to:

- access the target bucket,
- create objects in the bucket to test upload, and
- get objects from the bucket to test download.

For this demo, bucket-level permissions matter more than project-level visibility. A user can successfully read from and write to a known bucket in another project without being able to list all buckets in that project.

## Typical failure points

### Authentication fails

If `token_fetch()` fails, the workstation may not have valid Application Default Credentials yet. Re-run:

```bash
gcloud auth application-default login --no-browser
```

### Bucket listing, upload, or download fails

If calls involving `bucket_name` fail, verify:

- the bucket name is correct,
- the bucket exists,
- your identity has read and/or write permissions, and
- any organization or project policy allows cross-project access.

## Why the script writes a temporary file

`googleCloudStorageR::gcs_upload()` is clearest when you upload a file from disk. The script writes `mtcars` to a temporary CSV first, then uploads that file. This makes the demo easier to understand and mirrors many real workflows where outputs are written locally and then copied to cloud storage.

## Suggested next step for real workflows

For production code, it is usually worth adapting this demo in three ways:

1. move `bucket_name` and object paths into a small config section,
2. add explicit error handling and messages around authentication and uploads, and
3. avoid broad scopes when a narrower scope will do.