docs(studios): add S3 versioning guidance for checkpoint storage costs#1447
Open
ejseqera wants to merge 3 commits into
Open
docs(studios): add S3 versioning guidance for checkpoint storage costs#1447ejseqera wants to merge 3 commits into
ejseqera wants to merge 3 commits into
Conversation
Studios writes a checkpoint every five minutes to the same S3 key. When S3 versioning is enabled on the work bucket, each write creates a new object version rather than an overwrite, producing up to 96 non-current versions per day per active session. Add a new subsection under "Studio session checkpoints" that: - Explains the versioning interaction and its cost implications - Recommends an S3 Lifecycle rule (NoncurrentVersionExpiration: 1 day) with a ready-to-use JSON policy block and aws s3api CLI command - Provides a bulk-delete shell command for clearing existing accumulated non-current versions - Clarifies that non-current versions are safe to delete, while the current version and checkpoint directories must not be removed Changes applied to both platform-cloud and platform-enterprise docs.
✅ Deploy Preview for seqera-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
justinegeffen
approved these changes
May 21, 2026
robnewman
requested changes
May 26, 2026
Member
robnewman
left a comment
There was a problem hiding this comment.
This should not be limited to S3.
| Checkpoints vary in size depending on libraries installed in your session environment. This can potentially result in many large files stored in the compute environment's pipeline work directory and saved to cloud storage. This storage will incur costs based on the cloud provider. Due to the architecture of Studios, you cannot delete any checkpoint files to save on storage costs. Deleting a Studio session's checkpoints will result in a corrupted Studio session that cannot be started nor recovered. | ||
| ::: | ||
|
|
||
| ### S3 versioning and checkpoint storage costs |
Member
There was a problem hiding this comment.
This is a generic problem across any cloud provider that supports object versioning, which is all of them. Don't limit to just S3
|
|
||
| ### S3 versioning and checkpoint storage costs | ||
|
|
||
| If your compute environment work directory uses an S3 bucket with **versioning enabled**, checkpoint writes create a new S3 object version every five minutes rather than overwriting the previous one. For an active Studio session, this produces up to 96 new object versions per day per session. Over time, these non-current versions accumulate and can significantly increase storage costs. |
Member
There was a problem hiding this comment.
Suggested change
| If your compute environment work directory uses an S3 bucket with **versioning enabled**, checkpoint writes create a new S3 object version every five minutes rather than overwriting the previous one. For an active Studio session, this produces up to 96 new object versions per day per session. Over time, these non-current versions accumulate and can significantly increase storage costs. | |
| If your compute environment work directory uses an object storage bucket with **versioning enabled**, checkpoint writes create a new object version rather than overwriting the previous one. For an active Studio session, this produces many object versions per session. Over time, these non-current versions accumulate and can significantly increase storage costs. |
| If your compute environment work directory uses an S3 bucket with **versioning enabled**, checkpoint writes create a new S3 object version every five minutes rather than overwriting the previous one. For an active Studio session, this produces up to 96 new object versions per day per session. Over time, these non-current versions accumulate and can significantly increase storage costs. | ||
|
|
||
| :::warning | ||
| Only the latest version of each checkpoint file is read by Platform. However, non-current S3 object versions are not automatically removed and will continue to accrue storage costs until explicitly deleted or expired. |
Member
There was a problem hiding this comment.
Suggested change
| Only the latest version of each checkpoint file is read by Platform. However, non-current S3 object versions are not automatically removed and will continue to accrue storage costs until explicitly deleted or expired. | |
| Only the latest version of each checkpoint file is read by Platform. However, non-current object versions are not automatically removed and will continue to accrue storage costs until explicitly deleted or expired. |
| Only the latest version of each checkpoint file is read by Platform. However, non-current S3 object versions are not automatically removed and will continue to accrue storage costs until explicitly deleted or expired. | ||
| ::: | ||
|
|
||
| **Recommended mitigation:** Apply an S3 Lifecycle rule to expire non-current object versions on the `.studios/checkpoints/` prefix. A one-day expiry retains the current version while removing intermediate five-minute writes. You can also delete existing accumulated non-current versions manually using your cloud provider's console or CLI. |
Member
There was a problem hiding this comment.
Suggested change
| **Recommended mitigation:** Apply an S3 Lifecycle rule to expire non-current object versions on the `.studios/checkpoints/` prefix. A one-day expiry retains the current version while removing intermediate five-minute writes. You can also delete existing accumulated non-current versions manually using your cloud provider's console or CLI. | |
| **Recommended mitigation:** Apply lifecycle rules to expire non-current object versions on the `.studios/checkpoints/` prefix. A one-day expiry retains the current version while removing intermediate five-minute writes. You can also delete existing accumulated non-current versions manually using your cloud provider's console or CLI. |
| **Recommended mitigation:** Apply an S3 Lifecycle rule to expire non-current object versions on the `.studios/checkpoints/` prefix. A one-day expiry retains the current version while removing intermediate five-minute writes. You can also delete existing accumulated non-current versions manually using your cloud provider's console or CLI. | ||
|
|
||
| :::note | ||
| Non-current object versions (intermediate checkpoint writes) are safe to delete. Do **not** delete the current (latest) version of any checkpoint file or the checkpoint directory itself — doing so will corrupt the Studio session and it cannot be recovered. |
Member
There was a problem hiding this comment.
Won't this impact the "Start as new" functionality?
| Checkpoints vary in size depending on libraries installed in your session environment. This can potentially result in many large files stored in the compute environment's pipeline work directory and saved to cloud storage. This storage will incur costs based on the cloud provider. Due to the architecture of Studios, you cannot delete any checkpoint files to save on storage costs. Deleting a Studio session's checkpoints will result in a corrupted Studio session that cannot be started nor recovered. | ||
| ::: | ||
|
|
||
| ### S3 versioning and checkpoint storage costs |
Member
There was a problem hiding this comment.
Same problems as the above Cloud docs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
### S3 versioning and checkpoint storage costssubsection under the existing checkpoint section explaining the interaction and providing actionable remediation steps.platform-cloudandplatform-enterprisedocs.Test plan
🤖 Generated with Claude Code