Skip to content

Avoid workspace data object naming collisions #44

@rbdavid

Description

@rbdavid

Currently, the nextflow code and, thus the KBase apps, have all output written to hardcoded file names. If a user runs a cell multiple times, the results from previous runs of that cell will be overwritten, whether the same parameters are used or not. I think this can be avoided, but may require some reorganization on the EST nextflow repo. Will link with an issue on that repo.

In KBase, you can rename data objects in the data tab so that the data objects aren't overwritten and can be used in later cells. I have not done extensive tests if this works as expected. But I have noticed when a data object is renamed, the originating app's report will lose connection with the data object and will error when trying to show the report.

- [ ] Apply a data-provenance principal. Do a check for the presence of files with the to-be used filenames. If they already exist, kill the app's run with an informative error.
- [ ] Apply a input parameter to define a naming convention for files and data objects created during an app's run. Use the input as a string that represents the stem of the files/kbase data objects to be created.

Edit: Renaming data objects should be avoided. It breaks things as far as I can tell. Instead, use the versioning capabilities to access past data object versions or enable users to name their data objects uniquely so they can store multiple data objects easily without having naming/versioning clashes.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions