Skip to content

More reproducibility #30

@msyriac

Description

@msyriac

Say that I run:

bbpipe test.yml

which has stages stage1 and stage2, and config file config.yml. Currently, it is not obvious to me that the information in test.yml and config.yml is fully saved for future reference. Also, the outputs from the two stages end up in the same directory by default, which makes it difficult to e.g. sync just some of the expensive stages to a different compute system. Here's a proposal for a slight restructuring.

output_dir will be the root directory. Pipeline products will be written to sub-directories in output_dir with the same name as the stages. log_dir does not have to be specified. Instead, it is always written to a file $output_dir/run_{TIME}/log.txt, where TIME is some identifier for the time bbpipe was run. Similarly, test.yml and config.yml are copied into $output_dir/run_{TIME}/. This will save more info about the submission. Let me know what you think and I can submit a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions