diff --git a/docs/assays/metadata/testing/converted/4i.md b/docs/assays/metadata/testing/converted/4i.md new file mode 100644 index 0000000..89db4bc --- /dev/null +++ b/docs/assays/metadata/testing/converted/4i.md @@ -0,0 +1,39 @@ +--- +layout: page-triary +--- + +# 4i Metadata Attributes + +Fields that are collected for 4i data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | | +| source_storage_duration_value | | The length of time the sample was stored prior to processing it. For assays performed on tissue sections, this refers to how long the tissue section (e.g., slide) was stored before the assay began (e.g., imaging). For assays performed on suspensions, such as sequencing, it refers to how long the suspension was stored before library construction started. Example: 12 | | +| time_since_acquisition_instrument_calibration_value | | The length of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. Example: 10 | | +| contributors_path | | The name of the file containing the ORCID IDs for all contributors to this dataset. Example: ./contributors.csv | | +| data_path | | The top-level directory containing the raw and/or processed data. For a single dataset upload, this might be represented as ".", whereas for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For example, if the data is within a directory named "TEST001-RK", use the syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2", where "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field used solely for data ingestion. Example: ./TEST001-RK | | +| number_of_antibodies | | The number of antibodies used in the assay. If no antibodies were utilized, enter 0. Example: 5 | | +| number_of_biomarker_imaging_rounds | | The number of imaging rounds required to capture the tagged biomarkers. For CODEX, a biomarker imaging round includes steps such as (1) oligo application, (2) fluor application, and (3) washes. For Cell DIVE, it involves (1) the staining of a biomarker via secondary detection or direct conjugate, followed by (2) dye inactivation. Example: 3 | | +| number_of_total_imaging_rounds | | The total number of imaging rounds performed using a microscope to collect either autofluorescence/background or stained signals, such as those used in histological analysis. Example: 5 | | +| slide_id | | The unique identifier assigned to each slide, enabling users to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name to prevent overlapping values across different centers. Example: VAN0071-PA-1-1_AF | | +| dataset_type | | The specific type of dataset being produced. Example: RNAseq | ```Visium HD``` ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```COMET``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```DBiT-seq``` ```PhenoCycler``` ```CODEX``` ```Second Harmonic Generation (SHG)``` ```Seq-Scope``` | +| analyte_class | | The analyte class which is the target molecule that the assay is measuring. Example: DNA | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | The company that manufactures or supplies the acquisition instrument. An acquisition instrument is a device equipped with signal detection hardware and signal processing software. It captures signals produced by assays, such as variations in light intensity or color, or signals corresponding to molecular mass. If the instrument was custom-built or developed internally, enter "In-House". Example: Illumina | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | The specific model of the acquisition instrument, as manufacturers often offer various versions with differing features or sensitivities. These differences may be relevant to the processing or interpretation of the data. If the instrument was custom-built or developed internally, enter "In-House". If the model is unknown, enter "Unknown". Example: HiSeq 4000 | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```QTRAP 5500``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Q Exactive HF``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive``` ```VS200 Slide Scanner``` ```Not applicable``` ```Orbitrap Eclipse Tribrid``` ```MIBIscope``` ```IN Cell Analyzer 2200``` ```timsTOF FleX MALDI-2``` | +| source_storage_duration_unit | | The unit of measurement used to specify the source storage duration value. Example: hour | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_unit | | The unit of measurement used to specify the time since acquisition instrument calibration value. Example: month | ```month``` ```day``` ```year``` | +| metadata_schema_id | | The unique string identifier for the metadata specification version, which is easily interpretable by computers for purposes of data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| preparation_protocol_doi | | The DOI for the protocols.io page that details the assay or the procedures used for sample procurement and preparation. For example, in the case of an imaging assay, the protocol may start with tissue section staining and end with the generation of an OME-TIFF file. The documented protocol should also include any image processing steps involved in producing the final OME-TIFF. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Indicates whether a specific molecule or set of molecules is targeted for detection or measurement by the assay. Example: Yes | | +| antibodies_path | | The path to the antibodies.tsv file relative to the root directory of the upload structure. This path should start with "." and is typically formatted as "./extras/antibodies.tsv". Example: ./extras/antibodies.tsv | | +| parent_sample_id | | The unique identifier from HuBMAP or SenNet for the sample (such as a block, section, or suspension) used to perform the assay. For instance, in an RNAseq assay, the parent sample would be the suspension, while in imaging assays, it would be the tissue section. If the assay is derived from multiple parent samples, this field should contain a comma-separated list of identifiers. Example: HBM386.ZGKG.235, HBM672.MKPK.442 | | +| non_global_files | | Specifies a semicolon-separated list of non-global files that are to be included in the dataset. The file paths assume that the files are located in the "TOP/non-global/" directory. For instance, if the file is located at TOP/non-global/lab_processed/images/1-tissue-boundary.geojson, the value for this field would be "./lab_processed/images/1-tissue-boundary.geojson". Once ingested, these files will be copied to their appropriate locations within the respective dataset directory tree. This field is intended for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee Example: ./lab_processed/images/1-tissue-boundary.geojson | | +| cell_boundary_marker_or_stain | | The name of the marker or stain used to identify all cell boundaries in the tissue. This name must exactly match the antibody-targeted molecule marker or non-antibody targeted molecule stain as found in the imaging data. For example, in the case of using the PhenoCycler, ensure the name corresponds to the value in the XPD output file. If multiple markers or stains are employed, list them in a comma-separated format. Example: Pan-Cytokeratin, E-Cadherin | | +| nuclear_marker_or_stain | | The nuclear marker or stain used, which can be an antibody-targeted molecule present in or around the cell nucleus. For protein targets, use the protein or gene symbol that identifies the antibody target, ensuring it matches the antibody target from the panel used or custom panels. Preferably, if using a custom antibody marker, this symbol should be the HGNC symbol (https://www.genenames.org/). For non-protein targets, provide the stain name (e.g., DAPI) and, when applicable, include the associated staining kit and vendor. For the PhenoCycler, ensure the symbol matches the value found in the XPD output file. Example: DAPI | | +| number_of_channels | | The number of fluorescent channels that are imaged during each cycle. Example: 3 | | + + diff --git a/docs/assays/metadata/testing/converted/comet.md b/docs/assays/metadata/testing/converted/comet.md new file mode 100644 index 0000000..c6336b6 --- /dev/null +++ b/docs/assays/metadata/testing/converted/comet.md @@ -0,0 +1,39 @@ +--- +layout: page-triary +--- + +# COMET Metadata Attributes + +Fields that are collected for COMET data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | | +| source_storage_duration_value | | The length of time the sample was stored prior to processing it. For assays performed on tissue sections, this refers to how long the tissue section (e.g., slide) was stored before the assay began (e.g., imaging). For assays performed on suspensions, such as sequencing, it refers to how long the suspension was stored before library construction started. Example: 12 | | +| time_since_acquisition_instrument_calibration_value | | The length of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. Example: 10 | | +| contributors_path | | The name of the file containing the ORCID IDs for all contributors to this dataset. Example: ./contributors.csv | | +| data_path | | The top-level directory containing the raw and/or processed data. For a single dataset upload, this might be represented as ".", whereas for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For example, if the data is within a directory named "TEST001-RK", use the syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2", where "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field used solely for data ingestion. Example: ./TEST001-RK | | +| number_of_antibodies | | The number of antibodies used in the assay. If no antibodies were utilized, enter 0. Example: 5 | | +| number_of_biomarker_imaging_rounds | | The number of imaging rounds required to capture the tagged biomarkers. For CODEX, a biomarker imaging round includes steps such as (1) oligo application, (2) fluor application, and (3) washes. For Cell DIVE, it involves (1) the staining of a biomarker via secondary detection or direct conjugate, followed by (2) dye inactivation. Example: 3 | | +| number_of_total_imaging_rounds | | The total number of imaging rounds performed using a microscope to collect either autofluorescence/background or stained signals, such as those used in histological analysis. Example: 5 | | +| slide_id | | The unique identifier assigned to each slide, enabling users to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name to prevent overlapping values across different centers. Example: VAN0071-PA-1-1_AF | | +| dataset_type | | The specific type of dataset being produced. Example: RNAseq | ```Visium HD``` ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```COMET``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Raman Imaging``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```iCLAP``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```STARmap``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```Virtual Histology``` ```DBiT-seq``` | +| analyte_class | | The analyte class which is the target molecule that the assay is measuring. Example: DNA | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Lipid + metabolite + protein``` ```RNA + protein``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | The company that manufactures or supplies the acquisition instrument. An acquisition instrument is a device equipped with signal detection hardware and signal processing software. It captures signals produced by assays, such as variations in light intensity or color, or signals corresponding to molecular mass. If the instrument was custom-built or developed internally, enter "In-House". Example: Illumina | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Revvity``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Waters``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | The specific model of the acquisition instrument, as manufacturers often offer various versions with differing features or sensitivities. These differences may be relevant to the processing or interpretation of the data. If the instrument was custom-built or developed internally, enter "In-House". If the model is unknown, enter "Unknown". Example: HiSeq 4000 | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```QTRAP 5500``` ```DMi8``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Opera Phenix Plus HCS``` ```SYNAPT G2-Si``` ```Q Exactive HF``` ```Orbitrap Fusion Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive``` ```VS200 Slide Scanner``` ```Not applicable``` | +| source_storage_duration_unit | | The unit of measurement used to specify the source storage duration value. Example: hour | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_unit | | The unit of measurement used to specify the time since acquisition instrument calibration value. Example: month | ```month``` ```day``` ```year``` | +| metadata_schema_id | | The unique string identifier for the metadata specification version, which is easily interpretable by computers for purposes of data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| preparation_protocol_doi | | The DOI for the protocols.io page that details the assay or the procedures used for sample procurement and preparation. For example, in the case of an imaging assay, the protocol may start with tissue section staining and end with the generation of an OME-TIFF file. The documented protocol should also include any image processing steps involved in producing the final OME-TIFF. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Indicates whether a specific molecule or set of molecules is targeted for detection or measurement by the assay. Example: Yes | | +| antibodies_path | | The path to the antibodies.tsv file relative to the root directory of the upload structure. This path should start with "." and is typically formatted as "./extras/antibodies.tsv". Example: ./extras/antibodies.tsv | | +| parent_sample_id | | The unique identifier from HuBMAP or SenNet for the sample (such as a block, section, or suspension) used to perform the assay. For instance, in an RNAseq assay, the parent sample would be the suspension, while in imaging assays, it would be the tissue section. If the assay is derived from multiple parent samples, this field should contain a comma-separated list of identifiers. Example: HBM386.ZGKG.235, HBM672.MKPK.442 | | +| non_global_files | | Specifies a semicolon-separated list of non-global files that are to be included in the dataset. The file paths assume that the files are located in the "TOP/non-global/" directory. For instance, if the file is located at TOP/non-global/lab_processed/images/1-tissue-boundary.geojson, the value for this field would be "./lab_processed/images/1-tissue-boundary.geojson". Once ingested, these files will be copied to their appropriate locations within the respective dataset directory tree. This field is intended for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee Example: ./lab_processed/images/1-tissue-boundary.geojson | | +| cell_boundary_marker_or_stain | | The name of the marker or stain used to identify all cell boundaries in the tissue. This name must exactly match the antibody-targeted molecule marker or non-antibody targeted molecule stain as found in the imaging data. For example, in the case of using the PhenoCycler, ensure the name corresponds to the value in the XPD output file. If multiple markers or stains are employed, list them in a comma-separated format. Example: Pan-Cytokeratin, E-Cadherin | | +| nuclear_marker_or_stain | | The nuclear marker or stain used, which can be an antibody-targeted molecule present in or around the cell nucleus. For protein targets, use the protein or gene symbol that identifies the antibody target, ensuring it matches the antibody target from the panel used or custom panels. Preferably, if using a custom antibody marker, this symbol should be the HGNC symbol (https://www.genenames.org/). For non-protein targets, provide the stain name (e.g., DAPI) and, when applicable, include the associated staining kit and vendor. For the PhenoCycler, ensure the symbol matches the value found in the XPD output file. Example: DAPI | | +| number_of_channels | | The number of fluorescent channels that are imaged during each cycle. Example: 3 | | + + diff --git a/docs/assays/metadata/testing/converted/cosmx-proteomics.md b/docs/assays/metadata/testing/converted/cosmx-proteomics.md new file mode 100644 index 0000000..2299517 --- /dev/null +++ b/docs/assays/metadata/testing/converted/cosmx-proteomics.md @@ -0,0 +1,45 @@ +--- +layout: page-triary +--- + +# CosMx Proteomics Metadata Attributes + +Fields that are collected for CosMx Proteomics data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | | +| dataset_type | | The specific type of dataset being produced. Example: RNAseq | ```Visium HD``` ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```COMET``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```DBiT-seq``` ```PhenoCycler``` ```CODEX``` ```Second Harmonic Generation (SHG)``` ```Seq-Scope``` | +| analyte_class | | The analyte class which is the target molecule that the assay is measuring. Example: DNA | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | The company that manufactures or supplies the acquisition instrument. An acquisition instrument is a device equipped with signal detection hardware and signal processing software. It captures signals produced by assays, such as variations in light intensity or color, or signals corresponding to molecular mass. If the instrument was custom-built or developed internally, enter "In-House". Example: Illumina | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | The specific model of the acquisition instrument, as manufacturers often offer various versions with differing features or sensitivities. These differences may be relevant to the processing or interpretation of the data. If the instrument was custom-built or developed internally, enter "In-House". If the model is unknown, enter "Unknown". Example: HiSeq 4000 | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```QTRAP 5500``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Q Exactive HF``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive``` ```VS200 Slide Scanner``` ```Not applicable``` ```Orbitrap Eclipse Tribrid``` ```MIBIscope``` ```IN Cell Analyzer 2200``` ```timsTOF FleX MALDI-2``` | +| source_storage_duration_value | | The length of time the sample was stored prior to processing it. For assays performed on tissue sections, this refers to how long the tissue section (e.g., slide) was stored before the assay began (e.g., imaging). For assays performed on suspensions, such as sequencing, it refers to how long the suspension was stored before library construction started. Example: 12 | | +| source_storage_duration_unit | | The unit of measurement used to specify the source storage duration value. Example: hour | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The length of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. Example: 10 | | +| time_since_acquisition_instrument_calibration_unit | | The unit of measurement used to specify the time since acquisition instrument calibration value. Example: month | ```month``` ```day``` ```year``` | +| preparation_protocol_doi | | The DOI for the protocols.io page that details the assay or the procedures used for sample procurement and preparation. For example, in the case of an imaging assay, the protocol may start with tissue section staining and end with the generation of an OME-TIFF file. The documented protocol should also include any image processing steps involved in producing the final OME-TIFF. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Indicates whether a specific molecule or set of molecules is targeted for detection or measurement by the assay. Example: Yes | | +| contributors_path | | The name of the file containing the ORCID IDs for all contributors to this dataset. Example: ./contributors.csv | | +| data_path | | The top-level directory containing the raw and/or processed data. For a single dataset upload, this might be represented as ".", whereas for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For example, if the data is within a directory named "TEST001-RK", use the syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2", where "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field used solely for data ingestion. Example: ./TEST001-RK | | +| parent_sample_id | | The unique identifier from HuBMAP or SenNet for the sample (such as a block, section, or suspension) used to perform the assay. For instance, in an RNAseq assay, the parent sample would be the suspension, while in imaging assays, it would be the tissue section. If the assay is derived from multiple parent samples, this field should contain a comma-separated list of identifiers. Example: HBM386.ZGKG.235, HBM672.MKPK.442 | | +| mapped_area_value | | The mapped area value, which refers to the specific area covered or captured in various assays. For Visium, it is the area of spots covered by tissue within the captured area, excluding the total possible captured area. For GeoMx, it refers to the area of the AOI being captured. In HiFi, it is the summed area of the ROIs in a single flowcell lane. For CosMx and Resolve, it indicates the area of the FOV (also known as ROI) region being captured. For Xenium, it is the total area of the FOV regions (also known as ROI) being captured. For Stereo-Seq, this value represents the number of beads. Example: 42.25 | | +| mapped_area_unit | | The unit of measurement for the mapped area value. If mapping area is not specified, this field may be left blank. Example: um^2 | ```um^2``` ```mm^2``` | +| slide_id | | The unique identifier assigned to each slide, enabling users to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name to prevent overlapping values across different centers. Example: VAN0071-PA-1-1_AF | | +| target_retrieval_incubation_temperature | | The incubation temperature required for target retrieval, which is typically 100 degrees Celsius for RNA assays and 80 degrees Celsius for protein assays. Example: 100 | | +| target_retrieval_incubation_time_value | | The duration for which a sample is exposed to a target retrieval solution. Example: 15 | | +| target_retrieval_incubation_time_unit | | The unit of measurement for the target retrieval incubation time value. If no incubation time is specified, this field may be left blank. Example: minute | ```minute``` | +| probe_hybridization_time_value | | The duration for which the oligo-conjugated RNA or oligo-conjugated antibody probes were hybridized with the sample. Example: 30 | | +| probe_hybridization_time_unit | | The unit of measurement for the probe hybridization time value. If the hybridization time is not specified, this field may be left blank. Example: minute | ```hour``` ```minute``` | +| oligo_probe_panel | | The oligo probe panel used to target genes and/or proteins. If there is a core panel along with add-on modules, the core panel should be selected in this field. Any additional panels utilized should be documented in the "additional_panels_used.csv" file, which must be uploaded alongside the dataset. Example: 10x Genomics; Visium Human Transcriptome Probe Kit-Small; PN 1000363 | ```NanoString Technologies; GeoMx Mouse Whole Transcriptome Atlas, 4 slides; PN GMX-RNA-NGS-MsWTA-4``` ```NanoString Technologies; CosMx Mouse Neuroscience Panel (RNA, 1000 Plex); PN CMX-M-NEUP-R``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Human Transcriptome Probe Kit, 16 rxns; PN 1000420``` ```10x Genomics; Visium Human Transcriptome Probe Kit-Small; PN 1000363``` ```10x Genomics; Chromium Fixed RNA Kit, Human Transcriptome 4 rxns x 4 BC; PN 1000475``` ```NanoString Technologies; CosMx Human 6K Discovery Panel (RNA, 6175 Plex); PN 121500041``` ```10x Genomics; Chromium Fixed RNA Kit, Human Transcriptome, 4 rxns x 1 BC; PN 1000474``` ```10x Genomics; Xenium Human Multi-Tissue and Cancer Panel v1; PN 1000626``` ```NanoString Technologies; CosMx Human Universal Cell Characterization Panel (RNA, 1000 Plex); PN CMX-H-USCP-1KP-R``` ```10x Genomics; Xenium Custom Gene Expression Panel (up to 50 genes); PN 1000464``` ```NanoString Technologies; CosMx Hs Univ Cell (RNA, 1000 Plex); PN 121500002``` ```10x Genomics; Visium Human Transcriptome Probe Kit-Large; PN 1000364``` ```10x Genomics; Visium Mouse Transcriptome Probe Kit - Small; PN 1000365``` ```10x Genomics; Xenium Mouse Multi-Tissue Atlassing Panel; PN 1000627``` ```10x Genomics; Xenium Custom Gene Expression Panel (51-100 genes); PN 1000561``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Human Transcriptome Probe Kit, 64 rxns; PN 1000456``` ```NanoString Technologies; GeoMx Human Whole Transcriptome Atlas, 4 slides; PN GMX-RNA-NGS-HuWTA-4``` ```10x Genomics; Visium Mouse Transcriptome Probe Kit v2.0 - Small; PN 1000667``` ```NanoString Technologies; CosMx Mouse Universal Cell Characterization Panel (RNA, 1000 Plex); PN CMX-M-USCP-1KP-R``` ```NanoString Technologies; CosMx Human Immuno-Oncology Panel (Protein, 64 Plex); PN CMX-H-IOP-64P-P``` ```Custom``` ```10x Genomics; Visium Human Transcriptome Probe Kit v2 - Small; PN 1000466``` ```10x Genomics; Xenium Human Colon Gene Expression Panel; PN 1000642``` ```10x Genomics; Chromium Next GEM Single Cell Fixed RNA Mouse Transcriptome Probe Kit, 64 rxns; PN 1000492``` ```NanoString Technologies; CosMx Mouse Neuroscience Panel (Protein, 64 Plex); PN CMX-M-Neuro-64P-P``` ```NanoString Technologies; CosMx Hs WTX RNA Panel Kit, 2 slides: PN 121500047``` ```10x Genomics; Xenium Human Lung Gene Expression Panel; PN 1000601``` ```10x Genomics; Xenium Prime 5K Human Pan Tissue & Pathways Panel; PN 1000724``` | +| is_custom_probes_used | | Indicates whether custom RNA or antibody probes were utilized in the assay. If custom probes were employed, they should be documented in the "custom_probe_set.csv" file. Example: No | | +| number_of_panel_targets | | The number of panel targets, which refers to the total count of genes, RNA isoforms, or RNA regions that are targeted by probes. Example: 1000 | | +| roi_label | | The label for the region of interest (ROI). For Resolve and CosMx, this corresponds to the field of view (FOV) label. In the case of Xenium, it refers to the ID of the region containing the analysis. For GeoMx, this information can be located in the "Initial Dataset" spreadsheet, which can be downloaded from within the Data Analysis Suite. Example: Decidua | | +| anatomical_structure_label | | The label for the overarching anatomical structure. If the anatomical structure is not applicable or not specified, this field may be left blank. Example: Kidney | | +| anatomical_structure_id | | The ontology ID associated with the anatomical structure, typically represented by an UBERON ID. Example: UBERON:0002113 | | +| metadata_schema_id | | The unique string identifier for the metadata specification version, which is easily interpretable by computers for purposes of data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| non_global_files | | Specifies a semicolon-separated list of non-global files that are to be included in the dataset. The file paths assume that the files are located in the "TOP/non-global/" directory. For instance, if the file is located at TOP/non-global/lab_processed/images/1-tissue-boundary.geojson, the value for this field would be "./lab_processed/images/1-tissue-boundary.geojson". Once ingested, these files will be copied to their appropriate locations within the respective dataset directory tree. This field is intended for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee Example: ./lab_processed/images/1-tissue-boundary.geojson | | + + diff --git a/docs/assays/metadata/testing/converted/cosmx-transcriptomics.md b/docs/assays/metadata/testing/converted/cosmx-transcriptomics.md new file mode 100644 index 0000000..ecbbc05 --- /dev/null +++ b/docs/assays/metadata/testing/converted/cosmx-transcriptomics.md @@ -0,0 +1,90 @@ +--- +layout: page-triary +--- + +# CosMx Transcriptomics Metadata Attributes + +Fields that are collected for CosMx Transcriptomics data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | | +| dataset_type | | The specific type of dataset being produced. Example: RNAseq | ```Visium HD``` ```4i``` ```Illumina Spatial v0``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```COMET``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```MACSima``` ```Raman Imaging``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```iCLAP``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```STARmap``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` | +| analyte_class | | The analyte class which is the target molecule that the assay is measuring. Example: DNA | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Lipid + metabolite + protein``` ```RNA + protein``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | The company that manufactures or supplies the acquisition instrument. An acquisition instrument is a device equipped with signal detection hardware and signal processing software. It captures signals produced by assays, such as variations in light intensity or color, or signals corresponding to molecular mass. If the instrument was custom-built or developed internally, enter "In-House". Example: Illumina | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Miltenyi Biotec``` ```Revvity``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Waters``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | The specific model of the acquisition instrument, as manufacturers often offer various versions with differing features or sensitivities. These differences may be relevant to the processing or interpretation of the data. If the instrument was custom-built or developed internally, enter "In-House". If the model is unknown, enter "Unknown". Example: HiSeq 4000 | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```LSM 710 Confocal Microscope``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```MACSima System``` ```QTRAP 5500``` ```DMi8``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Opera Phenix Plus HCS``` ```SYNAPT G2-Si``` ```Q Exactive HF``` ```Orbitrap Fusion Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive HF-X``` | +| source_storage_duration_value | | The length of time the sample was stored prior to processing it. For assays performed on tissue sections, this refers to how long the tissue section (e.g., slide) was stored before the assay began (e.g., imaging). For assays performed on suspensions, such as sequencing, it refers to how long the suspension was stored before library construction started. Example: 12 | | +| source_storage_duration_unit | | The unit of measurement used to specify the source storage duration value. Example: hour | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The length of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. Example: 10 | | +| time_since_acquisition_instrument_calibration_unit | | The unit of measurement used to specify the time since acquisition instrument calibration value. Example: month | ```month``` ```day``` ```year``` | +| preparation_protocol_doi | | The DOI for the protocols.io page that details the assay or the procedures used for sample procurement and preparation. For example, in the case of an imaging assay, the protocol may start with tissue section staining and end with the generation of an OME-TIFF file. The documented protocol should also include any image processing steps involved in producing the final OME-TIFF. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Indicates whether a specific molecule or set of molecules is targeted for detection or measurement by the assay. Example: Yes | | +| contributors_path | | The name of the file containing the ORCID IDs for all contributors to this dataset. Example: ./contributors.csv | | +| data_path | | The top-level directory containing the raw and/or processed data. For a single dataset upload, this might be represented as ".", whereas for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For example, if the data is within a directory named "TEST001-RK", use the syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2", where "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field used solely for data ingestion. Example: ./TEST001-RK | | +| parent_sample_id | | The unique identifier from HuBMAP or SenNet for the sample (such as a block, section, or suspension) used to perform the assay. For instance, in an RNAseq assay, the parent sample would be the suspension, while in imaging assays, it would be the tissue section. If the assay is derived from multiple parent samples, this field should contain a comma-separated list of identifiers. Example: HBM386.ZGKG.235, HBM672.MKPK.442 | | +| mapped_area_value | | The mapped area value, which refers to the specific area covered or captured in various assays. For Visium, it is the area of spots covered by tissue within the captured area, excluding the total possible captured area. For GeoMx, it refers to the area of the AOI being captured. In HiFi, it is the summed area of the ROIs in a single flowcell lane. For CosMx and Resolve, it indicates the area of the FOV (also known as ROI) region being captured. For Xenium, it is the total area of the FOV regions (also known as ROI) being captured. For Stereo-Seq, this value represents the number of beads. Example: 42.25 | | +| mapped_area_unit | | The unit of measurement for the mapped area value. If mapping area is not specified, this field may be left blank. Example: um^2 | ```um^2``` ```mm^2``` | +| slide_id | | The unique identifier assigned to each slide, enabling users to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name to prevent overlapping values across different centers. Example: VAN0071-PA-1-1_AF | | +| target_retrieval_incubation_temperature | | The incubation temperature required for target retrieval, which is typically 100 degrees Celsius for RNA assays and 80 degrees Celsius for protein assays. Example: 100 | | +| target_retrieval_incubation_time_value | | The duration for which a sample is exposed to a target retrieval solution. Example: 15 | | +| target_retrieval_incubation_time_unit | | The unit of measurement for the target retrieval incubation time value. If no incubation time is specified, this field may be left blank. Example: minute | ```minute``` | +| proteinasek_concentration | | The concentration of the enzyme Proteinase K within a sample, measured in micrograms per milliliter (ug/ml). Example: 10 | | +| proteinasek_incubation_time_value | | The duration for which a sample is incubated with Proteinase K. Example: 15 | | +| proteinasek_incubation_time_unit | | The unit of measurement for the proteinaseK incubation time value. If no incubation time is specified, this field may be left blank. Example: minute | ```minute``` | +| probe_hybridization_time_value | | The duration for which the oligo-conjugated RNA or oligo-conjugated antibody probes were hybridized with the sample. Example: 30 | | +| probe_hybridization_time_unit | | The unit of measurement for the probe hybridization time value. If the hybridization time is not specified, this field may be left blank. Example: minute | ```hour``` ```minute``` | +| oligo_probe_panel | | The oligo probe panel used to target genes and/or proteins. If there is a core panel along with add-on modules, the core panel should be selected in this field. Any additional panels utilized should be documented in the "additional_panels_used.csv" file, which must be uploaded alongside the dataset. Example: 10x Genomics; Visium Human Transcriptome Probe Kit-Small; PN 1000363 | ```NanoString Technologies; GeoMx Mouse Whole Transcriptome Atlas, 4 slides; PN GMX-RNA-NGS-MsWTA-4``` ```10x Genomics; Chromium Fixed RNA Kit, Human Transcriptome 16 rxns x 16 BC; PN 1000547``` ```NanoString Technologies; CosMx Mouse Neuroscience Panel (RNA, 1000 Plex); PN CMX-M-NEUP-R``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Human Transcriptome Probe Kit, 16 rxns; PN 1000420``` ```10x Genomics; Visium Human Transcriptome Probe Kit-Small; PN 1000363``` ```10x Genomics; Chromium Fixed RNA Kit, Human Transcriptome 4 rxns x 4 BC; PN 1000475``` ```NanoString Technologies; CosMx Human 6K Discovery Panel (RNA, 6175 Plex); PN 121500041``` ```10x Genomics; Chromium Fixed RNA Kit, Human Transcriptome, 4 rxns x 1 BC; PN 1000474``` ```10x Genomics; Xenium Human Multi-Tissue and Cancer Panel v1; PN 1000626``` ```NanoString Technologies; CosMx Human Universal Cell Characterization Panel (RNA, 1000 Plex); PN CMX-H-USCP-1KP-R``` ```10x Genomics; GEM-X Flex Human Transcriptome Probe Kit, 16 samples; PN 1000785``` ```10x Genomics; Xenium Custom Gene Expression Panel (up to 50 genes); PN 1000464``` ```NanoString Technologies; CosMx Hs Univ Cell (RNA, 1000 Plex); PN 121500002``` ```10x Genomics; Visium Human Transcriptome Probe Kit-Large; PN 1000364``` ```10x Genomics; Visium Mouse Transcriptome Probe Kit - Small; PN 1000365``` ```10x Genomics; Xenium Mouse Multi-Tissue Atlassing Panel; PN 1000627``` ```10x Genomics; Xenium Custom Gene Expression Panel (51-100 genes); PN 1000561``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Human Transcriptome Probe Kit, 64 rxns; PN 1000456``` ```NanoString Technologies; GeoMx Human Whole Transcriptome Atlas, 4 slides; PN GMX-RNA-NGS-HuWTA-4``` ```NanoString Technologies; GeoMx Human IO Proteome Atlas, 4 slides; PN 121300160``` ```10x Genomics; Visium Mouse Transcriptome Probe Kit v2.0 - Small; PN 1000667``` ```NanoString Technologies; CosMx Mouse Universal Cell Characterization Panel (RNA, 1000 Plex); PN CMX-M-USCP-1KP-R``` ```NanoString Technologies; CosMx Human Immuno-Oncology Panel (Protein, 64 Plex); PN CMX-H-IOP-64P-P``` ```Custom``` ```10x Genomics; Visium Human Transcriptome Probe Kit v2 - Small; PN 1000466``` ```10x Genomics; Xenium Human Colon Gene Expression Panel; PN 1000642``` ```10x Genomics; Chromium Next GEM Single Cell Fixed RNA Mouse Transcriptome Probe Kit, 64 rxns; PN 1000492``` ```NanoString Technologies; CosMx Mouse Neuroscience Panel (Protein, 64 Plex); PN CMX-M-Neuro-64P-P``` ```NanoString Technologies; CosMx Hs WTX RNA Panel Kit, 2 slides: PN 121500047``` ```10x Genomics; Xenium Human Lung Gene Expression Panel; PN 1000601``` ```10x Genomics; Xenium Prime 5K Human Pan Tissue & Pathways Panel; PN 1000724``` | +| is_custom_probes_used | | Indicates whether custom RNA or antibody probes were utilized in the assay. If custom probes were employed, they should be documented in the "custom_probe_set.csv" file. Example: No | | +| number_of_panel_targets | | The number of panel targets, which refers to the total count of genes, RNA isoforms, or RNA regions that are targeted by probes. Example: 1000 | | +| roi_label | | The label for the region of interest (ROI). For Resolve and CosMx, this corresponds to the field of view (FOV) label. In the case of Xenium, it refers to the ID of the region containing the analysis. For GeoMx, this information can be located in the "Initial Dataset" spreadsheet, which can be downloaded from within the Data Analysis Suite. Example: Decidua | | +| anatomical_structure_label | | The label for the overarching anatomical structure. If the anatomical structure is not applicable or not specified, this field may be left blank. Example: Kidney | | +| anatomical_structure_id | | The ontology ID associated with the anatomical structure, typically represented by an UBERON ID. Example: UBERON:0002113 | | +| metadata_schema_id | | The unique string identifier for the metadata specification version, which is easily interpretable by computers for purposes of data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| non_global_files | | Specifies a semicolon-separated list of non-global files that are to be included in the dataset. The file paths assume that the files are located in the "TOP/non-global/" directory. For instance, if the file is located at TOP/non-global/lab_processed/images/1-tissue-boundary.geojson, the value for this field would be "./lab_processed/images/1-tissue-boundary.geojson". Once ingested, these files will be copied to their appropriate locations within the respective dataset directory tree. This field is intended for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee Example: ./lab_processed/images/1-tissue-boundary.geojson | | + + + +  + +## Deprecated Attributes + +* indicates a field that was previously required + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | | +| dataset_type | | The specific type of dataset being produced. Example: RNAseq | ```Visium HD``` ```4i``` ```Illumina Spatial v0``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```COMET``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```MACSima``` ```Raman Imaging``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```iCLAP``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```STARmap``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` | +| analyte_class | | The analyte class which is the target molecule that the assay is measuring. Example: DNA | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Lipid + metabolite + protein``` ```RNA + protein``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | The company that manufactures or supplies the acquisition instrument. An acquisition instrument is a device equipped with signal detection hardware and signal processing software. It captures signals produced by assays, such as variations in light intensity or color, or signals corresponding to molecular mass. If the instrument was custom-built or developed internally, enter "In-House". Example: Illumina | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Miltenyi Biotec``` ```Revvity``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Waters``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | The specific model of the acquisition instrument, as manufacturers often offer various versions with differing features or sensitivities. These differences may be relevant to the processing or interpretation of the data. If the instrument was custom-built or developed internally, enter "In-House". If the model is unknown, enter "Unknown". Example: HiSeq 4000 | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```LSM 710 Confocal Microscope``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```MACSima System``` ```QTRAP 5500``` ```DMi8``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Opera Phenix Plus HCS``` ```SYNAPT G2-Si``` ```Q Exactive HF``` ```Orbitrap Fusion Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive HF-X``` | +| source_storage_duration_value | | The length of time the sample was stored prior to processing it. For assays performed on tissue sections, this refers to how long the tissue section (e.g., slide) was stored before the assay began (e.g., imaging). For assays performed on suspensions, such as sequencing, it refers to how long the suspension was stored before library construction started. Example: 12 | | +| source_storage_duration_unit | | The unit of measurement used to specify the source storage duration value. Example: hour | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The length of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. Example: 10 | | +| time_since_acquisition_instrument_calibration_unit | | The unit of measurement used to specify the time since acquisition instrument calibration value. Example: month | ```month``` ```day``` ```year``` | +| preparation_protocol_doi | | The DOI for the protocols.io page that details the assay or the procedures used for sample procurement and preparation. For example, in the case of an imaging assay, the protocol may start with tissue section staining and end with the generation of an OME-TIFF file. The documented protocol should also include any image processing steps involved in producing the final OME-TIFF. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Indicates whether a specific molecule or set of molecules is targeted for detection or measurement by the assay. Example: Yes | | +| contributors_path | | The name of the file containing the ORCID IDs for all contributors to this dataset. Example: ./contributors.csv | | +| data_path | | The top-level directory containing the raw and/or processed data. For a single dataset upload, this might be represented as ".", whereas for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For example, if the data is within a directory named "TEST001-RK", use the syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2", where "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field used solely for data ingestion. Example: ./TEST001-RK | | +| parent_sample_id | | The unique identifier from HuBMAP or SenNet for the sample (such as a block, section, or suspension) used to perform the assay. For instance, in an RNAseq assay, the parent sample would be the suspension, while in imaging assays, it would be the tissue section. If the assay is derived from multiple parent samples, this field should contain a comma-separated list of identifiers. Example: HBM386.ZGKG.235, HBM672.MKPK.442 | | +| mapped_area_value | | The mapped area value, which refers to the specific area covered or captured in various assays. For Visium, it is the area of spots covered by tissue within the captured area, excluding the total possible captured area. For GeoMx, it refers to the area of the AOI being captured. In HiFi, it is the summed area of the ROIs in a single flowcell lane. For CosMx and Resolve, it indicates the area of the FOV (also known as ROI) region being captured. For Xenium, it is the total area of the FOV regions (also known as ROI) being captured. For Stereo-Seq, this value represents the number of beads. Example: 42.25 | | +| mapped_area_unit | | The unit of measurement for the mapped area value. If mapping area is not specified, this field may be left blank. Example: um^2 | ```um^2``` ```mm^2``` | +| slide_id | | The unique identifier assigned to each slide, enabling users to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name to prevent overlapping values across different centers. Example: VAN0071-PA-1-1_AF | | +| target_retrieval_incubation_temperature | | The incubation temperature required for target retrieval, which is typically 100 degrees Celsius for RNA assays and 80 degrees Celsius for protein assays. Example: 100 | | +| target_retrieval_incubation_time_value | | The duration for which a sample is exposed to a target retrieval solution. Example: 15 | | +| target_retrieval_incubation_time_unit | | The unit of measurement for the target retrieval incubation time value. If no incubation time is specified, this field may be left blank. Example: minute | ```minute``` | +| proteinasek_concentration | | The concentration of the enzyme Proteinase K within a sample, measured in micrograms per milliliter (ug/ml). Example: 10 | | +| proteinasek_incubation_time_value | | The duration for which a sample is incubated with Proteinase K. Example: 15 | | +| proteinasek_incubation_time_unit | | The unit of measurement for the proteinaseK incubation time value. If no incubation time is specified, this field may be left blank. Example: minute | ```minute``` | +| probe_hybridization_time_value | | The duration for which the oligo-conjugated RNA or oligo-conjugated antibody probes were hybridized with the sample. Example: 30 | | +| probe_hybridization_time_unit | | The unit of measurement for the probe hybridization time value. If the hybridization time is not specified, this field may be left blank. Example: minute | ```hour``` ```minute``` | +| oligo_probe_panel | | The oligo probe panel used to target genes and/or proteins. If there is a core panel along with add-on modules, the core panel should be selected in this field. Any additional panels utilized should be documented in the "additional_panels_used.csv" file, which must be uploaded alongside the dataset. Example: 10x Genomics; Visium Human Transcriptome Probe Kit-Small; PN 1000363 | ```NanoString Technologies; GeoMx Mouse Whole Transcriptome Atlas, 4 slides; PN GMX-RNA-NGS-MsWTA-4``` ```10x Genomics; Chromium Fixed RNA Kit, Human Transcriptome 16 rxns x 16 BC; PN 1000547``` ```NanoString Technologies; CosMx Mouse Neuroscience Panel (RNA, 1000 Plex); PN CMX-M-NEUP-R``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Human Transcriptome Probe Kit, 16 rxns; PN 1000420``` ```10x Genomics; Visium Human Transcriptome Probe Kit-Small; PN 1000363``` ```10x Genomics; Chromium Fixed RNA Kit, Human Transcriptome 4 rxns x 4 BC; PN 1000475``` ```NanoString Technologies; CosMx Human 6K Discovery Panel (RNA, 6175 Plex); PN 121500041``` ```10x Genomics; Chromium Fixed RNA Kit, Human Transcriptome, 4 rxns x 1 BC; PN 1000474``` ```10x Genomics; Xenium Human Multi-Tissue and Cancer Panel v1; PN 1000626``` ```NanoString Technologies; CosMx Human Universal Cell Characterization Panel (RNA, 1000 Plex); PN CMX-H-USCP-1KP-R``` ```10x Genomics; GEM-X Flex Human Transcriptome Probe Kit, 16 samples; PN 1000785``` ```10x Genomics; Xenium Custom Gene Expression Panel (up to 50 genes); PN 1000464``` ```NanoString Technologies; CosMx Hs Univ Cell (RNA, 1000 Plex); PN 121500002``` ```10x Genomics; Visium Human Transcriptome Probe Kit-Large; PN 1000364``` ```10x Genomics; Visium Mouse Transcriptome Probe Kit - Small; PN 1000365``` ```10x Genomics; Xenium Mouse Multi-Tissue Atlassing Panel; PN 1000627``` ```10x Genomics; Xenium Custom Gene Expression Panel (51-100 genes); PN 1000561``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Human Transcriptome Probe Kit, 64 rxns; PN 1000456``` ```NanoString Technologies; GeoMx Human Whole Transcriptome Atlas, 4 slides; PN GMX-RNA-NGS-HuWTA-4``` ```NanoString Technologies; GeoMx Human IO Proteome Atlas, 4 slides; PN 121300160``` ```10x Genomics; Visium Mouse Transcriptome Probe Kit v2.0 - Small; PN 1000667``` ```NanoString Technologies; CosMx Mouse Universal Cell Characterization Panel (RNA, 1000 Plex); PN CMX-M-USCP-1KP-R``` ```NanoString Technologies; CosMx Human Immuno-Oncology Panel (Protein, 64 Plex); PN CMX-H-IOP-64P-P``` ```Custom``` ```10x Genomics; Visium Human Transcriptome Probe Kit v2 - Small; PN 1000466``` ```10x Genomics; Xenium Human Colon Gene Expression Panel; PN 1000642``` ```10x Genomics; Chromium Next GEM Single Cell Fixed RNA Mouse Transcriptome Probe Kit, 64 rxns; PN 1000492``` ```NanoString Technologies; CosMx Mouse Neuroscience Panel (Protein, 64 Plex); PN CMX-M-Neuro-64P-P``` ```NanoString Technologies; CosMx Hs WTX RNA Panel Kit, 2 slides: PN 121500047``` ```10x Genomics; Xenium Human Lung Gene Expression Panel; PN 1000601``` ```10x Genomics; Xenium Prime 5K Human Pan Tissue & Pathways Panel; PN 1000724``` | +| is_custom_probes_used | | Indicates whether custom RNA or antibody probes were utilized in the assay. If custom probes were employed, they should be documented in the "custom_probe_set.csv" file. Example: No | | +| number_of_panel_targets | | The number of panel targets, which refers to the total count of genes, RNA isoforms, or RNA regions that are targeted by probes. Example: 1000 | | +| roi_label | | The label for the region of interest (ROI). For Resolve and CosMx, this corresponds to the field of view (FOV) label. In the case of Xenium, it refers to the ID of the region containing the analysis. For GeoMx, this information can be located in the "Initial Dataset" spreadsheet, which can be downloaded from within the Data Analysis Suite. Example: Decidua | | +| anatomical_structure_label | | The label for the overarching anatomical structure. If the anatomical structure is not applicable or not specified, this field may be left blank. Example: Kidney | | +| anatomical_structure_id | | The ontology ID associated with the anatomical structure, typically represented by an UBERON ID. Example: UBERON:0002113 | | +| metadata_schema_id | | The unique string identifier for the metadata specification version, which is easily interpretable by computers for purposes of data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | + diff --git a/docs/assays/metadata/testing/converted/cycif.md b/docs/assays/metadata/testing/converted/cycif.md new file mode 100644 index 0000000..dc63687 --- /dev/null +++ b/docs/assays/metadata/testing/converted/cycif.md @@ -0,0 +1,36 @@ +--- +layout: page-triary +--- + +# CyCIF Metadata Attributes + +Fields that are collected for CyCIF data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | An internal field labs can use it to add whatever ID(s) they want or need for dataset validation and tracking. This could be a single ID (e.g., "Visium_9OLC_A4_S1") or a delimited list of IDs (e.g., “9OL; 9OLC.A2; Visium_9OLC_A4_S1”). This field will not be accessible to anyone outside of the consortium and no effort will be made to check if IDs provided by one data provider are also used by another. | | +| source_storage_duration_value | | How long was the source material (parent) stored, prior to this sample being processed. | | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. | | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "/TEST001-RK/" for this field. If there are multiple directory levels, use the format "/TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| number_of_antibodies | | Number of antibodies | | +| number_of_biomarker_imaging_rounds | | Number of imaging rounds to capture the tagged biomarkers. For CODEX a biomarker imaging round consists of 1. oligo and fluor application, 2. imaging, 3. removal of oligo and fluor along washes. For Cell DIVE a biomarker imaging round consists of 1. staining of a biomarker via secondary detection or direct conjugate and 2. dye inactivation. | | +| number_of_total_imaging_rounds | | The total number of acquisitions performed on microscope to collect autofluorescence/background or stained signal (e.g., histology). | | +| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | +| dataset_type | | The specific type of dataset being produced. | ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Pixel-seq``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx``` ```MERFISH``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```DBiT-seq``` ```PhenoCycler``` ```CODEX``` ```Second Harmonic Generation (SHG)``` ```Seq-Scope``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```QTRAP 5500``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Q Exactive HF``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive``` ```VS200 Slide Scanner``` ```Not applicable``` ```Orbitrap Eclipse Tribrid``` ```MIBIscope``` ```IN Cell Analyzer 2200``` ```timsTOF FleX MALDI-2``` | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```month``` ```day``` ```year``` | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurement and preparation. For example for an imaging assay, the protocol might begin with staining of a section and finalize with the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1. | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | | +| antibodies_path | | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| number_of_channels | | Number of fluorescent channels imaged during each cycle. | | + + diff --git a/docs/assays/metadata/testing/converted/cytof.md b/docs/assays/metadata/testing/converted/cytof.md new file mode 100644 index 0000000..93b7a4d --- /dev/null +++ b/docs/assays/metadata/testing/converted/cytof.md @@ -0,0 +1,42 @@ +--- +layout: page-triary +--- + +# CyTOF Metadata Attributes + +Fields that are collected for CyTOF data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | An internal attribute labs can use to add whatever ID(s) they want or need for dataset validation and tracking. This could be a single ID (e.g., "Visium_9OLC_A4_S1") or a delimited list of IDs (e.g., “9OL; 9OLC.A2; Visium_9OLC_A4_S1”). This attribute will not be accessible to anyone outside of the consortium and no effort will be made to check if IDs provided by one data provider are also used by another. | | +| dataset_type | | The specific type of dataset being produced. | | +| analyte_class | | Analytes are the target molecules being measured with the assay. | | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | | +| source_storage_duration_unit | | The time duration unit of measurement | | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. | | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurement and preparation. For example for an imaging assay, the protocol might begin with staining of a section and finalize with the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: [https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1](https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1). | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Custom``` ```None``` ```Sigma Aldrich; Cisplatin 25mg; PN P4394``` ```Standard BioTools; Cell-ID Cisplatin-198Pt 100 uL; PN 201198``` ```Standard BioTools; Cell-ID Intercalator-103Rh 2,000 um; PN 201103B``` ```Standard BioTools; Cell-ID Cisplatin-196Pt 100 uL; PN 201196``` ```Standard BioTools; Cell-ID Cisplatin 100 uL; PN 201064``` ```Standard BioTools; Cell-ID Intercalator-103Rh 500 um; PN 201103A``` ```Standard BioTools; Cell-ID Cisplatin-194Pt 100 uL; PN 201194``` ```Standard BioTools; Cell-ID Cisplatin-195Pt 100 uL; PN 201195``` | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| number_of_mass_channels | | The number of mass channels that measure the expression of markers in single cells. | | +| is_erythrocyte_lysis_performed | | Process in which red blood cells (RBCs) are broken down in the sample prior to analysis, thereby allowing researchers to focus primarily on white blood cells (WBCs). | ```Yes``` ```No``` | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| antibody_reagent_kit | | The kit containing the set of antibodies pre-conjugated with different heavy metal isotopes used to simultaneously detect and quantify multiple protein markers on individual cells by attaching these metal-labeled antibodies to specific cellular targets, essentially acting as the key component for labeling cells with the various markers needed for analysis on the CyTOF machine. | | +| viability_reagent_kit | | The kit used to differentiate between live and dead cells within a sample by selectively staining dead cells with a dye that can be detected by the instrument, allowing researchers to exclude dead cell data from their analysis and ensure accurate results when studying cell populations. | | +| is_cell_activation_performed | | Process by which ligand is binded to its receptors on a cell, which enhances the cell's ability to respond to various stimuli. | ```Yes``` ```No``` | +| activation_stimulus | | Specific type of stimulus used to provoke cell activation. Examples would include PMA/ionomycin or CD28in/brefeldin A. This field is required if "is_cells_activation performed" is Yes. | | +| is_fcr_blocking_applied | | Process by which a reagent has been added to the staining procedure to block the binding of antibodies to Fc receptors (FcRs) on cells, preventing non-specific binding and ensuring that only the intended target antigen is detected by the antibodies; essentially, it helps to minimize false positive signals by preventing antibodies from attaching to the cell via their Fc region instead of the antigen-specific binding site. | ```Yes``` ```No``` | +| is_heparin_used | | Indicates whether heparin was used ("Yes") or not ("No") during staining to prevent non-specific binding of metal-labeled antibodies to eosinophils to reduce background noise. | ```Yes``` ```No``` | +| loaded_cell_concentration_value | | The number of cells present within a given volume of liquid for the experiment immediately prior to the experiment, essentially indicating how densely packed the cells are in a solution. | | +| loaded_cell_concentration_unit | | Unit of measure for cell concentration, e.g. cells per milliliter (cells/mL). | | +| instrument_calibration_bead_kit | | A set of beads of known mass intensity used to adjust the settings of a flow cytometer to ensure accurate measurements. | | +| calibration_kit_lot_number | | Manufacturer's lot number for the calibration bead kit used for the experiment. | | + + diff --git a/docs/assays/metadata/testing/converted/dna-methylation.md b/docs/assays/metadata/testing/converted/dna-methylation.md new file mode 100644 index 0000000..36ca0bc --- /dev/null +++ b/docs/assays/metadata/testing/converted/dna-methylation.md @@ -0,0 +1,30 @@ +--- +layout: page-triary +--- + +# DNA Methylation Metadata Attributes + +Fields that are collected for DNA Methylation data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | | +| dataset_type | | The specific type of dataset being produced. Example: RNAseq | ```Visium HD``` ```4i``` ```Illumina Spatial ver0``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```COMET``` ```DNA Methylation``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```MACSima``` ```Raman Imaging``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```iCLAP``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```STARmap``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` | +| analyte_class | | The analyte class which is the target molecule that the assay is measuring. Example: DNA | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Lipid + metabolite + protein``` ```RNA + protein``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | The company that manufactures or supplies the acquisition instrument. An acquisition instrument is a device equipped with signal detection hardware and signal processing software. It captures signals produced by assays, such as variations in light intensity or color, or signals corresponding to molecular mass. If the instrument was custom-built or developed internally, enter "In-House". Example: Illumina | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Miltenyi Biotec``` ```Revvity``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Waters``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | The specific model of the acquisition instrument, as manufacturers often offer various versions with differing features or sensitivities. These differences may be relevant to the processing or interpretation of the data. If the instrument was custom-built or developed internally, enter "In-House". If the model is unknown, enter "Unknown". Example: HiSeq 4000 | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```LSM 710 Confocal Microscope``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```MACSima System``` ```QTRAP 5500``` ```DMi8``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Opera Phenix Plus HCS``` ```SYNAPT G2-Si``` ```Q Exactive HF``` ```Orbitrap Fusion Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive HF-X``` | +| source_storage_duration_value | | The length of time the sample was stored prior to processing it. For assays performed on tissue sections, this refers to how long the tissue section (e.g., slide) was stored before the assay began (e.g., imaging). For assays performed on suspensions, such as sequencing, it refers to how long the suspension was stored before library construction started. Example: 12 | | +| source_storage_duration_unit | | The unit of measurement used to specify the source storage duration value. Example: hour | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The length of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. Example: 10 | | +| time_since_acquisition_instrument_calibration_unit | | The unit of measurement used to specify the time since acquisition instrument calibration value. Example: month | ```month``` ```day``` ```year``` | +| preparation_protocol_doi | | The DOI for the protocols.io page that details the assay or the procedures used for sample procurement and preparation. For example, in the case of an imaging assay, the protocol may start with tissue section staining and end with the generation of an OME-TIFF file. The documented protocol should also include any image processing steps involved in producing the final OME-TIFF. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Indicates whether a specific molecule or set of molecules is targeted for detection or measurement by the assay. Example: Yes | | +| contributors_path | | The name of the file containing the ORCID IDs for all contributors to this dataset. Example: ./contributors.csv | | +| data_path | | The top-level directory containing the raw and/or processed data. For a single dataset upload, this might be represented as ".", whereas for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For example, if the data is within a directory named "TEST001-RK", use the syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2", where "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field used solely for data ingestion. Example: ./TEST001-RK | | +| parent_sample_id | | The unique identifier from HuBMAP or SenNet for the sample (such as a block, section, or suspension) used to perform the assay. For instance, in an RNAseq assay, the parent sample would be the suspension, while in imaging assays, it would be the tissue section. If the assay is derived from multiple parent samples, this field should contain a comma-separated list of identifiers. Example: HBM386.ZGKG.235, HBM672.MKPK.442 | | +| metadata_schema_id | | The unique string identifier for the metadata specification version, which is easily interpretable by computers for purposes of data validation and processing. Example: d70bfe24-e82a-46cb-9369-28ae03660d97 | | + + diff --git a/docs/assays/metadata/testing/converted/enhancedsrs.md b/docs/assays/metadata/testing/converted/enhancedsrs.md new file mode 100644 index 0000000..2db9bca --- /dev/null +++ b/docs/assays/metadata/testing/converted/enhancedsrs.md @@ -0,0 +1,36 @@ +--- +layout: page-triary +--- + +# Enhanced SRS Metadata Attributes + +Fields that are collected for Enhanced SRS data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| source_storage_duration_value | | How long was the source material (parent) stored, prior to this sample being processed. | | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "/TEST001-RK/" for this field. If there are multiple directory levels, use the format "/TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| is_image_preprocessing_required | | Depending on if the acquisition instrument was a microscope, slide scanner, etc. will indicate whether or not any level of preprocessing was required to assemble the image (e.g., fusing image tiles) . | ```Yes``` ```No``` | +| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | +| tiled_image_columns | | This is how many columns used in stitching. This is sometimes referred to as the grid size x. | | +| tiled_image_count | | This is the total number of raw (tiled) images captured, that are to be stitched together. | | +| intended_tile_overlap_percentage | | The amount of overlap between tiled images. This is the set point, where as during image acquisition there will be slight variations due to stage registration. | | +| dataset_type | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | +| tile_configuration | | This is how the tiles are configured for stitching. | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | +| scan_direction | | This is the direction of imaging, which is required for stitching. | ```Left-and-down``` ```Left-and-up``` ```Not applicable``` ```Right-and-down``` ```Right-and-up``` | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | + + diff --git a/docs/assays/metadata/testing/converted/facs.md b/docs/assays/metadata/testing/converted/facs.md new file mode 100644 index 0000000..5f831ac --- /dev/null +++ b/docs/assays/metadata/testing/converted/facs.md @@ -0,0 +1,41 @@ +--- +layout: page-triary +--- + +# FACS Metadata Attributes + +Fields that are collected for FACS data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | An internal field labs can use it to add whatever ID(s) they want or need for dataset validation and tracking. This could be a single ID (e.g., "Visium_9OLC_A4_S1") or a delimited list of IDs (e.g., “9OL; 9OLC.A2; Visium_9OLC_A4_S1”). This field will not be accessible to anyone outside of the consortium and no effort will be made to check if IDs provided by one data provider are also used by another. | | +| dataset_type | | The specific type of dataset being produced. | ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Pixel-seq``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx``` ```MERFISH``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```DBiT-seq``` ```PhenoCycler``` ```CODEX``` ```Second Harmonic Generation (SHG)``` ```Seq-Scope``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```QTRAP 5500``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Q Exactive HF``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive``` ```VS200 Slide Scanner``` ```Not applicable``` ```Orbitrap Eclipse Tribrid``` ```MIBIscope``` ```IN Cell Analyzer 2200``` ```timsTOF FleX MALDI-2``` | +| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. | | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```month``` ```day``` ```year``` | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurement and preparation. For example for an imaging assay, the protocol might begin with staining of a section and finalize with the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1. | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| is_erythrocyte_lysis_performed | | Process in which red blood cells (RBCs) are broken down in the sample prior to analysis, thereby allowing researchers to focus primarily on white blood cells (WBCs). | | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| antibody_reagent_kit | | The kit containing the set of antibodies pre-conjugated with different heavy metal isotopes used to simultaneously detect and quantify multiple protein markers on individual cells by attaching these metal-labeled antibodies to specific cellular targets, essentially acting as the key component for labeling cells with the various markers needed for analysis on the CyTOF machine. | ```Standard BioTools; Maxpar Nuclear Antigen Staining Kit; PN 201603``` ```Standard BioTools; Maxpar Phosphoprotein Staining Kit; PN 201604``` ```Standard BioTools; Maxpar Cell Surface Staining Kit; PN 201601``` ```Standard BioTools; Maxpar Cytoplasmic/Secreted Antigen Staining Kit; PN 201602``` ```Custom``` | +| viability_reagent_kit | | The kit used to differentiate between live and dead cells within a sample by selectively staining dead cells with a dye that can be detected by the instrument, allowing researchers to exclude dead cell data from their analysis and ensure accurate results when studying cell populations. | ```Sigma Aldrich; Cisplatin 25mg; PN P4394``` ```Standard BioTools; Cell-ID Cisplatin-198Pt 100 uL; PN 201198``` ```None``` ```Standard BioTools; Cell-ID Intercalator-103Rh 2,000 um; PN 201103B``` ```Standard BioTools; Cell-ID Cisplatin-196Pt 100 uL; PN 201196``` ```Standard BioTools; Cell-ID Cisplatin 100 uL; PN 201064``` ```Standard BioTools; Cell-ID Intercalator-103Rh 500 um; PN 201103A``` ```Standard BioTools; Cell-ID Cisplatin-194Pt 100 uL; PN 201194``` ```Standard BioTools; Cell-ID Cisplatin-195Pt 100 uL; PN 201195``` ```Custom``` | +| is_cell_activation_performed | | Process by which ligand is binded to its receptors on a cell, which enhances the cell's ability to respond to various stimuli. | | +| activation_stimulus | | Specific type of stimulus used to provoke cell activation. Examples would include PMA/ionomycin or CD28in/brefeldin A. This field is required if "is_cells_activation performed" is Yes. | | +| is_fcr_blocking_applied | | Process by which a reagent has been added to the staining procedure to block the binding of antibodies to Fc receptors (FcRs) on cells, preventing non-specific binding and ensuring that only the intended target antigen is detected by the antibodies; essentially, it helps to minimize false positive signals by preventing antibodies from attaching to the cell via their Fc region instead of the antigen-specific binding site. | | +| is_heparin_used | | Indicates whether heparin was used ("Yes") or not ("No") during staining to prevent non-specific binding of metal-labeled antibodies to eosinophils to reduce background noise. | | +| loaded_cell_concentration_value | | The number of cells present within a given volume of liquid for the experiment immediately prior to the experiment, essentially indicating how densely packed the cells are in a solution. | | +| loaded_cell_concentration_unit | | Unit of measure for cell concentration, e.g. cells per milliliter (cells/mL). | ```cells/mL``` | +| instrument_calibration_bead_kit | | A set of beads of known mass intensity used to adjust the settings of a flow cytometer to ensure accurate measurements. | ```Standard BioTools; EQ Six Element Calibration Beads 100 mL; PN 201245``` ```Standard BioTools; EQ Four Element Calibration Beads 100 mL; PN 201078``` ```None``` ```Standard BioTools; CyTOF Calibration Beads; PN 201073``` ```Custom``` | +| calibration_kit_lot_number | | Manufacturer's lot number for the calibration bead kit used for the experiment. | | + + diff --git a/docs/assays/metadata/testing/converted/geomx.md b/docs/assays/metadata/testing/converted/geomx.md new file mode 100644 index 0000000..09966c8 --- /dev/null +++ b/docs/assays/metadata/testing/converted/geomx.md @@ -0,0 +1,98 @@ +--- +layout: page-triary +--- + +# GeoMx Metadata Attributes + +Fields that are collected for GeoMx data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| dataset_type | | The specific type of dataset being produced. | | +| analyte_class | | Analytes are the target molecules being measured with the assay. | | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | | +| source_storage_duration_unit | | The time duration unit of measurement | | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| mapped_area_value | | For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx, Xenium and Resolve, this is the area of the FOV (aka ROI) region being captured. | | +| mapped_area_unit | | The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2. | | +| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | +| number_of_channels | | The number of distinct color channels in the image. | | +| target_retrieval_incubation_temperature | | Will normally be 100 degrees Celsius for RNA assays, and 80 degrees Celsius for protein assays. | | +| target_retrieval_incubation_time_value | | The duration for which a sample is exposed to a target retrieval solution. | | +| target_retrieval_incubation_time_unit | | The units for target retrieval incubation time value. | | +| proteinasek_concentration | | The amount or concentration of the enzyme Proteinase K within a sample (in ug/ml). | | +| proteinasek_incubation_time_value | | The duration for which a sample is exposed to Proteinase K. | | +| proteinasek_incubation_time_unit | | The units for proteinaseK incubation time value. | | +| roi_label | | A label for the region of interest (ROI). For Xenium, Resolve and CosMx, this is the field of view (FOV) label. For GeoMx this can be found in the "Initial Dataset" spreadsheet (download from within Data Analysis Suite). | | +| is_roi_segmentation_performed | | Was the image segmented. For GeoMx this refers to whether segmentation was used to split ROIs (regions of interest) into AOIs (areas of interest). | | +| roi_segmentation_strategy | | The method of segmentation that was applied in a GeoMx assay. If an overlay was used the overlay image needs to be included in the dataset upload. | | +| anatomical_structure_label | | The overarching anatomical structure. | | +| anatomical_structure_id | | The ontology ID for the parent structure. Typically this would be an UBERON ID. | | +| targeted_entity_label | | State what cell type(s) or functional tissue unit was targeted in this ROI/AOI. | | +| targeted_entity_id | | The ontology ID for the targeted entity. | | +| segment_id | | This is the ID for the area of interest (AOI) in a GeoMx dataset. From "Initial Dataset" spreadsheet (download from within Data Analysis Suite), e.g. 9a828e39-43d8-4051-9bcc-581a520a85d4. | | +| is_technical_replicate | | Is the sequencing reaction run in replicate, "Yes" or "No". If "Yes", FASTQ files in dataset need to be merged. | | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| non_global_files | | A semicolon separated list of non-shared files to be included in the dataset. The path assumes the files are located in the "TOP/non-global/" directory. For example, for the file is TOP/non-global/lab_processed/images/1-tissue-boundary.geojson the value of this field would be "./lab_processed/images/1-tissue-boundary.geojson". After ingest, these files will be copied to the appropriate locations within the respective dataset directory tree. This field is used for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee | | + + + +  + +## Deprecated Attributes + +* indicates a field that was previously required + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| dataset_type | | The specific type of dataset being produced. | | +| analyte_class | | Analytes are the target molecules being measured with the assay. | | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | | +| source_storage_duration_unit | | The time duration unit of measurement | | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| mapped_area_value | | For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx, Xenium and Resolve, this is the area of the FOV (aka ROI) region being captured. | | +| mapped_area_unit | | The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2. | | +| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | +| number_of_channels | | The number of distinct color channels in the image. | | +| target_retrieval_incubation_temperature | | Will normally be 100 degrees Celsius for RNA assays, and 80 degrees Celsius for protein assays. | | +| target_retrieval_incubation_time_value | | The duration for which a sample is exposed to a target retrieval solution. | | +| target_retrieval_incubation_time_unit | | The units for target retrieval incubation time value. | | +| proteinasek_concentration | | The amount or concentration of the enzyme Proteinase K within a sample (in ug/ml). | | +| proteinasek_incubation_time_value | | The duration for which a sample is exposed to Proteinase K. | | +| proteinasek_incubation_time_unit | | The units for proteinaseK incubation time value. | | +| roi_label | | A label for the region of interest (ROI). For Xenium, Resolve and CosMx, this is the field of view (FOV) label. For GeoMx this can be found in the "Initial Dataset" spreadsheet (download from within Data Analysis Suite). | | +| is_roi_segmentation_performed | | Was the image segmented. For GeoMx this refers to whether segmentation was used to split ROIs (regions of interest) into AOIs (areas of interest). | | +| roi_segmentation_strategy | | The method of segmentation that was applied in a GeoMx assay. If an overlay was used the overlay image needs to be included in the dataset upload. | | +| anatomical_structure_label | | The overarching anatomical structure. | | +| anatomical_structure_id | | The ontology ID for the parent structure. Typically this would be an UBERON ID. | | +| targeted_entity_label | | State what cell type(s) or functional tissue unit was targeted in this ROI/AOI. | | +| targeted_entity_id | | The ontology ID for the targeted entity. | | +| segment_id | | This is the ID for the area of interest (AOI) in a GeoMx dataset. From "Initial Dataset" spreadsheet (download from within Data Analysis Suite), e.g. 9a828e39-43d8-4051-9bcc-581a520a85d4. | | +| is_technical_replicate | | Is the sequencing reaction run in replicate, "Yes" or "No". If "Yes", FASTQ files in dataset need to be merged. | | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| hybcode_pack_lot_number | | Enter the lot number noted within the LabWorksheet.txt file (and used in downstream nCounter processing). | | +| probe_hybridization_time_value | | How many hours were the oligo-conjugated RNA or oligo-conjugated antibody probes hybridized with the sample? | | +| probe_hybridization_time_unit | | The units for probe hybridization time value. | | +| oligo_probe_panel | | This is the probe panel used to target genes and/or proteins. In cases where there is a core panel and add-on modules, the core panel should be selected here. If additional panels are used, then they must be included in the "additional_panels_used.csv" file that's uploaded with the dataset. | | +| is_custom_probes_used | | State ("Yes" or "No") whether custom RNA or antibody probes were used. If custom probes were used, they must be listed in the "custom_probe_set.csv" file. | | +| non_global_files | | A semicolon separated list of non-shared files to be included in the dataset. The path assumes the files are located in the "TOP/non-global/" directory. For example, for the file is TOP/non-global/lab_processed/images/1-tissue-boundary.geojson the value of this field would be "./lab_processed/images/1-tissue-boundary.geojson". After ingest, these files will be copied to the appropriate locations within the respective dataset directory tree. This field is used for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee | | + diff --git a/docs/assays/metadata/testing/converted/hifi.md b/docs/assays/metadata/testing/converted/hifi.md new file mode 100644 index 0000000..27a8465 --- /dev/null +++ b/docs/assays/metadata/testing/converted/hifi.md @@ -0,0 +1,49 @@ +--- +layout: page-triary +--- + +# HiFi-Slide Metadata Attributes + +Fields that are collected for HiFi-Slide data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| dataset_type | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | +| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| mapped_area_value | | For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx, Xenium and Resolve, this is the area of the FOV (aka ROI) region being captured. | | +| mapped_area_unit | | The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2. | ```um^2``` ```mm^2``` | +| spot_size_value | | FModified progressive staining, Not applicable, Progressive staining, Regressive stainingor assays where spots are used to define discrete capture areas, this is the area of a spot. | | +| spot_size_unit | | The unit for spot size value. | ```um^2``` ```mm^2``` | +| number_of_spots | | Number of capture spots within the mapped area. For Visium this would be the number of spots covered by tissue, while it's the number of spots within ROIs for HiFi. | | +| spot_spacing_value | | Approximate center-to-center distance between capture spots. Synonyms: Inter-Spot distance, Spot resolution, Pit size | | +| spot_spacing_unit | | Units corresponding to inter-spot distance | ```um``` | +| capture_area_id | | Which capture area on the slide was used. For Visium this would be ```A1, B1, C1, D1```. For HiFi this would be the lane on the flowcell. | ```A1``` ```B1``` ```C1``` ```D1``` ```Lane 1``` ```Lane 2``` ```Lane 3``` ```Lane 4``` ```Lane 5``` ```Lane 6``` ```Lane 7``` ```Lane 8``` | +| permeabilization_time_value | | Permeabilization time used for this tissue section. | | +| permeabilization_time_unit | | The unit for the permeabilization time. | ```minute``` | +| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | +| target_retrieval_incubation_temperature | | Will normally be 100 degrees Celsius for RNA assays, and 80 degrees Celsius for protein assays. | | +| target_retrieval_incubation_time_value | | The duration for which a sample is exposed to a target retrieval solution. | | +| target_retrieval_incubation_time_unit | | The units for target retrieval incubation time value. | ```minute``` | +| proteinasek_concentration | | The amount or concentration of the enzyme Proteinase K within a sample (in ug/ml). | | +| proteinasek_incubation_time_value | | The duration for which a sample is exposed to Proteinase K. | | +| proteinasek_incubation_time_unit | | The units for proteinaseK incubation time value. | ```minute``` | +| anatomical_structure_label | | The overarching anatomical structure. | | +| anatomical_structure_id | | The ontology ID for the parent structure. Typically this would be an UBERON ID. | | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| roi_label | | A label for the region of interest (ROI). For Xenium, Resolve and CosMx, this is the field of view (FOV) label. For GeoMx this can be found in the "Initial Dataset" spreadsheet (download from within Data Analysis Suite). | | + + diff --git a/docs/assays/metadata/testing/converted/iclap.md b/docs/assays/metadata/testing/converted/iclap.md new file mode 100644 index 0000000..33be59c --- /dev/null +++ b/docs/assays/metadata/testing/converted/iclap.md @@ -0,0 +1,36 @@ +--- +layout: page-triary +--- + +# iCLAP Metadata Attributes + +Fields that are collected for iCLAP data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | | +| source_storage_duration_value | | The length of time the sample was stored prior to processing it. For assays performed on tissue sections, this refers to how long the tissue section (e.g., slide) was stored before the assay began (e.g., imaging). For assays performed on suspensions, such as sequencing, it refers to how long the suspension was stored before library construction started. Example: 12 | | +| time_since_acquisition_instrument_calibration_value | | The length of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. Example: 10 | | +| contributors_path | | The name of the file containing the ORCID IDs for all contributors to this dataset. Example: ./contributors.csv | | +| data_path | | The top-level directory containing the raw and/or processed data. For a single dataset upload, this might be represented as ".", whereas for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For example, if the data is within a directory named "TEST001-RK", use the syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2", where "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field used solely for data ingestion. Example: ./TEST001-RK | | +| number_of_antibodies | | The number of antibodies used in the assay. If no antibodies were utilized, enter 0. Example: 5 | | +| number_of_biomarker_imaging_rounds | | The number of imaging rounds required to capture the tagged biomarkers. For CODEX, a biomarker imaging round includes steps such as (1) oligo application, (2) fluor application, and (3) washes. For Cell DIVE, it involves (1) the staining of a biomarker via secondary detection or direct conjugate, followed by (2) dye inactivation. Example: 3 | | +| number_of_total_imaging_rounds | | The total number of imaging rounds performed using a microscope to collect either autofluorescence/background or stained signals, such as those used in histological analysis. Example: 5 | | +| slide_id | | The unique identifier assigned to each slide, enabling users to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name to prevent overlapping values across different centers. Example: VAN0071-PA-1-1_AF | | +| dataset_type | | The specific type of dataset being produced. Example: RNAseq | ```Visium HD``` ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```COMET``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Raman Imaging``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```iCLAP``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```STARmap``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```Virtual Histology``` ```DBiT-seq``` | +| analyte_class | | The analyte class which is the target molecule that the assay is measuring. Example: DNA | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Lipid + metabolite + protein``` ```RNA + protein``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | The company that manufactures or supplies the acquisition instrument. An acquisition instrument is a device equipped with signal detection hardware and signal processing software. It captures signals produced by assays, such as variations in light intensity or color, or signals corresponding to molecular mass. If the instrument was custom-built or developed internally, enter "In-House". Example: Illumina | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Revvity``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Waters``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | The specific model of the acquisition instrument, as manufacturers often offer various versions with differing features or sensitivities. These differences may be relevant to the processing or interpretation of the data. If the instrument was custom-built or developed internally, enter "In-House". If the model is unknown, enter "Unknown". Example: HiSeq 4000 | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```QTRAP 5500``` ```DMi8``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Opera Phenix Plus HCS``` ```SYNAPT G2-Si``` ```Q Exactive HF``` ```Orbitrap Fusion Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive``` ```VS200 Slide Scanner``` ```Not applicable``` | +| source_storage_duration_unit | | The unit of measurement used to specify the source storage duration value. Example: hour | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_unit | | The unit of measurement used to specify the time since acquisition instrument calibration value. Example: month | ```month``` ```day``` ```year``` | +| metadata_schema_id | | The unique string identifier for the metadata specification version, which is easily interpretable by computers for purposes of data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| preparation_protocol_doi | | The DOI for the protocols.io page that details the assay or the procedures used for sample procurement and preparation. For example, in the case of an imaging assay, the protocol may start with tissue section staining and end with the generation of an OME-TIFF file. The documented protocol should also include any image processing steps involved in producing the final OME-TIFF. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Indicates whether a specific molecule or set of molecules is targeted for detection or measurement by the assay. Example: Yes | | +| antibodies_path | | The path to the antibodies.tsv file relative to the root directory of the upload structure. This path should start with "." and is typically formatted as "./extras/antibodies.tsv". Example: ./extras/antibodies.tsv | | +| parent_sample_id | | The unique identifier from HuBMAP or SenNet for the sample (such as a block, section, or suspension) used to perform the assay. For instance, in an RNAseq assay, the parent sample would be the suspension, while in imaging assays, it would be the tissue section. If the assay is derived from multiple parent samples, this field should contain a comma-separated list of identifiers. Example: HBM386.ZGKG.235, HBM672.MKPK.442 | | +| number_of_channels | | The number of fluorescent channels that are imaged during each cycle. Example: 3 | | + + diff --git a/docs/assays/metadata/testing/converted/illumina-spatial.md b/docs/assays/metadata/testing/converted/illumina-spatial.md new file mode 100644 index 0000000..c854a41 --- /dev/null +++ b/docs/assays/metadata/testing/converted/illumina-spatial.md @@ -0,0 +1,43 @@ +--- +layout: page-triary +--- + +# Illumina Spatial Metadata Attributes + +Fields that are collected for Illumina Spatial data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | | +| preparation_protocol_doi | | The DOI for the protocols.io page that details the assay or the procedures used for sample procurement and preparation. For example, in the case of an imaging assay, the protocol may start with tissue section staining and end with the generation of an OME-TIFF file. The documented protocol should also include any image processing steps involved in producing the final OME-TIFF. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| dataset_type | | The specific type of dataset being produced. Example: RNAseq | ```Visium HD``` ```4i``` ```Illumina Spatial v0``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```COMET``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```MACSima``` ```Raman Imaging``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```iCLAP``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```STARmap``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` | +| contributors_path | | The name of the file containing the ORCID IDs for all contributors to this dataset. Example: ./contributors.csv | | +| data_path | | The top-level directory containing the raw and/or processed data. For a single dataset upload, this might be represented as ".", whereas for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For example, if the data is within a directory named "TEST001-RK", use the syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2", where "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field used solely for data ingestion. Example: ./TEST001-RK | | +| mapped_area_value | | The mapped area value, which refers to the specific area covered or captured in various assays. For Visium, it is the area of spots covered by tissue within the captured area, excluding the total possible captured area. For GeoMx, it refers to the area of the AOI being captured. In HiFi, it is the summed area of the ROIs in a single flowcell lane. For CosMx and Resolve, it indicates the area of the FOV (also known as ROI) region being captured. For Xenium, it is the total area of the FOV regions (also known as ROI) being captured. For Stereo-Seq, this value represents the number of beads. Example: 42.25 | | +| mapped_area_unit | | The unit of measurement for the mapped area value. If mapping area is not specified, this field may be left blank. Example: um^2 | ```um^2``` ```mm^2``` | +| capture_area_id | | The capture area on the slide that was used during the process. For example, in the case for Visium, this would correspond to areas such as [A1, B1, C1, D1], while for HiFi, it would refer to the lane on the flowcell. Example: A1 | | +| permeabilization_time_value | | Permeabilization time used for this tissue section. | | +| permeabilization_time_unit | | The unit for the permeabilization time. | ```minute``` | +| metadata_schema_id | | The unique string identifier for the metadata specification version, which is easily interpretable by computers for purposes of data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| parent_sample_id | | The unique identifier from HuBMAP or SenNet for the sample (such as a block, section, or suspension) used to perform the assay. For instance, in an RNAseq assay, the parent sample would be the suspension, while in imaging assays, it would be the tissue section. If the assay is derived from multiple parent samples, this field should contain a comma-separated list of identifiers. Example: HBM386.ZGKG.235, HBM672.MKPK.442 | | +| preparation_instrument_vendor | | The company that manufactures the instrument used to prepare the sample (e.g., for staining or other processing steps) prior to the assay. If the instrument was custom-built or developed internally, enter "In-House". If no sample preparation occurred, enter "Not applicable". Example: 10X Genomics | ```Thermo Fisher Scientific``` ```SunChrom``` ```Akoya Biosciences``` ```Leica Biosystems``` ```Ionpath``` ```Roche Diagnostics``` ```In-House``` ```Not applicable``` ```Hamamatsu``` ```HTX Technologies``` ```10x Genomics``` | +| preparation_instrument_model | | The specific model of the instrument used for sample preparation, such as staining. Manufacturers may offer multiple models with varying features or sensitivities, which can influence how the sample is processed and how the resulting data is interpreted. If no sample preparation occurred, enter "Not applicable". Example: Chromium X | ```AutoStainer XL``` ```ST5020 Multistainer``` ```Visium CytAssist``` ```SunCollect Sprayer``` ```Chromium X``` ```Chromium iX``` ```EVOS M7000``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```Discovery Ultra``` ```Sublimator``` ```Not applicable``` ```TM-Sprayer``` ```M5 Sprayer``` ```M3+ Sprayer``` ```Chromium Controller``` ```Chromium Connect``` ```Custom``` | +| capture_area_width_value | | The width of RNA capture area. Example: 10 | | +| capture_area_width_unit | | The unit of measurement for the capture area width value. If the width value is not specified, this field may be left blank. Example: mm | ```mm``` | +| capture_area_height_value | | The height of RNA capture area. Example: 10 | | +| capture_area_height_unit | | The unit of measurement for the capture area height value. If the height value is not specified, this field may be left blank. Example: mm | ```mm``` | +| spatial_discreatization_method | | The segmentation method used to divide the capture are into smaller, defined regions for analysis. Example: Cell segmentation | ```Square binning``` ```Cell segmentation``` ```Hexagonal binning``` | +| bin_size | | The size (in µm) of each discrete spatial unit ("bin") used to partition the capture area in bin-based spatial discretization. Example: 100 | | +| analyte_class | | The analyte class which is the target molecule that the assay is measuring. Example: DNA | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Lipid + metabolite + protein``` ```RNA + protein``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| is_targeted | | Indicates whether a specific molecule or set of molecules is targeted for detection or measurement by the assay. Example: Yes | | +| acquisition_instrument_vendor | | The company that manufactures or supplies the acquisition instrument. An acquisition instrument is a device equipped with signal detection hardware and signal processing software. It captures signals produced by assays, such as variations in light intensity or color, or signals corresponding to molecular mass. If the instrument was custom-built or developed internally, enter "In-House". Example: Illumina | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Miltenyi Biotec``` ```Revvity``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Waters``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | The specific model of the acquisition instrument, as manufacturers often offer various versions with differing features or sensitivities. These differences may be relevant to the processing or interpretation of the data. If the instrument was custom-built or developed internally, enter "In-House". If the model is unknown, enter "Unknown". Example: HiSeq 4000 | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```LSM 710 Confocal Microscope``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```MACSima System``` ```QTRAP 5500``` ```DMi8``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Opera Phenix Plus HCS``` ```SYNAPT G2-Si``` ```Q Exactive HF``` ```Orbitrap Fusion Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive HF-X``` | +| source_storage_duration_value | | The length of time the sample was stored prior to processing it. For assays performed on tissue sections, this refers to how long the tissue section (e.g., slide) was stored before the assay began (e.g., imaging). For assays performed on suspensions, such as sequencing, it refers to how long the suspension was stored before library construction started. Example: 12 | | +| source_storage_duration_unit | | The unit of measurement used to specify the source storage duration value. Example: hour | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The length of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. Example: 10 | | +| time_since_acquisition_instrument_calibration_unit | | The unit of measurement used to specify the time since acquisition instrument calibration value. Example: month | ```month``` ```day``` ```year``` | + + diff --git a/docs/assays/metadata/testing/converted/imc.md b/docs/assays/metadata/testing/converted/imc.md new file mode 100644 index 0000000..af93f18 --- /dev/null +++ b/docs/assays/metadata/testing/converted/imc.md @@ -0,0 +1,207 @@ +--- +layout: page-triary +--- + +# 2D Imaging Mass Cytometry Metadata Attributes + +Fields that are collected for 2D Imaging Mass Cytometry data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| dataset_type | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | +| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| total_run_time_value | | How long the tissue was on the acquisition instrument. | | +| total_run_time_unit | | The units for the total run time unit field. | ```Hour``` ```Minute``` | +| number_of_antibodies | | Number of antibodies | | +| number_of_channels | | The number of distinct color channels in the image. | | +| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | +| data_precision_bytes | | Numerical data precision in bytes. | | +| ablation_frequency_value | | Frequency value of laser ablation | | +| ablation_frequency_unit | | Frequency unit of laser ablation | ```Hz``` | +| antibodies_path | | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | + + + +  + +## Deprecated Attributes + +* indicates a field that was previously required + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| version | | Version of the schema to use when validating this metadata. | ```'1'``` | +| description | | Free-text description of this assay. | | +| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | | +| tissue_id | | HuBMAP Display ID of the assayed tissue. | | +| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | +| protocols_io_doi | | DOI for protocols.io referring to the protocol for this assay. | | +| operator | | Name of the person responsible for executing the assay. | | +| operator_email | | Email address for the operator. | | +| pi | | Name of the principal investigator responsible for the data. | | +| pi_email | | Email address for the principal investigator. | | +| assay_category | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```'mass_spectrometry_imaging'``` | +| assay_type | | The specific type of assay being executed. | ```'Imaging Mass Cytometry'``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```'protein'``` | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```'Yes'``` ```'No'``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| preparation_instrument_vendor | | The manufacturer of the instrument used to prepare the sample for the assay. | | +| preparation_instrument_model | | The model number/name of the instrument used to prepare the sample for the assay | | +| section_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | +| reagent_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | | +| number_of_channels | | Number of mass channels measured | | +| ablation_distance_between_shots_x_value | | x resolution. Distance between laser ablation shots in the X-dimension. | | +| ablation_distance_between_shots_x_units | | Units of x resolution distance between laser ablation shots. | ```'um'``` ```'nm'``` | +| ablation_distance_between_shots_y_value | | y resolution. Distance between laser ablation shots in the Y-dimension. | | +| ablation_distance_between_shots_y_units | | Units of y resolution distance between laser ablation shots. | ```'um'``` ```'nm'``` | +| ablation_frequency_value | | Frequency value of laser ablation (in Hz) | | +| ablation_frequency_unit | | Frequency unit of laser ablation | ```'Hz'``` | +| roi_description | | A description of the region of interest (ROI) captured in the image. | | +| roi_id | | Multiple images (1-n) are acquired from regions of interest (ROI1, ROI2, ROI3, etc) on a slide. The roi_id is a number from 1-n representing the ROI captured on a slide. | | +| acquisition_id | | The acquisition_id refers to the directory containing the ROI images for a slide. Together, the acquisition_id and the roi_id indicate the slide-ROI represented in the image. | | +| dual_count_start | | Threshold for dual counting. | | +| max_x_width_value | | Image width value of the ROI acquisition | | +| max_x_width_unit | | Units of image width of the ROI acquisition | ```'um'``` | +| max_y_height_value | | Image height value of the ROI acquisition | | +| max_y_height_unit | | Units of image height of the ROI acquisition | ```'um'``` | +| segment_data_format | | This refers to the data type, which is a "float" for the IMC counts. | ```'float'``` ```'integer'``` ```'string'``` | +| signal_type | | Type of signal measured per channel (usually dual counts) | ```'dual count'``` ```'pulse count'``` ```'intensity value'``` | +| data_precision_bytes | | Numerical data precision in bytes | | +| antibodies_path | | Relative path to file with antibody information for this dataset. | | +| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | | +| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | +| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | | +| tissue_id | | HuBMAP Display ID of the assayed tissue. | | +| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | +| protocols_io_doi | | DOI for protocols.io referring to the protocol for this assay. | | +| operator | | Name of the person responsible for executing the assay. | | +| operator_email | | Email address for the operator. | | +| pi | | Name of the principal investigator responsible for the data. | | +| pi_email | | Email address for the principal investigator. | | +| assay_category | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```'mass_spectrometry_imaging'``` | +| assay_type | | The specific type of assay being executed. | ```'Imaging Mass Cytometry'``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```'protein'``` | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```'Yes'``` ```'No'``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| preparation_instrument_vendor | | The manufacturer of the instrument used to prepare the sample for the assay. | | +| preparation_instrument_model | | The model number/name of the instrument used to prepare the sample for the assay | | +| section_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | +| reagent_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | | +| number_of_channels | | Number of mass channels measured | | +| ablation_distance_between_shots_x_value | | x resolution. Distance between laser ablation shots in the X-dimension. | | +| ablation_distance_between_shots_x_units | | Units of x resolution distance between laser ablation shots. | ```'um'``` ```'nm'``` | +| ablation_distance_between_shots_y_value | | y resolution. Distance between laser ablation shots in the Y-dimension. | | +| ablation_distance_between_shots_y_units | | Units of y resolution distance between laser ablation shots. | ```'um'``` ```'nm'``` | +| ablation_frequency_value | | Frequency value of laser ablation (in Hz) | | +| ablation_frequency_unit | | Frequency unit of laser ablation | ```'Hz'``` | +| roi_description | | A description of the region of interest (ROI) captured in the image. | | +| roi_id | | Multiple images (1-n) are acquired from regions of interest (ROI1, ROI2, ROI3, etc) on a slide. The roi_id is a number from 1-n representing the ROI captured on a slide. | | +| acquisition_id | | The acquisition_id refers to the directory containing the ROI images for a slide. Together, the acquisition_id and the roi_id indicate the slide-ROI represented in the image. | | +| dual_count_start | | Threshold for dual counting. | | +| end_datetime | | Time stamp indicating end of ablation for ROI | | +| max_x_width_value | | Image width value of the ROI acquisition | | +| max_x_width_unit | | Units of image width of the ROI acquisition | ```'um'``` | +| max_y_height_value | | Image height value of the ROI acquisition | | +| max_y_height_unit | | Units of image height of the ROI acquisition | ```'um'``` | +| segment_data_format | | This refers to the data type, which is a "float" for the IMC counts. | ```'float'``` ```'integer'``` ```'string'``` | +| signal_type | | Type of signal measured per channel (usually dual counts) | ```'dual count'``` ```'pulse count'``` ```'intensity value'``` | +| start_datetime | | Time stamp indicating start of ablation for ROI | | +| data_precision_bytes | | Numerical data precision in bytes | | +| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | | +| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | +| version | | Version of the schema to use when validating this metadata. | ```'1'``` | +| description | | Free-text description of this assay. | | +| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | | +| tissue_id | | HuBMAP Display ID of the assayed tissue. | | +| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | +| protocols_io_doi | | DOI for protocols.io referring to the protocol for this assay. | | +| operator | | Name of the person responsible for executing the assay. | | +| operator_email | | Email address for the operator. | | +| pi | | Name of the principal investigator responsible for the data. | | +| pi_email | | Email address for the principal investigator. | | +| assay_category | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```'mass_spectrometry_imaging'``` | +| assay_type | | The specific type of assay being executed. | ```'3D Imaging Mass Cytometry'``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```'protein'``` | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```'Yes'``` ```'No'``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| preparation_instrument_vendor | | The manufacturer of the instrument used to prepare the sample for the assay. | | +| preparation_instrument_model | | The model number/name of the instrument used to prepare the sample for the assay | | +| section_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | +| reagent_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | | +| number_of_channels | | Number of mass channels measured | | +| number_of_sections | | Number of sections | | +| ablation_distance_between_shots_x_value | | x resolution. Distance between laser ablation shots in the X-dimension. | | +| ablation_distance_between_shots_x_units | | Units of x resolution distance between laser ablation shots. | ```'um'``` ```'nm'``` | +| ablation_distance_between_shots_y_value | | y resolution. Distance between laser ablation shots in the Y-dimension. | | +| ablation_distance_between_shots_y_units | | Units of y resolution distance between laser ablation shots. | ```'um'``` ```'nm'``` | +| ablation_frequency_value | | Frequency value of laser ablation (in Hz) | | +| ablation_frequency_unit | | Frequency unit of laser ablation | ```'Hz'``` | +| roi_description | | A description of the region of interest (ROI) captured in the image. | | +| roi_id | | Multiple images (1-n) are acquired from regions of interest (ROI1, ROI2, ROI3, etc) on a slide. The roi_id is a number from 1-n representing the ROI captured on a slide. | | +| acquisition_id | | The acquisition_id refers to the directory containing the ROI images for a slide. Together, the acquisition_id and the roi_id indicate the slide-ROI represented in the image. | | +| max_x_width_value | | Image width value of the ROI acquisition | | +| max_x_width_unit | | Units of image width of the ROI acquisition | ```'um'``` | +| max_y_height_value | | Image height value of the ROI acquisition | | +| max_y_height_unit | | Units of image height of the ROI acquisition | ```'um'``` | +| segment_data_format | | This refers to the data type, which is a "float" for the IMC counts. | ```'float'``` ```'integer'``` ```'string'``` | +| signal_type | | Type of signal measured per channel (usually dual counts) | ```'dual count'``` ```'pulse count'``` ```'intensity value'``` | +| antibodies_path | | Relative path to file with antibody information for this dataset. | | +| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | | +| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | +| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | | +| tissue_id | | HuBMAP Display ID of the assayed tissue. | | +| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | +| protocols_io_doi | | DOI for protocols.io referring to the protocol for this assay. | | +| operator | | Name of the person responsible for executing the assay. | | +| operator_email | | Email address for the operator. | | +| pi | | Name of the principal investigator responsible for the data. | | +| pi_email | | Email address for the principal investigator. | | +| assay_category | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```'mass_spectrometry_imaging'``` | +| assay_type | | The specific type of assay being executed. | ```'3D Imaging Mass Cytometry'``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```'protein'``` | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```'Yes'``` ```'No'``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| preparation_instrument_vendor | | The manufacturer of the instrument used to prepare the sample for the assay. | | +| preparation_instrument_model | | The model number/name of the instrument used to prepare the sample for the assay | | +| section_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | +| reagent_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | | +| number_of_channels | | Number of mass channels measured | | +| number_of_sections | | Number of sections | | +| ablation_distance_between_shots_x_value | | x resolution. Distance between laser ablation shots in the X-dimension. | | +| ablation_distance_between_shots_x_units | | Units of x resolution distance between laser ablation shots. | ```'um'``` ```'nm'``` | +| ablation_distance_between_shots_y_value | | y resolution. Distance between laser ablation shots in the Y-dimension. | | +| ablation_distance_between_shots_y_units | | Units of y resolution distance between laser ablation shots. | ```'um'``` ```'nm'``` | +| ablation_frequency_value | | Frequency value of laser ablation (in Hz) | | +| ablation_frequency_unit | | Frequency unit of laser ablation | ```'Hz'``` | +| roi_description | | A description of the region of interest (ROI) captured in the image. | | +| roi_id | | Multiple images (1-n) are acquired from regions of interest (ROI1, ROI2, ROI3, etc) on a slide. The roi_id is a number from 1-n representing the ROI captured on a slide. | | +| acquisition_id | | The acquisition_id refers to the directory containing the ROI images for a slide. Together, the acquisition_id and the roi_id indicate the slide-ROI represented in the image. | | +| max_x_width_value | | Image width value of the ROI acquisition | | +| max_x_width_unit | | Units of image width of the ROI acquisition | ```'um'``` | +| max_y_height_value | | Image height value of the ROI acquisition | | +| max_y_height_unit | | Units of image height of the ROI acquisition | ```'um'``` | +| segment_data_format | | This refers to the data type, which is a "float" for the IMC counts. | ```'float'``` ```'integer'``` ```'string'``` | +| signal_type | | Type of signal measured per channel (usually dual counts) | ```'dual count'``` ```'pulse count'``` ```'intensity value'``` | +| antibodies_path | | Relative path to file with antibody information for this dataset. | | +| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | | +| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | + diff --git a/docs/assays/metadata/testing/converted/index.md b/docs/assays/metadata/testing/converted/index.md new file mode 100644 index 0000000..d89daa9 --- /dev/null +++ b/docs/assays/metadata/testing/converted/index.md @@ -0,0 +1,37 @@ +--- +layout: page +--- + +# Assay metadata pages + +- [4i](4i) +- [COMET](comet) +- [CosMx Proteomics](cosmx-proteomics) +- [CosMx Transcriptomics](cosmx-transcriptomics) +- [CyCIF](cycif) +- [CyTOF](cytof) +- [DNA Methylation](dna-methylation) +- [Enhanced SRS](enhancedsrs) +- [FACS](facs) +- [GeoMx](geomx) +- [HiFi-Slide](hifi) +- [iCLAP](iclap) +- [Illumina Spatial](illumina-spatial) +- [2D Imaging Mass Cytometry](imc) +- [MERFISH](merfish) +- [MPLEx](mplex) +- [Olink](olink) +- [PhenoCycler](phenocycler) +- [Pixel-seqV2](pixel-seqv2) +- [Raman Imaging](raman-imaging) +- [Second Harmonic Generation](secondharmonicgeneration) +- [Seq-Scope](seq-scope) +- [seqFISH](seqfish) +- [SIMS](sims) +- [Slide-seq](slide-seq) +- [SNARE-seq2](snareseq2) +- [STARmap](starmap) +- [Thick Section Multiphoton MxIF](thicksectionmultiphotonmxif) +- [Visium HD](visium-hd) +- [Visium (with probes)](visiumwithprobes) +- [WGS](wgs) diff --git a/docs/assays/metadata/testing/converted/merfish.md b/docs/assays/metadata/testing/converted/merfish.md new file mode 100644 index 0000000..9b950fe --- /dev/null +++ b/docs/assays/metadata/testing/converted/merfish.md @@ -0,0 +1,50 @@ +--- +layout: page-triary +--- + +# MERFISH Metadata Attributes + +Fields that are collected for MERFISH data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| dataset_type | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | +| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| mapped_area_value | | For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx, Xenium and Resolve, this is the area of the FOV (aka ROI) region being captured. | | +| mapped_area_unit | | The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2. | ```um^2``` ```mm^2``` | +| permeabilization_time_value | | Permeabilization time used for this tissue section. | | +| permeabilization_time_unit | | The unit for the permeabilization time. | ```minute``` | +| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | +| target_retrieval_incubation_temperature | | Will normally be 100 degrees Celsius for RNA assays, and 80 degrees Celsius for protein assays. | | +| target_retrieval_incubation_time_value | | The duration for which a sample is exposed to a target retrieval solution. | | +| target_retrieval_incubation_time_unit | | The units for target retrieval incubation time value. | ```minute``` | +| proteinasek_concentration | | The amount or concentration of the enzyme Proteinase K within a sample (in ug/ml). | | +| proteinasek_incubation_time_value | | The duration for which a sample is exposed to Proteinase K. | | +| proteinasek_incubation_time_unit | | The units for proteinaseK incubation time value. | ```minute``` | +| probe_hybridization_time_value | | How long was the oligo-conjugated RNA or oligo-conjugated antibody probes hybridized with the sample? | | +| probe_hybridization_time_unit | | The units for probe hybridization time value. | ```Hour``` ```Minute``` | +| oligo_probe_panel | | This is the probe panel used to target genes and/or proteins. In cases where there is a core panel and add-on modules, the core panel should be selected here. If additional panels are used, then they must be included in the "additional_panels_used.csv" file that's uploaded with the dataset. | ```10x Genomics; Chromium Fixed RNA Kit``` ```Human Transcriptome``` ```4 rxns x 1 BC; PN 1000474``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Human Transcriptome Probe Kit``` ```16 rxns; PN 1000420``` ```10X Genomics; Chromium Next GEM Single Cell Fixed RNA Human Transcriptome Probe Kit``` ```64 rxns; PN 1000456``` ```10x Genomics; Visium Human Transcriptome Probe Kit v2 - Small; PN 1000466``` ```10x Genomics; Visium Human Transcriptome Probe Kit-Large; PN 1000364``` ```10x Genomics; Visium Human Transcriptome Probe Kit-Small; PN 1000363``` ```10x Genomics; Visium Mouse Transcriptome Probe Kit - Small; PN 1000365``` ```Custom``` ```NanoString Technologies; GeoMx Human Whole Transcriptome Atlas``` ```4 slides; PN GMX-RNA-NGS-HuWTA-4``` ```NanoString Technologies; GeoMx Mouse Whole Transcriptome Atlas``` ```4 slides; PN GMX-RNA-NGS-MsWTA-4``` | +| is_custom_probes_used | | State ("Yes" or "No") whether custom RNA or antibody probes were used. If custom probes were used, they must be listed in the "custom_probe_set.csv" file. | ```Yes``` ```No``` | +| number_of_panel_targets | | How many genes, RNA isoforms or RNA regions are targeted by probes. | | +| roi_label | | A label for the region of interest (ROI). For Xenium, Resolve and CosMx, this is the field of view (FOV) label. For GeoMx this can be found in the "Initial Dataset" spreadsheet (download from within Data Analysis Suite). | | +| anatomical_structure_label | | The overarching anatomical structure. | | +| anatomical_structure_id | | The ontology ID for the parent structure. Typically this would be an UBERON ID. | | +| non_global_files | | A semicolon separated list of non-shared files to be included in the dataset. The path assumes the files are located in the "TOP/non-global/" directory. For example, for the file is TOP/non-global/lab_processed/images/1-tissue-boundary.geojson the value of this field would be "./lab_processed/images/1-tissue-boundary.geojson". After ingest, these files will be copied to the appropriate locations within the respective dataset directory tree. This field is used for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee | | +| number_of_additional_stains | | This would be minimally 2 (always include DAPI and polyT) and can include 6 more. | | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | + + diff --git a/docs/assays/metadata/testing/converted/mplex.md b/docs/assays/metadata/testing/converted/mplex.md new file mode 100644 index 0000000..bc4e9af --- /dev/null +++ b/docs/assays/metadata/testing/converted/mplex.md @@ -0,0 +1,61 @@ +--- +layout: page-triary +--- + +# MPLEx Metadata Attributes + +Fields that are collected for MPLEx data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | An internal field labs can use it to add whatever ID(s) they want or need for dataset validation and tracking. This could be a single ID (e.g., "Visium_9OLC_A4_S1") or a delimited list of IDs (e.g., “9OL; 9OLC.A2; Visium_9OLC_A4_S1”). This field will not be accessible to anyone outside of the consortium and no effort will be made to check if IDs provided by one data provider are also used by another. | | +| dataset_type | | The specific type of dataset being produced. | ```Visium HD``` ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```DBiT-seq``` ```PhenoCycler``` ```CODEX``` ```Second Harmonic Generation (SHG)``` ```Seq-Scope``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```QTRAP 5500``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Q Exactive HF``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive``` ```VS200 Slide Scanner``` ```Not applicable``` ```Orbitrap Eclipse Tribrid``` ```MIBIscope``` ```IN Cell Analyzer 2200``` ```timsTOF FleX MALDI-2``` | +| source_storage_duration_value | | How long was the source material (parent) stored, prior to this sample being processed. | | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. | | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```month``` ```day``` ```year``` | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurement and preparation. For example for an imaging assay, the protocol might begin with staining of a section and finalize with the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1. | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| mass_analysis_polarity | | The polarity of the mass analysis (positive or negative ion modes). | ```Negative and positive ion mode``` ```Positive ion mode``` ```Negative ion mode``` | +| mass_to_charge_range_low_value | | The low value of the scanned mass-to-charge range, for MS1. (unitless) | | +| mass_to_charge_range_high_value | | The high value of the scanned mass-to-charge range, for MS1. (unitless) | | +| mass_resolving_power | | The mass resolving power m/∆m, where ∆m is defined as the full width at half-maximum (FWHM) for a given peak with a specified mass-to-charge (m/z). (unitless) | | +| mass_to_charge_resolving_power | | The peak (m/z) used to calculate the resolving power. | | +| ion_mobility | | Specifies which technology was used for ion mobility spectrometry. Technologies for measuring ion mobility: Traveling Wave Ion Mobility Spectrometry (TWIMS), Trapped Ion Mobility Spectrometry (TIMS), High Field Asymmetric waveform ion Mobility Spectrometry (FAIMS), Drift Tube Ion Mobility Spectrometry (DTIMS), Structures for Lossless Ion Manipulations (SLIM), and cyclic Ion Mobility Spectrometry (cIMS). | ```TIMS``` ```SLIM``` ```FAIMS``` ```DTIMS``` ```cIMS``` ```TWIMS``` | +| ms_ionization_technique | | The ionization approach (i.e., sample probing method) for performing imaging mass spectrometry. | ```MALDI``` ```SIMS-C60``` ```LDI``` ```HESI``` ```nanoDESI``` ```MALDI-2``` ```DESI``` ```LA``` ```SIMS-H20``` ```ESI``` | +| ms_scan_mode | | MS (mass spectrometry) scan mode refers to the number of steps in the separation of fragments. | ```MS2``` ```MS1``` ```MS3``` | +| label_name | | If the samples were labeled (e.g. TMT), provide the name/ID of the label on this sample. Leave blank if not applicable. | | +| lc_instrument_vendor | | The manufacturer of the instrument used for liquid chromatography. | ```Thermo Fisher Scientific``` ```Sciex``` ```In-House``` ```Agilent Technologies``` ```Waters``` ```Bruker``` ```Evosep``` | +| lc_instrument_model | | The model number/name of the instrument used for liquid chromatography. | | +| lc_column_model | | The model number/name of the liquid chromatography column. If it is a custom self-packed, pulled tip capillary is used enter “Pulled tip capilary”. | | +| lc_resin | | Details of the resin used for liquid chromatography, including vendor, particle size, pore size. | | +| lc_column_length_value | | Liquid chromatography column length. | | +| lc_column_length_unit | | Units for liquid chromatography column length (typically cm). | ```um``` ```mm``` ```cm``` | +| lc_temperature_value | | Liquid chromatography temperature. | | +| lc_inner_diameter_value | | Liquid chromatography column inner diameter. | | +| lc_flow_rate_value | | Value of flow rate. | | +| lc_gradient_value | | Liquid chromatography gradient. | | +| lc_gradient_unit | | Unit for liquid chromatography gradient | ```minute``` | +| lc_mobile_phase_a | | Composition of mobile phase A. | | +| lc_mobile_phase_b | | | | +| spatial_sampling_technique | | | ```nanoSPLITS``` ```nanoPOTS``` ```LESA``` ```microPOTS``` ```LCM``` ```microLESA``` | +| spatial_sampling_target | | Specifies the cell-type or functional tissue unit (FTU) that is targeted in the spatial profiling experiment. Leave blank if data are generated in imaging mode without a specific target structure. | | +| analysis_protocol_doi | | A DOI to a protocols.io protocol describing the software and database(s) used to process the raw data. Example: https://dx.doi.org/10.17504/protocols.io.bsu5ney6 | | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| data_collection_mode | | Mode of data collection in tandem MS assays. Either DDA (Data-dependent acquisition), DIA (Data-independent acquisition), SRM (multiple reaction monitoring), or PRM (parallel reaction monitoring). | ```DDA``` ```PRM``` ```DIA``` ```SRM``` | +| lc_column_vendor | | The manufacturer of the liquid chromatography column unless self-packed, pulled tip capillary is used. | ```Thermo Fisher Scientific``` ```In-House``` ```Waters``` ```Bruker``` ```Evosep``` ```IonOpticks``` | +| lc_temperature_unit | | | ```celsius``` | +| lc_inner_diameter_unit | | | ```um``` ```mm``` ```cm``` | +| lc_flow_rate_unit | | Units of flow rate. | ```nL/min``` ```mL/min``` | +| spatial_sampling_type | | Specifies whether or not the analysis was performed in a spatially targeted manner. Spatial profiling experiments target specific tissue foci but do not necessarily generate images. Spatial imaging expriments collect data from a regular array (pixels) that can be visualized as heat maps of ion intensity at each location (molecular images). Leave blank if data are derived from bulk analysis. | ```Imaging``` ```Profiling``` | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | + + diff --git a/docs/assays/metadata/testing/converted/olink.md b/docs/assays/metadata/testing/converted/olink.md new file mode 100644 index 0000000..b5c8810 --- /dev/null +++ b/docs/assays/metadata/testing/converted/olink.md @@ -0,0 +1,30 @@ +--- +layout: page-triary +--- + +# Olink Metadata Attributes + +Fields that are collected for Olink data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | An internal field labs can use it to add whatever ID(s) they want or need for dataset validation and tracking. This could be a single ID (e.g., "Visium_9OLC_A4_S1") or a delimited list of IDs (e.g., “9OL; 9OLC.A2; Visium_9OLC_A4_S1”). This field will not be accessible to anyone outside of the consortium and no effort will be made to check if IDs provided by one data provider are also used by another. | | +| dataset_type | | The specific type of dataset being produced. | ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Pixel-seq``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx``` ```MERFISH``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```DBiT-seq``` ```PhenoCycler``` ```CODEX``` ```Second Harmonic Generation (SHG)``` ```Seq-Scope``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```QTRAP 5500``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Q Exactive HF``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive``` ```VS200 Slide Scanner``` ```Not applicable``` ```Orbitrap Eclipse Tribrid``` ```MIBIscope``` ```IN Cell Analyzer 2200``` ```timsTOF FleX MALDI-2``` | +| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. | | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```month``` ```day``` ```year``` | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurement and preparation. For example for an imaging assay, the protocol might begin with staining of a section and finalize with the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1. | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | + + diff --git a/docs/assays/metadata/testing/converted/phenocycler.md b/docs/assays/metadata/testing/converted/phenocycler.md new file mode 100644 index 0000000..6abe12d --- /dev/null +++ b/docs/assays/metadata/testing/converted/phenocycler.md @@ -0,0 +1,40 @@ +--- +layout: page-triary +--- + +# PhenoCycler Metadata Attributes + +Fields that are collected for PhenoCycler data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| source_storage_duration_value | | How long was the source material (parent) stored, prior to this sample being processed. | | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "/TEST001-RK/" for this field. If there are multiple directory levels, use the format "/TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| number_of_antibodies | | Number of antibodies | | +| number_of_channels | | Number of fluorescent channels imaged during each cycle. | | +| number_of_biomarker_imaging_rounds | | Number of imaging rounds to capture the tagged biomarkers. For CODEX a biomarker imaging round consists of 1. oligo application, 2. fluor application, 3. washes. For Cell DIVE a biomarker imaging round consists of 1. staining of a biomarker via secondary detection or direct conjugate and 2. dye inactivation. | | +| number_of_total_imaging_rounds | | The total number of acquisitions performed on microscope to collect autofluorescence/background or stained signal (e.g., histology). | | +| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | +| total_run_time_value | | How long the tissue was on the acquisition instrument. | | +| dataset_type | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | +| total_run_time_unit | | The units for the total run time unit field. | ```Hour``` ```Minute``` | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | +| antibodies_path | | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| nuclear_marker_or_stain | | For markers, an antibody-targetted molecule present in or around the cell nucleus, the protein or gene symbol that identifies the antibody target that is used as the nuclear marker. This symbol must match the antibody target that is either generated from the panel used or entered with custom panels. Preferably, if using a custom antibody marker, this symbol should be the HGNC symbol (https://www.genenames.org/). For non-protein targets this is the stain name (e.g., DAPI) and, when appropriate, associated staining kit and vendor. For the PhenoCycler, this symbol must match the value found in the XPD output file. | ```DAPI``` ```Not applicable``` | +| cell_boundary_marker_or_stain | | If a marker or stain was used to identify all cell boundaries in the tissue, then the name of the marker or stain should be included here. The name of the antibody-targeted molecule marker or non-antibody targeted molecule stain included here must be identical to what is found in the imaging data. For example, with the PhenoCycler, this name must match the value found in the XPD output file. If multiple marker or stains are used to identify all cell boundaries, then a comma separated list should be used here. | ```NAKATPASE``` ```CD298``` ```Not applicable``` | +| non_global_files | | A semicolon separated list of non-shared files to be included in the dataset. The path assumes the files are located in the "TOP/non-global/" directory. For example, for the file is TOP/non-global/lab_processed/images/1-tissue-boundary.geojson the value of this field would be "./lab_processed/images/1-tissue-boundary.geojson". After ingest, these files will be copied to the appropriate locations within the respective dataset directory tree. This field is used for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee | | + + diff --git a/docs/assays/metadata/testing/converted/pixel-seqv2.md b/docs/assays/metadata/testing/converted/pixel-seqv2.md new file mode 100644 index 0000000..8217d8f --- /dev/null +++ b/docs/assays/metadata/testing/converted/pixel-seqv2.md @@ -0,0 +1,39 @@ +--- +layout: page-triary +--- + +# Pixel-seqV2 Metadata Attributes + +Fields that are collected for Pixel-seqV2 data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | An internal field labs can use it to add whatever ID(s) they want or need for dataset validation and tracking. This could be a single ID (e.g., "Visium_9OLC_A4_S1") or a delimited list of IDs (e.g., “9OL; 9OLC.A2; Visium_9OLC_A4_S1”). This field will not be accessible to anyone outside of the consortium and no effort will be made to check if IDs provided by one data provider are also used by another. | | +| dataset_type | | The specific type of dataset being produced. | ```Visium HD``` ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```DBiT-seq``` ```PhenoCycler``` ```CODEX``` ```Second Harmonic Generation (SHG)``` ```Seq-Scope``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```QTRAP 5500``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Q Exactive HF``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive``` ```VS200 Slide Scanner``` ```Not applicable``` ```Orbitrap Eclipse Tribrid``` ```MIBIscope``` ```IN Cell Analyzer 2200``` ```timsTOF FleX MALDI-2``` | +| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. | | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```month``` ```day``` ```year``` | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurement and preparation. For example for an imaging assay, the protocol might begin with staining of a section and finalize with the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1. | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| mapped_area_value | | For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx and Resolve, this is the area of the FOV (aka ROI) region being captured. For Xenium this is the total area of the FOV regions (aka ROI) being captured. For Stereo-Seq this is the number of beads. | | +| mapped_area_unit | | The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2. | ```um^2``` ```mm^2``` | +| spot_size_value | | For assays where spots are used to define discrete capture areas, this is the area of a spot. | | +| spot_size_unit | | The unit for spot size value. | ```um^2``` ```mm^2``` | +| number_of_spots | | Number of capture spots within the mapped area. For Visium this would be the number of spots covered by tissue, while it's the number of spots within ROIs for HiFi. | | +| permeabilization_time_value | | Permeabilization time used for this tissue section. | | +| permeabilization_time_unit | | The unit for the permeabilization time. | ```minute``` | +| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| number_of_additional_stains | | This would be minimally 2 (always include DAPI and polyT) and can include 6 more. | | + + diff --git a/docs/assays/metadata/testing/converted/raman-imaging.md b/docs/assays/metadata/testing/converted/raman-imaging.md new file mode 100644 index 0000000..b569452 --- /dev/null +++ b/docs/assays/metadata/testing/converted/raman-imaging.md @@ -0,0 +1,47 @@ +--- +layout: page-triary +--- + +# Raman Imaging Metadata Attributes + +Fields that are collected for Raman Imaging data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | | +| source_storage_duration_value | | The length of time the sample was stored prior to processing it. For assays performed on tissue sections, this refers to how long the tissue section (e.g., slide) was stored before the assay began (e.g., imaging). For assays performed on suspensions, such as sequencing, it refers to how long the suspension was stored before library construction started. Example: 12 | | +| time_since_acquisition_instrument_calibration_value | | The length of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. Example: 10 | | +| contributors_path | | The name of the file containing the ORCID IDs for all contributors to this dataset. Example: ./contributors.csv | | +| data_path | | The top-level directory containing the raw and/or processed data. For a single dataset upload, this might be represented as ".", whereas for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For example, if the data is within a directory named "TEST001-RK", use the syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2", where "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field used solely for data ingestion. Example: ./TEST001-RK | | +| is_image_preprocessing_required | | Indicates whether image preprocessing is necessary based on the type of acquisition instrument used, such as a microscope or slide scanner. This may involve steps like fusing image tiles to assemble the complete image. Example: Yes | | +| slide_id | | The unique identifier assigned to each slide, enabling users to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name to prevent overlapping values across different centers. Example: VAN0071-PA-1-1_AF | | +| tiled_image_columns | | The number of columns used in the stitching process of a tiled image, often referred to as the grid size in the x-dimension. Example: 5 | | +| tiled_image_count | | The total number of raw tiled images captured, which are intended to be stitched together. Example: 75 | | +| intended_tile_overlap_percentage | | The intended percentage of overlap between tiled images. This value serves as the set point, although slight variations may occur during image acquisition due to stage registration. Example: 5 | | +| dataset_type | | The specific type of dataset being produced. Example: RNAseq | ```Visium HD``` ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```COMET``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Raman Imaging``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```STARmap``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```Virtual Histology``` ```DBiT-seq``` ```PhenoCycler``` | +| analyte_class | | The analyte class which is the target molecule that the assay is measuring. Example: DNA | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```RNA + protein``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | The company that manufactures or supplies the acquisition instrument. An acquisition instrument is a device equipped with signal detection hardware and signal processing software. It captures signals produced by assays, such as variations in light intensity or color, or signals corresponding to molecular mass. If the instrument was custom-built or developed internally, enter "In-House". Example: Illumina | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Revvity``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Waters``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | The specific model of the acquisition instrument, as manufacturers often offer various versions with differing features or sensitivities. These differences may be relevant to the processing or interpretation of the data. If the instrument was custom-built or developed internally, enter "In-House". If the model is unknown, enter "Unknown". Example: HiSeq 4000 | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```QTRAP 5500``` ```DMi8``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Opera Phenix Plus HCS``` ```SYNAPT G2-Si``` ```Q Exactive HF``` ```Orbitrap Fusion Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive``` ```VS200 Slide Scanner``` ```Not applicable``` | +| source_storage_duration_unit | | The unit of measurement used to specify the source storage duration value. Example: hour | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_unit | | The unit of measurement used to specify the time since acquisition instrument calibration value. Example: month | ```month``` ```day``` ```year``` | +| tile_configuration | | The configuration of tiles used for stitching in the assay process. If no tile configuration is applicable, enter "Not applicable". Example: Row-by-row | ```Column-by-column``` ```Not applicable``` ```Snake-by-columns``` ```Row-by-row``` ```Snake-by-rows``` | +| scan_direction | | The direction of imaging, which is necessary for the stitching process. Example: Left-and-down | ```Left-and-down``` ```Right-and-down``` ```Not applicable``` ```Right-and-up``` ```Left-and-up``` | +| metadata_schema_id | | The unique string identifier for the metadata specification version, which is easily interpretable by computers for purposes of data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| preparation_protocol_doi | | The DOI for the protocols.io page that details the assay or the procedures used for sample procurement and preparation. For example, in the case of an imaging assay, the protocol may start with tissue section staining and end with the generation of an OME-TIFF file. The documented protocol should also include any image processing steps involved in producing the final OME-TIFF. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Indicates whether a specific molecule or set of molecules is targeted for detection or measurement by the assay. Example: Yes | | +| parent_sample_id | | The unique identifier from HuBMAP or SenNet for the sample (such as a block, section, or suspension) used to perform the assay. For instance, in an RNAseq assay, the parent sample would be the suspension, while in imaging assays, it would be the tissue section. If the assay is derived from multiple parent samples, this field should contain a comma-separated list of identifiers. Example: HBM386.ZGKG.235, HBM672.MKPK.442 | | +| number_of_pixels | | The total number of spatial sampling points in an image; for example, in a Raman image, each pixel corresponds to one recorded Raman spectrum. Example: 40000 | | +| pixel_physical_size_height_value | | The physical height of a single pixel in the image. Example: 1000 | | +| pixel_physical_size_height_unit | | The unit of measurement for the pixel physical size height value. If the pixel height is not specified, this field may be left blank. Example: um | ```um``` ```mm``` ```nm``` | +| pixel_physical_size_width_value | | The physical width of a single pixel in the image. Example: 1000 | | +| pixel_physical_size_width_unit | | The unit of measurement for the pixel physical size width value. If the pixel width value is not specified, this field may be left blank. Example: um | ```um``` ```mm``` ```nm``` | +| pixel_physical_size_depth_value | | The physical depth of a single pixel in the image. Example: 10 | | +| pixel_physical_size_depth_unit | | The unit of measurement for the pixel physical size depth value. If the pixel depth value is not specified, this field may be left blank. Example: um | ```um``` ```mm``` ```nm``` | +| objective_numerical_aperture | | Numerical aperture of the microscope objective used to focus the excitation laser on the sample and collect the resulting scattered signal, such as Raman-scattered light. Example: 0.5 | | +| laser_power | | Power of the excitation laser at the sample’s focal plane, measured after the objective and reported in milliwatts (mW). Example: 10 | | +| raman_shift_range | | Range of Raman shifts acquired in the measurement, expressed in wavenumbers (cm⁻¹). Example: 400-3200 | | + + diff --git a/docs/assays/metadata/testing/converted/secondharmonicgeneration.md b/docs/assays/metadata/testing/converted/secondharmonicgeneration.md new file mode 100644 index 0000000..9057baa --- /dev/null +++ b/docs/assays/metadata/testing/converted/secondharmonicgeneration.md @@ -0,0 +1,37 @@ +--- +layout: page-triary +--- + +# Second Harmonic Generation Metadata Attributes + +Fields that are collected for Second Harmonic Generation data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| source_storage_duration_value | | How long was the source material (parent) stored, prior to this sample being processed. | | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "/TEST001-RK/" for this field. If there are multiple directory levels, use the format "/TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| is_image_preprocessing_required | | Depending on if the acquisition instrument was a microscope, slide scanner, etc. will indicate whether or not any level of preprocessing was required to assemble the image (e.g., fusing image tiles) . | ```Yes``` ```No``` | +| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | +| tiled_image_columns | | This is how many columns used in stitching. This is sometimes referred to as the grid size x. | | +| tiled_image_count | | This is the total number of raw (tiled) images captured, that are to be stitched together. | | +| intended_tile_overlap_percentage | | The amount of overlap between tiled images. This is the set point, where as during image acquisition there will be slight variations due to stage registration. | | +| dataset_type | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | +| tile_configuration | | This is how the tiles are configured for stitching. | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | +| scan_direction | | This is the direction of imaging, which is required for stitching. | ```Left-and-down``` ```Left-and-up``` ```Not applicable``` ```Right-and-down``` ```Right-and-up``` | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | +| antibodies_path | | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | + + diff --git a/docs/assays/metadata/testing/converted/seq-scope.md b/docs/assays/metadata/testing/converted/seq-scope.md new file mode 100644 index 0000000..b995653 --- /dev/null +++ b/docs/assays/metadata/testing/converted/seq-scope.md @@ -0,0 +1,41 @@ +--- +layout: page-triary +--- + +# Seq-Scope Metadata Attributes + +Fields that are collected for Seq-Scope data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | An internal field labs can use it to add whatever ID(s) they want or need for dataset validation and tracking. This could be a single ID (e.g., "Visium_9OLC_A4_S1") or a delimited list of IDs (e.g., “9OL; 9OLC.A2; Visium_9OLC_A4_S1”). This field will not be accessible to anyone outside of the consortium and no effort will be made to check if IDs provided by one data provider are also used by another. | | +| dataset_type | | The specific type of dataset being produced. | ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Pixel-seq``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx``` ```MERFISH``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```DBiT-seq``` ```PhenoCycler``` ```CODEX``` ```Second Harmonic Generation (SHG)``` ```Seq-Scope``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```QTRAP 5500``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Q Exactive HF``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive``` ```VS200 Slide Scanner``` ```Not applicable``` ```Orbitrap Eclipse Tribrid``` ```MIBIscope``` ```IN Cell Analyzer 2200``` ```timsTOF FleX MALDI-2``` | +| source_storage_duration_value | | How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. | | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. | | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```month``` ```day``` ```year``` | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurement and preparation. For example for an imaging assay, the protocol might begin with staining of a section and finalize with the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1. | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| mapped_area_value | | For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx and Resolve, this is the area of the FOV (aka ROI) region being captured. For Xenium this is the total area of the FOV regions (aka ROI) being captured. For Stereo-Seq this is the number of beads. | | +| mapped_area_unit | | The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2. | ```um^2``` ```mm^2``` | +| spot_size_value | | For assays where spots are used to define discrete capture areas, this is the area of a spot. | | +| spot_size_unit | | The unit for spot size value. | ```um^2``` ```mm^2``` | +| number_of_spots | | Number of capture spots within the mapped area. For Visium this would be the number of spots covered by tissue, while it's the number of spots within ROIs for HiFi. | | +| spot_spacing_value | | Approximate center-to-center distance between capture spots. Synonyms: Inter-Spot distance, Spot resolution, Pit size | | +| spot_spacing_unit | | Units corresponding to inter-spot distance | ```um``` | +| permeabilization_time_value | | Permeabilization time used for this tissue section. | | +| permeabilization_time_unit | | The unit for the permeabilization time. | ```minute``` | +| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| number_of_additional_stains | | This would be minimally 2 (always include DAPI and polyT) and can include 6 more. | | + + diff --git a/docs/assays/metadata/testing/converted/seqfish.md b/docs/assays/metadata/testing/converted/seqfish.md new file mode 100644 index 0000000..cbb835f --- /dev/null +++ b/docs/assays/metadata/testing/converted/seqfish.md @@ -0,0 +1,91 @@ +--- +layout: page-triary +--- + +# seqFISH Metadata Attributes + +Fields that are collected for seqFISH data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| version | | Version of the schema to use when validating this metadata. | ```'1'``` | +| description | | Free-text description of this assay. | | +| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | | +| tissue_id | | HuBMAP Display ID of the assayed tissue. | | +| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | +| protocols_io_doi | | DOI for protocols.io referring to the protocol for this assay. | | +| operator | | Name of the person responsible for executing the assay. | | +| operator_email | | Email address for the operator. | | +| pi | | Name of the principal investigator responsible for the data. | | +| pi_email | | Email address for the principal investigator. | | +| assay_category | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```'imaging'``` | +| assay_type | | The specific type of assay being executed. | ```'seqFISH'``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```'RNA'``` | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```'Yes'``` ```'No'``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| resolution_x_value | | The width of a pixel. | | +| resolution_x_unit | | The unit of measurement of the width of a pixel. | ```'nm'``` ```'um'``` | +| resolution_y_value | | The height of a pixel | | +| resolution_y_unit | | The unit of measurement of the height of a pixel. | ```'nm'``` ```'um'``` | +| resolution_z_value | | Optional if assay does not have multiple z-levels. Note that this is resolution within a given sample: z-pitch (resolution_z_value) is the increment distance between image slices (for Akoya, z-pitch=1.5um) ie. the microscope stage is moved up or down in increments of 1.5um to capture images of several focal planes. The best one will be used & the rest discarded. The thickness of the sample itself is sample metadata. | | +| resolution_z_unit | | The unit of incremental distance between image slices. | ```'mm'``` ```'um'``` ```'nm'``` | +| preparation_instrument_vendor | | The manufacturer of the instrument used to prepare the sample for the assay. | | +| preparation_instrument_model | | The model number/name of the instrument used to prepare the sample for the assay | | +| number_of_barcode_probes | | Number of barcode probes targeting mRNAs (eg. 24,000 barcode probes = 24,000 mRNAs - 1 per mRNA of interest) | | +| number_of_barcode_regions_per_barcode_probe | | Number of barcode regions on each mRNA barcode probe (the paper describes mRNA probes with 4 barcoded regions) | | +| number_of_readout_probes_per_channel | | Number of readout probes that can be interrogated per channel per cycle (the paper describes 20 readout probes per channel (x 3 channels -> total = 60)) | | +| number_of_pseudocolors_per_channel | | Number of pseudocolors that can be assigned to each fluorescent channel (the paper describes 20 pseudocolors per channel (x 3 channels -> total = 60) | | +| number_of_channels | | Number of fluorescent channels (the paper describes 3 channels - for 3 fluorescent dyes) | | +| number_of_cycles | | For each barcode region being interrogated, the number of cycles of 1. Hybridization of readout probes, 2. imaging, 3. Washes (the paper describes 1 readout probe per hyb cycle -> 20 readout probes = 20 hyb cycles) | | +| section_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | +| reagent_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | | +| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | | +| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | + + + +  + +## Deprecated Attributes + +* indicates a field that was previously required + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | | +| tissue_id | | HuBMAP Display ID of the assayed tissue. | | +| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | +| protocols_io_doi | | DOI for protocols.io referring to the protocol for this assay. | | +| operator | | Name of the person responsible for executing the assay. | | +| operator_email | | Email address for the operator. | | +| pi | | Name of the principal investigator responsible for the data. | | +| pi_email | | Email address for the principal investigator. | | +| assay_category | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```'imaging'``` | +| assay_type | | The specific type of assay being executed. | ```'seqFISH'``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```'RNA'``` | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```'Yes'``` ```'No'``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| resolution_x_value | | The width of a pixel. | | +| resolution_x_unit | | The unit of measurement of the width of a pixel. | ```'nm'``` ```'um'``` | +| resolution_y_value | | The height of a pixel | | +| resolution_y_unit | | The unit of measurement of the height of a pixel. | ```'nm'``` ```'um'``` | +| resolution_z_value | | Optional if assay does not have multiple z-levels. Note that this is resolution within a given sample: z-pitch (resolution_z_value) is the increment distance between image slices (for Akoya, z-pitch=1.5um) ie. the microscope stage is moved up or down in increments of 1.5um to capture images of several focal planes. The best one will be used & the rest discarded. The thickness of the sample itself is sample metadata. | | +| resolution_z_unit | | The unit of incremental distance between image slices. | ```'mm'``` ```'um'``` ```'nm'``` | +| preparation_instrument_vendor | | The manufacturer of the instrument used to prepare the sample for the assay. | | +| preparation_instrument_model | | The model number/name of the instrument used to prepare the sample for the assay | | +| number_of_barcode_probes | | Number of barcode probes targeting mRNAs (eg. 24,000 barcode probes = 24,000 mRNAs - 1 per mRNA of interest) | | +| number_of_barcode_regions_per_barcode_probe | | Number of barcode regions on each mRNA barcode probe (the paper describes mRNA probes with 4 barcoded regions) | | +| number_of_readout_probes_per_channel | | Number of readout probes that can be interrogated per channel per cycle (the paper describes 20 readout probes per channel (x 3 channels -> total = 60)) | | +| number_of_pseudocolors_per_channel | | Number of pseudocolors that can be assigned to each fluorescent channel (the paper describes 20 pseudocolors per channel (x 3 channels -> total = 60) | | +| number_of_channels | | Number of fluorescent channels (the paper describes 3 channels - for 3 fluorescent dyes) | | +| number_of_cycles | | For each barcode region being interrogated, the number of cycles of 1. Hybridization of readout probes, 2. imaging, 3. Washes (the paper describes 1 readout probe per hyb cycle -> 20 readout probes = 20 hyb cycles) | | +| section_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing tissue sections for the assay. | | +| reagent_prep_protocols_io_doi | | DOI for protocols.io referring to the protocol for preparing reagents for the assay. | | +| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | | +| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | + diff --git a/docs/assays/metadata/testing/converted/sims.md b/docs/assays/metadata/testing/converted/sims.md new file mode 100644 index 0000000..01852ec --- /dev/null +++ b/docs/assays/metadata/testing/converted/sims.md @@ -0,0 +1,41 @@ +--- +layout: page-triary +--- + +# SIMS Metadata Attributes + +Fields that are collected for SIMS data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| dataset_type | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | +| source_storage_duration_value | | How long was the source material (parent) stored, prior to this sample being processed. | | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| mass_analysis_polarity | | The polarity of the mass analysis (positive or negative ion modes). | ```Negative and positive ion mode``` ```Negative ion mode``` ```Positive ion mode``` | +| mass_resolving_power | | The mass resolving power m/∆m, where ∆m is defined as the full width at half-maximum (FWHM) for a given peak with a specified mass-to-charge (m/z). (unitless) | | +| mass-to-charge_resolving_power | | The peak (m/z) used to calculate the resolving power. | | +| matrix_deposition_method | | Common methods of depositing matrix for assisting in desorption and ionization in imaging mass spectrometry include robotic spotting, electrospray deposition, and sublimation. | ```Electrospray deposition``` ```Not applicable``` ```Robotic spotting``` ```Robotic spraying``` ```Sublimation``` | +| preparation_instrument_vendor | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | +| preparation_instrument_model | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist``` | +| preparation_matrix | | The matrix is a compound of crystallized molecules that acts like a buffer between the sample and the ionizing probe. It also helps ionize the sample, carrying it along the flight tube so it can be detected. | ```2,5-DHA (2,5-dihydroxyacetophenone)``` ```2,5-DHB (2,5-Dihydroxybenzoic acid)``` ```9-AA (9-aminoacridine)``` ```CHCA (alpha-cyano-4-hydroxy-cinnamic acid)``` ```DAN (1,5-diaminonapthalene)``` ```DMACA (4-(dimethylamino)cinnamic acid)``` ```NEDC (N-(1-naphthyl) ethylenediamine dihydrochloride)``` ```SA (sinapic acid)``` | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| mass-to-charge_range_low_value | | The low value of the scanned mass-to-charge range, for MS1. (unitless) | | +| mass-to-charge_range_high_value | | The high value of the scanned mass-to-charge range, for MS1. (unitless) | | +| analysis_protocol_doi | | A DOI to a protocols.io protocol describing the software and database(s) used to process the raw data. Example: https://dx.doi.org/10.17504/protocols.io.bsu5ney6 | | +| ms_ionization_technique | | The ionization approach (i.e., sample probing method) for performing imaging mass spectrometry. | ```DESI``` ```ESI``` ```HESI``` ```LA``` ```LDI``` ```MALDI``` ```MALDI-2``` ```nanoDESI``` ```SIMS-C60``` ```SIMS-H20``` | +| ms_scan_mode | | MS (mass spectrometry) scan mode refers to the number of steps in the separation of fragments. | ```MS1``` ```MS2``` ```MS3``` | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | + + diff --git a/docs/assays/metadata/testing/converted/slide-seq.md b/docs/assays/metadata/testing/converted/slide-seq.md new file mode 100644 index 0000000..199209c --- /dev/null +++ b/docs/assays/metadata/testing/converted/slide-seq.md @@ -0,0 +1,95 @@ +--- +layout: page-triary +--- + +# Slide-seq Metadata Attributes + +Fields that are collected for Slide-seq data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| version | | Version of the schema to use when validating this metadata. | ```'1'``` | +| description | | Free-text description of this assay. | | +| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | | +| tissue_id | | HuBMAP Display ID of the assayed tissue. | | +| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | +| protocols_io_doi | | DOI for protocols.io referring to the protocol for this assay. | | +| operator | | Name of the person responsible for executing the assay. | | +| operator_email | | Email address for the operator. | | +| pi | | Name of the principal investigator responsible for the data. | | +| pi_email | | Email address for the principal investigator. | | +| assay_category | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```'sequence'``` | +| assay_type | | The specific type of assay being executed. | ```'Slide-seq'``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```'RNA'``` | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```'Yes'``` ```'No'``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| rnaseq_assay_method | | The kit used for the RNA sequencing assay | | +| library_construction_protocols_io_doi | | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | | +| library_layout | | State whether the library was generated for single-end or paired end sequencing. | ```'single-end'``` ```'paired-end'``` | +| library_adapter_sequence | | Adapter sequence to be used for adapter trimming | | +| puck_id | | Slide-seq captures RNA sequence data on spatially barcoded arrays of beads. Beads are fixed to a slide in a region shaped like a round puck. Each puck has a unique puck_id. | | +| is_technical_replicate | | Is the sequencing reaction run in repliucate, TRUE or FALSE | ```'Yes'``` ```'No'``` | +| bead_barcode_read | | Which read file contains the bead barcode | | +| bead_barcode_offset | | Position(s) in the read at which the bead barcode starts | | +| bead_barcode_size | | Length of the bead barcode in base pairs | | +| library_pcr_cycles | | Number of PCR cycles to amplify cDNA | | +| library_pcr_cycles_for_sample_index | | Number of PCR cycles performed for library indexing | | +| library_final_yield_value | | Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul) | | +| library_final_yield_unit | | Units of final library yield | ```'ng'``` | +| library_average_fragment_size | | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | +| sequencing_reagent_kit | | Reagent kit used for sequencing | | +| sequencing_read_format | | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | +| sequencing_read_percent_q30 | | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | +| sequencing_phix_percent | | Percent PhiX loaded to the run | | +| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | | +| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | + + + +  + +## Deprecated Attributes + +* indicates a field that was previously required + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | | +| tissue_id | | HuBMAP Display ID of the assayed tissue. | | +| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | +| protocols_io_doi | | DOI for protocols.io referring to the protocol for this assay. | | +| operator | | Name of the person responsible for executing the assay. | | +| operator_email | | Email address for the operator. | | +| pi | | Name of the principal investigator responsible for the data. | | +| pi_email | | Email address for the principal investigator. | | +| assay_category | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```'sequence'``` | +| assay_type | | The specific type of assay being executed. | ```'Slide-seq'``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```'RNA'``` | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```'Yes'``` ```'No'``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| rnaseq_assay_method | | The kit used for the RNA sequencing assay | | +| library_construction_protocols_io_doi | | A link to the protocol document containing the library construction method (including version) that was used, e.g. "Smart-Seq2", "Drop-Seq", "10X v3". | | +| library_layout | | State whether the library was generated for single-end or paired end sequencing. | ```'single-end'``` ```'paired-end'``` | +| library_adapter_sequence | | Adapter sequence to be used for adapter trimming | | +| puck_id | | Slide-seq captures RNA sequence data on spatially barcoded arrays of beads. Beads are fixed to a slide in a region shaped like a round puck. Each puck has a unique puck_id. | | +| is_technical_replicate | | Is the sequencing reaction run in repliucate, TRUE or FALSE | ```'Yes'``` ```'No'``` | +| bead_barcode_read | | Which read file contains the bead barcode | | +| bead_barcode_offset | | Position(s) in the read at which the bead barcode starts | | +| bead_barcode_size | | Length of the bead barcode in base pairs | | +| library_pcr_cycles | | Number of PCR cycles to amplify cDNA | | +| library_pcr_cycles_for_sample_index | | Number of PCR cycles performed for library indexing | | +| library_final_yield_value | | Total number of ng of library after final pcr amplification step. This is the concentration (ng/ul) * volume (ul) | | +| library_final_yield_unit | | Units of final library yield | ```'ng'``` | +| library_average_fragment_size | | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | +| sequencing_reagent_kit | | Reagent kit used for sequencing | | +| sequencing_read_format | | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | +| sequencing_read_percent_q30 | | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | +| sequencing_phix_percent | | Percent PhiX loaded to the run | | +| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | | +| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | + diff --git a/docs/assays/metadata/testing/converted/snareseq2.md b/docs/assays/metadata/testing/converted/snareseq2.md new file mode 100644 index 0000000..ce2fa9d --- /dev/null +++ b/docs/assays/metadata/testing/converted/snareseq2.md @@ -0,0 +1,22 @@ +--- +layout: page-triary +--- + +# SNARE-seq2 Metadata Attributes + +Fields that are collected for SNARE-seq2 data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| dataset_type | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| number_of_pre-amplification_pcr_cycles | | The number of PCR cycles run after the Chromium Controller step and prior to separating the suspension and initiating library construction | | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | + + diff --git a/docs/assays/metadata/testing/converted/starmap.md b/docs/assays/metadata/testing/converted/starmap.md new file mode 100644 index 0000000..cbc8d39 --- /dev/null +++ b/docs/assays/metadata/testing/converted/starmap.md @@ -0,0 +1,45 @@ +--- +layout: page-triary +--- + +# STARmap Metadata Attributes + +Fields that are collected for STARmap data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1 | | +| dataset_type | | The specific type of dataset being produced. Example: RNAseq | ```Visium HD``` ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```COMET``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Raman Imaging``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```STARmap``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```Virtual Histology``` ```DBiT-seq``` ```PhenoCycler``` | +| analyte_class | | The analyte class which is the target molecule that the assay is measuring. Example: DNA | ```Nucleic acid + protein``` ```Lipid + metabolite``` ```Collagen``` ```RNA``` ```Fluorochrome``` ```DNA``` ```Metabolite``` ```DNA + RNA``` ```Saturated lipid``` ```Lipid``` ```RNA + protein``` ```Peptide``` ```Protein``` ```Unsaturated lipid``` ```Endogenous fluorophore``` ```Chromatin``` ```Polysaccharide``` | +| acquisition_instrument_vendor | | The company that manufactures or supplies the acquisition instrument. An acquisition instrument is a device equipped with signal detection hardware and signal processing software. It captures signals produced by assays, such as variations in light intensity or color, or signals corresponding to molecular mass. If the instrument was custom-built or developed internally, enter "In-House". Example: Illumina | ```Complete Genomics``` ```Cytek Biosciences``` ```Thermo Fisher Scientific``` ```Sciex``` ```Vizgen``` ```Leica Microsystems``` ```Akoya Biosciences``` ```Keyence``` ```Andor``` ```Standard BioTools (Fluidigm)``` ```Leica Biosystems``` ```Zeiss Microscopy``` ```Ionpath``` ```Motic``` ```In-House``` ```Revvity``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Element Biosciences``` ```Hamamatsu``` ```Waters``` ```Bruker``` ```Illumina``` ```3DHISTECH``` ```Singular Genomics``` ```Huron Digital Pathology``` ```Resolve Biosciences``` ```NanoString``` ```Cytiva``` ```10x Genomics``` ```Microscopes International``` ```BGI Genomics``` | +| acquisition_instrument_model | | The specific model of the acquisition instrument, as manufacturers often offer various versions with differing features or sensitivities. These differences may be relevant to the processing or interpretation of the data. If the instrument was custom-built or developed internally, enter "In-House". If the model is unknown, enter "Unknown". Example: HiSeq 4000 | ```NovaSeq X``` ```NovaSeq X Plus``` ```Cytek Northern Lights``` ```Lightsheet 7``` ```Resolve Biosciences Molecular Cartography``` ```timsTOF HT``` ```timsTOF Pro 2``` ```timsTOF Pro``` ```timsTOF Ultra``` ```timsTOF Ultra 2``` ```timsTOF SCP``` ```Axio Scan.Z1``` ```MALDI timsTOF Flex Prototype``` ```CosMx Spatial Molecular Imager``` ```Unknown``` ```MERSCOPE Ultra``` ```Juno System``` ```timsTOF FleX``` ```Custom: Multiphoton``` ```CyTOF XT``` ```Helios``` ```EVOS M7000``` ```Aperio AT2``` ```Phenocycler-Fusion 2.0``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Observer 3``` ```NanoZoomer-SQ``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```DM6 B``` ```MoticEasyScan One``` ```In-House``` ```NextSeq 500``` ```BZ-X710``` ```QTRAP 5500``` ```DMi8``` ```NextSeq 550``` ```HiSeq 2500``` ```HiSeq 4000``` ```NovaSeq 6000``` ```Opera Phenix Plus HCS``` ```SYNAPT G2-Si``` ```Q Exactive HF``` ```Orbitrap Fusion Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Q Exactive``` ```VS200 Slide Scanner``` ```Not applicable``` | +| source_storage_duration_value | | The length of time the sample was stored prior to processing it. For assays performed on tissue sections, this refers to how long the tissue section (e.g., slide) was stored before the assay began (e.g., imaging). For assays performed on suspensions, such as sequencing, it refers to how long the suspension was stored before library construction started. Example: 12 | | +| source_storage_duration_unit | | The unit of measurement used to specify the source storage duration value. Example: hour | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_value | | The length of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. Example: 10 | | +| time_since_acquisition_instrument_calibration_unit | | The unit of measurement used to specify the time since acquisition instrument calibration value. Example: month | ```month``` ```day``` ```year``` | +| preparation_protocol_doi | | The DOI for the protocols.io page that details the assay or the procedures used for sample procurement and preparation. For example, in the case of an imaging assay, the protocol may start with tissue section staining and end with the generation of an OME-TIFF file. The documented protocol should also include any image processing steps involved in producing the final OME-TIFF. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Indicates whether a specific molecule or set of molecules is targeted for detection or measurement by the assay. Example: Yes | | +| contributors_path | | The name of the file containing the ORCID IDs for all contributors to this dataset. Example: ./contributors.csv | | +| data_path | | The top-level directory containing the raw and/or processed data. For a single dataset upload, this might be represented as ".", whereas for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For example, if the data is within a directory named "TEST001-RK", use the syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2", where "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field used solely for data ingestion. Example: ./TEST001-RK | | +| parent_sample_id | | The unique identifier from HuBMAP or SenNet for the sample (such as a block, section, or suspension) used to perform the assay. For instance, in an RNAseq assay, the parent sample would be the suspension, while in imaging assays, it would be the tissue section. If the assay is derived from multiple parent samples, this field should contain a comma-separated list of identifiers. Example: HBM386.ZGKG.235, HBM672.MKPK.442 | | +| mapped_area_value | | The mapped area value, which refers to the specific area covered or captured in various assays. For Visium, it is the area of spots covered by tissue within the captured area, excluding the total possible captured area. For GeoMx, it refers to the area of the AOI being captured. In HiFi, it is the summed area of the ROIs in a single flowcell lane. For CosMx and Resolve, it indicates the area of the FOV (also known as ROI) region being captured. For Xenium, it is the total area of the FOV regions (also known as ROI) being captured. For Stereo-Seq, this value represents the number of beads. Example: 42.25 | | +| mapped_area_unit | | The unit of measurement for the mapped area value. If mapping area is not specified, this field may be left blank. Example: um^2 | ```um^2``` ```mm^2``` | +| target_retrieval_incubation_temperature | | The incubation temperature required for target retrieval, which is typically 100 degrees Celsius for RNA assays and 80 degrees Celsius for protein assays. Example: 100 | | +| target_retrieval_incubation_time_value | | The duration for which a sample is exposed to a target retrieval solution. Example: 15 | | +| target_retrieval_incubation_time_unit | | The unit of measurement for the target retrieval incubation time value. If no incubation time is specified, this field may be left blank. Example: minute | ```minute``` | +| proteinasek_concentration | | The concentration of the enzyme Proteinase K within a sample, measured in micrograms per milliliter (ug/ml). Example: 10 | | +| proteinasek_incubation_time_value | | The duration for which a sample is incubated with Proteinase K. Example: 15 | | +| proteinasek_incubation_time_unit | | The unit of measurement for the proteinaseK incubation time value. If no incubation time is specified, this field may be left blank. Example: minute | ```minute``` | +| probe_hybridization_time_value | | The duration for which the oligo-conjugated RNA or oligo-conjugated antibody probes were hybridized with the sample. Example: 30 | | +| probe_hybridization_time_unit | | The unit of measurement for the probe hybridization time value. If the hybridization time is not specified, this field may be left blank. Example: minute | ```hour``` ```minute``` | +| is_custom_probes_used | | Indicates whether custom RNA or antibody probes were utilized in the assay. If custom probes were employed, they should be documented in the "custom_probe_set.csv" file. Example: No | | +| number_of_panel_targets | | The number of panel targets, which refers to the total count of genes, RNA isoforms, or RNA regions that are targeted by probes. Example: 1000 | | +| anatomical_structure_label | | The label for the overarching anatomical structure. If the anatomical structure is not applicable or not specified, this field may be left blank. Example: Kidney | | +| anatomical_structure_id | | The ontology ID associated with the anatomical structure, typically represented by an UBERON ID. Example: UBERON:0002113 | | +| non_global_files | | Specifies a semicolon-separated list of non-global files that are to be included in the dataset. The file paths assume that the files are located in the "TOP/non-global/" directory. For instance, if the file is located at TOP/non-global/lab_processed/images/1-tissue-boundary.geojson, the value for this field would be "./lab_processed/images/1-tissue-boundary.geojson". Once ingested, these files will be copied to their appropriate locations within the respective dataset directory tree. This field is intended for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee Example: ./lab_processed/images/1-tissue-boundary.geojson | | +| metadata_schema_id | | The unique string identifier for the metadata specification version, which is easily interpretable by computers for purposes of data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | + + diff --git a/docs/assays/metadata/testing/converted/thicksectionmultiphotonmxif.md b/docs/assays/metadata/testing/converted/thicksectionmultiphotonmxif.md new file mode 100644 index 0000000..fc7b8bf --- /dev/null +++ b/docs/assays/metadata/testing/converted/thicksectionmultiphotonmxif.md @@ -0,0 +1,37 @@ +--- +layout: page-triary +--- + +# Thick Section Multiphoton MxIF Metadata Attributes + +Fields that are collected for Thick Section Multiphoton MxIF data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| source_storage_duration_value | | How long was the source material (parent) stored, prior to this sample being processed. | | +| time_since_acquisition_instrument_calibration_value | | The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture. | | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "/TEST001-RK/" for this field. If there are multiple directory levels, use the format "/TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| is_image_preprocessing_required | | Depending on if the acquisition instrument was a microscope, slide scanner, etc. will indicate whether or not any level of preprocessing was required to assemble the image (e.g., fusing image tiles) . | ```Yes``` ```No``` | +| slide_id | | A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. | | +| tiled_image_columns | | This is how many columns used in stitching. This is sometimes referred to as the grid size x. | | +| tiled_image_count | | This is the total number of raw (tiled) images captured, that are to be stitched together. | | +| intended_tile_overlap_percentage | | The amount of overlap between tiled images. This is the set point, where as during image acquisition there will be slight variations due to stage registration. | | +| dataset_type | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```Chromatin``` ```DNA``` ```DNA + RNA``` ```Endogenous fluorophores``` ```Fluorochrome``` ```Lipid``` ```Metabolite``` ```Nucleic acid and protein``` ```Peptide``` ```Polysaccharide``` ```Protein``` ```RNA``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | ```Akoya Biosciences``` ```Andor``` ```BGI Genomics``` ```Bruker``` ```Cytiva``` ```Evident Scientific (Olympus)``` ```GE Healthcare``` ```Hamamatsu``` ```Huron Digital Pathology``` ```Illumina``` ```In-House``` ```Ionpath``` ```Keyence``` ```Leica Biosystems``` ```Leica Microsystems``` ```Motic``` ```NanoString``` ```Resolve Biosciences``` ```Sciex``` ```Standard BioTools (Fluidigm)``` ```Thermo Fisher Scientific``` ```Zeiss Microscopy``` | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```Aperio AT2``` ```Aperio CS2``` ```Axio Observer 3``` ```Axio Observer 5``` ```Axio Observer 7``` ```Axio Scan.Z1``` ```BZ-X710``` ```BZ-X800``` ```BZ-X810``` ```CosMx Spatial Molecular Imager``` ```Custom: Multiphoton``` ```Digital Spatial Profiler``` ```DM6 B``` ```DNBSEQ-T7``` ```EVOS M7000``` ```HiSeq 2500``` ```HiSeq 4000``` ```Hyperion Imaging System``` ```IN Cell Analyzer 2200``` ```Lightsheet 7``` ```MALDI timsTOF Flex Prototype``` ```MIBIscope``` ```MoticEasyScan One``` ```NanoZoomer 2.0-HT``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```NanoZoomer-SQ``` ```NextSeq 2000``` ```NextSeq 500``` ```NextSeq 550``` ```NovaSeq 6000``` ```NovaSeq X``` ```NovaSeq X Plus``` ```Orbitrap Eclipse Tribrid``` ```Orbitrap Fusion Lumos Tribrid``` ```Phenocycler-Fusion 1.0``` ```Phenocycler-Fusion 2.0``` ```PhenoImager Fusion``` ```Q Exactive``` ```Q Exactive HF``` ```Q Exactive UHMR``` ```QTRAP 5500``` ```Resolve Biosciences Molecular Cartography``` ```SCN400``` ```STELLARIS 5``` ```TissueScope LE Slide Scanner``` ```Unknown``` ```VS200 Slide Scanner``` ```Xenium Analyzer``` ```Zyla 4.2 sCMOS``` | +| source_storage_duration_unit | | The time duration unit of measurement | ```hour``` ```month``` ```day``` ```minute``` ```year``` | +| time_since_acquisition_instrument_calibration_unit | | The time unit of measurement | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | +| tile_configuration | | This is how the tiles are configured for stitching. | ```Column-by-column``` ```Not applicable``` ```Row-by-row``` ```Snake-by-columns``` ```Snake-by-rows``` | +| scan_direction | | This is the direction of imaging, which is required for stitching. | ```Left-and-down``` ```Left-and-up``` ```Not applicable``` ```Right-and-down``` ```Right-and-up``` | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay ("Yes" or "No"). The CODEX analyte is protein. | ```Yes``` ```No``` | +| antibodies_path | | This is the location of the antibodies.tsv file relative to the root of the top level of the upload directory structure. This path should begin with "." and would likely be something like "./extras/antibodies.tsv". | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | + + diff --git a/docs/assays/metadata/testing/converted/visium-hd.md b/docs/assays/metadata/testing/converted/visium-hd.md new file mode 100644 index 0000000..9d19d96 --- /dev/null +++ b/docs/assays/metadata/testing/converted/visium-hd.md @@ -0,0 +1,35 @@ +--- +layout: page-triary +--- + +# Visium HD Metadata Attributes + +Fields that are collected for Visium HD data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| lab_id | | An internal field labs can use it to add whatever ID(s) they want or need for dataset validation and tracking. This could be a single ID (e.g., "Visium_9OLC_A4_S1") or a delimited list of IDs (e.g., “9OL; 9OLC.A2; Visium_9OLC_A4_S1”). This field will not be accessible to anyone outside of the consortium and no effort will be made to check if IDs provided by one data provider are also used by another. | | +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurement and preparation. For example for an imaging assay, the protocol might begin with staining of a section and finalize with the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1. | | +| dataset_type | | The specific type of dataset being produced. | ```Visium HD``` ```4i``` ```LC-MS``` ```Thick section Multiphoton MxIF``` ```Light Sheet``` ```ATACseq``` ```Resolve``` ```HiFi-Slide``` ```MPLEx``` ```10X Multiome``` ```MALDI``` ```Histology``` ```Cell DIVE``` ```FACS``` ```MS Lipidomics``` ```Visium (no probes)``` ```MUSIC``` ```RNAseq``` ```GeoMx (NGS)``` ```GeoMx (nCounter)``` ```RNAseq (with probes)``` ```Singular Genomics G4X``` ```Molecular Cartography``` ```CosMx Transcriptomics``` ```MERFISH``` ```Pixel-seqV2``` ```2D Imaging Mass Cytometry``` ```Confocal``` ```seqFISH``` ```DART-FISH``` ```MIBI``` ```Olink``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```DESI``` ```Xenium``` ```CyCIF``` ```SNARE-seq2``` ```nanoSPLITS``` ```Stereo-seq``` ```Visium (with probes)``` ```SIMS``` ```Auto-fluorescence``` ```CyTOF``` ```CosMx Proteomics``` ```DBiT-seq``` ```PhenoCycler``` ```CODEX``` ```Second Harmonic Generation (SHG)``` ```Seq-Scope``` | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| mapped_area_value | | For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx and Resolve, this is the area of the FOV (aka ROI) region being captured. For Xenium this is the total area of the FOV regions (aka ROI) being captured. For Stereo-Seq this is the number of beads. | | +| mapped_area_unit | | The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2. | ```um^2``` ```mm^2``` | +| spot_size_value | | For assays where spots are used to define discrete capture areas, this is the area of a spot. | | +| spot_size_unit | | The unit for spot size value. | ```um^2``` ```mm^2``` | +| number_of_spots | | Number of capture spots within the mapped area. For Visium this would be the number of spots covered by tissue, while it's the number of spots within ROIs for HiFi. | | +| spot_spacing_value | | Approximate center-to-center distance between capture spots. Synonyms: Inter-Spot distance, Spot resolution, Pit size | | +| spot_spacing_unit | | Units corresponding to inter-spot distance | ```um``` | +| capture_area_id | | Which capture area on the slide was used. For Visium this would be [A1, B1, C1, D1]. For HiFi this would be the lane on the flowcell. | | +| permeabilization_time_value | | Permeabilization time used for this tissue section. | | +| permeabilization_time_unit | | The unit for the permeabilization time. | ```minute``` | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| preparation_instrument_vendor | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```Thermo Fisher Scientific``` ```SunChrom``` ```Leica Biosystems``` ```Roche Diagnostics``` ```In-House``` ```Not applicable``` ```Hamamatsu``` ```HTX Technologies``` ```10x Genomics``` | +| preparation_instrument_model | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```ST5020 Multistainer``` ```Visium CytAssist``` ```SunCollect Sprayer``` ```Chromium X``` ```Chromium iX``` ```EVOS M7000``` ```NanoZoomer S210``` ```NanoZoomer S60``` ```NanoZoomer S360``` ```Discovery Ultra``` ```Sublimator``` ```Not applicable``` ```TM-Sprayer``` ```M5 Sprayer``` ```M3+ Sprayer``` ```Chromium Controller``` ```Chromium Connect``` | +| non_global_files | | A semicolon separated list of non-shared files to be included in the dataset. The path assumes the files are located in the "TOP/non-global/" directory. For example, for the file is TOP/non-global/lab_processed/images/1-tissue-boundary.geojson the value of this field would be "./lab_processed/images/1-tissue-boundary.geojson". After ingest, these files will be copied to the appropriate locations within the respective dataset directory tree. This field is used for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee | | + + diff --git a/docs/assays/metadata/testing/converted/visiumwithprobes.md b/docs/assays/metadata/testing/converted/visiumwithprobes.md new file mode 100644 index 0000000..595a6d4 --- /dev/null +++ b/docs/assays/metadata/testing/converted/visiumwithprobes.md @@ -0,0 +1,33 @@ +--- +layout: page-triary +--- + +# Visium (with probes) Metadata Attributes + +Fields that are collected for Visium (with probes) data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| preparation_protocol_doi | | DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1 | | +| dataset_type | | The specific type of dataset being produced. | ```10X Multiome``` ```2D Imaging Mass Cytometry``` ```ATACseq``` ```Auto-fluorescence``` ```Cell DIVE``` ```CODEX``` ```Confocal``` ```CosMx``` ```CyCIF``` ```DBiT``` ```DESI``` ```Enhanced Stimulated Raman Spectroscopy (SRS)``` ```GeoMx (nCounter)``` ```GeoMx (NGS)``` ```HiFi-Slide``` ```Histology``` ```LC-MS``` ```Light Sheet``` ```MALDI``` ```MERFISH``` ```MIBI``` ```Molecular Cartography``` ```MUSIC``` ```nanoSPLITS``` ```PhenoCycler``` ```Resolve``` ```RNAseq``` ```RNAseq (with probes)``` ```Second Harmonic Generation (SHG)``` ```SIMS``` ```SNARE-seq2``` ```Stereo-seq``` ```Thick section Multiphoton MxIF``` ```Visium (no probes)``` ```Visium (with probes)``` ```Xenium``` | +| contributors_path | | The path to the file with the ORCID IDs for all contributors of this dataset (e.g., "./extras/contributors.tsv" or "./contributors.tsv"). This is an internal metadata field that is just used for ingest. | | +| data_path | | The top level directory containing the raw and/or processed data. For a single dataset upload this might be "." where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called "TEST001-RK" use syntax "./TEST001-RK" for this field. If there are multiple directory levels, use the format "./TEST001-RK/Run1/Pass2" in which "Pass2" is the subdirectory where the single dataset's data is stored. This is an internal metadata field that is just used for ingest. | | +| mapped_area_value | | For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx, Xenium and Resolve, this is the area of the FOV (aka ROI) region being captured. | | +| mapped_area_unit | | The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2. | ```um^2``` ```mm^2``` | +| spot_size_value | | For assays where spots are used to define discrete capture areas, this is the area of a spot. | | +| spot_size_unit | | The unit for spot size value. | ```um^2``` ```mm^2``` | +| number_of_spots | | Number of capture spots within the mapped area. For Visium this would be the number of spots covered by tissue, while it's the number of spots within ROIs for HiFi. | | +| spot_spacing_value | | Approximate center-to-center distance between capture spots. Synonyms: Inter-Spot distance, Spot resolution, Pit size | | +| spot_spacing_unit | | Units corresponding to inter-spot distance | ```um``` | +| capture_area_id | | Which capture area on the slide was used. For Visium this would be ```A1, B1, C1, D1```. For HiFi this would be the lane on the flowcell. | ```A1``` ```B1``` ```C1``` ```D1``` ```Lane 1``` ```Lane 2``` ```Lane 3``` ```Lane 4``` ```Lane 5``` ```Lane 6``` ```Lane 7``` ```Lane 8``` | +| permeabilization_time_value | | Permeabilization time used for this tissue section. | | +| permeabilization_time_unit | | The unit for the permeabilization time. | ```minute``` | +| metadata_schema_id | | The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 | | +| parent_sample_id | | Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 | | +| preparation_instrument_vendor | | The manufacturer of the instrument used to prepare (staining/processing) the sample for the assay. If an automatic slide staining method was indicated this field should list the manufacturer of the instrument. | ```10x Genomics``` ```Hamamatsu``` ```HTX Technologies``` ```In-House``` ```Leica Biosystems``` ```Not applicable``` ```Roche Diagnostics``` ```SunChrom``` ```Thermo Fisher Scientific``` | +| preparation_instrument_model | | Manufacturers of a staining system instrument may offer various versions (models) of that instrument with different features. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | ```AutoStainer XL``` ```Chromium Connect``` ```Chromium Controller``` ```Chromium iX``` ```Chromium X``` ```Discovery Ultra``` ```EVOS M7000``` ```M3+ Sprayer``` ```M5 Sprayer``` ```NanoZoomer S210``` ```NanoZoomer S360``` ```NanoZoomer S60``` ```Not applicable``` ```ST5020 Multistainer``` ```Sublimator``` ```SunCollect Sprayer``` ```TM-Sprayer``` ```Visium CytAssist``` | + + diff --git a/docs/assays/metadata/testing/converted/wgs.md b/docs/assays/metadata/testing/converted/wgs.md new file mode 100644 index 0000000..ac5ab56 --- /dev/null +++ b/docs/assays/metadata/testing/converted/wgs.md @@ -0,0 +1,87 @@ +--- +layout: page-triary +--- + +# WGS Metadata Attributes + +Fields that are collected for WGS data, available at ```dataset.metadata.``` +  + +* indicates a required field + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| version | | Version of the schema to use when validating this metadata. | ```'1'``` | +| description | | Free-text description of this assay. | | +| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | | +| tissue_id | | HuBMAP Display ID of the assayed tissue. | | +| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | +| protocols_io_doi | | DOI for protocols.io referring to the protocol for this assay. | | +| operator | | Name of the person responsible for executing the assay. | | +| operator_email | | Email address for the operator. | | +| pi | | Name of the principal investigator responsible for the data. | | +| pi_email | | Email address for the principal investigator. | | +| assay_category | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```'sequence'``` | +| assay_type | | The specific type of assay being executed. | ```'WGS'``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```'DNA'``` | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```'Yes'``` ```'No'``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| gdna_fragmentation_quality_assurance | | Is the gDNA integrity good enough for WGS? This is usually checked through running a gel. | ```'Pass'``` ```'Fail'``` | +| dna_assay_input_value | | Amount of DNA input into library preparation | | +| dna_assay_input_unit | | Units of DNA input into library preparation | ```'ug'``` | +| library_construction_method | | Describes DNA library preparation kit. Modality of isolating gDNA, Fragmentation and generating sequencing libraries. | | +| library_construction_protocols_io_doi | | A link to the protocol document containing the library construction method (including version) that was used. | | +| library_layout | | State whether the library was generated for single-end or paired end sequencing. | ```'single-end'``` ```'paired-end'``` | +| library_adapter_sequence | | The adapter sequence to be used for adapter trimming starting with the 5' end. (eg. 5-ATCCTGAGAA) | | +| library_final_yield | | Total amount of library after final pcr amplification step | | +| library_final_yield_unit | | Total units of library after final pcr amplification step | ```'ng'``` | +| library_average_fragment_size | | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | +| sequencing_reagent_kit | | Reagent kit used for sequencing | | +| sequencing_read_format | | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | +| sequencing_read_percent_q30 | | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | +| sequencing_phix_percent | | Percent PhiX loaded to the run | | +| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | | +| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | + + + +  + +## Deprecated Attributes + +* indicates a field that was previously required + +| Attribute | Type | Description | Allowable Values | +|------|------|-------------|-------------------| +| donor_id | | HuBMAP Display ID of the donor of the assayed tissue. | | +| tissue_id | | HuBMAP Display ID of the assayed tissue. | | +| execution_datetime | | Start date and time of assay, typically a date-time stamped folder generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year, MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour with leading zeros, mm are the minutes with leading zeros. | | +| protocols_io_doi | | DOI for protocols.io referring to the protocol for this assay. | | +| operator | | Name of the person responsible for executing the assay. | | +| operator_email | | Email address for the operator. | | +| pi | | Name of the principal investigator responsible for the data. | | +| pi_email | | Email address for the principal investigator. | | +| assay_category | | Each assay is placed into one of the following 4 general categories: generation of images of microscopic entities, identification & quantitation of molecules by mass spectrometry, imaging mass spectrometry, and determination of nucleotide sequence. | ```'sequence'``` | +| assay_type | | The specific type of assay being executed. | ```'WGS'``` | +| analyte_class | | Analytes are the target molecules being measured with the assay. | ```'DNA'``` | +| is_targeted | | Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay. | ```'Yes'``` ```'No'``` | +| acquisition_instrument_vendor | | An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. | | +| acquisition_instrument_model | | Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. | | +| gdna_fragmentation_quality_assurance | | Is the gDNA integrity good enough for WGS? This is usually checked through running a gel. | ```'Pass'``` ```'Fail'``` | +| dna_assay_input_value | | Amount of DNA input into library preparation | | +| dna_assay_input_unit | | Units of DNA input into library preparation | ```'ug'``` | +| library_construction_method | | Describes DNA library preparation kit. Modality of isolating gDNA, Fragmentation and generating sequencing libraries. | | +| library_construction_protocols_io_doi | | A link to the protocol document containing the library construction method (including version) that was used. | | +| library_layout | | State whether the library was generated for single-end or paired end sequencing. | ```'single-end'``` ```'paired-end'``` | +| library_adapter_sequence | | The adapter sequence to be used for adapter trimming starting with the 5' end. (eg. 5-ATCCTGAGAA) | | +| library_final_yield | | Total amount of library after final pcr amplification step | | +| library_final_yield_unit | | Total units of library after final pcr amplification step | ```'ng'``` | +| library_average_fragment_size | | Average size in basepairs (bp) of sequencing library fragments estimated via gel electrophoresis or bioanalyzer/tapestation. | | +| sequencing_reagent_kit | | Reagent kit used for sequencing | | +| sequencing_read_format | | Slash-delimited list of the number of sequencing cycles for, for example, Read1, i7 index, i5 index, and Read2. | | +| sequencing_read_percent_q30 | | Q30 is the weighted average of all the reads (e.g. # bases UMI * q30 UMI + # bases R2 * q30 R2 + ...) | | +| sequencing_phix_percent | | Percent PhiX loaded to the run | | +| contributors_path | | Relative path to file with ORCID IDs for contributors for this dataset. | | +| data_path | | Relative path to file or directory with instrument data. Downstream processing will depend on filename extension conventions. | | +