Below we present some comments and advice regarding using the ISA-TAB for Phenotyping configuration. You can download the current version of the configuration from Downloads.
How to use this configuration
The configuration is meant as a minimal set of fields that are necessary to describe the experiment and data according to MIAPPE. We propose it as a configuration that can be used to format data from phenotyping experiments, or as a starting point for building specific configurations that deal with different phenotyping situations. However, in the constructed data files and in the “derived” configurations the names of fields should not be changed and the fields should not be removed even if not used. Specifically:
- At the “study” level there are two Characteristics[] fields: Organism and Infra-specific name that should be used to describe the biological source. We propose to use Organism for definition of the taxonomy unit. Infra-specific name is to be used for any further describing attribute like accession, variety, ecotype, line, etc. The field name Infra-specific name should not be changed, but its actual meaning in every situation should be resolvable from the corresponding Term Source REF pointing to a collection of names in a database.
- According to suggestions from users, there are two places where the organism part can be specified: Organism part at the “study” level (if the same plant tissue is used in all assays) and Material type at the “assay” level (if assays use different tissues). At least one of them should be used.
Two levels: “study” and “assay”
There are different possible ways of using the configuration in practice. The biosource specified at “study” level can mean a plant variety (ecotype, etc.) or a “variety x treatment” combination, if all assays within the study use the same combinations. If not, the treatments can be specified at “assay” level. Also, the “biosource” can mean different pools of material:
- plant population represented in different assays – possibly run in different experimental designs – by a subset of its samples,
- field plot represented in different assays by samples of plants or different plant parts.
- one plant represented in different assays by the same parts used for many measurements (non-destructive phenotyping) or different parts.
In every situation, all “assays” within one “study” can use only Sample names defined in this “study”. Of course, Sample names can be repeated in different “assays” and any “assay” can use just a subset of Sample names.
Derived Data File
The Derived Data File specified at “assay” level may be of any format that is fully described in the corresponding Protocol. If there is no description of the protocol, the Derived Data File should be a plain tab-separated text file in which the first column contains values from the field Assay Name in “assay” file and the rest of the columns give observations of all traits obtained for each Assay name. The columns for observations should be named according to the Measurement Type column in the Trait Definition File (see below). So, this is an “Assay name x Trait” matrix of observations (quantitative or qualitative). The key field linking “assay” file and Derived Data File is Assay name. The Derived Data File may by incomplete with respect to the number of Assay names (rows) in comparison to “assay” file.
Trait definitions
All observed traits must be described in the Trait Definition File. The description consists of:
- Measurement Type: description of the trait, or its local name annotated to an external ontology,
- Technology Type: description of the measurement method, or its local name annotated to an external ontology,
- Unit Name: description of units of measurement or the scale in which the observations are expressed; if possible, standard units and scales should be used, and a reference to existing ontologies should be used; in case of a non-standard scale full explanation should be given in Technology Type field.