Sequence string set


Sequences for DNA, RNA or protein molecules, plus some extra annotations. This corresponds to a XStringSet object from the Biostrings Bioconductor package. If the sequences are associated with quality scores, this corresponds instead to a QualityScaledXStringSet object.

Type: object

Type: string

The schema to use.

Type: array of object

Authors of this resource.

Each item of this array must be:

Type: object

Type: string

Email of the author.

Must match regular expression: ^[^@]+@[^@]+$

Type: string

Name of the author.

Type: string

ORCID of the author.

Must match regular expression: ^[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{4}$

Type: string

Description of the resource.

Type: array of object

UCSC, Ensembl or other genome builds involved in generating this resource.

Each item of this array must be:

Type: object

Type: string

Identifier for this genome build.


Examples:

"mm10"
"NCBIm37"

Type: enum (of string)

Source of the genome build identifier.

Must be one of:

  • "Ensembl"
  • "UCSC"
  • "Wormbase"
  • "Flybase"

Type: boolean Default: false

Is this a child document, only to be interpreted in the context of the parent document from which it is linked? This may have implications for search and metadata requirements.

Type: array of object

Origins of this resource.

Each item of this array must be:


Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "PubMed"
Type: object

Type: string
Must match regular expression: ^[0-9]+$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "GEO"
Type: object

Type: string
Must match regular expression: ^GSE[0-9]+$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "ArrayExpress"
Type: object

Type: string
Must match regular expression: ^E-MTAB-[0-9]+$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "DOI"
Type: object

Type: string
Must match regular expression: ^[0-9a-zA-Z\._-]+/[0-9a-zA-Z\._-]+$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "URI"
Type: object

Type: string
Must match regular expression: ^(http|ftp|https|s3|sftp)://

Type: string

Identifier for the resource in the specified type.

Type: enum (of string)

Source database or repository.

Must be one of:

  • "PubMed"
  • "GEO"
  • "ArrayExpress"
  • "DOI"
  • "URI"

Type: string

Path to the file in the project directory.

Type: object
No Additional Properties

Type: boolean Default: false

Whether the sequences were named. If false, placeholder names are created in the sequence file at sequence_file; these should be ignored for further processing.

Type: object

A list containing additional annotations for the object. Omitted if no annotations are present.

Type: object

Type: string

Relative path of the resource from the root of the project directory.

Type: enum (of string)

Type of file. Local files should be present in the same project directory.

Must be one of:

  • "local"

Type: object

A data frame containing additional annotations for each sequence. Each row corresponds to a sequence in the FASTA/FASTQ file. Omitted if no annotations are present.

Type: object

Type: string

Relative path of the resource from the root of the project directory.

Type: enum (of string)

Type of file. Local files should be present in the same project directory.

Must be one of:

  • "local"

Type: object

Biological sequences in FASTA or FASTQ format. The nature of the sequence and the presence/type of quality scores is determined from the sequence_file's own metadata.

Type: object

Type: string

Relative path of the resource from the root of the project directory.

Type: enum (of string)

Type of file. Local files should be present in the same project directory.

Must be one of:

  • "local"

Type: array of integer

Each item of this array must be:

Type: integer

NCBI taxonomy IDs of the species involved in this resource.

Type: array of object

Terms from a controlled vocabulary, used to annotate this resource in a machine-readable manner.

Each item of this array must be:


No Additional Properties

Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "Experimental Factor Ontology"
Type: object

Type: object
Must match regular expression: ^EFO:[0-9]{7}$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "Human Disease Ontology"
Type: object

Type: object
Must match regular expression: ^DOID:[0-9]+$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "Cell Ontology"
Type: object

Type: object
Must match regular expression: ^CL:[0-9]{7}$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "UBERON"
Type: object

Type: const
Specific value: "^UBERON:[0-9]{7}$"

Type: string

Identifier for the term.


Examples:

"EFO:0008913"
"DOID:13250"
"CL:0000097"
"UBERON:0005870"

Type: enum (of string)

Name of the vocabulary or ontology that is the source for this term.

Must be one of:

  • "Experimental Factor Ontology"
  • "Human Disease Ontology"
  • "Cell Ontology"
  • "UBERON"

Type: string

Version of the vocabulary.

Type: string

Title of the resource.

Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.


Must not be:

Type: object

Type: const
Specific value: true
Type: object

The following properties are required:

  • title
  • description
  • authors
  • species
  • genome
  • origin
  • terms