HDF5 dense array


Dense array, saved in a HDF5 file as an N-dimensional dataset for positive N.

The dimensions reported in array.dimensions are be ordered from fastest-changing to slowest, i.e., the first entry of the array.dimensions property corresponds to the fastest-changing dimension. In the context of matrices, this implies a column-major layout where the first (faster) dimension corresponds to the rows and the second (slower) dimension corresponds to the columns. Note that this ordering is reversed in the dimensions of the HDF5 dataset, as the HDF5 library lists the dimensions from slowest-changing to fastest.

The file may also contain the dimnames of the array, stored in a separate HDF5 group. If present, the name of the group should be listed in the hdf5_dense_array.dimnames property.

For a signed integer dataset, missing values are represented by -2147483648.

A string dataset may contain a missing attribute. This should be a scalar string dataset that contains the string used to represent missing values. If no attribute exists, it is assumed that all strings are non-missing.

Derived from array/v1.json: some kind of multi-dimensional array, where we store metadata about the dimensions and type of data. The exact implementation of the array is left to concrete subclasses.

Type: object

Type: string

The schema to use.

Type: object
No Additional Properties

Type: array of integer

Dimensions of an n-dimensional array.

Must contain a minimum of 1 items

Each item of this array must be:

Type: enum (of string)

Type of data stored in this array.

Must be one of:

  • "boolean"
  • "number"
  • "integer"
  • "string"
  • "other"

Type: array of object

Authors of this resource.

Each item of this array must be:

Type: object

Type: string

Email of the author.

Must match regular expression: ^[^@]+@[^@]+$

Type: string

Name of the author.

Type: string

ORCID of the author.

Must match regular expression: ^[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{4}$

Type: string

Description of the resource.

Type: array of object

UCSC, Ensembl or other genome builds involved in generating this resource.

Each item of this array must be:

Type: object

Type: string

Identifier for this genome build.


Examples:

"mm10"
"NCBIm37"

Type: enum (of string)

Source of the genome build identifier.

Must be one of:

  • "Ensembl"
  • "UCSC"
  • "Wormbase"
  • "Flybase"

Type: object
No Additional Properties

Type: string

Name of the dataset inside the HDF5 file that contains the array.

Type: string

Name of the HDF5 group containing the dimnames. This group should contain zero or one string datasets for each dimension. Each string dataset is numbered after a dimension of the dense array (ordered from fastest- to slowest-changing) and should have length equal to the extent of that dimension, i.e., the dataset named "0" should have length equal to the first entry of array.dimensions. If this property is not provided, it can be assumed that no dimnames are available. Each dataset should not contain any missing values, so each string can be interpreted as-is.

Type: boolean Default: false

Is this a child document, only to be interpreted in the context of the parent document from which it is linked? This may have implications for search and metadata requirements.

Type: string

MD5 checksum for the file.

Type: array of object

Origins of this resource.

Each item of this array must be:


Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "PubMed"
Type: object

Type: string
Must match regular expression: ^[0-9]+$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "GEO"
Type: object

Type: string
Must match regular expression: ^GSE[0-9]+$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "ArrayExpress"
Type: object

Type: string
Must match regular expression: ^E-MTAB-[0-9]+$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "DOI"
Type: object

Type: string
Must match regular expression: ^[0-9a-zA-Z\._-]+/[0-9a-zA-Z\._-]+$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "URI"
Type: object

Type: string
Must match regular expression: ^(http|ftp|https|s3|sftp)://

Type: string

Identifier for the resource in the specified type.

Type: enum (of string)

Source database or repository.

Must be one of:

  • "PubMed"
  • "GEO"
  • "ArrayExpress"
  • "DOI"
  • "URI"

Type: string

Path to the file in the project directory.

Type: array of integer

Each item of this array must be:

Type: integer

NCBI taxonomy IDs of the species involved in this resource.

Type: array of object

Terms from a controlled vocabulary, used to annotate this resource in a machine-readable manner.

Each item of this array must be:


No Additional Properties

Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "Experimental Factor Ontology"
Type: object

Type: object
Must match regular expression: ^EFO:[0-9]{7}$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "Human Disease Ontology"
Type: object

Type: object
Must match regular expression: ^DOID:[0-9]+$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "Cell Ontology"
Type: object

Type: object
Must match regular expression: ^CL:[0-9]{7}$
Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.

Type: object

Type: const
Specific value: "UBERON"
Type: object

Type: const
Specific value: "^UBERON:[0-9]{7}$"

Type: string

Identifier for the term.


Examples:

"EFO:0008913"
"DOID:13250"
"CL:0000097"
"UBERON:0005870"

Type: enum (of string)

Name of the vocabulary or ontology that is the source for this term.

Must be one of:

  • "Experimental Factor Ontology"
  • "Human Disease Ontology"
  • "Cell Ontology"
  • "UBERON"

Type: string

Version of the vocabulary.

Type: string

Title of the resource.

Type: object

If the conditions in the "If" tab are respected, then the conditions in the "Then" tab should be respected. Otherwise, the conditions in the "Else" tab should be respected.


Must not be:

Type: object

Type: const
Specific value: true
Type: object

The following properties are required:

  • title
  • description
  • authors
  • species
  • genome
  • origin
  • terms