Geff specification

The graph exchange file format is zarr based. A graph is stored in a zarr group, which can have any name. This allows storing multiple geff graphs inside the same zarr root directory. A geff group is identified by the presence of a geff key in the .zattrs. Other geff metadata is also stored in the .zattrs file of the geff group, nested under the geff key. The geff group must contain a nodes group and an edges group (albeit both can be empty). geff graphs have the option to provide properties for nodes and edges.

geff graphs have the option to provide time and spatial dimensions as special attributes. These attributes are specified in the axes section of the metadata, inspired by the OME-zarr axes specification.

Zarr specification

Currently, geff supports zarr specifications 2 and 3. However, geff will default to writing specification 2 because graphs written to the zarr v3 spec will not be compatible with all applications. When zarr 3 is more fully adopted by other libraries and tools, we will move to a zarr spec 3 default.

Geff metadata

GeffSchema

Type: object

root geff

geff_metadata

Type: object

geff_metadata

root geff geff_version

Geff Version

Type: string

Geff version string following semantic versioning (MAJOR.MINOR.PATCH), optionally with .devN and/or +local parts (e.g., 0.3.1.dev6+g61d5f18).
If not provided, the version will be set to the current geff package version.

Must match regular expression: ^\d+\.\d+(?:\.\d+)?(?:\.dev\d+)?(?:\+[a-zA-Z0-9]+)?

root geff directed

Directed

Type: boolean

True if the graph is directed, otherwise False.

root geff axes

Axes

Default: null

Optional list of Axis objects defining the axes of each node in the graph.
Each object's name must be an existing attribute on the nodes. The optional type keymust be one of space, time or channel, though readers may not use this information. Each axis can additionally optionally define a unit key, which should match the validOME-Zarr units, and min and max keys to define the range of the axis.

root geff axes anyOf item 0

Type: array
No Additional Items

Each item of this array must be:

root geff axes anyOf item 0 Axis

Axis

Type: object

root geff axes anyOf item 0 Axis name

Name

Type: string

root geff axes anyOf item 0 Axis type

Type

Default: null

Any of

Option 1
Option 2

root geff axes anyOf item 0 Axis type anyOf item 0

Type: string

root geff axes anyOf item 0 Axis type anyOf item 1

Type: null

root geff axes anyOf item 0 Axis unit

Unit

Default: null

Any of

Option 1
Option 2

root geff axes anyOf item 0 Axis unit anyOf item 0

Type: string

root geff axes anyOf item 0 Axis unit anyOf item 1

Type: null

root geff axes anyOf item 0 Axis min

Min

Default: null

Any of

Option 1
Option 2

root geff axes anyOf item 0 Axis min anyOf item 0

Type: number

root geff axes anyOf item 0 Axis min anyOf item 1

Type: null

root geff axes anyOf item 0 Axis max

Max

Default: null

Any of

Option 1
Option 2

root geff axes anyOf item 0 Axis max anyOf item 0

Type: number

root geff axes anyOf item 0 Axis max anyOf item 1

Type: null

root geff axes anyOf item 1

Type: null

root geff node_props_metadata

Node Props Metadata

Default: null

Metadata for node properties. The keys are the property identifiers, and the values are PropMetadata objects describing the properties.

Any of

Option 1
Option 2

root geff node_props_metadata anyOf item 0

Type: object

Each additional property must conform to the following schema

root geff node_props_metadata anyOf item 0 PropMetadata

PropMetadata

Type: object

Metadata describing a property in the geff graph.

root geff node_props_metadata anyOf item 0 PropMetadata identifier

Identifier

Type: string

root geff node_props_metadata anyOf item 0 PropMetadata dtype

Dtype

Type: string

root geff node_props_metadata anyOf item 0 PropMetadata encoding

Encoding

Default: null

Any of

Option 1
Option 2

root geff node_props_metadata anyOf item 0 PropMetadata encoding anyOf item 0

Type: string

root geff node_props_metadata anyOf item 0 PropMetadata encoding anyOf item 1

Type: null

root geff node_props_metadata anyOf item 0 PropMetadata unit

Unit

Default: null

Any of

Option 1
Option 2

root geff node_props_metadata anyOf item 0 PropMetadata unit anyOf item 0

Type: string

root geff node_props_metadata anyOf item 0 PropMetadata unit anyOf item 1

Type: null

root geff node_props_metadata anyOf item 0 PropMetadata name

Name

Default: null

Any of

Option 1
Option 2

root geff node_props_metadata anyOf item 0 PropMetadata name anyOf item 0

Type: string

root geff node_props_metadata anyOf item 0 PropMetadata name anyOf item 1

Type: null

root geff node_props_metadata anyOf item 0 PropMetadata description

Description

Default: null

Any of

Option 1
Option 2

root geff node_props_metadata anyOf item 0 PropMetadata description anyOf item 0

Type: string

root geff node_props_metadata anyOf item 0 PropMetadata description anyOf item 1

Type: null

root geff node_props_metadata anyOf item 1

Type: null

root geff edge_props_metadata

Edge Props Metadata

Default: null

Metadata for edge properties. The keys are the property identifiers, and the values are PropMetadata objects describing the properties.

Any of

Option 1
Option 2

root geff edge_props_metadata anyOf item 0

Type: object

Each additional property must conform to the following schema

root geff edge_props_metadata anyOf item 0 PropMetadata

PropMetadata

Type: object

Metadata describing a property in the geff graph.

Same definition as PropMetadata

root geff edge_props_metadata anyOf item 1

Type: null

root geff sphere

Node property: Detections as spheres

Default: null

        Name of the optional `sphere` property.

        A sphere is defined by
        - a center point, already given by the `space` type properties
        - a radius scalar, stored in this property

Any of

Option 1
Option 2

root geff sphere anyOf item 0

Type: string

root geff sphere anyOf item 1

Type: null

root geff ellipsoid

Node property: Detections as ellipsoids

Default: null

        Name of the `ellipsoid` property.

        An ellipsoid is assumed to be in the same coordinate system as the `space` type
        properties.

        It is defined by
        - a center point :math:`c`, already given by the `space` type properties
        - a covariance matrix :math:`\Sigma`, symmetric and positive-definite, stored in this
          property as a `2x2`/`3x3` array.

        To plot the ellipsoid:
        - Compute the eigendecomposition of the covariance matrix
        :math:`\Sigma = Q \Lambda Q^{\top}`
        - Sample points :math:`z` on the unit sphere
        - Transform the points to the ellipsoid by
        :math:`x = c + Q \Lambda^{(1/2)} z`.

Any of

Option 1
Option 2

root geff ellipsoid anyOf item 0

Type: string

root geff ellipsoid anyOf item 1

Type: null

root geff track_node_props

Track Node Props

Default: null

Node properties denoting tracklet and/or lineage IDs.
A tracklet is defined as a simple path of connected nodes where the initiating node has any incoming degree and outgoing degree at most 1,and the terminating node has incoming degree at most 1 and any outgoing degree, and other nodes along the path have in/out degree of 1. Each tracklet must contain the maximal set of connected nodes that match this definition - no sub-tracklets.
A lineage is defined as a weakly connected component on the graph.
The dictionary can store one or both of 'tracklet' or 'lineage' keys.

Any of

Option 1
Option 2

root geff track_node_props anyOf item 0

Type: object

Each additional property must conform to the following schema

root geff track_node_props anyOf item 0 additionalProperties

Type: string

root geff track_node_props anyOf item 1

Type: null

root geff related_objects

Related Objects

Default: null

A list of dictionaries of related objects such as labels or images. Each dictionary must contain 'type', 'path', and optionally 'labelprop' properties. The 'type' represents the data type. 'labels' and 'image' should be used for label and image objects, respectively. Other types are also allowed, The 'path' should be relative to the geff zarr-attributes file. It is strongly recommended all related objects are stored as siblings of the geff group within the top-level zarr group. The 'labelprop' is only valid for type 'labels' and specifies the node property that will be used to identify the labels in the related object.

Any of

Option 1
Option 2

root geff related_objects anyOf item 0

Type: array
No Additional Items

Each item of this array must be:

root geff related_objects anyOf item 0 RelatedObject

RelatedObject

Type: object

root geff related_objects anyOf item 0 RelatedObject type

Type

Type: string

Type of the related object. 'labels' for label objects, 'image' for image objects. Other types are also allowed, but may not be recognized by reader applications.

root geff related_objects anyOf item 0 RelatedObject path

Path

Type: string

Path of the related object within the zarr group, relative to the geff zarr-attributes file. It is strongly recommended all related objects are stored as siblings of the geff group within the top-level zarr group.

root geff related_objects anyOf item 0 RelatedObject label_prop

Label Prop

Default: null

Property name for label objects. This is the node property that will be used to identify the labels in the related object. This is only valid for type 'labels'.

Any of

Option 1
Option 2

root geff related_objects anyOf item 0 RelatedObject label_prop anyOf item 0

Type: string

root geff related_objects anyOf item 0 RelatedObject label_prop anyOf item 1

Type: null

root geff related_objects anyOf item 1

Type: null

root geff affine

Default: null

Affine transformation matrix to transform the graph coordinates to the physical coordinates. The matrix must have the same number of dimensions as the number of axes in the graph.

Any of

Affine
Option 2

root geff affine anyOf Affine

Affine

Type: object

Affine transformation class following scipy conventions.

Internally stores transformations as homogeneous coordinate matrices (N+1, N+1).
The transformation matrix follows scipy.ndimage.affine_transform convention
where the matrix maps output coordinates to input coordinates (inverse/pull transformation).

For a point pout in output space, the corresponding input point pin is computed as:
pinhomo = matrix @ pouthomo
where pouthomo = [pout; 1] and pin = pinhomo[:-1]

Attributes:
matrix: Homogeneous transformation matrix as list of lists (ndim+1, ndim+1)

root geff affine anyOf Affine matrix

Matrix

Type: object

Homogeneous transformation matrix as list of lists (ndim+1, ndim+1)

root geff affine anyOf item 1

Type: null

root geff display_hints

Default: null

Metadata indicating how spatiotemporal axes are displayed by a viewer

Any of

DisplayHint
Option 2

root geff display_hints anyOf DisplayHint

DisplayHint

Type: object

Metadata indicating how spatiotemporal axes are displayed by a viewer

root geff display_hints anyOf DisplayHint display_horizontal

Display Horizontal

Type: string

Which spatial axis to use for horizontal display

root geff display_hints anyOf DisplayHint display_vertical

Display Vertical

Type: string

Which spatial axis to use for vertical display

root geff display_hints anyOf DisplayHint display_depth

Display Depth

Default: null

Optional, which spatial axis to use for depth display

Any of

Option 1
Option 2

root geff display_hints anyOf DisplayHint display_depth anyOf item 0

Type: string

root geff display_hints anyOf DisplayHint display_depth anyOf item 1

Type: null

root geff display_hints anyOf DisplayHint display_time

Display Time

Default: null

Optional, which temporal axis to use for time

Any of

Option 1
Option 2

root geff display_hints anyOf DisplayHint display_time anyOf item 0

Type: string

root geff display_hints anyOf DisplayHint display_time anyOf item 1

Type: null

root geff display_hints anyOf item 1

Type: null

root geff extra

Extra

Type: object

Extra metadata that is not part of the schema

Note

The axes dictionary is modeled after the OME-zarr specifications and is used to identify spatio-temporal properties on the graph nodes. If the same names are used in the axes metadata of the related image or segmentation data, applications can use this information to align graph node locations with image data.

geff.units.VALID_AXIS_TYPES `module-attribute`

VALID_AXIS_TYPES = ['space', 'time', 'channel']

geff.units.VALID_SPACE_UNITS `module-attribute`

VALID_SPACE_UNITS = [
    None,
    "angstrom",
    "attometer",
    "centimeter",
    "decimeter",
    "exameter",
    "femtometer",
    "foot",
    "gigameter",
    "hectometer",
    "inch",
    "kilometer",
    "megameter",
    "meter",
    "micrometer",
    "mile",
    "millimeter",
    "nanometer",
    "parsec",
    "petameter",
    "picometer",
    "terameter",
    "yard",
    "yoctometer",
    "yottameter",
    "zeptometer",
    "zettameter",
]

geff.units.VALID_TIME_UNITS `module-attribute`

VALID_TIME_UNITS = [
    None,
    "attosecond",
    "centisecond",
    "day",
    "decisecond",
    "exasecond",
    "femtosecond",
    "gigasecond",
    "hectosecond",
    "hour",
    "kilosecond",
    "megasecond",
    "microsecond",
    "millisecond",
    "minute",
    "nanosecond",
    "petasecond",
    "picosecond",
    "second",
    "terasecond",
    "yoctosecond",
    "yottasecond",
    "zeptosecond",
    "zettasecond",
]

Affine transformations

The optional affine field allows specifying a global affine transformation that maps the graph coordinates stored in the node properties to a physical coordinate system. The value matrix is stored as a (N + 1) × (N + 1) homogeneous matrix following the scipy.ndimage.affine_transform convention, where N equals the number of spatio-temporal axes declared in axes.

Extra attributes

The optional extra object is a free-form dictionary that can hold any additional, application-specific metadata that is not covered by the core geff schema. Users may place arbitrary keys and values inside extra without fear of clashing with future reserved fields. Although the core geff reader makes these attributes available, their meaning and use are left entirely to downstream applications.

The `nodes` group

The nodes group will contain an ids array and optionally a props group.

The `ids` array

The nodes\ids array is a 1D array of node IDs of length N >= 0, where N is the number of nodes in the graph. Node ids must be unique. Node IDs can have any type supported by zarr (except floats), but we recommend integer dtypes. For large graphs, uint64 might be necessary to provide enough range for every node to have a unique ID. In the minimal case of an empty graph, the ids array will be present but empty.

The `props` group and `node property` groups

The nodes\props group is optional and will contain one or more node property groups, each with a values array and an optional missing array.

values arrays can be any zarr supported dtype, and can be N-dimensional. The first dimension of the values array must have the same length as the node ids array, such that each row of the property values array stores the property for the node at that index in the ids array.
The missing array is an optional, a one dimensional boolean array to support properties that are not present on all nodes. A 1 at an index in the missing array indicates that the value of that property for the node at that index is None, and the value in the values array at that index should be ignored. If the missing array is not present, that means that all nodes have values for the property.
Geff provides special support for spatio-temporal properties, although they are not required. When axes are specified in the geff metadata, each axis name identifies a spatio-temporal property. Spatio-temporal properties are not allowed to have missing arrays. Otherwise, they are identical to other properties from a storage specification perspective.
The seg_id property is an optional, special node property that stores the segmenatation label for each node. The seg_id values do not need to be unique, in case labels are repeated between time points. If the seg_id property is not present, it is assumed that the graph is not associated with a segmentation.
Geff provides special support for predefined shape properties, although they are not required. These currently include: sphere, ellipsoid. Values can be marked as missing, and a geff graph may contain multiple different shape properties. Units of shapes are assumed to be the same as the units on the spatial axes. Otherwise, shape properties are identical to other properties from a storage specification perspective.
- sphere: Hypersphere in n spatial dimensions, defined by a scalar radius.
- ellipsoid: Defined by a symmetric positive-definite covariance matrix, whose dimensionality is assumed to match the spatial axes.

Note

When writing a graph with missing properties to the geff format, you must fill in a dummy value in the values array for the nodes that are missing the property, in order to keep the indices aligned with the node ids.

The `edges` group

Similar to the nodes group, the edges group will contain an ids array and an optional props group.

The `ids` array

The edges\ids array is a 2D array with the same dtype as the nodes\ids array. It has shape (E, 2), where E is the number of edges in the graph. If there are no edges in the graph, the edge group and ids array must be present with shape (0, 2). All elements in the edges\ids array must also be present in the nodes\ids array, and the data types of the two id arrays must match. Each row represents an edge between two nodes. For directed graphs, the first column is the source nodes and the second column holds the target nodes. For undirected graphs, the order is arbitrary. Edges should be unique (no multiple edges between the same two nodes) and edges from a node to itself are not supported.

The `props` group and `edge property` groups

The edges\props group will contain zero or more edge property groups, each with a values array and an optional missing array.

values arrays can be any zarr supported dtype, and can be N-dimensional. The first dimension of the values array must have the same length as the edges\ids array, such that each row of the property values array stores the property for the edge at that index in the ids array.
The missing array is an optional, a one dimensional boolean array to support properties that are not present on all edges. A 1 at an index in the missing array indicates that the value of that property for the edge at that index is missing, and the value in the values array at that index should be ignored. If the missing array is not present, that means that all edges have values for the property.

The edges/props is optional. If you do not have any edge properties, the edges\props can be absent.

Example file structure and metadata

Here is a schematic of the expected file structure.

/path/to.zarr
    /tracking_graph
        .zattrs  # graph metadata with `geff_version`
        nodes/
            ids  # shape: (N,)  dtype: uint64
            props/
                t/
                    values # shape: (N,) dtype: uint16
                z/
                    values # shape: (N,) dtype: float32
                y/
                    values # shape: (N,) dtype: float32
                x/
                    values # shape: (N,) dtype: float32
                radius/
                    values # shape: (N,) dtype: int | float
                    missing # shape: (N,) dtype: bool
                covariance3d/
                    values # shape: (N, 3, 3) dtype: float
                    missing # shape: (N,) dtype: bool
                color/
                    values # shape: (N, 4) dtype: float16
                    missing # shape: (N,) dtype: bool
        edges/
            ids  # shape: (E, 2) dtype: uint64
            props/
                distance/
                    values # shape: (E,) dtype: float16
                score/
                    values # shape: (E,) dtype: float16
                    missing # shape: (E,) dtype: bool
    # optional:
    /segmentation 

    # unspecified, but totally okay:
    /raw

This is a geff metadata zattrs file that matches the above example structure.

# /path/to.zarr/tracking_graph/.zattrs
{   
    "geff": {
        "directed": true,
        "geff_version": "0.1.3.dev4+gd5d1132.d20250616",
        "axes": [ # optional
            {'name': 't', 'type': "time", 'unit': "seconds", 'min': 0, 'max': 125},
            {'name': 'z', 'type': "space", 'unit': "micrometers", 'min': 1523.36, 'max': 4398.1},
            {'name': 'y', 'type': "space", 'unit': "micrometers", 'min': 81.667, 'max': 1877.7},
            {'name': 'x', 'type': "space", 'unit': "micrometers", 'min': 764.42, 'max': 2152.3},
        ],
        # predefined node attributes for storing detections as spheres or ellipsoids
        "sphere": "radius", # optional
        "ellipsoid": "covariance3d", # optional
        "display_hints": {
            "display_horizontal": "x",
            "display_vertical": "y",
            "display_depth": "z",
            "display_time": "t",
        },
        # node attributes corresponding to tracklet and/or lineage IDs
        "track_node_props": {
            "lineage": "ultrack_lineage_id",
            "tracklet": "ultrack_id"
        },
        "related_objects": {
            {
                "type":"labels", "path":"../segmentation/", "label_prop": "seg_id",
            },
            {
                "type":"image", "path":"../raw/",
            },
        },
        # optional coordinate transformation is defined as homogeneous coordinates
        # It is expected to be a (D+1)x(D+1) matrix where D is the number of axes
        "affine": [
            [1, 0, 0, 0, 0],
            [0, 1, 0, 0, 0],
            [0, 0, 1, 0, 0],
            [0, 0, 0, 1, 0],
            [0, 0, 0, 0, 1],
        # custom other things must be placed **inside** the extra attribute
        "extra": {
            ...
        }
    }
}

Geff specification

Zarr specification

Geff metadata

geff Required

geff_metadata

geff_version Required

Geff Version

directed Required

Directed

axes

Axes

Any of

Each item of this array must be:

Axis

name Required

Name

type

Type

Any of

unit

Unit

Any of

min

Min

Any of

max

Max

Any of

node_props_metadata

Node Props Metadata

Any of

Additional Properties

PropMetadata

identifier Required

Identifier

dtype Required

Dtype

encoding

Encoding

Any of

unit

Unit

Any of

name

Name

Any of

description

Description

Any of

edge_props_metadata

Edge Props Metadata

Any of

Additional Properties

PropMetadata

sphere

Node property: Detections as spheres

Any of

ellipsoid

Node property: Detections as ellipsoids

Any of

track_node_props

Track Node Props

Any of

Additional Properties

related_objects

Related Objects

Any of

Each item of this array must be:

RelatedObject

type Required

Type

path Required

Path

label_prop

Label Prop

Any of

affine

Any of

Affine

matrix Required

geff.units.VALID_AXIS_TYPES `module-attribute`

geff.units.VALID_SPACE_UNITS `module-attribute`

geff.units.VALID_TIME_UNITS `module-attribute`

The `nodes` group

The `ids` array

The `props` group and `node property` groups

The `edges` group

The `ids` array

The `props` group and `edge property` groups