sparrow.tb.flowsom

Contents

sparrow.tb.flowsom#

sparrow.tb.flowsom(sdata, labels_layer_cells, labels_layer_clusters, output_layer, q=0.999, chunks=None, n_clusters=20, index_names_var=None, index_positions_var=None, random_state=100, overwrite=False, **kwargs)#

Prepares the data obtained from pixel clustering for cell clustering (see docstring of sp.tb.cell_clustering_preprocess) and then executes the FlowSOM clustering algorithm on the resulting table layer (output_layer) of the SpatialData object.

This function applies the FlowSOM clustering algorithm (via fs.FlowSOM) on spatial data contained in a SpatialData object. The algorithm organizes data into self-organizing maps and then clusters these maps, grouping them into n_clusters. The results of this clustering are added to a table layer in the sdata object.

Typically one would first process sdata via sp.im.pixel_clustering_preprocess and sp.im.flowsom before using this function.

Parameters:
  • sdata (SpatialData) – The input SpatialData object.

  • labels_layer_cells (Union[str, Iterable[str]]) – The labels layer(s) in sdata that contain cell segmentation masks. These masks should be previously generated using sp.im.segment. If a list of labels layers is provided, they will be clustered together (e.g. multiple samples).

  • labels_layer_clusters (Union[str, Iterable[str]]) – The labels layer(s) in sdata that contain metacluster or SOM cluster masks. These should be obtained via sp.im.flowsom.

  • output_layer (str) – The output table layer in sdata where results of the clustering and metaclustering will be stored.

  • q (float | None (default: 0.999)) – Quantile used for normalization. If specified, each pixel SOM/meta cluster column in output_layer is normalized by this quantile prior to flowsom clustering. Values are multiplied by 100 after normalization.

  • chunks (Union[str, tuple[int, ...], int, None] (default: None)) – Chunk sizes for processing the data. If provided as a tuple, it should detail chunk sizes for each dimension (z), y, x.

  • n_clusters (int (default: 20)) – The number of metaclusters to form from the self-organizing maps.

  • index_names_var (Optional[Iterable[str]] (default: None)) – Specifies the variable names to be used from sdata.tables[table_layer].var for clustering. If None, index_positions_var will be used if not None.

  • index_positions_var (Optional[Iterable[int]] (default: None)) – Specifies the positions of variables to be used from sdata.tables[table_layer].var for clustering. Used if index_names_var is None.

  • random_state (int (default: 100)) – A random state for reproducibility of the clustering.

  • overwrite (bool (default: False)) – If True, overwrites the existing data in output_layer if it already exists.

  • **kwargs – Additional keyword arguments passed to the fs.FlowSOM clustering algorithm.

Return type:

tuple[SpatialData, FlowSOM]

Returns:

: tuple:

  • The updated sdata with the clustering results added.

  • An instance of fs.FlowSOM containing the trained FlowSOM model.

See also

sparrow.im.flowsom

flowsom pixel clustering

sparrow.tb.cell_clustering_preprocess

prepares data for cell clustering.