sparrow.im.flowsom

Contents

sparrow.im.flowsom#

sparrow.im.flowsom(sdata, img_layer, output_layer_clusters, output_layer_metaclusters, channels=None, fraction=0.1, n_clusters=5, random_state=100, chunks=None, scale_factors=None, overwrite=False, **kwargs)#

Applies flowsom clustering on image layer(s) of a SpatialData object.

This function executes the flowsom clustering algorithm (via fs.FlowSOM) on spatial data encapsulated by a SpatialData object. The predited clusters and metaclusters are added as a labels layer to respectively sdata.labels[output_layer_clusters] and sdata.labels[output_layer_metaclusters].

Parameters:
  • sdata (SpatialData) – The input SpatialData object.

  • img_layer (Union[str, Iterable[str]]) – The image layer(s) of sdata on which flowsom is run. It is recommended to preprocess the data with sp.im.pixel_clustering_preprocess.

  • output_layer_clusters (Union[str, Iterable[str]]) – The output labels layer in sdata to which labels layer with predicted flowsom SOM clusters are saved.

  • output_layer_metaclusters (Union[str, Iterable[str]]) – The output labels layer in sdata to which labels layer with predicted flowsom metaclusters are saved.

  • channels (Union[int, str, Iterable[int], Iterable[str], None] (default: None)) – Specifies the channels to be included in the pixel clustering.

  • fraction (float | None (default: 0.1)) – Fraction of the data to sample for training flowsom. Inference will be done on all pixels in image_layer.

  • n_clusters (int (default: 5)) – The number of meta clusters to form.

  • random_state (int (default: 100)) – A random state for reproducibility of the clustering and sampling.

  • chunks (Union[str, tuple[int, ...], int, None] (default: None)) – Chunk sizes for processing. If provided as a tuple, it should contain chunk sizes for c, (z), y, x.

  • scale_factors (Optional[Sequence[Union[dict[str, int], int]]] (default: None)) – Scale factors to apply for multiscale

  • overwrite (bool (default: False)) – If True, overwrites the output_layer_cluster and/or output_layer_metacluster if it already exists in sdata.

  • **kwargs – Additional keyword arguments passed to fs.FlowSOM.

Return type:

tuple[SpatialData, FlowSOM, Series]

Returns:

: tuple:

  • The input sdata with the clustering results added.

  • FlowSOM object containing a MuData object and a trained fs.models.FlowSOMEstimator. MuData object will only contain the fraction (via the fraction parameter) of the data sampled from the img_layer on which the FlowSOM model is trained.

  • A pandas Series object containing a mapping between the clusters and the metaclusters.

See also

sparrow.im.pixel_clustering_preprocess

preprocess image layers before applying flowsom clustering.

Warning

  • The function is intended for use with spatial proteomics data. Input data should be appropriately preprocessed (e.g. via sp.im.pixel_clustering_preprocess) to ensure meaningful clustering results.

  • The cluster and metacluster ID’s found in output_layer_clusters and output_layer_metaclusters count from 1, while they count from 0 in the FlowSOM object.