sparrow.im.pixel_clustering_preprocess

sparrow.im.pixel_clustering_preprocess#

sparrow.im.pixel_clustering_preprocess(sdata, img_layer, output_layer, channels=None, q=99, q_sum=5, q_post=99.9, sigma=2, norm_sum=True, chunks=None, scale_factors=None, overwrite=False)#

Preprocess image layers specified in img_layer. Normalizes and blurs the images based on various quantile and gaussian blur parameters. The results are added to sdata as specified in output_layer.

Preprocessing function specifically designed for preprocessing images before using sp.im.flowsom.

Parameters:
  • sdata (SpatialData) – The SpatialData object containing the image data.

  • img_layer (Union[str, Iterable[str]]) – The image layer(s) from sdata to process. This can be a single layer or a list of layers, e.g., when multiple fields of view are available.

  • output_layer (Union[str, Iterable[str]]) – The preprocessed images are saved under this layer in sdata.

  • channels (Union[int, str, Iterable[int], Iterable[str], None] (default: None)) – Specifies the channels to be included in the processing.

  • q (float | None (default: 99)) – Quantile used for normalization. If specified, pixel values are normalized by this quantile across the specified channels. Each channel is normalized by its own calculated quantile.

  • q_sum (float | None (default: 5)) – If the sum of the channel values at a pixel is below this quantile, the pixel values across all channels are set to NaN.

  • q_post (float (default: 99.9)) – Quantile used for normalization after other preprocessing steps (q, q_sum, norm_sum normalization and Gaussian blurring) are performed. If specified, pixel values are normalized by this quantile across the specified channels. Each channel is normalized by its own calculated quantile.

  • sigma (Union[float, Iterable[float], None] (default: 2)) – Gaussian blur parameter for each channel. Use 0 to omit blurring for specific channels or None to skip blurring altogether.

  • norm_sum (bool (default: True)) – If True, each channel is normalized by the sum of all channels at each pixel.

  • chunks (Union[str, tuple[int, ...], int, None] (default: None)) – Chunk sizes for processing. If provided as a tuple, it should contain chunk sizes for c, (z), y, x.

  • scale_factors (Optional[Sequence[Union[dict[str, int], int]]] (default: None)) – Scale factors to apply for multiscale

  • overwrite (bool (default: False)) – If True, overwrites existing data in output_layer.

Notes

To avoid data leakage:
  • in the single fov case (one image layer provided), to prevent data leakage between channels, one should set q_sum=None and norm_sum=False, the only normalization that will be performed will then be a division by the q and q_post quantile values per channel.

  • in the multiple fov case (multiple image layers provided), both q_sum, norm_sum, q and q_post should be set to None to prevent data leakage both between channels and between images.

Return type:

SpatialData

Returns:

: An updated SpatialData object with the preprocessed image data stored in specified output_layers.

See also

sparrow.im.flowsom

flowsom pixel clustering on image layers.