sparrow.pl.analyse_genes_left_out#
- sparrow.pl.analyse_genes_left_out(sdata, labels_layer, table_layer, points_layer='transcripts', to_coordinate_system='global', name_x='x', name_y='y', name_gene_column='gene', output=None)#
Analyse and visualize the proportion of genes that could not be assigned to a cell during allocation step.
- Parameters:
sdata (
SpatialData
) – Data containing spatial information for plotting.labels_layer (
str
) – The layer insdata
that contains the segmentation masks. This layer is used to calculate the crd (region of interest) that was used in the segmentation step, otherwise transcript counts inpoints_layer
ofsdata
(containing all transcripts) and the counts obtained viasdata.tables[ table_layer ]
are not comparable. It is also used to select the cells insdata.tables[table_layer]
that are linked to thislabels_layer
via the _REGION_KEY.table_layer (
str
) – The table layer insdata
on which to perform analysis.points_layer (
str
(default:'transcripts'
)) – The layer insdata
containing transcript information.to_coordinate_system (
str
(default:'global'
)) – The coordinate system that holdslabels_layer
andpoints_layer
.name_x (
str
(default:'x'
)) – The column name representing the x-coordinate inpoints_layer
.name_y (
str
(default:'y'
)) – The column name representing the y-coordinate inpoints_layer
.name_gene_column (
str
(default:'gene'
)) – The column name representing the gene name inpoints_layer
.output (
Union
[str
,Path
,None
] (default:None
)) – The path to save the generated plots. If None, plots will be shown directly using plt.show().
- Return type:
DataFrame
- Returns:
: A DataFrame containing information about the proportion of transcripts kept for each gene, raw counts (i.e. obtained from
points_layer
ofsdata
), and the log of raw counts.- Raises:
AttributeError – If the provided
sdata
does not contain the necessary attributes (i.e., ‘labels’ or ‘points’).
Notes
- This function produces two plots:
A scatter plot of the log of raw gene counts vs. the proportion of transcripts kept.
A regression plot for the same data with Pearson correlation coefficients.
The function also prints the ten genes with the highest proportion of transcripts filtered out.
See also