minian.initialization module

minian.initialization.da_label(im)[source]

Label connected features in a 2d array.

Parameters

im (np.ndarray) – Input array.

Returns

label (np.ndarray) – Label array. Should have same shape as input im.

minian.initialization.gmm_refine(varr, seeds, q=(0.1, 99.9), n_components=2, valid_components=1, mean_mask=True)[source]

Filter seeds by fitting a GMM to peak-to-peak values.

This function assume that the distribution of peak-to-peak values of fluorescence across all seeds can be model by a Gaussian Mixture Model (GMM) with different means. It computes peak-to-peak value for all the seeds, then fit a GMM with n_components to the distribution, and filter out the seeds belonging to the n_components - valid_components number of gaussians with lower means.

Parameters
  • varr (xr.DataArray) – The input movie data. Should have dimension “spatial” and “frame”.

  • seeds (pd.DataFrame) – The input over-complete set of seeds to be filtered.

  • q (tuple, optional) – Percentile to use to compute the peak-to-peak values. For a given seed with corresponding fluorescent fluctuation f, the peak-to-peak value for that seed is computed as np.percentile(f, q[1]) - np.percentile(f, q[0]). By default (0.1, 99.9).

  • n_components (int, optional) – Number of components (Gaussians) in the GMM model. By default 2.

  • valid_components (int, optional) – Number of components (Gaussians) to be considered as modeling the distribution of peak-to-peak values of valid seeds. Should be smaller than n_components. By default 1.

  • mean_mask (bool, optional) – Whether to apply additional criteria where a seed is valid only if its peak-to-peak value exceeds the mean of the lowest gaussian distribution. Only useful in corner cases where the distribution of the gaussian heavily overlap. By default True.

Returns

  • seeds (pd.DataFrame) – The resulting seeds dataframe with an additional column “mask_gmm”, indicating whether the seed is considered valid by this function. If the column already exists in input seeds it will be overwritten.

  • varr_pv (xr.DataArray) – The computed peak-to-peak values for each seeds.

  • gmm (GaussianMixture) – The fitted GMM model object.

minian.initialization.initA(varr, seeds, thres_corr=0.8, wnd=10, noise_freq=None)[source]

Initialize spatial footprints from seeds.

For each input seed, this function compute the correlation between the fluorescence activity of the seed and those of its neighboring pixels up to wnd pixels. It then set all correlation below thres_corr to zero, and use the resulting correlation image as the resutling spatial footprint of the seed.

Parameters
  • varr (xr.DataArray) – Input movie data. Should have dimension “height”, “width” and “frame”.

  • seeds (pd.DataFrame) – Dataframe of seeds.

  • thres_corr (float, optional) – Threshold of correlation, below which the values will be set to zero in the resulting spatial footprints. By default 0.8.

  • wnd (int, optional) – Radius (in pixels) of a disk window within which correlation will be computed for each seed. By default 10.

  • noise_freq (float, optional) – Cut-off frequency for optional smoothing of activities before computing the correlation. If None then no smoothing will be done. By default None.

Returns

A (xr.DataArray) – The initial estimation of spatial footprint for each cell. Should have dimensions (“unit_id”, “height”, “width”).

See also

minian.cnmf.graph_optimize_corr

for how the correlation are computed in an out-of-core fashion

minian.initialization.initC(varr, A)[source]

Initialize temporal component given spatial footprints.

The temporal component is computed as the least-square solution between the input movie and the spatial footprints over the “height” and “width” dimensions.

Parameters
  • varr (xr.DataArray) – Input movie data. Should have dimensions (“height”, “width”, “frame”).

  • A (xr.DataArray) – Spatial footprints of cells. Should have dimensions (“unit_id”, “height”, “width”).

Returns

C (xr.DataArray) – The initial estimation of temporal components for each cell. Should have dimensions (“unit_id”, “frame”).

minian.initialization.intensity_refine(varr, seeds, thres_mul=2)[source]

Filter seeds by thresholding the intensity of their corresponding pixels in the max projection of the movie.

This function generate a histogram of the max projection by spliting the intensity into bins of roughly 10 pixels. Then the intensity threshold is defined as the intensity of the peak of the histogram times thres_mul.

Parameters
  • varr (xr.DataArray) – Input movie data. Should have dimensions “height”, “width” and “frame”.

  • seeds (pd.DataFrame) – The input over-complete set of seeds to be filtered.

  • thres_mul (int, optional) – Scalar multiplied to the intensity value corresponding to the peak of max projection histogram. By default 2, which can be interpreted as “seeds are only valid if they are more than twice as bright as the majority of the pixels”.

Returns

seeds (pd.DataFrame) – The resulting seeds dataframe with an additional column “mask_int”, indicating whether the seed is considered valid by this function.

minian.initialization.ks_perseed(a)[source]

Perform KS test on input and return the p-value.

Parameters

a (np.ndarray) – Input data.

Returns

p (float) – The p-value of the KS test.

minian.initialization.ks_refine(varr, seeds, sig=0.01)[source]

Filter the seeds using Kolmogorov-Smirnov (KS) test.

This function assume that the valid seeds’ fluorescence across frames notionally follows a bimodal distribution: with a large normal distribution representing baseline activity, and a second peak representing when the seed/cell is active. KS allows to discard the seeds where the null-hypothesis (i.e. the fluorescence intensity is simply a normal distribution) is rejected at sig.

Parameters
  • varr (xr.DataArray) – Input movie data. Should have dimensions “height”, “width” and “frame”.

  • seeds (pd.DataFrame) – The input over-complete set of seeds to be filtered.

  • sig (float, optional) – The significance threshold to reject null-hypothesis. By default 0.01.

Returns

seeds (pd.DataFrame) – The resulting seeds dataframe with an additional column “mask_ks”, indicating whether the seed is considered valid by this function. If the column already exists in input seeds it will be overwritten.

minian.initialization.local_max_roll(fm, k0, k1, diff)[source]

Compute local maxima of a frame with a range of kernel size.

This function wraps around minian.utilities.local_extreme() and compute local maxima of the input frame with kernels of size ranging from k0 to k1. It then takes the union of all the local maxima, and additionally merge all the connecting local maxima by using the middle pixel.

Parameters
  • fm (np.ndarray) – The input frame.

  • k0 (int) – The lower bound (inclusive) of the range of kernel sizes.

  • k1 (int) – The upper bound (inclusive) of the range of kernel sizes.

  • diff (Union[int, float]) – Intensity threshold for the difference between local maxima and its neighbours, passed to minian.utilities.local_extreme().

Returns

max_res (np.ndarray) – The image of local maxima. Has same shape as fm, and 1 at local maxima.

minian.initialization.max_proj_frame(varr, idx)[source]

Compute max projection on a given subset of frames.

Parameters
  • varr (xr.DataArray) – The input movie data containing all frames.

  • idx (np.ndarray) – The subset of frames to use to compute max projection.

Returns

max_proj (xr.DataArray) – The max projection.

minian.initialization.pnr_perseed(a, freq, q)[source]

Compute peak-to-noise ratio of a given timeseries.

Parameters
  • a (np.ndarray) – Input timeseries.

  • freq (float) – Cut-off frequency of the high-pass filtering used to define noise.

  • q (tuple) – Percentile used to compute peak-to-peak values.

Returns

pnr (float) – Peak-to-noise ratio.

See also

pnr_refine

for definition of peak-to-noise ratio

minian.initialization.pnr_refine(varr, seeds, noise_freq=0.25, thres=1.5, q=(0.1, 99.9), med_wnd=None)[source]

Filter seeds by thresholding peak-to-noise ratio.

For each input seed, the noise is defined as high-pass filtered fluorescence trace of the seed. The peak-to-noise ratio (pnr) of that seed is then defined as the ratio between the peak-to-peak value of the originial fluorescence trace and that of the noise trace. Optionally, if abrupt changes in baseline fluorescence is expected, then the baseline can be estimated by median-filtering the fluorescence trace and subtracted from the original trace before computing the peak-to-noise ratio. In addition, if a hard threshold of pnr is not desired, then a Gaussian Mixture Model with 2 components can be fitted to the distribution of pnr across all seeds, and only seeds with pnr belonging to the higher-mean Gaussian will be considered valide.

Parameters
  • varr (xr.DataArray) – Input movie data, should have dimensions “height”, “width” and “frame”.

  • seeds (pd.DataFrame) – The input over-complete set of seeds to be filtered.

  • noise_freq (float, optional) – Cut-off frequency for the high-pass filter used to define noise, specified as fraction of sampling frequency. By default 0.25.

  • thres (Union[float, str], optional) – Threshold of the peak-to-noise ratio. If “auto” then a GMM will be fit to the distribution of pnr. By default 1.5.

  • q (tuple, optional) – Percentile to use to compute the peak-to-peak values. For a given fluorescence fluctuation f, the peak-to-peak value for that seed is computed as np.percentile(f, q[1]) - np.percentile(f, q[0]). By default (0.1, 99.9).

  • med_wnd (int, optional) – Size of the median filter window to remove baseline. If None then no filtering will be done. By default None.

Returns

  • seeds (pd.DataFrame) – The resulting seeds dataframe with an additional column “mask_pnr”, indicating whether the seed is considered valid by this function. If the column already exists in input seeds it will be overwritten.

  • pnr (xr.DataArray) – The computed peak-to-noise ratio for each seeds.

  • gmm (GaussianMixture, optional) – The GMM model object fitted to the distribution of pnr. Will be None unless thres is “auto”.

minian.initialization.ptp_q(a, q)[source]

Compute peak-to-peak value of input with percentile values.

Parameters
  • a (np.ndarray) – Input array.

  • q (tuple) – Tuple specifying low and high percentile values.

Returns

ptp (float) – The peak-to-peak value.

minian.initialization.seeds_init(varr, wnd_size=500, method='rolling', stp_size=200, nchunk=100, max_wnd=10, diff_thres=2)[source]

Generate over-complete set of seeds by finding local maxima across frames.

This function computes the maximum intensity projection of a subset of frames and finds the local maxima. The subsetting use either a rolling window or random sampling of frames. wnd_size stp_size and nchunk controls different aspects of the subsetting. max_wnd and diff_thres controls how local maxima are computed. The set of all local maxima found in this process constitutes an overly-complete set of seeds, representing putative locations of cells.

Parameters
  • varr (xr.DataArray) – Input movie data. Should have dimensions “frame”, “height” and “width”.

  • wnd_size (int, optional) – Number of frames in each chunk, for which a max projection will be calculated. By default 500.

  • method (str, optional) – Either “rolling” or “random”. Controls whether to use rolling window or random sampling of frames to construct chunks. By default “rolling”.

  • stp_size (int, optional) – Number of frames between the center of each chunk when stepping through the data with rolling windows. Only used if method is “rolling”. By default 200.

  • nchunk (int, optional) – Number of chunks to sample randomly. Only used if method is “random”. By default 100.

  • max_wnd (int, optional) – Radius (in pixels) of the disk window used for computing local maxima. Local maximas are defined as pixels with maximum intensity in such a window. By default 10.

  • diff_thres (int, optional) – Intensity threshold for the difference between local maxima and its neighbours. Any local maxima that is not birghter than its neighbor (defined by the same disk window) by diff_thres intensity values will be filtered out. By default 2.

Returns

seeds (pd.DataFrame) – Seeds dataframe with each seed as a row. Has column “height” and “width” which are location of the seeds. Also has column “seeds” which is an integer showing how many chunks where the seed is considered a local maxima.

minian.initialization.seeds_merge(varr, max_proj, seeds, thres_dist=5, thres_corr=0.6, noise_freq=None)[source]

Merge seeds based on spatial distance and temporal correlation of their activities.

This function build an adjacency matrix by thresholding spatial distance between seeds and temporal correlation between activities of seeds. It then merge seeds using the adjacency matrix by only keeping the seed with maximum intensity in the max projection within each connected group of seeds. The merge is therefore transitive.

Parameters
  • varr (xr.DataArray) – Input movie data. Should have dimension “height”, “width” and “frame”.

  • max_proj (xr.DataArray) – Max projection of the movie data.

  • seeds (pd.DataFrame) – Dataframe of seeds to be merged.

  • thres_dist (int, optional) – Threshold of distance between seeds in pixel. By default 5.

  • thres_corr (float, optional) – Threshold of temporal correlation between activities of seeds. By default 0.6.

  • noise_freq (float, optional) – Cut-off frequency for optional smoothing of activities before computing the correlation. If None then no smoothing will be done. By default None.

Returns

seeds (pd.DataFrame) – The resulting seeds dataframe with an additional column “mask_mrg”, indicating whether the seed should be kept after the merge. If the column already exists in input seeds it will be overwritten.