minian.cnmf module#

minian.cnmf.adj_corr(varr, adj, nod_df, freq, idx_dims=None, chunk=600)[source]#

Compute correlation in an optimized fashion given an adjacency matrix and node attributes.

Wraps around graph_optimize_corr() and construct computation graph from adj and nod_df. Also convert the result into a sparse matrix with same shape as adj.

Parameters:

varr (xr.DataArray) – Input time series. Should have “frame” dimension in addition to column names of nod_df.
adj (np.ndarray) – Adjacency matrix.
nod_df (pd.DataFrame) – Dataframe containing node attributes. Should have length adj.shape[0]. Must contain height and width columns (the 2D positions used to spatially partition the graph) in addition to the columns named in idx_dims.
freq (float) – Cut-off frequency for the optional smoothing. If None then no smoothing will be done.
idx_dims (list, optional) – Columns of nod_df used to index the time series in varr. By default [“height”, “width”]. When the time series is indexed by something other than position (e.g. unit_merge indexes C by unit_id), pass that here; height / width are still read off nod_df for partitioning.
chunk (int, optional) – Chunk size passed through to graph_optimize_corr(). By default 600.

Returns:

adj_corr (scipy.sparse.csr_matrix) – Sparse matrix of the same shape as adj but with values corresponding the computed correlation.

minian.cnmf.compute_AtC(A, C)[source]#

Compute the outer product of spatial and temporal components.

This funtion computes the outer product of spatial and temporal components. The result is a 3d array representing the movie data as estimated by the spatial and temporal components.

Parameters:

A (xr.DataArray) – Spatial footprints of cells. Should have dimensions (“unit_id”, “height”, “width”).
C (xr.DataArray) – Temporal components of cells. Should have dimensions “frame” and “unit_id”.

Returns:

AtC (xr.DataArray) – The outer product representing estimated movie data. Has dimensions (“frame”, “height”, “width”).

minian.cnmf.compute_trace(Y, A, b, C, f)[source]#

Compute the residule traces YrA for each cell.

YrA is computed as C + A_norm(YtA - CtA), where YtA is (Y - b.dot(f)).tensordot(A, [“height”, “width”]), representing the projection of background-subtracted movie onto the spatial footprints, and CtA is C.dot(AtA, [“unit_id”]) with AtA = A.tensordot(A, [“height”, “width”]), hence CtA represent for each cell the sum of temporal activities that’s shared with any other cells, then finally A_norm is a “unit_id”x”unit_id” diagonal matrix that normalize the result with sum of squares of spatial footprints for each cell. Together, the YrA trace is a “unit_id”x”frame” matrix, representing the sum of previous temporal components and the residule temporal fluctuations as estimated by projecting the data onto the spatial footprints and subtracting the cross-talk fluctuations.

Parameters:

Y (xr.DataArray) – Input movie data. Should have dimensions (“frame”, “height”, “width”).
A (xr.DataArray) – Spatial footprints of cells. Should have dimensions (“unit_id”, “height”, “width”).
b (xr.DataArray) – Spatial footprint of background. Should have dimensions (“height”, “width”).
C (xr.DataArray) – Temporal components of cells. Should have dimensions (“frame”, “unit_id”).
f (xr.DataArray) – Temporal dynamic of background. Should have dimension “frame”.

Returns:

YrA (xr.DataArray) – Residule traces for each cell. Should have dimensions(“frame”, “unit_id”).

minian.cnmf.filt_butter(x, freq, btype)[source]#

Filter 1d timeseries with Butterworth filter using scipy.signal.butter().

Parameters:

x (np.ndarray) – Input timeseries.
freq (float) – Cut-off frequency.
btype (str) – Either “low” or “high” specify low or high pass filtering.

Returns:

x_filt (np.ndarray) – Filtered timeseries.

minian.cnmf.filt_fft(x, freq, btype)[source]#

Filter 1d timeseries by zero-ing bands in the fft signal.

Parameters:

x (np.ndarray) – Input timeseries.
freq (float) – Cut-off frequency.
btype (str) – Either “low” or “high” specify low or high pass filtering.

Returns:

x_filt (np.ndarray) – Filtered timeseries.

minian.cnmf.filt_fft_vec(x, freq, btype)[source]#

Vectorized wrapper of filt_fft().

Parameters:

x (np.ndarray) – Input timeseries. Should have 2 dimensions, and the filtering will be applied along the last dimension.
freq (float) – Cut-off frequency.
btype (str) – Either “low” or “high” specify low or high pass filtering.

Returns:

x_filt (np.ndarray) – Filtered timeseries

minian.cnmf.get_ar_coef(y, sn, p, add_lag, pad=None)[source]#

Estimate Autoregressive coefficients of order p given a timeseries y.

Parameters:

y (np.ndarray) – Input timeseries.
sn (float) – Estimated noise level of the input y.
p (int) – Order of the autoregressive process.
add_lag (int) – Additional number of timesteps of covariance to use for the estimation.
pad (int, optional) – Length of the output. If not None then the resulting coefficients will be zero-padded to this length. By default None.

Returns:

g (np.ndarray) – The estimated AR coefficients.

minian.cnmf.get_noise_fft(varr, noise_range=(0.25, 0.5), noise_method='logmexp')[source]#

Estimates noise along the “frame” dimension aggregating power spectral density within noise_range.

This function compute a Fast Fourier transform (FFT) along the “frame” dimension in a vectorized fashion, and estimate noise by aggregating its power spectral density (PSD). Note that noise_range is specified relative to the sampling frequency, so 0.5 represents the Nyquist frequency. Three noise_method are availabe for aggregating the psd: “mean” and “median” will use the mean and median across all frequencies as the estimation of noise. “logmexp” takes the mean of the logarithmic psd, then transform it back with an exponential function.

Parameters:

varr (xr.DataArray) – Input data, should have a “frame” dimension.
noise_range (tuple, optional) – Range of noise frequency to be aggregated as a fraction of sampling frequency. By default (0.25, 0.5).
noise_method (str, optional) – Method of aggreagtion for noise. Should be one of “mean” “median” “logmexp” or “sum”. By default “logmexp”.

Returns:

sn (xr.DataArray) – Spectral density of the noise. Same shape as varr with the “frame” dimension removed.

minian.cnmf.get_noise_welch(varr, noise_range=(0.25, 0.5), noise_method='logmexp')[source]#

Estimates noise along the “frame” dimension aggregating power spectral density within noise_range.

The PSD is estimated using welch method as an alternative to FFT. The welch method assumes the noise in the signal to be a stochastic process and attenuates noise by windowing the original signal into segments and averaging over them.

Parameters:

varr (xr.DataArray) – Input data. Should have a “frame” dimension.
noise_range (tuple, optional) – Range of noise frequency to be aggregated as a fraction of sampling frequency. By default (0.25, 0.5).
noise_method (str, optional) – Method of aggreagtion for noise. Should be one of “mean” “median” “logmexp” or “sum”. By default “logmexp”.

Returns:

sn (xr.DataArray) – Spectral density of the noise. Same shape as varr with the “frame” dimension removed.

See also

get_noise_fft: For more details on the parameters.

minian.cnmf.graph_optimize_corr(varr, G, freq, idx_dims=None, chunk=600, step_size=50)[source]#

Compute correlation in an optimized fashion given a computation graph.

This function carry out out-of-core computation of large correaltion matrix. It takes in a computaion graph whose node represent timeseries and edges represent the desired pairwise correlation to be computed. The actual timeseries are stored in varr and indexed with node attributes. The function can carry out smoothing of timeseries before computation of correlation. To minimize re-computation of smoothing for each pixel, the graph is first partitioned spatially via a k-d tree median split on node positions. Then the computation is performed in chunks with size chunk, with nodes from the same partition being in the same chunk as much as possible.

Parameters:

varr (xr.DataArray) – Input timeseries. Should have “frame” dimension in addition to those specified in idx_dims.
G (nx.Graph) – Graph representing computation to be carried out. Should be undirected and un-weighted. Each node should have unique attributes with keys specified in idx_dims, which will be used to index the timeseries in varr. Each node must additionally carry height and width attributes; these 2D coordinates are used to spatially partition the graph (via spatial_partition()) so the chunked computation recomputes as little smoothing as possible. Each edge represent a desired correlation.
freq (float) – Cut-off frequency for the optional smoothing. If None then no smoothing will be done.
idx_dims (list, optional) – The dimension used to index the timeseries in varr. By default [“height”, “width”].
chunk (int, optional) – Chunk size of each computation. By default 600.
step_size (int, optional) – Step size to iterate through all edges. If too small then the iteration will take a long time. If too large then the variances in the actual chunksize of computation will be large. By default 50.

Returns:

eg_df (pd.DataFrame) – Dataframe representation of edge list. Has column “source” and “target” representing the node index of the edge (correlation), and column “corr” with computed value of correlation.

minian.cnmf.idx_corr(X, ridx, cidx)[source]#

Compute partial pairwise correlation based on index.

This function compute a subset of a pairwise correlation matrix. The correlation to be computed are specified by two vectors ridx and cidx of same length, representing the row and column index of the full correlation matrix. The function use them to index the timeseries matrix X and compute only the requested correlations. The result is returned flattened.

Parameters:

X (np.ndarray) – Input time series. Should have 2 dimensions, where the last dimension should be the time dimension.
ridx (np.ndarray) – Row index of the correlation.
cidx (np.ndarray) – Column index of the correlation.

Returns:

res (np.ndarray) – Flattened resulting correlations. Has same shape as ridx or cidx.

minian.cnmf.label_connected(adj, only_connected=False)[source]#

Label connected components given adjacency matrix.

Parameters:

adj (np.ndarray) – Adjacency matrix. Should be 2d symmetric matrix.
only_connected (bool, optional) – Whether to keep only the labels of connected components. If True, then all components with only one node (isolated) will have their labels set to -1. Otherwise all components will have unique label. By default False.

Returns:

labels (np.ndarray) – The labels for each components. Should have length adj.shape[0].

minian.cnmf.lstsq_vec(a, b)[source]#

Estimate a least-square scaling from a to b in vectorized fashion.

Parameters:

a (np.ndarray) – Source of the scaling.
b (np.ndarray) – Target of the scaling.

Returns:

scale (np.ndarray) – A scaler that scales a to b.

minian.cnmf.noise_fft(px, noise_range=(0.25, 0.5), noise_method='logmexp', threads=1)[source]#

Estimates noise of the input by aggregating power spectral density within noise_range.

The PSD is estimated using FFT.

Parameters:

px (np.ndarray) – Input data.
noise_range (tuple, optional) – Range of noise frequency to be aggregated as a fraction of sampling frequency. By default (0.25, 0.5).
noise_method (str, optional) – Method of aggreagtion for noise. Should be one of “mean” “median” “logmexp” or “sum”. By default “logmexp”.
threads (int, optional) – Number of threads to use for pyfftw. By default 1.

Returns:

noise (float) – The estimated noise level of input.

See also

get_noise_fft

minian.cnmf.noise_welch(y, noise_range=(0.25, 0.5), noise_method='logmexp')[source]#

Estimates noise of the input by aggregating power spectral density within noise_range.

The PSD is estimated using welch method.

Parameters:

px (np.ndarray) – Input data.
noise_range (tuple, optional) – Range of noise frequency to be aggregated as a fraction of sampling frequency. By default (0.25, 0.5).
noise_method (str, optional) – Method of aggreagtion for noise. Should be one of “mean” “median” “logmexp” or “sum”. By default “logmexp”.
threads (int, optional) – Number of threads to use for pyfftw. By default 1.

Returns:

noise (float) – The estimated noise level of input.

See also

get_noise_welch

minian.cnmf.partition_diagnostics(membership, adj=None, n_frames=None, bytes_per_sample=4)[source]#

Compute quality statistics for a spatial partition.

Parameters:

membership (np.ndarray) – (N,) integer partition label per node.
adj (scipy.sparse.spmatrix, optional) – (N, N) adjacency, symmetric or triangular. Self-loops and duplicate entries are collapsed. When omitted, edge keys are absent from the returned dict.
n_frames (int, optional) – Length of the time axis. When supplied, "mem_mb" is added to the returned dict.
bytes_per_sample (int, optional) – Per-sample dtype size in bytes (default 4 = float32). Ignored if n_frames is None.

Returns:

diag (dict) – Diagnostic keys vary with which optional inputs were supplied:

============================== ============ ===================== ============================================================ # noqa: E501 key condition type / shape meaning ============================== ============ ===================== ============================================================ # noqa: E501 "sizes" always (n_parts,) int64 Node count in each partition (np.bincount(membership)). # noqa: E501 "n_parts" always int Number of distinct partition labels. # noqa: E501 "edges_per_partition" adj (n_parts,) int64 Intra-partition edge count per partition. # noqa: E501 "cross_edges" adj int Edges whose endpoints are in different partitions. # noqa: E501 "total_edges" adj int Total unique undirected edges (self-loops dropped). # noqa: E501 "cross_fraction" adj float cross_edges / total_edges (0.0 if no edges). # noqa: E501 "mem_mb" n_frames (n_parts,) float Per-partition trace memory estimate in MiB. # noqa: E501 ============================== ============ ===================== ============================================================ # noqa: E501

Returned as a plain dict for consistency with the rest of the codebase. A structured-return migration is on the roadmap with the project’s pydantic adoption.

minian.cnmf.smooth_corr(X, ridx, cidx, freq)[source]#

Wraps around filt_fft_vec() and idx_corr() to carry out both smoothing and computation of partial correlation.

Parameters:

X (np.ndarray) – Input time series.
ridx (np.ndarray) – Row index of the resulting correlation.
cidx (np.ndarray) – Column index of the resulting correlation.
freq (float) – Cut-off frequency for the smoothing.

Returns:

corr (np.ndarray) – Resulting partial correlation.

minian.cnmf.smooth_sig(sig, freq, method='fft', btype='low')[source]#

Filter the input timeseries with a cut-off frequency in vecorized fashion.

Parameters:

sig (xr.DataArray) – The input timeseries. Should have dimension “frame”.
freq (float) – The cut-off frequency.
method (str, optional) – Method used for filtering. Either “fft” or “butter”. If “fft”, the filtering is carried out with zero-ing fft signal. If “butter”, the fiilterings carried out with scipy.signal.butter(). By default “fft”.
btype (str, optional) – Either “low” or “high” specify low or high pass filtering. By default “low”.

Returns:

sig_smth (xr.DataArray) – The filtered signal. Has same shape as input sig.

Raises:

NotImplementedError – if method is not “fft” or “butter”

minian.cnmf.spatial_partition(positions, target_chunk)[source]#

Partition 2D points into balanced groups via k-d tree median split.

Used by graph_optimize_corr() to chunk pixels/units before correlation. The graphs graph_optimize_corr sees only connect spatially-adjacent nodes (radius-neighbour seed graph for seeds_merge, dilated-footprint overlap graph for unit_merge), so any partition that is geometrically compact in 2D is also a near-optimal cut: edge length is bounded, so partitions with short perimeters cross few edges.

Parameters:

positions (np.ndarray) – (N, 2) array of 2D coordinates, one row per node. Must be finite; non-finite rows raise – unit_merge() enforces this by raising on zero-mass footprints upstream.
target_chunk (int) – Maximum allowed partition size. Leaves end up at or below this size, so partition count is between ceil(N / target_chunk) and roughly twice that (the median split halves N at each step but cannot split a leaf below target_chunk).

Returns:

membership (np.ndarray) – Integer partition label per row, dense in [0, n_parts). Order of labels is the order in which leaves are emitted by the depth-first recursion (left-half before right-half at every node), so consecutive labels correspond to spatially-adjacent tiles.

Notes

Algorithm. Recursive k-d tree with median split:

If the current point set has <= target_chunk rows, assign it the next label and return.
Otherwise pick the axis with the larger extent (max - min over the current rows). Splitting along the longer axis produces squarer leaf tiles, which have shorter perimeter than narrow strips and therefore cross fewer edges.
Stable-sort the rows along the chosen axis, split at the median index (len(idx) // 2), recurse on each half.

The split is on the median index, not the median value, so the two halves differ in size by at most one regardless of how the points are distributed. The stable sort makes the output deterministic for a given input even when many points share a coordinate.

Complexity. Each recursion level argsorts an O(level_size)-sized array. With balanced halving the depth is log2(N / target_chunk) and the per-level work sums to O(N log N), giving an overall O(N log^2 N). N here is the number of pixels/units handed to one graph_optimize_corr call (a few thousand at most), so this is negligible next to the correlation work itself.

Why not pymetis. This replaces a pymetis.part_graph call. METIS minimises edge cuts on general graphs via multilevel coarsening, which is the right answer when edges may span the graph arbitrarily. The graphs we hand it never do – they only encode spatial neighbours by construction – so the geometric optimum coincides with the cut optimum, and the recursive median split achieves it without an external compiled dependency.

minian.cnmf.sps_any(x)[source]#

Compute any on a sparse array.

Parameters:: x (sparse.COO) – Input sparse array.
Returns:: x_any (np.ndarray) – 2d boolean numpy array.

minian.cnmf.unit_merge(A, C, add_list=None, thres_corr=0.9, noise_freq=None, chunk=600)[source]#

Merge cells given spatial footprints and temporal components

This function merge all cells that have common pixels based on correlation of their temporal components. The cells to be merged will become one cell, with spatial and temporal components taken as mean across all the cells to be merged. Additionally any variables specified in add_list will be merged in the same manner. Optionally the temporal components can be smoothed before being used to caculate correlation. Despite the name any timeseries be passed as C and used to calculate the correlation.

Parameters:

A (xr.DataArray) – Spatial footprints of the cells.
C (xr.DataArray) – Temporal component of cells.
add_list (list[xr.DataArray], optional) – list of additional variables to be merged. By default None.
thres_corr (float, optional) – The threshold of correlation. Any pair of spatially overlapping cells with correlation higher than this threshold will be transitively grouped together and merged. By default 0.9.
noise_freq (float, optional) – The cut-off frequency used to smooth C before calculation of correlation. If None then no smoothing will be done. By default None.
chunk (int, optional) – Chunk size for the out-of-core correlation computation, passed through to adj_corr() / graph_optimize_corr(). By default 600.

Returns:

A_merge (xr.DataArray) – Merged spatial footprints of cells.
C_merge (xr.DataArray) – Merged temporal components of cells.
add_list (list[xr.DataArray], optional) – list of additional merged variables. Only returned if input add_list is not None.

minian.cnmf.update_background(Y, A, C, b=None)[source]#

Update background terms given spatial and temporal components of cells.

A movie representation (with dimensions “height” “width” and “frame”) of estimated cell activities are computed as the product between the spatial components matrix and the temporal components matrix of cells over the “unit_id” dimension. Then the residule movie is computed by subtracting the estimated cell activity movie from the input movie. Then the spatial footprint of background b is the mean of the residule movie over “frame” dimension, and the temporal component of background f is the least-square solution between the residule movie and the spatial footprint b.

Parameters:

Y (xr.DataArray) – Input movie data. Should have dimensions (“frame”, “height”, “width”).
A (xr.DataArray) – Estimation of spatial footprints of cells. Should have dimensions (“unit_id”, “height”, “width”).
C (xr.DataArray) – Estimation of temporal activities of cells. Should have dimensions (“unit_id”, “frame”).
b (xr.DataArray, optional) – Previous estimation of spatial footprint of background. If provided it will be returned as-is, and only temporal activity of background will be updated

Returns:

b_new (xr.DataArray) – New estimation of the spatial footprint of background. Has dimensions (“height”, “width”).
f_new (xr.DataArray) – New estimation of the temporal activity of background. Has dimension “frame”.

minian.cnmf.update_spatial(Y, A, C, sn, b=None, f=None, dl_wnd=5, sparse_penal=0.5, update_background=False, normalize=True, size_thres=(9, None), in_memory=False)[source]#

Update spatial components given the input data and temporal dynamic for each cell.

This function carries out spatial update of the CNMF algorithm. The update is done in parallel and independently for each pixel. To save computation time, we compute a subsetting matrix sub by dilating the initial spatial foorprint of each cell. The window size of the dilation is controled by dl_wnd. Then for each pixel, only cells that have a non-zero value in sub at the current pixel will be considered for update. Optionally, the spatial footprint of the background can be updated in the same fashion based on the temporal dynamic of the background. After the update, the spatial footprint of each cell can be optionally noramlized to unit sum, so that difference in fluorescent intensity will not be reflected in spatial footprint. A size_thres can be passed in to filter out cells whose size (number of non-zero values in spatial footprint) is outside the specified range. Finally, the temporal dynamic of cells C can either be load in memory before the update or lazy-loaded during the update. Note that if in_memory is False, then C must be stored under the intermediate folder specified as environment variable MINIAN_INTERMEDIATE.

Parameters:

Y (xr.DataArray) – Input movie data. Should have dimensions “height”, “width” and “frame”.
A (xr.DataArray) – Previous estimation of spatial footprints. Should have dimension “height”, “width” and “unit_id”.
C (xr.DataArray) – Estimation of temporal component for each cell. Should have dimension “frame” and “unit_id”.
sn (xr.DataArray) – Estimation of noise level for each pixel. Should have dimension “height” and “width”.
b (xr.DataArray, optional) – Previous estimation of spatial footprint of background. Fhould have dimension “height” and “width”.
f (xr.DataArray, optional) – Estimation of temporal dynamic of background. Should have dimension “frame”.
dl_wnd (int, optional) – Window of morphological dilation in pixel when computing the subsetting matrix. By default 5.
sparse_penal (float, optional) – Global scalar controlling sparsity of the result. The higher the value, the sparser the spatial footprints. By default 0.5.
update_background (bool, optional) – Whether to update the spatial footprint of background. If True, then both b and f need to be provided. By default False.
normalize (bool, optional) – Whether to normalize resulting spatial footprints of each cell to unit sum. By default True
size_thres (tuple, optional) – The range of size in pixel allowed for the resulting spatial footprints. If None, then no filtering will be done. By default (9, None).
in_memory (bool, optional) – Whether to load C into memory before spatial update. By default False.

Returns:

A_new (xr.DataArray) – New estimation of spatial footprints. Same shape as A except the “unit_id” dimension might be smaller due to filtering.
mask (xr.DataArray) – Boolean mask of whether a cell passed size filtering. Has dimension “unit_id” that is same as input A. Useful for subsetting other variables based on the result of spatial update.
b_new (xr.DataArray) – New estimation of spatial footprint of background. Only returned if update_background is True. Same shape as b.
norm_fac (xr.DataArray) – Normalizing factor. Userful to scale temporal activity of cells. Only returned if normalize is True.

Notes

During spatial update, the algorithm solve the following optimization problem for each pixel:

\[\begin{split}\begin{aligned} & \underset{\mathbf{a}}{\text{minimize}} & & \left \lVert \mathbf{y} - \mathbf{a}^T \mathbf{C} \right \rVert ^2 + \alpha \left \lvert \mathbf{a} \right \rvert \\ & \text{subject to} & & \mathbf{a} \geq 0 \end{aligned}\end{split}\]

Where \(\mathbf{y}\) is the fluorescent dynamic of the pixel, \(\mathbf{a}\) is spatial footprint values across all cells on that pixel, \(\mathbf{C}\) is temporal component matrix across all cells. The parameter \(\alpha\) is the product of the noise level on each pixel sn and the global scalar sparse_penal. Higher value of \(\alpha\) will result in more sparse estimation of spatial footprints.

minian.cnmf.update_spatial_block(y, alpha, sub, **kwargs)[source]#

Carry out spatial update for each 3d block of data.

This function wraps around update_spatial_perpx() so that it can be applied to 3d blocks of data. Keyword arguments are passed to update_spatial_perpx().

Parameters:

y (np.ndarray) – Input data, should have dimension (height, width, frame).
alpha (np.ndarray) – Alpha parameter for the optimization problem. Should have dimension (height, width).
sub (sparse.COO) – Subsetting matrix. Should have dimension (height, width, unit_id).

Returns:

A_blk (sparse.COO) – Resulting spatial footprints. Should have dimension (height, width, unit_id).

minian.cnmf.update_spatial_perpx(y, alpha, sub, C_store, f)[source]#

Update spatial footprints across all the cells for a single pixel.

This function use sklearn.linear_model.LassoLars to solve the optimization problem. C_store can either be a in-memory numpy array, or a zarr array, in which case it will be lazy-loaded. If f is not None, then sub[-1] is expected to be the subsetting mask for background, and the last element of the return value will be the spatial footprint of background.

Parameters:

y (np.ndarray) – Input fluorescent trace for the given pixel.
alpha (float) – Parameter of the optimization problem controlling sparsity.
sub (sparse.COO) – Subsetting matrix.
C_store (Union[np.ndarray, zarr.core.Array]) – Estimation of temporal dynamics of cells.
f (np.ndarray, optional) – Temporal dynamic of background.

Returns:

A_px (sparse.COO) – Spatial footprint values across all cells for the given pixel.

See also

update_spatial: for more explanation of parameters

minian.cnmf.update_temporal(A, C, b=None, f=None, Y=None, YrA=None, noise_freq=0.25, p=2, add_lag='p', jac_thres=0.1, sparse_penal=1, bseg=None, med_wd=None, zero_thres=1e-08, max_iters=200, use_smooth=True, normalize=True, warm_start=False, post_scal=False, scs_fallback=False, concurrent_update=False)[source]#

Update temporal components and deconvolve calcium traces for each cell given spatial footprints.

This function carries out temporal update of the CNMF algorithm. The update is done in parallel and independently for each group of cells. The grouping of cells is controlled by jac_thres. The relationship between calcium and deconvolved spikes is modeled as an Autoregressive process (AR) of order p. The AR coefficients are estimated from autocovariances of YrA traces for each cell, with add_lag controls how many timesteps of autocovariances are used. Optionally, the YrA traces can be smoothed for the estimation of AR coefficients only. The noise level for each cell is estimated using FFT with noise_freq as cut-off, and controls the sparsity of the result together with the global sparse_penal parameter. YrA traces for each cells can be optionally normalized to unit sum to make sparse_penal to have comparable effects across cells. If abrupt change of baseline fluorescence is expected, a bseg vector can be passed to enable estimation of independent baseline for different segments of time. The temporal update itself is performed by solving an optimization problem using cvxpy, with concurrent_update, warm_start, max_iters, scs_fallback controlling different aspects of the optimization. Finally, the results can be filtered with zero_thres to suppress small values caused by numerical errors, and a post-hoc scaling process can be optionally used to scale the result based on YrA to get around unwanted effects from sparse penalty or normalization.

Parameters:

A (xr.DataArray) – Estimation of spatial footprints for each cell. Should have dimensions (“unit_id”, “height”, “width”).
C (xr.DataArray) – Previous estimation of calcium dynamic of cells. Should have dimensions (“frame”, “unit_id”). Only used if warm_start = True or if YrA is None.
b (xr.DataArray, optional) – Estimation of spatial footprint of background. Should have dimensions (“height”, “width”). Only used if YrA is None. By default None.
f (xr.DataArray, optional) – Estimation of temporal dynamic of background. Should have dimension “frame”. Only used if YrA is None. By default None.
Y (xr.DataArray, optional) – Input movie data. Should have dimensions (“frame”, “height”, “width”). Only used if YrA is None. By default None.
YrA (xr.DataArray, optional) – Estimation of residule traces for each cell. Should have dimensions (“frame”, “unit_id”). If None then one will be computed using computea_trace with relevant inputs. By default None.
noise_freq (float, optional) – Frequency cut-off for both the estimation of noise level and the optional smoothing, specified as a fraction of sampling frequency. By default 0.25.
p (int, optional) – Order of the AR process. By default 2.
add_lag (str, optional) – Additional number of timesteps in covariance to use for the estimation of AR coefficients. If 0, then only the first p number of timesteps will be used to estimate the p number of AR coefficients. If greater than 0, then the system is over-determined and least square will be used to estimate AR coefficients. If “p”, then p number of additional timesteps will be used. By default “p”.
jac_thres (float, optional) – Threshold for Jaccard Index. Cells whose overlap in spatial footprints (number of common pixels divided by number of total pixels) exceeding this threshold will be grouped together transitively for temporal update. By default 0.1.
sparse_penal (int, optional) – Global scalar controlling sparsity of the result. The higher the value, the sparser the deconvolved spikes. By default 1.
bseg (np.ndarray, optional) – 1d vector with length “frame” representing segments for which baseline should be estimated independently. An independent baseline will be estimated for frames corresponding to each unique label in this vector. If None then a single scalar baseline will be estimated for each cell. By default None.
med_wd (int, optional) – Window size for the median filter used for baseline correction. For each cell, the baseline flurescence is estimated by median-filtering the temporal activity. Then the baseline is subtracted from the temporal activity right before the optimization step. If None then no baseline correction will be performed. By default None.
zero_thres (float, optional) – Threshold to filter out small values in the result. Any values smaller than this threshold will be set to zero. By default 1e-8.
max_iters (int, optional) – Maximum number of iterations for optimization. Can be increased to get around sub-optimal results. By default 200.
use_smooth (bool, optional) – Whether to smooth the YrA for the estimation of AR coefficients. If True, then a smoothed version of YrA will be computed by low-pass filter with noise_freq and used for the estimation of AR coefficients only. By default True.
normalize (bool, optional) – Whether to normalize YrA for each cell to unit sum such that sparse penalty has simlar effect for all the cells. Each group of cell will be normalized together (with mean of the sum for each cell) to preserve relative amplitude of fluorescence between overlapping cells. By default True.
warm_start (bool, optional) – Whether to use previous estimation of C to warm start the optimization. Can lead to faster convergence in theory. Experimental. By default False.
post_scal (bool, optional) – Whether to apply the post-hoc scaling process, where a scalar will be estimated with least square for each cell to scale the amplitude of temporal component to YrA. Useful to get around unwanted dampening of result values caused by high sparse_penal or to revert the per-cell normalization. By default False.
scs_fallback (bool, optional) – Whether to fall back to scs solver if the default ecos solver fails. By default False.
concurrent_update (bool, optional) – Whether to update a group of cells as a single optimization problem. Yields slightly more accurate estimation when cross-talk between cells are severe, but significantly increase convergence time and memory demand. By default False.

Returns:

C_new (xr.DataArray) – New estimation of the calcium dynamic for each cell. Should have same shape as C except the “unit_id” dimension might be smaller due to dropping of cells and filtering.
S_new (xr.DataArray) – New estimation of the deconvolved spikes for each cell. Should have dimensions (“frame”, “unit_id”) and same shape as C_new.
b0_new (xr.DataArray) – New estimation of baseline fluorescence for each cell. Should have dimensions (“frame”, “unit_id”) and same shape as C_new. Each cell should only have one unique value if bseg is None.
c0_new (xr.DataArray) – New estimation of a initial calcium decay, in theory triggered by calcium events happened before the recording starts. Should have dimensions (“frame”, “unit_id”) and same shape as C_new.
g (xr.DataArray) – Estimation of AR coefficient for each cell. Useful for visualizing modeled calcium dynamic. Should have dimensions (“lag”, “unit_id”) with “lag” having length p.
mask (xr.DataArray) – Boolean mask of whether a cell has any temporal dynamic after the update and optional filtering. Has dimension “unit_id” that is same as input C. Useful for subsetting other variables based on the result of temporal update.

Notes

During temporal update, the algorithm solve the following optimization problem for each cell:

\[\begin{split}\begin{aligned} & \underset{\mathbf{c} \, \mathbf{b_0} \, \mathbf{c_0}}{\text{minimize}} & & \left \lVert \mathbf{y} - \mathbf{c} - \mathbf{c_0} - \mathbf{b_0} \right \rVert ^2 + \alpha \left \lvert \mathbf{G} \mathbf{c} \right \rvert \\ & \text{subject to} & & \mathbf{c} \geq 0, \; \mathbf{G} \mathbf{c} \geq 0 \end{aligned}\end{split}\]

Where \(\mathbf{y}\) is the estimated residule trace (YrA) for the cell, \(\mathbf{c}\) is the calcium dynamic of the cell, \(\mathbf{G}\) is a “frame”x”frame” matrix constructed from the estimated AR coefficients of cell, such that the deconvolved spikes of the cell is given by \(\mathbf{G}\mathbf{c}\). If bseg is None, then \(\mathbf{b_0}\) is a single scalar, otherwise it is a 1d vector with dimension “frame” constrained to have multiple independent values, each corresponding to a segment of time specified in bseg. \(\mathbf{c_0}\) is a 1d vector with dimension “frame” constrained to be the product of a scalar (representing initial calcium concentration) and the decay dynamic given by the estimated AR coefficients. The parameter \(\alpha\) is the product of estimated noise level of the cell and the global scalar sparse_penal. Higher value of \(\alpha\) will result in more sparse estimation of deconvolved spikes.

minian.cnmf.update_temporal_block(YrA, noise_freq, p, add_lag='p', normalize=True, use_smooth=True, med_wd=None, concurrent=False, **kwargs)[source]#

Update temporal components given residule traces of a group of cells.

This function wraps around update_temporal_cvxpy(), but also carry out additional initial steps given YrA of a group of cells. Additional keyword arguments are passed through to update_temporal_cvxpy().

Parameters:

YrA (np.ndarray) – Residule traces of a group of cells. Should have dimension (“unit_id”, “frame”).
noise_freq (float) – Frequency cut-off for both the estimation of noise level and the optional smoothing. Specified as a fraction of sampling frequency.
p (int) – Order of the AR process.
add_lag (str, optional) – Additional number of timesteps in covariance to use for the estimation of AR coefficients. By default “p”.
normalize (bool, optional) – Whether to normalize YrA for each cell to unit sum. By default True.
use_smooth (bool, optional) – Whether to smooth the YrA for the estimation of AR coefficients. By default True.
med_wd (int, optional) – Median window used for baseline correction.
concurrent (bool, optional) – Whether to update a group of cells as a single optimization problem. By default False.

Returns:

c (np.ndarray) – New estimation of the calcium dynamic of the group of cells. Should have dimensions (“unit_id”, “frame”) and same shape as YrA.
s (np.ndarray) – New estimation of the deconvolved spikes of the group of cells. Should have dimensions (“unit_id”, “frame”) and same shape as c.
b (np.ndarray) – New estimation of baseline fluorescence of the group of cells. Should have dimensions (“unit_id”, “frame”) and same shape as c.
c0 (np.ndarray) – New estimation of a initial calcium decay of the group of cells. Should have dimensions (“unit_id”, “frame”) and same shape as c.
g (np.ndarray) – Estimation of AR coefficient for each cell. Should have dimensions (“unit_id”, “lag”) with “lag” having length p.

See also

update_temporal: for more explanation of parameters

minian.cnmf.update_temporal_cvxpy(y, g, sn, A=None, bseg=None, **kwargs)[source]#

Solve the temporal update optimization problem using cvxpy

Parameters:

y (np.ndarray) – Input residule trace of one or more cells.
g (np.ndarray) – Estimated AR coefficients of one or more cells.
sn (np.ndarray) – Noise level of one or more cells.
A (np.ndarray, optional) – Spatial footprint of one or more cells. Not used. By default None.
bseg (np.ndarray, optional) – 1d vector with length “frame” representing segments for which baseline should be estimated independently. By default None.
sparse_penal (float) – Sparse penalty parameter for all the cells.
max_iters (int) – Maximum number of iterations.
use_cons (bool, optional) – Whether to try constrained version of the problem first. By default False.
scs_fallback (bool) – Whether to fall back to scs solver if the default ecos solver fails.
c_last (np.ndarray, optional) – Initial estimation of calcium traces for each cell used to warm start.
zero_thres (float) – Threshold to filter out small values in the result.

Returns:

c (np.ndarray) – New estimation of the calcium dynamic of the group of cells. Should have dimensions (“unit_id”, “frame”) and same shape as y.
s (np.ndarray) – New estimation of the deconvolved spikes of the group of cells. Should have dimensions (“unit_id”, “frame”) and same shape as c.
b (np.ndarray) – New estimation of baseline fluorescence of the group of cells. Should have dimensions (“unit_id”, “frame”) and same shape as c.
c0 (np.ndarray) – New estimation of a initial calcium decay of the group of cells. Should have dimensions (“unit_id”, “frame”) and same shape as c.

See also

update_temporal: for more explanation of parameters