Signal processing algorithms#

Algorithms to work with 1D signals.

tidyms2.algorithms.signal.detect_peaks(x, noise, baseline, **kwargs)#

Find peaks in a 1D signal.

Parameters:
  • x (ndarray) – 1D array.

  • noise (ndarray) – the noise level at each point in x. MUST have the same size as x.

  • baseline (ndarray) – the baseline level at each point in x. MUST have the same size as x.

  • kwargs – extra parameters to pass to scipy.signal.find_peaks(). If the prominence parameter is passed, it will be ignored and set to three times the noise level.

Return type:

tuple[ndarray, ndarray, ndarray]

Returns:

a tuple consisting of: an int array with peaks start location, an int array with the apexes location and an int array with the peaks end location.

Algorithm#

  1. Peak apexes are detected using scipy.signal.find_peaks() with a minimum distance of three points and a minimum prominence equal to three times the noise level.

  2. points in \(x\) are classified as either signal or baseline. The k-th point in \(x\) is classified as baseline if the following condition is met:

    \[|x[k] - b[k]| < e[k]\]

    where \(b\) is the baseline and \(e\) is the noise.

  3. Peaks are removed if they fall in a in a region classified as baseline.

  4. Peak extensions, i.e., beginning and end, are defined as the closest baseline point to the left and right of each apex.

  5. Overlapping peak extensions are fixed by setting the boundary between the peaks to the minimum value between the two apexes.

See also

tidyms2.algorithms.signal.estimate_baseline(x, noise, min_proba=0.05)#

Estimate the baseline level of a 1D signal.

Parameters:
  • x (ndarray) – non-empty 1D array

  • noise (ndarray) – the noise level at each point in x. MUST have the same size as x.

  • min_proba (float) – number between 0 and 1, default=0.05

Return type:

ndarray

Returns:

an array that contains the baseline level at each point in x.

Algorithm#

The baseline is estimated by classifying each point in the signal as either signal or baseline. The baseline is obtained by interpolation of baseline points. See [ADD LINK] for a detailed explanation of how the method works.

See also

tidyms2.algorithms.signal.estimate_noise(x, n_chunks=5, robust=True, min_chunk_size=200)#

Estimate the noise level in a 1D signal.

x is split into equally sized chunks and a noise estimation is done assuming a gaussian iid in each chunk. See [ADD LINK] for a detailed description of how the method works.

Parameters:
  • x (ndarray[tuple[Any, ...], dtype[TypeVar(FloatDtype, bound= floating)]]) – a 1D array

  • n_chunks (int) – number of chunks to create. The size of each slice must be greater than min_chunk_size.

  • robust (bool) – if set to True, estimates the noise using the median absolute deviation. Otherwise, noise estimation uses the standard deviation.

  • min_slice_size – minimum size of a slice. If the size of x is smaller than this value, the noise is estimated using the whole array.

Return type:

ndarray

Returns:

an array that contains the noise level at each point in x.

tidyms2.algorithms.signal.find_centroids(mz, spint, min_snr, min_distance)#

Find the centroid of a mass spectrum in profile mode.

Parameters:
  • mz (ndarray) – array of m/z in profile mode.

  • spint (ndarray) – array of spectral intensity in profile mode.

  • min_snr (float) – minimum peak signal-to-noise ratio

  • min_distance (float) – minimum m/z distance between consecutive centroids

Return type:

tuple[ndarray[tuple[Any, ...], dtype[TypeVar(FloatDtype, bound= floating)]], ndarray[tuple[Any, ...], dtype[TypeVar(FloatDtype, bound= floating)]]]

Returns#

centroid_mzarray

centroid m/z of peaks

centroid_intarray

area of peaks

tidyms2.algorithms.signal.smooth(x, smoothing_strength)#

Smooth a signal using using a gaussian kernel.

:param smoothing_strength : standard deviation of the gaussian kernel.