pyin

Description: libf0 yin implementation
Contributors: Sebastian Rosenzweig, Simon Schwär, Edgar Suárez, Meinard Müller
License: The MIT license, https://opensource.org/licenses/MIT
This file is part of libf0.
libf0.pyin.pyin(x, Fs=22050, N=2048, H=256, F_min=55.0, F_max=1760.0, R=10, thresholds=array([0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.2, 0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.3, 0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.4, 0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5, 0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.6, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.7, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.8, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.9, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99]), beta_params=[1, 18], absolute_min_prob=0.01, voicing_prob=0.5)[source]

Implementation of the pYIN F0-estimation algorithm.

1

Matthias Mauch and Simon Dixon. “PYIN: A fundamental frequency estimator using probabilistic threshold distributions”. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2014): 659-663.

Parameters
  • x (ndarray) – Audio signal

  • Fs (int) – Sampling rate

  • N (int) – Window size

  • H (int) – Hop size

  • F_min (float or int) – Minimal frequency

  • F_max (float or int) – Maximal frequency

  • R (int) – Frequency resolution given in cents

  • thresholds (ndarray) – Range of thresholds

  • beta_params (tuple or list) – Parameters of beta-distribution in the form [alpha, beta]

  • absolute_min_prob (float) – Prior for voice activity

  • voicing_prob (float) – Prior for transition probability?

Returns

  • f0 (ndarray) – Estimated F0-trajectory

  • t (ndarray) – Time axis

  • conf (ndarray) – Confidence

libf0.pyin.refine_estimates_yin(f0, p_orig, val_orig, Fs, tol)[source]

Refine estimates using YIN CMNDF information.

Parameters
  • f0 (ndarray) – F0 in Hz

  • p_orig (ndarray) – Original lag as computed by YIN

  • val_orig (ndarray) – Original CMNDF values as computed by YIN

  • Fs (float) – Sampling frequency

  • tol (float) – Tolerance for refinements in cents

Returns

f0_refined – Refined F0-trajectory

Return type

ndarray

libf0.pyin.probabilistic_thresholding(cmndf, thresholds, p_min, p_max, absolute_min_prob, F_axis, Fs, beta_distr, parabolic_interp=True)[source]

Probabilistic thresholding of the YIN CMNDF.

Parameters
  • cmndf (ndarray) – Cumulative Mean Normalized Difference Function

  • thresholds (ndarray) – Array of thresholds for CMNDF

  • p_min (float) – Period corresponding to the lower frequency bound

  • p_max (float) – Period corresponding to the upper frequency bound

  • absolute_min_prob (float) – Probability to chose absolute minimum

  • F_axis (ndarray) – Frequency axis

  • Fs (float) – Sampling rate

  • beta_distr (ndarray) – Beta distribution that defines mapping between thresholds and probabilities

  • parabolic_interp (bool) – Switch to activate/deactivate parabolic interpolation

Returns

  • O_m (ndarray) – Observations for given frame

  • lag_thr (ndarray) – Computed lags for every threshold

  • val_thr (ndarray) – CMNDF values for computed lag

libf0.pyin.yin_multi_thr(x, Fs, N, H, F_min, F_max, thresholds, beta_distr, absolute_min_prob, F_axis, voicing_prob, parabolic_interp=True)[source]

Applies YIN multiple times on input audio signals using different thresholds for CMNDF.

Parameters
  • x (ndarray) – Input audio signal

  • Fs (int) – Sampling rate

  • N (int) – Window size

  • H (int) – Hop size

  • F_min (float) – Lower frequency bound

  • F_max (float) – Upper frequency bound

  • thresholds (ndarray) – Array of thresholds

  • beta_distr (ndarray) – Beta distribution that defines mapping between thresholds and probabilities

  • absolute_min_prob (float) – Probability to chose absolute minimum

  • F_axis (ndarray) – Frequency axis

  • voicing_prob (float) – Probability of a frame being voiced

  • parabolic_interp (bool) – Switch to activate/deactivate parabolic interpolation

Returns

  • O (ndarray) – Observations based on YIN output

  • rms (ndarray) – Root mean square power

  • p_orig (ndarray) – Original YIN period estimates

  • val_orig (ndarray) – CMNDF values corresponding to original YIN period estimates

libf0.pyin.compute_transition_matrix(M, triang_distr)[source]

Compute a transition matrix for PYIN Viterbi.

Parameters
  • M (int) – Matrix dimension

  • triang_distr (ndarray) – (Triangular) distribution, defining tolerance for jumps deviating from the main diagonal

Returns

A – Transition matrix

Return type

ndarray

libf0.pyin.viterbi_pyin(A, C, O)[source]

Viterbi algorithm (pYIN variant)

Parameters
  • A – ndarray State transition probability matrix of dimension I x I

  • C – ndarray Initial state distribution of dimension I X 1

  • O – ndarray Likelihood matrix of dimension I x N

Returns

ndarray

Optimal state sequence of length N

Return type

idxs

libf0.pyin.viterbi_log_likelihood(A, C, B_O)[source]

Viterbi algorithm (log variant) for solving the uncovering problem

Notebook: C5/C5S3_Viterbi.ipynb

Parameters
  • A – ndarray State transition probability matrix of dimension I x I

  • C – ndarray Initial state distribution of dimension I

  • B_O – ndarray Likelihood matrix of dimension I x N

Returns

ndarray

Optimal state sequence of length N

Return type

S_opt

libf0.pyin.delete_numba(arr, num)[source]

Delete number from array, Numba compatible. Inspired by: https://stackoverflow.com/questions/53602663/delete-a-row-in-numpy-array-in-numba