salience
- libf0.salience.salience(x, Fs=22050, N=2048, H=256, F_min=55.0, F_max=1760.0, R=10.0, num_harm=10, freq_smooth_len=11, alpha=0.9, gamma=0.0, constraint_region=None, tol=5, score_low=0.01, score_high=1.0)[source]
Implementation of a salience-based F0-estimation algorithm using pitch contours, inspired by Melodia.
- Parameters:
x (ndarray) – Audio signal
Fs (int) – Sampling rate
N (int) – Window size
H (int) – Hop size
F_min (float or int) – Minimal frequency
F_max (float or int) – Maximal frequency
R (int) – Frequency resolution given in cents
num_harm (int) – Number of harmonics (Default value = 10)
freq_smooth_len (int) – Filter length for vertical smoothing (Default value = 11)
alpha (float) – Weighting parameter for harmonics (Default value = 0.9)
gamma (float) – Logarithmic compression factor (Default value = 0.0)
constraint_region (None or ndarray) – Constraint regions, row-format: (t_start_sec, t_end_sec, f_start_hz, f_end,hz) (Default value = None)
tol (int) – Tolerance parameter for transition matrix (Default value = 5)
score_low (float) – Score (low) for transition matrix (Default value = 0.01)
score_high (float) – Score (high) for transition matrix (Default value = 1.0)
- Returns:
f0 (ndarray) – Estimated F0-trajectory
T_coef (ndarray) – Time axis
sal (ndarray) – Salience value of estimated F0
See also
[FMP] Notebook: C8/C8S2_SalienceRepresentation.ipynb
- libf0.salience.compute_salience_rep(x, Fs, N, H, F_min, F_max, R, num_harm, freq_smooth_len, alpha, gamma)[source]
Compute salience representation [FMP, Eq. (8.56)]
- Parameters:
x (ndarray) – Audio signal
Fs (int) – Sampling rate
N (int) – Window size
H (int) – Hop size
F_min (float or int) – Minimal frequency
F_max (float or int) – Maximal frequency
R (int) – Frequency resolution given in cents
num_harm (int) – Number of harmonics
freq_smooth_len (int) – Filter length for vertical smoothing
alpha (float) – Weighting parameter for harmonics
gamma (float) – Logarithmic compression factor
- Returns:
Z (ndarray) – Salience representation
F_coef_hertz (ndarray) – Frequency axis in Hz
See also
[FMP] Notebook: C8/C8S2_SalienceRepresentation.ipynb
- libf0.salience.compute_y_lf_if_bin_eff(X, Fs, N, H, F_min, F_max, R)[source]
Binned Log-frequency Spectrogram with variable frequency resolution based on instantaneous frequency, more efficient implementation than FMP
- Parameters:
X (ndarray) – Complex spectrogram
Fs (int) – Sampling rate in Hz
N (int) – Window size
H (int) – Hop size
F_min (float or int) – Minimal frequency
F_max (float or int) – Maximal frequency
R (int) – Frequency resolution given in cents
- Returns:
Y_LF_IF_bin (ndarray) – Binned log-frequency spectrogram using instantaneous frequency (shape: [freq, time])
F_coef_hertz (ndarray) – Frequency axis in Hz
- libf0.salience.compute_salience_from_logfreq_spec(lf_spec, R, n_harmonics, alpha, beta, gamma, harmonic_win_len=11)[source]
Compute salience representation using harmonic summation following [1]
- [1] J. Salamon and E. Gomez,
“Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics.” IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 6, pp. 1759–1770, Aug. 2012.
- Parameters:
lf_spec (ndarray) – (F, T) log-spectrogram
R (int) – Frequency resolution given in cents
n_harmonics (int) – Number of harmonics
alpha (float) – Weighting parameter for harmonics
beta (float) – Compression parameter for spectrogram magnitudes
gamma (float) – Magnitude threshold
harmonic_win_len (int) – Length of a frequency weighting window in bins
- Returns:
Z – (F, T) salience representation of the input spectrogram
- Return type:
ndarray
- libf0.salience.define_transition_matrix(B, tol=0, score_low=0.01, score_high=1.0)[source]
Generate transition matrix for dynamic programming
- Parameters:
B (int) – Number of bins
tol (int) – Tolerance parameter for transition matrix (Default value = 0)
score_low (float) – Score (low) for transition matrix (Default value = 0.01)
score_high (float) – Score (high) for transition matrix (Default value = 1.0)
- Returns:
T – (B, B) Transition matrix
- Return type:
ndarray
See also
[FMP] Notebook: C8/C8S2_FundFreqTracking.ipynb
- libf0.salience.compute_trajectory_dp(Z, T)[source]
Trajectory tracking using dynamic programming
- Parameters:
Z (ndarray) – Salience representation
T (ndarray) – Transisition matrix
- Returns:
eta_DP – Trajectory indices
- Return type:
ndarray
See also
[FMP] Notebook: C8/C8S2_FundFreqTracking.ipynb
- libf0.salience.compute_trajectory_cr(Z, T_coef, F_coef_hertz, constraint_region=None, tol=5, score_low=0.01, score_high=1.0)[source]
Trajectory tracking with constraint regions Notebook: C8/C8S2_FundFreqTracking.ipynb
- Parameters:
Z (ndarray) – Salience representation
T_coef (ndarray) – Time axis
F_coef_hertz (ndarray) – Frequency axis in Hz
constraint_region (ndarray or None) – Constraint regions, row-format: (t_start_sec, t_end_sec, f_start_hz, f_end_hz) (Default value = None)
tol (int) – Tolerance parameter for transition matrix (Default value = 5)
score_low (float) – Score (low) for transition matrix (Default value = 0.01)
score_high (float) – Score (high) for transition matrix (Default value = 1.0)
- Returns:
eta – Trajectory indices, unvoiced frames are indicated with -1
- Return type:
ndarray
See also
[FMP] Notebook: C8/C8S2_FundFreqTracking.ipynb
- libf0.salience.frequency_to_bin_index(F, R, F_ref)[source]
Binning function with variable frequency resolution Note: Indexing starts with 0 (opposed to [FMP, Eq. (8.49)])
- Parameters:
F (float or ndarray) – Frequency in Hz
R (float) – Frequency resolution in cents (Default value = 10.0)
F_ref (float) – Reference frequency in Hz (Default value = 55.0)
- Returns:
bin_index (int)
- Return type:
Index for bin (starting with index 0)
See also
[FMP] Notebook: C8/C8S2_SalienceRepresentation.ipynb