salience¶
- libf0.salience.salience(x, Fs=22050, N=2048, H=256, F_min=55.0, F_max=1760.0, R=10.0, num_harm=10, freq_smooth_len=11, alpha=0.9, gamma=0.0, constraint_region=None, tol=5, score_low=0.01, score_high=1.0)[source]¶
Implementation of a salience-based F0-estimation algorithm using pitch contours, inspired by Melodia.
- 1
Justin Salamon and Emilia Gómez, “Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics.” IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 6, pp. 1759–1770, Aug. 2012.
- Parameters
x (ndarray) – Audio signal
Fs (int) – Sampling rate
N (int) – Window size
H (int) – Hop size
F_min (float or int) – Minimal frequency
F_max (float or int) – Maximal frequency
R (int) – Frequency resolution given in cents
num_harm (int) – Number of harmonics (Default value = 10)
freq_smooth_len (int) – Filter length for vertical smoothing (Default value = 11)
alpha (float) – Weighting parameter for harmonics (Default value = 0.9)
gamma (float) – Logarithmic compression factor (Default value = 0.0)
constraint_region (None or ndarray) – Constraint regions, row-format: (t_start_sec, t_end_sec, f_start_hz, f_end,hz) (Default value = None)
tol (int) – Tolerance parameter for transition matrix (Default value = 5)
score_low (float) – Score (low) for transition matrix (Default value = 0.01)
score_high (float) – Score (high) for transition matrix (Default value = 1.0)
- Returns
f0 (ndarray) – Estimated F0-trajectory
T_coef (ndarray) – Time axis
sal (ndarray) – Salience value of estimated F0
See also
[FMP] Notebook: C8/C8S2_SalienceRepresentation.ipynb
- libf0.salience.compute_salience_rep(x, Fs, N, H, F_min, F_max, R, num_harm, freq_smooth_len, alpha, gamma)[source]¶
Compute salience representation [FMP, Eq. (8.56)]
- Parameters
x (ndarray) – Audio signal
Fs (int) – Sampling rate
N (int) – Window size
H (int) – Hop size
F_min (float or int) – Minimal frequency
F_max (float or int) – Maximal frequency
R (int) – Frequency resolution given in cents
num_harm (int) – Number of harmonics
freq_smooth_len (int) – Filter length for vertical smoothing
alpha (float) – Weighting parameter for harmonics
gamma (float) – Logarithmic compression factor
- Returns
Z (ndarray) – Salience representation
F_coef_hertz (ndarray) – Frequency axis in Hz
See also
[FMP] Notebook: C8/C8S2_SalienceRepresentation.ipynb
- libf0.salience.compute_y_lf_if_bin_eff(X, Fs, N, H, F_min, F_max, R)[source]¶
Binned Log-frequency Spectrogram with variable frequency resolution based on instantaneous frequency, more efficient implementation than FMP
- Parameters
X (ndarray) – Complex spectrogram
Fs (int) – Sampling rate in Hz
N (int) – Window size
H (int) – Hop size
F_min (float or int) – Minimal frequency
F_max (float or int) – Maximal frequency
R (int) – Frequency resolution given in cents
- Returns
Y_LF_IF_bin (ndarray) – Binned log-frequency spectrogram using instantaneous frequency (shape: [freq, time])
F_coef_hertz (ndarray) – Frequency axis in Hz
- libf0.salience.compute_salience_from_logfreq_spec(lf_spec, R, n_harmonics, alpha, beta, gamma, harmonic_win_len=11)[source]¶
Compute salience representation using harmonic summation following [1]
- [1] J. Salamon and E. Gomez,
“Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics.” IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 6, pp. 1759–1770, Aug. 2012.
- Parameters
lf_spec (ndarray) – (F, T) log-spectrogram
R (int) – Frequency resolution given in cents
n_harmonics (int) – Number of harmonics
alpha (float) – Weighting parameter for harmonics
beta (float) – Compression parameter for spectrogram magnitudes
gamma (float) – Magnitude threshold
harmonic_win_len (int) – Length of a frequency weighting window in bins
- Returns
Z – (F, T) salience representation of the input spectrogram
- Return type
ndarray
- libf0.salience.define_transition_matrix(B, tol=0, score_low=0.01, score_high=1.0)[source]¶
Generate transition matrix for dynamic programming
- Parameters
B (int) – Number of bins
tol (int) – Tolerance parameter for transition matrix (Default value = 0)
score_low (float) – Score (low) for transition matrix (Default value = 0.01)
score_high (float) – Score (high) for transition matrix (Default value = 1.0)
- Returns
T – (B, B) Transition matrix
- Return type
ndarray
See also
[FMP] Notebook: C8/C8S2_FundFreqTracking.ipynb
- libf0.salience.compute_trajectory_dp(Z, T)[source]¶
Trajectory tracking using dynamic programming
- Parameters
Z (ndarray) – Salience representation
T (ndarray) – Transisition matrix
- Returns
eta_DP – Trajectory indices
- Return type
ndarray
See also
[FMP] Notebook: C8/C8S2_FundFreqTracking.ipynb
- libf0.salience.compute_trajectory_cr(Z, T_coef, F_coef_hertz, constraint_region=None, tol=5, score_low=0.01, score_high=1.0)[source]¶
Trajectory tracking with constraint regions Notebook: C8/C8S2_FundFreqTracking.ipynb
- Parameters
Z (ndarray) – Salience representation
T_coef (ndarray) – Time axis
F_coef_hertz (ndarray) – Frequency axis in Hz
constraint_region (ndarray or None) – Constraint regions, row-format: (t_start_sec, t_end_sec, f_start_hz, f_end_hz) (Default value = None)
tol (int) – Tolerance parameter for transition matrix (Default value = 5)
score_low (float) – Score (low) for transition matrix (Default value = 0.01)
score_high (float) – Score (high) for transition matrix (Default value = 1.0)
- Returns
eta – Trajectory indices, unvoiced frames are indicated with -1
- Return type
ndarray
See also
[FMP] Notebook: C8/C8S2_FundFreqTracking.ipynb
- libf0.salience.frequency_to_bin_index(F, R, F_ref)[source]¶
Binning function with variable frequency resolution Note: Indexing starts with 0 (opposed to [FMP, Eq. (8.49)])
- Parameters
F (float or ndarray) – Frequency in Hz
R (float) – Frequency resolution in cents (Default value = 10.0)
F_ref (float) – Reference frequency in Hz (Default value = 55.0)
- Returns
bin_index (int)
- Return type
Index for bin (starting with index 0)
See also
[FMP] Notebook: C8/C8S2_SalienceRepresentation.ipynb