Utils (libsoni.utils)

libsoni.utils.fade_signal(signal: ndarray, fading_duration: float = 0, fs: int = 22050, fade_type: str = 'squared_sine') → ndarray[source]

Fade in / out audio signal

Parameters:

signal (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Signal to be faded
fs (int, default = 22050) – sampling rate
fading_duration (float or tuple of 2 floats, default = 0) – duration of fade-in and fade-out, in seconds if one float is given, fade-in and fade-out have the same length

If the total fading duration is longer than the total signal length, the fades will be scaled proportionally.
fade_type (str) – Define the fading function. Options: “squared_sine” (default), “sine”, “linear”

Returns:

normalized_signal (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Normalized signal

libsoni.utils.mix_sonification_and_original(sonification: ndarray, original_audio: ndarray, gain_lin_sonification: float = 1.0, gain_lin_original_audio: float = 1.0, panning: float = 1.0, duration: int = None)[source]

This function takes a sonification and an original_audio and mixes it to stereo

Parameters:

sonification (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Sonification
original_audio (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Original audio
gain_lin_sonification (float, default = 1.0) – linear gain for sonification
gain_lin_original_audio (float, default = 1.0) – linear gain for original audio
panning (float, default = 1.0) –

Controls the panning of the mixed output
panning = 1.0 means original audio on left and sonification on right channel panning = 0.5 means same amount of both signals on both channels. panning = 0.0 means sonification on left and original audio on right channel
duration (int, default = None) – Duration of the output waveform, given in samples.

Returns:

stereo_audio (np.ndarray (np.float32 / np.float64) [shape=(N, 2)]) – Mix of the signals

libsoni.utils.normalize_signal(signal: ndarray) → ndarray[source]

Max-normalize audio signal

Parameters:: signal (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Signal to be normalized
Returns:: normalized_signal (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Normalized signal

libsoni.utils.pitch_to_frequency(pitch: int, reference_pitch: int = 69, tuning_frequency: float = 440.0) → float[source]

Calculates the corresponding frequency for a given pitch.

Parameters:

pitch (int) – Pitch to calculate frequency for.
reference_pitch (int, default = 69) – Reference pitch for calculation.
tuning_frequency (float, default = 440.0) – Tuning frequency for calculation, in Hertz.

Returns:

frequency (float) – Calculated frequency for given pitch, in Hertz.

libsoni.utils.replace_zeros(x: ndarray, zero_count: int = 1000, replace_with_previous=True, value=0)[source]

Replaces consecutive rows of zeros (up to a specified length) with the previous value or a given value in the array. If a row of zeros is longer than zero_counts, no zeros will be replaced.

Parameters:

x (np.ndarray) – 1D array of size (N).
zero_counts (int, default = 1000) – Maximum number of consecutive zeros that will be replaced. Must be greater than 2.
replace_with_previous (bool, default = True) – If True, zeros will be replaced with the last non-zero value in the array. If False, zeros will be replaced with the specified value.
value (int | float, optional) – The value used to replace zeros when replace_with_previous = False.

Returns:

y (np.ndarray) – 1D array of size (N) with specified zero rows replaced.

libsoni.utils.smooth_weights(weights: ndarray, fading_samples: int = 0) → ndarray[source]

Weight smoothing

Parameters:

weights ((np.float32 / np.float64) [shape=(N, )]) – Input weights
fading_samples (int) – Number of samples for fade-in/out.

Returns:

weights_smoothed (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Smoothed weights

libsoni.utils.split_freq_trajectory(frequencies: ndarray, max_change_cents: float = 50.0)[source]

Splits a frequency array into regions where the change in frequency from frame to frame remains within a specified threshold, e.g., to isolate note events in an F0 trajectory.

Parameters:

frequencies (np.ndarray) – 1D array of frequencies (Hz) to be split into regions with minimal pitch changes.
max_change_cents (float) – Maximum allowed change (in cents) between successive frames before splitting the trajectory.

Returns:

splits (np.ndarray) – 1D array containing indices where the input array should be split. Within each resulting region, the change in frequency from frame to frame remains below the specified threshold. Can be used with np.split().

libsoni.utils.visualize_pianoroll(pianoroll_df: DataFrame, xlabel: str = 'Time (seconds)', ylabel: str = 'Pitch', title: str = None, colors: str = 'FMP_1', velocity_alpha: bool = False, figsize: Tuple[float, float] = (12, 4), ax: Axes = None, dpi: int = 72) → Tuple[Figure, Axes][source]

Visualization function for piano-roll representations, given in a pd.DataFrame format

Parameters:

pianoroll_df (pd.DataFrame) – Dataframe containing pitch-event information.
xlabel (str, default = ‘Time (seconds)’) – Label text for the x-axis.
ylabel (str, default = ‘Pitch’) – Label text for the y-axis.
title (str, default = None) – Title of the figure.
colors (str, default = ‘FMP_1’) – Colormap, for the default colormap see https://github.com/meinardmueller/libfmp.
velocity_alpha (bool = False) – Set True to weight the visualized rectangular regions for each pitch based on their velocity value.
figsize (Tuple[float, float], default: [12, 4])) – Figure size
ax (matplotlib.axes.Axes) – Axes object
dpi (int) – Resolution of the figure.

Returns:

fig (matplotlib.figure.Figure) – Figure instance
ax (matplotlib.axes.Axes) – Axes object

libsoni.utils.warp_sample(sample: ndarray, reference_pitch: int, target_pitch: int, target_duration_sec: float, gain: float = 1.0, fs: int = 22050, fading_duration: float = 0.01)[source]

This function warps a sample. Given the reference pitch of the sample provided as np.ndarray, the warped version of the sample gets pitch-shifted using librosa.effects.pitch_shift(). For the temporal alignment, if the desired duration is shorter than the original sample, the sample gets cropped, else if the desired duration is longer of the provided sample, the returned signal gets zero-padded at the end.

Parameters:

sample (np.ndarray (np.float32 / np.float64) [shape=(K, )]) – Sample to be warped.
reference_pitch (int) – Reference pitch for the given sample.
target_pitch (int) – Target pitch for the warped sample.
target_duration_sec (float) – Duration, given in seconds, for the returned signal.
gain (float, default = 1.0) – Gain of the generated tone
fs (int, default = 22050) – Sampling rate, in samples per seconds.
fading_duration (float, default = 0.01) – Duration of fade in and fade out (to avoid clicks)

Returns:

warped_sample (np.ndarray (np.float32 / np.float64) [shape=(M, )]) – Warped sample.