Utils (libsoni.util.utils)

libsoni.util.utils.fade_signal(signal: ndarray, fading_duration: float = 0, fs: int = 22050) ndarray[source]

Fade in / out audio signal

Parameters:
  • signal (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Signal to be faded

  • fs (int, default = 22050) – sampling rate

  • fading_duration (float, default = 0) – duration of fade-in and fade-out, in seconds

Returns:

normalized_signal (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Normalized signal

libsoni.util.utils.mix_sonification_and_original(sonification: ndarray, original_audio: ndarray, gain_lin_sonification: float = 1.0, gain_lin_original_audio: float = 1.0, panning: float = 1.0, duration: int | None = None)[source]

This function takes a sonification and an original_audio and mixes it to stereo

Parameters:
  • sonification (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Sonification

  • original_audio (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Original audio

  • gain_lin_sonification (float, default = 1.0) – linear gain for sonification

  • gain_lin_original_audio (float, default = 1.0) – linear gain for original audio

  • panning (float, default = 1.0) –

    Controls the panning of the mixed output

    panning = 1.0 means original audio on left and sonification on right channel panning = 0.5 means same amount of both signals on both channels. panning = 0.0 means sonification on left and original audio on right channel

  • duration (int, default = None) – Duration of the output waveform, given in samples.

Returns:

stereo_audio (np.ndarray (np.float32 / np.float64) [shape=(N, 2)]) – Mix of the signals

libsoni.util.utils.normalize_signal(signal: ndarray) ndarray[source]

Max-normalize audio signal

Parameters:

signal (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Signal to be normalized

Returns:

normalized_signal (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Normalized signal

libsoni.util.utils.pitch_to_frequency(pitch: int, reference_pitch: int = 69, tuning_frequency: float = 440.0) float[source]

Calculates the corresponding frequency for a given pitch.

Parameters:
  • pitch (int) – Pitch to calculate frequency for.

  • reference_pitch (int, default = 69) – Reference pitch for calculation.

  • tuning_frequency (float, default = 440.0) – Tuning frequency for calculation, in Hertz.

Returns:

frequency (float) – Calculated frequency for given pitch, in Hertz.

libsoni.util.utils.smooth_weights(weights: ndarray, fading_samples: int = 0) ndarray[source]

Weight smoothing

Parameters:
  • weights ((np.float32 / np.float64) [shape=(N, )]) – Input weights

  • fading_samples (int) – Number of samples for fade-in/out.

Returns:

weights_smoothed (np.ndarray (np.float32 / np.float64) [shape=(N, )]) – Smoothed weights

libsoni.util.utils.visualize_pianoroll(pianoroll_df: DataFrame, xlabel: str = 'Time (seconds)', ylabel: str = 'Pitch', title: str | None = None, colors: str = 'FMP_1', velocity_alpha: bool = False, figsize: Tuple[float, float] = (12, 4), ax: Axes | None = None, dpi: int = 72) Tuple[Figure, Axes][source]

Visualization function for piano-roll representations, given in a pd.DataFrame format

Parameters:
  • pianoroll_df (pd.DataFrame) – Dataframe containing pitch-event information.

  • xlabel (str, default = ‘Time (seconds)’) – Label text for the x-axis.

  • ylabel (str, default = ‘Pitch’) – Label text for the y-axis.

  • title (str, default = None) – Title of the figure.

  • colors (str, default = ‘FMP_1’) – Colormap, for the default colormap see https://github.com/meinardmueller/libfmp.

  • velocity_alpha (bool = False) – Set True to weight the visualized rectangular regions for each pitch based on their velocity value.

  • figsize (Tuple[float, float], default: [12, 4])) – Figure size

  • ax (matplotlib.axes.Axes) – Axes object

  • dpi (int) – Resolution of the figure.

Returns:
  • fig (matplotlib.figure.Figure) – Figure instance

  • ax (matplotlib.axes.Axes) – Axes object

libsoni.util.utils.warp_sample(sample: ndarray, reference_pitch: int, target_pitch: int, target_duration_sec: float, gain: float = 1.0, fs: int = 22050, fading_duration: float = 0.01)[source]

This function warps a sample. Given the reference pitch of the sample provided as np.ndarray, the warped version of the sample gets pitch-shifted using librosa.effects.pitch_shift(). For the temporal alignment, if the desired duration is shorter than the original sample, the sample gets cropped, else if the desired duration is longer of the provided sample, the returned signal gets zero-padded at the end.

Parameters:
  • sample (np.ndarray (np.float32 / np.float64) [shape=(K, )]) – Sample to be warped.

  • reference_pitch (int) – Reference pitch for the given sample.

  • target_pitch (int) – Target pitch for the warped sample.

  • target_duration_sec (float) – Duration, given in seconds, for the returned signal.

  • gain (float, default = 1.0) – Gain of the generated tone

  • fs (int, default = 22050) – Sampling rate, in samples per seconds.

  • fading_duration (float, default = 0.01) – Duration of fade in and fade out (to avoid clicks)

Returns:

warped_sample (np.ndarray (np.float32 / np.float64) [shape=(M, )]) – Warped sample.