Skip to content
Snippets Groups Projects
Commit 1c21cb56 authored by valentin.emiya's avatar valentin.emiya
Browse files

Waveform.fs can now be of type float

parent cc3eb783
No related branches found
No related tags found
No related merge requests found
Pipeline #
%% Cell type:code id: tags:
``` python
%pylab inline
```
%% Cell type:markdown id: tags:
# Tutorial on how to use `Waveform` objects
%% Cell type:markdown id: tags:
A *Waveform* is a *MadArray* dedicated to handle audio signals. As such, it has a mandatory attribute *fs*, giving the sampling frequency of the signal.
## Initialization
As for *MadArray*, *Waveform* can be initialized from a 1D nd-array with or without mask. The parameter *fs* should be explicitly given.
%% Cell type:code id: tags:
``` python
from madarrays import Waveform
fs = 8000
f0 = 200
f1 = 220
x_len = fs // 4
x = np.cos(2*np.pi*f0*np.arange(x_len)/fs) + np.cos(2*np.pi*f1*np.arange(x_len)/fs)
x *= np.hanning(x_len)
x /= np.max(np.abs(x))
mask = np.zeros_like(x, dtype=np.bool)
mask[int(0.4*x_len):int(0.6*x_len)] = 1
# initialization without missing samples
w = Waveform(x, fs=fs)
w.plot()
print(w)
```
%% Cell type:code id: tags:
``` python
# initialization with missing samples
wm = Waveform(x, fs=fs, mask=mask)
wm.plot()
print(wm)
```
%% Cell type:markdown id: tags:
A *Waveform* can also be initialized from another *Waveform*. In this case, the parameter *fs* is optional.
%% Cell type:code id: tags:
``` python
wm2 = Waveform(wm)
wm2.plot()
print(wm2)
```
%% Cell type:markdown id: tags:
If *fs* is provided, the audio signal is **not** resampled
%% Cell type:code id: tags:
``` python
wm3 = Waveform(wm, fs=22050)
wm3.plot()
print(wm3)
```
%% Cell type:markdown id: tags:
Stereo signals are handled as $N \times 2$ arrays:
%% Cell type:code id: tags:
``` python
x_stereo = np.array([np.cos(2*np.pi*0.001*np.arange(2000)),
np.sin(2*np.pi*0.001*np.arange(2000))]). T
mask_stereo = np.zeros_like(x_stereo, dtype=np.bool)
mask_stereo[250:500, 0] = 1
mask_stereo[1000:1500, 1] = 1
w_stereo = Waveform(x_stereo, mask=mask_stereo, fs=1)
w_plot = w_stereo.plot()
legend(w_plot, ('left', 'right'))
print(w_stereo)
```
%% Cell type:markdown id: tags:
Extracting left and right channels as mono *Waveform* objects is easy:
%% Cell type:code id: tags:
``` python
w_left = w_stereo[:, 0]
w_right = w_stereo[:, 1]
w_left.plot(label='left mono')
w_right.plot(label='right mono')
legend()
print('Is w_left stereo?', w_left.is_stereo())
print('Is w_right stereo?', w_left.is_stereo())
print(w_left)
print(w_right)
```
%% Cell type:markdown id: tags:
## Special audio abilities
### Resampling
A *Waveform* can be resampled using the *resample* method:
%% Cell type:code id: tags:
``` python
wr = Waveform(w)
wr.resample(22050)
plt.subplot(211)
w.plot()
plt.subplot(212)
wr.plot()
print(w)
print(wr)
```
%% Cell type:markdown id: tags:
### Changing the sampling frequency without resampling the waveform
%% Cell type:code id: tags:
``` python
w_fs = Waveform(w)
w_fs.fs = 22050
plt.subplot(211)
w.plot()
plt.subplot(212)
w_fs.plot()
print(w)
print(w_fs)
```
%% Cell type:markdown id: tags:
### Intensity
A *Waveform* has an attribute *rms* giving the root mean square of the audio signal (where missing samples equal zero). It can be changed by setting a new value.
%% Cell type:code id: tags:
``` python
w_rms = Waveform(w)
plt.subplot(211)
w_rms.plot()
print('RMS before modification: ', w_rms.rms)
w_rms.set_rms(1)
plt.subplot(212)
w_rms.plot()
print('RMS after modification: ', w_rms.rms)
```
%% Cell type:markdown id: tags:
### Properties
A *Waveform* has several attributes that give information about the audio signal
%% Cell type:code id: tags:
``` python
print('Length: {} samples'.format(w.length))
print('Duration: {} s'.format(w.duration))
print('Time axis: {}'.format(w.time_axis))
```
%% Cell type:markdown id: tags:
### Plotting
A *Waveform* can be plotted, as well as the associated mask.
%% Cell type:code id: tags:
``` python
plt.figure()
wm.plot()
plt.title('Audio signal')
plt.figure()
wm.plot_mask()
plt.title('Mask')
pass
```
%% Cell type:markdown id: tags:
### Playing sound
The sound can be played using *show_player* in a notebook or *play* in a console.
%% Cell type:code id: tags:
``` python
w.show_player()
```
%% Cell type:markdown id: tags:
#### I/O
A *Waveform* can be exported as a .wav file using *to_wavfile*:
%% Cell type:code id: tags:
``` python
f0_io = 10
fs_io = 8000
x_io_len = fs_io
x_io = np.array([np.cos(2*np.pi*f0_io/fs_io*np.arange(x_io_len)),
np.sin(2*np.pi*f0_io/fs_io*np.arange(x_io_len))]).T
mask_io = np.zeros_like(x_io, dtype=bool)
mask_io[0, -1000:] = mask_io[1, -500:] = True
w_io = Waveform(x_io, mask=mask_io, fs=fs)
w_io.plot()
print(w_io)
w_io.to_wavfile('my_sound.wav')
```
%% Cell type:markdown id: tags:
A .wav file can be read using static method *from_wavfile*, returning a *Waveform*:
%% Cell type:code id: tags:
``` python
w_load = Waveform.from_wavfile('my_sound.wav')
w_load.plot()
print(w_load)
```
%% Cell type:code id: tags:
``` python
# Stereo files may be converted to mono
for mode in ('left', 'right', 'mean'):
w_load = Waveform.from_wavfile('my_sound.wav', conversion_to_mono=mode)
w_load.plot(label=mode)
legend()
pass
```
%% Cell type:markdown id: tags:
Note that:
* sampling frequency: only a restricted set of sampling frequencies are allowed for input/output (see set of supported frequencies `waveform.VALID_IO_FS`)
* *dtype*: float/int data types are conserved when exporting a *Waveform*, since the .wav format allows many data types. However, many audio players only read .wav files coded with int16 values so you may not be able to listen to your exported sound with your favorite player. In that case, you may convert the data type of your *Waveform* using the optional *dtype* argument of method *to_wavfile*.
* mask: the mask is lost when exporting to a .wav file.
* sampling frequency: sampling frequencies may be arbitrary ``float`` or ``int`` values; however, only a restricted set of sampling frequencies are allowed for input/output (see set of supported frequencies ``madarrays.waveform.VALID_IO_FS` below).
%% Cell type:code id: tags:
``` python
from madarrays.waveform import VALID_IO_FS
print(VALID_IO_FS)
```
%% Cell type:markdown id: tags:
### Clipping
Clipping a *Waveform* is done by using the `clip` method, taking as arguments the minimal and maximal values. Warnings are displayed to inform the user if any value has been clipped.
%% Cell type:code id: tags:
``` python
wm_clipped = wm.copy()
wm_clipped.clip(min_value=-0.75, max_value=0.25)
# Plot signals
plt.figure()
wm.plot('b', label='x')
wm_clipped.plot('y', label='y')
plt.legend()
```
%% Cell type:markdown id: tags:
## Type of entries in Waveform
This section is for advanced usages.
Audio data can have different types, that are associated with specific constraints on the values:
* *float* (np.float16, no.float32, np.float64): the values are float between -1 and 1;
* *int* (np.uint8, np.int16, np.int32): the values are integers between a range that depends on the precision.
* *complex* (np.complex64, np.complex128): the real and imaginary parts are float betwen -1 and 1.
%% Cell type:markdown id: tags:
### Integer-valued waveforms
Method *Waveform.astype* not only converts data types but also scale values to the range of the target type. The choice among the available integer types will result in different ranges. The following figures show integer-valued waveforms with different types: on the first row, waveforms created without conversion, from integer-valued data arrays where the full `dtype` range is used; on the second row, similar waveforms are created with a conversion from a float-valued array with entries in [-1, 1].
%% Cell type:code id: tags:
``` python
fs = 1000
f0 = 10
duration = 1
t = np.linspace(0, duration, int(duration*fs))
x_cos = 0.5 * np.cos(2*np.pi*f0*t)
w_uint8 = Waveform((2**7*x_cos + 128).astype(np.uint8), fs=fs)
w_int16 = Waveform((2**15*x_cos).astype(np.int16), fs=fs)
w_int32 = Waveform((2**31*x_cos).astype(np.int32), fs=fs)
plt.figure(figsize=(20, 5))
plt.subplot(131)
plt.title('uint8')
w_uint8.plot()
plt.subplot(132)
plt.title('int16')
w_int16.plot()
plt.subplot(133)
plt.title('int32')
w_int32.plot()
w_uint8 = Waveform(x_cos, fs=fs).astype(np.uint8)
w_int16 = Waveform(x_cos, fs=fs).astype(np.int16)
w_int32 = Waveform(x_cos, fs=fs).astype(np.int32)
plt.figure(figsize=(20, 5))
plt.subplot(131)
plt.title('uint8')
w_uint8.plot()
plt.subplot(132)
plt.title('int16')
w_int16.plot()
plt.subplot(133)
plt.title('int32')
w_int32.plot()
pass
```
%% Cell type:markdown id: tags:
### Real-valued waveforms
The choice among the available float types will not affect the range of the values but the precision. In the following example, one may observe how the floating-point precision varies, depending on the float type, when the fractionnal part is very small compared to the exponent part, which equals 1 here (see right column).
%% Cell type:code id: tags:
``` python
fs = 1000
f0 = 10
duration = 1
t = np.linspace(0, duration, int(duration*fs))
w_float16 = Waveform(np.cos(2*np.pi*f0*t).astype(np.float16) + 1, fs=fs)
w_float32 = Waveform(np.cos(2*np.pi*f0*t).astype(np.float32) + 1, fs=fs)
w_float64 = Waveform(np.cos(2*np.pi*f0*t).astype(np.float64) + 1, fs=fs)
plt.figure(figsize=(20, 15))
plt.subplot(321)
plt.title('float16')
w_float16.plot()
plt.subplot(323)
plt.title('float32')
w_float32.plot()
plt.subplot(325)
plt.title('float64')
w_float64.plot()
eps16=np.finfo(np.float16).eps * 4
eps32=np.finfo(np.float32).eps * 4
eps64=np.finfo(np.float64).eps * 4
print(eps16, eps32, eps64)
w_float16 = Waveform(np.cos(2*np.pi*f0*t).astype(np.float16) * eps16 + 1, fs=fs)
w_float32 = Waveform(np.cos(2*np.pi*f0*t).astype(np.float32) * eps32 + 1, fs=fs)
w_float64 = Waveform(np.cos(2*np.pi*f0*t).astype(np.float64) * eps64 + 1, fs=fs)
plt.subplot(322)
plt.title('float16')
w_float16.plot()
plt.subplot(324)
plt.title('float32')
w_float32.plot()
plt.subplot(326)
plt.title('float64')
w_float64.plot()
plt.ylim(1 - 1.2 * eps64, 1 + 1.2 * eps64)
pass
```
%% Cell type:markdown id: tags:
### Complex-valued waveforms
%% Cell type:code id: tags:
``` python
fs = 1000
f0 = 10
duration = 1
t = np.linspace(0, duration, int(duration*fs))
w_complex128 = Waveform((np.cos(2*np.pi*f0*t) + 1j*np.sin(2*np.pi*f0*t)).astype(np.complex128), fs=fs)
w_complex256 = Waveform((np.cos(2*np.pi*f0*t) + 1j*np.sin(2*np.pi*f0*t)).astype(np.complex256), fs=fs)
plt.figure(figsize=(20, 5))
plt.subplot(121)
plt.title('complex128')
w_complex128.plot(cpx_mode='both')
plt.subplot(122)
plt.title('complex256')
w_complex256.plot(cpx_mode='both')
pass
```
%% Cell type:markdown id: tags:
### Casting into another dtype
%% Cell type:markdown id: tags:
The casting of a waveform in a different dtype depends on the current dtype and the desired dtype:
* *Integer-to-real* casting is performed by applying on each entry $x$ the function $f(x)=\frac{x - z}{2^{n-1}}$, where the source integral type is coded with $n$ bits, and $z$ is the integer associated with zero, i.e., $z=0$ for a signed type (`int`) and $z=2^{n-1}$ for an unsigned type (`uint`).
* *Real-to-integer* casting is performed by applying on each entry $x$ the function $f(x)=\lfloor\left(x + 1\right) 2^{n-1} + m\rfloor$, where the target integral type is coded with $n$ bits, and $m$ is the minimum integer value, i.e., $m=-2^{n-1}$ for a signed type (`int`) and $z=0$ for an unsigned type (`uint`);
* *Real-to-real* casting is obtained by a basic rounding operation;
* *Integer-to-integer* casting is obtained by chaining an integer-to-float64 casting and a float64-to-integer casting.
These constraints are only applied when calling explicitely the method `astype`.
Clipping is performed for unexpected values:
* When casting to `float`, values outside $[-1, 1]$ are clipped;
* When casting to `int`, values outside the minimum and maximum values allowed by the integral type are clipped:
* $\left[-2^{n-1}, 2^{n-1}-1\right]$ for $n$-bits signed integers;
* $\left[0, 2^{n}-1\right]$ for $n$-bits unsigned integers.
%% Cell type:code id: tags:
``` python
w_float32 = Waveform(np.cos(2*np.pi*f0*t).astype(np.float32), fs=fs)
plt.figure(figsize=(20, 5))
plt.subplot(121)
plt.title('float32')
w_float32.plot()
plt.subplot(122)
plt.title('uint8')
w_float32.astype('uint8').plot()
pass
```
%% Cell type:code id: tags:
``` python
```
......
......@@ -236,6 +236,7 @@ class TestWaveform:
def test_resample(self):
# Common integer values
for fs in FS:
# Mono
w = Waveform(self.x_mono, fs=self.fs)
......@@ -262,16 +263,34 @@ class TestWaveform:
w.time_axis,
np.arange(np.floor(fs * self.length / self.fs)) / fs)
# Floating values with ratios that are exact rationals
for (old_fs, new_fs) in [(1, 1.5), (0.5, 3), (100.1, 200.2)]:
w = Waveform(self.x_mono, fs=old_fs)
w.resample(fs=new_fs)
assert w.fs == new_fs
assert w.length == int(np.floor(new_fs * self.length / old_fs))
# Floating values with ratios that are not well approximated
# by rationals
old_fs = np.sqrt(2)
new_fs = np.pi
with pytest.warns(UserWarning):
w = Waveform(self.x_mono, fs=old_fs)
w.resample(fs=new_fs)
np.testing.assert_almost_equal(w.fs, new_fs)
np.testing.assert_almost_equal(
w.length, int(np.floor(new_fs * self.length / old_fs)))
# Negative frequency sampling
with pytest.raises(
ValueError,
match='`fs` should be a positive integer \(given: -\d+\)'):
match='`fs` should be a positive number \(given: -\d+\)'):
w.resample(fs=-self.fs)
# Frequency sampling equal to 0
with pytest.raises(
ValueError,
match='`fs` should be a positive integer \(given: 0\)'):
match='`fs` should be a positive number \(given: 0\)'):
w.resample(fs=0)
# Masked data
......
......@@ -47,6 +47,7 @@
"""
import warnings
from fractions import Fraction
import numpy as np
import resampy
import simpleaudio as sa
......@@ -133,7 +134,7 @@ class Waveform(MadArray):
data : nd-array [N] or [N, 2]
Audio samples, as a N-length vector for a mono signal or a
[N, 2]-shape array for a stereo signal
fs : int, optional
fs : int or float, optional
Sampling frequency of the original signal, in Hz. If float, truncated.
If None and :paramref:`data` is a Waveform, use `data.fs`, otherwise it
is set to 1.
......@@ -169,7 +170,7 @@ class Waveform(MadArray):
raise ValueError('`data` should be either mono or stereo.')
# add the new attribute to the created instance
obj.fs = int(fs)
obj.fs = fs
return obj
......@@ -206,7 +207,7 @@ class Waveform(MadArray):
@property
def fs(self):
"""Frequency sampling of the audio signal.
"""Frequency sampling of the audio signal (int or float).
The signal is not resampled when the sampling frequency is modified.
......@@ -222,7 +223,7 @@ class Waveform(MadArray):
if fs <= 0:
errmsg = 'fs is not strictly positive (given: {})'
raise ValueError(errmsg.format(fs))
self._fs = int(fs)
self._fs = fs
@property
def rms(self):
......@@ -290,29 +291,60 @@ class Waveform(MadArray):
Can be only performed on a waveform without missing data.
Note that if the current or the new sampling frequencies are not
integers, the new sampling frequency may be different from the
desired value since the resampling method only allows input and
output frequencies of type ``int``. In this case, a warning is
raised.
Parameters
----------
fs : int
fs : int or float
New sampling frequency.
Raises
------
ValueError
If `fs` is not a positive integer.
If `fs` is not a positive number.
If `self` has missing samples.
UserWarning
"""
assert np.issubdtype(type(fs), np.integer) or np.issubdtype(type(fs),
np.float)
if fs <= 0:
errmsg = '`fs` should be a positive integer (given: {})'
errmsg = '`fs` should be a positive number (given: {})'
raise ValueError(errmsg.format(fs))
if np.issubdtype(type(fs), np.float) or np.issubdtype(type(self.fs),
np.float):
# Find a good rational number to approximate the ratio between
# sampling frequencies
fs_ratio = Fraction(fs/self.fs)
fs_ratio = fs_ratio.limit_denominator(10000)
# Sampling frequencies used for the resampling (need to be int)
resample_new_fs = fs_ratio.numerator
resample_old_fs = fs_ratio.denominator
# Adjust new sampling frequency
new_fs = fs_ratio * self.fs
if new_fs != fs:
warnings.warn('New sampling frequency adjusted to {} instead '
'of {} in order to use ``int`` values in '
'``resample``.'.format(fs_ratio * self.fs, fs))
fs = new_fs
else:
resample_new_fs = fs
resample_old_fs = self.fs
if self.is_masked():
errmsg = 'Waveform has missing entries.'
raise ValueError(errmsg)
if fs != self.fs:
if resample_new_fs != resample_old_fs:
x = self.to_np_array()
y = resampy.resample(x, self.fs, fs, axis=0)
y = resampy.resample(x, resample_old_fs, resample_new_fs, axis=0)
self.resize(y.shape, refcheck=False)
self[:] = y
......@@ -411,8 +443,8 @@ class Waveform(MadArray):
UserWarning
If the signal is complex.
ValueError
If the sampling frequency is not supported (see set of supported
frequencies `waveform.VALID_IO_FS`).
If the sampling frequency is not an integer from the set of
supported frequencies ``madarrays.waveform.VALID_IO_FS``).
NotImplementedError
If dtype is not supported by the current implementation.
......@@ -420,7 +452,7 @@ class Waveform(MadArray):
--------
scipy.io.wavfile.write
"""
if int(self.fs) not in VALID_IO_FS:
if self.fs not in VALID_IO_FS:
errmsg = '`fs` is not a valid sampling frequency (given: {}).'
raise ValueError(errmsg.format(self.fs))
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment