Audio Processing on the RaspberryPi

Scope

The Raspberry Pi (RPI) shall be used as a sound surveillance system with the following features:

  1. A microphone records audio samples in some kind of cyclic buffer.
  2. The audio samples are repeatedly evaluated to detect sound events.
    • A sound event is characterised by a time slot in which the audio activity exceeds some threshold.
  3. The RPI may then react to a sound event above such threshold like this:
    • A notification is sent to a PC connected to RPI via a wireless connection.
    • The RPI stores audio samples into a file. The time interval of this audio record shall be in the vicinity of the time instant where the sound event had occurred.
    • The PC may then download the audio file for further signal processing/analysis

Basically the sound surveillance system may be decomposed into two building blocks:

  • Capturing and processing of audio samples
  • Control interface
    • interacting with the RPI via wireless link
    • configuring the application on the RPI
    • starting and stopping the application
    • data retrieval
    • postprocessing of audio data on PC (motivation: the RPI shall act as a data collector; more time consuming operations are delegated to the PC).

Some more details are presented here. Specific sections will deal with implementation details.


Capturing audio samples

Hardware

The capturing of audio samples on the RPI shall use standard components which should not only work on the RPI but as well on a Windows / Linux PC. A reasonably cheap solution has been found to work on either platform (RPI , PC). Specifically I will use a USB soundcard and a microphone attached to the soundcard. A figure shows the setup:

ComponentsDescription
USB SoundcardCreative Soundblaster Play!3
MicrophoneLavalier microphone
external components for sound processing

Software

Interacting with the soundcard programmatically will be performed using Python libraries that are available on Linux and Window operating systems. The table provides an overview of what I found useful when implementing the sound processing with Python:

LibraryDescription
sounddeviceaccessing the soundcard, storing audio samples, setting parameters of the soundcard (sounddevice), etc.
 
https://python-sounddevice.readthedocs.io/en/0.4.6/
soundfilereading and writing audio file with different option on the format of the audio file.
 
https://python-soundfile.readthedocs.io/en/0.11.0/
Python libraries for soundprocessing

Control

The RPI shall be controlled by a PC via a wireless link. To minimise external hardware the communication between RPI and PC shall be setup as a WiFi direct link thus not requiring an additional WLAN router. So the RPI will be setup as a hotspot with a DHCP server. The PC connects to the hotspot after the RPI has booted . The diagram provides some more details:

And here is a picture indicating how an external USB WiFi adapter establishes a WiFi direct link between the RPI and the PC. (The internal Wifi adapter of the RPI could still be used to connect to a WLAN router).

Note

At least the PC should have an additional wireless connection to a WLAN router to access the internet or any other devices connected to the router. This router-based connection is shown in the figure above.


Preliminary Steps

Getting familiar with

  • soundprocessing with Python
  • Setting up Wifi direct on RasperryPi

are prerequisites to implement a sound surveillance system for the RPI. So these topics will be addressed first before defining details how to setup a sound surveillance system.

Soundprocessing with Python

Required Software

On a Windows PC I have installed:

  • Python 3.11
  • Numpy
  • Scipy
  • Matplotlib
  • soundfile
  • sounddevice

On the Raspberry PI I use:

  • Python 3.9 – 3.10 (depending on the version of the Raspberry Pi; currently RPI 3 (4 GB) and RPI 4 (8 GB))
  • Numpy
  • soundfile
  • sounddevice
    • plus some libraries required by soundfile, sounddevice

How to find audio devices

With

import sounddevice as sd
devices = sd.query_devices()

a dictionary of soundcards / sounddevices is returned. An example from my PC is shown here:

{'name': 'Microsoft Soundmapper - Input', 'index': 0, 'hostapi': 0, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.09, 'default_low_output_latency': 0.09, 'default_high_input_latency': 0.18, 'default_high_output_latency': 0.18, 'default_samplerate': 44100.0}

{'name': 'Mikrofon (Sound Blaster Play! 3', 'index': 1, 'hostapi': 0, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.09, 'default_low_output_latency': 0.09, 'default_high_input_latency': 0.18, 'default_high_output_latency': 0.18, 'default_samplerate': 44100.0}

{'name': 'Mikrofon (Realtek(R) Audio)', 'index': 2, 'hostapi': 0, 'max_input_channels': 2, 'max_output_channels': 0, 'default_low_input_latency': 0.09, 'default_low_output_latency': 0.09, 'default_high_input_latency': 0.18, 'default_high_output_latency': 0.18, 'default_samplerate': 44100.0}

Since we want to use the Sound Blaster device with a microphone connected to it, the appropriate sounddevice is identified by an index = 1. What is import to note is the parameter default_samplerate=44100.

Currently I assume that this is a fixed samplingrate of this specific sounddevice that cannot be reconfigured.

Selecting and configuring the audio device

For recording purposes it is sufficient to instantiate an InputStream. A code snippet shall illustrate the procedure:

iStream = sd.InputStream(samplerate=fs, device=1, channels=1, callback=wrapped_cb)

This class accepts the samplingrate samplerate as a parameter. The appropriate device is configured using another parameter device. Since only recording of a single channel is required the number of channels is configured accordingly (=1). The callback parameter accepts a callback function which retrieves chunks of audio data from the sounddevice. The callback function runs at a high priority and should not be overwhelmed with lots of other postprocessing tasks.

A code snippet demonstrates the use of a callback function which only collects data and avoids any sort of postprocessing.

First a queue is set up. Audio samples will be put into this queue (FIFO).

q = queue.Queue()

The call back function uses the queue variable q as its first parameter. Other parameters indata, frames, time, status are provided when called from the InputStream. Whenever a chunk of audio samples becomes available the callback is invoked and the audio samples are copied into the queue.

def callback_ref2(q, indata, frames, time, status):
    if status:
        print(status)
    q.put(indata.copy())

The callback function as defined above cannot be used directly as a callback function in InputStream. Using functools partial a new callback function wrapped_cb2 is generated. This wrapped function has only input parameters indata, frames, time, status which is the required signature of parameters by InputStream.

wrapped_cb2 = partial(callback_ref2, q)

The library sounddevice offers some convenience functions to obtain a chunk of audio samples. But for more flexibility the callback based approach is recommended.

A Jupyter Notebook provides a couple of examples which demonstrate the use of the callback-based procedure.

https://github.com/michaelbiester/audio/blob/master/JupyterNb/AudioProcessing/Audio_1.ipynb

An example below shows captured audio samples. The list of 5 buffers has been set up to store audio samples. Each audio buffer is capable of storing up to 10 seconds of audio samples. (The buffers are reused in a cyclical fashion)

The oldest recording is in buffer1, followed by buffer2, buffer3 and buffer4. The most recent audio samples are in buffer0 (which only has about 1000 audio samples).

For details of the implementation look at the Jupyter Notebook.

Other sample programs

A set of example programs which have been tested on a Windows PC and the Raspberry PI can be found on GitHub.

https://github.com/michaelbiester/audio

https://github.com/michaelbiester/audio/tree/master/src


A more advanced example

This examples uses two programs:

  • a client program
  • a server program

Client program

The client program initiates capturing of audio data of the server program. The server program must be already running and listening to a request from the client program. The client tries to connect to the server by establishing a websocket connection. If such a connection has been successful the server begins capturing audio data. The client may enable the server program to download audio files from the computer running the server program to the computer running the client program. Typically client and server programs are running on different platforms. Data exchange uses a wireless websocket connection.

Server program

The server programs captures audio data and implements a sound detector which generates sound events if the sound activity exceeds some user specified thresholds (configurable via configuration file). The server sends various notifications to the client if these events occur:

  • a sound event has been detected
  • audio data have been saved to a file on the computer which runs the server program
  • audio file available ready for download by the client program (if the client has configured the server to enable such downloads)

Flow of information

The figure provides an overview of the flow of information exchanged between client and server program.

The client and server program can be accessed from GitHub.

https://github.com/michaelbiester/audio


Setting up WiFi Direct on a Raspberry PI

As outlined in the overview section a WiFi direct connection between the Raspberry Pi and the PC shall be setup. Here are the steps I had to follow to make it work:

Does the PC support Wifi direct ?

On the Windows PC it is checked whether WiFi direct is supported at all. The required information is obtained using command ipconfig /all.

C:\Users\micha>ipconfig /all

Windows-IP-Konfiguration

   Hostname  . . . . . . . . . . . . : hp_mbi1955
   Primäres DNS-Suffix . . . . . . . :
   Knotentyp . . . . . . . . . . . . : Hybrid
   IP-Routing aktiviert  . . . . . . : Nein
   WINS-Proxy aktiviert  . . . . . . : Nein

Drahtlos-LAN-Adapter LAN-Verbindung* 1:

   Medienstatus. . . . . . . . . . . : Medium getrennt
   Verbindungsspezifisches DNS-Suffix:
   Beschreibung. . . . . . . . . . . : Microsoft Wi-Fi Direct Virtual Adapter
   Physische Adresse . . . . . . . . : 62-E9-AA-18-79-45
   DHCP aktiviert. . . . . . . . . . : Ja
   Autokonfiguration aktiviert . . . : Ja

Drahtlos-LAN-Adapter LAN-Verbindung* 2:

   Medienstatus. . . . . . . . . . . : Medium getrennt
   Verbindungsspezifisches DNS-Suffix:
   Beschreibung. . . . . . . . . . . : Microsoft Wi-Fi Direct Virtual Adapter #2
   Physische Adresse . . . . . . . . : E2-E9-AA-18-79-45
   DHCP aktiviert. . . . . . . . . . : Nein
   Autokonfiguration aktiviert . . . : Ja

The listing shows that this PC has 2 WLAN connection both of which support Wifi Direct.

Using the network manager on the Raspberry PI to setup Wifi Direct

First it must be checked whether the network manager is installed and enabled.

On my Raspberry Pi the network manager tool nmcli was already installed. However it had to be enabled. This is done using the raspi-config tool. It is started with

sudo raspi-config

The documentation of the raspi-config tool can be found here:

https://www.raspberrypi.com/documentation/

and specifically here:

https://www.raspberrypi.com/documentation/computers/configuration.html

The RPI has a built in WLAN adapter which shows up as interface wlan0. For the Wifi connection an external USB WLAN adapter has been attached to the RPI. This interface shows up as wlan1.

With the command nmcli the hotspot is configured like this (user specific information has been highlighted):

sudo nmcli device wifi hotspot ssid rpi3_mbi1955_hotspot password <my password> ifname wlan1

With command ifconfig the status is checked and the interface wlan1 shows up like this:

wlan1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.42.0.1  netmask 255.255.255.0  broadcast 10.42.0.255
        inet6 fe80::20c:aff:fe68:8cb  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:0a:68:08:cb  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 46  bytes 6440 (6.2 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

The important information here is that the WLAN interface has been assigned a local IP-address 10.42.0.1 /24.

Using address the PC can connect to the RPI (eg.: logging in via SSH).

At this point the hotspot is not setup permanently. To do this I found some extremely helpful documentation here:

https://www.raspberrypi.com/tutorials/host-a-hotel-wifi-hotspot/

From this document we learn that the autoconnect property must be set to yes and the autoconnect-priority must be changed to a high value.

sudo nmcli connection modify <hotspot UUID> connection.autoconnect yes connection.autoconnect-priority 100
The hotspot UUID is queried using nmcli :
mbi1955@raspberrypi:~ $ sudo nmcli device
DEVICE         TYPE      STATE         CONNECTION
wlan1          wifi      connected     Hotspot
wlan0          wifi      connected     Vodafone-8293
p2p-dev-wlan0  wifi-p2p  disconnected  --
eth0           ethernet  unavailable   --
lo             loopback  unmanaged     --

mbi1955@raspberrypi:~ $ sudo nmcli connect
NAME                UUID                                  TYPE      DEVICE
Hotspot             c95711aa-efc2-468e-8b3f-5e2315a5bab1  wifi      wlan1
Vodafone-8293       f511c832-9ad2-432d-90cc-78e311691db4  wifi      wlan0
Wired connection 1  b156082b-36c4-3424-b038-4ab041dc7550  ethernet  

Clearly the two WLAN adapters show up with their respective UUIDs.

With

nmcli connection show c95711aa-efc2-468e-8b3f-5e2315a5bab1

the properties of the hotspot connection are displayed.

With

nmcli connection show c95711aa-efc2-468e-8b3f-5e2315a5bab1 > hotspot.txt

the properties are collected into a text file.

When the PC scans for available WLANs it also displays the SSID of the hotspot rpi3_mbi1955_hotspot.

Once the PC has connected (user credentials must be entered) it is possible to login via SSH. The screenshot shows the procedure if PutTTy is used.


Documention

A comprehensive documentation of the tool nmcli can be found in various places. I found the following resources quite readable:

https://access.redhat.com/documentation/de-de/red_hat_enterprise_linux/7/html/networking_guide/sec-configuring_ip_networking_with_nmcli

https://developer-old.gnome.org/NetworkManager/stable/nmcli.html

https://manpages.ubuntu.com/manpages/trusty/man1/nmcli.1.html