Introducing Aurio and AudioAlign
AudioAlign is a tool that I started developing in 2010 for my master’s thesis (and have been actively developing since then), with the goal to create a software for the automatic synchronization of audio and video recordings. Although I never quite reached the point of a fully automatic synchronization system, it showed promising results compared to the few similar commercial applications available on the market, and continues to be a helpful tool for my research purposes. I gave up on the plan to commercialize it due to patenting problems I didn’t know how to deal with, but instead decided on open sourcing it, so others can make use of it and hopefully even help me improve it.
Aurio is a library extracted from AudioAlign, providing the underlying core audio processing functionality, including audio fingerprinting and time warping algorithms. Both Aurio and AudioAlign are now available on GitHub under the AGPL license.
Aurio: Audio Fingerprinting & Retrieval for .NET
Aurio is a library for audio retrieval purposes, with a focus on audio fingerprinting algorithms. It is written in C# for the .NET Framework 4.0, but actually has only minimal dependencies on Windows APIs and could therefore easily be ported to the Mono platform. It provides a lot of basic building blocks for the implementation of audio features and algorithms, e.g. an easily extensible stream-based 32-bit audio processing engine, file I/O through FFmpeg, FFT and resampling through various well known libraries. Based on these blocks, it currently implements multiple audio features, two kinds of time warping algorithms, and four well known and very easy to use fingerprinting algorithms. It additionally provides a few audio-related UI widgets.
A more detailed description with usage examples is available in the GitHub repository and the accompanying website. This library has also been accepted to the ACM Multimedia 2015 Open-Source Software Competition, and the paper describing the library is available here.
AudioAlign: Audio Synchronization And Analysis Tool
AudioAlign is a Windows application for the semi-automatic synchronization of media files, including drifted recordings. It can read lots of different media formats, synchronize them (semi-)automatically, and export them for further processing in other applications. The user interface provides a continuously zoomable and scrollable multi-track waveform timeline view, similar to many digital audio workstations. This timeline allows the user to place and synchronize tracks manually, but also to inspect and edit automatically generated synchronization points. Most of the functionality comes from the Aurio library, AudioAlign basically just wraps the GUI and multi-threading around Aurio. Use-cases of the software are mainly the synchronization of multiple overlapping recordings from one and the same event, e.g. generating long-running videos from separate short clips, detecting and removing drift from several long-running recordings, or generating multi-camera cuts. It can also be “abused” for use cases like generating video mashups, synchronizing different cover interpretations or voice dubbing. An extended description is available in the GitHub repository and the Aurio paper linked above.