Showing Posts From

Audio Fingerprinting

Introducing Aurio and AudioAlign

Introducing Aurio and AudioAlign

AudioAlign is a tool that I started developing in 2010 for my master’s thesis (and have been actively developing since then), with the goal to create a software for the automatic synchronization of audio and video recordings. Although I never quite reached the point of a fully automatic synchronization system, it showed promising results compared to the few similar commercial applications available on the market, and continues to be a helpful tool for my research purposes. I gave up on the plan to commercialize it due to patenting problems I didn’t know how to deal with, but instead decided on open sourcing it, so others can make use of it and hopefully even help me improve it.

Aurio is a library extracted from AudioAlign, providing the underlying core audio processing functionality, including audio fingerprinting and time warping algorithms. Both Aurio and AudioAlign are now available on GitHub under the AGPL license.

Aurio adds support for realtime live fingerprinting and cross-plattform cloud deployments with .NET Core 2.0

Aurio adds support for realtime live fingerprinting and cross-plattform cloud deployments with .NET Core 2.0

A collaboration with eyecandylab, a company developing products for augmenting TV programs, recently gave me the opportunity to implement great new features into Aurio. The most recent version released today extends the architecture to support processing of real-time audio streams with infinite lengths, which means that live streams can now be fingerprinted on the fly with minimal latency. Additionally, the Aurio core library has been ported to .NET Standard 2.0 and will run with the .NET Core 2.0 framework on Windows, Linux and MacOS, enabling building microservices in containerized environments like Docker.

Automatic subtitle synchronization for edited TV productions

Two years ago I was approached by someone from a public TV broadcaster in Germany with the following problem: Given multiple video files with differently cut versions of the same production, is it possible to use the technology from AudioAlign/Aurio to automatically generate edit decision lists (EDL/XML) and use them to transfer subtitles from a reference version to the different cuts? The answer is “yes”, and that’s just one of many use-cases. This article describes the challenges and how Aurio solves them almost magically in a successful prototype developed for the TV station.