Protyposis

Travelling through time & space in real-time...

CATEGORY: Audio Fingerprinting

Automatic subtitle synchronization for edited TV productions

Two years ago I was approached by someone from a public TV broadcaster in Germany with the following problem: Given multiple video files with differently cut versions of the same production, is it possible to use the technology from AudioAlign/Aurio to automatically generate edit decision lists (EDL/XML) and use them to transfer subtitles from a reference version to the different cuts? The answer is “yes”, and that’s just one of many use-cases. This article describes the challenges and how the Aurio technology solves them almost magically in a successful prototype developed for the TV station.

Aurio adds support for realtime live fingerprinting and cross-plattform cloud deployments with .NET Core 2.0

A collaboration with eyecandylab, a company developing products for augmenting TV programs, recently gave me the opportunity to implement great new features into Aurio. The most recent version released today extends the architecture to support processing of realtime audio streams with infinite lengths, which means that live streams can now be fingerprinted on the fly with minimal latency. Additionally, the Aurio core library has been ported to .NET Standard 2.0 and will run with the .NET Core 2.0 framework on Windows, Linux and MacOS, enabling building microservices in containerized environments like Docker.

An example application named Aurio.Test.RealtimeFingerprinting has been added to demonstrate how realtime live fingerprinting can be implemented with only a few lines of code. As part of supporting .NET Core, resampler and FFT dependencies building on native code have been moved from the core into optional add-on packages, and the WDL resampler ported to purely managed .NET code by NAudio has been integrated as an alternative for deployments where native dependencies are undesired. This means that now there are purely managed implementations available for both FFT (Exocortex.DSP with Aurio.Exocortex) and resampling (NAudioWdlResampler within the Aurio core).

Of course Aurio continues to support the good old WPF GUI applications, but the framework requirement had to be increased from .NET Framework 4.0 to 4.6.2. AudioAlign has also been updated to the latest Aurio version.

Aurio on GitHub
AudioAlign on GitHub

Introducing Aurio and AudioAlign

AudioAlign is a tool that I started developing in 2010 for my master’s thesis (and has been actively developed since then), with the goal to create a software for the automatic synchronization of audio and video recordings. Although I never quite reached the point of a fully automatic synchronization system, it showed promising results compared to the few similar commercial applications available on the market, and continues to be a helpful tool for my research purposes. I gave up on the plan to commercialize it due to patenting problems I didn’t know how to deal with, but instead decided on open sourcing it so others could still make use of it and hopefully even help me improve it. Aurio is a library extracted from AudioAlign, providing the underlying core audio processing functionality like an audio processing engine and audio fingerprinting and time warping algorithms. Both Aurio and AudioAlign are now available as AGPL licensed open source software on GitHub.

There are no more results.