How to create Training Data sets and turn Audio into Video

Two articles captured my reading attention this week. 

How to create large training data sets for Machine Learning

Training data has always been important in building machine learning algorithms, and the rise of data-hungry deep learning models has heightened the need for labeled data sets. In fact, the challenge of creating training data is ongoing for many companies; specific applications change over time, and what were gold standard data sets may no longer apply to changing situations.

Ben Lorica spoke with Alex Ratner, a graduate student at Stanford and a member of Christopher Ré’s Hazy research group.

By developing a framework for mining low-quality sources in order to build high-quality machine learning models, Ré and his collaborators help researchers extract information previously hidden in unstructured data sources (so-called “dark data” buried in text, images, charts, and so on).

How do you turn Audio into Video?

A new artificial intelligence tool can create realistic videos from audio files alone. This technology, developed at the University of Washington, has been tested on speeches made by former President Obama.

The technology is based on newly prepared algorithms, which are designed to overcome a limitation with ‘computer vision’. This is with turning audio clips into realistic, lip-synced videos of the person who is speaking the words. The developed algorithms learn from videos that exist "in the wild", such as on the Internet or elsewhere.

To test out the technology, the research group generated a realistic video of Barack Obama discussing such diverse subjects as terrorism, fatherhood and employment. The video was created using audio clips alone together with a separate video image of the former president. The video overcomes a major problem with adding audio to video, where the mouth of the speaker appears unrealistic.

The advantages to businesses are considerable, allowing high quality audit recordings to be made and later turned into videos of a higher resolution that would be possible using a standard camera and with taking archival sound recordings, which is an area that may appeal to the entertainments industry. Imagine, for example, being able to hold a conversation with a historical figure in virtual reality by creating visuals just from audio.


August's AIHappyHour has sold out

For the third month in a row our AIHappyHour event has sold out. Tonight we're at Microsoft. 

Next month we're planning a social event. Then we're at Datacom in October before the final AIHappyHour event in November at the new Generator building in Wynyard Quarter.