Education

Api.video + api.audio: Localize Advertisement Videos with Personalized Voice Overs

Guest Post: Erikka Innes, Developer Evangelist at api.video explains how to combine api.video with api.audio. Learn how to create a video ad and produce it in 50 different versions in this tutorial.

Erikka Innes

In a collaboration with api.video (what a coincidence and no, it's not related to api.audio) developer evangelist Erikka Innes created this amazing blogpost. Power to her and a big thank you to all of the folks at api.video for the collaboration.

You've prepared the perfect ad copy for your business, but now you're trying to figure out how you can personalize the ad for each business. One great way to do that would be to use a voice over tool that allows you to customize the content based on the location of your customers. Fortunately, this is available! By combining api.video with api.audio's voice over this problem is easy to solve.

Imagine, if you will, that you own a pizza chain

To make this fun, let's imagine you are the owner of a popular pizza chain - Renzo's pizza chain. You have fifty different outlets and you've prepared the perfect ad:


This ad is great for one of your locations, but you want people near all of your restaurants to know they can eat the best pizza for the least dough. For each location, you'd like to run this ad, but have it play information that tells the listener the city, address, day of the week and the time of the offer being described.

No problem! We can do this in the tutorial today. Let's get started.

Prerequisites

For this project, you're going to need:

Installation

We will use the api.video Python client and Aflorithic.ai's apiaudio library.

Installation for api.video:

pip install api.video

Installation for api.audio:

pip install -U apiaudio

ffmpeg installation

If you want to run the project as-is, you'll also need to install ffmpeg. These instructions help you with installation on a mac. What you'll want to do is make sure you have brew installed. Then it's very easy, you just install with:

brew install ffmpeg

You need to install ffmpeg before you install the next two items.

pydub and pyaudio installation

pydub and pyaudio can also be difficult to install, depending on your set up and what you've tried to install already. If you made the mistake of trying to install these before installing ffmpeg, then what you would do is first run:

brew remove portaudio

Then, reinstall this like so:

brew install port audio

After these steps, you should be able to successfully install the modules you'll need. Here's the commands:

brew pyaudio

and

brew pydub

Project overview

Here's what we're going to do with the example script today:

  1. Prompt to collect the api.video API key and the api.audio API key.
  2. Select our video from the Videos folder (for your own tweaks later you can drop other videos here to use, or use a different folder system to organize your content).
  3. Select the script we want from the Script folder and offer the user a preview of the script to make sure it's the right one.
  4. Select the localization .csv file we want to use with our script.
  5. Create a sample audio file with the voice over, using the first line from the .csv file
  6. Give the user the sample file so they can make sure it sounds like what they want.
  7. Next we'll combine the first audio file with the video.
  8. We'll display that file for review by the user to make sure it looks right.
  9. If the user approves the file, one by one we'll create an audio file, combine it with the video, tag each video with the information from the csv, and upload it for storage and hosting on api.video.
  10. We'll return the user a list of video titles and a link to a playable copy of the final file.

Code sample

Here is the code sample. It's a wizard that will walk you through all the steps. The complete project is available on github here: https://github.com/apivideo/python-api-client/blob/master/examples/video_audio/README.md


Walkthrough the important stuff

A bunch of the demo is set up to walk you through the process of combining information. We don't need to go over each while loop, but let's go over some details regarding API behavior and the tools used to create the demo.

The script

You can place your script into your code directly by using quotes. It could look like this (I'm saying could because there are a few different ways to set up your api.audio script):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
"""
    <<soundSegment::intro>> 
    <<sectionName::intro>> 
    On {{day_of_week}} it's Bring a friend day in {{city}} at Renzo's pizza!

    <<soundSegment::main>> 
    <<sectionName::main>> 
    Bring a friend and they will get their pizza for free if you order one or more premium size pizzas, between {{offer_time}}. Only at Renzo's, {{address}} in {{city}}.

    <<soundSegment::outro>> 
    <<sectionName::outro>> 
    Renzo's. Eat the best pizza for the least dough.
"""

You can see that the script is split into three sections. This demo doesn't really make use of the features available per section so I'm not going to go into detail for this part. Personalization is achieved by using {{ }} around the name of a column from your csv spreadsheet. They can be used in whatever order you like.

You can also choose to read your script from a .txt file. If you choose to do this, make sure to take the quotes off from the beginning and end or it will screw up the parser. A common error you will get will say there's a problem with creating a final file or working with a URL and it's usually because something broke before that point, so keep that in mind when debugging.

In your code, when you set up your script, you'll use a command that looks something like this:

In this snippet, you've already authenticated with your api key

apiaudip.api_key = your_key

so now you can work with the api's endpoints.

The only required field is scriptText, which will contain the text of your script. However it's useful to provide the other names for reference. Be aware that you cannot delete projects or modules. After you set up your script this way, you'll be able to reference it in your code easily via scriptId. Like so:

script.get("scriptId")

The audio file

For starters, when you create an audio file with api.audio, it names your file based on your project name and personalization parameters. In this demo, if you named your project "ad" then every audio file would begin with the word "ad." Next, it puts the headers for the csv file in alphabetical order, and adds the appropriate parameter next to each header. For example, our .csv file has the columns (in this order):

  • city
  • address
  • day_of_week
  • offer_time

After each column title, the entry that appears in the audio file will appear. To separate each column, two underscores are used. Here's a sample:

adaddress_1353%20harlem%20streetcity_springfieldday_of_week_tuesdayoffer_time_6%20to%208%20pm

So something you'll possibly want to do is rename the files when they arrive. An easy way to handle this is with the built-in os module.

When creating an audio file you'll do two steps, one is text-to-speech and one is mastering the audio. For text-to-speech you can pick a voice, how fast it will speak and then give it a dictionary containing the list of personalization terms you want to insert into your script.

You can also choose a template, which will play some background music under your spoken audio. You can see in the demo I used "copacabana." To list voices or background music options, go to the api.audio API reference docs and use the endpoint for listing voices or the endpoint for listing music.

The .csv

This demo reads information from a .csv file. If you want to read from something else, you can as long as your output for your program to work with becomes a list of dictionaries. .csv is pretty simple to use, so I went with that option.

Playing a track from your application

You can play a track to check out the audio by using a variety of tools. For this one, I used pydub and pyaudio. These are fairly popular modules. In order to use them, however, audio must be converted to .wav. You will see in the code that two imports are made from pydub:

from pydub import AudioSegmentfrom pydub.playback import play

These allow us to play straight from the terminal or wherever we may be. The code to convert to .wav and play is very straightforward:

wav_track = AudioSegment.from_mp3(file)wav_track.export(audio_title, format="wav")

There are other choices available for converting, but api.audio returns .mp3 files, so we use the from_mp3 choice.

After we have the track, playing it is as simple as this:

play(wav_track)

Merging audio and video

Prior to uploading your video to api.video, you will want to add the sound and video together. This can be accomplished with ffmpeg, which you need to import to use pydub and pyaudio anyway. The code for merging audio and video is:

input_video = ffmpeg.input(video)input_audio = ffmpeg.input(wav_track)title = title + ".mp4"ffmpeg.concat(input_video, input_audio, v=1, a=1).output(title).run()

This will produce an mp4. You can then upload it to api.video.

Upload to api.video

For details about uploading a video with api.video, you can check out the tutorial about it here: Upload a Video with the api.video Python Client

Something to note is to upload a file, it must be in the same folder as your application or it will not upload.

Once it's uploaded, you can retrieve the .mp4 from the response and play it right away in your browser using the built-in webbrowser module.

To retrieve the mp4 from the response, you do this:

link_mp4 = video_response['assets']['player']

And then you can open the link like this:

 webbrowser.open(link_mp4)

This will let you make sure everything combined properly so that the audio matches with the video the way you want.

Create all the localized videos

After all the steps to make sure you're creating the right type of video, you can use the recipe from api.audio with a couple of tweaks to create all your new videos with personalized ads by location, then upload them for hosting to api.video.

This demo deletes every video you upload right after the upload happens so you don't end up sitting with fifty videos in a folder.

Thanks for reading! Happy coding. :)

TLDNR? Watch the video tutorial:

About:

Aflorithmic is a London/Barcelona-based technology company. Its api.audio platform enables fully automated, scalable audio production by using synthetic media, voice cloning, and audio mastering, to then deliver it on any device, such as websites, mobile apps, or smart speakers.

With this Audio-As-A-Service, anybody can create beautiful sounding audio, starting from a simple text to including music and complex audio engineering without any previous experience required.

The team consists of highly skilled specialists in machine learning, software development, voice synthesizing, AI research, audio engineering, and product development.

API.video is just what it says, a video API built by developers, for developers.


Their mission is to connect people through their cameras and deliver valuable insights 
from the video-1st World.

The video distribution inside traditional, online and mobile apps stays a challenge due to the complexity of managing heavy files, making them available on any screen and worldwide in seconds. That’s why they've built api.video, the new Standard to manage online video streaming.