Producing quality audio content is expensive and time-consuming. So what do you do when your content is constantly changing? E.g. the daily news. Have you ever listened to news that are a few hours old....or let alone a few days? Probably not, because no one cares for ”yesterday’s news”.
As more content is moving from the screen to the ears, a number of publishers have started to embed content readers on their website. The user then has the choice to either read or listen to a news article. The advantages of using synthetically generated audio (~Text To Speech) for this purpose are obvious: audio production is all of the sudden inexpensive, fully scalable, and with ultra fast turnarounds. This would be impossible within a traditional audio production process. As an added benefit, publishers are also excited about reusing their content for other channels that were completely inaccessible to them before: e.g. the smart speaker.
While this reads like fantastic news for all publishers and content creators, it does not necessarily make news sound fantastic to the listener. Using a text-to-speech voice - no matter how expressive that voice may be - is usually unengaging and hard on the ear when used in isolation. Just listen to this example:
Not bad, but something is missing. When have you ever listened to news on the radio and there was only a voice in isolation? No jingle at the beginning, no audio queue to keep the different sections or news items apart or maybe an audible indicator that “headlines” are over and it is time for details. Further, depending on the velocity of the audio format, some news producers even like to make it sound a bit more active by having a constant audio activity level in the background.
So let’s consider this alternative: It has been produced with an artificial voice as well - however a voice that is slightly more fitting for the news item at hand - it has a sound design which keeps the listener engaged, and a post production that makes it sound professional. All of this generated automatically.
Which one does your ear prefer?
With API.audio you can produce both versions, but we strongly recommend the latter!
Here is an easy to follow recipe that will let you replicate the example above in less than 2 minutes:
At Aflorithmic, we certainly love to see our customers save cost by automating and streamlining their audio production with api.audio. At the same time, we find it much more inspiring to think about all the new and exciting innovations that can be built with this new technology - audio formats that were not possible even 2 years ago: Think about a sports newscast that only highlights your favorite sport teams, leagues, and players. Or a dedicated publisher’s local news weekly that summarizes the local press of each town that it serves, no matter how small it may be. Fresh news formats that update every 5 minutes. Or how about your personal newscast that is exactly as long as your morning commute? What is your idea? Contact us and let’s build it together.
Aflorithmic is a London/Barcelona-based technology company. Its api.audio platform enables fully automated, scalable audio production by using synthetic media, voice cloning, and audio mastering, to then deliver it on any device, such as websites, mobile apps, or smart speakers.
With this Audio-As-A-Service, anybody can create beautiful sounding audio, starting from a simple text to including music and complex audio engineering without any previous experience required.
The team consists of highly skilled specialists in machine learning, software development, voice synthesizing, AI research, audio engineering, and product development.