In this blog post we will create a simple Alexa Skill from scratch using api.audio content. The focus of this blogpost is to give a detailed overview on how to create alexa skills, and how to connect them with an external api (api.audio) to produce and retrieve audio content based on user's request. Finally, we will also explain how to test it using the Alexa Developer Console, the Alexa phone App, and your Echo devices. Special thanks to VozLab for collaborating with us on this. More about VozLab can be found at the end of this article.
First, we will review and install all the necessary tools we need, then we will create audio content using api.audio, in this case a personalized newscast. Finally, we will connect the Alexa skill with the api.audio API, deploy and test with an Echo device and the Alexa App.
Preparing the tool
First of all, let's check we have all the tools before we start:
2. Alexa Developer Account - Create an account in the Alexa Developer Console.
3. Amazon Web Services (AWS) Account- Create an account in the AWS Console.
4. IAM user - Login into your AWS Account and search for the IAM service. Create an IAM user with the necessary permissions and at least programmatic access. If you don't know how to do that, don't panic and follow this simple guide. Make sure you copy your AWS Access Key ID and your AWS Secret Access Key. You will need it for the next step.
5. AWS CLI v2 - Download the AWS Command Line interface. Once it's installed, just do:
6. node.js & npm - Download it from here
8. Alexa Skills Kit (ASK) CLI - Install it following the official AWS guide. Useful command line interface tool to easily manage and debug your Alexa Skill
9. VS Code - Install it from here
10. VS Code ASK extension - Extension to use ASK SDK in VS code. Install instructions here.
11. Apiaudio Alexa Streaming Skill Github Repo - Download from here or just:
12. Amazon Alexa Phone app
Download the official Amazon Alexa app on your mobile phone, as this will be very useful for testing. Once it has downloaded, sign in using the Alexa developer account you created in step 2. Also change the language of the Alexa App to English (US).
13. (Optional) Amazon Alexa Echo Device
If you have one of these - great! Just make sure in the Alexa phone app (see 12), that the app is linked with your new user. Go to More/Settings/Device Settings and click on the + symbol /Add Device/Amazon Echo and select your Echo device. That's it!
Tools ready? Let's start.
Creating Audio content with api.audio
First, let's create some Audio content using api.audio. In this case, we will generate a personalized newscast for each user. Have a listen to Sam's personal newscast:
Sounds good doesn't it? Everything was automatically generated in seconds by api.audio.
1. Go to console.api.audio and copy your api-key.
2. Create a new example.js file and copy the following code. You can also copy the code directly from github. Make sure you paste your api-key in the second line of code. Of course, feel free to modify the text!
3. Run the script:
4. DONE! You already have your news produced! Just copy the url in the browser, and have a listen 😉 You should get an .mp3 file back!
5. Copy the scriptId generated, you'll need it for the next part of the tutorial.
Do you want to go a step further and grab the news copy from a News API to produce it on the fly based on user preferences? Stay tuned, we will do this in the next tutorial 😉
So we are done with the api.audio part. We already have our newscast, and now we want to create an Alexa Skill that can reproduce personalized news for each user. We want something like this. In this video we are producing the speech and mastering audio on the fly. Quite impressive - right? 😎
Let's create our Alexa skill real quick:
2. Install the dependencies first (run this command in the /alexa folder):
3. Init ASK to configure our alexa skill
Now our Alexa Skill is configured, and all the code is already built for you, let's review the folder structure and the important parts of the code. For an in-detail explanation I recommend the official documentation here:
- skill-package folder - Here you find the skill.json file (also known as Skill Manifest) which has the general configuration of the skill, such as the locales (a.k.a. languages) supported (in our case is only en-US but feel free to add more), the invocation name per language, as well as the category of the Alexa Skill. Also in this folder you'll find the InteractionModels folders which contains the configuration of the interaction model for each language. This is a vital part of the skill. In the interactionModels/custom/en-US.json file you can set the invocationName of the skill (in our case "api audio maker"). So, for example, to open your skill in your alexa device you'll say: "alexa, open api audio maker". Feel free to play with this. Then you have the intents, which allow you to set different types of "voice actions/queries" for your skill. In this case, we only have 1 intent set up: PlaySoundIntent that will allow the Alexa Skill user to trigger a Lambda function based on the user query. You can also grab information from the response of the user. In this example, we are grabbing the name of the user.
In the lambda/index.js file, you'll find the Alexa Skill logic. The important bits are:
- LaunchRequestHandler (line 4) - Here you have the logic that handles the launch request from Alexa. This is the function triggered when you invoke the lambda by saying: "Alexa, open api audio maker".
- PlaySoundIntentHandler (line 84) - Here you have the logic that handles our PlaySoundIntent created in the interactionModels/custom/en-US.json file.
In the lambda/util.js file you'll find the code used from the index.js handlers to connect with api.audio and create/retrieve personalized audio. More on that in the following section.
Connecting your api.audio content with your Alexa Skill
Before deploying, let's copy the scriptId and the api-key in our util.js code, otherwise it won't work.
1. Go to lambda/util.js (lines 5-6) and paste the apiaudio api-key and the scriptId you created in the previous section.
Please note this is an example and we try to simplify things. For a production version, we HIGHLY recommend that you use environment variables and never hardcode api keys in your code and/or github repositories.
Deploy alexa skill
The code is ready, we are only missing one thing, deploying the code from our computer to AWS servers. Let's do it:
1. Deploy your new Alexa Skill. Just do:
Testing your Alexa Skill
The Skill has been deployed, now it's time to test it.
You have several options for testing:
1. Alexa Developer Console - This is the easiest. Just go to the link provided and click on your Alexa Skill. Then go to the Test tab and write "open api audio maker". The only problem with the Developer Console is that the audioPlayer directives (used to play the audio coming from api.audio) doesn't work and thus you cannot listen the audio coming from the api. But this is a great solution for testing that your skill works.
1. Using ASK CLI in your terminal. Run the following command to run the app in local:
2. Now open a new terminal window, and run the following command:
In the dialog tab you can type "open api audio maker" and your skill should respond. Again, audioPlayer directives are not supported in the terminal.
3. Using VSCode's Alexa Skill plugin. Similar to the 1st option, but can be used in your local machine, inside VSCode. Also does not allow for AudioPlayer directives.
4. Using Alexa Phone App. This is the first testing mechanism that allows us to play the audio and have an end-to-end testing experience. Just open the app and click on the Alexa Icon so the app starts to record, and say: "open api audio maker", and after the welcome message, just say: "open <yourname>". This will automatically render your name and produce personalized news. If it does not work, please check the language settings, and remember it has to be exactly the same as the skill locales (in this case, must be English US).
5. Using an Echo device. If your echo device is linked to the Alexa App account and using English US as main language, you will be able to run your skill directly by talking to your Echo device "Alexa, open api audio maker".
Are you happy with the results? Would you like to create your own skill using api.audio and certify your skill, so it can be put onto the official Alexa Skill Store?
Please let us know and we will provide you with guidance and a more in-depth tutorial!
Do you want to know more?
- Api.video + api.audio: Localize Advertisement Videos with Personalized Voice Overs
- Introducing: The World’s Largest Artificial Voice Library!
- Creating Personalized Audio Ads
- How To Create Engaging Voice Over For Your Video
Aflorithmic is a London/Barcelona-based technology company. Its api.audio platform enables fully automated, scalable audio production by using synthetic media, voice cloning, and audio mastering, to then deliver it on any device, such as websites, mobile apps, or smart speakers.
With this Audio-As-A-Service, anybody can create beautiful sounding audio, starting from a simple text to including music and complex audio engineering without any previous experience required.
The team consists of highly skilled specialists in machine learning, software development, voice synthesizing, AI research, audio engineering, and product development.
VozLab is a technology company based in London, building the next generation of voice-first applications. The companies’ self-serviced platform allows voice experiences to be created, edited and modified on live campaigns, without any coding experience required.
The platform enables these voice experiences to evolve with a campaign, which facilitates the conversation between brands and their customers. ‘We understand the best way to get revenue and results through smart speakers,’ concludes Maria Noel Reyes, CEO of VozLab ‘Through our self-serviced technologies, we have developed voice apps focused on longevity and flexibility that will challenge the traditional online mechanics and tactics.’