Text-to-speech can help us in creating more versatile, accessible content. We could purchase recording equipment and spend hours recording and editing each narration, but if we want most of the benefits for only a couple of minutes and a few pennies per post, consider using AWS Polly instead.
AWS Polly uses advanced deep learning technologies to synthesize speech that sounds like a human voice which helps us in creating applications that talk. With the use of AWS Polly, we can build entirely new categories of speech-enabled products. That too also without knowing much about machine learning and deep Learning.
In this blog, we will discuss AWS Polly and its use cases and how to use it. At the end of the blog, we will integrate AWS Polly with AWS Lightsail to create a blogging platform with text-to-speech features. By adding voice capabilities, readers can also consume blog content via new channels such as inline audio players and podcast applications. Let’s explore!
In this blog, we will cover:
- Text-to-Speech Software and Challenges
- What is Amazon Polly?
- Benefits of using Amazon Polly
- Key Features of Amazon Polly
- Use cases of AWS
- Content Creation
- E-learning
- Telephony
- Companies using AWS Polly
- Hands-on
- Conclusion
Text-to-Speech Software and challenges
The concept of text-to-speech software is simple – you take a paragraph, a page, a blog, or even a whole book and have a computer read it aloud to you. When people think about text-to-speech, they often associate it with robotic voices. However, this usually isn’t the case anymore, particularly with modern software.
Text-to-Speech is a technology with very practical applications, such as:
- Enabling people with disabilities to read
- It provides a hands-off reading experience.
- For situations where audio versions of content aren’t available.
It’s always good to use Text-to-Speech in modern applications, but it has a few challenges:
From a technical perspective, getting text-to-speech right is much more difficult than you might imagine! But you can make it easy and simple using AWS Polly! Yes, you read it right! Let’s explore more about AWS Polly in this blog!
What is Amazon Polly?
Amazon Polly is an AWS cloud service that enables you to turn text into speech in many languages, using many unique voices. The service has been around since 2016 and in 2018 Amazon launched a plugin to help WordPress users integrate it into their websites.
In addition to Standard TTS voices, Amazon Polly offers Neural Text-to-Speech (NTTS) voices that deliver advanced improvements in speech quality through a new machine learning approach.
Common use cases for Amazon Polly are mobile applications such as newsreaders, games, eLearning platforms, accessibility applications for visually impaired people, and the rapidly growing segment of the Internet of Things (IoT).
Benefits of using Amazon Polly
Key Features of Amazon Polly
Use cases of AWS Polly
Content Creation
Audio can be used as a complementary media to written and visual communication by adding voice to the content. Amazon Polly can generate speech in dozens of languages, making it easy to add speech to applications with a global audience, such as RSS feeds, websites, or videos.
Example: Convert a blog to speech
E-learning
Amazon Polly enables visual experiences such as speech-synchronized facial animation or karaoke-style word highlighting. Amazon Polly makes it easy to request an additional stream of metadata with information about when particular sentences, words, and sounds are being pronounced. Using this metadata stream alongside the synthesized speech audio stream, customers can animate avatars and highlight text as it is currently spoken text in their app.
Example: Play audio and highlight spoken text
Telephony
Business call centers can engage customers with natural-sounding voices using Amazon Polly. You can cache and replay Amazon Polly’s speech output to prompt callers through interactive voice response (IVR) systems, such as Amazon Connect.
Example: Text-to-speech for telephony systems
Companies using AWS Polly
Hands-on
Let’s take an example of Workfall. Workfall is currently building a new blog platform and wants to add voice features to convert complete blogs into audio. With an audio blog, Workfall wants to give flexibility of listening to the blog rather than reading! Sounds interesting right? The design implementation requires a quick solution to spin up a blog site immediately with the least configurations and minimal costs. For this, the tech team of Workfall decided to take advantage of Amazon Web Services to quickly get started. Based on the traffic estimation and configuration efforts, the Workfall tech Team has decided to use the AWS Lightsail service to quickly spin up a blog site with the least configuration efforts and minimal costs and along with that, the team plans on using AWS Polly to convert text-to-speech for each of their blogs. Let’s do hands-on to understand how to create a blog website with audio features using AWS Lightsail and AWS Polly.
Following is the architecture diagram for this configuration:
To implement this, we will do the following:
- Sign in to your AWS console and navigate to the IAM dashboard.
- Create a new user with Programmatic access and attach a plugin policy.
- Navigate to the Amazon Lightsail dashboard.
- Create a WordPress instance using the options provided in Amazon Lightsail.
- Connect to your newly created instance via SSH and get the password to access the dashboard of your WordPress site.
- Sign in to the admin dashboard for your WordPress site to customize the look and feel.
- Install the AWS plugin for WordPress.
- Enable the Text-To-Speech support for your posts.
- Create a new post and select a voice-based on your preference and publish the post.
- Navigate to your post on the site and test it out by clicking the play button.
Login to your AWS console and navigate to the Amazon IAM Dashboard. Click on ‘Add user’.
Enter a ‘user name’ and select the Access type as ‘Programmatic access’. Once done, click on ‘Next: Permissions’.
In the search box, enter ‘AWSForWordPressPluginPolicy’, check the checkbox for that policy and click on ‘Next: Tags’.
Enter tags (if any) for the new user you are creating and click on ‘Next: Review’.
Review the details you filled in along with reviewing the policy you attached and click on ‘Create user’.
Once the user has been created, click on ‘Download .csv’ to store a copy of the ‘Access key ID’ and ‘Secret access key’ of the created user.
Now, navigate to the Lightsail service to create a WordPress instance. On the console, click on ‘Create instance’.
Select an ‘AWS Region’ and an ‘Availability Zone’ for your instance based on your preferences.
Under the ‘Pick your instance image’ section, select the platform as ‘Linux/Unix’ and for the ‘Select a blueprint’ subsection, select ‘WordPress’.
Scroll down and under the ‘Choose your instance plan’ section, select the instance plan based on your needs. We will be using an instance plan of ‘USD 3.5’ that provides us with a free usage plan for the first month.
Now, scroll down and enter a name for your WordPress instance under the ‘Identify your instance’ section, select a count for the number of instances and add tags (if any) for your instance.
Finally, click on ‘Create instance’ and Amazon Lightsail will begin spinning up the new server and the status would be displayed as ‘Pending’.
After a few minutes, once the server has launched, the status for your instance changes to ‘Running’.
Click on the created instance and you will see the actions to ‘Stop’ and ‘Reboot’ your WordPress instance. Along with that, you will see the status as ‘Running’ and below that, you will find the Public and Private IP addresses using which you can access your WordPress site.
Copy the Public IP address, paste it in a new tab in your browser, and hit enter to access your newly created WordPress site. On accessing the Public IP in your browser, you will view your blog page as shown below.
To customize your WordPress site, you will require a username and password to login into your WordPress admin console. Navigate back to your instance details page on the Amazon Lightsail console.
Click on ‘Connect using SSH’. On clicking, a new secure SSH connection to the server console is initiated in a new browser.
Now, run the command “cat bitnami_application_password” on the server command prompt to get the password that is stored in a file named bitnami_application_password in the home directory that will be used to login to your WordPress admin console.
Now, navigate back to the WordPress site that you opened in a new tab using a public IP in the browser and click on ‘Manage’ on the right bottom of the page.
You will be navigated to an instructions page to access the WordPress admin console.
Note down the username displayed under the Access data for WordPress section and then click on ‘Login’.
Enter the username and the password copied above fetched from the server command prompt console.
Click on ‘Log In’ and you will be directed to your WordPress Admin Console.
Click on ‘Customize Your Site’ to customize the look and feel of your site. Once done customizing the site based on your preferences, click on ‘Publish’.
After customization, our WordPress site looks like as shown below.
Now, navigate back to the admin console and select plugins in the left navigation pane to install the WordPress plugin.
In the search bar, enter ‘AWS for WordPress’ and then click on ‘Activate’ button for the same plugin.
On success, you will see the success message as shown below.
Now, if you click on the ‘Active’ tab, you will find all the activated plugins and similarly, you will get to see the ‘AWS for WordPress’ plugin we just activated.
Select ‘AWS’ from the left navigation pane to link the user you created above with the AWS plugin. Enter the ‘AWS access key’ and ‘AWS secret key’ from the CSV file you downloaded above as well as select the ‘AWS Region’ based on your preference. Once done, click on ‘Save Changes’ to link the user.
On success, you will see the success message as shown below.
Now, in the left navigation pane, under AWS select Text-To-Speech.
Select the source language based on your preference and check the box to enable text-to-speech support. Click on ‘Save Changes’ once done.
On success, you will be navigated to the ‘Text To Speech – Amazon Polly’ configuration dashboard. There are different settings available so you can change them based on your preferences.
Once done, scroll down to the bottom of the page and click on ‘Save Changes’.
Now, navigate to the ‘Posts’ dashboard to test out the feature. Click on ‘Posts’ in the left navigation pane. On the posts dashboard, click on ‘Add New’ to add a new blog.
Enter in a title and the text for your blog. We have added one of our blogs as shown below.
If you scroll down to the bottom of the page, you will find the settings for enabling or disabling Amazon Polly to provide text-to-speech support for your post.
Make sure that you enable it and you can select the Voice name based on your choice for the same. Once done, click on ‘Publish’ on the top right corner to publish the post on your WordPress site.
On success, you will see a message as shown in the bottom left in the above image. You can navigate to your published post by clicking on ‘View Post’. Once you click on it, it will take you to your post on your WordPress site. As you can see in the image below, you will find the entire blog’s text converted to speech by Amazon Polly. You can click on Play to let Polly render the audio and read the entire blog for you.
Conclusion
In this blog, we gained amazing insights on Amazon Polly. We even went through a business scenario to fulfill the need for a text-to-speech feature for each of their posts on a blog website for a company. how can we configure Amazon Polly and integrate it with Amazon Lightsail to implement a text-to-speech feature for the WordPress site. We will discuss more of Amazon Polly and its other configurations in our upcoming blog. Stay tuned to keep getting all updates about our upcoming new blogs on AWS and relevant technologies.
Meanwhile …
Keep Exploring -> Keep Learning -> Keep Mastering
This blog is part of our effort towards building a knowledgeable and kick-ass tech community. At Workfall, we strive to provide the best tech and pay opportunities to AWS-certified talents. If you’re looking to work with global clients, build kick-ass products while making big bucks doing so, give it a shot at workfall.com/partner today.