-
Notifications
You must be signed in to change notification settings - Fork 20
Text-to-speech adapter for Amazon Polly #824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
a7cdcd9 to
e25cdbf
Compare
e25cdbf to
5fd6727
Compare
|
@NPavie I'm finally having a look at your PR. I have rebased the branch onto master and did some minor tweaks to the code formatting, and commit messages. I also removed some commented out code. These all seemed to stem from the original code (which is still available on the polly branch). Incidentally, I have also rebased the original branch to polly-rebased, and am checking whether there could be some useful stuff that isn't in this PR. First thing I notice (apart from all the changes in that branch not related to the AWS adapter) is that there were a lot of tests. I haven't looked at them yet. Were they not useful you think? I will have some more remarks on specific pieces of code. |
...apters/tts-adapter-aws/src/main/java/org/daisy/pipeline/tts/aws/impl/AWSPollyTTSService.java
Show resolved
Hide resolved
...dapters/tts-adapter-aws/src/main/java/org/daisy/pipeline/tts/aws/impl/AWSPollyTTSEngine.java
Show resolved
Hide resolved
| awsVoicesDescription.get(engine).voices() | ||
| .stream() | ||
| .map(i -> new Voice("aws", i.idAsString() + " (" + engine.toString() + ")", | ||
| new Locale(i.languageName()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other places locales are parsed like this:
(new Locale.Builder()).setLanguageTag(info.getLocale().replace("_", "-")).build()In the original code there was some language normalisation going on.
arbwas replaced byarcmn-CNwas replaced byzh_CN
(Originally this transformation was done in VoiceInfo.tagToLocale)(, not in TTSEngine.getAvailableVoices(), but the result should be the same.)
I wonder whether this normalisation is still needed. Note that we have the org.daisy.pipeline.common.NormalizeLang class (in common-utils) which could in theory be used for it.
The original code was provided by ONCE. The adapter was then updated for the latest version of Pipeline and refactored to match the structure of other adapters (notably the Azure and Google ones). The adapter was also updated to allow for the use of generative, long-form, or neural voices if available for the user's AWS account. The main changes compared to original code from ONCE are: - Updated the service and engine classes to mimic Azure adapter. - Added the following configuration properties for connecting to AWS: - org.daisy.pipeline.tts.aws.accesskey - org.daisy.pipeline.tts.aws.secretkey - org.daisy.pipeline.tts.aws.region - Omit namespace prefixes from SSML as AWS seems to refuse it. - Cleaned up tests files
- Change name of engine from "polly" to "aws" for mapping in UI project - Prioritize standard voices in the voices collection. This is to ensure a standard voice is used for the sanity check of the engine. - Add the engine (standard, neural, generative, long form) in the voice name to help identify which engine would be use (as it can greatly impact costs for the user). - Remove IDs in SSML as AWS rejects SSML with IDs.
to align with Azure and Google adapters.
5fd6727 to
980dc51
Compare
|
Merged in 6dbdb2f. |
|
I was able to create an account and run the tests succesfully. It was not clear from the README where I can find the region identifier, so I just took the same as you (eu-west-3). One thing I noticed was that there are debug messages on stdout. This is because of the logback.xml file in src/test/resources. @NPavie Did you do this on purpose? |
|
Hi @bertfrees For the region identifier, if you open the "IAM Identity center" service of your account (not the "IAM" service, those are 2 different services) and open the dashboard of the service, you should see the region identifier in the parameters summary on the right. For the logback report, I might have left it for debug. You probably can remove it if it is too verbose. |
|
I'll check if I can find the parameter that way, and update the README. I was also thinking the README would benefit from a short list of clear instructions specifically for creating a new access key for Polly. TBH I didn't follow the linked instructions because I didn't feel like reading that much, and also those instructions seemed very general and broad. I didn't need to know all the possibilities, I just wanted to create an access key quickly and specifically for Polly. So I tried to find in the user interface how to "create a new IAM user" and "access key", and the UI was intuitive enough (except to find the region). |
This PR integrates the text-to-speech adapter for Amazon Polly engines into the DAISY Pipeline 2 modules tree.
Original code was authored by @mmartida and provided by ONCE to the DAISY Pipeline team.
The adapter was then updated for the latest version of Daisy Pipeline 2 and refactored to match similar adapters structure (mainly azure and google TTS adapters).
The adapter has also been updated to allow the use of generative, long-form, or neural voices if available for the user's AWS account.
The following properties are used by the adapter :
org.daisy.pipeline.tts.aws.accesskey(mandatory) : Access key to connect to Amazon Web Service.org.daisy.pipeline.tts.aws.secretkey(mandatory) : Secret key to connect to Amazon Web Service.org.daisy.pipeline.tts.aws.region(mandatory) : Region associated with your Amazon Web Service account.org.daisy.pipeline.tts.aws.priority(defaults to15) : priority of usage within the pipeline if other text engine are availableThe adapter relies on AWS java Polly client library. All dependencies required for its usage are added to the
modules/bomproject.In case it would be preferable to release the adapter on maven before including it's first release version, this PR :
modules/bom(@bertfrees I'll leave you the squashing and rephrase for the pipeline history depending on how you want it merged in)