GCP如何使用Text-to-Speech语音合成

【概述】Text-to-Speech是使用由Google 的 AI 技术提供支持，通过API 将文字转换为自然而逼真的语音。通过 API 调用将转录数据发送到 Text-to-Speech进行语音合成，然后在响应中收到可播放音频格式的合成人类语音。
设想您有一个语音辅助应用，可以通过可播放音频文件，向您的用户提供自然语言反馈。您的应用可能会执行某个操作，然后向用户提供人类语音作为反馈。
【优势】

通过自然逼真的智能回复来改善客户互动体验；
让用户与您的设备和应用中的语音界面进行互动；
根据用户首选的语音和语言对沟通方式进行个性化设置；
支持多语种发音，参考链接：(https://cloud.google.com/text-to-speech/docs/voices；)
支持您配置语速、音高、音量和采样率（单位为赫兹）。

【实操】

创建新的服务帐号

如果项目还没有服务帐号，请创建一个新的服务帐号。必须创建服务帐号才能使用Text-to-Speech。转到“创建服务帐号”在服务帐号名称框中，输入新服务帐号的唯一名称。
为服务帐号分配一个基本 IAM 角色,点击选择角色下拉列表，然后向下滚动至基本。您可以从右侧列显示的选项中为此服务帐号选择角色。点击继续。

为服务帐号创建 JSON 密钥

通过主导航菜单中的 IAM 和管理 -> 服务帐号选项访问服务账号，随时生成密钥和/或更改个人用户信息。
如需创建密钥，请点击服务帐号，然后选择密钥。点击添加密钥 -> 创建新密钥。创建 JSON 格式的密钥。

系统会自动下载选择的格式的新密钥。将此文件存储在安全的位置，并记下文件路径。在每个新的 Text-to-Speech 会话开始时的身份验证过程中，需要将 GOOGLE_APPLICATION_CREDENTIALS 环境变量指向此文件。这是对发送到 Text-to-Speech 的请求进行身份验证的重要步骤。密钥的唯一 ID 显示在服务帐号名称旁边。

设置身份验证环境变量

export GOOGLE_APPLICATION_CREDENTIALS="KEY_PATH"

从文本合成音频

创建request.json

{
  "input":{
    "text":" Google Cloud Text-to-Speech enables developers to synthesize natural-sounding speech with 100+ voices, available in multiple languages and variants. It applies DeepMind’s groundbreaking research in WaveNet and Google’s powerful neural networks to deliver the highest fidelity possible. As an easy-to-use API, you can create lifelike interactions with your users, across many applications and devices. "
  },
  "voice":{
    "languageCode":"en-gb",
    "name":"en-GB-Standard-A",
    "ssmlGender":"FEMALE"
  },
  "audioConfig":{
    "audioEncoding":"MP3"
  }
}

在 input 部分的 text 字段中指定要合成的文本，并在 audioConfig 部分指定要创建的音频类型
在 POST 命令正文的 voice 配置部分指定要合成的语音类型

执行文本转化命令

命令：

curl -s -X POST -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) -H "Content-Type: application/json; charset=utf-8" -d @request.json "https://texttospeech.googleapis.com/v1/text:synthesize"  | jq .audioContent | xargs> synthesize-output-base64.txt

注：使用 gcloud auth application-default print-access-token 命令检索请求的授权令牌

将 synthesize-output-base64.txt 文件的内容解码到名synthesized-audio.mp3 的新文件

命令：

base64 synthesize-output-base64.txt -d > synthesized-audio.mp3

Blog

GCP如何使用Text-to-Speech语音合成

GCP如何批量创建bucket

Leave your thought here 取消回复

Search

Recent Posts

Assess your business potentials and find opportunities for bigger success

服务范围

IT Software

IT Support

IT Platforms

Cloud Computing

Cyber Security

快速链接

Locations Pickups

Terms of Payment

Privacy Policy

Where to Find Us

联系支持

Forum Support

Help & FAQ

联系我们

价格计划

Cookies Policy

Blog

GCP如何使用Text-to-Speech语音合成

GCP如何使用Text-to-Speech语音合成

GCP如何批量创建bucket

Leave your thought here 取消回复

Search

Recent Posts

Popular tags

Assess your business potentials and find opportunities for bigger success

服务范围

IT Software

IT Support

IT Platforms

Cloud Computing

Cyber Security

快速链接

Locations Pickups

Terms of Payment

Privacy Policy

Where to Find Us

联系支持

Forum Support

Help & FAQ

联系我们

价格计划

Cookies Policy