Today, we’re announcing a breakthrough in generative AI for speech. We’ve developed Voicebox, a state of the art AI model that can perform speech generation tasks — like editing, sampling and stylizing — that it wasn’t specifically trained to do through in-context learning.
Voicebox can produce high quality audio clips and edit pre-recorded audio — like removing car horns or a dog barking — all while preserving the content and style of the audio. The model is also multilingual and can produce speech in six languages.
In the future, multipurpose generative AI models like Voicebox could give natural-sounding voices to virtual assistants and non-player-characters in the metaverse. They could allow visually impaired people to hear written messages from friends read by AI in their voices, give creators new tools to easily create and edit audio tracks for videos, and much more.
The versatility of Voicebox enables a variety of tasks, including:
In-context text-to-speech synthesis: Using an audio sample as short as two seconds long, Voicebox can match the audio style and use it for text-to-speech generation.
Speech editing and noise reduction: Voicebox can recreate a portion of speech that’s interrupted by noise or replace misspoken words without having to re-record an entire speech. For example, you can identify a segment of a speech that’s interrupted by a dog barking, crop it, and instruct Voicebox to re-generate that segment – like an eraser for audio editing.
Cross-lingual style transfer: When given a sample of someone’s speech and a passage of text in English, French, German, Spanish, Polish or Portuguese, Voicebox can produce a reading of the text in any of those languages, even when the sample speech and the text are in different languages. This capability could be used in the future to help people communicate in a natural, authentic way even if they don’t speak the same languages.
Diverse speech sampling: Having learned from diverse data, Voicebox can generate speech that is more representative of how people talk in the real world and in the six languages listed above.
Voicebox is an important step forward in our generative AI research, and we look forward to continuing our exploration in the audio space and seeing how other researchers build on our work.
Learn more about Voicebox.
The post Introducing Voicebox: The Most Versatile AI for Speech Generation appeared first on Meta.
source https://about.fb.com/news/2023/06/introducing-voicebox-ai-for-speech-generation/
facebook ads expert how to be an expert in facebook ads site:clickfunnels.com facebook ads expert facebook ads expert panama city beach how much is a facebook ads expert hourly how much is a facebook ads expert facebook ads expert contact facebook ads expert new york city checklist to facebook ads expert facebook ads for concerts expert facebook ads manager expert dentist facebook ads expert facebook ads expert pay hire a facebook ads expert reddit expert facebook ads facebook ads expert newyork city how to become expert facebook ads facebook ads expert responsibilities shopify facebook ads expert hire facebook ads expert facebook ads expert in lagos how to become an expert in facebook ads best facebook ads expert facebook ads setup expert facebook ads expert australia expert in facebook ads expert facebook ads freelance become a facebook ads expert facebook ads expert certification facebook ads beginner to expert fb ads how much are facebook ads facebook expert hire facebook marketing expert facebook marketing expert facebook ad specialist facebook marketing consultant facebook advertising help facebook consultant facebook ads marketing fb ads experts agency instagram expert facebook ads specialist facebook ads consultant facebook advertising expert
0 Comments