Audiobox the goal AI to clone voices How does it work?

Audiobox the goal AI to clone voices How does it work?

Meta has taken a step beyond Artificial Intelligence and has been able to create a program to clone your own voice in seconds. Audiobox is the tool that was created by Meta to be able to do something so incredible, since its results are quite realistic. We will detail some aspects of Audiobox, the goal AI to clone voices. How does it work? We will analyze some details.

According to its developers, this program is the result of several years of research and many hours of work. It can now be used in Spain, with some generation functions. voice, editing, stylization and sampling. Its form can be a bit scary, since there are controversies such as the use of voices from dubbing actors.

How does Audiobox work to clone voices?

Its operation turns out to be simpler than it seems. Meta itself has ready within its website, an explanation or tutorial for explain everything you need to start using your tool.

Cloning the voice

Explain within your tutorial how you can create voices from scratch and To do this, you can apply it by recording your own voice, or the voice of another person while they are reading a text that appears on the screen. It also offers the possibility of using other voices contained in the program.

Although this program seems incredible, in reality It is not very suitable for use in dubbing, since the dialogues cannot be synchronized with the actors' lips, even if they speak it in another language. Nor can you use expressions sympathizing with emotions or natural intonations.

Another piece of information has also been offered, since the results are amazing, although it is still noticeable that the voices usually denote in a robotic tone. Its purpose is impeccable and it is getting closer to what we want, but we cannot sound the alarm, since it is still You cannot impersonate another person by voice.

Some functions that Audiobox provides us

Entre the functions it can present Audiobox, we can find some functions that may interest us:

  • You can create sounds like the sea, the water of the river, birds singing, a storm... it will be enough to ask Artificial Intelligence for these parameters to represent them.
  • Mix specific audios into other media or tones. For example, the AI ​​can be asked to represent a voice with an echo.
  • You can slow down a voice, make it faster, or make it sound too loud.

Program to clone voices

How can we use it?

You can access a free trial of this program. We can use our own voice to generate a recording, or we can use another voice that interests us. Afterwards, all that remains is to use it, we will only download the generated voice and apply it wherever we need it.

  • Among other ideas that it gives us, is to generate a voice from a small audio sample.
  • Create a voice with a personalized intonation and style, although it will not do so very naturally.
  • We can mix voices to redesign a single one.
  • Create sound effects, some specific and very representative, like those described above.
  • Replace audio parts with new sounds.
  • You can even erase noise from an audio track.

Meta warns of misuse of its Audiobox program

The company is offering programs for moderate use. However, all transparency in its use is being created for research purposes. In Spain it can be used, but many of the parts of its program are written in English. In the United States, this program is not eligible by its laws, so it cannot be accessed. But, it will not be a problem, since little by little it will be incorporated with open source.

Audiobox the goal AI to clone voices How does it work?

In our country, it can be used, but Audiobox warns that it cannot be used for harmful purposes. All audios come with watermarks, so that they can be tracked with said data and know where they come from. For many people, this badge or signal is something that is imperceptible to the human ear, but for its developers it is an easy way. They have enough tools to find these generated audio segments by Artificial Intelligence.

La watermark that generates said program does not prevent podcasts from being made or other services where audio is required, since it is almost non-existent to the ear. But, it is a security measure that is created so that any culprit can be quickly detected.

Therefore, the watermark is not a problem in the audio, so it can be used in podcasts or other listening services. However, in the event of an attack, the culprit can be quickly detected.

The end of the creation of this program, is to make it participatory for the people who need it. An attempt is made to clone the voice in a simple and natural way, but no attempt is made to create it against the will of the affected person. As can be seen, the message changes with rises and falls of intonation, sometimes being fast and irregular to look natural, so it's tricky to add someone else's voice with pre-recorded audio.

Leave a Comment

Your email address will not be published. Required fields are marked with *



  1. Responsible for the data: Actualidad Blog
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.