AI’s got talent? – Deepfakes in the music industry

The proliferation and consistent sophistication of AI techniques make it increasingly difficult to tell reality from machine-generated content. The music industry isn't immune to this technological advancement and must be ready to harness its potential and navigate its legal challenges.

In the past few months, social media and video sharing platforms have been inundated with audio clips of the late Queen Elizabeth reciting the Sex Pistols or former US President George Bush performing 50 Cent's "In Da Club."

The fact that these videos are fake becomes easily apparent. But how about a new release from your favourite rapper? Or an unexpected duet featuring a retired star and the latest teenage idol? If you listen to them for the first time, you probably won't be able to tell if they're real or an audio deepfake.

What are audio deepfakes?

An audio deepfake is an AI-generated synthetic voice that mimics real human speech. Created using advanced algorithms and machine learning, these voice cloning techniques can imitate the voice of virtually anyone and make them say, recite or sing whatever words or lyrics, either completely at random or carefully emulating their style and vocabulary.

Audio deepfakes (or voice cloning) can be used in a myriad of contexts; from innocuous entertainment to more nefarious purposes, like identity theft or fraud.

The use of deepfakes in the music industry

The music industry is not alien to integrating technology in song production. Since the first mainstream use of Auto-Tune in Cher's all-time hit "Believe," synthetised vocal recordings have become ever more popular.

Thanks to their versatility, deepfake technologies have the potential to revolutionise the music industry. They could be a useful tool to assist the performer in reaching higher notes or to correct speech problems or minor mistakes.

They can also be used to dub famous artists' songs into, virtually, any language without the artist having to learn it, permitting a fast-paced global expansion. Or they can help editing certain words without the need for the artist to record the track again, creating a more streamlined localisation process for certain markets.

Voice cloning can also be used to recreate the voices of iconic vocalists or deceased musicians, allowing their music to live on in new compositions and performances, preserving their musical legacies.

The legal challenges of using voice cloning

Besides the ethical concerns they may raise, such as the potential detriment to the performer's individuality and artistic flair, audio deepfakes don't come risk-free from a legal standpoint.

From the performers' perspective, the use of their voice to create a synthetic song without their authorisation is likely to infringe their exclusive rights.

First, if the artist is also the author of the lyrics or the music, generating a song based on those words or tunes may lead to copyright infringement. Additionally, any artist also holds exclusive performer's rights. These include the right to control the fixation, reproduction, distribution and communication to the public of their performances.

The simple act of mining hundreds or thousands of samples of audio tracks from different sources to train the algorithm that will generate the vocal clone could in itself be an infringement of the exclusive rights of reproduction that the artist may have over the performance or the musical work.

Copyright over the musical work is usually held by the music publishers and the performer's exclusive rights will generally be in the hands of the record label. So, when creating a deepfake, it's important to identify the chain of titles to prevent legal liability.

Performers, both as performers and, where applicable as authors, also have moral rights. These rights can't generally be assigned by contract in the EU – with the notable exception of Luxembourg. And they include the right to be identified as performer (or author) in relation to their performances (or works) and the right to oppose to any modification of their recordings (or musical work) that may affect their reputation.

The voice is a unique characteristic of each individual. It's probably one of the most distinctive features of our species. As such, it deserves a special level of legal protection.

Even in the absence of harmonisation at EU-level, voice is generally protected as part of the right to self-image, either as a self-standing right, like in Spain, or as an expression of the right to a private life, as it is the case in Luxembourg or France. Be it as it may, self-image, and by extension, an individual's voice should be considered protected as a fundamental right. This would give the holder the right to oppose to its use, and even, in some jurisdictions, to withdraw any given license of use, upon compensation to the licensee.

Voice is biometric data protected under the GDPR at EU-level. Processing the artist's voice to create an artificial reproduction of it, even for a seemingly innocent purpose, could potentially meet resistance from the performer. Using biometric data requires the free, specific, informed and unambiguous consent of the data subject (ie the performer). This includes the need for the deepfake creator to inform the performer that their voice will be used to generate an AI replica.

Failure to collect the authorisation from the artist to use their voice to create a digital clone could lead to the liability of the person or entity responsible for the generation and diffusion of the audio deepfake.

So any use of the artificially generated voice of an artist requires a thorough clearance of rights and carefully drafted contractual provisions that sufficiently allow for the training of the algorithm, the use of the voice and the exploitation of the results. Record labels and music publishers will likely have to review their existing contracts and standard terms very thoroughly if they want to be able to surf this new wave of "talented" AI-generated voices.

What role for the AI Act?

No one would be surprised to read that deepfakes, including voice cloning techniques, are tackled in the recently published EU Regulation 2024/1689, more commonly known as the AI Act. This piece of legislation classifies AI depending on the level of risk, imposing fewer restrictions on those AI systems presenting lower risks and more stringent obligations or even outright bans on those that pose a more serious threat to humans. Deepfakes may fall in any of these categories depending on their nature and potential or actual use.

When it comes to the type of audio deepfakes you might come across in the music industry, they'd fall under the umbrella of limited risk AI. This means that these deepfakes are allowed, but subject to transparency obligations. It should be disclosed that the content has been artificially generated or manipulated.

Where the audio deepfake forms part of an evidently artistic, creative, satirical, fictional or analogous work or programme, as it would happen if the AI-generated song is incorporated into a real video clip, disclosing the existence of the deepfake should be done in a way that doesn't adversely affect the display or enjoyment of the work (eg it could be included as a warning in the opening credits).

But the new provisions of the AI Act don't apply to everyone who uses AI. The transparency obligations are only imposed on deployers, a concept that encapsulates any natural or legal person who uses an AI system under its authority for any purpose, excluding strictly personal non-professional activities.

A record label or a music publisher will fall under the definition of deployer and would have to comply with the disclosure requirements whenever they release a new hit featuring the machine-made vocal sounds of their all-time diva or their freshly signed talent.

Now, what's next?

As AI continues to develop, the ramifications and opportunities for the music industry are simply incommensurable. Performers, publishers, record labels, platforms and other actors in the industry could benefit from the vast range of opportunities that voice cloning techniques offer; from those purely technical, to those facilitating diffusion of the artist's work or the preservation of their musical legacy.

Beyond the ethical implications, audio deepfakes raise complex legal questions from an intellectual property perspective and in the areas of data protection and fundamental rights. Mitigating the legal risks will require close examination of the chain of titles of the works and performances, thorough clearance of rights and careful drafting of appropriate contractual arrangements.

On top of that, all audio deepfakes created for commercial purposes have to comply with the transparency requirements of the AI Act.

All these issues must be closely addressed by the parties involved before pushing the golden buzzer for the singing AI-clone.

Do you want to know more? Check out our podcast.