AI and copyright: who holds the pen in the age of machines?

September 2024 | EXPERT BRIEFING | INTELLECTUAL PROPERTY

financierworldwide.com

Artificial intelligence (AI) has existed for decades. However, the pace at which AI has advanced in recent years means that, increasingly, questions are being asked by rightsholders, users and developers about the legal implications of the use of AI.

This article will focus on the potential issues under UK copyright law arising from the use of AI, including in relation to the risk of third-party copyright infringement claims.

Which law applies to the interaction between AI and copyright?

The UK government decided not to implement the Directive on Copyright in the Digital Single Market ((EU) 2019/790) (DSM Directive), which contained two copyright exceptions in relation to text and data mining. In addition, the UK government has not yet introduced specific laws to regulate AI.

In the absence of AI-specific laws in the UK, we need to look at the existing legislative framework, namely the Copyright, Designs and Patents Act 1988 (as amended), for answers to the following questions about the interaction between AI and copyright.

Who is the author of a work generated by AI?

Under UK copyright law, the author of a work is the person who creates it. Unlike many jurisdictions, UK copyright law goes further than this and sets out the authorship position for certain ‘computer-generated’ works. If a literary, dramatic, musical or artistic work is generated by computer in circumstances such that there is no human author of that work, then UK copyright law says the author is the person by whom the arrangements necessary for the creation of the work are undertaken.

Although these provisions were included in the original version of the Copyright, Designs and Patents Act 1988 to deal with AI, they seem to contemplate there being a single stage at which the necessary arrangements are undertaken. However, in the context of AI-generated works, it is not clear who would be considered the person responsible for undertaking the necessary arrangements because different people may be involved at different stages of the creation process.

For example, is the author the person who: (i) writes the code for the AI; (ii) chooses and handles the training data; (iii) provides feedback on decisions taken by AI; (iv) provides the prompt; or (v) funds, owns or controls the AI? While some answers seem more likely than others, this is a question of fact and degree and is likely to depend on the relative contributions of those involved in the creation of the AI-generated work.

The UK’s Intellectual Property Office (IPO) consulted in 2021 on whether to change the law in relation to computer-generated works. Commenting that use of AI was “still in its early stages”, the UK government decided not to make any changes to the law but said it would keep this under review. Acknowledging the pace at which AI has developed since then, the UK IPO indicated earlier this year that it might be time to revisit the law relating to computer-generated works.

Can a work generated by AI be protected by copyright?

Under UK copyright law, for literary, dramatic, musical and artistic works to be protected by copyright, they must be considered ‘original’.

The traditional test for originality in the UK requires that the author use their skill, judgment and effort in creating the work. The European Union (EU) test for originality, which is assimilated law following Brexit, requires a work to be the “author’s own intellectual creation”. In a 2023 case, an English court referred to the EU test as the correct test for originality, stating that this requires the author to be able to make “free and creative choices” so as to stamp the work with their “personal touch”.

It is not clear how an AI-generated work would satisfy the originality requirement. For many potential authors, their contribution to the AI-generated work seems too remote to be considered a ‘personal touch’. For the person providing the prompt, although their prompt results in the work being generated, the same prompt can result in different outputs. In this context, is it really correct to say that the resulting work is the intellectual creation of the person providing the prompt?

Where a work is generated using an AI model trained on large volumes of data, rightsholders are likely to argue that the AI-generated work has been derived from the training data and constitutes a derivative work. If this is the case then, in addition to the work being the author’s own intellectual creation, UK copyright law requires that there is a level of alteration sufficient to make the totality of the new derivative work an original work. It is not clear that this would automatically be the case for an AI-generated work, and so this would need to be assessed on a case-by-case basis.

Can a copyright work be used as part of training an AI model?

Most generative AI models need to be trained on a vast amount of data. The data used in training may include works protected by copyright.

Under UK copyright law, copyright is infringed if the substantial part of a copyright work is copied without the consent of the rightsholder, unless a copyright exception applies.

Although there is a copyright exception for copying for text and data analysis, often referred to as text and data mining (TDM), the exception is not available unless the analysis is solely for research for non-commercial purposes. If a copyright work is used to train an AI model in the UK, unless that is solely for non-commercial purposes, rightsholders are likely to argue that the TDM exception does not apply and that a licence is needed for the use of that copyright work.

The use of copyright works as training data for AI models is currently being considered in a case involving Getty Images, the visual media company, and Stability AI, a generative AI company. Getty Images is claiming that Stability AI has infringed copyright by using images from the Getty Images websites to train Stable Diffusion, Stability AI’s deep learning AI model, without consent from Getty Images. The case is yet to reach trial but, depending on what the court decides, may have significant implications for rightsholders and AI developers.

Even if Getty Images is successful, this may not be the outcome for future claims by rightsholders if UK copyright law is updated to cater for the use of copyright works for training AI.

In 2022, the IPO proposed introducing a new TDM exception for any use (whether commercial or non-commercial). Following significant concerns raised by representatives from the creative sector, the UK government confirmed in 2023 that it would not introduce the proposed exception. Instead, the IPO was tasked with producing a voluntary code of practice by summer 2023 to support AI firms to use copyright works as an input to their models. However, the UK government confirmed in February 2024 that it would not be possible to agree an effective voluntary code and, by the time the general election was announced in May 2024, had not proposed a way forward.

Although the Labour Party’s manifesto did not include a proposal in relation to the use of copyright works for training AI, its plan for the creative industries (launched in March 2024) stated that Labour would find the “right balance between fostering innovation in AI while ensuring protection for creators”. It remains to be seen how the new Labour government will ensure that the right balance is struck.

Can a work generated by AI infringe someone else’s copyright?

The traditional test for copyright infringement in the UK is whether a substantial part of a copyright work is copied. If a claimant cannot show that copying has taken place, they will need to be able to infer copying based on sufficient similarities between the works and by showing the defendant had access to the claimant’s work, although the defendant can rebut the presumption of copying by showing that the later work was created independently.

The EU test for copying, assimilated into UK law, is whether the elements copied reproduce the expression of the author’s intellectual creation. It is not entirely clear how this interacts with the UK ‘substantial part’ test.

Where an AI model has been trained using a perfect copy of a copyright work, the rightsholder may argue that this means any output generated by that AI model automatically infringes that copyright work. However, the AI developer is likely to resist this and may argue that the complexities of the machine learning process, together with the vast amounts of training data involved, mean that it is unlikely that the substantial part of a particular copyright work will be reproduced in the output.

If an AI model has not been trained using a particular copyright work, but output generated using that AI model is very similar to that copyright work, it is not clear if the absence of copying at the training stage would rule out a finding of copyright infringement, or if this would depend on the prompts used to generate the output. For example, what if the prompt asked the AI model to create something similar to that copyright work?

Conclusion

UK law does not currently address fully the questions arising from the interplay between AI and copyright, resulting in uncertainty for rightsholders, users and developers. Until new or amended legislation is introduced, or the courts provide AI-specific guidance, these questions are likely to remain unresolved. It remains to be seen if the new Labour government will take decisive action in this respect.

Hannah Curtis is a partner, Sarah Hopton is a senior associate and Carter Rich is an associate at CMS. She can be contacted on +44 (0)20 7367 3726 or by email: hannah.curtis@cms-cmno.com. Ms Hopton can be contacted on +44 (0)20 7067 3236 or by email: sarah.hopton@cms-cmno.com. Mr Rich can be contacted on +44 (0)20 7067 3306 or by email: carter.rich@cms-cmno.com.

Hannah Curtis, Sarah Hopton and Carter Rich

CMS