Generative AI & Copyright Law: Viewing the Legal Landscape

The term “artificial intelligence” is no longer confined to the pages of science fiction or the runtime of Blade Runner. But don’t let that fool you. AI has been a part of our lives longer than you may realize. From email automations to voice command search tools like Alexa or Siri, AI has become a prevalent and useful umbrella of technology. Over the past five years however, the emergence of generative AI models like ChatGPT, Midjourney, and OpenAI have become household names as new technology floods the global markets. With the rise of generative AI, ethical and legal questions regarding its use followed, and the intellectual property considerations of both AI programs and their outputs are reaching a boiling point. Yet, AI controversies are complex, and the law has not yet caught up to these advancements. Attorneys, legislators, and scholars alike are already contemplating the privacy, intellectual property, free speech, equal employment, and national security implications of artificial intelligence. Among these considerations, copyright law has been a driving force in the legal analysis of AI. Of course, the nature of generative AI is both vast and complex. In this article, I provide a bird-eye survey of the legal landscape related to copyright law and generative AI.

What is Generative AI?

Generally speaking, artificial intelligence is software and/or hardware that is trained to solve complex problems or complete tasks that require the instrument to mimic human thought. Make no mistake, however. AI cannot think, nor is it sentient. The futuristic, philosophical implications of this premise are encapsulated in the currently unrealized idea of Artificial General Intelligence (AGI).

In practice, AI cannot operate independently or of its own mind (it doesn’t have one). It is not a stochastic process. AI cannot operate randomly because it is bound to the instructions, guardrails, and training parameters of its developer. Further, “artificial intelligence” is a term used to encapsulate a vast blanket of probabilistic models ranging from robotics to voice assistant products (Siri and Alexa), website helpdesk chatbots, and reactive machine learning models. Here, we’ll focus on deep learning models like Midjourney, ChatGPT, GPT-4 (OpenAI), DALLE-2 (OpenAI), and Bard (Google).

There are two primary categories of what is called “Generative AI.” First, we have large language models (LLMs). These models, like Bard or ChatGPT, are supervised or semi-supervised chatbots that generate predictive, textual responses to prompts from a user. The second category includes large visual models (LVMs), which operate similarly to LLMs but generate an image output instead (see Midjourney and DALLE-2). These models are trained on vast data sets and use prompts to generate responses through predictions based on past data ingestion. Many are semi-supervised, meaning the AI is initially tuned on a pre-defined data set but then continues to train itself based on user prompts and reactions to develop more accurate generation.

Functionally the user-facing process looks like this: a user inputs prompt [X value] and the model generates [Y output] based on analytic similarity of the prompt to past data. Here is an example from ChatGPT:

User Prompt: Explain briefly how generative, machine-learning, text-based AI models work as though you were explaining it to a fifth grader.

ChatGPT 3.5: Sure! Imagine a computer that loves to read books. It reads so many books and learns how words and sentences work. Then, when you ask it a question, it uses what it learned to create new sentences and give you an answer. It’s like having a really smart friend who learns from books and can talk to you!

The Legal Landscape of Copyright & AI

Copyright applies to artificial intelligence in two categories: ingestion and output. Ingestion refers to the process of tuning, where the developer uses pre-defined data sets to train the AI so that the model can accurately react to prompts and generate outcomes based on similarities to its training data. Outputs refer to the actual generations of the model. In LLMs, this would be the text-based response to a user prompt. For LVMs, it is the image(s) generated. Copyright protects original works of authorship from unauthorized reproduction or display by those who are not the copyright owner. See 17 U.S.C. 102, 106. These rights vest in the author the moment the work is created. However, a copyright owner can obtain valuable protections by filing a registration with the U.S. Copyright Office. The application of these rights to AI is best viewed in light of parties with potential copyright interests in AI-related works.

The Three Parties With Potential Copyright Interests

AI Developers

AI developers may have several copyright interests related to their models. First and perhaps most settled, developers may have copyright in the AI model itself. While it is well settled that developers can obtain copyright protection over computer code, copyright’s merger doctrine and useful article doctrines complicate matters when it comes to protecting utilitarian works like software. However, licensing plays a role as well, and a developer’s ability to claim copyright over his model can be frustrated if he used an open-source software as the base for the program. Open-source code is a software package offered freely to the public under a general use license by its developer to incentivize innovation. While these free licenses allow for use, it does not mean the developer has relinquished their copyright. In fact, the opposite is true. The existence of a license implies the assertion of an otherwise protected right. While enforcement is unlikely unless the developer uses open-source software in an unethical manner, AI developers should be wary of their rights to the AI models they develop.

Further, AI developers have already attempted to claim copyright over the outputs generated by their AI models. In 2023, Stephen Thaler attempted to register an image (titled “A Recent Entrance to Paradise”). The image was created using Thaler’s “Creativity Machine,” a generative AI model, and the Copyright Office refused to register the image. In Thaler v. Perlmutter, the U.S. District Court for the District of Columbia agreed with the U.S.C.O., explaining that copyright authors must be human. Some have taken this to mean that the court said AI-generated works are not the product of human authorship. This is not the case.  The decision noted that Thaler, in his registration application, listed the AI as the author and himself as the owner of the copyright through the work-for-hire doctrine because he was the creator of the AI and prompt. As such, copyright in AI-generated outputs remains an unsettled issue. Nonetheless, in cases where the developer is not the prompter, the developer likely does not exert sufficient control over the creative process to claim the work as his own.

Authors With Pre-Existing Works

The second interested party consists of authors of pre-existing works that are used to train the AI models. This raises an issue as to whether a developer’s training of an AI on data sets (which often include millions of unfiltered data points) constitutes an unauthorized reproduction of copyrighted works contained in them. Filed on December 27, 2023, The New York Times Co. v. OpenAI embodies the essence of is conflict, and alleges that generative AI models engage in mass copyright infringement – in that case, the mass reproduction of NYT’s journalism, among other data, in training OpenAI’s GPT models.

That is not to say, however, that AI developers are doomed. Many argue that training AI models constitutes a fair use of copyrighted works. Whether or not AI training falls under copyright’s fair use protections will depend on how courts square the scope of “transformative” uses in relation to two Supreme Courts’ decisions in Google LLC v. Oracle Am., Inc. (2021) and Andy Warhol Found. v. Goldsmith (2023). In Google, the use of Oracle’s API software packages by Google to develop the Android platform was considered a transformative use, but Warhol’s use of Goldsmith’s photograph of prince was not. Both involve the direct copying of substantive elements of the underlying work, but both reach different results – with one major distinction appearing to be the amount of weight ascribed by the Court to “commercial use” in both cases.

Prompt Engineers

The third and final category of interested parties is the prompt engineer or prompter. These are the people who use generative AI services like ChatGPT and Midjourney to create writings or images. The precise question here is one that the court avoided in Thaler: is a prompter the author of a work generated by AI according to his prompt? This question remains unanswered, and the Copyright Office has, perhaps too hastily, begun to answer the question. In February 2023, the U.S.C.O. refused to register the individual images in the graphic novel, Zarya of the Dawn, because the images were created by AI. It did, however, allow for the compilation of the images as a graphic novel to be protected as its own work. Nonetheless, the Copyright Office seemed to recognize that this blanket refusal was problematic and issued a request for commentary on the subject in August 2023. I worked with other members of the Copyright Committee of the American Intellectual Property Law Association (AIPLA) to draft commentary addressing the application of copyright law to AI in October 2023. In the commentary, we addressed the importance of flexibility in determining what constitutes a “human creation” for the purposes of copyright authorship. A blanket rejection may raise questions about findings of authorship in other mediums that attenuate the creator from her creation (i.e. certain forms of photography use automatic exposures and slow shutter speeds to allow many random movements to be captured in a single frame, leaving the photographer with limited control over his subject). As of this post, the question of authorship in AI outputs remains unanswered, though some form of protection, however thin, seems imminent.

However, prompters should be wary of the outputs they generate because the ability for an output to infringe an existing work is present. For example, if I ask Midjourney to create an image of “an old, withered man standing on top of the Empire State Building dressed as Spider-Man,” the use of the Spiderman design may constitute an infringement of Marvel’s copyright, not only in the suit design but the character itself. The use of AI does not diminish the rights of existing copyright owners, and prompt engineers should take care to ensure they are not infringing the works of others.

Conclusion

Ultimately, the advent of generative AI has raised many questions regarding intellectual property. Whether copyright protection exists for generative AI outputs and whether the ingestion process constitutes fair use or infringement per se are vital questions, the answers to which will shape the relationship between intellectual property and AI. The landscape is evolving, and the need for developers, prompters, and existing authors to seek legal advice will only become more pressing as litigation escalates. Yet, it seems inevitable that some form of copyright protection, however thin, will find itself married to works created with the assistance of AI. As the legal field continues to navigate emerging technologies, connecting with an intellectual property attorney regarding protection when interacting with generative AI models is more important than ever.

If you have questions or needs regarding intellectual property and generative AI, reach out to Darkhorse Attorneys today to schedule a consultation.

The contents of this post do not constitute legal advice, and no client-attorney relationship is formed. The contents of this post are offered for educational purposes only. Every situation is unique. If you require legal assistance, submit an inquiry through our website or call our firm at 540-553-8149.

Darkhorse Staff

Share:

More Posts

Who Owns AI-Generated Content?

Who Owns AI-Generated Content? Understanding Intellectual Property Rights Artificial intelligence is rapidly becoming part of everyday business operations, from marketing content to product development. But

Read More »

Showgirl Showdown

  The Taylor Swift trademark infringement lawsuit highlights the importance of securing intellectual property rights early. This dispute, centered on the name of her newest

Read More »
Experienced Trademark, Copyright, IP, and Business Law Attorneys.

Darkhorse Attorneys is a boutique intellectual property law firm advising and representing clients in intellectual property and business matters. We specialize in trademark, copyright, trade secret, and business law. We especially focus on intellectual property litigation, business litigation, contract law, business planning, and software and technology law.

Lynchburg Office

Darkhorse Attorneys
119 Tradewynd Dr.
Suite B
Lynchburg, VA 24502

Roanoke Office

Darkhorse Attorneys
307 Market St. SE
Suite 203
Roanoke, VA 24011
Charlotte Office

                                                Darkhorse Attorneys
8936 Northpointe Executive Park Dr. Suite 240/260 (Office 202)
Huntersville, NC 28078

© 2016-2026 Darkhorse Law PLLC. All rights reserved. The information on this website is for general information purposes only. Nothing on this site should be taken as legal advice for any individual case or situation. Every case is different. This information is not intended to create, and receipt or viewing does not constitute, an attorney-client relationship.