With the breakthrough of ChatGPT into mainstream culture and the resulting arms race between tech giants, society is fast approaching the era where man-made machines will surpass their creators not only in memory and processing speed (which has already occurred), but in areas where humans previously owned a monopoly: artistic creativity, problem solving, and intellectual insight. This new technology will quickly flow into numerous industries, with experts predicting that generative AI could disrupt $100 billion in cloud spending; $500 billion in digital advertising; and $5.4 trillion in e-commerce. This disruption is understandable given the technology’s capabilities; ChatGPT has already assisted in the development of new drugs, detected various types of cancer, and scored in the top 90th percentile of the Uniform Bar Exam (don’t get too cocky top 10%, it’s coming for you).
Everyone agrees that AI-generated content will be valuable and even integral to companies. As the proliferation and value of these outputs only increases, so will companies’ needs to protect their proprietary information that gives them a competitive advantage. This article focuses on the existing and evolving legal framework to protect AI-generated content and ownership of that content (“outputs”) and also addresses the need to protect the AI inputs that may include your company’s confidential information.
A. How it Works
How this new “generative AI” works and the processes used to create these outputs are where the legal battle lines will be drawn and battles fought. Therefore, prior to diving into the legal landscape, it is helpful to understand how artificial learning and generative AI works, its uses, and its drawbacks.
Prior to the debut of generative AI, artificial intelligence was largely predictive, meaning the AI focused on making predictions and forecasts. Streaming services, for example, utilize predictive AI in their recommendation algorithms. Taking into account a user’s past behavior and the behavior of similar users, the AI models identify patterns and predict the types of content the user is likely to enjoy. Google’s search engine uses a similar model: indexing and ranking results in order to “predict” the responses that are the best fit for the user’s query.
Generative AI, on the other hand, doesn’t predict the correct answer. Instead, it “generates” the next word, brushstroke, or line of code. And it does so based on an enormous amount of data, continuously scraped from the internet in order to “train” the generative AI program. These generative models ultimately produce a comprehensible response (or “output”) to the user’s input and those inputs then become part of its expanded intelligence. Initially, AI’s outputs are reviewed and “graded” by a person, who prioritizes correct (or at least comprehensible) answers. However, at a certain point the AI’s responses are not graded by a person, but by the AI itself; this contributes to the exponential growth of the AI’s abilities. Generative AI creates new content based on the patterns and data it has consumed, sharpening its responses as time goes by and it ingests more data, until it can essentially teach itself.
Generative AI will likely disrupt industries and the legal landscape in ways that we cannot currently predict, but this is not the first time a new frontier of technology reshaped commerce and the rules that govern it. Just look at the legal battles fought during the early internet. We had the battle over Napster and other file-sharing platforms, which pitted copyright holders against 66-year old grandmothers who allegedly infringed the holders’ rights by illegally downloading songs. While the copyright holders won the legal battles, music labels made a futile attempt to resurrect the prevalence of the CD. Ultimately, music labels succumbed to the new world order and settled their dispute with the World Wide Web, putting their music on streaming services for a licensing fee. The ultimate result? Music labels increased profits because people spent more money on streaming services than they did on CDs.
One could imagine a similar result in the AI-context, with owners agreeing to license their intellectual property to AI companies for a fee. However, there is an important backdrop in the file-sharing legal cases that may not be present here: courts flatly rejected Napster’s (and others) argument that they were protected by the fair use doctrine.[1] The copyright holder, therefore, had a clear right to his or her creation, which was identically copied through file-sharing platforms, and the infringer had no valid defense. In the AI-context, it would appear[2] that rationale would not apply, as the ultimate output borrows from, but does not identically copy, otherwise protectable works.
The boundaries of this fight may be unclear and undeveloped, but as the battle lines begin to be drawn, companies will need help with navigating the benefits and managing the legal risks of this new AI-generated frontier. Lawyers should therefore be prepared to advise clients in a fast-changing environment, understanding the parameters of intellectual property protections and these doctrines’ imperfect applications to this new and evolving technology.
B. Legal Overview
As an initial matter, users of ChatGPT own the input and the output generated from the AI, so long as they don’t violate GPT’s content policy and terms of use. However, there is a notable caveat here: OpenAI, the owner of ChatGPT, “assigns” users its “right, title and interest in and to” the output “subject to” users’ compliance with ChatGPT’s terms of use. This qualification makes it easier for OpenAI to revoke users’ utilization of outputs that violate the company’s terms of use. However, for purposes of this article, so long as any output generated would otherwise qualify for intellectual property protection(s), we will assume that users can enforce their rights to property created by ChatGPT.
1. Copyright
Copyright could be a useful tool in protecting content that is assisted by AI-generated outputs. However, there must be some spark of human creativity to qualify for copyright protections. The United States Copyright Office recently published guidance on the topic, noting “that copyright can protect only material that is the product of human creativity.” However, the guidance provides that in certain circumstances “human authorship combined with uncopyrightable material” could qualify for copyright protections. The key question will be whether the AI contributions are a result of the author’s “own original mental conception.” If so, the result is likely copyrightable. This will “necessarily [be] a case-by-case inquiry.”
Beyond being a tool to protect AI-generated content, copyright can also be used to prevent AI companies and their users from pillaging holders’ intellectual property used to “train” the AI (and generate outputs). As already mentioned, the finding that Napster was not protected by the fair use doctrine was central to the demise of the peer-to-peer industry and the subsequent prevalence of streaming music services. But, what if the AI-generated content does not directly copy the underlying work, but only “learns” from it? Could that be fair use? Does requesting an AI to create an output in “Michiko Itatani’s [a well-known Chicago-based artist] style” necessarily violate her copyright? What if you ask the AI to reproduce a specific work?
Two cases filed in 2023, one against OpenAI and Microsoft[3] and another against an image-based AI model,[4] are seeking to answer some of these questions. The battle lines appear to focus on whether the input-stage ingestion of copyrighted material to train the AI model constitutes fair use. Courts have rejected similar, though distinguishable, arguments that are likely to be made in those cases. In a seminal fair use case, an appellate court rejected the Authors Guild’s argument that Google Books was infringing copyrights by transforming copyrighted books into an online searchable database through scanning and digitization. This end product by Google Books provided users with certain snippets of books, although not the entire text.[5] The court held that Google’s actions constituted “fair use,” as the copying was highly transformative and did not provide a sufficient market substitute for the copyrighted work (i.e., people would still buy books).
AI proponents will undoubtedly argue fair use, asserting AI’s creations are sufficiently transformative, making the end product something entirely different from the original work. However, the Google case involved a static product: Google Books’ database was not going to produce an exact reproduction of an entire book, no matter how many different prompts users inputted. As discussed above, generative AI is far from static; the content generated from such machines is limited only by the creativity of the user (and the AI’s ever-increasing capabilities). Unlike Google Books, therefore, users could mimic artist’s styles to the point of impersonation and essentially copy entire works. Additionally, the “most important” factor in determining fair use—economic damages—is more prevalent in the AI context. Unlike providing snippets of books (as in Google Books), AI could easily reproduce entire works, causing legitimate economic damages. And Getty’s fight, even if victorious, may be akin to the music industry pre-Napster and these technological advances may force companies to evolve into the AI space or create new previously unforeseen sources of revenue.
Laws and courts are going to have to grapple with these issues. However, companies trying to protect intellectual property based on AI-generated content should be mindful that certain outputs could be pulled from protected works. This may or may not be a cause for concern, and the legal landscape will change as quickly as the technology rapidly becomes more sophisticated.
2. Patent
Patent protection could be a good fit for certain AI-created information. Common patentable technologies—computer software, diagnostic tools, and pharmaceutical tech—are all susceptible to disruption by generative AI. However, notably, courts have held that an AI machine cannot be an “inventor” and therefore its invention cannot be protected by the Patent Act. Thaler v Hirshfeld, 558 F. Supp. 3d. 238 (E.D. Va. 2021).
There are important observations to make regarding Thaler. First, the Thaler court was presented with the unique circumstance of the applicant literally writing down the AI as the “inventor.” The court does not even address the meatier scenario where the AI-generated content is only a part of the larger whole. Second, and relatedly, assuming there has to be some human art added to the AI’s production, how much is sufficient for the result to be an invention? Courts have defined invention in the patent context as the “formation in the inventor’s mind of a definite and permanent idea of the complete and operative invention, as it is . . . to be applied in practice.”[6] It seems clear an AI can never have such a formation (or a mind), but simply parroting or utilizing an AI’s response doesn’t appear to meet this standard, either.
Regardless, patent protection has drawbacks. Namely, the requirement to publicly expose the patentable invention and the time limitation of enforcing one’s patent. Both issues are inapplicable to trade secrets.
3. Trade Secrets
Which leads to the final intellectual property protection, which is likely the most readily accessible and useful in the AI-generated context. To be a protectable trade secret, the information or technique must (a) be secret (it cannot be commonly known or easily discovered); (b) derive independent economic value from being secret (i.e., giving the owner an advantage over its competitors); and (c) be kept secret through reasonable efforts. 18 U.S.C. § 1839(3).
There are numerous benefits to trade secret protection. For example, trade secrets do not require up-front fees (e.g., filing and legal fees for obtaining patent grants) or a lengthy approval process. The upfront costs of trade secret protections are, generally, the cost to develop a trade secret and the cost to implement reasonable efforts to keep it secret. Further, unlike patents, trade secrets do not have to be disclosed, and trade secret law can protect information that does not qualify for patent protection at all, such as customer lists or other compilations of otherwise publicly available information that derive independent economic value from not being disclosed. Additionally, trade secrets have no expiration date so long as the qualifying factors continue to be in force.
The crux of any trade secret dispute regarding AI-generated content will likely be whether the secret can be easily discovered. Common sense tells us that if the “secret” is merely the copy-and-pasted output of the AI model, it is unlikely to be protectable. Similar to the guidance discussed above from the Copyright Office, to be protectable there likely has to be some additional protectable component added to the AI’s output. However, trade secrets do not require the spark of human creativity as demanded by copyright protections. Therefore, the primary barrier precluding AI-generated content from being copyrightable is not present when applied to trade secrets.
Companies attempting to protect secrets that utilize AI-generated content should therefore take active steps to ensure that the end-product is not easily discovered (i.e., reverse engineered). Some additional components, benefits, or application should be added to any AI-generated output that could provide a competitive advantage. The more layers that are added onto such outputs, the more likely they will be protectable. Indeed, courts have held that products and techniques can be protectable as trade secrets even when there are public domain counterparts.[7] Companies should also create confidentiality policies for all of its information it would like to (or may like to) protect as a trade secret.
Moreover, companies should be concerned about employees using company information as “inputs,” as companies have reported seeing their proprietary information appearing in AI-generated outputs. As we navigate this evolving technology with old legal constructs, companies will have to employ policies on how employees can and cannot use this technology without compromising company confidential information. Moreover, agreements with vendors, suppliers and third parties will have to address the use of exchanged confidential information in generative AI.
C. Conclusion
The arrival of generative artificial intelligence into realms previously thought to be within the sole purview of humans will disrupt countless industries and cause just as many legal headaches. Like the internet in the 2000’s and crypto currency today, practitioners are in the legal equivalent of the Wild West, where the laws appear to be more of guidelines and all parties attempt to find their way in territory that is, in many respects, uncharted. Courts and practitioners will attempt to fit this new order into existing legal doctrines. There will undoubtedly be growing pains as new applications arise and case law and the law adjusts.
We will continue to write on AI-generated intellectual property and related issues going forward as the legal battles unfold. Our lawyers have deep experience in advising clients on how to protect their intellectual property rights, including in the artificial intelligence arena. If you have any questions or would like to discuss anything in this article, please contact your regular Locke Lord LLP contact or the authors, Jennifer Kenedy and Jorden Rutledge.
[1]A&M Records, Inc. v. Napster, Inc., 239 F.3d 1004 (9th Cir. 2001)
[2]Authors Guild, Inc. v. Google, Inc 804 F.3d 202 (2d Cir. 2015)
[3]Doe v. GitHub, Inc., 3:22-cv-06823 (ND CA). At its core, plaintiffs allege Defendants released Codex and Copilot—two “assistive AI-based systems” that are alleged to generate copied copyrighted material without attribution in some instances. (Complaint at ¶¶ 46, 77.) Plaintiffs allege that after Defendants trained Copilot and Codex by exposing it to large quantities of data gathered from publicly accessible open source code repositories on GitHub, they used Copilot and Codex to distribute similar code to users. (Id. ¶¶ 46, 140.) As a result, Plaintiffs assert that Defendants violated open-source licenses and infringed intellectual property rights. (Id. ¶¶ 143-171.)
[4] Andersen, et al., v. Stability AI LTD., et al., 3:23-cv-00201(ND CA). In this case, Getty Images filed a lawsuit in the US against Stability AI, creators of open-source AI art generator Stable Diffusion. The stock photography company alleges that Stability AI copied more than 12 million images from its database “without permission ... or compensation ... as part of its efforts to build a competing business,” and that the startup has infringed on both the company’s copyright and trademark protections.
[5] The Authors Guild Inc., et al. v. Google, Inc., 804 F.3d 202 (2d Cir. 2015)
[6] Round Rock Rsch., LLC v. Sandisk Corp., 81 F. Supp. 3d 339, 348 (D. Del. 2015)
[7] Weston v. Buckley, 677 N.E.2d 1089, 1092 (Ind. App. 1997)
Sign up for our newsletter and get the latest to your inbox.