Steve and Katie take a look at the many developments around generative AI and fine art, including debates and litigation on copyrightability and infringement as well as the policy concerns surrounding increased use of generative AI to create artworks.
Steve Schindler: Hi, I’m Steve Schindler.
Katie Wilson-Milne: I’m Katie Wilson-Milne.
Steve Schindler: Welcome to the Art Law Podcast, a monthly podcast exploring the places where art intersects with and interferes with the law.
Katie Wilson-Milne: The Art Law Podcast is sponsored by the law firm of Schindler Cohen & Hochman LLP, a premier Litigation and art law boutique in New York City. Hi, Steve.
Steve Schindler: Hi, Katie. How are you?
Katie Wilson-Milne: I’m great. Happy holidays.
Steve Schindler: Happy Holidays to you.
Katie Wilson-Milne: We’re trying to do an end of year wrap up, which we have done before. And we realized that one single topic we wanted to talk about in the wrap up would be an entire episode. So here we are. We’re just going to talk about AI at long last. And we’ll talk about a few different developments this year in the AI, art, and copyright world, and I think there’ll be a lot more developments next year for sure, and we will try to follow them.
Steve Schindler: Yeah, I mean, given the lawsuits that are now really gaining momentum, I think we now have just a couple of small decisions to talk about, but mostly complaints and theories on that side. But next year I think we’ll probably have a lot more.
Katie Wilson-Milne: Yeah. And the Copyright Office has also been pretty active in issuing opinions and declining registrations, and they’ve issued a notice of inquiry and all the comments are wrapping up around now. So we’re going to have their report, I think, as a result of that notice of inquiry on generative AI and copyrightability and infringement next year, too, which will be big. So I guess we should start by explaining what generative AI is, and that’s what we’re going to be talking about, this one subset of artificial intelligence. And within this subset of generative AI, we’re going to be talking about how it applies to the fine art world for the most part, obviously, as we do. So, generative AI is, as I said, a subset of artificial intelligence that generates content that looks like it was produced by a human being. And so users typically insert text prompts, although I think in some platforms you can put in suggested images as well. And then the AI system produces outputs that could be text or could be images, it could be music, you know, depending on what the prompts are. And to be able to do this, the generative AI system has to train itself on huge data sets of existing material that are “scraped” from publicly available sources all over the internet. And this process, people may have heard of being called text and data mining, that’s sort of a regulatory legal term for it.
So after the scraping occurs, where the AI system has sort of looked around everywhere to consume as much content as possible, it draws on, I guess we would call it deep learning techniques and algorithms that are programmed into it to “learn” patterns and structures within the existing internet scape to know how to generate new content in response to user prompts. So that’s what we’re talking about today, that kind of AI, we’re talking about generative AI. And in the art world, this looks like AI systems learning how to create artistic works that appear to be created by a human artist and sometimes actually by particular human artists. So a user could prompt a certain system to create an artwork depicting a certain thing that looks like it’s by Picasso, for example.
That’s one possibility. So there are a lot of legal, ethical, and policy issues with these tools, and we’re only going to cover some of them. But the two general categories we’re going to talk about are the copyrightability of this kind of AI output, and then the copyright infringement issues, which are coming out mostly from the training data that these AI systems are using.
Steve Schindler: Perfect summary
Katie Wilson-Milne: So I guess we’ll start with copyrightability. I’ll take that one.
Steve Schindler: Okay.
Katie Wilson-Milne: And then we’ll turn to you, Steve, to get us up to speed on the infringement landscape, which is pretty interesting. All right, so a little more background for our listeners just to take a step back on what it means to— for something to be copyrightable. Not every creation is subject to copyright, right? The standard in the US is that copyright attaches to original works of authorship that are fixed in tangible medium of expression. So there are certain requirements, namely originality, which requires that a work is independently created and there’s some small amount of creativity in that.
Steve Schindler: Right? Very small.
Katie Wilson-Milne: Very. It could be a modicum, as the cases say. And then the next requirement is this term authorship, which has long been understood to mean human creation. And that’s the word “author” that really defines that requirement for human creation, which is going to define this— all the issues around copyrightability with AI. So copyright does not protect non-human creativity. A very famous case somewhat recently in the Ninth Circuit involved a monkey that had taken a photograph, and someone had tried to register the copyright in the photographs on the monkey’s behalf, and the Copyright Office declined. And that went all the way up through the court system. And the Ninth Circuit said, absolutely not, registration and copyright itself protects human authors, not nonhuman authors. So that’s pretty clear in the law. And then the last requirement, which is pretty simple, is that it has to be— this expression has to be fixed somewhere. So it can’t just be in someone’s head, it can’t be the spoken word, it has to be put down in writing, video, sound, something that’s recorded. So the main sort of animating distinction in copyright is that ideas are not protected, but expression of those ideas is. You know, original, creative, fixed human expression of an idea is protected. That protection occurs right away when something— a work of expression is created and fixed. The copyright occurs at that moment and belongs to the author, but you can also register that copyright with the US Copyright Office. And there are good reasons to do that, including the ability to sue for infringement in federal court, get enhanced damages and attorney’s fees for infringement. So it’s a good idea to register, but the copyright exists whether you do or not. And the registration by the Copyright Office is an acknowledgement that a work is copyrightable. So there’s not daylight there. If it’s copyrightable, it should be registered, it can be registered, and if the Copyright Office declines to register something, it means in their opinion it is not copyrightable. That’s the background here. And the big question for generative AI, which is what we’re talking about, is if the output of generative AI is copyrightable, given that it is output by a computer system or an algorithm.
Steve Schindler: Even though there’s a person programming the computer.
Katie Wilson-Milne: Right. Even though there are lots of people involved. They’re the people who created the system, they’re the people who program the system, they’re the people who put in prompts. And I think those are all the complicating factors which make this something worth talking about. I mean, if it was just so obvious that there’s no human making it, case closed, no controversy, no one disagrees, then this wouldn’t be interesting. But in fact, there’ve been several controversies, and as we already mentioned, the Copyright Office has felt the need to issue guidance and also to issue a notice of inquiry, because they see a lot of discussion about these issues, including copyrightability. So it’s an interesting area. Now, unlike some other countries, the Copyright Act itself, the Copyright Act of 1976, doesn’t have any provisions for computer-generated work. So there’s no special copyright law for computer-related product. So in applying the law, we apply to everything. You know, the Copyright Office has taken a position on the registration of AI and the copyrightability of generative AI, and it published that position in March of this year in a guidance paper that you can find on the Copyright Office website that makes very clear that expression created only by generative AI with no human involvement is clearly not copyrightable. So it will not be registered, is not protected by copyright. That requirement, of course as I said, nothing new there, not particularly controversial. But the more interesting question and the tricky question is that for many, if not most of the AI generated works being registered or you know, or their applications submitted for registration, there is some human involvement and we’ll call these hybrid works. So there’s a spectrum from extremely detailed prompting of the generative AI system to actually manipulating the final AI images by painting over them or using Photoshop or collaging them or cutting them up.
That is the area that the Copyright Office focused on to try to explain how applicants could fill out applications and the Copyright Office would know how to deal with them. So, you know, when a human gives a prompt, the Copyright Office does say that no matter how detailed the prompt, that is not enough. The machine is still making all the creative decisions, it’s deciding how to organize the creative material, at least in some part. And so the prompt is really just an idea. And again, an idea is not copyrightable only the expression of that idea. The Copyright Office has specifically likened this to commissioning a work of art from an artist. So if I were to commission a work of art, I might say I want a portrait of my daughter or a sculpture of my pet or something. So that’s the idea. That’s not copyrightable. Even if I was very specific, you know, I want my daughters painted at this age, in this light, in this color dress and whatever, that would all just be ideas. And so it’s really the artist and in this case the AI system that is generating the final product. And so that’s the distinction. But where a human being selects and arranges the AI generated material in a creative way, that compilation might be copyrightable. And that’s the same as we see in book compilations and all kinds of compilations that in their whole are copyrightable, but each independent part is obviously not copyrightable to the author of the compilation. So that would be true if a human modifies materials originally generated by AI to a significant degree, again painting over it, using Photoshop to significantly change the images, or collaging different AI works, arranging different AI works. Those are examples of true human additional authorship that the Copyright Office has said yes, that’s subject to copyright.
Steve Schindler: Well, and I think just as an example, if you look at the Schindler Cohen & Hochman holiday card that just went out the video, it uses a variety of generative AI programs, but it is a compilation of both text, of video, of still, of music, of voice, and all of that output had to be edited and arranged into something that we hope look beautiful by a human being. And so we have at least claimed a copyright on that work of generative art. And who knows if the Copyright Office would uphold that or not. But it does seem to be on the end of the spectrum that we’re discussing of sort of high degree of human input in sort of editing and making a lot of decisions about how that looks.
Katie Wilson-Milne: Right, exactly. There’s a really big spectrum. And for individuals that are claiming copyright for an AI work, the Copyright Office has been clear that they can claim it for their own contributions, but they have to disclaim the AI components. So in the standard application, there are spaces to do this and explain, you know, that you used generative AI, what the generative AI output was, and exactly what you did in addition to that generative AI output. And then the Copyright Office will issue a registration certificate only for the human components of that work, disclaiming the AI. So it’s not as easy as if you do a lot to a AI image, you get the copyright to the entire work, you don’t get the copyright to the entire work. You get the copyright to your additional contributions. So that is somewhat limiting. And that’s come up in a few cases. I mean, the first famous case was about a comic book called Zarya of the Dawn. It’s not the type of comic that I would—
Steve Schindler: Something that you read?
Katie Wilson-Milne: —I would be normally reading. I saw this only in relationship to art and copyright, created by someone called Kristina Kashtanova. She used an AI generative platform for creating the visual images. And then she wrote in the text, and then she arranged and coordinated various AI images with the text. She originally registered this comic book with the Copyright Office, but did not mention the use of AI. I believe this was before the guidance came out, so maybe not clear that she should have or she wouldn’t have known how to do it. And they registered the work not knowing that there was an AI component. The Copyright Office independently found out actually that the images in the comic book were created by generative AI, and they canceled the original registration and issued a new one that made clear it was only covering the expressive content that Kashtanova contributed and then disclaimed the AI images. So that was an interesting case.
In a far simpler case actually and the only case that we’re aware of that’s been up through the court system and subject to a published decision, an applicant whose name is Stephen Thaler, tried to register a work authored by a computer system that he owned. And actually his application to the Copyright Office said that the author was the computer system, but that the copyright registration could be in his name because he owned the computer system, which is a totally also wrong, confusing part of this application. He had some work made for hire idea that is not relevant at all for copyrightability. So part of this is that the application itself made it clear there was no human author, and that the computer system was entirely responsible for the output. Anyway, the Copyright Office obviously declined to register this. He tried again, it worked its way up through the Copyright Office review system and to the federal district court, which is the appellate court for copyright decisions, basically in the District of Columbia. And in a pretty simple decision, but a helpful decision, the court emphasized that human authorship is the keystone of copyright, and that it has always been the case, and it is still the case that only human-authored works are subject to copyright. And because this work of art lacked human authorship, it obviously could not be registered. But the court was very clear that it was not making any decision about works of art where there were a lot of human inputs, even in the prompts. I mean, the court said, look, this case is very narrow. We’re looking only at the application. In fact, Thaler later tried to put in all this evidence about his contributions to the art, but that was not part of the application. And the court said, we can’t look at that; that’s not what this this case is about. We’re just looking at the application. And your application says that this was solely created by a computer system. So that case is helpful, because it does go over the progression of copyright from simply the human hand in a painting to photography to other computer assisted outputs, Photoshop, things like that where, of course a computer system is a tool, or technology is a tool for artists, but the artist is still directing that tool and is entirely responsible for the output, even though it’s using that tool. So the case tries to draw that distinction, although I think people have questions about whether that’s okay.
Steve Schindler: Because I also wonder when I read that case and whether or not we’re just in the sort of infancy of a technology and everyone is trying to grapple with what it really means. And obviously one of the cases cited in the Thaler decision is the Burrow-Giles Lithographic [Co.] v. Sarony, the photography case that you mentioned decided by the Supreme Court in the 1880s. And that was a case in which the central argument against copyright was that photography is just a mechanical process and that you can’t copyright a mechanical process. You might be able to have a patent on it, but there’s no copyrightability. And what the court did there, of course, is to say that no, of course there is a mechanical aspect of it, but there are decisions that are being made by a human being about light and angle exposure and things like that. And all of that can be part of a copyrighted expression, but it probably wasn’t so obvious at the time. Now we look back on it after many years and say, oh, of course that’s sensible. And I’m just sort of wondering, as I read like the Thaler decision, if we’re just not at a point where the courts and all other decision makers are still trying to figure out if and where the human component lies.
Katie Wilson-Milne: Yeah. I mean, the Thaler case is easy. Actually, the last example that we have so far kind of touches on this, but it hasn’t gone to court. But, you know, I think the distinction, the Copyright Office and the district court drew, is that the photograph is the mechanical capturing of the image for sure, but it does nothing to create the image— that the camera itself makes no selections about light, or color, or style, or tone. It just makes no choices. It just literally captures what a human being is looking at and has framed. Whereas in the generative AI context, actually, all those aspects of authorship are done by the computer system.
The selection of the arrangement of images, the tone, the colors, you know, all of the sort of discretionary aspects of expression are done by the computer system. Now, a human being can put in more and more detailed prompts, they can reject an image and try again. They can look at 600 images and pick the best one. So it’s not that human beings aren’t doing a lot of work, it’s that the work that is the authorship is always done by the system, which is that final discretionary selection.
So I think that that is an interesting case, and I think the more and more detailed the prompts get and the more retouching and arrangement human beings do to AI images, you know, the harder or more interesting these cases will get. But the Thaler case was, you know, kind of stupid and that there were none of those issues and the court, I think went out of its way to make that clear so that it wasn’t seen as deciding any of these more complicated questions.
Now, the last example, which was from this September, the review board of the Copyright Office issued an opinion, which is sort of like an administrative appeal process, denying registration to a two dimensional work by an artist whose name is Jason Allen. And Allen created this work by inputting maybe hundreds of prompts, like a large, large number of very specific prompts into Midjourney’s generative AI platform. He selected an image he liked from many, many image options, so a lot of involvement here from Allen. And then after he selected the image he liked from all these detailed prompts, he altered the details of that image to create a more crisp visual, to add a lot of detail to the final project. So he did a lot with his own hands. Now, Allen tried to register this work, and he argues that the detailed prompts themselves and the humor and alteration of the final image were enough to make him the author of the work.
The Copyright Office, as I said, does register works that are modified by human hand if the right disclaimers are made. But in this case, Allen refused to disclaim all the AI content from the registration, which is what the Copyright Office asked him to do. They went back to him and they said, look, please disclaim all the AI input from this product.
Tell us what you did, and we’ll consider whether that’s proper to register just your contributions. And he wouldn’t do it. His stance was, no, I’m the author of this total work. I want the registration to this total work. And so the Copyright Office refused to register, because there was clearly more than de minimis AI content. You know, that’s where that was left. I don’t know if more and more artists are going to take sort of a policy stance and try to register works to make a point, which is I think part of what Allen was doing, and if the Copyright Office will become more flexible on this absent court intervention. But the US does appear to be the strictest in terms of the application of human authorship to copyrightability.
Steve Schindler: Right, and I wonder if we think there’s any chance of that evolving, you know, through— obviously you could amend the Copyright Act, you could change the requirement. And just one of the things that I think people think about is also AI is a big business, right? And countries may vie a little bit to want to attract that business.
Katie Wilson-Milne: I think that’s right. I mean, there’s, there’s this question of whether the copyright law should be changed, or there should be generous rights for generative AI and AI in general. You know, I think the trouble with copyright, and we talk about this so much on the podcast, is that we try to make copyright fit every need, every legal need in the art world. And it just doesn’t. I mean, it’s not — like we see with conceptual art or appropriation art, it’s not always going to do the work. And that doesn’t mean you can’t make your art, it just means that you don’t get the protection of the Copyright Act. And that might be fine, because there might be value in that production from something else besides copyright, like the market or authenticity or, you know, the fame of the artist and author. So I think this is another example of whether we should— just a question of whether we should be trying to stuff the question of generative AI into the protection of copyright, or whether it can still exist and flourish without that. And that is the essential question. It’s just like, will people be incentivized to create better and better AI systems without copyright protection? Now, of course they can still get patent protection, they can still get trademark protection. There are other systems that—
Steve Schindler: Right. And we see also some of the claims in the litigation go beyond intellectual property into unfair competition and unjust enrichment and some other types of legal theories that are being deployed, at least at the outset to see whether or not any of them stick, because there are some shortcomings of copyright. And I think though, you know, one of the things that you still sort of come down to on some level— and we saw this a lot with appropriation art, we saw this with the Warhol case— is that on some fundamental level, human beings look at these systems and say, you know, you are going out and just taking billions of images that were created— many of them created by people who are trying to protect their output, their intellectual output, and you’re just taking it .
Katie Wilson-Milne: Yeah. Well, that’s the infringement conversation.
Steve Schindler: That’s the infringement part. But it is the part that I think legitimately makes people angry. And it’s the same thing with appropriation art. You know, some artist is taking someone else’s stuff and using it. And obviously, there are reasons why in some cases that’s okay, you know, in terms of fair use. But it does still you know, just on some fundamental level make people very angry.
Katie Wilson-Milne: Yeah. And I think, you know, in the copyrightability question, there’s some of that too, right, artists are being— could be replaced, right? Someone who is commissioned to do something commercially could be replaced by a generative AI system that then is going to get copyright protection. I mean, that feels wrong to a lot of artists.
Steve Schindler: Right. And that was one of the issues in the writer’s strike in Hollywood, right? And that I think is a real legitimate concern that you could have at some point have AI generating screenplays perhaps that resemble, you know, a particular author. And that I think is a real problem. We talked about our holiday card before— one of the things I think we discovered in just using ChatGPT for the text output is that there are real limitations of that, of what the output of that is going to be, because of the sort of breaks that are put around it. So since we were doing a sort of winter theme, non-denominational holiday story, it’s never going to be edgy. It’s never going to be controversial. The output from, say ChatGPT, as we saw when we did this, was always going to be very middle of the road, very conventional. And so when you think about art and artists, and particularly in writing and performance, it’s usually not that kind of bland in the middle output that makes for great art.
Katie Wilson-Milne: There’s not discernment. But I don’t know, I mean, I think a lot of digital art is not appealing to me. And so I don’t know if I’d tell the difference between digital art or NFT-related art that I don’t get, and some of this AI-generated art. I think the other thing with copyrightability, before we talk about infringement more is what you touched on, Steve, which is that the copyright law comes from the Constitution. And the purpose of that constitutional authorization for Congress to create these limitations basically on speech, is that we are incentivizing human creation. That it’s important for a society to put in place systems that incentivize human beings to make new and better things. That’s true of why we give patents, it’s why we give copyrights. Is limiting the applicability of copyright in the AI context doing that? And I think you could see it both ways. So certainly if all kinds of AI systems are going to be protected by copyright, maybe human beings are going to care a lot less about creating this stuff themselves. And also giving copyright to a computer system doesn’t incentivize the computer system to do anything, because a computer system is not incentivizeable.
Steve Schindler: It has the same incentive whether you reward it with copyright or not.
Katie Wilson-Milne: So giving copyright to a computer-generated output is a departure from that original purpose. On the other hand, though, as you also said, if we are worried about and wanting to incentivize technological advancement in the United States, certainly as a competitive player in the field of AI, then we want to give as many carrots as possible, you know, to people building these systems responsibly. And one of those carrots is copyright. So it may incentivize the programmers and the developers. It’s just that if the author is the AI system, that author obviously doesn’t have any incentive.
Steve Schindler: Maybe that leads us a little bit to infringement and some of the legal cases that have been brought. But a sentence that kind of struck me from one of the cases I want to talk about, which is Sarah Anderson and others against Stability AI and others brought in the District Court for the Northern District of California. And one of the first paragraphs in that complaint that was filed says, and I’m going to quote, “an AI image product is a software-product designed to output images through so-called artificial intelligence techniques, but artificial intelligence is a misnomer. The AI image products at issue in this complaint are all built around the same asset: human intelligence and creative expression in the form of billions of artworks copied from the internet. And AI image product simply divorces these artworks from the artists and attaches a new price tag.” And so I think that also frames the way, you know, when we think about artificial intelligence and what it is doing or creating, if anything, that should or shouldn’t get protection, the flip side of that is that it isn’t actually intelligence at all. It’s just simply the end result of taking a lot of other people’s creative expressions and figuring out a way to move it through this system to an output that’s useful for other people.
Katie Wilson-Milne: So when, I guess when you were talking about an infringement, Steve, we’re talking about this— in these cases we’re talking about the scraping. The use of people’s data.
Steve Schindler: I think what we’re really talking about, there are sort of three main claims in the cases that I’m sort of focused on a little bit, which is one is Getty Images versus Stability AI et al., the Sarah Anderson case that I just referenced. And then another case called Richard Kadrey, Sarah Silverman, and Christopher Golden against Meta Platforms. And that’s unlike the first two cases— that’s a case that’s brought by writers. In terms of the copyright claims, you have a couple of different stages here, I think, to look at. And one is this sort of, so-called scraping. And the way these systems kind of work is that the internet, if you will, is scraped for images and also texts sometimes that goes along, labels that go along with the images. Those are sort of put into these networks. And they kind of just exist. And they’re usually, they’re all open source, which— they’re available to anybody, and there are billions of these images that are scraped. And so claim number one really does involve the sort of scraping of those images, which is just a form of copying. And one of the things that you get as a copyright owner is the right to reproduce your image. It’s one of the bundle of rights that you have.
Katie Wilson-Milne: Stop other people from doing it.
Steve Schindler: And stop other people from doing it. And so the scraping is one of the claims.
Katie Wilson-Milne: And it’s scraping of copyrighted images. That’s the distinction here. Because it’s scraping lots of stuff, but the problem is some of what it’s scraping is copyrighted expression by human authors.
Steve Schindler: That’s correct. And one of the problems that they have in the case is, which I see is sort of— that the plaintiffs are getting a little better at it, is you have these large sort of databases of scraped images and text, but it’s all mixed together. It’s— the scraping occurs out on the vast internet. And so it could be scraping images in the public domain, it can be scraping, you know, Monet Water Lilies, but it’s also scraping works of art that are either copyrightable or in some cases registered with copyrights. But the system makes no distinction between them. And that’s what makes it a little bit difficult sometimes, because then on the other side, the output is a sort of amalgamation of both copyrighted images and non-copyrighted images. So claim number one for copyright infringement is the copying of the actual image. The second piece of it really is through programs like Stable Diffusion, which was created by Stability AI. These programs are so-called trained right on images that have been scraped. There’s a very long explanation as to how this process works, but the result is to be an output image that is obtained by putting in prompts or sometimes pictures
Katie Wilson-Milne: You could actually put in a work you want it to imitate.
Steve Schindler: You can do that also. And then among the billions of images that were scraped, some subset of those are used in these various different programs that are trained. And unlike typical, kind of softwares, we think of as being programmed by human beings, the training of these programs is implemented by these collected images, and then that results in an output. And so the output is not done by say, Stability AI, it’s done by you or me that goes into the system and says, I want a Basquiat or name the artist. And so that claim is basically for vicarious copyright infringement. That is essentially that the defendant is making it possible by what they’re doing for you to violate somebody’s copyright. And that’s another theory that’s in all of these cases. And then I would say the third theory that’s definitely percolating in the sort of intellectual property realm is usually a violation of the Digital Millennium Copyright Act. And that has to do with either removing what they call copyright management information or somehow altering it or, in some cases, falsely disseminating it. But mostly it’s the removal of it. So for example, there’s certain kinds of copyright management information. The typical one would be, you know, a copyright C symbol with the name of the copyright holder that might appear on a work. It’s a violation of the Digital Millennium Copyright Act to take a work that has this kind of identifying copyright information, and there are other types as well, and then deletes it. That’s a violation in and of itself. And that is what is happening in some of these cases. So those are really the copyright infringement and intellectual property violations.
Katie Wilson-Milne: Can I ask you about that second theory? Because, we’ve had this with some clients too, and this does come up when someone is imitating an artist, but they’re not copying an exact work by the artist. They’re just mimicking the artist. And it’s so well done that you would think perhaps it’s by an artist, like by a famous artist, but it’s not. But unless the person making this imitation work is lying about who the author is, I don’t think copyright protects it. And I don’t know if anything in the law protects it. If it’s not fraud, if you’re not lying about who makes it, but it just looks like it’s by someone else. Artists do have this problem all the time, but style is not copyrightable. And this is a major issue in the visual arts.
Steve Schindler: That’s the output part. And that part is more complicated. As we know from prior discussions on the podcast there are two ways to show copyright infringement. One is evidence of direct copying, and in some ways the scraping is that. I mean, it’s direct copying. And the second more complicated one is if you don’t have that evidence, is showing what we call substantial similarity between the original work and the secondary work. And that is a whole, they’re legal tests and they vary from jurisdiction to jurisdiction a little bit. But so there you have to show some substantial similarity between the two works, and that of course can be done.
So the Anderson case, this is the case that was filed by Sarah Anderson and other visual artists. It was filed as a class action complaint, as is the Kadrey and Sarah Silverman case. So the idea is that there are several named plaintiffs, but they’re really filing a complaint on behalf of many, many, numerous unnamed plaintiffs who are in— share some of the same characteristics of their complaint. And then the idea is to try to get that class certified. And so then it’s a much more serious lawsuit. To some extent, damages are an issue here. But of course, if you have a case that has thousands of plaintiffs, the stakes get much higher for even companies like Stability AI. The basic idea is that Stability AI is a product that’s trained on billions of images scraped from the internet. There are a couple of different intermediate-type companies, Stable Diffusion is one of them. Now, in October, the judge hearing this case dismissed most of the complaint. And again, this was the plaintiff’s alleged copyright infringement, vicarious copyright infringement as we discussed, violation of the Digital Millennium Copyright Act, and the right to publicity, which is a right in California, which is where the case was brought— which is to say that if you’re a person of some renown, you have a right to benefit from any publicity using your name, likeness— and unfair competition. The judge dismissed most of the complaint in October with what we call leave to amend. And essentially the judge just basically said the complaint that was filed was too bare bones for him to make any real decisions about these allegations. And one of the problems that the judge had in looking at the factual allegations is there were not enough allegations about the registration of all the copyrights that the plaintiffs owned. And as you said earlier, one of the prerequisites for suing is that you have to register your copyrights. Not a big deal. But also there was a lack of clarity about how each of these defendants along the way, violated any of these single individual copyright owners’ rights, given the fact that there are billions of these images sort of floating around and a few different companies along the way. And the judge just said, you just haven’t said enough about how all this works.
Katie Wilson-Milne: They weren’t specific enough.
Steve Schindler: They were not specific enough. And so the result of that was a complaint that was literally just filed a few weeks ago, November 20th.
Katie Wilson-Milne: You mean an amended complaint?
Steve Schindler: An amended complaint, basically taking the judge’s decision to heart, filing what is an almost a hundred-page complaint. And this complaint, I will post it on the notes it’s recommended reading, I think for anyone, does a masterful job of both, basically tracking these individual works through the process and showing how they’re manipulated and how they’re associated with these individuals. And how at the end of it that people can actually make a request for a work of art that resembles the work of say, Sarah Anderson. And what comes out is sort of not distinguishable from her original work. And the other thing that they go through in excruciating detail is the connection between the visual works of art and, say, the titles that exist along with those visual works of art on the internet and how important it is that those two things are kind of lifted together.And then how the intervening sort of programming uses that information, that couples that information to then create the desired output. And it’s fascinating it’s got— the complaint has color illustrations in it, and I would say that the plaintiff’s lawyers here have moved the ball considerably forward. And it’s hard to imagine how this complaint is going to be dismissed for any of the same lack of clarity.
Katie Wilson-Milne: Yeah, I mean it has detail in it that you would normally only get in discovery, so it’s kind of—
Steve Schindler: Right. No, clearly, they have retained experts. This very technical information, coding information— it’s a very detailed complaint, which I think is just considerably stronger than what’s out there. And the same thing, the complaint that was filed by the group of writers, including Sarah Silverman, was dismissed in the Northern District of California, different judge, for the same types of things, for pleading deficiencies. It’s a very thin complaint. It speaks in certain generalities. That was dismissed on November 20th, and I think now will be up to the plaintiffs in that class action to go back and do something, maybe similar to what they did in the Anderson case.
Katie Wilson-Milne: The claims in the writer’s infringement cases are the same. They’re about infringement for scraping, infringement for use in the output.
Steve Schindler: It’s all the same. It’s basically copyright infringement— vicarious copyright infringement, which is the output, removal of copyright management information, and in this case, unfair competition. And it’s the same type of thing that there are these large language models, which are trained by feeding on massive amounts of text that have been scraped. You would then say to this system, produce a comedy routine, that resembles one of the plaintiffs. And I think, but again, the complaint in that case was very conclusory and I think was dismissed for the same types of deficiencies that the Anderson case was dismissed. The defendant in that case is Meta, which is a different defendant than the— Stability AI, because we’re dealing with text and not images.
Katie Wilson-Milne: I think these cases are— they’re rare in the sense that they present so many novel issues of law. That like they will necessarily be critical in our understanding of at least infringement I think copyrightability is a little different, but even the scraping. Like how long is too long to make a copy? Is it really the case that if a computer system copies an image for a few seconds, and that image never gets shown to anyone in the world. There’s no damage. I mean, there’s probably, I mean, they could argue some damage— but the image is never shown. The copy is never shown to anyone. It’s just a computer system holds it for a few seconds. And that really copying and the nature of infringement. And how many seconds is too long?
I think we’ve had some cases say — I feel like there, I don’t remember where it is, but there’s a case that, does make some comment about a few seconds being not enough. But we don’t know, and I think with the output question, which is I think the harder one, at least for artists, from the artist’s perspective, is it’s agonizing for an artist, again, to see their style being ripped off, to see someone deliberately imitating them. But because style is not protected, I think these claims are really, really difficult.
The question is, is it a derivative work or is it just a new work that has a similar style, in which case there’s no protection. And if it’s a derivative work, there’s protection. But that really shows that you used that one copyrighted original work as the basis for your new work rather than just picking up a style.
Steve Schindler: Right. Well, one of the things— sort of, there are two points embedded in that I think that I just want to address. But one is that you make the central point about whether or not the new work is just derivative or whether it’s fair use. And as we know and the cases have not gone far enough along at this point to have what is certainly going to be one of the principle defenses, once you get past the sort of technical pleading and jurisdictional issues is whether or not the output is fair use. And then of course —
Katie Wilson-Milne: Even if it is infringing.
Steve Schindler: Even if it is infringing, we know that there’s an affirmative defense of fair use, which has undergone some radical or at least some changes. As we know over the last year, the Warhol case was decided and it seemed to trim the sails a little bit on certainly— with respect to appropriation art and whether or not just the kind of manipulation of an existing work of art without more is sufficient. And so that may just also play out a little bit here when we see as a defense to— by some of these defendants. And the second is—
Katie Wilson-Milne: It’s going to e a harder argument now, right? Because—
Steve Schindler: It’ll be a harder argument. Yeah, it would’ve been an easier argument before. I think harder now. And obviously this is all sort of commercial. Interestingly, these structures are basically there are a number of companies, usually the company that is scraping the images is a non-profit type of company. It’s not charging anything. It of scrapes these images and is technically, usually says it’s, it’s open source, but it’s not going to provide the images to anyone who’s going to use it for sort of commercial benefit.
They have these kind of academic-y kind of strictures around it. But where the money is made here is basically on the tools that are used. Even after the training programs are deployed, it’s still not very user friendly. Like even in the Anderson complaint, it goes to a great length what they describe as the source of training data sets, which is something called LAION, L-A-I-O-N, which is an acronym for Large Scale Artificial Intelligence Open Network. LAION is an organization based in Hamburg, Germany, and its stated goal is to make large scale machine learning models, data sets and related code available to the general public. All of the projects are made available for free, and that’s where the scraping takes place.
Katie Wilson-Milne: So some of that material is copyrighted?
Steve Schindler: Correct, and used without permission. And of course, again, there are some descriptive information about what LAION says in sort of making this available. Now, according to the complaint, Stability funded LAION’s creation. So there are interconnections between all of these companies, but the money is ultimately made in the sort of keys that are given and licenses that are given to use. Even the training programs, which are also available are not easily manipulated. And if you want to have access to the training programs in a way that’s useful, then you have to pay a fee. That’s where the money gets made. But there are these— and again, I would refer our listeners to the amended complaint in the Anderson case for a very detailed description about how the— at least the structure of this particular enterprise, which at the top of it is Stability AI is organized. And I think there’s parallels also in the case involving the writers as well.
Katie Wilson-Milne: What’s the explanation of the infringement that occurs in the output stage? The machine is learning off of the images it scraped, and then a user puts in a prompt. It uses that learning and the algorithms to pull from whatever images it’s scraped that are relevant, but it spits out something new, right? So what is the argument that that act is infringing?
Steve Schindler: The allegation is it’s not really spitting out anything new, it’s taking existing images and then manipulating them internally, and then it spits out something that looks very much like the inputted image.
Katie Wilson-Milne: And so is what you’re saying that it’s not that the AI system is learning how to draw in the style of Sarah Anderson and then imitating her, which is maybe not infringement; you’re saying it’s actually copying so many of her existing works that it’s taking those works and actually using them to create?
Steve Schindler: That’s what the allegation of the complaint is.
Katie Wilson-Milne: That’s an important distinction.
Steve Schindler: And also again, if the initial premise is that, is there such a thing as artificial intelligence or is it just simply— what we’re really talking about is just the wholesale inputting of a lot of people’s creative output into a process that allows you to then access that creative output in a different way? And so that I think is what the cases are going to turn on. And it’s going to be very technical—
Katie Wilson-Milne: So the Getty Images case though is quite different. Which is interesting, because Getty Images isn’t suing as an artist or a creator, although they do create some of their own work. But tell us about that case.
Steve Schindler: Yeah, so that’s I think a really interesting case. There are really two cases brought by Getty Images. One was brought in the district court in Delaware. This is Getty Images v. Stability AI and some others, but it’s mostly focused on Stability AI. And so the case in Delaware is also for copyright infringement, removal and alteration of copyright management information providing false copyright management information, unfair competition, both under the Lanham Act, which is a federal trademark law and Delaware State law. And also Getty then brought a similar claim in the UK, which is moving forward in the UK as well. And one of the many sort of procedural issues in these cases is who has jurisdiction? Where is the infringement actually happening? Particularly when you have these different entities that are incorporated and exist in different places, doing different things. So there are some sort of overarching issues of where these cases should properly be brought. And so one of the arguments that Stability AI has made in trying to dismiss the Delaware case is that Delaware has no connection to the case and it should be dismissed on jurisdictional grounds and some other technical grounds. But I think what’s interesting about the Getty Images case is, as you’ve suggested, is that Getty Images is in a very similar business to what Stability AI is doing. Getty Images has, over the years, developed an enormous database of mostly copyrighted works that it licenses.And its business is to, if you go online and you’re trying to get a particular image, you go to their website you see it, and you have to pay money to license it for whatever it is that you want to do. Now, one of the things that Getty Images does, as other companies do, in order to prevent people from taking the images and using them improperly is it puts information across the image.
Katie Wilson-Milne: Copyright information.
Steve Schindler: Copyright information. So you can’t easily just use it for your own purposes. And so one of the allegations in the Getty Images case is that in some cases that their copyright information is removed, and in other cases they didn’t bother to remove it. So they’re now though— it’s kind of altered slightly. You can see it in a couple of the illustrations in the complaint. There’s one about a couple of soccer players. So you see instead of Getty Images, it’s sort of Getty Images, but it’s sort of like swirled around a little bit. So it’s actually very good proof that they took that very image with the Getty— so one of the claims is that they’ve removed it. And another claim is that you’re not allowed to provide false copyright management information, as if it’s legitimate. That is basically— the thrust of the complaint is that Stability AI copied more than 12 million of the photographs from Getty Images’ collection itself, along with captions and metadata, and then without permission or compensation, used them to build a competing business. And that’s a very different kind of complaint. And Getty has very meticulously— maintains its information about its sources and copyrights and it has a licensing business that can be affected by what Stability AI is doing.
Katie Wilson-Milne: That fact seems to be the most important for a fair use analysis. Like fair use is much less likely if someone’s running a directly competing business. Especially after Warhol, where we’re looking at each use and comparing it to the use of the original. But also for trademark, because, right a lot of this is— I mean, copyright law doesn’t care about whether you’re competing with someone else unless there’s a fair use analysis, and there’s some market factors that do come into play. But trademark really does. And so it seems like one of the— as you said, I mean one of the big claims, although we’re not focusing on it in all these cases, is the trademark claims may be more successful, at least in this output infringement question, than the copyright claims.
Steve Schindler: Because it’s also— there’s the claim under the Lanham Act in which you’re essentially by providing an image that let’s say says Getty Images on it, but you’ve manipulated it, you’re actually deceiving the public. You’re creating a situation where you’re sort of falsely suggesting what the origin is.
Katie Wilson-Milne: But even the artist cases, if I see a work that looks exactly like the style of an artist, that’s source confusion, right?
Steve Schindler: That is source confusion too, right.
Katie Wilson-Milne: It’s not maybe copyright for the reasons we discussed. I mean, maybe it is, but I think that’s a hard case. But there might be some trademark or unfair competition issues.
Steve Schindler: Yeah. For sure. I mean, we’ve been talking all about copyright, but there are these other claims out there and maybe some of those better fit some of these allegations.
Katie Wilson-Milne: And as you said already, and as we’ve said before, like whether or not copyright protects the owners of the underlying information that’s being scraped and then used to create the generative AI output, it could still be extremely damaging for the owners of this information. Mainly these artists of the source information. I mean, they make their money off of the unique output that they create with their hands. And if some machine can do that, can spit out thousands and thousands of images in a minute, why are they going to— how are they going to get paid for that? Right? Like, why would anyone pay them?
Steve Schindler: No, that’s— that is their argument.
Katie Wilson-Milne: And it’s not just the indignity of something that looks like that their work being out there which is a problem in and of itself, but there’s real financial right damage potentially.
Steve Schindler: So there are apparently some technological tools that are being developed and rolled out by creative types to, to sort of poison their images that they put online. I think there’s a product called Nightshade that we’ve read about where basically they’ll program into the images that they put online, this poisonous programming. So possibly they may have an image of a dog, but it’ll be programmed to be a cat. And so if somebody is using—
Katie Wilson-Milne: Very clever.
Steve Schindler: —using the product on another instance, give me a dog that is begging for food and then instead a cat shows up. And that obviously has serious implications for all of these AI platforms. And so I’m sure we’re going to see more technological innovations on both sides.
Katie Wilson-Milne: Yeah. And that’s a great example of you don’t always need the law. Like the law is not always going to provide the mechanisms. And sometimes you don’t want it to. And that’s a great example of technology itself providing some solutions to this problem. Anyway, stay tuned.
Steve Schindler: Was that a roundup or a wrap up?
Katie Wilson-Milne: It was a very limited roundup on one issue, but a critical issue and we’ve neglected it a bit. So we’ll definitely follow these issues and follow up next year.
Steve Schindler: All right. See you in 2024. And that’s it for today’s podcast. Please subscribe to us wherever you get your podcasts, and send us feedback at firstname.lastname@example.org. And if you like what you hear, give us a five-star rating. We are also featuring the original music of Chris Thompson. And finally, we want to thank our fabulous producer Jackie Santos, for making us sound so good.
Katie Wilson-Milne: Until next time, I’m Katie Wilson-Milne.
Steve Schindler: And I’m Steve Schindler, bringing you the Art Law Podcast, a podcast exploring the places where art intersects with and interferes with the law.
Katie Wilson-Milne: The information provided in this podcast is not intended to be a source of legal advice. You should not consider the information provided to be an invitation for an attorney-client relationship. You should not rely on the information as legal advice for any purpose, and should always seek the legal advice of competent counsel in the relevant jurisdiction.
Music by Chris Thompson. Produced by Jackie Santos.