Google Print

October 28, 2005


Neil J. Rosini, Michael I. Rudell

Suits by Authors and Publishers Against Google Raise Fair Use Questions

Last month, the Authors Guild and three individual authors commenced proceedings in the Southern District of New York against Google, Inc. for copyright infringement. They were joined last week by five large publishing companies in a separate action brought against Google, also in the Southern District. The plaintiffs in both suits target part of a service called Google Print, which would allow users to search the texts of library books online. The Authors Guild complaint seeks damages, injunctive relief, and class action certification that would embrace most of those copyright owners whose still-protected works are contained in the library of the University of Michigan. The publishing companies seek a declaration that Google’s activities infringe their copyrights as well as injunctive relief.

As Google explains at its website, the content of Google Print will come from two sources: publishers and libraries. Under the “Publisher Program,” which is not in contention, copyright holders who choose to participate will authorize Google to make their books searchable in Google’s database. The online user will see the page of the book containing the search terms, a few adjacent pages, some bibliographic information, and a link to online booksellers. Participating publishers also would share in advertising revenue if they permit ads to be displayed on their pages. Because the content for this “Publisher Program” will be licensed to Google, it raises no copyright issues.

The Google Print “Library Project” is the one in contention. Google has partnered with the libraries of the University of Michigan, Stanford, Harvard, Oxford and the New York Public Library, to scan books from their collections into its database and make them searchable. Public domain books from these sources will be viewable at Google in their entirety. (This part raises no copyright infringement issue.) Among the libraries, at least the University of Michigan is also permitting Google to scan works from its collection protected by copyright without first obtaining authorization from the copyright owners. In contrast to the public domain works, these copyrighted books will be accessible only in “brief snippets” containing the search terms and several lines of text before and after. Bibliographic information and direct links to booksellers will be furnished, too. Although Google allows authors and other copyright holders to opt-out of the Library Project, it plans to reproduce in its database all copyrighted books for which no opt-out is elected. These two functions – reproduction in the database and display online of snippets of copyrighted works without permission – are challenged by both lawsuits.

Google’s position is that the use it will make of the copyrighted books scanned through its Library Project will be a fair use consistent with “principles underlying copyright law itself.” Specifically, Google says that copyright law “has always been about ensuring the authors will continue to write books and publishers continue to sell them,” and that its service will increase the incentives for authors to write and publishers to sell books by making them “easier to find, buy and borrow from libraries.” According to Google, “backlist, out of print and lightly marketed new titles will be suggested to countless readers who wouldn’t have found them otherwise.” The company defends its making of digital copies of copyrighted books for its database as a necessary step to achieve this end, just as its search engines must make digital copies of vast numbers of web pages without obtaining an express license for online search services.

In their complaint, the plaintiffs in the Authors Guild action declare that by “reproducing for itself a copy of those works that are not in the public domain … Google is engaging in massive copyright infringement.” They plead that Google’s retention of a digital copy of the library archives for its own commercial use, as well as its announced plans to reproduce and display the works all without the authorization of the copyright holders, infringes the exclusive rights of the plaintiffs and the proposed class. The publishers plead in their complaint that Google’s “infringements are likely to usurp [the publishers’] present and future business relationships and opportunities for the digital copying, archiving, searching and public display of their works” and adversely impact the potential market for those books. From the perspective of the plaintiffs in both actions, Google’s opt-out election inverts the rights of copyright holders, who possess the exclusive right to grant or deny a license in the first instance.

The plaintiffs in both actions also point toward anticipated consumer traffic and advertising revenues that Google will derive from the use. And at its website, the Authors Guild further emphasizes that “Google is a commercial, not a charitable, enterprise” that is “digitizing countless texts” with no compensation to authors. Instead the Authors Guild wants to make “far more than ‘snippets'” available to users through services that first obtain a license.

To support its position, each of Google and the Authors Guild can point to prior legal authority that deals with unlicensed archiving by online services. Google would hope for the same result reached by the Ninth Circuit in Kelly v. Arriba Soft Corp.; while the plaintiffs would prefer a decision like UMG Recordings Inc. v., Inc. from the Southern District of New York. Each decision is worth reviewing because, taken together, they frame the debate.

In Kelly, the court considered a search engine that created a searchable database of images available online after the plaintiff objected to the inclusion of his copyrighted work without permission. The plaintiff, a professional photographer, displayed his images of the American West at his own website and others. The defendant’s computer program “crawled” the web looking for images to include in its searchable database and when it found one, downloaded a full-sized copy of the image onto its server. Only low-resolution “thumbnails” of these images could be called up by users of the database, and once the thumbnails were created, the program deleted the full-sized originals from the server. When Arriba learned of Kelly’s objection, it deleted thumbnails of his images and placed the sites on an off-limits list. Kelly sought damages for Arriba’s copying and display prior to the deletion and the defendant, like Google, relied on the defense of fair use.

The court analyzed the defense using the four-part balancing test set out in Section 107 of the Copyright Act. The first factor, “purpose and character of the use,” weighed in favor of the defendant because of the “public benefit of the search engine and the minimal loss of integrity to Kelly’s images.” The court reasoned that Arriba’s use was “transformative” in that the small low-resolution images were there to improve access to information on the Internet — not for any “aesthetic purpose” — and therefore served an entirely function than the original. And though commercial, the use was not “highly exploitative” because Kelly’s images were among thousands and the defendant did not try to profit by selling the images or use them directly to promote its web site.

The second factor, “the nature of the copyrighted work,” weighed slightly in favor of the plaintiff because Kelly’s photos were creative and therefore “closer to the core of intended copyright protection” than fact-based works. The third factor, “the amount and substantiality of the portion used” in relation to the copyrighted work as a whole, weighed in favor of neither party because the defendant’s copies of the images as a whole were reasonable in light of its use; copying only part of the image would have reduced the usefulness of the visual search engine. The last factor, “the effect of the use upon the potential market for or value of the copyrighted work,” weighed in favor of the defendant. The defendant’s use did not harm the market for the plaintiff’s images or the value of them and the defendant’s search engine guided users to the plaintiff’s website where full-sized images could be found. Accordingly, use of the plaintiff’s images in the thumbnails did not harm Kelly’s ability to sell or license his full-sized images. In total, two factors weighed in favor of the defendant, one was neutral, and one weighed slightly in favor of the plaintiff; Arriba’s use was found to be a fair one.

Google can point to the similarities between its own services and Arriba’s. The creation of the database is unlicensed in both cases and copyright owners receive no payment for inclusion. The works of copyright owners are included in their entirety, though each copyrighted work represents a very small part of each database. Both services are commercial in nature, but neither seeks to sell rights in the constituent works. Both offer copyright owners an opt-out choice but do not wait for an opt-in election before copying begins. Both would tend to bring published works, and places to buy them, to the attention of database users. Perhaps most importantly, both offer the public access to a new index of information that might be called “transformative.” And users of both services can access only a much-diminished version of the original that would not supplant demand for that original. On the other hand, Arriba destroyed the high-resolution original after creating a low-resolution replacement, while Google will keep the entire works in its database at all times. Moreover, Arriba’s service would be far less interesting to average members of the public and probably prove much less profitable than Google’s over time.

For their part, the plaintiffs likely will rely on Judge Rakoff’s decision that begins, “The complex marvels of cyberspatial communication may create difficult legal issues; but not in this case.” The defendant’s service enabled subscribers to store, customize and listen to the recordings contained on their CDs from any internet connection. To make this work, the defendants purchased tens of thousands of popular CDs in which the plaintiffs held the copyrights and, without authorization, copied them onto its database so as to be able to replay the recordings for subscribers. Not just anyone could access the database; a subscriber first had to “prove” that he already owned the CD version of the recording by inserting his copy in his computer’s CD-Rom drive or by purchasing the CD from one of the defendant’s cooperating online retailers. Thereafter the subscriber could access that CD on the database from any location, which the defendant tried to portray as the “functional equivalent” of storing its subscribers’ own CDs. The court found, however, that infringed by “re-playing for the subscribers converted versions of the recordings it copied, without authorization, from plaintiff’s copyrighted CDs.” It then proceeded to consider and reject the defendant’s fair use defense by applying the same four-factor test used in Arriba.

The court found that the first factor — “the purpose and character of the use” — weighed in favor of the plaintiffs for two reasons. First, the commercial nature of the use was undisputed. Second, re-playing the unauthorized copies in a different medium did not amount to a transformative use even if it saved subscribers the trouble of “lugging around the physical discs themselves.” The second factor — “the nature of the copyrighted work” — also weighed in favor of the plaintiffs because the recordings were creative and close to the core of intended copyright protection. The court found that the third factor — “the amount and substantiality of the portion used” — also favored the plaintiffs because the defendant copied and played the entirety of the copyrighted works. Finally, the fourth factor — “the effect of the use upon the potential market for or value of the copyrighted work” — also was found to favor the plaintiffs because any “allegedly positive impact of defendant’s activities on plaintiffs’ prior market in no way frees defendant to usurp a further market that directly derives from reproduction of the plaintiffs’ copyrighted works.” Moreover, the plaintiffs did not object in principle to licensing their recordings for such uses as long as they got “the remuneration the law reserves for them as holders of copyrights on creative works.”

The plaintiffs in both copyright infringement actions against Google will be encouraged by this decision for a number of reasons. Both the Google service and the service are commercial and designed to attract large numbers of users for purposes of profit, without compensating authors. Both services would create databases without authorization (although was not said to offer an “opt-out” choice). Both services would copy a large number of copyrighted works and store them permanently while arguably leaving undiminished the market for the works in their original forms.

But the case can be distinguished on a number of grounds. First, the Google service offers unprecedented support for academic research and its credentials as a “transformative” use that benefits the public are commensurately more impressive; merely aimed to make musical entertainment more convenient. Second, not only created a database filled with entire copyrighted works but also made those works available to users in their entirety, unlike Google. This difference between presenting users with several lines of text from a book compared to an entire musical recording is important for several reasons. For one, preserving for copyright owners the right to license access by users to entire recordings seems more apt than licensing access to “snippets” of books. Google’s use therefore seems less likely to “usurp” a licensing market of copyright owners. For another, quoting a few lines from a book, particularly for purposes of research or scholarship, is much closer to the “core” of fair use.

It also bears mentioning that the Copyright Act itself obliquely speaks to the creation of such databases in a limited way. For example, Section 112(a) allows for the unlicensed storage of certain copyrighted works embodied in “transmission programs,” which amounts to a database of a sort. But the permission is tightly circumscribed, limited for example to the transmitting organization’s own authorized transmissions within its “local service area” and an active life span of six months. This implies a considerable value to the control by copyright owners over the licensing of third party archiving. Further, Section 512(b), added to the Copyright Act in 1998 through the Online Copyright Infringement Liability Limitation Act, sends a mixed message about the creation of another kind of unauthorized database, “system caching.” Caching refers to the copying and temporary storage of web site pages that Google and others might use to reduce congestion and speed up search results. Section 512(b) allows internet service providers an exemption from liability for copyright infringement when their cached databases store infringing material from third party websites. To qualify, the service providers must satisfy a list of conditions concerning the nature and use of the cache — e.g., it must be “intermediate” and “temporary” and subject to certain requirements of the third party website for “refreshing, reloading or other updating” the site, and if a third party website has restricted access, the restrictions must be respected. But at the same time, the right of the service provider to copy the web site pages into the database without authorization in the first place, goes unquestioned.

Resolution of the Google cases might ultimately turn on the court’s assessment of the value of Google’s database. Will it regard the database copies as mere stepping stones toward a legitimate fair use, or valuable uses in and of themselves that authors should have the right to control, and license for compensation? The answer could have broad implications not only for Google and the plaintiffs in these actions but also for the economic model of search engines on the web. Will content owners be compensated solely through the advertising, promotion and sale of their works that are located through “unauthorized” databases? Or, should they be compensated for the inclusion of their works in the search function of the database itself, as an ancillary revenue stream? The answer will take considerably longer to arrive than a typical Google search.


1 440 Mass. 309 (2003).

2 92 F. Supp.2d 349 (S.D.N.Y. 2000).