Smart Screen Exclusive

Court Ruling in Google Books A Win For Metadata Automation

The U.S. Second Circuit Court of Appeals on Friday handed Google a major victory in its long-running legal fight with the Authors Guild over Google’s massive book-scanning project.

A three-judge panel ruled unanimously that Google’s scanning and digitizing entire books to make their contents searchable qualifies as a fair use under copyright law, as does making snippets of the texts available in response to search queries.

“[W]e conclude that..Google’s unauthorized digitizing of copyright-protected  works, creation of a search functionality, and display of snippets from those works are non-infringing fair uses,” Judge Pierre Leval wrote for the court. “The purpose of the copying is highly transformative, the public display of text is limited, and the revelations do not provide a significant market substitute for the protected aspects of the originals. Google’s commercial nature and profit motivation do not justify denial of  fair use.”

The ruling is certain to be a landmark in fair use and digital copyright law.

But the court’s framing of its ruling could also prove to the use of automated tools to parse and analyze libraries of digital content and to extract metadata about the content.

According to the court, information about a piece of content is outside the scope of the author’s copyright, and that deriving and compiling such metadata, even without a license, is not an infringement of that copyright.

“Google’s making of a digital copy to provide a search function is a transformative use, which augments public knowledge by making available information about [sic] Plaintiffs’ books  without providing the public with a substantial substitute for matter protected by the Plaintiffs’ copyright interests in the original works or derivatives of them,” Leval wrote. “The same is true, at least under present conditions, of Google’s provision of the snippet function. Plaintiffs’ contention that Google has usurped their opportunity to access paid and unpaid licensing markets for substantially the same functions that Google provides fails, in part because the licensing markets in fact involve very different functions than those that Google provides, and in part because an author’s derivative rights do not include an exclusive right to supply information (of the sort provided by Google) about her works.”

Google began scanning print books into machine readable digital form in 2004 through partnerships with a number of leading research libraries, to make their contents searchable. Google also retains a copy of the original digital image  in part so as to improve the accuracy of the machine-readable texts as image-to-text conversion technologies improve. Since 2004 it has scanned more than 20 million books, including copyrighted works as well as those in the public domain.

Google then provided the digital archives to the participating libraries and created the Google Books search engine that displays snippets of text from scanned books in response to queries. 

The Authors Guild, along with the American Publishers Association, sued in 2005, charging Google with copyright infringement for reproducing and displaying the works without a license. After several years of often tortuous negotiation among the parties and the trial court, the APA eventually settled with Google. The Authors Guild persisted however, and the district court ultimately ruled in favor of Google. The AG appealed that decision, which was the basis of Friday’s ruling.

The case drew widespread attention, including from the Motion Picture Association of America, which filed an amicus brief in support of the Guild.

In a statement issued immediately after the ruling was handed down the Authors Guild said it would appeal the  decision to the Supreme Court.  But if Friday’s decision is ultimately upheld it could open the door to other mass-digitization projects and to indexing and analyzing digital archives to extract information about their content. 

The full Second Circuit opinion is here