The internet engineer and entrepreneur Brewster Kahle took a shot at the book publishing industry a few weeks ago by pointing out something well-known to technologists but unappreciated by the general public: that ebooks and other digital artifacts have shorter lifespans than the physical items.
"Our paper books have lasted hundreds of years on our shelves and are still readable," Kahle observed in a post on the website of the Internet Archive, the invaluable historical repository of old web pages and other digital artifacts that he founded in 1996. "Without active maintenance, we will be lucky if our digital books last a decade."
It may be misleading to say that Kahle took a shot at the publishers. More accurately, he took another shot at them. That's because for more than two years Kahle has been embroiled in a bitter court fight with the industry over his effort to make digital copies of copyrighted books and lend them out for free.
Kahle says he's just doing what public libraries do. The publishers who have sued the Internet Archive in federal court in New York — Hachette Book Group, HarperCollins, John Wiley & Sons and Penguin Random House — have a different take.
They say the Archive is engaged in "willful digital piracy on an industrial scale." (HarperCollins is my book publisher.)
What's really happening here is that everyone involved — publishers, online distributors, authors and readers — is trying to come to terms with the capacity of digital technology to overthrow the traditional models of printing, selling and buying readable content.
Publishers and authors are predictably, and rightly, fearful that they'll lose out financially; but it's also quite possible that, properly managed, the technological revolution will make them more money.
To see how this may unfold, let's start with some fundamentals of digitization. Kahle's recent post is a good jumping-off point.
New technologies allow us to convert what's on the printed page into bits and bytes readable by computer. The process can reproduce a printed page exactly, or only the text. Some content begins as a computer file produced by a writer at a keyboard, which can then be used to produce a bound book.
The product can be an ebook, which can appear on the screen exactly like its paper analog, or can provide only the text or a nearly infinite variations of format.
As consumer products, ebooks began to make their mark with Amazon's introduction of its first Kindle e-reader in 2007.
Since then e-formats have proliferated, as have methods for reading them — dedicated devices, web browsers and apps, smartphones and tablets. What hasn't changed is the turmoil that the digitization of reading material has produced for publishers and libraries.
That leads us to Kahle's point. It's tempting to regard digital content as eternal, and in some respects it may be — it doesn't degrade as it's recopied, unlike recordings made from masters. On the other hand, as Kahle observed, it's vulnerable to becoming technologically outdated. A digital file produced in one technical format may be unreadable in another; the devices made to read the first version may become obsolete, leaving no way to read content produced for them.
That process can happen with unexpected speed. I have a CD-ROM set of every issue of The New Yorker that can't be read today on my Apple computer, because it was formatted for a Windows operating system that's incompatible with my desktop and Microsoft doesn't even make anymore. (The New Yorker now provides the same archive via the web, but it's available only by subscription, not a one-time purchase.)
By contrast, physical books can survive for centuries, through floods, droughts, heatwaves and deep freezes, and handling by hundreds of readers.
Nevertheless, publishers and librarians persist in thinking of books as perishable and digital files as eternal.
This fundamental error, which prompted libraries all over the world to discard their precious collections of actual books and periodicals in favor of digital facsimilies, was deemed "absolute nonsense" by the novelist Nicholson Baker in his passionate and meticulously researched 2001 exposé "Double Fold."
The ability to make identical copies of printed materials by digital scanning has been a boon for the cause of distributing the accumulated wisdom of the ages, but also a headache for contemporary publishers.
Digital archives of works that have outlived their copyrights make it easy for researchers to access older material; I'm a devoted user of the HathiTrust Digital Library, an enormous archive that was founded in 2008 by the University of California and other major institutions, built from digitized versions of volumes in their libraries.
But publishers and distributors, fearing that the ability to easily create identical digital copies of their products would open the door to unlimited piracy and copyright infringements, have imposed unprecedented restrictions on ownership rights of ebooks.
The industry argues that the Internet Archive's lending library is exactly what they're trying to fight. Kahle says he initially founded the Archive's Open Library to provide free online access to millions of public domain books that had been digitally scanned by the the Archive and a consortium of other institutions.
Eventually the Open Library included some copyrighted books in its stacks. Kahle said in a court declaration that the Archive generally has made its digital scans from print books it owns and that it avoids lending out books less than 5 years old in order to steer clear of contemporary bestsellers as an "accommodations to publishers."
The Archive maintains that allowing those books each to be borrowed by one user at a time and for limited periods for free like library books, a system it calls "controlled digital lending," falls within the "fair use" exception to U.S. copyright law, which allows copies of books or excerpts to be made for research or artistic purposes.
The publishers maintain that the Open Library's unauthorized scans of copyrighted books steps over the fair-use line.
"What's clearly not allowed is the type of systematic, broad-brush copying and public distributions of huge swaths of copyrighted work that Open Library is doing," says Terry Hart, general counsel of the American Assn. of Publishers.
Authors and publishers are afraid that efforts like that of the Open Library will cannibalize commercial, revenue-producing markets, sapping incentives to create and publish.
The ebook market, however, arguably has given publishers greater control over the dissemination of their products than they have had in the past.
In most cases, consumers don't own their ebooks — they've acquired only a limited license for their money. Although Amazon customers click on a button reading "Buy Now" to acquire a Kindle ebook, the license terms make clear that "buyers" don't get all the rights of conventional book owners.
They can't sell their ebooks and in many cases can't loan them to friends, as they can do with physical books. They can only read their ebooks on an Amazon device or Amazon applications. If they try to evade those terms, Amazon's proprietary digital rights management software embedded in the ebook will prevent them from doing so.
In a notorious 2009 episode, Amazon remotely deleted ebooks of George Orwell's "1984" and "Animal Farm" from customers' libraries without their permission after discovering the publisher lacked rights to the titles in the U.S. Following an uproar, the company restored the books and pledged not to take such a step again.)
In 2019, Macmillan Publishers announced that it would refuse to sell libraries more than one ebook copy of any new title for the first eight weeks after their publication dates. Other major publishers have imposed their own restrictions on ebooks for libraries, including changing perpetual rights to one- or two-year terms and paid renewals.
Macmillan lifted its embargo in March 2020 during the pandemic. But by then the restrictions had motivated some state legislators to consider laws requiring that publishers license ebooks to libraries on the same terms as those offered to consumers. One such law passed in Maryland was overturned in June by a federal judge who ruled that the law was preempted by federal copyright laws; a similar law passed in New York was vetoed by Gov. Kathy Hochul.
Kahle frankly acknowledges that one motivation of the Archive's lending program is to challenge the publishers' control of reading content.
The publishers "would like to force libraries and their patrons into a world in which books can only be accessed, never owned, and in which availability is subject to the rightsholders' whim," the archive said in its response to the publishers' lawsuit.
"We want an ebook to be a book," he told me. "When you buy an ebook you should have the same rights as a buyer in the physical world. In the same way that people and libraries bought books in the past, they should be able to buy ebooks."
Kahle is right about that. By downplaying the terms of sale, ebook purveyors such as Apple and Amazon are plainly misleading their customers, law professors Aaron Perzanowski of the University of Michigan and Chris Hoofnagle of UC Berkeley wrote in 2016. "Sales of digital media generate hundreds of billions in revenue, and some percentage of this revenue is based on deception," they wrote.
As digital books mature from a novel technology into a quotidian one, there is no reason why the rights conferred by ownership should be materially different from those that come with a book one can hold in one's hands. None, except that publishers and distributors have been able to get away with quietly shrinking those rights.
Perzanowski and Hoofnagle called on the Federal Trade Commission to force publishers to "align business practices with consumer perceptions." That still hasn't happened. Until it does, insurgents such as Kahle will have an incentive to test the limits of copyright law by taking it into their own hands.
That's not good for anybody.