The Justice Department has launched an investigation into whether Google is violating antitrust laws by reaching an agreement with authors and publishers to digitize millions of printed books and post the contents online. We speak to Brewster Kahle, founder of the non-profit internet library Archive.org. He’s among critics warning Google could end up with a monopoly of access to information and exclusive license to profit from millions of books. [includes rush transcript]
This is a rush transcript. Copy may not be in its final form.
AMY GOODMAN: The Justice Department has launched an investigation into whether Google is violating antitrust laws by reaching an agreement with authors and publishers to digitizing millions of printed books and posting the contents online.
Over the past five years, Google has partnered with some of the world’s most famous research libraries to scan over seven million books. In 2005, the Authors Guild and the Association of American Publishers filed lawsuits against Google challenging the company’s right to scan copyrighted material and making it searchable online. A $125 million settlement was reached last year, but it’s still awaiting court approval.
Google has defended its project, saying its goal is to improve access to books and to give the public access to millions of out-of-print books. But critics warn the settlement could result in Google having a monopoly of access to information and giving Google an exclusive license to profit from millions of books.
Well, recently, when I was in San Francisco, I interviewed Brewster Kahle, the founder of the non-profit online library, the Internet Archive, located at archive.org. The Internet Archive hosts an online text archive of over a million written books. Brewster Kahle is a prominent critic of Google’s book plan. We invited Google to join us but didn’t receive a response. I began by asking Brewster Kahle to outline his concerns.
BREWSTER KAHLE: This is about Google scanning books, and they’ve been scanning books and libraries now for several years. And the out-of-copyright works are fabulous to go and scan and make publicly available. The significance of this case is about the twentieth century, the books in the twentieth century, almost all of which are under copyright but out of print. And this case will give Google, and Google alone, the explicit license to scan and make those available in the digital world. So, in effect, in this digital world, Google will be able to control the library.
AMY GOODMAN: How does a court case allow one company to have a monopoly on all the books of the twentieth century that are out of print?
BREWSTER KAHLE: That’s the really curious thing about this. It has to do with how class actions work. And it’s an unprecedented use, as best my lawyer friends say, of how to abuse class actions.
Class actions are usually done to address some kind of harm. If you’re a pharmaceutical company, and you make pills or something like that and give them out, and people get sick, then a set of lawyers come together and build up a class of people that have suffered some kind of harm and then try to get money to help out the people that were in this harm.
What Google did is, when there’s this lawsuit, the Authors Guild pulled together a class of everybody having to do with books in the twentieth century and called it a class. And instead of just —-
AMY GOODMAN: What do you mean, “everybody having to do with books”?
BREWSTER KAHLE: They formed a class of anybody that had any rights associated with any books published during the twentieth century that are still under copyright, such that this class could then negotiate on behalf of all of those authors and all of those heirs, whether you can find them or not, to be able to negotiate with Google, because they scanned a lot of the books, which sounds fine, in the sense that if Google shouldn’t have done that, then maybe they should pay these people some money.
But what they did is go further than that and go and say, “Not only are we going to address past harm, we’re going to set up new structures for dealing with things in the future. We’re going to come up with a new copyright regime that allows Google to go and sell access to these works,” in this kind of bizarre new scheme that nobody had ever heard of, on a perpetual, going-forward basis. So class action usually tries to address past harm. Here, it’s setting up completely new copyright structures for going and dealing with things in the future. It’s unprecedented.
AMY GOODMAN: And what is this period that we’re talking about right now? Why is there a public comment allowed?
BREWSTER KAHLE: What’s going on right now is, this class-action settlement between the Authors Guild, the American Association of Publishers and Google is open to public comment or objections until about one month from now. And during this period, people can come forward and say, “This doesn’t make any sense.” And there are a growing number of libraries, law school professors, publishers, booksellers going and saying, “We don’t really want to build a monopoly here.”
AMY GOODMAN: So, explain exactly how it works. What does it mean to say that Google will have the sole access to the libraries of this country? What exactly are they doing?
BREWSTER KAHLE: What they’re doing is they’re digitizing books up a storm, so out-of-copyright works, which are works before 1923; in-copyright, but out-of-print works, which are the vast majority after 1923 and the present. Most books are out of print. And they’re even digitizing books that are in print. And they’re working with publishers to try to make sure that things that are in print, they can have in their search engine. In the out of copyright, it’s OK; there’s no rights issues there, they can make those available.
It’s those books in this limbo. It’s the books that are not commercially viable, things that are -— make up the vast majority of our libraries, the books in our libraries, but they’re not saleable. The question is, who can do what with those in the digital age?
And there’s been orphan works legislation. These works are called “orphan works.” These are things that are — there is no real owner. There’s no one to speak for them. And there’s been attempts to deal with this through the legislature, and the libraries are starting to loan these books. But Google had a different idea. They thought they could go and digitize these works and, through this class-action settlement, get an explicit license to be a digital bookstore of these works.
AMY GOODMAN: To sell?
BREWSTER KAHLE: To sell and make subscription access, to build a subscription library that is the only library that has the ability to go and sell and subscribe access to these.
AMY GOODMAN: Brewster Kahle is founder of the Internet Archive, archive.org. We’ll come back to our conversation in a minute.
AMY GOODMAN: We return to my interview with Brewster Kahle, founder of the non-profit online library, the Internet Archive, located at archive.org, talking about Google’s plan to digitize millions of books.
AMY GOODMAN: Why would any library agree to give over their work to a private company?
BREWSTER KAHLE: It seemed like a good idea at the time.
AMY GOODMAN: Why?
BREWSTER KAHLE: Because Google was going to pay for the digitization of these books. And what they said originally is that they would — like a web search engine, they would go and index these books and then allow people to see bits and pieces, but direct people back to the libraries or direct people back to bookstores to be able to get them. What we now find through this suit is Google’s ambitions were far greater than just directing people back to where they came from; they wanted to be the library or the bookstore themselves.
AMY GOODMAN: So they will make money on these libraries?
BREWSTER KAHLE: Not only will they make money, they will be the sole organization to control access to these works.
AMY GOODMAN: What do you mean, “control access”?
BREWSTER KAHLE: Well, if they want those books to be available to people, they can have it in their search engine and rank it high. If books are things they don’t want to have available, I don’t know, for any reason that corporations might want to do that, they can take it effectively out of the library. If they get to be the library that the next generation grows up with, then they get to decide who has access to works, and if you happen to be reading a book, they’ll know about it.
AMY GOODMAN: Can you talk about the libraries who have made deals with Google and why they did?
BREWSTER KAHLE: It started out with five great libraries: New York Public; Oxford; Stanford University Libraries —-
AMY GOODMAN: The Oxford Library in Britain.
BREWSTER KAHLE: Yes, in Britain.
AMY GOODMAN: Stanford in California.
BREWSTER KAHLE: California -— University of California, the public library system — the public university system here in California; University of Michigan. And they allowed Google — they actually — they didn’t pay Google anything, but they did all of the work to go and hand over the books that were on their shelves to Google, so that Google could take photographs of the pages and then process those to make them readable on devices or on the net.
AMY GOODMAN: I don’t understand something. If you have the New York Public Library, the California Public Library, how does a public library that’s been supported with public funds have the right to even decide that a private corporation now will determine who gets access and who doesn’t, what book can be read and what book can’t?
BREWSTER KAHLE: I think it was the — the issue is, it didn’t seem like that at the time, that that was really what was going on. These libraries wanted to have digital access. It’s what everybody wants these days. And Google was coming along and saying, “Hey, we’ll do this, and we’ll do it for free. Oh, and we’ll give you back a digital copy that you can use yourselves.” It just turns out that if you look at these contracts, which were secretly negotiated — and actually, it took real process to try to get these out of the libraries, because they were under nondisclosure agreements. But what —-
AMY GOODMAN: How did you get it out?
BREWSTER KAHLE: They were Freedom of Information Act requests, that basically they were demands on librarians. It’s now starting to take legal action to get answers from librarians. It’s getting a little sick out there.
But anyway, some of these contracts are now publicly available, and the restrictions are severe of what it is the libraries can do with the copies they get back. So they’re pretty much useless to the libraries.
AMY GOODMAN: Well, wait, because that is the argument that Google uses: “We make a digital copy, we invest money, and we give a copy back,” which is why a library would want to do this. They then have a fully digitized library at no cost. What are these restrictions you’re saying on the digitized copies of the books that Google gives back to the library?
BREWSTER KAHLE: Well, let’s take -— there’s two sets. There’s the out-of-copyright, and there’s the in-copyright work. Let’s take the out of copyright, the stuff that’s really — it’s public domain, meaning belongs to the public. It’s lived long enough to become part of the public sphere. But there are perpetual restrictions that the libraries must perform, that if they get these digital copies back, they must put up restrictions on use, such that they cannot be accessible by the general public.
AMY GOODMAN: Who can they be accessed by?
BREWSTER KAHLE: People on campus can use them, for the out-of-copyright works, but just on campus. And otherwise, they have to put up restrictions. And what’s turning out is a lot of these libraries aren’t even bothering to get copies back, because what can they use them for? I mean, in the future, people are going to want to have access to as many books as possible. And what Google is doing is pulling these together for many libraries to build a great collection. Terrific. But the bits and pieces that are going back to these libraries don’t make up a great collection. And what they can do with them is very, very limited. So these libraries aren’t, in many cases, even bothering to get the digital copies back. So there really is no quid pro quo.
AMY GOODMAN: Can you talk about whether there is a transformation of consciousness going on among these library directors, who at first thought, “Great! Free digitization. We get the copies ourselves. We’ll decide what we do with our own copies,” and now?
BREWSTER KAHLE: Yes. It’s starting to really dawn on people that this wasn’t the deal that they thought. Or even if they thought it was this deal, it’s building a world where libraries, traditional libraries, don’t really have a future in this world.
Bob Darnton from Harvard has been very eloquent on this fact in a long piece in the New York Review of Books, that we’ve got a problem out there, that the idea of having a single corporation is a problem. There are law school professors that are starting to realize what this really could mean for the future of information access by having single corporate control. Harvard has filed a —- law professors at Harvard have filed objections. The Internet Archive is filing objections. There are a growing consciousness that this is a problem. But it’s been done in such a clever way that there’s really very little avenue for coming back at this.
AMY GOODMAN: Let’s talk about the Internet Archive that you run, archive.org. Explain exactly what you’re doing now and what your vision is for libraries of the future. In fact, you’re, in a sense, a competitor with Google. You have been digitizing books also.
BREWSTER KAHLE: Yes. So, I work at the Internet Archive, and we work with about 2,000 libraries now.
AMY GOODMAN: But before you say the libraries, what does the Internet Archive do?
BREWSTER KAHLE: The Internet Archive is a non-profit library here in San Francisco, and we digitize books and make them available on the internet. We’re probably best known for the Wayback Machine, which is a collection of web pages, historical web pages, that we collect from all websites and make them freely accessible. We also collect -—
AMY GOODMAN: You’re archiving the internet.
BREWSTER KAHLE: We’re archiving the whole World Wide Web. We take a snapshot every two months of every website of anywhere in the world, and we record all of the pages, so that you can be able to see the World Wide Web as it was. You could surf the web as it was. We have the out-of-print web pages.
The average life of a web page is about 100 days. So, if you wanted to see what it is some corporation or a government claimed before, if you go back to the website, it could be gone. And so, the Wayback Machine plays the role of a library in the digital realm to be able to make it so that accountability is there towards what it is people said in the past.
AMY GOODMAN: And so, you, like Google, are digitizing books. And you’ve done — they’ve done seven million; so far, you’ve done what? One-point-seven million?
BREWSTER KAHLE: We have about 1.3 million books on our website. We’ve actually gone and done the digitization of about a half-a-million books. We have scanning centers in eighteen libraries now, and those are scanning books at about 1,000 books a day. So we’re in the Library of Congress. We’re scanning books in the Boston Public Library. We’re in Scotland and Guatemala digitizing books.
AMY GOODMAN: And what happens when you approach, say, the Harvard Library, when you approach, say, the Oxford Library? Do they say they cannot, you cannot digitize the books, because Google is?
BREWSTER KAHLE: Actually, there’s no restriction in the Google settlement or Google contracts that say they can’t deal with the open world, but in practice they won’t. So, the University of California, which we were working with in scanning their books, once they signed this agreement with Google — and we found out that they had been negotiating in, but they couldn’t tell us about it — but once that happened, within days, they said any books that they’re ever going to give to Google, they will not give to the Internet Archive to scan.
New York Public Library also has one of the fantastic library collections in the world, and they committed to Google to go and give access to that research collection.
AMY GOODMAN: Sole access?
BREWSTER KAHLE: And what turns out is sole access. It’s not legally required that they not give it to anybody else, but in practice, they said they will not. Columbia University, as well.
AMY GOODMAN: Explain what you mean when you say it’s not legally required. You mean in the contract, what they have with Google? And so, if Google was here, they’d say, “We didn’t say they couldn’t give it to Internet Archive. That’s their prerogative.”
BREWSTER KAHLE: Correct, that basically Google didn’t put it in their contract. Yet from a library’s perspective, why have a book scanned twice? It’s wear and tear on the books. If they think that — and they wouldn’t have signed it if they didn’t think that the Google thing was a good idea. But now that they’ve signed this with Google, they don’t want it scanned again. And this is a problem, because the books, even the out-of-copyright books, are locked up perpetually.
AMY GOODMAN: Conceivably, Google could give you the digitized copies, is that right?
BREWSTER KAHLE: Yes, Google could, but they have refused.
AMY GOODMAN: Why?
BREWSTER KAHLE: They say that they’ve paid for the work. They want to be the place that people go to get them. So they are going to be the proprietors of the public domain. And now, with this settlement, they’re looking to make a grab for the orphan works, the out-of-print works that are in copyright of the twentieth century.
AMY GOODMAN: What would be the difference between how Internet Archive makes available these digitized books and Google does?
BREWSTER KAHLE: The way the Internet Archive does it is that we digitized — photograph the pages. They get transferred onto servers that then take these pages and find the words and phrases so you can search them, package them as PDFs and a couple other formats, and puts them on servers for broad public access. So not only can you come to the website and see the books there, you can download them, and you can download them in bulk. And we have people going and downloading thousands, tens of thousands, hundreds of thousands of books, and going off and doing whatever it is they want with them. This is what the public domain is for. It’s the dream.
So we’ve actually gone and not only scanned them and put them here in the United States, but we have a relationship with the Library of Alexandria in Egypt, and they’re going and downloading them and storing them there. The idea is if we put multiple copies in multiple places around the world, we may have a library that will live for hundreds of years, as things happen, as libraries come and go, as laws change. The idea is to have this library live. And having multiple copies, we think, is the only way to go. So not only can researchers download these, or readers, people are taking these books, doing print on demand, making new things out of them. That’s what the public domain is for.
AMY GOODMAN: How much does it cost to digitize a book?
BREWSTER KAHLE: It costs ten cents a page to basically photograph a book — a page and then run it through all of these steps. So, a book, which is about 300 pages on average, a book then costs $30 to go and digitize. So if you wanted to make a million-book library — it’s a million books times $30 — thirty million dollars would be the cost of building a digital library of a million books that anybody could have access to, copyright willing.
AMY GOODMAN: Can these contracts that have been written between Google and Harvard or Oxford or the New York Public Library or University of California library system be broken?
BREWSTER KAHLE: I don’t know. You’d have to ask lawyers. But it’s —- these things are pretty lock solid. These are -—
AMY GOODMAN: For all time, in perpetuity?
BREWSTER KAHLE: At least the digital copies that are coming back to the libraries are under these restrictions forever.
AMY GOODMAN: Do you see the end of libraries as we know them?
BREWSTER KAHLE: Libraries as a physical place to go, I think will continue. But if this trend continues, if we let Google make a monopoly here, then we’ll — what libraries are in terms of repositories of books, places that buy books, own them, be a guardian of them, will cease to exist. Libraries, going forward, may just be subscribers to a few monopoly corporations’ databases.
AMY GOODMAN: What about antitrust laws? Why wouldn’t they apply here? Google owning all the access to the books of the twentieth century that are out of print?
BREWSTER KAHLE: It’s horrendous, isn’t it? So, what about antitrust laws and all? I think what’s taking people by surprise is the way this is happening.
Usually these monopolies — and we’ve had to wrestle with in the tech world, seemingly every decade — AT&T and IBM, most recently Microsoft — usually they achieve it through some kind of market dominance. They go and, you know, play not fair. They go and do something and just deal with it in that way, and then the courts come in to try to take apart the monopoly.
But in this case, they’re actually using the courts to create the monopoly. So the idea of using a class-action settlement to make a court-sanctioned monopoly for Google and this bizarre other thing called the Books Rights Registry is really a new use of class-action law to go and use the courts to create a monopoly.
AMY GOODMAN: So, could legislation nullify these agreements? Could they break the monopoly?
BREWSTER KAHLE: I guess Congress can do anything. But since this is sort of coming in through the back door, through the judiciary, it’s a little bit odd. There are people in the Justice Department that are starting to look at this. And I hope they take a close look at not only the monopoly of what Google is trying to make, but this price-setting organization called the Books Rights Registry is another sort of bizarre outcome of the secretly negotiated settlement that could determine the future of libraries.
AMY GOODMAN: So, Brewster Kahle, what difference does it make if people offer public comment? And where can they go?
BREWSTER KAHLE: What happens with this case is there are people that can come forward and object. They can either say the class isn’t a good class, that the idea of the Authors Guild, which has about 8,000 members representing millions of authors and dead authors and heirs, isn’t working. That’s one approach. Another is potentially that there’s an antitrust issue. But I’d say, actually, law professors are scratching their head about how to actually deal with this.
But there’s starting to become groups of people. The ALA, American Library Association, is going to object. The Electronic Frontier Foundation, the Internet Archive, Public Knowledge — there are organizations that are becoming very concerned and trying to figure out what can you do at this late stage. It’s happening all very fast. The comments are due in less than a month, and people are starting to really understand these hundreds of pages of settlement documents that have been foisted on the public.
AMY GOODMAN: Do you see the end of the book?
BREWSTER KAHLE: The book, as we know it, which is printed and put between two covers, there will still be books. But in their primacy of how intellectual discourse happens, it’s going to move online, in very likely form, so really how long-form narratives have got to find a way online. And if we want to have a publishing system that’s a distributed publishing system that has lots of authors that get paid, then we don’t want to have single corporate control over the distribution of those works. Otherwise, what it is we’ve had for centuries as the book and the freedom of the press will become so restricted that it won’t look like what it is we grew up with.
AMY GOODMAN: Brewster Kahle, founder of the non-profit online library, the Internet Archive, located at archive.org. The Justice Department has launched an investigation into whether Google is violating antitrust laws.