Page 1 of 1

Use of archive.org PDFs of out-of-copyright books

Posted: Fri Dec 06, 2024 12:35 pm
by ras52
The Internet archive (archive.org) has a large library of digitised old books which are available to view online and also as PDF downloads. For the purpose of this question, let’s assume the books of interest are all out of copyright, though I’m aware that can be a complex question to answer.

The site’s terms of use start by saying ‘When accessing an archived page, you will be presented with the terms of use agreement‘. I’m unclear whether ‘an archived page’ is supposed to include a digitised book, but you’re not presented with the terms of use when accessing a book (even in a private browsing window), and certainly don’t have to explicitly accept the terms. Nevertheless, I dare say there’s an argument that you’re still bound by them. They say that use is governed by Californian law. If it’s relevant, I’m accessing site from England.

The terms of use say very little about what you can do with digitised books, other than that you must respect the original work’s copyright. I do not know whether English law is relevant and whether similar provisions hold in other jurisdictions, but in England, a recent Court of Appeal case, THJ v Sheridan, has ruled that there is no copyright interest in images of documents or 2D works of art that are themselves out of copyright. Obviously you are still subject to whatever terms you consented to in accessing the site, and they could say you mustn’t reproduce or distribute any images you download, though in this case they seem not to say anything like that.

Can I assume in this situation that I am free to use the PDFs however I want? (What I want to do is thoroughly unobjectionable: I want to clean up the images a bit, fix some errors in the OCR, add PDF bookmarks, and make them available for free download on a personal site, acknowledging that they were originally from the Internet Archive, with a link back to the originals.)

Re: Use of archive.org PDFs of out-of-copyright books

Posted: Fri Dec 06, 2024 3:00 pm
by AndyJ
Hi ras52 and welcome to the forum,

You are right to assume that if you are reasonably sure that the original work is out of copyright, you only need to pay attention to the conditions laid down by the Internet Archive. Scans of the sort made by the Open Library and Google Books etc do not qualify for separate copyright as they lack the necessary element of human creativity. This is relatively settled law in both the UK and the USA.

However as you mention, the terms provided by the Internet Archive are fairly general in nature, and are largely there to cover any works which might still be in copyright. I think it is safe to assume, given the ethos of the Internet Archive, that they are not seeking to stop users from using the works which they have curated, and the terms are there merely to cover themselves in case any of their users are found to have infringed someone's copyright. Generally speaking the IA would be covered by the US Fair Use doctrine, whereas someone in the UK might not, when it came to works which were arguably still in copyright somewhere outside the USA. And even if I am wrong in that assumption, it would be a matter of contract law, not copyright law, if you had violated a condition set down by the IA. They would have to sue you in the California courts if they were aggrieved about this, which of course they would not be.

This means that as long as you have independently assured yourself that the original work is now out of copyright here in the UK, you are free to use the pdf version from the IA however you wish. I don't think it is even necessary to provide a credit to the IA since they don't insist on this in their terms, although it does no harm.

Re: Use of archive.org PDFs of out-of-copyright books

Posted: Fri Dec 06, 2024 3:48 pm
by ras52
Thank you. That's very helpful, and confirms what I thought (and hoped!) was probably the case.

The books I'm interested in were published in London during the seventeenth or early eighteenth centuries by authors who have been dead for more than 250 years. I can't see any grounds for them to be covered by anything like Crown Copyright, so I'm confident they're out of copyright. Even if they are somehow still in copyright, it seems highly unlikely anyone would know who held the copyright after so long.

Crediting the Internet Archive seems the appropriate thing to do, even if it's not legally required. They do an excellent job and I'd like to acknowledge it.