We can’t say it’s something that’s ever crossed our minds, but it seems software engineer Leonid Taycher and his team of boffins at Google were simply burning up over the question of how many books there are in the world.
Taycher and his techie team set out to solve this bibliophilic conundrum, and employed some impressive sounding “intensive analysis” to produce a number.
The figure was reached by researching data from libraries and cataloguing organisations and mixing it up with their own computational resources and “experience of organizing millions of books through our Books Library Project and Books Partner Program since 2004.”
Ta-da!
So we can tell you that Google reckon that there no less than 129,864,880 different books in the world – and if you like to feast on all the technical details of their calculations, proceed with haste to their Booksearch blog.
Here’s a snippet to get you all excited:
So after all is said and done, how many clusters does our algorithm come up with? The answer changes every time the computation is performed, as we accumulate more data and fine-tune the algorithm. The current number is around 210 million.
Is that a final number of books in the world? Not quite. We still have to exclude non-books such as microforms (8 million), audio recordings (4.5 million), videos (2 million), maps (another 2 million), t-shirts with ISBNs (about one thousand), turkey probes (1, added to a library catalog as an April Fools joke), and other items for which we receive catalog entries.
Counting only things that are printed and bound, we arrive at about 146 million. This is our best answer today. It will change as we get more data and become more adept at interpreting what we already have.
Our handling of serials is still imperfect. Serials cataloging practices vary widely across institutions. The volume descriptions are free-form and are often entered as an afterthought. For example, “volume 325, number 6”, “no. 325 sec. 6”, and “V325NO6” all describe the same bound volume. The same can be said for the vast holdings of the government documents in US libraries. At the moment we estimate that we know of 16 million bound serial and government document volumes. This number is likely to rise as our disambiguating algorithms become smarter.
After we exclude serials, we can finally count all the books in the world. There are 129,864,880 of them. At least until Sunday.
Interesting, but it doesn’t answer the next logical question- how much disk space would that little lot need? 😀
Of course it varies wildy depending on if you just store the text or if you digitise everything including the pictures..