I began studying biology in the 1960s and went to graduate school when a literature review meant wrestling with huge volumes of Biological Abstracts. Not only were they physically difficult to deal with, but if my topic had a long history, I tediously had to comb many volumes. After a few hours of this research, I often suffered from a syndrome I called “library malaise,” an overwhelming urge to take a nap. It was reading the Biodiversity Heritage Library’s (BHL) annual report that brought these not-so-good old days to mind. I hadn’t thought about them in a long time, because at this point they’ve faded into oblivion. No self-respecting scientist runs to the library to search for references. Now the big problem is sifting through too many citations to find the most valuable. One way to home in on what’s needed is to use the right database or portal, and for me this is often BHL. That’s because my interests are in botany and the history of botany, areas in which BHL is strong. With this series of posts I’ll explore this amazing resource and why, since its founding in 2006, it has become so valuable.
BHL’s strong points are that it’s massive, well-organized, and committed to expanding its user base. The recently published BHL 2016 annual report gives collection statistics such as: 51,460,159 pages from 196,801 volumes digitized; over 175 million taxonomic names indexed; 1,162,346 unique users, up 10% from 2015. Two new members joined this year, BHL Australia and the Natural History Museum in Paris, bringing the total to 17. There were ten original members including the Smithsonian, Missouri Botanical Garden, and the National History Museum, London—all with sizeable digital collections and digitization expertise to get the enterprise going. The Smithsonian still plays a pivotal role, with the BHL project director, Martin Kalfatovic, being a Smithsonian librarian. From the list of original members, it’s obvious that the focus is on English-language literature, though with institutions in France, Brazil, Mexico, and the Netherlands having joined, this is changing, and of course, some of the older literature is in Latin. Since all the text in BHL is available as optical character recognition (OCR) text, it is at least somewhat translatable using Google Translate (another amazing tool for someone of my vintage).
What makes BHL particularly powerful is that it’s linked to several other rich portals, making its holdings available to a broad audience. One of its new affiliates this year is Internet Archive with which it has been collaborating from BHL’s inception. Much of what’s available through BHL is also available in IA, which is a much broader storehouse. This is also becoming true for the newer Digital Public Library of America (DPLA). While a biologist might go directly to BHL to find a resource like Linnaeus’s Species Plantarum, a student doing a project on Linnaeus might not be aware of BHL, but instead use DPLA or IA. In all three cases, they will find what they need. But portal hopping can be a nuisance. Each interface is different, and it helps to become familiar with one. I’ve used BHL enough that I’m comfortable with its search functions and other tools. It provides an easy way to create a PDF of an entire document or of selected pages from it. Downloading PDFs or JPGs of images is also easy, admittedly PDFs are easier, at least for the moment. BHL is promising updates on image processing and since it has improved its interface substantially over the years, this will in all likelihood happen.
Besides working to broaden its user base, BHL has not forgotten those for whom it was originally designed: the biodiversity research community. The pages in BHL are tagged with the taxon names they contain, which means that the entire library is searchable if a user is looking for a particular genus or species. The word “miraculous” comes to mind when I consider this, and I’ve had fun testing it out with my favorite species, Darlingtonia californica. It’s good to keep in mind that because everything in BHL is open source, much of its collection dates to before 1923 and thus is out of copyright. However, since taxonomy is very much a historical science, particularly in botany, it is important to be able to trace new names back to old ones, and BHL is crucial in doing this. Also, over the past several years it has been increasing its in-copyright holdings by agreements with a number of organizations such as Arnold Arboretum, the Field Museum, and the California Academy of Sciences to host digital copies of some of their in-copyright publications. BHL is also expanding in other ways as well. It partnered with the Smithsonian’s Field Book Project that had been digitizing the field notes of Smithsonian researchers. These are absolutely fascinating and contain valuable information on where and when organisms were sighted and specimens collected. BHL is now continuing this effort as the BHL Field Notes Project by not only hosting the already digitized materials, but getting 450,000 more pages online through a Digitizing Hidden Special Collections grant from the Council on Library and Information Resources.
If all these connections that BHL has made are impressive, there are still more, including major efforts in using social media to get the word out about the riches it holds. This aspect of the portal will be the subject of my next post.