BHL and Social Media

I have a Facebook account that I ignore. I go into it about once every six months with the intention of using it, but I can never figure out its attractions, so I abandon it yet again. However, I use Twitter a lot, not to communicate so much as to keep up on the doings at institutions that interest me such as botanical gardens, herbaria, and natural history museums. Along the way, I’ve found several people and institutions posting notable items and I follow them too. For example, Donna Young (@HerbariumDonna) of the World Museum of Liverpool tweets and re-Tweets great material, as does the herbarium at St. Andrews University, Scotland (@STA_herbarium). Needless to say, in light of my last post, I also follow the Biodiversity Library, BHL (@BioDivLibrary). This is how I can keep up with its blog and all its latest endeavors. Because it’s trying to engage with as large an audience as possible, BHL communicates through a variety of social media outlets, since, like me, people have different tastes in their favorites apps. In 2016 it added Instagram and Tumblr to its internet presence along with its more longstanding Twitter and Facebook accounts. In total, it had a 76% increase in followers between 2015 and 2016, suggesting that these efforts have been successful. Perhaps its most fruitful outreach has been through Flickr where it has posted over 100,000 images from its resources, but I’ll get back to that later. I also want to note that there was a 54% increase in the number of visits to BHL from other social media sites—almost 100,000 in all, indicating users are coming to BHL from a variety of platforms. The most notable is Pinterest; posts from its accounts provided for more than half this traffic. Obviously many Pinterest users posted images sourced from BHL directly or from its Flickr account. These numbers suggest the general expansion of the social media universe and particularly of BHL’s participation in it. They also indicate its sophisticated approach to outreach.

At the moment BHL’s efforts in this area are being substantially assisted through the work of five one-year interns in the National Digital Stewardship Residency (NDSR) developed by the Library of Congress in conjunction with the Institute of Museum and Library Services (IMLS). The five residents, now at the half-way point in their work, are at five different BHL member institutions. Pamela McClanahan at the Smithsonian Library has posted a user survey and will analyze the results, which are important to planning BHL’s future direction and where it will focus its resources. Ariadne Rehbein at the Missouri Botanical Garden has joined a Codergirl cohort in St. Louis and is also interviewing Flickr and BHL volunteer taggers about their work and how the work flows can be improved. These contributors to bettering BHL participated in a two-year grant from the NEH to develop a system for volunteers to identify and tag images in BHL volumes. This is a great example of a citizen science project where a pool of interested and committed individuals can help to enhance BHL.

At the Natural History Museum of Los Angeles County, Marissa Kings, along with several summer interns, is creating and editing metadata for the museum’s Contributions in Science publications in preparation for uploading these and other in-house publications to BHL. She is also exploring how recently digitized museum entomology specimens and related data can be linked to the relevant literature in BHL. I have very limited experience in this area, but I know enough to realize that none of this is trivial. Having well-defined workflows and metadata can make all the difference when it comes to linking different types of data. Another intern, Alicia Esquivel at the Chicago Botanic Garden, is doing statistical analyses to estimate the size of the total amount of biodiversity literature—a difficult task to say the least. But even a rough estimate would give some idea of what percentage of that literature is now in BHL, in other words, how big its impact could be on the biodiversity research community. At Harvard’s Museum of Comparative Zoology, the fifth NDSR resident, Katie Mika is learning about adding structured bibliographic metadata in Wikidata to improve the quality of references in the Wikimedia universe and to reconcile messy data. By adding BHL IDs to Wikidata, it becomes a more robust knowledge base and improves the discoverability of BHL’s content. As you can see from these brief synopses, the NDSR program is providing BHL with expertise in several key areas and allowing it to both strengthen its foundations and move in new directions.

Before I close this post on BHL and social media, I want to get back to Flickr. BHL’s Flickr site is quite literally a joy to behold. There are now over 100,000 images from BHL content in Flickr and that number continues to rise. The contributions are arranged in albums, with each album representing one publication. For example, the album for Curtis’s Botanical Magazine, Volume 136 from 1910 has 60 images. Searching for this item in BHL will provide all these images as well as the related text, but to just enjoy the beautiful illustrations, BHL at Flickr is the way to go. All these images are copyright free and downloadable. I should note that while I gravitate to the botanical literature, Audubon’s birds are here and Gessner’s animals. Needless to say, many people stumble upon this treasure trove when they are surfing in Flickr and don’t investigate further, don’t go into BHL at all. However, some do, and that is the point of social media outreach, the more the right outlets are used, the larger the payoff.

Flickr has turned out to be an effective tool for BHL. It is also a wonderful place for a biologist to spend time on one of those days when spreadsheets and graphs make no sense and it’s easy to forget what makes biology so wonderful. Another fun way to join in is with Color Our Collections. Users can download black and white illustrations contributed by member institutions and then satisfy their urge to color them in any way they want. This project, which has become popular on the web and is continuing, grew out of a social media exchange between a librarian from the New York Academy of Medicine and a committed citizen scientist/BHL tagger from Australia—a beautiful example of BHL’s global scope (Garner, Goldberg & Pou, 2016).

Reference

Garner, A., Goldberg, J., & Pou, R. (2016). Collaborative social media campaigns and special collections: A case study on #ColorOurCollections. RBM: A Journal of Rare Books, Manuscripts, and Cultural Heritage, 17(2), 100–117. https://doi.org/https://doi.org/10.5860/rbm.17.2.9663

Advertisements

Biodiversity Heritage Library (BHL): An Introduction

I began studying biology in the 1960s and went to graduate school when a literature review meant wrestling with huge volumes of Biological Abstracts. Not only were they physically difficult to deal with, but if my topic had a long history, I tediously had to comb many volumes. After a few hours of this research, I often suffered from a syndrome I called “library malaise,” an overwhelming urge to take a nap. It was reading the Biodiversity Heritage Library’s (BHL) annual report that brought these not-so-good old days to mind. I hadn’t thought about them in a long time, because at this point they’ve faded into oblivion. No self-respecting scientist runs to the library to search for references. Now the big problem is sifting through too many citations to find the most valuable. One way to home in on what’s needed is to use the right database or portal, and for me this is often BHL. That’s because my interests are in botany and the history of botany, areas in which BHL is strong. With this series of posts I’ll explore this amazing resource and why, since its founding in 2006, it has become so valuable.

BHL’s strong points are that it’s massive, well-organized, and committed to expanding its user base. The recently published BHL 2016 annual report gives collection statistics such as: 51,460,159 pages from 196,801 volumes digitized; over 175 million taxonomic names indexed; 1,162,346 unique users, up 10% from 2015. Two new members joined this year, BHL Australia and the Natural History Museum in Paris, bringing the total to 17. There were ten original members including the Smithsonian, Missouri Botanical Garden, and the National History Museum, London—all with sizeable digital collections and digitization expertise to get the enterprise going. The Smithsonian still plays a pivotal role, with the BHL project director, Martin Kalfatovic, being a Smithsonian librarian. From the list of original members, it’s obvious that the focus is on English-language literature, though with institutions in France, Brazil, Mexico, and the Netherlands having joined, this is changing, and of course, some of the older literature is in Latin. Since all the text in BHL is available as optical character recognition (OCR) text, it is at least somewhat translatable using Google Translate (another amazing tool for someone of my vintage).

What makes BHL particularly powerful is that it’s linked to several other rich portals, making its holdings available to a broad audience. One of its new affiliates this year is Internet Archive with which it has been collaborating from BHL’s inception. Much of what’s available through BHL is also available in IA, which is a much broader storehouse. This is also becoming true for the newer Digital Public Library of America (DPLA). While a biologist might go directly to BHL to find a resource like Linnaeus’s Species Plantarum, a student doing a project on Linnaeus might not be aware of BHL, but instead use DPLA or IA. In all three cases, they will find what they need. But portal hopping can be a nuisance. Each interface is different, and it helps to become familiar with one. I’ve used BHL enough that I’m comfortable with its search functions and other tools. It provides an easy way to create a PDF of an entire document or of selected pages from it. Downloading PDFs or JPGs of images is also easy, admittedly PDFs are easier, at least for the moment. BHL is promising updates on image processing and since it has improved its interface substantially over the years, this will in all likelihood happen.

Besides working to broaden its user base, BHL has not forgotten those for whom it was originally designed: the biodiversity research community. The pages in BHL are tagged with the taxon names they contain, which means that the entire library is searchable if a user is looking for a particular genus or species. The word “miraculous” comes to mind when I consider this, and I’ve had fun testing it out with my favorite species, Darlingtonia californica. It’s good to keep in mind that because everything in BHL is open source, much of its collection dates to before 1923 and thus is out of copyright. However, since taxonomy is very much a historical science, particularly in botany, it is important to be able to trace new names back to old ones, and BHL is crucial in doing this. Also, over the past several years it has been increasing its in-copyright holdings by agreements with a number of organizations such as Arnold Arboretum, the Field Museum, and the California Academy of Sciences to host digital copies of some of their in-copyright publications. BHL is also expanding in other ways as well. It partnered with the Smithsonian’s Field Book Project that had been digitizing the field notes of Smithsonian researchers. These are absolutely fascinating and contain valuable information on where and when organisms were sighted and specimens collected. BHL is now continuing this effort as the BHL Field Notes Project by not only hosting the already digitized materials, but getting 450,000 more pages online through a Digitizing Hidden Special Collections grant from the Council on Library and Information Resources.

If all these connections that BHL has made are impressive, there are still more, including major efforts in using social media to get the word out about the riches it holds. This aspect of the portal will be the subject of my next post.

History and Herbaria Online

intro-wvwu

Online Herbarium of West Virginia Wesleyan University

Over the past 15 years vast digitization projects have made the internet a researcher’s paradise, a paradise in two dimensions. Because book pages are relatively flat, as are paintings, these were among the first resources to become richly available online. Pressed plant specimens are also susceptible to flatbed scanning, and many herbarium projects, particularly at smaller institutions, were taken on using librarian equipment and staff. This meant that botany had a leg up on geology and zoology in natural history digitization, though these fields are catching up, at least as far as taking photographs or scans of specimens, though 3-D imaging lags behind.

Making all these resources available on the web in an functional form is a further challenge, especially because “useable” can mean very different things, from simply broadly accessible to linked to other types of related resources so that users in a variety of fields can benefit from them. What I want to discuss in this and the next few posts is what this means for digital herbaria. While in their present form they are useable by botanists and ecologists studying everything from taxonomy to environmental change, they may be almost invisible to other potential users, including artists searching for inspiration, historians investigating our relationship with nature in the past, economists, sociologists, and pharmacists.

As Roderic Page (2016) notes, taxonomists themselves could greatly benefit from linking library and herbarium resources. It would be ideal to be able to click on the reference for the original paper on a species from its type specimen record. Often both are available electronically, and in some cases they have been linked, as in JSTOR Global Plants, where images of over two million type specimens are online, linked to related information in the Biodiversity Heritage Library (BHL), Tropicos plant information website, Global Biodiversity Information Facility, and journals housed in JSTOR. However, JSTOR lies behind a paywall, even though individual items may also be available on separate free websites. Even when economics isn’t involved, there are obstacles. The Smithsonian’s Field Book Project has put hundreds of notebooks kept by Smithsonian scientists online through BHL. Many have not only been photographed, but transcribed. Obviously, there are many plant species mentioned in some of these notebooks, but BHL doesn’t have links to herbarium specimens. There are some portals that do connect various types of information. For example, the EOL provides access to visual as well as textual resources for species. These often include original research articles, photographs, herbarium specimens, and even botanical illustrations. The Plant List also links to many resources (EOL, BHL, GenBank, etc.) but these must be accessed one at a time, and there is no guarantee that there will be useful information in any particular resource.

In the cases I’ve been discussing so far, the resources being connected are primarily scientific. Even here, there are many herbaria, especially smaller ones, that have unique and valuable collections, but for these institutions, just digitizing the information on the sheets, let alone imaging them, is a massive task that involves equipment, sophisticated software, expertise, and a great deal of labor. Launching a website to provide access to this data, when it is entered, is yet another challenge, and a great accomplishment when it’s achieved. To give one example, West Virginia Wesleyan University (WVWU) has an active herbarium with 25,000+ specimens. It is used in teaching and is available to researchers both at the University and elsewhere. Katharine Gregg, now professor emeritus of botany, applied for an NSF grant with a consortium of West Virginia and Appalachian institutions to digitize their collections. While the grant wasn’t funded, it spurred Gregg to apply for a smaller state grant to fund a similar project for WVWU. The grant from the West Virginia Higher Education Policy Commission Division of Science and Research won approval, and the university was able to buy the necessary equipment and then funded student workers to begin digitization and imaging. Now more than half the collection is online through the university library’s website. This collection is valuable for a number of reasons including the richness of its local plant collection, and WVWU’s experiences paved the way for the digitization of other West Virginia herbaria. Thanks to iDigBio, the NSF-funded project to make data and images of millions of biological specimens available on the web, WVWU’s specimens are now freely accessible to researchers and the general public.

However, I would like to argue that this is just the first step in the creation of a rich, multidisciplinary resource including historical and anthropological materials. My vision is quite ambitious, and perhaps even grandiose, but I think it will come and will indicate a new stage in the development of the internet. Before I get to that, however, I would like to investigate in my next post a number of projects that are leading in that direction. They vary in emphasis, aim, and scope, but all deal with linking resources from different disciplines in often novel ways.

Reference
Page, R. (2016). Surfacing the deep data of taxonomy. ZooKeys, 550, 247–260. https://doi.org/10.3897/zookeys.550.9293.