Herbaria and More

4 Platanthera psycodes

Platanthera psycodes collected in 1838, University of Michigan Herbarium

Since the cosponsor of the Digital Data in Biodiversity Research Conference (see last post), along with iDigBio, was the University of Michigan, it’s not surprising that there was a trip to its Research Museums Center south of the main campus. Along with a reception, there were tours of the various zoological, paleontological, archaeological, and botanical collections housed there. Naturally, I went on the herbarium tour offered by three botanists who work with the collection: Christopher Dick the director, Richard Rabeler collection manager, and Anton Reznicek curator. As with many large plant collections, the UM staff can only estimate its size: about 1.8 million specimens. Digitization efforts are making for more accurate estimates, but also unearthing more specimens. The herbarium is actively involved in iDigBio and the national digitization effort with more than 560,000 of its sheets digitized. Since the Research Museums Center is a converted warehouse, the herbarium has room to grow, a valuable resource for the future. The herbarium is strong in Michigan plants, including collections from 1837-1838 for a survey of natural resources made at the time Michigan gained statehood (see figure above). Many of these have habitat information, making them valuable in environmental change studies. There is also a large collection amassed by Harley Harris Bartlett, who led the UM Department of Botany from 1927-1944. He used his considerable wealth to fund explorations in the tropics, and so there are a significant number of specimens from these areas, including many wood specimens from Sumatra that are now being digitized. They are particularly important because of the dramatic changes logging has wrought in Sumatra and also because the labels include the names of the trees recorded in the indigenous language.

On my way home from Michigan, I took a rather roundabout route so I could visit the Cornell University herbarium. I wanted to go there primarily because Cornell was the long-time home campus for the botanist/horticulturalist/agronomist, Liberty Hyde Bailey (1863-1959). Bailey had incredible energy and drive during his entire life and became the first dean of Cornell’s agricultural college (Dorf, 1956). He served on Theodore Roosevelt’s National Commission on Country Life which recommended the formation of the 4-H movement, agricultural extension programs, and rural electrification. Bailey retired from the deanship in 1913 when he reached 60 and spent the next 35 years writing on agricultural and horticultural topics as well as studying the taxonomy of palms on which he published extensively.

As the herbarium’s collections manager Anna Stalter explained, Bailey left his extensive herbarium and book collection to Cornell which explains why a third of the specimens are cultivated plants. This makes it interesting horticulturally and associated materials increase its value. Bailey collected seed and plant catalogues from the late nineteenth century through the first half of the twentieth. The herbarium librarian Peter Fraissinet pulled out a selection that were fascinating historical documents. He also showed me an extensive card catalogue maintained for almost 60 years by Bailey’s daughter and assistant Ethel Zoe Bailey. There was a card for each plant variety, noting the catalogue and years it was offered. Researchers interested in heirloom plants and plant lineages still consult it.   Fraissinet also showed me some rare volumes Bailey had collected, including the oldest book in the library, an Italian translation from 1575 of Nicolas Monardes’s work on Mexican plants that includes the first known image of tobacco (see figure below). I’ve read about this treasure, but it was a thrill to see it as well as a German translation of Pietro Matthioli’s commentaries from 1678.

4a Tobacco from Monardes

Tobacco plant pictured in Monardes’s 16th-century work on Mexican plants. Bailey Hortorium Library, Cornell University

Obviously one day was not enough time to even glance at most Cornell botanical treasures, but I did get to see a few of the massive palm pods Bailey collected and was also introduced to a totally different aspect of botany at the Cornell herbarium, its plant anatomy slide collection. Curator Kevin Nixon and senior research associate Maria Gandolfo are heading the NSF-funded project called CUPAC: Cornell University Plant Anatomy Collection. The goal is to digitize the information on 200,000 slides and to image a significant portion of them, at least one from each set of serial sections, often using more than one power of magnification. They also hope to include slides from other institutions’ collections as a way to preserve and make broadly accessible a valuable research tool for future botanists. There are already many images available on their website, and in the future they hope to link the records to the relevant literature. So this is yet another government-funded digital asset available to all researchers, and also I might add, to artists as well since many of the images are stunning and include not only slides but peels of fossil plant structures.

When I left the herbarium I walked through the Cornell Botanic Gardens where living collections complement the horticultural specimens in the herbarium. It’s wonderful to have the two resources so close to each other. And close by is the plant pathology herbarium, still another treasure, but one I had to leave for the future. As my father always said on road trips: “You have to leave something for next time.” On this trip, I had seen a lot, from living plant collections, to personal collections representing place (see post), to herbaria, and the digital future (see 1, 2). I can’t wait to get on the road again.


Dorf, P. (1956). Liberty Hyde Bailey: An Informal Biography. Ithaca, NY: Cornell University Press.


Digitizing Collections

2 iDigBio

The Digital Data in Biodiversity Research Conference at Ann Arbor, Michigan was cosponsored by the University of Michigan and the iDigBio project, which deals with the digitization of natural history collections at non-government institutions in the United States. iDigBio is a 10-year project now in its sixth year. As Larry Page its director noted, it is designed to provide the infrastructure necessary to store and distribute the results of natural history specimen digitization efforts and also offer training and tools to support these projects. In addition, it aims to encourage development of a community to further this work and to ensure that these electronic resources are maintained and upgraded in the future. That is obviously a tall order, and just how tall became clearer during the two-day conference.

The first general sessions set the stage with Maureen Kearney of the Smithsonian arguing for the importance of “liberating” data from the paper silos where they have been kept and also for including paleobiological information to provide a longer view. Pam Soltis of the Florida Museum of Natural History at the University of Florida discussed the difficulties of linking heterogeneous data, for example, information on specimens, genomics, and phylogeny. Yes, there are data sets dealing with each for many species, but the challenge is to make it all available through one portal. Issues include locating disparate data and dealing with its patchiness and with format differences. There are also vagaries of taxonomic names and of finding ways to get these systems to talk to each other. Progress is being made, particularly in the automation of some phases, such as recording label data using optical recognition systems, but this work takes a great deal of time and money, and it’s never finished, as maintenance is a key issue.

Next came Donald Hobern, executive secretary of GBIF, the Global Biodiversity Information Facility to which the US contributes data in the form of information not only on specimens but on species occurrences. From the GBIF portal, researchers can create species checklists for particular areas and also access data on particular taxa. The GBIF network has over 700 million georeferenced occurrence records making it a massive resource. Organizationally, it is divided into geographic nodes, with each node responsible for inputting and maintaining its data. In the afternoon, I attended the session on the North American node, which includes contributions from Canada and the United States. There Hebern spoke again outlining the network’s three main goals. The first is to remove obstacles to collaboration in the sharing and use of biodiversity data, in other words, to provide tools that allow for uploading and maintaining data in a usable form. Second is to organize evidence of recorded occurrence of any species in time and space, that is, users should be able to access data on species occurrences worldwide or within a particular geographic area and timeframe. Finally, GBIF aims to support the development of a global virtual natural history collection. In one sense, this goal has already been met because there is so much data in GBIF from so many areas, but it is hardly complete in terms of extent or data richness. In order to function at such a large scale, GBIF can only provide limited information on each occurrence. However, the infrastructure that GBIF has created and is continuing to develop is a firm foundation for a richer and robust information system in the future. An indication of this is in Science Review 2017, its annual review of the scientific articles published over the past year using GBIF data. Along with this is a bibliography of these 438 peer-reviewed articles.

The next speaker presented still another acronym, or really two. Gerald “Stinger” Guala of the US Geological Service is director of both BISON (Biodiversity Information Serving Our Nation) and ITIS (Integrated Taxonomic Information System). BISON provides access to 375 million US occurrence records, including 275 million in GBIF. However, for US records, more data on some records are available than just what’s in GBIF. Essentially, BISON is a clearinghouse for US government information on natural history collections. It cleans the data, formats it, takes quality control measures, and allows for data discovery. One of its major services is providing checklists at the local, state and national levels; a user can draw a map around an area and get a species checklist for it. Datasets on particular areas or species are also downloadable. ITIS is more limited in scope; its aim is to provide stable nomenclature. It is linked to the Catalogue of Life, a worldwide database that publishes an annual checklist with over 1.7 species. The biggest difficulty for the latter, as discussed by its director Tom Orwell of the Smithsonian, is how to deal with synonyms. This is a tough problem for all taxonomy and for all biodiversity projects, as noted by Stepen Garnett and Les Christidis (2017) in a recent Nature article on how “taxonomic anarchy” impedes conservation efforts. To put it simply: it’s difficult to enforce regulations on an endangered species if its name changes.

These presentations were followed by two about Canadian projects; James Macklin spoke on CBIF, Canada’s GBIF node, and Anne Bruneau on Canadensys, which aims to provide richer information on species than that available in GBIF. Jon Coddington of the Global Genome Biodiversity Network (GGBN) then brought up a whole different set of issues, namely those involved in storing genetic information, both sequences and specimen data. And Martin Kalfatovic the program director of the Biodiversity Heritage Library (BHL) discussed its role in providing links to relevant literature. In all, this was a mind-bending session that helped me see the differences among the many portals I have come across as I try to educate myself botanically and technologically. In the next post, I’ll discuss some even more ambitious projects that move into the 3D realm.


Garnett, S. T., & Christidis, L. (2017). Taxonomy anarchy hampers conservation. Nature, 546, 25.