Or, where’d you get your information from huh?
Non-Mormon researchers are frequently shocked by things like the total sales of the Joseph Smith Papers Project’s volumes. Comparable papers editions sell frequently in the hundreds, whereas the first Journals volume of the JSPP sold scores of thousands. Now, I realize that the vast majority of those volumes are destined to reside as trophies on Latter-day Saint bookshelves, unread. However Mormons clearly have an interest in history that drives feats of strength that would be absurd to believers in other traditions. Voici, the digitization of published and manuscript (or holograph) materials. Various institutions have, over the last decade, digitized a shocking amount of material, an oeuvre that has, for example, allowed me to research and publish in Mormon history when I otherwise would not have been able.
Inspired by WVS’s recent post, I’d like to evaluate aspects of the various digitization efforts. Basically, I’m Anubis, this is my scale, and I just happen to have the feather of Ma’at. Things that are important:
If you live far from the repositories or simply don’t have time to run to the library, having access to images of the documents is really a tremendous win. Digital images alone are typically of manuscript documents as OCR (see below), is so readily available. Important manuscript collections include the Selected Collections DVDs, which are available online if you are a BYU Student, and are partially available to everyone at the CHL website. The full DVDs were apparently supposed to be made available there last fall and then last spring. Not sure where that update stands. Having color images of these documents is, in a word, huge. Some of this material was restricted not too long ago. Another example are the Cache Valley diaries at the USU Library digital collections.
Transcripts come in two flavors: OCR and human.
OCR stands for Optical Character Recognition, and it has been around for a long time, though the technology certainly has improved. Back in the early nineties, several groups took it upon themselves to scan books and sell digital libraries using the now deprecated NFO file format. The internet sort of killed this, but many of these tools are still in circulation as they also include sources that are otherwise not accessible. These include Signature’s New Mormon Studies CD-ROM and Deseret Books’ Gospelink (LDS Library is another though now defunct example). The former just sounds like a period piece, but it is still a must-have for the Woodruff diaries alone.
The thing about OCR is that, especially twenty years ago, it was sort of crap. So there are not a few errors, and it is generally advisable to compare against originals. Good for searching, but always verify.
More recently, groups are making published materials available as images with OCR texts associated with them. Examples include University Digital Collections who have also produced the Utah Digital Newspapers, the LDS Church History Library’s ambitious and quickly enormous Internet Archive, and the ever beneficent Google Books.
Human transcripts take a lot more effort. My Adobe Acrobat can OCR a scanned document in seconds. It would take me much, much, longer to transcribe it by hand. Because of the effort required, these transcripts are only typically performed for manuscript documents. And the only two institutions to provide images and manual transcripts of manuscript materials, of which I am aware, are the Joseph Smith Papers and the BYU Digital Collections, viz., the Mormon Missionary Diaries, and the Overland Trails Diaries. And really, the quality of these two groups’ offerings are simply incredible. With BYU the document transcripts are available in both HTML and PDF.
What good do these 0s and 1s floating in the cloud do if no one can find them? For example, if you didn’t know that the Utah Genealogical and Historical Magazine was recently moved from the BYU Digital Collections to the LDS Family History Library digital repository there would be no way to find it. No soup for you! Google is definitely your friend here, but drilling down into the catalogues can sometimes be the only way to find things. Consequently search engine-friendly finding aids are a real boon (like those that existed at BYU before their upgrade [shakes fist at sky]).
And here is a thing. Because the amount of content that is now digitized is so enormous that one cannot read it all (for interesting discussions see here and here), and while it is very helpful to be able to search through a document transcript, the ability to search globally across a collection yields truly a glorious fruit (the consumption of which leads to knowledge or death, depending on how lazy you are). I can honestly say that every project that I have worked on has been improved due to global search functions. Who has global search? Legacy NFO’s, universities, Google, and the JSPP. Losers: The CHL and the FHL.
Even better than global search, is the advanced search of University digital collections. This allows you to search for terms near each other, truncate search terms, search particular collections, and constrain searches all at the same time (granted legacy NFOs let you do this as well).
Winner, winner, chicken dinner
BYU for both prolificacy and quality, they are unmatched. JSPP you came close. Your transcripts are gold, but your search is still unwieldy and shallow.
As wonderful as it is to have documents available in any form, having images with advanced searchable transcripts is wonderfuler.