Digitization of the JBNHS: Getting Started

This is Part 2 of the story of the Digitization of the Journal of the Bombay Natural History (JBNHS). My introductory post, “Digitizing 100 years of the Journal of the Bombay Natural History Society”, contains the background of this project and also links to subsequent posts.

The project kicked-off in earnest in 2001. Kumaran met with Mr. J. C. Daniel, Honorary Secretary of the BNHS, conveying our intentions. Kumaran knew Mr Daniel personally, and he was very encouraging. Mr Daniel is one of the many truly remarkable people in BNHS’s long history, and continues to be active after over 50 years association with the Society!

I had the fortune of being able to chat with Dr. Gary Starkweather, inventor of the laser printer, who was working at Microsoft Research in Redmond at the time. Dr. Starkweather gave me very good advice on the kind of scanning equipment I would need, what DPI to scan at, and so forth. He strongly suggested cutting off the spines of the journals if that could be done, because everything goes smoothly if you could have flat pages. He told me he used a band saw. I performed one, mercifully short, experiment trying to use my radial arm saw to saw through the spine of a discarded book – it shredded that poor book to pieces! I eventually found that the local Kinkos™ had a book-slicing machine (“the Titan” it was appropriately called) and used them for the entire project. Gary also suggested I get a Fujitsu Duplex Scanner with ADF (automatic document feeder), as he’d had good experience with that brand. I ended up buying a Fujitsu M4097D scanner, a real workhorse. You could feed up to 100 pages at a time, and it scanned both sides of each page at one shot (hence the term “Duplex”). Here is a picture of it, just after unpacking:

Fujitsu M4097D Scanner

I realized very early on that to get full control over the project, I had to get a copy of the entire Journal for myself, and get it shipped over to Redmond, WA, USA, where I was living at the time. I did not mind sacrificing the “collector’s value” of the Journal (by slicing of their spines) in order to get it into digital form. I also wanted full control over the scanning process, in particular the quality control and the handling of, especially, earlier volume of the Journal, which are quite rare to find these days. In fact, the BNHS’s own copies of the earliest volumes of the Journal are not in particularly good shape, because of the weather in Mumbai and no doubt due to the amount handling those volumes have received over the years.

Serendipitously, Kumaran’s friend, Andrew Robertson, knew of a book dealer, Dieter Schierenberg b.v, based out of  Amsterdam. This dealer specialized in older scientific journals,  and had the entire set from Volume 1 to 72 for sale! I purchased that set and had it shipped over to Redmond. I remember hiring a station wagon and driving to the cargo area of SeaTac Airport, and picking up 2 palettes shrouded in black plastic, which were loaded onto the station wagon with a fork lift! I don’t have pictures from that memorable ride, but here is a picture of the books, both inside and outside their packaging (there were many more of them, of course):

Boxes Bound Journals

Here’s a picture of me with some of the books (the enormity of the task seems to be sinking in…):

Me with the Journals

The BNHS provide me with the more recent journals – from 1973 through 2000. I had those too shipped (by my friend Deepak Amin) from Mumbai to Redmond. By November 2001 I had the entire 100 volumes of the Journal, and all the equipment. At that time, I wrote up a  status report on the project which we sent to the BNHS. A PDF version of the report is here.

The journals themselves were marvelous, and it is easy to get lost in them. Here is the first page of the very first issue, dating from January 1886!

Page 1, Issue 1, Volume 1

I was surprised by the high quality color plates in even the earlier volumes of the Journal. Here’s a sample from Volume 17, which was published before 1910! I didn’t realize one could do that kind of high quality printing at the turn of the century (NOTE: while this is a color photograph of one of the pages.  All the scanning I did was at black & white, i.e., 1 bit per pixel):

Snakes Plate from Vol 17

A detail from the above page is here:

Snakes Plate Detaill

In my next post I will go through a sample of some of the earlier content from the Journal. The content and style of writing of the earlier articles, mostly by British authors, is fascinating. Many of these authors straddled the role of Naturalists and Hunters like it was the most natural thing to do. There were many articles whose very title speak of eras gone by, such as a series of articles on the “Poisonous Plants of Bombay!”

[Back to my introductory post on the Digitization of the JBNHS]

This entry was posted in Uncategorized. Bookmark the permalink.

4 Responses to Digitization of the JBNHS: Getting Started

  1. Ritesh says:

    Wow. Its great that you are doing this to keep historical texts alive. These texts will probably be the only place where people can get to see these wonderful specimens later on. Is there a number as to how many species recorded in this journal, are extinct now? Or, are endangered? That should prove the importance of this preservation process. Kudos.

  2. Joseph says:

    Ritesh, re statistics of species recorded in this journal: There are naturalists who have cataloged certain classes of species over the years, but I do not think there comprehensive statistics or lists maintained, certainly not online. One thing I would like to do, once we have an *online* version of the Journal available (something Kumaran, Dr. Yogananda and I are discussing with the BNHS), is to make it easy for the naturalist community to contribute structured information like this. Besides statistics, it would be nice to, for example, every article geocoded – point to the regions in the Indian Subcontinent that the article applies to.

  3. Aditya says:

    A fascinating account indeed. Not to mention Issue 1, Vol. 1! I was curious as to why you chose b/w, 1 bit per pixel for the digital scans (as opposed to grayscale, which may capture subtleties in the typesetting and glyphs themselves and could subsequently be down-sampled into b/w if required). This was answered to some extent in the \’Status Report\’: "We will scan all pages at 1-bit per pixel (in other words, black & white). This is perfectly adequate for text and line art. Additionally we will scan pages with photographs and color images using grayscale and color respectively."Would greyscale for pages have taken longer? Or was there not enough reason to use anything more than 1 bit per pixel for \’text and line art\’?

  4. Joseph says:

    Note that 1-bit per pixel is the Insustry standard for large scale scanning. In fact 400 or better DPI 1-bit is perfectly fine for most text, though not good enough for severely degrated text.Now-a-days (10 years since I started), it is conceivable to have simply grayscale or even selective color scans. This is how it goes with the evolution of technology. I\’ve told people that perhaps the most valuable thing I have done is securing a really good physical copy of the Joural – it is very hard to get a hold of a complete set.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s