Digital Humanities Research
April 2017
Ten years ago, I was looking for a big change. I got a sabbatical and moved to Leiden in the Netherlands. I did not know anyone there, nor did I know any Dutch. After a couple of months, I found a gap in the research: the life of Antony van Leeuwenhoek, who lived in Delft from 1632 to 1723.
Leeuwenhoek made over tiny 500 magnifying glasses like those on the right sidebar (click to enlarge). The plates are about four inches high and the lenses less than two millimeters in diameter. The specimen goes on the pin and the screws bring it into focus. Using these magnifying glasses, Leeuwenhoek was the first human to see protists (protozoa), bacteria, sperm, and red blood cells. Every human society had dealt with the stars in some way, the macro world. No human society had even imagined the micro world. Leeuwenhoek was among the first to discover it. He was the first human to see almost everything that he saw with his little hand-made devices.
Wat doe je als je dingen ziet dat niemand ooit heeft gezien?
What do you do when you see things that no one has ever seen before?
That has been the guiding question of my research. What did Leeuwenhoek do when he saw things no one had ever even imagined?
For more than fifty years, Leeuwenhoek observed an astonishing variety of plants and animals. His enduring themes included reproduction, growth, and the movement of fluids within living bodies.
He wrote only letters, often recording observations on several unrelated topics. He self-published 165 of these letters. An equal number either were published only in excerpt in the Philosophical Transactions of the Royal Society of London or were not published at all. He was elected to membership in the Royal Society in 1680.
The image on the right (click to enlarge) came from a letter that was published in Philosophical Transactions in 1702 showing rotifera, ciliates, vorticellids, and the budding of hydra. The left sidebar shows the figures that accompanied two letters from the mid-1680's.
Letter 35 of March 3, 1682, addressed to Robert Hooke
Letter 38 of July 16, 1683, addressed to Christopher Wren
Through these letters, much is known about Leeuwenhoek's science. What I soon discovered when I got to the Netherlands in 2007 was that little was known about his life. For starters, I found contradictory information, right down to his birthdate!
Perhaps there were no records?
I went to the Delft City archives. The first day, I found a document about his life that was in none of the biographies. (For a short period, Leeuwenhoek was one of Delft's community trash supervisors. His name is on the bottom line of this document snippet.)
The two major biographies concentrated on his science and included only short sections to his life. For research, Clifford Dobell spent about two weeks in Delft while writing Antony van Leeuwenhoek and his "Little Animals" (1932), and he never went to the city archives. Abraham Schierbeek did some archival research for the two-volume Antoni van Leeuwenhoek: Zijn leven en zijn werken (1950). He included a few more documented events than Dobell had, but Schierbeek's main interest was always the science.
Examples of documents from the Delft archives:
Baptismal records -- the painter Vermeer on the first line, Leeuwenhoek on the eighth
Leeuwenhoek appointed curator of Vermeer's insolvent estate
By the time my sabbatical was over, I suspected that I had found a gap in the research. Was it worth pursuing? Two Dutch institutions seemed to cover 17th century Dutch science: the Descartes Center at Utrecht Universiteit and the Huygens Institute for Dutch History, a governmen-funded research center in Den Haag, now moved to Amsterdam. I went to the Huygens Institute and introduced myself to Huib Zuidervaart, the researcher whose interests seemed closest to mine. Huib assured me that no one else was working on Leeuwenhoek's life. I had a green light.
I returned to Medaille for the fall semester of 2008 and then was able to return to the Netherlands for another 8-month stay until mid-August 2009. Since then, I have spent every summer there as well as another 8-month sabbatical, a year ago.
It wasn't long before I had spreadsheets full of information based on the documents I found in the archives. I had biographical data about Leeuwenhoek, his family, and his friends. I had hundreds of events in their lives. I had data about the properties they owned. Most importantly, I had notary records and city goverment records that fleshed out their daily lives. Photographs of relevant documents that I found in the Delft archives take up 16 GB on my hard drive. I couldn't get to them all if I had another whole lifetime. While my acquisition rate has slowed considerably, I have no doubt that there is more to be uncovered.
What to do with it all?
Structure documents or structure information?
Documents, I knew how to strcture. Articles:
Still going strong: Leeuwenhoek at eighty, an invited review for Antonie van Leeuwenhoek's 80th Anniversary Issue.
Antony van Leeuwenhoek's microscopes and other scientific instruments: new information from the Delft archives in Annals of Science.
The tensions between facts and fantasy in Studium, the journal of Gewina, the Belgian-Dutch Society for History of Science.
and a video:
Animated Life: Seeing the Invisible, which I scripted and co-narrated, in the New York Times' Op-Docs section on September 15, 2014.
Information? How is structuring information different from structuring documents? That was one of the hardest things I had to learn because it involved new patterns of thinking. I had to re-structure my thinking, too.
In terms of technology, I did not need a document management system like Wordpress or what I did when I made static web pages with software like Dreamweaver. Spreadsheets full of data needed a content management system that assembles the pages users see from a database full of the page's parts, that is, the cells of the spreadsheets. In fact, because my content was so specific, I could best benefit from a content management framework. The framework would let me create a custom content management system tailored to the different types of content that I had, such as property records, which are different from biographical data, which are different from information about events.
My courses have been paperless since I began running my own web server in 1998. Internet technologies (HTML, style sheets, scripting languages, database management, web analytics, etc.) have been integral to my teaching and scholarship ever since. For the Leeuwenhoek project, I chose the free, open-source Drupal content management framework. Drupal powers some of the most popular websites, from WhiteHouse.gov to Weather.com to Medaille.edu.
I went live with the first Drupal version of LensonLeeuwenhoek.net in October 2009. It now has about 400,000 words and several thousand images. In the last year, it had just shy of 20,000 visitors from over 150 different countries. The big day was October 24, 2016, when the Google Doodle celebrated Leeuwenhoek's 384th birthday, resulting in a little over four thousand visitors in less than 24 hours.
Lens on Leeuwenhoek is the most comprehensive Leeuwenhoek resource on the web and has directly caused many good things to happen to me.
Huygens Institute
It has led to my association with the Huygens Institute. Staying in the Netherlands past the standard three-month tourist visa requires a sponsor to keep the immigration folks happy. For my sabbatical last year, the Huygns Institute sponsored me. I expect that when I retire from teaching in a year or two and move to the Netherlands, the Huygens Institute will again sponsor me because of the contribution I am making to the history of Dutch science.
The Huygens Institute has long specialized in what is now known as the digital humanities.
We regard ourselves as a humanities laboratory in which we develop, test and apply new methods in order to extract more and different information from the sources than has been possible until now. That represents a new approach within the humanities. The Huygens ING wishes to pose new research questions, provide better answers to old ones, and to underpin the answers with much more data. We build and retrieve large bodies of text and datasets with hundreds of thousands of records. We ourselves develop the tools that are required for our research projects.
I have been doing the digital humanities for about 20 years, long before the term became current. At the Huygens Insitute, I found a group of scholars with similar interests in the power of database-driven research. For my project, that takes two forms.
Lens on Leeuwenhoek, a database-driven biographical web
Leeuwenhoek's collected letters, for which text-mining tools will let us "extract more and different information from his work than has been possible until now"
Text mining uses the power of a database to find patterns that would not be apparent otherwise. One quick example: In letters from 1673 to 1695, Leeuwenhoek usually used the Dutch term microskoop (microscope) and sometimes used the term vergrootglas (magnifying glass) to refer to his devices. After 1695, he used only vergrootglas. What happened that caused him to change? Without text mining, I wouldn't even know to look.
Text mining is only as good as the text. It is best to have a complete corpus. Unfortunately, we do not have a complete corpus of Leeuwenhoek's letters. In 1932, a committee of Dutch scientists began to collected, edit, translate into English, annotate, and publish the approximately 350 available letters. By last year, they had published 294 of those letters in 16 volumes. Volume 17 was moving toward publication but already six years behind schedule. Volumes 18 and 19 remained. Lodewijk Palm (right), the fifth editor in the series, was retiring, so last summer the committee asked me to finish. Two problems:
I did not want to do it without a native speaker to help me. My Dutch is pretty good, but not that good.
I have this teaching gig that takes too much of my time, so I can't be on the ground in the Netherlands.
Thus, I am the co-editor along with my collaborator Huib Zuidervaart. Together with an intern, we quickly got volume 17 to the point where we are currently proofreading the typeset text. Two more volumes to go, and we will finally have a complete corpus, in two languages! Then I can get to work mining the text and looking for patterns to relate to events in his life.
The 300th anniversary of Leeuwenhoek's death occurs in 2023. The 400th anniversary of his birth occurs in 2032. I am already talking with the Huygens Institute about appropriate celebrations. Could an international collaboration for 2023 draw grant funding? I would also note that Corning Glass in nearby Corning, N.Y., has an interest in Leeuwenhoek's lens-making techniques and that all of Western New York, including Buffalo, was once The Holland Purchase, owned and surveyed by Dutch investors.
Featured sections of the web:
His publications - an extended bibliography. If it were in print, it would be a book-length bibliographic essay.
Bibliography -- by far the most comprehensive Leeuwenhoek biography